The topologies of predictable dynamic networks are continuously dynamic in terms of node position, network connectivity and link metric. However, their dynamics are almost predictable compared with the ad-hoc network. The existing routing protocols specific to static or ad-hoc network do not consider this predictability and thus are not very efficient for some cases.   We present a topology model based on Divide-and-Merge methodology to formulate the dynamic topology into the series of static topologies, which can reflect the topology dynamics correctly with the least number of static topologies. Then we design a dynamic programing algorithm to solve that model and determine the timing of routing update and the topology to be used. Besides, for the classic predictable dynamic network---space Internet, the links at some region have shorter delay, which leads to most traffic converge at these links. Meanwhile, the connectivity and metric of these links continuously vary, which results in a great end-to-end path variations and routing updates. In this paper, we propose a stable routing scheme which adds link life-time into its metric to eliminate these dynamics. And then we take use of the Dijkstra's greedy feature to release some paths from the dynamic link, achieving the goal of routing stability. Experimental results show that our method can significantly decrease the number of changed paths and affected network nodes, and then greatly improve the network stability. Interestingly, our method can also achieve better network performance, including the less number of loss packets, smoother variation of end-to-end delay and higher throughput.
We give a complete characterization of the possible response matrices at a fixed frequency of n-terminal electrical networks of inductors, capacitors, resistors and grounds, and of n-terminal discrete linear elastodynamic networks of springs and point masses, both in the three-dimensional case and in the two-dimensional case. Specifically we construct networks which realize any response matrix which is compatible with the known symmetry properties and thermodynamic constraints of response matrices. Due to a mathematical equivalence we also obtain a characterization of the response matrices of discrete acoustic networks.
In this paper we propose the Structured Deep Neural Network (structured DNN) as a structured and deep learning framework. This approach can learn to find the best structured object (such as a label sequence) given a structured input (such as a vector sequence) by globally considering the mapping relationships between the structures rather than item by item.   When automatic speech recognition is viewed as a special case of such a structured learning problem, where we have the acoustic vector sequence as the input and the phoneme label sequence as the output, it becomes possible to comprehensively learn utterance by utterance as a whole, rather than frame by frame.   Structured Support Vector Machine (structured SVM) was proposed to perform ASR with structured learning previously, but limited by the linear nature of SVM. Here we propose structured DNN to use nonlinear transformations in multi-layers as a structured and deep learning approach. This approach was shown to beat structured SVM in preliminary experiments on TIMIT.
While deep learning models have achieved state-of-the-art accuracies for many prediction tasks, understanding these models remains a challenge. Despite the recent interest in developing visual tools to help users interpret deep learning models, the complexity and wide variety of models deployed in industry, and the large-scale datasets that they used, pose unique design challenges that are inadequately addressed by existing work. Through participatory design sessions with over 15 researchers and engineers at Facebook, we have developed, deployed, and iteratively improved ActiVis, an interactive visualization system for interpreting large-scale deep learning models and results. By tightly integrating multiple coordinated views, such as a computation graph overview of the model architecture, and a neuron activation view for pattern discovery and comparison, users can explore complex deep neural network models at both the instance- and subset-level. ActiVis has been deployed on Facebook's machine learning platform. We present case studies with Facebook researchers and engineers, and usage scenarios of how ActiVis may work with different models.
In this paper we describe the design, and implementation of the Open Science Data Cloud, or OSDC. The goal of the OSDC is to provide petabyte-scale data cloud infrastructure and related services for scientists working with large quantities of data. Currently, the OSDC consists of more than 2000 cores and 2 PB of storage distributed across four data centers connected by 10G networks. We discuss some of the lessons learned during the past three years of operation and describe the software stacks used in the OSDC. We also describe some of the research projects in biology, the earth sciences, and social sciences enabled by the OSDC.
A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. Here, we develop an approach that simultaneously achieves both flexibility and tractability. The essential idea, inspired by non-equilibrium statistical physics, is to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process. We then learn a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data. This approach allows us to rapidly learn, sample from, and evaluate probabilities in deep generative models with thousands of layers or time steps, as well as to compute conditional and posterior probabilities under the learned model. We additionally release an open source reference implementation of the algorithm.
This paper reviews the current status and challenges of Neural Networks (NNs) based machine learning approaches for modern power grid stability control including their design and implementation methodologies. NNs are widely accepted as Artificial Intelligence (AI) approaches offering an alternative way to control complex and ill-defined problems. In this paper various application of NNs for power system rotor angle stabilization and control problem is discussed. The main focus of this paper is on the use of Reinforcement Learning (RL) and Supervised Learning (SL) algorithms in power system wide-area control (WAC). Generally, these algorithms due to their capability in modeling nonlinearities and uncertainties are used for transient classification, neuro-control, wide-area monitoring and control, renewable energy management and control, and so on. The works of researchers in the field of conventional and renewable energy systems are reported and categorized. Paper concludes by presenting, comparing and evaluating various learning techniques and infrastructure configurations based on efficiency.
We propose a novel semantic segmentation algorithm by learning a deconvolution network. We learn the network on top of the convolutional layers adopted from VGG 16-layer net. The deconvolution network is composed of deconvolution and unpooling layers, which identify pixel-wise class labels and predict segmentation masks. We apply the trained network to each proposal in an input image, and construct the final semantic segmentation map by combining the results from all proposals in a simple manner. The proposed algorithm mitigates the limitations of the existing methods based on fully convolutional networks by integrating deep deconvolution network and proposal-wise prediction; our segmentation method typically identifies detailed structures and handles objects in multiple scales naturally. Our network demonstrates outstanding performance in PASCAL VOC 2012 dataset, and we achieve the best accuracy (72.5%) among the methods trained with no external data through ensemble with the fully convolutional network.
In this paper an improved version of the graded precision localization algorithm GRADELOC, called IGRADELOC is proposed. The performance of GRADELOC is dependent on the regions formed by the overlapping radio ranges of the nodes of the underlying sensor network. A different region pattern could significantly alter the nature and precision of localization. In IGRADELOC, two improvements are suggested. Firstly, modifications are proposed in the radio range of the fixed-grid nodes, keeping in mind the actual radio range of commonly available nodes, to allow for routing through them. Routing is not addressed by GRADELOC, but is of prime importance to the deployment of any adhoc network, especially sensor networks. A theoretical model expressing the radio range in terms of the cell dimensions of the grid infrastructure is proposed, to help in carrying out a deployment plan which achieves the desirable precision of coarse-grained localization. Secondly, in GRADELOC it is observed that fine-grained localization does not achieve significant performance benefits over coarse-grained localization. In IGRADELOC, this factor is addressed with the introduction of a parameter that could be used to improve and fine-tune the precision of fine-grained localization.
We study transport properties of bulk-disordered quasi-one-dimensional (Q1D) wires paying main attention to the role of long-range correlations embedded into the disorder. First, we show that for stratified disorder for which the disorder is the same for all individual chains forming the Q1D wire, the transport properties can be analytically described provided the disorder is weak. When the disorder in every chain is not the same, however, has the same binary correlator, the general theory is absent. Thus, we consider the case when only one channel is open and all others are closed. For this situation we suggest a semi-analytical approach which is quite effective for the description of the total transmission coefficient. Our numerical data confirm the validity of our approach. Such Q1D disordered structures with anomalous transport properties can be the subject of an experimental study.
The difference between surface and deep structures of a spreadsheet is a major cause of difficulty in checking spreadsheets. After a brief survey of current methods of checking (or debugging) spreadsheets, new visual methods of showing the deep structures are presented. Illustrations are given on how these visual methods can be employed in various interactive local and global debugging strategies.
The SW has undeniably been one of the most popular network descriptors in the neuroscience literature. Two main reasons for its lasting popularity are its apparent ease of computation and the intuitions it is thought to provide on how networked systems operate. Over the last few years, some pitfalls of the SW construct and, more generally, of network summary measures, have widely been acknowledged.
Systems which can spontaneously reveal periodic evolution are dubbed time crystals. This is in analogy with space crystals that display periodic behavior in configuration space. While space crystals are modelled with the help of space periodic potentials, crystalline phenomena in time can be modelled by periodically driven systems. Disorder in the periodic driving can lead to Anderson localization in time: the probability for detecting a system at a fixed point of configuration space becomes exponentially localized around a certain moment in time. We here show that a three-dimensional system exposed to a properly disordered pseudo-periodic driving may display a localized-delocalized Anderson transition in the time domain, in strong analogy with the usual three-dimensional Anderson transition in disordered systems. Such a transition could be experimentally observed with ultra-cold atomic gases.
In this paper, we present a comprehensive study of Medium Access Control (MAC) protocols developed for Wireless Body Area Networks (WBANs). In WBANs, small batteryoperated on-body or implanted biomedical sensor nodes are used to monitor physiological signs such as temperature, blood pressure, ElectroCardioGram (ECG), ElectroEncephaloGraphy (EEG) etc. We discuss design requirements for WBANs with major sources of energy dissipation. Then, we further investigate the existing designed protocols for WBANs with focus on their strengths and weaknesses. Paper ends up with concluding remarks and open research issues for future work.
We present a self-consistent local approach to self generated glassiness which is based on the concept of the dynamical mean field theory to many body systems. Using a replica approach to self generated glassiness, we map the problem onto an effective local problem which can be solved exactly. Applying the approach to the Brazovskii-model, relevant to a large class of systems with frustrated micro-phase separation, we are able to solve the self-consistent local theory without using additional approximations. We demonstrate that a glassy state found earlier in this model is generic and does not arise from the use of perturbative approximations. In addition we demonstrate that the glassy state depends strongly on the strength of the frustrated phase separation in that model.
Results of experiments with liquid 3He immersed in a new type of aerogel are described. This aerogel consists of Al2O3 strands which are nearly parallel to each other, so we call it as a "nematically ordered" aerogel. At all used pressures a superfluid transition was observed and a superfluid phase diagram was measured. Possible structures of the observed superfluid phases are discussed.
Applying a new method of rescattering which is based on the neural network technique we study the influence of rescattering on the spectra of strange particles produced in heavy ion reactions. In contradistinction to former approaches the rescattering is done explicitly and not in a perturbative fashion. We present a comparison of our calculations for the system Ni(1.93AGeV)+Ni with recent data of the FOPI collaboration. We find that even for this small system rescattering changes the observables considerably but does not invalidate the role of the kaons as a messenger from the high density zone. We cannot confirm the conjecture that the kaon flow can be of use for the determination of the optical potential of the kaon.
Min-SEIS-Cluster is an optimization problem which aims at minimizing the infection spreading in networks. In this problem, nodes can be susceptible to an infection, exposed to an infection, or infectious. One of the main features of this problem is the fact that nodes have different dynamics when interacting with other nodes from the same community. Thus, the problem is characterized by distinct probabilities of infecting nodes from both the same and from different communities. This paper presents a new genetic algorithm that solves the Min-SEIS-Cluster problem. This genetic algorithm surpassed the current heuristic of this problem significantly, reducing the number of infected nodes during the simulation of the epidemics. The results therefore suggest that our new genetic algorithm is the state-of-the-art heuristic to solve this problem.
Elements of the pomeron phenomenology within the Regge pole exchange picture are recalled. This includes discussion of the high energy behaviour of total cross-sections, the triple pomeron limit of the diffractive dissociation and the single particle distributions in the central region. The BFKL pomeron and QCD expectations for the small $x$ behaviour of the deep inelastic scattering structure functions are discussed. The dedicated measurements of the hadronic final state in deep inelastic scattering at small $x$ probing the QCD pomeron are described. The deep inelastic diffraction is also discussed.
In this paper we introduce WiNV - A framework for web-based interactive scalable network visualization. WiNV enables a new class of rich and scalable interactive cross-platform capabilities for visualizing large-scale networks natively in a user's browser. Extensive experiments show that our system can visualize networks that consist of tens of thousands of nodes while maintaining fast, high-quality interaction.
The sensor network is a network technique for the implementation of Ubiquitous computing environment. It is wireless network environment that consists of the many sensors of lightweight and low power. Though sensor network provides various capabilities, it is unable to ensure the secure authentication between nodes. Eventually it causes the losing reliability of the entire network and many secure problems. Therefore, encryption algorithm for the implementation of reliable sensor network environments is required to the applicable sensor network. In this paper, we proposed the solution of reliable sensor network to analyze the communication efficiency through measuring performance of AES encryption algorithm by plaintext size, and cost of operation per hop according to the network scale.
We propose an optimization approach to design cost-effective electrical power transmission networks. That is, we aim to select both the network structure and the line conductances (line sizes) so as to optimize the trade-off between network efficiency (low power dissipation within the transmission network) and the cost to build the network. We begin with a convex optimization method based on the paper ``Minimizing Effective Resistance of a Graph'' [Ghosh, Boyd \& Saberi]. We show that this (DC) resistive network method can be adapted to the context of AC power flow. However, that does not address the combinatorial aspect of selecting network structure. We approach this problem as selecting a subgraph within an over-complete network, posed as minimizing the (convex) network power dissipation plus a non-convex cost on line conductances that encourages sparse networks where many line conductances are set to zero. We develop a heuristic approach to solve this non-convex optimization problem using: (1) a continuation method to interpolate from the smooth, convex problem to the (non-smooth, non-convex) combinatorial problem, (2) the majorization-minimization algorithm to perform the necessary intermediate smooth but non-convex optimization steps. Ultimately, this involves solving a sequence of convex optimization problems in which we iteratively reweight a linear cost on line conductances to fit the actual non-convex cost. Several examples are presented which suggest that the overall method is a good heuristic for network design. We also consider how to obtain sparse networks that are still robust against failures of lines and/or generators.
Genomic alterations lead to cancer complexity and form a major hurdle for a comprehensive understanding of the molecular mechanisms underlying oncogenesis. In this review, we describe the recent advances in studying cancer-associated genes from a systems biological point of view. The integration of known cancer genes onto protein and signaling networks reveals the characteristics of cancer genes within networks. This approach shows that cancer genes often function as network hub proteins which are involved in many cellular processes and form focal nodes in the information exchange between many signaling pathways. Literature mining allows constructing gene-gene networks, in which new cancer genes can be identified. The gene expression profiles of cancer cells are used for reconstructing gene regulatory networks. By doing so, the genes, which are involved in the regulation of cancer progression, can be picked up from these networks after which their functions can be further confirmed in the laboratory.
The intriguing nature of classical Homeric narratives has always fascinated the occidental culture contributing to philosophy, history, mythology and straight forwardly to literature. However what would be so intriguing about Homer's narratives' At a first gaze we shall recognize the very literal appeal and aesthetic pleasure presented on every page across Homer's chants in Odyssey and rhapsodies in Iliad. Secondly we may perceive a biased aspect of its stories contents, varying from real-historical to fictional-mythological. To encompass this glance, there are some new archeological finding that supports historicity of some events described within Iliad, and consequently to Odyssey. Considering these observations and using complex network theory concepts, we managed to built and analyze a social network gathered across the classical epic, Odyssey of Homer. Longing for further understanding, topological quantities were collected in order to classify its social network qualitatively into real or fictional. It turns out that most of the found properties belong to real social networks besides assortativity and giant component's size. In order to test the network's possibilities to be real, we removed some mythological members that could imprint a fictional aspect on the network. Carrying on this maneuver the modified social network resulted on assortative mixing and reduction of the giant component, as expected for real social networks. Overall we observe that Odyssey might be an amalgam of fictional elements plus real based human relations, which corroborates other author's findings for Iliad and archeological evidences.
We introduce a novel online Bayesian method for the identification of a family of noisy recurrent neural networks (RNNs). We develop Bayesian active learning technique in order to optimize the interrogating stimuli given past experiences. In particular, we consider the unknown parameters as stochastic variables and use the D-optimality principle, also known as `\emph{infomax method}', to choose optimal stimuli. We apply a greedy technique to maximize the information gain concerning network parameters at each time step. We also derive the D-optimal estimation of the additive noise that perturbs the dynamical system of the RNN. Our analytical results are approximation-free. The analytic derivation gives rise to attractive quadratic update rules.
This thesis is divided in two parts. The first presents an overview of known results in statistical mechanics of disordered systems and its approach to random combinatorial optimization problems. The second part is a discussion of two original results.   The first result concerns DPLL heuristics for random k-XORSAT, which is equivalent to the diluted Ising p-spin model. It is well known that DPLL is unable to find the ground states in the clustered phase of the problem, i.e. that it leads to contradictions with probability 1. However, no solid argument supports this is general. A class of heuristics, which includes the well known UC and GUC, is introduced and studied. It is shown that any heuristic in this class must fail if the clause to variable ratio is larger than some constant, which depends on the heuristic but is always smaller than the clustering threshold.   The second result concerns the properties of random k-SAT at large clause to variable ratios. In this regime, it is well known that the uniform distribution of random instances is dominated by unsatisfiable instances. A general technique (based on the Replica method) to restrict the distribution to satisfiable instances with uniform weight is introduced, and is used to characterize their solutions. It is found that in the limit of large clause to variable ratios, the uniform distribution of satisfiable random k-SAT formulas is asymptotically equal to the much studied Planted distribution.   Both results are already published and available as arXiv:0709.0367 and arXiv:cs/0609101 . A more detailed and self-contained derivation is presented here.
Network dynamics are typically presented as a time series of network properties captured at each period. The current approach examines the dynamical properties of transmission via novel measures on an integrated, temporally extended network representation of interaction data across time. Because it encodes time and interactions as network connections, static network measures can be applied to this "temporal web" to reveal features of the dynamics themselves. Here we provide the technical details and apply it to agent-based implementations of the well-known SEIR and SEIS epidemiological models.
We study three instances of log-correlated processes on the interval: the logarithm of the Gaussian unitary ensemble (GUE) characteristic polynomial, the Gaussian log-correlated potential in presence of edge charges, and the Fractional Brownian motion with Hurst index $H \to 0$ (fBM0). In previous collaborations we obtained the probability distribution function (PDF) of the value of the global minimum (equivalently maximum) for the first two processes, using the {\it freezing-duality conjecture} (FDC). Here we study the PDF of the position of the maximum $x_m$ through its moments. Using replica, this requires calculating moments of the density of eigenvalues in the $\beta$-Jacobi ensemble. Using Jack polynomials we obtain an exact and explicit expression for both positive and negative integer moments for arbitrary $\beta >0$ and positive integer $n$ in terms of sums over partitions. For positive moments, this expression agrees with a very recent independent derivation by Mezzadri and Reynolds. We check our results against a contour integral formula derived recently by Borodin and Gorin (presented in the Appendix A from these authors). The duality necessary for the FDC to work is proved, and on our expressions, found to correspond to exchange of partitions with their dual. Performing the limit $n \to 0$ and to negative Dyson index $\beta \to -2$, we obtain the moments of $x_m$ and give explicit expressions for the lowest ones. Numerical checks for the GUE polynomials, performed independently by N. Simm, indicate encouraging agreement. Some results are also obtained for moments in Laguerre, Hermite-Gaussian, as well as circular and related ensembles. The correlations of the position and the value of the field at the minimum are also analyzed.
Efficiency and simplicity of random algorithms have made them a lucrative alternative for solving complex problems in the domain of communication networks. This paper presents a random algorithm for handling the routing problem in Mobile Ad hoc Networks [MANETS].The performance of most existing routing protocols for MANETS degrades in terms of packet delay and congestion caused as the number of mobile nodes increases beyond a certain level or their speed passes a certain level. As the network becomes more and more dynamic, congestion in network increases due to control packets generated by the routing protocols in the process of route discovery and route maintenance. Most of this congestion is due to flooding mechanism used in protocols like AODV and DSDV for the purpose of route discovery and route maintenance or for route discovery as in the case of DSR protocol. This paper introduces the concept of random routing algorithm that neither maintains a routing table nor floods the entire network as done by various known protocols thereby reducing the load on network in terms of number of control packets in a highly dynamic scenario. This paper calculates the expected run time of the designed random algorithm.
Conventionally, image denoising and high-level vision tasks are handled separately in computer vision, and their connection is fragile. In this paper, we cope with the two jointly and explore the mutual influence between them, with the focus on two questions, namely (1) how image denoising can help solving high-level vision problems, and (2) how the semantic information from high-level vision tasks can be used to guide image denoising. We propose a deep convolutional neural network solution that cascades two modules for image denoising and various high level tasks, respectively, and propose the use of joint loss for training to allow the semantic information flowing into the optimization of the denoising network via back-propagation. Our experimental results demonstrate that the proposed architecture not only yields superior image denoising results preserving fine details, but also overcomes the performance degradation of different high-level vision tasks, e.g., image classification and semantic segmentation, due to image noise or artifacts caused by conventional denoising approaches such as over-smoothing.
Proceedings of the First International Workshop on Deep Learning and Music, joint with IJCNN, Anchorage, US, May 17-18, 2017
A driven Monte Carlo dynamics is introduced to study resistivity scaling in XY-type models in the phase representation. The method is used to study the phase transition of the three-dimensional XY spin glass with a Gaussian coupling distribution. We find a phase-coherence transition at finite temperature in good agreement with recent equilibrium Monte Carlo simulations which shows a single (spin and chiral) glass transition. Estimates of the static and dynamic critical exponents indicate that the critical behavior is in the same universality class as the the model with a bimodal coupling distribution. Relevance of these results for $\pi$-junction superconductors is also discussed.
The ability to backpropagate stochastic gradients through continuous latent distributions has been crucial to the emergence of variational autoencoders and stochastic gradient variational Bayes. The key ingredient is an unbiased and low-variance way of estimating gradients with respect to distribution parameters from gradients evaluated at distribution samples. The "reparameterization trick" provides a class of transforms yielding such estimators for many continuous distributions, including the Gaussian and other members of the location-scale family. However the trick does not readily extend to mixture density models, due to the difficulty of reparameterizing the discrete distribution over mixture weights. This report describes an alternative transform, applicable to any continuous multivariate distribution with a differentiable density function from which samples can be drawn, and uses it to derive an unbiased estimator for mixture density weight derivatives. Combined with the reparameterization trick applied to the individual mixture components, this estimator makes it straightforward to train variational autoencoders with mixture-distributed latent variables, or to perform stochastic variational inference with a mixture density variational posterior.
In the field of empirical modeling using Genetic Programming (GP), it is important to evolve solution with good generalization ability. Generalization ability of GP solutions get affected by two important issues: bloat and over-fitting. We surveyed and classified existing literature related to different techniques used by GP research community to deal with these issues. We also point out limitation of these techniques, if any. Moreover, the classification of different bloat control approaches and measures for bloat and over-fitting are also discussed. We believe that this work will be useful to GP practitioners in following ways: (i) to better understand concepts of generalization in GP (ii) comparing existing bloat and over-fitting control techniques and (iii) selecting appropriate approach to improve generalization ability of GP evolved solutions.
Neurons perform computations, and convey the results of those computations through the statistical structure of their output spike trains. Here we present a practical method, grounded in the information-theoretic analysis of prediction, for inferring a minimal representation of that structure and for characterizing its complexity. Starting from spike trains, our approach finds their causal state models (CSMs), the minimal hidden Markov models or stochastic automata capable of generating statistically identical time series. We then use these CSMs to objectively quantify both the generalizable structure and the idiosyncratic randomness of the spike train. Specifically, we show that the expected algorithmic information content (the information needed to describe the spike train exactly) can be split into three parts describing (1) the time-invariant structure (complexity) of the minimal spike-generating process, which describes the spike train statistically; (2) the randomness (internal entropy rate) of the minimal spike-generating process; and (3) a residual pure noise term not described by the minimal spike-generating process. We use CSMs to approximate each of these quantities. The CSMs are inferred nonparametrically from the data, making only mild regularity assumptions, via the causal state splitting reconstruction algorithm. The methods presented here complement more traditional spike train analyses by describing not only spiking probability and spike train entropy, but also the complexity of a spike train's structure. We demonstrate our approach using both simulated spike trains and experimental data recorded in rat barrel cortex during vibrissa stimulation.
There are several centrality measures that have been introduced and studied for real world networks. They account for the different vertex characteristics that permit them to be ranked in order of importance in the network. Betweenness centrality is a measure of the influence of a vertex over the flow of information between every pair of vertices under the assumption that information primarily flows over the shortest path between them. In this paper we present betweenness centrality of some important classes of graphs.
In this work, we present a novel approach to ontology reasoning that is based on deep learning rather than logic-based formal reasoning. To this end, we introduce a new model for statistical relational learning that is built upon deep recursive neural networks, and give experimental evidence that it can easily compete with, or even outperform, existing logic-based reasoners on the task of ontology reasoning. More precisely, we compared our implemented system with one of the best logic-based ontology reasoners at present, RDFox, on a number of large standard benchmark datasets, and found that our system attained high reasoning quality, while being up to two orders of magnitude faster.
We present experimental evidence for the different mechanisms driving the fluctuations of the local density of states (LDOS) in disordered photonic systems. We establish a clear link between the microscopic structure of the material and the frequency correlation function of LDOS accessed by a near-field hyperspectral imaging technique. We show, in particular, that short- and long-range frequency correlations of LDOS are controlled by different physical processes (multiple or single scattering processes, respectively) that can be---to some extent---manipulated independently. We also demonstrate that the single scattering contribution to LDOS fluctuations is sensitive to subwavelength features of the material and, in particular, to the correlation length of its dielectric function. Our work paves a way towards a complete control of statistical properties of disordered photonic systems, allowing for designing materials with predefined correlations of LDOS.
A method is presented, which allows to sample directly low-temperature configurations of glassy systems, like spin glasses. The basic idea is to generate ground states and low lying excited configurations using a heuristic algorithm. Then, with the help of microcanonical Monte Carlo simulations, more configurations are found, clusters of configurations are determined and entropies evaluated. Finally equilibrium configuration are randomly sampled with proper Gibbs-Boltzmann weights.   The method is applied to three-dimensional Ising spin glasses with +- J interactions and temperatures T<=0.5. The low-temperature behavior of this model is characterized by evaluating different overlap quantities, exhibiting a complex low-energy landscape for T>0, while the T=0 behavior appears to be less complex.
Energy being the very key concern area with sensor networks, so the main focus lies in developing a mechanism to increase the lifetime of a sensor network by energy balancing. To achieve energy balancing and maximizing network lifetime we use an idea of clustering and dividing the whole network into different clusters. In this paper we propose a dynamic cluster formation method where clusters are refreshed periodically based on residual energy, distance and cost. Refreshing clustering minimizes workload of any single node and in turn enhances the energy conservation. Sleep and wait methodology is applied to the proposed protocol to enhance the network lifetime by turning the nodes on and off according to their duties. The node that has some data to be transmitted is in on state and after forwarding its data to the cluster head it changes its state to off which saves the energy of entire network. Simulations have been done using MAT lab. Simulation results prove the betterment of our proposed method over the existing Leach protocol.
Co-evolution exhibited by a network system, involving the intricate interplay between the dynamics of the network itself and the subsystems connected by it, is a key concept for understanding the self-organized, flexible nature of real-world network systems. We propose a simple model of such co-evolving network dynamics, in which the diffusion of a resource over a weighted network and the resource-driven evolution of the link weights occur simultaneously. We demonstrate that, under feasible conditions, the network robustly acquires scale-free characteristics in the asymptotic state. Interestingly, in the case that the system includes dissipation, it asymptotically realizes a dynamical phase characterized by an organized scale-free network, in which the ranking of each node with respect to the quantity of the resource possessed thereby changes ceaselessly. Our model offers a unified framework for understanding some real-world diffusion-driven network systems of diverse types.
We consider the capacitated selfish replication (CSR) game with binary preferences, over general undirected networks. We first show that such games have an associated ordinary potential function, and hence always admit a pure-strategy Nash equilibrium (NE). Further, when the minimum degree of the network and the number of resources are of the same order, there exists an exact polynomial time algorithm which can find a NE. Following this, we study the price of anarchy of such games, and show that it is bounded above by 3; we further provide some instances for which the price of anarchy is at least 2. We develop a quasi-polynomial algorithm O(n^2D^{ln n}), where n is the number of players and D is the diameter of the network, which can find, in a distributed manner, an allocation profile that is within a constant factor of the optimal allocation, and hence of any pure-strategy NE of the game. Proof of this result uses a novel potential function.
Local deep neural networks have been recently introduced for gender recognition. Although, they achieve very good performance they are very computationally expensive to train. In this work, we introduce a simplified version of local deep neural networks which significantly reduces the training time. Instead of using hundreds of patches per image, as suggested by the original method, we propose to use 9 overlapping patches per image which cover the entire face region. This results in a much reduced training time, since just 9 patches are extracted per image instead of hundreds, at the expense of a slightly reduced performance. We tested the proposed modified local deep neural networks approach on the LFW and Adience databases for the task of gender and age classification. For both tasks and both databases the performance is up to 1% lower compared to the original version of the algorithm. We have also investigated which patches are more discriminative for age and gender classification. It turns out that the mouth and eyes regions are useful for age classification, whereas just the eye region is useful for gender classification.
One of the famous results of network science states that networks with heterogeneous connectivity are more susceptible to epidemic spreading than their more homogeneous counterparts. In particular, in networks of identical nodes it has been shown that heterogeneity can lower the epidemic threshold at which epidemics can invade the system. Network heterogeneity can thus allow diseases with lower transmission probabilities to persist and spread. Here, we point out that for real world applications, this result should not be regarded independently of the intra-individual heterogeneity between people. Our results show that, if heterogeneity among people is taken into account, networks that are more heterogeneous in connectivity can be more resistant to epidemic spreading. We study a susceptible-infected-susceptible model with adaptive disease avoidance. Results from this model suggest that this reversal of the effect of network heterogeneity is likely to occur in populations in which the individuals are aware of their subjective disease risk. For epidemiology, this implies that network heterogeneity should not be studied in isolation.
The Kepler object KIC 12557548 shows irregular eclipsing behaviour with a constant 15.685 hr period, but strongly varying transit depth. In this paper we fit individual eclipses, in addition to fitting binned light curves, to learn more about the process underlying the eclipse depth variation. Additionally, we put forward observational constraints that any model of this planet-star system will have to match. We find two quiescent spells of ~30 orbital periods each where the transit depth is <0.1%, followed by relatively deep transits. Additionally, we find periods of on-off behaviour where >0.5% deep transits are followed by apparently no transit at all. Apart from these isolated events we find neither significant correlation between consecutive transit depths nor a correlation between transit depth and stellar intensity. We find a three-sigma upper limit for the secondary eclipse of 4.9*10^-5, consistent with a planet candidate with a radius of less than 4600 km. Using the short cadence data we find that a 1-D exponential dust tail model is insufficient to explain the data. We improved our model to a 2-D, two-component dust model with an opaque core and an exponential tail. Using this model we fit individual eclipses observed in short cadence mode. We find an improved fit of the data, quantifying earlier suggestions by Budaj (2013) of the necessity of at least two components. We find that deep transits have most absorption in the tail, and not in a disk-shaped, opaque coma, but the transit depth and the total absorption show no correlation with the tail length.
Artificial Neural Network computation relies on intensive vector-matrix multiplications. Recently, the emerging nonvolatile memory (NVM) crossbar array showed a feasibility of implementing such operations with high energy efficiency, thus there are many works on efficiently utilizing emerging NVM crossbar array as analog vector-matrix multiplier. However, its nonlinear I-V characteristics restrain critical design parameters, such as the read voltage and weight range, resulting in substantial accuracy loss. In this paper, instead of optimizing hardware parameters to a given neural network, we propose a methodology of reconstructing a neural network itself optimized to resistive memory crossbar arrays. To verify the validity of the proposed method, we simulated various neural network with MNIST and CIFAR-10 dataset using two different specific Resistive Random Access Memory (RRAM) model. Simulation results show that our proposed neural network produces significantly higher inference accuracies than conventional neural network when the synapse devices have nonlinear I-V characteristics.
Through research conducted in this study, a network approach to the correlation patterns of void spaces in rough fractures (crack type II) was developed. We characterized friction networks with several networks characteristics. The correlation among network properties with the fracture permeability is the result of friction networks. The revealed hubs in the complex aperture networks confirmed the importance of highly correlated groups to conduct the highlighted features of the dynamical aperture field. We found that there is a universal power law between the nodes' degree and motifs frequency (for triangles it reads T(k)\proptok{\beta} ({\beta} \approx2\pm0.3)). The investigation of localization effects on eigenvectors shows a remarkable difference in parallel and perpendicular aperture patches. Furthermore, we estimate the rate of stored energy in asperities so that we found that the rate of radiated energy is higher in parallel friction networks than it is in transverse directions. The final part of our research highlights 4 point sub-graph distribution and its correlation with fluid flow. For shear rupture, we observed a similar trend in sub-graph distribution, resulting from parallel and transversal aperture profiles (a superfamily phenomenon).
The statistical distribution of levels of an integrable system is claimed to be a Poisson distribution. In this paper, we numerically generate an ensemble of N dimensional random diagonal matrices as a model for regular systems. We evaluate the corresponding nearest-neighbor spacing (NNS) distribution, which characterizes the short range correlation between levels. To characterize the long term correlations, we evaluate the level number variance. We show that, by increasing the size of matrices, the level spacing distribution evolves from the Gaussian shape that characterizes ensembles of 2\times2 matrices tending to the Poissonian as N \rightarrow \infty. The transition occurs at N \approx 20. The number variance also shows a gradual transition towards the straight line behavior predicted by the Poisson statistics.
Businesses, tourism attractions, public transportation hubs and other points of interest are not isolated but part of a collaborative system. Making such collaborative network surface is not always an easy task. The existence of data-rich environments can assist in the reconstruction of collaborative networks. They shed light into how their members operate and reveal a potential for value creation via collaborative approaches. Social media data are an example of a means to accomplish this task. In this paper, we reconstruct a network of tourist locations using fine-grained data from Flickr, an online community for photo sharing. We have used a publicly available set of Flickr data provided by Yahoo! Labs. To analyse the complex structure of tourism systems, we have reconstructed a network of visited locations in Europe, resulting in around 180,000 vertices and over 32 million edges. An analysis of the resulting network properties reveals its complex structure.
A computer-aided detection (CADe) system for microcalcification cluster identification in mammograms has been developed in the framework of the EU-founded MammoGrid project. The CADe software is mainly based on wavelet transforms and artificial neural networks. It is able to identify microcalcifications in different datasets of mammograms (i.e. acquired with different machines and settings, digitized with different pitch and bit depth or direct digital ones). The CADe can be remotely run from GRID-connected acquisition and annotation stations, supporting clinicians from geographically distant locations in the interpretation of mammographic data. We report and discuss the system performances on different datasets of mammograms and the status of the GRID-enabled CADe analysis.
Audio Classical Composer Identification (ACC) is an important problem in Music Information Retrieval (MIR) which aims at identifying the composer for audio classical music clips. The famous annual competition, Music Information Retrieval Evaluation eXchange (MIREX), also takes it as one of the four training&testing tasks. We built a hybrid model based on Deep Belief Network (DBN) and Stacked Denoising Autoencoder (SDA) to identify the composer from audio signal. As a matter of copyright, sponsors of MIREX cannot publish their data set. We built a comparable data set to test our model. We got an accuracy of 76.26% in our data set which is better than some pure models and shallow models. We think our method is promising even though we test it in a different data set, since our data set is comparable to that in MIREX by size. We also found that samples from different classes become farther away from each other when transformed by more layers in our model.
We discuss the response of a quantum system to a time-dependent perturbation with spectrum \Phi(\omega). This is characterised by a rate constant D describing the diffusion of occupation probability between levels. We calculate the transition rates by first-order perturbation theory, so that multiplying \Phi(\omega) by a constant \lambda changes the diffusion constant to \lambda D. However, we discuss circumstances where this linearity does notextend to the function space of intensities, so that if intensities \Phi_i(\omega) yield diffusion constants D_i, then the intensity \sum_i \Phi_i(\omega) does not result in a diffusion constant \sum_i D_i. This `semilinear' response can occur in the absorption of radiation by small metal particles.
A theoretical study of the coherent light scattering from disordered photonic crystal is presented. In addition to the conventional enhancement of the reflected light intensity into the backscattering direction, the so called coherent backscattering (CBS), the periodic modulation of the dielectric function in photonic crystals gives rise to a qualitatively new effect: enhancement of the reflected light intensity in directions different from the backscattering direction. These additional coherent scattering processes, dubbed here {\em umklapp scattering} (CUS), result in peaks, which are most pronounced when the incident light beam enters the sample at an angle close to the the Bragg angle. Assuming that the dielectric function modulation is weak, we study the shape of the CUS peaks for different relative lengths of the modulation-induced Bragg attenuation compared to disorder-induced mean free path. We show that when the Bragg length increases, then the CBS peak assumes its conventional shape, whereas the CUS peak rapidly diminishes in amplitude. We also study the suppression of the CUS peak upon the departure of the incident beam from Bragg resonance: we found that the diminishing of the CUS intensity is accompanied by substantial broadening. In addition, the peak becomes asymmetric.
This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This represents an important step toward the design of competent incremental estimation of distribution algorithms that can solve difficult nearly decomposable problems scalably and reliably.
Natural language correction has the potential to help language learners improve their writing skills. While approaches with separate classifiers for different error types have high precision, they do not flexibly handle errors such as redundancy or non-idiomatic phrasing. On the other hand, word and phrase-based machine translation methods are not designed to cope with orthographic errors, and have recently been outpaced by neural models. Motivated by these issues, we present a neural network-based approach to language correction. The core component of our method is an encoder-decoder recurrent neural network with an attention mechanism. By operating at the character level, the network avoids the problem of out-of-vocabulary words. We illustrate the flexibility of our approach on dataset of noisy, user-generated text collected from an English learner forum. When combined with a language model, our method achieves a state-of-the-art $F_{0.5}$-score on the CoNLL 2014 Shared Task. We further demonstrate that training the network on additional data with synthesized errors can improve performance.
Learning latent structure in complex networks has become an important problem fueled by many types of networked data originating from practically all fields of science. In this paper, we propose a new non-parametric Bayesian multiple-membership latent feature model for networks. Contrary to existing multiple-membership models that scale quadratically in the number of vertices the proposed model scales linearly in the number of links admitting multiple-membership analysis in large scale networks. We demonstrate a connection between the single membership relational model and multiple membership models and show on "real" size benchmark network data that accounting for multiple memberships improves the learning of latent structure as measured by link prediction while explicitly accounting for multiple membership result in a more compact representation of the latent structure of networks.
Computational color constancy that requires esti- mation of illuminant colors of images is a fundamental yet active problem in computer vision, which can be formulated into a regression problem. To learn a robust regressor for color constancy, obtaining meaningful imagery features and capturing latent correlations across output variables play a vital role. In this work, we introduce a novel deep structured-output regression learning framework to achieve both goals simultaneously. By borrowing the power of deep convolutional neural networks (CNN) originally designed for visual recognition, the proposed framework can automatically discover strong features for white balancing over different illumination conditions and learn a multi-output regressor beyond underlying relationships between features and targets to find the complex interdependence of dif- ferent dimensions of target variables. Experiments on two public benchmarks demonstrate that our method achieves competitive performance in comparison with the state-of-the-art approaches.
We consider the spreading of the wave packet in the generalized Rosenzweig-Porter random matrix ensemble in the region of non-ergodic extended states $1<\gamma<2$. We show that despite non-trivial fractal dimensions $0 < D_{q}=2-\gamma<1$ characterize wave function statistics in this region, the wave packet spreading $\langle r^{2} \rangle \propto t^{\beta}$ is governed by the "diffusion" exponent $\beta=1$ outside the ballistic regime $t>\tau\sim 1$ and $\langle r^{2}\rangle \propto t^{2}$ in the ballistic regime for $t<\tau\sim 1$. This demonstrates that the multifractality exhibits itself only in {\it local} quantities like the wave packet survival probability but not in the large-distance spreading of the wave packet.
In this paper we study homomorphisms of Probabilistic Regulatory Gene Networks(PRN) introduced in arXiv:math.DS/0603289 v1 13 Mar 2006. The model PRN is a natural generalization of the Probabilistic Boolean Networks (PBN), introduced by I. Shmulevich, E. Dougherty, and W. Zhang in 2001, that has been using to describe genetic networks and has therapeutic applications. In this paper, our main objectives are to apply the concept of homomorphism and $\epsilon$-homomorphism of probabilistic regulatory networks to the dynamic of the networks. The meaning of $\epsilon$ is that these homomorphic networks have similar distributions and the distance between the distributions is upper bounded by $\epsilon$. Additionally, we prove that the class of PRN together with the homomorphisms form a category with products and coproducts. Projections are special homomorphisms, and they always induce invariant subnetworks that contain all the cycles and steady states in the network. Here, it is proved that the $\epsilon$-homomorphism for $0<\epsilon<1$ produce simultaneous Markov Chains in both networks, that permit to introduce the concept of $\epsilon$-isomorphism of Markov Chains, and similar networks.
Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when using high-dimensional representations, such as Fisher vectors and convolutional neural network features. We also propose a window refinement method, which improves the localization accuracy by incorporating an objectness prior. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset, which verifies the effectiveness of our approach.
In this paper, we propose an improved gravitational search algorithm named GSABC. The algorithm improves gravitational search algorithm (GSA) results improved by using artificial bee colony algorithm (ABC) to solve constrained numerical optimization problems. In GSA, solutions are attracted towards each other by applying gravitational forces, which depending on the masses assigned to the solutions, to each other. The heaviest mass will move slower than other masses and gravitate others. Due to nature of gravitation, GSA may pass global minimum if some solutions stuck to local minimum. ABC updates the positions of the best solutions that has obtained from GSA, preventing the GSA from sticking to the local minimum by its strong searching ability. The proposed algorithm improves the performance of GSA. The proposed method tested on 23 well-known unimodal, multimodal and fixed-point multimodal benchmark test functions. Experimental results show that GSABC outperforms or performs similarly to five state-of-the-art optimization approaches.
We propose a novel stacked generalization (stacking) method as a dynamic ensemble technique using a pool of heterogeneous classifiers for node label classification on networks. The proposed method assigns component models a set of functional coefficients, which can vary smoothly with certain topological features of a node. Compared to the traditional stacking model, the proposed method can dynamically adjust the weights of individual models as we move across the graph and provide a more versatile and significantly more accurate stacking model for label prediction on a network. We demonstrate the benefits of the proposed model using both a simulation study and real data analysis.
We establish a new framework for statistical estimation of directed acyclic graphs (DAGs) when data are generated from a linear, possibly non-Gaussian structural equation model. Our framework consists of two parts: (1) inferring the moralized graph from the support of the inverse covariance matrix; and (2) selecting the best-scoring graph amongst DAGs that are consistent with the moralized graph. We show that when the error variances are known or estimated to close enough precision, the true DAG is the unique minimizer of the score computed using the reweighted squared l_2-loss. Our population-level results have implications for the identifiability of linear SEMs when the error covariances are specified up to a constant multiple. On the statistical side, we establish rigorous conditions for high-dimensional consistency of our two-part algorithm, defined in terms of a "gap" between the true DAG and the next best candidate. Finally, we demonstrate that dynamic programming may be used to select the optimal DAG in linear time when the treewidth of the moralized graph is bounded.
The FORTRAN code POLRAD 2.0 for radiative correction calculation in inclusive and semi-inclusive deep inelastic scattering of polarized leptons by polarized nucleons and nuclei is described. Its theoretical basis, structure and algorithms are discussed in details.
Air transportation has been becoming a major part of transportation infrastructure worldwide. Hence the study of the Airports Networks, the backbone of air transportation, is becoming increasingly important. In complex systems domain, airport networks are modeled as graphs (networks) comprising of airports (vertices or nodes) that are linked by flight connectivities among the airports. A complex network analysis of such a model offers holistic insight about the performance and risks in such a network. We review the performance and risks of networks with the help of studies that have been done on some of the airport networks. We present various network parameters those could be potentially used as a measure of performance and risks on airport networks. We will also see how various risks, such as break down of airports, spread of diseases across the airport network could be assessed based on the network parameters. Further we review how these insights could possibly be used to shape more efficient and safer airport networks.
The current work addresses quantum machine learning in the context of Quantum Artificial Neural Networks such that the networks' processing is divided in two stages: the learning stage, where the network converges to a specific quantum circuit, and the backpropagation stage where the network effectively works as a self-programing quantum computing system that selects the quantum circuits to solve computing problems. The results are extended to general architectures including recurrent networks that interact with an environment, coupling with it in the neural links' activation order, and self-organizing in a dynamical regime that intermixes patterns of dynamical stochasticity and persistent quasiperiodic dynamics, making emerge a form of noise resilient dynamical record.
The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to recurrent models, computations over all elements can be fully parallelized during training and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT'14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.
Recent neural IR models have demonstrated deep learning's utility in ad-hoc information retrieval. However, deep models have a reputation for being black boxes, and the roles of a neural IR model's components may not be obvious at first glance. In this work, we attempt to shed light on the inner workings of a recently proposed neural IR model, namely the PACRR model, by visualizing the output of intermediate layers and by investigating the relationship between intermediate weights and the ultimate relevance score produced. We highlight several insights, hoping that such insights will be generally applicable.
Cells sense the geometry and stiffness of their adhesive environment by active contractility. For strong adhesion to flat substrates, two-dimensional contractile network models can be used to understand how force is distributed throughout the cell. Here we compare the shape and force distribution for different variants of such network models. In contrast to Hookean networks, cable networks reflect the asymmetric response of biopolymers to tension versus compression. For passive networks, contractility is modeled by a reduced resting length of the mechanical links. In actively contracting networks, a constant force couple is introduced into each link in order to model contraction by molecular motors. If combined with fixed adhesion sites, all network models lead to invaginated cell shapes, but only actively contracting cable networks lead to the circular arc morphology typical for strongly adhering cells. In this case, shape and force distribution are determined by local rather than global determinants and thus are suited to endow the cell with a robust sense of its environment. We also discuss non-linear and adaptive linker mechanics as well as the relation to tissue shape.
We present the results of the search for decaying dark matter with particle mass in the 6-40 keV range with NuSTAR deep observations of COSMOS and ECDFS empty sky fields. We show that main contribution to the decaying dark matter signal from the Milky Way galaxy comes through the aperture of the NuSTAR detector, rather than through the focusing optics. High sensitivity of the NuSTAR detector, combined with the large aperture and large exposure times of the two observation fields allow us to improve previously existing constraints on the dark matter decay time by up to an order of magnitude in the mass range 10-30 keV. In the particular case of the nuMSM sterile neutrino dark matter, our constraints impose an upper bound m<20 keV on the dark matter particle mass. We report detection of four unidentified spectral lines in our data set. These line detections are either due to the systematic effects (uncertainties of calibrations of the NuSTAR detectors) or have an astrophysical origin. We discuss different possibilities for testing the nature of the detected lines.
It is widely acknowledged that the forthcoming 5G architecture will be highly heterogeneous and deployed with a high degree of density. These changes over the current 4G bring many challenges on how to achieve an efficient operation from the network management perspective. In this article, we introduce a revolutionary vision of the future 5G wireless networks, in which the network is no longer limited by hardware or even software. Specifically, by the idea of virtualizing the wireless networks, which has recently gained increasing attention, we introduce the Everything-as-a-Service (XaaS) taxonomy to light the way towards designing the service-oriented wireless networks. The concepts, challenges along with the research opportunities for realizing XaaS in wireless networks are overviewed and discussed.
Image segmentation is an important step in most visual tasks. While convolutional neural networks have shown to perform well on single image segmentation, to our knowledge, no study has been been done on leveraging recurrent gated architectures for video segmentation. Accordingly, we propose a novel method for online segmentation of video sequences that incorporates temporal data. The network is built from fully convolutional element and recurrent unit that works on a sliding window over the temporal data. We also introduce a novel convolutional gated recurrent unit that preserves the spatial information and reduces the parameters learned. Our method has the advantage that it can work in an online fashion instead of operating over the whole input batch of video frames. The network is tested on the change detection dataset, and proved to have 5.5\% improvement in F-measure over a plain fully convolutional network for per frame segmentation. It was also shown to have improvement of 1.4\% for the F-measure compared to our baseline network that we call FCN 12s.
We study the ferromagnetic phase transition in a randomly layered Heisenberg model. A recent strong-disorder renormalization group approach [Phys. Rev. B 81, 144407 (2010)] predicted that the critical point in this system is of exotic infinite-randomness type and is accompanied by strong power-law Griffiths singularities. Here, we report results of Monte-Carlo simulations that provide numerical evidence in support of these predictions. Specifically, we investigate the finite-size scaling behavior of the magnetic susceptibility which is characterized by a non-universal power-law divergence in the Griffiths phase. In addition, we calculate the time autocorrelation function of the spins. It features a very slow decay in the Griffiths phase, following a non-universal power law in time.
We propose NM landscapes as a new class of tunably rugged benchmark problems. NM landscapes are well-defined on alphabets of any arity, including both discrete and real-valued alphabets, include epistasis in a natural and transparent manner, are proven to have known value and location of the global maximum and, with some additional constraints, are proven to also have a known global minimum. Empirical studies are used to illustrate that, when coefficients are selected from a recommended distribution, the ruggedness of NM landscapes is smoothly tunable and correlates with several measures of search difficulty. We discuss why these properties make NM landscapes preferable to both NK landscapes and Walsh polynomials as benchmark landscape models with tunable epistasis.
We present a detailed analytical study of the $A+A\to\emptyset$ diffusion-annihilation process in complex networks. By means of microscopic arguments, we derive a set of rate equations for the density of $A$ particles in vertices of a given degree, valid for any generic degree distribution, and which we solve for uncorrelated networks. For homogeneous networks (with bounded fluctuations), we recover the standard mean-field solution, i.e. a particle density decreasing as the inverse of time. For heterogeneous (scale-free networks) in the infinite network size limit, we obtain instead a density decreasing as a power-law, with an exponent depending on the degree distribution. We also analyze the role of finite size effects, showing that any finite scale-free network leads to the mean-field behavior, with a prefactor depending on the network size. We check our analytical predictions with extensive numerical simulations on homogeneous networks with Poisson degree distribution and scale-free networks with different degree exponents.
In this paper, we propose an end-to-end neural network (NN) based EEG-speech (NES) modeling framework, in which three network structures are developed to map imagined EEG signals to phonemes. The proposed NES models incorporate a language model based EEG feature extraction layer, an acoustic feature mapping layer, and a restricted Boltzmann machine (RBM) based the feature learning layer. The NES models can jointly realize the representation of multichannel EEG signals and the projection of acoustic speech signals. Among three proposed NES models, two augmented networks utilize spoken EEG signals as either bias or gate information to strengthen the feature learning and translation of imagined EEG signals. Experimental results show that all three proposed NES models outperform the baseline support vector machine (SVM) method on EEG-speech classification. With respect to binary classification, our approach achieves comparable results relative to deep believe network approach.
We propose a duality analysis for obtaining the critical manifold of two-dimensional spin glasses. Our method is based on the computation of quenched free energies with periodic and twisted periodic boundary conditions on a finite basis. The precision can be systematically improved by increasing the size of the basis, leading to very fast convergence towards the thermodynamic limit. We apply the method to obtain the phase diagrams of the random-bond Ising model and $q$-state Potts gauge glasses. In the Ising case, the Nishimori point is found at $p_N = 0.10929 \pm 0.00002$, in agreement with and improving on the precision of existing numerical estimations. Similar precision is found throughout the high-temperature part of the phase diagram. Finite-size effects are larger in the low-temperature region, but our results are in qualitative agreement with the known features of the phase diagram. In particular we show analytically that the critical point in the ground state is located at finite $p_0$.
We have measured directly the thermal conductance between electrons and phonons in ultra-thin Hf and Ti films at millikelvin temperatures. The experimental data indicate that electron-phonon coupling in these films is significantly suppressed by disorder. The electron cooling time $\tau_\epsilon$ follows the $T^{-4}$-dependence with a record-long value $\tau_\epsilon=25ms$ at $T=0.04K$. The hot-electron detectors of far-infrared radiation, fabricated from such films, are expected to have a very high sensitivity. The noise equivalent power of a detector with the area $1\mum^2$ would be $(2-3)10^{-20}W/Hz^{1/2}$, which is two orders of magnitude smaller than that of the state-of-the-art bolometers.
This paper considers the problem of energy-efficient transmission in multi-flow multihop cooperative wireless networks. Although the performance gains of cooperative approaches are well known, the combinatorial nature of these schemes makes it difficult to design efficient polynomial-time algorithms for joint routing, scheduling and power control. This becomes more so when there is more than one flow in the network. It has been conjectured by many authors, in the literature, that the multiflow problem in cooperative networks is an NP-hard problem. In this paper, we formulate the problem, as a combinatorial optimization problem, for a general setting of $k$-flows, and formally prove that the problem is not only NP-hard but it is $o(n^{1/7-\epsilon})$ inapproxmiable. To our knowledge*, these results provide the first such inapproxmiablity proof in the context of multiflow cooperative wireless networks. We further prove that for a special case of k = 1 the solution is a simple path, and devise a polynomial time algorithm for jointly optimizing routing, scheduling and power control. We then use this algorithm to establish analytical upper and lower bounds for the optimal performance for the general case of $k$ flows. Furthermore, we propose a polynomial time heuristic for calculating the solution for the general case and evaluate the performance of this heuristic under different channel conditions and against the analytical upper and lower bounds.
Source localization in ocean acoustics is posed as a machine learning problem in which data-driven methods learn source ranges directly from observed acoustic data. The pressure received by a vertical linear array is preprocessed by constructing a normalized sample covariance matrix (SCM) and used as the input. Three machine learning methods (feed-forward neural networks (FNN), support vector machines (SVM) and random forests (RF)) are investigated in this paper, with focus on the FNN. The range estimation problem is solved both as a classification problem and as a regression problem by these three machine learning algorithms. The results of range estimation for the Noise09 experiment are compared for FNN, SVM, RF and conventional matched-field processing and demonstrate the potential of machine learning for underwater source localization..
The handwritten string recognition is still a challengeable task, though the powerful deep learning tools were introduced. In this paper, based on TAO-FCN, we proposed an end-to-end system for handwritten string recognition. Compared with the conventional methods, there is no preprocess nor manually designed rules employed. With enough labelled data, it is easy to apply the proposed method to different applications. Although the performance of the proposed method may not be comparable with the state-of-the-art approaches, it's usability and robustness are more meaningful for practical applications.
The Pacific Rim Application and Grid Middleware Assembly (PRAGMA) is an international community of researchers that actively collaborate to address problems and challenges of common interest in eScience. The PRAGMA Experimental Network Testbed (PRAGMA-ENT) was established with the goal of constructing an international software-defined network (SDN) testbed to offer the necessary networking support to the PRAGMA cyberinfrastructure. PRAGMA-ENT is isolated, and PRAGMA researchers have complete freedom to access network resources to develop, experiment, and evaluate new ideas without the concerns of interfering with production networks.   In the first phase, PRAGMA-ENT focused on establishing an international L2 backbone. With support from the Florida Lambda Rail (FLR), Internet2, PacificWave, JGN-X, and TWAREN, PRAGMA-ENT backbone connects Open\-Flow-enabled switches at University of Florida (UF), University of California San Diego (UCSD), Nara Institute of Science and Technology (NAIST, Japan), Osaka University (Japan), National Institute of Advanced Industrial Science and Technology (AIST, Japan), and National Center for High-Performance Computing (Taiwan).   The second phase of PRAGMA-ENT consisted of evaluation of technologies for the control plane that enables multiple experiments (i.e., OpenFlow controllers) to co-exist. Preliminary experiments with FlowVisor revealed some limitations leading to the development of a new approach, called AutoVFlow. This paper will share our experience in the establishment of PRAGMA-ENT backbone (with international L2 links), its current status, and control plane plans. Discussion on preliminary application ideas, including optimization of routing control; multipath routing control; and remote visualization will also be discussed.
We consider the effect of geometric frustration induced by the random distribution of loop lengths in the "fat" graphs of the dynamical triangulations model on coupled antiferromagnets. While the influence of such connectivity disorder is rather mild for ferromagnets in that an ordered phase persists and only the properties of the phase transition are substantially changed in some cases, any finite-temperature transition is wiped out due to frustration for some of the antiferromagnetic models. A wealth of different phenomena is observed: while for the annealed average of quantum gravity some graphs can adapt dynamically to allow the emergence of a Neel ordered phase, this is not possible for the quenched average, where a zero-temperature spin-glass phase appears instead. We relate the latter to the behaviour of conventional spin-glass models coupled to random graphs.
The superconductor-insulator transition in the presence of strong compensation of dopants was recently realized in La doped YBCO. The compensation of acceptors by donors makes it possible to change independently the concentration of holes n and the total concentration of charged impurities N. We propose a theory of the superconductor-insulator phase diagram in the (N,n) plane. It exhibits interesting new features in the case of strong coupling superconductivity, where Cooper pairs are compact, non-overlapping bosons. For compact Cooper pairs the transition occurs at a significantly higher density than in the case of spatially overlapping pairs. We establish the superconductor-insulator phase diagram by studying how the potential of randomly positioned charged impurities is screened by holes or by strongly bound Cooper pairs, both in isotropic and layered superconductors. In the resulting self-consistent potential the carriers are either delocalized or localized, which corresponds to the superconducting or insulating phase, respectively.
Identifying and designing physical systems for use as qubits, the basic units of quantum information, are critical steps in the development of a quantum computer. Among the possibilities in the solid state, a defect in diamond known as the nitrogen-vacancy (NV-1) center stands out for its robustness - its quantum state can be initialized, manipulated, and measured with high fidelity at room temperature. Here we describe how to systematically identify other deep center defects with similar quantum-mechanical properties. We present a list of physical criteria that these centers and their hosts should meet and explain how these requirements can be used in conjunction with electronic structure theory to intelligently sort through candidate defect systems. To illustrate these points in detail, we compare electronic structure calculations of the NV-1 center in diamond with those of several deep centers in 4H silicon carbide (SiC). We then discuss the proposed criteria for similar defects in other tetrahedrally-coordinated semiconductors.
Observations of deuterated species are useful in probing the temperature, ionization level, evolutionary stage, chemistry, and thermal history of astrophysical environments. The analysis of data from ALMA and other new telescopes requires an elaborate model of deuterium fractionation. This paper presents a publicly available chemical network with multi-deuterated species and an extended, up-to-date set of gas-phase and surface reactions. To test this network, we simulate deuterium fractionation in diverse interstellar sources. Two cases of initial abundances are considered: i) atomic except for H2 and HD, and ii) molecular from a prestellar core. We reproduce the observed D/H ratios of many deuterated molecules, and sort the species according to their sensitivity to temperature gradients and initial abundances. We find that many multiply-deuterated species produced at 10 K retain enhanced D/H ratios at temperatures $\la 100$ K. We study how recent updates to reaction rates affect calculated D/H ratios, and perform a detailed sensitivity analysis of the uncertainties of the gas-phase reaction rates in the network. We find that uncertainties are generally lower in dark cloud environments than in warm IRDCs and that uncertainties increase with the size of the molecule and number of D-atoms. A set of the most problematic reactions is presented. We list potentially observable deuterated species predicted to be abundant in low- and high-mass star-formation regions.
A very important topic in systems biology is developing statistical methods that automatically find causal relations in gene regulatory networks with no prior knowledge of causal connectivity. Many methods have been developed for time series data. However, discovery methods based on steady-state data are often necessary and preferable since obtaining time series data can be more expensive and/or infeasible for many biological systems. A conventional approach is causal Bayesian networks. However, estimation of Bayesian networks is ill-posed. In many cases it cannot uniquely identify the underlying causal network and only gives a large class of equivalent causal networks that cannot be distinguished between based on the data distribution. We propose a new discovery algorithm for uniquely identifying the underlying causal network of genes. To the best of our knowledge, the proposed method is the first algorithm for learning gene networks based on a fully identifiable causal model called LiNGAM. We here compare our algorithm with competing algorithms using artificially-generated data, although it is definitely better to test it based on real microarray gene expression data.
We propose `Dracula', a new framework for unsupervised feature selection from sequential data such as text. Dracula learns a dictionary of $n$-grams that efficiently compresses a given corpus and recursively compresses its own dictionary; in effect, Dracula is a `deep' extension of Compressive Feature Learning. It requires solving a binary linear program that may be relaxed to a linear program. Both problems exhibit considerable structure, their solution paths are well behaved, and we identify parameters which control the depth and diversity of the dictionary. We also discuss how to derive features from the compressed documents and show that while certain unregularized linear models are invariant to the structure of the compressed dictionary, this structure may be used to regularize learning. Experiments are presented that demonstrate the efficacy of Dracula's features.
Many-body localization in a disordered system of interacting spins coupled by the long-range interaction $1/R^{\alpha}$ is investigated combining analytical theory considering resonant interactions and a finite size scaling of exact numerical solutions with a number of spins $N$. The numerical results for a one-dimensional system are consistent with the general expectations of analytical theory for $d$-dimensional system including the absence of localization in the infinite system at $\alpha<2d$ and a universal scaling of a critical energy disordering $W_{c} \propto N^{\frac{2d-\alpha}{d}}$. %The finite size effect on the interaction stimulated delocalization of energy in the ensemble of interacting two level systems in amorphous solids at low temperature is discussed.
Deep brain stimulation (DBS) is a surgical treatment for Parkinson's Disease. Static models based on quasi-static approximation are common approaches for DBS modeling. While this simplification has been validated for bioelectric sources, its application to rapid stimulation pulses, which contain more high-frequency power, may not be appropriate, as DBS therapeutic results depend on stimulus parameters such as frequency and pulse width, which are related to time variations of the electric field. We propose an alternative hybrid approach based on probabilistic models and differential equations, by using Gaussian processes and wave equation. Our model avoids quasi-static approximation, moreover, it is able to describe dynamic behavior of DBS. Therefore, the proposed model may be used to obtain a more realistic phenomenon description. The proposed model can also solve inverse problems, i.e. to recover the corresponding source of excitation, given electric potential distribution. The electric potential produced by a time-varying source was predicted using proposed model. For static sources, the electric potential produced by different electrode configurations were modeled. Four different sources of excitation were recovered by solving the inverse problem. We compare our outcomes with the electric potential obtained by solving Poisson's equation using the Finite Element Method (FEM). Our approach is able to take into account time variations of the source and the produced field. Also, inverse problem can be addressed using the proposed model. The electric potential calculated with the proposed model is close to the potential obtained by solving Poisson's equation using FEM.
Benes networks are constructed with simple switch modules and have many advantages, including small latency and requiring only an almost linear number of switch modules. As circuit-switches, Benes networks are rearrangeably non-blocking, which implies that they are full-throughput as packet switches, with suitable routing.   Routing in Benes networks can be done by time-sharing permutations. However, this approach requires centralized control of the switch modules and statistical knowledge of the traffic arrivals. We propose a backpressure-based routing scheme for Benes networks, combined with end-to-end congestion control. This approach achieves the maximal utility of the network and requires only four queues per module, independently of the size of the network.
Previous theoretical studies on the interaction of excitatory and inhibitory neurons proposed to model this cortical microcircuit motif as a so-called Winner-Take-All (WTA) circuit. A recent modeling study however found that the WTA model is not adequate for data-based softer forms of divisive inhibition as found in a microcircuit motif in cortical layer 2/3. We investigate here through theoretical analysis the role of such softer divisive inhibition for the emergence of computational operations and neural codes under spike-timing dependent plasticity (STDP). We show that in contrast to WTA models - where the network activity has been interpreted as probabilistic inference in a generative mixture distribution - this network dynamics approximates inference in a noisy-OR-like generative model that explains the network input based on multiple hidden causes. Furthermore, we show that STDP optimizes the parameters of this model by approximating online the expectation maximization (EM) algorithm. This theoretical analysis corroborates a preceding modelling study which suggested that the learning dynamics of this layer 2/3 microcircuit motif extracts a specific modular representation of the input and thus performs blind source separation on the input statistics.
The main aim of this paper is to discuss how the combination of Web 2.0, social media and geographic technologies can provide opportunities for learning and new forms of participation in an urban design studio. This discussion is mainly based on our recent findings from two experimental urban design studio setups as well as former research and literature studies. In brief, the web platform enabled us to extend the learning that took place in the design studio beyond the studio hours, to represent the design information in novel ways and allocate multiple communication forms. We found that the student activity in the introduced web platform was related to their progress up to a certain extent. Moreover, the students perceived the platform as a convenient medium and addressed it as a valuable resource for learning. This study should be conceived as a continuation of a series of our Design Studio 2.0 experiments which involve the exploitation of opportunities provided by novel socio-geographic information and communication technologies for the improvement of the design learning processes.
Over the last few years, Cloud Radio Access Network (C-RAN) has arisen as a transformative architecture for 5G cellular networks that brings the flexibility and agility of cloud computing to wireless communications. At the same time, content caching in wireless networks has become an essential solution to lower the content-access latency and backhaul traffic loading, which translate into user Quality of Experience (QoE) improvement and network cost reduction. In this article, a novel Cooperative Hierarchical Caching (CHC) framework in C-RAN is introduced where contents are jointly cached at the BaseBand Unit (BBU) and at the Radio Remote Heads (RRHs). Unlike in traditional approaches, the cache at the BBU, cloud cache, presents a new layer in the cache hierarchy, bridging the latency/capacity gap between the traditional edge-based and core-based caching schemes. Trace-driven simulations reveal that CHC yields up to 80% improvement in cache hit ratio, 21% decrease in average content-access latency, and 20% reduction in backhaul traffic load compared to the edge-only caching scheme with the same total cache capacity. Before closing the article, several challenges and promising opportunities for deploying content caching in C-RAN are highlighted towards a content-centric mobile wireless network.
We model the cooperation policy with only two parameters -- search radius $r$ and number of copies in the network $N_{copy}$. These two parameters represent the range of cooperation and tolerance of duplicates. We show how cooperation policy impacts content distribution, and further illustrate the relation between content popularity and topological properties. Our work leads many implications on how to take advantage of topological properties in in-network caching strategy design.
We focus on constructing the domi-join model by doing the join operation based on two smallest dominating sets of two network models and analysis the properties of domi-join model, such as power law distribution, small world. Besides, we will import two class of edge-bound growing network models to explain the process of domi-join model. Then we compute the average degree, clustering coefficient, power law distribution of the domi-join model. Finally, we discuss an impressive method for cutting down redundant operation of domi-join model.
In order to better accommodate the dramatically increasing demand for data caching and computing services, storage and computation capabilities should be endowed to some of the intermediate nodes within the network. In this paper, we design a novel virtualized heterogeneous networks framework aiming at enabling content caching and computing. With the virtualization of the whole system, the communication, computing and caching resources can be shared among all users associated with different virtual service providers. We formulate the virtual resource allocation strategy as a joint optimization problem, where the gains of not only virtualization but also caching and computing are taken into consideration in the proposed architecture. In addition, a distributed algorithm based on alternating direction method of multipliers is adopted to solve the formulated problem, in order to reduce the computational complexity and signaling overhead. Finally, extensive simulations are presented to show the effectiveness of the proposed scheme under different system parameters.
This paper introduces the probabilistic module interface, which allows encapsulation of complex probabilistic models with latent variables alongside custom stochastic approximate inference machinery, and provides a platform-agnostic abstraction barrier separating the model internals from the host probabilistic inference system. The interface can be seen as a stochastic generalization of a standard simulation and density interface for probabilistic primitives. We show that sound approximate inference algorithms can be constructed for networks of probabilistic modules, and we demonstrate that the interface can be implemented using learned stochastic inference networks and MCMC and SMC approximate inference programs.
We present a sparse and invariant representation with low asymptotic complexity for robust unsupervised transient and onset zone detection in noisy environments. This unsupervised approach is based on wavelet transforms and leverages the scattering network from Mallat et al. by deriving frequency invariance. This frequency invariance is a key concept to enforce robust representations of transients in presence of possible frequency shifts and perturbations occurring in the original signal. Implementation details as well as complexity analysis are provided in addition of the theoretical framework and the invariance properties. In this work, our primary application consists of predicting the onset of seizure in epileptic patients from subdural recordings as well as detecting inter-ictal spikes.
An associative memory is a framework of content-addressable memory that stores a collection of message vectors (or a dataset) over a neural network while enabling a neurally feasible mechanism to recover any message in the dataset from its noisy version. Designing an associative memory requires addressing two main tasks: 1) learning phase: given a dataset, learn a concise representation of the dataset in the form of a graphical model (or a neural network), 2) recall phase: given a noisy version of a message vector from the dataset, output the correct message vector via a neurally feasible algorithm over the network learnt during the learning phase. This paper studies the problem of designing a class of neural associative memories which learns a network representation for a large dataset that ensures correction against a large number of adversarial errors during the recall phase. Specifically, the associative memories designed in this paper can store dataset containing $\exp(n)$ $n$-length message vectors over a network with $O(n)$ nodes and can tolerate $\Omega(\frac{n}{{\rm polylog} n})$ adversarial errors. This paper carries out this memory design by mapping the learning phase and recall phase to the tasks of dictionary learning with a square dictionary and iterative error correction in an expander code, respectively.
To dynamically detect the facial landmarks in the video, we propose a novel hybrid framework termed as detection-tracking-detection (DTD). First, the face bounding box is achieved from the first frame of the video sequence based on a traditional face detection method. Then, a landmark detector detects the facial landmarks, which is based on a cascaded deep convolution neural network (DCNN). Next, the face bounding box in the current frame is estimated and validated after the facial landmarks in the previous frame are tracked based on the median flow. Finally, the facial landmarks in the current frame are exactly detected from the validated face bounding box via the landmark detector. Experimental results indicate that the proposed framework can detect the facial landmarks in the video sequence more effectively and with lower consuming time compared to the frame-by-frame method via the DCNN.
To avoid the complicated topology of surviving clusters induced by standard Strong Disorder RG in dimension $d>1$, we introduce a modified procedure called 'Boundary Strong Disorder RG' where the order of decimations is chosen a priori. We apply numerically this modified procedure to the Random Transverse Field Ising model in dimension $d=2$. We find that the location of the critical point, the activated exponent $\psi \simeq 0.5$ of the Infinite Disorder scaling, and the finite-size correlation exponent $\nu_{FS} \simeq 1.3$ are compatible with the values obtained previously by standard Strong Disorder RG.Our conclusion is thus that Strong Disorder RG is very robust with respect to changes in the order of decimations. In addition, we analyze in more details the RG flows within the two phases to show explicitly the presence of various correlation length exponents : we measure the typical correlation exponent $\nu_{typ} \simeq 0.64$ in the disordered phase (this value is very close to the correlation exponent $\nu^Q_{pure}(d=2) \simeq 0.63$ of the {\it pure} two-dimensional quantum Ising Model), and the typical exponent $\nu_h \simeq 1$ within the ordered phase. These values satisfy the relations between critical exponents imposed by the expected finite-size scaling properties at Infinite Disorder critical points. Within the disordered phase, we also measure the fluctuation exponent $\omega \simeq 0.35$ which is compatible with the Directed Polymer exponent $\omega_{DP}(1+1)=1/3$ in $(1+1)$ dimensions.
We survey the contributions presented in the working group ``Diffraction and Vector Mesons'' at the XIV International Workshop on Deep Inelastic Scattering.
In dissipationless linear media, spatial disorder induces Anderson localization of matter, light, and sound waves. The addition of nonlinearity causes interaction between the eigenmodes, which results in a slow wave diffusion. We go beyond the dissipationless limit of Anderson arrays and consider nonlinear disordered systems that are subjected to the dissipative losses and energy pumping. We show that the Anderson modes of the disordered Ginsburg-Landau lattice possess specific excitation thresholds with respect to the pumping strength. When pumping is increased above the threshold for the band-edge modes, the lattice dynamics yields an attractor in the form of a stable multi-peak pattern. The Anderson attractor is the result of a joint action by the pumping-induced mode excitation, nonlinearity-induced mode interactions, and dissipative stabilization. The regimes of Anderson attractors can be potentially realized with polariton condensates lattices, active waveguide or cavity-QED arrays.
Complex systems are successfully reduced to interacting elements via the network concept. Transport plays a key role in the survival of networks. For example the specialized signaling cascades of cellular networks filter noise and efficiently adapt the network structure to new stimuli. However, our general understanding of transport mechanisms and signaling pathways in complex systems is yet limited. Here we summarize the key network structures involved in transport, list the solutions available to overloaded systems for relaxing their load and outline a possible method for the computational determination of signaling pathways. We highlight that in addition to hubs, bridges and the network skeleton, the overlapping modular structure is also essential in network transport. Moreover, by locating network elements in the space of overlapping network modules and evaluating their distance in this "module space", it may be possible to approximate signaling pathways computationally, which, in turn could serve the identification of signaling pathways of complex systems. Our model may be applicable in a wide range of fields including traffic control or drug design.
We use the annealed formulation of complex networks to study the dynamical behavior of disease spreading on both static and adaptive networked systems. This unifying approach relies on the annealed adjacency matrix, representing one network ensemble, and allows to solve the dynamical evolution of the whole network ensemble all at once. Our results accurately reproduce those obtained by extensive numerical simulations showing a large improvement with respect to the usual heterogeneous mean-field formulation. Moreover, by means of the annealed formulation we derive a new heterogeneous mean-field formulation that correctly reproduces the epidemic dynamics.
Introduction to the Special Issue on Complex Networks, Artificial Life journal.
Recommendation algorithms that incorporate techniques from deep learning are becoming increasingly popular. Due to the structure of the data coming from recommendation domains (i.e., one-hot-encoded vectors of item preferences), these algorithms tend to have large input and output dimensionalities that dominate their overall size. This makes them difficult to train, due to the limited memory of graphical processing units, and difficult to deploy on mobile devices with limited hardware. To address these difficulties, we propose Bloom embeddings, a compression technique that can be applied to the input and output of neural network models dealing with sparse high-dimensional binary-coded instances. Bloom embeddings are computationally efficient, and do not seriously compromise the accuracy of the model up to 1/5 compression ratios. In some cases, they even improve over the original accuracy, with relative increases up to 12%. We evaluate Bloom embeddings on 7 data sets and compare it against 4 alternative methods, obtaining favorable results. We also discuss a number of further advantages of Bloom embeddings, such as 'on-the-fly' constant-time operation, zero or marginal space requirements, training time speedups, or the fact that they do not require any change to the core model architecture or training configuration.
We study the dynamics of excitations in a system of $O(N)$ quantum rotors in the presence of random fields and random anisotropies. Below the lower critical dimension $d_{\mathrm{lc}}=4$ the system exhibits a quasi-long-range order with a power-law decay of correlations. At zero temperature the spin waves are localized at the length scale $L_{\mathrm{loc}}$ beyond which the quantum tunneling is exponentially suppressed $ c \sim e^{-(L/L_{\mathrm{loc}})^{2(\theta+1)}}$. At finite temperature $T$ the spin waves propagate by thermal activation over energy barriers that scales as $L^{\theta}$. Above $d_{\mathrm{lc}}$ the system undergoes an order-disorder phase transition with activated dynamics such that the relaxation time grows with the correlation length $\xi$ as $\tau \sim e^{C \xi^\theta/T}$ at finite temperature and as $\tau \sim e^{C' \xi^{2(\theta+1)}/\hbar^2}$ in the vicinity of the quantum critical point.
For a linear code, deep holes are defined to be vectors that are further away from codewords than all other vectors. The problem of deciding whether a received word is a deep hole for generalized Reed-Solomon codes is proved to be co-NP-complete. For the extended Reed-Solomon codes $RS_q(\F_q,k)$, a conjecture was made to classify deep holes by Cheng and Murray in 2007. Since then a lot of effort has been made to prove the conjecture, or its various forms. In this paper, we classify deep holes completely for generalized Reed-Solomon codes $RS_p (D,k)$, where $p$ is a prime, $|D| > k \geqslant \frac{p-1}{2}$. Our techniques are built on the idea of deep hole trees, and several results concerning the Erd{\"o}s-Heilbronn conjecture.
Trees have long been used as a graphical representation of species relationships. However complex evolutionary events, such as genetic reassortments or hybrid speciations which occur commonly in viruses, bacteria and plants, do not fit into this elementary framework. Alternatively, various network representations have been developed. Circular networks are a natural generalization of leaf-labeled trees interpreted as split systems, that is, collections of bipartitions over leaf labels corresponding to current species. Although such networks do not explicitly model specific evolutionary events of interest, their straightforward visualization and fast reconstruction have made them a popular exploratory tool to detect network-like evolution in genetic datasets.   Standard reconstruction methods for circular networks, such as Neighbor-Net, rely on an associated metric on the species set. Such a metric is first estimated from DNA sequences, which leads to a key difficulty: distantly related sequences produce statistically unreliable estimates. This is problematic for Neighbor-Net as it is based on the popular tree reconstruction method Neighbor-Joining, whose sensitivity to distance estimation errors is well established theoretically. In the tree case, more robust reconstruction methods have been developed using the notion of a distorted metric, which captures the dependence of the error in the distance through a radius of accuracy. Here we design the first circular network reconstruction method based on distorted metrics. Our method is computationally efficient. Moreover, the analysis of its radius of accuracy highlights the important role played by the maximum incompatibility, a measure of the extent to which the network differs from a tree.
We present a study of the application of a variant of a recently introduced heuristic algorithm for the optimization of transport routes on complex networks to the problem of finding the optimal routes of communication between nodes on wireless networks. Our algorithm iteratively balances network traffic by minimizing the maximum node betweenness on the network. The variant we consider specifically accounts for the broadcast restrictions imposed by wireless communication by using a different betweenness measure. We compare the performance of our algorithm to two other known algorithms and find that our algorithm achieves the highest transport capacity both for minimum node degree geometric networks, which are directed geometric networks that model wireless communication networks, and for configuration model networks that are uncorrelated scale-free networks.
The exact closed-form expressions for outage probability and bit error rate of spectrum sharing-based multi-hop decodeand- forward (DF) relay networks in non-identical Rayleigh fading channels are derived. We also provide the approximate closed-form expression for the system ergodic capacity. Utilizing these tractable analytical formulas, we can study the impact of key network parameters on the performance of cognitivemulti-hop relay networks under interference constraints. Using a linear network model, we derive an optimum relay position scheme by numerically solving an optimization problem of balancing average signal-to-noise ratio (SNR) of each hop. The numerical results show that the optimal scheme leads to SNR performance gains of more than 1 dB. All the analytical expressions are verified by Monte-Carlo simulations confirming the advantage ofmultihop DF relaying networks in cognitive environments.
Deep neural networks (DNNs) provide useful models of visual representational transformations. We present a method that enables a DNN (student) to learn from the internal representational spaces of a reference model (teacher), which could be another DNN or, in the future, a biological brain. Representational spaces of the student and the teacher are characterized by representational distance matrices (RDMs). We propose representational distance learning (RDL), a stochastic gradient descent method that drives the RDMs of the student to approximate the RDMs of the teacher. We demonstrate that RDL is competitive with other transfer learning techniques for two publicly available benchmark computer vision datasets (MNIST and CIFAR-100), while allowing for architectural differences between student and teacher. By pulling the student's RDMs towards those of the teacher, RDL significantly improved visual classification performance when compared to baseline networks that did not use transfer learning. In the future, RDL may enable combined supervised training of deep neural networks using task constraints (e.g. images and category labels) and constraints from brain-activity measurements, so as to build models that replicate the internal representational spaces of biological brains.
It has been recently observed that the dynamical properties of mass action systems arising from many models of biochemical reaction networks can be derived by considering the corresponding properties of a related generalized mass action system. The correspondence process known as network translation in particular has been shown to be useful in characterizing a system's steady states. In this paper, we further develop the theory of network translation with particular focus on a subclass of translations known as improper translations. For these translations, we derive conditions on the network topology of the translated network which are sufficient to guarantee the original and translated systems share the same steady states. We then present a mixed-integer linear programming (MILP) algorithm capable of determining whether a mass action system can be corresponded to a generalized system through the process of network translation.
The Shintani-Tanaka model is a glass-forming system whose constituents interact via anisotropic potential depending on the angle of a unit vector carried by each particle. The decay of time-correlation functions of the unit vectors exhibits the characteristics of generic relaxation functions during glass transitions. In particular it exhibits a 'stretched exponential' form, with the stretching index beta depending strongly on the temperature. We construct a quantitative theory of this correlation function by analyzing all the physical processes that contribute to it, separating a rotational from a translational decay channel. Interestingly, the separate decay function of each of these processes is temperature independent. Taken together with temperature-dependent weights determined a-priori by statistical mechanics one generates the observed correlation function in quantitative agreement with simulations at different temperatures. This underlines the danger of concluding anything about glassy relaxation functions without detailed physical scrutiny.
Using the dedicated computer Janus, we follow the nonequilibrium dynamics of the Ising spin glass in three dimensions for eleven orders of magnitude. The use of integral estimators for the coherence and correlation lengths allows us to study dynamic heterogeneities and the presence of a replicon mode and to obtain safe bounds on the Edwards-Anderson order parameter below the critical temperature. We obtain good agreement with experimental determinations of the temperature-dependent decay exponents for the thermoremanent magnetization. This magnitude is observed to scale with the much harder to measure coherence length, a potentially useful result for experimentalists. The exponents for energy relaxation display a linear dependence on temperature and reasonable extrapolations to the critical point. We conclude examining the time growth of the coherence length, with a comparison of critical and activated dynamics.
Models of neural networks have proven their utility in the development of learning algorithms in computer science and in the theoretical study of brain dynamics in computational neuroscience. We propose in this paper a spatial neural network model to analyze the important class of functional networks, which are commonly employed in computational studies of clinical brain imaging time series. We developed a simulation framework inspired by multichannel brain surface recordings (more specifically, EEG -- electroencephalogram) in order to link the mesoscopic network dynamics (represented by sampled functional networks) and the microscopic network structure (represented by an integrate-and-fire neural network located in a 3D space -- hence the term spatial neural network). Functional networks are obtained by computing pairwise correlations between time-series of mesoscopic electric potential dynamics, which allows the construction of a graph where each node represents one time-series. The spatial neural network model is central in this study in the sense that it allowed us to characterize sampled functional networks in terms of what features they are able to reproduce from the underlying spatial network. Our modeling approach shows that, in specific conditions of sample size and edge density, it is possible to precisely estimate several network measurements of spatial networks by just observing functional samples.
Emergent behaviors are in the focus of recent research interest. It is then of considerable importance to investigate what optimizations suit the learning and prediction of chaotic systems, the putative candidates for emergence. We have compared L1 and L2 regularizations on predicting chaotic time series using linear recurrent neural networks. The internal representation and the weights of the networks were optimized in a unifying framework. Computational tests on different problems indicate considerable advantages for the L1 regularization: It had considerably better learning time and better interpolating capabilities. We shall argue that optimization viewed as a maximum likelihood estimation justifies our results, because L1 regularization fits heavy-tailed distributions -- an apparently general feature of emergent systems -- better.
In recent years, there have been many computational simulations of spontaneous neural dynamics. Here, we explore a model of spontaneous neural dynamics and allow it to control a virtual agent moving in a simple environment. This setup generates interesting brain-environment feedback interactions that rapidly destabilize neural and behavioral dynamics and suggest the need for homeostatic mechanisms. We investigate roles for both local homeostatic plasticity (local inhibition adjusting over time to balance excitatory input) as well as macroscopic task negative activity (that compensates for task positive, sensory input) in regulating both neural activity and resulting behavior (trajectories through the environment). Our results suggest complementary functional roles for both local homeostatic plasticity and balanced activity across brain regions in maintaining neural and behavioral dynamics. These findings suggest important functional roles for homeostatic systems in maintaining neural and behavioral dynamics and suggest a novel functional role for frequently reported macroscopic task-negative patterns of activity (e.g., the default mode network).
In this study, we investigate the complexity of two-phase flow (air/water) in a heterogeneous soil sample by using complex network theory, where the supposed porous media is non-deformable media, under the time-dependent gas pressure. Based on the different similarity measurements (i.e., correlation, Euclidean metrics) over the emerged patterns from the evolution of saturation of non-wetting phase of a multi-heterogeneous soil sample, the emerged complex networks are recognized. Understanding of the properties of complex networks (such degree distribution, mean path length, clustering coefficient) can be supposed as a way to analysis of variation of saturation profiles structures (as the solution of finite element method on the coupled PDEs) where complexity is coming from the changeable connection and links between assumed nodes. Also, the path of evolution of the supposed system will be illustrated on the state space of networks either in correlation and Euclidean measurements. The results of analysis showed in a closed system the designed complex networks approach to small world network where the mean path length and clustering coefficient are low and high, respectively. As another result, the evolution of macro -states of system (such mean velocity of air or pressure) can be scaled with characteristics of structure complexity of saturation. In other part, we tried to find a phase transition criterion based on the variation of non-wetting phase velocity profiles over a network which had been constructed over correlation distance.
With a simple attack and repair evolution model, we investigate and compare the stability of the Erdos-Renyi random graphs (RG) and Barabasi-Albert scale-free (SF) networks. We introduce a new quantity, invulnerability I(s), to describe the stability of the system. We find that both RG and SF networks can evolve to a stationary state. The stationary value Ic has a power-law dependence on the average degree <k>_rg for RG networks; and an exponential relationship with the repair probability p_sf for SF networks. We also discuss the topological changes of RG and SF networks between the initial and stationary states. We observe that the networks in the stationary state have smaller average degree <k> but larger clustering coefficient C and stronger assortativity r.
Neural machine translation (NMT) models are able to partially learn syntactic information from sequential lexical information. Still, some complex syntactic phenomena such as prepositional phrase attachment are poorly modeled. This work aims to answer two questions: 1) Does explicitly modeling target language syntax help NMT? 2) Is tight integration of words and syntax better than multitask training? We introduce syntactic information in the form of CCG supertags in the decoder, by interleaving the target supertags with the word sequence. Our results on WMT data show that explicitly modeling target-syntax improves machine translation quality for German->English, a high-resource pair, and for Romanian->English, a low-resource pair and also several syntactic phenomena including prepositional phrase attachment. Furthermore, a tight coupling of words and syntax improves translation quality more than multitask training. By combining target-syntax with adding source-side dependency labels in the embedding layer, we obtain a total improvement of 0.9 BLEU for German->English and 1.2 BLEU for Romanian->English.
Spiking Neural Networks (SNN) are more closely related to brain-like computation and inspire hardware implementation. This is enabled by small networks that give high performance on standard classification problems. In literature, typical SNNs are deep and complex in terms of network structure, weight update rules and learning algorithms. This makes it difficult to translate them into hardware. In this paper, we first develop a simple 2-layered network in software which compares with the state of the art on four different standard data-sets within SNNs and has improved efficiency. For example, it uses lower number of neurons (3 x), synapses (3.5 x) and epochs for training (30 x) for the Fisher Iris classification problem. The efficient network is based on effective population coding and synapse-neuron co-design. Second, we develop a computationally efficient (15000 x) and accurate (correlation of 0.98) method to evaluate the performance of the network without standard recognition tests. Third, we show that the method produces a robustness metric that can be used to evaluate noise tolerance.
Deep neural networks (DNN) abstract by demodulating the output of linear filters. In this article, we refine this definition of abstraction to show that the inputs of a DNN are abstracted with respect to the filters. Or, to restate, the abstraction is qualified by the filters. This leads us to introduce the notion of qualitative projection. We use qualitative projection to abstract MNIST hand-written digits with respect to the various dogs, horses, planes and cars of the CIFAR dataset. We then classify the MNIST digits according to the magnitude of their dogness, horseness, planeness and carness qualities, illustrating the generality of qualitative projection.
The superiority of deeply learned pedestrian representations has been reported in very recent literature of person re-identification (re-ID). In this paper, we consider the more pragmatic issue of learning a deep feature with no or only a few labels. We propose a progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains. Our method is easy to implement and can be viewed as an effective baseline for unsupervised re-ID feature learning. Specifically, PUL iterates between 1) pedestrian clustering and 2) fine-tuning of the convolutional neural network (CNN) to improve the original model trained on the irrelevant labeled dataset. Since the clustering results can be very noisy, we add a selection operation between the clustering and fine-tuning. At the beginning when the model is weak, CNN is fine-tuned on a small amount of reliable examples which locate near to cluster centroids in the feature space. As the model becomes stronger in subsequent iterations, more images are being adaptively selected as CNN training samples. Progressively, pedestrian clustering and the CNN model are improved simultaneously until algorithm convergence. This process is naturally formulated as self-paced learning. We then point out promising directions that may lead to further improvement. Extensive experiments on three large-scale re-ID datasets demonstrate that PUL outputs discriminative features that improve the re-ID accuracy.
We prove that a particular deep network architecture is more efficient at approximating radially symmetric functions than the best known 2 or 3 layer networks. We use this architecture to approximate Gaussian kernel SVMs, and subsequently improve upon them with further training. The architecture and initial weights of the Deep Radial Kernel Network are completely specified by the SVM and therefore sidesteps the problem of empirically choosing an appropriate deep network architecture.
We present a large N solution of a microscopic model describing the Mott-Anderson transition on a finite-coordination Bethe lattice. Our results demonstrate that strong spatial fluctuations, due to Anderson localization effects, dramatically modify the quantum critical behavior near disordered Mott transitions. The leading critical behavior of quasiparticle wavefunctions is shown to assume a universal form in the full range from weak to strong disorder, in contrast to disorder-driven non-Fermi liquid ("electronic Griffiths phase") behavior, which is found only in the strongly correlated regime.
Automatically generated political event data is an important part of the social science data ecosystem. The approaches for generating this data, though, have remained largely the same for two decades. During this time, the field of computational linguistics has progressed tremendously. This paper presents an overview of political event data, including methods and ontologies, and a set of experiments to determine the applicability of deep neural networks to the extraction of political events from news text.
Cut vertices, a generalization of matrix elements of local operators, are revisited, and an expansion in terms of minimally subtracted cut vertices is formulated. An extension of the formalism to deal with semi-inclusive deep inelastic processes in the target fragmentation region is explicitly constructed. The problem of factorization is discussed in detail.
Reason and inference require process as well as memory skills by humans. Neural networks are able to process tasks like image recognition (better than humans) but in memory aspects are still limited (by attention mechanism, size). Recurrent Neural Network (RNN) and it's modified version LSTM are able to solve small memory contexts, but as context becomes larger than a threshold, it is difficult to use them. The Solution is to use large external memory. Still, it poses many challenges like, how to train neural networks for discrete memory representation, how to describe long term dependencies in sequential data etc. Most prominent neural architectures for such tasks are Memory networks: inference components combined with long term memory and Neural Turing Machines: neural networks using external memory resources. Also, additional techniques like attention mechanism, end to end gradient descent on discrete memory representation are needed to support these solutions. Preliminary results of above neural architectures on simple algorithms (sorting, copying) and Question Answering (based on story, dialogs) application are comparable with the state of the art. In this paper, I explain these architectures (in general), the additional techniques used and the results of their application.
In this work, we present a novel 3D-Convolutional Neural Network (CNN) architecture called I2I-3D that predicts boundary location in volumetric data. Our fine-to-fine, deeply supervised framework addresses three critical issues to 3D boundary detection: (1) efficient, holistic, end-to-end volumetric label training and prediction (2) precise voxel-level prediction to capture fine scale structures prevalent in medical data and (3) directed multi-scale, multi-level feature learning. We evaluate our approach on a dataset consisting of 93 medical image volumes with a wide variety of anatomical regions and vascular structures. In the process, we also introduce HED-3D, a 3D extension of the state-of-the-art 2D edge detector (HED). We show that our deep learning approach out-performs, the current state-of-the-art in 3D vascular boundary detection (structured forests 3D), by a large margin, as well as HED applied to slices, and HED-3D while successfully localizing fine structures. With our approach, boundary detection takes about one minute on a typical 512x512x512 volume.
Multi-task learning (MTL) involves the simultaneous training of two or more related tasks over shared representations. In this work, we apply MTL to audio-visual automatic speech recognition(AV-ASR). Our primary task is to learn a mapping between audio-visual fused features and frame labels obtained from acoustic GMM/HMM model. This is combined with an auxiliary task which maps visual features to frame labels obtained from a separate visual GMM/HMM model. The MTL model is tested at various levels of babble noise and the results are compared with a base-line hybrid DNN-HMM AV-ASR model. Our results indicate that MTL is especially useful at higher level of noise. Compared to base-line, upto 7\% relative improvement in WER is reported at -3 SNR dB
To operate intelligently in domestic environments, robots require the ability to understand arbitrary spatial relations between objects and to generalize them to objects of varying sizes and shapes. In this work, we present a novel end-to-end approach utilizing neural networks to generalize spatial relations based on distance metric learning. Our network transforms spatial relations to a feature space that captures their similarities based on 3D point clouds of the objects and without prior semantic knowledge of the relations. It employs gradient-based optimization to compute object poses in order to imitate an arbitrary target relation by reducing the distance to it under the learned metric.
We study a random graph model named the "block model" in statistics and the "planted partition model" in theoretical computer science. In its simplest form, this is a random graph with two equal-sized clusters, with a between-class edge probability of $q$ and a within-class edge probability of $p$.   A striking conjecture of Decelle, Krzkala, Moore and Zdeborov\'a based on deep, non-rigorous ideas from statistical physics, gave a precise prediction for the algorithmic threshold of clustering in the sparse planted partition model. In particular, if $p = a/n$ and $q = b/n$, $s=(a-b)/2$ and $p=(a+b)/2$ then Decelle et al.\ conjectured that it is possible to efficiently cluster in a way correlated with the true partition if $s^2 > p$ and impossible if $s^2 < p$. By comparison, the best-known rigorous result is that of Coja-Oghlan, who showed that clustering is possible if $s^2 > C p \ln p$ for some sufficiently large $C$.   In a previous work, we proved that indeed it is information theoretically impossible to to cluster if $s^2 < p$ and furthermore it is information theoretically impossible to even estimate the model parameters from the graph when $s^2 < p$. Here we complete the proof of the conjecture by providing an efficient algorithm for clustering in a way that is correlated with the true partition when $s^2 > p$. A different independent proof of the same result was recently obtained by Laurent Massoulie.
We review the use of kinetically constrained models (KCMs) for the study of dynamics in glassy systems. The characteristic feature of KCMs is that they have trivial, often non-interacting, equilibrium behaviour but interesting slow dynamics due to restrictions on the allowed transitions between configurations. The basic question which KCMs ask is therefore how much glassy physics can be understood without an underlying ``equilibrium glass transition''. After a brief review of glassy phenomenology, we describe the main model classes, which include spin-facilitated (Ising) models, constrained lattice gases, models inspired by cellular structures such as soap froths, models obtained via mappings from interacting systems without constraints, and finally related models such as urn, oscillator, tiling and needle models. We then describe the broad range of techniques that have been applied to KCMs, including exact solutions, adiabatic approximations, projection and mode-coupling techniques, diagrammatic approaches and mappings to quantum systems or effective models. Finally, we give a survey of the known results for the dynamics of KCMs both in and out of equilibrium, including topics such as relaxation time divergences and dynamical transitions, nonlinear relaxation, aging and effective temperatures, cooperativity and dynamical heterogeneities, and finally non-equilibrium stationary states generated by external driving. We conclude with a discussion of open questions and possibilities for future work.
Discovering the 'Neural Code' from multi-neuronal spike trains is an important task in neuroscience. For such an analysis, it is important to unearth interesting regularities in the spiking patterns. In this report, we present an efficient method for automatically discovering synchrony, synfire chains, and more general sequences of neuronal firings. We use the Frequent Episode Discovery framework of Laxman, Sastry, and Unnikrishnan (2005), in which the episodes are represented and recognized using finite-state automata. Many aspects of functional connectivity between neuronal populations can be inferred from the episodes. We demonstrate these using simulated multi-neuronal data from a Poisson model. We also present a method to assess the statistical significance of the discovered episodes. Since the Temporal Data Mining (TDM) methods used in this report can analyze data from hundreds and potentially thousands of neurons, we argue that this framework is appropriate for discovering the `Neural Code'.
Random fields disorder Ising ferromagnets by aligning single spins in the direction of the random field in three space dimensions, or by flipping large ferromagnetic domains at dimensions two and below. While the former requires random fields of typical magnitude similar to the interaction strength, the latter Imry-Ma mechanism only requires infinitesimal random fields. Recently, it has been shown that for dilute anisotropic dipolar systems a third mechanism exists, where the ferromagnetic phase is disordered by finite-size glassy domains at a random field of finite magnitude that is considerably smaller than the typical interaction strength. Using large-scale Monte Carlo simulations and zero-temperature numerical approaches, we show that this mechanism applies to disordered ferromagnets with competing short-range ferromagnetic and antiferromagnetic interactions, suggesting its generality in ferromagnetic systems with competing interactions and an underlying spin-glass phase. A finite-size-scaling analysis of the magnetization distribution suggests that the transition might be first order.
Purpose of our work is to obtain a basic understanding and comparison of the performance and structure of real Knowledge Networks, to identify strengths and weaknesses and to highlight guidelines for improvements. We selected 18 Knowledge Networks from the service sector and 12 networks from the production sector and estimated their Performance and Structure in terms of 19 indices from graph theory. Highlights from our work include: 1) As most networks are unilaterally structured, the direction of knowledge transfer should be taken into account as illustrated in the analysis of clubs and entropy, 2) The stability of most Knowledge Networks is questionable, 3) Few networks are effective in sharing information, while most Knowledge Networks cannot benefit from the network effect, have rather limited capability for coordination, information propagation and synchronization and are not able to integrate Tacit knowledge, 4) Few networks have large cliques which have to be managed with caution as their role may be highly constructive or destructive, 5) While agents with rich connections form clubs, as in most social networks, the poor club effect is not negligible when we take into account the link direction, 6) The directed link analysis of entropy reveals the low complexity-diversification of the Knowledge Networks. In fact the only high entropy network found, has been improved by Knowledge Management Professionals. As most Knowledge Networks underperform, there is plenty of room for further customized analysis in order to improve communication efficiency, coordination, Tacit knowledge dissemination and robustness. This is the first comparative study of real Knowledge Networks in terms of graph theoretic methods.
This paper presents the first substantial study of the chemistry of the envelopes around a sample of 18 low-mass pre- and protostellar objects for which physical properties have previously been derived from radiative transfer modeling of their dust continuum emission. Single-dish line observations of 24 transitions of 9 molecular species (not counting isotopes) including HCO+, N2H+, CS, SO, SO2, HCN, HNC, HC3N and CN are reported. The line intensities are used to constrain the molecular abundances by comparison to Monte Carlo radiative transfer modeling of the line strengths. An empirical chemical network is constructed on the basis of correlations between the abundances of various species. For example, it is seen that the HCO+ and CO abundances are linearly correlated, both increasing with decreasing envelope mass. Species such as CS, SO and HCN show no trend with envelope mass. In particular no trend is seen between ``evolutionary stage'' of the objects and the abundances of the main sulfur- or nitrogen-containing species. Among the nitrogen-bearing species abundances of CN, HNC and HC3N are found to be closely correlated, which can be understood from considerations of the chemical network. The CS/SO abundance ratio is found to correlate with the abundances of CN and HC3N, which may reflect a dependence on the atomic carbon abundance. An anti-correlation is found between the deuteration of HCO+ and HCN, reflecting different temperature dependences for gas-phase deuteration mechanisms. The abundances are compared to other protostellar environments. In particular it is found that the abundances in the cold outer envelope of the previously studied class 0 protostar IRAS16293-2422 are in good agreement with the average abundances for the presented sample of class 0 objects.
In some infectious processes, transmission occurs between specific ties between individuals, which ties constitute a contact network. To estimate the effect of an exposure on infectious outcomes within a collection of contact networks, the analysis must adjust for the correlation of outcomes within networks as well as the probability of exposure. This estimation process may be more statistically efficient when leveraging baseline covariates related to both the exposure and infectious outcome. We investigate the extent to which gains in statistical efficiency depend on contact network structure and properties of the infectious process. To do this, we simulate a stochastic compartmental infection on a collection of contact networks, and employ the observational augmented GEE using a variety of contact network and baseline infection summaries as adjustment covariates. We apply this approach to estimate the effect of leadership and a concurrent self-help program in the spread of a novel microfinance program in a collection of villages in Karnataka, India.
The prediction of periodical time-series remains challenging due to various types of data distortions and misalignments. Here, we propose a novel model called Temporal embedding-enhanced convolutional neural Network (TeNet) to learn repeatedly-occurring-yet-hidden structural elements in periodical time-series, called abstract snippets, for predicting future changes. Our model uses convolutional neural networks and embeds a time-series with its potential neighbors in the temporal domain for aligning it to the dominant patterns in the dataset. The model is robust to distortions and misalignments in the temporal domain and demonstrates strong prediction power for periodical time-series.   We conduct extensive experiments and discover that the proposed model shows significant and consistent advantages over existing methods on a variety of data modalities ranging from human mobility to household power consumption records. Empirical results indicate that the model is robust to various factors such as number of samples, variance of data, numerical ranges of data etc. The experiments also verify that the intuition behind the model can be generalized to multiple data types and applications and promises significant improvement in prediction performances across the datasets studied.
Human pose estimation (i.e., locating the body parts / joints of a person) is a fundamental problem in human-computer interaction and multimedia applications. Significant progress has been made based on the development of depth sensors, i.e., accessible human pose prediction from still depth images [32]. However, most of the existing approaches to this problem involve several components/models that are independently designed and optimized, leading to suboptimal performances. In this paper, we propose a novel inference-embedded multi-task learning framework for predicting human pose from still depth images, which is implemented with a deep architecture of neural networks. Specifically, we handle two cascaded tasks: i) generating the heat (confidence) maps of body parts via a fully convolutional network (FCN); ii) seeking the optimal configuration of body parts based on the detected body part proposals via an inference built-in MatchNet [10], which measures the appearance and geometric kinematic compatibility of body parts and embodies the dynamic programming inference as an extra network layer. These two tasks are jointly optimized. Our extensive experiments show that the proposed deep model significantly improves the accuracy of human pose estimation over other several state-of-the-art methods or SDKs. We also release a large-scale dataset for comparison, which includes 100K depth images under challenging scenarios.
In this work we deal with the optimal design and optimal control of structures undergoing large rotations. In other words, we show how to find the corresponding initial configuration and the corresponding set of multiple load parameters in order to recover a desired deformed configuration or some desirable features of the deformed configuration as specified more precisely by the objective or cost function. The model problem chosen to illustrate the proposed optimal design and optimal control methodologies is the one of geometrically exact beam. First, we present a non-standard formulation of the optimal design and optimal control problems, relying on the method of Lagrange multipliers in order to make the mechanics state variables independent from either design or control variables and thus provide the most general basis for developing the best possible solution procedure. Two different solution procedures are then explored, one based on the diffuse approximation of response function and gradient method and the other one based on genetic algorithm. A number of numerical examples are given in order to illustrate both the advantages and potential drawbacks of each of the presented procedures.
We obtain an analytic expression for the full distribution of conductance for a strongly disordered three dimensional conductor within a perturbative approach based on the transfer-matrix formulation. Our results confirm numerical evidence that the log-normal limit of the distribution is not reached even in the deeply insulating regime. We show that the variance of the logarithm of the conductance scales as a fractional power of the mean, while the skewness changes sign as one approaches the Anderson metal-insulator transition from the deeply insulating limit, all described as a function of a single parameter. The approach suggests a possible single parameter description of the Anderson transition that takes into account the full nontrivial distribution of conductance.
Detecting small objects is notoriously challenging due to their low resolution and noisy representation. Existing object detection pipelines usually detect small objects through learning representations of all the objects at multiple scales. However, the performance gain of such ad hoc architectures is usually limited to pay off the computational cost. In this work, we address the small object detection problem by developing a single architecture that internally lifts representations of small objects to "super-resolved" ones, achieving similar characteristics as large objects and thus more discriminative for detection. For this purpose, we propose a new Perceptual Generative Adversarial Network (Perceptual GAN) model that improves small object detection through narrowing representation difference of small objects from the large ones. Specifically, its generator learns to transfer perceived poor representations of the small objects to super-resolved ones that are similar enough to real large objects to fool a competing discriminator. Meanwhile its discriminator competes with the generator to identify the generated representation and imposes an additional perceptual requirement - generated representations of small objects must be beneficial for detection purpose - on the generator. Extensive evaluations on the challenging Tsinghua-Tencent 100K and the Caltech benchmark well demonstrate the superiority of Perceptual GAN in detecting small objects, including traffic signs and pedestrians, over well-established state-of-the-arts.
A complex network approach is proposed to study the shear behavior of a rough rock joint. Similarities between aperture profiles are established and a general network in two directions (in parallel and perpendicular to the shear direction) is constructed. Evaluation of this newly formed network shows that the degree distribution of the network, after a transition stage falls into a quasi stable state which is roughly obeying a Gaussian distribution. In addition, the growth of the clustering coefficient and the number of edges are approximately scaled with the development of shear strength and hydraulic conductivity, which can be utilized to estimate, shear distribution over asperities. Furthermore, we characterize the contact profiles using the same approach. Despite the former case, the later networks are following a growing network mode.
The multi Vehicle Routing Problem with Pickup and Delivery with Time Windows is a challenging version of the Vehicle Routing Problem. In this paper, by embedding many complex assignment routing constraints through constructing a multi dimensional network, we intend to reach optimality for local clusters derived from a reasonably large set of passengers on real world transportation networks. More specifically, we introduce a multi vehicle state space time network representation in which only the non dominated assignment based hyper paths are examined. In addition, by the aid of passengers cumulative service patterns defined in this paper, our solution approach is able to take control of symmetry issue, a common issue in the combinatorial problems. At the end, extensive computational results over the instances proposed by Ropke and Cordeau 2009 and a randomly generated data sets from the Phoenix subarea, City of Tempe, show the computational efficiency and solution optimality of our developed algorithm.
Modeling interpersonal influence on different sentimental polarities is a fundamental problem in opinion formation and viral marketing. There has not been seen an effective solution for learning sentimental influences from users' behaviors yet. Previous related works on information propagation directly define interpersonal influence between each pair of users as a parameter, which is independent from each others, even if the influences come from or affect the same user. And influences are learned from user's propagation behaviors, namely temporal cascades, while sentiments are not associated with them. Thus we propose to model the interpersonal influence by latent influence and susceptibility matrices defined on individual users and sentiment polarities. Such low-dimensional and distributed representations naturally make the interpersonal influences related to the same user coupled with each other, and in turn, reduce the model complexity. Sentiments act on different rows of parameter matrices, depicting their effects in modeling cascades. With the iterative optimization algorithm of projected stochastic gradient descent over shuffled mini-batches and Adadelta update rule, negative cases are repeatedly sampled with the distribution of infection frequencies users, for reducing computation cost and optimization imbalance. Experiments are conducted on Microblog dataset. The results show that our model achieves better performance than the state-of-the-art and pair-wise models. Besides, analyzing the distribution of learned users' sentimental influences and susceptibilities results some interesting discoveries.
Associative memories are data structures that allow retrieval of stored messages from part of their content. They thus behave similarly to human brain that is capable for instance of retrieving the end of a song given its beginning. Among different families of associative memories, sparse ones are known to provide the best efficiency (ratio of the number of bits stored to that of bits used). Nevertheless, it is well known that non-uniformity of the stored messages can lead to dramatic decrease in performance. We introduce several strategies to allow efficient storage of non-uniform messages in recently introduced sparse associative memories. We analyse and discuss the methods introduced. We also present a practical application example.
In this work we explore a straightforward variational Bayes scheme for Recurrent Neural Networks. Firstly, we show that a simple adaptation of truncated backpropagation through time can yield good quality uncertainty estimates and superior regularisation at only a small extra computational cost during training. Secondly, we demonstrate how a novel kind of posterior approximation yields further improvements to the performance of Bayesian RNNs. We incorporate local gradient information into the approximate posterior to sharpen it around the current batch statistics. This technique is not exclusive to recurrent neural networks and can be applied more widely to train Bayesian neural networks. We also empirically demonstrate how Bayesian RNNs are superior to traditional RNNs on a language modelling benchmark and an image captioning task, as well as showing how each of these methods improve our model over a variety of other schemes for training them. We also introduce a new benchmark for studying uncertainty for language models so future methods can be easily compared.
We analyze a distributed information network in which each node has access to the information contained in a limited set of nodes (its neighborhood) at a given time. A collective computation is carried out in which each node calculates a value that implies all information contained in the network (in our case, the average value of a variable that can take different values in each network node). The neighborhoods can change dynamically by exchanging neighbors with other nodes. The results of this collective calculation show rapid convergence and good scalability with the network size. These results are compared with those of a fixed network arranged as a square lattice, in which the number of rounds to achieve a given accuracy is very high when the size of the network increases. The results for the evolving networks are interpreted in light of the properties of complex networks and are directly relevant to the diameter and characteristic path length of the networks, which seem to express "small world" properties.
While gradient descent has proven highly successful in learning connection weights for neural networks, the actual structure of these networks is usually determined by hand, or by other optimization algorithms. Here we describe a simple method to make network structure differentiable, and therefore accessible to gradient descent. We test this method on recurrent neural networks applied to simple sequence prediction problems. Starting with initial networks containing only one node, the method automatically builds networks that successfully solve the tasks. The number of nodes in the final network correlates with task difficulty. The method can dynamically increase network size in response to an abrupt complexification in the task; however, reduction in network size in response to task simplification is not evident for reasonable meta-parameters. The method does not penalize network performance for these test tasks: variable-size networks actually reach better performance than fixed-size networks of higher, lower or identical size. We conclude by discussing how this method could be applied to more complex networks, such as feedforward layered networks, or multiple-area networks of arbitrary shape.
The understanding of molecular cell biology requires insight into the structure and dynamics of networks that are made up of thousands of interacting molecules of DNA, RNA, proteins, metabolites, and other components. One of the central goals of systems biology is the unraveling of the as yet poorly characterized complex web of interactions among these components. This work is made harder by the fact that new species and interactions are continuously discovered in experimental work, necessitating the development of adaptive and fast algorithms for network construction and updating. Thus, the "reverse-engineering" of networks from data has emerged as one of the central concern of systems biology research.   A variety of reverse-engineering methods have been developed, based on tools from statistics, machine learning, and other mathematical domains. In order to effectively use these methods, it is essential to develop an understanding of the fundamental characteristics of these algorithms. With that in mind, this chapter is dedicated to the reverse-engineering of biological systems.   Specifically, we focus our attention on a particular class of methods for reverse-engineering, namely those that rely algorithmically upon the so-called "hitting-set" problem, which is a classical combinatorial and computer science problem, Each of these methods utilizes a different algorithm in order to obtain an exact or an approximate solution of the hitting set problem. We will explore the ultimate impact that the alternative algorithms have on the inference of published in silico biological networks.
Entrainment by a pacemaker, representing an element with a higher frequency, is numerically investigated for several classes of random networks which consist of identical phase oscillators. We find that the entrainment frequency window of a network decreases exponentially with its depth, defined as the mean forward distance of the elements from the pacemaker. Effectively, only shallow networks can thus exhibit frequency-locking to the pacemaker. The exponential dependence is also derived analytically as an approximation for large random asymmetric networks.
To understand the sample-to-sample fluctuations in disorder-generated multifractal patterns we investigate analytically as well as numerically the statistics of high values of the simplest model - the ideal periodic $1/f$ Gaussian noise. By employing the thermodynamic formalism we predict the characteristic scale and the precise scaling form of the distribution of number of points above a given level. We demonstrate that the powerlaw forward tail of the probability density, with exponent controlled by the level, results in an important difference between the mean and the typical values of the counting function. This can be further used to determine the typical threshold $x_m$ of extreme values in the pattern which turns out to be given by $x_m^{(typ)}=2-c\ln{\ln{M}}/\ln{M}$ with $c=3/2$. Such observation provides a rather compelling explanation of the mechanism behind universality of $c$. Revealed mechanisms are conjectured to retain their qualitative validity for a broad class of disorder-generated multifractal fields. In particular, we predict that the typical value of the maximum $p_{max}$ of intensity is to be given by $-\ln{p_{max}} = \alpha_{-}\ln{M} + \frac{3}{2f'(\alpha_{-})}\ln{\ln{M}} + O(1)$, where $f(\alpha)$ is the corresponding singularity spectrum vanishing at $\alpha=\alpha_{-}>0$. For the $1/f$ noise we also derive exact as well as well-controlled approximate formulas for the mean and the variance of the counting function without recourse to the thermodynamic formalism.
We present an analysis of the impact of structural disorder on the static scattering function of f-armed star branched polymers in d dimensions. To this end, we consider the model of a star polymer immersed in a good solvent in the presence of structural defects, correlated at large distances r according to a power law \sim r^{-a}. In particular, we are interested in the ratio g(f) of the radii of gyration of star and linear polymers of the same molecular weight, which is a universal experimentally measurable quantity. We apply a direct polymer renormalization approach and evaluate the results within the double \varepsilon=4-d, \delta=4-a-expansion. We find an increase of g(f) with an increasing \delta. Therefore, an increase of disorder correlations leads to an increase of the size measure of a star relative to linear polymers of the same molecular weight.
How to determine the community structure of complex networks is an open question. It is critical to establish the best strategies for community detection in networks of unknown structure. Here, using standard synthetic benchmarks, we show that none of the algorithms hitherto developed for community structure characterization perform optimally. Significantly, evaluating the results according to their modularity, the most popular measure of the quality of a partition, systematically provides mistaken solutions. However, a novel quality function, called Surprise, can be used to elucidate which is the optimal division into communities. Consequently, we show that the best strategy to find the community structure of all the networks examined involves choosing among the solutions provided by multiple algorithms the one with the highest Surprise value. We conclude that Surprise maximization precisely reveals the community structure of complex networks.
We propose sensorimotor tappings, a new graphical technique that explicitly represents relations between the time steps of an agent's sensorimotor loop and a single training step of an adaptive model that the agent is using internally. In the simplest case this is a relation linking two time steps. In realistic cases these relations can extend over several time steps and over different sensory channels. The aim is to capture the footprint of information intake relative to the agent's current time step. We argue that this view allows us to make prior considerations explicit and then use them in implementations without modification once they are established. In the paper we introduce the problem domain, explain the basic idea, provide example tappings for standard configurations used in developmental models, and show how tappings can be applied to problems in related fields.
We formulate two versions of the power control problem for wireless networks with latency constraints arising from duty cycle allocations In the first version, strategic power optimization, wireless nodes are modeled as rational agents in a power game, who strategically adjust their powers to minimize their own energy. In the other version, joint power optimization, wireless nodes jointly minimize the aggregate energy expenditure. Our analysis of these models yields insights into the different energy outcomes of strategic versus joint power optimization. We derive analytical solutions for power allocation under both models and study how they are affected by data loads and channel quality. We derive simple necessary conditions for the existence of Nash equilibria in the power game and also provide numerical examples of optimal power allocation under both models. Finally, we show that joint optimization can (sometimes) be Pareto-optimal and dominate strategic optimization, i.e the energy expenditure of all nodes is lower than if they were using strategic optimization.
We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives. In DyNet's dynamic declaration strategy, computation graph construction is mostly transparent, being implicitly constructed by executing procedural code that computes the network outputs, and the user is free to use different network structures for each input. Dynamic declaration thus facilitates the implementation of more complicated network architectures, and DyNet is specifically designed to allow users to implement their models in a way that is idiomatic in their preferred programming language (C++ or Python). One challenge with dynamic declaration is that because the symbolic computation graph is defined anew for every training example, its construction must have low overhead. To achieve this, DyNet has an optimized C++ backend and lightweight graph representation. Experiments show that DyNet's speeds are faster than or comparable with static declaration toolkits, and significantly faster than Chainer, another dynamic declaration toolkit. DyNet is released open-source under the Apache 2.0 license and available at http://github.com/clab/dynet.
We propose an entirely data-driven approach to estimating the 3D pose of a hand given a depth image. We show that we can correct the mistakes made by a Convolutional Neural Network trained to predict an estimate of the 3D pose by using a feedback loop. The components of this feedback loop are also Deep Networks, optimized using training data. They remove the need for fitting a 3D model to the input data, which requires both a carefully designed fitting function and algorithm. We show that our approach outperforms state-of-the-art methods, and is efficient as our implementation runs at over 400 fps on a single GPU.
We study the local density of states around potential scatterers in d-wave superconductors, and show that quantum interference between impurity states is not negligible for experimentally relevant impurity concentrations. The two impurity model is used as a paradigm to understand these effects analytically and in interpreting numerical solutions of the Bogoliubov-de Gennes equations on fully disordered systems. We focus primarily on the globally particle-hole symmetric model which has been the subject of considerable controversy, and give evidence that a zero-energy delta function exists in the DOS. The anomalous spectral weight at zero energy is seen to arise from resonant impurity states belonging to a particular sublattice, exactly as in the 2-impurity version of this model. We discuss the implications of these findings for realistic models of the cuprates.
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.
Using our previous results for the configurational entropy of a stripe glass as well as a variational result for the bare surface tension of entropic droplets we show that there is no disagreement between the numerical simulations of Grousson et al. and our theory. The claim that our theory disagrees with numerical simulations is based on the assumption that the surface tension is independent of the frustration parameter Q of the model. However, we show in this Reply that it varies strongly with Q and that the resulting Q-dependence of the kinetic fragility agrees with the one obtained by Grousson et al. We believe that this answers the questions raised in the Comment by Grousson et al.
This paper deals with the use of self-organizing protocols to improve the reliability of dynamic Peer-to-Peer (P2P) overlay networks. We present two approaches, that employ local knowledge of the 2nd neighborhood of nodes. The first scheme is a simple protocol requiring interactions among nodes and their direct neighbors. The second scheme extends this approach by resorting to the Edge Clustering Coefficient (ECC), a local measure that allows to identify those edges that connect different clusters in an overlay. A simulation assessment is presented, which evaluates these protocols over uniform networks, clustered networks and scale-free networks. Different failure modes are considered. Results demonstrate the viability of the proposal.
The one-dimensional Ising spin-glass model with power-law long-range interactions is a useful proxy model for studying spin glasses in higher space dimensions and for finding the dimension at which the spin-glass state changes from having broken replica symmetry to that of droplet behavior. To this end we have calculated the exponent that describes the difference in free energy between periodic and antiperiodic boundary conditions. Numerical work is done to support some of the assumptions made in the calculations and to determine the behavior of the interface free-energy exponent of the power law of the interactions. Our numerical results for the interface free-energy exponent are badly affected by finite-size problems.
Cognitive (Radio) (CR) Communications (CC) are mainly deployed within the environments of primary (user) communications, where the channel states and accessibility are usually stochastically distributed (benign or IID). However, many practical CC are also exposed to disturbing events (contaminated) and vulnerable jamming attacks (adversarial or non-IID). Thus, the channel state distribution of spectrum could be stochastic, contaminated or adversarial at different temporal and spatial locations. Without any a priori, facilitating optimal CC is a very challenging issue. In this paper, we propose an online learning algorithm that performs the joint channel sensing, probing and adaptive channel access for multi-channel CC in general unknown environments. We take energy-efficient CC (EECC) into our special attention, which is highly desirable for green wireless communications and demanding to combat with potential jamming attack who could greatly mar the energy and spectrum efficiency of CC. The EECC is formulated as a constrained regret minimization problem with power budget constraints. By tuning a novel exploration parameter, our algorithms could adaptively find the optimal channel access strategies and achieve the almost optimal learning performance of EECC in different scenarios provided with the vanishing long-term power budget violations. We also consider the important scenario that cooperative learning and information sharing among multiple CR users to see further performance improvements. The proposed algorithms are resilient to both oblivious and adaptive jamming attacks with different intelligence and attacking strength. Extensive numerical results are conducted to validate our theory.
We argue that the critical dynamical fluctuations predicted by the mode-coupling theory (MCT) of glasses provide a natural mechanism to explain the breakdown of the Stokes-Einstein relation. This breakdown, observed numerically and experimentally in a region where MCT should hold, is one of the major difficulty of the theory, for which we propose a natural resolution based on the recent interpretation of the MCT transition as a bona fide critical point with a diverging length scale. We also show that the upper critical dimension of MCT is d_c=8.
We propose to model the dynamics of metabolic networks from a systems biology point of view by four dynamical structure elements: potential function, transverse matrix, degradation matrix, and stochastic force. These four elements are balanced to determine the network dynamics, which gives arise to a special stochastic differential equation supplemented by a relationship between the stochastic force and the degradation matrix. Important network behaviors can be obtained from the potential function without explicitly solving for the time-dependent solution. The existence of such a potential function suggests a global optimization principle, and the existence stochastic force corresponds natural to the hierarchical structure in metabolic networks. We provide theoretical evidences to justify our proposal by discussing its connections to others large-scale biochemical systems approaches, such as the network thermodynamics theory, biochemical systems theory, metabolic control analysis, and flux balance analysis. Experimental data displaying stochasticity are also pointed out.
We study a highly supercooled two-dimensional fluid mixture via molecular dynamics simulation. We follow bond breakage events among particle pairs, which occur on the scale of the $\alpha$ relaxation time $\tau_{\alpha}$. Large scale heterogeneities analogous to the critical fluctuations in Ising systems are found in the spatial distribution of bonds which are broken in a time interval with a width of order $0.05\tau_{\alpha}$. The structure factor of the broken bond density is well approximated by the Ornstein-Zernike form. The correlation length is of order $100 \sigma_1$ at the lowest temperature studied, $\sigma_1$ being the particle size. The weakly bonded regions thus identified evolve in time with strong spatial correlations.
The ICDM Challenge 2013 is to apply machine learning to the problem of hotel ranking, aiming to maximize purchases according to given hotel characteristics, location attractiveness of hotels, user's aggregated purchase history and competitive online travel agency information for each potential hotel choice. This paper describes the solution of team "binghsu & MLRush & BrickMover". We conduct simple feature engineering work and train different models by each individual team member. Afterwards, we use listwise ensemble method to combine each model's output. Besides describing effective model and features, we will discuss about the lessons we learned while using deep learning in this competition.
To study the effect of quenched disorder in a class of reaction-diffusion systems, we introduce a conserved mass model of diffusion and aggregation in which the mass moves as a whole to a nearest neighbour on most sites while it fragments off as a single monomer (i.e. chips off) from certain fixed sites. Once the mass leaves any site, it coalesces with the mass present on its neighbour. We study in detail the effect of a \emph{single} chipping site on the steady state in arbitrary dimensions, with and without bias. In the thermodynamic limit, the system can exist in one of the following phases -- (a) Pinned Aggregate (PA) phase in which an infinite aggregate (with mass proportional to the volume of the system) appears with probability one at the chipping site but not in the bulk. (b) Unpinned Aggregate (UA) phase in which $\emph{both}$ the chipping site and the bulk can support an infinite aggregate simultaneously. (c) Non Aggregate (NA) phase in which there is no infinite cluster. Our analytical and numerical studies show that the system exists in the UA phase in all cases except in 1d with bias. In the latter case, there is a phase transition from the NA phase to the PA phase as density is increased. A variant of the above aggregation model is also considered in which total particle number is conserved and chipping occurs at a fixed site, but the particles do not interact with each other at other sites. This model is solved exactly by mapping it to a Zero Range Process. With increasing density, it exhibits a phase transition from the NA phase to the PA phase in all dimensions, irrespective of bias. Finally, we discuss the likely behaviour of the system in the presence of extensive disorder.
In this paper we introduce a method to overcome one of the main challenges of person re-identification in multi-camera networks, namely cross-view appearance changes. The proposed solution addresses the extreme variability of person appearance in different camera views by exploiting multiple feature representations. For each feature, Kernel Canonical Correlation Analysis (KCCA) with different kernels is exploited to learn several projection spaces in which the appearance correlation between samples of the same person observed from different cameras is maximized. An iterative logistic regression is finally used to select and weigh the contributions of each feature projections and perform the matching between the two views. Experimental evaluation shows that the proposed solution obtains comparable performance on VIPeR and PRID 450s datasets and improves on PRID and CUHK01 datasets with respect to the state of the art.
A quantum network promises to enable long distance quantum communication, and assemble small quantum devices into a large quantum computing cluster. Each network node can thereby be seen as a small few qubit quantum computer. Qubits can be sent over direct physical links connecting nearby quantum nodes, or by means of teleportation over pre-established entanglement amongst distant network nodes. Such pre-shared entanglement effectively forms a shortcut - a virtual quantum link - which can be used exactly once.   Here, we present an abstraction of a quantum network that allows ideas from computer science to be applied to the problem of routing qubits, and manage entanglement in the network. Specifically, we consider a scenario in which each quantum network node can create EPR pairs with its immediate neighbours over a physical connection, and perform entanglement swapping operations in order to create long distance virtual quantum links. We proceed to discuss the features unique to quantum networks, which call for the development of new routing techniques. As an example, we present two simple hierarchical routing schemes for a quantum network of N nodes for a ring and sphere topology. For these topologies we present efficient routing algorithms requiring O(log N) qubits to be stored at each network node, O(polylog N) time and space to perform routing decisions, and O(log N) timesteps to replenish the virtual quantum links in a model of entanglement generation.
We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute. Our approach is based on an interrelated set of measures of expressivity, unified by the novel notion of trajectory length, which measures how the output of a network changes as the input sweeps along a one-dimensional path. Our findings can be summarized as follows:   (1) The complexity of the computed function grows exponentially with depth.   (2) All weights are not equal: trained networks are more sensitive to their lower (initial) layer weights.   (3) Regularizing on trajectory length (trajectory regularization) is a simpler alternative to batch normalization, with the same performance.
In this work we present a new methodology to study the structure of the configuration spaces of hard combinatorial problems. It consists in building the network that has as nodes the locally optimal configurations and as edges the weighted oriented transitions between their basins of attraction. We apply the approach to the detection of communities in the optima networks produced by two different classes of instances of a hard combinatorial optimization problem: the quadratic assignment problem (QAP). We provide evidence indicating that the two problem instance classes give rise to very different configuration spaces. For the so-called real-like class, the networks possess a clear modular structure, while the optima networks belonging to the class of random uniform instances are less well partitionable into clusters. This is convincingly supported by using several statistical tests. Finally, we shortly discuss the consequences of the findings for heuristically searching the corresponding problem spaces.
Recent results are reviewed on both the time evolution and retrieval properties of multi-state neural networks that are based upon spin-glass models. In particular, the properties of models with neuron states having Q-Ising symmetry are discussed for various architectures. The main common features and differences are highlighted.
We propose a novel trust metric for social networks which is suitable for application in recommender systems. It is personalised and dynamic and allows to compute the indirect trust between two agents which are not neighbours based on the direct trust between agents that are neighbours. In analogy to some personalised versions of PageRank, this metric makes use of the concept of feedback centrality and overcomes some of the limitations of other trust metrics.In particular, it does not neglect cycles and other patterns characterising social networks, as some other algorithms do. In order to apply the metric to recommender systems, we propose a way to make trust dynamic over time. We show by means of analytical approximations and computer simulations that the metric has the desired properties. Finally, we carry out an empirical validation on a dataset crawled from an Internet community and compare the performance of a recommender system using our metric to one using collaborative filtering.
We argue that the Scanning Tunneling Microscope (STM) images of resonant states generated by doping Zn or Ni impurities into Cu-O planes of BSCCO are the result of quantum interference of the impurity signal coming from several distinct paths. The impurity image seen on the surface is greatly affected by interlayer tunneling matrix elements. We find that the optimal tunneling path between the STM tip and the metal (Cu, Zn, or Ni) $d_{x^2 - y^2}$ orbitals in the Cu-O plane involves intermediate excited states. This tunneling path leads to the four-fold nonlocal filter of the impurity state in Cu-O plane that explains the experimental impurity spectra. Applications of the tunneling filter to the Cu vacancy defects and ``direct'' tunneling into Cu-O planes are also discussed.
Vision impairment due to pathological damage of the retina can largely be prevented through periodic screening using fundus color imaging. However the challenge with large scale screening is the inability to exhaustively detect fine blood vessels crucial to disease diagnosis. In this work we present a computational imaging framework using deep and ensemble learning for reliable detection of blood vessels in fundus color images. An ensemble of deep convolutional neural networks is trained to segment vessel and non-vessel areas of a color fundus image. During inference, the responses of the individual ConvNets of the ensemble are averaged to form the final segmentation. In experimental evaluation with the DRIVE database, we achieve the objective of vessel detection with maximum average accuracy of 94.7\% and area under ROC curve of 0.9283.
The one-dimensional (1D) tight binding model with random nearest neighbor hopping is known to have a singularity of the density of states and of the localization length at the band center. We study numerically the effects of random long range (power-law) hopping with an ensemble averaged magnitude $\expectation{|t_{ij}|} \propto |i-j|^{-\sigma}$ in the 1D chain, while maintaining the particle-hole symmetry present in the nearest neighbor model. We find, in agreement with results of position space renormalization group techniques applied to the random XY spin chain with power-law interactions, that there is a change of behavior when the power-law exponent $\sigma$ becomes smaller than 2.
The depinning of an elastic line interacting with a quenched disorder is studied for long range interactions, applicable to crack propagation or wetting. An ultrametric distance is introduced instead of the Euclidean distance, allowing for a drastic reduction of the numerical complexity of the problem. Based on large scale simulations, two to three orders of magnitude larger than previously considered, we obtain a very precise determination of critical exponents which are shown to be indistinguishable from their Euclidean metric counterparts. Moreover the scaling functions are shown to be unchanged. The choice of an ultrametric distance thus does not affect the universality class of the depinning transition and opens the way to an analytic real space renormalization group approach.
Effective Communication for marketing is a vital field in business organizations, which is used to convey the details about their products and services to the market segments and subsequently to build long lasting customer relationships. This paper focuses on an emerging component of the integrated marketing communication, ie. social media networking, as it is increasingly becoming the trend. In 21st century, the marketing communication platforms show a tendency to shift towards innovative technology bound people networking which is becoming an acceptable domain of interaction. Though the traditional channels like TV, print media etc. are still active and prominent in marketing communication, the presences of the Internet and more specifically the Social Media Networking, has started influencing the way individuals and business enterprises communicate. It has become evident that more individuals and business enterprises are engaging the social media networking sites either to accelerate the sales of their products and services or to provide post-purchase feedbacks. This shift in scenario has motivated this research which took six months (June 2011 - December 2011), using empirical analysis which is carried out based on several primary and secondary evidences. The research paper also analyzes the factors that govern the social media networking sites to influence consumers and subsequently enable their purchase decisions. The secondary data presented for this research were those pertaining to the period between the year 2005 and year 2011. The study revealed promising facts like the transition to marketing through SMN gives visible advantages like bidirectional communication, interactive product presentation, and a firm influence on customer who has a rudimentary interest...
Curriculum Learning emphasizes the order of training instances in a computational learning setup. The core hypothesis is that simpler instances should be learned early as building blocks to learn more complex ones. Despite its usefulness, it is still unknown how exactly the internal representation of models are affected by curriculum learning. In this paper, we study the effect of curriculum learning on Long Short-Term Memory (LSTM) networks, which have shown strong competency in many Natural Language Processing (NLP) problems. Our experiments on sentiment analysis task and a synthetic task similar to sequence prediction tasks in NLP show that curriculum learning has a positive effect on the LSTM's internal states by biasing the model towards building constructive representations i.e. the internal representation at the previous timesteps are used as building blocks for the final prediction. We also find that smaller models significantly improves when they are trained with curriculum learning. Lastly, we show that curriculum learning helps more when the amount of training data is limited.
Deep neural networks have shown effectiveness in many challenging tasks and proved their strong capability in automatically learning good feature representation from raw input. Nonetheless, designing their architectures still requires much human effort. Techniques for automatically designing neural network architectures such as reinforcement learning based approaches recently show promising results in benchmarks. However, these methods still train each network from scratch during exploring the architecture space, which results in extremely high computational cost. In this paper, we propose a novel reinforcement learning framework for automatic architecture designing, where the action is to grow the network depth or layer width based on the current network architecture with function preserved. As such, the previously validated networks can be reused for further exploration, thus saves a large amount of computational cost. The experiments on image benchmark datasets have demonstrated the efficiency and effectiveness of our proposed solution compared to existing automatic architecture designing methods.
The Earth Mover's Distance (EMD) computes the optimal cost of transforming one distribution into another, given a known transport metric between them. In deep learning, the EMD loss allows us to embed information during training about the output space structure like hierarchical or semantic relations. This helps in achieving better output smoothness and generalization. However EMD is computationally expensive.Moreover, solving EMD optimization problems usually require complex techniques like lasso. These properties limit the applicability of EMD-based approaches in large scale machine learning.   We address in this work the difficulties facing incorporation of EMD-based loss in deep learning frameworks. Additionally, we provide insight and novel solutions on how to integrate such loss function in training deep neural networks. Specifically, we make three main contributions: (i) we provide an in-depth analysis of the fastest state-of-the-art EMD algorithm (Sinkhorn Distance) and discuss its limitations in deep learning scenarios. (ii) we derive fast and numerically stable closed-form solutions for the EMD gradient in output spaces with chain- and tree- connectivity; and (iii) we propose a relaxed form of the EMD gradient with equivalent computational complexity but faster convergence rate. We support our claims with experiments on real datasets. In a restricted data setting on the ImageNet dataset, we train a model to classify 1000 categories using 50K images, and demonstrate that our relaxed EMD loss achieves better Top-1 accuracy than the cross entropy loss. Overall, we show that our relaxed EMD loss criterion is a powerful asset for deep learning in the small data regime.
We explore the nature of the faint blue objects in the Hubble Deep Field South. We have derived proper motions for the point sources in the Hubble Deep Field South using a 3 year baseline. Combining our proper motion measurements with spectral energy distribution fitting enabled us to identify 4 quasars and 42 stars, including 3 white dwarf candidates. Two of these white dwarf candidates, HDFS 1444 and 895, are found to display significant proper motion, 21.1 $\pm$ 7.9 mas/yr and 34.9 $\pm$ 8.0 mas/yr, and are consistent with being thick disk or halo white dwarfs located at ~2 kpc. The other faint blue objects analyzed by Mendez & Minniti do not show any significant proper motion and are inconsistent with being halo white dwarfs; they do not contribute to the Galactic dark matter. The observed population of stars and white dwarfs is consistent with standard Galactic models.
Biological structure and function depend on complex regulatory interactions between many genes. A wealth of gene expression data is available from high-throughput genome-wide measurement technologies, but effective gene regulatory network inference methods are still needed. Model-based methods founded on quantitative descriptions of gene regulation are among the most promising, but many such methods rely on simple, local models or on ad hoc inference approaches lacking experimental interpretability. We propose an experimental design and develop an associated statistical method for inferring a gene network by learning a standard quantitative, interpretable, predictive, biophysics-based ordinary differential equation model of gene regulation. We fit the model parameters using gene expression measurements from perturbed steady-states of the system, like those following overexpression or knockdown experiments. Although the original model is nonlinear, our design allows us to transform it into a convex optimization problem by restricting attention to steady-states and using the lasso for parameter selection. Here, we describe the model and inference algorithm and apply them to a synthetic six-gene system, demonstrating that the model is detailed and flexible enough to account for activation and repression as well as synergistic and self-regulation, and the algorithm can efficiently and accurately recover the parameters used to generate the data.
Control theory concerns with the question if and how it is possible to drive the behavior of a complex dynamical system. A system is said to be controllable if we can drive it from any initial state to any desired final state in finite time. For many complex networks, the precise knowledge of system parameters lacks. But, it is possible to make a conclusion about network controllability by inspecting its structure. Classical theory of structural controllability is based on the Lin's structural controllability theorem, which gives necessary and sufficient conditions to conclude if any network is structurally controllable. Due to this fundamental theorem we may identify a minimum driver vertex set, whose control with independent driving signals is sufficient to make the whole system controllable. I show that the Lin's theorem does not impose any limitations on quantum networks structural controllability. By local operations and classical communication, one can modify any quantum network to make it structurally controllable by a single driving signal.
Making decisions about the structure of a future military fleet is a challenging task. Several issues need to be considered such as the existence of multiple competing objectives and the complexity of the operating environment. A particular challenge is posed by the various types of uncertainty that the future might hold. It is uncertain what future events might be encountered; how fleet design decisions will influence and shape the future; and how present and future decision makers will act based on available information, their personal biases regarding the importance of different objectives, and their economic preferences. In order to assist strategic decision-making, an analysis of future fleet options needs to account for conditions in which these different classes of uncertainty are exposed. It is important to understand what assumptions a particular fleet is robust to, what the fleet can readily adapt to, and what conditions present clear risks to the fleet. We call this the analysis of a fleet's strategic positioning. This paper introduces how strategic positioning can be evaluated using computer simulations. Our main aim is to introduce a framework for capturing information that can be useful to a decision maker and for defining the concepts of robustness and adaptiveness in the context of future fleet design. We demonstrate our conceptual framework using simulation studies of an air transportation fleet. We capture uncertainty by employing an explorative scenario-based approach. Each scenario represents a sampling of different future conditions, different model assumptions, and different economic preferences. Proposed changes to a fleet are then analysed based on their influence on the fleet's robustness, adaptiveness, and risk to different scenarios.
Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by millions of parameters, but valuable because it increases our ability to understand current models and create improved versions of them. In this paper we investigate the extent to which neural networks exhibit what we call convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces. We propose a specific method of probing representations: training multiple networks and then comparing and contrasting their individual, learned representations at the level of neurons or groups of neurons. We begin research into this question using three techniques to approximately align different neural networks on a feature level: a bipartite matching approach that makes one-to-one assignments between neurons, a sparse prediction approach that finds one-to-many mappings, and a spectral clustering approach that finds many-to-many mappings. This initial investigation reveals a few previously unknown properties of neural networks, and we argue that future research into the question of convergent learning will yield many more. The insights described here include (1) that some features are learned reliably in multiple networks, yet other features are not consistently learned; (2) that units learn to span low-dimensional subspaces and, while these subspaces are common to multiple networks, the specific basis vectors learned are not; (3) that the representation codes show evidence of being a mix between a local code and slightly, but not fully, distributed codes across multiple units.
We evaluate the uncertainty quality in neural networks using anomaly detection. We extract uncertainty measures (e.g. entropy) from the predictions of candidate models, use those measures as features for an anomaly detector, and gauge how well the detector differentiates known from unknown classes. We assign higher uncertainty quality to candidate models that lead to better detectors. We also propose a novel method for sampling a variational approximation of a Bayesian neural network, called One-Sample Bayesian Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We compare the following candidate neural network models: Maximum Likelihood, Bayesian Dropout, OSBA, and --- for MNIST --- the standard variational approximation. We show that Bayesian Dropout and OSBA provide better uncertainty information than Maximum Likelihood, and are essentially equivalent to the standard variational approximation, but much faster.
We study the effect of surface scattering on transport properties in many-mode conducting channels (electron waveguides). Assuming a strong roughness of the surface profiles, we show that there are two independent control parameters that determine statistical properties of the scattering. The first parameter is the ratio of the amplitude of the roughness to the transverse width of the waveguide. The second one, which is typically omitted, is determined by the mean value of the derivative of the profile. This parameter may be large, thus leading to specific properties of scattering. Our results may be used in experimental realizations of the surface scattering of electron waves, as well as for other applications (e.g., for optical and microwave waveguides)
An active object recognition system has the advantage of being able to act in the environment to capture images that are more suited for training and that lead to better performance at test time. In this paper, we propose a deep convolutional neural network for active object recognition that simultaneously predicts the object label, and selects the next action to perform on the object with the aim of improving recognition performance. We treat active object recognition as a reinforcement learning problem and derive the cost function to train the network for joint prediction of the object label and the action. A generative model of object similarities based on the Dirichlet distribution is proposed and embedded in the network for encoding the state of the system. The training is carried out by simultaneously minimizing the label and action prediction errors using gradient descent. We empirically show that the proposed network is able to predict both the object label and the actions on GERMS, a dataset for active object recognition. We compare the test label prediction accuracy of the proposed model with Dirichlet and Naive Bayes state encoding. The results of experiments suggest that the proposed model equipped with Dirichlet state encoding is superior in performance, and selects images that lead to better training and higher accuracy of label prediction at test time.
Several scenarios of interacting neural networks which are trained either in an identical or in a competitive way are solved analytically. In the case of identical training each perceptron receives the output of its neighbour. The symmetry of the stationary state as well as the sensitivity to the used training algorithm are investigated. Two competitive perceptrons trained on mutually exclusive learning aims and a perceptron which is trained on the opposite of its own output are examined analytically. An ensemble of competitive perceptrons is used as decision-making algorithms in a model of a closed market (El Farol Bar problem or Minority Game); each network is trained on the history of minority decisions. This ensemble of perceptrons relaxes to a stationary state whose performance can be better than random.
This paper presents a new state-of-the-art for document image classification and retrieval, using features learned by deep convolutional neural networks (CNNs). In object and scene analysis, deep neural nets are capable of learning a hierarchical chain of abstraction from pixel inputs to concise and descriptive representations. The current work explores this capacity in the realm of document analysis, and confirms that this representation strategy is superior to a variety of popular hand-crafted alternatives. Experiments also show that (i) features extracted from CNNs are robust to compression, (ii) CNNs trained on non-document images transfer well to document analysis tasks, and (iii) enforcing region-specific feature-learning is unnecessary given sufficient training data. This work also makes available a new labelled subset of the IIT-CDIP collection, containing 400,000 document images across 16 categories, useful for training new CNNs for document analysis.
The game of Go is more challenging than other board games, due to the difficulty of constructing a position or move evaluation function. In this paper we investigate whether deep convolutional networks can be used to directly represent and learn this knowledge. We train a large 12-layer convolutional neural network by supervised learning from a database of human professional games. The network correctly predicts the expert move in 55% of positions, equalling the accuracy of a 6 dan human player. When the trained convolutional network was used directly to play games of Go, without any search, it beat the traditional search program GnuGo in 97% of games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move.
This paper describes application of information granulation theory, on the analysis of "lugeon data". In this manner, using a combining of Self Organizing Map (SOM) and Neuro-Fuzzy Inference System (NFIS), crisp and fuzzy granules are obtained. Balancing of crisp granules and sub- fuzzy granules, within non fuzzy information (initial granulation), is rendered in open-close iteration. Using two criteria, "simplicity of rules "and "suitable adaptive threshold error level", stability of algorithm is guaranteed. In other part of paper, rough set theory (RST), to approximate analysis, has been employed >.Validation of the proposed methods, on the large data set of in-situ permeability in rock masses, in the Shivashan dam, Iran, has been highlighted. By the implementation of the proposed algorithm on the lugeon data set, was proved the suggested method, relating the approximate analysis on the permeability, could be applied.
Learning and memory are acquired through long-lasting changes in synapses. In the simplest models, such synaptic potentiation typically leads to runaway excitation, but in reality there must exist processes that robustly preserve overall stability of the neural system dynamics. How is this accomplished? Various approaches to this basic question have been considered. Here we propose a particularly compelling and natural mechanism for preserving stability of learning neural systems. This mechanism is based on the global processes by which metabolic resources are distributed to the neurons by glial cells. Specifically, we introduce and study a model comprised of two interacting networks: a model neural network interconnected by synapses which undergo spike-timing dependent plasticity (STDP); and a model glial network interconnected by gap junctions which diffusively transport metabolic resources among the glia and, ultimately, to neural synapses where they are consumed. Our main result is that the biophysical constraints imposed by diffusive transport of metabolic resources through the glial network can prevent runaway growth of synaptic strength, both during ongoing activity and during learning. Our findings suggest a previously unappreciated role for glial transport of metabolites in the feedback control stabilization of neural network dynamics during learning.
The motion of driven interfaces in random media at finite temperature $T$ and small external force $F$ is usually described by a linear displacement $h_G(t) \sim V(F,T) t$ at large times, where the velocity vanishes according to the creep formula as $V(F,T) \sim e^{-K(T)/F^{\mu}}$ for $F \to 0$. In this paper, we question this picture on the specific example of the directed polymer in a two dimensional random medium. We have recently shown (C. Monthus and T. Garel, arxiv:0802.2502) that its dynamics for F=0 can be analyzed in terms of a strong disorder renormalization procedure, where the distribution of renormalized barriers flows towards some "infinite disorder fixed point". In the present paper, we obtain that for small $F$, this "infinite disorder fixed point" becomes a "strong disorder fixed point" with an exponential distribution of renormalized barriers. The corresponding distribution of trapping times then only decays as a power-law $P(\tau) \sim 1/\tau^{1+\alpha}$, where the exponent $\alpha(F,T)$ vanishes as $\alpha(F,T) \propto F^{\mu}$ as $F \to 0$. Our conclusion is that in the small force region $\alpha(F,T)<1$, the divergence of the averaged trapping time $\bar{\tau}=+\infty$ induces strong non-self-averaging effects that invalidate the usual creep formula obtained by replacing all trapping times by the typical value. We find instead that the motion is only sub-linearly in time $h_G(t) \sim t^{\alpha(F,T)}$, i.e. the asymptotic velocity vanishes V=0. This analysis is confirmed by numerical simulations of a directed polymer with a metric constraint driven in a traps landscape. We moreover obtain that the roughness exponent, which is governed by the equilibrium value $\zeta_{eq}=2/3$ up to some large scale, becomes equal to $\zeta=1$ at the largest scales.
In practice, since many communication networks are huge in scale, or complicated in structure, or even dynamic, the predesigned linear network codes based on the network topology is impossible even if the topological structure is known. Therefore, random linear network coding has been proposed as an acceptable coding technique for the case that the network topology cannot be utilized completely. Motivated by the fact that different network topological information can be obtained for different practical applications, we study the performance analysis of random linear network coding by analyzing some failure probabilities depending on these different topological information of networks. We obtain some tight or asymptotically tight upper bounds on these failure probabilities and indicate the worst cases for these bounds, i.e., the networks meeting the upper bounds with equality. In addition, if the more topological information of the network is utilized, the better upper bounds are obtained. On the other hand, we also discuss the lower bounds on the failure probabilities.
Wireless Sensor Networks (WSNs), with growing applications in the environment which are not within human reach have been addressed tremendously in the recent past. For optimized working of network many routing algorithms have been proposed, mainly focusing energy efficiency, network lifetime, clustering processes. Considering homogeneity of network, we proposed Energy Efficient Sleep Awake Aware (EESAA) intelligent routing protocol for WSNs. In our proposed technique we evaluate and enhance certain issues like network stability, network lifetime and cluster head selection process. Utilizing the concept of characteristical pairing among sensor nodes energy utilization is optimized. Simulation results show that our proposed protocolnificantly improved the
Inspired by biophysical principles underlying nonlinear dendritic computation in neural circuits, we develop a scheme to train deep neural networks to make them robust to adversarial attacks. Our scheme generates highly nonlinear, saturated neural networks that achieve state of the art performance on gradient based adversarial examples on MNIST, despite never being exposed to adversarially chosen examples during training. Moreover, these networks exhibit unprecedented robustness to targeted, iterative schemes for generating adversarial examples, including second-order methods. We further identify principles governing how these networks achieve their robustness, drawing on methods from information geometry. We find these networks progressively create highly flat and compressed internal representations that are sensitive to very few input dimensions, while still solving the task. Moreover, they employ highly kurtotic weight distributions, also found in the brain, and we demonstrate how such kurtosis can protect even linear classifiers from adversarial attack.
We present new Neutron Spin Echo (NSE) results and a revisited analysis of historical data on spin glasses, which reveal a pure power-law time decay of the spin autocorrelation function $s(Q,t) = S(Q,t)/S(Q)$ at the glass temperature $T_g$, each power law exponent being in excellent agreement with that calculated from dynamic and static critical exponents deduced from macroscopic susceptibility measurements made on a quite different time scale. It is the first time that this scaling relation involving exponents of different physical quantities determined by completely independent experimental methods is stringently verified experimentally in a spin glass. As spin glasses are a subgroup of the vast family of glassy systems also comprising structural glasses, other non-crystalline systems living matter the observed strict critical scaling behaviour is important. Above the phase transition the strikingly non-exponential relaxation, best fitted by the Ogielski (power-law times stretched exponential) function, appears as an intrinsic, homogeneous feature of spin glasses.
Network sparsification aims to reduce the number of edges of a network while maintaining its structural properties; such properties include shortest paths, cuts, spectral measures, or network modularity. Sparsification has multiple applications, such as, speeding up graph-mining algorithms, graph visualization, as well as identifying the important network edges. In this paper we consider a novel formulation of the network-sparsification problem. In addition to the network, we also consider as input a set of communities. The goal is to sparsify the network so as to preserve the network structure with respect to the given communities. We introduce two variants of the community-aware sparsification problem, leading to sparsifiers that satisfy different connectedness community properties. From the technical point of view, we prove hardness results and devise effective approximation algorithms. Our experimental results on a large collection of datasets demonstrate the effectiveness of our algorithms.
As data sets grow in size, the ability of learning methods to find structure in them is increasingly hampered by the time needed to search the large spaces of possibilities and generate a score for each that takes all of the observed data into account. For instance, Bayesian networks, the model chosen in this paper, have a super-exponentially large search space for a fixed number of variables. One possible method to alleviate this problem is to use a proxy, such as a Gaussian Process regressor, in place of the true scoring function, training it on a selection of sampled networks. We prove here that the use of such a proxy is well-founded, as we can bound the smoothness of a commonly-used scoring function for Bayesian network structure learning. We show here that, compared to an identical search strategy using the network?s exact scores, our proxy-based search is able to get equivalent or better scores on a number of data sets in a fraction of the time.
In this paper, we design a Deep Dual-Domain ($\mathbf{D^3}$) based fast restoration model to remove artifacts of JPEG compressed images. It leverages the large learning capacity of deep networks, as well as the problem-specific expertise that was hardly incorporated in the past design of deep architectures. For the latter, we take into consideration both the prior knowledge of the JPEG compression scheme, and the successful practice of the sparsity-based dual-domain approach. We further design the One-Step Sparse Inference (1-SI) module, as an efficient and light-weighted feed-forward approximation of sparse coding. Extensive experiments verify the superiority of the proposed $D^3$ model over several state-of-the-art methods. Specifically, our best model is capable of outperforming the latest deep model for around 1 dB in PSNR, and is 30 times faster.
Convolutional neural nets (CNNs) have become a practical means to perform vision tasks, particularly in the area of image classification. FPGAs are well known to be able to perform convolutions efficiently, however, most recent efforts to run CNNs on FPGAs have shown limited advantages over other devices such as GPUs. Previous approaches on FPGAs have often been memory bound due to the limited external memory bandwidth on the FPGA device. We show a novel architecture written in OpenCL(TM), which we refer to as a Deep Learning Accelerator (DLA), that maximizes data reuse and minimizes external memory bandwidth. Furthermore, we show how we can use the Winograd transform to significantly boost the performance of the FPGA. As a result, when running our DLA on Intel's Arria 10 device we can achieve a performance of 1020 img/s, or 23 img/s/W when running the AlexNet CNN benchmark. This comes to 1382 GFLOPs and is 10x faster with 8.4x more GFLOPS and 5.8x better efficiency than the state-of-the-art on FPGAs. Additionally, 23 img/s/W is competitive against the best publicly known implementation of AlexNet on nVidia's TitanX GPU.
The Ising Model has recently received much attention for the statistical description of neural spike train data. In this paper, we propose and demonstrate its use for building decoders capable of predicting, on a millisecond timescale, the stimulus represented by a pattern of neural activity. After fitting to a training dataset, the Ising decoder can be applied "online" for instantaneous decoding of test data. While such models can be fit exactly using Boltzmann learning, this approach rapidly becomes computationally intractable as neural ensemble size increases. We show that several approaches, including the Thouless-Anderson-Palmer (TAP) mean field approach from statistical physics, and the recently developed Minimum Probability Flow Learning (MPFL) algorithm, can be used for rapid inference of model parameters in large-scale neural ensembles. Use of the Ising model for decoding, unlike other problems such as functional connectivity estimation, requires estimation of the partition function. As this involves summation over all possible responses, this step can be limiting. Mean field approaches avoid this problem by providing an analytical expression for the partition function. We demonstrate these decoding techniques by applying them to simulated neural ensemble responses from a mouse visual cortex model, finding an improvement in decoder performance for a model with heterogeneous as opposed to homogeneous neural tuning and response properties. Our results demonstrate the practicality of using the Ising model to read out, or decode, spatial patterns of activity comprised of many hundreds of neurons.
It has been recently reported that the reciprocity of real-life weighted networks is very pronounced, however its impact on dynamical processes is poorly understood. In this paper, we study random walks in a scale-free directed weighted network with a trap at the central hub node, where the weight of each directed edge is dominated by a parameter controlling the extent of network reciprocity. We derive two expressions for the mean first passage time (MFPT) to the trap, by using two different techniques, the results of which agree well with each other. We also analytically determine all the eigenvalues as well as their multiplicities for the fundamental matrix of the dynamical process, and show that the largest eigenvalue has an identical dominant scaling as that of the MFPT.We find that the weight parameter has a substantial effect on the MFPT, which behaves as a power-law function of the system size with the power exponent dependent on the parameter, signaling the crucial role of reciprocity in random walks occurring in weighted networks.
It has remained an open question for some time whether, given a set of not necessarily binary (i.e. "nonbinary") trees T on a set of taxa X, it is possible to determine in time f(r).poly(m) whether there exists a phylogenetic network that displays all the trees in T, where r refers to the reticulation number of the network and m=|X|+|T|. Here we show that this holds if one or both of the following conditions holds: (1) |T| is bounded by a function of r; (2) the maximum degree of the nodes in T is bounded by a function of r. These sufficient conditions absorb and significantly extend known special cases, namely when all the trees in T are binary, or T contains exactly two nonbinary trees. We believe this result is an important step towards settling the issue for an arbitrarily large and complex set of nonbinary trees. For completeness we show that the problem is certainly solveable in polynomial time.
Social networks enable users to freely communicate with each other and share their recent news, ongoing activities or views about different topics. As a result, they can be seen as a potentially viable source of information to understand the current emerging topics/events. The ability to model emerging topics is a substantial step to monitor and summarize the information originating from social sources. Applying traditional methods for event detection which are often proposed for processing large, formal and structured documents, are less effective, due to the short length, noisiness and informality of the social posts. Recent event detection techniques address these challenges by exploiting the opportunities behind abundant information available in social networks. This article provides an overview of the state of the art in event detection from social networks.
Due to deployment in hostile environment, wireless sensor network is vulnerable to various attacks. Exhausted sensor nodes in sensor network become a challenging issue because it disrupts the normal connectivity of the network. Affected nodes give rise to denial of service that resists to get the objective of sensor network in real life. A mathematical model based on Absorbing Markov Chain (AMC)is proposed for Denial of Sleep attack detection in sensor network. In this mechanism, whether sensor network is affected by denial of sleep attack or not can be decided by considering expected death time of sensor network under normal scenario.
Gain and order scheduling of fractional order (FO) PI{\lambda}D{\mu} controllers are studied in this paper considering four different classes of higher order processes. The mapping between the optimum PID/FOPID controller parameters and the reduced order process models are done using Radial Basis Function (RBF) type Artificial Neural Network (ANN). Simulation studies have been done to show the effectiveness of the RBFNN for online scheduling of such controllers with random change in set-point and process parameters.
Based on the requirement in the simulation of lepton-nucleus deep inelastic scattering (DIS), we construct a fortran program LDCS 1.0 calculating the differential and total cross sections for the unpolarized charged lepton-unpolarized nucleon and neutrino-unpolarized nucleon neutral current (charged current) DIS at leading order. Any set of the experimentally fitted parton distribution functions could be employed directly. The mass of incident and scattered leptons is taken into account and the boundary conditions calculating the single differential and total cross section are studied. The calculated results well agree with the corresponding experimental data which indicating the LDCS 1.0 program is good. It is also turned out that the effect of tauon mass is not negligible in the GeV energy level.
An artificial neural network for extracting reasonable and fast estimates of hyperfine parameters from M\"ossbauer spectra in the energy or time domain is outlined. First promising results for determining the asymmetry of the electric field gradient at the nucleus of a diamagnetic iron center as derived with different types of neural networks are reported.
A dynamical mean-field approximation (DMA) previously proposed by the present author [H. Hasegawa, Phys. Rev E {\bf 67}, 041903 (2003)] has been extended to ensembles described by a general noisy spiking neuron model. Ensembles of $N$-unit neurons, each of which is expressed by coupled $K$-dimensional differential equations (DEs), are assumed to be subject to spatially correlated white noises. The original $KN$-dimensional {\it stochastic} DEs have been replaced by $K(K+2)$-dimensional {\it deterministic} DEs expressed in terms of means and the second-order moments of {\it local} and {\it global} variables: the fourth-order contributions are taken into account by the Gaussian decoupling approximation. Our DMA has been applied to an ensemble of Hodgkin-Huxley (HH) neurons (K=4), for which effects of the noise, the coupling strength and the ensemble size on the response to a single-spike input have been investigated. Results calculated by DMA theory are in good agreement with those obtained by direct simulations.
Recently, there has been a growing concern about the overload status of the power grid networks, and the increasing possibility of cascading failures. Many researchers have studied these networks to provide design guidelines for more robust power grids. Topological analysis is one of the components of system analysis for its robustness. This paper presents a complex systems analysis of power grid networks. First, the cascading effect has been simulated on three well known networks: the IEEE 300 bus test system, the IEEE 118 bus test system, and the WSCC 179 bus equivalent model. To extend the analysis to a larger set of networks, we develop a network generator and generate multiple graphs with characteristics similar to the IEEE test networks but with different topologies. The generated graphs are then compared to the test networks to show the effect of topology in determining their robustness with respect to cascading failures. The generated graphs turn out to be more robust than the test graphs, showing the importance of topology in the robust design of power grids. The second part of this paper concerns the discussion of two novel mitigation strategies for cascading failures: Targeted Load Reduction and Islanding using Distributed Sources. These new mitigation strategies are compared with the Homogeneous Load Reduction strategy. Even though the Homogeneous Load Reduction is simpler to implement, the Targeted Load Reduction is much more effective. Additionally, an algorithm is presented for the partitioning of the network for islanding as an effort towards fault isolation to prevent cascading failures. The results for island formation are better if the sources are well distributed, else the algorithm leads to the formation of superislands.
We investigate site percolation in a hierarchical scale-free network known as the Dorogovtsev- Goltsev-Mendes network. We use the generating function method to show that the percolation threshold is 1, i.e., the system is not in the percolating phase when the occupation probability is less than 1. The present result is contrasted to bond percolation in the same network of which the percolation threshold is zero. We also show that the percolation threshold of intentional attacks is 1. Our results suggest that this hierarchical scale-free network is very fragile against both random failure and intentional attacks. Such a structural defect is common in many hierarchical network models.
We investigate layered neural networks with differentiable activation function and student vectors without normalization constraint by means of equilibrium statistical physics. We consider the learning of perfectly realizable rules and find that the length of student vectors becomes infinite, unless a proper weight decay term is added to the energy. Then, the system undergoes a first order phase transition between states with very long student vectors and states where the lengths are comparable to those of the teacher vectors. Additionally in both configurations there is a phase transition between a specialized and an unspecialized phase. An anti-specialized phase with long student vectors exists in networks with a small number of hidden units.
A quantum-mechanical analysis of hyper-fast (faster than ballistic) diffusion of a quantum wave packet in random optical lattices is presented. The main motivation of the presented analysis is experimental demonstrations of hyper-diffusive spreading of a wave packet in random photonic lattices [L. Levi \textit{et al.}, Nature Phys. \textbf{8}, 912 (2012)]. A rigorous quantum-mechanical calculation of the mean probability amplitude is suggested, and it is shown that the power law spreading of the mean squared displacement (MSD) is $< x^2(t)>\sim t^{\alpha}$, where $2<\alpha\leq 3$. The values of the transport exponent $\alpha$ depend on the correlation properties of the random potential $V(x,t)$, which describes random inhomogeneities of the medium. In particular, when the random potential is $\delta$ correlated in time, the quantum wave packet spreads according Richardson turbulent diffusion with the MSD $\sim t^3$. Hyper-diffusion with $\alpha=12/5$ is also obtained for arbitrary correlation properties of the random potential.
Not only is network coding essential to achieve the capacity of a single-session multicast network, it can also help to improve the throughput of wireless networks with multiple unicast sessions when overheard information is available. Most previous research aimed at realizing such improvement by using perfectly overheard information, while in practice, especially for wireless networks, overheard information is often imperfect. To date, it is unclear whether network coding should still be used in such situations with imperfect overhearing. In this paper, a simple but ubiquitous wireless network model with two unicast sessions is used to investigate this problem. From the diversity and multiplexing tradeoff perspective, it is proved that even when overheard information is imperfect, network coding can still help to improve the overall system performance. This result implies that network coding should be used actively regardless of the reception quality of overheard information.
The paper introduces a connectionist network approach to find numerical solutions of Diophantine equations as an attempt to address the famous Hilbert's tenth problem. The proposed methodology uses a three layer feed forward neural network with back propagation as sequential learning procedure to find numerical solutions of a class of Diophantine equations. It uses a dynamically constructed network architecture where number of nodes in the input layer is chosen based on the number of variables in the equation. The powers of the given Diophantine equation are taken as input to the input layer. The training of the network starts with initial random integral weights. The weights are updated based on the back propagation of the error values at the output layer. The optimization of weights is augmented by adding a momentum factor into the network. The optimized weights of the connection between the input layer and the hidden layer are taken as numerical solution of the given Diophantine equation. The procedure is validated using different Diophantine Equations of different number of variables and different powers.
Short-term synaptic depression and facilitation have been found to greatly influence the performance of autoassociative neural networks. However, only partial results, focused for instance on the computation of the maximum storage capacity at zero temperature, have been obtained to date. In this work, we extended the study of the effect of these synaptic mechanisms on autoassociative neural networks to more realistic and general conditions, including the presence of noise in the system. In particular, we characterized the behavior of the system by means of its phase diagrams, and we concluded that synaptic facilitation significantly enlarges the region of good retrieval performance of the network. We also found that networks with facilitating synapses may have critical temperatures substantially higher than those of standard autoassociative networks, thus allowing neural networks to perform better under high-noise conditions.
We have created a large diverse set of cars from overhead images, which are useful for training a deep learner to binary classify, detect and count them. The dataset and all related material will be made publically available. The set contains contextual matter to aid in identification of difficult targets. We demonstrate classification and detection on this dataset using a neural network we call ResCeption. This network combines residual learning with Inception-style layers and is used to count cars in one look. This is a new way to count objects rather than by localization or density estimation. It is fairly accurate, fast and easy to implement. Additionally, the counting method is not car or scene specific. It would be easy to train this method to count other kinds of objects and counting over new scenes requires no extra set up or assumptions about object locations.
The predominate traffic patterns in a wireless sensor network are many-to-one and one-to-many communication. Hence, the performance of wireless sensor networks is characterized by the rate at which data can be disseminated from or aggregated to a data sink. In this paper, we consider the data aggregation problem. We demonstrate that a data aggregation rate of O(log(n)/n) is optimal and that this rate can be achieved in wireless sensor networks using a generalization of cooperative beamforming called cooperative time-reversal communication.
In this paper, we have used Recurrent Neural Networks to capture and model human motion data and generate motions by prediction of the next immediate data point at each time-step. Our RNN is armed with recently proposed Gated Recurrent Units which has shown promising results in some sequence modeling problems such as Machine Translation and Speech Synthesis. We demonstrate that this model is able to capture long-term dependencies in data and generate realistic motions.
Most algorithms for propagating evidence through belief networks have been exact and exhaustive: they produce an exact (point-valued) marginal probability for every node in the network. Often, however, an application will not need information about every n ode in the network nor will it need exact probabilities. We present the localized partial evaluation (LPE) propagation algorithm, which computes interval bounds on the marginal probability of a specified query node by examining a subset of the nodes in the entire network. Conceptually, LPE ignores parts of the network that are "too far away" from the queried node to have much impact on its value. LPE has the "anytime" property of being able to produce better solutions (tighter intervals) given more time to consider more of the network.
In this letter, we consider the effect of clustering coefficient on the synchronizability of coupled oscillators located on scale-free networks. The analytic result for the value of clustering coefficient aiming at a highly clustered scale-free network model, the Holme-Kim is obtained, and the relationship between network synchronizability and clustering coefficient is reported. The simulation results strongly suggest that the more clustered the network, the poorer its synchronizability.
The driving force behind deep networks is their ability to compactly represent rich classes of functions. The primary notion for formally reasoning about this phenomenon is expressive efficiency, which refers to a situation where one network must grow unfeasibly large in order to realize (or approximate) functions of another. To date, expressive efficiency analyses focused on the architectural feature of depth, showing that deep networks are representationally superior to shallow ones. In this paper we study the expressive efficiency brought forth by connectivity, motivated by the observation that modern networks interconnect their layers in elaborate ways. We focus on dilated convolutional networks, a family of deep models delivering state of the art performance in sequence processing tasks. By introducing and analyzing the concept of mixed tensor decompositions, we prove that interconnecting dilated convolutional networks can lead to expressive efficiency. In particular, we show that even a single connection between intermediate layers can already lead to an almost quadratic gap, which in large-scale settings typically makes the difference between a model that is practical and one that is not. Empirical evaluation demonstrates how the expressive efficiency of connectivity, similarly to that of depth, translates into gains in accuracy. This leads us to believe that expressive efficiency may serve a key role in the development of new tools for deep network design.
In this paper, we investigate deep image synthesis guided by sketch, color, and texture. Previous image synthesis methods can be controlled by sketch and color strokes but we are the first to examine texture control. We allow a user to place a texture patch on a sketch at arbitrary location and scale to control the desired output texture. Our generative network learns to synthesize objects consistent with these texture suggestions. To achieve this, we develop a local texture loss in addition to adversarial and content loss to train the generative network. The new local texture loss can improve generated texture quality without knowing the patch location and size in advance. We conduct experiments using sketches generated from real images and textures sampled from the Describable Textures Dataset and results show that our proposed algorithm is able to generate plausible images that are faithful to user controls. Ablation studies show that our proposed pipeline can generate more realistic images than adapting existing methods directly.
We propose a mechanism to describe spin relaxation in n-doped III-V semiconductors close to the Mott metal-insulator transition. Taking into account the spin-orbit interaction induced spin admixture in the hydrogenic donor states, we build a tight-binding model for the spin-dependent impurity band. Since the hopping amplitudes with spin flip are considerably smaller than the spin conserving counterparts, the resulting spin lifetime is very large. We estimate the spin lifetime from the diffusive accumulation of spin rotations associated with the electron hopping. Our result is larger but of the same order of magnitude than the experimental value. Therefore the proposed mechanism has to be included when describing spin relaxation in the impurity band.
Self-organizing cyber-physical systems are expected to become increasingly important in the context of Industry 4.0 automation as well as in everyday scenarios. Resilient communication is crucial for such systems. In general, this can be achieved with redundant communication paths. Mathematically, the amount of redundant paths is expressed with the network connectivity. A high network connectivity is required for collaboration and system-wide self-adaptation even when nodes fail or get compromised by an attacker. In this paper, we analyze the network connectivity of a communication network for large distributed cyber-physical systems. For this, we simulate the communication structure of a CPS with different network parameters to determine its resilience. With our results, we also deduce the required network connectivity for a given number of failing or compromised nodes.
In this work we study a weak Prisoner\^as Dilemma game in which both strategies and update rules are subjected to evolutionary pressure. Interactions among agents are specified by complex topologies, and we consider both homogeneous and heterogeneous situations. We consider deterministic and stochastic update rules for the strategies, which in turn may consider single links or full context when selecting agents to copy from. Our results indicate that the co-evolutionary process preserves heterogeneous networks as a suitable framework for the emergence of cooperation. Furthermore, on those networks, the update rule leading to a larger fraction, which we call replicator dynamics, is selected during co-evolution. On homogeneous networks we observe that even if replicator dynamics turns out again to be the selected update rule, the cooperation level is larger than on a fixed update rule framework. We conclude that for a variety of topologies, the fact that the dynamics coevolves with the strategies leads in general to more cooperation in the weak Prisoner's Dilemma game.
In machine learning, there is a fundamental trade-off between ease of optimization and expressive power. Neural Networks, in particular, have enormous expressive power and yet are notoriously challenging to train. The nature of that optimization challenge changes over the course of learning. Traditionally in deep learning, one makes a static trade-off between the needs of early and late optimization. In this paper, we investigate a novel framework, GradNets, for dynamically adapting architectures during training to get the benefits of both. For example, we can gradually transition from linear to non-linear networks, deterministic to stochastic computation, shallow to deep architectures, or even simple downsampling to fully differentiable attention mechanisms. Benefits include increased accuracy, easier convergence with more complex architectures, solutions to test-time execution of batch normalization, and the ability to train networks of up to 200 layers.
We study the finite size fluctuations at the depinning transition for a one-dimensional elastic interface of size $L$ displacing in a disordered medium of transverse size $M=k L^\zeta$ with periodic boundary conditions, where $\zeta$ is the depinning roughness exponent and $k$ is a finite aspect ratio parameter. We focus on the crossover from the infinitely narrow ($k\to 0$) to the infinitely wide ($k\to \infty$) medium. We find that at the thermodynamic limit both the value of the critical force and the precise behavior of the velocity-force characteristics are {\it unique} and $k$-independent. We also show that the finite size fluctuations of the critical force (bias and variance) as well as the global width of the interface cross over from a power-law to a logarithm as a function of $k$. Our results are relevant for understanding anisotropic size-effects in force-driven and velocity-driven interfaces.
In this paper I will describe some results that have been recently obtained in the study of random Euclidean matrices, i.e. matrices that are functions of random points in Euclidean space. In the case of translation invariant matrices one generically finds a phase transition between a phonon phase and a saddle phase. If we apply these considerations to the study of the Hessian of the Hamiltonian of the particles of a fluid, we find that this phonon-saddle transition corresponds to the dynamical phase transition in glasses, that has been studied in the framework of the mode coupling approximation. The Boson peak observed in glasses at low temperature is a remanent of this transition.
Deep generative models parameterized by neural networks have recently achieved state-of-the-art performance in unsupervised and semi-supervised learning. We extend deep generative models with auxiliary variables which improves the variational approximation. The auxiliary variables leave the generative model unchanged but make the variational distribution more expressive. Inspired by the structure of the auxiliary variable we also propose a model with two stochastic layers and skip connections. Our findings suggest that more expressive and properly specified deep generative models converge faster with better results. We show state-of-the-art performance within semi-supervised learning on MNIST, SVHN and NORB datasets.
The Deviants' Dilemma is a two-person game with the individual gain conflicting with the choice for global good. Evolutionary considerations yield fixed point attractors, with the phenomena of exclusion potentially playing an important role when current opponent information is available. We carry out computer simulations which substantiates and illuminates theoretical claims, and brings to light the pertinance of the choice between deterministic and stochastic dynamics, and the conjecture of 'ergodicity spread'.
The multi-index matching is an NP-hard combinatorial optimization problem; for two indices it reduces to the well understood bipartite matching problem that belongs to the polynomial complexity class. We use the cavity method to solve the thermodynamics of the multi-index system with random costs. The phase diagram is much richer than for the case of the bipartite matching problem: it shows a finite temperature phase transition to a completely frozen glass phase, similar to what happens in the random energy model. We derive the critical temperature, the ground state energy density, and properties of the energy landscape, and compare the results to numerical studies based on exact analysis of small systems.
Spatio-temporal dynamics of excitable media with discrete three-level active centers (ACs) and absorbing boundaries is studied numerically by means of a deterministic three-level model (see S. D. Makovetskiy and D. N. Makovetskii, on-line preprint cond-mat/0410460 ), which is a generalization of Zykov- Mikhailov model (see Sov. Phys. -- Doklady, 1986, Vol.31, No.1, P.51) for the case of two-channel diffusion of excitations. In particular, we revealed some qualitatively new features of coexistence, competition and collapse of rotating spiral waves (RSWs) in three-level excitable media under conditions of strong influence of the second channel of diffusion. Part of these features are caused by unusual mechanism of RSWs evolution when RSW's cores get into the surface layer of an active medium (i.~e. the layer of ACs resided at the absorbing boundary). Instead of well known scenario of RSW collapse, which takes place after collision of RSW's core with absorbing boundary, we observed complicated transformations of the core leading to nonlinear ''reflection'' of the RSW from the boundary or even to birth of several new RSWs in the surface layer. To our knowledge, such nonlinear ''reflections'' of RSWs and resulting die hard vorticity in excitable media with absorbing boundaries were unknown earlier. ACM classes: F.1.1, I.6, J.2; PACS numbers: 05.65.+b, 07.05.Tp, 82.20.Wt
We analyse the newest diffractive deep inelastic scattering data from HERA using the dipole model approach. We find a reasonable good agreement between the predictions and the data although the region of small values of a kinematic variable $\beta$ needs refinement. A way to do this is to consider an approach with diffractive parton distributions evolved with the DGLAP evolution equations.
We describe algorithms for learning Bayesian networks from a combination of user knowledge and statistical data. The algorithms have two components: a scoring metric and a search procedure. The scoring metric takes a network structure, statistical data, and a user's prior knowledge, and returns a score proportional to the posterior probability of the network structure given the data. The search procedure generates networks for evaluation by the scoring metric. Previous work has concentrated on metrics for domains containing only discrete variables, under the assumption that data represents a multinomial sample. In this paper, we extend this work, developing scoring metrics for domains containing all continuous variables or a mixture of discrete and continuous variables, under the assumption that continuous data is sampled from a multivariate normal distribution. Our work extends traditional statistical approaches for identifying vanishing regression coefficients in that we identify two important assumptions, called event equivalence and parameter modularity, that when combined allow the construction of prior distributions for multivariate normal parameters from a single prior Bayesian network specified by a user.
This paper proposes a new approach to automatically quantify the severity of knee osteoarthritis (OA) from radiographs using deep convolutional neural networks (CNN). Clinically, knee OA severity is assessed using Kellgren \& Lawrence (KL) grades, a five point scale. Previous work on automatically predicting KL grades from radiograph images were based on training shallow classifiers using a variety of hand engineered features. We demonstrate that classification accuracy can be significantly improved using deep convolutional neural network models pre-trained on ImageNet and fine-tuned on knee OA images. Furthermore, we argue that it is more appropriate to assess the accuracy of automatic knee OA severity predictions using a continuous distance-based evaluation metric like mean squared error than it is to use classification accuracy. This leads to the formulation of the prediction of KL grades as a regression problem and further improves accuracy. Results on a dataset of X-ray images and KL grades from the Osteoarthritis Initiative (OAI) show a sizable improvement over the current state-of-the-art.
This work analyses the practice of sister city pairing. We investigate structural properties of the resulting city and country networks and present rankings of the most central nodes in these networks. We identify different country clusters and find that the practice of sister city pairing is not influenced by geographical proximity but results in highly assortative networks.
We prove tight network topology dependent bounds on the round complexity of computing well studied $k$-party functions such as set disjointness and element distinctness. Unlike the usual case in the CONGEST model in distributed computing, we fix the function and then vary the underlying network topology. This complements the recent such results on total communication that have received some attention. We also present some applications to distributed graph computation problems.   Our main contribution is a proof technique that allows us to reduce the problem on a general graph topology to a relevant two-party communication complexity problem. However, unlike many previous works that also used the same high level strategy, we do not reason about a two-party communication problem that is induced by a cut in the graph. To `stitch' back the various lower bounds from the two party communication problems, we use the notion of timed graph that has seen prior use in network coding. Our reductions use some tools from Steiner tree packing and multi-commodity flow problems that have a delay constraint.
The learning dynamics of on-line independent component analysis is analysed in the limit of large data dimension. We study a simple Hebbian learning algorithm that can be used to separate out a small number of non-Gaussian components from a high-dimensional data set. The de-mixing matrix parameters are confined to a Stiefel manifold of tall, orthogonal matrices and we introduce a natural gradient variant of the algorithm which is appropriate to learning on this manifold. For large input dimension the parameter trajectory of both algorithms passes through a sequence of unstable fixed points, each described by a diffusion process in a polynomial potential. Choosing the learning rate too large increases the escape time from each of these fixed points, effectively trapping the learning in a sub-optimal state. In order to avoid these trapping states a very low learning rate must be chosen during the learning transient, resulting in learning time-scales of $O(N^2)$ or $O(N^3)$ iterations where $N$ is the data dimension. Escape from each sub-optimal state results in a sequence of symmetry breaking events as the algorithm learns each source in turn. This is in marked contrast to the learning dynamics displayed by related on-line learning algorithms for multilayer neural networks and principal component analysis. Although the natural gradient variant of the algorithm has nice asymptotic convergence properties, it has an equivalent transient dynamics to the standard Hebbian algorithm.
$Range$ and $load$ play keys on the problem of attacking on links in random scale-free (RSF) networks. In this Brief Report we obtain the relation between $range$ and $load$ in RSF networks analytically by the generating function theory, and then give an estimation about the impact of attacks on the $efficiency$ of the network. The analytical results show that short range attacks are more destructive for RSF networks, and are confirmed numerically. Further our results are consistent with the former literature (Physical Review E \textbf{66}, 065103(R) (2002)).
This paper presents a novel approach in a rarely studied area of computer vision: Human interaction recognition in still images. We explore whether the facial regions and their spatial configurations contribute to the recognition of interactions. In this respect, our method involves extraction of several visual features from the facial regions, as well as incorporation of scene characteristics and deep features to the recognition. Extracted multiple features are utilized within a discriminative learning framework for recognizing interactions between people. Our designed facial descriptors are based on the observation that relative positions, size and locations of the faces are likely to be important for characterizing human interactions. Since there is no available dataset in this relatively new domain, a comprehensive new dataset which includes several images of human interactions is collected. Our experimental results show that faces and scene characteristics contain important information to recognize interactions between people.
We present broad band photometry and photometric redshifts for 187611 sources located in ~0.5deg^2 in the Lockman Hole area. The catalog includes 389 X-ray detected sources identified with the very deep XMM-Newton observations available for an area of 0.2 deg^2. The source detection was performed on the Rc, z' and B band images and the available photometry is spanning from the far ultraviolet to the mid infrared, reaching in the best case scenario 21 bands. Astrometry corrections and photometric cross-calibrations over the entire dataset allowed the computation of accurate photometric redshifts. Special treatment is undertaken for the X-ray sources, the majority of which is active galactic nuclei. Comparing the photometric redshifts to the available spectroscopic redshifts we achieve for normal galaxies an accuracy of \sigma_{\Delta z/(1+z)}=0.036, with 12.7% outliers, while for the X-ray detected sources the accuracy is \sigma_{\Delta z/(1+z)}=0.069, with 18.3% outliers, where the outliers are defined as sources with |z_{phot}-z_{spec}|>0.15 (1+z_{spec})}. These results are a significant improvement over the previously available photometric redshifts for normal galaxies in the Lockman Hole, while it is the first time that photometric redshifts are computed and made public for AGN for this field.
With social networking sites providing increasingly richer context, User-Centric Service (UCS) creation is expected to explode following a similar success path to User-Generated Content. One of the major challenges in this emerging highly user-centric networking paradigm is how to make these exploding in numbers yet, individually, of vanishing demand services available in a cost-effective manner. Of prime importance to the latter (and focus of this paper) is the determination of the optimal location for hosting a UCS. Taking into account the particular characteristics of UCS, we formulate the problem as a facility location problem and devise a distributed and highly scalable heuristic solution to it.   Key to the proposed approach is the introduction of a novel metric drawing on Complex Network Analysis. Given a current location of UCS, this metric helps to a) identify a small subgraph of nodes with high capacity to act as service demand concentrators; b) project on them a reduced yet accurate view of the global demand distribution that preserves the key attraction forces on UCS; and, ultimately, c) pave the service migration path towards its optimal location in the network. The proposed iterative UCS migration algorithm, called cDSMA, is extensively evaluated over synthetic and real-world network topologies. Our results show that cDSMA achieves high accuracy, fast convergence, remarkable insensitivity to the size and diameter of the network and resilience to inaccurate estimates of demands for UCS across the network. It is also shown to clearly outperform local-search heuristics for service migration that constrain the subgraph to the immediate neighbourhood of the node currently hosting UCS.
Extracting per-frame features using convolutional neural networks for real-time processing of video data is currently mainly performed on powerful GPU-accelerated workstations and compute clusters. However, there are many applications such as smart surveillance cameras that require or would benefit from on-site processing. To this end, we propose and evaluate a novel algorithm for change-based evaluation of CNNs for video data recorded with a static camera setting, exploiting the spatio-temporal sparsity of pixel changes. We achieve an average speed-up of 8.6x over a cuDNN baseline on a realistic benchmark with a negligible accuracy loss of less than 0.1% and no retraining of the network. The resulting energy efficiency is 10x higher than that of per-frame evaluation and reaches an equivalent of 328 GOp/s/W on the Tegra X1 platform.
In this paper, we propose a novel Deep Localized Makeup Transfer Network to automatically recommend the most suitable makeup for a female and synthesis the makeup on her face. Given a before-makeup face, her most suitable makeup is determined automatically. Then, both the beforemakeup and the reference faces are fed into the proposed Deep Transfer Network to generate the after-makeup face. Our end-to-end makeup transfer network have several nice properties including: (1) with complete functions: including foundation, lip gloss, and eye shadow transfer; (2) cosmetic specific: different cosmetics are transferred in different manners; (3) localized: different cosmetics are applied on different facial regions; (4) producing naturally looking results without obvious artifacts; (5) controllable makeup lightness: various results from light makeup to heavy makeup can be generated. Qualitative and quantitative experiments show that our network performs much better than the methods of [Guo and Sim, 2009] and two variants of NerualStyle [Gatys et al., 2015a].
We consider the problem of dynamic spectrum access for network utility maximization in multichannel wireless networks. The shared bandwidth is divided into K orthogonal channels, and the users access the spectrum using a random access protocol. In the beginning of each time slot, each user selects a channel and transmits a packet with a certain attempt probability. After each time slot, each user that has transmitted a packet receives a local observation indicating whether its packet was successfully delivered or not (i.e., ACK signal). The objective is to find a multi-user strategy that maximizes a certain network utility in a distributed manner without online coordination or message exchanges between users. Obtaining an optimal solution for the spectrum access problem is computationally expensive in general due to the large state space and partial observability of the states. To tackle this problem, we develop a distributed dynamic spectrum access algorithm based on deep multi-user reinforcement leaning. Specifically, at each time slot, each user maps its current state to spectrum access actions based on a trained deep-Q network used to maximize the objective function. Experimental results have demonstrated that users are capable to learn good policies that achieve strong performance in this challenging partially observable setting only from their ACK signals, without online coordination, message exchanges between users, or carrier sensing.
Reinforcement Learning is gaining attention by the wireless networking community due to its potential to learn good-performing configurations only from the observed results. In this work we propose a stateless variation of Q-learning, which we apply to exploit spatial reuse in a wireless network. In particular, we allow networks to modify both their transmission power and the channel used solely based on the experienced throughput. We concentrate in a completely decentralized scenario in which no information about neighbouring nodes is available to the learners. Our results show that although the algorithm is able to find the best-performing actions to enhance aggregate throughput, there is high variability in the throughput experienced by the individual networks. We identify the cause of this variability as the adversarial setting of our setup, in which the most played actions provide intermittent good/poor performance depending on the neighbouring decisions. We also evaluate the effect of the intrinsic learning parameters of the algorithm on this variability.
We model the formation of networks as a game where players aspire to maximize their own centrality by increasing the number of other players to which they are path-wise connected, while simultaneously incurring a cost for each added adjacent edge. We simulate the interactions between players using an algorithm that factors in rational strategic behavior based on a common objective function. The resulting networks exhibit pairwise stability, from which we derive necessary stable conditions for specific graph topologies. We then expand the model to simulate non-trivial games with large numbers of players. We show that using conditions necessary for the stability of star topologies we can induce the formation of hub players that positively impact the total welfare of the network.
We summarize a theoretical framework based on global time-reparametrization invariance that explains the origin of dynamic fluctuations in glassy systems. We introduce the main ideas without getting into much technical details. We describe a number of consequences arising from this scenario that can be tested numerically and experimentally distinguishing those that can also be explained by other mechanisms from the ones that we believe, are special to our proposal. We support our claims by presenting some numerical checks performed on the 3d Edwards-Anderson spin-glass. Finally, we discuss up to which extent these ideas apply to super-cooled liquids that have been studied in much more detail up to present.
In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information between observed examples and their predicted categorical class distribution, against robustness of the classifier to an adversarial generative model. The resulting algorithm can either be interpreted as a natural generalization of the generative adversarial networks (GAN) framework or as an extension of the regularized information maximization (RIM) framework to robust classification against an optimal adversary. We empirically evaluate our method - which we dub categorical generative adversarial networks (or CatGAN) - on synthetic data as well as on challenging image classification tasks, demonstrating the robustness of the learned classifiers. We further qualitatively assess the fidelity of samples generated by the adversarial generator that is learned alongside the discriminative classifier, and identify links between the CatGAN objective and discriminative clustering algorithms (such as RIM).
For a large multi-hop wireless network, nodes are preferable to make distributed and localized link-scheduling decisions with only interactions among a small number of neighbors. However, for a slowly decaying channel and densely populated interferers, a small size neighborhood often results in nontrivial link outages and is thus insufficient for making optimal scheduling decisions. A question arises how to deal with the information outside a neighborhood in distributed link-scheduling. In this work, we develop joint approximation of information and distributed link scheduling. We first apply machine learning approaches to model distributed link-scheduling with complete information. We then characterize the information outside a neighborhood in form of residual interference as a random loss variable. The loss variable is further characterized by either a Mean Field approximation or a normal distribution based on the Lyapunov central limit theorem. The approximated information outside a neighborhood is incorporated in a factor graph. This results in joint approximation and distributed link-scheduling in an iterative fashion. Link-scheduling decisions are first made at each individual node based on the approximated loss variables. Loss variables are then updated and used for next link-scheduling decisions. The algorithm repeats between these two phases until convergence. Interactive iterations among these variables are implemented with a message-passing algorithm over a factor graph. Simulation results show that using learned information outside a neighborhood jointly with distributed link-scheduling reduces the outage probability close to zero even for a small neighborhood.
Study of the production of pairs of top quarks in association with a Higgs boson is one of the primary goals of the Large Hadron Collider over the next decade, as measurements of this process may help us to understand whether the uniquely large mass of the top quark plays a special role in electroweak symmetry breaking. Higgs bosons decay predominantly to \bbbar, yielding signatures for the signal that are similar to $t\bar{t}$ + jets with heavy flavor. Though particularly challenging to study due to the similar kinematics between signal and background events, such final states ($t\bar{t} b \bar{b}$) are an important channel for studying the top quark Yukawa coupling. This paper presents a systematic study of machine learning (ML) methods for detecting $t\bar{t}h$ in the $h \rightarrow b\bar{b}$ decay channel. Among the eight ML methods tested, we show that two models, extreme gradient boosted trees and neural network models, outperform alternative methods. We further study the effectiveness of ML algorithms by investigating the impact of feature set and data size, as well as the structure of the models. While extended feature set and larger training sets expectedly lead to improvement of performance, shallow models deliver comparable or better performance than their deeper counterparts. Our study suggests that ensembles of trees and neurons, not necessarily deep, work effectively for the problem of $t\bar{t}h$ detection.
Numerous real-world relations can be represented by signed networks with positive links (e.g., trust) and negative links (e.g., distrust). Link analysis plays a crucial role in understanding the link formation and can advance various tasks in social network analysis such as link prediction. The majority of existing works on link analysis have focused on unsigned social networks. The existence of negative links determines that properties and principles of signed networks are substantially distinct from those of unsigned networks, thus we need dedicated efforts on link analysis in signed social networks. In this paper, following social theories in link analysis in unsigned networks, we adopt three social science theories, namely Emotional Information, Diffusion of Innovations and Individual Personality, to guide the task of link analysis in signed networks.
Significant microstructural anisotropy is known to develop during shearing flow of attractive particle suspensions. These suspensions, and their capacity to form conductive networks, play a key role in flow-battery technology, among other applications. Herein, we present and test an analytical model for the tensorial conductivity of attractive particle suspensions. The model utilizes the mean fabric of the network to characterize the structure, and the relationship to the conductivity is inspired by a lattice argument. We test the accuracy of our model against a large number of computer-generated suspension networks, based on multiple in-house generation protocols, giving rise to particle networks that emulate the physical system. The model is shown to adequately capture the tensorial conductivity, both in terms of its invariants and its mean directionality.
Measurements of fusion cross-sections of 7Li and 12C with 198Pt at deep sub-barrier energies are reported to unravel the role of the entrance channel in the occurrence of fusion hindrance. The onset of fusion hindrance has been clearly observed in 12C + 198Pt system but not in 7Li + 198Pt system, within the measured energy range. Emergence of the hindrance, moving from lighter (6,7Li) to heavier (12C,16O) projectiles is explained employing a model that considers a gradual transition from a sudden to adiabatic regime at low energies. The model calculation reveals a weak effect of the damping of coupling to collective motion for the present systems as compared to that obtained for systems with heavier projectiles.
Research issues and data mining techniques for product recommendation and viral marketing have been widely studied. Existing works on seed selection in social networks do not take into account the effect of product recommendations in e-commerce stores. In this paper, we investigate the seed selection problem for viral marketing that considers both effects of social influence and item inference (for product recommendation). We develop a new model, Social Item Graph (SIG), that captures both effects in form of hyperedges. Accordingly, we formulate a seed selection problem, called Social Item Maximization Problem (SIMP), and prove the hardness of SIMP. We design an efficient algorithm with performance guarantee, called Hyperedge-Aware Greedy (HAG), for SIMP and develop a new index structure, called SIG-index, to accelerate the computation of diffusion process in HAG. Moreover, to construct realistic SIG models for SIMP, we develop a statistical inference based framework to learn the weights of hyperedges from data. Finally, we perform a comprehensive evaluation on our proposals with various baselines. Experimental result validates our ideas and demonstrates the effectiveness and efficiency of the proposed model and algorithms over baselines.
The search for Majorana bound states in solid-state physics has been limited to materials which display a gap in their bulk spectrum. We show that such unpaired states appear in certain quasi-one-dimensional Josephson junctions arrays with gapless bulk excitations. The bulk modes mediate a coupling between Majorana bound states via the Ruderman-Kittel-Yosida-Kasuya mechanism. As a consequence, the lowest energy doublet acquires a finite energy difference. For realistic set of parameters this energy splitting remains much smaller than the energy of the bulk eigenstates even for short chains of length $L \sim 10$.
We study the Glauber dynamics of Ising spin models with random bonds, on finitely connected random graphs. We generalize a recent dynamical replica theory with which to predict the evolution of the joint spin-field distribution, to include random graphs with arbitrary degree distributions. The theory is applied to Ising ferromagnets on randomly diluted Bethe lattices, where we study the evolution of the magnetization and the internal energy. It predicts a prominent slowing down of the flow in the Griffiths phase, it suggests a further dynamical transition at lower temperatures within the Griffiths phase, and it is verified quantitatively by the results of Monte Carlo simulations.
This paper presents measurements of \k\ and \lam\ production in neutral current, deep inelastic scattering of 26.7 GeV electrons and 820 GeV protons in the kinematic range $ 10 < Q^{2} < 640 $ GeV$^2$, $0.0003 < x < 0.01$, and $y > 0.04$. Average multiplicities for \k\ and \lam\ production are determined for transverse momenta \ \ptr\ $> 0.5 $ GeV and pseudorapidities $\left| \eta \right| < 1.3$. The multiplicities favour a stronger strange to light quark suppression in the fragmentation chain than found in $e^+ e^-$ experiments. The production properties of \k's in events with and without a large rapidity gap with respect to the proton direction are compared. The ratio of neutral \k's to charged particles per event in the measured kinematic range is, within the present statistics, the same in both samples.
We consider the problem of image representation for the tasks of unsupervised learning and semi-supervised learning. In those learning tasks, the raw image vectors may not provide enough representation for their intrinsic structures due to their highly dense feature space. To overcome this problem, the raw image vectors should be mapped to a proper representation space which can capture the latent structure of the original data and represent the data explicitly for further learning tasks such as clustering.   Inspired by the recent research works on deep neural network and representation learning, in this paper, we introduce the multiple-layer auto-encoder into image representation, we also apply the locally invariant ideal to our image representation with auto-encoders and propose a novel method, called Graph regularized Auto-Encoder (GAE). GAE can provide a compact representation which uncovers the hidden semantics and simultaneously respects the intrinsic geometric structure.   Extensive experiments on image clustering show encouraging results of the proposed algorithm in comparison to the state-of-the-art algorithms on real-word cases.
We propose a method to determine the locally preferred structure of model liquids. This latter is obtained numerically as the global minimum of the effective energy surface of clusters formed by small numbers of particles embedded in a liquid-like environment. The effective energy is the sum of the intra-cluster interaction potential and of an external field that describes the influence of the embedding bulk liquid at a mean-field level. Doing so we minimize the surface effects present in isolated clusters without introducing the full blown geometrical frustration present in bulk condensed phases. We find that the locally preferred structure of the Lennard-Jones liquid is an icosahedron, and that the liquid-like environment only slightly reduces the relative stability of the icosahedral cluster. The influence of the boundary conditions on the nature of the ground-state configuration of Lennard-Jones clusters is also discussed.
Multivariate Poisson random variables subject to linear integer constraints arise in several application areas, such as queuing and biomolecular networks. This note shows how to compute conditional statistics in this context, by employing WF Theory and associated algorithms. A symbolic computation package has been developed and is made freely available. A discussion of motivating biomolecular problems is also provided.
Continuous neural field models with inhomogeneous synaptic connectivities are known to support traveling fronts as well as stable bumps of localized activity. We analyze stationary localized structures in a neural field model with periodic modulation of the synaptic connectivity kernel and find that they are arranged in a snakes-and-ladders bifurcation structure. In the case of Heaviside firing rates, we construct analytically symmetric and asymmetric states and hence derive closed-form expressions for the corresponding bifurcation diagrams. We show that the ideas proposed by Beck and co-workers to analyze snaking solutions to the Swift-Hohenberg equation remain valid for the neural field model, even though the corresponding spatial-dynamical formulation is non-autonomous. We investigate how the modulation amplitude affects the bifurcation structure and compare numerical calculations for steep sigmoidal firing rates with analytic predictions valid in the Heaviside limit.
We investigate spectral correlations in quasi one-dimensional Anderson insulators with broken time-reversal symmetry. While energy levels are uncorrelated in the thermodynamic limit of infinite wire-length, some correlations remain in finite-size Anderson insulators. Asymptotic behaviors of level-level correlations in these systems are known in the large- and small-frequency limits, corresponding to the regime of classical diffusive dynamics and the deep quantum regime of strong Anderson localization. Employing non-perturbative methods and a mapping to the Coulomb-scattering problem, recently introduced by {\it M.~A.~Skvortsov} and {\it P.~M.~Ostrovsky}, we derive a closed analytical expression for the spectral statistics in the classical-to-quantum region bridging the known asymptotic behaviors. We further discuss how Poisson statistics at large energies develop into Wigner-Dyson statistics as the wire-length decreases.
We propose a Convolutional Neural Network (CNN) based algorithm - StuffNet - for object detection. In addition to the standard convolutional features trained for region proposal and object detection [31], StuffNet uses convolutional features trained for segmentation of objects and 'stuff' (amorphous categories such as ground and water). Through experiments on Pascal VOC 2010, we show the importance of features learnt from stuff segmentation for improving object detection performance. StuffNet improves performance from 18.8% mAP to 23.9% mAP for small objects. We also devise a method to train StuffNet on datasets that do not have stuff segmentation labels. Through experiments on Pascal VOC 2007 and 2012, we demonstrate the effectiveness of this method and show that StuffNet also significantly improves object detection performance on such datasets.
The topology of social networks can be understood as being inherently dynamic, with edges having a distinct position in time. Most characterizations of dynamic networks discretize time by converting temporal information into a sequence of network "snapshots" for further analysis. Here we study a highly resolved data set of a dynamic proximity network of 66 individuals. We show that the topology of this network evolves over a very broad distribution of time scales, that its behavior is characterized by strong periodicities driven by external calendar cycles, and that the conversion of inherently continuous-time data into a sequence of snapshots can produce highly biased estimates of network structure. We suggest that dynamic social networks exhibit a natural time scale \Delta_{nat}, and that the best conversion of such dynamic data to a discrete sequence of networks is done at this natural rate.
Existing action detection algorithms usually generate action proposals through an extensive search over the video at multiple temporal scales, which brings about huge computational overhead and deviates from the human perception procedure. We argue that the process of detecting actions should be naturally one of observation and refinement: observe the current window and refine the span of attended window to cover true action regions. In this paper, we propose an active action proposal model that learns to find actions through continuously adjusting the temporal bounds in a self-adaptive way. The whole process can be deemed as an agent, which is firstly placed at a position in the video at random, adopts a sequence of transformations on the current attended region to discover actions according to a learned policy. We utilize reinforcement learning, especially the Deep Q-learning algorithm to learn the agent's decision policy. In addition, we use temporal pooling operation to extract more effective feature representation for the long temporal window, and design a regression network to adjust the position offsets between predicted results and the ground truth. Experiment results on THUMOS 2014 validate the effectiveness of the proposed approach, which can achieve competitive performance with current action detection algorithms via much fewer proposals.
Large-scale datasets have driven the rapid development of deep neural networks for visual recognition. However, annotating a massive dataset is expensive and time-consuming. Web images and their labels are, in comparison, much easier to obtain, but direct training on such automatically harvested images can lead to unsatisfactory performance, because the noisy labels of Web images adversely affect the learned recognition models. To address this drawback we propose an end-to-end weakly-supervised deep learning framework which is robust to the label noise in Web images. The proposed framework relies on two unified strategies -- random grouping and attention -- to effectively reduce the negative impact of noisy web image annotations. Specifically, random grouping stacks multiple images into a single training instance and thus increases the labeling accuracy at the instance level. Attention, on the other hand, suppresses the noisy signals from both incorrectly labeled images and less discriminative image regions. By conducting intensive experiments on two challenging datasets, including a newly collected fine-grained dataset with Web images of different car models, the superior performance of the proposed methods over competitive baselines is clearly demonstrated.
In this paper, we present a novel deep learning approach, deeply-fused nets. The central idea of our approach is deep fusion, i.e., combine the intermediate representations of base networks, where the fused output serves as the input of the remaining part of each base network, and perform such combinations deeply over several intermediate representations. The resulting deeply fused net enjoys several benefits. First, it is able to learn multi-scale representations as it enjoys the benefits of more base networks, which could form the same fused network, other than the initial group of base networks. Second, in our suggested fused net formed by one deep and one shallow base networks, the flows of the information from the earlier intermediate layer of the deep base network to the output and from the input to the later intermediate layer of the deep base network are both improved. Last, the deep and shallow base networks are jointly learnt and can benefit from each other. More interestingly, the essential depth of a fused net composed from a deep base network and a shallow base network is reduced because the fused net could be composed from a less deep base network, and thus training the fused net is less difficult than training the initial deep base network. Empirical results demonstrate that our approach achieves superior performance over two closely-related methods, ResNet and Highway, and competitive performance compared to the state-of-the-arts.
We analyze the statistics of gaps ($\Delta H$) between successive avalanches in one dimensional random field Ising models (RFIMs) in an external field $H$ at zero temperature. In the first part of the paper we study the nearest-neighbour ferromagnetic RFIM. We map the sequence of avalanches in this system to a non-homogeneous Poisson process with an $H$-dependent rate $\rho(H)$. We use this to analytically compute the distribution of gaps $P(\Delta H)$ between avalanches as the field is increased monotonically from $-\infty$ to $+\infty$. We show that $P(\Delta H)$ tends to a constant $\mathcal{C}(R)$ as $\Delta H \to 0^+$, which displays a non-trivial behaviour with the strength of disorder $R$. We verify our predictions with numerical simulations. In the second part of the paper, motivated by avalanche gap distributions in driven disordered amorphous solids, we study a long-range antiferromagnetic RFIM. This model displays a gapped behaviour $P(\Delta H) = 0$ up to a system size dependent offset value $\Delta H_{\text{off}}$, and $P(\Delta H) \sim (\Delta H - \Delta H_{\text{off}})^{\theta}$ as $\Delta H \to H_{\text{off}}^+$. We perform numerical simulations on this model and determine $\theta \approx 0.95(5)$. We also discuss mechanisms which would lead to a non-zero exponent $\theta$ for general spin models with quenched random fields.
In this paper we introduce a framework for computing upper bounds yet accurate WCET for hardware platforms with caches and pipelines. The methodology we propose consists of 3 steps: 1) given a program to analyse, compute an equivalent (WCET-wise) abstract program; 2) build a timed game by composing this abstract program with a network of timed automata modeling the architecture; and 3) compute the WCET as the optimal time to reach a winning state in this game. We demonstrate the applicability of our framework on standard benchmarks for an ARM9 processor with instruction and data caches, and compute the WCET with UPPAAL-TiGA. We also show that this framework can easily be extended to take into account dynamic changes in the speed of the processor during program execution. %
Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The key idea of Softmax GAN is to replace the classification loss in the original GAN with a softmax cross-entropy loss in the sample space of one single batch. In the adversarial learning of $N$ real training samples and $M$ generated samples, the target of discriminator training is to distribute all the probability mass to the real samples, each with probability $\frac{1}{M}$, and distribute zero probability to generated data. In the generator training phase, the target is to assign equal probability to all data points in the batch, each with probability $\frac{1}{M+N}$. While the original GAN is closely related to Noise Contrastive Estimation (NCE), we show that Softmax GAN is the Importance Sampling version of GAN. We futher demonstrate with experiments that this simple change stabilizes GAN training.
In this paper, we study instances of complex neural networks, i.e. neural netwo rks with complex topologies. We use Self-Organizing Map neural networks whose n eighbourhood relationships are defined by a complex network, to classify handwr itten digits. We show that topology has a small impact on performance and robus tness to neuron failures, at least at long learning times. Performance may howe ver be increased (by almost 10%) by artificial evolution of the network topo logy. In our experimental conditions, the evolved networks are more random than their parents, but display a more heterogeneous degree distribution.
Repeating patterns of spike sequences from a neuronal network have been proposed to be useful in the reconstruction of the network topology. Reverberations in a physiologically realistic model with various physical connection topologies (from random to scale-free) have been simulated to study the effectiveness of the pattern-matching method in the reconstruction of network topology from network dynamics. Simulation results show that functional networks reconstructed from repeating spike patterns can be quite different from the original physical networks; even global properties, such as the degree distribution, cannot always be recovered. However, the pattern-matching method can be effective in identifying hubs in the network. Since the form of reverberations are quite different for networks with and without hubs, the form of reverberations together with the reconstruction by repeating spike patterns might provide a reliable method to detect hubs in neuronal cultures.
We simulated the creep motions of flux lines subject to randomly distributed point-like pinning centers. It is found that at low temperatures, the pinning barrier $U$ defined in the Arrhenius-type $v-F$ characteristics increases with decreasing force $U(F) \propto F^{-\mu}$, as predicted by previous theories. The exponent $\mu$ is evaluated as $0.28\pm 0.02 $ for the vortex glass and $\mu\simeq 0.5\pm 0.02$ for the Bragg glass (BrG). The latter is in good agreement with the prediction by the scaling theory and the functional-renormalization-group theory on creep, while the former is a new estimate. Within BrG, we find that the pinning barrier is suppressed when temperature is lifted to approximately half of the melting temperature. Characterizations of this new transition at equilibrium are also presented, indicative of a phase transition associated with the replica-symmetry breaking.
We present the first public release of our generic neural network training algorithm, called SkyNet. This efficient and robust machine learning tool is able to train large and deep feed-forward neural networks, including autoencoders, for use in a wide range of supervised and unsupervised learning applications, such as regression, classification, density estimation, clustering and dimensionality reduction. SkyNet uses a `pre-training' method to obtain a set of network parameters that has empirically been shown to be close to a good solution, followed by further optimisation using a regularised variant of Newton's method, where the level of regularisation is determined and adjusted automatically; the latter uses second-order derivative information to improve convergence, but without the need to evaluate or store the full Hessian matrix, by using a fast approximate method to calculate Hessian-vector products. This combination of methods allows for the training of complicated networks that are difficult to optimise using standard backpropagation techniques. SkyNet employs convergence criteria that naturally prevent overfitting, and also includes a fast algorithm for estimating the accuracy of network outputs. The utility and flexibility of SkyNet are demonstrated by application to a number of toy problems, and to astronomical problems focusing on the recovery of structure from blurred and noisy images, the identification of gamma-ray bursters, and the compression and denoising of galaxy images. The SkyNet software, which is implemented in standard ANSI C and fully parallelised using MPI, is available at http://www.mrao.cam.ac.uk/software/skynet/.
We show that the stochastic field theory for directed percolation in presence of an additional conservation law (the C-DP class) can be mapped exactly to the continuum theory for the depinning of an elastic interface in short-range correlated quenched disorder. On one line of parameters commonly studied, this mapping leads to the simplest overdamped dynamics. Away from this line, an additional memory term arises in the interface dynamics; we argue that it does not change the universality class. Since C-DP is believed to describe the Manna class of self-organized criticality, this shows that Manna stochastic sandpiles and disordered elastic interfaces (i.e. the quenched Edwards-Wilkinson model) share the same universal large-scale behavior.
A field-theory approach is used to investigate the ''spin-glass effects'' on the critical behaviour of systems with weak temperature-like quenched disorder. The renormalization group (RG) analysis of the effective Hamiltonian of a model with replica symmetry breaking (RSB) potentials of a general type is carried out in the two-loop approximation. The fixed-point (FP) stability, recently found within the one-step RSB RG treatment, is further explored in terms of replicon eigenvalues. We find that the traditional FPs, which are usually considered to describe the disorder-induced universal critical behaviour, remain stable when the continuous RSB modes are taken into account.
The wide development of inter connectivity of cellular networks with the internet network has made them to be vulnerable. This exposure of the cellular networks to internet has increased threats to customer end equipment as well as the carrier infrastructure.
The artistic style of a painting is a subtle aesthetic judgment used by art historians for grouping and classifying artwork. The recently introduced `neural-style' algorithm substantially succeeds in merging the perceived artistic style of one image or set of images with the perceived content of another. In light of this and other recent developments in image analysis via convolutional neural networks, we investigate the effectiveness of a `neural-style' representation for classifying the artistic style of paintings.
A visualisation tool is presented to facilitate the study on large-scale communications networks. This tool provides a simple and effective way to summarise the topology of a complex network at a coarse level.
The challenging requirements of 5G--from both the applications and the architecture perspectives--motivate the need to explore the feasibility of delivering services over new network architectures. As 5G proposes application-centric network slicing, which enables the use of new data planes realizable over a programmable compute, storage, and transport infrastructure, we consider Information-centric Networking (ICN) as a candidate network architecture to realize 5G objectives. This can co-exist with end-to-end IP services that are offered today. To this effect, we first propose a 5G-ICN architecture and compare its benefits (i.e., innovative services offered by leveraging ICN features) to current 3GPP-based mobile architectures. We then introduce a general application-driven framework that emphasizes on the flexibility afforded by Network Function Virtualization (NFV) and Software Defined Networking (SDN) over which 5G-ICN can be realized. We specifically focus on the issue of how mobility-as-a-service (MaaS) can be realized as a 5G-ICN slice, and give an in-depth overview on resource provisioning and inter-dependencies and -coordinations among functional 5G-ICN slices to meet the MaaS objectives.
Mobile Ad-hoc Network (MANET) is a collection of autonomous nodes or terminals which communicate with each other by forming a multi-hop radio network and maintaining connectivity in a decentralized manner. The conventional security solutions to provide key management through accessing trusted authorities or centralized servers are infeasible for this new environment since mobile ad hoc networks are characterized by the absence of any infrastructure, frequent mobility, and wireless links. We propose a hierarchical group key management scheme that is hierarchical and fully distributed with no central authority and uses a simple rekeying procedure which is suitable for large and high mobility mobile ad hoc networks. The rekeying procedure requires only one round in our scheme and Chinese Remainder Theorem Diffie Hellman Group Diffie Hellmann and Burmester and Desmedt it is a constant 3 whereas in other schemes such as Distributed Logical Key Hierarchy and Distributed One Way Function Trees, it depends on the number of members. We reduce the energy consumption during communication of the keying materials by reducing the number of bits in the rekeying message. We show through analysis and simulations that our scheme has less computation, communication and energy consumption compared to the existing schemes.
We present two Bayesian procedures to infer the interactions and external currents in an assembly of stochastic integrate-and-fire neurons from the recording of their spiking activity. The first procedure is based on the exact calculation of the most likely time courses of the neuron membrane potentials conditioned by the recorded spikes, and is exact for a vanishing noise variance and for an instantaneous synaptic integration. The second procedure takes into account the presence of fluctuations around the most likely time courses of the potentials, and can deal with moderate noise levels. The running time of both procedures is proportional to the number S of spikes multiplied by the squared number N of neurons. The algorithms are validated on synthetic data generated by networks with known couplings and currents. We also reanalyze previously published recordings of the activity of the salamander retina (including from 32 to 40 neurons, and from 65,000 to 170,000 spikes). We study the dependence of the inferred interactions on the membrane leaking time; the differences and similarities with the classical cross-correlation analysis are discussed.
Phishing is an increasingly sophisticated method to steal personal user information using sites that pretend to be legitimate. In this paper, we take the following steps to identify phishing URLs. First, we carefully select lexical features of the URLs that are resistant to obfuscation techniques used by attackers. Second, we evaluate the classification accuracy when using only lexical features, both automatically and hand-selected, vs. when using additional features. We show that lexical features are sufficient for all practical purposes. Third, we thoroughly compare several classification algorithms, and we propose to use an online method (AROW) that is able to overcome noisy training data. Based on the insights gained from our analysis, we propose PhishDef, a phishing detection system that uses only URL names and combines the above three elements. PhishDef is a highly accurate method (when compared to state-of-the-art approaches over real datasets), lightweight (thus appropriate for online and client-side deployment), proactive (based on online classification rather than blacklists), and resilient to training data inaccuracies (thus enabling the use of large noisy training data).
We propose a simple preferential attachment model of growing network using the complementary probability of Barab\'asi-Albert (BA) model, i.e., $\Pi(k_i) \propto 1-\frac{k_i}{\sum_j k_j}$. In this network, new nodes are preferentially attached to not well connected nodes. Numerical simulations, in perfect agreement with the master equation solution, give an exponential degree distribution. This suggests that the power law degree distribution is a consequence of preferential attachment probability together with "rich get richer" phenomena. We also calculate the average degree of a target node at time t $(<k_s(t)>)$ and its fluctuations, to have a better view of the microscopic evolution of the network, and we also compare the results with BA model.
Survivable design of cross-layer networks, such as the cloud computing infrastructure, lies in its resource deployment and allocation and mapping of the logical (virtual datacenter/IP) network into the physical infrastructure (cloud backbone/WDM) such that link or node failure(s) in the physical infrastructure would not result in cascading failures in the logical network. Most of the prior approaches for survivable cross-layer network design aim at single-link failure scenario, which are not applicable to the more challenging multi-failure scenarios. Also, as many of these approaches use the cross-layer cut concept, enumeration of all cuts in the network is required and thus introducing exponential number of constraints. To overcome these difficulties, we investigate in this paper survivable mapping approaches against multiple physical link failures and its special case, Shared Risk Link Group (SRLG) failure. We present the necessary and sufficient conditions based on both cross-layer spanning trees and cutsets to guarantee a survivable mapping when multiple physical link failures occur. Based on the necessary and sufficient conditions, we propose to solve the problem through (1) mixed-integer linear programs which avoid enumerating all combinations of link failures, and (2) an algorithm which generates/adds logical spanning trees sequentially. Our simulation results show that the proposed approaches can produce survivable mappings effectively against both $k$- and SRLG-failures.
In this paper, we present an effective method to analyze the recognition confidence of handwritten Chinese character, based on the softmax regression score of a high performance convolutional neural networks (CNN). Through careful and thorough statistics of 827,685 testing samples that randomly selected from total 8836 different classes of Chinese characters, we find that the confidence measurement based on CNN is an useful metric to know how reliable the recognition results are. Furthermore, we find by experiments that the recognition confidence can be used to find out similar and confusable character-pairs, to check wrongly or cursively written samples, and even to discover and correct mis-labelled samples. Many interesting observations and statistics are given and analyzed in this study.
An important problem in physics concerns the analysis of audio time series generated by transduced acoustic phenomena. Here, we develop a new method to quantify the scaling properties of the local variance of nonstationary time series. We apply this technique to analyze audio signals obtained from selected genres of music. We find quantitative differences in the correlation properties of high art music, popular music, and dance music. We discuss the relevance of these objective findings in relation to the subjective experience of music.
Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP). Convolutional neural network (CNN) and recurrent neural network (RNN), the two main types of DNN architectures, are widely explored to handle various NLP tasks. CNN is supposed to be good at extracting position-invariant features and RNN at modeling units in sequence. The state of the art on many NLP tasks often switches due to the battle between CNNs and RNNs. This work is the first systematic comparison of CNN and RNN on a wide range of representative NLP tasks, aiming to give basic guidance for DNN selection.
An important goal for the machine learning (ML) community is to create approaches that can learn solutions with human-level capability. One domain where humans have held a significant advantage is visual processing. A significant approach to addressing this gap has been machine learning approaches that are inspired from the natural systems, such as artificial neural networks (ANNs), evolutionary computation (EC), and generative and developmental systems (GDS). Research into deep learning has demonstrated that such architectures can achieve performance competitive with humans on some visual tasks; however, these systems have been primarily trained through supervised and unsupervised learning algorithms. Alternatively, research is showing that evolution may have a significant role in the development of visual systems. Thus this paper investigates the role neuro-evolution (NE) can take in deep learning. In particular, the Hypercube-based NeuroEvolution of Augmenting Topologies is a NE approach that can effectively learn large neural structures by training an indirect encoding that compresses the ANN weight pattern as a function of geometry. The results show that HyperNEAT struggles with performing image classification by itself, but can be effective in training a feature extractor that other ML approaches can learn from. Thus NeuroEvolution combined with other ML methods provides an intriguing area of research that can replicate the processes in nature.
Incidents of organized cybercrime are rising because of criminals are reaping high financial rewards while incurring low costs to commit crime. As the digital landscape broadens to accommodate more internet-enabled devices and technologies like social media, more cybercriminals who are not native English speakers are invading cyberspace to cash in on quick exploits. In this paper we evaluate the performance of three machine learning classifiers in detecting 419 scams in a bilingual Nigerian cybercriminal community. We use three popular classifiers in text processing namely: Na\"ive Bayes, k-nearest neighbors (IBK) and Support Vector Machines (SVM). The preliminary results on a real world dataset reveal the SVM significantly outperforms Na\"ive Bayes and IBK at 95% confidence level.
Deep convolutional neural networks (CNN) have shown their good performances in many computer vision tasks. However, the high computational complexity of CNN involves a huge amount of data movements between the computational processor core and memory hierarchy which occupies the major of the power consumption. This paper presents Chain-NN, a novel energy-efficient 1D chain architecture for accelerating deep CNNs. Chain-NN consists of the dedicated dual-channel process engines (PE). In Chain-NN, convolutions are done by the 1D systolic primitives composed of a group of adjacent PEs. These systolic primitives, together with the proposed column-wise scan input pattern, can fully reuse input operand to reduce the memory bandwidth requirement for energy saving. Moreover, the 1D chain architecture allows the systolic primitives to be easily reconfigured according to specific CNN parameters with fewer design complexity. The synthesis and layout of Chain-NN is under TSMC 28nm process. It costs 3751k logic gates and 352KB on-chip memory. The results show a 576-PE Chain-NN can be scaled up to 700MHz. This achieves a peak throughput of 806.4GOPS with 567.5mW and is able to accelerate the five convolutional layers in AlexNet at a frame rate of 326.2fps. 1421.0GOPS/W power efficiency is at least 2.5 to 4.1x times better than the state-of-the-art works.
One of the important issues in wireless networks is the Routing problem that is effective on system performance, in this article the attempt is made to propose a routing algorithm using the bee colony in order to reduce energy consumption in wireless relay networks. In EBCD algorithm, through combined of energy, distance and traffic parameters a routing algorithm for wireless networks is presented with more efficiency than its predecessor. Applying the bee colony method would allow the placement of the parameters under conventional conditions and to get closer to a mechanism with a better adaptability than that of the existing algorithm. According to the parameters considered, the proposed algorithm provides a fitness function that can be applied as a multi-hop. Unlike other algorithms of its kind this can increase service quality based on environmental conditions through its multiple services. This new method can store the energy accumulated in the nodes and reduce the hop restrictions.
A favorable environment for downbursts associated with deep convective storm systems that occur over the central and eastern continental United States includes strong static instability with large amounts of convective available potential energy and the presence of a mid-tropospheric layer of dry air. However, previous research has identified that over the central United States, especially in the Great Plains region, an environment between that favorable for wet and dry microbursts may exist during the convective season, resulting in the generation of hybrid type microbursts. Hybrid microbursts have been found to originate from deep convective storms that generate heavy precipitation, with sub-cloud evaporation of precipitation a significant factor in downdraft acceleration. Accordingly, a new GOES sounder derived product, the GOES Hybrid Microburst Index, is under development and is designed to assess the potential for convective downbursts that develop in an intermediate environment between a wet type, associated with heavy precipitation, and a dry type associated with convection in which very little to no precipitation is observed at the surface.
Electrical analogues of fracture, such as the fuse network model, are widely studied. However, the "analogy" between the electrical problem and the elastic problem is rarely established explicitly. Further, the fuse network is a discrete approximation to the continuous problem of fracture. It is rarely, if ever, shown that the discrete approximation indeed approaches its continuum limit. We establish both of these correspondences directly.
We present results from a study of the photometric redshift performance of the Dark Energy Survey (DES), using the early data from a Science Verification (SV) period of observations in late 2012 and early 2013 that provided science-quality images for almost 200 sq.~deg.~at the nominal depth of the survey. We assess the photometric redshift performance using about 15000 galaxies with spectroscopic redshifts available from other surveys. These galaxies are used, in different configurations, as a calibration sample, and photo-$z$'s are obtained and studied using most of the existing photo-$z$ codes. A weighting method in a multi-dimensional color-magnitude space is applied to the spectroscopic sample in order to evaluate the photo-$z$ performance with sets that mimic the full DES photometric sample, which is on average significantly deeper than the calibration sample due to the limited depth of spectroscopic surveys. Empirical photo-$z$ methods using, for instance, Artificial Neural Networks or Random Forests, yield the best performance in the tests, achieving core photo-$z$ resolutions $\sigma_{68} \sim 0.08$. Moreover, the results from most of the codes, including template fitting methods, comfortably meet the DES requirements on photo-$z$ performance, therefore, providing an excellent precedent for future DES data sets.
A neural network technique is used to discriminate between quark and gluon jets produced in the qg->q+photon and q q->g+photon processes at the LHC. Considering the network as a trigger and using the PYTHIA event generator and the full event fast simulation package for the CMS detector CMSJET we obtain signal-to-background ratios.
A wide range of applications require or can benefit from collaborative behavior of a group of agents. The technical challenge addressed in this chapter is the development of a decentralized control strategy that enables each agent to independently navigate to ensure agents achieve a collective goal while maintaining network connectivity. Specifically, cooperative controllers are developed for networked agents with limited sensing and network connectivity constraints. By modeling the interaction among the agents as a graph, several different approaches to address the problems of preserving network connectivity are presented, with the focus on a method that utilizes navigation function frameworks. By modeling network connectivity constraints as artificial obstacles in navigation functions, a decentralized control strategy is presented in two particular applications, formation control and rendezvous for a system of autonomous agents, which ensures global convergence to the unique minimum of the potential field (i.e., desired formation or desired destination) while preserving network connectivity. Simulation results are provided to demonstrate the developed strategy.
We present a search for eclipses of $\sim$1700 white dwarfs in the Pan-STARRS1 medium-deep fields. Candidate eclipse events are selected by identifying low outliers in over 4.3 million light curve measurements. We find no short-duration eclipses consistent with being caused by a planetary size companion. This large dataset enables us to place strong constraints on the close-in planet occurrence rates around white dwarfs for planets as small as 2 R$_\oplus$. Our results indicate that gas giant planets orbiting just outside the Roche limit are rare, occurring around less than 0.5% of white dwarfs. Habitable-zone super-Earths and hot super-Earths are less abundant than similar classes of planets around main-sequence stars. These constraints give important insight into the ultimate fate of the large population of exoplanets orbiting main sequence stars.
Spatial evolutionary games are studied with myopic players whose payoff interest, as a personal character, is tuned from selfishness to other-regarding preference via fraternity. The players are located on a square lattice and collect income from symmetric two-person two-strategy (called cooperation and defection) games with their nearest neighbors. During the elementary steps of evolution a randomly chosen player modifies her strategy in order to maximize stochastically her utility function composed from her own and the co-players' income with weight factors $1-Q$ and Q. These models are studied within a wide range of payoff parameters using Monte Carlo simulations for noisy strategy updates and by spatial stability analysis in the low noise limit. For fraternal players ($Q=1/2$) the system evolves into ordered arrangements of strategies in the low noise limit in a way providing optimum payoff for the whole society. Dominance of defectors, representing the "tragedy of the commons", is found within the regions of prisoner's dilemma and stag hunt game for selfish players (Q=0). Due to the symmetry in the effective utility function the system exhibits similar behavior even for Q=1 that can be interpreted as the "lovers' dilemma".
In this paper we consider two sequence tagging tasks for medieval Latin: part-of-speech tagging and lemmatization. These are both basic, yet foundational preprocessing steps in applications such as text re-use detection. Nevertheless, they are generally complicated by the considerable orthographic variation which is typical of medieval Latin. In Digital Classics, these tasks are traditionally solved in a (i) cascaded and (ii) lexicon-dependent fashion. For example, a lexicon is used to generate all the potential lemma-tag pairs for a token, and next, a context-aware PoS-tagger is used to select the most appropriate tag-lemma pair. Apart from the problems with out-of-lexicon items, error percolation is a major downside of such approaches. In this paper we explore the possibility to elegantly solve these tasks using a single, integrated approach. For this, we make use of a layered neural network architecture from the field of deep representation learning.
We discuss transport on load bearing branching hierarchical networks which can model diverse systems which can serve as models of river networks, computer networks, respiratory networks and granular media. We study avalanche transmissions and directed percolation on these networks, and on the V lattice, i.e., the strongest realization of the lattice. We find that typical realizations of the lattice show multimodal distributions for the avalanche transmissions, and a second order transition for directed percolation. On the other hand, the V lattice shows power - law behavior for avalanche transmissions, and a first order (explosive) transition to percolation. The V lattice is thus the critical case of hierarchical networks. We note that small perturbations to the V lattice destroy the power-law behavior of the distributions, and the first order nature of the percolation. We discuss the implications of our results.
Data aggregation in intermediate nodes (called aggregator nodes) is an effective approach for optimizing consumption of scarce resources like bandwidth and energy in Wireless Sensor Networks (WSNs). However, in-network processing poses a problem for the privacy of the sensor data since individual data of sensor nodes need to be known to the aggregator node before the aggregation process can be carried out. In applications of WSNs, privacy-preserving data aggregation has become an important requirement due to sensitive nature of the sensor data. Researchers have proposed a number of protocols and schemes for this purpose. He et al. (INFOCOM 2007) have proposed a protocol - called CPDA - for carrying out additive data aggregation in a privacy-preserving manner for application in WSNs. The scheme has been quite popular and well-known. In spite of the popularity of this protocol, it has been found that the protocol is vulnerable to attack and it is also not energy-efficient. In this paper, we first present a brief state of the art survey on the current privacy-preserving data aggregation protocols for WSNS. Then we describe the CPDA protocol and identify its security vulnerability. Finally, we demonstrate how the protocol can be made secure and energy efficient.
The force networks of different granular ensembles are defined and their topological properties studied using the tools of complex networks. In particular, for each set of grains compressed in a square box, it is introduced a force threshold that determines which contacts conform the network. Hence, the topological characteristics of the network are analyzed as a function of this parameter. The characterization of the structural features thus obtained, may be useful in the understanding of the macroscopic physical behavior exhibited by this class of media.
The growth in data traffic and the increased demand for quality of service had generated a large demand for network systems to be more efficient. The introduction of improved routing systems to meet the increasing demand and varied protocols to accommodate various scales of challenges in network efficiency had further complicated the operations. This means that a better mode of intelligence has to be infused into networking for smoother operations and better autonomic features. Cognitive networks are defined and analyzed in this angle. They are identified to have the potential to deal with the future user related quality and efficiency of service at optimized levels. The cognitive elements of a system like perception, learning, planning, reasoning and decision forming can enable the systems to be more aware of their environment and offer better services. These approaches are expected to transform the mode of operation of future networks.
We study graded response attractor neural networks with asymmetrically extremely dilute interactions and Langevin dynamics. We solve our model in the thermodynamic limit using generating functional analysis, and find (in contrast to the binary neurons case) that even in statics one cannot eliminate the non-persistent order parameters. The macroscopic dynamics is driven by the (non-trivial) joint distribution of neurons and fields, rather than just the (Gaussian) field distribution. We calculate phase transition lines and present simulation results in support of our theory.
Stimulus from the environment that guides behavior and informs decisions is encoded in the firing rates of neural populations. Each neuron in the populations, however, does not spike independently: spike events are correlated from cell to cell. To what degree does this apparent redundancy impact the accuracy with which decisions can be made, and the computations that are required to optimally decide? We explore these questions for two illustrative models of correlation among cells. Each model is statistically identical at the level of pairs cells, but differs in higher-order statistics that describe the simultaneous activity of larger cell groups. We find that the presence of correlations can diminish the performance attained by an ideal decision maker to either a small or large extent, depending on the nature of the higher-order interactions. Moreover, while this optimal performance can in some cases be obtained via the standard integration-to-bound operation, in others it requires a nonlinear computation on incoming spikes. Overall, we conclude that a given level of pairwise correlations--even when restricted to identical neural populations--may not always indicate redundancies that diminish decision making performance.
Hard exclusive production in deep inelastic lepton scattering provides access to the unknown Generalized Parton Distributions (GPDs) of the nucleon. At HERMES, different observables for hard exclusive pi^+ production have been measured with a 27.6 GeV positron beam on an internal hydrogen gas target. First preliminary results for the unpolarized ep->enpi^+ total cross section for 1.5<Q^2<10.5 GeV^2 and for 0.02<x<0.8 are presented and compared to GPD calculations. The final result for the single-spin asymmetry using a longitudinal polarized target is also reported.
The concept of physical-layer network coding (PNC) was proposed in 2006 for application in wireless networks. Since then it has developed into a subfield of network coding with wide followings. The basic idea of PNC is to exploit the network coding operation that occurs naturally when electromagnetic (EM) waves are superimposed on one another. This simple idea turns out to have profound and fundamental ramifications. Subsequent works by various researchers have led to many new results in the domains of 1) wireless communication; 2) wireless information theory; and 3) wireless networking. The purpose of this paper is fourfold. First, we give a brief tutorial on the basic concept of PNC. Second, we survey and discuss recent key results in the three aforementioned areas. Third, we examine a critical issue in PNC: synchronization. It has been a common belief that PNC requires tight synchronization. Our recent results suggest, however, that PNC may actually benefit from asynchrony. Fourth, we propose that PNC is not just for wireless networks; it can also be useful in optical networks. We provide an example showing that the throughput of a passive optical network (PON) could potentially be raised by 100% with PNC.
Oblique impact of drops on a solid or liquid surface is frequently observed in nature. Most studies on drop impact and splashing, however, focus on perpendicular impact. Here, we study oblique impact onto a deep liquid pool, where we quantify the splashing threshold, maximal cavity dimensions and cavity collapse by high-speed imaging above and below the water surface. Three different impact regimes are identified: smooth deposition onto the pool, splashing in the direction of impact only, and splashing in all directions. We provide scaling arguments that delineate these regimes by accounting for drop impact angle and Weber number. The angle of the axis of the cavity created below the water surface follows the impact angle of the drop independent of the Weber number, while cavity depth and its displacement with respect to the impact position depend on the Weber number. Weber number dependency of both the cavity depth and displacement is modeled using an energy argument.
A number of real world problems in many domains (e.g. sociology, biology, political science and communication networks) can be modeled as dynamic networks with nodes representing entities of interest and edges representing interactions among the entities at different points in time. A common representation for such models is the snapshot model - where a network is defined at logical time-stamps. An important problem under this model is change point detection. In this work we devise an effective and efficient three-step-approach for detecting change points in dynamic networks under the snapshot model. Our algorithm achieves up to 9X speedup over the state-of-the-art while improving quality on both synthetic and real world networks.
We study in detail the photometric redshift requirements needed for tomographic weak gravitational lensing in order to measure accurately the Dark Energy equation of state. In particular, we examine how ground-based photometry (u,g,r,i,z,y) can be complemented by space-based near-infrared (IR) photometry (J,H), e.g. on board the planned DUNE satellite. Using realistic photometric redshift simulations and an artificial neural network photo-z method we evaluate the Figure of Merit for the Dark Energy parameters $(w_0, w_a)$. We consider a DUNE-like broad optical filter supplemented with ground-based multi-band optical data from surveys like the Dark Energy Survey, Pan-STARRS and LSST. We show that the Dark Energy Figure of Merit would improved by a factor of 1.3 to 1.7 if IR filters are added on board DUNE. Furthermore we show that with IR data catastrophic photo-z outliers can be removed effectively. There is an interplay between the choice of filters, the magnitude limits and the removal of outliers. We draw attention to the dependence of the results on the galaxy formation scenarios encoded into the mock galaxies, e.g the galaxy reddening. For example, deep u band data could be as effective as the IR. We also find that about $10^5-10^6$ spectroscopic redshifts are needed for calibration of the full survey.
We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS) supernova survey into real objects and artefacts. This is a first step in any transient science pipeline and is currently still done by humans, but future surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (principal component analysis, PCA) of single-epoch g, r and i-difference images, we can reach a completeness (recall) of 96 per cent, while only incorrectly classifying at most 18 per cent of artefacts as real objects, corresponding to a precision (purity) of 84 per cent. In general, random forests performed best, followed by the k-nearest neighbour and the SkyNet artificial neural net algorithms, compared to other methods such as na\"ive Bayes and kernel support vector machine. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further improvements, especially at low signal-to-noise.
The stunning empirical successes of neural networks currently lack rigorous theoretical explanation. What form would such an explanation take, in the face of existing complexity-theoretic lower bounds? A first step might be to show that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate here a comprehensive lower bound ruling out this possibility: for a wide class of activation functions (including all currently used), and inputs drawn from any logconcave distribution, there is a family of one-hidden-layer functions whose output is a sum gate, that are hard to learn in a precise sense: any statistical query algorithm (which includes all known variants of stochastic gradient descent with any loss function) needs an exponential number of queries even using tolerance inversely proportional to the input dimensionality. Moreover, this hard family of functions is realizable with a small (sublinear in dimension) number of activation units in the single hidden layer. The lower bound is also robust to small perturbations of the true weights. Systematic experiments illustrate a phase transition in the training error as predicted by the analysis.
This paper proposes a novel approach to person re-identification, a fundamental task in distributed multi-camera surveillance systems. Although a variety of powerful algorithms have been presented in the past few years, most of them usually focus on designing hand-crafted features and learning metrics either individually or sequentially. Different from previous works, we formulate a unified deep ranking framework that jointly tackles both of these key components to maximize their strengths. We start from the principle that the correct match of the probe image should be positioned in the top rank within the whole gallery set. An effective learning-to-rank algorithm is proposed to minimize the cost corresponding to the ranking disorders of the gallery. The ranking model is solved with a deep convolutional neural network (CNN) that builds the relation between input image pairs and their similarity scores through joint representation learning directly from raw image pixels. The proposed framework allows us to get rid of feature engineering and does not rely on any assumption. An extensive comparative evaluation is given, demonstrating that our approach significantly outperforms all state-of-the-art approaches, including both traditional and CNN-based methods on the challenging VIPeR, CUHK-01 and CAVIAR4REID datasets. Additionally, our approach has better ability to generalize across datasets without fine-tuning.
We propose a simple algorithm to train stochastic neural networks to draw samples from given target distributions for probabilistic inference. Our method is based on iteratively adjusting the neural network parameters so that the output changes along a Stein variational gradient that maximumly decreases the KL divergence with the target distribution. Our method works for any target distribution specified by their unnormalized density function, and can train any black-box architectures that are differentiable in terms of the parameters we want to adapt. As an application of our method, we propose an amortized MLE algorithm for training deep energy model, where a neural sampler is adaptively trained to approximate the likelihood function. Our method mimics an adversarial game between the deep energy model and the neural sampler, and obtains realistic-looking images competitive with the state-of-the-art results.
We propose a novel deep supervised neural network for the task of action recognition in videos, which implicitly takes advantage of visual tracking and shares the robustness of both deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). In our method, a multi-branch model is proposed to suppress noise from background jitters. Specifically, we firstly extract multi-level deep features from deep CNNs and feed them into 3d-convolutional network. After that we feed those feature cubes into our novel joint LSTM module to predict labels and to generate attention regularization. We evaluate our model on two challenging datasets: UCF101 and HMDB51. The results show that our model achieves the state-of-art by only using convolutional features.
Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a large convolutional network to produce state-of-the-art results can take weeks, even when using modern GPUs. Producing labels using a trained network can also be costly when dealing with web-scale datasets. In this work, we present a simple algorithm which accelerates training and inference by a significant factor, and can yield improvements of over an order of magnitude compared to existing state-of-the-art implementations. This is done by computing convolutions as pointwise products in the Fourier domain while reusing the same transformed feature map many times. The algorithm is implemented on a GPU architecture and addresses a number of related challenges.
We attack the problem of learning concepts automatically from noisy web image search results. Going beyond low level attributes, such as colour and texture, we explore weakly-labelled datasets for the learning of higher level concepts, such as scene categories. The idea is based on discovering common characteristics shared among subsets of images by posing a method that is able to organise the data while eliminating irrelevant instances. We propose a novel clustering and outlier detection method, namely Rectifying Self Organizing Maps (RSOM). Given an image collection returned for a concept query, RSOM provides clusters pruned from outliers. Each cluster is used to train a model representing a different characteristics of the concept. The proposed method outperforms the state-of-the-art studies on the task of learning low-level concepts, and it is competitive in learning higher level concepts as well. It is capable to work at large scale with no supervision through exploiting the available sources.
This paper introduces a novel algorithm, the Active Virtual Network Management Protocol (AVNMP), for predictive network management. It explains how the AVNMP facilitates the management of an active network by allowing future predicted state information within an active network to be available to network management algorithms. This is accomplished by coupling ideas from optimistic discrete event simulation with active networking. The optimistic discrete event simulation method used is a form of self-adjusting Time Warp. It is self-adjusting because the system adjusts for predictions which are inaccurate beyond a given tolerance. The concept of a streptichron and autoanaplasis are introduced as mechanisms which take advantage of the enhanced flexibility and intelligence of active packets. Finally, it is demonstrated that the AVNMP is a feasible concept.
We study distributed cooperative decision-making under the explore-exploit tradeoff in the multiarmed bandit (MAB) problem. We extend the state-of-the-art frequentist and Bayesian algorithms for single-agent MAB problems to cooperative distributed algorithms for multi-agent MAB problems in which agents communicate according to a fixed network graph. We rely on a running consensus algorithm for each agent's estimation of mean rewards from its own rewards and the estimated rewards of its neighbors. We prove the performance of these algorithms and show that they asymptotically recover the performance of a centralized agent. Further, we rigorously characterize the influence of the communication graph structure on the decision-making performance of the group.
In this paper, we present a new and significant theoretical discovery. If the absolute height difference between base station (BS) antenna and user equipment (UE) antenna is larger than zero, then the network capacity performance in terms of the area spectral efficiency (ASE) will continuously decrease as the BS density increases for ultra-dense (UD) small cell networks (SCNs). This performance behavior has a tremendous impact on the deployment of UD SCNs in the 5th-generation (5G) era. Network operators may invest large amounts of money in deploying more network infrastructure to only obtain an even worse network performance. Our study results reveal that it is a must to lower the SCN BS antenna height to the UE antenna height to fully achieve the capacity gains of UD SCNs in 5G. However, this requires a revolutionized approach of BS architecture and deployment, which is explored in this paper too.
Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Finding parameters with the maximum likelihood is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structures: topic models, mixture models, hidden markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model.   But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer {\em noisy or} network, which is a textbook example of a Bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits.   The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up.   For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.
This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input. When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class. It overcomes several shortcoming of previous methods and provides great additional insight into the decision making process of classifiers. Making neural network decisions interpretable through visualization is important both to improve models and to accelerate the adoption of black-box classifiers in application areas such as medicine. We illustrate the method in experiments on natural images (ImageNet data), as well as medical images (MRI brain scans).
A major open challenge in neuroscience is the ability to measure and perturb neural activity in vivo from well-defined neural sub-populations at cellular resolution anywhere in the brain. However, limitations posed by scattering and absorption prohibit non-invasive (surface) multiphoton approaches for deep (>2mm) structures, while Gradient Refreactive Index (GRIN) endoscopes are thick and cause significant damage upon insertion. Here, we demonstrate a novel microendoscope to image neural activity at arbitrary depths via an ultrathin multimode optical fiber (MMF) probe that is 5-10X thinner than commercially available microendoscopes. We demonstrate micron-scale resolution, multispectral and volumetric imaging. In contrast to previous approaches, we show that this method has an improved acquisition speed that is sufficient to capture rapid neuronal dynamics in-vivo in rodents expressing a genetically encoded calcium indicator. Our results emphasize the potential of this technology in neuroscience applications and open up possibilities for cellular resolution imaging in previously unreachable brain regions.
Content dissemination networks are pervasive in todays Internet. Examples of content dissemination networks include peer-to-peer networks (P2P), content distribution networks (CDN) and information centric networks (ICN). In this paper, we propose a new system design for information centric networks which leverages opportunistic searching, routing and caching. Our system design is based on an hierarchical tiered structure. Random walks are used to find content inside each tier, and gateways across tiers are used to direct requests towards servers placed in the top tier, which are accessed in case content replicas are not found in lower tiers. Then, we propose a model to analyze the system in consideration. The model yields metrics such as mean time to find a content and the load experienced by custodians as a function of the network topology. Using the model, we identify trade-offs between these two metrics, and numerically show how to find the optimal time to live of the random walks.
It is now widely accepted that one of the roles of the hippocampus is to maintain episodic spatial representations, while parallel striatal pathways contribute to both declarative and procedural value computations by encoding different input-specific outcome predictions. In this paper we investigate the use of these brain mechanisms for action selection, linking them to model-based and model-free controllers for decision making. To this aim we propose a biologically inspired computational model that embodies these theories and explains the functioning of the hippocampal-striatal circuit in a rat navigation task. Its main characteristic is to allow the cooperation of habitual and goal-directed behaviors, with the hippocampus primarily involved in encoding spatial information and simulating possible navigation paths, and the ventral and dorsal striatum involved in learning stimulus-response behaviors and evaluating the reward expectancies associated to predicted locations and sensed stimuli, respectively. The architecture we present employs an unsupervised reinforcement learning rule for the hippocampal-striatal network that is able to build a representation of the environment in which rewarding sites and informative landmarks produce value gradients that are used for planning and decision making. Additionally, it utilizes an arbitration mechanism that balances between exploitation, i.e. stimulus-response behaviors, and mental exploration, i.e. motor imagery processes, based on the intensity and the variability of the responses of striatal neurons. We interpret these results in light of recent experimental data that show anticipatory activations in hippocampal and striatal areas.
Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with 'deep' transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one. Several language modeling experiments demonstrate that the proposed architecture results in powerful and efficient models. On the Penn Treebank corpus, solely increasing the transition depth from 1 to 10 improves word-level perplexity from 90.6 to 65.4 using the same number of parameters. On the larger Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform all previous results and achieve an entropy of 1.27 bits per character.
We present a discriminative nonparametric latent feature relational model (LFRM) for link prediction to automatically infer the dimensionality of latent features. Under the generic RegBayes (regularized Bayesian inference) framework, we handily incorporate the prediction loss with probabilistic inference of a Bayesian model; set distinct regularization parameters for different types of links to handle the imbalance issue in real networks; and unify the analysis of both the smooth logistic log-loss and the piecewise linear hinge loss. For the nonconjugate posterior inference, we present a simple Gibbs sampler via data augmentation, without making restricting assumptions as done in variational methods. We further develop an approximate sampler using stochastic gradient Langevin dynamics to handle large networks with hundreds of thousands of entities and millions of links, orders of magnitude larger than what existing LFRM models can process. Extensive studies on various real networks show promising performance.
Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs. In an application to computer vision, we show that our algorithms can reliably produce samples correctly classified by human subjects but misclassified in specific targets by a DNN with a 97% adversarial success rate while only modifying on average 4.02% of the input features per sample. We then evaluate the vulnerability of different sample classes to adversarial perturbations by defining a hardness measure. Finally, we describe preliminary work outlining defenses against adversarial samples by defining a predictive measure of distance between a benign input and a target classification.
We investigate the potential of using deep learning techniques to reject background events in searches for neutrinoless double beta decay with high pressure xenon time projection chambers capable of detailed track reconstruction. The differences in the topological signatures of background and signal events can be learned by deep neural networks via training over many thousands of events. These networks can then be used to classify further events as signal or background, providing an additional background rejection factor at an acceptable loss of efficiency. The networks trained in this study performed better than previous methods developed based on the use of the same topological signatures by a factor of 1.2 to 1.6, and there is potential for further improvement.
Despite the recent success of deep learning for many speech processing tasks, single-microphone speech separation remains challenging for two main reasons. One reason is the arbitrary order of the target and masker speakers in the mixture (permutation problem), and the second is the unknown number of speakers in the mixture (output dimension problem). We propose a novel deep learning framework for speech separation that addresses both of these important issues. We use a neural network to project the time-frequency representation of the mixture signal into a high-dimensional embedding space. A reference point (attractor) is created in the embedding space to pull together all the time-frequency bins that belong to that speaker. The attractor point for a speaker is formed by finding the centroid of the source in the embedding space which is then used to determine the source assignment. We propose three methods for finding the attractor points for each source, including unsupervised clustering, fixed attractor points, and fixed anchor points in the embedding space that guide the estimation of attractor points. The objective function for the network is standard signal reconstruction error which enables end-to-end operation during both the training and test phases. We evaluate our system on the Wall Street Journal dataset (WSJ0) on two and three speaker mixtures, and report comparable or better performance in comparison with other deep learning methods for speech separation.
Recently there has been a lot of interest in identifying modules at the level of genetic and metabolic networks of organisms, as well as in identifying single genes and reactions that are essential for the organism. A goal of computational and systems biology is to go beyond identification towards an explanation of specific modules and essential genes and reactions in terms of specific structural or evolutionary constraints. In the metabolic networks of E. coli, S. cerevisiae and S. aureus, we identified metabolites with a low degree of connectivity, particularly those that are produced and/or consumed in just a single reaction. Using FBA we also determined reactions essential for growth in these metabolic networks. We find that most reactions identified as essential in these networks turn out to be those involving the production or consumption of low degree metabolites. Applying graph theoretic methods to these metabolic networks, we identified connected clusters of these low degree metabolites. The genes involved in several operons in E. coli are correctly predicted as those of enzymes catalyzing the reactions of these clusters. We independently identified clusters of reactions whose fluxes are perfectly correlated. We find that the composition of the latter `functional clusters' is also largely explained in terms of clusters of low degree metabolites in each of these organisms. Our findings mean that most metabolic reactions that are essential can be tagged by one or more low degree metabolites. Those reactions are essential because they are the only ways of producing or consuming their respective tagged metabolites. Furthermore, reactions whose fluxes are strongly correlated can be thought of as `glued together' by these low degree metabolites.
We present a study of photometric properties of the galaxies in the Hubble Deep Field South (HDFS) based on the released WFPC2 images obtained with the Hubble Space Telescope (HST). We have classified about 340 galaxies with $I<26$ mag in the HDFS as well as about 400 galaxies in the Hubble Deep Field North (HDFN) using the visual classification supplemented by inspection of the surface brightness profiles of the galaxies. Galaxy population statistics and morphological number counts for the HDFS are found to be similar to be those for the HDFN. We have also determined photometrically the redshifts of the galaxies with $I<26$ mag in the HDFS and the HDFN using the empirical training set method. Redshift distribution, color-redshift relation, and magnitude-redshift for each type of galaxies are investigated.
Cognitive engineering is a multi-disciplinary field and hence it is difficult to find a review article consolidating the leading developments in the field. The in-credible pace at which technology is advancing pushes the boundaries of what is achievable in cognitive engineering. There are also differing approaches to cognitive engineering brought about from the multi-disciplinary nature of the field and the vastness of possible applications. Thus research communities require more frequent reviews to keep up to date with the latest trends. In this paper we shall dis-cuss some of the approaches to cognitive engineering holistically to clarify the reasoning behind the different approaches and to highlight their strengths and weaknesses. We shall then show how developments from seemingly disjointed views could be integrated to achieve the same goal of creating cognitive machines. By reviewing the major contributions in the different fields and showing the potential for a combined approach, this work intends to assist the research community in devising more unified methods and techniques for developing cognitive machines.
What if we could effectively read the mind and transfer human visual capabilities to computer vision methods? In this paper, we aim at addressing this question by developing the first visual object classifier driven by human brain signals. In particular, we employ EEG data evoked by visual object stimuli combined with Recurrent Neural Networks (RNN) to learn a discriminative brain activity manifold of visual categories. Afterwards, we train a Convolutional Neural Network (CNN)-based regressor to project images onto the learned manifold, thus effectively allowing machines to employ human brain-based features for automated visual classification. We use a 32-channel EEG to record brain activity of seven subjects while looking at images of 40 ImageNet object classes. The proposed RNN based approach for discriminating object classes using brain signals reaches an average accuracy of about 40%, which outperforms existing methods attempting to learn EEG visual object representations. As for automated object categorization, our human brain-driven approach obtains competitive performance, comparable to those achieved by powerful CNN models, both on ImageNet and CalTech 101, thus demonstrating its classification and generalization capabilities. This gives us a real hope that, indeed, human mind can be read and transferred to machines.
In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.
Comprehensive review paper on the theory and phenomenology of polarized deep inelastic scattering, to appear in Physics Reports
We present a novel deep convolutional neural network (DCNN) system for fine-grained image classification, called a mixture of DCNNs (MixDCNN). The fine-grained image classification problem is characterised by large intra-class variations and small inter-class variations. To overcome these problems our proposed MixDCNN system partitions images into K subsets of similar images and learns an expert DCNN for each subset. The output from each of the K DCNNs is combined to form a single classification decision. In contrast to previous techniques, we provide a formulation to perform joint end-to-end training of the K DCNNs simultaneously. Extensive experiments, on three datasets using two network structures (AlexNet and GoogLeNet), show that the proposed MixDCNN system consistently outperforms other methods. It provides a relative improvement of 12.7% and achieves state-of-the-art results on two datasets.
Automatic photo adjustment is to mimic the photo retouching style of professional photographers and automatically adjust photos to the learned style. There have been many attempts to model the tone and the color adjustment globally with low-level color statistics. Also, spatially varying photo adjustment methods have been studied by exploiting high-level features and semantic label maps. Those methods are semantics-aware since the color mapping is dependent on the high-level semantic context. However, their performance is limited to the pre-computed hand-crafted features and it is hard to reflect user's preference to the adjustment. In this paper, we propose a deep neural network that models the semantics-aware photo adjustment. The proposed network exploits bilinear models that are the multiplicative interaction of the color and the contexual features. As the contextual features we propose the semantic adjustment map, which discovers the inherent photo retouching presets that are applied according to the scene context. The proposed method is trained using a robust loss with a scene parsing task. The experimental results show that the proposed method outperforms the existing method both quantitatively and qualitatively. The proposed method also provides users a way to retouch the photo by their own likings by giving customized adjustment maps.
Recent advances in deep learning motivate the use of deep neutral networks in sensing applications, but their excessive resource needs on constrained embedded devices remain an important impediment. A recently explored solution space lies in compressing (approximating or simplifying) deep neural networks in some manner before use on the device. We propose a new compression solution, called DeepIoT, that makes two key contributions in that space. First, unlike current solutions geared for compressing specific types of neural networks, DeepIoT presents a unified approach that compresses all commonly used deep learning structures for sensing applications, including fully-connected, convolutional, and recurrent neural networks, as well as their combinations. Second, unlike solutions that either sparsify weight matrices or assume linear structure within weight matrices, DeepIoT compresses neural network structures into smaller dense matrices by finding the minimum number of non-redundant hidden elements, such as filters and dimensions required by each layer, while keeping the performance of sensing applications the same. Importantly, it does so using an approach that obtains a global view of parameter redundancies, which is shown to produce superior compression. We conduct experiments with five different sensing-related tasks on Intel Edison devices. DeepIoT outperforms all compared baseline algorithms with respect to execution time and energy consumption by a significant margin. It reduces the size of deep neural networks by 90% to 98.9%. It is thus able to shorten execution time by 71.4% to 94.5%, and decrease energy consumption by 72.2% to 95.7%. These improvements are achieved without loss of accuracy. The results underscore the potential of DeepIoT for advancing the exploitation of deep neural networks on resource-constrained embedded devices.
We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis.We compare two different candidate proposal strategies to guide the object search: with and without overlap. Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal. Experiments indicate better results for the overlapping candidate proposal strategy and a loss of performance for the cropped image features due to the loss of spatial resolution. We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.
Synthesizing photo-realistic images from text descriptions is a challenging problem in computer vision and has many practical applications. Samples generated by existing text-to-image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts. In this paper, we propose stacked Generative Adversarial Networks (StackGAN) to generate photo-realistic images conditioned on text descriptions. The Stage-I GAN sketches the primitive shape and basic colors of the object based on the given text description, yielding Stage-I low resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high resolution images with photo-realistic details. The Stage-II GAN is able to rectify defects and add compelling details with the refinement process. Samples generated by StackGAN are more plausible than those generated by existing approaches. Importantly, our StackGAN for the first time generates realistic 256 x 256 images conditioned on only text descriptions, while state-of-the-art methods can generate at most 128 x 128 images. To demonstrate the effectiveness of the proposed StackGAN, extensive experiments are conducted on CUB and Oxford-102 datasets, which contain enough object appearance variations and are widely-used for text-to-image generation analysis.
We perform variational studies of the interaction-localization problem to describe the interaction-induced renormalizations of the effective (screened) random potential seen by quasiparticles. Here we present results of careful finite-size scaling studies for the conductance of disordered Hubbard chains at half-filling and zero temperature. While our results indicate that quasiparticle wave functions remain exponentially localized even in the presence of moderate to strong repulsive interactions, we show that interactions produce a strong decrease of the characteristic conductance scale g* signaling the crossover to strong localization. This effect, which cannot be captured by a simple renormalization of the disorder strength, instead reflects a peculiar non-Gaussian form of the spatial correlations of the screened disordered potential, a hitherto neglected mechanism to dramatically reduce the impact of Anderson localization (interference) effects.
The random first order transition theory of the dynamics of supercooled liquids is extended to treat aging phenomena in nonequilibrium structural glasses. A reformulation of the idea of ``entropic droplets'' in terms of libraries of local energy landscapes is introduced which treats in a uniform way the supercooled liquid (reproducing earlier results) and glassy regimes. The resulting microscopic theory of aging makes contact with the Nayaranaswamy-Moynihan-Tool nonlinear relaxation formalism and the Hodge-Scherer extrapolation of the Adam-Gibbs formula, but deviations from both approaches are predicted and shown to be consistent with experiment. The nonlinearity of glassy relaxation is shown to quantitatively correlate with liquid fragility. The residual nonArrhenius temperature dependence of relaxation observed in quenched glasses is explained. The broadening of relaxation spectra in the nonequilibrium glass with decreasing temperature is quantitatively predicted. The theory leads to the prediction of spatially fluctuating fictive temperatures in the long-aged glassy state, which have non-Gaussian statistics. This can give rise to ``ultra-slow'' relaxations in systems after deep quenches.
In mobile computing systems, users can access network services anywhere and anytime using mobile devices such as tablets and smart phones. These devices connect to the Internet via network or telecommunications operators. Users usually have some expectations about the services provided to them by different operators. Users' expectations along with additional factors such as cognitive and behavioural states, cost, and network quality of service (QoS) may determine their quality of experience (QoE). If users are not satisfied with their QoE, they may switch to different providers or may stop using a particular application or service. Thus, QoE measurement and prediction techniques may benefit users in availing personalized services from service providers. On the other hand, it can help service providers to achieve lower user-operator switchover. This paper presents a review of the state-the-art research in the area of QoE modelling, measurement and prediction. In particular, we investigate and discuss the strengths and shortcomings of existing techniques. Finally, we present future research directions for developing novel QoE measurement and prediction techniques
The scale-fee networks, having connectivity distribution $P(k)\sim k^{-\alpha}$ (where $k$ is the site connectivity), is very resilient to random failures but fragile to intentional attack. The purpose of this paper is to find the network design guideline which can make the robustness of the network to both random failures and intentional attack maximum while keeping the average connectivity $<k>$ per node constant. We find that when $<k>=3$ the robustness of the scale-free networks reach its maximum value if the minimal connectivity $m=1$, but when $<k>$ is larger than four, the networks will become more robust to random failures and targeted attacks as the minimal connectivity $m$ gets larger.
Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems. The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is L2. In this paper we bring attention to alternative choices. We study the performance of several losses, including perceptually-motivated losses, and propose a novel, differentiable error function. We show that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged.
The real-time strategy game StarCraft has proven to be a challenging environment for artificial intelligence techniques, and as a result, current state-of-the-art solutions consist of numerous hand-crafted modules. In this paper, we show how macromanagement decisions in StarCraft can be learned directly from game replays using deep learning. Neural networks are trained on 789,571 state-action pairs extracted from 2,005 replays of highly skilled players, achieving top-1 and top-3 error rates of 54.6% and 22.9% in predicting the next build action. By integrating the trained network into UAlbertaBot, an open source StarCraft bot, the system can significantly outperform the game's built-in Terran bot, and play competitively against UAlbertaBot with a fixed rush strategy. To our knowledge, this is the first time macromanagement tasks are learned directly from replays in StarCraft. While the best hand-crafted strategies are still the state-of-the-art, the deep network approach is able to express a wide range of different strategies and thus improving the network's performance further with deep reinforcement learning is an immediately promising avenue for future research. Ultimately this approach could lead to strong StarCraft bots that are less reliant on hard-coded strategies.
We present a semiclassical theory for the scattering matrix ${\cal S}$ of a chaotic ballistic cavity at finite Ehrenfest time. Using a phase-space representation coupled with a multi-bounce expansion, we show how the Liouville conservation of phase-space volume decomposes ${\cal S}$ as ${\cal S}={\cal S}^{\rm cl} \oplus {\cal S}^{\rm qm}$. The short-time, classical contribution ${\cal S}^{\rm cl}$ generates deterministic transmission eigenvalues T=0 or 1, while quantum ergodicity is recovered within the subspace corresponding to the long-time, stochastic contribution ${\cal S}^{\rm qm}$. This provides a microscopic foundation for the two-phase fluid model, in which the cavity acts like a classical and a quantum cavity in parallel, and explains recent numerical data showing the breakdown of universality in quantum chaotic transport in the deep semiclassical limit. We show that the Fano factor of the shot-noise power vanishes in this limit, while weak localization remains universal.
Recently, diverse phase transition (PT) types have been obtained in multiplex networks, such as discontinuous, continuous, and mixed-order PTs. However, they emerge from individual systems, and there is no theoretical understanding of such PTs in a single framework. Here, we study a spin model called the Ashkin-Teller (AT) model in a mono-layer scale-free network; this can be regarded as a model of two species of Ising spin placed on each layer of a double-layer network. The four-spin interaction in the AT model represents the inter-layer interaction in the multiplex network. Diverse PTs emerge depending on the inter-layer coupling strength and network structure. Especially, we find that mixed-order PTs occur at the critical end points. The origin of such behavior is explained in the framework of Landau-Ginzburg theory.
Neural networks (NNs) have been successfully applied to solve a variety of application problems involving classification and function approximation. Although backpropagation NNs generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions cannot be explained as those of decision trees. In many applications, it is desirable to extract knowledge from trained NNs for the users to gain a better understanding of how the networks solve the problems. An algorithm is proposed and implemented to extract symbolic rules for medical diagnosis problem. Empirical study on three benchmarks classification problems, such as breast cancer, diabetes, and lenses demonstrates that the proposed algorithm generates high quality rules from NNs comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy.
Future wireless networks are expected to be highly heterogeneous with the co-existence of macrocells and small cells as well as provide support for device-to-device (D2D) communication. In such muti-tier heterogeneous systems centralized radio resource allocation and interference management schemes will not be scalable. In this work, we propose an auction-based distributed solution to allocate radio resources in a muti-tier heterogeneous network. We provide the bound of achievable data rate and show that the complexity of the proposed scheme is linear with number of transmitter nodes and the available resources. The signaling issues (e.g., information exchange over control channels) for the proposed distributed solution is also discussed. Numerical results show the effectiveness of proposed solution in comparison with a centralized resource allocation scheme.
We present a network model in which words over a specific alphabet, called {\it structures}, are associated to each node and undirected edges are added depending on some distance between different structures. It is shown that this model can generate, without the use of preferential attachment or any other heuristic, networks with topological features similar to biological networks: power law degree distribution, clustering coefficient independent from the network size, etc. Specific biological networks ({\it C. Elegans} neural network and {\it E. Coli} protein-protein interaction network) are replicated using this model.
Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. In this paper, we develop exponential family embeddings, a class of methods that extends the idea of word embeddings to other types of high-dimensional data. As examples, we studied neural data with real-valued observations, count data from a market basket analysis, and ratings data from a movie recommendation system. The main idea is to model each observation conditioned on a set of other observations. This set is called the context, and the way the context is defined is a modeling choice that depends on the problem. In language the context is the surrounding words; in neuroscience the context is close-by neurons; in market basket data the context is other items in the shopping cart. Each type of embedding model defines the context, the exponential family of conditional distributions, and how the latent embedding vectors are shared across data. We infer the embeddings with a scalable algorithm based on stochastic gradient descent. On all three applications - neural activity of zebrafish, users' shopping behavior, and movie ratings - we found exponential family embedding models to be more effective than other types of dimension reduction. They better reconstruct held-out data and find interesting qualitative structure.
In recent years, new algorithms and cryptographic protocols based on the laws of quantum physics have been designed to outperform classical communication and computation. We show that the quantum world also opens up new perspectives in the field of complex networks. Already the simplest model of a classical random network changes dramatically when extended to the quantum case, as we obtain a completely distinct behavior of the critical probabilities at which different subgraphs appear. In particular, in a network of N nodes, any quantum subgraph can be generated by local operations and classical communication if the entanglement between pairs of nodes scales as 1/N^2.
We demonstrate that there is an intimate relationship between the magnetic properties of Derrida's random energy model (REM) of spin glasses and the problem of joint source--channel coding in Information Theory. In particular, typical patterns of erroneously decoded messages in the coding problem have ``magnetization'' properties that are analogous to those of the REM in certain phases, where the non--uniformity of the distribution of the source in the coding problem, plays the role of an external magnetic field applied to the REM. We also relate the ensemble performance (random coding exponents) of joint source--channel codes to the free energy of the REM in its different phases.
Machine learning methods in general and Deep Neural Networks in particular have shown to be vulnerable to adversarial perturbations. So far this phenomenon has mainly been studied in the context of whole-image classification. In this contribution, we analyse how adversarial perturbations can affect the task of semantic segmentation. We show how existing adversarial attackers can be transferred to this task and that it is possible to create imperceptible adversarial perturbations that lead a deep network to misclassify almost all pixels of a chosen class while leaving network prediction nearly unchanged outside this class.
We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.
We introduce a formalism of fractional diffusion on networks based on a fractional Laplacian matrix that can be constructed directly from the eigenvalues and eigenvectors of the Laplacian matrix. This fractional approach allows random walks with long-range dynamics providing a general framework for anomalous diffusion and navigation, and inducing dynamically the small-world property on any network. We obtained exact results for the stationary probability distribution, the average fractional return probability and a global time, showing that the efficiency to navigate the network is greater if we use a fractional random walk in comparison to a normal random walk. For the case of a ring, we obtain exact analytical results showing that the fractional transition and return probabilities follow a long-range power-law decay, leading to the emergence of L\'evy flights on networks. Our general fractional diffusion formalism applies to regular, random and complex networks and can be implemented from the spectral properties of the Laplacian matrix, providing an important tool to analyze anomalous diffusion on networks.
Neoteny, also spelled Paedomorphosis, can be defined in biological terms as the retention by an organism of juvenile or even larval traits into later life. In some species, all morphological development is retarded; the organism is juvenilized but sexually mature. Such shifts of reproductive capability would appear to have adaptive significance to organisms that exhibit it. In terms of evolutionary theory, the process of paedomorphosis suggests that larval stages and developmental phases of existing organisms may give rise, under certain circumstances, to wholly new organisms. Although the present work does not pretend to model or simulate the biological details of such a concept in any way, these ideas were incorporated by a rather simple abstract computational strategy, in order to allow (if possible) for faster convergence into simple non-memetic Genetic Algorithms, i.e. without using local improvement procedures (e.g. via Baldwin or Lamarckian learning). As a case-study, the Genetic Algorithm was used for colour image segmentation purposes by using K-mean unsupervised clustering methods, namely for guiding the evolutionary algorithm in his search for finding the optimal or sub-optimal data partition. Average results suggest that the use of neotonic strategies by employing juvenile genotypes into the later generations and the use of linear-dynamic mutation rates instead of constant, can increase fitness values by 58% comparing to classical Genetic Algorithms, independently from the starting population characteristics on the search space. KEYWORDS: Genetic Algorithms, Artificial Neoteny, Dynamic Mutation Rates, Faster Convergence, Colour Image Segmentation, Classification, Clustering.
High-resolution Brillouin scattering is used to achieve 3-dimensional maps of the longitudinal acoustic mode frequency shift in soda-lime silicate glasses subject to Vickers indentations. Assuming that residual stress-induced effects are simply proportional to density changes, residual densification fields are obtained. The density gradient is nearly isotropic, confirming earlier optical observations made on a similar glass. The results show that Brillouin micro-spectroscopy opens the way to a fully quantitative comparison of experimental data with predictions of mechanical models for the identification of a constitutive law.
Signature-based and protocol-based intrusion detection systems (IDS) are employed as means to reveal content-based network attacks. Such systems have proven to be effective in identifying known intrusion attempts and exploits but they fail to recognize new types of attacks or carefully crafted variants of well known ones. This paper presents the design and the development of an anomaly-based IDS technique which is able to detect content-based attacks carried out over application level protocols, like HTTP and FTP. In order to identify anomalous packets, the payload is split up in chunks of equal length and the n-gram technique is used to learn which byte sequences usually appear in each chunk. The devised technique builds a different model for each pair <protocol of interest, packet length> and uses them to classify the incoming traffic. Models are build by means of a semi-supervised approach. Experimental results witness that the technique achieves an excellent accuracy with a very low false positive rate.
Inclusive KsKs production in deep inelastic ep scattering at HERA has been studied with the ZEUS detector using an integrated luminosity of 120 pb-1. Two states are observed at masses of 1537 (+9)(-8) MeV and 1726 +- 7 MeV, as well as an enhancement around 1300 MeV. The state at 1537 MeV is consistent with the well established f2'(1525). The state at 1726 MeV may be the glueball candidate f0(1710). However, it's width of 38 (+20)(-14) MeV is narrower than 125 +- 10 MeV observed by previous experiments for the f0(1710).
In this article, we propose CANDIES (Combined Approach for Novelty Detection in Intelligent Embedded Systems), a new approach to novelty detection in technical systems. We assume that in a technical system several processes interact. If we observe these processes with sensors, we are able to model the observations (samples) with a probabilistic model, where, in an ideal case, the components of the parametric mixture density model we use, correspond to the processes in the real world. Eventually, at run-time, novel processes emerge in the technical systems such as in the case of an unpredictable failure. As a consequence, new kinds of samples are observed that require an adaptation of the model. CANDIES relies on mixtures of Gaussians which can be used for classification purposes, too. New processes may emerge in regions of the models' input spaces where few samples were observed before (low-density regions) or in regions where already many samples were available (high-density regions). The latter case is more difficult, but most existing solutions focus on the former. Novelty detection in low- and high-density regions requires different detection strategies. With CANDIES, we introduce a new technique to detect novel processes in high-density regions by means of a fast online goodness-of-fit test. For detection in low-density regions we combine this approach with a 2SND (Two-Stage-Novelty-Detector) which we presented in preliminary work. The properties of CANDIES are evaluated using artificial data and benchmark data from the field of intrusion detection in computer networks, where the task is to detect new kinds of attacks.
We study contextual bandits with budget and time constraints, referred to as constrained contextual bandits.The time and budget constraints significantly complicate the exploration and exploitation tradeoff because they introduce complex coupling among contexts over time.Such coupling effects make it difficult to obtain oracle solutions that assume known statistics of bandits. To gain insight, we first study unit-cost systems with known context distribution. When the expected rewards are known, we develop an approximation of the oracle, referred to Adaptive-Linear-Programming (ALP), which achieves near-optimality and only requires the ordering of expected rewards. With these highly desirable features, we then combine ALP with the upper-confidence-bound (UCB) method in the general case where the expected rewards are unknown {\it a priori}. We show that the proposed UCB-ALP algorithm achieves logarithmic regret except for certain boundary cases. Further, we design algorithms and obtain similar regret analysis results for more general systems with unknown context distribution and heterogeneous costs. To the best of our knowledge, this is the first work that shows how to achieve logarithmic regret in constrained contextual bandits. Moreover, this work also sheds light on the study of computationally efficient algorithms for general constrained contextual bandits.
It is generally believed that a generic system can be reversibly transformed from one state into another by sufficiently slow change of parameters. A standard argument favoring this assertion is based on a possibility to expand the energy or the entropy of the system into the Taylor series in the ramp speed. Here we show that this argumentation is only valid in high enough dimensions and can break down in low-dimensional gapless systems. We identify three generic regimes of a system response to a slow ramp: (A) mean-field, (B) non-analytic, and (C) non-adiabatic. In the last regime the limits of the ramp speed going to zero and the system size going to infinity do not commute and the adiabatic process does not exist in the thermodynamic limit. We support our results by numerical simulations. Our findings can be relevant to condensed-matter, atomic physics, quantum computing, quantum optics, cosmology and others.
Social contact networks underlying epidemic processes in humans and animals are highly dynamic. The spreading of infections on such temporal networks can differ dramatically from spreading on static networks. We theoretically investigate the effects of concurrency, the number of neighbors that a node has at a given time point, on the epidemic threshold in the stochastic susceptible-infected-susceptible dynamics on temporal network models. We show that network dynamics can suppress epidemics (i.e., yield a higher epidemic threshold) when nodes' concurrency is low, but can also enhance epidemics when the concurrency is high. We analytically determine different phases of this concurrency-induced transition, and confirm our results with numerical simulations.
Nonlinear conductivity effects are studied experimentally and theoretically for thin samples of disordered ionic conductors. Following previous work in this field the {\it experimental nonlinear conductivity} of sodium ion conducting glasses is analyzed in terms of apparent hopping distances. Values up to 43 \AA are obtained. Due to higher-order harmonic current density detection, any undesired effects arising from Joule heating can be excluded. Additionally, the influence of temperature and sample thickness on the nonlinearity is explored. From the {\it theoretical side} the nonlinear conductivity in a disordered hopping model is analyzed numerically. For the 1D case the nonlinearity can be even handled analytically. Surprisingly, for this model the apparent hopping distance scales with the system size. This result shows that in general the nonlinear conductivity cannot be interpreted in terms of apparent hopping distances. Possible extensions of the model are discussed.
While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for a fully connected network with squared loss and analytic activation function given that the number of hidden units of one layer of the network is larger than the number of training points and the network structure from this layer on is pyramidal.
We investigate the capacity, convexity and characterization of a general family of norm-constrained feed-forward networks.
Symmetry arguments are used to derive a set of exact identities between irreducible vertex functions for the replica symmetric field theory of the Ising spin glass in zero magnetic field. Their range of applicability spans from mean field to short ranged systems in physical dimensions. The replica symmetric theory is unstable for d>8, just like in mean field theory. For 6<d<8 and d<6 the resummation of an infinite number of terms is necessary to settle the problem. When d<8, these Ward-like identities must be used to distinguish an Almeida-Thouless line from the replica symmetric droplet phase.
Gene regulatory networks typically have low in-degrees, whereby any given gene is regulated by few of the genes in the network. They also tend to have broad distributions for the out-degree. What mechanisms might be responsible for these degree distributions? Starting with an accepted framework of the binding of transcription factors to DNA, we consider a simple model of gene regulatory dynamics. There, we show that selection for a target expression pattern leads to the emergence of minimum connectivities compatible with the selective constraint. As a consequence, these gene networks have low in-degree, and "functionality" is parsimonious, i.e. is concentrated on a sparse number of interactions as measured for instance by their essentiality. Furthermore, we find that mutations of the transcription factors drive the networks to have broad out-degrees. Finally, these classes of models are evolvable, i.e. significantly different genotypes can emerge gradually under mutation-selection balance.
A Deep Belief Network (DBN) requires large, multiple hidden layers with high number of hidden units to learn good features from the raw pixels of large images. This implies more training time as well as computational complexity. By integrating DBN with Discrete Wavelet Transform (DWT), both training time and computational complexity can be reduced. The low resolution images obtained after application of DWT are used to train multiple DBNs. The results obtained from these DBNs are combined using a weighted voting algorithm. The performance of this method is found to be competent and faster in comparison with that of traditional DBNs.
Analysis of data without labels is commonly subject to scrutiny by unsupervised machine learning techniques. Such techniques provide more meaningful representations, useful for better understanding of a problem at hand, than by looking only at the data itself. Although abundant expert knowledge exists in many areas where unlabelled data is examined, such knowledge is rarely incorporated into automatic analysis. Incorporation of expert knowledge is frequently a matter of combining multiple data sources from disparate hypothetical spaces. In cases where such spaces belong to different data types, this task becomes even more challenging. In this paper we present a novel immune-inspired method that enables the fusion of such disparate types of data for a specific set of problems. We show that our method provides a better visual understanding of one hypothetical space with the help of data from another hypothetical space. We believe that our model has implications for the field of exploratory data analysis and knowledge discovery.
The possibility of observing many body localization of ultracold atoms in a one dimensional optical lattice is discussed for random interactions. In the non-interacting limit, such a system reduces to single-particle physics in the absence of disorder, i.e. to extended states. In effect the observed localization is inherently due to interactions and is thus a genuine many-body effect. In the system studied, many-body localization manifests itself in a lack of thermalization visible in temporal propagation of a specially prepared initial state, in transport properties, in the logarithmic growth of entanglement entropy as well as in statistical properties of energy levels.
We investigate the dynamics of a binary mixture Lennard-Jones system of different system sizes with respect to the importance of the properties of the underlying potential energy landscape (PEL). We show that the dynamics of small systems can be very well described within the continuous time random walk formalism, which is determined solely by PEL parameters. Finite size analysis shows that the diffusivity of large and small systems are very similar. This suggests that the PEL parameters of the small system also determine the local dynamics in large systems. The structural relaxation time, however, displays significant finite size effects. Furthermore, using a non-equilibrium configuration of a large system, we find that causal connections exist between close-by regions of the system. These findings can be described by the coupled landscape model for which a macroscopic system is described by a superposition of elementary systems each described by its PEL. A minimum coupling is introduced which accounts for the finite size behavior. The coupling strength, as the single adjustable parameter, becomes smaller closer to the glass transition.
According to recent progress in the finite size scaling theory of critical disordered systems, the nature of the phase transition is reflected in the distribution of pseudo-critical temperatures $T_c(i,L)$ over the ensemble of samples $(i)$ of size $L$. In this paper, we apply this analysis to the delocalization transition of an heteropolymeric chain at a selective fluid-fluid interface. The width $\Delta T_c(L)$ and the shift $[T_c(\infty)-T_c^{av}(L)]$ are found to decay with the same exponent $L^{-1/\nu_{R}}$, where $1/\nu_{R} \sim 0.26$. The distribution of pseudo-critical temperatures $T_c(i,L)$ is clearly asymmetric, and is well fitted by a generalized Gumbel distribution of parameter $m \sim 3$. We also consider the free energy distribution, which can also be fitted by a generalized Gumbel distribution with a temperature dependent parameter, of order $m \sim 0.7$ in the critical region. Finally, the disorder averaged number of contacts with the interface scales at $T_c$ like $L^{\rho}$ with $\rho \sim 0.26 \sim 1/\nu_R $.
A repeated network game where agents have quadratic utilities that depend on information externalities -- an unknown underlying state -- as well as payoff externalities -- the actions of all other agents in the network -- is considered. Agents play Bayesian Nash Equilibrium strategies with respect to their beliefs on the state of the world and the actions of all other nodes in the network. These beliefs are refined over subsequent stages based on the observed actions of neighboring peers. This paper introduces the Quadratic Network Game (QNG) filter that agents can run locally to update their beliefs, select corresponding optimal actions, and eventually learn a sufficient statistic of the network's state. The QNG filter is demonstrated on a Cournot market competition game and a coordination game to implement navigation of an autonomous team.
We perform both analytical and numerical studies of the one-dimensional tight-binding Hamiltonian with stochastic uncorrelated on-site energies and non-fluctuating long-range hopping integrals . It was argued recently [A. Rodriguez at al., J. Phys. A: Math. Gen. 33, L161 (2000)] that this model reveals a localization-delocalization transition with respect to the disorder magnitude provided . The transition occurs at one of the band edges (the upper one for and the lower one for). The states at the other band edge are always localized, which hints on the existence of a single mobility edge. We analyze the mobility edge and show that, although the number of delocalized states tends to infinity, they form a set of null measure in the thermodynamic limit, i.e. the mobility edge tends to the band edge. The critical magnitude of disorder for the band edge states is computed versus the interaction exponent by making use of the conjecture on the universality of the normalized participation number distribution at transition.
We present a dual contribution to the task of machine reading-comprehension: a technique for creating large-sized machine-comprehension (MC) datasets using paragraph-vector models; and a novel, hybrid neural-network architecture that combines the representation power of recurrent neural networks with the discriminative power of fully-connected multi-layered networks. We use the MC-dataset generation technique to build a dataset of around 2 million examples, for which we empirically determine the high-ceiling of human performance (around 91% accuracy), as well as the performance of a variety of computer models. Among all the models we have experimented with, our hybrid neural-network architecture achieves the highest performance (83.2% accuracy). The remaining gap to the human-performance ceiling provides enough room for future model improvements.
This publication serves as an overview of clique topology -- a novel matrix analysis technique used to extract structural features from neural activity data that contains hidden nonlinearities. We highlight work done by Gusti et al. which introduces clique topology and verifies its applicability to neural feature extraction by showing that neural correlations in the rat hippocampus are determined by geometric structure of hippocampal circuits, rather than being a consequence of positional coding.
Analysis of stochastic models of networks is quite important in light of the huge influx of network data in social, information and bio sciences, but a proper statistical analysis of features of different stochastic models of networks is still underway. We propose bootstrap subsampling methods for finding empirical distribution of count features or ``moments'' (Bickel, Chen and Levina [Ann. Statist. 39 (2011) 2280-2301]) and smooth functions of these features for the networks. Using these methods, we cannot only estimate the variance of count features but also get good estimates of such feature counts, which are usually expensive to compute numerically in large networks. In our paper, we prove theoretical properties of the bootstrap estimates of variance of the count features as well as show their efficacy through simulation. We also use the method on some real network data for estimation of variance and expectation of some count features.
The dynamics of swollen fractal networks (Rouse model) has been studied through computer simulations. The fluctuation-relaxation theorem was used instead of the usual Langevin approach to Brownian dynamics. We measured the equivalent of the mean square displacement $\langle \vec r^{\,2} \rangle$ and the coefficient of self-diffusion $D$ of two-and three-dimensional Sierpinski networks and of the two-dimensional percolation network. The results showed an anomalous diffusion, i. e., a power law for $D$, decreasing with time with an exponent proportional to the spectral dimension of the network.
Recent advances in conditional recurrent language modelling have mainly focused on network architectures (e.g., attention mechanism), learning algorithms (e.g., scheduled sampling and sequence-level training) and novel applications (e.g., image/video description generation, speech recognition, etc.) On the other hand, we notice that decoding algorithms/strategies have not been investigated as much, and it has become standard to use greedy or beam search. In this paper, we propose a novel decoding strategy motivated by an earlier observation that nonlinear hidden layers of a deep neural network stretch the data manifold. The proposed strategy is embarrassingly parallelizable without any communication overhead, while improving an existing decoding algorithm. We extensively evaluate it with attention-based neural machine translation on the task of En->Cz translation.
In a weakly-supervised scenario object detectors need to be trained using image-level annotation alone. Since bounding-box-level ground truth is not available, most of the solutions proposed so far are based on an iterative, Multiple Instance Learning framework in which the current classifier is used to select the highest-confidence boxes in each image, which are treated as pseudo-ground truth in the next training iteration. However, the errors of an immature classifier can make the process drift, usually introducing a lot of false positives in the training dataset. In this paper we propose a self-paced learning protocol to alleviate this problem. The main idea is to iteratively select a subset of images and boxes that are the most reliable, and use them for training. While in the past few years similar strategies have been adopted for SVMs and other classifiers, we are the first showing that a self-paced approach can be used with deep-network-based classifiers in an end-to-end training pipeline. The method we propose is built on the fully-supervised Fast-RCNN architecture and can be applied to similar architectures which represent the input image as a bag of boxes. Using a relatively simple architecture based on Fast-RCNN and AlexNet, we show state-of-the-art results on Pascal VOC 2007 and ILSVRC 2013. On ILSVRC 2013 our low-capacity network outperforms even those weakly-supervised approaches which are based on much higher-capacity networks.
We have studied the longitudinal thermal conductivity of the surface of a three-dimensional time-reversal symmetric topological superconductor with random disorder. Majorana fermions on the surface of topological superconductors have a response to the gravitational field, which is realized as a thermal response to the temperature gradient inside of the material. Because of the presence of both time-reversal symmetry and particle-hole symmetry, disorder on the surface emerges in the Hamiltonian only as spatial deformations of the pair potential. In terms of the gravitational field, the disorder results in spatial fluctuations of the metric. We consider disorder effects on the thermal conductivity perturbatively within the self-consistent Born approximation. The density of states is calculated with the Green's function technique and the thermal conductivity of the surface modes is derived through the electronic conductivity using Wiedemann-Franz law for the Majorana fermions.
We provide a determination of the isotriplet quark distribution from available deep--inelastic data using neural networks. We give a general introduction to the neural network approach to parton distributions, which provides a solution to the problem of constructing a faithful and unbiased probability distribution of parton densities based on available experimental information. We discuss in detail the techniques which are necessary in order to construct a Monte Carlo representation of the data, to construct and evolve neural parton distributions, and to train them in such a way that the correct statistical features of the data are reproduced. We present the results of the application of this method to the determination of the nonsinglet quark distribution up to next--to--next--to--leading order, and compare them with those obtained using other approaches.
We publicly release a new large-scale dataset, called SearchQA, for machine comprehension, or question-answering. Unlike recently released datasets, such as DeepMind CNN/DailyMail and SQuAD, the proposed SearchQA was constructed to reflect a full pipeline of general question-answering. That is, we start not from an existing article and generate a question-answer pair, but start from an existing question-answer pair, crawled from J! Archive, and augment it with text snippets retrieved by Google. Following this approach, we built SearchQA, which consists of more than 140k question-answer pairs with each pair having 49.6 snippets on average. Each question-answer-context tuple of the SearchQA comes with additional meta-data such as the snippet's URL, which we believe will be valuable resources for future research. We conduct human evaluation as well as test two baseline methods, one simple word selection and the other deep learning based, on the SearchQA. We show that there is a meaningful gap between the human and machine performances. This suggests that the proposed dataset could well serve as a benchmark for question-answering.
Quantum computers require quantum arithmetic. We provide an explicit construction of quantum networks effecting basic arithmetic operations: from addition to modular exponentiation. Quantum modular exponentiation seems to be the most difficult (time and space consuming) part of Shor's quantum factorising algorithm. We show that the auxiliary memory required to perform this operation in a reversible way grows linearly with the size of the number to be factorised.
Advancements in parallel processing have lead to a surge in multilayer perceptrons' (MLP) applications and deep learning in the past decades. Recurrent Neural Networks (RNNs) give additional representational power to feedforward MLPs by providing a way to treat sequential data. However, RNNs are hard to train using conventional error backpropagation methods because of the difficulty in relating inputs over many time-steps. Regularization approaches from MLP sphere, like dropout and noisy weight training, have been insufficiently applied and tested on simple RNNs. Moreover, solutions have been proposed to improve convergence in RNNs but not enough to improve the long term dependency remembering capabilities thereof.   In this study, we aim to empirically evaluate the remembering and generalization ability of RNNs on polyphonic musical datasets. The models are trained with injected noise, random dropout, norm-based regularizers and their respective performances compared to well-initialized plain RNNs and advanced regularization methods like fast-dropout. We conclude with evidence that training with noise does not improve performance as conjectured by a few works in RNN optimization before ours.
The increasing complexity of deep learning architectures is resulting in training time requiring weeks or even months. This slow training is due in part to vanishing gradients, in which the gradients used by back-propagation are extremely large for weights connecting deep layers (layers near the output layer), and extremely small for shallow layers (near the input layer); this results in slow learning in the shallow layers. Additionally, it has also been shown that in highly non-convex problems, such as deep neural networks, there is a proliferation of high-error low curvature saddle points, which slows down learning dramatically. In this paper, we attempt to overcome the two above problems by proposing an optimization method for training deep neural networks which uses learning rates which are both specific to each layer in the network and adaptive to the curvature of the function, increasing the learning rate at low curvature points. This enables us to speed up learning in the shallow layers of the network and quickly escape high-error low curvature saddle points. We test our method on standard image classification datasets such as MNIST, CIFAR10 and ImageNet, and demonstrate that our method increases accuracy as well as reduces the required training time over standard algorithms.
Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing an analysis of their representations, predictions and error types. In particular, our experiments reveal the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Moreover, our comparative analysis with finite horizon n-gram models traces the source of the LSTM improvements to long-range structural dependencies. Finally, we provide analysis of the remaining errors and suggests areas for further study.
We investigate from a computational perspective the efficiency of the Willshaw synaptic update rule in the context of familiarity discrimination, a binary-answer, memory-related task that has been linked through psychophysical experiments with modified neural activity patterns in the prefrontal and perirhinal cortex regions. Our motivation for recovering this well-known learning prescription is two-fold: first, the switch-like nature of the induced synaptic bonds, as there is evidence that biological synaptic transitions might occur in a discrete stepwise fashion. Second, the possibility that in the mammalian brain, unused, silent synapses might be pruned in the long-term. Besides the usual pattern and network capacities, we calculate the synaptic capacity of the model, a recently proposed measure where only the functional subset of synapses is taken into account. We find that in terms of network capacity, Willshaw learning is strongly affected by the pattern coding rates, which have to be kept fixed and very low at any time to achieve a non-zero capacity in the large network limit. The information carried per functional synapse, however, diverges and is comparable to that of the pattern association case, even for more realistic moderately low activity levels that are a function of network size.
We consider an Ising spin-chain in a random transverse magnetic field and compute the zero temperature wave vector and frequency dependent dynamic structure factor numerically by using Jordan-Wigner transformation. Two types of distributions of magnetic fields are introduced. For a rectangular distribution, a dispersing branch is observed, and disorder tends to broaden the dispersion peak and close the excitation gap. For a binary distribution, a non-dispersing branch at almost zero energy is recovered. We discuss the relationship of our work to the neutron scattering measurement in $\mathrm{LiHoF_4}$.
In this paper we present a method to study the folding structure of a simple model consisting of two kinds of monomers, hydrophobic and hydrophilic. This method has three main steps: an efficient simulation method to bring an open sequence of homopolymer to a folded state, the application of a painting method called (regular hull) to the folded globule and the refolding process of the obtained copolymer sequence. This study allows us to suggest a theoretical function of disorder distribution for copolymer sequences that give rise to a compacted and well micro-phase separated globule.
Distributed strategic learning has been getting attention in recent years. As systems become distributed finding Nash equilibria in a distributed fashion is becoming more important for various applications. In this paper, we develop a distributed strategic learning framework for seeking Nash equilibria under stochastic state-dependent payoff functions. We extend the work of Krstic et.al. in [1] to the case of stochastic state dependent payoff functions. We develop an iterative distributed algorithm for Nash seeking and examine its convergence to a limiting trajectory defined by an Ordinary Differential Equation (ODE). We show convergence of our proposed algorithm for vanishing step size and provide an error bound for fixed step size. Finally, we conduct a stability analysis and apply the proposed scheme in a generic wireless networks. We also present numerical results which corroborate our claim.
We present Neural Autoregressive Distribution Estimation (NADE) models, which are neural network architectures applied to the problem of unsupervised distribution and density estimation. They leverage the probability product rule and a weight sharing scheme inspired from restricted Boltzmann machines, to yield an estimator that is both tractable and has good generalization performance. We discuss how they achieve competitive performance in modeling both binary and real-valued observations. We also present how deep NADE models can be trained to be agnostic to the ordering of input dimensions used by the autoregressive product rule decomposition. Finally, we also show how to exploit the topological structure of pixels in images using a deep convolutional architecture for NADE.
There are several confounding factors that can reduce the accuracy of gait recognition systems. These factors can reduce the distinctiveness, or alter the features used to characterise gait, they include variations in clothing, lighting, pose and environment, such as the walking surface. Full invariance to all confounding factors is challenging in the absence of high-quality labelled training data. We introduce a simulation-based methodology and a subject-specific dataset which can be used for generating synthetic video frames and sequences for data augmentation. With this methodology, we generated a multi-modal dataset. In addition, we supply simulation files that provide the ability to simultaneously sample from several confounding variables. The basis of the data is real motion capture data of subjects walking and running on a treadmill at different speeds. Results from gait recognition experiments suggest that information about the identity of subjects is retained within synthetically generated examples. The dataset and methodology allow studies into fully-invariant identity recognition spanning a far greater number of observation conditions than would otherwise be possible.
Phenomenological aspects of chirality violating processes induced by QCD instantons in deep inelastic scattering are discussed. First instanton searches and the prospects for their experimental discovery at HERA are presented.
In this paper, a high-speed online neural network classifier based on extreme learning machines for multi-label classification is proposed. In multi-label classification, each of the input data sample belongs to one or more than one of the target labels. The traditional binary and multi-class classification where each sample belongs to only one target class forms the subset of multi-label classification. Multi-label classification problems are far more complex than binary and multi-class classification problems, as both the number of target labels and each of the target labels corresponding to each of the input samples are to be identified. The proposed work exploits the high-speed nature of the extreme learning machines to achieve real-time multi-label classification of streaming data. A new threshold-based online sequential learning algorithm is proposed for high speed and streaming data classification of multi-label problems. The proposed method is experimented with six different datasets from different application domains such as multimedia, text, and biology. The hamming loss, accuracy, training time and testing time of the proposed technique is compared with nine different state-of-the-art methods. Experimental studies shows that the proposed technique outperforms the existing multi-label classifiers in terms of performance and speed.
In this paper, we propose an unsupervised face clustering algorithm called "Proximity-Aware Hierarchical Clustering" (PAHC) that exploits the local structure of deep representations. In the proposed method, a similarity measure between deep features is computed by evaluating linear SVM margins. SVMs are trained using nearest neighbors of sample data, and thus do not require any external training data. Clusters are then formed by thresholding the similarity scores. We evaluate the clustering performance using three challenging unconstrained face datasets, including Celebrity in Frontal-Profile (CFP), IARPA JANUS Benchmark A (IJB-A), and JANUS Challenge Set 3 (JANUS CS3) datasets. Experimental results demonstrate that the proposed approach can achieve significant improvements over state-of-the-art methods. Moreover, we also show that the proposed clustering algorithm can be applied to curate a set of large-scale and noisy training dataset while maintaining sufficient amount of images and their variations due to nuisance factors. The face verification performance on JANUS CS3 improves significantly by finetuning a DCNN model with the curated MS-Celeb-1M dataset which contains over three million face images.
On-line social networks have grown quickly over the last few years and nowadays many people use them frequently. Furthermore the emergence of smartphones allows to access these networks any time from any physical location. Among the social networks, Twitter offers a particularly large set of data publicly available. Here we discuss the procedure to mine this data and store it in distributed databases using Python scripts. We also illustrate how geolocated tweets can be used to study the mobility of people in urban areas.
Autism Spectrum Disorders (ASDs) are often associated with specific atypical postural or motor behaviors, of which Stereotypical Motor Movements (SMMs) have a specific visibility. While the identification and the quantification of SMM patterns remain complex, its automation would provide support to accurate tuning of the intervention in the therapy of autism. Therefore, it is essential to develop automatic SMM detection systems in a real world setting, taking care of strong inter-subject and intra-subject variability. Wireless accelerometer sensing technology can provide a valid infrastructure for real-time SMM detection, however such variability remains a problem also for machine learning methods, in particular whenever handcrafted features extracted from accelerometer signal are considered. Here, we propose to employ the deep learning paradigm in order to learn discriminating features from multi-sensor accelerometer signals. Our results provide preliminary evidence that feature learning and transfer learning embedded in the deep architecture achieve higher accurate SMM detectors in longitudinal scenarios.
The Resource Description Framework (RDF) is a semantic network data model that is used to create machine-understandable descriptions of the world and is the basis of the Semantic Web. This article discusses the application of RDF to the representation of computer software and virtual computing machines. The Semantic Web is posited as not only a web of data, but also as a web of programs and processes.
In this paper, we use recurrent autoencoder model to predict the time series in single and multiple steps ahead. Previous prediction methods, such as recurrent neural network (RNN) and deep belief network (DBN) models, cannot learn long term dependencies. And conventional long short-term memory (LSTM) model doesn't remember recent inputs. Combining LSTM and autoencoder (AE), the proposed model can capture long-term dependencies across data points and uses features extracted from recent observations for augmenting LSTM at the same time. Based on comprehensive experiments, we show that the proposed methods significantly improves the state-of-art performance on chaotic time series benchmark and also has better performance on real-world data. Both single-output and multiple-output predictions are investigated.
With the recent success of visual features from deep convolutional neural networks (DCNN) in visual robot self-localization, it has become important and practical to address more general self-localization scenarios. In this paper, we address the scenario of self-localization from images with small overlap. We explicitly introduce a localization difficulty index as a decreasing function of view overlap between query and relevant database images and investigate performance versus difficulty for challenging cross-view self-localization tasks. We then reformulate the self-localization as a scalable bag-of-visual-features (BoVF) scene retrieval and present an efficient solution called PCA-NBNN, aiming to facilitate fast and yet discriminative correspondence between partially overlapping images. The proposed approach adopts recent findings in discriminativity preserving encoding of DCNN features using principal component analysis (PCA) and cross-domain scene matching using naive Bayes nearest neighbor distance metric (NBNN). We experimentally demonstrate that the proposed PCA-NBNN framework frequently achieves comparable results to previous DCNN features and that the BoVF model is significantly more efficient. We further address an important alternative scenario of "self-localization from images with NO overlap" and report the result.
Many combinatorial optimization problems over graphs are NP-hard, and require significant specialized knowledge and trial-and-error to design good heuristics or approximation algorithms. Can we automate this challenging and tedious process, and learn the algorithms instead? In many real world applications, it is typically the case that the same type of optimization problem is solved again and again on a regular basis, maintaining the same problem structure but differing in the data. This provides an opportunity for learning heuristic algorithms which can exploit the structure of such recurring problems. In this paper, we propose a unique combination of reinforcement learning and graph embedding to address this challenge. The learned greedy policy behaves like a meta-algorithm which incrementally constructs a solution, and the action is determined by the output of a graph embedding network capturing the current state of the solution. We show that our framework can be applied to a diverse range of optimization problems over graphs, and provide evidence that our learning approach can compete with or outperform specialized heuristics or approximation algorithms for the Minimum Vertex Cover, Maximum Cut and Traveling Salesman Problems.
We calculate the optimal solutions of the fully heterogeneous Von Neumann expansion problem with $N$ processes and $P$ goods in the limit $N\to\infty$. This model provides an elementary description of the growth of a production economy in the long run. The system turns from a contracting to an expanding phase as $N$ increases beyond $P$. The solution is characterized by a universal behavior, independent of the parameters of the disorder statistics. Associating technological innovation to an increase of $N$, we find that while such an increase has a large positive impact on long term growth when $N\ll P$, its effect on technologically advanced economies ($N\gg P$) is very weak.
We consider the continuous-time random walk of a particle in a two-dimensional self-affine quenched random potential of Hurst exponent $H>0$. The corresponding master equation is studied via the strong disorder renormalization procedure introduced in Ref. [C. Monthus and T. Garel, J. Phys. A: Math. Theor. 41 (2008) 255002]. We present numerical results on the statistics of the equilibrium time $t_{eq}$ over the disordered samples of a given size $L \times L$ for $10 \leq L \leq 80$. We find an 'Infinite disorder fixed point', where the equilibrium barrier $\Gamma_{eq} \equiv \ln t_{eq}$ scales as $\Gamma_{eq}=L^H u $ where $u$ is a random variable of order O(1). This corresponds to a logarithmically-slow diffusion $ | \vec r(t) - \vec r(0) | \sim (\ln t)^{1/H}$ for the position $\vec r(t)$ of the particle.
The sparse representation classifier (SRC) proposed in Wright et al. (2009) has recently gained much attention from the machine learning community. It makes use of L1 minimization, and is known to work well for data satisfying a subspace assumption. In this paper, we use the notion of class dominance as well as a principal angle condition to investigate and validate the classification performance of SRC, without relying on L1 minimization and the subspace assumption. We prove that SRC can still work well using faster subset regression methods such as orthogonal matching pursuit and marginal regression, and its applicability is not limited to data satisfying the subspace assumption. We illustrate our theorems via various real data sets including face images, text features, and network data.
Let $\C$ be a sequence of multisets of subspaces of a vector space $\F_q^k$. We describe a practical algorithm which computes a canonical form and the stabilizer of $\C$ under the group action of the general semilinear group. It allows us to solve canonical form problems in coding theory, i.e. we are able to compute canonical forms of linear codes, $\F_{q}$-linear block codes over the alphabet $\F_{q^s}$ and random network codes under their natural notion of equivalence. The algorithm that we are going to develop is based on the partition refinement method and generalizes a previous work by the author on the computation of canonical forms of linear codes.
Massively multitask neural architectures provide a learning framework for drug discovery that synthesizes information from many distinct biological sources. To train these architectures at scale, we gather large amounts of data from public sources to create a dataset of nearly 40 million measurements across more than 200 biological targets. We investigate several aspects of the multitask framework by performing a series of empirical studies and obtain some interesting results: (1) massively multitask networks obtain predictive accuracies significantly better than single-task methods, (2) the predictive power of multitask networks improves as additional tasks and data are added, (3) the total amount of data and the total number of tasks both contribute significantly to multitask improvement, and (4) multitask networks afford limited transferability to tasks not in the training set. Our results underscore the need for greater data sharing and further algorithmic innovation to accelerate the drug discovery process.
Given a belief network with evidence, the task of finding the I most probable explanations (MPE) in the belief network is that of identifying and ordering the I most probable instantiations of the non-evidence nodes of the belief network. Although many approaches have been proposed for solving this problem, most work only for restricted topologies (i.e., singly connected belief networks). In this paper, we will present a new approach for finding I MPEs in an arbitrary belief network. First, we will present an algorithm for finding the MPE in a belief network. Then, we will present a linear time algorithm for finding the next MPE after finding the first MPE. And finally, we will discuss the problem of finding the MPE for a subset of variables of a belief network, and show that the problem can be efficiently solved by this approach.
Novel large scale research projects often require cooperation between various different project partners that are spread among the entire world. They do not only need huge computing resources, but also a reliable network to operate on. The Large Hadron Collider (LHC) at CERN is a representative example for such a project. Its experiments result in a vast amount of data, which is interesting for researchers around the world. For transporting the data from CERN to 11 data processing and storage sites, an optical private network (OPN) has been constructed. As the experiment data is highly valuable, LHC defines very high requirements to the underlying network infrastructure. In order to fulfil those requirements, the connections have to be managed and monitored permanently. In this paper, we present the integrated monitoring solution developed for the LHCOPN. We first outline the requirements and show how they are met on the single network layers. After that, we describe, how those single measurements can be combined into an integrated view. We cover design concepts as well as tool implementation highlights.
Evolutionary robotics is a promising approach to autonomously synthesize machines with abilities that resemble those of animals, but the field suffers from a lack of strong foundations. In particular, evolutionary systems are currently assessed solely by the fitness score their evolved artifacts can achieve for a specific task, whereas such fitness-based comparisons provide limited insights about how the same system would evaluate on different tasks, and its adaptive capabilities to respond to changes in fitness (e.g., from damages to the machine, or in new situations). To counter these limitations, we introduce the concept of "evolvability signatures", which picture the post-mutation statistical distribution of both behavior diversity (how different are the robot behaviors after a mutation?) and fitness values (how different is the fitness after a mutation?). We tested the relevance of this concept by evolving controllers for hexapod robot locomotion using five different genotype-to-phenotype mappings (direct encoding, generative encoding of open-loop and closed-loop central pattern generators, generative encoding of neural networks, and single-unit pattern generators (SUPG)). We observed a predictive relationship between the evolvability signature of each encoding and the number of generations required by hexapods to adapt from incurred damages. Our study also reveals that, across the five investigated encodings, the SUPG scheme achieved the best evolvability signature, and was always foremost in recovering an effective gait following robot damages. Overall, our evolvability signatures neatly complement existing task-performance benchmarks, and pave the way for stronger foundations for research in evolutionary robotics.
Orchestrating service-oriented workflows is typically based on a design model that routes both data and control through a single point - the centralised workflow engine. This causes scalability problems that include the unnecessary consumption of the network bandwidth, high latency in transmitting data between the services, and performance bottlenecks. These problems are highly prominent when orchestrating workflows that are composed from services dispersed across distant geographical locations. This paper presents a novel workflow partitioning approach, which attempts to improve the scalability of orchestrating large-scale workflows. It permits the workflow computation to be moved towards the services providing the data in order to garner optimal performance results. This is achieved by decomposing the workflow into smaller sub workflows for parallel execution, and determining the most appropriate network locations to which these sub workflows are transmitted and subsequently executed. This paper demonstrates the efficiency of our approach using a set of experimental workflows that are orchestrated over Amazon EC2 and across several geographic network regions.
Current Deep Learning approaches have been very successful using convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers. Three limitations of this approach are: 1) they are based on a simple layered network topology, i.e., highly connected layers, without intra-layer connections; 2) the networks are manually configured to achieve optimal results, and 3) the implementation of neuron model is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determine network topology, and neuromorphic computing for a low-power hardware implementation. We use the MNIST dataset for our experiment, due to input size limitations of current quantum computers. Our results show the feasibility of using the three architectures in tandem to address the above deep learning limitations. We show a quantum computer can find high quality values of intra-layer connections weights, in a tractable time as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware.
Teaching a computer to read and answer general questions pertaining to a document is a challenging yet unsolved problem. In this paper, we describe a novel neural network architecture called the Reasoning Network (ReasoNet) for machine comprehension tasks. ReasoNets make use of multiple turns to effectively exploit and then reason over the relation among queries, documents, and answers. Different from previous approaches using a fixed number of turns during inference, ReasoNets introduce a termination state to relax this constraint on the reasoning depth. With the use of reinforcement learning, ReasoNets can dynamically determine whether to continue the comprehension process after digesting intermediate results, or to terminate reading when it concludes that existing information is adequate to produce an answer. ReasoNets have achieved exceptional performance in machine comprehension datasets, including unstructured CNN and Daily Mail datasets, the Stanford SQuAD dataset, and a structured Graph Reachability dataset.
Recent empirical studies suggest that heavy-tailed distributions of human activities are universal in real social dynamics [Muchnik, \emph{et al.}, Sci. Rep. \textbf{3}, 1783 (2013)]. On the other hand, community structure is ubiquitous in biological and social networks [M.~E.~J. Newman, Nat. Phys. \textbf{8}, 25 (2012)]. Motivated by these facts, we here consider the evolutionary Prisoner's dilemma game taking place on top of a real social network to investigate how the community structure and the heterogeneity in activity of individuals affect the evolution of cooperation. In particular, we account for a variation of the birth-death process (which can also be regarded as a proportional imitation rule from social point of view) for the strategy updating under both weak- and strong-selection (meaning the payoffs harvested from games contribute either slightly or heavily to the individuals' performance). By implementing comparative studies, where the players are selected either randomly or in terms of their actual activities to playing games with their immediate neighbors, we figure out that heterogeneous activity benefits the emergence of collective cooperation in harsh environment (the action for cooperation is costly) under strong selection, while it impairs the formation of altruism under weak selection. Moreover, we find that the abundance of communities in the social network can evidently foster the fixation of cooperation under strong-selection, in contrast to the games evolving on the randomized counterparts. Our results are therefore helpful for us to better understand the evolution of cooperation in real social systems.
As radio spectrum usage paradigm moving from the traditional command and control allocation scheme to the open spectrum allocation scheme, wireless networks meet new opportunities and challenges. In this article we introduce the concept of cognitive wireless mesh (CogMesh) networks and address the unique problem in such a network. CogMesh is a self-organized distributed network architecture combining cognitive technologies with the mesh structure in order to provide a uniform service platform over a wide range of networks. It is based on dynamic spectrum access (DSA) and featured by self-organization, self-configuration and self-healing. The unique problem in CogMesh is the common control channel problem, which is caused by the opportunistic spectrum sharing nature of secondary users (SU) in the network. More precisely, since the channels of SUs are fluctuating according to the radio environment, it is difficult to find always available global common control channels. This puts a significant challenge on the network design. We develop the control cloud based control channel selection and cluster based network formation techniques to tackle this problem. Moreover, we show in this article that the swarm intelligence is a good candidate to deal with the control channel problem in CogMesh. Since the study of cognitive wireless networks (CWN) is still in its early phase, the ideas provided in this article act as a catalyst to inspire new solutions in this field.
We point out the role of intrinsic transverse momentum of partons in the study of azimuthal asymmetries in deep-inelastic 1-particle inclusive leptoproduction. Leading asymmetries often appear in combination with spin asymmetries. This leads not only to transverse momentum dependence in the parton distribution functions, but also to functions beyond the ones known from inclusive deep-inelastic scattering (DIS). We use Lorentz invariance and the QCD equations of motion to study the evolution of functions that appear at leading (zeroth) order in a 1/Q expansion in azimuthal asymmetries. This includes the evolution equation of the Collins fragmentation function. The moments of these functions are matrix elements of known twist two and twist three operators. We give the evolution in the large $N_c$ limit, restricted to the non-singlet case for the chiral-even functions.
We apply the optimal fluctuation method to the calculation of the optical absorption in disordered one-dimensional semiconductors below the fundamental optical gap. We find that a photon energy exists at which the shape of the optimal fluctuation undergoes a dramatic change, resulting in a different energy dependence of the absorption rate above and below this energy. In the limit when the interaction of an electron and a hole with disorder is stronger than their interaction with each other, we obtain an analytical expression for the optical conductivity. We show that to calculate the absorption rate, it is, in general, necessary to consider a manifold of optimal fluctuations, rather than just a single fluctuation. For an arbitrary ratio of the Coulomb interaction and disorder, the optimal fluctuation is found numerically.
The study of how photosynthetic organisms convert light offers insight not only into nature's evolutionary process, but may also give clues as to how best to design and manipulate artificial photosynthetic systems -- and also how far we can drive natural photosynthetic systems beyond normal operating conditions, so that they can harvest energy for us under otherwise extreme conditions. In addition to its interest from a basic scientific perspective, therefore, the goal to develop a deep quantitative understanding of photosynthesis offers the potential payoff of enhancing our current arsenal of alternative energy sources for the future.   In the following Chapter, we consider the trade-off between dynamics, structure and function of light harvesting membranes in Rps. Photometricum purple bacteria, as a model to highlight the priorities that arise when photosynthetic organisms adapt to deal with the ever-changing natural environment conditions.
Cardiac development is complex, multiscale process encompassing cell fate adoption, differentiation and morphogenesis. To elucidate pathways underlying this process, a recently developed algorithm to reverse engineer gene regulatory networks was applied to time-course microarray data obtained from the developing mouse heart. The algorithm generates many different putative network topologies that are capable of explaining the experimental data via model simulation. To cull specious network interactions, thousands of topologies are merged and filtered to generate a scale-free, hierarchical network. The network is validated with known gene interactions and used to identify regulatory pathways critical to the developing mammalian heart. The predicted gene interactions are prioritized using semantic similarity and gene profile uniqueness metrics. Using these metrics, the network is expanded to include all known mouse genes to form the most likely cardiogenic gene regulatory network. The method outlined herein provides an informative approach to network inference and leads to clear testable hypotheses related to gene regulation.
In recent years, the number of Internet of Things (IoT) devices/sensors has increased to a great extent. To support the computational demand of real-time latency-sensitive applications of largely geo-distributed IoT devices/sensors, a new computing paradigm named "Fog computing" has been introduced. Generally, Fog computing resides closer to the IoT devices/sensors and extends the Cloud-based computing, storage and networking facilities. In this chapter, we comprehensively analyse the challenges in Fogs acting as an intermediate layer between IoT devices/ sensors and Cloud datacentres and review the current developments in this field. We present a taxonomy of Fog computing according to the identified challenges and its key features.We also map the existing works to the taxonomy in order to identify current research gaps in the area of Fog computing. Moreover, based on the observations, we propose future directions for research.
Modern cellular networks are witnessing an unprecedented evolution from classical, centralized and homogenous architectures into a mix of various technologies, in which the network devices are densely and randomly deployed in a decentralized and heterogenous architecture. This shift in network architecture requires network devices to become more autonomous and, potentially, cooperate with one another. Such cooperation can, for example, take place between interfering small access points that seek to coordinate their radio resource allocation, nearby single-antenna users that can cooperatively perform virtual MIMO communications, or even unlicensed users that wish to cooperatively sense the spectrum of the licensed users. Such cooperative mechanisms involve the simultaneous sharing and distribution of resources among a number of overlapping cooperative groups or coalitions. In this paper, a novel mathematical framework from cooperative games, dubbed \emph{overlapping coalition formation games} (OCF games), is introduced to model and solve such cooperative scenarios. First, the concepts of OCF games are presented, and then, several algorithmic aspects are studied for two main classes of OCF games. Subsequently, two example applications, namely, interference management and cooperative spectrum sensing, are discussed in detail to show how the proposed models and algorithms can be used in the future scenarios of wireless systems. Finally, we conclude by providing an overview on future directions and applications of OCF games.
While it is known that using network coding can significantly improve the throughput of directed networks, it is a notorious open problem whether coding yields any advantage over the multicommodity flow (MCF) rate in undirected networks. It was conjectured by Li and Li (2004) that the answer is "no". In this paper we show that even a small advantage over MCF can be amplified to yield a near-maximum possible gap.   We prove that any undirected network with $k$ source-sink pairs that exhibits a $(1+\varepsilon)$ gap between its MCF rate and its network coding rate can be used to construct a family of graphs $G'$ whose gap is $\log(|G'|)^c$ for some constant $c < 1$. The resulting gap is close to the best currently known upper bound, $\log(|G'|)$, which follows from the connection between MCF and sparsest cuts.   Our construction relies on a gap-amplifying graph tensor product that, given two graphs $G_1,G_2$ with small gaps, creates another graph $G$ with a gap that is equal to the product of the previous two, at the cost of increasing the size of the graph. We iterate this process to obtain a gap of $\log(|G'|)^c$ from any initial gap.
The destruction of quasi-long range crystalline order as a consequence of strong disorder effects is shown to accompany the strict localization of all classical plasma modes of one-dimensional Wigner crystals at T=0. We construct a phase diagram that relates the structural phase properties of Wigner crystals to a plasmon delocalization transition recently reported. Deep inside the strictly localized phase of the strong disorder regime, we observe ``glass-like'' behavior. However, well into the critical phase with a plasmon mobility edge, the system retains its crystalline composition. We predict that a transition between the two phases occurs at a critical value of the relative disorder strength. This transition has an experimental signature in the AC conductivity as a local maximum of the largest spectral amplitude as a function of the relative disorder strength.
Unsupervised deep learning is one of the most powerful representation learning techniques. Restricted Boltzman machine, sparse coding, regularized auto-encoders, and convolutional neural networks are pioneering building blocks of deep learning. In this paper, we propose a new building block -- distributed random models. The proposed method is a special full implementation of the product of experts: (i) each expert owns multiple hidden units and different experts have different numbers of hidden units; (ii) the model of each expert is a k-center clustering, whose k-centers are only uniformly sampled examples, and whose output (i.e. the hidden units) is a sparse code that only the similarity values from a few nearest neighbors are reserved. The relationship between the pioneering building blocks, several notable research branches and the proposed method is analyzed. Experimental results show that the proposed deep model can learn better representations than deep belief networks and meanwhile can train a much larger network with much less time than deep belief networks.
Most of the problems in genetic algorithms are very complex and demand a large amount of resources that current technology can not offer. Our purpose was to develop a Java-JINI distributed library that implements Genetic Algorithms with sub-populations (coarse grain) and a graphical interface in order to configure and follow the evolution of the search. The sub-populations are simulated/evaluated in personal computers connected trough a network, keeping in mind different models of sub-populations, migration policies and network topologies. We show that this model delays the convergence of the population keeping a higher level of genetic diversity and allows a much greater number of evaluations since they are distributed among several computers compared with the traditional Genetic Algorithms.
A successful approach to structured learning is to write the learning objective as a joint function of linear parameters and inference messages, and iterate between updates to each. This paper observes that if the inference problem is "smoothed" through the addition of entropy terms, for fixed messages, the learning objective reduces to a traditional (non-structured) logistic regression problem with respect to parameters. In these logistic regression problems, each training example has a bias term determined by the current set of messages. Based on this insight, the structured energy function can be extended from linear factors to any function class where an "oracle" exists to minimize a logistic loss.
This study uses remote sensing technology that can provide information about the condition of the earth's surface area, fast, and spatially. The study area was in Karawang District, lying in the Northern part of West Java-Indonesia. We address a paddy growth stages classification using LANDSAT 8 image data obtained from multi-sensor remote sensing image taken in October 2015 to August 2016. This study pursues a fast and accurate classification of paddy growth stages by employing multiple regularizations learning on some deep learning methods such as DNN (Deep Neural Networks) and 1-D CNN (1-D Convolutional Neural Networks). The used regularizations are Fast Dropout, Dropout, and Batch Normalization. To evaluate the effectiveness, we also compared our method with other machine learning methods such as (Logistic Regression, SVM, Random Forest, and XGBoost). The data used are seven bands of LANDSAT-8 spectral data samples that correspond to paddy growth stages data obtained from i-Sky (eye in the sky) Innovation system. The growth stages are determined based on paddy crop phenology profile from time series of LANDSAT-8 images. The classification results show that MLP using multiple regularization Dropout and Batch Normalization achieves the highest accuracy for this dataset.
This paper introduces Grid Long Short-Term Memory, a network of LSTM cells arranged in a multidimensional grid that can be applied to vectors, sequences or higher dimensional data such as images. The network differs from existing deep LSTM architectures in that the cells are connected between network layers as well as along the spatiotemporal dimensions of the data. The network provides a unified way of using LSTM for both deep and sequential computation. We apply the model to algorithmic tasks such as 15-digit integer addition and sequence memorization, where it is able to significantly outperform the standard LSTM. We then give results for two empirical tasks. We find that 2D Grid LSTM achieves 1.47 bits per character on the Wikipedia character prediction benchmark, which is state-of-the-art among neural approaches. In addition, we use the Grid LSTM to define a novel two-dimensional translation model, the Reencoder, and show that it outperforms a phrase-based reference system on a Chinese-to-English translation task.
We study both numerically and analytically what happens to a random graph of average connectivity "alpha" when its leaves and their neighbors are removed iteratively up to the point when no leaf remains. The remnant is made of isolated vertices plus an induced subgraph we call the "core". In the thermodynamic limit of an infinite random graph, we compute analytically the dynamics of leaf removal, the number of isolated vertices and the number of vertices and edges in the core. We show that a second order phase transition occurs at "alpha = e = 2.718...": below the transition, the core is small but above the transition, it occupies a finite fraction of the initial graph. The finite size scaling properties are then studied numerically in detail in the critical region, and we propose a consistent set of critical exponents, which does not coincide with the set of standard percolation exponents for this model. We clarify several aspects in combinatorial optimization and spectral properties of the adjacency matrix of random graphs.   Key words: random graphs, leaf removal, core percolation, critical exponents, combinatorial optimization, finite size scaling, Monte-Carlo.
This paper reports the results on methods of comparing the memory retrieval capacity of the Hebbian neural network which implements the B-Matrix approach, by using the Widrow-Hoff rule of learning. We then, extend the recently proposed Active Sites model by developing a delta rule to increase memory capacity. Also, this paper extends the binary neural network to a multi-level (non-binary) neural network.
Recent developments in complex networks have paved the way to a series of important biological insights, such as the fact that many of the essential proteins of S. cerevisae corresponds to the so-called hubs of the respective protein-protein interaction networks. Despite the special importance of hubs, other types of nodes such as those corresponding to the network border, as well as the innermost nodes, also deserve special attention. This work reports on how the application of the concept of distance transform to networks showed that a great deal of the innermost nodes correspond to essential proteins, with interesting biological implications.
Mean-field theory is a powerful tool for studying large neural networks. However, when the system is composed of a few neurons, macroscopic differences between the mean-field approximation and the real behavior of the network can arise. Here we introduce a study of the dynamics of a small firing-rate network with excitatory and inhibitory populations, in terms of local and global bifurcations of the neural activity. Our approach is analytically tractable in many respects, and sheds new light on the finite-size effects of the system. In particular, we focus on the formation of multiple branching solutions of the neural equations through spontaneous symmetry-breaking, since this phenomenon increases considerably the complexity of the dynamical behavior of the network. For these reasons, branching points may reveal important mechanisms through which neurons interact and process information, which are not accounted for by the mean-field approximation.
An increasing number of proteins are being discovered with a remarkable and somewhat surprising feature, a knot in their native structures. How the polypeptide chain is able to knot itself during the folding process to form these highly intricate protein topologies is not known. Here, we perform a computational study on the 160-amino acid homodimeric protein YibK which, like other proteins in the SpoU family of MTases, contains a deep trefoil knot in its C-terminal region. In this study, we use a coarse-grained C-alpha-chain representation and Langevin dynamics to study folding kinetics. We find that specific, attractive nonnative interactions are critical for knot formation. In the absence of these interactions, i.e. in an energetics driven entirely by native interactions, knot formation is exceedingly unlikely. Further, we find, in concert with recent experimental data on YibK, two parallel folding pathways which we attribute to an early and a late formation of the trefoil knot, respectively. For both pathways, knot formation occurs before dimerization. A bioinformatics analysis of the SpoU family of proteins reveals further that the critical nonnative interactions may originate from evolutionary conserved hydrophobic segments around the knotted region.
In a recent article we described a new type of deep neural network - a Perpetual Learning Machine (PLM) - which is capable of learning 'on the fly' like a brain by existing in a state of Perpetual Stochastic Gradient Descent (PSGD). Here, by simulating the process of practice, we demonstrate both selective memory and selective forgetting when we introduce statistical recall biases during PSGD. Frequently recalled memories are remembered, whilst memories recalled rarely are forgotten. This results in a 'use it or lose it' stimulus driven memory process that is similar to human memory.
We summarise recent deep, rapid GRB follow-up observations using the RoboNet-1.0 network which comprises three fully-robotic 2-m telescopes, the Liverpool Telescope and the Faulkes Telescopes North and South. Observations begin automatically within minutes of receipt of a GRB alert and may continue for hours or days to provide well-sampled multi-colour light curves or deep upper limits. Our light curves show a variety of early afterglow behaviour, from smooth, simple or broken power laws to 'bumpy', for a wide range of optical brightness (from the unprecedented faint detections of GRB 060108 and GRB 060510B to classical bright ones). We discuss GRB 051111 as an example of how the combination of optical and X-ray light curves can provide insight into the circumburst environment, in particular the role played by intrinsic extinction soon after the burst.
Removing pixel-wise heterogeneous motion blur is challenging due to the ill-posed nature of the problem. The predominant solution is to estimate the blur kernel by adding a prior, but the extensive literature on the subject indicates the difficulty in identifying a prior which is suitably informative, and general. Rather than imposing a prior based on theory, we propose instead to learn one from the data. Learning a prior over the latent image would require modeling all possible image content. The critical observation underpinning our approach is thus that learning the motion flow instead allows the model to focus on the cause of the blur, irrespective of the image content. This is a much easier learning task, but it also avoids the iterative process through which latent image priors are typically applied. Our approach directly estimates the motion flow from the blurred image through a fully-convolutional deep neural network (FCN) and recovers the unblurred image from the estimated motion flow. Our FCN is the first universal end-to-end mapping from the blurred image to the dense motion flow. To train the FCN, we simulate motion flows to generate synthetic blurred-image-motion-flow pairs thus avoiding the need for human labeling. Extensive experiments on challenging realistic blurred images demonstrate that the proposed method outperforms the state-of-the-art.
The word2vec model and application by Mikolov et al. have attracted a great amount of attention in recent two years. The vector representations of words learned by word2vec models have been shown to carry semantic meanings and are useful in various NLP tasks. As an increasing number of researchers would like to experiment with word2vec or similar techniques, I notice that there lacks a material that comprehensively explains the parameter learning process of word embedding models in details, thus preventing researchers that are non-experts in neural networks from understanding the working mechanism of such models.   This note provides detailed derivations and explanations of the parameter update equations of the word2vec models, including the original continuous bag-of-word (CBOW) and skip-gram (SG) models, as well as advanced optimization techniques, including hierarchical softmax and negative sampling. Intuitive interpretations of the gradient equations are also provided alongside mathematical derivations.   In the appendix, a review on the basics of neuron networks and backpropagation is provided. I also created an interactive demo, wevi, to facilitate the intuitive understanding of the model.
Investigation of the neural basis of self-generated thought is moving beyond a simple identification with default network activation toward a more comprehensive view recognizing the role of the frontoparietal control network and other areas. A major task ahead is to unravel the functional roles and temporal dynamics of the widely distributed brain regions recruited during self-generated thought. We argue that various other neuroscientific methods - including lesion studies, human intracranial electrophysiology, and manipulation of neurochemistry - have much to contribute to this project. These diverse data have yet to be synthesized with the growing understanding of self-generated thought gained from neuroimaging, however. Here, we highlight several areas of ongoing inquiry and illustrate how evidence from other methodologies corroborates, complements, and clarifies findings from functional neuroimaging. Each methodology has particular strengths: functional neuroimaging reveals much about the variety of brain areas and networks reliably recruited. Lesion studies point to regions critical to generating and consciously experiencing self-generated thought. Human intracranial electrophysiology illuminates how and where in the brain thought is generated and where this activity subsequently spreads. Finally, measurement and manipulation of neurotransmitter and hormone levels can clarify what kind of neurochemical milieu drives or facilitates self-generated cognition. Integrating evidence from multiple complementary modalities will be a critical step on the way to improving our understanding of the neurobiology of functional and dysfunctional forms of self-generated thought.
We present here a reciprocal space formulation of the Augmented space recursion (ASR) which uses the lattice translation symmetry in the full augmented space to produce configuration averaged quantities, such as spectral functions and complex band structures. Since the real space part is taken into account {\sl exactly} and there is no truncation of this in the recursion, the results are more accurate than recursions in real space. We have also described the Brillouin zone integration procedure to obtain the configuration averaged density of states. We apply the technique to Ni$_{50}$Pt$_{50}$ alloy in conjunction with the tight-binding linearized muffin-tin orbital basis. These developments in the theoretical basis were necessitated by our future application to obtain optical conductivity in random systems.
We introduce a highly structured family of hard satisfiable 3-SAT formulas corresponding to an ordered spin-glass model from statistical physics. This model has provably "glassy" behavior; that is, it has many local optima with large energy barriers between them, so that local search algorithms get stuck and have difficulty finding the true ground state, i.e., the unique satisfying assignment. We test the hardness of our formulas with two Davis-Putnam solvers, Satz and zChaff, the recently introduced Survey Propagation (SP), and two local search algorithms, Walksat and Record-to-Record Travel (RRT). We compare our formulas to random 3-XOR-SAT formulas and to two other generators of hard satisfiable instances, the minimum disagreement parity formulas of Crawford et al., and Hirsch's hgen. For the complete solvers the running time of our formulas grows exponentially in sqrt(n), and exceeds that of random 3-XOR-SAT formulas for small problem sizes. SP is unable to solve our formulas with as few as 25 variables. For Walksat, our formulas appear to be harder than any other known generator of satisfiable instances. Finally, our formulas can be solved efficiently by RRT but only if the parameter d is tuned to the height of the barriers between local minima, and we use this parameter to measure the barrier heights in random 3-XOR-SAT formulas as well.
We investigate simplified models of computer data networks and examine how the introduction of additional random links influences the performance of these net works. In general, the impact of additional random links on the performance of the network strongly depends on the routing algorithm used in the network. Significant performance gains can be achieved if the routing is based on "geometrical distance" or shortest path reduced table routing. With shortest path full table routing degradation of performance is observed.
In an increasingly complex, mobile and interconnected world, we face growing threats of disasters, whether by chance or deliberately. Disruption of coordinated response and recovery efforts due to organizational, technical, procedural, random or deliberate attack could result in the risk of massive loss of life. This requires urgent action to explore the development of optimal information-sharing environments for promoting collective disaster response and preparedness using multijurisdictional hierarchical networks. Innovative approaches to information flow modeling and analysis for dealing with challenges of coordinating across multi layered agency structures as well as development of early warnings through social systems using social media analytics may be pivotal to timely responses to dealing with large scale disasters where response strategies need to be viewed as a shared responsibility. How do facilitate the development of collective disaster response in a multijurisdictional setting? How do we develop and test the level and effectiveness of shared multijurisdictional hierarchical networks for improved preparedness and response? What is the role of multi layered training and exercises in building the shared learning space for collective disaster preparedness and response? The aim of this is therefore to determine factors that may be responsible for affecting disaster response.
The problem of finding a resource residing in a network node (the \emph{resource location problem}) is a challenge in complex networks due to aspects as network size, unknown network topology, and network dynamics. The problem is especially difficult if no requirements on the resource placement strategy or the network structure are to be imposed, assuming of course that keeping centralized resource information is not feasible or appropriate. Under these conditions, random algorithms are useful to search the network. A possible strategy for static networks, proposed in previous work, uses short random walks precomputed at each network node as partial walks to construct longer random walks with associated resource information. In this work, we adapt the previous mechanisms to dynamic networks, where resource instances may appear in, and disappear from, network nodes, and the nodes themselves may leave and join the network, resembling realistic scenarios. We analyze the resulting resource location mechanisms, providing expressions that accurately predict average search lengths, which are validated using simulation experiments. Reduction of average search lengths compared to simple random walk searches are found to be very large, even in the face of high network volatility. We also study the cost of the mechanisms, focusing on the overhead implied by the periodic recomputation of partial walks to refresh the information on resources, concluding that the proposed mechanisms behave efficiently and robustly in dynamic networks.
We calculate the QED corrections to deep inelastic scattering with tagged photons at HERA in the leading logarithmic approximation. Due to the special experimental setup, two large scales appear in the calculation that lead to two large logarithms of comparable size. The relation of our formalism to the conventional structure function formalism is outlined. We present some numerical results and compare with previous calculations.
We use the concept of a Kirchhoff resistor network (alternatively random walk on a network) to probe connected graphs and produce symmetry revealing canonical labelings of the graph(s) nodes and edges.
Learning from data has led to paradigm shifts in a multitude of disciplines, including web, text, and image search, speech recognition, as well as bioinformatics. Can machine learning enable similar breakthroughs in understanding quantum many-body systems? Here we develop an efficient deep learning approach that enables spatially and chemically resolved insights into quantum-mechanical observables of molecular systems. We unify concepts from many-body Hamiltonians with purpose-designed deep tensor neural networks (DTNN), which leads to size-extensive and uniformly accurate (1 kcal/mol) predictions in compositional and configurational chemical space for molecules of intermediate size. As an example of chemical relevance, the DTNN model reveals a classification of aromatic rings with respect to their stability -- a useful property that is not contained as such in the training dataset. Further applications of DTNN for predicting atomic energies and local chemical potentials in molecules, reliable isomer energies, and molecules with peculiar electronic structure demonstrate the high potential of machine learning for revealing novel insights into complex quantum-chemical systems.
We demonstrate the potential of Deep Learning methods for measurements of cosmological parameters from density fields, focusing on the extraction of non-Gaussian information. We consider weak lensing mass maps as our dataset. We aim for our method to be able to distinguish between five models, which were chosen to lie along the $\sigma_8$ - $\Omega_m$ degeneracy, and have nearly the same two-point statistics. We design and implement a Deep Convolutional Neural Network (DCNN) which learns the relation between five cosmological models and the mass maps they generate. We develop a new training strategy which ensures the good performance of the network for high levels of noise. We compare the performance of this approach to commonly used non-Gaussian statistics, namely the skewness and kurtosis of the convergence maps. We find that our implementation of DCNN outperforms the skewness and kurtosis statistics, especially for high noise levels. The network maintains the mean discrimination efficiency greater than $85\%$ even for noise levels corresponding to ground based lensing observations, while the other statistics perform worse in this setting, achieving efficiency less than $70\%$. This demonstrates the ability of CNN-based methods to efficiently break the $\sigma_8$ - $\Omega_m$ degeneracy with weak lensing mass maps alone. We discuss the potential of this method to be applied to the analysis of real weak lensing data and other datasets.
Options have provided a field of much study because of the complexity involved in pricing them. The Black-Scholes equations were developed to price options but they are only valid for European styled options. There is added complexity when trying to price American styled options and this is why the use of neural networks has been proposed. Neural Networks are able to predict outcomes based on past data. The inputs to the networks here are stock volatility, strike price and time to maturity with the output of the network being the call option price. There are two techniques for Bayesian neural networks used. One is Automatic Relevance Determination (for Gaussian Approximation) and one is a Hybrid Monte Carlo method, both used with Multi-Layer Perceptrons.
The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem - unconstrained natural language sentences, and in the wild videos.   Our key contributions are: (1) a 'Watch, Listen, Attend and Spell' (WLAS) network that learns to transcribe videos of mouth motion to characters; (2) a curriculum learning strategy to accelerate training and to reduce overfitting; (3) a 'Lip Reading Sentences' (LRS) dataset for visual speech recognition, consisting of over 100,000 natural sentences from British television.   The WLAS model trained on the LRS dataset surpasses the performance of all previous work on standard lip reading benchmark datasets, often by a significant margin. This lip reading performance beats a professional lip reader on videos from BBC television, and we also demonstrate that visual information helps to improve speech recognition performance even when the audio is available.
This paper focuses on analyzing the free-riding behavior of self-interested users in online communities. Hence, traditional optimization methods for communities composed of compliant users such as network utility maximization cannot be applied here. In our prior work, we show how social reciprocation protocols can be designed in online communities which have populations consisting of a continuum of users and are stationary under stochastic permutations. Under these assumptions, we are able to prove that users voluntarily comply with the pre-determined social norms and cooperate with other users in the community by providing their services. In this paper, we generalize the study by analyzing the interactions of self-interested users in online communities with finite populations and are not stationary. To optimize their long-term performance based on their knowledge, users adapt their strategies to play their best response by solving individual stochastic control problems. The best-response dynamic introduces a stochastic dynamic process in the community, in which the strategies of users evolve over time. We then investigate the long-term evolution of a community, and prove that the community will converge to stochastically stable equilibria which are stable against stochastic permutations. Understanding the evolution of a community provides protocol designers with guidelines for designing social norms in which no user has incentives to adapt its strategy and deviate from the prescribed protocol, thereby ensuring that the adopted protocol will enable the community to achieve the optimal social welfare.
We present a comprehensive study of deep bidirectional long short-term memory (LSTM) recurrent neural network (RNN) based acoustic models for automatic speech recognition (ASR). We study the effect of size and depth and train models of up to 8 layers. We investigate the training aspect and study different variants of optimization methods, batching, truncated backpropagation, different regularization techniques such as dropout and $L_2$ regularization, and different gradient clipping variants.   The major part of the experimental analysis was performed on the Quaero corpus. Additional experiments also were performed on the Switchboard corpus. Our best LSTM model has a relative improvement in word error rate of over 14\% compared to our best feed-forward neural network (FFNN) baseline on the Quaero task. On this task, we get our best result with an 8 layer bidirectional LSTM and we show that a pretraining scheme with layer-wise construction helps for deep LSTMs.   Finally we compare the training calculation time of many of the presented experiments in relation with recognition performance.   All the experiments were done with RETURNN, the RWTH extensible training framework for universal recurrent neural networks in combination with RASR, the RWTH ASR toolkit.
We study the transverse-momentum distribution of hadrons produced in semi-inclusive deep-inelastic scattering (SIDIS). We consider cross sections for various combinations of polarizations of the initial lepton and nucleon or the produced hadron, for which we perform the resummation of large double-logarithmic perturbative corrections arising at small transverse momentum. We present phenomenological results for the processes $lp\to l\pi X$ with longitudinally polarized leptons and protons. We discuss the impact of the perturbative resummation and of estimated non-perturbative contributions on the corresponding cross sections and their spin asymmetry. Our results should be relevant for ongoing studies in the COMPASS experiment at CERN, and for future experiments at the proposed eRHIC collider at BNL.
The problem of reducing the age-of-information has been extensively studied in the single-hop networks. In this paper, we minimize the age-of-information in general multihop networks. If the packet transmission times over the network links are exponentially distributed, we prove that a preemptive Last Generated First Served (LGFS) policy results in smaller age processes at all nodes of the network (in a stochastic ordering sense) than any other causal policy. In addition, for arbitrary general distributions of packet transmission times, the non-preemptive LGFS policy is shown to minimize the age processes at all nodes of the network among all non-preemptive work-conserving policies (again in a stochastic ordering sense). It is surprising that such simple policies can achieve optimality of the joint distribution of the age processes at all nodes even under arbitrary network topologies, as well as arbitrary packet generation and arrival times. These optimality results not only hold for the age processes, but also for any non-decreasing functional of the age processes.
As one of the most important paradigms of recurrent neural networks, the echo state network (ESN) has been applied to a wide range of fields, from robotics to medicine to finance, and language processing. A key feature of the ESN paradigm is its reservoir ---a directed and weighted network--- that represents the connections between neurons and projects the input signals into a high dimensional space. Despite extensive studies, the impact of the reservoir network on the ESN performance remains unclear. Here we systematically address this fundamental question. Through spectral analysis of the reservoir network we reveal a key factor that largely determines the ESN memory capacity and hence affects its performance. Moreover, we find that adding short loops to the reservoir network can tailor ESN for specific tasks and optimal learning. We validate our findings by applying ESN to forecast both synthetic and real benchmark time series. Our results provide a new way to design task-specific recurrent neural networks, as well as new insights in understanding complex networked systems.
In wireless ad hoc or sensor networks, distributed node coloring is a fundamental problem closely related to establishing efficient communication through TDMA schedules. For networks with maximum degree Delta, a Delta + 1 coloring is the ultimate goal in the distributed setting as this is always possible. In this work we propose Delta + 1 coloring algorithms for the synchronous and asynchronous setting. All algorithms have a runtime of O(Delta log n) time slots. This improves on the previous algorithms for the SINR model either in terms of the number of required colors or the runtime and matches the runtime of local broadcasting in the SINR model (which can be seen as a lower bound).
The optimal power flow (OPF) problem seeks to control power generation/demand to optimize certain objectives such as minimizing the generation cost or power loss in the network. It is becoming increasingly important for distribution networks, which are tree networks, due to the emergence of distributed generation and controllable loads. In this paper, we study the OPF problem in tree networks. The OPF problem is nonconvex. We prove that after a "small" modification to the OPF problem, its global optimum can be recovered via a second-order cone programming (SOCP) relaxation, under a "mild" condition that can be checked apriori. Empirical studies justify that the modification to OPF is "small" and that the "mild" condition holds for the IEEE 13-bus distribution network and two real-world networks with high penetration of distributed generation.
With the wide deployment of network facilities and the increasing requirement of network reliability, the disruptive event like natural disaster, power outage or malicious attack has become a non-negligible threat to the current communication network. Such disruptive event can simultaneously destroy all devices in a specific geographical area and affect many network based applications for a long time. Hence, it is essential to build disaster-resilient network for future highly survivable communication services. In this paper, we consider the problem of designing a highly resilient network through the technique of SDN (Software Defined Networking). In contrast to the conventional idea of handling all the failures on the control plane (the controller), we focus on an integrated design to mitigate disaster risks by adding some redundant functions on the data plane. Our design consists of a sub-graph based proactive protection approach on the data plane and a splicing approach at the controller for effective restoration on the control plane. Such a systematic design is implemented in the OpenFlow framework through the Mininet emulator and Nox controller. Numerical results show that our approach can achieve high robustness with low control overhead.
Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real users where learning costs are expensive. Reward shaping is one promising technique for addressing these concerns. Here we examine three recurrent neural network (RNN) approaches for providing reward shaping information in addition to the primary (task-orientated) environmental feedback. These RNNs are trained on returns from dialogues generated by a simulated user and attempt to diffuse the overall evaluation of the dialogue back down to the turn level to guide the agent towards good behaviour faster. In both simulated and real user scenarios these RNNs are shown to increase policy learning speed. Importantly, they do not require prior knowledge of the user's goal.
A bivariate version of the multicanonical Monte Carlo method and its application to the simulation of the three-dimensional $\pm J$ Ising spin glass are described. We found the autocorrelation time associated with this particular multicanonical method was approximately proportional to the system volume, which is a great improvement over previous methods applied to spin-glass simulations. The principal advantage of this version of the multicanonical method, however, was its ability to access information predictive of low-temperature behavior. At low temperatures we found results on the three-dimensional $\pm J$ Ising spin glass consistent with a double degeneracy of the ground-state: the order-parameter distribution function $P(q)$ converged to two delta-function peaks and the Binder parameter approached unity as the system size was increased. With the same density of states used to compute these properties at low temperature, we found their behavior changing as the temperature is increased towards the spin glass transition temperature. Just below this temperature, the behavior is consistent with the standard mean-field picture that has an infinitely degenerate ground state. Using the concept of zero-energy droplets, we also discuss the structure of the ground-state degeneracy. The size distribution of the zero-energy droplets was found to produce the two delta-function peaks of $P(q)$.
Everyday we are exposed to various chemicals via food additives, cleaning and cosmetic products and medicines -- and some of them might be toxic. However testing the toxicity of all existing compounds by biological experiments is neither financially nor logistically feasible. Therefore the government agencies NIH, EPA and FDA launched the Tox21 Data Challenge within the "Toxicology in the 21st Century" (Tox21) initiative. The goal of this challenge was to assess the performance of computational methods in predicting the toxicity of chemical compounds. State of the art toxicity prediction methods build upon specifically-designed chemical descriptors developed over decades. Though Deep Learning is new to the field and was never applied to toxicity prediction before, it clearly outperformed all other participating methods. In this application paper we show that deep nets automatically learn features resembling well-established toxicophores. In total, our Deep Learning approach won both of the panel-challenges (nuclear receptors and stress response) as well as the overall Grand Challenge, and thereby sets a new standard in tox prediction.
In a glassy system different degrees of freedom have well-separated characteristic times, and are described by different temperatures. The stationary state is essentially non-equilibrium. A generalized statistical thermodynamics is constructed and a universal variational principle is proposed. Entropy production and energy dissipation occur at a constant rate; there exists a universal relation between them, valid to leading order in the small ratio of the characteristic times. Energy dissipation (unlike entropy production) is closely connected to the fluctuations of the slow degree. Corrections due to a finite ratio of the times are obtained. Onsager relations in the context of heat transfer are also considered. They are always broken in glassy states, except close to equilibrium.
Minutiae have played an important role in fingerprint identification. Extracting effective minutiae is difficult for latent fingerprints which are usually of poor quality. Instead of conventional hand-craft features, a fully convolutional network(FCN) is learned end-to-end to extract minutiae from raw fingerprints in pixel level. FCN is used to map raw fingerprints to a correspondingly-sized minutia-score map with a fixed stride. And thus a large number of minutiae will be extracted through a given thresh. Then small regions centering at these minutia points are put into a convolutional neural network(CNN) to reclassify these minutiae and calculate their orientations. The CNN shares the convolutional layers with the fully convolutional network to speed up. For the VGG model~\cite{simonyan2014very}, 0.45 second is used on average to detect one fingerprint on a GPU. On the NIST SD27 database we achieve 53\% recall rate and 53\% precise rate that beats many other algorithms. Our trained model is also visualized to see that we have successfully extracted features preserving ridge information of a latent fingerprint.
The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher's flexibility to study different machine learning algorithms, forcing them to either use a less desirable network architecture or parallelize the processing across multiple GPUs. We propose a runtime memory manager that virtualizes the memory usage of DNNs such that both GPU and CPU memory can simultaneously be utilized for training larger DNNs. Our virtualized DNN (vDNN) reduces the average GPU memory usage of AlexNet by up to 89%, OverFeat by 91%, and GoogLeNet by 95%, a significant reduction in memory requirements of DNNs. Similar experiments on VGG-16, one of the deepest and memory hungry DNNs to date, demonstrate the memory-efficiency of our proposal. vDNN enables VGG-16 with batch size 256 (requiring 28 GB of memory) to be trained on a single NVIDIA Titan X GPU card containing 12 GB of memory, with 18% performance loss compared to a hypothetical, oracular GPU with enough memory to hold the entire DNN.
With the number of fully-sequenced genomes now well over a hundred it has become possible to start investigating if there are any quantitative regularities in the genetic make-up of genomes. In (physics/0307001), I originally showed that the numbers of genes in different functional categories scale as power laws in the total number of genes in the genome. In this chapter I revisit these results with more recent data and go into considerable more depth regarding the implications of these scaling laws for our understanding of the regulatory design of cells. In addition, I further develop the evolutionary model first proposed in (physics/0307001), which suggests that the exponents of the observed scaling laws correspond to fundamental constants of the evolutionary process. In particular, I put forward an hypothesis for the approximately quadratic scaling of regulatory and signal transducing genes with genome size.
Spectral embedding provides a framework for solving perceptual organization problems, including image segmentation and figure/ground organization. From an affinity matrix describing pairwise relationships between pixels, it clusters pixels into regions, and, using a complex-valued extension, orders pixels according to layer. We train a convolutional neural network (CNN) to directly predict the pairwise relationships that define this affinity matrix. Spectral embedding then resolves these predictions into a globally-consistent segmentation and figure/ground organization of the scene. Experiments demonstrate significant benefit to this direct coupling compared to prior works which use explicit intermediate stages, such as edge detection, on the pathway from image to affinities. Our results suggest spectral embedding as a powerful alternative to the conditional random field (CRF)-based globalization schemes typically coupled to deep neural networks.
Statistical mechanics underlies our understanding of macroscopic quantum systems. It is based on the assumption that out-of-equilibrium systems rapidly approach their equilibrium states, forgetting any information about their microscopic initial conditions. This fundamental paradigm is challenged by disordered systems, in which a slowdown or even absence of thermalization is expected. We report the observation of critical thermalization in a three dimensional ensemble of $\sim10^6$ electronic spins coupled via dipolar interactions. By controlling the spin states of nitrogen vacancy color centers in diamond, we observe slow, sub-exponential relaxation dynamics consistent with power laws that exhibit disorder-dependent exponents; this behavior is modified at late times owing to many-body interactions. These observations are quantitatively explained by a resonance counting theory that incorporates the effects of both disorder and interactions.
We introduce a graphical framework for multiple instance learning (MIL) based on Markov networks. This framework can be used to model the traditional MIL definition as well as more general MIL definitions. Different levels of ambiguity -- the portion of positive instances in a bag -- can be explored in weakly supervised data. To train these models, we propose a discriminative max-margin learning algorithm leveraging efficient inference for cardinality-based cliques. The efficacy of the proposed framework is evaluated on a variety of data sets. Experimental results verify that encoding or learning the degree of ambiguity can improve classification performance.
Small world models are networks consisting of many local links and fewer long range 'shortcuts', used to model networks with a high degree of local clustering but relatively small diameter. Here, we concern ourselves with the distribution of typical inter-point network distances. We establish approximations to the distribution of the graph distance in a discrete ring network with extra random links, and compare the results to those for simpler models, in which the extra links have zero length and the ring is continuous.
The Belle experiment, part of a broad-based search for new physics, is a collaboration of approximately 400 physicists from 55 institutions across four continents. The Belle detector is located at the KEKB accelerator in Tsukuba, Japan. The Belle detector was operated at the asymmetric electron-positron collider KEKB from 1999-2010. The detector accumulated more than 1/ab of integrated luminosity corresponding to more than 2 PB of data near 10 GeV center-of-mass energy. Recently, KEK has initiated a $400 million accelerator upgrade to be called SuperKEKB, designed to produce instantaneous and integrated luminosity two orders of magnitude greater than KEKB. The new international collaboration at SuperKEKB is called Belle II. The first data from Belle II/SuperKEKB is expected in 2015.   In October 2012, senior members of the Belle II collaboration gathered at PNNL to discuss the computing and networking requirements of the Belle II experiment with ESnet staff and other computing and networking experts. The day-and-a-half-long workshop characterized the instruments and facilities used in the experiment, the process of science for Belle II, and the computing and networking equipment and configuration requirements to realize the full scientific potential of the collaboration's work.   The requirements identified at the Belle II Experiment Requirements workshop are summarized in this report.
Using the maximum entropy method, we derive the "adaptive cluster expansion" (ACE), which can be trained to estimate probability density functions in high dimensional spaces. The main advantage of ACE over other Bayesian networks is its ability to capture high order statistics after short training times, which it achieves by making use of a hierarchical vector quantisation of the input data. We derive a scheme for representing the state of an ACE network as a "probability image", which allows us to identify statistically anomalous regions in an otherwise statistically homogeneous image, for instance. Finally, we present some probability images that we obtained after training ACE on some Brodatz texture images - these demonstrate the ability of ACE to detect subtle textural anomalies.
We investigate the disorder-driven phase transition from a fractional quantum Hall state to an Anderson insulator using quantum entanglement methods. We find that the transition is signaled by a sharp increase in the sensitivity of a suitably averaged entanglement entropy with respect to disorder -- the magnitude of its disorder derivative appears to diverge in the thermodynamic limit. We also study the level statistics of the entanglement spectrum as a function of disorder. However, unlike the dramatic phase-transition signal in the entanglement entropy derivative, we find a gradual reduction of level repulsion only deep in the Anderson insulating phase.
It is well-known that neural networks are computationally hard to train. On the other hand, in practice, modern day neural networks are trained efficiently using SGD and a variety of tricks that include different activation functions (e.g. ReLU), over-specification (i.e., train networks which are larger than needed), and regularization. In this paper we revisit the computational complexity of training neural networks from a modern perspective. We provide both positive and negative results, some of them yield new provably efficient and practical algorithms for training certain types of neural networks.
A key problem in structured output prediction is direct optimization of the task reward function that matters for test evaluation. This paper presents a simple and computationally efficient approach to incorporate task reward into a maximum likelihood framework. By establishing a link between the log-likelihood and expected reward objectives, we show that an optimal regularized expected reward is achieved when the conditional distribution of the outputs given the inputs is proportional to their exponentiated scaled rewards. Accordingly, we present a framework to smooth the predictive probability of the outputs using their corresponding rewards. We optimize the conditional log-probability of augmented outputs that are sampled proportionally to their exponentiated scaled rewards. Experiments on neural sequence to sequence models for speech recognition and machine translation show notable improvements over a maximum likelihood baseline by using reward augmented maximum likelihood (RAML), where the rewards are defined as the negative edit distance between the outputs and the ground truth labels.
We study dimer molecules in two and three dimensions using both a model Lennard-Jones potential as well as Density Functional Theory (DFT) calculations. We first show that deep convolutional neural networks (DCNNs) can be used to predict the distances and energies of a dimer molecule in both two and three dimensional space using the Lennard-Jones potential. We then use a similar approach to learn hexagonal surfaces including graphene, hexagonal boron nitride (hBN), and graphene-hBN heterostructures.
This paper considers an amplify-and-forward relay network with fading states. Amplify-and-forward scheme (along with its variations) is the core mechanism for enabling cooperative communication in wireless networks, and hence understanding the network stability region under amplify-and-forward scheme is very important. However, in a relay network employing amplify-and-forward, the interaction between nodes is described in terms of real-valued ``packets'' (signals) instead of discrete packets (bits). This restrains the relay nodes from re-encoding the packets at desired rates. Hence, the stability analysis for relay networks employing amplify-and-forward scheme is by no means a straightforward extension of that in packet-based networks. In this paper, the stability region of a four-node relay network is characterized, and a simple throughput optimal algorithm with joint scheduling and rate allocation is proposed.
Properties of Term Rewriting Systems are called modular iff they are preserved under (and reflected by) disjoint union, i.e. when combining two Term Rewriting Systems with disjoint signatures. Convergence is the property of Infinitary Term Rewriting Systems that all reduction sequences converge to a limit. Strong Convergence requires in addition that redex positions in a reduction sequence move arbitrarily deep. In this paper it is shown that both Convergence and Strong Convergence are modular properties of non-collapsing Infinitary Term Rewriting Systems, provided (for convergence) that the term metrics are granular. This generalises known modularity results beyond metric \infty.
State-of-the-art statistical parametric speech synthesis (SPSS) generally uses a vocoder to represent speech signals and parameterize them into features for subsequent modeling. Magnitude spectrum has been a dominant feature over the years. Although perceptual studies have shown that phase spectrum is essential to the quality of synthesized speech, it is often ignored by using a minimum phase filter during synthesis and the speech quality suffers. To bypass this bottleneck in vocoded speech, this paper proposes a phase-embedded waveform representation framework and establishes a magnitude-phase joint modeling platform for high-quality SPSS. Our experiments on waveform reconstruction show that the performance is better than that of the widely-used STRAIGHT. Furthermore, the proposed modeling and synthesis platform outperforms a leading-edge, vocoded, deep bidirectional long short-term memory recurrent neural network (DBLSTM-RNN)-based baseline system in various objective evaluation metrics conducted.
Existing deep embedding methods in vision tasks are capable of learning a compact Euclidean space from images, where Euclidean distances correspond to a similarity metric. To make learning more effective and efficient, hard sample mining is usually employed, with samples identified through computing the Euclidean feature distance. However, the global Euclidean distance cannot faithfully characterize the true feature similarity in a complex visual feature space, where the intraclass distance in a high-density region may be larger than the interclass distance in low-density regions. In this paper, we introduce a Position-Dependent Deep Metric (PDDM) unit, which is capable of learning a similarity metric adaptive to local feature structure. The metric can be used to select genuinely hard samples in a local neighborhood to guide the deep embedding learning in an online and robust manner. The new layer is appealing in that it is pluggable to any convolutional networks and is trained end-to-end. Our local similarity-aware feature embedding not only demonstrates faster convergence and boosted performance on two complex image retrieval datasets, its large margin nature also leads to superior generalization results under the large and open set scenarios of transfer learning and zero-shot learning on ImageNet 2010 and ImageNet-10K datasets.
The current mainstream approach to train natural language systems is to expose them to large amounts of text. This passive learning is problematic if we are interested in developing interactive machines, such as conversational agents. We propose a framework for language learning that relies on multi-agent communication. We study this learning in the context of referential games. In these games, a sender and a receiver see a pair of images. The sender is told one of them is the target and is allowed to send a message from a fixed, arbitrary vocabulary to the receiver. The receiver must rely on this message to identify the target. Thus, the agents develop their own language interactively out of the need to communicate. We show that two networks with simple configurations are able to learn to coordinate in the referential game. We further explore how to make changes to the game environment to cause the "word meanings" induced in the game to better reflect intuitive semantic properties of the images. In addition, we present a simple strategy for grounding the agents' code into natural language. Both of these are necessary steps towards developing machines that are able to communicate with humans productively.
We report investigations of conductance fluctuations (noise) in doped silicon at low temperatures (T$<20$K) as it is tuned through the metal-insulator transition (MIT). The scaled magnitude of noise, $\gamma_H$, increases with decrease in T following an approximate power law $\gamma_H \sim T^{-\beta}$. At low T, $\gamma_H$ diverges as $n/n_c$ crosses 1 from the metallic side. We find that the distribution function and second spectrum of the fluctuations show strong non-gaussian behavior below 20K as $n/n_c$ decreases through 1. In particular, the observed distribution function which is gaussian for $n/n_c >> 1$, develops a log-normal tail as the transition is approached from the metallic side and eventually it dominates in the critical region.
In this paper we establish a connection between non-convex optimization methods for training deep neural networks and nonlinear partial differential equations (PDEs). Relaxation techniques arising in statistical physics which have already been used successfully in this context are reinterpreted as solutions of a viscous Hamilton-Jacobi PDE. Using a stochastic control interpretation allows we prove that the modified algorithm performs better in expectation that stochastic gradient descent. Well-known PDE regularity results allow us to analyze the geometry of the relaxed energy landscape, confirming empirical evidence. The PDE is derived from a stochastic homogenization problem, which arises in the implementation of the algorithm. The algorithms scale well in practice and can effectively tackle the high dimensionality of modern neural networks.
We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences. We achieve this by simultaneously training depth and camera pose estimation networks using the task of view synthesis as the supervisory signal. The networks are thus coupled via the view synthesis objective during training, but can be applied independently at test time. Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performing comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performing favorably with established SLAM systems under comparable input settings.
Constantly growing demands of high productivity and security of computer systems and computer networks call the interest of specialists in the environment of construction of optimum topologies of computer mediums. In earliest phases of design, the study of the topological influence of the processes that happen in computer systems and computer networks allows to obtain useful information which possesses a significant value in the subsequent design. It has always been tried to represent the different computer network topologies using appropriate graph models. Graphs have huge contributions towards the performance improvement factor of a network. Some major contributors are de-Bruijn, Hypercube, Mesh and Pascal. They had been studied a lot and different new features were always a part of research outcome. As per the definition of interconnection network it is equivalent that a suitable graph can represent the physical and logical layout very efficiently. In this present study Pascal graph is researched again and a new characteristics has been discovered. From the perspective of network topologies Pascal graph and its properties were first studied more than two decades back. Since then, a numerous graph models have emerged with potentials to be used as network topologies. This new property is guaranteed to make an everlasting mark towards the reliability of this graph to be used as a substantial contributor as a computer network topology. This shows its credentials over so many other topologies. This study reviews the characteristics of the Pascal graph and the new property is established using appropriate algorithm and the results.
Software-defined networking offers a device-agnostic programmable framework to encode new network functions. Externally centralized control plane intelligence allows programmers to write network applications and to build functional network designs. OpenFlow is a key protocol widely adopted to build programmable networks because of its programmability, flexibility and ability to interconnect heterogeneous network devices. We simulate the functional topology of a multi-node quantum network that uses programmable network principles to manage quantum metadata for protocols such as teleportation, superdense coding, and quantum key distribution. We first show how the OpenFlow protocol can manage the quantum metadata needed to control the quantum channel. We then use numerical simulation to demonstrate robust programmability of a quantum switch via the OpenFlow network controller while executing an application of superdense coding. We describe the software framework implemented to carry out these simulations and we discuss near-term efforts to realize these applications.
The success of deep learning in numerous application domains created the de- sire to run and train them on mobile devices. This however, conflicts with their computationally, memory and energy intense nature, leading to a growing interest in compression. Recent work by Han et al. (2015a) propose a pipeline that involves retraining, pruning and quantization of neural network weights, obtaining state-of-the-art compression rates. In this paper, we show that competitive compression rates can be achieved by using a version of soft weight-sharing (Nowlan & Hinton, 1992). Our method achieves both quantization and pruning in one simple (re-)training procedure. This point of view also exposes the relation between compression and the minimum description length (MDL) principle.
We design and analyze gossip algorithms for networks with correlated data. In these networks, either the data to be distributed, the data already available at the nodes, or both, are correlated. This model is applicable for a variety of modern networks, such as sensor, peer-to-peer and content distribution networks.   Although coding schemes for correlated data have been studied extensively, the focus has been on characterizing the rate region in static memory-free networks. In a gossip-based scheme, however, nodes communicate among each other by continuously exchanging packets according to some underlying communication model. The main figure of merit in this setting is the stopping time -- the time required until nodes can successfully decode. While Gossip schemes are practical, distributed and scalable, they have only been studied for uncorrelated data.   We wish to close this gap by providing techniques to analyze network coded gossip in (dynamic) networks with correlated data. We give a clean framework for oblivious network models that applies to a multitude of network and communication scenarios, specify a general setting for distributed correlated data, and give tight bounds on the stopping times of network coded protocols in this wide range of scenarios.
Computational intelligence is broadly defined as biologically-inspired computing. Usually, inspiration is drawn from neural systems. This article shows how to analyze neural systems using information theory to obtain constraints that help identify the algorithms run by such systems and the information they represent. Algorithms and representations identified information-theoretically may then guide the design of biologically inspired computing systems (BICS). The material covered includes the necessary introduction to information theory and the estimation of information theoretic quantities from neural data. We then show how to analyze the information encoded in a system about its environment, and also discuss recent methodological developments on the question of how much information each agent carries about the environment either uniquely, or redundantly or synergistically together with others. Last, we introduce the framework of local information dynamics, where information processing is decomposed into component processes of information storage, transfer, and modification -- locally in space and time. We close by discussing example applications of these measures to neural data and other complex systems.
DeepTingle is a text prediction and classification system trained on the collected works of the renowned fantastic gay erotica author Chuck Tingle. Whereas the writing assistance tools you use everyday (in the form of predictive text, translation, grammar checking and so on) are trained on generic, purportedly "neutral" datasets, DeepTingle is trained on a very specific, internally consistent but externally arguably eccentric dataset. This allows us to foreground and confront the norms embedded in data-driven creativity and productivity assistance tools. As such tools effectively function as extensions of our cognition into technology, it is important to identify the norms they embed within themselves and, by extension, us. DeepTingle is realized as a web application based on LSTM networks and the GloVe word embedding, implemented in JavaScript with Keras-JS.
Relational data are usually highly incomplete in practice, which inspires us to leverage side information to improve the performance of community detection and link prediction. This paper presents a Bayesian probabilistic approach that incorporates various kinds of node attributes encoded in binary form in relational models with Poisson likelihood. Our method works flexibly with both directed and undirected relational networks. The inference can be done by efficient Gibbs sampling which leverages sparsity of both networks and node attributes. Extensive experiments show that our models achieve the state-of-the-art link prediction results, especially with highly incomplete relational data.
Cluster based routing technique is most popular routing technique in Wireless Sensor Networks (WSNs). Due to varying need of WSN applications efficient energy utilization in routing protocols is still a potential area of research. In this research work we introduced a new energy efficient cluster based routing technique. In this technique we tried to overcome the problem of coverage hole and energy hole. In our technique we controlled these problems by introducing density controlled uniform distribution of nodes and fixing optimum number of Cluster Heads (CHs) in each round. Finally we verified our technique by experimental results of MATLAB simulations.
The use of RFID tag which identifies a thing and an object will be expanded with progress of ubiquitous society, and it is necessary to study how to construct RFID network system as a social infrastructure like the Internet. First, this paper proposes the virtualization method of RFID tag network system to enable the same physical RFID network system to be used by multiple different service systems. The system virtualization not only reduces the system cost but also can dramatically reduce the installation space of physical readers and the operation cost. It is proposed that all equipments in the RFID network system except RFID tag could be shared with the conventional virtual technologies. Then, this paper proposes the conditional tag ID processing and the efficient tag ID transmission method which can greatly reduce the processing time and processing load in RFID tag Infrastructure network The conditional tag ID processing allows that tag ID is valid only at a certain time zone of day or in a certain area. The efficient tag ID transmission method uses the virtual network address of the service center as a part of the ID of an RF tag, which allows the direct ID forwarding to the service center.
Boron abundances in A- and B-type stars may be a successful way to track evolutionary effects in these hot stars. The light elements -- Li, Be, and B -- are tracers of exposure to temperatures more moderate than those in which the H-burning CN-cycle operates. Thus, any exposure of surface stellar layers to deeper layers will affect these light element abundances. Li and Be are used in this role in investigations of evolutionary processes in cool stars, but are not observable in hotter stars. An investigation of boron, however, is possible through the B II 1362=C5 resonance line.   We have gathered high resolution spectra from the IUE database of A- and B-type stars near 10~M$_\odot$ for which nitrogen abundances have been determined (by Gies & Lambert, 1992, and Venn 1995). The B II 1362=C5 line is blended throughout the temperature range of this program, requiring spectrum syntheses to recover the boron abundances. For no star could we synthesize the 1362=C5 region using the meteoritic/solar boron abundance of log(B) =3D 2.88 (Anders & Grevesse 1989); a lower boron abundance was necessary which may reflect evolutionary effects (e.g., mass loss or mixing near the main-sequence), the natal composition of the star forming regions, or a systematic error in the analyses (e.g., non-LTE effects). Regardless of the initial boron abundance, and despite the possibility of non-LTE effects, it seems clear that boron is severely depleted in some stars. It may be that the nitrogen and boron abundances are anticorrelated, as would be expected from mixing between the H-burning and outer stellar layers. If, as we suspect, a residue of boron is present in the A-type supergiants, we may exclude a scenario in which mixing occurs continuously between the surface and the deep
In this paper, we study how the pro-social impact due to the vigilance by other individuals is conditioned by both environmental and evolutionary effects. To this aim, we consider a known model where agents play a Prisoner's Dilemma Game (PDG) among themselves and the pay-off matrix of an individual changes according to the number of neighbors that are "vigilant", i.e., how many neighbors watch out for her behavior. In particular, the temptation to defect decreases linearly with the number of vigilant neighbors. This model proved to support cooperation in specific conditions, and here we check its robustness with different topologies, microscopical update rules and initial conditions. By means of many numerical simulations and few theoretical considerations, we find in which situations the vigilance by the others is more effective in favoring cooperative behaviors and when its influence is weaker.
Kob and Andersen's simple lattice models for the dynamics of structural glasses are analyzed. Although the particles have only hard core interactions, the imposed constraint that they cannot move if surrounded by too many others causes slow dynamics. On Bethe lattices a dynamical transition to a partially frozen phase occurs. In finite dimensions there exist rare mobile elements that destroy the transition. At low vacancy density, $v$, the spacing, $\Xi$, between mobile elements diverges exponentially or faster in $1/v$. Within the mobile elements, the dynamics is intrinsically cooperative and the characteristic time scale diverges faster than any power of $1/v$ (although slower than $\Xi$). The tagged-particle diffusion coefficient vanishes roughly as $\Xi^{-d}$.
Remote sensing image retrieval(RSIR), which aims to efficiently retrieve data of interest from large collections of remote sensing data, is a fundamental task in remote sensing. Over the past several decades, there has been significant effort to extract powerful feature representations for this task since the retrieval performance depends on the representative strength of the features. Benchmark datasets are also critical for developing, evaluating, and comparing RSIR approaches. Current benchmark datasets are deficient in that 1) they were originally collected for land use/land cover classification and not image retrieval, 2) they are relatively small in terms of the number of classes as well the number of sample images per class, and 3) the retrieval performance has saturated. These limitations have severely restricted the development of novel feature representations for RSIR, particularly the recent deep-learning based features which require large amounts of training data. We therefore present in this paper, a new large-scale remote sensing dataset termed "PatternNet" that was collected specifically for RSIR. PatternNet was collected from high-resolution imagery and contains 38 classes with 800 images per class. We also provide a thorough review of RSIR approaches ranging from traditional handcrafted feature based methods to recent deep learning based ones. We evaluate over 35 methods to establish extensive baseline results for future RSIR research using the PatternNet benchmark.
A mobile ad-hoc network (MANET) is a peer-to-peer wireless network where nodes can communicate with each other without the use of infrastructure such as access points or base stations. These networks are self-configuring, capable of self-directed operation and hastily deployable. Nodes cooperate to provide connectivity, operates without centralized administration. Nodes are itinerant, topology can be very dynamic and nodes must be able to relay traffic since communicating nodes might be out of range. The dynamic nature of MANET makes network open to attacks and unreliability. Routing is always the most significant part for any networks. Each node should not only work for itself, but should be cooperative with other nodes. Node misbehaviour due to selfish or malicious intention could significantly degrade the performance of MANET. The Qos parameters like PDR, throughput and delay are affected directly due to such misbehaving nodes. We focus on trust management framework, which is intended to cope with misbehaviour problem of node and increase the performance of MANETs. A trust-based system can be used to track this misbehaving of nodes, spot them and isolate them from routing and provide reliability. In this paper a Trust Based Reliable AODV [TBRAODV] protocol is presented which implements a trust value for each node. For every node trust value is calculated and based trust value nodes are allowed to participate in routing or else identified to become a misbehaving node. This enhances reliability in AODV routing and results in increase of PDR, decrease in delay and throughput is maintained. This work is implemented and simulated on NS-2. Based on simulation results, the proposed protocol provides more consistent and reliable data transfer compared with general AODV, if there are misbehaving nodes in the MANET
We discuss the applicability of the perturbation theory in electrodynamic problems where the local Leontovich (the impedance) boundary conditions are used to calculate the ohmic losses at the metallic surface. As an example, we examine a periodic grating formed from semi- infinite rectangular plates exposed to the s-polarized electromagnetic wave. Two different ways for calculation of the ohmic losses are presented: (i) the calculation of the reflection coefficient obtained with the aid of the perturbation theory (the impedance is the small parameter) and (ii) the direct calculation of the energy flux through the metallic surface, when to get the answer only the tangential magnetic field at the surface of a perfect conductor of the same geometry has to be known. The results (i) and (ii) differ noticeably. The same difficulty is inherent in all the problems where the metallic surface has rectangular grooves. We show that the standard first order perturbation theory is not applicable since beginning from a number n even the first corrections to the modal functions ${\phi}_{n}$ used to calculate the fields, are of the same order as the zero order modal functions (the impedance is equal to zero). Basing on the energy conservation law we show that the accurate value for the ohmic losses is obtained with the aid of the approach (ii).
How to promote the innovative activities is an important problem for modern society. In this paper, combining with the evolutionary games and information spreading, we propose a lattice model to investigate dynamics of human innovative behaviors based on benefit-driven assumption. Simulations show several properties in agreement with peoples' daily cognition on innovative behaviors, such as slow diffusion of innovative behaviors, gathering of innovative strategy on "innovative centers", and quasi-localized dynamics. Furthermore, our model also emerges rich non-Poisson properties in the temporal-spacial patterns of the innovative status, including the scaling law in the interval time of innovation releases and the bimodal distributions on the spreading range of innovations, which would be universal in human innovative behaviors. Our model provide a basic framework on the study of the issue relevant to the evolution of human innovative behaviors and the promotion measurement of innovative activities.
We introduce a new convolutional neural network architecture with the ability to adapt dynamically to computational resource limits at test time. Our network architecture uses progressively growing multi-scale convolutions and dense connectivity, which allows for the training of multiple classifiers at intermediate layers of the network. We evaluate our approach in two settings: (1) anytime classification, where the network's prediction for a test example is progressively updated, facilitating the output of a prediction at any time; and (2) budgeted batch classification, where a fixed amount of computation is available to classify a set of examples that can be spent unevenly across "easier" and "harder" inputs. Experiments on three image-classification datasets demonstrate that our proposed framework substantially improves the state-of-the-art in both settings.
The densification of small-cell base stations in a 5G architecture is a promising approach to enhance the coverage area and facilitate the ever increasing capacity demand of end users. However, the bottleneck is an intelligent management of a backhaul/fronthaul network for these small-cell base stations. This involves efficient association and placement of the backhaul hubs that connects these small-cells with the core network. Terrestrial hubs suffer from an inefficient non line of sight link limitations and unavailability of a proper infrastructure in an urban area. Seeing the popularity of flying platforms, we employ here an idea of using networked flying platform (NFP) such as unmanned aerial vehicles (UAVs), drones, unmanned balloons flying at different altitudes, as aerial backhaul hubs. The association problem of these NFP-hubs and small-cell base stations is formulated considering backhaul link and NFP related limitations such as maximum number of supported links and bandwidth. Then, this paper presents an efficient and distributed solution of the designed problem, which performs a greedy search in order to maximize the sum rate of the overall network. A favorable performance is observed via a numerical comparison of our proposed method with optimal exhaustive search algorithm in terms of sum rate and run-time speed.
Most work in the area of statistical relational learning (SRL) is focussed on discrete data, even though a few approaches for hybrid SRL models have been proposed that combine numerical and discrete variables. In this paper we distinguish numerical random variables for which a probability distribution is defined by the model from numerical input variables that are only used for conditioning the distribution of discrete response variables. We show how numerical input relations can very easily be used in the Relational Bayesian Network framework, and that existing inference and learning methods need only minor adjustments to be applied in this generalized setting. The resulting framework provides natural relational extensions of classical probabilistic models for categorical data. We demonstrate the usefulness of RBN models with numeric input relations by several examples.   In particular, we use the augmented RBN framework to define probabilistic models for multi-relational (social) networks in which the probability of a link between two nodes depends on numeric latent feature vectors associated with the nodes. A generic learning procedure can be used to obtain a maximum-likelihood fit of model parameters and latent feature values for a variety of models that can be expressed in the high-level RBN representation. Specifically, we propose a model that allows us to interpret learned latent feature values as community centrality degrees by which we can identify nodes that are central for one community, that are hubs between communities, or that are isolated nodes. In a multi-relational setting, the model also provides a characterization of how different relations are associated with each community.
Surface codes reach high error thresholds when decoded with known algorithms, but the decoding time will likely exceed the available time budget, especially for near-term implementations. To decrease the decoding time, we reduce the decoding problem to a classification problem that a feedforward neural network can solve. We investigate quantum error correction and fault tolerance at small code distances using neural network-based decoders, demonstrating that the neural network can generalize to inputs that were not provided during training and that they can reach similar or better decoding performance compared to previous algorithms. We conclude by discussing the time required by a feedforward neural network decoder in hardware.
We propose using recognition networks for approximate inference inBayesian networks (BNs). A recognition network is a multilayerperception (MLP) trained to predict posterior marginals given observedevidence in a particular BN. The input to the MLP is a vector of thestates of the evidential nodes. The activity of an output unit isinterpreted as a prediction of the posterior marginal of thecorresponding variable. The MLP is trained using samples generated fromthe corresponding BN.We evaluate a recognition network that was trained to do inference ina large Bayesian network, similar in structure and complexity to theQuick Medical Reference, Decision Theoretic (QMR-DT). Our networkis a binary, two-layer, noisy-OR network containing over 4000 potentially observable nodes and over 600 unobservable, hidden nodes. Inreal medical diagnosis, most observables are unavailable, and there isa complex and unknown bias that selects which ones are provided. Weincorporate a very basic type of selection bias in our network: a knownpreference that available observables are positive rather than negative.Even this simple bias has a significant effect on the posterior. We compare the performance of our recognition network tostate-of-the-art approximate inference algorithms on a large set oftest cases. In order to evaluate the effect of our simplistic modelof the selection bias, we evaluate algorithms using a variety ofincorrectly modeled observation biases. Recognition networks performwell using both correct and incorrect observation biases.
Foreshock events provide valuable insight to predict imminent major earthquakes. However, it is difficult to identify them in real time. In this paper, I propose an algorithm based on deep learning to instantaneously classify a seismic waveform as a foreshock, mainshock or an aftershock event achieving a high accuracy of 99% in classification. As a result, this is by far the most reliable method to predict major earthquakes that are preceded by foreshocks. In addition, I discuss methods to create an earthquake dataset that is compatible with deep networks.
The paper presents a two-level learning method for the design of the Beta Basis Function Neural Network BBFNN. A Genetic Algorithm is employed at the upper level to construct BBFNN, while the key learning parameters :the width, the centers and the Beta form are optimised using the gradient algorithm at the lower level. In order to demonstrate the effectiveness of this hierarchical learning algorithm HLABBFNN, we need to validate our algorithm for the approximation of non-linear function.
Systems composed of distinct complex networks are present in many real-world environments, from society to ecological systems. In the present paper, we propose a network model obtained as a consequence of interactions between two species (e.g. predator and prey). Fields are produced and sensed by the individuals, defining spatio-temporal patterns which are strongly affected by the attraction intensity between individuals from the same species. The dynamical evolution of the system, including the change of individuals between different clusters, is investigated by building two complex networks having the individuals as nodes. In the first network, the edge weight is given by the Euclidean distance between every two individuals and, in the case of the second network, by the amount of time two individuals stay close one another. A third network is obtained from the two previous networks whose nodes correspond to the spatially congruent groups. The system evolves to an organized state where Gaussian and scale-free-like strength distributions emerge, respectively, in the predator and prey networks. Such a different connectivity is mainly a consequence of preys elimination. Some configurations favor the survival of preys or higher efficiency of predator activity.
Recent successful applications of convolutional neural networks (CNNs) to audio classification and speech recognition have motivated the search for better input representations for more efficient training. Visual displays of an audio signal, through various time-frequency representations such as spectrograms offer a rich representation of the temporal and spectral structure of the original signal. In this letter, we compare various popular signal processing methods to obtain this representation, such as short-time Fourier transform (STFT) with linear and Mel scales, constant-Q transform (CQT) and continuous Wavelet transform (CWT), and assess their impact on the classification performance of two environmental sound datasets using CNNs. This study supports the hypothesis that time-frequency representations are valuable in learning useful features for sound classification. Moreover, the actual transformation used is shown to impact the classification accuracy, with Mel-scaled STFT outperforming the other discussed methods slightly and baseline MFCC features to a large degree. Additionally, we observe that the optimal window size during transformation is dependent on the characteristics of the audio signal and architecturally, 2D convolution yielded better results in most cases compared to 1D.
Oil refinery is one of industries that require huge energy consumption. The today technology advance requires energy saving. Heat integration is a method used to minimize the energy comsumption though the implementation of Heat Exchanger Network (HEN). CPT is one of types of Heat Exchanger Network (HEN) that functions to recover the heat in the flow of product or waste. HEN comprises a number of heat exchangers (HEs) that are serially connected. However, the presence of fouling in the heat exchanger has caused the decline of the performance of both heat exchangers and all heat exchanger networks. Fouling can not be avoided. However, it can be mitigated. In industry, periodic heat exchanger cleaning is the most effective and widely used mitigation technique. On the other side, a very frequent cleaning of heat exchanger can be much costly in maintenance and lost of production. In this way, an accurate optimization technique of cleaning schedule interval of heat exchanger is very essential. Commonly, this technique involves three elements: model to simulate the heat exchanger network, representative fouling model to describe the fouling behavior and suitable optimization algorithm to solve the problem of clening schedule interval for heat exchanger network. This paper describe the optimization of interval cleaning schedule of HEN within the 44-month period using PSO (particle swarm optimization). The number of iteration used to achieve the convergent is 100 iterations and the fitness value in PSO correlated with the amount of heat recovery, cleaning cost, and additional pumping cost. The saving after the optimization of cleaning schedule of HEN in this research achieved at $ 1.236 millions or 23% of maximum potential savings.
A new Lorentz frame for DIS jet finding is suggested.
This paper studies properties of the back end of a sorting network and illustrates the utility of these in the search for networks of optimal size or depth. All previous works focus on properties of the front end of networks and on how to apply these to break symmetries in the search. The new properties help shed understanding on how sorting networks sort and speed-up solvers for both optimal size and depth by an order of magnitude.
Multi-task learning aims to improve generalization performance of multiple prediction tasks by appropriately sharing relevant information across them. In the context of deep neural networks, this idea is often realized by hand-designed network architectures with layers that are shared across tasks and branches that encode task-specific features. However, the space of possible multi-task deep architectures is combinatorially large and often the final architecture is arrived at by manual exploration of this space subject to designer's bias, which can be both error-prone and tedious. In this work, we propose a principled approach for designing compact multi-task deep learning architectures. Our approach starts with a thin network and dynamically widens it in a greedy manner during training using a novel criterion that promotes grouping of similar tasks together. Our Extensive evaluation on person attributes classification tasks involving facial and clothing attributes suggests that the models produced by the proposed method are fast, compact and can closely match or exceed the state-of-the-art accuracy from strong baselines by much more expensive models.
Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems, e.g., image classification, natural language processing or human action recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretability of the solution and thus the scope of application in practice. Especially DNNs act as black boxes due to their multilayer nonlinear structure. In this paper we introduce a novel methodology for interpreting generic multilayer neural networks by decomposing the network classification decision into contributions of its input elements. Although our focus is on image classification, the method is applicable to a broad set of input data, learning tasks and network architectures. Our method is based on deep Taylor decomposition and efficiently utilizes the structure of the network by backpropagating the explanations from the output to the input layer. We evaluate the proposed method empirically on the MNIST and ILSVRC data sets.
We describe the charge transport in ferromagnets with spin orbit coupled Bloch bands by combining the wave-packet evolution equations with the classical Boltzmann equation. This approach can be justified in the limit of smooth disorder potential. Besides the skew scattering contribution, we demonstrate how other effects of disorder appear which are closely linked to the Berry curvature of the Bloch states associated with the wavepacket. We show that, although being of the same order of magnitude as the clean limit Berry curvature contribution, generally disorder corrections depend differently on various parameters and can lead to the sign reversal of the Hall current as the function of the chemical potential in systems with a non-constant Berry curvature in momentum space. Earlier conclusions on the effects of disorder on the anomalous Hall effect depended stricly on the lack of momentum dependence of the Berry curvature in the models studied and generalizations of their findings to other systems with more complicated band structures were unjustified.
Miniaturised sensors and networking are technical proven concepts. Both the technologies are proven and various components e.g., sensors, controls, etc. are commercially available. Technology scene in above areas presents enormous possibilities for developing innovative applications for real life situations. Mining operations in many countries have lot of scope for improving environmental and safety measures. Efforts have been made to develop a system to efficiently monitor a particular environment by deploying a wireless sensor network using commercially available components. Wireless Sensor Network has been integrated with telecom network through a gateway using a suitable topology which can be selected at the application layer. The developed system demonstrates a way to connect wireless sensor network to external network which enables the distant administrator to access real time data and act expediently from long-distance to improve the environmental situation or prevent a disaster. Potentially, it can be used to avoid the awful situations leading to terrible environment in underground mines. Keywords: Wireless sensor network, Mine safety, Environment monitoring and telecom.
Learning about the social structure of hidden and hard-to-reach populations --- such as drug users and sex workers --- is a major goal of epidemiological and public health research on risk behaviors and disease prevention. Respondent-driven sampling (RDS) is a peer-referral process widely used by many health organizations, where research subjects recruit other subjects from their social network. In such surveys, researchers observe who recruited whom, along with the time of recruitment and the total number of acquaintances (network degree) of respondents. However, due to privacy concerns, the identities of acquaintances are not disclosed. In this work, we show how to reconstruct the underlying network structure through which the subjects are recruited. We formulate the dynamics of RDS as a continuous-time diffusion process over the underlying graph and derive the likelihood for the recruitment time series under an arbitrary recruitment time distribution. We develop an efficient stochastic optimization algorithm called RENDER (REspoNdent-Driven nEtwork Reconstruction) that finds the network that best explains the collected data. We support our analytical results through an exhaustive set of experiments on both synthetic and real data.
As ISPs begin to cooperate to expose their network locality information as services, e.g., P4P, solutions based on locality information provision for P2P traffic localization will soon approach their capability limits. A natural question is: can we do any better provided that no further locality information improvement can be made? This paper shows how the utility of locality information could be limited by conventional P2P data scheduling algorithms, even as sophisticated as the local rarest first policy.   Network coding's simplified data scheduling makes it competent for improving P2P application's throughput. Instead of only using locality information in the topology construction, this paper proposes the locality-aware network coding (LANC) that uses locality information in both the topology construction and downloading decision, and demonstrates its exceptional ability for P2P traffic localization. The randomization introduced by network coding enhances the chance for a peer to find innovative blocks in its neighborhood. Aided by proper locality-awareness, the probability for a peer to get innovative blocks from its proximity will increase as well, resulting in more efficient use of network resources. Extensive simulation results show that LANC can significantly reduce P2P traffic redundancy without sacrificing application-level performance. Aided by the same locality knowledge, the traffic redundancies of LANC in most cases are less than 50\% of the current best approach that does not use network coding.
We report on the discovery of a new kind of thermal pulse in intermediate mass AGB stars. Deep dredge-up during normal thermal pulses on the AGB leads to the formation of a long, unburnt tail to the helium profile. Eventually the tail ignites under partially degenerate conditions producing a strong shell flash with very deep subsequent dredge-up. The carbon content of the intershell convective region (X_C ~ 0.6) is substantially higher than in a normal thermal pulse (X_C ~ 0.25) and about 4 times more carbon is dredged-up than in a normal pulse.
We study network traffic dynamics in a two dimensional communication network with regular nodes and hubs. If the network experiences heavy message traffic, congestion occurs due to finite capacity of the nodes. We discuss strategies to manipulate hub capacity and hub connections to relieve congestion and define a coefficient of betweenness centrality (CBC), a direct measure of network traffic, which is useful for identifying hubs which are most likely to cause congestion. The addition of assortative connections to hubs of high CBC relieves congestion very efficiently.
In recent years, neural network approaches have shown superior performance to conventional hand-made features in numerous application areas. In particular, convolutional neural networks (ConvNets) exploit spatially local correlations across input data to improve the performance of audio processing tasks, such as speech recognition, musical chord recognition, and onset detection. Here we apply ConvNet to acoustic scene classification, and show that the error rate can be further decreased by using delta features in the frequency domain. We propose a multiple-width frequency-delta (MWFD) data augmentation method that uses static mel-spectrogram and frequency-delta features as individual input examples. In addition, we describe a ConvNet output aggregation method designed for MWFD augmentation, folded mean aggregation, which combines output probabilities of static and MWFD features from the same analysis window using multiplication first, rather than taking an average of all output probabilities. We describe calculation results using the DCASE 2016 challenge dataset, which shows that ConvNet outperforms both of the baseline system with hand-crafted features and a deep neural network approach by around 7%. The performance was further improved (by 5.7%) using the MWFD augmentation together with folded mean aggregation. The system exhibited a classification accuracy of 0.831 when classifying 15 acoustic scenes.
Consider a sensor network made of remote nodes connected to a common fusion center. In a recent work Blum and Sadler [1] propose the idea of ordered transmissions -sensors with more informative samples deliver their messages first- and prove that optimal detection performance can be achieved using only a subset of the total messages. Taking to one extreme this approach, we show that just a single delivering allows making the detection errors as small as desired, for a sufficiently large network size: a one-bit detection scheme can be asymptotically consistent. The transmission ordering is based on the modulus of some local statistic (MO system). We derive analytical results proving the asymptotic consistency and, for the particular case that the local statistic is the log-likelihood (\ell-MO system), we also obtain a bound on the error convergence rate. All the theorems are proved under the general setup of random number of sensors. Computer experiments corroborate the analysis and address typical examples of applications including: non-homogeneous Poisson-deployed networks, detection by per-sensor censoring, monitoring of energy-constrained phenomenon.
We present derivations of the contagion condition for a range of spreading mechanisms on families of generalized random networks and bipartite random networks. We show how the contagion condition can be broken into three elements, two structural in nature, and the third a meshing of the contagion process and the network. The contagion conditions we obtain reflect the spreading dynamics in a clear, interpretable way. For threshold contagion, we discuss results for all-to-all and random network versions of the model, and draw connections between them.
The vulnerability of an isolated network to cascades is fundamentally affected by its interactions with other networks. Motivated by failures cascading among electrical grids, we study the Bak-Tang-Wiesenfeld sandpile model on two sparsely-coupled random regular graphs. By approximating avalanches (cascades) as a multi-type branching process and using a generalization of Lagrange's expansion to multiple variables, we calculate the distribution of avalanche sizes within each network. Due to coupling, large avalanches in the individual networks are mitigated--in contrast to the conclusion for a simpler model [36]. Yet when compared to uncoupled networks, interdependent networks more frequently suffer avalanches that are large in both networks. Thus sparse connections between networks stabilize them individually but destabilize them jointly, as coupling introduces reservoirs for extra load yet also inflicts new stresses. These results suggest that in practice, to greedily mitigate large avalanches in one network, add connections between networks; conversely, to mitigate avalanches that are large in both networks, remove connections between networks. We also show that when only one network receives load, the largest avalanches in the second network increase in size and in frequency, an effect that is amplified with increased coupling between networks and with increased disparity in total capacity. Our framework is applicable to modular networks as well as to interacting networks and provides building blocks for better prediction of cascading processes on networks in general.
We present an approach for reconstructing vehicles from a single (RGB) image, in the context of autonomous driving. Though the problem appears to be ill-posed, we demonstrate that prior knowledge about how 3D shapes of vehicles project to an image can be used to reason about the reverse process, i.e., how shapes (back-)project from 2D to 3D. We encode this knowledge in \emph{shape priors}, which are learnt over a small keypoint-annotated dataset. We then formulate a shape-aware adjustment problem that uses the learnt shape priors to recover the 3D pose and shape of a query object from an image. For shape representation and inference, we leverage recent successes of Convolutional Neural Networks (CNNs) for the task of object and keypoint localization, and train a novel cascaded fully-convolutional architecture to localize vehicle \emph{keypoints} in images. The shape-aware adjustment then robustly recovers shape (3D locations of the detected keypoints) while simultaneously filling in occluded keypoints. To tackle estimation errors incurred due to erroneously detected keypoints, we use an Iteratively Re-weighted Least Squares (IRLS) scheme for robust optimization, and as a by-product characterize noise models for each predicted keypoint. We evaluate our approach on autonomous driving benchmarks, and present superior results to existing monocular, as well as stereo approaches.
Today by growing network systems, security is a key feature of each network infrastructure. Network Intrusion Detection Systems (IDS) provide defense model for all security threats which are harmful to any network. The IDS could detect and block attack-related network traffic. The network control is a complex model. Implementation of an IDS could make delay in the network. Several software-based network intrusion detection systems are developed. However, the model has a problem with high speed traffic. This paper reviews of many type of software architecture in intrusion detection systems and describes the design and implementation of a high-performance network intrusion detection system that combines the use of software-based network intrusion detection sensors and a network processor board. The network processor which is a hardware-based model could acts as a customized load balancing splitter. This model cooperates with a set of modified content-based network intrusion detection sensors rather than IDS in processing network traffic and controls the high-speed.
We study electron transport properties of a monoatomic graphite layer (graphene) with different types of disorder at half filling. We show that the transport properties of the system depend strongly on the symmetry of disorder. We find that the localization is ineffective if the randomness preserves one of the chiral symmetries of the clean Hamiltonian or does not mix valleys. We obtain the exact value of minimal conductivity $4e^2/\pi h$ in the case of chiral disorder. For long-range disorder (decoupled valleys), we derive the effective field theory. In the case of smooth random potential, it is a symplectic-class sigma model including a topological term with $\theta = \pi$. As a consequence, the system is at a quantum critical point with a universal value of the conductivity of the order of $e^2/h$. When the effective time reversal symmetry is broken, the symmetry class becomes unitary, and the conductivity acquires the value characteristic for the quantum Hall transition.
In a finite-connectivity spin-glass at the zero-temperature limit, long-range correlations exist among the unfrozen vertices (whose spin values being non-fixed). Such long-range frustrations are partially removed through the first-step replica-symmetry-broken (1RSB) cavity theory, but residual long-range frustrations may still persist in this mean-field solution. By way of population dynamics, here we perform a perturbation-percolation analysis to calculate the magnitude of long-range frustrations in the 1RSB solution of a given spin-glass system. We study two well-studied model systems, the minimal vertex-cover problem and the maximal 2-satisfiability problem. This work points to a possible way of improving the zero-temperature 1RSB mean-field theory of spin-glasses.
Person recognition methods that use multiple body regions have shown significant improvements over traditional face-based recognition. One of the primary challenges in full-body person recognition is the extreme variation in pose and view point. In this work, (i) we present an approach that tackles pose variations utilizing multiple models that are trained on specific poses, and combined using pose-aware weights during testing. (ii) For learning a person representation, we propose a network that jointly optimizes a single loss over multiple body regions. (iii) Finally, we introduce new benchmarks to evaluate person recognition in diverse scenarios and show significant improvements over previously proposed approaches on all the benchmarks including the photo album setting of PIPA.
We propose a novel method for practical Joint Processing downlink coordinated multipoint (DL CoMP) implementation in LTE/LTE-A systems using supervised machine learning. DL CoMP has not been thoroughly studied in previous work although cluster formation and interference mitigation have been studied extensively. In this paper, we attempt to improve the cell edge data rate served by a heterogeneous network cluster by means of dynamically changing the DL SINR threshold at which the DL CoMP feature is triggered. We do so by using a support vector machine (SVM) classifier. The simulation results show a cell edge user throughput improvement of 33.3% for pico cells and more than four-fold improvement in user throughput in the cluster. This has resulted from a reduction in the downlink block error rate (DL BLER) and an improvement in the spectral efficiency due to the informed triggering of the multiple radio streams as part of DL CoMP.
Cooperative binding of transcription factors (TFs) to cis-regulatory regions (CRRs) is essential for precision of gene expression in development and other processes. The classical model of cooperativity requires direct interactions between TFs, thus constraining the arrangement of TFs sites in a CRR. On the contrary, genomic and functional studies demonstrate a great deal of flexibility in such arrangements with variable distances, numbers of sites, and identities of the involved TFs. Such flexibility is inconsistent with the cooperativity by direct interactions between TFs. Here we demonstrate that strong cooperativity among non-interacting TFs can be achieved by their competition with nucleosomes. We find that the mechanism of nucleosome-mediated cooperativity is mathematically identical to the Monod-Wyman-Changeux (MWC) model of cooperativity in hemoglobin. This surprising parallel provides deep insights, with parallels between heterotropic regulation of hemoglobin (e.g. Bohr effect) and roles of nucleosome-positioning sequences and chromatin modifications in gene regulation. Characterized mechanism is consistent with numerous experimental results, allows substantial flexibility in and modularity of CRRs, and provides a rationale for a broad range of genomic and evolutionary observations. Striking parallels between cooperativity in hemoglobin and in transcription regulation point at a new design principle that may be used in range of biological systems.
In this paper we expose a dynamic traffic-classification scheme to support multimedia applications such as voice and broadband video transmissions over IEEE 802.11 Wireless Local Area Networks (WLANs). Obviously, over a Wi-Fi link and to better serve these applications - which normally have strict bounded transmission delay or minimum link rate requirement - a service differentiation technique can be applied to the media traffic transmitted by the same mobile node using the well-known 802.11e Enhanced Distributed Channel Access (EDCA) protocol. However, the given EDCA mode does not offer user differentiation, which can be viewed as a deficiency in multi-access wireless networks. Accordingly, we propose a new inter-node priority access scheme for IEEE 802.11e networks which is compatible with the EDCA scheme. The proposed scheme joins a dynamic user-weight to each mobile station depending on its outgoing data, and therefore deploys inter-node priority for the channel access to complement the existing EDCA inter-frame priority. This provides efficient quality of service control across multiple users within the same coverage area of an access point. We provide performance evaluations to compare the proposed access model with the basic EDCA 802.11 MAC protocol mode to elucidate the quality improvement achieved for multimedia communication over 802.11 WLANs.
The social media network phenomenon leads to a massive amount of valuable data that is available online and easy to access. Many users share images, videos, comments, reviews, news and opinions on different social networks sites, with Twitter being one of the most popular ones. Data collected from Twitter is highly unstructured, and extracting useful information from tweets is a challenging task. Twitter has a huge number of Arabic users who mostly post and write their tweets using the Arabic language. While there has been a lot of research on sentiment analysis in English, the amount of researches and datasets in Arabic language is limited. This paper introduces an Arabic language dataset which is about opinions on health services and has been collected from Twitter. The paper will first detail the process of collecting the data from Twitter and also the process of filtering, pre-processing and annotating the Arabic text in order to build a big sentiment analysis dataset in Arabic. Several Machine Learning algorithms (Naive Bayes, Support Vector Machine and Logistic Regression) alongside Deep and Convolutional Neural Networks were utilized in our experiments of sentiment analysis on our health dataset.
The kinetic roughening of a stable oil--air interface, moving in a Hele--Shaw cell which contains a quenched columnar disorder (tracks) has been studied. A capillary effect is responsible for the dynamic evolution of the resulting rough interface, which exhibits anomalous scaling. The three independent exponents needed to characterize the anomalous scaling are determined experimentally. The anomalous scaling is explained in terms of the initial acceleration and subsequent deceleration of the interface tips in the tracks coupled by mass conservation. A phenomenological model that reproduces the measured global and local exponents has been introduced.
Random networks are increasingly used to analyse complex transportation networks, such as airline routes, roads and rail networks. So far, this research has been focused on describing the properties of the networks with the help of random networks, often without considering their spatial properties. In this article, a methodology is proposed to create random networks conserving their spatial properties. The produced random networks are not intended to be an accurate model of the real-world network being investigated, but are to be used to gain insight into the functioning of the network taking into consideration its spatial properties, which has potential to be useful in many types of analysis, e.g. estimating the network related risk. The proposed methodology combines a spatial non-homogeneous point process for vertex creation, which accounts for the spatial distribution of vertices, considering clustering effects of the network and a hybrid connection model for the edge creation. To illustrate the ability of the proposed methodology to be used to gain insight into a real world network, it is used to estimate standard structural statistics for part of the Swiss road network, and these are then compared with the known values.
In this review, we give an introduction to the structural and functional properties of the biological networks. We focus on three major themes: topology of complex biological networks like the metabolic and protein-protein interaction networks, nonlinear dynamics in gene regulatory networks and in particular the design of synthetic genetic networks using the concepts and techniques of nonlinear physics and lastly the effect of stochasticity on the dynamics. The examples chosen illustrate the usefulness of interdisciplinary approaches in the study of biological networks.
Wireless Sensor Networks (WSNs) have been widely explored for forest fire detection, which is considered a fatal threat throughout the world. Energy conservation of sensor nodes is one of the biggest challenges in this context and random scheduling is frequently applied to overcome that. The performance analysis of these random scheduling approaches is traditionally done by paper-and-pencil proof methods or simulation. These traditional techniques cannot ascertain 100% accuracy, and thus are not suitable for analyzing a safety-critical application like forest fire detection using WSNs. In this paper, we propose to overcome this limitation by applying formal probabilistic analysis using theorem proving to verify scheduling performance of a real-world WSN for forest fire detection using a k-set randomized algorithm as an energy saving mechanism. In particular, we formally verify the expected values of coverage intensity, the upper bound on the total number of disjoint subsets, for a given coverage intensity, and the lower bound on the total number of nodes.
In this paper, we investigate some properties on capacity factors, which were proposed to investigate the link failure problem from network coding. A capacity factor (CF) of a network is an edge set, deleting which will cause the maximum flow to decrease while deleting any proper subset will not. Generally, a $k$-CF is a minimal (not minimum) edge set which will cause the network maximum flow decrease by $k$.   Under point to point acyclic scenario, we characterize all the edges which are contained in some CF, and propose an efficient algorithm to classify. And we show that all edges on some $s$-$t$ path in an acyclic point-to-point acyclic network are contained in some 2-CF. We also study some other properties of CF of point to point network, and a simple relationship with CF in multicast network.   On the other hand, some computational hardness results relating to capacity factors are obtained. We prove that deciding whether there is a capacity factor of a cyclic network with size not less a given number is NP-complete, and the time complexity of calculating the capacity rank is lowered bounded by solving the maximal flow. Besides that, we propose the analogous definition of CF on vertices and show it captures edge capacity factors as a special case.
Three-jet production in deep inelastic ep scattering and photoproduction was investigated with the ZEUS detector at HERA using an integrated luminosity of 127 pb-1. Measurements of differential cross sections are presented as functions of angular correlations between the three jets in the final state and the proton-beam direction. These correlations provide a stringent test of perturbative QCD and show sensitivity to the contributions from different colour configurations. Fixed-order perturbative QCD calculations assuming the values of the colour factors C_F, C_A and T_F as derived from a variety of gauge groups were compared to the measurements to study the underlying gauge group symmetry. The measured angular correlations in the deep inelastic ep scattering and photoproduction regimes are consistent with the admixture of colour configurations as predicted by SU(3) and disfavour other symmetry groups, such as SU(N) in the limit of large N.
Modern Learning Classifier Systems can be characterized by their use of rule accuracy as the utility metric for the search algorithm(s) discovering useful rules. Such searching typically takes place within the restricted space of co-active rules for efficiency. This paper gives an historical overview of the evolution of such systems up to XCS, and then some of the subsequent developments of XCS to different types of learning.
We present a robust deep learning based 6 degrees-of-freedom (DoF) localization system for endoscopic capsule robots. Our system mainly focuses on localization of endoscopic capsule robots inside the GI tract using only visual information captured by a mono camera integrated to the robot. The proposed system is a 23-layer deep convolutional neural network (CNN) that is capable to estimate the pose of the robot in real time using a standard CPU. The dataset for the evaluation of the system was recorded inside a surgical human stomach model with realistic surface texture, softness, and surface liquid properties so that the pre-trained CNN architecture can be transferred confidently into a real endoscopic scenario. An average error of 7:1% and 3:4% for translation and rotation has been obtained, respectively. The results accomplished from the experiments demonstrate that a CNN pre-trained with raw 2D endoscopic images performs accurately inside the GI tract and is robust to various challenges posed by reflection distortions, lens imperfections, vignetting, noise, motion blur, low resolution, and lack of unique landmarks to track.
Polarizable interaction potentials, parametrized using ab initio electronic structure calculations, have been used in molecular dynamics simulations to study the conduction mechanism in Y2 O3 - and Sc2 O3 -doped zirconias. The influence of vacancy-vacancy and vacancy-cation interactions on the conductivity of these materials has been characterised. While the latter can be avoided by using dopant cations with radii which match those of Zr4+ (as is the case of Sc3+), the former is an intrinsic characteristic of the fluorite lattice which cannot be avoided and which is shown to be responsible for the occurrence of a maximum in the conductivity at dopant concentrations between 8 and 13 %. The weakness of the Sc-vacancy interactions in Sc2 O3 -doped zirconia suggests that this material is likely to present the highest conductivity achievable in zirconias.
Machine learning methods incorporating deep neural networks have been the subject of recent proposals for new hadronic resonance taggers. These methods require training on a dataset produced by an event generator where the true class labels are known. However, training a network on a specific event generator may bias the network towards learning features associated with the approximations to QCD used in that generator which are not present in real data. We therefore investigate the effects of variations in the modelling of the parton shower on the performance of deep neural network taggers using jet images from hadronic W-bosons at the LHC, including detector-related effects. By investigating network performance on samples from the Pythia, Herwig and Sherpa generators, we find differences of up to fifty percent in background rejection for fixed signal efficiency. We also introduce and study a method, which we dub zooming, for implementing scale-invariance in neural network-based taggers. We find that this leads to an improvement in performance across a wide range of jet transverse momenta. Our results emphasise the importance gaining a detailed understanding what aspects of jet physics these methods are exploiting.
We present a catalogue and atlas of low-resolution spectra of a well defined sample of 341 objects in the FORS Deep Field. All spectra were obtained with the FORS instruments at the ESO VLT with essentially the same spectroscopic set-up. The observed extragalactic objects cover the redshift range 0.1 to 5.0. 98 objects are starburst galaxies and QSOs at z > 2. Using this data set we investigated the evolution of the characteristic spectral properties of bright starburst galaxies and their mutual relations as a function of the redshift. Significant evolutionary effects were found for redshifts 2 < z < 4. Most conspicuous are the increase of the average C IV absorption strength, of the dust reddening, and of the intrinsic UV luminosity, and the decrease of the average Ly alpha emission strength with decreasing redshift. In part the observed evolutionary effects can be attributed to an increase of the metallicity of the galaxies with cosmic age. Moreover, the increase of the total star-formation rates and the stronger obscuration of the starburst cores by dusty gas clouds suggest the occurrence of more massive starbursts at later cosmic epochs.
We present a new parton level Monte Carlo program for the calculation of jet cross sections in Deep Inelastic Scattering based on Born and next-to-leading order matrix elements. Using a class of invariant jet definition schemes, the program allows for the calculation of differential distributions of jet cross sections in the basic kinematical variables (like s, x, y, W^2, Q^2...) as well as for total jet cross sections. Various kinematical cuts can be chosen from an input file.
The study of human mobility is both of fundamental importance and of great potential value. For example, it can be leveraged to facilitate efficient city planning and improve prevention strategies when faced with epidemics. The newfound wealth of rich sources of data---including banknote flows, mobile phone records, and transportation data---has led to an explosion of attempts to characterize modern human mobility. Unfortunately, the dearth of comparable historical data makes it much more difficult to study human mobility patterns from the past. In this paper, we present an analysis of long-term human migration, which is important for processes such as urbanization and the spread of ideas. We demonstrate that the data record from Korean family books (called "jokbo") can be used to estimate migration patterns via marriages from the past 750 years. We apply two generative models of long-term human mobility to quantify the relevance of geographical information to human marriage records in the data, and we find that the wide variety in the geographical distributions of the clans poses interesting challenges for the direct application of these models. Using the different geographical distributions of clans, we quantify the "ergodicity" of clans in terms of how widely and uniformly they have spread across Korea, and we compare these results to those obtained using surname data from the Czech Republic. To examine population flow in more detail, we also construct and examine a population-flow network between regions. Based on the correlation between ergodicity and migration in Korea, we identify two different types of migration patterns: diffusive and convective. We expect the analysis of diffusive versus convective effects in population flows to be widely applicable to the study of mobility and migration patterns across different cultures.
Identifying, formalizing and combining biological mechanisms which implement known brain functions, such as prediction, is a main aspect of current research in theoretical neuroscience. In this letter, the mechanisms of Spike Timing Dependent Plasticity (STDP) and homeostatic plasticity, combined in an original mathematical formalism, are shown to shape recurrent neural networks into predictors. Following a rigorous mathematical treatment, we prove that they implement the online gradient descent of a distance between the network activity and its stimuli. The convergence to an equilibrium, where the network can spontaneously reproduce or predict its stimuli, does not suffer from bifurcation issues usually encountered in learning in recurrent neural networks.
Stock market prediction is the act of trying to determine the future value of a company stock or other financial instrument traded on a financial exchange.
This paper addresses the coordinated scheduling problem in cloud-enabled networks. Consider the downlink of a cloud-radio access network (C-RAN), where the cloud is only responsible for the scheduling policy and the synchronization of the transmit frames across the connected base-stations (BS). The transmitted frame of every BS consists of several time/frequency blocks, called power-zones (PZ), maintained at fixed transmit power. The paper considers the problem of scheduling users to PZs and BSs in a coordinated fashion across the network, by maximizing a network-wide utility under the practical constraint that each user cannot be served by more than one base-station, but can be served by one or more power-zone within each base-station frame. The paper solves the problem using a graph theoretical approach by introducing the scheduling graph in which each vertex represents an association of users, PZs and BSs. The problem is formulated as a maximum weight clique, in which the weight of each vertex is the benefit of the association represented by that vertex. The paper further presents heuristic algorithms with low computational complexity. Simulation results show the performance of the proposed algorithms and suggest that the heuristics perform near optimal in low shadowing environments
The problem of resource allocation of nonlinear networked control systems is investigated, where, unlike the well discussed case of triggering for stability, the objective is optimal triggering. An approximate dynamic programming approach is developed for solving problems with fixed final times initially and then it is extended to infinite horizon problems. Different cases including Zero-Order-Hold, Generalized Zero-Order-Hold, and stochastic networks are investigated. Afterwards, the developments are extended to the case of problems with unknown dynamics and a model-free scheme is presented for learning the (approximate) optimal solution. After detailed analyses of convergence, optimality, and stability of the results, the performance of the method is demonstrated through different numerical examples.
Routing is the main part of wireless adhoc network conventionally there are two approaches first one is Proactive and another one is Reactive. Both these approaches have some substantial disadvantage and to overcome hybrid routing protocols designed. ZRP (Zone Routing Protocol) is one of the hybrid routing protocols, it takes advantage of proactive approach by providing reliability within the scalable zone, and for beyond the scalable zone it looks for the reactive approach. It (ZRP) uses the proactive and the reactive routing according to the need of the application at that particular instance of time depending upon the prevailing scenario. This work revolves around the performance of ZRP against realistic parameters by varying various attributes such as Zone Radius of ZRP in different node density. Results vary as we change the node density on Qualnet 4.0 network simulator.
Ad-hoc networks, a promising trend in wireless technology, fail to work properly in a global setting. In most cases, self-organization and cost-free local communication cannot compensate the need for being connected, gathering urgent information just-in-time. Equipping mobile devices additionally with GSM or UMTS adapters in order to communicate with arbitrary remote devices or even a fixed network infrastructure provides an opportunity. Devices that operate as intermediate nodes between the ad-hoc network and a reliable backbone network are potential injection points. They allow disseminating received information within the local neighborhood. The effectiveness of different devices to serve as injection point differs substantially. For practical reasons the determination of injection points should be done locally, within the ad-hoc network partitions. We analyze different localized algorithms using at most 2-hop neighboring information. Results show that devices selected this way spread information more efficiently through the ad-hoc network. Our results can also be applied in order to support the election process for clusterheads in the field of clustering mechanisms.
Qualitative probabilistic networks have been introduced as qualitative abstractions of Bayesian belief networks. One of the major drawbacks of these qualitative networks is their coarse level of detail, which may lead to unresolved trade-offs during inference. We present an enhanced formalism for qualitative networks with a finer level of detail. An enhanced qualitative probabilistic network differs from a regular qualitative network in that it distinguishes between strong and weak influences. Enhanced qualitative probabilistic networks are purely qualitative in nature, as regular qualitative networks are, yet allow for efficiently resolving trade-offs during inference.
We study the evolution of networks when the creation and decay of links are based on the position of nodes in the network measured by their centrality. We show that the same network dynamics arises under various centrality measures, and solve analytically the network evolution. During the complete evolution, the network is characterized by nestedness: the neighbourhood of a node is contained in the neighbourhood of the nodes with larger degree. We find a discontinuous transition in the network density between hierarchical and homogeneous networks, depending on the rate of link decay. We also show that this evolution mechanism leads to double power-law degree distributions, with interrelated exponents.
We address the issue of the reducibility of the dynamics on a multilayer network to an equivalent process on an aggregated single-layer network. As a typical example of models for opinion formation in social networks, we implement the voter model on a two-layer multiplex network, and we study its dynamics as a function of two control parameters, namely the fraction of edges simultaneously existing in both layers of the network (edge overlap), and the fraction of nodes participating in both layers (interlayer connectivity or degree of multiplexity). We compute the asymptotic value of the number of active links (interface density) in the thermodynamic limit, and the time to reach an absorbing state for finite systems, and we compare the numerical results with the analytical predictions on equivalent single-layer networks obtained through various possible aggregation procedures. We find a large region of parameters where the interface density of large multiplexes gives systematic deviations from that of the aggregates. We show that neither of the standard aggregation procedures is able to capture the highly nonlinear increase in the lifetime of a finite size multiplex at small interlayer connectivity. These results indicate that multiplexity should be appropriately taken into account when studying voter model dynamics, and that, in general, single-layer approximations might be not accurate enough to properly understand processes occurring on multiplex networks, since they might flatten out relevant dynamical details.
The first result is on the separability of the unicast capacity of stationary multi-channel multi-radio wireless networks, i.e., whether the capacity of such a network is equal to the sum of the capacities of the corresponding single-channel single-radio wireless networks. For both the Arbitrary Network model and the Random Network model, given a channel assignment, the separability property does not always hold. However, if the number of radio interfaces at each node is equal to the number of channels, the separability property holds. The second result is on the impact of multi-channel routing (i.e., routing a bit through multiple channels as opposed to through a single channel) on the network capacity. For both network models, the network capacities conditioned on a channel assignment under the two routing schemes are not always equal, but if again the number of radio interfaces at each node is equal to the number of channels, the two routing schemes yield equal network capacities.
We theoretically study the response of a many-body localized system to a local quench from a quantum information perspective. We find that the local quench triggers entanglement growth throughout the whole system, giving rise to a logarithmic lightcone. This saturates the modified Lieb-Robinson bound for quantum information propagation in many-body localized systems previously conjectured based on the existence of local integrals of motion. In addition, near the localization-delocalization transition, we find that the final states after the local quench exhibit volume-law entanglement. We also show that the local quench induces a deterministic orthogonality catastrophe for highly excited eigenstates, where the typical wave-function overlap between the pre- and post-quench eigenstates decays {\it exponentially} with the system size.
Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. Finally, given the recent surge of interest in this task, a competition was organized in 2015 using the newly released COCO dataset. We describe and analyze the various improvements we applied to our own baseline and show the resulting performance in the competition, which we won ex-aequo with a team from Microsoft Research, and provide an open source implementation in TensorFlow.
Community detection of network flows conventionally assumes one-step dynamics on the links. For sparse networks and interest in large-scale structures, longer timescales may be more appropriate. Oppositely, for large networks and interest in small-scale structures, shorter timescales may be better. However, current methods for analyzing networks at different timescales require expensive and often infeasible network reconstructions. To overcome this problem, we introduce a method that takes advantage of the inner-workings of the map equation and evades the reconstruction step. This makes it possible to efficiently analyze large networks at different Markov times with no extra overhead cost. The method also evades the costly unipartite projection for identifying flow modules in bipartite networks.
We present deep $UBVRI$ CCD photometry for the young open star clusters Tr 1 and Be 11. The CCD data for Be 11 is obtained for the first time. The sample consists of $\sim$ 1500 stars reaching down to $V$ $\sim$ 21 mag. Analysis of the radial distribution of stellar surface density indicates that radius values for Tr 1 and Be 11 are 2.3 and 1.5 pc respectively. The interstellar extinction across the face of the imaged clusters region seems to be non-uniform with a mean value of $E(B-V)$ = 0.60$\pm$0.05 and 0.95$\pm$0.05 mag for Tr 1 and Be 11 respectively. A random positional variation of $E(B-V)$ is present in both the clusters. In the cluster Be 11, the reason of random positional variation may be apparent association of the HII region (S 213). The 2MASS $JHK$ data in combination with the optical data in the cluster Be 11 yields $E(J-K)$ = 0.40$\pm$0.20 mag and $E(V-K)$ = 2.20$\pm$0.20 mag. Colour excess diagrams indicate a normal interstellar extinction law in the direction of cluster Be 11. The distances of Tr 1 and Be 11 are estimated as 2.6$\pm$0.10 and 2.2$\pm$0.10 Kpc respectively, while the theoretical stellar evolutionary isochrones fitted to the bright cluster members indicate that the cluster Tr 1 and Be 11 are 40$\pm$10 and 110$\pm$10 Myr old. The mass functions corrected for both field star contamination and data incompleteness are derived for both the clusters. The slopes $1.50\pm0.40$ and $1.22\pm0.24$ for Tr 1 and Be 11 respectively are in agreement with the Salpeter's value. Observed mass segregations in both clusters may be due to the result of dynamical evolutions or imprint of star formation processes or both.
Synapses in real neural circuits can take discrete values, including zero (silent or potential) synapses. The computational role of zero synapses in unsupervised feature learning of unlabeled noisy data is still unclear, thus it is important to understand how the sparseness of synaptic activity is shaped during learning and its relationship with receptive field formation. Here, we formulate this kind of sparse feature learning by a statistical mechanics approach. We find that learning decreases the fraction of zero synapses, and when the fraction decreases rapidly around a critical data size, an intrinsically structured receptive field starts to develop. Further increasing the data size refines the receptive field, while a very small fraction of zero synapses remain to act as contour detectors. This phenomenon is discovered not only in learning a handwritten digits dataset, but also in learning retinal neural activity measured in a natural-movie-stimuli experiment.
We study the problem of cooperative inference where a group of agents interact over a network and seek to estimate a joint parameter that best explains a set of observations. Agents do not know the network topology or the observations of other agents. We explore a variational interpretation of the Bayesian posterior density, and its relation to the stochastic mirror descent algorithm, to propose a new distributed learning algorithm. We show that, under appropriate assumptions, the beliefs generated by the proposed algorithm concentrate around the true parameter exponentially fast. We provide explicit non-asymptotic bounds for the convergence rate. Moreover, we develop explicit and computationally efficient algorithms for observation models belonging to exponential families.
We investigate an extended +-J Ising spin glass model by using a gauge symmetry. This model has +-J1 interactions and +-J2 interactions. We show that a gauge symmetry is usable to study this model. The exact internal energy, the rigorous upper bound of the specific heat and some rigorous relations for correlation functions and order parameters are shown by using the gauge symmetry. The results are rigorous, and do not depend on any lattice shape. A part of our results, e.g., the value of the exact internal energy should be useful for checking the computer programs for investigating this model. In addition, we find that the present solutions are general solutions which include the solutions on the so-called Nishimori line for the conventional +-J Ising spin glass model and the solutions for the bond-diluted +-J Ising spin glass model.
In this paper, we propose a general cross-layer optimization framework in which we explicitly consider both the heterogeneous and dynamically changing characteristics of delay-sensitive applications and the underlying time-varying network conditions. We consider both the independently decodable data units (DUs, e.g. packets) and the interdependent DUs whose dependencies are captured by a directed acyclic graph (DAG). We first formulate the cross-layer design as a non-linear constrained optimization problem by assuming complete knowledge of the application characteristics and the underlying network conditions. The constrained cross-layer optimization is decomposed into several cross-layer optimization subproblems for each DU and two master problems. The proposed decomposition method determines the necessary message exchanges between layers for achieving the optimal cross-layer solution. However, the attributes (e.g. distortion impact, delay deadline etc) of future DUs as well as the network conditions are often unknown in the considered real-time applications. The impact of current cross-layer actions on the future DUs can be characterized by a state-value function in the Markov decision process (MDP) framework. Based on the dynamic programming solution to the MDP, we develop a low-complexity cross-layer optimization algorithm using online learning for each DU transmission. This online algorithm can be implemented in real-time in order to cope with unknown source characteristics, network dynamics and resource constraints. Our numerical results demonstrate the efficiency of the proposed online algorithm.
We present a fast multiscale approach for the network minimum logarithmic arrangement problem. This type of arrangement plays an important role in a network compression and fast node/link access operations. The algorithm is of linear complexity and exhibits good scalability which makes it practical and attractive for using on large-scale instances. Its effectiveness is demonstrated on a large set of real-life networks. These networks with corresponding best-known minimization results are suggested as an open benchmark for a research community to evaluate new methods for this problem.
We report on recent progress in the calculation of the 3-loop massive Wilson coefficients in deep-inelastic scattering at general values of $N$ for neutral and charged current reactions in the asymptotic region $Q^2 \gg m^2$.
Recurrent Neural Networks (RNN) are widely used to solve a variety of problems and as the quantity of data and the amount of available compute have increased, so have model sizes. The number of parameters in recent state-of-the-art networks makes them hard to deploy, especially on mobile phones and embedded devices. The challenge is due to both the size of the model and the time it takes to evaluate it. In order to deploy these RNNs efficiently, we propose a technique to reduce the parameters of a network by pruning weights during the initial training of the network. At the end of training, the parameters of the network are sparse while accuracy is still close to the original dense neural network. The network size is reduced by 8x and the time required to train the model remains constant. Additionally, we can prune a larger dense network to achieve better than baseline performance while still reducing the total number of parameters significantly. Pruning RNNs reduces the size of the model and can also help achieve significant inference time speed-up using sparse matrix multiply. Benchmarks show that using our technique model size can be reduced by 90% and speed-up is around 2x to 7x.
In this paper an event-based tracking, feature extraction, and classification system is presented for performing object recognition using an event-based camera. The high-speed recognition task involves detecting and classifying model airplanes that are dropped free-hand close to the camera lens so as to generate a challenging highly varied dataset of spatio-temporal event patterns. We investigate the use of time decaying memory surfaces to capture the temporal aspect of the event-based data. These surfaces are then used to perform unsupervised feature extraction, tracking and recognition. Both linear and exponentially decaying surfaces were found to result in equally high recognition accuracy. Using only twenty five event-based feature extracting neurons in series with a linear classifier, the system achieves 98.61% recognition accuracy within 156 milliseconds of the airplane entering the field of view. By comparing the linear classifier results to a high-capacity ELM classifier, we find that a small number of event-based feature extractors can effectively project the complex spatio-temporal event patterns of the data-set to a linearly separable representation in the feature space.
We propose a simple dynamical model of the glass transition based on the results from a non-randomly frustrated spin model which is known to form a glassy state below a characteristic quench temperature. The model is characterized by a multi-valleyed free-energy surface which is modulated by an overall curvature. The transition associated with the vanishing of this overall curvature is reminiscent of the glass transition. In particular, the frequency-dependent response evolves from a Debye relaxation peak to a function whose high-frequency behavior is characterized by a non-trivial power law. We present both an analytical form for the response function and numerical results from Langevin simulations.
In this short paper, we present an improved algorithm for approximating the minimum cut on distributed (CONGEST) networks. Let $\lambda$ be the minimum cut. Our algorithm can compute $\lambda$ exactly in $\tilde{O}((\sqrt{n}+D)\poly(\lambda))$ time, where $n$ is the number of nodes (processors) in the network, $D$ is the network diameter, and $\tilde{O}$ hides $\poly\log n$. By a standard reduction, we can convert this algorithm into a $(1+\epsilon)$-approximation $\tilde{O}((\sqrt{n}+D)/\poly(\epsilon))$-time algorithm. The latter result improves over the previous $(2+\epsilon)$-approximation $\tilde{O}((\sqrt{n}+D)/\poly(\epsilon))$-time algorithm of Ghaffari and Kuhn [DISC 2013]. Due to the lower bound of $\tilde{\Omega}(\sqrt{n}+D)$ by Das Sarma et al. [SICOMP 2013], this running time is {\em tight} up to a $\poly\log n$ factor. Our algorithm is an extremely simple combination of Thorup's tree packing theorem [Combinatorica 2007], Kutten and Peleg's tree partitioning algorithm [J. Algorithms 1998], and Karger's dynamic programming [JACM 2000].
We introduce a new achievability scheme, termed opportunistic network decoupling (OND), operating in virtual full-duplex mode. In the scheme, a novel relay scheduling strategy is utilized in the $K\times N\times K$ channel with interfering relays, consisting of $K$ source--destination pairs and $N$ half-duplex relays in-between them. A subset of relays using alternate relaying is opportunistically selected in terms of producing the minimum total interference level, thereby resulting in network decoupling. As our main result, it is shown that under a certain relay scaling condition, the OND protocol achieves $K$ degrees of freedom even in the presence of interfering links among relays. Numerical evaluation is also shown to validate the performance of the proposed OND. Our protocol basically operates in a fully distributed fashion along with local channel state information, thereby resulting in a relatively easy implementation.
We report a high-resolution resonant inelastic x-ray scattering study of La2CuO4. A number of spectral features are identified that were not clearly visible in earlier lower-resolution data. The momentum dependence of the spectral weight and the dispersion of the lowest energy excitation across the insulating gap have been measured in detail. The temperature dependence of the spectral features was also examined. The observed charge transfer edge shift, along with the low dispersion of the first charge transfer excitation are attributed to the lattice motion being coupled to the electronic system. In addition, we observe a dispersionless feature at 1.8 eV, which is associated with a d-d crystal field excitation.
This paper considers the problem of high dimensional signal detection in a large distributed network whose nodes can collaborate with their one-hop neighboring nodes (spatial collaboration). We assume that only a small subset of nodes communicate with the Fusion Center (FC). We design optimal collaboration strategies which are universal for a class of deterministic signals. By establishing the equivalence between the collaboration strategy design problem and sparse PCA, we solve the problem efficiently and evaluate the impact of collaboration on detection performance.
We investigate the evolutionary status of four stars: V348 Sgr, DY Cen and MV Sgr in the Galaxy and HV 2671 in the LMC. These stars have in common random deep declines in visual brightness which are characteristic for R Coronae Borealis (RCB) stars. RCB stars are typically cool, hydrogen deficient supergiants. The four stars studied in this paper are hotter (T$_{\rm eff}$ = 15-20 kK) than the majority of RCB stars (T$_{\rm eff}$ = 5000-7000 K). Although these are commonly grouped together as the \emph{hot RCB stars} they do not necessarily share a common evolutionary history. We present new observational data and an extensive collection of archival and previously-published data which is reassessed to ensure internal consistency. We find temporal variations of various properties on different time scales which will eventually help us to uncover the evolutionary history of these objects. DY Cen and MV Sgr have typical RCB helium abundances which excludes any currently known post-AGB evolutionary models. Moreover, their carbon and nitrogen abundances present us with further problems for their interpretation. V348 Sgr and HV 2671 are in general agreement with a born-again post-AGB evolution and their abundances are similar to Wolf-Rayet central stars of PN. The three Galactic stars in the sample have circumstellar nebulae which produce forbidden line radiation (for HV 2671 we have no information). V348 Sgr and DY Cen have low density, low expansion velocity nebulae (resolved in the case of V348 Sgr), while MV Sgr has a higher density, higher expansion velocity nebula.
The behavior of some stochastic chemical reaction networks is largely unaffected by slight inaccuracies in reaction rates. We formalize the robustness of state probabilities to reaction rate deviations, and describe a formal connection between robustness and efficiency of simulation. Without robustness guarantees, stochastic simulation seems to require computational time proportional to the total number of reaction events. Even if the concentration (molecular count per volume) stays bounded, the number of reaction events can be linear in the duration of simulated time and total molecular count. We show that the behavior of robust systems can be predicted such that the computational work scales linearly with the duration of simulated time and concentration, and only polylogarithmically in the total molecular count. Thus our asymptotic analysis captures the dramatic speedup when molecular counts are large, and shows that for bounded concentrations the computation time is essentially invariant with molecular count. Finally, by noticing that even robust stochastic chemical reaction networks are capable of embedding complex computational problems, we argue that the linear dependence on simulated time and concentration is likely optimal.
With the assumption that the effect is a mathematical function of the cause in a causal relationship, FunChisq, a chi-square test defined on a non-parametric representation of interactions, infers network topology considering both interaction directionality and nonlinearity. Here we show that both experimental and in silico biological network data suggest the importance of directionality as evidence for causality. Counter-intuitively, patterns in those interactions effectively revealed by FunChisq enlist an experimental design principle essential to network inference -- perturbations to a biological system shall make it transits between linear and nonlinear working zones, instead of operating only in a linear working zone.
Emerging mobility standards within the next generation Internet Protocol, IPv6, promise to continuously operate devices roaming between IP networks. Associated with the paradigm of ubiquitous computing and communication, network technology is on the spot to deliver voice and videoconferencing as a standard internet solution. However, current roaming procedures are too slow, to remain seamless for real-time applications. Multicast mobility still waits for a convincing design. This paper investigates the temporal behaviour of mobile IPv6 with dedicated focus on topological impacts. Extending the hierarchical mobile IPv6 approach we suggest protocol improvements for a continuous handover, which may serve bidirectional multicast communication, as well. Along this line a multicast mobility concept is introduced as a service for clients and sources, as they are of dedicated importance in multipoint conferencing applications. The mechanisms introduced do not rely on assumptions of any specific multicast routing protocol in use.
This paper (cmp-lg/yymmnnn) has been accepted for publication in the student session of EACL-95. It outlines ongoing work using statistical and unsupervised neural network methods for clustering words in untagged corpora. Such approaches are of interest when attempting to understand the development of human intuitive categorization of language as well as for trying to improve computational methods in natural language understanding. Some preliminary results using a simple statistical approach are described, along with work using an unsupervised neural network to distinguish between the sense classes into which words fall.
Breast cancer research over the last decade has been tremendous. The ground breaking innovations and novel methods help in the early detection, in setting the stages of the therapy and in assessing the response of the patient to the treatment. The prediction of the recurrent cancer is also crucial for the survival of the patient. This paper studies various techniques used for the diagnosis of breast cancer. Different methods are explored for their merits and de-merits for the diagnosis of breast lesion. Some of the methods are yet unproven but the studies look very encouraging. It was found that the recent use of the combination of Artificial Neural Networks in most of the instances gives accurate results for the diagnosis of breast cancer and their use can also be extended to other diseases.
The mammalian cortex is divided into architectonic and functionally distinct areas. There is growing experimental evidence that their emergence and development is controlled by both epigenetic and genetic factors. The latter were recently implicated as dominating the early cortical area specification. In this paper, we present a theoretical model that explicitly considers the genetic factors and that is able to explain several sets of experiments on cortical area regulation involving transcription factors Emx2 and Pax6, and fibroblast growth factor FGF8. The model consists of the dynamics of thalamo- cortical connections modulated by signaling molecules that are regulated genetically, and by axonal competition for neocortical space. The model can make predictions and provides a basic mathematical framework for the early development of the thalamo-cortical connections and area patterning that can be further refined as more experimental facts become known.
We consider a tapping dynamics, analogous to that in experiments on granular media, on spin glasses and ferromagnets on random thin graphs. Between taps, zero temperature single spin flip dynamics takes the system to a metastable state. Tapping, corresponds to flipping simultaneously any spin with probability $p$. This dynamics leads to a stationary regime with a steady state energy $E(p)$. We analytically solve this dynamics for the one dimensional ferromagnet and $\pm J$ spin glass. Numerical simulations for spin glasses and ferromagnets of higher connectivity are carried out, in particular we find a novel first order transition for the ferromagnetic systems.
One of the most critical tasks for network administrator is to ensure system uptime and availability. For the network security, anomaly detection systems, along with firewalls and intrusion prevention systems are the must-have tools. So far in the field of network anomaly detection, people are working on two different approaches. One is flow-based; usually rely on network elements to make so-called flow information available for analysis. The second approach is packet-based; which directly analyzes the data packet information for the detection of anomalies. This paper describes the main differences between the two approaches through an in-depth analysis. We try to answer the question of when and why an approach is better than the other. The answer is critical for network administrators to make their choices in deploying a defending system, securing the network and ensuring business continuity.
We investigate a novel global orientation regression approach for articulated objects using a deep convolutional neural network. This is integrated with an in-plane image derotation scheme, DeROT, to tackle the problem of per-frame fingertip detection in depth images. The method reduces the complexity of learning in the space of articulated poses which is demonstrated by using two distinct state-of-the-art learning based hand pose estimation methods applied to fingertip detection. Significant classification improvements are shown over the baseline implementation. Our framework involves no tracking, kinematic constraints or explicit prior model of the articulated object in hand. To support our approach we also describe a new pipeline for high accuracy magnetic annotation and labeling of objects imaged by a depth camera.
Total scattering neutron powder diffraction measurements were performed on the tetragonal phase (a=6.4202(5) {\AA}, c=9.5762(12) {\AA}) of $CI_4$. The experiments were followed by Reverse Monte Carlo (for POWder diffraction (RMCPOW)) modeling. Detailed analyses of the resulting particle configurations revealed that the observed diffuse scattering originates from the libration of the molecules. By examining the partial radial distribution functions a distinct carbon-iodine peak at 4.5 {\AA} is found, which appears as a consequence of corner-to-face mutual alignment of two molecules. The occurrence of edge-to-edge alignments is also significant within the first carbon-carbon coordination shell.
Network forensics deals with the capture, recording and analysis of network events in order to discover evidential information about the source of security attacks in a court of law. This paper discusses the different tools and techniques available to conduct network forensics. Some of the tools discussed include: eMailTrackerPro to identify the physical location of an email sender; Web Historian to find the duration of each visit and the files uploaded and downloaded from the visited website; packet sniffers like Etherea to capture and analyze the data exchanged among the different computers in the network. The second half of the paper presents a survey of different IP traceback techniques like packet marking that help a forensic investigator to identify the true sources of the attacking IP packets. We also discuss the use of Honeypots and Honeynets that gather intelligence about the enemy and the tools and tactics of network intruders.
Easy access and vast amount of data, especially from long period of time, allows to divide social network into timeframes and create temporal social network. Such network enables to analyse its dynamics. One aspect of the dynamics is analysis of social communities evolution, i.e., how particular group changes over time. To do so, the complete group evolution history is needed. That is why in this paper the new method for group evolution extraction called GED is presented.
In this work we examine the use of metal-organic framework (MOF) systems as host materials for the investigation of glassy dynamics in confined geometry. We investigate the confinement of the molecular glass former glycerol in three MFU-type MOFs with different pore sizes and study the dynamics of the confined liquid via dielectric spectroscopy. In accord with previous reports on confined glass formers, we find different degrees of deviations from bulk behavior depending on pore size, demonstrating that MOFs are well-suited host systems for confinement investigations.
We study the response of an attractor neural network, in the ferromagnetic phase, to an external, time-dependent stimulus, which drives the system periodically two different attractors. We demonstrate a non-trivial dependance of the system via a system size resonance, by showing a signal amplification maximum at a certain finite size.
We study the infinite-temperature properties of an infinite sequence of random quantum spin chains using a real-space renormalization group approach, and demonstrate that they exhibit non-ergodic behavior at strong disorder. The analysis is conveniently implemented in terms of SU(2)$_k$ anyon chains that include the Ising and Potts chains as notable examples. Highly excited eigenstates of these systems exhibit properties usually associated with quantum critical ground states, leading us to dub them "quantum critical glasses". We argue that random-bond Heisenberg chains self-thermalize and that the excited-state entanglement crosses over from volume-law to logarithmic scaling at a length scale that diverges in the Heisenberg limit $k\rightarrow\infty$. The excited state fixed points are generically distinct from their ground state counterparts, and represent novel non-equilibrium critical phases of matter.
In this paper, we address the problem of apparent age estimation. Different from estimating the real age of individuals, in which each face image has a single age label, in this problem, face images have multiple age labels, corresponding to the ages perceived by the annotators, when they look at these images. This provides an intriguing computer vision problem, since in generic image or object classification tasks, it is typical to have a single ground truth label per class. To account for multiple labels per image, instead of using average age of the annotated face image as the class label, we have grouped the face images that are within a specified age range. Using these age groups and their age-shifted groupings, we have trained an ensemble of deep learning models. Before feeding an input face image to a deep learning model, five facial landmark points are detected and used for 2-D alignment. We have employed and fine tuned convolutional neural networks (CNNs) that are based on VGG-16 [24] architecture and pretrained on the IMDB-WIKI dataset [22]. The outputs of these deep learning models are then combined to produce the final estimation. Proposed method achieves 0.3668 error in the final ChaLearn LAP 2016 challenge test set [5].
Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
We focus on sorting, which is the building block of many machine learning algorithms, and propose a novel distributed sorting algorithm, named Coded TeraSort, which substantially improves the execution time of the TeraSort benchmark in Hadoop MapReduce. The key idea of Coded TeraSort is to impose structured redundancy in data, in order to enable in-network coding opportunities that overcome the data shuffling bottleneck of TeraSort. We empirically evaluate the performance of CodedTeraSort algorithm on Amazon EC2 clusters, and demonstrate that it achieves 1.97x - 3.39x speedup, compared with TeraSort, for typical settings of interest.
This short article presents a summary of the NetSciEd (Network Science and Education) initiative that aims to address the need for curricula, resources, accessible materials, and tools for introducing K-12 students and the general public to the concept of networks, a crucial framework in understanding complexity. NetSciEd activities include (1) the NetSci High educational outreach program (since 2010), which connects high school students and their teachers with regional university research labs and provides them with the opportunity to work on network science research projects; (2) the NetSciEd symposium series (since 2012), which brings network science researchers and educators together to discuss how network science can help and be integrated into formal and informal education; and (3) the Network Literacy: Essential Concepts and Core Ideas booklet (since 2014), which was created collaboratively and subsequently translated into 18 languages by an extensive group of network science researchers and educators worldwide.
Since the emergence of Deep Neural Networks (DNNs) as a prominent technique in the field of computer vision, the ImageNet classification challenge has played a major role in advancing the state-of-the-art. While accuracy figures have steadily increased, the resource utilisation of winning models has not been properly taken into account. In this work, we present a comprehensive analysis of important metrics in practical applications: accuracy, memory footprint, parameters, operations count, inference time and power consumption. Key findings are: (1) power consumption is independent of batch size and architecture; (2) accuracy and inference time are in a hyperbolic relationship; (3) energy constraint is an upper bound on the maximum achievable accuracy and model complexity; (4) the number of operations is a reliable estimate of the inference time. We believe our analysis provides a compelling set of information that helps design and engineer efficient DNNs.
From energy conservation perspective inWireless Sensor Networks (WSNs), clustering of sensor nodes is a challenging task. Clustering technique in routing protocols play a key role to prolong the stability period and lifetime of the network. In this paper, we propose and evaluate a new routing protocol for WSNs. Our protocol; Divide-and-Rule (DR) is based upon static clustering and dynamic Cluster Head (CH) selection technique. This technique selects fixed number of CHs in each round instead of probabilistic selection of CH. Simulation results show that DR protocol outperform its counterpart routing protocols.
A quantum network is constructed via maximum entangled coherent states. The possibility of using this network to achieve communication between multi-participants is investigated. We showed that the probability of teleported unknown state successfully, depends on the size the used network. As the numbers of participants increases, the successful probability does not depend on the intensity of the field. The problem of implementing quantum teleportation protocol via a noise quantum network is discussed. We show one can send information perfectly with small values of the field intensity and larger values of the noise strength. The successful probability of this suggested protocol increases abruptly for larger values of the noise strength and gradually for small values. We show that for small size of the used quantum network, the fidelity of the teleported state decreases smoothly, while it decreases abruptly for larger size of network.
This invited paper presents some novel ideas on how to enhance the performance of consensus algorithms in distributed wireless sensor networks, when communication costs are considered. Of particular interest are consensus algorithms that exploit the broadcast property of the wireless channel to boost the performance in terms of convergence speeds. To this end, we propose a novel clustering based consensus algorithm that exploits interference for computation, while reducing the energy consumption in the network. The resulting optimization problem is a semidefinite program, which can be solved offline prior to system startup.
The mapping of time-dependent densities on potentials in quantum mechanics is critically examined. The issue is of significance ever since Runge and Gross (Phys. Rev. Lett. 52, 997 (1984)) established the uniqueness of the mapping, forming a theoretical basis for time-dependent density functional theory. We argue that besides existence (so called v-representability) and uniqueness there is an important question of stability and chaos. Studying a 2-level system we find innocent, almost constant densities that cannot be constructed from any potential (non-existence). We further show via a Lyapunov analysis that the mapping of densities on potentials has chaotic regions in this case. In real space the situation is more subtle. V-representability is formally assured but the mapping is often chaotic making the actual construction of the potential almost impossible. The chaotic nature of the mapping, studied for the first time here, has serious consequences regarding the possibility of using TDDFT in real-time settings.
Networks may, or may not, be wired to have a core that is both itself densely connected and central in terms of graph distance. In this study we propose a coefficient to measure if the network has such a clear-cut core-periphery dichotomy. We measure this coefficient for a number of real-world and model networks and find that different classes of networks have their characteristic values. For example do geographical networks have a strong core-periphery structure, while the core-periphery structure of social networks (despite their positive degree-degree correlations) is rather weak. We proceed to study radial statistics of the core, i.e. properties of the n-neighborhoods of the core vertices for increasing n. We find that almost all networks have unexpectedly many edges within n-neighborhoods at a certain distance from the core suggesting an effective radius for non-trivial network processes.
We solve the dynamics of an ensemble of interacting rotors coupled to two leads at different chemical potential letting a current flow through the system and driving it out of equilibrium. We show that at low temperature the coarsening phase persists under the voltage drop up to a critical value of the applied potential that depends on the characteristics of the electron reservoirs. We discuss the properties of the critical surface in the temperature, voltage, strength of quantum fluctuations and coupling to the bath phase diagram. We analyze the coarsening regime finding, in particular, which features are essentially quantum mechanical and which are basically classical in nature. We demonstrate that the system evolves via the growth of a coherence length with the same time-dependence as in the classical limit, $R(t) \simeq t^{1/2}$ -- the scalar curvature driven universality class. We obtain the scaling function of the correlation function at late epochs in the coarsening regime and we prove that it coincides with the classical one once a prefactor that encodes the dependence on all the parameters is factorized. We derive a generic formula for the current flowing through the system and we show that, for this model, it rapidly approaches a constant that we compute.
We introduce an algorithm to do backpropagation on a spiking network. Our network is "spiking" in the sense that our neurons accumulate their activation into a potential over time, and only send out a signal (a "spike") when this potential crosses a threshold and the neuron is reset. Neurons only update their states when receiving signals from other neurons. Total computation of the network thus scales with the number of spikes caused by an input rather than network size. We show that the spiking Multi-Layer Perceptron behaves identically, during both prediction and training, to a conventional deep network of rectified-linear units, in the limiting case where we run the spiking network for a long time. We apply this architecture to a conventional classification problem (MNIST) and achieve performance very close to that of a conventional Multi-Layer Perceptron with the same architecture. Our network is a natural architecture for learning based on streaming event-based data, and is a stepping stone towards using spiking neural networks to learn efficiently on streaming data.
Contagions such as the spread of popular news stories, or infectious diseases, propagate in cascades over dynamic networks with unobservable topologies. However, "social signals" such as product purchase time, or blog entry timestamps are measurable, and implicitly depend on the underlying topology, making it possible to track it over time. Interestingly, network topologies often "jump" between discrete states that may account for sudden changes in the observed signals. The present paper advocates a switched dynamic structural equation model to capture the topology-dependent cascade evolution, as well as the discrete states driving the underlying topologies. Conditions under which the proposed switched model is identifiable are established. Leveraging the edge sparsity inherent to social networks, a recursive $\ell_1$-norm regularized least-squares estimator is put forth to jointly track the states and network topologies. An efficient first-order proximal-gradient algorithm is developed to solve the resulting optimization problem. Numerical experiments on both synthetic data and real cascades measured over the span of one year are conducted, and test results corroborate the efficacy of the advocated approach.
A spectroscopic analysis of the whole sample of H$\alpha$ Emission-Line Galaxies (ELGs) contained in the lists 1 & 2 of the Universidad Complutense de Madrid (UCM) objective-prism survey is presented. A significant fraction (59\%) of star-forming galaxies with low ionization or high extinction properties has been found. This kind of ELG is only incompletely detected in the blue or in other ELG surveys. We have found evidence for evolution among some of the different ELG classes. A comparison between the populations detected by Case, Kiso, UM and UCM surveys is presented. We conclude that a deep H$\alpha$ survey is better able to sample all the ages, evolutionary stages and luminosities of current star-forming galaxies than other surveys using blue-emission lines or colors. Finally, the luminosity and spatial distributions of the UCM galaxies are determined. The contribution of the newly found current star-forming ELGs adds new clues to galaxy evolution and has to be taken into account when trying to consider the density of ELGs and total Star Formation Rate in the Universe.
Block matching (BM) motion estimation plays a very important role in video coding. In a BM approach, image frames in a video sequence are divided into blocks. For each block in the current frame, the best matching block is identified inside a region of the previous frame, aiming to minimize the sum of absolute differences (SAD). Unfortunately, the SAD evaluation is computationally expensive and represents the most consuming operation in the BM process. Therefore, BM motion estimation can be approached as an optimization problem, where the goal is to find the best matching block within a search space. The simplest available BM method is the full search algorithm (FSA) which finds the most accurate motion vector through an exhaustive computation of SAD values for all elements of the search window. Recently, several fast BM algorithms have been proposed to reduce the number of SAD operations by calculating only a fixed subset of search locations at the price of poor accuracy. In this paper, a new algorithm based on Artificial Bee Colony (ABC) optimization is proposed to reduce the number of search locations in the BM process. In our algorithm, the computation of search locations is drastically reduced by considering a fitness calculation strategy which indicates when it is feasible to calculate or only estimate new search locations. Since the proposed algorithm does not consider any fixed search pattern or any other movement assumption as most of other BM approaches do, a high probability for finding the true minimum (accurate motion vector) is expected. Conducted simulations show that the proposed method achieves the best balance over other fast BM algorithms, in terms of both estimation accuracy and computational cost.
Hessian-free training has become a popular parallel second or- der optimization technique for Deep Neural Network training. This study aims at speeding up Hessian-free training, both by means of decreasing the amount of data used for training, as well as through reduction of the number of Krylov subspace solver iterations used for implicit estimation of the Hessian. In this paper, we develop an L-BFGS based preconditioning scheme that avoids the need to access the Hessian explicitly. Since L-BFGS cannot be regarded as a fixed-point iteration, we further propose the employment of flexible Krylov subspace solvers that retain the desired theoretical convergence guarantees of their conventional counterparts. Second, we propose a new sampling algorithm, which geometrically increases the amount of data utilized for gradient and Krylov subspace iteration calculations. On a 50-hr English Broadcast News task, we find that these methodologies provide roughly a 1.5x speed-up, whereas, on a 300-hr Switchboard task, these techniques provide over a 2.3x speedup, with no loss in WER. These results suggest that even further speed-up is expected, as problems scale and complexity grows.
Robust object recognition is a crucial ingredient of many, if not all, real-world robotics applications. This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture for object recognition. Our architecture is composed of two separate CNN processing streams - one for each modality - which are consecutively combined with a late fusion network. We focus on learning with imperfect sensor data, a typical problem in real-world robotics tasks. For accurate learning, we introduce a multi-stage training methodology and two crucial ingredients for handling depth data with CNNs. The first, an effective encoding of depth information for CNNs that enables learning without the need for large depth datasets. The second, a data augmentation scheme for robust learning with depth images by corrupting them with realistic noise patterns. We present state-of-the-art results on the RGB-D object dataset and show recognition in challenging RGB-D real-world noisy settings.
Despite the remarkable progress recently made in distant speech recognition, state-of-the-art technology still suffers from a lack of robustness, especially when adverse acoustic conditions characterized by non-stationary noises and reverberation are met. A prominent limitation of current systems lies in the lack of matching and communication between the various technologies involved in the distant speech recognition process. The speech enhancement and speech recognition modules are, for instance, often trained independently. Moreover, the speech enhancement normally helps the speech recognizer, but the output of the latter is not commonly used, in turn, to improve the speech enhancement. To address both concerns, we propose a novel architecture based on a network of deep neural networks, where all the components are jointly trained and better cooperate with each other thanks to a full communication scheme between them. Experiments, conducted using different datasets, tasks and acoustic conditions, revealed that the proposed framework can overtake other competitive solutions, including recent joint training approaches.
We calculate twist--4 coefficient functions for the deep inelastic structure function $F_2(x,Q^2)$ associated to 4--gluon operator matrix elements for general values of the Bjorken variable $x$ and study the numerical effect on the slope $\partial F_2(x,Q^2)/\partial \log Q^2$. It is shown that these contributions diminish the strongly rising twist--2 terms towards small values of $x$.
This work updates an existing, simplistic computational model of Alzheimer's Disease (AD) to investigate the behaviour of synaptic compensatory mechanisms in neural networks with small-world connectivity, and varying methods of calculating compensation. It additionally introduces a method for simulating tau neurofibrillary pathology, resulting in a more dramatic damage profile. Small-world connectivity is shown to have contrasting effects on capacity, retrieval time, and robustness to damage, whilst the use of more easily-obtained remote memories rather than recent memories for synaptic compensation is found to lead to rapid network damage.
We discuss the recent application to strongly disordered systems of the Critical Minimum Energy Subspace (CMES) method, used to limit the energy subspace of the Wang-Landau sampling. We compare with our results on the 3D Random Field Ising Model obtained by a multi-range Wang-Landau simulation in the whole energy range. We point out at some problems that may arise when applying the CMES scheme to models having a complex free energy landscape. PACS numbers: 02.70.Tt,02.70.Rr,05.50.+q, 64.60.Cn, 64.60.Fr, 75.10.Hk
In this research paper, the problems dealing with sensor network architecture, sensor fusion are addressed. Time/Computationally optimal network architectures are investigated. Some novel ideas on sensor fusion are proposed.
Conditions are given under which one may prove that the stochastic dynamics of on-line learning can be described by the deterministic evolution of a finite set of order parameters in the thermodynamic limit. A global constraint on the average magnitude of the increments in the stochastic process is necessary to ensure self-averaging. In the absence of such a constraint, convergence may only be in probability.
When external stresses in a system - physical, social or virtual - are relieved through impulsive events, it is natural to focus on the attributes of these avalanches. However, during the quiescent periods in between, stresses may be relieved through competing processes, such as slowly flowing water between earthquakes or thermally activated dislocation flow between plastic bursts. Such unassuming, smooth responses can have dramatic effects on the avalanche properties. Here we report a thorough experimental investigation of slowly compressed Ni microcrystals, covering three orders of magnitude in nominal strain rate, that exhibits unconventional quasi-periodic avalanche bursts and higher critical exponents as the strain rate is decreased. Our analytic and computational study naturally extends dislocation avalanche modeling to incorporate dislocation relaxations and reveals the emergence of the self-organized avalanche oscillator, a novel critical state exhibiting oscillatory approaches toward a depinning critical point. We demonstrate that the predictions of our theory are faithfully exhibited in our experiments.
In this paper we investigate the behavioural differences between mobile phone customers with prepaid and postpaid subscriptions. Our study reveals that (a) postpaid customers are more active in terms of service usage and (b) there are strong structural correlations in the mobile phone call network as connections between customers of the same subscription type are much more frequent than those between customers of different subscription types. Based on these observations we provide methods to detect the subscription type of customers by using information about their personal call statistics, and also their egocentric networks simultaneously. The key of our first approach is to cast this classification problem as a problem of graph labelling, which can be solved by max-flow min-cut algorithms. Our experiments show that, by using both user attributes and relationships, the proposed graph labelling approach is able to achieve a classification accuracy of $\sim 87\%$, which outperforms by $\sim 7\%$ supervised learning methods using only user attributes. In our second problem we aim to infer the subscription type of customers of external operators. We propose via approximate methods to solve this problem by using node attributes, and a two-ways indirect inference method based on observed homophilic structural correlations. Our results have straightforward applications in behavioural prediction and personal marketing.
We analyze deep inelastic scattering at small Bjorken x, using the approximate conformal invariance of QCD at high energies. Hard pomeron exchanges are resummed eikonally, restoring unitarity at large values of the phase shift in the dual AdS geometry. At weak coupling this phase is imaginary, corresponding to a black disk in AdS. In this saturated regime, cross sections exhibit geometric scaling and have a simple universal form, which we test against available experimental data for the proton structure function F_2(x,Q^2). We predict, in particular, the dependence of the cross section on the scaling variable (Q/Q_s)^2 in the deeply saturated region, where Q_s is the usual saturation scale. We find agreement with current data on F_2 in the kinematical region 0.5 < Q^2 < 10 GeV^2, x < 10^-2, with an average 6% accuracy. We conclude by discussing the relation of our approach with the commonly used dipole formalism.
In a search for signatures of physics processes beyond the Standard Model, various eeqq vector contact-interaction hypotheses have been tested using the high-Q^2, deep inelastic neutral-current e^+p scattering data collected with the ZEUS detector at HERA. The data correspond to an integrated luminosity of 47.7pb-1 of e^+p interactions at 300GeV center-of-mass energy. No significant evidence of a contact-interaction signal has been found. Limits at the 95% confidence level are set on the contact-interaction amplitudes. The effective mass scales Lambda corresponding to these limits range from 1.7TeV to 5TeV for the contact-interaction scenarios considered.
Visual saliency detection tries to mimic human vision psychology which concentrates on sparse, important areas in natural image. Saliency prediction research has been traditionally based on low level features such as contrast, edge, etc. Recent thrust in saliency prediction research is to learn high level semantics using ground truth eye fixation datasets. In this paper we present, WEPSAM : Weakly Pre-Learnt Saliency Model as a pioneering effort of using domain specific pre-learing on ImageNet for saliency prediction using a light weight CNN architecture. The paper proposes a two step hierarchical learning, in which the first step is to develop a framework for weakly pre-training on a large scale dataset such as ImageNet which is void of human eye fixation maps. The second step refines the pre-trained model on a limited set of ground truth fixations. Analysis of loss on iSUN and SALICON datasets reveal that pre-trained network converges much faster compared to randomly initialized network. WEPSAM also outperforms some recent state-of-the-art saliency prediction models on the challenging MIT300 dataset.
We analyze different microscopic RNA models at zero temperature. We discuss both the most simple model, that suffers a large degeneracy of the ground state, and models in which the degeneracy has been remove, in a more or less severe manner. We calculate low-energy density of states using a coupling perturbing method, where the ground state of a modified Hamiltonian, that repels the original ground state, is determined. We evaluate scaling exponents starting from measurements of overlaps and energy differences. In the case of models without accidental degeneracy of the ground state we are able to clearly establish the existence of a glassy phase with $\theta \simeq 1/3$.
The brief overview of the definite determinations of the QCD coupling constant $\alpha_s$ from the characteristics of deep-inelastic scattering processes is given.
We build up a directed network tracing links from a given integer to its divisors and analyze the properties of the Google matrix of this network. The PageRank vector of this matrix is computed numerically and it is shown that its probability is inversely proportional to the PageRank index thus being similar to the Zipf law and the dependence established for the World Wide Web. The spectrum of the Google matrix of integers is characterized by a large gap and a relatively small number of nonzero eigenvalues. A simple semi-analytical expression for the PageRank of integers is derived that allows to find this vector for matrices of billion size. This network provides a new PageRank order of integers.
The Wasserstein probability metric has received much attention from the machine learning community. Unlike the Kullback-Leibler divergence, which strictly measures change in probability, the Wasserstein metric reflects the underlying geometry between outcomes. The value of being sensitive to this geometry has been demonstrated, among others, in ordinal regression and generative modelling. In this paper we describe three natural properties of probability divergences that reflect requirements from machine learning: sum invariance, scale sensitivity, and unbiased sample gradients. The Wasserstein metric possesses the first two properties but, unlike the Kullback-Leibler divergence, does not possess the third. We provide empirical evidence suggesting that this is a serious issue in practice. Leveraging insights from probabilistic forecasting we propose an alternative to the Wasserstein metric, the Cram\'er distance. We show that the Cram\'er distance possesses all three desired properties, combining the best of the Wasserstein and Kullback-Leibler divergences. To illustrate the relevance of the Cram\'er distance in practice we design a new algorithm, the Cram\'er Generative Adversarial Network (GAN), and show that it performs significantly better than the related Wasserstein GAN.
In the cerebral cortex, neurons are subject to a continuous bombardment of synaptic inputs originating from the network's background activity. This leads to ongoing, mostly subthreshold membrane dynamics that depends on the statistics of the background activity and of the synapses made on a neuron. Subthreshold membrane polarization is, in turn, a potent modulator of neural responses. The present paper analyzes the subthreshold dynamics of the neural membrane potential driven by synaptic inputs of stationary statistics. Synaptic inputs are considered in linear interaction. The analysis identifies regimes of input statistics which give rise to stationary, fluctuating, oscillatory, and unstable dynamics. In particular, I show that (i) mere noise inputs can drive the membrane potential into sustained, quasiperiodic oscillations (noise-driven oscillations), in the absence of a stimulus-derived, intraneural, or network pacemaker; (ii) adding hyperpolarizing to depolarizing synaptic input can increase neural activity (hyperpolarization-induced activity), in the absence of hyperpolarization-activated currents.
MicroRNA (miRNA) are small non-coding RNAs that regulates the gene expression at the post-transcriptional level. Determining whether a sequence segment is miRNA is experimentally challenging. Also, experimental results are sensitive to the experimental environment. These limitations inspire the development of computational methods for predicting the miRNAs. We propose a deep learning based classification model, called DP-miRNA, for predicting precursor miRNA sequence that contains the miRNA sequence. The feature set based Restricted Boltzmann Machine method, which we call DP-miRNA, uses 58 features that are categorized into four groups: sequence features, folding measures, stem-loop features and statistical feature. We evaluate the performance of the DP-miRNA on eleven twelve data sets of varying species, including the human. The deep neural network based classification outperformed support vector machine, neural network, naive Baye's classifiers, k-nearest neighbors, random forests, and a hybrid system combining support vector machine and genetic algorithm.
The magnetoresistance of InSb has been intensively investigated. The experiments we perform here focus on weak field magnetoresistance of InSb thin film. We investigate the magnetoresistance of InSb films in perpendicular, tilted as well as parallel magnetic field. Our results verify the previous observations concerning weak localization effect in InSb thin film. Moreover, we systematically study the anisotropy of magnetoresistance of InSb. We find that the existence of in-plane field can effectively suppress the weak localization effect of InSb film. We fit the experimental data with two types of models, the match between data and model is excellent. From the fitting procedure, we get information about phase coherence time, spin-orbit scattering time. The information about Zeeman effect and sample roughness are also extracted from the fitting procedure.
Inspired by "quantum graphity" models for spacetime, a statistical model of graphs is proposed to explore possible realizations of emergent manifolds. Graphs with given numbers of vertices and edges are considered, governed by a very general Hamiltonian that merely favors graphs with near-constant valency and local rotational symmetry. The ratio of vertices to edges controls the dimensionality of the emergent manifold. The model is simulated numerically in the canonical ensemble for a given vertex to edge ratio, where it is found that the low-energy states are almost triangulations of two-dimensional manifolds. The resulting manifold shows topological "handles" and surface intersections in a higher embedding space, as well as non-trivial fractal dimension consistent with previous spectral analysis, and nonlocal links consistent with models of disordered locality. The transition to an emergent manifold is first order, and thus dependent on microscopic structure. Issues involved in interpreting nearly-fixed valency graphs as Feynman diagrams dual to a triangulated manifold as in matrix models are discussed. Another interesting phenomenon is that the entropy of the graphs are super-extensive, a fact known since Erd\H{o}s, which results in a transition temperature of zero in the limit of infinite system size: infinite manifolds are always disordered. Aside from a finite universe or diverging coupling constraints as possible solutions to this problem, long-range interactions between vertex defects also resolve the problem and restore a nonzero transition temperature, in a manner similar to that in low-dimensional condensed-matter systems.
Bayesian networks (BNs) are graphical models that are useful for representing high-dimensional probability distributions. There has been a great deal of interest in recent years in the NP-hard problem of learning the structure of a BN from observed data. Typically, one assigns a score to various structures and the search becomes an optimization problem that can be approached with either deterministic or stochastic methods. In this paper, we walk through the space of graphs by modeling the appearance and disappearance of edges as a birth and death process and compare our novel approach to the popular Metropolis-Hastings search strategy. We give empirical evidence that the birth and death process has superior mixing properties.
Nowadays powerful X-ray sources like synchrotrons and free-electron lasers are considered as ultimate tools for probing microscopic properties in materials. However, the correct interpretation of such experiments requires a good understanding on how the beam affects the properties of the sample, knowledge that is currently lacking for intense X-rays. Here we use X-ray photon correlation spectroscopy to probe static and dynamic properties of oxide and metallic glasses. We find that although the structure does not depend on the flux, strong fluxes do induce a non-trivial microscopic motion in oxide glasses, whereas no such dependence is found for metallic glasses. These results show that high fluxes can alter dynamical properties in hard materials, an effect that needs to be considered in the analysis of X-ray data but which also gives novel possibilities to study materials properties since the beam can not only be used to probe the dynamics but also to pump it.
We propose a network characterization of combinatorial fitness landscapes by adapting the notion of inherent networks proposed for energy surfaces. We use the well-known family of NK landscapes as an example. In our case the inherent network is the graph where the vertices represent the local maxima in the landscape, and the edges account for the transition probabilities between their corresponding basins of attraction. We exhaustively extracted such networks on representative small NK landscape instances, and performed a statistical characterization of their properties. We found that most of these network properties can be related to the search difficulty on the underlying NK landscapes with varying values of K.
Road networks are characterised by several structural and geometric properties. Their topological structure determines partially its hierarchical arrangement, but since these are networks that are spatially situated and, therefore, spatially constrained, to fully understand the role that each road plays in the system it is fundamental to characterize the influence that geometrical properties have over the network's behaviour. In this work, we percolate the UK's road network using the relative angle between street segments as the occupation probability. We argue that road networks undergo a non-equilibrium first-order phase transition at the moment the main roads start to interconnect forming the spanning percolation cluster. The percolation process uncovers the hierarchical structure of the roads in the network, and as such, its classification. Furthermore, this technique serves to extract the set of most important roads of the network and to create a hierarchical index for each road in the system.
We numerically study the effect of short ranged potential disorder on massless noninteracting three-dimensional Dirac and Weyl fermions, with a focus on the question of the proposed quantum critical point separating the semimetal and diffusive metal phases. We determine the properties of the eigenstates of the disordered Dirac Hamiltonian ($H$) and exactly calculate the density of states (DOS) near zero energy, using a combination of Lanczos on $H^2$ and the kernel polynomial method on $H$. We establish the existence of two distinct types of low energy eigenstates contributing to the disordered density of states in the weak disorder semimetal regime. These are (i) typical eigenstates that are well described by linearly dispersing perturbatively dressed Dirac states, and (ii) nonperturbative rare eigenstates that are weakly-dispersive and quasi-localized in the real space regions with the largest (and rarest) local random potential. Using twisted boundary conditions, we are able to systematically find and study these two types of eigenstates. We find that the Dirac states contribute low energy peaks in the finite-size DOS that arise from the clean eigenstates which shift and broaden in the presence of disorder. On the other hand, we establish that the rare quasi-localized eigenstates contribute a nonzero background DOS which is only weakly energy-dependent near zero energy and is exponentially small at weak disorder. We find that the expected semimetal to diffusive metal quantum critical point is converted to an {\it avoided} quantum criticality that is "rounded out" by nonperturbative effects, with no signs of any singular behavior in the DOS near the Dirac energy. We discuss the implications of our results for disordered Dirac and Weyl semimetals, and reconcile the large body of existing numerical work showing quantum criticality with the existence of the rare region effects.
We present a statistical mechanics approach for the description of complex networks. We first define an energy and an entropy associated to a degree distribution which have a geometrical interpretation. Next we evaluate the distribution which extremize the free energy of the network. We find two important limiting cases: a scale-free degree distribution and a finite-scale degree distribution. The size of the space of allowed simple networks given these distribution is evaluated in the large network limit. Results are compared with simulations of algorithms generating these networks.
We report on zero field cooled magnetization relaxation experiments on a concen- trated frozen ferrofluid exhibiting a low temperature superspin glass transition. With a method initially developed for spin glasses, we investigate the field dependence of the relaxations that take place after different aging times. We extract the typical number of correlated spins involved in the aging dynamics. This brings important insights into the dynamical correlation length and its time growth. Our results, consistent with expressions obtained for spin glasses, extend the generality of these behaviours to the class of superspin glasses. Since the typical flipping time is much larger for superspins than for atomic spins, our experiments probe a time regime much closer to that of numerical simulations.
Accurate subtyping of renal cell carcinoma (RCC) is of crucial importance for understanding disease progression and for making informed treatment decisions. New discoveries of significant alterations to mitochondria between subtypes make immunohistochemical (IHC) staining based image classification an imperative. Until now, accurate quantification and subtyping was made impossible by huge IHC variations, the absence of cell membrane staining for cytoplasm segmentation as well as the complete lack of systems for robust and reproducible image based classification. In this paper we present a comprehensive classification framework to overcome these challenges for tissue microarrays (TMA) of RCCs. We compare and evaluate models based on domain specific hand-crafted "flat"-features versus "deep" feature representations from various layers of a pre-trained convolutional neural network (CNN). The best model reaches a cross-validation accuracy of 89%, which demonstrates for the first time, that robust mitochondria-based subtyping of renal cancer is feasible
We introduce a stochastic price model where, together with a random component, a moving average of logarithmic prices contributes to the price formation. Our model is tested against financial datasets, showing an extremely good agreement with them. It suggests how to construct trading strategies which imply a capital growth rate larger than the growth rate of the underlying asset, with also the effect of reducing the fluctuations. These results are a clear evidence that some hidden information is not fully integrated in price dynamics, and therefore financial markets are partially inefficient. In simple terms, we give a recipe for speculators to make money as long as only few investors follow it.
The Path Integral Monte Carlo simulated Quantum Annealing algorithm is applied to the optimization of a large hard instance of the Random 3-SAT Problem (N=10000). The dynamical behavior of the quantum and the classical annealing are compared, showing important qualitative differences in the way of exploring the complex energy landscape of the combinatorial optimization problem. At variance with the results obtained for the Ising spin glass and for the Traveling Salesman Problem, in the present case the linear-schedule Quantum Annealing performance is definitely worse than Classical Annealing. Nevertheless, a quantum cooling protocol based on field-cycling and able to outperform standard classical simulated annealing over short time scales is introduced.
Learning rich and diverse feature representation are always desired for deep convolutional neural networks (CNNs). Besides, when auxiliary annotations are available for specific data, simply ignoring them would be a great waste. In this paper, we incorporate these auxiliary annotations as privileged information and propose a novel CNN model that is able to maximize inherent diversity of a CNN model such that the model can learn better feature representation with a stronger generalization ability. More specifically, we propose a group orthogonal convolutional neural network (GoCNN) to learn features from foreground and background in an orthogonal way by exploiting privileged information for optimization, which automatically emphasizes feature diversity within a single model. Experiments on two benchmark datasets, ImageNet and PASCAL VOC, well demonstrate the effectiveness and high generalization ability of our proposed GoCNN models.
We have studied the internal friction and the relative change in the speed of sound of amorphous diamond-like carbon films prepared by pulsed-laser deposition from 0.3 K to room temperature. Like the most of amorphous solids, the internal friction below 10 K exhibits a temperature independent plateau. The values of the internal friction plateau, however, are slightly below the universal ``glassy range'' where the internal frictions of almost all amorphous solids lie. Similar observations have been made in our earlier studies in the thin films of amorphous silicon and amorphous germanium, and the behavior could be well accounted for by the existence of the low-energy atomic tunneling states. In this work, we have varied the concentration of sp^3 versus sp^2 carbon atoms by increasing laser fluence from 1.5 to 30 J/cm^2. Our results show that both the internal friction and the speed of sound have a nonmonotonic dependence on sp^3/sp^2 ratio with the values of the internal friction plateau varying between 6x10^-5 and 1.1x10^-4. We explain our findings as a result of a possible competition between the increase of atomic bonding and the increase of internal strain in the films, both of which are important in determining the tunneling states in amorphous solids. In contrast, no significant dependence of laser fluence is found in shear moduli of the films, which vary between 220 and 250 GPa. The temperature dependence of the relative change in speed of sound, although it shows a similar nonmonotonic dependence on laser fluence as the internal friction, differs from those found in thin films of amorphous silicon and amorphous germanium, which we explain as having the same origin as the anomalous behavior recently observed in the speed of sound of thin nanocrystalline diamond films.
Many networks are used to transfer information or goods, in other words, they are navigated. The larger the network, the more difficult it is to navigate efficiently. Indeed, information routing in the Internet faces serious scalability problems due to its rapid growth, recently accelerated by the rise of the Internet of Things. Large networks like the Internet can be navigated efficiently if nodes, or agents, actively forward information based on hidden maps underlying these systems. However, in reality most agents will deny to forward messages, which has a cost, and navigation is impossible. Can we design appropriate incentives that lead to participation and global navigability? Here, we present an evolutionary game where agents share the value generated by successful delivery of information or goods. We show that global navigability can emerge, but its complete breakdown is possible as well. Furthermore, we show that the system tends to self-organize into local clusters of agents who participate in the navigation. This organizational principle can be exploited to favor the emergence of global navigability in the system.
We consider the problem of efficient packet dissemination in wireless networks with point-to-multi-point wireless broadcast channels. We propose a dynamic policy, which achieves the broadcast capacity of the network. This policy is obtained by first transforming the original multi-hop network into a precedence-relaxed virtual single-hop network and then finding an optimal broadcast policy for the relaxed network. The resulting policy is shown to be throughput-optimal for the original wireless network using a sample-path argument. We also prove the NP-completeness of the finite-horizon broadcast problem, which is in contrast with the polynomial time solvability of the problem with point-to-point channels. Illustrative simulation results demonstrate the efficacy of the proposed broadcast policy in achieving the full broadcast capacity with low delay.
Though having achieved some progresses, the hand-crafted texture features, e.g., LBP [23], LBP-TOP [11] are still unable to capture the most discriminative cues between genuine and fake faces. In this paper, instead of designing feature by ourselves, we rely on the deep convolutional neural network (CNN) to learn features of high discriminative ability in a supervised manner. Combined with some data pre-processing, the face anti-spoofing performance improves drastically. In the experiments, over 70% relative decrease of Half Total Error Rate (HTER) is achieved on two challenging datasets, CASIA [36] and REPLAY-ATTACK [7] compared with the state-of-the-art. Meanwhile, the experimental results from inter-tests between two datasets indicates CNN can obtain features with better generalization ability. Moreover, the nets trained using combined data from two datasets have less biases between two datasets.
We study numerically the configuration space at low energy of electron glasses. We consider systems with Coulomb interactions, short-range interactions and no interactions. First, we calculate the integrated density of configurations as a function of energy. At a given energy, this density is smaller for Coulomb glasses than for short-range systems, which in turn is smaller than for non-interacting systems. We analyze how the site occupancy varies with the number of configurations. Through this study we estimate the number of particles involved in a typical low-energy transition between configurations. This number increases with system size for long range interactions, while it is basically constant for a short-range interaction. Finally we calculate the density of metastable configurations, i.e. valleys, classified according to their degree of stability.
Seeding then expanding is a commonly used scheme to discover overlapping communities in a network. Most seeding methods are either too complex to scale to large networks or too simple to select high-quality seeds, and the non-principled functions used by most expanding methods lead to poor performance when applied to diverse networks. This paper proposes a new method that transforms a network into a corpus where each edge is treated as a document, and all nodes of the network are treated as terms of the corpus. An effective seeding method is also proposed that selects seeds as a training set, then a principled expanding method based on semi-supervised learning is applied to classify edges. We compare our new algorithm with four other community detection algorithms on a wide range of synthetic and empirical networks. Experimental results show that the new algorithm can significantly improve clustering performance in most cases. Furthermore, the time complexity of the new algorithm is linear to the number of edges, and this low complexity makes the new algorithm scalable to large networks.
We present a semi-automated method to search for strong galaxy-galaxy lenses in optical imaging surveys. Our search technique constrains the shape of strongly lensed galaxies (or arcs) in a multi-parameter space, which includes the third order (octopole) moments of objects. This method is applied to the Deep Lens Survey (DLS), a deep ground based weak lensing survey imaging to $R\sim26$. The parameter space of arcs in the DLS is simulated using real galaxies extracted from deep HST fields in order to more accurately reproduce the properties of arcs. Arcs are detected in the DLS using a pixel thresholding method and candidate arcs are selected within this multi-parameter space. Examples of strong galaxy-galaxy lens candidates discovered in the DLS F2 field (4 square degrees) are presented.
Attention mechanism has enhanced state-of-the-art Neural Machine Translation (NMT) by jointly learning to align and translate. It tends to ignore past alignment information, however, which often leads to over-translation and under-translation. To address this problem, we propose coverage-based NMT in this paper. We maintain a coverage vector to keep track of the attention history. The coverage vector is fed to the attention model to help adjust future attention, which lets NMT system to consider more about untranslated source words. Experiments show that the proposed approach significantly improves both translation quality and alignment quality over standard attention-based NMT.
A recurrent neural network model storing multiple spatial maps, or ``charts'', is analyzed. A network of this type has been suggested as a model for the origin of place cells in the hippocampus of rodents. The extremely diluted and fully connected limits are studied, and the storage capacity and the information capacity are found. The important parameters determining the performance of the network are the sparsity of the spatial representations and the degree of connectivity, as found already for the storage of individual memory patterns in the general theory of auto-associative networks. Such results suggest a quantitative parallel between theories of hippocampal function in different animal species, such as primates (episodic memory) and rodents (memory for space).
We review and improve a recently introduced method for the detection of communities in complex networks. This method combines spectral properties of some matrices encoding the network topology, with well known hierarchical clustering techniques, and the use of the modularity parameter to quantify the goodness of any possible community subdivision. This provides one of the best available methods for the detection of community structures in complex systems.
Persistent spread measurement is to count the number of distinct elements that persist in each network flow for predefined time periods. It has many practical applications, including detecting long-term stealthy network activities in the background of normal-user activities, such as stealthy DDoS attack, stealthy network scan, or faked network trend, which cannot be detected by traditional flow cardinality measurement. With big network data, one challenge is to measure the persistent spreads of a massive number of flows without incurring too much memory overhead as such measurement may be performed at the line speed by network processors with fast but small on-chip memory. We propose a highly compact Virtual Intersection HyperLogLog (VI-HLL) architecture for this purpose. It achieves far better memory efficiency than the best prior work of V-Bitmap, and in the meantime drastically extends the measurement range. Theoretical analysis and extensive experiments demonstrate that VI-HLL provides good measurement accuracy even in very tight memory space of less than 1 bit per flow.
We describe the Customer LifeTime Value (CLTV) prediction system deployed at ASOS.com, a global online fashion retailer. CLTV prediction is an important problem in e-commerce where an accurate estimate of future value allows retailers to effectively allocate marketing spend, identify and nurture high value customers and mitigate exposure to losses. The system at ASOS provides daily estimates of the future value of every customer and is one of the cornerstones of the personalised shopping experience. The state of the art in this domain uses large numbers of handcrafted features and ensemble regressors to forecast value, predict churn and evaluate customer loyalty. Recently, domains including language, vision and speech have shown dramatic advances by replacing handcrafted features with features that are learned automatically from data. We detail the system deployed at ASOS and show that learning feature representations is a promising extension to the state of the art in CLTV modelling. We propose a novel way to generate embeddings of customers, which addresses the issue of the ever changing product catalogue and obtain a significant improvement over an exhaustive set of handcrafted features.
Static and dynamic structure factors and various transport coefficients are computed for a Lennard-Jones model of a binary fluid (A,B) with a symmetrical miscibility gap, varying both temperature and relative concentration of the mixture. The model is first equilibrated by a semi-grandcanonical Monte Carlo method, choosing the temperature and chemical potential difference $\Delta \mu$ between the two species as the given independent variables. Varying for $\Delta \mu=0$ the temperature and particle number $N$ over a wide range, the location of the coexistence curve in the thermodynamic limit is estimated. Well-equilibrated configurations from these Monte Carlo runs are used as initial states for microcanonical Molecular Dynamics runs, in order to study the microscopic structure and the behavior of transport coefficients as well as dynamic correlation functions along the coexistence curve. Dynamic structure factors $S_{\alpha \beta} (q,t)$ (and the corresponding static functions $S_{\alpha \beta} (q)$) are recorded ($\alpha, \beta, \in$ A,B), $q$ being the wavenumber and $t$ the time, as well as the mean square displacements of the particles (to obtain the self-diffusion constants $D_{\rm A}$, $D_{\rm B}$) and transport coefficients describing collective transport, such as the interdiffusion constant and the shear viscosity. The minority species is found to diffuse a bit faster than the majority species. Despite the presence of strong concentration fluctuations in the system the Stokes-Einstein relation is a reasonable approximation.
Machine learning techniques are often used in computer vision due to their ability to leverage large amounts of training data to improve performance. Unfortunately, most generic object trackers are still trained from scratch online and do not benefit from the large number of videos that are readily available for offline training. We propose a method for offline training of neural networks that can track novel objects at test-time at 100 fps. Our tracker is significantly faster than previous methods that use neural networks for tracking, which are typically very slow to run and not practical for real-time applications. Our tracker uses a simple feed-forward network with no online training required. The tracker learns a generic relationship between object motion and appearance and can be used to track novel objects that do not appear in the training set. We test our network on a standard tracking benchmark to demonstrate our tracker's state-of-the-art performance. Further, our performance improves as we add more videos to our offline training set. To the best of our knowledge, our tracker is the first neural-network tracker that learns to track generic objects at 100 fps.
Using the density matrix renormalization group algorithm, we investigate the lattice model for spinless fermions in one dimension in the presence of a strong interaction and disorder. The phase sensitivity of the ground state energy is determined with high accuracy for systems up to a size of 60 lattice constants. This quantity is found to be log-normally distributed. The fluctuations grow algebraically with system size with a universal exponent of ~2/3 in the localized region of the phase diagram. Surprizingly, we find, for an attractive interaction, a delocalized phase of finite extension. The boundary of this delocalized phase is determined.
Most previous approaches to Chinese word segmentation formalize this problem as a character-based sequence labeling task where only contextual information within fixed sized local windows and simple interactions between adjacent tags can be captured. In this paper, we propose a novel neural framework which thoroughly eliminates context windows and can utilize complete segmentation history. Our model employs a gated combination neural network over characters to produce distributed representations of word candidates, which are then given to a long short-term memory (LSTM) language scoring model. Experiments on the benchmark datasets show that without the help of feature engineering as most existing approaches, our models achieve competitive or better performances with previous state-of-the-art methods.
End-to-end neural machine translation has overtaken statistical machine translation in terms of translation quality for some language pairs, specially those with a large amount of parallel data available. Beside this palpable improvement, neural networks embrace several new properties. A single system can be trained to translate between many languages at almost no additional cost other than training time. Furthermore, internal representations learned by the network serve as a new semantic representation of words -or sentences- which, unlike standard word embeddings, are learned in an essentially bilingual or even multilingual context. In view of these properties, the contribution of the present work is two-fold. First, we systematically study the context vectors, i.e. output of the encoder, and their prowess as an interlingua representation of a sentence. Their quality and effectiveness are assessed by similarity measures across translations, semantically related, and semantically unrelated sentence pairs. Second, and as extrinsic evaluation of the first point, we identify parallel sentences in comparable corpora, obtaining an F1=98.2% on data from a shared task when using only context vectors. F1 reaches 98.9% when complementary similarity measures are used.
Over the past decade, crowdsourcing has emerged as a cheap and efficient method of obtaining solutions to simple tasks that are difficult for computers to solve but possible for humans. The popularity and promise of crowdsourcing markets has led to both empirical and theoretical research on the design of algorithms to optimize various aspects of these markets, such as the pricing and assignment of tasks. Much of the existing theoretical work on crowdsourcing markets has focused on problems that fall into the broad category of online decision making; task requesters or the crowdsourcing platform itself make repeated decisions about prices to set, workers to filter out, problems to assign to specific workers, or other things. Often these decisions are complex, requiring algorithms that learn about the distribution of available tasks or workers over time and take into account the strategic (or sometimes irrational) behavior of workers.   As human computation grows into its own field, the time is ripe to address these challenges in a principled way. However, it appears very difficult to capture all pertinent aspects of crowdsourcing markets in a single coherent model. In this paper, we reflect on the modeling issues that inhibit theoretical research on online decision making for crowdsourcing, and identify some steps forward. This paper grew out of the authors' own frustration with these issues, and we hope it will encourage the community to attempt to understand, debate, and ultimately address them.   The authors welcome feedback for future revisions of this paper.
We report on experiments aimed at comparing the hysteretic response of a Cu-Zn-Al single crystal undergoing a martensitic transition under strain-driven and stress-driven conditions. Strain-driven experiments were performed using a conventional tensile machine while a special device was designed to perform stress-driven experiments. Significant differences in the hysteresis loops were found. The strain-driven curves show re-entrant behaviour (yield point) which is not observed in the stress-driven case. The dissipated energy in the stress-driven curves is larger than in the strain-driven ones. Results from recently proposed models qualitatively agree with experiments.
We discuss the high density behavior of a system of hard spheres of diameter d on the hypercubic lattice of dimension n, in the limit n -> oo, d -> oo, d/n=delta. The problem is relevant for coding theory. We find a solution to the equations describing the liquid up to very large values of the density, but we show that this solution gives a negative entropy for the liquid phase when the density is large enough. We then conjecture that a phase transition towards a different phase might take place, and we discuss possible scenarios for this transition. Finally we discuss the relation between our results and known rigorous bounds on the maximal density of the system.
We first show that there are in fact triangular arbitrage opportunities in the spot foreign exchange markets, analyzing the time dependence of the yen-dollar rate, the dollar-euro rate and the yen-euro rate. Next, we propose a model of foreign exchange rates with an interaction. The model includes effects of triangular arbitrage transactions as an interaction among three rates. The model explains the actual data of the multiple foreign exchange rates well.
We propose a novel hierarchical online intrusion detection system (HOIDS) for supervisory control and data acquisition (SCADA) networks based on machine learning algorithms. By utilizing the server-client topology while keeping clients distributed for global protection, high detection rate is achieved with minimum network impact. We implement accurate models of normal-abnormal binary detection and multi-attack identification based on logistic regression and quasi-Newton optimization algorithm using the Broyden-Fletcher-Goldfarb-Shanno approach. The detection system is capable of accelerating detection by information gain based feature selection or principle component analysis based dimension reduction. By evaluating our system using the KDD99 dataset and the industrial control system dataset, we demonstrate that HOIDS is highly scalable, efficient and cost effective for securing SCADA infrastructures.
We consider the problem of estimating the topology of spatial interactions in a discrete state, discrete time spatio-temporal graphical model where the interactions affect the temporal evolution of each agent in a network. Among other models, the susceptible, infected, recovered ($SIR$) model for interaction events fall into this framework. We pose the problem as a structure learning problem and solve it using an $\ell_1$-penalized likelihood convex program. We evaluate the solution on a simulated spread of infectious over a complex network. Our topology estimates outperform those of a standard spatial Markov random field graphical model selection using $\ell_1$-regularized logistic regression.
MicroRNAs (miRNAs) are short sequences of ribonucleic acids that control the expression of target messenger RNAs (mRNAs) by binding them. Robust prediction of miRNA-mRNA pairs is of utmost importance in deciphering gene regulations but has been challenging because of high false positive rates, despite a deluge of computational tools that normally require laborious manual feature extraction. This paper presents an end-to-end machine learning framework for miRNA target prediction. Leveraged by deep recurrent neural networks-based auto-encoding and sequence-sequence interaction learning, our approach not only delivers an unprecedented level of accuracy but also eliminates the need for manual feature extraction. The performance gap between the proposed method and existing alternatives is substantial (over 25% increase in F-measure), and deepTarget delivers a quantum leap in the long-standing challenge of robust miRNA target prediction.
Energy conservation of sensor nodes for increasing the network life is the most crucial design goal while developing efficient routing protocol for wireless sensor networks. Recent technological advances help in the development of wide variety of sensor nodes. Heterogeneity takes the advantage of different types of sensor nodes and improves the energy efficiency and network life. Generally sensors are deployed randomly and densely in a sensing region so short distance multihop communication reduces the long distance transmission in the sensor network. In this research paper MEEP (multihop energy efficient protocol for heterogeneous sensor network) is proposed. The proposed protocol combines the idea of clustering and multihop communication. Heterogeneity is created in the network by using some nodes of high energy. Low energy nodes use a residual energy based scheme to become a cluster head. High energy nodes act as the relay nodes for low energy cluster head when they are not performing the duty of a cluster head to save their energy further. Protocol also suggests a sleep state for nodes in the cluster formation process for saving energy and increasing the life of sensor network. Simulation result shows that the proposed scheme is better than other two level heterogeneous sensor network protocol like SEP in energy efficiency and network life.
We propose a non-parametric regression methodology, Random Forests on Distance Matrices (RFDM), for detecting genetic variants associated to quantitative phenotypes representing the human brain's structure or function, and obtained using neuroimaging techniques. RFDM, which is an extension of decision forests, requires a distance matrix as response that encodes all pair-wise phenotypic distances in the random sample. We discuss ways to learn such distances directly from the data using manifold learning techniques, and how to define such distances when the phenotypes are non-vectorial objects such as brain connectivity networks. We also describe an extension of RFDM to detect espistatic effects while keeping the computational complexity low. Extensive simulation results and an application to an imaging genetics study of Alzheimer's Disease are presented and discussed.
Recently, with the rapid development of technology, there are a lot of applications require to achieve low-cost learning. However the computational power of classical artificial neural networks, they are not capable to provide low-cost learning. In contrast, quantum neural networks may be representing a good computational alternate to classical neural network approaches, based on the computational power of quantum bit (qubit) over the classical bit. In this paper we present a new computational approach to the quantum perceptron neural network can achieve learning in low-cost computation. The proposed approach has only one neuron can construct self-adaptive activation operators capable to accomplish the learning process in a limited number of iterations and, thereby, reduce the overall computational cost. The proposed approach is capable to construct its own set of activation operators to be applied widely in both quantum and classical applications to overcome the linearity limitation of classical perceptron. The computational power of the proposed approach is illustrated via solving variety of problems where promising and comparable results are given.
Ensembles of neural networks are known to be much more robust and accurate than individual networks. However, training multiple deep networks for model averaging is computationally expensive. In this paper, we propose a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost. We achieve this goal by training a single neural network, converging to several local minima along its optimization path and saving the model parameters. To obtain repeated rapid convergence, we leverage recent work on cyclic learning rate schedules. The resulting technique, which we refer to as Snapshot Ensembling, is simple, yet surprisingly effective. We show in a series of experiments that our approach is compatible with diverse network architectures and learning tasks. It consistently yields lower error rates than state-of-the-art single models at no additional training cost, and compares favorably with traditional network ensembles. On CIFAR-10 and CIFAR-100 our DenseNet Snapshot Ensembles obtain error rates of 3.4% and 17.4% respectively.
The electromagnetic and weak structure functions of the photon can be studied in the deep inelastic electron-photon processes e+gamma -> e+X and e+gamma -> nu+X. While at low energies only virtual photon exchange is operative in the neutral-current process, additional Z-exchange becomes relevant for high Q2 of order MZ^2 at e+e- linear colliders. Likewise the charged-current process can be observed at these high energy colliders. By measuring the electroweak neutral- and charged-current structure functions, the up- and down-type quark content of the photon can be determined separately.
Wong's diffusion network is a stochastic, zero-input Hopfield network with a Gibbs stationary distribution over a bounded, connected continuum. Previously, logarithmic thermal annealing was demonstrated for the diffusion network and digital versions of it were studied and applied to imaging. Recently, "quantum" annealed Markov chains have garnered significant attention because of their improved performance over "pure" thermal annealing. In this note, a joint quantum and thermal version of Wong's diffusion network is described and its convergence properties are studied. Different choices for "auxiliary" functions are discussed, including those of the kinetic type previously associated with quantum annealing.
Traditional ultra-dense wireless networks are recommended as a complement for cellular networks and are deployed in partial areas, such as hotspot and indoor scenarios. Based on the massive multiple-input multi-output (MIMO) antennas and the millimeter wavecommunication technologies, the 5G ultra-dense cellular network is proposed to deploy in overall cellular scenarios. Moreover, a distribution network architecture is presented for 5G ultra-dense cellular networks. Furthermore, the backhaul network capacity and the backhaul energy efficiency of ultra-dense cellular networks are investigated to answer an important question, i.e., how much densification can be deployed for 5G ultra-dense cellular networks. Simulation results reveal that there exist densification limits for 5G ultra-dense cellualr networks with backhaul network capacity and backhaul energy efficiency constraints.
We propose theoretical approach based on combination of graph theory and generalized Ising model (GIM), which enables systematic determination of extremal structures for crystalline solids without any information about interactions or constituent elements. The conventional approach to find such set of structure typically employs configurational polyhedra (CP) on configuration space based on GIM description, whose vertices can always be candidates to exhibit maximum or minimum physical quantities. We demonstrate that the present approach can construct extended CP whose vertices not only include those found in conventional CP, but also include other topologically and/or configurationally characteristic structures on the same dimensional configuration space with the same set of figures composed of underlying lattice points, which therefore has significant advantage over the conventional approach.
Music genre classification, especially using lyrics alone, remains a challenging topic in Music Information Retrieval. In this study we apply recurrent neural network models to classify a large dataset of intact song lyrics. As lyrics exhibit a hierarchical layer structure - in which words combine to form lines, lines form segments, and segments form a complete song - we adapt a hierarchical attention network (HAN) to exploit these layers and in addition learn the importance of the words, lines, and segments. We test the model over a 117-genre dataset and a reduced 20-genre dataset. Experimental results show that the HAN outperforms both non-neural models and simpler neural models, whilst also classifying over a higher number of genres than previous research. Through the learning process we can also visualise which words or lines in a song the model believes are important to classifying the genre. As a result the HAN provides insights, from a computational perspective, into lyrical structure and language features that differentiate musical genres.
The interactions between acceptors in semiconductors are often treated in qualitatively the same manner as those between donors. Acceptor wave functions are taken to be approximately hydrogenic and the standard hydrogen molecule Heitler-London model is used to describe acceptor-acceptor interactions. But due to valence band degeneracy and spin-orbit coupling, acceptor states can be far more complex than those of hydrogen atoms, which brings into question the validity of this approximation. To address this issue, we develop an acceptor-acceptor Heitler-London model using single-acceptor wave functions of the form proposed by Baldereschi and Lipari, which more accurately capture the physics of the acceptor states. We calculate the resulting acceptor-pair energy levels and find, in contrast to the two-level singlet-triplet splitting of the hydrogen molecule, a rich ten-level energy spectrum. Our results, computed as a function of inter-acceptor distance and spin-orbit coupling strength, suggest that acceptor-acceptor interactions can be qualitatively different from donor-donor interactions, and should therefore be relevant to the magnetic properties of a variety of p-doped semiconductor systems. Further insight is drawn by fitting numerical results to closed-form energy-level expressions obtained via an acceptor-acceptor Hubbard model.
Recently it has been experimentally demonstrated that certain glasses display an unexpected magnetic field dependence of the dielectric constant. In particular, the echo technique experiments have shown that the echo amplitude depends on the magnetic field. The analysis of these experiments results in the conclusion that the effect seems to be related to the nuclear degrees of freedom of tunneling systems. The interactions of a nuclear quadrupole electrical moment with the crystal field and of a nuclear magnetic moment with magnetic field transform the two-level tunneling systems inherent in amorphous dielectrics into many-level tunneling systems. The fact that these features show up at temperatures $T<100mK$, where the properties of amorphous materials are governed by the long-range $R^{-3}$ interaction between tunneling systems, suggests that this interaction is responsible for the magnetic field dependent relaxation. We have developed a theory of many-body relaxation in an ensemble of interacting many-level tunneling systems and show that the relaxation rate is controlled by the magnetic field. The results obtained correlate with the available experimental data. Our approach strongly supports the idea that the nuclear quadrupole interaction is just the key for understanding the unusual behavior of glasses in a magnetic field.
We make use of a previous well tested Galactic model, but describing the observational behavior of the various stellar components in terms of suitable assumptions on their evolutionary status. In this way we are able to predict the expected distribution of Galactic White Dwarfs (WDs), with results which appear in {\em rather good} agreement with recent estimates of the local WD luminosity function. The predicted occurrence of WDs in deep observations of selected galactic fields is presented, discussing the role played by WDs in star counts. The effects on the theoretical predictions of different White Dwarfs evolutionary models, ages, initial mass functions and relations between progenitor mass and WD mass are also discussed.
Structural and dynamical fingerprints of evolutionary optimization in biological networks are still unclear. We here analyze the dynamics of genetic regulatory networks responsible for the regulation of cell cycle and cell differentiation in three organisms or cell types each, and show that they obey a version of Hebb's rule which we term as coherence. More precisely, we find that simultaneously expressed genes with a common target are less likely to conflict at the attractors of the regulatory dynamics. We then investigate the dependence of coherence on structural parameters, such as the mean number of inputs per node and the activatory/repressory interaction ratio, as well as on dynamically determined quantities, such as the basin size and the number of expressed genes.
Wireless sensor networks (WSNs) have recently attracted a lot of interest in the research community due their wide range of applications. Due to distributed nature of these networks and their deployment in remote areas, these networks are vulnerable to numerous security threats that can adversely affect their proper functioning. This problem is more critical if the network is deployed for some mission-critical applications such as in a tactical battlefield. Random failure of nodes is also very likely in real-life deployment scenarios. Due to resource constraints in the sensor nodes, traditional security mechanisms with large overhead of computation and communication are infeasible in WSNs. Security in sensor networks is, therefore, a particularly challenging task. This paper discusses the current state of the art in security mechanisms for WSNs. Various types of attacks are discussed and their countermeasures presented. A brief discussion on the future direction of research in WSN security is also included.
Ubiquitous information service converged by different types of heterogeneous networks is one of fundamental functions for smart cities. Considering the deployment of 5G ultra-dense wireless networks, the 5G converged cell-less communication networks are proposed to support mobile terminals in smart cities. To break obstacles of heterogeneous wireless networks, the 5G converged cell-less communication network is vertically converged in different tiers of heterogeneous wireless networks and horizontally converged in celled architectures of base stations/access points. Moreover, the software defined network controllers are configured to manage the traffic scheduling and resource allocation in 5G converged cell-less communication networks. Simulation results indicate the coverage probability and the energy saving at both base stations and mobile terminals are improved by the cooperative grouping scheme in 5G converged cell-less communication networks.
We focus on the problem of performing random walks efficiently in a distributed network. Given bandwidth constraints, the goal is to minimize the number of rounds required to obtain a random walk sample. We first present a fast sublinear time distributed algorithm for performing random walks whose time complexity is sublinear in the length of the walk. Our algorithm performs a random walk of length $\ell$ in $\tilde{O}(\sqrt{\ell D})$ rounds (with high probability) on an undirected network, where $D$ is the diameter of the network. This improves over the previous best algorithm that ran in $\tilde{O}(\ell^{2/3}D^{1/3})$ rounds (Das Sarma et al., PODC 2009). We further extend our algorithms to efficiently perform $k$ independent random walks in $\tilde{O}(\sqrt{k\ell D} + k)$ rounds. We then show that there is a fundamental difficulty in improving the dependence on $\ell$ any further by proving a lower bound of $\Omega(\sqrt{\frac{\ell}{\log \ell}} + D)$ under a general model of distributed random walk algorithms. Our random walk algorithms are useful in speeding up distributed algorithms for a variety of applications that use random walks as a subroutine. We present two main applications. First, we give a fast distributed algorithm for computing a random spanning tree (RST) in an arbitrary (undirected) network which runs in $\tilde{O}(\sqrt{m}D)$ rounds (with high probability; here $m$ is the number of edges). Our second application is a fast decentralized algorithm for estimating mixing time and related parameters of the underlying network. Our algorithm is fully decentralized and can serve as a building block in the design of topologically-aware networks.
The storage and computation requirements of Convolutional Neural Networks (CNNs) can be prohibitive for exploiting these models over low-power or embedded devices. This paper reduces the computational complexity of the CNNs by minimizing an objective function, including the recognition loss that is augmented with a sparsity-promoting penalty term. The sparsity structure of the network is identified using the Alternating Direction Method of Multipliers (ADMM), which is widely used in large optimization problems. This method alternates between promoting the sparsity of the network and optimizing the recognition performance, which allows us to exploit the two-part structure of the corresponding objective functions. In particular, we take advantage of the separability of the sparsity-inducing penalty functions to decompose the minimization problem into sub-problems that can be solved sequentially. Applying our method to a variety of state-of-the-art CNN models, our proposed method is able to simplify the original model, generating models with less computation and fewer parameters, while maintaining and often improving generalization performance. Accomplishments on a variety of models strongly verify that our proposed ADMM-based method can be a very useful tool for simplifying and improving deep CNNs.
Fog computing has emerged as a promising technology that can bring the cloud applications closer to the physical IoT devices at the network edge. While it is widely known what cloud computing is, and how data centers can build the cloud infrastructure and how applications can make use of this infrastructure, there is no common picture on what fog computing and a fog node, as its main building block, really is. One of the first attempts to define a fog node was made by Cisco, qualifying a fog computing system as a mini-cloud, located at the edge of the network and implemented through a variety of edge devices, interconnected by a variety, mostly wireless, communication technologies. Thus, a fog node would be the infrastructure implementing the said mini-cloud. Other proposals have their own definition of what a fog node is, usually in relation to a specific edge device, a specific use case or an application. In this paper, we first survey the state of the art in technologies for fog computing nodes as building blocks of fog computing, paying special attention to the contributions that analyze the role edge devices play in the fog node definition. We summarize and compare the concepts, lessons learned from their implementation, and show how a conceptual framework is emerging towards a unifying fog node definition. We focus on core functionalities of a fog node as well as in the accompanying opportunities and challenges towards their practical realization in the near future.
This paper considers the problem of finding a quickest path between two points in the Euclidean plane in the presence of a transportation network. A transportation network consists of a planar network where each road (edge) has an individual speed. A traveller may enter and exit the network at any point on the roads. Along any road the traveller moves with a fixed speed depending on the road, and outside the network the traveller moves at unit speed in any direction. We give an exact algorithm for the basic version of the problem: given a transportation network of total complexity n in the Euclidean plane, a source point s and a destination point t, and the quickest path between s and t. We also show how the transportation network can be preprocessed in time O(n^2 log n) into a data structure of size O(n^2) such that (1 + \epsilon)-approximate cheapest path cost queries between any two points in the plane can be answered in time O(1\epsilon^4 log n).
Sequential quantile estimation refers to incorporating observations into quantile estimates in an incremental fashion thus furnishing an online estimate of one or more quantiles at any given point in time. Sequential quantile estimation is also known as online quantile estimation. This area is relevant to the analysis of data streams and to the one-pass analysis of massive data sets. Applications include network traffic and latency analysis, real time fraud detection and high frequency trading. We introduce new techniques for online quantile estimation based on Hermite series estimators in the settings of static quantile estimation and dynamic quantile estimation. In the static quantile estimation setting we apply the existing Gauss-Hermite expansion in a novel manner. In particular, we exploit the fact that Gauss-Hermite coefficients can be updated in a sequential manner. To treat dynamic quantile estimation we introduce a novel expansion with an exponentially weighted estimator for the Gauss-Hermite coefficients which we term the Exponentially Weighted Gauss-Hermite (EWGH) expansion. These algorithms go beyond existing sequential quantile estimation algorithms in that they allow arbitrary quantiles (as opposed to pre-specified quantiles) to be estimated at any point in time. In doing so we provide a solution to online distribution function and online quantile function estimation on data streams. In particular we derive an analytical expression for the CDF and prove consistency results for the CDF under certain conditions. In addition we analyse the associated quantile estimator. Simulation studies and tests on real data reveal the Gauss-Hermite based algorithms to be competitive with a leading existing algorithm.
An objective critical distance (OCD) has been defined as that spacing between adjacent formants, when the level of the valley between them reaches the mean spectral level. The measured OCD lies in the same range (viz., 3-3.5 bark) as the critical distance determined by subjective experiments for similar experimental conditions. The level of spectral valley serves a purpose similar to that of the spacing between the formants with an added advantage that it can be measured from the spectral envelope without an explicit knowledge of formant frequencies. Based on the relative spacing of formant frequencies, the level of the spectral valley, VI (between F1 and F2) is much higher than the level of VII (spectral valley between F2 and F3) for back vowels and vice-versa for front vowels. Classification of vowels into front/back distinction with the difference (VI-VII) as an acoustic feature, tested using TIMIT, NTIMIT, Tamil and Kannada language databases gives, on the average, an accuracy of about 95%, which is comparable to the accuracy (90.6%) obtained using a neural network classifier trained and tested using MFCC as the feature vector for TIMIT database. The acoustic feature (VI-VII) has also been tested for its robustness on the TIMIT database for additive white and babble noise and an accuracy of about 95% has been obtained for SNRs down to 25 dB for both types of noise.
During natural or man-made disasters, humanitarian response organizations look for useful information to support their decision-making processes. Social media platforms such as Twitter have been considered as a vital source of useful information for disaster response and management. Despite advances in natural language processing techniques, processing short and informal Twitter messages is a challenging task. In this paper, we propose to use Deep Neural Network (DNN) to address two types of information needs of response organizations: 1) identifying informative tweets and 2) classifying them into topical classes. DNNs use distributed representation of words and learn the representation as well as higher level features automatically for the classification task. We propose a new online algorithm based on stochastic gradient descent to train DNNs in an online fashion during disaster situations. We test our models using a crisis-related real-world Twitter dataset.
We study classical spin networks with group SU(2). In the first part, using gaussian integrals, we compute their generating series in the case where the networks are equipped with holonomies; this generalizes Westbury's formula. In the second part, we use an integral formula for the square of the spin network and perform stationary phase approximation under some non-degeneracy hypothesis. This gives a precise asymptotic behavior when the labels are rescaled by a constant going to infinity.
Community structure analysis is a powerful tool for social networks, which can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of community structure partitioned is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a novel framework analyzing the significance of social community specially. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of nodes and their corresponding leaders. Then, using log-likelihood score, the tightness of community can be derived. Based on the distribution of community tightness, we establish a new connection between $p$-value theory and network analysis and then get a novel statistical form significance measure. Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms and so on.
The Residual Network (ResNet), proposed in He et al. (2015), utilized shortcut connections to significantly reduce the difficulty of training, which resulted in great performance boosts in terms of both training and generalization error.   It was empirically observed in He et al. (2015) that stacking more layers of residual blocks with shortcut 2 results in smaller training error, while it is not true for shortcut of length 1 or 3. We provide a theoretical explanation for the uniqueness of shortcut 2.   We show that with or without nonlinearities, by adding shortcuts that have depth two, the condition number of the Hessian of the loss function at the zero initial point is depth-invariant, which makes training very deep models no more difficult than shallow ones. Shortcuts of higher depth result in an extremely flat (high-order) stationary point initially, from which the optimization algorithm is hard to escape. The shortcut 1, however, is essentially equivalent to no shortcuts, which has a condition number exploding to infinity as the number of layers grows. We further argue that as the number of layers tends to infinity, it suffices to only look at the loss function at the zero initial point.   Extensive experiments are provided accompanying our theoretical results. We show that initializing the network to small weights with shortcut 2 achieves significantly better results than random Gaussian (Xavier) initialization, orthogonal initialization, and shortcuts of deeper depth, from various perspectives ranging from final loss, learning dynamics and stability, to the behavior of the Hessian along the learning process.
Blind source separation, i.e. extraction of independent sources from a mixture, is an important problem for both artificial and natural signal processing. Here, we address a special case of this problem when sources (but not the mixing matrix) are known to be nonnegative, for example, due to the physical nature of the sources. We search for the solution to this problem that can be implemented using biologically plausible neural networks. Specifically, we consider the online setting where the dataset is streamed to a neural network. The novelty of our approach is that we formulate blind nonnegative source separation as a similarity matching problem and derive neural networks from the similarity matching objective. Importantly, synaptic weights in our networks are updated according to biologically plausible local learning rules.
Existing knowledge-based question answering systems often rely on small annotated training data. While shallow methods like relation extraction are robust to data scarcity, they are less expressive than the deep meaning representation methods like semantic parsing, thereby failing at answering questions involving multiple constraints. Here we alleviate this problem by empowering a relation extraction method with additional evidence from Wikipedia. We first present a neural network based relation extractor to retrieve the candidate answers from Freebase, and then infer over Wikipedia to validate these answers. Experiments on the WebQuestions question answering dataset show that our method achieves an F_1 of 53.3%, a substantial improvement over the state-of-the-art.
Artificial bee colony (ABC) algorithm has proved its importance in solving a number of problems including engineering optimization problems. ABC algorithm is one of the most popular and youngest member of the family of population based nature inspired meta-heuristic swarm intelligence method. ABC has been proved its superiority over some other Nature Inspired Algorithms (NIA) when applied for both benchmark functions and real world problems. The performance of search process of ABC depends on a random value which tries to balance exploration and exploitation phase. In order to increase the performance it is required to balance the exploration of search space and exploitation of optimal solution of the ABC. This paper outlines a new hybrid of ABC algorithm with Genetic Algorithm. The proposed method integrates crossover operation from Genetic Algorithm (GA) with original ABC algorithm. The proposed method is named as Crossover based ABC (CbABC). The CbABC strengthens the exploitation phase of ABC as crossover enhances exploration of search space. The CbABC tested over four standard benchmark functions and a popular continuous optimization problem.
In this article, we investigate the structure of Croatian linguistic co-occurrence networks. We examine the change of network structure properties by systematically varying the co-occurrence window sizes, the corpus sizes and removing stopwords. In a co-occurrence window of size $n$ we establish a link between the current word and $n-1$ subsequent words. The results point out that the increase of the co-occurrence window size is followed by a decrease in diameter, average path shortening and expectedly condensing the average clustering coefficient. The same can be noticed for the removal of the stopwords. Finally, since the size of texts is reflected in the network properties, our results suggest that the corpus influence can be reduced by increasing the co-occurrence window size.
In this paper, we propose a cognitive radio based Internet access framework for disaster response network deployment in challenged environments. The proposed architectural framework is designed to help the existent but partially damaged networks to restore their connectivity and to connect them to the global Internet. This architectural framework provides the basis to develop algorithms and protocols for the future cognitive radio network deployments in challenged environments.
The purpose of this paper is to give an introduction to the field of Schema Theory written by a mathematician and for mathematicians. In particular, we endeavor to to highlight areas of the field which might be of interest to a mathematician, to point out some related open problems, and to suggest some large-scale projects. Schema theory seeks to give a theoretical justification for the efficacy of the field of genetic algorithms, so readers who have studied genetic algorithms stand to gain the most from this paper. However, nothing beyond basic probability theory is assumed of the reader, and for this reason we write in a fairly informal style.   Because the mathematics behind the theorems in schema theory is relatively elementary, we focus more on the motivation and philosophy. Many of these results have been proven elsewhere, so this paper is designed to serve a primarily expository role. We attempt to cast known results in a new light, which makes the suggested future directions natural. This involves devoting a substantial amount of time to the history of the field.   We hope that this exposition will entice some mathematicians to do research in this area, that it will serve as a road map for researchers new to the field, and that it will help explain how schema theory developed. Furthermore, we hope that the results collected in this document will serve as a useful reference. Finally, as far as the author knows, the questions raised in the final section are new.
Deep H and K' images, recorded with the ALTAIR adaptive optics system and NIRI imager on Gemini North, are used to probe the red stellar content in a field with a projected distance of 1 kpc above the disk plane of the starburst galaxy M82. The data have an angular resolution of 0.08 arcsec FWHM, and individual AGB and RGB stars are resolved. The AGB extends to at least 1.7 mag in K above the RGB-tip, which occurs at K = 21.7. The relative numbers of bright AGB stars and RGB stars are consistent with stellar evolution models, and one of the brightest AGB stars has an H-K color and K brightness that is consistent with it being a C star. The brightnesses of the AGB stars suggest that they formed during intermediate epochs, possibily after the last major interaction with M81. Therefore, star formation in M82 during intermediate epochs may not have been restricted to the plane of the disk.
In neuroscience, population coding theory has demonstrated that fault-tolerant information processing can be achieved by neural assemblies. Transposed to nanoelectronics, this strategy could allow computing reliably with ultimately scaled-down, noisy, imperfect devices. This approach requires that the response functions to inputs of the population components form a set of basis functions, thus offering a physical substrate for calculating. For this purpose, the nanodevice responses should be non-linear, and each tuned to different values of the input. These strong requirements have until now prevented a demonstration of population coding with nanodevices. Here we show that nanoscale magnetic tunnel junctions can be assembled to exhibit such neural-like ensemble responses. We demonstrate experimentally that a population of nine junctions can implement a basis set of functions, allowing, for instance, the generation of cursive letters. Through simulations, we show that systems based on interlinked populations of junctions can learn to realize low power, variability-resilient non-linear transformations through population coding.
The stability of spin-glass (SG) phase is analyzed in detail for a fermionic Ising SG (FISG) model in the presence of a magnetic transverse field $\Gamma$. The fermionic path integral formalism, replica method and static approach have been used to obtain the thermodynamic potential within one step replica symmetry breaking ansatz. The replica symmetry (RS) results show that the SG phase is always unstable against the replicon. Moreover, the two other eigenvalues $\lambda_{\pm}$ of the Hessian matrix (related to the diagonal elements of the replica matrix) can indicate an additional instability to the SG phase, which enhances when $\Gamma$ is increased. Therefore, this result suggests that the study of the replicon can not be enough to guarantee the RS stability in the present quantum FISG model, especially near the quantum critical point. In particular, the FISG model allows changing the occupation number of sites, so one can get a first order transition when the chemical potential exceeds a certain value. In this region, the replicon and the $\lambda_{\pm}$ indicate instability problems for the SG solution close to all range of first order boundary.
This paper shows that a long chain of perceptrons (that is, a multilayer perceptron, or MLP, with many hidden layers of width one) can be a universal classifier. The classification procedure is not necessarily computationally efficient, but the technique throws some light on the kind of computations possible with narrow and deep MLPs.
Joint analysis of data from multiple sources has the potential to improve our understanding of the underlying structures in complex data sets. For instance, in restaurant recommendation systems, recommendations can be based on rating histories of customers. In addition to rating histories, customers' social networks (e.g., Facebook friendships) and restaurant categories information (e.g., Thai or Italian) can also be used to make better recommendations. The task of fusing data, however, is challenging since data sets can be incomplete and heterogeneous, i.e., data consist of both matrices, e.g., the person by person social network matrix or the restaurant by category matrix, and higher-order tensors, e.g., the "ratings" tensor of the form restaurant by meal by person.   In this paper, we are particularly interested in fusing data sets with the goal of capturing their underlying latent structures. We formulate this problem as a coupled matrix and tensor factorization (CMTF) problem where heterogeneous data sets are modeled by fitting outer-product models to higher-order tensors and matrices in a coupled manner. Unlike traditional approaches solving this problem using alternating algorithms, we propose an all-at-once optimization approach called CMTF-OPT (CMTF-OPTimization), which is a gradient-based optimization approach for joint analysis of matrices and higher-order tensors. We also extend the algorithm to handle coupled incomplete data sets. Using numerical experiments, we demonstrate that the proposed all-at-once approach is more accurate than the alternating least squares approach.
The past few years have witnessed a growth in size and computational requirements for training and inference with neural networks. Currently, a common approach to address these requirements is to use a heterogeneous distributed environment with a mixture of hardware devices such as CPUs and GPUs. Importantly, the decision of placing parts of the neural models on devices is often made by human experts based on simple heuristics and intuitions. In this paper, we propose a method which learns to optimize device placement for TensorFlow computational graphs. Key to our method is the use of a sequence-to-sequence model to predict which subsets of operations in a TensorFlow graph should run on which of the available devices. The execution time of the predicted placements is then used as the reward signal to optimize the parameters of the sequence-to-sequence model. Our main result is that on Inception-V3 for ImageNet classification, and on RNN LSTM, for language modeling and neural machine translation, our model finds non-trivial device placements that outperform hand-crafted heuristics and traditional algorithmic methods.
Coexpression of genes or, more generally, similarity in the expression profiles poses an unsurmountable obstacle to inferring the gene regulatory network (GRN) based solely on data from DNA microarray time series. Clustering of genes with similar expression profiles allows for a course-grained view of the GRN and a probabilistic determination of the connectivity among the clusters. We present a model for the temporal evolution of a gene cluster network which takes into account interactions of gene products with genes and, through a non-constant degradation rate, with other gene products. The number of model parameters is reduced by using polynomial functions to interpolate temporal data points. In this manner, the task of parameter estimation is reduced to a system of linear algebraic equations, thus making the computation time shorter by orders of magnitude. To eliminate irrelevant networks, we test each GRN for stability with respect to parameter variations, and impose restrictions on its behavior near the steady state. We apply our model and methods to DNA microarray time series' data collected on Escherichia coli during glucose-lactose diauxie and infer the most probable cluster network for different phases of the experiment.
We analyze the order alpha_s^2 corrections to the single inclusive jet cross section in lepton-nucleon deep inelastic scattering. The full calculation is done analytically, in the small cone approximation, obtaining finite NLO partonic level cross sections for these processes. A detailed study of the different underlying partonic reactions is presented focusing in the size of the corrections they get at NLO accuracy, their relative weight, and the residual scale uncertainty they leave in the full cross section depending on the kinematical region explored. The dominant partonic process in very forward jet production is found to start at order alpha_s^2, being effectively a lowest order estimate, with the consequent large factorization scale uncertainty, and the likelihood of non-negligible corrections at the subsequent order in perturbation theory.
The application of reinforcement learning algorithms onto real life problems always bears the challenge of filtering the environmental state out of raw sensor readings. While most approaches use heuristics, biology suggests that there must exist an unsupervised method to construct such filters automatically. Besides the extraction of environmental states, the filters have to represent them in a fashion that support modern reinforcement algorithms. Many popular algorithms use a linear architecture, so one should aim at filters that have good approximation properties in combination with linear functions. This thesis wants to propose the unsupervised method slow feature analysis (SFA) for this task. Presented with a random sequence of sensor readings, SFA learns a set of filters. With growing model complexity and training examples, the filters converge against trigonometric polynomial functions. These are known to possess excellent approximation capabilities and should therfore support the reinforcement algorithms well. We evaluate this claim on a robot. The task is to learn a navigational control in a simple environment using the least square policy iteration (LSPI) algorithm. The only accessible sensor is a head mounted video camera, but without meaningful filtering, video images are not suited as LSPI input. We will show that filters learned by SFA, based on a random walk video of the robot, allow the learned control to navigate successfully in ca. 80% of the test trials.
We propose a cooperative coevolutionary genetic algorithm for learning Bayesian network structures from fully observable data sets. Since this problem can be decomposed into two dependent subproblems, that is to find an ordering of the nodes and an optimal connectivity matrix, our algorithm uses two subpopulations, each one representing a subtask. We describe the empirical results obtained with simulations of the Alarm and Insurance networks. We show that our algorithm outperforms the deterministic algorithm K2.
Recently, {\it stochastic momentum} methods have been widely adopted in training deep neural networks. However, their convergence analysis is still underexplored at the moment, in particular for non-convex optimization. This paper fills the gap between practice and theory by developing a basic convergence analysis of two stochastic momentum methods, namely stochastic heavy-ball method and the stochastic variant of Nesterov's accelerated gradient method. We hope that the basic convergence results developed in this paper can serve the reference to the convergence of stochastic momentum methods and also serve the baselines for comparison in future development of stochastic momentum methods. The novelty of convergence analysis presented in this paper is a unified framework, revealing more insights about the similarities and differences between different stochastic momentum methods and stochastic gradient method. The unified framework exhibits a continuous change from the gradient method to Nesterov's accelerated gradient method and finally the heavy-ball method incurred by a free parameter, which can help explain a similar change observed in the testing error convergence behavior for deep learning. Furthermore, our empirical results for optimizing deep neural networks demonstrate that the stochastic variant of Nesterov's accelerated gradient method achieves a good tradeoff (between speed of convergence in training error and robustness of convergence in testing error) among the three stochastic methods.
Energy optimization has become a crucial issue in the realm of ICT. This paper addresses the problem of energy consumption in a Metro Ethernet network. Ethernet technology deployments have been increasing tremendously because of their simplicity and low cost. However, much research remains to be conducted to address energy efficiency in Ethernet networks. In this paper, we propose a novel Energy Aware Forwarding Strategy for Metro Ethernet networks based on a modification of the Internet Energy Aware Routing (EAR) algorithm. Our contribution identifies the set of links to turn off and maintain links with minimum energy impact on the active state. Our proposed algorithm could be a superior choice for use in networks with low saturation, as it involves a tradeoff between maintaining good network performance and minimizing the active links in the network. Performance evaluation shows that, at medium load traffic, energy savings of 60% can be achieved. At high loads, energy savings of 40% can be achieved without affecting the network performance.
Similar evolutionary variational inequalities appear as convenient formulations for continuous quasistationary models for sandpile growth, formation of a network of lakes and rivers, magnetization of type-II superconductors, and elastoplastic deformations. We outline the main steps of such models derivation and try to clarify the origin of this similarity. New dual variational formulations, analogous to mixed variational inequalities in plasticity, are derived for sandpiles and superconductors.
Osteoporosis is a public health problem characterized by increased fracture risk secondary to low bone mass and microarchitectural deterioration of bone tissue. Almost all fractures of the hip require hospitalization and major surgery. Early diagnosis of osteoporosis plays an important role in preventing osteoporotic fracture. Magnetic resonance imaging (MRI) has been successfully performed to image trabecular bone architecture in vivo proving itself as the practical imaging modality for bone quality assessment. However, segmentation of the whole proximal femur is required to measure bone quality and assess fracture risk precisely. Manual segmentation of the proximal femur is time-intensive, limiting the use of MRI measurements in the clinical practice. To overcome this bottleneck, robust automatic proximal femur segmentation method is required. In this paper, we propose to use deep convolutional neural networks (CNNs) for an automatic proximal femur segmentation using structural MR images. We constructed a dataset with 62 volumetric MR scans that are manually-segmented for proximal femur. We performed experiments using two different CNN architectures and achieved a high dice similarity score of 0.95.
The problem of opportunistic spectrum access in cognitive radio networks has been recently formulated as a non-Bayesian restless multi-armed bandit problem. In this problem, there are N arms (corresponding to channels) and one player (corresponding to a secondary user). The state of each arm evolves as a finite-state Markov chain with unknown parameters. At each time slot, the player can select K < N arms to play and receives state-dependent rewards (corresponding to the throughput obtained given the activity of primary users). The objective is to maximize the expected total rewards (i.e., total throughput) obtained over multiple plays. The performance of an algorithm for such a multi-armed bandit problem is measured in terms of regret, defined as the difference in expected reward compared to a model-aware genie who always plays the best K arms. In this paper, we propose a new continuous exploration and exploitation (CEE) algorithm for this problem. When no information is available about the dynamics of the arms, CEE is the first algorithm to guarantee near-logarithmic regret uniformly over time. When some bounds corresponding to the stationary state distributions and the state-dependent rewards are known, we show that CEE can be easily modified to achieve logarithmic regret over time. In contrast, prior algorithms require additional information concerning bounds on the second eigenvalues of the transition matrices in order to guarantee logarithmic regret. Finally, we show through numerical simulations that CEE is more efficient than prior algorithms.
A goal in network science is the geometrical characterization of complex networks. In this direction, we (arXiv:1603.00386; J. Stat. Mech. (2016) P063206) have recently introduced the Forman's discretization of Ricci curvature to the realm of undirected networks. Investigation of Forman curvature in diverse model and real-world undirected networks revealed that this measure captures several aspects of the organization of complex undirected networks. However, many important real-world networks are inherently directed in nature, and the Forman curvature for undirected networks is unsuitable for analysis of such directed networks. Hence, we here extend the Forman curvature for undirected networks to the case of directed networks. The simple mathematical formula for the Forman curvature in directed networks elegantly incorporates node weights, edge weights and edge direction. By applying the Forman curvature for directed networks to a variety of model and real-world directed networks, we show that the measure can be used to characterize the structure of complex directed networks. Furthermore, our results also hold in real directed networks which are weighted or spatial in nature. These results in combination with our previous results suggest that the Forman curvature can be readily employed to study the organization of both directed and undirected complex networks.
In evolutionary dynamics, the probability that a mutation spreads through the whole population, having arisen in a single individual, is known as the fixation probability. In general, it is not possible to find the fixation probability analytically given the mutant's fitness and the topological constraints that govern the spread of the mutation, so one resorts to simulations instead. Depending on the topology in use, a great number of evolutionary steps may be needed in each of the simulation events, particularly in those that end with the population containing mutants only. We introduce two techniques to accelerate the determination of the fixation probability. The first one skips all evolutionary steps in which the number of mutants does not change and thereby reduces the number of steps per simulation event considerably. This technique is computationally advantageous for some of the so-called layered networks. The second technique, which is not restricted to layered networks, consists of aborting any simulation event in which the number of mutants has grown beyond a certain threshold value, and counting that event as having led to a total spread of the mutation. For large populations, and regardless of the network's topology, we demonstrate, both analytically and by means of simulations, that using a threshold of about 100 mutants leads to an estimate of the fixation probability that deviates in no significant way from that obtained from the full-fledged simulations. We have observed speedups of two orders of magnitude for layered networks with 10000 nodes.
With the growing interest in the use of internet of things (IoT), machine-to-machine (M2M) communications have become an important networking paradigm. In this paper, with recent advances in wireless network virtualization (WNV), we propose a novel framework for M2M communications in vehicular ad-hoc networks (VANETs) with WNV. In the proposed framework, according to different applications and quality of service (QoS) requirements of vehicles, a hypervisor enables the virtualization of the physical vehicular network, which is abstracted and sliced into multiple virtual networks. Moreover, the process of resource blocks (RBs) selection and random access in each virtual vehicular network is formulated as a partially observable Markov decision process (POMDP), which can achieve the maximum reward about transmission capacity. The optimal policy for RBs selection is derived by virtue of a dynamic programming approach. Extensive simulation results with different system parameters are presented to show the performance improvement of the proposed scheme.
In this paper, for the purpose of data centre energy consumption monitoring and analysis, we propose to detect the running programs in a server by classifying the observed power consumption series. Time series classification problem has been extensively studied with various distance measurements developed; also recently the deep learning based sequence models have been proved to be promising. In this paper, we propose a novel distance measurement and build a time series classification algorithm hybridizing nearest neighbour and long short term memory (LSTM) neural network. More specifically, first we propose a new distance measurement termed as Local Time Warping (LTW), which utilizes a user-specified set for local warping, and is designed to be non-commutative and non-dynamic programming. Second we hybridize the 1NN-LTW and LSTM together. In particular, we combine the prediction probability vector of 1NN-LTW and LSTM to determine the label of the test cases. Finally, using the power consumption data from a real data center, we show that the proposed LTW can improve the classification accuracy of DTW from about 84% to 90%. Our experimental results prove that the proposed LTW is competitive on our data set compared with existed DTW variants and its non-commutative feature is indeed beneficial. We also test a linear version of LTW and it can significantly outperform existed linear runtime lower bound methods like LB_Keogh. Furthermore, with the hybrid algorithm, for the power series classification task we achieve an accuracy up to about 93%. Our research can inspire more studies on time series distance measurement and the hybrid of the deep learning models with other traditional models.
Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem. Instead of aiming to select a single optimal architecture, we propose a "fabric" that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern. The only hyper-parameters of a fabric are the number of channels and layers. While individual architectures can be recovered as paths, the fabric can in addition ensemble all embedded architectures together, sharing their weights where their paths overlap. Parameters can be learned using standard methods based on back-propagation, at a cost that scales linearly in the fabric size. We present benchmark results competitive with the state of the art for image classification on MNIST and CIFAR10, and for semantic segmentation on the Part Labels dataset.
In this paper we propose a hybrid model of a neural oscillator, obtained by partially discretizing a well-known continuous model. Our construction points out that in this case the standard techniques, based on replacing sigmoids with step functions, is not satisfactory. Then, we study the hybrid model through both symbolic methods and approximation techniques. This last analysis, in particular, allows us to show the differences between the considered approximation approaches. Finally, we focus on approximations via epsilon-semantics, proving how these can be computed in practice.
Community detection is a fundamental statistical problem in network data analysis. Many algorithms have been proposed to tackle this problem. Most of these algorithms are not guaranteed to achieve the statistical optimality of the problem, while procedures that achieve information theoretic limits for general parameter spaces are not computationally tractable. In this paper, we present a computationally feasible two-stage method that achieves optimal statistical performance in misclassification proportion for stochastic block model under weak regularity conditions. Our two-stage procedure consists of a generic refinement step that can take a wide range of weakly consistent community detection procedures as initializer, to which the refinement stage applies and outputs a community assignment achieving optimal misclassification proportion with high probability. The practical effectiveness of the new algorithm is demonstrated by competitive numerical results.
While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. Techniques that combine large graphical models with low-level vision have been proposed to address this problem; however, we propose an end-to-end recurrent neural network (RNN) architecture with an attention mechanism to model a human-like counting process, and produce detailed instance segmentations. The network is jointly trained to sequentially produce regions of interest as well as a dominant object segmentation within each region. The proposed model achieves competitive results on the CVPPP, KITTI, and Cityscapes datasets.
We study an interacting agent model of a game-theoretical economy. The agents play a minority-subsequently-majority game and they learn, using backpropagation networks, to obtain higher payoffs. We study the relevance of heterogeneity to performance, and how heterogeneity emerges.
SuperLupus is a deep transit survey monitoring a Galactic Plane field in the Southern hemisphere. The project is building on the successful Lupus Survey, and will double the number of images of the field from 1700 to 3400, making it one of the longest duration deep transit surveys. The immediate motivation for this expansion is to search for longer period transiting planets (5-8 days) and smaller radii planets. It will also provide near complete recovery for the shorter period planets (1-3 days). In March, April, and May 2008 we obtained the new images and work is currently in progress reducing these new data.
The dynamics of spreading of the minority opinion in public debates (a reform proposal, a behavior change, a military retaliation) is studied using a diffusion reaction model. People move by discrete step on a landscape of random geometry shaped by social life (offices, houses, bars, and restaurants). A perfect world is considered with no advantage to the minority. A one person-one argument principle is applied to determine locally individual mind changes. In case of equality, a collective doubt is evoked which in turn favors the Status Quo.Starting from a large in favor of the proposal initial majority, repeated random size local discussions are found to drive the majority reversal along the minority hostile view. Total opinion refusal is completed within few days. Recent national collective issues are revisited. The model may apply to rumor and fear propagation.
We consider heat transport in one-dimensional harmonic chains with isotopic disorder, focussing our attention mainly on how disorder correlations affect heat conduction. Our approach reveals that long-range correlations can change the number of low-frequency extended states. As a result, with a proper choice of correlations one can control how the conductivity $\kappa$ scales with the chain length $N$. We present a detailed analysis of the role of specific long-range correlations for which a size-independent conductivity is exactly recovered in the case of fixed boundary conditions. As for free boundary conditions, we show that disorder correlations can lead to a conductivity scaling as $\kappa \sim N^{\varepsilon}$, with the scaling exponent $\varepsilon$ being arbitrarily small (although not strictly zero), so that normal conduction is almost recovered even in this case.
A one-dimensional model of classical diffusion in a random force field with a weak concentration $\rho$ of absorbers is studied. The force field is taken as a Gaussian white noise with $\mean{\phi(x)}=0$ and $\mean{\phi(x)\phi(x')}=g \delta(x-x')$. Our analysis relies on the relation between the Fokker-Planck operator and a quantum Hamiltonian in which absorption leads to breaking of supersymmetry. Using a Lifshits argument, it is shown that the average return probability is a power law $\smean{P(x,t|x,0)}\sim{}t^{-\sqrt{2\rho/g}}$ (to be compared with the usual Lifshits exponential decay $\exp{-(\rho^2t)^{1/3}}$ in the absence of the random force field). The localisation properties of the underlying quantum Hamiltonian are discussed as well.
A combinatorial framework for adversarial network coding is presented. Channels are described by specifying the possible actions that one or more (possibly coordinated) adversaries may take. Upper bounds on three notions of capacity (the one-shot capacity, the zero-error capacity, and the compound zero-error capacity) are obtained for point-to-point channels, and generalized to corresponding capacity regions appropriate for multi-source networks. A key result of this paper is a general method by which bounds on these capacities in point-to-point channels may be ported to networks. This technique is illustrated in detail for Hamming-type channels with multiple adversaries operating on specific coordinates, which correspond, in the context of networks, to multiple adversaries acting on specific network edges. Capacity-achieving coding schemes are described for some of the considered adversarial models.
This work investigates how using reduced precision data in Convolutional Neural Networks (CNNs) affects network accuracy during classification. More specifically, this study considers networks where each layer may use different precision data. Our key result is the observation that the tolerance of CNNs to reduced precision data not only varies across networks, a well established observation, but also within networks. Tuning precision per layer is appealing as it could enable energy and performance improvements. In this paper we study how error tolerance across layers varies and propose a method for finding a low precision configuration for a network while maintaining high accuracy. A diverse set of CNNs is analyzed showing that compared to a conventional implementation using a 32-bit floating-point representation for all layers, and with less than 1% loss in relative accuracy, the data footprint required by these networks can be reduced by an average of 74% and up to 92%.
Representation learning is the foundation for the recent success of neural network models. However, the distributed representations generated by neural networks are far from ideal. Due to their highly entangled nature, they are di cult to reuse and interpret, and they do a poor job of capturing the sparsity which is present in real- world transformations. In this paper, I describe methods for learning disentangled representations in the two domains of graphics and computation. These methods allow neural methods to learn representations which are easy to interpret and reuse, yet they incur little or no penalty to performance. In the Graphics section, I demonstrate the ability of these methods to infer the generating parameters of images and rerender those images under novel conditions. In the Computation section, I describe a model which is able to factorize a multitask learning problem into subtasks and which experiences no catastrophic forgetting. Together these techniques provide the tools to design a wide range of models that learn disentangled representations and better model the factors of variation in the real world.
Autonomous repair of deep-sea coral reefs is a recent proposed idea to support the oceans ecosystem in which is vital for commercial fishing, tourism and other species. This idea can be operated through using many small autonomous underwater vehicles (AUVs) and swarm intelligence techniques to locate and replace chunks of coral which have been broken off, thus enabling re-growth and maintaining the habitat. The aim of this project is developing machine vision algorithms to enable an underwater robot to locate a coral reef and a chunk of coral on the seabed and prompt the robot to pick it up. Although there is no literature on this particular problem, related work on fish counting may give some insight into the problem. The technical challenges are principally due to the potential lack of clarity of the water and platform stabilization as well as spurious artifacts (rocks, fish, and crabs). We present an efficient sparse classification for coral species using supervised deep learning method called Convolutional Neural Networks (CNNs). We compute Weber Local Descriptor (WLD), Phase Congruency (PC), and Zero Component Analysis (ZCA) Whitening to extract shape and texture feature descriptors, which are employed to be supplementary channels (feature-based maps) besides basic spatial color channels (spatial-based maps) of coral input image, we also experiment state-of-art preprocessing underwater algorithms for image enhancement and color normalization and color conversion adjustment. Our proposed coral classification method is developed under MATLAB platform, and evaluated by two different coral datasets (University of California San Diego's Moorea Labeled Corals, and Heriot-Watt University's Atlantic Deep Sea).
We study (1+1)-dimensional turbulence in the framework of the Martin-Siggia-Rose field theory formalism. The analysis is focused on the asymptotic behaviour at the right tail of the probability distribution function (pdf) of velocity differences, where shock waves do not contribute. A BRS-preserving scheme of phase space reduction, based on the smoothness of the relevant velocity fields, leads to an effective theory for a few degrees of freedom. The sum over fluctuations around the instanton solution is written as the expectation value of a functional of the time-dependent physical fields, which evolve according to a set of Langevin equations. A natural regularization of the fluctuation determinant is provided from the fact that the instanton dominates the action for a finite time interval. The transition from the turbulent to the instanton dominated regime is related to logarithmic corrections to the saddle-point action, manifested on their turn as multiplicative power law corrections to the velocity differences pdf.
The current status of research on the carrier-mediated ferromagnetism in tetrahedrally coordinated semiconductors is briefly reviewed. The experimental results for III-V semiconductors, where Mn atoms introduce both spins and holes, are compared to the case of II-VI compounds, in which the ferromagnetism has been observed for the modulation-doped p-type Cd1-xMnxTe/Cd1-y-zMgyZnzTe:N heterostructures, and more recently, in Zn1-xMnxTe:N epilayers. On the theoretical side, a model is presented, which takes into account: (i) strong spin-orbit and kp couplings in the valence band; (ii) the effect of confinement and strain upon the hole density-of-states and response function, and (iii) the influence of disorder and carrier-carrier interactions, particularly near the metal-to-insulator transition. A comparison between experimental and theoretical results demonstrates that the model can describe the magnetic circular dichroism, the values of TC observed in the studied systems as well as explain the directions of the easy axis and the magnitudes of the corresponding anisotropy fields as a function of confinement and biaxial strain. Various suggestions concerning design of novel ferromagnetic semiconductor systems are described.
This chapter provides an overview of the different techniques and methods that exist for the analysis and visualization of dynamic networks. Basic definitions and formal notations are discussed and important references are cited.   A major reason for the popularity of the field of dynamic networks is its applicability in a number of diverse fields. The field of dynamic networks is in its infancy and there are so many avenues that need to be explored. From developing network generation models to developing temporal metrics and measures, from structural analysis to visual analysis, there is room for further exploration in almost every dimension where dynamic networks are studied. Recently, with the availability of dynamic data from various fields, the empirical study and experimentation with real data sets has also helped maturate the field. Furthermore, researchers have started to develop foundations and theories based on these datasets which in turn has resulted lots of activity among research communities.
Recent progress in face detection (including keypoint detection), and recognition is mainly being driven by (i) deeper convolutional neural network architectures, and (ii) larger datasets. However, most of the large datasets are maintained by private companies and are not publicly available. The academic computer vision community needs larger and more varied datasets to make further progress.   In this paper we introduce a new face dataset, called UMDFaces, which has 367,888 annotated faces of 8,277 subjects. We also introduce a new face recognition evaluation protocol which will help advance the state-of-the-art in this area. We discuss how a large dataset can be collected and annotated using human annotators and deep networks. We provide human curated bounding boxes for faces. We also provide estimated pose (roll, pitch and yaw), locations of twenty-one key-points and gender information generated by a pre-trained neural network. In addition, the quality of keypoint annotations has been verified by humans for about 115,000 images. Finally, we compare the quality of the dataset with other publicly available face datasets at similar scales.
Music auto-tagging is often handled in a similar manner to image classification by regarding the 2D audio spectrogram as image data. However, music auto-tagging is distinguished from image classification in that the tags are highly diverse and have different levels of abstractions. Considering this issue, we propose a convolutional neural networks (CNN)-based architecture that embraces multi-level and multi-scaled features. The architecture is trained in three steps. First, we conduct supervised feature learning to capture local audio features using a set of CNNs with different input sizes. Second, we extract audio features from each layer of the pre-trained convolutional networks separately and aggregate them altogether given a long audio clip. Finally, we put them into fully-connected networks and make final predictions of the tags. Our experiments show that using the combination of multi-level and multi-scale features is highly effective in music auto-tagging and the proposed method outperforms previous state-of-the-arts on the MagnaTagATune dataset and the Million Song Dataset. We further show that the proposed architecture is useful in transfer learning.
Machine learning algorithms are inherently multiobjective in nature, where approximation error minimization and model's complexity simplification are two conflicting objectives. We proposed a multiobjective genetic programming (MOGP) for creating a heterogeneous flexible neural tree (HFNT), tree-like flexible feedforward neural network model. The functional heterogeneity in neural tree nodes was introduced to capture a better insight of data during learning because each input in a dataset possess different features. MOGP guided an initial HFNT population towards Pareto-optimal solutions, where the final population was used for making an ensemble system. A diversity index measure along with approximation error and complexity was introduced to maintain diversity among the candidates in the population. Hence, the ensemble was created by using accurate, structurally simple, and diverse candidates from MOGP final population. Differential evolution algorithm was applied to fine-tune the underlying parameters of the selected candidates. A comprehensive test over classification, regression, and time-series datasets proved the efficiency of the proposed algorithm over other available prediction methods. Moreover, the heterogeneous creation of HFNT proved to be efficient in making ensemble system from the final population.
A central question in ecology is to understand the ecological processes that shape community structure. Niche-based theories have emphasized the important role played by competition for maintaining species diversity. Many of these insights have been derived using MacArthur's consumer resource model (MCRM) or its generalizations. Most theoretical work on the MCRM has focused on small ecosystems with a few species and resources. However theoretical insights derived from small ecosystems many not scale up large ecosystems with many resources and species because large systems with many interacting components often display new emergent behaviors that cannot be understood or deduced from analyzing smaller systems. To address this shortcoming, we develop a sophisticated statistical physics inspired cavity method to analyze MCRM when both the number of species and the number of resources is large. We find that in this limit, species generically and consistently perturb their environments and significantly modify available ecological niches. We show how our cavity approach naturally generalizes niche theory to large ecosystems by accounting for the effect of this emergent environmental engineering on species invasion and ecological stability. Our work suggests that environmental engineering is a generic feature of large, natural ecosystems and must be taken into account when analyzing and interpreting community structure. It also highlights the important role that statistical-physics inspired approaches can play in furthering our understanding of ecology.
A measurement of the $t$-channel single-top-quark and single-top-antiquark production cross-sections in the lepton+je ts channel is presented, using 3.2 fb$^{-1}$ of proton--proton collision data at a centre-of-mass energy of 13 TeV, recorded with the ATLAS detector at the LHC in 2015. Events are selected by requiring one charged lepton (electron or muon), missing transverse momentum, and two jets with high transverse momentum, exactly one of which is required to be $b$-tagged. Using a binned maximum-likelihood fit to the discriminant distribution of a neural network, the cross-sections are determined to be $\sigma(tq) = 156 \pm 5 \, (\mathrm{stat.}) \pm 27 \, (\mathrm{syst.}) \pm 3\,(\mathrm{lumi.})$ pb for single top-quark production and $\sigma(\bar{t}q) = 91 \pm 4 \, (\mathrm{stat.}) \pm 18 \, (\mathrm{syst.}) \pm 2\,(\mathrm{lumi.})$ pb for single top-antiquark production, assuming a top-quark mass of 172.5 GeV. The cross-section ratio is measured to be $R_t = \sigma(tq)/\sigma(\bar{t}q) = 1.72 \pm 0.09 \, (\mathrm{stat.}) \pm 0.18 \, (\mathrm{syst.})$.
We develop Gibbs sampling based techniques for learning the optimal content placement in a cellular network. A collection of base stations are scattered on the space, each having a cell (possibly overlapping with other cells). Mobile users request for downloads from a finite set of contents according to some popularity distribution. Each base station can store only a strict subset of the contents at a time; if a requested content is not available at any serving base station, it has to be downloaded from the backhaul. Thus, there arises the problem of optimal content placement which can minimize the download rate from the backhaul, or equivalently maximize the cache hit rate. Using similar ideas as Gibbs sampling, we propose simple sequential content update rules that decide whether to store a content at a base station based on the knowledge of contents in neighbouring base stations. The update rule is shown to be asymptotically converging to the optimal content placement for all nodes. Next, we extend the algorithm to address the situation where content popularities and cell topology are initially unknown, but are estimated as new requests arrive to the base stations. Finally, improvement in cache hit rate is demonstrated numerically.
We consider the problem of learning distributed representations for tags from their associated content for the task of tag recommendation. Considering tagging information is usually very sparse, effective learning from content and tag association is very crucial and challenging task. Recently, various neural representation learning models such as WSABIE and its variants show promising performance, mainly due to compact feature representations learned in a semantic space. However, their capacity is limited by a linear compositional approach for representing tags as sum of equal parts and hurt their performance. In this work, we propose a neural feedback relevance model for learning tag representations with weighted feature representations. Our experiments on two widely used datasets show significant improvement for quality of recommendations over various baselines.
We investigate the dissipative loss in the $\pm J$ Ising spin glass in three dimensions through the scaling of the hysteresis area, for a maximum magnetic field that is equal to the saturation field. We perform a systematic analysis for the whole range of the bond randomness as a function of the sweep rate, by means of frustration-preserving hard-spin mean field theory. Data collapse within the entirety of the spin-glass phase driven adiabatically (i.e., infinitely-slow field variation) is found, revealing a power-law scaling of the hysteresis area as a function of the antiferromagnetic bond fraction and the temperature. Two dynamic regimes separated by a threshold frequency $\omega_c$ characterize the dependence on the sweep rate of the oscillating field. For $\omega < \omega_c$, the hysteresis area is equal to its value in the adiabatic limit $\omega = 0$, while for $\omega > \omega_c$ it increases with the frequency through another randomness-dependent power law.
Widespread deployment of the Internet enabled building of an emerging IT delivery model, i.e., cloud computing. Albeit cloud computing-based services have rapidly developed, their security aspects are still at the initial stage of development. In order to preserve cybersecurity in cloud computing, cybersecurity information that will be exchanged within it needs to be identified and discussed. For this purpose, we propose an ontological approach to cybersecurity in cloud computing. We build an ontology for cybersecurity operational information based on actual cybersecurity operations mainly focused on non-cloud computing. In order to discuss necessary cybersecurity information in cloud computing, we apply the ontology to cloud computing. Through the discussion, we identify essential changes in cloud computing such as data-asset decoupling and clarify the cybersecurity information required by the changes such as data provenance and resource dependency information.
Many complex systems can be represented as networks and separating a network into communities could simplify the functional analysis considerably. Recently, many approaches have been proposed for finding communities, but none of them can evaluate the communities found are significant or trivial definitely. In this paper, we propose an index to evaluate the significance of communities in networks. The index is based on comparing the similarity between the original community structure in network and the community structure of the network after perturbed, and is defined by integrating all the similarities. Many artificial networks and real-world networks are tested. The results show that the index is independent from the size of network and the number of communities. Moreover, we find the clear communities always exist in social networks, but don't find significative communities in proteins interaction networks and metabolic networks.
Graphical model learning and inference are often performed using Bayesian techniques. In particular, learning is usually performed in two separate steps. First, the graph structure is learned from the data; then the parameters of the model are estimated conditional on that graph structure. While the probability distributions involved in this second step have been studied in depth, the ones used in the first step have not been explored in as much detail.   In this paper, we will study the prior and posterior distributions defined over the space of the graph structures for the purpose of learning the structure of a graphical model. In particular, we will provide a characterisation of the behaviour of those distributions as a function of the possible edges of the graph. We will then use the properties resulting from this characterisation to define measures of structural variability for both Bayesian and Markov networks, and we will point out some of their possible applications.
A possibility of the uniform antiferromagnetic order is pointed out in an S=1/2 ferromagnetic (F) - antiferromagnetic (AF) random alternating Heisenberg quantum spin chain compound: (CH_3)_2 CHNH_3 Cu(Cl_x Br_{1-x})_3. The system possesses the bond alternation of strong random bonds that take +/- 2J and weak uniform AF bonds of -J. In the pure concentration limits, the model reduces to the AF-AF alternation chain at x=0 and to the F-AF alternation chain at x=1. The nonequilibrium relaxation of large-scale quantum Monte Carlo simulations exhibits critical behaviors of the uniform AF order in the intermediate concentration region, which explains the experimental observation of the magnetic phase transition. The present results suggest that the uniform AF order may survive even in the presence of the randomly located ferromagnetic bonds.
This work investigates the use of deep fully convolutional neural networks (DFCNN) for pixel-wise scene labeling of Earth Observation images. Especially, we train a variant of the SegNet architecture on remote sensing data over an urban area and study different strategies for performing accurate semantic segmentation. Our contributions are the following: 1) we transfer efficiently a DFCNN from generic everyday images to remote sensing images; 2) we introduce a multi-kernel convolutional layer for fast aggregation of predictions at multiple scales; 3) we perform data fusion from heterogeneous sensors (optical and laser) using residual correction. Our framework improves state-of-the-art accuracy on the ISPRS Vaihingen 2D Semantic Labeling dataset.
Quality assurance in production line demands reliable weld joints. Human made errors is a major cause of faulty production. Promptly Identifying errors in the weld while welding is in progress will decrease the post inspection cost spent on the welding process. Electrical parameters generated during welding, could able to characterize the process efficiently. Parameter values are collected using high speed data acquisition system. Time series analysis tasks such as filtering, pattern recognition etc. are performed over the collected data. Filtering removes the unwanted noisy signal components and pattern recognition task segregate error patterns in the time series based upon similarity, which is performed by Self Organized mapping clustering algorithm. Welder quality is thus compared by detecting and counting number of error patterns appeared in his parametric time series. Moreover, Self Organized mapping algorithm provides the database in which patterns are segregated into two classes either desirable or undesirable. Database thus generated is used to train the classification algorithms, and thereby automating the real time error detection task. Multi Layer Perceptron and Radial basis function are the two classification algorithms used, and their performance has been compared based on metrics such as specificity, sensitivity, accuracy and time required in training.
The ability to detect weak distributed activation patterns in networks is critical to several applications, such as identifying the onset of anomalous activity or incipient congestion in the Internet, or faint traces of a biochemical spread by a sensor network. This is a challenging problem since weak distributed patterns can be invisible in per node statistics as well as a global network-wide aggregate. Most prior work considers situations in which the activation/non-activation of each node is statistically independent, but this is unrealistic in many problems. In this paper, we consider structured patterns arising from statistical dependencies in the activation process. Our contributions are three-fold. First, we propose a sparsifying transform that succinctly represents structured activation patterns that conform to a hierarchical dependency graph. Second, we establish that the proposed transform facilitates detection of very weak activation patterns that cannot be detected with existing methods. Third, we show that the structure of the hierarchical dependency graph governing the activation process, and hence the network transform, can be learnt from very few (logarithmic in network size) independent snapshots of network activity.
We have conducted a deep ($15 \la r \la 23$), 20 night survey for transiting planets in the intermediate age open cluster M37 (NGC 2099) using the Megacam wide-field mosaic CCD camera on the 6.5m Multiple Mirror Telescope (MMT). In this paper we describe the observations and data reduction procedures for the survey and analyze the stellar content and dynamical state of the cluster. By combining high resolution spectroscopy with existing $BVI_{C}K_{S}$ and new $gri$ color magnitude diagrams we determine the fundamental cluster parameters: $t = 485 \pm 28$ Myr without overshooting ($t = 550 \pm 30 {\rm Myr}$ with overshooting), $E(B-V) = 0.227 \pm 0.038$, $(m-M)_{V} = 11.57 \pm 0.13$ and $[M/H] = +0.045 \pm 0.044$ which are in good agreement with, though more precise than, previous measurements. We determine the mass function down to $0.3 M_{\odot}$ and use this to estimate the total cluster mass of $3640 \pm 170 M_{\odot}$.
The Josephson plasma resonance (JPR) provides a sensitive probe of vortex states in layered superconductors. We demonstrate that in the case of weak damping in the liquid phase, broadening of the JPR line is caused mainly by random Josephson coupling arising from the density fluctuations of pancake vortices. In this case the JPR line has the universal shape, which is determined only by parameters of the superconductors and temperature. This mechanism gives a natural explanation for the experimentally observed asymmetric lineshape. The tail at high frequencies arises due to mixing of the propagating plasma modes by random Josephson coupling, while the tail at small frequencies is caused by the localized plasma modes originating from a rare fluctuation suppression of the Josephson coupling in large areas.
This paper presents the recurrent estimation of distributions (RED) for modeling real-valued data in a semiparametric fashion. RED models make two novel uses of recurrent neural networks (RNNs) for density estimation of general real-valued data. First, RNNs are used to transform input covariates into a latent space to better capture conditional dependencies in inputs. After, an RNN is used to compute the conditional distributions of the latent covariates. The resulting model is efficient to train, compute, and sample from, whilst producing normalized pdfs. The effectiveness of RED is shown via several real-world data experiments. Our results show that RED models achieve a lower held-out negative log-likelihood than other neural network approaches across multiple dataset sizes and dimensionalities. Further context of the efficacy of RED is provided by considering anomaly detection tasks, where we also observe better performance over alternative models.
Motivation: Understanding functions of proteins in specific human tissues is essential for insights into disease diagnostics and therapeutics, yet prediction of tissue-specific cellular function remains a critical challenge for biomedicine.   Results: Here we present OhmNet, a hierarchy-aware unsupervised node feature learning approach for multi-layer networks. We build a multi-layer network, where each layer represents molecular interactions in a different human tissue. OhmNet then automatically learns a mapping of proteins, represented as nodes, to a neural embedding based low-dimensional space of features. OhmNet encourages sharing of similar features among proteins with similar network neighborhoods and among proteins activated in similar tissues. The algorithm generalizes prior work, which generally ignores relationships between tissues, by modeling tissue organization with a rich multiscale tissue hierarchy. We use OhmNet to study multicellular function in a multi-layer protein interaction network of 107 human tissues. In 48 tissues with known tissue-specific cellular functions, OhmNet provides more accurate predictions of cellular function than alternative approaches, and also generates more accurate hypotheses about tissue-specific protein actions. We show that taking into account the tissue hierarchy leads to improved predictive power. Remarkably, we also demonstrate that it is possible to leverage the tissue hierarchy in order to effectively transfer cellular functions to a functionally uncharacterized tissue. Overall, OhmNet moves from flat networks to multiscale models able to predict a range of phenotypes spanning cellular subsystems
Dynamic Time-division duplex (TDD) can provide efficient and flexible splitting of the common wireless cellular resources between uplink (UL) and downlink (DL) users. In this paper, the UL/DL optimization problem is formulated as a noncooperative game among the small cell base stations (SCBSs) in which each base station aims at minimizing its total UL and DL flow delays. To solve this game, a self-organizing UL/DL resource configuration scheme for TDD-based small cell networks is proposed. Using the proposed scheme, an SCBS is able to estimate and learn the UL and DL loads autonomously while optimizing its UL/DL configuration accordingly. Simulations results show that the proposed algorithm achieves significant gains in terms of packet throughput in case of asymmetric UL and DL traffic loads. This gain increases as the traffic asymmetry increases, reaching up to 97% and 200% gains relative to random and fixed duplexing schemes respectively. Our results also show that the proposed algorithm is well- adapted to dynamic traffic conditions and different network sizes, and operates efficiently in case of severe cross-link interference in which neighboring cells transmit in opposite directions.
People's personal social networks are big and cluttered, and currently there is no good way to automatically organize them. Social networking sites allow users to manually categorize their friends into social circles (e.g. 'circles' on Google+, and 'lists' on Facebook and Twitter), however they are laborious to construct and must be updated whenever a user's network grows. In this paper, we study the novel task of automatically identifying users' social circles. We pose this task as a multi-membership node clustering problem on a user's ego-network, a network of connections between her friends. We develop a model for detecting circles that combines network structure as well as user profile information. For each circle we learn its members and the circle-specific user profile similarity metric. Modeling node membership to multiple circles allows us to detect overlapping as well as hierarchically nested circles. Experiments show that our model accurately identifies circles on a diverse set of data from Facebook, Google+, and Twitter, for all of which we obtain hand-labeled ground-truth.
Conventionally, many researchers have used both regression and black box techniques to estimate the unconfined compressive strength (UCS) of different rocks. The advantage of the regression approach is that it can be used to render a functional relationship between the predictive rock indices and its UCS. The advantage of the black box techniques is in rendering more accurate predictions. Gene expression programming (GEP) is proposed, in this study, as a robust mathematical alternative for predicting the UCS of carbonate rocks. The two parameters of total porosity and P-wave speed were selected as predictive indices. The proposed GEP model had the advantage of the both traditionally used approaches by proposing a mathematical model, similar to a regression, while keeping the prediction errors as low as the black box methods. The GEP outperformed both artificial neural networks and support vector machines in terms of yielding more accurate estimates of UCS. Both the porosity and the P-wave velocity were sufficient predictive indices for estimating the UCS of the carbonate rocks in this study. Nearly, 95% of the observed variation in the UCS values was explained by these two parameters (i.e., R2 =95%).
We study critical behavior of the diluted 2D Ising model in the presence of disorder correlations which decay algebraically with distance as $\sim r^{-a}$. Mapping the problem onto 2D Dirac fermions with correlated disorder we calculate the critical properties using renormalization group up to two-loop order. We show that beside the Gaussian fixed point the flow equations have a non trivial fixed point which is stable for $0.995<a<2$ and is characterized by the correlation length exponent $\nu= 2/a + O((2-a)^3)$. Using bosonization, we also calculate the averaged square of the spin-spin correlation function and find the corresponding critical exponent $\eta_2=1/2-(2-a)/4+O((2-a)^2)$.
Accurate wireless timing synchronization has been an extremely important topic in wireless sensor networks, required in applications ranging from distributed beam forming to precision localization and navigation. However, it is very challenging to realize, in particular when the required accuracy should be better than the runtime between the nodes. This work presents, to our knowledge for the first time, an experimental timing synchronization scheme that achieves a timing accuracy better than 5ns rms in a network with 4 nodes. The experimental hardware is built from commercially available components and based on software defined ultra wideband transceivers. The protocol for establishing the synchronization is based on our recently developed blink protocol that can scale from the small network demonstrated here to larger networks of hundreds or thousands of nodes.
A modification of the neo-fuzzy neuron is proposed (an extended neo-fuzzy neuron (ENFN)) that is characterized by improved approximating properties. An adaptive learning algorithm is proposed that has both tracking and smoothing properties. An ENFN distinctive feature is its computational simplicity compared to other artificial neural networks and neuro-fuzzy systems.
This paper reports a robust scheme for topology identification and control of networks running on linear dynamics. In the proposed method, the unknown network is enforced to asymptotically follow a reference dynamics using the combination of Lyapunov based adaptive feedback input and sliding mode control. The adaptive part controls the dynamics by learning the network structure, while the sliding mode part rejects the input uncertainty. Simulation studies are presented in several scenarios (detection of link failure, tracking time varying topology, achieving dynamic synchronization) to give support to theoretical findings.
Mixtures of Experts combine the outputs of several "expert" networks, each of which specializes in a different part of the input space. This is achieved by training a "gating" network that maps each input to a distribution over the experts. Such models show promise for building larger networks that are still cheap to compute at test time, and more parallelizable at training time. In this this work, we extend the Mixture of Experts to a stacked model, the Deep Mixture of Experts, with multiple sets of gating and experts. This exponentially increases the number of effective experts by associating each input with a combination of experts at each layer, yet maintains a modest model size. On a randomly translated version of the MNIST dataset, we find that the Deep Mixture of Experts automatically learns to develop location-dependent ("where") experts at the first layer, and class-specific ("what") experts at the second layer. In addition, we see that the different combinations are in use when the model is applied to a dataset of speech monophones. These demonstrate effective use of all expert combinations.
HIV/AIDs Regimen specification one of many problems for which bioinformaticians have implemented and trained machine learning methods such as neural networks. Predicting HIV resistance would be much easier, but unfortunately we rarely have enough structural information available to train a neural network. To network model designed to predict how long the HIV patient can prolong his/her life time with certain regimen specification. To learn this model 300 patient's details have taken as a training set to train the network and 100 patients medical history has taken to test this model. This network model is trained using MAT lab implementation.
We analyze the performance of TCP and TCP with network coding (TCP/NC) in lossy wireless networks. We build upon the simple framework introduced by Padhye et al. and characterize the throughput behavior of classical TCP as well as TCP/NC as a function of erasure rate, round-trip time, maximum window size, and duration of the connection. Our analytical results show that network coding masks erasures and losses from TCP, thus preventing TCP's performance degradation in lossy networks, such as wireless networks. It is further seen that TCP/NC has significant throughput gains over TCP. In addition, we simulate TCP and TCP/NC to verify our analysis of the average throughput and the window evolution. Our analysis and simulation results show very close concordance and support that TCP/NC is robust against erasures. TCP/NC is not only able to increase its window size faster but also to maintain a large window size despite losses within the network, whereas TCP experiences window closing essentially because losses are mistakenly attributed to congestion.
We present a neural network architecture and training method designed to enable very rapid training and low implementation complexity. Due to its training speed and very few tunable parameters, the method has strong potential for applications requiring frequent retraining or online training. The approach is characterized by (a) convolutional filters based on biologically inspired visual processing filters, (b) randomly-valued classifier-stage input weights, (c) use of least squares regression to train the classifier output weights in a single batch, and (d) linear classifier-stage output units. We demonstrate the efficacy of the method by applying it to image classification. Our results match existing state-of-the-art results on the MNIST (0.37% error) and NORB-small (2.2% error) image classification databases, but with very fast training times compared to standard deep network approaches. The network's performance on the Google Street View House Number (SVHN) (4% error) database is also competitive with state-of-the art methods.
There is a growing interest in energy efficient or so-called "green" wireless communication to reduce the energy consumption in cellular networks. Since today's wireless terminals are typically equipped with multiple network access interfaces such as Bluetooth, Wi-Fi, and cellular networks, this paper investigates user terminals cooperating with each other in transmitting their data packets to the base station (BS), by exploiting the multiple network access interfaces, called inter-network cooperation. We also examine the conventional schemes without user cooperation and with intra-network cooperation for comparison. Given target outage probability and data rate requirements, we analyze the energy consumption of conventional schemes as compared to the proposed inter-network cooperation by taking into account both physical-layer channel impairments (including path loss, fading, and thermal noise) and upper-layer protocol overheads. It is shown that distances between different network entities (i.e., user terminals and BS) have a significant influence on the energy efficiency of proposed inter-network cooperation scheme. Specifically, when the cooperating users are close to BS or the users are far away from each other, the inter-network cooperation may consume more energy than conventional schemes without user cooperation or with intra-network cooperation. However, as the cooperating users move away from BS and the inter-user distance is not too large, the inter-network cooperation significantly reduces the energy consumption over conventional schemes.
We formalize the notion of a pseudo-ensemble, a (possibly infinite) collection of child models spawned from a parent model by perturbing it according to some noise process. E.g., dropout (Hinton et. al, 2012) in a deep neural network trains a pseudo-ensemble of child subnetworks generated by randomly masking nodes in the parent network. We present a novel regularizer based on making the behavior of a pseudo-ensemble robust with respect to the noise process generating it. In the fully-supervised setting, our regularizer matches the performance of dropout. But, unlike dropout, our regularizer naturally extends to the semi-supervised setting, where it produces state-of-the-art results. We provide a case study in which we transform the Recursive Neural Tensor Network of (Socher et. al, 2013) into a pseudo-ensemble, which significantly improves its performance on a real-world sentiment analysis benchmark.
Convolutional neural networks (CNNs) are a standard component of many current state-of-the-art Large Vocabulary Continuous Speech Recognition (LVCSR) systems. However, CNNs in LVCSR have not kept pace with recent advances in other domains where deeper neural networks provide superior performance. In this paper we propose a number of architectural advances in CNNs for LVCSR. First, we introduce a very deep convolutional network architecture with up to 14 weight layers. There are multiple convolutional layers before each pooling layer, with small 3x3 kernels, inspired by the VGG Imagenet 2014 architecture. Then, we introduce multilingual CNNs with multiple untied layers. Finally, we introduce multi-scale input features aimed at exploiting more context at negligible computational cost. We evaluate the improvements first on a Babel task for low resource speech recognition, obtaining an absolute 5.77% WER improvement over the baseline PLP DNN by training our CNN on the combined data of six different languages. We then evaluate the very deep CNNs on the Hub5'00 benchmark (using the 262 hours of SWB-1 training data) achieving a word error rate of 11.8% after cross-entropy training, a 1.4% WER improvement (10.6% relative) over the best published CNN result so far.
Objective: Modelling the associations from high-throughput experimental molecular data has provided unprecedented insights into biological pathways and signalling mechanisms. Graphical models and networks have especially proven to be useful abstractions in this regard. Ad-hoc thresholds are often used in conjunction with structure learning algorithms to determine significant associations. The present study overcomes this limitation by proposing a statistically-motivated approach for identifying significant associations in a network.   Methods and Materials: A new method that identifies significant associations in graphical models by estimating the threshold minimising the $L_{\mathrm{1}}$ norm between the cumulative distribution function (CDF) of the observed edge confidences and those of its asymptotic counterpart is proposed. The effectiveness of the proposed method is demonstrated on popular synthetic data sets as well as publicly available experimental molecular data corresponding to gene and protein expression profiles.   Results: The improved performance of the proposed approach is demonstrated across the synthetic data sets using sensitivity, specificity and accuracy as performance metrics. The results are also demonstrated across varying sample sizes and three different structure learning algorithms with widely varying assumptions. In all cases, the proposed approach has specificity and accuracy close to 1, while sensitivity increases linearly in the logarithm of the sample size. The estimated threshold systematically outperforms common ad-hoc ones in terms of sensitivity while maintaining comparable levels of specificity and accuracy. Networks from experimental data sets are reconstructed accurately with respect to the results from the original papers.
High level understanding of sequential visual input is important for safe and stable autonomy, especially in localization and object detection. While traditional object classification and tracking approaches are specifically designed to handle variations in rotation and scale, current state-of-the-art approaches based on deep learning achieve better performance. This paper focuses on developing a spatiotemporal model to handle videos containing moving objects with rotation and scale changes. Built on models that combine Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to classify sequential data, this work investigates the effectiveness of incorporating attention modules in the CNN stage for video classification. The superiority of the proposed spatiotemporal model is demonstrated on the Moving MNIST dataset augmented with rotation and scaling.
We study the effects of local perturbations on the dynamics of disordered fermionic systems in order to characterize time-irreversibility. We focus on three different systems, the non-interacting Anderson and Aubry-Andr\'e-Harper (AAH-) models, and the interacting spinless disordered t-V chain. First, we consider the effect on the full many-body wave-functions by measuring the Loschmidt echo (LE). We show that in the extended/ergodic phase the LE decays exponentially fast with time, while in the localized phase the decay is algebraic. We demonstrate that the exponent of the decay of the LE in the localized phase diverges proportionally to the single-particle localization length as we approach the metal-insulator transition in the AAH model. Second, we probe different phases of disordered systems by studying the time expectation value of local observables evolved with two Hamiltonians that differ by a spatially local perturbation. Remarkably, we find that many-body localized systems could lose memory of the initial state in the long-time limit, in contrast to the non-interacting localized phase where some memory is always preserved.
Leveraging advances in variational inference, we propose to enhance recurrent neural networks with latent variables, resulting in Stochastic Recurrent Networks (STORNs). The model i) can be trained with stochastic gradient methods, ii) allows structured and multi-modal conditionals at each time step, iii) features a reliable estimator of the marginal likelihood and iv) is a generalisation of deterministic recurrent neural networks. We evaluate the method on four polyphonic musical data sets and motion capture data.
Compositing is one of the most common operations in photo editing. To generate realistic composites, the appearances of foreground and background need to be adjusted to make them compatible. Previous approaches to harmonize composites have focused on learning statistical relationships between hand-crafted appearance features of the foreground and background, which is unreliable especially when the contents in the two layers are vastly different. In this work, we propose an end-to-end deep convolutional neural network for image harmonization, which can capture both the context and semantic information of the composite images during harmonization. We also introduce an efficient way to collect large-scale and high-quality training data that can facilitate the training process. Experiments on the synthesized dataset and real composite images show that the proposed network outperforms previous state-of-the-art methods.
Brain-inspired computing architectures attempt to mimic the computations performed in the neurons and the synapses in the human brain in order to achieve its efficiency in learning and cognitive tasks. In this work, we demonstrate the mapping of the probabilistic spiking nature of pyramidal neurons in the cortex to the stochastic switching behavior of a Magnetic Tunnel Junction in presence of thermal noise. We present results to illustrate the efficiency of neuromorphic systems based on such probabilistic neurons for pattern recognition tasks in presence of lateral inhibition and homeostasis. Such stochastic MTJ neurons can also potentially provide a direct mapping to the probabilistic computing elements in Belief Networks for performing regenerative tasks.
We introduce a neural network with a recurrent attention model over a possibly large external memory. The architecture is a form of Memory Network (Weston et al., 2015) but unlike the model in that work, it is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings. It can also be seen as an extension of RNNsearch to the case where multiple computational steps (hops) are performed per output symbol. The flexibility of the model allows us to apply it to tasks as diverse as (synthetic) question answering and to language modeling. For the former our approach is competitive with Memory Networks, but with less supervision. For the latter, on the Penn TreeBank and Text8 datasets our approach demonstrates comparable performance to RNNs and LSTMs. In both cases we show that the key concept of multiple computational hops yields improved results.
In this paper, subgraphs and complementary graphs are used to analyze the network synchronizability. Some sharp and attainable bounds are provided for the eigenratio of the network structural matrix, which characterizes the network synchronizability, especially when the network's corresponding graph has cycles, chains, bipartite graphs or product graphs as its subgraphs.
Bose-Einstein correlations in one and two dimensions have been studied in deep inelastic EP scattering events measured with the ZEUS detector at HERA using an integrated luminosity of 121 pb-1. The correlations are independent of the virtuality of the exchanged photon, Q2, in the range 0.1<Q2<8000 GeV2. There is no significant difference between the correlations in the current and target regions of the Breit frame for Q2>100 GeV2. The two-dimensional shape of the particle-production source was investigated, and a significant difference between the transverse and the longitudinal dimensions of the source is observed.This difference also shows no Q2 dependence.The results demonstrate that Bose-Einstein interference, and hence the size of the particle-production source, is insensitive to the hard subprocess.
Semantic image inpainting is a challenging task where large missing regions have to be filled based on the available visual data. Existing methods which extract information from only a single image generally produce unsatisfactory results due to the lack of high level context. In this paper, we propose a novel method for semantic image inpainting, which generates the missing content by conditioning on the available data. Given a trained generative model, we search for the closest encoding of the corrupted image in the latent image manifold using our context and prior losses. This encoding is then passed through the generative model to infer the missing content. In our method, inference is possible irrespective of how the missing content is structured, while the state-of-the-art learning based method requires specific information about the holes in the training phase. Experiments on three datasets show that our method successfully predicts information in large missing regions and achieves pixel-level photorealism, significantly outperforming the state-of-the-art methods.
Scattering hinders the passage of light through random media and consequently limits the usefulness of optical techniques for sensing and imaging. Thus, methods for increasing the transmission of light through such random media are of interest. Against this backdrop, recent theoretical and experimental advances have suggested the existence of a few highly transmitting eigen-wavefronts with transmission coefficients close to one in strongly backscattering random media.   Here, we numerically analyze this phenomenon in 2-D with fully spectrally accurate simulators and provide rigorous numerical evidence confirming the existence of these highly transmitting eigen-wavefronts in random media with periodic boundary conditions that is composed of hundreds of thousands of non-absorbing scatterers.   Motivated by bio-imaging applications where it is not possible to measure the transmitted fields, we develop physically realizable algorithms for increasing the transmission through such random media using backscatter analysis. We show via numerical simulations that the algorithms converge rapidly, yielding a near-optimum wavefront in just a few iterations. We also develop an algorithm that combines the knowledge of these highly transmitting eigen-wavefronts obtained from backscatter analysis, with intensity measurements at a point to produce a near-optimal focus with significantly fewer measurements than a method that does not utilize this information.
We propose an online convex optimization algorithm (RescaledExp) that achieves optimal regret in the unconstrained setting without prior knowledge of any bounds on the loss functions. We prove a lower bound showing an exponential separation between the regret of existing algorithms that require a known bound on the loss functions and any algorithm that does not require such knowledge. RescaledExp matches this lower bound asymptotically in the number of iterations. RescaledExp is naturally hyperparameter-free and we demonstrate empirically that it matches prior optimization algorithms that require hyperparameter optimization.
Discovery of communities in complex networks is a fundamental data analysis problem with applications in various domains. Most of the existing approaches have focused on discovering communities of nodes, while recent studies have shown great advantages and utilities of the knowledge of communities of links in networks. From this new perspective, we propose a link dynamics based algorithm, called UELC, for identifying link communities of networks. In UELC, the stochastic process of a link-node-link random walk is employed to unfold an embedded bipartition structure of links in a network. The local mixing properties of the Markov chain underlying the random walk are then utilized to extract two emerged link communities. Further, the random walk and the bipartitioning processes are wrapped in an iterative subdivision strategy to recursively identify link partitions that segregate the network links into multiple subdivisions. We evaluate the performance of the new method on synthetic benchmarks and demonstrate its utility on real-world networks. Our experimental results show that our method is highly effective for discovering link communities in complex networks. As a comparison, we also extend UELC to extracting communities of node, and show that it is effective for node community identification.
We investigate the ability of algorithms developed for reverse engineering of transcriptional regulatory networks to reconstruct metabolic networks from high-throughput metabolite profiling data. For this, we generate synthetic metabolic profiles for benchmarking purposes based on a well-established model for red blood cell metabolism. A variety of data sets is generated, accounting for different properties of real metabolic networks, such as experimental noise, metabolite correlations, and temporal dynamics. These data sets are made available online. We apply ARACNE, a mainstream transcriptional networks reverse engineering algorithm, to these data sets and observe performance comparable to that obtained in the transcriptional domain, for which the algorithm was originally designed.
The use of multiple Decision Models (DMs) enables to enhance the accuracy in decisions and at the same time allows users to evaluate the confidence in decision making. In this paper we explore the ability of multiple DMs to learn from a small amount of verified data. This becomes important when data samples are difficult to collect and verify. We propose an evolutionary-based approach to solving this problem. The proposed technique is examined on a few clinical problems presented by a small amount of data.
We present a new deep meta reinforcement learner, which we call Deep Episodic Value Iteration (DEVI). DEVI uses a deep neural network to learn a similarity metric for a non-parametric model-based reinforcement learning algorithm. Our model is trained end-to-end via back-propagation. Despite being trained using the model-free Q-learning objective, we show that DEVI's model-based internal structure provides `one-shot' transfer to changes in reward and transition structure, even for tasks with very high-dimensional state spaces.
Deep generative models (DGMs) have brought about a major breakthrough, as well as renewed interest, in generative latent variable models. However, an issue current DGM formulations do not address concerns the data-driven inference of the number of latent features needed to represent the observed data. Traditional linear formulations allow for addressing this issue by resorting to tools from the field of nonparametric statistics: Indeed, nonparametric linear latent variable models, obtained by appropriate imposition of Indian Buffet Process (IBP) priors, have been extensively studied by the machine learning community; inference for such models can been performed either via exact sampling or via approximate variational techniques. Based on this inspiration, in this paper we examine whether similar ideas from the field of Bayesian nonparametrics can be utilized in the context of modern DGMs in order to address the latent variable dimensionality inference problem. To this end, we propose a novel DGM formulation, based on the imposition of an IBP prior. We devise an efficient Black-Box Variational inference algorithm for our model, and exhibit its efficacy in a number of semi-supervised classification experiments. In all cases, we use popular benchmark datasets, and compare to state-of-the-art DGMs.
This work presents a novel objective function for the unsupervised training of neural network sentence encoders. It exploits signals from paragraph-level discourse coherence to train these models to understand text. Our objective is purely discriminative, allowing us to train models many times faster than was possible under prior methods, and it yields models which perform well in extrinsic evaluations.
We study the effect of U=0 impurities on the superconducting and thermodynamic properties of the attractive Hubbard model on a square lattice. Removal of the interaction on a critical fraction of $f_{\rm crit} \approx 0.30$ of the sites results in the destruction of off-diagonal long range order in the ground state. This critical fraction is roughly independent of filling in the range $0.75 < \rho < 1.00$, although our data suggest that $f_{\rm crit}$ might be somewhat larger below half-filling than at $\rho=1$. We also find that the two peak structure in the specific heat is present at $f$ both below and above the value which destroys long range pairing order. It is expected that the high $T$ peak associated with local pair formation should be robust, but apparently local pairing fluctuations are sufficient to generate a low temperature peak.
Community structure appears to be an intrinsic property of many complex real-world networks. However, recent work shows that real-world networks reveal even more sophisticated modules than classical cohesive (link-density) communities. In particular, networks can also be naturally partitioned according to similar patterns of connectedness among the nodes, revealing link-pattern communities. We here propose a propagation based algorithm that can extract both link-density and link-pattern communities, without any prior knowledge of the true structure. The algorithm was first validated on different classes of synthetic benchmark networks with community structure, and also on random networks. We have further applied the algorithm to different social, information, technological and biological networks, where it indeed reveals meaningful (composites of) link-density and link-pattern communities. The results thus seem to imply that, similarly as link-density counterparts, link-pattern communities appear ubiquitous in nature and design.
The popularity of image sharing on social media and the engagement it creates between users reflects the important role that visual context plays in everyday conversations. We present a novel task, Image-Grounded Conversations (IGC), in which natural-sounding conversations are generated about a shared image. To benchmark progress, we introduce a new multiple-reference dataset of crowd-sourced, event-centric conversations on images. IGC falls on the continuum between chit-chat and goal-directed conversation models, where visual grounding constrains the topic of conversation to event-driven utterances. Experiments with models trained on social media data show that the combination of visual and textual context enhances the quality of generated conversational turns. In human evaluation, the gap between human performance and that of both neural and retrieval architectures suggests that multi-modal IGC presents an interesting challenge for dialogue research.
Estimating count and density maps from crowd images has a wide range of applications such as video surveillance, traffic monitoring, public safety and urban planning. In addition, techniques developed for crowd counting can be applied to related tasks in other fields of study such as cell microscopy, vehicle counting and environmental survey. The task of crowd counting and density map estimation is riddled with many challenges such as occlusions, non-uniform density, intra-scene and inter-scene variations in scale and perspective. Nevertheless, over the last few years, crowd count analysis has evolved from earlier methods that are often limited to small variations in crowd density and scales to the current state-of-the-art methods that have developed the ability to perform successfully on a wide range of scenarios. The success of crowd counting methods in the recent years can be largely attributed to deep learning and publications of challenging datasets. In this paper, we provide a comprehensive survey of recent Convolutional Neural Network (CNN) based approaches that have demonstrated significant improvements over earlier methods that rely largely on hand-crafted representations. First, we briefly review the pioneering methods that use hand-crafted representations and then we delve in detail into the deep learning-based approaches and recently published datasets. Furthermore, we discuss the merits and drawbacks of existing CNN-based approaches and identify promising avenues of research in this rapidly evolving field.
In recent years, a number of major improvements were introduced in the area of computer networks, energy-efficient network protocols and network management systems. Software Defined Networking (SDN) as a new paradigm for managing complex networks brings a significant opportunity to reduce the energy consumption among ICT. In this paper, we are tackling improvements in the process of monitoring the states of the networking devices and optimizing the existing solutions. We are monitoring the energy consumption of a network architecture and augment the retrieved raw power data to detect changes in the state of the devices. The goal is to benchmark the difference between the power data fetched from the real-measures and the data extracted from the power models, translated as the expected behavior of the devices. An application is designed to monitor and analyze the retrieved power data of a simulated ICT infrastructure composed of Cisco switches and routers, Dell Precision stations and Raritan PDUs. Moreover, smart algorithms are developed which outcome results with detection of changes in the state of the devices, as well as with detection and isolation of possible anomalies and their impact on the energy consumption. The application is envisioned to be an extension to the existing SDN controllers for monitoring and reporting the changes in the state of the ICT devices.
We present the design of an instrument capable of measuring the high energy ($>$60 MeV) muon-induced neutron flux deep underground. The instrument is based on applying the Gd-loaded liquid-scintillator technique to measure the rate of high-energy neutrons underground based on the neutron multiplicity induced in a Pb target. We present design studies based on Monte Carlo simulations that show that an apparatus consisting of a Pb target of 200 cm by 200 cm area by 60 cm thickness covered by a 60 cm thick Gd-loaded liquid scintillator (0.5% Gd content) detector could measure, at a depth of 2000 meters of water equivalent, a rate of $70\pm8$ (stat) events/year. Based on these studies, we also discuss the benefits of using a neutron multiplicity meter as a component of active shielding in such experiments.
Good old on-line back-propagation for plain multi-layer perceptrons yields a very low 0.35% error rate on the famous MNIST handwritten digits benchmark. All we need to achieve this best result so far are many hidden layers, many neurons per layer, numerous deformed training images, and graphics cards to greatly speed up learning.
We consider a multiphysics system with multiple component models coupled together through network coupling interfaces, i.e., a handful of scalars. If each component model contains uncertainties represented by a set of parameters, a straightfoward uncertainty quantification (UQ) study would collect all uncertainties into a single set and treat the multiphysics model as a black box. Such an approach ignores the rich structure of the multiphysics system, and the combined space of uncertainties can have a large dimension that prohibits the use of polynomial surrogate models. We propose an intrusive methodology that exploits the structure of the network coupled multiphysics system to efficiently construct a polynomial surrogate of the model output as a function of uncertain inputs. Using a nonlinear elimination strategy, we treat the solution as a composite function: the model outputs are functions of the coupling terms which are functions of the uncertain parameters. The composite structure allows us to construct and employ a reduced polynomial basis that depends on the coupling terms; the basis can be constructed with many fewer system solves than the naive approach, which results in substantial computational savings. We demonstrate the method on an idealized model of a nuclear reactor.
Experimental neuroscience increasingly requires tractable models for analyzing and predicting the behavior of neurons and networks. The generalized linear model (GLM) is an increasingly popular statistical framework for analyzing neural data that is flexible, exhibits rich dynamic behavior and is computationally tractable (Paninski, 2004; Pillow et al., 2008; Truccolo et al., 2005). What follows is a brief summary of the primary equations governing the application of GLM's to spike trains with a few sentences linking this work to the larger statistical literature. Latter sections include extensions of a basic GLM to model spatio-temporal receptive fields as well as network activity in an arbitrary numbers of neurons.
A theoretical analysis is presented to show the general occurrence of phase clusters in weakly, globally coupled oscillators close to a Hopf bifurcation. Through a reductive perturbation method, we derive the amplitude equation with a higher order correction term valid near a Hopf bifurcation point. This amplitude equation allows us to calculate analytically the phase coupling function from given limit-cycle oscillator models. Moreover, using the phase coupling function, the stability of phase clusters can be analyzed. We demonstrate our theory with the Brusselator model. Experiments are carried out to confirm the presence of phase clusters close to Hopf bifurcations with electrochemical oscillators.
We propose a multigrid extension of convolutional neural networks (CNNs). Rather than manipulating representations living on a single spatial grid, our network layers operate across scale space, on a pyramid of grids. They consume multigrid inputs and produce multigrid outputs; convolutional filters themselves have both within-scale and cross-scale extent. This aspect is distinct from simple multiscale designs, which only process the input at different scales. Viewed in terms of information flow, a multigrid network passes messages across a spatial pyramid. As a consequence, receptive field size grows exponentially with depth, facilitating rapid integration of context. Most critically, multigrid structure enables networks to learn internal attention and dynamic routing mechanisms, and use them to accomplish tasks on which modern CNNs fail.   Experiments demonstrate wide-ranging performance advantages of multigrid. On CIFAR and ImageNet classification tasks, flipping from a single grid to multigrid within the standard CNN paradigm improves accuracy, while being compute and parameter efficient. Multigrid is independent of other architectural choices; we show synergy in combination with residual connections. Multigrid yields dramatic improvement on a synthetic semantic segmentation dataset. Most strikingly, relatively shallow multigrid networks can learn to directly perform spatial transformation tasks, where, in contrast, current CNNs fail. Together, our results suggest that continuous evolution of features on a multigrid pyramid is a more powerful alternative to existing CNN designs on a flat grid.
Recently, the need of deploying new wireless networks for smart gas metering has raised the problem of radio planning in the169 MHz band. Unluckily, software tools commonly adopted for radio planning in cellular communication systems cannot be employed to solve this problem because of the substantially lower transmission frequencies characterizing this application. In this manuscript a novel data-centric solution, based on the use of support vector machine techniques for classification and regression, is proposed. Our method requires the availability of a limited set of received signal strength measurements and the knowledge of a three-dimensional map of the propagation environment of interest, and generates both an estimate of the coverage area and a prediction of the field strength within it. Numerical results referring to different Italian villages and cities evidence that our method is able to achieve good accuracy at the price of an acceptable computational cost and of a limited effort for the acquisition of measurements in the considered environments.
We describe a new neural-network technique developed for an automated recognition of solar filaments visible in the hydrogen H-alpha line full disk spectroheliograms. This technique allows neural networks learn from a few image fragments labelled manually to recognize the single filaments depicted on a local background. The trained network is able to recognize filaments depicted on the backgrounds with variations in brightness caused by atmospherics distortions. Despite the difference in backgrounds in our experiments the neural network has properly recognized filaments in the testing image fragments. Using a parabolic activation function we extend this technique to recognize multiple solar filaments which may appear in one fragment.
We propose a new active learning algorithm for parametric linear regression with random design. We provide finite sample convergence guarantees for general distributions in the misspecified model. This is the first active learner for this setting that provably can improve over passive learning. Unlike other learning settings (such as classification), in regression the passive learning rate of $O(1/\epsilon)$ cannot in general be improved upon. Nonetheless, the so-called `constant' in the rate of convergence, which is characterized by a distribution-dependent risk, can be improved in many cases. For a given distribution, achieving the optimal risk requires prior knowledge of the distribution. Following the stratification technique advocated in Monte-Carlo function integration, our active learner approaches the optimal risk using piecewise constant approximations.
Recently, a number of drug-therapy, disease, drug, and drug-target networks have been introduced. Here we suggest novel methods for network-based prediction of novel drug targets and for improvement of drug efficiency by analysing the effects of drugs on the robustness of cellular networks.
We combine features extracted from pre-trained convolutional neural networks (CNNs) with the fast, linear Exemplar-LDA classifier to get the advantages of both: the high detection performance of CNNs, automatic feature engineering, fast model learning from few training samples and efficient sliding-window detection. The Adaptive Real-Time Object Detection System (ARTOS) has been refactored broadly to be used in combination with Caffe for the experimental studies reported in this work.
We study networks constructed from gene expression data obtained from many types of cancers. The networks are constructed by connecting vertices that belong to each others' list of K-nearest-neighbors, with K being an a priori selected non-negative integer. We introduce an order parameter for characterizing the homogeneity of the networks. On minimizing the order parameter with respect to K, degree distribution of the networks shows power-law behavior in the tails with an exponent of unity. Analysis of the eigenvalue spectrum of the networks confirms the presence of the power-law and small-world behavior. We discuss the significance of these findings in the context of evolutionary biological processes.
The success of deep learning often derives from well-chosen operational building blocks. In this work, we revise the temporal convolution operation in CNNs to better adapt it to text processing. Instead of concatenating word representations, we appeal to tensor algebra and use low-rank n-gram tensors to directly exploit interactions between words already at the convolution stage. Moreover, we extend the n-gram convolution to non-consecutive words to recognize patterns with intervening words. Through a combination of low-rank tensors, and pattern weighting, we can efficiently evaluate the resulting convolution operation via dynamic programming. We test the resulting architecture on standard sentiment classification and news categorization tasks. Our model achieves state-of-the-art performance both in terms of accuracy and training speed. For instance, we obtain 51.2% accuracy on the fine-grained sentiment classification task.
Reciprocity in directed networks points to user's willingness to return favors in building mutual interactions. High reciprocity has been widely observed in many directed social media networks such as following relations in Twitter and Tumblr. Therefore, reciprocal relations between users are often regarded as a basic mechanism to create stable social ties and play a crucial role in the formation and evolution of networks. Each reciprocity relation is formed by two parasocial links in a back-and-forth manner with a time delay. Hence, understanding the delay can help us gain better insights into the underlying mechanisms of network dynamics. Meanwhile, the accurate prediction of delay has practical implications in advancing a variety of real-world applications such as friend recommendation and marketing campaign. For example, by knowing when will users follow back, service providers can focus on the users with a potential long reciprocal delay for effective targeted marketing. This paper presents the initial investigation of the time delay in reciprocal relations. Our study is based on a large-scale directed network from Tumblr that consists of 62.8 million users and 3.1 billion user following relations with a timespan of multiple years (from 31 Oct 2007 to 24 Jul 2013). We reveal a number of interesting patterns about the delay that motivate the development of a principled learning model to predict the delay in reciprocal relations. Experimental results on the above mentioned dynamic networks corroborate the effectiveness of the proposed delay prediction model.
Recent years have witnessed an increased interest in the application of persistent homology, a topological tool for data analysis, to machine learning problems. Persistent homology is known for its ability to numerically characterize the shapes of spaces induced by features or functions. On the other hand, deep neural networks have been shown effective in various tasks. To our best knowledge, however, existing neural network models seldom exploit shape information. In this paper, we investigate a way to use persistent homology in the framework of deep neural networks. Specifically, we propose to embed the so-called "persistence landscape," a rather new topological summary for data, into a convolutional neural network (CNN) for dealing with audio signals. Our evaluation on automatic music tagging, a multi-label classification task, shows that the resulting persistent convolutional neural network (PCNN) model can perform significantly better than state-of-the-art models in prediction accuracy. We also discuss the intuition behind the design of the proposed model, and offer insights into the features that it learns.
Two-dimensional electron or hole systems in semiconductors offer the unique opportunity to investigate the physics of strongly interacting fermions. We have measured the 1/f resistance noise of two-dimensional hole systems in high mobility GaAs quantum wells, at densities below that of the metal-insulator transition (MIT) at zero magnetic field. Two techniques voltage and current fluctuations were used. The normalized noise power SR/R2 increases strongly when the hole density or the temperature are decreased. The temperature dependence is steeper at the lowest densities. This contradicts the predictions of the modulation approach in the strong localization hopping transport regime. The hypothesis of a second order phase transition or percolation transition at a density below that of the MIT is thus reinforced.
The electrostatic coupling between singled-walled carbon nanotube (SWNT) networks/arrays and planar gate electrodes in thin-film transistors (TFTs) is analyzed both in the quantum limit with an analytical model and in the classical limit with finite-element modeling. The computed capacitance depends on both the thickness of the gate dielectric and the average spacing between the tubes, with some dependence on the distribution of these spacings. Experiments on transistors that use sub-monolayer, random networks of SWNTs verify certain aspects of these calculations. The results are important for the development of networks or arrays of nanotubes as active layers in TFTs and other electronic devices.
We study correlations in temporal networks and introduce the notion of betweenness preference. It allows to quantify to what extent paths, existing in time-aggregated representations of temporal networks, are actually realizable based on the sequence of interactions. We show that betweenness preference is present in empirical temporal network data and that it influences the length of shortest time-respecting paths. Using four different data sets, we further argue that neglecting betweenness preference leads to wrong conclusions about dynamical processes on temporal networks.
Recent advances in technology have made our work easier compare to earlier times. Computer network is growing day by day but while discussing about the security of computers and networks it has always been a major concerns for organizations varying from smaller to larger enterprises. It is true that organizations are aware of the possible threats and attacks so they always prepare for the safer side but due to some loopholes attackers are able to make attacks. Intrusion detection is one of the major fields of research and researchers are trying to find new algorithms for detecting intrusions. Clustering techniques of data mining is an interested area of research for detecting possible intrusions and attacks. This paper presents a new clustering approach for anomaly intrusion detection by using the approach of K-medoids method of clustering and its certain modifications. The proposed algorithm is able to achieve high detection rate and overcomes the disadvantages of K-means algorithm.
This article describes the results of a case study that applies Neural Network-based Optical Character Recognition (OCR) to scanned images of books printed between 1487 and 1870 by training the OCR engine OCRopus [@breuel2013high] on the RIDGES herbal text corpus [@OdebrechtEtAlSubmitted]. Training specific OCR models was possible because the necessary *ground truth* is available as error-corrected diplomatic transcriptions. The OCR results have been evaluated for accuracy against the ground truth of unseen test sets. Character and word accuracies (percentage of correctly recognized items) for the resulting machine-readable texts of individual documents range from 94% to more than 99% (character level) and from 76% to 97% (word level). This includes the earliest printed books, which were thought to be inaccessible by OCR methods until recently. Furthermore, OCR models trained on one part of the corpus consisting of books with different printing dates and different typesets *(mixed models)* have been tested for their predictive power on the books from the other part containing yet other fonts, mostly yielding character accuracies well above 90%. It therefore seems possible to construct generalized models trained on a range of fonts that can be applied to a wide variety of historical printings still giving good results. A moderate postcorrection effort of some pages will then enable the training of individual models with even better accuracies. Using this method, diachronic corpora including early printings can be constructed much faster and cheaper than by manual transcription. The OCR methods reported here open up the possibility of transforming our printed textual cultural heritage into electronic text by largely automatic means, which is a prerequisite for the mass conversion of scanned books.
When cavity photons couple to an optical fiber with a continuum of modes, they usually leak out within a finite amount of time. However, if the fiber is about one meter long and linked to a mirror, photons bounce back and forth within the fiber on a much faster time scale. As a result, {\em dynamical decoupling} prevents the cavity photons from entering the fiber. In this paper we use the simultaneous dynamical decoupling of a large number of distant cavities from the fiber modes of linear optics networks to mediate effective cavity-cavity interactions in a huge variety of configurations. Coherent cavity networks with complete connectivity can be created with potential applications in quantum computing and simulation of the complex interaction Hamiltonians of biological systems.
ARFIMA is a time series forecasting model, which is an improved ARMA model, the ARFIMA model proposed in this article is demonstrated and deduced in detail. combined with network traffic of CERNET backbone and the ARFIMA model,the result shows that,compare to the ARMA model, the prediction efficiency and accuracy has increased significantly, and not susceptible to sampling.
The network of interactions in complex systems, strongly influences their resilience, the system capability to resist to external perturbations or structural damages and to promptly recover thereafter. The phenomenon manifests itself in different domains, e.g. cascade failures in computer networks or parasitic species invasion in ecosystems. Understanding the networks topological features that affect the resilience phenomenon remains a challenging goal of the design of robust complex systems. We prove that the non-normality character of the network of interactions amplifies the response of the system to exogenous disturbances and can drastically change the global dynamics. We provide an illustrative application to ecology by proposing a mechanism to mute the Allee effect and eventually a new theory of patterns formation involving a single diffusing species.
Radio Frequency IDentification (RFID) systems are becoming more and more popular in the field of ubiquitous computing, in particular for objects identification. An RFID system is composed by one or more readers and a number of tags. One of the main issues in an RFID network is the fast and reliable identification of all tags in the reader range. The reader issues some queries, and tags properly answer. Then, the reader must identify the tags from such answers. This is crucial for most applications. Since the transmission medium is shared, the typical problem to be faced is a MAC-like one, i.e. to avoid or limit the number of tags transmission collisions. We propose a protocol which, under some assumptions about transmission techniques, always achieves a 100% perfomance. It is based on a proper recursive splitting of the concurrent tags sets, until all tags have been identified. The other approaches present in literature have performances of about 42% in the average at most. The counterpart is a more sophisticated hardware to be deployed in the manufacture of low cost tags.
There are several different experimental indications, such as the pion-nucleon sigma term and polarized deep-inelastic scattering, which suggest that the nucleon wave function contains a hidden s bar s component. This is expected in chiral soliton models, which also predicted the existence of new exotic baryons that may recently have been observed. Another hint of hidden strangeness in the nucleon is provided by copious phi production in various N bar N annihilation channels, which may be due to evasions of the Okubo-Zweig-Iizuka rule. One way to probe the possible polarization of hidden s bar s pairs in the nucleon may be via Lambda polarization in deep-inelastic scattering.
We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, our system does not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learns a function that is robust to such effects. We do not need a phoneme dictionary, nor even the concept of a "phoneme." Key to our approach is a well-optimized RNN training system that uses multiple GPUs, as well as a set of novel data synthesis techniques that allow us to efficiently obtain a large amount of varied data for training. Our system, called Deep Speech, outperforms previously published results on the widely studied Switchboard Hub5'00, achieving 16.0% error on the full test set. Deep Speech also handles challenging noisy environments better than widely used, state-of-the-art commercial speech systems.
Synchronization of neural networks has been used for novel public channel protocols in cryptography. In the case of tree parity machines the dynamics of both bidirectional synchronization and unidirectional learning is driven by attractive and repulsive stochastic forces. Thus it can be described well by a random walk model for the overlap between participating neural networks. For that purpose transition probabilities and scaling laws for the step sizes are derived analytically. Both these calculations as well as numerical simulations show that bidirectional interaction leads to full synchronization on average. In contrast, successful learning is only possible by means of fluctuations. Consequently, synchronization is much faster than learning, which is essential for the security of the neural key-exchange protocol. However, this qualitative difference between bidirectional and unidirectional interaction vanishes if tree parity machines with more than three hidden units are used, so that those neural networks are not suitable for neural cryptography. In addition, the effective number of keys which can be generated by the neural key-exchange protocol is calculated using the entropy of the weight distribution. As this quantity increases exponentially with the system size, brute-force attacks on neural cryptography can easily be made unfeasible.
If a quantum computer is stabilized by fault-tolerant quantum error correction (QEC), then most of its resources (qubits and operations) are dedicated to the extraction of error information. Analysis of this process leads to a set of central requirements for candidate computing devices, in addition to the basic ones of stable qubits and controllable gates and measurements. The logical structure of the extraction process has a natural geometry and hierarchy of communication needs; a computer whose physical architecture is designed to reflect this will be able to tolerate the most noise. The relevant networks are dominated by quantum information transport, therefore to assess a computing device it is necessary to characterize its ability to transport quantum information, in addition to assessing the performance of conditional logic on nearest neighbours and the passive stability of the memory. The transport distances involved in QEC networks are estimated, and it is found that a device relying on swap operations for information transport must have those operations an order of magnitude more precise than the controlled gates of a device which can transport information at low cost.
Numerous important problems can be framed as learning from graph data. We propose a framework for learning convolutional neural networks for arbitrary graphs. These graphs may be undirected, directed, and with both discrete and continuous node and edge attributes. Analogous to image-based convolutional networks that operate on locally connected regions of the input, we present a general approach to extracting locally connected regions from graphs. Using established benchmark data sets, we demonstrate that the learned feature representations are competitive with state of the art graph kernels and that their computation is highly efficient.
GARUDA Grid developed on NKN (National Knowledge Network) network by Centre for Development of Advanced Computing (C-DAC) hubs High Performance Computing (HPC) Clusters which are geographically separated all over India. C-DAC has been associated with development of HPC infrastructure since its establishment in year 1988. The Grid infrastructure provides a secure and efficient way of accessing heterogeneous resource . Enabling scientific applications on Grid has been researched for some time now. In this regard we have successfully enabled Computational Fluid Dynamics (CFD) application which can help CFD community as a whole in effective manner to carry out computational research which requires huge compuational resource beyond once in house capability. This work is part of current on-going project Grid GARUDA funded by Department of Information Technology.
One of the main objectives of cloud computing providers is increasing the revenue of their cloud datacenters by accommodating virtual network requests as many as possible. However, arrival and departure of virtual network requests fragment physical network's resources and reduce the possibility of accepting more virtual network requests. To increase the number of virtual network requests accommodated by fragmented physical networks, we propose two virtual network embedding algorithms, which coarsen virtual networks using Heavy Edge Matching (HEM) technique and embed coarsened virtual networks on best-fit sub-substrate networks. The performance of the proposed algorithms are evaluated and compared with existing algorithms using extensive simulations, which show that the proposed algorithms increase the acceptance ratio and the revenue.
Wireless sensor network (WSN) is a collection of nodes which can communicate with each other without any prior infrastructure along with the ability to collect data autonomously and effectively after being deployed in an ad-hoc fashion to monitor a given area. One major problem encountered in data gathering wireless systems is to obtain an optimal balance among the number of nodes deployed, energy efficiency and lifetime as energy of nodes cannot be replenished. In this paper we propose first a scheme to estimate the number of nodes to be deployed in a WSN for a predetermined lifetime so that total energy utilization and complete connectivity are ensured under all circumstances. This scheme also guarantees that during each data gathering cycle, every node dissipates the requisite amount of energy, which thus minimizes the number of nodes required to achieve the desired network lifetime. Second, this paper has proposed a framework to conduct data gathering in WSN. Extensive simulations have been carried out in ns2 to establish the effectiveness of this framework.
Neural network models have been used to construct energy landscapes for modeling biological phenomena, in which the minima of the landscape correspond to memory patterns stored by the network. Here, we show that dynamic properties of those landscapes, such as the sizes of the basins of attraction and the density of stable and metastable states, depend strongly on the correlations between the memory patterns and can be altered by introducing hierarchical structures. Our findings suggest dynamic features of energy landscapes can be controlled by choosing the correlations between patterns
With the explosive growth of mobile data demand, the fifth generation (5G) mobile network would exploit the enormous amount of spectrum in the millimeter wave (mmWave) bands to greatly increase communication capacity. There are fundamental differences between mmWave communications and existing other communication systems, in terms of high propagation loss, directivity, and sensitivity to blockage. These characteristics of mmWave communications pose several challenges to fully exploit the potential of mmWave communications, including integrated circuits and system design, interference management, spatial reuse, anti-blockage, and dynamics control. To address these challenges, we carry out a survey of existing solutions and standards, and propose design guidelines in architectures and protocols for mmWave communications. We also discuss the potential applications of mmWave communications in the 5G network, including the small cell access, the cellular access, and the wireless backhaul. Finally, we discuss relevant open research issues including the new physical layer technology, software-defined network architecture, measurements of network state information, efficient control mechanisms, and heterogeneous networking, which should be further investigated to facilitate the deployment of mmWave communication systems in the future 5G networks.
This paper examines the use of a hierarchical coevolutionary genetic algorithm under different partnering strategies. Cascading clusters of sub-populations are built from the bottom up, with higher-level sub-populations optimising larger parts of the problem. Hence higher-level sub-populations potentially search a larger search space with a lower resolution whilst lower-level sub-populations search a smaller search space with a higher resolution. The effects of different partner selection schemes amongst the sub-populations on solution quality are examined for two constrained optimisation problems. We examine a number of recombination partnering strategies in the construction of higher-level individuals and a number of related schemes for evaluating sub-solutions. It is shown that partnering strategies that exploit problem-specific knowledge are superior and can counter inappropriate (sub)fitness measurements.
In this work we use two different but complementary approaches in order to study the ghost propagator of a pure SU(3) Yang-Mills theory quantized in the linear covariant gauges, focusing on its dependence on the gauge-fixing parameter $\xi$ in the deep infrared. In particular, we first solve the Schwinger-Dyson equation that governs the dynamics of the ghost propagator, using a set of simplifying approximations, and under the crucial assumption that the gluon propagators for $\xi>0$ are infrared finite, as is the case in the Landau gauge $(\xi=0)$. Then we appeal to the Nielsen identities, and express the derivative of the ghost propagator with respect to $\xi$ in terms of certain auxiliary Green's functions, which are subsequently computed under the same assumptions as before. Within both formalisms we find that for $\xi>0$ the ghost dressing function approaches zero in the deep infrared, in sharp contrast to what happens in the Landau gauge, where it known to saturate at a finite (non-vanishing) value. The Nielsen identities are then extended to the case of the gluon propagator, and the $\xi$-dependence of the corresponding gluon masses is derived using as input the results obtained in the previous steps. The result turns out to be logarithmically divergent in the deep infrared; the compatibility of this behavior with the basic assumption of a finite gluon propagator is discussed, and a specific Ansatz is put forth, which readily reconciles both features.
The mean field algorithm is a widely used approximate inference algorithm for graphical models whose exact inference is intractable. In each iteration of mean field, the approximate marginals for each variable are updated by getting information from the neighbors. This process can be equivalently converted into a feedforward network, with each layer representing one iteration of mean field and with tied weights on all layers. This conversion enables a few natural extensions, e.g. untying the weights in the network. In this paper, we study these mean field networks (MFNs), and use them as inference tools as well as discriminative models. Preliminary experiment results show that MFNs can learn to do inference very efficiently and perform significantly better than mean field as discriminative models.
We present the morphological catalog of galaxies in nearby clusters of the WINGS survey (Fasano et al. 2006). The catalog contains a total number of 39923 galaxies, for which we provide the automatic estimates of the morphological type applying the purposely devised tool MORPHOT to the V-band WINGS imaging. For ~3000 galaxies we also provide visual estimates of the morphological types. A substantial part of the paper is devoted to the description of the MORPHOT tool, whose application is limited, at least for the moment, to the WINGS imaging only. The approach of the tool to the automation of morphological classification is a non parametric and fully empiri- cal one. In particular, MORPHOT exploits 21 morphological diagnostics, directly and easily computable from the galaxy image, to provide two independent classifications: one based on a Maximum Likelihood (ML), semi-analytical technique, the other one on a Neural Network (NN) machine. A suitably selected sample of ~1000 visually clas- sified WINGS galaxies is used to calibrate the diagnostics for the ML estimator and as a training set in the NN machine. The final morphological estimator combines the two techniques and proves to be effective both when applied to an additional test sample of ~1000 visually classified WINGS galaxies and when compared with small samples of SDSS galaxies visually classified by Fukugita et al. (2007) and Nair et al. (2010). Finally, besides the galaxy morphology distribution (corrected for field contamination) in the WINGS clusters, we present the ellipticity ({\epsilon}), color (B-V) and Sersic index (n) distributions for different morphological types, as well as the morphological fractions as a function of the clustercentric distance (in units of R200).
In many systems, the electronic energy spectrum is a continuous or singular continuous multifractal set with a distribution of scaling exponents. Here, we show that for a quasiperiodic potential, the multifractal energy spectrum can have a minimal dispersion of the scaling exponents. This is made by tuning the ratio between self-energies in a tigth-binding Hamiltonian defined in a quasiperiodic Fibonacci chain. The tuning is calculated numerically from a trace map, and coincides with the place where the scaling exponents of the map, obtained from cycles with period six and two, are equal. The diffusion of electronic wave packets also reflects the minimal multifractal fractal nature of the spectrum, by reducing intermittency. The present result can help to simplify the task of studying several problems in quasiperiodic systems, since the effects of multiscaling are better isolated.
Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with 7 convolutional layers and 3 fully-connected layers. Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40\% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network. An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.
Slow and accurate finger and limb movements are essential to daily activities, but their neural control and governing mechanics are relatively unexplored. We consider neuromechanical systems where slow movements are produced by neural commands that modulate muscle stiffness. This formulation based on strain-energy equilibria is in agreement with prior work on neural control of muscle and limb impedance. Slow limb movements are driftless in the sense that movement stops when neural commands stop. We demonstrate, in the context of two planar tendon-driven systems representing a finger and a leg, that the control of muscle stiffness suffices to produce stable and accurate limb postures and quasi-static (slow) transitions among them. We prove, however, that stable postures are achievable only when muscles are pre-tensioned, as is the case for natural muscle tone. Our results further indicate, in accordance with experimental findings, that slow movements are non-smooth. The non-smoothness arises because the precision with which individual muscle stiffnesses need to be controlled changes substantially throughout the limb's motion. These results underscore the fundamental roles of muscle tone and accurate neural control of muscle stiffness in producing stable limb postures and slow movements.
We show how an ensemble of $Q^*$-functions can be leveraged for more effective exploration in deep reinforcement learning. We build on well established algorithms from the bandit setting, and adapt them to the $Q$-learning setting. First we propose an exploration strategy based on upper-confidence bounds (UCB). Next, we define an "InfoGain" exploration bonus, which depends on the disagreement of the $Q$-ensemble. Our experiments show significant gains on the Atari benchmark.
Recent work in percolation has led to exact solutions for the site and bond critical thresholds of many new lattices. Here we show how these results can be extended to other classes of graphs, significantly increasing the number and variety of solved problems. Any graph that can be decomposed into a certain arrangement of triangles, which we call self-dual, gives a class of lattices whose percolation thresholds can be found exactly by a recently introduced triangle-triangle transformation. We use this method to generalize Wierman's solution of the bow-tie lattice to yield several new solutions. We also give another example of a self-dual arrangement of triangles that leads to a further class of solvable problems. There are certainly many more such classes.
The concept of "remote synchronization" (RS) was introduced in [Phys. Rev. E 85, 026208 (2012)], where synchronization in a star network of Stuart-Landau oscillators was investigated. In the RS regime therein described, the central hub served as a transmitter of information between peripheral nodes, while maintaining independent dynamics that were asynchronous with the rest of the network. One of the key conclusions of that paper was that RS cannot occur in pure phase-oscillator networks. Here, we show that the RS regime can exist in networks of Kuramoto oscillators, and that hub nodes can actively drive remote synchronization even in the presence of a repulsive mean field. We apply this model to study the synchronization dynamics in complex networks endowed with hub-nodes, an ubiquitous feature of many natural networks. We show that a change in the natural frequency of a hub can alone reshape synchronization patterns, and switch from direct to remote synchronization, or to hub-driven desynchronization. We discuss the potential role of this phenomenon in real-world networks, including the Karate-club and brain connectivity networks.
In this paper we combine the advantages of a model using global source sentence contexts, the Discriminative Word Lexicon, and neural networks. By using deep neural networks instead of the linear maximum entropy model in the Discriminative Word Lexicon models, we are able to leverage dependencies between different source words due to the non-linearity. Furthermore, the models for different target words can share parameters and therefore data sparsity problems are effectively reduced.   By using this approach in a state-of-the-art translation system, we can improve the performance by up to 0.5 BLEU points for three different language pairs on the TED translation task.
Skin lesion is a severe disease in world-wide extent. Reliable automatic detection of skin tumors is very useful to help increase the accuracy and efficiency of pathologists. International Skin Imaging Collaboration (ISIC) is a challenge focusing on the automatic analysis of skin lesion. In this brief paper, we introduce two deep learning methods to address all the three tasks announced in ISIC 2017, i.e. lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A fully-convolutional network is proposed to simultaneously address the tasks of lesion segmentation and classification and a straight-forward CNN is proposed for the dermoscopic feature extraction task. Experimental results on the validation set show promising accuracies, i.e. 75.1% for task 1, 84.4% for task 2 and 90.8% for task 3 were achieved.
When heated through the magnetic transition at Tc, La0.7Ca0.3MnO3 changes from a band metal to a polaronic insulator. The Hall constant R_H, through its activated behavior and sign anomaly, provides key evidence for polaronic behavior. We use R_H and the Hall mobility to demonstrate the breakdown of the polaron phase. Above 1.4Tc, the polaron picture holds in detail, while below, the activation energies of both R_H and the mobility deviate strongly from their polaronic values. These changes reflect the presence of metallic, ferromagnetic fluctuations, in the volume of which the Hall effect develops additional contributions tied to quantal phases.
In the area of full duplex (FD)-enabled small cell networks, limited works have been done on consideration of cache and mobile edge communication (MEC). In this paper, a virtual FD-enabled small cell network with cache and MEC is investigated for two heterogeneous services, high-data-rate service and computation-sensitive service. In our proposed scheme, content caching and FD communication are closely combined to offer high-data-rate services without the cost of backhaul resource. Computing offloading is conducted to guarantee the delay requirement of users. Then we formulate a virtual resource allocation problem, in which user association, power control, caching and computing offloading policies and resource allocation are jointly considered. Since the original problem is a mixed combinatorial problem, necessary variables relaxation and reformulation are conducted to transfer the original problem to a convex problem. Furthermore, alternating direction method of multipliers (ADMM) algorithm is adopted to obtain the optimal solution. Finally, extensive simulations are conducted with different system configurations to verify the effectiveness of the proposed scheme.
We study a one-dimensional quasiperiodic system described by the off-diagonal Aubry-Andr\'{e} model and investigate its phase diagram by using the symmetry and the multifractal analysis. It was shown in a recent work ({\it Phys. Rev. B} {\bf 93}, 205441 (2016)) that its phase diagram was divided into three regions, dubbed the extended, the topologically-nontrivial localized and the topologically-trivial localized phases, respectively. Out of our expectation, we find an additional region of the extended phase which can be mapped into the original one by a symmetry transformation. More unexpectedly, in both "localized" phases, most of the eigenfunctions are neither localized nor extended. Instead, they display critical features, that is, the minimum of the singularity spectrum is in a range $0<\gamma_{min}<1$ instead of $0$ for the localized state or $1$ for the extended state. Thus, a mixed phase is found with a mixture of localized and critical eigenfunctions.
The Border Gateway Protocol (BGP) sets up routes between the smaller networks that make up the Internet. Despite its crucial role, BGP is notoriously vulnerable to serious problems, including (1) propagation of bogus routing information due to attacks or misconfigurations, and (2) network instabilities in the form of persistent routing oscillations. The conditions required to avoid BGP instabilities are quite delicate. How, then, can we explain the observed stability of today's Internet in the face of common configuration errors and attacks? This work explains this phenomenon by first noticing that almost every observed attack and misconfiguration to date shares a common characteristic: even when a router announces egregiously bogus information, it will continue to announce the same bogus information for the duration of its attack/misconfiguration. We call these the "fixed-route attacks", and show that, while even simple fixed-route attacks can destabilize a network, the commercial routing policies used in today's Internet prevent such attacks from creating instabilities.
Using first-principles variable-composition evolutionary methodology, we explored the high-pressure structures of beryllium hydrides between 0 and 400 GPa. We found that BeH$_2$ remains the only stable compound in this pressure range. The pressure-induced transformations are predicted as $Ibam$ $\rightarrow $ $P\bar{3}m1$ $\rightarrow $ $R\bar{3}m$ $ \rightarrow $ $Cmcm$ $ \rightarrow $ $P4/nmm$, which occur at 24, 139, 204 and 349 GPa, respectively. $P\bar{3}m1$ and $R\bar{3}m$ structures are layered polytypes based on close packings of H atoms with Be atoms filling octahedral voids in alternating layers. $Cmcm$ and $P4/nmm$ structures have 3D-networks of strong bonds, but also feature rectanular and squre, respectively, layers of H atoms with short H-H distances. $P\bar{3}m1$ and $R\bar{3}m$ are semiconductors while $Cmcm$ and $P4/nmm$ are metallic. We have explored superconductivity of both metallic phases, and found large electron-phonon coupling parameters of $ \lambda $=0.63 for $Cmcm$ (resulting in a $T_c$ of 32.1-44.1 K) at 250 GPa and $ \lambda $=0.65 for $P4/nmm$ ($T_c$ = 46.1-62.4 K) at 400 GPa. The dependence of $T_c$ on pressure indicates that $T_c$ initially increases to a maximum of 45.1 K for $Cmcm$ at 275 GPa and 97.0 K for $P4/nmm$ at 365 GPa, and then decreases with increasing pressure for both phases.
Non-invasive observation of spatiotemporal neural activity of large neural populations distributed over entire brains is a longstanding goal of neuroscience. We developed a real-time volumetric and multispectral optoacoustic tomography platform for imaging of neural activation deep in scattering brains. The system can record 100 volumetric frames per second across a 200mm3 field of view and spatial resolutions below 70um. Experiments performed in immobilized and freely swimming larvae and in adult zebrafish brains demonstrate, for the first time, the fundamental ability to optoacoustically track neural calcium dynamics in animals labeled with genetically encoded calcium indicator GCaMP5G, while overcoming the longstanding penetration barrier of optical imaging in scattering brains. The newly developed platform offers unprecedented capabilities for functional whole-brain observations of fast calcium dynamics; in combination with optoacoustics' well-established capacity in resolving vascular hemodynamics, it could open new vistas in the study of neural activity and neurovascular coupling in health and disease.
Image representations, from SIFT and bag of visual words to Convolutional Neural Networks (CNNs) are a crucial component of almost all computer vision systems. However, our understanding of them remains limited. In this paper we study several landmark representations, both shallow and deep, by a number of complementary visualization techniques. These visualizations are based on the concept of "natural pre-image", namely a natural-looking image whose representation has some notable property. We study in particular three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We pose these as a regularized energy-minimization framework and demonstrate its generality and effectiveness. In particular, we show that this method can invert representations such as HOG more accurately than recent alternatives while being applicable to CNNs too. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.
TLS uses X.509 certificates for server authentication. A X.509 certificate is a complex document and various innocent errors may occur while creating/ using it. Also, many certificates belong to malicious websites and should be rejected by the client and those web servers should not be visited. Usually, when a client finds a certificate that is doubtful using the traditional tests, it asks for human intervention. But, looking at certificates, most people can't differentiate between malicious and non-malicious websites. Thus, once traditional certificate validation has failed, instead of asking for human intervention, we use machine learning techniques to enable a web browser to decide whether the server to which the certificate belongs to is malignant or not ie, whether the website should be visited or not.   Once a certificate has been accepted in the above phase, we observe that the website may still turn out to be malicious. So, in the second phase, we download a part of the website in a sandbox without decrypting it and observe the TLS encrypted traffic (encrypted malicious data captured in a sandbox cannot harm the system). As the traffic is encrypted after Handshake is completed, traditional pattern-matching techniques cannot be employed. Thus we use flow features of the traffic along with the features used in the above first phase. We couple these features with the unencrypted TLS header information obtained during TLS Handshake and use these in a machine learning classifier to identify whether the traffic is malicious or not.
Generalized Linear Models (GLMs) are an increasingly popular framework for modeling neural spike trains. They have been linked to the theory of stochastic point processes and researchers have used this relation to assess goodness-of-fit using methods from point-process theory, e.g. the time-rescaling theorem. However, high neural firing rates or coarse discretization lead to a breakdown of the assumptions necessary for this connection. Here, we show how goodness-of-fit tests from point-process theory can still be applied to GLMs by constructing equivalent surrogate point processes out of time-series observations. Furthermore, two additional tests based on thinning and complementing point processes are introduced. They augment the instruments available for checking model adequacy of point processes as well as discretized models.
Mathematical definitions of polyhedrons and perceptron networks are discussed. The formalization of polyhedrons is done in a rather traditional way. For networks, previously proposed systems are developed. Perceptron networks in disjunctive normal form (DNF) and conjunctive normal forms (CNF) are introduced. The main theme is that single output perceptron neural networks and characteristic functions of polyhedrons are one and the same class of functions. A rigorous formulation and proof that three layers suffice is obtained. The various constructions and results are among several steps required for algorithms that replace incremental and statistical learning with more efficient, direct and exact geometric methods for calculation of perceptron architecture and weights.
The aim of this paper is to propose diffusion strategies for distributed estimation over adaptive networks, assuming the presence of spatially correlated measurements distributed according to a Gaussian Markov random field (GMRF) model. The proposed methods incorporate prior information about the statistical dependency among observations, while at the same time processing data in real-time and in a fully decentralized manner. A detailed mean-square analysis is carried out in order to prove stability and evaluate the steady-state performance of the proposed strategies. Finally, we also illustrate how the proposed techniques can be easily extended in order to incorporate thresholding operators for sparsity recovery applications. Numerical results show the potential advantages of using such techniques for distributed learning in adaptive networks deployed over GMRF.
We examine the feasibility of a simple description of Mn ions in III-V diluted magnetic semiconductors (DMSs) in terms of two species (components), motivated by the expectation that the Mn-hole exchange couplings are widely distributed, expecially for low Mn concentrations. We find, using distributions indicated by recent numerical mean field studies, that the thermodynamic properties (magnetization, susceptibility, and specific heat) cannot be fit by a single coupling as in a homogeneous model, but can be fit well by a two-component model with a temperature dependent number of ``strongly'' and ``weakly'' coupled spins. This suggests that a two-component description may be a minimal model for the interpretation of experimental measurements of thermodynamic quantities in III-V DMS systems.
In this paper we consider networks of dynamical systems that evolve in synchrony and investigate how dynamical information from the synchronization dynamics can be effectively used to learn the network topology, i.e., identify the time evolution of the couplings between the network nodes. To this aim, we present an adaptive strategy that, based on a potential that the network systems seek to minimize in order to maintain synchronization, can be successfully applied to identify the time evolution of the network from limited information. This strategy takes advantage of the properties of synchronism of chaos and of the presence of different communication delays over the network links. As a motivating example we consider a network of sensors surveying an area, in which information regarding the time evolution of the network connections can be used, e.g., to detect changes taking place within the area. We propose two different setups for our strategy. In the first one, synchronization has to be achieved at each node (as well as the identification of the couplings over the network links), based solely on a single scalar signal representing a superposition of signals from the other nodes in the network. In the second one, we incorporate an additional node, termed the maestro, having the function of maintaining network synchronization. We will see that when such an arrangement is realized, it will become possible to effectively identify the time evolution of networks that are much larger than would be possible in the absence of a maestro.
The recent successful deep neural networks are largely trained in a supervised manner. It {\it associates} complex patterns of input samples with neurons in the last layer, which form representations of {\it concepts}. In spite of their successes, the properties of complex patterns associated a learned concept remain elusive. In this work, by analyzing how neurons are associated with concepts in supervised networks, we hypothesize that with proper priors to regulate learning, neural networks can automatically associate neurons in the intermediate layers with concepts that are aligned with real world concepts, when trained only with labels that associate concepts with top level neurons, which is a plausible way for unsupervised learning. We develop a prior to verify the hypothesis and experimentally find the proposed prior help neural networks automatically learn both basic physical concepts at the lower layers, e.g., rotation of filters, and highly semantic concepts at the higher layers, e.g., fine-grained categories of an entry-level category.
We have developed a non-perturbative functional renormalization group approach for the random field O(N) model (RFO(N)M) that allows us to investigate the ordering transition in any dimension and for any value of N including the Ising case. We show that the failure of dimensional reduction and standard perturbation theory is due to the non-analytic nature of the zero-temperature fixed point controlling the critical behavior, non-analycity which is associated with the existence of many metastable states. We find that this non-analycity leads to critical exponents differing from the dimensional reduction prediction only below a critical dimension d\_c(N)<6, with d\_c(N=1)>3.
Based on a rigorous extension of classical statistical mechanics to networks, we study a specific microscopic network Hamiltonian. The form of this Hamiltonian is derived from the assumption that individual nodes increase/decrease their utility by linking to nodes with a higher/lower degree than their own. We interpret utility as an equivalent to energy in physical systems and discuss the temperature dependence of the emerging networks. We observe the existence of a critical temperature $T_c$ where total energy (utility) and network-architecture undergo radical changes. Along this topological transition we obtain scale-free networks with complex hierarchical topology. In contrast to models for scale-free networks introduced so far, the scale-free nature emerges within equilibrium, with a clearly defined microcanonical ensemble and the principle of detailed balance strictly fulfilled. This provides clear evidence that 'complex' networks may arise without irreversibility. The results presented here should find a wide variety of applications in socio-economic statistical systems.
In this paper, we discuss structure learning of causal networks from multiple data sets obtained by external intervention experiments where we do not know what variables are manipulated. For example, the conditions in these experiments are changed by changing temperature or using drugs, but we do not know what target variables are manipulated by the external interventions. From such data sets, the structure learning becomes more difficult. For this case, we first discuss the identifiability of causal structures. Next we present a graph-merging method for learning causal networks for the case that the sample sizes are large for these interventions. Then for the case that the sample sizes of these interventions are relatively small, we propose a data-pooling method for learning causal networks in which we pool all data sets of these interventions together for the learning. Further we propose a re-sampling approach to evaluate the edges of the causal network learned by the data-pooling method. Finally we illustrate the proposed learning methods by simulations.
This work presents a new algorithm for training recurrent neural networks (although ideas are applicable to feedforward networks as well). The algorithm is derived from a theory in nonconvex optimization related to the diffusion equation. The contributions made in this work are two fold. First, we show how some seemingly disconnected mechanisms used in deep learning such as smart initialization, annealed learning rate, layerwise pretraining, and noise injection (as done in dropout and SGD) arise naturally and automatically from this framework, without manually crafting them into the algorithms. Second, we present some preliminary results on comparing the proposed method against SGD. It turns out that the new algorithm can achieve similar level of generalization accuracy of SGD in much fewer number of epochs.
We apply the spike-and-slab Restricted Boltzmann Machine (ssRBM) to texture modeling. The ssRBM with tiled-convolution weight sharing (TssRBM) achieves or surpasses the state-of-the-art on texture synthesis and inpainting by parametric models. We also develop a novel RBM model with a spike-and-slab visible layer and binary variables in the hidden layer. This model is designed to be stacked on top of the TssRBM. We show the resulting deep belief network (DBN) is a powerful generative model that improves on single-layer models and is capable of modeling not only single high-resolution and challenging textures but also multiple textures.
The overwhelming success of online social networks, the key actors in the Web 2.0 cosmos, has reshaped human interactions globally. To help understand the fundamental mechanisms which determine the fate of online social networks at the system level, we describe the digital world as a complex ecosystem of interacting networks. In this paper, we study the impact of heterogeneity in network fitnesses on the competition between an international network, such as Facebook, and local services. The higher fitness of international networks is induced by their ability to attract users from all over the world, which can then establish social interactions without the limitations of local networks. In other words, inter-country social ties lead to increased fitness of the international network. To study the competition between an international network and local ones, we construct a 1:1000 scale model of the digital world, consisting of the 80 countries with the most Internet users. Under certain conditions, this leads to the extinction of local networks; whereas under different conditions, local networks can persist and even dominate completely. In particular, our model suggests that, with the parameters that best reproduce the empirical overtake of Facebook, this overtake could have not taken place with a significant probability.
Oscillations of atomic nuclei in crystals are considered in this paper. It is shown that elastic nuclei oscillations relatively electron envelops (inherent, I-oscillations) and waves of such oscillations can exist in crystals at adiabatic condition. The types and energy quantums of I-oscillations for different atoms are determined. In this connection the adiabatic crystal model is offered. Each atom in the adiabatic model is submitted as I-oscillator whose stationary oscillatory terms are shown as deep energy levels in crystals. The I-oscillations can be created at the expense of recombination energy of electrons and holes on electron-vibrational centers in semiconductors. They interact among themselves, with phonons and electrons, they influence on physical properties of crystals and crystal structures. The I-oscillations representing oscillations of atomic nuclei relatively electron system in crystals or molecules are the important physical reality.
We consider learning representations of entities and relations in KBs using the neural-embedding approach. We show that most existing models, including NTN (Socher et al., 2013) and TransE (Bordes et al., 2013b), can be generalized under a unified learning framework, where entities are low-dimensional vectors learned from a neural network and relations are bilinear and/or linear mapping functions. Under this framework, we compare a variety of embedding models on the link prediction task. We show that a simple bilinear formulation achieves new state-of-the-art results for the task (achieving a top-10 accuracy of 73.2% vs. 54.7% by TransE on Freebase). Furthermore, we introduce a novel approach that utilizes the learned relation embeddings to mine logical rules such as "BornInCity(a,b) and CityInCountry(b,c) => Nationality(a,c)". We find that embeddings learned from the bilinear objective are particularly good at capturing relational semantics and that the composition of relations is characterized by matrix multiplication. More interestingly, we demonstrate that our embedding-based rule extraction approach successfully outperforms a state-of-the-art confidence-based rule mining approach in mining Horn rules that involve compositional reasoning.
We present a detailed dynamic light scattering study on the phase separation in the ocular lens emerging during cold cataract development. Cold cataract is a phase separation effect that proceeds via spinodal decomposition of the lens cytoplasm with cooling. Intensity auto-correlation functions of the lens protein content are analyzed with the aid of two methods providing information on the populations and dynamics of the scattering elements associated with cold cataract. It is found that the temperature dependence of many measurable parameters changes appreciably at the characteristic temperature ~16+1 oC which is associated with the onset of cold cataract. Extending the temperature range of this work to previously inaccessible regimes, i.e. well below the phase separation or coexistence curve at Tcc, we have been able to accurately determine the temperature dependence of the collective and self-diffusion coefficient of proteins near the spinodal. The analysis showed that the dynamics of proteins bears some resemblance to the dynamics of structural glasses where the apparent activation energy for particle diffusion increases below Tcc indicating a highly cooperative motion. Application of ideas developed for studying the critical dynamics of binary protein/solvent mixtures, as well as the use of a modified Arrhenius equation, enabled us to estimate the spinodal temperature Tsp of the lens nucleus. The applicability of dynamic light scattering as a non-invasive, early-diagnostic tool for ocular diseases is also demonstrated in the light of the findings of the present paper.
We present a highly optimized implementation of a Monte Carlo (MC) simulator for the three-dimensional Ising spin-glass model with bimodal disorder, i.e., the 3D Edwards-Anderson model running on CUDA enabled GPUs. Multi-GPU systems exchange data by means of the Message Passing Interface (MPI). The chosen MC dynamics is the classic Metropolis one, which is purely dissipative, since the aim was the study of the critical off-equilibrium relaxation of the system. We focused on the following issues: i) the implementation of efficient access patterns for nearest neighbours in a cubic stencil and for lagged-Fibonacci-like pseudo-Random Numbers Generators (PRNGs); ii) a novel implementation of the asynchronous multispin-coding Metropolis MC step allowing to store one spin per bit and iii) a multi-GPU version based on a combination of MPI and CUDA streams. We highlight how cubic stencils and PRNGs are two subjects of very general interest because of their widespread use in many simulation codes. Our code best performances ~3 and ~5 psFlip on a GTX Titan with our implementations of the MINSTD and MT19937 respectively.
Finding visual correspondence between local features is key to many computer vision problems. While defining features with larger contextual scales usually implies greater discriminativeness, it could also lead to less spatial accuracy of the features. We propose AutoScaler, a scale-attention network to explicitly optimize this trade-off in visual correspondence tasks. Our network consists of a weight-sharing feature network to compute multi-scale feature maps and an attention network to combine them optimally in the scale space. This allows our network to have adaptive receptive field sizes over different scales of the input. The entire network is trained end-to-end in a siamese framework for visual correspondence tasks. Our method achieves favorable results compared to state-of-the-art methods on challenging optical flow and semantic matching benchmarks, including Sintel, KITTI and CUB-2011. We also show that our method can generalize to improve hand-crafted descriptors (e.g Daisy) on general visual correspondence tasks. Finally, our attention network can generate visually interpretable scale attention maps.
We study networks of biochemical reactions modelled by continuous-time Markov processes. Such networks typically contain many molecular species and reactions and are hard to study analytically as well as by simulation. Particularly, we are interested in reaction networks with intermediate species such as the substrate-enzyme complex in the Michaelis-Menten mechanism. These species are virtually in all real-world networks, they are typically short-lived, degraded at a fast rate and hard to observe experimentally.   We provide conditions under which the Markov process of a multiscale reaction network with intermediate species is approximated in finite dimensional distribution by the Markov process of a simpler reduced reaction network without intermediate species. We do so by embedding the Markov processes into a one-parameter family of processes, where reaction rates and species abundances are scaled in the parameter. Further, we show that there are close links between these stochastic models and deterministic ODE models of the same networks.
A robust worldwide air-transportation network (WAN) is one that minimizes the number of stranded passengers under a sequence of airport closures. Building on top of this realistic example, here we address how spatial network robustness can profit from cooperation between local actors. We swap a series of links within a certain distance, a cooperation range, while following typical constraints of spatially embedded networks. We find that the network robustness is only improved above a critical cooperation range. Such improvement can be described in the framework of a continuum transition, where the critical exponents depend on the spatial correlation of connected nodes. For the WAN we show that, except for Australia, all continental networks fall into the same universality class. Practical implications of this result are also discussed.
People usually get involved in multiple social networks to enjoy new services or to fulfill their needs. Many new social networks try to attract users of other existing networks to increase the number of their users. Once a user (called source user) of a social network (called source network) joins a new social network (called target network), a new inter-network link (called anchor link) is formed between the source and target networks. In this paper, we concentrated on predicting the formation of such anchor links between heterogeneous social networks. Unlike conventional link prediction problems in which the formation of a link between two existing users within a single network is predicted, in anchor link prediction, the target user is missing and will be added to the target network once the anchor link is created. To solve this problem, we use meta-paths as a powerful tool for utilizing heterogeneous information in both the source and target networks. To this end, we propose an effective general meta-path-based approach called Connector and Recursive Meta-Paths (CRMP). By using those two different categories of meta-paths, we model different aspects of social factors that may affect a source user to join the target network, resulting in the formation of a new anchor link. Extensive experiments on real-world heterogeneous social networks demonstrate the effectiveness of the proposed method against the recent methods.
We analyze isothermal aging of a four dimensional Edwards-Anderson model in detail by Monte Carlo simulations. We analyze the data in the view of an extended version of the droplet theory proposed recently (cond-mat/0202110) which is based on the original droplet theory plus conjectures on the anomalously soft droplets in the presence of domain walls. We found that the scaling laws including some fundamental predictions of the original droplet theory explain well our results. The results of our simulation strongly suggest the separation of the breaking of the time translational invariance and the fluctuation dissipation theorem in agreement with our scenario.
From the perspective of reindustrialization, it is important to understand the evolution of the structure of the network of organizations employment structure, and organization value. Understanding the potential influence of collaborative networks (CNs) on these aspects may lead to the development of appropriate economic policies. In this paper, we propose a theoretical approach to analysis this potential influence, based on a model of dynamic networked ecosystem of organizations encompassing collaboration relations among organization, employment mobility, and organization value. A large number of simulations has been performed to identify factors influencing the structure of the network of organizations employment structure, and organization value. The main findings are that 1) the higher the number of members of CNs, the better the clustering and the shorter the average path length among organizations; 2) the constitution of CNs does not affect neither the structure of the network of organizations, nor the employment structure and the organization value.
Small cell enchantment is emerging as the key technique for wireless network evolution. One challenging problem for small cell enhancement is how to achieve high data rate with as-low-as-possible control and computation overheads. As a solution, we propose a low-complexity distributed optimization framework in this paper. Our solution includes two parts. One is a novel implicit information exchange mechanism that enables channel-aware opportunistic scheduling and resource allocation among links. The other is the sub-gradient based algorithm with a polynomial-time complexity. What is more, for large scale systems, we design an improved distributed algorithm based on insights obtained from the problem structure. This algorithm achieves a close-to-optimal performance with a much lower complexity. Our numerical evaluations validate the analytical results and show the advantage of our algorithms.
Estimating student proficiency is an important task for computer based learning systems. We compare a family of IRT-based proficiency estimation methods to Deep Knowledge Tracing (DKT), a recently proposed recurrent neural network model with promising initial results. We evaluate how well each model predicts a student's future response given previous responses using two publicly available and one proprietary data set. We find that IRT-based methods consistently matched or outperformed DKT across all data sets at the finest level of content granularity that was tractable for them to be trained on. A hierarchical extension of IRT that captured item grouping structure performed best overall. When data sets included non-trivial autocorrelations in student response patterns, a temporal extension of IRT improved performance over standard IRT while the RNN-based method did not. We conclude that IRT-based models provide a simpler, better-performing alternative to existing RNN-based models of student interaction data while also affording more interpretability and guarantees due to their formulation as Bayesian probabilistic models.
Coronary angiography is considered to be a safe tool for the evaluation of coronary artery disease and perform in approximately 12 million patients each year worldwide. [1] In most cases, angiograms are manually analyzed by a cardiologist. Actually, there are no clinical practice algorithms which could improve and automate this work. Neural networks show high efficiency in tasks of image analysis and they can be used for the analysis of angiograms and facilitate diagnostics. We have developed an algorithm based on Convolutional Neural Network and Neural Network U-Net [2] for vessels segmentation and defects detection such as stenosis. For our research we used anonymized angiography data obtained from one of the city's hospitals and augmented them to improve learning efficiency. U-Net usage provided high quality segmentation and the combination of our algorithm with an ensemble of classifiers shows a good accuracy in the task of ischemia evaluation on test data. Subsequently, this approach can be served as a basis for the creation of an analytical system that could speed up the diagnosis of cardiovascular diseases and greatly facilitate the work of a specialist.
Communication networks consume huge, and rapidly growing, amount of energy. However, a lot of the energy consumption is wasted due to the lack of global link power coordination in these complex systems. This paper proposes several link power coordination schemes to achieve energy-efficient routing by progressively putting some links into energy saving mode and hence aggregating traffic during periods of low traffic load. We show that the achievable energy savings not only depend on the link power coordination schemes, but also on the network topologies. In the random network, there is no scheme that can significantly outperform others. In the scale-free network, when the largest betweenness first (LBF) scheme is used, phase transition of the networks' transmission capacities during the traffic cooling down phase is observed. Motivated by this, a hybrid link power coordination scheme is proposed to significantly reduce the energy consumption in the scale-free network. In a real Internet Service Provider (ISP)'s router-level Internet topology, however, the smallest betweenness first (SBF) scheme significantly outperforms other schemes.
Observations of the high redshift Universe, interpreted in the context of a new generation of computer simulated model Universes, are providing new insights into the processes by which galaxies and quasars form and evolve, as well as the relationship between the formation of virialized, star-forming systems and the evolution of the intergalactic medium. We describe our recent measurements of the star-formation rates, stellar populations, and structure of galaxies and protogalactic fragments at z~2.5, including narrow-band imaging in the near-IR, IR spectroscopy, and deep imaging from the ground and from space, using HST and ISO.
The relativistic LambdaCDM cosmological model has passed a demanding network of tests that convincingly demonstrate it is a useful approximation to what happened back to high redshift. But there are anomalies in its application to structure formation on the scales of galaxies that show we have much to learn about what this theory actually predicts and possibly something also of value to learn about the fundamental theoretical basis for observational cosmology.   This is slightly revised and enlarged from a contribution to A Century of Cosmology, Venice, August 2007.
We consider the problem of multiple users targeting the arms of a single multi-armed stochastic bandit. The motivation for this problem comes from cognitive radio networks, where selfish users need to coexist without any side communication between them, implicit cooperation or common control. Even the number of users may be unknown and can vary as users join or leave the network. We propose an algorithm that combines an $\epsilon$-greedy learning rule with a collision avoidance mechanism. We analyze its regret with respect to the system-wide optimum and show that sub-linear regret can be obtained in this setting. Experiments show dramatic improvement compared to other algorithms for this setting.
This paper explores the performance of fitted neural Q iteration for reinforcement learning in several partially observable environments, using three recurrent neural network architectures: Long Short-Term Memory, Gated Recurrent Unit and MUT1, a recurrent neural architecture evolved from a pool of several thousands candidate architectures. A variant of fitted Q iteration, based on Advantage values instead of Q values, is also explored. The results show that GRU performs significantly better than LSTM and MUT1 for most of the problems considered, requiring less training episodes and less CPU time before learning a very good policy. Advantage learning also tends to produce better results.
Deep Convolutional Neuronal Networks (DCNNs) are showing remarkable performance on many computer vision tasks. Due to their large parameter space, they require many labeled samples when trained in a supervised setting. The costs of annotating data manually can render the use of DCNNs infeasible. We present a novel framework called RenderGAN that can generate large amounts of realistic, labeled images by combining a 3D model and the Generative Adversarial Network framework. In our approach, image augmentations (e.g. lighting, background, and detail) are learned from unlabeled data such that the generated images are strikingly realistic while preserving the labels known from the 3D model. We apply the RenderGAN framework to generate images of barcode-like markers that are attached to honeybees. Training a DCNN on data generated by the RenderGAN yields considerably better performance than training it on various baselines.
In this work, we jointly address the problem of text detection and recognition in natural scene images based on convolutional recurrent neural networks. We propose a unified network that simultaneously localizes and recognizes text with a single forward pass, avoiding intermediate processes like image cropping and feature re-calculation, word separation, or character grouping. In contrast to existing approaches that consider text detection and recognition as two distinct tasks and tackle them one by one, the proposed framework settles these two tasks concurrently. The whole framework can be trained end-to-end, requiring only images, the ground-truth bounding boxes and text labels. Through end-to-end training, the learned features can be more informative, which improves the overall performance. The convolutional features are calculated only once and shared by both detection and recognition, which saves processing time. Our proposed method has achieved competitive performance on several benchmark datasets.
In this paper we investigate the geometry of the likelihood of the unknown parameters in a simple class of Bayesian directed graphs with hidden variables. This enables us, before any numerical algorithms are employed, to obtain certain insights in the nature of the unidentifiability inherent in such models, the way posterior densities will be sensitive to prior densities and the typical geometrical form these posterior densities might take. Many of these insights carry over into more complicated Bayesian networks with systematic missing data.
We study a more general class of the Alessandro-Beatrice-Bertotti-Montorsi (AMMB) models with velocity-dependent dissipation. We obtain the Fokker-Planck equation describing the evolution of an arbitrary initial probability distribution, and from there the stationary distribution under constant driving. For this class of models, we show that the distribution of the size of an avalanche is the same as when the dissipation is velocity-independent. As for durations, we show that, under non-stationary driving known as "kicks", although no closed-form solution seems to be available for an arbitrary velocity-dependent dissipation, for large durations the distribution seems to demonstrate an exponential fall-off, while for small durations (under some extra conditions to be made clear in the paper) the distribution seems to show a characteristic power-law behaviour.
The current paper presents a novel recurrent neural network model, the predictive multiple spatio-temporal scales RNN (P-MSTRNN), which can generate as well as recognize dynamic visual patterns in the predictive coding framework. The model is characterized by multiple spatio-temporal scales imposed on neural unit dynamics through which an adequate spatio-temporal hierarchy develops via learning from exemplars. The model was evaluated by conducting an experiment of learning a set of whole body human movement patterns which was generated by following a hierarchically defined movement syntax. The analysis of the trained model clarifies what types of spatio-temporal hierarchy develop in dynamic neural activity as well as how robust generation and recognition of movement patterns can be achieved by using the error minimization principle.
We consider the problem of optimizing the sum of a smooth convex function and a non-smooth convex function using proximal-gradient methods, where an error is present in the calculation of the gradient of the smooth term or in the proximity operator with respect to the non-smooth term. We show that both the basic proximal-gradient method and the accelerated proximal-gradient method achieve the same convergence rate as in the error-free case, provided that the errors decrease at appropriate rates.Using these rates, we perform as well as or better than a carefully chosen fixed error level on a set of structured sparsity problems.
In this article we extend the results presented in Ref. [Phys. Rev. A 76, 032101 (2007)] to treat quantitatively the effects of reservoirs at finite temperature in a bosonic dissipative network: a chain of coupled harmonic oscillators whichever its topology, i.e., whichever the way the oscillators are coupled together, the strength of their couplings and their natural frequencies. Starting with the case where distinct reservoirs are considered, each one coupled to a corresponding oscillator, we also analyze the case where a common reservoir is assigned to the whole network. Master equations are derived for both situations and both regimes of weak and strong coupling strengths between the network oscillators. Solutions of these master equations are presented through the normal ordered characteristic function. We also present a technique to estimate the decoherence time of network states by computing separately the effects of diffusion and the attenuation of the interference terms of the Wigner function. A detailed analysis of the diffusion mechanism is also presented through the evolution of the Wigner function. The interesting collective diffusion effects are discussed and applied to the analysis of decoherence of a class of network states. Finally, the entropy and the entanglement of a pure bipartite system are discussed.
Correlations in the signal observed via functional Magnetic Resonance Imaging (fMRI), are expected to reveal the interactions in the underlying neural populations through hemodynamic response. In particular, they highlight distributed set of mutually correlated regions that correspond to brain networks related to different cognitive functions. Yet graph-theoretical studies of neural connections give a different picture: that of a highly integrated system with small-world properties: local clustering but with short pathways across the complete structure. We examine the conditional independence properties of the fMRI signal, i.e. its Markov structure, to find realistic assumptions on the connectivity structure that are required to explain the observed functional connectivity. In particular we seek a decomposition of the Markov structure into segregated functional networks using decomposable graphs: a set of strongly-connected and partially overlapping cliques. We introduce a new method to efficiently extract such cliques on a large, strongly-connected graph. We compare methods learning different graph structures from functional connectivity by testing the goodness of fit of the model they learn on new data. We find that summarizing the structure as strongly-connected networks can give a good description only for very large and overlapping networks. These results highlight that Markov models are good tools to identify the structure of brain connectivity from fMRI signals, but for this purpose they must reflect the small-world properties of the underlying neural systems.
In this work, we calculate the contribution of dipole-dipole interactions to the optical nonlinearity of the two-dimensional random ensemble of nanoparticles that possess a set of exciton levels, for example, quantum dots. The analytical expressions for the contributions in the cases of TM and TE-polarized light waves propagating along the plane are obtained. It is shown that the optical nonlinearity, caused by the dipole-dipole interactions in the planar ensemble of the nanoparticles, is several times smaller than the similar nonlinearity of the bulk nanocomposite. This type of optical nonlinearity is expected to be observed at timescales much larger than the quantum dot exciton rise time. The proposed method may be applied to various types of the nanocomposite shapes.
Pain is a personal, subjective experience that is commonly evaluated through visual analog scales (VAS). While this is often convenient and useful, automatic pain detection systems can reduce pain score acquisition efforts in large-scale studies by estimating it directly from the participants' facial expressions. In this paper, we propose a novel two-stage learning approach for VAS estimation: first, our algorithm employs Recurrent Neural Networks (RNNs) to automatically estimate Prkachin and Solomon Pain Intensity (PSPI) levels from face images. The estimated scores are then fed into the personalized Hidden Conditional Random Fields (HCRFs), used to estimate the VAS, provided by each person. Personalization of the model is performed using a newly introduced facial expressiveness score, unique for each person. To the best of our knowledge, this is the first approach to automatically estimate VAS from face images. We show the benefits of the proposed personalized over traditional non-personalized approach on a benchmark dataset for pain analysis from face images.
Computer network is unpredictable due to information warfare and is prone to various attacks. Such attacks on network compromise the most important attribute, the privacy. Most of such attacks are devised using special communication channel called "Covert Channel". The word "Covert" stands for hidden or non-transparent. Network Covert Channel is a concealed communication path within legitimate network communication that clearly violates security policies laid down. The non-transparency in covert channel is also referred to as trapdoor. A trapdoor is unintended design within legitimate communication whose motto is to leak information. Subliminal channel, a variant of covert channel works similarly except that the trapdoor is set in a cryptographic algorithm. A composition of covert channel with subliminal channel is the "Hybrid Covert Channel". Hybrid covert channel is homogenous or heterogeneous mixture of two or more variants of covert channels either active at same instance or at different instances of time. Detecting such malicious channel activity plays a vital role in removing threat to the legitimate network. In this paper, we present a study of multi-trapdoor covert channels and introduce design of a new detection engine for hybrid covert channel in transport layer visualized in TCP and SSL.
The ANTARES deep sea neutrino telescope has acquired over four years of high quality data. This data has been used to measure the oscillation parameters of atmospheric neutrinos and also to search for neutrinos of a non-terrestrial origin. Competitive upper limits on the fluxes of neutrinos from dark matter annihilation in the Sun, a variety of Galactic and extra-galactic sources, both steady and transient, are presented.
In this work we address the problem of indoor scene understanding from RGB-D images. Specifically, we propose to find instances of common furniture classes, their spatial extent, and their pose with respect to generalized class models. To accomplish this, we use a deep, wide, multi-output convolutional neural network (CNN) that predicts class, pose, and location of possible objects simultaneously. To overcome the lack of large annotated RGB-D training sets (especially those with pose), we use an on-the-fly rendering pipeline that generates realistic cluttered room scenes in parallel to training. We then perform transfer learning on the relatively small amount of publicly available annotated RGB-D data, and find that our model is able to successfully annotate even highly challenging real scenes. Importantly, our trained network is able to understand noisy and sparse observations of highly cluttered scenes with a remarkable degree of accuracy, inferring class and pose from a very limited set of cues. Additionally, our neural network is only moderately deep and computes class, pose and position in tandem, so the overall run-time is significantly faster than existing methods, estimating all output parameters simultaneously in parallel on a GPU in seconds.
Recent interest in human dynamics has stimulated the investigation of the stochastic processes that explain human behaviour in various contexts, such as mobile phone networks and social media. In this paper, we extend the stochastic urn-based model proposed in \cite{FENN15} so that it can generate mixture models,in particular, a mixture of exponential distributions. The model is designed to capture the dynamics of survival analysis, traditionally employed in clinical trials, reliability analysis in engineering, and more recently in the analysis of large data sets recording human dynamics. The mixture modelling approach, which is relatively simple and well understood, is very effective in capturing heterogeneity in data. We provide empirical evidence for the validity of the model, using a data set of popular search engine queries collected over a period of 114 months. We show that the survival function of these queries is closely matched by the exponential mixture solution for our model.
Shadowing effects in deep-inelastic lepton-nucleus scattering probe the mass spectrum of diffractive leptoproduction from individual nucleons. We explore this relationship using current experimental information on both processes. In recent data from the NMC and E665 collaboration, taken at small x << 0.1 and Q^2 < 1 GeV^2, shadowing is dominated by the diffractive excitation and coherent interaction of low mass vector mesons. If shadowing is explored at small x << 0.1 but large Q^2 >> 1 GeV^2 as discussed at HERA, the situation is different. Here dominant contributions come from the coherent interaction of diffractively produced heavy mass states. Furthermore we observe that the energy dependence of shadowing is directly related to the mass dependence of the diffractive production cross section for free nucleon targets.
We compare the vibrational properties of model SiO_2 glasses generated by molecular-dynamics simulations using the effective force field of van Beest et al. (BKS) with those obtained when the BKS structure is relaxed using an ab initio calculation in the framework of the density functional theory. We find that this relaxation significantly improves the agreement of the density of states with the experimental result. For frequencies between 14 and 26 THz the nature of the vibrational modes as determined from the BKS model is very different from the one from the ab initio calculation, showing that the interpretation of the vibrational spectra in terms of calculations using effective potentials can be very misleading.
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.
Structured output prediction is an important machine learning problem both in theory and practice, and the max-margin Markov network (\mcn) is an effective approach. All state-of-the-art algorithms for optimizing \mcn\ objectives take at least $O(1/\epsilon)$ number of iterations to find an $\epsilon$ accurate solution. Recent results in structured optimization suggest that faster rates are possible by exploiting the structure of the objective function. Towards this end \citet{Nesterov05} proposed an excessive gap reduction technique based on Euclidean projections which converges in $O(1/\sqrt{\epsilon})$ iterations on strongly convex functions. Unfortunately when applied to \mcn s, this approach does not admit graphical model factorization which, as in many existing algorithms, is crucial for keeping the cost per iteration tractable. In this paper, we present a new excessive gap reduction technique based on Bregman projections which admits graphical model factorization naturally, and converges in $O(1/\sqrt{\epsilon})$ iterations. Compared with existing algorithms, the convergence rate of our method has better dependence on $\epsilon$ and other parameters of the problem, and can be easily kernelized.
Internet worms, which spread in computer networks without human mediation, pose a severe threat to computer systems today. The rate of propagation of worms has been measured to be extremely high and they can infect a large fraction of their potential hosts in a short time. We study two different methods of patch dissemination to combat the spread of worms. We first show that using a fixed number of patch servers performs woefully inadequately against Internet worms. We then show that by exploiting the exponential data dissemination capability of P2P systems, the spread of worms can be halted very effectively. We compare the two methods by using fluid models to compute two quantities of interest: the time taken to effectively combat the progress of the worm and the maximum number of infected hosts. We validate our models using Internet measurements and simulations.
Causal discovery from empirical data is a fundamental problem in many scientific domains. Observational data allows for identifiability only up to Markov equivalence class. In this paper, we propose a polynomial time algorithm for learning the exact structure of Bayesian networks with high probability, by using interventional path queries. Each path query takes as input an origin node and a target node, and answers whether there is a directed path from the origin to the target. This is done by intervening the origin node and observing samples from the target node. We theoretically show the logarithmic sample complexity for the size of interventional data per path query. Finally, we experimentally validate the correctness of our algorithm in synthetic and real-world networks.
There are deep analogies between the melting dynamics in systems with a first order phase transition and the dynamics from equilibrium in super-cooled liquids. For a class of Ising spin models undergoing a first order transition - namely p-spin models on the so-called Nishimori line - it can be shown that the melting dynamics can be exactly mapped to the equilibrium dynamics. In this mapping the dynamical -or mode-coupling- glass transition corresponds to the spinodal point, while the Kauzmann transition corresponds to the first order phase transition itself. Both in mean field and finite dimensional models this mapping provides an exact realization of the random first order theory scenario for the glass transition. The corresponding glassy phenomenology can then be understood in the framework of a standard first order phase transition.
Text to image transformation for input to neural networks requires intermediate steps. This paper attempts to present a new approach to pixel normalization so as to convert textual data into image, suitable as input for neural networks. This method can be further improved by its Graphics Processing Unit (GPU) implementation to provide significant speedup in computational time.
Many networked systems involve multiple modes of transport. Such systems are called multimodal, and examples include logistic networks, biomedical phenomena, manufacturing process and telecommunication networks. Existing techniques for determining optimal paths in multimodal networks have either required heuristics or else application-specific constraints to obtain tractable problems, removing the multimodal traits of the network during analysis. In this paper weighted coloured--edge graphs are introduced to model multimodal networks, where colours represent the modes of transportation. Optimal paths are selected using a partial order that compares the weights in each colour, resulting in a Pareto optimal set of shortest paths. This approach is shown to be tractable through experimental analyses for random and real multimodal networks without the need to apply heuristics or constraints.
We present a simple, yet general, end-to-end deep neural network representation of the potential energy surface for atomic and molecular systems. This methodology, which we call Deep Potential, is "first-principle" based, in the sense that no ad hoc approximations or empirical fitting functions are required. The neural network structure naturally respects the underlying symmetries of the systems. When tested on a wide variety of examples, Deep Potential is able to reproduce the original model, whether empirical or quantum mechanics based, within chemical accuracy. The computational cost of this new model is not substantially larger than that of empirical force fields. In addition, the method has promising scalability properties. This brings us one step closer to being able to carry out molecular simulations with accuracy comparable to that of quantum mechanics models and cost comparable to that of empirical potentials.
The use of simulated virtual environments to train deep convolutional neural networks (CNN) is a currently active practice to reduce the (real)data-hungriness of the deep CNN models, especially in application domains in which large scale real data and/or groundtruth acquisition is difficult or laborious. Recent approaches have attempted to harness the capabilities of existing video games, animated movies to provide training data with high precision groundtruth. However, a stumbling block is in how one can certify generalization of the learned models and their usefulness in real world data sets. This opens up fundamental questions such as: What is the role of photorealism of graphics simulations in training CNN models? Are the trained models valid in reality? What are possible ways to reduce the performance bias? In this work, we begin to address theses issues systematically in the context of urban semantic understanding with CNNs. Towards this end, we (a) propose a simple probabilistic urban scene model, (b) develop a parametric rendering tool to synthesize the data with groundtruth, followed by (c) a systematic exploration of the impact of level-of-realism on the generality of the trained CNN model to real world; and domain adaptation concepts to minimize the performance bias.
Rapid experimental advances now enable simultaneous electrophysiological recording of neural activity at single-cell resolution across large regions of the nervous system. Models of this neural network activity will necessarily increase in size and complexity, thus increasing the computational cost of simulating them and the challenge of analyzing them. Here we present a novel method to approximate the activity and firing statistics of a general firing rate network model (of Wilson-Cowan type) subject to noisy correlated background inputs. The method requires solving a system of transcendental equations and is fast compared to Monte Carlo simulations of coupled stochastic differential equations. We implement the method with several examples of coupled neural networks and show that the results are quantitatively accurate even with moderate coupling strengths and an appreciable amount of heterogeneity in many parameters. This work should be useful for investigating how various neural attributes qualitatively effect the spiking statistics of coupled neural networks. Matlab code implementing the method is freely available at GitHub-xxx.
One way to understand the role history plays on evolutionary trajectories is by giving ancient life a second opportunity to evolve. Our ability to empirically perform such an experiment, however, is limited by current experimental designs. Combining ancestral sequence reconstruction with synthetic biology allows us to resurrect the past within a modern context and has expanded our understanding of protein functionality within a historical context. Experimental evolution, on the other hand, provides us with the ability to study evolution in action, under controlled conditions in the laboratory. Here we describe a novel experimental setup that integrates two disparate fields - ancestral sequence reconstruction and experimental evolution. This allows us to rewind and replay the evolutionary history of ancient biomolecules in the laboratory. We anticipate that our combination will provide a deeper understanding of the underlying roles that contingency and determinism play in shaping evolutionary processes.
Computer viruses are evolving by developing spreading mechanisms based on the use of multiple vectors of propagation. The use of the social network as an extra vector of attack to penetrate the security measures in IP networks is improving the effectiveness of malware, and have therefore been used by the most aggressive viruses, like Conficker and Stuxnet. In this work we use interdependent networks to model the propagation of these kind of viruses. In particular, we study the propagation of a SIS model on interdependent networks where the state of each node is layer-independent and the dynamics in each network follows either a contact process or a reactive process, with different propagation rates. We apply this study to the case of existing multilayer networks, namely a Spanish scientific community of Statistical Physics, formed by a social network of scientific collaborations and a physical network of connected computers in each institution. We show that the interplay between layers increases dramatically the infectivity of viruses in the long term and their robustness against immunization.
We study, theoretically and experimentally, disorder-induced resonances in randomly-layered samples,and develop an algorithm for the detection and characterization of the effective cavities that give rise to these resonances. This algorithm enables us to find the eigen-frequencies and pinpoint the locations of the resonant cavities that appear in individual realizations of random samples, for arbitrary distributions of the widths and refractive indices of the layers. Each cavity is formed in a region whose size is a few localization lengths. Its eigen-frequency is independent of the location inside the sample, and does not change if the total length of the sample is increased by, for example, adding more scatterers on the sides. We show that the total number of cavities, $N_{\mathrm{cav}}$, and resonances, $N_{\mathrm{res}}$, per unit frequency interval is uniquely determined by the size of the disordered system and is independent of the strength of the disorder. In an active, amplifying medium, part of the cavities may host lasing modes whose number is less than $N_{\mathrm{res}}$. The ensemble of lasing cavities behaves as distributed feedback lasers, provided that the gain of the medium exceeds the lasing threshold, which is specific for each cavity. We present the results of experiments carried out with single-mode optical fibers with gain and randomly-located resonant Bragg reflectors (periodic gratings). When the fiber was illuminated by a pumping laser with an intensity high enough to overcome the lasing threshold, the resonances revealed themselves by peaks in the emission spectrum. Our experimental results are in a good agreement with the theory presented here.
Directed acyclic graph (DAG) models, also called Bayesian networks, impose conditional independence constraints on a multivariate probability distribution, and are widely used in probabilistic reasoning, machine learning and causal inference. If latent variables are included in such a model, then the set of possible marginal distributions over the remaining (observed) variables is generally complex, and not represented by any DAG. Larger classes of mixed graphical models, which use multiple edge types, have been introduced to overcome this; however, these classes do not represent all the models which can arise as margins of DAGs. In this paper we show that this is because ordinary mixed graphs are fundamentally insufficiently rich to capture the variety of marginal models.   We introduce a new class of hyper-graphs, called mDAGs, and a latent projection operation to obtain an mDAG from the margin of a DAG. We show that each distinct marginal of a DAG model is represented by at least one mDAG, and provide graphical results towards characterizing when two such marginal models are the same. Finally we show that mDAGs correctly capture the marginal structure of causally-interpreted DAGs under interventions on the observed variables.
We derive an analytical density functional for the single-site entanglement of the one-dimensional homogeneous Hubbard model, by means of an approximation to the linear entropy. We show that this very simple density functional reproduces quantitatively the exact results. We then use this functional as input for a local density approximation to the single-site entanglement of inhomogeneous systems. We illustrate the power of this approach in a harmonically confined system, which could simulate recent experiments with ultracold atoms in optical lattices as well as in a superlattice and in an impurity system. The impressive quantitative agreement with numerical calculations -- which includes reproducing subtle signatures of the particle density stages -- shows that our density-functional can provide entanglement calculations for actual experiments via density measurements. Next we use our functional to calculate the entanglement in disordered systems. We find that, at contrast with the expectation that disorder destroys the entanglement, there exist regimes for which the entanglement remains almost unaffected by the presence of disordered impurities.
We study the vibrational and electronic properties of (x)Na$_2$S-(1-x)GeS$_2$ glasses through DFT-based molecular dynamics simulations, at different sodium concentrations ($0<x<0.5$). We compute the vibrational density of states for the different samples in order to determine the contribution of the Na$^+$ ions in the VDOS. With an in-depth analysis of the eigenvectors we determine the spatial and atomic localization of the different modes, and in particular in the zone corresponding to the Boson peak. We also calculate the electronic density of states as well as the partial EDOS in order to determine the impact of the introduction of the sodium modifiers on the electronic properties of the GeS$_2$ matrix.
This paper introduces a novel method for learning how to play the most difficult Atari 2600 games from the Arcade Learning Environment using deep reinforcement learning. The proposed method, human checkpoint replay, consists in using checkpoints sampled from human gameplay as starting points for the learning process. This is meant to compensate for the difficulties of current exploration strategies, such as epsilon-greedy, to find successful control policies in games with sparse rewards. Like other deep reinforcement learning architectures, our model uses a convolutional neural network that receives only raw pixel inputs to estimate the state value function. We tested our method on Montezuma's Revenge and Private Eye, two of the most challenging games from the Atari platform. The results we obtained show a substantial improvement compared to previous learning approaches, as well as over a random player. We also propose a method for training deep reinforcement learning agents using human gameplay experience, which we call human experience replay.
Next-to-leading order QCD analyses of the ZEUS data on deep inelastic scattering together with fixed-target data have been perfomed, from which the gluon and the quark densities of the proton and the value of the strong coupling constant, alpha_s(M_Z), were extracted. The study includes a full treatment of the experimental systematic uncertainties including point-to-point correlations. The resulting uncertainties in the parton density functions are presented. A combined fit for alpha_s(M_Z) and the gluon and qurak densities yields a value of alpha_s(M_Z) in agreement with the world average. The parton density functions derived from ZEUS data alone indicate the importance of HERA data in determining sea quark and gluon distributions at low x. The limits of applicability of the theoretical formalism have been explored by comparing the fit predictions to ZEUS data at very low Q^2.
Current computers operate at enormous speeds of ~10^13 bits/s, but their principle of sequential logic operation has remained unchanged since the 1950s. Though our brain is much slower on a per-neuron base (~10^3 firings/s), it is capable of remarkable decision-making based on the collective operations of millions of neurons at a time in ever-evolving neural circuitry. Here we use molecular switches to build an assembly where each molecule communicates-like neurons-with many neighbors simultaneously. The assembly's ability to reconfigure itself spontaneously for a new problem allows us to realize conventional computing constructs like logic gates and Voronoi decompositions, as well as to reproduce two natural phenomena: heat diffusion and the mutation of normal cells to cancer cells. This is a shift from the current static computing paradigm of serial bit-processing to a regime in which a large number of bits are processed in parallel in dynamically changing hardware.
This paper describes our submission to the 1st 3D Face Alignment in the Wild (3DFAW) Challenge. Our method builds upon the idea of convolutional part heatmap regression [1], extending it for 3D face alignment. Our method decomposes the problem into two parts: (a) X,Y (2D) estimation and (b) Z (depth) estimation. At the first stage, our method estimates the X,Y coordinates of the facial landmarks by producing a set of 2D heatmaps, one for each landmark, using convolutional part heatmap regression. Then, these heatmaps, alongside the input RGB image, are used as input to a very deep subnetwork trained via residual learning for regressing the Z coordinate. Our method ranked 1st in the 3DFAW Challenge, surpassing the second best result by more than 22%.
We have used deep I-band (F814W) images from the HST archive to study the globular cluster systems around the brightest cluster galaxies (BCGs) in Abell 262, 3560, 3565 and 3742. Three of these BCGs have inner dust lanes and peculiar structural features which indicate past histories of low-level interaction and accretion. The deep I-band WFPC2 images have photometric limits which, for all four galaxies, reach near or just beyond the GCLF turnover point. Their specific frequencies are 8.24 \pm 1.65, 4.66 \pm 0.93, 2.58 \pm 0.52 and 2.62 \pm 0.52 respectively, all within a factor of two of the normal range for giant ellipticals. We obtain new estimates of the GCLF turnover magnitudes, which are shown to be consistent with an adopted Hubble constant of H_0 ~ 70 km s^{-1} Mpc^{-1} on the ``Hubble diagram'' of GCLF turnover apparent magnitude versus redshift, on a distance scale where the fundamental GCLF calibrator E galaxies (M87 and others) in Virgo are at d=16 Mpc.
Functionals of a stochastic process Y(t) model many physical time-extensive observables, e.g. particle positions, local and occupation times or accumulated mechanical work. When Y(t) is a normal diffusive process, their statistics are obtained as the solution of the Feynman-Kac equation. This equation provides the crucial link between the expected values of diffusion processes and the solutions of deterministic second-order partial differential equations. When Y(t) is an anomalous diffusive process, generalizations of the Feynman-Kac equation that incorporate power-law or more general waiting time distributions of the underlying random walk have recently been derived. A general representation of such waiting times is provided in terms of a L\'evy process whose Laplace exponent is related to the memory kernel appearing in the generalized Feynman-Kac equation. The corresponding anomalous processes have been shown to capture nonlinear mean square displacements exhibiting crossovers between different scaling regimes, which have been observed in biological systems like migrating cells or diffusing macromolecules in intracellular environments. However, the case where both space- and time-dependent forces drive the dynamics of the generalized anomalous process has not been solved yet. Here, we present the missing derivation of the Feynman-Kac equation in such general case by using the subordination technique. Furthermore, we discuss its extension to functionals explicitly depending on time, which are relevant for the stochastic thermodynamics of anomalous diffusive systems. Exact results on the work fluctuations of a simple non-equilibrium model are obtained. In this paper we also provide a pedagogical introduction to L\'evy processes, semimartingales and their associated stochastic calculus, which underlie the mathematical formulation of anomalous diffusion as a subordinated process.
We present a simple and general framework to simulate statistically correct realizations of a system of non-Markovian discrete stochastic processes. We give the exact analytical solution and a practical an efficient algorithm alike the Gillespie algorithm for Markovian processes, with the difference that now the occurrence rates of the events depend on the time elapsed since the event last took place. We use our non-Markovian generalized Gillespie stochastic simulation methodology to investigate the effects of non-exponential inter-event time distributions in the susceptible-infected-susceptible model of epidemic spreading. Strikingly, our results unveil the drastic effects that very subtle differences in the modeling of non-Markovian processes have on the global behavior of complex systems, with important implications for their understanding and prediction. We also assess our generalized Gillespie algorithm on a system of biochemical reactions with time delays. As compared to other existing methods, we find that the generalized Gillespie algorithm is the most general as it can be implemented very easily in cases, like for delays coupled to the evolution of the system, where other algorithms do not work or need adapted versions, less efficient in computational terms.
The density of the far infrared / submillimeter (FIR/SMM) diffuse extragalactic radiation field has recently been determined from COBE data. Nearly simultaneously, deep FIR/SMM surveys have detected substantial numbers of optically unidentified sources, which have led to the proposal that galaxies and protogalaxies at red shifts z = 2 to 4 may account for an appreciable fraction of the background. Here, I show that, if the reported radiation levels are generated through nucleosynthesis, most of this energy must have been produced at epochs $z \lesssim 2$. Hubble Deep Field data cited by Madau et al. (1998) indicate that the bulk of the integrated extragalactic background must have been generated even more recently at $z < 1$.
This article summarizes a talk given at the International Symposium on Multiparticle Dynamics 1999 in Providence/USA. It provides an overview on the variety of measurements of the hadronic final state in jet production for deep-inelastic scattering and photoproduction at HERA.
Restricted Boltzmann Machines (RBMs) are one of the fundamental building blocks of deep learning. Approximate maximum likelihood training of RBMs typically necessitates sampling from these models. In many training scenarios, computationally efficient Gibbs sampling procedures are crippled by poor mixing. In this work we propose a novel method of sampling from Boltzmann machines that demonstrates a computationally efficient way to promote mixing. Our approach leverages an under-appreciated property of deep generative models such as the Deep Belief Network (DBN), where Gibbs sampling from deeper levels of the latent variable hierarchy results in dramatically increased ergodicity. Our approach is thus to train an auxiliary latent hierarchical model, based on the DBN. When used in conjunction with parallel-tempering, the method is asymptotically guaranteed to simulate samples from the target RBM. Experimental results confirm the effectiveness of this sampling strategy in the context of RBM training.
This work introduces an integrative approach based on Q-analysis with machine learning. The new approach, called Neural Hypernetwork, has been applied to a case study of pulmonary embolism diagnosis. The objective of the application of neural hyper-network to pulmonary embolism (PE) is to improve diagnose for reducing the number of CT-angiography needed. Hypernetworks, based on topological simplicial complex, generalize the concept of two-relation to many-body relation. Furthermore, Hypernetworks provide a significant generalization of network theory, enabling the integration of relational structure, logic and analytic dynamics. Another important results is that Q-analysis stays close to the data, while other approaches manipulate data, projecting them into metric spaces or applying some filtering functions to highlight the intrinsic relations. A pulmonary embolism (PE) is a blockage of the main artery of the lung or one of its branches, frequently fatal. Our study uses data on 28 diagnostic features of 1,427 people considered to be at risk of PE. The resulting neural hypernetwork correctly recognized 94% of those developing a PE. This is better than previous results that have been obtained with other methods (statistical selection of features, partial least squares regression, topological data analysis in a metric space).
The objective function of a matrix factorization model usually aims to minimize the average of a regression error contributed by each element. However, given the existence of stochastic noises, the implicit deviations of sample data from their true values are almost surely diverse, which makes each data point not equally suitable for fitting a model. In this case, simply averaging the cost among data in the objective function is not ideal. Intuitively we would like to emphasize more on the reliable instances (i.e., those contain smaller noise) while training a model. Motivated by such observation, we derive our formula from a theoretical framework for optimal weighting under heteroscedastic noise distribution. Specifically, by modeling and learning the deviation of data, we design a novel matrix factorization model. Our model has two advantages. First, it jointly learns the deviation and conducts dynamic reweighting of instances, allowing the model to converge to a better solution. Second, during learning the deviated instances are assigned lower weights, which leads to faster convergence since the model does not need to overfit the noise. The experiments are conducted in clean recommendation and noisy sensor datasets to test the effectiveness of the model in various scenarios. The results show that our model outperforms the state-of-the-art factorization and deep learning models in both accuracy and efficiency.
We present and evaluate the capacity of a deep neural network to learn robust features from EEG to automatically detect seizures. This is a challenging problem because seizure manifestations on EEG are extremely variable both inter- and intra-patient. By simultaneously capturing spectral, temporal and spatial information our recurrent convolutional neural network learns a general spatially invariant representation of a seizure. The proposed approach exceeds significantly previous results obtained on cross-patient classifiers both in terms of sensitivity and false positive rate. Furthermore, our model proves to be robust to missing channel and variable electrode montage.
Convolutional Neural Networks (CNNs) are extremely efficient, since they exploit the inherent translation-invariance of natural images. However, translation is just one of a myriad of useful spatial transformations. Can the same efficiency be attained when considering other spatial invariances? Such generalized convolutions have been considered in the past, but at a high computational cost. We present a construction that is simple and exact, yet has the same computational complexity that standard convolutions enjoy. It consists of a constant image warp followed by a simple convolution, which are standard blocks in deep learning toolboxes. With a carefully crafted warp, the resulting architecture can be made invariant to one of a wide range of spatial transformations. We show encouraging results in realistic scenarios, including the estimation of vehicle poses in the Google Earth dataset (rotation and scale), and face poses in Annotated Facial Landmarks in the Wild (3D rotations under perspective).
We investigate the Robust Multiperiod Network Design Problem, a generalization of the classical Capacitated Network Design Problem that additionally considers multiple design periods and provides solutions protected against traffic uncertainty. Given the intrinsic difficulty of the problem, which proves challenging even for state-of-the art commercial solvers, we propose a hybrid primal heuristic based on the combination of ant colony optimization and an exact large neighborhood search. Computational experiments on a set of realistic instances from the SNDlib show that our heuristic can find solutions of extremely good quality with low optimality gap.
We introduce a method for learning the dynamics of complex nonlinear systems based on deep generative models over temporal segments of states and actions. Unlike dynamics models that operate over individual discrete timesteps, we learn the distribution over future state trajectories conditioned on past state, past action, and planned future action trajectories, as well as a latent prior over action trajectories. Our approach is based on convolutional autoregressive models and variational autoencoders. It makes stable and accurate predictions over long horizons for complex, stochastic systems, effectively expressing uncertainty and modeling the effects of collisions, sensory noise, and action delays. The learned dynamics model and action prior can be used for end-to-end, fully differentiable trajectory optimization and model-based policy optimization, which we use to evaluate the performance and sample-efficiency of our method.
A MIMO Multi-hop Control Network (MCN) consists of a MIMO LTI system where the communication between sensors, actuators and computational units is supported by a (wireless) multi-hop communication network, and data flow is performed using scheduling and routing of sensing and actuation data. We provide necessary and sufficient conditions on the plant dynamics and on the communication protocol configuration such that the Fault Detection and Isolation (FDI) problem of failures and malicious attacks to communication nodes can be solved.
This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA) techniques. The basic idea is to iteratively replace the original (non-convex, highly dimensional) learning problem with a sequence of (strongly convex) approximations, which are both accurate and simple to optimize. Differently from similar ideas (e.g., quasi-Newton algorithms), the approximations can be constructed using only first-order information of the neural network function, in a stochastic fashion, while exploiting the overall structure of the learning problem for a faster convergence. We discuss several use cases, based on different choices for the loss function (e.g., squared loss and cross-entropy loss), and for the regularization of the NN's weights. We experiment on several medium-sized benchmark problems, and on a large-scale dataset involving simulated physical data. The results show how the algorithm outperforms state-of-the-art techniques, providing faster convergence to a better minimum. Additionally, we show how the algorithm can be easily parallelized over multiple computational units without hindering its performance. In particular, each computational unit can optimize a tailored surrogate function defined on a randomly assigned subset of the input variables, whose dimension can be selected depending entirely on the available computational power.
In this paper, we propose gcForest, a decision tree ensemble approach with performance highly competitive to deep neural networks. In contrast to deep neural networks which require great effort in hyper-parameter tuning, gcForest is much easier to train. Actually, even when gcForest is applied to different data from different domains, excellent performance can be achieved by almost same settings of hyper-parameters. The training process of gcForest is efficient and scalable. In our experiments its training time running on a PC is comparable to that of deep neural networks running with GPU facilities, and the efficiency advantage may be more apparent because gcForest is naturally apt to parallel implementation. Furthermore, in contrast to deep neural networks which require large-scale training data, gcForest can work well even when there are only small-scale training data. Moreover, as a tree-based approach, gcForest should be easier for theoretical analysis than deep neural networks.
We discuss two simple variational approaches to quantum wells. The trial harmonic functions analyzed in an earlier paper give reasonable results for all well depths and are particularly suitable for deep wells. On the other hand, the exponential functions proposed here are preferable for shallow wells. We compare the shallow-well expansions for both kind of functions and show that they do not exhibit the cubic term appearing in the exact series. It is also shown that the deep-well expansion for the harmonic functions agree with the first terms of perturbation theory.
There is a widespread need for statistical methods that can analyze high-dimensional datasets with- out imposing restrictive or opaque modeling assumptions. This paper describes a domain-general data analysis method called CrossCat. CrossCat infers multiple non-overlapping views of the data, each consisting of a subset of the variables, and uses a separate nonparametric mixture to model each view. CrossCat is based on approximately Bayesian inference in a hierarchical, nonparamet- ric model for data tables. This model consists of a Dirichlet process mixture over the columns of a data table in which each mixture component is itself an independent Dirichlet process mixture over the rows; the inner mixture components are simple parametric models whose form depends on the types of data in the table. CrossCat combines strengths of mixture modeling and Bayesian net- work structure learning. Like mixture modeling, CrossCat can model a broad class of distributions by positing latent variables, and produces representations that can be efficiently conditioned and sampled from for prediction. Like Bayesian networks, CrossCat represents the dependencies and independencies between variables, and thus remains accurate when there are multiple statistical signals. Inference is done via a scalable Gibbs sampling scheme; this paper shows that it works well in practice. This paper also includes empirical results on heterogeneous tabular data of up to 10 million cells, such as hospital cost and quality measures, voting records, unemployment rates, gene expression measurements, and images of handwritten digits. CrossCat infers structure that is consistent with accepted findings and common-sense knowledge in multiple domains and yields predictive accuracy competitive with generative, discriminative, and model-free alternatives.
In this paper we try to bridge breakthroughs in quantitative sociology/econometrics pioneered during the last decades by Mac Fadden, Brock-Durlauf, Granovetter and Watts-Strogats through introducing a minimal model able to reproduce essentially all the features of social behavior highlighted by these authors. Our model relies on a pairwise Hamiltonian for decision maker interactions which naturally extends the multi-populations approaches by shifting and biasing the pattern definitions of an Hopfield model of neural networks. Once introduced, the model is investigated trough graph theory (to recover Granovetter and Watts-Strogats results) and statistical mechanics (to recover Mac-Fadden and Brock-Durlauf results). Due to internal symmetries of our model, the latter is obtained as the relaxation of a proper Markov process, allowing even to study its out of equilibrium properties. The method used to solve its equilibrium is an adaptation of the Hamilton-Jacobi technique recently introduced by Guerra in the spin glass scenario and the picture obtained is the following: just by assuming that the larger the amount of similarities among decision makers, the stronger their relative influence, this is enough to explain both the different role of strong and weak ties in the social network as well as its small world properties. As a result, imitative interaction strengths seem essentially a robust request (enough to break the gauge symmetry in the couplings), furthermore, this naturally leads to a discrete choice modelization when dealing with the external influences and to imitative behavior a la Curie-Weiss as the one introduced by Brock and Durlauf.
Recent years have seen significant interest in designing networks that are self-healing in the sense that they can automatically recover from adversarial attacks. Previous work shows that it is possible for a network to automatically recover, even when an adversary repeatedly deletes nodes in the network. However, there have not yet been any algorithms that self-heal in the case where an adversary takes over nodes in the network. In this paper, we address this gap. In particular, we describe a communication network over n nodes that ensures the following properties, even when an adversary controls up to t <= (1/8 - \epsilon)n nodes, for any non-negative \epsilon. First, the network provides a point-to-point communication with bandwidth and latency costs that are asymptotically optimal. Second, the expected total number of message corruptions is O(t(log* n)^2) before the adversarially controlled nodes are effectively quarantined so that they cause no more corruptions. Empirical results show that our algorithm can reduce the bandwidth cost by up to a factor of 70.
Deep neural networks have been widely adopted for automatic organ segmentation from abdominal CT scans. However, the segmentation accuracy of some small organs (e.g., the pancreas) is sometimes below satisfaction, arguably because deep networks are easily disrupted by the complex and variable background regions which occupies a large fraction of the input volume. In this paper, we formulate this problem into a fixed-point model which uses a predicted segmentation mask to shrink the input region. This is motivated by the fact that a smaller input region often leads to more accurate segmentation. In the training process, we use the ground-truth annotation to generate accurate input regions and optimize network weights. On the testing stage, we fix the network parameters and update the segmentation results in an iterative manner. We evaluate our approach on the NIH pancreas segmentation dataset, and outperform the state-of-the-art by more than 4%, measured by the average Dice-S{\o}rensen Coefficient (DSC). In addition, we report 62.43% DSC in the worst case, which guarantees the reliability of our approach in clinical applications.
Event-based cameras offer much potential to the fields of robotics and computer vision, in part due to their large dynamic range and extremely high "frame rates". These attributes make them, at least in theory, particularly suitable for enabling tasks like navigation and mapping on high speed robotic platforms under challenging lighting conditions, a task which has been particularly challenging for traditional algorithms and camera sensors. Before these tasks become feasible however, progress must be made towards adapting and innovating current RGB-camera-based algorithms to work with event-based cameras. In this paper we present ongoing research investigating two distinct approaches to incorporating event-based cameras for robotic navigation: the investigation of suitable place recognition / loop closure techniques, and the development of efficient neural implementations of place recognition techniques that enable the possibility of place recognition using event-based cameras at very high frame rates using neuromorphic computing hardware.
This paper deals with temporal enzyme distribution in the activation of biochemical pathways. Pathway activation arises when production of a certain biomolecule is required due to changing environmental conditions. Under the premise that biological systems have been optimized through evolutionary processes, a biologically meaningful optimal control problem is posed. In this setup, the enzyme concentrations are assumed to be time dependent and constrained by a limited overall enzyme production capacity, while the optimization criterion accounts for both time and resource usage.   Using geometric arguments we establish the bang-bang nature of the solution and reveal that each reaction must be sequentially activated in the same order as they appear in the pathway. The results hold for a broad range of enzyme dynamics which includes, but is not limited to, Mass Action, Michaelis-Menten and Hill Equation kinetics.
In this paper, first a theorem on the partial sum of a particular series is given. Then, based on it, the origin of obvious simulation deviation from theory is explained: i) why the numerically estimated \hat\gamma (degree exponent) in [1=arXiv:cond-mat/0106096] is always smaller than \gamma (=3) that is predicted by theory; ii) and why \hat\gamma rises monotonically as m (the links added at each step in Barabasi-Albert (BA) model [1]) increases. Strictly, it declares such errors are basically from the inconsistence of simulation with the theoretical model, which is caused by an additional incompatible condition used in simulation. In addition, noticing the evolving differences between the initial m_0 nodes and those after, we correct the derived BA model which unfairly omitted such differences.
We study magnetotransport in a high mobility Si two-dimensional electron system by in situ tilting of the sample relative to the magnetic field. A pronounced dip in the longitudinal resistivity is observed during the Landau level crossing process for noninteger filling factors. Together with a Hall resistivity change which exhibits the particle-hole symmetry, this indicates that electrons or holes in the relevant Landau levels become localized at the coincidence where the pseudospin-unpolarized state is expected to be stable.
Wireless Local Area Network (WLAN) has become a promising choice for indoor positioning as the only existing and established infrastructure, to localize the mobile and stationary users indoors. However, since WLAN has been initially designed for wireless networking and not positioning, the localization task based on WLAN signals has several challenges. Amongst the WLAN positioning methods, WLAN fingerprinting localization has recently achieved great attention due to its promising results. WLAN fingerprinting faces several challenges and hence, in this paper, our goal is to overview these challenges and the state-of-the-art solutions. This paper consists of three main parts: 1) Conventional localization schemes; 2) State-of-the-art approaches; 3) Practical deployment challenges. Since all the proposed methods in WLAN literature have been conducted and tested in different settings, the reported results are not equally comparable. So, we compare some of the main localization schemes in a single real environment and assess their localization accuracy, positioning error statistics, and complexity. Our results depict illustrative evaluation of WLAN localization systems and guide to future improvement opportunities.
This paper presents a practical computational approach to quantify the effect of individual observations in estimating the state of a system. Such an analysis can be used for pruning redundant measurements, and for designing future sensor networks. The mathematical approach is based on computing the sensitivity of the reanalysis (unconstrained optimization solution) with respect to the data. The computational cost is dominated by the solution of a linear system, whose matrix is the Hessian of the cost function, and is only available in operator form. The right hand side is the gradient of a scalar cost function that quantifies the forecast error of the numerical model. The use of adjoint models to obtain the necessary first and second order derivatives is discussed. We study various strategies to accelerate the computation, including matrix-free iterative solvers, preconditioners, and an in-house multigrid solver. Experiments are conducted on both a small-size shallow-water equations model, and on a large-scale numerical weather prediction model, in order to illustrate the capabilities of the new methodology.
The recent wave of mobilizations in the Arab world and across Western countries has generated much discussion on how digital media is connected to the diffusion of protests. We examine that connection using data from the surge of mobilizations that took place in Spain in May 2011. We study recruitment patterns in the Twitter network and find evidence of social influence and complex contagion. We identify the network position of early participants (i.e. the leaders of the recruitment process) and of the users who acted as seeds of message cascades (i.e. the spreaders of information). We find that early participants cannot be characterized by a typical topological position but spreaders tend to me more central to the network. These findings shed light on the connection between online networks, social contagion, and collective dynamics, and offer an empirical test to the recruitment mechanisms theorized in formal models of collective action.
The average of the ratio of powers of the spectral determinants of the Dirac operator in the $\epsilon$-regime of QCD is shown to satisfy a Toda lattice equation. The quenched limit of this Toda lattice equation is obtained using the supersymmetric method. This super symmetric approach is then shown to be equivalent to taking the replica limit of the Toda lattice equation. Among other, the factorization of the microscopic spectral correlation functions of the QCD Dirac operator into fermionic and bosonic partition functions follows naturally from both approaches. While the replica approach relies on an analytic continuation in the number of flavors no such assumptions are made in the present approach where the numbers of flavors in the Toda lattice equation are strictly integer.
In this work we describe a Convolutional Neural Network (CNN) to accurately predict the scene illumination. Taking image patches as input, the CNN works in the spatial domain without using hand-crafted features that are employed by most previous methods. The network consists of one convolutional layer with max pooling, one fully connected layer and three output nodes. Within the network structure, feature learning and regression are integrated into one optimization process, which leads to a more effective model for estimating scene illumination. This approach achieves state-of-the-art performance on a standard dataset of RAW images. Preliminary experiments on images with spatially varying illumination demonstrate the stability of the local illuminant estimation ability of our CNN.
Although deep neural networks (DNNs) have achieved great success in many computer vision tasks, recent studies have shown they are vulnerable to adversarial examples. Such examples, typically generated by adding small but purposeful distortions, can frequently fool DNN models. Previous studies to defend against adversarial examples mostly focused on refining the DNN models. They have either shown limited success or suffer from the expensive computation. We propose a new strategy, \emph{feature squeezing}, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model's prediction on the original input with that on the squeezed input, feature squeezing detects adversarial examples with high accuracy and few false positives. This paper explores two instances of feature squeezing: reducing the color bit depth of each pixel and smoothing using a spatial filter. These strategies are straightforward, inexpensive, and complementary to defensive methods that operate on the underlying model, such as adversarial training.
It is a widely accepted fact that the computational capability of recurrent neural networks is maximized on the so-called "edge of criticality". Once the network operates in this configuration, it performs efficiently on a specific application both in terms of (i) low prediction error and (ii) high short-term memory capacity. Since the behavior of recurrent networks is strongly influenced by the particular input signal driving the dynamics, a universal, application-independent method for determining the edge of criticality is still missing. In this paper, we aim at addressing this issue by proposing a theoretically motivated, unsupervised method based on Fisher information for determining the edge of criticality in recurrent neural networks. It is proven that Fisher information is maximized for (finite-size) systems operating in such critical regions. However, Fisher information is notoriously difficult to compute and either requires the probability density function or the conditional dependence of the system states with respect to the model parameters. The paper takes advantage of a recently-developed non-parametric estimator of the Fisher information matrix and provides a method to determine the critical region of echo state networks, a particular class of recurrent networks. The considered control parameters, which indirectly affect the echo state network performance, are explored to identify those configurations lying on the edge of criticality and, as such, maximizing Fisher information and computational performance. Experimental results on benchmarks and real-world data demonstrate the effectiveness of the proposed method.
The Random K-Satisfiability Problem, consisting in verifying the existence of an assignment of N Boolean variables that satisfy a set of M=alpha N random logical clauses containing K variables each, is studied using the replica symmetric framework of diluted disordered systems. We present an exact iterative scheme for the replica symmetric functional order parameter together for the different cases of interest K=2, K>= 3 and K>>1. The calculation of the number of solutions, which allowed us [Phys. Rev. Lett. 76, 3881 (1996)] to predict a first order jump at the threshold where the Boolean expressions become unsatisfiable with probability one, is thoroughly displayed. In the case K=2, the (rigorously known) critical value (alpha=1) of the number of clauses per Boolean variable is recovered while for K>=3 we show that the system exhibits a replica symmetry breaking transition. The annealed approximation is proven to be exact for large K.
Our Multi-Column Deep Neural Networks achieve best known recognition rates on Chinese characters from the ICDAR 2011 and 2013 offline handwriting competitions, approaching human performance.
Here we propose a different perspective of the deep factorisation in Kyprianou (2015) based on determining potentials. Indeed, we factorise the inverse of the MAP-exponent associated to a stable process via the Lamperti-Kiu transform. Here our factorisation is completely independent from the derivation in Kyprianou (2015) , moreover there is no clear way to invert the factors in Kyprianou (2015) to derive our results. Our method gives direct access to the potential densities of the ascending and descending ladder MAP of the Lamperti-stable MAP in closed form.   In the spirit of the interplay between the classical Wiener-Hopf factorisation and fluctuation theory of the underlying Levy process, our analysis will produce a collection of of new results for stable processes. We give an identity for the point of closest reach to the origin for a stable process with index $\alpha\in (0,1)$ as well as and identity for the point of furthest reach before absorption at the origin for a stable process with index $\alpha\in (1,2)$. Moreover, we show how the deep factorisation allows us to compute explicitly the stationary distribution of stable processes multiplicatively reflected in such a way that it remains in the strip [-1,1].
We propose a deep multitask architecture for \emph{fully automatic 2d and 3d human sensing} (DMHS), including \emph{recognition and reconstruction}, in \emph{monocular images}. The system computes the figure-ground segmentation, semantically identifies the human body parts at pixel level, and estimates the 2d and 3d pose of the person. The model supports the joint training of all components by means of multi-task losses where early processing stages recursively feed into advanced ones for increasingly complex calculations, accuracy and robustness. The design allows us to tie a complete training protocol, by taking advantage of multiple datasets that would otherwise restrictively cover only some of the model components: complex 2d image data with no body part labeling and without associated 3d ground truth, or complex 3d data with limited 2d background variability. In detailed experiments based on several challenging 2d and 3d datasets (LSP, HumanEva, Human3.6M), we evaluate the sub-structures of the model, the effect of various types of training data in the multitask loss, and demonstrate that state-of-the-art results can be achieved at all processing levels. We also show that in the wild our monocular RGB architecture is perceptually competitive to a state-of-the art (commercial) Kinect system based on RGB-D data.
This report describes the difficulties of training neural networks and in particular deep neural networks. It then provides a literature review of training methods for deep neural networks, with a focus on pre-training. It focuses on Deep Belief Networks composed of Restricted Boltzmann Machines and Stacked Autoencoders and provides an outreach on further and alternative approaches. It also includes related practical recommendations from the literature on training them. In the second part, initial experiments using some of the covered methods are performed on two databases. In particular, experiments are performed on the MNIST hand-written digit dataset and on facial emotion data from a Kaggle competition. The results are discussed in the context of results reported in other research papers. An error rate lower than the best contribution to the Kaggle competition is achieved using an optimized Stacked Autoencoder.
Multi-label image classification is a fundamental but challenging task in computer vision. Great progress has been achieved by exploiting semantic relations between labels in recent years. However, conventional approaches are unable to model the underlying spatial relations between labels in multi-label images, because spatial annotations of the labels are generally not provided. In this paper, we propose a unified deep neural network that exploits both semantic and spatial relations between labels with only image-level supervisions. Given a multi-label image, our proposed Spatial Regularization Network (SRN) generates attention maps for all labels and captures the underlying relations between them via learnable convolutions. By aggregating the regularized classification results with original results by a ResNet-101 network, the classification performance can be consistently improved. The whole deep neural network is trained end-to-end with only image-level annotations, thus requires no additional efforts on image annotations. Extensive evaluations on 3 public datasets with different types of labels show that our approach significantly outperforms state-of-the-arts and has strong generalization capability. Analysis of the learned SRN model demonstrates that it can effectively capture both semantic and spatial relations of labels for improving classification performance.
Gene regulatory networks typically have low in-degrees, whereby any given gene is regulated by few of the genes in the network. What mechanisms might be responsible for these low in-degrees? Starting with an accepted framework of the binding of transcription factors to DNA, we consider a simple model of gene regulatory dynamics. In this model, we show that the constraint of having a given function leads to the emergence of minimum connectivities compatible with function. We exhibit mathematically this behavior within a limit of our model and show that it also arises in the full model. As a consequence, functionality in these gene networks is parsimonious, i.e., is concentrated on a sparse number of interactions as measured for instance by their essentiality. Our model thus provides a simple mechanism for the emergence of sparse regulatory networks, and leads to very heterogeneous effects of mutations.
Today artificial neural networks are applied in various fields - engineering, data analysis, robotics. While they represent a successful tool for a variety of relevant applications, mathematically speaking they are still far from being conclusive. In particular, they suffer from being unable to find the best configuration possible during the training process (local minimum problem). In this paper, we focus on this issue and suggest a simple, but effective, post-learning strategy to allow the search for improved set of weights at a relatively small extra computational cost. Therefore, we introduce a novel technique based on analogy with quantum effects occurring in nature as a way to improve (and sometimes overcome) this problem. Several numerical experiments are presented to validate the approach.
Temporal point processes have been widely applied to model event sequence data generated by online users. In this paper, we consider the problem of how to design the optimal control policy for point processes, such that the stochastic system driven by the point process is steered to a target state. In particular, we exploit the key insight to view the stochastic optimal control problem from the perspective of optimal measure and variational inference. We further propose a convex optimization framework and an efficient algorithm to update the policy adaptively to the current system state. Experiments on synthetic and real-world data show that our algorithm can steer the user activities much more accurately and efficiently than other stochastic control methods.
Finding a succinct representation to describe the ground state of a disordered interacting system could be very helpful in understanding the interplay between the interactions that is manifested in a quantum phase transition. In this work we use some elementary states to construct recursively an ansatz of multilayer wave functions, where in each step the higher-level wave function is represented by a superposition of the locally "excited states" obtained from the lower-level wave function. This allows us to write the Hamiltonian expectation in terms of some local functions of the variational parameters, and employ an efficient message-passing algorithm to find the optimal parameters. We obtain good estimations of the ground-state energy and the phase transition point for the transverse Ising model with a few layers of mean-field and symmetric tree states. The work is the first step towards the application of local and distributed message-passing algorithms in the study of structured variational problems in finite dimensions.
Deep neural networks are widely used in machine learning applications. However, the deployment of large neural networks models can be difficult to deploy on mobile devices with limited power budgets. To solve this problem, we propose Trained Ternary Quantization (TTQ), a method that can reduce the precision of weights in neural networks to ternary values. This method has very little accuracy degradation and can even improve the accuracy of some models (32, 44, 56-layer ResNet) on CIFAR-10 and AlexNet on ImageNet. And our AlexNet model is trained from scratch, which means it's as easy as to train normal full precision model. We highlight our trained quantization method that can learn both ternary values and ternary assignment. During inference, only ternary values (2-bit weights) and scaling factors are needed, therefore our models are nearly 16x smaller than full-precision models. Our ternary models can also be viewed as sparse binary weight networks, which can potentially be accelerated with custom circuit. Experiments on CIFAR-10 show that the ternary models obtained by trained quantization method outperform full-precision models of ResNet-32,44,56 by 0.04%, 0.16%, 0.36%, respectively. On ImageNet, our model outperforms full-precision AlexNet model by 0.3% of Top-1 accuracy and outperforms previous ternary models by 3%.
Hybrid approach has a special status among Face Recognition Systems as they combine different recognition approaches in an either serial or parallel to overcome the shortcomings of individual methods. This paper explores the area of Hybrid Face Recognition using score based strategy as a combiner/fusion process. In proposed approach, the recognition system operates in two modes: training and classification. Training mode involves normalization of the face images (training set), extracting appropriate features using Principle Component Analysis (PCA) and Independent Component Analysis (ICA). The extracted features are then trained in parallel using Back-propagation neural networks (BPNNs) to partition the feature space in to different face classes. In classification mode, the trained PCA BPNN and ICA BPNN are fed with new face image(s). The score based strategy which works as a combiner is applied to the results of both PCA BPNN and ICA BPNN to classify given new face image(s) according to face classes obtained during the training mode. The proposed approach has been tested on ORL and other face databases; the experimented results show that the proposed system has higher accuracy than face recognition systems using single feature extractor.
The Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI) has been heavily supporting Machine Learning and Deep Learning research from its foundation in 2012. We have asked six leading ICRI-CI Deep Learning researchers to address the challenge of "Why & When Deep Learning works", with the goal of looking inside Deep Learning, providing insights on how deep networks function, and uncovering key observations on their expressiveness, limitations, and potential. The output of this challenge resulted in five papers that address different facets of deep learning. These different facets include a high-level understating of why and when deep networks work (and do not work), the impact of geometry on the expressiveness of deep networks, and making deep networks interpretable.
Quality of Experience (QoE) provisioning in heterogeneous access networks (HANs) can be achieved via handoffs. The current approaches for QoE-aware handoffs either lack the availability of a network path probing method or lack the availability of efficient methods for QoE prediction. Further, the current approaches do not explore the benefits of proactive QoE-aware handoffs such that user's QoE is maximized by learning from past network conditions and by actions taken by the mobile device regarding handoffs. In this paper, our contributions are two-fold. First, we propose, develop and validate a novel method for QoE prediction based on passive probing. Our method is based on hidden Markov models and Multi-homed Mobility Management Protocol which eliminates the need for additional probe packets for QoE prediction. It achieves the average QoE prediction accuracy of 97%. Second, we propose, develop and validate a novel reinforcement learning based method for proactive QoE-aware handoffs. We show that our method outperforms existing approaches by reducing the number of vertical handoffs by 60.65% while maintaining high QoE levels and by extending crucial functionality such as passive probing mechanisms.
The aging dynamics of a simple model glass is numerically investigated observing how it takes place in the potential energy landscape $V$. Partitioning the landscape in basins of minima of $|\nabla V|^2$, we are able to elucidate some interesting topological properties of the aging process. The main result is the characterization of the long time behavior as a jump dynamics between basins of attraction of minima. Moreover we extract some information about the landscape itself, determining quantitatively few parameters describing it, such as the mean energy barrier value and the mean square distance between adjacent minima.
The effective permittivity of composites containing ferromagnetic microwires has been analysed within a one-particle approximation, by considering a wire piece as an independent scatterer and solving the scattering problem with the impedance boundary condition. A new integro-differential equation for the current distribution in a wire is obtained, which is valid for the surface impedance matrix of a general form. In the vicinity of the antenna resonance, the dispersion behaviour of the effective permittivity can depend strongly on a dc external magnetic field applied to the whole composite sample, experiencing a transformation from a resonant spectrum to a relaxation one. If the resonance-like dispersion behaviour is realised, the real part of the effective permittivity can be made negative past the resonance for sufficiently small wire-inclusion concentrations (well below the percolation threshold). Applying a dc external magnetic field, the negative peak of the real part of effective permittivity continuously decreases as the dispersion tends to become of a relaxation type. The magnetic field required to cause these changes in the effective permittivity is of the order of the anisotropy field, that is, in the range of few Oe for Co-based microwires. The field dependence of the effective permittivity arises from a high sensitivity of the ac surface impedance of a ferromagnetic wire to a magnetic field (known as the magneto-impedance (MI) effect). This work demonstrates a possibility of using the MI effect to design field-controlled composites and band-gap structures. Another range of applications is related to tuneable waveguides where the composite material is used as an additional field-dependent cover.
Motivated by the advantages achieved by implicit analogue net for solving online linear equations, a novel implicit neural model is designed based on conventional explicit gradient neural networks in this letter by introducing a positive-definite mass matrix. In addition to taking the advantages of the implicit neural dynamics, the proposed implicit gradient neural networks can still achieve globally exponential convergence to the unique theoretical solution of linear equations and also global stability even under no-solution and multi-solution situations. Simulative results verify theoretical convergence analysis on the proposed neural dynamics.
We present a search for the standard model Higgs boson produced in association with a $W^{\pm}$ boson. This search uses data corresponding to an integrated luminosity of 7.5 fb$^{-1}$ collected by the CDF detector at the Tevatron. We select $WH \to \ell\nu b \bar{b}$ candidate events with two jets, large missing transverse energy, and exactly one charged lepton. We further require that at least one jet be identified to originate from a bottom quark. Discrimination between the signal and the large background is achieved through the use of a Bayesian artificial neural network. The number of tagged events and their distributions are consistent with the standard model expectations. We observe no evidence for a Higgs boson signal and set 95% C.L. upper limits on the $WH$ production cross section times the branching ratio to decay to $b\bar b$ pairs, $\sigma(p\bar p \rightarrow W^{\pm} H) \times {\cal B}(H\rightarrow b\bar b)$, relative to the rate predicted by the standard model. For the Higgs boson mass range of 100 GeV/c$^2$ to 150 GeV/c$^2$ we set observed (expected) upper limits from 1.34 (1.83) to 38.8 (23.4). For 115 GeV/c$^2$ the upper limit is 3.64 (2.78). The combination of the present search with an independent analysis that selects events with three jets yields more stringent limits ranging from 1.12 (1.79) to 34.4 (21.6) in the same mass range. For 115 GeV/c$^2$ the upper limit is 2.65 (2.60).
We estimate the number of AGN among the galaxies in the Hubble Deep Field (HDF). The recent discovery of a class of X-ray luminous narrow emission-line galaxies has provided a possible explanation for the origin of the X-ray background (XRB), although the nature of the activity is still unresolved. By extrapolating the observed X-ray number count distribution to faint optical magnitudes we predict that this AGN population could account for a significant fraction, $\sim 10$ per cent, of the galaxies in the HDF. In contrast, normal broad line QSOs are expected to account for no more than $\sim 0.1$ per cent of the sources at these magnitudes.
Systemic approaches to the study of a biological cell or tissue rely increasingly on the use of context-specific metabolic network models. The reconstruction of such a model from high-throughput data can routinely involve large numbers of tests under different conditions and extensive parameter tuning, which calls for fast algorithms. We present FASTCORE, a generic algorithm for reconstructing context-specific metabolic network models from global genome-wide metabolic network models such as Recon X. FASTCORE takes as input a core set of reactions that are known to be active in the context of interest (e.g., cell or tissue), and it searches for a flux consistent subnetwork of the global network that contains all reactions from the core set and a minimal set of additional reactions. Our key observation is that a minimal consistent reconstruction can be defined via a set of sparse modes of the global network, and FASTCORE iteratively computes such a set via a series of linear programs. Experiments on liver data demonstrate speedups of several orders of magnitude, and significantly more compact reconstructions, over a chief rival method. Given its simplicity and its excellent performance, FASTCORE can form the backbone of many future metabolic network reconstruction algorithms.
We investigate the 'Digital Synaptic Neural Substrate' (DSNS) computational creativity approach further with respect to the size and quality of images that can be used to seed the process. In previous work we demonstrated how combining photographs of people and sequences taken from chess games between weak players can be used to generate chess problems or puzzles of higher aesthetic quality, on average, compared to alternative approaches. In this work we show experimentally that using larger images as opposed to smaller ones improves the output quality even further. The same is also true for using clearer or less corrupted images. The reasons why these things influence the DSNS process is presently not well-understood and debatable but the findings are nevertheless immediately applicable for obtaining better results.
In order to cope with the relentless data tsunami in $5G$ wireless networks, current approaches such as acquiring new spectrum, deploying more base stations (BSs) and increasing nodes in mobile packet core networks are becoming ineffective in terms of scalability, cost and flexibility. In this regard, context-aware $5$G networks with edge/cloud computing and exploitation of \emph{big data} analytics can yield significant gains to mobile operators. In this article, proactive content caching in $5$G wireless networks is investigated in which a big data-enabled architecture is proposed. In this practical architecture, vast amount of data is harnessed for content popularity estimation and strategic contents are cached at the BSs to achieve higher users' satisfaction and backhaul offloading. To validate the proposed solution, we consider a real-world case study where several hours of mobile data traffic is collected from a major telecom operator in Turkey and a big data-enabled analysis is carried out leveraging tools from machine learning. Based on the available information and storage capacity, numerical studies show that several gains are achieved both in terms of users' satisfaction and backhaul offloading. For example, in the case of $16$ BSs with $30\%$ of content ratings and $13$ Gbyte of storage size ($78\%$ of total library size), proactive caching yields $100\%$ of users' satisfaction and offloads $98\%$ of the backhaul.
Spammer detection on social network is a challenging problem. The rigid anti-spam rules have resulted in emergence of "smart" spammers. They resemble legitimate users who are difficult to identify. In this paper, we present a novel spammer classification approach based on Latent Dirichlet Allocation(LDA), a topic model. Our approach extracts both the local and the global information of topic distribution patterns, which capture the essence of spamming. Tested on one benchmark dataset and one self-collected dataset, our proposed method outperforms other state-of-the-art methods in terms of averaged F1-score.
For many types of integrated circuits, accepting larger failure rates in computations can be used to improve energy efficiency. We study the performance of faulty implementations of certain deep neural networks based on pessimistic and optimistic models of the effect of hardware faults. After identifying the impact of hyperparameters such as the number of layers on robustness, we study the ability of the network to compensate for computational failures through an increase of the network size. We show that some networks can achieve equivalent performance under faulty implementations, and quantify the required increase in computational complexity.
Artificial neural networks (ANNs) have been successfully applied to solve a variety of classification and function approximation problems. Although ANNs can generally predict better than decision trees for pattern classification problems, ANNs are often regarded as black boxes since their predictions cannot be explained clearly like those of decision trees. This paper presents a new algorithm, called rule extraction from ANNs (REANN), to extract rules from trained ANNs for medical diagnosis problems. A standard three-layer feedforward ANN with four-phase training is the basis of the proposed algorithm. In the first phase, the number of hidden nodes in ANNs is determined automatically by a constructive algorithm. In the second phase, irrelevant connections and input nodes are removed from trained ANNs without sacrificing the predictive accuracy of ANNs. The continuous activation values of the hidden nodes are discretized by using an efficient heuristic clustering algorithm in the third phase. Finally, rules are extracted from compact ANNs by examining the discretized activation values of the hidden nodes. Extensive experimental studies on three benchmark classification problems, i.e. breast cancer, diabetes and lenses, demonstrate that REANN can generate high quality rules from ANNs, which are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy.
Vehicle detection and annotation for streaming video data with complex scenes is an interesting but challenging task for urban traffic surveillance. In this paper, we present a fast framework of Detection and Annotation for Vehicles (DAVE), which effectively combines vehicle detection and attributes annotation. DAVE consists of two convolutional neural networks (CNNs): a fast vehicle proposal network (FVPN) for vehicle-like objects extraction and an attributes learning network (ALN) aiming to verify each proposal and infer each vehicle's pose, color and type simultaneously. These two nets are jointly optimized so that abundant latent knowledge learned from the ALN can be exploited to guide FVPN training. Once the system is trained, it can achieve efficient vehicle detection and annotation for real-world traffic surveillance data. We evaluate DAVE on a new self-collected UTS dataset and the public PASCAL VOC2007 car and LISA 2010 datasets, with consistent improvements over existing algorithms.
We consider the use of cost sharing in the Aspnes model of network inoculation, showing that this can improve the cost of the optimal equilibrium by a factor of $O(\sqrt{n})$ in a network of $n$ nodes.
An increasing number of today's social interactions occurs using online social media as communication channels. Some online social networks have become extremely popular in the last decade. They differ among themselves in the character of the service they provide to online users. For instance, Facebook can be seen mainly as a platform for keeping in touch with close friends and relatives, Twitter is used to propagate and receive news, LinkedIn facilitates the maintenance of professional contacts, Flickr gathers amateurs and professionals of photography, etc. Albeit different, all these online platforms share an ingredient that pervades all their applications. There exists an underlying social network that allows their users to keep in touch with each other and helps to engage them in common activities or interactions leading to a better fulfillment of the service's purposes. This is the reason why these platforms share a good number of functionalities, e.g., personal communication channels, broadcasted status updates, easy one-step information sharing, news feeds exposing broadcasted content, etc. As a result, online social networks are an interesting field to study an online social behavior that seems to be generic among the different online services. Since at the bottom of these services lays a network of declared relations and the basic interactions in these platforms tend to be pairwise, a natural methodology for studying these systems is provided by network science. In this chapter we describe some of the results of research studies on the structure, dynamics and social activity in online social networks. We present them in the interdisciplinary context of network science, sociological studies and computer science.
We propose an heterogeneous multi-task learning framework for human pose estimation from monocular image with deep convolutional neural network. In particular, we simultaneously learn a pose-joint regressor and a sliding-window body-part detector in a deep network architecture. We show that including the body-part detection task helps to regularize the network, directing it to converge to a good solution. We report competitive and state-of-art results on several data sets. We also empirically show that the learned neurons in the middle layer of our network are tuned to localized body parts.
The application of conventional phylogenetic techniques for inferring cultural history is problematic due to differences in the nature of information transmission in biological and cultural realms. In culture, units of transmission are not just measurable attributes, but communicable concepts. Therefore, relatedness amongst cultural elements often resides at the conceptual level not captured by traditional phylogenetic methods. This paper takes a cognitively inspired approach to analyzing material cultural history. We show that combining data for physical attributes of cultural artifacts with conceptual information can uncover cultural influences among different ethnolinguistic groups, and reveal new patterns of cultural ancestry. Using the Baltic psaltery, a musical instrument with a well-documented ethnographic and archaeological record, we recovered a previously unacknowledged pattern of historical relationship that is more congruent with geographical distribution and temporal data than is obtained with other approaches.
Diffusion processes in networks are increasingly used to model the spread of information and social influence. In several applications in computational sustainability such as the spread of wildlife, infectious diseases and traffic mobility pattern, the observed data often consists of only aggregate information. In this work, we present new models that generalize standard diffusion processes to such collective settings. We also present optimization based techniques that can accurately learn the underlying dynamics of the given contagion process, including the hidden network structure, by only observing the time a node becomes active and the associated aggregate information. Empirically, our technique is highly robust and accurately learns network structure with more than 90% recall and precision. Results on real-world flu spread data in the US confirm that our technique can also accurately model infectious disease spread.
Observations of center of mass dynamics offer a straightforward method to identify strongly interacting quantum phases of atoms placed in optical lattices. We theoretically study the dynamics of states derived from the disordered Bose-Hubbard model in a trapping potential. We find that the edge states in the trap allow center of mass motion even with insulating states in the center. We identify short and long-time scale mechanisms for edge state transport in insulating phases. We also argue that the center of mass velocity can aid in identifying a Bose-glass phase. Our zero temperature results offer important insights into mechanisms of transport of atoms in trapped optical lattices while putting bounds on center of mass dynamics expected at non-zero temperature.
Diffusion, a ubiquitous phenomenon in nature, is a consequence of particle number conservation and locality, in systems with sufficient damping. In this paper we consider diffusive processes in the bulk of Weyl semimetals, which are exotic quantum materials, recently of considerable interest. In order to do this, we first explicitly implement the analytical scheme by which disorder with anisotropic scattering amplitude is incorporated into the diagrammatic response-function formalism for calculating the `diffuson'. The result thus obtained is consistent with transport coefficients evaluated from the Boltzmann transport equation or the renormalized uniform current vertex calculation, as it should be. We thus demonstrate that the computation of the diffusion coefficient should involve the transport lifetime, and not the quasiparticle lifetime. Using this method, we then calculate the density response function in Weyl semimetals and discover an unconventional diffusion process that is significantly slower than conventional diffusion. This gives rise to relaxation processes that exhibit stretched exponential decay, instead of the usual exponential diffusive relaxation. This result is then explained using a model of thermally excited quasiparticles diffusing with diffusion coefficients which are strongly dependent on their energies. We elucidate the roles of the various energy and time scales involved in this novel process and propose an experiment by which this process may be observed.
This paper investigates how neurons can use metabolic cost to facilitate learning at a population level. Although decision-making by individual neurons has been extensively studied, questions regarding how neurons should behave to cooperate effectively remain largely unaddressed. Under assumptions that capture a few basic features of cortical neurons, we show that constraining reward maximization by metabolic cost aligns the information content of actions with their expected reward. Thus, metabolic cost provides a mechanism whereby neurons encode expected reward into their outputs. Further, aside from reducing energy expenditures, imposing a tight metabolic constraint also increases the accuracy of empirical estimates of rewards, increasing the robustness of distributed learning. Finally, we present two implementations of metabolically constrained learning that confirm our theoretical finding. These results suggest that metabolic cost may be an organizing principle underlying the neural code, and may also provide a useful guide to the design and analysis of other cooperating populations.
This work presents a sound probabilistic method for enforcing adherence of the marginal probabilities of a multi-label model to automatically discovered deterministic relationships among labels. In particular we focus on discovering two kinds of relationships among the labels. The first one concerns pairwise positive entailement: pairs of labels, where the presence of one implies the presence of the other in all instances of a dataset. The second concerns exclusion: sets of labels that do not coexist in the same instances of the dataset. These relationships are represented with a Bayesian network. Marginal probabilities are entered as soft evidence in the network and adjusted through probabilistic inference. Our approach offers robust improvements in mean average precision compared to the standard binary relavance approach across all 12 datasets involved in our experiments. The discovery process helps interesting implicit knowledge to emerge, which could be useful in itself.
Using a combination of ground state quantum Monte-Carlo and finite size scaling techniques, we perform a systematic study of the effect of Coulomb interaction on the localisation length of a disordered two-dimensional electron gas. We find that correlations delocalise the 2D system. In the absence of valley degeneracy (as in GaAs heterostructures), this delocalization effect corresponds to a finite increase of the localization length. The delocalisation is much more dramatic in the presence of valley degeneracy (as in Si MOSFETSs) where the localization length increases drastically. Our results suggest that a rather simple mechanism can account for the main features of the metallic behaviour observed in high mobility Si MOSFETs. Our findings support the claim that this behaviour is indeed a genuine effect of the presence of electron-electron interactions, yet that the system is not a ``true'' metal in the thermodynamic sense.
Physical-layer Network Coding (PNC) makes use of the additive nature of the electromagnetic (EM) waves to apply network coding arithmetic at the physical layer. With PNC,the destructive effect of interference in wireless networks is eliminated and the capacity of networks can be boosted significantly. This paper addresses a key outstanding issue in PNC: synchronization among transmitting nodes. We first investigate the impact of imperfect synchronization (i.e., finite synchronization errors) in a 3-node network. It is shown that with QPSK modulation, PNC still yields significantly higher capacity than straightforward network coding when there are synchronization errors. Significantly, this remains to be so even in the extreme case when synchronization is not performed at all. Moving beyond a 3-node network, we propose and investigate a synchronization scheme for PNC in a general chain network. At last, numerical simulation verifies that PNC is robust to synchronization errors. In particular, for the mutual information performance, there is about 0.5dB loss without time synchronization and there is at most 2dB loss without phase synchronization.
In this paper, we propose a novel end-to-end approach for scalable visual search infrastructure. We discuss the challenges we faced for a massive volatile inventory like at eBay and present our solution to overcome those. We harness the availability of large image collection of eBay listings and state-of-the-art deep learning techniques to perform visual search at scale. Supervised approach for optimized search limited to top predicted categories and also for compact binary signature are key to scale up without compromising accuracy and precision. Both use a common deep neural network requiring only a single forward inference. The system architecture is presented with in-depth discussions of its basic components and optimizations for a trade-off between search relevance and latency. This solution is currently deployed in a distributed cloud infrastructure and fuels visual search in eBay ShopBot and Close5. We show benchmark on ImageNet dataset on which our approach is faster and more accurate than several unsupervised baselines. We share our learnings with the hope that visual search becomes a first class citizen for all large scale search engines rather than an afterthought.
Recent measurements from Jefferson Lab show significant beam single spin asymmetries in deep inelastic scattering. The asymmetry is due to interference of longitudinal and transverse photoabsorption amplitudes which have different phases induced by the final-state interaction between the struck quark and the target spectators. We developed a dynamical model for a single-spin beam asymmetry in deep-inelastic scattering. Our results are consistent with the experimentally observed magnitude of this effect. We conclude that similar mechanisms involving quark orbital angular momentum (`Sivers effect') are responsible for both target and beam single-spin asymmetries.
Face recognition approaches that are based on deep convolutional neural networks (CNN) have been dominating the field. The performance improvements they have provided in the so called in-the-wild datasets are significant, however, their performance under image quality degradations have not been assessed, yet. This is particularly important, since in real-world face recognition applications, images may contain various kinds of degradations due to motion blur, noise, compression artifacts, color distortions, and occlusion. In this work, we have addressed this problem and analyzed the influence of these image degradations on the performance of deep CNN-based face recognition approaches using the standard LFW closed-set identification protocol. We have evaluated three popular deep CNN models, namely, the AlexNet, VGG-Face, and GoogLeNet. Results have indicated that blur, noise, and occlusion cause a significant decrease in performance, while deep CNN models are found to be robust to distortions, such as color distortions and change in color balance.
The evolution of unconditional cooperation is one of the fundamental problems in science. A new solution is proposed to solve this puzzle. We treat this issue with an evolutionary model in which agents play the Prisoner's Dilemma on signed networks. The topology is allowed to co-evolve with relational signs as well as with agent strategies. We introduce a strategy that is conditional on the emotional content embedded in network signs. We show that this strategy acts as a catalyst and creates favorable conditions for the spread of unconditional cooperation. In line with the literature, we found evidence that the evolution of cooperation most likely occurs in networks with relatively high chances of rewiring and with low likelihood of strategy adoption. While a low likelihood of rewiring enhances cooperation, a very high likelihood seems to limit its diffusion. Furthermore, unlike in non-signed networks, cooperation becomes more prevalent in denser topologies.
Makespan, which is defined as the time difference between the starting time and the terminate time of a sequence of jobs or tasks, as the time to traverse a belt conveyor system, is well known as one of the most important criteria in scheduling problems. It is often used by manufacturing firms in practice in order to improve the operational efficiency with respect to the order of job processing to be performed. It is known that the performance of a machine depends on the particular timing of the job processing even if the job processing order is fixed. That is, the performance of a system with respect to flowshop processing depends on the procedure of scheduling. In this present work, we first discuss the relationship between makespan and several scheduling procedures in detail by using a small example and provide an algorithm for deriving the makespan. Using our proposed algorithm, several numerical experiments are examined so as to reveal the relationship between the typical behavior of makespan and the position of the fiducial machine, with respect to several distinguished distributions of the processing time. We also discuss the behavior of makespan by using the properties of the shape functions used in the context of percolation theory. Our contributions are firstly giving a detail discussion on the universality of makespan in flowshop problems and obtaining several novel properties of makespan, as follows: (1) makespan possesses universality in the sense of being little affected by a change in the probability distribution of the processing time, (2) makespan can be decomposed into the sum of two shape functions, and (3) makespan is less affected by the dispatching rule than by the scheduling procedure.
This paper presents a distributed algorithm to simultaneously compute the diameter, radius and node eccentricity in all nodes of a synchronous network. Such topological information may be useful as input to configure other algorithms. Previous approaches have been modular, progressing in sequential phases using building blocks such as BFS tree construction, thus incurring longer executions than strictly required. We present an algorithm that, by timely propagation of available estimations, achieves a faster convergence to the correct values. We show local criteria for detecting convergence in each node. The algorithm avoids the creation of BFS trees and simply manipulates sets of node ids and hop counts. For the worst scenario of variable start times, each node i with eccentricity ecc(i) can compute: the node eccentricity in diam(G)+ecc(i)+2 rounds; the diameter in 2*diam(G)+ecc(i)+2 rounds; and the radius in diam(G)+ecc(i)+2*radius(G) rounds.
We present a convolutional network capable of inferring a 3D representation of a previously unseen object given a single image of this object. Concretely, the network can predict an RGB image and a depth map of the object as seen from an arbitrary view. Several of these depth maps fused together give a full point cloud of the object. The point cloud can in turn be transformed into a surface mesh. The network is trained on renderings of synthetic 3D models of cars and chairs. It successfully deals with objects on cluttered background and generates reasonable predictions for real images of cars.
This paper studies controllability properties of recurrent neural networks. The new contributions are:   (1) an extension of the result in the previous paper "Complete controllability of continuous-time recurrent neural networks" (Sontag and Sussmann) to a slightly different model, where inputs appear in an affine form,   (2) a formulation and proof of a necessary and sufficient condition, in terms of local-local controllability, and   (3) a complete analysis of the 2-dimensional case for which the hypotheses made in previous work do not apply
Number partitioning is one of the classical NP-hard problems of combinatorial optimization. It has applications in areas like public key encryption and task scheduling. The random version of number partitioning has an "easy-hard" phase transition similar to the phase transitions observed in other combinatorial problems like $k$-SAT. In contrast to most other problems, number partitioning is simple enough to obtain detailled and rigorous results on the "hard" and "easy" phase and the transition that separates them. We review the known results on random integer partitioning, give a very simple derivation of the phase transition and discuss the algorithmic implications of both phases.
Real world complex networks are scale free and possess meso-scale properties like core-periphery and community structure. We study evolution of the core over time in real world networks. This paper proposes evolving models for both unweighted and weighted scale free networks having local and global core-periphery as well as community structure. Network evolves using topological growth, self growth, and weight distribution function. To validate the correctness of proposed models, we use K-shell and S-shell decomposition methods. Simulation results show that the generated unweighted networks follow power law degree distribution with droop head and heavy tail. Similarly, generated weighted networks follow degree, strength, and edge-weight power law distributions. We further study other properties of complex networks, such as clustering coefficient, nearest neighbor degree, and strength degree correlation.
Hardness is among the most important attributes of an object that humans learn about through touch. However, approaches for robots to estimate hardness are limited, due to the lack of information provided by current tactile sensors. In this work, we address these limitations by introducing a novel method for hardness estimation, based on the GelSight tactile sensor, and the method does not require accurate control of contact conditions or the shape of objects. A GelSight has a soft contact interface, and provides high resolution tactile images of contact geometry, as well as contact force and slip conditions. In this paper, we try to use the sensor to measure hardness of objects with multiple shapes, under a loosely controlled contact condition. The contact is made manually or by a robot hand, while the force and trajectory are unknown and uneven. We analyze the data using a deep constitutional (and recurrent) neural network. Experiments show that the neural net model can estimate the hardness of objects with different shapes and hardness ranging from 8 to 87 in Shore 00 scale.
We review recent progress in the nonequilibrium dynamics of thermally isolated many-body quantum systems, evolving with an ensemble of Hamiltonians as opposed to deterministic evolution with a single time-dependent Hamiltonian. Such questions arise in (i) quantum dynamics of disordered systems, where different realizations of disorder give rise to an ensemble of real-time quantum evolutions. (ii) quantum evolution with noisy Hamiltonians (temporal disorder), which leads to stochastic Schrodinger equations, and, (iii) in the broader context of quantum optimal control, where one needs to analyze an ensemble of permissible protocols in order to find one that optimizes a given figure of merit. The theme of ensemble quantum evolution appears in several emerging new directions in noneqilibrium quantum dynamics of thermally isolated many-body systems, which include many-body localization, noise-driven systems, and shortcuts to adiabaticity.
Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and time consuming. Recently, researchers have tried to use deep learning algorithms to exploit the landscape of the loss function of the training problem of interest, and learn how to optimize over it in an automatic way. In this paper, we propose a new learning-to-learn model and some useful and practical tricks. Our optimizer outperforms generic, hand-crafted optimization algorithms and state-of-the-art learning-to-learn optimizers by DeepMind in many tasks. We demonstrate the effectiveness of our algorithms on a number of tasks, including deep MLPs, CNNs, and simple LSTMs.
Recently, Heim, R\o nnow, Isakov and Troyer [{\it Science} \textbf{348} (2015) 215] have reported that Monte Carlo simulations for the Ising spin glass model on the square lattice in the physically relevant continuous-imaginary-time limit do not show superiority of quantum annealing (QA) using transverse field against classical annealing (CA). Although the QA schedule that they had used has been using conventionally, however the QA schedule mathematically has no guarantee that the used schedule is the best QA schedule for performance of optimization. We propose a new QA schedule for studying transverse-field-based quantum versus classical annealing of the Ising model. The present QA schedule is made based on the concept of perturbation theory. This QA schedule utilizes a smallest effective transverse field derived in this article. As a case study, we study QA of the Ising spin glass model on the square lattice at low but finite temperature. A new Monte Carlo algorithm using the physically relevant continuous-imaginary-time limit is performed. As the simulation results, we show superiority of QA against CA when the annealing time is spent enough.
Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of them remains limited. In this paper we conduct a direct analysis of the visual information contained in representations by asking the following question: given an encoding of an image, to which extent is it possible to reconstruct the image itself? To answer this question we contribute a general framework to invert representations. We show that this method can invert representations such as HOG and SIFT more accurately than recent alternatives while being applicable to CNNs too. We then use this technique to study the inverse of recent state-of-the-art CNN image representations for the first time. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.
The classical approach to protein folding inspired by statistical mechanics avoids the high dimensional structure of the conformation space by using effective coordinates. Here we introduce a network approach to capture the statistical properties of the structure of conformation spaces. Conformations are represented as nodes of the network, while links are transitions via elementary rotations around a chemical bond. Self-avoidance of a polypeptide chain introduces degree correlations in the conformation network, which in turn lead to energy landscape correlations. Folding can be interpreted as a biased random walk on the conformation network. We show that the folding pathways along energy gradients organize themselves into scale free networks, thus explaining previous observations made via molecular dynamics simulations. We also show that these energy landscape correlations are essential for recovering the observed connectivity exponent, which belongs to a different universality class than that of random energy models. In addition, we predict that the exponent and therefore the structure of the folding network fundamentally changes at high temperatures, as verified by our simulations on the AK peptide.
We present a measurement of the top pair production cross section in $p\bar{p}$ collisions at $\sqrt{s}$=1.96 TeV. We collect a data sample with an integrated luminosity of 194$\pm$11 pb$^{-1}$ with the CDF II detector at the Fermilab Tevatron. We use an artificial neural network technique to discriminate between top pair production and background processes in a sample of 519 lepton+jets events, which have one isolated energetic charged lepton, large missing transverse energy and at least three energetic jets. We measure the top pair production cross section to be $\sigma_{t\bar{t}}= 6.6$$\pm 1.1 \pm 1.5$ pb, where the first uncertainty is statistical and the second is systematic.
Face recognition has made great progress with the development of deep learning. However, video face recognition (VFR) is still an ongoing task due to various illumination, low-resolution, pose variations and motion blur. Most existing CNN-based VFR methods only obtain a feature vector from a single image and simply aggregate the features in a video, which less consider the correlations of face images in one video. In this paper, we propose a novel Attention-Set based Metric Learning (ASML) method to measure the statistical characteristics of image sets. It is a promising and generalized extension of Maximum Mean Discrepancy with memory attention weighting. First, we define an effective distance metric on image sets, which explicitly minimizes the intra-set distance and maximizes the inter-set distance simultaneously. Second, inspired by Neural Turing Machine, a Memory Attention Weighting is proposed to adapt set-aware global contents. Then ASML is naturally integrated into CNNs, resulting in an end-to-end learning scheme. Our method achieves state-of-the-art performance for the task of video face recognition on the three widely used benchmarks including YouTubeFace, YouTube Celebrities and Celebrity-1000.
We provide evidence for spin glass related magnetic gaps in the fermionic density of states below the freezing temperature. Model calculations are presented and proposed to be relevant for explaining resistivity measurements which observe a crossover from variable-range- to activated behavior. The magnetic field dependence of a hardgap and the low temperature decay of the density of states are given. In models with fermion transport a new metal-insulator transition is predicted to occur due to the spin-glass gap, anteceding the spin glass to quantum paramagnet transition at smaller spin density. Important fluctuation effects due to finite range frustrated interactions are estimated and discussed.
This is part III of three-part work. In parts I and II, we have presented eight variants for simplified Long Short Term Memory (LSTM) recurrent neural networks (RNNs). It is noted that fast computation, specially in constrained computing resources, are an important factor in processing big time-sequence data. In this part III paper, we present and evaluate two new LSTM model variants which dramatically reduce the computational load while retaining comparable performance to the base (standard) LSTM RNNs. In these new variants, we impose (Hadamard) pointwise state multiplications in the cell-memory network in addition to the gating signal networks.
This article analyzes the stochastic runtime of a Cross-Entropy Algorithm on two classes of traveling salesman problems. The algorithm shares main features of the famous Max-Min Ant System with iteration-best reinforcement.   For simple instances that have a $\{1,n\}$-valued distance function and a unique optimal solution, we prove a stochastic runtime of $O(n^{6+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3+\epsilon}\ln n)$ with the edge-based random solution generation for an arbitrary $\epsilon\in (0,1)$. These runtimes are very close to the known expected runtime for variants of Max-Min Ant System with best-so-far reinforcement. They are obtained for the stronger notion of stochastic runtime, which means that an optimal solution is obtained in that time with an overwhelming probability, i.e., a probability tending exponentially fast to one with growing problem size.   We also inspect more complex instances with $n$ vertices positioned on an $m\times m$ grid. When the $n$ vertices span a convex polygon, we obtain a stochastic runtime of $O(n^{3}m^{5+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{2}m^{5+\epsilon})$ for the edge-based random solution generation. When there are $k = O(1)$ many vertices inside a convex polygon spanned by the other $n-k$ vertices, we obtain a stochastic runtime of $O(n^{4}m^{5+\epsilon}+n^{6k-1}m^{\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3}m^{5+\epsilon}+n^{3k}m^{\epsilon})$ with the edge-based random solution generation. These runtimes are better than the expected runtime for the so-called $(\mu\!+\!\lambda)$ EA reported in a recent article, and again obtained for the stronger notion of stochastic runtime.
Nonlinear photonic delay systems present interesting implementation platforms for machine learning models. They can be extremely fast, offer great degrees of parallelism and potentially consume far less power than digital processors. So far they have been successfully employed for signal processing using the Reservoir Computing paradigm. In this paper we show that their range of applicability can be greatly extended if we use gradient descent with backpropagation through time on a model of the system to optimize the input encoding of such systems. We perform physical experiments that demonstrate that the obtained input encodings work well in reality, and we show that optimized systems perform significantly better than the common Reservoir Computing approach. The results presented here demonstrate that common gradient descent techniques from machine learning may well be applicable on physical neuro-inspired analog computers.
Recently we found an Anderson-type localization-delocalization transition in the QCD Dirac spectrum at high temperature. Using spectral statistics we obtained a critical exponent compatible with that of the corresponding Anderson model. Here we study the spatial structure of the eigenmodes both in the localized and the transition region. Based on previous studies in the Anderson model, at the critical point, the eigenmodes are expected to have a scale invariant multifractal structure. We verify the scale invariance of Dirac eigenmodes at the critical point.
Probabilistic Logic Programming (PLP), exemplified by Sato and Kameya's PRISM, Poole's ICL, De Raedt et al's ProbLog and Vennekens et al's LPAD, combines statistical and logical knowledge representation and inference. Inference in these languages is based on enumerative construction of proofs over logic programs. Consequently, these languages permit very limited use of random variables with continuous distributions. In this paper, we extend PRISM with Gaussian random variables and linear equality constraints, and consider the problem of parameter learning in the extended language. Many statistical models such as finite mixture models and Kalman filter can be encoded in extended PRISM. Our EM-based learning algorithm uses a symbolic inference procedure that represents sets of derivations without enumeration. This permits us to learn the distribution parameters of extended PRISM programs with discrete as well as Gaussian variables. The learning algorithm naturally generalizes the ones used for PRISM and Hybrid Bayesian Networks.
In this paper, we present a spectrum monitoring framework for the detection of radar signals in spectrum sharing scenarios. The core of our framework is a deep convolutional neural network (CNN) model that enables Measurement Capable Devices to identify the presence of radar signals in the radio spectrum, even when these signals are overlapped with other sources of interference, such as commercial LTE and WLAN. We collected a large dataset of RF measurements, which include the transmissions of multiple radar pulse waveforms, downlink LTE, WLAN, and thermal noise. We propose a pre-processing data representation that leverages the amplitude and phase shifts of the collected samples. This representation allows our CNN model to achieve a classification accuracy of 99.6% on our testing dataset. The trained CNN model is then tested under various SNR values, outperforming other models, such as spectrogram-based CNN models.
During last two decades it has been discovered that the statistical properties of a number of microscopically rather different random systems at the macroscopic level are described by {\it the same} universal probability distribution function which is called the Tracy-Widom (TW) distribution. Among these systems we find both purely methematical problems, such as the longest increasing subsequences in random permutations, and quite physical ones, such as directed polymers in random media or polynuclear crystal growth. In the extensive Introduction we discuss in simple terms these various random systems and explain what the universal TW function is. Next, concentrating on the example of one-dimensional directed polymers in random potential we give the main lines of the formal proof that fluctuations of their free energy are described the universal TW distribution. The second part of the review consist of detailed appendices which provide necessary self-contained mathematical background for the first part.
We study the robustness properties of multiplex networks consisting of multiple layers of distinct types of links, focusing on the role of correlations between degrees of a node in different layers. We use generating function formalism to address various notions of the network robustness relevant to multiplex networks such as the resilience of ordinary- and mutual connectivity under random or targeted node removals as well as the biconnectivity. We found that correlated coupling can affect the structural robustness of multiplex networks in diverse fashion. For example, for maximally-correlated duplex networks, all pairs of nodes in the giant component are connected via at least two independent paths and network structure is highly resilient to random failure. In contrast, anti-correlated duplex networks are on one hand robust against targeted attack on high-degree nodes, but on the other hand they can be vulnerable to random failure.
In Deepak Dhar's model of abelian distributed processors, automata occupy the vertices of a graph and communicate via the edges. We show that two simple axioms ensure that the final output does not depend on the order in which the automata process their inputs. A collection of automata obeying these axioms is called an "abelian network". We prove a least action principle for abelian networks. As an application, we show how abelian networks can solve certain linear and nonlinear integer programs asynchronously.   In most previously studied abelian networks, the input alphabet of each automaton consists of a single letter; in contrast, we propose two non-unary examples of abelian networks: "oil and water" and "abelian mobile agents".
Complex systems are usually illustrated by networks which captures the topology of the interactions between the entities. To better understand the roles played by the entities in the system one needs to uncover the underlying community structure of the system. In recent years, systems with interactions that have various types or can change over time between the entities have attracted an increasing research attention. However, algorithms aiming to solve the key problem - community detection - in multilayer networks are still limited. In this work, we first introduce the multilayer network model representation with multiple aspects, which is flexible to a variety of networks. Then based on this model, we naturally derive the multilayer modularity - a widely adopted objective function of community detection in networks - from a static perspective as an evaluation metric to evaluate the quality of the communities detected in multilayer networks. It enables us to better understand the essence of the modularity by pointing out the specific kind of communities that will lead to a high modularity score. We also propose a spectral method called mSpec for the optimization of the proposed modularity function based on the supra-adjacency representation of the multilayer networks. Experiments on the electroencephalograph network and the comparison results on several empirical multilayer networks demonstrate the feasibility and reliable performance of the proposed method.
In this work we show that in a microfluidic network and in low Reynolds numbers a system can be irreversible because of hysteresis effects.The network, which is employed in our simulations, is taken from recent experiments. The network consists of one loop connected to input and output pipes. A train of droplets enter the system at a uniform rate, but they may leave it in different patterns, e.g. periodic or even chaotic. The out put pattern depends on the time interval among the incoming droplets as well as the network geometry and for some parameters the system is not reversible.
Facial landmark localization is a fundamental module for face recognition. Current common approach for facial landmark detection is cascaded regression, which is composed by two steps: feature extraction and facial shape regression. Recent methods employ deep convolutional networks to extract robust features in each step and the whole system could be regarded as a deep cascaded regression architecture. Unfortunately, this architecture is problematic. First, parameters in the networks are optimized from a greedy stage-wise perspective. Second, the network cannot efficiently merge landmark coordinate vectors with 2D convolutional layers. Third, the facial shape regression relies on a feature vector generated from the bottom layer of the convolutional neural network, which has recently been criticized for lacking spatial resolution to accomplish pixel-wise localization tasks. We propose a globally optimized dual-pathway system (GoDP) to handle the optimization and precision weaknesses of deep cascaded regression without resorting to high-level inference models or complex stacked architecture. This end-to-end system relies on distance-aware softmax functions and dual-pathway proposal-refinement architecture. The proposed system outperforms the state-of-the-art cascaded regression-based methods on multiple in-the-wild face alignment databases. Experiments on face identification demonstrate that GoDP significantly improves the quality of face frontalization in face recognition.
In this paper, we propose a context-aware keyword spotting model employing a character-level recurrent neural network (RNN) for spoken term detection in continuous speech. The RNN is end-to-end trained with connectionist temporal classification (CTC) to generate the probabilities of character and word-boundary labels. There is no need for the phonetic transcription, senone modeling, or system dictionary in training and testing. Also, keywords can easily be added and modified by editing the text based keyword list without retraining the RNN. Moreover, the unidirectional RNN processes an infinitely long input audio streams without pre-segmentation and keywords are detected with low-latency before the utterance is finished. Experimental results show that the proposed keyword spotter significantly outperforms the deep neural network (DNN) and hidden Markov model (HMM) based keyword-filler model even with less computations.
The efficient and effective monitoring of mobile networks is vital given the number of users who rely on such networks and the importance of those networks. The purpose of this paper is to present a monitoring scheme for mobile networks based on the use of rules and decision tree data mining classifiers to upgrade fault detection and handling. The goal is to have optimisation rules that improve anomaly detection. In addition, a monitoring scheme that relies on Bayesian classifiers was also implemented for the purpose of fault isolation and localisation. The data mining techniques described in this paper are intended to allow a system to be trained to actually learn network fault rules. The results of the tests that were conducted allowed for the conclusion that the rules were highly effective to improve network troubleshooting.
In a recent Letter Mola et al. \cite{mola} reported magnetization measurements $M(H,\theta)$ performed on the organic superconductor $\kappa-$(ET)$_2$Cu(NCS)$_2 (T_c = 9.1$ K) as a function of the magnetic field $H$ applied at different angles $\theta$ with respect to the $a$-axis direction. The results \cite{mola} demonstrate:   (a) the occurrence of pronounced irreversible magnetization $M_{\rm irr}(H)$ jumps and (b) their sudden cessation for $H \ge H_m(T,\theta)$. The boundary line $H_m(T)$ has been interpreted by Mola et al. as the Q2D vortex lattice (VL) quantum melting phase transition line \cite{mola}. The purpose of this comment is to show that the results can be understood in a simple way without invoking "quantum melting phase transition".
The question of what can be computed, and how efficiently, are at the core of computer science. Not surprisingly, in distributed systems and networking research, an equally fundamental question is what can be computed in a \emph{distributed} fashion. More precisely, if nodes of a network must base their decision on information in their local neighborhood only, how well can they compute or approximate a global (optimization) problem? In this paper we give the first poly-logarithmic lower bound on such local computation for (optimization) problems including minimum vertex cover, minimum (connected) dominating set, maximum matching, maximal independent set, and maximal matching. In addition we present a new distributed algorithm for solving general covering and packing linear programs. For some problems this algorithm is tight with the lower bounds, for others it is a distributed approximation scheme. Together, our lower and upper bounds establish the local computability and approximability of a large class of problems, characterizing how much local information is required to solve these tasks.
Social behaviors are often contagious, spreading through a population as individuals imitate the decisions and choices of others. A variety of global phenomena, from innovation adoption to the emergence of social norms and political movements, arise as a result of people following a simple local rule, such as copy what others are doing. However, individuals often lack global knowledge of the behaviors of others and must estimate them from the observations of their friends' behaviors. In some cases, the structure of the underlying social network can dramatically skew an individual's local observations, making a behavior appear far more common locally than it is globally. We trace the origins of this phenomenon, which we call "the majority illusion," to the friendship paradox in social networks. As a result of this paradox, a behavior that is globally rare may be systematically overrepresented in the local neighborhoods of many people, i.e., among their friends. Thus, the "majority illusion" may facilitate the spread of social contagions in networks and also explain why systematic biases in social perceptions, for example, of risky behavior, arise. Using synthetic and real-world networks, we explore how the "majority illusion" depends on network structure and develop a statistical model to calculate its magnitude in a network.
This paper highlights the significance of including memory structures in neural networks when the latter are used to learn perception-action loops for autonomous robot navigation. Traditional navigation approaches rely on global maps of the environment to overcome cul-de-sacs and plan feasible motions. Yet, maintaining an accurate global map may be challenging in real-world settings. A possible way to mitigate this limitation is to use learning techniques that forgo hand-engineered map representations and infer appropriate control responses directly from sensed information. An important but unexplored aspect of such approaches is the effect of memory on their performance. This work is a first thorough study of memory structures for deep-neural-network-based robot navigation, and offers novel tools to train such networks from supervision and quantify their ability to generalize to unseen scenarios. We analyze the separation and generalization abilities of feedforward, long short-term memory, and differentiable neural computer networks. We introduce a new method to evaluate the generalization ability by estimating the VC-dimension of networks with a final linear readout layer. We validate that the VC estimates are good predictors of actual test performance. The reported method can be applied to deep learning problems beyond robotics.
We image local structural rearrangements in soft colloidal glasses under small periodic perturbations induced by thermal cycling. Local structural entropy $S_{2}$ positively correlates with observed rearrangements in colloidal glasses. The high $S_{2}$ values of the rearranging clusters in glasses indicate that fragile regions in glasses are structurally less correlated, similar to structural defects in crystalline solids. Slow-evolving high $S_{2}$ spots are capable of predicting local rearrangements long before the relaxations occur, while fluctuation-created high $S_{2}$ spots best correlate with local deformations right before the rearrangement events. Local free volumes are also found to correlate with particle rearrangements at extreme values, although the ability to identify relaxation sites is substantially lower than $S_{2}$. Our experiments provide an efficient structural identifier for the fragile regions in glasses, and highlight the important role of structural correlations in the physics of glasses.
In this paper, we present a joint compression and classification approach of EEG and EMG signals using a deep learning approach. Specifically, we build our system based on the deep autoencoder architecture which is designed not only to extract discriminant features in the multimodal data representation but also to reconstruct the data from the latent representation using encoder-decoder layers. Since autoencoder can be seen as a compression approach, we extend it to handle multimodal data at the encoder layer, reconstructed and retrieved at the decoder layer. We show through experimental results, that exploiting both multimodal data intercorellation and intracorellation 1) Significantly reduces signal distortion particularly for high compression levels 2) Achieves better accuracy in classifying EEG and EMG signals recorded and labeled according to the sentiments of the volunteer.
The search space of Bayesian Network structures is usually defined as Acyclic Directed Graphs (DAGs) and the search is done by local transformations of DAGs. But the space of Bayesian Networks is ordered by DAG Markov model inclusion and it is natural to consider that a good search policy should take this into account. First attempt to do this (Chickering 1996) was using equivalence classes of DAGs instead of DAGs itself. This approach produces better results but it is significantly slower. We present a compromise between these two approaches. It uses DAGs to search the space in such a way that the ordering by inclusion is taken into account. This is achieved by repetitive usage of local moves within the equivalence class of DAGs. We show that this new approach produces better results than the original DAGs approach without substantial change in time complexity. We present empirical results, within the framework of heuristic search and Markov Chain Monte Carlo, provided through the Alarm dataset.
Many distributed systems can be modeled as network games: a collection of selfish players that communicate in order to maximize their individual utilities. The performance of such games can be evaluated through the costs of the system equilibria: the system states in which no player can increase her utility by unilaterally changing her behavior. However, assuming that all players are selfish and in particular that all players have the same utility function may not always be appropriate. Hence, several extensions to incorporate also altruistic and malicious behavior in addition to selfishness have been proposed over the last years. In this paper, we seek to go one step further and study arbitrary relationships between participants. In particular, we introduce the notion of the social range matrix and explore the effects of the social range matrix on the equilibria in a network game. In order to derive concrete results, we propose a simplistic network creation game that captures the effect of social relationships among players.
Metros (heavy rail transit systems) are integral parts of urban transportation systems. Failures in their operations can have serious impacts on urban mobility, and measuring their robustness is therefore critical. Moreover, as physical networks, metros can be viewed as network topological entities, and as such they possess measurable network properties. In this paper, by using network science and graph theoretical concepts, we investigate both theoretical and experimental robustness metrics (i.e., the robustness indicator, the effective graph conductance, and the critical thresholds) and their performance in quantifying the robustness of metro networks under random failures or targeted attacks. We find that the theoretical metrics quantify different aspects of the robustness of metro networks. In particular, the robustness indicator captures the number of alternative paths and the effective graph conductance focuses on the length of each path. Moreover, the high positive correlation between the theoretical metrics and experimental metrics and the negative correlation within the theoretical metrics provide significant insights for planners to design more robust system while accommodating for transit specificities (e.g., alternative paths, fast transferring).
Consumer Debt has risen to be an important problem of modern societies, generating a lot of research in order to understand the nature of consumer indebtness, which so far its modelling has been carried out by statistical models. In this work we show that Computational Intelligence can offer a more holistic approach that is more suitable for the complex relationships an indebtness dataset has and Linear Regression cannot uncover. In particular, as our results show, Neural Networks achieve the best performance in modelling consumer indebtness, especially when they manage to incorporate the significant and experimentally verified results of the Data Mining process in the model, exploiting the flexibility Neural Networks offer in designing their topology. This novel method forms an elaborate framework to model Consumer indebtness that can be extended to any other real world application.
Recently, network error correction coding (NEC) has been studied extensively. Several bounds in classical coding theory have been extended to network error correction coding, especially the Singleton bound. In this paper, following the research line using the extended global encoding kernels proposed in \cite{zhang-correction}, the refined Singleton bound of NEC can be proved more explicitly. Moreover, we give a constructive proof of the attainability of this bound and indicate that the required field size for the existence of network maximum distance separable (MDS) codes can become smaller further. By this proof, an algorithm is proposed to construct general linear network error correction codes including the linear network error correction MDS codes. Finally, we study the error correction capability of random linear network error correction coding. Motivated partly by the performance analysis of random linear network coding \cite{Ho-etc-random}, we evaluate the different failure probabilities defined in this paper in order to analyze the performance of random linear network error correction coding. Several upper bounds on these probabilities are obtained and they show that these probabilities will approach to zero as the size of the base field goes to infinity. Using these upper bounds, we slightly improve on the probability mass function of the minimum distance of random linear network error correction codes in \cite{zhang-random}, as well as the upper bound on the field size required for the existence of linear network error correction codes with degradation at most $d$.
The NLO-QCD correction to single hadron production in deep inelastic scattering is calculated. We require the final state meson to carry a non-vanishing transversal momentum, thus being sensitive to perturbative QCD effects. Factorization allows us to convolute the hard scattering process with parton densities and fragmentation functions. The predictions are directly comparable to experimental results at the HERA collider at DESY. The results are sensitive to the gluon density in the proton and allow us to test universality of fragmentation functions.
Topological concepts have been introduced into electronic, photonic, and phononic systems, but have not been studied in surface-water-wave systems. Here we study a one-dimensional periodic resonant surface-water-wave system and demonstrate its topological transition. By selecting three different water depths, we can construct different types of water waves - shallow, intermediate and deep water waves. The periodic surface-water-wave system consists of an array of cylindrical water tanks connected with narrow water channels. As the width of connecting channel varies, the band diagram undergoes a topological transition which can be further characterized by Zak phase. This topological transition holds true for shallow, intermediate and deep water waves. However, the interface state at the boundary separating two topologically distinct arrays of water tanks can exhibit different bands for shallow, intermediate and deep water waves. Our work studies for the first time topological properties of water wave systems, and paves the way to potential management of water waves.
Faecal Calprotectin (FC) is a surrogate marker for intestinal inflammation, termed Inflammatory Bowel Disease (IBD), but not for cancer. In this retrospective study of 804 patients, an enhanced benchmark predictive model for analyzing FC is developed, based on a novel state-of-the-art Echo State Network (ESN), an advanced dynamic recurrent neural network which implements a biologically plausible architecture, and a supervised learning mechanism. The proposed machine learning driven predictive model is benchmarked against a conventional logistic regression model, demonstrating statistically significant performance improvements.
A game-theoretic framework is used to study the effect of constellation size on the energy efficiency of wireless networks for M-QAM modulation. A non-cooperative game is proposed in which each user seeks to choose its transmit power (and possibly transmit symbol rate) as well as the constellation size in order to maximize its own utility while satisfying its delay quality-of-service (QoS) constraint. The utility function used here measures the number of reliable bits transmitted per joule of energy consumed, and is particularly suitable for energy-constrained networks. The best-response strategies and Nash equilibrium solution for the proposed game are derived. It is shown that in order to maximize its utility (in bits per joule), a user must choose the lowest constellation size that can accommodate the user's delay constraint. Using this framework, the tradeoffs among energy efficiency, delay, throughput and constellation size are also studied and quantified. The effect of trellis-coded modulation on energy efficiency is also discussed.
In this paper, we present a system that employs a wearable acoustic sensor and a deep convolutional neural network for detecting coughs. We evaluate the performance of our system on 14 healthy volunteers and compare it to that of other cough detection systems that have been reported in the literature. Experimental results show that our system achieves a classification sensitivity of 95.1% and a specificity of 99.5%.
In this paper, we propose LMEEC, a cluster-based routing protocol with low energy consumption for wireless sensor networks. Our protocol is based on a strategy which aims to provide a more reasonable exploitation of the selected nodes (cluster-heads) energy. Simulation results show the effectiveness of LMEEC in decreasing the energy consumption, and in prolonging the network lifetime, compared to LEACH.
Recently, studies on deep Reservoir Computing (RC) highlighted the role of layering in deep recurrent neural networks (RNNs). In this paper, the use of linear recurrent units allows us to bring more evidence on the intrinsic hierarchical temporal representation in deep RNNs through frequency analysis applied to the state signals. The potentiality of our approach is assessed on the class of Multiple Superimposed Oscillator tasks. Furthermore, our investigation provides useful insights to open a discussion on the main aspects that characterize the deep learning framework in the temporal domain.
The IEEE 802.11 protocol is a popular standard for wireless local area networks. Its medium access control layer (MAC) is a carrier sense multiple access with collision avoidance (CSMA/CA) design and includes an exponential backoff mechanism that makes it a possible target for probabilistic model checking. In this work, we identify ways to increase the scope of application of probabilistic model checking to the 802.11 MAC. Current techniques do not scale to networks of even moderate size. To work around this problem, we identify properties of the protocol that can be used to simplify the models and make verification feasible. Using these observations, we directly optimize the probabilistic timed automata models while preserving probabilistic reachability measures. We substantiate our claims of significant reduction by our results from using the probabilistic model checker PRISM.
Recent study shows that a wide deep network can obtain accuracy comparable to a deeper but narrower network. Compared to narrower and deeper networks, wide networks employ relatively less number of layers and have various important benefits, such that they have less running time on parallel computing devices, and they are less affected by gradient vanishing problems. However, the parameter size of a wide network can be very large due to use of large width of each layer in the network. In order to keep the benefits of wide networks meanwhile improve the parameter size and accuracy trade-off of wide networks, we propose a binary tree architecture to truncate architecture of wide networks by reducing the width of the networks. More precisely, in the proposed architecture, the width is continuously reduced from lower layers to higher layers in order to increase the expressive capacity of network with a less increase on parameter size. Also, to ease the gradient vanishing problem, features obtained at different layers are concatenated to form the output of our architecture. By employing the proposed architecture on a baseline wide network, we can construct and train a new network with same depth but considerably less number of parameters. In our experimental analyses, we observe that the proposed architecture enables us to obtain better parameter size and accuracy trade-off compared to baseline networks using various benchmark image classification datasets. The results show that our model can decrease the classification error of baseline from 20.43% to 19.22% on Cifar-100 using only 28% of parameters that baseline has. Code is available at https://github.com/ZhangVision/bitnet.
This article contains two main theoretical results on neural spike train models. The first assumes that the spike train is modeled as a counting or point process on the real line where the conditional intensity function is a product of a free firing rate function s, which depends only on the stimulus, and a recovery function r, which depends only on the time since the last spike. If s and r belong to a q-smooth class of functions, it is proved that sieve maximum likelihood estimators for s and r achieve essentially the optimal convergence rate (except for a logarithmic factor) under L_1 loss.   The second part of this article considers template matching of multiple spike trains. P-values for the occurrences of a given template or pattern in a set of spike trains are computed using a general scoring system. By identifying the pattern with an experimental stimulus, multiple spike trains can be deciphered to provide useful information.
Analyzing historical data of price indices we find an extraordinary growth phenomenon in several examples of hyper-inflation in which price changes are approximated nicely by double-exponential functions of time. In order to explain such behavior we introduce the general coarse-graining technique in physics, the Monte Carlo renormalization group method, to the price dynamics. Starting from a microscopic stochastic equation describing dealers' actions in open markets we obtain a macroscopic noiseless equation of price consistent with the observation. The effect of auto-catalytic shortening of characteristic time caused by mob psychology is shown to be responsible for the double-exponential behavior.
User-generated content can be distributed at a low cost using peer-to-peer (P2P) networks, but the free-rider problem hinders the utilization of P2P networks. In order to achieve an efficient use of P2P networks, we investigate fundamental issues on incentives in content production and sharing using game theory. We build a basic model to analyze non-cooperative outcomes without an incentive scheme and then use different game formulations derived from the basic model to examine five incentive schemes: cooperative, payment, repeated interaction, intervention, and enforced full sharing. The results of this paper show that 1) cooperative peers share all produced content while non-cooperative peers do not share at all without an incentive scheme; 2) a cooperative scheme allows peers to consume more content than non-cooperative outcomes do; 3) a cooperative outcome can be achieved among non-cooperative peers by introducing an incentive scheme based on payment, repeated interaction, or intervention; and 4) enforced full sharing has ambiguous welfare effects on peers. In addition to describing the solutions of different formulations, we discuss enforcement and informational requirements to implement each solution, aiming to offer a guideline for protocol designers when designing incentive schemes for P2P networks.
We analyze a model of interacting agents (e.g. prebiotic chemical species) which are represended by nodes of a network, whereas their interactions are mapped onto directed links between these nodes. On a fast time scale, each agent follows an eigendynamics based on catalytic support from other nodes, whereas on a much slower time scale the network evolves through selection and mutation of its nodes-agent. In the first part of the paper, we explain the dynamics of the model by means of characteristic snapshots of the network evolution and confirm earlier findings on crashes an recoveries in the network structure. In the second part, we focus on the aggregate behavior of the network dynamics. We show that the disruptions in the network structure are smoothed out, so that the average evolution can be described by a growth regime followed by a saturation regime, without an initial random regime. For the saturation regime, we obtain a logarithmic scaling between the average connectivity per node $\mean{l}_{s}$ and a parameter $m$, describing the average incoming connectivity, which is independent of the system size $N$.
3\b{eta}-O-phthalic ester of betulinic acid is of great importance in anticancer studies. However, the optimization of its reaction conditions requires a large number of experimental works. To simplify the number of times of optimization in experimental works, here, we use artificial neural network (ANN) and support vector machine (SVM) models for the prediction of yields of 3\b{eta}-O-phthalic ester of betulinic acid synthesized by betulinic acid and phthalic anhydride using lipase as biocatalyst. General regression neural network (GRNN), multilayer feed-forward neural network (MLFN) and the SVM models were trained based on experimental data. Four indicators were set as independent variables, including time (h), temperature (C), amount of enzyme (mg) and molar ratio, while the yield of the 3\b{eta}-O-phthalic ester of betulinic acid was set as the dependent variable. Results show that the GRNN and SVM models have the best prediction results during the testing process, with comparatively low RMS errors (4.01 and 4.23respectively) and short training times (both 1s). The prediction accuracy of the GRNN and SVM are both 100% in testing process, under the tolerance of 30%.
The review is a survey of the present status of research in social networks highlighting the topics of small world property, degree distributions, community structure, assortativity, modelling, dynamics and searching in social networks.
It is shown that the Bragg glass phase can become unstable with respect to planar defects. A single defect plane that is oriented parallel to the magnetic field as well as to one of the main axis of the Abrikosov flux line lattice is always relevant, whereas we argue that a plane with higher Miller index is irrelevant, even at large defect potentials. A finite density of parallel defects with random separations can be relevant even for larger Miller indices. Defects that are aligned with the applied field restore locally the flux density oscillations which decay algebraically with distance from the defect. The current voltage relation is changed to ln V(J) -J^{-1}. The theory exhibits some similarities to the physics of Luttinger liquids with impurities.
We consider Euclidean Conformal Field Theories perturbed by quenched disorder, namely by random fluctuations in their couplings. Such theories are relevant for second-order phase transitions in the presence of impurities or other forms of disorder. Theories with quenched disorder often flow to new fixed points of the renormalization group. We begin with disorder in free field theories. Imry and Ma showed that disordered free fields can only exist for d>4. For d>4 we show that disorder leads to new fixed points which are not scale-invariant. We then move on to large-N theories (vector models or gauge theories in the `t Hooft limit). We compute exactly the beta function for the disorder, and the correlation functions of the disordered theory. We generalize the results of Imry and Ma by showing that such disordered theories exist only when disorder couples to operators of dimension \Delta > d/4. Sometimes the disordered fixed points are not scale-invariant, and in other cases they have unconventional dependence on the disorder, including non-trivial effects due to irrelevant operators. Holography maps disorder in conformal theories to stochastic differential equations in a higher-dimensional space. We use this dictionary to reproduce our field theory results. We also study the leading 1/N corrections, both by field theory methods and by holography. These corrections are particularly important when disorder scales with the number of degrees of freedom.
Variational autoencoders (VAEs), that are built upon deep neural networks have emerged as popular generative models in computer vision. Most of the work towards improving variational autoencoders has focused mainly on making the approximations to the posterior flexible and accurate, leading to tremendous progress. However, there have been limited efforts to replace pixel-wise reconstruction, which have known shortcomings. In this work, we use real-valued non-volume preserving transformations (real NVP) to exactly compute the conditional likelihood of the data given the latent distribution. We show that a simple VAE with this form of reconstruction is competitive with complicated VAE structures, on image modeling tasks. As part of our model, we develop powerful conditional coupling layers that enable real NVP to learn with fewer intermediate layers.
Neural network models of early sensory processing typically reduce the dimensionality of streaming input data. Such networks learn the principal subspace, in the sense of principal component analysis (PCA), by adjusting synaptic weights according to activity-dependent learning rules. When derived from a principled cost function these rules are nonlocal and hence biologically implausible. At the same time, biologically plausible local rules have been postulated rather than derived from a principled cost function. Here, to bridge this gap, we derive a biologically plausible network for subspace learning on streaming data by minimizing a principled cost function. In a departure from previous work, where cost was quantified by the representation, or reconstruction, error, we adopt a multidimensional scaling (MDS) cost function for streaming data. The resulting algorithm relies only on biologically plausible Hebbian and anti-Hebbian local learning rules. In a stochastic setting, synaptic weights converge to a stationary state which projects the input data onto the principal subspace. If the data are generated by a nonstationary distribution, the network can track the principal subspace. Thus, our result makes a step towards an algorithmic theory of neural computation.
A neural network works as an associative memory device if it has large storage capacity and the quality of the retrieval is good enough. The learning and attractor abilities of the network both can be measured by the mutual information (MI), between patterns and retrieval states. This paper deals with a search for an optimal topology, of a Hebb network, in the sense of the maximal MI. We use small-world topology. The connectivity $\gamma$ ranges from an extremely diluted to the fully connected network; the randomness $\omega$ ranges from purely local to completely random neighbors. It is found that, while stability implies an optimal $MI(\gamma,\omega)$ at $\gamma_{opt}(\omega)\to 0$, for the dynamics, the optimal topology holds at certain $\gamma_{opt}>0$ whenever $0\leq\omega<0.3$.
In this paper, we study the challenging problem of multi-object tracking in a complex scene captured by a single camera. Different from the existing tracklet association-based tracking methods, we propose a novel and efficient way to obtain discriminative appearance-based tracklet affinity models. Our proposed method jointly learns the convolutional neural networks (CNNs) and temporally constrained metrics. In our method, a Siamese convolutional neural network (CNN) is first pre-trained on the auxiliary data. Then the Siamese CNN and temporally constrained metrics are jointly learned online to construct the appearance-based tracklet affinity models. The proposed method can jointly learn the hierarchical deep features and temporally constrained segment-wise metrics under a unified framework. For reliable association between tracklets, a novel loss function incorporating temporally constrained multi-task learning mechanism is proposed. By employing the proposed method, tracklet association can be accomplished even in challenging situations. Moreover, a new dataset with 40 fully annotated sequences is created to facilitate the tracking evaluation. Experimental results on five public datasets and the new large-scale dataset show that our method outperforms several state-of-the-art approaches in multi-object tracking.
We use the Keck Deep Fields UGRI catalog of z~4, 3, and 2 UV-selected galaxies to study the evolution of the rest-frame 1700A luminosity density at high redshift. The ability to reliably constrain the contribution of faint galaxies is critical and our data do so as they reach to M*+2 even at z~4 and deeper still at lower redshifts. We find that the luminosity density at high redshift is dominated by the hitherto poorly studied galaxies fainter than L*, and, indeed, the the bulk of the UV light in the high-z Universe comes from galaxies in the luminosity range L=0.1-1L*. It is these faint galaxies that govern the behavior of the total UV luminosity density. Overall, there is a gradual rise in luminosity density starting at z~4 or earlier, followed by a shallow peak or a plateau within z~3--1, and then followed by the well-know plunge at lower redshifts. Within this total picture, luminosity density in sub-L* galaxies evolves more rapidly at high redshift, z>~2, than that in more luminous objects. However, this is reversed at lower redshifts, z<~1, a reversal that is reminiscent of galaxy downsizing. Within the context of the models commonly used in the observational literature, there seemingly aren't enough faint or bright LBGs to maintain ionization of intergalactic gas even as late as z~4. This is particularly true at earlier epochs and even more so if the faint-end evolutionary trends we observe at z~3 and 4 continue to higher redshifts. Apparently the Universe must be easier to reionize than some recent studies have assumed. Nevertheless, sub-L* galaxies do dominate the total UV luminosity density at z>~2 and this dominance further highlights the need for follow-up studies that will teach us more about these very numerous but thus far largely unexplored systems.
Following the recent trend in explicit neural memory structures, we present a new design of an external memory, wherein memories are stored in an Euclidean key space $\mathbb R^n$. An LSTM controller performs read and write via specialized read and write heads. It can move a head by either providing a new address in the key space (aka random access) or moving from its previous position via a Lie group action (aka Lie access). In this way, the "L" and "R" instructions of a traditional Turing Machine are generalized to arbitrary elements of a fixed Lie group action. For this reason, we name this new model the Lie Access Neural Turing Machine, or LANTM.   We tested two different configurations of LANTM against an LSTM baseline in several basic experiments. We found the right configuration of LANTM to outperform the baseline in all of our experiments. In particular, we trained LANTM on addition of $k$-digit numbers for $2 \le k \le 16$, but it was able to generalize almost perfectly to $17 \le k \le 32$, all with the number of parameters 2 orders of magnitude below the LSTM baseline.
As a result of deep hard X-ray observations by Chandra and XMM-Newton a significant fraction of the cosmic X-ray background (CXRB) has been resolved into individual sources. These objects are almost all active galactic nuclei (AGN) and optical followup observations find that they are mostly obscured Type 2 AGN, have Seyfert-like X-ray luminosities (i.e., L_X ~ 10^{43-44} ergs s^{-1}), and peak in redshift at z~0.7. Since this redshift is similar to the peak in the cosmic star-formation rate, this paper proposes that the obscuring material required for AGN unification is regulated by star-formation within the host galaxy. We test this idea by computing CXRB synthesis models with a ratio of Type 2/Type 1 AGN that is a function of both z and 2-10 keV X-ray luminosity, L_X. The evolutionary models are constrained by parameterizing the observed Type 1 AGN fractions from the recent work by Barger et al. The parameterization which simultaneously best accounts for Barger's data, the CXRB spectrum and the X-ray number counts has a local, low-L_X Type 2/Type 1 ratio of 4, and predicts a Type 2 AGN fraction which evolves as (1+z)^{0.3}. Models with no redshift evolution yielded much poorer fits to the Barger Type 1 AGN fractions. This particular evolution predicts a Type 2/Type 1 ratio of 1-2 for log L_X > 44, and thus the deep X-ray surveys are missing about half the obscured AGN with these luminosities. These objects are likely to be Compton thick. Overall, these calculations show that the current data strongly supports a change to the AGN unification scenario where the obscuration is connected with star formation in the host galaxy rather than a molecular torus alone. The evolution of the obscuration implies a close relationship between star formation and AGN fueling, most likely due to minor mergers or interactions.
Due to the potential risk of inducing cancers, radiation dose of X-ray CT should be reduced for routine patient scanning. However, in low-dose X-ray CT, severe artifacts usually occur due to photon starvation, beamhardening, etc, which decrease the reliability of diagnosis. Thus, high quality reconstruction from low-dose X-ray CT data has become one of the important research topics in CT community. Conventional model-based denoising approaches are, however, computationally very expensive, and image domain denoising approaches hardly deal with CT specific noise patterns. To address these issues, we propose an algorithm using a deep convolutional neural network (CNN), which is applied to wavelet transform coefficients of low-dose CT images. Specifically, by using a directional wavelet transform for extracting directional component of artifacts and exploiting the intra- and inter-band correlations, our deep network can effectively suppress CT specific noises. Moreover, our CNN is designed to have various types of residual learning architecture for faster network training and better denoising. Experimental results confirm that the proposed algorithm effectively removes complex noise patterns of CT images, originated from the reduced X-ray dose. In addition, we show that wavelet domain CNN is efficient in removing the noises from low-dose CT compared to an image domain CNN. Our results were rigorously evaluated by several radiologists and won the second place award in 2016 AAPM Low-Dose CT Grand Challenge. To the best of our knowledge, this work is the first deep learning architecture for low-dose CT reconstruction that has been rigorously evaluated and proven for its efficacy.
We explore architectures for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation. Convolutional predictors, such as the fully-convolutional network (FCN), have achieved remarkable success by exploiting the spatial redundancy of neighboring pixels through convolutional processing. Though computationally efficient, we point out that such approaches are not statistically efficient during learning precisely because spatial redundancy limits the information learned from neighboring pixels. We demonstrate that (1) stratified sampling allows us to add diversity during batch updates and (2) sampled multi-scale features allow us to explore more nonlinear predictors (multiple fully-connected layers followed by ReLU) that improve overall accuracy. Finally, our objective is to show how a architecture can get performance better than (or comparable to) the architectures designed for a particular task. Interestingly, our single architecture produces state-of-the-art results for semantic segmentation on PASCAL-Context, surface normal estimation on NYUDv2 dataset, and edge detection on BSDS without contextual post-processing.
Recent research has identified interactions between networks as crucial for the outcome of evolutionary games taking place on them. While the consensus is that interdependence does promote cooperation by means of organizational complexity and enhanced reciprocity that is out of reach on isolated networks, we here address the question just how much interdependence there should be. Intuitively, one might assume the more the better. However, we show that in fact only an intermediate density of sufficiently strong interactions between networks warrants an optimal resolution of social dilemmas. This is due to an intricate interplay between the heterogeneity that causes an asymmetric strategy flow because of the additional links between the networks, and the independent formation of cooperative patterns on each individual network. Presented results are robust to variations of the strategy updating rule, the topology of interdependent networks, and the governing social dilemma, thus suggesting a high degree of universality.
Deep Neural Networks (DNNs) have become very popular for prediction in many areas. Their strength is in representation with a high number of parameters that are commonly learned via gradient descent or similar optimization methods. However, the representation is non-standardized, and the gradient calculation methods are often performed using component-based approaches that break parameters down into scalar units, instead of considering the parameters as whole entities. In this work, these problems are addressed. Standard notation is used to represent DNNs in a compact framework. Gradients of DNN loss functions are calculated directly over the inner product space on which the parameters are defined. This framework is general and is applied to two common network types: the Multilayer Perceptron and the Deep Autoencoder.
We perform a systematic analytical study of finite size effects in separable recurrent neural network models with sequential dynamics, away from saturation. We find two types of finite size effects: thermal fluctuations, and disorder-induced `frozen' corrections to the mean-field laws. The finite size effects are described by equations that correspond to a time-dependent Ornstein-Uhlenbeck process. We show how the theory can be used to understand and quantify various finite size phenomena in recurrent neural networks, with and without detailed balance.
We consider the optimal paths in a $d$-dimensional lattice, where the bonds have isotropically correlated random weights. These paths can be interpreted as the ground state configuration of a simplified polymer model in a random potential. We study how the universal scaling exponents, the roughness and the energy fluctuation exponent, depend on the strength of the disorder correlations. Our numerical results using Dijkstra's algorithm to determine the optimal path in directed as well as undirected lattices indicate that the correlations become relevant if they decay with distance slower than 1/r in d=2 and 3. We show that the exponent relation 2nu-omega=1 holds at least in d=2 even in case of correlations. Both in two and three dimensions, overhangs turn out to be irrelevant even in the presence of strong disorder correlations.
We present an ab-initio density function theory to investigate the electronic and magnetic structures of the bilayer graphene with intercalated atoms C, N, and O. The intercalated atom although initially positioned at the middle site of the bilayer interval will finally be adsorbed to one graphene layer. Both N and O atoms favor the bridge site (i.e. above the carbon-carbon bonding of the lower graphene layer), while the C atom prefers the hollow site (i.e. just above a carbon atom of the lower graphene layer and simultaneously below the center of a carbon hexagon of the upper layer). Concerning the magnetic property, both C and N adatoms can induce itinerant Stoner magnetism by introducing extended or quasilocalized states around the Fermi level. Full spin polarization can be obtained in N-intercalated system and the magnetic moment mainly focuses on the N atom. In C-intercalated system, both the foreign C atom and some carbon atoms of the bilayer graphene are induced to be spin-polarized. N and O atoms can easily get electrons from carbon atoms of bilayer graphene, which leads to Fermi level shifting downward to valence band and thus producing the metallic behavior in bilayer graphene.
Computing has become a major component of all particle physics experiments and in many areas of theoretical particle physics. Progress in HEP experiment and theory will require significantly more computing, software development, storage, and networking, with different projects stretching future capabilities in different ways. However, there are many common needs among different areas in HEP, so more community planning is advised to increase efficiency. Careful and continuing review of the topics we studied, i.e., user needs and capabilities of current and future technology, is needed.
Adaptive networks rely on in-network and collaborative processing among distributed agents to deliver enhanced performance in estimation and inference tasks. Information is exchanged among the nodes, usually over noisy links. The combination weights that are used by the nodes to fuse information from their neighbors play a critical role in influencing the adaptation and tracking abilities of the network. This paper first investigates the mean-square performance of general adaptive diffusion algorithms in the presence of various sources of imperfect information exchanges, quantization errors, and model non-stationarities. Among other results, the analysis reveals that link noise over the regression data modifies the dynamics of the network evolution in a distinct way, and leads to biased estimates in steady-state. The analysis also reveals how the network mean-square performance is dependent on the combination weights. We use these observations to show how the combination weights can be optimized and adapted. Simulation results illustrate the theoretical findings and match well with theory.
A popular testbed for deep learning has been multimodal recognition of human activity or gesture involving diverse inputs such as video, audio, skeletal pose and depth images. Deep learning architectures have excelled on such problems due to their ability to combine modality representations at different levels of nonlinear feature extraction. However, designing an optimal architecture in which to fuse such learned representations has largely been a non-trivial human engineering effort. We treat fusion structure optimization as a hyper-parameter search and cast it as a discrete optimization problem under the Bayesian optimization framework. We propose a novel graph-induced kernel to compute structural similarities in the search space of tree-structured multimodal architectures and demonstrate its effectiveness using two challenging multimodal human activity recognition datasets.
In this work we extend, review and jointly discuss earlier experiments conducted by us in hyperaged geological glasses, either in Dominican amber (20 million years old) or in Spanish amber from El Soplao (110 million years old). After characterization of their thermodynamic and elastic properties (using Differential Scanning Calorimetry around the glass-transition temperature, and measuring mass density and sound velocity), their specific heat was measured at low and very low temperatures. By directly comparing pristine amber samples (i.e. highly stabilized polymer glasses after aging for millions of years) to the same samples after being totally or partially rejuvenated, we have found that the two most prominent universal anomalous low-temperature properties of glasses, namely the tunnelling two-level systems and the so-called boson peak, persist essentially unchanged in both types of hyperaged geological glasses. Therefore, non-Debye low-energy excitations of glasses appear to be robust, intrinsic properties of non-crystalline solids which do not vanish by accessing to very deep states in the potential energy landscape.
Hysteresis is a highly nonlinear phenomenon, showing up in a wide variety of science and engineering problems. The identification of hysteretic systems from input-output data is a challenging task. Recent work on black-box polynomial nonlinear state-space modeling for hysteresis identification has provided promising results, but struggles with a large number of parameters due to the use of multivariate polynomials. This drawback is tackled in the current paper by applying a decoupling approach that results in a more parsimonious representation involving univariate polynomials. This work is carried out numerically on input-output data generated by a Bouc-Wen hysteretic model and follows up on earlier work of the authors. The current article discusses the polynomial decoupling approach and explores the selection of the number of univariate polynomials with the polynomial degree, as well as the connections with neural network modeling. We have found that the presented decoupling approach is able to reduce the number of parameters of the full nonlinear model up to about 50\%, while maintaining a comparable output error level.
What is the best way to divide a rugged landscape? Since ancient times, watersheds separating adjacent water systems that flow, for example, toward different seas, have been used to delimit boundaries. Interestingly, serious and even tense border disputes between countries have relied on the subtle geometrical properties of these tortuous lines. For instance, slight and even anthropogenic modifications of landscapes can produce large changes in a watershed, and the effects can be highly nonlocal. Although the watershed concept arises naturally in geomorphology, where it plays a fundamental role in water management, landslide, and flood prevention, it also has important applications in seemingly unrelated fields such as image processing and medicine. Despite the far-reaching consequences of the scaling properties on watershed-related hydrological and political issues, it was only recently that a more profound and revealing connection has been disclosed between the concept of watershed and statistical physics of disordered systems. This review initially surveys the origin and definition of a watershed line in a geomorphological framework to subsequently introduce its basic geometrical and physical properties. Results on statistical properties of watersheds obtained from artificial model landscapes generated with long-range correlations are presented and shown to be in good qualitative and quantitative agreement with real landscapes.
We review recent developments in the study of deep inelastic scattering from light nuclei, focusing in particular on deuterium, helium, and lithium. Understanding the nuclear effects in these systems is essential for the extraction of information on the neutron structure function.
Even though transitivity is a central structural feature of social networks, its influence on epidemic spread on coevolving networks has remained relatively unexplored. Here we introduce and study an adaptive SIS epidemic model wherein the infection and network coevolve with non-trivial probability to close triangles during edge rewiring, leading to substantial reinforcement of network transitivity. This new model provides a unique opportunity to study the role of transitivity in altering the SIS dynamics on a coevolving network. Using numerical simulations and Approximate Master Equations (AME), we identify and examine a rich set of dynamical features in the new model. In many cases, the AME including transitivity reinforcement provide accurate predictions of stationary-state disease prevalence and network degree distributions. Furthermore, for some parameter settings, the AME accurately trace the temporal evolution of the system. We show that higher transitivity reinforcement in the model leads to lower levels of infective individuals in the population; when closing a triangle is the only rewiring mechanism. These methods and results may be useful in developing ideas and modeling strategies for controlling SIS type epidemics.
This contribution summarizes and explains various principles from physics which are used for the simulation of traffic flows in large street networks, the modeling of destination, transport mode, and route choice, or the simulation of urban growth and regional development. The methods stem from many-particle physics, from kinetic gas theory, or fluiddynamics. They involve energy and entropy considerations, transfer the law of gravity, apply cellular automata and require methods from evolutionary game theory. In this way, one can determine interaction forces among driver-vehicle units, reproduce breakdowns of traffic including features of synchronized congested flow, or understand changing usage patterns of alternative roads. One can also describe daily activity patterns based on decision models, simulate migration streams, and model urban growth as a particular kind of aggregation process.
We present a general prediction scheme of failure times based on updating continuously with time the probability for failure of the global system, conditioned on the information revealed on the pre-existing idiosyncratic realization of the system by the damage that has occurred until the present time. Its implementation on a simple prototype system of interacting elements with unknown random lifetimes undergoing irreversible damage until a global rupture occurs shows that the most probable predicted failure time (mode) may evolve non-monotonically with time as information is incorporated in the prediction scheme. In addition, both the mode, its standard deviation and, in fact, the full distribution of predicted failure times exhibit sensitive dependence on the realization of the system, similarly to ``chaos'' in spinglasses, providing a multi-dimensional dynamical explanation for the broad distribution of failure times observed in many empirical situations.
We study the condensation phenomenon in a zero range process on scale-free networks. We show that the stationary state property depends only on the degree distribution of underlying networks. The model displays a stationary state phase transition between a condensed phase and an uncondensed phase, and the phase diagram is obtained analytically. As for the dynamical property, we find that the relaxation dynamics depends on the global structure of underlying networks. The relaxation time follows the power law $\tau \sim L^z$ with the network size $L$ in the condensed phase. The dynamic exponent $z$ is found to take a different value depending on whether underlying networks have a tree structure or not.
Research on friendship networks in schools suggests that heterogeneity increases homophily preferences. We argue that this may be a misleading interpretation of the coefficients of the exponential random graph models (p*) that are used to model the network data. If students wish to avoid having no friends at all, then minority students may appear to be willing to integrate more with the majority in a more homogeneous school, even if the preference for having same ethnicity friends has the same strength in all schools. We use a random utility model of network formation to study in computational experiments the effects of the preferences that drive network choices. We generate simulated networks for different school compositions but with the same parameter for a preference for homophily. We estimate with a p* model on these simulated data the coefficients for the effect of same ethnicity on (simulated) friendship choice. The tests confirm that p* coefficients for the effect of same ethnicity are much larger in populations with a large minority, although actual homophily preferences are the same across all simulated school compositions.
The results of experimental study of the magnetoconductivity of 2D electron gas caused by suppression of the interference quantum correction in HgTe single quantum well heterostructure with the inverted energy spectrum are presented. It is shown that only the antilocalization magnetoconductivity is observed at the relatively high conductivity $\sigma>(20-30)G_0$, where $G_0= e^2/2\pi^2\hbar$. The antilocalization correction demonstrates a crossover from $0.5\ln{(\tau_\phi/\tau)}$ to $1.0\ln{(\tau_\phi/\tau)}$ behavior with the increasing conductivity or decreasing temperature (here $\tau_\phi$ and $\tau$ are the phase relaxation and transport relaxation times, respectively). It is interpreted as a result of crossover to the regime when the two chiral branches of the electron energy spectrum contribute to the weak antilocalization independently. At lower conductivity $\sigma<(20-30)G_0$, the magnetoconductivity behaves itself analogously to that in usual 2D systems with the fast spin relaxation: being negative in low magnetic field it becomes positive in higher one. We have found that the temperature dependences of the fitting parameter $\tau_\phi$ corresponding to the phase relaxation time demonstrate reasonable behavior, close to 1/T, over the whole conductivity range from $5G_0$ up to $130G_0$. However, the $\tau_\phi$ value remains practically independent of the conductivity in distinction to the conventional 2D systems with the simple energy spectrum, in which $\tau_\phi$ is enhanced with the conductivity.
We investigate the power-suppressed corrections to the mean values of various quantities that characterise the shapes of final states in deep inelastic lepton scattering. Our method is based on an analysis of one-loop Feynman graphs containing a massive gluon, which is equivalent to the evaluation of leading infrared renormalon contributions. As in $\ee$ annihilation, we find that the leading corrections are proportional to $1/Q$. We give quantitative estimates based on the hypothesis of a universal low-energy effective coupling.
The precise knowledge of the atmospheric neutrino fluxes is a key ingredient in the interpretation of the results from any atmospheric neutrino experiment. In the standard atmospheric neutrino data analysis, these fluxes are theoretical inputs obtained from sophisticated numerical calculations. In this contribution we present an alternative approach to the determination of the atmospheric neutrino fluxes based on the direct extraction from the experimental data on neutrino event rates. The extraction is achieved by means of a combination of artificial neural networks as interpolants and Monte Carlo methods.
We prove that a strongly disordered two-dimensional system localizes with a localization length given analytically. We get a scaling law with a critical exponent is $\nu=1$ in agreement with the Chayes criterion $\nu\ge 1$. The case we are considering is for off-diagonal disorder. The method we use is a perturbation approach holding in the limit of an infinitely large perturbation as recently devised and the Anderson model is considered with a Gaussian distribution of disorder. The localization length diverges when energy goes to zero with a scaling law in agreement to numerical and theoretical expectations.
Semiconductor quantum dots (known as artificial atoms) hold great promise for solid-state quantum networks and quantum computers. To realize a quantum network, it is crucial to achieve light-matter entanglement and coherent quantum-state transfer between light and matter. Here we present a robust photon-spin entangling gate with high fidelity and high efficiency (up to 50 percent) using a charged quantum dot in a double-sided microcavity. This gate is based on giant circular birefringence induced by a single electron spin, and functions as an optical circular polariser which allows only one circularly-polarized component of light to be transmitted depending on the electron spin states. We show this gate can be used for single-shot quantum non-demolition measurement of a single electron spin, and can work as an entanglement filter to make a photon-spin entangler, spin entangler and photon entangler as well as a photon-spin quantum interface. This work allows us to make all building blocks for solid-state quantum networks with single photons and quantum-dot spins.
Structure of real networked systems, such as social relationship, can be modeled as temporal networks in which each edge appears only at the prescribed time. Understanding the structure of temporal networks requires quantifying the importance of a temporal vertex, which is a pair of vertex index and time. In this paper, we define two centrality measures of a temporal vertex based on the fastest temporal paths which use the temporal vertex. The definition is free from parameters and robust against the change in time scale on which we focus. In addition, we can efficiently compute these centrality values for all temporal vertices. Using the two centrality measures, we reveal that distributions of these centrality values of real-world temporal networks are heterogeneous. For various datasets, we also demonstrate that a majority of the highly central temporal vertices are located within a narrow time window around a particular time. In other words, there is a bottleneck time at which most information sent in the temporal network passes through a small number of temporal vertices, which suggests an important role of these temporal vertices in spreading phenomena.
Network science is already making an impact on the study of complex systems and offers a promising variety of tools to understand their formation and evolution (1-4) in many disparate fields from large communication networks (5,6), transportation infrastructures (7) and social communities (8,9) to biological systems (1,10,11). Even though new highthroughput technologies have rapidly been generating large amounts of genomic data, drug design has not followed the same development, and it is still complicated and expensive to develop new single-target drugs. Nevertheless, recent approaches suggest that multi-target drug design combined with a network-dependent approach and large-scale systems-oriented strategies (12-14) create a promising framework to combat complex multigenetic disorders like cancer or diabetes. Here, we investigate the human network corresponding to the interactions between all US approved drugs and human therapies, defined by known drug-therapy relationships. Our results show that the key paths in this network are shorter than three steps, indicating that distant therapies are separated by a surprisingly low number of chemical compounds. We also identify a sub-network composed by drugs with high centrality measures (15), which represent the structural back-bone of the drug-therapy system and act as hubs routing information between distant parts of the network. These findings provide for the first time a global map of the largescale organization of all known drugs and associated therapies, bringing new insights on possible strategies for future drug development. Special attention should be given to drugs which combine the two properties of (a) having a high centrality value and (b) acting on multiple targets.
We study the quantum version of the random $K$-Satisfiability problem in the presence of the external magnetic field $\Gamma$ applied in the transverse direction. We derive the replica-symmetric free energy functional within static approximation and the saddle-point equation for the order parameter: the distribution $P[h(m)]$ of functions of magnetizations. The order parameter is interpreted as the histogram of probability distributions of individual magnetizations. In the limit of zero temperature and small transverse fields, to leading order in $\Gamma$ magnetizations $m \approx 0$ become relevant in addition to purely classical values of $m \approx \pm 1$. Self-consistency equations for the order parameter are solved numerically using Quasi Monte Carlo method for K=3. It is shown that for an arbitrarily small $\Gamma$ quantum fluctuations destroy the phase transition present in the classical limit $\Gamma=0$, replacing it with a smooth crossover transition. The implications of this result with respect to the expected performance of quantum optimization algorithms via adiabatic evolution are discussed. The replica-symmetric solution of the classical random $K$-Satisfiability problem is briefly revisited. It is shown that the phase transition at T=0 predicted by the replica-symmetric theory is of continuous type with atypical critical exponents.
The $T$-odd distribution functions contributing to transversity properties of the nucleon and their role in fueling nontrivial contributions to azimuthal asymmetries in semi-inclusive deep inelastic scattering are investigated. We use a dynamical model to evaluate these quantities in terms of HERMES kinematics.
The cross sections for charged and neutral current deep inelastic scattering in e^+p collisions with a longitudinally polarised positron beam have been measured using the ZEUS detector at HERA. The results, based on data corresponding to an integrated luminosity of 23.8 pb^-1 at sqrt(s) = 318 GeV, are given for both e^+p charged current and neutral current deep inelastic scattering for both positive and negative values of the longitudinal polarisation of the positron beam. Single differential cross sections are presented for the kinematic region Q^2 > 200 GeV^2 . The measured cross sections are compared to the predictions of the Standard Model. A fit to the data yields sigma^CC (P_e = -1) = 7.4 +/- 3.9 (stat.) +/- 1.2 (syst.) pb, which is consistent within two standard deviations with the absence of right-handed charged currents in the Standard Model.
In recent years several wireless communication standards have been developed and more are expected, each with different scope in terms of spatial coverage, radio access capabilities, and mobility support. Heterogeneous networks combine multiple of these radio interfaces both in network infrastructure and in user equipment which requires a new multi-radio framework, enabling mobility and handover management for multiple RATs. The use of heterogeneous networks can capitalize on the overlapping coverage and allow user devices to take advantage of the fact that there are multiple radio interfaces. This paper presents the functional architecture for such a framework and proposes a generic signaling exchange applicable to a range of different handover management protocols that enables seamless mobility. The interworking of radio resource management, access selection and mobility management is defined in a generic and modular way, which is extensible for future protocols and standards.
In this paper, we introduce new methods and discuss results of text-based LSTM (Long Short-Term Memory) networks for automatic music composition. The proposed network is designed to learn relationships within text documents that represent chord progressions and drum tracks in two case studies. In the experiments, word-RNNs (Recurrent Neural Networks) show good results for both cases, while character-based RNNs (char-RNNs) only succeed to learn chord progressions. The proposed system can be used for fully automatic composition or as semi-automatic systems that help humans to compose music by controlling a diversity parameter of the model.
We compute the phase diagram of the one-dimensional Bose-Hubbard model with a quasi-periodic potential by means of the density-matrix renormalization group technique. This model describes the physics of cold atoms loaded in an optical lattice in the presence of a superlattice potential whose wave length is incommensurate with the main lattice wave length. After discussing the conditions under which the model can be realized experimentally, the study of the density vs. the chemical potential curves for a non-trapped system unveils the existence of gapped phases at incommensurate densities interpreted as incommensurate charge-density wave phases. Furthermore, a localization transition is known to occur above a critical value of the potential depth V_2 in the case of free and hard-core bosons. We extend these results to soft-core bosons for which the phase diagrams at fixed densities display new features compared with the phase diagrams known for random box distribution disorder. In particular, a direct transition from the superfluid phase to the Mott insulating phase is found at finite V_2. Evidence for reentrances of the superfluid phase upon increasing interactions is presented. We finally comment on different ways to probe the emergent quantum phases and most importantly, the existence of a critical value for the localization transition. The later feature can be investigated by looking at the expansion of the cloud after releasing the trap.
Network Traffic Matrix (TM) prediction is defined as the problem of estimating future network traffic from the previous and achieved network traffic data. It is widely used in network planning, resource management and network security. Long Short-Term Memory (LSTM) is a specific recurrent neural network (RNN) architecture that is well-suited to learn from experience to classify, process and predict time series with time lags of unknown size. LSTMs have been shown to model temporal sequences and their long-range dependencies more accurately than conventional RNNs. In this paper, we propose a LSTM RNN framework for predicting short and long term Traffic Matrix (TM) in large networks. By validating our framework on real-world data from GEANT network, we show that our LSTM models converge quickly and give state of the art TM prediction performance for relatively small sized models.
We present results from a detailed spectrophotometric analysis of the blue compact dwarf galaxy Mrk 35 (Haro 3), based on deep optical (B,V,R,I) and near-IR (J,H,K) imaging, Halpha narrow-band observations and long-slit spectroscopy. The optical emission of the galaxy is dominated by a central young starburst, with a bar-like shape, while an underlying component of stars, with elliptical isophotes and red colors, extends more than 4 kpc from the galaxy center. High resolution Halpha and color maps allow us to identify the star-forming regions, to spatially discriminate them from the older stars, and to recognize several dust patches. We derive colors and Halpha parameters for all the identified star-forming knots. Observables derived for each knot are corrected for the contribution of the underlying older stellar population, the contribution by emission lines, and from interstellar extinction, and compared with evolutionary synthesis models. We find that the contributions of these three factors are by no means negligible and that they significantly vary across the galaxy. Therefore, careful quantification and subtraction of emission lines, galaxy host contribution, and interstellar reddening at every galaxy position, are essential to derive the properties of the young stars in BCDs. We find that we can reproduce the colors of all the knots with an instantaneous burst of star formation and the Salpeter initial mass function with an upper mass limit of 100 M_solar. In all cases the knots are just a few Myr old. The underlying population of stars has colors consistent with being several Gyr old.
We propose a simple yet very predictive form, based on a Poisson's equation, for the functional dependence of the cost from the density of points in the Euclidean bipartite matching problem. This leads, for quadratic costs, to the analytic prediction of the large $N$ limit of the average cost in dimension $d=1,2$ and of the subleading correction in higher dimension. A non-trivial scaling exponent, $\gamma_d=\frac{d-2}{d}$, which differs from the monopartite's one, is found for the subleading correction. We argue that the same scaling holds true for a generic cost exponent in dimension $d>2$.
We present a method to perform first-pass large vocabulary continuous speech recognition using only a neural network and language model. Deep neural network acoustic models are now commonplace in HMM-based speech recognition systems, but building such systems is a complex, domain-specific task. Recent work demonstrated the feasibility of discarding the HMM sequence modeling framework by directly predicting transcript text from audio. This paper extends this approach in two ways. First, we demonstrate that a straightforward recurrent neural network architecture can achieve a high level of accuracy. Second, we propose and evaluate a modified prefix-search decoding algorithm. This approach to decoding enables first-pass speech recognition with a language model, completely unaided by the cumbersome infrastructure of HMM-based systems. Experiments on the Wall Street Journal corpus demonstrate fairly competitive word error rates, and the importance of bi-directional network recurrence.
We can better understand deep neural networks by identifying which features each of their neurons have learned to detect. To do so, researchers have created Deep Visualization techniques including activation maximization, which synthetically generates inputs (e.g. images) that maximally activate each neuron. A limitation of current techniques is that they assume each neuron detects only one type of feature, but we know that neurons can be multifaceted, in that they fire in response to many different types of features: for example, a grocery store class neuron must activate either for rows of produce or for a storefront. Previous activation maximization techniques constructed images without regard for the multiple different facets of a neuron, creating inappropriate mixes of colors, parts of objects, scales, orientations, etc. Here, we introduce an algorithm that explicitly uncovers the multiple facets of each neuron by producing a synthetic visualization of each of the types of images that activate a neuron. We also introduce regularization methods that produce state-of-the-art results in terms of the interpretability of images obtained by activation maximization. By separately synthesizing each type of image a neuron fires in response to, the visualizations have more appropriate colors and coherent global structure. Multifaceted feature visualization thus provides a clearer and more comprehensive description of the role of each neuron.
Interfaces pinned by quenched disorder are often used to model jerky self-organized critical motion. We study static avalanches, or shocks, defined here as jumps between distinct global minima upon changing an external field. We show how the full statistics of these jumps is encoded in the functional-renormalization-group fixed-point functions. This allows us to obtain the size distribution P(S) of static avalanches in an expansion in the internal dimension d of the interface. Near and above d=4 this yields the mean-field distribution P(S) ~ S^(-3/2) exp(-S/[4 S_m]) where S_m is a large-scale cutoff, in some cases calculable. Resumming all 1-loop contributions, we find P(S) ~ S^(-tau) exp(C (S/S_m)^(1/2) -B/4 (S/S_m)^delta) where B, C, delta, tau are obtained to first order in epsilon=4-d. Our result is consistent to O(epsilon) with the relation tau = 2-2/(d+zeta), where zeta is the static roughness exponent, often conjectured to hold at depinning. Our calculation applies to all static universality classes, including random-bond, random-field and random-periodic disorder. Extended to long-range elastic systems, it yields a different size distribution for the case of contact-line elasticity, with an exponent compatible with tau=2-1/(d+zeta) to O(epsilon=2-d). We discuss consequences for avalanches at depinning and for sandpile models, relations to Burgers turbulence and the possibility that the above relations for tau be violated to higher loop order. Finally, we show that the avalanche-size distribution on a hyper-plane of co-dimension one is in mean-field (valid close to and above d=4) given by P(S) ~ K_{1/3}(S)/S, where K is the Bessel-K function, thus tau=4/3 for the hyper plane.
An efficient implementation of many multiparty protocols for quantum networks requires that all the nodes in the network share a common reference frame. Establishing such a reference frame from scratch is especially challenging in an asynchronous network where network links might have arbitrary delays and the nodes do not share synchronised clocks. In this work, we study the problem of establishing a common reference frame in an asynchronous network of $n$ nodes of which at most $t$ are affected by arbitrary unknown error, and the identities of the faulty nodes are not known. We present a protocol that allows all the correctly functioning nodes to agree on a common reference frame as long as the network graph is complete and not more than $t<n/4$ nodes are faulty. As the protocol is asynchronous, it can be used with some assumptions to synchronise clocks over a network. Also, the protocol has the appealing property that it allows any existing two-node asynchronous protocol for reference frame agreement to be lifted to a robust protocol for an asynchronous quantum network.
We show that a neural network with arbitrary depth and non-linearities, with dropout applied before every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model. This interpretation might offer an explanation to some of dropout's key properties, such as its robustness to over-fitting. Our interpretation allows us to reason about uncertainty in deep learning, and allows the introduction of the Bayesian machinery into existing deep learning frameworks in a principled way.   This document is an appendix for the main paper "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning" by Gal and Ghahramani, 2015.
We propose a new approach to designing visual markers (analogous to QR-codes, markers for augmented reality, and robotic fiducial tags) based on the advances in deep generative networks. In our approach, the markers are obtained as color images synthesized by a deep network from input bit strings, whereas another deep network is trained to recover the bit strings back from the photos of these markers. The two networks are trained simultaneously in a joint backpropagation process that takes characteristic photometric and geometric distortions associated with marker fabrication and marker scanning into account. Additionally, a stylization loss based on statistics of activations in a pretrained classification network can be inserted into the learning in order to shift the marker appearance towards some texture prototype. In the experiments, we demonstrate that the markers obtained using our approach are capable of retaining bit strings that are long enough to be practical. The ability to automatically adapt markers according to the usage scenario and the desired capacity as well as the ability to combine information encoding with artistic stylization are the unique properties of our approach. As a byproduct, our approach provides an insight on the structure of patterns that are most suitable for recognition by ConvNets and on their ability to distinguish composite patterns.
We investigate the behavior of the thermoelectric power [S] in disordered systems close to the Anderson-type metal-insulator transition [MIT] at low temperatures. In the literature, we find contradictory results for S. It is either argued to diverge or to remain a constant as the MIT is approached. To resolve this dilemma, we calculate the number density of electrons at the MIT in disordered systems using an averaged density of states obtained by diagonalizing the three-dimensional Anderson model of localization. From the number density we obtain the temperature dependence of the chemical potential necessary to solve for S. Without any additional approximation, we use the Chester-Thellung-Kubo-Greenwood formulation and numerically obtain the behavior of S at low T as the Anderson transition is approached from the metallic side. We show that indeed S does not diverge.
In this paper, we attempt to classify tweets into root categories of the Amazon browse node hierarchy using a set of tweets with browse node ID labels, a much larger set of tweets without labels, and a set of Amazon reviews. Examining twitter data presents unique challenges in that the samples are short (under 140 characters) and often contain misspellings or abbreviations that are trivial for a human to decipher but difficult for a computer to parse. A variety of query and document expansion techniques are implemented in an effort to improve information retrieval to modest success.
We study $D$-dimensional elastic manifolds driven by ac-forces in a disordered environment using a perturbation expansion in the disorder strength and the mean-field approximation. We find, that for $D\le 4$ perturbation theory produces non-regular terms that grow unboundedly in time. The origin of these non-regular terms is explained. By using a graphical representation we argue that the perturbation expansion is regular to all orders for $D>4$. Moreover, for the corresponding mean-field problem we prove that ill-behaved diagrams can be resummed in a way, that their unbounded parts mutually cancel. Our analytical results are supported by numerical investigations. Furthermore, we conjecture the scaling of the Fourier coefficients of the mean velocity with the amplitude of the driving force $h$.
Amorphous packings of frictionless, spherical particles are isostatic at jamming onset, with the number of constraints (contacts) equal to the number of degrees of freedom. Their structural and mechanical properties are controlled by the interparticle contact network. In contrast, amorphous packings of frictional particles are typically hyperstatic at jamming onset. We perform extensive numerical simulations in two dimensions of the geometrical asperity (GA) model for static friction, to further investigate the role of isostaticity. In the GA model, interparticle forces are obtained by summing up purely repulsive central forces between periodically spaced circular asperities on contacting grains. We compare the packing fraction, contact number, mobilization distribution, and vibrational density of states using the GA model to those generated using the Cundall-Strack (CS) approach. We find that static packings of frictional disks obtained from the GA model are mechanically stable and isostatic when we consider interactions between asperities on contacting particles. The crossover in the structural and mechanical properties of static packings from frictionless to frictional behavior as a function of the static friction coefficient coincides with a change in the type of interparticle contacts and the disappearance of a peak in the density of vibrational modes for the GA model. These results emphasize that mesoscale features of the model for static friction play an important role in determining the properties of granular packings.
In this paper we shall review the common problems associated with Piecewise Linear Separation incremental algorithms. This kind of neural models yield poor performances when dealing with some classification problems, due to the evolving schemes used to construct the resulting networks. So as to avoid this undesirable behavior we shall propose a modification criterion. It is based upon the definition of a function which will provide information about the quality of the network growth process during the learning phase. This function is evaluated periodically as the network structure evolves, and will permit, as we shall show through exhaustive benchmarks, to considerably improve the performance(measured in terms of network complexity and generalization capabilities) offered by the networks generated by these incremental models.
Computationally intensive distributed and parallel computing is often bottlenecked by a small set of slow workers known as stragglers. In this paper, we utilize the emerging idea of "coded computation" to design a novel error-correcting-code inspired technique for solving linear inverse problems under specific iterative methods in a parallelized implementation affected by stragglers. Example applications include inverse problems in machine learning on graphs, such as personalized PageRank and sampling on graphs. We provably show that our coded-computation technique can reduce the mean-squared error under a computational deadline constraint. In fact, the ratio of mean-squared error of replication-based and coded techniques diverges to infinity as the deadline increases. Our experiments for personalized PageRank performed on real systems and real social networks show that this ratio can be as large as $10^4$. Further, unlike coded-computation techniques proposed thus far, our strategy combines outputs of all workers, including the stragglers, to produce more accurate estimates at the computational deadline. This also ensures that the accuracy degrades "gracefully" in the event that the number of stragglers is large.
This paper considers a generalized framework to study OSNR optimization-based end-to-end link level power control problems in optical networks. We combine favorable features of game-theoretical approach and central cost approach to allow different service groups within the network. We develop solutions concepts for both cases of empty and nonempty feasible sets. In addition, we derive and prove the convergence of a distributed iterative algorithm for different classes of users. In the end, we use numerical examples to illustrate the novel framework.
We use the ROSAT Deep Cluster Survey (RDCS) to trace the evolution of the cluster abundance out to $z\simeq 0.8$ and constrain cosmological models. We resort to a phenomenological prescription to convert masses into $X$-ray fluxes and apply a maximum-likelihood approach to the RDCS redshift- and luminosity-distribution. We find that, even changing the shape and the evolution on the $L_{bol}$-$T_X$ relation within the observational uncertainties, a critical density Universe is always excluded at more than $3\sigma$ level. By assuming a non-evolving $X$-ray luminosity-temperature relation with shape $L_{bol}\propto T_X^3$, it is $\Omega_m=0.35^{+0.35}_{-0.25}$ and $\sigma_8=0.76^{+0.38}_{-0.14}$ for flat models, with uncertainties corresponding to $3\sigma$ confidence levels.
We study the problem of how to build a deep learning representation for 3D shape. Deep learning has shown to be very effective in variety of visual applications, such as image classification and object detection. However, it has not been successfully applied to 3D shape recognition. This is because 3D shape has complex structure in 3D space and there are limited number of 3D shapes for feature learning. To address these problems, we project 3D shapes into 2D space and use autoencoder for feature learning on the 2D images. High accuracy 3D shape retrieval performance is obtained by aggregating the features learned on 2D images. In addition, we show the proposed deep learning feature is complementary to conventional local image descriptors. By combing the global deep learning representation and the local descriptor representation, our method can obtain the state-of-the-art performance on 3D shape retrieval benchmarks.
Bayesian networks, or directed acyclic graph (DAG) models, are widely used to represent complex causal systems. Since the basic task of learning a Bayesian network from data is NP-hard, a standard approach is greedy search over the space of DAGs or Markov equivalent DAGs. Since the space of DAGs on $p$ nodes and the associated space of Markov equivalence classes are both much larger than the space of permutations, it is desirable to consider permutation-based searches. We here provide the first consistency guarantees, both uniform and high-dimensional, of a permutation-based greedy search. Geometrically, this search corresponds to a simplex-type algorithm on a sub-polytope of the permutohedron, the DAG associahedron. Every vertex in this polytope is associated with a DAG, and hence with a collection of permutations that are consistent with the DAG ordering. A walk is performed on the edges of the polytope maximizing the sparsity of the associated DAGs. We show based on simulations that this permutation search is competitive with standard approaches.
A non--equilibrium occupation distribution relaxes towards the Fermi--Dirac distribution due to electron--electron scattering even in finite Fermi systems. The dynamic evolution of this thermalization process assumed to result from an optical excitation is investigated numerically by solving a Boltzmann equation for the carrier populations using a one--dimensional disordered system. We focus on the short time--scale behavior. The logarithmically long time--scale associated with the glassy behavior of interacting electrons in disordered systems is not treated in our investigation.   For weak disorder and short range interaction we recover the expected result that disorder enhances the relaxation rate as compared to the case without disorder. For sufficiently strong disorder, however, we find an opposite trend due to the reduction of scattering probabilities originating from the strong localization of the single--particle states. Long--range interaction in this regime produces a similar effect. The relaxation rate is found to scale with the interaction strength, however, the interplay between the implicit and the explicit character of the interaction produces an anomalous exponent.
We investigate the possibility of Many-Body Localization in translation invariant Hamiltonian systems, which was recently brought up by several authors. A key feature of Many-Body Localized disordered systems is recovered, namely the fact that resonant spots are rare and far-between. However, we point out that resonant spots are mobile, unlike in models with strong quenched disorder, and that these mobile spots constitute a possible mechanism for delocalization, albeit possibly only on very long timescales. In some models, this argument for delocalization can be made very explicit in first order of perturbation theory in the hopping. For models where this does not work, we present instead a non-perturbative argument that relies solely on ergodicity inside the resonant spots.
The key requirement to routing in any telecommunication network, and especially in Internet-of-Things (IoT) networks, is scalability. Routing must route packets between any source and destination in the network without incurring unmanageable routing overhead that grows quickly with increasing network size and dynamics. Here we present an addressing scheme and a coupled network topology design scheme that guarantee essentially optimal routing scalability. The FIB sizes are as small as they can be, equal to the number of adjacencies a node has, while the routing control overhead is minimized as nearly zero routing control messages are exchanged even upon catastrophic failures in the network. The key new ingredient is the addressing scheme, which is purely local, based only on geographic coordinates of nodes and a centrality measure, and does not require any sophisticated non-local computations or global network topology knowledge for network embedding. The price paid for these benefits is that network topology cannot be arbitrary but should follow a specific design, resulting in Internet-like topologies. The proposed schemes can be most easily deployed in overlay networks, and also in other network deployments, where geolocation information is available, and where network topology can grow following the design specifications.
Entanglement is a physical resource of a quantum system just like mass, charge or energy. Moreover it is an essential tool for many purposes of nowadays quantum information processing, e.g. quantum teleportation, quantum cryptography or quantum computation. In this work we investigate an extended system of N qubits. In our system a qubit is the absence or presence of an electron at a site of a tight-binding system. Several measures of entanglement between a given qubit and the rest of the system and also the entanglement between two qubits and the rest of the system is calculated in a one-electron picture in the presence of disorder. We invoke the power law band random matrix model which even in one dimension is able to produce multifractal states that fluctuate at all length scales. The concurrence, the tangle and the entanglement entropy all show interesting scaling properties.
Given a metabolic network in terms of its metabolites and reactions, our goal is to efficiently compute the minimal knock out sets of reactions required to block a given behaviour. We describe an algorithm which improves the computation of these knock out sets when the elementary modes (minimal functional subsystems) of the network are given. We also describe an algorithm which computes both the knock out sets and the elementary modes containing the blocked reactions directly from the description of the network and whose worst-case computational complexity is better than the algorithms currently in use for these problems. Computational results are included.
In this study we determined neural network weights and biases by Imperialist Competitive Algorithm (ICA) in order to train network for predicting earthquake intensity in Richter. For this reason, we used dependent parameters like earthquake occurrence time, epicenter's latitude and longitude in degree, focal depth in kilometer, and the seismological center distance from epicenter and earthquake focal center in kilometer which has been provided by Berkeley data base. The studied neural network has two hidden layer: its first layer has 16 neurons and the second layer has 24 neurons. By using ICA algorithm, average error for testing data is 0.0007 with a variance equal to 0.318. The earthquake prediction error in Richter by MSE criteria for ICA algorithm is 0.101, but by using GA, the MSE value is 0.115.
We study the dynamics of an electron subjected to a static uniform electric field within a one-dimensional tight-binding model with a slowly varying aperiodic potential. The unbiased model is known to support phases of localized and extended one-electron states separated by two mobility edges. We show that the electric field promotes sustained Bloch oscillations of an initial Gaussian wave packet whose amplitude reflects the band width of extended states. The frequency of these oscillations exhibit unique features, such as a sensitivity to the initial wave packet position and a multimode structure for weak fields, originating from the characteristics of the underlying aperiodic potential.
We develop a recursive formula for the probability of a k-cluster in bootstrap percolation.
We propose a novel approach to template based face recognition. Our dual goal is to both increase recognition accuracy and reduce the computational and storage costs of template matching. To do this, we leverage on an approach which was proven effective in many other domains, but, to our knowledge, never fully explored for face images: average pooling of face photos. We show how (and why!) the space of a template's images can be partitioned and then pooled based on image quality and head pose and the effect this has on accuracy and template size. We perform extensive tests on the IJB-A and Janus CS2 template based face identification and verification benchmarks. These show that not only does our approach outperform published state of the art despite requiring far fewer cross template comparisons, but also, surprisingly, that image pooling performs on par with deep feature pooling.
Link recommendation has attracted significant attentions from both industry practitioners and academic researchers. In industry, link recommendation has become a standard and most important feature in online social networks, prominent examples of which include "People You May Know" on LinkedIn and "You May Know" on Google+. In academia, link recommendation has been and remains a highly active research area. This paper surveys state-of-the-art link recommendation methods, which can be broadly categorized into learning-based methods and proximity-based methods. We further identify social and economic theories, such as social interaction theory, that underlie these methods and explain from a theoretical perspective why a link recommendation method works. Finally, we propose to extend link recommendation research in several directions that include utility-based link recommendation, diversity of link recommendation, link recommendation from incomplete data, and experimental study of link recommendation.
We present a spherical version of the grand-canonical minority game (GCMG), and solve its dynamics in the stationary state. The model displays several types of transitions between multiple ergodic phases and one non-ergodic phase. We derive analytical solutions, including exact expressions for the volatility, throughout all ergodic phases, and compute the phase behaviour of the system. In contrast to conventional GCMGs, where the introduction of memory-loss precludes analytical approaches, the spherical model can be solved also when exponential discounting is taken into account. For the case of homogeneous incentives to trade epsilon and memory loss rates rho, an efficient phase is found only if rho=epsilon=0. Allowing for heterogeneous memory-loss rates we find that efficiency can be achieved as long as there is any finite fraction of agents which is not subject to memory loss.
Several studies have identified a significant amount of redundancy in the network traffic. For example, it is demonstrated that there is a great amount of redundancy within the content of a server over time. This redundancy can be leveraged to reduce the network flow by the deployment of memory units in the network. The question that arises is whether or not the deployment of memory can result in a fundamental improvement in the performance of the network. In this paper, we answer this question affirmatively by first establishing the fundamental gains of memory-assisted source compression and then applying the technique to a network. Specifically, we investigate the gain of memory-assisted compression in random network graphs consisted of a single source and several randomly selected memory units. We find a threshold value for the number of memories deployed in a random graph and show that if the number of memories exceeds the threshold we observe network-wide reduction in the traffic.
We study the influence of correlations among discrete stochastic excitatory or inhibitory inputs on the response of the FitzHugh-Nagumo neuron model. For any level of correlation the emitted signal exhibits at some finite noise intensity a maximal degree of regularity, i.e., a coherence resonance. Furthermore, for either inhibitory or excitatory correlated stimuli a {\it Double Coherence Resonance} (DCR) is observable. DCR refers to a (absolute) maximum coherence in the output occurring for an optimal combination of noise variance and correlation. All these effects can be explained by taking advantage of the discrete nature of the correlated inputs.
The emergent integrability of the many-body localized phase is naturally understood in terms of localized quasiparticles. As a result, the occupations of the one-particle density matrix in eigenstates show a Fermi-liquid like discontinuity. Here we show that in the steady state reached at long times after a global quench from a perfect density-wave state, this occupation discontinuity is absent, which is reminiscent of a Fermi liquid at a finite temperature, while the full occupation function remains strongly nonthermal. We discuss how one can understand this as a consequence of the local structure of the density-wave state and the resulting partial occupation of quasiparticles.
In the recent years, the problem of identifying suspicious behavior has gained importance and identifying this behavior using computational systems and autonomous algorithms is highly desirable in a tactical scenario. So far, the solutions have been primarily manual which elicit human observation of entities to discern the hostility of the situation. To cater to this problem statement, a number of fully automated and partially automated solutions exist. But, these solutions lack the capability of learning from experiences and work in conjunction with human supervision which is extremely prone to error. In this paper, a generalized methodology to predict the hostility of a given object based on its movement patterns is proposed which has the ability to learn and is based upon the mechanism of humans of learning from experiences. The methodology so proposed has been implemented in a computer simulation. The results show that the posited methodology has the potential to be applied in real world tactical scenarios.
This work aims to investigate the use of deep neural network to detect commercial hobby drones in real-life environments by analyzing their sound data. The purpose of work is to contribute to a system for detecting drones used for malicious purposes, such as for terrorism. Specifically, we present a method capable of detecting the presence of commercial hobby drones as a binary classification problem based on sound event detection. We recorded the sound produced by a few popular commercial hobby drones, and then augmented this data with diverse environmental sound data to remedy the scarcity of drone sound data in diverse environments. We investigated the effectiveness of state-of-the-art event sound classification methods, i.e., a Gaussian Mixture Model (GMM), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN), for drone sound detection. Our empirical results, which were obtained with a testing dataset collected on an urban street, confirmed the effectiveness of these models for operating in a real environment. In summary, our RNN models showed the best detection performance with an F-Score of 0.8009 with 240 ms of input audio with a short processing time, indicating their applicability to real-time detection systems.
The research paper emphasizes that the Stable Matching problems are the same as the problems of stable configurations of Multi-stage Interconnection Networks (MIN). We have discusses the Stability Problems of Existing Regular Omega Multi-stage Interconnection Network (OMIN) and Proposed 3-Disjoint Paths Omega Multi-stage Interconnection Network (3DON) using the approaches and solutions provided by the Stable Matching Problem. Specifically, Stable Marriage Problem is used as an example of Stable Matching. On application of the concept of the Stable Marriage over the MINs states that OMIN is highly stable in comparison to 3DON.
We present a theoretical framework for understanding the wavefunctions and spectrum of an extensively studied paradigm for quasiperiodic systems, namely the Fibonacci chain. Our analytical results, which are obtained in the limit of strong modulation of the hopping amplitudes, are in good agreement with published numerical data. In the perturbative limit, we show a new symmetry of wavefunctions under permutation of site and energy indices. We compute the wavefunction renormalization factors and from them deduce analytical expressions for the fractal exponents corresponding to individual wavefunctions, as well as their global averages. The multifractality of wavefunctions is seen to appear at next-to-leading order in the ratio of the hopping amplitudes, $\rho$. Exponents for the local spectral density are given, in extremely good accord with numerical calculations. Interestingly, our analytical results for exponents are observed to describe the system rather well even for values of $\rho$ well outside the domain of applicability of perturbation theory.
We explore the deep inelastic structure functions of hadrons nonperturbatively in an inverse power expansion of the light-front energy of the probe in the framework of light-front QCD. We arrive at the general expressions for various structure functions as the Fourier transform of matrix elements of different components of bilocal vector and axial vector currents on the light-front in a straightforward manner. The complexities of the structure functions are mainly carried by the multi-parton wave functions of the hadrons, while, the bilocal currents have a dynamically dependent yet simple structure on the light-front in this description. We also present a novel analysis of the power corrections based on light-front power counting which resolves some ambiguities of the conventional twist analysis in deep inelastic processes. Further, the factorization theorem and the scale evolution of the structure functions are presented in this formalism by using old-fashioned light-front time-ordered perturbation theory with multi-parton wave functions. Nonperturbative QCD dynamics underlying the structure functions can be explored in the same framework. Once the nonperturbative multi-parton wave functions are known from low-energy light-front QCD, a complete description of deep inelastic structure functions can be realized.
We investigated the acoustic radiation force (ARF) acting on a cylindrical brass particle near an acoustically soft plate patterned with a periodic deep grating. The existence of a negative ARF by which the particle can be pulled towards the sound source is confirmed. In addition, the bandwidth for negative ARF in this soft-plate system is found to be considerably broader than in the stiff-plate systems typically used in previous studies. It is further demonstrated by field distribution analysis that the negative ARF is caused by the gradient force induced by the gradient vortex velocity field near the surface, which stems from the collective resonance excitation of the antisymmetric coupling of Scholte surface waves in the thin plate. The effects of particle location and size on the ARF were also investigated in detail. The negative ARF has potential use in applications requiring particle manipulation using acoustic waves.
Networks are representations of complex underlying social processes. However, the same given network may be more suitable to model one behavior of individuals than another. In many cases, aggregate population models may be more effective than modeling on the network. We present a general framework for evaluating the suitability of given networks for a set of predictive tasks of interest, compared against alternative, networks inferred from data. We present several interpretable network models and measures for our comparison. We apply this general framework to the case study on collective classification of music preferences in a newly available dataset of the Last.fm social network.
Molecular electronic devices are the upmost destiny of the miniaturization trend of electronic components. Although not yet reproducible on large scale, molecular devices are since recently subject of intense studies both experimentally and theoretically, which agree in pointing out the extreme sensitivity of such devices on the nature and quality of the contacts. This chapter intends to provide a general theoretical framework for modelling electronic transport at the molecular scale by describing the implementation of a hybrid method based on Green function theory and density functional algorithms. In order to show the presence of contact-dependent features in the molecular conductance, we discuss three archetypal molecular devices, which are intended to focus on the importance of the different sub-parts of a molecular two-terminal setup.
In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features. We propose a new human body pose dataset, FLIC-motion, that extends the FLIC dataset with additional motion features. We apply our architecture to this dataset and report significantly better performance than current state-of-the-art pose detection systems.
An intriguing heuristic model of development, decline, and change conceived by Bernard J.F. Lonergan (BL) in the late 1940's was laid out in a manner now recognizable as representing an early model of complexity. This report is a first effort toward eventually translating that qualitative vision, designated Emergent Probability, into a viable network computer program.   In his study of human understanding, Lonergan saw the task of constructing a cohesive body of explanatory knowledge as a convoluted building process of schemes of recurrence that act as foundational elements to further growth. Although BL's kernal recurrent scheme was composed of the cognitional dynamics surrounding Insight, other examples abound in nature: resource cycles, motor skills, biological routines, autocatalytic processes, etc. The corresponding growing generic World Process can alternatively be thought of as chemical, environmental, evolutionary, social, organizational, economical, psychological, or ethical, and its generality might be of particular interest to complex systems researchers.
Brain networks has attracted the interests of many neuroscientists. From functional MRI (fMRI) data, statistical tools have been developed to recover brain networks. However, the dimensionality of whole-brain fMRI, usually in hundreds of thousands, challenges the applicability of these methods. We develop a hierarchical graphical model (HGM) to remediate this difficulty. This model introduces a hidden layer of networks based on sparse Gaussian graphical models, and the observed data are sampled from individual network nodes. In fMRI, the network layer models the underlying signals of different brain functional units, and how these units directly interact with each other. The introduction of this hierarchical structure not only provides a formal and interpretable approach, but also enables efficient computation for inferring big networks with hundreds of thousands of nodes. Based on the conditional convexity of our formulation, we develop an alternating update algorithm to compute the HGM model parameters simultaneously. The effectiveness of this approach is demonstrated on simulated data and a real dataset from a stop/go fMRI experiment.
The social networking sites have brought a new horizon for expressing views and opinions of individuals. Moreover, they provide medium to students to share their sentiments including struggles and joy during the learning process. Such informal information has a great venue for decision making. The large and growing scale of information needs automatic classification techniques. Sentiment analysis is one of the automated techniques to classify large data. The existing predictive sentiment analysis techniques are highly used to classify reviews on E-commerce sites to provide business intelligence. However, they are not much useful to draw decisions in education system since they classify the sentiments into merely three preset categories: positive, negative and neutral. Moreover, classifying the students sentiments into positive or negative category does not provide deeper insight into their problems and perks. In this paper, we propose a novel Hybrid Classification Algorithm to classify engineering students sentiments. Unlike traditional predictive sentiment analysis techniques, the proposed algorithm makes sentiment analysis process descriptive. Moreover, it classifies engineering students perks in addition to problems into several categories to help future students and education system in decision making.
Sybil attacks are a fundamental threat to the security of distributed systems. Recently, there has been a growing interest in leveraging social networks to mitigate Sybil attacks. However, the existing approaches suffer from one or more drawbacks, including bootstrapping from either only known benign or known Sybil nodes, failing to tolerate noise in their prior knowledge about known benign or Sybil nodes, and being not scalable.   In this work, we aim to overcome these drawbacks. Towards this goal, we introduce SybilBelief, a semi-supervised learning framework, to detect Sybil nodes. SybilBelief takes a social network of the nodes in the system, a small set of known benign nodes, and, optionally, a small set of known Sybils as input. Then SybilBelief propagates the label information from the known benign and/or Sybil nodes to the remaining nodes in the system.   We evaluate SybilBelief using both synthetic and real world social network topologies. We show that SybilBelief is able to accurately identify Sybil nodes with low false positive rates and low false negative rates. SybilBelief is resilient to noise in our prior knowledge about known benign and Sybil nodes. Moreover, SybilBelief performs orders of magnitudes better than existing Sybil classification mechanisms and significantly better than existing Sybil ranking mechanisms.
Nowadays, scaling methods for general large-scale complex networks have been developed. We proposed a new scaling scheme called "two-site scaling". This scheme was applied iteratively to various networks, and we observed how the degree distribution of the network changes by two-site scaling. In particular, networks constructed by the BA algorithm behave differently from the networks observed in the nature. In addition, an iterative scaling scheme can define a new renormalizing method. We investigated the possibility of defining the Wilsonian renormalization group method on general complex networks and its application to the analysis of the dynamics of complex networks.
We investigate a lattice of coupled logistic maps where, in addition to the usual diffusive coupling, an advection term parameterized by an asymmetry in the coupling is introduced. The advection term induces periodic behavior on a significant number of non-periodic solutions of the purely diffusive case. Our results are based on the characteristic exponents for such systems, namely the mean Lyapunov exponent and the co-moving Lyapunov exponent. In addition, we study how to deal with more complex phenomena in which the advective velocity may vary from site to site. In particular, we observe wave-like pulses to appear and disappear intermittently whenever the advection is spatially inhomogeneous.
In this work we address the task of segmenting an object into its parts, or semantic part segmentation. We start by adapting a state-of-the-art semantic segmentation system to this task, and show that a combination of a fully-convolutional Deep CNN system coupled with Dense CRF labelling provides excellent results for a broad range of object categories. Still, this approach remains agnostic to high-level constraints between object parts. We introduce such prior information by means of the Restricted Boltzmann Machine, adapted to our task and train our model in an discriminative fashion, as a hidden CRF, demonstrating that prior information can yield additional improvements. We also investigate the performance of our approach ``in the wild'', without information concerning the objects' bounding boxes, using an object detector to guide a multi-scale segmentation scheme. We evaluate the performance of our approach on the Penn-Fudan and LFW datasets for the tasks of pedestrian parsing and face labelling respectively. We show superior performance with respect to competitive methods that have been extensively engineered on these benchmarks, as well as realistic qualitative results on part segmentation, even for occluded or deformable objects. We also provide quantitative and extensive qualitative results on three classes from the PASCAL Parts dataset. Finally, we show that our multi-scale segmentation scheme can boost accuracy, recovering segmentations for finer parts.
In this paper we investigate a new computing paradigm, called SocialCloud, in which computing nodes are governed by social ties driven from a bootstrapping trust-possessing social graph. We investigate how this paradigm differs from existing computing paradigms, such as grid computing and the conventional cloud computing paradigms. We show that incentives to adopt this paradigm are intuitive and natural, and security and trust guarantees provided by it are solid. We propose metrics for measuring the utility and advantage of this computing paradigm, and using real-world social graphs and structures of social traces; we investigate the potential of this paradigm for ordinary users. We study several design options and trade-offs, such as scheduling algorithms, centralization, and straggler handling, and show how they affect the utility of the paradigm. Interestingly, we conclude that whereas graphs known in the literature for high trust properties do not serve distributed trusted computing algorithms, such as Sybil defenses---for their weak algorithmic properties, such graphs are good candidates for our paradigm for their self-load-balancing features.
We present a framework for simulating signal propagation in geometric networks (i.e. networks that can be mapped to geometric graphs in some space) and for developing algorithms that estimate (i.e. map) the state and functional topology of complex dynamic geometric net- works. Within the framework we define the key features typically present in such networks and of particular relevance to biological cellular neural networks: Dynamics, signaling, observation, and control. The framework is particularly well-suited for estimating functional connectivity in cellular neural networks from experimentally observable data, and has been implemented using graphics processing unit (GPU) high performance computing. Computationally, the framework can simulate cellular network signaling close to or faster than real time. We further propose a standard test set of networks to measure performance and compare different mapping algorithms.
A thesaurus is one, out of many, possible representations of term (or word) connectivity. The terms of a thesaurus are seen as the nodes and their relationship as the links of a directed graph. The directionality of the links retains all the thesaurus information and allows the measurement of several quantities. This has lead to a new term classification according to the characteristics of the nodes, for example, nodes with no links in, no links out, etc. Using an electronic available thesaurus we have obtained the incoming and outgoing link distributions. While the incoming link distribution follows a stretched exponential function, the lower bound for the outgoing link distribution has the same envelope of the scientific paper citation distribution proposed by Albuquerque and Tsallis. However, a better fit is obtained by simpler function which is the solution of Ricatti's differential equation. We conjecture that this differential equation is the continuous limit of a stochastic growth model of the thesaurus network. We also propose a new manner to arrange a thesaurus using the ``inversion method''.
Stochastic neurons and hard non-linearities can be useful for a number of reasons in deep learning models, but in many cases they pose a challenging problem: how to estimate the gradient of a loss function with respect to the input of such stochastic or non-smooth neurons? I.e., can we "back-propagate" through these stochastic neurons? We examine this question, existing approaches, and compare four families of solutions, applicable in different settings. One of them is the minimum variance unbiased gradient estimator for stochatic binary neurons (a special case of the REINFORCE algorithm). A second approach, introduced here, decomposes the operation of a binary stochastic neuron into a stochastic binary part and a smooth differentiable part, which approximates the expected effect of the pure stochatic binary neuron to first order. A third approach involves the injection of additive or multiplicative noise in a computational graph that is otherwise differentiable. A fourth approach heuristically copies the gradient with respect to the stochastic output directly as an estimator of the gradient with respect to the sigmoid argument (we call this the straight-through estimator). To explore a context where these estimators are useful, we consider a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network. In this case, it is important that the gating units produce an actual 0 most of the time. The resulting sparsity can be potentially be exploited to greatly reduce the computational cost of large deep networks for which conditional computation would be useful.
Monaural source separation is important for many real world applications. It is challenging because, with only a single channel of information available, without any constraints, an infinite number of solutions are possible. In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising. The joint optimization of the deep recurrent neural networks with an extra masking layer enforces a reconstruction constraint. Moreover, we explore a discriminative criterion for training neural networks to further enhance the separation performance. We evaluate the proposed system on the TSP, MIR-1K, and TIMIT datasets for speech separation, singing voice separation, and speech denoising tasks, respectively. Our approaches achieve 2.30--4.98 dB SDR gain compared to NMF models in the speech separation task, 2.30--2.48 dB GNSDR gain and 4.32--5.42 dB GSIR gain compared to existing models in the singing voice separation task, and outperform NMF and DNN baselines in the speech denoising task.
Managing and hedging the risks associated with Variable Annuity (VA) products require intraday valuation of key risk metrics for these products. The complex structure of VA products and computational complexity of their accurate evaluation have compelled insurance companies to adopt Monte Carlo (MC) simulations to value their large portfolios of VA products. Because the MC simulations are computationally demanding, especially for intraday valuations, insurance companies need more efficient valuation techniques. Recently, a framework based on traditional spatial interpolation techniques has been proposed that can significantly decrease the computational complexity of MC simulation (Gan and Lin, 2015). However, traditional interpolation techniques require the definition of a distance function that can significantly impact their accuracy. Moreover, none of the traditional spatial interpolation techniques provide all of the key properties of accuracy, efficiency, and granularity (Hejazi et al., 2015). In this paper, we present a neural network approach for the spatial interpolation framework that affords an efficient way to find an effective distance function. The proposed approach is accurate, efficient, and provides an accurate granular view of the input portfolio. Our numerical experiments illustrate the superiority of the performance of the proposed neural network approach compared to the traditional spatial interpolation schemes.
Stylolites are natural pressure-dissolution surfaces in sedimentary rocks. We present 3D high resolution measurements at laboratory scales of their complex roughness. The topography is shown to be described by a self-affine scaling invariance. At large scales, the Hurst exponent is $\zeta_1 \approx 0.5$ and very different from that at small scales where $\zeta_2 \approx 1.2$. A cross-over length scale at around $\L_c =1$~mm is well characterized. Measurements are consistent with a Langevin equation that describes the growth of a stylolitic interface as a competition between stabilizing long range elastic interactions at large scales or local surface tension effects at small scales and a destabilizing quenched material disorder.
For zero-error function computation over directed acyclic networks, existing upper and lower bounds on the computation capacity are known to be loose. In this work we consider the problem of computing the arithmetic sum over a specific directed acyclic network that is not a tree. We assume the sources to be i.i.d. Bernoulli with parameter $1/2$. Even in this simple setting, we demonstrate that upper bounding the computation rate is quite nontrivial. In particular, it requires us to consider variable length network codes and relate the upper bound to equivalently lower bounding the entropy of descriptions observed by the terminal conditioned on the function value. This lower bound is obtained by further lower bounding the entropy of a so-called \textit{clumpy distribution}. We also demonstrate an achievable scheme that uses variable length network codes and in-network compression.
Mechanical devices such as engines, vehicles, aircrafts, etc., are typically instrumented with numerous sensors to capture the behavior and health of the machine. However, there are often external factors or variables which are not captured by sensors leading to time-series which are inherently unpredictable. For instance, manual controls and/or unmonitored environmental conditions or load may lead to inherently unpredictable time-series. Detecting anomalies in such scenarios becomes challenging using standard approaches based on mathematical models that rely on stationarity, or prediction models that utilize prediction errors to detect anomalies. We propose a Long Short Term Memory Networks based Encoder-Decoder scheme for Anomaly Detection (EncDec-AD) that learns to reconstruct 'normal' time-series behavior, and thereafter uses reconstruction error to detect anomalies. We experiment with three publicly available quasi predictable time-series datasets: power demand, space shuttle, and ECG, and two real-world engine datasets with both predictive and unpredictable behavior. We show that EncDec-AD is robust and can detect anomalies from predictable, unpredictable, periodic, aperiodic, and quasi-periodic time-series. Further, we show that EncDec-AD is able to detect anomalies from short time-series (length as small as 30) as well as long time-series (length as large as 500).
The retina is a complex nervous system which encodes visual stimuli before higher order processing occurs in the visual cortex. In this study we evaluated whether information about the stimuli received by the retina can be retrieved from the firing rate distribution of Retinal Ganglion Cells (RGCs), exploiting High-Density 64x64 MEA technology. To this end, we modeled the RGC population activity using mean-covariance Restricted Boltzmann Machines, latent variable models capable of learning the joint distribution of a set of continuous observed random variables and a set of binary unobserved random units. The idea was to figure out if binary latent states encode the regularities associated to different visual stimuli, as modes in the joint distribution. We measured the goodness of mcRBM encoding by calculating the Mutual Information between the latent states and the stimuli shown to the retina. Results show that binary states can encode the regularities associated to different stimuli, using both gratings and natural scenes as stimuli. We also discovered that hidden variables encode interesting properties of retinal activity, interpreted as population receptive fields. We further investigated the ability of the model to learn different modes in population activity by comparing results associated to a retina in normal conditions and after pharmacologically blocking GABA receptors (GABAC at first, and then also GABAA and GABAB). As expected, Mutual Information tends to decrease if we pharmacologically block receptors. We finally stress that the computational method described in this work could potentially be applied to any kind of neural data obtained through MEA technology, though different techniques should be applied to interpret the results.
The success of an agent mediated e-market system lies in the underlying reputation management system to improve the quality of services in an information asymmetric e-market. Reputation provides an operatable metric for establishing trustworthiness between mutually unknown online entities. Reputation systems encourage honest behaviour and discourage malicious behaviour of participating agents in the e-market. A dynamic reputation model would provide virtually instantaneous knowledge about the changing e-market environment and would utilise Internets' capacity for continuous interactivity for reputation computation. This paper proposes a dynamic reputation framework using reinforcement learning and fuzzy set theory that ensures judicious use of information sharing for inter-agent cooperation. This framework is sensitive to the changing parameters of e-market like the value of transaction and the varying experience of agents with the purpose of improving inbuilt defense mechanism of the reputation system against various attacks so that e-market reaches an equilibrium state and dishonest agents are weeded out of the market.
Neural oscillations are universal phenomena and can be observed at different levels of neural systems, from single neuron to macroscopic brain. The frequency of those oscillations are related to the brain functions. However, little is know about how the oscillating frequency of neural system affects neural information transmission in them. In this paper, we investigated how the signal processing in single neuron is modulated by subthreshold membrane potential oscillation generated by upstream rhythmic neural activities. We found that the high frequency oscillations facilitate the transferring of strong signals, whereas slow oscillations the weak signals. Though the capacity of information convey for weak signal is low in single neuron, it is greatly enhanced when weak signals are transferred by multiple pathways with different oscillation phases. We provided a simple phase plane analysis to explain the mechanism for this stimulus-dependent frequency modulation in the leakage integrate-and-fire neuron model. Those results provided a basic understanding of how the brain could modulate its information processing simply through oscillating frequency.
The faint end of the differential galaxy number counts, n(m), in the Hubble Deep Field (HDF) North has been determined for the F450W, F606W, and F814W filters by means of surface-brightness fluctuation (SBF) measurements. This technique allows us to explore n(m) beyond the limiting magnitude of the HDF, providing new, stronger constraints on the faint end of n(m). This has allowed us to test the validity of previous number count studies and to produce a new determination of the faint end of n(m) for magnitudes fainter than 28.8 in the AB system and to extend this estimate down to 31. This value represents an extension of more than two magnitudes beyond the limits of previous photometric studies. The obtained n(m) slopes are \gamma=0.27, 0.21, and 0.26 in B_{450}, V_{606}, and I_{814}, respectively.
Inclusive jet, dijet and trijet differential cross sections have been measured in neutral current deep-inelastic scattering for exchanged boson virtualities $150 < Q^2 < 15000$ GeV$^2$ with the H1 detector at HERA using an integrated luminosity of 351 pb$^{-1}}$. The multijet cross sections are presented as a function of $Q^2$, the transverse momentum of the jet $P_T$ (the mean transverse momentum for dijets and trijets) and the proton's longitudinal momentum fraction of the parton participating in the hard interaction $\xi$. The cross sections are compared to perturbative QCD calculations at next-to-leading order and the value of the strong coupling $\alpha_s(M_Z)$ is determined.
The quantum circuit model allows gates between any pair of qubits yet physical instantiations allow only limited interactions. We address this problem by providing an interaction graph together with an efficient method for compiling quantum circuits so that gates are applied only locally. The graph requires each qubit to interact with 4 other qubits and yet the time-overhead for implementing any n-qubit quantum circuit is 6 log n. Building a network of quantum computing nodes according to this graph enables the network to emulate a single monolithic device with minimal overhead.
The star RZ Psc is one of the most enigmatic members of the UX Ori star family. It shows all properties that are typical for these stars (the light variability, high linear polarization in deep minima, the blueing effect) except for one: it lacks any signatures of youth. With the Li I line, as a rough estimate for the stellar age, we show that the "lithium" age of RZ Psc lies between the age of stars in the Pleiades (approximately 70 Myr) and the Orion (approximately 10 Myr) clusters. We also roughly estimated the age of RZ Psc based on the proper motion of the star using the Tycho-2 catalog. We found that the star has escaped from its assumed birthplace near to the Galactic plane about 30-40 Myr ago. We conclude that RZ Psc is a post-UXOr star, and its sporadic eclipses are caused by material from the debris disk.
Robust automated organ segmentation is a prerequisite for computer-aided diagnosis (CAD), quantitative imaging analysis and surgical assistance. For high-variability organs such as the pancreas, previous approaches report undesirably low accuracies. We present a bottom-up approach for pancreas segmentation in abdominal CT scans that is based on a hierarchy of information propagation by classifying image patches at different resolutions; and cascading superpixels. There are four stages: 1) decomposing CT slice images as a set of disjoint boundary-preserving superpixels; 2) computing pancreas class probability maps via dense patch labeling; 3) classifying superpixels by pooling both intensity and probability features to form empirical statistics in cascaded random forest frameworks; and 4) simple connectivity based post-processing. The dense image patch labeling are conducted by: efficient random forest classifier on image histogram, location and texture features; and more expensive (but with better specificity) deep convolutional neural network classification on larger image windows (with more spatial contexts). Evaluation of the approach is performed on a database of 80 manually segmented CT volumes in six-fold cross-validation (CV). Our achieved results are comparable, or better than the state-of-the-art methods (evaluated by "leave-one-patient-out"), with Dice 70.7% and Jaccard 57.9%. The computational efficiency has been drastically improved in the order of 6~8 minutes, comparing with others of ~10 hours per case. Finally, we implement a multi-atlas label fusion (MALF) approach for pancreas segmentation using the same datasets. Under six-fold CV, our bottom-up segmentation method significantly outperforms its MALF counterpart: (70.7 +/- 13.0%) versus (52.5 +/- 20.8%) in Dice. Deep CNN patch labeling confidences offer more numerical stability, reflected by smaller standard deviations.
This work aims at optimizing injection networks, which consist in adding a set of long-range links (called bypass links) in mobile multi-hop ad hoc networks so as to improve connectivity and overcome network partitioning. To this end, we rely on small-world network properties, that comprise a high clustering coefficient and a low characteristic path length. We investigate the use of two genetic algorithms (generational and steady-state) to optimize three instances of this topology control problem and present results that show initial evidence of their capacity to solve it.
Due to the increasingly need for automatic traffic monitoring, vehicle license plate detection is of high interest to perform automatic toll collection, traffic law enforcement, parking lot access control, among others. In this paper, a sliding window approach based on Histogram of Oriented Gradients (HOG) features is used for Brazilian license plate detection. This approach consists in scanning the whole image in a multiscale fashion such that the license plate is located precisely. The main contribution of this work consists in a deep study of the best setup for HOG descriptors on the detection of Brazilian license plates, in which HOG have never been applied before. We also demonstrate the reliability of this method ensured by a recall higher than 98% (with a precision higher than 78%) in a publicly available data set.
Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as a benchmark, we discuss the role that the discount factor may play in the quality of the learning process of a deep Q-network (DQN). When the discount factor progressively increases up to its final value, we empirically show that it is possible to significantly reduce the number of learning steps. When used in conjunction with a varying learning rate, we empirically show that it outperforms original DQN on several experiments. We relate this phenomenon with the instabilities of neural networks when they are used in an approximate Dynamic Programming setting. We also describe the possibility to fall within a local optimum during the learning process, thus connecting our discussion with the exploration/exploitation dilemma.
The Rosids is one of the largest groups of flowering plants, with 140 families and ~70,000 species. Previous phylogenetic studies of the rosids have primarily utilized organelle genes that likely differ in evolutionary histories from nuclear genes. To better understand the evolutionary history of rosids, it is necessary to investigate their phylogenetic relationships using nuclear genes. Here, we employed large-scale phylogenomic datasets composed of nuclear genes, including 891 clusters of putative orthologous genes. Combined with comprehensive taxon sampling covering 63 species representing 14 out of the 17 orders, we reconstructed the rosids phylogeny with coalescence and concatenation methods, yielding similar tree topologies from all datasets. However, these topologies did not agree on the placement of Zygophyllales. Through comprehensive analyses, we found that missing data and gene tree heterogeneity were potential factors that may mislead concatenation methods, in particular, large amounts of missing data under high gene tree heterogeneity. Our results provided new insights into the deep phylogenetic relationships of the rosids, and demonstrated that coalescence methods may effectively resolve the phylogenetic relationships of the rosids with missing data under high gene tree heterogeneity.
Numerous social, medical, engineering and biological challenges can be framed as graph-based learning tasks. Here, we propose a new feature based approach to network classification. We show how dynamics on a network can be useful to reveal patterns about the organization of the components of the underlying graph where the process takes place. We define generalized assortativities on networks and use them as generalized features across multiple time scales. These features turn out to be suitable signatures for discriminating between different classes of networks. Our method is evaluated empirically on established network benchmarks. We also introduce a new dataset of human brain networks (connectomes) and use it to evaluate our method. Results reveal that our dynamics based features are competitive and often outperform state of the art accuracies.
Content caching at intermediate nodes is a very effective way to optimize the operations of Computer networks, so that future requests can be served without going back to the origin of the content. Several caching techniques have been proposed since the emergence of the concept, including techniques that require major changes to the Internet architecture such as Content Centric Networking. Few of these techniques consider providing caching incentives for the nodes or quality of service guarantees for content owners. In this work, we present a low complexity, distributed, and online algorithm for making caching decisions based on content popularity, while taking into account the aforementioned issues. Our algorithm performs en-route caching. Therefore, it can be integrated with the current TCP/IP model. In order to measure the performance of any online caching algorithm, we define the competitive ratio as the ratio of the performance of the online algorithm in terms of traffic savings to the performance of the optimal offline algorithm that has a complete knowledge of the future. We show that under our settings, no online algorithm can achieve a better competitive ratio than $\Omega(\log n)$, where $n$ is the number of nodes in the network. Furthermore, we show that under realistic scenarios, our algorithm has an asymptotically optimal competitive ratio in terms of the number of nodes in the network. We also study an extension to the basic algorithm and show its effectiveness through extensive simulations.
We overview some results on distributed learning with focus on a family of recently proposed algorithms known as non-Bayesian social learning. We consider different approaches to the distributed learning problem and its algorithmic solutions for the case of finitely many hypotheses. The original centralized problem is discussed at first, and then followed by a generalization to the distributed setting. The results on convergence and convergence rate are presented for both asymptotic and finite time regimes. Various extensions are discussed such as those dealing with directed time-varying networks, Nesterov's acceleration technique and a continuum sets of hypothesis.
Experimental data suggest that neural circuits configure their synaptic connectivity for a given computational task. They also point to dopamine-gated stochastic spine dynamics as an important underlying mechanism, and they show that the stochastic component of synaptic plasticity is surprisingly strong. We propose a model that elucidates how task-dependent self-configuration of neural circuits can emerge through these mechanisms. The Fokker-Planck equation allows us to relate local stochastic processes at synapses to the stationary distribution of network configurations, and thereby to computational properties of the network. This framework suggests a new model for reward-gated network plasticity, where one replaces the common policy gradient paradigm by continuously ongoing stochastic policy search (sampling) from a posterior distribution of network configurations. This posterior integrates priors that encode for example previously attained knowledge and structural constraints. This model can explain the experimentally found capability of neural circuits to configure themselves for a given task, and to compensate automatically for changes in the network or task. We also show that experimental data on dopamine-modulated spine dynamics can be modeled within this theoretical framework, and that a strong stochastic component of synaptic plasticity is essential for its performance.
Light scattering inhibits high-resolution optical imaging, manipulation and therapy deep inside biological tissue by preventing focusing. To form deep foci, wavefront-shaping and time-reversal techniques that break the optical diffusion limit have been developed. For in vivo applications, such focusing must provide high gain, high speed, and a large number of spatial modes. However, none of the previous techniques meet these requirements simultaneously. Here, we overcome this challenge by rapidly measuring the perturbed optical field within a single camera exposure followed by adaptively time-reversing the phase-binarized perturbation. Consequently, a phase-conjugated wavefront is synthesized within a millisecond, two orders of magnitude shorter than the digitally achieved record. We demonstrated real-time focusing in dynamic scattering media, and extended laser speckle contrast imaging to new depths. The unprecedented combination of fast response, high gain, and large mode count makes this work a major stride toward in vivo deep tissue optical imaging, manipulation, and therapy.
Recent advances in neuroscience together with nanoscale electronic device technology have resulted in huge interests in realizing brain-like computing hardwares using emerging nanoscale memory devices as synaptic elements. Although there has been experimental work that demonstrated the operation of nanoscale synaptic element at the single device level, network level studies have been limited to simulations. In this work, we demonstrate, using experiments, array level associative learning using phase change synaptic devices connected in a grid like configuration similar to the organization of the biological brain. Implementing Hebbian learning with phase change memory cells, the synaptic grid was able to store presented patterns and recall missing patterns in an associative brain-like fashion. We found that the system is robust to device variations, and large variations in cell resistance states can be accommodated by increasing the number of training epochs. We illustrated the tradeoff between variation tolerance of the network and the overall energy consumption, and found that energy consumption is decreased significantly for lower variation tolerance.
We present an 11 X 11 arcminute map centred on the Hubble Deep Field taken at 850 microns with the SCUBA camera on the JCMT. The map has an average one-sigma sensitivity to point sources of about 2.3 mJy and thus probes the brighter end of the sub-mm source counts. We find 7 sources with a flux greater than 9 mJy (roughly 4 sigma), and therefore estimate N(>9 mJy)= 208 (+90/-72) per degree. This result is consistent with work from other groups, but improves the statistics at the bright end, and is suggestive of a steepening of the counts.
We formulate a dynamical real space renormalization group approach to describe the time evolution of a random spin-1/2 chain, or interacting fermions, initialized in a state with fixed particle positions. Within this approach we identify a many-body localized state of the chain as a dynamical infinite randomness fixed point. Near this fixed point our method becomes asymptotically exact, allowing analytic calculation of time dependent quantities. In particular we explain the striking universal features in the growth of the entanglement seen in recent numerical simulations: unbounded logarithmic growth delayed by a time inversely proportional to the interaction strength. The particle number fluctuations by contrast exhibit much slower growth as log(log(t)) indicating blocked particle transport. Lack of true thermalization in the long time limit is attributed to an infinite set of approximate integrals of motion revealed in the course of the RG flow, which become asymptotically exact conservation laws at the fixed point. Hence we identify the many-body localized state with an emergent generalized Gibbs ensemble.
Recent advances of deep learning have achieved remarkable performances in various challenging computer vision tasks. Especially in object localization, deep convolutional neural networks outperform traditional approaches based on extraction of data/task-driven features instead of hand-crafted features. Although location information of region-of-interests (ROIs) gives good prior for object localization, it requires heavy annotation efforts from human resources. Thus a weakly supervised framework for object localization is introduced. The term "weakly" means that this framework only uses image-level labeled datasets to train a network. With the help of transfer learning which adopts weight parameters of a pre-trained network, the weakly supervised learning framework for object localization performs well because the pre-trained network already has well-trained class-specific features. However, those approaches cannot be used for some applications which do not have pre-trained networks or well-localized large scale images. Medical image analysis is a representative among those applications because it is impossible to obtain such pre-trained networks. In this work, we present a "fully" weakly supervised framework for object localization ("semi"-weakly is the counterpart which uses pre-trained filters for weakly supervised localization) named as self-transfer learning (STL). It jointly optimizes both classification and localization networks simultaneously. By controlling a supervision level of the localization network, STL helps the localization network focus on correct ROIs without any types of priors. We evaluate the proposed STL framework using two medical image datasets, chest X-rays and mammograms, and achieve signiticantly better localization performance compared to previous weakly supervised approaches.
Cloud computing has emerged as a popular computing paradigm in recent years. However, today's cloud computing architectures often lack support for computer forensic investigations. Analyzing various logs (e.g., process logs, network logs) plays a vital role in computer forensics. Unfortunately, collecting logs from a cloud is very hard given the black-box nature of clouds and the multi-tenant cloud models, where many users share the same processing and network resources. Researchers have proposed using log API or cloud management console to mitigate the challenges of collecting logs from cloud infrastructure. However, there has been no concrete work, which shows how to provide cloud logs to investigator while preserving users' privacy and integrity of the logs. In this paper, we introduce Secure-Logging-as-a-Service (SecLaaS), which stores virtual machines' logs and provides access to forensic investigators ensuring the confidentiality of the cloud users. Additionally, SeclaaS preserves proofs of past log and thus protects the integrity of the logs from dishonest investigators or cloud providers. Finally, we evaluate the feasibility of the scheme by implementing SecLaaS for network access logs in OpenStack - a popular open source cloud platform.
We present deep galaxy counts and half-light radii from F160W ($\lambda_c=1.6\mu$) images obtained with NICMOS on HST. Nearly 9 arcmin$^2$ have been imaged with camera 3, with $3\sigma$ depths ranging from H = 24.3 to 25.5 in a 0.6$''$ diameter aperture. The slope of the counts fainter than H~$= 20$ is 0.31, and the integrated surface density to H$\leq 24.75$ is $4 \times 10^5$ galaxies per square degree. The half-light radii of the galaxies declines steeply with apparent magnitude. At H~$=24$ we are limited by both the delivered FWHM and the detection threshold of the images.
Various $\ell_1$-penalised estimation methods such as graphical lasso and CLIME are widely used for sparse precision matrix estimation. Many of these methods have been shown to be consistent under various quantitative assumptions about the underlying true covariance matrix. Intuitively, these conditions are related to situations where the penalty term will dominate the optimisation. In this paper, we explore the consistency of $\ell_1$-based methods for a class of sparse latent variable -like models, which are strongly motivated by several types of applications. We show that all $\ell_1$-based methods fail dramatically for models with nearly linear dependencies between the variables. We also study the consistency on models derived from real gene expression data and note that the assumptions needed for consistency never hold even for modest sized gene networks and $\ell_1$-based methods also become unreliable in practice for larger networks.
We study the distribution of overlaps of glassy minima, taking proper care of residual symmetries of the system. Ensembles of locally stable, low lying glassy states are efficiently generated by rapid cooling from the liquid phase which has been equilibrated at a temperature $T_{run}$. Varying $T_{run}$, we observe a transition from a regime where a broad range of states are sampled to a regime where the system is almost always trapped in a metastable glassy state. We do not observe any structure in the distribution of overlaps of glassy minima, but find only very weak correlations, comparable in size to those of two liquid configurations.
Online social networking sites such as Facebook, Twitter and Flickr are among the most popular sites on the Web, providing platforms for sharing information and interacting with a large number of people. The different ways for users to interact, such as liking, retweeting and favoriting user-generated content, are among the defining and extremely popular features of these sites. While empirical studies have been done to learn about the network growth processes in these sites, few studies have focused on social interaction behaviour and the effect of social interaction on network growth.   In this paper, we analyze large-scale data collected from the Flickr social network to learn about individual favoriting behaviour and examine the occurrence of link formation after a favorite is created. We do this using a systematic formulation of Flickr as a two-layer temporal multiplex network: the first layer describes the follow relationship between users and the second layer describes the social interaction between users in the form of favorite markings to photos uploaded by them. Our investigation reveals that (a) favoriting is well-described by preferential attachment, (b) over 50% of favorites are reciprocated within 10 days if at all they are reciprocated, (c) different kinds of favorites differ in how fast they are reciprocated, and (d) after a favorite is created, multiplex triangles are closed by the creation of follow links by the favoriter's followers to the favorite receiver.
Natural disasters or attacks may disrupt infrastructure networks on a vast scale. Parts of the damaged network are interdependent, making it difficult to plan and optimally execute the recovery operations. To study how interdependencies affect the recovery schedule, we introduce a new discrete optimization problem where the goal is to minimize the total cost of installing (or recovering) a given network. This cost is determined by the structure of the network and the sequence in which the nodes are installed. Namely, the cost of installing a node is a function of the number of its neighbors that have been installed before it. We analyze the natural case where the cost function is decreasing and convex, and provide bounds on the cost of the optimal solution. We also show that all sequences have the same cost when the cost function is linear and provide an upper bound on the cost of a random solution for an Erd\H{o}s-R\'enyi random graph. Examining the computational complexity, we show that the problem is NP-hard when the cost function is arbitrary. Finally, we provide a formulation as an integer program, an exact dynamic programming algorithm, and a greedy heuristic which gives high quality solutions.
We examine the theoretical properties of enforcing priors provided by generative deep neural networks via empirical risk minimization. In particular we consider two models, one in which the task is to invert a generative neural network given access to its last layer and another which entails recovering a latent code in the domain of a generative neural network from compressive linear observations of its last layer. We establish that in both cases, in suitable regimes of network layer sizes and a randomness assumption on the network weights, that the non-convex objective function given by empirical risk minimization does not have any spurious stationary points. That is, we establish that with high probability, at any point away from small neighborhoods around two scalar multiples of the desired solution, there is a descent direction. These results constitute the first theoretical guarantees which establish the favorable global geometry of these non-convex optimization problems, and bridge the gap between the empirical success of deep learning and a rigorous understanding of non-linear inverse problems.
This paper proposes a hybrid technique for secured optimal power flow coupled with enhancing voltage stability with FACTS device installation. The hybrid approach of Improved Gravitational Search algorithm (IGSA) and Firefly algorithm (FA) performance is analyzed by optimally placing TCSC controller. The algorithm is implemented in MATLAB working platform and the power flow security and voltage stability is evaluated with IEEE 30 bus transmission systems. The optimal results generated are compared with those available in literature and the superior performance of algorithm is depicted as minimum generation cost, reduced real power losses along with sustaining voltage stability.
This work is motivated by the necessity to automate the discovery of structure in vast and evergrowing collection of relational data commonly represented as graphs, for example genomic networks. A novel algorithm, dubbed Graphitour, for structure induction by lossless graph compression is presented and illustrated by a clear and broadly known case of nested structure in a DNA molecule. This work extends to graphs some well established approaches to grammatical inference previously applied only to strings. The bottom-up graph compression problem is related to the maximum cardinality (non-bipartite) maximum cardinality matching problem. The algorithm accepts a variety of graph types including directed graphs and graphs with labeled nodes and arcs. The resulting structure could be used for representation and classification of graphs.
Bond-dilution effects on spin-1/2 spin-gapped Heisenberg antiferromagnets of coupled alternating chains on a square lattice are investigated by means of the quantum Monte Carlo method. It is found that, in contrast with the site-diluted system having an infinitesimal critical concentration, the bond-diluted system has a finite critical concentration of diluted bonds, $x_{c}$, above which the system is in an antiferromagnetic (AF) long-range ordered phase. In the disordered phase below $x_{c}$, plausibly in the concentration region significantly less than $x_{c}$, the system has a spin gap due to singlet pairs of induced magnetic moments reformed by the AF interactions through the two-dimensional shortest paths.
We study properties of multi-layered, interconnected networks from an ensemble perspective, i.e. we analyze ensembles of multi-layer networks that share similar aggregate characteristics. Using a diffusive process that evolves on a multi-layer network, we analyze how the speed of diffusion depends on the aggregate characteristics of both intra- and inter-layer connectivity. Through a block-matrix model representing the distinct layers, we construct transition matrices of random walkers on multi-layer networks, and estimate expected properties of multi-layer networks using a mean-field approach. In addition, we quantify and explore conditions on the link topology that allow to estimate the ensemble average by only considering aggregate statistics of the layers. Our approach can be used when only partial information is available, like it is usually the case for real-world multi-layer complex systems.
We propose a new method to estimate the photometric redshift of galaxies by using the full galaxy image in each measured band. This method draws from the latest techniques and advances in machine learning, in particular Deep Neural Networks. We pass the entire multi-band galaxy image into the machine learning architecture to obtain a redshift estimate that is competitive with the best existing standard machine learning techniques. The standard techniques estimate redshifts using post-processed features, such as magnitudes and colours, which are extracted from the galaxy images and are deemed to be salient by the user. This new method removes the user from the photometric redshift estimation pipeline. However we do note that Deep Neural Networks require many orders of magnitude more computing resources than standard machine learning architectures.
Propagation of contagion in networks depends on the graph topology. This paper is concerned with studying the time-asymptotic behavior of the extended contact processes on static, undirected, finite-size networks. This is a contact process with nonzero exogenous infection rate (also known as the {\epsilon}-SIS, {\epsilon} susceptible-infected-susceptible, model [1]). The only known analytical characterization of the equilibrium distribution of this process is for complete networks. For large networks with arbitrary topology, it is infeasible to numerically solve for the equilibrium distribution since it requires solving the eigenvalue-eigenvector problem of a matrix that is exponential in N , the size of the network. We show that, for a certain range of the network process parameters, the equilibrium distribution of the extended contact process on arbitrary, finite-size networks is well approximated by the equilibrium distribution of the scaled SIS process, which we derived in closed-form in prior work. We confirm this result with numerical simulations comparing the equilibrium distribution of the extended contact process with that of a scaled SIS process. We use this approximation to decide, in polynomial-time, which agents and network substructures are more susceptible to infection by the extended contact process.
The spin network simulator model represents a bridge between (generalised) circuit schemes for standard quantum computation and approaches based on notions from Topological Quantum Field Theories (TQFTs). The key tool is provided by the fiber space structure underlying the model which exhibits combinatorial properties closely related to SU(2) state sum models, widely employed in discretizing TQFTs and quantum gravity in low spacetime dimensions.
We extend the real-space renormalization group (RG) approach to the study of the energy level statistics at the integer quantum Hall (QH) transition. Previously it was demonstrated that the RG approach reproduces the critical distribution of the {\em power} transmission coefficients, i.e., two-terminal conductances, $P_{\text c}(G)$, with very high accuracy. The RG flow of $P(G)$ at energies away from the transition yielded the value of the critical exponent, $\nu$, that agreed with most accurate large-size lattice simulations. To obtain the information about the level statistics from the RG approach, we analyze the evolution of the distribution of {\em phases} of the {\em amplitude} transmission coefficient upon a step of the RG transformation. From the fixed point of this transformation we extract the critical level spacing distribution (LSD). This distribution is close, but distinctively different from the earlier large-scale simulations. We find that away from the transition the LSD crosses over towards the Poisson distribution. Studying the change of the LSD around the QH transition, we check that it indeed obeys scaling behavior. This enables us to use the alternative approach to extracting the critical exponent, based on the LSD, and to find $\nu=2.37\pm0.02$ very close to the value established in the literature. This provides additional evidence for the surprising fact that a small RG unit, containing only five nodes, accurately captures most of the correlations responsible for the localization-delocalization transition.
Queueing networks are gaining attraction for the performance analysis of parallel computer systems. A Jackson network is a set of interconnected servers, where the completion of a job at server i may result in the creation of a new job for server j. We propose to extend Jackson networks by "branching" and by "control" features. Both extensions are new and substantially expand the modelling power of Jackson networks. On the other hand, the extensions raise computational questions, particularly concerning the stability of the networks, i.e, the ergodicity of the underlying Markov chain. We show for our extended model that it is decidable in polynomial time if there exists a controller that achieves stability. Moreover, if such a controller exists, one can efficiently compute a static randomized controller which stabilizes the network in a very strong sense; in particular, all moments of the queue sizes are finite.
We investigate two quantities of interest in a delay-tolerant mobile ad hoc network: the network capacity region and the minimum energy function. The network capacity region is defined as the set of all input rates that the network can stably support considering all possible scheduling and routing algorithms. Given any input rate vector in this region, the minimum energy function establishes the minimum time average power required to support it. In this work, we consider a cell-partitioned model of a delay-tolerant mobile ad hoc network with general Markovian mobility. This simple model incorporates the essential features of locality of wireless transmissions as well as node mobility and enables us to exactly compute the corresponding network capacity and minimum energy function. Further, we propose simple schemes that offer performance guarantees that are arbitrarily close to these bounds at the cost of an increased delay.
Biopolymer networks are of fundamental importance to many biological processes in normal and tumorous tissues. In this paper, we employ the panoply of theoretical and simulation techniques developed for characterizing heterogeneous materials to quantify the microstructure and effective diffusive transport properties (diffusion coefficient $D_e$ and mean survival time $\tau$) of collagen type I networks at various collagen concentrations. In particular, we compute the pore-size probability density function $P(\delta)$ for the networks and present a variety of analytical estimates of the effective diffusion coefficient $D_e$ for finite-sized diffusing particles. The Hashin-Strikman upper bound on the effective diffusion coefficient $D_e$ and the pore-size lower bound on the mean survival time $\tau$ are used as benchmarks to test our analytical approximations and numerical results. Moreover, we generalize the efficient first-passage-time techniques for Brownian-motion simulations in suspensions of spheres to the case of fiber networks and compute the associated effective diffusion coefficient $D_e$ as well as the mean survival time $\tau$, which is related to nuclear magnetic resonance (NMR) relaxation times. Specifically, the Torquato approximation provides the most accurate estimates of $D_e$ for all collagen concentrations among all of the analytical approximations we consider. We formulate a universal curve for $\tau$ for the networks at different collagen concentrations. We apply rigorous cross-property relations to estimate the effective bulk modulus of collagen networks from a knowledge of the effective diffusion coefficient computed here.
We study how to detect groups in a complex network each of which consists of component nodes sharing a similar connection pattern. Based on the mixture models and the exploratory analysis set up by Newman and Leicht (Newman and Leicht 2007 {\it Proc. Natl. Acad. Sci. USA} {\bf 104} 9564), we develop an algorithm that is applicable to a network with any degree distribution. The partition of a network suggested by this algorithm also applies to its complementary network. In general, groups of similar components are not necessarily identical with the communities in a community network; thus partitioning a network into groups of similar components provides additional information of the network structure. The proposed algorithm can also be used for community detection when the groups and the communities overlap. By introducing a tunable parameter that controls the involved effects of the heterogeneity, we can also investigate conveniently how the group structure can be coupled with the heterogeneity characteristics. In particular, an interesting example shows a group partition can evolve into a community partition in some situations when the involved heterogeneity effects are tuned. The extension of this algorithm to weighted networks is discussed as well.
We demonstrate relationships between the classic Euclidean algorithm and many other fields of study, particularly in the context of music and distance geometry. Specifically, we show how the structure of the Euclidean algorithm defines a family of rhythms which encompass over forty timelines (\emph{ostinatos}) from traditional world music. We prove that these \emph{Euclidean rhythms} have the mathematical property that their onset patterns are distributed as evenly as possible: they maximize the sum of the Euclidean distances between all pairs of onsets, viewing onsets as points on a circle. Indeed, Euclidean rhythms are the unique rhythms that maximize this notion of \emph{evenness}. We also show that essentially all Euclidean rhythms are \emph{deep}: each distinct distance between onsets occurs with a unique multiplicity, and these multiplicies form an interval $1,2,...,k-1$. Finally, we characterize all deep rhythms, showing that they form a subclass of generated rhythms, which in turn proves a useful property called shelling. All of our results for musical rhythms apply equally well to musical scales. In addition, many of the problems we explore are interesting in their own right as distance geometry problems on the circle; some of the same problems were explored by Erd\H{o}s in the plane.
Password authentication is a common approach to the system security and it is also a very important procedure to gain access to user resources. In the conventional password authentication methods a server has to authenticate the legitimate user. In our proposed method users can freely choose their passwords from a defined character set or they can use a graphical image as password and that input will be normalized. Neural networks have been used recently for password authentication in order to overcome pitfall of traditional password authentication methods. In this paper we proposed a method for password authentication using alphanumeric password and graphical password. We used Back Propagation algorithm for both alphanumeric (Text) and graphical password by which the level of security can be enhanced. This paper along with test results show that converting user password in to Probabilistic values enhances the security of the system
Dimensionless ratios of physical properties can characterize low-temperature phases in a wide variety of materials. As such, the Wilson ratio (WR), the Kadowaki-Woods ratio and the Wiedemann\--Franz law capture essential features of Fermi liquids in metals, heavy fermions, etc. Here we prove that the phases of many-body interacting multi-component quantum liquids in one dimension (1D) can be described by WRs based on the compressibility, susceptibility and specific heat associated with each component. These WRs arise due to additivity rules within subsystems reminiscent of the rules for multi-resistor networks in series and parallel --- a novel and useful characteristic of multi-component Tomonaga-Luttinger liquids (TLL) independent of microscopic details of the systems. Using experimentally realised multi-species cold atomic gases as examples, we prove that the Wilson ratios uniquely identify phases of TLL, while providing universal scaling relations at the boundaries between phases. Their values within a phase are solely determined by the stiffnesses and sound velocities of subsystems and identify the internal degrees of freedom of said phase such as its spin-degeneracy. This finding can be directly applied to a wide range of 1D many-body systems and reveals deep physical insights into recent experimental measurements of the universal thermodynamics in ultracold atoms and spins.
We present a hybrid constraint-based/Bayesian algorithm for learning causal networks in the presence of sparse data. The algorithm searches the space of equivalence classes of models (essential graphs) using a heuristic based on conventional constraint-based techniques. Each essential graph is then converted into a directed acyclic graph and scored using a Bayesian scoring metric. Two variants of the algorithm are developed and tested using data from randomly generated networks of sizes from 15 to 45 nodes with data sizes ranging from 250 to 2000 records. Both variations are compared to, and found to consistently outperform two variations of greedy search with restarts.
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain $X$ to a target domain $Y$ in the absence of paired examples. Our goal is to learn a mapping $G: X \rightarrow Y$ such that the distribution of images from $G(X)$ is indistinguishable from the distribution $Y$ using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping $F: Y \rightarrow X$ and introduce a cycle consistency loss to push $F(G(X)) \approx X$ (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.
This paper introduces a new interconnection network topology called Balanced Varietal Hypercube (BVH), suitable for massively parallel systems. The proposed topology being a hybrid structure retains almost all the attractive properties of Balanced Hypercube and Varietal Hypercube. The topology, various parameters, routing and broadcasting of Balanced Varietal Hypercube are presented. The performance of the Balanced Varietal Hypercube is compared with other networks. In terms of diameter, cost and average distance and reliability the proposed network is found to be better than the Hypercube, Balanced Hypercube and Varietal Hypercube. Also it is more reliable and cost-effective than Hypercube and Balanced Hypercube.
We investigate the phase diagram of disordered copolymers at the interface between two selective solvents, and in particular its weak-coupling behavior, encoded in the slope $m_c$ of the critical line at the origin. In mathematical terms, the partition function of such a model does not depend on all the details of the Markov chain that models the polymer, but only on the time elapsed between successive returns to zero and on whether the walk is in the upper or lower half plane between such returns. This observation leads to a natural generalization of the model, in terms of arbitrary laws of return times: the most interesting case being the one of return times with power law tails (with exponent 1+alpha, alpha=1/2 in the case of the symmetric random walk). The main results we present here are:   1. The improvement of the known result 1/(1+alpha) smaller or equal to m_c smaller or equal to 1, as soon as alpha >1 for what concerns the upper bound, and down to alpha = 0.65 for the lower bound.   2. A proof of the fact that the critical curve lies strictly below the critical curve of the annealed model for every non-zero value of the coupling parameter.   We also provide an argument that rigorously shows the strong dependence of the phase diagram on the details of the return probability (and not only on the tail behavior). Lower bounds are obtained by exhibiting a new localization strategy, while upper bounds are based on estimates of non-integer moments of the partition function.
Bayesian inference on structured models typically relies on the ability to infer posterior distributions of underlying hidden variables. However, inference in implicit models or complex posterior distributions is hard. A popular tool for learning implicit models are generative adversarial networks (GANs) which learn parameters of generators by fooling discriminators. Typically, GANs are considered to be models themselves and are not understood in the context of inference. Current techniques rely on inefficient global discrimination of joint distributions to perform learning, or only consider discriminating a single output variable. We overcome these limitations by treating GANs as a basis for likelihood-free inference in generative models and generalize them to Bayesian posterior inference over factor graphs. We propose local learning rules based on message passing minimizing a global divergence criterion involving cooperating local adversaries used to sidestep explicit likelihood evaluations. This allows us to compose models and yields a unified inference and learning framework for adversarial learning. Our framework treats model specification and inference separately and facilitates richly structured models within the family of Directed Acyclic Graphs, including components such as intractable likelihoods, non-differentiable models, simulators and generally cumbersome models. A key result of our treatment is the insight that Bayesian inference on structured models can be performed only with sampling and discrimination when using nonparametric variational families, without access to explicit distributions. As a side-result, we discuss the link to likelihood maximization. These approaches hold promise to be useful in the toolbox of probabilistic modelers and enrich the gamut of current probabilistic programming applications.
We present the results of an experimental study of the current-voltage characteristics in strong magnetic field ($B$) of disordered, superconducting, thin-films of amorphous Indium-Oxide. As the $B$ strength is increased superconductivity degrades, until a critical field ($B_c$) where the system is forced into an insulating state. We show that the differential conductance measured in the insulating phase vanishes abruptly below a well-defined temperature, resulting in a clear threshold for conduction. Our results indicate that a new collective state emerges in two-dimensional superconductors at high $B$.
For large but finite systems the static properties of the infinite ranged Sherrington-Kirkpatrick model are numerically investigated in the entire the glass regime. The approach is based on the modified Thouless-Anderson-Palmer equations in combination with a phenomenological relaxational dynamics used as a numerical tool. For all temperatures and all bond configurations stable and meta stable states are found. Following a discussion of the finite size effects, the static properties of the state of lowest free energy are presented in the presence of a homogeneous magnetic field for all temperatures below the spin glass temperature. Moreover some characteristic features of the meta stable states are presented. These states exist in finite temperature intervals and disappear via local saddle node bifurcations. Numerical evidence is found that the excess free energy of the meta stable states remains finite in the thermodynamic limit. This implies a the `multi-valley' structure of the free energy on a sub-extensive scale.
We consider the domination number for on-line social networks, both in a stochastic network model, and for real-world, networked data. Asymptotic sublinear bounds are rigorously derived for the domination number of graphs generated by the memoryless geometric protean random graph model. We establish sublinear bounds for the domination number of graphs in the Facebook 100 data set, and these bounds are well-correlated with those predicted by the stochastic model. In addition, we derive the asymptotic value of the domination number in classical random geometric graphs.
We present experimental and computational studies of the propagation of internal waves in a stratified fluid with an exponential density profile that models the deep ocean. The buoyancy frequency profile $N(z)$ (proportional to the square root of the density gradient) varies smoothly by more than an order of magnitude over the fluid depth, as is common in the deep ocean. The nonuniform stratification is characterized by a turning depth $z_c$, where $N(z_c)$ is equal to the wave frequency $\omega$ and $N(z < z_c) < \omega$. Internal waves reflect from the turning depth and become evanescent below the turning depth. The energy flux below the turning depth is shown to decay exponentially with a decay constant given by $ k_c$, which is the horizontal wavenumber at the turning depth. The viscous decay of the vertical velocity amplitude of the incoming and reflected waves above the turning depth agree within a few percent with a previously untested theory for a fluid of arbitrary stratification [Kistovich and Chashechkin, J. App. Mech. Tech. Phys. 39, 729-737 (1998)].
Harnessing human computation for solving complex problems call spawns the issue of finding the unknown competitive group of solvers. In this paper, we propose an approach called Friendlysourcing to build up teams from social network answering a business call, all the while avoiding partial solution disclosure to competitive groups. The contributions of this paper include (i) a clustering based approach for discovering collaborative and competitive team in social network (ii) a Markov-chain based algorithm for discovering implicit interactions in the social network.
Multipath forwarding consists of using multiple paths simultaneously to transport data over the network. While most such techniques require endpoint modifications, we investigate how multipath forwarding can be done inside the network, transparently to endpoint hosts. With such a network-centric approach, packet reordering becomes a critical issue as it may cause critical performance degradation.   We present a Software Defined Network architecture which automatically sets up multipath forwarding, including solutions for reordering and performance improvement, both at the sending side through multipath scheduling algorithms, and the receiver side, by resequencing out-of-order packets in a dedicated in-network buffer.   We implemented a prototype with commonly available technology and evaluated it in both emulated and real networks. Our results show consistent throughput improvements, thanks to the use of aggregated path capacity. We give comparisons to Multipath TCP, where we show our approach can achieve a similar performance while offering the advantage of endpoint transparency.
Many language generation tasks require the production of text conditioned on both structured and unstructured inputs. We present a novel neural network architecture which generates an output sequence conditioned on an arbitrary number of input functions. Crucially, our approach allows both the choice of conditioning context and the granularity of generation, for example characters or tokens, to be marginalised, thus permitting scalable and effective training. Using this framework, we address the problem of generating programming code from a mixed natural language and structured specification. We create two new data sets for this paradigm derived from the collectible trading card games Magic the Gathering and Hearthstone. On these, and a third preexisting corpus, we demonstrate that marginalising multiple predictors allows our model to outperform strong benchmarks.
Recurrent neural network architectures can have useful computational properties, with complex temporal dynamics. However, evaluation of recurrent dynamic architectures requires solution of systems of differential equations, and the number of evaluations required to determine their response to a given input can vary with the input, or can be indeterminate altogether in the case of oscillations or instability. In feed-forward networks, by contrast, only a single pass through the network is needed to determine the response to a given input. Modern machine-learning systems are designed to operate efficiently on feed-forward architectures. We hypothesised that two-layer feedforward architectures with simple, deterministic dynamics could approximate the responses of single-layer recurrent network architectures. By identifying the fixed-point responses of a given recurrent network, we trained two-layer networks to directly approximate the fixed-point response to a given input. These feed-forward networks then embodied useful computations, including competitive interactions, information transformations and noise rejection. Our approach was able to find useful approximations to recurrent networks, which can then be evaluated in linear and deterministic time complexity.
Recently two search algorithms, A* and breadth-first branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses optimal parents independently. As a result, the heuristic may contain many directed cycles and result in a loose bound. This paper introduces an improved admissible heuristic that tries to avoid directed cycles within small groups of variables. A sparse representation is also introduced to store only the unique optimal parent choices. Empirical results show that the new techniques significantly improved the efficiency and scalability of A* and BFBnB on most of datasets tested in this paper.
Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.
A catastrophic forgetting problem makes deep neural networks forget the previously learned information, when learning data collected in new environments, such as by different sensors or in different light conditions. This paper presents a new method for alleviating the catastrophic forgetting problem. Unlike previous research, our method does not use any information from the source domain. Surprisingly, our method is very effective to forget less of the information in the source domain, and we show the effectiveness of our method using several experiments. Furthermore, we observed that the forgetting problem occurs between mini-batches when performing general training processes using stochastic gradient descent methods, and this problem is one of the factors that degrades generalization performance of the network. We also try to solve this problem using the proposed method. Finally, we show our less-forgetting learning method is also helpful to improve the performance of deep neural networks in terms of recognition rates.
Suppose N is a phylogenetic network indicating a complicated relationship among individuals and taxa. Often of interest is a much simpler network, for example, a species tree T, that summarizes the most fundamental relationships. The meaning of a species tree is made more complicated by the recent discovery of the importance of hybridizations and lateral gene transfers. Hence it is desirable to describe uniform well-defined procedures that yield a tree given a network N. A useful tool toward this end is a connected surjective digraph (CSD) map f from N to N' where N' is generally a much simpler network than N. A set W of vertices in N is "restricted" if there is at most one vertex from which there is an arc into W, thus yielding a bottleneck in N. A CSD map f from N to N' is "restricted" if the inverse image of each vertex in N' is restricted in N. This paper describes a uniform procedure that, given a network N, yields a well-defined tree called the "restricted tree" of N. There is a restricted CSD map from N to the restricted tree. Many relationships in the tree can be proved to appear also in N.
This paper is focused on studying the view-manifold structure in the feature spaces implied by the different layers of Convolutional Neural Networks (CNN). There are several questions that this paper aims to answer: Does the learned CNN representation achieve viewpoint invariance? How does it achieve viewpoint invariance? Is it achieved by collapsing the view manifolds, or separating them while preserving them? At which layer is view invariance achieved? How can the structure of the view manifold at each layer of a deep convolutional neural network be quantified experimentally? How does fine-tuning of a pre-trained CNN on a multi-view dataset affect the representation at each layer of the network? In order to answer these questions we propose a methodology to quantify the deformation and degeneracy of view manifolds in CNN layers. We apply this methodology and report interesting results in this paper that answer the aforementioned questions.
The field-driven conversion between the zero-field-cooled frozen relaxor state and a ferroelectric state of several cubic relaxors is found to occur in at least two distinct steps, after a period of creep, as a function of time. The relaxation of this state back to a relaxor state under warming in zero field also occurs via two or more sharp steps, in contrast to a one-step relaxation of the ferroelectric state formed by field-cooling. An intermediate state can be trapped by interrupting the polarization. Giant pyroelectric noise appears in some of the non-equilibrium regimes. It is suggested that two coupled types of order, one ferroelectric and the other glassy, may be required to account for these data.
We introduce a fully automatic self-learning scheme for detecting phase boundaries. This method extends the previously introduced confusion scheme for learning phase transitions, by using a cooperative network that learns to optimize the guess for the transition point. The networks together are capable of finding transition points for fully unlabeled data. This improvement allows us to efficiently study 1D and 2D parameter spaces, where for the latter we utilize an active contour model -- the snake -- from computer vision as a representation of the learned phase boundary. The snakes, equipped with neural networks, can learn while they move in the parameter space and thereby detect phase boundaries automatically.
Cooperative multi-agent systems can be naturally used to model many real world problems, such as network packet routing and the coordination of autonomous vehicles. There is a great need for new reinforcement learning methods that can efficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent's action, while keeping the other agents' actions fixed. COMA also uses a critic representation that allows the counterfactual baseline to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actor-critic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state.
Natural-language processing of historical documents is complicated by the abundance of variant spellings and lack of annotated data. A common approach is to normalize the spelling of historical words to modern forms. We explore the suitability of a deep neural network architecture for this task, particularly a deep bi-LSTM network applied on a character level. Our model compares well to previously established normalization algorithms when evaluated on a diverse set of texts from Early New High German. We show that multi-task learning with additional normalization data can improve our model's performance further.
The emergence of glassy behavior of the random field Ising model (RFIM) is investigated using an extended mean-field theory approach. Using this formulation, systematic corrections to the standard Bragg-Williams theory can be incorporated, leading to the appearance of a glassy phase, in agreement with the results of the self-consistent screening theory of Mezard and Young. Our approach makes it also possible to obtain information about the low temperature behavior of this glassy phase. We present results showing that within our mean-field framework, the hysteresis and avalanche behavior of the RFIM is characterized by power-law distributions, in close analogy with recent results obtained for mean-field spin glass models.
End-to-end learning refers to training a possibly complex learning system by applying gradient-based learning to the system as a whole. End-to-end learning system is specifically designed so that all modules are differentiable. In effect, not only a central learning machine, but also all "peripheral" modules like representation learning and memory formation are covered by a holistic learning process. The power of end-to-end learning has been demonstrated on many tasks, like playing a whole array of Atari video games with a single architecture. While pushing for solutions to more challenging tasks, network architectures keep growing more and more complex.   In this paper we ask the question whether and to what extent end-to-end learning is a future-proof technique in the sense of scaling to complex and diverse data processing architectures. We point out potential inefficiencies, and we argue in particular that end-to-end learning does not make optimal use of the modular design of present neural networks. Our surprisingly simple experiments demonstrate these inefficiencies, up to the complete breakdown of learning.
One of the most efficient methods of exploiting space diversity for portable wireless devices is cooperative communication utilizing space-time block codes. In cooperative communication, users besides communicating their own information, also relay the information of other users. In this paper we investigate a scheme where cooperation is achieved using two methods, namely, distributed space-time coding and network coding. Two cooperating users utilize Alamouti space time code for inter-user cooperation and in addition utilize a third relay which performs network coding. The third relay does not have any of its information to be sent. In this paper we propose a scheme utilizing convolutional code based network coding, instead of conventional XOR based network code and utilize iterative joint network-channel decoder for efficient decoding. Extrinsic information transfer (EXIT) chart analysis is performed to investigate the convergence property of the proposed decoder.
Radial Basis Functions Neural Networks (RBFNNs) are tools widely used in regression problems. One of their principal drawbacks is that the formulation corresponding to the training with the supervision of both the centers and the weights is a highly non-convex optimization problem, which leads to some fundamentally difficulties for traditional optimization theory and methods.   This paper presents a generalized canonical duality theory for solving this challenging problem. We demonstrate that by sequential canonical dual transformations, the nonconvex optimization problem of the RBFNN can be reformulated as a canonical dual problem (without duality gap). Both global optimal solution and local extrema can be classified. Several applications to one of the most used Radial Basis Functions, the Gaussian function, are illustrated. Our results show that even for one-dimensional case, the global minimizer of the nonconvex problem may not be the best solution to the RBFNNs, and the canonical dual theory is a promising tool for solving general neural networks training problems.
Modern social media platforms facilitate the rapid spread of information online. Modelling phenomena such as social contagion and information diffusion are contingent upon a detailed understanding of the information-sharing processes. In Twitter, an important aspect of this occurs with retweets, where users rebroadcast the tweets of other users. To improve our understanding of how these distributions arise, we analyse the distribution of retweet times. We show that a power law with exponential cutoff provides a better fit than the power laws previously suggested. We explain this fit through the burstiness of human behaviour and the priorities individuals place on different tasks.
Chain graphs combine directed and undirected graphs and their underlying mathematics combines properties of the two. This paper gives a simplified definition of chain graphs based on a hierarchical combination of Bayesian (directed) and Markov (undirected) networks. Examples of a chain graph are multivariate feed-forward networks, clustering with conditional interaction between variables, and forms of Bayes classifiers. Chain graphs are then extended using the notation of plates so that samples and data analysis problems can be represented in a graphical model as well. Implications for learning are discussed in the conclusion.
The high-order behavior of the perturbation expansion in the cubic replica field theory of spin glasses in the paramagnetic phase has been investigated. The study starts with the zero-dimensional version of the replica field theory and this is shown to be equivalent to the problem of finding finite size corrections in a modified spherical spin glass near the critical temperature. We find that the high-order behavior of the perturbation series is described, to leading order, by coefficients of alternating signs (suggesting that the cubic field theory is well-defined) but that there are also subdominant terms with a complicated dependence of their sign on the order. Our results are then extended to the d-dimensional field theory and in particular used to determine the high-order behavior of the terms in the expansion of the critical exponents in a power series in epsilon=6-d. We have also corrected errors in the existing epsilon expansions at third order.
An Intrusion Detection System (IDS) detects malicious and selfish nodes in a network. Ad hoc networks are often secured by using either intrusion detection or by secure routing. Designing efficient IDS for wireless ad-hoc networks that would not affect the performance of the network significantly is indeed a challenging task. Arguably, the most common thing in a review paper in the domain of wireless networks is to compare the performances of different solutions using simulation results. However, variance in multiple configuration aspects including that due to different underlying routing protocols, makes the task of simulation based comparative evaluation of IDS solutions somewhat unrealistic. In stead, the authors have followed an analytic approach to identify the gaps in the existing IDS solutions for MANETs and wireless mesh networks. The paper aims to ease the job of a new researcher by exposing him to the state of the art research issues on IDS. Nearly 80% of the works cited in this paper are published with in last 3 to 4 years.
It is shown that all polar encoding schemes of rate $R>\frac{1}{2}$ of block length $N$ implemented according to the Thompson VLSI model must take energy $E\ge\Omega\left(N^{3/2}\right)$. This lower bound is achievable up to polylogarithmic factors using a mesh network topology defined by Thompson and the encoding algorithm defined by Arikan. A general class of circuits that compute successive cancellation decoding adapted from Arikan's butterfly network algorithm is defined. It is shown that such decoders implemented on a rectangle grid for codes of rate $R>2/3$ must take energy $E\ge\Omega(N^{3/2})$, and this can also be reached up to polylogarithmic factors using a mesh network. Capacity approaching sequences of energy optimal polar encoders and decoders, as a function of reciprocal gap to capacity $\chi = (1-R/C)^{-1}$, have energy that scales as $\Omega\left(\chi^{5.325}\right)\le E \le O\left(\chi^{7.05}\log^{4}\left(\chi\right)\right)$.
We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning. The BNNs are trained by minimizing $\alpha$-divergences, allowing us to capture complicated statistical patterns in the transition dynamics, e.g. multi-modality and heteroskedasticity, which are usually missed by other common modeling approaches. We illustrate the performance of our method by solving a challenging benchmark where model-based approaches usually fail and by obtaining promising results in a real-world scenario for controlling a gas turbine.
Models such as latent semantic analysis and those based on neural embeddings learn distributed representations of text, and match the query against the document in the latent semantic space. In traditional information retrieval models, on the other hand, terms have discrete or local representations, and the relevance of a document is determined by the exact matches of query terms in the body text. We hypothesize that matching with distributed representations complements matching with traditional local representations, and that a combination of the two is favorable. We propose a novel document ranking model composed of two separate deep neural networks, one that matches the query and the document using a local representation, and another that matches the query and the document using learned distributed representations. The two networks are jointly trained as part of a single neural network. We show that this combination or `duet' performs significantly better than either neural network individually on a Web page ranking task, and also significantly outperforms traditional baselines and other recently proposed models based on neural networks.
Randomly connected networks of excitatory and inhibitory spiking neurons provide a parsimonious model of neural variability, but are notoriously unreliable for performing computations. We show that this difficulty is overcome by incorporating the well-documented dependence of connection probability on distance. Spatially extended spiking networks exhibit symmetry-breaking bifurcations and generate spatiotemporal patterns that can be trained to perform dynamical computations.
We study numerically the cluster structure of random ensembles of two NP-hard optimization problems originating in computational complexity, the vertex-cover problem and the number partitioning problem. We use branch-and-bound type algorithms to obtain exact solutions of these problems for moderate system sizes. Using two methods, direct neighborhood-based clustering and hierarchical clustering, we investigate the structure of the solution space. The main result is that the correspondence between solution structure and the phase diagrams of the problems is not unique. Namely, for vertex cover we observe a drastic change of the solution space from large single clusters to multiple nested levels of clusters. In contrast, for the number-partitioning problem, the phase space looks always very simple, similar to a random distribution of the lowest-energy configurations. This holds in the ``easy''/solvable phase as well as in the ``hard''/unsolvable phase.
High orders in perturbation theory can be calculated by the Lipatov method. For most field theories, the Lipatov asymptotics has the functional form c a^N \Gamma(N+b) (N is the order of perturbation theory); relative corrections to this asymptotics have the form of a power series in 1/N. The coefficients of high order terms of this series can be calculated using a procedure analogous to the Lipatov approach and are determined by the second instanton in the considered field theory. These coefficients are calculated quantitatively for the n-component \phi^4 theory under the assumption that the second instanton is (i) a combination of two elementary instantons and (ii) a spherically asymmetric localized function. The technique of two-instanton calculations is developed, as well as the method for integrating over rotations of an asymmetric instanton in the coordinate state.
In most gene expression data, the number of training samples is very small compared to the large number of genes involved in the experiments. However, among the large amount of genes, only a small fraction is effective for performing a certain task. Furthermore, a small subset of genes is desirable in developing gene expression based diagnostic tools for delivering reliable and understandable results. With the gene selection results, the cost of biological experiment and decision can be greatly reduced by analyzing only the marker genes. An important application of gene expression data in functional genomics is to classify samples according to their gene expression profiles. Feature selection (FS) is a process which attempts to select more informative features. It is one of the important steps in knowledge discovery. Conventional supervised FS methods evaluate various feature subsets using an evaluation function or metric to select only those features which are related to the decision classes of the data under consideration. This paper studies a feature selection method based on rough set theory. Further K-Means, Fuzzy C-Means (FCM) algorithm have implemented for the reduced feature set without considering class labels. Then the obtained results are compared with the original class labels. Back Propagation Network (BPN) has also been used for classification. Then the performance of K-Means, FCM, and BPN are analyzed through the confusion matrix. It is found that the BPN is performing well comparatively.
The rules governing the availability and quality of connections in a wireless network are described by physical models such as the signal-to-interference & noise ratio (SINR) model. For a collection of simultaneously transmitting stations in the plane, it is possible to identify a reception zone for each station, consisting of the points where its transmission is received correctly. The resulting SINR diagram partitions the plane into a reception zone per station and the remaining plane where no station can be heard.   SINR diagrams appear to be fundamental to understanding the behavior of wireless networks, and may play a key role in the development of suitable algorithms for such networks, analogous perhaps to the role played by Voronoi diagrams in the study of proximity queries and related issues in computational geometry. So far, however, the properties of SINR diagrams have not been studied systematically, and most algorithmic studies in wireless networking rely on simplified graph-based models such as the unit disk graph (UDG) model, which conveniently abstract away interference-related complications, and make it easier to handle algorithmic issues, but consequently fail to capture accurately some important aspects of wireless networks.   The current paper focuses on obtaining some basic understanding of SINR diagrams, their properties and their usability in algorithmic applications. Specifically, based on some algebraic properties of the polynomials defining the reception zones we show that assuming uniform power transmissions, the reception zones are convex and relatively well-rounded. These results are then used to develop an efficient approximation algorithm for a fundamental point location problem in wireless networks.
The problem of optimising a network of discretely firing neurons is addressed. An objective function is introduced which measures the average number of bits that are needed for the network to encode its state. When this is minimised, it is shown that this leads to a number of results, such as topographic mappings, piecewise linear dependence on the input of the probability of a neuron firing, and factorial encoder networks.
Artificial Neural Networks are computational network models inspired by signal processing in the brain. These models have dramatically improved the performance of many learning tasks, including speech and object recognition. However, today's computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made to develop electronic architectures tuned to implement artificial neural networks that improve upon both computational speed and energy efficiency. Here, we propose a new architecture for a fully-optical neural network that, using unique advantages of optics, promises a computational speed enhancement of at least two orders of magnitude over the state-of-the-art and three orders of magnitude in power efficiency for conventional learning tasks. We experimentally demonstrate essential parts of our architecture using a programmable nanophotonic processor.
We present a vision-only model for gaming AI which uses a late integration deep convolutional network architecture trained in a purely supervised imitation learning context. Although state-of-the-art deep learning models for video game tasks generally rely on more complex methods such as deep-Q learning, we show that a supervised model which requires substantially fewer resources and training time can already perform well at human reaction speeds on the N64 classic game Super Smash Bros. We frame our learning task as a 30-class classification problem, and our CNN model achieves 80% top-1 and 95% top-3 validation accuracy. With slight test-time fine-tuning, our model is also competitive during live simulation with the highest-level AI built into the game. We will further show evidence through network visualizations that the network is successfully leveraging temporal information during inference to aid in decision making. Our work demonstrates that supervised CNN models can provide good performance in challenging policy prediction tasks while being significantly simpler and more lightweight than alternatives.
We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph with few connected components; by exploiting prior knowledge, one can indeed improve the prediction performance or obtain results that are easier to interpret. Regularization or penalty functions for selecting features in graphs have recently been proposed, but they raise new algorithmic challenges. For example, they typically require solving a combinatorially hard selection problem among all connected subgraphs. In this paper, we propose computationally feasible strategies to select a sparse and well-connected subset of features sitting on a directed acyclic graph (DAG). We introduce structured sparsity penalties over paths on a DAG called "path coding" penalties. Unlike existing regularization functions that model long-range interactions between features in a graph, path coding penalties are tractable. The penalties and their proximal operators involve path selection problems, which we efficiently solve by leveraging network flow optimization. We experimentally show on synthetic, image, and genomic data that our approach is scalable and leads to more connected subgraphs than other regularization functions for graphs.
We present a parallel derivation of the Thouless-Anderson-Palmer (TAP) equations and of an effective potential for the negative perceptron and soft sphere models in high dimension. Both models are continuous constrained satisfaction problems with a critical jamming transition characterized by the same exponents. Our analysis reveals that a power expansion of the potential up to the second order represents a successful framework to approach the jamming line from the SAT phase (the region of the phase diagram where at least one configuration verifies all the constraints), where the ground-state energy is zero. An interesting outcome is that close to jamming the effective thermodynamic potential has a logarithmic contribution, which turns out to be dominant in a proper scaling regime. Our approach is quite general and can be directly applied to other interesting models. Finally, we study the spectrum of small harmonic fluctuations in the SAT phase recovering the typical scaling $D(\omega) \sim \omega^2$ below the cutoff frequency but a different behavior characterized by a non-trivial exponent above it.
The ability to extract public opinion from web portals such as review sites, social networks and blogs will enable companies and individuals to form a view, an attitude and make decisions without having to do lengthy and costly researches and surveys. In this paper machine learning techniques are used for determining the polarity of forum posts on kajgana which are written in Macedonian language. The posts are classified as being positive, negative or neutral. We test different feature metrics and classifiers and provide detailed evaluation of their participation in improving the overall performance on a manually generated dataset. By achieving 92% accuracy, we show that the performance of systems for automated opinion mining is comparable to a human evaluator, thus making it a viable option for text data analysis. Finally, we present a few statistics derived from the forum posts using the developed system.
In a recent Letter [F.C. Santos and J. M. Pacheco Phys. Rev. Lett. \textbf{95}, 098104 (2005)], the scale-free networks are found to be advantageous for the emergence of cooperation. In the present work an evolutionary prisoner's dilemma game with players located on Barab\'asi-Albert scale-free networks is studied in detail. The players are pure strategist and can follow two strategies: either to defect or to cooperate. Serval alternative update rules determining the evolution of each player's strategy are considered. Using systematic Monte Carlo simulations we have calculated the average density of cooperators as a function of the temptation to defect. It is shown that the results obtained by numerical experiments depend strongly on the dynamics of the game, which could lower the important of scale-free topology on the persistence of the cooperation. Particularly, the system exhibits a phase transition, from active state (coexistence of cooperators and defectors) to absorbing state (only defectors surviving) when allowing `` worse\rq\rq strategy to be imitated in the evolution of the game.
In future networks, an operator may employ a wide range of access points using diverse radio access technologies (RATs) over multiple licensed and unlicensed frequency bands. This paper studies centralized user association and spectrum allocation across many access points in such a heterogeneous network (HetNet). Such centralized control is on a relatively slow timescale to allow information exchange and joint optimization over multiple cells. This is in contrast and complementary to fast timescale distributed scheduling. A queueing model is introduced to capture the lower spectral efficiency, reliability, and additional delays of data transmission over the unlicensed bands due to contention and/or listen-before-talk requirements. Two optimization-based spectrum allocation schemes are proposed along with efficient algorithms for computing the allocations. The proposed solutions are fully aware of traffic loads, network topology, as well as external interference levels in the unlicensed bands. Packet-level simulation results show that the proposed schemes significantly outperform orthogonal and full-frequency-reuse allocations under all traffic conditions.
We report the in situ formation of an ordered equilibrium decagonal Al-Pd-Mn quasicrystal overlayer on the 5-fold symmetric surface of an icosahedral Al-Pd-Mn monograin. The decagonal structure of the epilayer is evidenced by x-ray photoelectron diffraction, low-energy electron diffraction and electron backscatter diffraction. This overlayer is also characterized by a reduced density of states near the Fermi edge as expected for quasicrystals. This is the first time that a millimeter-size surface of the stable decagonal Al-Pd-Mn is obtained, studied and compared to its icosahedral counterpart.
A rank-dependent deactivation mechanism is introduced to network evolution. The growth dynamics of the network is based on a finite memory of individuals, which is implemented by deactivating one site at each time step. The model shows striking features of a wide range of real-world networks: power-law degree distribution, high clustering coefficient, and disassortative degree correlation.
Connections between the electron eigenstates and conductivity of one-dimensional disordered electron systems is studied in the framework of the tight-binding model. We show that for weak disorder only part of the states exhibit resonant transmission and contribute to the conductivity. The rest of the eigenvalues are not associated with peaks in transmission and the amplitudes of their wave functions do not exhibit a significant maxima within the sample. Moreover, unlike ordinary states, the lifetimes of these `hidden' modes either remain constant or even decrease (depending on the coupling with the leads) as the disorder becomes stronger. In a wide range of the disorder strengths, the averaged ratio of the number of transmission peaks to the total number of the eigenstates is independent of the degree of disorder and is close to the value $\sqrt{2/5}$, which was derived analytically in the weak-scattering approximation. These results are in perfect analogy to the spectral and transport properties of light in one-dimensional randomly inhomogeneous media, which provides strong grounds to believe that the existence of hidden, non-conducting modes is a general phenomenon inherent to 1D open random systems, and their fraction of the total density of states is the same for quantum particles and classical waves.
A wide range of complex systems can be modeled as networks with corresponding constraints on the edges and nodes, which have been extensively studied in recent years. Nowadays, with the progress of information technology, systems that contain the information collected from multiple perspectives have been generated. The conventional models designed for single perspective networks fail to depict the diverse topological properties of such systems, so multilayer network models aiming at describing the structure of these networks emerge. As a major concern in network science, decomposing the networks into communities, which usually refers to closely interconnected node groups, extracts valuable information about the structure and interactions of the network. Unlike the contention of dozens of models and methods in conventional single-layer networks, methods aiming at discovering the communities in the multilayer networks are still limited. In order to help explore the community structure in multilayer networks, we propose the multilayer edge mixture model, which explores a relatively general form of a community structure evaluator from an edge combination view. As an example, we demonstrate that the multilayer modularity and stochastic blockmodels can be derived from the proposed model. We also explore the decomposition of community structure evaluators with specific forms to the multilayer edge mixture model representation, which turns out to reveal some new interpretation of the evaluators. The flexibility and performance on different networks of the proposed model are illustrated with applications on a series of benchmark networks.
Micrometeorology, city comfort, land use management and air quality monitoring increasingly become important environmental issues. To serve the needs, meteorology needs to achieve a serious advance in representation and forecast on micro-scales (meters to 100 km) called meteorological terra incognita. There is a suitable numerical tool, namely, the large-eddy simulation modelling (LES) to support the development. However, at present, the LES is of limited utility for applications. The study addresses two problems. First, the data assimilation problem on micro-scales is investigated as a possibility to recover the turbulent fields consistent with the mean meteorological profiles. Second, the methods to incorporate of the unresolved surface structures are investigated in a priopi numerical experiments. The numerical experiments demonstrated that the simplest nudging or Newtonian relaxation technique for the data assimilation is applicable on the turbulence scales. It is also shown that the filtering property of the three layers artificial neural network (ANN) can be used for formulation of the surface stress from the unresolved surface features.
Fine-Grained Visual Categorization (FGVC) has achieved significant progress recently. However, the number of fine-grained species could be huge and dynamically increasing in real scenarios, making it difficult to recognize unseen objects under the current FGVC framework. This raises an open issue to perform large-scale fine-grained identification without a complete training set. Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR). "One-Shot" denotes the ability of identifying unseen objects through a fine-grained retrieval task assisted with an incomplete auxiliary training set. This paper first presents the detailed description to OSFGIR task and our collected OSFGIR-378K dataset. Next, we propose the Convolutional and Normalization Networks (CN-Nets) learned on the auxiliary dataset to generate a concise and discriminative representation. Finally, we present a coarse-to-fine retrieval framework consisting of three components, i.e., coarse retrieval, fine-grained retrieval, and query expansion, respectively. The framework progressively retrieves images with similar semantics, and performs fine-grained identification. Experiments show our OSFGIR framework achieves significantly better accuracy and efficiency than existing FGVC and image retrieval methods, thus could be a better solution for large-scale fine-grained object identification.
We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian \cite{gupta1999matrix} parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the "local reprarametrization trick" \cite{kingma2015variational} on this posterior distribution we arrive at a Gaussian Process \cite{rasmussen2006gaussian} interpretation of the hidden units in each layer and we, similarly with \cite{gal2015dropout}, provide connections with deep Gaussian processes. We continue in taking advantage of this duality and incorporate "pseudo-data" \cite{snelson2005sparse} in our model, which in turn allows for more efficient sampling while maintaining the properties of the original model. The validity of the proposed approach is verified through extensive experiments.
Various molecular interaction networks have been claimed to follow power-law decay for their global connectivity distribution. It has been proposed that there may be underlying generative models that explain this heavy-tailed behavior by self-reinforcement processes such as classical or hierarchical scale-free network models. Here we analyze a comprehensive data set of protein-protein and transcriptional regulatory interaction networks in yeast, an E. coli metabolic network, and gene activity profiles for different metabolic states in both organisms. We show that in all cases the networks have a heavy-tailed distribution, but most of them present significant differences from a power-law model according to a stringent statistical test. Those few data sets that have a statistically significant fit with a power-law model follow other distributions equally well. Thus, while our analysis supports that both global connectivity interaction networks and activity distributions are heavy-tailed, they are not generally described by any specific distribution model, leaving space for further inferences on generative models.
We present a high sensitivity method allowing the measurement of the non linear dielectric susceptibility of an insulating material at finite frequency. It has been developped for the study of dynamic heterogeneities in supercooled liquids using dielectric spectroscopy at frequencies 0.05 Hz < f < 30000 Hz . It relies on the measurement of the third harmonics component of the current flowing out of a capacitor. We first show that standard laboratory electronics (amplifiers and voltage sources) nonlinearities lead to limits on the third harmonics measurements that preclude reaching the level needed by our physical goal, a ratio of the third harmonics to the fundamental signal about 7 orders of magnitude lower than 1. We show that reaching such a sensitivity needs a method able to get rid of the nonlinear contributions both of the measuring device (lock-in amplifier) and of the excitation voltage source. A bridge using two sources fulfills only the first of these two requirements, but allows to measure the nonlinearities of the sources. Our final method is based on a bridge with two plane capacitors characterized by different dielectric layer thicknesses. It gets rid of the source and amplifier nonlinearities because in spite of a strong frequency dependence of the capacitors impedance, it is equilibrated at any frequency. We present the first measurements of the physical nonlinear response using our method. Two extensions of the method are suggested.
We study the effect of vaccination on robustness of networks against propagating attacks that obey the susceptible-infected-removed model.By extending the generating function formalism developed by Newman (2005), we analytically determine the robustness of networks that depends on the vaccination parameters. We consider the random defense where nodes are vaccinated randomly and the degree-based defense where hubs are preferentially vaccinated. We show that when vaccines are inefficient, the random graph is more robust against propagating attacks than the scale-free network. When vaccines are relatively efficient, the scale-free network with the degree-based defense is more robust than the random graph with the random defense and the scale-free network with the random defense.
We present a theoretical analysis of low-temperature quenching of one-dimensional Frenkel excitons that are localised by moderate on-site (diagonal) uncorrelated disorder. Exciton diffusion is considered as an incoherent hopping over localization segments and is probed by the exciton fluorescence quenching at point traps. The rate equation is used to calculate the temperature dependence of the exciton quenching. The activation temperature of the diffusion is found to be of the order of the width of the exciton absorption band. We demonstrate that the intra-segment scattering is extremely important for the exciton diffusion. We discuss also experimental data on the fast exciton-exciton annihilation in linear molecular aggregates at low temperatures.
Dropout is a popular technique for regularizing artificial neural networks. Dropout networks are generally trained by minibatch gradient descent with a dropout mask turning off some of the units---a different pattern of dropout is applied to every sample in the minibatch. We explore a very simple alternative to the dropout mask. Instead of masking dropped out units by setting them to zero, we perform matrix multiplication using a submatrix of the weight matrix---unneeded hidden units are never calculated. Performing dropout batchwise, so that one pattern of dropout is used for each sample in a minibatch, we can substantially reduce training times. Batchwise dropout can be used with fully-connected and convolutional neural networks.
Modern database clusters entail two levels of networks: connecting CPUs and NUMA regions inside a single server in the small and multiple servers in the large. The huge performance gap between these two types of networks used to slow down distributed query processing to such an extent that a cluster of machines actually performed worse than a single many-core server. The increased main-memory capacity of the cluster remained the sole benefit of such a scale-out.   The economic viability of high-speed interconnects such as InfiniBand has narrowed this performance gap considerably. However, InfiniBand's higher network bandwidth alone does not improve query performance as expected when the distributed query engine is left unchanged. The scalability of distributed query processing is impaired by TCP overheads, switch contention due to uncoordinated communication, and load imbalances resulting from the inflexibility of the classic exchange operator model. This paper presents the blueprint for a distributed query engine that addresses these problems by considering both levels of networks holistically. It consists of two parts: First, hybrid parallelism that distinguishes local and distributed parallelism for better scalability in both the number of cores as well as servers. Second, a novel communication multiplexer tailored for analytical database workloads using remote direct memory access (RDMA) and low-latency network scheduling for high-speed communication with almost no CPU overhead. An extensive evaluation within the HyPer database system using the TPC-H benchmark shows that our holistic approach indeed enables high-speed query processing over high-speed networks.
A packet-switched network node with constant capacity (in bps) is considered, where packets within each flow are served in the first in first out (FIFO) manner. While this single node system is perhaps the simplest computer communication system, its stochastic service curve characterization and independent case analysis in the context of stochastic network calculus (snetcal) are still basic and many crucial questions surprisingly remain open. Specifically, when the input is a single flow, what stochastic service curve and delay bound does the node provide? When the considered flow shares the node with another flow, what stochastic service curve and delay bound does the node provide to the considered flow, and if the two flows are independent, can this independence be made use of and how? The aim of this paper is to provide answers to these fundamental questions.
We investigate the power-suppressed corrections to the fragmentation functions of the current jet in non-singlet deep inelastic lepton-hadron scattering. The current jet is defined by selecting final-state particles in the current hemisphere in the Breit frame of reference. Our method is based on an analysis of one-loop Feynman graphs containing a massive gluon, which is equivalent to the evaluation of leading infrared renormalon contributions. We find that the leading corrections are proportional to $1/Q^2$, as in $e^+e^-$ annihilation, but their functional forms are different. We give quantitative estimates based on the hypothesis of universal low-energy behaviour of the strong coupling.
A new proof is given for the mathematical equivalence among three $k$-sparse controllability problems of a networked system, which plays key roles in Olshevsky,2014, in the establishment of the NP-hardness of the associated minimal controllability problems (MCP). Compared with the available ones, a completely deterministic approach is adopted. Moreover, only primary algebraic operations are utilized in all the derivations. These results enhance the available conclusions about the NP-hardness of a MCP, and can also be directly applied to the computational complexity analysis for a minimal observability problem. In addition, the results of Olshevsky,2014, have also been extended to situations in which there are also some other restrictions, such as bounded element magnitude, etc., on the system input matrix.
In this paper, we formalize the notion of distributed sensitive social networks (DSSNs), which encompasses networks like enmity networks, financial transaction networks, supply chain networks and sexual relationship networks. Compared to the well studied traditional social networks, DSSNs are often more challenging to study, given the privacy concerns of the individuals on whom the network is knit. In the current work, we envision the use of secure multiparty tools and techniques for performing privacy preserving social network analysis over DSSNs. As a step towards realizing this, we design efficient data-oblivious algorithms for computing the K-shell decomposition and the PageRank centrality measure for a given DSSN. The designed data-oblivious algorithms can be translated into equivalent secure computation protocols. We also list a string of challenges that are needed to be addressed, for employing secure computation protocols as a practical solution for studying DSSNs.
Traditional network disruption approaches focus on disconnecting or lengthening paths in the network. We present a new framework for network disruption that attempts to reroute flow through critical vertices via vertex deletion, under the assumption that this will render those vertices vulnerable to future attacks. We define the load on a critical vertex to be the number of paths in the network that must flow through the vertex. We present graph-theoretic and computational techniques to maximize this load, firstly by removing either a single vertex from the network, secondly by removing a subset of vertices.
RW Aur is a young binary star that experienced a deep dimming in 2010-11 in component A and a second even deeper dimming from summer 2014 to summer 2016. We present new unresolved multi-band photometry during the 2014-16 eclipse, new emission line spectroscopy before and during the dimming, archive infrared photometry between 2014-15, as well as an overview of literature data.   Spectral observations were carried out with the Fibre-fed RObotic Dual-beam Optical Spectrograph on the Liverpool Telescope. Photometric monitoring was done with the Las Cumbres Observatory Global Telescope Network and James Gregory Telescope. Our photometry shows that RW Aur dropped in brightness to R = 12.5 in March 2016. In addition to the long-term dimming trend, RW Aur is variable on time scales as short as hours. The short-term variation is most likely due to an unstable accretion flow. This, combined with the presence of accretion-related emission lines in the spectra suggest that accretion flows in the binary system are at least partially visible during the eclipse.   The equivalent width of [O I] increases by a factor of ten in 2014, coinciding with the dimming event, confirming previous reports. The blue-shifted part of the $H\alpha$ profile is suppressed during the eclipse. In combination with the increase in mid-infrared brightness during the eclipse reported in the literature and seen in WISE archival data, and constraints on the geometry of the disk around RW Aur A we arrive at the conclusion that the obscuring screen is part of a wind emanating from the inner disk.
Reciprocal processes are acausal generalizations of Markov processes introduced by Bernstein in 1932. In the literature, a significant amount of attention has been focused on developing dynamical models for reciprocal processes. Recently, probabilistic graphical models for reciprocal processes have been provided. This opens the way to the application of efficient inference algorithms in the machine learning literature to solve the smoothing problem for reciprocal processes. Such algorithms are known to converge if the underlying graph is a tree. This is not the case for a reciprocal process, whose associated graphical model is a single loop network. The contribution of this paper is twofold. First, we introduce belief propagation for Gaussian reciprocal processes. Second, we establish a link between convergence analysis of belief propagation for Gaussian reciprocal processes and stability theory for differentially positive systems.
Probabilistic inference offers a principled framework for understanding both behaviour and cortical computation. However, two basic and ubiquitous properties of cortical responses seem difficult to reconcile with probabilistic inference: neural activity displays prominent oscillations in response to constant input, and large transient changes in response to stimulus onset. Here we show that these dynamical behaviours may in fact be understood as hallmarks of the specific representation and algorithm that the cortex employs to perform probabilistic inference. We demonstrate that a particular family of probabilistic inference algorithms, Hamiltonian Monte Carlo (HMC), naturally maps onto the dynamics of excitatory-inhibitory neural networks. Specifically, we constructed a model of an excitatory-inhibitory circuit in primary visual cortex that performed HMC inference, and thus inherently gave rise to oscillations and transients. These oscillations were not mere epiphenomena but served an important functional role: speeding up inference by rapidly spanning a large volume of state space. Inference thus became an order of magnitude more efficient than in a non-oscillatory variant of the model. In addition, the network matched two specific properties of observed neural dynamics that would otherwise be difficult to account for in the context of probabilistic inference. First, the frequency of oscillations as well as the magnitude of transients increased with the contrast of the image stimulus. Second, excitation and inhibition were balanced, and inhibition lagged excitation. These results suggest a new functional role for the separation of cortical populations into excitatory and inhibitory neurons, and for the neural oscillations that emerge in such excitatory-inhibitory networks: enhancing the efficiency of cortical computations.
Faces and words both evoke an N170, a strong electrophysiological response that is often used as a marker for the early stages of expert pattern perception. We examine the relationship of neural selectivity between faces and words by using a novel application of cross-category adaptation to the N170. We report a strong asymmetry between N170 adaptation induced by faces and by words. This is the first electrophysiological result showing that neural selectivity to faces encompasses neural selectivity to words, and suggests that the N170 response to faces constitutes a neural marker for versatile representations of familiar visual patterns.
Semantic labeling of RGB-D scenes is crucial to many intelligent applications including perceptual robotics. It generates pixelwise and fine-grained label maps from simultaneously sensed photometric (RGB) and depth channels. This paper addresses this problem by i) developing a novel Long Short-Term Memorized Context Fusion (LSTM-CF) Model that captures and fuses contextual information from multiple channels of photometric and depth data, and ii) incorporating this model into deep convolutional neural networks (CNNs) for end-to-end training. Specifically, contexts in photometric and depth channels are, respectively, captured by stacking several convolutional layers and a long short-term memory layer; the memory layer encodes both short-range and long-range spatial dependencies in an image along the vertical direction. Another long short-term memorized fusion layer is set up to integrate the contexts along the vertical direction from different channels, and perform bi-directional propagation of the fused vertical contexts along the horizontal direction to obtain true 2D global contexts. At last, the fused contextual representation is concatenated with the convolutional features extracted from the photometric channels in order to improve the accuracy of fine-scale semantic labeling. Our proposed model has set a new state of the art, i.e., 48.1% and 49.4% average class accuracy over 37 categories (2.2% and 5.4% improvement) on the large-scale SUNRGBD dataset and the NYUDv2dataset, respectively.
Observational data usually comes with a multimodal nature, which means that it can be naturally represented by a multi-layer graph whose layers share the same set of vertices (users) with different edges (pairwise relationships). In this paper, we address the problem of combining different layers of the multi-layer graph for improved clustering of the vertices compared to using layers independently. We propose two novel methods, which are based on joint matrix factorization and graph regularization framework respectively, to efficiently combine the spectrum of the multiple graph layers, namely the eigenvectors of the graph Laplacian matrices. In each case, the resulting combination, which we call a "joint spectrum" of multiple graphs, is used for clustering the vertices. We evaluate our approaches by simulations with several real world social network datasets. Results demonstrate the superior or competitive performance of the proposed methods over state-of-the-art technique and common baseline methods, such as co-regularization and summation of information from individual graphs.
For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates the need for most of these. Our method consists of two parts: First we stochastically binarize weights to convert multiplications involved in computing hidden states to sign changes. Second, while back-propagating error derivatives, in addition to binarizing the weights, we quantize the representations at each layer to convert the remaining multiplications into binary shifts. Experimental results across 3 popular datasets (MNIST, CIFAR10, SVHN) show that this approach not only does not hurt classification performance but can result in even better performance than standard stochastic gradient descent training, paving the way to fast, hardware-friendly training of neural networks.
The spreading of evolutionary novelties across populations is the central element of adaptation. Unless population are well-mixed (like bacteria in a shaken test tube), the spreading dynamics not only depends on fitness differences but also on the dispersal behavior of the species. Spreading at a constant speed is generally predicted when dispersal is sufficiently short-ranged. However, the case of long-range dispersal is unresolved: While it is clear that even rare long-range jumps can lead to a drastic speedup, it has been difficult to quantify the ensuing stochastic growth process. Yet such knowledge is indispensable to reveal general laws for the spread of modern human epidemics, which is greatly accelerated by aviation. We present a simple iterative scaling approximation supported by simulations and rigorous bounds that accurately predicts evolutionary spread for broad distributions of long distance dispersal. In contrast to the exponential laws predicted by deterministic "mean-field" approximations, we show that the asymptotic growth is either according to a power-law or a stretched exponential, depending on the tails of the dispersal kernel. More importantly, we provide a full time-dependent description of the convergence to the asymptotic behavior which can be anomalously slow and is needed even for long times. Our results also apply to spreading dynamics on networks with a spectrum of long-range links under certain conditions on the probabilities of long distance travel and are thus relevant for the spread of epidemics.
Multicast communication primitives have broad utility as building blocks for distributed applications. The challenge is to create and maintain the distributed structures that support these primitives while accounting for volatile end nodes and variable network characteristics. Most solutions proposed to date rely on complex algorithms or global information, thus limiting the scale of deployments and acceptance outside the academic realm. This article introduces a low-complexity, self organizing solution for maintaining multicast trees, that we refer to as UMM (Unstructured Multi-source Multicast). UMM uses traditional distributed systems techniques: layering, soft-state, and passive data collection to adapt to the dynamics of the physical network and maintain data dissemination trees. The result is a simple, adaptive system with lower overheads than more complex alternatives. We have implemented UMM and evaluated it on a 100-node PlanetLab testbed and on up to 1024-node emulated ModelNet networks Extensive experimental evaluations demonstrate UMM's low overhead, efficient network usage compared to alternative solutions, and ability to quickly adapt to network changes and to recover from failures.
Deep learning has significantly advanced computer vision and natural language processing. While there have been some successes in robotics using deep learning, it has not been widely adopted. In this paper, we present a novel robotic grasp detection system that predicts the best grasping pose of a parallel-plate robotic gripper for novel objects using the RGB-D image of the scene. The proposed model uses a deep convolutional neural network to extract features from the scene and then uses a shallow convolutional neural network to predict the grasp configuration for the object of interest. Our multi-modal model achieved an accuracy of 89.21% on the standard Cornell Grasp Dataset and runs at real-time speeds. This redefines the state-of-the-art for robotic grasp detection.
We have used the Kubo formula to calculate the temperature dependence of the electrical conductance of the double exchange Hamiltonian. We average the conductance over an statistical ensemble of clusters, which are obtained by performing Monte Carlo simulations on the classical spin orientation of the double exchange Hamiltonian. We find that for electron concentrations bigger than 0.1, the system is metallic at all temperatures. In particular it is not observed any change in the temperature dependence of the resistivity near the magnetical critical temperature. The calculated resistivity near $T_c$ is around ten times smaller than the experimental value. We conclude that the double exchange model is not able to explain the metal to insulator transition which experimentally occurs at temperatures near the magnetic critical temperature.
We investigate the chemical dissolution of porous media using a network model in which the system is represented as a series of interconnected pipes with the diameter of each segment increasing in proportion to the local reactant consumption. Moreover, the topology of the network is allowed to change dynamically during the simulation: as the diameters of the eroding pores become comparable with the interpore distances, the pores are joined together thus changing the interconnections within the network. With this model, we investigate different growth regimes in an evolving porous medium, identifying the mechanisms responsible for the emergence of specific patterns. We consider both the random and regular network and study the effect of the network geometry on the patterns. Finally, we consider practically important problem of finding an optimum flow rate that gives a maximum increase in permeability for a given amount of reactant.
Like any large system development effort, the construction of a complex belief network model requires systems engineering to manage the design and construction process. We propose a rapid prototyping approach to network engineering. We describe criteria for identifying network modules and the use of "stubs" to represent not-yet-constructed modules. We propose an object oriented representation for belief networks which captures the semantics of the problem in addition to conditional independencies and probabilities. Methods for evaluating complex belief network models are discussed. The ideas are illustrated with examples from a large belief network construction problem in the military intelligence domain.
In retailer management, the Newsvendor problem has widely attracted attention as one of basic inventory models. In the traditional approach to solving this problem, it relies on the probability distribution of the demand. In theory, if the probability distribution is known, the problem can be considered as fully solved. However, in any real world scenario, it is almost impossible to even approximate or estimate a better probability distribution for the demand. In recent years, researchers start adopting machine learning approach to learn a demand prediction model by using other feature information. In this paper, we propose a supervised learning that optimizes the demand quantities for products based on feature information. We demonstrate that the original Newsvendor loss function as the training objective outperforms the recently suggested quadratic loss function. The new algorithm has been assessed on both the synthetic data and real-world data, demonstrating better performance.
We theoretically analyze the response properties of ultracold bosons in optical lattices to the static variation of the trapping potential. We show that, upon an increase of such potential (trap squeezing), the density variations in a central region, with linear size of >~ 10 wavelengths, reflect that of the bulk system upon changing the chemical potential: hence measuring the density variations gives direct access to the bulk compressibility. When combined with standard time-of-flight measurements, this approach has the potential of unambiguously detecting the appearence of the most fundamental phases realized by bosons in optical lattices, with or without further external potentials: superfluid, Mott insulator, band insulator and Bose glass.
The Landauer scattering approach to 4-probe resistance is revisited for the case of a d-dimensional disordered resistor in the presence of decoherence. Our treatment is based on an invariant-embedding equation for the evolution of the coherent reflection amplitude coefficient in the length of a 1-dimensional disordered conductor, where decoherence is introduced at par with the disorder through an outcoupling, or stochastic absorption, of the wave amplitude into side (transverse) channels, and its subsequent incoherent re-injection into the conductor. This is essentially in the spirit of B{\"u}ttiker's reservoir-induced decoherence. The resulting evolution equation for the probability density of the 4-probe resistance in the presence of decoherence is then generalised from the 1-dimensional to the d-dimensional case following an anisotropic Migdal-Kadanoff-type procedure and analysed. The anisotropy, namely that the disorder evolves in one arbitrarily chosen direction only, is the main approximation here that makes the analytical treatment possible. A qualitatively new result is that arbitrarily small decoherence reduces the localisation-delocalisation transition to a crossover making resistance moments of all orders finite.
This paper demonstrates end-to-end neural network architectures for Vietnamese named entity recognition. Our best model is a combination of bidirectional Long Short-Term Memory (Bi-LSTM), Convolutional Neural Network (CNN), Conditional Random Field (CRF), using pre-trained word embeddings as input, which achieves an F1 score of 88.59% on a standard test set. Our system is able to achieve a comparable performance to the first-rank system of the VLSP campaign without using any syntactic or hand-crafted features. We also give an extensive empirical study on using common deep learning models for Vietnamese NER, at both word and character level.
Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently. We use prioritized experience replay in Deep Q-Networks (DQN), a reinforcement learning algorithm that achieved human-level performance across many Atari games. DQN with prioritized experience replay achieves a new state-of-the-art, outperforming DQN with uniform replay on 41 out of 49 games.
The recently introduced dropout training criterion for neural networks has been the subject of much attention due to its simplicity and remarkable effectiveness as a regularizer, as well as its interpretation as a training procedure for an exponentially large ensemble of networks that share parameters. In this work we empirically investigate several questions related to the efficacy of dropout, specifically as it concerns networks employing the popular rectified linear activation function. We investigate the quality of the test time weight-scaling inference procedure by evaluating the geometric average exactly in small models, as well as compare the performance of the geometric mean to the arithmetic mean more commonly employed by ensemble techniques. We explore the effect of tied weights on the ensemble interpretation by training ensembles of masked networks without tied weights. Finally, we investigate an alternative criterion based on a biased estimator of the maximum likelihood ensemble gradient.
The paper systematically studies the impact of a range of recent advances in CNN architectures and learning methods on the object categorization (ILSVRC) problem. The evalution tests the influence of the following choices of the architecture: non-linearity (ReLU, ELU, maxout, compatibility with batch normalization), pooling variants (stochastic, max, average, mixed), network width, classifier design (convolutional, fully-connected, SPP), image pre-processing, and of learning parameters: learning rate, batch size, cleanliness of the data, etc.   The performance gains of the proposed modifications are first tested individually and then in combination. The sum of individual gains is bigger than the observed improvement when all modifications are introduced, but the "deficit" is small suggesting independence of their benefits. We show that the use of 128x128 pixel images is sufficient to make qualitative conclusions about optimal network structure that hold for the full size Caffe and VGG nets. The results are obtained an order of magnitude faster than with the standard 224 pixel images.
Belief propagation and its variants are popular methods for approximate inference, but their running time and even their convergence depend greatly on the schedule used to send the messages. Recently, dynamic update schedules have been shown to converge much faster on hard networks than static schedules, namely the residual BP schedule of Elidan et al. [2006]. But that RBP algorithm wastes message updates: many messages are computed solely to determine their priority, and are never actually performed. In this paper, we show that estimating the residual, rather than calculating it directly, leads to significant decreases in the number of messages required for convergence, and in the total running time. The residual is estimated using an upper bound based on recent work on message errors in BP. On both synthetic and real-world networks, this dramatically decreases the running time of BP, in some cases by a factor of five, without affecting the quality of the solution.
In this paper, we present a novel computational framework for nonlinear dimensionality reduction which is specifically suited to process large data sets: the Exploratory Inspection Machine (XIM). XIM introduces a conceptual cross-link between hitherto separate domains of machine learning, namely topographic vector quantization and divergence-based neighbor embedding approaches. There are three ways to conceptualize XIM, namely (i) as the inversion of the Exploratory Observation Machine (XOM) and its variants, such as Neighbor Embedding XOM (NE-XOM), (ii) as a powerful optimization scheme for divergence-based neighbor embedding cost functions inspired by Stochastic Neighbor Embedding (SNE) and its variants, such as t-distributed SNE (t-SNE), and (iii) as an extension of topographic vector quantization methods, such as the Self-Organizing Map (SOM). By preserving both global and local data structure, XIM combines the virtues of classical and advanced recent embedding methods. It permits direct visualization of large data collections without the need for prior data reduction. Finally, XIM can contribute to many application domains of data analysis and visualization important throughout the sciences and engineering, such as pattern matching, constrained incremental learning, data clustering, and the analysis of non-metric dissimilarity data.
We investigate zero and finite temperature properties of the one-dimensional spin-glass model for vector spins in the limit of an infinite number m of spin components where the interactions decay with a power, \sigma, of the distance. A diluted version of this model is also studied, but found to deviate significantly from the fully connected model. At zero temperature, defect energies are determined from the difference in ground-state energies between systems with periodic and antiperiodic boundary conditions to determine the dependence of the defect-energy exponent \theta on \sigma. A good fit to this dependence is \theta =3/4-\sigma. This implies that the upper critical value of \sigma is 3/4, corresponding to the lower critical dimension in the d-dimensional short-range version of the model. For finite temperatures the large m saddle-point equations are solved self-consistently which gives access to the correlation function, the order parameter and the spin-glass susceptibility. Special attention is paid to the different forms of finite-size scaling effects below and above the lower critical value, \sigma =5/8, which corresponds to the upper critical dimension 8 of the hypercubic short-range model.
Energy disaggregation estimates appliance-by-appliance electricity consumption from a single meter that measures the whole home's electricity demand. Recently, deep neural networks have driven remarkable improvements in classification performance in neighbouring machine learning fields such as image classification and automatic speech recognition. In this paper, we adapt three deep neural network architectures to energy disaggregation: 1) a form of recurrent neural network called `long short-term memory' (LSTM); 2) denoising autoencoders; and 3) a network which regresses the start time, end time and average power demand of each appliance activation. We use seven metrics to test the performance of these algorithms on real aggregate power data from five appliances. Tests are performed against a house not seen during training and against houses seen during training. We find that all three neural nets achieve better F1 scores (averaged over all five appliances) than either combinatorial optimisation or factorial hidden Markov models and that our neural net algorithms generalise well to an unseen house.
We study smectic-liquid-crystal order in a cell with a heterogeneous substrate imposing surface random positional and orientational pinnings. Proposing a minimal random elastic model, we demonstrate that, for a thick cell, the smectic state without a rubbed substrate is always unstable at long scales and, for weak random pinning, is replaced by a smectic glass state. We compute the statistics of the associated substrate-driven distortions and the characteristic smectic domain size on the heterogeneous substrate and in the bulk. We find that for weak disorder, the system exhibits a three-dimensional temperature-controlled phase transition between a weakly and strongly pinned smectic glass states akin to the Cardy-Ostlund phase transition. We explore experimental implications of the predicted phenomenology and suggest that it provides a plausible explanation for the experimental observations on polarized light microscopy and x-ray scattering.
Among the novel metrics used to study the relative importance of nodes in complex networks, k-core decomposition has found a number of applications in areas as diverse as sociology, proteinomics, graph visualization, and distributed system analysis and design. This paper proposes new distributed algorithms for the computation of the k-core decomposition of a network, with the purpose of (i) enabling the run-time computation of k-cores in "live" distributed systems and (ii) allowing the decomposition, over a set of connected machines, of very large graphs, that cannot be hosted in a single machine. Lower bounds on the algorithms complexity are given, and an exhaustive experimental analysis on real-world graphs is provided.
We show that dynamical gain modulation of neurons' stimulus response is described as an information-theoretic cycle that generates entropy associated with the stimulus-related activity from entropy produced by the modulation. To articulate this theory, we describe stimulus-evoked activity of a neural population based on the maximum entropy principle with constraints on two types of overlapping activities, one that is controlled by stimulus conditions and the other, termed internal activity, that is regulated internally in an organism. We demonstrate that modulation of the internal activity realises gain control of stimulus response, and controls stimulus information. A cycle of neural dynamics is then introduced to model information processing by the neurons during which the stimulus information is dynamically enhanced by the internal gain-modulation mechanism. Based on the conservation law for entropy production, we demonstrate that the cycle generates entropy ascribed to the stimulus-related activity using entropy supplied by the internal mechanism, analogously to a heat engine that produces work from heat. We provide an efficient cycle that achieves the highest entropic efficiency to retain the stimulus information. The theory allows us to quantify efficiency of the internal computation and its theoretical limit.
Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem.   As a step towards a model-agnostic defense against adversarial examples, we show that they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests. As the number of malicious points included in samples presented to the test diminishes, its detection confidence decreases. Hence, we introduce a complimentary approach to identify specific inputs that are adversarial among sets of inputs flagged by the statistical test. Specifically, we augment our ML model with an additional output, in which the model is trained to classify all adversarial inputs.   We evaluate our approach on multiple adversarial example crafting methods (including the fast gradient sign and Jacobian-based saliency map methods) with several datasets. The statistical test flags sample sets containing adversarial inputs with confidence above 80%. Furthermore, our augmented model either detects adversarial examples with high accuracy (>80%) or increases the adversary's cost---the perturbation added---by more than 150%. In this way, we show that statistical properties of adversarial examples are essential to their detection.
We study the diffusion annihilation process which occurs when spin ice is quenched from a high temperature paramagnetic phase deep into the spin ice regime, where the excitations -- magnetic monopoles -- are sparse. We find that due to the Coulomb interaction between the monopoles, a dynamical arrest occurs, in which `non-universal' lattice-scale constraints impede the complete decay of charge fluctuations. This phenomenon is outside the reach of universal mean-field theory for a two-component Coulomb liquid. We identify the relevant timescales for the dynamical arrest and propose an experiment for detecting monopoles and their dynamics in spin ice based on this non-equilibrium phenomenon.
We study the influence of complex graphs on the metastability and fixation properties of a set of evolutionary processes. In the framework of evolutionary game theory, where the fitness and selection are frequency-dependent and vary with the population composition, we analyze the dynamics of snowdrift games (characterized by a metastable coexistence state) on scale-free networks. Using an effective diffusion theory in the weak selection limit, we demonstrate how the scale-free structure affects the system's metastable state and leads to anomalous fixation. In particular, we analytically and numerically show that the probability and mean time of fixation are characterized by stretched exponential behaviors with exponents depending on the network's degree distribution.
With the advent of high-throughput wet lab technologies the amount of protein interaction data available publicly has increased substantially, in turn spurring a plethora of computational methods for in silico knowledge discovery from this data. In this paper, we focus on parameterized methods for modeling and solving complex computational problems encountered in such knowledge discovery from protein data. Specifically, we concentrate on three relevant problems today in proteomics, namely detection of lethal proteins, functional modules and alignments from protein interaction networks. We propose novel graph theoretic models for these problems and devise practical parameterized algorithms. At a broader level, we demonstrate how these methods can be viable alternatives for the several heurestic, randomized, approximation and sub-optimal methods by arriving at parameterized yet optimal solutions for these problems. We substantiate these theoretical results by experimenting on real protein interaction data of S. cerevisiae (budding yeast) and verifying the results using gene ontology.
Complex networks have gained more attention from the last few years. The size of the real world complex networks, such as online social networks, WWW networks, collaboration networks, is exponentially increasing with time. It is not feasible to completely collect, store and process these networks. In the present work, we propose a method to estimate the degree centrality ranking of a node without having complete structure of the graph. The proposed algorithm uses degree of a node and power law exponent of the degree distribution to calculate the ranking. We also study simulation results on Barabasi-Albert model. Simulation results show that the average error in the estimated ranking is approximately $5\%$ of the total number of nodes.
A vast majority of computation in the brain is performed by spiking neural networks. Despite the ubiquity of such spiking, we currently lack an understanding of how biological spiking neural circuits learn and compute in-vivo, as well as how we can instantiate such capabilities in artificial spiking circuits in-silico. Here we revisit the problem of supervised learning in temporally coding multi-layer spiking neural networks. First, by using a surrogate gradient approach, we derive SuperSpike, a nonlinear voltage-based three factor learning rule capable of training multi-layer networks of deterministic integrate-and-fire neurons to perform nonlinear computations on spatiotemporal spike patterns. Second, inspired by recent results on feedback alignment, we compare the performance of our learning rule under different credit assignment strategies for propagating output errors to hidden units. Specifically, we test uniform, symmetric and random feedback, finding that simpler tasks can be solved with any type of feedback, while more complex tasks require symmetric feedback. In summary, our results open the door to obtaining a better scientific understanding of learning and computation in spiking neural networks by advancing our ability to train them to solve nonlinear problems involving transformations between different spatiotemporal spike-time patterns.
Different people and cultures associate different emotional states to different parts and spaces of cities. These vary according to individuals, their cultures and also to the time of day, day of week, season, special occasions and more. Recurring patterns may occur in correspondence of the places in which people work, study, entertain themselves, consume, relate, wait or just take a break. What can we learn from these patterns? Trying to find possible answers to this question passes through the possibility to visualize and represent the configurations of emotional expressions in urban spaces, across time, geography, theme, cultures and other dimensions. We have developed ways in which it is possible to harvest people's geo-located (or geo-locatable) emotional expressions from major social networks and to visualize them according to a variety of different modalities. In this paper we will present a series of these types of visualizations, and the ways in which they can be used to gain better understandings of these emotional patterns as they arise, from points of view which derive from anthropology, urbanism, sociology, politics and also arts and poetics. The paper will focus on the ways in which the data is harvested from different social networks, then categorized and annotated with meta-data describing the emotional states, the languages in which people express themselves, the geographic locations, the themes expressed. A methodology for representing this information across a variety of domains (time, space, emotion, theme) will then be presented in detail. A reflection on possible usage cases for anthropology, urbanism, policy-making, arts and design will end the contribution, as well as the description of series of open issues and the indication of possible next-steps for research.
An exponential version of a deep nucleon-nucleon potential involving forbidden states and a tensor component is proposed. The parameters of interaction are chosen by fitting the low-energy characteristics and phase shifts of np scattering at energies up to 500 MeV.
A pressing scientific challenge is to understand how brains work. Of particular interest is the neocortex,the part of the brain that is especially large in humans, capable of handling a wide variety of tasks including visual, auditory, language, motor, and abstract processing. These functionalities are processed in different self-organized regions of the neocortical sheet, and yet the anatomical structure carrying out the processing is relatively uniform across the sheet. We are at a loss to explain, simulate, or understand such a multi-functional homogeneous sheet-like computational structure - we do not have computational models which work in this way. Here we present an important step towards developing such models: we show how uniform modules of excitatory and inhibitory neurons can be connected bidirectionally in a network that, when exposed to input in the form of population codes, learns the input encodings as well as the relationships between the inputs. STDP learning rules lead the modules to self-organize into a relational network, which is able to infer missing inputs,restore noisy signals, decide between conflicting inputs, and combine cues to improve estimates. These networks show that it is possible for a homogeneous network of spiking units to self-organize so as to provide meaningful processing of its inputs. If such networks can be scaled up, they could provide an initial computational model relevant to the large scale anatomy of the neocortex.
We propose a novel approach for instance-level image retrieval. It produces a global and compact fixed-length representation for each image by aggregating many region-wise descriptors. In contrast to previous works employing pre-trained deep networks as a black box to produce features, our method leverages a deep architecture trained for the specific task of image retrieval. Our contribution is twofold: (i) we leverage a ranking framework to learn convolution and projection weights that are used to build the region features; and (ii) we employ a region proposal network to learn which regions should be pooled to form the final global descriptor. We show that using clean training data is key to the success of our approach. To that aim, we use a large scale but noisy landmark dataset and develop an automatic cleaning approach. The proposed architecture produces a global image representation in a single forward pass. Our approach significantly outperforms previous approaches based on global descriptors on standard datasets. It even surpasses most prior works based on costly local descriptor indexing and spatial verification. Additional material is available at www.xrce.xerox.com/Deep-Image-Retrieval.
We study an endogenous opinion (or, belief) dynamics model where we endogenize the social network that models the link (`trust') weights between agents. Our network adjustment mechanism is simple: an agent increases her weight for another agent if that agent has been close to truth (whence, our adjustment criterion is `past performance'). Moreover, we consider multiply biased agents that do not learn in a fully rational manner but are subject to persuasion bias - they learn in a DeGroot manner, via a simple `rule of thumb' - and that have biased initial beliefs. In addition, we also study this setup under conformity, opposition, and homophily - which are recently suggested variants of DeGroot learning in social networks - thereby taking into account further biases agents are susceptible to. Our main focus is on crowd wisdom, that is, on the question whether the so biased agents can adequately aggregate dispersed information and, consequently, learn the true states of the topics they communicate about. In particular, we present several conditions under which wisdom fails.
Flow-induced aggregation of colloidal particles leads to aggregates with fairly high fractal dimension () which are directly responsible for the observed rheological properties of sheared dispersions. We address the problem of the decrease of aggregate size with increasing hydrodynamic stress, as a consequence of breakup, by means of a fracture-mechanics model complemented by experiments in a multi-pass extensional (laminar) flow device. Evidence is shown that as long as the inner density decay with linear size within the aggregate (due to fractality) is not negligible (as for), this imposes a substantial limitation to the hydrodynamic fragmentation process as compared with non-fractal aggregates (where the critical stress is practically size-independent). This is due to the fact that breaking up a fractal object leads to denser fractals which better withstand stress. In turbulent flows, accounting for intermittency introduces just a small deviation with respect to the laminar case, while the model predictions are equally in good agreement with experiments from the literature. Our findings are summarized in a diagram for the breakup exponent (governing the size versus stress scaling) as a function of fractal dimension.
Behavioral experiments on humans and animals suggest that the brain performs probabilistic inference to interpret its environment. Here we present a new general-purpose, biologically-plausible neural implementation of approximate inference. The neural network represents uncertainty using Probabilistic Population Codes (PPCs), which are distributed neural representations that naturally encode probability distributions, and support marginalization and evidence integration in a biologically-plausible manner. By connecting multiple PPCs together as a probabilistic graphical model, we represent multivariate probability distributions. Approximate inference in graphical models can be accomplished by message-passing algorithms that disseminate local information throughout the graph. An attractive and often accurate example of such an algorithm is Loopy Belief Propagation (LBP), which uses local marginalization and evidence integration operations to perform approximate inference efficiently even for complex models. Unfortunately, a subtle feature of LBP renders it neurally implausible. However, LBP can be elegantly reformulated as a sequence of Tree-based Reparameterizations (TRP) of the graphical model. We re-express the TRP updates as a nonlinear dynamical system with both fast and slow timescales, and show that this produces a neurally plausible solution. By combining all of these ideas, we show that a network of PPCs can represent multivariate probability distributions and implement the TRP updates to perform probabilistic inference. Simulations with Gaussian graphical models demonstrate that the neural network inference quality is comparable to the direct evaluation of LBP and robust to noise, and thus provides a promising mechanism for general probabilistic inference in the population codes of the brain.
Usage of mobile wireless Internet has grown very fast in recent years. This radical change in availability of Internet has led to communication of big amount of data over mobile networks and consequently new challenges and opportunities for modeling of mobile Internet characteristics. While the traditional approach toward network modeling suggests finding a generic traffic model for the whole network, in this paper, we show that this approach does not capture all the dynamics of big mobile networks and does not provide enough accuracy. Our case study based on a big dataset including billions of netflow records collected from a campus-wide wireless mobile network shows that user interests acquired based on accessed domains and visited locations as well as user behavioral groups have a significant impact on traffic characteristics of big mobile networks. For this purpose, we utilize a novel graph-based approach based on KS-test as well as a novel co-clustering technique. Our study shows that interest-based modeling of big mobile networks can significantly improve the accuracy and reduce the KS distance by factor of 5 comparing to the generic approach.
Online Social Networks (OSNs) witness a rise in user activity whenever an event takes place. Malicious entities exploit this spur in user-engagement levels to spread malicious content that compromises system reputation and degrades user experience. It also generates revenue from advertisements, clicks, etc. for the malicious entities. Facebook, the world's biggest social network, is no exception and has recently been reported to face much abuse through scams and other type of malicious content, especially during news making events. Recent studies have reported that spammers earn $200 million just by posting malicious links on Facebook. In this paper, we characterize malicious content posted on Facebook during 17 events, and discover that existing efforts to counter malicious content by Facebook are not able to stop all malicious content from entering the social graph. Our findings revealed that malicious entities tend to post content through web and third party applications while legitimate entities prefer mobile platforms to post content. In addition, we discovered a substantial amount of malicious content generated by Facebook pages. Through our observations, we propose an extensive feature set based on entity profile, textual content, metadata, and URL features to identify malicious content on Facebook in real time and at zero-hour. This feature set was used to train multiple machine learning models and achieved an accuracy of 86.9%. The intent is to catch malicious content that is currently evading Facebook's detection techniques. Our machine learning model was able to detect more than double the number of malicious posts as compared to existing malicious content detection techniques. Finally, we built a real world solution in the form of a REST based API and a browser plug-in to identify malicious Facebook posts in real time.
Many real networks can be understood as two complementary networks with two kind of nodes. This is the case of metabolic networks where the first network has chemical compounds as nodes and the second one has nodes as reactions. The second network can be related to the first one by a technique called line graph transformation (i.e., edges in an initial network are transformed into nodes). Recently, the main topological properties of the metabolic networks have been properly described by means of a hierarchical model. In our work, we apply the line graph transformation to a hierarchical network and the clustering coefficient $C(k)$ is calculated for the transformed network, where $k$ is the node degree. While $C(k)$ follows the scaling law $C(k)\sim k^{-1.1}$ for the initial hierarchical network, $C(k)$ scales weakly as $k^{0.08}$ for the transformed network. These results indicate that the reaction network can be identified as a degree-independent clustering network.
This article deals with an important aspect of the neural network retrieval of sea surface salinity (SSS) from SMOS brightness temperatures (TBs). The neural network retrieval method is an empirical approach that offers the possibility of being independent from any theoretical emissivity model, during the in-flight phase. A Previous study [1] has proven that this approach is applicable to all pixels on ocean, by designing a set of neural networks with different inputs. The present study focuses on the choice of the learning database and demonstrates that a judicious distribution of the geophysical parameters allows to markedly reduce the systematic regional biases of the retrieved SSS, which are due to the high noise on the TBs. An equalization of the distribution of the geophysical parameters, followed by a new technique for boosting the learning process, makes the regional biases almost disappear for latitudes between 40{\deg}S and 40{\deg}N, while the global standard deviation remains between 0.6 psu (at the center of the of the swath) and 1 psu (at the edges).
Literature dielectric data of glycerol, propylene carbonate and ortho-terphenyl (OTP) show that the measured dielectric relaxation is a decade faster than the Debye expectation, but still a decade slower than the breakdown of the shear modulus. From a comparison of time scales, the dielectric relaxation seems to be due to a process which relaxes not only the molecular orientation, but the entropy, the short-range order and the density as well. On the basis of this finding, we propose an alternative to the Gemant-DiMarzio-Bishop extension of the Debye picture.
Several integrate-to-threshold models with differing temporal integration mechanisms have been proposed to describe the accumulation of sensory evidence to a prescribed level prior to motor response in perceptual decision-making tasks. An experiment and simulation studies have shown that the introduction of time-varying perturbations during integration may distinguish among some of these models. Here, we present computer simulations and mathematical proofs that provide more rigorous comparisons among one-dimensional stochastic differential equation models. Using two perturbation protocols and focusing on the resulting changes in the means and standard deviations of decision times, we show that, for high signal-to-noise ratios, drift-diffusion models with constant and time-varying drift rates can be distinguished from Ornstein-Uhlenbeck processes, but not necessarily from each other. The protocols can also distinguish stable from unstable Ornstein-Uhlenbeck processes, and we show that a nonlinear integrator can be distinguished from these linear models by changes in standard deviations. The protocols can be implemented in behavioral experiments.
Many real-world tasks involve multiple agents with partial observability and limited communication. Learning is challenging in these settings due to local viewpoints of agents, which perceive the world as non-stationary due to concurrently-exploring teammates. Approaches that learn specialized policies for individual tasks face problems when applied to the real world: not only do agents have to learn and store distinct policies for each task, but in practice identities of tasks are often non-observable, making these approaches inapplicable. This paper formalizes and addresses the problem of multi-task multi-agent reinforcement learning under partial observability. We introduce a decentralized single-task learning approach that is robust to concurrent interactions of teammates, and present an approach for distilling single-task policies into a unified policy that performs well across multiple related tasks, without explicit provision of task identity.
Individuals of different types, may it be genetic, cultural, or else, with different levels of fitness often compete for reproduction and survival. A fitter type generally has higher chances of disseminating their copies to other individuals. The fixation probability of a single mutant type introduced in a population of wild-type individuals quantifies how likely the mutant type spreads. How much the excess fitness of the mutant type increases its fixation probability, namely, the selection pressure, is important in assessing the impact of the introduced mutant. Previous studies mostly based on undirected and unweighted contact networks of individuals showed that the selection pressure depends on the structure of networks and the rule of reproduction. Real networks underlying ecological and social interactions are usually directed or weighted. Here we examine how the selection pressure is modulated by directionality of interactions under several update rules. Our conclusions are twofold. First, directionality discounts the selection pressure for different networks and update rules. Second, given a network, the update rules in which death events precede reproduction events significantly decrease the selection pressure than the other rules.
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images. It directly models the probability distribution of generating a word given previous words and the image. Image descriptions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on three benchmark datasets (IAPR TC-12, Flickr 8K, and Flickr 30K). Our model outperforms the state-of-the-art generative method. In addition, the m-RNN model can be applied to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval.
The recent promising achievements of deep learning rely on the large amount of labeled data. Considering the abundance of data on the web, most of them do not have labels at all. Therefore, it is important to improve generalization performance using unlabeled data on supervised tasks with few labeled instances. In this work, we revisit graph-based semi-supervised learning algorithms and propose an online graph construction technique which suits deep convolutional neural network better. We consider an EM-like algorithm for semi-supervised learning on deep neural networks: In forward pass, the graph is constructed based on the network output, and the graph is then used for loss calculation to help update the network by back propagation in the backward pass. We demonstrate the strength of our online approach compared to the conventional ones whose graph is constructed on static but not robust enough feature representations beforehand.
This paper proposes a multilinear discriminant analysis network (MLDANet) for the recognition of multidimensional objects, known as tensor objects. The MLDANet is a variation of linear discriminant analysis network (LDANet) and principal component analysis network (PCANet), both of which are the recently proposed deep learning algorithms. The MLDANet consists of three parts: 1) The encoder learned by MLDA from tensor data. 2) Features maps ob-tained from decoder. 3) The use of binary hashing and histogram for feature pooling. A learning algorithm for MLDANet is described. Evaluations on UCF11 database indicate that the proposed MLDANet outperforms the PCANet, LDANet, MPCA + LDA, and MLDA in terms of classification for tensor objects.
Energy-based models are popular in machine learning due to the elegance of their formulation and their relationship to statistical physics. Among these, the Restricted Boltzmann Machine (RBM), and its staple training algorithm contrastive divergence (CD), have been the prototype for some recent advancements in the unsupervised training of deep neural networks. However, CD has limited theoretical motivation, and can in some cases produce undesirable behavior. Here, we investigate the performance of Minimum Probability Flow (MPF) learning for training RBMs. Unlike CD, with its focus on approximating an intractable partition function via Gibbs sampling, MPF proposes a tractable, consistent, objective function defined in terms of a Taylor expansion of the KL divergence with respect to sampling dynamics. Here we propose a more general form for the sampling dynamics in MPF, and explore the consequences of different choices for these dynamics for training RBMs. Experimental results show MPF outperforming CD for various RBM configurations.
Common Representation Learning (CRL), wherein different descriptions (or views) of the data are embedded in a common subspace, is receiving a lot of attention recently. Two popular paradigms here are Canonical Correlation Analysis (CCA) based approaches and Autoencoder (AE) based approaches. CCA based approaches learn a joint representation by maximizing correlation of the views when projected to the common subspace. AE based methods learn a common representation by minimizing the error of reconstructing the two views. Each of these approaches has its own advantages and disadvantages. For example, while CCA based approaches outperform AE based approaches for the task of transfer learning, they are not as scalable as the latter. In this work we propose an AE based approach called Correlational Neural Network (CorrNet), that explicitly maximizes correlation among the views when projected to the common subspace. Through a series of experiments, we demonstrate that the proposed CorrNet is better than the above mentioned approaches with respect to its ability to learn correlated common representations. Further, we employ CorrNet for several cross language tasks and show that the representations learned using CorrNet perform better than the ones learned using other state of the art approaches.
The motivation and major ways for probing the Zone of Avoidance (ZOA) are reviewed. Galaxies hidden behind the ZOA may have important implications for the internal dynamics of the Local Group, for the origin of its motion relative to the Microwave Background, and for the connectivity of the large scale structure. Current direct (`observational') methods for exploring the ZOA include eye-balling of plates, source identification in the IRAS data base and pointed and blind-search observations in 21 cm. Interesting regions identified so far include the two crossing points of the Supergalactic Plane by the Galactic Plane (at Galactic longitude $l \sim 135^o$, near Perseus-Pisces, and $l \sim 315^o$, near the Great Attractor), the Puppis cluster (at $l \sim 240^o, cz \sim 1500$ km/sec) and the Ophiuchus cluster (at $l \sim 0^o, cz \sim 8400$ km/sec). New promising wavelengths are the $2 \mu$ and the X-ray band. Indirect (`theoretical') approaches include `Wiener reconstruction' from incomplete and noisy data, and using the peculiar velocity field as a probe of the mass distribution hidden behind the ZOA. The problem of source confusion at low Galactic latitude can be addressed by novel statistical methods, e.g. Artificial Neural Networks.
Physical systems with high ground state degeneracy, such as electrons in large magnetic fields [1, 2] and geometrically frustrated spins [3], provide a rich playground for exploring emergent many-body phenomena. Quantum simulations with cold atoms offer new prospects for exploring complex phases arising from frustration and interactions [4-7] through the direct engineering of these ingredients in a well-controlled environment [8, 9]. Advances in band structure engineering, through the use of sophisticated lattice potentials made from interfering lasers, have allowed for explorations of kagome [10] and Lieb [11] lattice structures that support high-degeneracy excited energy bands. The use of internal states as synthetic dimensions [12] offers even greater flexibility in creating nontrivial band structures [13-17]. Here, using synthetic lattices based on laser-coupled atomic momentum states, we perform the first exploration of high-degeneracy ground bands in a cold atom system. By combining nearest- and next-nearest-neighbour tunnellings, we form an effective zigzag lattice that naturally supports kinetic frustration and nearly-flat quartic energy bands. In a quartic band structure with time-reversal symmetry (TRS) broken by a synthetic magnetic flux, the quench dynamics of our atoms reveal hallmark signatures of spin-momentum locking. Under preserved TRS, we demonstrate the extreme sensitivity of a nearly-flat ground band to added disorder. Furthermore, we show that the ground state localisation properties are strongly modified by an added flux, relating to a flux-dependent mobility edge due to multi-range tunnelling. Our work constitutes the first quantum simulation study on the interplay of topology and disorder [18], and opens the door to studying emergent phenomena driven by frustrated kinetics and atomic interactions
Spiking Neural Network (SNN) naturally inspires hardware implementation as it is based on biology. For learning, spike time dependent plasticity (STDP) may be implemented using an energy efficient waveform superposition on memristor based synapse. However, system level implementation has three challenges. First, a classic dilemma is that recognition requires current reading for short voltage$-$spikes which is disturbed by large voltage$-$waveforms that are simultaneously applied on the same memristor for real$-$time learning i.e. the simultaneous read$-$write dilemma. Second, the hardware needs to exactly replicate software implementation for easy adaptation of algorithm to hardware. Third, the devices used in hardware simulations must be realistic. In this paper, we present an approach to address the above concerns. First, the learning and recognition occurs in separate arrays simultaneously in real$-$time, asynchronously $-$ avoiding non$-$biomimetic clocking based complex signal management. Second, we show that the hardware emulates software at every stage by comparison of SPICE (circuit$-$simulator) with MATLAB (mathematical SNN algorithm implementation in software) implementations. As an example, the hardware shows 97.5 per cent accuracy in classification which is equivalent to software for a Fisher$-$Iris dataset. Third, the STDP is implemented using a model of synaptic device implemented using HfO2 memristor. We show that an increasingly realistic memristor model slightly reduces the hardware performance (85 per cent), which highlights the need to engineer RRAM characteristics specifically for SNN.
In this work we investigate the betweenness centrality in geographical networks and its relationship with network communities. We show that vertices with large betweenness define what we call characteristic betweenness paths in both modeled and real-world geographical networks. We define a geographical network model that possess a simple topology while still being able to present such betweenness paths. Using this model, we show that such paths represent pathways between entry and exit points of highly connected regions, or communities, of geographical networks. By defining a new network, containing information about community adjacencies in the original network, we describe a means to characterize the mesoscale connectivity provided by such characteristic betweenness paths.
Distributed representation learned with neural networks has recently shown to be effective in modeling natural languages at fine granularities such as words, phrases, and even sentences. Whether and how such an approach can be extended to help model larger spans of text, e.g., documents, is intriguing, and further investigation would still be desirable. This paper aims to enhance neural network models for such a purpose. A typical problem of document-level modeling is automatic summarization, which aims to model documents in order to generate summaries. In this paper, we propose neural models to train computers not just to pay attention to specific regions and content of input documents with attention models, but also distract them to traverse between different content of a document so as to better grasp the overall meaning for summarization. Without engineering any features, we train the models on two large datasets. The models achieve the state-of-the-art performance, and they significantly benefit from the distraction modeling, particularly when input documents are long.
Motivated by a problem in the theory of randomized search heuristics, we give a very precise analysis for the coupon collector problem where the collector starts with a random set of coupons (chosen uniformly from all sets).   We show that the expected number of rounds until we have a coupon of each type is $nH_{n/2} - 1/2 \pm o(1)$, where $H_{n/2}$ denotes the $(n/2)$th harmonic number when $n$ is even, and $H_{n/2}:= (1/2) H_{\lfloor n/2 \rfloor} + (1/2) H_{\lceil n/2 \rceil}$ when $n$ is odd. Consequently, the coupon collector with random initial stake is by half a round faster than the one starting with exactly $n/2$ coupons (apart from additive $o(1)$ terms).   This result implies that classic simple heuristic called \emph{randomized local search} needs an expected number of $nH_{n/2} - 1/2 \pm o(1)$ iterations to find the optimum of any monotonic function defined on bit-strings of length $n$.
Loop operators of a class S theory arise from networks on the corresponding Riemann surface, and their operator product expansions are given in terms of the skein relations, that we describe in detail in the case of class S theories of type A. As two applications, we explicitly determine networks corresponding to dyonic loops of $N=4$ $SU(3)$ super Yang-Mills, and compute the superconformal index of a nontrivial network operator of the $T_3$ theory.
We introduce an exponential random graph model for networks with a fixed degree distribution and with a tunable degree-degree correlation. We then investigate the nature of a percolation transition in the correlated network with the Poisson degree distribution. It is found that negative correlation is irrelevant in that the percolation transition in the disassortative network belongs to the same universality class of the uncorrelated network. Positive correlation turns out to be relevant. The percolation transition in the assortative network is characterized by the non-diverging mean size of finite clusters and power-law scalings of the density of the largest cluster and the cluster size distribution in the non-percolating phase as well as at the critical point. Our results suggest that the unusual type percolation transition in the growing network models reported recently may be inherited from the assortative degree-degree correlation.
Chimera states occur in networks of coupled oscillators, and are characterized by having some fraction of the oscillators perfectly synchronized, while the remainder are desynchronized. Most chimera states have been observed in networks of phase oscillators with coupling via a sinusoidal function of phase differences, and it is only for such networks that any analysis has been performed. Here we present the first analysis of chimera states in a network of planar oscillators, each of which is described by both an amplitude and a phase. We find that as the attractivity of the underlying periodic orbit is reduced chimeras are destroyed in saddle-node bifurcations, and supercritical Hopf and homoclinic bifurcations of chimeras also occur.
We demonstrate a method to count small numbers of atoms held in a deep, microscopic optical dipole trap by collecting fluorescence from atoms exposed to a standing wave of light that is blue detuned from resonance. While scattering photons, the atoms are also cooled by a Sisyphus mechanism that results from the spatial variation in light intensity. The use of a small blue detuning limits the losses due to light assisted collisions, thereby making the method suitable for counting several atoms in a microscopic volume.
Multi-layer networks are networks on a set of entities (nodes) with multiple types of relations (edges) among them where each type of relation/interaction is represented as a network layer. As with single layer networks, community detection is an important task in multi-layer networks. A large group of popular community detection methods in networks are based on optimizing a quality function known as the modularity score, which is a measure of presence of modules or communities in networks. Hence a first step in community detection is defining a suitable modularity score that is appropriate for the network in question. Here we introduce several multi-layer network modularity measures under different null models of the network, motivated by empirical observations in networks from a diverse field of applications. In particular we define the multi-layer configuration model, the multi-layer expected degree model and their various modifications as null models for multi-layer networks to derive different modularities. The proposed modularities are grouped into two categories. The first category, which is based on degree corrected multi-layer stochastic block model, has the multi-layer expected degree model as their null model. The second category, which is based on multi-layer extensions of Newman-Girvan modularity, has the multi-layer configuration model as their null model. These measures are then optimized to detect the optimal community assignment of nodes. We compare the effectiveness of the measures in community detection in simulated networks and then apply them to four real networks.
We model the evolution of eukaryotic protein-protein interaction (PPI) networks. In our model, PPI networks evolve by two known biological mechanisms: (1) Gene duplication, which is followed by rapid diversification of duplicate interactions. (2) Neofunctionalization, in which a mutation leads to a new interaction with some other protein. Since many interactions are due to simple surface compatibility, we hypothesize there is an increased likelihood of interacting with other proteins in the target protein's neighborhood. We find good agreement of the model on 10 different network properties compared to high-confidence experimental PPI networks in yeast, fruit flies, and humans. Key findings are: (1) PPI networks evolve modular structures, with no need to invoke particular selection pressures. (2) Proteins in cells have on average about 6 degrees of separation, similar to some social networks, such as human-communication and actor networks. (3) Unlike social networks, which have a shrinking diameter (degree of maximum separation) over time, PPI networks are predicted to grow in diameter. (4) The model indicates that evolutionarily old proteins should have higher connectivities and be more centrally embedded in their networks. This suggests a way in which present-day proteomics data could provide insights into biological evolution.
We study community structure in time-dependent legislation cosponsorship networks in the Peruvian Congress, and we compare them briefly to legislation cosponsorship networks in the US Senate. To study these legislatures, we employ a multilayer representation of temporal networks in which legislators in each layer are connected to each other with a weight that is based on how many bills they cosponsor. We then use multilayer modularity maximization to detect communities in these networks. From our computations, we are able to capture power shifts in the Peruvian Congress during 2006--2011. For example, we observe the emergence of 'opportunists', who switch from one community to another, as well as cohesive legislative communities whose initial component legislators never change communities. Interestingly, many of the opportunists belong to the group that won the majority in Congress.
Tree-structured neural networks encode a particular tree geometry for a sentence in the network design. However, these models have at best only slightly outperformed simpler sequence-based models. We hypothesize that neural sequence models like LSTMs are in fact able to discover and implicitly use recursive compositional structure, at least for tasks with clear cues to that structure in the data. We demonstrate this possibility using an artificial data task for which recursive compositional structure is crucial, and find an LSTM-based sequence model can indeed learn to exploit the underlying tree structure. However, its performance consistently lags behind that of tree models, even on large training sets, suggesting that tree-structured models are more effective at exploiting recursive structure.
The dynamical organization in the presence of noise of a Boolean neural network with random connections is analyzed. For low levels of noise, the system reaches a stationary state in which the majority of its elements acquire the same value. It is shown that, under very general conditions, there exists a critical value of the noise, below which the network remains organized and above which it behaves randomly. The existence and nature of the phase transition are computed analytically, showing that the critical exponent is 1/2. The dependence of the critical noise on the parameters of the network is obtained. These results are then compared with two numerical realizations of the network.
The extinction of the contact process in lattice models with quenched disorder is analysed in the limit of small density of infected sites. It is shown that the problem in such a regime can be mapped to the quantum-mechanical one characterized by the Anderson Hamiltonian for an electron in a random lattice. It is demonstrated both analytically (self-consistent mean-field) and numerically (by direct diagonalization of the Hamiltonian and by means of cellular automata simulations) that disorder enhances the contact process given the mean values of random parameters are not influenced by disorder.
We introduce a model for dynamic networks, where the links or the strengths of the links change over time. We solve the model by mapping dynamic networks to the problem of directed percolation, where the direction corresponds to the evolution of the network in time. We show that the dynamic network undergoes a percolation phase transition at a critical concentration $p_c$, which decreases with the rate $r$ at which the network links are changed. The behavior near criticality is universal and independent of $r$. We find fundamental network laws are changed. (i) For Erd\H{o}s-R\'{e}nyi networks we find that the size of the giant component at criticality scales with the network size $N$ for all values of $r$, rather than as $N^{2/3}$. (ii) In the presence of a broad distribution of disorder, the optimal path length between two nodes in a dynamic network scales as $N^{1/2}$, compared to $N^{1/3}$ in a static network.
We report a novel singularity in the hysteresis of spin glasses, the reversal-field memory effect, which creates a non-analyticity in the magnetization curves at a particular point related to the history of the sample. The origin of the effect is due to the existence of a macroscopic number of "symmetric clusters" of spins associated with a local spin-reversal symmetry of the Hamiltonian. We use First Order Reversal Curve (FORC) diagrams to characterize the effect and compare to experimental results on thin magnetic films. We contrast our results on spin glasses to random magnets and show that the FORC technique is an effective "magnetic fingerprinting" tool.
The problem of estimating the kernel mean in a reproducing kernel Hilbert space (RKHS) is central to kernel methods in that it is used by classical approaches (e.g., when centering a kernel PCA matrix), and it also forms the core inference step of modern kernel methods (e.g., kernel-based non-parametric tests) that rely on embedding probability distributions in RKHSs. Muandet et al. (2014) has shown that shrinkage can help in constructing "better" estimators of the kernel mean than the empirical estimator. The present paper studies the consistency and admissibility of the estimators in Muandet et al. (2014), and proposes a wider class of shrinkage estimators that improve upon the empirical estimator by considering appropriate basis functions. Using the kernel PCA basis, we show that some of these estimators can be constructed using spectral filtering algorithms which are shown to be consistent under some technical assumptions. Our theoretical analysis also reveals a fundamental connection to the kernel-based supervised learning framework. The proposed estimators are simple to implement and perform well in practice.
The server-centric data centre network architecture can accommodate a wide variety of network topologies. Newly proposed topologies in this arena often require several rounds of analysis and experimentation in order that they might achieve their full potential as data centre networks. We propose a family of novel routing algorithms on two well-known data centre networks of this type, (Generalized) DCell and FiConn, using techniques that can be applied more generally to the class of networks we call completely connected recursively-defined networks. In doing so, we develop a classification of all possible routes from server-node to server-node on these networks, called general routes of order $t$, and find that for certain topologies of interest, our routing algorithms efficiently produce paths that are up to 16% shorter than the best previously known algorithms, and are comparable to shortest paths. In addition to finding shorter paths, we show evidence that our algorithms also have good load-balancing properties.
We investigate the generalizability of deep learning based on the sensitivity to input perturbation. We hypothesize that the high sensitivity to the perturbation of data degrades the performance on it. To reduce the sensitivity to perturbation, we propose a simple and effective regularization method, referred to as spectral norm regularization, which penalizes the high spectral norm of weight matrices in neural networks. We provide supportive evidence for the abovementioned hypothesis by experimentally confirming that the models trained using spectral norm regularization exhibit better generalizability than other baseline methods.
Recent progress on many imaging and vision tasks has been driven by the use of deep feed-forward neural networks, which are trained by propagating gradients of a loss defined on the final output, back through the network up to the first layer that operates directly on the image. We propose back-propagating one step further---to learn camera sensor designs jointly with networks that carry out inference on the images they capture. In this paper, we specifically consider the design and inference problems in a typical color camera---where the sensor is able to measure only one color channel at each pixel location, and computational inference is required to reconstruct a full color image. We learn the camera sensor's color multiplexing pattern by encoding it as layer whose learnable weights determine which color channel, from among a fixed set, will be measured at each location. These weights are jointly trained with those of a reconstruction network that operates on the corresponding sensor measurements to produce a full color image. Our network achieves significant improvements in accuracy over the traditional Bayer pattern used in most color cameras. It automatically learns to employ a sparse color measurement approach similar to that of a recent design, and moreover, improves upon that design by learning an optimal layout for these measurements.
Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it performs can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so.
Exemplar learning is a powerful paradigm for discovering visual similarities in an unsupervised manner. In this context, however, the recent breakthrough in deep learning could not yet unfold its full potential. With only a single positive sample, a great imbalance between one positive and many negatives, and unreliable relationships between most samples, training of Convolutional Neural networks is impaired. Given weak estimates of local distance we propose a single optimization problem to extract batches of samples with mutually consistent relations. Conflicting relations are distributed over different batches and similar samples are grouped into compact cliques. Learning exemplar similarities is framed as a sequence of clique categorization tasks. The CNN then consolidates transitivity relations within and between cliques and learns a single representation for all samples without the need for labels. The proposed unsupervised approach has shown competitive performance on detailed posture analysis and object classification.
A typical pipeline for Zero-Shot Learning (ZSL) is to integrate the visual features and the class semantic descriptors into a multimodal framework with a linear or bilinear model. However, the visual features and the class semantic descriptors locate in different structural spaces, a linear or bilinear model can not capture the semantic interactions between different modalities well. In this letter, we propose a nonlinear approach to impose ZSL as a multi-class classification problem via a Semantic Softmax Loss by embedding the class semantic descriptors into the softmax layer of multi-class classification network. To narrow the structural differences between the visual features and semantic descriptors, we further use an L2 normalization constraint to the differences between the visual features and visual prototypes reconstructed with the semantic descriptors. The results on three benchmark datasets, i.e., AwA, CUB and SUN demonstrate the proposed approach can boost the performances steadily and achieve the state-of-the-art performance for both zero-shot classification and zero-shot retrieval.
Using the dynamics of information propagation on a network as our illustrative example, we present and discuss a systematic approach to quantifying heterogeneity and its propagation that borrows established tools from Uncertainty Quantification. The crucial assumption underlying this mathematical and computational "technology transfer" is that the evolving states of the nodes in a network quickly become correlated with the corresponding node "identities": features of the nodes imparted by the network structure (e.g. the node degree, the node clustering coefficient). The node dynamics thus depend on heterogeneous (rather than uncertain) parameters, whose distribution over the network results from the network structure. Knowing these distributions allows us to obtain an efficient coarse-grained representation of the network state in terms of the expansion coefficients in suitable orthogonal polynomials. This representation is closely related to mathematical/computational tools for uncertainty quantification (the Polynomial Chaos approach and its associated numerical techniques). The Polynomial Chaos coefficients provide a set of good collective variables for the observation of dynamics on a network, and subsequently, for the implementation of reduced dynamic models of it. We demonstrate this idea by performing coarse-grained computations of the nonlinear dynamics of information propagation on our illustrative network model using the Equation-Free approach
Social, biological and economic networks grow and decline with occasional fragmentation and re-formation, often explained in terms of external perturbations. We show that these phenomena can be a direct consequence of simple imitation and internal conflicts between 'cooperators' and 'defectors'. We employ a game-theoretic model of dynamic network formation where successful individuals are more likely to be imitated by newcomers who adopt their strategies and copy their social network. We find that, despite using the same mechanism, cooperators promote well-connected highly prosperous networks and defectors cause the network to fragment and lose its prosperity; defectors are unable to maintain the highly connected networks they invade. Once the network is fragmented it can be reconstructed by a new invasion of cooperators, leading to the cycle of formation and fragmentation seen, for example, in bacterial communities and socio-economic networks. In this endless struggle between cooperators and defectors we observe that cooperation leads to prosperity, but prosperity is associated with instability. Cooperation is prosperous when the network has frequent formation and fragmentation.
This paper studies the Dirac cohomology of standard modules in the setting of graded Hecke algebras with geometric parameters. We prove that the Dirac cohomology of a standard module vanishes if and only if the module is not twisted-elliptic tempered. The proof makes use of two deep results. One is some structural information from the generalized Springer correspondence obtained by S. Kato and Lusztig. Another one is a computation of the Dirac cohomology of tempered modules by Barbasch-Ciubotaru-Trapa and Ciubotaru.   We apply our result to compute the Dirac cohomology of ladder representations for type $A_n$. For each of such representations with non-zero Dirac cohomology, we associate to a canonical Weyl group representation. We use the Dirac cohomology to conclude that such representations appear with multiplicity one.
We present a new cluster-finding algorithm based on a combination of the Voronoi Tessellation and Friends-Of-Friends methods. The algorithm utilises probability distribution functions derived from a photometric redshift analysis. We test our algorithm on a set of simulated cluster-catalogues and have published elsewhere its employment on UKIDSS Ultra Deep Survey infrared J and K data combined with 3.6 micro-m and 4.5 micro-m Spitzer bands and optical BVRi'z' imaging from the Subaru Telescope. This pilot study has detected clusters over 0.5 square degrees in the Subaru XMM-Newton Deep Field. The resulting cluster catalogue contains 13 clusters at redshifts 0.61 <= z <= 1.39 with luminosities 10 L* <~ L_tot <~ 50 L*.
Community detection has been one of the central problems in network studies and directed network is particularly challenging due to asymmetry among its links. In this paper, we found that incorporating the direction of links reveals new perspectives on communities regarding to two different roles, source and terminal, that a node plays in each community. Intriguingly, such communities appear to be connected with unique spectral property of the graph Laplacian of the adjacency matrix and we exploit this connection by using regularized SVD methods. We propose harvesting algorithms, coupled with regularized SVDs, that are linearly scalable for efficient identification of communities in huge directed networks. The proposed algorithm shows great performance and scalability on benchmark networks in simulations and successfully recovers communities in real network applications.
We consider signal transaction in a simple neuronal model featuring intrinsic noise. The presence of noise limits the precision of neural responses and impacts the quality of neural signal transduction. We assess the signal transduction quality in relation to the level of noise, and show it to be maximized by a non-zero level of noise, analogous to the stochastic resonance effect. The quality enhancement occurs for a finite range of stimuli to a single neuron; we show how to construct networks of neurons that extend the range. The range increases more rapidly with network size when we make use of heterogeneous populations of neurons with a variety of thresholds, rather than homogeneous populations of neurons all with the same threshold. The limited precision of neural responses thus can have a direct effect on the optimal network structure, with diverse functional properties of the constituent neurons supporting an economical information processing strategy that reduces the metabolic costs of handling a broad class of stimuli.
A major obstacle to understanding neural coding and computation is the fact that experimental recordings typically sample only a small fraction of the neurons in a circuit. Measured neural properties are skewed by interactions between recorded neurons and the "hidden" portion of the network. To properly interpret neural data, we thus need a better understanding of the relationships between measured effective neural properties and the true underlying physiological properties. Here, we focus on how the effective spatiotemporal dynamics of the synaptic interactions between neurons are reshaped by coupling to unobserved neurons. We find that the effective interactions from a pre-synaptic neuron $r'$ to a post-synaptic neuron $r$ can be decomposed into a sum of the true interaction from $r'$ to $r$ plus corrections from every directed path from $r'$ to $r$ through unobserved neurons. Importantly, the resulting formula reveals when the hidden units have---or do not have---major effects on reshaping the interactions among observed neurons. As a prominent example, we derive a formula for strong impact of hidden units in random networks with connection weights that scale with $1/\sqrt{N}$, where $N$ is the network size---precisely the scaling observed in recent experiments.
Learning neural network architectures is a way to discover new highly predictive models. We propose to focus on this problem from a different perspective where the goal is to discover architectures efficient in terms of both prediction quality and computation cost, e.g time in milliseconds, number of operations... For instance, our approach is able to solve the following task: find the best neural network architecture (in a very large set of possible architectures) able to predict well in less than 100 milliseconds on my mobile phone.   Our contribution is based on a new family of models called Budgeted Super Networks that are learned using reinforcement-learning inspired techniques applied to a budgeted learning objective function which includes the computation cost during disk/memory operations at inference. We present a set of experiments on computer vision problems and show the ability of our method to discover efficient architectures in terms of both predictive quality and computation time.
Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.
Wireless surveillance in cellular networks has become increasingly important, while commercial LTE surveillance cameras are also available nowadays. Nevertheless, most scheduling algorithms in the literature are throughput, fairness, or profit-based approaches, which are not suitable for wireless surveillance. In this paper, therefore, we explore the resource allocation problem for a multi-camera surveillance system in 3GPP Long Term Evolution (LTE) uplink (UL) networks. We minimize the number of allocated resource blocks (RBs) while guaranteeing the coverage requirement for surveillance systems in LTE UL networks. Specifically, we formulate the Camera Set Resource Allocation Problem (CSRAP) and prove that the problem is NP-Hard. We then propose an Integer Linear Programming formulation for general cases to find the optimal solution. Moreover, we present a baseline algorithm and devise an approximation algorithm to solve the problem. Simulation results based on a real surveillance map and synthetic datasets manifest that the number of allocated RBs can be effectively reduced compared to the existing approach for LTE networks.
We develop a general scheme of deriving boundary conditions for spin-charge coupled transport in disordered systems with spin-orbit interactions. To illustrate the application of the method, we explicitly derive boundary conditions for spin diffusion in the Rashba model. Due to the surface spin precession, the boundary conditions are non-trivial and contain terms, which couple different components of the spin density. We argue that boundary conditions and the corresponding electric-field-induced spin accumulation generally depend on the nature of the boundary and therefore the spin Hall effect in a spin-orbit coupled system can be viewed as a non-universal edge phenomenon.
We construct a simple, regularized estimator for the dark energy equation of state by using the recently introduced linear response approximation. We show that even a simple regularization substantially improves the performance of the free form fitting approach. The use of linear response approximation allows an analytic construction of maximum likelihood estimator, in a convenient and easy to use matrix form. We show that in principle, such regularized free form fitting can give us an unbiased estimate of the functional form of the equation of state of dark energy. We show the efficacy of this approach on a simulated SNAP class data, but it is easy to generalize this method to include other cosmological tests. We provide a possible explanation for the sweet spots seen in other reconstruction methods.
We show that numerical approximations of Kolmogorov complexity (K) applied to graph adjacency matrices capture some group-theoretic and topological properties of graphs and empirical networks ranging from metabolic to social networks. That K and the size of the group of automorphisms of a graph are correlated opens up interesting connections to problems in computational geometry, and thus connects several measures and concepts from complexity science. We show that approximations of K characterise synthetic and natural networks by their generating mechanisms, assigning lower algorithmic randomness to complex network models (Watts-Strogatz and Barabasi-Albert networks) and high Kolmogorov complexity to (random) Erdos-Renyi graphs. We derive these results via two different Kolmogorov complexity approximation methods applied to the adjacency matrices of the graphs and networks. The methods used are the traditional lossless compression approach to Kolmogorov complexity, and a normalised version of a Block Decomposition Method (BDM) measure, based on algorithmic probability theory.
Performance and high availability have become increasingly important drivers, amongst other drivers, for user retention in the context of web services such as social networks, and web search. Exogenic and/or endogenic factors often give rise to anomalies, making it very challenging to maintain high availability, while also delivering high performance. Given that service-oriented architectures (SOA) typically have a large number of services, with each service having a large set of metrics, automatic detection of anomalies is non-trivial.   Although there exists a large body of prior research in anomaly detection, existing techniques are not applicable in the context of social network data, owing to the inherent seasonal and trend components in the time series data.   To this end, we developed two novel statistical techniques for automatically detecting anomalies in cloud infrastructure data. Specifically, the techniques employ statistical learning to detect anomalies in both application, and system metrics. Seasonal decomposition is employed to filter the trend and seasonal components of the time series, followed by the use of robust statistical metrics -- median and median absolute deviation (MAD) -- to accurately detect anomalies, even in the presence of seasonal spikes.   We demonstrate the efficacy of the proposed techniques from three different perspectives, viz., capacity planning, user behavior, and supervised learning. In particular, we used production data for evaluation, and we report Precision, Recall, and F-measure in each case.
Bayesian network classifiers are used in many fields, and one common class of classifiers are naive Bayes classifiers. In this paper, we introduce an approach for reasoning about Bayesian network classifiers in which we explicitly convert them into Ordered Decision Diagrams (ODDs), which are then used to reason about the properties of these classifiers. Specifically, we present an algorithm for converting any naive Bayes classifier into an ODD, and we show theoretically and experimentally that this algorithm can give us an ODD that is tractable in size even given an intractable number of instances. Since ODDs are tractable representations of classifiers, our algorithm allows us to efficiently test the equivalence of two naive Bayes classifiers and characterize discrepancies between them. We also show a number of additional results including a count of distinct classifiers that can be induced by changing some CPT in a naive Bayes classifier, and the range of allowable changes to a CPT which keeps the current classifier unchanged.
The largest eigenvalue of the adjacency matrix of the networks is a key quantity determining several important dynamical processes on complex networks. Based on this fact, we present a quantitative, objective characterization of the dynamical importance of network nodes and links in terms of their effect on the largest eigenvalue. We show how our characterization of the dynamical importance of nodes can be affected by degree-degree correlations and network community structure. We discuss how our characterization can be used to optimize techniques for controlling certain network dynamical processes and apply our results to real networks.
One of the challenging problems in biology is to classify plants based on their reaction on genetic mutation. Arabidopsis Thaliana is a plant that is so interesting, because its genetic structure has some similarities with that of human beings. Biologists classify the type of this plant to mutated and not mutated (wild) types. Phenotypic analysis of these types is a time-consuming and costly effort by individuals. In this paper, we propose a modified feature extraction step by using velocity and acceleration of root growth. In the second step, for plant classification, we employed different Support Vector Machine (SVM) kernels and two hybrid systems of neural networks. Gated Negative Correlation Learning (GNCL) and Mixture of Negatively Correlated Experts (MNCE) are two ensemble methods based on complementary feature of classical classifiers; Mixture of Expert (ME) and Negative Correlation Learning (NCL). The hybrid systems conserve of advantages and decrease the effects of disadvantages of NCL and ME. Our Experimental shows that MNCE and GNCL improve the efficiency of classical classifiers, however, some SVM kernels function has better performance than classifiers based on neural network ensemble method. Moreover, kernels consume less time to obtain a classification rate.
This paper presents a kernel-based discriminative learning framework on probability measures. Rather than relying on large collections of vectorial training examples, our framework learns using a collection of probability distributions that have been constructed to meaningfully represent training data. By representing these probability distributions as mean embeddings in the reproducing kernel Hilbert space (RKHS), we are able to apply many standard kernel-based learning techniques in straightforward fashion. To accomplish this, we construct a generalization of the support vector machine (SVM) called a support measure machine (SMM). Our analyses of SMMs provides several insights into their relationship to traditional SVMs. Based on such insights, we propose a flexible SVM (Flex-SVM) that places different kernel functions on each training example. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our proposed framework.
The critical infrastructures of the nation including the power grid and the communication network are highly interdependent. Recognizing the need for a deeper understanding of the interdependency in a multi-layered network, significant efforts have been made by the research community in the last few years to achieve this goal. Accordingly a number of models have been proposed and analyzed. Unfortunately, most of the models are over simplified and, as such, they fail to capture the complex interdependency that exists between entities of the power grid and the communication networks involving a combination of conjunctive and disjunctive relations. To overcome the limitations of existing models, we propose a new model that is able to capture such complex interdependency relations. Utilizing this model, we provide techniques to identify the $\cal K$ most vulnerable nodes of an interdependent network. We show that the problem can be solved in polynomial time in some special cases, whereas for some others, the problem is NP-complete. We establish that this problem is equivalent to computation of a {\em fixed point} of a multilayered network system and we provide a technique for its computation utilizing Integer Linear Programming. Finally, we evaluate the efficacy of our technique using real data collected from the power grid and the communication network that span the Maricopa County of Arizona.
We show that the maximum possible energy benefit of network coding for multiple unicast on wireless networks is at least 3. This improves the previously known lower bound of 2.4 from [1].
Lattice reduction algorithms have numerous applications in number theory, algebra, as well as in cryptanalysis. The most famous algorithm for lattice reduction is the LLL algorithm. In polynomial time it computes a reduced basis with provable output quality. One early improvement of the LLL algorithm was LLL with deep insertions (DeepLLL). The output of this version of LLL has higher quality in practice but the running time seems to explode. Weaker variants of DeepLLL, where the insertions are restricted to blocks, behave nicely in practice concerning the running time. However no proof of polynomial running time is known. In this paper a new variant of DeepLLL with provably polynomial running time is presented. We compare the practical behavior of the new algorithm to classical LLL, BKZ as well as blockwise variants of DeepLLL regarding both the output quality and running time.
Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. One example is the Inception architecture that has been shown to achieve very good performance at relatively low computational cost. Recently, the introduction of residual connections in conjunction with a more traditional architecture has yielded state-of-the-art performance in the 2015 ILSVRC challenge; its performance was similar to the latest generation Inception-v3 network. This raises the question of whether there are any benefit in combining the Inception architecture with residual connections. Here we give clear empirical evidence that training with residual connections accelerates the training of Inception networks significantly. There is also some evidence of residual Inception networks outperforming similarly expensive Inception networks without residual connections by a thin margin. We also present several new streamlined architectures for both residual and non-residual Inception networks. These variations improve the single-frame recognition performance on the ILSVRC 2012 classification task significantly. We further demonstrate how proper activation scaling stabilizes the training of very wide residual Inception networks. With an ensemble of three residual and one Inception-v4, we achieve 3.08 percent top-5 error on the test set of the ImageNet classification (CLS) challenge
Threshold-linear networks consist of simple units interacting in the presence of a threshold nonlinearity. Competitive threshold-linear networks have long been known to exhibit multistability, where the activity of the network settles into one of potentially many steady states. In this work, we find conditions that guarantee the absence of steady states, while maintaining bounded activity. These conditions lead us to define a combinatorial family of competitive threshold-linear networks, parametrized by a simple directed graph. By exploring this family, we discover that threshold-linear networks are capable of displaying a surprisingly rich variety of nonlinear dynamics, including limit cycles, quasiperiodic attractors, and chaos. In particular, several types of nonlinear behaviors can co-exist in the same network. Our mathematical results also enable us to engineer networks with multiple dynamic patterns. Taken together, these theoretical and computational findings suggest that threshold-linear networks may be a valuable tool for understanding the relationship between network connectivity and emergent dynamics.
Although deep learning has yielded impressive performance for face recognition, many studies have shown that different networks learn different feature maps: while some networks are more receptive to pose and illumination others appear to capture more local information. Thus, in this work, we propose a deep heterogeneous feature fusion network to exploit the complementary information present in features generated by different deep convolutional neural networks (DCNNs) for template-based face recognition, where a template refers to a set of still face images or video frames from different sources which introduces more blur, pose, illumination and other variations than traditional face datasets. The proposed approach efficiently fuses the discriminative information of different deep features by 1) jointly learning the non-linear high-dimensional projection of the deep features and 2) generating a more discriminative template representation which preserves the inherent geometry of the deep features in the feature space. Experimental results on the IARPA Janus Challenge Set 3 (Janus CS3) dataset demonstrate that the proposed method can effectively improve the recognition performance. In addition, we also present a series of covariate experiments on the face verification task for in-depth qualitative evaluations for the proposed approach.
A commonly used learning rule is to approximately minimize the \emph{average} loss over the training set. Other learning algorithms, such as AdaBoost and hard-SVM, aim at minimizing the \emph{maximal} loss over the training set. The average loss is more popular, particularly in deep learning, due to three main reasons. First, it can be conveniently minimized using online algorithms, that process few examples at each iteration. Second, it is often argued that there is no sense to minimize the loss on the training set too much, as it will not be reflected in the generalization loss. Last, the maximal loss is not robust to outliers. In this paper we describe and analyze an algorithm that can convert any online algorithm to a minimizer of the maximal loss. We prove that in some situations better accuracy on the training set is crucial to obtain good performance on unseen examples. Last, we propose robust versions of the approach that can handle outliers.
We show that a model of the hippocampus introduced recently by Scarpetta, Zhaoping & Hertz ([2002] Neural Computation 14(10):2371-96), explains the theta phase precession phenomena. In our model, the theta phase precession comes out as a consequence of the associative-memory-like network dynamics, i.e. the network's ability to imprint and recall oscillatory patterns, coded both by phases and amplitudes of oscillation. The learning rule used to imprint the oscillatory states is a natural generalization of that used for static patterns in the Hopfield model, and is based on the spike time dependent synaptic plasticity (STDP), experimentally observed. In agreement with experimental findings, the place cell's activity appears at consistently earlier phases of subsequent cycles of the ongoing theta rhythm during a pass through the place field, while the oscillation amplitude of the place cell's firing rate increases as the animal approaches the center of the place field and decreases as the animal leaves the center. The total phase precession of the place cell is lower than 360 degrees, in agreement with experiments. As the animal enters a receptive field the place cell's activity comes slightly less than 180 degrees after the phase of maximal pyramidal cell population activity, in agreement with the findings of Skaggs et al (1996). Our model predicts that the theta phase is much better correlated with location than with time spent in the receptive field. Finally, in agreement with the recent experimental findings of Zugaro et al (2005), our model predicts that theta phase precession persists after transient intra-hippocampal perturbation.
In this article, we tackle the problem of depth estimation from single monocular images. Compared with depth estimation using multiple images such as stereo depth perception, depth from monocular images is much more challenging. Prior work typically focuses on exploiting geometric priors or additional sources of information, most using hand-crafted features. Recently, there is mounting evidence that features from deep convolutional neural networks (CNN) set new records for various vision applications. On the other hand, considering the continuous characteristic of the depth values, depth estimations can be naturally formulated as a continuous conditional random field (CRF) learning problem. Therefore, here we present a deep convolutional neural field model for estimating depths from single monocular images, aiming to jointly explore the capacity of deep CNN and continuous CRF. In particular, we propose a deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework. We then further propose an equally effective model based on fully convolutional networks and a novel superpixel pooling method, which is $\sim 10$ times faster, to speedup the patch-wise convolutions in the deep model. With this more efficient model, we are able to design deeper networks to pursue better performance. Experiments on both indoor and outdoor scene datasets demonstrate that the proposed method outperforms state-of-the-art depth estimation approaches.
We study the typical properties of polynomial Support Vector Machines within a Statistical Mechanics approach that allows us to analyze the effect of different normalizations of the features. If the normalization is adecuately chosen, there is a hierarchical learning of features of increasing order as a function of the training set size.
There are several indications that brain is organized not on a basis of individual unreliable neurons, but on a micro-circuital scale providing Lego blocks employed to create complex architectures. At such an intermediate scale, the firing activity in the microcircuits is governed by collective effects emerging by the background noise soliciting spontaneous firing, the degree of mutual connections between the neurons, and the topology of the connections. We compare spontaneous firing activity of small populations of neurons adhering to an engineered scaffold with simulations of biologically plausible CMOS artificial neuron populations whose spontaneous activity is ignited by tailored background noise. We provide a full set of flexible and low-power consuming silicon blocks including neurons, excitatory and inhibitory synapses, and both white and pink noise generators for spontaneous firing activation. We achieve a comparable degree of correlation of the firing activity of the biological neurons by controlling the kind and the number of connection among the silicon neurons. The correlation between groups of neurons, organized as a ring of four distinct populations connected by the equivalent of interneurons, is triggered more effectively by adding multiple synapses to the connections than increasing the number of independent point-to-point connections. The comparison between the biological and the artificial systems suggests that a considerable number of synapses is active also in biological populations adhering to engineered scaffolds.
By means of numerical simulations we investigate the configurational properties of densely and fully packed configurations of loops in the negative-weight percolation (NWP) model. In the presented study we consider 2d square, 2d honeycomb, 3d simple cubic and 4d hypercubic lattice graphs, where edge weights are drawn from a Gaussian distribution. For a given realization of the disorder we then compute a configuration of loops, such that the configurational energy, given by the sum of all individual loop weights, is minimized. For this purpose, we employ a mapping of the NWP model to the "minimum-weight perfect matching problem" that can be solved exactly by using sophisticated polynomial-time matching algorithms. We characterize the loops via observables similar to those used in percolation studies and perform finite-size scaling analyses, up to side length L=256 in 2d, L=48 in 3d and L=20 in 4d (for which we study only some observables), in order to estimate geometric exponents that characterize the configurations of densely and fully packed loops. One major result is that the loops behave like uncorrelated random walks from dimension d=3 on, in contrast to the previously studied behavior at the percolation threshold, where random-walk behavior is obtained for d>=6.
A new scalable parallel math library, dMath, is presented in this paper that demonstrates leading scaling when using intranode, or internode, hybrid-parallelism for deep-learning. dMath provides easy-to-use distributed base primitives and a variety of domain-specific algorithms. These include matrix multiplication, convolutions, and others allowing for rapid development of highly scalable applications, including Deep Neural Networks (DNN), whereas previously one was restricted to libraries that provided effective primitives for only a single GPU, like Nvidia cublas and cudnn or DNN primitives from Nervana neon framework. Development of HPC software is difficult, labor-intensive work, requiring a unique skill set. dMath allows a wide range of developers to utilize parallel and distributed hardware easily. One contribution of this approach is that data is stored persistently on the GPU hardware, avoiding costly transfers between host and device. Advanced memory management techniques are utilized, including caching of transferred data and memory reuse through pooling. A key contribution of dMath is that it delivers performance, portability, and productivity to its specific domain of support. It enables algorithm and application programmers to quickly solve problems without managing the significant complexity associated with multi-level parallelism.
Influence maximization is the problem of finding a set of users in a social network, such that by targeting this set, one maximizes the expected spread of influence in the network. Most of the literature on this topic has focused exclusively on the social graph, overlooking historical data, i.e., traces of past action propagations. In this paper, we study influence maximization from a novel data-based perspective. In particular, we introduce a new model, which we call credit distribution, that directly leverages available propagation traces to learn how influence flows in the network and uses this to estimate expected influence spread. Our approach also learns the different levels of influenceability of users, and it is time-aware in the sense that it takes the temporal nature of influence into account. We show that influence maximization under the credit distribution model is NP-hard and that the function that defines expected spread under our model is submodular. Based on these, we develop an approximation algorithm for solving the influence maximization problem that at once enjoys high accuracy compared to the standard approach, while being several orders of magnitude faster and more scalable.
In this paper we address the problem of modeling relational data, which appear in many applications such as social network analysis, recommender systems and bioinformatics. Previous studies either consider latent feature based models but disregarding local structure in the network, or focus exclusively on capturing local structure of objects based on latent blockmodels without coupling with latent characteristics of objects. To combine the benefits of the previous work, we propose a novel model that can simultaneously incorporate the effect of latent features and covariates if any, as well as the effect of latent structure that may exist in the data. To achieve this, we model the relation graph as a function of both latent feature factors and latent cluster memberships of objects to collectively discover globally predictive intrinsic properties of objects and capture latent block structure in the network to improve prediction performance. We also develop an optimization transfer algorithm based on the generalized EM-style strategy to learn the latent factors. We prove the efficacy of our proposed model through the link prediction task and cluster analysis task, and extensive experiments on the synthetic data and several real world datasets suggest that our proposed LFBM model outperforms the other state of the art approaches in the evaluated tasks.
For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems. In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments. We show that this approach can effectively solve complex RL tasks without access to the reward function, including Atari games and simulated robot locomotion, while providing feedback on less than one percent of our agent's interactions with the environment. This reduces the cost of human oversight far enough that it can be practically applied to state-of-the-art RL systems. To demonstrate the flexibility of our approach, we show that we can successfully train complex novel behaviors with about an hour of human time. These behaviors and environments are considerably more complex than any that have been previously learned from human feedback.
Man-made objects usually exhibit descriptive curved features (i.e., curve networks). The curve network of an object conveys its high-level geometric and topological structure. We present a framework for extracting feature curve networks from unstructured point cloud data. Our framework first generates a set of initial curved segments fitting highly curved regions. We then optimize these curved segments to respect both data fitting and structural regularities. Finally, the optimized curved segments are extended and connected into curve networks using a clustering method. To facilitate effectiveness in case of severe missing data and to resolve ambiguities, we develop a user interface for completing the curve networks. Experiments on various imperfect point cloud data validate the effectiveness of our curve network extraction framework. We demonstrate the usefulness of the extracted curve networks for surface reconstruction from incomplete point clouds.
We perform a comprehensive study of the ability of the Deep Underground Neutrino Experiment (DUNE) to answer outstanding questions in the neutrino sector. We consider the sensitivities to the mass hierarchy, the octant of \theta_{23} and to CP violation using data from beam and atmospheric neutrinos. We evaluate the dependencies on the precision with which \theta_{13} will be measured by reactor experiments, on the detector size, beam power and exposure time, on detector magnetization, and on the systematic uncertainties achievable with and without a near detector. We find that a 35 kt far detector in DUNE with a near detector will resolve the eight-fold degeneracy that is intrinsic to long baseline experiments and will meet the primary goals of oscillation physics that it is designed for.
We recently showed that the S&P500 stock market index is well described by Tsallis non-extensive statistics and nonlinear Fokker-Planck time evolution. We argued that these results should be applicable to a broad range of markets and exchanges where anomalous diffusion and `heavy' tails of the distribution are present. In the present work we examine how the Black-Scholes derivative pricing formula is modified when the underlying security obeys non-extensive statistics and Fokker-Planck time evolution. We answer this by recourse to the underlying microscopic Ito-Langevin stochastic differential equation of the non-extensive process.
In sentence modeling and classification, convolutional neural network approaches have recently achieved state-of-the-art results, but all such efforts process word vectors sequentially and neglect long-distance dependencies. To exploit both deep learning and linguistic structures, we propose a tree-based convolutional neural network model which exploit various long-distance relationships between words. Our model improves the sequential baselines on all three sentiment and question classification tasks, and achieves the highest published accuracy on TREC.
In this work, we introduce a new video representation for action classification that aggregates local convolutional features across the entire spatio-temporal extent of the video. We do so by integrating state-of-the-art two-stream networks with learnable spatio-temporal feature aggregation. The resulting architecture is end-to-end trainable for whole-video classification. We investigate different strategies for pooling across space and time and combining signals from the different streams. We find that: (i) it is important to pool jointly across space and time, but (ii) appearance and motion streams are best aggregated into their own separate representations. Finally, we show that our representation outperforms the two-stream base architecture by a large margin (13% relative) as well as out-performs other baselines with comparable base architectures on HMDB51, UCF101, and Charades video classification benchmarks.
We have studied the potential of the CDF and DZero experiments to discover a low-mass Standard Model Higgs boson, during Run II, via the processes $p\bar{p}$ -> WH -> $\ell\nu b\bar{b}$, $p\bar{p}$ -> ZH -> $\ell^{+}\ell^{-}b\bar{b}$ and $p\bar{p}$ -> ZH ->$\nu \bar{\nu} b\bar{b}$. We show that a multivariate analysis using neural networks, that exploits all the information contained within a set of event variables, leads to a significant reduction, with respect to {\em any} equivalent conventional analysis, in the integrated luminosity required to find a Standard Model Higgs boson in the mass range 90 GeV/c**2 < M_H < 130 GeV/c**2. The luminosity reduction is sufficient to bring the discovery of the Higgs boson within reach of the Tevatron experiments, given the anticipated integrated luminosities of Run II, whose scope has recently been expanded.
Large amount of image denoising literature focuses on single channel images and often experimentally validates the proposed methods on tens of images at most. In this paper, we investigate the interaction between denoising and classification on large scale dataset. Inspired by classification models, we propose a novel deep learning architecture for color (multichannel) image denoising and report on thousands of images from ImageNet dataset as well as commonly used imagery. We study the importance of (sufficient) training data, how semantic class information can be traded for improved denoising results. As a result, our method greatly improves PSNR performance by 0.34 - 0.51 dB on average over state-of-the art methods on large scale dataset. We conclude that it is beneficial to incorporate in classification models. On the other hand, we also study how noise affect classification performance. In the end, we come to a number of interesting conclusions, some being counter-intuitive.
We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.
We apply kernel-based methods to solve the difficult reinforcement learning problem of 3vs2 keepaway in RoboCup simulated soccer. Key challenges in keepaway are the high-dimensionality of the state space (rendering conventional discretization-based function approximation like tilecoding infeasible), the stochasticity due to noise and multiple learning agents needing to cooperate (meaning that the exact dynamics of the environment are unknown) and real-time learning (meaning that an efficient online implementation is required). We employ the general framework of approximate policy iteration with least-squares-based policy evaluation. As underlying function approximator we consider the family of regularization networks with subset of regressors approximation. The core of our proposed solution is an efficient recursive implementation with automatic supervised selection of relevant basis functions. Simulation results indicate that the behavior learned through our approach clearly outperforms the best results obtained earlier with tilecoding by Stone et al. (2005).
Delay-coupled electro-optical systems have received much attention for their dynamical properties and their potential use in signal processing. In particular it has recently been demonstrated, using the artificial intelligence algorithm known as reservoir computing, that photonic implementations of such systems solve complex tasks such as speech recognition. Here we show how the backpropagation algorithm can be physically implemented on the same electro-optical delay-coupled architecture used for computation with only minor changes to the original design. We find that, compared when the backpropagation algorithm is not used, the error rate of the resulting computing device, evaluated on three benchmark tasks, decreases considerably. This demonstrates that electro-optical analog computers can embody a large part of their own training process, allowing them to be applied to new, more difficult tasks.
Conspiracy theories, or in general seriously distorted beliefs, are widespread. How and why are they formed in the brain is still more a matter of speculation rather than science. In this paper one plausible mechanisms is investigated: rapid freezing of high neuroplasticity (RFHN). Emotional arousal increases neuroplasticity and leads to creation of new pathways spreading neural activation. Using the language of neurodynamics a meme is defined as quasi-stable associative memory attractor state. Depending on the temporal characteristics of the incoming information and the plasticity of the network, memory may self-organize creating memes with large attractor basins, linking many unrelated input patterns. Memes with fake rich associations distort relations between memory states. Simulations of various neural network models trained with competitive Hebbian learning (CHL) on stationary and non-stationary data lead to the same conclusion: short learning with high plasticity followed by rapid decrease of plasticity leads to memes with large attraction basins, distorting input pattern representations in associative memory. Such system-level models may be used to understand creation of distorted beliefs and formation of conspiracy memes, understood as strong attractor states of the neurodynamics.
There is considerable interest in developing techniques for predicting human behavior, for instance to enable emerging contentious situations to be forecast or the nature of ongoing but hidden activities to be inferred. A promising approach to this problem is to identify and collect appropriate empirical data and then apply machine learning methods to these data to generate the predictions. This paper shows the performance of such learning algorithms often can be improved substantially by leveraging sociological models in their development and implementation. In particular, we demonstrate that sociologically-grounded learning algorithms outperform gold-standard methods in three important and challenging tasks: 1.) inferring the (unobserved) nature of relationships in adversarial social networks, 2.) predicting whether nascent social diffusion events will go viral, and 3.) anticipating and defending future actions of opponents in adversarial settings. Significantly, the new algorithms perform well even when there is limited data available for their training and execution.
Mutational neighbourhoods in genotype-phenotype (GP) maps are widely believed to be more likely to share characteristics than expected from random chance. Such genetic correlations should, as John Maynard Smith famously pointed out, strongly influence evolutionary dynamics. We explore and quantify these intuitions by comparing three GP maps - RNA SS, HP for tertiary, Polyominoes for protein quaternary structure - to a simple random null model that maintains the number of genotypes mapping to each phenotype, but assigns genotypes randomly. The mutational neighbourhood of a genotype in these GP maps is much more likely to contain (mutationally neutral) genotypes mapping to the same phenotype than in the random null model. These neutral correlations can increase the robustness to mutations by orders of magnitude over that of the null model, raising robustness above the critical threshold for the formation of large neutral networks that enhance the capacity for neutral exploration. We also study {\em non-neutral correlations}: Compared to the null model, i) If a particular (non-neutral) phenotype is found once in the 1-mutation neighbourhood of a genotype, then the chance of finding that phenotype multiple times in this neighbourhood is larger than expected; ii) If two genotypes are connected by a single neutral mutation, then their respective non-neutral 1-mutation neighbourhoods are more likely to be similar; iii) If a genotype maps to a folding or self-assembling phenotype, then its non-neutral neighbours are less likely to be a potentially deleterious non-folding or non-assembling phenotype. Non-neutral correlations of type i) and ii) reduce the rate at which new phenotypes can be found by neutral exploration, and so may diminish evolvability, while non-neutral correlations of type iii) may instead facilitate evolutionary exploration and so increase evolvability.
CNN-based optical flow estimation has attracted attention recently, mainly due to its impressively high frame rates. These networks perform well on synthetic datasets, but they are still far behind the classical methods in real-world videos. This is because there is no ground truth optical flow for training these networks on real data. In this paper, we boost CNN-based optical flow estimation in real scenes with the help of the freely available self-supervised task of next-frame prediction. To this end, we train the network in a hybrid way, providing it with a mixture of synthetic and real videos. With the help of a sample-variant multi-tasking architecture, the network is trained on different tasks depending on the availability of ground-truth. We also experiment with the prediction of "next-flow" instead of estimation of the current flow, which is intuitively closer to the task of next-frame prediction and yields favorable results. We demonstrate the improvement in optical flow estimation on the real-world KITTI benchmark. Additionally, we test the optical flow indirectly in an action classification scenario. As a side product of this work, we report significant improvements over state-of-the-art in the task of next-frame prediction.
We investigate the optimization of synchronizability in multiplex networks and demonstrate that the interlayer coupling strength is the deciding factor for the efficiency of optimization. The optimized networks have homogeneity in the degree as well as in the betweenness centrality. Additionally, the interlayer coupling strength crucially affects various properties of individual layers in the optimized multiplex networks. We provide an understanding to how the emerged network properties are shaped or affected when the evolution renders them better synchronizable.
For crowded scenes, the accuracy of object-based computer vision methods declines when the images are low-resolution and objects have severe occlusions. Taking counting methods for example, almost all the recent state-of-the-art counting methods bypass explicit detection and adopt regression-based methods to directly count the objects of interest. Among regression-based methods, density map estimation, where the number of objects inside a subregion is the integral of the density map over that subregion, is especially promising because it preserves spatial information, which makes it useful for both counting and localization (detection and tracking). With the power of deep convolutional neural networks (CNNs) the counting performance has improved steadily. The goal of this paper is to evaluate density maps generated by density estimation methods on a variety of crowd analysis tasks, including counting, detection, and tracking. Most existing CNN methods produce density maps with resolution that is smaller than the original images, due to the downsample strides in the convolution/pooling operations. To produce an original-resolution density map, we also evaluate a classical CNN that uses a sliding window regressor to predict the density for every pixel in the image. We also consider a fully convolutional (FCNN) adaptation, with skip connections from lower convolutional layers to compensate for loss in spatial information during upsampling. In our experiments, we found that the lower-resolution density maps sometimes have better counting performance. In contrast, the original-resolution density maps improved localization tasks, such as detection and tracking, compared to bilinear upsampling the lower-resolution density maps. Finally, we also propose several metrics for measuring the quality of a density map, and relate them to experiment results on counting and localization.
We show that a network of spiking neurons exhibits robust self-organized criticality if the synaptic efficacies follow realistic dynamics. Deriving analytical expressions for the average coupling strengths and inter-spike intervals, we demonstrate that networks with dynamical synapses exhibit critical avalanche dynamics for a wide range of interaction parameters. We prove that in the thermodynamical limit the network becomes critical for all large enough coupling parameters. We thereby explain experimental observations in which cortical neurons show avalanche activity with the total intensity of firing events being distributed as a power-law.
Joint distributions over many variables are frequently modeled by decomposing them into products of simpler, lower-dimensional conditional distributions, such as in sparsely connected Bayesian networks. However, automatically learning such models can be very computationally expensive when there are many datapoints and many continuous variables with complex nonlinear relationships, particularly when no good ways of decomposing the joint distribution are known a priori. In such situations, previous research has generally focused on the use of discretization techniques in which each continuous variable has a single discretization that is used throughout the entire network. \ In this paper, we present and compare a wide variety of tree-based algorithms for learning and evaluating conditional density estimates over continuous variables. These trees can be thought of as discretizations that vary according to the particular interactions being modeled; however, the density within a given leaf of the tree need not be assumed constant, and we show that such nonuniform leaf densities lead to more accurate density estimation. We have developed Bayesian network structure-learning algorithms that employ these tree-based conditional density representations, and we show that they can be used to practically learn complex joint probability models over dozens of continuous variables from thousands of datapoints. We focus on finding models that are simultaneously accurate, fast to learn, and fast to evaluate once they are learned.
A new family of self-organizing maps, the Winner-Relaxing Kohonen Algorithm, is introduced as a generalization of a variant given by Kohonen in 1991. The magnification behaviour is calculated analytically. For the original variant a magnification exponent of 4/7 is derived; the generalized version allows to steer the magnification in the wide range from exponent 1/2 to 1 in the one-dimensional case, thus provides optimal mapping in the sense of information theory. The Winner Relaxing Algorithm requires minimal extra computations per learning step and is conveniently easy to implement.
Recent results on jet production in neutral current deep inelastic scattering and high energy photoproduction at the HERA electron-proton-collider are briefly reviewed. The results are compared to QCD expectations in NLO and $\alpha_s$ determinations using these data are summarized.
In this paper, we propose a general analytical framework for information spreading in mobile networks based on a new performance metric, mobile conductance, which allows us to separate the details of mobility models from the study of mobile spreading time. We derive a general result for the information spreading time in mobile networks in terms of this new metric, and instantiate it through several popular mobility models. Large scale network simulation is conducted to verify our analysis.
Several overlay-based live multimedia streaming platforms have been proposed in the recent peer-to-peer streaming literature. In most of the cases, the overlay neighbors are chosen randomly for robustness of the overlay. However, this causes nodes that are distant in terms of proximity in the underlying physical network to become neighbors, and thus data travels unnecessary distances before reaching the destination. For efficiency of bulk data transmission like multimedia streaming, the overlay neighborhood should resemble the proximity in the underlying network. In this paper, we exploit the proximity and redundancy properties of a recently proposed clique-based clustered overlay network, named eQuus, to build efficient as well as robust overlays for multimedia stream dissemination. To combine the efficiency of content pushing over tree structured overlays and the robustness of data-driven mesh overlays, higher capacity stable nodes are organized in tree structure to carry the long haul traffic and less stable nodes with intermittent presence are organized in localized meshes. The overlay construction and fault-recovery procedures are explained in details. Simulation study demonstrates the good locality properties of the platform. The outage time and control overhead induced by the failure recovery mechanism are minimal as demonstrated by the analysis.
This paper proposes and illustrates a general framework to integrate the areas of vision research and complex networks. Each image pixel is associated to a network node and the Euclidean distance between the visual properties (e.g. gray-level intensity, color or texture) at each possible pair of pixels is assigned as the respective edge weight. In addition to investigating the therefore obtained weight and adjacency matrices in terms of node degree densities, it is shown that the combination of the concepts of network hub and \emph{2-}expansion of the adjacency matrix provides an effective means to separate the image elements, a challenging task in computer vision known as segmentation.
All dynamical systems of biological interest--be they food webs, regulation of genes, or contacts between healthy and infectious individuals--have complex network structure. Wigner's semicircular law and Girko's circular law describe the eigenvalues of systems whose structure is a fully connected network. However, these laws fail for systems with complex network structure. Here we show that in these cases the eigenvalues are described by superellipses. We also develop a new method to analytically estimate the dominant eigenvalue of complex networks.
Accurate pedestrian detection has a primary role in automotive safety: for example, by issuing warnings to the driver or acting actively on car's brakes, it helps decreasing the probability of injuries and human fatalities. In order to achieve very high accuracy, recent pedestrian detectors have been based on Convolutional Neural Networks (CNN). Unfortunately, such approaches require vast amounts of computational power and memory, preventing efficient implementations on embedded systems. This work proposes a CNN-based detector, adapting a general-purpose convolutional network to the task at hand. By thoroughly analyzing and optimizing each step of the detection pipeline, we develop an architecture that outperforms methods based on traditional image features and achieves an accuracy close to the state-of-the-art while having low computational complexity. Furthermore, the model is compressed in order to fit the tight constrains of low power devices with a limited amount of embedded memory available. This paper makes two main contributions: (1) it proves that a region based deep neural network can be finely tuned to achieve adequate accuracy for pedestrian detection (2) it achieves a very low memory usage without reducing detection accuracy on the Caltech Pedestrian dataset.
One of the goals of the COMPASS experiment is the determination of the gluon polarisation \Delta G/G, for a deep understanding of the spin structure of the nucleon. In DIS the gluon polarisation can be measured via the Photon-Gluon-Fusion (PGF) process, identified by open charm production or by selecting high p_T hadron pairs in the final state. The data used for this work were collected by the COMPASS experiment during the years 2002-2004, using a 160 GeV naturally polarised positive muon beam scattering on a polarised nucleon target. A new preliminary result of the gluon polarisation \Delta G/G from high p_T hadron pairs in events with Q^2>1 (GeV/c)^2 is presented. In order to extract \Delta G/G, this analysis takes into account the leading process \gamma q contribution together with the PGF and QCD Compton processes. A new weighted method based on a neural network approach is used. A preliminary \Delta G/G result for events from quasi-real photoproduction (Q^2<1 (GeV/c)^2) is also presented.
We consider the problem of communication over a network containing a hidden and malicious adversary that can control a subset of network resources, and aims to disrupt communications. We focus on omniscient node-based adversaries, i.e., the adversaries can control a subset of nodes, and know the message, network code and packets on all links. Characterizing information-theoretically optimal communication rates as a function of network parameters and bounds on the adversarially controlled network is in general open, even for unicast (single source, single destination) problems. In this work we characterize the information-theoretically optimal randomized capacity of such problems, i.e., under the assumption that the source node shares (an asymptotically negligible amount of) independent common randomness with each network node a priori (for instance, as part of network design). We propose a novel computationally-efficient communication scheme whose rate matches a natural information-theoretically "erasure outer bound" on the optimal rate. Our schemes require no prior knowledge of network topology, and can be implemented in a distributed manner as an overlay on top of classical distributed linear network coding.
Multiclass queueing networks (McQNs) provide a natural mathematical framework for modeling a wide range of stochastic systems, e.g., manufacturing lines, computer grids, and telecommunication systems. They differ from the classical Jacksonian network model in that the same (physical) item entering the system may require multiple service stages at the same station, with different service and routing characteristics. This distinguishing feature has a significant impact on the assessment of stability of such networks; more specifically, while for Jackson networks stability is equivalent to sub-criticality, for certain McQNs such an equivalence does not hold. As such, analytical stability conditions are not available in general and one needs to rely on numerical methods for assessing stability. We propose a numerical, simulation-based method for determining the stability region of a McQN with respect to its arrival rate(s). We exploit (stochastic) monotonicity of McQNs in order to design a stochastic search scheme in the parameter space; stability is then a monotone property and the stability region can be recovered from its boundary points, which can be evaluated via stochastic approximation schemes. Numerical experiments show that the monotonicity condition, which we prove for Jackson networks, carries over to all networks that we considered, including relatively complex networks for which the stability condition is not known. An extensive set of simulation experiments shows the effectiveness of our method.
We study the morphology of magnetic domain growth in disordered three dimensional magnets. The disordered magnetic material is described within the random-field Ising model with a Gaussian distribution of local fields with width $\Delta$. Growth is driven by a uniform applied magnetic field, whose value is kept equal to the critical value $H_c(\Delta)$ for the onset of steady motion. Two growth regimes are clearly identified. For low $\Delta$ the growing domain is compact, with a self-affine external interface. For large $\Delta$ a self-similar percolation-like morphology is obtained. A multi-critical point at $(\Delta_c$, $H_c(\Delta_c))$ separates the two types of growth. We extract the critical exponents near $\Delta_c$ using finite-size scaling of different morphological attributes of the external domain interface. We conjecture that the critical disorder width also corresponds to a maximum in $H_c(\Delta)$.
Community structure is largely regarded as an intrinsic property of complex real-world networks. However, recent studies reveal that networks comprise even more sophisticated modules than classical cohesive communities. More precisely, real-world networks can also be naturally partitioned according to common patterns of connections between the nodes. Recently, a propagation based algorithm has been proposed for the detection of arbitrary network modules. We here advance the latter with a more adequate community modeling based on network clustering. The resulting algorithm is evaluated on various synthetic benchmark networks and random graphs. It is shown to be comparable to current state-of-the-art algorithms, however, in contrast to other approaches, it does not require some prior knowledge of the true community structure. To demonstrate its generality, we further employ the proposed algorithm for community detection in different unipartite and bipartite real-world networks, for generalized community detection and also predictive data clustering.
This paper computes the sensing capacity of a sensor network, with sensors of limited range, sensing a two-dimensional Markov random field, by modeling the sensing operation as an encoder. Sensor observations are dependent across sensors, and the sensor network output across different states of the environment is neither identically nor independently distributed. Using a random coding argument, based on the theory of types, we prove a lower bound on the sensing capacity of the network, which characterizes the ability of the sensor network to distinguish among environments with Markov structure, to within a desired accuracy.
We discuss how to use a Genetic Regulatory Network as an evolutionary representation to solve a typical GP reinforcement problem, the pole balancing. The network is a modified version of an Artificial Regulatory Network proposed a few years ago, and the task could be solved only by finding a proper way of connecting inputs and outputs to the network. We show that the representation is able to generalize well over the problem domain, and discuss the performance of different models of this kind.
Industrial pollution is often considered to be one of the prime factors contributing to air, water and soil pollution. Sectoral pollution loads (ton/yr) into different media (i.e. air, water and land) in Lagos were estimated using Industrial Pollution Projected System (IPPS). These were further studied using Artificial neural Networks (ANNs), a data mining technique that has the ability of detecting and describing patterns in large data sets with variables that are non- linearly related. Time Lagged Recurrent Network (TLRN) appeared as the best Neural Network model among all the neural networks considered which includes Multilayer Perceptron (MLP) Network, Generalized Feed Forward Neural Network (GFNN), Radial Basis Function (RBF) Network and Recurrent Network (RN). TLRN modelled the data-sets better than the others in terms of the mean average error (MAE) (0.14), time (39 s) and linear correlation coefficient (0.84). The results showed that Artificial Neural Networks (ANNs) technique (i.e., Time Lagged Recurrent Network) is also applicable and effective in environmental assessment study. Keywords: Artificial Neural Networks (ANNs), Data Mining Techniques, Industrial Pollution Projection System (IPPS), Pollution load, Pollution Intensity.
We use activity networks (task graphs) to model parallel programs and consider series-parallel extensions of these networks. Our motivation is two-fold: the benefits of series-parallel activity networks and the modelling of programming constructs, such as those imposed by current parallel computing environments. Series-parallelisation adds precedence constraints to an activity network, usually increasing its makespan (execution time). The slowdown ratio describes how additional constraints affect the makespan. We disprove an existing conjecture positing a bound of two on the slowdown when workload is not considered. Where workload is known, we conjecture that 4/3 slowdown is always achievable, and prove our conjecture for small networks using max-plus algebra. We analyse a polynomial-time algorithm showing that achieving 4/3 slowdown is in exp-APX. Finally, we discuss the implications of our results.
Collaborative filtering (CF) is one of the most popular approaches to build a recommendation system. In this paper, we propose a hybrid collaborative filtering model based on a Makovian random walk to address the data sparsity and cold start problems in recommendation systems. More precisely, we construct a directed graph whose nodes consist of items and users, together with item content, user profile and social network information. We incorporate user's ratings into edge settings in the graph model. The model provides personalized recommendations and predictions to individuals and groups. The proposed algorithms are evaluated on MovieLens and Epinions datasets. Experimental results show that the proposed methods perform well compared with other graph-based methods, especially in the cold start case.
Coevolution is a powerful tool in evolutionary computing that mitigates some of its endemic problems, namely stagnation in local optima and lack of convergence in high dimensionality problems. Since its inception in 1990, there are multiple articles that have contributed greatly to the development and improvement of the coevolutionary techniques. In this report we review some of those landmark articles dwelving in the techniques they propose and how they fit to conform robust evolutionary algorithms
We seek to augment a geometric network in the Euclidean plane with shortcuts to minimize its continuous diameter, i.e., the largest network distance between any two points on the augmented network. Unlike in the discrete setting where a shortcut connects two vertices and the diameter is measured between vertices, we take all points along the edges of the network into account when placing a shortcut and when measuring distances in the augmented network.   We study this network augmentation problem for paths and cycles. For paths, we determine an optimal shortcut in linear time. For cycles, we show that a single shortcut never decreases the continuous diameter and that two shortcuts always suffice to reduce the continuous diameter. Furthermore, we characterize optimal pairs of shortcuts for convex and non-convex cycles. Finally, we develop a linear time algorithm that produces an optimal pair of shortcuts for convex cycles. Apart from the algorithms, our results extend to rectifiable curves.   Our work reveals some of the underlying challenges that must be overcome when addressing the discrete version of this network augmentation problem, where we minimize the discrete diameter of a network with shortcuts that connect only vertices.
The smart grid vision is to revitalize the electric power network by leveraging the proven sensing, communication, control, and machine learning technologies to address pressing issues related to security, stability, environmental impact, market diversity, and novel power technologies. Significant effort and investment have been committed to architect the necessary infrastructure by installing advanced metering systems and establishing data communication networks throughout the grid. Signal processing methodologies are expected to play a major role in this context by providing intelligent algorithms that fully exploit such pervasive sensing and control capabilities to realize the vision and manifold anticipated benefits of the smart grid. In this feature article, analytical background and relevance of signal processing tools to power systems are delineated, while introducing major challenges and opportunities for the future grid engineering. From grid informatics to inference for monitoring and optimization tools, energy-related issues are shown to offer a fertile ground for signal processing growth whose time has come.
We introduce and analyze a waiting time model for the accumulation of genetic changes. The continuous time conjunctive Bayesian network is defined by a partially ordered set of mutations and by the rate of fixation of each mutation. The partial order encodes constraints on the order in which mutations can fixate in the population, shedding light on the mutational pathways underlying the evolutionary process. We study a censored version of the model and derive equations for an EM algorithm to perform maximum likelihood estimation of the model parameters. We also show how to select the maximum likelihood poset. The model is applied to genetic data from different cancers and from drug resistant HIV samples, indicating implications for diagnosis and treatment.
*New Theory Result* We analyze the generalizability of the LS-GAN, showing that the loss function and generator trained over finite examples can converge to those learned from the real distributions with a moderate number of training examples.   In this paper, we present a novel Loss-Sensitive GAN (LS-GAN) that learns a loss function to separate generated samples from their real examples. An important property of the LS-GAN is it allows the generator to focus on improving poor data points that are far apart from real examples rather than wasting efforts on those samples that have already been well generated, and thus can improve the overall quality of generated samples. The theoretical analysis also shows that the LS-GAN can generate samples following the true data density. In particular, we present a regularity condition on the underlying data density, which allows us to use a class of Lipschitz losses and generators to model the LS-GAN. It relaxes the assumption that the classic GAN should have infinite modeling capacity to obtain the similar theoretical guarantee. Furthermore, we show the generalization ability of the LS-GAN by bounding the difference between the model performances over the empirical and real distributions, as well as deriving a tractable sample complexity to train the LS-GAN model in terms of its generalization ability. We also derive a non-parametric solution that characterizes the upper and lower bounds of the losses learned by the LS-GAN, both of which are cone-shaped and have non-vanishing gradient almost everywhere.
We study generative nets which can control and modify observations, after being trained on real-life datasets. In order to zoom-in on an object, some spatial, color and other attributes are learned by classifiers in specialized attention nets. In field-theoretical terms, these learned symmetry statistics form the gauge group of the data set. Plugging them in the generative layers of auto-classifiers-encoders (ACE) appears to be the most direct way to simultaneously: i) generate new observations with arbitrary attributes, from a given class, ii) describe the low-dimensional manifold encoding the "essence" of the data, after superfluous attributes are factored out, and iii) organically control, i.e., move or modify objects within given observations. We demonstrate the sharp improvement of the generative qualities of shallow ACE, with added spatial and color symmetry statistics, on the distorted MNIST and CIFAR10 datasets.
We present a new, novel approach to obtaining a network's connectivity. More specifically, we show that there exists a relationship between a network's graph isoperimetric properties and its conditional connectivity. A network's connectivity is the minimum number of nodes, whose removal will cause the network disconnected. It is a basic and important measure for the network's reliability, hence its overall robustness. Several conditional connectivities have been proposed in the past for the purpose of accurately reflecting various realistic network situations, with extra connectivity being one such conditional connectivity. In this paper, we will use isoperimetric properties of the hypercube network to obtain its extra connectivity. The result of the paper for the first time establishes a relationship between the age-old isoperimetric problem and network connectivity.
We make an intensive use of multimedia frameworks in our research on modeling the perceived quality estimation in streaming services and real-time communications. In our preliminary work, we have used the VLC VOD software to generate reference audiovisual files with various degree of coding and network degradations. We have successfully built machine learning based models on the subjective quality dataset we have generated using these files. However, imperfections in the dataset introduced by the multimedia framework we have used prevented us from achieving the full potential of these models.   In order to develop better models, we have re-created our end-to-end multimedia pipeline using the GStreamer framework for audio and video streaming. A GStreamer based pipeline proved to be significantly more robust to network degradations than the VLC VOD framework and allowed us to stream a video flow at a loss rate up to 5\% packet very easily. GStreamer has also enabled us to collect the relevant RTCP statistics that proved to be more accurate than network-deduced information. This dataset is free to the public. The accuracy of the statistics eventually helped us to generate better performing perceived quality estimation models.   In this paper, we present the implementation of these VLC and GStreamer-based multimedia communication quality assessment testbeds with the references to their publicly available code bases.
Synchronization across long neural distances is a functionally important phenomenon. In order to access the mechanistic basis of long-range synchrony, we constructed an experimental model that enables monitoring of spiking activities over centimeter scale distances in large random networks of cortical neutrons. We show that the mode of synchrony over these distances depends upon a length scale, $\lambda$, which is the minimal path that activity should travel through before meeting its point of origin ready for reactivation. When $\lambda$ is experimentally made larger than the physical dimension of the network, distant neuronal populations operate synchronously, giving rise to irregularly occurring network-wide events that last hundreds of milliseconds to couple of seconds. In contrast, when $\lambda$ approaches the dimension of the network, a continuous self-sustained reentry propagation emerges, a regular dynamical mode that is marked by precise spatiotemporal patterns (`synfire chains') that may last many minutes. These results contribute to discussions on the origin of different modes of neural synchrony in normal and pathological conditions
We study the local density of states (LDOS) of two-dimensional electrons in the presence of spin-orbit (SO) coupling. Although SO coupling has no effect on the average density of states, it manifests itself in the correlations of the LDOS. Namely, the correlation function acquires two satellites centered at energy difference equal to the SO splitting, $2\omega_{SO}$, of the electron Fermi surface. For a smooth disorder the satellites are well separated from the main peak. Weak Zeeman splitting $\omega_{Z} \ll \omega_{SO}$ in a parallel magnetic field causes an anomaly in the shape of the satellites. We consider the effect of SO-induced satellites in the LDOS correlations on the shape of the correlation function of resonant-tunneling conductances at different source-drain biases, which can be measured experimentally. This shape is strongly sensitive to the relation between $\omega_{SO}$ and $\omega_{Z}$.
Exploiting recent developments in information theory, we propose, illustrate, and validate a principled information-theoretic algorithm for module discovery and resulting measure of network modularity. This measure is an order parameter (a dimensionless number between 0 and 1). Comparison is made to other approaches to module-discovery and to quantifying network modularity using Monte Carlo generated Erdos-like modular networks. Finally, the Network Information Bottleneck (NIB) algorithm is applied to a number of real world networks, including the "social" network of coauthors at the APS March Meeting 2004.
The storage capacity of an incremental learning algorithm for the parity machine, the Tilinglike Learning Algorithm, is analytically determined in the limit of a large number of hidden perceptrons. Different learning rules for the simple perceptron are investigated. The usual Gardner-Derrida one leads to a storage capacity close to the upper bound, which is independent of the learning algorithm considered.
We study the distribution of brain and cortical area sizes [parcellation units (PUs)] obtained for three species: mouse, macaque, and human. We find that the distribution of PU sizes is close to lognormal. We analyze the mathematical model of evolution of brain parcellation based on iterative fragmentation and specialization. In this model, each existing PU has a probability to be split that depends on PU size only. This model shows that the same evolutionary process may have led to brain parcellation in these three species. Our model suggests that region-to-region (macro) connectivity is given by the outer product form. We show that most experimental data on non-vanishing macaque cortex macroconnectivity (62% for area V1) can be explained by the outer product power-law form suggested by our model. We propose a multiplicative Hebbian learning rule for the macroconnectome that could yield the correct scaling of connection strengths between areas. We thus propose a universal evolutionary model that may have contributed to both brain parcellation and mesoscopic level connectivity in mammals.
We present numerical simulations of the random field Ising model in three dimensions at zero temperature. The critical exponents are found to agree with previous results. We study the magnetic susceptibility by applying a small magnetic field perturbation. We find that the critical amplitude ratio of the magnetic susceptibilities to be very large, equal to 233.1 \pm 1.5. We find strong sample to sample fluctuations which obey finite size scaling. The probability distribution of the size of small energy excitations is maximally non-self averaging, obeying a double peak distribution, and is finite size scaling invariant. We also study the approach to the thermodynamic limit of the ground state magnetization at the phase transition.
Coordinated multipoint (CoMP) communications are considered for the fifth-generation (5G) small cell networks as a tool to improve the high data rates and the cell-edge throughput. The average achievable rates of the small-cell base stations (SBS) cooperation strategies with distance and received signal power constraints are respectively derived for the fractal small-cell networks based on the anisotropic path loss model. Simulation results are presented to show that the average achievable rate with the received signal power constraint is larger than the rate with a distance constraint considering the same number of cooperative SBSs. The average achievable rate with distance constraint decreases with the increase of the intensity of SBSs when the anisotropic path loss model is considered. What's more, the network energy efficiency of fractal small cell networks adopting the SBS cooperation strategy with the received signal power constraint is analyzed. The network energy efficiency decreases with the increase of the intensity of SBSs which indicates a challenge on the deployment design for fractal small-cell networks.
In this paper, we introduce a new set of reinforcement learning (RL) tasks in Minecraft (a flexible 3D world). We then use these tasks to systematically compare and contrast existing deep reinforcement learning (DRL) architectures with our new memory-based DRL architectures. These tasks are designed to emphasize, in a controllable manner, issues that pose challenges for RL methods including partial observability (due to first-person visual observations), delayed rewards, high-dimensional visual observations, and the need to use active perception in a correct manner so as to perform well in the tasks. While these tasks are conceptually simple to describe, by virtue of having all of these challenges simultaneously they are difficult for current DRL architectures. Additionally, we evaluate the generalization performance of the architectures on environments not used during training. The experimental results show that our new architectures generalize to unseen environments better than existing DRL architectures.
Recent developments in neural information retrieval models have been promising, but a problem remains: human relevance judgments are expensive to produce, while neural models require a considerable amount of training data. In an attempt to fill this gap, we present an approach for generating weak supervision training data for use in a neural IR model. Specifically, we use a news corpus with article headlines acting as pseudo-queries and article content as pseudo-documents, and we propose a measure of interaction similarity to filter these pseudo-documents. Additionally, we employ techniques for addressing problems related to finding effective negative training examples and disregarding headlines that do not work well as queries. By using our approach to train state-of-the-art neural IR models and comparing to established baselines, we find that training data generated by our approach can lead to good results on a benchmark test collection.
Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead. Experiments on several sequence prediction tasks show that this approach yields significant improvements. Moreover, it was used successfully in our winning entry to the MSCOCO image captioning challenge, 2015.
The ensemble averaged power scattered in and out of lossless chaotic cavities decays as a power law in time for large times. In the case of a pulse with a finite duration, the power scattered from a single realization of a cavity closely tracks the power law ensemble decay initially, but eventually transitions to an exponential decay. In this paper, we explore the nature of this transition in the case of coupling to a single port. We find that for a given pulse shape, the properties of the transition are universal if time is properly normalized. We define the crossover time to be the time at which the deviations from the mean of the reflected power in individual realizations become comparable to the mean reflected power. We demonstrate numerically that, for randomly chosen cavity realizations and given pulse shapes, the probability distribution function of reflected power depends only on time, normalized to this crossover time.
Deep learning has recently led to great successes in tasks such as image recognition (e.g Krizhevsky et al., 2012). However, deep networks are still outmatched by the power and versatility of the brain, perhaps in part due to the richer neuronal computations available to cortical circuits. The challenge is to identify which neuronal mechanisms are relevant, and to find suitable abstractions to model them. Here, we show how aspects of spike timing, long hypothesized to play a crucial role in cortical information processing, could be incorporated into deep networks to build richer, versatile representations.   We introduce a neural network formulation based on complex-valued neuronal units that is not only biologically meaningful but also amenable to a variety of deep learning frameworks. Here, units are attributed both a firing rate and a phase, the latter indicating properties of spike timing. We show how this formulation qualitatively captures several aspects thought to be related to neuronal synchrony, including gating of information processing and dynamic binding of distributed object representations. Focusing on the latter, we demonstrate the potential of the approach in several simple experiments. Thus, neuronal synchrony could be a flexible mechanism that fulfills multiple functional roles in deep networks.
We present an analogue Very Large Scale Integration (aVLSI) implementation that uses first-order lowpass filters to implement a conductance-based silicon neuron for high-speed neuromorphic systems. The aVLSI neuron consists of a soma (cell body) and a single synapse, which is capable of linearly summing both the excitatory and inhibitory postsynaptic potentials (EPSP and IPSP) generated by the spikes arriving from different sources. Rather than biasing the silicon neuron with different parameters for different spiking patterns, as is typically done, we provide digital control signals, generated by an FPGA, to the silicon neuron to obtain different spiking behaviours. The proposed neuron is only ~26.5 um2 in the IBM 130nm process and thus can be integrated at very high density. Circuit simulations show that this neuron can emulate different spiking behaviours observed in biological neurons.
We introduce a new parameter to discuss the behavior of a genetic algorithm. This parameter is the mean number of exact copies of the best fit chromosomes from one generation to the next. We argue that the genetic algorithm should operate efficiently when this parameter is slightly larger than $1$. We consider the case of the simple genetic algorithm with the roulette--wheel selection mechanism. We denote by $\ell$ the length of the chromosomes, by $m$ the population size, by $p_C$ the crossover probability and by $p_M$ the mutation probability. We start the genetic algorithm with an initial population whose maximal fitness is equal to $f_0^*$ and whose mean fitness is equal to ${\overline{f_0}}$. We show that, in the limit of large populations, the dynamics of the genetic algorithm depends in a critical way on the parameter $\pi \,=\,\big({f_0^*}/{\overline{f_0}}\big) (1-p_C)(1-p_M)^\ell\,.$ Our results suggest that the mutation and crossover probabilities should be tuned so that, at each generation, $\text{maximal fitness} \times (1-p_C) (1-p_M)^\ell > \text{mean fitness}$.
The low temperature acoustic and thermal properties of amorphous, glassy materials are remarkably similar. All these properties are described theoretically with reasonable quantitative accuracy by assuming that the amorphous solid contains dynamical defects that can be described at low temperatures as an ensemble of two-level systems (TLS), but the deep nature of these TLSs is not clarified yet. Moreover, glassy properties were found also in disordered crystals, quasicrystals, and even perfect crystals with a large number of atoms in the unit cell. In crystals, the glassy properties are not universal, like in amorphous materials, and also exhibit anisotropy. Recently it was proposed a model for the interaction of two-level systems with arbitrary strain fields (Phys. Rev. B 75, 64202, 2007), which was used to calculate the thermal properties of nanoscopic membranes at low temperatures. The model is also suitable for the description of anisotropic crystals. We describe here the results of the calculation of anisotropic glass-like properties in crystals of various lattice symmetries, emphasizing the tetragonal symmetry.
We theoretically investigate impurity correlation and magnetic clustering effects on the long-range ferromagnetic ordering in diluted magnetic semiconductors, such as $\textrm{Ga}_{1-x}\textrm{Mn}_{x}\textrm{As}$, using analytical arguments and direct Monte Carlo simulations. We obtain an analytic formula for the ferromagnetic transition temperature $T_{c}$ which becomes asymptotically exact in the strongly disordered, highly dilute (i.e. small $x$) regime. We establish that impurity correlations have only small effects on $T_{c}$ with the neutrally correlated random disorder producing the nominally highest $T_{c}$. We find that the ferromagnetic order is approached from the high temperature paramagnetic side through a random magnetic clustering phenomenon consistent with the percolation transition scenario.
We use constructive bounded Kasparov K-theory to investigate the numerical invariants stemming from the internal Kasparov products $K_i(\mathcal A) \times KK^i(\mathcal A, \mathcal B) \rightarrow K_0(\mathcal B) \rightarrow \mathbb R$, $i=0,1$, where the last morphism is provided by a tracial state. For the class of properly defined finitely-summable Kasparov $(\mathcal A,\mathcal B)$-cycles, the invariants are given by the pairing of K-theory of $\mathcal B$ with an element of the periodic cyclic cohomology of $\mathcal B$, which we call the generalized Connes-Chern character. When $\mathcal A$ is a twisted crossed product of $\mathcal B$ by $\mathbb Z^k$, $\mathcal A = \mathcal B \rtimes_\xi^\theta \mathbb Z^k$, we derive a local formula for the character corresponding to the fundamental class of a properly defined Dirac cycle. Furthermore, when $\mathcal B = C(\Omega) \rtimes_{\xi'}^{\phi} \mathbb Z^j$, with $C(\Omega)$ the algebra of continuous functions over a disorder configuration space, we show that the numerical invariants are connected to the weak topological invariants of the complex classes of topological insulators, defined in the physics literature. The end products are generalized index theorems for these weak invariants, which enable us to predict the range of the invariants and to identify regimes of strong disorder in which the invariants remain stable. The latter will be reported in a subsequent publication.
Energy consumption is a main issue of concern in wireless networks. Energy minimization increases the time that networks' nodes work properly without recharging or substituting batteries. Another criterion for network performance is data transmission rate which is usually quantified by a network utility function. There exists an inherent tradeoff between these criteria and enhancing one of them can deteriorate the other one. In this paper, we consider both Network Utility Maximization (NUM) and energy minimization in a bi-criterion optimization problem. The problem is formulated for Random Access (RA) Medium Access Control (MAC) for ad-hoc networks. First, we optimize performance of the MAC and define utility as a monotonically increasing function of link throughputs. We investigate the optimal tradeoff between energy and utility in this part. In the second part, we define utility as a function of end to end rates and optimize MAC and transport layers simultaneously. We calculate optimal persistence probabilities and end-to-end rates. Finally, by means of duality theorem, we decompose the problem into smaller subproblems, which are solved at node and network layers separately. This decomposition avoids need for a central unit while sustaining benefits of layering.
We employed the random graph theory approach to analyze the protein-protein interaction database DIP (Feb. 2004), for seven species (S. cerevisiae, H. pylori, E. coli, C. elegans, H. sapiens, M. musculus and D. melanogaster). Several global topological parameters (such as node connectivity, average diameter, node connectivity correlation) were used to characterize these protein-protein interaction networks (PINs). The logarithm of the connectivity distribution vs. the logarithm of connectivity study indicated that PINs follow a power law (P(k) ~ k-\gamma) behavior. Using the regression analysis method we determined that \gamma lies between 1.5 and 2.4, for the seven species. Correlation analysis provides good evidence supporting the fact that the seven PINs form a scale-free network. The average diameters of the networks and their randomized version are found to have large difference. We also demonstrated that the interaction networks are quite robust when subject to random perturbation. Average node connectivity correlation study supports the earlier results that nodes of low connectivity are correlated, whereas nodes of high connectivity are not directly linked. These results provided some evidence suggesting such correlation relations might be a general feature of the PINs across different species.
The full distribution of the conductance $P(G)$ in quasi-one-dimensional wires with rough surfaces is analyzed from the diffusive to the localization regime. In the crossover region, where the statistics is dominated by only one or two eigenchannels, the numerically obtained P(G) is found to be independent of the details of the system with the average conductance <G> as the only scaling parameter. For <G> < e^2/h, P(G) is given by an essentially ``one-sided'' log-normal distribution. In contrast, for e^2/h < <G> <= 2e^2/h, the shape of P(G) remarkable agrees with those predicted by random matrix theory for two fluctuating transmission eigenchannels.
The ability to automatically learn task specific feature representations has led to a huge success of deep learning methods. When large training data is scarce, such as in medical imaging problems, transfer learning has been very effective. In this paper, we systematically investigate the process of transferring a Convolutional Neural Network, trained on ImageNet images to perform image classification, to kidney detection problem in ultrasound images. We study how the detection performance depends on the extent of transfer. We show that a transferred and tuned CNN can outperform a state-of-the-art feature engineered pipeline and a hybridization of these two techniques achieves 20\% higher performance. We also investigate how the evolution of intermediate response images from our network. Finally, we compare these responses to state-of-the-art image processing filters in order to gain greater insight into how transfer learning is able to effectively manage widely varying imaging regimes.
Intelligence Quotient (IQ) Test is a set of standardized questions designed to evaluate human intelligence. Verbal comprehension questions appear very frequently in IQ tests, which measure human's verbal ability including the understanding of the words with multiple senses, the synonyms and antonyms, and the analogies among words. In this work, we explore whether such tests can be solved automatically by artificial intelligence technologies, especially the deep learning technologies that are recently developed and successfully applied in a number of fields. However, we found that the task was quite challenging, and simply applying existing technologies (e.g., word embedding) could not achieve a good performance, mainly due to the multiple senses of words and the complex relations among words. To tackle these challenges, we propose a novel framework consisting of three components. First, we build a classifier to recognize the specific type of a verbal question (e.g., analogy, classification, synonym, or antonym). Second, we obtain distributed representations of words and relations by leveraging a novel word embedding method that considers the multi-sense nature of words and the relational knowledge among words (or their senses) contained in dictionaries. Third, for each type of questions, we propose a specific solver based on the obtained distributed word representations and relation representations. Experimental results have shown that the proposed framework can not only outperform existing methods for solving verbal comprehension questions but also exceed the average performance of the Amazon Mechanical Turk workers involved in the study. The results indicate that with appropriate uses of the deep learning technologies we might be a further step closer to the human intelligence.
The common sense suggests that networks are not random mazes of purposeless connections, but that these connections are organised so that networks can perform their functions well. One function common to many networks is targeted transport or navigation. Using game theory, here we show that minimalistic networks designed to maximise the navigation efficiency at minimal cost share basic structural properties with real networks. These idealistic networks are Nash equilibria of a network construction game whose purpose is to find an optimal trade-off between the network cost and navigability. We show that these skeletons are present in the Internet, metabolic, English word, US airport, Hungarian road networks, and in a structural network of the human brain. The knowledge of these skeletons allows one to identify the minimal number of edges by altering which one can efficiently improve or paralyse navigation in the network.
Human navigation has been a topic of interest in spatial cognition from the past few decades. It has been experimentally observed that humans accomplish the task of way-finding a destination in an unknown environment by recognizing landmarks. Investigations using network analytic techniques reveal that humans, when asked to way-find their destination, learn the top ranked nodes of a network. In this paper we report a study simulating the strategy used by humans to recognize the centers of a network. We show that the paths obtained from our simulation has the same properties as the paths obtained in human based experiment. The simulation thus performed leads to a novel way of path-finding in a network. We discuss the performance of our method and compare it with the existing techniques to find a path between a pair of nodes in a network.
We present a new method to close the Master Equation representing the continuous time dynamics of Ising interacting spins. The method makes use of the the theory of Random Point Processes to derive a master equation for local conditional probabilities. We analytically test our solution studying two known cases, the dynamics of the mean field ferromagnet and the dynamics of the one dimensional Ising system. We then present numerical results comparing our predictions with Monte Carlo simulations in three different models on random graphs with finite connectivity: the Ising ferromagnet, the Random Field Ising model, and the Viana-Bray spin-glass model.
DNN-based cross-modal retrieval is a research hotspot to retrieve across different modalities as image and text, but existing methods often face the challenge of insufficient cross-modal training data. In single-modal scenario, similar problem is usually relieved by transferring knowledge from large-scale auxiliary datasets (as ImageNet). Knowledge from such single-modal datasets is also very useful for cross-modal retrieval, which can provide rich general semantic information that can be shared across different modalities. However, it is challenging to transfer useful knowledge from single-modal (as image) source domain to cross-modal (as image/text) target domain. Knowledge in source domain cannot be directly transferred to both two different modalities in target domain, and the inherent cross-modal correlation contained in target domain provides key hints for cross-modal retrieval which should be preserved during transfer process. This paper proposes Cross-modal Hybrid Transfer Network (CHTN) with two subnetworks: Modal-sharing transfer subnetwork utilizes the modality in both source and target domains as a bridge, for transferring knowledge to both two modalities simultaneously; Layer-sharing correlation subnetwork preserves the inherent cross-modal semantic correlation to further adapt to cross-modal retrieval task. Cross-modal data can be converted to common representation by CHTN for retrieval, and comprehensive experiment on 3 datasets shows its effectiveness.
Recently, strong results have been demonstrated by Deep Recurrent Neural Networks on natural language transduction problems. In this paper we explore the representational power of these models using synthetic grammars designed to exhibit phenomena similar to those found in real transduction problems such as machine translation. These experiments lead us to propose new memory-based recurrent networks that implement continuously differentiable analogues of traditional data structures such as Stacks, Queues, and DeQues. We show that these architectures exhibit superior generalisation performance to Deep RNNs and are often able to learn the underlying generating algorithms in our transduction experiments.
This paper introduces multiplicative LSTM, a novel hybrid recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. Multiplicative LSTM is motivated by its flexibility to have very different recurrent transition functions for each possible input, which we argue helps make it more expressive in autoregressive density estimation. We show empirically that multiplicative LSTM outperforms standard LSTM and its deep variants for a range of character level modelling tasks. We also found that this improvement increases as the complexity of the task scales up. This model achieves a test error of 1.19 bits/character on the last 4 million characters of the Hutter prize dataset when combined with dynamic evaluation.
This paper presents a Bayesian approach to learning the connectivity structure of a group of neurons from data on configuration frequencies. A major objective of the research is to provide statistical tools for detecting changes in firing patterns with changing stimuli. Our framework is not restricted to the well-understood case of pair interactions, but generalizes the Boltzmann machine model to allow for higher order interactions. The paper applies a Markov Chain Monte Carlo Model Composition (MC3) algorithm to search over connectivity structures and uses Laplace's method to approximate posterior probabilities of structures. Performance of the methods was tested on synthetic data. The models were also applied to data obtained by Vaadia on multi-unit recordings of several neurons in the visual cortex of a rhesus monkey in two different attentional states. Results confirmed the experimenters' conjecture that different attentional states were associated with different interaction structures.
Deep neural networks have been exhibiting splendid accuracies in many of visual pattern classification problems. Many of the state-of-the-art methods employ a technique known as data augmentation at the training stage. This paper addresses an issue of decision rule for classifiers trained with augmented data. Our method is named as APAC: the Augmented PAttern Classification, which is a way of classification using the optimal decision rule for augmented data learning. Discussion of methods of data augmentation is not our primary focus. We show clear evidences that APAC gives far better generalization performance than the traditional way of class prediction in several experiments. Our convolutional neural network model with APAC achieved a state-of-the-art accuracy on the MNIST dataset among non-ensemble classifiers. Even our multilayer perceptron model beats some of the convolutional models with recently invented stochastic regularization techniques on the CIFAR-10 dataset.
The threshold network model is a type of finite random graphs. In this paper, we introduce a generalized threshold network model. A pair of vertices with random weights is connected by an edge when real-valued functions of the pair of weights belong to given Borel sets. We extend several known limit theorems for the number of prescribed subgraphs to show that the strong law of large numbers can be uniform convergence. We also prove two limit theorems for the local and global clustering coefficients.
It is an increasingly important problem to study conditions on the structure of a network that guarantee a given behavior for its underlying dynamical system. In this paper we report that a Boolean network may fall within the chaotic regime, even under the simultaneous assumption of several conditions which in randomized studies have been separately shown to correlate with ordered behavior. These properties include using at most two inputs for every variable, using biased and canalyzing regulatory functions, and restricting the number of negative feedback loops.   We also prove for n-dimensional Boolean networks that if in addition the number of outputs for each variable is bounded and there exist periodic orbits of length c^n for c sufficiently close to 2, any network with these properties must have a large proportion of variables that simply copy previous values of other variables. Such systems share a structural similarity to a relatively small Turing machine acting on one or several tapes.
The reliability of electric transmission systems is examined using a scale-free model of network structure and failure propagation. The topologies of the North American eastern and western electric networks are analyzed to estimate their reliability based on the Barabasi-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using standard power system reliability analysis methods, and they suggest that scale-free network models are useful for estimating aggregate electric network reliability.
In Part II [3] we carried out a detailed mean-square-error analysis of the performance of asynchronous adaptation and learning over networks under a fairly general model for asynchronous events including random topologies, random link failures, random data arrival times, and agents turning on and off randomly. In this Part III, we compare the performance of synchronous and asynchronous networks. We also compare the performance of decentralized adaptation against centralized stochastic-gradient (batch) solutions. Two interesting conclusions stand out. First, the results establish that the performance of adaptive networks is largely immune to the effect of asynchronous events: the mean and mean-square convergence rates and the asymptotic bias values are not degraded relative to synchronous or centralized implementations. Only the steady-state mean-square-deviation suffers a degradation in the order of $\nu$, which represents the small step-size parameters used for adaptation. Second, the results show that the adaptive distributed network matches the performance of the centralized solution. These conclusions highlight another critical benefit of cooperation by networked agents: cooperation does not only enhance performance in comparison to stand-alone single-agent processing, but it also endows the network with remarkable resilience to various forms of random failure events and is able to deliver performance that is as powerful as batch solutions.
Alkali halide crystals doped with certain impurity ions show a low temperature behaviour, which differs significantly from that of pure crystals. The origin of these characteristic differences are tunneling centers formed by atomic or molecular impurity ions. We have investigated the dielectric susceptibility of hydroxyl ions in NaCl crystals at very low concentrations (below 30 ppm), where interactions are believed to be negligible. We find that the temperature dependence of the susceptibility is noticeably different from what one would expect for isolated defects in a symmetric environment. We propose that the origin of these deviations are random internal strains arising from imperfections of the host crystal. We will present the experimental data and a theoretical model which allows a quantitative understanding on a microscopic basis.
Understanding the camera wearer's activity is central to egocentric vision, yet one key facet of that activity is inherently invisible to the camera--the wearer's body pose. Prior work focuses on estimating the pose of hands and arms when they come into view, but this 1) gives an incomplete view of the full body posture, and 2) prevents any pose estimate at all in many frames, since the hands are only visible in a fraction of daily life activities. We propose to infer the "invisible pose" of a person behind the egocentric camera. Given a single video, our efficient learning-based approach returns the full body 3D joint positions for each frame. Our method exploits cues from the dynamic motion signatures of the surrounding scene--which changes predictably as a function of body pose--as well as static scene structures that reveal the viewpoint (e.g., sitting vs. standing). We further introduce a novel energy minimization scheme to infer the pose sequence. It uses soft predictions of the poses per time instant together with a non-parametric model of human pose dynamics over longer windows. Our method outperforms an array of possible alternatives, including deep learning approaches for direct pose regression from images.
Question-answering (QA) on video contents is a significant challenge for achieving human-level intelligence as it involves both vision and language in real-world settings. Here we demonstrate the possibility of an AI agent performing video story QA by learning from a large amount of cartoon videos. We develop a video-story learning model, i.e. Deep Embedded Memory Networks (DEMN), to reconstruct stories from a joint scene-dialogue video stream using a latent embedding space of observed data. The video stories are stored in a long-term memory component. For a given question, an LSTM-based attention model uses the long-term memory to recall the best question-story-answer triplet by focusing on specific words containing key information. We trained the DEMN on a novel QA dataset of children's cartoon video series, Pororo. The dataset contains 16,066 scene-dialogue pairs of 20.5-hour videos, 27,328 fine-grained sentences for scene description, and 8,913 story-related QA pairs. Our experimental results show that the DEMN outperforms other QA models. This is mainly due to 1) the reconstruction of video stories in a scene-dialogue combined form that utilize the latent embedding and 2) attention. DEMN also achieved state-of-the-art results on the MovieQA benchmark.
Many-body localization is characterized by a slow logarithmic growth of the entanglement entropy after a global quantum quench while the local memory of an initial density imbalance remains at infinite time. We investigate how much the proximity of a many-body localized phase can influence the dynamics in the delocalized ergodic regime where thermalization is expected. Using an exact Krylov space technique, the out-of-equilibrium dynamics of the random-field Heisenberg chain is studied up to $L=28$ sites, starting from an initially unentangled high-energy product state. Within most of the delocalized phase, we find a sub-ballistic entanglement growth $S(t)\propto t^{1/z}$ with a disorder-dependent exponent $z\ge1$, in contrast with the pure ballistic growth $z=1$ of clean systems. At the same time, anomalous relaxation is also observed for the spin imbalance ${\cal{I}}(t)\propto t^{-\zeta}$ with a continuously varying disorder-dependent exponent $\zeta$, vanishing at the transition. This provides a clear experimental signature for detecting this non-conventional regime.
We study mission-critical networking in wireless communication networks, where network users are subject to critical events such as emergencies and crises. If a critical event occurs to a user, the user needs to send necessary information for help as early as possible. However, most existing medium access control (MAC) protocols are not adequate to meet the urgent need for information transmission by users in a critical situation. In this paer, we propose a novel class of MAC protocols that utilize available past information as well as current information. Our proposed protocols are mission-aware since they prescribe different transmission decision rules to users in different situations. We show that the proposed protocols perform well not only when the system faces a critical situation but also when there is no critical situation. By utilizing past information, the proposed protocols coordinate transmissions by users to achieve high throughput in the normal phase of operation and to let a user in a critical situation make successful transmissions while it is in the critical situation. Moreover, the proposed protocols require short memory and no message exchanges.
Measurement of stride-related, biomechanical parameters is the common rationale for objective gait impairment scoring. State-of-the-art double integration approaches to extract these parameters from inertial sensor data are, however, limited in their clinical applicability due to the underlying assumptions. To overcome this, we present a method to translate the abstract information provided by wearable sensors to context-related expert features based on deep convolutional neural networks. Regarding mobile gait analysis, this enables integration-free and data-driven extraction of a set of 8 spatio-temporal stride parameters. To this end, two modelling approaches are compared: A combined network estimating all parameters of interest and an ensemble approach that spawns less complex networks for each parameter individually. The ensemble approach is outperforming the combined modelling in the current application. On a clinically relevant and publicly available benchmark dataset, we estimate stride length, width and medio-lateral change in foot angle up to ${-0.15\pm6.09}$ cm, ${-0.09\pm4.22}$ cm and ${0.13 \pm 3.78^\circ}$ respectively. Stride, swing and stance time as well as heel and toe contact times are estimated up to ${\pm 0.07}$, ${\pm0.05}$, ${\pm 0.07}$, ${\pm0.07}$ and ${\pm0.12}$ s respectively. This is comparable to and in parts outperforming or defining state-of-the-art. Our results further indicate that the proposed change in methodology could substitute assumption-driven double-integration methods and enable mobile assessment of spatio-temporal stride parameters in clinically critical situations as e.g. in the case of spastic gait impairments.
We report the results of a study which assembles deep observations with the ACIS-I instrument on the Chandra Observatory to study the evolution in the core properties of a sample of galaxy groups and clusters out to redshifts $z\approx 1.3$. A search for extended objects within these fields yields a total of 62 systems for which redshifts are available, and we added a further 24 non-X-ray-selected clusters, to investigate the impact of selection effects and improve our statistics at high redshift. Six different estimators of cool core strength are applied to these data: the entropy (K) and cooling time ($t_{cool}$) within the cluster core, the cooling time as a fraction of the age of the Universe ($t_{cool}/t_{Uni}$), and three estimators based on the cuspiness of the X-ray surface brightness profile. A variety of statistical tests are used to quantify evolutionary trends in these cool core indicators. In agreement with some previous studies, we find that there is significant evolution in $t_{cool}/t_{Uni}$, but little evolution in $t_{cool}$, suggesting that gas is accumulating within the core, but that the cooling time deep in the core is controlled by AGN feedback. We show that this result extends down to the group regime and appears to be robust against a variety of selection biases (detection bias, archival biases and biases due to the presence of central X-ray AGN) which we consider.
In this paper, we consider multiple cache-enabled clients connected to multiple servers through an intermediate network. We design several topology-aware coding strategies for such networks. Based on topology richness of the intermediate network, and types of coding operations at internal nodes, we define three classes of networks, namely, dedicated, flexible, and linear networks. For each class, we propose an achievable coding scheme, analyze its coding delay, and also, compare it with an information theoretic lower bound. For flexible networks, we show that our scheme is order-optimal in terms of coding delay and, interestingly, the optimal memory-delay curve is achieved in certain regimes. In general, our results suggest that, in case of networks with multiple servers, type of network topology can be exploited to reduce service delay.
This paper considers social learning amongst rational agents (for example, sensors in a network). We consider three models of social learning in increasing order of sophistication. In the first model, based on its private observation of a noisy underlying state process, each agent selfishly optimizes its local utility and broadcasts its action. This protocol leads to a herding behavior where the agents eventually choose the same action irrespective of their observations. We then formulate a second more general model where each agent is benevolent and chooses its sensor-mode to optimize a social welfare function to facilitate social learning. Using lattice programming and stochastic orders, it is shown that the optimal decision each agent makes is characterized by a switching curve on the space of Bayesian distributions. We then present a third more general model where social learning takes place to achieve quickest time change detection. Both geometric and phase-type change time distributions are considered. It is proved that the optimal decision is again characterized by a switching curve We present a stochastic approximation (adaptive filtering) algorithms to estimate this switching curve. Finally, we present extensions of the social learning model in a changing world (Markovian target) where agents learn in multiple iterations. By using Blackwell stochastic dominance, we give conditions under which myopic decisions are optimal. We also analyze the effect of target dynamics on the social welfare cost.
Security organizations often attempt to disrupt terror or insurgent networks by targeting "high value targets" (HVT's). However, there have been numerous examples that illustrate how such networks are able to quickly re-generate leadership after such an operation. Here, we introduce the notion of a "shaping" operation in which the terrorist network is first targeted for the purpose of reducing its leadership re-generation ability before targeting HVT's. We look to conduct shaping by maximizing the network-wide degree centrality through node removal. We formally define this problem and prove solving it is NP-Complete. We introduce a mixed integer-linear program that solves this problem exactly as well as a greedy heuristic for more practical use. We implement the greedy heuristic and found in examining five real-world terrorist networks that removing only 12% of nodes can increase the network-wide centrality between 17% and 45%. We also show our algorithm can scale to large social networks of 1,133 nodes and 5,541 edges on commodity hardware.
Many users in online social networks are constantly trying to gain attention from their followers by broadcasting posts to them. These broadcasters are likely to gain greater attention if their posts can remain visible for a longer period of time among their followers' most recent feeds. Then when to post? In this paper, we study the problem of smart broadcasting using the framework of temporal point processes, where we model users feeds and posts as discrete events occurring in continuous time. Based on such continuous-time model, then choosing a broadcasting strategy for a user becomes a problem of designing the conditional intensity of her posting events. We derive a novel formula which links this conditional intensity with the visibility of the user in her followers' feeds. Furthermore, by exploiting this formula, we develop an efficient convex optimization framework for the when-to-post problem. Our method can find broadcasting strategies that reach a desired visibility level with provable guarantees. We experimented with data gathered from Twitter, and show that our framework can consistently make broadcasters' post more visible than alternatives.
We describe and analyze a simple random feature scheme (RFS) from prescribed compositional kernels. The compositional kernels we use are inspired by the structure of convolutional neural networks and kernels. The resulting scheme yields sparse and efficiently computable features. Each random feature can be represented as an algebraic expression over a small number of (random) paths in a composition tree. Thus, compositional random features can be stored compactly. The discrete nature of the generation process enables de-duplication of repeated features, further compacting the representation and increasing the diversity of the embeddings. Our approach complements and can be combined with previous random feature schemes.
A new general procedure for a priori selection of more predictable events from a time series of observed variable is proposed. The procedure is applicable to time series which contains different types of events that feature significantly different predictability, or, in other words, to heteroskedastic time series. A priori selection of future events in accordance to expected uncertainty of their forecasts may be helpful for making practical decisions. The procedure first implies creation of two neural network based forecasting models, one of which is aimed at prediction of conditional mean and other - conditional dispersion, and then elaboration of the rule for future event selection into groups of more and less predictable events. The method is demonstrated and tested by the example of the computer generated time series, and then applied to the real world time series, Dow Jones Industrial Average index.
Broadcasting systems such as P2P streaming systems represent important network applications that support up to millions of online users. An efficient broadcasting mechanism is at the core of the system design. Despite substantial efforts on developing efficient broadcasting algorithms, the following important question remains open: How to achieve the maximum broadcast rate in a distributed manner with each user maintaining information queues only for its direct neighbors? In this work, we first derive an innovative formulation of the problem over acyclic overlay networks with arbitrary underlay capacity constraints. Then, based on the formulation, we develop a distributed algorithm to achieve the maximum broadcast rate and every user only maintains one queue per-neighbor. Due to its lightweight nature, our algorithm scales very well with the network size and remains robust against high system dynamics. Finally, by conducting simulations we validate the optimality of our algorithm under different network capacity models. Simulation results further indicate that the convergence time of our algorithm grows linearly with the network size, which suggests an interesting direction for future investigation.
Owing to the rapid growth of touchscreen mobile terminals and pen-based interfaces, handwriting-based writer identification systems are attracting increasing attention for personal authentication, digital forensics, and other applications. However, most studies on writer identification have not been satisfying because of the insufficiency of data and difficulty of designing good features under various conditions of handwritings. Hence, we introduce an end-to-end system, namely DeepWriterID, employed a deep convolutional neural network (CNN) to address these problems. A key feature of DeepWriterID is a new method we are proposing, called DropSegment. It designs to achieve data augmentation and improve the generalized applicability of CNN. For sufficient feature representation, we further introduce path signature feature maps to improve performance. Experiments were conducted on the NLPR handwriting database. Even though we only use pen-position information in the pen-down state of the given handwriting samples, we achieved new state-of-the-art identification rates of 95.72% for Chinese text and 98.51% for English text.
We present deep K-band imaging at the positions of four very faint X-ray sources found in the UK ROSAT Deep Survey to have no optical counterpart brighter than R~23. Likely identifications are found within the ROSAT error circle in all four fields with R-K colours of between 3.2 +/- 0.4 and 6.4 +/- 0.6. From a consideration of the R-K colours and X-ray to optical luminosity ratios of the candidate identifications, we tentatively classify two of the X-ray sources as very distant (z ~ 1) clusters of galaxies, one as a narrow emission line galaxy and one as an obscured QSO.
What are the effects of neuromodulation on a large network model? Neuromodulation influences neural processing by presynaptic and postsynaptic regulation of synaptic efficacy and by ion channel regulation for dendritic excitability. We present a model, where regulation of synaptic efficacy changes the overall connectivity, or topology, of the network, and regulation of dendritic ion channels sets intrinsic excitability, or the gain of the neuron. We show that network topology influences synchronization, i. e. the correlations of spiking activity generated in the network and synchronization influences the read-out of intrinsic properties. Highly synchronous input drives neurons, such that differences in intrinsic properties disappear, while asynchronous input lets intrinsic properties determine output behavior. We conclude that neuromodulation may allow a network to alternate between a synchronized transmission mode and an asynchronous intrinsic read-out mode.
Modal sense classification (MSC) is a special WSD task that depends on the meaning of the proposition in the modal's scope. We explore a CNN architecture for classifying modal sense in English and German. We show that CNNs are superior to manually designed feature-based classifiers and a standard NN classifier. We analyze the feature maps learned by the CNN and identify known and previously unattested linguistic features. We benchmark the CNN on a standard WSD task, where it compares favorably to models using sense-disambiguated target vectors.
Learning joint probability distributions on n random variables requires exponential sample size in the generic case. Here we consider the case that a temporal (or causal) order of the variables is known and that the (unknown) graph of causal dependencies has bounded in-degree Delta. Then the joint measure is uniquely determined by the probabilities of all (2 Delta+1)-tuples. Upper bounds on the sample size required for estimating their probabilities can be given in terms of the VC-dimension of the set of corresponding cylinder sets. The sample size grows less than linearly with n.
We study the minimum energy clusters (MEC) above the ground state for the 3-d Edwards-Anderson Ising spin glass in a magnetic field. For fields B below 0.4, we find that the field has almost no effect on the excitations that we can probe, of volume V <= 64. As found previously for B=0, their energies decrease with V, and their magnetization remains very small (even slightly negative). For larger fields, both the MEC energy and magnetization grow with V, as expected in a paramagnetic phase. However, all results appear to scale as BV (instead of the B sqrt(V) expected from droplet arguments), suggesting that the spin glass phase is destroyed by any small field. Finally, the geometry of the MEC is completely insensitive to the field, giving further credence that they are lattice animals, in the presence or the absence of a field.
The problem where a tropical cyclone intensifies dramatically within a short period of time is known as rapid intensification. This has been one of the major challenges for tropical weather forecasting. Recurrent neural networks have been promising for time series problems which makes them appropriate for rapid intensification. In this paper, recurrent neural networks are used to predict rapid intensification cases of tropical cyclones from the South Pacific and South Indian Ocean regions. A class imbalanced problem is encountered which makes it very challenging to achieve promising performance. A simple strategy was proposed to include more positive cases for detection where the false positive rate was slightly improved. The limitations of building an efficient system remains due to the challenges of addressing the class imbalance problem encountered for rapid intensification prediction. This motivates further research in using innovative machine learning methods.
The stationary one-dimensional tight-binding Schredinger equation with a weak diagonal long-range correlated disorder in the potential is studied. An algorithm for constructing the discrete binary on-site potential exhibiting a hybrid spectrum with three different spectral components (absolutely continuous, singular continuous and point) ordered in any predefined manner in the region of energy and/or wave number is presented. A new approach to generating a binary sequence with the long-range memory based on a concept of additive Markov chains (Phys. Rev. E 68, 061107 (2003)) is used.
We present preliminary results on 24micron detections of luminous infrared galaxies at z>1 with the Multiband Imaging Photometer for Spitzer (MIPS). Observations were performed in the Lockman Hole and the Extended Groth Strip (EGS), and were supplemented by data obtained with the Infrared Array Camera (IRAC) between 3 and 9microns. The positional accuracy of ~2arcsec for most MIPS/IRAC detections provides unambiguous identifications of their optical counterparts. Using spectroscopic redshifts from the Deep Extragalactic Evolutionary Probe survey, we identify 24micron sources at z>1 in the EGS, while the combination of the MIPS/IRAC observations with $BVRIJHK$ ancillary data in the Lockman Hole also shows very clear cases of galaxies with photometric redshifts at 1<z<2.5.   The observed 24micron fluxes indicate infrared luminosities greater than 10^11 L_sol, while the data at shorter wavelengths reveal rather red and probably massive (M>=M*) galaxy counterparts. It is the first time that this population of luminous objects is detected up to z~2.5 in the infrared. Our work demonstrates the ability of the MIPS instrument to probe the dusty Universe at very high redshift, and illustrates how the forthcoming Spitzer deep surveys will offer a unique opportunity to illuminate a dark side of cosmic history not explored by previous infrared experiments.
The most important stages in designing a computer network in a wider geographical area include: definition of requirements, topological description, identification and calculation of relevant parameters (i.e. traffic matrix), determining the shortest path between nodes, quantification of the effect of various levels of technical and technological development of urban areas involved, the cost of technology, and the cost of services. These parameters differ for WAN networks in different regions - their calculation depends directly on the data "in the field": number of inhabitants, distance between populated areas, network traffic density, as well as available bandwidth. The main reason for identification and evaluation of these parameters is to develop a model that could meet the constraints imposed by potential beneficiaries. In this paper, we develop a methodology for planning and cost-modeling of a wide area network and validate it in a case study, under the supposition that behavioral interactions of individuals and groups play a significant role and have to be taken into consideration by employing either simple or composite indicators of socioeconomic status.
In 2014, Amin, Heidari, and Kearns proved that tree networks can be learned by observing only the infected set of vertices of the contagion process under the independent cascade model, in both the active and passive query models. They also showed empirically that simple extensions of their algorithms work on sparse networks. In this work, we focus on the active model. We prove that a simple modification of Amin et al.'s algorithm works on more general classes of networks, namely (i) networks with large girth and low path growth rate, and (ii) networks with bounded degree. This also provides partial theoretical explanation for Amin et al.'s experiments on sparse networks.
In equality-constrained optimization, a standard regularity assumption is often associated with feasible point methods, namely the gradients of constraints are linearly independent. In practice, the regularity assumption may be violated. To avoid such a singularity, we propose a new projection matrix, based on which a feasible point method for the continuous-time, equality-constrained optimization problem is developed. First, the equality constraint is transformed into a continuous-time dynamical system with solutions that always satisfy the equality constraint. Then, the singularity is explained in detail and a new projection matrix is proposed to avoid singularity. An update (or say a controller) is subsequently designed to decrease the objective function along the solutions of the transformed system. The invariance principle is applied to analyze the behavior of the solution. We also propose a modified approach for addressing cases in which solutions do not satisfy the equality constraint. Finally, the proposed optimization approaches are applied to two examples to demonstrate its effectiveness.
The production of isolated photons in deep-inelastic scattering $ep\to e \gamma X$ is measured with the H1 detector at HERA. The measurement is performed in the kinematic range of negative four-momentum transfer squared $4<Q^2<150 $~GeV$^2$ and a mass of the hadronic system $W_X>50$ GeV. The analysis is based on a total integrated luminosity of 227~pb$^{-1}$. The production cross section of isolatedphotons with a transverse energy in the range $3 < E_T^\gamma < 10$ GeV and pseudorapidity range $-1.2 < \eta^\gamma < 1.8$ is measured as a function of $E_T^\gamma$, $\eta^\gamma$ and $Q^2$. Isolated photon cross sections are also measured for events with no jets or at least one hadronic jet. The measurements are compared with predictions from Monte Carlo generators modelling the photon radiation from the quark and the electron lines, as well as with calculations at leading and next to leading order in the strong coupling. The predictions significantly underestimate the measured cross sections.
The Stochastic Block Model (SBM) is a widely used random graph model for networks with communities. Despite the recent burst of interest in recovering communities in the SBM from statistical and computational points of view, there are still gaps in understanding the fundamental information theoretic and computational limits of recovery. In this paper, we consider the SBM in its full generality, where there is no restriction on the number and sizes of communities or how they grow with the number of nodes, as well as on the connection probabilities inside or across communities. This generality allows us to move past the artifacts of homogenous SBM, and understand the right parameters (such as the relative densities of communities) that define the various recovery thresholds. We outline the implications of our generalizations via a set of illustrative examples. For instance, $\log n$ is considered to be the standard lower bound on the cluster size for exact recovery via convex methods, for homogenous SBM. We show that it is possible, in the right circumstances (when sizes are spread and the smaller the cluster, the denser), to recover very small clusters (up to $\sqrt{\log n}$ size), if there are just a few of them (at most polylogarithmic in $n$).
Multi-layered social networks consist of the fixed set of nodes linked by multiple connections. These connections may be derived from different types of user activities logged in the IT system. To calculate any structural measures for multi-layered networks this multitude of relations should be coped with in the parameterized way. Two separate algorithms for evaluation of shortest paths in the multi-layered social network are proposed in the paper. The first one is based on pre-processing - aggregation of multiple links into single multi-layered edges, whereas in the second approach, many edges are processed 'on the fly' in the middle of path discovery. Experimental studies carried out on the DBLP database converted into the multi-layered social network are presented as well.
An important goal in neural map learning, which can conveniently be accomplished by magnification control, is to achieve information optimal coding in the sense of information theory. In the present contribution we consider the winner relaxing approach for the neural gas network. Originally, winner relaxing learning is a slight modification of the self-organizing map learning rule that allows for adjustment of the magnification behavior by an a priori chosen control parameter. We transfer this approach to the neural gas algorithm. The magnification exponent can be calculated analytically for arbitrary dimension from a continuum theory, and the entropy of the resulting map is studied numerically conf irming the theoretical prediction. The influence of a diagonal term, which can be added without impacting the magnification, is studied numerically. This approach to maps of maximal mutual information is interesting for applications as the winner relaxing term only adds computational cost of same order and is easy to implement. In particular, it is not necessary to estimate the generally unknown data probability density as in other magnification control approaches.
Hashing-based methods seek compact and efficient binary codes that preserve the neighborhood structure in the original data space. For most existing hashing methods, an image is first encoded as a vector of hand-crafted visual feature, followed by a hash projection and quantization step to get the compact binary vector. Most of the hand-crafted features just encode the low-level information of the input, the feature may not preserve the semantic similarities of images pairs. Meanwhile, the hashing function learning process is independent with the feature representation, so the feature may not be optimal for the hashing projection. In this paper, we propose a supervised hashing method based on a well designed deep convolutional neural network, which tries to learn hashing code and compact representations of data simultaneously. The proposed model learn the binary codes by adding a compact sigmoid layer before the loss layer. Experiments on several image data sets show that the proposed model outperforms other state-of-the-art methods.
Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature. However, their performance is generally much worse than the word-level language models (WLMs), since CLMs need to consider longer history of tokens to properly predict the next one. We address this problem by proposing hierarchical RNN architectures, which consist of multiple modules with different timescales. Despite the multi-timescale structures, the input and output layers operate with the character-level clock, which allows the existing RNN CLM training approaches to be directly applicable without any modifications. Our CLM models show better perplexity than Kneser-Ney (KN) 5-gram WLMs on the One Billion Word Benchmark with only 2% of parameters. Also, we present real-time character-level end-to-end speech recognition examples on the Wall Street Journal (WSJ) corpus, where replacing traditional mono-clock RNN CLMs with the proposed models results in better recognition accuracies even though the number of parameters are reduced to 30%.
We study weak anti-localization (WAL) effect in topological insulator Bi2Te3 thin films at low temperatures. Two-dimensional WAL effect associated with surface carriers is revealed in the tilted magnetic field dependence of magneto-conductance. Our data demonstrates that the observed WAL is robust against deposition of non-magnetic Au impurities on the surface of the thin films. But it is quenched by deposition of magnetic Fe impurities which destroy the pi Berry's phase of the topological surface states. The magneto-conductance data of a 5 nm Bi2Te3 film suggests that a crossover from symplectic to unitary classes is observed with the deposition of Fe impurities.
Based on the formation of triad junctions, the proposed mechanism generates networks that exhibit extended rather than single power law behavior. Triad formation guarantees strong neighborhood clustering and community-level characteristics as the network size grows to infinity. The asymptotic behavior is of interest in the study of directed networks in which (i) the formation of links cannot be described according to the principle of preferential attachment; (ii) the in-degree distribution fits a power law for nodes with a high degree and an exponential form otherwise; (iii) clustering properties emerge at multiple scales and depend on both the number of links that newly added nodes establish and the probability of forming triads; and (iv) groups of nodes form modules that feature less links to the rest of the nodes.
Efficient dynamic spectrum access mechanism is crucial for improving the spectrum utilization. In this paper, we consider the dynamic spectrum access mechanism design with both complete and incomplete network information. When the network information is available, we propose an evolutionary spectrum access mechanism. We use the replicator dynamics to study the dynamics of channel selections, and show that the mechanism achieves an equilibrium that is an evolutionarily stable strategy and is also max-min fair. With incomplete network information, we propose a distributed reinforcement learning mechanism for dynamic spectrum access. Each secondary user applies the maximum likelihood estimation method to estimate its expected payoff based on the local observations, and learns to adjust its mixed strategy for channel selections adaptively over time. We study the convergence of the learning mechanism based on the theory of stochastic approximation, and show that it globally converges to an approximate Nash equilibrium. Numerical results show that the proposed evolutionary spectrum access and distributed reinforcement learning mechanisms achieve up to 82% and 70% performance improvement than a random access mechanism, respectively, and are robust to random perturbations of channel selections.
We present an efficient algorithm for the inference of stochastic block models in large networks. The algorithm can be used as an optimized Markov chain Monte Carlo (MCMC) method, with a fast mixing time and a much reduced susceptibility to getting trapped in metastable states, or as a greedy agglomerative heuristic, with an almost linear $O(N\ln^2N)$ complexity, where $N$ is the number of nodes in the network, independent on the number of blocks being inferred. We show that the heuristic is capable of delivering results which are indistinguishable from the more exact and numerically expensive MCMC method in many artificial and empirical networks, despite being much faster. The method is entirely unbiased towards any specific mixing pattern, and in particular it does not favor assortative community structures.
The measurement of transverse single-spin asymmetries for baryon production in the target fragmentation region of semi-inclusive deep-inelastic scattering (SIDIS), can produce important insight into those nonperturbative aspects of QCD directly associated with confinement and with the dynamical breaking of chiral symmetry. We discuss here, interns of spin-directed momentum transfers, the powerful quantum field-theoretical constraints on the spin-orbit dynamics underlying these transverse spin observables. The spin-directed momentum shifts, originating either in the target nucleon or in the QCD jets produced in the deep inelastic scattering process, represent significant quantum entanglement effects connecting information from current fragmentation with observables in target fragmentation.
Most state-of-the-art information extraction approaches rely on token-level labels to find the areas of interest in text. Unfortunately, these labels are time-consuming and costly to create, and consequently, not available for many real-life IE tasks. To make matters worse, token-level labels are usually not the desired output, but just an intermediary step. End-to-end (E2E) models, which take raw text as input and produce the desired output directly, need not depend on token-level labels. We propose an E2E model based on pointer networks, which can be trained directly on pairs of raw input and output text. We evaluate our model on the ATIS data set, MIT restaurant corpus and the MIT movie corpus and compare to neural baselines that do use token-level labels. We achieve competitive results, within a few percentage points of the baselines, showing the feasibility of E2E information extraction without the need for token-level labels. This opens up new possibilities, as for many tasks currently addressed by human extractors, raw input and output data are available, but not token-level labels.
In this paper, the optimal spectral efficiency (data rate divided by the message bandwidth) that minimizes the probability of causing disruptive interference for ad hoc wireless networks or cognitive radios is investigated. Two basic problem constraints are considered: a given message size, or fixed data rate. Implicitly, the trade being optimized is between longer transmit duration and wider bandwidth versus higher transmit power. Both single-input single-output (SISO) and multiple-input multiple-output (MIMO) links are considered. Here, a link optimizes its spectral efficiency to be a "good neighbor." The probability of interference is characterized by the probability that the signal power received by a hidden node in a wireless network exceeds some threshold. The optimized spectral efficiency is a function of the transmitter-to-hidden-node channel exponent, exclusively. It is shown that for typical channel exponents a spectral efficiency of slightly greater than 1~b/s/Hz per antenna is optimal. It is also shown that the optimal spectral efficiency is valid in the environment with multiple hidden nodes. Also explicit evaluations of the probability of collisions is presented as a function of spectral efficiency.
A theory is constructed to describe the zero-temperature jamming transition as the density of repulsive soft spheres is increased. Local mechanical stability imposes a constraint on the minimum number of bonds per particle; we argue that this constraint suggests an analogy to $k$-core percolation. The latter model can be solved exactly on the Bethe lattice, and the resulting transition has a mixed first-order/continuous character. The exponents characterizing the continuous part appear to be the same as for the jamming transition. Finally, numerical simulations suggest that in finite dimensions the $k$-core transition can be discontinuous with a nontrivial diverging correlation length.
This paper describes the application of a real coded genetic algorithm (GA) to align two or more 2-D images by means of image registration. The proposed search strategy is a transformation parameters-based approach involving the affine transform. The real coded GA uses Simulated Binary Crossover (SBX), a parent-centric recombination operator that has shown to deliver a good performance in many optimization problems in the continuous domain. In addition, we propose a new technique for matching points between a warped and static images by using a randomized ordering when visiting the points during the matching procedure. This new technique makes the evaluation of the objective function somewhat noisy, but GAs and other population-based search algorithms have been shown to cope well with noisy fitness evaluations. The results obtained are competitive to those obtained by state-of-the-art classical methods in image registration, confirming the usefulness of the proposed noisy objective function and the suitability of SBX as a recombination operator for this type of problem.
User preference integration is of great importance in multi-objective optimization, in particular in many objective optimization. Preferences have long been considered in traditional multicriteria decision making (MCDM) which is based on mathematical programming. Recently, it is integrated in multi-objective metaheuristics (MOMH), resulting in focus on preferred parts of the Pareto front instead of the whole Pareto front. The number of publications on preference-based multi-objective metaheuristics has increased rapidly over the past decades. There already exist various preference handling methods and MOMH methods, which have been combined in diverse ways. This article proposes to use the Web Ontology Language (OWL) to model and systematize the results developed in this field. A review of the existing work is provided, based on which an ontology is built and instantiated with state-of-the-art results. The OWL ontology is made public and open to future extension. Moreover, the usage of the ontology is exemplified for different use-cases, including querying for methods that match an engineering application, bibliometric analysis, checking existence of combinations of preference models and MOMH techniques, and discovering opportunities for new research and open research questions.
Content popularity prediction has been extensively studied due to its importance and interest for both users and hosts of social media sites like Facebook, Instagram, Twitter, and Pinterest. However, existing work mainly focuses on modeling popularity using a single metric such as the total number of likes or shares. In this work, we propose Diffusion-LSTM, a memory-based deep recurrent network that learns to recursively predict the entire diffusion path of an image through a social network. By combining user social features and image features, and encoding the diffusion path taken thus far with an explicit memory cell, our model predicts the diffusion path of an image more accurately compared to alternate baselines that either encode only image or social features, or lack memory. By mapping individual users to user prototypes, our model can generalize to new users not seen during training. Finally, we demonstrate our model's capability of generating diffusion trees, and show that the generated trees closely resemble ground-truth trees.
We study the critical behavior of the three-dimensional $\pm J$ Ising model [with a random-exchange probability $P(J_{xy}) = p \delta(J_{xy} - J) + (1-p) \delta(J_{xy} + J)$] at the transition line between the paramagnetic and ferromagnetic phase, which extends from $p=1$ to a multicritical (Nishimori) point at $p=p_N\approx 0.767$. By a finite-size scaling analysis of Monte Carlo simulations at various values of $p$ in the region $p_N<p<1$, we provide strong numerical evidence that the critical behavior along the ferromagnetic transition line belongs to the same universality class as the three-dimensional randomly-dilute Ising model. We obtain the results $\nu=0.682(3)$ and $\eta=0.036(2)$ for the critical exponents, which are consistent with the estimates $\nu=0.683(2)$ and $\eta=0.036(1)$ at the transition of randomly-dilute Ising models.
Coarse graining model is a promising way to analyze and visualize large-scale networks. The coarse-grained networks are required to preserve the same statistical properties as well as the dynamic behaviors as the initial networks. Some methods have been proposed and found effective in undirected networks, while the study on coarse graining in directed networks lacks of consideration. In this paper, we proposed a Topology-aware Coarse Graining (TCG) method to coarse grain the directed networks. Performing the linear stability analysis of synchronization and numerical simulation of the Kuramoto model on four kinds of directed networks, including tree-like networks and variants of Barab\'{a}si-Albert networks, Watts-Strogatz networks and Erd\"{o}s-R\'{e}nyi networks, we find our method can effectively preserve the network synchronizability.
The deployment of small cell base stations(SCBSs) overlaid on existing macro-cellular systems is seen as a key solution for offloading traffic, optimizing coverage, and boosting the capacity of future cellular wireless systems. The next-generation of SCBSs is envisioned to be multi-mode, i.e., capable of transmitting simultaneously on both licensed and unlicensed bands. This constitutes a cost-effective integration of both WiFi and cellular radio access technologies (RATs) that can efficiently cope with peak wireless data traffic and heterogeneous quality-of-service requirements. To leverage the advantage of such multi-mode SCBSs, we discuss the novel proposed paradigm of cross-system learning by means of which SCBSs self-organize and autonomously steer their traffic flows across different RATs. Cross-system learning allows the SCBSs to leverage the advantage of both the WiFi and cellular worlds. For example, the SCBSs can offload delay-tolerant data traffic to WiFi, while simultaneously learning the probability distribution function of their transmission strategy over the licensed cellular band. This article will first introduce the basic building blocks of cross-system learning and then provide preliminary performance evaluation in a Long-Term Evolution (LTE) simulator overlaid with WiFi hotspots. Remarkably, it is shown that the proposed cross-system learning approach significantly outperforms a number of benchmark traffic steering policies.
Three classes of harmonic disorder systems (Lennard-Jones like glasses, percolators above threshold, and spring disordered lattices) have been numerically investigated in order to clarify the effect of different types of disorder on the mechanism of high frequency sound attenuation. We introduce the concept of frustration in structural glasses as a measure of the internal stress, and find a strong correlation between the degree of frustration and the exponent alpha that characterizes the momentum dependence of the sound attenuation $Gamma(Q)$$\simeq$$Q^\alpha$. In particular, alpha decreases from about d+1 in low-frustration systems (where d is the spectral dimension), to about 2 for high frustration systems like the realistic glasses examined.
Paper has been withdrawn due to non-compliance with IJCSI terms and conditions.
We examine a fundamental problem that models various active sampling setups, such as network tomography. We analyze sampling of a multivariate normal distribution with an unknown expectation that needs to be estimated: in our setup it is possible to sample the distribution from a given set of linear functionals, and the difficulty addressed is how to optimally select the combinations to achieve low estimation error. Although this problem is in the heart of the field of optimal design, no efficient solutions for the case with many functionals exist. We present some bounds and an efficient sub-optimal solution for this problem for more structured sets such as binary functionals that are induced by graph walks.
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network. The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN and also with the well known DeepLab-LargeFOV, DeconvNet architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance.   SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. We show that SegNet provides good performance with competitive inference time and more efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet/.
Quasi-stellar objects (QSOs) occur in galaxies in which supermassive black holes (SMBHs) are growing substantially through rapid accretion of gas. Many popular models of the co-evolutionary growth of galaxies and SMBHs predict that QSOs are also sites of substantial recent star formation, mediated by important processes, such as major mergers, which rapidly transform the nature of galaxies. A detailed study of the star-forming properties of QSOs is a critical test of such models. We present a far-infrared Herschel/PACS study of the mean star formation rate (SFR) of a sample of spectroscopically observed QSOs to z~2 from the COSMOS extragalactic survey. This is the largest sample to date of moderately luminous AGNs studied using uniform, deep far-infrared photometry. We study trends of the mean SFR with redshift, black hole mass, nuclear bolometric luminosity and specific accretion rate (Eddington ratio). To minimize systematics, we have undertaken a uniform determination of SMBH properties, as well as an analysis of important selection effects within spectroscopic QSO samples that influence the interpretation of SFR trends. We find that the mean SFRs of these QSOs are consistent with those of normal massive star-forming galaxies with a fixed scaling between SMBH and galaxy mass at all redshifts. No strong enhancement in SFR is found even among the most rapidly accreting systems, at odds with several co-evolutionary models. Finally, we consider the qualitative effects on mean SFR trends from different assumptions about the star-forming properties of QSO hosts and redshift evolution of the SMBH-galaxy relationship. While limited currently by uncertainties, valuable constraints on AGN-galaxy co-evolution can emerge from our approach.
We consider restrictions imposed on the electromagnetic and weak current operators by Poincare invariance and show that some assumptions used in deriving the sum rules in deep inelastic scattering (DIS) have no physical ground. In particular there is no ground to neglect the contribution of nonperturbative effects to these operators, even in the Bjorken limit. Therefore the conclusion that the sum rules in DIS unambiguously follow from QCD is not substantiated.
Recently, end-to-end learning frameworks are gaining prevalence in the field of robot control. These frameworks input states/images and directly predict the torques or the action parameters. However, these approaches are often critiqued due to their huge data requirements for learning a task. The argument of the difficulty in scalability to multiple tasks is well founded, since training these tasks often require hundreds or thousands of examples. But do end-to-end approaches need to learn a unique model for every task? Intuitively, it seems that sharing across tasks should help since all tasks require some common understanding of the environment. In this paper, we attempt to take the next step in data-driven end-to-end learning frameworks: move from the realm of task-specific models to joint learning of multiple robot tasks. In an astonishing result we show that models with multi-task learning tend to perform better than task-specific models trained with same amounts of data. For example, a deep-network learned with 2.5K grasp and 2.5K push examples performs better on grasping than a network trained on 5K grasp examples.
The recently discovered three dimensional or bulk topological insulators are expected to exhibit exotic quantum phenomena. It is believed that a trivial insulator can be twisted into a topological state by modulating the spin-orbit interaction or the crystal lattice via odd number of band inversions, driving the system through a topological quantum phase transition. By directly measuring the topological invariants (for the method to directly measure Fu-Kane {$\nu_0$}, see Hsieh \textit{et.al.,} Science 323, 919 (2009) at http://www.sciencemag.org/content/323/5916/919.abstract) we report the observation of a phase transition in a tunable spin-orbit system BiTl(S{1-d}Se{d})2 (which is an analog of the most studied topological insulator Bi2Se3, see Xia \textit{et.al.,} Nature Phys. 5, 398 (2009) at http://www.nature.com/nphys/journal/v5/n6/full/nphys1294.html and Spin-Momentum locking at http://www.nature.com/nature/journal/v460/n7259/full/nature08234.html) where the topological insulator state formation is visualized for the first time. In the topological state, vortex-like polarization states are observed to exhibit 3D vectorial textures, which collectively feature a chirality transition of its topological spin-textures as the spin-momentum locked (Z2 topologically ordered) electrons on the surface go through the zero carrier density point. Such phase transition and texture chirality inversion can be the physical basis for observing \textit{fractional charge} (e/2) and other related fractional topological phenomena.
We investigate the spread of diseases, computer viruses or information on complex networks and also immunization strategies to prevent or control the spread. When an entire population cannot be immunized and the effect of immunization is not perfect, we need the targeted immunization with immunization rate. Under such a circumstance we calculate epidemic thresholds for the SIR and SIS epidemic models. It is shown that, in scale-free networks, the targeted immunization is effective only if the immunization rate is equal to one. We analyze here epidemic spreading on directed complex networks, but similar results are also valid for undirected ones.
Deep learning has shown promising results on hard perceptual problems in recent years. However, deep learning systems are found to be vulnerable to small adversarial perturbations that are nearly imperceptible to human. Such specially crafted perturbations cause deep learning systems to output incorrect decisions, with potentially disastrous consequences. These vulnerabilities hinder the deployment of deep learning systems where safety or security is important. Attempts to secure deep learning systems either target specific attacks or have been shown to be ineffective.   In this paper, we propose MagNet, a framework for defending neural network classifiers against adversarial examples. MagNet does not modify the protected classifier or know the process for generating adversarial examples. MagNet includes one or more separate detector networks and a reformer network. Different from previous work, MagNet learns to differentiate between normal and adversarial examples by approximating the manifold of normal examples. Since it does not rely on any process for generating adversarial examples, it has substantial generalization power. Moreover, MagNet reconstructs adversarial examples by moving them towards the manifold, which is effective for helping classify adversarial examples with small perturbation correctly. We discuss the intrinsic difficulty in defending against whitebox attack and propose a mechanism to defend against graybox attack. Inspired by the use of randomness in cryptography, we propose to use diversity to strengthen MagNet. We show empirically that MagNet is effective against most advanced state-of-the-art attacks in blackbox and graybox scenarios while keeping false positive rate on normal examples very low.
Random operators may acquire extended states formed from a multitude of mutually resonating local quasi-modes. This mechanics is explored here in the context of the random Schr\"odinger operator on the complete graph. The operators exhibits local quasi modes mixed through a single channel. While most of its spectrum consists of localized eigenfunctions, under appropriate conditions it includes also bands of states which are delocalized in the $\ell^1$-though not in $\ell^2$-sense, where the eigenvalues have the statistics of \v{S}eba spectra. The analysis proceeds through some general observations on the scaling limits of random functions in the Herglotz-Pick class. The results are in agreement with a heuristic condition for the emergence of resonant delocalization, which is stated in terms of the tunneling amplitude among quasi-modes.
Landau's Fermi-liquid theory is the standard model for metals, characterized by the existence of electron quasiparticles near a Fermi surface as long as Landau's interaction parameters lie below critical values for instabilities. Recently, this fundamental paradigm has been challenged by physics of strong spin-orbit coupling although the concept of electron quasiparticles remains valid near the Fermi surface, where the Landau's Fermi-liquid theory fails to describe electromagnetic properties of this novel metallic state, referred to as Weyl metal. A novel ingredient is that such a Fermi surface encloses a Weyl point with definite chirality, referred to as a chiral Fermi surface, which can arise from breaking of either time reversal or inversion symmetry in systems with strong spin-orbit coupling, responsible for both Berry curvature and chiral anomaly. As a result, electromagnetic properties of the Weyl metallic state are described not by conventional Maxwell equations but by axion electrodynamics, where Maxwell equations are modified with a topological-in-origin spatially modulated $\theta(\bm{r}) \bm{E} \cdot \bm{B}$ term. This novel metallic state has been realized recently in Bi$_{1-x}$Sb$_{x}$ around $x \sim 3%$ under magnetic fields, where the Dirac spectrum appears around the critical point between the normal semiconducting ($x < 3%$) and topological semiconducting phases ($x > 3%$) and the time reversal symmetry breaking perturbation causes the Dirac point to split into a pair of Weyl points along the direction of the applied magnetic field for such a strong spin-orbit coupled system. In this review article, we discuss how the topological structure of both the Berry curvature and chiral anomaly (axion electrodynamics) gives rise to anomalous transport phenomena in Bi$_{1-x}$Sb$_{x}$ around $x \sim 3%$ under magnetic fields, modifying the Drude model of Landau's Fermi liquids.
One key step in audio signal processing is to transform the raw signal into representations that are efficient for encoding the original information. Traditionally, people transform the audio into spectral representations, as a function of frequency, amplitude and phase transformation. In this work, we take a purely data-driven approach to understand the temporal dynamics of audio at the raw signal level. We maximize the information extracted from the raw signal through a deep convolutional neural network (CNN) model. Our CNN model is trained on the urbansound8k dataset. We discover that salient audio patterns embedded in the raw waveforms can be efficiently extracted through a combination of nonlinear filters learned by the CNN model.
Deep convolutional neural networks have been successfully applied to image classification tasks. When these same networks have been applied to image retrieval, the assumption has been made that the last layers would give the best performance, as they do in classification. We show that for instance-level image retrieval, lower layers often perform better than the last layers in convolutional neural networks. We present an approach for extracting convolutional features from different layers of the networks, and adopt VLAD encoding to encode features into a single vector for each image. We investigate the effect of different layers and scales of input images on the performance of convolutional features using the recent deep networks OxfordNet and GoogLeNet. Experiments demonstrate that intermediate layers or higher layers with finer scales produce better results for image retrieval, compared to the last layer. When using compressed 128-D VLAD descriptors, our method obtains state-of-the-art results and outperforms other VLAD and CNN based approaches on two out of three test datasets. Our work provides guidance for transferring deep networks trained on image classification to image retrieval tasks.
A method based on Bayesian neural networks and genetic algorithm is proposed to control the fermentation process. The relationship between input and output variables is modelled using Bayesian neural network that is trained using hybrid Monte Carlo method. A feedback loop based on genetic algorithm is used to change input variables so that the output variables are as close to the desired target as possible without the loss of confidence level on the prediction that the neural network gives. The proposed procedure is found to reduce the distance between the desired target and measured outputs significantly.
This article focuses on the identification of the number of paths with different lengths between pairs of nodes in complex networks and how, by providing comprehensive information about the network topology, such an information can be effectively used for characterization of theoretical and real-world complex networks, as well as for identification of communities.
Deep learning techniques have been successfully applied in many areas of computer vision, including low-level image restoration problems. For image super-resolution, several models based on deep neural networks have been recently proposed and attained superior performance that overshadows all previous handcrafted models. The question then arises whether large-capacity and data-driven models have become the dominant solution to the ill-posed super-resolution problem. In this paper, we argue that domain expertise represented by the conventional sparse coding model is still valuable, and it can be combined with the key ingredients of deep learning to achieve further improved results. We show that a sparse coding model particularly designed for super-resolution can be incarnated as a neural network, and trained in a cascaded structure from end to end. The interpretation of the network based on sparse coding leads to much more efficient and effective training, as well as a reduced model size. Our model is evaluated on a wide range of images, and shows clear advantage over existing state-of-the-art methods in terms of both restoration accuracy and human subjective quality.
We make a new proposal to describe the very low temperature susceptibility of the doped Haldane gap compound Y$_2$BaNi$_{1-x}$Zn$_x$O$_5$. We propose a new mean field model relevant for this compound. The ground state of this mean field model is unconventional because antiferromagnetism coexists with random dimers. We present new susceptibility experiments at very low temperature. We obtain a Curie-Weiss susceptibility $\chi(T) \sim C / (\Theta+T)$ as expected for antiferromagnetic correlations but we do not obtain a direct signature of antiferromagnetic long range order. We explain how to obtain the ``impurity'' susceptibility $\chi_{imp}(T)$ by subtracting the Haldane gap contribution to the total susceptibility. In the temperature range [1 K, 300 K] the experimental data are well fitted by $T \chi_{imp}(T) = C_{imp} (1 + T_{imp}/T )^{-\gamma}$. In the temperature range [100 mK, 1 K] the experimental data are well fitted by $T \chi_{imp}(T) = A \ln{(T/T_c)}$, where $T_c$ increases with $x$. This fit suggests the existence of a finite N\'eel temperature which is however too small to be probed directly in our experiments. We also obtain a maximum in the temperature dependence of the ac-susceptibility $\chi'(T)$ which suggests the existence of antiferromagnetic correlations at very low temperature.
Neural networks that compute over graph structures are a natural fit for problems in a variety of domains, including natural language (parse trees) and cheminformatics (molecular graphs). However, since the computation graph has a different shape and size for every input, such networks do not directly support batched training or inference. They are also difficult to implement in popular deep learning libraries, which are based on static data-flow graphs. We introduce a technique called dynamic batching, which not only batches together operations between different input graphs of dissimilar shape, but also between different nodes within a single input graph. The technique allows us to create static graphs, using popular libraries, that emulate dynamic computation graphs of arbitrary shape and size. We further present a high-level library of compositional blocks that simplifies the creation of dynamic graph models. Using the library, we demonstrate concise and batch-wise parallel implementations for a variety of models from the literature.
Due to that the existing traffic facilities can hardly be extended, developing traffic signal control methods is the most important way to improve the traffic efficiency of modern roundabouts. This paper proposes a novel traffic signal controller with two fuzzy layers for signalizing the roundabout. The outer layer of the controller computes urgency degrees of all the phase subsets and then activates the most urgent subset. This mechanism helps to instantly respond to the current traffic condition of the roundabout so as to improve real-timeness. The inner layer of the controller computes extension time of the current phase. If the extension value is larger than a threshold value, the current phase is maintained; otherwise the next phase in the running phase subset (selected by the outer layer) is activated. The inner layer adopts well-designed phase sequences, which helps to smooth the traffic flows and to avoid traffic jam. In general, the proposed traffic signal controller is capable of improving real-timeness as well as reducing traffic congestion. Moreover, an offline particle swarm optimization (PSO) algorithm is developed to optimize the membership functions adopted in the proposed controller. By using optimal membership functions, the performance of the controller can be further improved. Simulation results demonstrate that the proposed controller outperforms previous traffic signal controllers in terms of improving the traffic efficiency of modern roundabouts.
In this chapter we review some of the most recent computational advances in the rapidly expanding field of statistical social network analysis using the R open-source software. In particular we will focus on Bayesian estimation for two important families of models: exponential random graph models (ERGMs) and latent space models (LSMs).
We calculate the temperature dependence of conductivity due to interaction correction for a disordered itinerant electron system close to a ferromagnetic quantum critical point which occurs due to a spin density wave instability. In the quantum critical regime, the crossover between diffusive and ballistic transport occurs at a temperature $T^{\ast}=1/[\tau \gamma (E_{F}\tau)^{2}]$, where $\gamma$ is the parameter associated with the Landau damping of the spin fluctuations, $\tau$ is the impurity scattering time, and $E_{F}$ is the Fermi energy. For a generic choice of parameters, $T^{\ast}$ is few orders of magnitude smaller than the usual crossover scale $1/\tau$. In the ballistic quantum critical regime, the conductivity has a $T^{(d-1)/3}$ temperature dependence, where $d$ is the dimensionality of the system. In the diffusive quantum critical regime we get $T^{1/4}$ dependence in three dimensions, and $\ln^2 T$ dependence in two dimensions. Away from the quantum critical regime we recover the standard results for a good metal.
Fractal patterns are observed in computational mechanics of elastic-plastic transitions in two models of linear elastic/perfectly-plastic random heterogeneous materials: (1) a composite made of locally isotropic grains with weak random fluctuations in elastic moduli and/or yield limits; and (2) a polycrystal made of randomly oriented anisotropic grains. In each case, the spatial assignment of material randomness is a non-fractal, strict-white-noise field on a 256 x 256 square lattice of homogeneous, square-shaped grains; the flow rule in each grain follows associated plasticity. These lattices are subjected to simple shear loading increasing through either one of three macroscopically uniform boundary conditions (kinematic, mixed-orthogonal or traction), admitted by the Hill-Mandel condition. Upon following the evolution of a set of grains that become plastic, we find that it has a fractal dimension increasing from 0 towards 2 as the material transitions from elastic to perfectly-plastic. While the grains possess sharp elastic-plastic stress-strain curves, the overall stress-strain responses are smooth and asymptote toward perfectly-plastic flows; these responses and the fractal dimension-strain curves are almost identical for three different loadings. The randomness in elastic moduli alone is sufficient to generate fractal patterns at the transition, but has a weaker effect than the randomness in yield limits. In the model with isotropic grains, as the random fluctuations vanish (i.e. the composite becomes a homogeneous body), a sharp elastic-plastic transition is recovered.
Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse. We attempt to bridge the gap between the theory and practice of deep learning by systematically analyzing learning dynamics for the restricted case of deep linear neural networks. Despite the linearity of their input-output map, such networks have nonlinear gradient descent dynamics on weights that change with the addition of each new hidden layer. We show that deep linear networks exhibit nonlinear learning phenomena similar to those seen in simulations of nonlinear networks, including long plateaus followed by rapid transitions to lower error solutions, and faster convergence from greedy unsupervised pretraining initial conditions than from random initial conditions. We provide an analytical description of these phenomena by finding new exact solutions to the nonlinear dynamics of deep learning. Our theoretical analysis also reveals the surprising finding that as the depth of a network approaches infinity, learning speed can nevertheless remain finite: for a special class of initial conditions on the weights, very deep networks incur only a finite, depth independent, delay in learning speed relative to shallow networks. We show that, under certain conditions on the training data, unsupervised pretraining can find this special class of initial conditions, while scaled random Gaussian initializations cannot. We further exhibit a new class of random orthogonal initial conditions on weights that, like unsupervised pre-training, enjoys depth independent learning times. We further show that these initial conditions also lead to faithful propagation of gradients even in deep nonlinear networks, as long as they operate in a special regime known as the edge of chaos.
The dynamical emergence (and subsequent intermittent breakdown) of collective behavior in complex systems is described as a non-Poisson renewal process, characterized by a waiting-time distribution density $\psi (\tau)$ for the time intervals between successively recorded breakdowns. In the intermittent case $\psi (t)\sim t^{-\mu}$, with complexity index $\mu $. We show that two systems can exchange information through complexity matching and present theoretical and numerical calculations describing a system with complexity index $\mu_{S}$ perturbed by a signal with complexity index $\mu_{P}$. The analysis focuses on the non-ergodic (non-stationary) case $\mu \leq 2$ showing that for $\mu_{S}\geq \mu_{P}$, the system $S$ statistically inherits the correlation function of the perturbation $P$. The condition $\mu_{P}=\mu_{S}$ is a resonant maximum for correlation information exchange.
The learning process for multi layered neural networks with many nodes makes heavy demands on computational resources. In some neural network models, the learning formulas, such as the Widrow-Hoff formula, do not change the eigenvectors of the weight matrix while flatting the eigenvalues. In infinity, this iterative formulas result in terms formed by the principal components of the weight matrix: i.e., the eigenvectors corresponding to the non-zero eigenvalues.   In quantum computing, the phase estimation algorithm is known to provide speed-ups over the conventional algorithms when it is used for the eigenvalue related problems. Therefore, it is appealing to ask whether we can model such learning formulas in quantum computing and gain a computational speed-up. Combining the quantum amplitude amplification with the phase estimation algorithm, a quantum implementation model for artificial neural networks using the Widrow-Hoff learning rule is presented. In addition, the complexity of the model is found to be linear in the size of the weight matrix. This provides a quadratic improvement over the classical algorithms.
This paper presents a margin-based multiclass generalization bound for neural networks which scales with their margin-normalized "spectral complexity": their Lipschitz constant, meaning the product of the spectral norms of the weight matrices, times a certain correction factor. This bound is empirically investigated for a standard AlexNet network on the mnist and cifar10 datasets, with both original and random labels, where it tightly correlates with the observed excess risks.
Several messages express opinions about events, products, and services, political views or even their author's emotional state and mood. Sentiment analysis has been used in several applications including analysis of the repercussions of events in social networks, analysis of opinions about products and services, and simply to better understand aspects of social communication in Online Social Networks (OSNs). There are multiple methods for measuring sentiments, including lexical-based approaches and supervised machine learning methods. Despite the wide use and popularity of some methods, it is unclear which method is better for identifying the polarity (i.e., positive or negative) of a message as the current literature does not provide a method of comparison among existing methods. Such a comparison is crucial for understanding the potential limitations, advantages, and disadvantages of popular methods in analyzing the content of OSNs messages. Our study aims at filling this gap by presenting comparisons of eight popular sentiment analysis methods in terms of coverage (i.e., the fraction of messages whose sentiment is identified) and agreement (i.e., the fraction of identified sentiments that are in tune with ground truth). We develop a new method that combines existing approaches, providing the best coverage results and competitive agreement. We also present a free Web service called iFeel, which provides an open API for accessing and comparing results across different sentiment methods for a given text.
The stochastic block model is a powerful tool for inferring community structure from network topology. However, it predicts a Poisson degree distribution within each community, while most real-world networks have a heavy-tailed degree distribution. The degree-corrected block model can accommodate arbitrary degree distributions within communities. But since it takes the vertex degrees as parameters rather than generating them, it cannot use them to help it classify the vertices, and its natural generalization to directed graphs cannot even use the orientations of the edges. In this paper, we present variants of the block model with the best of both worlds: they can use vertex degrees and edge orientations in the classification process, while tolerating heavy-tailed degree distributions within communities. We show that for some networks, including synthetic networks and networks of word adjacencies in English text, these new block models achieve a higher accuracy than either standard or degree-corrected block models.
Phylogenetic networks are a generalization of phylogenetic trees that are used in biology to represent reticulate or non-treelike evolution. Recently, several algorithms have been developed which aim to construct phylogenetic networks from biological data using {\em triplets}, i.e. binary phylogenetic trees on 3-element subsets of a given set of species. However, a fundamental problem with this approach is that the triplets displayed by a phylogenetic network do not necessary uniquely determine or {\em encode} the network. Here we propose an alternative approach to encoding and constructing phylogenetic networks, which uses phylogenetic networks on 3-element subsets of a set, or {\em trinets}, rather than triplets. More specifically, we show that for a special, well-studied type of phylogenetic network called a 1-nested network, the trinets displayed by a 1-nested network always encode the network. We also present an efficient algorithm for deciding whether a {\em dense} set of trinets (i.e. one that contains a trinet on every 3-element subset of a set) can be displayed by a 1-nested network or not and, if so, constructs that network. In addition, we discuss some potential new directions that this new approach opens up for constructing and comparing phylogenetic networks.
With the recent development of technology, wireless sensor networks are becoming an important part of many applications such as health and medical applications, military applications, agriculture monitoring, home and office applications, environmental monitoring, etc. Knowing the location of a sensor is important, but GPS receivers and ophisticated sensors are too expensive and require processing power. Therefore, the localization wireless sensor network problem is a growing field of interest. The aim of this paper is to give a comparison of wireless sensor network localization methods, and therefore, multidimensional scaling and semidefinite programming are chosen for this research. Multidimensional scaling is a simple mathematical technique widely-discussed that solves the wireless sensor networks localization problem. In contrast, semidefinite programming is a relatively new field of optimization with a growing use, although being more complex. In this paper, using extensive simulations, a detailed overview of these two approaches is given, regarding different network topologies, various network parameters and performance issues. The performances of both techniques are highly satisfactory and estimation errors are minimal
Reconfiguration has been used for both defect- and fault-tolerant nanoscale architectures with regular structure. Recent advances in self-assembled nanowires have opened doors to a new class of electronic devices with irregular structure. For such devices, reservoir computing has been shown to be a viable approach to implement computation. This approach exploits the dynamical properties of a system rather than specifics of its structure. Here, we extend a model of reservoir computing, called the echo state network, to reflect more realistic aspects of self-assembled nanowire networks. As a proof of concept, we use echo state networks to implement basic building blocks of digital computing: AND, OR, and XOR gates, and 2-bit adder and multiplier circuits. We show that the system can operate perfectly in the presence of variations five orders of magnitude higher than ITRS's 2005 target, $\bm{6\%}$, and achieves success rates $\bm{6}$ times higher than related approaches at half the cost. We also describe an adaptive algorithm that can detect faults in the system and reconfigure it to resume perfect operational condition.
Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems. We propose a method for multi-domain dialogue policy learning---termed NDQN, and apply it to an information-seeking spoken dialogue system in the domains of restaurants and hotels. Experimental results comparing DQN (baseline) versus NDQN (proposed) using simulations report that our proposed method exhibits better scalability and is promising for optimising the behaviour of multi-domain dialogue systems.
This paper presents an empirical study on applying convolutional neural networks (CNNs) to detecting J-UNIWARD, one of the most secure JPEG steganographic method. Experiments guiding the architectural design of the CNNs have been conducted on the JPEG compressed BOSSBase containing 10,000 covers of size 512x512. Results have verified that both the pooling method and the depth of the CNNs are critical for performance. Results have also proved that a 20-layer CNN, in general, outperforms the most sophisticated feature-based methods, but its advantage gradually diminishes on hard-to-detect cases. To show that the performance generalizes to large-scale databases and to different cover sizes, one experiment has been conducted on the CLS-LOC dataset of ImageNet containing more than one million covers cropped to unified size of 256x256. The proposed 20-layer CNN has cut the error achieved by a CNN recently proposed for large-scale JPEG steganalysis by 35%. Source code is available via GitHub: https://github.com/GuanshuoXu/deep_cnn_jpeg_steganalysis
Authors compare different ways of selecting change agents within network analysis paradigm and propose a new algorithm of doing so. All methods are evaluated against network coverage measure that calculates how many network members can be directly reached by selected nodes. Results from the analysis of organizational network show that compared to other methods the proposed algorithm provides better network coverage, at the same time selecting change agents that are well connected, influential opinion leaders.
Experiments in particle physics produce enormous quantities of data that must be analyzed and interpreted by teams of physicists. This analysis is often exploratory, where scientists are unable to enumerate the possible types of signal prior to performing the experiment. Thus, tools for summarizing, clustering, visualizing and classifying high-dimensional data are essential. In this work, we show that meaningful physical content can be revealed by transforming the raw data into a learned high-level representation using deep neural networks, with measurements taken at the Daya Bay Neutrino Experiment as a case study. We further show how convolutional deep neural networks can provide an effective classification filter with greater than 97% accuracy across different classes of physics events, significantly better than other machine learning approaches.
We study the finite dimensional marginals of the Gibbs measure in the Hopfield model at low temperature when the number of patterns, $M$, is proportional to the volume with a sufficiently small proportionality constant $\a>0$. It is shown that even when a single pattern is selected (by a magnetic field or by conditioning), the marginals do not converge almost surely, but only in law. The corresponding limiting law is constructed explicitly. We fit our result in the recently proposed language of ``metastates'' which we discuss in some length. As a byproduct, in a certain regime of the parameters $\a$ and $\b$ (the inverse temperature), we also give a simple proof of Talagrand's [T1] recent result that the replica symmetric solution found by Amit, Gutfreund, and Sompolinsky [AGS] can be rigorously justified.
Network alignment is the problem of matching the nodes of two graphs, maximizing the similarity of the matched nodes and the edges between them. This problem is encountered in a wide array of applications - from biological networks to social networks to ontologies - where multiple networked data sources need to be integrated. Due to the difficulty of the task, an accurate alignment can rarely be found without human assistance. Thus, it is of great practical importance to develop network alignment algorithms that can optimally leverage experts who are able to provide the correct alignment for a small number of nodes. Yet, only a handful of existing works address this active network alignment setting. The majority of the existing active methods focus on absolute queries ("are nodes $a$ and $b$ the same or not?"), whereas we argue that it is generally easier for a human expert to answer relative queries ("which node in the set $\{b_1, \ldots, b_n\}$ is the most similar to node $a$?"). This paper introduces a novel relative-query strategy, TopMatchings, which can be applied on top of any network alignment method that constructs and solves a bipartite matching problem. Our method identifies the most informative nodes to query based on the top-$k$ matchings of a bipartite graph. We compare the proposed approach to several commonly-used query strategies and perform experiments on both synthetic and real-world datasets. Our matching-based strategy yields the highest overall performance, outperforming all the baseline methods by more than 15 percentage points in some cases.
The non-equilibrium dynamics of the strongly diluted random-bond Ising model in two-dimensions (2d) is investigated numerically.   The persistence probability, P(t), of spins which do not flip by time t is found to decay to a non-zero, dilution-dependent, value $P(\infty)$. We find that $p(t)=P(t)-P(\infty)$ decays exponentially to zero at large times. Furthermore, the fraction of spins which never flip is a monotonically increasing function over the range of bond-dilution considered. Our findings, which are consistent with a recent result of Newman and Stein, suggest that persistence in disordered and pure systems falls into different classes. Furthermore, its behaviour would also appear to depend crucially on the strength of the dilution present.
We show that some natural output conventions for error-free computation in chemical reaction networks (CRN) lead to a common level of computational expressivity. Our main results are that the standard consensus-based output convention have equivalent computational power to (1) existence-based and (2) democracy-based output conventions. The CRNs using the former output convention have only "yes" voters, with the interpretation that the CRN's output is yes if any voters are present and no otherwise. The CRNs using the latter output convention define output by majority vote among "yes" and "no" voters.   Both results are proven via a generalized framework that simultaneously captures several definitions, directly inspired by a Petri net result of Esparza, Ganty, Leroux, and Majumder [CONCUR 2015]. These results support the thesis that the computational expressivity of error-free CRNs is intrinsic, not sensitive to arbitrary definitional choices.
We study central limit theorems for certain nonlinear sequences of random variables. In particular, we prove the central limit theorems for the bounded conductivity of the random resistor networks on hierarchical lattices.
Reverse engineering of complex dynamical networks is important for a variety of fields where uncovering the full topology of unknown networks and estimating parameters characterizing the network structure and dynamical processes are of interest. We consider complex oscillator networks with time-delayed interactions in a noisy environment, and develop an effective method to infer the full topology of the network and evaluate the amount of time delay based solely on noise- contaminated time series. In particular, we develop an analytic theory establishing that the dynamical correlation matrix, which can be constructed purely from time series, can be manipulated to yield both the network topology and the amount of time delay simultaneously. Extensive numerical support is provided to validate the method. While our method provides a viable solution to the network inverse problem, significant difficulties, limitations, and challenges still remain, and these are discussed thoroughly.
The proliferation of mobile devices, such as smartphones and Internet of Things (IoT) gadgets, results in the recent mobile big data (MBD) era. Collecting MBD is unprofitable unless suitable analytics and learning methods are utilized for extracting meaningful information and hidden patterns from data. This article presents an overview and brief tutorial of deep learning in MBD analytics and discusses a scalable learning framework over Apache Spark. Specifically, a distributed deep learning is executed as an iterative MapReduce computing on many Spark workers. Each Spark worker learns a partial deep model on a partition of the overall MBD, and a master deep model is then built by averaging the parameters of all partial models. This Spark-based framework speeds up the learning of deep models consisting of many hidden layers and millions of parameters. We use a context-aware activity recognition application with a real-world dataset containing millions of samples to validate our framework and assess its speedup effectiveness.
Graphs and network data are ubiquitous across a wide spectrum of scientific and application domains. Often in practice, an input graph can be considered as an observed snapshot of a (potentially continuous) hidden domain or process. Subsequent analysis, processing, and inferences are then performed on this observed graph. In this paper we advocate the perspective that an observed graph is often a noisy version of some discretized 1-skeleton of a hidden domain, and specifically we will consider the following natural network model: We assume that there is a true graph ${G^*}$ which is a certain proximity graph for points sampled from a hidden domain $\mathcal{X}$; while the observed graph $G$ is an Erd$\"{o}$s-R$\'{e}$nyi type perturbed version of ${G^*}$.   Our network model is related to, and slightly generalizes, the much-celebrated small-world network model originally proposed by Watts and Strogatz. However, the main question we aim to answer is orthogonal to the usual studies of network models (which often focuses on characterizing / predicting behaviors and properties of real-world networks). Specifically, we aim to recover the metric structure of ${G^*}$ (which reflects that of the hidden space $\mathcal{X}$ as we will show) from the observed graph $G$. Our main result is that a simple filtering process based on the \emph{Jaccard index} can recover this metric within a multiplicative factor of $2$ under our network model. Our work makes one step towards the general question of inferring structure of a hidden space from its observed noisy graph representation. In addition, our results also provide a theoretical understanding for Jaccard-Index-based denoising approaches.
Deep convolutional networks are well-known for their high computational and memory demands. Given limited resources, how does one design a network that balances its size, training time, and prediction accuracy? A surprisingly effective approach to trade accuracy for size and speed is to simply reduce the number of channels in each convolutional layer by a fixed fraction and retrain the network. In many cases this leads to significantly smaller networks with only minimal changes to accuracy. In this paper, we take a step further by empirically examining a strategy for deactivating connections between filters in convolutional layers in a way that allows us to harvest savings both in run-time and memory for many network architectures. More specifically, we generalize 2D convolution to use a channel-wise sparse connection structure and show that this leads to significantly better results than the baseline approach for large networks including VGG and Inception V3.
The paper investigates the properties of a class of resource allocation algorithms for communication networks: if a node of this network has $x$ requests to transmit, then it receives a fraction of the capacity proportional to $\log(1{+}L)$, the logarithm of its current load $L$. A stochastic model of such an algorithm is investigated in the case of the star network, in which $J$ nodes can transmit simultaneously, but interfere with a central node $0$ in such a way that node $0$ cannot transmit while one of the other nodes does. One studies the impact of the log policy on these $J+1$ interacting communication nodes. A fluid scaling analysis of the network is derived with the scaling parameter $N$ being the norm of the initial state. It is shown that the asymptotic fluid behaviour of the system is a consequence of the evolution of the state of the network on a specific time scale $(N^t,\, t{\in}(0,1))$. The main result is that, on this time scale and under appropriate conditions, the state of a node with index $j\geq 1$ is of the order of $N^{a_j(t)}$, with $0{\leq}a_j(t){<}1$, where $t\mapsto a_j(t)$ is a piecewise linear function. Convergence results on the fluid time scale and a stability property are derived as a consequence of this study.
A decentralized network of cognitive and non-cognitive transmitters where each transmitter aims at maximizing his energy-efficiency is considered. The cognitive transmitters are assumed to be able to sense the transmit power of their non-cognitive counterparts and the former have a cost for sensing. The Stackelberg equilibrium analysis of this $2-$level hierarchical game is conducted, which allows us to better understand the effects of cognition on energy-efficiency. In particular, it is proven that the network energy-efficiency is maximized when only a given fraction of terminals are cognitive. Then, we study a sensing game where all the transmitters are assumed to take the decision whether to sense (namely to be cognitive) or not. This game is shown to be a weighted potential game and its set of equilibria is studied. Playing the sensing game in a first phase (e.g., of a time-slot) and then playing the power control game is shown to be more efficient individually for all transmitters than playing a game where a transmitter would jointly optimize whether to sense and his power level, showing the existence of a kind of Braess paradox. The derived results are illustrated by numerical results and provide some insights on how to deploy cognitive radios in heterogeneous networks in terms of sensing capabilities. Keywords: Power Control, Stackelberg Equilibrium, Energy-Efficiency.
Users form information trails as they browse the web, checkin with a geolocation, rate items, or consume media. A common problem is to predict what a user might do next for the purposes of guidance, recommendation, or prefetching. First-order and higher-order Markov chains have been widely used methods to study such sequences of data. First-order Markov chains are easy to estimate, but lack accuracy when history matters. Higher-order Markov chains, in contrast, have too many parameters and suffer from overfitting the training data. Fitting these parameters with regularization and smoothing only offers mild improvements. In this paper we propose the retrospective higher-order Markov process (RHOMP) as a low-parameter model for such sequences. This model is a special case of a higher-order Markov chain where the transitions depend retrospectively on a single history state instead of an arbitrary combination of history states. There are two immediate computational advantages: the number of parameters is linear in the order of the Markov chain and the model can be fit to large state spaces. Furthermore, by providing a specific structure to the higher-order chain, RHOMPs improve the model accuracy by efficiently utilizing history states without risks of overfitting the data. We demonstrate how to estimate a RHOMP from data and we demonstrate the effectiveness of our method on various real application datasets spanning geolocation data, review sequences, and business locations. The RHOMP model uniformly outperforms higher-order Markov chains, Kneser-Ney regularization, and tensor factorizations in terms of prediction accuracy.
Deep learning approaches are still not very common in the speaker verification field. We investigate the possibility of using deep residual convolutional neural network with spectrograms as an input features in the text-dependent speaker verification task. Despite the fact that we were not able to surpass the baseline system in quality, we achieved a quite good results for such a new approach getting an 5.23% ERR on the RSR2015 evaluation part. Fusion of the baseline and proposed systems outperformed the best individual system by 18% relatively.
This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image. The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model. Unlike prior work, we are agnostic to whether the object is textured or textureless, as the convnet learns the optimal representation from the available training image data. Furthermore, the approach can be applied to instance- and class-based pose recovery. Empirically, we show that the proposed approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios with a cluttered background. For class-based object pose estimation, state-of-the-art accuracy is shown on the large-scale PASCAL3D+ dataset.
Current network protection systems use a collection of intelligent components - e.g. classifiers or rule-based firewall systems to detect intrusions and anomalies and to secure a network against viruses, worms, or trojans. However, these network systems rely on individuality and support an architecture with less collaborative work of the protection components. They give less administration support for maintenance, but offer a large number of individual single points of failures - an ideal situation for network attacks to succeed. In this work, we discuss the required features, the performance, and the problems of a distributed protection system called SANA. It consists of a cooperative architecture, it is motivated by the human immune system, where the components correspond to artificial immune cells that are connected for their collaborative work. SANA promises a better protection against intruders than common known protection systems through an adaptive self-management while keeping the resources efficiently by an intelligent reduction of redundant tasks. We introduce a library of several novel and common used protection components and evaluate the performance of SANA by a proof-of-concept implementation.
Depth estimation from single monocular images is a key component of scene understanding and has benefited largely from deep convolutional neural networks (CNN) recently. In this article, we take advantage of the recent deep residual networks and propose a simple yet effective approach to this problem. We formulate depth estimation as a pixel-wise classification task. Specifically, we first discretize the continuous depth values into multiple bins and label the bins according to their depth range. Then we train fully convolutional deep residual networks to predict the depth label of each pixel. Performing discrete depth label classification instead of continuous depth value regression allows us to predict a confidence in the form of probability distribution. We further apply fully-connected conditional random fields (CRF) as a post processing step to enforce local smoothness interactions, which improves the results. We evaluate our approach on both indoor and outdoor datasets and achieve state-of-the-art performance.
In this study, the partially entangled neural networks is used to transfer information between two neurons, where the original teleportation protocol is employed this for this purpose. The effect of the network strength on the fidelity of the transported information is investigated. We show that as the strength of the network increases, the accuracy of the transformed information increases. As a practical application, we consider the spread of swine flu virus between two equivalent tranches of the community. In this treatment two factors are considered, one for humanity and the other for influence factor.   The likelihood of infection between different age group is investigated, where we show that the strength of the neural network and the degree of infection plays an important role on transferring infection between different age group. From theoretical point of view, we show that it is possible to control the spread of the virus by controlling the network parameter. Also, by using local rotation, one can decrease the rate of infection between the young.
Robustness and sensitivity of responses generated by cell signaling networks has been associated with survival and evolvability of organisms. However, existing methods analyzing robustness and sensitivity of signaling networks ignore the experimentally observed cell-to-cell variations of protein abundances and cell functions or contain ad hoc assumptions. We propose and apply a data driven Maximum Entropy (MaxEnt) based method to quantify robustness and sensitivity of Escherichia coli (E. coli) chemotaxis signaling network. Our analysis correctly rank orders different models of E. coli chemotaxis based on their robustness and suggests that parameters regulating cell signaling are evolutionary selected to vary in individual cells according to their abilities to perturb cell functions. Furthermore, predictions from our approach regarding distribution of protein abundances and properties of chemotactic responses in individual cells based on cell population averaged data are in excellent agreement with their experimental counterparts. Our approach is general and can be used to evaluate robustness as well as generate predictions of single cell properties based on population averaged experimental data in a wide range of cell signaling systems.
Communities are not static; they evolve, split and merge, appear and disappear, i.e. they are product of dynamical processes that govern the evolution of the network. A good algorithm for community detection should not only quantify the topology of the network, but incorporate the dynamical processes that take place on the network. We present a novel algorithm for community detection that combines network structure with processes that support creation and/or evolution of communities. The algorithm does not embrace the universal approach but instead tries to focus on social networks and model dynamic social interactions that occur on those networks. It identifies leaders, and communities that form around those leaders. It naturally supports overlapping communities by associating each node with a membership vector that describes node's involvement in each community. This way, in addition to overlapping communities, we can identify nodes that are good followers to their leader, and also nodes with no clear community involvement that serve as a proxy between several communities and are equally as important. We run the algorithm for several real social networks which we believe represent a good fraction of the wide body of social networks and discuss the results including other possible applications.
While neural networks have been successfully applied to many NLP tasks the resulting vector-based models are very difficult to interpret. For example it's not clear how they achieve {\em compositionality}, building sentence meaning from the meanings of words and phrases. In this paper we describe four strategies for visualizing compositionality in neural models for NLP, inspired by similar work in computer vision. We first plot unit values to visualize compositionality of negation, intensification, and concessive clauses, allow us to see well-known markedness asymmetries in negation. We then introduce three simple and straightforward methods for visualizing a unit's {\em salience}, the amount it contributes to the final composed meaning: (1) gradient back-propagation, (2) the variance of a token from the average word node, (3) LSTM-style gates that measure information flow. We test our methods on sentiment using simple recurrent nets and LSTMs. Our general-purpose methods may have wide applications for understanding compositionality and other semantic properties of deep networks , and also shed light on why LSTMs outperform simple recurrent nets,
Recent initiatives that overstate health insurance coverage for well-being conflict with the recognized antagonistic facts identified by the determinants of health that identify health care as an intermediate factor. By using a network of controlled interdependences among multiple social resources including health insurance, which we reconstructed from survey data of the U.S. and Bayesian networks structure learning algorithms, we examined why health insurance through coverage, which in most countries is the access gate to health care, is just an intermediate factor of well-being. We used social network analysis methods to explore the complex relationships involved at general, specific and particular levels of the model. All levels provide evidence that the intermediate role of health insurance relies in a strong relationship to income and reproduces its unfair distribution. Some signals about the most efficient type of health coverage emerged in our analyses.
The Bochkov-Kuzovlev nonlinear fluctuation-dissipation theorem is used to derive Narayanaswamy's phenomenological theory of physical aging, in which this highly nonlinear phenomenon is described by a linear material-time convolution integral. A characteristic property of the Narayanaswamy aging description is material-time translational invariance, which is here taken as the basic assumption of the derivation. It is shown that only one possible definition of the material time obeys this invariance, namely the square of the distance travelled from a configuration of the system far back in time. The paper concludes with suggestions for computer simulations that test for consequences of material-time translational invariance. One of these is the "unique-triangles property" according to which any three points on the system's path form a triangle such that two side lengths determine the third; this is equivalent to the well-known triangular relation for time-autocorrelation functions of aging spin glasses [L. F. Cugliandolo and J. Kurchan, J. Phys. A: Math. Gen. 27, 5749 (1994)]. The unique-triangles property implies a simple geometric interpretation of out-of-equilibrium time-autocorrelation functions, which extends to aging a previously proposed framework for such functions in equilibrium [J. C. Dyre, cond-mat/9712222].
Deep neural networks (DNNs) play a key role in many applications. Current studies focus on crafting adversarial samples against DNN-based image classifiers by introducing some imperceptible perturbations to the input. However, DNNs for natural language processing have not got the attention they deserve. In fact, the existing perturbation algorithms for images cannot be directly applied to text. This paper presents a simple but effective method to attack DNN-based text classifiers. Three perturbation strategies, namely insertion, modification, and removal, are designed to generate an adversarial sample for a given text. By computing the cost gradients, what should be inserted, modified or removed, where to insert and how to modify are determined effectively. The experimental results show that the adversarial samples generated by our method can successfully fool a state-of-the-art model to misclassify them as any desirable classes without compromising their utilities. At the same time, the introduced perturbations are difficult to be perceived. Our study demonstrates that DNN-based text classifiers are also prone to the adversarial sample attack.
Most instruments - formalisms, concepts, and metrics - for social networks analysis fail to capture their dynamics. Typical systems exhibit different scales of dynamics, ranging from the fine-grain dynamics of interactions (which recently led researchers to consider temporal versions of distance, connectivity, and related indicators), to the evolution of network properties over longer periods of time. This paper proposes a general approach to study that evolution for both atemporal and temporal indicators, based respectively on sequences of static graphs and sequences of time-varying graphs that cover successive time-windows. All the concepts and indicators, some of which are new, are expressed using a time-varying graph formalism.
Compact keyframe-based video summaries are a popular way of generating viewership on video sharing platforms. Yet, creating relevant and compelling summaries for arbitrarily long videos with a small number of keyframes is a challenging task. We propose a comprehensive keyframe-based summarization framework combining deep convolutional neural networks and restricted Boltzmann machines. An original co-regularization scheme is used to discover meaningful subject-scene associations. The resulting multimodal representations are then used to select highly-relevant keyframes. A comprehensive user study is conducted comparing our proposed method to a variety of schemes, including the summarization currently in use by one of the most popular video sharing websites. The results show that our method consistently outperforms the baseline schemes for any given amount of keyframes both in terms of attractiveness and informativeness. The lead is even more significant for smaller summaries.
We study a system composed from two interdependent networks A and B, where a fraction of the nodes in network A depends on the nodes of network B and a fraction of the nodes in network B depends on the nodes of network A. Due to the coupling between the networks when nodes in one network fail they cause dependent nodes in the other network to also fail. This invokes an iterative cascade of failures in both networks. When a critical fraction of nodes fail the iterative process results in a percolation phase transition that completely fragments both networks. We show both analytically and numerically that reducing the coupling between the networks leads to a change from a first order percolation phase transition to a second order percolation transition at a critical point. The scaling of the percolation order parameter near the critical point is characterized by the critical exponent beta=1.
We study four Achlioptas type processes with "explosive" percolation transitions. All transitions are clearly continuous, but their finite size scaling functions are not entire holomorphic. The distributions of the order parameter, the relative size $s_{\rm max}/N$ of the largest cluster, are double-humped. But -- in contrast to first order phase transitions -- the distance between the two peaks decreases with system size $N$ as $N^{-\eta}$ with $\eta > 0$. We find different positive values of $\beta$ (defined via $< s_{\rm max}/N > \sim (p-p_c)^\beta$ for infinite systems) for each model, showing that they are all in different universality classes. In contrast, the exponent $\Theta$ (defined such that observables are homogeneous functions of $(p-p_c)N^\Theta$) is close to -- or even equal to -- 1/2 for all models.
Selective weeding is one of the key challenges in the field of agriculture robotics. To accomplish this task, a farm robot should be able to accurately detect plants and to distinguish them between crop and weeds. Most of the promising state-of-the-art approaches make use of appearance-based models trained on large annotated datasets. Unfortunately, creating large agricultural datasets with pixel-level annotations is an extremely time consuming task, actually penalizing the usage of data-driven techniques. In this paper, we face this problem by proposing a novel and effective approach that aims to dramatically minimize the human intervention needed to train the detection and classification algorithms. The idea is to procedurally generate large synthetic training datasets randomizing the key features of the target environment (i.e., crop and weed species, type of soil, light conditions). More specifically, by tuning these model parameters, and exploiting a few real-world textures, it is possible to render a large amount of realistic views of an artificial agricultural scenario with no effort. The generated data can be directly used to train the model or to supplement real-world images. We validate the proposed methodology by using as testbed a modern deep learning based image segmentation architecture. We compare the classification results obtained using both real and synthetic images as training data. The reported results confirm the effectiveness and the potentiality of our approach.
The ground state structure of the two-dimensional random field Ising magnet is studied using exact numerical calculations. First we show that the ferromagnetism, which exists for small system sizes, vanishes with a large excitation at a random field strength dependent length scale. This {\it break-up length scale} $L_b$ scales exponentially with the squared random field, $\exp(A/\Delta^2)$. By adding an external field $H$ we then study the susceptibility in the ground state. If $L>L_b$, domains melt continuously and the magnetization has a smooth behavior, independent of system size, and the susceptibility decays as $L^{-2}$. We define a random field strength dependent critical external field value $\pm H_c(\Delta)$, for the up and down spins to form a percolation type of spanning cluster. The percolation transition is in the standard short-range correlated percolation universality class. The mass of the spanning cluster increases with decreasing $\Delta$ and the critical external field approaches zero for vanishing random field strength, implying the critical field scaling (for Gaussian disorder) $H_c \sim (\Delta -\Delta_c)^\delta$, where $\Delta_c = 1.65 \pm 0.05$ and $\delta=2.05\pm 0.10$. Below $\Delta_c$ the systems should percolate even when H=0. This implies that even for H=0 above $L_b$ the domains can be fractal at low random fields, such that the largest domain spans the system at low random field strength values and its mass has the fractal dimension of standard percolation $D_f = 91/48$. The structure of the spanning clusters is studied by defining {\it red clusters}, in analogy to the ``red sites'' of ordinary site-percolation. The size of red clusters defines an extra length scale, independent of $L$.
We introduce a minimal extended evolving model for small-world networks which is controlled by a parameter. In this model the network growth is determined by the attachment of new nodes to already existing nodes that are geographically close. We analyze several topological properties for our model both analytically and by numerical simulations. The resulting network shows some important characteristics of real-life networks such as the small-world effect and a high clustering.
We present an analysis of the optical data of the Hubble Deep Field South. We derive F300W(AB), F450W(AB), F606W(AB) and F814W(AB) number counts for galaxies in all four bands. The slope is steeper at shortest wavelengths: we estimated gamma(F300W(AB))=0.47, gamma(F450W(AB))=0.35, gamma(F606W(AB))=0.28 and gamma(F814W(AB))=0.28. Morphological number counts are actually dominated by late type galaxies, while early type galaxies show a decreasing slope at faint magnitudes. Combining this information with photometric redshifts, we notice that galaxies contributing with a steep slope to morphological number counts (i.e. spiral and irregular galaxies) have z>1, suggesting a moderate merging. However we emphasize that any cut in apparent magnitude at optical wavelengths results in samples biased against elliptical galaxies, affecting as a consequence the redshift distributions and the implications on the evolution of galaxies along the Hubble sequence.
Theano is a linear algebra compiler that optimizes a user's symbolically-specified mathematical computations to produce efficient low-level implementations. In this paper, we present new features and efficiency improvements to Theano, and benchmarks demonstrating Theano's performance relative to Torch7, a recently introduced machine learning library, and to RNNLM, a C++ library targeted at recurrent neural networks.
User interfaces provide an interactive window between physical and virtual environments. A new concept in the field of human-computer interaction is a soft user interface; a compliant surface that facilitates touch interaction through deformation. Despite the potential of these interfaces, they currently lack a signal processing framework that can efficiently extract information from their deformation. Here we present OrbTouch, a device that uses statistical learning algorithms, based on convolutional neural networks, to map deformations from human touch to categorical labels (i.e., gestures) and touch location using stretchable capacitor signals as inputs. We demonstrate this approach by using the device to control the popular game Tetris. OrbTouch provides a modular, robust framework to interpret deformation in soft media, laying a foundation for new modes of human computer interaction through shape changing solids.
We have proposed a qualitative model for the structure of binary systems similar to Pd-Er alloys, which explains their nonmonotonic relaxation after the hydrogen saturation. It is based on the assumption that such a solid solution involves two kind heterogeneities. The former are caused by spinodal decomposition of the initially homogeneous state of the solid solution into the phases enriched and depleted of Er atoms. The latter are crystalline defects that trap an additional amount of Er atoms, which leads also to their local accumulation, changing the defect properties. Hydrogen atoms penetrating into the solid disturb the equilibrium of both the phase separation and the defect saturation with Er atoms, causing redistribution of Er atoms. The diffusion fluxes give rise to the motion of the interface between the two phases that is responsible for time variations, for example, in the relative volume of the enriched phase observed experimentally. We have found the conditions when the interface motion can change the direction during the system relaxation to a new equilibrium state. The latter effect is, from our point of view, the essence of the hydrogen induced nonmonotonic relaxation observed in such systems. The numerical simulation confirms the basic assumptions.
Locally adapted parameterizations of a model (such as locally weighted regression) are expressive but often suffer from high variance. We describe an approach for reducing the variance, based on the idea of estimating simultaneously a transformed space for the model, as well as locally adapted parameterizations in this new space. We present a new problem formulation that captures this idea and illustrate it in the important context of time varying models. We develop an algorithm for learning a set of bases for approximating a time varying sparse network; each learned basis constitutes an archetypal sparse network structure. We also provide an extension for learning task-driven bases. We present empirical results on synthetic data sets, as well as on a BCI EEG classification task.
The problem of matchmaking in electronic social networks is formulated as an optimization problem. In particular, a function measuring the matching degree of fields of interest of a search profile with those of an advertising profile is proposed.
Deep convolutional neural networks (DCNNs) have shown remarkable performance in image classification tasks in recent years. Generally, deep neural network architectures are stacks consisting of a large number of convolutional layers, and they perform downsampling along the spatial dimension via pooling to reduce memory usage. Concurrently, the feature map dimension (i.e., the number of channels) is sharply increased at downsampling locations, which is essential to ensure effective performance because it increases the diversity of high-level attributes. This also applies to residual networks and is very closely related to their performance. In this research, instead of sharply increasing the feature map dimension at units that perform downsampling, we gradually increase the feature map dimension at all units to involve as many locations as possible. This design, which is discussed in depth together with our new insights, has proven to be an effective means of improving generalization ability. Furthermore, we propose a novel residual unit capable of further improving the classification accuracy with our new network architecture. Experiments on benchmark CIFAR-10, CIFAR-100, and ImageNet datasets have shown that our network architecture has superior generalization ability compared to the original residual networks. Code is available at https://github.com/jhkim89/PyramidNet}
Layers is an open source neural network toolkit aim at providing an easy way to implement modern neural networks. The main user target are students and to this end layers provides an easy scriptting language that can be early adopted. The user has to focus only on design details as network totpology and parameter tunning.
We present a methodological study to find out how far back and to what precision star formation histories of galaxies can be reconstructed from CMDs, from integrated spectra and Lick indices, and from integrated multi-band photometry. Our evolutionary synthesis models GALEV allow to describe the evolution of galaxies in terms of all three approaches and we have assumed typical observational uncertainties for each of them and then investigated to what extent and accuracy different star formation histories can be discriminated. For a field in the LMC bar region with both a deep CMD from HST observations and a trailing slit spectrum across exactly the same field of view we could test our modelling results against real data.
The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties. We show that this can be solved by adding a regularization term, which is in turn related to injecting multiplicative noise in the activations of a Deep Neural Network, a special case of which is the common practice of dropout. We show that our regularized loss function can be efficiently minimized using Information Dropout, a generalization of dropout rooted in information theoretic principles that automatically adapts to the data and can better exploit architectures of limited capacity. When the task is the reconstruction of the input, we show that our loss function yields a Variational Autoencoder as a special case, thus providing a link between representation learning, information theory and variational inference. Finally, we prove that we can promote the creation of disentangled representations simply by enforcing a factorized prior, a fact that has been observed empirically in recent work. Our experiments validate the theoretical intuitions behind our method, and we find that information dropout achieves a comparable or better generalization performance than binary dropout, especially on smaller models, since it can automatically adapt the noise to the structure of the network, as well as to the test sample.
In the presence of great social diversity in India, it is difficult to change the social background of students, parents and their economical conditions. Therefore the only option left for us is to provide uniform or standardize teaching learning resources or methods. For high quality education throughout India there must be some nation-wide network, which provides equal quality education to all students, including the student from the rural areas and villages. The one and only simple solution to this is Web Based e-Learning. In this paper we try to give some innovative ideas to spread the Web Based e-Learning (WBeL) concept in to the minds of young India along with various approaches taken or to be taken, associated to it till date besides of instructional design models, different course developmental models, the role of technical writing and merit-demerit of WBeL till date.
We propose a simple method to learn linear causal cyclic models in the presence of latent variables. The method relies on equilibrium data of the model recorded under a specific kind of interventions ("shift interventions"). The location and strength of these interventions do not have to be known and can be estimated from the data. Our method, called backShift, only uses second moments of the data and performs simple joint matrix diagonalization, applied to differences between covariance matrices. We give a sufficient and necessary condition for identifiability of the system, which is fulfilled almost surely under some quite general assumptions if and only if there are at least three distinct experimental settings, one of which can be pure observational data. We demonstrate the performance on some simulated data and applications in flow cytometry and financial time series. The code is made available as R-package backShift.
Outage probabilities and single-hop throughput are two important performance metrics that have been evaluated for certain specific types of wireless networks. However, there is a lack of comprehensive results for larger classes of networks, and there is no systematic approach that permits the convenient comparison of the performance of networks with different geometries and levels of randomness.   The uncertainty cube is introduced to categorize the uncertainty present in a network. The three axes of the cube represent the three main potential sources of uncertainty in interference-limited networks: the node distribution, the channel gains (fading), and the channel access (set of transmitting nodes). For the performance analysis, a new parameter, the so-called {\em spatial contention}, is defined. It measures the slope of the outage probability in an ALOHA network as a function of the transmit probability $p$ at $p=0$. Outage is defined as the event that the signal-to-interference ratio (SIR) is below a certain threshold in a given time slot. It is shown that the spatial contention is sufficient to characterize outage and throughput in large classes of wireless networks, corresponding to different positions on the uncertainty cube. Existing results are placed in this framework, and new ones are derived.   Further, interpreting the outage probability as the SIR distribution, the ergodic capacity of unit-distance links is determined and compared to the throughput achievable for fixed (yet optimized) transmission rates.
The statistical mechanics of thermally excited vortex lines with columnar defects can be mapped onto the physics of interacting quantum particles with quenched random disorder in one less dimension. The destruction of the Bose glass phase in Type II superconductors, when the external magnetic field is tilted sufficiently far from the column direction, is described by a poorly understood non-Hermitian quantum phase transition. We present here exact results for this transition in (1+1)-dimensions, obtained by mapping the problem in the hard core limit onto one-dimensional fermions described by a non-Hermitian tight binding model. Both site randomness and the relatively unexplored case of bond randomness are considered. Analysis near the mobility edge and near the band center in the latter case is facilitated by a real space renormalization group procedure used previously for Hermitian quantum problems with quenched randomness in one dimension.
The lactose operon in Escherichia coli was the first known gene regulatory network, and it is frequently used as a prototype for new modeling paradigms. Historically, many of these modeling frameworks use differential equations. More recently, Stigler and Veliz-Cuba proposed a Boolean network model that captures the bistability of the system and all of the biological steady states. In this paper, we model the well-known arabinose operon in E. coli with a Boolean network. This has several complex features not found in the lac operon, such as a protein that is both an activator and repressor, a DNA looping mechanism for gene repression, and the lack of inducer exclusion by glucose. For 11 out of 12 choices of initial conditions, we use computational algebra and Sage to verify that the state space contains a single fixed point that correctly matches the biology. The final initial condition, medium levels of arabinose and no glucose, successfully predicts the system's bistability. Finally, we compare the state space under synchronous and asynchronous update, and see that the former has several artificial cycles that go away under a general asynchronous update.
The working group on two photon physics concentrated on three main subtopics: modelling the hadronic final state of deep inelastic scattering on a photon; unfolding the deep inelastic scattering data to obtain the photon structure function; and resonant production of exclusive final states, particularly of glueball candidates. In all three areas, new results were presented.
Machine-learned models are often described as "black boxes". In many real-world applications however, models may have to sacrifice predictive power in favour of human-interpretability. When this is the case, feature engineering becomes a crucial task, which requires significant and time-consuming human effort. Whilst some features are inherently static, representing properties that cannot be influenced (e.g., the age of an individual), others capture characteristics that could be adjusted (e.g., the daily amount of carbohydrates taken). Nonetheless, once a model is learned from the data, each prediction it makes on new instances is irreversible - assuming every instance to be a static point located in the chosen feature space. There are many circumstances however where it is important to understand (i) why a model outputs a certain prediction on a given instance, (ii) which adjustable features of that instance should be modified, and finally (iii) how to alter such a prediction when the mutated instance is input back to the model. In this paper, we present a technique that exploits the internals of a tree-based ensemble classifier to offer recommendations for transforming true negative instances into positively predicted ones. We demonstrate the validity of our approach using an online advertising application. First, we design a Random Forest classifier that effectively separates between two types of ads: low (negative) and high (positive) quality ads (instances). Then, we introduce an algorithm that provides recommendations that aim to transform a low quality ad (negative instance) into a high quality one (positive instance). Finally, we evaluate our approach on a subset of the active inventory of a large ad network, Yahoo Gemini.
We study the error landscape of deep linear and nonlinear neural networks with square error loss. We build on the recent results in the literature and present necessary and sufficient conditions for a critical point of the empirical risk function to be a global minimum in the deep linear network case. Our simple conditions can also be used to determine whether a given critical point is a global minimum or a saddle point. We further extend these results to deep nonlinear neural networks and prove similar sufficient conditions for global optimality in the function space.
We investigate the performance of the recently proposed stationary Fokker-Planck sampling method considering a combinatorial optimization problem from statistical physics. The algorithmic procedure relies upon the numerical solution of a linear second order differential equation that depends on a diffusion-like parameter D. We apply it to the problem of finding ground states of 2d Ising spin glasses for the +-J-Model. We consider square lattices with side length up to L=24 with two different types of boundary conditions and compare the results to those obtained by exact methods.   A particular value of D is found that yields an optimal performance of the algorithm. We compare this optimal value of D to a percolation transition, which occurs when studying the connected clusters of spins flipped by the algorithm. Nevertheless, even for moderate lattice sizes, the algorithm has more and more problems to find the exact ground states. This means that the approach, at least in its standard form, seems to be inferior to other approaches like parallel tempering.
We consider the problem of characterizing network capacity in the presence of adversarial errors on network links,focusing in particular on the effect of low-capacity feedback links cross network cuts.
We introduce a model for active transport on inhomogeneous networks embedded in a diffusive environment and investigate the formation of particle clusters. In the presence of a hard-core interaction, cluster sizes exhibit an algebraically decaying distribution in a large parameter regime, indicating the existence of clusters on all scales. The results are compared with a diffusion limited aggregation model and active transport on a regular network. For both models we observe aggregation of particles to clusters which are characterized by a finite size-scale if the relevant time-scales and particle densities are considered.
We consider a Hamiltonian system made of $N$ classical particles moving in two dimensions, coupled via an {\it infinite-range interaction} gauged by a parameter $A$. This system shows a low energy phase with most of the particles trapped in a unique cluster. At higher energy it exhibits a transition towards a homogenous phase. For sufficiently strong coupling $A$ an intermediate phase characterized by two clusters appears. Depending on the value of $A$ the observed transitions can be either second or first order in the canonical ensemble. In the latter case microcanonical results differ dramatically from canonical ones. However, a canonical analysis, extended to metastable and unstable states, is able to describe the microcanonical equilibrium phase. In particular, a microcanonical negative specific heat regime is observed in the proximity of the transition whenever it is canonically discontinuous. In this regime, {\it microcanonically stable} states are shown to correspond to {\it saddles} of the Helmholtz free energy, located inside the spinodal region.
Recently, fully-connected and convolutional neural networks have been trained to achieve state-of-the-art performance on a wide variety of tasks such as speech recognition, image classification, natural language processing, and bioinformatics. For classification tasks, most of these "deep learning" models employ the softmax activation function for prediction and minimize cross-entropy loss. In this paper, we demonstrate a small but consistent advantage of replacing the softmax layer with a linear support vector machine. Learning minimizes a margin-based loss instead of the cross-entropy loss. While there have been various combinations of neural nets and SVMs in prior art, our results using L2-SVMs show that by simply replacing softmax with linear SVMs gives significant gains on popular deep learning datasets MNIST, CIFAR-10, and the ICML 2013 Representation Learning Workshop's face expression recognition challenge.
Recognizing seismic waves immediately is very important for the realization of efficient disaster prevention. Generally these systems consist of a network of seismic detectors that send real time data to a central server. The server elaborates the data and attempts to recognize the first signs of an earthquake. The current problem with this approach is that it is subject to false alarms. A critical trade-off exists between sensitivity of the system and error rate. To overcame this problems, an artificial neural network based intelligent learning systems can be used. However, conventional supervised ANN systems are difficult to train, CPU intensive and prone to false alarms. To surpass these problems, here we attempt to use a next-generation unsupervised cortical algorithm HTM. This novel approach does not learn particular waveforms, but adapts to continuously fed data reaching the ability to discriminate between normality (seismic sensor background noise in no-earthquake conditions) and anomaly (sensor response to a jitter or an earthquake). Main goal of this study is test the ability of the HTM algorithm to be used to signal earthquakes automatically in a feasible disaster prevention system. We describe the methodology used and give the first qualitative assessments of the recognition ability of the system. Our preliminary results show that the cortical algorithm used is very robust to noise and that can successfully recognize synthetic earthquake-like signals efficiently and reliably.
This letter proposes a sparse diffusion steepest-descent algorithm for one bit compressed sensing in wireless sensor networks. The approach exploits the diffusion strategy from distributed learning in the one bit compressed sensing framework. To estimate a common sparse vector cooperatively from only the sign of measurements, steepest-descent is used to minimize the suitable global and local convex cost functions. A diffusion strategy is suggested for distributive learning of the sparse vector. Simulation results show the effectiveness of the proposed distributed algorithm compared to the state-of-the-art non distributive algorithms in the one bit compressed sensing framework.
We propose a novel deep network structure called "Network In Network" (NIN) to enhance model discriminability for local patches within the receptive field. The conventional convolutional layer uses linear filters followed by a nonlinear activation function to scan the input. Instead, we build micro neural networks with more complex structures to abstract the data within the receptive field. We instantiate the micro neural network with a multilayer perceptron, which is a potent function approximator. The feature maps are obtained by sliding the micro networks over the input in a similar manner as CNN; they are then fed into the next layer. Deep NIN can be implemented by stacking mutiple of the above described structure. With enhanced local modeling via the micro network, we are able to utilize global average pooling over feature maps in the classification layer, which is easier to interpret and less prone to overfitting than traditional fully connected layers. We demonstrated the state-of-the-art classification performances with NIN on CIFAR-10 and CIFAR-100, and reasonable performances on SVHN and MNIST datasets.
The success of CNNs in various applications is accompanied by a significant increase in the computation and parameter storage costs. Recent efforts toward reducing these overheads involve pruning and compressing the weights of various layers without hurting original accuracy. However, magnitude-based pruning of weights reduces a significant number of parameters from the fully connected layers and may not adequately reduce the computation costs in the convolutional layers due to irregular sparsity in the pruned networks. We present an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy. By removing whole filters in the network together with their connecting feature maps, the computation costs are reduced significantly. In contrast to pruning weights, this approach does not result in sparse connectivity patterns. Hence, it does not need the support of sparse convolution libraries and can work with existing efficient BLAS libraries for dense matrix multiplications. We show that even simple filter pruning techniques can reduce inference costs for VGG-16 by up to 34% and ResNet-110 by up to 38% on CIFAR10 while regaining close to the original accuracy by retraining the networks.
The nearly equal lunar and solar angular sizes as subtended at the Earth is generally regarded as a coincidence. This is, however, an incidental consequence of the tidal forces from these bodies being comparable. Comparable magnitudes implies strong temporal modulation, as the forcing frequencies are nearly but not precisely equal. We suggest that on the basis of paleogeographic reconstructions, in the Devonian period, when the first tetrapods appeared on land, a large tidal range would accompany these modulated tides. This would have been conducive to the formation of a network of isolated tidal pools, lending support to A.S. Romer's classic idea that the evaporation of shallow pools was an evolutionary impetus for the development of chiridian limbs in aquatic tetrapodomorphs. Romer saw this as the reason for the existence of limbs, but strong selection pressure for terrestrial navigation would have been present even if the limbs were aquatic in origin. Since even a modest difference in the Moon's angular size relative to the Sun's would lead to a qualitatively different tidal modulation, the fact that we live on a planet with a Sun and Moon of close apparent size is not entirely coincidental: it may have an anthropic basis.
3D mesh segmentation is an important research area in computer graphics, and there is an increasing interest in applying deep learning to this challenging area. We observe that 1) existing techniques are either slow to train or sensitive to feature resizing and sampling, 2) in the literature there are minimal comparative studies and 3) techniques often suffer from reproducibility issue. These hinder the research development of supervised segmentation tasks. This study contributes in two ways. First, we propose a novel convolutional neural network technique for mesh segmentation, using 1D data and filters, and a multi-branch network for separate training of features of three different scales. We also propose a novel way of computing conformal factor, which is less sensitive to small areas of large curvatures, and improve graph-cut refinement with the addition of a geometric feature term. The technique gives better results than the state of the art. Secondly, we provide a comprehensive study and implementations of several deep learning techniques, namely, neural networks (NNs), autoencoders (AEs) and convolutional neural networks (CNNs), which use an architecture of at least two layers deep. The significance of the study is that it offers a novel fast and accurate CNN technique, and a comparison of several other deep learning techniques for comparison.
Semantic networks qualify the meaning of an edge relating any two vertices. Determining which vertices are most "central" in a semantic network is difficult because one relationship type may be deemed subjectively more important than another. For this reason, research into semantic network metrics has focused primarily on context-based rankings (i.e. user prescribed contexts). Moreover, many of the current semantic network metrics rank semantic associations (i.e. directed paths between two vertices) and not the vertices themselves. This article presents a framework for calculating semantically meaningful primary eigenvector-based metrics such as eigenvector centrality and PageRank in semantic networks using a modified version of the random walker model of Markov chain analysis. Random walkers, in the context of this article, are constrained by a grammar, where the grammar is a user defined data structure that determines the meaning of the final vertex ranking. The ideas in this article are presented within the context of the Resource Description Framework (RDF) of the Semantic Web initiative.
Traditional security analyses are often geared towards cryptographic primitives or protocols. Although such analyses are necessary, they cannot address a defender's need for insight into {\em which aspects of a networked system having a significant impact on its security, and how to tune its configurations or parameters so as to improve security}. This question is known to be notoriously difficult to answer, and the state-of-the-art is that we know little about it. Towards ultimately addressing this question, this paper presents a stochastic model for quantifying security of networked systems. The resulting model captures two aspects of a networked system: (1) the strength of deployed security mechanisms such as intrusion detection systems, and (2) the underlying {\em vulnerability graph}, which reflects how attacks may proceed. The resulting model brings the following insights: (1) How should a defender "tune" system configurations (e.g., network topology) so as to improve security? (2) How should a defender "tune" system parameters (e.g., by upgrading which security mechanisms) so as to improve security? (3) Under what conditions is the steady-state number of compromised entities of interest below a given threshold with a high probability? Simulation studies are conducted to confirm the analytic results, and to show the tightness of the bounds of certain important metric that cannot be resolved analytically.
We study the power of \textit{local information algorithms} for optimization problems on social networks. We focus on sequential algorithms for which the network topology is initially unknown and is revealed only within a local neighborhood of vertices that have been irrevocably added to the output set. The distinguishing feature of this setting is that locality is necessitated by constraints on the network information visible to the algorithm, rather than being desirable for reasons of efficiency or parallelizability. In this sense, changes to the level of network visibility can have a significant impact on algorithm design.   We study a range of problems under this model of algorithms with local information. We first consider the case in which the underlying graph is a preferential attachment network. We show that one can find the node of maximum degree in the network in a polylogarithmic number of steps, using an opportunistic algorithm that repeatedly queries the visible node of maximum degree. This addresses an open question of Bollob{\'a}s and Riordan. In contrast, local information algorithms require a linear number of queries to solve the problem on arbitrary networks.   Motivated by problems faced by recruiters in online networks, we also consider network coverage problems such as finding a minimum dominating set. For this optimization problem we show that, if each node added to the output set reveals sufficient information about the set's neighborhood, then it is possible to design randomized algorithms for general networks that nearly match the best approximations possible even with full access to the graph structure. We show that this level of visibility is necessary.   We conclude that a network provider's decision of how much structure to make visible to its users can have a significant effect on a user's ability to interact strategically with the network.
A Wireless Sensor Network (WSN) consists of spatially distributed autonomous sensors to monitor physical or environmental conditions, such as temperature, sound, pressure,etc. In sensing applications, data packets are flowing from sensor nodes to base station. In data collection processes, bottom up approach is used. In bottom up approach, all nodes send their sensed data packets to base station directly. In this approach will lead to increased delay, which will lead to higher energy consumption. To reduce the energy consumption of sensor nodes,Clustering Algorithm and Low-Energy Adaptive Clustering Hierarchy (LEACH) are being used. Efficient gathering in Wireless Sensor information systems, Power Efficient Gathering in Sensor Information System (PEGASIS) is being used. There are lot of research issues in Wireless Sensor Networks such as delay, lifetime of network, energy dissipation which needs to be resolved.
Aerial images are often taken under poor lighting conditions and contain low resolution objects, many times occluded by other objects. In this domain, visual context could be of great help, but there are still very few papers that consider context in aerial image understanding and still remains an open problem in computer vision. We propose a dual-stream deep neural network that processes information along two independent pathways. Our model learns to combine local and global appearance in a complementary way, such that together form a powerful classifier. We test our dual-stream network on the task of buildings segmentation in aerial images and obtain state-of-the-art results on the Massachusetts Buildings Dataset. We study the relative importance of local appearance versus the larger scene, as well as their performance in combination on three new buildings datasets. We clearly demonstrate the effectiveness of visual context in conjunction with deep neural networks for aerial image understanding.
Several experimental studies show the existence of leader neurons in population bursts of 2D living neural networks. A leader neuron is, basically, a neuron which fires at the beginning of a burst (respectively network spike) more often that we expect by looking at its whole mean neural activity. This means that leader neurons have some burst triggering power beyond a simple statistical effect. In this study, we characterize these leader neuron properties. This naturally leads us to simulate neural 2D networks. To build our simulations, we choose the leaky integrate and fire (lIF) neuron model. Our lIF model has got stable leader neurons in the burst population that we simulate. These leader neurons are excitatory neurons and have a low membrane potential firing threshold. Except for these two first properties, the conditions required for a neuron to be a leader neuron are difficult to identify and seem to depend on several parameters involved in the simulations themself. However, a detailed linear analysis shows a trend of the properties required for a neuron to be a leader neuron. Our main finding is: A leader neuron sends a signal to many excitatory neurons as well as to a few inhibitory neurons and a leader neuron receives only a few signals from other excitatory neurons. Our linear analysis exhibits five essential properties for leader neurons with relative importance. This means that considering a given neural network with a fixed mean number of connections per neuron, our analysis gives us a way of predicting which neuron can be a good leader neuron and which cannot. Our prediction formula gives us a good statistical prediction even if, considering a single given neuron, the success rate does not reach hundred percent.
Deep learning has recently achieved very promising results in a wide range of areas such as computer vision, speech recognition and natural language processing. It aims to learn hierarchical representations of data by using deep architecture models. In a smart city, a lot of data (e.g. videos captured from many distributed sensors) need to be automatically processed and analyzed. In this paper, we review the deep learning algorithms applied to video analytics of smart city in terms of different research topics: object detection, object tracking, face recognition, image classification and scene labeling.
Deep neural networks (DNNs) trained on large-scale datasets have recently achieved impressive improvements in face recognition. But a persistent challenge remains to develop methods capable of handling large pose variations that are relatively under-represented in training data. This paper presents a method for learning a feature representation that is invariant to pose, without requiring extensive pose coverage in training data. We first propose to use a synthesis network for generating non-frontal views from a single frontal image, in order to increase the diversity of training data while preserving accurate facial details that are critical for identity discrimination. Our next contribution is a multi-source multi-task DNN that seeks a rich embedding representing identity information, as well as information such as pose and landmark locations. Finally, we propose a Siamese network to explicitly disentangle identity and pose, by demanding alignment between the feature reconstructions through various combinations of identity and pose features obtained from two images of the same subject. Experiments on face datasets in both controlled and wild scenarios, such as MultiPIE, LFW and 300WLP, show that our method consistently outperforms the state-of-the-art, especially on images with large head pose variations.
Building a voice conversion (VC) system from non-parallel speech corpora is challenging but highly valuable in real application scenarios. In most situations, the source and the target speakers do not repeat the same texts or they may even speak different languages. In this case, one possible, although indirect, solution is to build a generative model for speech. Generative models focus on explaining the observations with latent variables instead of learning a pairwise transformation function, thereby bypassing the requirement of speech frame alignment. In this paper, we propose a non-parallel VC framework with a variational autoencoding Wasserstein generative adversarial network (VAW-GAN) that explicitly considers a VC objective when building the speech model. Experimental results corroborate the capability of our framework for building a VC system from unaligned data, and demonstrate improved conversion quality.
Quantum transport properties in quantum Hall wires in the presence of spatially correlated random potential are investigated numerically. It is found that the potential correlation reduces the localization length associated with the edge state, in contrast to the naive expectation that the potential correlation increases it. The effect appears as the sizable shift of quantized conductance plateaus in long wires, where the plateau transitions occur at energies much higher than the Landau band centers. The scale of the shift is of the order of the strength of the random potential and is insensitive to the strength of magnetic fields. Experimental implications are also discussed.
Ancestral graph models, introduced by Richardson and Spirtes (2002), generalize both Markov random fields and Bayesian networks to a class of graphs with a global Markov property that is closed under conditioning and marginalization. By design, ancestral graphs encode precisely the conditional independence structures that can arise from Bayesian networks with selection and unobserved (hidden/latent) variables. Thus, ancestral graph models provide a potentially very useful framework for exploratory model selection when unobserved variables might be involved in the data-generating process but no particular hidden structure can be specified. In this paper, we present the Iterative Conditional Fitting (ICF) algorithm for maximum likelihood estimation in Gaussian ancestral graph models. The name reflects that in each step of the procedure a conditional distribution is estimated, subject to constraints, while a marginal distribution is held fixed. This approach is in duality to the well-known Iterative Proportional Fitting algorithm, in which marginal distributions are fitted while conditional distributions are held fixed.
Rectifier neuron units (ReLUs) have been widely used in deep convolutional networks. An ReLU converts negative values to zeros, and does not change positive values, which leads to a high sparsity of neurons. In this work, we first examine the sparsity of the outputs of ReLUs in some popular deep convolutional architectures. And then we use the sparsity property of ReLUs to accelerate the calculation of convolution by skipping calculations of zero-valued neurons. The proposed sparse convolution algorithm achieves some speedup improvements on CPUs compared to the traditional matrix-matrix multiplication algorithm for convolution when the sparsity is not less than 0.9.
Membrane receptors for neuromodulators (NM) are highly regulated in their distribution and efficacy - a phenomenon which influences the individual cell's response to central signals of NM release. Even though NM receptor regulation is implicated in the pharmacological action of many drugs, and is also known to be influenced by various environmental factors, its functional consequences and modes of action are not well understood. In this paper we summarize relevant experimental evidence on NM receptor regulation (specifically dopamine D1 and D2 receptors) in order to explore its significance for neural and synaptic plasticity. We identify the relevant components of NM receptor regulation (receptor phosphorylation, receptor trafficking and sensitization of second-messenger pathways) gained from studies on cultured cells. Key principles in the regulation and control of short-term plasticity (sensitization) are identified, and a model is presented which employs direct and indirect feedback regulation of receptor efficacy. We also discuss long-term plasticity which involves shifts in receptor sensitivity and loss of responsivity to NM signals. Finally, we discuss the implications of NM receptor regulation for models of brain plasticity and memorization. We emphasize that a realistic model of brain plasticity will have to go beyond Hebbian models of long-term potentiation and depression. Plasticity in the distribution and efficacy of NM receptors may provide another important source of functional plasticity with implications for learning and memory.
It is now well known that Internet traffic exhibits self-similarity, which cannot be described by traditional Markovian models such as the Poisson process. The causes of self-similarity of network traffic must be identified because understanding the nature of network traffic is critical in order to properly design and implement computer networks and network services like the World Wide Web. While some researchers have argued self similarity is generated by the typical applications or caused by Transport layer Protocols, it is also possible that the CSMA/CD protocol may cause or at least contribute to this phenomenon. In this paper, we use NS simulator to study the effect of CSMA/CD Exponential Backoff retransmission algorithm on Traffic Self similarity.
We deal with the problem of maintaining a shortest-path tree rooted at some process r in a network that may be disconnected after topological changes. The goal is then to maintain a shortest-path tree rooted at r in its connected component, Vr, and make all processes of other components detecting that r is not part of their connected component. We propose, in the composite atomicity model, a silent self-stabilizing algorithm for this problem working in semi-anonymous networks, where edges have strictly positive weights. This algorithm does not require any a priori knowledge about global parameters of the network. We prove its correctness assuming the distributed unfair daemon, the most general daemon. Its stabilization time in rounds is at most 3nmaxCC + D, where nmaxCC is the maximum number of non-root processes in a connected component and D is the hop-diameter of Vr. Furthermore, if we additionally assume that edge weights are positive integers, then it stabilizes in a polynomial number of steps: namely, we exhibit a bound in O(WmaxnmaxCC 3 n), where Wmax is the maximum weight of an edge and n is the number of processes.
Automated face recognition and identification softwares are becoming part of our daily life; it finds its abode not only with Facebook's auto photo tagging, Apple's iPhoto, Google's Picasa, Microsoft's Kinect, but also in Homeland Security Department's dedicated biometric face detection systems. Most of these automatic face identification systems fail where the effects of aging come into the picture. Little work exists in the literature on the subject of face prediction that accounts for aging, which is a vital part of the computer face recognition systems. In recent years, individual face components' (e.g. eyes, nose, mouth) features based matching algorithms have emerged, but these approaches are still not efficient. Therefore, in this work we describe a Face Prediction Model (FPM), which predicts human face aging or growth related image variation using Principle Component Analysis (PCA) and Artificial Neural Network (ANN) learning techniques. The FPM captures the facial changes, which occur with human aging and predicts the facial image with a few years of gap with an acceptable accuracy of face matching from 76 to 86%.
Cascade is a widely used approach that rejects obvious negative samples at early stages for learning better classifier and faster inference. This paper presents chained cascade network (CC-Net). In this CC-Net, the cascaded classifier at a stage is aided by the classification scores in previous stages. Feature chaining is further proposed so that the feature learning for the current cascade stage uses the features in previous stages as the prior information. The chained ConvNet features and classifiers of multiple stages are jointly learned in an end-to-end network. In this way, features and classifiers at latter stages handle more difficult samples with the help of features and classifiers in previous stages. It yields consistent boost in detection performance on benchmarks like PASCAL VOC 2007 and ImageNet. Combined with better region proposal, CC-Net leads to state-of-the-art result of 81.1% mAP on PASCAL VOC 2007.
The largest eigenvalue of the adjacency matrix of a network plays an important role in several network processes (e.g., synchronization of oscillators, percolation on directed networks, linear stability of equilibria of network coupled systems, etc.). In this paper we develop approximations to the largest eigenvalue of adjacency matrices and discuss the relationships between these approximations. Numerical experiments on simulated networks are used to test our results.
Questions on possible relationship between phenotypic plasticity and evolvability, as well as that between robustness and evolution have been addressed over decades in the field of evolution-development. By introducing an evolutionary stability assumption on the distribution of phenotype and genotype, we establish quantitative relationships on plasticity, phenotypic fluctuations, and evolvability. Derived are proportionality among plasticity as a responsiveness of phenotype against environmental change, variances of phenotype fluctuations of genetic and developmental origins, and evolution speed. Confirmation of the relationships is given by numerical experiments of a gene expression dynamics model with an evolving transcription network, whereas verifications by laboratory evolution experiments are also discussed. These results provide quantitative formulation on canalization and genetic assimilation, in terms of fluctuations of gene expression levels.
Sum-product networks have recently emerged as an attractive representation due to their dual view as a special type of deep neural network with clear semantics and a special type of probabilistic graphical model for which inference is always tractable. Those properties follow from some conditions (i.e., completeness and decomposability) that must be respected by the structure of the network. As a result, it is not easy to specify a valid sum-product network by hand and therefore structure learning techniques are typically used in practice. This paper describes the first online structure learning technique for continuous SPNs with Gaussian leaves. We also introduce an accompanying new parameter learning technique.
A network can be analyzed at different topological scales, ranging from single nodes to motifs, communities, up to the complete structure. We propose a novel intermediate-level topological analysis that considers non-overlapping subgraphs (connected components) and their interrelationships and distribution through the network. Though such subgraphs can be completely general, our methodology focuses the cases in which the nodes of these subgraphs share some special feature, such as being critical for the proper operation of the network. Our methodology of subgraph characterization involves two main aspects: (i) a distance histogram containing the distances calculated between all subgraphs, and (ii) a merging algorithm, developed to progressively merge the subgraphs until the whole network is covered. The latter procedure complements the distance histogram by taking into account the nodes lying between subgraphs, as well as the relevance of these nodes to the overall interconnectivity. Experiments were carried out using four types of network models and four instances of real-world networks, in order to illustrate how subgraph characterization can help complementing complex network-based studies.
Predicting protein properties such as solvent accessibility and secondary structure from its primary amino acid sequence is an important task in bioinformatics. Recently, a few deep learning models have surpassed the traditional window based multilayer perceptron. Taking inspiration from the image classification domain we propose a deep convolutional neural network architecture, MUST-CNN, to predict protein properties. This architecture uses a novel multilayer shift-and-stitch (MUST) technique to generate fully dense per-position predictions on protein sequences. Our model is significantly simpler than the state-of-the-art, yet achieves better results. By combining MUST and the efficient convolution operation, we can consider far more parameters while retaining very fast prediction speeds. We beat the state-of-the-art performance on two large protein property prediction datasets.
Subjective expected utility theory assumes that decision-makers possess unlimited computational resources to reason about their choices; however, virtually all decisions in everyday life are made under resource constraints - i.e. decision-makers are bounded in their rationality. Here we experimentally tested the predictions made by a formalization of bounded rationality based on ideas from statistical mechanics and information-theory. We systematically tested human subjects in their ability to solve combinatorial puzzles under different time limitations. We found that our bounded-rational model accounts well for the data. The decomposition of the fitted model parameter into the subjects' expected utility function and resource parameter provide interesting insight into the subjects' information capacity limits. Our results confirm that humans gradually fall back on their learned prior choice patterns when confronted with increasing resource limitations.
Robust visual tracking is a challenging computer vision problem, with many real-world applications. Most existing approaches employ hand-crafted appearance features, such as HOG or Color Names. Recently, deep RGB features extracted from convolutional neural networks have been successfully applied for tracking. Despite their success, these features only capture appearance information. On the other hand, motion cues provide discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos.   This paper presents an investigation of the impact of deep motion features in a tracking-by-detection framework. We further show that hand-crafted, deep RGB, and deep motion features contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly suggest that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.
We define a notion of network capacity region of networks that generalizes the notion of network capacity defined by Cannons et al. and prove its notable properties such as closedness, boundedness and convexity when the finite field is fixed. We show that the network routing capacity region is a computable rational polytope and provide exact algorithms and approximation heuristics for computing the region. We define the semi-network linear coding capacity region, with respect to a fixed finite field, that inner bounds the corresponding network linear coding capacity region, show that it is a computable rational polytope, and provide exact algorithms and approximation heuristics. We show connections between computing these regions and a polytope reconstruction problem and some combinatorial optimization problems, such as the minimum cost directed Steiner tree problem. We provide an example to illustrate our results. The algorithms are not necessarily polynomial-time.
We introduce a simple network formation model for social networks. Agents are nodes, connecting to another agent by building a directed edge (or accepting a connection from another agent) has a cost, and reaching (or being reached by) other agents via short directed paths has a benefit; in effect, an agent wants to reach others quickly, but without the cost of directly connecting each and every one. We prove that asynchronous edge dynamics always converge to a stable network; in fact, for nontrivial ranges of parameters this convergence is fast. Moreover, the set of fixed points of the dynamics form a nontrivial class of networks. For the static game, we give classes of efficient networks for nontrivial parameter ranges and further study their stability. We close several problems, and leave many interesting ones open.
We study the deep inelastic structure functions of mesons within the Nambu and Jona-Lasinio model. We calculate the valence quark distributions in $\pi $, $K $, and $\rho $ mesons at the low energy model scale, which are evoluted to the experimental momentum scale in terms of the Altarelli-Parisi equation. The resulting distribution functions show reasonable agreements with experiment. We also discuss the semi-inclusive lepton nucleon scattering process with a slow nucleon in coincidence in the final state, which reveals the off-shell structure of the pion.
Soft-collinear effective theory (SCET) is used to sum Sudakov double-logarithms in the x ->1 endpoint region for the deep inelastic scattering structure function. The calculations are done in both the target rest frame and the Breit frame. The separation of scales in the effective theory implies that the anomalous dimension of the SCET current is linear in ln mu, and the anomalous dimension for the Nth moment of the structure function is linear in ln N, to all orders in perturbation theory. The SCET formulation is shown to be free of Landau pole singularities. Some important differences between the deep inelastic structure function and the shape function in B decay are discussed.
We introduce a simple dynamical model of two interacting communities whose elements are subject to stochastic discrete-time updates governed by only bilinear interactions. When the intra- and inter-couplings are cooperative, the two communities reach asymptotically an equilibrium state. However, when the intra- or inter-couplings are anti-cooperative, the system may remain in perpetual oscillations and, when the coupling values belong to certain intervals, two possible scenarios arise, characterized either by erratic aperiodic trajectories and high sensitiveness to small changes of the couplings, or by chaotic trajectories and bifurcation cascades. Quite interestingly, we find out that even a moderate consensus in one single community can remove the chaos. Connections of the model with interacting stock markets are discussed.
A recurrent neural network model of phonological pattern learning is proposed. The model is a relatively simple neural network with one recurrent layer, and displays biases in learning that mimic observed biases in human learning. Single-feature patterns are learned faster than two-feature patterns, and vowel or consonant-only patterns are learned faster than patterns involving vowels and consonants, mimicking the results of laboratory learning experiments. In non-recurrent models, capturing these biases requires the use of alpha features or some other representation of repeated features, but with a recurrent neural network, these elaborations are not necessary.
We study the structure and internal kinematics of the "star pile" in Abell 545 - a low surface brightness structure lying in the center of the cluster.We have obtained deep long-slit spectroscopy of the star pile using VLT/FORS2 and Gemini/GMOS, which is analyzed in conjunction with deep multiband CFHT/MEGACAM imaging. As presented in a previous study the star pile has a flat luminosity profile and its color is consistent with the outer parts of elliptical galaxies. Its velocity map is irregular, with parts being seemingly associated with an embedded nucleus, and others which have significant velocity offsets to the cluster systemic velocity with no clear kinematical connection to any of the surrounding galaxies. This would make the star pile a dynamically defined stellar intra-cluster component. The complicated pattern in velocity and velocity dispersions casts doubts on the adequacy of using the whole star pile as a dynamical test for the innermost dark matter profile of the cluster. This status is fulfilled only by the nucleus and its nearest surroundings which lie at the center of the cluster velocity distribution.
A major problem in road network analysis is discovery of important crossroads, which can provide useful information for transport planning. However, none of existing approaches addresses the problem of identifying network-wide important crossroads in real road network. In this paper, we propose a novel data-driven based approach named CRRank to rank important crossroads. Our key innovation is that we model the trip network reflecting real travel demands with a tripartite graph, instead of solely analysis on the topology of road network. To compute the importance scores of crossroads accurately, we propose a HITS-like ranking algorithm, in which a procedure of score propagation on our tripartite graph is performed. We conduct experiments on CRRank using a real-world dataset of taxi trajectories. Experiments verify the utility of CRRank.
Time-delay mappings constructed using neural networks have proven successful in performing nonlinear system identification; however, because of their discrete nature, their use in bifurcation analysis of continuous-time systems is limited. This shortcoming can be avoided by embedding the neural networks in a training algorithm that mimics a numerical integrator. Both explicit and implicit integrators can be used. The former case is based on repeated evaluations of the network in a feedforward implementation; the latter relies on a recurrent network implementation. Here the algorithms and their implementation on parallel machines (SIMD and MIMD architectures) are discussed.
Social media and social networks have already woven themselves into the very fabric of everyday life. This results in a dramatic increase of social data capturing various relations between the users and their associated artifacts, both in online networks and the real world using ubiquitous devices. In this work, we consider social interaction networks from a data mining perspective - also with a special focus on real-world face-to-face contact networks: We combine data mining and social network analysis techniques for examining the networks in order to improve our understanding of the data, the modeled behavior, and its underlying emergent processes. Furthermore, we adapt, extend and apply known predictive data mining algorithms on social interaction networks. Additionally, we present novel methods for descriptive data mining for uncovering and extracting relations and patterns for hypothesis generation and exploration, in order to provide characteristic information about the data and networks. The presented approaches and methods aim at extracting valuable knowledge for enhancing the understanding of the respective data, and for supporting the users of the respective systems. We consider data from several social systems, like the social bookmarking system BibSonomy, the social resource sharing system flickr, and ubiquitous social systems: Specifically, we focus on data from the social conference guidance system Conferator and the social group interaction system MyGroup. This work first gives a short introduction into social interaction networks, before we describe several analysis results in the context of online social networks and real-world face-to-face contact networks. Next, we present predictive data mining methods, i.e., for localization, recommendation and link prediction. After that, we present novel descriptive data mining methods for mining communities and patterns.
We provide a scheme for exploring the reconstruction limit of compressed sensing by minimizing the general cost function under the random measurement constraints for generic correlated signal sources. Our scheme is based on the statistical mechanical replica method for dealing with random systems. As a simple but non-trivial example, we apply the scheme to a sparse autoregressive model, where the first differences in the input signals of the correlated time series are sparse, and evaluate the critical compression rate for a perfect reconstruction. The results are in good agreement with a numerical experiment for a signal reconstruction.
In this work we aim to discover high quality speech features and linguistic units directly from unlabeled speech data in a zero resource scenario. The results are evaluated using the metrics and corpora proposed in the Zero Resource Speech Challenge organized at Interspeech 2015. A Multi-layered Acoustic Tokenizer (MAT) was proposed for automatic discovery of multiple sets of acoustic tokens from the given corpus. Each acoustic token set is specified by a set of hyperparameters that describe the model configuration. These sets of acoustic tokens carry different characteristics fof the given corpus and the language behind, thus can be mutually reinforced. The multiple sets of token labels are then used as the targets of a Multi-target Deep Neural Network (MDNN) trained on low-level acoustic features. Bottleneck features extracted from the MDNN are then used as the feedback input to the MAT and the MDNN itself in the next iteration. We call this iterative deep learning framework the Multi-layered Acoustic Tokenizing Deep Neural Network (MAT-DNN), which generates both high quality speech features for the Track 1 of the Challenge and acoustic tokens for the Track 2 of the Challenge. In addition, we performed extra experiments on the same corpora on the application of query-by-example spoken term detection. The experimental results showed the iterative deep learning framework of MAT-DNN improved the detection performance due to better underlying speech features and acoustic tokens.
The need for large annotated image datasets for training Convolutional Neural Networks (CNNs) has been a significant impediment for their adoption in computer vision applications. We show that with transfer learning an effective object detector can be trained almost entirely on synthetically rendered datasets. We apply this strategy for detecting pack- aged food products clustered in refrigerator scenes. Our CNN trained only with 4000 synthetic images achieves mean average precision (mAP) of 24 on a test set with 55 distinct products as objects of interest and 17 distractor objects. A further increase of 12% in the mAP is obtained by adding only 400 real images to these 4000 synthetic images in the training set. A high degree of photorealism in the synthetic images was not essential in achieving this performance. We analyze factors like training data set size and 3D model dictionary size for their influence on detection performance. Additionally, training strategies like fine-tuning with selected layers and early stopping which affect transfer learning from synthetic scenes to real scenes are explored. Training CNNs with synthetic datasets is a novel application of high-performance computing and a promising approach for object detection applications in domains where there is a dearth of large annotated image data.
A synfire chain is a simple neural network model which can propagate stable synchronous spikes called a pulse packet and widely researched. However how synfire chains coexist in one network remains to be elucidated. We have studied the activity of a layered associative network of Leaky Integrate-and-Fire neurons in which connection we embed memory patterns by the Hebbian Learning. We analyzed their activity by the Fokker-Planck method. In our previous report, when a half of neurons belongs to each memory pattern (memory pattern rate $F=0.5$), the temporal profiles of the network activity is split into temporally clustered groups called sublattices under certain input conditions. In this study, we show that when the network is sparsely connected ($F<0.5$), synchronous firings of the memory pattern are promoted. On the contrary, the densely connected network ($F>0.5$) inhibit synchronous firings. The sparseness and denseness also effect the basin of attraction and the storage capacity of the embedded memory patterns. We show that the sparsely(densely) connected networks enlarge(shrink) the basion of attraction and increase(decrease) the storage capacity.
Pyrochlore magnets are candidates for spin-ice behavior. We present theoretical simulations of relevance for the pyrochlore family R2Ti2O7 (R= rare earth) supported by magnetothermal measurements on selected systems. By considering long ranged dipole-dipole as well as short-ranged superexchange interactions we get three distinct behaviours: (i) an ordered doubly degenerate state, (ii) a highly disordered state with a broad transition to paramagnetism, (iii) a partially ordered state with a sharp transition to paramagnetism. Thus these competing interactions can induce behaviour very different from conventional ``spin ice''. Closely corresponding behaviour is seen in the real compounds---in particular Ho2Ti2O7 corresponds to case (iii) which has not been discussed before, rather than (ii) as suggested earlier.
Accurately predicting and detecting interstitial lung disease (ILD) patterns given any computed tomography (CT) slice without any pre-processing prerequisites, such as manually delineated regions of interest (ROIs), is a clinically desirable, yet challenging goal. The majority of existing work relies on manually-provided ILD ROIs to extract sampled 2D image patches from CT slices and, from there, performs patch-based ILD categorization. Acquiring manual ROIs is labor intensive and serves as a bottleneck towards fully-automated CT imaging ILD screening over large-scale populations. Furthermore, despite the considerable high frequency of more than one ILD pattern on a single CT slice, previous works are only designed to detect one ILD pattern per slice or patch.   To tackle these two critical challenges, we present multi-label deep convolutional neural networks (CNNs) for detecting ILDs from holistic CT slices (instead of ROIs or sub-images). Conventional single-labeled CNN models can be augmented to cope with the possible presence of multiple ILD pattern labels, via 1) continuous-valued deep regression based robust norm loss functions or 2) a categorical objective as the sum of element-wise binary logistic losses. Our methods are evaluated and validated using a publicly available database of 658 patient CT scans under five-fold cross-validation, achieving promising performance on detecting four major ILD patterns: Ground Glass, Reticular, Honeycomb, and Emphysema. We also investigate the effectiveness of a CNN activation-based deep-feature encoding scheme using Fisher vector encoding, which treats ILD detection as spatially-unordered deep texture classification.
Generative Adversarial Networks (GANs) have recently been proposed as a promising avenue towards learning generative models with deep neural networks. While GANs have demonstrated state-of-the-art performance on multiple vision tasks, their learning dynamics are not yet well understood, both in theory and in practice. To address this issue, we take a first step towards a rigorous study of GAN dynamics. We propose a simple model that exhibits several of the common problematic convergence behaviors (e.g., vanishing gradient, mode collapse, diverging or oscillatory behavior) and still allows us to establish the first convergence bounds for parametric GAN dynamics. We find an interesting dichotomy: a GAN with an optimal discriminator provably converges, while a first order approximation of the discriminator leads to unstable GAN dynamics and mode collapse. Our model and analysis point to a specific challenge in practical GAN training that we call discriminator collapse.
This study examines a human-based approach for knowledge retention that is evolving through various knowledge sharing channels in a low-technology environment with a strong emphasis on social networks in a loosely-coupled inter-organisational government-industry collaboration focused on regional sustainability. Using social network analysis combined with interview and observational analysis, our results show that a combination of close-knit community ties and group interaction promote the development of strong personal networks that provide continued access to group memory to retain the groups knowledge.
This paper shows how the Bayesian network paradigm can be used in order to solve combinatorial optimization problems. To do it some methods of structure learning from data and simulation of Bayesian networks are inserted inside Estimation of Distribution Algorithms (EDA). EDA are a new tool for evolutionary computation in which populations of individuals are created by estimation and simulation of the joint probability distribution of the selected individuals. We propose new approaches to EDA for combinatorial optimization based on the theory of probabilistic graphical models. Experimental results are also presented.
Coherent energy transfer in pigment-protein complexes has been studied by mapping the quantum network to a kinetic network. This gives an analytic way to find parameter values for optimal transfer efficiency. In the case of the Fenna-Matthews-Olson (FMO) complex, the comparison of quantum and kinetic network evolution shows that dephasing-assisted energy transfer is driven by the two-site coherent interaction, and not system-wide coherence. Using the Schur complement, we find a new kinetic network that gives a closer approximation to the quantum network by including all multi-site coherence contributions. Our new network approximation can be expanded as a series with contributions representing different numbers of coherently interacting sites.   For both kinetic networks we study the system relaxation time, the time it takes for the excitation to spread throughout the complex. We make mathematically rigorous estimates of the relaxation time when comparing kinetic and quantum network. Numerical simulations comparing the coherent model and the two kinetic network models, confirm our bounds, and show that the relative error of the new kinetic network approximation is several orders of magnitude smaller.   Keywords: exciton transfer, quantum efficiency, kinetic networks, FMO, coherent energy transfer, quantum networks, Schur complement.
Multiple object detection in wide area aerial videos, has drawn the attention of the computer vision research community for a number of years. A novel framework is proposed in this paper using a fully convolutional deep neural network, which is able to detect all objects simultaneously for a given region of interest. The network is designed to accept multiple video frames at a time as the input and yields detection results for all objects in the temporally center frame. This multi-frame approach yield far better results than its single frame counterpart. Additionally, the proposed method can detect vehicles which are slowing, stopped, and/or partially or fully occluded during some frames, which cannot be handled by nearly all state-of-the-art methods. To the best of our knowledge, this is the first use of a multiple-frame, fully convolutional deep model for detecting multiple small objects and the only framework which can detect stopped and temporarily occluded vehicles, for aerial videos. The proposed network exceeds state-of-the-art results significantly on WPAFB 2009 dataset.
When a new product or technology is introduced, potential consumers can learn its quality by trying the product, at a risk, or by letting others try it and free-riding on the information that they generate. We propose a dynamic game to study the adoption of technologies of uncertain value, when agents are connected by a network and a monopolist seller chooses a policy to maximize profits. Consumers with low degree (few friends) have incentives to adopt early, while consumers with high degree have incentives to free ride. The seller can induce high-degree consumers to adopt early by offering referral incentives - rewards to early adopters whose friends buy in the second period. Referral incentives thus lead to a `double-threshold strategy' by which low and high-degree agents adopt the product early while middle-degree agents wait. We show that referral incentives are optimal on certain networks while inter-temporal price discrimination (i.e., a first-period price discount) is optimal on others, and discuss welfare implications.
How does information flow in online social networks? How does the structure and size of the information cascade evolve in time? How can we efficiently mine the information contained in cascade dynamics? We approach these questions empirically and present an efficient and scalable mathematical framework for quantitative analysis of cascades on networks. We define a cascade generating function that captures the details of the microscopic dynamics of the cascades. We show that this function can also be used to compute the macroscopic properties of cascades, such as their size, spread, diameter, number of paths, and average path length. We present an algorithm to efficiently compute cascade generating function and demonstrate that while significantly compressing information within a cascade, it nevertheless allows us to accurately reconstruct its structure. We use this framework to study information dynamics on the social network of Digg. Digg allows users to post and vote on stories, and easily see the stories that friends have voted on. As a story spreads on Digg through voting, it generates cascades. We extract cascades of more than 3,500 Digg stories and calculate their macroscopic and microscopic properties. We identify several trends in cascade dynamics: spreading via chaining, branching and community. We discuss how these affect the spread of the story through the Digg social network. Our computational framework is general and offers a practical solution to quantitative analysis of the microscopic structure of even very large cascades.
Stochastic optimization methods are widely used for training of deep neural networks. However, it is still a challenging research problem to achieve effective training by using stochastic optimization methods. This is due to the difficulties in finding good parameters on a loss function that have many saddle points. In this paper, we propose a stochastic optimization method called STDProp for effective training of deep neural networks. Its key idea is to effectively explore parameters on a complex surface of a loss function. We additionally develop momentum version of STDProp. While our approaches are easy to implement with high memory efficiency, it is more effective than other practical stochastic optimization methods for deep neural networks.
We propose to study equivariance in deep neural networks through parameter symmetries. In particular, given a group $\mathcal{G}$ that acts discretely on the input and output of a standard neural network layer $\phi_{W}: \Re^{M} \to \Re^{N}$, we show that $\phi_{W}$ is equivariant with respect to $\mathcal{G}$-action iff $\mathcal{G}$ explains the symmetries of the network parameters $W$. Inspired by this observation, we then propose two parameter-sharing schemes to induce the desirable symmetry on $W$. Our procedures for tying the parameters achieve $\mathcal{G}$-equivariance and, under some conditions on the action of $\mathcal{G}$, they guarantee sensitivity to all other permutation groups outside $\mathcal{G}$.
We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end neural speech synthesis. The system comprises five major building blocks: a segmentation model for locating phoneme boundaries, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, a fundamental frequency prediction model, and an audio synthesis model. For the segmentation model, we propose a novel way of performing phoneme boundary detection with deep neural networks using connectionist temporal classification (CTC) loss. For the audio synthesis model, we implement a variant of WaveNet that requires fewer parameters and trains faster than the original. By using a neural network for each component, our system is simpler and more flexible than traditional text-to-speech systems, where each component requires laborious feature engineering and extensive domain expertise. Finally, we show that inference with our system can be performed faster than real time and describe optimized WaveNet inference kernels on both CPU and GPU that achieve up to 400x speedups over existing implementations.
We study possible applications of artificial neural networks to examine the string landscape. Since the field of application is rather versatile, we propose to dynamically evolve these networks via genetic algorithms. This means that we start from basic building blocks and combine them such that the neural network performs best for the application we are interested in. We study three areas in which neural networks can be applied: to classify models according to a fixed set of (physically) appealing features, to find a concrete realization for a computation for which the precise algorithm is known in principle but very tedious to actually implement, and to predict or approximate the outcome of some involved mathematical computation which performs too inefficient to apply it, e.g. in model scans within the string landscape. We present simple examples that arise in string phenomenology for all three types of problems and discuss how they can be addressed by evolving neural networks from genetic algorithms.
We present a deep convolutional decoder architecture that can generate volumetric 3D outputs in a compute- and memory-efficient manner by using an octree representation. The network learns to predict both the structure of the octree, and the occupancy values of individual cells. This makes it a particularly valuable technique for generating 3D shapes. In contrast to standard decoders acting on regular voxel grids, the architecture does not have cubic complexity. This allows representing much higher resolution outputs with a limited memory budget. We demonstrate this in several application domains, including 3D convolutional autoencoders, generation of objects and whole scenes from high-level representations, and shape from a single image.
This paper reviews a class of generic dissipative dynamical systems called N-K models. In these models, the dynamics of N elements, defined as Boolean variables, develop step by step, clocked by a discrete time variable. Each of the N Boolean elements at a given time is given a value which depends upon K elements in the previous time step.   We review the work of many authors on the behavior of the models, looking particularly at the structure and lengths of their cycles, the sizes of their basins of attraction, and the flow of information through the systems. In the limit of infinite N, there is a phase transition between a chaotic and an ordered phase, with a critical phase in between.   We argue that the behavior of this system depends significantly on the topology of the network connections. If the elements are placed upon a lattice with dimension d, the system shows correlations related to the standard percolation or directed percolation phase transition on such a lattice. On the other hand, a very different behavior is seen in the Kauffman net in which all spins are equally likely to be coupled to a given spin. In this situation, coupling loops are mostly suppressed, and the behavior of the system is much more like that of a mean field theory.   We also describe possible applications of the models to, for example, genetic networks, cell differentiation, evolution, democracy in social systems and neural networks.
This paper addresses the problem of reliable transmission of data through a sensor network. We focus on networks rapidly deployed in harsh environments. For these networks, important design requirements are fast data transmission and rapid network setup, as well as minimized energy consumption for increased network lifetime. We propose a novel broadcasting solution that accounts for the interference impact and the congestion level of the channel, in order to improve robustness, energy consumption and delay performance, compared to a benchmark routing protocol, the GRAB algorithm. Three solutions are proposed: P-GRAB, a probabilistic routing algorithm for interference mitigation, U-GRAB, a utility-based algorithm that adjusts to real-time congestion and UP-GRAB, a combination of P-GRAB and U-GRAB. It is shown that P-GRAB provides the best performance for geometry-aware networks while the U-GRAB approach is the best option for unreliable and unstable networks.
In this paper, we present a simple model of scale-free networks that incorporates both preferential & random attachment and anti-preferential & random deletion at each time step. We derive the degree distribution analytically and show that it follows a power law with the degree exponent in the range of (2,infinity). We also find a way to derive an expression of the clustering coefficient for growing networks and compute the average path length through simulation.
For a given network, a backbone is an overlay network consisting of a connected dominating set with additional accessibility properties. Once a backbone is created for a network, it can be utilized for fast communication amongst the nodes of the network.   The Signal-to-Interference-plus-Noise-Ratio (SINR) model has become the standard for modeling communication among devices in wireless networks. For this model, the community has pondered what the most realistic solutions for communication problems in wireless networks would look like. Such solutions would have the characteristic that they would make the least number of assumptions about the availability of information about the participating nodes. Solving problems when nothing at all is known about the network and having nodes just start participating would be ideal. However, this is quite challenging and most likely not feasible. The pragmatic approach is then to make meaningful assumptions about the available information and present efficient solutions based on this information.   We present a solution for creation of backbone in the SINR model, when nodes do not have access to their physical coordinates or the coordinates of other nodes in the network. This restriction models the deployment of nodes in various situations for sensing hurricanes, cyclones, and so on, where only information about nodes prior to their deployment may be known but not their actual locations post deployment. We assume that nodes have access to knowledge of their label, the labels of nodes within their neighborhood, the range from which labels are taken $[N]$ and the total number of participating nodes $n$. We also assume that nodes wake up spontaneously. We present an efficient deterministic protocol to create a backbone with a round complexity of $O(\Delta \lg^2 N)$.
HERMES is a second generation experiment to study the spin structure of the nucleon, in which measurements of the spin dependent properties of semi-inclusive deep-inelastic lepton scattering are emphasized. Data have been accumulated for semi-inclusive pion, kaon, and proton double-spin asymmetries, as well as for high-p_T hadron pairs, and single-spin azimuthal asymmetries for pion electroproduction and deep virtual Compton scattering. These results provide information on the flavor decomposition of the polarized quark distributions in the nucleon and a first glimpse of the gluon polarization, while the observation of the azimuthal asymmetries show promise for probing the tensor spin of the nucleon and isolating the total angular momentum carried by the quarks.
The ensemble of $L \times L$ power-law random banded matrices, where the random hopping $H_{i,j}$ decays as a power-law $(b/| i-j |)^a$, is known to present an Anderson localization transition at $a=1$, where one-particle eigenfunctions are multifractal. Here we study numerically, at this critical point, the statistical properties of the transmission $T_2$ for two distinguishable particles, two bosons or two fermions. We find that the statistics of $T_2$ is multifractal, i.e. the probability to have $T_2(L) \sim 1/L^{\kappa}$ behaves as $L^{\Phi_2(\kappa)}$, where the multifractal spectrum $\Phi_2(\kappa)$ for fermions is different from the common multifractal spectrum concerning distinguishable particles and bosons. However in the three cases, the typical transmission $T_2^{typ}(L)$ is governed by the same exponent $\kappa_2^{typ}$, which is much smaller than the naive expectation $2\kappa_1^{typ}$, where $\kappa_1^{typ}$ is the typical exponent of the one-particle transmission $T_1(L)$.
In this paper we propose a decentralized sensor network scheme capable to reach a globally optimum maximum likelihood (ML) estimate through self-synchronization of nonlinearly coupled dynamical systems. Each node of the network is composed of a sensor and a first-order dynamical system initialized with the local measurements. Nearby nodes interact with each other exchanging their state value and the final estimate is associated to the state derivative of each dynamical system. We derive the conditions on the coupling mechanism guaranteeing that, if the network observes one common phenomenon, each node converges to the globally optimal ML estimate. We prove that the synchronized state is globally asymptotically stable if the coupling strength exceeds a given threshold. Acting on a single parameter, the coupling strength, we show how, in the case of nonlinear coupling, the network behavior can switch from a global consensus system to a spatial clustering system. Finally, we show the effect of the network topology on the scalability properties of the network and we validate our theoretical findings with simulation results.
In recent years, bullying and aggression against users on social media have grown significantly, causing serious consequences to victims of all demographics. In particular, cyberbullying affects more than half of young social media users worldwide, and has also led to teenage suicides, prompted by prolonged and/or coordinated digital harassment. Nonetheless, tools and technologies for understanding and mitigating it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of cyberbullies and aggressors, and what features distinguish them from regular users. We find that bully users post less, participate in fewer online communities, and are less popular than normal users, while aggressors are quite popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, achieving over 90% AUC.
Social networks existing among employees, customers or users of various IT systems have become one of the research areas of growing importance. A social network consists of nodes - social entities and edges linking pairs of nodes. In regular, one-layered social networks, two nodes - i.e. people are connected with a single edge whereas in the multi-layered social networks, there may be many links of different types for a pair of nodes. Nowadays data about people and their interactions, which exists in all social media, provides information about many different types of relationships within one network. Analysing this data one can obtain knowledge not only about the structure and characteristics of the network but also gain understanding about semantic of human relations. Are they direct or not? Do people tend to sustain single or multiple relations with a given person? What types of communication is the most important for them? Answers to these and more questions enable us to draw conclusions about semantic of human interactions. Unfortunately, most of the methods used for social network analysis (SNA) may be applied only to one-layered social networks. Thus, some new structural measures for multi-layered social networks are proposed in the paper, in particular: cross-layer clustering coefficient, cross-layer degree centrality and various versions of multi-layered degree centralities. Authors also investigated the dynamics of multi-layered neighbourhood for five different layers within the social network. The evaluation of the presented concepts on the real-world dataset is presented. The measures proposed in the paper may directly be used to various methods for collective classification, in which nodes are assigned to labels according to their structural input features.
Node similarity is a fundamental problem in graph analytics. However, node similarity between nodes in different graphs (inter-graph nodes) has not received a lot of attention yet. The inter-graph node similarity is important in learning a new graph based on the knowledge of an existing graph (transfer learning on graphs) and has applications in biological, communication, and social networks. In this paper, we propose a novel distance function for measuring inter-graph node similarity with edit distance, called NED. In NED, two nodes are compared according to their local neighborhood structures which are represented as unordered k-adjacent trees, without relying on labels or other assumptions. Since the computation problem of tree edit distance on unordered trees is NP-Complete, we propose a modified tree edit distance, called TED*, for comparing neighborhood trees. TED* is a metric distance, as the original tree edit distance, but more importantly, TED* is polynomially computable. As a metric distance, NED admits efficient indexing, provides interpretable results, and shows to perform better than existing approaches on a number of data analysis tasks, including graph de-anonymization. Finally, the efficiency and effectiveness of NED are empirically demonstrated using real-world graphs.
Analyzing videos of human actions involves understanding the temporal relationships among video frames. CNNs are the current state-of-the-art methods for action recognition in videos. However, the CNN architectures currently being used have difficulty in capturing these relationships. State-of-the-art action recognition approaches rely on traditional local optical flow estimation methods to pre-compute the motion information for CNNs. Such a two-stage approach is computationally expensive, storage demanding, and not end-to-end trainable. In this paper, we present a novel CNN architecture that implicitly captures motion information. Our method is 10x faster than a two-stage approach, does not need to cache flow information, and is end-to-end trainable. Experimental results on UCF101 and HMDB51 show that it achieves competitive accuracy with the two-stage approaches.
Random linear network coding is a feasible encoding tool for network coding, specially for the non-coherent network, and its performance is important in theory and application. In this letter, we study the performance of random linear network coding for the well-known butterfly network by analyzing the failure probabilities. We determine the failure probabilities of random linear network coding for the well-known butterfly network and the butterfly network with channel failure probability p.
We study the impact of the wetting properties on the immiscible displacement of a viscous fluid in disordered porous media. We present a novel pore-scale model that captures wettability and dynamic effects, including the spatiotemporal nonlocality associated with interface readjustments. Our simulations show that increasing the wettability of the invading fluid (the contact angle) promotes cooperative pore filling that stabilizes the invasion, and that this effect is suppressed as the flow rate increases, due to viscous instabilities. We use scaling analysis to derive two dimensionless numbers that predict the mode of displacement. By elucidating the underlying mechanisms, we explain classical yet intriguing experimental observations. These insights could be used to improve technologies such as hydraulic fracturing, CO$_{2}$ geo-sequestration, and microfluidics.
Universal quantum computers promise a dramatic speed-up over classical computers but a full-size realization remains challenging. However, intermediate quantum computational models have been proposed that are not universal, but can solve problems that are strongly believed to be classically hard. Aaronson and Arkhipov have shown that interference of single photons in random optical networks can solve the hard problem of sampling the bosonic output distribution which is directly connected to computing matrix permanents. Remarkably, this computation does not require measurement-based interactions or adaptive feed-forward techniques. Here we demonstrate this model of computation using high--quality laser--written integrated quantum networks that were designed to implement random unitary matrix transformations. We experimentally characterize the integrated devices using an in--situ reconstruction method and observe three-photon interference that leads to the boson-sampling output distribution. Our results set a benchmark for quantum computers, that hold the potential of outperforming conventional ones using only a few dozen photons and linear-optical elements.
We propose deep-subwavelength optical waveguides based on metal-dielectric multilayer indefinite metamaterials with ultrahigh effective refractive indices. Waveguide modes with different mode orders are systematically analyzed with numerical simulations based on both metal-dielectric multilayer structures and the effective medium approach. The dependences of waveguide mode indices, propagation lengths and mode areas on different mode orders, free space wavelengths and sizes of waveguide cross sections are studied. Furthermore, waveguide modes are also illustrated with iso-frequency contours in the wave vector space in order to investigate the mechanism of waveguide mode cutoff for high order modes. The deep-subwavelength optical waveguide with a size smaller than {\lambda}0/50 and a mode area in the order of 10-4 {\lambda}02 is realized, and an ultrahigh effective refractive index up to 62.0 is achieved at the telecommunication wavelength. This new type of metamaterial optical waveguide opens up opportunities for various applications in enhanced light-matter interactions.
We present a new inference method based on approximate Bayesian computation for estimating parameters governing an entire network based on link-traced samples of that network. To do this, we first take summary statistics from an observed link-traced network sample, such as a recruitment network of subjects in a hard-to-reach population. Then we assume prior distributions, such as multivariate uniform, for the distribution of some parameters governing the structure of the network and behaviour of its nodes. Then, we draw many independent and identically distributed values for these parameters. For each set of values, we simulate a population network, take a link-traced sample from that network, and find the summary statistics for that sample. The statistics from the sample, and the parameters that eventually led to that sample, are collectively treated as a single point. We take a Kernel Density estimate of the points from many simulations, and observe the density across the hyperplane coinciding with the statistic values of the originally observed sample. This density function is treat as a posterior estimate of the paramaters of the network that provided the observed sample.   We also apply this method to a network of precedence citations between legal documents, centered around cases overseen by the Supreme Court of Canada, is observed. The features of certain cases that lead to their frequent citation are inferred, and their effects estimated by ABC. Future work and extensions are also briefly discussed.
It is shown analytically how a neural network can be used optimally to encode input data that is derived from a toroidal manifold. The case of a 2-layer network is considered, where the output is assumed to be a set of discrete neural firing events. The network objective function measures the average Euclidean error that occurs when the network attempts to reconstruct its input from its output. This optimisation problem is solved analytically for a toroidal input manifold, and two types of solution are obtained: a joint encoder in which the network acts as a soft vector quantiser, and a factorial encoder in which the network acts as a pair of soft vector quantisers (one for each of the circular subspaces of the torus). The factorial encoder is favoured for small network sizes when the number of observed firing events is large. Such self-organised factorial encoding may be used to restrict the size of network that is required to perform a given encoding task, and will decompose an input manifold into its constituent submanifolds.
Nowadays social media (Twitter, Facebook, etc.), not only simply as communication media, but also for promotion. Social networking media offers many business benefits for companies and organizations. Research purposes is to determine the model of social network media utilization as a promotional media for handicraft business in Palembang city. Qualitative and quantitative research design are used to know how handicraft business in Palembang city utilizing social media networking as a promotional media. The research results show 35% craft businesses already utilizing social media as a promotional media. The social media used are blog development 15%, facebook 46%, and twitter etc. are 39%. The reasons they use social media such as, 1) minimal cost, 2) easily recognizable, 3) global distribution areas. Social media emphasis on direct engagement with customers better. So that the marketing method could be more personal through direct communication with customers.
This paper presents small world in motion (SWIM), a new mobility model for ad-hoc networking. SWIM is relatively simple, is easily tuned by setting just a few parameters, and generates traces that look real--synthetic traces have the same statistical properties of real traces. SWIM shows experimentally and theoretically the presence of the power law and exponential decay dichotomy of inter-contact time, and, most importantly, our experiments show that it can predict very accurately the performance of forwarding protocols.
Hour-glass-shaped magnetic excitation spectra have been detected in a variety of doped transition-metal oxides with stripe-like charge order. Compared to the predictions of spin-wave theory, these spectra display a different intensity distribution and anomalous broadening. Here we show, based on a comprehensive modelling of these phenomena for La5/3Sr1/3CoO4, how quenched disorder in the charge sector causes frustration, and consequently cluster-glass behavior at low temperatures, in the spin sector. This spin-glass physics, which is insensitive to the detailed nature of the charge disorder, but sensitive to the relative strength of the magnetic inter-stripe coupling, ultimately determines the distribution of magnetic spectral weight and thus causes the hour-glass spectrum.
We compute the deeply virtual Compton scattering (DVCS) amplitude for forward and backward scattering in the asymptotic limit. We make use of the Regge calculus to resum important logarithmic contributions that are beyond those included by the DGLAP evolution. We find a power-like behavior for the forward DVCS amplitude.
We study the effect in geometrically frustrated antiferromagnets of weak, random variations in the strength of exchange interactions. Without disorder the simplest classical models for these systems have macroscopically degenerate ground states, and this degeneracy may prevent ordering at any temperature. Weak exchange randomness favours a small subset of these ground states and induces a spin-glass transition at an ordering temperature determined by the amplitude of modulations in interaction strength. We use the replica approach to formulate a theory for this transition, showing that it falls into the same universality class as conventional spin-glass transitions. In addition, we show that a model with a low concentration of defect bonds can be mapped onto a system of randomly located pseudospins that have dipolar effective interactions. We also present detailed results from Monte Carlo simulations of the classical Heisenberg antiferromagnet on the pyrochlore lattice with weak randomness in nearest neighbour exchange.
Many proximity-based mobile social networks are developed to facilitate connections between any two people, or to help a user to find people with matched profile within a certain distance. A challenging task in these applications is to protect the privacy the participants' profiles and personal interests.   In this paper, we design novel mechanisms, when given a preference-profile submitted by a user, that search a person with matching-profile in decentralized multi-hop mobile social networks. Our mechanisms are privacy-preserving: no participants' profile and the submitted preference-profile are exposed. Our mechanisms establish a secure communication channel between the initiator and matching users at the time when the matching user is found. Our rigorous analysis shows that our mechanism is secure, privacy-preserving, verifiable, and efficient both in communication and computation. Extensive evaluations using real social network data, and actual system implementation on smart phones show that our mechanisms are significantly more efficient then existing solutions.
We calculate the mean total number of equilibrium points in a system of $N$ random autonomous ODE's introduced by Cugliandolo et al. in 1997 to describe non-relaxational glassy dynamics on the high-dimensional sphere. In doing it we suggest a new approach which allows such a calculation to be done most straightforwardly, and is based on efficiently incorporating the Langrange multiplier into the Kac-Rice framework. Analysing the asymptotic behaviour for large $N$ we confirm that the phenomenon of 'topology trivialization' revealed earlier for other systems holds also in the present framework with nonrelaxational dynamics. Namely, by increasing the variance of the random 'magnetic field' term in dynamical equations we find a 'phase transition' from the exponentially abundant number of equilibria down to just two equilibria. Classifying the equilibria in the nontrivial phase by stability remains an open problem.
ZigBee Light Link (ZLL) is the low-power mesh network standard used by connected lighting systems, such as Philips Hue, Osram Lightify, and GE Link. These lighting systems are intended for residential use but also deployed in hotels, restaurants, and industrial buildings. In this paper, we investigate the current state of security in ZLL-based connected lighting systems. We extend the scope of known attacks by describing novel attack procedures to show that the ZLL standard is insecure by design. Using our penetration testing framework, we are able to take full control over all three systems mentioned above. Besides novel attack procedures, we also extend the intended wireless range of max. 2 meters for configuring a ZLL device to over 30 meters, thus making ZLL-based systems susceptible to war driving. We conclude with a discussion about the security needs of connected lighting systems and derive several lessons for Internet of Things security that can be learned from the insecure design of ZLL-based connected lighting systems.
Local descriptors based on the image noise residual have proven extremely effective for a number of forensic applications, like forgery detection and localization. Nonetheless, motivated by promising results in computer vision, the focus of the research community is now shifting on deep learning. In this paper we show that a class of residual-based descriptors can be actually regarded as a simple constrained convolutional neural network (CNN). Then, by relaxing the constraints, and fine-tuning the net on a relatively small training set, we obtain a significant performance improvement with respect to the conventional detector.
Coupled human balancing tasks are investigated based on both pseudo-neural controllers characterized by time-delayed feedback with random gain and natural human balancing tasks. It is shown numerically that, compared to single balancing tasks, balancing tasks coupled by mechanical structures exhibit enhanced stability against balancing errors in terms of both amplitude and velocity and also improve the tracking ability of the controllers. We then perform an experiment in which numerical pseudo-neural controllers are replaced with natural human balancing tasks carried out using computer mice. The results reveal that the coupling structure generates asymmetric tracking abilities in subjects whose tracking abilities are nearly symmetric in their single balancing tasks.
Hierarchical models of scale free networks are introduced where numbers of nodes in clusters of a given hierarchy are stochastic variables. Our models show periodic oscillations of degree distribution P(k) in the log-log scale. Periods and amplitudes of such oscillations depend on network parameters. Numerical simulations are in a good agreement to analytical calculations.
In the past decade, network structures have penetrated nearly every aspect of our lives. The detection of anomalous vertices in these networks has become increasingly important, such as in exposing computer network intruders or identifying fake online reviews. In this study, we present a novel unsupervised two-layered meta-classifier that can detect irregular vertices in complex networks solely by using features extracted from the network topology. Following the reasoning that a vertex with many improbable links has a higher likelihood of being anomalous,we employed our method on 10 networks of various scales, from a network of several dozen students to online social networks with millions of users. In every scenario, we were able to identify anomalous vertices with lower false positive rates and higher AUCs compared to other prevalent methods. Moreover, we demonstrated that the presented algorithm is efficient both in revealing fake users and in disclosing the most influential people in social networks.
The observed spectral variation of HD 50138 has led different authors to classify it in a very wide range of spectral types and luminosity classes (from B5 to A0 and III to Ia) and at different evolutionary stages as either HAeBe star or classical Be. Aims: Based on new high-resolution optical spectroscopic data from 1999 and 2007 associated to a photometric analysis, the aim of this work is to provide a deep spectroscopic description and a new set of parameters for this unclassified southern B[e] star and its interstellar extinction. Methods: From our high-resolution optical spectroscopic data separated by 8 years, we perform a detailed spectral description, presenting the variations seen and discussing their possible origin. We derive the interstellar extinction to HD 50138 by taking the influences of the circumstellar matter in the form of dust and an ionized disk into account. Based on photometric data from the literature and the new Hipparcos distance, we obtain a revised set of parameters for HD 50138. Results: Because of the spectral changes, we tentatively suggest that a new shell phase could have taken place prior to our observations in 2007. We find a color excess value of E(B-V) = 0.08 mag, and from the photometric analysis, we suggest that HD 50138 is a B6-7 III-V star. A discussion of the different evolutionary scenarios is also provided.
In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the $\ell_2$ loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union ($IoU$) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of $IoU$ loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.
Polyploidy is an important speciation mechanism, particularly in land plants. Allopolyploid species are formed after hybridization between otherwise intersterile parental species. Recent theoretical progress has led to successful implementation of species tree models that take population genetic parameters into account. However, these models have not included allopolyploid hybridization and the special problems imposed when species trees of allopolyploids are inferred. Here, two new models for the statistical inference of the evolutionary history of allopolyploids are evaluated using simulations and demonstrated on two empirical data sets. It is assumed that there has been a single hybridization event between two diploid species resulting in a genomic allotetraploid. The evolutionary history can be represented as a network or as a multiply labeled tree, in which some pairs of tips are labeled with the same species. In one of the models (AlloppMUL), the multiply labeled tree is inferred directly. This is the simplest model and the most widely applicable, since fewer assumptions are made. The second model (AlloppNET) incorporates the hybridization event explicitly which means that fewer parameters need to be estimated. Both models are implemented in the BEAST framework. Simulations show that both models are useful and that AlloppNET is more accurate if the assumptions it is based on are valid. The models are demonstrated on previously analyzed data from the genus Pachycladon (Brassicaceae) and from the genus Silene (Caryophyllaceae).
One explanation for the impressive recent boom in network theory might be that it provides a promising tool for an understanding of complex systems. Network theory is mainly focusing on discrete large-scale topological structures rather than on microscopic details of interactions of its elements. This viewpoint allows to naturally treat collective phenomena which are often an integral part of complex systems, such as biological or socio-economical phenomena. Much of the attraction of network theory arises from the discovery that many networks, natural or man-made, seem to exhibit some sort of universality, meaning that most of them belong to one of three classes: {\it random}, {\it scale-free} and {\it small-world} networks. Maybe most important however for the physics community is, that due to its conceptually intuitive nature, network theory seems to be within reach of a full and coherent understanding from first principles ...
The internet has become a central medium through which `networked publics' express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers trained on large corpora of texts manually annotated for offence. While such systems could help encourage more civil debate, they must navigate inherently normatively contestable boundaries, and are subject to the idiosyncratic norms of the human raters who provide the training data. An important objective for platforms implementing such measures might be to ensure that they are not unduly biased towards or against particular norms of offence. This paper provides some exploratory methods by which the normative biases of algorithmic content moderation systems can be measured, by way of a case study using an existing dataset of comments labelled for offence. We train classifiers on comments labelled by different demographic subsets (men and women) to understand how differences in conceptions of offence between these groups might affect the performance of the resulting models on various test sets. We conclude by discussing some of the ethical choices facing the implementers of algorithmic moderation systems, given various desired levels of diversity of viewpoints amongst discussion participants.
It is estimated that Italian Mafias registered 135 billion euros in profits only in 2010. Part of this huge amount of money, coming mostly from the drugs, prostitution and arms illicit markets, is often used to invest into legitimate private economies. As a consequence, the affected economies destabilise, become entrenched with violent forms of competition and are bound to stagnation. Nonetheless, few are the attempts to uncover the patterns followed by criminal organisations in their business ventures. The reason lays mostly in the poor availability of data on criminal activity, or in the highly risky task of gather it.   This paper partially fills this gap thanks to access to information about the Sicilian Mafia in a city. More specifically, it tries to analyse the nature and extent of criminal infiltration into the legitimate private economy of the case-study using network techniques. The research demonstrates that sectors with a high degree of centrality and comprising fewer firms are the most vulnerable to this kind of security threat. It also shows that centrality is also the key criterion that makes a firm sensitive to infiltration, provided it belongs to a susceptible economic sector.
We investigate the connection between the well known Sherrington-Kirkpatrick Ising Spin Glass and the corresponding Lattice Gas model by analyzing the relation between their thermodynamical functions. We present results of replica approach in the Replica Symmetric approximation and discuss its stability as a function of temperature and external source. Next we examine the effects of first order Replica Symmetry Breaking at zero temperature. We finally compare SK results with ours and suggest how the latter could be relevant to a description of the structural glass transition.
We propose a deep convolutional neural network (CNN) for face detection leveraging on facial attributes based supervision. We observe a phenomenon that part detectors emerge within CNN trained to classify attributes from uncropped face images, without any explicit part supervision. The observation motivates a new method for finding faces through scoring facial parts responses by their spatial structure and arrangement. The scoring mechanism is data-driven, and carefully formulated considering challenging cases where faces are only partially visible. This consideration allows our network to detect faces under severe occlusion and unconstrained pose variations. Our method achieves promising performance on popular benchmarks including FDDB, PASCAL Faces, AFW, and WIDER FACE.
We consider the problem of optimizing time averages in systems with independent and identically distributed behavior over renewal frames. This includes scheduling and task processing to maximize utility in stochastic networks with variable length scheduling modes. Every frame, a new policy is implemented that affects the frame size and that creates a vector of attributes. An algorithm is developed for choosing policies on each frame in order to maximize a concave function of the time average attribute vector, subject to additional time average constraints. The algorithm is based on Lyapunov optimization concepts and involves minimizing a ``drift-plus-penalty'' ratio over each frame. The algorithm can learn efficient behavior without a-priori statistical knowledge by sampling from the past. Our framework is applicable to a large class of problems, including Markov decision problems.
We review our recent measurements of the complex AC conductivity of thin InO_x films studied as a function of magnetic field through the nominal 2D superconductor-insulator transition. These measurements - the first of their type to probe nonzero frequency - reveals a significant finite frequency superfluid stiffness well into the insulating regime. Unlike conventional fluctuation superconductivity in which thermal fluctuations give a superconducting response in regions of parameter space that don't exhibit long range order, these fluctuations are temperature independent as T --> 0 and are exhibited in samples where the resistance is large (greater than 10^6 Ohms/Square) and strongly diverging. We interpret this as the direct observation of quantum superconducting fluctuations around an insulating ground state. This system serves as a prototype for other insulating states of matter that derive from superconductors.
This paper describes the main things that need to be done to develop the "SP machine", based on the "SP theory of intelligence" and its realisation in the "SP computer model". The SP machine may be developed initially as a software virtual machine with high levels of parallel processing, hosted on a high-performance computer, or driven by high-parallel search processes in any of the leading search engines. An easy-to-use user interface is needed with facilities for visualisation of knowledge structures and processing. The system needs to be generalised to work with patterns in two dimensions, including the display of multiple alignments with 2D patterns. Research is needed into how the system may discover low-level features in speech and visual images. Existing strengths of the SP system in the processing of natural language may be developed towards the understanding of natural language and its production from meanings. This may be done most effectively in conjunction with the development of existing strengths of the SP system in unsupervised learning. Existing strengths of the SP system in pattern recognition may be developed for computer vision. Further work is needed on the representation of numbers and the performance of arithmetic processes in the SP system. A computer model is needed of SP-neural drawing on the existing conceptual model. When the SP machine is relatively mature, new hardware may be developed to exploit opportunities to increase the efficiency of computations. There is potential for the SP machine to be applied on relatively short timescales in such areas as information storage and retrieval, with intelligence, software engineering with or without automation or semi-automation, and information compression.
We study the non--equilibrium motion of an elastic string in a two dimensional pinning landscape using Langevin dynamics simulations. The relaxation of a line, initially flat, is characterized by a growing length, $L(t)$, separating the equilibrated short length scales from the flat long distance geometry that keep memory of the initial condition. We show that, in the long time limit, $L(t)$ has a non--algebraic growth with a universal distribution function. The distribution function of waiting times is also calculated, and related to the previous distribution. The barrier distribution is narrow enough to justify arguments based on scaling of the typical barrier.
Segmenting human left ventricle (LV) in magnetic resonance imaging (MRI) images and calculating its volume are important for diagnosing cardiac diseases. In 2016, Kaggle organized a competition to estimate the volume of LV from MRI images. The dataset consisted of a large number of cases, but only provided systole and diastole volumes as labels. We designed a system based on neural networks to solve this problem. It began with a detector combined with a neural network classifier for detecting regions of interest (ROIs) containing LV chambers. Then a deep neural network named hypercolumns fully convolutional network was used to segment LV in ROIs. The 2D segmentation results were integrated across different images to estimate the volume. With ground-truth volume labels, this model was trained end-to-end. To improve the result, an additional dataset with only segmentation label was used. The model was trained alternately on these two datasets with different types of teaching signals. We also proposed a variance estimation method for the final prediction. Our algorithm ranked the 4th on the test set in this competition.
We report the thickness, roughness, optical constant and band gap of Wurtzite type AlN thin films deposited on Si (100) substrate using DC reactive magnetron sputtering as a function of growth temperature (Ts, 35 to 600 C). Evolution of optical properties with Ts of these films was investigated by Spectroscopic Ellipsometry (SE) technique in the photon energy range of 0.6 to 6.5 eV. The New Amorphous dispersion formula was employed to extract optical constants from the experimental SE data. Thickness and roughness of these films were determined from the regression analysis of SE data, which have been corroborated using TEM and AFM technique. The optical parameters n and k strongly depend on Ts as well as crystallite size. Highly a-axis oriented AlN film grown at 400 C, exhibited high n (2.64) and low k (0.22) at 210 nm (deep-UV region), which can be used in deep UV optoelectronic device applications. All these AlN films exhibit transparent nature from near infrared (NIR) to 336 nm, where optical band gap energies varying between 5.42 to 6.16 eV.
We analyzed the stochastic behavior of systems controlled by autocatalytic reaction A+X -> X+X, X+X -> A+X, X -> B provided that the distribution of reacting particles in the system volume is uniform, i.e. the point model of reaction kinetics introduced in arXiv:cond-mat/0404402 can be applied. Assuming the number of substrate particles A to be kept constant by a suitable reservoir, we derived the forward Kolmogorov equation for the probability of finding n=0,1,... autocatalytic particles X in the system at a given time moment. We have shown that the stochastic model results in an equation for the mean value of autocatalytic particles X which differs strongly from the kinetic rate equation. It has been found that not only the law of the mass action is violated but also the bifurcation point is disappeared in the well-known diagram of X particle- vs. A particle-concentration. Therefore, speculations about the role of autocatalytic reactions in processes of the "natural selection" can be hardly supported.
This paper studies deep network architectures to address the problem of video classification. A multi-stream framework is proposed to fully utilize the rich multimodal information in videos. Specifically, we first train three Convolutional Neural Networks to model spatial, short-term motion and audio clues respectively. Long Short Term Memory networks are then adopted to explore long-term temporal dynamics. With the outputs of the individual streams, we propose a simple and effective fusion method to generate the final predictions, where the optimal fusion weights are learned adaptively for each class, and the learning process is regularized by automatically estimated class relationships. Our contributions are two-fold. First, the proposed multi-stream framework is able to exploit multimodal features that are more comprehensive than those previously attempted. Second, we demonstrate that the adaptive fusion method using the class relationship as a regularizer outperforms traditional alternatives that estimate the weights in a "free" fashion. Our framework produces significantly better results than the state of the arts on two popular benchmarks, 92.2\% on UCF-101 (without using audio) and 84.9\% on Columbia Consumer Videos.
In this letter, we investigate the resource allocation for downlink multi-cell coordinated OFDMA wireless networks, in which power allocation and subcarrier scheduling are jointly optimized. Aiming at maximizing the weighted sum of the minimal user rates (WSMR) of coordinated cells under individual power constraints at each base station, an effective distributed resource allocation algorithm using a modified decomposition method is proposed, which is suitable by practical implementation due to its low complexity and fast convergence speed. Simulation results demonstrate that the proposed decentralized algorithm provides substantial throughput gains with lower computational cost compared to existing schemes.
We study deep inelastic scattering for vector and axial vector mesons in the holographic D4-D8 brane model. We consider tree level contributions with one particle in the final hadronic state. We obtain the unpolarized structure functions F1 and F2 for the rho and a1 mesons for q2 < 80 GeV2 and 0.2 < x < 1. We find that the ratio F2/(2xF1) is approximately equal to one for some ranges of x and q2, satisfying the Callan-Gross relation.
The aim of this work is to establish the personal income distribution from the elementary constituents of a free market; products of a representative good and agents forming the economic network. The economy is treated as a self-organized system. Based on the idea that the dynamics of an economy is governed by slow modes, the model suggests that for short time intervals a fixed ratio of total labour income (capital income) to net income exists (Cobb-Douglas relation). Explicitly derived is Gibrat's law from an evolutionary market dynamics of short term fluctuations. The total private income distribution is shown to consist of four main parts. From capital income of private firms the income distribution contains a lognormal distribution for small and a Pareto tail for large incomes. Labour income contributes an exponential distribution. Also included is the income from a social insurance system, approximated by a Gaussian peak. The evolutionary model is able to reproduce the stylized facts of the income distribution, shown by a comparison with empirical data of a high resolution income distribution. The theory suggests that in a free market competition between products is ultimately the origin of the uneven income distribution.
State-of-the-art methods for zero-shot visual recognition formulate learning as a joint embedding problem of images and side information. In these formulations the current best complement to visual features are attributes: manually encoded vectors describing shared characteristics among categories. Despite good performance, attributes have limitations: (1) finer-grained recognition requires commensurately more attributes, and (2) attributes do not provide a natural language interface. We propose to overcome these limitations by training neural language models from scratch; i.e. without pre-training and only consuming words and characters. Our proposed models train end-to-end to align with the fine-grained and category-specific content of images. Natural language provides a flexible and compact way of encoding only the salient visual aspects for distinguishing categories. By training on raw text, our model can do inference on raw text as well, providing humans a familiar mode both for annotation and retrieval. Our model achieves strong performance on zero-shot text-based image retrieval and significantly outperforms the attribute-based state-of-the-art for zero-shot classification on the Caltech UCSD Birds 200-2011 dataset.
We present two quantitative behavioral equivalences over species of a chemical reaction network (CRN) with semantics based on ordinary differential equations. Forward CRN bisimulation identifies a partition where each equivalence class represents the exact sum of the concentrations of the species belonging to that class. Backward CRN bisimulation relates species that have the identical solutions at all time points when starting from the same initial conditions. Both notions can be checked using only CRN syntactical information, i.e., by inspection of the set of reactions. We provide a unified algorithm that computes the coarsest refinement up to our bisimulations in polynomial time. Further, we give algorithms to compute quotient CRNs induced by a bisimulation. As an application, we find significant reductions in a number of models of biological processes from the literature. In two cases we allow the analysis of benchmark models which would be otherwise intractable due to their memory requirements.
Suppression of excessively synchronous beta-band oscillatory activity in the brain is believed to suppress hypokinetic motor symptoms of Parkinson's disease. Recently, a lot of interest has been devoted to desynchronizing delayed feedback deep brain stimulation (DBS). This type of synchrony control was shown to destabilize the synchronized state in networks of simple model oscillators as well as in networks of coupled model neurons. However, the dynamics of the neural activity in Parkinson's disease exhibits complex intermittent synchronous patterns, far from the idealized synchronous dynamics used to study the delayed feedback stimulation. This study explores the action of delayed feedback stimulation on partially synchronized oscillatory dynamics, similar to what one observes experimentally in parkinsonian patients. We employ a model of the basal ganglia networks which reproduces experimentally observed fine temporal structure of the synchronous dynamics. When the parameters of our model are such that the synchrony is unphysiologically strong, the feedback exerts a desynchronizing action. However, when the network is tuned to reproduce the highly variable temporal patterns observed experimentally, the same kind of delayed feedback may actually increase the synchrony. As network parameters are changed from the range which produces complete synchrony to those favoring less synchronous dynamics, desynchronizing delayed feedback may gradually turn into synchronizing stimulation. This suggests that delayed feedback DBS in Parkinson's disease may boost rather than suppress synchronization and is unlikely to be clinically successful. The study also indicates that delayed feedback stimulation may not necessarily exhibit a desynchronization effect when acting on a physiologically realistic partially synchronous dynamics, and provides an example of how to estimate the stimulation effect.
Though machine learning has been applied to the foreign exchange market for algorithmic trading for quiet some time now, and neural networks(NN) have been shown to yield positive results, in most modern approaches the NN systems are optimized through traditional methods like the backpropagation algorithm for example, and their input signals are price lists, and lists composed of other technical indicator elements. The aim of this paper is twofold: the presentation and testing of the application of topology and weight evolving artificial neural network (TWEANN) systems to automated currency trading, and to demonstrate the performance when using Forex chart images as input to geometrical regularity aware indirectly encoded neural network systems, enabling them to use the patterns & trends within, when trading. This paper presents the benchmark results of NN based automated currency trading systems evolved using TWEANNs, and compares the performance and generalization capabilities of these direct encoded NNs which use the standard sliding-window based price vector inputs, and the indirect (substrate) encoded NNs which use charts as input. The TWEANN algorithm I will use in this paper to evolve these currency trading agents is the memetic algorithm based TWEANN system called Deus Ex Neural Network (DXNN) platform.
Using the deepest and finest resolution images of the Universe acquired with the Hubble Space Telescope and a similar image taken 7 years later for the Great Observatories Origins Deep Survey, we have derived proper motions for the point sources in the Hubble Deep Field--North. Two faint blue objects,HDF2234 and HDF3072, are found to display significant proper motion, 10.0 $\pm$ 2.5 and 15.5 $\pm$ 3.8 mas yr$^{-1}$. Photometric distances and tangential velocities for these stars are consistent with disk white dwarfs located at $\sim$ 500 pc. The faint blue objects analyzed by Ibata et al. (1999) and Mendez & Minniti (2000) do not show any significant proper motion; they are not halo white dwarfs and they do not contribute to the Galactic dark matter. These objects are likely to be distant AGN.
We develop models and extract relevant features for automatic text summarization and investigate the performance of different models on the DUC 2001 dataset. Two different models were developed, one being a ridge regressor and the other one was a multi-layer perceptron. The hyperparameters were varied and their performance were noted. We segregated the summarization task into 2 main steps, the first being sentence ranking and the second step being sentence selection. In the first step, given a document, we sort the sentences based on their Importance, and in the second step, in order to obtain non-redundant sentences, we weed out the sentences that are have high similarity with the previously selected sentences.
In present paper we suggest a new universal approach to study complex systems by microscopic, mesoscopic and macroscopic methods. We discuss new possibilities of extracting information on nonstationarity, unsteadiness and non-Markovity of discrete stochastic processes in complex systems. We consider statistical properties of the fast, intermediate and slow components of the investigated processes in complex systems within the framework of microscopic, mesoscopic and macroscopic approaches separately. Among them theoretical analysis is carried out by means of local noisy time-dependent parameters and the conception of a quasi-Brownian particle (QBP) (mesoscopic approach) as well as the use of wavelet transformation of the initial row time series. As a concrete example we examine the seismic time series data for strong and weak earthquakes in Turkey ($1998,1999$) in detail, as well as technogenic explosions. We propose a new way of possible solution to the problem of forecasting strong earthquakes forecasting. Besides we have found out that an unexpected restoration of the first two local noisy parameters in weak earthquakes and technogenic explosions is determined by exponential law. In this paper we have also carried out the comparison and have discussed the received time dependence of the local parameters for various seismic phenomena.
As a discrete approach to genetic regulatory networks, Boolean models provide an essential qualitative description of the structure of interactions among genes and proteins. Boolean models generally assume only two possible states (expressed or not expressed) for each gene or protein in the network as well as a high level of synchronization among the various regulatory processes. In this paper, we discuss and compare two possible methods of adapting qualitative models to incorporate the continuous-time character of regulatory networks. The first method consists of introducing asynchronous updates in the Boolean model. In the second method, we adopt the approach introduced by L. Glass to obtain a set of piecewise linear differential equations which continuously describe the states of each gene or protein in the network. We apply both methods to a particular example: a Boolean model of the segment polarity gene network of Drosophila melanogaster. We analyze the dynamics of the model, and provide a theoretical characterization of the model's gene pattern prediction as a function of the timescales of the various processes.
To achieve security in wireless networks, it is important to be able to encrypt and authenticate messages sent among sensor nodes. Due to resource constraints, achieving such key agreement in wireless sensor networks in nontrivial. Blom's scheme is a symmetric key exchange protocol used in cryptography. In this paper, we propose a new key pre-distribution scheme by modifying Blom's scheme which reduces the computational complexity as well as memory usage.
This paper presents an unsupervised multi-modal learning system that learns associative representation from two input modalities, or channels, such that input on one channel will correctly generate the associated response at the other and vice versa. In this way, the system develops a kind of supervised classification model meant to simulate aspects of human associative memory. The system uses a deep learning architecture (DLA) composed of two input/output channels formed from stacked Restricted Boltzmann Machines (RBM) and an associative memory network that combines the two channels. The DLA is trained on pairs of MNIST handwritten digit images to develop hierarchical features and associative representations that are able to reconstruct one image given its paired-associate. Experiments show that the multi-modal learning system generates models that are as accurate as back-propagation networks but with the advantage of a bi-directional network and unsupervised learning from either paired or non-paired training examples.
We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted features. We compare CRNN with three CNN structures that have been used for music tagging while controlling the number of parameters with respect to their performance and training time per sample. Overall, we found that CRNNs show a strong performance with respect to the number of parameter and training time, indicating the effectiveness of its hybrid structure in music feature extraction and feature summarisation.
While initially devised for image categorization, convolutional neural networks (CNNs) are being increasingly used for the pixelwise semantic labeling of images. However, the proper nature of the most common CNN architectures makes them good at recognizing but poor at localizing objects precisely. This problem is magnified in the context of aerial and satellite image labeling, where a spatially fine object outlining is of paramount importance. Different iterative enhancement algorithms have been presented in the literature to progressively improve the coarse CNN outputs, seeking to sharpen object boundaries around real image edges. However, one must carefully design, choose and tune such algorithms. Instead, our goal is to directly learn the iterative process itself. For this, we formulate a generic iterative enhancement process inspired from partial differential equations, and observe that it can be expressed as a recurrent neural network (RNN). Consequently, we train such a network from manually labeled data for our enhancement task. In a series of experiments we show that our RNN effectively learns an iterative process that significantly improves the quality of satellite image classification maps.
We present a stochastic neural automata in which activity fluctuations and synaptic intensities evolve at different temperature, the latter moving through a set of stored patterns. The network thus exhibits various retrieval phases, including one which depicts continuous switching between attractors. The switching may be either random or more complex, depending on the system parameters values.
In Hopfield neural networks with up to 10^8 nodes we store two patterns through Hebb couplings. Then we start with a third random pattern which is supposed to evolve into one of the two stored patterns, simulating the cognitive process of associative memory leading to one of two possible opinions. With probability p each neuron independently, instead of following the Hopfield rule, takes over the corresponding value of another network, thus simulating how different people can convince each other. A consensus is achieved for high p.
Deep ultraviolet (DUV) light sources are used to neutralise isolated test masses in highly sensitive space-based gravitational experiments. An example is the LISA Pathfinder charge management system, which uses low-pressure mercury lamps. A future gravitational-wave observatory such as eLISA will use UV light-emitting diodes (UV LEDs), which offer numerous advantages over traditional discharge lamps. Such devices have limited space heritage but are are now available from a number of commercial suppliers. Here we report on a test campaign that was carried out to quantify the general properties of three types of commercially available UV LEDs and demonstrate their suitability for use in space. Testing included general electrical and UV output measurements, spectral stability, pulsed performance, temperature dependence as well as thermal vacuum, radiation and vibration survivability.
A method for segmenting water bodies in optical and synthetic aperture radar (SAR) satellite images is proposed. It makes use of the textural features of the different regions in the image for segmentation. The method consists in a multiscale analysis of the images, which allows us to study the images regularity both, locally and globally. As results of the analysis, coarse multifractal spectra of studied images and a group of images that associates each position (pixel) with its corresponding value of local regularity (or singularity) spectrum are obtained. Thresholds are then applied to the multifractal spectra of the images for the classification. These thresholds are selected after studying the characteristics of the spectra under the assumption that water bodies have larger local regularity than other soil types. Classifications obtained by the multifractal method are compared quantitatively with those obtained by neural networks trained to classify the pixels of the images in covered against uncovered by water. In optical images, the classifications are also compared with those derived using the so-called Normalized Differential Water Index (NDWI).
Occlusion edges in images which correspond to range discontinuity in the scene from the point of view of the observer are an important prerequisite for many vision and mobile robot tasks. Although they can be extracted from range data however extracting them from images and videos would be extremely beneficial. We trained a deep convolutional neural network (CNN) to identify occlusion edges in images and videos with both RGB-D and RGB inputs. The use of CNN avoids hand-crafting of features for automatically isolating occlusion edges and distinguishing them from appearance edges. Other than quantitative occlusion edge detection results, qualitative results are provided to demonstrate the trade-off between high resolution analysis and frame-level computation time which is critical for real-time robotics applications.
We present WikiReading, a large-scale natural language understanding task and publicly-available dataset with 18 million instances. The task is to predict textual values from the structured knowledge base Wikidata by reading the text of the corresponding Wikipedia articles. The task contains a rich variety of challenging classification and extraction sub-tasks, making it well-suited for end-to-end models such as deep neural networks (DNNs). We compare various state-of-the-art DNN-based architectures for document classification, information extraction, and question answering. We find that models supporting a rich answer space, such as word or character sequences, perform best. Our best-performing model, a word-level sequence to sequence model with a mechanism to copy out-of-vocabulary words, obtains an accuracy of 71.8%.
A hash function is constructed based on a three-layer neural network. The three neuron-layers are used to realize data confusion, diffusion and compression respectively, and the multi-block hash mode is presented to support the plaintext with variable length. Theoretical analysis and experimental results show that this hash function is one-way, with high key sensitivity and plaintext sensitivity, and secure against birthday attacks or meet-in-the-middle attacks. Additionally, the neural network's property makes it practical to realize in a parallel way. These properties make it a suitable choice for data signature or authentication.
We present a quantitative description of the thermodynamics in a supercooled binary Lennard Jones liquid via the evaluation of the degeneracy of the inherent structures, i.e. of the number of potential energy basins in configuration space. We find that for supercooled states, the contribution of the inherent structures to the free energy of the liquid almost completely decouples from the vibrational contribution. An important byproduct of the presented analysis is the determination of the Kauzmann temperature for the studied system. The resulting quantitative picture of the thermodynamics of the inherent structures offers new suggestions for the description of equilibrium and out-of-equilibrium slow-dynamics in liquids below the Mode-Coupling temperature.
A novel deep learning architecture (XmasNet) based on convolutional neural networks was developed for the classification of prostate cancer lesions, using the 3D multiparametric MRI data provided by the PROSTATEx challenge. End-to-end training was performed for XmasNet, with data augmentation done through 3D rotation and slicing, in order to incorporate the 3D information of the lesion. XmasNet outperformed traditional machine learning models based on engineered features, for both train and test data. For the test data, XmasNet outperformed 69 methods from 33 participating groups and achieved the second highest AUC (0.84) in the PROSTATEx challenge. This study shows the great potential of deep learning for cancer imaging.
A search for a narrow baryonic resonance decaying to $K^0_s p$ or $K^0_s \bar p$ is carried out in deep inelastic ep scattering with the H1 detector at HERA. Such a resonance could be a strange pentaquark \thplns, evidence for which has been reported by several experiments. The $K^0_s p$ and $K^0_s \bar p$ invariant mass distributions presented here do not show any significant peak in the mass range from threshold up to 1.7 GeV. Mass dependent upper limits on $\sigma(ep \to e \thplf X)\times BR(\thplf \to K^0 p)$ are obtained at the 95% confidence level.
Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. In this work, we propose a flexible deep CNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrary number of object segment hypotheses are taken as the inputs, then a shared CNN is connected with each hypothesis, and finally the CNN output results from different hypotheses are aggregated with max pooling to produce the ultimate multi-label predictions. Some unique characteristics of this flexible deep CNN infrastructure include: 1) no ground truth bounding box information is required for training; 2) the whole HCP infrastructure is robust to possibly noisy and/or redundant hypotheses; 3) no explicit hypothesis label is required; 4) the shared CNN may be well pre-trained with a large-scale single-label image dataset, e.g. ImageNet; and 5) it may naturally output multi-label prediction results. Experimental results on Pascal VOC2007 and VOC2012 multi-label image datasets well demonstrate the superiority of the proposed HCP infrastructure over other state-of-the-arts. In particular, the mAP reaches 84.2% by HCP only and 90.3% after the fusion with our complementary result in [47] based on hand-crafted features on the VOC2012 dataset, which significantly outperforms the state-of-the-arts with a large margin of more than 7%.
The local minima of a quadratic functional depending on binary variables are discussed. An arbitrary connection matrix can be presented in the form of quasi-Hebbian expansion where each pattern is supplied with its own individual weight. For such matrices statistical physics methods allow one to derive an equation describing local minima of the functional. A model where only one weight differs from other ones is discussed in detail. In this case the equation can be solved analytically. The critical values of the weight, for which the energy landscape is reconstructed, are obtained. Obtained results are confirmed by computer simulations.
New temperature dependent inelastic x-ray (IXS) and Raman (RS) scattering data are compared to each other and with existing inelastic neutron scattering data in vitreous silica (v-SiO_2), in the 300 - 1775 K region. The IXS data show collective propagating excitations up to Q=3.5 nm^-1. The temperature behaviour of the excitations at Q=1.6 nm^-1 matches that of the boson peak found in INS and RS. This supports the acoustic origin of the excess of vibrational states giving rise to the boson peak in this glass.
ANIMO (Analysis of Networks with Interactive MOdeling) is a software for modeling biological networks, such as e.g. signaling, metabolic or gene networks. An ANIMO model is essentially the sum of a network topology and a number of interaction parameters. The topology describes the interactions between biological entities in form of a graph, while the parameters determine the speed of occurrence of such interactions. When a mismatch is observed between the behavior of an ANIMO model and experimental data, we want to update the model so that it explains the new data. In general, the topology of a model can be expanded with new (known or hypothetical) nodes, and enables it to match experimental data. However, the unrestrained addition of new parts to a model causes two problems: models can become too complex too fast, to the point of being intractable, and too many parts marked as "hypothetical" or "not known" make a model unrealistic. Even if changing the topology is normally the easier task, these problems push us to try a better parameter fit as a first step, and resort to modifying the model topology only as a last resource. In this paper we show the support added in ANIMO to ease the task of expanding the knowledge on biological networks, concentrating in particular on the parameter settings.
We investigate ultrathin superconducting TiN films, which are very close to the localization threshold. Perpendicular magnetic field drives the films from the superconducting to an insulating state, with very high resistance. Further increase of the magnetic field leads to an exponential decay of the resistance towards a finite value. In the limit of low temperatures, the saturation value can be very accurately extrapolated to the universal quantum resistance h/e^2. Our analysis suggests that at high magnetic fields a new ground state, distinct from the normal metallic state occurring above the superconducting transition temperature, is formed. A comparison with other studies on different materials indicates that the quantum metallic phase following the magnetic-field-induced insulating phase is a generic property of systems close to the disorder-driven superconductor-insulator transition.
In this paper we describe a hands-on laboratory oriented instructional package that we have developed for data communications and networking. The package consists of a software tool, together with instructional material for a laboratory based networking curriculum. The software is based on a simulation environment that enables the student to experiment with various networking protocols, on an easy to use graphical user interface (GUI). Data message flows, packet losses, control/routing message flows, virtual circuit setups, link failures, bit errors etc., are some of the features that can be visualized in this environment. The student can also modify the networking components provided, as well as add new components using the C programming language. The instructional material consists of a set of laboratory exercises for flow and error control (HDLC), IEEE 802.3 CSMA/CD protocol, the token ring protocol, interconnecting LANs via bridges, TCP congestion avoidance and control, IP fragmentation and reassembly, ATM PNNI routing and ATM policing. The laboratory exercises have facilitated the development of a networking curriculum based on both the traditional computer networking principles, as well as the new technologies in telecommunication networking. The laboratory environment has been used in the networking curriculum at The Ohio State University, and is being piloted at other universities. The entire package is freely available over the Internet.
We perform a novel type of analysis of diffractive deep-inelastic scattering data, in which the input parton distributions of the Pomeron are parameterised using the perturbative QCD expressions. In particular, we treat individually the components of the Pomeron of different size. We are able to describe simultaneously both the recent ZEUS and H1 diffractive data. In addition to the usual two-gluon model for the perturbative Pomeron, we allow for the possibility that it may be made from two sea quarks.
In this paper, a novel architecture for a deep recurrent neural network, residual LSTM is introduced. A plain LSTM has an internal memory cell that can learn long term dependencies of sequential data. It also provides a temporal shortcut path to avoid vanishing or exploding gradients in the temporal domain. The residual LSTM provides an additional spatial shortcut path from lower layers for efficient training of deep networks with multiple LSTM layers. Compared with the previous work, highway LSTM, residual LSTM separates a spatial shortcut path with temporal one by using output layers, which can help to avoid a conflict between spatial and temporal-domain gradient flows. Furthermore, residual LSTM reuses the output projection matrix and the output gate of LSTM to control the spatial information flow instead of additional gate networks, which effectively reduces more than 10% of network parameters. An experiment for distant speech recognition on the AMI SDM corpus shows that 10-layer plain and highway LSTM networks presented 13.7% and 6.2% increase in WER over 3-layer aselines, respectively. On the contrary, 10-layer residual LSTM networks provided the lowest WER 41.0%, which corresponds to 3.3% and 2.8% WER reduction over plain and highway LSTM networks, respectively.
In the ROSAT era of the mid-1990's, the problems facing deep X-ray surveys could be largely solved with 10 m class telescopes. In the first decade of this new millennium, with X-ray telescopes such as the Chandra X-ray Observatory and XMM-Newton in operation, deep X-ray surveys are challenging 10 m telescopes. For example, in the Chandra Deep Field surveys, approximately 30% of the X-ray sources have optical counterparts fainter than R=25 (I=24).   This paper reviews current progress with 6-10 m class telescopes in following up sources discovered in deep X-ray surveys, including results from several X-ray surveys which have depended on telescopes such as Keck, VLT and HET. Topics include the prospects for detecting extreme redshift (z>6) quasars and the first detections of normal and starburst galaxies at cosmologically interesting distances in the X-ray band.   X-ray astronomy can significantly bolster the science case for the next generation of large aperture (30-100 m) ground-based telescopes and has already provided targets for these large telescopes through the Chandra and XMM-Newton surveys. The next generation of X-ray telescopes will continue to challenge large optical telescopes; this review concludes with a discussion of prospects from new X-ray missions coming into operation on a 5-30 year timescale.
The effect of disorder on confined metallic cavities with an Aharonov-Bohm flux line is addressed. We find that, even deep in the diffusive regime, large values of persistent currents may arise for a wide variety of geometries. We present numerical results supporting an anomalous scaling law of the average typical current $< I_{typ}>$ with the strength of disorder $w$, $< I_{typ}> \sim w^{- \gamma}$ with $\gamma < 2$. This is contrasted with previously reported results obtained for cylindrical samples where a scaling $< I_{typ}> \sim w^{-2}$ has been found. Possible links to, up to date, unexplained experimental data are finally discussed.
As Communication Service Providers (CSPs) adopt the Network Function Virtualization (NFV) paradigm, they need to transition their network function capacity to a virtualized infrastructure with different Network Functions running on a set of heterogeneous servers. This abstract describes a novel technique for allocating server resources (compute, storage and network) for a given set of Virtual Network Function (VNF) requirements. Our approach helps the telco providers decide the most effective way to run several VNFs on servers with different performance characteristics. Our analysis of prior VNF performance characterization on heterogeneous/different server resource allocations shows that the ability to arbitrarily create many VNFs among different servers' resource allocations leads to a comparative advantage among servers. We propose a VNF resource allocation method called COMPARE that maximizes the total throughput of the system by formulating this resource allocation problem as a comparative advantage problem among heterogeneous servers. There several applications for using the VNF resource allocation from COMPARE including transitioning current Telco deployments to NFV based solutions and providing initial VNF placement for Service Function Chain (SFC) provisioning.
We study the prospects for constraining the nuclear parton distribution functions by small-x deep inelastic scattering. Performing a global fit of nuclear parton distribution functions including a sample of pseudodata representing expected measurements at the planned LHeC collider, we demonstrate that the accuracy of the present nuclear parton distributions could be be improved substantially. We also discuss the impact of flavour-tagged data.
High-contrast adaptive optics imaging is a powerful technique to probe the architectures of planetary systems from the outside-in and survey the atmospheres of self-luminous giant planets. Direct imaging has rapidly matured over the past decade and especially the last few years with the advent of high-order adaptive optics systems, dedicated planet-finding instruments with specialized coronagraphs, and innovative observing and post-processing strategies to suppress speckle noise. This review summarizes recent progress in high-contrast imaging with particular emphasis on observational results, discoveries near and below the deuterium-burning limit, and a practical overview of large-scale surveys and dedicated instruments. I conclude with a statistical meta-analysis of deep imaging surveys in the literature. Based on observations of 384 unique and single young ($\approx$5--300~Myr) stars spanning stellar masses between 0.1--3.0~\Msun, the overall occurrence rate of 5--13~\Mjup \ companions at orbital distances of 30--300~AU is 0.6$^{+0.7}_{-0.5}$\% assuming hot-start evolutionary models. The most massive giant planets regularly accessible to direct imaging are about as rare as hot Jupiters are around Sun-like stars. Dividing this sample into individual stellar mass bins does not reveal any statistically-significant trend in planet frequency with host mass: giant planets are found around 2.8$^{+3.7}_{-2.3}$\% of BA stars, $<$4.1\% of FGK stars, and $<$3.9\% of M dwarfs. Looking forward, extreme adaptive optics systems and the next generation of ground- and space-based telescopes with smaller inner working angles and deeper detection limits will increase the pace of discovery to ultimately map the demographics, composition, evolution, and origin of planets spanning a broad range of masses and ages.
In this paper, we propose a novel model for high-dimensional data, called the Hybrid Orthogonal Projection and Estimation (HOPE) model, which combines a linear orthogonal projection and a finite mixture model under a unified generative modeling framework. The HOPE model itself can be learned unsupervised from unlabelled data based on the maximum likelihood estimation as well as discriminatively from labelled data. More interestingly, we have shown the proposed HOPE models are closely related to neural networks (NNs) in a sense that each hidden layer can be reformulated as a HOPE model. As a result, the HOPE framework can be used as a novel tool to probe why and how NNs work, more importantly, to learn NNs in either supervised or unsupervised ways. In this work, we have investigated the HOPE framework to learn NNs for several standard tasks, including image recognition on MNIST and speech recognition on TIMIT. Experimental results have shown that the HOPE framework yields significant performance gains over the current state-of-the-art methods in various types of NN learning problems, including unsupervised feature learning, supervised or semi-supervised learning.
The reversible aggregates formation in a shear thickening, concentrated colloidal suspension is investigated through speckle visibility spectroscopy, a dynamic light scattering technique recently introduced [P.K. Dixon and D.J. Durian, Phys. Rev. Lett. 90, 184302 (2003)]. Formation of particles aggregates is observed in the jamming regime, and their relaxation after shear cessation is monitored as a function of the applied shear stress. The aggregates relaxation time increases when a larger stress is applied. Several phenomena have been proposed to interpret this behavior: an increase of the aggregates size and volume fraction, or a closer packing of the particles in the aggregates.
We investigate the momentum-dependent transport of 1D quasi-condensates in quasiperiodic optical lattices. We observe a sharp crossover from a weakly dissipative regime to a strongly unstable one at a disorder-dependent critical momentum. In the limit of non-disordered lattices the observations indicate a contribution of quantum phase slips to the dissipation. We identify a set of critical disorder and interaction strengths for which such critical momentum vanishes, separating a fluid regime from an insulating one. We relate our observation to the predicted zero-temperature superfluid-Bose glass transition.
Dendritic cells are antigen presenting cells that provide a vital link between the innate and adaptive immune system, providing the initial detection of pathogenic invaders. Research into this family of cells has revealed that they perform information fusion which directs immune responses. We have derived a Dendritic Cell Algorithm based on the functionality of these cells, by modelling the biological signals and differentiation pathways to build a control mechanism for an artificial immune system. We present algorithmic details in addition to experimental results, when the algorithm was applied to anomaly detection for the detection of port scans. The results show the Dendritic Cell Algorithm is sucessful at detecting port scans.
The intelligent reformulation or restructuring of a belief network can greatly increase the efficiency of inference. However, time expended for reformulation is not available for performing inference. Thus, under time pressure, there is a tradeoff between the time dedicated to reformulating the network and the time applied to the implementation of a solution. We investigate this partition of resources into time applied to reformulation and time used for inference. We shall describe first general principles for computing the ideal partition of resources under uncertainty. These principles have applicability to a wide variety of problems that can be divided into interdependent phases of problem solving. After, we shall present results of our empirical study of the problem of determining the ideal amount of time to devote to searching for clusters in belief networks. In this work, we acquired and made use of probability distributions that characterize (1) the performance of alternative heuristic search methods for reformulating a network instance into a set of cliques, and (2) the time for executing inference procedures on various belief networks. Given a preference model describing the value of a solution as a function of the delay required for its computation, the system selects an ideal time to devote to reformulation.
We consider the problem of estimating high-dimensional Gaussian graphical models corresponding to a single set of variables under several distinct conditions. This problem is motivated by the task of recovering transcriptional regulatory networks on the basis of gene expression data {containing heterogeneous samples, such as different disease states, multiple species, or different developmental stages}. We assume that most aspects of the conditional dependence networks are shared, but that there are some structured differences between them. Rather than assuming that similarities and differences between networks are driven by individual edges, we take a node-based approach, which in many cases provides a more intuitive interpretation of the network differences. We consider estimation under two distinct assumptions: (1) differences between the K networks are due to individual nodes that are perturbed across conditions, or (2) similarities among the K networks are due to the presence of common hub nodes that are shared across all K networks. Using a row-column overlap norm penalty function, we formulate two convex optimization problems that correspond to these two assumptions. We solve these problems using an alternating direction method of multipliers algorithm, and we derive a set of necessary and sufficient conditions that allows us to decompose the problem into independent subproblems so that our algorithm can be scaled to high-dimensional settings. Our proposal is illustrated on synthetic data, a webpage data set, and a brain cancer gene expression data set.
This paper presents an end-to-end approach for tracking static and dynamic objects for an autonomous vehicle driving through crowded urban environments. Unlike traditional approaches to tracking, this method is learned end-to-end, and is able to directly predict a full unoccluded occupancy grid map from raw laser input data. Inspired by the recently presented DeepTracking approach [Ondruska, 2016], we employ a recurrent neural network (RNN) to capture the temporal evolution of the state of the environment, and propose to use Spatial Transformer modules to exploit estimates of the egomotion of the vehicle. Our results demonstrate the ability to track a range of objects, including cars, buses, pedestrians, and cyclists through occlusion, from both moving and stationary platforms, using a single learned model. Experimental results demonstrate that the model can also predict the future states of objects from current inputs, with greater accuracy than previous work.
This work is focused on improving the character recognition capability of feed-forward back-propagation neural network by using one, two and three hidden layers and the modified additional momentum term. 182 English letters were collected for this work and the equivalent binary matrix form of these characters was applied to the neural network as training patterns. While the network was getting trained, the connection weights were modified at each epoch of learning. For each training sample, the error surface was examined for minima by computing the gradient descent. We started the experiment by using one hidden layer and the number of hidden layers was increased up to three and it has been observed that accuracy of the network was increased with low mean square error but at the cost of training time. The recognition accuracy was improved further when modified additional momentum term was used.
We present a novel framework, called balanced overlay networks (BON), that provides scalable, decentralized load balancing for distributed computing using large-scale pools of heterogeneous computers. Fundamentally, BON encodes the information about each node's available computational resources in the structure of the links connecting the nodes in the network. This distributed encoding is self-organized, with each node managing its in-degree and local connectivity via random-walk sampling. Assignment of incoming jobs to nodes with the most free resources is also accomplished by sampling the nodes via short random walks. Extensive simulations show that the resulting highly dynamic and self-organized graph structure can efficiently balance computational load throughout large-scale networks. These simulations cover a wide spectrum of cases, including significant heterogeneity in available computing resources and high burstiness in incoming load. We provide analytical results that prove BON's scalability for truly large-scale networks: in particular we show that under certain ideal conditions, the network structure converges to Erdos-Renyi (ER) random graphs; our simulation results, however, show that the algorithm does much better, and the structures seem to approach the ideal case of d-regular random graphs. We also make a connection between highly-loaded BONs and the well-known ball-bin randomized load balancing framework.
ClassX is a project aimed at creating an automated system to classify X-ray sources and is envisaged as a prototype of the Virtual Observatory. As a system, ClassX integrates into a pipeline a network of classifiers and an engine that searches and retrieves for a given target multi-wavelength counterparts from the worldwide data storage media. It applies machine learning methods to `train' different classifiers using different `training' data sets. In ClassX, each classifier can make its own class (object type) assignment and is optimized for handling different tasks and/or different object types. A user would generally select a certain classifier to make, for instance, a most complete list of candidate QSOs, but a different classifier would be used to make a most reliable list of candidate QSOs. Still different classifiers would be selected to make similar lists for other object types. Along with the class name assignment, a network classifier outputs the probability for a source to belong to the assigned class as well as probabilities that the source belongs in fact to other classes. We illustrate the current capabilities of ClassX and the concept of a classifiers network with the results obtained with classifiers trained using ROSAT data. ~
Wireless sensor network (WSN) has attracted researchers worldwide to explore the research opportunities, with application mainly in health monitoring, industry automation, battlefields, home automation and environmental monitoring. A WSN is highly resource constrained in terms of energy, computation and memory. WSNs deployment ranges from the normal working environment up to hostile and hazardous environment such as in volcano monitoring and underground mines. These characteristics of WSNs hold additional set of challenges in front of the operating system designer. The objective of this survey is to highlight the features and weakness of the opearting system available for WSNs, with the focus on the current application demands. The paper also discusses the operating system design issues in terms of architecture, programming model, scheduling and memory management and support for real time applications.
We reply to comments by P.Marko$\breve{s}$, L.Schweitzer and M.Weyrauch [preceding paper] on our recent paper [J. Phys.: Condens. Matter 63, 13777 (2002)]. We demonstrate that our quite different viewpoints stem for the different physical assumptions made prior to the choice of the mathematical formalism. The authors of the Comment expect \emph{a priori} to see a single thermodynamic phase while our approach is capable of detecting co-existence of distinct pure phases. The limitations of the transfer matrix techniques for the multi-dimensional Anderson localization problem are discussed.
This paper is proposed a security plan for Smart Grid Systems based on AGC4ISR which is an architecture for Autonomic Grid Computing Systems. Smart Grid incorporates has many benefits of distributed computing and communications to deliver a real-time information and enable the near-instantaneous balance of supply and demand at the device level. AGC4ISR architecture is Organized by Autonomic Grid Computing and C4ISR (Command, Control, Communications, Computers and Intelligence, Surveillance, & Reconnaissance) Architecture. In this paper we will present a solution for as security plan which will be consider encryption, intrusion detection, key management and detail of cyber security in Smart Grids. In this paper we use the cryptography for the packet in the C4ISR and we use a key management for send and receive a packet in the smart grid because it is necessary for intelligent networks to keeping away from packet missing.
In this paper, we propose a new image instance segmentation method that segments individual glands (instances) in colon histology images. This is a task called instance segmentation that has recently become increasingly important. The problem is challenging since not only do the glands need to be segmented from the complex background, they are also required to be individually identified. Here we leverage the idea of image-to-image prediction in recent deep learning by building a framework that automatically exploits and fuses complex multichannel information, regional and boundary patterns, with side supervision (deep supervision on side responses) in gland histology images. Our proposed system, deep multichannel side supervision (DMCS), alleviates heavy feature design due to the use of convolutional neural networks guided by side supervision. Compared to methods reported in the 2015 MICCAI Gland Segmentation Challenge, we observe state-of-the-art results based on a number of evaluation metrics.
Dust is commonly present in weakly radio emitting star-forming galaxies and this dust may obscure the signatures of accreting black holes in these objects. We aim to uncover weak active galactic nuclei, AGN, in the faint radio source population by means of deep high-resolution radio observations. VLBI observations with a world-wide array at unparallelled sensitivity are carried out to assess the nature of the faint radio source population in the Hubble Deep Field North and its flanking fields. Images of twelve compact, AGN-driven radio sources are presented. These represent roughly one quarter of the detectable faint radio source sample. Most, but not all of these low power AGN have X-ray detections. The majority of the faint radio source population must be star-forming galaxies. Faint AGN occur in a variety of (distant) host galaxies, and these are often accompanied by a dust-obscured starburst. Deep, high-resolution VLBI is a unique, powerful technique to assess the occurrence of faint AGN.
The role of semantics in zero-shot learning is considered. The effectiveness of previous approaches is analyzed according to the form of supervision provided. While some learn semantics independently, others only supervise the semantic subspace explained by training classes. Thus, the former is able to constrain the whole space but lacks the ability to model semantic correlations. The latter addresses this issue but leaves part of the semantic space unsupervised. This complementarity is exploited in a new convolutional neural network (CNN) framework, which proposes the use of semantics as constraints for recognition.Although a CNN trained for classification has no transfer ability, this can be encouraged by learning an hidden semantic layer together with a semantic code for classification. Two forms of semantic constraints are then introduced. The first is a loss-based regularizer that introduces a generalization constraint on each semantic predictor. The second is a codeword regularizer that favors semantic-to-class mappings consistent with prior semantic knowledge while allowing these to be learned from data. Significant improvements over the state-of-the-art are achieved on several datasets.
We present a non-perturbative analysis of the power-spectrum of energy level fluctuations in fully chaotic quantum structures. Focussing on systems with broken time-reversal symmetry, we employ a finite-$N$ random matrix theory to derive an exact multidimensional integral representation of the power-spectrum. The $N\rightarrow \infty$ limit of the exact solution furnishes the main result of this study -- a universal, parameter-free prediction for the power-spectrum expressed in terms of a fifth Painlev\'e transcendent. Extensive numerics lends further support to our theory which, as discussed at length, invalidates a traditional assumption that the power-spectrum is merely determined by the spectral form factor of a quantum system.
We determine the critical equation of state of three-dimensional randomly dilute Ising systems, i.e. of the random-exchange Ising universality class. We first consider the small-magnetization expansion of the Helmholtz free energy in the high-temperature phase. Then, we apply a systematic approximation scheme of the equation of state in the whole critical regime, that is based on polynomial parametric representations matching the small-magnetization of the Helmholtz free energy and satisfying a global stationarity condition. These results allow us to estimate several universal amplitude ratios, such as the ratio A^+/A^- of the specific-heat amplitudes. Our best estimate A^+/A^-=1.6(3) is in good agreement with experimental results on dilute uniaxial antiferromagnets.
Mobile networks of the future are predicted to be much denser than today's networks in order to cater to increasing user demands. In this context, cloud based radio access networks have garnered significant interest as a cost effective solution to the problem of coping with denser networks and providing higher data rates. However, to the best knowledge of the authors, a quantitative analysis of the cost of such networks is yet to be undertaken. This paper develops a theoretic framework that enables computation of the deployment cost of a network (modeled using various spatial point processes) to answer the question posed by the paper's title. Then, the framework obtained is used along with a complexity model, which enables computing the information processing costs of a network, to compare the deployment cost of a cloud based network against that of a traditional LTE network, and to analyze why they are more economical. Using this framework and an exemplary budget, this paper shows that cloud-based radio access networks require approximately 10 to 15% less capital expenditure per square kilometer than traditional LTE networks. It also demonstrates that the cost savings depend largely on the costs of base stations and the mix of backhaul technologies used to connect base stations with data centers.
In this paper we examine learning methods combining the Random Neural Network, a biologically inspired neural network and the Extreme Learning Machine that achieve state of the art classification performance while requiring much shorter training time. The Random Neural Network is a integrate and fire computational model of a neural network whose mathematical structure permits the efficient analysis of large ensembles of neurons. An activation function is derived from the RNN and used in an Extreme Learning Machine. We compare the performance of this combination against the ELM with various activation functions, we reduce the input dimensionality via PCA and compare its performance vs. autoencoder based versions of the RNN-ELM.
A long-running difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality of all real-world players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) distributions. This paper shows that the same information theoretic mathematical structure, known as Product Distribution (PD) theory, addresses both issues. In this, PD theory not only provides a principled formulation of bounded rationality and a set of new types of mean field theory in statistical physics. It also shows that those topics are fundamentally one and the same.
Simulation is widely adopted in the study of modern computer networks. In this context, OMNeT++ provides a set of very effective tools that span from the definition of the network, to the automation of simulation execution and quick result representation. However, as network models become more and more complex to cope with the evolution of network systems, the amount of simulation factors, the number of simulated nodes and the size of results grow consequently, leading to simulations with larger scale. In this work, we perform a critical analysis of the tools provided by OMNeT++ in case of such large-scale simulations. We then propose a unified and flexible software architecture to support simulation automation.
Preserving security and confidentiality in wireless sensor networks (WSN) are crucial. Wireless sensor networks in comparison with wired networks are more substantially vulnerable to attacks and intrusions. In WSN, a third person can eavesdrop to the information or link to the network. So, preventing these intrusions by detecting them has become one of the most demanding challenges. This paper, proposes an improved watchdog technique as an effective technique for detecting malicious nodes based on a power aware hierarchical model. This technique overcomes the common problems in the original Watchdog mechanism. The main purpose to present this model is reducing the power consumption as a key factor for increasing the network's lifetime. For this reason, we simulated our model with Tiny-OS simulator and then, compared our results with non hierarchical model to ensure the improvement. The results indicate that, our proposed model is better in performance than the original models and it has increased the lifetime of the wireless sensor nodes by around 2611.492 seconds for a network with 100 sensors.
We study the correlations of classical hardcore dimer models doped with monomers by Monte Carlo simulation. We introduce an efficient cluster algorithm, which is applicable in any dimension, for different lattices and arbitrary doping. We use this algorithm for the dimer model on the square lattice, where a finite density of monomers destroys the critical confinement of the two-monomer problem. The monomers form a two-component plasma located in its high-temperature phase, with the Coulomb interaction screened at finite densities. On the triangular lattice, a single pair of monomers is not confined. The monomer correlations are extremely short-ranged and hardly change with doping.
We propose a robust classifier to predict buying intentions based on user behaviour within a large e-commerce website. In this work we compare traditional machine learning techniques with the most advanced deep learning approaches. We show that both Deep Belief Networks and Stacked Denoising auto-Encoders achieved a substantial improvement by extracting features from high dimensional data during the pre-train phase. They prove also to be more convenient to deal with severe class imbalance.
The aim of link prediction is to forecast connections that are most likely to occur in the future, based on examples of previously observed links. A key insight is that it is useful to explicitly model network dynamics, how frequently links are created or destroyed when doing link prediction. In this paper, we introduce a new supervised link prediction framework, RPM (Rate Prediction Model). In addition to network similarity measures, RPM uses the predicted rate of link modifications, modeled using time series data; it is implemented in Spark-ML and trained with the original link distribution, rather than a small balanced subset. We compare the use of this network dynamics model to directly creating time series of network similarity measures. Our experiments show that RPM, which leverages predicted rates, outperforms the use of network similarity measures, either individually or within a time series.
We introduce SaltiNet, a deep neural network for scanpath prediction trained on 360-degree images. The first part of the network consists of a model trained to generate saliency volumes, whose parameters are learned by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes. Sampling strategies over these volumes are used to generate scanpaths over the 360-degree images. Our experiments show the advantages of using saliency volumes, and how they can be used for related tasks. Our source code and trained models available at https://github.com/massens/saliency-360salient-2017 .
Highly dynamic networks are characterized by frequent changes in the availability of communication links. Many of these networks are in general partitioned into several components that keep splitting and merging continuously and unpredictably. We present an algorithm that strives to maintain a forest of spanning trees in such networks, without any kind of assumption on the rate of changes. Our algorithm is the adaptation of a coarse-grain interaction algorithm (Casteigts et al., 2013) to the synchronous message passing model (for dynamic networks). While the high-level principles of the coarse-grain variant are preserved, the new algorithm turns out to be significantly more complex. In particular, it involves a new technique that consists of maintaining a distributed permutation of the set of all nodes IDs throughout the execution. The algorithm also inherits the properties of its original variant: It relies on purely localized decisions, for which no global information is ever collected at the nodes, and yet it maintains a number of critical properties whatever the frequency and scale of the changes. In particular, the network remains always covered by a spanning forest in which 1) no cycle can ever appear, 2) every node belongs to a tree, and 3) after an arbitrary number of edge disappearance, all maximal subtrees immediately restore exactly one token (at their root). These properties are ensured whatever the dynamics, even if it keeps going for an arbitrary long period of time. Optimality is not the focus here, however the number of tree per components -- the metric of interest here -- eventually converges to one if the network stops changing (which is never expected to happen, though). The algorithm correctness is proven and its behavior is tested through experimentation.
We discuss the ability of a network with non linear relays and chaotic dynamics to transmit signals, on the basis of a linear response theory developed by Ruelle \cite{Ruelle} for dissipative systems. We show in particular how the dynamics interfere with the graph topology to produce an effective transmission network, whose topology depends on the signal, and cannot be directly read on the ``wired'' network. This leads one to reconsider notions such as ``hubs''. Then, we show examples where, with a suitable choice of the carrier frequency (resonance), one can transmit a signal from a node to another one by amplitude modulation, \textit{in spite of chaos}. Also, we give an example where a signal, transmitted to any node via different paths, can only be recovered by a couple of \textit{specific} nodes. This opens the possibility for encoding data in a way such that the recovery of the signal requires the knowledge of the carrier frequency \textit{and} can be performed only at some specific node.
LSTM (Long Short-Term Memory) recurrent neural networks have been highly successful in a number of application areas. This technical report describes the use of the MNIST and UW3 databases for benchmarking LSTM networks and explores the effect of different architectural and hyperparameter choices on performance. Significant findings include: (1) LSTM performance depends smoothly on learning rates, (2) batching and momentum has no significant effect on performance, (3) softmax training outperforms least square training, (4) peephole units are not useful, (5) the standard non-linearities (tanh and sigmoid) perform best, (6) bidirectional training combined with CTC performs better than other methods.
In i-theory a typical layer of a hierarchical architecture consists of HW modules pooling the dot products of the inputs to the layer with the transformations of a few templates under a group. Such layers include as special cases the convolutional layers of Deep Convolutional Networks (DCNs) as well as the non-convolutional layers (when the group contains only the identity). Rectifying nonlinearities -- which are used by present-day DCNs -- are one of the several nonlinearities admitted by i-theory for the HW module. We discuss here the equivalence between group averages of linear combinations of rectifying nonlinearities and an associated kernel. This property implies that present-day DCNs can be exactly equivalent to a hierarchy of kernel machines with pooling and non-pooling layers. Finally, we describe a conjecture for theoretically understanding hierarchies of such modules. A main consequence of the conjecture is that hierarchies of trained HW modules minimize memory requirements while computing a selective and invariant representation.
We show how the renormalized force correlator Delta(u), the function computed in the functional RG (FRG) field theory, can be measured directly in numerics and experiments on the dynamics of elastic manifolds in presence of pinning disorder. For equilibrium dynamics we recover the relation obtained recently in the statics between Delta(u) and a physical observable. Its extension to depinning reveals interesting relations to stick-slip models of avalanches used in dry friction and earthquake dynamics. The particle limit (d=0) is solved for illustration: Delta(u) exhibits a cusp and differs from the statics. We propose that the FRG functions be measured in wetting and magnetic interfaces experiments.
In this paper we study biased random K-SAT problems in which each logical variable is negated with probability $p$. This generalization provides us a crossover from easy to hard problems and would help us in a better understanding of the typical complexity of random K-SAT problems. The exact solution of 1-SAT case is given. The critical point of K-SAT problems and results of replica method are derived in the replica symmetry framework. It is found that in this approximation $\alpha_c \propto p^{-(K-1)}$ for $p\to 0$. Solving numerically the survey propagation equations for K=3 we find that for $p<p^* \sim 0.17$ there is no replica symmetry breaking and still the SAT-UNSAT transition is discontinuous.
Dy2Ti2O7 has been advanced as an ideal spin ice. We present a neutron scattering investigation of a sample of 162Dy2Ti2O7. The scattering intensity has been mapped in zero applied field in the hhl and hk0 planes at temperatures between 0.05 K and 20 K. The measured diffuse scattering (in the static approximation) has been compared to that predicted by the dipolar spin ice model. The comparison is good, except at the Brillouin zone boundaries where extra scattering appears in the experimental data. It is concluded that the dipolar spin ice model provides a successful basis for understanding Dy2Ti2O7, but that there are issues which remain to be clarified.
We report a novel critical behavior in the breakdown of an equal load sharing fiber bundle model at a dispersion $\delta_c$ of the breaking threshold of the fibers. For $\delta < \delta_c$, there is a finite probability $P_b$, that rupturing of the weakest fiber leads to the failure of the entire system. For $\delta \geq \delta_c$, $P_b = 0$. At $\delta_c, P_b \sim L^{-\eta}$, with $\eta \approx 1/3$, where $L$ is the size of the system. As $\delta \rightarrow \delta_c$, the relaxation time $\tau$ diverges obeying the finite size scaling law: $\tau \sim L^{\beta}(|\delta-\delta_c| L^{\alpha})$ with $\alpha, \beta = 0.33 \pm 0.05$. At $\delta_c$, the system fails, at the critical load, in avalanches (of rupturing fibers) of all sizes $s$ following the distribution $P(s) \sim s^{-\kappa}$, with $\kappa = 0.50 \pm 0.01$. We relate this critical behavior to brittle to quasi-brittle transition.
Emoticons (e.g., :) and :( ) have been widely used in sentiment analysis and other NLP tasks as features to ma- chine learning algorithms or as entries of sentiment lexicons. In this paper, we argue that while emoticons are strong and common signals of sentiment expression on social media the relationship between emoticons and sentiment polarity are not always clear. Thus, any algorithm that deals with sentiment polarity should take emoticons into account but extreme cau- tion should be exercised in which emoticons to depend on. First, to demonstrate the prevalence of emoticons on social media, we analyzed the frequency of emoticons in a large re- cent Twitter data set. Then we carried out four analyses to examine the relationship between emoticons and sentiment polarity as well as the contexts in which emoticons are used. The first analysis surveyed a group of participants for their perceived sentiment polarity of the most frequent emoticons. The second analysis examined clustering of words and emoti- cons to better understand the meaning conveyed by the emoti- cons. The third analysis compared the sentiment polarity of microblog posts before and after emoticons were removed from the text. The last analysis tested the hypothesis that removing emoticons from text hurts sentiment classification by training two machine learning models with and without emoticons in the text respectively. The results confirms the arguments that: 1) a few emoticons are strong and reliable signals of sentiment polarity and one should take advantage of them in any senti- ment analysis; 2) a large group of the emoticons conveys com- plicated sentiment hence they should be treated with extreme caution.
Nowadays swarm intelligence-based algorithms are being used widely to optimize the dynamic traveling salesman problem (DTSP). In this paper, we have used mixed method of Ant Colony Optimization (AOC)and gradient descent to optimize DTSP which differs with ACO algorithm in evaporation rate and innovative data. This approach prevents premature convergence and scape from local optimum spots and also makes it possible to find better solutions for algorithm. In this paper, we are going to offer gradient descent and ACO algorithm which in comparison to some former methods it shows that algorithm has significantly improved routes optimization.
Convolutional neural networks (CNNs) have attracted increasing attention in the remote sensing community. Most CNNs only take the last fully-connected layers as features for the classification of remotely sensed images, discarding the other convolutional layer features which may also be helpful for classification purposes. In this paper, we propose a new adaptive deep pyramid matching (ADPM) model that takes advantage of the features from all of the convolutional layers for remote sensing image classification. To this end, the optimal fusing weights for different convolutional layers are learned from the data itself. In remotely sensed scenes, the objects of interest exhibit different scales in distinct scenes, and even a single scene may contain objects with different sizes. To address this issue, we select the CNN with spatial pyramid pooling (SPP-net) as the basic deep network, and further construct a multi-scale ADPM model to learn complementary information from multi-scale images. Our experiments have been conducted using two widely used remote sensing image databases, and the results show that the proposed method significantly improves the performance when compared to other state-of-the-art methods.
Analogous to biological sequence comparison, comparing cellular networks is an important problem that could provide insight into biological understanding and therapeutics. For technical reasons, comparing large networks is computationally infeasible, and thus heuristics such as the degree distribution have been sought. It is easy to demonstrate that two networks are different by simply showing a short list of properties in which they differ. It is much harder to show that two networks are similar, as it requires demonstrating their similarity in all of their exponentially many properties. Clearly, it is computationally prohibitive to analyze all network properties, but the larger the number of constraints we impose in determining network similarity, the more likely it is that the networks will truly be similar.   We introduce a new systematic measure of a network's local structure that imposes a large number of similarity constraints on networks being compared. In particular, we generalize the degree distribution, which measures the number of nodes 'touching' k edges, into distributions measuring the number of nodes 'touching' k graphlets, where graphlets are small connected non-isomorphic subgraphs of a large network. Our new measure of network local structure consists of 73 graphlet degree distributions (GDDs) of graphlets with 2-5 nodes, but it is easily extendible to a greater number of constraints (i.e. graphlets). Furthermore, we show a way to combine the 73 GDDs into a network 'agreement' measure. Based on this new network agreement measure, we show that almost all of the 14 eukaryotic PPI networks, including human, are better modeled by geometric random graphs than by Erdos-Reny, random scale-free, or Barabasi-Albert scale-free networks.
Recently it has been demonstrated that the connectivity transition from microscopic connectivity to macroscopic connectedness, known as percolation, is generically announced by a cascade of microtransitions of the percolation order parameter [Chen et al., Phys. Rev. Lett. 112, 155701 (2014)]. Here we report the discovery of macrotransition cascades which follow percolation. The order parameter grows in discrete macroscopic steps with positions that can be randomly distributed even in the thermodynamic limit. These transition positions are, however, correlated and follow scaling laws which arise from discrete scale invariance and non self-averaging, both traditionally unrelated to percolation. We reveal the discrete scale invariance in ensemble measurements of these non self-averaging systems by rescaling of the individual realizations before averaging.
Existing approaches to online convex optimization (OCO) make sequential one-slot-ahead decisions, which lead to (possibly adversarial) losses that drive subsequent decision iterates. Their performance is evaluated by the so-called regret that measures the difference of losses between the online solution and the best yet fixed overall solution in hindsight. The present paper deals with online convex optimization involving adversarial loss functions and adversarial constraints, where the constraints are revealed after making decisions, and can be tolerable to instantaneous violations but must be satisfied in the long term. Performance of an online algorithm in this setting is assessed by: i) the difference of its losses relative to the best dynamic solution with one-slot-ahead information of the loss function and the constraint (that is here termed dynamic regret); and, ii) the accumulated amount of constraint violations (that is here termed dynamic fit). In this context, a modified online saddle-point (MOSP) scheme is developed, and proved to simultaneously yield sub-linear dynamic regret and fit, provided that the accumulated variations of per-slot minimizers and constraints are sub-linearly growing with time. MOSP is also applied to the dynamic network resource allocation task, and it is compared with the well-known stochastic dual gradient method. Under various scenarios, numerical experiments demonstrate the performance gain of MOSP relative to the state-of-the-art.
We extend our previous results on small-x resummation in the pure Yang--Mills theory to full QCD with nf quark flavours, with a resummed two-by-two matrix of resummed quark and gluon splitting functions. We also construct the corresponding deep-inelastic coefficient functions, and show how these can be combined with parton densities to give fully resummed deep-inelastic structure functions F_2 and F_L at the next-to-leading logarithmic level. We discuss how this resummation can be performed in different factorization schemes, including the commonly used MSbar scheme. We study the importance of the resummation effects by comparison with fixed-order perturbative results, and we discuss the corresponding renormalization and factorization scale variation uncertainties. We find that for x below 0.01 the resummation effects are comparable in size to the fixed order NNLO corrections, but differ in shape. We finally discuss the phenomenological impact of the small-x resummation, specifically in the extraction of parton distribution from present day experiments and their extrapolation to the kinematics relevant for future colliders such as the LHC
In this paper, we study distributed consensus in the radio network setting. We produce new upper and lower bounds for this problem in an abstract MAC layer model that captures the key guarantees provided by most wireless MAC layers. In more detail, we first generalize the well-known impossibility of deterministic consensus with a single crash failure [FLP 1895] from the asynchronous message passing model to our wireless setting. Proceeding under the assumption of no faults, we then investigate the amount of network knowledge required to solve consensus in our model---an important question given that these networks are often deployed in an ad hoc manner. We prove consensus is impossible without unique ids or without knowledge of network size (in multihop topologies). We also prove a lower bound on optimal time complexity. We then match these lower bounds with a pair of new deterministic consensus algorithms---one for single hop topologies and one for multihop topologies---providing a comprehensive characterization of the consensus problem in the wireless setting. From a theoretical perspective, our results shed new insight into the role of network information and the power of MAC layer abstractions in solving distributed consensus. From a practical perspective, given the level of abstraction used by our model, our upper bounds can be easily implemented in real wireless devices on existing MAC layers while preserving their correctness guarantees---facilitating the development of wireless distributed systems.
We study the transverse momentum ($p_T$) spectrum of charged particles produced in deep inelastic scattering (DIS) at small Bjorken $x$ in the central region between the current jet and the proton remnants. We calculate the spectrum at large $p_T$ with the BFKL $\ln (1/x)$ resummation included and then repeat the calculation with it omitted. We find that data favour the former. We normalize our BFKL predictions by comparing with HERA data for DIS containing a forward jet. The shape of the $x$ distribution of DIS + jet data are also well described by BFKL dynamics.
According to the DeGroot-Friedkin model of a social network, an individual's social power evolves as the network discusses individual opinions over a sequence of issues. Under mild assumptions on the connectivity of the network, the social power of every individual converges to a constant strictly positive value as the number of issues discussed increases. If the network has a special topology, termed "star topology", then all social power accumulates with the individual at the centre of the star. This paper studies the strategic introduction of new individuals and/or interpersonal relationships into a social network with star topology to reduce the social power of the centre individual. In fact, several strategies are proposed. For each strategy, we derive necessary and sufficient conditions on the strength of the new interpersonal relationships, based on local information, which ensures that the centre individual no longer has the greatest social power within the social network. Interpretations of these conditions show that the strategies are remarkably intuitive and that certain strategies are favourable compared to others, all of which is sociologically expected.
Swarm dynamics is the study of collections of agents that interact with one another without central control. In natural systems, insects, birds, fish and other large mammals function in larger units to increase the overall fitness of the individuals. Their behavior is coordinated through local interactions to enhance mate selection, predator detection, migratory route identification and so forth [Andersson and Wallander 2003; Buhl et al. 2006; Nagy et al. 2010; Partridge 1982; Sumpter et al. 2008]. In artificial systems, swarms of autonomous agents can augment human activities such as search and rescue, and environmental monitoring by covering large areas with multiple nodes [Alami et al. 2007; Caruso et al. 2008; Ogren et al. 2004; Paley et al. 2007; Sibley et al. 2002]. In this paper, we explore the interplay between swarm dynamics, covert leadership and theoretical information transfer. A leader is a member of the swarm that acts upon information in addition to what is provided by local interactions. Depending upon the leadership model, leaders can use their external information either all the time or in response to local conditions [Couzin et al. 2005; Sun et al. 2013]. A covert leader is a leader that is treated no differently than others in the swarm, so leaders and followers participate equally in whatever interaction model is used [Rossi et al. 2007]. In this study, we use theoretical information transfer as a means of analyzing swarm interactions to explore whether or not it is possible to distinguish between followers and leaders based on interactions within the swarm. We find that covert leaders can be distinguished from followers in a swarm because they receive less transfer entropy than followers.
We present Keck LRIS spectroscopy along with NICMOS F110W (~J) and F160W (~H) images of the galaxy HDF4-473.0 (hereafter 4-473) in the Hubble Deep Field, with a detection of an emission line consistent with Ly-alpha at a redshift of z=5.60. Attention to this object as a high redshift galaxy was first drawn by Lanzetta, Yahil and Fernandez-Soto and appeared in their initial list of galaxies with redshifts estimated from the WFPC2 HDF photometry. It was selected by us for spectroscopic observation, along with others in the Hubble Deep Field, on the basis of the NICMOS F110W and F160W and WFPC2 photometry. For H_0 = 65 and q_0 = 0.125, use of simple evolutionary models along with the F814W (~I), F110W, and F160W magnitudes allow us to estimate the star formation rate (~13 M(solar)/yr). The colors suggest a reddening of E(B-V) ~ 0.06. The measured flux in the Ly-alpha line is approximately 1.0*10^(-17) ergs/cm/s and the restframe equivalent width, correcting for the absorption caused by intervening HI, is approximately 90AA. The galaxy is compact and regular, but resolved, with an observed FWHM of ~0.44". Simple evolutionary models can accurately reproduce the colors and these models predict the Ly-alpha flux to within a factor of 2. Using this object as a template shifted to higher redshifts, we calculate the magnitudes through the F814W and two NICMOS passbands for galaxies at redshifts 6 < z < 10.
The apparent stochasticity of in-vivo neural circuits has long been hypothesized to represent a signature of ongoing stochastic inference in the brain. More recently, a theoretical framework for neural sampling has been proposed, which explains how sample-based inference can be performed by networks of spiking neurons. One particular requirement of this approach is that the neural response function closely follows a logistic curve.   Analytical approaches to calculating neural response functions have been the subject of many theoretical studies. In order to make the problem tractable, particular assumptions regarding the neural or synaptic parameters are usually made. However, biologically significant activity regimes exist which are not covered by these approaches: Under strong synaptic bombardment, as is often the case in cortex, the neuron is shifted into a high-conductance state (HCS) characterized by a small membrane time constant. In this regime, synaptic time constants and refractory periods dominate membrane dynamics.   The core idea of our approach is to separately consider two different "modes" of spiking dynamics: burst spiking and transient quiescence, in which the neuron does not spike for longer periods. We treat the former by propagating the PDF of the effective membrane potential from spike to spike within a burst, while using a diffusion approximation for the latter. We find that our prediction of the neural response function closely matches simulation data. Moreover, in the HCS scenario, we show that the neural response function becomes symmetric and can be well approximated by a logistic function, thereby providing the correct dynamics in order to perform neural sampling. We hereby provide not only a normative framework for Bayesian inference in cortex, but also powerful applications of low-power, accelerated neuromorphic systems to relevant machine learning tasks.
Motivated by the fact that entities in a social network or biological system often interact by exchanging information, we propose an efficient info-clustering algorithm that can group entities into communities using a parametric max-flow algorithm. This is a meaningful special case of the info-clustering paradigm where the dependency structure is graphical and can be learned readily from data.
In this paper we report our experiment concerning new attacks detection by a neural network-based Intrusion Detection System. What is crucial for this topic is the adaptation of the neural network that is already in use to correct classification of a new "normal traffic" and of an attack representation not presented during the network training process. When it comes to the new attack it should also be easy to obtain vectors to test and to retrain the neural classifier. We describe the proposal of an algorithm and a distributed IDS architecture that could achieve the goals mentioned above.
The topology of an instant messaging system is described. Statistical measures of the network are given and compared with the statistics of a comparable random graph. The scale-free character of the network is examined and implications are given for the structure of social networks and instant messenger security.
We present an efficient algorithm for computing distances in hyperbolic grids. We apply this algorithm to work efficiently with a discrete variant of the hyperbolic random graph model. This model is gaining popularity in the analysis of scale-free networks, which are ubiquitous in many fields, from social network analysis to biology. We present experimental results conducted on real world networks.
Automatic Music Transcription (AMT) is one of the oldest and most well-studied problems in the field of music information retrieval. Within this challenging research field, onset detection and instrument recognition take important places in transcription systems, as they respectively help to determine exact onset times of notes and to recognize the corresponding instrument sources. The aim of this study is to explore the usefulness of multiscale scattering operators for these two tasks on plucked string instrument and piano music. After resuming the theoretical background and illustrating the key features of this sound representation method, we evaluate its performances comparatively to other classical sound representations. Using both MIDI-driven datasets with real instrument samples and real musical pieces, scattering is proved to outperform other sound representations for these AMT subtasks, putting forward its richer sound representation and invariance properties.
Study on human mobility is gaining increasing attention from the research community with its multiple applications to use in mobile networks, particularly for the purpose of message delivery in the Delay Tolerant Networks. To better understand the potential of mobile nodes as message relays, our study investigates the encounter pattern of mobile devices. Specifically, we examine the extensive network traces that reflect mobility of communication devices. We analyze the periodicity in encounter pattern by using power spectral analysis. Strong periodicity was observed among rarely encountering mobile nodes while the periodicity was weaker among frequently encountering nodes. Further, we present a method to search regularly encountering pairs and discuss the findings. To our knowledge, we are the first to analyze the periodicity of encounter pattern with large network traces, which is a critical basis for designing an efficient delivery scheme using mobile nodes.
Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.
We propose a framework for training multiple neural networks simultaneously. The parameters from all models are regularised by the tensor trace norm, so that each neural network is encouraged to reuse others' parameters if possible -- this is the main motivation behind multi-task learning. In contrast to many deep multi-task learning models, we do not predefine a parameter sharing strategy by specifying which layers have tied parameters. Instead, our framework considers sharing for all shareable layers, and the sharing strategy is learned in a data-driven way.
The information system (T.V., newspapers, blogs, social network platforms) and its inner dynamics play a fundamental role on the evolution of collective debates and thus on the public opinion. In this work we address such a process focusing on how the current inner strategies of the information system (competition, customer satisfaction) once combined with the gossip may affect the opinions dynamics. A reinforcement effect is particularly evident in the social network platforms where several and incompatible cultures coexist (e.g, pro or against the existence of chemical trails and reptilians, the new world order conspiracy and so forth). We introduce a computational model of opinion dynamics which accounts for the coexistence of media and gossip as separated but interdependent mechanisms influencing the opinions evolution. Individuals may change their opinions under the contemporary pressure of the information supplied by the media and the opinions of their social contacts. We stress the effect of the media communication patterns by considering both the simple case where each medium mimics the behavior of the most successful one (in order to maximize the audience) and the case where there is polarization and thus competition among media reported information (in order to preserve and satisfy their segmented audience). Finally, we first model the information cycle as in the case of traditional main stream media (i.e, when every medium knows about the format of all the others) and then, to account for the effect of the Internet, on more complex connectivity patterns (as in the case of the web based information). We show that multiple and polarized information sources lead to stable configurations where several and distant opinions coexist.
Products of random matrices associated to one-dimensional random media satisfy a central limit theorem assuring convergence to a gaussian centered at the Lyapunov exponent. The hypothesis of single parameter scaling states that its variance is equal to the Lyapunov exponent. We settle discussions about its validity for a wide class of models by proving that, away from anomalies, single parameter scaling holds to lowest order perturbation theory in the disorder strength. However, it is generically violated at higher order. This is explicitely exhibited for the Anderson model.
We analyze a population of intermediate-redshift (z ~ 0.05-0.3), off-nuclear X-ray sources located within optically-bright galaxies in the Great Observatories Origins Deep Survey (GOODS) and Galaxy Evolution from Morphology and SEDs (GEMS) fields. A total of 24 off-nuclear source candidates are classified using deep Chandra exposures from the Chandra Deep Field-North, Chandra Deep Field-South, and Extended Chandra Deep Field-South; 15 of these are newly identified. These sources have average X-ray spectral shapes and optical environments similar to those observed for off-nuclear intermediate-luminosity (L_X >~ 10^{39} erg/s in the 0.5-2.0 keV band) X-ray objects (IXOs; sometimes referred to as ultraluminous X-ray sources [ULXs]) in the local universe. This sample improves the available source statistics for intermediate-redshift, off-nuclear sources with L_X >~ 10^{39.5} erg/s, and it places significant new constraints on the redshift evolution of the off-nuclear source frequency in field galaxies. The fraction of intermediate-redshift field galaxies containing an off-nuclear source with L_X >~ 10^{39} erg/s is suggestively elevated (~80% confidence level) with respect to that observed for IXOs in the local universe; we calculate this elevation to be a factor of 1.9^{+1.4}_{-1.3}. A rise in this fraction is plausibly expected as a consequence of the observed increase in global star-formation density with redshift, and our results are consistent with the expected magnitude of the rise in this fraction.
We summarize the main features of our approach to parton fitting, and we show a preliminary result for the non-singlet structure function. When comparing our result to other PDF sets, we find a better description of large x data and larger error bands in the extrapolation regions.
We numerically address the issue of how the ground state topology is reflected in the finite temperature dynamics of the $\pm J$ Edwards-Anderson spin glass model. In this system a careful study of the ground state configurations allows to classify spins into two sets: solidary and non-solidary spins. We show that these sets quantitatively account for the dynamical heterogeneities found in the mean flipping time distribution at finite low temperatures. The results highlight the relevance of taking into account the ground state topology in the analysis of the finite temperature dynamics of spin glasses.
There are a number of ways to procedurally generate interesting three-dimensional shapes, and a method where a cellular neural network is combined with a mesh growth algorithm is presented here. The aim is to create a shape from a genetic code in such a way that a crude search can find interesting shapes. Identical neural networks are placed at each vertex of a mesh which can communicate with neural networks on neighboring vertices. The output of the neural networks determine how the mesh grows, allowing interesting shapes to be produced emergently, mimicking some of the complexity of biological organism development. Since the neural networks' parameters can be freely mutated, the approach is amenable for use in a genetic algorithm.
We have considered a large-scale road network in Eastern Massachusetts. Using real traffic data in the form of spatial average speeds and the flow capacity for each road segment of the network, we converted the speed data to flow data and estimated the origin-destination flow demand matrices for the network. Assuming that the observed traffic data correspond to user (Wardrop) equilibria for different times-of-the-day and days-of-the-week, we formulated appropriate inverse problems to recover the per-road cost (congestion) functions determining user route selection for each month and time-of-day period. In addition, we analyzed the sensitivity of the total user latency cost to important parameters such as road capacities and minimum travel times. Finally, we formulated a system-optimum problem in order to find socially optimal flows for the network. We investigated the network performance, in terms of the total latency, under a user-optimal policy versus a system-optimal policy. The ratio of these two quantities is defined as the Price of Anarchy (POA) and quantifies the efficiency loss of selfish actions compared to socially optimal ones. Our findings contribute to efforts for a smarter and more efficient city.
In two-way OFDM relay, carrier frequency offsets (CFOs) between relay and terminal nodes introduce severe intercarrier interference (ICI) which degrades the performance of traditional physical-layer network coding (PLNC). Moreover, traditional algorithm to compute the posteriori probability in the presence of ICI would incur prohibitive computational complexity at the relay node. In this paper, we proposed a two-step asynchronous PLNC scheme at the relay to mitigate the effect of CFOs. In the first step, we intend to reconstruct the ICI component, in which space-alternating generalized expectationmaximization (SAGE) algorithm is used to jointly estimate the needed parameters. In the second step, a channel-decoding and network-coding scheme is proposed to transform the received signal into the XOR of two terminals' transmitted information using the reconstructed ICI. It is shown that the proposed scheme greatly mitigates the impact of CFOs with a relatively lower computational complexity in two-way OFDM relay.
A serious threat today is malicious executables. It is designed to damage computer system and some of them spread over network without the knowledge of the owner using the system. Two approaches have been derived for it i.e. Signature Based Detection and Heuristic Based Detection. These approaches performed well against known malicious programs but cannot catch the new malicious programs. Different researchers have proposed methods using data mining and machine learning for detecting new malicious programs. The method based on data mining and machine learning has shown good results compared to other approaches. This work presents a static malware detection system using data mining techniques such as Information Gain, Principal component analysis, and three classifiers: SVM, J48, and Na\"ive Bayes. For overcoming the lack of usual anti-virus products, we use methods of static analysis to extract valuable features of Windows PE file. We extract raw features of Windows executables which are PE header information, DLLs, and API functions inside each DLL of Windows PE file. Thereafter, Information Gain, calling frequencies of the raw features are calculated to select valuable subset features, and then Principal Component Analysis is used for dimensionality reduction of the selected features. By adopting the concepts of machine learning and data-mining, we construct a static malware detection system which has a detection rate of 99.6%.
Deep neural networks are typically represented by a much larger number of parameters than shallow models, making them prohibitive for small footprint devices. Recent research shows that there is considerable redundancy in the parameter space of deep neural networks. In this paper, we propose a method to compress deep neural networks by using the Fisher Information metric, which we estimate through a stochastic optimization method that keeps track of second-order information in the network. We first remove unimportant parameters and then use non-uniform fixed point quantization to assign more bits to parameters with higher Fisher Information estimates. We evaluate our method on a classification task with a convolutional neural network trained on the MNIST data set. Experimental results show that our method outperforms existing methods for both network pruning and quantization.
The small medium large system (SMLSystem) is a house built at the Universidad CEU Cardenal Herrera (CEU-UCH) for participation in the Solar Decathlon 2013 competition. Several technologies have been integrated to reduce power consumption. One of these is a forecasting system based on artificial neural networks (ANNs), which is able to predict indoor temperature in the near future using captured data by a complex monitoring system as the input. A study of the impact on forecasting performance of different covariate combinations is presented in this paper. Additionally, a comparison of ANNs with the standard statistical forecasting methods is shown. The research in this paper has been focused on forecasting the indoor temperature of a house, as it is directly related to HVAC---heating, ventilation and air conditioning---system consumption. HVAC systems at the SMLSystem house represent 53.9% of the overall power consumption. The energy used to maintain temperature was measured to be 30--38.9% of the energy needed to lower it. Hence, these forecasting measures allow the house to adapt itself to future temperature conditions by using home automation in an energy-efficient manner. Experimental results show a high forecasting accuracy and therefore, they might be used to efficiently control an HVAC system.
Representation learning is currently a very hot topic in modern machine learning, mostly due to the great success of the deep learning methods. In particular low-dimensional representation which discriminates classes can not only enhance the classification procedure, but also make it faster, while contrary to the high-dimensional embeddings can be efficiently used for visual based exploratory data analysis.   In this paper we propose Maximum Entropy Linear Manifold (MELM), a multidimensional generalization of Multithreshold Entropy Linear Classifier model which is able to find a low-dimensional linear data projection maximizing discriminativeness of projected classes. As a result we obtain a linear embedding which can be used for classification, class aware dimensionality reduction and data visualization. MELM provides highly discriminative 2D projections of the data which can be used as a method for constructing robust classifiers.   We provide both empirical evaluation as well as some interesting theoretical properties of our objective function such us scale and affine transformation invariance, connections with PCA and bounding of the expected balanced accuracy error.
Background: In recent years automated data analysis techniques have drawn great attention and are used in almost every field of research including biomedical. Artificial Neural Networks (ANNs) are one of the Computer- Aided- Diagnosis tools which are used extensively by advances in computer hardware technology. The application of these techniques for disease diagnosis has made great progress and is widely used by physicians. An Electrocardiogram carries vital information about heart activity and physicians use this signal for cardiac disease diagnosis which was the great motivation towards our study. Methods: In this study we are using Probabilistic Neural Networks (PNN) as an automatic technique for ECG signal analysis along with a Genetic Algorithm (GA). As every real signal recorded by the equipment can have different artifacts, we need to do some preprocessing steps before feeding it to the ANN. Wavelet transform is used for extracting the morphological parameters and median filter for data reduction of the ECG signal. The subset of morphological parameters are chosen and optimized using GA. We had two approaches in our investigation, the first one uses the whole signal with 289 normalized and de-noised data points as input to the ANN. In the second approach after applying all the preprocessing steps the signal is reduced to 29 data points and also their important parameters extracted to form the ANN input with 35 data points. Results: The outcome of the two approaches for 8 types of arrhythmia shows that the second approach is superior than the first one with an average accuracy of %99.42.
A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) have emerged as clear winners on object classification benchmarks, in part due to training with 1.2M+ labeled classification images. Unfortunately, only a small fraction of those labels are available for the detection task. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect detection data and label it with precise bounding boxes. In this paper, we propose Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Our method has the potential to enable detection for the tens of thousands of categories that lack bounding box annotations, yet have plenty of classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge demonstrates the efficacy of our approach. This algorithm enables us to produce a >7.6K detector by using available classification data from leaf nodes in the ImageNet tree. We additionally demonstrate how to modify our architecture to produce a fast detector (running at 2fps for the 7.6K detector). Models and software are available at
We iteratively derive the product-form solutions of stationary distributions of priority multiclass queueing networks with multi-sever stations. The networks are Markovian with exponential interarrival and service time distributions. These solutions can be used to conduct performance analysis or as comparison criteria for approximation and simulation studies of large scale networks with multi-processor shared-memory switches and cloud computing systems with parallel-server stations. Numerical comparisons with existing Brownian approximating model are provided to indicate the effectiveness of our algorithm.
Tunnel spectroscopy is used to probe the electronic structure in GaAs quantum well of resonant tunnel junction over wide range of energies and magnetic fields normal to layers. Spin degenerated high Landau levels ($N=2\div7$) are found to be drastically renormalised near energies when the longitudinal optical-phonon ($\hbar\omega_{LO}$) and cyclotron energy ($\hbar\omega_{C}$) are satisfied condition $\hbar\omega_{LO}=m\hbar\omega_{C}$, where $m=1,2,3$. This renormalisation is attributed to formation of resonant magnetopolarons, i.e. mixing of high index Landau levels by strong interaction of electrons at Landau level states with LO-phonons.
We introduce a technology stack or specification describing the multiple levels of abstraction and specialization needed to implement a neuromorphic processor (NPU) based on the previously-described concept of AHaH Computing and integrate it into today's digital computing systems. The general purpose NPU implementation described here is called Thermodynamic-RAM (kT-RAM) and is just one of many possible architectures, each with varying advantages and trade offs. Bringing us closer to brain-like neural computation, kT-RAM will provide a general-purpose adaptive hardware resource to existing computing platforms enabling fast and low-power machine learning capabilities that are currently hampered by the separation of memory and processing, a.k.a the von Neumann bottleneck. Because understanding such a processor based on non-traditional principles can be difficult, by presenting the various levels of the stack from the bottom up, layer by layer, explaining kT-RAM becomes a much easier task. The levels of the Thermodynamic-RAM technology stack include the memristor, synapse, AHaH node, kT-RAM, instruction set, sparse spike encoding, kT-RAM emulator, and SENSE server.
We extend a coherent network data-analysis strategy developed earlier for detecting Newtonian waveforms to the case of post-Newtonian (PN) waveforms. Since the PN waveform depends on the individual masses of the inspiraling binary, the parameter-space dimension increases by one from that of the Newtonian case. We obtain the number of templates and estimate the computational costs for PN waveforms: For a lower mass limit of a solar mass, for LIGO-I noise, and with 3% maximum mismatch, the online computational speed requirement for single detector is a few Gflops; for a two-detector network it is hundreds of Gflops and for a three-detector network it is tens of Tflops. Apart from idealistic networks, we obtain results for realistic networks comprising of LIGO and VIRGO. Finally, we compare costs incurred in a coincidence detection strategy with those incurred in the coherent strategy detailed above.
Deep Neural Networks often require good regularizers to generalize well. Dropout is one such regularizer that is widely used among Deep Learning practitioners. Recent work has shown that Dropout can also be viewed as performing Approximate Bayesian Inference over the network parameters. In this work, we generalize this notion and introduce a rich family of regularizers which we call Generalized Dropout. One set of methods in this family, called Dropout++, is a version of Dropout with trainable parameters. Classical Dropout emerges as a special case of this method. Another member of this family selects the width of neural network layers. Experiments show that these methods help in improving generalization performance over Dropout.
Deep Matching (DM) is a popular high-quality method for quasi-dense image matching. Despite its name, however, the original DM formulation does not yield a deep neural network that can be trained end-to-end via backpropagation. In this paper, we remove this limitation by rewriting the complete DM algorithm as a convolutional neural network. This results in a novel deep architecture for image matching that involves a number of new layer types and that, similar to recent networks for image segmentation, has a U-topology. We demonstrate the utility of the approach by improving the performance of DM by learning it end-to-end on an image matching task.
The ground-state energy E_0 of a spin glass is an example of an extreme statistic. We consider the large deviations of this energy for a variety of models when the number of spins N goes to infinity. In most cases, the behavior can be understood qualitatively, in particular with the help of semi-analytical results for hierarchical lattices. Particular attention is paid to the Sherrington-Kirkpatrick model; after comparing to the Tracy-Widom distribution which follows from the spherical approximation, we find that the large deviations give rise to non-trivial scaling laws with N.
Developing an accurate and reliable injury prediction method is central to the biomechanics studies of traumatic brain injury. Previous efforts have relied on empirical metrics based on kinematics or model-estimated tissue responses explicitly pre-defined in a specific region of interest. A single "training" dataset has also been used to evaluate performance but without cross-validation. In this study, we developed a deep learning approach for concussion classification based on implicit features of the entire voxel-wise white matter fiber strains. Using reconstructed American National Football League (NFL) injury cases, repeated random subsampling was employed to train and cross-validate the concussion classifier. Compared with Brain Injury Criterion (BrIC), cumulative strain damage measure and peak strain based on logistic regression, our approach achieved significantly higher area under the receiver operating curve (AUC) in training for all four subsampling configurations (0.95-0.99), and comparable and often higher cross-validation AUC (0.83-0.86) and accuracy (0.78-0.82). It also consistently achieved the smallest standard deviation in cross-validation accuracy, indicating a more stable performance. These findings suggest that deep learning is an attractive and highly competitive alternative for brain injury prediction, which may offer important fresh insight into how best to predict injury, including concussion, in the future.
We address the problem of achieving a random laser with a cloud of cold atoms, in which gain and scattering are provided by the same atoms. In this system, the elastic scattering cross-section is related to the complex atomic polarizability. As a consequence, the random laser threshold is expressed as a function of this polarizability, which can be fully determined by spectroscopic measurements. We apply this idea to experimentally evaluate the threshold of a random laser based on Raman gain between non-degenerate Zeeman states and find a critical optical thickness on the order of 200, which is within reach of state-of-the-art cold-atom experiments.
We investigate a disordered single-walled carbon nanotube (SWCNT) in an effective medium super cell approximation (EMSCA).   First type of disorder that we consider is the presence of vacancies.   Our results show that the vacancies induce some bound states on their neighbor host sites, leading to the creation of a band around the Fermi energy in the SWCNT average density of states.Second type of disorder considered is a substitutional $B_{cb}N_{cn}C_{1-cb-cn}$ alloy due to it's applications in hetrojunctions. We found that for a fixed boron (nitrogen) concentration, by increasing the nitrogen (boron) concentration the averaged semiconducting gap, $E_{g}$, decreases and at a critical concentration it disappears. A consequence of our results for nano electronic devices is that by changing the boron(nitrogen) concentration, one can make a semiconductor SWCNT with a pre-determined energy gap.
The intermediate results of the ongoing study of deep samples of ~200 galaxies residing in nearby voids, are presented. Their properties are probed via optical spectroscopy, ugri surface photometry, and HI 21-cm line measurements, with emphasis on their evolutionary status. We derive directly the hydrogen mass M(HI), the ratio M(HI)/L_B and the evolutionary parameter gas-phase O/H. Their luminosities and integrated colours are used to derive stellar mass M(*) and the second evolutionary parameter -- gas mass-fraction f_g. The colours of the outer parts, typically representative of the galaxy oldest stellar population, are used to estimate the upper limits on time since the beginning of the main SF episode. We compare properties of void galaxies with those of the similar late-type galaxies in denser environments. Most of void galaxies show smaller O/H for their luminosity, in average by ~30%, indicating slower evolution. Besides, the fraction of ~10% of the whole void sample or ~30% of the least luminous void LSB dwarfs show the oxygen deficiency by a factor of 2--5. The majority of this group appear very gas-rich, with f_g ~(95-99)%, while their outer parts appear rather blue, indicating the time of onset of the main star-formation episode of less than 1-4 Gyr. Such unevolved LSBD galaxies appear not rare among the smallest void objects, but turned out practically missed to date due to the strong observational selection effects. Our results evidense for unusual evolutionary properties of the sizable fraction of void galaxies, and thus, pose the task of better modelling of dwarf galaxy formation and evolution in voids.
Mechanical deformation of nanopillars displays features that are distinctly different from the bulk behavior of single crystals: Yield strength increases with decreasing size and plastic deformation comes together with strain bursts or/and stress drops (depending on loading conditions) with a very strong sensitivity of the stochasticity character on material preparation and conditions. The character of the phenomenon is standing as a paradox: While these bursts resemble the universal, widely independent of material conditions, noise heard in bulk crystals using acoustic emission (AE) techniques, they strongly emerge primarily with decreasing size and increasing strength in nanopillars. In this paper, we present a realistic but minimal discrete dislocation plasticity model for the elasto-plastic deformation of nanopillars that is consistent with the main experimental observations of nano pillar compression experiments and provides a clear insight to this paradox. With increasing sample size, the model naturally transitions between the typical progressive behavior of nanopillars to a behavior that resembles evidence for bulk mesoscale plasticity. The combination of consistent strengthening, large flow stress fluctuations and critical avalanches is only observed in the {\it depinning regime} where obstacles are much stronger than dislocation sources; in contrast, when dislocation source strength becomes comparable to obstacle barriers, then yield strength size effects are absent but plasticity avalanche dynamics is strongly universal, across sample width and aspect-ratio scales. Finally, we elucidate the mechanism that leads to the connection between depinning and size effects in our model dislocation dynamics. In this way, our model builds a way towards unifying statistical aspects of mechanical deformation across scales.
If each node of an idealized network has an equal capacity to efficiently exchange benefits, then the network's capacity to use energy is scaled by the average amount of energy required to connect any two of its nodes. The scaling factor equals \textit{e}, and the network's entropy is $\ln(n)$. Networking emerges in consequence of nodes minimizing the ratio of their energy use to the benefits obtained for such use, and their connectability. Networking leads to nested hierarchical clustering, which multiplies a network's capacity to use its energy to benefit its nodes. Network entropy multiplies a node's capacity. For a real network in which the nodes have the capacity to exchange benefits, network entropy may be estimated as $C \log_L(n)$, where the base of the log is the path length $L$, and $C$ is the clustering coefficient. Since $n$, $L$ and $C$ can be calculated for real networks, network entropy for real networks can be calculated and can reveal aspects of emergence and also of economic, biological, conceptual and other networks, such as the relationship between rates of lexical growth and divergence, and the economic benefit of adding customers to a commercial communications network. \textit{Entropy dating} can help estimate the age of network processes, such as the growth of hierarchical society and of language.
While deep neural networks have led to human-level performance on computer vision tasks, they have yet to demonstrate similar gains for holistic scene understanding. In particular, 3D context has been shown to be an extremely important cue for scene understanding - yet very little research has been done on integrating context information with deep models. This paper presents an approach to embed 3D context into the topology of a neural network trained to perform holistic scene understanding. Given a depth image depicting a 3D scene, our network aligns the observed scene with a predefined 3D scene template, and then reasons about the existence and location of each object within the scene template. In doing so, our model recognizes multiple objects in a single forward pass of a 3D convolutional neural network, capturing both global scene and local object information simultaneously. To create training data for this 3D network, we generate partly hallucinated depth images which are rendered by replacing real objects with a repository of CAD models of the same object category. Extensive experiments demonstrate the effectiveness of our algorithm compared to the state-of-the-arts. Source code and data will be available.
The localization subregions of stationary waves in continuous disordered media have been recently demonstrated to be governed by a hidden landscape that is the solution of a Dirichlet problem expressed with the wave operator. In this theory, the strength of Anderson localization confinement is determined by this landscape, and continuously decreases as the energy increases. However, this picture has to be changed in discrete lattices in which the eigenmodes close to the edge of the first Brillouin zone are as localized as the low energy ones. Here we show that in a 1D discrete lattice, the localization of low and high energy modes is governed by two different landscapes, the high energy landscape being the solution of a dual Dirichlet problem deduced from the low energy one using the symmetries of the Hamiltonian. We illustrate this feature using the one-dimensional tight-binding Hamiltonian with random on-site potentials as a prototype model. Moreover we show that, besides unveiling the subregions of Anderson localization, these dual landscapes also provide an accurate overal estimate of the localization length over the energy spectrum, especially in the weak disorder regime.
The cross section of the diffractive process e^+p -> e^+Xp is measured at a centre-of-mass energy of 318 GeV, where the system X contains at least two jets and the leading final state proton p is detected in the H1 Very Forward Proton Spectrometer. The measurement is performed in photoproduction with photon virtualities Q^2 <2 GeV^2 and in deep-inelastic scattering with 4 GeV^2<Q^2<80 GeV^2. The results are compared to next-to-leading order QCD calculations based on diffractive parton distribution functions as extracted from measurements of inclusive cross sections in diffractive deep-inelastic scattering.
Accurate knowledge of the neutron energy spectra is useful in basic research and applications. The overall procedure of measuring and unfolding the fast neutron energy spectra with BC501A liquid scintillation detector is described. The recoil proton spectrum of Am-Be neutrons was obtained experimentally. With the NRESP7 code, the response matrix of detector was simulated. Combining the recoil proton spectrum and response matrix, the unfolding of neutron spectra was performed by GRAVEL iterative algorithm. A MatLab program based on the GRAVEL method was developed. The continuous neutron spectrum of Am-Be source and monoenergetic neutron spectrum of D-T source have been unfolded successfully and are in good agreement with their standard reference spectra. The unfolded Am-Be spectrum are more accurate than the spectra unfolded by artificial neural networks in recent years.
We present low resolution (R = 500) near-infrared spectra of 46 candidate young stellar objects in the Chamaeleon I star-forming region recently detected in several deep photometric surveys of the cloud. Most of these stars have K < 12. In addition, we present spectra of 63 previously known southern hemisphere young stars mainly belonging to the Chamaeleon I and Lupus dark clouds. We describe near-infrared spectroscopic characteristics of these stars and use the water vapor indexes to derive spectral types for the new objects. Photometric data from the literature are used to estimate the bolometric luminosities of all sources. We apply D'Antona & Mazzitelli (1998) pre-main sequence evolutionary tracks and isocrones to derive masses and ages. We detect two objects with mass below the H burning limit among the 46 new candidates. One of this object (PMK99 IR Cha INa1) is the likely driving source of a bipolar outflow in the northern region of the cloud.
A cooperative medium access control (CoopMAC) network with randomly distributed helpers is considered. We introduce a new helper selection scheme which maximizes the average throughput while maintaining a low power consumption profile in the network. To this end, all transmissions are assumed to be performed using power control. We assume that each node can estimate the channel between itself and a receiving node. Then, it evaluates the minimum transmission power needed to achieve the desired signal to noise ratio (SNR). If the required transmission power is less than the maximum transmission power of the node, the communication is regarded as successful. Otherwise, the transmission is canceled. %Also, we classify the helpers into six groups according to their transmission rates IEEE 802.11b Standard. In order to increase the average throughput, we assume that the cooperative link with the highest transmission rate is chosen from those that can successfully forward the source signal to destination. Also, when there are several links with the same rates, the one with minimum required power is given the highest priority. Assuming that the helpers are distributed as a Poisson point process with fixed intensity, we derive exact expressions for the average throughput and the power consumption of the network. Simulation results show that our scheme is able to significantly increase the throughput in comparison to the conventional CoopMAC network. It is also able to reduce the power consumption compared to a network with no power control approach.
The AGN-galaxy cross-correlation function of radio-quiet AGN and radio galaxies has been measured with the Panoramic Deep Fields. Colour selection criteria and photometric redshifts have been used to significantly increase the signal-to-noise of the angular cross-correlation function. Radio-quiet AGN environments are comparable to the environments of early-type galaxies at low redshift. The radio galaxy-galaxy spatial cross-correlation function is very strong though large variations are observed from field-to-field. These variations appear to be caused by large-scale-structure on scales comparable to the 5 degree field-of-view. The distribution and spatial cross-correlation function of radio galaxies and clusters in the Panoramic Deep Fields is consistent with these objects tracing the same structures at z<0.7. No evidence is found for evolution of the AGN-galaxy spatial cross-correlation function across the redshift range observed.
In this work, we consider the popular OPNET simulator as a tool for performance evaluation of algorithms operating in peer-to-peer (P2P) networks. We created simple framework and used it to analyse the flooding search algorithm which is a popular technique for searching files in an unstructured P2P network. We investigated the influence of the number of replicas and time to live (TTL) of search queries on the algorithm performance. Preparing the simulation we did not reported the problems which are commonly encountered in P2P dedicated simulators although the size of simulated network was limited.
The emergence of small satellites and CubeSats for interplanetary exploration will mean hundreds if not thousands of spacecraft exploring every corner of the solar-system. Current methods for communication and tracking of deep space probes use ground based systems such as the Deep Space Network (DSN). However, the increased communication demand will require radically new methods to ease communication congestion. Networks of communication relay satellites located at strategic locations such as geostationary orbit and Lagrange points are potential solutions. Instead of one large communication relay satellite, we could have scores of small satellites that utilize phase arrays to effectively operate as one large satellite. Excess payload capacity on rockets can be used to warehouse more small satellites in the communication network. The advantage of this network is that even if one or a few of the satellites are damaged or destroyed, the network still operates but with degraded performance. The satellite network would operate in a distributed architecture and some satellites maybe dynamically repurposed to split and communicate with multiple targets at once. The potential for this alternate communication architecture is significant, but this requires development of satellite formation flying and networking technologies. Our research has found neural-network control approaches such as the Artificial Neural Tissue can be effectively used to control multirobot/multi-spacecraft systems and can produce human competitive controllers. We have been developing a laboratory experiment platform called Athena to develop critical spacecraft control algorithms and cognitive communication methods. We briefly report on the development of the platform and our plans to gain insight into communication phase arrays for space.
In Phys. Rev. Lett. 91 167206 (2003), Sun et al. study memory effects in an interacting nanoparticle system with specific temperature and field protocols. The authors claim that the observed memory effects originate from spin-glass dynamics and that the results are consistent with the hierarchical picture of the spin-glass phase. In this comment, we argue their claims premature by demonstrating that all their experimental curves can be reproduced qualitatively using only a simplified model of isolated nanoparticles with a temperature dependent distribution of relaxation times.
We provide two distributed confidence ball algorithms for solving linear bandit problems in peer to peer networks with limited communication capabilities. For the first, we assume that all the peers are solving the same linear bandit problem, and prove that our algorithm achieves the optimal asymptotic regret rate of any centralised algorithm that can instantly communicate information between the peers. For the second, we assume that there are clusters of peers solving the same bandit problem within each cluster, and we prove that our algorithm discovers these clusters, while achieving the optimal asymptotic regret rate within each one. Through experiments on several real-world datasets, we demonstrate the performance of proposed algorithms compared to the state-of-the-art.
The eigenstates of many-body localized (MBL) Hamiltonians exhibit low entanglement. We adapt the highly successful density-matrix renormalization group method, which is usually used to find modestly entangled ground states of local Hamiltonians, to find individual highly excited eigenstates of many body localized Hamiltonians. The adaptation builds on the distinctive spatial structure of such eigenstates. We benchmark our method against the well studied random field Heisenberg model in one dimension. At moderate to large disorder, we find that the method successfully obtains excited eigenstates with high accuracy, thereby enabling a study of MBL systems at much larger system sizes than those accessible to exact-diagonalization methods.
Results are presented on the inclusive production of isolated prompt photons in deep inelastic scattering with a four-momentum transfer of $Q^2>4 GeV^2$. The cross sections are measured for the transverse momentum range of the photons $3 < E_T^\gamma < 10 GeV$ and for the pseudorapidity range of the photons $-1.2 < \eta^\gamma < 1.8$. They are measured differentially as a function of $E_T^\gamma$ and $\eta^\gamma$. The results are compared with the predictions of a leading order calculation, which is in reasonable agreement with the inclusive measurement.
We propose a method for learning continuous-space vector representation of graphs, which preserves directed edge information. Unlike previous works that utilize random walks to learn structure-preserving graph embeddings, we (1) explicitly model an edge as a function of node embeddings that we jointly learn with the node embeddings, and we (2) propose a novel objective which we call \emph{graph likelihood}, defined in terms of the random walk statistics. Individually, both of these contributions improve the learned representations, especially when there are memory constraints on the total size of the embeddings. When combined, our contributions enable us to significantly improve the state-of-the-art by learning more concise representations that better preserve the graph structure.   We evaluate our method on a variety of link-prediction task including social networks, collaboration networks, and protein interactions, showing that our proposed method learn representations with error reductions of up to 76% and 55%, respectively, on directed and undirected graphs. In addition, we show that the representations learned by our method more effectively utilize their provided space -- on several datasets, they outperform all baseline methods while using \emph{16 times less} space to represent each node.
Variable range hopping transport has been investigated in amorphous YSi from 30 mK to 2 K as a function of the temperature T and electric field F. For F=0, where conduction is along sinuous paths (isotropic percolation), the conductance G depends on T according to an Efros-Shklovskii law. The nonlinear I-V's were studied up to very high fields for which G no longer depends on T (directed percolation : current paths along F). We show that heating effects are negligible. Then, we show that for low F values, ln(G(F,T))=ln(G(0,T))-eFL/kT. The order of magnitude (5-10 nm) and the T dependence of L agree with theoretical predictions. From L, we extract the dielectric constant. The very high electric field data do not agree with the theoretical predictions, which could be due to tunneling across the mobility edge. For intermediate electric fields our data evidence the influence of the straightening of the paths (transition from isotropic to directed percolation). The statistical properties of the segments of the paths where the current flows against the electrical force are extracted for the first time.
The abundance of online user data has led to a surge of interests in understanding the dynamics of social relationships using computational methods. Utilizing users' items adoption data, we develop a new method to compute the Granger-causal (GC) relationships among users. In order to handle the high dimensional and sparse nature of the adoption data, we propose to model the relationships among users in latent space instead of the original data space. We devise a Linear Dynamical Topic Model (LDTM) that can capture the dynamics of the users' items adoption behaviors in latent (topic) space. Using the time series of temporal topic distributions learned by LDTM, we conduct Granger causality tests to measure the social correlation relationships between pairs of users. We call the combination of our LDTM and Granger causality tests as Temporal Social Correlation. By conducting extensive experiments on bibliographic data, where authors are analogous to users, we show that the ordering of authors' name on their publications plays a statistically significant role in the interaction of research topics among the authors. We also present a case study to illustrate the correlational relationships between pairs of authors.
We generalize the stochastic block model to the important case in which edges are annotated with weights drawn from an exponential family distribution. This generalization introduces several technical difficulties for model estimation, which we solve using a Bayesian approach. We introduce a variational algorithm that efficiently approximates the model's posterior distribution for dense graphs. In specific numerical experiments on edge-weighted networks, this weighted stochastic block model outperforms the common approach of first applying a single threshold to all weights and then applying the classic stochastic block model, which can obscure latent block structure in networks. This model will enable the recovery of latent structure in a broader range of network data than was previously possible.
Besides the success on object recognition, machine translation and system control in games, (deep) neural networks have achieved state-of-the-art results in collaborative filtering (CF) recently. Previous neural approaches for CF are either user-based or item-based, which cannot leverage all relevant information explicitly. We propose CF-UIcA, a neural co-autoregressive model for CF tasks, which exploit the structural autoregressiveness in the domains of both users and items. Furthermore, we separate the inherent dependence in this structure under a natural assumption and develop an efficient stochastic learning algorithm to handle large scale datasets. We evaluate CF-UIcA on two popular benchmarks: MovieLens 1M and Netflix, and achieve state-of-the-art predictive performance, which demonstrates the effectiveness of CF-UIcA.
In this paper we analyze deep inelastic one-particle inclusive processes for the case of spin-one targets or for the case of spin-one produced hadrons, such as rho mesons. This allows the measurement of new distribution and fragmentation functions not available in the spin-half case, and provides new ways to probe functions otherwise difficult to measure. We will analyze only contributions leading order in 1/Q, but we will include effects of the transverse momentum of partons. We also include time-reversal odd functions.
A natural consequence of the composite operator propagator-vertex description of deep inelastic scattering developed by the authors is that the anomalous suppression observed in the flavour singlet contribution to the first moment of the polarised proton structure function $g_1^p$ (the `proton spin' problem) is not a special property of the proton structure but is a target independent effect which can be related to an anomalous suppression in the QCD topological susceptibility. In this paper, it is shown how this target independent mechanism can be tested in semi-inclusive deep inelastic scattering in which a pion or D meson carrying a large target energy fraction $z$ is detected in the target fragmentation region.
We present a unified approach for learning the parameters of Sum-Product networks (SPNs). We prove that any complete and decomposable SPN is equivalent to a mixture of trees where each tree corresponds to a product of univariate distributions. Based on the mixture model perspective, we characterize the objective function when learning SPNs based on the maximum likelihood estimation (MLE) principle and show that the optimization problem can be formulated as a signomial program. We construct two parameter learning algorithms for SPNs by using sequential monomial approximations (SMA) and the concave-convex procedure (CCCP), respectively. The two proposed methods naturally admit multiplicative updates, hence effectively avoiding the projection operation. With the help of the unified framework, we also show that, in the case of SPNs, CCCP leads to the same algorithm as Expectation Maximization (EM) despite the fact that they are different in general.
Sparse coding can learn good robust representation to noise and model more higher-order representation for image classification. However, the inference algorithm is computationally expensive even though the supervised signals are used to learn compact and discriminative dictionaries in sparse coding techniques. Luckily, a simplified neural network module (SNNM) has been proposed to directly learn the discriminative dictionaries for avoiding the expensive inference. But the SNNM module ignores the sparse representations. Therefore, we propose a sparse SNNM module by adding the mixed-norm regularization (l1/l2 norm). The sparse SNNM modules are further stacked to build a sparse deep stacking network (S-DSN). In the experiments, we evaluate S-DSN with four databases, including Extended YaleB, AR, 15 scene and Caltech101. Experimental results show that our model outperforms related classification methods with only a linear classifier. It is worth noting that we reach 98.8% recognition accuracy on 15 scene.
New low frequency ac susceptibility measurements on two different spin glasses show that cooling/heating the sample at a constant rate yields an essentially reversible (but rate dependent) X(T) curve; a downward relaxation of X occurs during a temporary stop at constant temperature (ageing). Two main features of our results are: (i)when cooling is resumed after such a stop, X goes back to the reversible curve (chaos) (ii) upon re-heating, X perfectly traces the previous ageing history (memory). We discuss implications of our results for a real space (as opposed to phase space) picture of spin glasses.
The time taken for gene expression varies not least because proteins vary in length considerably. This paper uses an abstract, tuneable Boolean regulatory network model to explore gene expression time variation. In particular, it is shown how non-uniform expression times can emerge under certain conditions through simulated evolution. That is, gene expression time variance appears beneficial in the shaping of the dynamical behaviour of the regulatory network without explicit consideration of protein function.
This paper describes an implementation of the well-known consensus protocol, Paxos, in the P4 programming language. P4 is a language for programming the behavior of network forwarding devices (i.e., the network data plane). Moving consensus logic into network devices could significantly improve the performance of the core infrastructure and services in data centers. Moreover, implementing Paxos in P4 provides a critical use case and set of requirements for data plane language designers. In the long term, we imagine that consensus could someday be offered as a network service, just as point-to-point communication is provided today.
Here we show that, given a set of clusters C on a set of taxa X, where |X|=n, it is possible to determine in time f(k).poly(n) whether there exists a level-<= k network (i.e. a network where each biconnected component has reticulation number at most k) that represents all the clusters in C in the softwired sense, and if so to construct such a network. This extends a polynomial time result from "On the elusiveness of clusters" by Kelk, Scornavacca and Van Iersel(2011). By generalizing the concept of "level-k generator" to general networks, we then extend this fixed parameter tractability result to the problem where k refers not to the level but to the reticulation number of the whole network.
The cross section of diffractive deep-inelastic scattering ep \rightarrow eXp is measured, where the system X contains at least two jets and the leading final state proton is detected in the H1 Forward Proton Spectrometer. The measurement is performed for fractional proton longitudinal momentum loss xIP < 0.1 and covers the range 0.1 < |t| < 0.7 GeV2 in squared four-momentum transfer at the proton vertex and 4 < Q2 < 110 GeV2 in photon virtuality. The differential cross sections extrapolated to |t| < 1 GeV2 are in agreement with next-toleading order QCD predictions based on diffractive parton distribution functions extracted from measurements of inclusive and dijet cross sections in diffractive deep-inelastic scattering. The data are also compared with leading order Monte Carlo models.
A mobile agent in a network wants to visit every node of an n-node network, using a small number of steps. We investigate the performance of the following ``nearest neighbor'' heuristic: always go to the nearest unvisited node. If the network graph never changes, then from (Rosenkrantz, Stearns and Lewis, 1977) and (Hurkens and Woeginger, 2004) it follows that Theta(n log n) steps are necessary and sufficient in the worst case. We give a simpler proof of the upper bound and an example that improves the best known lower bound.   We investigate how the performance of this heuristic changes when it is distributively implemented in a network. Even if network edges are allow to fail over time, we show that the nearest neighbor strategy never runs for more than O(n^2) iterations. We also show that any strategy can be forced to take at least n(n-1)/2 steps before all nodes are visited, if the edges of the network are deleted in an adversarial way.
In this paper, we consider the problem of maximizing the throughput of Byzantine agreement, given that the sum capacity of all links in between nodes in the system is finite. We have proposed a highly efficient Byzantine agreement algorithm on values of length l>1 bits. This algorithm uses error detecting network codes to ensure that fault-free nodes will never disagree, and routing scheme that is adaptive to the result of error detection. Our algorithm has a bit complexity of n(n-1)l/(n-t), which leads to a linear cost (O(n)) per bit agreed upon, and overcomes the quadratic lower bound (Omega(n^2)) in the literature. Such linear per bit complexity has only been achieved in the literature by allowing a positive probability of error. Our algorithm achieves the linear per bit complexity while guaranteeing agreement is achieved correctly even in the worst case. We also conjecture that our algorithm can be used to achieve agreement throughput arbitrarily close to the agreement capacity of a network, when the sum capacity is given.
This paper presents results of topic modeling and network models of topics using the International Conference on Computational Science corpus, which contains domain-specific (computational science) papers over sixteen years (a total of 5695 papers). We discuss topical structures of International Conference on Computational Science, how these topics evolve over time in response to the topicality of various problems, technologies and methods, and how all these topics relate to one another. This analysis illustrates multidisciplinary research and collaborations among scientific communities, by constructing static and dynamic networks from the topic modeling results and the keywords of authors. The results of this study give insights about the past and future trends of core discussion topics in computational science. We used the Non-negative Matrix Factorization topic modeling algorithm to discover topics and labeled and grouped results hierarchically.
We present a fast, fully parameterizable GPU implementation of Convolutional Neural Network variants. Our feature extractors are neither carefully designed nor pre-wired, but rather learned in a supervised way. Our deep hierarchical architectures achieve the best published results on benchmarks for object classification (NORB, CIFAR10) and handwritten digit recognition (MNIST), with error rates of 2.53%, 19.51%, 0.35%, respectively. Deep nets trained by simple back-propagation perform better than more shallow ones. Learning is surprisingly rapid. NORB is completely trained within five epochs. Test error rates on MNIST drop to 2.42%, 0.97% and 0.48% after 1, 3 and 17 epochs, respectively.
We explore the possibility that the observed onset of the Pioneer anomaly after Saturn encounter by Pioneer 11 is not necessarily due to mismodeling of solar radiation pressure but instead reflects a physically relevant characteristic of the anomaly itself. We employ the principles of a recently proposed cosmological model termed "the theory of inertial centers" along with an understanding of the fundamental assumptions taken by the Deep Space Network (DSN) to attempt to model this sudden onset. Due to an ambiguity that arises from the difference in the DSN definition of expected light-time with light-time according to the theory of inertial centers, we are forced to adopt a seemingly arbitrary convention to relate DSN-assumed clock-rates to physical clock-rates for this model. We offer a possible reason for adopting the convention employed in our analysis; however, we remain skeptical. Nevertheless, with this convention, one finds that this theory is able to replicate the previously reported Hubble-like behavior of the "clock acceleration" for the Pioneer anomaly as well as the sudden onset of the anomalous acceleration after Pioneer 11 Saturn encounter. While oscillatory behavior with a yearly period is also predicted for the anomalous clock accelerations of both Pioneer 10 and Pioneer 11, the predicted amplitude is an order of magnitude too small when compared with that reported for Pioneer 10.
In this paper we investigate the performance of different types of rectified activation functions in convolutional neural network: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), parametric rectified linear unit (PReLU) and a new randomized leaky rectified linear units (RReLU). We evaluate these activation function on standard image classification task. Our experiments suggest that incorporating a non-zero slope for negative part in rectified activation units could consistently improve the results. Thus our findings are negative on the common belief that sparsity is the key of good performance in ReLU. Moreover, on small scale dataset, using deterministic negative slope or learning it are both prone to overfitting. They are not as effective as using their randomized counterpart. By using RReLU, we achieved 75.68\% accuracy on CIFAR-100 test set without multiple test or ensemble.
One of the key obstacles in making learning protocols realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. We consider the framework to use the world knowledge as indirect supervision. World knowledge is general-purpose knowledge, which is not designed for any specific domain. Then the key challenges are how to adapt the world knowledge to domains and how to represent it for learning. In this paper, we provide an example of using world knowledge for domain dependent document clustering. We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network. Then we propose a clustering algorithm that can cluster multiple types and incorporate the sub-type information as constraints. In the experiments, we use two existing knowledge bases as our sources of world knowledge. One is Freebase, which is collaboratively collected knowledge about entities and their organizations. The other is YAGO2, a knowledge base automatically extracted from Wikipedia and maps knowledge to the linguistic knowledge base, WordNet. Experimental results on two text benchmark datasets (20newsgroups and RCV1) show that incorporating world knowledge as indirect supervision can significantly outperform the state-of-the-art clustering algorithms as well as clustering algorithms enhanced with world knowledge features.
Nearest neighbor (kNN) methods have been gaining popularity in recent years in light of advances in hardware and efficiency of algorithms. There is a plethora of methods to choose from today, each with their own advantages and disadvantages. One requirement shared between all kNN based methods is the need for a good representation and distance measure between samples.   We introduce a new method called differentiable boundary tree which allows for learning deep kNN representations. We build on the recently proposed boundary tree algorithm which allows for efficient nearest neighbor classification, regression and retrieval. By modelling traversals in the tree as stochastic events, we are able to form a differentiable cost function which is associated with the tree's predictions. Using a deep neural network to transform the data and back-propagating through the tree allows us to learn good representations for kNN methods.   We demonstrate that our method is able to learn suitable representations allowing for very efficient trees with a clearly interpretable structure.
This paper contains an analysis of a simple neural network that exhibits self-organized criticality. Such criticality follows from the combination of a simple neural network with an excitatory feedback loop that generates bistability, in combination with an anti-Hebbian synapse in its input pathway. Using the methods of statistical field theory, we show how one can formulate the stochastic dynamics of such a network as the action of a path integral, which we then investigate using renormalization group methods. The results indicate that the network exhibits hysteresis in switching back and forward between its two stable states, each of which loses its stability at a saddle-node bifurcation. The renormalization group analysis shows that the fluctuations in the neighborhood of such bifurcations have the signature of directed percolation. Thus the network states undergo the neural analog of a phase transition in the universality class of directed percolation. The network replicates precisely the behavior of the original sand-pile model of Bak, Tang & Wiesenfeld.
We propose and analyze a new approach to the coherent control and manipulation of quantum degrees of freedom in disordered, interacting systems in the many-body localized phase. Our approach leverages a number of unique features of many-body localization: a lack of thermalization, a locally gapped spectrum, and slow dephasing. Using the technique of quantum phase estimation, we demonstrate a protocol that enables the local preparation of a many-body system into an effective eigenstate. This leads to the ability to encode information and control interactions without full microscopic knowledge of the underlying Hamiltonian. Finally, we analyze the effects of weak coupling to an external bath and provide an estimate for the fidelity of our protocol.
We examine two-dimensional conformal field theories (CFTs) at central charge c=0. These arise typically in the description of critical systems with quenched disorder, but also in other contexts including dilute self-avoiding polymers and percolation. We show that such CFTs must in general possess, in addition to their stress energy tensor T(z), an extra field whose holomorphic part, t(z), has conformal weight two. The singular part of the Operator Product Expansion (OPE) between T(z) and t(z) is uniquely fixed up to a single number b, defining a new `anomaly' which is a characteristic of any c=0 CFT, and which may be used to distinguish between different such CFTs. The extra field t(z) is not primary (unless b=0), and is a so-called `logarithmic operator' except in special cases which include affine (Kac-Moody) Lie-super current algebras. The number b controls the question of whether Virasoro null-vectors arising at certain conformal weights contained in the c=0 Kac table may be set to zero or not, in these nonunitary theories. This has, in the familiar manner, implications on the existence of differential equations satisfied by conformal blocks involving primary operators with Kac-table dimensions. It is shown that c=0 theories where t(z) is logarithmic, contain, besides T and t, additional fields with conformal weight two. If the latter are a fermionic pair, the OPEs between the holomorphic parts of all these conformal weight-two operators are automatically covariant under a global U(1|1) supersymmetry. A full extension of the Virasoro algebra by the Laurent modes of these extra conformal weight-two fields, including t(z), remains an interesting question for future work.
The problem of f-divergence estimation is important in the fields of machine learning, information theory, and statistics. While several nonparametric divergence estimators exist, relatively few have known convergence properties. In particular, even for those estimators whose MSE convergence rates are known, the asymptotic distributions are unknown. We establish the asymptotic normality of a recently proposed ensemble estimator of f-divergence between two distributions from a finite number of samples. This estimator has MSE convergence rate of O(1/T), is simple to implement, and performs well in high dimensions. This theory enables us to perform divergence-based inference tasks such as testing equality of pairs of distributions based on empirical samples. We experimentally validate our theoretical results and, as an illustration, use them to empirically bound the best achievable classification error.
Compared to image representation based on low-level local descriptors, deep neural activations of Convolutional Neural Networks (CNNs) are richer in mid-level representation, but poorer in geometric invariance properties. In this paper, we present a straightforward framework for better image representation by combining the two approaches. To take advantages of both representations, we propose an efficient method to extract a fair amount of multi-scale dense local activations from a pre-trained CNN. We then aggregate the activations by Fisher kernel framework, which has been modified with a simple scale-wise normalization essential to make it suitable for CNN activations. Replacing the direct use of a single activation vector with our representation demonstrates significant performance improvements: +17.76 (Acc.) on MIT Indoor 67 and +7.18 (mAP) on PASCAL VOC 2007. The results suggest that our proposal can be used as a primary image representation for better performances in visual recognition tasks.
Recently, the rapid development of word embedding and neural networks has brought new inspiration to various NLP and IR tasks. In this paper, we describe a staged hybrid model combining Recurrent Convolutional Neural Networks (RCNN) with highway layers. The highway network module is incorporated in the middle takes the output of the bi-directional Recurrent Neural Network (Bi-RNN) module in the first stage and provides the Convolutional Neural Network (CNN) module in the last stage with the input. The experiment shows that our model outperforms common neural network models (CNN, RNN, Bi-RNN) on a sentiment analysis task. Besides, the analysis of how sequence length influences the RCNN with highway layers shows that our model could learn good representation for the long text.
One important goal of black-box complexity theory is the development of complexity models allowing to derive meaningful lower bounds for whole classes of randomized search heuristics. Complementing classical runtime analysis, black-box models help us understand how algorithmic choices such as the population size, the variation operators, or the selection rules influence the optimization time. One example for such a result is the $\Omega(n \log n)$ lower bound for unary unbiased algorithms on functions with a unique global optimum [Lehre/Witt, GECCO 2010], which tells us that higher arity operators or biased sampling strategies are needed when trying to beat this bound. In lack of analyzing techniques, almost no non-trivial bounds are known for other restricted models. Proving such bounds therefore remains to be one of the main challenges in black-box complexity theory.   With this paper we contribute to our technical toolbox for lower bound computations by proposing a new type of information-theoretic argument. We regard the permutation- and bit-invariant version of \textsc{LeadingOnes} and prove that its (1+1) elitist black-box complexity is $\Omega(n^2)$, a bound that is matched by (1+1)-type evolutionary algorithms. The (1+1) elitist complexity of \textsc{LeadingOnes} is thus considerably larger than its unrestricted one, which is known to be of order $n\log\log n$ [Afshani et al., 2013].
Based on a weighted knowledge graph to represent first-order knowledge and combining it with a probabilistic model, we propose a methodology for the creation of a medical knowledge network (MKN) in medical diagnosis. When a set of symptoms is activated for a specific patient, we can generate a ground medical knowledge network composed of symptom nodes and potential disease nodes. By Incorporating a Boltzmann machine into the potential function of a Markov network, we investigated the joint probability distribution of the MKN. In order to deal with numerical symptoms, a multivariate inference model is presented that uses conditional probability. In addition, the weights for the knowledge graph were efficiently learned from manually annotated Chinese Electronic Medical Records (CEMRs). In our experiments, we found numerically that the optimum choice of the quality of disease node and the expression of symptom variable can improve the effectiveness of medical diagnosis. Our experimental results comparing a Markov logic network and the logistic regression algorithm on an actual CEMR database indicate that our method holds promise and that MKN can facilitate studies of intelligent diagnosis.
We consider deeply virtual Compton scattering and deep inelastic scattering in presence of Regge exchanges that are part of the non-perturbative quark-nucleon amplitude. In particular we discuss contribution from the Pomeron exchange and demonstrate how it leads to Regge scaling of the Compton amplitude. Comparison with HERA data is given.
We propose some kinetic models of wealth exchange and investigate their behavior on directed networks though numerical simulations. We observe that network topology and directedness yields a variety of interesting features in these models. The nature of asset distribution in such directed networks show varied results, the degree of asset inequality increased with the degree of disorder in the graphs.
Human keypoints are a well-studied representation of people.We explore how to use keypoint models to improve instance-level person segmentation. The main idea is to harness the notion of a distance transform of oracle provided keypoints or estimated keypoint heatmaps as a prior for person instance segmentation task within a deep neural network. For training and evaluation, we consider all those images from COCO where both instance segmentation and human keypoints annotations are available. We first show how oracle keypoints can boost the performance of existing human segmentation model during inference without any training. Next, we propose a framework to directly learn a deep instance segmentation model conditioned on human pose. Experimental results show that at various Intersection Over Union (IOU) thresholds, in a constrained environment with oracle keypoints, the instance segmentation accuracy achieves 10% to 12% relative improvements over a strong baseline of oracle bounding boxes. In a more realistic environment, without the oracle keypoints, the proposed deep person instance segmentation model conditioned on human pose achieves 3.8% to 10.5% relative improvements comparing with its strongest baseline of a deep network trained only for segmentation.
Empirical scoring functions based on either molecular force fields or cheminformatics descriptors are widely used, in conjunction with molecular docking, during the early stages of drug discovery to predict potency and binding affinity of a drug-like molecule to a given target. These models require expert-level knowledge of physical chemistry and biology to be encoded as hand-tuned parameters or features rather than allowing the underlying model to select features in a data-driven procedure. Here, we develop a general 3-dimensional spatial convolution operation for learning atomic-level chemical interactions directly from atomic coordinates and demonstrate its application to structure-based bioactivity prediction. The atomic convolutional neural network is trained to predict the experimentally determined binding affinity of a protein-ligand complex by direct calculation of the energy associated with the complex, protein, and ligand given the crystal structure of the binding pose. Non-covalent interactions present in the complex that are absent in the protein-ligand sub-structures are identified and the model learns the interaction strength associated with these features. We test our model by predicting the binding free energy of a subset of protein-ligand complexes found in the PDBBind dataset and compare with state-of-the-art cheminformatics and machine learning-based approaches. We find that all methods achieve experimental accuracy and that atomic convolutional networks either outperform or perform competitively with the cheminformatics based methods. Unlike all previous protein-ligand prediction systems, atomic convolutional networks are end-to-end and fully-differentiable. They represent a new data-driven, physics-based deep learning model paradigm that offers a strong foundation for future improvements in structure-based bioactivity prediction.
We explore recurrent encoder multi-decoder neural network architectures for semi-supervised sequence classification and reconstruction. We find that the use of multiple reconstruction modules helps models generalize. Our experiments are conducted on two well known Motion Capture data sets. We also explore a novel formulation for future predicting decoders based on conditional recurrent generative adversarial networks. Further our networks have both soft and hard constraints derived from desired physical properties of synthesized future movements and desired animation goals. We find that networks with these properties reduce common artifacts in generated sequences compared to using simpler architectures.
We apply a generalized Kibble-Zurek out-of-equilibrium scaling ansatz to simulated annealing when approaching the spin-glass transition at temperature $T=0$ of the two-dimensional Ising model with random $J= \pm 1$ couplings. Analyzing the spin-glass order parameter and the excess energy as functions of the system size and the annealing velocity in Monte Carlo simulations with Metropolis dynamics, we find scaling where the energy relaxes slower than the spin-glass order parameter, i.e., there are two different dynamic exponents. The values of the exponents relating the relaxation time scales to the system length, $\tau \sim L^z$, are $z=8.28 \pm 0.03$ for the relaxation of the order parameter and $z=10.31 \pm 0.04$ for the energy relaxation. We argue that the behavior with dual time scales arises as a consequence of the entropy-driven ordering mechanism within droplet theory. We point out that the dynamic exponents found here for $T \to 0$ simulated annealing are different from the temperature-dependent equilibrium dynamic exponent $z_{\rm eq}(T)$, for which previous studies have found a divergent behavior; $z_{\rm eq}(T\to 0) \to \infty$. Thus, our study shows that, within Metropolis dynamics, it is easier to relax the system to one of its degenerate ground states than to migrate at low temperatures between regions of the configuration space surrounding different ground states. In a more general context of optimization, our study provides an example of robust dense-region solutions for which the excess energy (the conventional cost function) may not be the best measure of success.
Despite decades of work, gaining a first-principle understanding of amorphous materials remains an extremely challenging problem. However, recent theoretical breakthroughs have led to the formulation of an exact solution in the mean-field limit of infinite spatial dimension, and numerical simulations have remarkably confirmed the dimensional robustness of some of the predictions. This review describes these latest advances. More specifically, we consider the dynamical and thermodynamic descriptions of hard spheres around the dynamical, Gardner and jamming transitions. Comparing mean-field predictions with the finite-dimensional simulations, we identify robust aspects of the description and uncover its more sensitive features. We conclude with a brief overview of ongoing research.
Network latency can have a significant impact on the performance of transactional storage systems, particularly in wide area or geo-distributed deployments. To reduce latency, systems typically rely on a cache to service read-requests closer to the client. However, caches are not effective for write-heavy workloads, which have to be processed by the storage system in order to maintain serializability.   This paper presents a new technique, called optimistic abort, which reduces network latency for high-contention, write-heavy workloads by identifying transactions that will abort as early as possible, and aborting them before they reach the store. We have implemented optimistic abort in a system called Gotthard, which leverages recent advances in network data plane programmability to execute transaction processing logic directly in network devices. Gotthard examines network traffic to observe and log transaction requests. If Gotthard suspects that a transaction is likely to be aborted at the store, it aborts the transaction early by re-writing the packet header, and routing the packets back to the client. Gotthard significantly reduces the overall latency and improves the throughput for high-contention workloads.
We investigate the XY spin-glass model in two and three dimensions using the domain-wall renormalization-group method. The results for systems of linear sizes up to L=12 (2D) and L=8 (3D) strongly suggest that the lower critical dimension for spin-glass ordering may be $d_{c}\approx 3$ rather than four as is commonly believed. Our 3D data favor the scenario of a low but finite spin-glass ordering temperature below the chiral transition but they are also compatible with the system being at or slightly below its lower critical dimension.
In this paper, we study joint network coding and distributed source coding of inter-node dependent messages, with the perspective of compressed sensing. Specifically, the theoretical guarantees for robust $\ell_1$-min recovery of an under-determined set of linear network coded sparse messages are investigated. We discuss the guarantees for $\ell_1$-min decoding of quantized network coded messages, using the proposed local network coding coefficients in \cite{naba}, based on Restricted Isometry Property (RIP) of the resulting measurement matrix. Moreover, the relation between tail probability of $\ell_2$-norms and satisfaction of RIP is derived and used to compare our designed measurement matrix, with i.i.d. Gaussian measurement matrix. Finally, we present our numerical evaluations, which shows that the proposed design of network coding coefficients result in a measurement matrix with an RIP behavior, similar to that of i.i.d. Gaussian matrix.
Deep convolutional neural networks have recently proven extremely competitive in challenging image recognition tasks. This paper proposes the epitomic convolution as a new building block for deep neural networks. An epitomic convolution layer replaces a pair of consecutive convolution and max-pooling layers found in standard deep convolutional neural networks. The main version of the proposed model uses mini-epitomes in place of filters and computes responses invariant to small translations by epitomic search instead of max-pooling over image positions. The topographic version of the proposed model uses large epitomes to learn filter maps organized in translational topographies. We show that error back-propagation can successfully learn multiple epitomic layers in a supervised fashion. The effectiveness of the proposed method is assessed in image classification tasks on standard benchmarks. Our experiments on Imagenet indicate improved recognition performance compared to standard convolutional neural networks of similar architecture. Our models pre-trained on Imagenet perform excellently on Caltech-101. We also obtain competitive image classification results on the small-image MNIST and CIFAR-10 datasets.
This paper presents an extension of the BrainScaleS accelerated analog neuromorphic hardware model. The scalable neuromorphic architecture is extended by the support for multi-compartment models and non-linear dendrites. These features are part of a \SI{65}{\nano\meter} prototype ASIC. It allows to emulate different spike types observed in cortical pyramidal neurons: NMDA plateau potentials, calcium and sodium spikes. By replicating some of the structures of these cells, they can be configured to perform coincidence detection within a single neuron. Built-in plasticity mechanisms can modify not only the synaptic weights, but also the dendritic synaptic composition to efficiently train large multi-compartment neurons. Transistor-level simulations demonstrate the functionality of the analog implementation and illustrate analogies to biological measurements.
This work addresses the problem of ensuring trustworthy computation in a linear consensus network. A solution to this problem is relevant for several tasks in multi-agent systems including motion coordination, clock synchronization, and cooperative estimation. In a linear consensus network, we allow for the presence of misbehaving agents, whose behavior deviate from the nominal consensus evolution. We model misbehaviors as unknown and unmeasurable inputs affecting the network, and we cast the misbehavior detection and identification problem into an unknown-input system theoretic framework. We consider two extreme cases of misbehaving agents, namely faulty (non-colluding) and malicious (Byzantine) agents. First, we characterize the set of inputs that allow misbehaving agents to affect the consensus network while remaining undetected and/or unidentified from certain observing agents. Second, we provide worst-case bounds for the number of concurrent faulty or malicious agents that can be detected and identified. Precisely, the consensus network needs to be 2k+1 (resp. k+1) connected for k malicious (resp. faulty) agents to be generically detectable and identifiable by every well behaving agent. Third, we quantify the effect of undetectable inputs on the final consensus value. Fourth, we design three algorithms to detect and identify misbehaving agents. The first and the second algorithm apply fault detection techniques, and affords complete detection and identification if global knowledge of the network is available to each agent, at a high computational cost. The third algorithm is designed to exploit the presence in the network of weakly interconnected subparts, and provides local detection and identification of misbehaving agents whose behavior deviates more than a threshold, which is quantified in terms of the interconnection structure.
Many important NLP problems can be posed as dual-sequence or sequence-to-sequence modeling tasks. Recent advances in building end-to-end neural architectures have been highly successful in solving such tasks. In this work we propose a new architecture for dual-sequence modeling that is based on associative memory. We derive AM-RNNs, a recurrent associative memory (AM) which augments generic recurrent neural networks (RNN). This architecture is extended to the Dual AM-RNN which operates on two AMs at once. Our models achieve very competitive results on textual entailment. A qualitative analysis demonstrates that long range dependencies between source and target-sequence can be bridged effectively using Dual AM-RNNs. However, an initial experiment on auto-encoding reveals that these benefits are not exploited by the system when learning to solve sequence-to-sequence tasks which indicates that additional supervision or regularization is needed.
A novel formalism for Bayesian learning in the context of complex inference models is proposed. The method is based on the use of the Stationary Fokker--Planck (SFP) approach to sample from the posterior density. Stationary Fokker--Planck sampling generalizes the Gibbs sampler algorithm for arbitrary and unknown conditional densities. By the SFP procedure approximate analytical expressions for the conditionals and marginals of the posterior can be constructed. At each stage of SFP, the approximate conditionals are used to define a Gibbs sampling process, which is convergent to the full joint posterior. By the analytical marginals efficient learning methods in the context of Artificial Neural Networks are outlined. Off--line and incremental Bayesian inference and Maximum Likelihood Estimation from the posterior is performed in classification and regression examples. A comparison of SFP with other Monte Carlo strategies in the general problem of sampling from arbitrary densities is also presented. It is shown that SFP is able to jump large low--probabilty regions without the need of a careful tuning of any step size parameter. In fact, the SFP method requires only a small set of meaningful parameters which can be selected following clear, problem--independent guidelines. The computation cost of SFP, measured in terms of loss function evaluations, grows linearly with the given model's dimension.
Kinect skeleton tracker is able to achieve considerable human body tracking performance in convenient and a low-cost manner. However, The tracker often captures unnatural human poses such as discontinuous and vibrated motions when self-occlusions occur. A majority of approaches tackle this problem by using multiple Kinect sensors in a workspace. Combination of the measurements from different sensors is then conducted in Kalman filter framework or optimization problem is formulated for sensor fusion. However, these methods usually require heuristics to measure reliability of measurements observed from each Kinect sensor. In this paper, we developed a method to improve Kinect skeleton using single Kinect sensor, in which supervised learning technique was employed to correct unnatural tracking motions. Specifically, deep recurrent neural networks were used for improving joint positions and velocities of Kinect skeleton, and three methods were proposed to integrate the refined positions and velocities for further enhancement. Moreover, we suggested a novel measure to evaluate naturalness of captured motions. We evaluated the proposed approach by comparison with the ground truth obtained using a commercial optical maker-based motion capture system.
Supervised learning of Convolutional Neural Networks (CNNs), also known as supervised Deep Learning, is a computationally demanding process. To find the most suitable parameters of a network for a given application, numerous training sessions are required. Therefore, reducing the training time per session is essential to fully utilize CNNs in practice. While numerous research groups have addressed the training of CNNs using GPUs, so far not much attention has been paid to the Intel Xeon Phi coprocessor. In this paper we investigate empirically and theoretically the potential of the Intel Xeon Phi for supervised learning of CNNs. We design and implement a parallelization scheme named CHAOS that exploits both the thread- and SIMD-parallelism of the coprocessor. Our approach is evaluated on the Intel Xeon Phi 7120P using the MNIST dataset of handwritten digits for various thread counts and CNN architectures. Results show a 103.5x speed up when training our large network for 15 epochs using 244 threads, compared to one thread on the coprocessor. Moreover, we develop a performance model and use it to assess our implementation and answer what-if questions.
We study the problem of Byzantine-robust topology discovery in an arbitrary asynchronous network. We formally state the weak and strong versions of the problem. The weak version requires that either each node discovers the topology of the network or at least one node detects the presence of a faulty node. The strong version requires that each node discovers the topology regardless of faults. We focus on non-cryptographic solutions to these problems. We explore their bounds. We prove that the weak topology discovery problem is solvable only if the connectivity of the network exceeds the number of faults in the system. Similarly, we show that the strong version of the problem is solvable only if the network connectivity is more than twice the number of faults. We present solutions to both versions of the problem. The presented algorithms match the established graph connectivity bounds. The algorithms do not require the individual nodes to know either the diameter or the size of the network. The message complexity of both programs is low polynomial with respect to the network size. We describe how our solutions can be extended to add the property of termination, handle topology changes and perform neighborhood discovery.
Several types of biological networks have recently been shown to be accurately described by a maximum entropy model with pairwise interactions, also known as the Ising model. Here we present an approach for finding the optimal mappings between input signals and network states that allow the network to convey the maximal information about input signals drawn from a given distribution. This mapping also produces a set of linear equations for calculating the optimal Ising model coupling constants, as well as geometric properties that indicate the applicability of the pairwise Ising model. We show that the optimal pairwise interactions are on average zero for Gaussian and uniformly distributed inputs, whereas they are non-zero for inputs approximating those in natural environments. These non-zero network interactions are predicted to increase in strength as the noise in the response functions of each network node increases. This approach also suggests ways for how interactions with unmeasured parts of the network can be inferred from the parameters of response functions for the measured network nodes.
The Chandra Deep Field-North survey, which has at its center the Hubble Deep Field-North, has reached an exposure of 1 Ms and is now available to the public for analysis. This great astronomical resource will soon be released to the community in the form of a catalog paper with accurate X-ray fluxes in four bands and astrometry good to approximately 0.6-1.2 arcseconds over the entire ACIS-I field.   The scientific focus of this contribution is the population of X-ray sources detected at X-ray fluxes below the faintest detection limits of X-ray observatories such as ROSAT and ASCA. These include fairly normal and star-forming galaxies out to z=2, starburst galaxies from z=2-4 and possibly very high redshift (z> 6) AGN. The exciting new prospects for studying these populations in the X-ray band are discussed.
We calculate deep inelastic structure functions for mesons in the D3-D7 brane model, that incorporates flavour to the AdS/CFT correspondence. We consider two different prescriptions for the hadronic current dual: a gauge field in the AdS bulk and a gauge field on the D7 brane. We also calculate elastic form factors in both cases. We compare our results with other holographic models.
Signed networks have been a topic of recent interest in the network control community as they allow studying antagonistic interactions in multi-agent systems. Although dynamical characteristics of signed networks have been well-studied, notions such as controllability and stabilizability for signed networks for protocols such as consensus are missing in the literature. Classically, graph automorphisms with respect to the input nodes have been used to characterize uncontrollability of consensus networks. In this paper, we show that in addition to the graph symmetry, the topological property of structural balance facilitates the derivation of analogous sufficient conditions for uncontrollability for signed networks. In particular, we provide an analysis which shows that a gauge transformation induced by structural balance allows symmetry arguments to hold for signed consensus networks. Lastly, we use fractional automorphisms to extend our observations to output controllability and stabilizability of signed networks.
We discuss the computational complexity of random 2D Ising spin glasses, which represent an interesting class of constraint satisfaction problems for black box optimization. Two extremal cases are considered: (1) the +/- J spin glass, and (2) the Gaussian spin glass. We also study a smooth transition between these two extremal cases. The computational complexity of all studied spin glass systems is found to be dominated by rare events of extremely hard spin glass samples. We show that complexity of all studied spin glass systems is closely related to Frechet extremal value distribution. In a hybrid algorithm that combines the hierarchical Bayesian optimization algorithm (hBOA) with a deterministic bit-flip hill climber, the number of steps performed by both the global searcher (hBOA) and the local searcher follow Frechet distributions. Nonetheless, unlike in methods based purely on local search, the parameters of these distributions confirm good scalability of hBOA with local search. We further argue that standard performance measures for optimization algorithms--such as the average number of evaluations until convergence--can be misleading. Finally, our results indicate that for highly multimodal constraint satisfaction problems, such as Ising spin glasses, recombination-based search can provide qualitatively better results than mutation-based search.
Background: Designing amino acid sequences that are stable in a given target structure amounts to maximizing a conditional probability. A straightforward approach to accomplish this is a nested Monte Carlo where the conformation space is explored over and over again for different fixed sequences, which requires excessive computational demand. Several approximate attempts to remedy this situation, based on energy minimization for fixed structure or high-$T$ expansions, have been proposed. These methods are fast but often not accurate since folding occurs at low $T$.   Results: We develop a multisequence Monte Carlo procedure, where both sequence and conformation space are simultaneously probed with efficient prescriptions for pruning sequence space. The method is explored on hydrophobic/polar models. We first discuss short lattice chains, in order to compare with exact data and with other methods. The method is then successfully applied to lattice chains with up to 50 monomers, and to off-lattice 20-mers.   Conclusions: The multisequence Monte Carlo method offers a new approach to sequence design in coarse-grained models. It is much more efficient than previous Monte Carlo methods, and is, as it stands, applicable to a fairly wide range of two-letter models.
The Naming Game has been studied to explore the role of self-organization in the development and negotiation of linguistic conventions. In this paper, we define an automata networks approach to the Naming Game. Two problems are faced: (1) the definition of an automata networks for multi-party communicative interactions; and (2) the proof of convergence for three different orders in which the individuals are updated (updating schemes). Finally, computer simulations are explored in two-dimensional lattices with the purpose to recover the main features of the Naming Game and to describe the dynamics under different updating schemes.
Frequency and phase of neural activity play important roles in the behaving brain. The emerging understanding of these roles has been informed by the design of analog devices that have been important to neuroscience, among them the neuroanalog computer developed by O. Schmitt in the 1930's. In the 1950's, J. von Neumann, in a search for high performance computing using microwaves, invented a logic machine based on similar devices, that can perform logic functions including binary arithmetic. Described here is a novel embodiment of his machine using nano-magnetics. The embodiment is based on properties of ferromagnetic thin films that are governed by a nonlinear Schrodinger equation for magnetization in a film. Electrical currents through point contacts on a film create spin torque nano oscillators (STNO) that define the oscillator elements of the system. These oscillators may communicate through directed graphs of electrical connections or by radiation in the form of spin waves. It is shown here how to construct a logic machine using STNO, that this machine can perform several computations simultaneously using multiplexing of inputs, that this system can evaluate iterated logic functions, and that spin waves can communicate frequency, phase and binary information. Neural tissue and the Schmitt, von Neumann and STNO devices share a common bifurcation structure, although these systems operate on vastly different space and time scales. This suggests that neural circuits may be capable of computational functionality described here.
A quantum field theoretic treatment of inclusive deep-inelastic diffractive scattering is given. The process can be described in the general framework of non-forward scattering processes using the light-cone expansion in the generalized Bjorken region. Evolution equations of the diffractive hadronic matrix elements are derived at the level of the twist-2 contributions and are compared to those of inclusive deep-inelastic forward scattering (DIS). The diffractive parton densities are obtained as projections of two-variable parton distributions. We also comment on the higher twist contributions in the light-cone expansion.
We introduce our method and system for face recognition using multiple pose-aware deep learning models. In our representation, a face image is processed by several pose-specific deep convolutional neural network (CNN) models to generate multiple pose-specific features. 3D rendering is used to generate multiple face poses from the input image. Sensitivity of the recognition system to pose variations is reduced since we use an ensemble of pose-specific CNN features. The paper presents extensive experimental results on the effect of landmark detection, CNN layer selection and pose model selection on the performance of the recognition pipeline. Our novel representation achieves better results than the state-of-the-art on IARPA's CS2 and NIST's IJB-A in both verification and identification (i.e. search) tasks.
Finding meaningful communities in social network has attracted the attentions of many researchers. The community structure of complex networks reveals both their organization and hidden relations among their constituents. Most of the researches in the field of community detection mainly focus on the topological structure of the network without performing any content analysis. Nowadays, real world social networks are containing a vast range of information including shared objects, comments, following information, etc. In recent years, a number of researches have proposed approaches which consider both the contents that are interchanged in the networks and the topological structures of the networks in order to find more meaningful communities. In this research, the effect of topic analysis in finding more meaningful communities in social networking sites in which the users express their feelings toward different objects (like movies) by the means of rating is demonstrated by performing extensive experiments.
This paper introduces a particular game formulation and its corresponding notion of equilibrium, namely the satisfaction form (SF) and the satisfaction equilibrium (SE). A game in SF models the case where players are uniquely interested in the satisfaction of some individual performance constraints, instead of individual performance optimization. Under this formulation, the notion of equilibrium corresponds to the situation where all players can simultaneously satisfy their individual constraints. The notion of SE, models the problem of QoS provisioning in decentralized self-configuring networks. Here, radio devices are satisfied if they are able to provide the requested QoS. Within this framework, the concept of SE is formalized for both pure and mixed strategies considering finite sets of players and actions. In both cases, sufficient conditions for the existence and uniqueness of the SE are presented. When multiple SE exist, we introduce the idea of effort or cost of satisfaction and we propose a refinement of the SE, namely the efficient SE (ESE). At the ESE, all players adopt the action which requires the lowest effort for satisfaction. A learning method that allows radio devices to achieve a SE in pure strategies in finite time and requiring only one-bit feedback is also presented. Finally, a power control game in the interference channel is used to highlight the advantages of modeling QoS problems following the notion of SE rather than other equilibrium concepts, e.g., generalized Nash equilibrium.
The impact parameter (b) dependence on the saturation scale, in the framework of the Color Glass Condensate (b-CGC) dipole model, is investigated from the configurational point of view. During the calculations and analysis of the quantum nuclear states, the critical points of stability in the configurational entropy setup are computed, matching the experimental parameters that define the onset of the quantum regime in the b-CGC in the literature with very good accuracy. This new approach is crucial and important for understanding the stability of quantum systems in study of deep inelastic scattering processes.
A web application prototype is described, aimed at the generation of synthetic seismograms for user-defined earthquake models. The web application graphical user interface hides the complexity of the underlying computational engine, which is the outcome of the continuous evolution of sophisticated computer codes, some of which saw the light back in the middle '80s. With the web application, even the non-experts can produce ground shaking scenarios at the local or regional scale in very short times, depending on the complexity of the adopted source and medium models, without the need of a deep knowledge of the physics of the earthquake phenomenon. Actually, it may even allow neophytes to get some basic education in the field of seismology and seismic engineering, due to the simplified intuitive experimental approach to the matter. One of the most powerful features made available to the users is indeed the capability of executing quick parametric tests in near real-time, to explore the relations between each model's parameter and the resulting ground motion scenario. The synthetic seismograms generated through the web application can be used by civil engineers for the design of new seismo-resistant structures, or to analyse the performance of the existing ones under seismic load.
The scaling of the tails of the probability of a system to percolate only in the horizontal direction $\pi_{hs}$ was investigated numerically for correlated site-bond percolation model for $q=1,2,3,4$.We have to demonstrate that the tails of the crossing probability far from the critical point have shape $\pi_{hs}(p) \simeq D \exp(c L[p-p_{c}]^{\nu})$ where $\nu$ is the correlation length index, $p=1-\exp(-\beta)$ is the probability of a bond to be closed. At criticality we observe crossover to another scaling $\pi_{hs}(p) \simeq A \exp (-b {L [p-p_{c}]^{\nu}}^{z})$. Here $z$ is a scaling index describing the central part of the crossing probability.
The stochastic block model (SBM) is a mixture model used for the clustering of nodes in networks. It has now been employed for more than a decade to analyze very different types of networks in many scientific fields such as Biology and social sciences. Because of conditional dependency, there is no analytical expression for the posterior distribution over the latent variables, given the data and model parameters. Therefore, approximation strategies, based on variational techniques or sampling, have been proposed for clustering. Moreover, two SBM model selection criteria exist for the estimation of the number K of clusters in networks but, again, both of them rely on some approximations. In this paper, we show how an analytical expression can be derived for the integrated complete data log likelihood. We then propose an inference algorithm to maximize this exact quantity. This strategy enables the clustering of nodes as well as the estimation of the number clusters to be performed at the same time and no model selection criterion has to be computed for various values of K. The algorithm we propose has a better computational cost than existing inference techniques for SBM and can be employed to analyze large networks with ten thousand nodes. Using toy and true data sets, we compare our work with other approaches.
Topological metrics of graphs provide a natural way to describe the prominent features of various types of networks. Graph metrics describe the structure and interplay of graph edges and have found applications in many scientific fields. In this work, the use of graph metrics is employed in network estimation by developing optimisation methods that incorporate prior knowledge of a network's topology. The derivatives of graph metrics are used in gradient descent schemes for weighted undirected network denoising, network decomposition, and network completion. The successful performance of our methodology is shown in a number of toy examples and real-world datasets. Most notably, our work establishes a new link between graph theory, network science and optimisation.
A simple geometrical characterization of configuration space neighborhoods of local energy minima in spin glass landscapes is found by exhaustive search. Combined with previous Monte Carlo investigations of thermal domain growth, it allows a discussion of the connection between real and configuration space descriptions of low temperature relaxational dynamics. We argue that the part of state-space corresponding to a single growing domain is adequately modeled by a hierarchically organized set of states and that thermal (meta)stability in spin glasses is related to the nearly exponential local density of states present within each trap.
Artificial Neural Networks (ANN) comprise important symmetry properties, which can influence the performance of Monte Carlo methods in Neuroevolution. The problem of the symmetries is also known as the competing conventions problem or simply as the permutation problem. In the literature, symmetries are mainly addressed in Genetic Algoritm based approaches. However, investigations in this direction based on other Evolutionary Algorithms (EA) are rare or missing. Furthermore, there are different and contradictionary reports on the efficacy of symmetry breaking. By using a novel viewpoint, we offer a possible explanation for this issue. As a result, we show that a strategy which is invariant to the global optimum can only be successfull on certain problems, whereas it must fail to improve the global convergence on others. We introduce the \emph{Minimum Global Optimum Proximity} principle as a generalized and adaptive strategy to symmetry breaking, which depends on the location of the global optimum. We apply the proposed principle to Differential Evolution (DE) and Covariance Matrix Adaptation Evolution Strategies (CMA-ES), which are two popular and conceptually different global optimization methods. Using a wide range of feedforward ANN problems, we experimentally illustrate significant improvements in the global search efficiency by the proposed symmetry breaking technique.
The goal of this paper is to provide an intuitive and useful tool for analyzing the impurity bound state problem. We develop a semiclassical approach and apply it to an impurity in two dimensional systems with parabolic or Dirac like bands. Our method consists of reducing a higher dimensional problem into a sum of one dimensional ones using the two dimensional Green functions as a guide. We then analyze the one dimensional effective systems in the spirit of the wave function matching method as in the standard 1d quantum model. We demonstrate our method on two dimensional models with parabolic and Dirac-like dispersion, with the later specifically relevant to topological insulators.
We report a molecular dynamics simulation of a supercooled simple monatomic glass-forming liquid. It is found that the onset of the supercooled regime results in formation of distinct domains of slow diffusion which are confined to the long-lived icosahedrally structured clusters associated with deeper minima in the energy landscape. As these domains, possessing a low-dimensional geometry, grow with cooling and percolate below $T_c$, the critical temperature of the mode coupling theory, a sharp slowing down of the structural relaxation relative to diffusion is observed. It is concluded that this latter anomaly cannot be accounted for by the spatial variation in atomic mobility; instead, we explain it as a direct result of the configuration-space constraints imposed by the transient structural correlations. We also conjecture that the observed tendency for low-dimensional clustering may be regarded as a possible mechanism of fragility.
We consider a multi-user network where a network manager and selfish users interact. The network manager monitors the behavior of users and intervenes in the interaction among users if necessary, while users make decisions independently to optimize their individual objectives. In this paper, we develop a framework of intervention mechanism design, which is aimed to optimize the objective of the manager, or the network performance, taking the incentives of selfish users into account. Our framework is general enough to cover a wide range of application scenarios, and it has advantages over existing approaches such as Stackelberg strategies and pricing. To design an intervention mechanism and to predict the resulting operating point, we formulate a new class of games called intervention games and a new solution concept called intervention equilibrium. We provide analytic results about intervention equilibrium and optimal intervention mechanisms in the case of a benevolent manager with perfect monitoring. We illustrate these results with a random access model. Our illustrative example suggests that intervention requires less knowledge about users than pricing.
We study the ergodic properties of excited states in a model of interacting fermions in quasi-one-dimensional chains subjected to a random vector potential. In the noninteracting limit, we show that arbitrarily small values of this complex off-diagonal disorder trigger localization for the whole spectrum; the divergence of the localization length in the single-particle basis is characterized by a critical exponent $\nu$ which depends on the energy density being investigated. When short-range interactions are included, the localization is lost, and the system is ergodic regardless of the magnitude of disorder in finite chains. Our numerical results suggest a delocalization scheme for arbitrary small values of interactions. This finding indicates that the standard scenario of the many-body localization cannot be obtained in a model with random gauge fields.
A computational challenge to validate the candidate disease genes identified in a high-throughput genomic study is to elucidate the associations between the set of candidate genes and disease phenotypes. The conventional gene set enrichment analysis often fails to reveal associations between disease phenotypes and the gene sets with a short list of poorly annotated genes, because the existing annotations of disease causative genes are incomplete. We propose a network-based computational approach called rcNet to discover the associations between gene sets and disease phenotypes. Assuming coherent associations between the genes ranked by their relevance to the query gene set, and the disease phenotypes ranked by their relevance to the hidden target disease phenotypes of the query gene set, we formulate a learning framework maximizing the rank coherence with respect to the known disease phenotype-gene associations. An efficient algorithm coupling ridge regression with label propagation, and two variants are introduced to find the optimal solution of the framework. We evaluated the rcNet algorithms and existing baseline methods with both leave-one-out cross-validation and a task of predicting recently discovered disease-gene associations in OMIM. The experiments demonstrated that the rcNet algorithms achieved the best overall rankings compared to the baselines. To further validate the reproducibility of the performance, we applied the algorithms to identify the target diseases of novel candidate disease genes obtained from recent studies of GWAS, DNA copy number variation analysis, and gene expression profiling. The algorithms ranked the target disease of the candidate genes at the top of the rank list in many cases across all the three case studies. The rcNet algorithms are available as a webtool for disease and gene set association analysis at http://compbio.cs.umn.edu/dgsa_rcNet.
The validity of mode coupling theory (MCT) is restricted by an uncontrolled factorization approximation of density correlations. The factorization can be delayed and ultimately avoided, however, by explicitly including higher order correlations. We explore this approach within a microscopically motivated schematic model. Analytic tractability allows us to discuss in great detail the impact of factorization at arbitrary order, including the limit of avoided factorization. Our results indicate a coherent picture for the capabilities as well as limitations of MCT. Moreover, including higher order correlations systematically defers the transition and ultimately restores ergodicity. Power-law divergence of the relaxation time is then replaced by continuous but exponential growth.
We performed simultaneous recordings of electroencephalography (EEG) from multiple students in a classroom, and measured the inter-subject correlation (ISC) of activity evoked by a common video stimulus. The neural reliability, as quantified by ISC, has been linked to engagement and attentional modulation in earlier studies that used high-grade equipment in laboratory settings. Here we reproduce many of the results from these studies using portable low-cost equipment, focusing on the robustness of using ISC for subjects experiencing naturalistic stimuli. The present data shows that stimulus-evoked neural responses, known to be modulated by attention, can be tracked in for groups of students with synchronized EEG acquisition. This is a step towards real-time inference of engagement in the classroom.
Many algorithms have been proposed to predict missing links in a variety of real networks. These studies focus on mainly both accuracy and efficiency of these algorithms. However, little attention is paid to their robustness against either noise or irrationality of a link existing in almost all of real networks. In this paper, we investigate the robustness of several typical node-similarity-based algorithms and find that these algorithms are sensitive to the strength of noise. Moreover, we find that it also depends on networks' structure properties, especially on network efficiency, clustering coefficient and average degree. In addition, we make an attempt to enhance the robustness by using link weighting method to transform un-weighted network to weighted one and then make use of weights of links to characterize their reliability. The result shows that proper link weighting scheme can enhance both robustness and accuracy of these algorithms significantly in biological networks while it brings little computational effort.
We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application.
In spite of the advances in pattern recognition technology, Handwritten Bangla Character Recognition (HBCR) (such as alpha-numeric and special characters) remains largely unsolved due to the presence of many perplexing characters and excessive cursive in Bangla handwriting. Even the best existing recognizers do not lead to satisfactory performance for practical applications. To improve the performance of Handwritten Bangla Digit Recognition (HBDR), we herein present a new approach based on deep neural networks which have recently shown excellent performance in many pattern recognition and machine learning applications, but has not been throughly attempted for HBDR. We introduce Bangla digit recognition techniques based on Deep Belief Network (DBN), Convolutional Neural Networks (CNN), CNN with dropout, CNN with dropout and Gaussian filters, and CNN with dropout and Gabor filters. These networks have the advantage of extracting and using feature information, improving the recognition of two dimensional shapes with a high degree of invariance to translation, scaling and other pattern distortions. We systematically evaluated the performance of our method on publicly available Bangla numeral image database named CMATERdb 3.1.1. From experiments, we achieved 98.78% recognition rate using the proposed method: CNN with Gabor features and dropout, which outperforms the state-of-the-art algorithms for HDBR.
A major hurdle to clinical translation of brain-machine interfaces (BMIs) is that current decoders, which are trained from a small quantity of recent data, become ineffective when neural recording conditions subsequently change. We tested whether a decoder could be made more robust to future neural variability by training it to handle a variety of recording conditions sampled from months of previously collected data as well as synthetic training data perturbations. We developed a new multiplicative recurrent neural network BMI decoder that successfully learned a large variety of neural-to- kinematic mappings and became more robust with larger training datasets. When tested with a non-human primate preclinical BMI model, this decoder was robust under conditions that disabled a state-of-the-art Kalman filter based decoder. These results validate a new BMI strategy in which accumulated data history is effectively harnessed, and may facilitate reliable daily BMI use by reducing decoder retraining downtime.
We study a modified version of a model previously proposed by Jackson and Wolinsky to account for communicating information and allocating goods in socioeconomic networks. In the model, the utility function of each node is given by a weighted sum of contributions from all accessible nodes. The weights, parameterized by the variable $\delta$, decrease with distance. We introduce a growth mechanism where new nodes attach to the existing network preferentially by utility. By increasing $\delta$, the network structure evolves from a power-law to an exponential degree distribution, passing through a regime characterised by shorter average path length, lower degree assortativity and higher central point dominance. In the second part of the paper we compare different network structures in terms of the average utility received by each node. We show that power-law networks provide higher average utility than Poisson random networks. This provides a possible justification for the ubiquitousness of scale-free networks in the real world.
The processing of mega-dimensional data, such as images, scales linearly with image size only if fixed size processing windows are used. It would be very useful to be able to automate the process of sizing and interconnecting the processing windows. A stochastic encoder that is an extension of the standard Linde-Buzo-Gray vector quantiser, called a stochastic vector quantiser (SVQ), includes this required behaviour amongst its emergent properties, because it automatically splits the input space into statistically independent subspaces, which it then separately encodes. Various optimal SVQs have been obtained, both analytically and numerically. Analytic solutions which demonstrate how the input space is split into independent subspaces may be obtained when an SVQ is used to encode data that lives on a 2-torus (e.g. the superposition of a pair of uncorrelated sinusoids). Many numerical solutions have also been obtained, using both SVQs and chains of linked SVQs: (1) images of multiple independent targets (encoders for single targets emerge), (2) images of multiple correlated targets (various types of encoder for single and multiple targets emerge), (3) superpositions of various waveforms (encoders for the separate waveforms emerge - this is a type of independent component analysis (ICA)), (4) maternal and foetal ECGs (another example of ICA), (5) images of textures (orientation maps and dominance stripes emerge). Overall, SVQs exhibit a rich variety of self-organising behaviour, which effectively discovers the internal structure of the training data. This should have an immediate impact on "intelligent" computation, because it reduces the need for expert human intervention in the design of data processing algorithms.
The concept of (auto)catalytic systems has become a cornerstone in understanding evolutionary processes in various fields. The common ground is the observation that for the production of new species/goods/ideas/elements etc. the pre-existence of specific other elements is a necessary condition. In previous work some of us showed that the dynamics of the catalytic network equation can be understood in terms of topological recurrence relations paving a path towards the analytic tractability of notoriously high dimensional evolution equations. We apply this philosophy to studies in socio-physics, bio-diversity and massive events of creation and destruction in technological and biological networks. Cascading events, triggered by small exogenous fluctuations, lead to dynamics strongly resembling the qualitative picture of Schumpeterian economic evolution. Further we show that this new methodology allows to mathematically treat a variant of the threshold voter-model of opinion formation on networks. For fixed topology we find distinct phases of mixed opinions and consensus.
It is argued considering multiple production in the deep inelastic scattering kinematics that the very high multiplicity events are extremely sensitive to the low-x partons density.
The cross section for semi-inclusive deep inelastic charged current neutrino scattering on hydrogen and deuterium in which a slow proton is observed in coincidence with the muon is computed as a function of Bjorken x and light-cone momentum of the detected proton. In the Impulse Approximation contributions from hadronization and (in case of the deuteron) the emission of a spectator nucleon are considered. In addition the probability for rescattering is computed. The results are compared to a recent analysis of the data from the WA25 (BEBC) experiment at CERN. (KVI-1024)
This paper introduces the NK Echo State Network. The problem of learning in the NK Echo State Network is reduced to the problem of optimizing a special form of a Spin Glass Problem known as an NK Landscape. No weight adjustment is used; all learning is accomplished by spinning up (turning on) or spinning down (turning off) neurons in order to find a combination of neurons that work together to achieve the desired computation. For special types of NK Landscapes, an exact global solution can be obtained in polynomial time using dynamic programming. The NK Echo State Network is applied to a reinforcement learning problem requiring a recurrent network: balancing two poles on a cart given no velocity information. Empirical results shows that the NK Echo State Network learns very rapidly and yields very good generalization.
During the last years, large-scale simulations of realistic physical environments which support the interaction of multiple participants over the Internet have become increasingly available and economically significant, most notably in the computer gaming industry. Such systems, commonly called networked virtual environments (NVEs), are usually based on a client-server architecture where for performance reasons and bandwidth restrictions, the simulation is partially deferred to the clients. This inevitable architectural choice renders the simulation vulnerable to attacks against the semantic integrity of the simulation: malicious clients may attempt to compromise the physical and logical laws governing the simulation, or to alter the causality of events a posteriori. In this paper, we initiate the systematic study of semantic integrity in NVEs from a security point of view. We argue that naive policies to enforce semantic integrity involve intolerable network load, and are therefore not practically feasible. We present a new semantic integrity protocol based on cryptographic primitives which enables the server system to audit the local computations of the clients on demand. Our approach facilitates low network and CPU load, incurs reasonable engineering overhead, and maximally decouples the auditing process from the soft real time constraints of the simulation.
We study a disordered superconducting nanowire, with broken time-reversal and spin-rotational symmetry, which can be driven into a topological phase with end Majorana bound states by an externally applied magnetic field. As a function of disorder strength, it is known that the Majorana nanowire has a delocalization quantum phase transition from a topologically nontrivial phase, which supports Majorana bound states, to a nontopological insulating phase without them. On both sides of the transition, the system is localized at zero energy albeit with very different topological properties. We exploit this deep connection between topology and localization properties to propose an electrical transport measurement to detect the localization-delocalization transition occurring in the bulk of the nanowire. The basic idea consists of measuring the difference of conductance at one end of the wire obtained at different values of the coupling to the opposite lead. We show that this measurement reveals the nonlocal correlations emergent only at the topological transition. Hence, while the proposed experiment does not directly probe the end Majorana bound states, it can provide direct evidence for the bulk topological quantum phase transition itself.
It is well known that many biochemical processes in the cell such as gene regulation, growth signals and activation of ion channels, rely on mechanical stimuli. However, the mechanism by which mechanical signals propagate through cells is not as well understood. In this review we focus on stress propagation in a minimal model for cell elasticity, actomyosin networks, which are comprised of a sub-family of cytoskeleton proteins. After giving an overview of th actomyosin network components, structure and evolution we review stress propagation in these materials as measured through the correlated motion of tracer beads. We also discuss the possibility to extract structural features of these networks from the same experiments. We show that stress transmission through these networks has two pathways, a quickly dissipative one through the bulk, and a long ranged weakly dissipative one through the pre-stressed actin network.
Given a pedestrian image as a query, the purpose of person re-identification is to identify the correct match from a large collection of gallery images depicting the same person captured by disjoint camera views. The critical challenge is how to construct a robust yet discriminative feature representation to capture the compounded variations in pedestrian appearance. To this end, deep learning methods have been proposed to extract hierarchical features against extreme variability of appearance. However, existing methods in this category generally neglect the efficiency in the matching stage whereas the searching speed of a re-identification system is crucial in real-world applications. In this paper, we present a novel deep hashing framework with Convolutional Neural Networks (CNNs) for fast person re-identification. Technically, we simultaneously learn both CNN features and hash functions/codes to get robust yet discriminative features and similarity-preserving hash codes. Thereby, person re-identification can be resolved by efficiently computing and ranking the Hamming distances between images. A structured loss function defined over positive pairs and hard negatives is proposed to formulate a novel optimization problem so that fast convergence and more stable optimized solution can be obtained. Extensive experiments on two benchmarks CUHK03 \cite{FPNN} and Market-1501 \cite{Market1501} show that the proposed deep architecture is efficacy over state-of-the-arts.
Wearable technologies are today on the rise, becoming more common and broadly available to mainstream users. In fact, wristband and armband devices such as smartwatches and fitness trackers already took an important place in the consumer electronics market and are becoming ubiquitous. By their very nature of being wearable, these devices, however, provide a new pervasive attack surface threatening users privacy, among others.   In the meantime, advances in machine learning are providing unprecedented possibilities to process complex data efficiently. Allowing patterns to emerge from high dimensional unavoidably noisy data.   The goal of this work is to raise awareness about the potential risks related to motion sensors built-in wearable devices and to demonstrate abuse opportunities leveraged by advanced neural network architectures.   The LSTM-based implementation presented in this research can perform touchlogging and keylogging on 12-keys keypads with above-average accuracy even when confronted with raw unprocessed data. Thus demonstrating that deep neural networks are capable of making keystroke inference attacks based on motion sensors easier to achieve by removing the need for non-trivial pre-processing pipelines and carefully engineered feature extraction strategies. Our results suggest that the complete technological ecosystem of a user can be compromised when a wearable wristband device is worn.
Until today there has been no systematic comparison of how far document classification can be conducted using just the titles of the documents. However, methods using only the titles are very important since automated processing of titles has no legal barriers. Copyright laws often hinder automated document classification on full-text and even abstracts. In this paper, we compare established methods like Bayes, Rocchio, kNN, SVM, and logistic regression as well as recent methods like Learning to Rank and neural networks to the multi-label document classification problem. We demonstrate that classifications solely using the documents' titles can be very good and very close to the classification results using full-text. We use two established news corpora and two scientific document collections. The experiments are large-scale in terms of documents per corpus (up to 100,000) as well as number of labels (up to 10,000). The best method on title data is a modern variant of neural networks. For three datasets, the difference to full-text is very small. For one dataset, a stacking of logistic regression and decision trees performs slightly better than neural networks. Furthermore, we observe that the best methods on titles are even better than several state-of-the-art methods on full-text.
Graph theory provides fundamental concepts for many fields of science like statistical physics, network analysis and theoretical computer science. Here we give a pedagogical introduction to graph theory, divided into three sections. In the first, we introduce some basic notations and graph theoretical problems, e.g. Eulerian circuits, vertex covers, and graph colorings. The second section describes some fundamental algorithmic concepts to solve basic graph problems numerically, as, e.g., depth-first search, calculation of strongly connected components, and minimum-spanning tree algorithms. The last section introduces random graphs and probabilistic tools to analyze the emergence of a giant component and a giant q-core in these graphs.   The presented text is published as the third chapter of the book "Phase Transitions in Combinatorial Optimization Problem" (Wiley-VCH 2005). Together with introductions to algorithms, to complexity theory and to basic statistical mechanics over random structures, it provides the technical basis for the more advanced chapters. These cover the analysis of phase transitions in combinatorial optimization problems, algorithmic and analytical approaches based on statistical physics tools (replica and cavity methods), the analysis of various search algorithms and the development of efficient heuristic algorithms, based on message passing techniques (warning, belief, and survey propagation).
Submodular functions describe a variety of discrete problems in machine learning, signal processing, and computer vision. However, minimizing submodular functions poses a number of algorithmic challenges. Recent work introduced an easy-to-use, parallelizable algorithm for minimizing submodular functions that decompose as the sum of "simple" submodular functions. Empirically, this algorithm performs extremely well, but no theoretical analysis was given. In this paper, we show that the algorithm converges linearly, and we provide upper and lower bounds on the rate of convergence. Our proof relies on the geometry of submodular polyhedra and draws on results from spectral graph theory.
I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales significantly better than all alternatives when applied to modern convolutional neural networks.
Transactional network data can be thought of as a list of one-to-many communications(e.g., email) between nodes in a social network. Most social network models convert this type of data into binary relations between pairs of nodes. We develop a latent mixed membership model capable of modeling richer forms of transactional network data, including relations between more than two nodes. The model can cluster nodes and predict transactions. The block-model nature of the model implies that groups can be characterized in very general ways. This flexible notion of group structure enables discovery of rich structure in transactional networks. Estimation and inference are accomplished via a variational EM algorithm. Simulations indicate that the learning algorithm can recover the correct generative model. Interesting structure is discovered in the Enron email dataset and another dataset extracted from the Reddit website. Analysis of the Reddit data is facilitated by a novel performance measure for comparing two soft clusterings. The new model is superior at discovering mixed membership in groups and in predicting transactions.
Network embeddings have become very popular in learning effective feature representations of networks. Motivated by the recent successes of embeddings in natural language processing, researchers have tried to find network embeddings in order to exploit machine learning algorithms for mining tasks like node classification and edge prediction. However, most of the work focuses on finding distributed representations of nodes, which are inherently ill-suited to tasks such as community detection which are intuitively dependent on subgraphs.   Here, we propose sub2vec, an unsupervised scalable algorithm to learn feature representations of arbitrary subgraphs. We provide means to characterize similarties between subgraphs and provide theoretical analysis of sub2vec and demonstrate that it preserves the so-called local proximity. We also highlight the usability of sub2vec by leveraging it for network mining tasks, like community detection. We show that sub2vec gets significant gains over state-of-the-art methods and node-embedding methods. In particular, sub2vec offers an approach to generate a richer vocabulary of features of subgraphs to support representation and reasoning.
We present a semantic vector space model for capturing complex polyphonic musical context. A word2vec model based on a skip-gram representation with negative sampling was used to model slices of music from a dataset of Beethoven's piano sonatas. A visualization of the reduced vector space using t-distributed stochastic neighbor embedding shows that the resulting embedded vector space captures tonal relationships, even without any explicit information about the musical contents of the slices. Secondly, an excerpt of the Moonlight Sonata from Beethoven was altered by replacing slices based on context similarity. The resulting music shows that the selected slice based on similar word2vec context also has a relatively short tonal distance from the original slice.
Recent advances in the study of networked systems have highlighted that our interconnected world is composed of networks that are coupled to each other through different "layers" that each represent one of many possible subsystems or types of interactions. Nevertheless, it is traditional to aggregate multilayer networks into a single weighted network in order to take advantage of existing tools. This is admittedly convenient, but it is also extremely problematic, as important information can be lost as a result. It is therefore important to develop multilayer generalizations of network concepts. In this paper, we analyze triadic relations and generalize the idea of transitivity to multiplex networks. By focusing on triadic relations, which yield the simplest type of transitivity, we generalize the concept and computation of clustering coefficients to multiplex networks. We show how the layered structure of such networks introduces a new degree of freedom that has a fundamental effect on transitivity. We compute multiplex clustering coefficients for several real multiplex networks and illustrate why one must take great care when generalizing standard network concepts to multiplex networks. We also derive analytical expressions for our clustering coefficients for ensemble averages of networks in a family of random multiplex networks. Our analysis illustrates that social networks have a strong tendency to promote redundancy by closing triads at every layer and that they thereby have a different type of multiplex transitivity from transportation networks, which do not exhibit such a tendency. These insights are invisible if one only studies aggregated networks.
The classical problem in network coding theory considers communication over multicast networks. Multiple transmitters send independent messages to multiple receivers which decode the same set of messages. In this work, computation over multicast networks is considered: each receiver decodes an identical function of the original messages. For a countably infinite class of two-transmitter two-receiver single-hop linear deterministic networks, the computing capacity is characterized for a linear function (modulo-2 sum) of Bernoulli sources. Inspired by the geometric concept of interference alignment in networks, a new achievable coding scheme called function alignment is introduced. A new converse theorem is established that is tighter than cut-set based and genie-aided bounds. Computation (vs. communication) over multicast networks requires additional analysis to account for multiple receivers sharing a network's computational resources. We also develop a network decomposition theorem which identifies elementary parallel subnetworks that can constitute an original network without loss of optimality. The decomposition theorem provides a conceptually-simpler algebraic proof of achievability that generalizes to $L$-transmitter $L$-receiver networks.
We report on the first search for top-quark production via flavor-changing neutral-current (FCNC) interactions in the non-standard-model process u(c)+g -> t using ppbar collision data collected by the CDF II detector. The data set corresponds to an integrated luminosity of 2.2/fb. The candidate events feature the signature of semileptonic top-quark decays and are classified as signal-like or background-like by an artificial neural network trained on simulated events. The observed discriminant distribution is in good agreement with the one predicted by the standard model and provides no evidence for FCNC top-quark production, resulting in a Bayesian upper limit on the production cross section sigma (u(c)+g -> t) < 1.8 pb at the 95% confidence level. Using theoretical predictions we convert the cross-section limit to upper limits on FCNC branching ratios: BR (t -> u+g) < 3.9 x 10{-4}$ and BR (t -> c+g) < 5.7 x 10^{-3}.
In this paper, an approach is proposed to fuse LiDAR and hyperspectral data, which considers both spectral and spatial information in a single framework. Here, an extended self-dual attribute profile (ESDAP) is investigated to extract spatial information from a hyperspectral data set. To extract spectral information, a few well-known classifiers have been used such as support vector machines (SVMs), random forests (RFs), and artificial neural networks (ANNs). The proposed method accurately classify the relatively volumetric data set in a few CPU processing time in a real ill-posed situation where there is no balance between the number of training samples and the number of features. The classification part of the proposed approach is fully-automatic.
Recurrent Neural Networks (RNNs) with Long Short-Term Memory units (LSTM) are widely used because they are expressive and are easy to train. Our interest lies in empirically evaluating the expressiveness and the learnability of LSTMs in the sequence-to-sequence regime by training them to evaluate short computer programs, a domain that has traditionally been seen as too complex for neural networks. We consider a simple class of programs that can be evaluated with a single left-to-right pass using constant memory. Our main result is that LSTMs can learn to map the character-level representations of such programs to their correct outputs. Notably, it was necessary to use curriculum learning, and while conventional curriculum learning proved ineffective, we developed a new variant of curriculum learning that improved our networks' performance in all experimental conditions. The improved curriculum had a dramatic impact on an addition problem, making it possible to train an LSTM to add two 9-digit numbers with 99% accuracy.
Convoulutional Neural Networks (CNNs) exhibit extraordinary performance on a variety of machine learning tasks. However, their mathematical properties and behavior are quite poorly understood. There is some work, in the form of a framework, for analyzing the operations that they perform. The goal of this project is to present key results from this theory, and provide intuition for why CNNs work.
As high-throughput biological sequencing becomes faster and cheaper, the need to extract useful information from sequencing becomes ever more paramount, often limited by low-throughput experimental characterizations. For proteins, accurate prediction of their functions directly from their primary amino-acid sequences has been a long standing challenge. Here, machine learning using artificial recurrent neural networks (RNN) was applied towards classification of protein function directly from primary sequence without sequence alignment, heuristic scoring or feature engineering. The RNN models containing long-short-term-memory (LSTM) units trained on public, annotated datasets from UniProt achieved high performance for in-class prediction of four important protein functions tested, particularly compared to other machine learning algorithms using sequence-derived protein features. RNN models were used also for out-of-class predictions of phylogenetically distinct protein families with similar functions, including proteins of the CRISPR-associated nuclease, ferritin-like iron storage and cytochrome P450 families. Applying the trained RNN models on the partially unannotated UniRef100 database predicted not only candidates validated by existing annotations but also currently unannotated sequences. Some RNN predictions for the ferritin-like iron sequestering function were experimentally validated, even though their sequences differ significantly from known, characterized proteins and from each other and cannot be easily predicted using popular bioinformatics methods. As sequencing and experimental characterization data increases rapidly, the machine-learning approach based on RNN could be useful for discovery and prediction of homologues for a wide range of protein functions.
This paper addresses the problem of Target Activity Detection (TAD) for binaural listening devices. TAD denotes the problem of robustly detecting the activity of a target speaker in a harsh acoustic environment, which comprises interfering speakers and noise (cocktail party scenario). In previous work, it has been shown that employing a Feed-forward Neural Network (FNN) for detecting the target speaker activity is a promising approach to combine the advantage of different TAD features (used as network inputs). In this contribution, we exploit a larger context window for TAD and compare the performance of FNNs and Recurrent Neural Networks (RNNs) with an explicit focus on small network topologies as desirable for embedded acoustic signal processing systems. More specifically, the investigations include a comparison between three different types of RNNs, namely plain RNNs, Long Short-Term Memories, and Gated Recurrent Units. The results indicate that all versions of RNNs outperform FNNs for the task of TAD.
The Gaussian process (GP) is a popular way to specify dependencies between random variables in a probabilistic model. In the Bayesian framework the covariance structure can be specified using unknown hyperparameters. Integrating over these hyperparameters considers different possible explanations for the data when making predictions. This integration is often performed using Markov chain Monte Carlo (MCMC) sampling. However, with non-Gaussian observations standard hyperparameter sampling approaches require careful tuning and may converge slowly. In this paper we present a slice sampling approach that requires little tuning while mixing well in both strong- and weak-data regimes.
Networks of model neurons with balanced recurrent excitation and inhibition produce irregular and asynchronous spiking activity. We extend the analysis of balanced networks to include the known dependence of connection probability on the spatial separation between neurons. In the continuum limit we derive that stable, balanced firing rate solutions require that the spatial spread of external inputs be broader than that of recurrent excitation, which in turn must be broader than or equal to that of recurrent inhibition. For finite size networks we investigate the pattern forming dynamics arising when balanced conditions are not satisfied. The spatiotemporal dynamics of balanced networks offer new challenges in the statistical mechanics of complex systems.
Many real-world networks exhibit scale-free feature, have a small diameter and a high clustering tendency. We have studied the properties of a growing network, which has all these features, in which an incoming node is connected to its $i$th predecessor of degree $k_i$ with a link of length $\ell$ using a probability proportional to $k^\beta_i \ell^{\alpha}$. For $\alpha > -0.5$, the network is scale free at $\beta = 1$ with the degree distribution $P(k) \propto k^{-\gamma}$ and $\gamma = 3.0$ as in the Barab\'asi-Albert model ($\alpha =0, \beta =1$). We find a phase boundary in the $\alpha-\beta$ plane along which the network is scale-free. Interestingly, we find scale-free behaviour even for $\beta > 1$ for $\alpha < -0.5$ where the existence of a new universality class is indicated from the behaviour of the degree distribution and the clustering coefficients. The network has a small diameter in the entire scale-free region. The clustering coefficients emulate the behaviour of most real networks for increasing negative values of $\alpha$ on the phase boundary.
By means of molecular dynamics, we study the structure and the dynamics of a microscopic model for colloidal gels at low volume fractions. The presence of directional interactions leads to the formation of a persistent interconnected network at temperatures where phase separation does not occur. We find that large scale spatial correlations strongly depend on the volume fraction and characterize the formation of the persistent network. We observe a pre-peak in the static structure factor and relate it to the network structure. The slow dynamics at gelation is characterized by the coexistence of fast collective motion of the mobile parts of the network structure (chains) with large scale rearrangements producing stretched exponential relaxations. We show that, once the network is sufficiently persistent, it induces slow, cooperative processes related to the network nodes. We suggest that this peculiar glassy dynamics is a hallmark of the physics of colloidal gels at low volume fractions.
We study several azimuthal asymmetries in semi-inclusive deep inelastic scattering and in Drell-Yan, interpreting them in the framework of the formalism of the quark correlator, with a particular reference to T-odd functions. The correlator contains an undetermined energy scale, which we fix on the basis of a simple and rather general argument. We find a different value than the one assumed in previous treatments of T-odd functions. This implies different predictions on the Q^2 dependence of the above mentioned asymmetries. Our result about the azimuthal asymmetry of unpolarized Drell-Yan agrees with presently available data, contrary to the alternative assumption on the scale. Predictions on other azimuthal asymmetries could be tested against data of planned experiments on Drell-Yan and semi-inclusive deep inelastic scattering.
We aim to investigate advancing the state of the art of detection, classification and localization (DCL) in the field of bioacoustics. The two primary goals are to develop transferable technologies for detection and classification in: (1) the area of advanced algorithms, such as deep learning and other methods; and (2) advanced systems, capable of real-time and archival and processing. This project will focus on long-term, continuous datasets to provide automatic recognition, minimizing human time to annotate the signals. Effort will begin by focusing on several years of multi-channel acoustic data collected in the Stellwagen Bank National Marine Sanctuary (SBNMS) between 2006 and 2010. Our efforts will incorporate existing technologies in the bioacoustics signal processing community, advanced high performance computing (HPC) systems, and new approaches aimed at automatically detecting-classifying and measuring features for species-specific marine mammal sounds within passive acoustic data.
Ecological systems can be seen as networks of interactions between individual, species, or habitat patches. A key feature of many ecological networks is their organization into modules, which are subsets of elements that are more connected to each other than to the other elements in the network. We introduce MODULAR to perform rapid and autonomous calculation of modularity in sets of networks. MODULAR reads a set of files with matrices or edge lists that represent unipartite or bipartite networks, and identify modules using two different modularity metrics that have been previously used in studies of ecological networks. To find the network partition that maximizes modularity, the software offers five optimization methods to the user. We also included two of the most common null models that are used in studies of ecological networks to verify how the modularity found by the maximization of each metric differs from a theoretical benchmark.
In vehicular networks, vehicles exchange messages with each other as well as infrastructure to prevent accidents or enhance driver's and passenger's experience. In this paper, we propose a grid-based multichannel access scheme to enhance the performance of a vehicular network. To determine the feasibility of our scheme, we obtained preliminary results using the OPNET simulation tool.
We consider a one-dimensional continuous time random walk (CTRW) on a fixed time interval $T$ where at each time step the walker waits a random time $\tau$, before performing a jump drawn from a symmetric continuous probability distribution function (PDF) $f(\eta)$, of L\'evy index $0 < \mu \leq 2$. Our study includes the case where the waiting time PDF $\Psi(\tau)$ has a power law tail, $\Psi(\tau) \propto \tau^{-1 - \gamma}$, with $0< \gamma < 1$, such that the average time between two consecutive jumps is infinite. The random motion is sub-diffusive if $\gamma < \mu/2$ (and super-diffusive if $\gamma > \mu/2$). We investigate the joint PDF of the gap $g$ between the first two highest positions of the CTRW and the time $t$ separating these two maxima. We show that this PDF reaches a stationary limiting joint distribution $p(g,t)$ in the limit of long CTRW, $T \to \infty$. Our exact analytical results show a very rich behavior of this joint PDF in the $(\gamma, \mu)$ plane, which we study in great detail. Our main results are verified by numerical simulations. This work provides a non trivial extension to CTRWs of the recent study in the discrete time setting by Majumdar et al. (J. Stat. Mech. P09013, 2014).
I aim to show that models, classification or generating functions, invariances and datasets are algorithmically equivalent concepts once properly defined, and provide some concrete examples of them. I then show that a) neural networks (NNs) of different kinds can be seen to implement models, b) that perturbations of inputs and nodes in NNs trained to optimally implement simple models propagate strongly, c) that there is a framework in which recurrent, deep and shallow networks can be seen to fall into a descriptive power hierarchy in agreement with notions from the theory of recursive functions. The motivation for these definitions and following analysis lies in the context of cognitive neuroscience, and in particular in Ruffini (2016), where the concept of model is used extensively, as is the concept of algorithmic complexity.
Image denoising techniques are essential to reducing noise levels and enhancing diagnosis reliability in low-dose computed tomography (CT). Machine learning based denoising methods have shown great potential in removing the complex and spatial-variant noises in CT images. However, some residue artifacts would appear in the denoised image due to complexity of noises. A cascaded training network was proposed in this work, where the trained CNN was applied on the training dataset to initiate new trainings and remove artifacts induced by denoising. A cascades of convolutional neural networks (CNN) were built iteratively to achieve better performance with simple CNN structures. Experiments were carried out on 2016 Low-dose CT Grand Challenge datasets to evaluate the method's performance.
Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features. The power of deep networks stems both from their ability to perform local computations followed by pointwise non-linearities over increasingly larger receptive fields, and from the simplicity and scalability of the gradient-descent training procedure based on backpropagation. An open problem is the inclusion of layers that perform global, structured matrix computations like segmentation (e.g. normalized cuts) or higher-order pooling (e.g. log-tangent space metrics defined over the manifold of symmetric positive definite matrices) while preserving the validity and efficiency of an end-to-end deep training framework. In this paper we propose a sound mathematical apparatus to formally integrate global structured computation into deep computation architectures. At the heart of our methodology is the development of the theory and practice of backpropagation that generalizes to the calculus of adjoint matrix variations. The proposed matrix backpropagation methodology applies broadly to a variety of problems in machine learning or computational perception. Here we illustrate it by performing visual segmentation experiments using the BSDS and MSCOCO benchmarks, where we show that deep networks relying on second-order pooling and normalized cuts layers, trained end-to-end using matrix backpropagation, outperform counterparts that do not take advantage of such global layers.
The purpose of this paper is to develop a self-optimized association algorithm based on PGRL (Policy Gradient Reinforcement Learning), which is both scalable, stable and robust. The term robust means that performance degradation in the learning phase should be forbidden or limited to predefined thresholds. The algorithm is model-free (as opposed to Value Iteration) and robust (as opposed to Q-Learning). The association problem is modeled as a Markov Decision Process (MDP). The policy space is parameterized. The parameterized family of policies is then used as expert knowledge for the PGRL. The PGRL converges towards a local optimum and the average cost decreases monotonically during the learning process. The properties of the solution make it a good candidate for practical implementation. Furthermore, the robustness property allows to use the PGRL algorithm in an "always-on" learning mode.
A framework is proposed for the design and analysis of \emph{network-oblivious algorithms}, namely, algorithms that can run unchanged, yet efficiently, on a variety of machines characterized by different degrees of parallelism and communication capabilities. The framework prescribes that a network-oblivious algorithm be specified on a parallel model of computation where the only parameter is the problem's input size, and then evaluated on a model with two parameters, capturing parallelism granularity and communication latency. It is shown that, for a wide class of network-oblivious algorithms, optimality in the latter model implies optimality in the Decomposable BSP model, which is known to effectively describe a wide and significant class of parallel platforms. The proposed framework can be regarded as an attempt to port the notion of obliviousness, well established in the context of cache hierarchies, to the realm of parallel computation. Its effectiveness is illustrated by providing optimal network-oblivious algorithms for a number of key problems. Some limitations of the oblivious approach are also discussed.
Memory units have been widely used to enrich the capabilities of deep networks on capturing long-term dependencies in reasoning and prediction tasks, but little investigation exists on deep generative models (DGMs) which are good at inferring high-level invariant representations from unlabeled data. This paper presents a deep generative model with a possibly large external memory and an attention mechanism to capture the local detail information that is often lost in the bottom-up abstraction process in representation learning. By adopting a smooth attention model, the whole network is trained end-to-end by optimizing a variational bound of data likelihood via auto-encoding variational Bayesian methods, where an asymmetric recognition network is learnt jointly to infer high-level invariant representations. The asymmetric architecture can reduce the competition between bottom-up invariant feature extraction and top-down generation of instance details. Our experiments on several datasets demonstrate that memory can significantly boost the performance of DGMs and even achieve state-of-the-art results on various tasks, including density estimation, image generation, and missing value imputation.
The correct estimation of the head pose is a problem of the great importance for many applications. For instance, it is an enabling technology in automotive for driver attention monitoring. In this paper, we tackle the pose estimation problem through a deep learning network working in regression manner. Traditional methods usually rely on visual facial features, such as facial landmarks or nose tip position. In contrast, we exploit a Convolutional Neural Network (CNN) to perform head pose estimation directly from depth data. We exploit a Siamese architecture and we propose a novel loss function to improve the learning of the regression network layer. The system has been tested on two public datasets, Biwi Kinect Head Pose and ICT-3DHP database. The reported results demonstrate the improvement in accuracy with respect to current state-of-the-art approaches and the real time capabilities of the overall framework.
Automatic classification of Human Epithelial Type-2 (HEp-2) cells staining patterns is an important and yet a challenging problem. Although both shallow and deep methods have been applied, the study of deep convolutional networks (CNNs) on this topic is shallow to date, thus failed to claim its top position for this problem. In this paper, we propose a novel study of using CNNs for HEp-2 cells classification focusing on cross-specimen analysis, a key evaluation for generalization. For the first time, our study reveals several key factors of using CNNs for HEp-2 cells classification. Our proposed system achieves state-of-the-art classification accuracy on public benchmark dataset. Comparative experiments on different training data reveals that adding different specimens,rather than increasing in numbers by affine transformations, helps to train a good deep model. This opens a new avenue for adopting deep CNNs to HEp-2 cells classification.
A bounded operator $T$ on a separable, complex Hilbert space is said to be odd symmetric if $I^*T^tI=T$ where $I$ is a real unitary satisfying $I^2=-1$ and $T^t$ denotes the transpose of $T$. It is proved that such an operator can always be factorized as $T=I^*A^tIA$ with some operator $A$. This generalizes a result of Hua and Siegel for matrices. As application it is proved that the set of odd symmetric Fredholm operators has two connected components labelled by a $Z_2$-index given by the parity of the dimension of the kernel of $T$. This recovers a result of Atiyah and Singer. Two examples of $Z_2$-valued index theorems are provided, one being a version of the Noether-Gohberg-Krein theorem with symmetries and the other an application to topological insulators.
Quantum networks must classically exchange complex metadata between devices in order to carry out information for protocols such as teleportation, super-dense coding, and quantum key distribution. Demonstrating the integration of these new communication methods with existing network protocols, channels, and data forwarding mechanisms remains an open challenge. Software-defined networking (SDN) offers robust and flexible strategies for managing diverse network devices and uses. We adapt the principles of SDN to the deployment of quantum networks, which are composed from unique devices that operate according to the laws of quantum mechanics. We show how quantum metadata can be managed within a software-defined network using the OpenFlow protocol, and we describe how OpenFlow management of classical optical channels is compatible with emerging quantum communication protocols. We next give an example specification of the metadata needed to manage and control QPHY behavior and we extend the OpenFlow interface to accommodate this quantum metadata. We conclude by discussing near-term experimental efforts that can realize SDN's principles for quantum communication.
We consider complete synchronization of identical maps coupled through a general interaction function and in a general network topology where the edges may be directed and may carry both positive and negative weights. We define mixed transverse exponents and derive sufficient conditions for local complete synchronization. The general non-diffusive coupling scheme can lead to new synchronous behavior, in networks of identical units, that cannot be produced by single units in isolation. In particular, we show that synchronous chaos can emerge in networks of simple units. Conversely, in networks of chaotic units simple synchronous dynamics can emerge; that is, chaos can be suppressed through synchrony.
We derive a analytic evolution equation for overlap parameters including the effect of degree distribution on the transient dynamics of sequence processing neural networks. In the special case of globally coupled networks, the precisely retrieved critical loading ratio $\alpha_c = N ^{-1/2}$ is obtained, where $N$ is the network size. In the presence of random networks, our theoretical predictions agree quantitatively with the numerical experiments for delta, binomial, and power-law degree distributions.
We formulate a general microscopic approach to spin-orbit torques in thin ferromagnet/heavy-metal bilayers in linear response to electric current or electric field. The microscopic theory we develop avoids the notion of spin currents and spin-Hall effect. Instead, the torques are directly related to a local spin polarization of conduction electrons, which is computed from generalized Kubo-St\v{r}eda formulas. A symmetry analysis provides a one-to-one correspondence between polarization susceptibility tensor components and different torque terms in the Landau-Lifshitz-Gilbert equation for magnetization dynamics. The spin-orbit torques arising from Rashba or Dresselhaus type of spin-orbit interaction are shown to have different symmetries. We analyze these spin-orbit torques microscopically for a generic electron model in the presence of an arbitrary smooth magnetic texture. For a model with spin-independent disorder we find a major cancelation of the torques. In this case the only remaining torque corresponds to the magnetization-independent Edelstein effect. Furthermore, our results are applied to analyze the dynamics of a Skyrmion under the action of electric current.
Active deep learning classification of hyperspectral images is considered in this paper. Deep learning has achieved success in many applications, but good-quality labeled samples are needed to construct a deep learning network. It is expensive getting good labeled samples in hyperspectral images for remote sensing applications. An active learning algorithm based on a weighted incremental dictionary learning is proposed for such applications. The proposed algorithm selects training samples that maximize two selection criteria, namely representative and uncertainty. This algorithm trains a deep network efficiently by actively selecting training samples at each iteration. The proposed algorithm is applied for the classification of hyperspectral images, and compared with other classification algorithms employing active learning. It is shown that the proposed algorithm is efficient and effective in classifying hyperspectral images.
Autoencoder neural network is implemented to estimate the missing data. Genetic algorithm is implemented for network optimization and estimating the missing data. Missing data is treated as Missing At Random mechanism by implementing maximum likelihood algorithm. The network performance is determined by calculating the mean square error of the network prediction. The network is further optimized by implementing Decision Forest. The impact of missing data is then investigated and decision forrests are found to improve the results.
Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets.   In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities. To get the videos and their labels, we used a YouTube video annotation system, which labels videos with their main topics. While the labels are machine-generated, they have high-precision and are derived from a variety of human-based signals including metadata and query click signals. We filtered the video labels (Knowledge Graph entities) using both automated and manual curation strategies, including asking human raters if the labels are visually recognizable. Then, we decoded each video at one-frame-per-second, and used a Deep CNN pre-trained on ImageNet to extract the hidden representation immediately prior to the classification layer. Finally, we compressed the frame features and make both the features and video-level labels available for download.   We trained various (modest) classification models on the dataset, evaluated them using popular evaluation metrics, and report them as baselines. Despite the size of the dataset, some of our models train to convergence in less than a day on a single machine using TensorFlow. We plan to release code for training a TensorFlow model and for computing metrics.
In our ongoing deep infrared imaging search for faint wide secondaries of planet-candidate host stars we have confirmed astrometrically the companionship of a low-mass object to be co-moving with HD89744, a companion-candidate suggested already by Wilson et al. (2000). The derivation of the common proper motion of HD89744 and its companion, which are separated by 62.996+-0.035 arcsec, is based on about 5 year epoch difference between 2MASS and our own UKIRT/UFTI images. The companion effective temperature is about 2200 K and its mass is in the range between 0.072 and 0.081 Msun, depending on the evolutionary model. Therefore, HD89744B is either a very low mass stellar or a heavy brown dwarf companion.
Node forgery or impersonation, in which legitimate cryptographic credentials are captured by an adversary, constitutes one major security threat facing wireless networks. The fact that mobile devices are prone to be compromised and reverse engineered significantly increases the risk of such attacks in which adversaries can obtain secret keys on trusted nodes and impersonate the legitimate node. One promising approach toward thwarting these attacks is through the extraction of unique fingerprints that can provide a reliable and robust means for device identification. These fingerprints can be extracted from transmitted signal by analyzing information across the protocol stack. In this paper, the first unified and comprehensive tutorial in the area of wireless device fingerprinting for security applications is presented. In particular, we aim to provide a detailed treatment on developing novel wireless security solutions using device fingerprinting techniques. The objectives are three-fold: (i) to introduce a comprehensive taxonomy of wireless features that can be used in fingerprinting, (ii) to provide a systematic review on fingerprint algorithms including both white-list based and unsupervised learning approaches, and (iii) to identify key open research problems in the area of device fingerprinting and feature extraction, as applied to wireless security.
Synthetic images rendered from 3D CAD models are useful for augmenting training data for object recognition algorithms. However, the generated images are non-photorealistic and do not match real image statistics. This leads to a large domain discrepancy, causing models trained on synthetic data to perform poorly on real domains. Recent work has shown the great potential of deep convolutional neural networks to generate realistic images, but has not utilized generative models to address synthetic-to-real domain adaptation. In this work, we propose a Deep Generative Correlation Alignment Network (DGCAN) to synthesize images using a novel domain adaption algorithm. DGCAN leverages a shape preserving loss and a low level statistic matching loss to minimize the domain discrepancy between synthetic and real images in deep feature space. Experimentally, we show training off-the-shelf classifiers on the newly generated data can significantly boost performance when testing on the real image domains (PASCAL VOC 2007 benchmark and Office dataset), improving upon several existing methods.
Measurements of open charm production cross sections in deep-inelastic ep scattering at HERA from the H1 and ZEUS Collaborations are combined. Reduced cross sections sigma_red^{c\bar{c}} for charm production are obtained in the kinematic range of photon virtuality 2.5<Q2<2000 GeV2 and Bjorken scaling variable 0.00003<x<0.05. The combination method accounts for the correlations of the systematic uncertainties among the different data sets. The combined charm data together with the combined inclusive deep-inelastic scattering cross sections from HERA are used as input for a detailed NLO QCD analysis to study the influence of different heavy flavour schemes on the parton distribution functions. The optimal values of the charm mass as a parameter in these different schemes are obtained. The implications on the NLO predictions for W^{\pm} and Z production cross sections at the LHC are investigated. Using the fixed flavour number scheme, the running mass of the charm quark is determined.
PSO is a widely recognized optimization algorithm inspired by social swarm. In this brief we present a heterogeneous strategy particle swarm optimization (HSPSO), in which a proportion of particles adopt a fully informed strategy to enhance the converging speed while the rest are singly informed to maintain the diversity. Our extensive numerical experiments show that HSPSO algorithm is able to obtain satisfactory solutions, outperforming both PSO and the fully informed PSO. The evolution process is examined from both structural and microscopic points of view. We find that the cooperation between two types of particles can facilitate a good balance between exploration and exploitation, yielding better performance. We demonstrate the applicability of HSPSO on the filter design problem.
A Siamese Deep Forest (SDF) is proposed in the paper. It is based on the Deep Forest or gcForest proposed by Zhou and Feng and can be viewed as a gcForest modification. It can be also regarded as an alternative to the well-known Siamese neural networks. The SDF uses a modified training set consisting of concatenated pairs of vectors. Moreover, it defines the class distributions in the deep forest as the weighted sum of the tree class probabilities such that the weights are determined in order to reduce distances between similar pairs and to increase them between dissimilar points. We show that the weights can be obtained by solving a quadratic optimization problem. The SDF aims to prevent overfitting which takes place in neural networks when only limited training data are available. The numerical experiments illustrate the proposed distance metric method.
We introduce a model of the Relentless Congestion Control proposed by Matt Mathis. Relentless Congestion Control (RCC) is a modification of the AIMD (Additive Increase Multiplicative Decrease) congestion control which consists in decreasing the TCP congestion window by the number of lost segments instead of halving it. Despite some on-going discussions at the ICCRG IRTF-group, this congestion control has, to the best of our knowledge, never been modeled. In this paper, we provide an analytical model of this novel congestion control and propose an implementation of RCC for the commonly-used network simulator ns-2. We also improve RCC with the addition of a loss retransmission detection scheme (based on SACK+) to prevent RTO caused by a loss of a retransmission and called this new version RCC+. The proposed models describe both the original RCC algorithm and RCC+ improvement and would allow to better assess the impact of this new congestion control scheme over the network traffic.
We propose a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. The method exhibits a roughly linear runtime scaling over real-world networks ranging from 1000 to 100 million nodes. In a test on a social network with 1.8 billion edges, the algorithm finds the largest clique in about 20 minutes. Our method employs a branch and bound strategy with novel and aggressive pruning techniques. For instance, we use the core number of a vertex in combination with a good heuristic clique finder to efficiently remove the vast majority of the search space. In addition, we parallelize the exploration of the search tree. During the search, processes immediately communicate changes to upper and lower bounds on the size of maximum clique, which occasionally results in a super-linear speedup because vertices with large search spaces can be pruned by other processes. We apply the algorithm to two problems: to compute temporal strong components and to compress graphs.
The concept of network slice, i.e.,a service customized virtual network (VN) is attracting more and more attentions in the telecommunication industry. A slice is a set of network resources which fits the service attributes and requirements of customer services. The network resources consist of cloud resources and communication link resources. A slice can serve one or more customer services which share the similar service attributes and requirements. To define, create and manage a slice (VN) is one aspect of future networks. Another aspect is the slice operation, i.e., the provisioning of services to customers using created slices. In this paper, the focus is put on the configuration of slices and the operation of slices. In this paper, the detailed description of slice (VN) configuration is provided. A new concept of hop-on (a slice) is described. Given a well defined and configured end-to-end slice (VN), the realtime data traffic delivery over a slice is governed by network operation control entities, which are also pre-configured. Therefore, the procedure of customer traffic delivery over a slice is just like a traveler hopping on tourist bus and then the traffic control officers at key intersections directing the traveler to go through pre-designed routes until the destination is reached.
The prediction of salient areas in images has been traditionally addressed with hand-crafted features based on neuroscience principles. This paper, however, addresses the problem with a completely data-driven approach by training a convolutional neural network (convnet). The learning process is formulated as a minimization of a loss function that measures the Euclidean distance of the predicted saliency map with the provided ground truth. The recent publication of large datasets of saliency prediction has provided enough data to train end-to-end architectures that are both fast and accurate. Two designs are proposed: a shallow convnet trained from scratch, and a another deeper solution whose first three layers are adapted from another network trained for classification. To the authors knowledge, these are the first end-to-end CNNs trained and tested for the purpose of saliency prediction.
Many complex networks exhibit vulnerability to spreading of epidemics, and such vulnerability relates to the viral strain as well as to the network characteristics. For instance, the structure of the network plays an important role in spreading of epidemics. Additionally, properties of previous epidemic models require prior knowledge of the complex network structure, which means the models are limited to only well-known network structures. In this paper, we propose a new epidemiological SIR model based on the continuous time Markov chain, which is generalized to any type of network. The new model is capable of evaluating the states of every individual in the network. Through mathematical analysis, we prove an epidemic threshold exists below which an epidemic does not propagate in the network. We also show that the new epidemic threshold is inversely proportional to the spectral radius of the network. In particular, we employ the new epidemic model as a novel measure to assess the vulnerability of networks to the spread of epidemics. The new measure considers all possible effective infection rates that an epidemic might possess. Next, we apply the measure to correlated networks to evaluate the vulnerability of disassortative and assortative scalefree networks. Ultimately, we verify the accuracy of the theoretical epidemic threshold through extensive numerical simulations. Within the set of tested networks, the numerical results show that disassortative scale-free networks are more vulnerable to spreading of epidemics than assortative scale-free networks.
We experimentally observe the spatial intensity statistics of light transmitted through three-dimensional isotropic scattering media. The intensity distributions measured through layers consisting of zinc oxide nanoparticles differ significantly from the usual Rayleigh statistics associated with speckle, and instead are in agreement with the predictions of mesoscopic transport theory, taking into account the known material parameters of the samples. Consistent with the measured spatial intensity fluctuations, the total transmission fluctuates. The magnitude of the fluctuations in the total transmission is smaller than expected on the basis of quasi-one-dimensional (1D) transport theory, which indicates that quasi-1D theories cannot fully describe these open three-dimensional media.
We address a question answering task on real-world images that is set up as a Visual Turing Test. By combining latest advances in image representation and natural language processing, we propose Ask Your Neurons, a scalable, jointly trained, end-to-end formulation to this problem.   In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language inputs (image and question). We provide additional insights into the problem by analyzing how much information is contained only in the language part for which we provide a new human baseline. To study human consensus, which is related to the ambiguities inherent in this challenging task, we propose two novel metrics and collect additional answers which extend the original DAQUAR dataset to DAQUAR-Consensus.   Moreover, we also extend our analysis to VQA, a large-scale question answering about images dataset, where we investigate some particular design choices and show the importance of stronger visual models. At the same time, we achieve strong performance of our model that still uses a global image representation. Finally, based on such analysis, we refine our Ask Your Neurons on DAQUAR, which also leads to a better performance on this challenging task.
We introduce a new formalism for dealing with networks of queues. The formalism is based on the Doi-Peliti second quantization method for reaction diffusion systems. As a demonstration of the method's utility we compute perturbatively the different time busy-busy correlations between two servers in a Jackson network.
A microscopically motivated theory of glassy dynamics based on an underlying random first order transition is developed to explain the magnitude of free energy barriers for glassy relaxation. A variety of empirical correlations embodied in the concept of liquid "fragility" are shown to be quantitatively explained by such a model. The near universality of a Lindemann ratio characterizing the maximal amplitude of thermal vibrations within an amorphous minimum explains the variation of fragility with a liquid's configurational heat capacity density. Furthermore the numerical prefactor of this correlation is well approximated by the microscopic calculation. The size of heterogeneous reconfiguring regions in a viscous liquid is inferred and the correlation of nonexponentiality of relaxation with fragility is qualitatively explained. Thus the wide variety of kinetic behavior in liquids of quite disparate chemical nature reflects quantitative rather than qualitative differences in their energy landscapes.
Biological network alignment aims to identify similar regions between networks of different species. Existing methods compute node "similarities" to rapidly identify from possible alignments the "high-scoring" alignments with respect to the overall node similarity. However, the accuracy of the alignments is then evaluated with some other measure that is different than the node similarity used to construct the alignments. Typically, one measures the amount of conserved edges. Thus, the existing methods align similar nodes between networks hoping to conserve many edges (after the alignment is constructed!).   Instead, we introduce MAGNA to directly "optimize" edge conservation while the alignment is constructed. MAGNA uses a genetic algorithm and our novel function for crossover of two "parent" alignments into a superior "child" alignment to simulate a "population" of alignments that "evolves" over time; the "fittest" alignments survive and proceed to the next "generation", until the alignment accuracy cannot be optimized further. While we optimize our new and superior measure of the amount of conserved edges, MAGNA can optimize any alignment accuracy measure. In systematic evaluations against existing state-of-the-art methods (IsoRank, MI-GRAAL, and GHOST), MAGNA improves alignment accuracy of all methods.
In this work we present a mobile application we designed and engineered to enable people to log their travels near and far, leave notes behind, and build a community around spaces in between destinations. Our design explores new ground for location-based social computing systems, identifying opportunities where these systems can foster the growth of on-line communities rooted at non-places. In our work we develop, explore, and evaluate several innovative features designed around four usage scenarios: daily commuting, long-distance traveling, quantified traveling, and journaling. We present the results of two small-scale user studies, and one large-scale, world-wide deployment, synthesizing the results as potential opportunities and lessons learned in designing social computing for non-places.
Gaining profound insights from collected data of today's application domains like IoT, cyber-physical systems, health care, or the financial sector is business-critical and can create the next multi-billion dollar market. However, analyzing these data and turning it into valuable insights is a huge challenge. This is often not alone due to the large volume of data but due to an incredibly high domain complexity, which makes it necessary to combine various extrapolation and prediction methods to understand the collected data. Model-driven analytics is a refinement process of raw data driven by a model reflecting deep domain understanding, connecting data, domain knowledge, and learning.
A loop series expansion for the partition function of a general statistical model on a graph is carried out. If the auxiliary probability distributions of the expansion are chosen to be a fixed point of the belief-propagation equation, the first term of the loop series gives the Bethe-Peierls free energy functional at the replica-symmetric level of the mean-field spin glass theory, and corrections are contributed only by subgraphs that are free of dangling edges. This result generalize the early work of Chertkov and Chernyak on binary statistical models. If the belief-propagation equation has multiple fixed points, a loop series expansion is performed for the grand partition function. The first term of this series gives the Bethe-Peierls free energy functional at the first-step replica-symmetry-breaking (RSB) level of the mean-field spin-glass theory, and corrections again come only from subgraphs that are free of dangling edges, provided that the auxiliary probability distributions of the expansion are chosen to be a fixed point of the survey-propagation equation. The same loop series expansion can be performed for higher-level partition functions, obtaining the higher-level RSB Bethe-Peierls free energy functionals (and the correction terms) and message-passing equations without using the Bethe-Peierls approximation.
We discuss how to formulate lattice gauge theories in the Tensor Network language. In this way we obtain both a consistent truncation scheme of the Kogut-Susskind lattice gauge theories and a Tensor Network variational ansatz for gauge invariant states that can be used in actual numerical computation. Our construction is also applied to the simplest realization of the quantum link models/gauge magnets and provides a clear way to understand their microscopic relation with Kogut-Susskind lattice gauge theories. We also introduce a new set of gauge invariant operators that modify continuously Rokshar-Kivelson wave functions and can be used to extend the phase diagram of known models. As an example we characterize the transition between the deconfined phase of the $Z_2$ lattice gauge theory and the Rokshar-Kivelson point of the U(1) gauge magnet in 2D in terms of entanglement entropy. The topological entropy serves as an order parameter for the transition but not the Schmidt gap.
The magnetisation relaxations of three different types of geometrically frustrated magnetic systems have been studied with the same experimental procedures as previously used in spin glasses. The materials investigated are Y$_2$Mo$_2$O$_7$ (pyrochlore system), SrCr$_{8.6}$Ga$_{3.4}$O$_{19}$ (piled pairs of Kagom\'e layers) and (H$_3$O)Fe$_3$(SO$_4$)$_2$(OH)$_6$ (jarosite compound). Despite a very small amount of disorder, all the samples exhibit many characteristic features of spin glass dynamics below a freezing temperature $T_g$, much smaller than their Curie-Weiss temperature $\theta$. The ageing properties of their thermoremanent magnetization can be well accounted for by the same scaling law as in spin glasses, and the values of the scaling exponents are very close. The effects of temperature variations during ageing have been specifically investigated. In the pyrochlore and the bi-Kagom\'e compounds, a decrease of temperature after some waiting period at a certain temperature $T_p$ re-initializes ageing and the evolution at the new temperature is the same as if the system were just quenched from above $T_g$. However, as the temperature is raised back to $T_p$, the sample recovers the state it had previously reached at that temperature. These features are known in spin glasses as rejuvenation and memory effects. They are clear signatures of the spin glass dynamics. In the Kagom\'e compound, there is also some rejuvenation and memory, but much larger temperature changes are needed to observe the effects. In that sense, the behaviour of this compound is quantitatively different from that of spin glasses.
Ubiquitous computing allows to make data and services within the reach of users anytime and anywhere. This makes ubiquitous networks vulnerable to attacks coming from either inside or outside the network. To ensure and enhance networks security, several solutions have been implemented. These solutions are inefficient and or incomplete. Solving these challenges in security with new requirement of Ubicomp, could provide a potential future for such systems towards better mobility and higher confidence level of end user services. We investigate the possibility to detect network intrusions, based on security nodes abilities. Specifically, we show how authentication can help build user profiles in each network node. Authentication is based on permissions and restrictions to access to information and services on ubiquitous network. As a result, our idea realizes a protection of nodes and assures security of network.
We will present a new component of our technical framework that was built to provide a brought range of reusable web services for the enhancement of typical scientific retrieval processes. The proposed component computes betweenness of authors in co-authorship networks extracted from publicly available metadata that was harvested using OAI-PMH.
The recent progress in sparse coding and deep learning has made unsupervised feature learning methods a strong competitor to hand-crafted descriptors. In computer vision, success stories of learned features have been predominantly reported for object recognition tasks. In this paper, we investigate if and how feature learning can be used for material recognition. We propose two strategies to incorporate scale information into the learning procedure resulting in a novel multi-scale coding procedure. Our results show that our learned features for material recognition outperform hand-crafted descriptors on the FMD and the KTH-TIPS2 material classification benchmarks.
In this paper, we examine the fundamental trade-off between radiated power and achieved throughput in wireless multi-carrier, multiple-input and multiple-output (MIMO) systems that vary with time in an unpredictable fashion (e.g. due to changes in the wireless medium or the users' QoS requirements). Contrary to the static/stationary channel regime, there is no optimal power allocation profile to target (either static or in the mean), so the system's users must adapt to changes in the environment "on the fly", without being able to predict the system's evolution ahead of time. In this dynamic context, we formulate the users' power/throughput trade-off as an online optimization problem and we provide a matrix exponential learning algorithm that leads to no regret - i.e. the proposed transmit policy is asymptotically optimal in hindsight, irrespective of how the system evolves over time. Furthermore, we also examine the robustness of the proposed algorithm under imperfect channel state information (CSI) and we show that it retains its regret minimization properties under very mild conditions on the measurement noise statistics. As a result, users are able to track the evolution of their individually optimum transmit profiles remarkably well, even under rapidly changing network conditions and high uncertainty. Our theoretical analysis is validated by extensive numerical simulations corresponding to a realistic network deployment and providing further insights in the practical implementation aspects of the proposed algorithm.
As belief networks are used to model increasingly complex situations, the need to automatically construct them from large databases will become paramount. This paper concentrates on solving a part of the belief network induction problem: that of learning the quantitative structure (the conditional probabilities), given the qualitative structure. In particular, a theory is presented that shows how to propagate inference distributions in a belief network, with the only assumption being that the given qualitative structure is correct. Most inference algorithms must make at least this assumption. The theory is based on four network transformations that are sufficient for any inference in a belief network. Furthermore, the claim is made that contrary to popular belief, error will not necessarily grow as the inference chain grows. Instead, for QBN belief nets induced from large enough samples, the error is more likely to decrease as the size of the inference chain increases.
We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.
Routing, as a basic phenomena, by itself, has got umpteen scopes to analyse, discuss and arrive at an optimal solution for the technocrats over years. Routing is analysed based on many factors; few key constraints that decide the factors are communication medium, time dependency, information source nature. Parametric routing has become the requirement of the day, with some kind of adaptation to the underlying network environment. Satellite constellations, particularly LEO satellite constellations have become a reality in operational to have a non-breaking voice/data communication around the world.Routing in these constellations has to be treated in a non conventional way, taking their network geometry into consideration. One of the efficient methods of optimization is putting Neural Networks to use. Few Artificial Neural Network models are very much suitable for the adaptive control mechanism, by their nature of network arrangement. One such efficient model is Hopfield Network model.   This paper is an attempt to design a framework for the Hopfield Network based adaptive routing phenomena in satellite constellations.
In reality, many real-world networks interact with and depend on other networks. We develop an analytical framework for studying interacting networks and present an exact percolation law for a network of $n$ interdependent networks (NON). We present a general framework to study the dynamics of the cascading failures process at each step caused by an initial failure occurring in the NON system. We study and compare both $n$ coupled Erd\H{o}s-R\'{e}nyi (ER) graphs and $n$ coupled random regular (RR) graphs. We found recently [Gao et. al. arXive:1010.5829] that for an NON composed of $n$ ER networks each of average degree $k$, the giant component, $P_{\infty}$, is given by $P_{\infty}=p[1-\exp(-kP_{\infty})]^n$ where $1-p$ is the initial fraction of removed nodes. Our general result coincides for $n=1$ with the known Erd\H{o}s-R\'{e}nyi second-order phase transition at a threshold, $p=p_c$, for a single network. For $n=2$ the general result for $P_{\infty}$ corresponds to the $n=2$ result [Buldyrev et. al., Nature, 464, (2010)]. Similar to the ER NON, for $n=1$ the percolation transition at $p_c$, is of second order while for any $n>1$ it is of first order. The first order percolation transition in both ER and RR (for $n>1$) is accompanied by cascading failures between the networks due to their interdependencies. However, we find that the robustness of $n$ coupled RR networks of degree $k$ is dramatically higher compared to the $n$ coupled ER networks of average degree $k$. While for ER NON there exists a critical minimum average degree $k=k_{\min}$, that increases with $n$, below which the system collapses, there is no such analogous $k_{\min}$ for RR NON system.
A mathematical model of figure-ground articulation is presented, taking into account both local and global gestalt laws. The model is compatible with the functional architecture of the primary visual cortex (V1). Particularly the local gestalt law of good continuity is described by means of suitable connectivity kernels, that are derived from Lie group theory and are neurally implemented in long range connectivity in V1. Different kernels are compatible with the geometric structure of cortical connectivity and they are derived as the fundamental solutions of the Fokker Planck, the Sub-Riemannian Laplacian and the isotropic Laplacian equations. The kernels are used to construct matrices of connectivity among the features present in a visual stimulus. Global gestalt constraints are then introduced in terms of spectral analysis of the connectivity matrix, showing that this processing can be cortically implemented in V1 by mean field neural equations. This analysis performs grouping of local features and individuates perceptual units with the highest saliency. Numerical simulations are performed and results are obtained applying the technique to a number of stimuli.
The real-space renormalization group (RSRG) method introduced previously for the Brownian landscape is generalized to obtain the joint probability distribution of the subset of the important extrema at large scales of other one-dimensional landscapes. For a large class of models we give exact solutions obtained either by the use of constrained path-integrals in the continuum limit, or by solving the RSRG equations via an Ansatz which leads to the Liouville equation. We apply in particular our results to the toy model energy landscape, which consists in a quadratic potential plus a Brownian potential. The measure of the renormalized landscape is obtained explicitly in terms of Airy functions, and allows to study in details the Boltzmann equilibrium of a particle at low temperature as well as its non-equilibrium dynamics. For the equilibrium, we give results for the statistics of the absolute minimum which dominates at zero temperature, and for the configurations with nearly degenerate minima which govern the thermal fluctuations at very low-temperature. For the dynamics, we compute the distribution over samples of the equilibration time, or equivalently the distribution of the largest barrier in the system. We also study the properties of the rare configurations presenting an anomalously large equilibration time which govern the long-time dynamics. We compute the disorder averaged diffusion front, which interpolates between the Kesten distribution of the Sinai model at short rescaled time and the reaching of equilibrium at long rescaled time. Finally, the method allows to describe the full coarsening (i.e. many domain walls) of the 1D RFIM in a field gradient as well as its equilibrium.
Collective dynamics result from interactions among noisy dynamical components. Examples include heartbeats, circadian rhythms, and various pattern formations. Because of noise in each component, collective dynamics inevitably involve fluctuations, which may crucially affect functioning of the system. However, the relation between the fluctuations in isolated individual components and those in collective dynamics is unclear. Here we study a linear dynamical system of networked components subjected to independent Gaussian noise and analytically show that the connectivity of networks determines the intensity of fluctuations in the collective dynamics. Remarkably, in general directed networks including scale-free networks, the fluctuations decrease more slowly with the system size than the standard law stated by the central limit theorem. They even remain finite for a large system size when global directionality of the network exists. Moreover, such nontrivial behavior appears even in undirected networks when nonlinear dynamical systems are considered. We demonstrate it with a coupled oscillator system.
We propose a novel 3D neural network architecture for 3D hand pose estimation from a single depth image. Different from previous works that mostly run on 2D depth image domain and require intermediate or post process to bring in the supervision from 3D space, we convert the depth map to a 3D volumetric representation, and feed it into a 3D convolutional neural network(CNN) to directly produce the pose in 3D requiring no further process. Our system does not require the ground truth reference point for initialization, and our network architecture naturally integrates both local feature and global context in 3D space. To increase the coverage of the hand pose space of the training data, we render synthetic depth image by transferring hand pose from existing real image datasets. We evaluation our algorithm on two public benchmarks and achieve the state-of-the-art performance. The synthetic hand pose dataset will be available.
We analyze how the knowledge to autonomously handle one type of intersection, represented as a Deep Q-Network, translates to other types of intersections (tasks). We view intersection handling as a deep reinforcement learning problem, which approximates the state action Q function as a deep neural network. Using a traffic simulator, we show that directly copying a network trained for one type of intersection to another type of intersection decreases the success rate. We also show that when a network that is pre-trained on Task A and then is fine-tuned on a Task B, the resulting network not only performs better on the Task B than an network exclusively trained on Task A, but also retained knowledge on the Task A. Finally, we examine a lifelong learning setting, where we train a single network on five different types of intersections sequentially and show that the resulting network exhibited catastrophic forgetting of knowledge on previous tasks. This result suggests a need for a long-term memory component to preserve knowledge.
Recent progress in network topology modeling [1], [2] has shown that it is possible to create smaller-scale replicas of large complex networks, like the Internet, while simultaneously preserving several important topological properties. However, the constructed replicas do not include notions of capacities and latencies, and the fundamental question of whether smaller networks can reproduce the performance of larger networks remains unanswered. We address this question in this letter, and show that it is possible to predict the performance of larger networks from smaller replicas, as long as the right link capacities and propagation delays are assigned to the replica's links. Our procedure is inspired by techniques introduced in [2] and combines a time-downscaling argument from [3]. We show that significant computational savings can be achieved when simulating smaller-scale replicas with TCP and UDP traffic, with simulation times being reduced by up to two orders of magnitude.
The fluid model has proven to be one of the most effective tools for the analysis of stochastic queueing networks, specifically for the analysis of stability. It is known that stability of a fluid model implies positive (Harris) recurrence (stability) of a corresponding stochastic queueing network, and weak stability implies rate stability of a corresponding stochastic network. These results have been established both for cases of specific scheduling policies and for the class of all work conserving policies.   However, only partial converse results have been established and in certain cases converse statements do not hold. In this paper we close one of the existing gaps. For the case of networks with two stations we prove that if the fluid model is not weakly stable under the class of all work conserving policies, then a corresponding queueing network is not rate stable under the class of all work conserving policies. We establish the result by building a particular work conserving scheduling policy which makes the associated stochastic process transient. An important corollary of our result is that the condition $\rho^*\leq 1$, which was proven in \cite{daivan97} to be the exact condition for global weak stability of the fluid model, is also the exact global rate stability condition for an associated queueing network. Here $\rho^*$ is a certain computable parameter of the network involving virtual station and push start conditions.
We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. We associate a transverse field Ising spin Hamiltonian with a layout of qubits similar to that of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to numerically simulate quantum sampling from this system. We design a reinforcement learning algorithm in which the set of visible nodes representing the states and actions of an optimal policy are the first and last layers of the deep network. In absence of a transverse field, our simulations show that DBMs train more effectively than restricted Boltzmann machines (RBM) with the same number of weights. Since sampling from Boltzmann distributions of a DBM is not classically feasible, this is evidence of advantage of a non-Turing sampling oracle. We then develop a framework for training the network as a quantum Boltzmann machine (QBM) in the presence of a significant transverse field for reinforcement learning. This further improves the reinforcement learning method using DBMs.
Modern learning classifier systems typically exploit a niched genetic algorithm to facilitate rule discovery. When used for reinforcement learning, such rules represent generalisations over the state-action-reward space. Whilst encouraging maximal generality, the niching can potentially hinder the formation of generalisations in the state space which are symmetrical, or very similar, over different actions. This paper introduces the use of rules which contain multiple actions, maintaining accuracy and reward metrics for each action. It is shown that problem symmetries can be exploited, improving performance, whilst not degrading performance when symmetries are reduced.
The role of substantial in-plane disorder (Zn) on the transport and AC susceptibility of Y1-xCaxBa2(Cu1-yZny)3O7-delta was investigated over a wide range of planar hole concentration, p. Resistivity, r(T), for a number of overdoped to underdoped samples with y equals/greater than 0.055 showed clear downturns at a characteristic temperature similar to that found at T* in Zn-free underdoped samples because of the presence of the pseudogap. Contrary to the widely observed behavior for underdoped cuprates at lower Zn contents (where the pseudogap energy increases almost linearly with decreasing p in the same way as for the Zn-free compounds), this apparent pseudogap temperature at high Zn content showed very little or no p-dependence. It also increases systematically with increasing Zn concentration in the CuO2 planes. This anomalous behavior appears quite abruptly, e.g., samples with y less than/equals 0.05 exhibit the usual T*(p) behavior. AC susceptibility of these heavily disordered samples showed the superfluid density to be extremely low. Magneto-transport, r(T,H), measurements are provisionally interpreted in terms of high-strength pinning centers for vortices in these samples. We also discuss various possible scenarios that might lead to a Zn induced pseudogap in the cuprates.
In the present work we focus on future experiments using cluster abundance observations to constraint the Dark Energy equation of state parameter, w. To obtain tight constraints from this kind of experiment, a reliable sample of galaxy clusters must be obtained from deep and wide-field images. We therefore present the computational environment (2DPHOT) that allow us to build the galaxy catalog from the images and the Voronoi Tessellation cluster finding algorithm that we use to identify the galaxy clusters on those catalogs. To test our pipeline with data similar in quality to what will be gathered by future wide field surveys, we process images from the Deep fields obtained as part of the LEGACY Survey (four fields of one square degree each, in five bands, with depth up to r'=25). We test our cluster finder by determining the completeness and purity of the finder when applied to mock galaxy catalogs made for the Dark Energy Survey cluster finder comparison project by Risa Wechsler and Michael Busha. This procedure aims to understand the selection function of the underlying dark matter halos.
We consider network aggregative games to model and study multi-agent populations in which each rational agent is influenced by the aggregate behavior of its neighbors, as specified by an underlying network. Specifically, we examine systems where each agent minimizes a quadratic cost function, that depends on its own strategy and on a convex combination of the strategies of its neighbors, and is subject to personalized convex constraints. We analyze the best response dynamics and we propose alternative distributed algorithms to steer the strategies of the rational agents to a Nash equilibrium configuration. The convergence of these schemes is guaranteed under different sufficient conditions, depending on the matrices defining the cost and on the network. Additionally, we propose an extension to the network aggregative game setting that allows for multiple rounds of communications among the agents, and we illustrate how it can be combined with consensus theory to recover a solution to the mean field control problem in a distributed fashion, that is, without requiring the presence of a central coordinator. Finally, we apply our theoretical findings to study a novel multi-dimensional, convex-constrained model of opinion dynamics and a hierarchical demand-response scheme for energy management in smart buildings, extending literature results.
Existing construction algorithms of block network-error correcting codes require a rather large field size, which grows with the size of the network and the number of sinks, and thereby can be prohibitive in large networks. In this work, we give an algorithm which, starting from a given network-error correcting code, can obtain another network code using a small field, with the same error correcting capability as the original code. An algorithm for designing network codes using small field sizes proposed recently by Ebrahimi and Fragouli can be seen as a special case of our algorithm. The major step in our algorithm is to find a least degree irreducible polynomial which is coprime to another large degree polynomial. We utilize the algebraic properties of finite fields to implement this step so that it becomes much faster than the brute-force method. As a result the algorithm given by Ebrahimi and Fragouli is also quickened.
In this work, we present an unconstrained face verification algorithm and evaluate it on the recently released IJB-A dataset that aims to push the boundaries of face verification methods. The proposed algorithm couples a deep CNN-based approach with a low-dimensional discriminative embedding learnt using triplet similarity constraints in a large margin fashion. Aside from yielding performance improvement, this embedding provides significant advantages in terms of memory and post-processing operations like hashing and visualization. Experiments on the IJB-A dataset show that the proposed algorithm outperforms state of the art methods in verification and identification metrics, while requiring less training time.
In this paper, we present an exact model for the analysis of the performance of Random Linear Network Coding (RLNC) in wired erasure networks with finite buffers. In such networks, packets are delayed due to either random link erasures or blocking by full buffers. We assert that because of RLNC, the content of buffers have dependencies which cannot be captured directly using the classical queueing theoretical models. We model the performance of the network using Markov chains by a careful derivation of the buffer occupancy states and their transition rules. We verify by simulations that the proposed framework results in an accurate measure of the network throughput offered by RLNC. Further, we introduce a class of acyclic networks for which the number of state variables is significantly reduced.
The effects of error propagation in the reproduction of diploid organisms are studied within the populational genetics framework of the quasispecies model. The dependence of the error threshold on the dominance parameter is fully investigated. In particular, it is shown that dominance can protect the wild-type alleles from the error catastrophe. The analysis is restricted to a diploid analogue of the single-peaked landscape.
A variational approach to finite connectivity spin-glass-like models is developed and applied to describe the structure of optimal solutions in random satisfiability problems. Our variational scheme accurately reproduces the known replica symmetric results and also allows for the inclusion of replica symmetry breaking effects. For the 3-SAT problem, we find two transitions as the ratio $\alpha$ of logical clauses per Boolean variables increases. At the first one $\alpha_s \simeq 3.96$, a non-trivial organization of the solution space in geometrically separated clusters emerges. The multiplicity of these clusters as well as the typical distances between different solutions are calculated. At the second threshold $\alpha_c \simeq 4.48$, satisfying assignments disappear and a finite fraction $B_0 \simeq 0.13$ of variables are overconstrained and take the same values in all optimal (though unsatisfying) assignments. These values have to be compared to $\alpha_c \simeq 4.27, B_0 \simeq 0.4$ obtained from numerical experiments on small instances. Within the present variational approach, the SAT-UNSAT transition naturally appears as a mixture of a first and a second order transition. For the mixed $2+p$-SAT with $p<2/5$, the behavior is as expected much simpler: a unique smooth transition from SAT to UNSAT takes place at $\alpha_c=1/(1-p)$.
The mutual information of two random variables i and j with joint probabilities t_ij is commonly used in learning Bayesian nets as well as in many other fields. The chances t_ij are usually estimated by the empirical sampling frequency n_ij/n leading to a point estimate I(n_ij/n) for the mutual information. To answer questions like "is I(n_ij/n) consistent with zero?" or "what is the probability that the true mutual information is much larger than the point estimate?" one has to go beyond the point estimate. In the Bayesian framework one can answer these questions by utilizing a (second order) prior distribution p(t) comprising prior information about t. From the prior p(t) one can compute the posterior p(t|n), from which the distribution p(I|n) of the mutual information can be calculated. We derive reliable and quickly computable approximations for p(I|n). We concentrate on the mean, variance, skewness, and kurtosis, and non-informative priors. For the mean we also give an exact expression. Numerical issues and the range of validity are discussed.
In this paper, we consider the problem of energy efficient uplink scheduling with delay constraint for a multi-user wireless system. We address this problem within the framework of constrained Markov decision processes (CMDPs) wherein one seeks to minimize one cost (average power) subject to a hard constraint on another (average delay). We do not assume the arrival and channel statistics to be known. To handle state space explosion and informational constraints, we split the problem into individual CMDPs for the users, coupled through their Lagrange multipliers; and a user selection problem at the base station. To address the issue of unknown channel and arrival statistics, we propose a reinforcement learning algorithm. The users use this learning algorithm to determine the rate at which they wish to transmit in a slot and communicate this to the base station. The base station then schedules the user with the highest rate in a slot. We analyze convergence, stability and optimality properties of the algorithm. We also demonstrate the efficacy of the algorithm through simulations within IEEE 802.16 system.
We investigate the NGC 3933 poor galaxy association, that contains NGC 3934, which is classified as a polar-ring galaxy. The multi-band photometric analysis of NGC 3934 allows us to investigate the nature of this galaxy and to re-define the NGC 3933 group members with the aim to characterize the group dynamical properties and its evolutionary phase. We imaged the group in the far (FUV,lambda = 1530A) and near (NUV, lambda=2316A) ultraviolet (UV) bands of the Galaxy Evolution Explorer (GALEX). From the deep optical imaging we determined the fine structure of NGC 3934. We measured the recession velocity of PGC 213894 which shows that it belongs to the NGC 3933 group. We derived the spectral energy distribution (SED) from FUV (GALEX) to far-IR emission of the two brightest members of the group. We compared a grid of smooth particle hydrodynamical (SPH) chemo-photometric simulations with the SED and the integrated properties of NGC 3934 and NGC 3933 to devise their possible formation/evolutionary scenarios. The NGC 3933 group has six bright members: a core composed of five galaxies, which have Hickson's compact group characteristics, and a more distant member, PGC 37112. The group velocity dispersion is relatively low (157+-44 km s-1). The projected mass, from the NUV photometry, is ~7$\times$10^12 M\odot with a crossing time of 0.04 Hubble times, suggesting that at least in the center the group is virialized. We do not find evidence that NGC 3934 is a polar-ring galaxy, as suggested by the literature, but find that it is a disk galaxy with a prominent dust-lane structure and a wide type-II shell structure. NGC 3934 is a quite rare example of a shell galaxy in a likely dense galaxy region. The comparison between physically motivated SPH simulations with multi-band integrated photometry suggests that NGC 3934 is the product of a major merger.
The goal of this paper is to analyze the geometric properties of deep neural network classifiers in the input space. We specifically study the topology of classification regions created by deep networks, as well as their associated decision boundary. Through a systematic empirical investigation, we show that state-of-the-art deep nets learn connected classification regions, and that the decision boundary in the vicinity of datapoints is flat along most directions. We further draw an essential connection between two seemingly unrelated properties of deep networks: their sensitivity to additive perturbations in the inputs, and the curvature of their decision boundary. The directions where the decision boundary is curved in fact remarkably characterize the directions to which the classifier is the most vulnerable. We finally leverage a fundamental asymmetry in the curvature of the decision boundary of deep nets, and propose a method to discriminate between original images, and images perturbed with small adversarial examples. We show the effectiveness of this purely geometric approach for detecting small adversarial perturbations in images, and for recovering the labels of perturbed images.
The distribution of the geometric distances of connected neurons is a practical factor underlying neural networks in the brain. It can affect the brain\'s dynamic properties at the ground level. Karbowski derived a power-law decay distribution that has not yet been verified by experiment. In this work, we check its validity using simulations with a phenomenological model. Based on the in vitro two-dimensional development of neural networks in culture vessels by Ito, we match the synapse number saturation time to obtain suitable parameters for the development process, then determine the distribution of distances between connected neurons under such conditions. Our simulations obtain a clear exponential distribution instead of a power-law one, which indicates that Karbowski's conclusion is invalid, at least for the case of in vitro neural network development in two-dimensional culture vessels.
In this paper, a deep domain adaptation based method for video smoke detection is proposed to extract a powerful feature representation of smoke. Due to the smoke image samples limited in scale and diversity for deep CNN training, we systematically produced adequate synthetic smoke images with a wide variation in the smoke shape, background and lighting conditions. Considering that the appearance gap (dataset bias) between synthetic and real smoke images degrades significantly the performance of the trained model on the test set composed fully of real images, we build deep architectures based on domain adaptation to confuse the distributions of features extracted from synthetic and real smoke images. This approach expands the domain-invariant feature space for smoke image samples. With their approximate feature distribution off non-smoke images, the recognition rate of the trained model is improved significantly compared to the model trained directly on mixed dataset of synthetic and real images. Experimentally, several deep architectures with different design choices are applied to the smoke detector. The ultimate framework can get a satisfactory result on the test set. We believe that our approach is a start in the direction of utilizing deep neural networks enhanced with synthetic smoke images for video smoke detection.
A feedforward multilayered neural network has been trained to "recognize" true V0's in the presence of a large combinatoric background using simulated data for 2 GeV/nucleon Ni + Cu interactions. The resulting neural network filter has been applied to actual data from the EOS TPC experiment. An enhancement of signal to background over more traditional selection mechanisms has been observed.
Coherent backscattering is a multiple scattering interference effect which enhances the diffuse reflection off a disordered sample in the backward direction. Classically, the enhanced intensity is twice the average background under well chosen experimental conditions. We show how the quantum internal structure of atomic scatterers leads to a significantly smaller enhancement. Theoretical results for double scattering in the weak localization regime are presented which confirm recent experimental observations.
The task of allocating preventative resources to a computer network in order to protect against the spread of viruses is addressed. Virus spreading dynamics are described by a linearized SIS model and protection is framed by an optimization problem which maximizes the rate at which a virus in the network is contained given finite resources. One approach to problems of this type involve greedy heuristics which allocate all resources to the nodes with large centrality measures. We address the worst case performance of such greedy algorithms be constructing networks for which these greedy allocations are arbitrarily inefficient. An example application is presented in which such a worst case network might arise naturally and our results are verified numerically by leveraging recent results which allow the exact optimal solution to be computed via geometric programming.
In computational biology, biological entities such as genes or proteins are usually annotated with terms extracted from Gene Ontology (GO). The functional similarity among terms of an ontology is evaluated by using Semantic Similarity Measures (SSM). More recently, the extensive application of SSMs yielded to the Semantic Similarity Networks (SSNs). SSNs are edge-weighted graphs where the nodes are concepts (e.g. proteins) and each edge has an associated weight that represents the semantic similarity among related pairs of nodes. The analysis of SSNs may reveal biologically meaningful knowledge. For these aims, the need for the introduction of tool able to manage and analyze SSN arises. Consequently we developed SSN-Analyzer a web based tool able to build and preprocess SSN. As proof of concept we demonstrate that community detection algorithms applied to filtered (thresholded) networks, have better performances in terms of biological relevance of the results, with respect to the use of raw unfiltered networks.
Diversification of an investment into independently fluctuating assets reduces its risk. In reality, movement of assets are are mutually correlated and therefore knowledge of cross--correlations among asset price movements are of great importance. Our results support the possibility that the problem of finding an investment in stocks which exposes invested funds to a minimum level of risk is analogous to the problem of finding the magnetization of a random magnet. The interactions for this ``random magnet problem'' are given by the cross-correlation matrix {\bf \sf C} of stock returns. We find that random matrix theory allows us to make an estimate for {\bf \sf C} which outperforms the standard estimate in terms of constructing an investment which carries a minimum level of risk.
Early diagnosis of interstitial lung diseases is crucial for their treatment, but even experienced physicians find it difficult, as their clinical manifestations are similar. In order to assist with the diagnosis, computer-aided diagnosis (CAD) systems have been developed. These commonly rely on a fixed scale classifier that scans CT images, recognizes textural lung patterns and generates a map of pathologies. In a previous study, we proposed a method for classifying lung tissue patterns using a deep convolutional neural network (CNN), with an architecture designed for the specific problem. In this study, we present an improved method for training the proposed network by transferring knowledge from the similar domain of general texture classification. Six publicly available texture databases are used to pretrain networks with the proposed architecture, which are then fine-tuned on the lung tissue data. The resulting CNNs are combined in an ensemble and their fused knowledge is compressed back to a network with the original architecture. The proposed approach resulted in an absolute increase of about 2% in the performance of the proposed CNN. The results demonstrate the potential of transfer learning in the field of medical image analysis, indicate the textural nature of the problem and show that the method used for training a network can be as important as designing its architecture.
In this paper, we present subgraph2vec, a novel approach for learning latent representations of rooted subgraphs from large graphs inspired by recent advancements in Deep Learning and Graph Kernels. These latent representations encode semantic substructure dependencies in a continuous vector space, which is easily exploited by statistical models for tasks such as graph classification, clustering, link prediction and community detection. subgraph2vec leverages on local information obtained from neighbourhoods of nodes to learn their latent representations in an unsupervised fashion. We demonstrate that subgraph vectors learnt by our approach could be used in conjunction with classifiers such as CNNs, SVMs and relational data clustering algorithms to achieve significantly superior accuracies. Also, we show that the subgraph vectors could be used for building a deep learning variant of Weisfeiler-Lehman graph kernel. Our experiments on several benchmark and large-scale real-world datasets reveal that subgraph2vec achieves significant improvements in accuracies over existing graph kernels on both supervised and unsupervised learning tasks. Specifically, on two realworld program analysis tasks, namely, code clone and malware detection, subgraph2vec outperforms state-of-the-art kernels by more than 17% and 4%, respectively.
The status of the twist-2 and the twist-3 integral relations between polarized structure functions in deep inelastic scattering is discussed. The relations can be tested in the upcoming experiments in the range $Q^2 \gsim M_p^2$.
We consider a network where strategic agents, who are contesting for allocation of resources, are divided into fixed groups. The network control protocol is such that within each group agents get to share the resource and across groups they contest for it. A prototypical example is the allocation of data rate on a network with multicast/multirate architecture. Compared to the unicast architecture (which is a special case), the multicast/multirate architecture can result in substantial bandwidth savings. However, design of a market mechanism in such a scenario requires dealing with both private and public good problems as opposed to just private goods in unicast.   The mechanism proposed in this work ensures that social welfare maximizing allocation on such a network is realized at all Nash equilibria (NE) i.e., full implementation in NE. In addition it is individually rational, i.e., agents have an incentive to participate in the mechanism. The mechanism, which is constructed in a quasi-systematic way starting from the dual of the centralized problem, has a number of useful properties. Specifically, due to a novel allocation scheme, namely "radial projection", the proposed mechanism results in feasible allocation even off equilibrium. This is a practical necessity for any realistic mechanism since agents have to "learn" the NE through a dynamic process. Finally, it is shown how strong budget balance at equilibrium can be achieved with a minimal increase in message space as an add-on to a weakly budget balanced mechanism.
We find that different geographical structures of networks lead to varied percolation thresholds, although these networks may have similar abstract topological structures. Thus, the strategies for enhancing robustness and immunization of a geographical network are proposed. Using the generating function formalism, we obtain the explicit form of the percolation threshold $q_{c}$ for networks containing arbitrary order cycles. For 3-cycles, the dependence of $q_c$ on the clustering coefficients is ascertained. The analysis substantiates the validity of the strategies with an analytical evidence.
We introduce Parseval networks, a form of deep neural networks in which the Lipschitz constant of linear, convolutional and aggregation layers is constrained to be smaller than 1. Parseval networks are empirically and theoretically motivated by an analysis of the robustness of the predictions made by deep neural networks when their input is subject to an adversarial perturbation. The most important feature of Parseval networks is to maintain weight matrices of linear and convolutional layers to be (approximately) Parseval tight frames, which are extensions of orthogonal matrices to non-square matrices. We describe how these constraints can be maintained efficiently during SGD. We show that Parseval networks match the state-of-the-art in terms of accuracy on CIFAR-10/100 and Street View House Numbers (SVHN) while being more robust than their vanilla counterpart against adversarial examples. Incidentally, Parseval networks also tend to train faster and make a better usage of the full capacity of the networks.
This paper presents a review of peer-to-peer network security. Popular for sharing of multimedia files, these networks carry risks and vulnerabilities relating to data integrity, spyware, adware, and unwanted files. Further attacks include those of forgery, pollution, repudiation, membership and Eclipse attacks, neighbor selection attacks, Sybil, DoS, and omission attacks. We review some protection mechanisms that have been devised.
We investigated the use of the Bayesian inference to restore noise-degraded images under conditions of spatially correlated noise. The generative statistical models used for the original image and the noise were assumed to obey multi-dimensional Gaussian distributions whose covariance matrices are translational invariant. We derived an exact description to be used as the expectation for the restored image by the Fourier transformation and restored an image distorted by spatially correlated noise by using a spatially uncorrelated noise model. We found that the resulting hyperparameter estimations for the minimum error and maximal posterior marginal criteria did not coincide when the generative probabilistic model and the model used for restoration were in different classes, while they did coincide when they were in the same class.
We investigate the dynamic scaling properties of stochastic particle systems on a non-deterministic scale-free network. It has been known that the dynamic scaling behavior depends on the degree distribution exponent of the underlying scale-free network. Our study shows that it also depends on the global structure of the underlying network. In random walks on the tree structure scale-free network, we find that the relaxation time follows a power-law scaling $\tau\sim N$ with the network size $N$. And the random walker return probability decays algebraically with the decay exponent which varies from node to node. On the other hand, in random walks on the looped scale-free network, they do not show the power-law scaling. We also study a pair-annihilation process on the scale-free network with the tree and the looped structure, respectively. We find that the particle density decays algebraically in time both cases, but with the different exponent.
Statistics of the local density of states in the two-dimensional Falicov-Kimball model with local disorder is studied by employing the statistical dynamical mean-field theory. Within the theory the local density of states and its distributions are calculated through stochastic self-consistent equations. The most probable value of the local density of states is used to monitor the metal-insulator transition driven by correlation and disorder. Nonvanishing of the most probable value of the local density of states at the Fermi energy indicates the existence of extended states in the two-dimensional disordered interacting system. It is also found that the most probable value of the local density of states exhibits a discontinuity when the system crosses from extended states to the Anderson localization. A phase diagram is also presented.
We present the results of extensive molecular dynamics computer simulations in which the high frequency dynamics of silica, nu>0.5 THz, is investigated in the viscous liquid state as well as in the glass state. We characterize the properties of high frequency sound modes by analyzing J_l(q,nu) and J_t(q,nu), the longitudinal and transverse current correlation function, respectively. For wave-vectors q>0.4 Angstrom^{-1} the spectra are sitting on top of a flat background which is due to multiphonon excitations. In the acoustic frequency band, i.e. for nu<20 THz, the intensity of J_l(q,nu) and J_t(q,nu) in the liquid and the glass approximately proportional to temperature, in agreement with the harmonic approximation. In contrast to this, strong deviations from a linear scaling are found for nu>20 THz. The dynamic structure factor S(q,nu) exhibits for q>0.23 Angstrom^{-1} a boson peak which is located nearly independent of q around 1.7 THz. We show that the low frequency part of the boson peak is mainly due to the elastic scattering of transverse acoustic modes with frequencies around 1 THz. The strength of this scattering depends on q and is largest around q=1.7 Angstrom^{-1}, the location of the first sharp diffraction peak in the static structure factor. By studying S(q,nu) for different system sizes we show that strong finite size effects are present in the low frequency part of the boson peak in that for small systems part of its intensity is missing. We discuss the consequences of these finite size effects for the structural relaxation.
When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition.
Cumulant generating function phi(n) and rate function Sigma(f) of the free energy is evaluated in p-body Sherrington-Kirkpatrick model by using the replica method with the replica number n finite. From a perturbational argument, we show that the cumulant generating function is constant in the vicinity of n = 0. On the other hand, with the help of two analytic properties of phi(n), the behavior of phi(n) is derived again. However this is also shown to be broken at a finite value of n, which gives a characteristic value in the rate function near the thermodynamic value of the free energy. Through the continuation of phi(n) as a function of n, we find out a way to derive the 1RSB solution at least in this model, which is to fix the RS solution to be a monotone increasing function.
The stringent requirements defined for 5G systems drive the need to promote new paradigms to the existing cellular networks. Dense and ultra-dense networks based on small cells, together with new spectrum sharing schemes seem to be key enabling technologies for emerging 5G mobile networks. This article explores the vision of the SPEED-5G project, analyzing the ability of a Centralized Radio Resource Management entity to support the Licensed Shared Access spectrum sharing framework in a deployment based on Network Slicing and Network Sharing paradigms.
A sensitivity of twist-4 and $\alpha_s$ values extracted in the NLO QCD analysis of nonsinglet SLAC-BCDMS-NMC deep inelastic scattering data to the choice of QCD renormalization scale (RS) is analysed. It is obtained that the high twist (HT) contribution to structure function $F_2$, is retuned with the change of RS. This retuning depends on the choice of starting QCD evolution point $Q_0$ and x. At $Q_0 > 10 GeV^2$ the HT contribution to $F_2$ is retuned at small $x$ and almost not retuned at large $x$; at small $Q_0$ it exhibits approximate RS stability for all x in question. The HT contribution to $F_ L$ is RS stable for all $Q_0$ and x. The RS sensitivity of $\alpha_s$ also depends on the choice of $Q_0$: at large $Q_0$ this sensitivity is weaker, than at small one. For $Q_0^2=9 GeV^2$ the value $\alpha_s(M_Z)=0.1183\pm0.0021(stat+syst)\pm0.0013(RS)$ is obtained. Connection with the higher order QCD corrections is discussed.
We introduce a convolutional neural network for inferring a compact disentangled graphical description of objects from 2D images that can be used for volumetric reconstruction. The network comprises an encoder and a twin-tailed decoder. The encoder generates a disentangled graphics code. The first decoder generates a volume, and the second decoder reconstructs the input image using a novel training regime that allows the graphics code to learn a separate representation of the 3D object and a description of its lighting and pose conditions. We demonstrate this method by generating volumes and disentangled graphical descriptions from images and videos of faces and chairs.
Rapid development in numerical modelling of materials and the complexity of new models increases quickly together with their computational demands. Despite the growing performance of modern computers and clusters, calibration of such models from noisy experimental data remains a nontrivial and often computationally exhaustive task. The layered neural networks thus represent a robust and efficient technique to overcome the time-consuming simulations of a calibrated model. The potential of neural networks consists in simple implementation and high versatility in approximating nonlinear relationships. Therefore, there were several approaches proposed to accelerate the calibration of nonlinear models by neural networks. This contribution reviews and compares three possible strategies based on approximating (i) model response, (ii) inverse relationship between the model response and its parameters and (iii) error function quantifying how well the model fits the data. The advantages and drawbacks of particular strategies are demonstrated on the calibration of four parameters of the affinity hydration model from simulated data as well as from experimental measurements. This model is highly nonlinear, but computationally cheap thus allowing its calibration without any approximation and better quantification of results obtained by the examined calibration strategies. The paper can be thus viewed as a guide intended for the engineers to help them select an appropriate strategy in their particular calibration problems.
The emergence of new network applications is driving network operators to not only fulfill dynamic bandwidth requirements, but offer various grades of service. Degraded provisioning provides an effective solution to flexibly allocate resources in various dimensions to reduce blocking for differentiated demands when network congestion occurs. In this work, we investigate the novel problem of online degraded provisioning in service-differentiated multi-layer networks with optical elasticity. Quality of Service (QoS) is assured by service-holding-time prolongation and immediate access as soon as the service arrives without set-up delay. We decompose the problem into degraded routing and degraded resource allocation stages, and design polynomial-time algorithms with the enhanced multi-layer architecture to increase the network flexibility in temporal and spectral dimensions. Illustrative results verify that we can achieve significant reduction of network service failures, especially for requests with higher priorities. The results also indicate that degradation in optical layer can increase the network capacity, while the degradation in electric layer provides flexible time-bandwidth exchange.
The congestion control algorithms in TCP may incur inferior performance in a lossy network context like wireless networks. Previous works have shown that random linear network coding can improve the throughput of TCP in such networks, although it introduces extra decoding delay at the destination. In this paper we try to alleviate the decoding delay by replacing random linear network coding with LT Codes. Due to the inherent difference between linear network coding and Fountain Codes, such replacement is not as simple as it sounds. We conquer some practical problems and come up with TCP-Forward, a new TCP variant which offers many properties that TCP as a streaming transport protocol should offer. Our performance evaluation shows TCP-Forward provides better performance than previous works.
Recurrence networks are complex networks, constructed from time series data, having several practical applications. Though their properties when constructed with the threshold value \epsilon chosen at or just above the percolation threshold of the network are quite well understood, what happens as the threshold increases beyond the usual operational window is still not clear from a complex network perspective. The present Letter is focused mainly on the network properties at intermediate-to-large values of the recurrence threshold, for which no systematic study has been performed so far. We argue, with numerical support, that recurrence networks constructed from chaotic attractors with \epsilon equal to the usual recurrence threshold or slightly above cannot, in general, show small-world property. However, if the threshold is further increased, the recurrence network topology initially changes to a small-worldstructure and finally to that of a classical random graph as the threshold approaches the size of the strange attractor.
The ordering of the three-dimensional isotropic {\it XY} spin glass with the nearest-neighbor random Gaussian coupling is studied by extensive Monte Carlo simulations. To investigate the ordering of the spin and the chirality, we compute several independent physical quantities including the glass order parameter, the Binder parameter, the correlation-length ratio, the overlap distribution and the non-self-averageness parameter, {\it etc}, for both the spin-glass (SG) and the chiral-glass (CG) degrees of freedom. Evidence of the spin-chirality decoupling, {\it i.e.}, the CG and the SG order occurring at two separated temperatures, $0<T_{SG}<T_{CG}$, is obtained from the glass order parameter, which is fully corroborated by the Binder parameter. By contrast, the CG correlation-length ratio yields a rather pathological and inconsistent result in the range of sizes we studied, which may originate from the finite-size effect associated with a significant short-length drop-off of the spatial CG correlations. Finite-size-scaling analysis yields the CG exponents $\nu_{CG}=1.36^{+0.15}_{-0.37}$ and $\eta_{CG}=0.26^{+0.29}_{-0.26}$, and the SG exponents $\nu_{SG}=1.22^{+0.26}_{-0.06}$ and $\eta_{SG}=-0.54^{+0.24}_{-0.52}$. The obtained exponents are close to those of the Heisenberg SG, but are largely different from those of the Ising SG. The chiral overlap distribution and the chiral Binder parameter exhibit the feature of a continuous one-step replica-symmetry breaking (1RSB), consistently with the previous reports. Such a 1RSB feature is again in common with that of the Heisenberg SG, but is different from the Ising one, which may be the cause of the difference in the CG critical properties from the Ising SG ones despite of a common $Z_2$ symmetry.
For successful estimation, the usual network tomography algorithms crucially require i) end-to-end data generated using multicast probe packets, real or emulated, and ii) the network to be a tree rooted at a single sender with destinations at leaves. These requirements, consequently, limit their scope of application. In this paper, we address successfully a general problem, henceforth called generalized network tomography, wherein the objective is to estimate the link performance parameters for networks with arbitrary topologies using only end-to-end measurements of pure unicast probe packets. Mathematically, given a binary matrix $A,$ we propose a novel algorithm to uniquely estimate the distribution of $X,$ a vector of independent non-negative random variables, given only IID samples of the components of the random vector $Y = AX.$ This algorithm, in fact, does not even require any prior knowledge of the unknown distributions. The idea is to approximate the distribution of each component of $X$ using linear combinations of known exponential bases and estimate the unknown weights. These weights are obtained by solving a set of polynomial systems based on the moment generating function of the components of $Y.$ For unique identifiability, it is only required that every pair of columns of the matrix $A$ be linearly independent, a property that holds true for the routing matrices of all multicast tree networks. Matlab based simulations have been included to illustrate the potential of the proposed scheme.
All the existing real world networks are evolving, hence, study of traffic dynamics in these enlarged networks is a challenging task. The critical issue is to optimize the network structure to improve network capacity and avoid traffic congestion. We are interested in taking user's routes such that it is least congested with optimal network capacity. Network capacity may be improved either by optimizing network topology or enhancing in routing approach. In this context, we propose and design a model of the time varying data communication networks (TVCN) based on the dynamics of in-flowing links. Newly appeared node prefers to attach with most influential node present in the network. In this paper, influence is termed as \textit{reputation} and is applied for computing overall congestion at any node. User path with least betweenness centrality and most reputation is preferred for routing. Kelly's optimization formulation for a rate allocation problem is used for obtaining optimal rates of distinct users at different time instants and it is found that the user's path with lowest betweenness centrality and highest reputation will always give maximum rate at stable point.
Studying quantitatively the real time dynamics of quantum materials can be challenging. Here, we address dissipative and driven quantum impurities such as spin-1/2 particles. Integrating over the environmental degrees of freedom produces entropy, decoherence and disorder effects. At low temperatures, spin-bath entanglement and memory effects of a non-Markovian bath result in intricate spin dynamics that we describe in analogy to classical systems coupled to stochastic forces or fields and random walks. A paradigmatic example of environment is a one-dimensional ensemble of harmonic oscillators (boson bath or Luttinger liquid) which can give rise to a plethora of rich phenomena such as dynamical Coulomb blockade, quantum phase transitions due to dissipation-induced localization, Kondo physics and Quantum Brownian Motion. At the same time, such an environment can be seen as a transmission line transporting input and output microwave light signals. In this article, we present our efforts to describe quantitatively the stochastic aspect of such quantum trajectories in time, the dissipation induced by the environment, as well as dynamical and topological effects imposed by external time-dependent potentials (such as AC perturbations or Floquet perturbations periodic in time, light-matter coupling and Ramsey interferometry). Then, we introduce quantum simulators in mesoscopic systems, circuit quantum electrodynamics and ultra-cold atoms to describe other aspects of driven dissipative quantum dynamics and topology in relation with experiments.
Ordinal regression is an important type of learning, which has properties of both classification and regression. Here we describe a simple and effective approach to adapt a traditional neural network to learn ordinal categories. Our approach is a generalization of the perceptron method for ordinal regression. On several benchmark datasets, our method (NNRank) outperforms a neural network classification method. Compared with the ordinal regression methods using Gaussian processes and support vector machines, NNRank achieves comparable performance. Moreover, NNRank has the advantages of traditional neural networks: learning in both online and batch modes, handling very large training datasets, and making rapid predictions. These features make NNRank a useful and complementary tool for large-scale data processing tasks such as information retrieval, web page ranking, collaborative filtering, and protein ranking in Bioinformatics.
Most complex systems are intrinsically dynamic in nature. The evolution of a dynamic complex system is typically represented as a sequence of snapshots, where each snapshot describes the configuration of the system at a particular instant of time. Then, one may directly follow how the snapshots evolve in time, or aggregate the snapshots within some time intervals to form representative "slices" of the evolution of the system configuration. This is often done with constant intervals, whose duration is based on arguments on the nature of the system and of its dynamics. A more refined approach would be to consider the rate of activity in the system to perform a separation of timescales. However, an even better alternative would be to define dynamic intervals that match the evolution of the system's configuration. To this end, we propose a method that aims at detecting evolutionary changes in the configuration of a complex system, and generates intervals accordingly. We show that evolutionary timescales can be identified by looking for peaks in the similarity between the sets of events on consecutive time intervals of data. Tests on simple toy models reveal that the technique is able to detect evolutionary timescales of time-varying data both when the evolution is smooth as well as when it changes sharply. This is further corroborated by analyses of several real datasets. Our method is scalable to extremely large datasets and is computationally efficient. This allows a quick, parameter-free detection of multiple timescales in the evolution of a complex system.
Memristors have been suggested as neuromorphic computing elements. Spike-time dependent plasticity and the Hodgkin-Huxley model of the neuron have both been modelled effectively by memristor theory. The d.c. response of the memristor is a current spike. Based on these three facts we suggest that memristors are well-placed to interface directly with neurons. In this paper we show that connecting a spiking memristor network to spiking neuronal cells causes a change in the memristor network dynamics by: removing the memristor spikes, which we show is due to the effects of connection to aqueous medium; causing a change in current decay rate consistent with a change in memristor state; presenting more-linear $I-t$ dynamics; and increasing the memristor spiking rate, as a consequence of interaction with the spiking neurons. This demonstrates that neurons are capable of communicating directly with memristors, without the need for computer translation.
Under what conditions is an edge present in a social network at time t likely to decay or persist by some future time t + Delta(t)? Previous research addressing this issue suggests that the network range of the people involved in the edge, the extent to which the edge is embedded in a surrounding structure, and the age of the edge all play a role in edge decay. This paper uses weighted data from a large-scale social network built from cell-phone calls in an 8-week period to determine the importance of edge weight for the decay/persistence process. In particular, we study the relative predictive power of directed weight, embeddedness, newness, and range (measured as outdegree) with respect to edge decay and assess the effectiveness with which a simple decision tree and logistic regression classifier can accurately predict whether an edge that was active in one time period continues to be so in a future time period. We find that directed edge weight, weighted reciprocity and time-dependent measures of edge longevity are highly predictive of whether we classify an edge as persistent or decayed, relative to the other types of factors at the dyad and neighborhood level.
A social network grows over a period of time with the formation of new connections and relations. In recent years we have witnessed a massive growth of online social networks like Facebook, Twitter etc. So it has become a problem of extreme importance to know the destiny of these networks. Thus predicting the evolution of a social network is a question of extreme importance. A good model for evolution of a social network can help in understanding the properties responsible for the changes occurring in a network structure. In this paper we propose such a model for evolution of social networks. We model the social network as an undirected graph where nodes represent people and edges represent the friendship between them. We define the evolution process as a set of rules which resembles very closely to how a social network grows in real life. We simulate the evolution process and show, how starting from an initial network, a network evolves using this model. We also discuss how our model can be used to model various complex social networks other than online social networks like political networks, various organizations etc..
We present a state-of-the-art end-to-end Automatic Speech Recognition (ASR) model. We learn to listen and write characters with a joint Connectionist Temporal Classification (CTC) and attention-based encoder-decoder network. The encoder is a deep Convolutional Neural Network (CNN) based on the VGG network. The CTC network sits on top of the encoder and is jointly trained with the attention-based decoder. During the beam search process, we combine the CTC predictions, the attention-based decoder predictions and a separately trained LSTM language model. We achieve a 5-10\% error reduction compared to prior systems on spontaneous Japanese and Chinese speech, and our end-to-end model beats out traditional hybrid ASR systems.
The kinetic glass transition in short-range attractive colloids is theoretically studied by time-convolutionless mode-coupling theory (TMCT). By numerical calculations, TMCT is shown to recover all the remarkable features predicted by the mode-coupling theory for attractive colloids, namely the glass-liquid-glass reentrant, the glass-glass transition, and the higher-order singularities. It is also demonstrated through the comparisons with the results of molecular dynamics for the binary attractive colloids that TMCT improves the critical values of the volume fraction. In addition, a schematic model of three control parameters is investigated analytically. It is thus confirmed that TMCT can describe the glass-glass transition and higher-order singularities even in such a schematic model.
The \emph{Mixed-Membership Stochastic Blockmodel (MMSB)} is a popular framework for modeling social network relationships. It can fully exploit each individual node's participation (or membership) in a social structure. Despite its powerful representations, this model makes an assumption that the distributions of relational membership indicators between two nodes are independent. Under many social network settings, however, it is possible that certain known subgroups of people may have high or low correlations in terms of their membership categories towards each other, and such prior information should be incorporated into the model. To this end, we introduce a \emph{Copula Mixed-Membership Stochastic Blockmodel (cMMSB)} where an individual Copula function is employed to jointly model the membership pairs of those nodes within the subgroup of interest. The model enables the use of various Copula functions to suit the scenario, while maintaining the membership's marginal distribution, as needed, for modeling membership indicators with other nodes outside of the subgroup of interest. We describe the proposed model and its inference algorithm in detail for both the finite and infinite cases. In the experiment section, we compare our algorithms with other popular models in terms of link prediction, using both synthetic and real world data.
Efforts at understanding the computational processes in the brain have met with limited success, despite their importance and potential uses in building intelligent machines. We propose a simple new model which draws on recent findings in Neuroscience and the Applied Mathematics of interacting Dynamical Systems. The Feynman Machine is a Universal Computer for Dynamical Systems, analogous to the Turing Machine for symbolic computing, but with several important differences. We demonstrate that networks and hierarchies of simple interacting Dynamical Systems, each adaptively learning to forecast its evolution, are capable of automatically building sensorimotor models of the external and internal world. We identify such networks in mammalian neocortex, and show how existing theories of cortical computation combine with our model to explain the power and flexibility of mammalian intelligence. These findings lead directly to new architectures for machine intelligence. A suite of software implementations has been built based on these principles, and applied to a number of spatiotemporal learning tasks.
We report on a search for the standard model Higgs boson produced in association with a $W$ or $Z$ boson in $p\bar{p}$ collisions at $\sqrt{s} = 1.96$ TeV recorded by the CDF II experiment at the Tevatron in a data sample corresponding to an integrated luminosity of 2.1 fb$^{-1}$. We consider events which have no identified charged leptons, an imbalance in transverse momentum, and two or three jets where at least one jet is consistent with originating from the decay of a $b$ hadron. We find good agreement between data and predictions. We place 95% confidence level upper limits on the production cross section for several Higgs boson masses ranging from 110$\gevm$ to 150$\gevm$. For a mass of 115$\gevm$ the observed (expected) limit is 6.9 (5.6) times the standard model prediction.
A new method to simulate probability distributions in regions where the events are VERY unlikely (e.g. p ~ 10^{-40}) is presented. The basic idea is to represent the underlying probability space by the phase space of a physical system. The system is held at a temperature T, which is chosen such that the system preferably generates configurations which originally have low probabilities. Since the distribution of such a physical system is know from statistical physics, the original unbiased distribution can be obtained.   As an application, local alignment of protein sequences based on BLOSUM62 substitution scores with (12,1) affine gap costs are considered The distribution of optimum sequence-alignment scores S is studied numerically over a large range of scores.   The deviation of p(S) from the extreme-value (or Gumbel) distribution is quantified. This deviation decreases with growing sequence length.
In this paper, we are interested in constructing general graph-based regularizers for multiple kernel learning (MKL) given a structure which is used to describe the way of combining basis kernels. Such structures are represented by sum-product networks (SPNs) in our method. Accordingly we propose a new convex regularization method for MLK based on a path-dependent kernel weighting function which encodes the entire SPN structure in our method. Under certain conditions and from the view of probability, this function can be considered to follow multinomial distributions over the weights associated with product nodes in SPNs. We also analyze the convexity of our regularizer and the complexity of our induced classifiers, and further propose an efficient wrapper algorithm to optimize our formulation. In our experiments, we apply our method to ......
Kleinberg proposed a family of small-world networks to explain the navigability of large-scale real-world social networks. However, the underlying mechanism that drives real networks to be navigable is not yet well understood. In this paper, we present a game theoretic model for the formation of navigable small world networks. We model the network formation as a Distance-Reciprocity Balanced (DRB) game in which people seek for both high reciprocity and long-distance relationships. We show that the game has only two Nash equilibria: One is the navigable small-world network, and the other is the random network in which each node connects with each other node with equal probability. We further show that the navigable small world is very stable --- (a) no collusion of any size would benefit from deviating from it; and (b) after an arbitrary deviations of a large random set of nodes, the network would return to the navigable small world as soon as every node takes one best-response step. In contrast, for the random network, a small group collusion or random perturbations is guaranteed to move the network to the navigable network as soon as every node takes one best-response step. Moreover, we show that navigable small world has much better social welfare than the random network, and provide the price-of-anarchy and price-of-stability results of the game. Our empirical evaluation demonstrates that the system always converges to the navigable network even when limited or no information about other players' strategies is available, and the DRB game simulated on the real-world network leads to navigability characteristic that is very close to that of the real network. Our theoretical and empirical analyses provide important new insight on the connection between distance, reciprocity and navigability in social networks.
Previous RNN architectures have largely been superseded by LSTM, or "Long Short-Term Memory". Since its introduction, there have been many variations on this simple design. However, it is still widely used and we are not aware of a gated-RNN architecture that outperforms LSTM in a broad sense while still being as simple and efficient. In this paper we propose a modified LSTM-like architecture. Our architecture is still simple and achieves better performance on the tasks that we tested on. We also introduce a new RNN performance benchmark that uses the handwritten digits and stresses several important network capabilities.
The influence of long-range correlated surface and decaying near surface disorder with quenched defects is studied. We consider a correlation function for the defects of the form $\frac{e^{-z/\xi}}{r^{a}}$, where $a<d-1$ and $z$ being the coordinate in the direction perpendicular to the surface and $r$ denotes the distance parallel to the surface. We investigate the process of adsorption of long-flexible polymer chains with excluded volume interactions on a "marginal" and attractive wall in the framework of renormalization group field theoretical approach up to first order of perturbation theory in a double ($\epsilon$,$\delta$)- expansion ($\epsilon=4-d$, $\delta=3-a$) for the semi-infinite $|\phi|^4$ $O(m,n)$ model with the above mentioned type of surface and near the surface disorder in the limit $m,n\to 0$. In particular we study two limiting cases. First, we investigate the scenario where the chain's extension it much larger then $\xi$. Second, we consider the case where the chain's extension is of the order of $\xi$. For both cases we obtained series for bulk and the whole set of surface critical exponents, characterizing the process of adsorption of long-flexible polymer chains at the surface. The polymer linear dimensions parallel and perpendicular to the surface and the corresponding partition functions as well as the behavior of monomer density profiles and the fraction of adsorbed monomers at the surface and in the volume are studied.
Learning physics requires understanding the applicability of fundamental principles in a variety of contexts that share deep features. One way to help students learn physics is via analogical reasoning. Students can be taught to make an analogy between situations that are more familiar or easier to understand and another situation where the same physics principle is involved but that is more difficult to handle. Here, we examine introductory physics students' ability to use analogies in solving problems involving Newton's second law. Students enrolled in an algebra-based introductory physics course were given a solved problem involving tension in a rope and were then asked to solve another problem for which the physics is very similar but involved a frictional force. They were asked to point out the similarities between the two problems and then use the analogy to solve the friction problem.
We study the scaling properties in deep inelastic scattering using the most recent combined structure function data $F_2$ from the H1 and ZEUS collaborations. We also perform a direct fit to the $F_2$ data inspired by the scaling properties. Our analysis favours the QCD saturation mechanism from the Balitski Kovchegiv equation wuth running coupling.
Convolutional neural networks (CNNs) have been successfully applied on both discriminative and generative modeling for music-related tasks. For a particular task, the trained CNN contains information representing the decision making or the abstracting process. One can hope to manipulate existing music based on this 'informed' network and create music with new features corresponding to the knowledge obtained by the network. In this paper, we propose a method to utilize the stored information from a CNN trained on musical genre classification task. The network was composed of three convolutional layers, and was trained to classify five-second song clips into five different genres. After training, randomly selected clips were modified by maximizing the sum of outputs from the network layers. In addition to the potential of such CNNs to produce interesting audio transformation, more information about the network and the original music could be obtained from the analysis of the generated features since these features indicate how the network 'understands' the music.
We investigate the equilibration of a small isolated quantum system by means of its matrix of asymptotic transition probabilities in a preferential basis. The trace of this matrix is shown to measure the degree of equilibration of the system launched from a typical state, from the standpoint of the chosen basis. This approach is substantiated by an in-depth study of the example of a tight-binding particle in one dimension. In the regime of free ballistic propagation, the above trace saturates to a finite limit, testifying good equilibration. In the presence of a random potential, the trace grows linearly with the system size, testifying poor equilibration in the insulating regime induced by Anderson localization. In the weak-disorder situation of most interest, a universal finite-size scaling law describes the crossover between the ballistic and localized regimes. The associated crossover exponent 2/3 is dictated by the anomalous band-edge scaling characterizing the most localized energy eigenstates.
We study the Shannon entropy of the cluster size distribution in classical as well as explosive percolation, in order to estimate the uncertainty in the sizes of randomly chosen clusters. At the critical point the cluster size distribution is a power-law, i.e. there are clusters of all sizes, so one expects the information entropy to attain a maximum. As expected, our results show that the entropy attains a maximum at this point for classical percolation. Surprisingly, for explosive percolation the maximum entropy does not match the critical point. Moreover, we show that it is possible determine the critical point without using the conventional order parameter, just analysing the entropy's derivatives.
Whole-brain neural connectivity data are now available from viral tracing experiments, which reveal the connections between a source injection site and elsewhere in the brain. These hold the promise of revealing spatial patterns of connectivity throughout the mammalian brain. To achieve this goal, we seek to fit a weighted, nonnegative adjacency matrix among 100 $\mu$m brain "voxels" using viral tracer data. Despite a multi-year experimental effort, injections provide incomplete coverage, and the number of voxels in our data is orders of magnitude larger than the number of injections, making the problem severely underdetermined. Furthermore, projection data are missing within the injection site because local connections there are not separable from the injection signal.   We use a novel machine-learning algorithm to meet these challenges and develop a spatially explicit, voxel-scale connectivity map of the mouse visual system. Our method combines three features: a matrix completion loss for missing data, a smoothing spline penalty to regularize the problem, and (optionally) a low rank factorization. We demonstrate the consistency of our estimator using synthetic data and then apply it to newly available Allen Mouse Brain Connectivity Atlas data for the visual system. Our algorithm is significantly more predictive than current state of the art approaches which assume regions to be homogeneous. We demonstrate the efficacy of a low rank version on visual cortex data and discuss the possibility of extending this to a whole-brain connectivity matrix at the voxel scale.
We consider the model of the directed polymer in a random medium of dimension 1+3, and investigate its multifractal properties at the localization/delocalization transition. In close analogy with models of the quantum Anderson localization transition, where the multifractality of critical wavefunctions is well established, we analyse the statistics of the position weights $w_L(\vec r)$ of the end-point of the polymer of length $L$ via the moments $Y_q(L) = \sum_{\vec r} [w_L(\vec r)]^q$. We measure the generalized exponents $\tau(q)$ and $\tilde \tau(q)$ governing the decay of the typical values $Y^{typ}_q(L) = e^{\bar{\ln Y_q(L)}} \sim L^{- \tau(q)} $ and disorder-averaged values $\bar{Y_q(L)} \sim L^{- \tilde \tau(q)} $ respectively. To understand the difference between these exponents, $ \tau(q) \neq \tilde \tau(q)$ above some threshold $q>q_c \sim 2$, we compute the probability distributions of $y=Y_q(L)/Y^{typ}_q(L) $ over the samples : we find that these distributions becomes scale invariant with a power-law tail $1/y^{1+x_q}$. These results thus correspond to the Ever-Mirlin scenario [Phys. Rev. Lett. 84, 3690 (2000)] for the statistics of Inverse Participation Ratios at the Anderson localization transitions. Finally, the finite-size scaling analysis in the critical region yields the correlation length exponent $\nu \sim 2$.
We conducted a systematic study of the disorder dependence of the termination of superconductivity, at high magnetic fields (B), of amorphous indium oxide films. Our lower disorder films show conventional behavior where superconductivity is terminated with a transition to a metallic state at a well-defined critical field, Bc2. Our higher disorder samples undergo a B-induced transition into a strongly insulating state, which terminates at higher B's forming an insulating peak. We demonstrate that the B terminating this peak coincides with Bc2 of the lower disorder samples. Additionally we show that, beyond this field, these samples enter a different insulating state in which the magnetic field dependence of the resistance is weak. These results provide crucial evidence for the importance of Cooper-pairing in the insulating peak regime.
In this work we propose a network meta-architecture based on fundamental laws of physics and a physical model of computation. This meta-architecture may be used to frame discussions about novel network architectures as well as cross-layer alterations to the canonical network stack.
We present the OC16-CE80 Chinese-English mixlingual speech database which was released as a main resource for training, development and test for the Chinese-English mixlingual speech recognition (MixASR-CHEN) challenge on O-COCOSDA 2016. This database consists of 80 hours of speech signals recorded from more than 1,400 speakers, where the utterances are in Chinese but each involves one or several English words. Based on the database and another two free data resources (THCHS30 and the CMU dictionary), a speech recognition (ASR) baseline was constructed with the deep neural network-hidden Markov model (DNN-HMM) hybrid system. We then report the baseline results following the MixASR-CHEN evaluation rules and demonstrate that OC16-CE80 is a reasonable data resource for mixlingual research.
Adaptive networks are well-suited to perform decentralized information processing and optimization tasks and to model various types of self-organized and complex behavior encountered in nature. Adaptive networks consist of a collection of agents with processing and learning abilities. The agents are linked together through a connection topology, and they cooperate with each other through local interactions to solve distributed optimization, estimation, and inference problems in real-time. The continuous diffusion of information across the network enables agents to adapt their performance in relation to streaming data and network conditions; it also results in improved adaptation and learning performance relative to non-cooperative agents. This article provides an overview of diffusion strategies for adaptation and learning over networks. The article is divided into several sections: 1. Motivation; 2. Mean-Square-Error Estimation; 3. Distributed Optimization via Diffusion Strategies; 4. Adaptive Diffusion Strategies; 5. Performance of Steepest-Descent Diffusion Strategies; 6. Performance of Adaptive Diffusion Strategies; 7. Comparing the Performance of Cooperative Strategies; 8. Selecting the Combination Weights; 9. Diffusion with Noisy Information Exchanges; 10. Extensions and Further Considerations; Appendix A: Properties of Kronecker Products; Appendix B: Graph Laplacian and Network Connectivity; Appendix C: Stochastic Matrices; Appendix D: Block Maximum Norm; Appendix E: Comparison with Consensus Strategies; References.
Deep learning for human action recognition in videos is making significant progress, but is slowed down by its dependency on expensive manual labeling of large video collections. In this work, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation and other computer graphics techniques of modern game engines. We generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for "Procedural Human Action Videos". It contains a total of 39,982 videos, with more than 1,000 examples for each action of 35 categories. Our approach is not limited to existing motion capture sequences, and we procedurally define 14 synthetic actions. We introduce a deep multi-task representation learning architecture to mix synthetic and real videos, even if the action categories differ. Our experiments on the UCF101 and HMDB51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance, significantly outperforming fine-tuning state-of-the-art unsupervised generative models of videos.
We employ the formalism of bond currents, expressed in terms of the nonequilibrium Green functions, to image the charge flow between two sites of the honeycomb lattice of graphene ribbons of few nanometers width. In sharp contrast to nonrelativistic electrons, current density profiles of quantum transport at energies close to the Dirac point in clean zigzag graphene nanoribbons (ZGNR) differs markedly from the profiles of charge density peaked at the edges due to zero-energy localized edge states. For transport through the lowest propagating mode induced by these edge states, edge vacancies do not affect current density peaked in the center of ZGNR. The long-range potential of a single impurity acts to reduce local current around it while concurrently increasing the current density along the zigzag edge, so that ZGNR conductance remains perfect $G=2e^2/h$.
In many real world applications, data cannot be accurately represented by vectors. In those situations, one possible solution is to rely on dissimilarity measures that enable sensible comparison between observations. Kohonen's Self-Organizing Map (SOM) has been adapted to data described only through their dissimilarity matrix. This algorithm provides both non linear projection and clustering of non vector data. Unfortunately, the algorithm suffers from a high cost that makes it quite difficult to use with voluminous data sets. In this paper, we propose a new algorithm that provides an important reduction of the theoretical cost of the dissimilarity SOM without changing its outcome (the results are exactly the same as the ones obtained with the original algorithm). Moreover, we introduce implementation methods that result in very short running times. Improvements deduced from the theoretical cost model are validated on simulated and real world data (a word list clustering problem). We also demonstrate that the proposed implementation methods reduce by a factor up to 3 the running time of the fast algorithm over a standard implementation.
We are interested in assigning a pre-specified number of nodes as leaders in order to minimize the mean-square deviation from consensus in stochastically forced networks. This problem arises in several applications including control of vehicular formations and localization in sensor networks. For networks with leaders subject to noise, we show that the Boolean constraints (a node is either a leader or it is not) are the only source of nonconvexity. By relaxing these constraints to their convex hull we obtain a lower bound on the global optimal value. We also use a simple but efficient greedy algorithm to identify leaders and to compute an upper bound. For networks with leaders that perfectly follow their desired trajectories, we identify an additional source of nonconvexity in the form of a rank constraint. Removal of the rank constraint and relaxation of the Boolean constraints yields a semidefinite program for which we develop a customized algorithm well-suited for large networks. Several examples ranging from regular lattices to random graphs are provided to illustrate the effectiveness of the developed algorithms.
A first determination of the strong coupling $\alpha_s(m_Z)$ in next-to-next-to-leading order (NNLO) from inclusive jet and dijet production cross sections in deep-inelastic scattering at HERA is presented. Data collected by the H1 experiment in the years 1995 to 2007 covering the range of momentum transfer $5.5<Q^2<15\,000\,{\rm GeV}^2$ and jet transverse momenta $P_T>4.5\,{\rm GeV}$ are explored. The strong coupling is determined in a fit to inclusive jet and dijet data to $\alpha_s(m_Z)=0.1157\,(6)_{\rm exp}\,(^{+31}_{-26})_{\rm theo}$. Further studies on the phenomenological application of the new NNLO calculations and on fits to the individual data sets are presented. The running of the strong coupling is probed in a single experiment over one order of magnitude in the remormalisation scale and consistency with the QCD expectations is found.
Acoustic models based on long short-term memory recurrent neural networks (LSTM-RNNs) were applied to statistical parametric speech synthesis (SPSS) and showed significant improvements in naturalness and latency over those based on hidden Markov models (HMMs). This paper describes further optimizations of LSTM-RNN-based SPSS for deployment on mobile devices; weight quantization, multi-frame inference, and robust inference using an {\epsilon}-contaminated Gaussian loss function. Experimental results in subjective listening tests show that these optimizations can make LSTM-RNN-based SPSS comparable to HMM-based SPSS in runtime speed while maintaining naturalness. Evaluations between LSTM-RNN- based SPSS and HMM-driven unit selection speech synthesis are also presented.
We investigate the quantum dynamics in a disordered electronic lattice with Fibonacci sequence of site energy and off-diagonal electron-phonon coupling within a sub-Ohmic bath by the time-dependent density matrix renormalization group algorithm. It is found that, the slope of the inverse participation ratio versus the coupling strength undergoes a sudden change indicating a transition from the static to the dynamic localization. In the dynamic localization regime, the generated polarons coherently diffuse via hopping-like processes evidenced by the saturated entanglement entropy, providing a novel scenario for the transportation mechanism in strongly disordered systems. The mean-square displacement is revealed to be insensitive to the coupling strength, implying the quantum diffusion behavior survives the energy disorder that prevails in real organic materials.
The inclusion of a macroscopic adaptive threshold is studied for the retrieval dynamics of both layered feedforward and fully connected neural network models with synaptic noise. These two types of architectures require a different method to be solved numerically. In both cases it is shown that, if the threshold is chosen appropriately as a function of the cross-talk noise and of the activity of the stored patterns, adapting itself automatically in the course of the recall process, an autonomous functioning of the network is guaranteed. This self-control mechanism considerably improves the quality of retrieval, in particular the storage capacity, the basins of attraction and the mutual information content.
Multi-agent predictive modeling is an essential step for understanding physical, social and team-play systems. Recently, Interaction Networks (INs) were proposed for the task of modeling multi-agent physical systems, INs scale with the number of interactions in the system (typically quadratic or higher order in the number of agents). In this paper we introduce VAIN, a novel attentional architecture for multi-agent predictive modeling that scales linearly with the number of agents. We show that VAIN is effective for multi-agent predictive modeling. Our method is evaluated on tasks from challenging multi-agent prediction domains: chess and soccer, and outperforms competing multi-agent approaches.
Real-world networks process structured connections since they have non-trivial vertex degree correlation and clustering. Here we propose a toy model of structure formation in real-world weighted network. In our model, a network evolves by topological growth as well as by weight change. In addition, we introduce the weighted assortativity coefficient, which generalizes the assortativity coefficient of a topological network, to measure the tendency of having a high-weighted link between two vertices of similar degrees. Network generated by our model exhibits scale-free behavior with a tunable exponent. Besides, a few non-trivial features found in real-world networks are reproduced by varying the parameter ruling the speed of weight evolution. Most importantly, by studying the weighted assortativity coefficient, we found that both topologically assortative and disassortative networks generated by our model are in fact weighted assortative.
To characterize the pairing-specificity of RNA secondary structures as a function of temperature, we analyse the statistics of the pairing weights as follows : for each base $(i)$ of the sequence of length N, we consider the $(N-1)$ pairing weights $w_i(j)$ with the other bases $(j \neq i)$ of the sequence. We numerically compute the probability distributions $P_1(w)$ of the maximal weight, the probability distribution $\Pi(Y_2)$ of the parameter $Y_2(i)= \sum_j w_i^2(j)$, as well as the average values of the moments $Y_k(i)= \sum_j w_i^k(j)$. We find that there are two important temperatures $T_c<T_{gap}$. For $T>T_{gap}$, the distribution $P_1(w)$ vanishes at some value $w_0(T)<1$, and accordingly the moments $\bar{Y_k(i)}$ decay exponentially in $k$. For $T<T_{gap}$, the distributions $P_1(w)$ and $\Pi(Y_2)$ present the characteristic Derrida-Flyvbjerg singularities at $w,Y_2=1/n$ for $n=1,2..$. In particular, there exists a temperature-dependent exponent $\mu(T)$ that governs these singularities and the decay of the moments $ \bar{Y_k(i)} \sim 1/k^{\mu(T)}$. The exponent $\mu(T)$ grows from $\mu(T=0)=0$ up to $\mu(T_{gap}) \sim 2$. The study of spatial properties indicates that the critical temperature $T_c$ where the roughness exponent changes from the low temperature value $\zeta \sim 0.67$ to the high temperature value $\zeta \sim 0.5$ corresponds to the exponent $\mu(T_c)=1$. For $T<T_c$, there exists frozen pairs of all sizes, whereas for $T_c< T <T_{gap}$, there exists frozen pairs, but only up to some characteristic length diverging as $\xi(T) \sim 1/(T_c-T)^{\nu}$ with $\nu \simeq 2$. The similarities and differences with the weight statistics in L\'evy sums and in Derrida's Random Energy Model are discussed.
The assessment of node importance has been a fundamental issue in the research of complex networks. In this paper, we propose to use the Shannon-Parry measure (SPM) to evaluate the importance of a node quantitatively, because SPM is the stationary distribution of the most unprejudiced random walk on the network. We demonstrate the accuracy and robustness of SPM compared with several popular methods in the Zachary karate club network and three toy networks. We apply SPM to analyze the city importance of China Railways High-speed (CRH) network, and obtain reasonable results. Since SPM can be used effectively in weighted and directed network, we believe it is a relevant method to identify key nodes in networks.
The OSI model, developed by ISO in 1984, attempts to summarize complicated network cases on layers. Moreover, network troubles are expressed by taking the model into account. However, there has been no standardization for network troubles up to now. Network troubles have only been expressed by the name of the related layer. In this paper, it is pointed out that possible troubles on the related layer vary and possible troubles on each layer are categorized for functional network administration and they are standardized in an eligible way. The proposed model for network trouble shooting was developed considering the OSI model.
Nowadays, neural networks play an important role in the task of relation classification. By designing different neural architectures, researchers have improved the performance to a large extent in comparison with traditional methods. However, existing neural networks for relation classification are usually of shallow architectures (e.g., one-layer convolutional neural networks or recurrent networks). They may fail to explore the potential representation space in different abstraction levels. In this paper, we propose deep recurrent neural networks (DRNNs) for relation classification to tackle this challenge. Further, we propose a data augmentation method by leveraging the directionality of relations. We evaluated our DRNNs on the SemEval-2010 Task~8, and achieve an F1-score of 86.1%, outperforming previous state-of-the-art recorded results.
We abstract the essential aspects of network-error detecting and correcting codes to arrive at the definitions of matroidal error detecting networks and matroidal error correcting networks. An acyclic network (with arbitrary sink demands) is then shown to possess a scalar linear error detecting (correcting) network code if and only if it is a matroidal error detecting (correcting) network associated with a representable matroid. Therefore, constructing such network-error correcting and detecting codes implies the construction of certain representable matroids that satisfy some special conditions, and vice versa. We then present algorithms which enable the construction of matroidal error detecting and correcting networks with a specified capability of network-error correction. Using these construction algorithms, a large class of hitherto unknown scalar linearly solvable networks with multisource multicast and multiple-unicast network-error correcting codes is made available for theoretical use and practical implementation, with parameters such as number of information symbols, number of sinks, number of coding nodes, error correcting capability, etc. being arbitrary but for computing power (for the execution of the algorithms). The complexity of the construction of these networks is shown to be comparable to the complexity of existing algorithms that design multicast scalar linear network-error correcting codes. Finally we also show that linear network coding is not sufficient for the general network-error detection problem with arbitrary demands. In particular, for the same number of network-errors, we show a network for which there is a nonlinear network-error detecting code satisfying the demands at the sinks, while there are no linear network-error detecting codes that do the same.
In this paper we discuss and analyze some of the intelligent classifiers which allows for automatic detection and classification of networks attacks for any intrusion detection system. We will proceed initially with their analysis using the WEKA software to work with the classifiers on a well-known IDS (Intrusion Detection Systems) dataset like NSL-KDD dataset. The NSL-KDD dataset of network attacks was created in a military network by MIT Lincoln Labs. Then we will discuss and experiment some of the hybrid AI (Artificial Intelligence) classifiers that can be used for IDS, and finally we developed a Java software with three most efficient classifiers and compared it with other options. The outputs would show the detection accuracy and efficiency of the single and combined classifiers used.
We investigate numerically the collective dynamical behavior of pulse-coupled non-leaky integrate-and-fire-neurons that are arranged on a two-dimensional small-world network. To ensure ongoing activity, we impose a probability for spontaneous firing for each neuron. We study network dynamics evolving from different sets of initial conditions in dependence on coupling strength and rewiring probability. Beside a homogeneous equilibrium state for low coupling strength, we observe different local patterns including cyclic waves, spiral waves, and turbulent-like patterns, which -- depending on network parameters -- interfere with the global collective firing of the neurons. We attribute the various network dynamics to distinct regimes in the parameter space. For the same network parameters different network dynamics can be observed depending on the set of initial conditions only. Such a multistable behavior and the interplay between local pattern formation and global collective firing may be attributable to the spatiotemporal dynamics of biological networks.
we investigate the thermodynamic properties of weakly anisotropic disordered magnetic chains where the nearest-neighbor Heisenberg exchange coupling is the dominant spin-spin interaction. In addition to the exchange interaction, there is single-ion anisotropy with the direction of the easy axis being the same for all spins. The analysis is carried out in the limit S >> 1, where S denotes the ionic spin, and in the temperature range T << JaveS, where Jave is the average magnitude of the exchange interaction. It is assumed that the independent boson picture is appropriate so that the thermodynamic properties of the chain are those of a gas of weakly interacting magnons whose frequencies are obtained from linearized equations of motion for the spins. The =+/-J model is studied in detail with both random and non-random anisotropy. Particular emphasis is placed on the existence
The study of the chemical abundances of HII regions in polar ring galaxies and their implications for the evolutionary scenario of these systems has been a step forward both in tracing the formation history of the galaxy and giving hints on the mechanisms at work during the building of disk by cold accretion process. It's now important to establish whether such results are typical for the class of polar disk galaxies as whole. The present work aims at checking the cold accretion of gas through a "cosmic filament" as a possible scenario for the formation of the polar structures in UGC7576 and UGC9796. If these form by cold accretion, we expect the HII regions abundances and metallicities to be lower than those of same-luminosity spiral disks, with values of the order of Z ~ 1/10 Zsun, as predicted by cosmological simulations. We have used deep long-slit spectra, obtained with DOLORES@TNG in the optical wavelengths, of the brightest HII regions associated with the polar structures to derive their chemical abundances and star formation rate. We used the "Empirical methods", based on the intensities of easily observable lines, to derive the oxygen abundance 12+log(O/H) of both galaxies. Such values are compared with those typical for different morphological galaxy types of comparable luminosity. The average metallicity values for UGC7576 and UGC9796 are Z = 0.4 Zsun and Z = 0.1 Zsun respectively. Both values are lower than those measured for ordinary spirals of similar luminosity and UGC7576 presents no metallicity gradient along the polar structure. These data, toghether with other observed features, available for the two PRGs in previous works, are compared with the predictions of simulations of tidal accretion, cold accretion and merging, to disentangle between these scenarios.
Online image sharing in social media sites such as Facebook, Flickr, and Instagram can lead to unwanted disclosure and privacy violations, when privacy settings are used inappropriately. With the exponential increase in the number of images that are shared online every day, the development of effective and efficient prediction methods for image privacy settings are highly needed. The performance of models critically depends on the choice of the feature representation. In this paper, we present an approach to image privacy prediction that uses deep features and deep image tags as feature representations. Specifically, we explore deep features at various neural network layers and use the top layer (probability) as an auto-annotation mechanism. The results of our experiments show that models trained on the proposed deep features and deep image tags substantially outperform baselines such as those based on SIFT and GIST as well as those that use "bag of tags" as features.
We consider the problem of scheduling in multihop wireless networks subject to interference constraints. We consider a graph based representation of wireless networks, where scheduled links adhere to the K-hop link interference model. We develop a distributed greedy heuristic for this scheduling problem. Further, we show that this distributed greedy heuristic computes the exact same schedule as the centralized greedy heuristic.
We present an approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations. One common approach to combine clean and noisy data is to first pre-train a network using the large noisy dataset and then fine-tune with the clean dataset. We show this approach does not fully leverage the information contained in the clean set. Thus, we demonstrate how to use the clean annotations to reduce the noise in the large dataset before fine-tuning the network using both the clean set and the full set with reduced noise. The approach comprises a multi-task network that jointly learns to clean noisy annotations and to accurately classify images. We evaluate our approach on the recently released Open Images dataset, containing ~9 million images, multiple annotations per image and over 6000 unique classes. For the small clean set of annotations we use a quarter of the validation set with ~40k images. Our results demonstrate that the proposed approach clearly outperforms direct fine-tuning across all major categories of classes in the Open Image dataset. Further, our approach is particularly effective for a large number of classes with wide range of noise in annotations (20-80% false positive annotations).
We propose a process calculus to model high level wireless systems, where the topology of a network is described by a digraph. The calculus enjoys features which are proper of wireless networks, namely broadcast communication and probabilistic behaviour. We first focus on the problem of composing wireless networks, then we present a compositional theory based on a probabilistic generalisation of the well known may-testing and must-testing pre- orders. Also, we define an extensional semantics for our calculus, which will be used to define both simulation and deadlock simulation preorders for wireless networks. We prove that our simulation preorder is sound with respect to the may-testing preorder; similarly, the deadlock simulation pre- order is sound with respect to the must-testing preorder, for a large class of networks. We also provide a counterexample showing that completeness of the simulation preorder, with respect to the may testing one, does not hold. We conclude the paper with an application of our theory to probabilistic routing protocols.
The many-body localization (MBL) transition is a quantum phase transition involving highly excited eigenstates of a disordered quantum many-body Hamiltonian, which evolve from "extended/ergodic" (exhibiting extensive entanglement entropies and fluctuations) to "localized" (exhibiting area-law scaling of entanglement and fluctuations). The MBL transition can be driven by the strength of disorder in a given spectral range, or by the energy density at fixed disorder - if the system possesses a many-body mobility edge. Here we propose to explore the latter mechanism by using "quantum-quench spectroscopy", namely via quantum quenches of variable width which prepare the state of the system in a superposition of eigenstates of the Hamiltonian within a controllable spectral region. Studying numerically a chain of interacting spinless fermions in a quasi-periodic potential, we argue that this system has a many-body mobility edge; and we show that its existence translates into a clear dynamical transition in the time evolution immediately following a quench in the strength of the quasi-periodic potential, as well as a transition in the scaling properties of the quasi-stationary state at long times. Our results suggest a practical scheme for the experimental observation of many-body mobility edges using cold-atom setups.
Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V;E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that correctly estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. We conclude with extensions and directions for future work.
Recent work on language modelling has shifted focus from count-based models to neural models. In these works, the words in each sentence are always considered in a left-to-right order. In this paper we show how we can improve the performance of the recurrent neural network (RNN) language model by incorporating the syntactic dependencies of a sentence, which have the effect of bringing relevant contexts closer to the word being predicted. We evaluate our approach on the Microsoft Research Sentence Completion Challenge and show that the dependency RNN proposed improves over the RNN by about 10 points in accuracy. Furthermore, we achieve results comparable with the state-of-the-art models on this task.
We present an effective multifaceted system for exploratory analysis of highly heterogeneous document collections. Our system is based on intelligently tagging individual documents in a purely automated fashion and exploiting these tags in a powerful faceted browsing framework. Tagging strategies employed include both unsupervised and supervised approaches based on machine learning and natural language processing. As one of our key tagging strategies, we introduce the KERA algorithm (Keyword Extraction for Reports and Articles). KERA extracts topic-representative terms from individual documents in a purely unsupervised fashion and is revealed to be significantly more effective than state-of-the-art methods. Finally, we evaluate our system in its ability to help users locate documents pertaining to military critical technologies buried deep in a large heterogeneous sea of information.
We study Members of Parliament (MP) private initiative (bill) cosponsor patterns from a European parliamentary multiparty perspective. By applying network detection algorithms, we set out to find the determinants of the cosponsorship patterns. The algorithms detect the initiative networks core communities, after which the variables characterizing the core communities can be analyzed. We found legislative network communities being best characterized by the MPs' party affiliations. The budget motion networks, which constitute roughly half of the data, were found mostly characterized by the MPs' home constituencies and only to a limited extent by the MPs' party affiliations. In comparison to previous findings regarding certain presidential systems, MPs committee assignments or gender were found irrelevant.
We study inference and reconstruction of couplings in a partially observed kinetic Ising model. With hidden spins, calculating the likelihood of a sequence of observed spin configurations requires performing a trace over the configurations of the hidden ones. This, as we show, can be represented as a path integral. Using this representation, we demonstrate that systematic approximate inference and learning rules can be derived using dynamical mean-field theory. Although naive mean-field theory leads to an unstable learning rule, taking into account Gaussian corrections allows learning the couplings involving hidden nodes. It also improves learning of the couplings between the observed nodes compared to when hidden nodes are ignored.
In this paper we study the difficulty of counting nodes in a synchronous dynamic network where nodes share the same identifier, they communicate by using a broadcast with unlimited bandwidth and, at each synchronous round, network topology may change. To count in such setting, it has been shown that the presence of a leader is necessary. We focus on a particularly interesting subset of dynamic networks, namely \textit{Persistent Distance} - ${\cal G}($PD$)_{h}$, in which each node has a fixed distance from the leader across rounds and such distance is at most $h$. In these networks the dynamic diameter $D$ is at most $2h$. We prove the number of rounds for counting in ${\cal G}($PD$)_{2}$ is at least logarithmic with respect to the network size $|V|$. Thanks to this result, we show that counting on any dynamic anonymous network with $D$ constant w.r.t. $|V|$ takes at least $D+ \Omega(\text{log}\, |V| )$ rounds where $\Omega(\text{log}\, |V|)$ represents the additional cost to be payed for handling anonymity. At the best of our knowledge this is the fist non trivial, i.e. different from $\Omega(D)$, lower bounds on counting in anonymous interval connected networks with broadcast and unlimited bandwith.
In principle, the rules of links formation of a network model can be considered as a kind of link prediction algorithm. By revisiting the preferential attachment mechanism for generating a scale-free network, here we propose a class of preferential attachment indices which are different from the previous one. Traditionally, the preferential attachment index is defined by the product of the related nodes degrees, while the new indices will define the similarity score of a pair of nodes by either the maximum in the two nodes degrees or the summarization of their degrees. Extensive experiments are carried out on fourteen real-world networks. Compared with the traditional preferential attachment index, the new ones, especially the degree-summarization similarity index, can provide more accurate prediction on most of the networks. Due to the improved prediction accuracy and low computational complexity, these proposed preferential attachment indices may be of help to provide an instruction for mining unknown links in incomplete networks.
Median clustering extends popular neural data analysis methods such as the self-organizing map or neural gas to general data structures given by a dissimilarity matrix only. This offers flexible and robust global data inspection methods which are particularly suited for a variety of data as occurs in biomedical domains. In this chapter, we give an overview about median clustering and its properties and extensions, with a particular focus on efficient implementations adapted to large scale data analysis.
Based on the empirical analysis of the dependency network in 18 Java projects, we develop a novel model of network growth which considers both: an attachment mechanism and the addition of new nodes with a heterogeneous distribution of their initial degree, $k_0$. Empirically we find that the cumulative degree distributions of initial degrees and of the final network, follow power-law behaviors: $P(k_{0}) \propto k_{0}^{1-\alpha}$, and $P(k)\propto k^{1-\gamma}$, respectively. For the total number of links as a function of the network size, we find empirically $K(N)\propto N^{\beta}$, where $\beta$ is (at the beginning of the network evolution) between 1.25 and 2, while converging to $\sim 1$ for large $N$. This indicates a transition from a growth regime with increasing network density towards a sustainable regime, which revents a collapse because of ever increasing dependencies. Our theoretical framework is able to predict relations between the exponents $\alpha$, $\beta$, $\gamma$, which also link issues of software engineering and developer activity. These relations are verified by means of computer simulations and empirical investigations. They indicate that the growth of real Open Source Software networks occurs on the edge between two regimes, which are either dominated by the initial degree distribution of added nodes, or by the preferential attachment mechanism. Hence, the heterogeneous degree distribution of newly added nodes, found empirically, is essential to describe the laws of sustainable growth in networks.
The iterations of many first-order algorithms, when applied to minimizing common regularized regression functions, often resemble neural network layers with pre-specified weights. This observation has prompted the development of learning-based approaches that purport to replace these iterations with enhanced surrogates forged as DNN models from available training data. For example, important NP-hard sparse estimation problems have recently benefitted from this genre of upgrade, with simple feedforward or recurrent networks ousting proximal gradient-based iterations. Analogously, this paper demonstrates that more powerful Bayesian algorithms for promoting sparsity, which rely on complex multi-loop majorization-minimization techniques, mirror the structure of more sophisticated long short-term memory (LSTM) networks, or alternative gated feedback networks previously designed for sequence prediction. As part of this development, we examine the parallels between latent variable trajectories operating across multiple time-scales during optimization, and the activations within deep network structures designed to adaptively model such characteristic sequences. The resulting insights lead to a novel sparse estimation system that, when granted training data, can estimate optimal solutions efficiently in regimes where other algorithms fail, including practical direction-of-arrival (DOA) and 3D geometry recovery problems. The underlying principles we expose are also suggestive of a learning process for a richer class of multi-loop algorithms in other domains.
This paper presents the research of the influence of cognitive, behavioral, representational factors on the susceptibility of the participants in social networks to misinformation, as well as on the activity of the nodes in this regard. The importance of this research consists of method of blocking the propaganda. This is very important because when people involuntarily acquire information some of them experience an undesired change in their social attitude. Such phenomena typically lead towards the information warfare. A model was developed during this research for calculating the level of misinformation of the social network participant (network node) based on the model of iterative learning process.
We develop an abstract look at linear optical networks from the viewpoint of combinatorics and permanents. In particular we show that calculation of matrix elements of unitarily transformed photonic multi-mode states is intimately linked to the computation of permanents. An implication of this remarkable fact is that all calculations that are based on evaluating matrix elements are generically computationally hard. Moreover, quantum mechanics provides simpler derivations of certain matrix analysis results which we exemplify by showing that the permanent of any unitary matrix takes its values across the unit disk in the complex plane.
A step-to-step introduction is provided on how to generate a semantic map from a collection of messages (full texts, paragraphs or statements) using freely available software and/or SPSS for the relevant statistics and the visualization. The techniques are discussed in the various theoretical contexts of (i) linguistics (e.g., Latent Semantic Analysis), (ii) sociocybernetics and social systems theory (e.g., the communication of meaning), and (iii) communication studies (e.g., framing and agenda-setting). We distinguish between the communication of information in the network space (social network analysis) and the communication of meaning in the vector space. The vector space can be considered a generated as an architecture by the network of relations in the network space; words are then not only related, but also positioned. These positions are expected rather than observed and therefore one can communicate meaning. Knowledge can be generated when these meanings can recursively be communicated and therefore also further codified.
Carbon ions offer significant advantages for deep-seated local tumors therapy due to their physical and biological properties. Secondary particles, especially neutrons caused by heavy ion reactions should be carefully considered in treatment process and radiation protection. For radiation protection purposes, the FLUKA Code was used in order to evaluate the radiation field at deep tumor therapy room of HIRFL in this paper. The neutron energy spectra, neutron dose and energy deposition of carbon ion and neutron in tissue-like media was studied for bombardment of solid water target by 430MeV/u C ions. It is found that the calculated neutron dose have a good agreement with the experimental date, and the secondary neutron dose may not exceed one in a thousand of the carbon ions dose at Bragg peak area in tissue-like media.
We present a solution to the problem of paraphrase identification of questions. We focus on a recent dataset of question pairs annotated with binary paraphrase labels and show that a variant of the decomposable attention model (Parikh et al., 2016) results in accurate performance on this task, while being far simpler than many competing neural architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically collected question paraphrases, it obtains the best reported performance on the dataset.
This paper analyzes the circumstances under which Bayesian networks can be pruned in order to reduce computational complexity without altering the computation for variables of interest. Given a problem instance which consists of a query and evidence for a set of nodes in the network, it is possible to delete portions of the network which do not participate in the computation for the query. Savings in computational complexity can be large when the original network is not singly connected. Results analogous to those described in this paper have been derived before [Geiger, Verma, and Pearl 89, Shachter 88] but the implications for reducing complexity of the computations in Bayesian networks have not been stated explicitly. We show how a preprocessing step can be used to prune a Bayesian network prior to using standard algorithms to solve a given problem instance. We also show how our results can be used in a parallel distributed implementation in order to achieve greater savings. We define a computationally equivalent subgraph of a Bayesian network. The algorithm developed in [Geiger, Verma, and Pearl 89] is modified to construct the subgraphs described in this paper with O(e) complexity, where e is the number of edges in the Bayesian network. Finally, we define a minimal computationally equivalent subgraph and prove that the subgraphs described are minimal.
Social networking on mobile devices has become a commonplace of everyday life. In addition, photo capturing process has become trivial due to the advances in mobile imaging. Hence people capture a lot of photos everyday and they want them to be visually-attractive. This has given rise to automated, one-touch enhancement tools. However, the inability of those tools to provide personalized and content-adaptive enhancement has paved way for machine-learned methods to do the same. The existing typical machine-learned methods heuristically (e.g. kNN-search) predict the enhancement parameters for a new image by relating the image to a set of similar training images. These heuristic methods need constant interaction with the training images which makes the parameter prediction sub-optimal and computationally expensive at test time which is undesired. This paper presents a novel approach to predicting the enhancement parameters given a new image using only its features, without using any training images. We propose to model the interaction between the image features and its corresponding enhancement parameters using the matrix factorization (MF) principles. We also propose a way to integrate the image features in the MF formulation. We show that our approach outperforms heuristic approaches as well as recent approaches in MF and structured prediction on synthetic as well as real-world data of image enhancement.
We present predictions for the outcome of deep galaxy surveys with the \emph{James Webb Space Telescope} (\emph{JWST}) obtained from a physical model of galaxy formation in $\Lambda$CDM. We use the latest version of the GALFORM model, embedded within a new ($800$ Mpc)$^{3}$ dark matter only simulation with a halo mass resolution of $M_{\rm halo}>2\times10^{9}$ $h^{-1}$ M$_{\odot}$. For computing full UV-to-mm galaxy spectral energy distributions, including the absorption and emission of radiation by dust, we use the spectrophotometric radiative transfer code GRASIL. The model is calibrated to reproduce a broad range of observational data at $z\lesssim6$, and we show here that it can also predict evolution of the rest-frame far-UV luminosity function for $7\lesssim z\lesssim10$ which is in good agreement with observations. We make predictions for the evolution of the luminosity function from $z=16$ to $z=0$ in all broadband filters on the Near InfraRed Camera (NIRCam) and Mid InfraRed Instrument (MIRI) on \emph{JWST}, and present the resulting galaxy number counts and redshift distributions. Our fiducial model predicts that $\sim1$ galaxy per field of view will be observable at $z\sim10$ for a $10^4$ s exposure with NIRCam. A variant model, which produces a higher redshift of reionization in better agreement with \emph{Planck} data, predicts number densities of observable galaxies $\sim5\times$ greater at this redshift. Similar observations with MIRI are predicted not to detect any galaxies at $z\gtrsim6$. We also make predictions for the effect of different exposure times on the redshift distributions of galaxies observable with \emph{JWST}, and for the angular sizes of galaxies in \emph{JWST} bands.
Understanding the social and behavioral forces behind event participation is not only interesting from the viewpoint of social science, but also has important applications in the design of personalized event recommender systems. This paper takes advantage of data from a widely used location-based social network, Foursquare, to analyze event patterns in three metropolitan cities. We put forward several hypotheses on the motivating factors of user participation and confirm that social aspects play a major role in determining the likelihood of a user to participate in an event. While an explicit social filtering signal accounting for whether friends are attending dominates the factors, the popularity of an event proves to also be a strong attractor. Further, we capture an implicit social signal by performing random walks in a high dimensional graph that encodes the place type preferences of friends and that proves especially suited to identify relevant niche events for users. Our findings on the extent to which the various temporal, spatial and social aspects underlie users' event preferences lead us to further hypothesize that a combination of factors better models users' event interests. We verify this through a supervised learning framework. We show that for one in three users in London and one in five users in New York and Chicago it identifies the exact event the user would attend among the pool of suggestions.
The dynamical magnetic properties of an Ising spin glass Fe$_{0.55}$Mn$_{0.45}$TiO$_3$ are studied under various magnetic fields. Having determined the temperature and static field dependent relaxation time $\tau(T;H)$ from ac magnetization measurements under a dc bias field by a general method, we first demonstrate that these data provide evidence for a spin-glass (SG) phase transition only in zero field. We next argue that the data $\tau(T;H)$ of finite $H$ can be well interpreted by the droplet theory which predicts the absence of a SG phase transition in finite fields.
This paper presents a comprehensive literature review on applications of economic and pricing models for resource management in cloud networking. To achieve sustainable profit advantage, cost reduction, and flexibility in provisioning of cloud resources, resource management in cloud networking requires adaptive and robust designs to address many issues, e.g., resource allocation, bandwidth reservation, request allocation, and workload allocation. Economic and pricing models have received a lot of attention as they can lead to desirable performance in terms of social welfare, fairness, truthfulness, profit, user satisfaction, and resource utilization. This paper reviews applications of the economic and pricing models to develop adaptive algorithms and protocols for resource management in cloud networking. Besides, we survey a variety of incentive mechanisms using the pricing strategies in sharing resources in edge computing. In addition, we consider using pricing models in cloud-based Software Defined Wireless Networking (cloud-based SDWN). Finally, we highlight important challenges, open issues and future research directions of applying economic and pricing models to cloud networking
It is generally accepted that scale-free networks is prone to epidemic spreading allowing the onset of large epidemics whatever the spreading rate of the infection. In the paper, we show that disease propagation may be suppressed in particular fractal scale-free networks. We first study analytically the topological characteristics of a network model and show that it is simultaneously scale-free, highly clustered, "large-world", fractal and disassortative. Any previous model does not have all the properties as the one under consideration. Then, by using the renormalization group technique we analyze the dynamic susceptible-infected-removed (SIR) model for spreading of infections. Interestingly, we find the existence of an epidemic threshold, as compared to the usual epidemic behavior without a finite threshold in uncorrelated scale-free networks. This phenomenon indicates that degree distribution of scale-free networks does not suffice to characterize the epidemic dynamics on top of them. Our results may shed light in the understanding of the epidemics and other spreading phenomena on real-life networks with similar structural features as the considered model.
Different theoretical methods used for the description of diffractive processes in small-x deep inelastic scattering are reviewed. The semiclassical approach, where a partonic fluctuation of the incoming virtual photon scatters off a superposition of target colour fields, is used to explain the basic physical effects. In this approach, diffraction occurs if the emerging partonic state is in a colour singlet, thus fragmenting independently of the target. Other approaches, such as the idea of the pomeron structure function and two gluon exchange calculations, are also discussed in some detail. Particular attention is paid to the close relation between the semiclassical approach and the method of diffractive parton distributions, which is linked to the relation between the target rest frame and the Breit frame point of view. While the main focus is on diffractive structure functions, basic issues in the diffractive production of mesons and of other less inclusive final states are also discussed. Models of the proton colour field, which can be converted into predictions for diffractive cross sections using the semiclassical approach, are presented. The concluding overview of recent experimental results is very brief and mainly serves to illustrate implications of the theoretical methods presented.
We combine deep UBVRIzJK photometry from the MUSYC survey with redshifts from the COMBO-17 survey to study the rest-frame ultraviolet (UV) properties of 674 high-redshift (0.5<z<1) early-type galaxies, drawn from the Extended Chandra Deep Field South (E-CDFS). Galaxy morphologies are determined through visual inspection of Hubble Space Telescope (HST) images taken from the GEMS survey. We harness the sensitivity of the UV to young (<1 Gyrs old) stars to quantify the recent star formation history of the early-type population. We find compelling evidence that early-types of all luminosities form stars over the lifetime of the Universe, although the bulk of their star formation is already complete at high redshift. Luminous (-23<M(V)<-20.5) early-types form 10-15 percent of their mass after z=1, while their less luminous (M(V)>-20.5) counterparts form 30-60 percent of their mass in the same redshift range.
Computed tomography (CT) generates a stack of cross-sectional images covering a region of the body. The visual assessment of these images for the identification of potential abnormalities is a challenging and time consuming task due to the large amount of information that needs to be processed. In this article we propose a deep artificial neural network architecture, ReCTnet, for the fully-automated detection of pulmonary nodules in CT scans. The architecture learns to distinguish nodules and normal structures at the pixel level and generates three-dimensional probability maps highlighting areas that are likely to harbour the objects of interest. Convolutional and recurrent layers are combined to learn expressive image representations exploiting the spatial dependencies across axial slices. We demonstrate that leveraging intra-slice dependencies substantially increases the sensitivity to detect pulmonary nodules without inflating the false positive rate. On the publicly available LIDC/IDRI dataset consisting of 1,018 annotated CT scans, ReCTnet reaches a detection sensitivity of 90.5% with an average of 4.5 false positives per scan. Comparisons with a competing multi-channel convolutional neural network for multi-slice segmentation and other published methodologies using the same dataset provide evidence that ReCTnet offers significant performance gains.
Deep Convolutional Neural Networks (DCNNs) commonly use generic `max-pooling' (MP) layers to extract deformation-invariant features, but we argue in favor of a more refined treatment. First, we introduce epitomic convolution as a building block alternative to the common convolution-MP cascade of DCNNs; while having identical complexity to MP, Epitomic Convolution allows for parameter sharing across different filters, resulting in faster convergence and better generalization. Second, we introduce a Multiple Instance Learning approach to explicitly accommodate global translation and scaling when training a DCNN exclusively with class labels. For this we rely on a `patchwork' data structure that efficiently lays out all image scales and positions as candidates to a DCNN. Factoring global and local deformations allows a DCNN to `focus its resources' on the treatment of non-rigid deformations and yields a substantial classification accuracy improvement. Third, further pursuing this idea, we develop an efficient DCNN sliding window object detector that employs explicit search over position, scale, and aspect ratio. We provide competitive image classification and localization results on the ImageNet dataset and object detection results on the Pascal VOC 2007 benchmark.
We analyse the large-scale structure of the journal citation network built from information contained in the Thomson-Reuters Journal Citation Reports. To this end, we take advantage of the network science paraphernalia and explore network properties like density, percolation robustness, average and largest node distances, reciprocity, incoming and outgoing degree distributions, as well as assortative mixing by node degrees. We discover that the journal citation network is a dense, robust, small, and reciprocal world. Furthermore, in and out node degree distributions display long-tails, with few vital journals and many trivial ones, and they are strongly positively correlated.
We study the problem of mapping an input image to a tied pair consisting of a vector of parameters and an image that is created using a graphical engine from the vector of parameters. The mapping's objective is to have the output image as similar as possible to the input image. During training, no supervision is given in the form of matching inputs and outputs.   This learning problem extends two literature problems: unsupervised domain adaptation and cross domain transfer. We define a generalization bound that is based on discrepancy, and employ a GAN to implement a network solution that corresponds to this bound. Experimentally, our method is shown to solve the problem of automatically creating avatars.
The Le Chatelier principle states that physical equilibria are not only stable, but they also resist external perturbations via short-time negative-feedback mechanisms: a perturbation induces processes tending to diminish its results. The principle has deep roots, e.g., in thermodynamics it is closely related to the second law and the positivity of the entropy production. Here we study the applicability of the Le Chatelier principle to evolutionary game theory, i.e., to perturbations of a Nash equilibrium within the replicator dynamics. We show that the principle can be reformulated as a majorization relation. This defines a stability notion that generalizes the concept of evolutionary stability. We determine criteria for a Nash equilibrium to satisfy the Le Chatelier principle and relate them to mutualistic interactions (game-theoretical anticoordination) showing in which sense mutualistic replicators can be more stable than (say) competing ones. There are globally stable Nash equilibria, where the Le Chatelier principle is violated even locally: in contrast to the thermodynamic equilibrium a Nash equilibrium can amplify small perturbations, though both this type of equilibria satisfy the detailed balance condition.
Scientific results are communicated visually in the literature through diagrams, visualizations, and photographs. These information-dense objects have been largely ignored in bibliometrics and scientometrics studies when compared to citations and text. In this paper, we use techniques from computer vision and machine learning to classify more than 8 million figures from PubMed into 5 figure types and study the resulting patterns of visual information as they relate to impact. We find that the distribution of figures and figure types in the literature has remained relatively constant over time, but can vary widely across field and topic. Remarkably, we find a significant correlation between scientific impact and the use of visual information, where higher impact papers tend to include more diagrams, and to a lesser extent more plots and photographs. To explore these results and other ways of extracting this visual information, we have built a visual browser to illustrate the concept and explore design alternatives for supporting viziometric analysis and organizing visual information. We use these results to articulate a new research agenda -- viziometrics -- to study the organization and presentation of visual information in the scientific literature.
Hand gesture is one of the most important means of touchless communication between human and machines. There is a great interest for commanding electronic equipment in surgery rooms by hand gesture for reducing the time of surgery and the potential for infection. There are challenges in implementation of a hand gesture recognition system. It has to fulfill requirements such as high accuracy and fast response. In this paper we introduce a system of hand gesture recognition based on a deep learning approach. Deep learning is known as an accurate detection model, but its high complexity prevents it from being fabricated as an embedded system. To cope with this problem, we applied some changes in the structure of our work to achieve low complexity. As a result, the proposed method could be implemented on a naive embedded system. Our experiments show that the proposed system results in higher accuracy while having less complexity in comparison with the existing comparable methods.
Since its development, ingestible wireless endoscopy is considered to be a painless diagnostic method to detect a number of diseases inside GI tract. Medical related engineering companies have made significant improvements in this technology in last decade; however, some major limitations still residue. Localization of the next generation steerable endoscopic capsule robot in six-degree-of freedom (6 DoF) and active motion control are some of these limitations. The significance of localization capability concerns with the correct diagnosis of the disease area. This paper presents a very robust 6-DoF localization method based on supervised training of an architecture consisting of recurrent networks (RNN) embedded into a convolutional neural network (CNN) to make use of both just-in-moment information obtained by CNN and correlative information across frames obtained by RNN. To our knowledge, the idea of embedding RNNs into a CNN architecture is for the first time proposed in literature. The experimental results show that the proposed RNN-in-CNN architecture performs very well for endoscopic capsule robot localization in cases reflection distortions, noise, sudden camera movements and lack of distinguishable features.
In recent years, deep learning algorithms have outperformed the state-of-the art methods in several areas thanks to the efficient methods for training and for preventing overfitting, advancement in computer hardware, the availability of vast amount data. The high performance of multi-task deep neural networks in drug discovery has attracted the attention to deep learning algorithms in bioinformatics area. Here, we proposed a hierarchical multi-task deep neural network architecture based on Gene Ontology (GO) terms as a solution to protein function prediction problem and investigated various aspects of the proposed architecture by performing several experiments. First, we showed that there is a positive correlation between performance of the system and the size of training datasets. Second, we investigated whether the level of GO terms on GO hierarchy related to their performance. We showed that there is no relation between the depth of GO terms on GO hierarchy and their performance. In addition, we included all annotations to the training of a set of GO terms to investigate whether including noisy data to the training datasets change the performance of the system. The results showed that including less reliable annotations in training of deep neural networks increased the performance of the low performed GO terms, significantly. We evaluated the performance of the system using hierarchical evaluation method. Mathews correlation coefficient was calculated as 0.75, 0.49 and 0.63 for molecular function, biological process and cellular component categories, respectively. We showed that deep learning algorithms have a great potential in protein function prediction area. We plan to further improve the DEEPred by including other types of annotations from various biological data sources. We plan to construct DEEPred as an open access online tool.
The exponent puzzle of the Anderson-Mott transition is discussed on the basis of a duality model for strongly correlated electrons.
We review the observational foundations of the $\Lambda$CDM model, considered by most cosmologists as the standard model of cosmology. The Cosmological Principle, a key assumption of the model is shown to be verified with increasing accuracy. The fact that the Universe seems to have expanded from and hot and dense past is supported by many independent probes (galaxy redshifts, Cosmic Microwave Background, Big-Bang Nucleosynthesis and reionization). The explosion of detailed observations in the last few decades has allowed for precise measurements of the cosmological parameters within Friedman-Lema\^itre-Robertson-Walker cosmologies leading to the $\Lambda$CDM model: an apparently flat Universe, dominated by a cosmological constant, whose matter component is dominantly dark. We describe and discuss the various observational probes that led to this conclusion and conclude that the $\Lambda$CDM model, although leaving a number of open questions concerning the deep nature of the constituents of the Universe, provides the best theoretical framework to explain the observations.
This paper propose and predict the need for a new line of computer production that can facilitate and accelerate the improvements of things and systems towards IoT networks. The proposed computer that is named Computer of Thing, CoT, will speed up the establishment of smart world in a synchronized way with similar standards, communication protocols, and hardware and software. Also, the need for a new standardization body in addition to those who exist is stated that will motivate governments to invest more on IoT development. However, to show the importance of proposed CoT, a brief review on IoT present status is performed and then some terminologies and classification of IoT networks is presented. This classification helps manufacturers to gain a proper view for their future products development. Then, observing from production point of view the need for CoT is highlighted. Finally, as smart systems may work in different dimensions of applications via internet or intranet connectivity, the basic specifications of proposed CoT are presented. Consequently, due to nature of CoTs, a new multi- dimension multi-hierarchy control strategy is proposed.
We summarize recent work on low-x deep inelastic scattering. The generalized vector dominance/color-dipole picture (GVD/CDP) implies a scaling behavior for $\sigma_{\gamma^* p} (W^2, Q^2)\cong \sigma_{\gamma^* p}(\eta)$, with $\eta = (Q^2 + m^2_0)/\Lambda^2 (W^2)$ and yields an excellent representation of the experimental results on $\sigma_{\gamma^* p}(\eta)$.
The effects of the physical aging on the vibrational density of states (VDOS) of a polymeric glass is studied. The VDOS of a poly(methyl methacrylate) glass at low-energy (<15 meV), was determined from inelastic neutron scattering at low-temperature for two different physical thermodynamical states. One sample was annealed during a long time at temperature lower than Tg, and another was quenched from a temperature higher than Tg. It was found that the VDOS around the boson peak, relatively to the one at higher energy, decreases with the annealing at lower temperature than Tg, i.e., with the physical aging.
We consider an inhomogeneous anisotropic gap superconductor in the vicinity of the quantum critical point, where the transition temperature is suppressed to zero by disorder. Starting with the BCS Hamiltonian, we derive the Ginzburg-Landau action for the superconducting order parameter. It is shown that the critical theory corresponds to the marginal case in two dimensions and is formally equivalent to the theory of an antiferromagnetic quantum critical point, which is a quantum critical theory with the dynamic critical exponent, z=2. This allows us to use a parquet method to calculate the non-perturbative effect of quantum superconducting fluctuations on thermodynamic properties. We derive a general expression for the fluctuation magnetic susceptibility, which exhibits a crossover from the logarithmic dependence, $\chi ~ ln(dn)$, valid beyond the Ginzburg region to $\chi ~ ln^{1/5}(dn)$ valid in the immediate vicinity of the transition (where $dn$ is the deviation from the critical disorder concentration). We suggest that the obtained non-perturbative results describe the low-temperature critical behavior of a variety of diverse superconducting systems, which include overdoped high-temperature cuprates, disordered p-wave superconductors, and conventional superconducting films with magnetic impurities.
There are numerous examples of societies with extremely stable mix of contrasting opinions. We argue that this stability is a result of an interplay between society network topology adjustment and opinion changing processes. To support this position we present a computer model of opinion formation based on some novel assumptions, designed to bring the model closer to social reality. In our model, the agents, in addition to changing their opinions due to influence of the rest of society and external propaganda, have the ability to modify their social network, forming links with agents sharing the same opinions and cutting the links with those they disagree with. To improve the model further we divide the agents into `fanatics' and `opportunists', depending on how easy is to change their opinions. The simulations show significant differences compared to traditional models, where network links are static. In particular, for the dynamical model where inter-agent links are adjustable, the final network structure and opinion distribution is shown to resemble real world observations, such as social structures and persistence of minority groups even when most of the society is against them and the propaganda is strong.
This work presents an end-to-end trainable deep bidirectional LSTM (Long-Short Term Memory) model for image captioning. Our model builds on a deep convolutional neural network (CNN) and two separate LSTM networks. It is capable of learning long term visual-language interactions by making use of history and future context information at high level semantic space. Two novel deep bidirectional variant models, in which we increase the depth of nonlinearity transition in different way, are proposed to learn hierarchical visual-language embeddings. Data augmentation techniques such as multi-crop, multi-scale and vertical mirror are proposed to prevent overfitting in training deep models. We visualize the evolution of bidirectional LSTM internal states over time and qualitatively analyze how our models "translate" image to sentence. Our proposed models are evaluated on caption generation and image-sentence retrieval tasks with three benchmark datasets: Flickr8K, Flickr30K and MSCOCO datasets. We demonstrate that bidirectional LSTM models achieve highly competitive performance to the state-of-the-art results on caption generation even without integrating additional mechanism (e.g. object detection, attention model etc.) and significantly outperform recent methods on retrieval task.
To better extract users' interest by exploiting the rich historical behavior data is crucial for building the click-through rate (CTR) prediction model in the online advertising system in e-commerce industry. There are two key observations on user behavior data: i) \textbf{diversity}. Users are interested in different kinds of goods when visiting e-commerce site. ii) \textbf{local activation}. Whether users click or not click a good depends only on part of their related historical behavior. However, most traditional CTR models lack of capturing these structures of behavior data. In this paper, we introduce a new proposed model, Deep Interest Network (DIN), which is developed and deployed in the display advertising system in Alibaba. DIN represents users' diverse interests with an interest distribution and designs an attention-like network structure to locally activate the related interests according to the candidate ad, which is proven to be effective and significantly outperforms traditional model. Overfitting problem is easy to encounter on training such industrial deep network with large scale sparse inputs. We study this problem carefully and propose a useful adaptive regularization technique.
This paper characterizes the secret message capacity of three networks where two unicast sessions share some of the communication resources. Each network consists of erasure channels with state feedback. A passive eavesdropper is assumed to wiretap any one of the links. The capacity achieving schemes as well as the outer bounds are formulated as linear programs. The proposed strategies are then numerically evaluated and shown to achieve higher rate performances (up to a double single- or sum-rate) with respect to alternative strategies, where the network resources are time-shared among the two sessions. These results represent a step towards the secure capacity characterization for general networks. They also show that, even in configurations for which network coding does not offer benefits in absence of security, it can become beneficial under security constraints.
We study the behavior of persistent current of relativistic electrons on a one dimensional ring in presence of attractive/repulsive scattering potentials. In particular, we investigate the persistent current in accordance with the strength as well as the number of the scattering potential. We find that in presence of single scatterer the persistent current becomes smaller in magnitude than the scattering free scenario. This behaviour is similar to the non-relativistic case. Even for a very strong scattering potential, finite amount of persistent current remains for a relativistic ring. In presence of multiple scatterer we observe that the persistent current is maximum when the scatterers are placed uniformly compared to the current averaged over random configurations. However if we increase the number of scatterers, we find that the random averaged current increases with the number of scatterers. The latter behaviour is in contrast to the non-relativistic case.
Automated story generation is the problem of automatically selecting a sequence of events, actions, or words that can be told as a story. We seek to develop a system that can generate stories by learning everything it needs to know from textual story corpora. To date, recurrent neural networks that learn language models at character, word, or sentence levels have had little success generating coherent stories. We explore the question of event representations that provide a mid-level of abstraction between words and sentences in order to retain the semantic information of the original data while minimizing event sparsity. We present a technique for preprocessing textual story data into event sequences. We then present a technique for automated story generation whereby we decompose the problem into the generation of successive events (event2event) and the generation of natural language sentences from events (event2sentence). We give empirical results comparing different event representations and their effects on event successor generation and the translation of events to natural language.
This paper introduces AntNet, a novel approach to the adaptive learning of routing tables in communications networks. AntNet is a distributed, mobile agents based Monte Carlo system that was inspired by recent work on the ant colony metaphor for solving optimization problems. AntNet's agents concurrently explore the network and exchange collected information. The communication among the agents is indirect and asynchronous, mediated by the network itself. This form of communication is typical of social insects and is called stigmergy. We compare our algorithm with six state-of-the-art routing algorithms coming from the telecommunications and machine learning fields. The algorithms' performance is evaluated over a set of realistic testbeds. We run many experiments over real and artificial IP datagram networks with increasing number of nodes and under several paradigmatic spatial and temporal traffic distributions. Results are very encouraging. AntNet showed superior performance under all the experimental conditions with respect to its competitors. We analyze the main characteristics of the algorithm and try to explain the reasons for its superiority.
This paper investigates energy efficiency for two-tier femtocell networks through combining game theory and stochastic learning. With the Stackelberg game formulation, a hierarchical reinforcement learning framework is applied to study the joint average utility maximization of macrocells and femtocells subject to the minimum signal-to-interference-plus-noise-ratio requirements. The macrocells behave as the leaders and the femtocells are followers during the learning procedure. At each time step, the leaders commit to dynamic strategies based on the best responses of the followers, while the followers compete against each other with no further information but the leaders' strategy information. In this paper, we propose two learning algorithms to schedule each cell's stochastic power levels, leading by the macrocells. Numerical experiments are presented to validate the proposed studies and show that the two learning algorithms substantially improve the energy efficiency of the femtocell networks.
In this paper we introduce the Generalized Virtual Networking (GVN) concept. GVN provides a framework to influence the routing of packets based on service level information that is carried in the packets. It is based on a protocol header inserted between the Network and Transport layers, therefore it can be seen as a layer 3.5 solution. Technically, GVN is proposed as a new transport layer protocol in the TCP/IP protocol suite. An IP router that is not GVN capable will simply process the IP destination address as usual. Similar concepts have been proposed in other works, and referred to as Service Oriented Networking, Service Centric Networking, Application Delivery Networking, but they are now generalized in the proposed GVN framework. In this respect, the GVN header is a generic container that can be adapted to serve the needs of arbitrary service level routing solutions. The GVN header can be managed by GVN capable end-hosts and applications or can be pushed/popped at the edge of a GVN capable network (like a VLAN tag). In this position paper, we show that Generalized Virtual Networking is a powerful enabler for SCN (Service Centric Networking) and NFV (Network Function Virtualization) and how it couples with the SDN (Software Defined Networking) paradigm.
The phase diagram of localization is numerically calculated for a three-dimensional disordered system in the presence of a magnetic field using the Peierls substitution. The mobility edge trajectory shifts in the energy-disorder space when increasing the field. In the band center, localized states near the phase boundary become delocalized. The obtained field dependence of the critical disorder is in agreement with a power law behavior expected from scaling theory. Close to the tail of the band the magnetic field causes localization of extended states.
We use the cell model to justify the use of a lattice model to study the ideal glass transition. Based on empirical evidence and several previous exact calculations, we hypothesize that there exists an energy gap between the lowest possible energy of a glass (the ideal glass IG) and the crystal (CR). The gap is due to the presence of strongly correlated excitations with respect to the ideal CR; thus, one can treat IG as a highly defective crystal. We argue that an excitation in IG requires energy that increases logarithmically with the size of the system; as a consequence, we prove that IG must emerge at a positive temperature T_{K}. We propose an antiferromagnetic Ising model on a lattice to model liquid-crystal transition in a simple fluid or a binary mixture, which is then solved exactly on a recursive (Husimi) lattice to investigate the ideal glass transition, the nature of defects in the supercooled liquid and CR analytically, and the effects of competing interactions on the glass transition. The calculation establishes the gap. The lattice entropy of the supercooled liquid vanishes at a positive temperature T_{K}>0, where IG emerges but where CR has a positive entropy. The macrostate IG is in a particular and unique disordered microstate at T_{K}, just as the ideal CR is in a perfectly ordered microstate at absolute zero. This explains why it is possible for CR to have a higher entropy at T_{K} than IG. The demonstration here of an entropy crisis in monatomic systems along with previously known results strongly suggests that the entropy crisis first noted by Kauzmann and demonstrated by Gibbs and DiMarzio in long polymers appears to be ubiquitous in all supercooled liquids.
Motivated by the large expansion in the study of social networks, this paper deals with the problem of multiple messages spreading over the same network using gossip algorithms. Given two messages distributed over some nodes of the graph, we first investigate the final distribution of the messages given an initial state. Then, an algorithm is presented to achieve consensus over one of the messages. Finally, a game theoretical application and an analogy with word-of-mouth marketing are outlined.
We investigate through molecular dynamics the transition from Knudsen to molecular diffusion transport towards 2d absorbing interfaces with irregular geometry. Our results indicate that the length of the active zone decreases continuously with density from the Knudsen to the molecular diffusion regime. In the limit where molecular diffusion dominates, we find that this length approaches a constant value of the order of the system size, in agreement with theoretical predictions for Laplacian transport in irregular geometries. Finally, we show that all these features can be qualitatively described in terms of a simple random-walk model of the diffusion process.
The production of color language is essential for grounded language generation. Color descriptions have many challenging properties: they can be vague, compositionally complex, and denotationally rich. We present an effective approach to generating color descriptions using recurrent neural networks and a Fourier-transformed color representation. Our model outperforms previous work on a conditional language modeling task over a large corpus of naturalistic color descriptions. In addition, probing the model's output reveals that it can accurately produce not only basic color terms but also descriptors with non-convex denotations ("greenish"), bare modifiers ("bright", "dull"), and compositional phrases ("faded teal") not seen in training.
We propose a model for deterministic distributed function computation by a network of identical and anonymous nodes, with bounded computation and storage capabilities that do not scale with the network size. Our goal is to characterize the class of functions that can be computed within this model. In our main result, we exhibit a class of non-computable functions, and prove that every function outside this class can at least be approximated. The problem of computing averages in a distributed manner plays a central role in our development.
We present evidence that Abrikosov-Nielsen-Olesen (ANO) strings pass through each other for very high speeds of approach due to a double intercommutation. In near-perpendicular collisions numerical simulations give threshold speeds bounded above by $\sim 0.97 c$ for type I, and by $\sim 0.90 c$ for deep type II strings. The second intercommutation occurs because at ultra high collision speeds, the connecting segments formed by the first intercommutation are nearly static and almost antiparallel, which gives them time to interact and annihilate. A simple model explains the rough features of the threshold velocity dependence with the incidence angle. For deep type II strings and large incidence angles a second effect becomes dominant, the formation of a loop that catches up with the interpolating segments. The loop is related to the observed vortex - antivortex reemergence in two-dimensions. In this case the critical value for double intercommutation can become much lower.
We presents in this paper a novel fish classification methodology based on a combination between robust feature selection, image segmentation and geometrical parameter techniques using Artificial Neural Network and Decision Tree. Unlike existing works for fish classification, which propose descriptors and do not analyze their individual impacts in the whole classification task and do not make the combination between the feature selection, image segmentation and geometrical parameter, we propose a general set of features extraction using robust feature selection, image segmentation and geometrical parameter and their correspondent weights that should be used as a priori information by the classifier. In this sense, instead of studying techniques for improving the classifiers structure itself, we consider it as a black box and focus our research in the determination of which input information must bring a robust fish discrimination.The main contribution of this paper is enhancement recognize and classify fishes based on digital image and To develop and implement a novel fish recognition prototype using global feature extraction, image segmentation and geometrical parameters, it have the ability to Categorize the given fish into its cluster and Categorize the clustered fish into poison or non-poison fish, and categorizes the non-poison fish into its family .
In this paper, we study the spreading speed of complex contagions in a social network. A $k$-complex contagion starts from a set of initially infected seeds such that any node with at least $k$ infected neighbors gets infected. Simple contagions, i.e., $k=1$, quickly spread to the entire network in small world graphs. However, fast spreading of complex contagions appears to be less likely and more delicate; the successful cases depend crucially on the network structure~\cite{G08,Ghasemiesfeh:2013:CCW}.   Our main result shows that complex contagions can spread fast in a general family of time-evolving networks that includes the preferential attachment model~\cite{barabasi99emergence}. We prove that if the initial seeds are chosen as the oldest nodes in a network of this family, a $k$-complex contagion covers the entire network of $n$ nodes in $O(\log n)$ steps. We show that the choice of the initial seeds is crucial. If the initial seeds are uniformly randomly chosen in the PA model, even with a polynomial number of them, a complex contagion would stop prematurely. The oldest nodes in a preferential attachment model are likely to have high degrees. However, we remark that it is actually not the power law degree distribution per se that facilitates fast spreading of complex contagions, but rather the evolutionary graph structure of such models. Some members of the said family do not even have a power-law distribution.   We also prove that complex contagions are fast in the copy model~\cite{KumarRaRa00}, a variant of the preferential attachment family.   Finally, we prove that when a complex contagion starts from an arbitrary set of initial seeds on a general graph, determining if the number of infected vertices is above a given threshold is $\mathbf{P}$-complete. Thus, one cannot hope to categorize all the settings in which complex contagions percolate in a graph.
Attack graphs are a powerful tool for security risk assessment by analysing network vulnerabilities and the paths attackers can use to compromise network resources. The uncertainty about the attacker's behaviour makes Bayesian networks suitable to model attack graphs to perform static and dynamic analysis. Previous approaches have focused on the formalization of attack graphs into a Bayesian model rather than proposing mechanisms for their analysis. In this paper we propose to use efficient algorithms to make exact inference in Bayesian attack graphs, enabling the static and dynamic network risk assessments. To support the validity of our approach we have performed an extensive experimental evaluation on synthetic Bayesian attack graphs with different topologies, showing the computational advantages in terms of time and memory use of the proposed techniques when compared to existing approaches.
Social media and social networking sites have become a global pinboard for exposition and discussion of news, topics, and ideas, where social media users often update their opinions about a particular topic by learning from the opinions shared by their friends. In this context, can we learn a data-driven model of opinion dynamics that is able to accurately forecast opinions from users? In this paper, we introduce SLANT, a probabilistic modeling framework of opinion dynamics, which represents users opinions over time by means of marked jump diffusion stochastic differential equations, and allows for efficient model simulation and parameter estimation from historical fine grained event data. We then leverage our framework to derive a set of efficient predictive formulas for opinion forecasting and identify conditions under which opinions converge to a steady state. Experiments on data gathered from Twitter show that our model provides a good fit to the data and our formulas achieve more accurate forecasting than alternatives.
Although several measurements and analyses support the idea that the brain is energy-optimized, there is one disturbing, contradictory observation: In theory, computation limited by thermal noise can occur as cheaply as ~$2.9\cdot 10^{-21}$ joules per bit (kTln2). Unfortunately, for a neuron the ostensible discrepancy from this minimum is startling - ignoring inhibition the discrepancy is $10^7$ times this amount and taking inhibition into account $>10^9$. Here we point out that what has been defined as neural computation is actually a combination of computation and neural communication: the communication costs, transmission from each excitatory postsynaptic activation to the S4-gating-charges of the fast Na+ channels of the initial segment (fNa's), dominate the joule-costs. Making this distinction between communication to the initial segment and computation at the initial segment (i.e., adding up of the activated fNa's) implies that the size of the average synaptic event reaching the fNa's is the size of the standard deviation of the thermal noise. Moreover, defining computation as the addition of activated fNa's, yields a biophysically plausible mechanism for approaching the desired minimum. This mechanism, requiring something like the electrical engineer's equalizer (not much more than the action potential generating conductances), only operates at threshold. This active filter modifies the last few synaptic excitations, providing barely enough energy to allow the last sub-threshold gating charge to transport. That is, the last, threshold-achieving S4-subunit activation requires an energy that matches the information being provided by the last few synaptic events, a ratio that is near kTln2 joules per bit.
We consider stochastic optimization problems in multi-agent settings, where a network of agents aims to learn parameters which are optimal in terms of a global objective, while giving preference to locally observed streaming information. To do so, we depart from the canonical decentralized optimization framework where agreement constraints are enforced, and instead formulate a problem where each agent minimizes a global objective while enforcing network proximity constraints. This formulation includes online consensus optimization as a special case, but allows for the more general hypothesis that there is data heterogeneity across the network. To solve this problem, we propose using a stochastic saddle point algorithm inspired by Arrow and Hurwicz. This method yields a decentralized algorithm for processing observations sequentially received at each node of the network. Using Lagrange multipliers to penalize the discrepancy between them, only neighboring nodes exchange model information. We establish that under a constant step-size regime the time-average suboptimality and constraint violation are contained in a neighborhood whose radius vanishes with increasing number of iterations. As a consequence, we prove that the time-average primal vectors converge to the optimal objective while satisfying the network proximity constraints. We apply this method to the problem of sequentially estimating a correlated random field in a sensor network, as well as an online source localization problem, both of which demonstrate the empirical validity of the aforementioned convergence results.
In science and engineering, intelligent processing of complex signals such as images, sound or language is often performed by a parameterized hierarchy of nonlinear processing layers, sometimes biologically inspired. Hierarchical systems (or, more generally, nested systems) offer a way to generate complex mappings using simple stages. Each layer performs a different operation and achieves an ever more sophisticated representation of the input, as, for example, in an deep artificial neural network, an object recognition cascade in computer vision or a speech front-end processing. Joint estimation of the parameters of all the layers and selection of an optimal architecture is widely considered to be a difficult numerical nonconvex optimization problem, difficult to parallelize for execution in a distributed computation environment, and requiring significant human expert effort, which leads to suboptimal systems in practice. We describe a general mathematical strategy to learn the parameters and, to some extent, the architecture of nested systems, called the method of auxiliary coordinates (MAC). This replaces the original problem involving a deeply nested function with a constrained problem involving a different function in an augmented space without nesting. The constrained problem may be solved with penalty-based methods using alternating optimization over the parameters and the auxiliary coordinates. MAC has provable convergence, is easy to implement reusing existing algorithms for single layers, can be parallelized trivially and massively, applies even when parameter derivatives are not available or not desirable, and is competitive with state-of-the-art nonlinear optimizers even in the serial computation setting, often providing reasonable models within a few iterations.
Today data mining techniques are exploited in medical science for diagnosing, overcoming and treating diseases. Neural network is one of the techniques which are widely used for diagnosis in medical field. In this article efficiency of nine algorithms, which are basis of neural network learning in diagnosing cardiovascular diseases, will be assessed. Algorithms are assessed in terms of accuracy, sensitivity, transparency, AROC and convergence rate by means of 10 fold cross validation. The results suggest that in training phase, Lonberg-M algorithm has the best efficiency in terms of all metrics, algorithm OSS has maximum accuracy in testing phase, algorithm SCG has the maximum transparency and algorithm CGB has the maximum sensitivity.
In this paper we propose cross-modal convolutional neural networks (X-CNNs), a novel biologically inspired type of CNN architectures, treating gradient descent-specialised CNNs as individual units of processing in a larger-scale network topology, while allowing for unconstrained information flow and/or weight sharing between analogous hidden layers of the network---thus generalising the already well-established concept of neural network ensembles (where information typically may flow only between the output layers of the individual networks). The constituent networks are individually designed to learn the output function on their own subset of the input data, after which cross-connections between them are introduced after each pooling operation to periodically allow for information exchange between them. This injection of knowledge into a model (by prior partition of the input data through domain knowledge or unsupervised methods) is expected to yield greatest returns in sparse data environments, which are typically less suitable for training CNNs. For evaluation purposes, we have compared a standard four-layer CNN as well as a sophisticated FitNet4 architecture against their cross-modal variants on the CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data being removed, and find that at lower levels of data availability, the X-CNNs significantly outperform their baselines (typically providing a 2--6% benefit, depending on the dataset size and whether data augmentation is used), while still maintaining an edge on all of the full dataset tests.
Fuzzy logic is a powerful tool to model knowledge uncertainty, measurements imprecision, and vagueness. However, there is another type of vagueness that arises when data have multiple forms of expression that fuzzy logic does not address quite well. This is the case for multiple instance learning problems (MIL). In MIL, an object is represented by a collection of instances, called a bag. A bag is labeled negative if all of its instances are negative, and positive if at least one of its instances is positive. Positive bags encode ambiguity since the instances themselves are not labeled. In this paper, we introduce fuzzy inference systems and neural networks designed to handle bags of instances as input and capable of learning from ambiguously labeled data. First, we introduce the Multiple Instance Sugeno style fuzzy inference (MI-Sugeno) that extends the standard Sugeno style inference to handle reasoning with multiple instances. Second, we use MI-Sugeno to define and develop Multiple Instance Adaptive Neuro Fuzzy Inference System (MI-ANFIS). We expand the architecture of the standard ANFIS to allow reasoning with bags and derive a learning algorithm using backpropagation to identify the premise and consequent parameters of the network. The proposed inference system is tested and validated using synthetic and benchmark datasets suitable for MIL problems. We also apply the proposed MI-ANFIS to fuse the output of multiple discrimination algorithms for the purpose of landmine detection using Ground Penetrating Radar.
One of the fundamental problems in network virtualization is Virtual Network Embedding (VNE). The VNE problem deals with finding an effective mapping of the virtual nodes & links onto the substrate network. The recent advances in network virtualization gave cloud operators the ability to extend their cloud computing offerings with virtual networks. This trend, jointly with the increasing evidence of incidents in cloud facilities demonstrate that security and dependability is becoming a critical factor that should be considered by VNE algorithms. In this abstract we propose a VNE solution that considers security and dependability as first class citizens. The resiliency properties of our solution are enhanced by assuming a multiple cloud provider model.
Conventional Convolutional Neural Networks (CNNs) use either a linear or non-linear filter to extract features from an image patch (region) of spatial size $ H\times W $ (Typically, $ H $ is small and is equal to $ W$, e.g., $ H $ is 5 or 7). Generally, the size of the filter is equal to the size $ H\times W $ of the input patch. We argue that the representation ability of equal-size strategy is not strong enough. To overcome the drawback, we propose to use subpatch filter whose spatial size $ h\times w $ is smaller than $ H\times W $. The proposed subpatch filter consists of two subsequent filters. The first one is a linear filter of spatial size $ h\times w $ and is aimed at extracting features from spatial domain. The second one is of spatial size $ 1\times 1 $ and is used for strengthening the connection between different input feature channels and for reducing the number of parameters. The subpatch filter convolves with the input patch and the resulting network is called a subpatch network. Taking the output of one subpatch network as input, we further repeat constructing subpatch networks until the output contains only one neuron in spatial domain. These subpatch networks form a new network called Cascaded Subpatch Network (CSNet). The feature layer generated by CSNet is called csconv layer. For the whole input image, we construct a deep neural network by stacking a sequence of csconv layers. Experimental results on four benchmark datasets demonstrate the effectiveness and compactness of the proposed CSNet. For example, our CSNet reaches a test error of $ 5.68\% $ on the CIFAR10 dataset without model averaging. To the best of our knowledge, this is the best result ever obtained on the CIFAR10 dataset.
The balance of exploration versus exploitation (EvE) is a key issue on evolutionary computation. In this paper we will investigate how an adaptive controller aimed to perform Operator Selection can be used to dynamically manage the EvE balance required by the search, showing that the search strategies determined by this control paradigm lead to an improvement of solution quality found by the evolutionary algorithm.
The network of 5823 cities of Mexico with a population more than 5000 inhabitants is studied. Our analysis is focused to the spectral properties of the adjacency matrix, the small-world properties of the network, the distribution of the clustering coefficients and the degree distribution of the vertices. The connection of these features with the spread of epidemics on this network is also discussed.
Breast cancer has the highest incidence and second highest mortality rate for women in the US. Our study aims to utilize deep learning for benign/malignant classification of mammogram tumors using a subset of cases from the Digital Database of Screening Mammography (DDSM). Though it was a small dataset from the view of Deep Learning (about 1000 patients), we show that currently state of the art architectures of deep learning can find a robust signal, even when trained from scratch. Using convolutional neural networks (CNNs), we are able to achieve an accuracy of 85% and an ROC AUC of 0.91, while leading hand-crafted feature based methods are only able to achieve an accuracy of 71%. We investigate an amalgamation of architectures to show that our best result is reached with an ensemble of the lightweight GoogLe Nets tasked with interpreting both the coronal caudal view and the mediolateral oblique view, simply averaging the probability scores of both views to make the final prediction. In addition, we have created a novel method to visualize what features the neural network detects for the benign/malignant classification, and have correlated those features with well known radiological features, such as spiculation. Our algorithm significantly improves existing classification methods for mammography lesions and identifies features that correlate with established clinical markers.
We present a specialized network simplex algorithm for the budget-constrained minimum cost flow problem, which is an extension of the traditional minimum cost flow problem by a second kind of costs associated with each edge, whose total value in a feasible flow is constrained by a given budget B. We present a fully combinatorial description of the algorithm that is based on a novel incorporation of two kinds of integral node potentials and three kinds of reduced costs. We prove optimality criteria and combine two methods that are commonly used to avoid cycling in traditional network simplex algorithms into new techniques that are applicable to our problem. With these techniques and our definition of the reduced costs, we are able to prove a pseudo-polynomial running time of the overall procedure, which can be further improved by incorporating Dantzig's pivoting rule. Moreover, we present computational results that compare our procedure with Gurobi.
The Wisdom of Crowds is a phenomenon described in social science that suggests four criteria applicable to groups of people. It is claimed that, if these criteria are satisfied, then the aggregate decisions made by a group will often be better than those of its individual members. Inspired by this concept, we present a novel feedback framework for the cluster ensemble problem, which we call Wisdom of Crowds Cluster Ensemble (WOCCE). Although many conventional cluster ensemble methods focusing on diversity have recently been proposed, WOCCE analyzes the conditions necessary for a crowd to exhibit this collective wisdom. These include decentralization criteria for generating primary results, independence criteria for the base algorithms, and diversity criteria for the ensemble members. We suggest appropriate procedures for evaluating these measures, and propose a new measure to assess the diversity. We evaluate the performance of WOCCE against some other traditional base algorithms as well as state-of-the-art ensemble methods. The results demonstrate the efficiency of WOCCE's aggregate decision-making compared to other algorithms.
The effects of networking on the extent of cooperation emerging in a competitive setting are studied. The evolutionary snowdrift game, which represents a realistic alternative to the well-known Prisoner's Dilemma, is studied in the Watts-Strogatz network that spans the regular, small-world, and random networks through random re-wiring. Over a wide range of payoffs, a re-wired network is found to suppress cooperation when compared with a well-mixed or fully connected system. Two extinction payoffs, that characterize the emergence of a homogeneous steady state, are identified. It is found that, unlike in the Prisoner's Dilemma, the standard deviation of the degree distribution is the dominant network property that governs the extinction payoffs.
Two main approaches to using social network information in recommendation have emerged: augmenting collaborative filtering with social data and algorithms that use only ego-centric data. We compare the two approaches using movie and music data from Facebook, and hashtag data from Twitter. We find that recommendation algorithms based only on friends perform no worse than those based on the full network, even though they require much less data and computational resources. Further, our evidence suggests that locality of preference, or the non-random distribution of item preferences in a social network, is a driving force behind the value of incorporating social network information into recommender algorithms. When locality is high, as in Twitter data, simple k-nn recommenders do better based only on friends than they do if they draw from the entire network. These results help us understand when, and why, social network information is likely to support recommendation systems, and show that systems that see ego-centric slices of a complete network (such as websites that use Facebook logins) or have computational limitations (such as mobile devices) may profitably use ego-centric recommendation algorithms.
Studying network robustness for wireless sensor networks(WSNs) is an exciting topic of research as sensor nodes often fail due to hardware degradation, resource constraints, and environmental changes. The application of spectral graph theory to networked systems has generated several important results. However, previous research has often failed to consider the network parameters, which is crucial to study the real network applications. Network criticality is one of the effective metrics to quantify the network robustness against such failures and attacks. In this work, we derive the exact formulas of network criticality for WSNs using r-nearest neighbor networks and we show the effect of nearest neighbors and network dimension on robustness using analytical and numerical evaluations. Furthermore, we also show how symmetric and static approximations can wrongly designate the network robustness when implemented to WSNs.
Recently, the abundance of digital data enabled the implementation of graph based ranking algorithms that provide system level analysis for ranking publications and authors. Here we take advantage of the entire Physical Review publication archive (1893-2006) to construct authors' networks where weighted edges, as measured from opportunely normalized citation counts, define a proxy for the mechanism of scientific credit transfer. On this network we define a ranking method based on a diffusion algorithm that mimics the spreading of scientific credits on the network. We compare the results obtained with our algorithm with those obtained by local measures such as the citation count and provide a statistical analysis of the assignment of major career awards in the area of Physics. A web site where the algorithm is made available to perform customized rank analysis can be found at the address http://www.physauthorsrank.org
This paper evaluates and classifies existing and emerging energy-control technologies for computer networks based on their relative value functions. Using formal decision analysis methods, we demonstrate the impact of risk-benefit dimensions on technology certain equivalent and deployment perspective. We demonstrate how energy control solutions can be cost-effective or unsustainable depending on network type and operator risk tolerance.
We introduce a new model for the generation of random satisfiability problems. It is an extension of the hyper-SAT model of Ricci-Tersenghi, Weigt and Zecchina, which is a variant of the famous K-SAT model: it is extended to q-state variables and relates to a different choice of the statistical ensemble. The model has an exactly solvable statistic: the critical exponents and scaling functions of the SAT/UNSAT transition are calculable at zero temperature, with no need of replicas, also with exact finite-size corrections. We also introduce an exact duality of the model, and show an analogy of thermodynamic properties with the Random Energy Model of disordered spin systems theory. Relations with Error-Correcting Codes are also discussed.
The overall performance of a distributed system is highly dependent on the communication efficiency of the system. Although network resources (links, bandwidth) are becoming increasingly more available, the communication performance of data transfers involving large volumes of data does not necessarily improve at the same rate. This is due to the inefficient usage of the available network resources. A solution to this problem consists of data transfer scheduling techniques, which manage and allocate the network resources in an efficient manner. In this paper we present several online and offline data transfer optimization techniques, in the context of a centrally controlled distributed system.
In this paper, we investigate energy-efficient clustering and medium access control (MAC) for cellular-based M2M networks to minimize device energy consumption and prolong network battery lifetime. First, we present an accurate energy consumption model that considers both static and dynamic energy consumptions, and utilize this model to derive the network lifetime. Second, we find the cluster size to maximize the network lifetime and develop an energy-efficient cluster-head selection scheme. Furthermore, we find feasible regions where clustering is beneficial in enhancing network lifetime. We further investigate communications protocols for both intra- and inter-cluster communications. While inter-cluster communications use conventional cellular access schemes, we develop an energy-efficient and load-adaptive multiple access scheme, called n-phase CSMA/CA, which provides a tunable tradeoff between energy efficiency, delay, and spectral efficiency of the network. The simulation results show that the proposed clustering, cluster-head selection, and communications protocol design outperform the others in energy saving and significantly prolong the lifetimes of both individual nodes and the whole M2M network.
The unitarization of the longitudinal vector boson scattering (VBS) cross section by the Higgs boson is a fundamental prediction of the Standard Model which has not been experimentally verified. One of the most promising ways to measure VBS uses events containing two leptonically-decaying same-electric-charge $W$ bosons produced in association with two jets. However, the angular distributions of the leptons in the $W$ boson rest frame, which are commonly used to fit polarization fractions, are not readily available in this process due to the presence of two neutrinos in the final state. In this paper we present a method to alleviate this problem by using a deep machine learning technique to recover these angular distributions from measurable event kinematics and demonstrate how the longitudinal-longitudinal scattering fraction could be studied. We show that this method doubles the expected sensitivity when compared to previous proposals.
The ubiquity of online fashion shopping demands effective recommendation services for customers. In this paper, we study two types of fashion recommendation: (i) suggesting an item that matches existing components in a set to form a stylish outfit (a collection of fashion items), and (ii) generating an outfit with multimodal (images/text) specifications from a user. To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion. More specifically, we consider a fashion outfit to be a sequence (usually from top to bottom and then accessories) and each item in the outfit as a time step. Given the fashion items in an outfit, we train a bidirectional LSTM (Bi-LSTM) model to sequentially predict the next item conditioned on previous ones to learn their compatibility relationships. Further, we learn a visual-semantic space by regressing image features to their semantic representations aiming to inject attribute and category information as a regularization for training the LSTM. The trained network can not only perform the aforementioned recommendations effectively but also predict the compatibility of a given outfit. We conduct extensive experiments on our newly collected Polyvore dataset, and the results provide strong qualitative and quantitative evidence that our framework outperforms alternative methods.
We prove the existence of the flow by curvature of regular planar networks starting from an initial network which is non-regular. The proof relies on a monotonicity formula for expanding solutions and a local regularity result for the network flow in the spirit of B. White's local regularity theorem for mean curvature flow.
Peer-to-peer (P2P) systems are widely used to exchange content over the Internet. Knowledge on paedophile activity in such networks remains limited while it has important social consequences. Moreover, though there are different P2P systems in use, previous academic works on this topic focused on one system at a time and their results are not directly comparable.   We design a methodology for comparing \kad and \edonkey, two P2P systems among the most prominent ones and with different anonymity levels. We monitor two \edonkey servers and the \kad network during several days and record hundreds of thousands of keyword-based queries. We detect paedophile-related queries with a previously validated tool and we propose, for the first time, a large-scale comparison of paedophile activity in two different P2P systems. We conclude that there are significantly fewer paedophile queries in \kad than in \edonkey (approximately 0.09% \vs 0.25%).
We offer a solution to a long-standing problem in the physics of networks, the creation of a plausible, solvable model of a network that displays clustering or transitivity -- the propensity for two neighbors of a network node also to be neighbors of one another. We show how standard random graph models can be generalized to incorporate clustering and give exact solutions for various properties of the resulting networks, including sizes of network components, size of the giant component if there is one, position of the phase transition at which the giant component forms, and position of the phase transition for percolation on the network.
We solve the dynamics of Hopfield-type neural networks which store sequences of patterns, close to saturation. The asymmetry of the interaction matrix in such models leads to violation of detailed balance, ruling out an equilibrium statistical mechanical analysis. Using generating functional methods we derive exact closed equations for dynamical order parameters, viz. the sequence overlap and correlation- and response functions, in the thermodynamic limit. We calculate the time translation invariant solutions of these equations, describing stationary limit-cycles, which leads to a phase diagram. The effective retarded self-interaction usually appearing in symmetric models is here found to vanish, which causes a significantly enlarged storage capacity of $\alpha_c\sim 0.269$, compared to $\alpha_\c\sim 0.139$ for Hopfield networks storing static patterns. Our results are tested against extensive computer simulations and excellent agreement is found.
In this work we present a computational study of the Kinetoplast genome, modelled as a large number of semiflexible unknotted loops, which are allowed to link with each other. As the DNA density increases, the systems shows a percolation transition between a gas of unlinked rings and a network of linked loops which spans the whole system. Close to the percolation transition, we find that the mean valency of the network, i.e. the average number of loops which are linked to any one loop, is around 3 as found experimentally for the Kinetoplast DNA. Even more importantly, by simulating the digestion of the network by a restriction enzyme, we show that the distribution of oligomers, i.e. structures formed by a few loops which remain linked after digestion, quantitatively matches experimental data obtained from gel electrophoresis, provided that the density is, once again, close to the percolation transition. With respect to previous work, our analysis builds on a reduced number of assumptions, yet can still fully explain the experimental data. Our findings suggest that the Kinetoplast DNA can be viewed as a network of linked loops positioned very close to the percolation transition, and we discuss the possible biological implications of this remarkable fact.
We present an approach for the verification of feed-forward neural networks in which all nodes have a piece-wise linear activation function. Such networks are often used in deep learning and have been shown to be hard to verify for modern satisfiability modulo theory (SMT) and integer linear programming (ILP) solvers.   The starting point of our approach is the addition of a global linear approximation of the overall network behavior to the verification problem that helps with SMT-like reasoning over the network behavior. We present a specialized verification algorithm that employs this approximation in a search process in which it infers additional node phases for the non-linear nodes in the network from partial node phase assignments, similar to unit propagation in classical SAT solving. We also show how to infer additional conflict clauses and safe node fixtures from the results of the analysis steps performed during the search. The resulting approach is evaluated on collision avoidance and handwritten digit recognition case studies.
Revision of the paper previously entitled "Learning a Machine for the Decision in a Partially Observable Markov Universe" In this paper, we are interested in optimal decisions in a partially observable universe. Our approach is to directly approximate an optimal strategic tree depending on the observation. This approximation is made by means of a parameterized probabilistic law. A particular family of hidden Markov models, with input \emph{and} output, is considered as a model of policy. A method for optimizing the parameters of these HMMs is proposed and applied. This optimization is based on the cross-entropic principle for rare events simulation developed by Rubinstein.
We prove that the distribution functions of magnetization and spin glass order parameter coincide on the Nishimori line in the phase diagram of the $\pm J$ Ising model in any dimension. This implies absence of replica symmetry breaking because the distribution function of magnetization consists only of two delta functions, suggesting the same simple structure for the distribution of spin glass order parameter. It then follows that the mixed (glassy) phase, where the ferromagnetic order coexists with complex phase space, should lie, if any, below the Nishimori line. We also argue that the AT line to mark the onset of RSB with a continous distribution of the spin glass order parameter, if any again, would start with an infinite slope from the multicritical point where paramagnetic, ferromagnetic and spin glass phases merge.
For an artificial creative agent, an essential driver of the search for novelty is a value function which is often provided by the system designer or users. We argue that an important barrier for progress in creativity research is the inability of these systems to develop their own notion of value for novelty. We propose a notion of knowledge-driven creativity that circumvent the need for an externally imposed value function, allowing the system to explore based on what it has learned from a set of referential objects. The concept is illustrated by a specific knowledge model provided by a deep generative autoencoder. Using the described system, we train a knowledge model on a set of digit images and we use the same model to build coherent sets of new digits that do not belong to known digit types.
Social networks are getting closer to our real physical world. People share the exact location and time of their check-ins and are influenced by their friends. Modeling the spatio-temporal behavior of users in social networks is of great importance for predicting the future behavior of users, controlling the users' movements, and finding the latent influence network. It is observed that users have periodic patterns in their movements. Also, they are influenced by the locations that their close friends recently visited. Leveraging these two observations, we propose a probabilistic model based on a doubly stochastic point process with a periodic decaying kernel for the time of check-ins and a time-varying multinomial distribution for the location of check-ins of users in the location-based social networks. We learn the model parameters using an efficient EM algorithm, which distributes over the users. Experiments on synthetic and real data gathered from Foursquare show that the proposed inference algorithm learns the parameters efficiently and our model outperforms the other alternatives in the prediction of time and location of check-ins.
In dimension $d \geq 3$, the directed polymer in a random medium undergoes a phase transition between a free phase and a disorder dominated phase. For the latter, Fisher and Huse have proposed a droplet theory based on the scaling of the free energy fluctuations $\Delta F(l) \sim l^{\theta}$. On the other hand, in related growth models belonging to the KPZ universality class, Forrest and Tang have found that the height-height correlation function is logarithmic at the transition. For the directed polymer model at criticality, this translates into logarithmic free energy fluctuations $\Delta F_{T_c}(l) \sim (\ln l)^{\sigma}$ with $\sigma=1/2$. In this paper, we propose a droplet scaling analysis exactly at criticality based on this logarithmic scaling. Our main conclusion is that the typical correlation length $\xi(T)$ of the low temperature phase, diverges as $ \ln \xi(T) \sim (- \ln (T_c-T))^{1/\sigma} \sim (- \ln (T_c-T))^{2} $. Furthermore, the logarithmic dependence of $\Delta F_{T_c}(l)$ leads to the conclusion that the critical temperature $T_c$ actually coincides with the explicit upper bound $T_2$ derived by Derrida and coworkers, where $T_2$ corresponds to the temperature below which the ratio $\bar{Z_L^2}/(\bar{Z_L})^2$ diverges exponentially in $L$. Finally, since the Fisher-Huse droplet theory was initially introduced for the spin-glass phase, we briefly mention the similarities and differences with the directed polymer model. If one speculates that the free energy of droplet excitations for spin-glasses is also logarithmic at $T_c$, one obtains a logarithmic decay for the mean square correlation function at criticality $\bar{C^2(r)} \sim 1/(\ln r )^{\sigma}$.
We show that the longitudinal position $x(t)$ of a particle in a $(d+1)$-dimensional layered random velocity field (the Matheron-de Marsily model) can be identified as a fractional Brownian motion (fBm) characterized by a variable Hurst exponent $H(d)=1-d/4$ for $d<2$ and $H(d)=1/2$ for $d>2$. The fBm becomes marginal at $d=2$. Moreover, using the known first-passage properties of fBm we prove analytically that the disorder averaged persistence (the probability of no zero crossing of the process $x(t)$ upto time $t$) has a power law decay for large $t$ with an exponent $\theta=d/4$ for $d<2$ and $\theta=1/2$ for $d\geq 2$ (with logarithmic correction at $d=2$), results that were earlier derived by Redner based on heuristic arguments and supported by numerical simulations (S. Redner, Phys. Rev. E {\bf 56}, 4967 (1997)).
The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks.
A novel quantum pattern recognition scheme is presented, which combines the idea of a classic Hopfield neural network with adiabatic quantum computation. Both the input and the memorized patterns are represented by means of the problem Hamiltonian. In contrast to classic neural networks, the algorithm can return a quantum superposition of multiple recognized patterns. A proof of principle for the algorithm for two qubits is provided using a liquid state NMR quantum computer.
In this paper we consider the classical spherical perceptron problem. This problem and its variants have been studied in a great detail in a broad literature ranging from statistical physics and neural networks to computer science and pure geometry. Among the most well known results are those created using the machinery of statistical physics in \cite{Gar88}. They typically relate to various features ranging from the storage capacity to typical overlap of the optimal configurations and the number of incorrectly stored patterns. In \cite{SchTir02,SchTir03,TalBook} many of the predictions of the statistical mechanics were rigorously shown to be correct. In our own work \cite{StojnicGardGen13} we then presented an alternative way that can be used to study the spherical perceptrons as well. Among other things we reaffirmed many of the results obtained in \cite{SchTir02,SchTir03,TalBook} and thereby confirmed many of the predictions established by the statistical mechanics. Those mostly relate to spherical perceptrons with positive thresholds (which we will typically refer to as the positive spherical perceptrons). In this paper we go a step further and attack the negative counterpart, i.e. the perceptron with negative thresholds. We present a mechanism that can be used to analyze many features of such a model. As a concrete example, we specialize our results for a particular feature, namely the storage capacity. The results we obtain for the storage capacity seem to indicate that the negative case could be more combinatorial in nature and as such a somewhat harder challenge than the positive counterpart.
We consider deep neural networks, in which the output of each node is a quadratic function of its inputs. Similar to other deep architectures, these networks can compactly represent any function on a finite training set. The main goal of this paper is the derivation of an efficient layer-by-layer algorithm for training such networks, which we denote as the \emph{Basis Learner}. The algorithm is a universal learner in the sense that the training error is guaranteed to decrease at every iteration, and can eventually reach zero under mild conditions. We present practical implementations of this algorithm, as well as preliminary experimental results. We also compare our deep architecture to other shallow architectures for learning polynomials, in particular kernel learning.
The practical applications based on recurrent spiking neurons are limited due to their non-trivial learning algorithms. The temporal nature of spiking neurons is more favorable for hardware implementation where signals can be represented in binary form and communication can be done through the use of spikes. This work investigates the potential of recurrent spiking neurons implementations on reconfigurable platforms and their applicability in temporal based applications. A theoretical framework of reservoir computing is investigated for hardware/software implementation. In this framework, only readout neurons are trained which overcomes the burden of training at the network level. These recurrent neural networks are termed as microcircuits which are viewed as basic computational units in cortical computation. This paper investigates the potential of recurrent neural reservoirs and presents a novel hardware/software strategy for their implementation on FPGAs. The design is implemented and the functionality is tested in the context of speech recognition application.
Functional Data Analysis (FDA) is an extension of traditional data analysis to functional data, for example spectra, temporal series, spatio-temporal images, gesture recognition data, etc. Functional data are rarely known in practice; usually a regular or irregular sampling is known. For this reason, some processing is needed in order to benefit from the smooth character of functional data in the analysis methods. This paper shows how to extend the Radial-Basis Function Networks (RBFN) and Multi-Layer Perceptron (MLP) models to functional data inputs, in particular when the latter are known through lists of input-output pairs. Various possibilities for functional processing are discussed, including the projection on smooth bases, Functional Principal Component Analysis, functional centering and reduction, and the use of differential operators. It is shown how to incorporate these functional processing into the RBFN and MLP models. The functional approach is illustrated on a benchmark of spectrometric data analysis.
How can a machine learn to recognize visual attributes emerging out of online community without a definitive supervised dataset? This paper proposes an automatic approach to discover and analyze visual attributes from a noisy collection of image-text data on the Web. Our approach is based on the relationship between attributes and neural activations in the deep network. We characterize the visual property of the attribute word as a divergence within weakly-annotated set of images. We show that the neural activations are useful for discovering and learning a classifier that well agrees with human perception from the noisy real-world Web data. The empirical study suggests the layered structure of the deep neural networks also gives us insights into the perceptual depth of the given word. Finally, we demonstrate that we can utilize highly-activating neurons for finding semantically relevant regions.
This paper introduces the concept of choreography with respect to inter-organizational innovation networks, as they constitute an attractive environment to create innovation in different sectors. We argue that choreography governs behaviours by shaping the level of connectivity and cohesion among network members. It represents a valid organizational system able to sustain some activities and to reach effects generating innovation outcomes. This issue is tackled introducing a new framework in which we propose a network model as prerequisite for our hypothesis. The analysis is focused on inter-organizational innovation networks characterized by the presence of hubs, semi-peripheral and peripheral members lacking hierarchical authority. We sustain that the features of a network, bringing to synchronization phenomena, are extremely similar to those existing in innovation network characterized by the emergence of choreography. The effectiveness of our model is verified by providing a real case study that gives preliminary empirical hints on the network aptitude to perform choreography. Indeed, the innovation network analysed in the case study reveals characteristics causing synchronization and consequently the establishment of choreography.
We have simulated the response of a high energy neutrino telescope in deep Antarctic ice to the stream of low energy neutrinos produced by a supernova. The passage of a large flux of MeV-energy neutrinos during a period of seconds will be detected as an excess of single counting rates in all individual optical modules. We update here a previous estimate of the performance of such an instrument taking into account the recent discovery of absorption lengths of several hundred meters for near-UV photons in natural deep ice. The existing AMANDA detector can, even by the most conservative estimates, act as a galactic supernova watch.
The importance of studying the brain microstructure is described and the existing and state of the art non-invasive methods for the investigation of the brain microstructure using Diffusion Weighted Magnetic Resonance Imaging (DWI) is studied. In the next step, Cramer-Rao Lower Bound (CRLB) analysis is described and utilised for assessment of the minimum estimation error and uncertainty level of different Diffusion Weighted Magnetic Resonance (DWMR) signal decay models. The analyses are performed considering the best scenario through which, we assume that the models are the appropriate representation of the measured phenomena. This includes the study of the sensitivity of the estimations to the measurement and model parameters. It is demonstrated that none of the existing models can achieve a reasonable minimum uncertainty level under typical measurement setup. At the end, the practical obstacles for achieving higher performance in clinical and experimental environments are studied and their effects on feasibility of the methods are discussed.
We investigate the properties of the background gauge field configurations that act as disorder for the Anderson localization mechanism in the Dirac spectrum of QCD at high temperatures. We compute the eigenmodes of the M\"obius domain-wall fermion operator on configurations generated for the $SU(3)$ gauge theory with two flavors of fermions, in the temperature range $[0.9,1.9]T_c$. We identify the source of localization of the eigenmodes with gauge configurations that are self-dual and support negative fluctuations of the Polyakov loop $P_L$, in the high temperature sea of $P_L\sim 1$. The dependence of these observations on the boundary conditions of the valence operator is studied. We also investigate the spatial overlap of the left-handed and right-handed projected eigenmodes in correlation with the localization and the corresponding eigenvalue. We discuss an interpretation of the results in terms of monopole-instanton structures.
Biology has taken strong steps towards becoming a computer science aiming at reprogramming nature after the realisation that nature herself has reprogrammed organisms by harnessing the power of natural selection and the digital prescriptive nature of replicating DNA. Here we further unpack ideas related to computability, algorithmic information theory and software engineering, in the context of the extent to which biology can be (re)programmed, and with how we may go about doing so in a more systematic way with all the tools and concepts offered by theoretical computer science in a translation exercise from computing to molecular biology and back. These concepts provide a means to a hierarchical organization thereby blurring previously clear-cut lines between concepts like matter and life, or between tumour types that are otherwise taken as different and may not have however a different cause. This does not diminish the properties of life or make its components and functions less interesting. On the contrary, this approach makes for a more encompassing and integrated view of nature, one that subsumes observer and observed within the same system, and can generate new perspectives and tools with which to view complex diseases like cancer, approaching them afresh from a software-engineering viewpoint that casts evolution in the role of programmer, cells as computing machines, DNA and genes as instructions and computer programs, viruses as hacking devices, the immune system as a software debugging tool, and diseases as an information-theoretic battlefield where all these forces deploy. We show how information theory and algorithmic programming may explain fundamental mechanisms of life and death.
In our empirical works, we find that there exists the equivalency of the cumulative degree distribution and edge-cumulative distribution. Furthermore, we employ three network models of the recursive graphs, Sierpinksi networks and Apollonian networks to verify our conjecture: \emph{Both the cumulative degree distribution and the edge-cumulative distribution are equivalent to each other in deterministic network models}
The brain did not develop a dedicated device for reasoning. This fact bears dramatic consequences. While for perceptuo-motor functions neural activity is shaped by the input's statistical properties, and processing is carried out at high speed in hardwired spatially segregated modules, in reasoning, neural activity is driven by internal dynamics, and processing times, stages, and functional brain geometry are largely unconstrained a priori. Here, it is shown that the complex properties of spontaneous activity, which can be ignored in a short-lived event-related world, become prominent at the long time scales of certain forms of reasoning which stretch over sufficiently long periods of time. It is argued that the neural correlates of reasoning should in fact be defined in terms of non-trivial generic properties of spontaneous brain activity, and that this implies resorting to concepts, analytical tools, and ways of designing experiments that are as yet non-standard in cognitive neuroscience. The implications in terms of models of brain activity, shape of the neural correlates, methods of data analysis, observability of the phenomenon and experimental designs are discussed.
To meet the ever growing demand for both high throughput and uniform coverage in future wireless networks, dense network deployment will be ubiquitous, for which co- operation among the access points is critical. Considering the computational complexity of designing coordinated beamformers for dense networks, low-complexity and suboptimal precoding strategies are often adopted. However, it is not clear how much performance loss will be caused. To enable optimal coordinated beamforming, in this paper, we propose a framework to design a scalable beamforming algorithm based on the alternative direction method of multipliers (ADMM) method. Specifically, we first propose to apply the matrix stuffing technique to transform the original optimization problem to an equivalent ADMM-compliant problem, which is much more efficient than the widely-used modeling framework CVX. We will then propose to use the ADMM algorithm, a.k.a. the operator splitting method, to solve the transformed ADMM-compliant problem efficiently. In particular, the subproblems of the ADMM algorithm at each iteration can be solved with closed-forms and in parallel. Simulation results show that the proposed techniques can result in significant computational efficiency compared to the state- of-the-art interior-point solvers. Furthermore, the simulation results demonstrate that the optimal coordinated beamforming can significantly improve the system performance compared to sub-optimal zero forcing beamforming.
The influence of time-dependent fitnesses on the infinite population dynamics of simple genetic algorithms (without crossover) is analyzed. Based on general arguments, a schematic phase diagram is constructed that allows one to characterize the asymptotic states in dependence on the mutation rate and the time scale of changes. Furthermore, the notion of regular changes is raised for which the population can be shown to converge towards a generalized quasispecies. Based on this, error thresholds and an optimal mutation rate are approximately calculated for a generational genetic algorithm with a moving needle-in-the-haystack landscape. The so found phase diagram is fully consistent with our general considerations.
Most existing techniques for spam detection on Twitter aim to identify and block users who post spam tweets. In this paper, we propose a Semi-Supervised Spam Detection (S3D) framework for spam detection at tweet-level. The proposed framework consists of two main modules: spam detection module operating in real-time mode, and model update module operating in batch mode. The spam detection module consists of four light-weight detectors: (i) blacklisted domain detector to label tweets containing blacklisted URLs, (ii) near-duplicate detector to label tweets that are near-duplicates of confidently pre-labeled tweets, (iii) reliable ham detector to label tweets that are posted by trusted users and that do not contain spammy words, and (iv) multi-classifier based detector labels the remaining tweets. The information required by the detection module are updated in batch mode based on the tweets that are labeled in the previous time window. Experiments on a large scale dataset show that the framework adaptively learns patterns of new spam activities and maintain good accuracy for spam detection in a tweet stream.
Wood increment is critical information in forestry management. Previous studies used mathematics models to describe complex growing pattern of forest stand, in order to determine the dynamic status of growing forest stand in multiple conditions. In our research, we aimed at studying non-linear relationships to establish precise and robust Artificial Neural Networks (ANN) models to predict the precise values of tree height and forest stock volume based on data of Chinese fir. Results show that Multilayer Feedforward Neural Networks with 4 nodes (MLFN-4) can predict the tree height with the lowest RMS error (1.77); Multilayer Feedforward Neural Networks with 7 nodes (MLFN-7) can predict the forest stock volume with the lowest RMS error (4.95). The training and testing process have proved that our models are precise and robust.
In this work we study the Boolean Networks of different geometric shape and lattice organization. It was revealed that no only a spatial shape but also type of lattice are very important for definition of the structure-dynamics relation. The regular structures do not give a critical regime in the investigated cases. Hierarchy together with the irregular structure reveals characteristic features of criticality.
We investigate two-dimensional neural fields as a model of the dynamics of macroscopic activations in a cortex-like neural system. While the one-dimensional case has been treated comprehensively by Amari 30 years ago, two-dimensional neural fields are much less understood. We derive conditions for the stability for the main classes of localized solutions of the neural field equation and study their behavior beyond parameter-controlled destabilization. We show that a slight modification of original model yields an equation whose stationary states are guaranteed to satisfy the original problem and numerically demonstrate that it admits localized non-circular solutions. Generically, however, only periodic spatial tessellations emerge upon destabilization of rotationally-invariant solutions.
Multihop ad hoc wireless networks consist of mobile nodes that communicate with each other without any fixed infrastructure. The nodes in these networks are power constrained, since they operate in limited battery energy. Cooperative caching is an attractive solution for reducing network traffic and bandwidth demands in mobile ad hoc networks. Deploying caches in mobile nodes can reduce the overall traffic considerably.   Cache hits eliminate the need to contact the data source frequently, which avoids additional network overhead. In this paper we propose a cache discovery policy for cooperative caching, which reduces the power usage, caching overhead and delay. This is done by power control and transmission range adjustment. A cache discovery process based on position coordinates of neighboring nodes is developed for this. The simulation results gives a promising result based on the metrics of studies.
The design of modulation schemes for the physical layer network-coded two way wireless relaying scenario is considered. It was observed by Koike-Akino et al. for the two way relaying scenario, that adaptively changing the network coding map used at the relay according to the channel conditions greatly reduces the impact of multiple access interference which occurs at the relay during the MA Phase and all these network coding maps should satisfy a requirement called exclusive law. We extend this approach to an Accumulate-Compute and Forward protocol which employs two phases: Multiple Access (MA) phase consisting of two channel uses with independent messages in each channel use, and Broadcast (BC) phase having one channel use. Assuming that the two users transmit points from the same 4-PSK constellation, every such network coding map that satisfies the exclusive law can be represented by a Latin Square with side 16, and conversely, this relationship can be used to get the network coding maps satisfying the exclusive law. Two methods of obtaining this network coding map to be used at the relay are discussed. Using the structural properties of the Latin Squares for a given set of parameters, the problem of finding all the required maps is reduced to finding a small set of maps. Having obtained all the Latin Squares, the set of all possible channel realizations is quantized, depending on which one of the Latin Squares obtained optimizes the performance. The quantization thus obtained, is shown to be the same as the one obtained in [7] for the 2-stage bidirectional relaying.
Excitations in disordered systems are typically categorized as localized or delocalized, depending on whether they entail disturbances extending throughout the system or are confined to small, generally nanometer scale, subsystems. Such categorization is impossible to achieve using traditional spectroscopy where the response to a weak oscillating (ac) electromagnetic probe is measured as a function of frequency. However, the localized excitations can be separated from each other as well as the delocalized continuum by measuring a spectral "hole" in the ordinary response while a large amplitude pump is imposed at a fixed frequency. Localized excitations will result in a very sharp "hole," and any residual couplings to other excitations, both localized and extended, will determine its detailed shape. This technique probes incoherent lifetime effects as well as coherent mixing or quantum interference phenomena, describable in terms of the Fano effect. Here we show that in a disordered Ising magnet, LiHo$_{0.045}$Y$_{0.955}$F$_4$, the quality factor $Q$ for spectral holes, the ratio of the drive frequency to their width, can be as high as 100,000. In addition, we can tune the dynamics of the quantum degrees of freedom by sweeping the quantum mixing parameter through zero via the amplitude of the ac pump as well as a static external transverse field. The zero-crossing is associated with a dissipationless response at the drive frequency. The identification of such a point where localized degrees of freedom are minimally mixed with their environment in a dense and disordered dipolar coupled spin system implies control over the bath coupling of qubits emerging from strongly interacting many-body systems.
Nb3Sn was prepared by milling Nb and Sn powder mixtures followed by limited reactions to restrict disorder recovery. Although disorder reduced the superconducting critical temperature Tc, the concomitant electron scattering increased the upper critical field mu0Hc2 to as high as 35 T at 0 K, as determined by the Werthamer-Helfand-Hohenberg equation. Hc2 was higher for longer milling times and lower annealing temperatures. Substitution of 2% Ti for Nb did not appreciably enhance Hc2, suggesting that alloying mitigates the benefits of disorder. Since alloyed Nb3Sn wires have mu0Hc2(0) approximately 29 T, wires based on heavily milled powders could extend the field range for applications if they can be made with high current density.
We present experimental data and associated theory for correlations in a series of experiments involving repeated Landau-Zener sweeps through the crossing point of a singlet state and a spin aligned triplet state in a GaAs double quantum dot containing two conduction electrons, which are loaded in the singlet state before each sweep, and the final spin is recorded after each sweep. The experiments reported here measure correlations on time scales from 4 $\mu$s to 2 ms. When the magnetic field is aligned in a direction such that spin-orbit coupling cannot cause spin flips, the correlation spectrum has prominent peaks centered at zero frequency and at the differences of the Larmor frequencies of the nuclei, on top of a frequency-independent background. When the spin-orbit field is relevant, there are additional peaks, centered at the frequencies of the individual species. A theoretical model which neglects the effects of high-frequency charge noise correctly predicts the positions of the observed peaks, and gives a reasonably accurate prediction of the size of the frequency-independent background, but gives peak areas that are larger than the observed areas by a factor of two or more. The observed peak widths are roughly consistent with predictions based on nuclear dephasing times of the order of 60 $\mu$s. However, there is extra weight at the lowest observed frequencies, which suggests the existence of residual correlations on the scale of 2 ms. We speculate on the source of these discrepancies.
Understanding network structure and having access to realistic graphs plays a central role in computer and social networks research. In this paper, we propose a complete, and practical methodology for generating graphs that resemble a real graph of interest. The metrics of the original topology we target to match are the joint degree distribution (JDD) and the degree-dependent average clustering coefficient ($\bar{c}(k)$). We start by developing efficient estimators for these two metrics based on a node sample collected via either independence sampling or random walks. Then, we process the output of the estimators to ensure that the target properties are realizable. Finally, we propose an efficient algorithm for generating topologies that have the exact target JDD and a $\bar{c}(k)$ close to the target. Extensive simulations using real-life graphs show that the graphs generated by our methodology are similar to the original graph with respect to, not only the two target metrics, but also a wide range of other topological metrics; furthermore, our generator is order of magnitudes faster than state-of-the-art techniques.
A network coding scheme for practical implementations of wireless body area networks is presented, with the objective of providing reliability under low-energy constraints. We propose a simple network layer protocol for star networks, adapting redundancy based on both transmission and reception energies for data and control packets, as well as channel conditions. Our numerical results show that even for small networks, the amount of energy reduction achievable can range from 29% to 87%, as the receiving energy per control packet increases from equal to much larger than the transmitting energy per data packet. The achievable gains increase as a) more nodes are added to the network, and/or b) the channels seen by different sensor nodes become more asymmetric.
We develop a deep learning algorithm for contour detection with a fully convolutional encoder-decoder network. Different from previous low-level edge detection, our algorithm focuses on detecting higher-level object contours. Our network is trained end-to-end on PASCAL VOC with refined ground truth from inaccurate polygon annotations, yielding much higher precision in object contour detection than previous methods. We find that the learned model generalizes well to unseen object classes from the same super-categories on MS COCO and can match state-of-the-art edge detection on BSDS500 with fine-tuning. By combining with the multiscale combinatorial grouping algorithm, our method can generate high-quality segmented object proposals, which significantly advance the state-of-the-art on PASCAL VOC (improving average recall from 0.62 to 0.67) with a relatively small amount of candidates ($\sim$1660 per image).
To enhance developer productivity, all modern integrated development environments (IDEs) include code suggestion functionality that proposes likely next tokens at the cursor. While current IDEs work well for statically-typed languages, their reliance on type annotations means that they do not provide the same level of support for dynamic programming languages as for statically-typed languages. Moreover, suggestion engines in modern IDEs do not propose expressions or multi-statement idiomatic code. Recent work has shown that language models can improve code suggestion systems by learning from software repositories. This paper introduces a neural language model with a sparse pointer network aimed at capturing very long-range dependencies. We release a large-scale code suggestion corpus of 41M lines of Python code crawled from GitHub. On this corpus, we found standard neural language models to perform well at suggesting local phenomena, but struggle to refer to identifiers that are introduced many tokens in the past. By augmenting a neural language model with a pointer network specialized in referring to predefined classes of identifiers, we obtain a much lower perplexity and a 5 percentage points increase in accuracy for code suggestion compared to an LSTM baseline. In fact, this increase in code suggestion accuracy is due to a 13 times more accurate prediction of identifiers. Furthermore, a qualitative analysis shows this model indeed captures interesting long-range dependencies, like referring to a class member defined over 60 tokens in the past.
Recently, deep learning techniques have enjoyed success in various multimedia applications, such as image classification and multi-modal data analysis. Large deep learning models are developed for learning rich representations of complex data. There are two challenges to overcome before deep learning can be widely adopted in multimedia and other applications. One is usability, namely the implementation of different models and training algorithms must be done by non-experts without much effort especially when the model is large and complex. The other is scalability, that is the deep learning system must be able to provision for a huge demand of computing resources for training large models with massive datasets. To address these two challenges, in this paper, we design a distributed deep learning platform called SINGA which has an intuitive programming model based on the common layer abstraction of deep learning models. Good scalability is achieved through flexible distributed training architecture and specific optimization techniques. SINGA runs on GPUs as well as on CPUs, and we show that it outperforms many other state-of-the-art deep learning systems. Our experience with developing and training deep learning models for real-life multimedia applications in SINGA shows that the platform is both usable and scalable.
Nonparametric detection of existence of an anomalous structure over a network is investigated. Nodes corresponding to the anomalous structure (if one exists) receive samples generated by a distribution q, which is different from a distribution p generating samples for other nodes. If an anomalous structure does not exist, all nodes receive samples generated by p. It is assumed that the distributions p and q are arbitrary and unknown. The goal is to design statistically consistent tests with probability of errors converging to zero as the network size becomes asymptotically large. Kernel-based tests are proposed based on maximum mean discrepancy that measures the distance between mean embeddings of distributions into a reproducing kernel Hilbert space. Detection of an anomalous interval over a line network is first studied. Sufficient conditions on minimum and maximum sizes of candidate anomalous intervals are characterized in order to guarantee the proposed test to be consistent. It is also shown that certain necessary conditions must hold to guarantee any test to be universally consistent. Comparison of sufficient and necessary conditions yields that the proposed test is order-level optimal and nearly optimal respectively in terms of minimum and maximum sizes of candidate anomalous intervals. Generalization of the results to other networks is further developed. Numerical results are provided to demonstrate the performance of the proposed tests.
Conventional deep brain stimulation (DBS) of basal ganglia uses high-frequency regular electrical pulses to treat Parkinsonian motor symptoms and has a series of limitations. Relatively new and not yet clinically tested optogenetic stimulation is an effective experimental stimulation technique to affect pathological network dynamics. We compared the effects of electrical and optogenetic stimulation of the basal ganglia on the pathological parkinsonian rhythmic neural activity. We studied the network response to electrical stimulation and excitatory and inhibitory optogenetic stimulations. Different stimulations exhibit different interactions with pathological activity in the network. We studied these interactions for different network and stimulation parameter values. Optogenetic stimulation was found to be more efficient than electrical stimulation in suppressing pathological rhythmicity. Our findings indicate that optogenetic control of neural synchrony may be more efficacious than electrical control because of the different ways of how stimulations interact with network dynamics.
This paper introduces analogical and deductive methodologies for the design medical processor units (MPUs). From the study of evolution of numerous earlier processors, we derive the basis for the architecture of MPUs. These specialized processors perform unique medical functions encoded as medical operational codes (mopcs). From a pragmatic perspective, MPUs function very close to CPUs. Both processors have unique operation codes that command the hardware to perform a distinct chain of subprocesses upon operands and generate a specific result unique to the opcode and the operand(s). In medical environments, MPU decodes the mopcs and executes a series of medical sub-processes and sends out secondary commands to the medical machine. Whereas operands in a typical computer system are numerical and logical entities, the operands in medical machine are objects such as such as patients, blood samples, tissues, operating rooms, medical staff, medical bills, patient payments, etc. We follow the functional overlap between the two processes and evolve the design of medical computer systems and networks.
Person re-identification is to seek a correct match for a person of interest across views among a large number of imposters. It typically involves two procedures of non-linear feature extractions against dramatic appearance changes, and subsequent discriminative analysis in order to reduce intra- personal variations while enlarging inter-personal differences. In this paper, we introduce a hybrid architecture which combines Fisher vectors and deep neural networks to learn non-linear representations of person images to a space where data can be linearly separable. We reinforce a Linear Discriminant Analysis (LDA) on top of the deep neural network such that linearly separable latent representations can be learnt in an end-to-end fashion. By optimizing an objective function modified from LDA, the network is enforced to produce feature distributions which have a low variance within the same class and high variance between classes. The objective is essentially derived from the general LDA eigenvalue problem and allows to train the network with stochastic gradient descent and back-propagate LDA gradients to compute the gradients involved in Fisher vector encoding. For evaluation we test our approach on four benchmark data sets in person re-identification (VIPeR [1], CUHK03 [2], CUHK01 [3], and Market1501 [4]). Extensive experiments on these benchmarks show that our model can achieve state-of-the-art results.
Several recent works have shown that state-of-the-art classifiers are vulnerable to worst-case (i.e., adversarial) perturbations of the datapoints. On the other hand, it has been empirically observed that these same classifiers are relatively robust to random noise. In this paper, we propose to study a \textit{semi-random} noise regime that generalizes both the random and worst-case noise regimes. We propose the first quantitative analysis of the robustness of nonlinear classifiers in this general noise regime. We establish precise theoretical bounds on the robustness of classifiers in this general regime, which depend on the curvature of the classifier's decision boundary. Our bounds confirm and quantify the empirical observations that classifiers satisfying curvature constraints are robust to random noise. Moreover, we quantify the robustness of classifiers in terms of the subspace dimension in the semi-random noise regime, and show that our bounds remarkably interpolate between the worst-case and random noise regimes. We perform experiments and show that the derived bounds provide very accurate estimates when applied to various state-of-the-art deep neural networks and datasets. This result suggests bounds on the curvature of the classifiers' decision boundaries that we support experimentally, and more generally offers important insights onto the geometry of high dimensional classification problems.
Topological quantum computation provides an elegant way around decoherence, as one encodes quantum information in a non-local fashion that the environment finds difficult to corrupt. Here we establish that one of the key operations---braiding of non-Abelian anyons---can be implemented in one-dimensional semiconductor wire networks. Previous work [Lutchyn et al., arXiv:1002.4033 and Oreg et al., arXiv:1003.1145] provided a recipe for driving semiconducting wires into a topological phase supporting long-sought particles known as Majorana fermions that can store topologically protected quantum information. Majorana fermions in this setting can be transported, created, and fused by applying locally tunable gates to the wire. More importantly, we show that networks of such wires allow braiding of Majorana fermions and that they exhibit non-Abelian statistics like vortices in a p+ip superconductor. We propose experimental setups that enable the Majorana fusion rules to be probed, along with networks that allow for efficient exchange of arbitrary numbers of Majorana fermions. This work paves a new path forward in topological quantum computation that benefits from physical transparency and experimental realism.
Learning Classifier Systems (LCS) are population-based reinforcement learners used in a wide variety of applications. This paper presents a LCS where each traditional rule is represented by a spiking neural network, a type of network with dynamic internal state. We employ a constructivist model of growth of both neurons and dendrites that realise flexible learning by evolving structures of sufficient complexity to solve a well-known problem involving continuous, real-valued inputs. Additionally, we extend the system to enable temporal state decomposition. By allowing our LCS to chain together sequences of heterogeneous actions into macro-actions, it is shown to perform optimally in a problem where traditional methods can fail to find a solution in a reasonable amount of time. Our final system is tested on a simulated robotics platform.
We studied the phase transition of the $\pm J$ Heisenberg model with and without a random anisotropy on four dimensional lattice $L\times L\times L\times (L+1)$ $(L\leq 9)$. We showed that the Binder parameters $g(L,T)$'s for different sizes do not cross even when the anisotropy is present. On the contrary, when a strong anisotropy exists, $g(L,T)$ exhibits a steep negative dip near the spin-glass phase transition temperature $T_{\rm SG}$ similarly to the $p-$state infinite-range Potts glass model with $p \geq 3$, in which the one-step replica-symmetry-breaking (RSB) occurs. We speculated that a one-step RSB-like state occurs below $T_{\rm SG}$, which breaks the usual crossing behavior of $g(L,T)$.
We present an attention-based model for recognizing multiple objects in images. The proposed model is a deep recurrent neural network trained with reinforcement learning to attend to the most relevant regions of the input image. We show that the model learns to both localize and recognize multiple objects despite being given only class labels during training. We evaluate the model on the challenging task of transcribing house number sequences from Google Street View images and show that it is both more accurate than the state-of-the-art convolutional networks and uses fewer parameters and less computation.
Precision achieved by stochastic sampling algorithms for Bayesian networks typically deteriorates in face of extremely unlikely evidence. To address this problem, we propose the Evidence Pre-propagation Importance Sampling algorithm (EPIS-BN), an importance sampling algorithm that computes an approximate importance function by the heuristic methods: loopy belief Propagation and e-cutoff. We tested the performance of e-cutoff on three large real Bayesian networks: ANDES, CPCS, and PATHFINDER. We observed that on each of these networks the EPIS-BN algorithm gives us a considerable improvement over the current state of the art algorithm, the AIS-BN algorithm. In addition, it avoids the costly learning stage of the AIS-BN algorithm.
We train spiking deep networks using leaky integrate-and-fire (LIF) neurons, and achieve state-of-the-art results for spiking networks on the CIFAR-10 and MNIST datasets. This demonstrates that biologically-plausible spiking LIF neurons can be integrated into deep networks can perform as well as other spiking models (e.g. integrate-and-fire). We achieved this result by softening the LIF response function, such that its derivative remains bounded, and by training the network with noise to provide robustness against the variability introduced by spikes. Our method is general and could be applied to other neuron types, including those used on modern neuromorphic hardware. Our work brings more biological realism into modern image classification models, with the hope that these models can inform how the brain performs this difficult task. It also provides new methods for training deep networks to run on neuromorphic hardware, with the aim of fast, power-efficient image classification for robotics applications.
We investigate the statistics of eigenstates in a weak self-affine disordered potential in one dimension, whose Gaussian fluctuations grow with distance with a positive Hurst exponent $H$. Typical eigenstates are superlocalized on samples much larger than a well-defined crossover length, which diverges in the weak-disorder regime. We present a parallel analytical investigation of the statistics of these superlocalized states in the discrete and the continuum formalisms. For the discrete tight-binding model, the effective localization length decays logarithmically with the sample size, and the logarithm of the transmission is marginally self-averaging. For the continuum Schr\"odinger equation, the superlocalization phenomenon has more drastic effects. The effective localization length decays as a power of the sample length, and the logarithm of the transmission is fully non-self-averaging.
We perform a NLO QCD analysis of deep-inelastic scattering data, in which we account for absorptive corrections. These corrections are determined from a simultaneous analysis of diffractive deep-inelastic data. The absorptive effects are found to enhance the size of the gluon distribution at small x, such that a negative input gluon distribution at Q^2 = 1 GeV^2 is no longer required. We discuss the problem that the gluon distribution is valence-like at low scales, whereas the sea quark distribution grows with decreasing x. Our study hints at the possible importance of power corrections for Q^2 \simeq 1--2 GeV^2.
Natural language understanding and dialogue policy learning are both essential in conversational systems that predict the next system actions in response to a current user utterance. Conventional approaches aggregate separate models of natural language understanding (NLU) and system action prediction (SAP) as a pipeline that is sensitive to noisy outputs of error-prone NLU. To address the issues, we propose an end-to-end deep recurrent neural network with limited contextual dialogue memory by jointly training NLU and SAP on DSTC4 multi-domain human-human dialogues. Experiments show that our proposed model significantly outperforms the state-of-the-art pipeline models for both NLU and SAP, which indicates that our joint model is capable of mitigating the affects of noisy NLU outputs, and NLU model can be refined by error flows backpropagating from the extra supervised signals of system actions.
This paper presents multi-passband optical data obtained from observations of the Chandra Deep Field South (CDF-S), located at alpha ~ 3h 32m, delta ~ -27d 48m. The observations were conducted at the ESO/MPG 2.2m telescope at La Silla using the 8kx8k Wide-Field Imager (WFI). This data set, taken over a period of one year, represents the first field to be completed by the ongoing Deep Public Survey (DPS) being carried out by the ESO Imaging Survey (EIS) project. This paper describes the optical observations, the techniques employed for un-supervised pipeline processing and the general characteristics of the final data set. The paper includes data taken in six different filters U'UBVRI. The data cover an area of about 0.25 square degrees reaching 5 sigma limiting magnitudes of U'_AB=26.0, U_AB=25.7, B_AB=26.4$, V_AB=25.4, R_AB=25.5 and I_AB= 24.7 mag, as measured within a 2xFWHM aperture. The optical data covers the area of ~ 0.1
Models of dynamic networks --- networks that evolve over time --- have manifold applications. We develop a discrete-time generative model for social network evolution that inherits the richness and flexibility of the class of exponential-family random graph models. The model --- a Separable Temporal ERGM (STERGM) --- facilitates separable modeling of the tie duration distributions and the structural dynamics of tie formation. We develop likelihood-based inference for the model, and provide computational algorithms for maximum likelihood estimation. We illustrate the interpretability of the model in analyzing a longitudinal network of friendship ties within a school.
Online social networks (e.g. Facebook, Twitter, Youtube) provide a popular, cost-effective and scalable framework for sharing user-generated contents. This paper addresses the intrinsic incentive problems residing in social networks using a game-theoretic model where individual users selfishly trade off the costs of forming links (i.e. whom they interact with) and producing contents personally against the potential rewards from doing so. Departing from the assumption that contents produced by difference users is perfectly substitutable, we explicitly consider heterogeneity in user-generated contents and study how it influences users' behavior and the structure of social networks. Given content heterogeneity, we rigorously prove that when the population of a social network is sufficiently large, every (strict) non-cooperative equilibrium should consist of either a symmetric network topology where each user produces the same amount of content and has the same degree, or a two-level hierarchical topology with all users belonging to either of the two types: influencers who produce large amounts of contents and subscribers who produce small amounts of contents and get most of their contents from influencers. Meanwhile, the law of the few disappears in such networks. Moreover, we prove that the social optimum is always achieved by networks with symmetric topologies, where the sum of users' utilities is maximized. To provide users with incentives for producing and mutually sharing the socially optimal amount of contents, a pricing scheme is proposed, with which we show that the social optimum can be achieved as a non-cooperative equilibrium with the pricing of content acquisition and link formation.
Despite the effectiveness of Convolutional Neural Networks (CNNs) for image classification, our understanding of the relationship between shape of convolution kernels and learned representations is limited. In this work, we explore and employ the relationship between shape of kernels which define Receptive Fields (RFs) in CNNs for learning of feature representations and image classification. For this purpose, we first propose a feature visualization method for visualization of pixel-wise classification score maps of learned features. Motivated by our experimental results, and observations reported in the literature for modeling of visual systems, we propose a novel design of shape of kernels for learning of representations in CNNs. In the experimental results, we achieved a state-of-the-art classification performance compared to a base CNN model [28] by reducing the number of parameters and computational time of the model using the ILSVRC-2012 dataset [24]. The proposed models also outperform the state-of-the-art models employed on the CIFAR-10/100 datasets [12] for image classification. Additionally, we analyzed the robustness of the proposed method to occlusion for classification of partially occluded images compared with the state-of-the-art methods. Our results indicate the effectiveness of the proposed approach. The code is available in github.com/minogame/caffe-qhconv.
We present here a new model and algorithm which performs an efficient Natural gradient descent for Multilayer Perceptrons. Natural gradient descent was originally proposed from a point of view of information geometry, and it performs the steepest descent updates on manifolds in a Riemannian space. In particular, we extend an approach taken by the "Whitened neural networks" model. We make the whitening process not only in feed-forward direction as in the original model, but also in the back-propagation phase. Its efficacy is shown by an application of this "Bidirectional whitened neural networks" model to a handwritten character recognition data (MNIST data).
Generalized Ising models, also known as cluster expansions, are an important tool in many areas of condensed-matter physics and materials science, as they are often used in the study of lattice thermodynamics, solid-solid phase transitions, magnetic and thermal properties of solids, and fluid mechanics. However, the problem of finding the global ground state of generalized Ising model has remained unresolved, with only a limited number of results for simple systems known. We propose a method to efficiently find the periodic ground state of a generalized Ising model of arbitrary complexity by a new algorithm which we term cluster tree optimization. Importantly, we are able to show that even in the case of an aperiodic ground state, our algorithm produces a sequence of states with energy converging to the true ground state energy, with a provable bound on error. Compared to the current state-of-the-art polytope method, this algorithm eliminates the necessity of introducing an exponential number of variables to counter frustration, and thus significantly improves tractability. We believe that the cluster tree algorithm offers an intuitive and efficient approach to finding and proving ground states of generalized Ising Hamiltonians of arbitrary complexity, which will help validate assumptions regarding local vs. global optimality in lattice models, as well as offer insights into the low-energy behavior of highly frustrated systems.
Various variants of the well known Covariance Matrix Adaptation Evolution Strategy (CMA-ES) have been proposed recently, which improve the empirical performance of the original algorithm by structural modifications. However, in practice it is often unclear which variation is best suited to the specific optimization problem at hand. As one approach to tackle this issue, algorithmic mechanisms attached to CMA-ES variants are considered and extracted as functional \emph{modules}, allowing for combinations of them. This leads to a configuration space over ES structures, which enables the exploration of algorithm structures and paves the way toward novel algorithm generation. Specifically, eleven modules are incorporated in this framework with two or three alternative configurations for each module, resulting in $4\,608$ algorithms. A self-adaptive Genetic Algorithm (GA) is used to efficiently evolve effective ES-structures for given classes of optimization problems, outperforming any classical CMA-ES variants from literature. The proposed approach is evaluated on noiseless functions from BBOB suite. Furthermore, such an observation is again confirmed on different function groups and dimensionality, indicating the feasibility of ES configuration on real-world problem classes.
This paper examines the memory capacity of generalized neural networks. Hopfield networks trained with a variety of learning techniques are investigated for their capacity both for binary and non-binary alphabets. It is shown that the capacity can be much increased when multilevel inputs are used. New learning strategies are proposed to increase Hopfield network capacity, and the scalability of these methods is also examined in respect to size of the network. The ability to recall entire patterns from stimulation of a single neuron is examined for the increased capacity networks.
Due to dynamic network conditions, routing is the most critical part in WMNs and needs to be optimised. The routing strategies developed for WMNs must be efficient to make it an operationally self configurable network. Thus we need to resort to near shortest path evaluation. This lays down the requirement of some soft computing approaches such that a near shortest path is available in an affordable computing time. This paper proposes a Fuzzy Logic based integrated cost measure in terms of delay, throughput and jitter. Based upon this distance (cost) between two adjacent nodes we evaluate minimal shortest path that updates routing tables. We apply two recent soft computing approaches namely Big Bang Big Crunch (BB-BC) and Biogeography Based Optimization (BBO) approaches to enumerate shortest or near short paths. BB-BC theory is related with the evolution of the universe whereas BBO is inspired by dynamical equilibrium in the number of species on an island. Both the algorithms have low computational time and high convergence speed. Simulation results show that the proposed routing algorithms find the optimal shortest path taking into account three most important parameters of network dynamics. It has been further observed that for the shortest path problem BB-BC outperforms BBO in terms of speed and percent error between the evaluated minimal path and the actual shortest path.
Chemical reaction systems are dynamical systems that arise in chemical engineering and systems biology. In this work, we consider the question of whether the minimal (in a precise sense) multistationary chemical reaction networks, which we propose to call `atoms of multistationarity,' characterize the entire set of multistationary networks. Our main result states that the answer to this question is `yes' in the context of fully open continuous-flow stirred-tank reactors (CFSTRs), which are networks in which all chemical species take part in the inflow and outflow. In order to prove this result, we show that if a subnetwork admits multiple steady states, then these steady states can be lifted to a larger network, provided that the two networks share the same stoichiometric subspace. We also prove an analogous result when a smaller network is obtained from a larger network by `removing species.' Our results provide the mathematical foundation for a technique used by Siegal-Gaskins et al. of establishing bistability by way of `network ancestry.' Additionally, our work provides sufficient conditions for establishing multistationarity by way of atoms and moreover reduces the problem of classifying multistationary CFSTRs to that of cataloging atoms of multistationarity. As an application, we enumerate and classify all 386 bimolecular and reversible two-reaction networks. Of these, exactly 35 admit multiple positive steady states. Moreover, each admits a unique minimal multistationary subnetwork, and these subnetworks form a poset (with respect to the relation of `removing species') which has 11 minimal elements (the atoms of multistationarity).
In this paper, we analyze the monotone space of complexity of directed connectivity for a large class of input graphs $G$ using the switching network model. The upper and lower bounds we obtain are a significant generalization of previous results and the proofs involve several completely new techniques and ideas.
The visual appearance of a person is easily affected by many factors like pose variations, viewpoint changes and camera parameter differences. This makes person Re-Identification (ReID) among multiple cameras a very challenging task. This work is motivated to learn mid-level human attributes which are robust to such visual appearance variations. And we propose a semi-supervised attribute learning framework which progressively boosts the accuracy of attributes only using a limited number of labeled data. Specifically, this framework involves a three-stage training. A deep Convolutional Neural Network (dCNN) is first trained on an independent dataset labeled with attributes. Then it is fine-tuned on another dataset only labeled with person IDs using our defined triplet loss. Finally, the updated dCNN predicts attribute labels for the target dataset, which is combined with the independent dataset for the final round of fine-tuning. The predicted attributes, namely \emph{deep attributes} exhibit superior generalization ability across different datasets. By directly using the deep attributes with simple Cosine distance, we have obtained surprisingly good accuracy on four person ReID datasets. Experiments also show that a simple metric learning modular further boosts our method, making it significantly outperform many recent works.
Olshausen and Field (OF) proposed that neural computations in the primary visual cortex (V1) can be partially modeled by sparse dictionary learning. By minimizing the regularized representation error they derived an online algorithm, which learns Gabor-filter receptive fields from a natural image ensemble in agreement with physiological experiments. Whereas the OF algorithm can be mapped onto the dynamics and synaptic plasticity in a single-layer neural network, the derived learning rule is nonlocal - the synaptic weight update depends on the activity of neurons other than just pre- and postsynaptic ones - and hence biologically implausible. Here, to overcome this problem, we derive sparse dictionary learning from a novel cost-function - a regularized error of the symmetric factorization of the input's similarity matrix. Our algorithm maps onto a neural network of the same architecture as OF but using only biologically plausible local learning rules. When trained on natural images our network learns Gabor-filter receptive fields and reproduces the correlation among synaptic weights hard-wired in the OF network. Therefore, online symmetric matrix factorization may serve as an algorithmic theory of neural computation.
Detecting new information and events in a dynamic network by probing individual nodes has many practical applications: discovering new webpages, analyzing influence properties in network, and detecting failure propagation in electronic circuits or infections in public drinkable water systems. In practice, it is infeasible for anyone but the owner of the network (if existent) to monitor all nodes at all times. In this work we study the constrained setting when the observer can only probe a small set of nodes at each time step to check whether new pieces of information (items) have reached those nodes.   We formally define the problem through an infinite time generating process that places new items in subsets of nodes according to an unknown probability distribution. Items have an exponentially decaying novelty, modeling their decreasing value. The observer uses a probing schedule (i.e., a probability distribution over the set of nodes) to choose, at each time step, a small set of nodes to check for new items. The goal is to compute a schedule that minimizes the average novelty of undetected items. We present an algorithm, WIGGINS, to compute the optimal schedule through convex optimization, and then show how it can be adapted when the parameters of the problem must be learned or change over time. We also present a scalable variant of WIGGINS for the MapReduce framework. The results of our experimental evaluation on real social networks demonstrate the practicality of our approach.
Previous studies have shown that spike-timing-dependent plasticity (STDP) can be used in spiking neural networks (SNN) to extract visual features of low or intermediate complexity in an unsupervised manner. These studies, however, used relatively shallow architectures, and only one layer was trainable. Another line of research has demonstrated - using rate-based neural networks trained with back-propagation - that having many layers increases the recognition robustness, an approach known as deep learning. We thus designed a deep SNN, comprising several convolutional (trainable with STDP) and pooling layers. We used a temporal coding scheme where the most strongly activated neurons fire first, and less activated neurons fire later or not at all. The network was exposed to natural images. Thanks to STDP, neurons progressively learned features corresponding to prototypical patterns that were both salient and frequent. Only a few tens of examples per category were required and no label was needed. After learning, the complexity of the extracted features increased along the hierarchy, from edge detectors in the first layer to object prototypes in the last layer. Coding was very sparse, with only a few thousands spikes per image, and in some cases the object category could be reasonably well inferred from the activity of a single higher-order neuron. More generally, the activity of a few hundreds of such neurons contained robust category information, as demonstrated using a classifier on Caltech 101, ETH-80, and MNIST databases. We think that the combination of STDP with latency coding is key to understanding the way that the primate visual system learns, its remarkable processing speed and its low energy consumption. These mechanisms are also interesting for artificial vision systems, particularly for hardware solutions.
This paper proposes a new deep convolutional neural network (DCNN) architecture that learns pixel embeddings, such that pairwise distances between the embeddings can be used to infer whether or not the pixels lie on the same region. That is, for any two pixels on the same object, the embeddings are trained to be similar; for any pair that straddles an object boundary, the embeddings are trained to be dissimilar. Experimental results show that when this embedding network is used in conjunction with a DCNN trained on semantic segmentation, there is a systematic improvement in per-pixel classification accuracy. Our contributions are integrated in the popular Caffe deep learning framework, and consist in straightforward modifications to convolution routines. As such, they can be exploited for any task involving convolution layers.
We study counter expressed gene networks constructed from gene-expression data obtained from many types of cancers. The networks are synthesized by connecting vertices belonging to each others' list of K-farthest-neighbors, with K being an a priori selected non-negative integer. In the range of K corresponding to minimum homogeneity, the degree distribution of the networks shows scaling. Clustering in these networks is smaller than that in equivalent random graphs and remains zero till significantly large K. Their small diameter, however, implies small-world behavior which is corroborated by their eigenspectrum. We discuss implications of these findings in several contexts.
In this work we focus on the problem of colorization for image compression. Since color information occupies a large proportion of the total storage size of an image, a method that can predict accurate color from its grayscale version can produce dramatic reduction in image file size. But colorization for compression poses several challenges. First, while colorization for artistic purposes simply involves predicting plausible chroma, colorization for compression requires generating output colors that are as close as possible to the ground truth. Second, many objects in the real world exhibit multiple possible colors. Thus, to disambiguate the colorization problem some additional information must be stored to reproduce the true colors with good accuracy. To account for the multimodal color distribution of objects we propose a deep tree-structured network that generates multiple color hypotheses for every pixel from a grayscale picture (as opposed to a single color produced by most prior colorization approaches). We show how to leverage the multimodal output of our model to reproduce with high fidelity the true colors of an image by storing very little additional information. In the experiments we show that our proposed method outperforms traditional JPEG color coding by a large margin, producing colors that are nearly indistinguishable from the ground truth at the storage cost of just a few hundred bytes for high-resolution pictures!
An increasing number of people are using online social networking services (SNSs), and a significant amount of information related to experiences in consumption is shared in this new media form. Text mining is an emerging technique for mining useful information from the web. We aim at discovering in particular tweets semantic patterns in consumers' discussions on social media. Specifically, the purposes of this study are twofold: 1) finding similarity and dissimilarity between two sets of textual documents that include consumers' sentiment polarities, two forms of positive vs. negative opinions and 2) driving actual content from the textual data that has a semantic trend. The considered tweets include consumers opinions on US retail companies (e.g., Amazon, Walmart). Cosine similarity and K-means clustering methods are used to achieve the former goal, and Latent Dirichlet Allocation (LDA), a popular topic modeling algorithm, is used for the latter purpose. This is the first study which discover semantic properties of textual data in consumption context beyond sentiment analysis. In addition to major findings, we apply LDA (Latent Dirichlet Allocations) to the same data and drew latent topics that represent consumers' positive opinions and negative opinions on social media.
We study the correlated-disorder driven zero-temperature phase transition of the Random-Field Ising Magnet using exact numerical ground-state calculations for cubic lattices. We consider correlations of the quenched disorder decaying proportional to r^a, where r is the distance between two lattice sites and a<0. To obtain exact ground states, we use a well established mapping to the graph-theoretical maximum-flow problem, which allows us to study large system sizes of more than two million spins. We use finite-size scaling analyses for values a={-1,-2,-3,-7} to calculate the critical point and the critical exponents characterizing the behavior of the specific heat, magnetization, susceptibility and of the correlation length close to the critical point. We find basically the same critical behavior as for the RFIM with delta-correlated disorder, except for the finite-size exponent of the susceptibility and for the case a=-1, where the results are also compatible with a phase transition at infinitesimal disorder strength.   A summary of this work can be found at the papercore database at www.papercore.org.
This paper has been withdrawn by the authors, due to a possible bias in the twosome approach found by Dr. George Kirov. It seems that the bias has amplified an existing weak signal. The detection of the bias was made possible by the posting of the original data on the preprint server. It was peer review at its best, and confirms in our view the importance of preprint servers.
We present molecular dynamics simulations of a binary Lennard-Jones mixture at temperatures below the kinetic glass transition. The ``mobility'' of a particle is characterized by the amplitude of its fluctuation around its average position. The 5% particles with the largest/smallest mean amplitude are thus defined as the relatively most mobile/immobile particles. We investigate for these 5% particles their spatial distribution and find them to be distributed very heterogeneously in that mobile as well as immobile particles form clusters. The reason for this dynamic heterogeneity is traced back to the fact that mobile/immobile particles are surrounded by fewer/more neighbors which form an effectively wider/narrower cage. The dependence of our results on the length of the simulation run indicates that individual particles have a characteristic mobility time scale, which can be approximated via the non-Gaussian parameter.
Graphical models are usually learned without regard to the cost of doing inference with them. As a result, even if a good model is learned, it may perform poorly at prediction, because it requires approximate inference. We propose an alternative: learning models with a score function that directly penalizes the cost of inference. Specifically, we learn arithmetic circuits with a penalty on the number of edges in the circuit (in which the cost of inference is linear). Our algorithm is equivalent to learning a Bayesian network with context-specific independence by greedily splitting conditional distributions, at each step scoring the candidates by compiling the resulting network into an arithmetic circuit, and using its size as the penalty. We show how this can be done efficiently, without compiling a circuit from scratch for each candidate. Experiments on several real-world domains show that our algorithm is able to learn tractable models with very large treewidth, and yields more accurate predictions than a standard context-specific Bayesian network learner, in far less time.
This paper discusses several practical use cases for deploying network proximity in mobile services. Our research presents here mobile services oriented for either discovering new data for mobile subscribers or for delivering some customized information to them. All applications share the same approach and use the common platform, based on the Wi-Fi proximity. The typical deployment areas for our approach are context-aware services and ubiquitous computing applications. Our own examples include proximity marketing as the model use case.
The discovery of community structure in networks is a problem of considerable interest in recent years. In online social networks, often times, users are simultaneously involved in multiple social media sites, some of which share common social relationships. It is of great interest to uncover a shared community structure across these networks. However, in reality, users typically identify themselves with different usernames across social media sites. This creates a great difficulty in detecting the community structure. In this paper, we explore several approaches for community detection across online social networks with limited knowledge of username alignment across the networks. We refer to the known alignment of usernames as seeds. We investigate strategies for seed selection and its impact on networks with a different fraction of overlapping vertices. The goal is to study the interplay between network topologies and seed selection strategies, and to understand how it affects the detected community structure. We also propose several measures to assess the performance of community detection and use them to measure the quality of the detected communities in both Twitter-Twitter networks and Twitter-Instagram networks.
Relational learning deals with data that are characterized by relational structures. An important task is collective classification, which is to jointly classify networked objects. While it holds a great promise to produce a better accuracy than non-collective classifiers, collective classification is computational challenging and has not leveraged on the recent breakthroughs of deep learning. We present Column Network (CLN), a novel deep learning model for collective classification in multi-relational domains. CLN has many desirable theoretical properties: (i) it encodes multi-relations between any two instances; (ii) it is deep and compact, allowing complex functions to be approximated at the network level with a small set of free parameters; (iii) local and relational features are learned simultaneously; (iv) long-range, higher-order dependencies between instances are supported naturally; and (v) crucially, learning and inference are efficient, linear in the size of the network and the number of relations. We evaluate CLN on multiple real-world applications: (a) delay prediction in software projects, (b) PubMed Diabetes publication classification and (c) film genre classification. In all applications, CLN demonstrates a higher accuracy than state-of-the-art rivals.
We study how the dynamic equilibrium of the reversible protein-protein binding network in yeast Saccharomyces cerevisiae responds to large changes in abundances of individual proteins. The magnitude of shifts between free and bound concentrations of their immediate and more distant neighbors in the network is influenced by such factors as the network topology, the distribution of protein concentrations among its nodes, and the average binding strength. Our primary conclusion is that, on average, the effects of a perturbation are strongly localized and exponentially decay with the network distance away from the perturbed node, which explains why, despite globally connected topology, individual functional modules in such networks are able to operate fairly independently. We also found that under specific favorable conditions, realized in a significant number of paths in the yeast network, concentration perturbations can selectively propagate over considerable network distances (up to four steps). Such "action-at-a-distance" requires high concentrations of heterodimers along the path as well as low free (unbound) concentration of intermediate proteins.
Revealing hidden patterns in astronomical data is often the path to fundamental scientific breakthroughs; meanwhile the complexity of scientific inquiry increases as more subtle relationships are sought. Contemporary data analysis problems often elude the capabilities of classical statistical techniques, suggesting the use of cutting edge statistical methods. In this light, astronomers have overlooked a whole family of statistical techniques for exploratory data analysis and robust regression, the so-called Generalized Linear Models (GLMs). In this paper -- the first in a series aimed at illustrating the power of these methods in astronomical applications -- we elucidate the potential of a particular class of GLMs for handling binary/binomial data, the so-called logit and probit regression techniques, from both a maximum likelihood and a Bayesian perspective. As a case in point, we present the use of these GLMs to explore the conditions of star formation activity and metal enrichment in primordial minihaloes from cosmological hydro-simulations including detailed chemistry, gas physics, and stellar feedback. We predict that for a dark mini-halo with metallicity $\approx 1.3 \times 10^{-4} Z_{\bigodot}$, an increase of $1.2 \times 10^{-2}$ in the gas molecular fraction, increases the probability of star formation occurrence by a factor of 75%. Finally, we highlight the use of receiver operating characteristic curves as a diagnostic for binary classifiers, and ultimately we use these to demonstrate the competitive predictive performance of GLMs against the popular technique of artificial neural networks.
We comment on a paper by Kaminski et al. (2001) and show that their claim of a relationship between the directed transfer function (DTF) and the concept of Granger causality is false.
In this article we discuss network growth based on Prisoner's Dilemma(PD) where palyers on nodes in a network palay with its linked players. The players estimate total profits in the PD. When a new node is attached, the node make linkes to nodes in the network with the probabilities in proportion to the profits made by the game. Iterating this process, a network grows. We investigate properties of this type of growing networks, especially the degree distribution and time-depending strategy distribution by running computer simulation. We also find a sort of phase transition in the strategy distributions. For these phenomena given by computer simulation, theoretical studies are also carried out.
The precise knowledge of the atmospheric neutrino fluxes is a key ingredient in the interpretation of the results from any atmospheric neutrino experiment. In the standard atmospheric neutrino data analysis, these fluxes are theoretical inputs obtained from sophisticated numerical calculations based on the convolution of the primary cosmic ray spectrum with the expected yield of neutrinos per incident cosmic ray. In this work we present an alternative approach to the determination of the atmospheric neutrino fluxes based on the direct extraction from the experimental data on neutrino event rates. The extraction is achieved by means of a combination of artificial neural networks as interpolants and Monte Carlo methods for faithful error estimation
In many image-related tasks, learning expressive and discriminative representations of images is essential, and deep learning has been studied for automating the learning of such representations. Some user-centric tasks, such as image recommendations, call for effective representations of not only images but also preferences and intents of users over images. Such representations are termed \emph{hybrid} and addressed via a deep learning approach in this paper. We design a dual-net deep network, in which the two sub-networks map input images and preferences of users into a same latent semantic space, and then the distances between images and users in the latent space are calculated to make decisions. We further propose a comparative deep learning (CDL) method to train the deep network, using a pair of images compared against one user to learn the pattern of their relative distances. The CDL embraces much more training data than naive deep learning, and thus achieves superior performance than the latter, with no cost of increasing network complexity. Experimental results with real-world data sets for image recommendations have shown the proposed dual-net network and CDL greatly outperform other state-of-the-art image recommendation solutions.
We study trade-offs presented by local search algorithms in complex networks which are heterogeneous in edge weights and node degree. We show that search based on a network measure, local betweenness centrality (LBC), utilizes the heterogeneity of both node degrees and edge weights to perform the best in scale-free weighted networks. The search based on LBC is universal and performs well in a large class of complex networks.
We investigate the performance of parity check codes using the mapping onto Ising spin systems proposed by Sourlas. We study codes where each parity check comprises products of K bits selected from the original digital message with exactly C checks per message bit. We show, using the replica method, that these codes saturate Shannon's coding bound for $K\to\infty$ when the code rate K/C is finite. We then examine the finite temperature case to asses the use of simulated annealing methods for decoding, study the performance of the finite K case and extend the analysis to accommodate different types of noisy channels. The connection between statistical physics and belief propagation decoders is discussed and the dynamics of the decoding itself is analyzed. Further insight into new approaches for improving the code performance is given.
Significant research contributions and Directives approach the issue of the insertion of renewable-based energy systems on urban territory in order to face with the growing energy needs of citizens. The introduction of such systems gives raise to installers to both satisfy their energy demands and distribute eventual energy excesses to close neighbours. This paper presents a multi-layer agent-based computational model that simulates multiple event of the network of the energy distribution occurring within urban areas. The model runs on the NetLogo platform and aims at elaborating the most suitable strategy when dealing with the design of a network of energy distribution. Experimental data are discussed on the basis of two main scenarios within an operating period of 24 hours. Scenarios consider both the variation of the percentages of installers of renewable-based energy systems and the distance along which energy exchanges occur.
We investigate the critical behaviour of the two-dimensional Ising model defined on a curved surface with a constant negative curvature. Finite-size scaling analysis reveals that the critical exponents for the zero-field magnetic susceptibility and the correlation length deviate from those for the Ising lattice model on a flat plane. Furthermore, when reducing the effects of boundary spins, the values of the critical exponents tend to those derived from the mean field theory. These findings evidence that the underlying geometric character is responsible for the critical properties the Ising model when the lattice is embedded on negatively curved surfaces.
In routing games, agents pick their routes through a network to minimize their own delay. A primary concern for the network designer in routing games is the average agent delay at equilibrium. A number of methods to control this average delay have received substantial attention, including network tolls, Stackelberg routing, and edge removal.   A related approach with arguably greater practical relevance is that of making investments in improvements to the edges of the network, so that, for a given investment budget, the average delay at equilibrium in the improved network is minimized. This problem has received considerable attention in the literature on transportation research and a number of different algorithms have been studied. To our knowledge, none of this work gives guarantees on the output quality of any polynomial-time algorithm. We study a model for this problem introduced in transportation research literature, and present both hardness results and algorithms that obtain nearly optimal performance guarantees.   - We first show that a simple algorithm obtains good approximation guarantees for the problem. Despite its simplicity, we show that for affine delays the approximation ratio of 4/3 obtained by the algorithm cannot be improved.   - To obtain better results, we then consider restricted topologies. For graphs consisting of parallel paths with affine delay functions we give an optimal algorithm. However, for graphs that consist of a series of parallel links, we show the problem is weakly NP-hard.   - Finally, we consider the problem in series-parallel graphs, and give an FPTAS for this case.   Our work thus formalizes the intuition held by transportation researchers that the network improvement problem is hard, and presents topology-dependent algorithms that have provably tight approximation guarantees.
We present the basic aspects of deep inelastic phenomena in the framework of the QCD parton model. After recalling briefly the standard kinematics, we discuss the physical interpretation of unpolarized and polarized structure functions in terms of parton distributions together with several important sum rules. We also make a rapid survey of the experimental situation together with phenomenological tests, in particular concerning various QCD predictions. In the summary, we try to identify some significant open questions, to clarify where we stand and to see what we expect to learn in the future.
Many real life domains contain a mixture of discrete and continuous variables and can be modeled as hybrid Bayesian Networks. Animportant subclass of hybrid BNs are conditional linear Gaussian (CLG) networks, where the conditional distribution of the continuous variables given an assignment to the discrete variables is a multivariate Gaussian. Lauritzen's extension to the clique tree algorithm can be used for exact inference in CLG networks. However, many domains also include discrete variables that depend on continuous ones, and CLG networks do not allow such dependencies to berepresented. No exact inference algorithm has been proposed for these enhanced CLG networks. In this paper, we generalize Lauritzen's algorithm, providing the first "exact" inference algorithm for augmented CLG networks - networks where continuous nodes are conditional linear Gaussians but that also allow discrete children ofcontinuous parents. Our algorithm is exact in the sense that it computes the exact distributions over the discrete nodes, and the exact first and second moments of the continuous ones, up to the accuracy obtained by numerical integration used within thealgorithm. When the discrete children are modeled with softmax CPDs (as is the case in many real world domains) the approximation of the continuous distributions using the first two moments is particularly accurate. Our algorithm is simple to implement and often comparable in its complexity to Lauritzen's algorithm. We show empirically that it achieves substantially higher accuracy than previous approximate algorithms.
Using liquid 3He in aerogel as an example, it is shown that correlations in positions of impurities affect the temperature $T_c$ of transition of Fermi liquid in an unconventional superfluid or superconductive state. The effect is significant if the correlation length for impurities is greater than the coherence length in the superfluid or superconductive state $\xi_0$. For 3He in aerogel the suppression of $T_c$ is expressed in terms of the structure factor of aerogel. With the account of the fractal structure of aerogel a simple expression is obtained for the decrease of $T_c$ from its clean value. This expression is in a satisfactory agreement with experimental data.
For more than two decades, research has been performed on content-based image retrieval (CBIR). By combining Radon projections and the support vector machines (SVM), a content-based medical image retrieval method is presented in this work. The proposed approach employs the normalized Radon projections with corresponding image category labels to build an SVM classifier, and the Radon barcode database which encodes every image in a binary format is also generated simultaneously to tag all images. To retrieve similar images when a query image is given, Radon projections and the barcode of the query image are generated. Subsequently, the k-nearest neighbor search method is applied to find the images with minimum Hamming distance of the Radon barcode within the same class predicted by the trained SVM classifier that uses Radon features. The performance of the proposed method is validated by using the IRMA 2009 dataset with 14,410 x-ray images in 57 categories. The results demonstrate that our method has the capacity to retrieve similar responses for the correctly identified query image and even for those mistakenly classified by SVM. The approach further is very fast and has low memory requirement.
The scaling properties of the inverse moments of Wigner delay times are investigated in finite one-dimensional (1D) random media with one channel attached to the boundary of the sample. We find that they follow a simple scaling law which is independent of the microscopic details of the random potential. Our theoretical considerations are confirmed numerically for systems as diverse as 1D disordered wires and optical lattices to microwave waveguides with correlated scatterers.
Note: This paper has been published in Physical Chemistry Chemical Physics, which can be viewed at the following URL: http://doi.org/10.1039/C5CP03102H Cs2SnI6, a rarely studied perovskite variant material, is recently gaining a lot of interest in the field of photovoltaics owing to its nontoxity, air-stability and promissing photovoltaic properties. In this work, we report intrinsic defects in Cs2SnI6 using first-principles density functional theory calculations. It is revealed that iodine vacancy and tin interstitial are the dominant defects that are responsible for the intrinsic n-type conduction in Cs2SnI6. Tin vacancy has a very high formation energy (>3.6 eV) due to the strong covalency in the Sn-I bonds and is hardly generated for p-type doping. All the dominant defects in Cs2SnI6 have deep transition levels in the band gap. It is suggested that the formation of the deep defects can be suppressed significantly by employing an I-rich synthesis condition, which is inevitable for photovoltaic and other semiconductor applications.
Matrix completion models are among the most common formulations of recommender systems. Recent works have showed a boost of performance of these techniques when introducing the pairwise relationships between users/items in the form of graphs, and imposing smoothness priors on these graphs. However, such techniques do not fully exploit the local stationarity structures of user/item graphs, and the number of parameters to learn is linear w.r.t. the number of users and items. We propose a novel approach to overcome these limitations by using geometric deep learning on graphs. Our matrix completion architecture combines graph convolutional neural networks and recurrent neural networks to learn meaningful statistical graph-structured patterns and the non-linear diffusion process that generates the known ratings. This neural network system requires a constant number of parameters independent of the matrix size. We apply our method on both synthetic and real datasets, showing that it outperforms state-of-the-art techniques.
Mixed strategy EAs aim to integrate several mutation operators into a single algorithm. However few theoretical analysis has been made to answer the question whether and when the performance of mixed strategy EAs is better than that of pure strategy EAs. In theory, the performance of EAs can be measured by asymptotic convergence rate and asymptotic hitting time. In this paper, it is proven that given a mixed strategy (1+1) EAs consisting of several mutation operators, its performance (asymptotic convergence rate and asymptotic hitting time)is not worse than that of the worst pure strategy (1+1) EA using one mutation operator; if these mutation operators are mutually complementary, then it is possible to design a mixed strategy (1+1) EA whose performance is better than that of any pure strategy (1+1) EA using one mutation operator.
There has been a lot of recent interest in mining patterns from graphs. Often, the exact structure of the patterns of interest is not known. This happens, for example, when molecular structures are mined to discover fragments useful as features in chemical compound classification task, or when web sites are mined to discover sets of web pages representing logical documents. Such patterns are often generated from a few small subgraphs (cores), according to certain generalization rules (GRs). We call such patterns "generalized patterns"(GPs). While being structurally different, GPs often perform the same function in the network. Previously proposed approaches to mining GPs either assumed that the cores and the GRs are given, or that all interesting GPs are frequent. These are strong assumptions, which often do not hold in practical applications. In this paper, we propose an approach to mining GPs that is free from the above assumptions. Given a small number of GPs selected by the user, our algorithm discovers all GPs similar to the user examples. First, a machine learning-style approach is used to find the cores. Second, generalizations of the cores in the graph are computed to identify GPs. Evaluation on synthetic data, generated using real cores and GRs from biological and web domains, demonstrates effectiveness of our approach.
This paper addresses the model-free nonlinear optimal problem with generalized cost functional, and a data-based reinforcement learning technique is developed. It is known that the nonlinear optimal control problem relies on the solution of the Hamilton-Jacobi-Bellman (HJB) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, most of practical systems are too complicated to establish their accurate mathematical model. To overcome these difficulties, we propose a data-based approximate policy iteration (API) method by using real system data rather than system model. Firstly, a model-free policy iteration algorithm is derived for constrained optimal control problem and its convergence is proved, which can learn the solution of HJB equation and optimal control policy without requiring any knowledge of system mathematical model. The implementation of the algorithm is based on the thought of actor-critic structure, where actor and critic neural networks (NNs) are employed to approximate the control policy and cost function, respectively. To update the weights of actor and critic NNs, a least-square approach is developed based on the method of weighted residuals. The whole data-based API method includes two parts, where the first part is implemented online to collect real system information, and the second part is conducting offline policy iteration to learn the solution of HJB equation and the control policy. Then, the data-based API algorithm is simplified for solving unconstrained optimal control problem of nonlinear and linear systems. Finally, we test the efficiency of the data-based API control design method on a simple nonlinear system, and further apply it to a rotational/translational actuator system. The simulation results demonstrate the effectiveness of the proposed method.
A new complex network model is proposed which is founded on growth with new connections being established proportionally to the current dynamical activity of each node, which can be understood as a generalization of the Barabasi-Albert static model. By using several topological measurements, as well as optimal multivariate methods (canonical analysis and maximum likelihood decision), we show that this new model provides, among several other theoretical types of networks including Watts-Strogatz small-world networks, the greatest compatibility with three real-world cortical networks.
We present a self-stabilizing leader election algorithm for arbitrary networks, with space-complexity $O(\max\{\log \Delta, \log \log n\})$ bits per node in $n$-node networks with maximum degree~$\Delta$. This space complexity is sub-logarithmic in $n$ as long as $\Delta = n^{o(1)}$. The best space-complexity known so far for arbitrary networks was $O(\log n)$ bits per node, and algorithms with sub-logarithmic space-complexities were known for the ring only. To our knowledge, our algorithm is the first algorithm for self-stabilizing leader election to break the $\Omega(\log n)$ bound for silent algorithms in arbitrary networks. Breaking this bound was obtained via the design of a (non-silent) self-stabilizing algorithm using sophisticated tools such as solving the distance-2 coloring problem in a silent self-stabilizing manner, with space-complexity $O(\max\{\log \Delta, \log \log n\})$ bits per node. Solving this latter coloring problem allows us to implement a sub-logarithmic encoding of spanning trees --- storing the IDs of the neighbors requires $\Omega(\log n)$ bits per node, while we encode spanning trees using $O(\max\{\log \Delta, \log \log n\})$ bits per node. Moreover, we show how to construct such compactly encoded spanning trees without relying on variables encoding distances or number of nodes, as these two types of variables would also require $\Omega(\log n)$ bits per node.
Artificial neural networks learn how to solve new problems through a computationally intense and time consuming process. One way to reduce the amount of time required is to inject preexisting knowledge into the network. To make use of past knowledge, we can take advantage of techniques that transfer the knowledge learned from one task, and reuse it on another (sometimes unrelated) task. In this paper we propose a novel selective breeding technique that extends the transfer learning with behavioural genetics approach proposed by Kohli, Magoulas and Thomas (2013), and evaluate its performance on financial data. Numerical evidence demonstrates the credibility of the new approach. We provide insights on the operation of transfer learning and highlight the benefits of using behavioural principles and selective breeding when tackling a set of diverse financial applications problems.
Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers etc., can understand what triggered a particular behavior. Here we explore the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). Our approach is two-stage. In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. We demonstrate the effectiveness of our model on three datasets totaling 16 hours of driving. We first show that training with attention does not degrade the performance of the end-to-end network. Then we show that the network causally cues on a variety of features that are used by humans while driving.
The Self-Organizing Map (SOM) with its related extensions is the most popular artificial neural algorithm for use in unsupervised learning, clustering, classification and data visualization. Over 5,000 publications have been reported in the open literature, and many commercial projects employ the SOM as a tool for solving hard real-world problems. Each two years, the "Workshop on Self-Organizing Maps" (WSOM) covers the new developments in the field. The WSOM series of conferences was initiated in 1997 by Prof. Teuvo Kohonen, and has been successfully organized in 1997 and 1999 by the Helsinki University of Technology, in 2001 by the University of Lincolnshire and Humberside, and in 2003 by the Kyushu Institute of Technology. The Universit\'{e} Paris I Panth\'{e}on Sorbonne (SAMOS-MATISSE research centre) organized WSOM 2005 in Paris on September 5-8, 2005.
The probability density function (PDF) for critical wavefunction amplitudes is studied in the three-dimensional Anderson model. We present a formal expression between the PDF and the multifractal spectrum f(alpha) in which the role of finite-size corrections is properly analyzed. We show the non-gaussian nature and the existence of a symmetry relation in the PDF. From the PDF, we extract information about f(alpha) at criticality such as the presence of negative fractal dimensions and we comment on the possible existence of termination points. A PDF-based multifractal analysis is hence shown to be a valid alternative to the standard approach based on the scaling of general inverse participation ratios.
Learning effective configurations in computer systems without hand-crafting models for every parameter is a long-standing problem. This paper investigates the use of deep reinforcement learning for runtime parameters of cloud databases under latency constraints. Cloud services serve up to thousands of concurrent requests per second and can adjust critical parameters by leveraging performance metrics. In this work, we use continuous deep reinforcement learning to learn optimal cache expirations for HTTP caching in content delivery networks. To this end, we introduce a technique for asynchronous experience management called delayed experience injection, which facilitates delayed reward and next-state computation in concurrent environments where measurements are not immediately available. Evaluation results show that our approach based on normalized advantage functions and asynchronous CPU-only training outperforms a statistical estimator.
The recent development of compressed sensing has led to spectacular advances in the understanding of sparse linear estimation problems as well as in algorithms to solve them. It has also triggered a new wave of developments in the related fields of generalized linear and bilinear inference problems, that have very diverse applications in signal processing and are furthermore a building block of deep neural networks. These problems have in common that they combine a linear mixing step and a nonlinear, probabilistic sensing step, producing indirect measurements of a signal of interest. Such a setting arises in problems as different as medical or astronomical imaging, clustering, matrix completion or blind source separation. The aim of this thesis is to propose efficient algorithms for this class of problems and to perform their theoretical analysis. To this end, it uses belief propagation, thanks to which high-dimensional distributions can be sampled efficiently, thus making a Bayesian approach to inference tractable. The resulting algorithms undergo phase transitions just as physical systems do. These phase transitions can be analyzed using the replica method, initially developed in statistical physics of disordered systems. The analysis reveals phases in which inference is easy, hard or impossible. These phases correspond to different energy landscapes of the problem. The main contributions of this thesis can be divided into three categories. First, the application of known algorithms to concrete problems: community detection, superposition codes and an innovative imaging system. Second, a new, efficient message-passing algorithm for a class of problems called blind sensor calibration. Third, a theoretical analysis of matrix compressed sensing and of instabilities in Bayesian bilinear inference algorithms.
Biological systems have to build models from their sensory data that allow them to efficiently process previously unseen inputs. Here, we study a neural network learning a linearly separable rule using examples provided by a teacher. We analyse the ability of the network to apply the rule to new inputs, that is to generalise from past experience. Using stochastic thermodynamics, we show that the thermodynamic costs of the learning process provide an upper bound on the amount of information that the network is able to learn from its teacher for both batch and online learning. This allows us to introduce a thermodynamic efficiency of learning. We analytically compute the dynamics and the efficiency of a noisy neural network performing online learning in the thermodynamic limit. In particular, we analyse three popular learning algorithms, namely Hebbian, Perceptron and AdaTron learning. Our work extends the methods of stochastic thermodynamics to a new type of learning problem and might form a suitable basis for investigating the thermodynamics of decision-making.
Several networks occurring in real life have modular structures that are arranged in an hierarchical fashion. In this paper, we have proposed a model for such networks, using a stochastic generation method. Using this model we show that, the scaling relation between the clustering and degree of the nodes is not a necessary property of hierarchical modular networks, as had previously been suggested on the basis of a deterministically constructed model. We also look at dynamics on such networks, in particular, the stability of equilibria of network dynamics and of synchronized activity in the network. For both of these, we find that, increasing modularity or the number of hierarchical levels tends to increase the probability of instability. As both hierarchy and modularity are seen in natural systems, which necessarily have to be robust against environmental fluctuations, we conclude that additional constraints are necessary for the emergence of hierarchical structure, similar to the occurrence of modularity through multi-constraint optimization as shown by us previously.
The wavelet shrinkage denoising approach is able to maintain local regularity of a signal while suppressing noise. However, the conventional wavelet shrinkage based methods are not time-scale adaptive to track the local time-scale variation. In this paper, a new type of Neural Shrinkage (NS) is presented with a new class of shrinkage architecture for speckle reduction in Synthetic Aperture Radar (SAR) images. The numerical results indicate that the new method outperforms the standard filters, the standard wavelet shrinkage despeckling method, and previous NS.
Qualitative probabilistic networks have been designed for probabilistic reasoning in a qualitative way. Due to their coarse level of representation detail, qualitative probabilistic networks do not provide for resolving trade-offs and typically yield ambiguous results upon inference. We present an algorithm for computing more insightful results for unresolved trade-offs. The algorithm builds upon the idea of using pivots to zoom in on the trade-offs and identifying the information that would serve to resolve them.
This paper describes our deep learning-based approach to sentiment analysis in Twitter as part of SemEval-2016 Task 4. We use a convolutional neural network to determine sentiment and participate in all subtasks, i.e. two-point, three-point, and five-point scale sentiment classification and two-point and five-point scale sentiment quantification. We achieve competitive results for two-point scale sentiment classification and quantification, ranking fifth and a close fourth (third and second by alternative metrics) respectively despite using only pre-trained embeddings that contain no sentiment information. We achieve good performance on three-point scale sentiment classification, ranking eighth out of 35, while performing poorly on five-point scale sentiment classification and quantification. An error analysis reveals that this is due to low expressiveness of the model to capture negative sentiment as well as an inability to take into account ordinal information. We propose improvements in order to address these and other issues.
In this paper, a framework for testing Deep Neural Network (DNN) design in Python is presented. First, big data, machine learning (ML), and Artificial Neural Networks (ANNs) are discussed to familiarize the reader with the importance of such a system. Next, the benefits and detriments of implementing such a system in Python are presented. Lastly, the specifics of the system are explained, and some experimental results are presented to prove the effectiveness of the system.
One property of networks that has received comparatively little attention is hierarchy, i.e., the property of having vertices that cluster together in groups, which then join to form groups of groups, and so forth, up through all levels of organization in the network. Here, we give a precise definition of hierarchical structure, give a generic model for generating arbitrary hierarchical structure in a random graph, and describe a statistically principled way to learn the set of hierarchical features that most plausibly explain a particular real-world network. By applying this approach to two example networks, we demonstrate its advantages for the interpretation of network data, the annotation of graphs with edge, vertex and community properties, and the generation of generic null models for further hypothesis testing.
We reanalyze deep inelastic scattering data of BCDMS Collaboration by including proper cuts of ranges with large systematic errors. We perform also the fits of high statistic deep inelastic scattering data of BCDMS, SLAC, NM and BFP Collaborations taking the data separately and in combined way and find good agreement between these analyses. We extract the values of both the QCD coupling constant \alpha_s(M^2_Z) up to NLO level and of the power corrections to the structure function F_2.
Epidemic spreading of infectious diseases is ubiquitous and has often considerable impact on public health and economic wealth. The large variability in spatio-temporal patterns of epidemics prohibits simple interventions and demands for a detailed analysis of each epidemic with respect to its infectious agent and the corresponding routes of transmission. To facilitate this analysis, a mathematical framework is introduced which links epidemic patterns to the topology and dynamics of the underlying transmission network. The evolution both in disease prevalence and transmission network topology are derived from a closed set of partial differential equations for infections without recovery which are in excellent agreement with complementarily conducted agent based simulations. The mutual influence between the epidemic process and its transmission network is shown by several case studies on HIV epidemics in synthetic populations. They reveal context dependent key processes which drive the epidemic and which in turn can be exploited for targeted intervention strategies. The mathematical framework provides a capable toolbox to analyze epidemics from first principles. This allows for fast in silico modeling - and manipulation - of epidemics which is especially powerful if complemented with adequate empirical data for parametrization.
One of the most significant challenges involved in efforts to understand the effects of repeated earthquake cycle activity are the computational costs of large-scale viscoelastic earthquake cycle models. Computationally intensive viscoelastic codes must be evaluated thousands of times and locations, and as a result, studies tend to adopt a few fixed rheological structures and model geometries, and examine the predicted time-dependent deformation over short (<10 yr) time periods at a given depth after a large earthquake. Training a deep neural network to learn a computationally efficient representation of viscoelastic solutions, at any time, location, and for a large range of rheological structures, allows these calculations to be done quickly and reliably, with high spatial and temporal resolution. We demonstrate that this machine learning approach accelerates viscoelastic calculations by more than 50,000%. This magnitude of acceleration will enable the modeling of geometrically complex faults over thousands of earthquake cycles across wider ranges of model parameters and at larger spatial and temporal scales than have been previously possible.
In cocktail party listening scenarios, the human brain is able to separate competing speech signals. However, the signal processing implemented by the brain to perform cocktail party listening is not well understood. Here, we trained two separate convolutive autoencoder deep neural networks (DNN) to separate monaural and binaural mixtures of two concurrent speech streams. We then used these DNNs as convolutive deep transform (CDT) devices to perform probabilistic re-synthesis. The CDTs operated directly in the time-domain. Our simulations demonstrate that very simple neural networks are capable of exploiting monaural and binaural information available in a cocktail party listening scenario.
In this paper, we present the first experiments using neural network models for the task of error detection in learner writing. We perform a systematic comparison of alternative compositional architectures and propose a framework for error detection based on bidirectional LSTMs. Experiments on the CoNLL-14 shared task dataset show the model is able to outperform other participants on detecting errors in learner writing. Finally, the model is integrated with a publicly deployed self-assessment system, leading to performance comparable to human annotators.
We present algorithms for initializing a convolutional network coding scheme in networks that may contain cycles. An initialization process is needed if the network is unknown or if local encoding kernels are chosen randomly. During the initialization process every source node transmits basis vectors and every sink node measures the impulse response of the network. The impulse response is then used to find a relationship between the transmitted and the received symbols, which is needed for a decoding algorithm and to find the set of all achievable rates. Unlike acyclic networks, for which it is enough to transmit basis vectors one after another, the initialization of cyclic networks is more involved, as pilot symbols interfere with each other and the impulse response is of infinite duration.
Bayesian learning of belief networks (BLN) is a method for automatically constructing belief networks (BNs) from data using search and Bayesian scoring techniques. K2 is a particular instantiation of the method that implements a greedy search strategy. To evaluate the accuracy of K2, we randomly generated a number of BNs and for each of those we simulated data sets. K2 was then used to induce the generating BNs from the simulated data. We examine the performance of the program, and the factors that influence it. We also present a simple BN model, developed from our results, which predicts the accuracy of K2, when given various characteristics of the data set.
This article provides a taxonomy of current and past network modeling efforts. In all these efforts over the last few years we see a trend towards not only describing the network, but connected devices as well. This is especially current given the many Future Internet projects, which are combining different models, and resources in order to provide complete virtual infrastructures to users.   An important mechanism for managing complexity is the creation of an abstract model, a step which has been undertaken in computer networks too. The fact that more and more devices are network capable, coupled with increasing popularity of the Internet, has made computer networks an important focus area for modeling. The large number of connected devices creates an increasing complexity which must be harnessed to keep the networks functioning.   Over the years many different models for computer networks have been proposed, and used for different purposes. While for some time the community has moved away from the need of full topology exchange, this requirement resurfaced for optical networks. Subsequently, research on topology descriptions has seen a rise in the last few years. Many different models have been created and published, yet there is no publication that shows an overview of the different approaches.
Computer infections such as viruses and worms spread over networks of contacts between computers, with different types of networks being exploited by different types of infections. Here we analyze the structures of several of these networks, exploring their implications for modes of spread and the control of infection. We argue that vaccination strategies that focus on a limited number of network nodes, whether targeted or randomly chosen, are in many cases unlikely to be effective. An alternative dynamic mechanism for the control of contagion, called throttling, is introduced and argued to be effective under a range of conditions.
In contrast to today's IP-based host-oriented Internet architecture, Information-Centric Networking (ICN) emphasizes content by making it directly addressable and routable. Named Data Networking (NDN) architecture is an instance of ICN that is being developed as a candidate next-generation Internet architecture. By opportunistically caching content within the network (in routers), NDN appears to be well-suited for large-scale content distribution and for meeting the needs of increasingly mobile and bandwidth-hungry applications that dominate today's Internet.   One key feature of NDN is the requirement for each content object to be digitally signed by its producer. Thus, NDN should be, in principle, immune to distributing fake (aka "poisoned") content. However, in practice, this poses two challenges for detecting fake content in NDN routers: (1) overhead due to signature verification and certificate chain traversal, and (2) lack of trust context, i.e., determining which public keys are trusted to verify which content. Because of these issues, NDN does not force routers to verify content signatures, which makes the architecture susceptible to content poisoning attacks.   This paper explores root causes of, and some cures for, content poisoning attacks in NDN. In the process, it becomes apparent that meaningful mitigation of content poisoning is contingent upon a network-layer trust management architecture, elements of which we construct while carefully justifying specific design choices. This work represents the initial effort towards comprehensive trust management for NDN.
Visual context is important in object recognition and it is still an open problem in computer vision. Along with the advent of deep convolutional neural networks (CNN), using contextual information with such systems starts to receive attention in the literature. At the same time, aerial imagery is gaining momentum. While advances in deep learning make good progress in aerial image analysis, this problem still poses many great challenges. Aerial images are often taken under poor lighting conditions and contain low resolution objects, many times occluded by trees or taller buildings. In this domain, in particular, visual context could be of great help, but there are still very few papers that consider context in aerial image understanding. Here we introduce context as a complementary way of recognizing objects. We propose a dual-stream deep neural network model that processes information along two independent pathways, one for local and another for global visual reasoning. The two are later combined in the final layers of processing. Our model learns to combine local object appearance as well as information from the larger scene at the same time and in a complementary way, such that together they form a powerful classifier. We test our dual-stream network on the task of segmentation of buildings and roads in aerial images and obtain state-of-the-art results on the Massachusetts Buildings Dataset. We also introduce two new datasets, for buildings and road segmentation, respectively, and study the relative importance of local appearance vs. the larger scene, as well as their performance in combination. While our local-global model could also be useful in general recognition tasks, we clearly demonstrate the effectiveness of visual context in conjunction with deep nets for aerial image understanding.
In this paper a new proof is given for the supermodularity of information content. Using the decomposability of the information content an algorithm is given for discovering the Markov network graph structure endowed by the pairwise Markov property of a given probability distribution. A discrete probability distribution is given for which the equivalence of Hammersley-Clifford theorem is fulfilled although some of the possible vector realizations are taken on with zero probability. Our algorithm for discovering the pairwise Markov network is illustrated on this example, too.
We present results of a search for WH -> lepton neutrino b bbar production in ppbar collisions based on the analysis of 1.05 fb-1 of data collected by the D0 experiment at the Fermilab Tevatron, using a neural network for separating the signal from backgrounds. No signal-like excess is observed, and we set 95 % C.L. upper limits on the WH production cross section multiplied by the branching ratio for H->bb for Higgs boson masses between 100 and 150 GeV. For a Higgs boson mass of 115 GeV, we obtain an observed (expected) limit of 1.5 (1.4) pb, a factor of 11.4 (10.7) times larger than standard model prediction.
The critical behavior of the random-field Ising model has been a puzzle for a long time. Different theoretical methods predict that the critical exponents of the random-field ferromagnet in D dimensions are the same as in the pure (D-2)-dimensional ferromagnet with the same number of the magnetization components. This result contradicts the experiments and simulations. We calculate the critical exponents of the random-field O(N) model with the (4+\epsilon)-expansion and obtain values different from the critical exponents of the pure ferromagnet in 2+\epsilon dimensions. In contrast to the previous approaches we take into account an infinite set of relevant operators emerging in the problem. We demonstrate how these previously missed relevant operators lead to the breakdown of the (6-\epsilon)-expansion for the random-field Ising model.
We present a computer simulation study of a disordered two-dimensional system of localized interacting electrons at thermal equilibrium. It is shown that the configuration of occupied sites within the Coulomb gap persistently changes at temperatures much less than the gap width. This is accompanied by large time dependent fluctuations of the site energies. The observed thermal equilibration at low temperatures suggests a possible glass transition only at T=0. We interpret the strong fluctuations in the occupation numbers and site energies in terms of the drift of the system between multiple energy minima. The results also imply that interacting electrons may be effectively delocalized within the Coulomb gap. Insulating properties, such as hopping conduction, appear as a result of long equilibration times associated with glassy dynamics. This may shine new light on the relation between the metal-insulator transition and glassy behavior.
Our general motivation is to answer the question: "What is a model of concurrent computation?". As a preliminary exercise, we study dataflow networks. We develop a very general notion of model for asynchronous networks. The "Kahn Principle", which states that a network built from functional nodes is the least fixpoint of a system of equations associated with the network, has become a benchmark for the formal study of dataflow networks. We formulate a generalized version of the Kahn Principle, which applies to a large class of non-deterministic systems, in the setting of abstract asynchronous networks; and prove that the Kahn Principle holds under certain natural assumptions on the model. We also show that a class of models, which represent networks that compute over arbitrary event structures, generalizing dataflow networks which compute over streams, satisfy these assumptions.
The evolving fifth generation (5G) cellular wireless networks are envisioned to overcome the fundamental challenges of existing cellular networks, e.g., higher data rates, excellent end-to-end performance and user-coverage in hot-spots and crowded areas with lower latency, energy consumption and cost per information transfer. To address these challenges, 5G systems need to adopt a multi-tier architecture consisting of macrocells, different types of licensed small cells, relays, and device-to-device (D2D) networks to serve users with different quality-of-service (QoS) requirements in a spectrum and energy-efficient manner. Starting with the visions and requirements of 5G multi-tier networks, this article outlines the challenges of interference management (e.g., power control, cell association) in these networks with shared spectrum access (i.e., when the different network tiers shares the same licensed spectrum). It is argued that the existing interference management schemes will not be able to address the interference management problem in prioritized 5G multi-tier networks where users in different tiers have different priorities for channel access. In this context, a survey and qualitative comparison of the potential existing cell association and power control schemes is provided to demonstrate their limitations for interference management in 5G networks. Open challenges are highlighted and guidelines are provided to modify the existing schemes in order to overcome these limitations and make them suitable for the emerging 5G systems.
This paper introduces MazeBase: an environment for simple 2D games, designed as a sandbox for machine learning approaches to reasoning and planning. Within it, we create 10 simple games embodying a range of algorithmic tasks (e.g. if-then statements or set negation). A variety of neural models (fully connected, convolutional network, memory network) are deployed via reinforcement learning on these games, with and without a procedurally generated curriculum. Despite the tasks' simplicity, the performance of the models is far from optimal, suggesting directions for future development. We also demonstrate the versatility of MazeBase by using it to emulate small combat scenarios from StarCraft. Models trained on the MazeBase version can be directly applied to StarCraft, where they consistently beat the in-game AI.
We introduce several techniques for sampling and visualizing the latent spaces of generative models. Replacing linear interpolation with spherical linear interpolation prevents diverging from a model's prior distribution and produces sharper samples. J-Diagrams and MINE grids are introduced as visualizations of manifolds created by analogies and nearest neighbors. We demonstrate two new techniques for deriving attribute vectors: bias-corrected vectors with data replication and synthetic vectors with data augmentation. Binary classification using attribute vectors is presented as a technique supporting quantitative analysis of the latent space. Most techniques are intended to be independent of model type and examples are shown on both Variational Autoencoders and Generative Adversarial Networks.
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.   The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
We propose Rotation Equivariant vector field Networks (RotEqNet) to encode rotation equivariance and invariance into Convolutional Neural Networks (CNNs). Each convolutional filter is applied at multiple orientations and returns a vector field that represents the magnitude and angle of the highest scoring orientation at every spatial location. A modified convolution operator using vector fields as inputs and filters can then be applied to obtain deep architectures. We test RotEqNet on several problems requiring different responses with respect to the inputs' rotation: image classification, biomedical image segmentation, orientation estimation and patch matching. In all cases, we show that RotEqNet offers very compact models in terms of number of parameters and provides results in line to those of networks orders of magnitude larger.
As a model of temporally evolving networks, we consider a globally coupled logistic map with variable connection weights. The model exhibits self-organization of network structure, reflected by the collective behavior of units. Structural order emerges even without any inter-unit synchronization of dynamics. Within this structure, units spontaneously separate into two groups whose distinguishing feature is that the first group possesses many outwardly-directed connections to the second group, while the second group possesses only few outwardly-directed connections to the first. The relevance of the results to structure formation in neural networks is briefly discussed.
We investigate the dynamics of a two-dimensional array of oscillators with phase-shifted coupling. Each oscillator is allowed to interact with its neighbors within a finite radius. The system exhibits various patterns including squarelike pinwheels, (anti)spirals with phase-randomized cores, and antiferro patterns embedded in (anti)spirals. We consider the symmetry properties of the system to explain the observed behaviors, and estimate the wavelengths of the patterns by linear analysis. Finally, we point out the implications of our work for biological neural networks.
When mixed states are composed of s memory patterns, s types of mixed states, which can become equilibrium states of the model, can be generated. We previously reported that storage capacities for most mixed states composed of uncorrelated memory patterns do not diverge at the sparse limit of the firing rate, f -> 0. On the contrary to the uncorrelated case, we show that the storage capacities for all mixed states composed of correlated memory patterns diverge as 1/|f log f| even when the correlation of memory patterns is infinitestimal small. We also show that, when the firing rate is fixed, as the correlation coefficient increases, the storage capacities of the mixed states composed of all correlated memory patterns increase, while those of the mixed states composed of only some of the correlated memory patterns decrease.
In rotationally constrained percolation models, a site of a percolation cluster could be occupied more than once from different directions due to the nature of the rotational constraint. A state variable $s_i$ is assigned to each lattice site whose value corresponds to the number times it has been visited during the growth of a cluster. It is proposed here that the percolation transition and the multifractal aspects of infinite percolation clusters under rotational constraint can be studied defining suitable measures in terms of the state variable $s_i$. This method does not require to introduce any external agency like an electric current or a random walker in order to explore multifractality as in the case of ordinary percolation. The state variable representation also describes the universality class of the percolation models appropriately.
The nonparametric estimation of the distribution of relaxation times approach is not as frequently used in the analysis of dispersed response of dielectric or conductive materials as are other immittance data analysis methods based on parametric curve fitting techniques. Nevertheless, such distributions can yield important information about the physical processes present in measured material. In this letter, we apply two quite different numerical inversion methods to estimate the distribution of relaxation times for glassy \lila\ dielectric frequency-response data at $225 \kelvin$. Both methods yield unique distributions that agree very closely with the actual exact one accurately calculated from the corrected bulk-dispersion Kohlrausch model established independently by means of parametric data fit using the corrected modulus formalism method. The obtained distributions are also greatly superior to those estimated using approximate functions equations given in the literature.
Systems that comprise many interacting dynamical networks, such as the human body with its biological networks or the global economic network consisting of regional clusters, often exhibit complicated collective dynamics. To understand the collective behavior of such systems, we investigate a model of interacting networks exhibiting the fundamental processes of failure, damage spread, and recovery. We find a very rich phase diagram that becomes exponentially more complex as the number of networks is increased. In the simplest example of $n=2$ interacting networks we find two critical points, 4 triple points, 10 allowed transitions, and two "forbidden" transitions, as well as complex hysteresis loops. Remarkably, we find that triple points play the dominant role in constructing the optimal repairing strategy in damaged interacting systems. To support our model, we analyze an example of real interacting financial networks and find evidence of rapid dynamical transitions between well-defined states, in agreement with the predictions of our model.
We show how to use properties of the vectors which are iterated in the transfer-matrix approach to Anderson localization, in order to generate the statistical distribution of electronic wavefunction amplitudes at arbitary distances from the origin of $L^{d-1} \times \infty$ disordered systems. For $d=1$ our approach is shown to reproduce exact diagonalization results available in the literature. In $d=2$, where strips of width $ L \leq 64$ sites were used, attempted fits of gaussian (log-normal) forms to the wavefunction amplitude distributions result in effective localization lengths growing with distance, contrary to the prediction from single-parameter scaling theory. We also show that the distributions possess a negative skewness $S$, which is invariant under the usual histogram-collapse rescaling, and whose absolute value increases with distance. We find $0.15 \lesssim -S \lesssim 0.30$ for the range of parameters used in our study, .
In this paper, we present a deep regression approach for face alignment. The deep architecture consists of a global layer and multi-stage local layers. We apply the back-propagation algorithm with the dropout strategy to jointly optimize the regression parameters. We show that the resulting deep regressor gradually and evenly approaches the true facial landmarks stage by stage, avoiding the tendency to yield over-strong early stage regressors while over-weak later stage regressors. Experimental results show that our approach achieves the state-of-the-art
We summarize the discussion on the possibilities of doing inclusive and semi-inclusive deep inelastic scattering experiments at CEBAF with beam energy of the order of 10 GeV.
This paper considers energy-aware control for a computing system with two states: "active" and "idle." In the active state, the controller chooses to perform a single task using one of multiple task processing modes. The controller then saves energy by choosing an amount of time for the system to be idle. These decisions affect processing time, energy expenditure, and an abstract attribute vector that can be used to model other criteria of interest (such as processing quality or distortion). The goal is to optimize time average system performance. Applications of this model include a smart phone that makes energy-efficient computation and transmission decisions, a computer that processes tasks subject to rate, quality, and power constraints, and a smart grid energy manager that allocates resources in reaction to a time varying energy price. The solution methodology of this paper uses the theory of optimization for renewal systems developed in our previous work. This paper is written in tutorial form and develops the main concepts of the theory using several detailed examples. It also highlights the relationship between online dynamic optimization and linear fractional programming. Finally, it provides exercises to help the reader learn the main concepts and apply them to their own optimizations. This paper is an arxiv technical report, and is a preliminary version of material that will appear as a book chapter in an upcoming book on green communications and networking.
With the recent development of technology, wireless sensor networks (WSN) are becoming an important part of many applications. Knowing the exact location of each sensor in the network is very important issue. Therefore, the localization problem is a growing field of interest. Adding GPS receivers to each sensor node is costly solution and inapplicable on nodes with limited resources. Additionally, it is not suitable for indoor environments. In this paper, we propose an algorithm for nodes localization in WNS based on multidimensional scaling (MDS) technique. Our approach improves MDS by distance matrix refinement. Using extensive simulations we investigated in details our approach regarding different network topologies, various network parameters and performance issues. The results from simulations show that our improved MDS (IMDS) algorithm outperforms well known MDS-MAP algorithm [1] in terms of accuracy.
This paper introduces a method to generate hierarchically modular networks with prescribed node degree list and proposes a metric to measure network modularity based on the notion of edge distance. The generated networks are used as test problems to explore the effect of modularity and degree distribution on evolutionary algorithm performance. Results from the experiments (i) confirm a previous finding that modularity increases the performance advantage of genetic algorithms over hill climbers, and (ii) support a new conjecture that test problems with modularized constraint networks having heavy-tailed right-skewed degree distributions are more easily solved than test problems with modularized constraint networks having bell-shaped normal degree distributions.
We consider the Laplacian on a rooted metric tree graph with branching number $ K \geq 2 $ and random edge lengths given by independent and identically distributed bounded variables. Our main result is the stability of the absolutely continuous spectrum for weak disorder. A useful tool in the discussion is a function which expresses a directional transmission amplitude to infinity and forms a generalization of the Weyl-Titchmarsh function to trees. The proof of the main result rests on upper bounds on the range of fluctuations of this quantity in the limit of weak disorder.
In these lectures, I survey a number of applications of light-front methods to hadron and nuclear physics phenomenology and dynamics, Light-front Fock-state wavefunctions provide a frame-independent representation of hadrons in terms of their fundamental quark and gluon degrees of freedom. Nonperturbative methods for computing LFWFs in QCD are discussed, including string/gauge duality which predicts the power-law fall-off at high momentum transfer of light-front Fock-state hadronic wavefunctions with an arbitrary number of constituents and orbital angular momentum. The AdS/CFT correspondence has important implications for hadron phenomenology in the conformal limit, including an all-orders derivation of counting rules for exclusive processes. One can also compute the hadronic spectrum of near-conformal QCD assuming a truncated AdS/CFT space. The quantum fluctuations represented by the light-front Fock expansion leads to novel QCD phenomena such as color transparency, intrinsic heavy quark distributions, diffractive dissociation, and hidden-color components of nuclear wavefunctions. A new test of hidden color in deuteron photodisintegration is proposed. The origin of leading-twist phenomena such as the diffractive component of deep inelastic scattering, single-spin asymmetries, nuclear shadowing and antishadowing is also discussed; these phenomena cannot be described by light-front wavefunctions of the target computed in isolation. Part of the anomalous NuTeV results for the weak mixing angle could be due to the non-universality of nuclear antishadowing for charged and neutral currents.
Further analysis and experimentation is carried out in this paper for a chaotic dynamic model, viz. the Nonlinear Dynamic State neuron (NDS). The analysis and experimentations are performed to further understand the underlying dynamics of the model and enhance it as well. Chaos provides many interesting properties that can be exploited to achieve computational tasks. Such properties are sensitivity to initial conditions, space filling, control and synchronization.Chaos might play an important role in information processing tasks in human brain as suggested by biologists. If artificial neural networks (ANNs) is equipped with chaos then it will enrich the dynamic behaviours of such networks. The NDS model has some limitations and can be overcome in different ways. In this paper different approaches are followed to push the boundaries of the NDS model in order to enhance it. One way is to study the effects of scaling the parameters of the chaotic equations of the NDS model and study the resulted dynamics. Another way is to study the method that is used in discretization of the original R\"{o}ssler that the NDS model is based on. These approaches have revealed some facts about the NDS attractor and suggest why such a model can be stabilized to large number of unstable periodic orbits (UPOs) which might correspond to memories in phase space.
Deep neural networks coupled with fast simulation and improved computation have led to recent successes in the field of reinforcement learning (RL). However, most current RL-based approaches fail to generalize since: (a) the gap between simulation and real world is so large that policy-learning approaches fail to transfer; (b) even if policy learning is done in real world, the data scarcity leads to failed generalization from training to test scenarios (e.g., due to different friction or object masses). Inspired from H-infinity control methods, we note that both modeling errors and differences in training and test scenarios can be viewed as extra forces/disturbances in the system. This paper proposes the idea of robust adversarial reinforcement learning (RARL), where we train an agent to operate in the presence of a destabilizing adversary that applies disturbance forces to the system. The jointly trained adversary is reinforced -- that is, it learns an optimal destabilization policy. We formulate the policy learning as a zero-sum, minimax objective function. Extensive experiments in multiple environments (InvertedPendulum, HalfCheetah, Swimmer, Hopper and Walker2d) conclusively demonstrate that our method (a) improves training stability; (b) is robust to differences in training/test conditions; and c) outperform the baseline even in the absence of the adversary.
Fine-grained visual recognition aims to capture discriminative characteristics amongst visually similar categories. The state-of-the-art research work has significantly improved the fine-grained recognition performance by deep metric learning using triplet network. However, the impact of intra-category variance on the performance of recognition and robust feature representation has not been well studied. In this paper, we propose to leverage intra-class variance in metric learning of triplet network to improve the performance of fine-grained recognition. Through partitioning training images within each category into a few groups, we form the triplet samples across different categories as well as different groups, which is called Group Sensitive TRiplet Sampling (GS-TRS). Accordingly, the triplet loss function is strengthened by incorporating intra-class variance with GS-TRS, which may contribute to the optimization objective of triplet network. Extensive experiments over benchmark datasets CompCar and VehicleID show that the proposed GS-TRS has significantly outperformed state-of-the-art approaches in both classification and retrieval tasks.
This paper presents a new deterministic approximation technique in Bayesian networks. This method, "Expectation Propagation", unifies two previous techniques: assumed-density filtering, an extension of the Kalman filter, and loopy belief propagation, an extension of belief propagation in Bayesian networks. All three algorithms try to recover an approximate distribution which is close in KL divergence to the true distribution. Loopy belief propagation, because it propagates exact belief states, is useful for a limited class of belief networks, such as those which are purely discrete. Expectation Propagation approximates the belief states by only retaining certain expectations, such as mean and variance, and iterates until these expectations are consistent throughout the network. This makes it applicable to hybrid networks with discrete and continuous nodes. Expectation Propagation also extends belief propagation in the opposite direction - it can propagate richer belief states that incorporate correlations between nodes. Experiments with Gaussian mixture models show Expectation Propagation to be convincingly better than methods with similar computational cost: Laplace's method, variational Bayes, and Monte Carlo. Expectation Propagation also provides an efficient algorithm for training Bayes point machine classifiers.
Random Boolean networks (RBNs) have been a popular model of genetic regulatory networks for more than four decades. However, most RBN studies have been made with random topologies, while real regulatory networks have been found to be modular. In this work, we extend classical RBNs to define modular RBNs. Statistical experiments and analytical results show that modularity has a strong effect on the properties of RBNs. In particular, modular RBNs have more attractors and are closer to criticality when chaotic dynamics would be expected, compared to classical RBNs.
We discuss recent work in the study of a simple model for the collective behaviour of diverse speculative agents in an idealized stockmarket, considered from the perspective of the statistical physics of many-body systems. The only information about other agents available to any one is the total trade at time steps. Evidence is presented for correlated adaptation and phase transitions/crossovers in the global volatility of the system as a function of appropriate information scaling dimension. Stochastically controlled irrationally of individual agents is shown to be globally advantageous. We describe the derivation of the underlying effective stochastic differential equations which govern the dynamics, and make an interpretation of the results from the point of view of the statistical physics of disordered systems.
The scaling of the magnetic field dependence of the remanent magnetization for different temperatures and different spin-glass samples is studied. Particular attention is paid to the effect of the de Almeida-Thouless (AT) critical line on spin-glass dynamics. It is shown that results of the mean-field theory of aging phenomena, with two additional experimentally justified assumptions, predict $H/H_{AT}(T)$ scaling for remanent magnetization curves. Experiments on a single crystal Cu:Mn 1.5 at % sample in the temperature interval from $0.7T_{g}$ to $0.85T_{g}$ give results consistent with this scaling. Magnetization vs. field curves for different Cu:Mn and thiospinel samples also scale together. These experimental results support the predictions of the mean-field theory of aging phenomena.
Finding the dense regions of a graph and relations among them is a fundamental task in network analysis. Nucleus decomposition is a principled framework of algorithms that generalizes the k-core and k-truss decompositions. It can leverage the higher-order structures to locate the dense subgraphs with hierarchical relations. Computation of the nucleus decomposition is performed in multiple steps, known as the peeling process, and it requires global information about the graph at any time. This prevents the scalable parallelization of the computation. Also, it is not possible to compute approximate and fast results by the peeling process, because it does not produce the densest regions until the algorithm is complete. In a previous work, Lu et al. proposed to iteratively compute the h-indices of vertex degrees to obtain the core numbers and prove that the convergence is obtained after a finite number of iterations. In this work, we generalize the iterative h-index computation for any nucleus decomposition and prove convergence bounds. We present a framework of local algorithms to obtain the exact and approximate nucleus decompositions. Our algorithms are pleasingly parallel and can provide approximations to explore time and quality trade-offs. Our shared-memory implementation verifies the efficiency, scalability, and effectiveness of our algorithms on real-world networks. In particular, using 24 threads, we obtain up to 4.04x and 7.98x speedups for k-truss and (3, 4) nucleus decompositions.
Cheating is a real problem in the Internet of Things. The fundamental question that needs to be answered is how we can trust the validity of the data being generated in the first place. The problem, however, isn't inherent in whether or not to embrace the idea of an open platform and open-source software, but to establish a methodology to verify the trustworthiness and control any access. This paper focuses on building an access control model and system based on trust computing. This is a new field of access control techniques which includes Access Control, Trust Computing, Internet of Things, network attacks, and cheating technologies. Nevertheless, the target access control systems can be very complex to manage. This paper presents an overview of the existing work on trust computing, access control models and systems in IoT. It not only summarizes the latest research progress, but also provides an understanding of the limitations and open issues of the existing work. It is expected to provide useful guidelines for future research.
The latent block model (LBM) is a flexible probabilistic tool to describe interactions between node sets in bipartite networks, but it does not account for interactions of time varying intensity between nodes in unknown classes. In this paper we propose a non stationary temporal extension of the LBM that clusters simultaneously the two node sets of a bipartite network and constructs classes of time intervals on which interactions are stationary. The number of clusters as well as the membership to classes are obtained by maximizing the exact complete-data integrated likelihood relying on a greedy search approach. Experiments on simulated and real data are carried out in order to assess the proposed methodology.
Query relevance ranking and sentence saliency ranking are the two main tasks in extractive query-focused summarization. Previous supervised summarization systems often perform the two tasks in isolation. However, since reference summaries are the trade-off between relevance and saliency, using them as supervision, neither of the two rankers could be trained well. This paper proposes a novel summarization system called AttSum, which tackles the two tasks jointly. It automatically learns distributed representations for sentences as well as the document cluster. Meanwhile, it applies the attention mechanism to simulate the attentive reading of human behavior when a query is given. Extensive experiments are conducted on DUC query-focused summarization benchmark datasets. Without using any hand-crafted features, AttSum achieves competitive performance. It is also observed that the sentences recognized to focus on the query indeed meet the query need.
We discuss the potential and limitations of Gaussian cluster states for measurement-based quantum computing. Using a framework of Gaussian projected entangled pair states (GPEPS), we show that no matter what Gaussian local measurements are performed on systems distributed on a general graph, transport and processing of quantum information is not possible beyond a certain influence region, except for exponentially suppressed corrections. We also demonstrate that even under arbitrary non-Gaussian local measurements, slabs of Gaussian cluster states of a finite width cannot carry logical quantum information, even if sophisticated encodings of qubits in continuous-variable (CV) systems are allowed for. This is proven by suitably contracting tensor networks representing infinite-dimensional quantum systems. The result can be seen as sharpening the requirements for quantum error correction and fault tolerance for Gaussian cluster states, and points towards the necessity of non-Gaussian resource states for measurement-based quantum computing. The results can equally be viewed as referring to Gaussian quantum repeater networks.
The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods. In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process. MDNet includes an image model and a language model. The image model is proposed to enhance multi-scale feature ensembles and utilization efficiency. The language model, integrated with our improved attention mechanism, aims to read and explore discriminative image feature descriptions from reports to learn a direct mapping from sentence words to image pixels. The overall network is trained end-to-end by using our developed optimization strategy. Based on a pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we conduct sufficient experiments to demonstrate that MDNet outperforms comparative baselines. The proposed image model obtains state-of-the-art performance on two CIFAR datasets as well.
The analytical description of the dynamics in models with discrete variables (e.g. Ising spins) is a notoriously difficult problem, that can be tackled only under some approximation. Recently a novel variational approach to solve the stationary dynamical regime has been introduced by Pelizzola [Eur. Phys. J. B, 86 (2013) 120], where simple closed equations are derived under mean-field approximations based on the cluster variational method. Here we propose to use the same approximation based on the cluster variational method also for the non-stationary regime, which has not been considered up to now within this framework. We check the validity of this approximation in describing the non-stationary dynamical regime of several Ising models defined on Erdos-R\'enyi random graphs: we study ferromagnetic models with symmetric and partially asymmetric couplings, models with random fields and also spin glass models. A comparison with the actual Glauber dynamics, solved numerically, shows that one of the two studied approximations (the so-called 'diamond' approximation) provides very accurate results in all the systems studied. Only for the spin glass models we find some small discrepancies in the very low temperature phase, probably due to the existence of a large number of metastable states. Given the simplicity of the equations to be solved, we believe the diamond approximation should be considered as the 'minimal standard' in the description of the non-stationary regime of Ising-like models: any new method pretending to provide a better approximate description to the dynamics of Ising-like models should perform at least as good as the diamond approximation.
We report on the observation of suppression and revival of coherent backscattering of ultra-cold atoms launched in an optical disorder and submitted to a short dephasing pulse, as proposed in a recent paper of T. Micklitz \textit{et al.} [arXiv:1406.6915]. This observation, in a quasi-2D geometry, demonstrates a novel and general method to study weak localization by manipulating time reversal symmetry in disordered systems. In future experiments, this scheme could be extended to investigate higher order localization processes at the heart of Anderson (strong) localization.
We study the spin-1/2 Ising model on a Bethe lattice in the mean-field limit, with the interaction constants following two deterministic aperiodic sequences: Fibonacci or period-doubling ones. New algorithms of sequence generation were implemented, which were fundamental in obtaining long sequences and, therefore, precise results. We calculate the exact critical temperature for both sequences, as well as the critical exponent $\beta$, $\gamma$ and $\delta$. For the Fibonacci sequence, the exponents are classical, while for the period-doubling one they depend on the ratio between the two exchange constants. The usual relations between critical exponents are satisfied, within error bars, for the period-doubling sequence. Therefore, we show that mean-field-like procedures may lead to nonclassical critical exponents.
Merging Mobile Edge Computing (MEC), which is an emerging paradigm to meet the increasing computation demands from mobile devices, with the dense deployment of Base Stations (BSs), is foreseen as a key step towards the next generation mobile networks. However, new challenges arise for designing energy efficient networks since radio access resources and computing resources of BSs have to be jointly managed, and yet they are complexly coupled with traffic in both spatial and temporal domains. In this paper, we address the challenge of incorporating MEC into dense cellular networks, and propose an efficient online algorithm, called ENGINE (ENErgy constrained offloadINg and slEeping) which makes joint computation offloading and BS sleeping decisions in order to maximize the quality of service while keeping the energy consumption low. Our algorithm leverages Lyapunov optimization technique, works online and achieves a close-to-optimal performance without using future information. Our simulation results show that our algorithm can effectively reduce energy consumption without sacrificing the user quality of service.
We propose a spin model with quenched disorder which exhibits in slow driving two drastically different types of critical nonequilibrium steady states. One of them corresponds to classical criticality requiring fine-tuning of the disorder. The other is a self-organized criticality which is insensitive to disorder. The crossover between the two types of criticality is determined by the mode of driving. As one moves from "soft" to "hard" driving the universality class of the critical point changes from a classical order-disorder to a quenched Edwards-Wilkinson universality class. The model is viewed as prototypical for a broad class of physical phenomena ranging from magnetism to earthquakes.
Information Retrieval (IR) models need to deal with two difficult issues, vocabulary mismatch and term dependencies. Vocabulary mismatch corresponds to the difficulty of retrieving relevant documents that do not contain exact query terms but semantically related terms. Term dependencies refers to the need of considering the relationship between the words of the query when estimating the relevance of a document. A multitude of solutions has been proposed to solve each of these two problems, but no principled model solve both. In parallel, in the last few years, language models based on neural networks have been used to cope with complex natural language processing tasks like emotion and paraphrase detection. Although they present good abilities to cope with both term dependencies and vocabulary mismatch problems, thanks to the distributed representation of words they are based upon, such models could not be used readily in IR, where the estimation of one language model per document (or query) is required. This is both computationally unfeasible and prone to over-fitting. Based on a recent work that proposed to learn a generic language model that can be modified through a set of document-specific parameters, we explore use of new neural network models that are adapted to ad-hoc IR tasks. Within the language model IR framework, we propose and study the use of a generic language model as well as a document-specific language model. Both can be used as a smoothing component, but the latter is more adapted to the document at hand and has the potential of being used as a full document language model. We experiment with such models and analyze their results on TREC-1 to 8 datasets.
In the usual Achlioptas processes the smallest clusters of a few randomly chosen ones are selected to merge together at each step. The resulting aggregation process leads to the delayed birth of a giant cluster and the so-called explosive percolation transition showing a set of anomalous features. We explore a process with the opposite selection rule, in which the biggest clusters of the randomly chosen ones merge together. We develop a theory of this kind of percolation based on the Smoluchowski equation, find the percolation threshold, and describe the scaling properties of this continuous transition, namely, the critical exponents and amplitudes, and scaling functions. We show that, qualitatively, this transition is similar to the ordinary percolation one, though occurring in less connected systems.
For decades, researchers have been applying computer simulation to address problems in biology. However, many of these "grand challenges" in computational biology, such as simulating how proteins fold, remained unsolved due to their great complexity. Indeed, even to simulate the fastest folding protein would require decades on the fastest modern CPUs. Here, we review novel methods to fundamentally speed such previously intractable problems using a new computational paradigm: distributed computing. By efficiently harnessing tens of thousands of computers throughout the world, we have been able to break previous computational barriers. However, distributed computing brings new challenges, such as how to efficiently divide a complex calculation of many PCs that are connected by relatively slow networking. Moreover, even if the challenge of accurately reproducing reality can be conquered, a new challenge emerges: how can we take the results of these simulations (typically tens to hundreds of gigabytes of raw data) and gain some insight into the questions at hand. This challenge of the analysis of the sea of data resulting from large-scale simulation will likely remain for decades to come.
Most papers about the evolutionary game on graph assume the statistic network structure. However, social interaction could change the relationship of people. And the changing social structure will affect the people's strategy too. We build a coevolutionary model of prisoner's dilemma game and network structure to study the dynamic interaction in the real world. Based on the asynchronous update rule and Monte Carlo simulation, we find that, when players prefer to rewire their links to the richer, the cooperation density will increase. The reason of it has been analyzed.
We consider a simple model of quantum disorder in two dimensions, characterized by a long-range site-to-site hopping. The system undergoes a metal-insulator transition -- its eigenfunctions change from being extended to being localized. We demonstrate that at the point of the transition the eigenfunctions do not become fractal. Their density moments do not scale as a power of the system size. Instead, in one of the considered limits our result suggests a power of the logarithm of the system size. In this regard, the transition differs from a similar one in the one-dimensional version of the same system, as well as from the conventional Anderson transition in more than two dimensions.
In contrast to previous common wisdom that epidemic activity in heterogeneous networks is dominated by the hubs with the largest number of connections, recent research has pointed out the role that the innermost, dense core of the network plays in sustaining epidemic processes. Here we show that the mechanism responsible of spreading depends on the nature of the process. Epidemics with a transient state are boosted by the innermost core. Contrarily, epidemics allowing a steady state present a dual scenario, where either the hub independently sustains activity and propagates it to the rest of the system, or, alternatively, the innermost network core collectively turns into the active state, maintaining it globally. In uncorrelated networks the former mechanism dominates if the degree distribution decays with an exponent larger than 5/2, and the latter otherwise. Topological correlations, rife in real networks, may perturb this picture, mixing the role of both mechanisms.
The model-independent radiative corrections to deep-inelastic scattering of unpolarized electron beam off the tensor polarized deuteron target have been considered. The contribution to the radiative corrections due to the hard-photon emission from the elastic electron-deuteron scattering (the so-called elastic radiative tail) is also investigated. The calculation is based on the covariant parametrization of the deuteron quadrupole polarization tensor. The numerical estimates of the radiative corrections to the polarization observables have been done for the kinematical conditions of the current experiment at HERA
The early time regime of the Kardar-Parisi-Zhang (KPZ) equation in $1+1$ dimension, starting from a Brownian initial condition with a drift $w$, is studied using the exact Fredholm determinant representation. For large drift we recover the exact results for the droplet initial condition, whereas a vanishingly small drift describes the stationary KPZ case, recently studied by weak noise theory (WNT). We show that for short time $t$, the probability distribution $P(H,t)$ of the height $H$ at a given point takes the large deviation form $P(H,t) \sim \exp{\left(-\Phi(H)/\sqrt{t} \right)}$. We obtain the exact expressions for the rate function $\Phi(H)$ for $H<H_{c2}$. Our exact expression for $H_{c2}$ numerically coincides with the value at which WNT was found to exhibit a spontaneous reflection symmetry breaking. We propose two continuations for $H>H_{c2}$, which apparently correspond to the symmetric and asymmetric WNT solutions. The rate function $\Phi(H)$ is Gaussian in the center, while it has asymmetric tails, $|H|^{5/2}$ on the negative $H$ side and $H^{3/2}$ on the positive $H$ side.
In data-intensive applications data transfer is a primary cause of job execution delay. Data access time depends on bandwidth. The major bottleneck to supporting fast data access in Grids is the high latencies of Wide Area Networks and Internet. Effective scheduling can reduce the amount of data transferred across the internet by dispatching a job to where the needed data are present. Another solution is to use a data replication mechanism. Objective of dynamic replica strategies is reducing file access time which leads to reducing job runtime. In this paper we develop a job scheduling policy and a dynamic data replication strategy, called HRS (Hierarchical Replication Strategy), to improve the data access efficiencies. We study our approach and evaluate it through simulation. The results show that our algorithm has improved 12% over the current strategies.
We study both analytically and numerically the effect of presynaptic noise on the transmission of information in attractor neural networks. The noise occurs on a very short-time scale compared to that for the neuron dynamics and it produces short-time synaptic depression. This is inspired in recent neurobiological findings that show that synaptic strength may either increase or decrease on a short-time scale depending on presynaptic activity. We thus describe a mechanism by which fast presynaptic noise enhances the neural network sensitivity to an external stimulus. The reason for this is that, in general, the presynaptic noise induces nonequilibrium behavior and, consequently, the space of fixed points is qualitatively modified in such a way that the system can easily scape from the attractor. As a result, the model shows, in addition to pattern recognition, class identification and categorization, which may be relevant to the understanding of some of the brain complex tasks.
Typical applications of the mobile ad-hoc network, MANET, are in disaster recovery operations which have to respect time constraint needs. Since MANET is affected by limited resources such as power constraints, it is a challenge to respect the deadline of a real-time data. This paper proposes the Energy and Delay aware based on Dynamic Source Routing protocol, ED-DSR. ED-DSR efficiently utilizes the network resources such as the intermediate mobile nodes energy and load. It ensures both timeliness and energy efficiency by avoiding low-power and overloaded intermediate mobile nodes. Through simulations, we compare our proposed routing protocol with the basic routing protocol Dynamic Source Routing, DSR. Weighting factors are introduced to improve the route selection. Simulation results, using the NS-2 simulator, show that the proposed protocol prolongs the network lifetime (up to 66%), increases the volume of packets delivered while meeting the data flows real-time constraints and shortens the endto- end delay.
Deep Reinforcement Learning (DRL) is a trending field of research, showing great promise in challenging problems such as playing Atari, solving Go and controlling robots. While DRL agents perform well in practice we are still lacking the tools to analayze their performance. In this work we present the Semi-Aggregated MDP (SAMDP) model. A model best suited to describe policies exhibiting both spatial and temporal hierarchies. We describe its advantages for analyzing trained policies over other modeling approaches, and show that under the right state representation, like that of DQN agents, SAMDP can help to identify skills. We detail the automatic process of creating it from recorded trajectories, up to presenting it on t-SNE maps. We explain how to evaluate its fitness and show surprising results indicating high compatibility with the policy at hand. We conclude by showing how using the SAMDP model, an extra performance gain can be squeezed from the agent.
Many success stories involving deep neural networks are instances of supervised learning, where available labels power gradient-based learning methods. Creating such labels, however, can be expensive and thus there is increasing interest in weak labels which only provide coarse information, with uncertainty regarding time, location or value. Using such labels often leads to considerable challenges for the learning process. Current methods for weak-label training often employ standard supervised approaches that additionally reassign or prune labels during the learning process. The information gain, however, is often limited as only the importance of labels where the network already yields reasonable results is boosted. We propose treating weak-label training as an unsupervised problem and use the labels to guide the representation learning to induce structure. To this end, we propose two autoencoder extensions: class activity penalties and structured dropout. We demonstrate the capabilities of our approach in the context of score-informed source separation of music.
Image captioning has been recently gaining a lot of attention thanks to the impressive achievements shown by deep captioning architectures, which combine Convolutional Neural Networks to extract image representations, and Recurrent Neural Networks to generate the corresponding captions. At the same time, a significant research effort has been dedicated to the development of saliency prediction models, which can predict human eye fixations. Despite saliency information could be useful to condition an image captioning architecture, by providing an indication of what is salient and what is not, no model has yet succeeded in effectively incorporating these two techniques. In this work, we propose an image captioning approach in which a generative recurrent neural network can focus on different parts of the input image during the generation of the caption, by exploiting the conditioning given by a saliency prediction model on which parts of the image are salient and which are contextual. We demonstrate, through extensive quantitative and qualitative experiments on large scale datasets, that our model achieves superior performances with respect to different image captioning baselines with and without saliency.
The superfluid density near the superconducting transition is investigated in the presence of spatial inhomogeneity in the critical temperature. Disorder is accounted for by means of a random $T_c$ term in the conventional Ginzburg-Landau action for the superconducting order parameter. Focusing on the case where a low-density of randomly distributed planar defects are responsible for the variation of $T_c$, we derive the lowest order correction to the superfluid density in powers of the defect concentration. The correction is calculated assuming a broad Gaussian distribution for the strengths of the defect potentials. Our results are in a qualitative agreement with the superfluid density measurements in the underdoped regime of high-quality YBCO crystals by Broun and co-workers.
This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brain functioning. We introduce a model, the selectron, that (i) arises as the fast time constant limit of leaky integrate-and-fire neurons equipped with spiking timing dependent plasticity (STDP) and (ii) is amenable to theoretical analysis. We show that the selectron encodes reward estimates into spikes and that an error bound on spikes is controlled by a spiking margin and the sum of synaptic weights. Moreover, the efficacy of spikes (their usefulness to other reward maximizing selectrons) also depends on total synaptic strength. Finally, based on our analysis, we propose a regularized version of STDP, and show the regularization improves the robustness of neuronal learning when faced with multiple stimuli.
We introduce a new framework for training deep generative models for high-dimensional conditional density estimation. The Bottleneck Conditional Density Estimator (BCDE) is a variant of the conditional variational autoencoder (CVAE) that employs layer(s) of stochastic variables as the bottleneck between the input $x$ and target $y$, where both are high-dimensional. Crucially, we propose a new hybrid training method that blends the conditional generative model with a joint generative model. Hybrid blending is the key to effective training of the BCDE, which avoids overfitting and provides a novel mechanism for leveraging unlabeled data. We show that our hybrid training procedure enables models to achieve competitive results in the MNIST quadrant prediction task in the fully-supervised setting, and sets new benchmarks in the semi-supervised regime for MNIST, SVHN, and CelebA.
While degree correlations are known to play a crucial role for spreading phenomena in networks, their impact on the propagation speed has hardly been understood. Here we investigate a tunable spreading model on scale-free networks and show that the propagation becomes slow in positively (negatively) correlated networks if nodes with a high connectivity locally accelerate (decelerate) the propagation. Examining the efficient paths offers a coherent explanation for this result, while the $k$-core decomposition reveals the dependence of the nodal spreading efficiency on the correlation. Our findings should open new pathways to delicately control real-world spreading processes.
Analysis of degree-degree dependencies in complex networks, and their impact on processes on networks requires null models, i.e. models that generate uncorrelated scale-free networks. Most models to date however show structural negative dependencies, caused by finite size effects. We analyze the behavior of these structural negative degree-degree dependencies, using rank based correlation measures, in the directed Erased Configuration Model. We obtain expressions for the scaling as a function of the exponents of the distributions. Moreover, we show that this scaling undergoes a phase transition, where one region exhibits scaling related to the natural cut-off of the network while another region has scaling similar to the structural cut-off for uncorrelated networks. By establishing the speed of convergence of these structural dependencies we are able to asses statistical significance of degree-degree dependencies on finite complex networks when compared to networks generated by the directed Erased Configuration Model.
In low power wireless sensor networks, MAC protocols usually employ periodic sleep/wake schedule to reduce idle listening time. Even though this mechanism is simple and efficient, it results in high end-to-end latency and low throughput. On the other hand, the previously proposed CSMA/CA-based MAC protocols have tried to reduce inter-node interference at the cost of increased latency and lower network capacity. In this paper we propose IAMAC, a CSMA/CA sleep/wake MAC protocol that minimizes inter-node interference, while also reduces per-hop delay through cross-layer interactions with the network layer. Furthermore, we show that IAMAC can be integrated into the SP architecture to perform its inter-layer interactions. Through simulation, we have extensively evaluated the performance of IAMAC in terms of different performance metrics. Simulation results confirm that IAMAC reduces energy consumption per node and leads to higher network lifetime compared to S-MAC and Adaptive S-MAC, while it also provides lower latency than S-MAC. Throughout our evaluations we have considered IAMAC in conjunction with two error recovery methods, i.e., ARQ and Seda. It is shown that using Seda as the error recovery mechanism of IAMAC results in higher throughput and lifetime compared to ARQ.
The peak of the spin glass relaxation rate, S(t)=d{-M_{TRM}(t,t_w)}/H/{d ln t}, is directly related to the typical value of the free energy barrier which can be explored over experimental time scales. A change in magnetic field H generates an energy E_z={N_s}{X_fc}{H^2} by which the barrier heights are reduced, where X_{fc} is the field cooled susceptibility per spin, and N_s is the number of correlated spins. The shift of the peak of S(t) gives E_z, generating the correlation length, Ksi(t,T), for Cu:Mn 6at.% and CdCr_{1.7}In_{0.3}S_4. Fits to power law dynamics, Ksi(t,T)\propto {t}^{\alpha(T)} and activated dynamics Ksi(t,T) \propto {ln t}^{1/psi} compare well with simulation fits, but possess too small a prefactor for activated dynamics.
An indicator for presence of community structure in networks is suggested. It allows one to check whether such structures can exist, in principle, in any particular network, without a need to apply computationally cost algorithms. In this way we exclude a large class of networks that do not possess any community structure.
Motifs are patterns of subgraphs of complex networks. We studied the impact of such patterns of connectivity on the level of correlated, or synchronized, spiking activity among pairs of cells in a recurrent network model of integrate and fire neurons. For a range of network architectures, we find that the pairwise correlation coefficients, averaged across the network, can be closely approximated using only three statistics of network connectivity. These are the overall network connection probability and the frequencies of two second-order motifs: diverging motifs, in which one cell provides input to two others, and chain motifs, in which two cells are connected via a third intermediary cell. Specifically, the prevalence of diverging and chain motifs tends to increase correlation. Our method is based on linear response theory, which enables us to express spiking statistics using linear algebra, and a resumming technique, which extrapolates from second order motifs to predict the overall effect of coupling on network correlation. Our motif-based results seek to isolate the effect of network architecture perturbatively from a known network state.
This paper presents adaptive link selection algorithms for distributed estimation and considers their application to wireless sensor networks and smart grids. In particular, exhaustive search--based least--mean--squares(LMS)/recursive least squares(RLS) link selection algorithms and sparsity--inspired LMS/RLS link selection algorithms that can exploit the topology of networks with poor--quality links are considered. The proposed link selection algorithms are then analyzed in terms of their stability, steady--state and tracking performance, and computational complexity. In comparison with existing centralized or distributed estimation strategies, key features of the proposed algorithms are: 1) more accurate estimates and faster convergence speed can be obtained; and 2) the network is equipped with the ability of link selection that can circumvent link failures and improve the estimation performance. The performance of the proposed algorithms for distributed estimation is illustrated via simulations in applications of wireless sensor networks and smart grids.
This paper studies the performance of partial-Rake (PRake) receivers in impulse-radio ultrawideband wireless networks when an energy-efficient power control scheme is adopted. Due to the large bandwidth of the system, the multipath channel is assumed to be frequency-selective. By using noncooperative game-theoretic models and large system analysis, explicit expressions are derived in terms of network parameters to measure the effects of self- and multiple-access interference at a receiving access point. Performance of the PRake is compared in terms of achieved utilities and loss to that of the all-Rake receiver.
Peer-to-Peer streaming technology has become one of the major Internet applications as it offers the opportunity of broadcasting high quality video content to a large number of peers with low costs. It is widely accepted that with the efficient utilization of peers and server's upload capacities, peers can enjoy watching a high bit rate video with minimal end-to-end delay. In this paper, we present a practical scheduling algorithm that works in the challenging condition where no spare capacity is available, i.e., it maximally utilizes the resources and broadcasts the maximum streaming rate. Each peer contacts with only a small number of neighbours in the overlay network and autonomously subscribes to sub-streams according to a budget-model in such a way that the number of peers forwarding exactly one sub-stream will be maximized. The hop-count delay is also taken into account to construct a short depth trees. Finally, we show through simulation that peers dynamically converge to an efficient overlay structure with a short hop-count delay. Moreover, the proposed scheme gives nice features in the homogeneous case and overcomes SplitStream in all simulated scenarios.
In this paper, transmission over time-selective, flat fading relay channels is studied. It is assumed that channel fading coefficients are not known a priori. Transmission takes place in two phases: network training phase and data transmission phase. In the training phase, pilot symbols are sent and the receivers employ single-pilot MMSE estimation or noncausal Wiener filter to learn the channel. Amplify-and-Forward (AF) and Decode-and-Forward (DF) techniques are considered in the data transmission phase and achievable rate expressions are obtained. The training period, and data and training power allocations are jointly optimized by using the achievable rate expressions. Numerical results are obtained considering Gauss-Markov and lowpass fading models. Achievable rates are computed and energy-per-bit requirements are investigated. The optimal power distributions among pilot and data symbols are provided.
Spatial stochastic models have been much used for performance analysis of wireless communication networks. This is due to the fact that the performance of wireless networks depends on the spatial configuration of wireless nodes and the irregularity of node locations in a real wireless network can be captured by a spatial point process. Most works on such spatial stochastic models of wireless networks have adopted homogeneous Poisson point processes as the models of wireless node locations. While this adoption makes the models analytically tractable, it assumes that the wireless nodes are located independently of each other and their spatial correlation is ignored. Recently, the authors have proposed to adopt the Ginibre point process---one of the determinantal point processes---as the deployment models of base stations (BSs) in cellular networks. The determinantal point processes constitute a class of repulsive point processes and have been attracting attention due to their mathematically interesting properties and efficient simulation methods. In this tutorial, we provide a brief guide to the Ginibre point process and its variant, $\alpha$-Ginibre point process, as the models of BS deployments in cellular networks and show some existing results on the performance analysis of cellular network models with $\alpha$-Ginibre deployed BSs. The authors hope the readers to use such point processes as a tool for analyzing various problems arising in future cellular networks.
Social media provides a rich source of networked data. This data is represented by a set of nodes and a set of relations (edges). It is often possible to obtain or infer multiple types of relations from the same set of nodes, such as observed friend connections, inferred links via semantic comparison, or relations based off of geographic proximity. These edge sets can be represented by one multi-layer network. In this paper we review a method to perform community detection of multilayer networks, and illustrate its use as a visualization tool for analyzing different community partitions. The algorithm is illustrated on a dataset from Twitter, specifically regarding the National Football League (NFL).
We present a new computational scheme aimed at reducing the complexity of the chemical networks in astrophysical models, one which is shown to markedly improve their computational efficiency. It contains a flux-reduction scheme that permits to deal with both large and small systems. This procedure is shown to yield a large speed-up of the corresponding numerical codes and provides good accord with the full network results. We analyse and discuss two examples involving chemistry networks of the interstellar medium and show that the results from the present reduction technique reproduce very well the results from fuller calculations.
Time series classification (TSC), the problem of predicting class labels of time series, has been around for decades within the community of data mining and machine learning, and found many important applications such as biomedical engineering and clinical prediction. However, it still remains challenging and falls short of classification accuracy and efficiency. Traditional approaches typically involve extracting discriminative features from the original time series using dynamic time warping (DTW) or shapelet transformation, based on which an off-the-shelf classifier can be applied. These methods are ad-hoc and separate the feature extraction part with the classification part, which limits their accuracy performance. Plus, most existing methods fail to take into account the fact that time series often have features at different time scales. To address these problems, we propose a novel end-to-end neural network model, Multi-Scale Convolutional Neural Networks (MCNN), which incorporates feature extraction and classification in a single framework. Leveraging a novel multi-branch layer and learnable convolutional layers, MCNN automatically extracts features at different scales and frequencies, leading to superior feature representation. MCNN is also computationally efficient, as it naturally leverages GPU computing. We conduct comprehensive empirical evaluation with various existing methods on a large number of benchmark datasets, and show that MCNN advances the state-of-the-art by achieving superior accuracy performance than other leading methods.
UK electricity market changes provide opportunities to alter households' electricity usage patterns for the benefit of the overall electricity network. Work on clustering similar households has concentrated on daily load profiles and the variability in regular household behaviours has not been considered. Those households with most variability in regular activities may be the most receptive to incentives to change timing.   Whether using the variability of regular behaviour allows the creation of more consistent groupings of households is investigated and compared with daily load profile clustering. 204 UK households are analysed to find repeating patterns (motifs). Variability in the time of the motif is used as the basis for clustering households. Different clustering algorithms are assessed by the consistency of the results.   Findings show that variability of behaviour, using motifs, provides more consistent groupings of households across different clustering algorithms and allows for more efficient targeting of behaviour change interventions.
In this paper, the problem of proactive deployment of cache-enabled unmanned aerial vehicles (UAVs) for optimizing the quality-of-experience (QoE) of wireless devices in a cloud radio access network (CRAN) is studied. In the considered model, the network can leverage human-centric information such as users' visited locations, requested contents, gender, job, and device type to predict the content request distribution and mobility pattern of each user. Then, given these behavior predictions, the proposed approach seeks to find the user-UAV associations, the optimal UAVs' locations, and the contents to cache at UAVs. This problem is formulated as an optimization problem whose goal is to maximize the users' QoE while minimizing the transmit power used by the UAVs. To solve this problem, a novel algorithm based on the machine learning framework of conceptor-based echo state networks (ESNs) is proposed. Using ESNs, the network can effectively predict each user's content request distribution and its mobility pattern when limited information on the states of users and the network is available. Based on the predictions of the users' content request distribution and their mobility patterns, we derive the optimal user-UAV association, optimal locations of the UAVs as well as the content to cache at UAVs. Simulation results using real pedestrian mobility patterns from BUPT and actual content transmission data from Youku show that the proposed algorithm can yield 40% and 61% gains, respectively, in terms of the average transmit power and the percentage of the users with satisfied QoE compared to a benchmark algorithm without caching and a benchmark solution without UAVs.
Music tag words that describe music audio by text have different levels of abstraction. Taking this issue into account, we propose a music classification approach that aggregates multi-level and multi-scale features using pre-trained feature extractors. In particular, the feature extractors are trained in sample-level deep convolutional neural networks using raw waveforms. We show that this approach achieves state-of-the-art results on several music classification datasets.
First observation of (anti)deuterons in deep inelastic ep scattering (DIS) measured with the ZEUS detector at HERA is reported. The production rate of deuterons is higher than that of antideuterons. However, no asymmetry in the production rate of protons and antiprotons was found. The (anti)deuteron yield is approximately three orders of magnitude smaller than that of (anti)protons, which is consistent with the world measurements.
High-velocity streams of high-dimensional data pose significant "big data" analysis challenges across a range of applications and settings. Online learning and online convex programming play a significant role in the rapid recovery of important or anomalous information from these large datastreams. While recent advances in online learning have led to novel and rapidly converging algorithms, these methods are unable to adapt to nonstationary environments arising in real-world problems. This paper describes a dynamic mirror descent framework which addresses this challenge, yielding low theoretical regret bounds and accurate, adaptive, and computationally efficient algorithms which are applicable to broad classes of problems. The methods are capable of learning and adapting to an underlying and possibly time-varying dynamical model. Empirical results in the context of dynamic texture analysis, solar flare detection, sequential compressed sensing of a dynamic scene, traffic surveillance,tracking self-exciting point processes and network behavior in the Enron email corpus support the core theoretical findings.
This work addresses the problem of extracting deeply learned features directly from compressive measurements. There has been no work in this area. Existing deep learning tools only give good results when applied on the full signal, that too usually after preprocessing. These techniques require the signal to be reconstructed first. In this work we show that by learning directly from the compressed domain, considerably better results can be obtained. This work extends the recently proposed framework of deep matrix factorization in combination with blind compressed sensing; hence the term deep blind compressed sensing. Simulation experiments have been carried out on imaging via single pixel camera, under-sampled biomedical signals, arising in wireless body area network and compressive hyperspectral imaging. In all cases, the superiority of our proposed deep blind compressed sensing can be envisaged.
Layer-wise relevance propagation is a framework which allows to decompose the prediction of a deep neural network computed over a sample, e.g. an image, down to relevance scores for the single input dimensions of the sample such as subpixels of an image. While this approach can be applied directly to generalized linear mappings, product type non-linearities are not covered. This paper proposes an approach to extend layer-wise relevance propagation to neural networks with local renormalization layers, which is a very common product-type non-linearity in convolutional neural networks. We evaluate the proposed method for local renormalization layers on the CIFAR-10, Imagenet and MIT Places datasets.
In this work we investigate the effectiveness of the application of niching able swarm metaheuristic approaches in order to solve constrained optimization problems. Sub-swarms are used in order to allow the achievement of many feasible regions to be exploited in terms of fitness function. The niching approach employed was wFSS, a version of the Fish School Search algorithm devised specifically to deal with multi-modal search spaces. A base technique referred as wrFSS was conceived and three variations applying different constraint handling procedures were also proposed. Tests were performed in seven problems from CEC 2010 and a comparison with other approaches was carried out. Results show that the search strategy proposed is able to handle some heavily constrained problems and achieve results comparable to the state-of-the-art algorithms. However, we also observed that the local search operator present in wFSS and inherited by wrFSS makes the fitness convergence difficult when the feasible region presents some specific geometrical features.
Undergraduate biochemistry laboratory courses often do not provide students with an authentic research experience, particularly when the express purpose of the laboratory is purely instructional. However, an instructional laboratory course that is inquiry- and research-based could simultaneously impart scientific knowledge and foster a student's research expertise and confidence. We have developed a year-long undergraduate biochemistry laboratory curriculum wherein students determine, via experiment and computation, the function of a protein of known three-dimensional structure. The first half of the course is inquiry-based and modular in design; students learn general biochemical techniques while gaining preparation for research experiments in the second semester. Having learned standard biochemical methods in the first semester, students independently pursue their own (original) research projects in the second semester. This new curriculum has yielded an improvement in student performance and confidence as assessed by various metrics. To disseminate teaching resources to students and instructors alike, a freely accessible Biochemistry Laboratory Education resource is available at http://biochemlab.org.
In disordered Weyl semimetals, mechanisms of topological origin lead to the protection against Anderson localization, and at the same time to different types of transverse electromagnetic response -- the anomalous Hall, and chiral magnetic effect. We here apply field theory methods to discuss the manifestation of these phenomena at length scales which are beyond the scope of diagrammatic perturbation theory. Specifically we show how an interplay of symmetry breaking and the chiral anomaly leads to a field theory containing two types of topological terms. Generating the unconventional response coefficients of the system, these terms remain largely unaffected by disorder, i.e. information on the chirality of the system remains visible even at large length scales.
We report on low-resolution multi-object spectroscopy of 30 faint targets (R \~ 24-25) in the HDF-S and AXAF deep field obtained with the VLT Focal Reducer/low dispersion Spectrograph (FORS1). Eight high-redshift galaxies with 2.75< z < 4 have been identified. The spectroscopic redshifts are in good agreement with the photometric ones with a dispersion $\sigma_z = 0.07$ at z<2 and $\sigma_z = 0.16$ at z>2. The inferred star formation rates of the individual objects are moderate, ranging from a few to a few tens solar masses per year. Five out of the eight high-z objects do not show prominent emission lines. One object has a spectrum typical of an AGN. In the AXAF field two relatively close pairs of galaxies have been identified, with separations of 8.7 and 3.1 proper Mpc and mean redshifts of 3.11 and 3.93, respectively.
Spectral clustering is a standard approach to label nodes on a graph by studying the (largest or lowest) eigenvalues of a symmetric real matrix such as e.g. the adjacency or the Laplacian. Recently, it has been argued that using instead a more complicated, non-symmetric and higher dimensional operator, related to the non-backtracking walk on the graph, leads to improved performance in detecting clusters, and even to optimal performance for the stochastic block model. Here, we propose to use instead a simpler object, a symmetric real matrix known as the Bethe Hessian operator, or deformed Laplacian. We show that this approach combines the performances of the non-backtracking operator, thus detecting clusters all the way down to the theoretical limit in the stochastic block model, with the computational, theoretical and memory advantages of real symmetric matrices.
First steps towards a mathematical theory of deep convolutional neural networks for feature extraction were made---for the continuous-time case---in Mallat, 2012, and Wiatowski and B\"olcskei, 2015. This paper considers the discrete case, introduces new convolutional neural network architectures, and proposes a mathematical framework for their analysis. Specifically, we establish deformation and translation sensitivity results of local and global nature, and we investigate how certain structural properties of the input signal are reflected in the corresponding feature vectors. Our theory applies to general filters and general Lipschitz-continuous non-linearities and pooling operators. Experiments on handwritten digit classification and facial landmark detection---including feature importance evaluation---complement the theoretical findings.
This paper proposes an analytical framework for peer-to-peer (P2P) networks and introduces schemes for building P2P networks to approach the minimum weighted average download time (WADT). In the considered P2P framework, the server, which has the information of all the download bandwidths and upload bandwidths of the peers, minimizes the weighted average download time by determining the optimal transmission rate from the server to the peers and from the peers to the other peers. This paper first defines the static P2P network, the hierarchical P2P network and the strictly hierarchical P2P network. Any static P2P network can be decomposed into an equivalent network of sub-peers that is strictly hierarchical. This paper shows that convex optimization can minimize the WADT for P2P networks by equivalently minimizing the WADT for strictly hierarchical networks of sub-peers. This paper then gives an upper bound for minimizing WADT by constructing a hierarchical P2P network, and lower bound by weakening the constraints of the convex problem. Both the upper bound and the lower bound are very tight. This paper also provides several suboptimal solutions for minimizing the WADT for strictly hierarchical networks, in which peer selection algorithms and chunk selection algorithm can be locally designed.
In this paper, a deep mixture of diverse experts algorithm is developed for seamlessly combining a set of base deep CNNs (convolutional neural networks) with diverse outputs (task spaces), e.g., such base deep CNNs are trained to recognize different subsets of tens of thousands of atomic object classes. First, a two-layer (category layer and object class layer) ontology is constructed to achieve more effective solution for task group generation, e.g., assigning the semantically-related atomic object classes at the sibling leaf nodes into the same task group because they may share similar learning complexities. Second, one particular base deep CNNs with $M+1$ ($M \leq 1,000$) outputs is learned for each task group to recognize its $M$ atomic object classes effectively and identify one special class of "not-in-group" automatically, and the network structure (numbers of layers and units in each layer) of the well-designed AlexNet is directly used to configure such base deep CNNs. A deep multi-task learning algorithm is developed to leverage the inter-class visual similarities to learn more discriminative base deep CNNs and multi-task softmax for enhancing the separability of the atomic object classes in the same task group. Finally, all these base deep CNNs with diverse outputs (task spaces) are seamlessly combined to form a deep mixture of diverse experts for recognizing tens of thousands of atomic object classes. Our experimental results have demonstrated that our deep mixture of diverse experts algorithm can achieve very competitive results on large-scale visual recognition.
Improving distant speech recognition is a crucial step towards flexible human-machine interfaces. Current technology, however, still exhibits a lack of robustness, especially when adverse acoustic conditions are met. Despite the significant progress made in the last years on both speech enhancement and speech recognition, one potential limitation of state-of-the-art technology lies in composing modules that are not well matched because they are not trained jointly. To address this concern, a promising approach consists in concatenating a speech enhancement and a speech recognition deep neural network and to jointly update their parameters as if they were within a single bigger network. Unfortunately, joint training can be difficult because the output distribution of the speech enhancement system may change substantially during the optimization procedure. The speech recognition module would have to deal with an input distribution that is non-stationary and unnormalized. To mitigate this issue, we propose a joint training approach based on a fully batch-normalized architecture. Experiments, conducted using different datasets, tasks and acoustic conditions, revealed that the proposed framework significantly overtakes other competitive solutions, especially in challenging environments.
Recurrent neural network (RNN) are being extensively used over feed-forward neural networks (FFNN) because of their inherent capability to capture temporal relationships that exist in the sequential data such as speech. This aspect of RNN is advantageous especially when there is no a priori knowledge about the temporal correlations within the data. However, RNNs require large amount of data to learn these temporal correlations, limiting their advantage in low resource scenarios. It is not immediately clear (a) how a priori temporal knowledge can be used in a FFNN architecture (b) how a FFNN performs when provided with this knowledge about temporal correlations (assuming available) during training. The objective of this paper is to explore k-FFNN, namely a FFNN architecture that can incorporate the a priori knowledge of the temporal relationships within the data sequence during training and compare k-FFNN performance with RNN in a low resource scenario. We evaluate the performance of k-FFNN and RNN by extensive experimentation on MediaEval 2016 audio data ("Emotional Impact of Movies" task). Experimental results show that the performance of k-FFNN is comparable to RNN, and in some scenarios k-FFNN performs better than RNN when temporal knowledge is injected into FFNN architecture. The main contributions of this paper are (a) fusing a priori knowledge into FFNN architecture to construct a k-FFNN and (b) analyzing the performance of k-FFNN with respect to RNN for different size of training data.
The paper analyzes theoretically and empirically the performance of likelihood weighting (LW) on a subset of nodes in Bayesian networks. The proposed scheme requires fewer samples to converge due to reduction in sampling variance. The method exploits the structure of the network to bound the complexity of exact inference used to compute sampling distributions, similar to Gibbs cutset sampling. Yet, the extension of the previosly proposed cutset sampling principles to likelihood weighting is non-trivial due to differences in the sampling processes of Gibbs sampler and LW. We demonstrate empirically that likelihood weighting on a cutset (LWLC) is effective time-wise and has a lower rejection rate than LW when applied to networks with many deterministic probabilities. Finally, we show that the performance of likelihood weighting on a cutset can be improved further by caching computed sampling distributions and, consequently, learning 'zeros' of the target distribution.
This paper proposes a CS scheme that exploits the representational power of restricted Boltzmann machines and deep learning architectures to model the prior distribution of the sparsity pattern of signals belonging to the same class. The determined probability distribution is then used in a maximum a posteriori (MAP) approach for the reconstruction. The parameters of the prior distribution are learned from training data. The motivation behind this approach is to model the higher-order statistical dependencies between the coefficients of the sparse representation, with the final goal of improving the reconstruction. The performance of the proposed method is validated on the Berkeley Segmentation Dataset and the MNIST Database of handwritten digits.
In this work we introduce a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture that is trained end-to-end. Such a universal network can act like a `swiss knife' for vision tasks; we call this architecture an UberNet to indicate its overarching nature.   We address two main technical challenges that emerge when broadening up the range of tasks handled by a single CNN: (i) training a deep architecture while relying on diverse training sets and (ii) training many (potentially unlimited) tasks with a limited memory budget. Properly addressing these two problems allows us to train accurate predictors for a host of tasks, without compromising accuracy.   Through these advances we train in an end-to-end manner a CNN that simultaneously addresses (a) boundary detection (b) normal estimation (c) saliency estimation (d) semantic segmentation (e) human part segmentation (f) semantic boundary detection, (g) region proposal generation and object detection. We obtain competitive performance while jointly addressing all of these tasks in 0.7 seconds per frame on a single GPU. A demonstration of this system can be found at http://cvn.ecp.fr/ubernet/.
Deep learning has given way to a new era of machine learning, apart from computer vision. Convolutional neural networks have been implemented in image classification, segmentation and object detection. Despite recent advancements, we are still in the very early stages and have yet to settle on best practices for network architecture in terms of deep design, small in size and a short training time. In this work, we propose a very deep neural network comprised of 16 Convolutional layers compressed with the Fire Module adapted from the SQUEEZENET model. We also call for the addition of residual connections to help suppress degradation. This model can be implemented on almost every neural network model with fully incorporated residual learning. This proposed model Residual-Squeeze-VGG16 (ResSquVGG16) trained on the large-scale MIT Places365-Standard scene dataset. In our tests, the model performed with accuracy similar to the pre-trained VGG16 model in Top-1 and Top-5 validation accuracy while also enjoying a 23.86% reduction in training time and an 88.4% reduction in size. In our tests, this model was trained from scratch.
Supervised learning of convolutional neural networks (CNNs) can require very large amounts of labeled data. Labeling thousands or millions of training examples can be extremely time consuming and costly. One direction towards addressing this problem is to create features from unlabeled data. In this paper we propose a new method for training a CNN, with no need for labeled instances. This method for unsupervised feature learning is then successfully applied to a challenging object recognition task. The proposed algorithm is relatively simple, but attains accuracy comparable to that of more sophisticated methods. The proposed method is significantly easier to train, compared to existing CNN methods, making fewer requirements on manually labeled training data. It is also shown to be resistant to overfitting. We provide results on some well-known datasets, namely STL-10, CIFAR-10, and CIFAR-100. The results show that our method provides competitive performance compared with existing alternative methods. Selective Convolutional Neural Network (S-CNN) is a simple and fast algorithm, it introduces a new way to do unsupervised feature learning, and it provides discriminative features which generalize well.
Traditional generative adversarial networks (GAN) and many of its variants are trained by minimizing the KL or JS-divergence loss that measures how close the generated data distribution is from the true data distribution. A recent advance called the WGAN based on Wasserstein distance can improve on the KL and JS-divergence based GANs, and alleviate the gradient vanishing, instability, and mode collapse issues that are common in the GAN training. In this work, we aim at improving on the WGAN by first generalizing its discriminator loss to a margin-based one, which leads to a better discriminator, and in turn a better generator, and then carrying out a progressive training paradigm involving multiple GANs to contribute to the maximum margin ranking loss so that the GAN at later stages will improve upon early stages. We call this method Gang of GANs (GoGAN). We have shown theoretically that the proposed GoGAN can reduce the gap between the true data distribution and the generated data distribution by at least half in an optimally trained WGAN. We have also proposed a new way of measuring GAN quality which is based on image completion tasks. We have evaluated our method on four visual datasets: CelebA, LSUN Bedroom, CIFAR-10, and 50K-SSFF, and have seen both visual and quantitative improvement over baseline WGAN.
Network topology and its relationship to tie strengths may hinder or enhance the spreading of information in social networks. We study the correlations between tie strengths and topology in networks of scientific collaboration, and show that these are very different from ordinary social networks. For the latter, it has earlier been shown that strong ties are associated with dense network neighborhoods, while weaker ties act as bridges between these. Because of this, weak links act as bottlenecks for the diffusion of information. We show that on the contrary, in co-authorship networks dense local neighborhoods mainly consist of weak links, whereas strong links are more important for overall connectivity. The important role of strong links is further highlighted in simulations of information spreading, where their topological position is seen to dramatically speed up spreading dynamics. Thus, in contrast to ordinary social networks, weight-topology correlations enhance the flow of information across scientific collaboration networks.
AThe paper gives a few arguments in favour of the use of chain graphs for description of probabilistic conditional independence structures. Every Bayesian network model can be equivalently introduced by means of a factorization formula with respect to a chain graph which is Markov equivalent to the Bayesian network. A graphical characterization of such graphs is given. The class of equivalent graphs can be represented by a distinguished graph which is called the largest chain graph. The factorization formula with respect to the largest chain graph is a basis of a proposal of how to represent the corresponding (discrete) probability distribution in a computer (i.e. parametrize it). This way does not depend on the choice of a particular Bayesian network from the class of equivalent networks and seems to be the most efficient way from the point of view of memory demands. A separation criterion for reading independency statements from a chain graph is formulated in a simpler way. It resembles the well-known d-separation criterion for Bayesian networks and can be implemented locally.
We study the multifractal analysis (MFA) of electronic wavefunctions at the localisation-delocalisation transition in the 3D Anderson model for very large system sizes up to $240^3$. The singularity spectrum $f(\alpha)$ is numerically obtained using the \textsl{ensemble average} of the scaling law for the generalized inverse participation ratios $P_q$, employing box-size and system-size scaling. The validity of a recently reported symmetry law [Phys. Rev. Lett. 97, 046803 (2006)] for the multifractal spectrum is carefully analysed at the metal-insulator transition (MIT). The results are compared to those obtained using different approaches, in particular the typical average of the scaling law. System-size scaling with ensemble average appears as the most adequate method to carry out the numerical MFA. Some conjectures about the true shape of $f(\alpha)$ in the thermodynamic limit are also made.
Signal transduction within biological cells is governed by networks of interacting proteins. Communication between these proteins is mediated by signaling molecules which bind to receptors and induce stochastic transitions between different conformational states. Signaling is typically a cooperative process which requires the occurrence of multiple binding events so that reaction rates have a nonlinear dependence on the amount of signaling molecule. It is this nonlinearity that endows biological signaling networks with robust switch-like properties which are critical to their biological function. In this study, we investigate how the properties of these signaling systems depend on the network architecture. Our main result is that these nonlinear networks exhibit bistability where the network activity can switch between states that correspond to a low and high activity level. We show that this bistable regime emerges at a critical coupling strength that is determined by the spectral structure of the network. In particular, the set of nodes that correspond to large components of the leading eigenvector of the adjacency matrix determines the onset of bistability. Above this transition, the eigenvectors of the adjacency matrix determine a hierarchy of clusters, defined by its spectral properties, which are activated sequentially with increasing network activity. We argue further that the onset of bistability occurs either continuously or discontinuously depending upon whether the leading eigenvector is localized or delocalized. Finally, we show that at low network coupling stochastic transitions to the active branch are also driven by the set of nodes that contribute more strongly to the leading eigenvector.
A neural-network model is developed to reproduce the differences between experimental nuclear mass-excess values and the theoretical values given by the Finite Range Droplet Model. The results point to the existence of subtle regularities of nuclear structure not yet contained in the best microscopic/phenomenological models of atomic masses. Combining the FRDM and the neural-network model, we create a hybrid model with improved predictive performance on nuclear-mass systematics and related quantities.
We present deep optical B, V, R images of a sample of 10 interacting systems which were selected for their resemblance to disturbed galaxies at high redshift. Photometry is performed on knots in the tidal features of the galaxies. We calculate a grid of evolutionary synthesis models with two metallicities and various burst strengths for systems consisting of some fraction of the stellar population of a progenitor spiral plus starburst. By comparison with two-color diagrams we interpret the photometric data, select from a total of about 100 condensations 36 star-forming objects that are located in the tidal features and predict their further evolution. Being more luminous by 4 mag than normal HII regions we argue that these objects could be tidal dwarf galaxies or their progenitors, although they differ in number and mean luminosity from the already known tidal dwarf galaxies typically located at the end of tidal tails in nearby giant interacting systems. From comparison with our models we note that all objects show young burst ages. The young stellar component formed in these tidal dwarf candidates contributes up to 18% to the total stellar mass at the end of the starburst and dominates the optical luminosity. This may result in fading by up to 2.5 mag in B during the next 200 Myrs after the burst.
Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
In multi-label classification, the main focus has been to develop ways of learning the underlying dependencies between labels, and to take advantage of this at classification time. Developing better feature-space representations has been predominantly employed to reduce complexity, e.g., by eliminating non-helpful feature attributes from the input space prior to (or during) training. This is an important task, since many multi-label methods typically create many different copies or views of the same input data as they transform it, and considerable memory can be saved by taking advantage of redundancy. In this paper, we show that a proper development of the feature space can make labels less interdependent and easier to model and predict at inference time. For this task we use a deep learning approach with restricted Boltzmann machines. We present a deep network that, in an empirical evaluation, outperforms a number of competitive methods from the literature
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.
We consider the problem of modeling discrete-valued vector time series data using extensions of Chow-Liu tree models to capture both dependencies across time and dependencies across variables. Conditional Chow-Liu tree models are introduced, as an extension to standard Chow-Liu trees, for modeling conditional rather than joint densities. We describe learning algorithms for such models and show how they can be used to learn parsimonious representations for the output distributions in hidden Markov models. These models are applied to the important problem of simulating and forecasting daily precipitation occurrence for networks of rain stations. To demonstrate the effectiveness of the models, we compare their performance versus a number of alternatives using historical precipitation data from Southwestern Australia and the Western United States. We illustrate how the structure and parameters of the models can be used to provide an improved meteorological interpretation of such data.
In realistic disordered systems, such as the Edwards-Anderson (EA) spin glass, no order parameter, such as the Parisi overlap distribution, can be both translation-invariant and non-self-averaging. The standard mean-field picture of the EA spin glass phase can therefore not be valid in any dimension and at any temperature. Further analysis shows that, in general, when systems have many competing (pure) thermodynamic states, a single state which is a mixture of many of them (as in the standard mean-field picture) contains insufficient information to reveal the full thermodynamic structure. We propose a different approach, in which an appropriate thermodynamic description of such a system is instead based on a metastate, which is an ensemble of (possibly mixed) thermodynamic states. This approach, modelled on chaotic dynamical systems, is needed when chaotic size dependence (of finite volume correlations) is present. Here replicas arise in a natural way, when a metastate is specified by its (meta)correlations. The metastate approach explains, connects, and unifies such concepts as replica symmetry breaking, chaotic size dependence and replica non-independence. Furthermore, it replaces the older idea of non-self-averaging as dependence on the bulk couplings with the concept of dependence on the state within the metastate at fixed coupling realization. We use these ideas to classify possible metastates for the EA model, and discuss two scenarios introduced by us earlier --- a nonstandard mean-field picture and a picture intermediate between that and the usual scaling/droplet picture.
This paper proposes to use low-level spatial features extracted from multichannel audio for sound event detection. We extend the convolutional recurrent neural network to handle more than one type of these multichannel features by learning from each of them separately in the initial stages. We show that instead of concatenating the features of each channel into a single feature vector the network learns sound events in multichannel audio better when they are presented as separate layers of a volume. Using the proposed spatial features over monaural features on the same network gives an absolute F-score improvement of 6.1% on the publicly available TUT-SED 2016 dataset and 2.7% on the TUT-SED 2009 dataset that is fifteen times larger.
The present study uses the Kohonen self organizing map (SOM) to represent the popularity patterns of Myspace music artists from their attributes on the platform and their position in the social network. The method is applied to cluster the profiles (the nodes of the social network) and the best friendship links (the edges). It shows that the SOM is an efficient tool to interpret the complex links between the audience and the influence of the musicians. It finally provides a robust classifier of the online social network behaviors.
We present a deep BVrIK multicolor catalog of galaxies in the field of the high redshift (z=4.7) quasar BR 1202-0725. Reliable colors have been measured for galaxies selected down to R=25. The choice of the optical filters has been optimized to define a robust multicolor selection of galaxies at 3.8<z< 4.5. Within this interval the surface density of galaxy candidates with z~4 in this field is 1 arcmin^{-2}. Photometric redshifts have been derived for the galaxies in the field with the maximum likelihood analysis using the GISSEL library of ~10^6 synthetic spectra. The accuracy of the method used has been discussed and tested using galaxies in the Hubble Deep Field (HDF) with known spectroscopic redshifts and accurate photometry. A peak in the redshift distribution is present at z~0.6 with relatively few galaxies at z>1.5. At variance with brighter surveys (I<22.5) there is a tail in the distribution towards high redshifts up to z~4. The luminosity function at z~0.6 shows a steepening for M_B>-19. The observed cosmological ultraviolet luminosity density is computed in the overall redshift interval z=0.3-4.5 reaching a value 2x10^{19} W/Hz/Mpc^3 at z~0.8. We have derived in a homogeneous way, using the GISSEL libraries, the physical parameters connected with the fitted spectral energy distributions. The bulk of the blue intermediate redshift population with z=0.4-1 mostly consists of very young star-forming galaxies with a median starburst age of the order of a few 10^8 yr and typical mass in luminous stars ~2x10^8 Mo.
When training data is sparse, more domain knowledge must be incorporated into the learning algorithm in order to reduce the effective size of the hypothesis space. This paper builds on previous work in which knowledge about qualitative monotonicities was formally represented and incorporated into learning algorithms (e.g., Clark & Matwin's work with the CN2 rule learning algorithm). We show how to interpret knowledge of qualitative influences, and in particular of monotonicities, as constraints on probability distributions, and to incorporate this knowledge into Bayesian network learning algorithms. We show that this yields improved accuracy, particularly with very small training sets (e.g. less than 10 examples).
We propose a novel deep neural network architecture for semi-supervised semantic segmentation using heterogeneous annotations. Contrary to existing approaches posing semantic segmentation as a single task of region-based classification, our algorithm decouples classification and segmentation, and learns a separate network for each task. In this architecture, labels associated with an image are identified by classification network, and binary segmentation is subsequently performed for each identified label in segmentation network. The decoupled architecture enables us to learn classification and segmentation networks separately based on the training data with image-level and pixel-wise class labels, respectively. It facilitates to reduce search space for segmentation effectively by exploiting class-specific activation maps obtained from bridging layers. Our algorithm shows outstanding performance compared to other semi-supervised approaches even with much less training images with strong annotations in PASCAL VOC dataset.
We study numerically a disordered version of the model for DNA denaturation transition (DSAW-DNA) consisting of two interacting SAWs in 3d, which undergoes a first order transition in the homogeneous case. The two possible values eAT and eGC of the interactions between base pairs are taken as quenched random variables distributed with equal probability along the chain. We measure quantities averaged over disorder such as the energy density, the specific heat and the probability distribution of the loop lengths. When applying the scaling laws used in the homogeneous case we find that the transition seems to be smoother in presence of disorder, in agreement with general theoretical arguments. Nevertheless we can not rule out the possibility of a still first order transition.
We review in this work specific-heat experiments, that we have conducted on different hydrogen-bonded glasses during last years. Specifically, we have measured the low-temperature specific heat Cp for a set of glassy alcohols: normal and fully-deuterated ethanol, 1- and 2- propanol, and glycerol. Ethanol exhibits a very interesting polymorphism presenting three different solid phases at low temperature: a fully-ordered (monoclinic) crystal, an orientationally-disordered (cubic) crystal or 'orientational glass', and the ordinary structural glass. By measuring and comparing the low-temperature specific heat of the three phases, in the 'boson peak' range 2-10 K as well as in the tunneling-states range below 1K, we are able to provide a quantitative confirmation that ''glassy behavior'' is not an exclusive property of amorphous solids. On the other hand, propanol is the simplest monoalcohol with two different stereoisomers (1- and 2-propanol), what allows us to study directly the influence of the spatial rearrangement of atoms on the universal properties of glasses. We have measured the specific heat of both isomers, finding a noteworthy quantitative difference between them. Finally, low-temperature specific-heat data of glassy glycerol have also been obtained. Here we propose a simple method based upon the soft-potential model to analyze low-temperature specific-heat measurements, and we use this method for a quantitative comparison of all these data of glassy alcohols and as a stringent test of several universal correlations and scaling laws suggested in the literature. In particular, we find that the interstitialcy model for the boson peak [A. V. Granato, Phys. Rev. Lett. 68 (1992) 974] gives a very good account of the temperature at which the maximum in Cp/T^3 occurs.
In this paper, we introduce a methodology that allows to model behavioral trajectories of users in online social media. First, we illustrate how to leverage the probabilistic framework provided by Hidden Markov Models (HMMs) to represent users by embedding the temporal sequences of actions they performed online. We then derive a model-based distance between trained HMMs, and we use spectral clustering to find homogeneous clusters of users showing similar behavioral trajectories. To provide platform-agnostic results, we apply the proposed approach to two different online social media --- i.e. Facebook and YouTube. We conclude discussing merits and limitations of our approach as well as future and promising research directions.
Despite being the appearance-based classifier of choice in recent years, relatively few works have examined how much convolutional neural networks (CNNs) can improve performance on accepted expression recognition benchmarks and, more importantly, examine what it is they actually learn. In this work, not only do we show that CNNs can achieve strong performance, but we also introduce an approach to decipher which portions of the face influence the CNN's predictions. First, we train a zero-bias CNN on facial expression data and achieve, to our knowledge, state-of-the-art performance on two expression recognition benchmarks: the extended Cohn-Kanade (CK+) dataset and the Toronto Face Dataset (TFD). We then qualitatively analyze the network by visualizing the spatial patterns that maximally excite different neurons in the convolutional layers and show how they resemble Facial Action Units (FAUs). Finally, we use the FAU labels provided in the CK+ dataset to verify that the FAUs observed in our filter visualizations indeed align with the subject's facial movements.
Anthropomimetic robots are robots that sense, behave, interact and feel like humans. By this definition, anthropomimetic robots require human-like physical hardware and actuation, but also brain-like control and sensing. The most self-evident realization to meet those requirements would be a human-like musculoskeletal robot with a brain-like neural controller. While both musculoskeletal robotic hardware and neural control software have existed for decades, a scalable approach that could be used to build and control an anthropomimetic human-scale robot has not been demonstrated yet. Combining Myorobotics, a framework for musculoskeletal robot development, with SpiNNaker, a neuromorphic computing platform, we present the proof-of-principle of a system that can scale to dozens of neurally-controlled, physically compliant joints. At its core, it implements a closed-loop cerebellar model which provides real-time low-level neural control at minimal power consumption and maximal extensibility: higher-order (e.g., cortical) neural networks and neuromorphic sensors like silicon-retinae or -cochleae can naturally be incorporated.
When we elastically impose a homogeneous, affine deformation on amorphous solids, they also undergo an inhomogeneous, non-affine deformation, which can have a crucial impact on the overall elastic response. To correctly understand the elastic modulus $M$, it is therefore necessary to take into account not only the affine modulus $M_A$, but also the non-affine modulus $M_N$ that arises from the non-affine deformation. In the present work, we study the bulk ($M=K$) and shear ($M=G$) moduli in static jammed particulate packings over a range of packing fractions $\varphi$. One novelty of this work is to elucidate the contribution of each vibrational mode to the non-affine $M_N$ through a modal decomposition of the displacement and force fields. In the vicinity of the (un)jamming transition, $\varphi_{c}$, the vibrational density of states, $g(\omega)$, shows a plateau in the intermediate frequency regime above a characteristic frequency $\omega^\ast$. We illustrate that this unusual feature apparent in $g(\omega)$ is reflected in the behavior of $M_N$: As $\varphi \rightarrow \varphi_c$, where $\omega^\ast \rightarrow 0$, those modes for $\omega < \omega^\ast$ contribute less and less, while contributions from those for $\omega > \omega^\ast$ approach a constant value which results in $M_N$ to approach a critical value $M_{Nc}$, as $M_N-M_{Nc} \sim \omega^\ast$. At $\varphi_c$ itself, the bulk modulus attains a finite value $K_c=K_{Ac}-K_{Nc} > 0$, such that $K_{Nc}$ has a value that remains below $K_{Ac}$. In contrast, for the critical shear modulus $G_c$, $G_{Nc}$ and $G_{Ac}$ approach the same value so that the total value becomes exactly zero, $G_c = G_{Ac}-G_{Nc} =0$. We explore what features of the configurational and vibrational properties cause such the distinction between $K$ and $G$, allowing us to validate analytical expressions for their critical values.
Recent developments in speech synthesis have produced systems capable of outcome intelligible speech, but now researchers strive to create models that more accurately mimic human voices. One such development is the incorporation of multiple linguistic styles in various languages and accents.   HMM-based Speech Synthesis is of great interest to many researchers, due to its ability to produce sophisticated features with small footprint. Despite such progress, its quality has not yet reached the level of the predominant unit-selection approaches that choose and concatenate recordings of real speech. Recent efforts have been made in the direction of improving these systems.   In this paper we present the application of Long-Short Term Memory Deep Neural Networks as a Postfiltering step of HMM-based speech synthesis, in order to obtain closer spectral characteristics to those of natural speech. The results show how HMM-voices could be improved using this approach.
Photon radiation at large transverse momenta at colliders is a detailed probe of hard interaction dynamics. The isolated photon production cross section in deep inelastic scattering was measured recently by the ZEUS experiment, and found to be considerably larger than theoretical predictions obtained with widely used event generators. To investigate this discrepancy, we perform a dedicated parton-level calculation of this observable, including contributions from fragmentation and large-angle radiation. Our results are in good agreement with all aspects of the experimental measurement.
Recent theoretical investigations on quantum coherent feedback networks have found that with the same pump power, the Einstein-Podolski-Rosen (EPR)-like entanglement generated via a dual nondegenerate optical parametric amplifier (NOPA) system placed in a certain coherent feedback loop is stronger than the EPR-like entangled pairs produced by a single NOPA. In this paper, we present a linear quantum system consisting of two NOPAs and a static linear passive network of optical devices. The network has six inputs and six outputs, among which four outputs and four inputs are connected in a coherent feedback loop with the two NOPAs. This passive network is represented by a $6 \times 6$ complex unitary matrix. A modified steepest descent method is used to find a passive complex unitary matrix at which the entanglement of this dual-NOPA network is locally maximized. Here we choose the matrix corresponding to a dual-NOPA coherent feedback network from our previous work as a starting point for the modified steepest descent algorithm. By decomposing the unitary matrix obtained by the algorithm as the product of so-called two-level unitary matrices, we find an optimized configuration in which the complex matrix is realized by a static optical network made of beam splitters.
The ANTARES neutrino telescope is a large photomultiplier array designed to detect neutrino-induced upward-going muons by their Cherenkov radiation. Understanding the absorption and scattering of light in the deep Mediterranean is fundamental to optimising the design and performance of the detector. This paper presents measurements of blue and UV light transmission at the ANTARES site taken between 1997 and 2000. The derived values for the scattering length and the angular distribution of particulate scattering were found to be highly correlated, and results are therefore presented in terms of an absorption length lambda_abs and an effective scattering length lambda_sct^eff. The values for blue (UV) light are found to be lambda_abs ~ 60(26) m, lambda_sct^eff ~ 265(122) m, with significant (15%) time variability. Finally, the results of ANTARES simulations showing the effect of these water properties on the anticipated performance of the detector are presented.
The flavor asymmetry of the light quark sea of the nucleon is determined in the kinematic range 0.02<x<0.3 and 1 GeV^2<Q^2<10 GeV^2, for the first time from semi-inclusive deep-inelastic scattering. The quantity (dbar(x)-ubar(x))/(u(x)-d(x)) is derived from a relationship between the yields of positive and negative pions from unpolarized hydrogen and deuterium targets. The flavor asymmetry dbar-ubar is found to be non-zero and x dependent, showing an excess of dbar over ubar quarks in the proton.
In this paper we study different types of Recurrent Neural Networks (RNN) for sequence labeling tasks. We propose two new variants of RNNs integrating improvements for sequence labeling, and we compare them to the more traditional Elman and Jordan RNNs. We compare all models, either traditional or new, on four distinct tasks of sequence labeling: two on Spoken Language Understanding (ATIS and MEDIA); and two of POS tagging for the French Treebank (FTB) and the Penn Treebank (PTB) corpora. The results show that our new variants of RNNs are always more effective than the others.
Ad hoc networks are the special networks formed for specific applications. Operating in ad-hoc mode allows all wireless devices within range of each other to discover and communicate in a peer-to-peer fashion without involving central access points. Many routing protocols like AODV, DSR etc have been proposed for these networks to find an end to end path between the nodes. These routing protocols are prone to attacks by the malicious nodes. There is a need to detect and prevent these attacks in a timely manner before destruction of network services.
This paper describes a new mechanism that might help with defining pattern sequences, by the fact that it can produce an upper bound on the ensemble value that can persistently oscillate with the actual values produced from each pattern. With every firing event, a node also receives an on/off feedback switch. If the node fires, then it sends a feedback result depending on the input signal strength. If the input signal is positive or larger, it can store an 'on' switch feedback for the next iteration. If the signal is negative or smaller, it can store an 'off' switch feedback for the next iteration. If the node does not fire, then it does not affect the current feedback situation and receives the switch command produced by the last active pattern event for the same neuron. The upper bound therefore also represents the largest or most enclosing pattern set and the lower value is for the actual set of firing patterns. If the pattern sequence repeats, it will oscillate between the two values, allowing them to be recognised and measured more easily, over time. Tests show that changing the sequence ordering produces different value sets, which can also be measured.
Revealing how a biological network is organized to realize its function is one of the main topics in systems biology. The functional backbone network, defined as the primary structure of the biological network, is of great importance in maintaining the main function of the biological network. We propose a new algorithm, the tinker algorithm, to determine this core structure and apply it in the cell-cycle system. With this algorithm, the backbone network of the cell-cycle network can be determined accurately and efficiently in various models such as the Boolean model, stochastic model, and ordinary differential equation model. Results show that our algorithm is more efficient than that used in the previous research. We hope this method can be put into practical use in relevant future studies.
We study a two dimensional, two-band double-exchange model for $e_g$ electrons coupled to Jahn-Teller distortions in the presence of quenched disorder using a recently developed Monte-Carlo technique. In the absence of disorder the half-filled system at low temperatures is an orbitally ordered ferromagnetic insulator with a staggered pattern of Jahn-Teller distortions. We examine the finite temperature transition to the orbitally disordered phase and uncover a qualitative difference between the intermediate and strongly coupled systems, including a thermally driven insulator to metal crossover in the former case. Long range orbital order is suppressed in the presence of disorder and the system displays a tendency towards metastable states consisting of orbitally disordered stripe-like structures enclosing orbitally ordered domains.
Protein structure prediction is considered as one of the most challenging and computationally intractable combinatorial problem. Thus, the efficient modeling of convoluted search space, the clever use of energy functions, and more importantly, the use of effective sampling algorithms become crucial to address this problem. For protein structure modeling, an off-lattice model provides limited scopes to exercise and evaluate the algorithmic developments due to its astronomically large set of data-points. In contrast, an on-lattice model widens the scopes and permits studying the relatively larger proteins because of its finite set of data-points. In this work, we took the full advantage of an on-lattice model by using a face-centered-cube lattice that has the highest packing density with the maximum degree of freedom. We proposed a graded energy-strategically mixes the Miyazawa-Jernigan (MJ) energy with the hydrophobic-polar (HP) energy-based genetic algorithm (GA) for conformational search. In our application, we introduced a 2x2 HP energy guided macro-mutation operator within the GA to explore the best possible local changes exhaustively. Conversely, the 20x20 MJ energy model-the ultimate objective function of our GA that needs to be minimized-considers the impacts amongst the 20 different amino acids and allow searching the globally acceptable conformations. On a set of benchmark proteins, our proposed approach outperformed state-of-the-art approaches in terms of the free energy levels and the root-mean-square deviations.
Recently, Babenko and Lempitsky introduced Additive Quantization (AQ), a generalization of Product Quantization (PQ) where a non-independent set of codebooks is used to compress vectors into small binary codes. Unfortunately, under this scheme encoding cannot be done independently in each codebook, and optimal encoding is an NP-hard problem. In this paper, we observe that PQ and AQ are both compositional quantizers that lie on the extremes of the codebook dependence-independence assumption, and explore an intermediate approach that exploits a hierarchical structure in the codebooks. This results in a method that achieves quantization error on par with or lower than AQ, while being several orders of magnitude faster. We perform a complexity analysis of PQ, AQ and our method, and evaluate our approach on standard benchmarks of SIFT and GIST descriptors, as well as on new datasets of features obtained from state-of-the-art convolutional neural networks.
Deep convolutional neural networks learn extremely powerful image representations, yet most of that power is hidden in the millions of deep-layer parameters. What exactly do these parameters represent? Recent work has started to analyse CNN representations, finding that, e.g., they are invariant to some 2D transformations Fischer et al. (2014), but are confused by particular types of image noise Nguyen et al. (2014). In this work, we delve deeper and ask: how invariant are CNNs to object-class variations caused by 3D shape, pose, and photorealism?
In this Letter we report the measurement of a pseudo-temperature for compacting granular media on the basis of the Fluctuation-Dissipation relations in the aging dynamics of a model system. From the violation of the Fluctuation-Dissipation Theorem an effective temperature emerges (a dynamical temperature T_{dyn}) whose ratio with the equilibrium temperature T_d^{eq} depends on the particle density. We compare the results for the Fluctuation-Dissipation Ratio (FDR) T_{dyn}/T_d^{eq} at several densities with the outcomes of Edwards' approach at the corresponding densities. It turns out that the FDR and the so-called Edwards' ratio coincide at several densities (very different ages of the system), opening in this way the door to experimental checks as well as theoretical constructions.
The experimental study of neural networks requires simultaneous measurements of a massive number of neurons, while monitoring properties of the connectivity, synaptic strengths and delays. Current technological barriers make such a mission unachievable. In addition, as a result of the enormous number of required measurements, the estimated network parameters would differ from the original ones. Here we present a versatile experimental technique, which enables the study of recurrent neural networks activity while being capable of dictating the network connectivity and synaptic strengths. This method is based on the observation that the response of neurons depends solely on their recent stimulations, a short-term memory. It allows a long-term scheme of stimulation and recording of a single neuron, to mimic simultaneous activity measurements of neurons in a recurrent network. Utilization of this technique demonstrates the spontaneous emergence of cooperative synchronous oscillations, in particular the coexistence of fast Gamma and slow Delta oscillations, and opens the horizon for the experimental study of other cooperative phenomena within large-scale neural networks.
Representations that can compactly and effectively capture temporal evolution of semantic content are important to machine learning algorithms that operate on multi-variate time-series data. We investigate such representations motivated by the task of human action recognition. Here each data instance is encoded by a multivariate feature (such as via a deep CNN) where action dynamics are characterized by their variations in time. As these features are often non-linear, we propose a novel pooling method, kernelized rank pooling, that represents a given sequence compactly as the pre-image of the parameters of a hyperplane in an RKHS, projections of data onto which captures their temporal order. We develop this idea further and show that such a pooling scheme can be cast as an order-constrained kernelized PCA objective; we then propose to use the parameters of a kernelized low-rank feature subspace as the representation of the sequences. We cast our formulation as an optimization problem on generalized Grassmann manifolds and then solve it efficiently using Riemannian optimization techniques. We present experiments on several action recognition datasets using diverse feature modalities and demonstrate state-of-the-art results.
We consider single-source single-sink (ss-ss) multi-hop networks, with slow-fading links and single-antenna half-duplex relays. We identify two families of networks that are multi-hop generalizations of the well-studied two-hop network: K-Parallel-Path (KPP) networks and layered networks. KPP networks can be viewed as the union of K node-disjoint parallel relaying paths, each of length greater than one. KPP networks are then generalized to KPP(I) networks, which permit interference between paths and to KPP(D) networks, which possess a direct link from source to sink. We characterize the DMT of these families of networks completely for K > 3. Layered networks are networks comprising of relaying layers with edges existing only within the same layer or between adjacent layers. We prove that a linear DMT between the maximum diversity d_{max} and the maximum multiplexing gain of 1 is achievable for fully-connected layered networks. This is shown to be equal to the optimal DMT if the number of layers is less than 4. For multi-antenna KPP and layered networks, we provide an achievable DMT region.   For arbitrary ss-ss single-antenna directed-acyclic full-duplex networks, we prove that a linear tradeoff between maximum diversity and maximum multiplexing gain is achievable. All protocols in this paper are explicit and use only amplify and forward (AF) relaying. We also construct codes with short block-lengths based on cyclic division algebras that achieve the optimal DMT for all the proposed schemes. Two key implications of the results in the paper are that the half-duplex constraint does not entail any rate loss for a large class of networks and that simple AF protocols are often sufficient to attain the optimal DMT.
We present a numerical simulation study of the density-dependence (rho = 2.2 - 4.0 g/cm^3) of the high-energy collective dynamics in vitreous silica at mesoscopic wavevectors (Q = 1 - 18 nm^-1). The density-dependence of the longitudinal and transverse current spectra provides evidence that the excess modes observed in the density of states of this and many other glasses, i.e. the Boson Peak, arises from the high-Q limit of the quasi-transverse acoustic branch. This conclusion emerges from the comparison of the numerical results with the experimentally observed energy-shift and intensity variation of the Boson Peak with increasing density.
Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural networks, on the other hand, have proven track records in many supervised learning tasks. In this work, we propose a training framework with a graph-regularised objective, namely "Neural Graph Machines", that can combine the power of neural networks and label propagation. This work generalises previous literature on graph-augmented training of neural networks, enabling it to be applied to multiple neural architectures (Feed-forward NNs, CNNs and LSTM RNNs) and a wide range of graphs. The new objective allows the neural networks to harness both labeled and unlabeled data by: (a) allowing the network to train using labeled data as in the supervised setting, (b) biasing the network to learn similar hidden representations for neighboring nodes on a graph, in the same vein as label propagation. Such architectures with the proposed objective can be trained efficiently using stochastic gradient descent and scaled to large graphs, with a runtime that is linear in the number of edges. The proposed joint training approach convincingly outperforms many existing methods on a wide range of tasks (multi-label classification on social graphs, news categorization, document classification and semantic intent classification), with multiple forms of graph inputs (including graphs with and without node-level features) and using different types of neural networks.
The LiHo$_x$Y$_{1-x}$F$_4$ magnetic material in a transverse magnetic field $B_{x}\hat x$ perpendicular to the Ising spin direction has long been used to study tunable quantum phase transitions in a random disordered system. We show that the $B_{x}-$induced magnetization along the $\hat x$ direction, combined with the local random dilution-induced destruction of crystalline symmetries, generates, via the predominant dipolar interactions between Ho$^{3+}$ ions, {\it random fields} along the Ising $\hat z$ direction. This identifies LiHo$_x$Y$_{1-x}$F$_4$ in $B_x$ as a new random field Ising system. The random fields explain the rapid decrease of the critical temperature in the diluted ferromagnetic regime and the smearing of the nonlinear susceptibility at the spin glass transition with increasing $B_{x}$, and render the $B_{x}-$induced quantum criticality in LiHo$_x$Y$_{1-x}$F$_4$ likely inaccessible.
We have used very deep XMM-Newton observations of the Chandra Deep Field-South to examine the spectral properties of the faint active galactic nucleus (AGN) population. Crucially, redshift measurements are available for 84% (259/309) of the XMM-Newton sample. We have calculated the absorption and intrinsic luminosities of the sample using an extensive Monte Carlo technique incorporating the specifics of the XMM-Newton observations. Twenty-three sources are found to have substantial absorption and intrinsic X-ray luminosities greater than 10^44 erg/s, putting them in the "type-2" QSO regime. We compare the redshift, luminosity and absorption distributions of our sample to the predictions of a range of AGN population models. In contrast to recent findings from ultra-deep Chandra surveys, we find that there is little evidence that the absorption distribution is dependent on either redshift or intrinsic X-ray luminosity. The pattern of absorption in our sample is best reproduced by models in which ~75% of the AGN population is heavily absorbed at all luminosities and redshifts.
We introduce an optimum principle for a vehicular traffic network with road bottlenecks. This network breakdown minimization (BM) principle states that the network optimum is reached, when link flow rates are assigned in the network in such a way that the probability for spontaneous occurrence of traffic breakdown at one of the network bottlenecks during a given observation time reaches the minimum possible value. Based on numerical simulations with a stochastic three-phase traffic flow model, we show that in comparison to the well-known Wardrop's principles the application of the BM principle permits considerably greater network inflow rates at which no traffic breakdown occurs and, therefore, free flow remains in the whole network.
The Van Hove self correlation function of water confined in a silica pore is calculated from Molecular Dynamics trajectories upon supercooling. At long time in the $\alpha$ relaxation region we found that the behaviour of the real space time dependent correlators can be decomposed in a very slow, almost frozen, dynamics due to the bound water close to the substrate and a faster dynamics of the free water which resides far from the confining surface. For free water we confirm the evidences of an approach to a crossover mode coupling transition, previously found in Q space. In the short time region we found that the two dynamical regimes are overimposed and cannot be distinguished. This shows that the interplay between the slower and the faster dynamics emerges in going from early times to the $\alpha$ relaxation region, where a layer analysis of the dynamical properties can be performed.
We study the dynamics of cold atoms subjected to {\em pairs} of closely time-spaced $\delta$-kicks from standing waves of light. The classical phase space of this system is partitioned into momentum cells separated by trapping regions. In a certain range of parameters it is shown that the classical motion is well described by a process of anomalous diffusion. We investigate in detail the impact of the underlying classical anomalous diffusion on the quantum dynamics with special emphasis on the phenomenon of dynamical localization. Based on the study of the quantum density of probability, its second moment and the return probability we identify a region of weak dynamical localization where the quantum diffusion is still anomalous but the diffusion rate is slower than in the classical case. Moreover we examine how other relevant time scales such as the quantum-classical breaking time or the one related to the beginning of full dynamical localization are modified by the classical anomalous diffusion. Finally we discuss the relevance of our results for the understanding of the role of classical cantori in quantum mechanics.
We study a microscopic mean-field model for the dynamics of the electron glass, near a local equilibrium state. Phonon-induced tunneling processes are responsible for generating transitions between localized electronic sites, which eventually lead to the thermalization of the system. We find that the decay of an excited state to a locally stable state is far from being exponential in time, and does not have a characteristic time scale. Working in a mean-field approximation, we write rate equations for the average occupation numbers, and describe the return to the locally stable state using the eigenvalues of a rate matrix, A, describing the linearized time-evolution of the occupation numbers. Analyzing the probability distribution of the eigenvalues of A we find that, under certain physically reasonable assumptions, it takes the form $P(\lambda) \sim \frac{1}{|\lambda|}$, leading naturally to a logarithmic decay in time. While our derivation of the matrix A is specific for the chosen model, we expect that other glassy systems, with different microscopic characteristics, will be described by random rate matrices belonging to the same universality class of A. Namely, the rate matrix has elements with a very broad distribution, i.e., exponentials of a variable with nearly uniform distribution.
In this paper we revisit the problem of a (non self-avoiding) polymer chain in a random medium which was previously investigated by Edwards and Muthukumar (EM). As noticed by Cates and Ball (CB) there is a discrepancy between the predictions of the replica calculation of EM and the expectation that in an infinite medium the quenched and annealed results should coincide (for a chain that is free to move) and a long polymer should always collapse. CB argued that only in a finite volume one might see a ``localization transition'' (or crossover) from a stretched to a collapsed chain in three spatial dimensions. Here we carry out the replica calculation in the presence of an additional confining harmonic potential that mimics the effect of a finite volume. Using a variational scheme with five variational parameters we derive analytically for d<4 the result R~(g |ln \mu|)^{-1/(4-d)} ~(g lnV)^{-1/(4-d)}, where R is the radius of gyration, g is the strength of the disorder, \mu is the spring constant associated with the confining potential and V is the associated effective volume of the system. Thus the EM result is recovered with their constant replaced by ln(V) as argued by CB. We see that in the strict infinite volume limit the polymer always collapses, but for finite volume a transition from a stretched to a collapsed form might be observed as a function of the strength of the disorder. For d<2 and for large V>V'~exp[g^(2/(2-d))L^((4-d)/(2-d))] the annealed results are recovered and R~(Lg)^(1/(d-2)), where L is the length of the polymer. Hence the polymer also collapses in the large L limit. The 1-step replica symmetry breaking solution is crucial for obtaining the above results.
Mobile ad hoc networks (MANETs) consist of a collection of wireless mobile nodes which dynamically exchange data without reliance on a fixed base station or a wired backbone network, which makes routing a crucial issue for the design of a ad hoc networks. In this paper we discussed a hybrid multipath routing protocol named MP-OLSR. It is based on the link state algorithm and employs periodic exchange of messages to maintain topology information of the networks. In the mean time, it updates the routing table in an on-demand scheme and forwards the packets in multiple paths which have been determined at the source. If a link failure is detected, the algorithm recovers the route automatically. Concerning the instability of the wireless networks, the redundancy coding is used to improve the delivery ratio. The simulation in NS2 shows that the new protocol can effectively improve the performance of the networks.
In this paper, we propose and study a novel visual object tracking approach based on convolutional networks and recurrent networks. The proposed approach is distinct from the existing approaches to visual object tracking, such as filtering-based ones and tracking-by-detection ones, in the sense that the tracking system is explicitly trained off-line to track anonymous objects in a noisy environment. The proposed visual tracking model is end-to-end trainable, minimizing any adversarial effect from mismatches in object representation and between the true underlying dynamics and learning dynamics. We empirically show that the proposed tracking approach works well in various scenarios by generating artificial video sequences with varying conditions; the number of objects, amount of noise and the match between the training shapes and test shapes.
Mobile streaming video data accounts for a large and increasing percentage of wireless network traffic. The available bandwidths of modern wireless networks are often unstable, leading to difficulties in delivering smooth, high-quality video. Streaming service providers such as Netflix and YouTube attempt to adapt their systems to adjust in response to these bandwidth limitations by changing the video bitrate or, failing that, allowing playback interruptions (rebuffering). Being able to predict end user' quality of experience (QoE) resulting from these adjustments could lead to perceptually-driven network resource allocation strategies that would deliver streaming content of higher quality to clients, while being cost effective for providers. Existing objective QoE models only consider the effects on user QoE of video quality changes or playback interruptions. For streaming applications, adaptive network strategies may involve a combination of dynamic bitrate allocation along with playback interruptions when the available bandwidth reaches a very low value. Towards effectively predicting user QoE, we propose Video Assessment of TemporaL Artifacts and Stalls (Video ATLAS): a machine learning framework where we combine a number of QoE-related features, including objective quality features, rebuffering-aware features and memory-driven features to make QoE predictions. We evaluated our learning-based QoE prediction model on the recently designed LIVE-Netflix Video QoE Database which consists of practical playout patterns, where the videos are afflicted by both quality changes and rebuffering events, and found that it provides improved performance over state-of-the-art video quality metrics while generalizing well on different datasets. The proposed algorithm is made publicly available at http://live.ece.utexas.edu/research/Quality/VideoATLAS release_v2.rar.
We consider first and second order consensus algorithms in networks with stochastic disturbances. We quantify the deviation from consensus using the notion of network coherence, which can be expressed as an $H_2$ norm of the stochastic system. We use the setting of fractal networks to investigate the question of whether a purely topological measure, such as the fractal dimension, can capture the asymptotics of coherence in the large system size limit. Our analysis for first-order systems is facilitated by connections between first-order stochastic consensus and the global mean first passage time of random walks. We then show how to apply similar techniques to analyze second-order stochastic consensus systems. Our analysis reveals that two networks with the same fractal dimension can exhibit different asymptotic scalings for network coherence. Thus, this topological characterization of the network does not uniquely determine coherence behavior. The question of whether the performance of stochastic consensus algorithms in large networks can be captured by purely topological measures, such as the spatial dimension, remains open.
We study the organization and dynamics of growing directed networks. These networks are built by adding nodes successively in such a way that each new node has $K$ directed links to the existing ones. The organization of a growing directed network is analyzed in terms of the number of ``descendants'' of each node in the network. We show that the distribution $P(S)$ of the size, $S$, of the descendant cluster is described generically by a power-law,   $P(S) \sim S^{-\eta}$, where the exponent $\eta$ depends on the value of $K$ as well as the strength of preferential attachment. We determine that, in the case of growing random directed networks without any preferential attachment, $\eta$ is given by $1+1/K$. We also show that the Boolean dynamics of these networks is stable for any value of $K$. However, with a small fraction of reversal in the direction of the links, the dynamics of growing directed networks appears to operate on ``the edge of chaos'' with a power-law distribution of the cycle lengths. We suggest that the growing directed network may serve as another paradigm for the emergence of the scale-free features in network organization and dynamics.
In this work we analyze the evolution of voluntary vaccination in networked populations by entangling the spreading dynamics of an influenza-like disease with an evolutionary framework taking place at the end of each influenza season so that individuals take or not the vaccine upon their previous experience. Our framework thus put in competition two well-known dynamical properties of scale-free networks: the fast propagation of diseases and the promotion of cooperative behaviors. Our results show that when vaccine is perfect scale-free networks enhance the vaccination behavior with respect to random graphs with homogeneous connectivity patterns. However, when imperfection appears we find a cross-over effect so that the number of infected (vaccinated) individuals increases (decreases) with respect to homogeneous networks, thus showing up the competition between the aforementioned properties of scale-free graphs.
This paper discusses an efficient approach to design and implement a highly available peer- to-peer system irrespective of peer timing and churn.
Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.
Wide-field photometric surveys enable searches of rare yet interesting objects, such as strongly lensed quasars or quasars with a bright host galaxy. Past searches for lensed quasars based on their optical and near infrared properties have relied on photometric cuts and spectroscopic pre-selection (as in the Sloan Quasar Lens Search), or neural networks applied to photometric samples. These methods rely on cuts in morphology and colours, with the risk of losing many interesting objects due to scatter in their population properties, restrictive training sets, systematic uncertainties in catalog-based magnitudes, and survey-to-survey photometric variations. Here, we explore the performance of a Gaussian Mixture Model to separate point-like quasars, quasars with an extended host, and strongly lensed quasars using griz psf and model magnitudes and WISE W1, W2. The choice of optical magnitudes is due to their presence in all current and upcoming releases of wide-field surveys, whereas UV information is not always available. We then assess the contamination from blue galaxies and the role of additional features such as W3 magnitudes or psf-model terms as morphological information. As a demonstration, we conduct a search in a random 10% of the SDSS footprint, and we provide the catalog of the 43 SDSS object with the highest `lens' score in our selection that survive visual inspection, and are spectroscopically confirmed to host active nuclei. We inspect archival data and find images of 5/43 objects in the Hubble Legacy Archive, including 2 known lenses. The code and materials are available to facilitate follow-up.
Web crawling, snowball sampling, and respondent-driven sampling (RDS) are three types of network sampling techniques used to contact individuals in hard-to-reach populations. This paper studies these procedures as a Markov process on the social network that is indexed by a tree. Each node in this tree corresponds to an observation and each edge in the tree corresponds to a referral. Indexing with a tree (instead of a chain) allows for the sampled units to refer multiple future units into the sample. In survey sampling, the design effect characterizes the additional variance induced by a novel sampling strategy. If the design effect is some value $DE$, then constructing an estimator from the novel design makes the variance of the estimator $DE$ times greater than it would be under a simple random sample with the same sample size $n$. Under certain assumptions on the referral tree, the design effect of network sampling has a critical threshold that is a function of the referral rate $m$ and the clustering structure in the social network, represented by the second eigenvalue of the Markov transition matrix, $\lambda_2$. If $m < 1/\lambda_2^2$, then the design effect is finite (i.e. the standard estimator is $\sqrt{n}$-consistent). However, if $m > 1/\lambda_2^2$, then the design effect grows with $n$ (i.e. the standard estimator is no longer $\sqrt{n}$-consistent). Past this critical threshold, the standard error of the estimator converges at the slower rate of $n^{\log_m \lambda_2}$. The Markov model allows for nodes to be resampled; computational results show that the findings hold in without-replacement sampling. To estimate confidence intervals that adapt to the correct level of uncertainty, a novel resampling procedure is proposed. Computational experiments compare this procedure to previous techniques.
A fundamental insight in the theory of diffusive random walks is that the mean length of trajectories traversing a finite open system is independent of the details of the diffusion process. Instead, the mean trajectory length depends only on the system's boundary geometry and is thus unaffected by the value of the mean free path. Here we show that this result is rooted on a much deeper level than that of a random walk, which allows us to extend the reach of this universal invariance property beyond the diffusion approximation. Specifically, we demonstrate that an equivalent invariance relation also holds for the scattering of waves in resonant structures as well as in ballistic, chaotic or in Anderson localized systems. Our work unifies a number of specific observations made in quite diverse fields of science ranging from the movement of ants to nuclear scattering theory. Potential experimental realizations using light fields in disordered media are discussed.
We use the incompressibility method based on Kolmogorov complexity to determine the total number of bits of routing information for almost all network topologies. In most models for routing, for almost all labeled graphs $\Theta (n^2)$ bits are necessary and sufficient for shortest path routing. By `almost all graphs' we mean the Kolmogorov random graphs which constitute a fraction of $1-1/n^c$ of all graphs on $n$ nodes, where $c > 0$ is an arbitrary fixed constant. There is a model for which the average case lower bound rises to $\Omega(n^2 \log n)$ and another model where the average case upper bound drops to $O(n \log^2 n)$. This clearly exposes the sensitivity of such bounds to the model under consideration. If paths have to be short, but need not be shortest (if the stretch factor may be larger than 1), then much less space is needed on average, even in the more demanding models. Full-information routing requires $\Theta (n^3)$ bits on average. For worst-case static networks we prove a $\Omega(n^2 \log n)$ lower bound for shortest path routing and all stretch factors $<2$ in some networks where free relabeling is not allowed.
Betweenness Centrality (BC) is steadily growing in popularity as a metrics of the influence of a vertex in a graph. The BC score of a vertex is proportional to the number of all-pairs-shortest-paths passing through it. However, complete and exact BC computation for a large-scale graph is an extraordinary challenge that requires high performance computing techniques to provide results in a reasonable amount of time. Our approach combines bi-dimensional (2-D) decomposition of the graph and multi-level parallelism together with a suitable data-thread mapping that overcomes most of the difficulties caused by the irregularity of the computation on GPUs. Furthermore, we propose novel heuristics which exploit the topology information of the graph in order to reduce time and space requirements of BC computation. Experimental results on synthetic and real-world graphs show that the proposed techniques allow the BC computation of graphs which are too large to fit in the memory of a single computational node along with a significant reduction of the computing time.
The equilibrium properties of flux lines pinned by columnar disorder are studied, using the analogy with the time evolution of a diffusing scalar density in a randomly amplifying medium. Near H_{c1}, the physical features of the vortices in the localized phase are shown to be determined by the density of states near the band edge. As a result, H_{c1} is inversely proportional to the logarithm of the sample size, and the screening length of the perpendicular magnetic field decreases with temperature. For large tilt the extended ground state turns out to wander in the plane perpendicular to the defects with exponents corresponding to a directed polymer in a random medium, and the energy difference between two competing metastable states in this case is extensive. The divergence of the effective potential associated with strong pinning centers as the tilt approaches its critical value is discussed as well.
The WC9-Type Wolf-Rayet star WR 76 is one of the most prolific dust makers identified from its infrared emission. WR 76 experienced a deep fading eclipse in 2016. The ~3.1 magnitude depth of the eclipse exceeds fadings in similar eclipses observed in WR stars thus far. Conclusions from recent and earlier analyses of eclipses observed suggests that WR 76 may be a prolific eclipser.
Compute-and-forward (CF) harnesses interference in wireless communications by exploiting structured coding. The key idea of CF is to compute integer combinations of codewords from multiple source nodes, rather than to decode individual codewords by treating others as noise. Compute-compress-and-forward (CCF) can further enhance the network performance by introducing compression operations at receivers. In this paper, we develop a more general compression framework, termed generalized compute-compress-and-forward (GCCF), where the compression function involves multiple quantization-and-modulo lattice operations. We show that GCCF achieves a broader compression rate region than CCF. We also compare our compression rate region with the fundamental Slepian-Wolf (SW) region. We show that GCCF is optimal in the sense of achieving the minimum total compression rate. We also establish the criteria under which GCCF achieves the SW region. In addition, we consider a two-hop relay network employing the GCCF scheme. We formulate a sum-rate maximization problem and develop an approximate algorithm to solve the problem. Numerical results are presented to demonstrate the performance superiority of GCCF over CCF and other schemes.
In this work, we propose a training algorithm for an audio-visual automatic speech recognition (AV-ASR) system using deep recurrent neural network (RNN).First, we train a deep RNN acoustic model with a Connectionist Temporal Classification (CTC) objective function. The frame labels obtained from the acoustic model are then used to perform a non-linear dimensionality reduction of the visual features using a deep bottleneck network. Audio and visual features are fused and used to train a fusion RNN. The use of bottleneck features for visual modality helps the model to converge properly during training. Our system is evaluated on GRID corpus. Our results show that presence of visual modality gives significant improvement in character error rate (CER) at various levels of noise even when the model is trained without noisy data. We also provide a comparison of two fusion methods: feature fusion and decision fusion.
We uncover a new kind of entropic long range order in finite dimensional spin glasses. We study the link-diluted version of the Edwards-Anderson spin glass model with bimodal couplings (J=+/-1) on a 3D lattice. By using exact reduction algorithms, we prove that there exists a region of the phase diagram (at zero temperature and link density low enough), where spins are long range correlated, even if the ground states energy stiffness is null. In other words, in this region twisting the boundary conditions cost no energy, but spins are long range correlated by means of pure entropic effects.
Efficient routing mechanism is a challenging issue for group oriented computing in Mobile Ad Hoc Networks (MANETs). The ability of MANETs to support adequate Quality of Service (QoS) for group communication is limited by the ability of the underlying ad-hoc routing protocols to provide consistent behavior despite the dynamic properties of mobile computing devices. In MANET QoS requirements can be quantified in terms of Packet Delivery Ratio (PDR), Data Latency, Packet Loss Probability, Routing Overhead, Medium Access Control (MAC) Overhead and Data Throughput etc. This paper presents an in depth study of one to many and many to many communications in MANETs and provides a comparative performance evaluation of unicast and broadcast routing protocols. Dynamic Source Routing protocol (DSR) is used as unicast protocol and BCAST is used to represent broadcast protocol. The performance differentials are analyzed using ns2 network simulator varying multicast group size (number of data senders and data receivers). Both protocols are simulated with identical traffic loads and mobility models. Simulation result shows that BCAST performs better than DSR in most cases.
In the past few years, the storage and analysis of large-scale and fast evolving networks present a great challenge. Therefore, a number of different techniques have been proposed for sampling large networks. In general, network exploration techniques approximate the original networks more accurately than random node and link selection. Yet, link selection with additional subgraph induction step outperforms most other techniques. In this paper, we apply subgraph induction also to random walk and forest-fire sampling. We analyze different real-world networks and the changes of their properties introduced by sampling. We compare several sampling techniques based on the match between the original networks and their sampled variants. The results reveal that the techniques with subgraph induction underestimate the degree and clustering distribution, while overestimate average degree and density of the original networks. Techniques without subgraph induction step exhibit exactly the opposite behavior. Hence, the performance of the sampling techniques from random selection category compared to network exploration sampling does not differ significantly, while clear differences exist between the techniques with subgraph induction step and the ones without it.
The total communicability of a network (or graph) is defined as the sum of the entries in the exponential of the adjacency matrix of the network, possibly normalized by the number of nodes. This quantity offers a good measure of how easily information spreads across the network, and can be useful in the design of networks having certain desirable properties. The total communicability can be computed quickly even for large networks using techniques based on the Lanczos algorithm.   In this work we introduce some heuristics that can be used to add, delete, or rewire a limited number of edges in a given sparse network so that the modified network has a large total communicability. To this end, we introduce new edge centrality measures which can be used to guide in the selection of edges to be added or removed.   Moreover, we show experimentally that the total communicability provides an effective and easily computable measure of how "well-connected" a sparse network is.
We study the mean-field static solution of the Blume-Emery-Griffiths-Capel model with quenched disorder, an Ising-spin lattice gas with quenched random magnetic interaction. The thermodynamics is worked out in the Full Replica Symmetry Breaking scheme. The model exhibits a high temperature/low density paramagnetic phase. When the temperature is decreased or the density increased, the system undergoes a phase transition to a Full Replica Symmetry Breaking spin-glass phase. The nature of the transition can be either of the second order   (like in the Sherrington-Kirkpatrick model) or, at temperature below a given critical value (tricritical point), of the first order in the Ehrenfest sense, with a discontinuous jump of the order parameter and a latent heat. In this last case coexistence of phases occurs.
The recognition of optical characters is known to be one of the earliest applications of Artificial Neural Networks, which partially emulate human thinking in the domain of artificial intelligence. In this paper, a simplified neural approach to recognition of optical or visual characters is portrayed and discussed. The document is expected to serve as a resource for learners and amateur investigators in pattern recognition, neural networking and related disciplines.
The value measured in the amorphous structure with the same chemical composition is often considered as a lower bound for the thermal conductivity of any material: the heat carriers are strongly scattered by disorder, and their lifetimes reach the minimum time scale of thermal vibrations. An appropriate design at the nano-scale, however, may allow one to reduce the thermal conductivity even below the amorphous limit. In the present contribution, using molecular-dynamics simulation and the Green-Kubo formulation, we study systematically the thermal conductivity of layered phononic materials (superlattices), by tuning different parameters that can characterize such structures. We discover that the key to reach a lower-than-amorphous thermal conductivity is to block almost completely the propagation of the heat carriers, the superlattice phonons. We demonstrate that a large mass difference in the two intercalated layers, or weakened interactions across the interface between layers result in materials with very low thermal conductivity, below the values of the corresponding amorphous counterparts.
Random linear network codes can be designed and implemented in a distributed manner, with low computational complexity. However, these codes are classically implemented over finite fields whose size depends on some global network parameters (size of the network, the number of sinks) that may not be known prior to code design. Also, if new nodes join the entire network code may have to be redesigned.   In this work, we present the first universal and robust distributed linear network coding schemes. Our schemes are universal since they are independent of all network parameters. They are robust since if nodes join or leave, the remaining nodes do not need to change their coding operations and the receivers can still decode. They are distributed since nodes need only have topological information about the part of the network upstream of them, which can be naturally streamed as part of the communication protocol.   We present both probabilistic and deterministic schemes that are all asymptotically rate-optimal in the coding block-length, and have guarantees of correctness. Our probabilistic designs are computationally efficient, with order-optimal complexity. Our deterministic designs guarantee zero error decoding, albeit via codes with high computational complexity in general. Our coding schemes are based on network codes over ``scalable fields". Instead of choosing coding coefficients from one field at every node, each node uses linear coding operations over an ``effective field-size" that depends on the node's distance from the source node. The analysis of our schemes requires technical tools that may be of independent interest. In particular, we generalize the Schwartz-Zippel lemma by proving a non-uniform version, wherein variables are chosen from sets of possibly different sizes. We also provide a novel robust distributed algorithm to assign unique IDs to network nodes.
We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform all other known community detection method in terms of computation time. Moreover, the quality of the communities detected is very good, as measured by the so-called modularity. This is shown first by identifying language communities in a Belgian mobile phone network of 2.6 million customers and by analyzing a web graph of 118 million nodes and more than one billion links. The accuracy of our algorithm is also verified on ad-hoc modular networks. .
The influence of on-site (Hubbard) electron-electron interaction on disorder-induced localization is studied in order to clarify the role of electronic spin. The motivation is based on the recent experimental indications of a "metal-insulator" transition in two dimensional systems. We use both analytical and numerical techniques, addressing the limit of weak short-range interaction. The analytical calculation is based on Random Matrix Theory (RMT). It is found that although RMT gives a qualitative explanation of the numerical results, it is quantitatively incorrect. This is due to an exact cancellation of short range and long range correlations in RMT, which does not occur in the non-universal corrections to RMT. An estimate for these contributions is given.
One of the long-standing challenges in Artificial Intelligence for learning goal-directed behavior is to build a single agent which can solve multiple tasks. Recent progress in multi-task learning for goal-directed sequential problems has been in the form of distillation based learning wherein a student network learns from multiple task-specific expert networks by mimicking the task-specific policies of the expert networks. While such approaches offer a promising solution to the multi-task learning problem, they require supervision from large expert networks which require extensive data and computation time for training. In this work, we propose an efficient multi-task learning framework which solves multiple goal-directed tasks in an on-line setup without the need for expert supervision. Our work uses active learning principles to achieve multi-task learning by sampling the harder tasks more than the easier ones. We propose three distinct models under our active sampling framework. An adaptive method with extremely competitive multi-tasking performance. A UCB-based meta-learner which casts the problem of picking the next task to train on as a multi-armed bandit problem. A meta-learning method that casts the next-task picking problem as a full Reinforcement Learning problem and uses actor critic methods for optimizing the multi-tasking performance directly. We demonstrate results in the Atari 2600 domain on seven multi-tasking instances: three 6-task instances, one 8-task instance, two 12-task instances and one 21-task instance.
This paper outlines a methodological approach for designing adaptive agents driving themselves near points of criticality. Using a synthetic approach we construct a conceptual model that, instead of specifying mechanistic requirements to generate criticality, exploits the maintenance of an organizational structure capable of reproducing critical behavior. Our approach exploits the well-known principle of universality, which classifies critical phenomena inside a few universality classes of systems independently of their specific mechanisms or topologies. In particular, we implement an artificial embodied agent controlled by a neural network maintaining a correlation structure randomly sampled from a lattice Ising model at a critical point. We evaluate the agent in two classical reinforcement learning scenarios: the Mountain Car benchmark and the Acrobot double pendulum, finding that in both cases the neural controller reaches a point of criticality, which coincides with a transition point between two regimes of the agent's behaviour, maximizing the mutual information between neurons and sensorimotor patterns. Finally, we discuss the possible applications of this synthetic approach to the comprehension of deeper principles connected to the pervasive presence of criticality in biological and cognitive systems.
Studies of single and double-spin asymmetries in pion electro-production in semi-inclusive deep-inelastic scattering of 5.8 GeV polarized electrons from unpolarized and longitudinally polarized targets at the Thomas Jefferson National Accelerator Facility using CLAS discussed. We present a Bessel-weighting strategy to extract transverse-momentum-dependent parton distribution functions.
We study the properties of the distance between attractors in Random Boolean Networks, a prominent model of genetic regulatory networks. We define three distance measures, upon which attractor distance matrices are constructed and their main statistic parameters are computed. The experimental analysis shows that ordered networks have a very clustered set of attractors, while chaotic networks' attractors are scattered; critical networks show, instead, a pattern with characteristics of both ordered and chaotic networks.
We have investigated the dependence of galaxy clustering on their stellar mass at z~1, using the data from the VIMOS-VLT Deep Survey (VVDS). We have measured the projected two-point correlation function of galaxies, wp(rp) for a set of stellar mass selected samples at an effective redshift <z>=0.85. We have control and quantify all effects on galaxy clustering due to the incompleteness of our low mass samples. We find that more massive galaxies are more clustered. When compared to similar results at z~0.1 in the SDSS, we observed no evolution of the projected correlation function for massive galaxies. These objects present a stronger linear bias at z~1 with respect to low mass galaxies. As expected, massive objects at high redshift are found in the highest pics of the dark matter density field.
We propose and evaluate an immuno-inspired approach to misbehavior detection in ad hoc wireless networks. Node misbehavior can be the result of an intrusion, or a software or hardware failure. Our approach is motivated by co-stimulatory signals present in the Biological immune system. The results show that co-stimulation in ad hoc wireless networks can both substantially improve energy efficiency of detection and, at the same time, help achieve low false positives rates. The energy efficiency improvement is almost two orders of magnitude, if compared to misbehavior detection based on watchdogs.   We provide a characterization of the trade-offs between detection approaches executed by a single node and by several nodes in cooperation. Additionally, we investigate several feature sets for misbehavior detection. These feature sets impose different requirements on the detection system, most notably from the energy efficiency point of view.
Regression or classification? This is perhaps the most basic question faced when tackling a new supervised learning problem. We present an Evolutionary Deep Learning (EDL) algorithm that automatically solves this by identifying the question type with high accuracy, along with a proposed deep architecture. Typically, a significant amount of human insight and preparation is required prior to executing machine learning algorithms. For example, when creating deep neural networks, the number of parameters must be selected in advance and furthermore, a lot of these choices are made based upon pre-existing knowledge of the data such as the use of a categorical cross entropy loss function. Humans are able to study a dataset and decide whether it represents a classification or a regression problem, and consequently make decisions which will be applied to the execution of the neural network. We propose the Automated Problem Identification (API) algorithm, which uses an evolutionary algorithm interface to TensorFlow to manipulate a deep neural network to decide if a dataset represents a classification or a regression problem. We test API on 16 different classification, regression and sentiment analysis datasets with up to 10,000 features and up to 17,000 unique target values. API achieves an average accuracy of $96.3\%$ in identifying the problem type without hardcoding any insights about the general characteristics of regression or classification problems. For example, API successfully identifies classification problems even with 1000 target values. Furthermore, the algorithm recommends which loss function to use and also recommends a neural network architecture. Our work is therefore a step towards fully automated machine learning.
A theory accounting for the specific features of magnetic polarons (MP) in the presence of spin glass order is presented. We derive and solve selfconsistent equations for i) the polaron magnetisation, ii) the thermodynamically averaged carrier--spin, and iii) for the spin glass order parameter. The temperature dependence of these quantities is analysed in detail. The modification of the spin glass phase due to the presence of the exchange field of the carrier inside the magnetic polaron volume is investigated. The onset of spin glass order leads to a plateau--like flattening in the temperature dependence of the MP energy at low temperatures. It is found that solutions of spin glass equations are needed to optimally fit the experimental data of the temperature dependence of the exciton magnetic polaron (EMP) energy in (Cd,Mn)Te. Moreover, the dynamical aspects of the MP formation are discussed. Our model predicts qualitatively different temperature dependences of the MP formation time in different dynamical scenarios.
In this paper, the relationship between the network synchronizability and the edge distribution of its associated graph is investigated. First, it is shown that adding one edge to a cycle definitely decreases the network sychronizability. Then, since sometimes the synchronizability can be enhanced by changing the network structure, the question of whether the networks with more edges are easier to synchronize is addressed. It is shown by examples that the answer is negative. This reveals that generally there are redundant edges in a network, which not only make no contributions to synchronization but actually may reduce the synchronizability. Moreover, an example shows that the node betweenness centrality is not always a good indicator for the network synchronizability. Finally, some more examples are presented to illustrate how the network synchronizability varies following the addition of edges, where all the examples show that the network synchronizability globally increases but locally fluctuates as the number of added edges increases.
Human dynamical social networks encode information and are highly adaptive. To characterize the information encoded in the fast dynamics of social interactions, here we introduce the entropy of dynamical social networks. By analysing a large dataset of phone-call interactions we show evidence that the dynamical social network has an entropy that depends on the time of the day in a typical week-day. Moreover we show evidence for adaptability of human social behavior showing data on duration of phone-call interactions that significantly deviates from the statistics of duration of face-to-face interactions. This adaptability of behavior corresponds to a different information content of the dynamics of social human interactions. We quantify this information by the use of the entropy of dynamical networks on realistic models of social interactions.
Inclusive four-jet events produced in proton-proton collisions at a centre-of-mass energy of $\sqrt{s} = 7$ TeV are analysed for the presence of hard double-parton scattering using data corresponding to an integrated luminosity of 37.3 pb$^{-1}$, collected with the ATLAS detector at the LHC. The contribution of hard double-parton scattering to the production of four-jet events is extracted using an artificial neural network, assuming that hard double-parton scattering can be approximated by an uncorrelated overlaying of dijet events. For events containing at least four jets with transverse momentum $p_{\mathrm{T}} \geq 20$ GeV and pseudorapidity $\eta \leq 4.4$, and at least one having $p_{\mathrm{T}} \geq 42.5$ GeV, the contribution of hard double-parton scattering is estimated to be $f_{\mathrm{DPS}} = 0.092 ^{+0.005}_{-0.011} (\mathrm{stat.}) ^{+0.033}_{-0.037} (\mathrm{syst.})$. After combining this measurement with those of the inclusive dijet and four-jet cross-sections in the appropriate phase space regions, the effective overlap area between the interacting protons, $\sigma_{\mathrm{eff}}$, was determined to be $\sigma_{\mathrm{eff}} = 14.9 ^{+1.2}_{-1.0} (\mathrm{stat.}) ^{+5.1}_{-3.8} (\mathrm{syst.})$ mb. This result is consistent within the quoted uncertainties with previous measurements of $\sigma_{\mathrm{eff}}$, performed at centre-of-mass energies between 63 GeV and 8 TeV using various final states, and it corresponds to $21^{+7}_{-6}$% of the total inelastic cross-section measured at $\sqrt{s} = 7$ TeV. The distributions of the observables sensitive to the contribution of hard double-parton scattering, corrected for detector effects, are also provided.
We investigate the properties of the Gibbs states and thermodynamic observables of the spherical model in a random field. We show that on the low-temperature critical line the magnetization of the model is not a self-averaging observable, but it self-averages conditionally. We also show that an arbitrarily weak homogeneous boundary field dominates over fluctuations of the random field once the model transits into a ferromagnetic phase. As a result, a homogeneous boundary field restores the conventional self-averaging of thermodynamic observables, like the magnetization and the susceptibility. We also investigate the effective field created at the sites of the lattice by the random field, and show that at the critical temperature of the spherical model the effective field undergoes a transition into a phase with long-range correlations $\sim r^{4-d}$.
Generating natural language descriptions for images is a challenging task. The traditional way is to use the convolutional neural network (CNN) to extract image features, followed by recurrent neural network (RNN) to generate sentences. In this paper, we present a new model that added memory cells to gate the feeding of image features to the deep neural network. The intuition is enabling our model to memorize how much information from images should be fed at each stage of the RNN. Experiments on Flickr8K and Flickr30K datasets showed that our model outperforms other state-of-the-art models with higher BLEU scores.
Wearable devices are transforming computing and the human-computer interaction and they are a primary means for motion recognition of reflexive systems. We review basic wearable deployments and their open wireless communications. An algorithm that uses accelerometer data to provide a control and communication signal is described. Challenges in the further deployment of wearable device in the field of body area network and biometric verification are discussed.
Time crystals are time-periodic self-organized structures postulated by Frank Wilczek in 2012. While the original concept was strongly criticized, it stimulated at the same time an intensive research leading to propositions and experimental verifications of discrete (or Floquet) time crystals -- the structures that appear in the time domain due to spontaneous breaking of discrete time translation symmetry. The struggle to observe discrete time crystals is reviewed here together with propositions that generalize this concept introducing condensed matter like physics in the time domain. We shall also revisit the original Wilczek's idea and review strategies aimed at spontaneous breaking of continuous time translation symmetry.
The structure, interdependence, and fragility of systems ranging from power grids and transportation to ecology, climate, biology and even human communities and the Internet, have been examined through network science. While the response to perturbations has been quantified, recovery strategies for perturbed networks have usually been either discussed conceptually or through anecdotal case studies. Here we develop a network science-based quantitative methods framework for measuring, comparing and interpreting hazard responses and as well as recovery strategies. The framework, motivated by the recently proposed temporal resilience paradigm, is demonstrated with the Indian Railways Network. The methods are demonstrated through the resilience of the network to natural or human-induced hazards and electric grid failure. Simulations inspired by the 2004 Indian Ocean Tsunami and the 2012 North Indian blackout as well as a cyber-physical attack scenario. Multiple metrics are used to generate various recovery strategies, which are simply sequences in which system components should be recovered after a disruption. Quantitative evaluation of recovery strategies suggests that faster and more resource-effective recovery is possible through network centrality measures. Case studies based on two historical events, specifically the 2004 Indian Ocean tsunami and the 2012 North Indian blackout, and a simulated cyber-physical attack scenario, provides means for interpreting the relative performance of various recovery strategies. Quantitative evaluation of recovery strategies suggests that faster and more resource-effective restoration is possible through network centrality measures, even though the specific strategy may be different for sub-networks or for the partial recovery.
We present a novel deep Recurrent Neural Network (RNN) model for acoustic modelling in Automatic Speech Recognition (ASR). We term our contribution as a TC-DNN-BLSTM-DNN model, the model combines a Deep Neural Network (DNN) with Time Convolution (TC), followed by a Bidirectional Long Short-Term Memory (BLSTM), and a final DNN. The first DNN acts as a feature processor to our model, the BLSTM then generates a context from the sequence acoustic signal, and the final DNN takes the context and models the posterior probabilities of the acoustic states. We achieve a 3.47 WER on the Wall Street Journal (WSJ) eval92 task or more than 8% relative improvement over the baseline DNN models.
All Spin Logic gates employ multiple nano-magnets interacting through spin-torque using non-magnetic channels. Compactness, non-volatility and ultra-low voltage operation are some of the attractive features of ASL, while, low switching-speed (of nano-magnets as compared to CMOS gates) and static-power dissipation can be identified as the major bottlenecks. In this work we explore design techniques that leverage the specific device characteristics of ASL to overcome the inefficiencies and to enhance the merits of this technology, for a given set of device parameters. We exploit the non-volatility of nano-magnets to model fully-pipelined ASL that can achieve higher performance. Clocking of power supply in pipelined ASL would require CMOS transistors that may consume significantly large voltage headroom and area, as compared to the nano-magnets. We show that the use of leaky transistors can significantly mitigate such bottlenecks, without sacrificing energy-efficiency and robustness. Exploiting the inherent isolation between the biasing charge current and spin-current paths in ASL, we propose to stack multiple ASL metal layers, leading to ultra-high-density and energy-efficient 3-D computation blocks. Results for the design of an FIR filter show that ASL can achieve performance and power consumption comparable to CMOS while the ultra-high-density of ASL can be projected as its main advantage over CMOS.
In this paper, we address the problem of estimating and removing non-uniform motion blur from a single blurry image. We propose a deep learning approach to predicting the probabilistic distribution of motion blur at the patch level using a convolutional neural network (CNN). We further extend the candidate set of motion kernels predicted by the CNN using carefully designed image rotations. A Markov random field model is then used to infer a dense non-uniform motion blur field enforcing motion smoothness. Finally, motion blur is removed by a non-uniform deblurring model using patch-level image prior. Experimental evaluations show that our approach can effectively estimate and remove complex non-uniform motion blur that is not handled well by previous approaches.
Process migration involves moving the running state of a process from one physical system to another, as is commonly done for virtual machines. In this paper, we describe how Content Centric Networking (CCNx) facilitates process migration through an intuitive naming ontology and version checkpointing.
We consider a standard splitting algorithm for the rare-event simulation of overflow probabilities in any subset of stations in a Jackson network at level n, starting at a fixed initial position. It was shown in DeanDup09 that a subsolution to the Isaacs equation guarantees that a subexponential number of function evaluations (in n) suffice to estimate such overflow probabilities within a given relative accuracy. Our analysis here shows that in fact O(n^{2{\beta}+1}) function evaluations suffice to achieve a given relative precision, where {\beta} is the number of bottleneck stations in the network. This is the first rigorous analysis that allows to favorably compare splitting against directly computing the overflow probability of interest, which can be evaluated by solving a linear system of equations with O(n^{d}) variables.
Advanced combustion technologies such as homogeneous charge compression ignition (HCCI) engines have a narrow stable operating region defined by complex control strategies such as exhaust gas recirculation (EGR) and variable valve timing among others. For such systems, it is important to identify the operating envelope or the boundary of stable operation for diagnostics and control purposes. Obtaining a good model of the operating envelope using physics becomes intractable owing to engine transient effects. In this paper, a machine learning based approach is employed to identify the stable operating boundary of HCCI combustion directly from experimental data. Owing to imbalance in class proportions in the data, two approaches are considered. A re-sampling (under-sampling, over-sampling) based approach is used to develop models using existing algorithms while a cost-sensitive approach is used to modify the learning algorithm without modifying the data set. Support vector machines and recently developed extreme learning machines are used for model development and results compared against linear classification methods show that cost-sensitive versions of ELM and SVM algorithms are well suited to model the HCCI operating envelope. The prediction results indicate that the models have the potential to be used for predicting HCCI instability based on sensor measurement history.
Motivated by the idea that criticality and universality of phase transitions might play a crucial role in achieving and sustaining learning and intelligent behaviour in biological and artificial networks, we analyse a theoretical and a pragmatic experimental set up for critical phenomena in deep learning. On the theoretical side, we use results from statistical physics to carry out critical point calculations in feed-forward/fully connected networks, while on the experimental side we set out to find traces of criticality in deep neural networks. This is our first step in a series of upcoming investigations to map out the relationship between criticality and learning in deep networks.
This thesis is devoted to the study of dynamical properties of diluted models. These are mean field statistical mechanics systems, but with finite local connectivity. Among other reasons, the interest for these models arises from their deep relationship with combinatorial optimization problems, random $K$-satisfiability for instance.   Several analytical descriptions of their out of equilibrium regime are described. This regime can be due to long relaxation times in glassy phases, lack of detailed balance condition for optimization algorithms, or transient relaxation from an arbitrary initial condition for ferromagnets. In the course of these studies some attention will also be given to random matrix theory, and to a generalization of fluctuation-dissipation theorem for $n$-times functions.
Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential decision-making problems such as Go and video games, but their performance can be poor when the planning depth and sampling trajectories are limited or when the rewards are sparse. We present an adaptation of PGRD (policy-gradient for reward-design) for learning a reward-bonus function to improve UCT (a MCTS algorithm). Unlike previous applications of PGRD in which the space of reward-bonus functions was limited to linear functions of hand-coded state-action-features, we use PGRD with a multi-layer convolutional neural network to automatically learn features from raw perception as well as to adapt the non-linear reward-bonus function parameters. We also adopt a variance-reducing gradient method to improve PGRD's performance. The new method improves UCT's performance on multiple ATARI games compared to UCT without the reward bonus. Combining PGRD and Deep Learning in this way should make adapting rewards for MCTS algorithms far more widely and practically applicable than before.
Image aesthetic evaluation has attracted much attention in recent years. Image aesthetic evaluation methods heavily depend on the effective aesthetic feature. Traditional meth-ods always extract hand-crafted features. However, these hand-crafted features are always designed to adapt particu-lar datasets, and extraction of them needs special design. Rather than extracting hand-crafted features, an automati-cally learn of aesthetic features based on deep convolutional neural network (DCNN) is first adopt in this paper. As we all know, when the training dataset is given, the DCNN architecture with high complexity may meet the over-fitting problem. On the other side, the DCNN architecture with low complexity would not efficiently extract effective features. For these reasons, we further propose a paralleled convolutional neural network (PDCNN) with multi-level structures to automatically adapt to the training dataset. Experimental results show that our proposed PDCNN architecture achieves better performance than other traditional methods.
We apply deep belief networks of restricted Boltzmann machines to bags of words of sift features obtained from databases of 13 Scenes, 15 Scenes and Caltech 256 and study experimentally their behavior and performance. We find that the final performance in the supervised phase is reached much faster if the system is pre-trained. Pre-training the system on a larger dataset keeping the supervised dataset fixed improves the performance (for the 13 Scenes case). After the unsupervised pre-training, neurons arise that form approximate explicit representations for several categories (meaning they are mostly active for this category). The last three facts suggest that unsupervised training really discovers structure in these data. Pre-training can be done on a completely different dataset (we use Corel dataset) and we find that the supervised phase performs just as good (on the 15 Scenes dataset). This leads us to conjecture that one can pre-train the system once (e.g. in a factory) and subsequently apply it to many supervised problems which then learn much faster. The best performance is obtained with single hidden layer system suggesting that the histogram of sift features doesn't have much high level structure. The overall performance is almost equal, but slightly worse then that of the support vector machine and the spatial pyramidal matching.
Background: Cardiac MRI derived biventricular mass and function parameters, such as end-systolic volume (ESV), end-diastolic volume (EDV), ejection fraction (EF), stroke volume (SV), and ventricular mass (VM) are clinically well established. Image segmentation can be challenging and time-consuming, due to the complex anatomy of the human heart.   Objectives: This study introduces $\nu$-net (/nju:n$\varepsilon$t/) -- a deep learning approach allowing for fully-automated high quality segmentation of right (RV) and left ventricular (LV) endocardium and epicardium for extraction of cardiac function parameters.   Methods: A set consisting of 253 manually segmented cases has been used to train a deep neural network. Subsequently, the network has been evaluated on 4 different multicenter data sets with a total of over 1000 cases.   Results: For LV EF the intraclass correlation coefficient (ICC) is 98, 95, and 80 % (95 %), and for RV EF 96, and 87 % (80 %) on the respective data sets (human expert ICCs reported in parenthesis). The LV VM ICC is 95, and 94 % (84 %), and the RV VM ICC is 83, and 83 % (54 %). This study proposes a simple adjustment procedure, allowing for the adaptation to distinct segmentation philosophies. $\nu$-net exhibits state of-the-art performance in terms of dice coefficient.   Conclusions: Biventricular mass and function parameters can be determined reliably in high quality by applying a deep neural network for cardiac MRI segmentation, especially in the anatomically complex right ventricle. Adaption to individual segmentation styles by applying a simple adjustment procedure is viable, allowing for the processing of novel data without time-consuming additional training.
When an electronic system is subjected to a sufficiently strong magnetic field that the cyclotron energy is much larger than the Fermi energy, the system enters the "extreme quantum limit" (EQL) and becomes susceptible to a number of instabilities. Bringing a three-dimensional electronic system deeply into the EQL can be difficult, however, since it requires a small Fermi energy, large magnetic field, and low disorder. Here we present an experimental study of the EQL in lightly-doped single crystals of strontium titanate, which remain good bulk conductors down to very low temperatures and high magnetic fields. Our experiments probe deeply into the regime where theory has long predicted electron-electron interactions to drive the system into a charge density wave or Wigner crystal state. A number of interesting features arise in the transport in this regime, including a striking re-entrant nonlinearity in the current-voltage characteristics and a saturation of the quantum-limiting field at low carrier density. We discuss these features in the context of possible correlated electron states, and present an alternative picture based on magnetic-field induced puddling of electrons.
The advection of a passive scalar by a quenched (frozen) incompressible velocity field is studied by extensive high precision numerical simulation and various approximation schemes. We show that second order self consistent perturbation theory, in the absence of helicity, perfectly predicts the effective diffusivity of a tracer particle in such a field. In the presence of helicity in the flow simulations reveal an unexpectedly strong enhancement of the effective diffusivity which is highly nonperturbative and is most visible when the bare molecular diffusivity of the particle is small. We develop and analyse a series of approximation schemes which indicate that this enhancement of the diffusivity is due to a novel second order effect whereby the helical component of the field, which does not directly renormalize the effective diffusivity, enhances the strength of the non helical part of the flow, which in turn renormalizes the molecular diffusivity. We show that this renormalization is most important at low bare molecular diffusivity, in agreement with the numerical simulations.
The increasing pervasiveness of social media creates new opportunities to study human social behavior, while challenging our capability to analyze their massive data streams. One of the emerging tasks is to distinguish between different kinds of activities, for example engineered misinformation campaigns versus spontaneous communication. Such detection problems require a formal definition of meme, or unit of information that can spread from person to person through the social network. Once a meme is identified, supervised learning methods can be applied to classify different types of communication. The appropriate granularity of a meme, however, is hardly captured from existing entities such as tags and keywords. Here we present a framework for the novel task of detecting memes by clustering messages from large streams of social data. We evaluate various similarity measures that leverage content, metadata, network features, and their combinations. We also explore the idea of pre-clustering on the basis of existing entities. A systematic evaluation is carried out using a manually curated dataset as ground truth. Our analysis shows that pre-clustering and a combination of heterogeneous features yield the best trade-off between number of clusters and their quality, demonstrating that a simple combination based on pairwise maximization of similarity is as effective as a non-trivial optimization of parameters. Our approach is fully automatic, unsupervised, and scalable for real-time detection of memes in streaming data.
The broad abundance of time series data, which is in sharp contrast to limited knowledge of the underlying network dynamic processes that produce such observations, calls for a rigorous and efficient method of causal network inference. Here we develop mathematical theory of causation entropy, an information-theoretic statistic designed for model-free causality inference. For stationary Markov processes, we prove that for a given node in the network, its causal parents forms the minimal set of nodes that maximizes causation entropy, a result we refer to as the optimal causation entropy principle. Furthermore, this principle guides us to develop computational and data efficient algorithms for causal network inference based on a two-step discovery and removal algorithm for time series data for a network-couple dynamical system. Validation in terms of analytical and numerical results for Gaussian processes on large random networks highlight that inference by our algorithm outperforms previous leading methods including conditioned Granger causality and transfer entropy. Interestingly, our numerical results suggest that the number of samples required for accurate inference depends strongly on network characteristics such as the density of links and information diffusion rate and not necessarily on the number of nodes.
We present a machine learning framework that leverages a mixture of metadata, network, and temporal features to detect extremist users, and predict content adopters and interaction reciprocity in social media. We exploit a unique dataset containing millions of tweets generated by more than 25 thousand users who have been manually identified, reported, and suspended by Twitter due to their involvement with extremist campaigns. We also leverage millions of tweets generated by a random sample of 25 thousand regular users who were exposed to, or consumed, extremist content. We carry out three forecasting tasks, (i) to detect extremist users, (ii) to estimate whether regular users will adopt extremist content, and finally (iii) to predict whether users will reciprocate contacts initiated by extremists. All forecasting tasks are set up in two scenarios: a post hoc (time independent) prediction task on aggregated data, and a simulated real-time prediction task. The performance of our framework is extremely promising, yielding in the different forecasting scenarios up to 93% AUC for extremist user detection, up to 80% AUC for content adoption prediction, and finally up to 72% AUC for interaction reciprocity forecasting. We conclude by providing a thorough feature analysis that helps determine which are the emerging signals that provide predictive power in different scenarios.
Nowadays this is very popular to use deep architectures in machine learning. Deep Belief Networks (DBNs) are deep architectures that use stack of Restricted Boltzmann Machines (RBM) to create a powerful generative model using training data. In this paper we present an improvement in a common method that is usually used in training of RBMs. The new method uses free energy as a criterion to obtain elite samples from generative model. We argue that these samples can more accurately compute gradient of log probability of training data. According to the results, an error rate of 0.99% was achieved on MNIST test set. This result shows that the proposed method outperforms the method presented in the first paper introducing DBN (1.25% error rate) and general classification methods such as SVM (1.4% error rate) and KNN (with 1.6% error rate). In another test using ISOLET dataset, letter classification error dropped to 3.59% compared to 5.59% error rate achieved in those papers using this dataset. The implemented method is available online at "http://ceit.aut.ac.ir/~keyvanrad/DeeBNet Toolbox.html".
We study the damage process of fiber bundles in a wedge-shape geometry which ensures a constant strain gradient. To obtain the wedge geometry we consider the three-point bending of a bar, which is modelled as two rigid blocks glued together by a thin elastic interface. The interface is discretized by parallel fibers with random failure thresholds, which get elongated when the bar is bent. Analyzing the progressive damage of the system we show that the strain gradient results in a rich spectrum of novel behavior of fiber bundles. We find that for weak disorder an interface crack is formed as a continuous region of failed fibers. Ahead the crack a process zone develops which proved to shrink with increasing deformation making the crack tip sharper as the crack advances. For strong disorder, failure of the system occurs as a spatially random sequence of breakings. Damage of the fiber bundle proceeds in bursts whose size distribution shows a power law behavior with a crossover from an exponent 2.5 to 2.0 as the disorder is weakened. The size of the largest burst increases as a power law of the strength of disorder with an exponent 2/3 and saturates for strongly disordered bundles.
Evolutionary games on structured populations have been studied extensively in recent years. In reality, social interactions take place in different domains, which naturally requires a multiplex description. The impact of the multiplex nature of human interactions on the evolution of cooperation has recently attracted a lot of attention, however, the fundamental mechanisms at play are still not well understood. Here, we show that the interplay between the structural organization of the multiplex and the assumptions about the dynamical coupling between the layers leads to very different outcomes. We show that the organization of the multiplex can enable mutual spatial selection, which refers to the formation of overlapping clusters of cooperators in different layers that can survive in social dilemmas. Furthermore, heterogeneity and degree correlations lead to topological enslavement, which means that the hubs dominate the game dynamics inducing payoff irrelevance. Our findings reveal the fundamental mechanisms at play and provide a new perspective for understanding the evolution of cooperation on realistic structured populations.
This paper presents experiments illustrating how formal language theory can shed light on deep learning. We train naive Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs) on six formal languages drawn from the Strictly Local (SL) and Strictly Piecewise (SP) classes. These classes are relevant to computational linguistics and among the simplest in a mathematically well-understood hierarchy of subregular classes. SL and SP classes encode local and long-distance dependencies, respectively. The results show four of the six languages were learned remarkably well, but overfitting arguably occurred with the simplest SL language and undergeneralization with the most complex SP pattern. Even though LSTMs were developed to handle long-distance dependencies, the latter result shows they stymie naive LSTMs in contrast to local dependencies. While it remains to be seen which of the many variants of LSTMs may learn SP languages well, this result speaks to the larger point that the judicial use of formal language theory can illuminate the inner workings of RNNs.
The learning of domain-invariant representations in the context of domain adaptation with neural networks is considered. We propose a new regularization method that minimizes the discrepancy between domain-specific latent feature representations directly in the hidden activation space. Although some standard distribution matching approaches exist that can be interpreted as the matching of weighted sums of moments, e.g. Maximum Mean Discrepancy (MMD), an explicit order-wise matching of higher order moments has not been considered before. We propose to match the higher order central moments of probability distributions by means of order-wise moment differences. Our model does not require computationally expensive distance and kernel matrix computations. We utilize the equivalent representation of probability distributions by moment sequences to define a new distance function, called Central Moment Discrepancy (CMD). We prove that CMD is a metric on the set of probability distributions on a compact interval. We further prove that convergence of probability distributions on compact intervals w.r.t. the new metric implies convergence in distribution of the respective random variables. We test our approach on two different benchmark data sets for object recognition (Office) and sentiment analysis of product reviews (Amazon reviews). CMD achieves a new state-of-the-art performance on most domain adaptation tasks of Office and outperforms networks trained with MMD, Variational Fair Autoencoders and Domain Adversarial Neural Networks on Amazon reviews. In addition, a post-hoc parameter sensitivity analysis shows that the new approach is stable w.r.t. parameter changes in a certain interval. The source code of the experiments is publicly available.
We discuss the number-theoretic properties of distributions appearing in physical systems when an observable is a quotient of two independent exponentially weighted integers. The spectral density of ensemble of linear polymer chains distributed with the law $\sim f^L$ ($0<f<1$), where $L$ is the chain length, serves as a particular example. At $f\to 1$, the spectral density can be expressed through the discontinuous at all rational points, Thomae ("popcorn") function. We suggest a continuous approximation of the popcorn function, based on the Dedekind $\eta$-function near the real axis. Moreover, we provide simple arguments, based on the "Euclid orchard" construction, that demonstrate the presence of Lifshitz tails, typical for the 1D Anderson localization, at the spectral edges. We emphasize that the ultrametric structure of the spectral density is ultimately connected with number-theoretic relations on asymptotic modular functions. We also pay attention to connection of the Dedekind $\eta$-function near the real axis to invariant measures of some continued fractions studied by Borwein and Borwein in 1993.
We study transport of interacting electrons in a low-dimensional disordered system at low temperature $T$. In view of localization by disorder, the conductivity $\sigma(T)$ may only be non-zero due to electron-electron scattering. For weak interactions, the weak-localization regime crosses over with lowering $T$ into a dephasing-induced "power-law hopping". As $T$ is further decreased, the Anderson localization in Fock space crucially affects $\sigma(T)$, inducing a transition at $T=T_c$, so that $\sigma(T<T_c)=0$. The critical behavior of $\sigma(T)$ above $T_c$ is $\ln\sigma(T)\propto - (T-T_c)^{-1/2}$. The mechanism of transport in the critical regime is many-particle transitions between distant states in Fock space.
We perform a large-scale analysis of language diatopic variation using geotagged microblogging datasets. By collecting all Twitter messages written in Spanish over more than two years, we build a corpus from which a carefully selected list of concepts allows us to characterize Spanish varieties on a global scale. A cluster analysis proves the existence of well defined macroregions sharing common lexical properties. Remarkably enough, we find that Spanish language is split into two superdialects, namely, an urban speech used across major American and Spanish citites and a diverse form that encompasses rural areas and small towns. The latter can be further clustered into smaller varieties with a stronger regional character.
The aim of the VIRMOS VLT Deep Survey (VVDS) is to study of the evolution of galaxies, large scale structures and AGNs from a sample of more than 150,000 galaxies with measured redshifts in the range 0<z<5+. The VVDS will rely on the VIMOS and NIRMOS wide field multi-object spectrographs, which the VIRMOS consortium is delivering to ESO. Together, they offer unprecedented multiplex capability in the wavelength range 0.37-1.8microns, allowing for large surveys to be carried out. The VVDS has several main aspects: (1) a deep multi-color imaging survey over 18deg^2 of more than one million galaxies, (2) a "wide" spectroscopic survey with more than 130,000 redshifts measured for objects brighter than IAB=22.5 over 18deg^2, (3) a "deep" survey with 50,000 redshifts measured to IAB=24, (4) ultra-deep" surveys with several thousand redshifts measured to IAB=25, (5) multi-wavelength observations with the VLA and XMM.
We consider here the morphogenesis (pattern formation) problem for some genetic network models. First, we show that any given spatio-temporal pattern can be generated by a genetic network involving a sufficiently large number of genes. Moreover, patterning process can be performed by an effective algorithm. We also show that Turing's or Meinhardt's type reaction-diffusion models can be approximated by genetic networks. These results exploit the fundamental fact that the genes form functional units and are organised in blocks (modular principle). Due to this modular organisation, the genes always are capable to construct any new patterns and even any time sequences of new patterns from old patterns. Computer simulations illustrate analytical results.
We derive the critical behavior of a CA traffic flow model using an order parameter breaking the symmetry of the jam-free phase. Random braking appears to be the symmetry-breaking field conjugate to the order parameter. For $v_{\max}=2$, we determine the values of the critical exponents $\beta$, $\gamma$ and $\delta$ using an order-3 cluster approximation and computer simulations. These critical exponents satisfy a scaling relation, which can be derived assuming that the order parameter is a generalized homogeneous function of $|\rho-\rho_c|$ and p in the vicinity of the phase transition point.
In this contribution we give an overview over recent work on the theory of interacting neural networks. The model is defined in Section 2. The typical teacher/student scenario is considered in Section 3. A static teacher network is presenting training examples for an adaptive student network. In the case of multilayer networks, the student shows a transition from a symmetric state to specialisation. Neural networks can also generate a time series. Training on time series and predicting it are studied in Section 4. When a network is trained on its own output, it is interacting with itself. Such a scenario has implications on the theory of prediction algorithms, as discussed in Section 5. When a system of networks is trained on its minority decisions, it may be considered as a model for competition in closed markets, see Section 6. In Section 7 we consider two mutually interacting networks. A novel phenomenon is observed: synchronisation by mutual learning. In Section 8 it is shown, how this phenomenon can be applied to cryptography: Generation of a secret key over a public channel.
Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in next generation DNN accelerators such as TPU. The structure of sparsity, i.e., the granularity of pruning, affects the efficiency of hardware accelerator design as well as the prediction accuracy. Coarse-grained pruning creates regular sparsity patterns, making it more amenable for hardware acceleration but more challenging to maintain the same accuracy. In this paper we quantitatively measure the trade-off between sparsity regularity and prediction accuracy, providing insights in how to maintain accuracy while having more a more structured sparsity pattern. Our experimental results show that coarse-grained pruning can achieve a sparsity ratio similar to unstructured pruning without loss of accuracy. Moreover, due to the index saving effect, coarse-grained pruning is able to obtain a better compression ratio than fine-grained sparsity at the same accuracy threshold. Based on the recent sparse convolutional neural network accelerator (SCNN), our experiments further demonstrate that coarse-grained sparsity saves about 2x the memory references compared to fine-grained sparsity. Since memory reference is more than two orders of magnitude more expensive than arithmetic operations, the regularity of sparse structure leads to more efficient hardware design.
The resistivity of a dense crystalline array of semiconductor nanocrystals (NCs) depends in a sensitive way on the level of doping as well as on the NC size and spacing. The choice of these parameters determines whether electron conduction through the array will be characterized by activated nearest-neighbor hopping or variable-range hopping (VRH). Thus far, no general theory exists to explain how these different behaviors arise at different doping levels and for different types of NCs. In this paper we examine a simple theoretical model of an array of doped semiconductor NCs that can explain the transition from activated transport to VRH. We show that in sufficiently small NCs, the fluctuations in donor number from one NC to another provide sufficient disorder to produce charging of some NCs, as electrons are driven to vacate higher shells of the quantum confinement energy spectrum. This confinement-driven charging produces a disordered Coulomb landscape throughout the array and leads to VRH at low temperature. We use a simple computer simulation to identify different regimes of conduction in the space of temperature, doping level, and NC diameter. We also discuss the implications of our results for large NCs with external impurity charges and for NCs that are gated electrochemically.
The Chord distributed hash table (DHT) is well-known and frequently used to implement peer-to-peer systems. Chord peers find other peers, and access their data, through a ring-shaped pointer structure in a large identifier space. Despite claims of proven correctness, i.e., eventual reachability, previous work has shown that the Chord ring-maintenance protocol is not correct under its original operating assumptions. It has not, however, discovered whether Chord could be made correct with reasonable operating assumptions. The contribution of this paper is to provide the first specification of correct operations and initialization for Chord, an inductive invariant that is necessary and sufficient to support a proof of correctness, and the proof itself. Most of the proof is carried out by automated analysis of an Alloy model. The inductive invariant reflects the fact that a Chord network must have a minimum ring size (the minimum being the length of successor lists plus one) to be correct. The invariant relies on an assumption that there is a stable base, of the minimum size, of permanent ring members. Because a stable base has only a few members and a Chord network can have millions, we learn that the obstacles to provable correctness are anomalies in small networks, and that a stable base need not be maintained once a Chord network grows large.
Automating the solutions of multiple network information theory problems, stretching from fundamental concerns such as determining all information inequalities and the limitations of linear codes, to applied ones such as designing coded networks, distributed storage systems, and caching systems, can be posed as polyhedral projections. These problems are demonstrated to exhibit multiple types of polyhedral symmetries. It is shown how these symmetries can be exploited to reduce the complexity of solving these problems through polyhedral projection.
This paper introduces the novel concept of proactive resource allocation through which the predictability of user behavior is exploited to balance the wireless traffic over time, and hence, significantly reduce the bandwidth required to achieve a given blocking/outage probability. We start with a simple model in which the smart wireless devices are assumed to predict the arrival of new requests and submit them to the network T time slots in advance. Using tools from large deviation theory, we quantify the resulting prediction diversity gain} to establish that the decay rate of the outage event probabilities increases with the prediction duration T. This model is then generalized to incorporate the effect of the randomness in the prediction look-ahead time T. Remarkably, we also show that, in the cognitive networking scenario, the appropriate use of proactive resource allocation by the primary users improves the diversity gain of the secondary network at no cost in the primary network diversity. We also shed lights on multicasting with predictable demands and show that the proactive multicast networks can achieve a significantly higher diversity gain that scales super-linearly with T. Finally, we conclude by a discussion of the new research questions posed under the umbrella of the proposed proactive (non-causal) wireless networking framework.
This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of our study is that large-scale machine learning represents a distinctive setting in which the stochastic gradient (SG) method has traditionally played a central role while conventional gradient-based nonlinear optimization techniques typically falter. Based on this viewpoint, we present a comprehensive theory of a straightforward, yet versatile SG algorithm, discuss its practical behavior, and highlight opportunities for designing algorithms with improved performance. This leads to a discussion about the next generation of optimization methods for large-scale machine learning, including an investigation of two main streams of research on techniques that diminish noise in the stochastic directions and methods that make use of second-order derivative approximations.
We view Digital Ecosystems to be the digital counterparts of biological ecosystems. Here, we are concerned with the creation of these Digital Ecosystems, exploiting the self-organising properties of biological ecosystems to evolve high-level software applications. Therefore, we created the Digital Ecosystem, a novel optimisation technique inspired by biological ecosystems, where the optimisation works at two levels: a first optimisation, migration of agents which are distributed in a decentralised peer-to-peer network, operating continuously in time; this process feeds a second optimisation based on evolutionary computing that operates locally on single peers and is aimed at finding solutions to satisfy locally relevant constraints. The Digital Ecosystem was then measured experimentally through simulations, with measures originating from theoretical ecology, evaluating its likeness to biological ecosystems. This included its responsiveness to requests for applications from the user base, as a measure of the ecological succession (ecosystem maturity). Overall, we have advanced the understanding of Digital Ecosystems, creating Ecosystem-Oriented Architectures where the word ecosystem is more than just a metaphor.
Systems of strongly interacting dipoles offer an attractive platform to study many-body localized phases, owing to their long coherence times and strong interactions. We explore conditions under which such localized phases persist in the presence of power-law interactions and supplement our analytic treatment with numerical evidence of localized states in one dimension. We propose and analyze several experimental systems that can be used to observe and probe such states, including ultracold polar molecules and solid-state magnetic spin impurities.
Semantic segmentation, like other fields of computer vision, has seen a remarkable performance advance by the use of deep convolution neural networks. Many recent studies on this field upsample smaller feature maps into the original size of the image to label each pixel into one of semantic categories. However, considering that neighboring pixels are heavily dependent on each other, both learning and testing of these methods have a lot of redundant operations. To resolve this problem, the proposed network is trained and tested with only 0.37\% of total pixels by superpixel-based sampling and largely reduced the complexity of upsampling calculation. In doing so, scale and translation invariant features are constructed by making the hypercolumns using the feature maps obtained by pyramid module as well as the feature maps in convolution layers of the base network. Since the proposed method uses a very small number of sampled pixels, the end-to-end learning of the entire network is difficult with a common learning rate for all the layers, which is caused by the small sampling ratio. In order to compensate for this, the learning rate after sampling is controlled by statistical process control (SPC) of gradients in each layer. The proposed method performs better than or equal to the conventional methods that use much more samples on Pascal Context, SUN-RGBD dataset.
Distributed adaptive networks achieve better estimation performance by exploiting temporal and as well spatial diversity while consuming few resources. Recent works have studied the single task distributed estimation problem, in which the nodes estimate a single optimum parameter vector collaboratively. However, there are many important applications where the multiple vectors have to estimated simultaneously, in a collaborative manner. This paper presents multi-task diffusion strategies based on the Affine Projection Algorithm (APA), usage of APA makes the algorithm robust against the correlated input. The performance analysis of the proposed multi-task diffusion APA algorithm is studied in mean and mean square sense. And also a modified multi-task diffusion strategy is proposed that improves the performance in terms of convergence rate and steady state EMSE as well. Simulations are conducted to verify the analytical results.
The existence of considerable amount of redundancy in the Internet traffic at the packet level has stimulated the deployment of packet-level redundancy elimination techniques within the network by enabling network nodes to memorize data packets. Redundancy elimination results in traffic reduction which in turn improves the efficiency of network links. In this paper, the concept of network compression is introduced that aspires to exploit the statistical correlation beyond removing large duplicate strings from the flow to better suppress redundancy.   In the first part of the paper, we introduce "memory-assisted compression", which utilizes the memorized content within the network to learn the statistics of the information source generating the packets which can then be used toward reducing the length of codewords describing the packets emitted by the source. Using simulations on data gathered from real network traces, we show that memory-assisted compression can result in significant traffic reduction.   In the second part of the paper, we study the scaling of the average network-wide benefits of memory-assisted compression. We discuss routing and memory placement problems in network for the reduction of overall traffic. We derive a closed-form expression for the scaling of the gain in Erdos-Renyi random network graphs, where obtain a threshold value for the number of memories deployed in a random graph beyond which network-wide benefits start to shine. Finally, the network-wide benefits are studied on Internet-like scale-free networks. We show that non-vanishing network compression gain is obtained even when only a tiny fraction of the total number of nodes in the network are memory-enabled.
We show that peculiar collective dynamics called slow switching arises in a population of leaky integrate-and-fire oscillators with delayed, all-to-all pulse-couplings. By considering the stability of cluster states and symmetry possessed by our model, we argue that saddle connections between a pair of the two-cluster states are formed under general conditions. Slow switching appears as a result of the system's approach to the saddle connections. It is also argued that such saddle connections easy to arise near the bifurcation point where the state of perfect synchrony loses stability. We develop an asymptotic theory to reduce the model into a simpler form, with which an analytical study of cluster states becomes possible.
Bell's Theorem shows that quantum mechanical correlations can violate the constraints that the causal structure of certain experiments impose on any classical explanation. It is thus natural to ask to which degree the causal assumptions -- e.g. locality or measurement independence -- have to be relaxed in order to allow for a classical description of such experiments. Here, we develop a conceptual and computational framework for treating this problem. We employ the language of Bayesian networks to systematically construct alternative causal structures and bound the degree of relaxation using quantitative measures that originate from the mathematical theory of causality. The main technical insight is that the resulting problems can often be expressed as computationally tractable linear programs. We demonstrate the versatility of the framework by applying it to a variety of scenarios, ranging from relaxations of the measurement independence, locality and bilocality assumptions, to a novel causal interpretation of CHSH inequality violations.
A matrix network is a family of matrices, where the relationship between them is modeled as a weighted graph. Each node represents a matrix, and the weight on each edge represents the similarity between the two matrices. Suppose that we observe a few entries of each matrix with noise, and the fraction of entries we observe varies from matrix to matrix. Even worse, a subset of matrices in this family may be completely unobserved. How can we recover the entire matrix network from noisy and incomplete observations? One motivating example is the cold start problem, where we need to do inference on new users or items that come with no information. To recover this network of matrices, we propose a structural assumption that the matrix network can be approximated by generalized convolution of low rank matrices living on the same network. We propose an iterative imputation algorithm to complete the matrix network. This algorithm is efficient for large scale applications and is guaranteed to accurately recover all matrices, as long as there are enough observations accumulated over the network.
The evolution of stars and planets is mostly controlled by the properties of their atmosphere. This is particularly true in the case of exoplanets close to their stars, for which one has to account both for an (often intense) irradiation flux, and from an intrinsic flux responsible for the progressive loss of the inner planetary heat. The goals of the present work are to help understanding the coupling between radiative transfer and advection in exoplanetary atmospheres and to provide constraints on the temperatures of the deep atmospheres. This is crucial in assessing whether modifying assumed opacity sources and/or heat transport may explain the inflated sizes of a significant number of giant exoplanets found so far. I use a simple analytical approach inspired by Eddington's approximation for stellar atmospheres to derive a relation between temperature and optical depth valid for plane-parallel static grey atmospheres which are both transporting an intrinsic heat flux and receiving an outer radiation flux. The model is parameterized as a function of mean visible and thermal opacities, respectively. The model is shown to reproduce relatively well temperature profiles obtained from more sophisticated radiative transfer calculations of exoplanetary atmospheres. It naturally explains why a temperature inversion (stratosphere) appears when the opacity in the optical becomes significant compared to that in the infrared. I further show that the mean equivalent flux (proportional to T^4) is conserved in the presence of horizontal advection on constant optical depth levels. This implies with these hypotheses that the deep atmospheric temperature used as outer boundary for the evolution models should be calculated from models pertaining to the entire planetary atmosphere, not from ones that are relevant to the day side or to the substellar point. In these conditions, present-day models yield deep temperatures that are ~1000K too cold to explain the present size of planet HD 209458b. An tenfold increase in the infrared to visible opacity ratio would be required to slow the planetary cooling and contraction sufficiently to explain its size. However, the mean equivalent flux is not conserved anymore in the presence of opacity variations, or in the case of non-radiative vertical transport of energy: The presence of clouds on the night side or a downward transport of kinetic energy and its dissipation at deep levels would help making the deep atmosphere hotter and may explain the inflated sizes of giant exoplanets.
In light of the growing interest in agent-based market models, we bring together several earlier works in which we considered the topic of self-consistent market modelling. Building upon the binary game structure of Challet and Zhang, we discuss generalizations of the strategy reward scheme such that the agents seek to maximize their wealth in a more direct way. We then examine a disturbing feature whereby such reward schemes, while appearing microscopically acceptable, lead to unrealistic market dynamics (e.g. instabilities). Finally, we discuss various mechanisms which are responsible for re-stabilizing the market in reality. This discussion leads to a `toolbox' of processes from which, we believe, successful market models can be constructed in the future.
Next-generation sequencing technologies provide a revolutionary tool for generating gene expression data. Starting with a fixed RNA sample, they construct a library of millions of differentially abundant short sequence tags or "reads", which constitute a fundamentally discrete measure of the level of gene expression. A common limitation in experiments using these technologies is the low number or even absence of biological replicates, which complicates the statistical analysis of digital gene expression data. Analysis of this type of data has often been based on modified tests originally devised for analysing microarrays; both these and even de novo methods for the analysis of RNA-seq data are plagued by the common problem of low replication. We propose a novel, non-parametric Bayesian approach for the analysis of digital gene expression data. We begin with a hierarchical model for modelling over-dispersed count data and a blocked Gibbs sampling algorithm for inferring the posterior distribution of model parameters conditional on these counts. The algorithm compensates for the problem of low numbers of biological replicates by clustering together genes with tag counts that are likely sampled from a common distribution and using this augmented sample for estimating the parameters of this distribution. The number of clusters is not decided a priori, but it is inferred along with the remaining model parameters. We demonstrate the ability of this approach to model biological data with high fidelity by applying the algorithm on a public dataset obtained from cancerous and non-cancerous neural tissues.
As a fundamental challenge in vast disciplines, link prediction aims to identify potential links in a network based on the incomplete observed information, which has broad applications ranging from uncovering missing protein-protein interaction to predicting the evolution of networks. One of the most influential methods rely on similarity indices characterized by the common neighbors or its variations. We construct a hidden space mapping a network into Euclidean space based solely on the connection structures of a network. Compared with real geographical locations of nodes, our reconstructed locations are in conformity with those real ones. The distances between nodes in our hidden space could serve as a novel similarity metric in link prediction. In addition, we hybrid our hidden space method with other state-of-the-art similarity methods which substantially outperforms the existing methods on the prediction accuracy. Hence, our hidden space reconstruction model provides a fresh perspective to understand the network structure, which in particular casts a new light on link prediction.
We study high-density traffic of information packets on sparse modular networks with scale-free subgraphs. With different statistical measures we distinguish between the free flow and congested regime and point out the role of modules in the jamming transition. We further consider correlations between traffic signals collected at each node in the network. The correlation matrix between pairs of signals reflects the network modularity in the eigenvalue spectrum and the structure of eigenvectors. The internal structure of the modules has an important role in the diffusion dynamics, leading to enhanced correlations between the modular hubs, which can not be filtered out by standard methods. Implications for the analysis of real networks with unknown modular structure are discussed.
Identification of communities in complex networks has become an effective means to analysis of complex systems. It has broad applications in diverse areas such as social science, engineering, biology and medicine. Finding communities of nodes and finding communities of links are two popular schemes for network structure analysis. These schemes, however, have inherent drawbacks and are often inadequate to properly capture complex organizational structures in real networks. We introduce a new scheme and effective approach for identifying complex network structures using a mixture of node and link communities, called hybrid node-link communities. A central piece of our approach is a probabilistic model that accommodates node, link and hybrid node-link communities. Our extensive experiments on various real-world networks, including a large protein-protein interaction network and a large semantic association network of commonly used words, illustrated that the scheme for hybrid communities is superior in revealing network characteristics. Moreover, the new approach outperformed the existing methods for finding node or link communities separately.
In past ten years, modern societies developed enormous communication and social networks. Their classification and information retrieval processing become a formidable task for the society. Due to the rapid growth of World Wide Web, social and communication networks, new mathematical methods have been invented to characterize the properties of these networks on a more detailed and precise level. Various search engines are essentially using such methods. It is highly important to develop new tools to classify and rank enormous amount of network information in a way adapted to internal network structures and characteristics. This review describes the Google matrix analysis of directed complex networks demonstrating its efficiency on various examples including World Wide Web, Wikipedia, software architecture, world trade, social and citation networks, brain neural networks, DNA sequences and Ulam networks. The analytical and numerical matrix methods used in this analysis originate from the fields of Markov chains, quantum chaos and Random Matrix theory.
The asymptotic behaviour of dynamical processes in networks can be expressed as a function of spectral properties of the Adjacency and Laplacian matrices. Although many theoretical results are known for the spectra of traditional configuration models, networks generated through these models fail to describe many topological features of real-world networks, in particular non-null values for the clustering coefficient. Here we study the effects of cycles or order three (triangles) in network spectra. By using recent advances in random matrix theory, we determine the spectrum distribution of the network Adjacency matrix as a function of the average number of triangles attached to each node for networks without modular structure and degree-degree correlations. Furthermore we show that cycles of order three have a weak influence on the Laplacian eigenvalues, fact that explains the recent controversy on the dynamics of clustered networks. Our findings can shed light in the study of how particular kinds of subgraphs influence network dynamics.
The effect of open boundary conditions for four models with quenched disorder are studied in finite samples by numerical ground state calculations. Extrapolation to the infinite volume limit indicates that the configurations in ``windows'' of fixed size converge to a unique configuration, up to global symmetries. The scaling of this convergence is consistent with calculations based on the fractal dimension of domain walls. These results provide strong evidence for the ``two-state'' picture of the low temperature behavior of these models. Convergence in three-dimensional systems can require relatively large windows.
This paper extends a recently proposed model for combinatorial landscapes: Local Optima Networks (LON), to incorporate a first-improvement (greedy-ascent) hill-climbing algorithm, instead of a best-improvement (steepest-ascent) one, for the definition and extraction of the basins of attraction of the landscape optima. A statistical analysis comparing best and first improvement network models for a set of NK landscapes, is presented and discussed. Our results suggest structural differences between the two models with respect to both the network connectivity, and the nature of the basins of attraction. The impact of these differences in the behavior of search heuristics based on first and best improvement local search is thoroughly discussed.
Although matching effects in superconducting anti-dot arrays have been studied extensively through magneto-resistance oscillations, these investigations have been restricted to a very narrow temperature window close to the superconducting transition. Here we report a "two coil" mutual inductance technique, which allows the study of this phenomenon deep in the superconducting state, through a direct measurement of the magnetic field variation of the shielding response. We demonstrate how this technique can be used to resolve outstanding issues on the origin of matching effects in superconducting thin films with periodic array of holes grown on anodized alumina membranes.
We introduce dropout compaction, a novel method for training feed-forward neural networks which realizes the performance gains of training a large model with dropout regularization, yet extracts a compact neural network for run-time efficiency. In the proposed method, we introduce a sparsity-inducing prior on the per unit dropout retention probability so that the optimizer can effectively prune hidden units during training. By changing the prior hyperparameters, we can control the size of the resulting network. We performed a systematic comparison of dropout compaction and competing methods on several real-world speech recognition tasks and found that dropout compaction achieved comparable accuracy with fewer than 50% of the hidden units, translating to a 2.5x speedup in run-time.
Neural network based approaches for sentence relation modeling automatically generate hidden matching features from raw sentence pairs. However, the quality of matching feature representation may not be satisfied due to complex semantic relations such as entailment or contradiction. To address this challenge, we propose a new deep neural network architecture that jointly leverage pre-trained word embedding and auxiliary character embedding to learn sentence meanings. The two kinds of word sequence representations as inputs into multi-layer bidirectional LSTM to learn enhanced sentence representation. After that, we construct matching features followed by another temporal CNN to learn high-level hidden matching feature representations. Experimental results demonstrate that our approach consistently outperforms the existing methods on standard evaluation datasets.
Simulation is a key component of physics analysis in particle physics and nuclear physics. The most computationally expensive simulation step is the detailed modeling of particle showers inside calorimeters. Full detector simulations are too slow to meet the growing demands resulting from large quantities of data; current fast simulations are not precise enough to serve the entire physics program. Therefore, we introduce CaloGAN, a new fast simulation based on generative adversarial neural networks (GANs). We apply the CaloGAN to model electromagnetic showers in a longitudinally segmented calorimeter. This represents a significant stepping stone toward a full neural network-based detector simulation that could save significant computing time and enable many analyses now and in the future. In particular, the CaloGAN achieves speedup factors comparable to or better than existing fast simulation techniques on CPU ($100\times$-$1000\times$) and even faster on GPU (up to $\sim10^5\times$)) and has the capability of faithfully reproducing many aspects of key shower shape variables for a variety of particle types.
We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.
In this paper I will describe some results that have been recently obtained in the study of random Euclidean matrices, i.e. matrices that are functions of random points in Euclidean space. In the case of {\sl translation invariant} matrices one generically finds a phase transition between a {\sl phonon} phase and a {\sl saddle} phase. If we apply these considerations to the study of the Hessian of the Hamiltonian of the particles of a fluid, we find that this phonon-saddle transition corresponds to the dynamical phase transition in glasses, that has been studied in the framework of the mode coupling approximation. The Boson peak observed in glasses at low temperature is a remanent of this transition. We finally present some recent results obtained with a new approach where one deeply uses some hidden supersymmetric properties of the problem.
We derive a quantum master equation which describes the dynamics of the ensemble-averaged state of homogeneous disorder models at short times, and mediates a transition from coherent superpositions into classical mixtures. While each single realization follows unitary dynamics, this decoherencelike behavior arises as a consequence of the ensemble average. The master equation manifestly reflects the translational invariance of the disorder correlations and allows us to relate the disorder-induced dynamics to a collisional decoherence process, where the disorder correlations determine the spatial decay of coherences. We apply our theory to the (one-dimensional) Anderson model.
Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave one out cross-validation of the classification algorithm gives 94.7% accuracy as the best performance with corresponding sensitivity and specificity values as 85.7% and 100% respectively. Significance. The proposed method gives high classification accuracies and outperforms other contemporary research results. The effectiveness of the proposed method for classification of autistic and typical children suggests the possibility of using it on a larger population to validate it for clinical practice.
This paper presents a novel data-driven technique based on the spatiotemporal pattern network (STPN) for energy/power prediction for complex dynamical systems. Built on symbolic dynamic filtering, the STPN framework is used to capture not only the individual system characteristics but also the pair-wise causal dependencies among different sub-systems. For quantifying the causal dependency, a mutual information based metric is presented. An energy prediction approach is subsequently proposed based on the STPN framework. For validating the proposed scheme, two case studies are presented, one involving wind turbine power prediction (supply side energy) using the Western Wind Integration data set generated by the National Renewable Energy Laboratory (NREL) for identifying the spatiotemporal characteristics, and the other, residential electric energy disaggregation (demand side energy) using the Building America 2010 data set from NREL for exploring the temporal features. In the energy disaggregation context, convex programming techniques beyond the STPN framework are developed and applied to achieve improved disaggregation performance.
Intrusion Detection System (IDS) has increasingly become a crucial issue for computer and network systems. Optimizing performance of IDS becomes an important open problem which receives more and more attention from the research community. In this work, A multi-layer intrusion detection model is designed and developed to achieve high efficiency and improve the detection and classification rate accuracy .we effectively apply Machine learning techniques (C5 decision tree, Multilayer Perceptron neural network and Na\"ive Bayes) using gain ratio for selecting the best features for each layer as to use smaller storage space and get higher Intrusion detection performance. Our experimental results showed that the proposed multi-layer model using C5 decision tree achieves higher classification rate accuracy, using feature selection by Gain Ratio, and less false alarm rate than MLP and na\"ive Bayes. Using Gain Ratio enhances the accuracy of U2R and R2L for the three machine learning techniques (C5, MLP and Na\"ive Bayes) significantly. MLP has high classification rate when using the whole 41 features in Dos and Probe layers.
Networks of coupled dynamical systems provide a powerful way to model systems with enormously complex dynamics, such as the human brain. Control of synchronization in such networked systems has far reaching applications in many domains, including engineering and medicine. In this paper, we formulate the synchronization control in dynamical systems as an optimization problem and present a multi-objective genetic programming-based approach to infer optimal control functions that drive the system from a synchronized to a non-synchronized state and vice-versa. The genetic programming-based controller allows learning optimal control functions in an interpretable symbolic form. The effectiveness of the proposed approach is demonstrated in controlling synchronization in coupled oscillator systems linked in networks of increasing order complexity, ranging from a simple coupled oscillator system to a hierarchical network of coupled oscillators. The results show that the proposed method can learn highly-effective and interpretable control functions for such systems.
In this paper we solve the problem: how to determine maximal allowable errors, possible for signals and parameters of each element of a network proceeding from the condition that the vector of output signals of the network should be calculated with given accuracy? "Back-propagation of accuracy" is developed to solve this problem. The calculation of allowable errors for each element of network by back-propagation of accuracy is surprisingly similar to a back-propagation of error, because it is the backward signals motion, but at the same time it is very different because the new rules of signals transformation in the passing back through the elements are different. The method allows us to formulate the requirements to the accuracy of calculations and to the realization of technical devices, if the requirements to the accuracy of output signals of the network are known.
Our $N$-intertwined model (now called NIMFA) for virus spread in any network with $N$ nodes is extended to a full heterogeneous setting. The metastable steady-state nodal infection probabilities are specified in terms of a generalized Laplacian, that possesses analogous properties as the classical Laplacian in graph theory. The critical threshold that separates global network infection from global network health is characterized via an $N$ dimensional vector that makes the largest eigenvalue of a modified adjacency matrix equal to unity. Finally, the steady-state infection probability of node $i$ is convex in the own curing rate $\delta_{i}$, but concave in the curing rates $\delta_{j}$ of the other nodes $1\leq j\neq i\leq N$ in the network.
Queue networks describe complex stochastic systems of both theoretical and practical interest. They provide the means to assess alterations, diagnose poor performance and evaluate robustness across sets of interconnected resources. In the present paper, we focus on the underlying continuous-time Markov chains induced by these networks, and we present a flexible method for drawing parameter inference in multi-class Markovian cases with switching and different service disciplines. The approach is directed towards the inferential problem with missing data and introduces a slice sampling technique with mappings to the measurable space of task transitions between service stations. The method deals with time and tractability issues, can handle prior system knowledge and overcomes common restrictions on service rates across existing inferential frameworks. Finally, the proposed algorithm is validated on synthetic data and applied to a real data set, obtained from a service delivery tasking tool implemented in two university hospitals.
Network Service Chaining (NSC) is a service deployment concept that promises increased flexibility and cost efficiency for future carrier networks. NSC has received considerable attention in the standardization and research communities lately. However, NSC is largely undefined in the peer-reviewed literature. In fact, a literature review reveals that the role of NSC enabling technologies is up for discussion, and so are the key research challenges lying ahead. This paper addresses these topics by motivating our research interest towards advanced dynamic NSC and detailing the main aspects to be considered in the context of carrier-grade telecommunication networks. We present design considerations and system requirements alongside use cases that illustrate the advantages of adopting NSC. We detail prominent research challenges during the typical lifecycle of a network service chain in an operational telecommunications network, including service chain description, programming, deployment, and debugging, and summarize our security considerations. We conclude this paper with an outlook on future work in this area.
We model the QCD Dirac operator as a power-law random banded matrix (RBM) with the appropriate chiral symmetry. Our motivation is the form of the Dirac operator in a basis of instantonic zero modes with a corresponding gauge background of instantons. We compare the spectral correlations of this model to those of an instanton liquid model (ILM) and find agreement well beyond the Thouless energy. In the bulk of the spectrum the (dimensionless) Thouless energy of the RBM scales with the square root of system size in agreement with the ILM and chiral perturbation theory. Near the origin the scaling of the (dimensionless) Thouless energy in the RBM remains the same as in the bulk which agrees with chiral perturbation theory but not with the ILM. Finally we discuss how this RBM should be modified in order to describe the spectral correlations of the QCD Dirac operator at the finite temperature chiral restoration transition.
This paper is an extension to the memory retrieval procedure of the B-Matrix approach [6],[17] to neural network learning. The B-Matrix is a part of the interconnection matrix generated from the Hebbian neural network, and in memory retrieval, the B-matrix is clamped with a small fragment of the memory. The fragment gradually enlarges by means of feedback, until the entire vector is obtained. In this paper, we propose the use of delta learning to enhance the retrieval rate of the stored memories.
This paper presents a set of full-resolution lossy image compression methods based on neural networks. Each of the architectures we describe can provide variable compression rates during deployment without requiring retraining of the network: each network need only be trained once. All of our architectures consist of a recurrent neural network (RNN)-based encoder and decoder, a binarizer, and a neural network for entropy coding. We compare RNN types (LSTM, associative LSTM) and introduce a new hybrid of GRU and ResNet. We also study "one-shot" versus additive reconstruction architectures and introduce a new scaled-additive framework. We compare to previous work, showing improvements of 4.3%-8.8% AUC (area under the rate-distortion curve), depending on the perceptual metric used. As far as we know, this is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.
Many sequential decision-making problems in communication networks can be modeled as contextual bandit problems, which are natural extensions of the well-known multi-armed bandit problem. In contextual bandit problems, at each time, an agent observes some side information or context, pulls one arm and receives the reward for that arm. We consider a stochastic formulation where the context-reward tuples are independently drawn from an unknown distribution in each trial. Motivated by networking applications, we analyze a setting where the reward is a known non-linear function of the context and the chosen arm's current state. We first consider the case of discrete and finite context-spaces and propose DCB($\epsilon$), an algorithm that we prove, through a careful analysis, yields regret (cumulative reward gap compared to a distribution-aware genie) scaling logarithmically in time and linearly in the number of arms that are not optimal for any context, improving over existing algorithms where the regret scales linearly in the total number of arms. We then study continuous context-spaces with Lipschitz reward functions and propose CCB($\epsilon, \delta$), an algorithm that uses DCB($\epsilon$) as a subroutine. CCB($\epsilon, \delta$) reveals a novel regret-storage trade-off that is parametrized by $\delta$. Tuning $\delta$ to the time horizon allows us to obtain sub-linear regret bounds, while requiring sub-linear storage. By exploiting joint learning for all contexts we get regret bounds for CCB($\epsilon, \delta$) that are unachievable by any existing contextual bandit algorithm for continuous context-spaces. We also show similar performance bounds for the unknown horizon case.
Solving inverse problems with iterative algorithms is popular, especially for large data. Due to time constraints, the number of possible iterations is usually limited, potentially limiting the achievable accuracy. Given an error one is willing to tolerate, an important question is whether it is possible to modify the original iterations to obtain faster convergence to a minimizer achieving the allowed error without increasing the computational cost of each iteration considerably. Relying on recent recovery techniques developed for settings in which the desired signal belongs to some low-dimensional set, we show that using a coarse estimate of this set may lead to a faster convergence at the cost of an additional error in the reconstruction related to the accuracy of the set approximation. Our theory ties to recent advances in sparse recovery, compressed sensing, and deep learning. Particularly, it may provide a possible explanation to the successful approximation of the l_1-minimization solution by neural networks with layers representing iterations, as practiced in the learned iterative shrinkage-thresholding algorithm (LISTA).
In this article, we extend the conventional framework of convolutional-Restricted-Boltzmann-Machine to learn highly abstract features among abitrary number of time related input maps by constructing a layer of multiplicative units, which capture the relations among inputs. In many cases, more than two maps are strongly related, so it is wise to make multiplicative unit learn relations among more input maps, in other words, to find the optimal relational-order of each unit. In order to enable our machine to learn relational order, we developed a reinforcement-learning method whose optimality is proven to train the network.
This paper addresses the problem of estimating the depth map of a scene given a single RGB image. We propose a fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps. In order to improve the output resolution, we present a novel way to efficiently learn feature map up-sampling within the network. For optimization, we introduce the reverse Huber loss that is particularly suited for the task at hand and driven by the value distributions commonly present in depth maps. Our model is composed of a single architecture that is trained end-to-end and does not rely on post-processing techniques, such as CRFs or other additional refinement steps. As a result, it runs in real-time on images or videos. In the evaluation, we show that the proposed model contains fewer parameters and requires fewer training data than the current state of the art, while outperforming all approaches on depth estimation. Code and models are publicly available.
Exponential-family random graph models (ERGMs) are probabilistic network models that are parametrized by sufficient statistics based on structural (i.e., graph-theoretic) properties. The ergm package for the R statistical computing system is a collection of tools for the analysis of network data within an ERGM framework. Many different network properties can be employed as sufficient statistics for ERGMs by using the model terms defined in the ergm package; this functionality can be expanded by the creation of packages that code for additional network statistics. Here, our focus is on the addition of statistics based on graphlets. Graphlets are small, connected, and non-isomorphic induced subgraphs that describe the topological structure of a network. We introduce an R package called ergm.graphlets that enables the use of graphlet properties of a network within the ergm package of R. The ergm.graphlets package provides a complete list of model terms that allows to incorporate statistics of any 2-, 3-, 4- and 5-node graphlet into ERGMs. The new model terms of ergm.graphlets package enable both ERG modelling of global structural properties and investigation of relationships between nodal attributes (i.e., covariates) and local topologies around nodes.
It is well known that there is a one-to-one correspondence between the entropy vector of a collection of n random variables and a certain group-characterizable vector obtained from a finite group and n of its subgroups. However, if one restricts attention to abelian groups then not all entropy vectors can be obtained. This is an explanation for the fact shown by Dougherty et al that linear network codes cannot achieve capacity in general network coding problems (since linear network codes form an abelian group). All abelian group-characterizable vectors, and by fiat all entropy vectors generated by linear network codes, satisfy a linear inequality called the Ingleton inequality. In this paper, we study the problem of finding nonabelian finite groups that yield characterizable vectors which violate the Ingleton inequality. Using a refined computer search, we find the symmetric group S_5 to be the smallest group that violates the Ingleton inequality. Careful study of the structure of this group, and its subgroups, reveals that it belongs to the Ingleton-violating family PGL(2,p) with primes p > 3, i.e., the projective group of 2 by 2 nonsingular matrices with entries in F_p. This family of groups is therefore a good candidate for constructing network codes more powerful than linear network codes.
Solar radiation prediction is an important challenge for the electrical engineer because it is used to estimate the power developed by commercial photovoltaic modules. This paper deals with the problem of solar radiation prediction based on observed meteorological data. A 2-day forecast is obtained by using novel wavelet recurrent neural networks (WRNNs). In fact, these WRNNS are used to exploit the correlation between solar radiation and timescale-related variations of wind speed, humidity, and temperature. The input to the selected WRNN is provided by timescale-related bands of wavelet coefficients obtained from meteorological time series. The experimental setup available at the University of Catania, Italy, provided this information. The novelty of this approach is that the proposed WRNN performs the prediction in the wavelet domain and, in addition, also performs the inverse wavelet transform, giving the predicted signal as output. The obtained simulation results show a very low root-mean-square error compared to the results of the solar radiation prediction approaches obtained by hybrid neural networks reported in the recent literature.
Online social networks represent a popular and diverse class of social media systems. Despite this variety, each of these systems undergoes a general process of online social network assembly, which represents the complicated and heterogeneous changes that transform newly born systems into mature platforms. However, little is known about this process. For example, how much of a network's assembly is driven by simple growth? How does a network's structure change as it matures? How does network structure vary with adoption rates and user heterogeneity, and do these properties play different roles at different points in the assembly? We investigate these and other questions using a unique dataset of online connections among the roughly one million users at the first 100 colleges admitted to Facebook, captured just 20 months after its launch. We first show that different vintages and adoption rates across this population of networks reveal temporal dynamics of the assembly process, and that assembly is only loosely related to network growth. We then exploit natural experiments embedded in this dataset and complementary data obtained via Internet archaeology to show that different subnetworks matured at different rates toward similar end states. These results shed light on the processes and patterns of online social network assembly, and may facilitate more effective design for online social systems.
Accurate and reliable brain tumor segmentation is a critical component in cancer diagnosis, treatment planning, and treatment outcome evaluation. Build upon successful deep learning techniques, we propose a novel brain tumor segmentation method by integrating fully convolutional neural networks (FCNNs) and Conditional Random Fields (CRFs) in a unified framework to obtain segmentation results with appearance and spatial consistency. We train the deep learning based segmentation model using image patches and image slices in following steps: 1) training FCNNs using image patches; 2) training CRF-RNN using image slices of axial view with parameters of FCNNs fixed; and 3) fine-tuning the whole network using image slices. Our method could segment brain images slice-by-slice, much faster than those image patch based tumor segmentation methods. We have evaluated our method based on imaging data provided by the Multimodal Brain Tumor Image Segmentation Challenge (BRATS) 2013 and the BRATS 2016. The experimental results have demonstrated that our method could build a segmentation model with Flair, T1c, and T2 scans and achieve competitive performance as those built with Flair, T1, T1c, and T2 scans.
Over the last years, most websites on which users can register (e.g., email providers and social networks) adopted CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) as a countermeasure against automated attacks. The battle of wits between designers and attackers of CAPTCHAs led to current ones being annoying and hard to solve for users, while still being vulnerable to automated attacks.   In this paper, we propose CAPTCHaStar, a new image-based CAPTCHA that relies on user interaction. This novel CAPTCHA leverages the innate human ability to recognize shapes in a confused environment. We assess the effectiveness of our proposal for the two key aspects for CAPTCHAs, i.e., usability, and resiliency to automated attacks. In particular, we evaluated the usability, carrying out a thorough user study, and we tested the resiliency of our proposal against several types of automated attacks: traditional ones; designed ad-hoc for our proposal; and based on machine learning. Compared to the state of the art, our proposal is more user friendly (e.g., only some 35% of the users prefer current solutions, such as text-based CAPTCHAs) and more resilient to automated attacks.
This paper proposes a method for measuring semantic similarity between words as a new tool for text analysis. The similarity is measured on a semantic network constructed systematically from a subset of the English dictionary, LDOCE (Longman Dictionary of Contemporary English). Spreading activation on the network can directly compute the similarity between any two words in the Longman Defining Vocabulary, and indirectly the similarity of all the other words in LDOCE. The similarity represents the strength of lexical cohesion or semantic relation, and also provides valuable information about similarity and coherence of texts.
Retrosynthesis is a technique to plan the chemical synthesis of organic molecules, for example drugs, agro- and fine chemicals. In retrosynthesis, a search tree is built by analysing molecules recursively and dissecting them into simpler molecular building blocks until one obtains a set of known building blocks. The search space is intractably large, and it is difficult to determine the value of retrosynthetic positions. Here, we propose to model retrosynthesis as a Markov Decision Process. In combination with a Deep Neural Network policy learned from essentially the complete published knowledge of chemistry, Monte Carlo Tree Search (MCTS) can be used to evaluate positions. In exploratory studies, we demonstrate that MCTS with neural network policies outperforms the traditionally used best-first search with hand-coded heuristics.
We propose a new type of leaf node for use in Symbolic Regression (SR) that performs linear combinations of feature variables (LCF). These nodes can be handled in three different modes -- an unsynchronized mode, where all LCFs are free to change on their own, a synchronized mode, where LCFs are sorted into groups in which they are forced to be identical throughout the whole individual, and a globally synchronized mode, which is similar to the previous mode but the grouping is done across the whole population. We also present two methods of evolving the weights of the LCFs -- a purely stochastic way via mutation and a gradient-based way based on the backpropagation algorithm known from neural networks -- and also a combination of both. We experimentally evaluate all configurations of LCFs in Multi-Gene Genetic Programming (MGGP), which was chosen as baseline, on a number of benchmarks. According to the results, we identified two configurations which increase the performance of the algorithm.
Many systems respond to slowly changing external conditions with crackling noise, created by avalanches or pulses of a broad range of sizes. Examples range from Barkhausen Noise in magnets to earthquakes. Here we discuss how the scaling behavior of the avalanche size and duration distribution and the power spectra of this noise depend on the rate at which the external conditionsare changed. We derive an exponent inequality as a criteria for the relevance of adding a small driving rate to the adiabatic model. We use the zero temperature nonequilibrium random field Ising model to test our results, which are expected to be applicable to a large class of systems with crackling noise. They also agree with recent experiments on Barkhausen noise in various materials.
This paper considers the problem of inferring an unknown network of dynamical systems driven by unknown, intrinsic, noise inputs. Equivalently we seek to identify direct causal dependencies among manifest variables only from observations of these variables. For linear, time-invariant systems of minimal order, we characterise under what conditions this problem is well posed. We first show that if the transfer matrix from the inputs to manifest states is minimum phase, this problem has a unique solution irrespective of the network topology. This is equivalent to there being only one valid spectral factor (up to a choice of signs of the inputs) of the output spectral density.   If the assumption of phase-minimality is relaxed, we show that the problem is characterised by a single Algebraic Riccati Equation (ARE), of dimension determined by the number of latent states. The number of solutions to this ARE is an upper bound on the number of solutions for the network. We give necessary and sufficient conditions for any two dynamical networks to have equal output spectral density, which can be used to construct all equivalent networks. Extensive simulations quantify the number of solutions for a range of problem sizes. For a slightly simpler case, we also provide an algorithm to construct all equivalent networks from the output spectral density.
A 3D copepod trajectory is recorded in the laboratory, using 2 digital cameras. The copepod undergoes a very structured type of trajectory, with successive moves displaying intermittent amplitudes. We perform a statistical analysis of this 3D trajectory using statistical tools developed in the field of turbulence and anomalous diffusion. We show that the walk belongs to "multifractal random walks", characterized by a nonlinear moment scaling function for the distance versus time. To our knowledge, this is the first experimental study of multifractal anomalous diffusion. We then propose a new type of stochastic process reproducing these multifractal scaling properties. This can be directly used for stochastic numerical simulations, and is thus of important potential applications in the field of animal movement study, and more generally of anomalous diffusion studies.
The process of sorting marble plates according to their surface texture is an important task in the automated marble plate production. Nowadays some inspection systems in marble industry that automate the classification tasks are too expensive and are compatible only with specific technological equipment in the plant. In this paper a new approach to the design of an Automated Marble Plate Classification System (AMPCS),based on different neural network input training sets is proposed, aiming at high classification accuracy using simple processing and application of only standard devices. It is based on training a classification MLP neural network with three different input training sets: extracted texture histograms, Discrete Cosine and Wavelet Transform over the histograms. The algorithm is implemented in a PLC for real-time operation. The performance of the system is assessed with each one of the input training sets. The experimental test results regarding classification accuracy and quick operation are represented and discussed.
We derive a mode-coupling theory for the slow dynamics of fluids confined in disordered porous media represented by spherical particles randomly placed in space. Its equations display the usual nonlinear structure met in this theoretical framework, except for a linear contribution to the memory kernel which adds to the usual quadratic term. The coupling coefficients involve structural quantities which are specific of fluids evolving in random environments and have expressions which are consistent with those found in related problems. Numerical solutions for two simple models with pure hard core interactions lead to the prediction of a variety of glass transition scenarios, which are either continuous or discontinuous and include the possibility of higher-order singularities and glass-glass transitions. The main features of the dynamics in the two most generic cases are reviewed and illustrated with detailed computations. Moreover, a reentry phenomenon is predicted in the low fluid-high matrix density regime and is interpreted as the signature of a decorrelation mechanism by fluid-fluid collisions competing with the localization effect of the solid matrix.
We investigate the dynamical properties of the transcriptional regulation of gene expression in the yeast Saccharomyces Cerevisiae within the framework of a synchronously and deterministically updated Boolean network model. By means of a dynamically determinant subnetwork, we explore the robustness of transcriptional regulation as a function of the type of Boolean functions used in the model that mimic the influence of regulating agents on the transcription level of a gene. We compare the results obtained for the actual yeast network with those from two different model networks, one with similar in-degree distribution as the yeast and random otherwise, and another due to Balcan et al., where the global topology of the yeast network is reproduced faithfully. We, surprisingly, find that the first set of model networks better reproduce the results found with the actual yeast network, even though the Balcan et al. model networks are structurally more similar to that of yeast.
Here we introduce a model in which individuals differ in the rate at which they seek new interactions with others, making rational decisions modeled as general symmetric two-player games. Once a link between two individuals has formed, the productivity of this link is evaluated. Links can be broken off at different rates. We provide analytic results for the limiting cases where linking dynamics is much faster than evolutionary dynamics and vice-versa, and show how the individual capacity of forming new links or severing inconvenient ones maps into the problem of strategy evolution in a well-mixed population under a different game. For intermediate ranges, we investigate numerically the detailed interplay determined by these two time-scales and show that the scope of validity of the analytical results extends to a much wider ratio of time scales than expected.
We present the first reinforcement-learning model to self-improve its reward-modulated training implemented through a continuously improving "intuition" neural network. An agent was trained how to play the arcade video game Pong with two reward-based alternatives, one where the paddle was placed randomly during training, and a second where the paddle was simultaneously trained on three additional neural networks such that it could develop a sense of "certainty" as to how probable its own predicted paddle position will be to return the ball. If the agent was less than 95% certain to return the ball, the policy used an intuition neural network to place the paddle. We trained both architectures for an equivalent number of epochs and tested learning performance by letting the trained programs play against a near-perfect opponent. Through this, we found that the reinforcement learning model that uses an intuition neural network for placing the paddle during reward training quickly overtakes the simple architecture in its ability to outplay the near-perfect opponent, additionally outscoring that opponent by an increasingly wide margin after additional epochs of training.
Noise, corruptions and variations in face images can seriously hurt the performance of face recognition systems. To make such systems robust, multiclass neuralnetwork classifiers capable of learning from noisy data have been suggested. However on large face data sets such systems cannot provide the robustness at a high level. In this paper we explore a pairwise neural-network system as an alternative approach to improving the robustness of face recognition. In our experiments this approach is shown to outperform the multiclass neural-network system in terms of the predictive accuracy on the face images corrupted by noise.
In this paper, we present the actual risks of stealing user PINs by using mobile sensors versus the perceived risks by users. First, we propose PINlogger.js which is a JavaScript-based side channel attack revealing user PINs on an Android mobile phone. In this attack, once the user visits a website controlled by an attacker, the JavaScript code embedded in the web page starts listening to the motion and orientation sensor streams without needing any permission from the user. By analysing these streams, it infers the user's PIN using an artificial neural network. Based on a test set of fifty 4-digit PINs, PINlogger.js is able to correctly identify PINs in the first attempt with a success rate of 74% which increases to 86 and 94% in the second and third attempts, respectively. The high success rates of stealing user PINs on mobile devices via JavaScript indicate a serious threat to user security. With the technical understanding of the information leakage caused by mobile phone sensors, we then study users' perception of the risks associated with these sensors. We design user studies to measure the general familiarity with different sensors and their functionality, and to investigate how concerned users are about their PIN being discovered by an app that has access to all these sensors. Our studies show that there is significant disparity between the actual and perceived levels of threat with regard to the compromise of the user PIN. We confirm our results by interviewing our participants using two different approaches, within-subject and between-subject, and compare the results. We discuss how this observation, along with other factors, renders many academic and industry solutions ineffective in preventing such side channel attacks.
Radiative heat transfer between parallel objects separated by deep sub-wavelength distances and subject to large thermal gradients (>100 K) could enable breakthrough technologies for electricity generation and thermal transport control. However, thermal transport in this regime has never been achieved experimentally due to the difficulty of maintaining large thermal gradients over nm-scale distances while avoiding other heat transfer mechanism such as conduction. Previous experimental measurement between parallel planes were limited to distances greater than 500 nm (with a 20 K thermal gradient), which is much larger than the theoretically predicted distance (<100 nm) required for most applications. Here we show near-field radiative heat transfer between parallel nanostructures in the deep sub-wavelength regime using high precision micro electromechanical (MEMS) displacement control. We also exploit the high mechanical stability of structures under high tensile stress to minimize thermal buckling effects and maintain small separations at large thermal gradients. We achieve an enhancement of heat transfer of almost two orders of magnitude relative to the far-field limit, which corresponds to a 54 nm separation. We also achieve a high temperature gradient (260 K) between the cold and hot surfaces while maintaining a ~100 nm distance.
People use microblogging platforms like Twitter to involve with other users for a wide range of interests and practices. Twitter profiles run by different types of users such as humans, bots, spammers, businesses and professionals. This research work identifies six broad classes of Twitter users, and employs a supervised machine learning approach which uses a comprehensive set of features to classify users into the identified classes. For this purpose, we exploit users' profile and tweeting behavior information. We evaluate our approach by performing 10-fold cross validation using manually annotated 716 different Twitter profiles. High classification accuracy (measured using AUC, and precision, recall) reveals the significance of the proposed approach.
Performance of routing protocols in mobile ad-hoc networks is greatly affected by the dynamic nature of nodes, route failures, wireless channels with variable bandwidth and scalability issues. A mobility model imitates the real world movement of mobile nodes and is central component to simulation based studies. In this paper we consider mobility nodes which mimic the vehicular motion of nodes like Manhattan mobility model and City Section mobility model. We also propose a new Group Vehicular mobility model that takes the best features of group mobility models like Reference Point Group mobility model and applies it to vehicular models. We analyze the performance of our model known as Group Vehicular mobility model (GVMM) and other vehicular mobility models with various metrics. This analysis provides us with an insight about the impact of mobility models on the performance of routing protocols for ad-hoc networks. The routing protocols are simulated and measured for performance and finally we arrive at the correlation about the impact of mobility models on routing protocols, which are central to the design of mobile adhoc networks.
We explore the role of entanglement in adiabatic quantum optimization by performing approximate simulations of the real-time evolution of a quantum system while limiting the amount of entanglement. To classically simulate the time evolution of the system with a limited amount of entanglement, we represent the quantum state using matrix-product states and projected entangled-pair states. We show that the probability of finding the ground state of an Ising spin glass on either a planar or non-planar two-dimensional graph increases rapidly as the amount of entanglement in the state is increased. Furthermore, we propose evolution in complex time as a way to improve simulated adiabatic evolution and mimic the effects of thermal cooling of the quantum annealer.
A Mobile Ad-hoc Network (MANET) is a self-motivated wireless network which has no centralized point. It is an independent network that is connected by wireless link so, in which every point or device work as a router. In this network every node forward the packets to the destination as a router and it's not operating as an ending point. In this network every node adjusts them self by on his way in any direction because they are independent and change their position regularly. There are exist three main types of routing protocols which are reactive, proactive and final is hybrid protocols. This whole work compares the performance of some reactive protocols which also known as on - demand protocols, which are DSR, AODV and the final is AOMDV. DSR and AODV are reactive protocols which connected the devices on the network when needed by a doorway. The AOMDV protocol was designed for ad hoc networks whenever any route or link fail and also maintain routes with sequence numbers to avoid looping.
We introduce a deep network architecture called DerainNet for removing rain streaks from an image. Based on the deep convolutional neural network (CNN), we directly learn the mapping relationship between rainy and clean image detail layers from data. Because we do not possess the ground truth corresponding to real-world rainy images, we synthesize images with rain for training. In contrast to other common strategies that increase depth or breadth of the network, we use image processing domain knowledge to modify the objective function and improve deraining with a modestly-sized CNN. Specifically, we train our DerainNet on the detail (high-pass) layer rather than in the image domain. Though DerainNet is trained on synthetic data, we find that the learned network translates very effectively to real-world images for testing. Moreover, we augment the CNN framework with image enhancement to improve the visual results. Compared with state-of-the-art single image de-raining methods, our method has improved rain removal and much faster computation time after network training.
We point out a simple poly-time log-space routing algorithm in ad hoc networks with guaranteed delivery using universal exploration sequences.
Content-Centric Networking (CCN) naturally supports multi-path communication, as it allows the simultaneous use of multiple interfaces (e.g. LTE and WiFi). When multiple sources and multiple clients are considered, the optimal set of distribution trees should be determined in order to optimally use all the available interfaces. This is not a trivial task, as it is a computationally intense procedure that should be done centrally. The need for central coordination can be removed by employing network coding, which also offers improved resiliency to errors and large throughput gains. In this paper, we propose NetCodCCN, a protocol for integrating network coding in CCN. In comparison to previous works proposing to enable network coding in CCN, NetCodCCN permit Interest aggregation and Interest pipelining, which reduce the data retrieval times. The experimental evaluation shows that the proposed protocol leads to significant improvements in terms of content retrieval delay compared to the original CCN. Our results demonstrate that the use of network coding adds robustness to losses and permits to exploit more efficiently the available network resources. The performance gains are verified for content retrieval in various network scenarios.
Many papers have been proposed in order to increase the wireless sensor networks performance; This kind of network has limited resources, where the energy in each sensor came from a small battery that sometime is hard to be replaced or recharged. Transmission energy is the most concern part where the higher energy consumption takes place. Clustered hierarchy has been proposed in many papers; in most cases, it provides the network with better performance than other protocols. In our paper, first we discuss some of techniques, relates to this protocol, that have been proposed for energy efficiency; some of them were proposed to provide the network with more security level. Our proposal then suggests some modifications to some of these techniques to provide the network with more energy saving that should lead to high performance; also we apply our technique on an existing one that proposed to increase the security level of cluster sensor networks.
Inspired by studies on airline networks we propose a general model for weighted networks in which topological growth and weight dynamics are both determined by cost adversarial mechanism. Since transportation networks are designed and operated with objectives to reduce cost, the theory of cost in micro-economics plays a critical role in the evolution. We assume vertices and edges are given cost functions according to economics of scale and diseconomics of scale (congestion effect). With different cost functions the model produces broad distribution of networks. The model reproduces key properties of real airline networks: truncated degree distributions, nonlinear strength degree correlations, hierarchy structures, and particulary the disassortative and assortative behavior observed in different airline networks. The result suggests that the interplay between economics of scale and diseconomics of scale is a key ingredient in order to understand the underlying driving factor of the real-world weighted networks.
A neural network with fixed topology can be regarded as a parametrization of functions, which decides on the correlations between functional variations when parameters are adapted. We propose an analysis, based on a differential geometry point of view, that allows to calculate these correlations. In practise, this describes how one response is unlearned while another is trained. Concerning conventional feed-forward neural networks we find that they generically introduce strong correlations, are predisposed to forgetting, and inappropriate for task decomposition. Perspectives to solve these problems are discussed.
We investigate unsupervised pre-training of deep architectures as feature generators for "shallow" classifiers. Stacked Denoising Autoencoders (SdA), when used as feature pre-processing tools for SVM classification, can lead to significant improvements in accuracy - however, at the price of a substantial increase in computational cost. In this paper we create a simple algorithm which mimics the layer by layer training of SdAs. However, in contrast to SdAs, our algorithm requires no training through gradient descent as the parameters can be computed in closed-form. It can be implemented in less than 20 lines of MATLABTMand reduces the computation time from several hours to mere seconds. We show that our feature transformation reliably improves the results of SVM classification significantly on all our data sets - often outperforming SdAs and even deep neural networks in three out of four deep learning benchmarks.
A rectifier network is a directed acyclic graph with distinguished sources and sinks; it is said to compute a Boolean matrix $M$ that has a $1$ in the entry $(i,j)$ iff there is a path from the $j$th source to the $i$th sink. The smallest number of edges in a rectifier network computing $M$ is a classic complexity measure on matrices, which has been studied for more than half a century.   We explore two well-known techniques that have hitherto found little to no applications in this theory. Both of them build on a basic fact that depth-$2$ rectifier networks are essentially weighted coverings of Boolean matrices with rectangles. We obtain new results by using fractional and greedy coverings (defined in the standard way).   First, we show that all fractional coverings of the so-called full triangular matrix have cost at least $n\log n$. This provides (a fortiori) a new proof of the tight lower bound on its depth-$2$ complexity (the exact value has been known since 1965, but previous proofs are based on different arguments). Second, we show that the greedy heuristic is instrumental in tightening the upper bound on the depth-$2$ complexity of the Kneser-Sierpi\'nski (disjointness) matrix. The previous upper bound is $O(n^{1.28})$, and we improve it to $O(n^{1.17})$, while the best known lower bound is $\Omega(n^{1.16})$. Third, using fractional coverings, we obtain a form of direct product theorem that gives a lower bound on unbounded-depth complexity of Kronecker (tensor) products of matrices. In this case, the greedy heuristic shows (by an argument due to Lov\'asz) that our result is only a logarithmic factor away from the "full" direct product theorem. Our second and third results constitute progress on open problem 7.3 and resolve, up to a logarithmic factor, open problem 7.5 from a recent book by Jukna and Sergeev (in Foundations and Trends in Theoretical Computer Science (2013)).
We study the Bayesian model averaging approach to learning Bayesian network structures (DAGs) from data. We develop new algorithms including the first algorithm that is able to efficiently sample DAGs according to the exact structure posterior. The DAG samples can then be used to construct estimators for the posterior of any feature. We theoretically prove good properties of our estimators and empirically show that our estimators considerably outperform the estimators from the previous state-of-the-art methods.
We propose a novel method called deep convolutional decision jungle (CDJ) and its learning algorithm for image classification. The CDJ maintains the structure of standard convolutional neural networks (CNNs), i.e. multiple layers of multiple response maps fully connected. Each response map-or node-in both the convolutional and fully-connected layers selectively respond to class labels s.t. each data sample travels via a specific soft route of those activated nodes. The proposed method CDJ automatically learns features, whereas decision forests and jungles require pre-defined feature sets. Compared to CNNs, the method embeds the benefits of using data-dependent discriminative functions, which better handles multi-modal/heterogeneous data; further,the method offers more diverse sparse network responses, which in turn can be used for cost-effective learning/classification. The network is learnt by combining conventional softmax and proposed entropy losses in each layer. The entropy loss,as used in decision tree growing, measures the purity of data activation according to the class label distribution. The back-propagation rule for the proposed loss function is derived from stochastic gradient descent (SGD) optimization of CNNs. We show that our proposed method outperforms state-of-the-art methods on three public image classification benchmarks and one face verification dataset. We also demonstrate the use of auxiliary data labels, when available, which helps our method to learn more discriminative routing and representations and leads to improved classification.
Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks. The two main challenges are the large number of samples typically required, and the difficulty of obtaining stable and steady improvement despite the nonstationarity of the incoming data. We address the first challenge by using value functions to substantially reduce the variance of policy gradient estimates at the cost of some bias, with an exponentially-weighted estimator of the advantage function that is analogous to TD(lambda). We address the second challenge by using trust region optimization procedure for both the policy and the value function, which are represented by neural networks.   Our approach yields strong empirical results on highly challenging 3D locomotion tasks, learning running gaits for bipedal and quadrupedal simulated robots, and learning a policy for getting the biped to stand up from starting out lying on the ground. In contrast to a body of prior work that uses hand-crafted policy representations, our neural network policies map directly from raw kinematics to joint torques. Our algorithm is fully model-free, and the amount of simulated experience required for the learning tasks on 3D bipeds corresponds to 1-2 weeks of real time.
Robust network flows are a concept for dealing with uncertainty and unforeseen failures in the network infrastructure. They and their dual counterpart, network flow interdiction, have received steady attention within the operations research community over the past years. One of the most basic models is the Maximum Robust Flow Problem: Given a network and an integer k, the task is to find a path flow of maximum robust value, i.e., the guaranteed value of surviving flow after removal of any k arcs in the network. The complexity of this problem appeared to have been settled almost a decade ago: Aneja et al. (2001) showed that the problem can be solved efficiently when k = 1, while an article by Du and Chandrasekaran (2007) established that the problem is NP-hard for any constant value of k larger than 1.   We point to a flaw in the proof of the latter result, leaving the complexity for constant k open once again. For the case that k is not bounded by a constant, we present a new hardness proof, establishing that the problem is strongly NP-hard, even when only two different capacity values occur and the number of paths is polynomial in the size of the input.
We provide an analytical study of the impact of packet skipping and opportunistic network coding on the timely communication of messages through a single network element. In a first step, we consider a single-server queueing system with Poisson arrivals, exponential service times, and a single buffer position. Packets arriving at a network node have a fixed deadline before which they should reach the destination. To preserve server capacity, we introduce a thresholding policy, based on remaining time until deadline expiration, to decide whether to serve a packet or skip its service. The obtained goodput improvement of the system is derived, as well as the operating conditions under which thresholding can enhance performance. Subsequently, we focus our analysis on a system that supports network coding instead of thresholding. We characterize the impact of network coding at a router node on the delivery of packets associated with deadlines. We model the router node as a queueing system where packets arrive from two independent Poisson flows and undergo opportunistic coding operations. We obtain an exact expression for the goodput of the system and study the achievable gain. Finally, we provide an analytical model that considers both network coding and packet skipping, capturing their joint performance. A comparative analysis between the aforementioned approaches is provided.
We use deep Chandra imaging and an extensive optical spectroscopy campaign on the Keck 10-m telescopes to study the properties of X-ray point sources in five large-scale structures at redshifts of z ~ 0.7-0.9. We first study X-ray point sources using the statistical measure of cumulative source counts, finding that the measured overdensities are consistent with previous results, but we recommend caution in overestimating the precision of the technique. Optical spectroscopy of objects matched to X-ray point sources confirms a total of 27 AGN within the five structures, and we find that their host galaxies tend to be located away from dense cluster cores. More than 36% of host galaxies are located in the `green valley', which suggests they are a transitional population. Based on analysis of OII and Hd line strengths, the average spectral properties of the AGN host galaxies in all structures indicate either on-going star formation or a starburst within ~ 1 Gyr, and the host galaxies are younger than the average galaxy in the parent population. These results indicate a clear connection between starburst and nuclear activity. We use composite spectra of the spectroscopically confirmed members in each structure to separate them based on a measure of the overall evolutionary state of their constituent galaxies. We define structures as having more evolved populations if their average galaxy has lower EW(OII) and EW(Hd). The AGN in the more evolved structures have lower rest-frame 0.5-8 keV X-ray luminosities (all below 10^43.3 erg s^-1) and longer times since a starburst than those in the less evolved structures, suggesting that the peak of both star formation and AGN activity has occurred at earlier times. With the wide range of evolutionary states and timeframes in the structures, we use our results to analyze the evolution of X-ray AGN and evaluate potential triggering mechanisms.
In this paper, we propose a deep end-to-end neu- ral network to simultaneously learn high-level features and a corresponding similarity metric for person re-identification. The network takes a pair of raw RGB images as input, and outputs a similarity value indicating whether the two input images depict the same person. A layer of computing neighborhood range differences across two input images is employed to capture local relationship between patches. This operation is to seek a robust feature from input images. By increasing the depth to 10 weight layers and using very small (3$\times$3) convolution filters, our architecture achieves a remarkable improvement on the prior-art configurations. Meanwhile, an adaptive Root- Mean-Square (RMSProp) gradient decent algorithm is integrated into our architecture, which is beneficial to deep nets. Our method consistently outperforms state-of-the-art on two large datasets (CUHK03 and Market-1501), and a medium-sized data set (CUHK01).
Exchange interactions among defects in semiconductors are commonly treated within effective-mass theory using a scaled hydrogenic wave-function. However such a wave-function is only applicable to shallow impurities; here we present a simple but robust generalization to treat deep donors, in which we treat the long-range part of the wavefunction using the well established quantum defect theory, and include a model central-cell correction to fix the bound-state eigenvalue at the experimentally observed value. This allows us to compute the effect of binding energy on exchange interactions as a function of donor distance; this is a significant quantity given recent proposals to carry out quantum information processing using deep donors. As expected, exchange interactions are suppressed (or increased), compared to the hydrogenic case, by the greater localization (or delocalization) of the wavefunctions of deep donors (or `super-shallow' donors with binding energy less then the hydrogenic value). The calculated results are compared with a simple scaling of the Heitler-London hydrogenic exchange; the scaled hydrogenic results give the correct order of magnitude but fail to reproduce quantitatively our calculations. We calculate the donor exchange in silicon including inter-valley interference terms for donor pairs along the $\{100\}$ direction, and also show the influence of the donor type on the distribution of nearest-neighbour exchange constants at different concentrations. Our methods can be used to compute the exchange interactions between two donor electrons with arbitrary binding energy.
While recent deep neural networks have achieved a promising performance on object recognition, they rely implicitly on the visual contents of the whole image. In this paper, we train deep neural net- works on the foreground (object) and background (context) regions of images respectively. Consider- ing human recognition in the same situations, net- works trained on the pure background without ob- jects achieves highly reasonable recognition performance that beats humans by a large margin if only given context. However, humans still outperform networks with pure object available, which indicates networks and human beings have different mechanisms in understanding an image. Furthermore, we straightforwardly combine multiple trained networks to explore different visual cues learned by different networks. Experiments show that useful visual hints can be explicitly learned separately and then combined to achieve higher performance, which verifies the advantages of the proposed framework.
We study numerically the disorder-induced localization-delocalization phase transitions that occur for mass and spring constant disorder in a three-dimensional cubic lattice with harmonic couplings. We show that, while the phase diagrams exhibit regions of stable and unstable waves, the universality of the transitions is the same for mass and spring constant disorder throughout all the phase boundaries. The combined value for the critical exponent of the localization lengths of $\nu = 1.550^{+0.020}_{-0.017}$ confirms the agreement with the universality class of the standard electronic Anderson model of localization. We further support our investigation with studies of the density of states, the participation numbers and wave function statistics.
In recent years there has been significant progress in algorithms and methods for inducing Bayesian networks from data. However, in complex data analysis problems, we need to go beyond being satisfied with inducing networks with high scores. We need to provide confidence measures on features of these networks: Is the existence of an edge between two nodes warranted? Is the Markov blanket of a given node robust? Can we say something about the ordering of the variables? We should be able to address these questions, even when the amount of data is not enough to induce a high scoring network. In this paper we propose Efron's Bootstrap as a computationally efficient approach for answering these questions. In addition, we propose to use these confidence measures to induce better structures from the data, and to detect the presence of latent variables.
Small-worlds represent efficient communication networks that obey two distinguishing characteristics: a high clustering coefficient together with a small characteristic path length. This paper focuses on an interesting paradox, that removing links in a network can increase the overall clustering coefficient. Reckful Roaming, as introduced in this paper, is a 2-localized algorithm that takes advantage of this paradox in order to selectively remove superfluous links, this way optimizing the clustering coefficient while still retaining a sufficiently small characteristic path length.
Comprehending complex systems by simplifying and highlighting important dynamical patterns often require modeling and mapping of higher-order network flows. However, complex systems come in many forms and demand a range of representations, including memory and multilayer networks, which in turn call for versatile community-detection algorithms to reveal important modular regularities in the flows. We show that various higher-order network flow representations can be represented with sparse memory networks, in which the multilevel map equation can identify overlapping and nested modules that capture network flows. We derive the map equation applied to sparse memory networks and describe its search algorithm Infomap, which can exploit the flexibility of sparse memory networks. Together they provide a general solution to reveal overlapping modular patterns in higher-order flows through complex systems.
We analyze the multihop delay of ad hoc cognitive radio networks, where the transmission delay of each hop consists of the propagation delay and the waiting time for the availability of the communication channel (i.e., the occurrence of a spectrum opportunity at this hop). Using theories and techniques from continuum percolation and ergodicity, we establish the scaling law of the minimum multihop delay with respect to the source-destination distance in cognitive radio networks. When the propagation delay is negligible, we show the starkly different scaling behavior of the minimum multihop delay in instantaneously connected networks as compared to networks that are only intermittently connected due to scarcity of spectrum opportunities. Specifically, if the network is instantaneously connected, the minimum multihop delay is asymptotically independent of the distance; if the network is only intermittently connected, the minimum multihop delay scales linearly with the distance. When the propagation delay is nonnegligible but small, we show that although the scaling order is always linear, the scaling rate for an instantaneously connected network can be orders of magnitude smaller than the one for an intermittently connected network.
Our goal is to be able to build a generative model from a deep neural network architecture to try to create music that has both harmony and melody and is passable as music composed by humans. Previous work in music generation has mainly been focused on creating a single melody. More recent work on polyphonic music modeling, centered around time series probability density estimation, has met some partial success. In particular, there has been a lot of work based off of Recurrent Neural Networks combined with Restricted Boltzmann Machines (RNN-RBM) and other similar recurrent energy based models. Our approach, however, is to perform end-to-end learning and generation with deep neural nets alone.
Internet or things (IoT) is changing our daily life rapidly. Although new technologies are emerging everyday and expanding their influence in this rapidly growing area, many classic theories can still find their places. In this paper, we study the important applications of the classic network coding theory in two important components of Internet of things, including the IoT core network, where data is sensed and transmitted, and the distributed cloud storage, where the data generated by the IoT core network is stored. First we propose an adaptive network coding (ANC) scheme in the IoT core network to improve the transmission efficiency. We demonstrate the efficacy of the scheme and the performance advantage over existing schemes through simulations. %Next we study the application of network coding in the distributed cloud storage. Next we introduce the optimal storage allocation problem in the network coding based distributed cloud storage, which aims at searching for the most reliable allocation that distributes the $n$ data components into $N$ data centers, given the failure probability $p$ of each data center. Then we propose a polynomial-time optimal storage allocation (OSA) scheme to solve the problem. Both the theoretical analysis and the simulation results show that the storage reliability could be greatly improved by the OSA scheme.
A description of exclusive charged pion electroproduction $(e,e'\pi^{\pm})$ off nucleons at high energies is proposed. The model combines a Regge pole approach with residual effect of nucleon resonances. The exchanges of $\pi$(140), vector $\rho(770)$ and axial-vector $a_1(1260)$ and $b_1(1235)$ Regge trajectories are considered. The contribution of nucleon resonances is described using a dual connection between the exclusive hadronic form factors and inclusive deep inelastic structure functions. The model describes the measured longitudinal, transverse and interference cross sections at JLAB and DESY. The scaling behavior of the cross sections is in agreement with JLAB and deeply virtual HERMES data. The results for a polarized beam-spin azimuthal asymmetry in $(\vec{e},e'\pi^{\pm})$ are presented. Model predictions for JLAB at 12 GeV are given.
Lots of information on solar-like oscillations in red giants has been obtained thanks to observations with CoRoT and Kepler space telescopes. Data on dipolar modes appear most interesting. We study properties of dipolar oscillations in luminous red giants to explain mechanism of mode trapping in the convective envelope and to assess what may be learned from the new data. Equations for adiabatic oscillations are solved by numerical integration down to the bottom of convective envelope, where the boundary condition is applied. The condition is based on asymptotic decomposition of the fourth order system into components describing a running wave and a uniform shift of radiative core. If the luminosity of a red giant is sufficiently high, for instance at M = 2 Msun greater than about 100 Lsun, the dipolar modes become effectively trapped in the acoustic cavity, which covers the outer part of convective envelope. Energy loss caused by gravity wave emission at the envelope base is a secondary or negligible source of damping. Frequencies are insensitive to structure of the deep interior.
The reading comprehension task, that asks questions about a given evidence document, is a central problem in natural language understanding. Recent formulations of this task have typically focused on answer selection from a set of candidates pre-defined manually or through the use of an external NLP pipeline. However, Rajpurkar et al. (2016) recently released the SQuAD dataset in which the answers can be arbitrary strings from the supplied text. In this paper, we focus on this answer extraction task, presenting a novel model architecture that efficiently builds fixed length representations of all spans in the evidence document with a recurrent network. We show that scoring explicit span representations significantly improves performance over other approaches that factor the prediction into separate predictions about words or start and end markers. Our approach improves upon the best published results of Wang & Jiang (2016) by 5% and decreases the error of Rajpurkar et al.'s baseline by > 50%.
In the wireless sensor networks composed of battery-powered sensor nodes, one of the main issues is how to save power consumption at each node. The usual approach to this problem is to activate only necessary nodes (e.g., those nodes which compose a backbone network), and to put other nodes to sleep. One such algorithm using location information is GAF (Geographical Adaptive Fidelity), and the GAF is enhanced to HGAF (Hierarchical Geographical Adaptive Fidelity). In this paper, we show that we can further improve the energy efficiency of HGAF by modifying the manner of dividing sensor-field. We also provide a theoretical bound on this problem.
Training neural networks is a challenging non-convex optimization problem, and backpropagation or gradient descent can get stuck in spurious local optima. We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks. We provide risk bounds for our proposed method, with a polynomial sample complexity in the relevant parameters, such as input dimension and number of neurons. While learning arbitrary target functions is NP-hard, we provide transparent conditions on the function and the input for learnability. Our training method is based on tensor decomposition, which provably converges to the global optimum, under a set of mild non-degeneracy conditions. It consists of simple embarrassingly parallel linear and multi-linear operations, and is competitive with standard stochastic gradient descent (SGD), in terms of computational complexity. Thus, we propose a computationally efficient method with guaranteed risk bounds for training neural networks with one hidden layer.
We introduce a two-layer wavelet scattering network, for object classification. This scattering transform computes a spatial wavelet transform on the first layer and a new joint wavelet transform along spatial, angular and scale variables in the second layer. Numerical experiments demonstrate that this two layer convolution network, which involves no learning and no max pooling, performs efficiently on complex image data sets such as CalTech, with structural objects variability and clutter. It opens the possibility to simplify deep neural network learning by initializing the first layers with wavelet filters.
Action recognition and anticipation are key to the success of many computer vision applications. Existing methods can roughly be grouped into those that extract global, context-aware representations of the entire image or sequence, and those that aim at focusing on the regions where the action occurs. While the former may suffer from the fact that context is not always reliable, the latter completely ignore this source of information, which can nonetheless be helpful in many situations. In this paper, we aim at making the best of both worlds by developing an approach that leverages both context-aware and action-aware features. At the core of our method lies a novel multi-stage recurrent architecture that allows us to effectively combine these two sources of information throughout a video. This architecture first exploits the global, context-aware features, and merges the resulting representation with the localized, action-aware ones. Our experiments on standard datasets evidence the benefits of our approach over methods that use each information type separately. We outperform the state-of-the-art methods that, as us, rely only on RGB frames as input for both action recognition and anticipation.
Future Internet-of-Things (IoT) will connect billions of small computing devices embedded in the environment and support their device-to-device (D2D) communication. Powering this massive number of embedded devices is a key challenge of designing IoT since batteries increase the devices' form factors and their recharging/replacement is difficult. To tackle this challenge, we propose a novel network architecture that integrates wireless power transfer and backscatter communication, called wirelessly powered backscatter communication (WP-BC) networks. In this architecture, power beacons (PBs) are deployed for wirelessly powering devices; their ad-hoc communication relies on backscattering and modulating incident continuous waves from PBs, which consumes orders-of-magnitude less power than traditional radios. Thereby, the dense deployment of low-complexity PBs with high transmission power can power a large-scale IoT. In this paper, a WP-BC network is modeled as a random Poisson cluster process in the horizontal plane where PBs are Poisson distributed and active ad-hoc pairs of backscatter communication nodes with fixed separation distances form random clusters centered at PBs. Furthermore, by harvesting energy from and backscattering radio frequency (RF) waves transmitted by PBs, the transmission power of each node depends on the distance from the associated PB. Applying stochastic geometry, the network coverage probability and transmission capacity are derived and optimized as functions of the backscatter duty cycle and reflection coefficient as well as the PB density. The effects of the parameters on network performance are characterized.
We report a detailed and systematic numerical study of wave propagation through a coherently amplifying random one-dimensional medium. The coherent amplification is modeled by introducing a uniform imaginary part in the site energies of the disordered single-band tight binding Hamiltonian. Several distinct length scales (regimes), most of them new, are identified from the behavior of transmittance and reflectance as a function of the material parameters. We show that the transmittance is a non-self-averaging quantity with a well defined mean value. The stationary distribution of the super reflection differs qualitatively from the analytical results obtained within the random phase approximation in strong disorder and amplification regime. The study of the stationary distribution of the phase of the reflected wave reveals the reason for this discrepancy. The applicability of random phase approximation is discussed. We emphasize the dual role played by the lasing medium, as an amplifier as well as a reflector.
For the most up-to-date version please visit http://www.cis.upenn.edu/~brautbar/ccgame.pdf
The distributed control systems are more and more used in many industrial applications. These systems are often referred as "Networked control systems". The goal of this paper is to show the network influence on feedback control systems. Two networks are considered: Switched Ethernet network and CAN fieldbus. The first one represents the non deterministic network and second one represents the deterministic one. Several scenarii are studied to analyse the stability of system according to different network parameters (packets losses, congestion and frame priority). The Truetime simulator is used in this work.
The Classification of medical images and illustrations in the literature aims to label a medical image according to the modality it was produced or label an illustration according to its production attributes. It is an essential and challenging research hotspot in the area of automated literature review, retrieval and mining. The significant intra-class variation and inter-class similarity caused by the diverse imaging modalities and various illustration types brings a great deal of difficulties to the problem. In this paper, we propose a synergic deep learning (SDL) model to address this issue. Specifically, a dual deep convolutional neural network with a synergic signal system is designed to mutually learn image representation. The synergic signal is used to verify whether the input image pair belongs to the same category and to give the corrective feedback if a synergic error exists. Our SDL model can be trained 'end to end'. In the test phase, the class label of an input can be predicted by averaging the likelihood probabilities obtained by two convolutional neural network components. Experimental results on the ImageCLEF2016 Subfigure Classification Challenge suggest that our proposed SDL model achieves the state-of-the art performance in this medical image classification problem and its accuracy is higher than that of the first place solution on the Challenge leader board so far.
Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox's proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.
Performing random walks in networks is a fundamental primitive that has found numerous applications in communication networks such as token management, load balancing, network topology discovery and construction, search, and peer-to-peer membership management. While several such algorithms are ubiquitous, and use numerous random walk samples, the walks themselves have always been performed naively.   In this paper, we focus on the problem of performing random walk sampling efficiently in a distributed network. Given bandwidth constraints, the goal is to minimize the number of rounds and messages required to obtain several random walk samples in a continuous online fashion. We present the first round and message optimal distributed algorithms that present a significant improvement on all previous approaches. The theoretical analysis and comprehensive experimental evaluation of our algorithms show that they perform very well in different types of networks of differing topologies.   In particular, our results show how several random walks can be performed continuously (when source nodes are provided only at runtime, i.e., online), such that each walk of length $\ell$ can be performed exactly in just $\tilde{O}(\sqrt{\ell D})$ rounds, (where $D$ is the diameter of the network), and $O(\ell)$ messages. This significantly improves upon both, the naive technique that requires $O(\ell)$ rounds and $O(\ell)$ messages, and the sophisticated algorithm of [DasSarma et al. PODC 2010] that has the same round complexity as this paper but requires $\Omega(m\sqrt{\ell})$ messages (where $m$ is the number of edges in the network). Our theoretical results are corroborated through extensive experiments on various topological data sets. Our algorithms are fully decentralized, lightweight, and easily implementable, and can serve as building blocks in the design of topologically-aware networks.
We study the influence of the strong nuclear vector potential, treated using the mean-field approximation, in deep inelastic scattering. A consistent treatment of the electromagnetic current operator, combined with the use of the operator product expansion is presented and discussed.
Networks in nature do not act in isolation but instead exchange information, and depend on each other to function properly. An incipient theory of Networks of Networks have shown that connected random networks may very easily result in abrupt failures. This theoretical finding bares an intrinsic paradox: If natural systems organize in interconnected networks, how can they be so stable? Here we provide a solution to this conundrum, showing that the stability of a system of networks relies on the relation between the internal structure of a network and its pattern of connections to other networks. Specifically, we demonstrate that if network inter-connections are provided by hubs of the network and if there is a moderate degree of convergence of inter-network connection the systems of network are stable and robust to failure. We test this theoretical prediction in two independent experiments of functional brain networks (in task- and resting states) which show that brain networks are connected with a topology that maximizes stability according to the theory.
Humans take advantage of real world symmetries for various tasks, yet capturing their superb symmetry perception mechanism into a computational model remains elusive. Encouraged by a new discovery (CVPR 2016) demonstrating extremely high inter-person accuracy of human perceived symmetries in the wild, we have created the first deep-learning neural network for reflection and rotation symmetry detection (Sym-NET), trained on photos from MS-COCO (Common Object in COntext) dataset with nearly 11K symmetry-labels from more than 400 human observers. We employ novel methods to convert discrete human labels into symmetry heatmaps, capture symmetry densely in an image and quantitatively evaluate Sym-NET against multiple existing computer vision algorithms. Using the symmetry competition testsets from CVPR 2013 and unseen MS-COCO photos, Sym-NET comes out as the winner with significantly superior performance over all other competitors. Beyond mathematically well-defined symmetries on a plane, Sym-NET demonstrates abilities to identify viewpoint-varied 3D symmetries, partially occluded symmetrical objects and symmetries at a semantic level.
For any stream of time-stamped edges that form a dynamic network, an important choice is the aggregation granularity that an analyst uses to bin the data. Picking such a windowing of the data is often done by hand, or left up to the technology that is collecting the data. However, the choice can make a big difference in the properties of the dynamic network. This is the time scale detection problem. In previous work, this problem is often solved with a heuristic as an unsupervised task. As an unsupervised problem, it is difficult to measure how well a given algorithm performs. In addition, we show that the quality of the windowing is dependent on which task an analyst wants to perform on the network after windowing. Therefore the time scale detection problem should not be handled independently from the rest of the analysis of the network.   We introduce a framework that tackles both of these issues: By measuring the performance of the time scale detection algorithm based on how well a given task is accomplished on the resulting network, we are for the first time able to directly compare different time scale detection algorithms to each other. Using this framework, we introduce time scale detection algorithms that take a supervised approach: they leverage ground truth on training data to find a good windowing of the test data. We compare the supervised approach to previous approaches and several baselines on real data.
The proton structure function is re-deduced from the data of deep inelastic electron-proton scattering after enhanced correction that is made due to the multiple scattering effect. The Glauber approach is used to account for the multiple scattering of electron with the quark constituents of the proton. The results are compared with the prediction of the parton model. Appreciable deviation is observed only at low values of the Bjorken scaling, which correspond to the extremely deep inelastic scattering.
There is an obvious need for improving the performance and accuracy of a Bayesian network as new data is observed. Because of errors in model construction and changes in the dynamics of the domains, we cannot afford to ignore the information in new data. While sequential update of parameters for a fixed structure can be accomplished using standard techniques, sequential update of network structure is still an open problem. In this paper, we investigate sequential update of Bayesian networks were both parameters and structure are expected to change. We introduce a new approach that allows for the flexible manipulation of the tradeoff between the quality of the learned networks and the amount of information that is maintained about past observations. We formally describe our approach including the necessary modifications to the scoring functions for learning Bayesian networks, evaluate its effectiveness through an empirical study, and extend it to the case of missing data.
We outline a program in the area of formalization of mathematics to automate theorem proving in algebra and algebraic geometry. We propose a construction of a dictionary between automated theorem provers and (La)TeX exploiting syntactic parsers. We describe its application to a repository of human-written facts and definitions in algebraic geometry (The Stacks Project). We use deep learning techniques.
We examine the ground state of the random quantum Ising model in a transverse field using a generalization of the Ma-Dasgupta-Hu renormalization group (RG) scheme. For spatial dimensionality d=2, we find that at strong randomness the RG flow for the quantum critical point is towards an infinite-randomness fixed point, as in one-dimension. This is consistent with the results of a recent quantum Monte Carlo study by Pich, et al., including estimates of the critical exponents from our RG that agree well with those from the quantum Monte Carlo. The same qualitative behavior appears to occur for three-dimensions; we have not yet been able to determine whether or not it persists to arbitrarily high d. Some consequences of the infinite-randomness fixed point for the quantum critical scaling behavior are discussed. Because frustration is irrelevant in the infinite-randomness limit, the same fixed point should govern both ferromagnetic and spin-glass quantum critical points. This RG maps the random quantum Ising model with strong disorder onto a novel type of percolation/aggregation process.
Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning.   One factor behind the recent resurgence of the subject is a key algorithmic step called pre-training: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications of this simple principle, by establishing a connection with the interplay of orbits and stabilizers of group actions. Although the neural networks themselves may not form groups, we show the existence of {\em shadow} groups whose elements serve as close approximations.   Over the shadow groups, the pre-training step, originally introduced as a mechanism to better initialize a network, becomes equivalent to a search for features with minimal orbits. Intuitively, these features are in a way the {\em simplest}. Which explains why a deep learning network learns simple features first. Next, we show how the same principle, when repeated in the deeper layers, can capture higher order representations, and why representation complexity increases as the layers get deeper.
For the system with the n-component order parameter (O(n)-model), conditions for initiation of the Imry-Ma disordered state resulting from the influence of defects of the "random local anisotropy" type were discovered. The initiation of such a state was shown to be possible if the distribution of local anisotropy axes directions in the order parameter space is nearly isotropic, and the limiting degree of the distribution anisotropy was found. For a higher anisotropy in the distribution of local axes directions, the long-range order in the system holds true even in the presence of defects of the given type.
Based on a local mean-field theory approach at Anderson localization, we find a distribution function of critical temperature from that of disorder. An essential point of this local mean-field theory approach is that the information of the wave-function multifractality is introduced. The distribution function of the Kondo temperature ($T_{K}$) shows a power-law tail in the limit of $T_{K} \rightarrow 0$ regardless of the Kondo coupling constant. We also find that the distribution function of the ferromagnetic transition temperature ($T_{c}$) gives a power-law behavior in the limit of $T_{c} \rightarrow 0$ when an interaction parameter for ferromagnetic instability lies below a critical value. However, the $T_{c}$ distribution function stops the power-law increasing behavior in the $T_{c} \rightarrow 0$ limit and vanishes beyond the critical interaction parameter inside the ferromagnetic phase. These results imply that the typical Kondo temperature given by a geometric average always vanishes due to finite density of the distribution function in the $T_{K} \rightarrow 0$ limit while the typical ferromagnetic transition temperature shows a phase transition at the critical interaction parameter. We propose that the typical transition temperature serves a criterion for quantum Griffiths phenomena vs. smeared transitions: Quantum Griffiths phenomena occur above the typical value of the critical temperature while smeared phase transitions result at low temperatures below the typical transition temperature. We speculate that the ferromagnetic transition at Anderson localization shows the evolution from quantum Griffiths phenomena to smeared transitions around the critical interaction parameter at low temperatures.
Salient object detection has recently witnessed substantial progress due to powerful features extracted using deep convolutional neural networks (CNNs). However, existing CNN-based methods operate at the patch level instead of the pixel level. Resulting saliency maps are typically blurry, especially near the boundary of salient objects. Furthermore, image patches are treated as independent samples even when they are overlapping, giving rise to significant redundancy in computation and storage. In this CVPR 2016 paper, we propose an end-to-end deep contrast network to overcome the aforementioned limitations. Our deep network consists of two complementary components, a pixel-level fully convolutional stream and a segment-wise spatial pooling stream. The first stream directly produces a saliency map with pixel-level accuracy from an input image. The second stream extracts segment-wise features very efficiently, and better models saliency discontinuities along object boundaries. Finally, a fully connected CRF model can be optionally incorporated to improve spatial coherence and contour localization in the fused result from these two streams. Experimental results demonstrate that our deep model significantly improves the state of the art.
Modern societies can be understood as the intersection of four interdependent systems: (1) the natural environment of geography, climate and weather; (2) the built environment of cities, engineered systems, and physical infrastructure; (3) the social environment of human populations, communities and socio-economic activities; and (4) an information ecosystem that overlays the other three domains and provides the means for understanding, interacting with, and managing the relationships between the natural, built, and human environments.   As the nation and its communities become more connected, networked and technologically sophisticated, new challenges and opportunities arise that demand a rethinking of current approaches to public safety and emergency management. Addressing the current and future challenges requires an equally sophisticated program of research, technology development, and strategic planning. The design and integration of intelligent infrastructure-including embedded sensors, the Internet of Things (IoT), advanced wireless information technologies, real-time data capture and analysis, and machine-learning-based decision support-holds the potential to greatly enhance public safety, emergency management, disaster recovery, and overall community resilience, while addressing new and emerging threats to public safety and security. Ultimately, the objective of this program of research and development is to save lives, reduce risk and disaster impacts, permit efficient use of material and social resources, and protect quality of life and economic stability across entire regions.
The remarkable phenomenon of weak magnetization hysteresis loops, observed recently deep in the vortex-liquid state of a nearly two-dimensional (2D) superconductor at low temperatures, is shown to reflect the existence of an unusual vortex-liquid state, consisting of collectively pinned crystallites of easily sliding vortex chains.
We present a procedure for reconstructing the decision function of an artificial neural network as a simple function of the input, provided the decision function is sufficiently symmetric. In this case one can easily deduce the quantity by which the neural network classifies the input. The procedure is embedded into a pipeline of machine learning algorithms able to detect the existence of different phases of matter, to determine the position of phase transitions and to find explicit expressions of the physical quantities by which the algorithm distinguishes between phases. We assume no prior knowledge about the Hamiltonian or the order parameters except Monte Carlo-sampled configurations. The method is applied to the Ising Model and SU(2) lattice gauge theory. In both systems we deduce the explicit expressions of the known order parameters from the decision functions of the neural networks.
The problem of finding an electric vehicle route that optimizes both driving time and energy consumption can be modeled as a bicriterion path problem. Unfortunately, the problem of finding optimal bicriterion paths is NP-complete. This paper studies such problems restricted to two-phase paths, which correspond to a common way people drive electric vehicles, where a driver uses one driving style (say, minimizing driving time) at the beginning of a route and another driving style (say, minimizing energy consumption) at the end. We provide efficient polynomial-time algorithms for finding optimal two-phase paths in bicriterion networks, and we empirically verify the effectiveness of these algorithms for finding good electric vehicle driving routes in the road networks of various U.S. states. In addition, we show how to incorporate charging stations into these algorithms, in spite of the computational challenges introduced by the negative energy consumption of such network vertices.
Unlike wired networks, the capacity of a wireless network is interference limited due to the broadcast nature of wireless medium. Some multicast wireless network protocols do not consider channel assignment issue, that they cause interference at transmission nodes, hence do not use full capacity of the network. Interference can be reduced and throughput improved with the use of multichannel features. Therefore, this paper used orthogonal channels for sending and receiving nodes in the network. We propose EWM (Efficient Wireless Multicast) method that is distributed scheme for constructing multicast tree in multi-channel multi-interface wireless mesh networks (MIMC-WMN) which selects relay nodes and in distributed form assign orthogonal radio channels to them. To more decrease of interference in adding a branch to the tree, the route with minimum end-to-end delay from the source to the multicast receiver will be chosen. Thus, the tree is suitable for multimedia applicants. We also employ the broadcast nature of the wireless media to reduce the number of relay nodes. The proposed algorithm is compared with MCM algorithm in NS2.
Complex networks have been applied to model numerous interactive nonlinear systems in the real world. Knowledge about network topology is crucial for understanding the function, performance and evolution of complex systems. In the last few years, many network metrics and models have been proposed to illuminate the network topology, dynamics and evolution. Since these network metrics and models derive from a wide range of studies, a systematic study is required to investigate the correlations between them. The present paper explores the effect of degree correlation on the other network metrics through studying an ensemble of graphs where the degree sequence (set of degrees) is fixed. We show that to some extent, the characteristic path length, clustering coefficient, modular extent and robustness of networks are directly influenced by the degree correlation.
The sense of touch, being the earliest sensory system to develop in a human body [1], plays a critical part of our daily interaction with the environment. In order to successfully complete a task, many manipulation interactions require incorporating haptic feedback. However, manually designing a feedback mechanism can be extremely challenging. In this work, we consider manipulation tasks that need to incorporate tactile sensor feedback in order to modify a provided nominal plan. To incorporate partial observation, we present a new framework that models the task as a partially observable Markov decision process (POMDP) and learns an appropriate representation of haptic feedback which can serve as the state for a POMDP model. The model, that is parametrized by deep recurrent neural networks, utilizes variational Bayes methods to optimize the approximate posterior. Finally, we build on deep Q-learning to be able to select the optimal action in each state without access to a simulator. We test our model on a PR2 robot for multiple tasks of turning a knob until it clicks.
Recently, Grid Computing Systems have provided wide integrated use of resources. Grid computing systems provide the ability to share, select and aggregate distributed resources as computers, storage systems or other devices in an integrated way. Grid computing systems have solved many problems in science, engineering and commerce fields. In this paper we introduce a self-recognition algorithm for grid network and introduced this algorithm to have exclusive management control on the autonomic grid networks. This algorithm is base on binomial heap to allocate and recognition any node in the grid. We try to using this algorithm in advisor labor law software application as case study and shown in this application how to use this method for any advisor application on the network. By this implementation model shown this method can get better answer to any question as a best labor law advisor.
In many classification problems a classifier should be robust to small variations in the input vector. This is a desired property not only for particular transformations, such as translation and rotation in image classification problems, but also for all others for which the change is small enough to retain the object perceptually indistinguishable. We propose two extensions of the backpropagation algorithm that train a neural network to be robust to variations in the feature vector. While the first of them enforces robustness of the loss function to all variations, the second method trains the predictions to be robust to a particular variation which changes the loss function the most. The second methods demonstrates better results, but is slightly slower. We analytically compare the proposed algorithm with two the most similar approaches (Tangent BP and Adversarial Training), and propose their fast versions. In the experimental part we perform comparison of all algorithms in terms of classification accuracy and robustness to noise on MNIST and CIFAR-10 datasets. Additionally we analyze how the performance of the proposed algorithm depends on the dataset size and data augmentation.
Characterizing out-of-equilibrium many-body dynamics is a complex but crucial task for quantum applications and the understanding of fundamental phenomena. A central question is the role of localization in quenching quantum thermalization, and whether localization survives in the presence of interactions. The localized phase of interacting systems (many-body localization, MBL) exhibits a long-time logarithmic growth in entanglement entropy that distinguishes it from the noninteracting Anderson localization (AL), but entanglement is difficult to measure experimentally. Here, we present a novel correlation metric, capable of distinguishing MBL from AL in high-temperature spin systems. We demonstrate the use of this metric to detect localization in a natural solidstate spin system using nuclear magnetic resonance (NMR). We engineer the natural Hamiltonian to controllably introduce disorder and interactions and observe the emergence of localization. In particular, while our correlation metric saturates for AL, it keeps increasing logarithmically for MBL, a behavior reminiscent of entanglement entropy, as we confirm by simulations. Our results show that our NMR techniques, akin to measuring out-of-time correlations, are well suited for studying localization in spin systems.
Tor is currently the most popular network for anonymous Internet access. It critically relies on volunteer nodes called bridges for relaying Internet traffic when a user's ISP blocks connections to Tor. Unfortunately, current methods for distributing bridges are vulnerable to malicious users who obtain and block bridge addresses. In this paper, we propose TorBricks, a protocol for distributing Tor bridges to n users, even when an unknown number t < n of these users are controlled by a malicious adversary. TorBricks distributes O(tlog(n)) bridges and guarantees that all honest users can connect to Tor with high probability after O(log(t)) rounds of communication with the distributor.   We also extend our algorithm to perform privacy-preserving bridge distribution when run among multiple untrusted distributors. This not only prevents the distributors from learning bridge addresses and bridge assignment information, but also provides resistance against malicious attacks from a m/3 fraction of the distributors, where m is the number of distributors.
Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.
We study the relative importance of "top-speed" (long-term growth rate) and "acceleration" (how quickly the long-term growth rate can be reached) in the evolutionary race to increase population size. We observe that fitness alone does not capture growth rate: robustness, a property of neutral network shape, combines with fitness to include the effect of deleterious mutations, giving growth rate. Similarly, we show that growth rate alone does not capture population size: regularity, a different property of neutral network shape, combines with growth rate to include the effect of higher depletion rates early on, giving size. Whereas robustness is a function of the principal eigenvalue of the neutral network adjacency matrix, regularity is a function of the principal eigenvector. We show that robustness is not correlated with regularity, and observe in silico the selection for regularity by evolving RNA ribozymes. Despite having smaller growth rates, the more regular ribozymes have the biggest populations.
We study the adaptive modulation (AM) problem in a network-coded two-way relay channel (NC-TWRC), where each of the two users controls its own bit rate in the $m$-ary quadrature amplitude modulation ($m$-QAM) to minimize the transmission error rate and enhance the spectral efficiency. We show that there exists a strategic complementarity, one user tends to transmit while the other decides to do so in order to enhance the overall spectral efficiency, which is beyond the scope of the conventional single-agent AM scheduling method. We propose a two-player game model parameterized by the signal-to-noise ratios (SNRs) of two user-to-user channels and prove that it is a supermodular game where there always exist the extremal pure strategy Nash equilibria (PSNEs), the largest and smallest PSNEs. We show by simulation results that the extremal PSNEs incur a similar bit error rate (BER) as the conventional single-agent AM scheme, but significantly improve the spectral efficiency in the NC-TWRC system. The study also reveals the Pareto order of the extremal PSNEs: The largest and smallest PSNEs are Pareto worst and best PSNEs, respectively. Finally, we derive the sufficient conditions for the extremal PSNEs to be symmetric and monotonic in channel SNRs. We also discuss how to utilize the symmetry and monotonicity to relieve the complexity in the PSNE learning process.
The level spacing distribution is numerically calculated at the disorder-induced metal--insulator transition for dimensionality d=4 by applying the Lanczos diagonalisation. The critical level statistics are shown to deviate stronger from the result of the random matrix theory compared to those of d=3 and to become closer to the Poisson limit of uncorrelated spectra. Using the finite size scaling analysis for the probability distribution Q_n(E) of having n levels in a given energy interval E we find the critical disorder W_c = 34.5 \pm 0.5, the correlation length exponent \nu = 1.1 \pm 0.2 and the critical spectral compressibility k_c \approx 0.5.
All-Optical Network (AON) is a network where the user-network interface is optical and the data does undergo optical to electrical conversion within the network. AONs are attractive because they promise very high rates, flexible switching and broad application support. There are two technologies for AON: Wavelength Division Multiplexed (WDM) and Optical Time Division Multiplexed (OTDM). OTDM transmission systems are becoming increasingly important as one of the key technologies satisfying the growing demand for large capacity optical networks. Although OTDM has several advantages in terms of operation system, such as natural accommodation of higher bit rate payloads, it introduces many security vulnerabilities, which do not exist in traditional networks. One of the serious problems with OTDM is the fact that optical crosstalk is additive, and thus the aggregate effect of crosstalk over a whole all-optical network (AON) may be more nefarious than a single point of crosstalk. This is because crosstalk can spread rapidly through the network, causing additional awkward failures and triggering multiple undesirable alarms. This results in the continuous monitoring and identification of the impairments becoming challenging in the event of transmission failures. In this paper we propose a novel approach for detecting and localizing crosstalk in OTDM transmission systems that can participate in some tasks for fault management in optical network.
Instead of starting from a theoretically motivated form of the color dipole cross section in the dipole picture of deep inelastic scattering, we start with a parametrization of the deep inelastic structure function for electromagnetic scattering with protons, and then extract the color dipole cross section. Using the parametrizations of $F_2(\xi=x \ {\rm or}\ W^2,Q^2)$ by Donnachie-Landshoff and Block et al., we find the dipole cross section from an approximate form of the presumed dipole cross section convoluted with the perturbative photon wave function for virtual photon splitting into a color dipole with massless quarks. The color dipole cross section determined this way reproduces the original structure function within about 10\% for $0.1$ GeV$^2\leq Q^2\leq 10$ GeV$^2$. We discuss the large and small form of the dipole cross section and compare with other parameterizations.
We study the Bjorken x dependence of the virtual photon spin asymmetry in polarized deep inelastic scattering of electrons from hadrons. We use an exactly solved relativistic potential model of the hadron, treating the constituents as independent massless Dirac particles bound to an infinitely massive force center. The potential is chosen to have spin symmetry and a linear radial dependence with spherical symmetry. The effect of interactions of the struck constituent with the remainder of the target on the longitudinal photon asymmetry is demonstrated. In particular, the small-x suppression of the photon asymmetry observed in polarized deep inelastic scattering from the proton is shown to be a consequence of these interactions. The effect of p--wave components of the Dirac wave function, long known to give an important contribution to the spin of hadrons, is explicitly demonstrated through their interference with the s--wave term.
Complex networks describe a wide range of systems in nature and society, much quoted examples including the cell, a network of chemicals linked by chemical reactions, or the Internet, a network of routers and computers connected by physical links. While traditionally these systems were modeled as random graphs, it is increasingly recognized that the topology and evolution of real networks is governed by robust organizing principles. Here we review the recent advances in the field of complex networks, focusing on the statistical mechanics of network topology and dynamics. After reviewing the empirical data that motivated the recent interest in networks, we discuss the main models and analytical tools, covering random graphs, small-world and scale-free networks, as well as the interplay between topology and the network's robustness against failures and attacks.
The concept of the augmented coaching ecosystem for non-obtrusive adaptive personalized elderly care is proposed on the basis of the integration of new and available ICT approaches. They include the multimodal user interface (MMUI), augmented reality (AR), machine learning (ML), Internet of Things (IoT), and machine-to-machine (M2M) interactions. The ecosystem is based on the Cloud-Fog-Dew computing paradigm services, providing a full symbiosis by integrating the whole range from low-level sensors up to high-level services using integration efficiency inherent in synergistic use of applied technologies. Inside of this ecosystem, all of them are encapsulated in the following network layers: Dew, Fog, and Cloud computing layer. Instead of the "spaghetti connections", "mosaic of buttons", "puzzles of output data", etc., the proposed ecosystem provides the strict division in the following dataflow channels: consumer interaction channel, machine interaction channel, and caregiver interaction channel. This concept allows to decrease the physical, cognitive, and mental load on elderly care stakeholders by decreasing the secondary human-to-human (H2H), human-to-machine (H2M), and machine-to-human (M2H) interactions in favor of M2M interactions and distributed Dew Computing services environment. It allows to apply this non-obtrusive augmented reality ecosystem for effective personalized elderly care to preserve their physical, cognitive, mental and social well-being.
We study the deviations from perfect memory in negative temperature cycle spin glass experiments. It is known that the a.c. susceptibility after the temperature is raised back to its initial value is superimposed to the reference isothermal curve for large enough temperature jumps DT (perfect memory). For smaller DT, the deviation from this perfect memory has a striking non monotonous behavior: the memory anomaly is negative for small DT's, becomes positive for intermediate DT's, before vanishing for still larger DT's. We show that this interesting behavior can be reproduced by simple Random Energy trap models. We discuss an alternative interpretation in terms of droplets and temperature chaos.
We present a scalable architecture for fault-tolerant topological quantum computation using networks of voltage-controlled Majorana Cooper pair boxes, and topological color codes for error correction. Color codes have a set of transversal gates which coincides with the set of topologically protected gates in Majorana-based systems, namely the Clifford gates. In this way, we establish color codes as providing a natural setting in which advantages offered by topological hardware can be combined with those arising from topological error-correcting software for full-fledged fault-tolerant quantum computing. We provide a complete description of our architecture including the underlying physical ingredients. We start by showing that in topological superconductor networks, hexagonal cells can be employed to serve as physical qubits for universal quantum computation, and present protocols for realizing topologically protected Clifford gates. These hexagonal cell qubits allow for a direct implementation of open-boundary color codes with ancilla-free syndrome readout and logical $T$-gates via magic state distillation. For concreteness, we describe how the necessary operations can be implemented using networks of Majorana Cooper pair boxes, and give a feasibility estimate for error correction in this architecture. Our approach is motivated by nanowire-based networks of topological superconductors, but could also be realized in alternative settings such as quantum Hall-superconductor hybrids.
There have been several approaches to provisioning traffic between core network nodes in Internet Service Provider (ISP) networks. Such approaches aim to minimize network delay, increase network capacity, and enhance network security services. MATE (Multipath Adaptive Traffic Engineering) protocol has been proposed for multipath adaptive traffic engineering between an ingress node (source) and an egress node (destination). Its novel idea is to avoid network congestion and attacks that might exist in edge and node disjoint paths between two core network nodes.   This paper builds an adaptive, robust, and reliable traffic engineering scheme for better performance of communication network operations. This will also provision quality of service (QoS) and protection of traffic engineering to maximize network efficiency. Specifically, we present a new approach, S-MATE (secure MATE) is developed to protect the network traffic between two core nodes (routers or switches) in a cloud network. S-MATE secures against a single link attack/failure by adding redundancy in one of the operational paths between the sender and receiver. The proposed scheme can be built to secure core networks such as optical and IP networks.
This paper proposes a new formulation of robotic pick and place. We formulate pick and place as a deep RL problem where the actions are grasp and place poses for the robot's hand, and the state is encoded with the observed geometry local to a selected grasp. This framework is well-suited to learning pick and place tasks involving novel objects in clutter. We present experiments demonstrating that our method works well on a new variant of pick and place tasks which we call category level pick and place where the category of the object to be manipulated is known but its exact appearance and geometry is unknown. The results show, even though these are novel objects and they are presented in clutter, our method can still grasp, re-grasp, and place them in a desired pose with high probability.
Many features from texts and languages can now be inferred from statistical analyses using concepts from complex networks and dynamical systems. In this paper we quantify how topological properties of word co-occurrence networks and intermittency (or burstiness) in word distribution depend on the style of authors. Our database contains 40 books from 8 authors who lived in the 19th and 20th centuries, for which the following network measurements were obtained: clustering coefficient, average shortest path lengths, and betweenness. We found that the two factors with stronger dependency on the authors were the skewness in the distribution of word intermittency and the average shortest paths. Other factors such as the betweeness and the Zipf's law exponent show only weak dependency on authorship. Also assessed was the contribution from each measurement to authorship recognition using three machine learning methods. The best performance was a ca. 65 % accuracy upon combining complex network and intermittency features with the nearest neighbor algorithm. From a detailed analysis of the interdependence of the various metrics it is concluded that the methods used here are complementary for providing short- and long-scale perspectives of texts, which are useful for applications such as identification of topical words and information retrieval.
Deep learning has gained great popularity due to its widespread success on many inference problems. We consider the application of deep learning to the sparse linear inverse problem encountered in compressive sensing, where one seeks to recover a sparse signal from a small number of noisy linear measurements. In this paper, we propose a novel neural-network architecture that decouples prediction errors across layers in the same way that the approximate message passing (AMP) algorithm decouples them across iterations: through Onsager correction. Numerical experiments suggest that our "learned AMP" network significantly improves upon Gregor and LeCun's "learned ISTA" network in both accuracy and complexity.
We study the properties of the level statistics of 1D disordered systems with long-range spatial correlations. We find a threshold value in the degree of correlations below which in the limit of large system size the level statistics follows a Poisson distribution (as expected for 1D uncorrelated disordered systems), and above which the level statistics is described by a new class of distribution functions. At the threshold, we find that with increasing system size the standard deviation of the function describing the level statistics converges to the standard deviation of the Poissonian distribution as a power law. Above the threshold we find that the level statistics is characterized by different functional forms for different degrees of correlations.
We consider a directed variant of the negative-weight percolation model in a two-dimensional, periodic, square lattice. The problem exhibits edge weights which are taken from a distribution that allows for both positive and negative values. Additionally, in this model variant all edges are directed. For a given realization of the disorder, a minimally weighted loop/path configuration is determined by performing a non-trivial transformation of the original lattice into a minimum weight perfect matching problem. For this problem, fast polynomial-time algorithms are available, thus we could study large systems with high accuracy. Depending on the fraction of negatively and positively weighted edges in the lattice, a continuous phase transition can be identified, whose characterizing critical exponents we have estimated by a finite-size scaling analyses of the numerically obtained data. We observe a strong change of the universality class with respect to standard directed percolation, as well as with respect to undirected negative-weight percolation. Furthermore, the relation to directed polymers in random media is illustrated.
Vertex cover is one of the classical NP-complete problems in theoretical computer science. A vertex cover of a graph is a subset of vertices such that for each edge at least one of the two endpoints is contained in the subset. When studied on Erdos-Renyi random graphs (with connectivity c) one observes a threshold behavior: In the thermodynamic limit the size of the minimal vertex cover is independent of the specific graph. Recent analytical studies show that on the phase boundary, for small connectivities c<e, the system is replica symmetric, while for larger connectivities replica symmetry breaking occurs. This change coincides with a change of the typical running time of algorithms from polynomial to exponential.   To understand the reasons for this behavior and to compare with the analytical results, we numerically analyze the structure of the solution landscape. For this purpose, we have also developed an algorithm, which allows the calculation of the backbone, without the need to enumerate all solutions. We study exact solutions found with a Branch-and-Bound algorithm as well as configurations obtained via a Monte Carlo simulation.   We analyze the cluster structure of the solution landscape by direct clustering of the states, by analyzing the eigenvalue spectrum of correlation matrices and by using a hierarchical clustering method. All results are compatible with a change at c=e. For small connectivities, the solutions are collected in a finite small number of clusters, while the number of cluster diverges slowly with system size for larger connectivities and replica symmetry breaking, but not 1-RSB, occurs.
Using non-linear evolution equations of QCD, we compute the von Neumann entropy of the system of partons resolved by deep inelastic scattering at a given Bjorken $x$ and momentum transfer $q^2 = - Q^2$. We interpret the result as the entropy of entanglement between the spatial region probed by deep inelastic scattering and the rest of the proton. At small $x$ the relation between the entanglement entropy $S(x)$ and the parton distribution $xG(x)$ becomes very simple: $S(x) = \ln[ xG(x) ]$. In this small $x$, large rapidity $Y$ regime, all partonic micro-states have equal probabilities -- the proton is composed by an exponentially large number $\exp(\Delta Y)$ of micro-states that occur with equal and exponentially small probabilities $\exp(-\Delta Y)$, where $\Delta$ is defined by $xG(x) \sim 1/x^\Delta$. For this equipartitioned state, the entanglement entropy is maximal -- so at small $x$, deep inelastic scattering probes a {\it maximally entangled state}. We propose the entanglement entropy as an observable that can be studied in deep inelastic scattering. This will require event-by-event measurements of hadronic final states, and would allow to study the transformation of entanglement entropy into the Boltzmann one. We estimate that the proton is represented by the maximally entangled state at $x \leq 10^{-3}$; this kinematic region will be amenable to studies at the Electron Ion Collider.
In this paper, we analyze the coexistence of a primary and a secondary (cognitive) network when both networks use the IEEE 802.11 based distributed coordination function for medium access control. Specifically, we consider the problem of channel capture by a secondary network that uses spectrum sensing to determine the availability of the channel, and its impact on the primary throughput. We integrate the notion of transmission slots in Bianchi's Markov model with the physical time slots, to derive the transmission probability of the secondary network as a function of its scan duration. This is used to obtain analytical expressions for the throughput achievable by the primary and secondary networks. Our analysis considers both saturated and unsaturated networks. By performing a numerical search, the secondary network parameters are selected to maximize its throughput for a given level of protection of the primary network throughput. The theoretical expressions are validated using extensive simulations carried out in the Network Simulator 2. Our results provide critical insights into the performance and robustness of different schemes for medium access by the secondary network. In particular, we find that the channel captures by the secondary network does not significantly impact the primary throughput, and that simply increasing the secondary contention window size is only marginally inferior to silent-period based methods in terms of its throughput performance.
As the technology for building knowledge based systems has matured, important lessons have been learned about the relationship between the architecture of a system and the nature of the problems it is intended to solve. We are implementing a knowledge engineering tool called BART that is designed with these lessons in mind. BART is a Bayesian reasoning tool that makes belief networks and other probabilistic techniques available to knowledge engineers building classificatory problem solvers. BART has already been used to develop a decision aid for classifying ship images, and it is currently being used to manage uncertainty in systems concerned with analyzing intelligence reports. This paper discusses how state-of-the-art probabilistic methods fit naturally into a knowledge based approach to classificatory problem solving, and describes the current capabilities of BART.
It is often argued that accurate machine translation requires reference to contextual knowledge for the correct treatment of linguistic phenomena such as dropped arguments and accurate lexical selection. One of the historical arguments in favor of the interlingua approach has been that, since it revolves around a deep semantic representation, it is better able to handle the types of linguistic phenomena that are seen as requiring a knowledge-based approach. In this paper we present an alternative approach, exemplified by a prototype system for machine translation of English and Korean which is implemented in Synchronous TAGs. This approach is essentially transfer based, and uses semantic feature unification for accurate lexical selection of polysemous verbs. The same semantic features, when combined with a discourse model which stores previously mentioned entities, can also be used for the recovery of topicalized arguments. In this paper we concentrate on the translation of Korean to English.
For data sets retrieved from wireless sensors to be insightful, it is often of paramount importance that the data be accurate and also location stamped. This paper describes a maximum-likelihood based multihop localization algorithm called kHopLoc for use in wireless sensor networks that is strong in both isotropic and anisotropic network deployment regions. During an initial training phase, a Monte Carlo simulation is utilized to produce multihop connection density functions. Then, sensor node locations are estimated by maximizing local likelihood functions of the hop counts to anchor nodes. Compared to other multihop localization algorithms, the proposed kHopLoc algorithm achieves higher accuracy in varying network configurations and connection link-models.
Finding the mean of the total number of stationary points for N-dimensional random Gaussian landscapes can be reduced to averaging the absolute value of characteristic polynomial of the corresponding Hessian. First such a reduction is illustrated for a class of models describing energy landscapes of elastic manifolds in random environment, and a general method of attacking the problem analytically is suggested. Then the exact solution to the problem [Phys. Rev. Lett. v.92 (2004) 240601] for a class of landscapes corresponding to the simplest, yet nontrivial "toy model" with N degrees of freedom is described. Asymptotic analysis reveals a phase transition to a glass-like state of the matter.
The best summary of a long video differs among different people due to its highly subjective nature. Even for the same person, the best summary may change with time or mood. In this paper, we introduce the task of generating customized video summaries through simple text. First, we train a deep architecture to effectively learn semantic embeddings of video frames by leveraging the abundance of image-caption data via a progressive and residual manner. Given a user-specific text description, our algorithm is able to select semantically relevant video segments and produce a temporally aligned video summary. In order to evaluate our textually customized video summaries, we conduct experimental comparison with baseline methods that utilize ground-truth information. Despite the challenging baselines, our method still manages to show comparable or even exceeding performance. We also show that our method is able to generate semantically diverse video summaries by only utilizing the learned visual embeddings.
This paper proposes a set of new error criteria and learning approaches, Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex optimization problem in training deep neural networks (DNNs). Theoretically, we demonstrate its effectiveness on global and local convexity lower-bounded by the standard $L_p$-norm error. By analyzing the gradient on the convexity index $\lambda$, we explain the reason why to learn $\lambda$ adaptively using gradient descent works. In practice, we show how this method improves training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptrons and Denoised Auto-encoders is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other specific tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization problem in DNNs.
In previous work, we have proposed an entanglement indicator for a general multiqubit state, which can be "learned" by a quantum system, acting as a neural network. The indicator can be used for a pure or a mixed state, and it need not be "close" to any particular state; moreover, as the size of the system grows, the amount of additional training necessary diminishes. Here, we show that the indicator is stable to noise and decoherence.
We propose a new class of phenomenological models for dynamic glass transitions. The system consists of an ensemble of mesoscopic regions to which local energies are allocated. At each time step, a region is randomly chosen and a new local energy is drawn from a distribution that self-consistently depends on the global energy of the system. Then, the transition is accepted or not according to the Metropolis rule. Within this scheme, we model an energy threshold leading to a mode-coupling glass transition as in the p-spin model. The glassy dynamics is characterized by a two-step relaxation of the energy autocorrelation function. The aging scaling is fully determined by the evolution of the global energy and linear violations of the fluctuation dissipation relation are found for observables uncorrelated with the energies. Interestingly, our mean-field approach has a natural extension to finite dimension, that we briefly discuss.
This paper presents a new method for the discovery of latent domains in diverse speech data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech Recognition. Our work focuses on transcription of multi-genre broadcast media, which is often only categorised broadly in terms of high level genres such as sports, news, documentary, etc. However, in terms of acoustic modelling these categories are coarse. Instead, it is expected that a mixture of latent domains can better represent the complex and diverse behaviours within a TV show, and therefore lead to better and more robust performance. We propose a new method, whereby these latent domains are discovered with Latent Dirichlet Allocation, in an unsupervised manner. These are used to adapt DNNs using the Unique Binary Code (UBIC) representation for the LDA domains. Experiments conducted on a set of BBC TV broadcasts, with more than 2,000 shows for training and 47 shows for testing, show that the use of LDA-UBIC DNNs reduces the error up to 13% relative compared to the baseline hybrid DNN models.
WLAN Device-free passive DfP indoor localization is an emerging technology enabling the localization of entities that do not carry any devices nor participate actively in the localization process using the already installed wireless infrastructure. This technology is useful for a variety of applications such as intrusion detection, smart homes and border protection. We present the design, implementation and evaluation of RASID, a DfP system for human motion detection. RASID combines different modules for statistical anomaly detection while adapting to changes in the environment to provide accurate, robust, and low-overhead detection of human activities using standard WiFi hardware. Evaluation of the system in two different testbeds shows that it can achieve an accurate detection capability in both environments with an F-measure of at least 0.93. In addition, the high accuracy and low overhead performance are robust to changes in the environment as compared to the current state of the art DfP detection systems. We also relay the lessons learned during building our system and discuss future research directions.
Artistic style transfer is an image synthesis problem where the content of an image is reproduced with the style of another. Recent works show that a visually appealing style transfer can be achieved by using the hidden activations of a pretrained convolutional neural network. However, existing methods either apply (i) an optimization procedure that works for any style image but is very expensive, or (ii) an efficient feedforward network that only allows a limited number of trained styles. In this work we propose a simpler optimization objective based on local matching that combines the content structure and style textures in a single layer of the pretrained network. We show that our objective has desirable properties such as a simpler optimization landscape, intuitive parameter tuning, and consistent frame-by-frame performance on video. Furthermore, we use 80,000 natural images and 80,000 paintings to train an inverse network that approximates the result of the optimization. This results in a procedure for artistic style transfer that is efficient but also allows arbitrary content and style images.
Ubiquitous computing based on small mobile devices using wireless communication links is becoming very attractive. The computational power and storage capacities provided allow the execution of sophisticated applications. Due to the fact that sharing of information is a central problem for distributed applications, the development of self organizing middleware services providing high level interfaces for information managing is essential. ADS is a directory service for mobile ad-hoc networks dealing with local and nearby information as well as providing access to distant information. The approach discussed throughout this paper is based upon the concept of information markets.
In this paper, a new transformation is generated from a three variable Boolean function 3, which is used to produce a self-similar fractal pattern of dimension 1.58. This very fractal pattern is used to reconstruct the whole structural position of resources in wireless CDMA network. This reconstruction minimizes the number of resources in the network and so naturally network consumption costs are getting reduced. Now -a -days resource controlling and cost minimization are still a severe problem in wireless CDMA network. To overcome this problem fractal pattern produced in our research provides a complete solution of structural position of resources in this Wireless CDMA Network.
In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Moreover, we propose a new model, the Semi Aggregated Markov Decision Process (SAMDP), and an algorithm that learns it automatically. The SAMDP model allows us to identify spatio-temporal abstractions directly from features and may be used as a sub-goal detector in future work. Using our tools we reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover, we are able to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and optimize deep neural networks in reinforcement learning.
This comment reexamines Simard et al.'s work in [D. Simard, L. Nadeau, H. Kroger, Phys. Lett. A 336 (2005) 8-15]. We found that Simard et al. calculated mistakenly the local connectivity lengths Dlocal of networks. The right results of Dlocal are presented and the supervised learning performance of feedforward neural networks (FNNs) with different rewirings are re-investigated in this comment. This comment discredits Simard et al's work by two conclusions: 1) Rewiring connections of FNNs cannot generate networks with small-world connectivity; 2) For different training sets, there do not exist networks with a certain number of rewirings generating reduced learning errors than networks with other numbers of rewiring.
Sites for incipient low metallicity (Pop II) star formation can support environments conducive to Deuterium production up to levels observed in the universe. This could have a deep impact on a ``Standard Cosmology''.
The networking industry, compared to the compute industry, has been slow in evolving from a closed ecosystem with limited abstractions to a more open ecosystem with well-defined sophisticated high level abstractions. This has resulted in an ossified Internet architecture that inhibits innovation and is unnecessarily complex. Fortunately, there has been an exciting flux of rapid developments in networking in recent times with prominent trends emerging that have brought us to the cusp of a major paradigm shift. In particular, the emergence of technologies such as cloud computing, software defined networking (SDN), and network virtualization are driving a new vision of `networking as a service' (NaaS) in which networks are managed flexibly and efficiently cloud computing style. These technologies promise to both facilitate architectural and technological innovation while also simplifying commissioning, orchestration, and composition of network services. In this article, we introduce our readers to these technologies. In the coming few years, the trends of cloud computing, SDN, and network virtualization will further strengthen each other's value proposition symbiotically and NaaS will increasingly become the dominant mode of commissioning new networks.
We investigated the efficiency of attack strategies to network nodes when targeting several complex model and real-world networks. We tested 5 attack strategies, 3 of which were introduced in this work for the first time, to attack 3 model (Erdos and Renyi, Barabasi and Albert preferential attachment network, and scale-free network configuration models) and 3 real networks (Gnutella peer-to-peer network, email network of the University of Rovira i Virgili, and immunoglobulin interaction network). Nodes were removed sequentially according to the importance criterion defined by the attack strategy. We used the size of the largest connected component (LCC) as a measure of network damage. We found that the efficiency of attack strategies (fraction of nodes to be deleted for a given reduction of LCC size) depends on the topology of the network, although attacks based on the number of connections of a node and betweenness centrality were often the most efficient strategies. Sequential deletion of nodes in decreasing order of betweenness centrality was the most efficient attack strategy when targeting real-world networks. In particular for networks with power-law degree distribution, we observed that most efficient strategy change during the sequential removal of nodes.
The spherical mean field approximation of a spin-1 model with p-body quenched disordered interaction is investigated. Depending on temperature and chemical potential the system is found in a paramagnetic or in a glassy phase and the transition between these phases can be of different nature. In given conditions inverse freezing occurs. As $p=2$ the glassy phase is replica symmetric and the transition is always continuous in the phase diagram. For $p>2$ the exact solution for the glassy phase is obtained by the one step replica symmetry breaking Ansatz. Different scenarios arise for both the dynamic and the thermodynamic transitions. These include (i) the usual random first order transition (Kauzmann-like) preceded by a dynamic transition, typical of mean-field glasses, (ii) a thermodynamic first order transition with phase coexistence and latent heat and (iii) a regime of inversion of static and dynamic transition lines. In the latter case a thermodynamic stable glassy phase, with zero configurational entropy, is dynamically accessible from the paramagnetic phase. Crossover between different transition regimes are analyzed by means of Replica Symmetry Breaking theory and a detailed study of the complexity and of the stability of the static solution is performed throughout the space of external thermodynamic parameters.
A new and successful attack strategy in neural cryptography is presented. The neural cryptosystem, based on synchronization of neural networks by mutual learning, has been recently shown to be secure under different attack strategies. The advanced attacker presented here, named the ``Majority-Flipping Attacker'', is the first whose success does not decay with the parameters of the model. This new attacker's outstanding success is due to its using a group of attackers which cooperate throughout the synchronization process, unlike any other attack strategy known. An analytical description of this attack is also presented, and fits the results of simulations.
Recent results show that deep neural networks achieve excellent performance even when, during training, weights are quantized and projected to a binary representation. Here, we show that this is just the tip of the iceberg: these same networks, during testing, also exhibit a remarkable robustness to distortions beyond quantization, including additive and multiplicative noise, and a class of non-linear projections where binarization is just a special case. To quantify this robustness, we show that one such network achieves 11% test error on CIFAR-10 even with 0.68 effective bits per weight. Furthermore, we find that a common training heuristic--namely, projecting quantized weights during backpropagation--can be altered (or even removed) and networks still achieve a base level of robustness during testing. Specifically, training with weight projections other than quantization also works, as does simply clipping the weights, both of which have never been reported before. We confirm our results for CIFAR-10 and ImageNet datasets. Finally, drawing from these ideas, we propose a stochastic projection rule that leads to a new state of the art network with 7.64% test error on CIFAR-10 using no data augmentation.
High resolution, deep imaging surveys are instrumental in setting constraints on semi-analytical structure formation models in Cold Dark Matter (CDM) cosmologies. We show here that the lack of unresolved B-band ``dropouts'' with V > 25 mag in the Hubble Deep Field (HDF) appears to be inconsistent with the expected number of quasars if massive black holes form with a constant universal efficiency in all CDM halos. To reconcile the models with the data, a mechanism is needed that suppresses the formation of quasars in halos with circular velocities v(circ) < 50-75 km/s. This feedback naturally arises due to the photoionization heating of the gas by the UV background. We consider several alternative effects that would help reduce the quasar number counts, and find that these can not alone account for the observed lack of detections. If reddening by dust can be neglected at early epochs, consistency with the optical data also requires that the luminous extent of dwarf galaxies at high redshifts be larger than a few percent of the virial radii of their dark matter halos, in order not to overpredict the number of point-like B-band dropouts. Future deep observations in the J and H bands with NICMOS might reveal several z > 5 objects per field or provide even stronger constraints on the models than existing B, V, and I data.
Machine-to-machine (M2M) communications are expected to provide ubiquitous connectivity between machines without the need of human intervention. To support such a large number of autonomous devices, the M2M system architecture needs to be extremely power and spectrally efficient. This article thus briefly reviews the features of M2M services in the third generation (3G) long-term evolution and its advancement (LTE-Advanced) networks. Architectural enhancements are then presented for supporting M2M services in LTE-Advanced cellular networks. To increase spectral efficiency, the same spectrum is expected to be utilized for human-to-human (H2H) communications as well as M2M communications. We therefore present various radio resource allocation schemes and quantify their utility in LTE-Advanced cellular networks. System-level simulation results are provided to validate the performance effectiveness of M2M communications in LTE-Advanced cellular networks.
We present a study of the fiber bundle model using equal load sharing dynamics where the breaking thresholds of the fibers are drawn randomly from a power law distribution of the form $p(b)\sim b^{-1}$ in the range $10^{-\beta}$ to $10^{\beta}$. Tuning the value of $\beta$ continuously over a wide range, the critical behavior of the fiber bundle has been studied both analytically as well as numerically. Our results are: (i) The critical load $\sigma_c(\beta,N)$ for the bundle of size $N$ approaches its asymptotic value $\sigma_c(\beta)$ as $\sigma_c(\beta,N) = \sigma_c(\beta)+AN^{-1/\nu(\beta)}$ where $\sigma_c(\beta)$ has been obtained analytically as $\sigma_c(\beta) = 10^\beta/(2\beta e\ln10)$ for $\beta \geq \beta_u = 1/(2\ln10)$, and for $\beta<\beta_u$ the weakest fiber failure leads to the catastrophic breakdown of the entire fiber bundle, similar to brittle materials, leading to $\sigma_c(\beta) = 10^{-\beta}$; (ii) the fraction of broken fibers right before the complete breakdown of the bundle has the form $1-1/(2\beta \ln10)$; (iii) the distribution $D(\Delta)$ of the avalanches of size $\Delta$ follows a power law $D(\Delta)\sim \Delta^{-\xi}$ with $\xi = 5/2$ for $\Delta \gg \Delta_c(\beta)$ and $\xi = 3/2$ for $\Delta \ll \Delta_c(\beta)$, where the crossover avalanche size $\Delta_c(\beta) = 2/(1-e10^{-2\beta})^2$.
Aggregating extra features has been considered as an effective approach to boost traditional pedestrian detection methods. However, there is still a lack of studies on whether and how CNN-based pedestrian detectors can benefit from these extra features. The first contribution of this paper is exploring this issue by aggregating extra features into CNN-based pedestrian detection framework. Through extensive experiments, we evaluate the effects of different kinds of extra features quantitatively. Moreover, we propose a novel network architecture, namely HyperLearner, to jointly learn pedestrian detection as well as the given extra feature. By multi-task training, HyperLearner is able to utilize the information of given features and improve detection performance without extra inputs in inference. The experimental results on multiple pedestrian benchmarks validate the effectiveness of the proposed HyperLearner.
We describe \textit{deep exponential families} (DEFs), a class of latent variable models that are inspired by the hidden structures used in deep neural networks. DEFs capture a hierarchy of dependencies between latent variables, and are easily generalized to many settings through exponential families. We perform inference using recent "black box" variational inference techniques. We then evaluate various DEFs on text and combine multiple DEFs into a model for pairwise recommendation data. In an extensive study, we show that going beyond one layer improves predictions for DEFs. We demonstrate that DEFs find interesting exploratory structure in large data sets, and give better predictive performance than state-of-the-art models.
Designing brain-computer interfaces (BCIs) that can be used in conjunction with ongoing motor behavior requires an understanding of how neural activity co-opted for brain-control interacts with existing neural circuits. BCIs may be used to regain lost motor function after stroke or traumatic brain injury, for instance, provided that a neural control signal can be extracted from a single spared hemisphere. Such an approach, however, requires that neural activity controlling the unaffected limb is dissociated from activity controlling the BCI. In this study we investigate how primary motor cortex accomplishes simultaneous BCI control and motor control in a task that explicitly requires both activities to be driven from the same brain region (i.e. a 'dual control task'). Single unit activity was recorded from intracortical multi-electrode arrays while a non-human primate performed this dual-control BCI task. We observe broad changes in tuning of both units used to drive the BCI directly ('control units') and units that do not directly control the BCI ('non-control units'). Using transfer entropy, a measure of effective connectivity, we observe control-unit-specific dissociation from other units. Through an analysis of variance we find that the intrinsic variability of the control units has a significant effect on task proficiency. In particular, at least one control unit must have a high intrinsic variability to allow cursor control and target acquisition. Surprisingly, the degree of co-tuning or connectivity with other units did not affect dual-control task proficiency. Thus, motor cortical activity is flexible enough to perform novel BCI tasks that require active decoupling of natural associations to wrist motion.
Researchers have recently started investigating deep neural networks for dialogue applications. In particular, generative sequence-to-sequence (Seq2Seq) models have shown promising results for unstructured tasks, such as word-level dialogue response generation. The hope is that such models will be able to leverage massive amounts of data to learn meaningful natural language representations and response generation strategies, while requiring a minimum amount of domain knowledge and hand-crafting. An important challenge is to develop models that can effectively incorporate dialogue context and generate meaningful and diverse responses. In support of this goal, we review recently proposed models based on generative encoder-decoder neural network architectures, and show that these models have better ability to incorporate long-term dialogue history, to model uncertainty and ambiguity in dialogue, and to generate responses with high-level compositional structure.
Predicting user responses, such as click-through rate and conversion rate, are critical in many web applications including web search, personalised recommendation, and online advertising. Different from continuous raw features that we usually found in the image and audio domains, the input features in web space are always of multi-field and are mostly discrete and categorical while their dependencies are little known. Major user response prediction models have to either limit themselves to linear models or require manually building up high-order combination features. The former loses the ability of exploring feature interactions, while the latter results in a heavy computation in the large feature space. To tackle the issue, we propose two novel models using deep neural networks (DNNs) to automatically learn effective patterns from categorical feature interactions and make predictions of users' ad clicks. To get our DNNs efficiently work, we propose to leverage three feature transformation methods, i.e., factorisation machines (FMs), restricted Boltzmann machines (RBMs) and denoising auto-encoders (DAEs). This paper presents the structure of our models and their efficient training algorithms. The large-scale experiments with real-world data demonstrate that our methods work better than major state-of-the-art models.
The DESY Computer Center is the home of O(1000) computers supplying a wide range of different services Monitoring such a large installation is a challenge. After a long time running a SNMP based commercial Network Management System, the evaluation of a new System was started. There are a lot of different commercial and freeware products on the market, but none of them fully satisfied all our requirements. After re-valuating our original requirements we selected NAGIOS as our monitoring and alarming tool. After a successful test we are in production since autumn 2002 and are extending the service to fully support a distributed monitoring and alarming.
An interbank market lets participants pool the risk arising from the combination of illiquid investments and random withdrawals by depositors. But it also creates the potential for one bank's failure to trigger off avalanches of further failures. We simulate a model of interbank lending to study the interplay of these two effects. We show that when banks are similar in size and exposure to risk, avalanche effects are small so that widening the interbank market leads to more stability. But as heterogeneity increases, avalanche effects become more important. By varying the heterogeneity and connectivity across banks, the system enters a critical regime with a power law distribution of avalanche sizes.
Dynamics in biological networks are in general robust against several perturbations. We investigate a coupled map network as a model motivated by gene regulatory networks and design systems which are robust against phenotypic perturbations (perturbations in dynamics), as well as systems which are robust against mutation (perturbations in network structure). To achieve such a design, we apply a multicanonical Monte Carlo method. Analysis based on the maximum Lyapunov exponent and parameter sensitivity shows that systems with marginal stability, which are regarded as systems at the edge of chaos, emerge when robustness against network perturbations is required. This emergence of the edge of chaos is a self-organization phenomenon and does not need a fine tuning of parameters.
Rapid progress is now being made in the study of stellar populations of galaxies at large lookback times, both in dense clusters and the field. Dramatic transformations in star formation histories (even morphologies) appear to prevail among all types of galaxies and in all environments, and these can be traced to relatively recent times (z ~ 0.2). This article updates to December 1998 a review made in 1994 to cover the recent watershed of observational results emerging from the Hubble Deep Field and deep surveys made with large ground-based telescopes.
This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources, the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services, and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.
We propose an adaptive diffusion mechanism to optimize a global cost function in a distributed manner over a network of nodes. The cost function is assumed to consist of a collection of individual components. Diffusion adaptation allows the nodes to cooperate and diffuse information in real-time; it also helps alleviate the effects of stochastic gradient noise and measurement noise through a continuous learning process. We analyze the mean-square-error performance of the algorithm in some detail, including its transient and steady-state behavior. We also apply the diffusion algorithm to two problems: distributed estimation with sparse parameters and distributed localization. Compared to well-studied incremental methods, diffusion methods do not require the use of a cyclic path over the nodes and are robust to node and link failure. Diffusion methods also endow networks with adaptation abilities that enable the individual nodes to continue learning even when the cost function changes with time. Examples involving such dynamic cost functions with moving targets are common in the context of biological networks.
Online social networking services (SNSs) involve communication activities between large number of individuals over the public Internet and their crawled records are often regarded as proxies of real (i.e., offline) interaction structure. However, structure observed in these records might differ from real counterparts because individuals may behave differently online and non-human accounts may even participate. To understand the difference between online and real social networks, we investigate an empirical communication network between users on Twitter, which is perhaps one of the largest SNSs. We define a network of user pairs that send reciprocal messages. Based on the mixing pattern observed in this network, we argue that this network differs from conventional understandings in the sense that there is a small number of distinctive users that we call outsiders. Outsiders do not belong to any user groups but they are connected with different groups, while not being well connected with each other. We identify outsiders by maximizing the degree assortativity coefficient of the network via node removal, thereby confirming that local structural properties of outsiders identified are consistent with our hypothesis. Our findings suggest that the existence of outsiders should be considered when using Twitter communication networks for social network analysis.
Attitudes can have a profound impact on socially relevant behaviours, such as voting. However, this effect is not uniform across situations or individuals, and it is at present difficult to predict whether attitudes will predict behaviour in any given circumstance. Using a network model, we demonstrate that (a) more strongly connected attitude networks have a stronger impact on behaviour, and (b) within any given attitude network, the most central attitude elements have the strongest impact. We test these hypotheses using data on voting and attitudes toward presidential candidates in the US presidential elections from 1980 to 2012. These analyses confirm that the predictive value of attitude networks depends almost entirely on their level of connectivity, with more central attitude elements having stronger impact. The impact of attitudes on voting behaviour can thus be reliably determined before elections take place by using network analyses.
This paper derives a population sizing relationship for genetic programming (GP). Following the population-sizing derivation for genetic algorithms in Goldberg, Deb, and Clark (1992), it considers building block decision making as a key facet. The analysis yields a GP-unique relationship because it has to account for bloat and for the fact that GP solutions often use subsolution multiple times. The population-sizing relationship depends upon tree size, solution complexity, problem difficulty and building block expression probability. The relationship is used to analyze and empirically investigate population sizing for three model GP problems named ORDER, ON-OFF and LOUD. These problems exhibit bloat to differing extents and differ in whether their solutions require the use of a building block multiple times.
Two-way relay is potentially an effective approach to spectrum sharing and aggregation by allowing simultaneous bidirectional transmissions between source-destinations pairs. This paper studies the two-way $2\times2\times2$ relay network, a class of four-unicast networks, where there are four source/destination nodes and two relay nodes, with each source sending a message to its destination. We show that without relay caching the total degrees of freedom (DoF) is bounded from above by $8/3$, indicating that bidirectional links do not double the DoF (It is known that the total DoF of one-way $2\times2\times2$ relay network is $2$.). Further, we show that the DoF of $8/3$ is achievable for the two-way $2\times2\times2$ relay network with relay caching. Finally, even though the DoF of this network is no more than $8/3$ for generic channel gains, DoF of $4$ can be achieved for a symmetric configuration of channel gains.
Self-organizing neural networks are used for brick finding in OPERA experiment. Self-organizing neural networks and wavelet analysis used for recognition and extraction of car numbers from images.
We introduce the cut averaged entanglement entropy in disordered periodic spin chains and prove it to be a concave function of subsystem size for individual eigenstates. This allows us to identify the entanglement scaling as a function of subsystem size for individual states in inhomogeneous systems. Using this quantity, we probe the critical region between the many-body localized (MBL) and ergodic phases in finite systems.   In the middle of the spectrum, we show evidence for bimodality of the entanglement distribution in the MBL critical region, finding both volume law and area law eigenstates over disorder realizations as well as within \emph{single disorder realizations}. The disorder averaged entanglement entropy in this region then scales as a volume law with a coefficient below its thermal value. We discover in the critical region, as we approach the thermodynamic limit, that the cut averaged entanglement entropy density falls on a one-parameter family of curves. Finally, we also show that without averaging over cuts the slope of the entanglement entropy \vs subsystem size can be negative at intermediate and strong disorder, caused by rare localized regions in the system.
We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are unevenly distributed over an extremely large number of nodes. The goal is to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of the utmost importance and minimizing the number of rounds of communication is the principal goal.   A motivating example arises when we keep the training data locally on users' mobile devices instead of logging it to a data center for training. In federated optimziation, the devices are used as compute nodes performing computation on their local data in order to update a global model. We suppose that we have extremely large number of devices in the network --- as many as the number of users of a given service, each of which has only a tiny fraction of the total data available. In particular, we expect the number of data points available locally to be much smaller than the number of devices. Additionally, since different users generate data with different patterns, it is reasonable to assume that no device has a representative sample of the overall distribution.   We show that existing algorithms are not suitable for this setting, and propose a new algorithm which shows encouraging experimental results for sparse convex problems. This work also sets a path for future research needed in the context of \federated optimization.
We study a simple 1D model of dust rods, with mean size \mu, passing through a parallel 1D alignment of pores as a problem of clogging of a filter by dust grains; \mu is kept less than the pore size, s. We assume that the filter is "sticky", characterized by some parameter 0 \le \lambda \le 1, which means that dust grains slightly smaller in size than s can get trapped in the pores. Our analyses suggest that the number of clogged pores, N_{cl}, grows in time as t_N^{clog} \propto N_{cl}^{\nu}, where \nu = \nu(\mu,\lambda) is a non-universal exponent that depends upon the dust size distribution and filter properties.
In the advent of large-scale multi-hop wireless technologies, such as MANET, VANET, iThings, it is of utmost importance to devise efficient distributed protocols to maintain network architecture and provide basic communication tools. One of such fundamental communication tasks is broadcast, also known as a 1-to-all communication. We propose several new efficient distributed algorithms and evaluate their time performance both theoretically and by simulations. First randomized algorithm accomplishes broadcast in O(D+log(1/d)) rounds with probability at least 1-d on any uniform-power network of n nodes and diameter D, when equipped with local estimate of network density. Additionally, we evaluate average performance of this protocols by simulations on two classes of generated networks - uniform and social - and compare the results with performance of exponential backoff heuristic. Ours is the first provably efficient and well-scalable distributed solution for the (global) broadcast task. The second randomized protocol developed in this paper does not rely on the estimate of local density, and achieves only slightly higher time performance O((D+log(1/d))log n). Finally, we provide a deterministic algorithm achieving similar time O(D log^2 n), supported by theoretical analysis.
This chapter is a contribution in the "Handbook of Applications of Chaos Theory" ed. by Prof. Christos H Skiadas. The chapter is organized as follows. First we study the statistical properties of combs and explain how to reduce the effect of teeth on the movement along the backbone as a waiting time distribution between consecutive jumps. Second, we justify an employment of a comb-like structure as a paradigm for further exploration of a spiny dendrite. In particular, we show how a comb-like structure can sustain the phenomenon of the anomalous diffusion, reaction-diffusion and L\'evy walks. Finally, we illustrate how the same models can be also useful to deal with the mechanism of ta translocation wave / translocation waves of CaMKII and its propagation failure. We also present a brief introduction to the fractional integro-differentiation in appendix at the end of the chapter.
Deep neural networks with alternating convolutional, max-pooling and decimation layers are widely used in state of the art architectures for computer vision. Max-pooling purposefully discards precise spatial information in order to create features that are more robust, and typically organized as lower resolution spatial feature maps. On some tasks, such as whole-image classification, max-pooling derived features are well suited; however, for tasks requiring precise localization, such as pixel level prediction and segmentation, max-pooling destroys exactly the information required to perform well. Precise localization may be preserved by shallow convnets without pooling but at the expense of robustness. Can we have our max-pooled multi-layered cake and eat it too? Several papers have proposed summation and concatenation based methods for combining upsampled coarse, abstract features with finer features to produce robust pixel level predictions. Here we introduce another model --- dubbed Recombinator Networks --- where coarse features inform finer features early in their formation such that finer features can make use of several layers of computation in deciding how to use coarse features. The model is trained once, end-to-end and performs better than summation-based architectures, reducing the error from the previous state of the art on two facial keypoint datasets, AFW and AFLW, by 30\% and beating the current state-of-the-art on 300W without using extra data. We improve performance even further by adding a denoising prediction model based on a novel convnet formulation.
Scene classification is a key problem in the interpretation of high-resolution remote sensing imagery. Many state-of-the-art methods, e.g. bag-of-visual-words model and its variants, the topic models as well as deep learning-based approaches, share similar procedures: patch sampling, feature description/learning and classification. Patch sampling is the first and a key procedure which has a great influence on the results. In the literature, many different sampling strategies have been used, {e.g. dense sampling, random sampling, keypoint-based sampling and saliency-based sampling, etc. However, it is still not clear which sampling strategy is suitable for the scene classification of high-resolution remote sensing images. In this paper, we comparatively study the effects of different sampling strategies under the scenario of scene classification of high-resolution remote sensing images. We divide the existing sampling methods into two types: dense sampling and sparse sampling, the later of which includes random sampling, keypoint-based sampling and various saliency-based sampling proposed recently. In order to compare their performances, we rely on a standard bag-of-visual-words model to construct our testing scheme, owing to their simplicity, robustness and efficiency. The experimental results on two commonly used datasets show that dense sampling has the best performance among all the strategies but with high spatial and computational complexity, random sampling gives better or comparable results than other sparse sampling methods, like the sophisticated multi-scale key-point operators and the saliency-based methods which are intensively studied and commonly used recently.
The formation of Stationary Localized states due to a nonlinear dimeric impurity embedded in a perfect 1-d chain is studied here using the appropriate Discrete Nonlinear Schr$\ddot{o}$dinger Equation. Furthermore, the nonlinearity has the form, $\chi |C|^\sigma$ where $C$ is the complex amplitude. A proper ansatz for the Localized state is introduced in the appropriate Hamiltonian of the system to obtain the reduced effective Hamiltonian. The Hamiltonian contains a parameter, $\beta = \phi_1/\phi_0$ which is the ratio of stationary amplitudes at impurity sites. Relevant equations for Localized states are obtained from the fixed point of the reduced dynamical system. $|\beta|$ = 1 is always a permissible solution. We also find solutions for which $|\beta| \ne 1$. Complete phase diagram in the $(\chi, \sigma)$ plane comprising of both cases is discussed. Several critical lines separating various regions are found. Maximum number of Localized states is found to be six. Furthermore, the phase diagram continuously extrapolates from one region to the other. The importance of our results in relation to solitonic solutions in a fully nonlinear system is discussed.
This paper presents our novel method to encode word confusion networks, which can represent a rich hypothesis space of automatic speech recognition systems, via recurrent neural networks. We demonstrate the utility of our approach for the task of dialog state tracking in spoken dialog systems that relies on automatic speech recognition output. Encoding confusion networks outperforms encoding the best hypothesis of the automatic speech recognition in a neural system for dialog state tracking on the well-known second Dialog State Tracking Challenge dataset.
Financial portfolio management is the process of constant redistribution of a fund into different financial products. This paper presents a financial-model-free Reinforcement Learning framework to provide a deep machine learning solution to the portfolio management problem. The framework consists of the Ensemble of Identical Independent Evaluators (EIIE) topology, a Portfolio-Vector Memory (PVM), an Online Stochastic Batch Learning (OSBL) scheme, and a fully exploiting and explicit reward function. This framework is realized in three instants in this work with a Convolutional Neural Network (CNN), a basic Recurrent Neural Network (RNN), and a Long Short-Term Memory (LSTM). They are, along with a number of recently reviewed or published portfolio-selection strategies, examined in three back-test experiments with a trading period of 30 minutes in a cryptocurrency market. Cryptocurrencies are electronic and decentralized alternatives to government-issued money, with Bitcoin as the best-known example of a cryptocurrency. All three instances of the framework monopolize the top three positions in all experiments, outdistancing other compared trading algorithms. Although with a high commission rate of 0.25% in the backtests, the framework is able to achieve at least 4-fold returns in 50 days.
We present the first models of Saturn and Jupiter to couple their evolution to both a radiative-atmosphere grid and to high-pressure phase diagrams of hydrogen with helium. The purpose of these models is to quantify the evolutionary effects of helium phase separation in Saturn's deep interior. We find that prior calculated phase diagrams in which Saturn's interior reaches a region of predicted helium immiscibility do not allow enough energy release to prolong Saturn's cooling to its known age and effective temperature. We explore modifications to published phase diagrams that would lead to greater energy release, and find a modified H-He phase diagram that is physically reasonable, leads to the correct extension of Saturn's cooling, and predicts an atmospheric helium mass fraction Y_atmos in agreement with recent estimates. We then expand our inhomogeneous evolutionary models to show that hypothetical extrasolar giant planets in the 0.15 to 3.0 Jupiter mass range may have T_effs 10-15 K greater than one would predict with models that do not incorporate helium phase separation.
In most natural and engineered systems, a set of entities interact with each other in complicated patterns that can encompass multiple types of relationships, change in time, and include other types of complications. Such systems include multiple subsystems and layers of connectivity, and it is important to take such "multilayer" features into account to try to improve our understanding of complex systems. Consequently, it is necessary to generalize "traditional" network theory by developing (and validating) a framework and associated tools to study multilayer systems in a comprehensive fashion. The origins of such efforts date back several decades and arose in multiple disciplines, and now the study of multilayer networks has become one of the most important directions in network science. In this paper, we discuss the history of multilayer networks (and related concepts) and review the exploding body of work on such networks. To unify the disparate terminology in the large body of recent work, we discuss a general framework for multilayer networks, construct a dictionary of terminology to relate the numerous existing concepts to each other, and provide a thorough discussion that compares, contrasts, and translates between related notions such as multilayer networks, multiplex networks, interdependent networks, networks of networks, and many others. We also survey and discuss existing data sets that can be represented as multilayer networks. We review attempts to generalize single-layer-network diagnostics to multilayer networks. We also discuss the rapidly expanding research on multilayer-network models and notions like community structure, connected components, tensor decompositions, and various types of dynamical processes on multilayer networks. We conclude with a summary and an outlook.
We consider a stochastic model of clock synchronization in a wireless network consisting of N sensors interacting with one dedicated accurate time server. For large N we find an estimate of the final time sychronization error for global and relative synchronization. Main results concern a behavior of the network on different time scales $t=t_N \to \infty$, $N \to \infty$. We discuss existence of phase transitions and find exact time scales on which an effective clock synchronization of the system takes place.
In this paper we study a simple model of a purely excitatory neural network that, by construction, operates at a critical point. This model allows us to consider various markers of criticality and illustrate how they should perform in a finite-size system. By calculating the exact distribution of avalanche sizes we are able to show that, over a limited range of avalanche sizes which we precisely identify, the distribution has scale free properties but is not a power law. This suggests that it would be inappropriate to dismiss a system as not being critical purely based on an inability to rigorously fit a power law distribution as has been recently advocated. In assessing whether a system, especially a finite-size one, is critical it is thus important to consider other possible markers. We illustrate one of these by showing the divergence of susceptibility as the critical point of the system is approached. Finally, we provide evidence that power laws may underlie other observables of the system, that may be more amenable to robust experimental assessment.
The performance of the meta-heuristic algorithms often depends on their parameter settings. Appropriate tuning of the underlying parameters can drastically improve the performance of a meta-heuristic. The Ant Colony Optimization (ACO), a population based meta-heuristic algorithm inspired by the foraging behavior of the ants, is no different. Fundamentally, the ACO depends on the construction of new solutions, variable by variable basis using Gaussian sampling of the selected variables from an archive of solutions. A comprehensive performance analysis of the underlying parameters such as: selection strategy, distance measure metric and pheromone evaporation rate of the ACO suggests that the Roulette Wheel Selection strategy enhances the performance of the ACO due to its ability to provide non-uniformity and adequate diversity in the selection of a solution. On the other hand, the Squared Euclidean distance-measure metric offers better performance than other distance-measure metrics. It is observed from the analysis that the ACO is sensitive towards the evaporation rate. Experimental analysis between classical ACO and other meta-heuristic suggested that the performance of the well-tuned ACO surpasses its counterparts.
Given a set of possible models (e.g., Bayesian network structures) and a data sample, in the unsupervised model selection problem the task is to choose the most accurate model with respect to the domain joint probability distribution. In contrast to this, in supervised model selection it is a priori known that the chosen model will be used in the future for prediction tasks involving more ``focused' predictive distributions. Although focused predictive distributions can be produced from the joint probability distribution by marginalization, in practice the best model in the unsupervised sense does not necessarily perform well in supervised domains. In particular, the standard marginal likelihood score is a criterion for the unsupervised task, and, although frequently used for supervised model selection also, does not perform well in such tasks. In this paper we study the performance of the marginal likelihood score empirically in supervised Bayesian network selection tasks by using a large number of publicly available classification data sets, and compare the results to those obtained by alternative model selection criteria, including empirical crossvalidation methods, an approximation of a supervised marginal likelihood measure, and a supervised version of Dawids prequential(predictive sequential) principle.The results demonstrate that the marginal likelihood score does NOT perform well FOR supervised model selection, WHILE the best results are obtained BY using Dawids prequential r napproach.
Current Bayesian net representations do not consider structure in the domain and include all variables in a homogeneous network. At any time, a human reasoner in a large domain may direct his attention to only one of a number of natural subdomains, i.e., there is ?localization' of queries and evidence. In such a case, propagating evidence through a homogeneous network is inefficient since the entire network has to be updated each time. This paper presents multiply sectioned Bayesian networks that enable a (localization preserving) representation of natural subdomains by separate Bayesian subnets. The subnets are transformed into a set of permanent junction trees such that evidential reasoning takes place at only one of them at a time. Probabilities obtained are identical to those that would be obtained from the homogeneous network. We discuss attention shift to a different junction tree and propagation of previously acquired evidence. Although the overall system can be large, computational requirements are governed by the size of only one junction tree.
We summarize the main results of the spin physics Working Group Session of DIS 2007, the 15th International Workshop on ``Deep-Inelastic Scattering and Related Subjects''.
Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features. Moreover, L-Softmax not only can adjust the desired margin but also can avoid overfitting. We also show that the L-Softmax loss can be optimized by typical stochastic gradient descent. Extensive experiments on four benchmark datasets demonstrate that the deeply-learned features with L-softmax loss become more discriminative, hence significantly boosting the performance on a variety of visual classification and verification tasks.
Deep neural network models, though very powerful and highly successful, are computationally expensive in terms of space and time. Recently, there have been a number of attempts on binarizing the network weights and activations. This greatly reduces the network size, and replaces the underlying multiplications to additions or even XNOR bit operations. However, existing binarization schemes are based on simple matrix approximation and ignore the effect of binarization on the loss. In this paper, we propose a proximal Newton algorithm with diagonal Hessian approximation that directly minimizes the loss w.r.t. the binarized weights. The underlying proximal step has an efficient closed-form solution, and the second-order information can be efficiently obtained from the second moments already computed by the Adam optimizer. Experiments on both feedforward and recurrent networks show that the proposed loss-aware binarization algorithm outperforms existing binarization schemes, and is also more robust for wide and deep networks.
Creation procedure of associative patterns ensemble in terms of formal logic with using neural net-work (NN) model is formulated. It is shown that the associative patterns set is created by means of unique procedure of NN work which having individual parameters of entrance stimulus transformation. It is ascer-tained that the quantity of the selected associative patterns possesses is a constant.
Material classification in natural settings is a challenge due to complex interplay of geometry, reflectance properties, and illumination. Previous work on material classification relies strongly on hand-engineered features of visual samples. In this work we use a Convolutional Neural Network (convnet) that learns descriptive features for the specific task of material recognition. Specifically, transfer learning from the task of object recognition is exploited to more effectively train good features for material classification. The approach of transfer learning using convnets yields significantly higher recognition rates when compared to previous state-of-the-art approaches. We then analyze the relative contribution of reflectance and shading information by a decomposition of the image into its intrinsic components. The use of convnets for material classification was hindered by the strong demand for sufficient and diverse training data, even with transfer learning approaches. Therefore, we present a new data set containing approximately 10k images divided into 10 material categories.
Intra-operative measurements of tissue shape and multi/ hyperspectral information have the potential to provide surgical guidance and decision making support. We report an optical probe based system to combine sparse hyperspectral measurements and spectrally-encoded structured lighting (SL) for surface measurements. The system provides informative signals for navigation with a surgical interface. By rapidly switching between SL and white light (WL) modes, SL information is combined with structure-from-motion (SfM) from white light images, based on SURF feature detection and Lucas-Kanade (LK) optical flow to provide quasi-dense surface shape reconstruction with known scale in real-time. Furthermore, "super-spectral-resolution" was realized, whereby the RGB images and sparse hyperspectral data were integrated to recover dense pixel-level hyperspectral stacks, by using convolutional neural networks to upscale the wavelength dimension. Validation and demonstration of this system is reported on ex vivo/in vivo animal/ human experiments.
Motivated by experimental observations of time-symmetry breaking behavior in a periodically driven (Floquet) system, we study one-dimensional spin model to explore the stability of such Floquet discrete time crystals (DTC) under the interplay between interaction and the microwave driving. For intermediate interactions and high drivings, from the time evolution of both stroboscopic spin polarization and mutual information between two ends, we show that Floquet DTC can exist in a prethermal time regime without the tuning of strong disorder. For much weak interactions the system is a symmetry-unbroken phase, while for strong interactions it gives its way to a thermal phase. Through analyzing the entanglement dynamics, we show that large driving fields protect the prethermal DTC from many-body localization and thermalization. Our results suggest that by increasing the spin interaction, one can drive the experimental system into optimal regime for observing robust prethermal DTC phase.
Modern data acquisition routinely produces massive amounts of network data. Though many methods and models have been proposed to analyze such data, the research of network data is largely disconnected with the classical theory of statistical learning and signal processing. In this paper, we present a new framework for modeling network data, which connects two seemingly different areas: network data analysis and compressed sensing. From a nonparametric perspective, we model an observed network using a large dictionary. In particular, we consider the network clique detection problem and show connections between our formulation with a new algebraic tool, namely Randon basis pursuit in homogeneous spaces. Such a connection allows us to identify rigorous recovery conditions for clique detection problems. Though this paper is mainly conceptual, we also develop practical approximation algorithms for solving empirical problems and demonstrate their usefulness on real-world datasets.
In this article we investigate the topological changes undergone by trajectory networks as a consequence of progressive geographical infiltration. Trajectory networks, a type of knitted network, are obtained by establishing paths between geographically distributed nodes while following an associated vector field. For instance, the nodes could correspond to neurons along the cortical surface and the vector field could correspond to the gradient of neurotrophic factors, or the nodes could represent towns while the vector fields would be given by economical and/or geographical gradients. Therefore trajectory networks are natural models of a large number of geographical structures. The geographical infiltrations correspond to the addition of new local connections between nearby existing nodes. As such, these infiltrations could be related to several real-world processes such as contaminations, diseases, attacks, parasites, etc. The way in which progressive geographical infiltrations affect trajectory networks is investigated in terms of the degree, clustering coefficient, size of the largest component and the lengths of the existing chains measured along the infiltrations. It is shown that the maximum infiltration distance plays a critical role in the intensity of the induced topological changes. For large enough values of this parameter, the chains intrinsic to the trajectory networks undergo a collapse which is shown not to be related to the percolation of the network also implied by the infiltrations.
In this paper, we consider situations in which a given logical function is realized by a multithreshold threshold function. In such situations, constant functions can be easily obtained from multithreshold threshold functions, and therefore, we can show that it becomes possible to optimize a class of high-order neural networks. We begin by proposing a generating method for threshold functions in which we use a vector that determines the boundary between the linearly separable function and the high-order threshold function. By applying this method to high-order threshold functions, we show that functions with the same weight as, but a different threshold than, a threshold function generated by the generation process can be easily obtained. We also show that the order of the entire network can be extended while maintaining the structure of given functions.
Isolated quantum systems at strong disorder can display many-body localization (MBL), a remarkable phenomena characterized by an absence of conduction even at finite temperatures. As the ratio of interactions to disorder is increased, one expects that an MBL phase will eventually undergo a dynamical phase transition to a delocalized phase. Here we constrain the nature of such a transition by exploiting the strong subadditivity of entanglement entropy, as applied to the many-body eigenstates close to the transition in general dimensions. In particular, we show that at a putative continuous transition between an MBL and an ergodic delocalized phase, the critical eigenstates are necessarily thermal, and therefore, the critical entanglement entropy equals the thermal entropy. We also explore a qualitatively different continuous localization-delocalization transition, where the delocalized phase is non-ergodic whose volume law entanglement entropy tends to zero as the transition is approached.
There is a pressing need for a description of complex systems that includes considerations of the underlying network of interactions, for a diverse range of biological, technological and other networks. In this work relationships between second-order phase transitions and the power laws associated with scale-free networks are directly quantified. A unique unbiased partitioning of complex networks (exemplified in this work by software architectures) into high- and low-connectivity regions can be made. Other applications to finance and aerogels are outlined.
Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery. This article aims to give a general overview of MTL, particularly in deep neural networks. It introduces the two most common methods for MTL in Deep Learning, gives an overview of the literature, and discusses recent advances. In particular, it seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks.
Many real-world applications are associated with structured data, where not only input but also output has interplay. However, typical classification and regression models often lack the ability of simultaneously exploring high-order interaction within input and that within output. In this paper, we present a deep learning model aiming to generate a powerful nonlinear functional mapping from structured input to structured output. More specifically, we propose to integrate high-order hidden units, guided discriminative pretraining, and high-order auto-encoders for this purpose. We evaluate the model with three datasets, and obtain state-of-the-art performances among competitive methods. Our current work focuses on structured output regression, which is a less explored area, although the model can be extended to handle structured label classification.
Spectroscopic results on low frequency excitations of densified silica are presented and related to characteristic thermal properties of glasses. The end of the longitudinal acoustic branch is marked by a rapid increase of the Brillouin linewidth with the scattering vector. This rapid growth saturates at a crossover frequency Omega_co which nearly coincides with the center of the boson peak. The latter is clearly due to additional optic-like excitations related to nearly rigid SiO_4 librations as indicated by hyper-Raman scattering. Whether the onset of strong scattering is best described by hybridization of acoustic modes with these librations, by their elastic scattering (Rayleigh scattering) on the local excitations, or by soft potentials remains to be settled.
We report our search for the cool white dwarfs belonging to the Galactic disk by extending the NOAO Deep Wide-Field Survey. Narrow-band DDO51 photometry of the Deep Wide-Field Survey's northern field was obtained using the 4m-Mayall Telescope and the MOSAIC imager to separate cool white dwarfs from other stellar types of similar T_eff. Follow-up spectroscopy of four white dwarf candidates from our photometric search resulted in the discovery of two new cool white dwarfs as companions to M dwarfs.
We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that query volumes (i.e., the number of requests submitted by users to search engines on the www) can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful exemples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that trading volumes of stocks traded in NASDAQ-100 are correlated with the volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.
Construction of phylogenetic trees and networks for extant species from their characters represents one of the key problems in phylogenomics. While solution to this problem is not always uniquely defined and there exist multiple methods for tree/network construction, it becomes important to measure how well the constructed networks capture the given character relationship across the species.   In the current study, we propose a novel method for measuring the specificity of a given phylogenetic network in terms of the total number of distributions of character states at the leaves that the network may impose. While for binary phylogenetic trees, this number has an exact formula and depends only on the number of leaves and character states but not on the tree topology, the situation is much more complicated for non-binary trees or networks. Nevertheless, we develop an algorithm for combinatorial enumeration of such distributions, which is applicable for arbitrary trees and networks under some reasonable assumptions.
Now-a-days vehicles are one of the most important parts of our life. We need them to cross distances in our everyday life. In this paper we discuss Vehicular AdHoc Network (VANET) technology that can ensure the maintenance of traffic rules and regulation. By applying this technology we can save life, save time, corruption, vehicle security, avoid collision and so on. Vehicular Ad Hoc Network (VANET) is a part of Mobile Ad Hoc Network (MANET). Every node or vehicle can move freely and they will communicate each other by wireless technology in coverage. The main goal of this research is to study the existing routing protocols for ad-hoc network system and compared between AODV (Reactive) and DSDV (Proactive). We have studied different types of routing protocols such as topology based, position based, cluster based, geo-cast based and broadcast based. We have simulated and compared AODV (Reactive) and DSDV (Proactive) to find out their efficiency and detect their flaws.
We study the optimal transmission scheme that maximizes the local capacity in two-dimensional (2D) wireless networks. Local capacity is defined as the average information rate received by a node randomly located in the network. Using analysis based on analytical and numerical methods, we show that maximum local capacity can be obtained if simultaneous emitters are positioned in a grid pattern based on equilateral triangles. We also compare this maximum local capacity with the local capacity of slotted ALOHA scheme and our results show that slotted ALOHA can achieve at least half of the maximum local capacity in wireless networks.
Inspired by real-time ad exchanges for online display advertising, we consider the problem of inferring a buyer's value distribution for a good when the buyer is repeatedly interacting with a seller through a posted-price mechanism. We model the buyer as a strategic agent, whose goal is to maximize her long-term surplus, and we are interested in mechanisms that maximize the seller's long-term revenue. We define the natural notion of strategic regret --- the lost revenue as measured against a truthful (non-strategic) buyer. We present seller algorithms that are no-(strategic)-regret when the buyer discounts her future surplus --- i.e. the buyer prefers showing advertisements to users sooner rather than later. We also give a lower bound on strategic regret that increases as the buyer's discounting weakens and shows, in particular, that any seller algorithm will suffer linear strategic regret if there is no discounting.
We show that the failure time $\tau_f$ in fiber bundle model, taken as a prototype of heterogeneous materials, depends crucially on the strength of the disorder $\delta$ and the stress release range $R$ in the system. For $R$ beyond a critical value $R_c$ the distribution of $\tau_f$ follows Weibull form. In this region, the average $\tau_f$ shows the variation $\tau_f \sim L^{\alpha}$ where $L$ is the system size. For $R<R_c$, $\tau_f\sim L/R$. We find that the crossover length scale has the scaling form $R_c \sim L^{1-\alpha}$. This scaling has been found to be valid for various disorder distributions. For $\delta<\delta_c$, $\alpha$ is an increasing function of $\delta$. For all $\delta \ge \delta_c$, $\alpha$=1/3.
This paper addresses a method to analyze the covert social network foundation hidden behind the terrorism disaster. It is to solve a node discovery problem, which means to discover a node, which functions relevantly in a social network, but escaped from monitoring on the presence and mutual relationship of nodes. The method aims at integrating the expert investigator's prior understanding, insight on the terrorists' social network nature derived from the complex graph theory, and computational data processing. The social network responsible for the 9/11 attack in 2001 is used to execute simulation experiment to evaluate the performance of the method.
Face recognition capabilities have recently made extraordinary leaps. Though this progress is at least partially due to ballooning training set sizes -- huge numbers of face images downloaded and labeled for identity -- it is not clear if the formidable task of collecting so many images is truly necessary. We propose a far more accessible means of increasing training data sizes for face recognition systems. Rather than manually harvesting and labeling more faces, we simply synthesize them. We describe novel methods of enriching an existing dataset with important facial appearance variations by manipulating the faces it contains. We further apply this synthesis approach when matching query images represented using a standard convolutional neural network. The effect of training and testing with synthesized images is extensively tested on the LFW and IJB-A (verification and identification) benchmarks and Janus CS2. The performances obtained by our approach match state of the art results reported by systems trained on millions of downloaded images.
Linear Dynamical System (LDS) is an elegant mathematical framework for modeling and learning multivariate time series. However, in general, it is difficult to set the dimension of its hidden state space. A small number of hidden states may not be able to model the complexities of a time series, while a large number of hidden states can lead to overfitting. In this paper, we study methods that impose an $\ell_1$ regularization on the transition matrix of an LDS model to alleviate the problem of choosing the optimal number of hidden states. We incorporate a generalized gradient descent method into the Maximum a Posteriori (MAP) framework and use Expectation Maximization (EM) to iteratively achieve sparsity on the transition matrix of an LDS model. We show that our Sparse Linear Dynamical System (SLDS) improves the predictive performance when compared to ordinary LDS on a multivariate clinical time series dataset.
Betweenness centrality is an important index widely used in different domains such as social networks, traffic networks and the world wide web. However, even for mid-size networks that have only a few hundreds thousands vertices, it is computationally expensive to compute exact betweenness scores. Therefore in recent years, several approximate algorithms have been developed. In this paper, first given a network $G$ and a vertex $r \in V(G)$, we propose a Metropolis-Hastings MCMC algorithm that samples from the space $V(G)$ and estimates betweenness score of $r$. The stationary distribution of our MCMC sampler is the optimal sampling proposed for betweenness centrality estimation. We show that our MCMC sampler provides an $(\epsilon,\delta)$-approximation, where the number of required samples depends on the position of $r$ in $G$ and in many cases, it is a constant. Then, given a network $G$ and a set $R \subset V(G)$, we present a Metropolis-Hastings MCMC sampler that samples from the joint space $R$ and $V(G)$ and estimates relative betweenness scores of the vertices in $R$. We show that for any pair $r_i, r_j \in R$, the ratio of the expected values of the estimated relative betweenness scores of $r_i$ and $r_j$ respect to each other is equal to the ratio of their betweenness scores. We also show that our joint-space MCMC sampler provides an $(\epsilon,\delta)$-approximation of the relative betweenness score of $r_i$ respect to $r_j$, where the number of required samples depends on the position of $r_j$ in $G$ and in many cases, it is a constant.
We addressed the problem of detecting the change in behavior of information diffusion from a small amount of observation data, where the behavior changes were assumed to be effectively reflected in changes in the diffusion parameter value. The problem is to detect where in time and how long this change persisted and how big this change is. We solved this problem by searching the change pattern that maximizes the likelihood of generating the observed diffusion sequences. The naive learning algorithm has to iteratively update the patten boundaries, each requiring optimization of diffusion parameters by the EM algorithm, and is very inefficient. We devised a very efficient search algorithm using the derivative of likelihood which avoids parameter value optimization during the search. The results tested using three real world network structures confirmed that the algorithm can efficiently identify the correct change pattern. We further compared our algorithm with the naive method that finds the best combination of change boundaries by an exhaustive search through a set of randomly selected boundary candidates, and showed that the proposed algorithm far outperforms the native method both in terms of accuracy and computation time.
Accurate software development effort estimation is critical to the success of software projects. Although many techniques and algorithmic models have been developed and implemented by practitioners, accurate software development effort prediction is still a challenging endeavor in the field of software engineering, especially in handling uncertain and imprecise inputs and collinear characteristics. In this paper, a hybrid in-telligent model combining a neural network model integrated with fuzzy model (neuro-fuzzy model) has been used to improve the accuracy of estimating software cost. The performance of the proposed model is assessed by designing and conducting evaluation with published project and industrial data. Results have shown that the proposed model demonstrates the ability of improving the estimation accuracy by 18% based on the Mean Magnitude of Relative Error (MMRE) criterion.
Privacy-preserving distributed machine learning becomes increasingly important due to the recent rapid growth of data. This paper focuses on a class of regularized empirical risk minimization (ERM) machine learning problems, and develops two methods to provide differential privacy to distributed learning algorithms over a network. We first decentralize the learning algorithm using the alternating direction method of multipliers (ADMM), and propose the methods of dual variable perturbation and primal variable perturbation to provide dynamic differential privacy. The two mechanisms lead to algorithms that can provide privacy guarantees under mild conditions of the convexity and differentiability of the loss function and the regularizer. We study the performance of the algorithms, and show that the dual variable perturbation outperforms its primal counterpart. To design an optimal privacy mechanisms, we analyze the fundamental tradeoff between privacy and accuracy, and provide guidelines to choose privacy parameters. Numerical experiments using customer information database are performed to corroborate the results on privacy and utility tradeoffs and design.
The problem of communicating a single message to a destination in presence of multiple relay nodes, referred to as cooperative unicast network, is considered. First, we introduce "Mixed Noisy Network Coding" (MNNC) scheme which generalizes "Noisy Network Coding" (NNC) where relays are allowed to decode-and-forward (DF) messages while all of them (without exception) transmit noisy descriptions of their observations. These descriptions are exploited at the destination and the DF relays aim to decode the transmitted messages while creating full cooperation among the nodes. Moreover, the destination and the DF relays can independently select the set of descriptions to be decoded or treated as interference. This concept is further extended to multi-hopping scenarios, referred to as "Layered MNNC" (LMNNC), where DF relays are organized into disjoint groups representing one hop in the network. For cooperative unicast additive white Gaussian noise (AWGN) networks we show that -provided DF relays are properly chosen- MNNC improves over all previously established constant gaps to the cut-set bound. Secondly, we consider the composite cooperative unicast network where the channel parameters are randomly drawn before communication starts and remain fixed during the transmission. Each draw is assumed to be unknown at the source and fully known at the destination but only partly known at the relays. We introduce through MNNC scheme the concept of "Selective Coding Strategy" (SCS) that enables relays to decide dynamically whether, in addition to communicate noisy descriptions, is possible to decode and forward messages. It is demonstrated through slow-fading AWGN relay networks that SCS clearly outperforms conventional coding schemes.
We consider the problem of face swapping in images, where an input identity is transformed into a target identity while preserving pose, facial expression, and lighting. To perform this mapping, we use convolutional neural networks trained to capture the appearance of the target identity from an unstructured collection of his/her photographs.This approach is enabled by framing the face swapping problem in terms of style transfer, where the goal is to render an image in the style of another one. Building on recent advances in this area, we devise a new loss function that enables the network to produce highly photorealistic results. By combining neural networks with simple pre- and post-processing steps, we aim at making face swap work in real-time with no input from the user.
The deployment of Artificial Neural Networks (ANNs) in safety-critical applications poses a number of new verification and certification challenges. In particular, for ANN-enabled self-driving vehicles it is important to establish properties about the resilience of ANNs to noisy or even maliciously manipulated sensory input. We are addressing these challenges by defining resilience properties of ANN-based classifiers as the maximal amount of input or sensor perturbation which is still tolerated. This problem of computing maximal perturbation bounds for ANNs is then reduced to solving mixed integer optimization problems (MIP). A number of MIP encoding heuristics are developed for drastically reducing MIP-solver runtimes, and using parallelization of MIP-solvers results in an almost linear speed-up in the number (up to a certain limit) of computing cores in our experiments. We demonstrate the effectiveness and scalability of our approach by means of computing maximal resilience bounds for a number of ANN benchmark sets ranging from typical image recognition scenarios to the autonomous maneuvering of robots.
Recently deep residual learning with residual units for training very deep neural networks advanced the state-of-the-art performance on 2D image recognition tasks, e.g., object detection and segmentation. However, how to fully leverage contextual representations for recognition tasks from volumetric data has not been well studied, especially in the field of medical image computing, where a majority of image modalities are in volumetric format. In this paper we explore the deep residual learning on the task of volumetric brain segmentation. There are at least two main contributions in our work. First, we propose a deep voxelwise residual network, referred as VoxResNet, which borrows the spirit of deep residual learning in 2D image recognition tasks, and is extended into a 3D variant for handling volumetric data. Second, an auto-context version of VoxResNet is proposed by seamlessly integrating the low-level image appearance features, implicit shape information and high-level context together for further improving the volumetric segmentation performance. Extensive experiments on the challenging benchmark of brain segmentation from magnetic resonance (MR) images corroborated the efficacy of our proposed method in dealing with volumetric data. We believe this work unravels the potential of 3D deep learning to advance the recognition performance on volumetric image segmentation.
We consider a group of Bayesian agents who try to estimate a state of the world $\theta$ through interaction on a social network. Each agent $v$ initially receives a private measurement of $\theta$: a number $S_v$ picked from a Gaussian distribution with mean $\theta$ and standard deviation one. Then, in each discrete time iteration, each reveals its estimate of $\theta$ to its neighbors, and, observing its neighbors' actions, updates its belief using Bayes' Law.   This process aggregates information efficiently, in the sense that all the agents converge to the belief that they would have, had they access to all the private measurements. We show that this process is computationally efficient, so that each agent's calculation can be easily carried out. We also show that on any graph the process converges after at most $2N \cdot D$ steps, where $N$ is the number of agents and $D$ is the diameter of the network. Finally, we show that on trees and on distance transitive-graphs the process converges after $D$ steps, and that it preserves privacy, so that agents learn very little about the private signal of most other agents, despite the efficient aggregation of information. Our results extend those in an unpublished manuscript of the first and last authors.
The social network analysis of bibliometric data needs matrices to be recast in a network framework. In this paper we argue that a simple conservation rule requires that this should be done only using fractional counting so that conservation at the paper level will be faithfully reproduced at higher levels ofaggregation (i.e. author, institute, country, journal etc.) of the complex network.
Earlier definitions of capacity for wireless networks, e.g., transport or transmission capacity, for which exact theoretical results are known, are well suited for ad hoc networks but are not directly applicable for cellular wireless networks, where large-scale basestation (BS) coordination is not possible, and retransmissions/ARQ under the SINR model is a universal feature.   In this paper, cellular wireless networks, where both BS locations and mobile user (MU) locations are distributed as independent Poisson point processes are considered, and each MU connects to its nearest BS. With ARQ, under the SINR model, the effective downlink rate of packet transmission is the reciprocal of the expected delay (number of retransmissions needed till success), which we use as our network capacity definition after scaling it with the BS density.   Exact characterization of this natural capacity metric for cellular wireless networks is derived. The capacity is shown to first increase polynomially with the BS density in the low BS density regime and then scale inverse exponentially with the increasing BS density. Two distinct upper bounds are derived that are relevant for the low and the high BS density regimes. A single power control strategy is shown to achieve the upper bounds in both the regimes. This result is fundamentally different from the well known capacity results for ad hoc networks, such as transport and transmission capacity that scale as the square root of the (high) BS density. Our results show that the strong temporal correlations of SINRs with PPP distributed BS locations is limiting, and the realizable capacity in cellular wireless networks in high-BS density regime is much smaller than previously thought. A byproduct of our analysis shows that the capacity of the ALOHA strategy with retransmissions is zero.
In this paper, we study the challenging problem of categorizing videos according to high-level semantics such as the existence of a particular human action or a complex event. Although extensive efforts have been devoted in recent years, most existing works combined multiple video features using simple fusion strategies and neglected the utilization of inter-class semantic relationships. This paper proposes a novel unified framework that jointly exploits the feature relationships and the class relationships for improved categorization performance. Specifically, these two types of relationships are estimated and utilized by rigorously imposing regularizations in the learning process of a deep neural network (DNN). Such a regularized DNN (rDNN) can be efficiently realized using a GPU-based implementation with an affordable training cost. Through arming the DNN with better capability of harnessing both the feature and the class relationships, the proposed rDNN is more suitable for modeling video semantics. With extensive experimental evaluations, we show that rDNN produces superior performance over several state-of-the-art approaches. On the well-known Hollywood2 and Columbia Consumer Video benchmarks, we obtain very competitive results: 66.9\% and 73.5\% respectively in terms of mean average precision. In addition, to substantially evaluate our rDNN and stimulate future research on large scale video categorization, we collect and release a new benchmark dataset, called FCVID, which contains 91,223 Internet videos and 239 manually annotated categories.
We study charge transport in a monolayer molybdenum disulfide nanoflake over a wide range of carrier density, temperature, and electric bias. We find that the transport is best described by a percolating picture in which the disorder breaks translational invariance, breaking the system up into a series of puddles, rather than previous pictures in which the disorder is treated as homogeneous and uniform. Our work provides insight to a unified picture of charge transport in monolayer molybdenum disulfide nanoflakes and contributes to the development of next-generation molybdenum disulfide based devices.
We propose, and illustrate via a neural network example, two different approaches to coarse-graining large heterogeneous networks. Both approaches are inspired from, and use tools developed in, methods for uncertainty quantification in systems with multiple uncertain parameters - in our case, the parameters are heterogeneously distributed on the network nodes. The approach shows promise in accelerating large scale network simulations as well as coarse-grained fixed point, periodic solution and stability analysis. We also demonstrate that the approach can successfully deal with structural as well as intrinsic heterogeneities.
We derive fundamental lower bounds on the connectivity and the memory requirements of deep neural networks guaranteeing uniform approximation rates for arbitrary function classes in $L^2(\mathbb{R}^d)$. In other words, we establish a connection between the complexity of a function class and the complexity of deep neural networks approximating functions from this class to within a prescribed accuracy.   Additionally, we prove that our lower bounds are achievable for a broad family of function classes. Specifically, all function classes that are optimally approximated by a general class of representation systems---so-called \emph{affine systems}---can be approximated by deep neural networks with minimal connectivity and memory requirements. Affine systems encompass a wealth of representation systems from applied harmonic analysis such as wavelets, ridgelets, curvelets, shearlets, $\alpha$-shearlets, and more generally $\alpha$-molecules. This result elucidates a remarkable universality property of neural networks and shows that they achieve the optimum approximation properties of all affine systems combined. As a specific example, we consider the class of $1/\alpha$-cartoon-like functions, which is approximated optimally by $\alpha$-shearlets.   We also explain how our results can be extended to the case of functions on low-dimensional immersed manifolds.   Finally, we present numerical experiments demonstrating that the standard stochastic gradient descent algorithm generates deep neural networks providing close-to-optimal approximation rates at minimal connectivity. Moreover, these results show that stochastic gradient descent actually learns approximations that are sparse in the representation systems optimally sparsifying the function class the network is trained on.
Surgical workflow recognition has numerous potential medical applications, such as the automatic indexing of surgical video databases and the optimization of real-time operating room scheduling, among others. As a result, phase recognition has been studied in the context of several kinds of surgeries, such as cataract, neurological, and laparoscopic surgeries. In the literature, two types of features are typically used to perform this task: visual features and tool usage signals. However, the visual features used are mostly handcrafted. Furthermore, the tool usage signals are usually collected via a manual annotation process or by using additional equipment. In this paper, we propose a novel method for phase recognition that uses a convolutional neural network (CNN) to automatically learn features from cholecystectomy videos and that relies uniquely on visual information. In previous studies, it has been shown that the tool signals can provide valuable information in performing the phase recognition task. Thus, we present a novel CNN architecture, called EndoNet, that is designed to carry out the phase recognition and tool presence detection tasks in a multi-task manner. To the best of our knowledge, this is the first work proposing to use a CNN for multiple recognition tasks on laparoscopic videos. Extensive experimental comparisons to other methods show that EndoNet yields state-of-the-art results for both tasks.
Automated Lymph Node (LN) detection is an important clinical diagnostic task but very challenging due to the low contrast of surrounding structures in Computed Tomography (CT) and to their varying sizes, poses, shapes and sparsely distributed locations. State-of-the-art studies show the performance range of 52.9% sensitivity at 3.1 false-positives per volume (FP/vol.), or 60.9% at 6.1 FP/vol. for mediastinal LN, by one-shot boosting on 3D HAAR features. In this paper, we first operate a preliminary candidate generation stage, towards 100% sensitivity at the cost of high FP levels (40 per patient), to harvest volumes of interest (VOI). Our 2.5D approach consequently decomposes any 3D VOI by resampling 2D reformatted orthogonal views N times, via scale, random translations, and rotations with respect to the VOI centroid coordinates. These random views are then used to train a deep Convolutional Neural Network (CNN) classifier. In testing, the CNN is employed to assign LN probabilities for all N random views that can be simply averaged (as a set) to compute the final classification probability per VOI. We validate the approach on two datasets: 90 CT volumes with 388 mediastinal LNs and 86 patients with 595 abdominal LNs. We achieve sensitivities of 70%/83% at 3 FP/vol. and 84%/90% at 6 FP/vol. in mediastinum and abdomen respectively, which drastically improves over the previous state-of-the-art work.
We present a complete analytical and numerical solution of the Typical Medium Theory (TMT) for the Anderson metal-insulator transition. In this theory, we self-consistently calculate the typical amplitude of the electron wave-We present a complete analytical and numerical solution of the Typical Medium Theory (TMT) for the Anderson metal-insulator transition. This approach self-consistently calculates the typical amplitude of the electronic wave-functions, thus representing the conceptually simplest order-parameter theory for the Anderson transition. We identify all possible universality classes for the critical behavior, which can be found within such a mean-field approach. This provides insights into how interaction-induced renormalizations of the disorder potential may produce qualitative modifications of the critical behavior. We also formulate a simplified description of the leading critical behavior, thus obtaining an effective Landau theory for Anderson localization.
The interaction between transitivity and sparsity, two common features in empirical networks, implies that there are local regions of large sparse networks that are dense. We call this the blessing of transitivity and it has consequences for both modeling and inference. Extant research suggests that statistical inference for the Stochastic Blockmodel is more difficult when the edges are sparse. However, this conclusion is confounded by the fact that the asymptotic limit in all of the previous studies is not merely sparse, but also non-transitive. To retain transitivity, the blocks cannot grow faster than the expected degree. Thus, in sparse models, the blocks must remain asymptotically small. \n Previous algorithmic research demonstrates that small "local" clusters are more amenable to computation, visualization, and interpretation when compared to "global" graph partitions. This paper provides the first statistical results that demonstrate how these small transitive clusters are also more amenable to statistical estimation. Theorem 2 shows that a "local" clustering algorithm can, with high probability, detect a transitive stochastic block of a fixed size (e.g. 30 nodes) embedded in a large graph. The only constraint on the ambient graph is that it is large and sparse--it could be generated at random or by an adversary--suggesting a theoretical explanation for the robust empirical performance of local clustering algorithms.
Statistical network calculus is the probabilistic extension of network calculus, which uses a simple envelope approach to describe arrival traffic and service available for the arrival traffic in a node. One of the key features of network calculus is the possibility to describe the service available in a network using a network service envelope constructed from the service envelopes of the individual nodes constituting the network. It have been shown that the end-to-end worst case performance measures computed using the network service envelope is bounded by $ {\cal O} (H) $, where $H$ is the number of nodes traversed by a flow. There have been many attempts to achieve a similar linear scaling for end-to-end probabilistic performance measures but with limited success. In this paper, we present a simple general proof of computing end-to-end probabilistic performance measures using network calculus that grow linearly in the number of nodes ($H$).
Push message delivery, where a client maintains an ``always-on'' connection with a server in order to be notified of a (asynchronous) message arrival in real-time, is increasingly being used in Internet services. The key message in this paper is that push message delivery on the World Wide Web is not scalable for servers, intermediate network elements, and battery-operated mobile device clients. We present a measurement analysis of a commercially deployed WWW push email service to highlight some of these issues. Next, we suggest content-based optimization to reduce the always-on connection requirement of push messaging. Our idea is based on exploiting the periodic nature of human-to-human messaging. We show how machine learning can accurately model the times of a day or week when messages are least likely to arrive; and turn off always-on connections these times. We apply our approach to a real email data set and our experiments demonstrate that the number of hours of active always-on connections can be cut by half while still achieving real-time message delivery for up to 90% of all messages.
In this paper are presented first results of a theoretical study on the role of non-monotone interactions in Boolean automata networks. We propose to analyse the contribution of non-monotony to the diversity and complexity in their dynamical behaviours according to two axes. The first one consists in supporting the idea that non-monotony has a peculiar influence on the sensitivity to synchronism of such networks. It leads us to the second axis that presents preliminary results and builds an understanding of the dynamical behaviours, in particular concerning convergence times, of specific non-monotone Boolean automata networks called XOR circulant networks.
Small-world networks are networks in which the graphical diameter of the network is as small as the diameter of random graphs but whose nodes are highly clustered when compared with the ones in a random graph. Examples of small-world networks abound in sociology, biology, neuroscience and physics as well as in human-made networks. This paper analyzes the average delivery time of messages in dense small-world networks constructed on a plane. Iterative equations for the average message delivery time in these networks are provided for the situation in which nodes employ a simple greedy geographic routing algorithm. It is shown that two network nodes communicate with each other only through their short-range contacts, and that the average message delivery time rises linearly if the separation between them is small. On the other hand, if their separation increases, the average message delivery time rapidly saturates to a constant value and stays almost the same for all large values of their separation.
The German engineer H.A. Janssen gave one of the first accounts of the often peculiar behavior of granular material in a paper published in German in 1895.   From simple experiments with corn he inferred the saturation of pressure with height in a granular system. Subsequently, Janssen derived the equivalent of the barometric formula for granular material from the main assumption that the walls carry part of the weight. The following is a translation of this article. The wording is chosen as close as possible to the original. While drawings are copied from the original, figures displaying data are redone for better readability. The translation is complemented by some bibliographical notes and an assessment of earlier work, wherein Hagen predicted the saturation of pressure with depth in 1852, and Huber-Burnand demonstrated that saturation qualitatively as early as in 1829. We conclude with a brief discussion of more recent developments resting on Janssen's work.
Time evolution operator in quantum mechanics can be changed into a statistical operator by a Wick rotation. This strict relation between statistical mechanics and quantum evolution can reveal deep results when the thermodynamic limit is considered. These results translate in a set of theorems proving that these effects can be effectively at work producing an emerging classical world without recurring to any external entity that in some cases cannot be properly defined. In a many-body system has been recently shown that Gaussian decay of the coherence is the rule with a duration of recurrence more and more small as the number of particles increases. This effect has been observed experimentally. More generally, a theorem about coherence of bulk matter can be proved. All this takes us to the conclusion that a well definite boundary for the quantum to classical world does exist and that can be drawn by the thermodynamic limit, extending in this way the deep link between statistical mechanics and quantum evolution to a high degree.
A complex network approach on a rough fracture is developed. In this manner, some hidden metric spaces (similarity measurements) between apertures profiles are set up and a general evolutionary network in two directions (in parallel and perpendicular to the shear direction) is constructed. Also, an algorithm (COmplex Networks on Apertures: CONA) is proposed in which evolving of a network is accomplished using preferential detachments and attachments of edges (based on a competition and game manner) while the number of nodes is fixed. Also, evolving of clustering coefficients and number of edges display similar patterns as well as are appeared in shear stress, hydraulic conductivity and dilation changes, which can be engaged to estimate shear strength distribution of asperities.
Understanding brain function, constructing computational models and engineering neural prosthetics require assessing two problems, namely encoding and decoding, but their relation remains controversial. For decades, the encoding problem has been shown to provide insight into the decoding problem, for example, by upper bounding the decoded information. However, here we show that this need not be the case when studying response aspects beyond noise correlations, and trace back the actual causes of this major departure from traditional views. To that end, we reformulate the encoding and decoding problems from the observer or organism perspective. In addition, we study the role of spike-time precision and response discrimination, among other response aspects, using stochastic transformations of the neural responses, here called stochastic codes. Our results show that stochastic codes may cause different information losses when used to describe neural responses and when employed to train optimal decoders. Therefore, we conclude that response aspects beyond noise correlations may play different roles in encoding and decoding. In practice, our results show for the first time that decoders constructed low-quality descriptions of response aspects may operate optimally on high-quality descriptions and vice versa, thereby potentially yielding experimental and computational savings, as well as new opportunities for simplifying the design of computational brain models and neural prosthetics.
Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods though the seed set expansion problem: given a subset $S$ of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of landing probabilities of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. In this work we develop a principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model. We derive the optimal gradient for separating the landing probabilities of two classes in a stochastic block model, and find, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter $\alpha$ that depends on the block model parameters. This connection provides a novel formal motivation for the success of personalized PageRank in seed set expansion and node ranking generally. We use this connection to propose more advanced techniques incorporating higher moments of landing probabilities; our advanced methods exhibit greatly improved performance despite being simple linear classification rules, and are even competitive with belief propagation.
The key idea of current deep learning methods for dense prediction is to apply a model on a regular patch centered on each pixel to make pixel-wise predictions. These methods are limited in the sense that the patches are determined by network architecture instead of learned from data. In this work, we propose the dense transformer networks, which can learn the shapes and sizes of patches from data. The dense transformer networks employ an encoder-decoder architecture, and a pair of dense transformer modules are inserted into each of the encoder and decoder paths. The novelty of this work is that we provide technical solutions for learning the shapes and sizes of patches from data and efficiently restoring the spatial correspondence required for dense prediction. The proposed dense transformer modules are differentiable, thus the entire network can be trained. We apply the proposed networks on natural and biological image segmentation tasks and show superior performance is achieved in comparison to baseline methods.
Cortical networks can maintain memories for decades despite the short lifetime of synaptic strength. Can a neural network store long-lasting memories in unstable synapses? Here, we study the effects of random noise on the stability of memory stored in synapses of an attractor neural network. The model includes ongoing spike timing dependent plasticity (STDP). We show that certain classes of STDP rules can lead to the stabilization of memory patterns stored in the network. The stabilization results from rehearsals induced by noise. We show that unstructured neural noise, after passing through the recurrent network weights, carries the imprint of all memory patterns in temporal correlations. Under certain conditions, STDP combined with these correlations, can lead to reinforcement of all existing patterns, even those that are never explicitly visited. Thus, unstructured neural noise can stabilize the existing structure of synaptic connectivity. Our findings may provide the functional reason for highly irregular spiking displayed by cortical neurons and provide justification for models of system memory consolidation. Therefore, we propose that irregular neural activity is the feature that helps cortical networks maintain stable connections.
Every day, billions of mobile network events (i.e. CDRs) are generated by cellular phone operator companies. Latent in this data are inspiring insights about human actions and behaviors, the discovery of which is important because context-aware applications and services hold the key to user-driven, intelligent services, which can enhance our everyday lives such as social and economic development, urban planning, and health prevention. The major challenge in this area is that interpreting such a big stream of data requires a deep understanding of mobile network events' context through available background knowledge. This article addresses the issues in context awareness given heterogeneous and uncertain data of mobile network events missing reliable information on the context of this activity. The contribution of this research is a model from a combination of logical and statistical reasoning standpoints for enabling human activity inference in qualitative terms from open geographical data that aimed at improving the quality of human behaviors recognition tasks from CDRs. We use open geographical data, Openstreetmap (OSM), as a proxy for predicting the content of human activity in the area. The user study performed in Trento shows that predicted human activities (top level) match the survey data with around 93% overall accuracy. The extensive validation for predicting a more specific economic type of human activity performed in Barcelona, by employing credit card transaction data. The analysis identifies that appropriately normalized data on points of interest (POI) is a good proxy for predicting human economical activities, with 84% accuracy on average. So the model is proven to be efficient for predicting the context of human activity, when its total level could be efficiently observed from cell phone data records, missing contextual information however.
Using multispeckle x-ray photon correlation spectroscopy, we have measured the slow, wave-vector dependent dynamics of concentrated, disordered nanoemulsions composed of silicone oil droplets in water. The intermediate scattering function possesses a compressed exponential lineshape and a relaxation time that varies inversely with wave vector. We interpret these dynamics as strain in response to local stress relaxation. The motion includes a transient component whose characteristic velocity decays exponentially with time following a mechanical perturbation of the nanoemulsions and a second component whose characteristic velocity is essentially independent of time. The steady-state characteristic velocity is surprisingly insensitive to droplet volume fraction in the concentrated regime, indicating that the strain motion is only weakly dependent on the droplet-droplet interactions.
We investigate the effect of incomplete information on the growth process of scale-free networks - a situation that occurs frequently e.g. in real existing citation networks. Two models are proposed and solved analytically for the scaling behavior of the connectivity distribution. These models show a varying scaling exponent with respect to the model parameters but no break-down of scaling thus introducing the first models of scale-free networks in an environment of incomplete information. We compare to results from computer simulations which show a very good agreement.
It has recently been observed that the SIR distributions of a variety of cellular network models and transmission techniques look very similar in shape. As a result, they are well approximated by a simple horizontal shift (or gain) of the distribution of the most tractable model, the Poisson point process (PPP). To study and explain this behavior, this paper focuses on general single-tier network models with nearest-base station association and studies the asymptotic gain both at 0 and at infinity.   We show that the gain at 0 is determined by the so-called mean interference-to-signal ratio (MISR) between the PPP and the network model under consideration, while the gain at infinity is determined by the expected fading-to-interference ratio (EFIR).   The analysis of the MISR is based on a novel type of point process, the so-called relative distance process, which is a one-dimensional point process on the unit interval [0,1] that fully determines the SIR. A comparison of the gains at 0 and infinity shows that the gain at 0 indeed provides an excellent approximation for the entire SIR distribution. Moreover, the gain is mostly a function of the network geometry and barely depends on the path loss exponent and the fading. The results are illustrated using several examples of repulsive point processes.
Software-defined networking (SDN) promises to improve the programmability and flexibility of networks, but it may bring also new challenges that need to be explored. One open issue is the quantitative assessment of the properties of SDN backbone networks to determine whether they can provide similar availability to the traditional IP backbone networks. To achieve this goal, a two-level availability model that is able to capture the global network connectivity without neglecting the essential details and which includes a failure correlation assessment should be considered. The two-level availability model is composed by a structural model and the dynamic models of the principal minimal-cut sets of the network. The purpose of this technical report is to extensively present the implementation on M\"obius of the Stochastic Activity Network (SAN) availability model of the network elements and the principal minimal-cut sets of a SDN backbone network and the corresponding traditional backbone network.
We study several bayesian inference problems for irreversible stochastic epidemic models on networks from a statistical physics viewpoint. We derive equations which allow to accurately compute the posterior distribution of the time evolution of the state of each node given some observations. At difference with most existing methods, we allow very general observation models, including unobserved nodes, state observations made at different or unknown times, and observations of infection times, possibly mixed together. Our method, which is based on the Belief Propagation algorithm, is efficient, naturally distributed, and exact on trees. As a particular case, we consider the problem of finding the "zero patient" of a SIR or SI epidemic given a snapshot of the state of the network at a later unknown time. Numerical simulations show that our method outperforms previous ones on both synthetic and real networks, often by a very large margin.
Extensively studied Mn-doped semiconductor nanocrystals have invariably exhibited photoluminescence (PL) over a narrow energy window of width <= 149 meV in the orange-red region and a surprisingly large spectral width (>= 180 meV), contrary to its presumed atomic-like origin. Carrying out emission measurements on individual single nanocrystals and supported by ab initio calculations, we show that Mn PL emission, in fact, can (i) vary over a much wider range (~ 370 meV) covering the deep green-deep red region and (ii) exhibit widths substantially lower (~ 60-75 meV) than reported so far, opening newer application possibilities and requiring a fundamental shift in our perception of the emission from Mn-doped semiconductor nanocrystals.
We consider the distribution function $P(|\psi|^{2})$ of the eigenfunction amplitude at the center-of-band (E=0) anomaly in the one-dimensional tight-binding chain with weak uncorrelated on-site disorder (the one-dimensional Anderson model). The special emphasis is on the probability of the anomalously localized states (ALS) with $|\psi|^{2}$ much larger than the inverse typical localization length $\ell_{0}$. Using the solution to the generating function $\Phi_{an}(u,\phi)$ found recently in our works we find the ALS probability distribution $P(|\psi|^{2})$ at $|\psi|^{2}\ell_{0} >> 1$. As an auxiliary preliminary step we found the asymptotic form of the generating function $\Phi_{an}(u,\phi)$ at $u >> 1$ which can be used to compute other statistical properties at the center-of-band anomaly. We show that at moderately large values of $|\psi|^{2}\ell_{0}$, the probability of ALS at E=0 is smaller than at energies away from the anomaly. However, at very large values of $|\psi|^{2}\ell_{0}$, the tendency is inverted: it is exponentially easier to create a very strongly localized state at E=0 than at energies away from the anomaly. We also found the leading term in the behavior of $P(|\psi|^{2})$ at small $|\psi|^{2}<< \ell_{0}^{-1}$ and show that it is consistent with the exponential localization corresponding to the Lyapunov exponent found earlier by Kappus and Wegner and Derrida and Gardner.
We present WSRT 1.38 GHz observations of the Hubble Deep Field (and flanking fields). 72 hours of data were combined to produce the WSRT's deepest image yet, achieving an r.m.s. noise level of 8 microJy per beam. We detect radio emission from galaxies both in the HDF and HFF which have not been previously detected by recent MERLIN or VLA studies of the field.
In order to elucidate the relationship between rate-independent hysteresis and metastability in disordered systems driven by an external field, we study the Gaussian RFIM at T=0 on regular random graphs (Bethe lattice) of finite connectivity z and compute to O(1/z) (i.e. beyond mean-field) the quenched complexity associated with the one-spin-flip stable states with magnetization m as a function of the magnetic field H. When the saturation hysteresis loop is smooth in the thermodynamic limit, we find that it coincides with the envelope of the typical metastable states (the quenched complexity vanishes exactly along the loop and is positive everywhere inside). On the other hand, the occurence of a jump discontinuity in the loop (associated with an infinite avalanche) can be traced back to the existence of a gap in the magnetization of the metastable states for a range of applied field, and the envelope of the typical metastable states is then reentrant. These findings confirm and complete earlier analytical and numerical studies.
We develop a new edge detection algorithm that tackles two important issues in this long-standing vision problem: (1) holistic image training and prediction; and (2) multi-scale and multi-level feature learning. Our proposed method, holistically-nested edge detection (HED), performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to approach the human ability resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSD500 dataset (ODS F-score of .782) and the NYU Depth dataset (ODS F-score of .746), and do so with an improved speed (0.4 second per image) that is orders of magnitude faster than some recent CNN-based edge detection algorithms.
This paper proposes a novel, algorithm-independent approach to optimizing belief network inference. rather than designing optimizations on an algorithm by algorithm basis, we argue that one should use an unoptimized algorithm to generate a Q-DAG, a compiled graphical representation of the belief network, and then optimize the Q-DAG and its evaluator instead. We present a set of Q-DAG optimizations that supplant optimizations designed for traditional inference algorithms, including zero compression, network pruning and caching. We show that our Q-DAG optimizations require time linear in the Q-DAG size, and significantly simplify the process of designing algorithms for optimizing belief network inference.
Considering effects of electric field on the low temperature absorption line of quantum well excitons, we show that, for moderate strength of the electric field, the main contribution to the field dependence of the line-width results from field induced modifications of inhomogeneous broadening of excitons. We find that the strength of the random potential acting on quantum well excitons due to alloy disorder and interface roughness can either decrease or increase with field depending upon the thickness of the well. This means that under certain conditions one can observe counterintuitive narrowing of exciton spectral lines in electric field.
We consider a model of large regulatory gene expression networks where the thresholds activating the sigmoidal interactions between genes and the signs of these interactions are shuffled randomly. Such an approach allows for a qualitative understanding of network dynamics in a lack of empirical data concerning the large genomes of living organisms. Local dynamics of network nodes exhibits the multistationarity and oscillations and depends crucially upon the global topology of a "maximal" graph (comprising of all possible interactions between genes in the network). The long time behavior observed in the network defined on the homogeneous "maximal" graphs is featured by the fraction of positive interactions ($0\leq \eta\leq 1$) allowed between genes. There exists a critical value $\eta_c<1$ such that if $\eta<\eta_c$, the oscillations persist in the system, otherwise, when $\eta>\eta_c,$ it tends to a fixed point (which position in the phase space is determined by the initial conditions and the certain layout of switching parameters). In networks defined on the inhomogeneous directed graphs depleted in cycles, no oscillations arise in the system even if the negative interactions in between genes present therein in abundance ($\eta_c=0$). For such networks, the bidirectional edges (if occur) influence on the dynamics essentially. In particular, if a number of edges in the "maximal" graph is bidirectional, oscillations can arise and persist in the system at any low rate of negative interactions between genes ($\eta_c=1$). Local dynamics observed in the inhomogeneous scalable regulatory networks is less sensitive to the choice of initial conditions. The scale free networks demonstrate their high error tolerance.
Network size is a fundamental statistic for a peer-to-peer system but is generally considered to contain too little information to be useful. However, most existing work only considers the metric by itself and does not explore what features could be extracted from this seem- ingly trivial metric. In this paper, we show that Fourier transform allows us to extract frequency features from such time series data, which can further be used to characterize user behaviors and detect system anoma- lies in a peer-to-peer system automatically without needing to resort to visual comparisons. By using the proposed algorithm, our system suc- cessfully discovers and clusters countries of similar user behavior and captures the anomalies like Sybil attacks and other real-world events with high accuracy. Our work in this paper highlights the usefulness of more advanced time series processing techniques in analyzing network measurements.
We seek to improve deep neural networks by generalizing the pooling operations that play a central role in current architectures. We pursue a careful exploration of approaches to allow pooling to learn and to adapt to complex and variable patterns. The two primary directions lie in (1) learning a pooling function via (two strategies of) combining of max and average pooling, and (2) learning a pooling function in the form of a tree-structured fusion of pooling filters that are themselves learned. In our experiments every generalized pooling operation we explore improves performance when used in place of average or max pooling. We experimentally demonstrate that the proposed pooling operations provide a boost in invariance properties relative to conventional pooling and set the state of the art on several widely adopted benchmark datasets; they are also easy to implement, and can be applied within various deep neural network architectures. These benefits come with only a light increase in computational overhead during training and a very modest increase in the number of model parameters.
As data science continues to grow in popularity, there will be an increasing need to make data science tools more scalable, flexible, and accessible. In particular, automated machine learning (AutoML) systems seek to automate the process of designing and optimizing machine learning pipelines. In this chapter, we present a genetic programming-based AutoML system called TPOT that optimizes a series of feature preprocessors and machine learning models with the goal of maximizing classification accuracy on a supervised classification problem. Further, we analyze a large database of pipelines that were previously used to solve various supervised classification problems and identify 100 short series of machine learning operations that appear the most frequently, which we call the building blocks of machine learning pipelines. We harness these building blocks to initialize TPOT with promising solutions, and find that this sensible initialization method significantly improves TPOT's performance on one benchmark at no cost of significantly degrading performance on the others. Thus, sensible initialization with machine learning pipeline building blocks shows promise for GP-based AutoML systems, and should be further refined in future work.
One vision of future wireless networks is that they will be deeply integrated and embedded in our lives and will involve the use of personalized mobile devices. User behavior in such networks is bound to affect the network performance. It is imperative to study and characterize the fundamental structure of wireless user behavior in order to model, manage, leverage and design efficient mobile networks. It is also important to make such study as realistic as possible, based on extensive measurements collected from existing deployed wireless networks.   In this study, using our systematic TRACE approach, we analyze wireless users' behavioral patterns by extensively mining wireless network logs from two major university campuses. We represent the data using location preference vectors, and utilize unsupervised learning (clustering) to classify trends in user behavior using novel similarity metrics. Matrix decomposition techniques are used to identify (and differentiate between) major patterns. While our findings validate intuitive repetitive behavioral trends and user grouping, it is surprising to find the qualitative commonalities of user behaviors from the two universities. We discover multi-modal user behavior for more than 60% of the users, and there are hundreds of distinct groups with unique behavioral patterns in both campuses. The sizes of the major groups follow a power-law distribution. Our methods and findings provide an essential step towards network management and behavior-aware network protocols and applications, to name a few.
We consider a continuous supply chain network consisting of buffering queues and processors first proposed by [D. Armbruster, P. Degond, and C. Ringhofer, SIAM J. Appl. Math., 66 (2006), pp. 896--920] and subsequently analyzed by [D. Armbruster, P. Degond, and C. Ringhofer, Bull. Inst. Math. Acad. Sin. (N.S.), 2 (2007), pp. 433--460] and [D. Armbruster, C. De Beer, M. Freitag, T. Jagalski, and C. Ringhofer, Phys. A, 363 (2006), pp. 104--114]. A model was proposed for such a network by [S. G\"ottlich, M. Herty, and A. Klar, Commun. Math. Sci., 3 (2005), pp. 545--559] using a system of coupling ordinary differential equations and partial differential equations. In this article, we propose an alternative approach based on a variational method to formulate the network dynamics. We also derive, based on the variational method, a computational algorithm that guarantees numerical stability, allows for rigorous error estimates, and facilitates efficient computations. A class of network flow optimization problems are formulated as mixed integer programs (MIPs). The proposed numerical algorithm and the corresponding MIP are compared theoretically and numerically with existing ones [A. F\"ugenschuh, S. G\"ottlich, M. Herty, A. Klar, and A. Martin, SIAM J. Sci. Comput., 30 (2008), pp. 1490--1507; S. G\"ottlich, M. Herty, and A. Klar, Commun. Math. Sci., 3 (2005), pp. 545--559], which demonstrates the modeling and computational advantages of the variational approach.
Deep convolutional neural networks achieve remarkable visual recognition performance, at the cost of high computational complexity. In this paper, we have a new design of efficient convolutional layers based on three schemes. The 3D convolution operation in a convolutional layer can be considered as performing spatial convolution in each channel and linear projection across channels simultaneously. By unravelling them and arranging the spatial convolution sequentially, the proposed layer is composed of a single intra-channel convolution, of which the computation is negligible, and a linear channel projection. A topological subdivisioning is adopted to reduce the connection between the input channels and output channels. Additionally, we also introduce a spatial "bottleneck" structure that utilizes a convolution-projection-deconvolution pipeline to take advantage of the correlation between adjacent pixels in the input. Our experiments demonstrate that the proposed layers remarkably outperform the standard convolutional layers with regard to accuracy/complexity ratio. Our models achieve similar accuracy to VGG, ResNet-50, ResNet-101 while requiring 42, 4.5, 6.5 times less computation respectively.
The random field S=1/2 Heisenberg chain exhibits a dynamical many body localization transition at a critical disorder strength, which depends on the energy density. At weak disorder, the eigenstate thermalization hypothesis (ETH) is fulfilled on average, making local observables smooth functions of energy, whose eigenstate-to-eigenstate fluctuations decrease exponentially with system size. We demonstrate the validity of ETH in the thermal phase as well as its breakdown in the localized phase and show that rare states exist which do not strictly follow ETH, becoming more frequent closer to the transition. Similarly, the probability distribution of the entanglement entropy at intermediate disorder develops long tails all the way down to zero entanglement. We propose that these low entanglement tails stem from localized regions at the subsystem boundaries which were recently discussed as a possible mechanism for subdiffusive transport in the ergodic phase.
In this paper we present the experimental results of the neural network control of a servo-system in order to control its speed. The control strategy is implemented by using an inverse-model control based on Artificial Neural Networks (ANNs). The network training was performed using two learning algorithms: Levenberg-Marquardt and Bayesian regularization. We evaluate the generalization capability for each method according to both the correct operation of the controller to follow the reference signal, and the control efforts developed by the ANN-based controller.
Finding the conditions that foster synchronization in networked oscillatory systems is critical to understanding a wide range of biological and mechanical systems. However, the conditions proved in the literature for synchronization in nonlinear systems with linear coupling, such as has been used to model neuronal networks, are in general not strict enough to accurately determine the system behavior. We leverage contraction theory to derive new sufficient conditions for cluster synchronization in terms of the network structure, for a network where the intrinsic nonlinear dynamics of each node may differ. Our result requires that network connections satisfy a cluster-input-equivalence condition, and we explore the influence of this requirement on network dynamics. For application to networks of nodes with neuronal spiking dynamics, we show that our new sufficient condition is tighter than those found in previous analyses which used nonsmooth Lyapunov functions. Improving the analytical conditions for when cluster synchronization will occur based on network configuration is a significant step toward facilitating understanding and control of complex oscillatory systems.
The propagation of an interfacial crack front through a weak plane of a transparent Plexiglas block has been studied experimentally. A stable crack in mode I was generated by loading the system by an imposed displacement. The local velocities of the fracture front line have been measured by using an high speed CCD camera. The distribution of the velocities exhibits a power law behavior for velocities larger than the average front velocity <v> with a crossover to a slowly increasing function for velocities lower than <v>. The fluctuations in the velocities reflect an underlying irregular bursts activity with a power law distribution of the bursts. We further found that the size of the local bursts scales differently in the direction parallel to and perpendicular to the fracture front.
With respect to usual thermal ferromagnetic transitions, the zero-temperature finite-disorder critical point of the Random-field Ising model (RFIM) has the peculiarity to involve some 'droplet' exponent $\theta$ that enters the generalized hyperscaling relation $2-\alpha= \nu (d-\theta)$. In the present paper, to better understand the meaning of this droplet exponent $\theta$ beyond its role in the thermodynamics, we discuss the statistics of low-energy excitations generated by an imposed single spin-flip with respect to the ground state, as well as the statistics of equilibrium avalanches i.e. the magnetization jumps that occur in the sequence of ground-states as a function of the external magnetic field. The droplet scaling theory predicts that the distribution $dl/l^{1+\theta}$ of the linear-size $l$ of low-energy excitations transforms into the distribution $ds/s^{1+{\theta/d_f}}$ for the size $s$ (number of spins) of excitations of fractal dimension $d_f$ ($s \sim l^{d_f}$). In the non-mean-field region $d<d_c$, droplets are compact $d_f=d$, whereas in the mean-field region $d>d_c$, droplets have a fractal dimension $d_f=2 \theta$ leading to the well-known mean-field result $ds/s^{3/2}$. Zero-field equilibrium avalanches are expected to display the same distribution $ds/s^{1+{\theta/d_f}}$. We also discuss the statistics of equilibrium avalanches integrated over the external field and finite-size behaviors. These expectations are checked numerically for the Dyson hierarchical version of the RFIM, where the droplet exponent $\theta(\sigma)$ can be varied as a function of the effective long-range interaction $J(r) \sim 1/r^{d+\sigma}$ in $d=1$.
The nonlinear rheological properties of dense suspensions are discussed within simplified models, suggested by a recent first principles approach to the model of Brownian particles in a constant-velocity-gradient solvent flow. Shear thinning of colloidal fluids and dynamical yielding of colloidal glasses arise from a competition between a slowing down of structural relaxation, because of particle interactions, and enhanced decorrelation of fluctuations, caused by the shear advection of density fluctuations. A mode coupling approach is developed to explore the shear-induced suppression of particle caging and the resulting speed-up of the structural relaxation.
We consider the problem of reliably broadcasting information in a multihop asynchronous network that is subject to Byzantine failures. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the authentic message and nothing else), but they require a highly connected network. An approach giving only probabilistic guarantees (correct nodes deliver the authentic message with high probability) was recently proposed for loosely connected networks, such as grids and tori. Yet, the proposed solution requires a specific initialization (that includes global knowledge) of each node, which may be difficult or impossible to guarantee in self-organizing networks - for instance, a wireless sensor network, especially if they are prone to Byzantine failures. In this paper, we propose a new protocol offering guarantees for loosely connected networks that does not require such global knowledge dependent initialization. In more details, we give a methodology to determine whether a set of nodes will always deliver the authentic message, in any execution. Then, we give conditions for perfect reliable broadcast in a torus network. Finally, we provide experimental evaluation for our solution, and determine the number of randomly distributed Byzantine failures than can be tolerated, for a given correct broadcast probability.
Modern financial networks exhibit a high degree of interconnectedness and determining the causes of instability and contagion in financial networks is necessary to inform policy and avoid future financial collapse. In the American Economic Review, Elliott, Golub and Jackson proposed a simple model for capturing the dynamics of complex financial networks. In Elliott, Golub and Jackson's model, each institution in the network can buy underlying assets or percentage shares in other institutions (cross-holdings) and if any institution's value drops below a critical threshold value, its value suffers an additional failure cost.   This work shows that even in simple model put forward by Elliott, Golub and Jackson there are fundamental barriers to understanding the risks that are inherent in a network. First, if institutions are not required to maintain a minimum amount of self-holdings, an $\epsilon$ change in investments by a single institution can have an arbitrarily magnified influence on the net worth of the institutions in the system. This sensitivity result shows that if institutions have small self-holdings, then estimating the market value of an institution requires almost perfect information about every cross-holding in the system. Second, we show that even if a regulator has complete information about all cross-holdings in the system, it may be computationally intractable to even estimate the number of failures that could be caused by an arbitrarily small shock to the system. Together, these results show that any uncertainty in the cross-holdings or values of the underlying assets can be amplified by the network to arbitrarily large uncertainty in the valuations of institutions in the network.
With the rapid development in mobile network effective network planning tool is needed to satisfy the need of customers. However, deciding upon the optimum placement for the base stations (BS) to achieve best services while reducing the cost is a complex task requiring vast computational resource. This paper addresses antenna placement problem or the cell planning problem, involves locating and configuring infrastructure for mobile networks. The Cluster Partitioning Around Medoids (PAM) original algorithm has been modified and a new algorithm M-PAM (Modified-Partitioning Around Medoids) has been proposed by the authors in a recent work. In the present paper, the M-PAM algorithm is modified and a new algorithm CWN-PAM (Clustering with Weighted Node-Partitioning Around Medoids) has been proposed to satisfy the requirements and constraints. Implementation of this algorithm to a real case study is presented. Results demonstrate the effectiveness and flexibility of the modifying algorithm in tackling the important problem of mobile network planning.
Here we consider the topological properties of the integrated networks emerging from the activity driven model [Perra at al. Sci. Rep. 2, 469 (2012)], a temporal network model recently proposed to explain the power-law degree distribution empirically observed in many real social networks. By means of a mapping to a hidden variables network model, we provide analytical expressions for the main topological properties of the integrated network, depending on the integration time and the distribution of activity potential characterizing the model. The expressions obtained, exacts in some cases, the results of controlled asymptotic expansions in others, are confirmed by means of extensive numerical simulations. Our analytical approach, which highlights the differences of the model with respect to the empirical observations made in real social networks, can be easily extended to deal with improved, more realistic modifications of the activity driven network paradigm.
Credit networks represent a way of modeling trust between entities in a network. Nodes in the network print their own currency and trust each other for a certain amount of each other's currency. This allows the network to serve as a decentralized payment infrastructure---arbitrary payments can be routed through the network by passing IOUs between trusting nodes in their respective currencies---and obviates the need for a common currency. Nodes can repeatedly transact with each other and pay for the transaction using trusted currency. A natural question to ask in this setting is: how long can the network sustain liquidity, i.e., how long can the network support the routing of payments before credit dries up? We answer this question in terms of the long term failure probability of transactions for various network topologies and credit values.   We prove that the transaction failure probability is independent of the path along which transactions are routed. We show that under symmetric transaction rates, the transaction failure probability in a number of well-known graph families goes to zero as the size, density or credit capacity of the network increases. We also show via simulations that even networks of small size and credit capacity can route transactions with high probability if they are well-connected. Further, we characterize a centralized currency system as a special type of a star network (one where edges to the root have infinite credit capacity, and transactions occur only between leaf nodes) and compute the steady-state transaction failure probability in a centralized system. We show that liquidity in star networks, complete graphs and Erd\"{o}s-R\'{e}nyi networks is comparable to that in equivalent centralized currency systems; thus we do not lose much liquidity in return for their robustness and decentralized properties.
We study the second-moment correlation length and the reduced susceptibility of two ferromagnetic Ising models with zero-temperature ordering. By introducing a scaling variable motivated by high-temperature series expansions, we are able to scale data for the one-dimensional Ising ferromagnet rigorously over the entire temperature range. Analogous scaling expressions are then applied to the two-dimensional fully frustrated Villain model where excellent finite-size scaling over the entire temperature range is achieved. Thus we broaden the applicability of the extended scaling method to Ising systems having a zero-temperature critical point.
In the context of the Internet of Things (IoT), sound sensing applications are required to run on embedded platforms where notions of product pricing and form factor impose hard constraints on the available computing power. Whereas Automatic Environmental Sound Recognition (AESR) algorithms are most often developed with limited consideration for computational cost, this article seeks which AESR algorithm can make the most of a limited amount of computing power by comparing the sound classification performance em as a function of its computational cost. Results suggest that Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consistently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost.
Deep neural networks have achieved remarkable results across many language processing tasks, however these methods are highly sensitive to noise and adversarial attacks. We present a regularization based method for limiting network sensitivity to its inputs, inspired by ideas from computer vision, thus learning models that are more robust. Empirical evaluation over a range of sentiment datasets with a convolutional neural network shows that, compared to a baseline model and the dropout method, our method achieves superior performance over noisy inputs and out-of-domain data.
The majority of real-world networks are dynamic and extremely large (e.g., Internet Traffic, Twitter, Facebook, ...). To understand the structural behavior of nodes in these large dynamic networks, it may be necessary to model the dynamics of behavioral roles representing the main connectivity patterns over time. In this paper, we propose a dynamic behavioral mixed-membership model (DBMM) that captures the roles of nodes in the graph and how they evolve over time. Unlike other node-centric models, our model is scalable for analyzing large dynamic networks. In addition, DBMM is flexible, parameter-free, has no functional form or parameterization, and is interpretable (identifies explainable patterns). The performance results indicate our approach can be applied to very large networks while the experimental results show that our model uncovers interesting patterns underlying the dynamics of these networks.
One of the most fundamental problems in large scale network analysis is to determine the importance of a particular node in a network. Betweenness centrality is the most widely used metric to measure the importance of a node in a network. In this paper, we present a randomized parallel algorithm and an algebraic method for computing betweenness centrality of all nodes in a network. We prove that any path-comparison based algorithm cannot compute betweenness in less than O(nm) time.
We study the mean field dilute model of a ferromagnet. We find and prove an expression for the free energy density at high temperature, and at temperature zero. We find the critical line of the model, separating the phase with zero magnetization from the phase with symmetry breaking. We also compute exactly the entropy at temperature zero, which is strictly positive. The physical behavior at temperature zero is very interesting and related to infinite dimensional percolation, and suggests possible behaviors at generic low temperatures. Lastly, we provide a complete solution for the annealed model. Our results hold both for the Poisson and the Bernoulli versions of the model.
Social messages classification is a research domain that has attracted the attention of many researchers in these last years. Indeed, the social message is different from ordinary text because it has some special characteristics like its shortness. Then the development of new approaches for the processing of the social message is now essential to make its classification more efficient. In this paper, we are mainly interested in the classification of social messages based on their spreading on online social networks (OSN). We proposed a new distance metric based on the Dynamic Time Warping distance and we use it with the probabilistic and the evidential k Nearest Neighbors (k-NN) classifiers to classify propagation networks (PrNets) of messages. The propagation network is a directed acyclic graph (DAG) that is used to record propagation traces of the message, the traversed links and their types. We tested the proposed metric with the chosen k-NN classifiers on real world propagation traces that were collected from Twitter social network and we got good classification accuracies.
Complex networks, comprised of individual elements that interact with each other through reaction channels, are ubiquitous across many scientific and engineering disciplines. Examples include biochemical, pharmacokinetic, epidemiological, ecological, social, neural, and multi-agent networks. A common approach to modeling such networks is by a master equation that governs the dynamic evolution of the joint probability mass function of the underling population process and naturally leads to Markovian dynamics for such process. Due however to the nonlinear nature of most reactions, the computation and analysis of the resulting stochastic population dynamics is a difficult task. This review article provides a coherent and comprehensive coverage of recently developed approaches and methods to tackle this problem. After reviewing a general framework for modeling Markovian reaction networks and giving specific examples, the authors present numerical and computational techniques capable of evaluating or approximating the solution of the master equation, discuss a recently developed approach for studying the stationary behavior of Markovian reaction networks using a potential energy landscape perspective, and provide an introduction to the emerging theory of thermodynamic analysis of such networks. Three representative problems of opinion formation, transcription regulation, and neural network dynamics are used as illustrative examples.
We define a class of multi--hop erasure networks that approximates a wireless multi--hop network. The network carries unicast flows for multiple users, and each information packet within a flow is required to be decoded at the flow destination within a specified delay deadline. The allocation of coding rates amongst flows/users is constrained by network capacity. We propose a proportional fair transmission scheme that maximises the sum utility of flow throughputs. This is achieved by {\em jointly optimising the packet coding rates and the allocation of bits of coded packets across transmission slots.}
We study the stability of the replica-symmetric solution of a two-sublattice infinite-range spin-glass model, which can describe the transition from antiferromagnetic to spin glass state. The eigenvalues associated with replica-symmetric perturbations are in general complex. The natural generalization of the usual stability condition is to require the real part of these eigenvalues to be positive. The necessary and sufficient conditions for all the roots of the secular equation to have positive real parts is given by the Hurwitz criterion. The generalized stability condition allows a consistent analysis of the phase diagram within the replica-symmetric approximation.
Similarities between fragile glasses and spin glasses (SG) suggest the study of frustrated spin model to understand the complex dynamics of glasses above the glass transition. We consider a frustrated spin model with Ising spins and s-state Potts spins both with and without disorder. We study the two models by Monte Carlo simulations in two dimensions. The Potts spins mimic orientational degrees of freedom and the coupled frustrated Ising spins take into account for frustrating effects like the geometrical hindrance. We show that in this model dynamical transitions and crossovers are related to static transitions. In particular, when disorder is present, as predicted and verified in SG, a dynamical transition between high-temperatures exponential to low-temperatures non-exponential correlation functions numerically coincides with the ordering temperature of the ferromagnetic regions, i.e. the Griffiths temperature $T_c(s)$, while a crossover between power law growth of correlation times to Arrhenius law occurs near a Potts transition at $T_p(s)<T_c(s)$. In the model without disorder, where $T_c(s)$ is not defined, both dynamical transition and crossover occur at $T_p(s)$. Furthermore the static susceptibility and the autocorrelation times of quantities depending on Potts spins diverge at $T_p(s)$. This is reminescent of recent experimental results on glass-forming liquids.
Inferring a generative model from data is a fundamental problem in machine learning. It is well-known that the Ising model is the maximum entropy model for binary variables which reproduces the sample mean and pairwise correlations. Learning the parameters of the Ising model from data is the challenge. We establish an analogy between the inverse Ising problem and the Ornstein-Zernike formalism in liquid state physics. Rather than analytically deriving the closure relation, we use a deep neural network to learn the closure from simulations of the Ising model. We show, using simulations as well as biochemical datasets, that the deep neural network model outperforms systematic field-theoretic expansions and can generalize well beyond the parameter regime of the training data. The neural network is able to learn from synthetic data, which can be generated with relative ease, to give accurate predictions on real world datasets.
Objects appear to scale differently in natural images. This fact requires methods dealing with object-centric tasks (e.g. object proposal) to have robust performance over variances in object scales. In the paper, we present a novel segment proposal framework, namely FastMask, which takes advantage of hierarchical features in deep convolutional neural networks to segment multi-scale objects in one shot. Innovatively, we adapt segment proposal network into three different functional components (body, neck and head). We further propose a weight-shared residual neck module as well as a scale-tolerant attentional head module for efficient one-shot inference. On MS COCO benchmark, the proposed FastMask outperforms all state-of-the-art segment proposal methods in average recall being 2~5 times faster. Moreover, with a slight trade-off in accuracy, FastMask can segment objects in near real time (~13 fps) with 800*600 resolution images, demonstrating its potential in practical applications. Our implementation is available on https://github.com/voidrank/FastMask.
Network penetration testing identifies the exploits and vulnerabilities those exist within computer network infrastructure and help to confirm the security measures. The objective of this paper is to explain methodology and methods behind penetration testing and illustrate remedies over it, which will provide substantial value for network security Penetration testing should model real world attacks as closely as possible. An authorized and scheduled penetration testing will probably detected by IDS (Intrusion Detection System). Network penetration testing is done by either or manual automated tools. Penetration test can gather evidence of vulnerability in the network. Successful testing provides indisputable evidence of the problem as well as starting point for prioritizing remediation. Penetration testing focuses on high severity vulnerabilities and there are no false positive.
To avoid the exhaustive search over locations and scales, current state-of-the-art object detection systems usually involve a crucial component generating a batch of candidate object proposals from images. In this paper, we present a simple yet effective approach for segmenting object proposals via a deep architecture of recursive neural networks (RNNs), which hierarchically groups regions for detecting object candidates over scales. Unlike traditional methods that mainly adopt fixed similarity measures for merging regions or finding object proposals, our approach adaptively learns the region merging similarity and the objectness measure during the process of hierarchical region grouping. Specifically, guided by a structured loss, the RNN model jointly optimizes the cross-region similarity metric with the region merging process as well as the objectness prediction. During inference of the object proposal generation, we introduce randomness into the greedy search to cope with the ambiguity of grouping regions. Extensive experiments on standard benchmarks, e.g., PASCAL VOC and ImageNet, suggest that our approach is capable of producing object proposals with high recall while well preserving the object boundaries and outperforms other existing methods in both accuracy and efficiency.
By using renormalization-group (RG) methods, we study a non-mean-field model of a spin glass built on a hierarchical lattice, the hierarchical Edwards-Anderson model in a magnetic field. We investigate the spin-glass transition in a field by studying the existence of a stable critical RG fixed point (FP) with perturbation theory. In the parameter region where the model has a mean-field behavior - corresponding to $d \geq 4$ for a $d$-dimensional Ising model - we find a stable FP that corresponds to a spin-glass transition in a field. In the non-mean-field parameter region the FP above is unstable, and we determined exactly all other FPs: to our knowledge, this is the first time that all perturbative FPs for the full set of RG equations of a spin glass in a field have been characterized in the non-mean-field region. We find that all potentially stable FPs in the non-mean-field region have a nonzero imaginary part: this constitutes, to the best of our knowledge, the first demonstration for a spin glass in a field that there is no perturbative FP corresponding to a spin-glass transition in the non-mean-field region. Finally, we discuss the possible interpretations of this result, such as the absence of a phase transition in a field, or the existence of a transition associated with a non-perturbative FP.
We propose a new method for creating computationally efficient and compact convolutional neural networks (CNNs) using a novel sparse connection structure that resembles a tree root. This allows a significant reduction in computational cost and number of parameters compared to state-of-the-art deep CNNs, without compromising accuracy, by exploiting the sparsity of inter-layer filter dependencies. We validate our approach by using it to train more efficient variants of state-of-the-art CNN architectures, evaluated on the CIFAR10 and ILSVRC datasets. Our results show similar or higher accuracy than the baseline architectures with much less computation, as measured by CPU and GPU timings. For example, for ResNet 50, our model has 40% fewer parameters, 45% fewer floating point operations, and is 31% (12%) faster on a CPU (GPU). For the deeper ResNet 200 our model has 25% fewer floating point operations and 44% fewer parameters, while maintaining state-of-the-art accuracy. For GoogLeNet, our model has 7% fewer parameters and is 21% (16%) faster on a CPU (GPU).
We study the breaking of ergodicity measured in terms of return probability in the evolution of a quantum state of a spin chain. In the non ergodic phase a quantum state evolves in a much smaller fraction of the Hilbert space than would be allowed by the conservation of extensive observables. By the anomalous scaling of the participation ratios with system size we are led to consider the distribution of the wave function coefficients, a standard observable in modern studies of Anderson localization. We finally present a criterion for the identification of the ergodicity breaking (many-body localization) transition based on these distributions which is quite robust and well suited for numerical investigations of a broad class of problems.
We study the distribution of equilibrium avalanches (shocks) in Ising spin glasses which occur at zero temperature upon small changes in the magnetic field. For the infinite-range Sherrington-Kirkpatrick model we present a detailed derivation of the density rho(Delta M) of the magnetization jumps Delta M. It is obtained by introducing a multi-component generalization of the Parisi-Duplantier equation, which allows us to compute all cumulants of the magnetization. We find that rho(Delta M) ~ (Delta M)^(-tau) with an avalanche exponent tau=1 for the SK model, originating from the marginal stability (criticality) of the model. It holds for jumps of size 1 << Delta M < N^(1/2) being provoked by changes of the external field by delta H = O(N^[-1/2]) where N is the total number of spins. Our general formula also suggests that the density of overlap q between initial and final state in an avalanche is rho(q) ~ 1/(1-q). These results show interesting similarities with numerical simulations for the out-of-equilibrium dynamics of the SK model. For finite-range models, using droplet arguments, we obtain the prediction tau= (d_f + theta)/d_m, where d_f,d_m and theta are the fractal dimension, magnetization exponent and energy exponent of a droplet, respectively. This formula is expected to apply to other glassy disordered systems, such as the random-field model and pinned interfaces. We make suggestions for further numerical investigations, as well as experimental studies of the Barkhausen noise in spin glasses.
We consider the problem of two-point resistance in a resistor network previously studied by one of us [F. Y. Wu, J. Phys. A {\bf 37}, 6653 (2004)]. By formulating the problem differently, we obtain a new expression for the two-point resistance between two arbitrary nodes which is simpler and can be easier to use in practice. We apply the new formulation to the cobweb resistor network to obtain the resistance between two nodes in the network. Particularly, our results prove a recently proposed conjecture on the resistance between the center node and a node on the network boundary. Our analysis also solves the spanning tree problem on the cobweb network.
Virtual Learning Environments (VLEs) are spaces designed to educate students remotely via online platforms. Although traditional VLEs such as iSocial have shown promise in educating students, they offer limited immersion that diminishes learning effectiveness. In this paper, we outline a virtual reality learning environment (VRLE) over a high-speed network, which promotes educational effectiveness/efficiency via our creation of flexible content that meets established VLE standards but with better immersion. We describe the implementation of multiple learning modules we developed utilizing High Fidelity, a 'Social VR' platform. Our experiment results show that the VR medium of content delivery better stimulates the generalization of lessons to the real world than non-VR lessons, and provides improved immersion when compared to an equivalent desktop version.
In this paper a lattice model for diffusional transport of particles in the interphase cell nucleus is proposed. Dense networks of chromatin fibers are created by three different methods: randomly distributed, non-interconnected obstacles, a random walk chain model, and a self avoiding random walk chain model with persistence length. By comparing a discrete and a continuous version of the random walk chain model, we demonstrate that lattice discretization does not alter particle diffusion. The influence of the 3D geometry of the fiber network on the particle diffusion is investigated in detail, while varying occupation volume, chain length, persistence length and walker size. It is shown that adjacency of the monomers, the excluded volume effect incorporated in the self avoiding random walk model, and, to a lesser extent, the persistence length, affect particle diffusion. It is demonstrated how the introduction of the effective chain occupancy, which is a convolution of the geometric chain volume with the walker size, eliminates the conformational effects of the network on the diffusion, i.e., when plotting the diffusion coefficient as a function of the effective chain volume, the data fall onto a master curve.
Missing data is a well-recognized problem impacting all domains. State-of-the-art framework to minimize missing data bias is multiple imputation, for which the choice of an imputation model remains nontrivial. We propose a multiple imputation model based on overcomplete deep denoising autoencoders, capable of handling different data types, missingness patterns, missingness proportions and distributions. Evaluation on real life datasets shows our proposed model outperforms the state-of-the-art methods under varying conditions and improves the end of the line analytics.
We study the problem of identifying the source of a diffusion spreading over a regular tree. When the degree of each node is at least three, we show that it is possible to construct confidence sets for the diffusion source with size independent of the number of infected nodes. Our estimators are motivated by analogous results in the literature concerning identification of the root node in preferential attachment and uniform attachment trees. At the core of our proofs is a probabilistic analysis of P\'{o}lya urns corresponding to the number of uninfected neighbors in specific subtrees of the infection tree. We also provide an example illustrating the shortcomings of source estimation techniques in settings where the underlying graph is asymmetric.
Deep CCA is a recently proposed deep neural network extension to the traditional canonical correlation analysis (CCA), and has been successful for multi-view representation learning in several domains. However, stochastic optimization of the deep CCA objective is not straightforward, because it does not decouple over training examples. Previous optimizers for deep CCA are either batch-based algorithms or stochastic optimization using large minibatches, which can have high memory consumption. In this paper, we tackle the problem of stochastic optimization for deep CCA with small minibatches, based on an iterative solution to the CCA objective, and show that we can achieve as good performance as previous optimizers and thus alleviate the memory requirement.
Much of the recent research on solving iterative inference problems focuses on moving away from hand-chosen inference algorithms and towards learned inference. In the latter, the inference process is unrolled in time and interpreted as a recurrent neural network (RNN) which allows for joint learning of model and inference parameters with back-propagation through time. In this framework, the RNN architecture is directly derived from a hand-chosen inference algorithm, effectively limiting its capabilities. We propose a learning framework, called Recurrent Inference Machines (RIM), in which we turn algorithm construction the other way round: Given data and a task, train an RNN to learn an inference algorithm. Because RNNs are Turing complete [1, 2] they are capable to implement any inference algorithm. The framework allows for an abstraction which removes the need for domain knowledge. We demonstrate in several image restoration experiments that this abstraction is effective, allowing us to achieve state-of-the-art performance on image denoising and super-resolution tasks and superior across-task generalization.
We examine two properties of complex networks, the robustness against targeted node removal (attack) and the transport efficiency in terms of degree correlation in node connection by numerical evaluation of exact analytic expressions. We find that, while the assortative correlation enhances the structural robustness against attack, the disassortative correlation significantly improves the transport efficiency of the network under consideration. This finding might shed light on the reason why some networks in the real world prefer assortative correlation and others prefer disassortative one.
The complexity of neural dynamics stems in part from the complexity of the underlying anatomy. Yet how the organization of white matter architecture constrains how the brain transitions from one cognitive state to another remains unknown. Here we address this question from a computational perspective by defining a brain state as a pattern of activity across brain regions. Drawing on recent advances in network control theory, we model the underlying mechanisms of brain state transitions as elicited by the collective control of region sets. Specifically, we examine how the brain moves from a specified initial state (characterized by high activity in the default mode) to a specified target state (characterized by high activity in primary sensorimotor cortex) in finite time. Across all state transitions, we observe that the supramarginal gyrus and the inferior parietal lobule consistently acted as efficient, low energy control hubs, consistent with their strong anatomical connections to key input areas of sensorimotor cortex. Importantly, both these and other regions in the fronto-parietal, cingulo-opercular, and attention systems are poised to affect a broad array of state transitions that cannot easily be classified by traditional notions of control common in the engineering literature. This theoretical versatility comes with a vulnerability to injury. In patients with mild traumatic brain injury, we observe a loss of specificity in putative control processes, suggesting greater susceptibility to damage-induced noise in neurophysiological activity. These results offer fundamentally new insights into the mechanisms driving brain state transitions in healthy cognition and their alteration following injury.
Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam. We show that for simple overparameterized problems, adaptive methods often find drastically different solutions than gradient descent (GD) or stochastic gradient descent (SGD). We construct an illustrative binary classification problem where the data is linearly separable, GD and SGD achieve zero test error, and AdaGrad, Adam, and RMSProp attain test errors arbitrarily close to half. We additionally study the empirical generalization capability of adaptive methods on several state-of-the-art deep learning models. We observe that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance. These results suggest that practitioners should reconsider the use of adaptive methods to train neural networks.
We discuss the feasibility of absolute negative conductivity (ANC) in two-dimensional electron systems (2DES) stimulated by microwave radiation in transverse magnetic field. The mechanism of ANC under consideration is associated with the electron scattering on acoustic piezoelectric phonons accompanied by the absorption of microwave photons. It is demonstrated that the dissipative components of the 2DES dc conductivity can be negative ($\sigma_{xx} = \sigma_{yy} < 0$) when the microwave frequency $\Omega$ is somewhat higher than the electron cyclotron frequency $\Omega_c$ or its harmonics. The concept of ANC associated with such a scattering mechanism can be invoked to explain the nature of the occurrence of zero-resistance ``dissipationless'' states observed in recent experiments.
To cater for the demands of future fifth generation (5G) ultra-dense small cell networks, the wireless backhaul network is an attractive solution for the urban deployment of 5G wireless networks. Optimization of 5G wireless backhaul networks is a key issue. In this paper we propose a two-scale optimization solution to maximize the cost efficiency of 5G wireless backhaul networks. Specifically, the number and positions of gateways are optimized in the long time scale of 5G wireless backhaul networks. The wireless backhaul routings are optimized in the short time scale of 5G wireless backhaul networks. Numerical results show the cost efficiency of proposed optimization algorithm is better than the cost efficiency of conventional and most widely used shortest path (SP) algorithm in 5G wireless backhaul networks.
Aging effects in the relaxations of conductivity of a two-dimensional electron system in Si have been studied as a function of carrier density. They reveal an abrupt change in the nature of the glassy phase at the metal-insulator transition (MIT): (a) while full aging is observed in the insulating regime, there are significant departures from full aging on the metallic side of the MIT, before the glassy phase disappears completely at a higher density $n_g$; (b) the amplitude of the relaxations peaks just below the MIT, and it is strongly suppressed in the insulating phase. Other aspects of aging, including large non-Gaussian noise and similarities to spin glasses, also have been discussed.
Various aspects of the construction, operation and calibration of the recently completed deep-sea ANTARES neutrino telescope are described. Some first results obtained with a partial five line configuration are presented, including depth dependence of the atmospheric muon rate, the search for point-like cosmic neutrino sources and the search for dark matter annihilation in the Sun.
Understanding dynamics of evolution in large social networks is an important problem. In this paper, we characterize evolution in large multi-relational social networks. The proliferation of online media such as Twitter, Facebook, Orkut and MMORPGs\footnote{Massively Multi-player Online Role Playing Games} have created social networking data at an unprecedented scale. Sony's Everquest 2 is one such example. We used game multi-relational networks to reveal the dynamics of evolution in a multi-relational setting by macroscopic study of the game network. Macroscopic analysis involves fragmenting the network into smaller portions for studying the dynamics within these sub-networks, referred to as `communities'. From an evolutionary perspective of multi-relational network analysis, we have made the following contributions. Specifically, we formulated and analyzed various metrics to capture evolutionary properties of networks. We find that co-evolution rates in trust based `communities' are approximately $60\%$ higher than the trade based `communities'. We also find that the trust and trade connections within the `communities' reduce as their size increases. Finally, we study the interrelation between the dynamics of trade and trust within `communities' and find interesting results about the precursor relationship between the trade and the trust dynamics within the `communities'.
A model glass is considered with one type of fast ($\beta$-type) of processes, and one type of slow processes ($\alpha$-type). On time-scales where the fast ones are in equilibrium, the slow ones have a dynamics that resembles the one of facilitated spin models. The main features are the occurrence of a Kauzmann transition, a Vogel-Fulcher-Tammann-Hesse behaviour for the relaxation time, an Adam-Gibbs relation between relaxation time and configurational entropy, and an aging regime. The model is such that its statics is simple and its (Monte-Carlo-type) dynamics is exactly solvable. The dynamics has been studied both on the approach to the Kauzmann transition and below it. In certain parameter regimes it is so slow that it sets out a quasi-equilibrium at a time dependent effective temperature. Correlation and Response functions are also computed, as well as the out of equilibrium Fluctuation-Dissipation Relation, showing the uniqueness of the effective temperature, thus giving support to the rephrasing of the problem within the framework of out of equilibrium thermodynamics.
The classic renaming protocol of Moir and Anderson (1995) uses a network of Theta(n^2) splitters to assign unique names to n processes with unbounded initial names. We show how to reduce this bound to Theta(n^{3/2}) splitters.
Inspired by scientific collaboration networks, especially our empirical analysis of the network of econophysicists, an evolutionary model for weighted networks is proposed. Both degree-driven and weight-driven models are considered. Compared with the BA model and other evolving models with preferential attachment, there are two significant generalizations. First, besides the new vertex added in at every time step, old vertices can also attempt to build up new links, or to reconnect the existing links. The reconnection between both new-old and old-old nodes are recorded and the connecting times on every link is converted into the weight of the link. This provides a natural way for the evolution of edge weight. Second, besides degree and the weight of vertices, a path-related local information is also used as a reference in the preferential attachment. The path-related preferential attachment mechanism significantly increases the clustering coefficient of the network. The model shows the scale-free phenomena in degree and weight distribution. It also gives well qualitatively consistent behavior with the empirical results.
The deep near-infrared luminosity function of AC118, a cluster of galaxies at z=0.3, is presented. AC118 is a bimodal cluster, as evidenced both by our near-infrared images of lensed galaxies, by public X-ray Rosat images and by the spatial distribution of bright galaxies. Taking advantage of the extension and depth of our data, which sample an almost unexplored region in the depth vs. observed area diagram, we derive the luminosity function (LF), down to the dwarf regime (M*+5), computed in several cluster portions. The overall LF, computed on a 2.66 Mpc2 areas (H_0=50 km/s/Mpc), has an intermediate slope (alpha=-1.2). However, the LF parameters depend on the surveyed cluster region: the central concentration has 2.6^{+5.1}_{-1.7} times more bright galaxies and 5.3^{+7.2}_{-2.3} times less dwarfs per typical galaxy than the outer region, which includes galaxies at an average projected distance of ~580 kpc (errors are quoted at the 99.9 % confidence level). The LF in the secondary AC118 clump is intermediate between the central and outer one. In other words, the near-infrared AC118 LF steepens going from high to low density regions. At an average clustercentric distance of ~580 kpc, the AC118 LF is statistically indistinguishable from the LF of field galaxies at similar redshift, thus suggesting that the hostile cluster environment plays a minor role in shaping the LF at large clustercentric distances, while it strongly affects the LF at higher galaxy density.
We describe a new 512-CPU Beowulf cluster with Teraflop performance dedicated to problems in computational astrophysics. The cluster incorporates a cubic network topology based on inexpensive commodity 24-port gigabit switches and point to point connections through the second gigabit port on each Linux server. This configuration has network performance competitive with more expensive cluster configurations and is scaleable to much larger systems using other network topologies. Networking represents only about 9% of our total system cost of USD$561K. The standard Top 500 HPL Linpack benchmark rating is 1.202 Teraflops on 512 CPUs so computing costs by this measure are $0.47/Megaflop. We also describe 4 different astrophysical applications using complex parallel algorithms for studying large-scale structure formation, galaxy dynamics, magnetohydrodynamic flows onto blackholes and planet formation currently running on the cluster and achieving high parallel performance. The MHD code achieved a sustained speed of 2.2 teraflops in single precision or 44% of the theoretical peak.
Recent progress in Reinforcement Learning (RL), fueled by its combination, with Deep Learning has enabled impressive results in learning to interact with complex virtual environments, yet real-world applications of RL are still scarce. A key limitation is data efficiency, with current state-of-the-art approaches requiring millions of training samples. A promising way to tackle this problem is to augment RL with learning from human demonstrations. However, human demonstration data is not yet readily available. This hinders progress in this direction. The present work addresses this problem as follows. We (i) collect and describe a large dataset of human Atari 2600 replays -- the largest and most diverse such data set publicly released to date, (ii) illustrate an example use of this dataset by analyzing the relation between demonstration quality and imitation learning performance, and (iii) outline possible research directions that are opened up by our work.
Deep learning methods such as multitask neural networks have recently been applied to ligand-based virtual screening and other drug discovery applications. Using a set of industrial ADMET datasets, we compare neural networks to standard baseline models and analyze multitask learning effects with both random cross-validation and a more relevant temporal validation scheme. We confirm that multitask learning can provide modest benefits over single-task models and show that smaller datasets tend to benefit more than larger datasets from multitask learning. Additionally, we find that adding massive amounts of side information is not guaranteed to improve performance relative to simpler multitask learning. Our results emphasize that multitask effects are highly dataset-dependent, suggesting the use of dataset-specific models to maximize overall performance.
Automatically generating natural language descriptions of videos plays a fundamental challenge for computer vision community. Most recent progress in this problem has been achieved through employing 2-D and/or 3-D Convolutional Neural Networks (CNN) to encode video content and Recurrent Neural Networks (RNN) to decode a sentence. In this paper, we present Long Short-Term Memory with Transferred Semantic Attributes (LSTM-TSA)---a novel deep architecture that incorporates the transferred semantic attributes learnt from images and videos into the CNN plus RNN framework, by training them in an end-to-end manner. The design of LSTM-TSA is highly inspired by the facts that 1) semantic attributes play a significant contribution to captioning, and 2) images and videos carry complementary semantics and thus can reinforce each other for captioning. To boost video captioning, we propose a novel transfer unit to model the mutually correlated attributes learnt from images and videos. Extensive experiments are conducted on three public datasets, i.e., MSVD, M-VAD and MPII-MD. Our proposed LSTM-TSA achieves to-date the best published performance in sentence generation on MSVD: 52.8% and 74.0% in terms of BLEU@4 and CIDEr-D. Superior results when compared to state-of-the-art methods are also reported on M-VAD and MPII-MD.
Recently, there is a surge of interest in using point processes to model continuous-time online user activities. This framework has resulted in many novel models, new algorithms and improved performance in diverse applications such as information diffusion, timely recommendation and network evolution. However, most previous work has focused on the "open loop" setting where learned models are used mainly for predictive task. Very often we are interested in the "closed loop" setting where a policy needs to be learned to take into account online user feedbacks and guide user activities to a desirable target state. Although point process based method has led to good predictive performance, it is not clear how we can use them for the more challenging closed loop activity guiding task.   In this paper, we propose a framework to reformulate many point process based models into stochastic differential equations, which allows us to use and extend methods from stochastic control and reinforcement learning to address the user activity guiding problem. For our experiments, we also designed an efficient online algorithm, and show that our algorithm can guide user activities to desired states more effectively than open loop approaches.
This letter investigates the problem of distributed spectrum access for cognitive small cell networks. Compared with existing work, two inherent features are considered: i) the transmission of a cognitive small cell base station only interferes with its neighbors due to the low power, i.e., the interference is local, and ii) the channel state is time-varying due to fading. We formulate the problem as a robust graphical game, and prove that it is an ordinal potential game which has at least one pure strategy Nash equilibrium (NE). Also, the lower throughput bound of NE solutions is analytically obtained. To cope with the dynamic and incomplete information constraints, we propose a distribute spectrum access algorithm to converge to some stable results. Simulation results validate the effectiveness of the proposed game-theoretic distributed learning solution in time-varying spectrum environment.
The MonALISA (Monitoring Agents in A Large Integrated Services Architecture) system provides a distributed monitoring service. MonALISA is based on a scalable Dynamic Distributed Services Architecture which is designed to meet the needs of physics collaborations for monitoring global Grid systems, and is implemented using JINI/JAVA and WSDL/SOAP technologies. The scalability of the system derives from the use of multithreaded Station Servers to host a variety of loosely coupled self-describing dynamic services, the ability of each service to register itself and then to be discovered and used by any other services, or clients that require such information, and the ability of all services and clients subscribing to a set of events (state changes) in the system to be notified automatically. The framework integrates several existing monitoring tools and procedures to collect parameters describing computational nodes, applications and network performance. It has built-in SNMP support and network-performance monitoring algorithms that enable it to monitor end-to-end network performance as well as the performance and state of site facilities in a Grid. MonALISA is currently running around the clock on the US CMS test Grid as well as an increasing number of other sites. It is also being used to monitor the performance and optimize the interconnections among the reflectors in the VRVS system.
We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages. Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages. Key to our approach is our application of HPC techniques, resulting in a 7x speedup over our previous system. Because of this efficiency, experiments that previously took weeks now run in days. This enables us to iterate more quickly to identify superior architectures and algorithms. As a result, in several cases, our system is competitive with the transcription of human workers when benchmarked on standard datasets. Finally, using a technique called Batch Dispatch with GPUs in the data center, we show that our system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.
We study the low-energy properties of the long-range random transverse-field Ising chain with ferromagnetic interactions decaying as a power alpha of the distance. Using variants of the strong-disorder renormalization group method, the critical behavior is found to be controlled by a strong-disorder fixed point with a finite dynamical exponent z_c=alpha. Approaching the critical point, the correlation length diverges exponentially. In the critical point, the magnetization shows an alpha-independent logarithmic finite-size scaling and the entanglement entropy satisfies the area law. These observations are argued to hold for other systems with long-range interactions, even in higher dimensions.
In parallel with the success of CNNs to solve vision problems, there is a growing interest in developing methodologies to understand and visualize the internal representations of these networks. How the responses of a trained CNN encode the visual information is a fundamental question both for computer and human vision research. Image representations provided by the first convolutional layer as well as the resolution change provided by the max-polling operation are easy to understand, however, as soon as a second and further convolutional layers are added in the representation, any intuition is lost. A usual way to deal with this problem has been to define deconvolutional networks that somehow allow to explore the internal representations of the most important activations towards the image space, where deconvolution is assumed as a convolution with the transposed filter. However, this assumption is not the best approximation of an inverse convolution. In this paper we propose a new assumption based on filter substitution to reverse the encoding of a convolutional layer. This provides us with a new tool to directly visualize any CNN single neuron as a filter in the first layer, this is in terms of the image space.
The point of this note is to prove that a language is in the complexity class PP if and only if the strings of the language encode valid inferences in a Bayesian network defined using function-free first-order logic with equality.
Network virtualization allows cloud infrastructure providers to accommodate multiple virtual networks on a single physical network. However, mapping multiple virtual network resources to physical network components, called virtual network embedding (VNE), is known to be non-deterministic polynomial-time hard (NP-hard). Effective virtual network embedding increases the revenue by increasing the number of accepted virtual networks. In this paper, we propose virtual network embedding algorithm, which improves virtual network embedding by coarsening virtual networks. Heavy Clique matching technique is used to coarsen virtual networks. Then, the coarsened virtual networks are enhanced by using a refined Kernighan-Lin algorithm. The performance of the proposed algorithm is evaluated and compared with existing algorithms using extensive simulations, which show that the proposed algorithm improves virtual network embedding by increasing the acceptance ratio and the revenue.
The scalar and vector leptoquark pair production cross sections for deep inelastic $ep$ scattering are calculated. Estimates are presented for the search potential at HERA.
Accommodating the needs of a large number of diverse users in the Internet of Things (IoT), notably managing how the users access the common channel, has posed unique challenges to the network designers. In this paper, we study a heterogeneous IoT network consisting of multiple classes of users who may have different service requirements. For this network, we consider the application of irregular repetition slotted ALOHA (IRSA) that is shown to offer large throughput for single-class networks. Then, we focus on finding the network performance boundaries by studying the set of feasible throughput values for each class, called the capacity region. To this end, we first introduce the concept of dual network of a multi-class network meaning a homogeneous network with the same number of users. We then prove that finding the capacity region of the assumed multi-class network boils down to finding the maximum achievable throughput of its dual network. Using this finding, we then discuss how any given point of the capacity region can be achieved. Further, a delay performance study is conducted to evaluate the average and maximum packet transmission delay experienced by the users of each class.
Volumetric (3d) images are acquired for many scientific and biomedical purposes using imaging methods such as serial section microscopy, CT scans, and MRI. A frequent step in the analysis and reconstruction of such data is the alignment and registration of images that were acquired in succession along a spatial or temporal dimension. For example, in serial section electron microscopy, individual 2d sections are imaged via electron microscopy and then must be aligned to one another in order to produce a coherent 3d volume. State of the art approaches find image correspondences derived from patch matching and invariant feature detectors, and then solve optimization problems that rigidly or elastically deform series of images into an aligned volume. Here we show how fully convolutional neural networks trained with an adversarial loss function can be used for two tasks: (1) synthesis of missing or damaged image data from adjacent sections, and (2) fine-scale alignment of block-face electron microscopy data. Finally, we show how these two capabilities can be combined in order to produce artificial isotropic volumes from anisotropic image volumes using a super-resolution adversarial alignment and interpolation approach.
Existing person re-identification (re-id) methods rely mostly on either localised or global feature representation alone. This ignores their joint benefit and mutual complementary effects. In this work, we show the advantages of jointly learning local and global features in a Convolutional Neural Network (CNN) by aiming to discover correlated local and global features in different context. Specifically, we formulate a method for joint learning of local and global feature selection losses designed to optimise person re-id when using only generic matching metrics such as the L2 distance. We design a novel CNN architecture for Jointly Learning Multi-Loss (JLML) of local and global discriminative feature optimisation subject concurrently to the same re-id labelled information. Extensive comparative evaluations demonstrate the advantages of this new JLML model for person re-id over a wide range of state-of-the-art re-id methods on five benchmarks (VIPeR, GRID, CUHK01, CUHK03, Market-1501).
The disorder-driven phase transition of the RFIM is observed using exact ground-state computer simulations for hyper cubic lattices in d=5,6,7 dimensions. Finite-size scaling analyses are used to calculate the critical point and the critical exponents of the specific heat, magnetization, susceptibility and of the correlation length. For dimensions d=6,7 which are larger or equal to the assumed upper critical dimension, d_u=6, mean-field behaviour is found, i.e. alpha=0, beta=1/2, gamma=1, nu=1/2. For the analysis of the numerical data, it appears to be necessary to include recently proposed corrections to scaling at and beyond the upper critical dimension.
Within a perturbative approach to Quantum Chromodynamics (QCD), we show how to extend ordinary DGLAP longitudinal evolution equations to include the radiative transverse momentum generated in the collinear branching regime. Considering Semi-inclusive Deep Inelastic Scattering as a reference process, we perform such a generalization both in the current and in the target fragmentation region. These distributions are then used to predict semi-inclusive Deep Inelastic Scattering cross-sections onto the whole phase space of the detected hadron.
Social trust prediction addresses the significant problem of exploring interactions among users in social networks. Naturally, this problem can be formulated in the matrix completion framework, with each entry indicating the trustness or distrustness. However, there are two challenges for the social trust problem: 1) the observed data are with sign (1-bit) measurements; 2) they are typically sampled non-uniformly. Most of the previous matrix completion methods do not well handle the two issues. Motivated by the recent progress of max-norm, we propose to solve the problem with a 1-bit max-norm constrained formulation. Since max-norm is not easy to optimize, we utilize a reformulation of max-norm which facilitates an efficient projected gradient decent algorithm. We demonstrate the superiority of our formulation on two benchmark datasets.
Geometric algebra was initiated by W.K. Clifford over 130 years ago. It unifies all branches of physics, and has found rich applications in robotics, signal processing, ray tracing, virtual reality, computer vision, vector field processing, tracking, geographic information systems and neural computing. This tutorial explains the basics of geometric algebra, with concrete examples of the plane, of 3D space, of spacetime, and the popular conformal model. Geometric algebras are ideal to represent geometric transformations in the general framework of Clifford groups (also called versor or Lipschitz groups). Geometric (algebra based) calculus allows, e.g., to optimize learning algorithms of Clifford neurons, etc.   Keywords: Hypercomplex algebra, hypercomplex analysis, geometry, science, engineering.
In this paper, we evaluate convolutional neural network (CNN) features using the AlexNet architecture and very deep convolutional network (VGGNet) architecture. To date, most CNN researchers have employed the last layers before output, which were extracted from the fully connected feature layers. However, since it is unlikely that feature representation effectiveness is dependent on the problem, this study evaluates additional convolutional layers that are adjacent to fully connected layers, in addition to executing simple tuning for feature concatenation (e.g., layer 3 + layer 5 + layer 7) and transformation, using tools such as principal component analysis. In our experiments, we carried out detection and classification tasks using the Caltech 101 and Daimler Pedestrian Benchmark Datasets.
For distributed systems to properly react to peaks of requests, their adaptation activities would benefit from the estimation of the amount of requests. This paper proposes a solution to produce a short-term forecast based on data characterising user behaviour of online services. We use \emph{wavelet analysis}, providing compression and denoising on the observed time series of the amount of past user requests; and a \emph{recurrent neural network} trained with observed data and designed so as to provide well-timed estimations of future requests. The said ensemble has the ability to predict the amount of future user requests with a root mean squared error below 0.06\%. Thanks to prediction, advance resource provision can be performed for the duration of a request peak and for just the right amount of resources, hence avoiding over-provisioning and associated costs. Moreover, reliable provision lets users enjoy a level of availability of services unaffected by load variations.
We present an exact calculation for the scattering of light from a single sphere made of a Faraday-active material, into first order of the external magnetic field. When the size of the sphere is small compared to the wavelength the known T-matrix for a magneto-active Rayleigh scatterer is found. We address the issue whether or not there is a so called \em Photonic Hall Effect - \em a magneto-transverse anisotropy in light scattering - for one Mie scatterer. In the limit of geometrical optics, we compare our results to the Faraday effect in a Fabry-Perot etalon.
The Internet is constantly changing, and its hierarchy was recently shown to become flatter. Recent studies of inter-domain traffic showed that large content providers drive this change by bypassing tier-1 networks and reaching closer to their users, enabling them to save transit costs and reduce reliance of transit networks as new services are being deployed, and traffic shaping is becoming increasingly popular.   In this paper we take a first look at the evolving connectivity of large content provider networks, from a topological point of view of the autonomous systems (AS) graph. We perform a 5-year longitudinal study of the topological trends of large content providers, by analyzing several large content providers and comparing these trends to those observed for large tier-1 networks. We study trends in the connectivity of the networks, neighbor diversity and geographical spread, their hierarchy, the adoption of IXPs as a convenient method for peering, and their centrality. Our observations indicate that content providers gradually increase and diversify their connectivity, enabling them to improve their centrality in the graph, and as a result, tier-1 networks lose dominance over time.
We develop analytical tools for performance analysis of multiple TCP flows (which could be using TCP CUBIC, TCP Compound, TCP New Reno) passing through a multi-hop network. We first compute average window size for a single TCP connection (using CUBIC or Compound TCP) under random losses. We then consider two techniques to compute steady state throughput for different TCP flows in a multi-hop network. In the first technique, we approximate the queues as M/G/1 queues. In the second technique, we use an optimization program whose solution approximates the steady state throughput of the different flows. Our results match well with ns2 simulations.
Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.
We derive a mean-field approximation for the macroscopic dynamics of large networks of pulse-coupled theta neurons in order to study the effects of different network degree distributions, as well as degree correlations (assortativity). Using the ansatz of Ott and Antonsen (Chaos, 19 (2008) 037113), we obtain a reduced system of ordinary differential equations describing the mean-field dynamics, with significantly lower dimensionality compared with the complete set of dynamical equations for the system. We find that, for sufficiently large networks and degrees, the dynamical behavior of the reduced system agrees well with that of the full network. This dimensional reduction allows for an efficient characterization of system phase transitions and attractors. For networks with tightly peaked degree distributions, the macroscopic behavior closely resembles that of fully connected networks previously studied by others. In contrast, networks with scale-free degree distributions exhibit different macroscopic dynamics due to the emergence of degree dependent behavior of different oscillators. For nonassortative networks (i.e., networks without degree correlations) we observe the presence of a synchronously firing phase that can be suppressed by the presence of either assortativity or disassortativity in the network. We show that the results derived here can be used to analyze the effects of network topology on macroscopic behavior in neuronal networks in a computationally efficient fashion.
One of the major problems in computational biology is the inability of existing classification models to incorporate expanding and new domain knowledge. This problem of static classification models is addressed in this paper by the introduction of incremental learning for problems in bioinformatics. Many machine learning tools have been applied to this problem using static machine learning structures such as neural networks or support vector machines that are unable to accommodate new information into their existing models. We utilize the fuzzy ARTMAP as an alternate machine learning system that has the ability of incrementally learning new data as it becomes available. The fuzzy ARTMAP is found to be comparable to many of the widespread machine learning systems. The use of an evolutionary strategy in the selection and combination of individual classifiers into an ensemble system, coupled with the incremental learning ability of the fuzzy ARTMAP is proven to be suitable as a pattern classifier. The algorithm presented is tested using data from the G-Coupled Protein Receptors Database and shows good accuracy of 83%. The system presented is also generally applicable, and can be used in problems in genomics and proteomics.
Traditionally, object tracking and segmentation are treated as two separate problems and solved independently. However, in this paper, we argue that tracking and segmentation are actually closely related and solving one should help the other. On one hand, the object track, which is a set of bounding boxes with one bounding box in every frame, would provide strong high-level guidance for the target/background segmentation task. On the other hand, the object segmentation would separate object from other objects and background, which will be useful for determining track locations in every frame. We propose a novel framework which combines online multiple target tracking and segmentation in a video. In our approach, the tracking and segmentation problems are coupled by Lagrange dual decomposition, which leads to more accurate segmentation results and also \emph{helps resolve typical difficulties in multiple target tracking, such as occlusion handling, ID-switch and track drifting}. To track targets, an individual appearance model is learned for each target via structured learning and network flow is employed to generate tracks from densely sampled candidates. For segmentation, multi-label Conditional Random Field (CRF) is applied to a superpixel based spatio-temporal graph in a segment of video to assign background or target labels to every superpixel. The experiments on diverse sequences show that our method outperforms state-of-the-art approaches for multiple target tracking as well as segmentation.
We consider \emph{influence maximization} (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of "seed" users to expose the product to. While prior work assumes a known model of information diffusion, we propose a parametrization in terms of pairwise reachability which makes our framework agnostic to the underlying diffusion model. We give a corresponding monotone, submodular surrogate function, and show that it is a good approximation to the original IM objective. We also consider the case of a new marketer looking to exploit an existing social network, while simultaneously learning the factors governing information propagation. For this, we propose a pairwise-influence semi-bandit feedback model and develop a LinUCB-based bandit algorithm. Our model-independent regret analysis shows that our bound on the cumulative regret has a better (as compared to previous work) dependence on the size of the network. By using the graph Laplacian eigenbasis to construct features, we describe a practical LinUCB implementation. Experimental evaluation suggests that our framework is robust to the underlying diffusion model and can efficiently learn a near-optimal solution.
This paper presents a method for automated healing as part of off-line automated troubleshooting. The method combines statistical learning with constraint optimization. The automated healing aims at locally optimizing radio resource management (RRM) or system parameters of cells with poor performance in an iterative manner. The statistical learning processes the data using Logistic Regression (LR) to extract closed form (functional) relations between Key Performance Indicators (KPIs) and Radio Resource Management (RRM) parameters. These functional relations are then processed by an optimization engine which proposes new parameter values. The advantage of the proposed formulation is the small number of iterations required by the automated healing method to converge, making it suitable for off-line implementation. The proposed method is applied to heal an Inter-Cell Interference Coordination (ICIC) process in a 3G Long Term Evolution (LTE) network which is based on soft-frequency reuse scheme. Numerical simulations illustrate the benefits of the proposed approach.
A large effort is devoted to the research of new computing paradigms associated to innovative nanotechnologies that should complement and/or propose alternative solutions to the classical Von Neumann/CMOS association. Among various propositions, Spiking Neural Network (SNN) seems a valid candidate. (i) In terms of functions, SNN using relative spike timing for information coding are deemed to be the most effective at taking inspiration from the brain to allow fast and efficient processing of information for complex tasks in recognition or classification. (ii) In terms of technology, SNN may be able to benefit the most from nanodevices, because SNN architectures are intrinsically tolerant to defective devices and performance variability. Here we demonstrate Spike-Timing-Dependent Plasticity (STDP), a basic and primordial learning function in the brain, with a new class of synapstor (synapse-transistor), called Nanoparticle Organic Memory Field Effect Transistor (NOMFET). We show that this learning function is obtained with a simple hybrid material made of the self-assembly of gold nanoparticles and organic semiconductor thin films. Beyond mimicking biological synapses, we also demonstrate how the shape of the applied spikes can tailor the STDP learning function. Moreover, the experiments and modeling show that this synapstor is a memristive device. Finally, these synapstors are successfully coupled with a CMOS platform emulating the pre- and post-synaptic neurons, and a behavioral macro-model is developed on usual device simulator.
We consider the problem of serving multicast flows in a crossbar switch. We show that linear network coding across packets of a flow can sustain traffic patterns that cannot be served if network coding were not allowed. Thus, network coding leads to a larger rate region in a multicast crossbar switch. We demonstrate a traffic pattern which requires a switch speedup if coding is not allowed, whereas, with coding the speedup requirement is eliminated completely. In addition to throughput benefits, coding simplifies the characterization of the rate region. We give a graph-theoretic characterization of the rate region with fanout splitting and intra-flow coding, in terms of the stable set polytope of the 'enhanced conflict graph' of the traffic pattern. Such a formulation is not known in the case of fanout splitting without coding. We show that computing the offline schedule (i.e. using prior knowledge of the flow arrival rates) can be reduced to certain graph coloring problems. Finally, we propose online algorithms (i.e. using only the current queue occupancy information) for multicast scheduling based on our graph-theoretic formulation. In particular, we show that a maximum weighted stable set algorithm stabilizes the queues for all rates within the rate region.
The body of knowledge accumulated in recent years on the structure and the dynamics of complex networks has offered useful insights on the behaviour of many natural and artificial complex systems. The analysis of some of these, namely those formed by companies and institutions, however, has proved problematical mainly for the difficulties in collecting a reasonable amount of data. This contribution argues that the World Wide Web can provide an efficient and effective way to gather significant samples of networked socio-economic systems to be used for network analyses and simulations. The case discussed refers to a tourism destination, the fundamental subsystem of an industry which can be considered one of the most important in today's World economy.
The advantage of recurrent neural networks (RNNs) in learning dependencies between time-series data has distinguished RNNs from other deep learning models. Recently, many advances are proposed in this emerging field. However, there is a lack of comprehensive review on memory models in RNNs in the literature. This paper provides a fundamental review on RNNs and long short term memory (LSTM) model. Then, provides a surveys of recent advances in different memory enhancements and learning techniques for capturing long term dependencies in RNNs.
Within the mode-coupling theory for structural relaxation in simple systems the asymptotic laws and their leading-asymptotic correction formulas are derived for the motion of a tagged particle near a glass-transition singularity. These analytic results are compared with numerical ones of the equations of motion evaluated for a tagged hard sphere moving in a hard-sphere system. It is found that the long-time part of the two-step relaxation process for the mean-squared displacement can be characterized by the $\alpha $-relaxation-scaling law and von Schweidler's power-law decay while the critical-decay regime is dominated by the corrections to the leading power-law behavior. For parameters of interest for the interpretations of experimental data, the corrections to the leading asymptotic laws for the non-Gaussian parameter are found to be so large that the leading asymptotic results are altered qualitatively by the corrections. Results for the non-Gaussian parameter are shown to follow qualitatively the findings reported in the molecular-dynamics-simulations work by Kob and Andersen [Phys. Rev. E 51, 4626 (1995)].
By means of variable-composition evolutionary algorithm coupled with density functional theory and in combination with aberration-corrected high-resolution transmission electron microscopy experiments, we have studied and characterized the composition, structure and hardness properties of WB$_{3+x}$ ($x$ $<$ 0.5). We provide robust evidence for the occurrence of stoichiometric WB$_3$ and non-stoichiometric WB$_{3+x}$ both crystallizing in the metastable $hP$16 ($P6_3/mmc$) structure. No signs for the formation of the highly debated WB$_4$ (both $hP$20 and $hP$10) phases were found. Our results rationalize the seemingly contradictory high-pressure experimental findings and suggest that the interstitial boron atom is located in the tungsten layer and vertically interconnect with four boron atoms, thus forming a typical three-center boron net with the upper and lower boron layers in a three-dimensional covalent network, which thereby strengthen the hardness.
Sodium abundances have been determined in a large number of giants of open clusters but conflicting results, ranging from solar values to overabundances of up to five orders of magnitude, have been found. The reasons for this disagreement are not well-understood. As these Na overabundances can be the result of deep mixing, their proper understanding has consequences for models of stellar evolution. As discussed in the literature, part of this disagreement comes from the adoption of different corrections for non-LTE effects and from the use of different atomic data for the same set of lines. However, a clear picture of the Na behaviour in giants is still missing. To contribute in this direction, this work presents a careful redetermination of the Na abundances of the Hyades giants, motivated by the recent measurement of their angular diameters. An average of [Na/Fe] = +0.30, in NLTE, has been found. This overabundance can be explained by hydrodynamical models with high initial rotation velocities. This result, and a trend of increasing Na with increasing stellar mass found in a previous work, suggests that there is no strong evidence of Na overabundances in red giants beyond those values expected by evolutionary models of stars with more than ~ 2 Msun.
The Deep Impact encounter with the nucleus of 9P/Tempel ejected small grains (a <~ 10 micron) into the comet's coma, evidenced by thermal emission from small dust grains at mid-infrared wavelengths (~10 micron) and dynamical simulations of optical images. Meteor-sized particles (a >~ 100 micron) ejected by the impact will likely have the lowest ejection velocities and will weakly interact with solar radiation pressure. Therefore, large particles may remain near the nucleus for weeks or months after ejection by Deep Impact. We present initial highlights of our Spitzer Space Telescope/MIPS 24 micron camera program to image comet 9P/Tempel at 30, 80, 420, and 560 days after the Deep Impact encounter. The MIPS data, combined with our dynamics model, enable detection of large dust grains potentially ejected by Deep Impact.
We present an analytic theory of quantum criticality in quasi one-dimensional topological Anderson insulators. We describe these systems in terms of two parameters $(g,\chi)$ representing localization and topological properties, respectively. Certain critical values of $\chi$ (half-integer for $\Bbb{Z}$ classes, or zero for $\Bbb{Z}_2$ classes) define phase boundaries between distinct topological sectors. Upon increasing system size, the two parameters exhibit flow similar to the celebrated two parameter flow of the integer quantum Hall insulator. However, unlike the quantum Hall system, an exact analytical description of the entire phase diagram can be given in terms of the transfer-matrix solution of corresponding supersymmetric non-linear sigma-models. In $\Bbb{Z}_2$ classes we uncover a hidden supersymmetry, present at the quantum critical point.
In Multihop Wireless Networks, traffic forwarding capability of each node varies according to its level of contention. Each node can yield its channel access opportunity to its neighbouring nodes, so that all the nodes can evenly share the channel and have similar forwarding capability. In this manner the wireless channel is utilize d effectively, which is achieved using Contention Window Adaptation Mechanism (CWAM). This mechanism achieves a higher end to - end throughout but consumes the network power to a higher level. So, a newly proposed algorithm Quadrant Based Directional Routing Protocol (Q-DIR) is implemented as a cross - layer with CWAM, to reduce the total network power consumption through limited flooding and also reduce the routing overheads, which eventually increases overall network throughput. This algorithm limits the broadcast region to a quadrant where the source node and the destination nodes are located. Implementation of the algorithm is done in Linux based NS-2 simulator.
The move from hand-designed features to learned features in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand. In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way. Our learned algorithms, implemented by LSTMs, outperform generic, hand-designed competitors on the tasks for which they are trained, and also generalize well to new tasks with similar structure. We demonstrate this on a number of tasks, including simple convex problems, training neural networks, and styling images with neural art.
The ground-state phase diagram of an Ising spin-glass model on a random graph with an arbitrary fraction $w$ of ferromagnetic interactions is analysed in the presence of an external field. Using the replica method, and performing an analysis of stability of the replica-symmetric solution, it is shown that $w=1/2$, correponding to an unbiased spin glass, is a singular point in the phase diagram, separating a region with a spin-glass phase ($w<1/2$) from a region with spin-glass, ferromagnetic, mixed, and paramagnetic phases ($w>1/2$).
In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active towards a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target.   Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery.
In this paper, we study the large-scale protein interaction network of yeast uti lizing a stochastic method based upon percolation of random graphs. In order to find the global features of connectivities in the network, we introduce numeric al measures that quantify (1) how strongly a protein ties with the other parts o f the network and (2) how significantly an interaction contributes to the integr ity of the network. Our study shows that the distribution of essential proteins is distinct from the background in terms of global connectivities. This observ ation highlights a fundamental difference between the essential and the non-esse ntial proteins in the network. Furthermore, we find that the interaction data o btained from different experimental methods such as immunoprecipitation and two- hybrid techniques possess different characteristics. We discuss the biological implications of these observations.
Image registration is a key component of various image processing operations which involve the analysis of different image data sets. Automatic image registration domains have witnessed the application of many intelligent methodologies over the past decade; however inability to properly model object shape as well as contextual information had limited the attainable accuracy. In this paper, we propose a framework for accurate feature shape modeling and adaptive resampling using advanced techniques such as Vector Machines, Cellular Neural Network (CNN), SIFT, coreset, and Cellular Automata. CNN has found to be effective in improving feature matching as well as resampling stages of registration and complexity of the approach has been considerably reduced using corset optimization The salient features of this work are cellular neural network approach based SIFT feature point optimisation, adaptive resampling and intelligent object modelling. Developed methodology has been compared with contemporary methods using different statistical measures. Investigations over various satellite images revealed that considerable success was achieved with the approach. System has dynamically used spectral and spatial information for representing contextual knowledge using CNN-prolog approach. Methodology also illustrated to be effective in providing intelligent interpretation and adaptive resampling.
The normalized edit distance is one of the distances derived from the edit distance. It is useful in some applications because it takes into account the lengths of the two strings compared. The normalized edit distance is not defined in terms of edit operations but rather in terms of the edit path. In this paper we propose a new derivative of the edit distance that also takes into consideration the lengths of the two strings, but the new distance is related directly to the edit distance. The particularity of the new distance is that it uses the genetic algorithms to set the values of the parameters it uses. We conduct experiments to test the new distance and we obtain promising results.
Explanation of ABR service in plain language.
Forecasting the flow of crowds is of great importance to traffic management and public safety, and very challenging as it is affected by many complex factors, including spatial dependencies (nearby and distant), temporal dependencies (closeness, period, trend), and external conditions (e.g., weather and events). We propose a deep-learning-based approach, called ST-ResNet, to collectively forecast two types of crowd flows (i.e. inflow and outflow) in each and every region of a city. We design an end-to-end structure of ST-ResNet based on unique properties of spatio-temporal data. More specifically, we employ the residual neural network framework to model the temporal closeness, period, and trend properties of crowd traffic. For each property, we design a branch of residual convolutional units, each of which models the spatial properties of crowd traffic. ST-ResNet learns to dynamically aggregate the output of the three residual neural networks based on data, assigning different weights to different branches and regions. The aggregation is further combined with external factors, such as weather and day of the week, to predict the final traffic of crowds in each and every region. We have developed a real-time system based on Microsoft Azure Cloud, called UrbanFlow, providing the crowd flow monitoring and forecasting in Guiyang City of China. In addition, we present an extensive experimental evaluation using two types of crowd flows in Beijing and New York City (NYC), where ST-ResNet outperforms nine well-known baselines.
In this paper, we propose a scheme by which banks can collect and verify biometric data eg, fingerprints, directly from their customers and use it to authenticate their transactions made through PoS/ ATM/ online console. We propose building a network of computers called BioNet to allow such transactions to be made online across the world. A BioNet server will be able to do 4 million transactions per second using GPU.
Person re-identification task has been greatly boosted by deep convolutional neural networks (CNNs) in recent years. The core of which is to enlarge the inter-class distinction as well as reduce the intra-class variance. However, to achieve this, existing deep models prefer to adopt image pairs or triplets to form verification loss, which is inefficient and unstable since the number of training pairs or triplets grows rapidly as the number of training data grows. Moreover, their performance is limited since they ignore the fact that different dimension of embedding may play different importance. In this paper, we propose to employ identification loss with center loss to train a deep model for person re-identification. The training process is efficient since it does not require image pairs or triplets for training while the inter-class distinction and intra-class variance are well handled. To boost the performance, a new feature reweighting (FRW) layer is designed to explicitly emphasize the importance of each embedding dimension, thus leading to an improved embedding. Experiments on several benchmark datasets have shown the superiority of our method over the state-of-the-art alternatives on both accuracy and speed.
Recently, it has been conjectured that the statistics of extremes is of relevance for a large class of correlated system. For certain probability densities this predicts the characteristic large $x$ fall-off behavior $f(x)\sim\exp (-a e^x)$, $a>0$. Using a multicanonical Monte Carlo technique, we have calculated the Parisi overlap distribution $P(q)$ for the three-dimensional Edward-Anderson Ising spin glass at and below the critical temperature, even where $P(q)$ is exponentially small. We find that a probability distribution related to extreme order statistics gives an excellent description of $P(q)$ over about 80 orders of magnitude.
A breast neoplasia is often marked by the presence of microcalcifications and massive lesions in the mammogram: hence the need for tools able to recognize such lesions at an early stage. Our collaboration, among italian physicists and radiologists, has built a large distributed database of digitized mammographic images and has developed a Computer Aided Detection (CADe) system for the automatic analysis of mammographic images and installed it in some Italian hospitals by a GRID connection. Regarding microcalcifications, in our CADe digital mammogram is divided into wide windows which are processed by a convolution filter; after a self-organizing map analyzes each window and produces 8 principal components which are used as input of a neural network (FFNN) able to classify the windows matched to a threshold. Regarding massive lesions we select all important maximum intensity position and define the ROI radius. From each ROI found we extract the parameters which are used as input in a FFNN to distinguish between pathological and non-pathological ROI. We present here a test of our CADe system, used as a second reader and a comparison with another (commercial) CADe system.
We propose DEEPMEMORY, a novel deep architecture for sequence-to-sequence learning, which performs the task through a series of nonlinear transformations from the representation of the input sequence (e.g., a Chinese sentence) to the final output sequence (e.g., translation to English). Inspired by the recently proposed Neural Turing Machine (Graves et al., 2014), we store the intermediate representations in stacked layers of memories, and use read-write operations on the memories to realize the nonlinear transformations between the representations. The types of transformations are designed in advance but the parameters are learned from data. Through layer-by-layer transformations, DEEPMEMORY can model complicated relations between sequences necessary for applications such as machine translation between distant languages. The architecture can be trained with normal back-propagation on sequenceto-sequence data, and the learning can be easily scaled up to a large corpus. DEEPMEMORY is broad enough to subsume the state-of-the-art neural translation model in (Bahdanau et al., 2015) as its special case, while significantly improving upon the model with its deeper architecture. Remarkably, DEEPMEMORY, being purely neural network-based, can achieve performance comparable to the traditional phrase-based machine translation system Moses with a small vocabulary and a modest parameter size.
It is known that link throughputs of CSMA wireless networks can be computed from a time-reversible Markov chain arising from an ideal CSMA network model (ICN). In particular, this model yields general closed-form equations of link throughputs. However, an idealized and important assumption made in ICN is that the backoff countdown process is in "contiuous-time" and carrier sensing is instantaneous. As a result, there is no collision in ICN. In practical CSMA protocols such as IEEE 802.11, the stations count down in "mini-timeslot" and the process is therefore a "discrete-time" process. In particular, two stations may end their backoff process in the same mini-timeslot and then transmit simultaneously, resulting in a packet collision. This paper is an attempt to study how to compute link throughputs after taking such backoff collision effects into account. We propose a generalized ideal CSMA network model (GICN) to characterize the collision states as well as the interactions and dependency among links in the network. We show that link throughputs and collision probability can be computed from GICN. Simulation results validate GICN's accuracy. Interestingly, we also find that the original ICN model yields fairly accurate results despite the fact that collisions are not modeled.
Zn in GaN forms an efficient radiative center and acts as a deep acceptor which can make the crystal insulating. Four different Zn-related centers have been by now identified, leading to light emission in the range between 1.8 eV and 2.9 eV. We present a first-principles investigation total energy and electronic structure calculations for Ga-substitutional and hetero-antisite N-substitutional Zn in wurtzite GaN, using ultrasoft pseudopotentials and a conjugate-gradient total energy minimization method. Our results permit the identification of the blue-light emission center as the substitutional acceptor, while contrary to a common belief the Zn_N heteroantisite has a very high formation energy and donor behavior, which seems to exclude it as the origin of the other centers.
We calculate exactly the velocity and diffusion constant of a microscopic stochastic model of $N$ evolving particles which can be described by a noisy traveling wave equation with a noise of order $N^{-1/2}$. Our model can be viewed as the infinite range limit of a directed polymer in random medium with $N$ sites in the transverse direction. Despite some peculiarities of the traveling wave equations in the absence of noise, our exact solution allows us to test the validity of a simple cutoff approximation and to show that, in the weak noise limit, the position of the front can be completely described by the effect of the noise on the first particle.
We report single-cluster Monte Carlo simulations of the Ising model on three-dimensional Poissonian random lattices with up to 128,000 approx. 503 sites which are linked together according to the Voronoi/Delaunay prescription. For each lattice size quenched averages are performed over 96 realizations. By using reweighting techniques and finite-size scaling analyses we investigate the critical properties of the model in the close vicinity of the phase transition point. Our random lattice data provide strong evidence that, for the available system sizes, the resulting effective critical exponents are indistinguishable from recent high-precision estimates obtained in Monte Carlo studies of the Ising model and \phi^4 field theory on three-dimensional regular cubic lattices.
Mathematical modeling has broad applications in neuroscience whether modeling the dynamics of a single synapse or an entire network of neurons. In Part I, we model vesicle replenishment and release at the photoreceptor synapse to better understand how visual information is processed. In Part II, we explore a simple model of neural networks with the goal of discovering how network structure shapes the behavior of the network.   To fully understand how visual information is processed requires an understanding of the way signals are transformed at the ribbon synapse of photoreceptor neurons. These synapses possess a ribbon-like structure capable of storing around 100 synaptic vesicles, allowing graded responses through the release of different numbers of vesicles in response to visual input. These responses depend critically on the ability of the ribbon to replenish itself as ribbon sites empty upon release. The rate of vesicle replenishment is thus an important factor in shaping neural coding in the retina. In collaboration with experimental neuroscientists we developed a mathematical model to describe the dynamics of vesicle release and replenishment at the ribbon synapse.   To learn more about how network architecture shapes the dynamics of the network, we study a specific type of threshold-linear network that is constructed from a simple directed graph. The network construction guarantees that differences in dynamics arise solely from differences in the connectivity of the underlying graph. By design, the activity of these networks is bounded and there are no stable fixed points. Computational experiments show that most of these networks yield limit cycles where the neurons fire in sequence. We devised an algorithm to predict the sequence of firing using the structure of the underlying graph. Using the algorithm we classify all the networks of this type on five or fewer nodes.
An accurate model of patient-specific kidney graft survival distributions can help to improve shared-decision making in the treatment and care of patients. In this paper, we propose a deep learning method that directly models the survival function instead of estimating the hazard function to predict survival times for graft patients based on the principle of multi-task learning. By learning to jointly predict the time of the event, and its rank in the cox partial log likelihood framework, our deep learning approach outperforms, in terms of survival time prediction quality and concordance index, other common methods for survival analysis, including the Cox Proportional Hazards model and a network trained on the cox partial log-likelihood.
Irreversible broadening of spectral holes in chlorin-doped polystyrene glass was studied for the first time in the temperature cycling experiments under high pressure (by raising the temperature from 5 K to various magnitudes up to 18 K and turning back to 5 K at several fixed pressures between 0 and 5 kbar). At all pressures the increment in the hole width observed after completing a temperature cycle exhibits a slightly superlinear (proportional to T to the power 3/2) dependence on the cycling temperature. The magnitude of this increment is essentially reduced under high pressure (e.g., at 4.9 kbar it makes up less than 2/3 of its initial value obtained at ambient pressure). The residual broadening of holes is interpreted as a result of irreversible thermally induced spectral diffusion arising from interaction of the electronic transition in a dopant molecule with two-level systems (TLSs) which perform thermally activated overbarrier jumps between two possible states of a TLS. The pressure effects are treated theoretically within the scope of the soft anharmonic potential model with asymmetric distribution of the cubic anharmonicity parameter. It is shown that more abundant are TLSs with such double-well potentials where the minimum, corresponding to a larger glass volume, is placed at a higher energy than the minimum, corresponding to a smaller volume. In this case, applied pressure reduces the number of almost symmetric TLSs (having very small energy difference between the two minima), which provide the largest contribution to the residual broadening of spectral holes after a temperature cycle. Our earlier results on isothermal hole burning at various fixed pressures [V. Hizhnyakov et al., Phys. Rev. B 62, 11296 (2000)] also qualitatively fit into this picture.
We derive the analytic continuation of the Mellin moments of deep inelastic structure functions at the next-to-next-to-leading order accuracy.
In the area of ad-targeting, predicting user responses is essential for many applications such as Real-Time Bidding (RTB). Many of the features available in this domain are sparse categorical features. This presents a challenge especially when the user responses to be predicted are rare, because each feature will only have very few positive examples. Recently, neural embedding techniques such as word2vec which learn distributed representations of words using occurrence statistics in the corpus have been shown to be effective in many Natural Language Processing tasks. In this paper, we use real-world data set to show that a similar technique can be used to learn distributed representations of features from users' web history, and that such representations can be used to improve the accuracy of commonly used models for predicting rare user responses.
We present the detailed study of the thermodynamics of vibrational modes in disordered elastic systems such as the Bragg glass phase of lattices pinned by quenched impurities. Our study and our results are valid within the (mean field) replica Gaussian variational method. We obtain an expression for the internal energy in the quantum regime as a function of the saddle point solution, which is then expanded in powers of $\hbar$ at low temperature $T$. In the calculation of the specific heat $C_v$ a non trivial cancellation of the term linear in $T$ occurs, explicitly checked to second order in $\hbar$. The final result is $C_v \propto T^3$ at low temperatures in dimension three and two. The prefactor is controlled by the pinning length. This result is discussed in connection with other analytical or numerical studies.
In this work, water distribution systems are regarded as large sparse planar graphs with complex network characteristics and the relationship between important topological features of the network (i.e. structural robustness and loop redundancy) and system resilience, viewed as the antonym to structural vulnerability, are assessed. Deterministic techniques from complex networks and spectral graph theory are utilized to quantify well-connectedness and estimate loop redundancy in the studied benchmark networks. By using graph connectivity and expansion properties, system robustness against node/link failures and isolation of the demand nodes from the source(s) are assessed and network tolerance against random failures and targeted attacks on their bridges and cut sets are analyzed. Among other measurements, two metrics of meshed-ness and algebraic connectivity are proposed as candidates for quantification of redundancy and robustness, respectively, in optimization design models. A brief discussion on the scope and limitations of the provided measurements in the analysis of operational reliability of water distribution systems is presented.
Despite growing scientific interest in the placebo effect and increasing understanding of neurobiological mechanisms, theoretical modeling of the placebo response remains poorly developed. The most extensively accepted theories are expectation and conditioning, involving both conscious and unconscious information processing. However, it is not completely understood how these mechanisms can shape the placebo response. We focus here on neural processes which can account for key properties of the response to substance intake. It is shown that placebo response can be conceptualized as a reaction of a distributed neural system within the central nervous system. Such a reaction represents an integrated component of the response to open substance administration (or to substance intake) and is updated through "unconditioned stimulus (UCS) revaluation" learning. The analysis leads to a theorem, which proves the existence of two distinct quantities coded within the brain, these are the expected or prediction outcome and the reactive response. We show that the reactive response is updated automatically by implicit revaluation lerning, while the expected outcome can also be modulated through conscious information processing. Conceptualizing the response to substance intake in terms of UCS revaluation learning leads to the theoretical formulation of a potential neuropharmacological treatment for increasing unlimitedly the effectiveness of a given drug.
We study the problem of link and node delay estimation in undirected networks when at most k out of n links or nodes in the network are congested. Our approach relies on end-to-end measurements of path delays across pre-specified paths in the network. We present a class of algorithms that we call FRANTIC. The FRANTIC algorithms are motivated by compressive sensing; however, unlike traditional compressive sensing, the measurement design here is constrained by the network topology and the matrix entries are constrained to be positive integers. A key component of our design is a new compressive sensing algorithm SHO-FA-INT that is related to the prior SHO-FA algorithm for compressive sensing, but unlike SHO-FA, the matrix entries here are drawn from the set of integers {0, 1, ..., M}. We show that O(k log n /log M) measurements suffice both for SHO-FA-INT and FRANTIC. Further, we show that the computational complexity of decoding is also O(k log n/log M) for each of these algorithms. Finally, we look at efficient constructions of the measurement operations through Steiner Trees.
In this paper, we address the problem of optimal relay placement in a cellular network assuming network densification, with the aim of maximizing cell capacity. In our model, a fraction of radio resources is dedicated to the base-station (BS)/relay nodes (RN) communication. In the remaining resources, BS and RN transmit simultaneously to users. During this phase, the network is densified in the sense that the transmitters density and so network capacity are increased. Intra- and inter-cell interference is taken into account in Signal to Interference plus Noise Ratio (SINR) simple formulas derived from a fluid model for heterogeneous network. Optimization can then be quickly performed using Simulated Annealing. Performance results show that cell capacity is boosted thanks to densification despite a degradation of the signal quality. Bounds are also provided on the fraction of resources dedicated to the BS-RN link.
We develop a non-perturbative theory to study large-scale quantum dynamics of Dirac particles in disordered scalar potentials (the so-called "topological metal"). For general disorder strength and carrier doping, we find that at large times, superdiffusion occurs. I.e., the mean squared displacement grows as $\sim t\ln t$. In the static limit, our analytical theory shows that the conductance of a finite-size system obeys the scaling equation identical to that found in previous numerical studies. These results suggest that in the topological metal, there exist some transparent channels -- where waves propagate "freely" -- dominating long-time transport of the system. We discuss the ensuing consequence -- the transverse superdiffusion in photonic materials -- that might be within the current experimental reach.
In recent years, the emerged network worms and attacks have distributive characteristic, which can spread globally in a very short time. Security management crossing network to co-defense network-wide attacks and improve efficiency of security administration is urgently needed. This paper proposes a hierarchical distributed network security management system (HD-NSMS), which can centrally manage security across networks. First describes the system in macrostructure and microstructure; then discusses three key problems when building HD-NSMS: device model, alert mechanism and emergency response mechanism; at last, describes the implementation of HD-NSMS. The paper is valuable for implementing NSMS in that it derives from a practical network security management system (NSMS).
Missing link prediction in indirected and un-weighted network is an open and challenge problem which has been studied intensively in recent years. In this paper, we studied the relationships between community structure and link formation and proposed a Fast Block probabilistic Model(FBM). In accordance with the experiments on four real world networks, we have yielded very good accuracy of missing link prediction and huge improvement in computing efficiency compared to conventional methods. By analyzing the mechanism of link formation, we also discovered that clique structure plays a significant role to help us understand how links grow in communities. Therefore, we summarized three principles which are proved to be able to well explain the mechanism of link formation and network evolution from the theory of graph topology.
We introduce AESMC: a method for using deep neural networks for simultaneous model learning and inference amortization in a broad family of structured probabilistic models. Starting with an unlabeled dataset and a partially specified underlying generative model, AESMC refines the generative model and learns efficient proposal distributions for SMC for performing inference in this model. Our approach relies on 1) efficiency of SMC in performing inference in structured probabilistic models and 2) flexibility of deep neural networks to model complex conditional probability distributions. We demonstrate that our approach provides a fast, accurate, easy-to-implement, and scalable means for carrying out parameter estimation in high-dimensional statistical models as well as simultaneous model learning and proposal amortization in neural network based models.
In this paper we study the recently proposed tensor networks/AdS correspondence. We found that the Coxeter group is a useful tool to describe tensor networks in a negatively curved space. Studying generic tensor network populated by perfect tensors, we find that the physical wave function generically do not admit any connected correlation functions of local operators. To remedy the problem, we assume that wavefunctions admitting such semi-classical gravitational interpretation are composed of tensors close to, but not exactly perfect tensors. Computing corrections to the connected two point correlation functions, we find that the leading contribution is given by structures related to geodesics connecting the operators inserted at the boundary physical dofs. Such considerations admit generalizations at least to three point functions. This is highly suggestive of the emergence of the analogues of Witten diagrams in the tensor network. The perturbations alone however do not give the right entanglement spectrum. Using the Coxeter construction, we also constructed the tensor network counterpart of the BTZ black hole, by orbifolding the discrete lattice on which the network resides. We found that the construction naturally reproduces some of the salient features of the BTZ black hole, such as the appearance of RT surfaces that could wrap the horizon, depending on the size of the entanglement region A.
We present the results of fitting deep off-nuclear optical spectroscopy of radio-quiet quasars, radio-loud quasars and radio galaxies at z ~ 0.2 with evolutionary synthesis models of galaxy evolution. Our aim was to determine the age of the dynamically dominant stellar populations in the hos t galaxies of these three classes of powerful AGN. Some of our spectra display residual nuclear contamination at the shortest wavelengths, but the detailed quality of the fits longward of the 4000A break provide unequivocal proof, if further proof were needed, that quasars lie in massive galaxies with (at least at z ~ 0.2) evolved stellar populations. By fitting a two-component model we have separated the very blue (starburst and/or AGN contamination) from the redder underlying spectral energy distribution, and find that the hosts of all three classes of AGN are dominated by old stars of age 8 - 14 Gyr. If the blue component is attributed to young stars, we find that, at most, 1% of the baryonic mass of these galaxies is involved in star-formation activity at the epoch of observation. These results strongly support the conclusion reached by McLure et al. (1999) that the host galaxies of luminous quasars are massive ellipticals which formed prior to the peak epoch of quasar activity at z ~ 2.5.
We consider the effect of adding quantum dynamics to a classical topological spin liquid, with particular view to how best to detect its presence in experiment. For the Coulomb phase of spin ice, we find quantum effects to be most visible in the gauge-charged monopole excitations. In the presence of weak dilution with nonmagnetic ions we find a particularly crisp phenomenon, namely the emergence of hydrogenic excited states in which a magnetic monopole is bound to a vacancy at various distances. Via a mapping to an analytically tractable single particle problem on the Bethe lattice, we obtain an approximate expression for the dynamic neutron scattering structure factor.
In this work, we propose "Residual Attention Network", a convolutional neural network using attention mechanism which can incorporate with state-of-art feed forward network architecture in an end-to-end training fashion. Our Residual Attention Network is built by stacking Attention Modules which generate attention-aware features. The attention-aware features from different modules change adaptively as layers going deeper. Inside each Attention Module, bottom-up top-down feedforward structure is used to unfold the feedforward and feedback attention process into a single feedforward process. Importantly, we propose attention residual learning to train very deep Residual Attention Networks which can be easily scaled up to hundreds of layers. Extensive analyses are conducted on CIFAR-10 and CIFAR-100 datasets to verify the effectiveness of every module mentioned above. Our Residual Attention Network achieves state-of-the-art object recognition performance on three benchmark datasets including CIFAR-10 (3.90% error), CIFAR-100 (20.45% error) and ImageNet (4.8% single model and single crop, top-5 error). Note that, our method achieves 0.6% top-1 accuracy improvement with 46% trunk depth and 69% forward FLOPs comparing to ResNet-200. The experiment also demonstrates that our network is robust against noisy labels.
This lecture discusses the mathematical relationship between network structure and network utilization of transportation network. Network structure means the graph itself. Network utilization represent the aggregation of trajectories of agents in using the network graph. I show the similarity and relationship between the structural pattern of the network and network utilization.
We use a Kubo formalism to calculate both A.C. conductivity and D.C. transport properties of a dirty nodal loop semimetal. The optical conductivity as a function of photon energy $\Omega $, exhibits an extended flat background $\sigma^{BG}$ as in graphene provided the scattering rate $\Gamma$ is small as compared to the radius of the nodal ring $b$ (in energy units). Modifications to the constant background arise for $\Omega\le \Gamma $ and the minimum D.C. conductivity $\sigma^{DC} $ which is approached as $\Omega^2/\Gamma^2$ as $\Omega\rightarrow0$, is found to be proportional to $\frac{\sqrt{\Gamma^2+b^2}}{v_{F}}$ with $v_{F}$ the Fermi velocity. For $b=0$ we recover the known three-dimensional point node Dirac result $\sigma^{DC}\sim \frac{\Gamma}{v_{F}}$ while for $b>\Gamma$, $\sigma^{DC}$ becomes independent of $\Gamma$ (universal) and the ratio $\frac{\sigma^{DC}}{\sigma^{BG}}=\frac{8}{\pi^2}$ where all reference to material parameters has dropped out. As $b$ is reduced and becomes of the order $\Gamma$, the flat background is lost as the optical response evolves towards that of a three-dimensional point node Dirac semimetal which is linear in $\Omega$ for the clean limit. For finite $\Gamma$ there are modifications from linearity in the photon region $\Omega\le \Gamma$. When the chemical potential $\mu$ (temperature $T$) is nonzero the D.C. conductivity increases as $\mu^2/\Gamma^2$($T^2/\Gamma^2$) for $\mu/\Gamma$ $(T/\Gamma)\le 1$. For larger values of $\mu>\Gamma$ away from the nodal region the conductivity shows a Drude like contribution about $\Omega\approxeq 0$ which is followed by a dip in the Pauli blocked region $\Omega \le 2\mu$ after which it increases to merge with the flat background (two-dimensional graphene like) for $\mu< b$ and to the quasilinear (three-dimensional point node Dirac) law for $\mu> b$.
Application-layer multicast implements the multicast functionality at the application layer. The main goal of application-layer multicast is to construct and maintain efficient distribution structures between endhosts. In this paper we focus on the implementation of an application-layer multicast network using PlanetLab. We observe that the total time required to measure network latency over TCP is influenced dramatically by the TCP connection time. We argue that end-host distribution is not only influenced by the quality of network links but also by the time required to make connections between nodes. We provide several solutions to decrease the total end-host distribution time.
This paper introduces the concept of rate privacy in the context of wireless sensor networks. Our discussion reveals that the concept indeed is of a great importance for the privacy preservation of such networks. As a result, we propose a buffering scheme to protect the rate from adversaries. Simulation results verify the applicability of our approach.
The problem of network coding for multicasting a single source to multiple sinks has first been studied by Ahlswede, Cai, Li and Yeung in 2000, in which they have established the celebrated max-flow mini-cut theorem on non-physical information flow over a network of independent channels. On the other hand, in 1980, Han has studied the case with correlated multiple sources and a single sink from the viewpoint of polymatroidal functions in which a necessary and sufficient condition has been demonstrated for reliable transmission over the network. This paper presents an attempt to unify both cases, which leads to establish a necessary and sufficient condition for reliable transmission over a noisy network for multicasting all the correlated multiple sources to all the multiple sinks. Furthermore, we address also the problem of transmitting "independent" sources over a multiple-access-type of network as well as over a broadcast-type of network, which reveals that the (co-) polymatroidal structures are intrinsically involved in these types of network coding.
Training of large-scale deep neural networks is often constrained by the available computational resources. We study the effect of limited precision data representation and computation on neural network training. Within the context of low-precision fixed-point computations, we observe the rounding scheme to play a crucial role in determining the network's behavior during training. Our results show that deep networks can be trained using only 16-bit wide fixed-point number representation when using stochastic rounding, and incur little to no degradation in the classification accuracy. We also demonstrate an energy-efficient hardware accelerator that implements low-precision fixed-point arithmetic with stochastic rounding.
We present the first detailed numerical study in three dimensions of a first-order phase transition that remains first-order in the presence of quenched disorder (specifically, the ferromagnetic/paramagnetic transition of the site-diluted four states Potts model). A tricritical point, which lies surprisingly near to the pure-system limit and is studied by means of Finite-Size Scaling, separates the first-order and second-order parts of the critical line. This investigation has been made possible by a new definition of the disorder average that avoids the diverging-variance probability distributions that plague the standard approach. Entropy, rather than free energy, is the basic object in this approach that exploits a recently introduced microcanonical Monte Carlo method.
We study the low temperature phase of the 3D Coulomb glass within a mean field approach which reduces the full problem to an effective single site model with a non-trivial replica structure. We predict a finite glass transition temperature $T_c$, and a glassy low temperature phase characterized by permanent criticality. The latter is shown to assure the saturation of the Efros-Shklovskii Coulomb gap in the density of states. We find this pseudogap to be universal due to a fixed point in Parisi's flow equations. The latter is given a physical interpretation in terms of a dynamical self-similarity of the system in the long time limit, shedding new light on the concept of effective temperature. From the low temperature solution we infer properties of the hierarchical energy landscape, which we use to make predictions about the master function governing the aging in relaxation experiments.
We study a system of two-dimensional Bose gases trapped in minima of a deep one-dimensional optical lattice potential. Increasing the tunneling amplitude between adjacent gases drives a deconfinement transition to a phase where coherence is established between neighboring two-dimensional gases. We compute the signature of this transition in the interference pattern of the system as well as in its rotational response, which provides a direct measurement of the superfluidity in the system.
Recurrent Neural Networks (RNNs) produce state-of-art performance on many machine learning tasks but their demand on resources in terms of memory and computational power are often high. Therefore, there is a great interest in optimizing the computations performed with these models especially when considering development of specialized low-power hardware for deep networks. One way of reducing the computational needs is to limit the numerical precision of the network weights and biases, and this will be addressed for the case of RNNs. We present results from the use of different stochastic and deterministic reduced precision training methods applied to two major RNN types, which are then tested on three datasets. The results show that the stochastic and deterministic ternarization, pow2- ternarization, and exponential quantization methods gave rise to low-precision RNNs that produce similar and even higher accuracy on certain datasets, therefore providing a path towards training more efficient implementations of RNNs in specialized hardware.
Predicting the biological function of molecules, be it proteins or drug-like compounds, from their atomic structure is an important and long-standing problem. Function is dictated by structure, since it is by spatial interactions that molecules interact with each other, both in terms of steric complementarity, as well as intermolecular forces. Thus, the electron density field and electrostatic potential field of a molecule contain the "raw fingerprint" of how this molecule can fit to binding partners. In this paper, we show that deep learning can predict biological function of molecules directly from their raw 3D approximated electron density and electrostatic potential fields. Protein function based on EC numbers is predicted from the approximated electron density field. In another experiment, the activity of small molecules is predicted with quality comparable to state-of-the-art descriptor-based methods. We propose several alternative computational models for the GPU with different memory and runtime requirements for different sizes of molecules and of databases. We also propose application-specific multi-channel data representations. With future improvements of training datasets and neural network settings in combination with complementary information sources (sequence, genomic context, expression level), deep learning can be expected to show its generalization power and revolutionize the field of molecular function prediction.
The concept of evolutionary development of structures constituted a \emph{real} revolution in biology: it was possible to understand how the very complex structures of life can arise in an out-of-equilibrium system. The investigation of such systems has shown that indeed, systems under a flux of energy or matter can self-organize into complex patterns, think for instance to Rayleigh-Bernard convection, Liesegang rings, patterns formed by granular systems under shear. Following this line, one could characterize life as a state of matter, characterized by the slow, continuous process that we call evolution. In this paper we try to identify the organizational level of life, that spans several orders of magnitude from the elementary constituents to whole ecosystems.   Although similar structures can be found in other contexts like ideas (memes) in neural systems and self-replicating elements (computer viruses, worms, etc.) in computer systems, we shall concentrate on biological evolutionary structure, and try to put into evidence the role and the emergence of network structure in such systems.
The evolutionary dynamics of the Public Goods game addresses the emergence of cooperation within groups of individuals. However, the Public Goods game on large populations of interconnected individuals has been usually modeled without any knowledge about their group structure. In this paper, by focusing on collaboration networks, we show that it is possible to include the mesoscopic information about the structure of the real groups by means of a bipartite graph. We compare the results with the projected (coauthor) and the original bipartite graphs and show that cooperation is enhanced by the mesoscopic structure contained. We conclude by analyzing the influence of the size of the groups in the evolutionary success of cooperation.
In this work and the supporting Parts II [2] and III [3], we provide a rather detailed analysis of the stability and performance of asynchronous strategies for solving distributed optimization and adaptation problems over networks. We examine asynchronous networks that are subject to fairly general sources of uncertainties, such as changing topologies, random link failures, random data arrival times, and agents turning on and off randomly. Under this model, agents in the network may stop updating their solutions or may stop sending or receiving information in a random manner and without coordination with other agents. We establish in Part I conditions on the first and second-order moments of the relevant parameter distributions to ensure mean-square stable behavior. We derive in Part II expressions that reveal how the various parameters of the asynchronous behavior influence network performance. We compare in Part III the performance of asynchronous networks to the performance of both centralized solutions and synchronous networks. One notable conclusion is that the mean-square-error performance of asynchronous networks shows a degradation only of the order of $O(\nu)$, where $\nu$ is a small step-size parameter, while the convergence rate remains largely unaltered. The results provide a solid justification for the remarkable resilience of cooperative networks in the face of random failures at multiple levels: agents, links, data arrivals, and topology.
Deep neural networks have been successfully applied to many text matching tasks, such as paraphrase identification, question answering, and machine translation. Although ad-hoc retrieval can also be formalized as a text matching task, few deep models have been tested on it. In this paper, we study a state-of-the-art deep matching model, namely MatchPyramid, on the ad-hoc retrieval task. The MatchPyramid model employs a convolutional neural network over the interactions between query and document to produce the matching score. We conducted extensive experiments to study the impact of different pooling sizes, interaction functions and kernel sizes on the retrieval performance. Finally, we show that the MatchPyramid models can significantly outperform several recently introduced deep matching models on the retrieval task, but still cannot compete with the traditional retrieval models, such as BM25 and language models.
We study collective synchronization of pulse-coupled oscillators interacting on asymmetric random networks. We demonstrate that random matrix theory can be used to accurately predict the speed of synchronization in such networks in dependence on the dynamical and network parameters. Furthermore, we show that the speed of synchronization is limited by the network connectivity and stays finite, even if the coupling strength becomes infinite. In addition, our results indicate that synchrony is robust under structural perturbations of the network dynamics.
The macroscopic transport properties in a disordered potential, namely diffusion and weak/strong localization, closely depend on the microscopic and statistical properties of the disorder itself. This dependence is rich of counter-intuitive consequences. It can be particularly exploited in matter wave experiments, where the disordered potential can be tailored and controlled, and anisotropies are naturally present. In this work, we apply a perturbative microscopic transport theory and the self-consistent theory of Anderson localization to study the transport properties of ultracold atoms in anisotropic 2D and 3D speckle potentials. In particular, we discuss the anisotropy of single-scattering, diffusion and localization. We also calculate a disorder-induced shift of the energy states and propose a method to include it, which amounts to renormalize energies in the standard on-shell approximation. We show that the renormalization of energies strongly affects the prediction for the 3D localization threshold (mobility edge). We illustrate the theoretical findings with examples which are revelant for current matter wave experiments, where the disorder is created with a laser speckle. This paper provides a guideline for future experiments aiming at the precise location of the 3D mobility edge and study of anisotropic diffusion and localization effects in 2D and 3D.
Reinforcement Learning algorithms can learn complex behavioral patterns for sequential decision making tasks wherein an agent interacts with an environment and acquires feedback in the form of rewards sampled from it. Traditionally, such algorithms make decisions, i.e., select actions to execute, at every single time step of the agent-environment interactions. In this paper, we propose a novel framework, Fine Grained Action Repetition (FiGAR), which enables the agent to decide the action as well as the time scale of repeating it. FiGAR can be used for improving any Deep Reinforcement Learning algorithm which maintains an explicit policy estimate by enabling temporal abstractions in the action space. We empirically demonstrate the efficacy of our framework by showing performance improvements on top of three policy search algorithms in different domains: Asynchronous Advantage Actor Critic in the Atari 2600 domain, Trust Region Policy Optimization in Mujoco domain and Deep Deterministic Policy Gradients in the TORCS car racing domain.
We propose a multi-objective framework to learn both secondary targets not directly related to the intended task of speech enhancement (SE) and the primary target of the clean log-power spectra (LPS) features to be used directly for constructing the enhanced speech signals. In deep neural network (DNN) based SE we introduce an auxiliary structure to learn secondary continuous features, such as mel-frequency cepstral coefficients (MFCCs), and categorical information, such as the ideal binary mask (IBM), and integrate it into the original DNN architecture for joint optimization of all the parameters. This joint estimation scheme imposes additional constraints not available in the direct prediction of LPS, and potentially improves the learning of the primary target. Furthermore, the learned secondary information as a byproduct can be used for other purposes, e.g., the IBM-based post-processing in this work. A series of experiments show that joint LPS and MFCC learning improves the SE performance, and IBM-based post-processing further enhances listening quality of the reconstructed speech.
We present a new algorithm, trimed, for obtaining the medoid of a set, that is the element of the set which minimises the mean distance to all other elements. The algorithm is shown to have, under certain assumptions, expected run time O(N^(3/2)) in R^d where N is the set size, making it the first sub-quadratic exact medoid algorithm for d>1. Experiments show that it performs very well on spatial network data, frequently requiring two orders of magnitude fewer distance calculations than state-of-the-art approximate algorithms. As an application, we show how trimed can be used as a component in an accelerated K-medoids algorithm, and then how it can be relaxed to obtain further computational gains with only a minor loss in cluster quality.
In many applications, we need to measure similarity between nodes in a large network based on features of their neighborhoods. Although in-network node similarity based on proximity has been well investigated, surprisingly, measuring in-network node similarity based on neighborhoods remains a largely untouched problem in literature. One grand challenge is that in different applications we may need different measurements that manifest different meanings of similarity. In this paper, we investigate the problem in a principled and systematic manner. We develop a unified parametric model and a series of four instance measures. Those instance similarity measures not only address a spectrum of various meanings of similarity, but also present a series of tradeoffs between computational cost and strictness of matching between neighborhoods of nodes being compared. By extensive experiments and case studies, we demonstrate the effectiveness of the proposed model and its instances.
We report time-resolved measurements of the statistics of pulsed transmission through quasi-one-dimensional dielectric media with static disorder. The normalized intensity correlation function with displacement and polarization rotation for an incident pulse of linewidth $\sigma$ at delay time t is a function only of the field correlation function, which is identical to that found for steady-state excitation, and of $\kappa_{\sigma}(t)$, the residual degree of intensity correlation at points at which the field correlation function vanishes. The dynamic probability distribution of normalized intensity depends only upon $\kappa_{\sigma}(t)$. Steady-state statistics are recovered in the limit $\sigma$->0, in which $\kappa_{\sigma=0}$ is the steady-state degree of correlation.
We consider a class of interdependent security games on networks where each node chooses a personal level of security investment. The attack probability experienced by a node is a function of her own investment and the investment by her neighbors in the network. Most of the existing work in these settings considers players who are risk-neutral. In contrast, studies in behavioral decision theory have shown that individuals often deviate from risk-neutral behavior while making decisions under uncertainty. In particular, the true probabilities associated with uncertain outcomes are often transformed into perceived probabilities in a highly nonlinear fashion by the users, which then influence their decisions. In this paper, we investigate the effects of such behavioral probability weightings by the nodes on their optimal investment strategies and the resulting security risk profiles that arise at the Nash equilibria of interdependent network security games. We characterize graph topologies that achieve the largest and smallest worst case average attack probabilities at Nash equilibria in Total Effort games, and equilibrium investments in Weakest Link and Best Shot games.
It has been shown recently that graph signals with small total variation can be accurately recovered from only few samples if the sampling set satisfies a certain condition, referred to as the network nullspace property. Based on this recovery condition, we propose a sampling strategy for smooth graph signals based on random walks. Numerical experiments demonstrate the effectiveness of this approach for graph signals obtained from a synthetic random graph model as well as a real-world dataset.
For many real-world networks only a small "sampled" version of the original network may be investigated; those results are then used to draw conclusions about the actual system. Variants of breadth-first search (BFS) sampling, which are based on epidemic processes, are widely used. Although it is well established that BFS sampling fails, in most cases, to capture the IN-component(s) of directed networks, a description of the effects of BFS sampling on other topological properties are all but absent from the literature. To systematically study the effects of sampling biases on directed networks, we compare BFS sampling to random sampling on complete large-scale directed networks. We present new results and a thorough analysis of the topological properties of seven different complete directed networks (prior to sampling), including three versions of Wikipedia, three different sources of sampled World Wide Web data, and an Internet-based social network. We detail the differences that sampling method and coverage can make to the structural properties of sampled versions of these seven networks. Most notably, we find that sampling method and coverage affect both the bow-tie structure, as well as the number and structure of strongly connected components in sampled networks. In addition, at low sampling coverage (i.e. less than 40%), the values of average degree, variance of out-degree, degree auto-correlation, and link reciprocity are overestimated by 30% or more in BFS-sampled networks, and only attain values within 10% of the corresponding values in the complete networks when sampling coverage is in excess of 65%. These results may cause us to rethink what we know about the structure, function, and evolution of real-world directed networks.
This paper discusses the usage of network traffic properties in passive network monitoring which are used in recognizing and identifying anomaly.
We consider the statistical problem of learning common source of variability in data which are synchronously captured by multiple sensors, and demonstrate that Siamese neural networks can be naturally applied to this problem. This approach is useful in particular in exploratory, data-driven applications, where neither a model nor label information is available. In recent years, many researchers have successfully applied Siamese neural networks to obtain an embedding of data which corresponds to a "semantic similarity". We present an interpretation of this "semantic similarity" as learning of equivalence classes. We discuss properties of the embedding obtained by Siamese networks and provide empirical results that demonstrate the ability of Siamese networks to learn common variability.
This note provides a family of classification problems, indexed by a positive integer $k$, where all shallow networks with fewer than exponentially (in $k$) many nodes exhibit error at least $1/6$, whereas a deep network with 2 nodes in each of $2k$ layers achieves zero error, as does a recurrent network with 3 distinct nodes iterated $k$ times. The proof is elementary, and the networks are standard feedforward networks with ReLU (Rectified Linear Unit) nonlinearities.
In this letter, we investigate the problem of dynamic spectrum access for small cell networks, using a graphical game approach. Compared with existing studies, we take the features of different cell loads and local interference relationship into account. It is proved that the formulated spectrum access game is an exact potential game with the aggregate interference level as the potential function, and Nash equilibrium (NE) of the game corresponds to the global or local optima of the original optimization problem. A lower bound of the achievable aggregate interference level is rigorously derived. Finally, we propose an autonomous best response learning algorithm to converge towards its NE. It is shown that the proposed game-theoretic solution converges rapidly and its achievable performance is close to the optimum solution.
All sciences, including astronomy, are now entering the era of information abundance. The exponentially increasing volume and complexity of modern data sets promises to transform the scientific practice, but also poses a number of common technological challenges. The Virtual Observatory concept is the astronomical community's response to these challenges: it aims to harness the progress in information technology in the service of astronomy, and at the same time provide a valuable testbed for information technology and applied computer science. Challenges broadly fall into two categories: data handling (or "data farming"), including issues such as archives, intelligent storage, databases, interoperability, fast networks, etc., and data mining, data understanding, and knowledge discovery, which include issues such as automated clustering and classification, multivariate correlation searches, pattern recognition, visualization in highly hyperdimensional parameter spaces, etc., as well as various applications of machine learning in these contexts. Such techniques are forming a methodological foundation for science with massive and complex data sets in general, and are likely to have a much broather impact on the modern society, commerce, information economy, security, etc. There is a powerful emerging synergy between the computationally enabled science and the science-driven computing, which will drive the progress in science, scholarship, and many other venues in the 21st century.
Implicit probabilistic models are a flexible class for modeling data. They define a process to simulate observations, and unlike traditional models, they do not require a tractable likelihood function. In this paper, we develop two families of models: hierarchical implicit models and deep implicit models. They combine the idea of implicit densities with hierarchical Bayesian modeling and deep neural networks. The use of implicit models with Bayesian analysis has been limited by our ability to perform accurate and scalable inference. We develop likelihood-free variational inference (LFVI). Key to LFVI is specifying a variational family that is also implicit. This matches the model's flexibility and allows for accurate approximation of the posterior. Our work scales up implicit models to sizes previously not possible and advances their modeling design. We demonstrate diverse applications: a large-scale physical simulator for predator-prey populations in ecology; a Bayesian generative adversarial network for discrete data; and a deep implicit model for text generation.
Video sequences contain rich dynamic patterns, such as dynamic texture patterns that exhibit stationarity in the temporal domain, and action patterns that are non-stationary in either spatial or temporal domain. We show that a spatial-temporal generative ConvNet can be used to model and synthesize dynamic patterns. The model defines a probability distribution on the video sequence, and the log probability is defined by a spatial-temporal ConvNet that consists of multiple layers of spatial-temporal filters to capture spatial-temporal patterns of different scales. The model can be learned from the training video sequences by an "analysis by synthesis" learning algorithm that iterates the following two steps. Step 1 synthesizes video sequences from the currently learned model. Step 2 then updates the model parameters based on the difference between the synthesized video sequences and the observed training sequences. We show that the learning algorithm can synthesize realistic dynamic patterns.
A new challenge for learning algorithms in cyber-physical network systems is the distributed solution of big-data classification problems, i.e., problems in which both the number of training samples and their dimension is high. Motivated by several problem set-ups in Machine Learning, in this paper we consider a special class of quadratic optimization problems involving a "large" number of input data, whose dimension is "big". To solve these quadratic optimization problems over peer-to-peer networks, we propose an asynchronous, distributed algorithm that scales with both the number and the dimension of the input data (training samples in the classification problem). The proposed distributed optimization algorithm relies on the notion of "core-set" which is used in geometric optimization to approximate the value function associated to a given set of points with a smaller subset of points. By computing local core-sets on a smaller version of the global problem and exchanging them with neighbors, the nodes reach consensus on a set of active constraints representing an approximate solution for the global quadratic program.
Information divergence that measures the difference between two nonnegative matrices or tensors has found its use in a variety of machine learning problems. Examples are Nonnegative Matrix/Tensor Factorization, Stochastic Neighbor Embedding, topic models, and Bayesian network optimization. The success of such a learning task depends heavily on a suitable divergence. A large variety of divergences have been suggested and analyzed, but very few results are available for an objective choice of the optimal divergence for a given task. Here we present a framework that facilitates automatic selection of the best divergence among a given family, based on standard maximum likelihood estimation. We first propose an approximated Tweedie distribution for the beta-divergence family. Selecting the best beta then becomes a machine learning problem solved by maximum likelihood. Next, we reformulate alpha-divergence in terms of beta-divergence, which enables automatic selection of alpha by maximum likelihood with reuse of the learning principle for beta-divergence. Furthermore, we show the connections between gamma and beta-divergences as well as R\'enyi and alpha-divergences, such that our automatic selection framework is extended to non-separable divergences. Experiments on both synthetic and real-world data demonstrate that our method can quite accurately select information divergence across different learning problems and various divergence families.
Cerebellum is part of the brain that occupies only 10% of the brain volume, but it contains about 80% of total number of brain neurons. New cerebellar function model is developed that sets cerebellar circuits in context of multibody dynamics model computations, as important step in controlling balance and movement coordination, functions performed by two oldest parts of the cerebellum. Model gives new functional interpretation for granule cells-Golgi cell circuit, including distinct function for upper and lower Golgi cell dendritc trees, and resolves issue of sharing Granule cells between Purkinje cells. Sets new function for basket cells, and for stellate cells according to position in molecular layer. New model enables easily and direct integration of sensory information from vestibular system and cutaneous mechanoreceptors, for balance, movement and interaction with environments. Model gives explanation of Purkinje cells convergence on deep-cerebellar nuclei.
Ability of deep networks to extract high level features and of recurrent networks to perform time-series inference have been studied. In view of universality of one hidden layer network at approximating functions under weak constraints, the benefit of multiple layers is to enlarge the space of dynamical systems approximated or, given the space, reduce the number of units required for a certain error. Traditionally shallow networks with manually engineered features are used, back-propagation extent is limited to one and attempt to choose a large number of hidden units to satisfy the Markov condition is made. In case of Markov models, it has been shown that many systems need to be modeled as higher order. In the present work, we present deep recurrent networks with longer backpropagation through time extent as a solution to modeling systems that are high order and to predicting ahead. We study epileptic seizure suppression electro-stimulator. Extraction of manually engineered complex features and prediction employing them has not allowed small low-power implementations as, to avoid possibility of surgery, extraction of any features that may be required has to be included. In this solution, a recurrent neural network performs both feature extraction and prediction. We prove analytically that adding hidden layers or increasing backpropagation extent increases the rate of decrease of approximation error. A Dynamic Programming (DP) training procedure employing matrix operations is derived. DP and use of matrix operations makes the procedure efficient particularly when using data-parallel computing. The simulation studies show the geometry of the parameter space, that the network learns the temporal structure, that parameters converge while model output displays same dynamic behavior as the system and greater than .99 Average Detection Rate on all real seizure data tried.
Human brain contains about 10 billion neurons, each of which has about 10~10,000 nerve endings from which neurotransmitters are released in response to incoming spikes, and the released neurotransmitters then bind to receptors located in the postsynaptic neurons. However, individually, neurons are noisy and synaptic release is in general unreliable. But groups of neurons that are arranged in specialized modules can collectively perform complex information processing tasks robustly and reliably. How functionally groups of neurons perform behavioural related tasks crucial rely on a coherent organization of dynamics from membrane ionic kinetics to synaptic coupling of the network and dynamics of rhythmic oscillations that are tightly linked to behavioural state.   To capture essential features of the biological system at multiple spatial-temporal scales, it is important to construct a suitable computational model that is closely or solely based on experimental data. Depending on what one wants to understand, these models can either be very functional and biologically realistic descriptions with thousands of coupled differential equations (Hodgkin-Huxley type) or greatly simplified caricatures (integrate-and-fire type) which are useful for studying large interconnected networks.
In a recent article [1] we surveyed advances related to adaptation, learning, and optimization over synchronous networks. Various distributed strategies were discussed that enable a collection of networked agents to interact locally in response to streaming data and to continually learn and adapt to track drifts in the data and models. Under reasonable technical conditions on the data, the adaptive networks were shown to be mean-square stable in the slow adaptation regime, and their mean-square-error performance and convergence rate were characterized in terms of the network topology and data statistical moments [2]. Classical results for single-agent adaptation and learning were recovered as special cases. Following the works [3]-[5], this chapter complements the exposition from [1] and extends the results to asynchronous networks. The operation of this class of networks can be subject to various sources of uncertainties that influence their dynamic behavior, including randomly changing topologies, random link failures, random data arrival times, and agents turning on and off randomly. In an asynchronous environment, agents may stop updating their solutions or may stop sending or receiving information in a random manner and without coordination with other agents. The presentation will reveal that the mean-square-error performance of asynchronous networks remains largely unaltered compared to synchronous networks. The results justify the remarkable resilience of cooperative networks in the face of random events.
The sound attenuation in the THz region is studied down to T=16 K in glassy glycerol by inelastic x-ray scattering. At striking variance with the decrease found below 100 K in the GHz data, the attenuation in the THz range does not show any T dependence. This result i) indicates the presence of two different attenuation mechanisms, active respectively in the high and low frequency limits; ii) demonstrates the non-dynamic origin of the attenuation of THz sound waves, and confirms a similar conclusion obtained in SiO2 glass by molecular dynamics; and iii) supports the low frequency attenuation mechanism proposed by Fabian and Allen (Phys.Rev.Lett. 82, 1478 (1999)).
We report experiments on cohesionless granular piles to determine the effect of construction history on the static stress distribution. The stresses beneath the piles are monitored using a very sensitive capacitive technique. The piles are formed either by release of granular material from a relatively small output (localized source), or from a large diameter sieve (homogeneous rain). The stress profiles resulting from localized source inputs have a clear stress dip near the center of the pile while the results from an homogeneous rain show no stress dip. We also show that the stress profiles scale simply with the pile height. Experiments on wedges-shaped piles show the same effects but to a lesser degree.
Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these situations is limited by how fast we can compute them. Conventional FFT based convolution is fast for large filters, but state of the art convolutional neural networks use small, 3x3 filters. We introduce a new class of fast algorithms for convolutional neural networks using Winograd's minimal filtering algorithms. The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes. We benchmark a GPU implementation of our algorithm with the VGG network and show state of the art throughput at batch sizes from 1 to 64.
In this paper we develop a novel computational sensing framework for sensing and recovering structured signals. When trained on a set of representative signals, our framework learns to take undersampled measurements and recover signals from them using a deep convolutional neural network. In other words, it learns a transformation from the original signals to a near-optimal number of undersampled measurements and the inverse transformation from measurements to signals. This is in contrast to traditional compressive sensing (CS) systems that use random linear measurements and convex optimization or iterative algorithms for signal recovery. We compare our new framework with $\ell_1$-minimization from the phase transition point of view and demonstrate that it outperforms $\ell_1$-minimization in the regions of phase transition plot where $\ell_1$-minimization cannot recover the exact solution. In addition, we experimentally demonstrate how learning measurements enhances the overall recovery performance, speeds up training of recovery framework, and leads to having fewer parameters to learn.
We analyze a model of learning and belief formation in networks in which agents follow Bayes rule yet they do not recall their history of past observations and cannot reason about how other agents' beliefs are formed. They do so by making rational inferences about their observations which include a sequence of independent and identically distributed private signals as well as the actions of their neighboring agents at each time. Successive applications of Bayes rule to the entire history of past observations lead to forebodingly complex inferences: due to lack of knowledge about the global network structure, and unavailability of private observations, as well as third party interactions preceding every decision. Such difficulties make Bayesian updating of beliefs an implausible mechanism for social learning. To address these complexities, we consider a Bayesian without Recall model of inference. On the one hand, this model provides a tractable framework for analyzing the behavior of rational agents in social networks. On the other hand, this model also provides a behavioral foundation for the variety of non-Bayesian update rules in the literature. We present the implications of various choices for the structure of the action space and utility functions for such agents and investigate the properties of learning, convergence, and consensus in special cases.
Deep Neural Networks (DNN) have achieved human level performance in many image analytics tasks but DNNs are mostly deployed to GPU platforms that consume a considerable amount of power. Brain-inspired spiking neuromorphic chips consume low power and can be highly parallelized. However, for deploying DNNs to energy efficient neuromorphic chips the incompatibility between continuous neurons and synaptic weights of traditional DNNs, discrete spiking neurons and synapses of neuromorphic chips has to be overcome. Previous work has achieved this by training a network to learn continuous probabilities and deployment to a neuromorphic architecture by random sampling these probabilities. An ensemble of sampled networks is needed to approximate the performance of the trained network.   In the work presented in this paper, we have extended previous research by directly learning binary synaptic crossbars. Results on MNIST show that better performance can be achieved with a small network in one time step (92.7% maximum observed accuracy vs 95.98% accuracy in our work). Top results on a larger network are similar to previously published results (99.42% maximum observed accuracy vs 99.45% accuracy in our work). More importantly, in our work a smaller ensemble is needed to achieve similar or better accuracy than previous work, which translates into significantly decreased energy consumption for both networks. Results of our work are stable since they do not require random sampling.
Many networks exhibit the small-world property of the neighborhood connectivity being higher than in comparable random networks. However, the standard measure of local neighborhood clustering is typically not defined if a node has one or no neighbors. In such cases, local clustering has traditionally been set to zero and this value influenced the global clustering coefficient. Such a procedure leads to underestimation of the neighborhood clustering in sparse networks. We propose to include $\theta$ as the proportion of leafs and isolated nodes to estimate the contribution of these cases and provide a formula for estimating a clustering coefficient excluding these cases from the Watts and Strogatz (1998 Nature 393 440-2) definition of the clustering coefficient. Excluding leafs and isolated nodes leads to values which are up to 140% higher than the traditional values for the observed networks indicating that neighborhood connectivity is normally underestimated. We find that the definition of the clustering coefficient has a major effect when comparing different networks. For metabolic networks of 43 organisms, relations changed for 58% of the comparisons when a different definition was applied. We also show that the definition influences small-world features and that the classification can change from non-small-world to small-world network. We discuss the use of an alternative measure, disconnectedness D, which is less influenced by leafs and isolated nodes.
We present observations of a very massive galaxy at z=1.82 which show that its morphology, size, velocity dispersion and stellar population properties that are fully consistent with those expected for passively evolving progenitors of today's giant ellipticals. These findings are based on a deep optical rest-frame spectrum obtained with the Multi-Object InfraRed Camera and Spectrograph (MOIRCS) on the Subaru telescope of a high-z passive galaxy candidate (pBzK) from the COSMOS field, for which we accurately measure its redshift of z=1.8230 and obtain an upper limit on its velocity dispersion sigma_star<326 km/s. By detailed stellar population modeling of both the galaxy broad-band SED and the rest-frame optical spectrum we derive a star-formation-weighted age and formation redshift of t_sf~1-2 Gyr and z_form~2.5-4, and a stellar mass of M_star~(3-4)x10^{11} M_sun. This is in agreement with a virial mass limit of M_vir<7x10^{11}M_sun, derived from the measured sigma_star value and stellar half-light radius, as well as with the dynamical mass limit based on the Jeans equations. In contrast with previously reported super-dense passive galaxies at z~2, the present galaxy at z=1.82 appears to have both size and velocity dispersion similar to early-type galaxies in the local Universe with similar stellar mass. This suggests that z~2 massive and passive galaxies may exhibit a wide range of properties, then possibly following quite different evolutionary histories from z~2 to z=0.
Designing reliable networks consists in finding topological structures, which are able to successfully carry out desired processes and operations. When this set of activities performed within a network are unknown and the only available information is a probabilistic model reflecting topological network features, highly probable networks are regarded as "reliable", in the sense of being consistent with those probabilistic model. In this paper we are studying the reliability maximization, based on the Exponential Random Graph Model (ERGM), whose statistical properties has been widely used to capture complex topological feature of real-world networks. Under such models the probability of a network is maximized when specified structural properties appear in the network. However, the search of maximally reliable (highly probable) networks might result in difficult combinatorial optimization problems and an important goal of this work is to translate them into solvable systems of linear constraints. Analytical and numerical results are provided, using exact optimization techniques and efficient computer implementation.
Thousands of first-millennium BCE ivory carvings have been excavated from Neo-Assyrian sites in Mesopotamia (primarily Nimrud, Khorsabad, and Arslan Tash) hundreds of miles from their Levantine production contexts. At present, their specific manufacture dates and workshop localities are unknown. Relying on subjective, visual methods, scholars have grappled with their classification and regional attribution for over a century. This study combines visual approaches with machine-learning techniques to offer data-driven perspectives on the classification and attribution of this early Iron Age corpus. The study sample consisted of 162 sculptures of female figures. We have developed an algorithm that clusters the ivories based on a combination of descriptive and anthropometric data. The resulting categories, which are based on purely statistical criteria, show good agreement with conventional art historical classifications, while revealing new perspectives, especially with regard to the contested Syrian/South Syrian/Intermediate tradition. Specifically, we have identified that objects of the Syrian/South Syrian/Intermediate tradition may be more closely related to Phoenician objects than to North Syrian objects; we offer a reconsideration of a subset of Phoenician objects, and we confirm Syrian/South Syrian/Intermediate stylistic subgroups that might distinguish networks of acquisition among the sites of Nimrud, Khorsabad, Arslan Tash and the Levant. We have also identified which features are most significant in our cluster assignments and might thereby be most diagnostic of regional carving traditions. In short, our study both corroborates traditional visual classification methods and demonstrates how machine-learning techniques may be employed to reveal complementary information not accessible through the exclusively visual analysis of an archaeological corpus.
Possibilities for performing stochastic simulations on the analog and fully parallelized Cellular Neural Network Universal Machine (CNN-UM) are investigated. By using a chaotic cellular automaton perturbed with the natural noise of the CNN-UM chip, a realistic binary random number generator is built. As a specific example for Monte Carlo type simulations, we use this random number generator and a CNN template to study the classical site-percolation problem on the ACE16K chip. The study reveals that the analog and parallel architecture of the CNN-UM is very appropriate for stochastic simulations on lattice models. The natural trend for increasing the number of cells and local memories on the CNN-UM chip will definitely favor in the near future the CNN-UM architecture for such problems.
Learned feature representations and sub-phoneme posteriors from Deep Neural Networks (DNNs) have been used separately to produce significant performance gains for speaker and language recognition tasks. In this work we show how these gains are possible using a single DNN for both speaker and language recognition. The unified DNN approach is shown to yield substantial performance improvements on the the 2013 Domain Adaptation Challenge speaker recognition task (55% reduction in EER for the out-of-domain condition) and on the NIST 2011 Language Recognition Evaluation (48% reduction in EER for the 30s test condition).
Consider random linear estimation with Gaussian measurement matrices and noise. One can compute infinitesimal variations of the mutual information under infinitesimal variations of the signal-to-noise ratio or of the measurement rate. We discuss how each variation is related to the minimum mean-square error and deduce that the two variations are directly connected through a very simple identity. The main technical ingredient is a new interpolation method called "sub-extensive interpolation method". We use it to provide a new proof of an I-MMSE relation recently found by Reeves and Pfister [1] when the measurement rate is varied. Our proof makes it clear that this relation is intimately related to another I-MMSE relation also recently proved in [2]. One can directly verify that the identity relating the two types of variation of mutual information is indeed consistent with the one letter replica symmetric formula for the mutual information, first derived by Tanaka [3] for binary signals, and recently proved in more generality in [1,2,4,5] (by independent methods). However our proof is independent of any knowledge of Tanaka's formula.
Back-propagation algorithm is one of the most widely used and popular techniques to optimize the feed forward neural network training. Nature inspired meta-heuristic algorithms also provide derivative-free solution to optimize complex problem. Artificial bee colony algorithm is a nature inspired meta-heuristic algorithm, mimicking the foraging or food source searching behaviour of bees in a bee colony and this algorithm is implemented in several applications for an improved optimized outcome. The proposed method in this paper includes an improved artificial bee colony algorithm based back-propagation neural network training method for fast and improved convergence rate of the hybrid neural network learning method. The result is analysed with the genetic algorithm based back-propagation method, and it is another hybridized procedure of its kind. Analysis is performed over standard data sets, reflecting the light of efficiency of proposed method in terms of convergence speed and rate.
We study the distribution of fluctuations over a time scale $\Delta t$ (i.e., the returns) of the S&P 500 index by analyzing three distinct databases. Database (i) contains approximately 1 million records sampled at 1 min intervals for the 13-year period 1984-1996, database (ii) contains 8686 daily records for the 35-year period 1962-1996, and database (iii) contains 852 monthly records for the 71-year period 1926-1996. We compute the probability distributions of returns over a time scale $\Delta t$, where $\Delta t$ varies approximately over a factor of 10^4 - from 1 min up to more than 1 month. We find that the distributions for $\Delta t \leq$ 4 days (1560 mins) are consistent with a power-law asymptotic behavior, characterized by an exponent $\alpha \approx 3$, well outside the stable L\'evy regime $0 < \alpha < 2$. To test the robustness of the S&P result, we perform a parallel analysis on two other financial market indices. Database (iv) contains 3560 daily records of the NIKKEI index for the 14-year period 1984-97, and database (v) contains 4649 daily records of the Hang-Seng index for the 18-year period 1980-97. We find estimates of $\alpha$ consistent with those describing the distribution of S&P 500 daily-returns. One possible reason for the scaling of these distributions is the long persistence of the autocorrelation function of the volatility. For time scales longer than $(\Delta t)_{\times} \approx 4$ days, our results are consistent with slow convergence to Gaussian behavior.
The knowledge representation community has built general-purpose ontologies which contain large amounts of commonsense knowledge over relevant aspects of the world, including useful visual information, e.g.: "a ball is used by a football player", "a tennis player is located at a tennis court". Current state-of-the-art approaches for visual recognition do not exploit these rule-based knowledge sources. Instead, they learn recognition models directly from training examples. In this paper, we study how general-purpose ontologies---specifically, MIT's ConceptNet ontology---can improve the performance of state-of-the-art vision systems. As a testbed, we tackle the problem of sentence-based image retrieval. Our retrieval approach incorporates knowledge from ConceptNet on top of a large pool of object detectors derived from a deep learning technique. In our experiments, we show that ConceptNet can improve performance on a common benchmark dataset. Key to our performance is the use of the ESPGAME dataset to select visually relevant relations from ConceptNet. Consequently, a main conclusion of this work is that general-purpose commonsense ontologies improve performance on visual reasoning tasks when properly filtered to select meaningful visual relations.
This paper establishes theoretical bonafides for implicit concurrent multivariate effect evaluation--implicit concurrency for short---a broad and versatile computational learning efficiency thought to underlie general-purpose, non-local, noise-tolerant optimization in genetic algorithms with uniform crossover (UGAs). We demonstrate that implicit concurrency is indeed a form of efficient learning by showing that it can be used to obtain close-to-optimal bounds on the time and queries required to approximately correctly solve a constrained version (k=7, \eta=1/5) of a recognizable computational learning problem: learning parities with noisy membership queries. We argue that a UGA that treats the noisy membership query oracle as a fitness function can be straightforwardly used to approximately correctly learn the essential attributes in O(log^1.585 n) queries and O(n log^1.585 n) time, where n is the total number of attributes. Our proof relies on an accessible symmetry argument and the use of statistical hypothesis testing to reject a global null hypothesis at the 10^-100 level of significance. It is, to the best of our knowledge, the first relatively rigorous identification of efficient computational learning in an evolutionary algorithm on a non-trivial learning problem.
We analyze eigenvalues fluctuations of the Laplacian of various networks under the random matrix theory framework. Analyses of random networks, scale-free networks and small-world networks show that nearest neighbor spacing distribution of the Laplacian of these networks follow Gaussian orthogonal ensemble statistics of random matrix theory. Furthermore, we study nearest neighbor spacing distribution as a function of the random connections and find that transition to the Gaussian orthogonal ensemble statistics occurs at the small-world transition.
The problem of characterizing classical and quantum correlations in networks is considered. Contrary to the usual Bell scenario, where distant observers share a physical system emitted by one common source, a network features several independent sources, each distributing a physical system to a subset of observers. In the quantum setting, the observers can perform joint measurements on initially independent systems, which may lead to strong correlations across the whole network. In this work, we introduce a technique to systematically map a Bell inequality to a family of Bell-type inequalities bounding classical correlations on networks in a star-configuration. Also, we show that whenever a given Bell inequality can be violated by some entangled state $\rho$, then all the corresponding network inequalities can be violated by considering many copies of $\rho$ distributed in the star network. The relevance of these ideas is illustrated by applying our method to a specific multi-setting Bell inequality. We derive the corresponding network inequalities, and study their quantum violations.
Mobile Ad hoc Network (MANET) is a collection of wireless mobile nodes that dynamically form a network temporarily without any support of central administration. Moreover, Every node in MANET moves arbitrarily making the multi-hop network topology to change randomly at unpredictable times. There are several familiar routing protocols like DSDV, AODV, DSR, etc...which have been proposed for providing communication among all the nodes in the network. This paper presents a performance comparison of proactive and reactive protocols DSDV, AODV and DSR based on metrics such as throughput, packet delivery ratio and average end-to-end delay by using the NS-2 simulator.
Biological networks have evolved to be highly functional within uncertain environments while remaining extremely adaptable. One of the main contributors to the robustness and evolvability of biological networks is believed to be their modularity of function, with modules defined as sets of genes that are strongly interconnected but whose function is separable from those of other modules. Here, we investigate the in silico evolution of modularity and robustness in complex artificial metabolic networks that encode an increasing amount of information about their environment while acquiring ubiquitous features of biological, social, and engineering networks, such as scale-free edge distribution, small-world property, and fault-tolerance. These networks evolve in environments that differ in their predictability, and allow us to study modularity from topological, information-theoretic, and gene-epistatic points of view using new tools that do not depend on any preconceived notion of modularity. We find that for our evolved complex networks as well as for the yeast protein-protein interaction network, synthetic lethal pairs consist mostly of redundant genes that lie close to each other and therefore within modules, while knockdown suppressor pairs are farther apart and often straddle modules, suggesting that knockdown rescue is mediated by alternative pathways or modules. The combination of network modularity tools together with genetic interaction data constitutes a powerful approach to study and dissect the role of modularity in the evolution and function of biological networks.
Models of the cortico-basal ganglia network and volume conductor models of the brain can provide insight into the mechanisms of action of deep brain stimulation (DBS). In this study, the coupling of a network model, under parkinsonian conditions, to the extracellular field distribution obtained from a three dimensional finite element model of a rodent's brain during DBS is presented. This coupled model is used to investigate the influence of uncertainty in the electrical properties of brain tissue and encapsulation tissue, formed around the electrode after implantation, on the suppression of oscillatory neural activity during DBS. The resulting uncertainty in this effect of DBS on the network activity is quantified using a computationally efficient and non-intrusive stochastic approach based on the generalized Polynomial Chaos. The results suggest that variations in the electrical properties of brain tissue may have a substantial influence on the level of suppression of oscillatory activity during DBS. Applying a global sensitivity analysis on the suppression of the simulated oscillatory activity showed that the influence of uncertainty in the electrical properties of the encapsulation tissue had only a minor influence, in agreement with previous experimental and computational studies investigating the mechanisms of current-controlled DBS in the literature.
To what extent do the characteristic features of a chemical reaction network reflect its purpose and function? In general, one argues that correlations between specific features and specific functions are key to understanding a complex structure. However, specific features may sometimes be neutral and uncorrelated with any system-specific purpose, function or causal chain. Such neutral features are caused by chance and randomness. Here we compare two classes of chemical networks: one that has been subjected to biological evolution (the chemical reaction network of metabolism in living cells) and one that has not (the atmospheric planetary chemical reaction networks). Their degree distributions are shown to share the very same neutral system-independent features. The shape of the broad distributions is to a large extent controlled by a single parameter, the network size. From this perspective, there is little difference between atmospheric and metabolic networks; they are just different sizes of the same random assembling network. In other words, the shape of the degree distribution is a neutral characteristic feature and has no functional or evolutionary implications in itself; it is not a matter of life and death.
In this essay written with my students and collaborators in mind, I share simple recipes that are easy and often fun to put in practice and that make a big difference in one's life. The seven guiding principles are: (1) sleep, (2) love and sex, (3) deep breathing and daily exercises, (4) water and chewing, (5) fruits, unrefined products, food combination, vitamin D and no meat, (6) power foods, (7) play, intrinsic motivation, positive psychology and will. These simple laws are based on an integration of evolutionary thinking, personal experimentation, and evidence from experiments reported in the scientific literature. I develop their rationality, expected consequences and describe briefly how to put them in practice. I hope that professionals and the broader public may also find some use for it, as I have seen already the positive impacts on some of my students.
In this paper we introduce a case study describing the combination of manual survey-based and e-mail-based social network analysis. The goal of the project was to increase collaboration efficiency in a team of consultants of a major high tech manufacturer. By analyzing the social network of a team of 42 consultants and comparing it with their utilization as the dependent variable, their efficiency in working together was improved in various way by bridging structure holes and eliminating bottlenecks, reducing stress for overburdened individuals, connecting isolated individuals and identifying the best network structures for high utilization and increased job satisfaction.
In the framework of evolutionary games with institutional reciprocity, limited incentives are at disposal for rewarding cooperators and punishing defectors. In the simplest case, it can be assumed that, depending on their strategies, all players receive equal incentives from the common pool. The question arises, however, what is the optimal distribution of institutional incentives? How should we best reward and punish individuals for cooperation to thrive? We study this problem for the public goods game on a scale-free network. We show that if the synergetic effects of group interactions are weak, the level of cooperation in the population can be maximized simply by adopting the simplest "equal distribution" scheme. If synergetic effects are strong, however, it is best to reward high-degree nodes more than low-degree nodes. These distribution schemes for institutional rewards are independent of payoff normalization. For institutional punishment, however, the same optimization problem is more complex, and its solution depends on whether absolute or degree-normalized payoffs are used. We find that degree-normalized payoffs require high-degree nodes be punished more lenient than low-degree nodes. Conversely, if absolute payoffs count, then high-degree nodes should be punished stronger than low-degree nodes.
We consider the early time regime of the Kardar-Parisi-Zhang (KPZ) equation in $1+1$ dimensions in curved (or droplet) geometry. We show that for short time $t$, the probability distribution $P(H,t)$ of the height $H$ at a given point $x$ takes the scaling form $P(H,t) \sim \exp{\left(-\Phi_{\rm drop}(H)/\sqrt{t} \right)}$ where the rate function $\Phi_{\rm drop}(H)$ is computed exactly. While it is Gaussian in the center, i.e., for small $H$, the PDF has highly asymmetric non-Gaussian tails which we characterize in detail. This function $\Phi_{\rm drop}(H)$ is surprisingly reminiscent of the large deviation function describing the stationary fluctuations of finite size models belonging to the KPZ universality class. Thanks to a recently discovered connection between KPZ and free fermions, our results have interesting implications for the fluctuations of the rightmost fermion in a harmonic trap at high temperature and the full couting statistics at the edge.
Recent advances in graph theory suggest that is possible to identify the oldest nodes of a network using only the graph topology. Here we report on applications to heterogeneous real world networks. To this end, and in order to gain new insights, we propose the theoretical framework of the Estrada communicability. We apply it to two technological networks (an underground, the diffusion of a software worm in a LAN) and to a third network representing a cholera outbreak. In spite of errors introduced in the adjacency matrix of their graphs, the identification of the oldest nodes is feasible, within a small margin of error, and extremely simple. Utilizations include the search of the initial disease-spreader (patient zero problem), rumors in social networks, malware in computer networks, triggering events in blackouts, oldest urban sites recognition.
We derive a modified form of the BFKL equation which enables the structure of the gluon emissions to be studied in small $x$ deep inelastic scattering. The equation incorporates the resummation of the virtual and unresolved real gluon emissions. We solve the equation to calculate the number of small $x$ deep-inelastic events containing 0,1,2 ...resolved gluon jets, that is jets with transverse momenta $q_{T} > \mu$. We study the jet decomposition for different choices of the jet resolution parameter $\mu$.
This review outlines our present experimental knowledge and theoretical understanding of deep-inelastic scattering on nuclear targets. The emphasis is primarily on nuclear coherence phenomena, such as shadowing, where the key physics issue is the exploration of hadronic and quark-gluon fluctuations of a high-energy virtual photon and their passage through the nuclear medium. New developments in polarized deep-inelastic scattering on nuclei are also discussed, and more conventional binding and Fermi motion effects are summarized. The report closes with a brief outlook on vector meson electroproduction, nuclear shadowing at very large Q^2 and the physics of high parton densities in QCD.
We study distributed computation in synchronous dynamic networks where an omniscient adversary controls the unidirectional communication links. Its behavior is modeled as a sequence of directed graphs representing the active (i.e. timely) communication links per round. We prove that consensus is impossible under some natural weak connectivity assumptions, and introduce vertex-stable root components as a means for circumventing this impossibility. Essentially, we assume that there is a short period of time during which an arbitrary part of the network remains strongly connected, while its interconnect topology may keep changing continuously. We present a consensus algorithm that works under this assumption, and prove its correctness. Our algorithm maintains a local estimate of the communication graphs, and applies techniques for detecting stable network properties and univalent system configurations. Our possibility results are complemented by several impossibility results and lower bounds for consensus and other distributed computing problems like leader election, revealing that our algorithm is asymptotically optimal.
We introduce a dynamical network model which unifies a number of network families which are individually known to exhibit $q$-exponential degree distributions. The present model dynamics incorporates static (non-growing) self-organizing networks, preferentially growing networks, and (preferentially) rewiring networks. Further, it exhibits a natural random graph limit. The proposed model generalizes network dynamics to rewiring and growth modes which depend on internal topology as well as on a metric imposed by the space they are embedded in. In all of the networks emerging from the presented model we find q-exponential degree distributions over a large parameter space. We comment on the parameter dependence of the corresponding entropic index q for the degree distributions, and on the behavior of the clustering coefficients and neighboring connectivity distributions.
LINE [1], as an efficient network embedding method, has shown its effectiveness in dealing with large-scale undirected, directed, and/or weighted networks. Particularly, it proposes to preserve both the local structure (represented by First-order Proximity) and global structure (represented by Second-order Proximity) of the network. In this study, we prove that LINE with these two proximities (LINE(1st) and LINE(2nd)) are actually factoring two different matrices separately. Specifically, LINE(1st) is factoring a matrix M (1), whose entries are the doubled Pointwise Mutual Information (PMI) of vertex pairs in undirected networks, shifted by a constant. LINE(2nd) is factoring a matrix M (2), whose entries are the PMI of vertex and context pairs in directed networks, shifted by a constant. We hope this finding would provide a basis for further extensions and generalizations of LINE.
We investigate hard ellipsoids of revolution in a parameter regime where no long range nematic order is present but already finite size domains are formed which show orientational order. Domain formation leads to a substantial slowing down of a collective rotational mode which separates well from the usual microscopic frequency regime. A dynamic coupling of this particular mode into all other modes provides a general mechanism which explains an excess peak in spectra of molecular fluids. Using molecular dynamics simulation on up to 4096 particles and on solving the molecular mode coupling equation we investigate dynamic properties of the peak and prove its orientational origin.
While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a)~a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b)~other researchers to repeat, improve, or even refute that original work.   For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method.   We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives.
In this paper, a hybrid method for solving multi-objective problem has been provided. The proposed method is combining the {\epsilon}-Constraint and the Cuckoo algorithm. First the multi objective problem transfers into a single-objective problem using $\epsilon$-Constraint, then the Cuckoo optimization algorithm will optimize the problem in each task. At last the optimized Pareto frontier will be drawn. The advantage of this method is the high accuracy and the dispersion of its Pareto frontier. In order to testing the efficiency of the suggested method, a lot of test problems have been solved using this method. Comparing the results of this method with the results of other similar methods shows that the Cuckoo algorithm is more suitable for solving the multi-objective problems.
In this paper, the problem of assigning channel slots to a number of contending stations is modeled as a Constraint Satisfaction Problem (CSP). A learning MAC protocol that uses deterministic backoffs after successful transmissions is used as a decentralized solver for the CSP. The convergence process of the solver is modeled by an absorbing Markov chain (MC), and analytical, closed-form expressions for its transition probabilities are derived. Using these, the expected number of steps required to reach a solution is found. The analysis is validated by means of simulations and the model is extended to account for the presence of channel errors. The results are applicable in various resource allocation scenarios in wireless networks.
From social networks to Internet applications, a wide variety of electronic communication tools are producing streams of graph data; where the nodes represent users and the edges represent the contacts between them over time. This has led to an increased interest in mechanisms to model the dynamic structure of time-varying graphs. In this work, we develop a framework for learning the latent state space of a time-varying email graph. We show how the framework can be used to find subsequences that correspond to global real-time events in the Email graph (e.g. vacations, breaks, ...etc.). These events impact the underlying graph process to make its characteristics non-stationary. Within the framework, we compare two different representations of the temporal relationships; discrete vs. probabilistic. We use the two representations as inputs to a mixture model to learn the latent state transitions that correspond to important changes in the Email graph structure over time.
We derive the sampling properties of random networks based on weights whose pairwise products parameterize independent Bernoulli trials. This enables an understanding of many degree-based network models, in which the structure of realized networks is governed by properties of their degree sequences. We provide exact results and large-sample approximations for power-law networks and other more general forms. This enables us to quantify sampling variability both within and across network populations, and to characterize the limiting extremes of variation achievable through such models. Our results highlight that variation explained through expected degree structure need not be attributed to more complicated generative mechanisms.
Networks are structures that pervade many natural and man-made phenomena. Recent findings have characterized many networks as not random structures, but as efficent complex formations. Current research has examined complex networks as largely non-spatial phenomena. Location, distance, and geograhpy, though, are vital aspects of a wide variety of networks. This paper will examine the United State's portion of the Internet's infrastructure as a complex network and what role distance and geography play in its formation. From these findings implications will be drawn on the economic, political, and national security impacts of network formation and evolution in an information economy.
Fully convolutional models for dense prediction have proven successful for a wide range of visual tasks. Such models perform well in a supervised setting, but performance can be surprisingly poor under domain shifts that appear mild to a human observer. For example, training on one city and testing on another in a different geographic region and/or weather condition may result in significantly degraded performance due to pixel-level distribution shift. In this paper, we introduce the first domain adaptive semantic segmentation method, proposing an unsupervised adversarial approach to pixel prediction problems. Our method consists of both global and category specific adaptation techniques. Global domain alignment is performed using a novel semantic segmentation network with fully convolutional domain adversarial learning. This initially adapted space then enables category specific adaptation through a generalization of constrained weak learning, with explicit transfer of the spatial layout from the source to the target domains. Our approach outperforms baselines across different settings on multiple large-scale datasets, including adapting across various real city environments, different synthetic sub-domains, from simulated to real environments, and on a novel large-scale dash-cam dataset.
For a safe, natural and effective human-robot social interaction, it is essential to develop a system that allows a robot to demonstrate the perceivable responsive behaviors to complex human behaviors. We introduce the Multimodal Deep Attention Recurrent Q-Network using which the robot exhibits human-like social interaction skills after 14 days of interacting with people in an uncontrolled real world. Each and every day during the 14 days, the system gathered robot interaction experiences with people through a hit-and-trial method and then trained the MDARQN on these experiences using end-to-end reinforcement learning approach. The results of interaction based learning indicate that the robot has learned to respond to complex human behaviors in a perceivable and socially acceptable manner.
We present new results from our search for z~7 galaxies from deep spectroscopic observations of candidate z-dropouts in the CANDELS fields. Despite the extremely low flux limits achieved by our sensitive observations, only 2 galaxies have robust redshift identifications, one from its Lyalpha emission line at z=6.65, the other from its Lyman-break, i.e. the continuum discontinuity at the Lyalpha wavelength consistent with a redshift 6.42, but with no emission line. In addition, for 23 galaxies we present deep limits in the Lyalpha EW derived from the non detections in ultra-deep observations. Using this new data as well as previous samples, we assemble a total of 68 candidate z~7 galaxies with deep spectroscopic observations, of which 12 have a line detection. With this much enlarged sample we can place solid constraints on the declining fraction of Ly$\alpha$ emission in z~7 Lyman break galaxies compared to z~6, both for bright and faint galaxies. Applying a simple analytical model, we show that the present data favor a patchy reionization process rather than a smooth one.
The paper examines the learning mechanism of adaptive agents over weakly-connected graphs and reveals an interesting behavior on how information flows through such topologies. The results clarify how asymmetries in the exchange of data can mask local information at certain agents and make them totally dependent on other agents. A leader-follower relationship develops with the performance of some agents being fully determined by the performance of other agents that are outside their domain of influence. This scenario can arise, for example, due to intruder attacks by malicious agents or as the result of failures by some critical links. The findings in this work help explain why strong-connectivity of the network topology, adaptation of the combination weights, and clustering of agents are important ingredients to equalize the learning abilities of all agents against such disturbances. The results also clarify how weak-connectivity can be helpful in reducing the effect of outlier data on learning performance.
Drug-drug interaction (DDI) is a vital information when physicians and pharmacists intend to co-administer two or more drugs. Thus, several DDI databases are constructed to avoid mistakenly combined use. In recent years, automatically extracting DDIs from biomedical text has drawn researchers' attention. However, the existing work utilize either complex feature engineering or NLP tools, both of which are insufficient for sentence comprehension. Inspired by the deep learning approaches in natural language processing, we propose a recur- rent neural network model with multiple attention layers for DDI classification. We evaluate our model on 2013 SemEval DDIExtraction dataset. The experiments show that our model classifies most of the drug pairs into correct DDI categories, which outperforms the existing NLP or deep learning methods.
The random field Ising model driven by a slowly varying uniform external field at zero temperature provides a caricature of several threshold activated systems. In this model, the non-equilibrium response of the system can be obtained analytically on a Bethe lattice if the initial state of the system has all spins aligned parallel to each other. We consider ferromagnetic as well as anti-ferromagnetic interactions. The ferromagnetic model exhibits avalanches and non-equilibrium critical behavior. The anti-ferromagnetic model is marked by the absence of these features. The ferromagnetic model is Abelian, and the anti-ferromagnetic model is non-Abelian. Theoretical approaches based on the probabilistic method are discussed in the two cases, and illustrated by deriving some basic results.
Navigation is one of the most popular cloud computing services. But in virtually all cloud-based navigation systems, the client must reveal her location and destination to the cloud service provider in order to learn the fastest route. In this work, we present a cryptographic protocol for navigation on city streets that provides privacy for both the client's location and the service provider's routing data. Our key ingredient is a novel method for compressing the next-hop routing matrices in networks such as city street maps. Applying our compression method to the map of Los Angeles, for example, we achieve over tenfold reduction in the representation size. In conjunction with other cryptographic techniques, this compressed representation results in an efficient protocol suitable for fully-private real-time navigation on city streets. We demonstrate the practicality of our protocol by benchmarking it on real street map data for major cities such as San Francisco and Washington, D.C.
Maximally random jammed (MRJ) sphere packing is a prototypical example of a system naturally poised at the margin between underconstraint and overconstraint. This marginal stability has traditionally been understood in terms of isostaticity, the equality of the number of mechanical contacts and the number of degrees of freedom. Quasicontacts, pairs of spheres on the verge of coming in contact, are irrelevant for static stability, but they come into play when considering dynamic stability, as does the distribution of contact forces. We show that the effects of marginal dynamic stability, as manifested in the distributions of quasicontacts and weak contacts, are consequential and nontrivial. We study these ideas first in the context of MRJ packing of d-dimensional spheres, where we show that the abundance of quasicontacts grows at a faster rate than that of contacts. We reexamine a calculation of Jin et al. (Phys. Rev. E 82, 051126, 2010), where quasicontacts were originally neglected, and we explore the effect of their inclusion in the calculation. This analysis yields an estimate of the asymptotic behavior of the packing density in high dimensions. We argue that this estimate should be reinterpreted as a lower bound. The latter part of the paper is devoted to Bravais lattice packings that possess the minimum number of contacts to maintain mechanical stability. We show that quasicontacts play an even more important role in these packings. We also show that jammed lattices are a useful setting for studying the Edwards ensemble, which weights each mechanically stable configuration equally and does not account for dynamics. This ansatz fails to predict the power-law distribution of near-zero contact forces, $P(f)\sim f^\theta$.
Many combinatorial problems arising in machine learning can be reduced to the problem of minimizing a submodular function. Submodular functions are a natural discrete analog of convex functions, and can be minimized in strongly polynomial time. Unfortunately, state-of-the-art algorithms for general submodular minimization are intractable for larger problems. In this paper, we introduce a novel subclass of submodular minimization problems that we call decomposable. Decomposable submodular functions are those that can be represented as sums of concave functions applied to modular functions. We develop an algorithm, SLG, that can efficiently minimize decomposable submodular functions with tens of thousands of variables. Our algorithm exploits recent results in smoothed convex minimization. We apply SLG to synthetic benchmarks and a joint classification-and-segmentation task, and show that it outperforms the state-of-the-art general purpose submodular minimization algorithms by several orders of magnitude.
We simulate the dc magnetic response of the diluted dipolar-coupled Ising magnet LiHo\(_{0.045}\)Y\(_{0.955}\)F\(_4\) in a transverse field, using exact diagonalization of a two-spin Hamiltonian averaged over nearest-neighbour configurations. The pairwise model, incorporating hyperfine interactions, accounts for the observed drop-off in the longitudinal (c-axis) susceptibility with increasing transverse field; with the inclusion of a small tilt in the transverse field, it also accounts for the behavior of the off-diagonal magnetic susceptibility. The hyperfine interactions do not appear to lead to qualitative changes in the pair susecptibilities, although they do renormalize the crossover fields between different regimes. Comparison with experiment indicates that antiferromagnetic correlations are more important than anticipated based on simple pair statistics and our first-principles calculations of the pair response. This means that larger clusters will be needed for a full description of the reduction in the diagonal response at small transverse fields.
We investigate a human-machine collaborative drawing environment in which an autonomous agent sketches images while optionally allowing a user to directly influence the agent's trajectory. We combine Monte Carlo Tree Search with image classifiers and test both shallow models (e.g. multinomial logistic regression) and deep Convolutional Neural Networks (e.g. LeNet, Inception v3). We found that using the shallow model, the agent produces a limited variety of images, which are noticably recogonisable by humans. However, using the deeper models, the agent produces a more diverse range of images, and while the agent remains very confident (99.99%) in having achieved its objective, to humans they mostly resemble unrecognisable 'random' noise. We relate this to recent research which also discovered that 'deep neural networks are easily fooled' \cite{Nguyen2015} and we discuss possible solutions and future directions for the research.
Can one hear the 'sound' of a growing network? We address the problem of recognizing the topology of evolving biological or social networks. Starting from percolation theory, we analytically prove a linear inverse relationship between two simple graph parameters--the logarithm of the average cluster size and logarithm of the ratio of the edges of the graph to the theoretically maximum number of edges for that graph--that holds for all growing power law graphs. The result establishes a novel property of evolving power-law networks in the asymptotic limit of network size. Numerical simulations as well as fitting to real-world citation co-authorship networks demonstrate that the result holds for networks of finite sizes, and provides a convenient measure of the extent to which an evolving family of networks belongs to the same power-law class.
Acceptable but due to extensive usage of a computer rather unpleasant proof of the famous four color map problem of Francis Guthrie were settled eventually by W. Appel and K. Haken in 1976. Using the same method but shortening the proof twenty years later by another team, namely N. Robertson, D.P. Sanders, P.D. Seymour and R. Thomas would not improve considerably the readability of the proof either. Thus it has been widely accepted the need of more elegant and readable proof. There are considerable number of equivalent formulations of the problem but none of them promising for a possible non-computer proofs. On the other hand known proofs are used the concept of Kempe chain and reducibility of the configurations which were a century old ideas. With these in mind we have introduced a new concept which we call "spiral chains" in the maximal planar graphs. We have shown that for any maximal graph as long as spiral chains are being used we do not need the fifth color. Henceforth this paper offers another proof to the four color theorem which is not based on deep and abstract theories from the other branches of mathematics or using computing power of computers, but rather completely on a new idea in graph theory.
We introduce a complex systems perspective on innovation in networks in which innovation is conceptualized as a form of creative act associated with the dynamics and evolution of business network. We show how innovation is a form of creative act that involves the creation of new ideas and their exploitation, in which new ideas come from combining and recombining existing ideas in new ways that have value. We stress the need to move away from traditional linear, comparative static variables based theories and models to more nonlinear, dynamic, evolutionary mechanism and process based theories models of business networks. This calls for different types of methods including systematic case histories and agent-based models   Keywords: Business Networks, Innovation, Complex Systems, Evolution, Mechanisms, Processes, Dynamics.
We present zBootes, a new z-band photometric imaging campaign of 7.62 square degrees in the NOAO Deep Wide-Field Survey (NDWFS) Bootes field. In this paper, all of the images for this survey are released as well as the associated catalogs. The final zBootes catalogs are complete (at the 50% level) to 22.7 mag over 50% of the field. With these depths, the zBootes images should be sensitive to L* galaxies to z~1 over much of the survey area. These data have several possible applications including searching for and characterizing high-redshift quasars and brown dwarfs and providing added constraints to photometric redshift determinations of galaxies and active galaxies to moderate redshift. The zBootes imaging adds photometric data at a new wavelength to the existing wealth of multi-wavelength observations of the NDWFS Bootes field.
We report an efficient method to observe single photon emissions in monolayer WSe2 by applying hydrostatic pressure. The photoluminescence peaks of typical two-dimensional (2D) excitons show a nearly identical pressure-induced blue-shift, whereas the energy of pressure-induced discrete emission lines (quantum emitters) demonstrates a pressure insensitive behavior. The decay time of these discrete line emissions is approximately 10 ns, which is at least one order longer than the lifetime of the broad localized (L) excitons. These characteristics lead to a conclusion that the excitons bound to deep level defects can be responsible for the observed single photon emissions.
Advancements in Complementary Metal Oxide Semiconductor (CMOS) technology have enabled Wireless Sensor Networks (WSN) to gather, process and transport multimedia (MM) data as well and not just limited to handling ordinary scalar data anymore. This new generation of WSN type is called Wireless Multimedia Sensor Networks (WMSNs). Better and yet relatively cheaper sensors that are able to sense both scalar data and multimedia data with more advanced functionalities such as being able to handle rather intense computations easily have sprung up. In this paper, the applications, architectures, challenges and issues faced in the design of WMSNs are explored. Security and privacy issues, over all requirements, proposed and implemented solutions so far, some of the successful achievements and other related works in the field are also highlighted. Open research areas are pointed out and a few solution suggestions to the still persistent problems are made, which, to the best of my knowledge, so far have not been explored yet.
The extreme eigenvalues of connectivity matrices govern the influence of the network structure on a number of network dynamical processes. A fundamental open question is whether the eigenvalues of large networks are well represented by ensemble averages. Here we investigate this question explicitly and validate the concept of ensemble averageability in random scale-free networks by showing that the ensemble distributions of extreme eigenvalues converge to peaked distributions as the system size increases. We discuss the significance of this result using synchronization and epidemic spreading as example processes.
In the machine learning problems, the performance measure is used to evaluate the machine learning models. Recently, the number positive data points ranked at the top positions (Pos@Top) has been a popular performance measure in the machine learning community. In this paper, we propose to learn a convolutional neural network (CNN) model to maximize the Pos@Top performance measure. The CNN model is used to represent the multi-instance data point, and a classifier function is used to predict the label from the its CNN representation. We propose to minimize the loss function of Pos@Top over a training set to learn the filters of CNN and the classifier parameter. The classifier parameter vector is solved by the Lagrange multiplier method, and the filters are updated by the gradient descent method alternately in an iterative algorithm. Experiments over benchmark data sets show that the proposed method outperforms the state-of-the-art Pos@Top maximization methods.
We investigate Threshold Random Boolean Networks with $K = 2$ inputs per node, which are equivalent to Kauffman networks, with only part of the canalyzing functions as update functions. According to the simplest consideration these networks should be critical but it turns out that they show a rich variety of behaviors, including periodic and chaotic oscillations. The results are supported by analytical calculations and computer simulations.
One of the major open challenges in self-driving cars is the ability to detect cars and pedestrians to safely navigate in the world. Deep learning-based object detector approaches have enabled great advances in using camera imagery to detect and classify objects. But for a safety critical application such as autonomous driving, the error rates of the current state-of-the-art are still too high to enable safe operation. Moreover, our characterization of object detector performance is primarily limited to testing on prerecorded datasets. Errors that occur on novel data go undetected without additional human labels. In this paper, we propose an automated method to identify mistakes made by object detectors without ground truth labels. We show that inconsistencies in object detector output between a pair of similar images can be used to identify false negatives(e.g. missed detections). In particular, we study two distinct cues - temporal and stereo inconsistencies - using data that is readily available on most autonomous vehicles. Our method can be used with any camera-based object detector and we evaluate the technique on several sets of real world data. The proposed method achieves over 97% precision in automatically identifying missed detections produced by one of the leading state-of-the-art object detectors in the literature. We also release a new tracking dataset with over 100 sequences totaling more than 80,000 labeled images from a game engine to facilitate further research.
Identifying overlapping communities in networks is a challenging task. In this work we present a novel approach to community detection that utilises the Bayesian non-negative matrix factorisation (NMF) model to produce a probabilistic output for node memberships. The scheme has the advantage of computational efficiency, soft community membership and an intuitive foundation. We present the performance of the method against a variety of benchmark problems and compare and contrast it to several other algorithms for community detection. Our approach performs favourably compared to other methods at a fraction of the computational costs.
We utilize machine learning to study the string landscape. Deep data dives and conjecture generation are proposed as useful frameworks for utilizing machine learning in the landscape, and examples of each are presented. A decision tree accurately predicts the number of weak Fano toric threefolds arising from reflexive polytopes, each of which determines a smooth F-theory compactification, and linear regression generates a previously proven conjecture for the gauge group rank in an ensemble of $\frac43 \times 2.96 \times 10^{755}$ F-theory compactifications. Logistic regression generates a new conjecture for when $E_6$ arises in the large ensemble of F-theory compactifications, which is then rigorously proven. This result may be relevant for the appearance of visible sectors in the ensemble. Through conjecture generation, machine learning is useful not only for numerics, but also for rigorous results.
We present a sequence of five deep observations of SS433 made over the summer of 2007 using the VLA in the A configuration at 5 and 8 GHz. In this paper we study the brightness profiles of the jets and their time evolution. We also examine the spectral index distribution in the source. We find (as previously reported from the analysis of a single earlier image) that the profiles of the east and west jets are remarkably similar if projection and Doppler beaming are taken into account. The sequence of five images allows us to disentangle the evolution of brightness of individual pieces of jet from the variations of jet power originating at the core. We find that the brightness of each piece of the jet fades as an exponential function of age (or distance from the core), exp(-tau/tau'), where tau is the age at emission and tau' = 55.9 +- 1.7 days. This evolutionary model describes both the east and west jets equally well. There is also significant variation (by a factor of at least five) in jet power with birth epoch, with the east and west jets varying in synchrony. The lack of deceleration between the scale of the optical Balmer line emission (10^15 cm) and that of the radio emission (10^17 cm) requires that the jet material is much denser than its surroundings. We find that the density ratio must exceed 300:1.
Using a recently proposed model for combinatorial landscapes, Local Optima Networks (LON), we conduct a thorough analysis of two types of instances of the Quadratic Assignment Problem (QAP). This network model is a reduction of the landscape in which the nodes correspond to the local optima, and the edges account for the notion of adjacency between their basins of attraction. The model was inspired by the notion of 'inherent network' of potential energy surfaces proposed in physical-chemistry. The local optima networks extracted from the so called uniform and real-like QAP instances, show features clearly distinguishing these two types of instances. Apart from a clear confirmation that the search difficulty increases with the problem dimension, the analysis provides new confirming evidence explaining why the real-like instances are easier to solve exactly using heuristic search, while the uniform instances are easier to solve approximately. Although the local optima network model is still under development, we argue that it provides a novel view of combinatorial landscapes, opening up the possibilities for new analytical tools and understanding of problem difficulty in combinatorial optimization.
Privacy preserving networks can be modelled as decentralized networks (e.g., sensors, connected objects, smartphones), where communication between nodes of the network is not controlled by an all-knowing, central node. For this type of networks, the main issue is to gather/learn global information on the network (e.g., by optimizing a global cost function) while keeping the (sensitive) information at each node. In this work, we focus on text information that agents do not want to share (e.g., text messages, emails, confidential reports). We use recent advances on decentralized optimization and topic models to infer topics from a graph with limited communication. We propose a method to adapt latent Dirichlet allocation (LDA) model to decentralized optimization and show on synthetic data that we still recover similar parameters and similar performance at each node than with stochastic methods accessing to the whole information in the graph.
Accurate traffic participant prediction is the prerequisite for collision avoidance of autonomous vehicles. In this work, we predict pedestrians by emulating their own motion planning. From online observations, we infer a mixture density function for possible destinations. We use this result as the goal states of a planning stage that performs motion prediction based on common behavior patterns. The entire system is modeled as one monolithic neural network and trained via inverse reinforcement learning. Experimental validation on real world data shows the system's ability to predict both, destinations and trajectories accurately.
We propose a simple dynamic model splitting users of the transport network by type of movement: personal transport and public transport. A distinctive feature proposed in the article is the existence of an evolutionary approach component in the justification of the method of splitting.
We introduce the concept of communicability angle between a pair of nodes in a graph. We provide strong analytical and empirical evidence that the average communicability angle for a given network accounts for its spatial efficiency on the basis of the communications among the nodes in a network. We determine characteristics of the spatial efficiency of more than a hundred real-world complex networks that represent complex systems arising in a diverse set of scenarios. In particular, we find that the communicability angle correlates very well with the experimentally measured value of the relative packing efficiency of proteins that are represented as residue networks. We finally show how we can modulate the spatial efficiency of a network by tuning the weights of the edges of the networks. This allows us to predict effects of external stresses on the spatial efficiency of a network as well as to design strategies to improve important parameters in real-world complex systems.
Understanding granular and other athermal systems requires the identification of state variables which consistently predict their bulk properties. A promising approach has been to draw on the techniques of equilibrium statistical mechanics, but to consider alternate conserved quantities in place of energy. The Edwards ensemble, based on volume conservation, provides a temperaturelike intensive parameter called compactivity. We present experiments which demonstrate the failure of compactivity to equilibrate (via volume-exchange) between a pair of externally-agitated granular subsystems with different material properties. Nonetheless, we identify a material-independent relationship between the mean and fluctuations of the local packing fraction which forms the basis for an equation of state. This relationship defines an intensive parameter that decouples from the volume statistics.
We develop the Latent Multi-group Membership Graph (LMMG) model, a model of networks with rich node feature structure. In the LMMG model, each node belongs to multiple groups and each latent group models the occurrence of links as well as the node feature structure. The LMMG can be used to summarize the network structure, to predict links between the nodes, and to predict missing features of a node. We derive efficient inference and learning algorithms and evaluate the predictive performance of the LMMG on several social and document network datasets.
Since 2006, deep learning (DL) has become a rapidly growing research direction, redefining state-of-the-art performances in a wide range of areas such as object recognition, image segmentation, speech recognition and machine translation. In modern manufacturing systems, data-driven machine health monitoring is gaining in popularity due to the widespread deployment of low-cost sensors and their connection to the Internet. Meanwhile, deep learning provides useful tools for processing and analyzing these big machinery data. The main purpose of this paper is to review and summarize the emerging research work of deep learning on machine health monitoring. After the brief introduction of deep learning techniques, the applications of deep learning in machine health monitoring systems are reviewed mainly from the following aspects: Auto-encoder (AE) and its variants, Restricted Boltzmann Machines and its variants including Deep Belief Network (DBN) and Deep Boltzmann Machines (DBM), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Finally, some new trends of DL-based machine health monitoring methods are discussed.
Solid He4 is viewed as a nearly perfect Debye solid. Yet, recent calorimetry indicates that its low-temperature specific heat has both cubic and linear contributions. These features appear in the same temperature range ($T \sim 200$ mK) where measurements of the torsional oscillator period suggest a supersolid transition. We analyze the specific heat to compare the measured with the estimated entropy for a proposed supersolid transition with 1% superfluid fraction. We find that the experimental entropy is substantially less than the calculated entropy. We suggest that the low-temperature linear term in the specific heat is due to a glassy state that develops at low temperatures and is caused by a distribution of tunneling systems in the crystal. It is proposed that small scale dislocation loops produce those tunneling systems. We argue that the reported mass decoupling is consistent with an increase in the oscillator frequency as expected for a glass-like transition.
We introduce a model for the evolution of sexually transmitted diseases, in which the social behavior is incorporated as a determinant factor for the further propagation of the infection. The system may be regarded as a society of agents where in principle anyone can sexually interact with any other one in the population. Different social behaviors are reflected in a distribution of sexual attitudes ranging from the more conservative to the more promiscuous. This is measured by what we call the promiscuity parameter. In terms of this parameter, we find a critical behavior for the evolution of the disease. There is a threshold below what the epidemic does not occur. We relate this critical value of the promiscuity to what epidemiologist call the basic reproductive number, connecting it with the other parameters of the model, namely the infectivity and the infective period in a quantitative way. We consider the possibility of subjects be grouped in couples. In this contribution only the homosexual case is analyzed.
Training of Convolutional Neural Networks (CNNs) on long video sequences is computationally expensive due to the substantial memory requirements and the massive number of parameters that deep architectures demand. Early fusion of video frames is thus a standard technique, in which several consecutive frames are first agglomerated into a compact representation, and then fed into the CNN as an input sample. For this purpose, a summarization approach that represents a set of consecutive RGB frames by a single dynamic image to capture pixel dynamics is proposed recently. In this paper, we introduce a novel ordered representation of consecutive optical flow frames as an alternative and argue that this representation captures the action dynamics more effectively than RGB frames. We provide intuitions on why such a representation is better for action recognition. We validate our claims on standard benchmark datasets and demonstrate that using summaries of flow images lead to significant improvements over RGB frames while achieving accuracy comparable to the state-of-the-art on UCF101 and HMDB datasets.
Experimental evidence from measurements of the a.c. and d.c. susceptibility, and heat capacity data show that the pyrochlore structure oxide, Gd_2Ti_2O_7, exhibits short range order that starts developing at 30K, as well as long range magnetic order at $T\sim 1$K. The Curie-Weiss temperature, $\theta_{CW}$ = -9.6K, is largely due to exchange interactions. Deviations from the Curie-Weiss law occur below $\sim$10K while magnetic heat capacity contributions are found at temperatures above 20K. A sharp maximum in the heat capacity at $T_c=0.97$K signals a transition to a long range ordered state, with the magnetic specific accounting for only $\sim$ 50% of the magnetic entropy. The heat capacity above the phase transition can be modeled by assuming that a distribution of random fields acts on the $^8S_{7/2}$ ground state for Gd$^{3+}$. There is no frequency dependence to the a.c. susceptibility in either the short range or long range ordered regimes, hence suggesting the absence of any spin-glassy behavior. Mean field theoretical calculations show that no long range ordered ground state exists for the conditions of nearest-neighbor antiferromagnetic exchange and long range dipolar couplings. At the mean-field level, long range order at various commensurate or incommensurate wave vectors is found only upon inclusion of exchange interactions beyond nearest-neighbor exchange and dipolar coupling. The properties of Gd$_2Ti_2O_7 are compared with other geometrically frustrated antiferromagnets such as the Gd_3Ga_5O_{12} gadolinium gallium garnet, RE_2Ti_2O_7 pyrochlores where RE = Tb, Ho and Tm, and Heisenberg-type pyrochlore such as Y_2Mo_2O_7, Tb_2Mo_2O_7, and spinels such as ZnFe_2O_4
We present a radial basis function solver for convolutional neural networks that can be directly applied to both image classification and distance metric learning problems. Our method treats all training features from a deep neural network as radial basis function centres and computes loss by summing the influence of a feature's nearby centres in the embedding space. Having a radial basis function centred on each training feature is made scalable by treating it as an approximate nearest neighbour search problem. End-to-end learning of the network and solver is carried out, mapping high dimensional features into clusters of the same class. This results in a well formed embedding space, where semantically related instances are likely to be located near one another, regardless of whether or not the network was trained on those classes. The same loss function is used for both the metric learning and classification problems. We show that our radial basis function solver sets state-of-the-art embedding results on the Stanford Cars196 and CUB-200-2011 datasets. Additionally, we show that when used as a classifier, our method outperforms a conventional softmax classifier on the Caltech-256 object recognition dataset and the fine-grained recognition dataset CUB-200-2011.
Image reconstruction plays a critical role in the implementation of all contemporary imaging modalities across the physical and life sciences including optical, MRI, CT, PET, and radio astronomy. During an image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function. Image reconstruction is challenging because analytic knowledge of the inverse transform may not exist a priori, especially in the presence of sensor non-idealities and noise. Thus, the standard reconstruction approach involves approximating the inverse function with multiple ad hoc stages in a signal processing chain whose composition depends on the details of each acquisition strategy, and often requires expert parameter tuning to optimize reconstruction performance. We present here a unified framework for image reconstruction, AUtomated TransfOrm by Manifold APproximation (AUTOMAP), which recasts image reconstruction as a data-driven, supervised learning task that allows a mapping between sensor and image domain to emerge from an appropriate corpus of training data. We implement AUTOMAP with a deep neural network and exhibit its flexibility in learning reconstruction transforms for a variety of MRI acquisition strategies, using the same network architecture and hyperparameters. We further demonstrate its efficiency in sparsely representing transforms along low-dimensional manifolds, resulting in superior immunity to noise and reconstruction artifacts compared with conventional handcrafted reconstruction methods. In addition to improving the reconstruction performance of existing acquisition methodologies, we anticipate accelerating the discovery of new acquisition strategies across modalities as the burden of reconstruction becomes lifted by AUTOMAP and learned-reconstruction approaches.
Next-generation continuum surveys will be strongly constrained by dynamic range and confusion. For example, the ASKAP-EMU (Evolutionary Map of the Universe) project will map 75% of the sky at 20cm to a sensitivity of 10 microJy - some 45 times deeper than NVSS, and is likely to be challenged by issues of confusion, cross-identification, and dynamic range. Here we describe the survey, the issues, and the steps that can be taken to overcome them. We also explore ways of using multiwavelength data to penetrate well beyond the classical confusion limit, using multiwavelength data, and an innovative outreach approach to cross-identification.
Background: The study of genome-scale metabolic models and their underlying networks is one of the most important fields in systems biology. The complexity of these models and their description makes the use of computational tools an essential element in their research. Therefore there is a strong need of efficient and versatile computational tools for the research in this area.   Results: In this manuscript we present PyNetMet, a Python library of tools to work with networks and metabolic models. These are open-source free tools for use in a Python platform, which adds considerably versatility to them when compared with their desktop software similars. On the other hand these tools allow one to work with different standards of metabolic models (OptGene and SBML) and the fact that they are programmed in Python opens the possibility of efficient integration with any other already existing Python tool.   Conclusions: PyNetMet is, therefore, a collection of computational tools that will facilitate the research work with metabolic models and networks.
In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.
Applications of Wireless Sensor Networks (WSNs) are growing day by day due to the ease of deployment, reduction in costs to affordable levels and versatility of these networks. Besides developing advanced micro fabrication technologies means are being devised to reduce energy consumption to bring the network setup and operational costs down. With increasing applications amount of energy consumed in these networks is enormous. Even a small savings in energy consumption will result in huge benefits in energy consumption globally. Bulk of the energy is consumed in communication activity of these networks. It is our endeavor to optimize this activity to make these networks energy efficient and thereby reducing their impact on the overall environment in line with the principle Go Green.
In recent years, deep architectures have been used for transfer learning with state-of-the-art performance in many datasets. The properties of their features remain, however, largely unstudied under the transfer perspective. In this work, we present an extensive analysis of the resiliency of feature vectors extracted from deep models, with special focus on the trade-off between performance and compression rate. By introducing perturbations to image descriptions extracted from a deep convolutional neural network, we change their precision and number of dimensions, measuring how it affects the final score. We show that deep features are more robust to these disturbances when compared to classical approaches, achieving a compression rate of 98.4%, while losing only 0.88% of their original score for Pascal VOC 2007.
In this paper, we are interested in the few-shot learning problem. In particular, we focus on a challenging scenario where the number of categories is large and the number of examples per novel category is very limited, i.e. 1, 2, or 3. Motivated by the close relationship between the parameters and the activations in a neural network associated with the same category, we propose a novel method that can adapt a pre-trained neural network to novel categories by directly predicting the parameters from the activations. Zero training is required in adaptation to novel categories, and fast inference is realized by a single forward pass. We evaluate our method by doing few-shot image recognition on the ImageNet dataset, which achieves state-of-the-art classification accuracy on novel categories by a significant margin while keeping comparable performance on the large-scale categories.
Automatic video captioning is challenging due to the complex interactions in dynamic real scenes. A comprehensive system would ultimately localize and track the objects, actions and interactions present in a video and generate a description that relies on temporal localization in order to ground the visual concepts. However, most existing automatic video captioning systems map from raw video data to high level textual description, bypassing localization and recognition, thus discarding potentially valuable information for content localization and generalization. In this work we present an automatic video captioning model that combines spatio-temporal attention and image classification by means of deep neural network structures based on long short-term memory. The resulting system is demonstrated to produce state-of-the-art results in the standard YouTube captioning benchmark while also offering the advantage of localizing the visual concepts (subjects, verbs, objects), with no grounding supervision, over space and time.
In this report, I surveyed the cognitive radio technique in wireless networks. Researched several kinds of cognitive techniques about their advantages and disadvantages.
We have studied the influence of alloying copper with amorphous arsenic sulfide on the electronic properties of this material. In our computer-generated models, copper is found in two-fold near-linear and four-fold square-planar configurations, which apparently correspond to Cu(I) and Cu(II) oxidation states. The number of overcoordinated atoms, both arsenic and sulfur, grows with increasing concentration of copper. Overcoordinated sulfur is found in trigonal planar configuration, and overcoordinated (four-fold) arsenic is in tetrahedral configuration. Addition of copper suppresses the localization of lone-pair electrons on chalcogen atoms, and localized states at the top of the valence band are due to Cu 3d orbitals. Evidently, these additional Cu states, which are positioned at the same energies as the states due to ([As4]-)-([S_3]+) pairs, are responsible for masking photodarkening in Cu chalcogenides.
The idea that there are any large-scale trends in the evolution of biological organisms is highly controversial. It is commonly believed, for example, that there is a large-scale trend in evolution towards increasing complexity, but empirical and theoretical arguments undermine this belief. Natural selection results in organisms that are well adapted to their local environments, but it is not clear how local adaptation can produce a global trend. In this paper, I present a simple computational model, in which local adaptation to a randomly changing environment results in a global trend towards increasing evolutionary versatility. In this model, for evolutionary versatility to increase without bound, the environment must be highly dynamic. The model also shows that unbounded evolutionary versatility implies an accelerating evolutionary pace. I believe that unbounded increase in evolutionary versatility is a large-scale trend in evolution. I discuss some of the testable predictions about organismal evolution that are suggested by the model.
The applicability of the linear theory of elasticity to the Moon has been studied. As a criterion was taken the smallness of the strain tensor. The elastic moduli are obtained from the data of the project "Apollo". The pressure was calculated in the framework of the model of a homogeneous solid sphere under the action of its own gravity. The strain tensor trace is of the order $0.02$, which indicates the applicability.   The strain tensor in the body of the Moon is calculated taking into account the self-gravity and the Earth's tidal potential. Respectively the free energy density is calculated. Since the axis of the Moon rotation has its own non-zero declination to the ecliptic plane, the tidal potential variations take place during the rotation of the Moon around the Earth.   The estimation of the corresponding free energy density variations are made. Their dependence on the depth exhibit qualitative agreement with the data of depth dependence in the energy of deep moonquakes obtained in the project "Apollo."   Integral estimations of variations in the free energy for the year shows that it is many orders of magnitude greater than estimates of energy for the year of deep moonquakes energy. Thus offering an answer to the fundamental question: where is the source of energy released in the deep moonquakes.
This paper deals with power line communication (PLC) in the context of in-vehicle data networks. This technology can provide high-speed data connectivity via the exploitation of the existing power network, with clear potential benefits in terms of cost and weight reduction. The focus is on two scenarios: an electric car and a cruise ship. An overview of the wiring infrastructure and the network topology in these two scenarios is provided. The main findings reported in the literature related to the channel characteristics are reported. Noise is also assessed with emphasis to the electric car context. Then, new results from the statistical analysis of measurements made in a compact electric car and in a large cruise ship are shown. The channel characteristics are analysed in terms of average channel gain, delay spread, coherence bandwidth and achievable transmission rate. Finally, an overall comparison is made, highlighting similarities and differences taking into account also the conventional (combustion engine) car and the largely investigated in-home scenario.
We present a deep color-magnitude diagram for individual stars in the halo of the nearby spiral galaxy M81, at a projected distance of 19 kpc, based on data taken with the Advanced Camera for Surveys on the Hubble Space Telescope (HST). The color magnitude diagram reveals a red giant branch that is narrow and fairly blue, and a horizontal branch that has stars that lie mostly redward of the RR Lyrae instability strip. We derive a mean metallicity of [M/H] = -1.15 +\- 0.11 and age of 9 +\- 2 Gyr for the dominant population in our field, from the shape of the red giant branch, the magnitude of the red clump, and the location of the red giant branch bump. We compare our metallicity and age results with those found previously for stars in different locations within M81, and in the spheroids of other nearby galaxies.
Neural sequence models are widely used to model time-series data in many fields. Equally ubiquitous is the usage of beam search (BS) as an approximate inference algorithm to decode output sequences from these models. BS explores the search space in a greedy left-right fashion retaining only the top-$B$ candidates -- resulting in sequences that differ only slightly from each other. Producing lists of nearly identical sequences is not only computationally wasteful but also typically fails to capture the inherent ambiguity of complex AI tasks. To overcome this problem, we propose \emph{Diverse Beam Search} (DBS), an alternative to BS that decodes a list of diverse outputs by optimizing for a diversity-augmented objective. We observe that our method finds better top-1 solutions by controlling for the exploration and exploitation of the search space -- implying that DBS is a \emph{better search algorithm}. Moreover, these gains are achieved with minimal computational or memory overhead as compared to beam search. To demonstrate the broad applicability of our method, we present results on image captioning, machine translation and visual question generation using both standard quantitative metrics and qualitative human studies. Our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.
Many real systems have been modelled in terms of network concepts, and written texts are a particular example of information networks. In recent years, the use of network methods to analyze language has allowed the discovery of several interesting findings, including the proposition of novel models to explain the emergence of fundamental universal patterns. While syntactical networks, one of the most prevalent networked models of written texts, display both scale-free and small-world properties, such representation fails in capturing other textual features, such as the organization in topics or subjects. In this context, we propose a novel network representation whose main purpose is to capture the semantical relationships of words in a simple way. To do so, we link all words co-occurring in the same semantic context, which is defined in a threefold way. We show that the proposed representations favours the emergence of communities of semantically related words, and this feature may be used to identify relevant topics. The proposed methodology to detect topics was applied to segment selected Wikipedia articles. We have found that, in general, our methods outperform traditional bag-of-words representations, which suggests that a high-level textual representation may be useful to study semantical features of texts.
In this paper we present a novel Neural Network algorithm for conducting semi-supervised learning for sequence labeling tasks arranged in a linguistically motivated hierarchy. This relationship is exploited to regularise the representations of supervised tasks by backpropagating the error of the unsupervised task through the supervised tasks. We introduce a neural network where lower layers are supervised by junior downstream tasks and the final layer task is an auxiliary unsupervised task. The architecture shows improvements of up to two percentage points F1 for Chunking compared to a plausible baseline.
Recent two-stream deep Convolutional Neural Networks (ConvNets) have made significant progress in recognizing human actions in videos. Despite their success, methods extending the basic two-stream ConvNet have not systematically explored possible network architectures to further exploit spatiotemporal dynamics within video sequences. Further, such networks often use different baseline two-stream networks. Therefore, the differences and the distinguishing factors between various methods using Recurrent Neural Networks (RNN) or convolutional networks on temporally-constructed feature vectors (Temporal-ConvNet) are unclear. In this work, we first demonstrate a strong baseline two-stream ConvNet using ResNet-101. We use this baseline to thoroughly examine the use of both RNNs and Temporal-ConvNets for extracting spatiotemporal information. Building upon our experimental results, we then propose and investigate two different networks to further integrate spatiotemporal information: 1) temporal segment RNN and 2) Inception-style Temporal-ConvNet. We demonstrate that using both RNNs (using LSTMs) and Temporal-ConvNets on spatiotemporal feature matrices are able to exploit spatiotemporal dynamics to improve the overall performance. However, each of these methods require proper care to achieve state-of-the-art performance; for example, LSTMs require pre-segmented data or else they cannot fully exploit temporal information. Our analysis identifies specific limitations for each method that could form the basis of future work. Our experimental results on UCF101 and HMDB51 datasets achieve state-of-the-art performances, 94.1% and 69.0%, respectively, without requiring extensive temporal augmentation.
The recently introduced Deep Q-Networks (DQN) algorithm has gained attention as one of the first successful combinations of deep neural networks and reinforcement learning. Its promise was demonstrated in the Arcade Learning Environment (ALE), a challenging framework composed of dozens of Atari 2600 games used to evaluate general competency in AI. It achieved dramatically better results than earlier approaches, showing that its ability to learn good representations is quite robust and general. This paper attempts to understand the principles that underlie DQN's impressive performance and to better contextualize its success. We systematically evaluate the importance of key representational biases encoded by DQN's network by proposing simple linear representations that make use of these concepts. Incorporating these characteristics, we obtain a computationally practical feature set that achieves competitive performance to DQN in the ALE. Besides offering insight into the strengths and weaknesses of DQN, we provide a generic representation for the ALE, significantly reducing the burden of learning a representation for each game. Moreover, we also provide a simple, reproducible benchmark for the sake of comparison to future work in the ALE.
Jet production is studied in the Breit frame in deep-inelastic positron-proton scattering over a large range of four-momentum transfers 5 < Q^2 < 15000 GeV^2 and transverse jet energies 7 < E_T < 60 GeV. The analysis is based on data corresponding to an integrated luminosity of L_int \simeq 33 pb^(-1) taken in the years 1995-1997 with the H1 detector at HERA at a center-of-mass energy sqrt(s)=300 GeV. Dijet and inclusive jet cross sections are measured multi-differentially using k_perp and angular ordered jet algorithms. The results are compared to the predictions of perturbative QCD calculations in next-to-leading order in the strong coupling constant alphas.QCD fits are performed in which alphas and the gluon density in the proton are determined separately. The gluon density is found to be in good agreement with results obtained in other analyses using data from different processes. The strong coupling constant is determined to be alphas(MZ)=0.1186+-0.0059. In addition an analysis of the data in which both alphas and the gluon density are determined simultaneously is presented.
We study the high-power asymptotic behavior of the sum-rate capacity of multi-user interference networks with an equal number of transmitters and receivers. We assume that each transmitter is cognizant of the message it wishes to convey to its corresponding receiver and also of the messages that a subset of the other transmitters wish to send. The receivers are assumed not to be able to cooperate in any way so that they must base their decision on the signal they receive only. We focus on the network's pre-log, which is defined as the limiting ratio of the sum-rate capacity to half the logarithm of the transmitted power. We present both upper and lower bounds on the network's pre-log. The lower bounds are based on a linear partial-cancellation scheme which entails linearly transforming Gaussian codebooks so as to eliminate the interference in a subset of the receivers. Inter alias, the bounds give a complete characterization of the networks and side-information settings that result in a full pre-log, i.e., in a pre-log that is equal to the number of transmitters (and receivers) as well as a complete characterization of networks whose pre-log is equal to the full pre-log minus one. They also fully characterize networks where the full pre-log can only be achieved if each transmitter knows the messages of all users, i.e., when the side-information is "full".
We provide the first social choice theory approach to the question of what constitutes a community in a social network. Inspired by the classic preferences models in social choice theory, we start from an abstract social network framework, called preference networks; these consist of a finite set of members where each member has a total-ranking preference of all members in the set.   Within this framework, we develop two complementary approaches to axiomatically study the formation and structures of communities. (1) We apply social choice theory and define communities indirectly by postulating that they are fixed points of a preference aggregation function obeying certain desirable axioms. (2) We directly postulate desirable axioms for communities without reference to preference aggregation, leading to eight natural community axioms.   These approaches allow us to formulate and analyze community rules. We prove a taxonomy theorem that provides a structural characterization of the family of community rules that satisfies all eight axioms. The structure is actually quite beautiful: these community rules form a bounded lattice under the natural intersection and union operations. Our structural theorem is complemented with a complexity result: while identifying a community by the most selective rule of the lattice is in P, deciding if a subset is a community by the most comprehensive rule of the lattice is coNP-complete. Our studies also shed light on the limitations of defining community rules solely based on preference aggregation: any aggregation function satisfying Arrow's IIA axiom, or based on commonly used aggregation schemes like the Borda count or generalizations thereof, lead to communities which violate at least one of our community axioms. Finally, we give a polynomial-time rule consistent with seven axioms and weakly satisfying the eighth axiom.
In this article, we take one step toward understanding the learning behavior of deep residual networks, and supporting the observation that deep residual networks behave like ensembles. We propose a new convolutional neural network architecture which builds upon the success of residual networks by explicitly exploiting the interpretation of very deep networks as an ensemble. The proposed multi-residual network increases the number of residual functions in the residual blocks. Our architecture generates models that are wider, rather than deeper, which significantly improves accuracy. We show that our model achieves an error rate of 3.73% and 19.45% on CIFAR-10 and CIFAR-100 respectively, that outperforms almost all of the existing models. We also demonstrate that our model outperforms very deep residual networks by 0.22% (top-1 error) on the full ImageNet 2012 classification dataset. Additionally, inspired by the parallel structure of multi-residual networks, a model parallelism technique has been investigated. The model parallelism method distributes the computation of residual blocks among the processors, yielding up to 15% computational complexity improvement.
Using molecular dynamics computer simulations we investigate the aging dynamics of a gel. We start from a fractal structure generated by the DLCA-DEF algorithm, onto which we then impose an interaction potential consisting of a short-range attraction as well as a long-range repulsion. After relaxing the system at T=0, we let it evolve at a fixed finite temperature. Depending on the temperature T we find different scenarios for the aging behavior. For T>0.2 the fractal structure is unstable and breaks up into small clusters which relax to equilibrium. For T<0.2 the structure is stable and the dynamics slows down with increasing waiting time. At intermediate and low T the mean squared displacement scales as t^{2/3} and we discuss several mechanisms for this anomalous time dependence. For intermediate T, the self-intermediate scattering function is given by a compressed exponential at small wave-vectors and by a stretched exponential at large wave-vectors. In contrast, for low T it is a stretched exponential for all wave-vectors. This behavior can be traced back to a subtle interplay between elastic rearrangements, fluctuations of chain-like filaments, and heterogeneity.
The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, Temporal Ensembling becomes unwieldy when learning large datasets. To overcome this problem, we propose Mean Teacher, a method that averages model weights instead of label predictions. As an additional benefit, Mean Teacher improves test accuracy and enables training with fewer labels than Temporal Ensembling. Mean Teacher achieves error rate 4.35% on SVHN with 250 labels, better than Temporal Ensembling does with 1000 labels.
Recently Le & Mikolov described two log-linear models, called Paragraph Vector, that can be used to learn state-of-the-art distributed representations of documents. Inspired by this work, we present Binary Paragraph Vector models: simple neural networks that learn short binary codes for fast information retrieval. We show that binary paragraph vectors outperform autoencoder-based binary codes, despite using fewer bits. We also evaluate their precision in transfer learning settings, where binary codes are inferred for documents unrelated to the training corpus. Results from these experiments indicate that binary paragraph vectors can capture semantics relevant for various domain-specific documents. Finally, we present a model that simultaneously learns short binary codes and longer, real-valued representations. This model can be used to rapidly retrieve a short list of highly relevant documents from a large document collection.
Operating Systems exists since existence of computers, and have been evolving continuously from time to time. In this paper we have reviewed a relatively new or unexplored topic of Live OS. From networking perspective, Live OS is used for establishing Clusters, Firewalls and as Network security assessment tool etc. Our proposed concept is that a Live OS can be established or configured for an organizations specific network requirements with respect to their servers. An important server failure due to hardware or software could take time for remedy of the problem, so for that situation a preconfigured server in the form of Live OS on CD/DVD/USB can be used as an immediate solution. In a network of ten nodes, we stopped the server machine and with necessary adjustments, Live OS replaced the server in less than five minutes. Live OS in a network environment is a quick replacement of the services that are failed due to server failure (hardware or software). It is a cost effective solution for low budget networks. The life of Live OS starts when we boot it from CD/DVD/USB and remains in action for that session. As soon as the machine is rebooted, any work done for that session is gone, (in case we do not store any information on permanent storage media). Live CD/DVD/USB is normally used on systems where we do not have Operating Systems installed. A Live OS can also be used on systems where we already have an installed OS. On the basis of functionality a Live OS can be used for many purposes and has some typical advantages that are not available on other operating systems. Vendors are releasing different distributions of Live OS and is becoming their sole identity in a particular domain like Networks, Security, Education or Entertainment etc. There can be many aspects of Live OS, but Linux based Live OS and their use in the field of networks is the main focus of this paper.
We study the effect of localized modes in lattices of size N with parity-time (PT) symmetry. Such modes are arranged in pairs of quasi-degenerate levels with splitting delta exp{-N/xi}, where \xi is their localization length. The level "evolution" with respect to the PT breaking parameter gamma shows a cascade of bifurcations during which a pair of real levels becomes complex. The spontaneous PT symmetry breaking occurs at gamma min(delta), thus resulting in an exponentially narrow exact PT phase. As N/xi decreases, it becomes more robust with gamma (1/N)^2 and the distribution P(gamma) changes from log-normal to semi-Gaussian. Our theory can be tested in the frame of optical lattices.
Random networks generators like Erdoes-Renyi, Watts-Strogatz and Barabasi-Albert models are used as models to study real-world networks. Let G^1(V,E_1) and G^2(V,E_2) be two such networks on the same vertex set V. This paper studies the degree distribution and cluster coefficient of the resultant networks, G(V, E_1 U E_2).
A susceptible-infected-susceptible (SIS) model of multiple contagions is developed to incorporate different spreading channels and disease mutations. The basic reproduction number for this model is estimated analytically. We find that the absence of an epidemic threshold is not related to the network topology and can be observed even in the compartmental models. This new model could help in the understanding of other spreading phenomena including communicable diseases, cultural characteristics, addictions, or information spread through e-mail messages, web blogs, and computer networks.
The interpretation and analysis of the wireless capsule endoscopy recording is a complex task which requires sophisticated computer aided decision (CAD) systems in order to help physicians with the video screening and, finally, with the diagnosis. Most of the CAD systems in the capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, new CAD system has to be designed from scratch. This characteristic makes the design of new CAD systems a very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which avoids the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed by using state of the art hand-crafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase).
An analytic theory of species abundance patterns (SAPs) in biological networks is presented. The theory is based on multispecies replicator dynamics equivalent to the Lotka-Volterra equation, with diverse interspecies interactions. Various SAPs observed in nature are derived from a single parameter. The abundance distribution is formed like a widely observed left-skewed lognormal distribution. As the model has a general form, the result can be applied to similar patterns in other complex biological networks, e.g. gene expression.
Machine translation between Arabic and Hebrew has so far been limited by a lack of parallel corpora, despite the political and cultural importance of this language pair. Previous work relied on manually-crafted grammars or pivoting via English, both of which are unsatisfactory for building a scalable and accurate MT system. In this work, we compare standard phrase-based and neural systems on Arabic-Hebrew translation. We experiment with tokenization by external tools and sub-word modeling by character-level neural models, and show that both methods lead to improved translation performance, with a small advantage to the neural models.
We study here a standard next-nearest-neighbor (NNN) model of ballistic growth on one- and two-dimensional substrates focusing our analysis on the probability distribution function $P(M,L)$ of the number $M$ of maximal points (i.e., local ``peaks'') of growing surfaces. Our analysis is based on two central results: (i) the proof (presented here) of the fact that uniform one--dimensional ballistic growth process in the steady state can be mapped onto ''rise-and-descent'' sequences in the ensemble of random permutation matrices; and (ii) the fact, established in Ref. \cite{ov}, that different characteristics of ``rise-and-descent'' patterns in random permutations can be interpreted in terms of a certain continuous--space Hammersley--type process. For one--dimensional system we compute $P(M,L)$ exactly and also present explicit results for the correlation function characterizing the enveloping surface. For surfaces grown on 2d substrates, we pursue similar approach considering the ensemble of permutation matrices with long--ranged correlations. Determining exactly the first three cumulants of the corresponding distribution function, we define it in the scaling limit using an expansion in the Edgeworth series, and show that it converges to a Gaussian function as $L \to \infty$.
We have completed a numerical investigation of the Anderson-Hubbard model for three-dimensional simple cubic lattices using a real-space self-consistent Hartree-Fock decoupling approximation for the Hubbard interaction. In this formulation we treat the spatial disorder exactly, and therefore we account for effects arising from localization physics. We have examined the model for electronic densities well away 1/2 filling, thereby avoiding the physics of a Mott insulator. Several recent studies have made clear that the combined effects of electronic interactions and spatial disorder can give rise to a suppression of the electronic density of states, and a subsequent metal-insulator transition can occur. We augment such studies by calculating the ac conductivity for such systems. Our numerical results show that weak interactions enhance the density of states at the Fermi level and the low-frequency conductivity, there are no local magnetic moments, and the ac conductivity is Drude-like. However, with a large enough disorder strength and larger interactions the density of states at the Fermi level and the low-frequency conductivity are both suppressed, the conductivity becomes non-Drude-like, and these phenomena are accompanied by the presence of local magnetic moments. The low-frequency conductivity changes from a sigma-sigma_dc omega^{1/2} behaviour in the metallic phase, to a sigma omega^2 behaviour in the nonmetallic regime. Our numerical results show that the formation of magnetic moments is essential to the suppression of the density of states at the Fermi level, and therefore essential to the metal-insulator transition.
The Gr\"uneisen relation is shown to be important for the thermodynamics of dense liquids.
Two coarse-grained models for polymer chains in dense glass-forming polymer melts are studied by computer simulation: the bond-fluctuation model on a simple cubic lattice, where a bond-length potential favors long bonds, is treated by dynamic Monte Carlo methods, and a bead-spring model in the continuum with a Lennard-Jones potential between the beads is treated by Molecular Dynamics. While the dynamics of both models differ for short length scales and associated time scales, on mesoscopic spatial and temporal scales both models behave similarly. In particular, the mode coupling theory of the glass transition can be used to interpret the slowing down of the undercooled polymer melt. For the off-lattice model, the approach to the critical point of mode coupling is both studied for constant pressure and for constant volume. The lattice model allows a test of the Gibbs-Di Marzio entropy theory of the glass transition, and our finding is that although the entropy does decrease significantly, there is no ``entropy catastrophe'', where the fluid entropy would turn negative. Finally, an outlook on confinement effects on the glass transition in thin film geometry is given.
We introduce a new model of hard spheres under confinement for the study of the glass and jamming transitions. The model is an one-dimensional chain of the $d$-dimensional boxes each of which contains the same number of hard spheres, and the particles in the boxes of the ends of the chain are quenched at their equilibrium positions. We focus on the infinite dimensional limit ($d \to \infty$) of the model and analytically compute the glass transition densities using the replica liquid theory. From the chain length dependence of the transition densities, we extract the characteristic length scales at the glass transition. The divergence of the lengths are characterized by the two exponents, $-1/4$ for the dynamical transition and $-1$ for the ideal glass transition, which are consistent with those of the $p$-spin mean-field spin glass model. We also show that the model is useful for the study of the growing length scale at the jamming transition.
Diluted neural networks with continuous neurons and nonmonotonic transfer function are studied, with both fixed and dynamic synapses. A noisy stimulus with periodic variance results in a mechanism for controlling chaos in neural systems with fixed synapses: a proper amount of external perturbation forces the system to behave periodically with the same period as the stimulus.
This paper introduces an objective function that seeks to minimise the average total number of bits required to encode the joint state of all of the layers of a Markov source. This type of encoder may be applied to the problem of optimising the bottom-up (recognition model) and top-down (generative model) connections in a multilayer neural network, and it unifies several previous results on the optimisation of multilayer neural networks.
Variational auto-encoders (VAE) are scalable and powerful generative models. However, the choice of the variational posterior determines tractability and flexibility of the VAE. Commonly, latent variables are modeled using the normal distribution with a diagonal covariance matrix. This results in computational efficiency but typically it is not flexible enough to match the true posterior distribution. One fashion of enriching the variational posterior distribution is application of normalizing flows, i.e., a series of invertible transformations to latent variables with a simple posterior. In this paper, we follow this line of thinking and propose a volume-preserving flow that uses a series of Householder transformations. We show empirically on MNIST dataset and histopathology data that the proposed flow allows to obtain more flexible variational posterior and competitive results comparing to other normalizing flows.
For robots that have the capability to interact with the physical environment through their end effectors, understanding the surrounding scenes is not merely a task of image classification or object recognition. To perform actual tasks, it is critical for the robot to have a functional understanding of the visual scene. Here, we address the problem of localizing and recognition of functional areas from an arbitrary indoor scene, formulated as a two-stage deep learning based detection pipeline. A new scene functionality testing-bed, which is complied from two publicly available indoor scene datasets, is used for evaluation. Our method is evaluated quantitatively on the new dataset, demonstrating the ability to perform efficient recognition of functional areas from arbitrary indoor scenes. We also demonstrate that our detection model can be generalized onto novel indoor scenes by cross validating it with the images from two different datasets.
Networks describe a range of social, biological and technical phenomena. An important property of a network is its degree correlation or assortativity, describing how nodes in the network associate based on their number of connections. Social networks are typically thought to be distinct from other networks in being assortative (possessing positive degree correlations); well-connected individuals associate with other well-connected individuals, and poorly-connected individuals associate with each other. We review the evidence for this in the literature and find that, while social networks are more assortative than non-social networks, only when they are built using group-based methods do they tend to be positively assortative. Non-social networks tend to be disassortative. We go on to show that connecting individuals due to shared membership of a group, a commonly used method, biases towards assortativity unless a large enough number of censuses of the network are taken. We present a number of solutions to overcoming this bias by drawing on advances in sociological and biological fields. Adoption of these methods across all fields can greatly enhance our understanding of social networks and networks in general.
There is increasing evidence to suggest functional connectivity networks are non-stationary. This has lead to the development of novel methodologies with which to accurately estimate time-varying functional connectivity networks. Many of these methods provide unprecedented temporal granularity by estimating a functional connectivity network at each point in time; resulting in high-dimensional output which can be studied in a variety of ways. One possible method is to employ graph embedding algorithms. Such algorithms effectively map estimated networks from high-dimensional spaces down to a low dimensional vector space; thus facilitating visualization, interpretation and classification. In this work, the dynamic properties of functional connectivity are studied using working memory task data from the Human Connectome Project. A recently proposed method is employed to estimate dynamic functional connectivity networks. The results are subsequently analyzed using two graph embedding methods based on linear projections. These methods are shown to provide informative embeddings that can be directly interpreted as functional connectivity networks.
We propose a simple yet effective technique for neural network learning. The forward propagation is computed as usual. In back propagation, only a small subset of the full gradient is computed to update the model parameters. The gradient vectors are sparsified in such a way that only the top-$k$ elements (in terms of magnitude) are kept. As a result, only $k$ rows or columns (depending on the layout) of the weight matrix are modified, leading to a linear reduction ($k$ divided by the vector dimension) in the computational cost. Surprisingly, experimental results demonstrate that we can update only 1--4\% of the weights at each back propagation pass. This does not result in a larger number of training iterations. More interestingly, the accuracy of the resulting models is actually improved rather than degraded, and a detailed analysis is given.
We derive a modular fluid-flow network congestion control model based on a law of fundamental nature in networks: the conservation of information. Network elements such as queues, users, and transmission channels and network performance indicators like sending/acknowledgement rates and delays are mathematically modelled by applying this law locally. Our contributions are twofold. First, we introduce a modular metamodel that is sufficiently generic to represent any network topology. The proposed model is composed of building blocks that implement mechanisms ignored by the existing ones, which can be recovered from exact reduction or approximation of this new model. Second, we provide a novel classification of previously proposed models in the literature and show that they are often not capable of capturing the transient behavior of the network precisely. Numerical results obtained from packet-level simulations demonstrate the accuracy of the proposed model.
This paper addressed the issue of estimating the damage caused by a computer virus. First, an individual-level delayed SIR model capturing the spreading process of a digital virus is derived. Second, the damage inflicted by the virus is modeled as the sum of the economic losses and the cost for developing the antivirus. Next, the impact of different factors, including the delay and the network structure, on the damage is explored by means of computer simulations. Thereby some measures of reducing the damage of a virus are recommended. To our knowledge, this is the first time the antivirus-developing cost is taken into account when estimating the damage of a virus.
Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (Distill & transfer learning). Instead of sharing parameters between the different workers, we propose to share a "distilled" policy that captures common behaviour across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.
Modeling preference time in triathlons means predicting the intermediate times of particular sports disciplines by a given overall finish time in a specific triathlon course for the athlete with the known personal best result. This is a hard task for athletes and sport trainers due to a lot of different factors that need to be taken into account, e.g., athlete's abilities, health, mental preparations and even their current sports form. So far, this process was calculated manually without any specific software tools or using the artificial intelligence. This paper presents the new solution for modeling preference time in middle distance triathlons based on particle swarm optimization algorithm and archive of existing sports results. Initial results are presented, which suggest the usefulness of proposed approach, while remarks for future improvements and use are also emphasized.
Long training times for high-accuracy deep neural networks (DNNs) impede research into new DNN architectures and slow the development of high-accuracy DNNs. In this paper we present FireCaffe, which successfully scales deep neural network training across a cluster of GPUs. We also present a number of best practices to aid in comparing advancements in methods for scaling and accelerating the training of deep neural networks. The speed and scalability of distributed algorithms is almost always limited by the overhead of communicating between servers; DNN training is not an exception to this rule. Therefore, the key consideration here is to reduce communication overhead wherever possible, while not degrading the accuracy of the DNN models that we train. Our approach has three key pillars. First, we select network hardware that achieves high bandwidth between GPU servers -- Infiniband or Cray interconnects are ideal for this. Second, we consider a number of communication algorithms, and we find that reduction trees are more efficient and scalable than the traditional parameter server approach. Third, we optionally increase the batch size to reduce the total quantity of communication during DNN training, and we identify hyperparameters that allow us to reproduce the small-batch accuracy while training with large batch sizes. When training GoogLeNet and Network-in-Network on ImageNet, we achieve a 47x and 39x speedup, respectively, when training on a cluster of 128 GPUs.
We study the capacity of Hodgkin-Huxley neuron in a network to change temporarily or permanently their connections and behavior, the so called spike timing-dependent plasticity (STDP), as a function of their synchronous behavior. We consider STDP of excitatory and inhibitory synapses driven by Hebbian rules. We show that the final state of networks evolved by a STDP depend on the initial network configuration. Specifically, an initial all-to-all topology envolves to a complex topology. Moreover, external perturbations can induce co-existence of clusters, those whose neurons are synchronous and those whose neurons are desynchronous. This work reveals that STDP based on Hebbian rules leads to a change in the direction of the synapses between high and low frequency neurons, and therefore, Hebbian learning can be explained in terms of preferential attachment between these two diverse communities of neurons, those with low-frequency spiking neurons, and those with higher-frequency spiking neurons.
The problem of content search through comparisons has recently received considerable attention. In short, a user searching for a target object navigates through a database in the following manner: the user is asked to select the object most similar to her target from a small list of objects. A new object list is then presented to the user based on her earlier selection. This process is repeated until the target is included in the list presented, at which point the search terminates. This problem is known to be strongly related to the small-world network design problem.   However, contrary to prior work, which focuses on cases where objects in the database are equally popular, we consider here the case where the demand for objects may be heterogeneous. We show that, under heterogeneous demand, the small-world network design problem is NP-hard. Given the above negative result, we propose a novel mechanism for small-world design and provide an upper bound on its performance under heterogeneous demand. The above mechanism has a natural equivalent in the context of content search through comparisons, and we establish both an upper bound and a lower bound for the performance of this mechanism. These bounds are intuitively appealing, as they depend on the entropy of the demand as well as its doubling constant, a quantity capturing the topology of the set of target objects. They also illustrate interesting connections between comparison-based search to classic results from information theory. Finally, we propose an adaptive learning algorithm for content search that meets the performance guarantees achieved by the above mechanisms.
In the performance evaluation of a protocol for an ad hoc network, the protocol should be tested under realistic conditions including, but not limited to, a sensible transmission range, limited buffer space for the storage of messages, representative data traffic models, and realistic movements of the mobile users and several mobility models that represent mobile nodes whose movements are dependent on each other (i.e., group mobility models ).The goal of this paper is to simulate the movements of mobile nodes within a network and present a number of mobility models in order to demonstrate its effect on Location management scheme for mobile ad hoc network or personal communication services networks. Specifically, to illustrate how the performance results of an ad hoc network protocol drastically change as a result of changing the mobility model simulated. Location management is a fundamental problem in personal communication services network. The current standard of location management is HLR/VLR scheme. It has been observed that the performance of any location management scheme depends on space requirements, bandwidth requirements and time requirements. To avoid certain drawbacks in HLR/VLR scheme, many approaches including hierarchical approaches have been suggested. Working set idea is chosen to analyze its performance for location management in PCS networks. Due to inadequacy of standard network simulators to provide the output in the format desired, a new location management simulator can be built. Two variants of working set idea viz. Working set scheme for HLR/VLR approach and working set scheme for hierarchical approach can be used and then compare the performance of HLR/VLR scheme and working set scheme using the results obtained by the simulator with respect to already available mobile activity traces.
There has been a rich interplay in recent years between (i) empirical investigations of real world dynamic networks, (ii) analytical modeling of the microscopic mechanisms that drive the emergence of such networks, and (iii) harnessing of these mechanisms to either manipulate existing networks, or engineer new networks for specific tasks. We continue in this vein, and study the deletion phenomenon in the web by following two different sets of web-sites (each comprising more than 150,000 pages) over a one-year period. Empirical data show that there is a significant deletion component in the underlying web networks, but the deletion process is not uniform. This motivates us to introduce a new mechanism of preferential survival (PS), where nodes are removed according to a degree-dependent deletion kernel. We use the mean-field rate equation approach to study a general dynamic model driven by Preferential Attachment (PA), Double PA (DPA), and a tunable PS, where c nodes (c<1) are deleted per node added to the network, and verify our predictions via large-scale simulations. One of our results shows that, unlike in the case of uniform deletion, the PS kernel when coupled with the standard PA mechanism, can lead to heavy-tailed power law networks even in the presence of extreme turnover in the network. Moreover, a weak DPA mechanism, coupled with PS, can help make the network even more heavy-tailed, especially in the limit when deletion and insertion rates are almost equal, and the overall network growth is minimal. The dynamics reported in this work can be used to design and engineer stable ad hoc networks and explain the stability of the power law exponents observed in real-world networks.
The interplay between quantum fluctuations and disorder is investigated in a spin-glass model, in the presence of a uniform transverse field $\Gamma$, and a longitudinal random field following a Gaussian distribution with width $\Delta$. The model is studied through the replica formalism. This study is motivated by experimental investigations on the LiHo$_x$Y$_{1-x}$F$_4$ compound, where the application of a transverse magnetic field yields rather intriguing effects, particularly related to the behavior of the nonlinear magnetic susceptibility $\chi_3$, which have led to a considerable experimental and theoretical debate. We analyzed two situations, namely, $\Delta$ and $\Gamma$ considered as independent, as well as these two quantities related as proposed recently by some authors. In both cases, a spin-glass phase transition is found at a temperature $T_f$; moreover, $T_f$ decreases by increasing $\Gamma$ towards a quantum critical point at zero temperature. The situation where $\Delta$ and $\Gamma$ are related appears to reproduce better the experimental observations on the LiHo$_x$Y$_{1-x}$F$_4$ compound, with the theoretical results coinciding qualitatively with measurements of the nonlinear susceptibility. In this later case, by increasing $\Gamma$, $\chi_3$ becomes progressively rounded, presenting a maximum at a temperature $T^*$ ($T^*>T_f$). Moreover, we also show that the random field is the main responsible for the smearing of the nonlinear susceptibility, acting significantly inside the paramagnetic phase, leading to two regimes delimited by the temperature $T^*$, one for $T_f<T<T^*$, and another one for $T>T^*$. It is argued that the conventional paramagnetic state corresponds to $T>T^*$, whereas the temperature region $T_f<T<T^*$ may be characterized by a rather unusual dynamics, possibly including Griffiths singularities.
We have conducted a new deep SCUBA survey, which has targetted 12 lensing galaxy clusters and one blank field. In this survey we have detected several sub-mJy sources after correcting for the gravitational lensing by the intervening clusters. We here present the preliminary results and point out two highlights.
Similarity metrics are a core component of many information retrieval and machine learning systems. In this work we propose a method capable of learning a similarity metric from data equipped with a binary relation. By considering only the similarity constraints, and initially ignoring the features, we are able to learn target vectors for each instance using one of several appropriately designed loss functions. A regression model can then be constructed that maps novel feature vectors to the same target vector space, resulting in a feature extractor that computes vectors for which a predefined metric is a meaningful measure of similarity. We present results on both multiclass and multi-label classification datasets that demonstrate considerably faster convergence, as well as higher accuracy on the majority of the intrinsic evaluation tasks and all extrinsic evaluation tasks.
We study optimal transmission strategies in interfering wireless networks, under Quality of Service constraints. A buffered, dynamic network with multiple sources is considered, and sources use a retransmission strategy in order to improve packet delivery probability. The optimization problem is formulated as a Markov Decision Process, where constraints and objective functions are ratios of time-averaged cost functions. The optimal strategy is found as the solution of a Linear Fractional Program, where the optimization variables are the steady-state probability of state-action pairs. Numerical results illustrate the dependence of optimal transmission/interference strategies on the constraints imposed on the network.
Random network models play a prominent role in modeling, analyzing and understanding complex phenomena on real-life networks. However, a key property of networks is often neglected: many real-world networks exhibit spatial structure, the tendency of a node to select neighbors with a probability depending on physical distance. Here, we introduce a class of random spatial networks (RSNs) which generalizes many existing random network models but adds spatial structure. In these networks, nodes are placed randomly in space and joined in edges with a probability depending on their distance and their individual expected degrees, in a manner that crucially remains analytically tractable. We use this network class to propose a new generalization of small-world networks, where the average shortest path lengths in the graph are small, as in classical Watts-Strogatz small-world networks, but with close spatial proximity of nodes that are neighbors in the network playing the role of large clustering. Small-world effects are demonstrated on these spatial small-world networks without clustering. We are able to derive partial integro-differential equations governing susceptible-infectious-recovered disease spreading through an RSN, and we demonstrate the existence of traveling wave solutions. If the distance kernel governing edge placement decays slower than exponential, the population-scale dynamics are dominated by long-range hops followed by local spread of traveling waves. This provides a theoretical modeling framework for recent observations of how epidemics like Ebola evolve in modern connected societies, with long-range connections seeding new focal points from which the epidemic locally spreads in a wavelike manner.
The results of morphological galaxy classifications performed by humans and by automated methods are compared. In particular, a comparison is made between the eyeball classifications of 454 galaxies in the Sloan Digital Sky Survey (SDSS) commissioning data (Shimasaku et al. 2001) with those of supervised artificial neural network programs constructed using the MATLAB Neural Network Toolbox package. Networks in this package have not previously been used for galaxy classification. It is found that simple neural networks are able to improve on the results of linear classifiers, giving correlation coefficients of the order of 0.8 +/- 0.1, compared with those of around 0.7 +/- 0.1 for linear classifiers. The networks are trained using the resilient backpropagation algorithm, which, to the author's knowledge, has not been specifically used in the galaxy classification literature. The galaxy parameters used and the network architecture are both important, and in particular the galaxy concentration index, a measure of the concentration of light towards the centre of the galaxy, is the most significant parameter. Simple networks are briefly applied to 29,429 galaxies with redshifts from the SDSS Early Data Release. They give an approximate ratio of types E/S0:Sp:Irr of 14 +/- 5 : 86 +/- 12 : 0 +/- 0.1, which broadly agrees with the well known approximate ratios of 20:80:1 observed at low redshift.
This paper uses a recently presented abstract, tuneable Boolean regulatory network model extended to consider aspects of mobile DNA, such as transposons. The significant role of mobile DNA in the evolution of natural systems is becoming increasingly clear. This paper shows how dynamically controlling network node connectivity and function via transposon-inspired mechanisms can be selected for in computational intelligence tasks to give improved performance. The designs of dynamical networks intended for implementation within the slime mould Physarum polycephalum and for the distributed control of a smart surface are considered.
The magnetic properties of the diluted magnetic semiconductors (DMS) (Ga, Mn)As and (Ga, Mn)N are investigated by means of an effective Heisenberg model, whose exchange parameters are obtained from first-principle calculations. The finite-temperature properties of the model are studied numerically using a method based upon the Tyablikov approximation. The method properly incorporates the effects of positional disorder present in DMS. The resulting Curie temperatures for (Ga, Mn)As are in excellent agreement with experimental data. Due to percolation effects and noncollinear magnetic structures at higher Mn concentrations, our calculations predict for (Ga, Mn)N very low Curie temperatures compared to mean-field estimates.
In this paper, we explore degrees of freedom in deep sigmoidal neural networks. We show that the degrees of freedom in these models is related to the expected optimism, which is the expected difference between test error and training error. We provide an efficient Monte-Carlo method to estimate the degrees of freedom for multi-class classification methods. We show degrees of freedom are lower than the parameter count in a simple XOR network. We extend these results to neural nets trained on synthetic and real data, and investigate impact of network's architecture and different regularization choices. The degrees of freedom in deep networks are dramatically smaller than the number of parameters, in some real datasets several orders of magnitude. Further, we observe that for fixed number of parameters, deeper networks have less degrees of freedom exhibiting a regularization-by-depth.
In this communication we have studied the electronic structure, magnetic and optical properties of bcc \fecr alloys in the ferromagnetic phase. We have used the augmented space recursion technique coupled with tight-binding linearized muffin-tin orbital technique (TB-LMTO-ASR) as well as the coherent-potential approximation based on the Korringa-Kohn-Rostocker method (KKR-CPA). Also the plane wave projector augmented wave (PAW) method has been used with the disorder simulated by the special quasi-random structure method f or configuration averaging (SQS). This was to provide a comparison between the different methods in common use for random alloys. Moreover, using the self-consistent potential parameters from TB-LMTO-ASR ca lculations we obtained the spin resolved optical conductivity using the generalized recursion technique proposed by M\"uller and Vishwanathan.
Advances in multimedia and ad-hoc networking have urged a wealth of research in multimedia delivery over ad-hoc networks. This comes as no surprise, as those networks are versatile and beneficial to a plethora of applications where the use of fully wired network has proved intricate if not impossible, such as prompt formation of networks during conferences, disaster relief in case of flood and earthquake, and also in war activities. It this paper, we aim to investigate the combined impact of network sparsity and network node density on the Peak Signal Noise to Ratio (PSNR) and jitter performance of proactive and reactive routing protocols in ad-hoc networks. We also shed light onto the combined effect of mobility and sparsity on the performance of these protocols. We validate our results through the use of an integrated Simulator-Evaluator environment consisting of the Network Simulator NS2, and the Video Evaluation Framework Evalvid.
We study the application of Tuplix Calculus in modular financial budget design. We formalize organizational structure using financial transfer networks. We consider the notion of flux of money over a network, and a way to enforce the matching of influx and outflux for parts of a network. We exploit so-called signed attribute notation to make internal streams visible through encapsulations. Finally, we propose a Tuplix Calculus construct for the definition of data functions.
Although ultraviolet (UV) light is important in many areas of science and technology, there are very few if any lasers capable of delivering wavelength-tunable ultrashort UV pulses at MHz repetition rates. Here we report the generation of deep-UV laser pulses at MHz repetition rates and \mu J-energies by means of dispersive wave (DW) emission from self-compressed solitons in gas-filled single-ring hollow-core photonic crystal fiber (SR-PCF). Pulses from an ytterbium fiber laser (~300 fs) are first compressed to ~25 fs in a SR-PCF-based nonlinear compression stage, and subsequently used to pump a second SR-PCF stage for broadband DW generation in the deep UV. The UV wavelength is tunable by selecting the gas species and the pressure. At 100 kHz repetition rate, a pulse energy of 1.05 \mu J was obtained at 205 nm (average power 0.1 W), and at 1.92 MHz, a pulse energy of 0.54 \mu J was obtained at 275 nm (average power 1.03 W).
VLBI observations are a reliable method to identify AGN, since they require high brightness temperatures for a detection to be made. However, because of the tiny fields of view it is unpractical to carry out VLBI observations of many sources using conventional methods. We used an extension of the DiFX software correlator to image with high sensitivity 96 sources in the Chandra Deep Field South, using only 9h of observing time with the VLBA. We detected 20 sources, 8 of which had not been identified as AGN at any other wavelength, despite the comprehensive coverage of this field. The lack of X-ray counterparts to 1/3 of the VLBI-detected sources, despite the sensitivity of co-located X-ray data, demonstrates that X-ray observations cannot be solely relied upon when searching for AGN activity. Surprisingly, we find that sources classified as type 1 QSOs using X-ray data are always detected, in contrast to the 10% radio-loud objects which are found in optically-selected QSOs. We present the continuation of this project with the goal to image 1450 sources in the Lockman Hole/XMM region.
The advances in WDM technology lead to the great interest in traffic grooming problems. As traffic often changes from time to time, the problem of grooming dynamic traffic is of great practical value. In this paper, we discuss dynamic grooming of traffic in star and tree networks. A genetic algorithm (GA) based approach is proposed to support arbitrary dynamic traffic patterns, which minimizes the number of ADM's and wavelengths. To evaluate the algorithm, tighter bounds are derived. Computer simulation results show that our algorithm is efficient in reducing both the numbers of ADM's and wavelengths in tree and star networks.
Wireless network virtualization has been well recognized as a way to improve the flexibility of wireless networks by decoupling the functionality of the system and implementing infrastructure and spectrum as services. Recent studies have shown that caching provides a better performance to serve the content requests from mobile users. In this paper, we propose that \emph{caching can be applied as a service} in mobile networks, i.e., different service providers (SPs) cache their contents in the storages of wireless facilities that owned by mobile network operators (MNOs). Specifically, we focus on the scenario of \emph{small-cell networks}, where cache-enabled small-cell base stations (SBSs) are the facilities to cache contents. To deal with the competition for storages among multiple SPs, we design a mechanism based on multi-object auctions, where the time-dependent feature of system parameters and the frequency of content replacement are both taken into account. Simulation results show that our solution leads to a satisfactory outcome.
In this paper we summarized a framework for designing grammar-based procedure for the automatic extraction of the semantic content from spoken queries. Starting with a case study and following an approach which combines the notions of fuzziness and robustness in sentence parsing, we showed we built practical domain-dependent rules which can be applied whenever it is possible to superimpose a sentence-level semantic structure to a text without relying on a previous deep syntactical analysis. This kind of procedure can be also profitably used as a pre-processing tool in order to cut out part of the sentence which have been recognized to have no relevance in the understanding process. In the case of particular dialogue applications where there is no need to build a complex semantic structure (e.g. word spotting or excerpting) the presented methodology may represent an efficient alternative solution to a sequential composition of deep linguistic analysis modules. Even if the query generation problem may not seem a critical application it should be held in mind that the sentence processing must be done on-line. Having this kind of constraints we cannot design our system without caring for efficiency and thus provide an immediate response. Another critical issue is related to whole robustness of the system. In our case study we tried to make experiences on how it is possible to deal with an unreliable and noisy input without asking the user for any repetition or clarification. This may correspond to a similar problem one may have when processing text coming from informal writing such as e-mails, news and in many cases Web pages where it is often the case to have irrelevant surrounding information.
Recently, Tegmark pointed out that the superposition of ion states involved in the superposition of firing and resting states of a neuron quickly decohere. It undoubtedly indicates that neural networks cannot work as quantum computers, or computers taking advantage of coherent states. Does it also mean that the brain can be modeled as a neural network obeying classical physics? Here we show that it does not mean that the brain can be modeled as a neural network obeying classical physics. A brand new perspective in research of neural networks from quantum theoretical aspect is presented.
Sum-Product Networks (SPNs) are a class of expressive yet tractable hierarchical graphical models. LearnSPN is a structure learning algorithm for SPNs that uses hierarchical co-clustering to simultaneously identifying similar entities and similar features. The original LearnSPN algorithm assumes that all the variables are discrete and there is no missing data. We introduce a practical, simplified version of LearnSPN, MiniSPN, that runs faster and can handle missing data and heterogeneous features common in real applications. We demonstrate the performance of MiniSPN on standard benchmark datasets and on two datasets from Google's Knowledge Graph exhibiting high missingness rates and a mix of discrete and continuous features.
Constrained clustering has been well-studied in the unsupervised learning society. However, how to encode constraints into community structure detection, within complex networks, remains a challenging problem. In this paper, we propose a semi-supervised learning framework for community structure detection. This framework implicitly encodes the must-link and cannot-link constraints by modifying the adjacency matrix of network, which can also be regarded as de-noising the consensus matrix of community structures. Our proposed method gives consideration to both the topology and the functions (background information) of complex network, which enhances the interpretability of the results. The comparisons performed on both the synthetic benchmarks and the real-world networks show that the proposed framework can significantly improve the community detection performance with few constraints, which makes it an attractive methodology in the analysis of complex networks.
The free-energy distribution function of an elastic string in a quenched random potential, P(F), is investigated with the help of the optimal-fluctuation approach. The form of the far-right tail of P(F) is found by constructing the exact solution of the non-linear saddle-point equations describing the asymptotic form of the optimal fluctuation. The solution of the problem is obtained for two different types of boundary conditions and for an arbitrary dimension of the imbeding space 1+d with d from the interval 0<d<2. The results are also applicable for the description of the far-left tail of the height distribution function in the stochastic growth problem described by the d-dimensional Kardar-Parisi-Zhang equation.
Transcranial Electrical Stimulation (TCES) and Deep Brain Stimulation (DBS) are two different applications of electrical current to the brain used in different areas of medicine. Both have a similar frequency dependence of their efficiency, with the most pronounced effects around 100Hz. We apply superthreshold electrical stimulation, specifically depolarizing DC current, interrupted at different frequencies, to a simple model of a population of cortical neurons which uses phenomenological descriptions of neurons by Izhikevich and synaptic connections on a similar level of sophistication. With this model, we are able to reproduce the optimal desynchronization around 100Hz, as well as to predict the full frequency dependence of the efficiency of desynchronization, and thereby to give a possible explanation for the action mechanism of TCES.
Brain research has been driven by enquiry for principles of brain structure organization and its control mechanisms. The neuronal wiring map of C. elegans, the only complete connectome available till date, presents an incredible opportunity to learn basic governing principles that drive structure and function of its neuronal architecture. Despite its apparently simple nervous system, C. elegans is known to possess complex functions. The neuronal architecture forms an important underlying framework which specifies phenotypic features associated to sensation, movement, conditioning and memory. In this study, with the help of graph theoretical models, we investigated the C. elegans neuronal network to identify network features that are critical for its control. The 'driver neurons' are associated with important biological functions such as reproduction, signalling processes and anatomical structural development. We created 1D and 2D network models of C. elegans neuronal system to probe the role of features that confer controllability and small world nature. The simple 1D ring model is critically poised for the number of feed forward motifs, neuronal clustering and characteristic path-length in response to synaptic rewiring, indicating optimal rewiring. Using empirically observed distance constraint in the neuronal network as a guiding principle, we created a distance constrained synaptic plasticity model that simultaneously explains small world nature, saturation of feed forward motifs as well as observed number of driver neurons. The distance constrained model suggests optimum long distance synaptic connections as a key feature specifying control of the network.
Nanomagnets driven by spin currents provide a natural implementation for a neuron and a synapse: currents allow convenient summation of multiple inputs, while the magnet provides the threshold function. The objective of this paper is to explore the possibility of a hardware neural network (HNN) implementation using a spin switch (SS) as its basic building block. SS is a recently proposed device based on established technology with a transistor-like gain and input-output isolation. This allows neural networks to be constructed with purely passive interconnections without intervening clocks or amplifiers. The weights for the neural network are conveniently adjusted through analog voltages that can be stored in a non-volatile manner in an underlying CMOS layer using a floating gate low dropout voltage regulator. The operation of a multi-layer SS neural network designed for character recognition is demonstrated using a standard simulation model based on coupled Landau-Lifshitz-Gilbert (LLG) equations, one for each magnet in the network.
The leaves of angiosperms contain highly complex venation networks consisting of recursively nested, hierarchically organized loops. We describe a new phenotypic trait of reticulate vascular networks based on the topology of the nested loops. This phenotypic trait encodes information orthogonal to widely used geometric phenotypic traits, and thus constitutes a new dimension in the leaf venation phenotypic space. We apply our metric to a database of 186 leaves and leaflets representing 137 species, predominantly from the Burseraceae family, revealing diverse topological network traits even within this single family. We show that topological information significantly improves identification of leaves from fragments by calculating a "leaf venation fingerprint" from topology and geometry. Further, we present a phenomenological model suggesting that the topological traits can be explained by noise effects unique to specimen during development of each leaf which leave their imprint on the final network. This work opens the path to new quantitative identification techniques for leaves which go beyond simple geometric traits such as vein density and is directly applicable to other planar or sub-planar networks such as blood vessels in the brain.
Genetic regulatory networks enable cells to respond to the changes in internal and external conditions by dynamically coordinating their gene expression profiles. Our ability to make quantitative measurements in these biochemical circuits has deepened our understanding of what kinds of computations genetic regulatory networks can perform and with what reliability. These advances have motivated researchers to look for connections between the architecture and function of genetic regulatory networks. Transmitting information between network's inputs and its outputs has been proposed as one such possible measure of function, relevant in certain biological contexts. Here we summarize recent developments in the application of information theory to gene regulatory networks. We first review basic concepts in information theory necessary to understand recent work. We then discuss the functional complexity of gene regulation which arrises from the molecular nature of the regulatory interactions. We end by reviewing some experiments supporting the view that genetic networks responsible for early development of multicellular organisms might be maximizing transmitted 'positional' information.
Image matting is a fundamental computer vision problem and has many applications. Previous algorithms have poor performance when an image has similar foreground and background colors or complicated textures. The main reasons are prior methods 1) only use low-level features and 2) lack high-level context. In this paper, we propose a novel deep learning based algorithm that can tackle both these problems. Our deep model has two parts. The first part is a deep convolutional encoder-decoder network that takes an image and the corresponding trimap as inputs and predict the alpha matte of the image. The second part is a small convolutional network that refines the alpha matte predictions of the first network to have more accurate alpha values and sharper edges. In addition, we also create a large-scale image matting dataset including 49300 training images and 1000 testing images. We evaluate our algorithm on the image matting benchmark, our testing set, and a wide variety of real images. Experimental results clearly demonstrate the superiority of our algorithm over previous methods.
We present the discovery of a rapidly evolving transient by the Korean Microlensing Telescope Network Supernova Program (KSP). KSP is a novel high-cadence supernova survey that offers deep ($\sim21.5$ mag in $BVI$ bands) nearly continuous wide-field monitoring for the discovery of early and/or fast optical transients. KSP-OT-201509a, reported here, was discovered on 2015 September 27 during the KSP commissioning run in the direction of the nearby galaxy NGC~300, and stayed above detection limit for $\sim$ 22 days. We use our $BVI$ light-curves to constrain the ascent rate, $-3.7(7)$ mag day$^{-1}$ in $V$, decay time scale, $t^{V}_{2}=1.7(6)$ days, and peak absolute magnitude, $-9.65\leq M_{V}\leq -9.25$ mag. We also find evidence for a short-lived pre-maximum halt in all bands. The peak luminosity and lightcurve evolution make KSP-OT-201509a consistent with a bright, rapidly decaying nova outburst. We discuss constraints on the nature of the progenitor and its environment using archival HST/ACS images and conclude with a broad discussion on the nature of the system.
The human organism is an integrated network where complex physiologic systems, each with its own regulatory mechanisms, continuously interact, and where failure of one system can trigger a breakdown of the entire network. Identifying and quantifying dynamical networks of diverse systems with different types of interactions is a challenge. Here, we develop a framework to probe interactions among diverse systems, and we identify a physiologic network. We find that each physiologic state is characterized by a specific network structure, demonstrating a robust interplay between network topology and function. Across physiologic states the network undergoes topological transitions associated with fast reorganization of physiologic interactions on time scales of a few minutes, indicating high network flexibility in response to perturbations. The proposed system-wide integrative approach may facilitate the development of a new field, Network Physiology.
Bandwidth allocation is a fundamental problem in communication networks. With current network moving towards the Future Internet model, the problem is further intensified as network traffic demanding far from exceeds network bandwidth capability. Maintaining a certain user satisfaction degree therefore becomes a challenge research topic. In this paper, we deal with the problem by proposing BASMIN, a novel bandwidth allocation scheme that aims to maximize network user's happiness. We also defined a new metric for evaluating network user satisfaction degree: network worth. A three-step evaluation process is then conducted to compare BASMIN efficiency with other three popular bandwidth allocation schemes. Throughout the tests, we experienced BASMIN's advantages over the others; we even found out that one of the most widely used bandwidth allocation schemes, in fact, is not effective at all.
This paper addresses a fundamental limitation for the adoption of caching for wireless access networks due to small population sizes. This shortcoming is due to two main challenges: (i) making timely estimates of varying content popularity and (ii) inferring popular content from small samples. We propose a framework which alleviates such limitations.   To timely estimate varying popularity in a context of a single cache we propose an Age-Based Threshold (ABT) policy which caches all contents requested more times than a threshold $\widetilde N(\tau)$, where $\tau$ is the content age. We show that ABT is asymptotically hit rate optimal in the many contents regime, which allows us to obtain the first characterization of the optimal performance of a caching system in a dynamic context. We then address small sample sizes focusing on $L$ local caches and one global cache. On the one hand we show that the global cache learns L times faster by aggregating all requests from local caches, which improves hit rates. On the other hand, aggregation washes out local characteristics of correlated traffic which penalizes hit rate. This motivates coordination mechanisms which combine global learning of popularity scores in clusters and LRU with prefetching.
Cooperative effects in neural networks appear because a neuron fires only if a minimal number $m$ of its inputs are excited. The multiple inputs requirement leads to a percolation model termed {\it quorum percolation}. The connectivity undergoes a phase transition as $m$ grows, from a network--spanning cluster at low $m$ to a set of disconnected clusters above a critical $m$. Both numerical simulations and the model reproduce the experimental results well. This allows a robust quantification of biologically relevant quantities such as the average connectivity $\kbar$ and the distribution of connections $p_k$
The locations of multicritical points on many hierarchical lattices are numerically investigated by the renormalization group analysis. The results are compared with an analytical conjecture derived by using the duality, the gauge symmetry and the replica method. We find that the conjecture does not give the exact answer but leads to locations slightly away from the numerically reliable data. We propose an improved conjecture to give more precise predictions of the multicritical points than the conventional one. This improvement is inspired by a new point of view coming from renormalization group and succeeds in deriving very consistent answers with many numerical data.
Protein phosphorylation is a reversible post-translational modification commonly used by cell signaling networks to transmit information about the extracellular environment into intracellular organelles for the regulation of the activity and sorting of proteins within the cell. For this study we reconstructed a literature-based mammalian kinase-substrate network from several online resources. The interactions within this directed graph network connect kinases to their substrates, through specific phosphosites including kinase-kinase regulatory interactions. However, the "signs" of links, activation or inhibition of the substrate upon phosphorylation, within this network are mostly unknown. Here we show how we can infer the "signs" indirectly using data from quantitative phosphoproteomics experiments applied to mammalian cells combined with the literature-based kinase-substrate network. Our inference method was able to predict the sign for 321 links and 153 phosphosites on 120 kinases, resulting in signed and directed subnetwork of mammalian kinase-kinase interactions. Such an approach can rapidly advance the reconstruction of cell signaling pathways and networks regulating mammalian cells.
This paper presents a co-salient object detection method to find common salient regions in a set of images. We utilize deep saliency networks to transfer co-saliency prior knowledge and better capture high-level semantic information, and the resulting initial co-saliency maps are enhanced by seed propagation steps over an integrated graph. The deep saliency networks are trained in a supervised manner to avoid online weakly supervised learning and exploit them not only to extract high-level features but also to produce both intra- and inter-image saliency maps. Through a refinement step, the initial co-saliency maps can uniformly highlight co-salient regions and locate accurate object boundaries. To handle input image groups inconsistent in size, we propose to pool multi-regional descriptors including both within-segment and within-group information. In addition, the integrated multilayer graph is constructed to find the regions that the previous steps may not detect by seed propagation with low-level descriptors. In this work, we utilize the useful complementary components of high-, low-level information, and several learning-based steps. Our experiments have demonstrated that the proposed approach outperforms comparable co-saliency detection methods on widely used public databases and can also be directly applied to co-segmentation tasks.
Functional magnetic resonance imaging produces high dimensional data, with a less then ideal number of labelled samples for brain decoding tasks (predicting brain states). In this study, we propose a new deep temporal convolutional neural network architecture with spatial pooling for brain decoding which aims to reduce dimensionality of feature space along with improved classification performance. Temporal representations (filters) for each layer of the convolutional model are learned by leveraging unlabelled fMRI data in an unsupervised fashion with regularized autoencoders. Learned temporal representations in multiple levels capture the regularities in the temporal domain and are observed to be a rich bank of activation patterns which also exhibit similarities to the actual hemodynamic responses. Further, spatial pooling layers in the convolutional architecture reduce the dimensionality without losing excessive information. By employing the proposed temporal convolutional architecture with spatial pooling, raw input fMRI data is mapped to a non-linear, highly-expressive and low-dimensional feature space where the final classification is conducted. In addition, we propose a simple heuristic approach for hyper-parameter tuning when no validation data is available. Proposed method is tested on a ten class recognition memory experiment with nine subjects. The results support the efficiency and potential of the proposed model, compared to the baseline multi-voxel pattern analysis techniques.
Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only "autoencodes" data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(x|z)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.
We construct infrared star count models of the Galaxy applicable at faint magnitudes. In addition to dwarfs and subdwarfs, we also take into account L dwarfs, T dwarfs, and white dwarfs of both disk and halo. As a test case of the models, we analyze the infrared imaging data at J and K' obtained during the Subaru Deep Field Survey to study stellar objects. Infrared star count models we have obtained predict that the Next Generation Space Telescope will primarily see T dwarfs, M subdwarfs, and old halo white dwarfs at faint magnitudes.
The new index of the author's popularity estimation is represented in the paper. The index is calculated on the basis of Wikipedia encyclopedia analysis (Wikipedia Index - WI). Unlike the conventional existed citation indices, the suggested mark allows to evaluate not only the popularity of the author, as it can be done by means of calculating the general citation number or by the Hirsch index, which is often used to measure the author's research rate. The index gives an opportunity to estimate the author's popularity, his/her influence within the sought-after area "knowledge area" in the Internet - in the Wikipedia. The suggested index is supposed to be calculated in frames of the subject domain, and it, on the one hand, avoids the mistaken computation of the homonyms, and on the other hand - provides the entirety of the subject area. There are proposed algorithms and the technique of the Wikipedia Index calculation through the network encyclopedia sounding, the exemplified calculations of the index for the prominent researchers, and also the methods of the information networks formation - models of the subject domains by the automatic monitoring and networks information reference resources analysis. The considered in the paper notion network corresponds the terms-heads of the Wikipedia articles.
Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Progress in the field will be further accelerated by the development of better tools for visualizing and interpreting neural nets. We introduce two such tools here. The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e.g. a live webcam stream). We have found that looking at live activations that change in response to user input helps build valuable intuitions about how convnets work. The second tool enables visualizing features at each layer of a DNN via regularized optimization in image space. Because previous versions of this idea produced less recognizable images, here we introduce several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations. Both tools are open source and work on a pre-trained convnet with minimal setup.
Connectivity studies using resting-state functional magnetic resonance imaging are increasingly pooling data acquired at multiple sites. While this may allow investigators to speed up recruitment or increase sample size, multisite studies also potentially introduce systematic biases in connectivity measures across sites. In this work, we measure the inter-site effect in connectivity and its impact on our ability to detect individual and group differences. Our study was based on real, as opposed to simulated, multisite fMRI datasets collected in N=345 young, healthy subjects across 8 scanning sites with 3T scanners and heterogeneous scanning protocols, drawn from the 1000 functional connectome project. We first empirically show that typical functional networks were reliably found at the group level in all sites, and that the amplitude of the inter-site effects was small to moderate, with a Cohen's effect size below 0.5 on average across brain connections. We then implemented a series of Monte-Carlo simulations, based on real data, to evaluate the impact of the multisite effects on detection power in statistical tests comparing two groups (with and without the effect) using a general linear model, as well as on the prediction of group labels with a support-vector machine. As a reference, we also implemented the same simulations with fMRI data collected at a single site using an identical sample size. Simulations revealed that using data from heterogeneous sites only slightly decreased our ability to detect changes compared to a monosite study with the GLM, and had a greater impact on prediction accuracy. Taken together, our results support the feasibility of multisite studies in rs-fMRI provided the sample size is large enough.
We study theoretically diffusion of one-dimensional Frenkel excitons in J-aggregates at temperatures that are smaller or of the order of the J-band width. We consider an aggregate as an open linear chain with uncorrelated on-site (diagonal) disorder that localizes the exciton at chain segments of size smaller than the full chain length. The exciton diffusion over the localization segments is considered as incoherent hopping. The diffusion is probed by the exciton fluorescence quenching which is due to the presence of point traps in the aggregate. The rate equation for populations of the localized exciton states is used to describe the exciton diffusion and trapping. We show that there exist two regimes of the exciton diffusion at low temperatures. The first, slower one, involves only the states of the very tail of the density of states, while the second, much faster one, also involves the higher states that are close to the bottom of the exciton band. The activation energy for the first regime of diffusion is of the order of one fifth of the J-band width, while for the second one it is of the order of the full J-band width. We discuss also the experimental data on the fast low-temperature exciton-exciton annihilation reported recently by I. G. Scheblykin {\em et al}, J. Phys. Chem. B {\bf 104}, 10949 (2000).
We formulate the problem of supervised hashing, or learning binary embeddings of data, as a learning to rank problem. Specifically, we optimize two common ranking-based evaluation metrics, Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG). Observing that ranking with the discrete Hamming distance naturally results in ties, we propose to use tie-aware versions of ranking metrics in both the evaluation and the learning of supervised hashing. For AP and NDCG, we derive continuous relaxations of their tie-aware versions, and optimize them using stochastic gradient ascent with deep neural networks. Our results establish the new state-of-the-art for tie-aware AP and NDCG on common hashing benchmarks.
What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization strategies employ fine-grained operations and scheduling beyond the classic bulk-synchronous processing paradigm popularized by MapReduce, or even specialized graph-based execution that relies on graph representations of ML programs. The variety of approaches tends to pull systems and algorithms design in different directions, and it remains difficult to find a universal platform applicable to a wide range of ML programs at scale. We propose a general-purpose framework that systematically addresses data- and model-parallel challenges in large-scale ML, by observing that many ML programs are fundamentally optimization-centric and admit error-tolerant, iterative-convergent algorithmic solutions. This presents unique opportunities for an integrative system design, such as bounded-error network synchronization and dynamic scheduling based on ML program structure. We demonstrate the efficacy of these system designs versus well-known implementations of modern ML algorithms, allowing ML programs to run in much less time and at considerably larger model sizes, even on modestly-sized compute clusters.
We analyse, using Inhomogenous Mode-Coupling Theory, the critical scaling behaviour of the dynamical susceptibility at a distance epsilon from continuous second-order glass transitions. We find that the dynamical correlation length xi behaves generically as epsilon^{-1/3} and that the upper critical dimension is equal to six. More surprisingly, we find activated dynamic scaling, where xi grows with time as [ln(t)]^2 exactly at criticality. All these results suggest a deep analogy between the glassy behaviour of attractive colloids or randomly pinned supercooled liquids and that of the Random Field Ising Model.
Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure. Existing language models generally fail to account for discourse structure, but it is crucial if we are to have language models that reward coherence and generate coherent texts. We present and empirically evaluate a set of multi-level recurrent neural network language models, called Document-Context Language Models (DCLM), which incorporate contextual information both within and beyond the sentence. In comparison with word-level recurrent neural network language models, the DCLM models obtain slightly better predictive likelihoods, and considerably better assessments of document coherence.
A search for narrow baryonic resonances decaying into Xi- pi- or Xi- pi+ and their antiparticles is carried out with the H1 detector using deep inelastic scattering events at HERA in the range of negative photon four-momentum transfer squared 2 < Q^2 < 100 GeV^2. No signal is observed for a new baryonic state in the mass range 1600 - 2300 MeV in either the doubly charged or the neutral decay channels. The known baryon Xi0 is observed through its decay mode into Xi- pi+. Upper limits are given on the ratio of the production rates of new baryonic states, such as the hypothetical pentaquark states Xi^{--}_{5q} or Xi^{0}_{5q}, relative to the Xi0 baryon state.
Optical Character Recognition software (OCR) are important tools for obtaining accessible texts. We propose the use of artificial neural networks (ANN) in order to develop pattern recognition algorithms capable of recognizing both normal texts and formulae. We present an original improvement of the backpropagation algorithm. Moreover, we describe a novel image segmentation algorithm that exploits fuzzy logic for separating touching characters.
Public debates are a common platform for presenting and juxtaposing diverging views on important issues. In this work we propose a methodology for tracking how ideas flow between participants throughout a debate. We use this approach in a case study of Oxford-style debates---a competitive format where the winner is determined by audience votes---and show how the outcome of a debate depends on aspects of conversational flow. In particular, we find that winners tend to make better use of a debate's interactive component than losers, by actively pursuing their opponents' points rather than promoting their own ideas over the course of the conversation.
We introduce a method to train Quantized Neural Networks (QNNs) --- neural networks with extremely low precision (e.g., 1-bit) weights and activations, at run-time. At train-time the quantized weights and activations are used for computing the parameter gradients. During the forward pass, QNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations. As a result, power consumption is expected to be drastically reduced. We trained QNNs over the MNIST, CIFAR-10, SVHN and ImageNet datasets. The resulting QNNs achieve prediction accuracy comparable to their 32-bit counterparts. For example, our quantized version of AlexNet with 1-bit weights and 2-bit activations achieves $51\%$ top-1 accuracy. Moreover, we quantize the parameter gradients to 6-bits as well which enables gradients computation using only bit-wise operation. Quantized recurrent neural networks were tested over the Penn Treebank dataset, and achieved comparable accuracy as their 32-bit counterparts using only 4-bits. Last but not least, we programmed a binary matrix multiplication GPU kernel with which it is possible to run our MNIST QNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The QNN code is available online.
Adaptive social networks, in which nodes and network structure co-evolve, are often described using a mean-field system of equations for the density of node and link types. These equations constitute an open system due to dependence on higher order topological structures. We propose a new approach to moment closure based on the analytical description of the system in an asymptotic regime. We apply the proposed approach to two examples of adaptive networks: recruitment to a cause model and adaptive epidemic model. We show a good agreement between the improved mean-field prediction and simulations of the full network system.
The current status of deep inelastic scattering is briefly reviewed. We discuss future theoretical developments desired and measurements needed to further complete our understanding of the picture of nucleons at short distances.
We present the results of an observational study of the efficiency of deep mixing in globular cluster red giants as a function of stellar metallicity. We determine [C/Fe] abundances based on low-resolution spectra taken with the Kast spectrograph on the 3m Shane telescope at Lick Observatory. Spectra centered on the 4300 Angstrom CH absorption band were taken for 42 bright red giants in 11 Galactic globular clusters ranging in metallicity from M92 ([Fe/H]=-2.29) to NGC 6712 ([Fe/H]=-1.01). Carbon abundances were derived by comparing values of the CH bandstrength index S2(CH) measured from the data with values measured from a large grid of SSG synthetic spectra. Present-day abundances are combined with theoretical calculations of the time since the onset of mixing, which is also a function of stellar metallicity, to calculate the carbon depletion rate across our metallicity range. We find that the carbon depletion rate is twice as high at a metallicity of [Fe/H]=-2.3 than at [Fe/H]=-1.3, which is a result qualitatively predicted by some theoretical explanations of the deep mixing process.
We investigate the evolution of populations of random Boolean networks under selection for robustness of the dynamics with respect to the perturbation of the state of a node. The fitness landscape contains a huge plateau of maximum fitness that spans the entire network space. When selection is so strong that it dominates over drift, the evolutionary process is accompanied by a slow increase in the mean connectivity and a slow decrease in the mean fitness. Populations evolved with higher mutation rates show a higher robustness under mutations. This means that even though all the evolved populations exist close to the plateau of maximum fitness, they end up in different regions of network space.
Social networks have well documented effects at the individual and aggregate level. Consequently it is often useful to understand how an attempt to influence a network will change its structure and consequently achieve other goals. We develop a framework for network modification that allows for arbitrary objective functions, types of modification (e.g. edge weight addition, edge weight removal, node removal, and covariate value change), and recovery mechanisms (i.e. how a network responds to interventions). The framework outlined in this paper helps both to situate the existing work on network interventions but also opens up many new possibilities for intervening in networks. In particular use two case studies to highlight the potential impact of empirically calibrating the objective function and network recovery mechanisms as well as showing how interventions beyond node removal can be optimised. First, we simulate an optimal removal of nodes from the Noordin terrorist network in order to reduce the expected number of attacks (based on empirically predicting the terrorist collaboration network from multiple types of network ties). Second, we simulate optimally strengthening ties within entrepreneurial ecosystems in six developing countries. In both cases we estimate ERGM models to simulate how a network will endogenously evolve after intervention.
As it is getting increasingly difficult to achieve gains in the density and power efficiency of microelectronic computing devices because of lithographic techniques reaching fundamental physical limits, new approaches are required to maximize the benefits of distributed sensors, micro-robots or smart materials. Biologically-inspired devices, such as artificial neural networks, can process information with a high level of parallelism to efficiently solve difficult problems, even when implemented using conventional microelectronic technologies. We describe a mechanical device, which operates in a manner similar to artificial neural networks, to solve efficiently two difficult benchmark problems (computing the parity of a bit stream, and classifying spoken words). The device consists in a network of masses coupled by linear springs and attached to a substrate by non-linear springs, thus forming a network of anharmonic oscillators. As the masses can directly couple to forces applied on the device, this approach combines sensing and computing functions in a single power-efficient device with compact dimensions.
We examine the relation between the free volume per particle and the variance of the particle position, equivalent to a local Debye-Waller (DW) factor for a 2D glass-forming alloy using molecular dynamics simulations. We find that the latter quantity exhibits significant spatial heterogeneity despite involving trajectories two orders of magnitude shorter than those typically used to measure such heterogeneities. We find that the free volume exhibits no significant spatial correlation with the local DW factor. We conclude that the spatial variation in local free volume is not the cause of the short time dynamic heterogeneity.
We introduce a new computational problem, the BackboneDiscovery problem, which encapsulates both functional and structural aspects of network analysis.   While the topology of a typical road network has been available for a long time (e.g., through maps), it is only recently that fine-granularity functional (activity and usage) information about the network (like source-destination traffic information) is being collected and is readily available. The combination of functional and structural information provides an efficient way to explore and understand usage patterns of networks and aid in design and decision making. We propose efficient algorithms for the BackboneDiscovery problem including a novel use of edge centrality. We observe that for many real world networks, our algorithm produces a backbone with a small subset of the edges that support a large percentage of the network activity.
We show that the proper inclusion of soft reparameterization modes in the Sachdev-Ye-Kitaev model of $N$ randomly interacting Majorana fermions reduces its long-time behavior to that of Liouville quantum mechanics. As a result, all zero temperature correlation functions decay with the universal exponent $\propto \tau^{-3/2}$ for times larger than the inverse single particle level spacing $\tau\gg N\ln N$. In the particular case of the single particle Green function this behavior is manifestation of the zero-bias anomaly, or scaling in energy as $\epsilon^{1/2}$. We also present exact diagonalization study supporting our conclusions.
Naming game simulates the process of naming an objective by a population of agents organized in a certain communication network topology. By pair-wise iterative interactions, the population reaches a consensus state asymptotically. In this paper, we study naming game with communication errors during pair-wise conversations, where errors are represented by error rates in a uniform probability distribution. First, a model of naming game with learning errors in communications (NGLE) is proposed. Then, a strategy for agents to prevent learning errors is suggested. To that end, three typical topologies of communication networks, namely random-graph, small-world and scale-free networks with different parameters, are employed to investigate the effects of various learning errors. Simulation results on these models show that 1) learning errors slightly affect the convergence speed but distinctively increase the requirement for memory of each agent during lexicon propagation; 2) the maximum number of different words held by the whole population increases linearly as the value of the error rate increases; 3) without applying any strategy to eliminate learning errors, there is a threshold value of the learning errors which impairs the convergence. The new findings help to better understand the role of learning errors in naming game as well as human language development from a network science perspective.
The Complexity of the Thouless-Anderson-Palmer (TAP) solutions of the Ising $p$-spin is investigated in the temperature regime where the equilibrium phase is one step Replica Symmetry   Breaking. Two solutions of the resulting saddle point equations are found. One is supersymmetric (SUSY) and includes the equilibrium value of the free energy while the other is non-SUSY. The two solutions cross exactly at a value of the free energy where the replicon eigenvalue is zero; at low free energy the complexity is described by the SUSY solution while at high free energy it is described by the non-SUSY solution. In particular the non-SUSY solution describes the total number of solutions, like in the   Sherrington-Kirkpatrick (SK) model. The relevant TAP solutions corresponding to the non-SUSY solution share the same feature of the corresponding solutions in the SK model, in particular their Hessian has a vanishing isolated eigenvalue. The TAP solutions corresponding to the SUSY solution, instead, are well separated minima.
We study Bennett deep sequences in the context of recursion theory; in particular we investigate the notions of O(1)-deepK, O(1)-deepC , order-deep K and order-deep C sequences. Our main results are that Martin-Loef random sets are not order-deepC , that every many-one degree contains a set which is not O(1)-deepC , that O(1)-deepC sets and order-deepK sets have high or DNR Turing degree and that no K-trival set is O(1)-deepK.
We present an approach to adaptively utilize deep neural networks in order to reduce the evaluation time on new examples without loss of classification performance. Rather than attempting to redesign or approximate existing networks, we propose two schemes that adaptively utilize networks. First, we pose an adaptive network evaluation scheme, where we learn a system to adaptively choose the components of a deep network to be evaluated for each example. By allowing examples correctly classified using early layers of the system to exit, we avoid the computational time associated with full evaluation of the network. Building upon this approach, we then learn a network selection system that adaptively selects the network to be evaluated for each example. We exploit the fact that many examples can be correctly classified using relatively efficient networks and that complex, computationally costly networks are only necessary for a small fraction of examples. By avoiding evaluation of these complex networks for a large fraction of examples, computational time can be dramatically reduced. Empirically, these approaches yield dramatic reductions in computational cost, with up to a 2.8x speedup on state-of-the-art networks from the ImageNet image recognition challenge with minimal (less than 1%) loss of accuracy.
Using molecular dynamics simulations we study the out of equilibrium dynamic correlations in a model glass-forming liquid. The system is quenched from a high temperature to a temperature below its glass transition temperature and the decay of the two-time intermediate scattering function C(t_w,t+t_w) is monitored for several values of the waiting time t_w after the quench. We find that C(t_w,t+t_w) shows a strong dependence on the waiting time, i.e. aging, depends on the temperature before the quench and, similar to the case of spin glasses, can be scaled onto a master curve.
We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the "lazy agent" problem, which arises due to partial observability. We address these problems by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions. We perform an experimental evaluation across a range of partially-observable multi-agent domains and show that learning such value-decompositions leads to superior results, in particular when combined with weight sharing, role information and information channels.
In a temporal network, the presence and activity of nodes and links can change through time. To describe temporal networks we introduce the notion of temporal quantities. We define the addition and multiplication of temporal quantities in a way that can be used for the definition of addition and multiplication of temporal networks. The corresponding algebraic structures are semirings. The usual approach to (data) analysis of temporal networks is to transform it into a sequence of time slices -- static networks corresponding to selected time intervals and analyze each of them using standard methods to produce a sequence of results. The approach proposed in this paper enables us to compute these results directly. We developed fast algorithms for the proposed operations. They are available as an open source Python library TQ (Temporal Quantities) and a program Ianus. The proposed approach enables us to treat as temporal quantities also other network characteristics such as degrees, connectivity components, centrality measures, Pathfinder skeleton, etc. To illustrate the developed tools we present some results from the analysis of Franzosi's violence network and Corman's Reuters terror news network.
We investigate the dimensional crossover of scaling properties of avalanches (domain-wall jumps) in a single-interface model, used for the description of Barkhausen noise in disordered magnets. By varying the transverse aspect ratio $A=L_y/L_x$ of simulated samples, the system dimensionality changes from two to three. We find that perturbing away from $d=2$ is a relevant field. The exponent $\tau$ characterizing the power-law scaling of avalanche distributions varies between $1.06(1)$ for $d=2$ and $1.275(15)$ for $d=3$, according to a crossover function $f(x)$, $x \equiv (L_x^{-1})^{\phi}/A$, with $\phi=0.95(3)$. We discuss the possible relevance of our results to the interpretation of thin-film measurements of Barkhausen noise. We also study the probability distributions of interface roughness, sampled among successive equilibrium configurations in the Barkhausen noise regime. Attempts to fit our data to the class of universality distributions associated to $1/f^\alpha$ noise give $\alpha \simeq 1-1.1$ for $d=2$ and 3 (provided that suitable boundary conditions are used in the latter case).
Using Boolean networks as prototypical examples, the role of symmetry in the dynamics of heterogeneous complex systems is explored. We show that symmetry of the dynamics, especially in critical states, is a controlling feature that can be used both to greatly simplify analysis and to characterize different types of dynamics. Symmetry in Boolean networks is found by determining the frequency at which the various Boolean output functions occur. There are classes of functions that consist of Boolean functions that behave similarly. These classes are orbits of the controlling symmetry group. We find that the symmetry that controls the critical random Boolean networks is expressed through the frequency by which output functions are utilized by nodes that remain active on dynamical attractors. This symmetry preserves canalization, a form of network robustness. We compare it to a different symmetry known to control the dynamics of an evolutionary process that allows Boolean networks to organize into a critical state. Our results demonstrate the usefulness and power of using the symmetry of the behavior of the nodes to characterize complex network dynamics, and introduce a novel approach to the analysis of heterogeneous complex systems.
Recently, spatial stochastic models based on determinantal point processes (DPP) are studied as promising models for analysis of cellular wireless networks. Indeed, the DPPs can express the repulsive nature of the macro base station (BS) configuration observed in a real cellular network and have many desirable mathematical properties to analyze the network performance. However, almost all the prior works on the DPP based models assume the Rayleigh fading while the spatial models based on Poisson point processes have been developed to allow arbitrary distributions of fading/shadowing propagation effects. In order for the DPP based model to be more promising, it is essential to extend it to allow non-Rayleigh propagation effects. In the present paper, we propose the downlink cellular network model where the BSs are deployed according to the Ginibre point process, which is one of the main examples of the DPPs, over Nakagami-m fading. For the proposed model, we derive a numerically computable form of the coverage probability and reveal some properties of it numerically and theoretically.
Bipartite networks are a common type of network data in which there are two types of vertices, and only vertices of different types can be connected. While bipartite networks exhibit community structure like their unipartite counterparts, existing approaches to bipartite community detection have drawbacks, including implicit parameter choices, loss of information through one-mode projections, and lack of interpretability. Here we solve the community detection problem for bipartite networks by formulating a bipartite stochastic block model, which explicitly includes vertex type information and may be trivially extended to $k$-partite networks. This bipartite stochastic block model yields a projection-free and statistically principled method for community detection that makes clear assumptions and parameter choices and yields interpretable results. We demonstrate this model's ability to efficiently and accurately find community structure in synthetic bipartite networks with known structure and in real-world bipartite networks with unknown structure, and we characterize its performance in practical contexts.
Today, a wide variety of probabilistic and expert AI systems used to analyze real world inputs such as unstructured text, sounds, images, and statistical data. However, all these systems exist on different platforms, with different implementations, and with very different, often very specific goals in mind. This paper introduces a concept for a mediator framework for such systems and seeks to show several architectures which would support it, potential benefits in combining the signals of disparate networks for formalized, high level logic and signal processing, and its possible academic and industrial uses.
Real Call Detail Records (CDR) are analyzed and classified based on Support Vector Machine (SVM) algorithm. The daily classification results in three traffic classes. We use two different algorithms, K-means and SVM to check the classification efficiency. A second support vector regression (SVR) based algorithm is built to make an online prediction of traffic load using the history of CDRs. Then, these algorithms will be integrated to a network planning tool which will help cellular operators on planning optimally their access network.
A class of queueing networks which may have an arbitrary topology, and consist of single-server fork-join nodes with both infinite and finite buffers is examined to derive a representation of the network dynamics in terms of max-plus algebra. For the networks, we present a common dynamic state equation which relates the departure epochs of customers from the network nodes in an explicit vector form determined by a state transition matrix. It is shown how the matrices inherent in particular networks may be calculated from the service times of customers. Since, in general, an explicit dynamic equation may not exist for a network, related existence conditions are established in terms of the network topology.
We present an implementation of a probabilistic first-order logic called TensorLog, in which classes of logical queries are compiled into differentiable functions in a neural-network infrastructure such as Tensorflow or Theano. This leads to a close integration of probabilistic logical reasoning with deep-learning infrastructure: in particular, it enables high-performance deep learning frameworks to be used for tuning the parameters of a probabilistic logic. Experimental results show that TensorLog scales to problems involving hundreds of thousands of knowledge-base triples and tens of thousands of examples.
We propose a novel deep learning architecture for regressing disparity from a rectified pair of stereo images. We leverage knowledge of the problem's geometry to form a cost volume using deep feature representations. We learn to incorporate contextual information using 3-D convolutions over this volume. Disparity values are regressed from the cost volume using a proposed differentiable soft argmin operation, which allows us to train our method end-to-end to sub-pixel accuracy without any additional post-processing or regularization. We evaluate our method on the Scene Flow and KITTI datasets and on KITTI we set a new state-of-the-art benchmark, while being significantly faster than competing approaches.
We study the dynamics of a population subject to selective pressures, evolving either on RNA neutral networks or in toy fitness landscapes. We discuss the spread and the neutrality of the population in the steady state. Different limits arise depending on whether selection or random drift are dominant. In the presence of strong drift we show that observables depend mainly on $M \mu$, $M$ being the population size and $\mu$ the mutation rate, while corrections to this scaling go as 1/M: such corrections can be quite large in the presence of selection if there are barriers in the fitness landscape. Also we find that the convergence to the large $M \mu$ limit is linear in $1/M \mu$. Finally we introduce a protocol that minimizes drift; then observables scale like 1/M rather than $1/(M\mu)$, allowing one to determine the large $M$ limit faster when $\mu$ is small; furthermore the genotypic diversity increases from $O(\ln M)$ to $O(M)$.
Complex networks underlie an enormous variety of social, biological, physical, and virtual systems. A profound complication for the science of complex networks is that in most cases, observing all nodes and all network interactions is impossible. Previous work addressing the impacts of partial network data is surprisingly limited, focuses primarily on missing nodes, and suggests that network statistics derived from subsampled data are not suitable estimators for the same network statistics describing the overall network topology. We generate scaling methods to predict true network statistics, including the degree distribution, from only partial knowledge of nodes, links, or weights. Our methods are transparent and do not assume a known generating process for the network, thus enabling prediction of network statistics for a wide variety of applications. We validate analytical results on four simulated network classes and empirical data sets of various sizes. We perform subsampling experiments by varying proportions of sampled data and demonstrate that our scaling methods can provide very good estimates of true network statistics while acknowledging limits. Lastly, we apply our techniques to a set of rich and evolving large-scale social networks, Twitter reply networks. Based on 100 million tweets, we use our scaling techniques to propose a statistical characterization of the Twitter Interactome from September 2008 to November 2008. Our treatment allows us to find support for Dunbar's hypothesis in detecting an upper threshold for the number of active social contacts that individuals maintain over the course of one week.
We give an algorithm for finding network encoding and decoding equations for error-free multicasting networks with multiple sources and sinks. The algorithm given is efficient (polynomial complexity) and works on any kind of network (acyclic, link cyclic, flow cyclic, or even in the presence of knots). The key idea will be the appropriate use of the delay (both natural and additional) during the encoding. The resulting code will always work with finite delay with binary encoding coefficients.
A comparative study is performed on volcanic seismicities at Icelandic volcano, Eyjafjallaj\"okull, and Mt. Etna in Sicily from the viewpoint of complex systems science, and the discovery of remarkable similarities between them is reported. In these seismicities as point processes, the jump probability distributions of earthquakes (i.e., distributions of the distance between the hypocenters of two successive events) are found to obey the exponential law, whereas the waiting-time distributions (i.e., distributions of inter-occurrence time of two successive events) follow the power law. A careful analysis is made about the finite size effects on the waiting-time distributions, and the previously reported results for Mt. Etna (Abe and Suzuki 2015) are reinterpreted accordingly. It is shown that the growth of the seismic region in time is subdiffusive at both volcanoes. The aging phenomenon is commonly observed in the "event-time-averaged" mean-squared displacements of the hypocenters. A comment is also made on (non-)Markovianity of the processes.
Revealing the structure and dynamics of complex networked systems from observed data is of fundamental importance to science, engineering, and society. Is it possible to develop a universal, completely data driven framework to decipher the network structure and different types of dynamical processes on complex networks, regardless of their details? We develop a Markov network based model, sparse dynamical Boltzmann machine (SDBM), as a universal network structural estimator and dynamics approximator. The SDBM attains its topology according to that of the original system and is capable of simulating the original dynamical process. We develop a fully automated method based on compressive sensing and machine learning to find the SDBM. We demonstrate, for a large variety of representative dynamical processes on model and real world complex networks, that the equivalent SDBM can recover the network structure of the original system and predicts its dynamical behavior with high precision.
Although Deep Convolutional Neural Networks (CNNs) have liberated their power in various computer vision tasks, the most important components of CNN, convolutional layers and fully connected layers, are still limited to linear transformations. In this paper, we propose a novel Factorized Bilinear (FB) layer to model the pairwise feature interactions by considering the quadratic terms in the transformations. Compared with existing methods that tried to incorporate complex non-linearity structures into CNNs, the factorized parameterization makes our FB layer only require a linear increase of parameters and affordable computational cost. To further reduce the risk of overfitting of the FB layer, a specific remedy called DropFactor is devised during the training process. We also analyze the connection between FB layer and some existing models, and show FB layer is a generalization to them. Finally, we validate the effectiveness of FB layer on several widely adopted datasets including CIFAR-10, CIFAR-100 and ImageNet, and demonstrate superior results compared with various state-of-the-art deep models.
The goal of the present study is to explore the application of deep convolutional network features to emotion recognition. Results indicate that they perform similarly to other published models at a best recognition rate of 94.4%, and do so with a single still image rather than a video stream. An implementation of an affective feedback game is also described, where a classifier using these features tracks the facial expressions of a player in real-time.
Efficient simulation of the Navier-Stokes equations for fluid flow is a long standing problem in applied mathematics, for which state-of-the-art methods require large compute resources. In this work, we propose a data-driven approach that leverages the approximation power of deep-learning with the precision of standard solvers to obtain fast and highly realistic simulations. Our method solves the incompressible Euler equations using the standard operator splitting method, in which a large sparse linear system with many free parameters must be solved. We use a Convolutional Network with a highly tailored architecture, trained using a novel unsupervised learning framework to solve the linear system. We present real-time 2D and 3D simulations that outperform recently proposed data-driven methods; the obtained results are realistic and show good generalization properties.
We have revealed general physical conditions for the {\it maximization} of the network throughput at which free flow conditions are ensured, i.e., traffic breakdown cannot occur in the whole traffic or transportation network. A physical measure of the network -- {\it network capacity} is introduced that characterizes general features of the network with respect to the maximization of the network throughput. The network capacity allows us also to make a general proof of the deterioration of traffic system occurring when dynamic traffic assignment is performed in a network based on the classical Wardrop' user equilibrium (UE) and system optimum (SO) equilibrium.
We consider a model for the formation of a river network in which erosion process plays a role only at the initial stage. Once a global connectivity is achieved, no further evolution takes place. In spite of this, the network reproduces approximately most of the empirical statistical results of natural river network. It is observed that the resulting network is a spanning tree graph and therefore this process could be looked upon as a new algorithm for the generation of spanning tree graphs in which different configurations occur quasi-randomly. A new loop-less percolation model is also defined at an intermediate stage of evolution of the river network.
We present new quantitative classification methods for emission-line galaxies, which are specially designed to be used in deep galaxy redshift surveys. A good segregation between starbursts and active galactic nuclei, i.e. Seyferts 2s and LINERs, is obtained from diagnostic diagrams involving the [O II]\lambda 3727 \AA, [Ne III]\lambda3869 \AA, H\beta and [O III]\lambda5007 \AA relative intensities or the [O II]\lambda 3727 \AA and H\beta equivalent widths. Furthermore, the colour index of the continuum underlying [O II]\lambda 3727 \AA and H\beta provides an additional separation parameter between the two types of emission-line galaxies. We have applied the equivalent widths method to the 0 < z \leq 0.3 emission-line galaxies of the Canada-France Redshift Survey. Our results are in very good agreement with those obtained using the standard diagnostic diagrams including all the strong optical emission-line intensity ratios.
Understanding how local interactions among connected nodes in a network result in global network dynamics and behaviors remains a critical open problem in network theory. This is important both for understanding complex networks, including the brain, and for the controlled design of networks intended to achieve a specific function. Here, we describe the construction and theoretical analysis of a framework derived from canonical neurophysiological principles that models the competing dynamics of incident signals into nodes along directed edges in a network. The framework describes the dynamics between the offset in the latencies of propagating signals, which reflect the geometry of the edges and conduction velocities, and the internal refractory dynamics and processing times of the downstream node. One of the main theoretical results is the definition of a ratio between the speed of signaling or information flow, which is bounded by the spatial geometry, and the internal time it takes for individual nodes to process incoming signals. We show that an optimal ratio is one where the speed of information propagation between connected nodes does not exceed the internal dynamic time scale of the nodes. A mismatch of this ratio leads to sub-optimal signaling and information flows in a network, and even a breakdown in signaling all together.
We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning. The key components are a custom-built supercomputer dedicated to deep learning, a highly optimized parallel algorithm using new strategies for data partitioning and communication, larger deep neural network models, novel data augmentation approaches, and usage of multi-scale high-resolution images. Our method achieves excellent results on multiple challenging computer vision benchmarks.
Back-propagation with gradient method is the most popular learning algorithm for feed-forward neural networks. However, it is critical to determine a proper fixed learning rate for the algorithm. In this paper, an optimized recursive algorithm is presented for online learning based on matrix operation and optimization methods analytically, which can avoid the trouble to select a proper learning rate for the gradient method. The proof of weak convergence of the proposed algorithm also is given. Although this approach is proposed for three-layer, feed-forward neural networks, it could be extended to multiple layer feed-forward neural networks. The effectiveness of the proposed algorithms applied to the identification of behavior of a two-input and two-output non-linear dynamic system is demonstrated by simulation experiments.
Many real networks have been found to have a rich degree of symmetry, which is a very important structural property of complex network, yet has been rarely studied so far. And where does symmetry comes from has not been explained. To explore the mechanism underlying symmetry of the networks, we studied statistics of certain local symmetric motifs, such as symmetric bicliques and generalized symmetric bicliques, which contribute to local symmetry of networks. We found that symmetry of complex networks is a consequence of similar linkage pattern, which means that nodes with similar degree tend to share similar linkage targets. A improved version of BA model integrating similar linkage pattern successfully reproduces the symmetry of real networks, indicating that similar linkage pattern is the underlying ingredient that responsible for the emergence of the symmetry in complex networks.
We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference and learning. Our algorithm introduces a recognition model to represent approximate posterior distributions, and that acts as a stochastic encoder of the data. We develop stochastic back-propagation -- rules for back-propagation through stochastic variables -- and use this to develop an algorithm that allows for joint optimisation of the parameters of both the generative and recognition model. We demonstrate on several real-world data sets that the model generates realistic samples, provides accurate imputations of missing data and is a useful tool for high-dimensional data visualisation.
In this paper we present Deep Secure Encoding: a framework for secure classification using deep neural networks, and apply it to the task of biometric template protection for faces. Using deep convolutional neural networks (CNNs), we learn a robust mapping of face classes to high entropy secure codes. These secure codes are then hashed using standard hash functions like SHA-256 to generate secure face templates. The efficacy of the approach is shown on two face databases, namely, CMU-PIE and Extended Yale B, where we achieve state of the art matching performance, along with cancelability and high security with no unrealistic assumptions. Furthermore, the scheme can work in both identification and verification modes.
We investigate systematically the impact of human intervention in the training of computer players in a strategy board game. In that game, computer players utilise reinforcement learning with neural networks for evolving their playing strategies and demonstrate a slow learning speed. Human intervention can significantly enhance learning performance, but carry-ing it out systematically seems to be more of a problem of an integrated game development environment as opposed to automatic evolutionary learning.
Deep learning has become the state-of-art tool in many applications, but the evaluation and training of deep models can be time-consuming and computationally expensive. The conditional computation approach has been proposed to tackle this problem (Bengio et al., 2013; Davis & Arel, 2013). It operates by selectively activating only parts of the network at a time. In this paper, we use reinforcement learning as a tool to optimize conditional computation policies. More specifically, we cast the problem of learning activation-dependent policies for dropping out blocks of units as a reinforcement learning problem. We propose a learning scheme motivated by computation speed, capturing the idea of wanting to have parsimonious activations while maintaining prediction accuracy. We apply a policy gradient algorithm for learning policies that optimize this loss function and propose a regularization mechanism that encourages diversification of the dropout policy. We present encouraging empirical results showing that this approach improves the speed of computation without impacting the quality of the approximation.
Quantum key distribution (QKD) network has recently attracted growing attentions. Due to the special characteristics of quantum information, to build a full-connectivity QKD network without trusted relays is a stimulating challenge. In this letter, we report on the first realization of QKD network without trusted relays which covers metropolis in the commercial backbone optical fiber networks. The star topology four-user QKD network automatically addresses the quantum signal with a quantum router (QR) and every user in the network can receive and distribute quantum keys to any others simultaneously. The longest and the shortest length of fibers between two geographically separated nodes are 42.6km and 32km respectively, and the maximum average quantum bit error rate (QBER) is below 8%. This result opens a new possibility for the use of QKD into existing network.
We take an analytical approach to study the Quality of user Experience (QoE) for video streaming applications. Our propose is to characterize buffer starvations for streaming video with Long-Range-Dependent (LRD) input traffic. Specifically we develop a new analytical framework to investigate Quality of user Experience (QoE) for streaming by considering a Markov Modulated Fluid Model (MMFM) that accurately approximates the Long Range Dependence (LRD) nature of network traffic. We drive the close-form expressions for calculating the distribution of starvation as well as start-up delay using partial differential equations (PDEs) and solve them using the Laplace Transform. We illustrate the results with the cases of the two-state Markov Modulated Fluid Model that is commonly used in multimedia applications. We compare our analytical model with simulation results using ns-3 under various operating parameters. We further adopt the model to analyze the effect of bitrate switching on the starvation probability and start-up delay. Finally, we apply our analysis results to optimize the objective quality of experience (QoE) of media streaming realizing the tradeoff among different metrics incorporating user preferences on buffering ratio, startup delay and perceived quality.
We show how effectively the diffusive capture processes (DCP) on complex networks can be applied to information search in the networks. Numerical simulations show that our method generates only 2% of traffic compared with the most popular flooding-based query-packet-forwarding (FB) algorithm. We find that the average searching time, $<T>$, of the our model is more scalable than another well known $n$-random walker model and comparable to the FB algorithm both on real Gnutella network and scale-free networks with $\gamma =2.4$. We also discuss the possible relationship between $<T>$ and $<k^2>$, the second moment of the degree distribution of the networks.
The analysis in Part I revealed interesting properties for subgradient learning algorithms in the context of stochastic optimization when gradient noise is present. These algorithms are used when the risk functions are non-smooth and involve non-differentiable components. They have been long recognized as being slow converging methods. However, it was revealed in Part I that the rate of convergence becomes linear for stochastic optimization problems, with the error iterate converging at an exponential rate $\alpha^i$ to within an $O(\mu)-$neighborhood of the optimizer, for some $\alpha \in (0,1)$ and small step-size $\mu$. The conclusion was established under weaker assumptions than the prior literature and, moreover, several important problems (such as LASSO, SVM, and Total Variation) were shown to satisfy these weaker assumptions automatically (but not the previously used conditions from the literature). These results revealed that sub-gradient learning methods have more favorable behavior than originally thought when used to enable continuous adaptation and learning. The results of Part I were exclusive to single-agent adaptation. The purpose of the current Part II is to examine the implications of these discoveries when a collection of networked agents employs subgradient learning as their cooperative mechanism. The analysis will show that, despite the coupled dynamics that arises in a networked scenario, the agents are still able to attain linear convergence in the stochastic case; they are also able to reach agreement within $O(\mu)$ of the optimizer.
Current large scale implementations of deep learning and data mining require thousands of processors, massive amounts of off-chip memory, and consume gigajoules of energy. Emerging memory technologies such as nanoscale two-terminal resistive switching memory devices offer a compact, scalable and low power alternative that permits on-chip co-located processing and memory in fine-grain distributed parallel architecture. Here we report first use of resistive switching memory devices for implementing and training a Restricted Boltzmann Machine (RBM), a generative probabilistic graphical model as a key component for unsupervised learning in deep networks. We experimentally demonstrate a 45-synapse RBM realized with 90 resistive switching phase change memory (PCM) elements trained with a bio-inspired variant of the Contrastive Divergence (CD) algorithm, implementing Hebbian and anti-Hebbian weight updates. The resistive PCM devices show a two-fold to ten-fold reduction in error rate in a missing pixel pattern completion task trained over 30 epochs, compared to untrained case. Measured programming energy consumption is 6.1 nJ per epoch with the resistive switching PCM devices, a factor of ~150 times lower than conventional processor-memory systems. We analyze and discuss the dependence of learning performance on cycle-to-cycle variations as well as number of gradual levels in the PCM analog memory devices.
Land use mapping is a fundamental yet challenging task in geographic science. In contrast to land cover mapping, it is generally not possible using overhead imagery. The recent, explosive growth of online geo-referenced photo collections suggests an alternate approach to geographic knowledge discovery. In this work, we present a general framework that uses ground-level images from Flickr for land use mapping. Our approach benefits from several novel aspects. First, we address the nosiness of the online photo collections, such as imprecise geolocation and uneven spatial distribution, by performing location and indoor/outdoor filtering, and semi- supervised dataset augmentation. Our indoor/outdoor classifier achieves state-of-the-art performance on several bench- mark datasets and approaches human-level accuracy. Second, we utilize high-level semantic image features extracted using deep learning, specifically convolutional neural net- works, which allow us to achieve upwards of 76% accuracy on a challenging eight class land use mapping problem.
In a recent work [Proc. Natl. Acad. Sci. USA 108, 3838 (2011)], Schneider et al. proposed a new measure for network robustness and investigated optimal networks with respect to this quantity. For networks with a power-law degree distribution, the optimized networks have an onion structure-high-degree vertices forming a core with radially decreasing degrees and an over-representation of edges within the same radial layer. In this paper we relate the onion structure to graphs with good expander properties (another characterization of robust network) and argue that networks of skewed degree distributions with large spectral gaps (and thus good expander properties) are typically onion structured. Furthermore, we propose a generative algorithm producing synthetic scale-free networks with onion structure, circumventing the optimization procedure of Schneider et al. We validate the robustness of our generated networks against malicious attacks and random removals.
We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $\Omega( W L \log(W/L) )$. This improves both the previously known upper bounds and lower bounds. In terms of the number $U$ of non-linear units, we prove a tight bound $\Theta(W U)$ on the VC-dimension. All of these results generalize to arbitrary piecewise linear activation functions.
A particular case of Recurrent Neural Network (RNN) was introduced at the beginning of the 2000s under the name of Echo State Networks (ESNs). The ESN model overcomes the limitations during the training of the RNNs while introducing no significant disadvantages. Although the model presents some well-identified drawbacks when the parameters are not well initialised. The performance of an ESN is highly dependent on its internal parameters and pattern of connectivity of the hidden-hidden weights Often, the tuning of the network parameters can be hard and can impact in the accuracy of the models.   In this work, we investigate the performance of a specific boosting technique (called L2-Boost) with ESNs as single predictors. The L2-Boost technique has been shown to be an effective tool to combine "weak" predictors in regression problems. In this study, we use an ensemble of random initialized ESNs (without control their parameters) as "weak" predictors of the boosting procedure. We evaluate our approach on five well-know time-series benchmark problems. Additionally, we compare this technique with a baseline approach that consists of averaging the prediction of an ensemble of ESNs.
A common analytical problem in neuroscience is the interpretation of neural activity with respect to sensory input or behavioral output. This is typically achieved by regressing measured neural activity against known stimuli or behavioral variables to produce a "tuning function" for each neuron. Unfortunately, because this approach handles neurons individually, it cannot take advantage of simultaneous measurements from spatially adjacent neurons that often have similar tuning properties. On the other hand, sharing information between adjacent neurons can errantly degrade estimates of tuning functions across space if there are sharp discontinuities in tuning between nearby neurons. In this paper, we develop a computationally efficient block Gibbs sampler that effectively pools information between neurons to de-noise tuning function estimates while simultaneously preserving sharp discontinuities that might exist in the organization of tuning across space. This method is fully Bayesian and its computational cost per iteration scales sub-quadratically with total parameter dimensionality. We demonstrate the robustness and scalability of this approach by applying it to both real and synthetic datasets. In particular, an application to data from the spinal cord illustrates that the proposed methods can dramatically decrease the experimental time required to accurately estimate tuning functions.
Cell motion inside dense tissues governs many biological processes, including embryonic development and cancer metastasis, and recent experiments suggest that these tissues exhibit collective glassy behavior. To make quantitative predictions about glass transitions in tissues, we study a self-propelled Voronoi (SPV) model that simultaneously captures polarized cell motility and multi-body cell-cell interactions in a confluent tissue, where there are no gaps between cells. We demonstrate that the model exhibits a jamming transition from a solid-like state to a fluid-like state that is controlled by three parameters: the single-cell motile speed, the persistence time of single-cell tracks, and a target shape index that characterizes the competition between cell-cell adhesion and cortical tension. In contrast to traditional particulate glasses, we are able to identify an experimentally accessible structural order parameter that specifies the entire jamming surface as a function of model parameters. We demonstrate that a continuum Soft Glassy Rheology model precisely captures this transition in the limit of small persistence times, and explain how it fails in the limit of large persistence times. These results provide a framework for understanding the collective solid-to-liquid transitions that have been observed in embryonic development and cancer progression, which may be associated with Epithelial-to-Mesenchymal transition in these tissues.
This article presents a novel visualization approach for dynamic graphs, the versinus method, specially useful for real world networks exhibiting free-scale properties. With a simple and fixed layout, and a small set of visual markups, the method has been useful for understanding network dynamics. Local community often suggests that it be reported, which motivated this article. Online resources deliver videos and computer scripts for rendering new animations. This article has a concise description of the method.
Using the Anderson model for disordered systems the fluctuations in electron spectra near the metal--insulator transition were numerically calculated for lattices of sizes up to 28 x 28 x 28 sites. The results show a finite--size scaling of both the level spacing distribution and the variance of number of states in a given energy interval, that allows to locate the critical point and to determine the critical exponent of the localization length.
This Special Issue of the Astrophysical Journal Letters is dedicated to presenting initial results from the Great Observatories Origins Deep Survey (GOODS) that are primarily, but not exclusively, based on multi--band imaging data obtained with the {\it Hubble Space Telescope} ({\it HST}) and the Advanced Camera for Surveys (ACS). The survey covers roughly 320 square arcminutes in the ACS F435W, F606W, F814W, and F850LP bands, divided into two well-studied fields. Existing deep observations from the {\it Chandra} X-ray Observatory (\cxo) and ground-based facilities are supplemented with new, deep imaging in the optical and near-infrared from the European Southern Observatory (ESO) and from the Kitt Peak National Observatory (KPNO). Deep observations with the {\it Space Infrared Telescope Facility} (SIRTF) are scheduled. Reduced data from all facilities are being released worldwide within three to six months of acquisition. Together, this data set provides two deep reference fields for studies of distant normal and active galaxies, supernovae, and faint stars in our own galaxy. This paper serves to outline the survey strategy and describe the specific data that have been used in the accompanying letters, summarizing the reduction procedures and sensitivity limits.
We study the sensitivity of current and future long-baseline neutrino oscillation experiments to the effects of dimension six operators affecting neutrino propagation through Earth, commonly referred to as Non-Standard Interactions (NSI). All relevant parameters entering the oscillation probabilities (standard and non-standard) are considered at once, in order to take into account possible cancellations and degeneracies between them. We find that the Deep Underground Neutrino Experiment will significantly improve over current constraints for most NSI parameters. Most notably, it will be able to rule out the so-called LMA-dark solution, still compatible with current oscillation data, and will be sensitive to off-diagonal NSI parameters at the level of $\varepsilon \sim \mathcal{O}(0.05 - 0.5)$. We also identify two degeneracies among standard and non-standard parameters, which could be partially resolved by combining T2HK and DUNE data.
A Boolean network model of the cell-cycle regulatory network of fission yeast (Schizosaccharomyces Pombe) is constructed solely on the basis of the known biochemical interaction topology. Simulating the model in the computer, faithfully reproduces the known sequence of regulatory activity patterns along the cell cycle of the living cell. Contrary to existing differential equation models, no parameters enter the model except the structure of the regulatory circuitry. The dynamical properties of the model indicate that the biological dynamical sequence is robustly implemented in the regulatory network, with the biological stationary state G1 corresponding to the dominant attractor in state space, and with the biological regulatory sequence being a strongly attractive trajectory. Comparing the fission yeast cell-cycle model to a similar model of the corresponding network in S. cerevisiae, a remarkable difference in circuitry, as well as dynamics is observed. While the latter operates in a strongly damped mode, driven by external excitation, the S. pombe network represents an auto-excited system with external damping.
A standard deep convolutional neural network paired with a suitable loss function learns compact local image descriptors that perform comparably to state-of-the art approaches.
We propose the design of an original scalable image coder/decoder that is inspired from the mammalians retina. Our coder accounts for the time-dependent and also nondeterministic behavior of the actual retina. The present work brings two main contributions: As a first step, (i) we design a deterministic image coder mimicking most of the retinal processing stages and then (ii) we introduce a retinal noise in the coding process, that we model here as a dither signal, to gain interesting perceptual features. Regarding our first contribution, our main source of inspiration will be the biologically plausible model of the retina called Virtual Retina. The main novelty of this coder is to show that the time-dependent behavior of the retina cells could ensure, in an implicit way, scalability and bit allocation. Regarding our second contribution, we reconsider the inner layers of the retina. We emit a possible interpretation for the non-determinism observed by neurophysiologists in their output. For this sake, we model the retinal noise that occurs in these layers by a dither signal. The dithering process that we propose adds several interesting features to our image coder. The dither noise whitens the reconstruction error and decorrelates it from the input stimuli. Furthermore, integrating the dither noise in our coder allows a faster recognition of the fine details of the image during the decoding process. Our present paper goal is twofold. First, we aim at mimicking as closely as possible the retina for the design of a novel image coder while keeping encouraging performances. Second, we bring a new insight concerning the non-deterministic behavior of the retina.
Secure Aggregation protocols allow a collection of mutually distrust parties, each holding a private value, to collaboratively compute the sum of those values without revealing the values themselves. We consider training a deep neural network in the Federated Learning model, using distributed stochastic gradient descent across user-held training data on mobile devices, wherein Secure Aggregation protects each user's model gradient. We design a novel, communication-efficient Secure Aggregation protocol for high-dimensional data that tolerates up to 1/3 users failing to complete the protocol. For 16-bit input values, our protocol offers 1.73x communication expansion for $2^{10}$ users and $2^{20}$-dimensional vectors, and 1.98x expansion for $2^{14}$ users and $2^{24}$ dimensional vectors.
This paper presents a new intelligent algorithm that can solve the problems of finding the optimum solution in the state space among which the desired solution resides. The algorithm mimics the principles of bat sonar in finding its targets. The algorithm introduces three search approaches. The first search approach considers a single sonar unit (SSU) with a fixed beam length and a single starting point. In this approach, although the results converge toward the optimum fitness, it is not guaranteed to find the global optimum solution especially for complex problems; it is satisfied with finding 'acceptably good' solutions to these problems. The second approach considers multisonar units (MSU) working in parallel in the same state space. Each unit has its own starting point and tries to find the optimum solution. In this approach the probability that the algorithm converges toward the optimum solution is significantly increased. It is found that this approach is suitable for complex functions and for problems of wide state space. In the third approach, a single sonar unit with a moment (SSM) is used in order to handle the problem of convergence toward a local optimum rather than a global optimum. The momentum term is added to the length of the transmitted beams. This will give the chance to find the best fitness in a wider range within the state space. In this paper a comparison between the proposed algorithm and genetic algorithm (GA) has been made. It showed that both of the algorithms can catch approximately the optimum solutions for all of the testbed functions except for the function that has a local minimum, in which the proposed algorithm's result is much better than that of the GA algorithm. On the other hand, the comparison showed that the required execution time to obtain the optimum solution using the proposed algorithm is much less than that of the GA algorithm.
The discovery of memory effects in the magnetization decays of spin glasses in 1983 began a large effort to determine the exact nature of the decay. While qualitative arguments have suggested that the decay functions should scale as $t_{w}$, the only time scale in the system, this type of scaling has not yet been observed. In this letter we report strong evidence for the scaling of the TRM magnetization decays as a function of $t_{w}$. By varying the rate and the profile that the sample is cooled through its transition temperature to the measuring temperature, we find that the cooling plays a major role in determining scaling. As the effective cooling time decreases, $\frac {t}{t_{w}}$scaling improves and for $t_{c}^{eff}<20s$ we find almost perfect $\frac{t}{t_{w}}$ scaling. We also find that subtraction of a stationary term from the magnetization decay has a small effect on the scaling but changes the form of the magnetization decay and improves overlap between curves produced with different $t_{w}$.
The paper presents some theoretical and practical considerations regarding the TV information distribution in local (small and medium) networks, using different technologies and architectures. The SMATV concept is chosen to be presented extensively. The most important design formulae are presented with a software package supporting the network planner to design and optimize the network. A case study is realized, using standard components in SMATV, for a 5 floor building. The study proved that it is possible to design and optimize the entire network, without realizing first a costly experimental setup. It is also possible to run different architectures, optimizing also the costs of the final solution of network.
We study electric field driven deracemization in an achiral liquid crystal through the formation and coarsening of chiral domains. It is proposed that deracemization in this system is a curvature-driven process. We test this prediction using the exact result for the distribution of hull-enclosed areas in two-dimensional coarsening in non-conserved scalar order parameter dynamics recently obtained [J.J. Arenzon et al., Phys. Rev. Lett. 98, 061116 (2007)]. The experimental data are in very good agreement with the theory. We thus demonstrate that deracemization in such bent-core liquid crystals belongs to the Allen-Cahn universality class, and that the exact formula, which gives us the statistics of domain sizes during coarsening, can also be used as a strict test for this dynamic universality class.
Experiments that study neural encoding of stimuli at the level of individual neurons typically choose a small set of features present in the world --- contrast and luminance for vision, pitch and intensity for sound --- and assemble a stimulus set that systematically varies along these dimensions. Subsequent analysis of neural responses to these stimuli typically focuses on regression models, with experimenter-controlled features as predictors and spike counts or firing rates as responses. Unfortunately, this approach requires knowledge in advance about the relevant features coded by a given population of neurons. For domains as complex as social interaction or natural movement, however, the relevant feature space is poorly understood, and an arbitrary \emph{a priori} choice of features may give rise to confirmation bias. Here, we present a Bayesian model for exploratory data analysis that is capable of automatically identifying the features present in unstructured stimuli based solely on neuronal responses. Our approach is unique within the class of latent state space models of neural activity in that it assumes that firing rates of neurons are sensitive to multiple discrete time-varying features tied to the \emph{stimulus}, each of which has Markov (or semi-Markov) dynamics. That is, we are modeling neural activity as driven by multiple simultaneous stimulus features rather than intrinsic neural dynamics. We derive a fast variational Bayesian inference algorithm and show that it correctly recovers hidden features in synthetic data, as well as ground-truth stimulus features in a prototypical neural dataset. To demonstrate the utility of the algorithm, we also apply it to cluster neural responses and demonstrate successful recovery of features corresponding to monkeys and faces in the image set.
Power spectral density (PSD) maps providing the distribution of RF power across space and frequency are constructed using power measurements collected by a network of low-cost sensors. By introducing linear compression and quantization to a small number of bits, sensor measurements can be communicated to the fusion center with minimal bandwidth requirements. Strengths of data- and model-driven approaches are combined to develop estimators capable of incorporating multiple forms of spectral and propagation prior information while fitting the rapid variations of shadow fading across space. To this end, novel nonparametric and semiparametric formulations are investigated. It is shown that PSD maps can be obtained using support vector machine-type solvers. In addition to batch approaches, an online algorithm attuned to real-time operation is developed. Numerical tests assess the performance of the novel algorithms.
Background: How to extract useful information from complex biological networks is a major goal in many fields, especially in genomics and proteomics. We have shown in several works that iterative hierarchical clustering, as implemented in the UVCluster program, is a powerful tool to analyze many of those networks. However, the amount of computation time required to perform UVCluster analyses imposed significant limitations to its use.   Methodology/Principal Findings: We describe the suite Jerarca, designed to efficiently convert networks of interacting units into dendrograms by means of iterative hierarchical clustering. Jerarca is divided into three main sections. First, weighted distances among units are computed using up to three different approaches: a more efficient version of UVCluster and two new, related algorithms called RCluster and SCluster. Second, Jerarca builds dendrograms based on those distances, using well-known phylogenetic algorithms, such as UPGMA or Neighbor-Joining. Finally, Jerarca provides optimal partitions of the trees using statistical criteria based on the distribution of intra- and intercluster connections. Outputs compatible with the phylogenetic software MEGA and the Cytoscape package are generated, allowing the results to be easily visualized.   Conclusions/Significance: The four main advantages of Jerarca in respect to UVCluster are: 1) Improved speed of a novel UVCluster algorithm; 2) Additional, alternative strategies to perform iterative hierarchical clustering; 3) Automatic evaluation of the hierarchical trees to obtain optimal partitions; and, 4) Outputs compatible with popular software such as MEGA and Cytoscape.
Explosive percolation in a network is a phase transition where a large portion of nodes becomes connected with an addition of a small number of edges. Although extensively studied in random network models and reconstructed real networks, explosive percolation has not been observed in a more realistic scenario where a network is generated by thresholding a similarity matrix describing between-node associations. In this report, I examine construction schemes of such thresholded networks, and demonstrate that explosive percolation can be observed by introducing edges in a particular order.
Kinetically constrained spin models are known to exhibit dynamical behavior mimicking that of glass forming systems. They are often understood as coarse-grained models of glass formers, in terms of some "mobility" field. The identity of this "mobility" field has remained elusive due to the lack of coarse-graining procedures to obtain these models from a more microscopic point of view. Here we exhibit a scheme to map the dynamics of a two-dimensional soft disc glass former onto a kinetically constrained spin model, providing an attempt at bridging these two approaches.
We introduce a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop. It regularises the weights by minimising a compression cost, known as the variational free energy or the expected lower bound on the marginal likelihood. We show that this principled kind of regularisation yields comparable performance to dropout on MNIST classification. We then demonstrate how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems, and how this weight uncertainty can be used to drive the exploration-exploitation trade-off in reinforcement learning.
This paper extends the treatment of single-neuron memories obtained by the B-matrix approach. The spreading of the activity within the network is determined by the network's proximity matrix which represents the separations amongst the neurons through the neural pathways.
Highly expressive directed latent variable models, such as sigmoid belief networks, are difficult to train on large datasets because exact inference in them is intractable and none of the approximate inference methods that have been applied to them scale well. We propose a fast non-iterative approximate inference method that uses a feedforward network to implement efficient exact sampling from the variational posterior. The model and this inference network are trained jointly by maximizing a variational lower bound on the log-likelihood. Although the naive estimator of the inference model gradient is too high-variance to be useful, we make it practical by applying several straightforward model-independent variance reduction techniques. Applying our approach to training sigmoid belief networks and deep autoregressive networks, we show that it outperforms the wake-sleep algorithm on MNIST and achieves state-of-the-art results on the Reuters RCV1 document dataset.
A procedure is presented which considerably improves the performance of local search based heuristic algorithms for combinatorial optimization problems. It increases the average `gain' of the individual local searches by merging pairs of solutions: certain parts of either solution are transcribed by the related parts of the respective other solution, corresponding to flipping clusters of a spin glass. This iterative partial transcription acts as a local search in the subspace spanned by the differing components of both solutions. Embedding it in the simple multi-start-local-search algorithm and in the thermal-cycling method, we demonstrate its effectiveness for several instances of the traveling salesman problem. The obtained results indicate that, for this task, such approaches are far superior to simulated annealing.
The growing online traffic that is bringing the infrastructure to its limits induces an urgent demand for an efficient content delivery model. Capitalizing social networks and using advanced delivery networks potentially can help to solve this problem. However, due to the complex nature of the involved networks such a model is difficult to assess. In this paper we use a simulative approach to analyze how the SatTorrent P2P protocol supported by social networks can improve content delivery by means of reduced download duration and traffic.
This note presents an extension to the neural artistic style transfer algorithm (Gatys et al.). The original algorithm transforms an image to have the style of another given image. For example, a photograph can be transformed to have the style of a famous painting. Here we address a potential shortcoming of the original method: the algorithm transfers the colors of the original painting, which can alter the appearance of the scene in undesirable ways. We describe simple linear methods for transferring style while preserving colors.
This paper presents a distributed optimization scheme over a network of agents in the presence of cost uncertainties and over switching communication topologies. Inspired by recent advances in distributed convex optimization, we propose a distributed algorithm based on a dual sub-gradient averaging. The objective of this algorithm is to minimize a cost function cooperatively. Furthermore, the algorithm changes the weights on the communication links in the network to adapt to varying reliability of neighboring agents. A convergence rate analysis as a function of the underlying network topology is then presented, followed by simulation results for representative classes of sensor networks.
Recent coding strategies for deterministic and noisy relay networks are related to the pipelining of block Markov encoding. For deterministic networks, it is shown that pipelined encoding improves encoding delay, as opposed to end-to-end delay. For noisy networks, it is observed that decode-and-forward exhibits good rate scaling when the signal-to-noise ratio (SNR) increases.
We derive the probability that a randomly chosen NL-node over $S$ gets localized as a function of a variety of parameters. Then, we derive the probability that the whole network of NL-nodes over $S$ gets localized. In connection with the asymptotic thresholds, we show the presence of asymptotic thresholds on the network localization probability in two different scenarios. The first refers to dense networks, which arise when the domain $S$ is bounded and the densities of the two kinds of nodes tend to grow unboundedly. The second kind of thresholds manifest themselves when the considered domain increases but the number of nodes grow in such a way that the L-node density remains constant throughout the investigated domain. In this scenario, what matters is the minimum value of the maximum transmission range averaged over the fading process, denoted as $d_{max}$, above which the network of NL-nodes almost surely gets asymptotically localized.
We study the low-energy physics of a broad class of time-reversal invariant and SU(2)-symmetric one-dimensional spin-S systems in the presence of quenched disorder via a strong-disorder renormalization-group technique. We show that, in general, there is an antiferromagnetic phase with an emergent SU(2S+1) symmetry. The ground state of this phase is a random singlet state in which the singlets are formed by pairs of spins. For integer spins, there is an additional antiferromagnetic phase which does not exhibit any emergent symmetry (except for S=1). The corresponding ground state is a random singlet one but the singlets are formed mostly by trios of spins. In each case the corresponding low-energy dynamics is activated, i.e., with a formally infinite dynamical exponent, and related to distinct infinite-randomness fixed points. The phase diagram has two other phases with ferromagnetic tendencies: a disordered ferromagnetic phase and a large spin phase in which the effective disorder is asymptotically finite. In the latter case, the dynamical scaling is governed by a conventional power law with a finite dynamical exponent.
Calculation of the distribution of the average value of a Gaussian random field in a finite domain is carried out for different cases. The results of the calculation demonstrate a strong dependence of the width of the distribution on the spatial correlations of the field. Comparison with the simulation results for the distribution of the size of the cluster indicates that the distribution of an average field could serve as a useful tool for the estimation of the asymptotic behavior of the distribution of the size of the clusters for "deep" clusters where value of the field on each site is much greater than the rms disorder.
An outstanding problem in quantum computing is the calculation of entanglement, for which no closed-form algorithm exists. Here we solve that problem, and demonstrate the utility of a quantum neural computer, by showing, in simulation, that such a device can be trained to calculate the entanglement of an input state, something neither an algorithmic quantum computer nor a classical neural net can do.
The theoretical status of perturbative QED and QCD corrections to deep inelastic scattering is reviewed.
Given a large population of players, each player has three possible choices between option 1 or 2 or no option. The two options are equally favorable and the population has to reach consensus on one of the two options quickly and in a distributed way. The more popular an option is, the more likely it is to be chosen by uncommitted players. Uncommitted players can be attracted by those committed to any of the other two options through a cross-inhibitory signal. This model originates in the context of honeybees swarms, and we generalize it to duopolistic competition and opinion dynamics. The contributions of this work include (1) the formulation of an evolutionary game model to explain the behavioral traits of the honeybees, (2) the study of the individuals and collective behavior including equilibrium points and stability, (3) the extension of the results to the case of structured environment via complex network theory, (4) the analysis of the impact of the connectivity on consensus, and (5) the study of absolute stability for the collective system under time-varying and uncertain cross-inhibitory parameter.
Link prediction is an important network science problem in many domains such as social networks, chem/bio-informatics, etc. Most of these networks are dynamic in nature with patterns evolving over time. In such cases, it is necessary to incorporate time in the mining process in a seamless manner to aid in better prediction performance. We propose a two-step solution strategy to the link prediction problem in dynamic networks in this work. The first step involves a novel yet simple feature construction approach using a combination of domain and topological attributes of the graph. In the second phase, we perform unconstrained edge selection to identify potential candidates for prediction by any generic two-class learner. We design various experiments on a real world collaboration network and show the effectiveness of our approach.
A Pascal challenge entitled monaural multi-talker speech recognition was developed, targeting the problem of robust automatic speech recognition against speech like noises which significantly degrades the performance of automatic speech recognition systems. In this challenge, two competing speakers say a simple command simultaneously and the objective is to recognize speech of the target speaker. Surprisingly during the challenge, a team from IBM research, could achieve a performance better than human listeners on this task. The proposed method of the IBM team, consist of an intermediate speech separation and then a single-talker speech recognition. This paper reconsiders the task of this challenge based on gain adapted factorial speech processing models. It develops a joint-token passing algorithm for direct utterance decoding of both target and masker speakers, simultaneously. Comparing it to the challenge winner, it uses maximum uncertainty during the decoding which cannot be used in the past two-phased method. It provides detailed derivation of inference on these models based on general inference procedures of probabilistic graphical models. As another improvement, it uses deep neural networks for joint-speaker identification and gain estimation which makes these two steps easier than before producing competitive results for these steps. The proposed method of this work outperforms past super-human results and even the results were achieved recently by Microsoft research, using deep neural networks. It achieved 5.5% absolute task performance improvement compared to the first super-human system and 2.7% absolute task performance improvement compared to its recent competitor.
A combination of variable-temperature neutron scattering, reverse Monte Carlo analysis and direct Monte Carlo simulation is used to characterise the emergence of magnetic order in the metal--organic framework (MOF) Tb(HCOO)$_3$ over the temperature range 100 K to 1.6 K $=T_{\rm N}$. We show that the magnetic transition at $T_{\rm N}$ involves one-dimensional ferromagnetic ordering to a partially-ordered state related to the triangular Ising antiferromagnet. In this phase, the direction of magnetisation of ferromagnetic chains tends to alternate between neighbouring chains but this alternation is frustrated and is not itself ordered. In neutron scattering measurements this partial order gives rise to Bragg-like peaks, which cannot be interpreted using conventional magnetic crystallography without resort to unphysical spin models. The existence of low-dimensional magnetic order in Tb(HCOO)$_3$ is stabilised by the contrasting strength of inter- and intra-chain magnetic coupling, itself a consequence of the underlying MOF architecture. Our results demonstrate how MOFs may provide an attractive if as yet under-explored platform for the realisation and investigation of low-dimensional physics.
We show that higher-order coefficients required to perform threshold resummation for electroweak annihilation processes, such as Drell-Yan or Higgs production via gluon fusion, can be computed using perturbative results derived in Deep Inelastic Scattering. As an example, we compute the three-loop coefficient D_3, generating most of the fourth tower of threshold logarithms for the Drell-Yan cross section in the MSbar scheme, using the recent three-loop results for splitting functions and for the quark form factor, as well as a class of exponentiating two-loop contributions to the Drell-Yan process.
Background: The evolution of microRNA regulation in metazoans is a mysterious process: MicroRNA sequences are highly conserved among distal organisms, but on the other hand, there is no evident conservation of their targets. Results: We study this extensive rewiring of the microRNA regulatory network by analyzing the evolutionary trajectories of duplicated genes in D. melanogatser. We find that in general microRNA-targeted genes tend to avoid gene duplication. However, in cases where gene duplication is evident, we find that the gene that displays high divergence from the ancestral gene at the sequence level is also likely to be associated in an opposing manner with the microRNA regulatory system - if the ancestral gene is a miRNA target then the divergent gene tends not to be, and vice versa. Conclusions: This suggests that miRNAs not only have a role in conferring expression robustness, as was suggested by previous works, but are also an accessible tool in evolving expression divergence.
In this paper we present a model describing Susceptible-Infected-Susceptible (SIS) type epidemics spreading on a dynamic contact network with random link activation and deletion where link ac- tivation can be locally constrained. We use and adapt a improved effective degree compartmental modelling framework recently proposed by Lindquist et al. [J. Lindquist et al., J. Math Biol. 62, 2, 143 (2010)] and Marceau et al. [V. Marceau et al., Phys. Rev. E 82, 036116 (2010)]. The resulting set of ordinary differential equations (ODEs) is solved numerically and results are compared to those obtained using individual-based stochastic network simulation. We show that the ODEs display excellent agreement with simulation for the evolution of both the disease and the network, and is able to accurately capture the epidemic threshold for a wide range of parameters. We also present an analytical R0 calculation for the dynamic network model and show that depending on the relative timescales of the network evolution and disease transmission two limiting cases are recovered: (i) the static network case when network evolution is slow and (ii) homogeneous random mixing when the network evolution is rapid. We also use our threshold calculation to highlight the dangers of relying on local stability analysis when predicting epidemic outbreaks on evolving networks.
We consider a problem that marries network flows and scheduling, motivated by the need to schedule maintenance activities in infrastructure networks, such as rail or general logistics networks. Network elements must undergo regular preventive maintenance, shutting down the arc for the duration of the activity. Careful coordination of these arc maintenance jobs can dramatically reduce the impact of such shutdown jobs on the flow carried by the network. Scheduling such jobs between given release dates and deadlines so as to maximize the total flow over time presents an intriguing case to study the role of time discretization. Here we prove that if the problem data is integer, and no flow can be stored at nodes, we can restrict attention to integer job start times. However if flow can be stored, fractional start times may be needed. This makes traditional strong integer programming scheduling models difficult to apply. Here we formulate an exact integer programming model for the continuous time problem, as well as integer programming models based on time discretization that can provide dual bounds, and that can - with minor modifications - also yield primal bounds. The resulting bounds are demonstrated to have small gaps on test instances, and offer a good trade-off for bound quality against computing time.
Effective regularisation during training can mean the difference between success and failure for deep neural networks. Recently, dither has been suggested as alternative to dropout for regularisation during batch-averaged stochastic gradient descent (SGD). In this article, we show that these methods fail without batch averaging and we introduce a new, parallel regularisation method that may be used without batch averaging. Our results for parallel-regularised non-batch-SGD are substantially better than what is possible with batch-SGD. Furthermore, our results demonstrate that dither and dropout are complimentary.
In this paper we construct a learning architecture for high dimensional time series sampled by sensor arrangements. Using a redundant wavelet decomposition on a graph constructed over the sensor locations, our algorithm is able to construct discriminative features that exploit the mutual information between the sensors. The algorithm then applies scattering networks to the time series graphs to create the feature space. We demonstrate our method on a machine olfaction problem, where one needs to classify the gas type and the location where it originates from data sampled by an array of sensors. Our experimental results clearly demonstrate that our method outperforms classical machine learning techniques used in previous studies.
We consider the non-Lambertian object intrinsic problem of recovering diffuse albedo, shading, and specular highlights from a single image of an object.   We build a large-scale object intrinsics database based on existing 3D models in the ShapeNet database. Rendered with realistic environment maps, millions of synthetic images of objects and their corresponding albedo, shading, and specular ground-truth images are used to train an encoder-decoder CNN. Once trained, the network can decompose an image into the product of albedo and shading components, along with an additive specular component.   Our CNN delivers accurate and sharp results in this classical inverse problem of computer vision, sharp details attributed to skip layer connections at corresponding resolutions from the encoder to the decoder. Benchmarked on our ShapeNet and MIT intrinsics datasets, our model consistently outperforms the state-of-the-art by a large margin.   We train and test our CNN on different object categories. Perhaps surprising especially from the CNN classification perspective, our intrinsics CNN generalizes very well across categories. Our analysis shows that feature learning at the encoder stage is more crucial for developing a universal representation across categories.   We apply our synthetic data trained model to images and videos downloaded from the internet, and observe robust and realistic intrinsics results. Quality non-Lambertian intrinsics could open up many interesting applications such as image-based albedo and specular editing.
In this paper, we studied the strategies to enhance synchronization on directed networks by manipulating a fixed number of links. We proposed a centrality-based reconstructing (CBR) method, where the node centrality is measured by the well-known PageRank algorithm. Extensive numerical simulation on many modeled networks demonstrated that the CBR method is more effective in facilitating synchronization than the degree-based reconstructing method and random reconstructing method for adding or removing links. The reason is that CBR method can effectively narrow the incoming degree distribution and reinforce the hierarchical structure of the network. Furthermore, we apply the CBR method to links rewiring procedure where at each step one link is removed and one new link is added. The CBR method helps to decide which links should be removed or added. After several steps, the resulted networks are very close to the optimal structure from the evolutionary optimization algorithm. The numerical simulations on the Kuramoto model further demonstrate that our method has advantage in shortening the convergence time to synchronization on directed networks.
A new approach to efficient quantum computation with probabilistic gates is proposed and analyzed in both a local and non-local setting. It combines heralded gates previously studied for atom or atom-like qubits with logical encoding from linear optical quantum computation in order to perform high fidelity quantum gates across a quantum network. The error-detecting properties of the heralded operations ensure high fidelity while the encoding makes it possible to correct for failed attempts such that deterministic and high-quality gates can be achieved. Importantly, this is robust to photon loss, which is typically the main obstacle to photonic based quantum information processing. Overall this approach opens a novel path towards quantum networks with atomic nodes and photonic links.
The equilibrium ensemble approach to disordered systems is used to investigate the critical behaviour of the two dimensional Ising model in presence of quenched random site dilution. The numerical transfer matrix technique in semi- infinite strips of finite width, together with phenomenological renormalization and conformal invariance, is particularly suited to put the equilibrium ensemble approach to work. A new method to extract with great precision the critical temperature of the model is proposed and applied. A more systematic finite-size scaling analysis than in previous numerical studies has been performed. A parallel investigation, along the lines of the two main scenarios currently under discussion, namely the logarithmic correction scenario (with critical exponents fixed in the Ising universality class) versus the weak universality scenario (critical exponents varying with the degree of disorder), is carried out. In interpreting our data, maximum care is costantly taken to be open in both directions. A critical discussion shows that, still, an unambiguous discrimination between the two scenarios is not possible on the basis of the available finite-size data.
Stability evaluation of a weight-update system of higher-order neural units (HONUs) with polynomial aggregation of neural inputs (also known as classes of polynomial neural networks) for adaptation of both feedforward and recurrent HONUs by a gradient descent method is introduced. An essential core of the approach is based on spectral radius of a weight-update system, and it allows stability monitoring and its maintenance at every adaptation step individually. Assuring stability of the weight-update system (at every single adaptation step) naturally results in adaptation stability of the whole neural architecture that adapts to target data. As an aside, the used approach highlights the fact that the weight optimization of HONU is a linear problem, so the proposed approach can be generally extended to any neural architecture that is linear in its adaptable parameters.
Network ecologists investigate the structure, function, and evolution of ecological systems using network models and analyses. For example, network techniques have been used to study community interactions (i.e., food-webs, mutualisms), gene flow across landscapes, and the sociality of individuals in populations. The work presented here uses a bibliographic and network approach to (1) document the rise of Network Ecology, (2) identify the diversity of topics addressed in the field, and (3) map the structure of scientific collaboration among contributing scientists. Our aim is to provide a broad overview of this emergent field that highlights its diversity and to provide a foundation for future advances. To do this, we searched the ISI Web of Science database for ecology publications between 1900 and 2012 using the search terms for research areas of Environmental Sciences & Ecology and Evolutionary Biology and the topic tag ecology. From these records we identified the Network Ecology publications using the topic terms network, graph theory, and web while controlling for the usage of misleading phrases. The resulting corpus entailed 29,513 publications between 1936 and 2012. We document the rapid rise in Network Ecology publications per year reaching a magnitude of over 5% of the ecological publications in 2012. Drawing topical information from the publication record content (titles, abstracts, keywords) and collaboration information from author listing, our analysis highlights the diversity and clustering of topics addressed within Network Ecology and reveals the highly collaborative approach of scientists publishing in this field. We conclude that Network Ecology is a large and rapidly growing area of ecology, and we expect continued growth of this research field.
Obstacle avoidance from monocular images is a challenging problem for robots. Though multi-view structure-from-motion could build 3D maps, it is not robust in textureless environments. Some learning based methods exploit human demonstration to predict a steering command directly from a single image. However, this method is usually biased towards certain tasks or demonstration scenarios and also biased by human understanding. In this paper, we propose a new method to predict a trajectory from images. We train our system on more diverse NYUv2 dataset. The ground truth trajectory is computed from the designed cost functions automatically. The Convolutional Neural Network perception is divided into two stages: first, predict depth map and surface normal from RGB images, which are two important geometric properties related to 3D obstacle representation. Second, predict the trajectory from the depth and normal. Results show that our intermediate perception increases the accuracy by 20% than the direct prediction. Our model generalizes well to other public indoor datasets and is also demonstrated for robot flights in simulation and experiments.
Modern computer vision is all about the possession of powerful image representations. Deeper and deeper convolutional neural networks have been built using larger and larger datasets and are made publicly available. A large swath of computer vision scientists use these pre-trained networks with varying degrees of successes in various tasks. Even though there is tremendous success in copying these networks, the representational space is not learnt from the target dataset in a traditional manner. One of the reasons for opting to use a pre-trained network over a network learnt from scratch is that small datasets provide less supervision and require meticulous regularization, smaller and careful tweaking of learning rates to even achieve stable learning without weight explosion. It is often the case that large deep networks are not portable, which necessitates the ability to learn mid-sized networks from scratch.   In this article, we dive deeper into training these mid-sized networks on small datasets from scratch by drawing additional supervision from a large pre-trained network. Such learning also provides better generalization accuracies than networks trained with common regularization techniques such as l2, l1 and dropouts. We show that features learnt thus, are more general than those learnt independently. We studied various characteristics of such networks and found some interesting behaviors.
Accurate predictions of peptide retention times (RT) in liquid chromatography have many applications in mass spectrometry-based proteomics. Herein, we present DeepRT, a deep learning based software for peptide retention time prediction. DeepRT automatically learns features directly from the peptide sequences using the deep convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) model, which eliminates the need to use hand-crafted features or rules. After the feature learning, principal component analysis (PCA) was used for dimensionality reduction, then three conventional machine learning methods were utilized to perform modeling. Two published datasets were used to evaluate the performance of DeepRT and we demonstrate that DeepRT greatly outperforms previous state-of-the-art approaches ELUDE and GPTime.
Using the zero-temperature Metropolis dynamics, the time decay of the remanent magnetization in the $\pm J$ Edward-Anderson spin glass model with a uniform random distribution of ferromagnetic and antiferromagnetic interactions has been investigated. Starting from the saturation, the magnetization per spin $m$ reveals a slow decrease with time, which can be approximated by a power law:$m(t)=m_{\infty}+ ({t\over a_{0}})^{a_{1}}$, $a_{1} < 0$. Moreover, its relaxation does not lead it into one of the ground states, and therefore the system is trapped in metastable isoenergetic microstates remaining magnetized. Such behaviour is discussed in terms of a random walk the system performs on its available configuration space.
We propose a deep feature-based face detector for mobile devices to detect user's face acquired by the front facing camera. The proposed method is able to detect faces in images containing extreme pose and illumination variations as well as partial faces. The main challenge in developing deep feature-based algorithms for mobile devices is the constrained nature of the mobile platform and the non-availability of CUDA enabled GPUs on such devices. Our implementation takes into account the special nature of the images captured by the front-facing camera of mobile devices and exploits the GPUs present in mobile devices without CUDA-based frameorks, to meet these challenges.
This article considers the parallel machine scheduling problem with step-deteriorating jobs and sequence-dependent setup times. The objective is to minimize the total tardiness by determining the allocation and sequence of jobs on identical parallel machines. In this problem, the processing time of each job is a step function dependent upon its starting time. An individual extended time is penalized when the starting time of a job is later than a specific deterioration date. The possibility of deterioration of a job makes the parallel machine scheduling problem more challenging than ordinary ones. A mixed integer programming model for the optimal solution is derived. Due to its NP-hard nature, a hybrid discrete cuckoo search algorithm is proposed to solve this problem. In order to generate a good initial swarm, a modified heuristic named the MBHG is incorporated into the initialization of population. Several discrete operators are proposed in the random walk of L\'{e}vy Flights and the crossover search. Moreover, a local search procedure based on variable neighborhood descent is integrated into the algorithm as a hybrid strategy in order to improve the quality of elite solutions. Computational experiments are executed on two sets of randomly generated test instances. The results show that the proposed hybrid algorithm can yield better solutions in comparison with the commercial solver CPLEX with one hour time limit, discrete cuckoo search algorithm and the existing variable neighborhood search algorithm.
This paper presents the star formation history in the NICMOS Northern Deep HDF. It uses the techniques of photometric redshifts and extinctions to correct for extinction of the ultra-violet flux. It presents a new method for correcting for surface brightness diming. It also predicts the 850 micron fluxes of the objects for comparison with SCUBA measurements
Many real-world networks such as social networks consist of strategic agents. The topology of these networks often plays a crucial role in determining the ease and speed with which certain information driven tasks can be accomplished. Consequently, growing a stable network having a certain desired topology is of interest. Motivated by this, we study the following important problem: given a certain desired topology, under what conditions would best response link alteration strategies adopted by strategic agents, uniquely lead to formation of a stable network having the given topology. This problem is the inverse of the classical network formation problem where we are concerned with determining stable topologies, given the conditions on the network parameters. We study this interesting inverse problem by proposing (1) a recursive model of network formation and (2) a utility model that captures key determinants of network formation. Building upon these models, we explore relevant topologies such as star graph, complete graph, bipartite Turan graph, and multiple stars with interconnected centers. We derive a set of sufficient conditions under which these topologies uniquely emerge, study their social welfare properties, and investigate the effects of deviating from the derived conditions.
Over the past decade, large-scale supervised learning corpora have enabled machine learning researchers to make substantial advances. However, to this date, there are no large-scale question-answer corpora available. In this paper we present the 30M Factoid Question-Answer Corpus, an enormous question answer pair corpus produced by applying a novel neural network architecture on the knowledge base Freebase to transduce facts into natural language questions. The produced question answer pairs are evaluated both by human evaluators and using automatic evaluation metrics, including well-established machine translation and sentence similarity metrics. Across all evaluation criteria the question-generation model outperforms the competing template-based baseline. Furthermore, when presented to human evaluators, the generated questions appear comparable in quality to real human-generated questions.
Physical systems have some degree of disorder present in them. We discuss how to treat natural, thermal entanglement in any random macroscopic system from which a thermodynamic witness bounded by a constant can be found. We propose that functional many-body perturbation theory be applied to allow either a quenched or an annealed average over the disorder to be taken. We find when considering the example of an XX Heisenberg spin chain with a random coupling strength, that the region of natural entanglement detected by both witnesses can be enhanced by the disorder.
Linear principal component analysis (PCA) can be extended to a nonlinear PCA by using artificial neural networks. But the benefit of curved components requires a careful control of the model complexity. Moreover, standard techniques for model selection, including cross-validation and more generally the use of an independent test set, fail when applied to nonlinear PCA because of its inherent unsupervised characteristics. This paper presents a new approach for validating the complexity of nonlinear PCA models by using the error in missing data estimation as a criterion for model selection. It is motivated by the idea that only the model of optimal complexity is able to predict missing values with the highest accuracy. While standard test set validation usually favours over-fitted nonlinear PCA models, the proposed model validation approach correctly selects the optimal model complexity.
Glass states of superfluid A-like phase of 3He in aerogel induced by random orientations of aerogel strands are investigated theoretically and experimentally. In anisotropic aerogel with stretching deformation two glass phases are observed. Both phases represent the anisotropic glass of the orbital ferromagnetic vector l -- the orbital glass (OG). The phases differ by the spin structure: the spin nematic vector d can be either in the ordered spin nematic (SN) state or in the disordered spin-glass (SG) state. The first phase (OG-SN) is formed under conventional cooling from normal 3He. The second phase (OG-SG) is metastable, being obtained by cooling through the superfluid transition temperature, when large enough resonant continuous radio-frequency excitation are applied. NMR signature of different phases allows us to measure the parameter of the global anisotropy of the orbital glass induced by deformation.
SOM is a type of unsupervised learning where the goal is to discover some underlying structure of the data. In this paper, a new extraction method based on the main idea of Concurrent Self-Organizing Maps (CSOM), representing a winner-takes-all collection of small SOM networks is proposed. Each SOM of the system is trained individually to provide best results for one class only. The experiments confirm that the proposed features based CSOM is capable to represent image content better than extracted features based on a single big SOM and these proposed features improve the final decision of the CAD. Experiments held on Mammographic Image Analysis Society (MIAS) dataset.
In this paper, we investigate the problem of network backbone discovery. In complex systems, a "backbone" takes a central role in carrying out the system functionality and carries the bulk of system traffic. It also both simplifies and highlight underlying networking structure. Here, we propose an integrated graph theoretical and information theoretical network backbone model. We develop an efficient mining algorithm based on Kullback-Leibler divergence optimization procedure and maximal weight connected subgraph discovery procedure. A detailed experimental evaluation demonstrates both the effectiveness and efficiency of our approach. The case studies in the real world domain further illustrates the usefulness of the discovered network backbones.
The celebrated multi-armed bandit problem in decision theory models the basic trade-off between exploration, or learning about the state of a system, and exploitation, or utilizing the system. In this paper we study the variant of the multi-armed bandit problem where the exploration phase involves costly experiments and occurs before the exploitation phase; and where each play of an arm during the exploration phase updates a prior belief about the arm. The problem of finding an inexpensive exploration strategy to optimize a certain exploitation objective is NP-Hard even when a single play reveals all information about an arm, and all exploration steps cost the same.   We provide the first polynomial time constant-factor approximation algorithm for this class of problems. We show that this framework also generalizes several problems of interest studied in the context of data acquisition in sensor networks. Our analyses also extends to switching and setup costs, and to concave utility objectives.   Our solution approach is via a novel linear program rounding technique based on stochastic packing. In addition to yielding exploration policies whose performance is within a small constant factor of the adaptive optimal policy, a nice feature of this approach is that the resulting policies explore the arms sequentially without revisiting any arm. Sequentiality is a well-studied concept in decision theory, and is very desirable in domains where multiple explorations can be conducted in parallel, for instance, in the sensor network context.
We introduce a quantitative measure of the capacity of a small biological network to evolve. We apply our measure to a stochastic description of the experimental setup of Guet et al. (Science 296:1466, 2002), treating chemical inducers as functional inputs to biochemical networks and the expression of a reporter gene as the functional output. We take an information-theoretic approach, allowing the system to set parameters that optimize signal processing ability, thus enumerating each network's highest-fidelity functions. We find that all networks studied are highly evolvable by our measure, meaning that change in function has little dependence on change in parameters. Moreover, we find that each network's functions are connected by paths in the parameter space along which information is not significantly lowered, meaning a network may continuously change its functionality without losing it along the way. This property further underscores the evolvability of the networks.
Localization-delocalization transition in a discrete Anderson nonlinear Schr\"odinger equation with disorder is shown to be a critical phenomenon $-$ similar to a percolation transition on a disordered lattice, with the nonlinearity parameter thought as the control parameter. In vicinity of the critical point the spreading of the wave field is subdiffusive in the limit $t\rightarrow+\infty$. The second moment grows with time as a powerlaw $\propto t^\alpha$, with $\alpha$ exactly 1/3. This critical spreading finds its significance in some connection with the general problem of transport along separatrices of dynamical systems with many degrees of freedom and is mathematically related with a description in terms fractional derivative equations. Above the delocalization point, with the criticality effects stepping aside, we find that the transport is subdiffusive with $\alpha = 2/5$ consistently with the results from previous investigations. A threshold for unlimited spreading is calculated exactly by mapping the transport problem on a Cayley tree.
We consider the problem of reliably broadcasting information in a multihop asynchronous network, despite the presence of Byzantine failures: some nodes are malicious and behave arbitrarly. We focus on non-cryptographic solutions. Most existing approaches give conditions for perfect reliable broadcast (all correct nodes deliver the good information), but require a highly connected network. A probabilistic approach was recently proposed for loosely connected networks: the Byzantine failures are randomly distributed, and the correct nodes deliver the good information with high probability. A first solution require the nodes to initially know their position on the network, which may be difficult or impossible in self-organizing or dynamic networks. A second solution relaxed this hypothesis but has much weaker Byzantine tolerance guarantees. In this paper, we propose a parameterizable broadcast protocol that does not require nodes to have any knowledge about the network. We give a deterministic technique to compute a set of nodes that always deliver authentic information, for a given set of Byzantine failures. Then, we use this technique to experimentally evaluate our protocol, and show that it significantely outperforms previous solutions with the same hypotheses. Important disclaimer: these results have NOT yet been published in an international conference or journal. This is just a technical report presenting intermediary and incomplete results. A generalized version of these results may be under submission.
In order to reduce the potential radiation risk, low-dose CT has attracted more and more attention. However, simply lowering the radiation dose will significantly degrade the imaging quality. In this paper, we propose a noise reduction method for low-dose CT via deep learning without accessing the original projection data. An architecture of deep convolutional neural network was considered to map the low-dose CT images into its corresponding normal-dose CT images patch by patch. Qualitative and quantitative evaluations demonstrate a state-the-art performance of the proposed method.
We report the results of the numerical study of the non-dissipative quantum Josephson junction chain at high temperatures. The disorder in this chain is due to the random offset charges. This chain is one of the simplest physical systems to study many body localization. We show that at high temperatures the systems exhibits three distinct regimes: insulating, characterized by the full localization of many body wave function, fully delocalized (metallic) one characterized by the wave function that takes all available phase volume and the intermediate regime in which the volume taken by the wave function scales as a non-trivial power of the full Hilbert space volume. In the intermediate, non-ergodic regime the generalized conductance does not change as a function of the chain length indicating a failure of the conventional one parameter scaling theory of localization transition. The local spectra in this regime display the fractal structure in the energy space.
We present results from numerical studies of supervised learning operations in recurrent networks considered as graphs, leading from a given set of input conditions to predetermined outputs. Graphs that have optimized their output for particular inputs with respect to predetermined outputs are asymptotically stable and can be characterized by attractors which form a representation space for an associative multiplicative structure of input operations. As the mapping from a series of inputs onto a series of such attractors generally depends on the sequence of inputs, this structure is generally non-commutative. Moreover, the size of the set of attractors, indicating the complexity of learning, is found to behave non-monotonically as learning proceeds. A tentative relation between this complexity and the notion of pragmatic information is indicated.
We measure the reliability of signals at three levels within the blowfly visual system, and present a theoretical framework for analyzing the experimental results, starting from the Poisson process. We find that blowfly photoreceptors, up to frequencies of 50-100 Hz and photon capture rates of up to about 3*10^5/s, operate well within an order of magnitude from ideal photon counters. Photoreceptors signals are transmitted to LMCs through an array of chemical synapses. We quantify a lower bound on LMC reliability, which in turn provides a lower bound on synaptic vesicle release rate, assuming Poisson statistics. This bound is much higher than what is found in published direct measurements of vesicle release rates in goldfish bipolar cells, suggesting that release statistics may be significantly sub-Poisson. Finally we study H1, a motion sensitive tangential cell in the fly's lobula plate, which transmits information about a continuous signal by sequences of action potentials. In an experiment with naturalistic motion stimuli performed on a sunny day outside in the field, H1 transmits information at about 50% coding efficiency down to millisecond spike timing precision. Comparing the measured reliability of H1's response to motion steps with the bounds on the accuracy of motion computation set by photoreceptor noise, we find that the fly's brain makes efficient use of the information available in the photoreceptor array.
We analyze the effects of intersite energy correlations on the linear optical properties of one-dimensional disordered Frenkel exciton systems. The absorption line width and the factor of radiative rate enhancement are studied as a function of the correlation length of the disorder. The absorption line width monotonously approaches the seeding degree of disorder on increasing the correlation length. On the contrary, the factor of radiative rate enhancement shows a non-monotonous trend, indicating a complicated scenario of the exciton localization in correlated systems. The concept of coherently bound molecules is exploited to explain the numerical results, showing good agreement with theory. Some recent experiments are discussed in the light of the present theory.
We introduce a simple and effective method for regularizing large convolutional neural networks. We replace the conventional deterministic pooling operations with a stochastic procedure, randomly picking the activation within each pooling region according to a multinomial distribution, given by the activities within the pooling region. The approach is hyper-parameter free and can be combined with other regularization approaches, such as dropout and data augmentation. We achieve state-of-the-art performance on four image datasets, relative to other approaches that do not utilize data augmentation.
Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders and is traditionally performed by a sleep expert who assigns to each 30s of signal a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEG), electrooculograms (EOG), electrocardiograms (ECG) and electromyograms (EMG). In this paper, we introduce the first end-to-end deep learning approach that performs automatic temporal sleep stage classification from multivariate and multimodal Polysomnography (PSG) signals. We build a general deep architecture which can extract information from EEG, EOG and EMG channels and pools the learnt representations into a final softmax classifier. The architecture is light enough to be distributed in time in order to learn from the temporal context of each sample, namely previous and following data segments. Our model, which is unique in its ability to learn a feature representation from multiple modalities, is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields state-of-the-art performance. Our study reveals a number of insights on the spatio-temporal distribution of the signal of interest: a good trade-off for optimal classification performance measured with balanced accuracy is to use 6 EEG with some EOG and EMG channels. Also exploiting one minute of data before and after each data segment to be classified offers the strongest improvement when a limited number of channels is available. Our approach aims to improve a key step in the study of sleep disorders. As sleep experts, our system exploits the multivariate and multimodal character of PSG signals to deliver state-of-the-art classification performance at a very low complexity cost.
Higher-order tensors are becoming prevalent in many scientific areas such as computer vision, social network analysis, data mining and neuroscience. Traditional tensor decomposition approaches face three major challenges: model selecting, gross corruptions and computational efficiency. To address these problems, we first propose a parallel trace norm regularized tensor decomposition method, and formulate it as a convex optimization problem. This method does not require the rank of each mode to be specified beforehand, and can automatically determine the number of factors in each mode through our optimization scheme. By considering the low-rank structure of the observed tensor, we analyze the equivalent relationship of the trace norm between a low-rank tensor and its core tensor. Then, we cast a non-convex tensor decomposition model into a weighted combination of multiple much smaller-scale matrix trace norm minimization. Finally, we develop two parallel alternating direction methods of multipliers (ADMM) to solve our problems. Experimental results verify that our regularized formulation is effective, and our methods are robust to noise or outliers.
Post-genomic research deals with challenging problems in screening genomes of organisms for particular functions or potential for being the targets of genetic engineering for desirable biological features. 'Phenotyping' of wild type and mutants is a time-consuming and costly effort by many individuals. This article is a preliminary progress report in research on large-scale automation of phenotyping steps (imaging, informatics and data analysis) needed to study plant gene-proteins networks that influence growth and development of plants. Our results undermine the significance of phenotypic traits that are implicit in patterns of dynamics in plant root response to sudden changes of its environmental conditions, such as sudden re-orientation of the root tip against the gravity vector. Including dynamic features besides the common morphological ones has paid off in design of robust and accurate machine learning methods to automate a typical phenotyping scenario, i.e. to distinguish the wild type from the mutants.
We present an experimental study of short-time diffusion properties in fluid-like suspensions of monodisperse charge-stabilized silica spheres suspended in DMF. The static structure factor S(q), the short-time diffusion function, D(q), and the hydrodynamic function, H(q), in these systems have been probed by combining X-ray photon correlation spectroscopy experiments with static small-angle X-ray scattering. Our experiments cover the full liquid-state part of the phase diagram, including deionized systems right at the liquid-solid phase boundary. We show that the dynamic data can be consistently described by the renormalized density fluctuation expansion theory of Beenakker and Mazur over a wide range of concentrations and ionic strengths. In accord with this theory and Stokesian dynamics computer simulations, the measured short-time properties cross over monotonically, with increasing salt content, from the bounding values of salt-free suspensions to those of neutral hard spheres. Moreover, we discuss an upper bound for the hydrodynamic function peak height of fluid systems based on the Hansen-Verlet freezing criterion.
Generating natural language descriptions for in-the-wild videos is a challenging task. Most state-of-the-art methods for solving this problem borrow existing deep convolutional neural network (CNN) architectures (AlexNet, GoogLeNet) to extract a visual representation of the input video. However, these deep CNN architectures are designed for single-label centered-positioned object classification. While they generate strong semantic features, they have no inherent structure allowing them to detect multiple objects of different sizes and locations in the frame. Our paper tries to solve this problem by integrating the base CNN into several fully convolutional neural networks (FCNs) to form a multi-scale network that handles multiple receptive field sizes in the original image. FCNs, previously applied to image segmentation, can generate class heat-maps efficiently compared to sliding window mechanisms, and can easily handle multiple scales. To further handle the ambiguity over multiple objects and locations, we incorporate the Multiple Instance Learning mechanism (MIL) to consider objects in different positions and at different scales simultaneously. We integrate our multi-scale multi-instance architecture with a sequence-to-sequence recurrent neural network to generate sentence descriptions based on the visual representation. Ours is the first end-to-end trainable architecture that is capable of multi-scale region processing. Evaluation on a Youtube video dataset shows the advantage of our approach compared to the original single-scale whole frame CNN model. Our flexible and efficient architecture can potentially be extended to support other video processing tasks.
Neurodegeneration affects cortical gray matter leading to loss of cortical mantle volume. As a result of such volume loss, the geometrical arrangement of the regions on the cortical surface is expected to be altered in comparison to healthy brains. Here we present a novel method to study the alterations in brain cortical surface geometry in Parkinson's disease (PD) subjects with a \emph{Geometry Networks (GN)} framework. The local geometrical arrangement of the cortical surface is captured as the 3D coordinates of the centroids of anatomically defined parcels on the surface. The inter-regional distance between cortical patches is the signal of interest and is captured as a geometry network. We study its topology by computing the dimensionality of simplicial complexes induced on a filtration of binary undirected networks for each geometry network. In a permutation statistics test, a statistically significant ($p<0.05$) difference was observed in the homology features between PD and healthy control groups highlighting its potential to differentiate between the groups and their potential utility in disease diagnosis.
The amount of completely sequenced chloroplast genomes increases rapidly every day, leading to the possibility to build large scale phylogenetic trees of plant species. Considering a subset of close plant species defined according to their chloroplasts, the phylogenetic tree that can be inferred by their core genes is not necessarily well supported, due to the possible occurrence of "problematic" genes (i.e., homoplasy, incomplete lineage sorting, horizontal gene transfers, etc.) which may blur phylogenetic signal. However, a trustworthy phylogenetic tree can still be obtained if the number of problematic genes is low, the problem being to determine the largest subset of core genes that produces the best supported tree. To discard problematic genes and due to the overwhelming number of possible combinations, we propose an hybrid approach that embeds both genetic algorithms and statistical tests. Given a set of organisms, the result is a pipeline of many stages for the production of well supported phylogenetic trees. The proposal has been applied to different cases of plant families, leading to encouraging results for these families.
Network Intrusion Detection Systems (NIDS) are computer systems which monitor a network with the aim of discerning malicious from benign activity on that network. With the recent growth of the Internet such security limitations are becoming more and more pressing. Most of the current network intrusion detection systems relay on labeled training data. An Unsupervised CA based anomaly detection technique that was trained with unlabelled data is capable of detecting previously unseen attacks. This new approach, based on the Cellular Automata classifier (CAC) with Genetic Algorithms (GA), is used to classify program behavior as normal or intrusive. Parameters and evolution process for CAC with GA are discussed in detail. This implementation considers both temporal and spatial information of network connections in encoding the network connection information into rules in NIDS. Preliminary experiments with KDD Cup data set show that the CAC classifier with Genetic Algorithms can effectively detect intrusive attacks and achieve a low false positive rate. Training a NIDWCA (Network Intrusion Detection with Cellular Automata) classifier takes significantly shorter time than any other conventional techniques.
We introduce Deep Linear Discriminant Analysis (DeepLDA) which learns linearly separable latent representations in an end-to-end fashion. Classic LDA extracts features which preserve class separability and is used for dimensionality reduction for many classification problems. The central idea of this paper is to put LDA on top of a deep neural network. This can be seen as a non-linear extension of classic LDA. Instead of maximizing the likelihood of target labels for individual samples, we propose an objective function that pushes the network to produce feature distributions which: (a) have low variance within the same class and (b) high variance between different classes. Our objective is derived from the general LDA eigenvalue problem and still allows to train with stochastic gradient descent and back-propagation. For evaluation we test our approach on three different benchmark datasets (MNIST, CIFAR-10 and STL-10). DeepLDA produces competitive results on MNIST and CIFAR-10 and outperforms a network trained with categorical cross entropy (same architecture) on a supervised setting of STL-10.
Deep J and Ks band images covering a $5\times5$ arcmin area centered on the NTT Deep Field have been obtained during the Science Verification of SOFI at the NTT. These images were made available via the Web in early June. The preliminary results we have obtained by the analysis of these data are the following: 1) the counts continue to rise with no evidence of a turnover or of a flattening down to the limits of the survey (Ks=22.5 and J=24); 2) we find a slope $dlog(N)/dm\sim$0.37 in agreement with most of the faintest surveys but much steeper than the Hawaii survey; 3) fainter than Ks$\sim19$ and J$\sim20$, the median J-K color of galaxies shows a break in its reddening trend turning toward bluer colors; 4) faint bluer galaxies also display a larger compactness index, and a smaller apparent size.
We propose a consistent approach to the statistics of the shortest paths in random graphs with a given degree distribution. This approach goes further than a usual tree ansatz and rigorously accounts for loops in a network. We calculate the distribution of shortest-path lengths (intervertex distances) in these networks and a number of related characteristics for the networks with various degree distributions. We show that in the large network limit this extremely narrow intervertex distance distribution has a finite width while the mean intervertex distance grows with the size of a network. The size dependence of the mean intervertex distance is discussed in various situations.
High frequency sound is observed in lithium diborate glass, Li$_2$O--2B$_2$O$_3$, using Brillouin scattering of light and x-rays. The sound attenuation exhibits a non-trivial dependence on the wavevector, with a remarkably rapid increase towards a Ioffe-Regel crossover as the frequency approaches the boson peak from below. An analysis of literature results reveals the near coincidence of the boson-peak frequency with a Ioffe-Regel limit for sound in {\em all} sufficiently strong glasses. We conjecture that this behavior, specific to glassy materials, must be quite universal among them.
We used convolutional neural networks (CNNs) for automatic sleep stage scoring based on single-channel electroencephalography (EEG) to learn task-specific filters for classification without using prior domain knowledge. We used an openly available dataset from 20 healthy young adults for evaluation and applied 20-fold cross-validation. We used class-balanced random sampling within the stochastic gradient descent (SGD) optimization of the CNN to avoid skewed performance in favor of the most represented sleep stages. We achieved high mean F1-score (81%, range 79-83%), mean accuracy across individual sleep stages (82%, range 80-84%) and overall accuracy (74%, range 71-76%) over all subjects. By analyzing and visualizing the filters that our CNN learns, we found that rules learned by the filters correspond to sleep scoring criteria in the American Academy of Sleep Medicine (AASM) manual that human experts follow. Our method's performance is balanced across classes and our results are comparable to state-of-the-art methods with hand-engineered features. We show that, without using prior domain knowledge, a CNN can automatically learn to distinguish among different normal sleep stages.
Decision-making is a process of choosing among alternative courses of action for solving complicated problems where multi-criteria objectives are involved. The past few years have witnessed a growing recognition of Soft Computing (SC) technologies that underlie the conception, design and utilization of intelligent systems. In this paper, we present different SC paradigms involving an artificial neural network trained using the scaled conjugate gradient algorithm, two different fuzzy inference methods optimised using neural network learning/evolutionary algorithms and regression trees for developing intelligent decision support systems. We demonstrate the efficiency of the different algorithms by developing a decision support system for a Tactical Air Combat Environment (TACE). Some empirical comparisons between the different algorithms are also provided.
Small-world and scale-free networks are known to be more easily synchronized than regular lattices, which is usually attributed to the smaller network distance between oscillators. Surprisingly, we find that networks with a homogeneous distribution of connectivity are more synchronizable than heterogeneous ones, even though the average network distance is larger. We present numerical computations and analytical estimates on synchronizability of the network in terms of its heterogeneity parameters. Our results suggest that some degree of homogeneity is expected in naturally evolved structures, such as neural networks, where synchronizability is desirable.
A distributed adaptive algorithm to estimate a time-varying signal, measured by a wireless sensor network, is designed and analyzed. One of the major features of the algorithm is that no central coordination among the nodes needs to be assumed. The measurements taken by the nodes of the network are affected by noise, and the communication among the nodes is subject to packet losses. Nodes exchange local estimates and measurements with neighboring nodes. Each node of the network locally computes adaptive weights that minimize the estimation error variance. Decentralized conditions on the weights, needed for the convergence of the estimation error throughout the overall network, are presented. A Lipschitz optimization problem is posed to guarantee stability and the minimization of the variance. An efficient strategy to distribute the computation of the optimal solution is investigated. A theoretical performance analysis of the distributed algorithm is carried out both in the presence of perfect and lossy links. Numerical simulations illustrate performance for various network topologies and packet loss probabilities.
This paper describes the design of a control and management network (orderwire) for a mobile wireless Asynchronous Transfer Mode (ATM) network. This mobile wireless ATM network is part of the Rapidly Deployable Radio Network (RDRN). The orderwire system consists of a packet radio network which overlays the mobile wireless ATM network, each network element in this network uses Global Positioning System (GPS) information to control a beamforming antenna subsystem which provides for spatial reuse. This paper also proposes a novel Virtual Network Configuration (VNC) algorithm for predictive network configuration. A mobile ATM Private Network-Network Interface (PNNI) based on VNC is also discussed. Finally, as a prelude to the system implementation, results of a Maisie simulation of the orderwire system are discussed.
In this paper, we study sparsity-exploiting Mastermind algorithms for attacking the privacy of an entire database of character strings or vectors, such as DNA strings, movie ratings, or social network friendship data. Based on reductions to nonadaptive group testing, our methods are able to take advantage of minimal amounts of privacy leakage, such as contained in a single bit that indicates if two people in a medical database have any common genetic mutations, or if two people have any common friends in an online social network. We analyze our Mastermind attack algorithms using theoretical characterizations that provide sublinear bounds on the number of queries needed to clone the database, as well as experimental tests on genomic information, collaborative filtering data, and online social networks. By taking advantage of the generally sparse nature of these real-world databases and modulating a parameter that controls query sparsity, we demonstrate that relatively few nonadaptive queries are needed to recover a large majority of each database.
Our aim here is to address the problem of decomposing a whole network into a minimal number of ego-centered subnetworks. For this purpose, the network egos are picked out as the members of a minimum dominating set of the network. However, to find such an efficient dominating ego-centered construction, we need to be able to detect all the minimum dominating sets and to compare all the corresponding dominating ego-centered decompositions of the network. To find all the minimum dominating sets of the network, we are developing a computational heuristic, which is based on the partition of the set of nodes of a graph into three subsets, the always dominant vertices, the possible dominant vertices and the never dominant vertices, when the domination number of the network is known. To compare the ensuing dominating ego-centered decompositions of the network, we are introducing a number of structural measures that count the number of nodes and links inside and across the ego-centered subnetworks. Furthermore, we are applying the techniques of graph domination and ego=centered decomposition for six empirical social networks.
We study both massless scalar and Yang-Mills field theories in the deep infrared in presence of a simple boundary. We can show, with the help of the recent scenario emerging from studies on the propagators, that the presence of a mass gap makes the Casimir contribution exponentially small as should be expected. The existence of a trivial infrared fixed point shared by both theories makes the computation as simple as the free particle case.
In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or optimization techniques and K-Nearest Neighbor approaches to solve the problem. The presence of missing data entries in databases render the tasks of decision-making and data analysis nontrivial. As a result this area has attracted a lot of research interest with the aim being to yield accurate and time efficient and sensitive missing data imputation techniques especially when time sensitive applications are concerned like power plants and winding processes. In this article, considering arbitrary and monotone missing data patterns, we hypothesize that the use of deep neural networks built using autoencoders and denoising autoencoders in conjunction with genetic algorithms, swarm intelligence and maximum likelihood estimator methods as novel data imputation techniques will lead to better imputed values than existing techniques. Also considered are the missing at random, missing completely at random and missing not at random missing data mechanisms. We also intend to use fuzzy logic in tandem with deep neural networks to perform the missing data imputation tasks, as well as different building blocks for the deep neural networks like Stacked Restricted Boltzmann Machines and Deep Belief Networks to test our hypothesis. The motivation behind this article is the need for missing data imputation techniques that lead to better imputed values than existing methods with higher accuracies and lower errors.
Truss was proposed to study social network data represented by graphs. A k-truss of a graph is a cohesive subgraph, in which each edge is contained in at least k-2 triangles within the subgraph. While truss has been demonstrated as superior to model the close relationship in social networks and efficient algorithms for finding trusses have been extensively studied, very little attention has been paid to truss maintenance. However, most social networks are evolving networks. It may be infeasible to recompute trusses from scratch from time to time in order to find the up-to-date $k$-trusses in the evolving networks. In this paper, we discuss how to maintain trusses in a graph with dynamic updates. We first discuss a set of properties on maintaining trusses, then propose algorithms on maintaining trusses on edge deletions and insertions, finally, we discuss truss index maintenance. We test the proposed techniques on real datasets. The experiment results show the promise of our work.
The increasing interest in complex networks research has been a consequence of several intrinsic features of this area, such as the generality of the approach to represent and model virtually any discrete system, and the incorporation of concepts and methods deriving from many areas, from statistical physics to sociology, which are often used in an independent way. Yet, for this same reason, it would be desirable to integrate these various aspects into a more coherent and organic framework, which would imply in several benefits normally allowed by the systematization in science, including the identification of new types of problems and the cross-fertilization between fields. More specifically, the identification of the main areas to which the concepts frequently used in complex networks can be applied paves the way to adopting and applying a larger set of concepts and methods deriving from those respective areas. Among the several areas that have been used in complex networks research, pattern recognition, optimization, linear algebra, and time series analysis seem to play a more basic and recurrent role. In the present manuscript, we propose a systematic way to integrate the concepts from these diverse areas regarding complex networks research. In order to do so, we start by grouping the multidisciplinary concepts into three main groups, namely features, similarity, and network connectivity. Then we show that several of the analysis and modeling approaches to complex networks can be thought as a composition of maps between these three groups, with emphasis on nine main types of mappings, which are presented and illustrated. Such a systematization of principles and approaches also provides an opportunity to review some of the most closely related works in the literature, which is also developed in this article.
Seismology allows for direct observational constraints on the interior structures of stars and planets. Recent observations of Saturn's ring system have revealed the presence of density waves within the rings excited by oscillation modes within Saturn, allowing for precise measurements of a limited set of the planet's mode frequencies. We construct interior structure models of Saturn, compute the corresponding mode frequencies, and compare them with the observed mode frequencies. The fundamental mode frequencies of our models match the observed frequencies (of the largest amplitude waves) to an accuracy of $\sim 1 \%$, confirming that these waves are indeed excited by Saturn's f-modes. The presence of the lower amplitude waves (finely split in frequency from the f-modes) can only be reproduced in models containing gravity modes that propagate in a stably stratified region of the planet. The stable stratification must exist deep within the planet near the large density gradients between the core and envelope. Our models cannot easily reproduce the observed fine splitting of the $m=-3$ modes, suggesting that additional effects (e.g., significant latitudinal differential rotation) may be important.
From the sandpoint of neural network dynamics we consider dynamical system of special type pesesses gradient (symmetric) and Hamiltonian (antisymmetric) flows. The conditions when Hamiltonian flow properties are dominant in the system are considered. A simple Hamiltonian has been studied for establishing oscillatory patern conditions in system under consideration.
We conduct an empirical study to test the ability of Convolutional Neural Networks (CNNs) to reduce the effects of nuisance transformations of the input data, such as location, scale and aspect ratio. We isolate factors by adopting a common convolutional architecture either deployed globally on the image to compute class posterior distributions, or restricted locally to compute class conditional distributions given location, scale and aspect ratios of bounding boxes determined by proposal heuristics. In theory, averaging the latter should yield inferior performance compared to proper marginalization. Yet empirical evidence suggests the converse, leading us to conclude that - at the current level of complexity of convolutional architectures and scale of the data sets used to train them - CNNs are not very effective at marginalizing nuisance variability. We also quantify the effects of context on the overall classification task and its impact on the performance of CNNs, and propose improved sampling techniques for heuristic proposal schemes that improve end-to-end performance to state-of-the-art levels. We test our hypothesis on a classification task using the ImageNet Challenge benchmark and on a wide-baseline matching task using the Oxford and Fischer's datasets.
This article presents an algorithm for learning the essential graph of a Bayesian network. The basis of the algorithm is the Maximum Minimum Parents and Children algorithm developed by previous authors, with three substantial modifications. The MMPC algorithm is the first stage of the Maximum Minimum Hill Climbing algorithm for learning the directed acyclic graph of a Bayesian network, introduced by previous authors. The MMHC algorithm runs in two phases; firstly, the MMPC algorithm to locate the skeleton and secondly an edge orientation phase. The computationally expensive part is the edge orientation phase.   The first modification introduced to the MMPC algorithm, which requires little additional computational cost, is to obtain the immoralities and hence the essential graph. This renders the edge orientation phase, the computationally expensive part, unnecessary, since the entire Markov structure that can be derived from data is present in the essential graph.   Secondly, the MMPC algorithm can accept independence statements that are logically inconsistent with those rejected, since with tests for independence, a `do not reject' conclusion for a particular independence statement is taken as `accept' independence. An example is given to illustrate this and a modification is suggested to ensure that the conditional independence statements are logically consistent.   Thirdly, the MMHC algorithm makes an assumption of faithfulness. An example of a data set is given that does not satisfy this assumption and a modification is suggested to deal with some situations where the assumption is not satisfied. The example in question also illustrates problems with the `faithfulness' assumption that cannot be tackled by this modification.
We consider a two-dimensional bi-layered loop model with a certain interlayer coupling and study its spectrum on a torus. Each layer consists of an $O(n)$ model on a honeycomb lattice with periodic boundary conditions; these layers are stacked such that the links of the lattice intersect each other. A complex Boltzmann weight $\lambda$ with unit modulus is assigned to each intersection of two loops each from each layer. The model is reduced to an inhomogeneous vertex model at a special point of parameters. The continuum partition function is represented, based on the idea of the Coulomb gas, by a path integral over two compact bosonic fields. The modular invariance of the partition function follows naturally. Further, because of the topological nature of the interlayer coupling, the fluctuation of loops decomposes into a local and a global part. The existence of the latter leads to a sum over all the pairs of torus knots, which can be Poisson ressummed by the M\"{o}bius inversion formula. This reveals the operator content of the theory. The multiplicity of each operator is explicitly given by a combination of two Ramanujan sums. We calculate each scaling dimension as a function of $\lambda$. We present the flow of dimensions which connects the decoupled-$O(1)$ models at $\lambda=1$ and the layered-$O(1)$ model with the non-trivial coupling $\lambda=-1$. The lower spectrum in the latter model is related to that of a known coset model.
We propose a general approach to the description of spectra of complex networks. For the spectra of networks with uncorrelated vertices (and a local tree-like structure), exact equations are derived. These equations are generalized to the case of networks with correlations between neighboring vertices. The tail of the density of eigenvalues $\rho(\lambda)$ at large $|\lambda|$ is related to the behavior of the vertex degree distribution $P(k)$ at large $k$. In particular, as $P(k) \sim k^{-\gamma}$, $\rho(\lambda) \sim |\lambda|^{1-2\gamma}$. We propose a simple approximation, which enables us to calculate spectra of various graphs analytically. We analyse spectra of various complex networks and discuss the role of vertices of low degree. We show that spectra of locally tree-like random graphs may serve as a starting point in the analysis of spectral properties of real-world networks, e.g., of the Internet.
The dynamics of the highly excited states of a system projected into a single Landau level are analyzed. An analysis of level spacing ratios for finite size systems shows a clear crossover from extend (GUE) to localized (Poisson) statistics, indicating a many body localization transition. However, the location of this transition depends very strongly on system size, and appears to scale to infinite disorder in the thermodynamic limit. This result does not depend on the properties of the ground state (such as whether the ground state exhibits topological order), as expected for a transition of highly-excited eigenstates. We therefore conclude that many body localization does not exist in these systems. Our results demonstrate that a sub-thermodynamic number of single particle effectively extended states is sufficient to cause all many body states to become extended.
Single molecule time trajectories of biomolecules provide glimpses into complex folding landscapes that are difficult to visualize using conventional ensemble measurements. Recent experiments and theoretical analyses have highlighted dynamic disorder in certain classes of biomolecules, whose dynamic pattern of conformational transitions is affected by slower transition dynamics of internal state hidden in a low dimensional projection. A systematic means to analyze such data is, however, currently not well developed. Here we report a new algorithm - Variational Bayes-double chain Markov model (VB-DCMM) - to analyze single molecule time trajectories that display dynamic disorder. The proposed analysis employing VB-DCMM allows us to detect the presence of dynamic disorder, if any, in each trajectory, identify the number of internal states, and estimate transition rates between the internal states as well as the rates of conformational transition within each internal state. Applying VB-DCMM algorithm to single molecule FRET data of H-DNA in 100 mM-Na$^+$ solution, followed by data clustering, we show that at least 6 kinetic paths linking 4 distinct internal states are required to correctly interpret the duplex-triplex transitions of H-DNA.
Deep learning is a branch of artificial intelligence employing deep neural network architectures that has significantly advanced the state-of-the-art in computer vision, speech recognition, natural language processing and other domains. In November 2015, Google released $\textit{TensorFlow}$, an open source deep learning software library for defining, training and deploying machine learning models. In this paper, we review TensorFlow and put it in context of modern deep learning concepts and software. We discuss its basic computational paradigms and distributed execution model, its programming interface as well as accompanying visualization toolkits. We then compare TensorFlow to alternative libraries such as Theano, Torch or Caffe on a qualitative as well as quantitative basis and finally comment on observed use-cases of TensorFlow in academia and industry.
State-of-the-art image segmentation algorithms generally consist of at least two successive and distinct computations: a boundary detection process that uses local image information to classify image locations as boundaries between objects, followed by a pixel grouping step such as watershed or connected components that clusters pixels into segments. Prior work has varied the complexity and approach employed in these two steps, including the incorporation of multi-layer neural networks to perform boundary prediction, and the use of global optimizations during pixel clustering. We propose a unified and end-to-end trainable machine learning approach, flood-filling networks, in which a recurrent 3d convolutional network directly produces individual segments from a raw image. The proposed approach robustly segments images with an unknown and variable number of objects as well as highly variable object sizes. We demonstrate the approach on a challenging 3d image segmentation task, connectomic reconstruction from volume electron microscopy data, on which flood-filling neural networks substantially improve accuracy over other state-of-the-art methods. The proposed approach can replace complex multi-step segmentation pipelines with a single neural network that is learned end-to-end.
Deep Neural Networks (DNNs) have provably enhanced the state-of-the-art Neural Machine Translation (NMT) with their capability in modeling complex functions and capturing complex linguistic structures. However NMT systems with deep architecture in their encoder or decoder RNNs often suffer from severe gradient diffusion due to the non-linear recurrent activations, which often make the optimization much more difficult. To address this problem we propose novel linear associative units (LAU) to reduce the gradient propagation length inside the recurrent unit. Different from conventional approaches (LSTM unit and GRU), LAUs utilizes linear associative connections between input and output of the recurrent unit, which allows unimpeded information flow through both space and time direction. The model is quite simple, but it is surprisingly effective. Our empirical study on Chinese-English translation shows that our model with proper configuration can improve by 11.7 BLEU upon Groundhog and the best reported results in the same setting. On WMT14 English-German task and a larger WMT14 English-French task, our model achieves comparable results with the state-of-the-art.
The coding mechanism of sensory memory on the neuron scale is one of the most important questions in neuroscience. We have put forward a quantitative neural network model, which is self organized, self similar, and self adaptive, just like an ecosystem following Darwin theory. According to this model, neural coding is a mult to one mapping from objects to neurons. And the whole cerebrum is a real-time statistical Turing Machine, with powerful representing and learning ability. This model can reconcile some important disputations, such as: temporal coding versus rate based coding, grandmother cell versus population coding, and decay theory versus interference theory. And it has also provided explanations for some key questions such as memory consolidation, episodic memory, consciousness, and sentiment. Philosophical significance is indicated at last.
The innate immune system, acting as the first line of host defense, senses and adapts to foreign challenges through complex intracellular and intercellular signaling networks. Endotoxin tolerance and priming elicited by macrophages are classic examples of the complex adaptation of innate immune cells. Upon repetitive exposures to different doses of bacterial endotoxin (lipopolysaccharide) or other stimulants, macrophages show either suppressed or augmented inflammatory responses compared to a single exposure to the stimulant. Endotoxin tolerance and priming are critically involved in both immune homeostasis and the pathogenesis of diverse inflammatory diseases. However, the underlying molecular mechanisms are not well understood. By means of a computational search through the parameter space of a coarse-grained three-node network with a two-stage Metropolis sampling approach, we enumerated all the network topologies that can generate priming or tolerance. We discovered three major mechanisms for priming (pathway synergy, suppressor deactivation, activator induction) and one for tolerance (inhibitor persistence). These results not only explain existing experimental observations, but also reveal intriguing test scenarios for future experimental studies to clarify mechanisms of endotoxin priming and tolerance.
We introduce and evaluate several architectures for Convolutional Neural Networks to predict the 3D joint locations of a hand given a depth map. We first show that a prior on the 3D pose can be easily introduced and significantly improves the accuracy and reliability of the predictions. We also show how to use context efficiently to deal with ambiguities between fingers. These two contributions allow us to significantly outperform the state-of-the-art on several challenging benchmarks, both in terms of accuracy and computation times.
We consider the problem of solving TAP mean field equations by iteration for Ising model with coupling matrices that are drawn at random from general invariant ensembles. We develop an analysis of iterative algorithms using a dynamical functional approach that in the thermodynamic limit yields an effective dynamics of a single variable trajectory. Our main novel contribution is the expression for the implicit memory term of the dynamics for general invariant ensembles. By subtracting these terms, that depend on magnetizations at previous time steps, the implicit memory terms cancel making the iteration dependent on a Gaussian distributed field only. The TAP magnetizations are stable fixed points if an AT stability criterion is fulfilled. We illustrate our method explicitly for coupling matrices drawn from the random orthogonal ensemble.
One of the goals of phylogenetic research is to find the species tree describing the evolutionary history of a set of species. But the trees derived from geneti data with the help of tree inference methods are gene trees that need not coincide with the species tree. This can for example happen when so-called deep coalescence events take place. It is also known that species trees can differ from their most likely gene trees. Therefore, as a means to find the species tree, it has been suggested to use subtrees of the gene trees, for example triples, and to puzzle them together in order to find the species tree. In this paper, we will show that this approach may lead to wrong trees regarding the minimum deep coalescence criterion (MDC). In particular, we present an example in which the optimal MDC tree is unique, but none of its triple subtrees fulfills the MDC criterion. In this sense, MDC is a non-hereditary tree reconstruction method.
Wireless information-centric networks consider storage as one of the network primitives, and propose to cache data within the network in order to improve latency and reduce bandwidth consumption. We study the throughput capacity and latency in an information-centric network when the data cached in each node has a limited lifetime. The results show that with some fixed request and cache expiration rates, the order of the data access time does not change with network growth, and the maximum throughput order is not changing with the network growth in grid networks, and is inversely proportional to the number of nodes in one cell in random networks. Comparing these values with the corresponding throughput and latency with no cache capability (throughput inversely proportional to the network size, and latency of order $\sqrt{n}$ and the inverse of the transmission range in grid and random networks, respectively), we can actually quantify the asymptotic advantage of caching. Moreover, we compare these scaling laws for different content discovery mechanisms and illustrate that not much gain is lost when a simple path search is used.
We perfect the recursion-transform method to be a complete theory, which can derive the general exact resistance between any two nodes in a resistor network with several arbitrary boundaries. As application of the method, we give a profound example to illuminate the usefulness on calculating resistance of a nearly $m\times n$ resistor network with a null resistor and three arbitrary boundaries, which has never been solved before since the Greens function technique and the Laplacian matrix approach are invalid in this case. Looking for the exact solutions of resistance is important but difficult in the case of the arbitrary boundary since the boundary is a wall or trap which affects the behavior of finite network. For the first time, seven general formulae of resistance between any two nodes in a nearly $m\times n$ resistor network in both finite and infinite cases are given by our theory. In particular, we give eight special cases by reducing one of general formulae to understand its application and meaning.
Availability of short, femtosecond laser pulses has recently made feasible the probing of phases in an atomic or molecular wave-packet (superposition of energy eigenstates). With short duration excitations the initial form of the wave-packet is an essentially real "doorway state", and this develops phases for each of its component amplitudes as it evolves. It is suggested that these phases are hallmarks of a time arrow and irreversibility that are inherent in the quantum mechanical processes of preparation and evolution. To display the non-triviality of the result, we show under what conditions it would not hold; to discuss its truth, we consider some apparent contradictions. We propose that (in time-reversal invariant systems) the preparation of "initially" complex wave-packets needs finite times to complete, i.e., is not instantaneous.
We study the distribution of finite size pseudo-critical points in a one-dimensional random quantum magnet with a quantum phase transition described by an infinite randomness fixed point. Pseudo-critical points are defined in three different ways: the position of the maximum of the average entanglement entropy, the scaling behavior of the surface magnetization, and the energy of a soft mode. All three lead to a log-normal distribution of the pseudo-critical transverse fields, where the width scales as $L^{-1/\nu}$ with $\nu=2$ and the shift of the average value scales as $L^{-1/\nu_{typ}}$ with $\nu_{typ}=1$, which we related to the scaling of average and typical quantities in the critical region.
Most existing models for multilingual natural language processing (NLP) treat language as a discrete category, and make predictions for either one language or the other. In contrast, we propose using continuous vector representations of language. We show that these can be learned efficiently with a character-based neural language model, and used to improve inference about language varieties not seen during training. In experiments with 1303 Bible translations into 990 different languages, we empirically explore the capacity of multilingual language models, and also show that the language vectors capture genetic relationships between languages.
We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independently of camera calibration or the current robot pose. This requires the network to observe the spatial relationship between the gripper and objects in the scene, thus learning hand-eye coordination. We then use this network to servo the gripper in real time to achieve successful grasps. To train our network, we collected over 800,000 grasp attempts over the course of two months, using between 6 and 14 robotic manipulators at any given time, with differences in camera placement and hardware. Our experimental evaluation demonstrates that our method achieves effective real-time control, can successfully grasp novel objects, and corrects mistakes by continuous servoing.
A method for studying exact properties of a class of {\it inhomogeneous} stochastic many-body systems is developed and presented in the framework of a voter model perturbed by the presence of a ``zealot'', an individual allowed to favour an opinion. We compute exactly the magnetization of this model and find that in one (1d) and two dimensions (2d) it evolves, algebraically ($\sim t^{-1/2}$) in 1d and much slower ($\sim 1/\ln{t}$) in 2d, towards the unanimity state chosen by the zealot. In higher dimensions the stationary magnetization is no longer uniform: the zealot cannot influence all the individuals. Implications to other physical problems are also pointed out.
We present an algorithm for data preprocessing of an associative memory inspired to an electrostatic problem that turns out to have intimate relations with information maximization.
The concept of the value-gradient is introduced and developed in the context of reinforcement learning. It is shown that by learning the value-gradients exploration or stochastic behaviour is no longer needed to find locally optimal trajectories. This is the main motivation for using value-gradients, and it is argued that learning value-gradients is the actual objective of any value-function learning algorithm for control problems. It is also argued that learning value-gradients is significantly more efficient than learning just the values, and this argument is supported in experiments by efficiency gains of several orders of magnitude, in several problem domains. Once value-gradients are introduced into learning, several analyses become possible. For example, a surprising equivalence between a value-gradient learning algorithm and a policy-gradient learning algorithm is proven, and this provides a robust convergence proof for control problems using a value function with a general function approximator.
One of the big challenges in ad hoc network design is packet routing. Studies have shown that on-demand routing protocols perform better than table-driven routing protocols. In order to avoid route discovery for each packet, on-demand routing protocols cache routes previously learnt. A node in ad hoc network learns routing information by overhearing or forwarding packets to other nodes and keep learned routes in its route cache. However, node movement results broken links and therefore increases risk of cache pollution. Ensuring cache freshness in on-demand routing protocols, therefore, presents a serious challenge. A lot of research has been done in route cache organization, however, little effort has been done for route cache timeout policy to prevent stale routes from being used. In this paper we propose a new cross-layer framework to improve route cache performance in on-demand routing protocols. The proposed framework presents novel use of Received Signal Strength Indicator (RSSI) information to choose cache timeout of individual links in route cache.
In many Twitter studies, it is important to know where a tweet came from in order to use the tweet content to study regional user behavior. However, researchers using Twitter to understand user behavior often lack sufficient geo-tagged data. Given the huge volume of Twitter data there is a need for accurate automated geolocating solutions. Herein, we present a new method to predict a Twitter user's location based on the information in a single tweet. We integrate text and user profile meta-data into a single model using a convolutional neural network. Our experiments demonstrate that our neural model substantially outperforms baseline methods, achieving 52.8% accuracy and 92.1% accuracy on city-level and country-level prediction respectively.
In this paper, we propose a novel finetuning algorithm for the recently introduced multi-way, mulitlingual neural machine translate that enables zero-resource machine translation. When used together with novel many-to-one translation strategies, we empirically show that this finetuning algorithm allows the multi-way, multilingual model to translate a zero-resource language pair (1) as well as a single-pair neural translation model trained with up to 1M direct parallel sentences of the same language pair and (2) better than pivot-based translation strategy, while keeping only one additional copy of attention-related parameters.
Current Internet of Things (IoT) development requires service distribution over Wireless Sensor and Actor Networks (WSAN) to deal with the drastic increasing of network management complexity. Because of the specific constraints of WSAN, centralized approaches are strongly limited. Multi-hop communication used by WSAN introduces transmission latency, packet errors, router congestion and security issues. As it uses local services, a decentralized service model avoid long path communications between nodes and applications. But the main issue is then to have such local services installed on the desired nodes. Environment Monitoring and Management Agent (EMMA) system proposes a set of software to deploy and to execute such services over Wireless Sensor and Actor Networks (WSAN) through a middleware based on Resource Oriented Architecture (ROA). Its Internet integration and the local management of data heterogeneity are facilitated through the use of current standard protocols such as IPv6 LoW Power Wireless Area Networks (6LoWPAN) and Constrained Application Protocol (CoAP). This contribution presents EMMA middleware, methodology and tools used to determine efficient service mapping and its deployment.
Convolutional neural networks (CNN) are widely used in computer vision, especially in image classification. However, the way in which information and invariance properties are encoded through in deep CNN architectures is still an open question. In this paper, we propose to modify the standard convo- lutional block of CNN in order to transfer more information layer after layer while keeping some invariance within the net- work. Our main idea is to exploit both positive and negative high scores obtained in the convolution maps. This behav- ior is obtained by modifying the traditional activation func- tion step before pooling. We are doubling the maps with spe- cific activations functions, called MaxMin strategy, in order to achieve our pipeline. Extensive experiments on two classical datasets, MNIST and CIFAR-10, show that our deep MaxMin convolutional net outperforms standard CNN.
According to Goldbach conjecture, any even number can be broken up as the sum of two prime numbers : $n = p + q$. We construct a network where each node is a prime number and corresponding to every even number $n$, we put a link between the component primes $p$ and $q$. In most cases, an even number can be broken up in many ways, and then we chose {\em one} decomposition with a probability $|p - q|^{\alpha}$. Through computation of average shortest distance and clustering coefficient, we conclude that for $\alpha > -1.8$ the network is of small world type and for $\alpha < -1.8$ it is of regular type. We also present a theoretical justification for such behaviour.
We present longitudinal analysis of the evolution of inter-organizational disaster coordination networks during natural disasters. We suggest that social networks are a useful paradigm for exploring this complex phenomenon from both theoretical and methodological perspective aiming to develop a quantitative assessment framework which could aid in developing a better understanding of the optimal functioning of these emerging inter-organizational networks during natural disasters. We highlight the importance of network metrics in order to investigate disaster response coordination networks. Results suggest that in disasters, rate of communication increases and creates the conditions where organizational structures need to move at that same pace to exchange new information.
This paper is an introduction to diagrammatic methods for representing quantum processes and quantum computing. We review basic notions for quantum information and quantum computing. We discuss topological diagrams and some issues about using category theory in representing quantum computing and teleportation. We analyze very carefully the diagrammatic meaning of the usual representation of the Mach-Zehnder interferometer, and we show how it can be generalized to associate to each composition of unitary transformations a "laboratory quantum diagram" such that particles moving though the many alternate paths in this diagram will mimic the quantum process represented by the composition of unitary transformations. This is a finite dimensional way to think about the Feynman Path Integral. We call our representation result the Path Theorem. Then we go back to the basics of networks and matrices and show how elements of quantum measurement can be represented with network diagrams.
We demonstrate the damping of quantum octupole vibrations near the touching point when two colliding nuclei approach each other in the mass-asymmetric $^{208}$Pb + $^{16}$O system, for which the strong fusion hindrance was clearly observed. We, for the first time, apply the random-phase approximation method to the heavy-mass asymmetric di-nuclear system to calculate the transition strength $B$(E3) as a function of the center-of-mass distance. The obtained $B$(E3) strengths are substantially damped near the touching point, because the single-particle wave functions of the two nuclei strongly mix with each other and a neck is formed. The energy-weighted sums of $B$(E3) are also strongly correlated with the damping factor which is phenomenologically introduced in the standard coupled-channel calculations to reproduce the fusion hindrance. This strongly indicates that the damping of the quantum vibrations universally occurs in the deep sub-barrier fusion reactions.
The distributed representation of correlated multi-view images is an important problem that arise in vision sensor networks. This paper concentrates on the joint reconstruction problem where the distributively compressed correlated images are jointly decoded in order to improve the reconstruction quality of all the compressed images. We consider a scenario where the images captured at different viewpoints are encoded independently using common coding solutions (e.g., JPEG, H.264 intra) with a balanced rate distribution among different cameras. A central decoder first estimates the underlying correlation model from the independently compressed images which will be used for the joint signal recovery. The joint reconstruction is then cast as a constrained convex optimization problem that reconstructs total-variation (TV) smooth images that comply with the estimated correlation model. At the same time, we add constraints that force the reconstructed images to be consistent with their compressed versions. We show by experiments that the proposed joint reconstruction scheme outperforms independent reconstruction in terms of image quality, for a given target bit rate. In addition, the decoding performance of our proposed algorithm compares advantageously to state-of-the-art distributed coding schemes based on disparity learning and on the DISCOVER.
Convolutional neural network (CNN) offers significant accuracy in image detection. To implement image detection using CNN in the internet of things (IoT) devices, a streaming hardware accelerator is proposed. The proposed accelerator optimizes the energy efficiency by avoiding unnecessary data movement. With unique filter decomposition technique, the accelerator can support arbitrary convolution window size. In addition, max pooling function can be computed in parallel with convolution by using separate pooling unit, thus achieving throughput improvement. A prototype accelerator was implemented in TSMC 65nm technology with a core size of 5mm2. The accelerator can support major CNNs and achieve 152GOPS peak throughput and 434GOPS/W energy efficiency at 350mW, making it a promising hardware accelerator for intelligent IoT devices.
We study constrained general-sum stochastic games with unknown Markovian dynamics. A distributed constrained no-regret Q-learning scheme (CNRQ) is presented to guarantee convergence to the set of stationary correlated equilibria of the game. Prior art addresses the unconstrained case only, is structured with nested control loops, and has no convergence result. CNRQ is cast as a single-loop three-timescale asynchronous stochastic approximation algorithm with set-valued update increments. A rigorous convergence analysis with differential inclusion arguments is given which draws on recent extensions of the theory of stochastic approximation to the case of asynchronous recursive inclusions with set-valued mean fields. Numerical results are given for the exemplary application of CNRQ to decentralized resource control in heterogeneous wireless networks (HetNets).
In this work, methods to determine more accurate doping profiles in semiconductors is explored where trap-induced artifacts such as hysteresis and doping artifacts are observed. Specifically in CIGS, it is shown that this fast capacitance-voltage (C-V) approach presented here allows for accurate doping profile measurement even at room temperature, which is typically not possible due to the large ratio of trap concentration to doping. Using deep level transient spectroscopy (DLTS) measurement, the deep trap responsible for the abnormal C-V measurement above 200 K is identified. Importantly, this fast C-V can be used for fast evaluation on the production line to monitor the true doping concentration, and even estimate the trap concentration. Additionally, the influence of high conductance on the apparent doping profile at different temperature is investigated.
Molecular dynamics simulations of shear band development over 1000% strain in simple shear are used to test whether the local plastic strain rate is proportional to exp(-1/chi), where chi is a dimensionless quantity related to the "disorder temperature" or "free volume" that characterizes the structural state of the glass. Scaling is observed under the assumption that chi is linearly related to the local potential energy per atom. This analysis permits the extraction of the potential energy per atom at which chi=0 and the approximate energy needed to create a shear transformation zone.
Recurrent neural networks (RNNs) were recently proposed for the session-based recommendation task. The models showed promising improvements over traditional recommendation approaches. In this work, we further study RNN-based models for session-based recommendations. We propose the application of two techniques to improve model performance, namely, data augmentation, and a method to account for shifts in the input data distribution. We also empirically study the use of generalised distillation, and a novel alternative model that directly predicts item embeddings. Experiments on the RecSys Challenge 2015 dataset demonstrate relative improvements of 12.8% and 14.8% over previously reported results on the Recall@20 and Mean Reciprocal Rank@20 metrics respectively.
We report on the methods used in our recent DeepEnsembleCoco submission to the PASCAL VOC 2012 challenge, which achieves state-of-the-art performance on the object detection task. Our method is a variant of the R-CNN model proposed Girshick:CVPR14 with two key improvements to training and evaluation. First, our method constructs an ensemble of deep CNN models with different architectures that are complementary to each other. Second, we augment the PASCAL VOC training set with images from the Microsoft COCO dataset to significantly enlarge the amount training data. Importantly, we select a subset of the Microsoft COCO images to be consistent with the PASCAL VOC task. Results on the PASCAL VOC evaluation server show that our proposed method outperform all previous methods on the PASCAL VOC 2012 detection task at time of submission.
We report on a systematic investigation of the dominant scattering mechanism in shallow two-dimensional electron gases (2DEGs) formed in modulation-doped GaAs/Al_{x}Ga_{1-x}As heterostructures. The power-law exponent of the electron mobility versus density, mu \propto n^{alpha}, is extracted as a function of the 2DEG's depth. When shallower than 130 nm from the surface, the power-law exponent of the 2DEG, as well as the mobility, drops from alpha \simeq 1.65 (130 nm deep) to alpha \simeq 1.3 (60 nm deep). Our results for shallow 2DEGs are consistent with theoretical expectations for scattering by remote dopants, in contrast to the mobility-limiting background charged impurities of deeper heterostructures.
Reservoir computing (RC) is a novel approach to time series prediction using recurrent neural networks. In RC, an input signal perturbs the intrinsic dynamics of a medium called a reservoir. A readout layer is then trained to reconstruct a target output from the reservoir's state. The multitude of RC architectures and evaluation metrics poses a challenge to both practitioners and theorists who study the task-solving performance and computational power of RC. In addition, in contrast to traditional computation models, the reservoir is a dynamical system in which computation and memory are inseparable, and therefore hard to analyze. Here, we compare echo state networks (ESN), a popular RC architecture, with tapped-delay lines (DL) and nonlinear autoregressive exogenous (NARX) networks, which we use to model systems with limited computation and limited memory respectively. We compare the performance of the three systems while computing three common benchmark time series: H{\'e}non Map, NARMA10, and NARMA20. We find that the role of the reservoir in the reservoir computing paradigm goes beyond providing a memory of the past inputs. The DL and the NARX network have higher memorization capability, but fall short of the generalization power of the ESN.
Binarized neural networks (BNNs) are gaining interest in the deep learning community due to their significantly lower computational and memory cost. They are particularly well suited to reconfigurable logic devices, which contain an abundance of fine-grained compute resources and can result in smaller, lower power implementations, or conversely in higher classification rates. Towards this end, the Finn framework was recently proposed for building fast and flexible field programmable gate array (FPGA) accelerators for BNNs. Finn utilized a novel set of optimizations that enable efficient mapping of BNNs to hardware and implemented fully connected, non-padded convolutional and pooling layers, with per-layer compute resources being tailored to user-provided throughput requirements. However, FINN was not evaluated on larger topologies due to the size of the chosen FPGA, and exhibited decreased accuracy due to lack of padding. In this paper, we improve upon Finn to show how padding can be employed on BNNs while still maintaining a 1-bit datapath and high accuracy. Based on this technique, we demonstrate numerous experiments to illustrate flexibility and scalability of the approach. In particular, we show that a large BNN requiring 1.2 billion operations per frame running on an ADM-PCIE-8K5 platform can classify images at 12 kFPS with 671 us latency while drawing less than 41 W board power and classifying CIFAR-10 images at 88.7% accuracy. Our implementation of this network achieves 14.8 trillion operations per second. We believe this is the fastest classification rate reported to date on this benchmark at this level of accuracy.
Near the superconductor-insulator (S-I) transition, quench-condensed ultrathin Be films show a large magnetoconductance which is highly anisotropic in the direction of the applied field. Film conductance can drop as much as seven orders of magnitude in a weak perpendicular field (< 1 T), but is insensitive to a parallel field in the same field range. We believe that this negative magnetoconductance is due to the field de-phasing of the superconducting pair wavefunction. This idea enables us to extract the finite superconducting phase coherence length in nearly superconducting films. Our data indicate that this local phase coherence persists even in highly insulating films in the vicinity of the S-I transition.
We present an orthotropic elastic analysis of frictional granular layers under gravity by studying their stress response to a localized overload at the layer surface for several substrate tilt angles. The distance to the unjamming transition is controlled by the tilt angle {\alpha} with respect to the critical angle \alpha_c. We find that the shear modulus of the system decreases with {\alpha}, but reaches a finite value as \alpha tends to \alpha_c. We also analyze the vibration modes of the system and show that the soft modes play an increasing, though not crucial, role approaching the transition.
Music genres allow to categorize musical items that share common characteristics. Although these categories are not mutually exclusive, most related research is traditionally focused on classifying tracks into a single class. Furthermore, these categories (e.g., Pop, Rock) tend to be too broad for certain applications. In this work we aim to expand this task by categorizing musical items into multiple and fine-grained labels, using three different data modalities: audio, text, and images. To this end we present MuMu, a new dataset of more than 31k albums classified into 250 genre classes. For every album we have collected the cover image, text reviews, and audio tracks. Additionally, we propose an approach for multi-label genre classification based on the combination of feature embeddings learned with state-of-the-art deep learning methodologies. Experiments show major differences between modalities, which not only introduce new baselines for multi-label genre classification, but also suggest that combining them yields improved results.
We study the ferromagnetic random-field Ising model on random graphs of fixed connectivity z (Bethe lattice) in the presence of an external magnetic field $H$. We compute the number of single-spin-flip stable configurations with a given magnetization m and study the connection between the distribution of these metastable states in the H-m plane (focusing on the region where the number is exponentially large) and the shape of the saturation hysteresis loop obtained by cycling the field between $-\infty$ and $+\infty$ at T=0. The annealed complexity $\Sigma_A(m,H)$ is calculated for z=2,3,4 and the quenched complexity $\Sigma_Q(m,H)$ for z=2. We prove explicitly for z=2 that the contour $\Sigma_Q(m,H)=0$ coincides with the saturation loop. On the other hand, we show that $\Sigma_A(m,H)$ is irrelevant for describing, even qualitatively, the observable hysteresis properties of the system.
In modern machine learning, pattern recognition replaces realtime semantic reasoning. The mapping from input to output is learned with fixed semantics by training outcomes deliberately. This is an expensive and static approach which depends heavily on the availability of a very particular kind of prior raining data to make inferences in a single step. Conventional semantic network approaches, on the other hand, base multi-step reasoning on modal logics and handcrafted ontologies, which are {\em ad hoc}, expensive to construct, and fragile to inconsistency. Both approaches may be enhanced by a hybrid approach, which completely separates reasoning from pattern recognition. In this report, a quasi-linguistic approach to knowledge representation is discussed, motivated by spacetime structure. Tokenized patterns from diverse sources are integrated to build a lightly constrained and approximately scale-free network. This is then be parsed with very simple recursive algorithms to generate `brainstorming' sets of reasoned knowledge.
This paper reviews the state of the art in overlapping community detection algorithms, quality measures, and benchmarks. A thorough comparison of different algorithms (a total of fourteen) is provided. In addition to community level evaluation, we propose a framework for evaluating algorithms' ability to detect overlapping nodes, which helps to assess over-detection and under-detection. After considering community level detection performance measured by Normalized Mutual Information, the Omega index, and node level detection performance measured by F-score, we reached the following conclusions. For low overlapping density networks, SLPA, OSLOM, Game and COPRA offer better performance than the other tested algorithms. For networks with high overlapping density and high overlapping diversity, both SLPA and Game provide relatively stable performance. However, test results also suggest that the detection in such networks is still not yet fully resolved. A common feature observed by various algorithms in real-world networks is the relatively small fraction of overlapping nodes (typically less than 30%), each of which belongs to only 2 or 3 communities.
We perform extensive simulations of a binary mixture Lennard-Jones system subjected to a temperature jump in order to study the time evolution of fluctuations during aging. Analyzing data from 1500 different aging realizations, we calculate distributions of inherent structure energies for different aging times and contrast them with equilibrium. We find that the distributions initially become narrower and then widen as the system equilibrates. For deep quenches, fluctuations in the glassy system differ significantly from those observed in equilibrium. Simulation results are partially captured by theoretical predictions only when the final temperature is higher than the mode coupling temperature.
A new Schottky-gate Bipolar Mode Field Effect Transistor (SBMFET) is proposed and verified by two-dimensional simulation. Unlike in the case of conventional BMFET, which uses deep diffused p+-regions as the gate, the proposed device uses the Schottky gate formed on the silicon planar surface for injecting minority carriers into the drift region. The SBMFET is demonstrated to have improved current gain, identical breakdown voltage and ON-voltage drop when compared to the conventional BMFET. Since the fabrication of the SBMFET is much simpler and obliterates the need for deep thermal diffusion of P+-gates, the SBMFET is expected to be of great practical importance in medium-power high-current switching applications.
Computational complexity of the design problem for a network with a target value of Region-Based Component Decomposition Number (RBCDN) has been proven to be NP-complete.
Deep Neural Networks (DNNs) have achieved remarkable performance on a variety of pattern-recognition tasks, particularly visual classification problems, where new algorithms reported to achieve or even surpass the human performance. In this paper, we test the state-of-the-art DNNs with negative images and show that the accuracy drops to the level of random classification. This leads us to the conjecture that the DNNs, which are merely trained on raw data, do not recognize the semantics of the objects, but rather memorize the inputs. We suggest that negative images can be thought as "semantic adversarial examples", which we define as transformed inputs that semantically represent the same objects, but the model does not classify them correctly.
Inspired by cognitive radio networks, we consider a setting where multiple users share several channels modeled as a multi-user multi-armed bandit (MAB) problem. The characteristics of each channel are unknown and are different for each user. Each user can choose between the channels, but her success depends on the particular channel chosen as well as on the selections of other users: if two users select the same channel their messages collide and none of them manages to send any data. Our setting is fully distributed, so there is no central control. As in many communication systems, the users cannot set up a direct communication protocol, so information exchange must be limited to a minimum. We develop an algorithm for learning a stable configuration for the multi-user MAB problem. We further offer both convergence guarantees and experiments inspired by real communication networks, including comparison to state-of-the-art algorithms.
We study the role of fluctuations on the thermodynamic glassy properties of plaquette spin models, more specifically on the transition involving an overlap order parameter in the presence of an attractive coupling between different replicas of the system. We consider both short-range fluctuations associated with the local environment on Bethe lattices and long-range fluctuations that distinguish Euclidean from Bethe lattices with the same local environment. We find that the phase diagram in the temperature-coupling plane is very sensitive to the former but, at least for the $3$-dimensional (square pyramid) model, appears qualitatively or semi-quantitatively unchanged by the latter. This surprising result suggests that the mean-field theory of glasses provides a reasonable account of the glassy thermodynamics of models otherwise described in terms of the kinetically constrained motion of localized defects and taken as a paradigm for the theory of dynamic facilitation. We discuss the possible implications for the dynamical behavior.
Decision analytics commonly focuses on the text mining of financial news sources in order to provide managerial decision support and to predict stock market movements. Existing predictive frameworks almost exclusively apply traditional machine learning methods, whereas recent research indicates that traditional machine learning methods are not sufficiently capable of extracting suitable features and capturing the non-linear nature of complex tasks. As a remedy, novel deep learning models aim to overcome this issue by extending traditional neural network models with additional hidden layers. Indeed, deep learning has been shown to outperform traditional methods in terms of predictive performance. In this paper, we adapt the novel deep learning technique to financial decision support. In this instance, we aim to predict the direction of stock movements following financial disclosures. As a result, we show how deep learning can outperform the accuracy of random forests as a benchmark for machine learning by 5.66%.
Rectified linear activation units are important components for state-of-the-art deep convolutional networks. In this paper, we propose a novel S-shaped rectified linear activation unit (SReLU) to learn both convex and non-convex functions, imitating the multiple function forms given by the two fundamental laws, namely the Webner-Fechner law and the Stevens law, in psychophysics and neural sciences. Specifically, SReLU consists of three piecewise linear functions, which are formulated by four learnable parameters. The SReLU is learned jointly with the training of the whole deep network through back propagation. During the training phase, to initialize SReLU in different layers, we propose a "freezing" method to degenerate SReLU into a predefined leaky rectified linear unit in the initial several training epochs and then adaptively learn the good initial values. SReLU can be universally used in the existing deep networks with negligible additional parameters and computation cost. Experiments with two popular CNN architectures, Network in Network and GoogLeNet on scale-various benchmarks including CIFAR10, CIFAR100, MNIST and ImageNet demonstrate that SReLU achieves remarkable improvement compared to other activation functions.
Though biological and artificial complex systems having inhibitory connections exhibit high degree of clustering in their interaction pattern, the evolutionary origin of clustering in such systems remains a challenging problem. Using genetic algorithm we demonstrate that inhibition is required in the evolution of clique structure from primary random architecture, in which the fitness function is assigned based on the largest eigenvalue. Further, the distribution of triads over nodes of the network evolved from mixed connections exhibits a negative correlation with its degree providing insight into origin of this trend observed in real networks.
A spin-1 model, appropriated to study the competition between bilinear (J_{ij}S_{i}S_{j}) and biquadratic (K_{ij}S_{i}^{2}S_{j}^{2}) random interactions, both of them with zero mean, is investigated. The interactions are infinite-ranged and the replica method is employed. Within the replica-symmetric assumption, the system presents two phases, namely, paramagnetic and spin-glass, separated by a continuous transition line. The stability analysis of the replica-symmetric solution yields, besides the usual instability associated with the spin-glass ordering, a new phase due to the random biquadratic couplings between the spins.
A new preliminary result of a gluon polarisation \Delta G/G obtained selecting high transverse momentum hadron pairs in DIS events with Q^2>1 (GeV/c)^2 is presented. Data has been collected by COMPASS at CERN during the 2002-2004 years. In the extraction of $\Delta G/G$ contributions coming from the leading order $\gamma q$ and QCD processes are taken into account. A new weighting method based on a neural network approach is used. Also a preliminary result of \Delta G/G for events with Q^2<1 (GeV/c)^2 is presented.
The rise of graph-structured data such as social networks, regulatory networks, citation graphs, and functional brain networks, in combination with resounding success of deep learning in various applications, has brought the interest in generalizing deep learning models to non-Euclidean domains. In this paper, we introduce a new spectral domain convolutional architecture for deep learning on graphs. The core ingredient of our model is a new class of parametric rational complex functions (Cayley polynomials) allowing to efficiently compute localized regular filters on graphs that specialize on frequency bands of interest. Our model scales linearly with the size of the input data for sparsely-connected graphs, can handle different constructions of Laplacian operators, and typically requires less parameters than previous models. Extensive experimental results show the superior performance of our approach on various graph learning problems.
We consider the task of one-shot learning of visual categories. In this paper we explore a Bayesian procedure for updating a pretrained convnet to classify a novel image category for which data is limited. We decompose this convnet into a fixed feature extractor and softmax classifier. We assume that the target weights for the new task come from the same distribution as the pretrained softmax weights, which we model as a multivariate Gaussian. By using this as a prior for the new weights, we demonstrate competitive performance with state-of-the-art methods whilst also being consistent with 'normal' methods for training deep networks on large data.
Human learning is a complex phenomenon that requires adaptive processes across a range of temporal and spacial scales. While our understanding of those processes at single scales has increased exponentially over the last few years, a mechanistic understanding of the entire phenomenon has remained elusive. We propose that progress has been stymied by the lack of a quantitative framework that can account for the full range of neurophysiological and behavioral dynamics both across scales in the systems and also across different types of learning. We posit that network neuroscience offers promise in meeting this challenge. Built on the mathematical fields of complex systems science and graph theory, network neuroscience embraces the interconnected and hierarchical nature of human learning, offering insights into the emergent properties of adaptability. In this review, we discuss the utility of network neuroscience as a tool to build a quantitative framework in which to study human learning, which seeks to explain the full chain of events in the brain from sensory input to motor output, being both biologically plausible and able to make predictions about how an intervention at a single level of the chain may cause alterations in another level of the chain. We close by laying out important remaining challenges in network neuroscience in explicitly bridging spatial scales at which neurophysiological processes occur, and underscore the utility of such a quantitative framework for education and therapy.
This survey explores Procedural Content Generation via Machine Learning (PCGML), defined as the generation of game content using machine learning models trained on existing content. As the importance of PCG for game development increases, researchers explore new avenues for generating high-quality content with or without human involvement; this paper addresses the relatively new paradigm of using machine learning (in contrast with search-based, solver-based, and constructive methods). We focus on what is most often considered functional game content such as platformer levels, game maps, interactive fiction stories, and cards in collectible card games, as opposed to cosmetic content such as sprites and sound effects. In addition to using PCG for autonomous generation, co-creativity, mixed-initiative design, and compression, PCGML is suited for repair, critique, and content analysis because of its focus on modeling existing content. We discuss various data sources and representations that affect the resulting generated content. Multiple PCGML methods are covered, including neural networks, long short-term memory (LSTM) networks, autoencoders, and deep convolutional networks; Markov models, $n$-grams, and multi-dimensional Markov chains; clustering; and matrix factorization. Finally, we discuss open problems in the application of PCGML, including learning from small datasets, lack of training data, multi-layered learning, style-transfer, parameter tuning, and PCG as a game mechanic.
Powering cellular networks with renewable energy sources via energy harvesting (EH) has recently been proposed as a promising solution for green networking. However, with intermittent and random energy arrivals, it is challenging to provide satisfactory quality of service (QoS) in EH networks. To enjoy the greenness brought by EH while overcoming the instability of the renewable energy sources, hybrid energy supply (HES) networks that are powered by both EH and the electric grid have emerged as a new paradigm for green communications. In this paper, we will propose new design methodologies for HES green cellular networks with the help of Lyapunov optimization techniques. The network service cost, which addresses both the grid energy consumption and achievable QoS, is adopted as the performance metric, and it is optimized via base station assignment and power control (BAPC). Our main contribution is a low-complexity online algorithm to minimize the long-term average network service cost, namely, the Lyapunov optimization-based BAPC (LBAPC) algorithm. One main advantage of this algorithm is that the decisions depend only on the instantaneous side information without requiring distribution information of channels and EH processes. To determine the network operation, we only need to solve a deterministic per-time slot problem, for which an efficient inner-outer optimization algorithm is proposed. Moreover, the proposed algorithm is shown to be asymptotically optimal via rigorous analysis. Finally, sample simulation results are presented to verify the theoretical analysis as well as validate the effectiveness of the proposed algorithm.
When banks choose similar investment strategies the financial system becomes vulnerable to common shocks. We model a simple financial system in which banks decide about their investment strategy based on a private belief about the state of the world and a social belief formed from observing the actions of peers. Observing a larger group of peers conveys more information and thus leads to a stronger social belief. Extending the standard model of Bayesian updating in social networks, we show that the probability that banks synchronize their investment strategy on a state non-matching action critically depends on the weighting between private and social belief. This effect is alleviated when banks choose their peers endogenously in a network formation process, internalizing the externalities arising from social learning.
The superconducting LHC magnets are coupled with an electronic monitoring system which records and analyses voltage time series reflecting their performance. A currently used system is based on a range of preprogrammed triggers which launches protection procedures when a misbehavior of the magnets is detected. All the procedures used in the protection equipment were designed and implemented according to known working scenarios of the system and are updated and monitored by human operators.   This paper proposes a novel approach to monitoring and fault protection of the Large Hadron Collider (LHC) superconducting magnets which employs state-of-the-art Deep Learning algorithms. Consequently, the authors of the paper decided to examine the performance of LSTM recurrent neural networks for modeling of voltage time series of the magnets. In order to address this challenging task different network architectures and hyper-parameters were used to achieve the best possible performance of the solution. The regression results were measured in terms of RMSE for different number of future steps and history length taken into account for the prediction. The best result of RMSE=0.00104 was obtained for a network of 128 LSTM cells within the internal layer and 16 steps history buffer.
We present very efficient active learning algorithms for link classification in signed networks. Our algorithms are motivated by a stochastic model in which edge labels are obtained through perturbations of a initial sign assignment consistent with a two-clustering of the nodes. We provide a theoretical analysis within this model, showing that we can achieve an optimal (to whithin a constant factor) number of mistakes on any graph G = (V,E) such that |E| = \Omega(|V|^{3/2}) by querying O(|V|^{3/2}) edge labels. More generally, we show an algorithm that achieves optimality to within a factor of O(k) by querying at most order of |V| + (|V|/k)^{3/2} edge labels. The running time of this algorithm is at most of order |E| + |V|\log|V|.
Wireless fading networks with multiple antennas are typically studied information-theoretically from two different perspectives - the outage characterization and the ergodic capacity characterization. A key parameter in the outage characterization of a network is the diversity, whereas a first-order indicator for the ergodic capacity is the degrees of freedom (DOF), which is the pre-log coefficient in the capacity expression. In this paper, we present max-flow min-cut type theorems for computing both the diversity and the degrees of freedom of arbitrary single-source single-sink multi-antenna networks. We also show that an amplify-and-forward protocol is sufficient to achieve this. The degrees of freedom characterization is obtained using a conversion to a deterministic wireless network for which the capacity was recently found. We show that the diversity result easily extends to multi-source multi-sink networks and evaluate the DOF for multi-casting in single-source multi-sink networks.
This study shows that a mixture of RNN experts model can acquire the ability to generate sequences combining multiple primitive patterns by means of self-organizing chaos. By training of the model, each expert learns a primitive sequence pattern, and a gating network learns to imitate stochastic switching of the multiple primitives via a chaotic dynamics, utilizing a sensitive dependence on initial conditions. As a demonstration, we present a numerical simulation in which the model learns Markov chain switching among some Lissajous curves by a chaotic dynamics. Our analysis shows that by using a sufficient amount of training data, balanced with the network memory capacity, it is possible to satisfy the conditions for embedding the target stochastic sequences into a chaotic dynamical system. It is also shown that reconstruction of a stochastic time series by a chaotic model can be stabilized by adding a negligible amount of noise to the dynamics of the model.
The topology of a sensor network changes very frequently due to node failures because of power constraints or physical destruction. Robustness to topology changes is one of the important design factors of wireless sensor networks which makes them suitable to military, communications, health and surveillance applications. Network criticality is a measure which capture the properties of network robustness to environmental changes. In this work, we derived the analytical formulas for network criticality, normalized network criticality and studied the network robustness for r-nearest neighbor cycle and r-nearest neighbor torus networks. Further, we compared our analytical results with simulation results and studied the effect of number of nodes, nearest neighbors and network dimension on the network criticality and normalized network criticality.
Inferring network topology from dynamical observations is a fundamental problem pervading research on complex systems. Here, we present a simple, direct method to infer the structural connection topology of a network, given an observation of one collective dynamical trajectory. The general theoretical framework is applicable to arbitrary network dynamical systems described by ordinary differential equations. No interference (external driving) is required and the type of dynamics is not restricted in any way. In particular, the observed dynamics may be arbitrarily complex; stationary, invariant or transient; synchronous or asynchronous and chaotic or periodic. Presupposing a knowledge of the functional form of the dynamical units and of the coupling functions between them, we present an analytical solution to the inverse problem of finding the network topology. Robust reconstruction is achieved in any sufficiently long generic observation of the system. We extend our method to simultaneously reconstruct both the entire network topology and all parameters appearing linear in the system's equations of motion. Reconstruction of network topology and system parameters is viable even in the presence of substantial external noise.
The principle of peer review is central to the evaluation of research, by ensuring that only high-quality items are funded or published. But peer review has also received criticism, as the selection of reviewers may introduce biases in the system. In 2014, the organizers of the ``Neural Information Processing Systems\rq\rq{} conference conducted an experiment in which $10\%$ of submitted manuscripts (166 items) went through the review process twice. Arbitrariness was measured as the conditional probability for an accepted submission to get rejected if examined by the second committee. This number was equal to $60\%$, for a total acceptance rate equal to $22.5\%$. Here we present a Bayesian analysis of those two numbers, by introducing a hidden parameter which measures the probability that a submission meets basic quality criteria. The standard quality criteria usually include novelty, clarity, reproducibility, correctness and no form of misconduct, and are met by a large proportions of submitted items. The Bayesian estimate for the hidden parameter was equal to $56\%$ ($95\%$CI: $ I = (0.34, 0.83)$), and had a clear interpretation. The result suggested the total acceptance rate should be increased in order to decrease arbitrariness estimates in future review processes.
This paper aims at the problem of link pattern prediction in collections of objects connected by multiple relation types, where each type may play a distinct role. While common link analysis models are limited to single-type link prediction, we attempt here to capture the correlations among different relation types and reveal the impact of various relation types on performance quality. For that, we define the overall relations between object pairs as a \textit{link pattern} which consists in interaction pattern and connection structure in the network, and then use tensor formalization to jointly model and predict the link patterns, which we refer to as \textit{Link Pattern Prediction} (LPP) problem. To address the issue, we propose a Probabilistic Latent Tensor Factorization (PLTF) model by introducing another latent factor for multiple relation types and furnish the Hierarchical Bayesian treatment of the proposed probabilistic model to avoid overfitting for solving the LPP problem. To learn the proposed model we develop an efficient Markov Chain Monte Carlo sampling method. Extensive experiments are conducted on several real world datasets and demonstrate significant improvements over several existing state-of-the-art methods.
The recently proposed "generalized min-max" (GMM) kernel can be efficiently linearized, with direct applications in large-scale statistical learning and fast near neighbor search. The linearized GMM kernel was extensively compared in with linearized radial basis function (RBF) kernel. On a large number of classification tasks, the tuning-free GMM kernel performs (surprisingly) well compared to the best-tuned RBF kernel. Nevertheless, one would naturally expect that the GMM kernel ought to be further improved if we introduce tuning parameters.   In this paper, we study three simple constructions of tunable GMM kernels: (i) the exponentiated-GMM (or eGMM) kernel, (ii) the powered-GMM (or pGMM) kernel, and (iii) the exponentiated-powered-GMM (epGMM) kernel. The pGMM kernel can still be efficiently linearized by modifying the original hashing procedure for the GMM kernel. On about 60 publicly available classification datasets, we verify that the proposed tunable GMM kernels typically improve over the original GMM kernel. On some datasets, the improvements can be astonishingly significant.   For example, on 11 popular datasets which were used for testing deep learning algorithms and tree methods, our experiments show that the proposed tunable GMM kernels are strong competitors to trees and deep nets. The previous studies developed tree methods including "abc-robust-logitboost" and demonstrated the excellent performance on those 11 datasets (and other datasets), by establishing the second-order tree-split formula and new derivatives for multi-class logistic loss. Compared to tree methods like "abc-robust-logitboost" (which are slow and need substantial model sizes), the tunable GMM kernels produce largely comparable results.
Deep convolutional neural networks (CNNs) have outperformed existing object recognition and detection algorithms. On the other hand satellite imagery captures scenes that are diverse. This paper describes a deep learning approach that analyzes a geo referenced satellite image and efficiently detects built structures in it. A Fully Convolution Network (FCN) is trained on low resolution Google earth satellite imagery in order to achieve end result. The detected built communities are then correlated with the vaccination activity that has furnished some useful statistics.
Catastrophic interference has been a major roadblock in the research of continual learning. Here we propose a variant of the back-propagation algorithm, "conceptor-aided back-prop" (CAB), in which gradients are shielded by conceptors against degradation of previously learned tasks. Conceptors have their origin in reservoir computing, where they have been previously shown to overcome catastrophic forgetting. CAB extends these results to deep feedforward networks. On the disjoint MNIST task CAB outperforms two other methods for coping with catastrophic interference that have recently been proposed in the deep learning field.
We study the information-theoretic lower bound of the sample complexity of the correct recovery of diffusion network structures. We introduce a discrete-time diffusion model based on the Independent Cascade model for which we obtain a lower bound of order $\Omega(k \log p)$, for directed graphs of $p$ nodes, and at most $k$ parents per node. Next, we introduce a continuous-time diffusion model, for which a similar lower bound of order $\Omega(k \log p)$ is obtained. Our results show that the algorithm of Pouget-Abadie et al. is statistically optimal for the discrete-time regime. Our work also opens the question of whether it is possible to devise an optimal algorithm for the continuous-time regime.
Recent advances in the development of the low-cost, power-efficient embedded devices, coupled with the rising need for support of new information processing paradigms such as smart spaces and military surveillance systems, have led to active research in large-scale, highly distributed sensor networks of small, wireless, low-power, unattended sensors and actuators. While applications keep diversifying, one common property they share is the need for an efficient network architecture tailored towards information retrieval in sensor networks. Previous solutions designed for traditional networks serve as good references; however, due to the vast differences between previous paradigms and needs of sensor networks, a framework is required to gather and impart only the required information .To achieve this goal in this paper we have proposed a framework for intelligent information retrieval and dissemination to desired destination node. The proposed frame work combines three major concern areas in WSNs i.e. data aggregation, information retrieval and data dissemination in a single scenario. In the proposed framework data aggregation is responsible for combining information from all nodes and removing the redundant data. Information retrieval filters the processed data to obtain final information termed as intelligent data to be disseminated to the required destination node.
Scaling behavior of scale-free evolving networks arising in communications, citations, collaborations, etc. areas is studied. We derive universal scaling relations describing properties of such networks and indicate limits of their validity. We show that main properties of scale-free evolving networks may be described in frames of a simple continuous approach. The simplest models of networks, which growth is determined by a mechanism of preferential linking, are used. We consider different forms of this preference and demonstrate that the range of types of preference linking producing scale-free networks is wide. We obtain also scaling relations for networks with nonlinear, accelerating growth and describe temporal evolution of arising distributions. Size-effects - cut-offs of these distributions - implement restrictions for observation of power-law dependences. The main characteristic of interest is so-called degree distribution, i.e., distribution of a number of connections of nodes. A scaling form of the distribution of links between pairs of individual nodes for the growing network of citations is also studied. We describe effects that produce difference of nodes. ``Aging'' of nodes changes exponents of distributions. Appearence of a single ``strong'' node changes dramatically the degree distribution of a network. If its strength exceeds some threshold value, the strong node captures a finite part of all links of a network. We show that permanent random damage of a growing scale-free network - permanent deleting of some links - change radically values of the scaling exponents. We describe the arising rich phase diagram. Results of other types of permanent damage are described.
Deep neural networks (DNNs) have achieved great success in solving a variety of machine learning (ML) problems, especially in the domain of image recognition. However, recent research showed that DNNs can be highly vulnerable to adversarially generated instances, which look seemingly normal to human observers, but completely confuse DNNs. These adversarial samples are crafted by adding small perturbations to normal, benign images. Such perturbations, while imperceptible to the human eye, are picked up by DNNs and cause them to misclassify the manipulated instances with high confidence. In this work, we explore and demonstrate how systematic JPEG compression can work as an effective pre-processing step in the classification pipeline to counter adversarial attacks and dramatically reduce their effects (e.g., Fast Gradient Sign Method, DeepFool). An important component of JPEG compression is its ability to remove high frequency signal components, inside square blocks of an image. Such an operation is equivalent to selective blurring of the image, helping remove additive perturbations. Further, we propose an ensemble-based technique that can be constructed quickly from a given well-performing DNN, and empirically show how such an ensemble that leverages JPEG compression can protect a model from multiple types of adversarial attacks, without requiring knowledge about the model.
Networks describe various complex natural systems including social systems. We investigate the social network of co-occurrence in Reuters-21578 corpus, which consists of news articles that appeared in the Reuters newswire in 1987. People are represented as vertices and two persons are connected if they co-occur in the same article. The network has small-world features with power-law degree distribution. The network is disconnected and the component size distribution has power law characteristics. Community detection on a degree-reduced network provides meaningful communities. An edge-reduced network, which contains only the strong ties has a star topology.   "Importance" of persons are investigated. The network is the situation in 1987. After 20 years, a better judgment on the importance of the people can be done. A number of ranking algorithms, including Citation count, PageRank, are used to assign ranks to vertices. The ranks given by the algorithms are compared against how well a person is represented in Wikipedia. We find up to medium level Spearman's rank correlations. A noteworthy finding is that PageRank consistently performed worse than the other algorithms. We analyze this further and find reasons.
We derive exact renormalization-group recursion relations for an Ising model, in the presence of external fields, with ferromagnetic nearest-neighbor interactions on Migdal-Kadanoff hierarchical lattices. We consider layered distributions of aperiodic exchange interactions, according to a class of two-letter substitutional sequences. For irrelevant geometric fluctuations, the recursion relations in parameter space display a nontrivial uniform fixed point of hyperbolic character that governs the universal critical behavior. For relevant fluctuations, in agreement with previous work, this fixed point becomes fully unstable, and there appears a two-cycle attractor associated with a new critical universality class.
Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the `clustering-friendly' latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network's ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.
We evaluate the probability that a Boolean network returns to an attractor after perturbing h nodes. We find that the return probability as function of h can display a variety of different behaviours, which yields insights into the state-space structure. In addition to performing computer simulations, we derive analytical results for several types of Boolean networks, in particular for Random Boolean Networks. We also apply our method to networks that have been evolved for robustness to small perturbations, and to a biological example.
Automatically describing video content with natural language is a fundamental challenge of multimedia. Recurrent Neural Networks (RNN), which models sequence dynamics, has attracted increasing attention on visual interpretation. However, most existing approaches generate a word locally with given previous words and the visual content, while the relationship between sentence semantics and visual content is not holistically exploited. As a result, the generated sentences may be contextually correct but the semantics (e.g., subjects, verbs or objects) are not true.   This paper presents a novel unified framework, named Long Short-Term Memory with visual-semantic Embedding (LSTM-E), which can simultaneously explore the learning of LSTM and visual-semantic embedding. The former aims to locally maximize the probability of generating the next word given previous words and visual content, while the latter is to create a visual-semantic embedding space for enforcing the relationship between the semantics of the entire sentence and visual content. Our proposed LSTM-E consists of three components: a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep RNN for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics. The experiments on YouTube2Text dataset show that our proposed LSTM-E achieves to-date the best reported performance in generating natural sentences: 45.3% and 31.0% in terms of BLEU@4 and METEOR, respectively. We also demonstrate that LSTM-E is superior in predicting Subject-Verb-Object (SVO) triplets to several state-of-the-art techniques.
A measurement is presented of dijet and 3-jet cross sections in low-|t| diffractive deep-inelastic scattering interactions of the type ep -> eXY, where the system X is separated by a large rapidity gap from a low-mass baryonic system Y. Data taken with the H1 detector at HERA, corresponding to an integrated luminosity of 18.0 pb^(-1), are used to measure hadron level single and double differential cross sections for 4<Q^2<80 GeV^2, x_pom<0.05 and p_(T,jet)>4 GeV. The energy flow not attributed to jets is also investigated. The measurements are consistent with a factorising diffractive exchange with trajectory intercept close to 1.2 and tightly constrain the dominating diffractive gluon distribution. Viewed in terms of the diffractive scattering of partonic fluctuations of the photon, the data require the dominance of qqbarg over qqbar states. Soft colour neutralisation models in their present form cannot simultaneously reproduce the shapes and the normalisations of the differential cross sections. Models based on 2-gluon exchange are able to reproduce the shapes of the cross sections at low x_pom values.
Indoor scene understanding is central to applications such as robot navigation and human companion assistance. Over the last years, data-driven deep neural networks have outperformed many traditional approaches thanks to their representation learning capabilities. One of the bottlenecks in training for better representations is the amount of available per-pixel ground truth data that is required for core scene understanding tasks such as semantic segmentation, normal prediction, and object edge detection. To address this problem, a number of works proposed using synthetic data. However, a systematic study of how such synthetic data is generated is missing. In this work, we introduce a large-scale synthetic dataset with 400K physically-based rendered images from 45K realistic 3D indoor scenes. We study the effects of rendering methods and scene lighting on training for three computer vision tasks: surface normal prediction, semantic segmentation, and object boundary detection. This study provides insights into the best practices for training with synthetic data (more realistic rendering is worth it) and shows that pretraining with our new synthetic dataset can improve results beyond the current state of the art on all three tasks.
We propose methods for computing two network features with topological underpinnings: the Rips and Dowker Persistent Homology Diagrams. Our formulations work for general networks, which may be asymmetric and may have any real number as an edge weight. We study the sensitivity of Dowker persistence diagrams to intrinsic asymmetry in the data, and investigate the stability properties of both the Dowker and Rips persistence diagrams. We include detailed experiments run on a variety of simulated and real world datasets using our methods.
We have investigated the influence of a driving force on the elastic coupling (Labusch parameter) of the field-cooled state of the flux line lattice (FLL) in 400 nm thick YBa$_2$Cu$_3$O$_7$ superconducting films. We found that the FLL of a field-cooled state without driving forces is not in an equilibrium state. Results obtained for magnetic fields applied at $0^\circ$ and 30$^\circ$ relative to CuO$_2$ planes, show an enhancement of the elastic coupling of the films at driving current densities several orders of magnitude smaller than the critical one. Our results indicate that the FLL appears to be in a relatively ordered, metastable state after field cooling without driving forces.
In this paper, we present a novel approach for contour detection with Convolutional Neural Networks. A multi-scale CNN learning framework is designed to automatically learn the most relevant features for contour patch detection. Our method uses patch-level measurements to create contour maps with overlapping patches. We show the proposed CNN is able to to detect large-scale contours in an image efficienly. We further propose a guided filtering method to refine the contour maps produced from large-scale contours. Experimental results on the major contour benchmark databases demonstrate the effectiveness of the proposed technique. We show our method can achieve good detection of both fine-scale and large-scale contours.
We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism. Our method is evaluated in the context of image-to-LaTeX generation, and we introduce a new dataset of real-world rendered mathematical expressions paired with LaTeX markup. We show that unlike neural OCR techniques using CTC-based models, attention-based approaches can tackle this non-standard OCR task. Our approach outperforms classical mathematical OCR systems by a large margin on in-domain rendered data, and, with pretraining, also performs well on out-of-domain handwritten data. To reduce the inference complexity associated with the attention-based approaches, we introduce a new coarse-to-fine attention layer that selects a support region before applying attention.
Gas Transmission Networks are large-scale complex systems, and corresponding design and control problems are challenging. In this paper, we consider the problem of control and management of these systems in crisis situations. We present these networks by a hybrid systems framework that provides required analysis models. Further, we discuss decision-making using computational discrete and hybrid optimization methods. In particular, several reinforcement learning methods are employed to explore decision space and achieve the best policy in a specific crisis situation. Simulations are presented to illustrate the efficiency of the method.
Modern Ethernet switches support many advanced features beyond route learning and packet forwarding such as VLAN tagging, IGMP snooping, rate limiting, and status monitoring, which can be controlled through a programmatic interface. Traditionally, these features are mostly used to statically configure a network. This paper proposes to apply them as dynamic control mechanisms to maximize physical network link resources, to minimize failure recovery time, to enforce QoS requirements, and to support link-layer multicast without broadcasting. With these advanced programmable control mechanisms, standard Ethernet switches can be used as effective building blocks for metropolitan-area Ethernet networks (MEN), storage-area networks (SAN), and computation cluster interconnects. We demonstrate the usefulness of this new level of control over Ethernet switches with a MEN architecture that features multi-fold throughput gains and sub-second failure recovery time.
It has been reported that the number of transcription factors encoded in prokaryotic genomes scales approximately quadratically with their total number of genes. We propose a conceptual explanation of this finding and illustrate it using a simple model in which metabolic and regulatory networks of prokaryotes are shaped by horizontal gene transfer of coregulated metabolic pathways. Adapting to a new environmental condition monitored by a new transcription factor (e.g., learning to use another nutrient) involves both acquiring new enzymes and reusing some of the enzymes already encoded in the genome. As the repertoire of enzymes of an organism (its toolbox) grows larger, it can reuse its enzyme tools more often and thus needs to get fewer new ones to master each new task. From this observation, it logically follows that the number of functional tasks and their regulators increases faster than linearly with the total number of genes encoding enzymes. Genomes can also shrink, e.g., because of a loss of a nutrient from the environment, followed by deletion of its regulator and all enzymes that become redundant. We propose several simple models of network evolution elaborating on this toolbox argument and reproducing the empirically observed quadratic scaling. The distribution of lengths of pathway branches in our model agrees with that of the real-life metabolic network of Escherichia coli. Thus, our model provides a qualitative explanation for broad distributions of regulon sizes in prokaryotes.
Reaction-diffusion models are widely used to study spatially-extended chemical reaction systems. In order to understand how the dynamics of a reaction-diffusion model are affected by changes in its input parameters, efficient methods for computing parametric sensitivities are required. In this work, we focus on stochastic models of spatially-extended chemical reaction systems that involve partitioning the computational domain into voxels. Parametric sensitivities are often calculated using Monte Carlo techniques that are typically computationally expensive; however, variance reduction techniques can decrease the number of Monte Carlo simulations required. By exploiting the characteristic dynamics of spatially-extended reaction networks, we are able to adapt existing finite difference schemes to robustly estimate parametric sensitivities in a spatially-extended network. We show that algorithmic performance depends on the dynamics of the given network and the choice of summary statistics. We then describe a hybrid technique that dynamically chooses the most appropriate simulation method for the network of interest. Our method is tested for functionality and accuracy in a range of different scenarios.
The primal-dual column generation method (PDCGM) is a general-purpose column generation technique that relies on the primal-dual interior point method to solve the restricted master problems. The use of this interior point method variant allows to obtain suboptimal and well-centered dual solutions which naturally stabilizes the column generation. As recently presented in the literature, reductions in the number of calls to the oracle and in the CPU times are typically observed when compared to the standard column generation, which relies on extreme optimal dual solutions. However, these results are based on relatively small problems obtained from linear relaxations of combinatorial applications. In this paper, we investigate the behaviour of the PDCGM in a broader context, namely when solving large-scale convex optimization problems. We have selected applications that arise in important real-life contexts such as data analysis (multiple kernel learning problem), decision-making under uncertainty (two-stage stochastic programming problems) and telecommunication and transportation networks (multicommodity network flow problem). In the numerical experiments, we use publicly available benchmark instances to compare the performance of the PDCGM against recent results for different methods presented in the literature, which were the best available results to date. The analysis of these results suggests that the PDCGM offers an attractive alternative over specialized methods since it remains competitive in terms of number of iterations and CPU times even for large-scale optimization problems.
Congestion in network occurs due to exceed in aggregate demand as compared to the accessible capacity of the resources. Network congestion will increase as network speed increases and new effective congestion control methods are needed, especially to handle bursty traffic of todays very high speed networks. Since late 90s numerous schemes i.e. [1]...[10] etc. have been proposed. This paper concentrates on comparative study of the different congestion control schemes based on some key performance metrics. An effort has been made to judge the performance of Maximum Entropy (ME) based solution for a steady state GE/GE/1/N censored queues with partial buffer sharing scheme against these key performance metrics.
In this paper, we propose Deep Alignment Network (DAN), a robust face alignment method based on a deep neural network architecture. DAN consists of multiple stages, where each stage improves the locations of the facial landmarks estimated by the previous stage. Our method uses entire face images at all stages, contrary to the recently proposed face alignment methods that rely on local patches. This is possible thanks to the use of landmark heatmaps which provide visual information about landmark locations estimated at the previous stages of the algorithm. The use of entire face images rather than patches allows DAN to handle face images with large variation in head pose and difficult initializations. An extensive evaluation on two publicly available datasets shows that DAN reduces the state-of-the-art failure rate by up to 70%. Our method has also been submitted for evaluation as part of the Menpo challenge.
Encoding models have as their objective to predict neural responses to naturalistic stimuli with the aim of elucidating how sensory information is represented in the brain. This prediction is achieved by representing the stimulus in terms of a suitable feature space and using this feature space to linearly predict observed neural responses. Here, we investigate to what extent semantic vector space models can be used to predict neural responses to complex visual stimuli. We show that these models provide good predictions of neural responses in downstream visual areas, improving significantly over a low-level control model based on Gabor wavelet pyramids. The outlined approach provides a new way to model and map high-level semantic representations across cortex.
This paper considers the identification of linear time-invariant networks, also known as dynamic structure functions. Assuming identifiability of the network addressed in previous work, this paper presents an identification method that infers both the Boolean structure of the network and the transfer functions between nodes. The identification is performed directly from data and without any prior knowledge of the system, including its order. The method is to formulate the identification as a linear regression problem together with penalties for complexity, both in terms of element (order of nonzero connections) and group sparsity (network topology). We then propose a novel scheme that combines sparse Bayesian and sparse group Bayesian to efficiently solve the problem. The method and the developed toolbox can now be used to infer networks from a wide range of fields, including systems biology applications such as signalling and genetic regulatory networks.
Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU). However, the dynamic properties behind the remarkable performance remain unclear in many applications, e.g., automatic speech recognition (ASR). This paper employs visualization techniques to study the behavior of LSTM and GRU when performing speech recognition tasks. Our experiments show some interesting patterns in the gated memory, and some of them have inspired simple yet effective modifications on the network structure. We report two of such modifications: (1) lazy cell update in LSTM, and (2) shortcut connections for residual learning. Both modifications lead to more comprehensible and powerful networks.
Recently we have shown that an architecture based on resistive processing unit (RPU) devices has potential to achieve significant acceleration in deep neural network (DNN) training compared to today's software-based DNN implementations running on CPU/GPU. However, currently available device candidates based on non-volatile memory technologies do not satisfy all the requirements to realize the RPU concept. Here, we propose an analog CMOS-based RPU design (CMOS RPU) which can store and process data locally and can be operated in a massively parallel manner. We analyze various properties of the CMOS RPU to evaluate the functionality and feasibility for acceleration of DNN training.
The number of contending neighbors of a node in a multihop ad hoc network has to be adjusted while analyzing the performance of the network such as computing the end-to-end delays along a path from a given source to a destination. In this paper, we describe a method to adjust the number of contending neighbors of a node in a multihop wireless ad hoc network. Our method is based on the minimum number of neighbors that has to be common between two consecutive nodes along a path. We derive an analytical expression for the adjustment factor.
The deep Boltzmann machine (DBM) has been an important development in the quest for powerful "deep" probabilistic models. To date, simultaneous or joint training of all layers of the DBM has been largely unsuccessful with existing training methods. We introduce a simple regularization scheme that encourages the weight vectors associated with each hidden unit to have similar norms. We demonstrate that this regularization can be easily combined with standard stochastic maximum likelihood to yield an effective training strategy for the simultaneous training of all layers of the deep Boltzmann machine.
Recurrent neural networks (RNNs), including long short-term memory (LSTM) RNNs, have produced state-of-the-art results on a variety of speech recognition tasks. However, these models are often too large in size for deployment on mobile devices with memory and latency constraints. In this work, we study mechanisms for learning compact RNNs and LSTMs via low-rank factorizations and parameter sharing schemes. Our goal is to investigate redundancies in recurrent architectures where compression can be admitted without losing performance. A hybrid strategy of using structured matrices in the bottom layers and shared low-rank factors on the top layers is found to be particularly effective, reducing the parameters of a standard LSTM by 75%, at a small cost of 0.3% increase in WER, on a 2,000-hr English Voice Search task.
In classical machine learning, regression is treated as a black box process of identifying a suitable function from a hypothesis set without attempting to gain insight into the mechanism connecting inputs and outputs. In the natural sciences, however, finding an interpretable function for a phenomenon is the prime goal as it allows to understand and generalize results. This paper proposes a novel type of function learning network, called equation learner (EQL), that can learn analytical expressions and is able to extrapolate to unseen domains. It is implemented as an end-to-end differentiable feed-forward network and allows for efficient gradient based training. Due to sparsity regularization concise interpretable expressions can be obtained. Often the true underlying source expression is identified.
Thanks to the potential they hold and the variety of their application domains, Multimedia Wireless Sensor Networks (MWSN) are forecast to become highly integrated into our daily activities. Due to the carried content nature, mainly composed of images and/or video streams with high throughput and delay constraints, Quality of Service in the context of MWSN is a crucial issue. In this paper, we propose a QoS and energy aware geographic routing protocol for MWSN: QGRP. The proposed protocol addresses bandwidth, delay and energy constraints associated with MWSN. QGRP adopts an analytical model of IEEE 802.11 Distributed Coordination Function (DCF) to estimate available bandwidth and generates loop-free routing paths.
Recursive Neural Networks are non-linear adaptive models that are able to learn deep structured information. However, these models have not yet been broadly accepted. This fact is mainly due to its inherent complexity. In particular, not only for being extremely complex information processing models, but also because of a computational expensive learning phase. The most popular training method for these models is back-propagation through the structure. This algorithm has been revealed not to be the most appropriate for structured processing due to problems of convergence, while more sophisticated training methods enhance the speed of convergence at the expense of increasing significantly the computational cost. In this paper, we firstly perform an analysis of the underlying principles behind these models aimed at understanding their computational power. Secondly, we propose an approximate second order stochastic learning algorithm. The proposed algorithm dynamically adapts the learning rate throughout the training phase of the network without incurring excessively expensive computational effort. The algorithm operates in both on-line and batch modes. Furthermore, the resulting learning scheme is robust against the vanishing gradients problem. The advantages of the proposed algorithm are demonstrated with a real-world application example.
Recent measurement of the structure function $F_2^\nu$ in neutrino deep inelastic scattering allows us to compare structure function measured in neutrino and charged lepton scattering for the first time with reasonable precision. The comparison between neutrino and muon structure functions made by the CCFR Collaboration indicates that there is a discrepancy between these structure functions at small Bjorken x values. In this talk I examine two effects which might account for the experimental discrepancy: nuclear shadowing corrections for neutrinos and contributions from strange and anti strange quarks.
The fixed-point structure of three-dimensional bond-disordered Ising models is investigated using the numerical domain-wall renormalization-group method. It is found that, in the +/-J Ising model, there exists a non-trivial fixed point along the phase boundary between the paramagnetic and ferromagnetic phases. The fixed-point Hamiltonian of the +/-J model numerically coincides with that of the unfrustrated random Ising models, strongly suggesting that both belong to the same universality class. Another fixed point corresponding to the multicritical point is also found in the +/-J model. Critical properties associated with the fixed point are qualitatively consistent with theoretical predictions.
We consider whether it is possible to find ground states of frustrated spin systems by solving them locally. Using spin glass physics and Imry-Ma arguments in addition to numerical benchmarks we quantify the power of such local solution methods and show that for the average low-dimensional spin glass problem outside the spin- glass phase the exact ground state can be found in polynomial time. In the second part we present a heuristic, general-purpose hierarchical approach which for spin glasses on chimera graphs and lattices in two and three dimensions outperforms, to our knowledge, any other solver currently around, with significantly better scaling performance than simulated annealing.
With the recent explosion of publicly available biological data, the analysis of networks has gained significant interest. In particular, recent promising results in Neuroscience show that the way neurons and areas of the brain are connected to each other plays a fundamental role in cognitive functions and behaviour. Revealing pattern and structures within such an intricate volume of connections is a hard problem that has its roots in Graph and Network Theory. Since many real world situations can be modelled through networks, structures detection algorithms find application in almost every field of Science. These are NP-complete problems; therefore the generally used approach is through heuristic algorithms. Here, we formulate the problem of finding structures in networks of neurons in terms of a community detection problem. We introduce a definition of community and we construct a statistics-based heuristic algorithm for directed and weighted networks aiming at identifying overlapping bidirectional communities in large networks. We carry out a systematic analysis of the algorithm's performance, showing excellent results over a wide range of parameters (successful detection percentages almost $100\%$ all the time). Also, we show results on the computational time needed and we suggest future directions on how to improve computational performance.
This paper reports the results of an experiment on the use of Kak's B-Matrix approach to spreading activity in a Hebbian neural network. Specifically, it concentrates on the memory retrieval from single neurons and compares the performance of the B-Matrix approach to that of the traditional approach.
Molecular dynamics simulations are used to generate an ensemble of saddles of the potential energy of a Lennard-Jones liquid. Classifying all extrema by their potential energy u and number of unstable directions k, a well defined relation k(u) is revealed. The degree of instability of typical stationary points vanishes at a threshold potential energy, which lies above the energy of the lowest glassy minima of the system. The energies of the inherent states, as obtained by the Stillinger-Weber method, approach the threshold energy at a temperature close to the mode-coupling transition temperature Tc.
Learning requires the traversal of inherently distinct cognitive states to produce behavioral adaptation. Yet, tools to explicitly measure these states with non-invasive imaging -- and to assess their dynamics during learning -- remain limited. Here, we describe an approach based on a novel application of graph theory in which points in time are represented by network nodes, and similarities in brain states between two different time points are represented as network edges. We use a graph-based clustering technique to identify clusters of time points representing canonical brain states, and to assess the manner in which the brain moves from one state to another as learning progresses. We observe the presence of two primary states characterized by either high activation in sensorimotor cortex or high activation in a frontal-subcortical system. Flexible switching among these primary states and other less common states becomes more frequent as learning progresses, and is inversely correlated with individual differences in learning rate. These results are consistent with the notion that the development of automaticity is associated with a greater freedom to use cognitive resources for other processes. Taken together, our work offers new insights into the constrained, low dimensional nature of brain dynamics characteristic of early learning, which give way to less constrained, high-dimensional dynamics in later learning.
Context. The compositional properties of hydrogenated amorphous carbons are known to evolve in response to the local conditions. Aims. To present a model for low-temperature, amorphous hydrocarbon solids, based on the microphysical properties of random and defected networks of carbon and hydrogen atoms, that can be used to study and predict the evolution of their properties in the interstellar medium. Methods. We adopt an adaptable and prescriptive approach to model these materials, which is based on a random covalent network (RCN) model, extended here to a full compositional derivation (the eRCN model), and a defective graphite (DG) model for the hydrogen poorer materials where the eRCN model is no longer valid. Results. We provide simple expressions that enable the determination of the structural, infrared and spectral properties of amorphous hydrocarbon grains as a function of the hydrogen atomic fraction, XH. Structural annealing, resulting from hydrogen atom loss, results in a transition from H-rich, aliphatic-rich to H-poor, aromatic-rich materials. Conclusions. The model predicts changes in the optical properties of hydrogenated amorphous carbon dust in response to the likely UV photon-driven and/or thermal annealing processes resulting, principally, from the radiation field in the environment. We show how this dust component will evolve, compositionally and structurally in the interstellar medium in response to the local conditions.
Telehealth and wearable equipment can deliver personal healthcare and necessary treatment remotely. One major challenge is transmitting large amount of biosignals through wireless networks. The limited battery life calls for low-power data compressors. Compressive Sensing (CS) has proved to be a low-power compressor. In this study, we apply CS on the compression of multichannel biosignals. We firstly develop an efficient CS algorithm from the Block Sparse Bayesian Learning (BSBL) framework. It is based on a combination of the block sparse model and multiple measurement vector model. Experiments on real-life Fetal ECGs showed that the proposed algorithm has high fidelity and efficiency. Implemented in hardware, the proposed algorithm was compared to a Discrete Wavelet Transform (DWT) based algorithm, verifying the proposed one has low power consumption and occupies less computational resources.
In this paper, we present a deep neural network (DNN)-based acoustic scene classification framework. Two hierarchical learning methods are proposed to improve the DNN baseline performance by incorporating the hierarchical taxonomy information of environmental sounds. Firstly, the parameters of the DNN are initialized by the proposed hierarchical pre-training. Multi-level objective function is then adopted to add more constraint on the cross-entropy based loss function. A series of experiments were conducted on the Task1 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 challenge. The final DNN-based system achieved a 22.9% relative improvement on average scene classification error as compared with the Gaussian Mixture Model (GMM)-based benchmark system across four standard folds.
In recent times, wireless access technology is becoming increasingly commonplace due to the ease of operation and installation of untethered wireless media. The design of wireless networking is challenging due to the highly dynamic environmental condition that makes parameter optimization a complex task. Due to the dynamic, and often unknown, operating conditions, modern wireless networking standards increasingly rely on machine learning and artificial intelligence algorithms. Genetic algorithms (GAs) provide a well-established framework for implementing artificial intelligence tasks such as classification, learning, and optimization. GAs are well-known for their remarkable generality and versatility, and have been applied in a wide variety of settings in wireless networks. In this paper, we provide a comprehensive survey of the applications of GAs in wireless networks. We provide both an exposition of common GA models and configuration and provide a broad ranging survey of GA techniques in wireless networks. We also point out open research issues and define potential future work. While various surveys on GAs exist in literature, our paper is the first paper, to the best of our knowledge, which focuses on their application in wireless networks.
Two well-known databases of semantic relationships between pairs of words used in psycholinguistics, feature-based and association-based, are studied as complex networks. We propose an algorithm to disentangle feature based relationships from free association semantic networks. The algorithm uses the rich topology of the free association semantic network to produce a new set of relationships between words similar to those observed in feature production norms.
In the course of the growth of the Internet and due to increasing availability of data, over the last two decades, the field of network science has established itself as an own area of research. With quantitative scientists from computer science, mathematics, and physics working on datasets from biology, economics, sociology, political sciences, and many others, network science serves as a paradigm for interdisciplinary research. One of the major goals in network science is to unravel the relationship between topological graph structure and a network's function. As evidence suggests, systems from the same fields, i.e. with similar function, tend to exhibit similar structure. However, it is still vague whether a similar graph structure automatically implies likewise function. This dissertation aims at helping to bridge this gap, while particularly focusing on the role of triadic structures. After a general introduction to the main concepts of network science, existing work devoted to the relevance of triadic substructures is reviewed. A major challenge in modeling such structure is the fact that not all three-node subgraphs can be specified independently of each other, as pairs of nodes may participate in multiple triadic subgraphs. In order to overcome this obstacle, a novel class of generative network models based on pair-disjoint triadic building blocks is suggested. It is further investigated whether triad motifs - subgraph patterns which appear significantly more frequently than expected at random - occur homogeneously or heterogeneously distributed over graphs. Finally, the influence of triadic substructure on the evolution of dynamical processes acting on their nodes is studied. It is observed that certain motifs impose clear signatures on the systems' dynamics, even when embedded in a larger network structure.
Random critical points are generically characterized by multifractal properties. In the field of Anderson localization, Mirlin, Fyodorov, Mildenberger and Evers [Phys. Rev. Lett 97, 046803 (2006)] have proposed that the singularity spectrum $f(\alpha)$ of eigenfunctions satisfies the exact symmetry $f(2d-\alpha)=f(\alpha)+d-\alpha$ at any Anderson transition. In the present paper, we analyse the physical origin of this symmetry in relation with the Gallavotti-Cohen fluctuation relations of large deviation functions that are well-known in the field of non-equilibrium dynamics: the multifractal spectrum of the disordered model corresponds to the large deviation function of the rescaling exponent $\gamma=(\alpha-d)$ along a renormalization trajectory in the effective time $t=\ln L$. We conclude that the symmetry discovered on the specific example of Anderson transitions should actually be satisfied at many other random critical points after an appropriate translation. For many-body random phase transitions, where the critical properties are usually analyzed in terms of the multifractal spectrum $H(a)$ and of the moments exponents X(N) of two-point correlation function [A. Ludwig, Nucl. Phys. B330, 639 (1990)], the symmetry becomes $H(2X(1) -a)= H(a) + a-X(1)$, or equivalently $\Delta(N)=\Delta(1-N)$ for the anomalous parts $\Delta(N) \equiv X(N)-NX(1)$. We present numerical tests in favor of this symmetry for the 2D random $Q-$state Potts model with various $Q$.
We consider a network of agents that aim to learn some unknown state of the world using private observations and exchange of beliefs. At each time, agents observe private signals generated based on the true unknown state. Each agent might not be able to distinguish the true state based only on her private observations. This occurs when some other states are observationally equivalent to the true state from the agent's perspective. To overcome this shortcoming, agents must communicate with each other to benefit from local observations. We propose a model where each agent selects one of her neighbors randomly at each time. Then, she refines her opinion using her private signal and the prior of that particular neighbor. The proposed rule can be thought of as a Bayesian agent who cannot recall the priors based on which other agents make inferences. This learning without recall approach preserves some aspects of the Bayesian inference while being computationally tractable. By establishing a correspondence with a random walk on the network graph, we prove that under the described protocol, agents learn the truth exponentially fast in the almost sure sense. The asymptotic rate is expressed as the sum of the relative entropies between the signal structures of every agent weighted by the stationary distribution of the random walk.
We compute accurate redshift distributions to I(AB) = 24 and R(AB) = 24.5 using photometric redshifts estimated from six-band UBVRIZ photometry in the Canada-France Deep Fields-Photometric Redshift Survey (CFDF-PRS). Our photometric redshift algorithm is calibrated using hundreds of CFRS spectroscopic redshifts in the same fields. The dispersion in redshift is \sigma/(1+z) \la 0.04 to the CFRS depth of I(AB) = 22.5, rising to \sigma/(1+z) \la 0.06 at our nominal magnitude and redshift limits of I(AB) = 24 and z \le 1.3, respectively. We describe a new method to compute N(z) that incorporates the full redshift likelihood functions in a Bayesian iterative analysis and we demonstrate in extensive Monte Carlo simulations that it is superior to distributions calculated using simple maximum likelihood redshifts. The field-to-field differences in the redshift distributions, while not unexpected theoretically, are substantial even on 30' scales. We provide I(AB) and R(AB) redshift distributions, median redshifts, and parametrized fits of our results in various magnitude ranges, accounting for both random and systematic errors in the analysis.
We study the nature of the phase transition in the multifractal formalism of the harmonic measure of Diffusion Limited Aggregates (DLA). Contrary to previous work that relied on random walk simulations or ad-hoc models to estimate the low probability events of deep fjord penetration, we employ the method of iterated conformal maps to obtain an accurate computation of the probability of the rarest events. We resolve probabilities as small as $10^{-70}$. We show that the generalized dimensions $D_q$ are infinite for $q<q^*$, where $q^*= -0.17\pm 0.02$. In the language of $f(\alpha)$ this means that $\alpha_{max}$ is finite. We present a converged $f(\alpha)$ curve.
The dynamical and stationary properties of on-line learning from finite training sets are analysed using the cavity method. For large input dimensions, we derive equations for the macroscopic parameters, namely, the student-teacher correlation, the student-student autocorrelation and the learning force uctuation. This enables us to provide analytical solutions to Adaline learning as a benchmark. Theoretical predictions of training errors in transient and stationary states are obtained by a Monte Carlo sampling procedure. Generalization and training errors are found to agree with simulations. The physical origin of the critical learning rate is presented. Comparison with batch learning is discussed throughout the paper.
The loss surface of deep neural networks has recently attracted interest in the optimization and machine learning communities as a prime example of high-dimensional non-convex problem. Some insights were recently gained using spin glass models and mean-field approximations, but at the expense of strongly simplifying the nonlinear nature of the model.   In this work, we do not make any such assumption and study conditions on the data distribution and model architecture that prevent the existence of bad local minima. Our theoretical work quantifies and formalizes two important \emph{folklore} facts: (i) the landscape of deep linear networks has a radically different topology from that of deep half-rectified ones, and (ii) that the energy landscape in the non-linear case is fundamentally controlled by the interplay between the smoothness of the data distribution and model over-parametrization. Our main theoretical contribution is to prove that half-rectified single layer networks are asymptotically connected, and we provide explicit bounds that reveal the aforementioned interplay.   The conditioning of gradient descent is the next challenge we address. We study this question through the geometry of the level sets, and we introduce an algorithm to efficiently estimate the regularity of such sets on large-scale networks. Our empirical results show that these level sets remain connected throughout all the learning phase, suggesting a near convex behavior, but they become exponentially more curvy as the energy level decays, in accordance to what is observed in practice with very low curvature attractors.
The procedural generation of video game levels has existed for at least 30 years, but only recently have machine learning approaches been used to generate levels without specifying the rules for generation. A number of these have looked at platformer levels as a sequence of characters and performed generation using Markov chains. In this paper we examine the use of Long Short-Term Memory recurrent neural networks (LSTMs) for the purpose of generating levels trained from a corpus of Super Mario Brothers levels. We analyze a number of different data representations and how the generated levels fit into the space of human authored Super Mario Brothers levels.
Network modeling plays a critical role in identifying statistical regularities and structural principles common to many systems. The large majority of recent modeling approaches are connectivity driven. The structural patterns of the network are at the basis of the mechanisms ruling the network formation. Connectivity driven models necessarily provide a time-aggregated representation that may fail to describe the instantaneous and fluctuating dynamics of many networks. We address this challenge by defining the activity potential, a time invariant function characterizing the agents' interactions and constructing an activity driven model capable of encoding the instantaneous time description of the network dynamics. The model provides an explanation of structural features such as the presence of hubs, which simply originate from the heterogeneous activity of agents. Within this framework, highly dynamical networks can be described analytically, allowing a quantitative discussion of the biases induced by the time-aggregated representations in the analysis of dynamical processes.
New machine learning based algorithms have been developed and tested for Monte Carlo integration based on generative Boosted Decision Trees and Deep Neural Networks. Both of these algorithms exhibit substantial improvements compared to existing algorithms for non-factorizable integrands in terms of the achievable integration precision for a given number of target function evaluations. Large scale Monte Carlo generation of complex collider physics processes with improved efficiency can be achieved by implementing these algorithms into commonly used matrix element Monte Carlo generators once their robustness is demonstrated and performance validated for the relevant classes of matrix elements.
The aim of this work is to try to bridge over theoretical immunology and disordered statistical mechanics. Our long term hope is to contribute to the development of a quantitative theoretical immunology from which practical applications may stem. In order to make theoretical immunology appealing to the statistical physicist audience we are going to work out a research article which, from one side, may hopefully act as a benchmark for future improvements and developments, from the other side, it is written in a very pedagogical way both from a theoretical physics viewpoint as well as from the theoretical immunology one.   Furthermore, we have chosen to test our model describing a wide range of features of the adaptive immune response in only a paper: this has been necessary in order to emphasize the benefit available when using disordered statistical mechanics as a tool for the investigation. However, as a consequence, each section is not at all exhaustive and would deserve deep investigation: for the sake of completeness, we restricted details in the analysis of each feature with the aim of introducing a self-consistent model.
Histopathology images are crucial to the study of complex diseases such as cancer. The histologic characteristics of nuclei play a key role in disease diagnosis, prognosis and analysis. In this work, we propose a sparse Convolutional Autoencoder (CAE) for fully unsupervised, simultaneous nucleus detection and feature extraction in histopathology tissue images. Our CAE detects and encodes nuclei in image patches in tissue images into sparse feature maps that encode both the location and appearance of nuclei. Our CAE is the first unsupervised detection network for computer vision applications. The pretrained nucleus detection and feature extraction modules in our CAE can be fine-tuned for supervised learning in an end-to-end fashion. We evaluate our method on four datasets and reduce the errors of state-of-the-art methods up to 42%. We are able to achieve comparable performance with only 5% of the fully-supervised annotation cost.
We address the problem of multiuser scheduling with partial channel information in a multi-cell environment. The scheduling problem is formulated jointly with the ARQ based channel learning process and the intercell interference mitigating cell breathing protocol. The optimal joint scheduling policy under various system constraints is established. The general problem is posed as a generalized Restless Multiarmed Bandit process and the notion of indexability is studied. We conjecture, with numerical support, that the multicell multiuser scheduling problem is indexable and obtain a partial structure of the index policy.
We present an architecture of a recurrent neural network (RNN) with a fully-connected deep neural network (DNN) as its feature extractor. The RNN is equipped with both causal temporal prediction and non-causal look-ahead, via auto-regression (AR) and moving-average (MA), respectively. The focus of this paper is a primal-dual training method that formulates the learning of the RNN as a formal optimization problem with an inequality constraint that provides a sufficient condition for the stability of the network dynamics. Experimental results demonstrate the effectiveness of this new method, which achieves 18.86% phone recognition error on the TIMIT benchmark for the core test set. The result approaches the best result of 17.7%, which was obtained by using RNN with long short-term memory (LSTM). The results also show that the proposed primal-dual training method produces lower recognition errors than the popular RNN methods developed earlier based on the carefully tuned threshold parameter that heuristically prevents the gradient from exploding.
This paper describes the participation of Araguaia Medical Vision Lab at the International Skin Imaging Collaboration 2017 Skin Lesion Challenge. We describe the use of deep convolutional neural networks in attempt to classify images of Melanoma and Seborrheic Keratosis lesions. With use of finetuned GoogleNet and AlexNet we attained results of 0.950 and 0.846 AUC on Seborrheic Keratosis and Melanoma respectively.
We present a first numerical investigation of the accuracy of the recently proposed {\em non-classical transport equation}. This equation contains an extra independent variable (the path-length $s$), and models particle transport taking place in random media in which a particle's distance-to-collision is {\em not} exponentially distributed. To solve the non-classical equation, one needs to know the $s$-dependent ensemble-averaged total cross section $\Sigma_t(s)$, or its corresponding path-length distribution function $p(s)$. We consider a 1-D spatially periodic system consisting of alternating solid and void layers, randomly placed in the infinite line. In this preliminary work, we assume transport in rod geometry: particles can move only in the directions $\mu=\pm 1$. We obtain an analytical expression for $p(s)$, and use this result to compute the corresponding $\Sigma_t(s)$. Then, we proceed to solve the non-classical equation for different test problems. To assess the accuracy of these solutions, we produce "benchmark" results obtained by (i) generating a large number of physical realizations of the system, (ii) numerically solving the transport equation in each realization, and (iii) ensemble-averaging the solutions over all physical realizations. We show that the results obtained with the non-classical equation accurately model the ensemble-averaged scalar flux in this 1-D random system, generally outperforming the widely-used atomic mix model. We conclude by discussing plans to extend the present work to slab geometry, as well as to more general random mixtures.
We introduce the Deep Symbolic Network (DSN) model, which aims at becoming the white-box version of Deep Neural Networks (DNN). The DSN model provides a simple, universal yet powerful structure, similar to DNN, to represent any knowledge of the world, which is transparent to humans. The conjecture behind the DSN model is that any type of real world objects sharing enough common features are mapped into human brains as a symbol. Those symbols are connected by links, representing the composition, correlation, causality, or other relationships between them, forming a deep, hierarchical symbolic network structure. Powered by such a structure, the DSN model is expected to learn like humans, because of its unique characteristics. First, it is universal, using the same structure to store any knowledge. Second, it can learn symbols from the world and construct the deep symbolic networks automatically, by utilizing the fact that real world objects have been naturally separated by singularities. Third, it is symbolic, with the capacity of performing causal deduction and generalization. Fourth, the symbols and the links between them are transparent to us, and thus we will know what it has learned or not - which is the key for the security of an AI system. Fifth, its transparency enables it to learn with relatively small data. Sixth, its knowledge can be accumulated. Last but not least, it is more friendly to unsupervised learning than DNN. We present the details of the model, the algorithm powering its automatic learning ability, and describe its usefulness in different use cases. The purpose of this paper is to generate broad interest to develop it within an open source project centered on the Deep Symbolic Network (DSN) model towards the development of general AI.
Object proposals for detecting moving or static video objects need to address issues such as speed, memory complexity and temporal consistency. We propose an efficient Video Object Proposal (VOP) generation method and show its efficacy in learning a better video object detector. A deep-learning based video object detector learned using the proposed VOP achieves state-of-the-art detection performance on the Youtube-Objects dataset. We further propose a clustering of VOPs which can efficiently be used for detecting objects in video in a streaming fashion. As opposed to applying per-frame convolutional neural network (CNN) based object detection, our proposed method called Objects in Video Enabler thRough LAbel Propagation (OVERLAP) needs to classify only a small fraction of all candidate proposals in every video frame through streaming clustering of object proposals and class-label propagation. Source code will be made available soon.
We present an approach to studying directed polymers in interaction with a defect line and subject to a force, which pulls them away from the line. We consider in particular the case of inhomogeneous interactions. We first give a formula relating the free energy of these models to the free energy of the corresponding ones in which the force is switched off. We then show how to detect the presence of a re-entrant transition without fully solving the model. We discuss some models in detail and show that inhomogeneous interaction, e.g. disordered interactions, may induce the re-entrance phenomenon.
The focus of this work is to enumerate the various approaches and algorithms that center around application of reinforcement learning in robotic ma- ]]nipulation tasks. Earlier methods utilized specialized policy representations and human demonstrations to constrict the policy. Such methods worked well with continuous state and policy space of robots but failed to come up with generalized policies. Subsequently, high dimensional non-linear function approximators like neural networks have been used to learn policies from scratch. Several novel and recent approaches have also embedded control policy with efficient perceptual representation using deep learning. This has led to the emergence of a new branch of dynamic robot control system called deep r inforcement learning(DRL). This work embodies a survey of the most recent algorithms, architectures and their implementations in simulations and real world robotic platforms. The gamut of DRL architectures are partitioned into two different branches namely, discrete action space algorithms(DAS) and continuous action space algorithms(CAS). Further, the CAS algorithms are divided into stochastic continuous action space(SCAS) and deterministic continuous action space(DCAS) algorithms. Along with elucidating an organ- isation of the DRL algorithms this work also manifests some of the state of the art applications of these approaches in robotic manipulation tasks.
Grid cells in medial entorhinal cortex are believed to play a key role in path integration. However, the relation between path integration and the grid-like arrangement of their firing field remains unclear. We provide theoretical evidence that grid-like structure and path integration are closely related. In one dimension, the grid-like structure provides the optimal solution for path integration assuming that the noise correlation structure is Gaussian. In two dimensions, assuming that the noise is Gaussian, rectangular grid-like structure is the optimal solution provided that 1- both noise correlation and receptive field structures of the neurons can be multiplicatively decomposed into orthogonal components and 2- the eigenvalues of the decomposed correlation matrices decrease faster than the square of the frequency of the corresponding eigenvectors. We will also address the decoding mechanism and show that the problem of decoding reduces to the problem of extracting task relevant information in the presence of task irrelevant information. Change-based Population Coding provides the optimal solution for this problem.
In this article, we derive a new stepsize adaptation for the normalized least mean square algorithm (NLMS) by describing the task of linear acoustic echo cancellation from a Bayesian network perspective. Similar to the well-known Kalman filter equations, we model the acoustic wave propagation from the loudspeaker to the microphone by a latent state vector and define a linear observation equation (to model the relation between the state vector and the observation) as well as a linear process equation (to model the temporal progress of the state vector). Based on additional assumptions on the statistics of the random variables in observation and process equation, we apply the expectation-maximization (EM) algorithm to derive an NLMS-like filter adaptation. By exploiting the conditional independence rules for Bayesian networks, we reveal that the resulting EM-NLMS algorithm has a stepsize update equivalent to the optimal-stepsize calculation proposed by Yamamoto and Kitayama in 1982, which has been adopted in many textbooks. As main difference, the instantaneous stepsize value is estimated in the M step of the EM algorithm (instead of being approximated by artificially extending the acoustic echo path). The EM-NLMS algorithm is experimentally verified for synthesized scenarios with both, white noise and male speech as input signal.
In this paper we study the relay-interference wireless network, in which relay (helper) nodes are to facilitate competing information flows over a wireless network. We examine this in the context of a deterministic wireless interaction model, which eliminates the channel noise and focuses on the signal interactions. Using this model, we show that almost all the known schemes such as interference suppression, interference alignment and interference separation are necessary for relay-interference networks. In addition, we discover a new interference management technique, which we call interference neutralization, which allows for over-the-air interference removal, without the transmitters having complete access the interfering signals. We show that interference separation, suppression, and neutralization arise in a fundamental manner, since we show complete characterizations for special configurations of the relay-interference network.
Belief Propagation algorithms acting on Graphical Models of classical probability distributions, such as Markov Networks, Factor Graphs and Bayesian Networks, are amongst the most powerful known methods for deriving probabilistic inferences amongst large numbers of random variables. This paper presents a generalization of these concepts and methods to the quantum case, based on the idea that quantum theory can be thought of as a noncommutative, operator-valued, generalization of classical probability theory. Some novel characterizations of quantum conditional independence are derived, and definitions of Quantum n-Bifactor Networks, Markov Networks, Factor Graphs and Bayesian Networks are proposed. The structure of Quantum Markov Networks is investigated and some partial characterization results are obtained, along the lines of the Hammersely-Clifford theorem. A Quantum Belief Propagation algorithm is presented and is shown to converge on 1-Bifactor Networks and Markov Networks when the underlying graph is a tree. The use of Quantum Belief Propagation as a heuristic algorithm in cases where it is not known to converge is discussed. Applications to decoding quantum error correcting codes and to the simulation of many-body quantum systems are described.
Structure of eigenstates in a periodic quasi-1D waveguide with a rough surface is studied both analytically and numerically. We have found a large number of "regular" eigenstates for any high energy. They result in a very slow convergence to the classical limit in which the eigenstates are expected to be completely ergodic. As a consequence, localization properties of eigenstates originated from unperturbed transverse channels with low indexes, are strongly localized (delocalized) in the momentum (coordinate) representation. These eigenstates were found to have a quite unexpeted form that manifests a kind of "repulsion" from the rough surface. Our results indicate that standard statistical approaches for ballistic localization in such waveguides seem to be unappropriate.
Traditional learning approaches proposed for controlling quadrotors or helicopters have focused on improving performance for specific trajectories by iteratively improving upon a nominal controller, for example learning from demonstrations, iterative learning, and reinforcement learning. In these schemes, however, it is not clear how the information gathered from the training trajectories can be used to synthesize controllers for more general trajectories. Recently, the efficacy of deep learning in inferring helicopter dynamics has been shown. Motivated by the generalization capability of deep learning, this paper investigates whether a neural network based dynamics model can be employed to synthesize control for trajectories different than those used for training. To test this, we learn a quadrotor dynamics model using only translational and only rotational training trajectories, each of which can be controlled independently, and then use it to simultaneously control the yaw and position of a quadrotor, which is non-trivial because of nonlinear couplings between the two motions. We validate our approach in experiments on a quadrotor testbed.
Reliability on complex biological networks reconstructions remains a concern. Although observations are getting more and more precise, the data collection process is yet error prone and the proofs display uneven certitude. In the case of metabolic networks, the currently employed confidence scoring system rates reactions according to a discretized small set of labels denoting different levels of experimental evidence or model-based likelihood. Here, we propose a computational network-based system of reaction scoring that exploits the complex hierarchical structure and the statistical regularities of the metabolic network as a bipartite graph. We use the example of Escherichia coli metabolism to illustrate our methodology. Our model is adjusted to the observations in order to derive connection probabilities between individual metabolite-reaction pairs and, after validation, we integrate individual link information to assess the reliability of each reaction in probabilistic terms. This network-based scoring system breaks the degeneracy of currently employed scores, enables further confirmation of modeling results, uncovers very specific reactions that could be functionally or evolutionary important, and identifies prominent experimental targets for further verification. We foresee a wide range of potential applications of our approach given the natural network bipartivity of many biological interactions.
Adversarial learning has been successfully embedded into deep networks to learn transferable features for domain adaptation, which reduce distribution discrepancy between the source and target domains and improve generalization performance. Prior domain adversarial adaptation methods could not align complex multimode distributions since the discriminative structures and inter-layer interactions across multiple domain-specific layers have not been exploited for distribution alignment. In this paper, we present randomized multilinear adversarial networks (RMAN), which exploit multiple feature layers and the classifier layer based on a randomized multilinear adversary to enable both deep and discriminative adversarial adaptation. The learning can be performed by stochastic gradient descent with the gradients computed by back-propagation in linear-time. Experiments demonstrate that our models exceed the state-of-the-art results on standard domain adaptation datasets.
Complex network theory is a useful way to study many real systems. In this paper, an anti-attack model based on complex network theory is introduced. The mechanism of this model is based on dynamic compensation process and reverse percolation process in P2P networks. The main purpose of the paper is: (i) a dynamic compensation process can turn an attacked P2P network into a power-law (PL) network with exponential cutoff; (ii) a local healing process can restore the maximum degree of peers in an attacked P2P network to a normal level; (iii) a restoring process based on reverse percolation theory connects the fragmentary peers of an attacked P2P network together into a giant connected component. In this way, the model based on complex network theory can be effectively utilized for anti-attack and protection purposes in P2P networks.
After the presence of high Bandwidth-Delay Product (high-BDP) networks, many researches have been conducted to prove either the existing TCP variants can achieve an excellent performance without wasting the bandwidth of these networks or not. In this paper, a comparative test-bed experiment on a set of high speed TCP variants has been conducted to show their differences in bandwidth utilization, loss ratio and TCP-Fairness. The involved TCP Variants in this experiment are: NewReno, STCP, HS-TCP, H-TCP and CUBIC. These TCP variants have been examined in both cases of single and parallel schemes. The core of this work is how to evaluate these TCP variants over a single bottleneck network using a new parallel scheme to fully utilize the bandwidth of this network, and to show the impact of accelerating these variants on bandwidth utilization, loss-ratio and fairness. The results of this work reveal that, first: the proposed parallel scheme strongly outperforms the single based TCP in terms of bandwidth utilization and fairness. Second: CUBIC achieved better performance than NewReno, STCP, H-TCP and HS-TCP in both cases of single and parallel schemes. Briefly, parallel TCP scheme increases the utilization of network resources, and it is relatively good in fairness.
Over time, a population acquires neutral genetic substitutions as a consequence of random drift. A famous result in population genetics asserts that the rate, $K$, at which these substitutions accumulate in the population coincides with the mutation rate, $u$, at which they arise in individuals: $K=u$. This identity enables genetic sequence data to be used as a "molecular clock" to estimate the timing of evolutionary events. While the molecular clock is known to be perturbed by selection, it is thought that $K=u$ holds very generally for neutral evolution. Here we show that asymmetric spatial population structure can alter the molecular clock rate for neutral mutations, leading to either $K<u$ or $K>u$. Deviations from $K=u$ occur because mutations arise unequally at different sites and have different probabilities of fixation depending on where they arise. If birth rates are uniform across sites, then $K \leq u$. In general, $K$ can take any value between 0 and $Nu$. Our model can be applied to a variety of population structures. In one example, we investigate the accumulation of genetic mutations in the small intestine. In another application, we analyze over 900 Twitter networks to study the effect of network topology on the fixation of neutral innovations in social evolution.
The criteria for the existence of a glass transition in a planar vortex array with quenched disorder are studied. Applying a replica Bethe ansatz, we obtain for self-avoiding vortices the exact quenched average free energy and effective stiffness which is found to be in excellent agreement with recent numerical results for the related random bond dimer model [1]. Including a repulsive vortex interaction and a finite vortex persistence length \xi, we find that for \xi \to 0 the system is at {\em all} temperatures in a glassy phase; a glass transition exists only for finite \xi. Our results indicate that planar vortex arrays in superconducting films are glassy at presumably all temperatures.
Sequence-to-Sequence (seq2seq) modeling has rapidly become an important general-purpose NLP tool that has proven effective for many text-generation and sequence-labeling tasks. Seq2seq builds on deep neural language modeling and inherits its remarkable accuracy in estimating local, next-word distributions. In this work, we introduce a model and beam-search training scheme, based on the work of Daume III and Marcu (2005), that extends seq2seq to learn global sequence scores. This structured approach avoids classical biases associated with local training and unifies the training loss with the test-time usage, while preserving the proven model architecture of seq2seq and its efficient training approach. We show that our system outperforms a highly-optimized attention-based seq2seq system and other baselines on three different sequence to sequence tasks: word ordering, parsing, and machine translation.
Entities are essential elements of natural language. In this paper, we present methods for learning multi-level representations of entities on three complementary levels: character (character patterns in entity names extracted, e.g., by neural networks), word (embeddings of words in entity names) and entity (entity embeddings). We investigate state-of-the-art learning methods on each level and find large differences, e.g., for deep learning models, traditional ngram features and the subword model of fasttext (Bojanowski et al., 2016) on the character level; for word2vec (Mikolov et al., 2013) on the word level; and for the order-aware model wang2vec (Ling et al., 2015a) on the entity level. We confirm experimentally that each level of representation contributes complementary information and a joint representation of all three levels improves the existing embedding based baseline for fine-grained entity typing by a large margin. Additionally, we show that adding information from entity descriptions further improves multi-level representations of entities.
This paper introduces the concept of Conjectural Link for Complex Networks, in particular, social networks. Conjectural Link we understand as an implicit link, not available in the network, but supposed to be present, based on the characteristics of its topology. It is possible, for example, when in the formal description of the network some connections are skipped due to errors, deliberately hidden or withdrawn (e.g. in the case of partial destruction of the network). Introduced a parameter that allows ranking the Conjectural Link. The more this parameter - the more likely that this connection should be present in the network. This paper presents a method of recovery of partially destroyed Complex Networks using Conjectural Links finding. Presented two methods of finding the node pairs that are not linked directly to one another, but have a great possibility of Conjectural Link communication among themselves: a method based on the determination of the resistance between two nodes, and method based on the computation of the lengths of routes between two nodes. Several examples of real networks are reviewed and performed a comparison to know network links prediction methods, not intended to find the missing links in already formed networks.
Blogging is a popular way of expressing opinions and discussing topics. Bloggers demonstrate different levels of commitment and most interesting are influential bloggers. Around such bloggers, the groups are forming, which concentrate users sharing similar interests. Finding such bloggers is an important task and has many applications e.g. marketing, business, politics. Influential ones affect others which is related to the process of diffusion. However, there is no objective way to telling which blogger is more influential. Therefore, researchers take into consideration different criteria to assess bloggers (e.g. SNA centrality measures). In this paper we propose new, efficient method for influential bloggers discovery which is based on relation of commenting in blogger's thread and is defined on bloggers level. Next, we compare results with other, comparative method proposed by Agarwal et al. called iFinder which is based on links between posts.
We present the results of deep I and V-band photometry of the halo stars near the southeast minor axis of the Local Group spiral galaxy M33. An (I,V-I) color-magnitude diagram distinctly reveals the red giant branch stars at the bright end of the color-magnitude diagram. A luminosity function in the I-band which utilizes a background field to remove the contaminating effects of Galactic foreground stars and distant, unresolved galaxies reveals the presence of the tip of the red giant branch at (m-M)_{I}=20.70. Assuming an absolute magnitude of the tip of the red giant branch of M_{I,TRGB}=-4.1 and a foreground reddening of E(V-I)=0.054, the distance modulus of M33 is determined to be (m-M)=24.72 (880 kpc). The metallicity distribution function derived by interpolating between evolutionary tracks of red giant branch models is dominated by a relatively metal poor stellar population, with a mean metallicity of [m/H]=-0.94 or [Fe/H]=-1.24. We fit a leaky-box chemical enrichment model to the halo data, which shows that the halo is well represented by a single-component model with an effective yield of y_{eff}=0.0024.
We propose a novel method of regularization for recurrent neural networks called suprisal-driven zoneout. In this method, states zoneout (maintain their previous value rather than updating), when the suprisal (discrepancy between the last state's prediction and target) is small. Thus regularization is adaptive and input-driven on a per-neuron basis. We demonstrate the effectiveness of this idea by achieving state-of-the-art bits per character of 1.31 on the Hutter Prize Wikipedia dataset, significantly reducing the gap to the best known highly-engineered compression methods.
We propose zoneout, a novel method for regularizing RNNs. At each timestep, zoneout stochastically forces some hidden units to maintain their previous values. Like dropout, zoneout uses random noise to train a pseudo-ensemble, improving generalization. But by preserving instead of dropping hidden units, gradient information and state information are more readily propagated through time, as in feedforward stochastic depth networks. We perform an empirical investigation of various RNN regularizers, and find that zoneout gives significant performance improvements across tasks. We achieve competitive results with relatively simple models in character- and word-level language modelling on the Penn Treebank and Text8 datasets, and combining with recurrent batch normalization yields state-of-the-art results on permuted sequential MNIST.
Diversity is one of the fundamental properties for the survival of species, populations, and organizations. Recent advances in deep learning allow for the rapid and automatic assessment of organizational diversity and possible discrimination by race, sex, age and other parameters. Automating the process of assessing the organizational diversity using the deep neural networks and eliminating the human factor may provide a set of real-time unbiased reports to all stakeholders. In this pilot study we applied the deep-learned predictors of race and sex to the executive management and board member profiles of the 500 largest companies from the 2016 Forbes Global 2000 list and compared the predicted ratios to the ratios within each company's country of origin and ranked them by the sex-, age- and race- diversity index (DI). While the study has many limitations and no claims are being made concerning the individual companies, it demonstrates a method for the rapid and impartial assessment of organizational diversity using deep neural networks.
We investigate the calorimetric liquid-glass transition by performing simulations of a binary Lennard-Jones mixture in one through four dimensions. Starting at a high temperature, the systems are cooled to T=0 and heated back to the ergodic liquid state at constant rates. Glass transitions are observed in two, three and four dimensions as a hysteresis between the cooling and heating curves. This hysteresis appears in the energy and pressure diagrams, and the scanning-rate dependence of the area and height of the hysteresis can be described by power laws. The one dimensional system does not experience a glass transition but its specific heat curve resembles the shape of the $D\geq 2$ results in the supercooled liquid regime above the glass transition. As $D$ increases, the radial distribution functions reflect reduced geometric constraints. Nearest-neighbor distances become smaller with increasing $D$ due to interactions between nearest and next-nearest neighbors. Simulation data for the glasses are compared with crystal and melting data obtained with a Lennard-Jones system with only one type of particle and we find that with increasing $D$ crystallization becomes increasingly more difficult.
Our goal is to determine the structural differences between different categories of networks and to use these differences to predict the network category. Existing work on this topic has looked at social networks such as Facebook, Twitter, co-author networks etc. We, instead, focus on a novel data set that we have assembled from a variety of sources, including law-enforcement agencies, financial institutions, commercial database providers and other similar organizations. The data set comprises networks of "persons of interest" with each network belonging to different categories such as suspected terrorists, convicted individuals etc. We demonstrate that such "anti-social" networks are qualitatively different from the usual social networks and that new techniques are required to identify and learn features of such networks for the purposes of prediction and classification.   We propose Cliqster, a new generative Bernoulli process-based model for unweighted networks. The generating probabilities are the result of a decomposition which reflects a network's community structure. Using a maximum likelihood solution for the network inference leads to a least-squares problem. By solving this problem, we are able to present an efficient algorithm for transforming the network to a new space which is both concise and discriminative. This new space preserves the identity of the network as much as possible. Our algorithm is interpretable and intuitive. Finally, by comparing our research against the baseline method (SVD) and against a state-of-the-art Graphlet algorithm, we show the strength of our algorithm in discriminating between different categories of networks.
Modeling self-organization of neural networks for unsupervised learning using Hebbian and anti-Hebbian plasticity has a long history in neuroscience. Yet, derivations of single-layer networks with such local learning rules from principled optimization objectives became possible only recently, with the introduction of similarity matching objectives. What explains the success of similarity matching objectives in deriving neural networks with local learning rules? Here, using dimensionality reduction as an example, we introduce several variable substitutions that illuminate the success of similarity matching. We show that the full network objective may be optimized separately for each synapse using local learning rules both in the offline and online settings. We formalize the long-standing intuition of the rivalry between Hebbian and anti-Hebbian rules by formulating a min-max optimization problem. We introduce a novel dimensionality reduction objective using fractional matrix exponents. To illustrate the generality of our approach, we apply it to a novel formulation of dimensionality reduction combined with whitening. We confirm numerically that the networks with learning rules derived from principled objectives perform better than those with heuristic learning rules.
Recently-proposed virtualization platforms give cloud users the freedom to specify their network topologies and addressing schemes. These platforms have, however, been targeting a single datacenter of a cloud provider, which is insufficient to support (critical) applications that need to be deployed across multiple trust domains while enforcing diverse security requirements. This paper addresses this problem by presenting a novel solution for a central component of network virtualization -- the online network embedding, which finds efficient mappings of virtual networks requests onto the substrate network. Our solution considers security as a first class citizen, enabling the definition of flexible policies in three central areas: on the communications, where alternative security compromises can be explored (e.g., encryption); on the computations, supporting redundancy if necessary while capitalizing on hardware assisted trusted executions; across multiples clouds, including public and private facilities, with the associated trust levels. We formulate the solution as a Mixed Integer Linear Program (MILP), and evaluate our proposal against the most commonly used alternative. Our analysis gives insight into the trade-offs involved with the inclusion of security and trust into network virtualization, providing evidence that this notion may enhance profits under the appropriate cost models.
We study deeply virtual Compton scattering and deep exclusive meson electroproduction on a deuteron target. We model the Generalized Quark Distributions in the deuteron by using the impulse approximation for the lowest Fock-space state on the light-cone. We study the properties of the resulting GPDs, and verify that sum rules violations are quite small in the impulse approximation. Numerical predictions are given for the unpolarized cross sections and polarization asymmetries for the kinematical regimes relevant for JLab experiments and for HERMES at HERA. We conclude that the signal of coherent scattering on the deuteron is comparable to the one on the proton at least for low momentum transfer, providing support to the feasibility of the experiments. The short distance structure of the deuteron may thus be scrutinized in the near future.
Real time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks, which are channel wise, kernel wise and intra kernel strided sparsity. This structured sparsity is very advantageous for direct computational resource savings on embedded computers, parallel computing environments and hardware based systems. To decide the importance of network connections and paths, the proposed method uses a particle filtering approach. The importance weight of each particle is assigned by computing the misclassification rate with corresponding connectivity pattern. The pruned network is re-trained to compensate for the losses due to pruning. While implementing convolutions as matrix products, we particularly show that intra kernel strided sparsity with a simple constraint can significantly reduce the size of kernel and feature map matrices. The pruned network is finally fixed point optimized with reduced word length precision. This results in significant reduction in the total storage size providing advantages for on-chip memory based implementations of deep neural networks.
The practical realization of managing and executing large scale scientific computations efficiently and reliably is quite challenging. Scientific computations often involve thousands or even millions of tasks operating on large quantities of data, such data are often diversely structured and stored in heterogeneous physical formats, and scientists must specify and run such computations over extended periods on collections of compute, storage and network resources that are heterogeneous, distributed and may change constantly. We present the integration of several advanced systems: Swift, Karajan, and Falkon, to address the challenges in running various large scale scientific applications in Grid environments. Swift is a parallel programming tool for rapid and reliable specification, execution, and management of large-scale science and engineering workflows. Swift consists of a simple scripting language called SwiftScript and a powerful runtime system that is based on the CoG Karajan workflow engine and integrates the Falkon light-weight task execution service that uses multi-level scheduling and a streamlined dispatcher. We showcase the scalability, performance and reliability of the integrated system using application examples drawn from astronomy, cognitive neuroscience and molecular dynamics, which all comprise large number of fine-grained jobs. We show that Swift is able to represent dynamic workflows whose structures can only be determined during runtime and reduce largely the code size of various workflow representations using SwiftScript; schedule the execution of hundreds of thousands of parallel computations via the Karajan engine; and achieve up to 90% reduction in execution time when compared to traditional batch schedulers.
Lung nodule classification is a class imbalanced problem because nodules are found with much lower frequency than non-nodules. In the class imbalanced problem, conventional classifiers tend to be overwhelmed by the majority class and ignore the minority class. We therefore propose cascaded convolutional neural networks to cope with the class imbalanced problem. In the proposed approach, multi-stage convolutional neural networks that perform as single-sided classifiers filter out obvious non-nodules. Successively, a convolutional neural network trained with a balanced data set calculates nodule probabilities. The proposed method achieved the sensitivity of 92.4\% and 94.5% at 4 and 8 false positives per scan in Free Receiver Operating Characteristics (FROC) curve analysis, respectively.
Over the past decade, machine learning techniques especially predictive modeling and pattern recognition in biomedical sciences from drug delivery system to medical imaging has become one of the important methods which are assisting researchers to have deeper understanding of entire issue and to solve complex medical problems. Deep learning is power learning machine learning algorithm in classification while extracting high-level features. In this paper, we used convolutional neural network to classify Alzheimer's brain from normal healthy brain. The importance of classifying this kind of medical data is to potentially develop a predict model or system in order to recognize the type disease from normal subjects or to estimate the stage of the disease. Classification of clinical data such as Alzheimer's disease has been always challenging and most problematic part has been always selecting the most discriminative features. Using Convolutional Neural Network (CNN) and the famous architecture LeNet-5, we successfully classified functional MRI data of Alzheimer's subjects from normal controls where the accuracy of test data on trained data reached 96.85%. This experiment suggests us the shift and scale invariant features extracted by CNN followed by deep learning classification is most powerful method to distinguish clinical data from healthy data in fMRI. This approach also enables us to expand our methodology to predict more complicated systems.
This paper presents a tutorial for network anomaly detection, focusing on non-signature-based approaches. Network traffic anomalies are unusual and significant changes in the traffic of a network. Networks play an important role in today's social and economic infrastructures. The security of the network becomes crucial, and network traffic anomaly detection constitutes an important part of network security. In this paper, we present three major approaches to non-signature-based network detection: PCA-based, sketch-based, and signal-analysis-based. In addition, we introduce a framework that subsumes the three approaches and a scheme for network anomaly extraction. We believe network anomaly detection will become more important in the future because of the increasing importance of network security.
In this paper we bring to bear some new tools from statistical learning on the analysis of roll call data. We present a new data-driven model for roll call voting that is geometric in nature. We construct the model by adapting the "Partition Decoupling Method," an unsupervised learning technique originally developed for the analysis of families of time series, to produce a multiscale geometric description of a weighted network associated to a set of roll call votes. Central to this approach is the quantitative notion of a "motivation," a cluster-based and learned basis element that serves as a building block in the representation of roll call data. Motivations enable the formulation of a quantitative description of ideology and their data-dependent nature makes possible a quantitative analysis of the evolution of ideological factors. This approach is generally applicable to roll call data and we apply it in particular to the historical roll call voting of the U.S. House and Senate. This methodology provides a mechanism for estimating the dimension of the underlying action space. We determine that the dominant factors form a low- (one- or two-) dimensional representation with secondary factors adding higher-dimensional features. In this way our work supports and extends the findings of both Poole-Rosenthal and Heckman-Snyder concerning the dimensionality of the action space. We give a detailed analysis of several individual Senates and use the AdaBoost technique from statistical learning to determine those votes with the most powerful discriminatory value. When used as a predictive model, this geometric view significantly outperforms spatial models such as the Poole-Rosenthal DW-NOMINATE model and the Heckman-Snyder 6-factor model, both in raw accuracy as well as Aggregate Proportional Reduced Error (APRE).
We study the discrete Gierer-Meinhardt model of reaction-diffusion on three different types of networks: regular, random and scale-free. The model dynamics lead to the formation of stationary Turing patterns in the steady state in certain parameter regions. Some general features of the patterns are studied through numerical simulation. The results for the random and scale-free networks show a marked difference from those in the case of the regular network. The difference may be ascribed to the small world character of the first two types of networks.
There is a trend toward the use of predictive systems in communications networks. At the systems and network management level predictive capabilities are focused on anticipating network faults and performance degradation. Simultaneously, mobile communication networks are being developed with predictive location and tracking mechanisms. The interactions and synergies between these systems present a new set of problems. A new predictive network management framework is developed and examined. The interaction between a predictive mobile network and the proposed network management system is discussed. The Rapidly Deployable Radio Network is used as a specific example to illustrate these interactions.
The vertical handover decision is considered an NP-Hard problem. For that reason, a large variety of vertical handoff algorithms (VHA) have been proposed to help the user to select dynamically the best access network in terms of quality of service (QoS). The objective of this paper is to provide a new approach for evaluating of the vertical handoff algorithms in order to choose the most appropriate algorithm which should be used to select the best access network. Simulation results are presented to evaluate and to test our new evaluation model
Motivated by a recently identified severe discrepancy between a static and a dynamic theory of glasses, we numerically investigate the behavior of dense hard spheres in spatial dimensions 3 to 12. Our results are consistent with the static replica theory, but disagree with the dynamic mode-coupling theory, indicating that key ingredients of high-dimensional physics are missing from the latter. We also obtain numerical estimates of the random close packing density, which provides new insights into the mathematical problem of packing spheres in large dimension.
A fundamental method of reconstructing networks, e.g. in the context of gene regulation, relies on the precision matrix (the inverse of the variance-covariance matrix) as an indicator which variables are associated with each other. The precision matrix assumes Gaussian data and its entries are zero for those pairs of variable which are conditionally independent. Here, we propose the Distance Precision Matrix which is based on a measure of possibly non-linear association, the distance covarince. We provide evidence that the Distance Precision Matrix can successfully compute networks from non-linear data and does so in a very consistent manner across many data situations.
Vision research has been shaped by the seminal insight that we can understand higher-tier visual cortex from the perspective of multiple functional pathways with different goals. In this paper we try to give a computational account of the functional organization of this system by reasoning from the perspective of multi-task deep neural networks. Machine learning has shown that tasks become easier to solve when they are decomposed into subtasks with their own cost function. We hypothesise that the visual system optimizes multiple cost functions of unrelated tasks and this causes the emergence of the ventral pathway, dedicated to vision for perception and dorsal pathway, dedicated to vision for action. To evaluate the functional organization in multi-task deep neural networks we propose a method that measures the contribution of a unit towards each task and apply it to two networks that have been trained on either two related or two unrelated tasks using an identical stimulus set. Results show that the network trained on the unrelated tasks shows a decreasing degree of feature representation sharing towards higher-tier layers while the network trained on related tasks uniformly shows high degree of sharing. We conjecture that the method we propose can be used to reason about the anatomical and functional organization of the visual system and beyond as we predict that the degree to which tasks are related is a good descriptor of the degree to which they can share downstream cortical-units.
The Marginally Rigid State is a candidate paradigm for what makes granular material a state of matter distinct from both liquid and solid. Coordination number is identified as a discriminating characteristic, and for rough-surfaced particles we show that the low values predicted are indeed approached in simple two dimensional experiments. We show calculations of the stress transmission suggesting that this is governed by local linear equations of constraint between the stress components. These constraints can in turn be related to the generalised forces conjugate to the motion of grains rolling over each other. The lack of a spatially coherent way of imposing a sign convention on these motions is a problem for up-scaling the equations, which leads us to attempt a renormalisation group calculation. Finally we discuss how perturbations propagate through such systems, suggesting a distinction between the behaviour of rough and of smooth grains.
New results on diffractive deep-inelastic $e p$ scattering at HERA are presented using data taken in 1994 with the H1 detector. The cross section for diffractive deep-inelastic scattering is measured in terms of a diffractive structure function $F_2^{D(3)}(\beta,Q^2,\xpom)$ over an extended kinematic range. The dependence of $F_2^{D(3)}$ on $\xpom$ is found not to depend on $Q^2$, but to depend on $\beta$. Therefore the $\xpom$ dependence no longer factorizes. The $Q^2$ and $\beta$ dependence of $F_2^{D(3)}$ is analyzed after an integration over the dependence on $\xpom$. For fixed $\beta$ a clear rise with $\log Q^2$ is observed, persisting up to high values of $\beta$. In terms of the Altarelli-Parisi (DGLAP) QCD evolution equations, these scaling violations give clear indications for a gluon dominated process. Subsequently an attempt is made to quantify the parton content of the diffractive exchange using the DGLAP evolution. At the starting scale a ``leading'' gluon distribution is found which contributes about $80 \%$ of the momentum in the diffractive exchange. Measurements of the hadronic final state (energy flow and production of $D^{*}$ mesons) are found to be consistent with the predictions of a model of deep-inelastic electron pomeron scattering using the information on the parton content obtained.
The motor control problem involves determining the time-varying muscle activation trajectories required to accomplish a given movement. Muscle redundancy makes motor control a challenging task: there are many possible activation trajectories that accomplish the same movement. Despite this redundancy, most movements are accomplished in highly stereotypical ways. For example, point-to-point reaching movements are almost universally performed with very similar smooth trajectories. Optimization methods are commonly used to predict muscle forces for measured movements. However, these approaches require computationally expensive simulations and are sensitive to the chosen optimality criteria and regularization. In this work, we investigate deep autoencoders for the prediction of muscle activation trajectories for point-to-point reaching movements. We evaluate our DNN predictions with simulated reaches and two methods to generate the muscle activations: inverse dynamics (ID) and optimal control (OC) criteria. We also investigate optimal network parameters and training criteria to improve the accuracy of the predictions.
By using a simple interpolation argument, in previous work we have proven the existence of the thermodynamic limit, for mean field disordered models, including the Sherrington-Kirkpatrick model, and the Derrida p-spin model. Here we extend this argument in order to compare the limiting free energy with the expression given by the Parisi Ansatz, and including full spontaneous replica symmetry breaking. Our main result is that the quenched average of the free energy is bounded from below by the value given in the Parisi Ansatz uniformly in the size of the system. Moreover, the difference between the two expressions is given in the form of a sum rule, extending our previous work on the comparison between the true free energy and its replica symmetric Sherrington-Kirkpatrick approximation. We give also a variational bound for the infinite volume limit of the ground state energy per site.
We propose a new encoder-decoder approach to learn distributed sentence representations from unlabeled sentences. The word-to-vector representation is used, and convolutional neural networks are employed as sentence encoders, mapping an input sentence into a fixed-length vector. This representation is decoded using long short-term memory recurrent neural networks, considering several tasks, such as reconstructing the input sentence, or predicting the future sentence. We further describe a hierarchical encoder-decoder model to encode a sentence to predict multiple future sentences. By training our models on a large collection of novels, we obtain a highly generic convolutional sentence encoder that performs well in practice. Experimental results on several benchmark datasets, and across a broad range of applications, demonstrate the superiority of the proposed model over competing methods.
Neural encoding studies explore the relationships between measurements of neural activity and measurements of a behavior that is viewed as a response to that activity. The coupling between neural and behavioral measurements is typically imperfect and difficult to measure.To enhance our ability to understand neural encoding relationships, we propose that a behavioral measurement may be decomposable as a sum of two latent components, such that the direct neural influence and prediction is primarily localized to the component which encodes temporal dependence. For this purpose, we propose to use a non-separable Kronecker sum covariance model to characterize the behavioral data as the sum of terms with exclusively trial-wise, and exclusively temporal dependencies. We then utilize a corrected form of Lasso regression in combination with the nodewise regression approach for estimating the conditional independence relationships between and among variables for each component of the behavioral data, where normality is necessarily assumed. We provide the rate of convergence for estimating the precision matrices associated with the temporal as well as spatial components in the Kronecker sum model. We illustrate our methods and theory using simulated data, and data from a neural encoding study of hawkmoth flight; we demonstrate that the neural encoding signal for hawkmoth wing strokes is primarily localized to a latent component with temporal dependence, which is partially obscured by a second component with trial-wise dependencies.
The keep-growing content of Web images may be the next important data source to scale up deep neural networks, which recently obtained a great success in the ImageNet classification challenge and related tasks. This prospect, however, has not been validated on convolutional networks (convnet) -- one of best performing deep models -- because of their supervised regime. While unsupervised alternatives are not so good as convnet in generalizing the learned model to new domains, we use convnet to leverage semi-supervised representation learning. Our approach is to use massive amounts of unlabeled and noisy Web images to train convnets as general feature detectors despite challenges coming from data such as high level of mislabeled data, outliers, and data biases. Extensive experiments are conducted at several data scales, different network architectures, and data reranking techniques. The learned representations are evaluated on nine public datasets of various topics. The best results obtained by our convnets, trained on 3.14 million Web images, outperform AlexNet trained on 1.2 million clean images of ILSVRC 2012 and is closing the gap with VGG-16. These prominent results suggest a budget solution to use deep learning in practice and motivate more research in semi-supervised representation learning.
This paper introduces two complexity-theoretic formulations of Bennett's logical depth: finite-state depth and polynomial-time depth. It is shown that for both formulations, trivial and random infinite sequences are shallow, and a slow growth law holds, implying that deep sequences cannot be created easily from shallow sequences. Furthermore, the E analogue of the halting language is shown to be polynomial-time deep, by proving a more general result: every language to which a nonnegligible subset of E can be reduced in uniform exponential time is polynomial-time deep.
Boolean networks have been used successfully in modeling biological networks and provide a good framework for theoretical analysis. However, the analysis of large networks is not trivial. In order to simplify the analysis of such networks, several model reduction algorithms have been proposed; however, it is not clear if such algorithms scale well with respect to the number of nodes. The goal of this paper is to propose and implement an algorithm for the reduction of AND-NOT network models for the purpose of steady state computation. Our method of network reduction is the use of "steady state approximations" that do not change the number of steady states. Our algorithm is designed to work at the wiring diagram level without the need to evaluate or simplify Boolean functions. Also, our implementation of the algorithm takes advantage of the sparsity typical of discrete models of biological systems. The main features of our algorithm are that it works at the wiring diagram level, it runs in polynomial time, and it preserves the number of steady states. We used our results to study AND-NOT network models of gene networks and showed that our algorithm greatly simplifies steady state analysis. Furthermore, our algorithm can handle sparse AND-NOT networks with up to 1000000 nodes.
Nowadays, peer-to-peer (P2P) streaming systems have become a popular way to deliver multimedia content over the internet due to their low bandwidth requirement, high video streaming quality, and flexibility. However, P2P streaming systems are vulnerable to various attacks, especially pollution attacks, due to their distributed and dynamically changing infrastructure. In this paper, by exploring the features of various pollution attacks, we propose a trust management system tailored for P2P streaming systems. Both direct trust and indirect trust are taken into consideration when designing the trust management system. A new direct trust model is proposed. A dynamic confidence factor that can dynamically adjust the weight of direct and indirect trust in computing the trust is also proposed and studied. A novel double-threshold trust utilization scheme is given. It is shown that the proposed trust management system is effective in identifying polluters and preventing them from further sharing of polluted data chunks.
We present preliminary results on the metastable behavior of a nonequilibrium ferromagnetic system. The metastable state mean lifetime is a non-monotonous function of temperature; it shows a maximum at certain non-zero temperature which depends on the strengh of the nonequilibrium perturbation. This is in contrast with the equilibrium case in which lifetime increases monotonously as the temperature is decreasesed. We also report on avalanches during the decay from the metastable state. Assuming both free boundaries and nonequilibrium impurities, the avalanches exhibit power-law size and lifetime distributions. Such scale free behavior is very sensible. The chances are that our observations may be observable in real (i.e. impure) ferromagnetic nanoparticles.
Robot warehouse automation has attracted significant interest in recent years, perhaps most visibly in the Amazon Picking Challenge (APC). A fully autonomous warehouse pick-and-place system requires robust vision that reliably recognizes and locates objects amid cluttered environments, self-occlusions, sensor noise, and a large variety of objects. In this paper we present an approach that leverages multi-view RGB-D data and self-supervised, data-driven learning to overcome those difficulties. The approach was part of the MIT-Princeton Team system that took 3rd- and 4th- place in the stowing and picking tasks, respectively at APC 2016. In the proposed approach, we segment and label multiple views of a scene with a fully convolutional neural network, and then fit pre-scanned 3D object models to the resulting segmentation to get the 6D object pose. Training a deep neural network for segmentation typically requires a large amount of training data. We propose a self-supervised method to generate a large labeled dataset without tedious manual segmentation. We demonstrate that our system can reliably estimate the 6D pose of objects under a variety of scenarios. All code, data, and benchmarks are available at http://apc.cs.princeton.edu/
A fundamental feature of learning in animals is the "ability to forget" that allows an organism to perceive, model and make decisions from disparate streams of information and adapt to changing environments. Against this backdrop, we present a novel unsupervised learning mechanism ASP (Adaptive Synaptic Plasticity) for improved recognition with Spiking Neural Networks (SNNs) for real time on-line learning in a dynamic environment. We incorporate an adaptive weight decay mechanism with the traditional Spike Timing Dependent Plasticity (STDP) learning to model adaptivity in SNNs. The leak rate of the synaptic weights is modulated based on the temporal correlation between the spiking patterns of the pre- and post-synaptic neurons. This mechanism helps in gradual forgetting of insignificant data while retaining significant, yet old, information. ASP, thus, maintains a balance between forgetting and immediate learning to construct a stable-plastic self-adaptive SNN for continuously changing inputs. We demonstrate that the proposed learning methodology addresses catastrophic forgetting while yielding significantly improved accuracy over the conventional STDP learning method for digit recognition applications. Additionally, we observe that the proposed learning model automatically encodes selective attention towards relevant features in the input data while eliminating the influence of background noise (or denoising) further improving the robustness of the ASP learning.
Writing rap lyrics requires both creativity to construct a meaningful, interesting story and lyrical skills to produce complex rhyme patterns, which form the cornerstone of good flow. We present a rap lyrics generation method that captures both of these aspects. First, we develop a prediction model to identify the next line of existing lyrics from a set of candidate next lines. This model is based on two machine-learning techniques: the RankSVM algorithm and a deep neural network model with a novel structure. Results show that the prediction model can identify the true next line among 299 randomly selected lines with an accuracy of 17%, i.e., over 50 times more likely than by random. Second, we employ the prediction model to combine lines from existing songs, producing lyrics with rhyme and a meaning. An evaluation of the produced lyrics shows that in terms of quantitative rhyme density, the method outperforms the best human rappers by 21%. The rap lyrics generator has been deployed as an online tool called DeepBeat, and the performance of the tool has been assessed by analyzing its usage logs. This analysis shows that machine-learned rankings correlate with user preferences.
This paper introduces an Enhanced Boolean version of the Correlation Matrix Memory (CMM), which is useful to work with binary memories. A novel Boolean Orthonormalization Process (BOP) is presented to convert a non-orthonormal Boolean basis, i.e., a set of non-orthonormal binary vectors (in a Boolean sense) to an orthonormal Boolean basis, i.e., a set of orthonormal binary vectors (in a Boolean sense). This work shows that it is possible to improve the performance of Boolean CMM thanks BOP algorithm. Besides, the BOP algorithm has a lot of additional fields of applications, e.g.: Steganography, Hopfield Networks, Bi-level image processing, etc. Finally, it is important to mention that the BOP is an extremely stable and fast algorithm.
Spectral clustering has become a popular technique due to its high performance in many contexts. It comprises three main steps: create a similarity graph between N objects to cluster, compute the first k eigenvectors of its Laplacian matrix to define a feature vector for each object, and run k-means on these features to separate objects into k classes. Each of these three steps becomes computationally intensive for large N and/or k. We propose to speed up the last two steps based on recent results in the emerging field of graph signal processing: graph filtering of random signals, and random sampling of bandlimited graph signals. We prove that our method, with a gain in computation time that can reach several orders of magnitude, is in fact an approximation of spectral clustering, for which we are able to control the error. We test the performance of our method on artificial and real-world network data.
In this paper, the Turing instability in reaction-diffusion models defined on complex networks is studied. Here, we focus on three types of models which generate complex networks, i.e. the Erd\H{o}s-R\'enyi, the Watts-Strogatz, and the threshold network models. From analysis of the Laplacian matrices of graphs generated by these models, we numerically reveal that stable and unstable regions of a homogeneous steady state on the parameter space of two diffusion coefficients completely differ, depending on the network architecture. In addition, we theoretically discuss the stable and unstable regions in the cases of regular enhanced ring lattices which include regular circles, and networks generated by the threshold network model when the number of vertices is large enough.
This paper deals with a two-way relay network (TWRN) based on a slotted ALOHA protocol which utilizes network coding to exchange the packets. We proposed an analytical approach to study the behavior of such networks and the effects of network coding on the throughput, power, and queueing delay of the relay node. In addition, when end nodes are not saturated, our approach enables us to achieve the stability region of the network in different situations. Finally, we carry out some simulation to confirm the validity of the proposed analytical approach.
Human action recognition is an important task in computer vision. Extracting discriminative spatial and temporal features to model the spatial and temporal evolutions of different actions plays a key role in accomplishing this task. In this work, we propose an end-to-end spatial and temporal attention model for human action recognition from skeleton data. We build our model on top of the Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM), which learns to selectively focus on discriminative joints of skeleton within each frame of the inputs and pays different levels of attention to the outputs of different frames. Furthermore, to ensure effective training of the network, we propose a regularized cross-entropy loss to drive the model learning process and develop a joint training strategy accordingly. Experimental results demonstrate the effectiveness of the proposed model,both on the small human action recognition data set of SBU and the currently largest NTU dataset.
We study network response to queries that require computation of remotely located data and seek to characterize the performance limits in terms of maximum sustainable query rate that can be satisfied. The available resources include (i) a communication network graph with links over which data is routed, (ii) computation nodes, over which computation load is balanced, and (iii) network nodes that need to schedule raw and processed data transmissions. Our aim is to design a universal methodology and distributed algorithm to adaptively allocate resources in order to support maximum query rate. The proposed algorithms extend in a nontrivial way the backpressure (BP) algorithm to take into account computations operated over query streams. They contribute to the fundamental understanding of network computation performance limits when the query rate is limited by both the communication bandwidth and the computation capacity, a classical setting that arises in streaming big data applications in network clouds and fogs.
This paper gives a detail analysis of various applications based on Internet of Thing (IoT)s. This explains about how internet of things evolved from mobile computing and ubiquitous computing. It emphasises the fact that objects are connected over the internet rather than people. The properties of Internet of Things (IOT) are product information, electronic tag, standard expressed and uploading information. It utilises the Radio Frequency Identification (RFID) technology and wireless sensor networks (WSN). IOT applications are used in domains such as healthcare, supply chain management, defence and agriculture. Lastly the paper focuses on issues involved in IOT. Though it is a boon, IOT faces certain crucial issues like privacy and security.
We consider two types of fluctuations in the mass-action equilibrium in protein binding networks. The first type is driven by relatively slow changes in total concentrations (copy numbers) of interacting proteins. The second type, to which we refer to as spontaneous, is caused by quickly decaying thermodynamic deviations away from the equilibrium of the system. As such they are amenable to methods of equilibrium statistical mechanics used in our study. We investigate the effects of network connectivity on these fluctuations and compare them to their upper and lower bounds. The collective effects are shown to sometimes lead to large power-law distributed amplification of spontaneous fluctuations as compared to the expectation for isolated dimers. As a consequence of this, the strength of both types of fluctuations is positively correlated with the overall network connectivity of proteins forming the complex. On the other hand, the relative amplitude of fluctuations is negatively correlated with the abundance of the complex. Our general findings are illustrated using a real network of protein-protein interactions in baker's yeast with experimentally determined protein concentrations.
We study -- using molecular dynamics simulations -- the temperature dependence of the dynamics in a dense short-ranged attractive colloidal glass to find evidence of the kinetic glass-glass transition predicted by the ideal Mode Coupling Theory. According to the theory, the two distinct glasses are stabilized one by excluded volume and the other by short-ranged attractive interactions. By studying the density autocorrelation functions, we discover that the short-ranged attractive glass is unstable. Indeed, activated bond-breaking processes slowly convert the attractive glass into the hard-sphere one, preempting the observation of a sharp glass-glass transition.
The vast majority of natural sensory data is temporally redundant. Video frames or audio samples which are sampled at nearby points in time tend to have similar values. Typically, deep learning algorithms take no advantage of this redundancy to reduce computation. This can be an obscene waste of energy. We present a variant on backpropagation for neural networks in which computation scales with the rate of change of the data - not the rate at which we process the data. We do this by having neurons communicate a combination of their state, and their temporal change in state. Intriguingly, this simple communication rule give rise to units that resemble biologically-inspired leaky integrate-and-fire neurons, and to a weight-update rule that is equivalent to a form of Spike-Timing Dependent Plasticity (STDP), a synaptic learning rule observed in the brain. We demonstrate that on MNIST and a temporal variant of MNIST, our algorithm performs about as well as a Multilayer Perceptron trained with backpropagation, despite only communicating discrete values between layers.
We exhibit a family of neural networks of McCulloch and Pitts of size $2nk+2$ which can be simulated by a neural networks of Caianiello of size $2n+2$ and memory length $k$. This simulation allows us to find again one of the result of the following article: [Cycles exponentiels des r\'{e}seaux de Caianiello et compteurs en arithm\'{e}tique redondante, Technique et Science Informatiques Vol. 19, pages 985-1008] on the existence of neural networks of Caianiello of size $2n+2$ and memory length $k$ which describes a cycle of length $k \times 2^{nk}$.
Rodent hippocampal population codes represent important spatial information about the environment during navigation. Several computational methods have been developed to uncover the neural representation of spatial topology embedded in rodent hippocampal ensemble spike activity. Here we extend our previous work and propose a nonparametric Bayesian approach to infer rat hippocampal population codes during spatial navigation. To tackle the model selection problem, we leverage a nonparametric Bayesian model. Specifically, to analyze rat hippocampal ensemble spiking activity, we apply a hierarchical Dirichlet process-hidden Markov model (HDP-HMM) using two Bayesian inference methods, one based on Markov chain Monte Carlo (MCMC) and the other based on variational Bayes (VB). We demonstrate the effectiveness of our Bayesian approaches on recordings from a freely-behaving rat navigating in an open field environment. We find that MCMC-based inference with Hamiltonian Monte Carlo (HMC) hyperparameter sampling is flexible and efficient, and outperforms VB and MCMC approaches with hyperparameters set by empirical Bayes.
The pace of progress in the fields of Evolutionary Computation and Machine Learning is currently limited -- in the former field, by the improbability of making advantageous extensions to evolutionary algorithms when their capacity for adaptation is poorly understood, and in the latter by the difficulty of finding effective semi-principled reductions of hard real-world problems to relatively simple optimization problems. In this paper we explain why a theory which can accurately explain the simple genetic algorithm's remarkable capacity for adaptation has the potential to address both these limitations. We describe what we believe to be the impediments -- historic and analytic -- to the discovery of such a theory and highlight the negative role that the building block hypothesis (BBH) has played. We argue based on experimental results that a fundamental limitation which is widely believed to constrain the SGA's adaptive ability (and is strongly implied by the BBH) is in fact illusionary and does not exist. The SGA therefore turns out to be more powerful than it is currently thought to be. We give conditions under which it becomes feasible to numerically approximate and study the multivariate marginals of the search distribution of an infinite population SGA over multiple generations even when its genomes are long, and explain why this analysis is relevant to the riddle of the SGA's remarkable adaptive abilities.
We give a relativistic spin network model for quantum gravity based on the Lorentz group and its q-deformation, the Quantum Lorentz Algebra.   We propose a combinatorial model for the path integral given by an integral over suitable representations of this algebra. This generalises the state sum models for the case of the four-dimensional rotation group previously studied in gr-qc/9709028.   As a technical tool, formulae for the evaluation of relativistic spin networks for the Lorentz group are developed, with some simple examples which show that the evaluation is finite in interesting cases. We conjecture that the `10J' symbol needed in our model has a finite value.
In a cooperative multiple-antenna downlink cellular network, maximization of a concave function of user rates is considered. A new linear precoding technique called soft interference nulling (SIN) is proposed, which performs at least as well as zero-forcing (ZF) beamforming. All base stations share channel state information, but each user's message is only routed to those that participate in the user's coordination cluster. SIN precoding is particularly useful when clusters of limited sizes overlap in the network, in which case traditional techniques such as dirty paper coding or ZF do not directly apply. The SIN precoder is computed by solving a sequence of convex optimization problems. SIN under partial network coordination can outperform ZF under full network coordination at moderate SNRs. Under overlapping coordination clusters, SIN precoding achieves considerably higher throughput compared to myopic ZF, especially when the clusters are large.
Error tolerance and attack vulnerability are two common and important properties of complex networks, which are usually used to evaluate the robustness of a network. Recently, much work has been devoted to determining the network design with optimal robustness. However, little attention has been paid to the problem of how to improve the robustness of existing networks. In this paper, we present a new parameter alpha, called enforcing parameter, to guide the process of enhancing the robustness of scale-free networks by gradually adding new links. Intuitively, alpha < 0 means the nodes with lower degrees are selected preferentially while the nodes with higher degrees will be more probably selected when alpha > 0. It is shown both theoretically and experimentally that when alpha < 0 the attack survivability of the network can be enforced apparently. Then we propose new strategies to enhance the network robustness. Through extensive experiments and comparisons, we conclude that establishing new links between nodes with low degrees can drastically enforce the attack survivability of scale-free networks while having little impact on the error tolerance.
Random linear network coding (RLNC) in theory achieves the max-flow capacity of multicast networks, at the cost of high decoding complexity. To improve the performance-complexity tradeoff, we consider the design of sparse network codes. A generation-based strategy is employed in which source packets are grouped into overlapping subsets called generations. RLNC is performed only amongst packets belonging to the same generation throughout the network so that sparseness can be maintained. In this paper, generation-based network codes with low reception overheads and decoding costs are designed for transmitting of the order of $10^2$-$10^3$ source packets. A low-complexity overhead-optimized decoder is proposed that exploits "overlaps" between generations. The sparseness of the codes is exploited through local processing and multiple rounds of pivoting of the decoding matrix. To demonstrate the efficacy of our approach, codes comprising a binary precode, random overlapping generations, and binary RLNC are designed. The results show that our designs can achieve negligible code overheads at low decoding costs, and outperform existing network codes that use the generation based strategy.
Using non-contact scanning probe microscopy (SPM) techniques, dielectric properties were studied on 50 nanometer length scales in poly-vinyl-acetate (PVAc) films in the vicinity of the glass transition. Low frequency (1/f) noise observed in the measurements, was shown to arise from thermal fluctuations of the electric polarization. Anomalous variations observed in the noise spectrum provide direct evidence for cooperative nano-regions with heterogeneous kinetics. The cooperative length scale was determined. Heterogeneity was long-lived only well below the glass transition for faster than average processes.
The universal bimodal distribution of transmission eigenvalues in lossless diffusive systems un- derpins such celebrated phenomena as universal conductance fluctuations, quantum shot noise in condensed matter physics and enhanced transmission in optics and acoustics. Here, we show that in the presence of absorption, density of the transmission eigenvalues depends on the confinement geometry of scattering media. Furthermore, in an asymmetric waveguide, densities of the reflection and absorption eigenvalues also depend of the side from which the waves are incident. With increas- ing absorpotion, the density of absorption eigenvalues transforms from single-peak to double-peak function. Our findings open a new avenue for coherent control of wave transmission, reflection and absorption in random media.
We present a simple point process model of $1/f^{\beta}$ noise, covering different values of the exponent $\beta$. The signal of the model consists of pulses or events. The interpulse, interevent, interarrival, recurrence or waiting times of the signal are described by the general Langevin equation with the multiplicative noise and stochastically diffuse in some interval resulting in the power-law distribution. Our model is free from the requirement of a wide distribution of relaxation times and from the power-law forms of the pulses. It contains only one relaxation rate and yields $1/f^ {\beta}$ spectra in a wide range of frequency. We obtain explicit expressions for the power spectra and present numerical illustrations of the model. Further we analyze the relation of the point process model of $1/f$ noise with the Bernamont-Surdin-McWhorter model, representing the signals as a sum of the uncorrelated components. We show that the point process model is complementary to the model based on the sum of signals with a wide-range distribution of the relaxation times. In contrast to the Gaussian distribution of the signal intensity of the sum of the uncorrelated components, the point process exhibits asymptotically a power-law distribution of the signal intensity. The developed multiplicative point process model of $1/f^{\beta}$ noise may be used for modeling and analysis of stochastic processes in different systems with the power-law distribution of the intensity of pulsing signals.
This paper addresses reconstruction of linear dynamic networks from heterogeneous datasets. Those datasets consist of measurements from linear dynamical systems in multiple experiment subjected to different experimental conditions, e.g., changes/perturbations in parameters, disturbance or noise. A main assumption is that the Boolean structures of the underlying networks are the same in all experiments. The ARMAX model is adopted to parameterize the general linear dynamic network representation "Dynamical Structure Function" (DSF), which provides the Granger Causality graph as a special case. The network identification is performed by integrating all available datasets, which resorts to group sparsity to assure both network sparsity and the consistency of Boolean structures over datasets. In terms of solving the problem, a treatment by the iterative reweighted $l_1$ method is used, together with its implementations via proximal methods and ADMM for large-dimensional networks.
Building on the adoption of the Network Functions Virtualization (NFV) and Software Defined Networking (SDN) paradigms, 5G networks promise distinctive features includ- ing the capability to support multi-tenancy. Virtual network operators (VNOs) are expected to co-exist over the shared infrastructure, realizing their network functionality on top of virtualized resources. In this context, we observe the emerging opportunity for establishing synergies between co-located tenants of the infrastructure, in the form of cache peering relationships between co-located VNOs. Upon a cache miss, co-located caches benefit from content cached at their peers, taking advantage of the shared nature of the infrastructure in reducing latencies and traffic overheads. Our approach allows VNOs to autonomously manage their peering links without the involvement of the infrastructure operator.
We present the results of the tree-level calculation of deep-inelastic leptoproduction, including polarization of target hadron and produced hadron. We also discuss the dependence on transverse momenta of the quarks, which leads to azimuthal asymmetries for the produced hadrons.
Deep Neural Networks (DNNs) have shown to outperform traditional methods in various visual recognition tasks including Facial Expression Recognition (FER). In spite of efforts made to improve the accuracy of FER systems using DNN, existing methods still are not generalizable enough in practical applications. This paper proposes a 3D Convolutional Neural Network method for FER in videos. This new network architecture consists of 3D Inception-ResNet layers followed by an LSTM unit that together extracts the spatial relations within facial images as well as the temporal relations between different frames in the video. Facial landmark points are also used as inputs to our network which emphasize on the importance of facial components rather than the facial regions that may not contribute significantly to generating facial expressions. Our proposed method is evaluated using four publicly available databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.
When setting up field experiments, to test and compare a range of genotypes (e.g. maize hybrids), it is important to account for any possible field effect that may otherwise bias performance estimates of genotypes. To do so, we propose a model-based method aimed at optimizing the allocation of the tested genotypes and checks between fields and placement within field, according to their kinship. This task can be formulated as a combinatorial permutation-based problem. We used Differential Evolution concept to solve this problem. We then present results of optimal strategies for between-field and within-field placements of genotypes and compare them to existing optimization strategies, both in terms of convergence time and result quality. The new algorithm gives promising results in terms of convergence and search space exploration.
Echo State Networks (ESNs) are a special type of the temporally deep network model, the Recurrent Neural Network (RNN), where the recurrent matrix is carefully designed and both the recurrent and input matrices are fixed. An ESN uses the linearity of the activation function of the output units to simplify the learning of the output matrix. In this paper, we devise a special technique that take advantage of this linearity in the output units of an ESN, to learn the input and recurrent matrices. This has not been done in earlier ESNs due to their well known difficulty in learning those matrices. Compared to the technique of BackPropagation Through Time (BPTT) in learning general RNNs, our proposed method exploits linearity of activation function in the output units to formulate the relationships amongst the various matrices in an RNN. These relationships results in the gradient of the cost function having an analytical form and being more accurate. This would enable us to compute the gradients instead of obtaining them by recursion as in BPTT. Experimental results on phone state classification show that learning one or both the input and recurrent matrices in an ESN yields superior results compared to traditional ESNs that do not learn these matrices, especially when longer time steps are used.
We propose two types of evolving networks: evolutionary Apollonian networks (EAN) and general deterministic Apollonian networks (GDAN), established by simple iteration algorithms. We investigate the two networks by both simulation and theoretical prediction. Analytical results show that both networks follow power-law degree distributions, with distribution exponents continuously tuned from 2 to 3. The accurate expression of clustering coefficient is also given for both networks. Moreover, the investigation of the average path length of EAN and the diameter of GDAN reveals that these two types of networks possess small-world feature. In addition, we study the collective synchronization behavior on some limitations of the EAN.
Recent computational investigations in polymeric and non-polymeric binary mixtures have reported anomalous relaxation features when both components exhibit very different mobilities. Anomalous relaxation is characterized by sublinear power law behaviour for mean squared displacements, logarithmic decay in dynamic correlators, and a striking concave-to-convex crossover in the latters by tuning the relevant control parameter, in analogy with predictions of the Mode Coupling Theory for state points close to higher-order transitions. We present Monte Carlo simulations on a coarse-grained model for relaxation in binary mixtures. The liquid structure is substituted by a three-dimensional array of cells. A spin variable is assigned to each cell, representing unexcited and excited local states of a mobility field. Changes in local mobility (spin flip) are permitted according to kinetic constraints determined by the mobilities of the neighbouring cells. We introduce two types of cells (``fast'' and ``slow'') with very different rates for spin flip. This coarse-grained model qualitatively reproduces the mentioned anomalous relaxation features observed for real binary mixtures.
Deep Neural Networks (DNNs) are analyzed via the theoretical framework of the information bottleneck (IB) principle. We first show that any DNN can be quantified by the mutual information between the layers and the input and output variables. Using this representation we can calculate the optimal information theoretic limits of the DNN and obtain finite sample generalization bounds. The advantage of getting closer to the theoretical limit is quantifiable both by the generalization bound and by the network's simplicity. We argue that both the optimal architecture, number of layers and features/connections at each layer, are related to the bifurcation points of the information bottleneck tradeoff, namely, relevant compression of the input layer with respect to the output layer. The hierarchical representations at the layered network naturally correspond to the structural phase transitions along the information curve. We believe that this new insight can lead to new optimality bounds and deep learning algorithms.
We consider a request processing system composed of organizations and their servers connected by the Internet.   The latency a user observes is a sum of communication delays and the time needed to handle the request on a server. The handling time depends on the server congestion, i.e. the total number of requests a server must handle. We analyze the problem of balancing the load in a network of servers in order to minimize the total observed latency. We consider both cooperative and selfish organizations (each organization aiming to minimize the latency of the locally-produced requests). The problem can be generalized to the task scheduling in a distributed cloud; or to content delivery in an organizationally-distributed CDNs.   In a cooperative network, we show that the problem is polynomially solvable. We also present a distributed algorithm iteratively balancing the load. We show how to estimate the distance between the current solution and the optimum based on the amount of load exchanged by the algorithm. During the experimental evaluation, we show that the distributed algorithm is efficient, therefore it can be used in networks with dynamically changing loads.   In a network of selfish organizations, we prove that the price of anarchy (the worst-case loss of performance due to selfishness) is low when the network is homogeneous and the servers are loaded (the request handling time is high compared to the communication delay). After relaxing these assumptions, we assess the loss of performance caused by the selfishness experimentally, showing that it remains low.   Our results indicate that a network of servers handling requests can be efficiently managed by a distributed algorithm. Additionally, even if the network is organizationally distributed, with individual organizations optimizing performance of their requests, the network remains efficient.
We consider a living organism as an observer of the evolution of its environment recording sensory information about the state space X of the environment in real time. Sensory information is sampled and then processed on two levels. On the biological level, the organism serves as an evaluation mechanism of the subjective relevance of the incoming data to the observer: the observer assigns excitation values to events in X it could recognize using its sensory equipment. On the algorithmic level, sensory input is used for updating a database, the memory of the observer whose purpose is to serve as a geometric/combinatorial model of X, whose nodes are weighted by the excitation values produced by the evaluation mechanism. These values serve as a guidance system for deciding how the database should transform as observation data mounts. We define a searching problem for the proposed model and discuss the model's flexibility and its computational efficiency, as well as the possibility of implementing it as a dynamic network of neuron-like units. We show how various easily observable properties of the human memory and thought process can be explained within the framework of this model. These include: reasoning (with efficiency bounds), errors, temporary and permanent loss of information. We are also able to define general learning problems in terms of the new model, such as the language acquisition problem.
We propose a resolution of the renormalization group flow for the disordered Dirac fermion theories describing the quantum Hall transition (QHT) and spin Quantum Hall transition (SQHT), which previously revealed no perturbative fixed points at 1-loop and higher. The approach involves carrying out the flow in 2 stages, the first stage utilizing a new form of super spin-charge separation to flow to gl(1|1)_N and osp(2|2)_{-2N} supercurrent algebra theories, where N is the number of copies. This leads to the unconventional feature that at the critical point the exponents depend on the original number of copies N. In the second stage, additional forms of disorder are incorporated as dimension zero logarithmic operators, and the resulting actions have explicit forms in terms of two scalar fields and a symplectic fermion. Under some assumptions, the multi-fractal exponents are computed with the result q(1-q)/4 and q(1-q)/8 for the QHT and SQHT respectively, and are within a few percent of numerical estimates.
When training deep neural networks, it is typically assumed that the training examples are uniformly difficult to learn. Or, to restate, it is assumed that the training error will be uniformly distributed across the training examples. Based on these assumptions, each training example is used an equal number of times. However, this assumption may not be valid in many cases. "Oddball SGD" (novelty-driven stochastic gradient descent) was recently introduced to drive training probabilistically according to the error distribution - training frequency is proportional to training error magnitude. In this article, using a deep neural network to encode a video, we show that oddball SGD can be used to enforce uniform error across the training set.
We have developed a steady state theory of complex transport networks used to model the flow of commodity, information, viruses, opinions, or traffic. Our approach is based on the use of the Markov chains defined on the graph representations of transport networks allowing for the effective network design, network performance evaluation, embedding, partitioning, and network fault tolerance analysis. Random walks embed graphs into Euclidean space in which distances and angles acquire a clear statistical interpretation. Being defined on the dual graph representations of transport networks random walks describe the equilibrium configurations of not random commodity flows on primary graphs. This theory unifies many network concepts into one framework and can also be elegantly extended to describe networks represented by directed graphs and multiple interacting networks.
We revisit the idea of using deep neural networks for one-shot decoding of random and structured codes, such as polar codes. Although it is possible to achieve maximum a posteriori (MAP) bit error rate (BER) performance for both code families and for short codeword lengths, we observe that (i) structured codes are easier to learn and (ii) the neural network is able to generalize to codewords that it has never seen during training for structured, but not for random codes. These results provide some evidence that neural networks can learn a form of decoding algorithm, rather than only a simple classifier. We introduce the metric normalized validation error (NVE) in order to further investigate the potential and limitations of deep learning-based decoding with respect to performance and complexity.
Despite our extensive knowledge of biophysical properties of neurons, there is no commonly accepted algorithmic theory of neuronal function. Here we explore the hypothesis that single-layer neuronal networks perform online symmetric nonnegative matrix factorization (SNMF) of the similarity matrix of the streamed data. By starting with the SNMF cost function we derive an online algorithm, which can be implemented by a biologically plausible network with local learning rules. We demonstrate that such network performs soft clustering of the data as well as sparse feature discovery. The derived algorithm replicates many known aspects of sensory anatomy and biophysical properties of neurons including unipolar nature of neuronal activity and synaptic weights, local synaptic plasticity rules and the dependence of learning rate on cumulative neuronal activity. Thus, we make a step towards an algorithmic theory of neuronal function, which should facilitate large-scale neural circuit simulations and biologically inspired artificial intelligence.
Kardar-Parisi-Zhang interface depinning with quenched noise is studied in an ensemble that leads to self-organized criticality in the quenched Edwards-Wilkinson (QEW) universality class and related sandpile models. An interface is pinned at the boundaries, and a slowly increasing external drive is added to compensate for the pinning. The ensuing interface behavior describes the integrated toppling activity history of a QKPZ cellular automaton. The avalanche picture consists of several phases depending on the relative importance of the terms in the interface equation. The SOC state is more complicated than in the QEW case and it is not related to the properties of the bulk depinning transition.
Logical formalisms such as first-order logic (FO) and fixpoint logic (FP) are well suited to express in a declarative manner fundamental graph functionalities required in distributed systems. We show that these logics constitute good abstractions for programming distributed systems as a whole, since they can be evaluated in a fully distributed manner with reasonable complexity upper-bounds. We first prove that FO and FP can be evaluated with a polynomial number of messages of logarithmic size. We then show that the (global) logical formulas can be translated into rule programs describing the local behavior of the nodes of the distributed system, which compute equivalent results. Finally, we introduce local fragments of these logics, which preserve as much as possible the locality of their distributed computation, while offering a rich expressive power for networking functionalities. We prove that they admit tighter upper-bounds with bounded number of messages of bounded size. Finally, we show that the semantics and the complexity of the local fragments are preserved over locally consistent networks as well as anonymous networks, thus showing the robustness of the proposed local logical formalisms.
With reasonable assumptions and approximations, we compute the velocity of the meridional flow $U$ in the convective envelope by modified Chandrasekhar's (1956) MHD equations. The analytical solution of such a modified equation is found to be $U(x,\mu) = \sum_{n=0}^\infty \bigl[u1_n x^n + u2_n x^{-(n+3)}\bigr] C_n^{3/2}(\mu)$, where $x$ is non-dimensional radius, $\mu = cos{\vartheta}$, ${\vartheta}$ is the co-latitude, $C_n^{3/2} {(\mu)}$ are the Gegenbaur polynomials of order 3/2, $u1_n$ and $u2_n$ are the unknown constants. The results show that meridional velocity flow from the surface appears to penetrates deep below base of the convective envelope and at outer part of the radiative zone. With such a deep flow velocity below the convective envelope and a very high density stratification in the outer part of the radiative zone with likely existence of a strong ($\sim$ $10^{4}$ G) toroidal magnetic field structure, the velocity of transport of meridional flow is considerably reduced. Hence, it is very unlikely that the return flow will reach the surface (with a period of solar cycle) as required by some of the flux transport dynamo models. On the other hand, deep meridional flow is required for burning of Lithium at outer part of the radiative zone supporting the observed Lithium deficiency at the surface.
We propose a highly efficient and faster Single Image Super-Resolution (SISR) model with Deep Convolutional neural networks (Deep CNN). Deep CNN have recently shown that they have a significant reconstruction performance on single-image super-resolution. Current trend is using deeper CNN layers to improve performance. However, deep models demand larger computation resources and is not suitable for network edge devices like mobile, tablet and IoT devices. Our model achieves state of the art reconstruction performance with at least 10 times lower calculation cost by Deep CNN with Residual Net, Skip Connection and Network in Network (DCSCN). A combination of Deep CNNs and Skip connection layers is used as a feature extractor for image features on both local and global area. Parallelized 1x1 CNNs, like the one called Network in Network, is also used for image reconstruction. That structure reduces the dimensions of the previous layer's output for faster computation with less information loss, and make it possible to process original images directly. Also we optimize the number of layers and filters of each CNN to significantly reduce the calculation cost. Thus, the proposed algorithm not only achieves the state of the art performance but also achieves faster and efficient computation. Code is available at https://github.com/jiny2001/dcscn-super-resolution.
Knowing the strategy of an opponent in a competitive environment conveys obvious evolutionary advantages. But this information is costly, and the benefit of being informed may not necessarily offset the additional cost. Here we introduce social dilemmas with informed strategies, and we show that this gives rise to two cyclically dominant triplets that form defensive alliances. The stability of these two alliances is determined by the rotation velocity of the strategies within each triplet. A weaker strategy in a faster rotating triplet can thus overcome an individually stronger competitor. Fascinating spatial patterns favor the dominance of a single defensive alliance, but enable also the stable coexistence of both defensive alliances in very narrow regions of the parameter space. A continuous reentrant phase transition reveals before unseen complexity behind the stability of strategic alliances in evolutionary social dilemmas.
In this paper, we describe a statistical parametric speech synthesis approach with unit-level acoustic representation. In conventional deep neural network based speech synthesis, the input text features are repeated for the entire duration of phoneme for mapping text and speech parameters. This mapping is learnt at the frame-level which is the de-facto acoustic representation. However much of this computational requirement can be drastically reduced if every unit can be represented with a fixed-dimensional representation. Using recurrent neural network based auto-encoder, we show that it is indeed possible to map units of varying duration to a single vector. We then use this acoustic representation at unit-level to synthesize speech using deep neural network based statistical parametric speech synthesis technique. Results show that the proposed approach is able to synthesize at the same quality as the conventional frame based approach at a highly reduced computational cost.
One of the necessary extensions to the centering model is a mechanism to handle pronouns with intrasentential antecedents. Existing centering models deal only with discourses consisting of simple sentences. It leaves unclear how to delimit center-updating utterance units and how to process complex utterances consisting of multiple clauses. In this paper, I will explore the extent to which a straightforward extension of an existing intersentential centering model contributes to this effect. I will motivate an approach that breaks a complex sentence into a hierarchy of center-updating units and proposes the preferred interpretation of a pronoun in its local context arbitrarily deep in the given sentence structure. This approach will be substantiated with examples from naturally occurring written discourses.
Deep learning techniques are being used in skeleton based action recognition tasks and outstanding performance has been reported. Compared with RNN based methods which tend to overemphasize temporal information, CNN-based approaches can jointly capture spatio-temporal information from texture color images encoded from skeleton sequences. There are several skeleton-based features that have proven effective in RNN-based and handcrafted-feature-based methods. However, it remains unknown whether they are suitable for CNN-based approaches. This paper proposes to encode five spatial skeleton features into images with different encoding methods. In addition, the performance implication of different joints used for feature extraction is studied. The proposed method achieved state-of-the-art performance on NTU RGB+D dataset for 3D human action analysis. An accuracy of 75.32\% was achieved in Large Scale 3D Human Activity Analysis Challenge in Depth Videos.
Inspired by a growing interest in analyzing network data, we study the problem of node classification on graphs, focusing on approaches based on kernel machines. Conventionally, kernel machines are linear classifiers in the implicit feature space. We argue that linear classification in the feature space of kernels commonly used for graphs is often not enough to produce good results. When this is the case, one naturally considers nonlinear classifiers in the feature space. We show that repeating this process produces something we call "deep kernel machines." We provide some examples where deep kernel machines can make a big difference in classification performance, and point out some connections to various recent literature on deep architectures in artificial intelligence and machine learning.
Decompositions of tensors into factor matrices, which interact through a core tensor, have found numerous applications in signal processing and machine learning. A more general tensor model which represents data as an ordered network of sub-tensors of order-2 or order-3 has, so far, not been widely considered in these fields, although this so-called tensor network decomposition has been long studied in quantum physics and scientific computing. In this study, we present novel algorithms and applications of tensor network decompositions, with a particular focus on the tensor train decomposition and its variants. The novel algorithms developed for the tensor train decomposition update, in an alternating way, one or several core tensors at each iteration, and exhibit enhanced mathematical tractability and scalability to exceedingly large-scale data tensors. The proposed algorithms are tested in classic paradigms of blind source separation from a single mixture, denoising, and feature extraction, and achieve superior performance over the widely used truncated algorithms for tensor train decomposition.
We study two specific measures of quality of chemical reaction networks, Precision and Sensitivity. The two measures arise in the study of sensory adaptation, in which the reaction network is viewed as an input-output system. Given a step change in input, Sensitivity is a measure of the magnitude of the response, while Precision is a measure of the degree to which the system returns to its original output for large time. High values of both are necessary for high-quality adaptation.   We focus on reaction networks without dissipation, which we interpret as detailed-balance, mass-action networks. We give various upper and lower bounds on the optimal values of Sensitivity and Precision, characterized in terms of the stoichiometry, by using a combination of ideas from matroid theory and differential-equation theory.   Among other results, we show that this class of non-dissipative systems contains networks with arbitrarily high values of both Sensitivity and Precision. This good performance does come at a cost, however, since certain ratios of concentrations need to be large, the network has to be extensive, or the network should show strongly different time scales.
We study the distribution of multivalent counterions next to a dielectric slab, bearing a quenched, random distribution of charges on one of its solution interfaces, with a given mean and variance, both in the absence and in the presence of a bathing monovalent salt solution. We use the previously derived approach based on the dressed multivalent-ion theory that combines aspects of the strong and weak coupling of multivalent and monovalent ions in a single framework. The presence of quenched charge disorder on the charged surface of the dielectric slab is shown to substantially increase the density of multivalent counterions in its vicinity. In the counterion-only model (with no monovalent salt ions), the surface disorder generates an additional logarithmic attraction potential and thus an algebraically singular counterion density profile at the surface. This behavior persists also in the presence of a monovalent salt bath and results in significant violation of the contact-value theorem, reflecting the anti-fragility effects of the disorder that drive the system towards a more ordered state. In the presence of an interfacial dielectric discontinuity, depleting the counterion layer at the surface, the charge disorder still generates a much enhanced counterion density further away from the surface. Likewise, the charge inversion and/or overcharging of the surface occur more strongly and at smaller bulk concentrations of multivalent counterions when the surface carries quenched charge disorder. Overall, the presence of quenched surface charge disorder leads to sizable effects in the distribution of multivalent counterions in a wide range of realistic parameters and typically within a distance of a few nanometers from the charged surface.
A central challenge to many fields of science and engineering involves minimizing non-convex error functions over continuous, high dimensional spaces. Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for the ability of these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum. Here we argue, based on results from statistical physics, random matrix theory, and neural network theory, that a deeper and more profound difficulty originates from the proliferation of saddle points, not local minima, especially in high dimensional problems of practical interest. Such saddle points are surrounded by high error plateaus that can dramatically slow down learning, and give the illusory impression of the existence of a local minimum. Motivated by these arguments, we propose a new algorithm, the saddle-free Newton method, that can rapidly escape high dimensional saddle points, unlike gradient descent and quasi-Newton methods. We apply this algorithm to deep neural network training, and provide preliminary numerical evidence for its superior performance.
Learning acoustic models directly from the raw waveform data with minimal processing is challenging. Current waveform-based models have generally used very few (~2) convolutional layers, which might be insufficient for building high-level discriminative features. In this work, we propose very deep convolutional neural networks (CNNs) that directly use time-domain waveforms as inputs. Our CNNs, with up to 34 weight layers, are efficient to optimize over very long sequences (e.g., vector of size 32000), necessary for processing acoustic waveforms. This is achieved through batch normalization, residual learning, and a careful design of down-sampling in the initial layers. Our networks are fully convolutional, without the use of fully connected layers and dropout, to maximize representation learning. We use a large receptive field in the first convolutional layer to mimic bandpass filters, but very small receptive fields subsequently to control the model capacity. We demonstrate the performance gains with the deeper models. Our evaluation shows that the CNN with 18 weight layers outperform the CNN with 3 weight layers by over 15% in absolute accuracy for an environmental sound recognition task and matches the performance of models using log-mel features.
Objective: Patient notes in electronic health records (EHRs) may contain critical information for medical investigations. However, the vast majority of medical investigators can only access de-identified notes, in order to protect the confidentiality of patients. In the United States, the Health Insurance Portability and Accountability Act (HIPAA) defines 18 types of protected health information (PHI) that needs to be removed to de-identify patient notes. Manual de-identification is impractical given the size of EHR databases, the limited number of researchers with access to the non-de-identified notes, and the frequent mistakes of human annotators. A reliable automated de-identification system would consequently be of high value.   Materials and Methods: We introduce the first de-identification system based on artificial neural networks (ANNs), which requires no handcrafted features or rules, unlike existing systems. We compare the performance of the system with state-of-the-art systems on two datasets: the i2b2 2014 de-identification challenge dataset, which is the largest publicly available de-identification dataset, and the MIMIC de-identification dataset, which we assembled and is twice as large as the i2b2 2014 dataset.   Results: Our ANN model outperforms the state-of-the-art systems. It yields an F1-score of 97.85 on the i2b2 2014 dataset, with a recall 97.38 and a precision of 97.32, and an F1-score of 99.23 on the MIMIC de-identification dataset, with a recall 99.25 and a precision of 99.06.   Conclusion: Our findings support the use of ANNs for de-identification of patient notes, as they show better performance than previously published systems while requiring no feature engineering.
In this paper we investigate the value of a social network with respect to the probability mechanism underlying its structure. Specifically, we compute the value for small world and scale free networks. We provide evidence in support of the value to be given by Zipfs law.
We review some recent (mostly ours) results on the Anderson localization of light and electron waves in complex disordered systems, including: (i) left-handed metamaterials, (ii) magneto-active optical structures, (iii) graphene superlattices, and (iv) nonlinear dielectric media. First, we demonstrate that left-handed metamaterials can significantly suppress localization of light and lead to an anomalously enhanced transmission. This suppression is essential at the long-wavelength limit in the case of normal incidence, at specific angles of oblique incidence (Brewster anomaly), and in the vicinity of the zero-epsilon or zero-mu frequencies for dispersive metamaterials. Remarkably, in disordered samples comprised of alternating normal and left-handed metamaterials, the reciprocal Lyapunov exponent and reciprocal transmittance increment can differ from each other. Second, we study magneto-active multilayered structures, which exhibit nonreciprocal localization of light depending on the direction of propagation and on the polarization. At resonant frequencies or realizations, such nonreciprocity results in effectively unidirectional transport of light. Third, we discuss the analogy between the wave propagation through multilayered samples with metamaterials and the charge transport in graphene, which enables a simple physical explanation of unusual conductive properties of disordered graphene superlatices. We predict disorder-induced resonances of the transmission coefficient at oblique incidence of the Dirac quasiparticles. Finally, we demonstrate that an interplay of nonlinearity and disorder in dielectric media can lead to bistability of individual localized states excited inside the medium at resonant frequencies. This results in nonreciprocity of the wave transmission and unidirectional transport of light.
In this paper, we propose a novel ranking approach for collaborative filtering based on Neural-Networks that jointly learns a new representation of users and items in an embedded space as well as the preference relation of users over pairs of items. The learning objective is based on two ranking losses that control the ability of the model to respect the ordering over the items induced from the users preferences, as well as, the capacity of the dot-product defined in the learned embedded space to produce the ordering. The proposed model is by nature suitable for both implicit and explicit feedback and involves the estimation of only very few parameters. Through extensive experiments on several real-world benchmarks, both explicit and implicit, we show the interest of learning the preference and the embedding simultaneously when compared to learning those separately. We also demonstrate that our approach is very competitive with the best state-of-the-art collaborative filtering techniques proposed independently for explicit and implicit feedback.
We propose a deep-learning-based classification of data pages used in holographic memory. We numerically investigated the classification performance of a conventional multi-layer perceptron (MLP) and a deep neural network, under the condition that reconstructed page data are contaminated by some noise and are randomly laterally shifted. The MLP was found to have a classification accuracy of 91.58%, whereas the deep neural network was able to classify data pages at an accuracy of 99.98%. The accuracy of the deep neural network is two orders of magnitude better than the MLP.
The switch-like character of gene regulation has motivated the use of hybrid, discrete-continuous models of genetic regulatory networks. While powerful techniques for the analysis, verification, and control of hybrid systems have been developed, the specificities of the biological application domain pose a number of challenges, notably the absence of quantitative information on parameter values and the size and complexity of networks of biological interest. We introduce a method for the analysis of reachability properties of genetic regulatory networks that is based on a class of discontinuous piecewise-affine (PA) differential equations well-adapted to the above constraints. More specifically, we introduce a hyperrectangular partition of the state space that forms the basis for a discrete abstraction preserving the sign of the derivatives of the state variables. The resulting discrete transition system provides a conservative approximation of the qualitative dynamics of the network and can be efficiently computed in a symbolic manner from inequality constraints on the parameters. The method has been implemented in the computer tool Genetic Network Analyzer (GNA), which has been applied to the analysis of a regulatory system whose functioning is not well-understood by biologists, the nutritional stress response in the bacterium Escherichia coli.
Detecting strong ties among users in social and information networks is a fundamental operation that can improve performance on a multitude of personalization and ranking tasks. Strong-tie edges are often readily obtained from the social network as users often participate in multiple overlapping networks via features such as following and messaging. These networks may vary greatly in size, density and the information they carry. This setting leads to a natural strong tie detection task: given a small set of labeled strong tie edges, how well can one detect unlabeled strong ties in the remainder of the network?   This task becomes particularly daunting for the Twitter network due to scant availability of pairwise relationship attribute data, and sparsity of strong tie networks such as phone contacts. Given these challenges, a natural approach is to instead use structural network features for the task, produced by {\em combining} the strong and "weak" edges. In this work, we demonstrate via experiments on Twitter data that using only such structural network features is sufficient for detecting strong ties with high precision. These structural network features are obtained from the presence and frequency of small network motifs on combined strong and weak ties. We observe that using motifs larger than triads alleviate sparsity problems that arise for smaller motifs, both due to increased combinatorial possibilities as well as benefiting strongly from searching beyond the ego network. Empirically, we observe that not all motifs are equally useful, and need to be carefully constructed from the combined edges in order to be effective for strong tie detection. Finally, we reinforce our experimental findings with providing theoretical justification that suggests why incorporating these larger sized motifs as features could lead to increased performance in planted graph models.
Canonical correlation analysis (CCA) has been one of the most popular methods for frequency recognition in steady-state visual evoked potential (SSVEP)-based brain-computer interfaces (BCIs). Despite its efficiency, a potential problem is that using pre-constructed sine-cosine waves as the required reference signals in the CCA method often does not result in the optimal recognition accuracy due to their lack of features from the real EEG data. To address this problem, this study proposes a novel method based on multiset canonical correlation analysis (MsetCCA) to optimize the reference signals used in the CCA method for SSVEP frequency recognition. The MsetCCA method learns multiple linear transforms that implement joint spatial filtering to maximize the overall correlation among canonical variates, and hence extracts SSVEP common features from multiple sets of EEG data recorded at the same stimulus frequency. The optimized reference signals are formed by combination of the common features and completely based on training data. Experimental study with EEG data from ten healthy subjects demonstrates that the MsetCCA method improves the recognition accuracy of SSVEP frequency in comparison with the CCA method and other two competing methods (multiway CCA (MwayCCA) and phase constrained CCA (PCCA)), especially for a small number of channels and a short time window length. The superiority indicates that the proposed MsetCCA method is a new promising candidate for frequency recognition in SSVEP-based BCIs.
Asynchronous transfer mode (ATM) is the new generation of computer and communication networks that are being deployed throughout the telecommunication industry as well as in campus backbones. ATM technology distinguishes itself from the previous networking protocols in that it has the latest traffic management technology and thus allows guaranteeing delay, throughput, and other performance measures. This in turn, allows users to integrate voice, video, and data on the same network. Available bit rate (ABR) service in ATM has been designed to fairly distribute all unused capacity to data traffic and is specified in the ATM Forum's Traffic Management (TM4.0) standard. This paper will describe the OPNET models that have been developed for ATM and ABR design and analysis.
We present a brief status report of our broad and systematic study of QCD-instantons at HERA.
We evolve network topology of an asymmetrically connected threshold network by a simple local rewiring rule: quiet nodes grow links, active nodes lose links. This leads to convergence of the average connectivity of the network towards the critical value $K_c =2$ in the limit of large system size $N$. How this principle could generate self-organization in natural complex systems is discussed for two examples: neural networks and regulatory networks in the genome.
I address the need for deep blazar surveys by showing that our current understanding of blazars is based on a relatively small number of intrinsically luminous sources. I then review the on-going deeper surveys, addressing in particular their limits and limitations. Finally, I present some preliminary results on the evolutionary properties of faint blazars as derived from the Deep X-ray Radio Blazar Survey (DXRBS).
The idea to use automated algorithms to determine geological facies from well logs is not new (see e.g Busch et al. (1987); Rabaute (1998)) but the recent and dramatic increase in research in the field of machine learning makes it a good time to revisit the topic. Following an exercise proposed by Dubois et al. (2007) and Hall (2016) we employ a modern type of deep convolutional network, called \textit{inception network} (Szegedy et al., 2015), to tackle the supervised classification task and we discuss the methodological limits of such problem as well as further research opportunities.
Cognitive radio networks (CRNs) are networks of nodes equipped with cognitive radios that can optimize performance by adapting to network conditions. While cognitive radio networks (CRN) are envisioned as intelligent networks, relatively little research has focused on the network level functionality of CRNs. Although various routing protocols, incorporating varying degrees of adaptiveness, have been proposed for CRNs, it is imperative for the long term success of CRNs that the design of cognitive routing protocols be pursued by the research community. Cognitive routing protocols are envisioned as routing protocols that fully and seamless incorporate AI-based techniques into their design. In this paper, we provide a self-contained tutorial on various AI and machine-learning techniques that have been, or can be, used for developing cognitive routing protocols. We also survey the application of various classes of AI techniques to CRNs in general, and to the problem of routing in particular. We discuss various decision making techniques and learning techniques from AI and document their current and potential applications to the problem of routing in CRNs. We also highlight the various inference, reasoning, modeling, and learning sub tasks that a cognitive routing protocol must solve. Finally, open research issues and future directions of work are identified.
Traditional recommendation systems rely on past usage data in order to generate new recommendations. Those approaches fail to generate sensible recommendations for new users and items into the system due to missing information about their past interactions. In this paper, we propose a solution for successfully addressing item-cold start problem which uses model-based approach and recent advances in deep learning. In particular, we use latent factor model for recommendation, and predict the latent factors from item's descriptions using convolutional neural network when they cannot be obtained from usage data. Latent factors obtained by applying matrix factorization to the available usage data are used as ground truth to train the convolutional neural network. To create latent factor representations for the new items, the convolutional neural network uses their textual description. The results from the experiments reveal that the proposed approach significantly outperforms several baseline estimators.
Designing and implementing efficient, provably correct parallel neural network processing is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. However, the diversity and large-scale data size have posed a significant challenge to construct a flexible and high-performance implementation of deep learning neural networks. To improve the performance and maintain the scalability, we present CNNLab, a novel deep learning framework using GPU and FPGA-based accelerators. CNNLab provides a uniform programming model to users so that the hardware implementation and the scheduling are invisible to the programmers. At runtime, CNNLab leverages the trade-offs between GPU and FPGA before offloading the tasks to the accelerators. Experimental results on the state-of-the-art Nvidia K40 GPU and Altera DE5 FPGA board demonstrate that the CNNLab can provide a universal framework with efficient support for diverse applications without increasing the burden of the programmers. Moreover, we analyze the detailed quantitative performance, throughput, power, energy, and performance density for both approaches. Experimental results leverage the trade-offs between GPU and FPGA and provide useful practical experiences for the deep learning research community.
We develop a natural language interface for human robot interaction that implements reasoning about deep semantics in natural language. To realize the required deep analysis, we employ methods from cognitive linguistics, namely the modular and compositional framework of Embodied Construction Grammar (ECG) [Feldman, 2009]. Using ECG, robots are able to solve fine-grained reference resolution problems and other issues related to deep semantics and compositionality of natural language. This also includes verbal interaction with humans to clarify commands and queries that are too ambiguous to be executed safely. We implement our NLU framework as a ROS package and present proof-of-concept scenarios with different robots, as well as a survey on the state of the art.
Non intrusive monitoring of animals in the wild is possible using camera trapping framework, which uses cameras triggered by sensors to take a burst of images of animals in their habitat. However camera trapping framework produces a high volume of data (in the order on thousands or millions of images), which must be analyzed by a human expert. In this work, a method for animal species identification in the wild using very deep convolutional neural networks is presented. Multiple versions of the Snapshot Serengeti dataset were used in order to probe the ability of the method to cope with different challenges that camera-trap images demand. The method reached 88.9% of accuracy in Top-1 and 98.1% in Top-5 in the evaluation set using a residual network topology. Also, the results show that the proposed method outperforms previous approximations and proves that recognition in camera-trap images can be automated.
We explore, by means of inelastic neutron scattering, the extent of changes in dynamic correlations induced by thermal poling of vitreous $SiO_2$. The measured vibrational density of states shows an excess of modes in certain frequency regions as well as a narrowing of the 100 meV peak. These findings indicate that such alterations cannot be ascribed to the appearance of new well defined vibrational modes, such as those coming from localized topological defects, but rather arises from an increase in ordering in the material as attested in a reduced spread of the inter-tetrahedral angles.
We present a comprehensive study of utility function of the minority game in its efficient regime. We develop an effective description of state of the game. For the payoff function $g(x)=\sgn (x)$ we explicitly represent the game as the Markov process and prove the finitness of number of states. We also demonstrate boundedness of the utility function. Using these facts we can explain all interesting observable features of the aggregated demand: appearance of strong fluctuations, their periodicity and existence of prefered levels. For another payoff, $g(x)=x$, the number of states is still finite and utility remains bounded but the number of states cannot be reduced and probabilities of states are not calculated. However, using properties of the utility and analysing the game in terms of de Bruijn graphs, we can also explain distinct peaks of demand and their frequencies.
We investigate the effects of geometric fluctuations, associated with aperiodic exchange interactions, on the critical behavior of $q$-state ferromagnetic Potts models on generalized diamond hierarchical lattices. For layered exchange interactions according to some two-letter substitutional sequences, and irrelevant geometric fluctuations, the exact recursion relations in parameter space display a non-trivial diagonal fixed point that governs the universal critical behavior. For relevant fluctuations, this fixed point becomes fully unstable, and we show the apperance of a two-cycle which is associated with a novel critical behavior. We use scaling arguments to calculate the critical exponent $\alpha$ of the specific heat, which turns out to be different from the value for the uniform case. We check the scaling predictions by a direct numerical analysis of the singularity of the thermodynamic free-energy. The agreement between scaling and direct calculations is excellent for stronger singularities (large values of $q$). The critical exponents do not depend on the strengths of the exchange interactions.
While cognitive representations of an environment can last for days and even months, the synaptic architecture of the neuronal networks that underlie these representations constantly changes due to various forms of synaptic and structural plasticity at a much faster timescale. This raises an immediate question: how can a transient network maintain a stable representation of space? In the following, we propose a computational model for describing emergence of the hippocampal cognitive map in a network of transient place cell assemblies and demonstrate, using methods of algebraic topology, that such a network can maintain a robust map of the environment.
Ensembles of artificial neural networks show improved generalization capabilities that outperform those of single networks. However, for aggregation to be effective, the individual networks must be as accurate and diverse as possible. An important problem is, then, how to tune the aggregate members in order to have an optimal compromise between these two conflicting conditions. We present here an extensive evaluation of several algorithms for ensemble construction, including new proposals and comparing them with standard methods in the literature. We also discuss a potential problem with sequential aggregation algorithms: the non-frequent but damaging selection through their heuristics of particularly bad ensemble members. We introduce modified algorithms that cope with this problem by allowing individual weighting of aggregate members. Our algorithms and their weighted modifications are favorably tested against other methods in the literature, producing a sensible improvement in performance on most of the standard statistical databases used as benchmarks.
Despite recent advances in 3D pose estimation of human hands, especially thanks to the advent of CNNs and depth cameras, this task is still far from being solved. This is mainly due to the highly non-linear dynamics of fingers, which makes hand model training a challenging task. In this paper, we exploit a novel hierarchical tree-like structured CNN, in which branches are trained to become specialized in predefined subsets of hand joints, called local poses. We further fuse local pose features, extracted from hierarchical CNN branches, to learn higher order dependencies among joints in the final pose by end-to-end training. Lastly, the loss function used is also defined to incorporate appearance and physical constraints about doable hand motion and deformation. Experimental results suggest that feeding a tree-shaped CNN, specialized in local poses, into a fusion network for modeling joints correlations, helps to increase the precision of final estimations, outperforming state-of-the-art results on NYU and MSRA datasets.
I consider branches of Replica-Symmetry-Breaking (RSB) solutions in Glassy systems that display a dynamical transition at a temperature $T_d$ characterized by a Mode-Coupling-Theory dynamical behavior. Below $T_d$ these branches of solutions are considered to be relevant to the complexity and to off-equilibrium dynamics. Under general assumptions I argue that near $T_d$ it is not possible to stabilize the one-step (1RSB) solution beyond the marginal point by making a full RSB (FRSB) ansatz. However, depending on the model, it may exist a temperature $T_*$ strictly lower than $T_d$ below which the 1RSB branch can be continued to a FRSB branch. Such a temperature certainly exists for models that display the so-called Gardner transition and in this case $T_G<T_*<T_d$. An analytical study in the context of the truncated model reveals that the FRSB branch of solutions below $T_*$ is characterized by a two plateau structure and it ends where the first plateau disappears. These general features are confirmed in the context of the Ising $p$-spin with $p=3$ by means of a numerical solution of the FRSB equations. The results are discussed in connection with off-equilibrium dynamics within Cugliandolo-Kurchan theory. In this context I assume that the RSB solution relevant for off-equilibrium dynamics is the 1RBS marginal solution in the whole range $(T_*,T_d)$ and it is the end-point of the FRSB branch for $T<T_*$. Remarkably under these assumptions it can be argued that $T_*$ marks a qualitative change in off-equilibrium dynamics in the sense that the decay of various dynamical quantities changes from power-law to logarithmic.
For the past two decades, researchers have attempted to create a Quantum Neural Network (QNN) by combining the merits of quantum computing and neural computing. In order to exploit the advantages of the two prolific fields, the QNN must meet the non-trivial task of integrating the unitary dynamics of quantum computing and the dissipative dynamics of neural computing. At the core of quantum computing and neural computing lies the qubit and perceptron, respectively. We see that past implementations of the quantum perceptron model have failed to fuse the two elegantly. This was due to a slow learning rule and a disregard for the unitary requirement. In this paper, we present a quantum perceptron that can compute functions uncomputable by the classical perceptron while analytically solving for parameters and preserving the unitary and dissipative requirements.
Network alignment (NA) aims to find a node mapping between molecular networks of different species that identifies topologically or functionally similar network regions. Analogous to genomic sequence alignment, NA can be used to transfer biological knowledge from well- to poorly-studied species between aligned network regions. Pairwise NA (PNA) finds similar regions between two networks while multiple NA (MNA) can align more than two networks. We focus on MNA. Existing MNA methods aim to maximize total similarity over all aligned nodes (node conservation). Then, they evaluate alignment quality by measuring the amount of conserved edges, but only after the alignment is constructed. Directly optimizing edge conservation during alignment construction in addition to node conservation may result in superior alignments. Thus, we present a novel MNA approach called multiMAGNA++ that can achieve this. Indeed, multiMAGNA++ generally outperforms or is on par with the existing MNA methods, while often completing faster than the existing methods. That is, multiMAGNA++ scales well to larger network data and can be parallelized effectively. During method evaluation, we also introduce new MNA quality measures to allow for more complete alignment characterization as well as more fair MNA method comparison compared to using only the existing alignment quality measures.
Most infectious diseases spread on a dynamic network of human interactions. Recent studies of social dynamics have provided evidence that spreading patterns may depend strongly on detailed micro-dynamics of the social system. We have recorded every single interaction within a large population, mapping out---for the first time at scale---the complete proximity network for a densely-connected system. Here we show the striking impact of interaction-distance on the network structure and dynamics of spreading processes. We create networks supporting close (intimate network, up to ~1m) and longer distance (ambient network, up to ~10m) modes of transmission. The intimate network is fragmented, with weak ties bridging densely-connected neighborhoods, whereas the ambient network supports spread driven by random contacts between strangers. While there is no trivial mapping from the micro-dynamics of proximity networks to empirical epidemics, these networks provide a telling approximation of droplet and airborne modes of pathogen spreading. The dramatic difference in outbreak dynamics has implications for public policy and methodology of data collection and modeling.
The immune system provides a rich metaphor for computer security: anomaly detection that works in nature should work for machines. However, early artificial immune system approaches for computer security had only limited success. Arguably, this was due to these artificial systems being based on too simplistic a view of the immune system. We present here a second generation artificial immune system for process anomaly detection. It improves on earlier systems by having different artificial cell types that process information. Following detailed information about how to build such second generation systems, we find that communication between cells types is key to performance. Through realistic testing and validation we show that second generation artificial immune systems are capable of anomaly detection beyond generic system policies. The paper concludes with a discussion and outline of the next steps in this exciting area of computer security.
By means of extensive computer simulations, the authors consider the entangled coevolution of actions and social structure in a new version of a spatial Prisoner's Dilemma model that naturally gives way to a process of social differentiation. Diverse social roles emerge from the dynamics of the system: leaders are individuals getting a large payoff who are imitated by a considerable fraction of the population, conformists are unsatisfied cooperative agents that keep cooperating, and exploiters are defectors with a payoff larger than the average one obtained by cooperators. The dynamics generate a social network that can have the topology of a small world network. The network has a strong hierarchical structure in which the leaders play an essential role in sustaining a highly cooperative stable regime. But disruptions affecting leaders produce social crises described as dynamical cascades that propagate through the network.
Segmentation of histopathology sections is an ubiquitous requirement in digital pathology and due to the large variability of biological tissue, machine learning techniques have shown superior performance over standard image processing methods. As part of the GlaS@MICCAI2015 colon gland segmentation challenge, we present a learning-based algorithm to segment glands in tissue of benign and malignant colorectal cancer. Images are preprocessed according to the Hematoxylin-Eosin staining protocol and two deep convolutional neural networks (CNN) are trained as pixel classifiers. The CNN predictions are then regularized using a figure-ground segmentation based on weighted total variation to produce the final segmentation result. On two test sets, our approach achieves a tissue classification accuracy of 98% and 94%, making use of the inherent capability of our system to distinguish between benign and malignant tissue.
We reinterpret multiplicative noise in neural networks as auxiliary random variables that augment the approximate posterior in a variational setting for Bayesian neural networks. We show that through this interpretation it is both efficient and straightforward to improve the approximation by employing normalizing flows while still allowing for local reparametrizations and a tractable lower bound. In experiments we show that with this new approximation we can significantly improve upon classical mean field for Bayesian neural networks on both predictive accuracy as well as predictive uncertainty.
The output scores of a neural network classifier are converted to probabilities via normalizing over the scores of all competing categories. Computing this partition function, $Z$, is then linear in the number of categories, which is problematic as real-world problem sets continue to grow in categorical types, such as in visual object recognition or discriminative language modeling. We propose three approaches for sublinear estimation of the partition function, based on approximate nearest neighbor search and kernel feature maps and compare the performance of the proposed approaches empirically.
Quantum phase transitions are usually observed in ground states of correlated systems. Remarkably, eigenstate phase transitions can also occur at finite energy density in disordered, isolated quantum systems. Such transitions fall outside the framework of statistical mechanics as they involve the breakdown of ergodicity. Here, we consider what general constraints can be imposed on the nature of eigenstate transitions due to the presence of disorder. We derive Harris-type bounds on the finite-size scaling exponents of the mean entanglement entropy and level statistics at the many-body localization phase transition using several different arguments. Our results are at odds with recent small-size numerics, for which we estimate the crossover scales beyond which the Harris bound must hold.
In computational biology and bioinformatics, the manner to understand evolution processes within various related organisms paid a lot of attention these last decades. However, accurate methodologies are still needed to discover genes content evolution. In a previous work, two novel approaches based on sequence similarities and genes features have been proposed. More precisely, we proposed to use genes names, sequence similarities, or both, insured either from NCBI or from DOGMA annotation tools. Dogma has the advantage to be an up-to-date accurate automatic tool specifically designed for chloroplasts, whereas NCBI possesses high quality human curated genes (together with wrongly annotated ones). The key idea of the former proposal was to take the best from these two tools. However, the first proposal was limited by name variations and spelling errors on the NCBI side, leading to core trees of low quality. In this paper, these flaws are fixed by improving the comparison of NCBI and DOGMA results, and by relaxing constraints on gene names while adding a stage of post-validation on gene sequences. The two stages of similarity measures, on names and sequences, are thus proposed for sequence clustering. This improves results that can be obtained using either NCBI or DOGMA alone. Results obtained with this quality control test are further investigated and compared with previously released ones, on both computational and biological aspects, considering a set of 99 chloroplastic genomes.
Large-scale distributed Multiuser MIMO (MU-MIMO) is a promising wireless network architecture that combines the advantages of "massive MIMO" and "small cells." It consists of several Access Points (APs) connected to a central server via a wired backhaul network and acting as a large distributed antenna system. We focus on the downlink, which is both more demanding in terms of traffic and more challenging in terms of implementation than the uplink. In order to enable multiuser joint precoding of the downlink signals, channel state information at the transmitter side is required. We consider Time Division Duplex (TDD), where the {\em downlink} channels can be learned from the user uplink pilot signals, thanks to channel reciprocity. Furthermore, coherent multiuser joint precoding is possible only if the APs maintain a sufficiently accurate relative timing and phase synchronization. AP synchronization and TDD reciprocity calibration are two key problems to be solved in order to enable distributed MU-MIMO downlink. In this paper, we propose novel over-the-air synchronization and calibration protocols that scale well with the network size. The proposed schemes can be applied to networks formed by a large number of APs, each of which is driven by an inexpensive 802.11-grade clock and has a standard RF front-end, not explicitly designed to be reciprocal. Our protocols can incorporate, as a building block, any suitable timing and frequency estimator. Here we revisit the problem of joint ML timing and frequency estimation and use the corresponding Cramer-Rao bound to evaluate the performance of the synchronization protocol. Overall, the proposed synchronization and calibration schemes are shown to achieve sufficient accuracy for satisfactory distributed MU-MIMO performance.
Sample-to-sample fluctuations of the time-dependent conductance of a system with static disorder have been studied by means of diagrammatic theory and microwave pulsed transmission measurements. The fluctuations of time-dependent conductance are not universal, i.e., depend on sample parameters, in contrast to the universal conductance fluctuations in the steady-state regime. The variance of normalized conductance, determined by the infinite-range intensity correlation C_3(t), is found to increase as a third power of delay time from an exciting pulse, t. C_3(t) grows larger than the long-range intensity correlation C_2(t) after a time t_q ~ <g>^{1/2} t_D (t_D being the diffusion time, <g> being the average dimensionless conductance).
It is well-known that biological and social interaction networks have a varying degree of redundancy, though a consensus of the precise cause of this is so far lacking. In this paper, we introduce a topological redundancy measure for labeled directed networks that is formal, computationally efficient and applicable to a variety of directed networks such as cellular signaling, metabolic and social interaction networks. We demonstrate the computational efficiency of our measure by computing its value and statistical significance on a number of biological and social networks with up to several thousands of nodes and edges. Our results suggest a number of interesting observations: (1) social networks are more redundant that their biological counterparts, (2) transcriptional networks are less redundant than signaling networks, (3) the topological redundancy of the C. elegans metabolic network is largely due to its inclusion of currency metabolites, and (4) the redundancy of signaling networks is highly (negatively) correlated with the monotonicity of their dynamics.
The supplier selection problem is based on electing the best supplier from a group of pre-specified candidates, is identified as a Multi Criteria Decision Making (MCDM), is proportionately significant in terms of qualitative and quantitative attributes. It is a fundamental issue to achieve a trade-off between such quantifiable and unquantifiable attributes with an aim to accomplish the best solution to the abovementioned problem. This article portrays a metaheuristic based optimization model to solve this NP-Complete problem. Initially the Analytic Hierarchy Process (AHP) is implemented to generate an initial feasible solution of the problem. Thereafter a Simulated Annealing (SA) algorithm is exploited to improve the quality of the obtained solution. The Taguchi robust design method is exploited to solve the critical issues on the subject of the parameter selection of the SA technique. In order to verify the proposed methodology the numerical results are demonstrated based on tangible industry data.
Relational data-like graphs, networks, and matrices-is often dynamic, where the relational structure evolves over time. A fundamental problem in the analysis of time-varying network data is to extract a summary of the common structure and the dynamics of the underlying relations between the entities. Here we build on the intuition that changes in the network structure are driven by the dynamics at the level of groups of nodes. We propose a nonparametric multi-group membership model for dynamic networks. Our model contains three main components: We model the birth and death of individual groups with respect to the dynamics of the network structure via a distance dependent Indian Buffet Process. We capture the evolution of individual node group memberships via a Factorial Hidden Markov model. And, we explain the dynamics of the network structure by explicitly modeling the connectivity structure of groups. We demonstrate our model's capability of identifying the dynamics of latent groups in a number of different types of network data. Experimental results show that our model provides improved predictive performance over existing dynamic network models on future network forecasting and missing link prediction.
Siphons in a chemical reaction system are subsets of the species that have the potential of being absent in a steady state. We present a characterization of minimal siphons in terms of primary decomposition of binomial ideals, we explore the underlying geometry, and we demonstrate the effective computation of siphons using computer algebra software. This leads to a new method for determining whether given initial concentrations allow for various boundary steady states.
The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations such as the amount of noise, the number of neural recording sites, and the number trials, and computational limitations such as the complexity of the decoding classifier and the number of classifier training examples. In this work we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of "kernel analysis" that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds.
In this paper we describe our attempt at producing a state-of-the-art Twitter sentiment classifier using Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTMs) networks. Our system leverages a large amount of unlabeled data to pre-train word embeddings. We then use a subset of the unlabeled data to fine tune the embeddings using distant supervision. The final CNNs and LSTMs are trained on the SemEval-2017 Twitter dataset where the embeddings are fined tuned again. To boost performances we ensemble several CNNs and LSTMs together. Our approach achieved first rank on all of the five English subtasks amongst 40 teams.
We study the three-dimensional XY-spin glass as a model for the resistive behavior of granular superconductors containing a random distribution of $\pi$ junctions, as in high-$T_c$ superconducting materials with d-wave symmetry. The $\pi$ junctions leads to quenched in circulating currents (chiralities) and to a chiral-glass state at low temperatures, even in the absence of an external magnetic field. Dynamical simulations in the phase representation are used to determine the nonlinear current-voltage characteristics as a function of temperature. Based on dynamic scaling analysis, we find a phase-coherence transition at finite temperature below which the linear resistivity should vanish and determine the corresponding critical exponents. The results suggest that the phase and chiralities may order simultaneously for decreasing temperatures into a superconducting chiral-glass state.
Carbon nanotubes are often seen as the only alternative technology to silicon transistors. While they are the most likely short-term one, other longer-term alternatives should be studied as well. While contemplating biological neurons as an alternative component may seem preposterous at first sight, significant recent progress in CMOS-neuron interface suggests this direction may not be unrealistic; moreover, biological neurons are known to self-assemble into very large networks capable of complex information processing tasks, something that has yet to be achieved with other emerging technologies. The first step to designing computing systems on top of biological neurons is to build an abstract model of self-assembled biological neural networks, much like computer architects manipulate abstract models of transistors and circuits. In this article, we propose a first model of the structure of biological neural networks. We provide empirical evidence that this model matches the biological neural networks found in living organisms, and exhibits the small-world graph structure properties commonly found in many large and self-organized systems, including biological neural networks. More importantly, we extract the simple local rules and characteristics governing the growth of such networks, enabling the development of potentially large but realistic biological neural networks, as would be needed for complex information processing/computing tasks. Based on this model, future work will be targeted to understanding the evolution and learning properties of such networks, and how they can be used to build computing systems.
We study the dynamical properties of the random transverse-field Ising chain at criticality using a mapping to free fermions, with which we can obtain numerically exact results for system sizes, L, as large as 256. The probability distribution of the local imaginary time correlation function S(tau) is investigated and found to be simply a function of alpha = -log S(tau) / log(tau). This scaling behavior implies that the typical correlation function decays algebraically where the exponent is determined from the distribution of alpha. The precise value for the exponent depends on exactly how the ``typical'' correlation function is defined. The form of P(alpha) for small alpha gives a contribution to the average correlation function, which varies as a power of the logarithm of time, which was obtained recently in Europhys. Lett. 39, 135 (1997). These results represent a type of ``multiscaling'' different from the well-known ``multifractal'' behavior.
A fundamental question for evolutionary biology is why rates of evolution vary dramatically between proteins. Perhaps surprisingly, it is controversial how much a protein's functional importance affects its rate of evolution. In most studies, functional importance has been measured on the coarse scale of protein knock-outs, while evolutionary rate has been measured on the fine scale of amino acid substitutions. Here we introduce dynamical influence, a finer measure of protein functional importance. To measure dynamical influence, we first use detailed biochemical models of particular reaction networks to measure the influence of each reaction rate constant on network dynamics. We then define the dynamical influence of a protein to be the average influence of the rate constants for all reactions it is involved in.   Using models of a dozen biochemical systems and sequence data from vertebrates, we show that dynamical influence and evolutionary rate are negatively correlated; proteins with greater dynamical influence evolve more slowly. We also show that proteins with greater dynamical influence are not more likely to be essential. This suggests that there are many cellular reactions whose presence is essential for life, but whose quantitative rate is relatively unimportant to fitness. We also provide evidence that the effect of dynamical influence on evolutionary rate is independent of protein expression level, expression specificity, gene compactness, and reaction degree. Dynamical influence offers a finer view of functional importance, and our results suggest that focusing on essentiality may have previously led to an underestimation of the role function plays in protein evolution.
Considering N spinless Fermions in a random potential, we study how a short range pairwise interaction delocalizes the N-body states in the basis of the one-particle Slater determinants, and the spectral rigidity of the N-body spectrum. The maximum number g_N of consecutive levels exhibiting the universal Wigner-Dyson rigidity (the Thouless number) is given as a function of the strength U of the interaction for the bulk of the spectrum. In the dilute limit, one finds two thresholds: When U<U_{c1}, there is a perturbative mixing between a few Slater determinants (Rabi oscillations) and g_N \propto |U|^P <1, where P=N/2 (even N) or (N+1)/2 (odd N). When U=U_{c1}, the level spacing distribution exhibits a crossover from Poisson to Wigner, related to the crossover between weak perturbative mixing and effective golden-rule decay, and g_N \approx 1. Moreover, we show that the same U_{c1} signifies also the breakdown of the perturbation theory in U. For U_{c1}<U<U_{c2}, the states are extended over the energetically nearby Slater determinants with a non-ergodic hierarchical structure related to the sparse form of the Hamiltonian. Above a second threshold U_{c2}, the sparsity becomes irrelevant, and the states are extended more or less ergodically over g_N consecutive Slater determinants. A self-consistent argument gives g_N ~ U^{N/(N-1)}. We compare our predictions to a numerical study of three spinless Fermions in a disordered cubic lattice. Implications for the interaction-induced N-particle delocalization in real space are discussed. The applicability of Fermi's golden rule for decay in this dilute gas of "real" particles is compared with the one characterizing a finite-density Fermi gas. The latter is related to the recently suggested Anderson transition in Fock space.
We study the problem of distributed hypothesis testing with a network of agents where some agents repeatedly gain access to information about the correct hypothesis. The group objective is to globally agree on a joint hypothesis that best describes the observed data at all the nodes. We assume that the agents can interact with their neighbors in an unknown sequence of time-varying directed graphs. Following the pioneering work of Jadbabaie, Molavi, Sandroni, and Tahbaz-Salehi, we propose local learning dynamics which combine Bayesian updates at each node with a local aggregation rule of private agent signals. We show that these learning dynamics drive all agents to the set of hypotheses which best explain the data collected at all nodes as long as the sequence of interconnection graphs is uniformly strongly connected. Our main result establishes a non-asymptotic, explicit, geometric convergence rate for the learning dynamic.
Echo state networks represent a special type of recurrent neural networks. Recent papers stated that the echo state networks maximize their computational performance on the transition between order and chaos, the so-called edge of chaos. This work confirms this statement in a comprehensive set of experiments. Furthermore, the echo state networks are compared to networks evolved via neuroevolution. The evolved networks outperform the echo state networks, however, the evolution consumes significant computational resources. It is demonstrated that echo state networks with local connections combine the best of both worlds, the simplicity of random echo state networks and the performance of evolved networks. Finally, it is shown that evolution tends to stay close to the ordered side of the edge of chaos.
We study the advantages of quantum strategies in evolutionary social dilemmas on evolving random networks. We focus our study on the two-player games: prisoner's dilemma, snowdrift and stag-hunt games. The obtained result show the benefits of quantum strategies for the prisoner's dilemma game. For the other two games, we obtain regions of parameters where the quantum strategies dominate, as well as regions where the classical strategies coexist.
In this paper an open-domain factoid question answering system for Polish, RAFAEL, is presented. The system goes beyond finding an answering sentence; it also extracts a single string, corresponding to the required entity. Herein the focus is placed on different approaches to entity recognition, essential for retrieving information matching question constraints. Apart from traditional approach, including named entity recognition (NER) solutions, a novel technique, called Deep Entity Recognition (DeepER), is introduced and implemented. It allows a comprehensive search of all forms of entity references matching a given WordNet synset (e.g. an impressionist), based on a previously assembled entity library. It has been created by analysing the first sentences of encyclopaedia entries and disambiguation and redirect pages. DeepER also provides automatic evaluation, which makes possible numerous experiments, including over a thousand questions from a quiz TV show answered on the grounds of Polish Wikipedia. The final results of a manual evaluation on a separate question set show that the strength of DeepER approach lies in its ability to answer questions that demand answers beyond the traditional categories of named entities.
The role of weight on the weighted networks is investigated by studying the effect of weight on community structures. We use weighted modularity $Q^w$ to evaluate the partitions and Weighted Extremal Optimization algorithm to detect communities. Starting from idealized and empirical weighted networks, the distribution or matching between weights and edges are disturbed. Using dissimilarity function $D$ to distinguish the difference between community structures, it is found that the redistribution of weights does strongly affect the community structure especially in dense networks. This indicates that the community structure in networks is a suitable property to reflect the role of weight.
Energy efficient routing protocols are consistently cited as efficient solutions for Wireless Sensor Networks (WSNs) routing. The area of WSNs is one of the emerging and fast growing fields which brought low cost, low power and multi-functional sensor nodes. In this paper, we examine some protocols related to homogeneous and heterogeneous networks. To evaluate the efficiency of different clustering schemes, we compare five clustering routing protocols; Low Energy Adaptive Clustering Hierarchy (LEACH), Threshold Sensitive Energy Efficient Sensor Network (TEEN), Distributed Energy Efficient Clustering (DEEC) and two variants of TEEN which are Clustering and Multi-Hop Protocol in Threshold Sensitive Energy Efficient Sensor Network (CAMPTEEN) and Hierarchical Threshold Sensitive Energy Efficient Sensor Network (H-TEEN). The contribution of this paper is to introduce sink mobility to increase the network life time of hierarchal routing protocols. Two scenarios are discussed to compare the performances of routing protocols; in first scenario static sink is implanted and in later one mobile sink is used. We perform analytical simulations in MATLAB by using different performance metrics such as, number of alive nodes, number of dead nodes and throughput.
Quench-condensed granular Al films, with normal-state sheet resistance close to 10 k$\Omega/\Box$, display strong hysteresis and ultraslow, non-exponential relaxation in the resistance when temperature is varied below 300 mK. The hysteresis is nonlinear and can be suppressed by a dc bias voltage. The relaxation time does not obey the Arrhenius form, indicating the existence of a broad distribution of low energy barriers. Furthermore, large resistance fluctuations, having a 1/f-type power spectrum with a low-frequency cut-off, are observed at low temperatures. With decreasing temperature, the amplitude of the fluctuation increases and the cut-off frequency decreases. These observations combine to provide a coherent picture that there exists a new glassy electron state in ultrathin granular Al films, with a growing correlation length at low temperatures.
Recent advances in one-shot learning have produced models that can learn from a handful of labeled examples, for passive classification and regression tasks. This paper combines reinforcement learning with one-shot learning, allowing the model to decide, during classification, which examples are worth labeling. We introduce a classification task in which a stream of images are presented and, on each time step, a decision must be made to either predict a label or pay to receive the correct label. We present a recurrent neural network based action-value function, and demonstrate its ability to learn how and when to request labels. Through the choice of reward function, the model can achieve a higher prediction accuracy than a similar model on a purely supervised task, or trade prediction accuracy for fewer label requests.
To explore the origin of the observed giant magnetic moments ($\sim 7 \mu_B$) of Fe impurities on the surface and in the bulk of Cs films, we have performed the relativistic LSDA + U calculations using the linearized muffin-tin orbital (LMTO) band method. We have found that Fe impurities in Cs behave differently from those in noble metals or in Pd. Whereas the induced spin polarization of Cs atoms is negligible, the Fe ion itself is found to be the source of the giant magnetic moment. The 3d electrons of Fe in Cs are localized as the 4f electrons in rare-earth ions so that the orbital magnetic moment becomes as large as the spin magnetic moment. The calculated total magnetic moment of $M = 6.43 \mu_B$, which comes mainly from Fe ion, is close to the experimentally observed value.
We calculate the parallel (VV) and perpendicular (VH) polarized Raman spectra of amorphous silica. Model SiO2 glasses, uncompressed and compressed, were generated by a combination of classical and ab initio molecular-dynamics simulations and their dynamical matrices were computed within the framework of the density functional theory. The Raman scattering intensities were determined using the bond-polarizability model and a good agreement with experimental spectra was found. We confirm that the modes associated to the fourfold and threefold rings produce most of the Raman intensity of the D1 and D2 peaks, respectively, in the VV Raman spectra. Modifications of the Raman spectra upon compression are found to be in agreement with experimental data. We show that the modes associated to the fourfold rings still exist upon compression but do not produce a strong Raman intensity, whereas the ones associated to the threefold rings do. This result strongly suggests that the area under the D1 and D2 peaks is not directly proportional to the concentration of small rings in amorphous SiO2.
Extended systems driven through strong disorder are modeled generically using coarse-grained degrees of freedom that interact elastically in the directions parallel to the driving force and that slip along at least one of the directions transverse to the motion. A realization of such a model is a collection of elastic channels with transverse viscous couplings. In the infinite range limit this model has a tricritical point separating a region where the depinning is continuous, in the universality class of elastic depinning, from a region where depinning is hysteretic. Many of the collective transport models discussed in the literature are special cases of the generic model.
Eigenfunctions in inhomogeneous media can have strong localization properties. Filoche \& Mayboroda showed that the function $u$ solving $(-\Delta + V)u = 1$ controls the behavior of eigenfunctions $(-\Delta + V)\phi = \lambda\phi$ via the inequality $$|\phi(x)| \leq \lambda u(x) \|\phi\|_{L^{\infty}}.$$ This inequality has proven to be remarkably effective in predicting localization and recently Arnold, David, Jerison, Mayboroda \& Filoche connected $1/u$ to decay properties of eigenfunctions. We aim to clarify properties of the landscape: the main ingredient is a localized variation estimate obtained from writing $\phi(x)$ as an average over Brownian motion $\omega(\cdot)$ in started in $x$ $$\phi(x) = \mathbb{E}_{x}\left(\phi(\omega(t)) e^{\lambda t-\int_{0}^{t}{V(\omega(z))dz}} \right).$$ This variation estimate will guarantee that $\phi$ has to change at least by a factor of 2 in a small ball, which implicitly creates a landscape whose relationship with $1/u$ we discuss.
The influence of nuclear effects on the transverse momentum $(p_T)$ distributions of neutrinoproduced hadrons is investigated using the data obtained with SKAT propane-freon bubble chamber irradiated in the neutrino beam (with $E_{\nu}$ = 3-30 GeV) at Serpukhov accelerator. Dependences of $<p_T^2>$ on the kinematical variables of inclusive deep-inelastic scattering and of the produced hadrons are measured. It has been observed, that the nuclear effects cause an enhancement of $<p_T^2>$ of hadrons (more pronounced for the positively charged ones) produced in the target fragmentation region at low invariant mass of the hadronic system (2 $< W <$ 4 GeV) or at low energies transferred to the current quark (2 $< \nu < 9$ GeV). At higher $W$ or $\nu$, no influence of nuclear effects on $<p_T^2>$ is observed. Measurement results are compared with predictions of a simple model, incorporating secondary intranuclear interactions of hadrons (with a formation length extracted from the Lund fragmentation model), which qualitatively reproduces the main features of the data.
We study the vulnerability of dominating sets against random and targeted node removals in complex networks. While small, cost-efficient dominating sets play a significant role in controllability and observability of these networks, a fixed and intact network structure is always implicitly assumed. We find that cost-efficiency of dominating sets optimized for small size alone comes at a price of being vulnerable to damage; domination in the remaining network can be severely disrupted, even if a small fraction of dominator nodes are lost. We develop two new methods for finding flexible dominating sets, allowing either adjustable overall resilience, or dominating set size, while maximizing the dominated fraction of the remaining network after the attack. We analyze the efficiency of each method on synthetic scale-free networks, as well as real complex networks.
The solitary wave problem at the free surface of a two-dimensional, infinitely-deep and irrotational flow of water, under the influence of gravity, is formulated as a nonlinear pseudodifferential equation. A Pohozaev identity is used to show that it admits no solutions which asymptotically vanish faster than linearly.
The long-term memory of most connectionist systems lies entirely in the weights of the system. Since the number of weights is typically fixed, this bounds the total amount of knowledge that can be learned and stored. Though this is not normally a problem for a neural network designed for a specific task, such a bound is undesirable for a system that continually learns over an open range of domains. To address this, we describe a lifelong learning system that leverages a fast, though non-differentiable, content-addressable memory which can be exploited to encode both a long history of sequential episodic knowledge and semantic knowledge over many episodes for an unbounded number of domains. This opens the door for investigation into transfer learning, and leveraging prior knowledge that has been learned over a lifetime of experiences to new domains.
We continue the discussion of the fermion models on graphs that started in the first paper of the series. Here we introduce a Graphical Gauge Model (GGM) and show that : (a) it can be stated as an average/sum of a determinant defined on the graph over $\mathbb{Z}_{2}$ (binary) gauge field; (b) it is equivalent to the Monomer-Dimer (MD) model on the graph; (c) the partition function of the model allows an explicit expression in terms of a series over disjoint directed cycles, where each term is a product of local contributions along the cycle and the determinant of a matrix defined on the remainder of the graph (excluding the cycle). We also establish a relation between the MD model on the graph and the determinant series, discussed in the first paper, however, considered using simple non-Belief-Propagation choice of the gauge. We conclude with a discussion of possible analytic and algorithmic consequences of these results, as well as related questions and challenges.
Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.
Optomechanics, nano-electromechanics, and integrated photonics have brought about a renaissance in phononic device physics and technology. Central to this advance are devices and materials that support ultra long-lived photonic and phononic excitations, providing access to novel regimes of classical and quantum dynamics based on tailorable photon-phonon coupling. Silica-based devices have been at the forefront of such innovations for their ability to support optical excitations persisting for nearly 1 billion cycles, and for their low optical nonlinearity. Remarkably, acoustic phonon modes can persist for a comparable number of cycles in crystalline solids at cryogenic temperatures, permitting radical enhancement of photon-phonon coupling. However, it has not been possible to achieve similar phononic coherence times in silica, as silica becomes acoustically opaque at low temperatures. In this paper, we demonstrate that intrinsic forms of phonon dissipation are greatly reduced (by > 90%) using nonlinear saturation with continuous driving fields of disparate frequencies. We demonstrate steady-state phononic spectral hole burning for the first time, and show that this technique for controlling dissipation in glass produces a wide-band transparency window. These studies were carried out in a micro-scale fiber waveguide where the acoustic intensities necessary to manipulate phonon dissipation can be achieved with optically generated phonon fields of modest (nW) powers. We developed a simple model that explains both dissipative and dispersive changes produced by phononic saturation. In showing how the dissipative and dispersive properties of glasses can be manipulated using external fields, we open the door to dynamical phononic switching and the use of glasses as low loss phononic media which may enable new forms of controllable laser dynamics, information processing, and precision metrology.
Centralized radio access network architectures consolidate the baseband operation towards a cloud-based platform, thereby allowing for efficient utilization of computing assets, effective inter-cell coordination, and exploitation of global channel state information. This paper considers the interplay between computational efficiency and data throughput that is fundamental to centralized RAN. It introduces the concept of computational outage in mobile networks, and applies it to the analysis of complexity constrained dense centralized RAN networks. The framework is applied to single-cell and multi-cell scenarios using parameters drawn from the LTE standard. It is found that in computationally limited networks, the effective throughput can be improved by using a computationally aware policy for selecting the modulation and coding scheme, which sacrifices spectral efficiency in order to reduce the computational outage probability. When signals of multiple base stations are processed centrally, a computational diversity benefit emerges, and the benefit grows with increasing user density.
We study the R\'enyi entanglement entropy and the Shannon mutual information for a class of spin-1/2 quantum critical XXZ chains with random coupling constants which are partially correlated. In the XX case, distinctly from the usual uncorrelated random case where the system is governed by an infinite-disorder fixed point, the correlated-disorder chain is governed by finite-disorder fixed points. Surprisingly, we verify that, although the system is not conformally invariant, the leading behavior of the R\'enyi entanglement entropies are similar to those of the clean (no randomness) conformally invariant system. In addition, we compute the Shannon mutual information among subsystems of our correlated-disorder quantum chain and verify the same leading behavior as the $n=2$ R\'enyi entanglement entropy. This result extends a recent conjecture concerning the same universal behavior of these quantities for conformally invariant quantum chains. For the generic spin-1/2 quantum critical XXZ case, the true asymptotic regime is identical to that in the uncorrelated disorder case. However, these finite-disorder fixed points govern the low-energy physics up to a very long crossover length scale and the same results as in the XX case apply. Our results are based on exact numerical calculations and on a numerical strong-disorder renormalization group.
As the expressive depth of an emotional face differs with individuals, expressions, or situations, recognizing an expression using a single facial image at a moment is difficult. One of the approaches to alleviate this difficulty is using a video-based method that utilizes multiple frames to extract temporal information between facial expression images. In this paper, we attempt to utilize a generative image that is estimated based on a given single image. Then, we propose to utilize a contrastive representation that explains an expression difference for discriminative purposes. The contrastive representation is calculated at the embedding layer of a deep network by comparing a single given image with a reference sample generated by a deep encoder-decoder network. Consequently, we deploy deep neural networks that embed a combination of a generative model, a contrastive model, and a discriminative model. In our proposed networks, we attempt to disentangle a facial expressive factor in two steps including learning of a reference generator network and learning of a contrastive encoder network. We conducted extensive experiments on three publicly available face expression databases (CK+, MMI, and Oulu-CASIA) that have been widely adopted in the recent literatures. The proposed method outperforms the known state-of-the art methods in terms of the recognition accuracy.
We derive a "master formula" for the contribution of the three-gluon correlation function in the nucleon to the twist-3 single-spin-dependent cross section for semi-inclusive deep-inelastic scattering, ep^\uparrow\to eDX. This is an extension of the similar formula known for the so-called soft-gluon-pole contribution induced by the quark-gluon correlation function in a variety of processes. Our master formula reduces the relevant interfering partonic subprocess with the participation of the three gluons to the Born cross sections for the \gamma^*g\to c\bar{c} scattering, which reveals the new structure behind the twist-3 single spin asymmetry and simplifies the actual calculation greatly. A possible extension to higher order corrections is also discussed.
We develop a supervised machine learning model that detects anomalies in systems in real time. Our model processes unbounded streams of data into time series which then form the basis of a low-latency anomaly detection model. Moreover, we extend our preliminary goal of just anomaly detection to simultaneous anomaly prediction. We approach this very challenging problem by developing a Bayesian Network framework that captures the information about the parameters of the lagged regressors calibrated in the first part of our approach and use this structure to learn local conditional probability distributions.
A search technique locating network modules, i.e., internally densely connected groups of nodes in directed networks is introduced by extending the Clique Percolation Method originally proposed for undirected networks. After giving a suitable definition for directed modules we investigate their percolation transition in the Erdos-Renyi graph both analytically and numerically. We also analyse four real-world directed networks, including Google's own webpages, an email network, a word association graph and the transcriptional regulatory network of the yeast Saccharomyces cerevisiae. The obtained directed modules are validated by additional information available for the nodes. We find that directed modules of real-world graphs inherently overlap and the investigated networks can be classified into two major groups in terms of the overlaps between the modules. Accordingly, in the word-association network and among Google's webpages the overlaps are likely to contain in-hubs, whereas the modules in the email and transcriptional regulatory networks tend to overlap via out-hubs.
Using classical notions of statistical decision and information theory, we show that invariance in a deep neural network is equivalent to minimality of the representation it computes, and can be achieved by stacking layers and injecting noise in the computation, under realistic and empirically validated assumptions. We use an Information Decomposition of the empirical loss to show that overfitting can be reduced by limiting the information content stored in the weights. We then present a sharp inequality that relates the information content in the weights -- which are a representation of the training set and inferred by generic optimization agnostic of invariance and disentanglement -- and the minimality and total correlation of the activation functions, which are a representation of the test datum. This allows us to tackle recent puzzles concerning the generalization properties of deep networks and their relation to the geometry of the optimization residual.
Congestion control in the current Internet is accomplished mainly by TCP/IP. To understand the macroscopic network behavior that results from TCP/IP and similar end-to-end protocols, one main analytic technique is to show that the the protocol maximizes some global objective function of the network traffic. Here we analyze a particular end-to-end, MIMD (multiplicative-increase, multiplicative-decrease) protocol. We show that if all users of the network use the protocol, and all connections last for at least logarithmically many rounds, then the total weighted throughput (value of all packets received) is near the maximum possible. Our analysis includes round-trip-times, and (in contrast to most previous analyses) gives explicit convergence rates, allows connections to start and stop, and allows capacities to change.
One of the key challenges of artificial intelligence is to learn models that are effective in the context of planning. In this document we introduce the predictron architecture. The predictron consists of a fully abstract model, represented by a Markov reward process, that can be rolled forward multiple "imagined" planning steps. Each forward pass of the predictron accumulates internal rewards and values over multiple planning depths. The predictron is trained end-to-end so as to make these accumulated values accurately approximate the true value function. We applied the predictron to procedurally generated random mazes and a simulator for the game of pool. The predictron yielded significantly more accurate predictions than conventional deep neural network architectures.
The capability to provide network service even under a significant network system element disruption is the backbone for the survival of route optimize of mobile network Technology in today s world. Keeping this view in mind, the present paper highlights a new method based on memetic algorithm.
Deep reinforcement learning (RL) methods have significant potential for dialogue policy optimisation. However, they suffer from a poor performance in the early stages of learning. This is especially problematic for on-line learning with real users. Two approaches are introduced to tackle this problem. Firstly, to speed up the learning process, two sample-efficient neural networks algorithms: trust region actor-critic with experience replay (TRACER) and episodic natural actor-critic with experience replay (eNACER) are presented. For TRACER, the trust region helps to control the learning step size and avoid catastrophic model changes. For eNACER, the natural gradient identifies the steepest ascent direction in policy space to speed up the convergence. Both models employ off-policy learning with experience replay to improve sample-efficiency. Secondly, to mitigate the cold start issue, a corpus of demonstration data is utilised to pre-train the models prior to on-line reinforcement learning. Combining these two approaches, we demonstrate a practical approach to learn deep RL-based dialogue policies and demonstrate their effectiveness in a task-oriented information seeking domain.
Building a successful recommender system depends on understanding both the dimensions of people's preferences as well as their dynamics. In certain domains, such as fashion, modeling such preferences can be incredibly difficult, due to the need to simultaneously model the visual appearance of products as well as their evolution over time. The subtle semantics and non-linear dynamics of fashion evolution raise unique challenges especially considering the sparsity and large scale of the underlying datasets. In this paper we build novel models for the One-Class Collaborative Filtering setting, where our goal is to estimate users' fashion-aware personalized ranking functions based on their past feedback. To uncover the complex and evolving visual factors that people consider when evaluating products, our method combines high-level visual features extracted from a deep convolutional neural network, users' past feedback, as well as evolving trends within the community. Experimentally we evaluate our method on two large real-world datasets from Amazon.com, where we show it to outperform state-of-the-art personalized ranking measures, and also use it to visualize the high-level fashion trends across the 11-year span of our dataset.
Transliteration is a key component of machine translation systems and software internationalization. This paper demonstrates that neural sequence-to-sequence models obtain state of the art or close to state of the art results on existing datasets. In an effort to make machine transliteration accessible, we open source a new Arabic to English transliteration dataset and our trained models.
In evolutionary robotics an encoding of the control software, which maps sensor data (input) to motor control values (output), is shaped by stochastic optimization methods to complete a predefined task. This approach is assumed to be beneficial compared to standard methods of controller design in those cases where no a-priori model is available that could help to optimize performance. Also for robots that have to operate in unpredictable environments, an evolutionary robotics approach is favorable. We demonstrate here that such a model-free approach is not a free lunch, as already simple tasks can represent unsolvable barriers for fully open-ended uninformed evolutionary computation techniques. We propose here the 'Wankelmut' task as an objective for an evolutionary approach that starts from scratch without pre-shaped controller software or any other informed approach that would force the behavior to be evolved in a desired way. Our focal claim is that 'Wankelmut' represents the simplest set of problems that makes plain-vanilla evolutionary computation fail. We demonstrate this by a series of simple standard evolutionary approaches using different fitness functions and standard artificial neural networks as well as continuous-time recurrent neural networks. All our tested approaches failed. We claim that any other evolutionary approach will also fail that does per-se not favor or enforce modularity and does not freeze or protect already evolved functionalities. Thus we propose a hard-to-pass benchmark and make a strong statement for self-complexifying and generative approaches in evolutionary computation. We anticipate that defining such a 'simplest task to fail' is a valuable benchmark for promoting future development in the field of artificial intelligence, evolutionary robotics and artificial life.
Long-term situation prediction plays a crucial role in the development of intelligent vehicles. A major challenge still to overcome is the prediction of complex downtown scenarios with mutiple road users interacting with each other. This contribution tackles this challenge by combining a Bayesian filtering technique for environment representation and machine learning as long-term predictor. Therefore, a dynamic occupancy grid map representing the static and dynamic environment around the ego-vehicle is utilized as input to a deep convolutional neural network. This yields the advantage of using data from a single timestamp for prediction, rather than an entire time series, alleviating common problems dealing with input time series. Furthermore, convolutional neural networks have the inherent characteristic of using context information, enabling the implicit modeling of road user interaction. Considering the extremely unbalanced data of dynamic and static grid cells, a pixel-wise balancing loss function for the training of the neural network is introduced. One of the major advantages is the unsupervised learning character due to fully automatic label generation. The presented algorithm is trained and evaluated on multiple hours of recorded sensor data. The recorded scenario is comprised of a shared space containing multiple road users, e.g., pedestrians, bikes and vehicles.
The newly appeared idea of Logical Stochastic Resonance (LSR) is verified in synthetic gene networks induced by non-Gaussian noise. We realize the switching between two kinds of logic gates under optimal moderate noise intensity by varying two different tunable parameters in a single gene network. Furthermore, in order to obtain more logic operations, thus providing additional information processing capacity, we obtain in a two-dimensional toggle switch model two complementary logic gates and realize the transformation between two logic gates via the methods of changing different parameters. These simulated results contribute to improve the computational power and functionality of the networks.
In this talk, I review recent developments in perturbative QCD and their applications to collider physics.
Containing the spreading of crime is a major challenge for society. Yet, since thousands of years, no effective strategy has been found to overcome crime. To the contrary, empirical evidence shows that crime is recurrent, a fact that is not captured well by rational choice theories of crime. According to these, strong enough punishment should prevent crime from happening. To gain a better understanding of the relationship between crime and punishment, we consider that the latter requires prior discovery of illicit behavior and study a spatial version of the inspection game. Simulations reveal the spontaneous emergence of cyclic dominance between ''criminals'', ''inspectors'', and ''ordinary people'' as a consequence of spatial interactions. Such cycles dominate the evolutionary process, in particular when the temptation to commit crime or the cost of inspection are low or moderate. Yet, there are also critical parameter values beyond which cycles cease to exist and the population is dominated either by a stable mixture of criminals and inspectors or one of these two strategies alone. Both continuous and discontinuous phase transitions to different final states are possible, indicating that successful strategies to contain crime can be very much counter-intuitive and complex. Our results demonstrate that spatial interactions are crucial for the evolutionary outcome of the inspection game, and they also reveal why criminal behavior is likely to be recurrent rather than evolving towards an equilibrium with monotonous parameter dependencies.
We introduce Dimple, a fully open-source API for probabilistic modeling. Dimple allows the user to specify probabilistic models in the form of graphical models, Bayesian networks, or factor graphs, and performs inference (by automatically deriving an inference engine from a variety of algorithms) on the model. Dimple also serves as a compiler for GP5, a hardware accelerator for inference.
Motivated by applications in domains such as social networks and computational biology, we study the problem of community recovery in graphs with locality. In this problem, pairwise noisy measurements of whether two nodes are in the same community or different communities come mainly or exclusively from nearby nodes rather than uniformly sampled between all nodes pairs, as in most existing models. We present an algorithm that runs nearly linearly in the number of measurements and which achieves the information theoretic limit for exact recovery.
In this study, an Artificial Neural Network (ANN) approach is utilized to perform a parametric study on the process of extraction of lubricants from heavy petroleum cuts. To train the model, we used field data collected from an industrial plant. Operational conditions of feed and solvent flow rate, Temperature of streams and mixing rate were considered as the input to the model, whereas the flow rate of the main product was considered as the output of the ANN model. A feed-forward Multi-Layer Perceptron Neural Network was successfully applied to capture the relationship between inputs and output parameters.
In recent years, video-related services such as YouTube and Netflix have generated huge amounts of traffic and the network neutrality debate has emerged as a major issue. In this paper, we consider feasibility of using a hybrid of unicast and broadcast in cellular networks for high-traffic services (e.g., video streaming), from the perspective of cost effectiveness. To reflect spatial characteristics of base stations (BSs) and mobile users (MUs), we use the stochastic geometry approach where BSs and MUs are modeled as independent homogeneous Poisson point processes (PPPs). With these assumptions and results, we show how to cope with the trade-off between broadcast and unicast for providing the service with affordable cost levels and reduced network load. Moreover, we propose the so called periodic broadcasting service, where popular video contents are periodically broadcast over cellular networks. This service will make a positive impact on the network neutrality debate by stimulating cooperation between mobile network operators (MNOs) and content providers (CPs).
We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model's capabilities and limitations. We show that GRBMs are capable of learning meaningful features both in a two-dimensional blind source separation task and in modeling natural images. Further, we show that reported difficulties in training GRBMs are due to the failure of the training algorithm rather than the model itself. Based on our analysis we are able to propose several training recipes, which allowed successful and fast training in our experiments. Finally, we discuss the relationship of GRBMs to several modifications that have been proposed to improve the model.
This work treats the effects of disorder and interactions in a quantum Hall ferromagnet, which is realized in a two-dimensional electron gas (2DEG) in a perpendicular magnetic field at Landau level filling factor equal one. We study the problem by projecting the original fermionic Hamiltonian into magnon states, which behave as bosons in the vicinity of the ferromagnetic ground state. The approach permits the reformulation of a strongly interacting model into a non-interacting one. The latter is a non-perturbative scheme that consists in treating the two-particle neutral excitations of the electron system as a bosonic single-particle. Indeed, the employment of bosonization facilitates the inclusion of disorder in the study of the system. It has been shown previously that disorder may drive a quantum phase transition in the Hall ferromagnet. However, such studies have been either carried out in the framework of nonlinear sigma model, as an effective low-energy theory, or included the long-range Coulomb interaction in a quantum description only up to the Hartree-Fock level. Here, we establish the occurrence of a disorder-driven quantum phase transition from a ferromagnetic 2DEG to a spin glass phase by taking into account interactions between electrons up to the random phase approximation level in a fully quantum description.
Many networks can be usefully decomposed into a dense core plus an outlying, loosely-connected periphery. Here we propose an algorithm for performing such a decomposition on empirical network data using methods of statistical inference. Our method fits a generative model of core-periphery structure to observed data using a combination of an expectation--maximization algorithm for calculating the parameters of the model and a belief propagation algorithm for calculating the decomposition itself. We find the method to be efficient, scaling easily to networks with a million or more nodes and we test it on a range of networks, including real-world examples as well as computer-generated benchmarks, for which it successfully identifies known core-periphery structure with low error rate. We also demonstrate that the method is immune from the detectability transition observed in the related community detection problem, which prevents the detection of community structure when that structure is too weak. There is no such transition for core-periphery structure, which is detectable, albeit with some statistical error, no matter how weak it is.
We present a data-driven inference method that can synthesize a photorealistic texture map of a complete 3D face model given a partial 2D view of a person in the wild. After an initial estimation of shape and low-frequency albedo, we compute a high-frequency partial texture map, without the shading component, of the visible face area. To extract the fine appearance details from this incomplete input, we introduce a multi-scale detail analysis technique based on mid-layer feature correlations extracted from a deep convolutional neural network. We demonstrate that fitting a convex combination of feature correlations from a high-resolution face database can yield a semantically plausible facial detail description of the entire face. A complete and photorealistic texture map can then be synthesized by iteratively optimizing for the reconstructed feature correlations. Using these high-resolution textures and a commercial rendering framework, we can produce high-fidelity 3D renderings that are visually comparable to those obtained with state-of-the-art multi-view face capture systems. We demonstrate successful face reconstructions from a wide range of low resolution input images, including those of historical figures. In addition to extensive evaluations, we validate the realism of our results using a crowdsourced user study.
We study pair correlations in cooperative systems placed on complex networks. We show that usually in these systems, the correlations between two interacting objects (e.g., spins), separated by a distance $\ell$, decay, on average, faster than $1/(\ell z_\ell)$. Here $z_\ell$ is the mean number of the $\ell$-th nearest neighbors of a vertex in a network. This behavior, in particular, leads to a dramatic weakening of correlations between second and more distant neighbors on networks with fat-tailed degree distributions, which have a divergent number $z_2$ in the infinite network limit. In this case, only the pair correlations between the nearest neighbors are observable. We obtain the pair correlation function of the Ising model on a complex network and also derive our results in the framework of a phenomenological approach.
Hyperparameter tuning is one of the big costs of deep learning. State-of-the-art optimizers, such as Adagrad, RMSProp and Adam, make things easier by adaptively tuning an individual learning rate for each variable. This level of fine adaptation is understood to yield a more powerful method. However, our experiments, as well as recent theory by Wilson et al., show that hand-tuned stochastic gradient descent (SGD) achieves better results, at the same rate or faster. The hypothesis put forth is that adaptive methods converge to different minima (Wilson et al.). Here we point out another factor: none of these methods tune their momentum parameter, known to be very important for deep learning applications (Sutskever et al.). Tuning the momentum parameter becomes even more important in asynchronous-parallel systems: recent theory (Mitliagkas et al.) shows that asynchrony introduces momentum-like dynamics, and that tuning down algorithmic momentum is important for efficient parallelization.   We revisit the simple momentum SGD algorithm and show that hand-tuning a single learning rate and momentum value makes it competitive with Adam. We then analyze its robustness in learning rate misspecification and objective curvature variation. Based on these insights, we design YellowFin, an automatic tuner for both momentum and learning rate in SGD. YellowFin optionally uses a novel momentum-sensing component along with a negative-feedback loop mechanism to compensate for the added dynamics of asynchrony on the fly. We empirically show YellowFin converges in fewer iterations than Adam on large ResNet and LSTM models, with a speedup up to $2.8$x in synchronous and $2.7$x in asynchronous settings.
Recognition of handwritten words continues to be an important problem in document analysis and recognition. Existing approaches extract hand-engineered features from word images--which can perform poorly with new data sets. Recently, deep learning has attracted great attention because of the ability to learn features from raw data. Moreover they have yielded state-of-the-art results in classification tasks including character recognition and scene recognition. On the other hand, word recognition is a sequential problem where we need to model the correlation between characters. In this paper, we propose using deep Conditional Random Fields (deep CRFs) for word recognition. Basically, we combine CRFs with deep learning, in which deep features are learned and sequences are labeled in a unified framework. We pre-train the deep structure with stacked restricted Boltzmann machines (RBMs) for feature learning and optimize the entire network with an online learning algorithm. The proposed model was evaluated on two datasets, and seen to perform significantly better than competitive baseline models. The source code is available at https://github.com/ganggit/deepCRFs.
In a recent letter, Stenull and Lubensky claim that periodic approximants of Penrose tilings, which are generically isostatic, have a nonzero bulk modulus B when disordered, and, therefore, Penrose tilings are good models of jammed packings. The claim of a nonzero B, which is made on the basis of a normal mode analysis of periodic Penrose approximants for a single value of the disorder epsilon, is the central point of their letter: other properties of Penrose tilings, such as the vanishing of the shear modulus, and a flat density of vibrational states, are already shared by most geometrically disordered isostatic networks studied so far. In this comment, Conjugate Gradient is used to solve the elastic equations on approximants with up to 8x10^4 sites for several values of epsilon, to show beyond reasonable doubt that Stenull and Lubensky's claim is incorrect. The bulk modulus of generic Penrose tilings is zero asymptotically. According to our results, B grows as (epsilon^2 L^3) when (epsilon^2 L^3) << 10^2, then saturates, and finally decays as (epsilon^2 L^3)^{-2/3} ~ 1/L^2 for epsilon^2 L^3 >> 10^2. Stenull and Lubensky seem to have only analyzed one value of epsilon for which saturation is reached at the largest size studied. This led them to a wrong conclusion. We support our results by also considering generic Penrose approximants with fixed boundaries, whose bulk modulus constitutes a strict upper bound for that of periodic systems, finding that these have a vanishing B as well for large L. We conclude that the main point in Stenull and Lubensky letter is unjustified. Penrose tilings are no better models of jammed packings than any of the previously studied isostatic networks with geometric disorder.
The localization behavior of the one-dimensional Anderson model with correlated and uncorrelated purely off-diagonal disorder is studied. Using the transfer matrix method, we derive an analytical expression for the localization length at the band center in terms of the pair correlation function. It is proved that for long-range correlated hopping disorder, a localization-delocalization transition occurs at the critical Hurst exponent H_c= 1/2 when the variance of the logarithm of hopping "\sigma_{\ln(t)}" is kept fixed with the system size N. Based on numerical calculations, finite size scaling relations are postulated for the localization length near the band center (E \neq 0) in terms of the system parameters: E,N,H, and \sigma_{\ln(t)}.
Module network inference is an established statistical method to reconstruct co-expression modules and their upstream regulatory programs from integrated multi-omics datasets measuring the activity levels of various cellular components across different individuals, experimental conditions or time points of a dynamic process. We have developed Lemon-Tree, an open-source, platform-independent, modular, extensible software package implementing state-of-the-art ensemble methods for module network inference. We benchmarked Lemon-Tree using large-scale tumor datasets and showed that Lemon-Tree algorithms compare favorably with state-of-the-art module network inference software. We also analyzed a large dataset of somatic copy-number alterations and gene expression levels measured in glioblastoma samples from The Cancer Genome Atlas and found that Lemon-Tree correctly identifies known glioblastoma oncogenes and tumor suppressors as master regulators in the inferred module network. Novel candidate driver genes predicted by Lemon-Tree were validated using tumor pathway and survival analyses. Lemon-Tree is available from http://lemon-tree.googlecode.com under the GNU General Public License version 2.0.
We consider a utility maximization problem over partially observable Markov ON/OFF channels. In this network instantaneous channel states are never known, and at most one user is selected for service in every slot according to the partial channel information provided by past observations. Solving the utility maximization problem directly is difficult because it involves solving partially observable Markov decision processes. Instead, we construct an approximate solution by optimizing the network utility only over a good constrained network capacity region rendered by stationary policies. Using a novel frame-based Lyapunov drift argument, we design a policy of admission control and user selection that stabilizes the network with utility that can be made arbitrarily close to the optimal in the constrained region. Equivalently, we are dealing with a high-dimensional restless bandit problem with a general functional objective over Markov ON/OFF restless bandits. Thus the network control algorithm developed in this paper serves as a new approximation methodology to attack such complex restless bandit problems.
Swarm Intelligence (SI) is the property of a system whereby the collective behaviors of (unsophisticated) entities interacting locally with their environment cause coherent functional global patterns to emerge. SI provides a basis with wich it is possible to explore collective (or distributed) problem solving without centralized control or the provision of a global model. In this paper we present a Swarm Search Algorithm with varying population of agents. The swarm is based on a previous model with fixed population which proved its effectiveness on several computation problems. We will show that the variation of the population size provides the swarm with mechanisms that improves its self-adaptability and causes the emergence of a more robust self-organized behavior, resulting in a higher efficiency on searching peaks and valleys over dynamic search landscapes represented here - for the purpose of different experiments - by several three-dimensional mathematical functions that suddenly change over time. We will also show that the present swarm, for each function, self-adapts towards an optimal population size, thus self-regulating.
Online learning has become increasingly popular on handling massive data. The sequential nature of online learning, however, requires a centralized learner to store data and update parameters. In this paper, we consider online learning with {\em distributed} data sources. The autonomous learners update local parameters based on local data sources and periodically exchange information with a small subset of neighbors in a communication network. We derive the regret bound for strongly convex functions that generalizes the work by Ram et al. (2010) for convex functions. Most importantly, we show that our algorithm has \emph{intrinsic} privacy-preserving properties, and we prove the sufficient and necessary conditions for privacy preservation in the network. These conditions imply that for networks with greater-than-one connectivity, a malicious learner cannot reconstruct the subgradients (and sensitive raw data) of other learners, which makes our algorithm appealing in privacy sensitive applications.
Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance of this task. Most state-of-the-art approaches follow an encoder-decoder framework, which generates captions using a sequential recurrent prediction model. However, in this paper, we introduce a novel decision-making framework for image captioning. We utilize a "policy network" and a "value network" to collaboratively generate captions. The policy network serves as a local guidance by providing the confidence of predicting the next word according to the current state. Additionally, the value network serves as a global and lookahead guidance by evaluating all possible extensions of the current state. In essence, it adjusts the goal of predicting the correct words towards the goal of generating captions similar to the ground truth captions. We train both networks using an actor-critic reinforcement learning model, with a novel reward defined by visual-semantic embedding. Extensive experiments and analyses on the Microsoft COCO dataset show that the proposed framework outperforms state-of-the-art approaches across different evaluation metrics.
We apply the generating function technique developed by Nazarov to the computation of the density of transmission eigenvalues for a two-dimensional free massless Dirac fermion, which, e.g., underlies theoretical descriptions of graphene. By modeling ideal leads attached to the sample as a conformal invariant boundary condition, we relate the generating function for the density of transmission eigenvalues to the twisted chiral partition functions of fermionic ($c=1$) and bosonic ($c=-1$) conformal field theories. We also discuss the scaling behavior of the ac Kubo conductivity and compare its \textit{different} $dc$ limits with results obtained from the Landauer conductance. Finally, we show that the disorder averaged Einstein conductivity is an analytic function of the disorder strength, with vanishing first-order correction, for a tight-binding model on the honeycomb lattice with weak real-valued and nearest-neighbor random hopping.
Bethe lattice spins glasses are supposed to be marginally stable, i.e. their equilibrium probability distribution changes discontinuously when we add an external perturbation. So far the problem of a spin glass on a Bethe lattice has been studied only using an approximation where marginally stability is not present, which is wrong in the spin glass phase. Because of some technical difficulties, attempts at deriving a marginally stable solution have been confined to some perturbative regimes, high connectivity lattices or temperature close to the critical temperature. Using the cavity method, we propose a general non-perturbative approach to the Bethe lattice spin glass problem using approximations that should be hopeful consistent with marginal stability.
We compare numerical estimates from different sources for the ordering temperature $T_g$ and the critical exponents of the Ising spin glass in dimension three with binomial ($\pm J$) interactions. Corrections to finite size scaling turn out to be important especially for parameters such as the Binder cumulant. For non-equilibrium parameters it is easier to approach the large size limit and to allow for corrections to scaling. Relying principally on such data, a crossing point defines the freezing temperature $T_g$; the possibility that the ordering temperature is zero can definitively be excluded. We estimate an ordering temperature $T_g = 1.195(15)$, with associated estimates of the critical exponents for which corrections to finite size scaling are well under control. Among the parameters evaluated is the leading dynamic correction to scaling exponent $w$.
The large N_f self consistency method is applied to the computation of perturbative information in the operator product expansion used in deep inelastic scattering. The O(1/N_f) critical exponents corresponding to the anomalous dimensions of the twist 2 non-singlet and singlet operators are computed analytically as well as the non-singlet structure functions. The results are in agreement with recent explicit perturbative calculations.
We study the emergence of coherence in complex networks of mutually coupled non-identical elements. We uncover the precise dependence of the dynamical coherence on the network connectivity, on the isolated dynamics of the elements and the coupling function. These findings predict that in random graphs, the enhancement of coherence is proportional to the mean degree. In locally connected networks, coherence is no longer controlled by the mean degree, but rather on how the mean degree scales with the network size. In these networks, even when the coherence is absent, adding a fraction s of random connections leads to an enhancement of coherence proportional to s. Our results provide a way to control the emergent properties by the manipulation of the dynamics of the elements and the network connectivity.
Phylogenetic trees are widely used to display estimates of how groups of species evolved. Each phylogenetic tree can be seen as a collection of clusters, subgroups of the species that evolved from a common ancestor. When phylogenetic trees are obtained for several data sets (e.g. for different genes), then their clusters are often contradicting. Consequently, the set of all clusters of such a data set cannot be combined into a single phylogenetic tree. Phylogenetic networks are a generalization of phylogenetic trees that can be used to display more complex evolutionary histories, including reticulate events such as hybridizations, recombinations and horizontal gene transfers. Here we present the new CASS algorithm that can combine any set of clusters into a phylogenetic network. We show that the networks constructed by CASS are usually simpler than networks constructed by other available methods. Moreover, we show that CASS is guaranteed to produce a network with at most two reticulations per biconnected component, whenever such a network exists. We have implemented CASS and integrated it in the freely available Dendroscope software.
We study the time scales associated to diffusion processes that take place on multiplex networks, i.e. on a set of networks linked through interconnected layers. To this end, we propose the construction of a supra-Laplacian matrix, which consists of a dimensional lifting of the Laplacian matrix of each layer of the multiplex network. We use perturbative analysis to reveal analytically the structure of eigenvectors and eigenvalues of the complete network in terms of the spectral properties of the individual layers. The spectrum of the supra-Laplacian allows us to understand the physics of diffusion-like processes on top of multiplex networks.
This paper addresses some aspects of the general problem of information transfer and distributed function computation in wireless networks. Many applications of wireless technology foresee networks of autonomous devices executing tasks that can be posed as distributed function computation. In today's wireless networks, the tasks of communication and (distributed) computation are performed separately, although an efficient network operation calls for approaches in which the information transfer is dynamically adapted to time-varying computation objectives. Thus, wireless communications and function computation must be tightly coupled and it is shown in this paper that information theory may play a crucial role in the design of efficient computation-aware wireless communication and networking strategies. This is explained in more detail by considering the problem of computing $\ell_p$-norms over multiple access channels.
In this paper we present derivation details, logic, and motivation for the loop calculus introduced in \cite{06CCa}. Generating functions for three inter-related discrete statistical models are each expressed in terms of a finite series. The first term in the series corresponds to the Bethe-Peierls (Belief Propagation)-BP contribution, the other terms are labeled by loops on the factor graph. All loop contributions are simple rational functions of spin correlation functions calculated within the BP approach. We discuss two alternative derivations of the loop series. One approach implements a set of local auxiliary integrations over continuous fields with the BP contribution corresponding to an integrand saddle-point value. The integrals are replaced by sums in the complimentary approach, briefly explained in \cite{06CCa}. A local gauge symmetry transformation that clarifies an important invariant feature of the BP solution, is revealed in both approaches. The partition function remains invariant while individual terms change under the gauge transformation. The requirement for all individual terms to be non-zero only for closed loops in the factor graph (as opposed to paths with loose ends) is equivalent to fixing the first term in the series to be exactly equal to the BP contribution. Further applications of the loop calculus to problems in statistical physics, computer and information sciences are discussed.
We propose a framework for computer music composition that uses resilient propagation (RProp) and long short term memory (LSTM) recurrent neural network. In this paper, we show that LSTM network learns the structure and characteristics of music pieces properly by demonstrating its ability to recreate music. We also show that predicting existing music using RProp outperforms Back propagation through time (BPTT).
With the growing popularity of mobile smart devices, the existing networks are unable to meet the requirement of many complex scenarios; current network architectures and protocols do not work well with the network with high latency and frequent disconnections. To improve the performance of these networks some scholars opened up a new research field, delay-tolerant networks, in which one of the important research subjects is the forwarding and routing mechanism of data packets. This paper presents a routing scheme based on social networks owing to the fact that nodes in computer networks and social networks have high behavioural similarity. To further improve efficiency this paper also suggests a mechanism, which is the improved version of an existing betweenness centrality based routing algorithm. The experiments showed that the proposed scheme has better performance than the existing friendship routing algorithms.
Many systems can be described using graphs, or networks. Detecting communities in these networks can provide information about the underlying structure and functioning of the original systems. Yet this detection is a complex task and a large amount of work was dedicated to it in the past decade. One important feature is that communities can be found at several scales, or levels of resolution, indicating several levels of organisations. Therefore solutions to the community structure may not be unique. Also networks tend to be large and hence require efficient processing. In this work, we present a new algorithm for the fast detection of communities across scales using a local criterion. We exploit the local aspect of the criterion to enable parallel computation and improve the algorithm's efficiency further. The algorithm is tested against large generated multi-scale networks and experiments demonstrate its efficiency and accuracy.
Controlling a complex network towards a desire state is of great importance in many applications. Existing works present an approximate algorithm to find the driver nodes used to control partial nodes of the network. However, the driver nodes obtained by this algorithm depend on the matching order of nodes and cannot get the optimum results. Here we present a novel algorithm to find the driver nodes for target control based on preferential matching. The algorithm elaborately arrange the matching order of nodes in order to minimize the size of the driver nodes set. The results on both synthetic and real networks indicate that the performance of proposed algorithm are better than the previous one. The algorithm may have various application in controlling complex networks.
Most neurons in the primary visual cortex initially respond vigorously when a preferred stimulus is presented, but adapt as stimulation continues. The functional consequences of adaptation are unclear. Typically a reduction of firing rate would reduce single neuron accuracy as less spikes are available for decoding, but it has been suggested that on the population level, adaptation increases coding accuracy. This question requires careful analysis as adaptation not only changes the firing rates of neurons, but also the neural variability and correlations between neurons, which affect coding accuracy as well. We calculate the coding accuracy using a computational model that implements two forms of adaptation: spike frequency adaptation and synaptic adaptation in the form of short-term synaptic plasticity. We find that the net effect of adaptation is subtle and heterogeneous. Depending on adaptation mechanism and test stimulus, adaptation can either increase or decrease coding accuracy. We discuss the neurophysiological and psychophysical implications of the findings and relate it to published experimental data.
Spin-orbit (SO) splitting, $\pm \omega_{SO}$, of the electron Fermi surface in two-dimensional systems manifests itself in the interaction-induced corrections to the tunneling density of states, $\nu (\epsilon)$. Namely, in the case of a smooth disorder, it gives rise to the satellites of a zero-bias anomaly at energies $\epsilon=\pm 2\omega_{SO}$. Zeeman splitting, $\pm \omega_{Z}$, in a weak parallel magnetic field causes a narrow {\em plateau} of a width $\delta\epsilon=2\omega_{Z}$ at the top of each sharp satellite peak. As $\omega_{Z}$ exceeds $\omega_{SO}$, the SO satellites cross over to the conventional narrow maxima at $\epsilon = \pm 2\omega_{Z}$ with SO-induced plateaus $\delta\epsilon=2\omega_{SO}$ at the tops.
Additive nonparametric regression models provide an attractive tool for variable selection in high dimensions when the relationship between the response and predictors is complex. They offer greater flexibility compared to parametric non-linear regression models and better interpretability and scalability than the non-parametric regression models. However, achieving sparsity simultaneously in the number of nonparametric components as well as in the variables within each nonparametric component poses a stiff computational challenge. In this article, we develop a novel Bayesian additive regression model using a combination of hard and soft shrinkages to separately control the number of additive components and the variables within each component. An efficient algorithm is developed to select the importance variables and estimate the interaction network. Excellent performance is obtained in simulated and real data examples.
Collective classification of vertices is a task of assigning categories to each vertex in a graph based on both vertex attributes and link structure. Nevertheless, some existing approaches do not use the features of neighbouring vertices properly, due to the noise introduced by these features. In this paper, we propose a graph-based recursive neural network framework for collective vertex classification. In this framework, we generate hidden representations from both attributes of vertices and representations of neighbouring vertices via recursive neural networks. Under this framework, we explore two types of recursive neural units, naive recursive neural unit and long short-term memory unit. We have conducted experiments on four real-world network datasets. The experimental results show that our frame- work with long short-term memory model achieves better results and outperforms several competitive baseline methods.
In this paper we present an algorithm to address the predecessor problem of feed-forward Boolean networks. We propose an probabilistic algorithm, which solves this problem in linear time with respect to the number of nodes in the network. Finally, we evaluate our algorithm for random Boolean networks and the regulatory network of Escherichia coli.
There is an increasing demand for P2P streaming in particular for layered video. In this category of applications, the stream is composed of hierarchically encoded sub-streams layers namely the base layer and enhancements layers. We consider a scenario where the receiver peer uses the pull-based approach to adjust the video quality level to their capability by subscribing to different number of layers. We note that higher layers received without their corresponding lower layers are considered as useless and cannot be played, consequently the throughput of the system will drastically degrade. To avoid this situation, we propose an economical model based on auction mechanisms to optimize the allocation of sender peers' upload bandwidth. The upstream peers organize auctions to "sell" theirs items (links' bandwidth) according to bids submitted by the downstream peers taking into consideration the peers priorities and the requested layers importance. The ultimate goal is to satisfy the quality level requirement for each peer, while reducing the overall streaming cost. Through theoretical study and performance evaluation we show the effectiveness of our model in terms of users and network's utility.
One of the major motifs in collective or swarm intelligence is that, even though individuals follow simple rules, the resulting global behavior can be complex and intelligent. In artificial swarm systems, such as swarm robots, the goal is to use systems that are as simple and cheap as possible, deploy many of them, and coordinate them to conduct complex tasks that each individual cannot accomplish. Shape formation in artificial intelligence systems is usually required for specific task-oriented performance, including 1) forming sensing grids, 2) exploring and mapping in space, underwater, or hazardous environments, and 3) forming a barricade for surveillance or protecting an area or a person. This paper presents a dynamic model of an artificial swarm system based on a virtual spring damper model and algorithms for dispersion without a leader and line formation with an interim leader using only the distance estimation among the neighbors.
In object detection, reducing computational cost is as important as improving accuracy for most practical usages. This paper proposes a novel network structure, which is an order of magnitude lighter than other state-of-the-art networks while maintaining the accuracy. Based on the basic principle of more layers with less channels, this new deep neural network minimizes its redundancy by adopting recent innovations including C.ReLU and Inception structure. We also show that this network can be trained efficiently to achieve solid results on well-known object detection benchmarks: 84.9% and 84.2% mAP on VOC2007 and VOC2012 while the required compute is less than 10% of the recent ResNet-101.
Learning the distribution of images in order to generate new samples is a challenging task due to the high dimensionality of the data and the highly non-linear relations that are involved. Nevertheless, some promising results have been reported in the literature recently,building on deep network architectures. In this work, we zoom in on a specific type of image generation: given an image and knowing the category of objects it belongs to (e.g. faces), our goal is to generate a similar and plausible image, but with some altered attributes. This is particularly challenging, as the model needs to learn to disentangle the effect of each attribute and to apply a desired attribute change to a given input image, while keeping the other attributes and overall object appearance intact. To this end, we learn a convolutional network, where the desired attribute information is encoded then merged with the encoded image at feature map level. We show promising results, both qualitatively as well as quantitatively, in the context of a retrieval experiment, on two face datasets (MultiPie and CAS-PEAL-R1).
Learning Markov blanket (MB) structures has proven useful in performing feature selection, learning Bayesian networks (BNs), and discovering causal relationships. We present a formula for efficiently determining the number of MB structures given a target variable and a set of other variables. As expected, the number of MB structures grows exponentially. However, we show quantitatively that there are many fewer MB structures that contain the target variable than there are BN structures that contain it. In particular, the ratio of BN structures to MB structures appears to increase exponentially in the number of variables.
We study the locally-defined social capital metric of Palasek (2013) for determining individuals' prestige within an online social network. From it we derive an equivalent global measure by considering random walks over the network itself. This result inspires a novel expression quantifying the strategic desirability of a potential social connection. We show in silico that ideal social neighbors tend to satisfy a "big fish in a small pond" criterion and that the distribution of neighbor-desirability throughout a network is governed by anti-homophily.
Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential signals by applying a companion cross-entropy objective function to the conditioning vector. The experimental and analytical results demonstrate firstly that competition occurs between the conditioning vector and the LM, and the differing architectures provide different trade-offs between the two. Secondly, the discriminative power and transparency of the conditioning vector is key to providing both model interpretability and better performance. Thirdly, snapshot learning leads to consistent performance improvements independent of which architecture is used.
Economic Load Dispatch (ELD) is one of the essential components in power system control and operation. Although conventional ELD formulation can be solved using mathematical programming techniques, modern power system introduces new models of the power units which are non-convex, non-differentiable, and sometimes non-continuous. In order to solve such non-convex ELD problems, in this paper we propose a new approach based on the Social Spider Algorithm (SSA). The classical SSA is modified and enhanced to adapt to the unique characteristics of ELD problems, e.g., valve-point effects, multi-fuel operations, prohibited operating zones, and line losses. To demonstrate the superiority of our proposed approach, five widely-adopted test systems are employed and the simulation results are compared with the state-of-the-art algorithms. In addition, the parameter sensitivity is illustrated by a series of simulations. The simulation results show that SSA can solve ELD problems effectively and efficiently.
The bidirectional selection between two classes widely emerges in various social lives, such as commercial trading and mate choosing. Until now, the discussions on bidirectional selection in structured human society are quite limited. We demonstrated theoretically that the rate of successfully matching is affected greatly by individuals neighborhoods in social networks, regardless of the type of networks. Furthermore, it is found that the high average degree of networks contributes to increasing rates of successful matches. The matching performance in different types of networks has been quantitatively investigated, revealing that the small-world networks reinforces the matching rate more than scale-free networks at given average degree. In addition, our analysis is consistent with the modeling result, which provides the theoretical understanding of underlying mechanisms of matching in complex networks.
Optimizing the cellular network's cell locations is one of the most fundamental problems of network design. The general objective is to provide the desired Quality-of-Service (QoS) with the minimum system cost. In order to meet a growing appetite for mobile data services, heterogeneous networks have been proposed as a cost- and energy-efficient method of improving local spectral efficiency. Whilst unarticulated cell deployments can lead to localized improvements, there is a significant risk posed to network-wide performance due to the additional interference.   The first part of the paper focuses on state-of-the-art modelling and radio-planning methods based on stochastic geometry and Monte-Carlo simulations, and the emerging automatic deployment prediction technique for low-power nodes (LPNs) in heterogeneous networks. The technique advises a LPN where it should be deployed, given certain knowledge of the network. The second part of the paper focuses on algorithms that utilize interference and physical environment knowledge to assist LPN deployment. The proposed techniques can not only improve network performance, but also reduce radio-planning complexity, capital expenditure, and energy consumption of the cellular network. The theoretical work is supported by numerical results from system-level simulations that employ real cellular network data and physical environments.
We analyze gene co-expression network under the random matrix theory framework. The nearest neighbor spacing distribution of the adjacency matrix of this network follows Gaussian orthogonal statistics of random matrix theory (RMT). Spectral rigidity test follows random matrix prediction for a certain range, and deviates after wards. Eigenvector analysis of the network using inverse participation ratio (IPR) suggests that the statistics of bulk of the eigenvalues of network is consistent with those of the real symmetric random matrix, whereas few eigenvalues are localized. Based on these IPR calculations, we can divide eigenvalues in three sets; (A) The non-degenerate part that follows RMT. (B) The non-degenerate part, at both ends and at intermediate eigenvalues, which deviate from RMT and expected to contain information about {\it important nodes} in the network. (C) The degenerate part with $zero$ eigenvalue, which fluctuates around RMT predicted value. We identify nodes corresponding to the dominant modes of the corresponding eigenvectors and analyze their structural properties.
Group-based brain connectivity networks have great appeal for researchers interested in gaining further insight into complex brain function and how it changes across different mental states and disease conditions. Accurately constructing these networks presents a daunting challenge given the difficulties associated with accounting for inter-subject topological variability. Viable approaches to this task must engender networks that capture the constitutive topological properties of the group of subjects' networks that it is aiming to represent. The conventional approach has been to use a mean or median correlation network (Achard et al., 2006; Song et al., 2009) to embody a group of networks. However, the degree to which their topological properties conform with those of the groups that they are purported to represent has yet to be explored. Here we investigate the performance of these mean and median correlation networks. We also propose an alternative approach based on an exponential random graph modeling framework and compare its performance to that of the aforementioned conventional approach. Simpson et al. (2010) illustrated the utility of exponential random graph models (ERGMs) for creating brain networks that capture the topological characteristics of a single subject's brain network. However, their advantageousness in the context of producing a brain network that "represents" a group of brain networks has yet to be examined. Here we show that our proposed ERGM approach outperforms the conventional mean and median correlation based approaches and provides an accurate and flexible method for constructing group-based representative brain networks.
We show that in many data sequences - from texts in different languages to melodies and genomes - the mutual information between two symbols decays roughly like a power law with the number of symbols in between the two. In contrast, we prove that Markov/hidden Markov processes generically exhibit exponential decay in their mutual information, which explains why natural languages are poorly approximated by Markov processes. We present a broad class of models that naturally reproduce this critical behavior. They all involve deep dynamics of a recursive nature, as can be approximately implemented by tree-like or recurrent deep neural networks. This model class captures the essence of probabilistic context-free grammars as well as recursive self-reproduction in physical phenomena such as turbulence and cosmological inflation. We derive an analytic formula for the asymptotic power law and elucidate our results in a statistical physics context: 1-dimensional "shallow" models (such as Markov models or regular grammars) will fail to model natural language, because they cannot exhibit criticality, whereas "deep" models with one or more "hidden" dimensions representing levels of abstraction or scale can potentially succeed.
We consider a set of agents who are attempting to iteratively learn the 'state of the world' from their neighbors in a social network. Each agent initially receives a noisy observation of the true state of the world. The agents then repeatedly 'vote' and observe the votes of some of their peers, from which they gain more information. The agents' calculations are Bayesian and aim to myopically maximize the expected utility at each iteration.   This model, introduced by Gale and Kariv (2003), is a natural approach to learning on networks. However, it has been criticized, chiefly because the agents' decision rule appears to become computationally intractable as the number of iterations advances. For instance, a dynamic programming approach (part of this work) has running time that is exponentially large in \min(n, (d-1)^t), where n is the number of agents.   We provide a new algorithm to perform the agents' computations on locally tree-like graphs. Our algorithm uses the dynamic cavity method to drastically reduce computational effort. Let d be the maximum degree and t be the iteration number. The computational effort needed per agent is exponential only in O(td) (note that the number of possible information sets of a neighbor at time t is itself exponential in td).   Under appropriate assumptions on the rate of convergence, we deduce that each agent is only required to spend polylogarithmic (in 1/\eps) computational effort to approximately learn the true state of the world with error probability \eps, on regular trees of degree at least five. We provide numerical and other evidence to justify our assumption on convergence rate.   We extend our results in various directions, including loopy graphs. Our results indicate efficiency of iterative Bayesian social learning in a wide range of situations, contrary to widely held beliefs.
Human activity recognition has wide applications in medical research and human survey system. In this project, we design a robust activity recognition system based on a smartphone. The system uses a 3-dimentional smartphone accelerometer as the only sensor to collect time series signals, from which 31 features are generated in both time and frequency domain. Activities are classified using 4 different passive learning methods, i.e., quadratic classifier, k-nearest neighbor algorithm, support vector machine, and artificial neural networks. Dimensionality reduction is performed through both feature extraction and subset selection. Besides passive learning, we also apply active learning algorithms to reduce data labeling expense. Experiment results show that the classification rate of passive learning reaches 84.4% and it is robust to common positions and poses of cellphone. The results of active learning on real data demonstrate a reduction of labeling labor to achieve comparable performance with passive learning.
For human pose estimation in monocular images, joint occlusions and overlapping upon human bodies often result in deviated pose predictions. Under these circumstances, biologically implausible pose predictions may be produced. In contrast, human vision is able to predict poses by exploiting geometric constraints of joint inter-connectivity. To address the problem by incorporating priors about the structure of human bodies, we propose a novel structure-aware convolutional network to implicitly take such priors into account during training of the deep network. Explicit learning of such constraints is typically challenging. Instead, we design discriminators to distinguish the real poses from the fake ones (such as biologically implausible ones). If the pose generator (G) generates results that the discriminator fails to distinguish from real ones, the network successfully learns the priors.
One of the major goals in automated argumentation mining is to uncover the argument structure present in argumentative text. In order to determine this structure, one must understand how different individual components of the overall argument are linked. General consensus in this field dictates that the argument components form a hierarchy of persuasion, which manifests itself in a tree structure. This work provides the first neural network-based approach to argumentation mining, focusing on the two tasks of extracting links between argument components, and classifying types of argument components. In order to solve this problem, we propose to use a joint model that is based on a Pointer Network architecture. A Pointer Network is appealing for this task for the following reasons: 1) It takes into account the sequential nature of argument components; 2) By construction, it enforces certain properties of the tree structure present in argument relations; 3) The hidden representations can be applied to auxiliary tasks. In order to extend the contribution of the original Pointer Network model, we construct a joint model that simultaneously attempts to learn the type of argument component, as well as continuing to predict links between argument components. The proposed joint model achieves state-of-the-art results on two separate evaluation corpora, achieving far superior performance than a regular Pointer Network model. Our results show that optimizing for both tasks, and adding a fully-connected layer prior to recurrent neural network input, is crucial for high performance.
While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has shown the potential of such networks to distinguish between various brain disorders, giving rise to a network (or graph) classification problem. Existing approaches to graph classification tend to either treat all edge weights as a long vector, ignoring the network structure, or focus on the graph topology while ignoring the edge weights. Our goal here is to design a graph classification method that uses both the individual edge information and the network structure of the data in a computationally efficient way. We are also interested in obtaining a parsimonious and interpretable representation of differences in brain connectivity patterns between classes, which requires variable selection. We propose a graph classification method that uses edge weights as variables but incorporates the network nature of the data via penalties that promotes sparsity in the number of nodes. We implement the method via efficient convex optimization algorithms and show good performance on data from two fMRI studies of schizophrenia.
To solve the parameter sensitive issue of the traditional RED (random early detection) algorithm, an adaptive buffer management algorithm called PAFD (packet adaptive fair dropping) is proposed. This algorithm supports DiffServ (differentiated services) model of QoS (quality of service). In this algorithm, both of fairness and throughput are considered. The smooth buffer occupancy rate function is adopted to adjust the parameters. By implementing buffer management and packet scheduling on Intel IXP2400, the viability of QoS mechanisms on NPs (network processors) is verified. The simulation shows that the PAFD smoothes the flow curve, and achieves better balance between fairness and network throughput. It also demonstrates that this algorithm meets the requirements of fast data packet processing, and the hardware resource utilization of NPs is higher.
An important goal towards the design of Future Networks is to achieve the best ratio of performance to energy consumption and at the same time assure manageability. This paper presents a general problem formulation for Energy-Aware Traffic Engineering and proposes a distributed, heuristic Energy-Aware Traffic Engineering scheme (ETE) that provides load balancing and energy-awareness in accordance with the operator's needs. Simulation results of ETE compared to the optimal network performance confirm the capability of ETE to meeting the needs of Future Networks.
Deep convolutional networks have witnessed unprecedented success in various machine learning applications. Formal understanding on what makes these networks so successful is gradually unfolding, but for the most part there are still significant mysteries to unravel. The inductive bias, which reflects prior knowledge embedded in the network architecture, is one of them. In this work, we establish a fundamental connection between the fields of quantum physics and deep learning. We use this connection for asserting novel theoretical observations regarding the role that the number of channels in each layer of the convolutional network fulfills in the overall inductive bias. Specifically, we show an equivalence between the function realized by a deep convolutional arithmetic circuit (ConvAC) and a quantum many-body wave function, which relies on their common underlying tensorial structure. This facilitates the use of quantum entanglement measures as well-defined quantifiers of a deep network's expressive ability to model intricate correlation structures of its inputs. Most importantly, the construction of a deep ConvAC in terms of a Tensor Network is made available. This description enables us to carry a graph-theoretic analysis of a convolutional network, with which we demonstrate a direct control over the inductive bias of the deep network via its channel numbers, that are related to the min-cut in the underlying graph. This result is relevant to any practitioner designing a network for a specific task. We theoretically analyze ConvACs, and empirically validate our findings on more common ConvNets which involve ReLU activations and max pooling. Beyond the results described above, the description of a deep convolutional network in well-defined graph-theoretic tools and the formal connection to quantum entanglement, are two interdisciplinary bridges that are brought forth by this work.
We study the low temperature dynamics of a two dimensional short-range spin system with uniform ferromagnetic interactions, which displays glassiness at low temperatures despite the absence of disorder or frustration. The model has a dual description in terms of free defects subject to dynamical constraints, and is an explicit realization of the ``hierarchically constrained dynamics'' scenario for glassy systems. We give a number of exact results for the statics of the model, and study in detail the dynamical behaviour of one-time and two-time quantities. We also consider the role played by the configurational entropy, which can be computed exactly, in the relation between fluctuations and response.
This letter aims at justifying the stochastic equations in terms of the number density variable, which are still controversial, via complementing Dean's approach [Dean D S 1996 {\itshape J. Phys. A} {\bf 29} L613]. Our course is twofold: First, we demonstrate that standard manipulations straightforwardly transform the stochastic equation of density {\itshape operator}, derived by Dean, to the Fokker-Planck equation for the (c-number) density distribution functional $P(\{\rho\},t)$. Moreover, we verify the associated static solution of $P(\{\rho\},t)$ with the help of the conditional grand canonical partition function.
In this paper, we consider a class of sensor networks where the data is not required in real-time by an observer; for example, a sensor network monitoring a scientific phenomenon for later play back and analysis. In such networks, the data must be stored in the network. Thus, in addition to battery power, storage is a primary resource: the useful lifetime of the network is constrained by its ability to store the generated data samples. We explore the use of collaborative storage technique to efficiently manage data in storage constrained sensor networks. The proposed collaborative storage technique takes advantage of spatial correlation among the data collected by nearby sensors to significantly reduce the size of the data near the data sources. We show that the proposed approach provides significant savings in the size of the stored data vs. local buffering, allowing the network to run for a longer time without running out of storage space and reducing the amount of data that will eventually be relayed to the observer. In addition, collaborative storage performs load balancing of the available storage space if data generation rates are not uniform across sensors (as would be the case in an event driven sensor network), or if the available storage varies across the network.
The brain must robustly store a large number of memories, corresponding to the many events and scenes a person encounters over a lifetime. However, the number of memory states in existing neural network models either grows weakly with network size or recall performance fails catastrophically with vanishingly little noise. Here we show that it is possible to construct an associative content-addressable memory (ACAM) with exponentially many stable states and robust error-correction. The network possesses expander graph connectivity on a restricted Boltzmann machine architecture. The expansion property allows simple neural network dynamics to perform at par with modern error-correcting codes. Appropriate networks can be constructed with sparse random connections combined with glomerular nodes and a local associative learning rule, using low dynamic-range weights. Thus, sparse quasi-random constraint structures -- characteristic of an important class of modern error-correcting codes -- may provide for high-performance computation in artificial neural networks and the brain.
When two or more users in a wireless network transmit simultaneously, their electromagnetic signals are linearly superimposed on the channel. As a result, a receiver that is interested in one of these signals sees the others as unwanted interference. This property of the wireless medium is typically viewed as a hindrance to reliable communication over a network. However, using a recently developed coding strategy, interference can in fact be harnessed for network coding. In a wired network, (linear) network coding refers to each intermediate node taking its received packets, computing a linear combination over a finite field, and forwarding the outcome towards the destinations. Then, given an appropriate set of linear combinations, a destination can solve for its desired packets. For certain topologies, this strategy can attain significantly higher throughputs over routing-based strategies. Reliable physical layer network coding takes this idea one step further: using judiciously chosen linear error-correcting codes, intermediate nodes in a wireless network can directly recover linear combinations of the packets from the observed noisy superpositions of transmitted signals. Starting with some simple examples, this survey explores the core ideas behind this new technique and the possibilities it offers for communication over interference-limited wireless networks.
Binary representation is desirable for its memory efficiency, computation speed and robustness. In this paper, we propose adjustable bounded rectifiers to learn binary representations for deep neural networks. While hard constraining representations across layers to be binary makes training unreasonably difficult, we softly encourage activations to diverge from real values to binary by approximating step functions. Our final representation is completely binary. We test our approach on MNIST, CIFAR10, and ILSVRC2012 dataset, and systematically study the training dynamics of the binarization process. Our approach can binarize the last layer representation without loss of performance and binarize all the layers with reasonably small degradations. The memory space that it saves may allow more sophisticated models to be deployed, thus compensating the loss. To the best of our knowledge, this is the first work to report results on current deep network architectures using complete binary middle representations. Given the learned representations, we find that the firing or inhibition of a binary neuron is usually associated with a meaningful interpretation across different classes. This suggests that the semantic structure of a neural network may be manifested through a guided binarization process.
Logo detection in unconstrained images is challenging, particularly when only very sparse labelled training images are accessible due to high labelling costs. In this work, we describe a model training image synthesising method capable of improving significantly logo detection performance when only a handful of (e.g., 10) labelled training images captured in realistic context are available, avoiding extensive manual labelling costs. Specifically, we design a novel algorithm for generating Synthetic Context Logo (SCL) training images to increase model robustness against unknown background clutters, resulting in superior logo detection performance. For benchmarking model performance, we introduce a new logo detection dataset TopLogo-10 collected from top 10 most popular clothing/wearable brandname logos captured in rich visual context. Extensive comparisons show the advantages of our proposed SCL model over the state-of-the-art alternatives for logo detection using two real-world logo benchmark datasets: FlickrLogo-32 and our new TopLogo-10.
Since microRNAs (miRNAs) play a crucial role in post-transcriptional gene regulation, miRNA identification is one of the most essential problems in computational biology. miRNAs are usually short in length ranging between 20 and 23 base pairs. It is thus often difficult to distinguish miRNA-encoding sequences from other non-coding RNAs and pseudo miRNAs that have a similar length, and most previous studies have recommended using precursor miRNAs instead of mature miRNAs for robust detection. A great number of conventional machine-learning-based classification methods have been proposed, but they often have the serious disadvantage of requiring manual feature engineering, and their performance is limited as well. In this paper, we propose a novel miRNA precursor prediction algorithm, deepMiRGene, based on recurrent neural networks, specifically long short-term memory networks. deepMiRGene automatically learns suitable features from the data themselves without manual feature engineering and constructs a model that can successfully reflect structural characteristics of precursor miRNAs. For the performance evaluation of our approach, we have employed several widely used evaluation metrics on three recent benchmark datasets and verified that deepMiRGene delivered comparable performance among the current state-of-the-art tools.
In the current intensively changing technological environment, wireless network operators try to manage the increase of global traffic, optimizing the use of the available resources. This involves associating each user to one of its reachable wireless networks; a decision that can be made on the user side, in which case inefficiencies stem from user selfishness.   This paper aims at correcting that efficiency loss through the use of a one-dimensional incentive signal, interpreted as a price. While the so-called Pigovian taxes allow to deal with homogeneous users, we consider here two classes with different sensitivities to the Quality of Service, reflecting the dichotomy between delay-sensitive and delay-insensitive applications. We consider a geographic area covered by two wireless networks, among which users choose based on a trade-off between the quality of service and the price to pay.   Using a non-atomic routing game model, we study analytically the case of constant demand levels. We show that when properly designed, the incentives elicit efficient user-network associations. Moreover, those optimal incentives can be simply computed by the wireless operator, using only some information that is easily available. Finally, the performance of the incentive scheme under dynamic demand (users opening and closing connections over time) are investigated through simulations, our incentive scheme also yielding significant improvements in that case.
Modern neural networks are often augmented with an attention mechanism, which tells the network where to focus within the input. We propose in this paper a new framework for sparse and structured attention, building upon a max operator regularized with a strongly convex function. We show that this operator is differentiable and that its gradient defines a mapping from real values to probabilities, suitable as an attention mechanism. Our framework includes softmax and a slight generalization of the recently-proposed sparsemax as special cases. However, we also show how our framework can incorporate modern structured penalties, resulting in new attention mechanisms that focus on entire segments or groups of an input, encouraging parsimony and interpretability. We derive efficient algorithms to compute the forward and backward passes of these attention mechanisms, enabling their use in a neural network trained with backpropagation. To showcase their potential as a drop-in replacement for existing attention mechanisms, we evaluate them on three large-scale tasks: textual entailment, machine translation, and sentence summarization. Our attention mechanisms improve interpretability without sacrificing performance; notably, on textual entailment and summarization, we outperform the existing attention mechanisms based on softmax and sparsemax.
The Neural Autoregressive Distribution Estimator (NADE) and its real-valued version RNADE are competitive density models of multidimensional data across a variety of domains. These models use a fixed, arbitrary ordering of the data dimensions. One can easily condition on variables at the beginning of the ordering, and marginalize out variables at the end of the ordering, however other inference tasks require approximate inference. In this work we introduce an efficient procedure to simultaneously train a NADE model for each possible ordering of the variables, by sharing parameters across all these models. We can thus use the most convenient model for each inference task at hand, and ensembles of such models with different orderings are immediately available. Moreover, unlike the original NADE, our training procedure scales to deep models. Empirically, ensembles of Deep NADE models obtain state of the art density estimation performance.
We study expressive power of shallow and deep neural networks with piece-wise linear activation functions. We establish new rigorous upper and lower bounds for the network complexity in the setting of approximations in Sobolev spaces. In particular, we prove that deep ReLU networks more efficiently approximate smooth functions than shallow networks. In the case of approximations of 1D Lipschitz functions we describe adaptive depth-6 network architectures more efficient than the standard shallow architecture.
We study a quenched disordered d=3 tJ Hamiltonian with static vacancies as a model of nonmagnetic impurities in high-Tc materials. Using a position-space renormalization-group approach, we calculate the evolution of the finite-temperature phase diagram with impurity concentration p, and find several features with close experimental parallels: away from half-filling we see the rapid destruction of a spin-singlet phase (analogous to the superconducting phase in cuprates) which is eliminated for p > 0.05; in the same region for these dilute impurity concentrations we observe an enhancement of antiferromagnetism. The antiferromagnetic phase near half-filling is robust against impurity addition, and disappears only for p > 0.40.
Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a virtual filter bank containing the filter itself and its multiple unmaterialised rotated versions. During back-propagation, an ARF is collectively updated using errors from all its rotated versions. DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks. The oriented response produced by ORNs can also be used for image and object orientation estimation tasks. Over multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we consistently observe that replacing regular filters with the proposed ARFs leads to significant reduction in the number of network parameters and improvement in classification performance. We report the best results on several commonly used benchmarks.
Recently a new metaheuristic called harmony search was developed. It mimics the behaviors of musicians improvising to find the better state harmony. In this paper, this algorithm is described and applied to solve the container storage problem in the harbor. The objective of this problem is to determine a valid containers arrangement, which meets customers delivery deadlines, reduces the number of container rehandlings and minimizes the ship idle time. In this paper, an adaptation of the harmony search algorithm to the container storage problem is detailed and some experimental results are presented and discussed. The proposed approach was compared to a genetic algorithm previously applied to the same problem and recorded a good results.
We introduce, test and discuss a method for classifying and clustering data modeled as directed graphs. The idea is to start diffusion processes from any subset of a data collection, generating corresponding distributions for reaching points in the network. These distributions take the form of high-dimensional numerical vectors and capture essential topological properties of the original dataset. We show how these diffusion vectors can be successfully applied for getting state-of-the-art accuracies in the problem of extracting pathways from metabolic networks. We also provide a guideline to illustrate how to use our method for classification problems, and discuss important details of its implementation. In particular, we present a simple dimensionality reduction technique that lowers the computational cost of classifying diffusion vectors, while leaving the predictive power of the classification process substantially unaltered. Although the method has very few parameters, the results we obtain show its flexibility and power. This should make it helpful in many other contexts.
Memory and aging behaviors in stage-2 CoCl$_{2}$ GIC ($T_{cu}$ = 8.9 K and $T_{cl}$ = 6.9 K) have been studied using low frequency ($f$ = 0.1 Hz) AC magnetic susceptibility ($\chi^{\prime}$ and $\chi^{\prime\prime}$) as well as thermoremnant magnetization. There occurs a crossover from a cumulative aging (mainly) with a partial memory effect for the domain-growth in the intermediate state between $T_{cu}$ and $T_{cl}$ to an aging and memory in a spin glass phase below $T_{cl}$. When the system is aged at single or multiple stop and wait processes at stop temperatures $T_{s}$'s below $T_{cl}$ for wait times $t_{s}$, the AC magnetic susceptibility shows single or multiple aging dips at $T_{s}$'s on reheating. The depth of the aging dips is logarithmically proportional to the wait time $t_{s}$. Very weak aging dip between $T_{cu}$ and $T_{cl}$ indicates the existence of a partial memory effect. The sign of the difference between the reference cooling and reference heating curves changes from positive to negative on crossing $T_{cl}$ from the high $T$-side. The time dependence of $\chi^{\prime}$ and $\chi^{\prime\prime}$ below $T_{cl}$ is described by a scaling function of $\omega t$. its a local maximum at 6.5 K just below $T_{cl}$, and drastically decreases with increasing $T$. The nature of the cumulative aging between $T_{cl}$ and $T_{cu}$ is also examined.
We describe CITlab's recognition system for the ANWRESH-2014 competition attached to the 14. International Conference on Frontiers in Handwriting Recognition, ICFHR 2014. The task comprises word recognition from segmented historical documents. The core components of our system are based on multi-dimensional recurrent neural networks (MDRNN) and connectionist temporal classification (CTC). The software modules behind that as well as the basic utility technologies are essentially powered by PLANET's ARGUS framework for intelligent text recognition and image processing.
More often than not, bad decisions are bad regardless of where and when they are made. Information sharing might thus be utilized to mitigate them. Here we show that sharing the information about strategy choice between players residing on two different networks reinforces the evolution of cooperation. In evolutionary games the strategy reflects the action of each individual that warrants the highest utility in a competitive setting. We therefore assume that identical strategies on the two networks reinforce themselves by lessening their propensity to change. Besides network reciprocity working in favour of cooperation on each individual network, we observe the spontaneous emerge of correlated behaviour between the two networks, which further deters defection. If information is shared not just between individuals but also between groups, the positive effect is even stronger, and this despite the fact that information sharing is implemented without any assumptions with regards to content.
Synergies between evolutionary game theory and statistical physics have significantly improved our understanding of public cooperation in structured populations. Multiplex networks, in particular, provide the theoretical framework within network science that allows us to mathematically describe the rich structure of interactions characterizing human societies. While research has shown that multiplex networks may enhance the resilience of cooperation, the interplay between the overlap in the structure of the layers and the control parameters of the corresponding games has not yet been investigated. With this aim, we consider here the public goods game on a multiplex network, and we unveil the role of the number of layers and the overlap of links, as well as the impact of different synergy factors in different layers, on the onset of cooperation. We show that enhanced public cooperation emerges only when a significant edge overlap is combined with at least one layer being able to sustain some cooperation by means of a sufficiently high synergy factor. In the absence of either of these conditions, the evolution of cooperation in multiplex networks is determined by the bounds of traditional network reciprocity with no enhanced resilience. These results caution against overly optimistic predictions that the presence of multiple social domains may in itself promote cooperation, and they help us better understand the complexity behind prosocial behavior in layered social systems.
The focus of the current research is to identify people of interest in social networks. We are especially interested in studying dark networks, which represent illegal or covert activity. In such networks, people are unlikely to disclose accurate information when queried. We present REDLEARN, an algorithm for sampling dark networks with the goal of identifying as many nodes of interest as possible. We consider two realistic lying scenarios, which describe how individuals in a dark network may attempt to conceal their connections. We test and present our results on several real-world multilayered networks, and show that REDLEARN achieves up to a 340% improvement over the next best strategy.
A multi-object spectrograph on the forthcoming European Extremely Large Telescope will be required to operate with good sky coverage. Many of the interesting deep cosmological fields were deliberately chosen to be free of bright foreground stars, and therefore are potentially challenging for adaptive optics (AO) systems. Here we investigate multi-object AO performance using sub-fields chosen at random from within the Great Observatories Origins Deep Survey (GOODS)-S field, which is the worst case scenario for five deep fields used extensively in studies of high-redshift galaxies. Our AO system model is based on that of the proposed MOSAIC instrument but our findings are equally applicable to plans for multi-object spectroscopy on any of the planned Extremely Large Telescopes. Potential guide stars within these sub-fields are identified and used for simulations of AO correction. We achieve ensquared energies within 75~mas of between 25-35\% depending on the sub-field, which is sufficient to probe sub-kpc scales in high-redshift galaxies. We also investigate the effect of detector readout noise on AO system performance, and consider cases where natural guide stars are used for both high-order and tip-tilt-only AO correction. We also consider how performance scales with ensquared energy box size. In summary, the expected AO performance is sufficient for a MOSAIC-like instrument, even within deep fields characterised by a lack of bright foreground stars.
One of the biggest needs in network science research is access to large realistic datasets. As data analytics methods permeate a range of diverse disciplines---e.g., computational epidemiology, sustainability, social media analytics, biology, and transportation--- network datasets that can exhibit characteristics encountered in each of these disciplines becomes paramount. The key technical issue is to be able to generate synthetic topologies with pre-specified, arbitrary, degree distributions. Existing methods are limited in their ability to faithfully reproduce macro-level characteristics of networks while at the same time respecting particular degree distributions. We present a suite of three algorithms that exploit the principle of residual degree attenuation to generate synthetic topologies that adhere to macro-level real-world characteristics. By evaluating these algorithms w.r.t. several real-world datasets we demonstrate their ability to faithfully reproduce network characteristics such as node degree, clustering coefficient, hop length, and k-core structure distributions.
We model a set of point-to-point transports on a network as a system of polydisperse interacting self-avoiding walks (SAWs) over a finite square lattice. The ends of each SAW may be located both at random, uniformly distributed, positions or with one end fixed at a lattice corner. The total energy of the system is computed as the sum over all SAWs, which may represent either the time needed to complete the transport over the network, or the resources needed to build the networking infrastructure. We focus especially on the second aspect by assigning a concave cost function to each site to encourage path overlap. A Simulated Annealing optimization, based on a modified BFACF Montecarlo algorithm developed for polymers, is used to probe the complex conformational substates structure. We characterize the average cost gains (and path-length variation) for increasing polymer density with respect to a Dijkstra routing and find a non-monotonic behavior as previously found in random networks. We observe the expected phase transition when switching from a convex to a concave cost function (e.g., $x^\gamma$, where $x$ represents the node overlap) and the emergence of ergodicity breaking, finally we show that the space of ground states for $\gamma<1$ is compatible with an ultrametric structure as seen in many complex systems such as some spin glasses.
We study the competition between interactions and disorder in two dimensions. Whereas a noninteracting system is always Anderson localized by disorder in two dimensions, a pure system can develop a Mott gap for sufficiently strong interactions. Within a simple model, with short-ranged repulsive interactions, we show that, even in the limit of strong interaction, the Mott gap is completely washed out by disorder for an infinite system for dimensions $D\le 2$. The probability of a nonzero gap falls onto a universal curve, leading to a glassy state for which we provide a scaling function for the frequency dependent susceptibility.
Lecture notes for LES HOUCHES 2002 SUMMER SCHOOL - SESSION LXXVII on SLOW RELAXATIONS AND NONEQUILIBRIUM DYNAMICS IN CONDENSED MATTER; Les Houches: July 1-26, 2002
Data communication in sensor networks can have timing constraints like end to end deadlines. If the deadlines are not met either a catastrophe can happen in hard real time systems or performance deterioration can occur in soft real time systems. In real time sensor networks, the recovery of data through retransmission should be minimized due to the stringent requirements on the worst case time delays. This paper presents the application of Stop and Go Multihop protocol (SGMH) at node level in wireless sensor networks for scheduling and hence to meet the hard real time routing requirements. SGMH is a distributed multihop packet delivery algorithm. The fractions of the total available bandwidth on each channel is assigned to several traffic classes by which the time it takes to traverse each of the hops from the source to the destination is bounded. It is based on the notion of time frames (Tfr). In sensor networks packets can have different delay guarantees. Multiple frame sizes can be assigned for different traffic classes.
Adaptive traffic signal control, which adjusts traffic signal timing according to real-time traffic, has been shown to be an effective method to reduce traffic congestion. Available works on adaptive traffic signal control make responsive traffic signal control decisions based on human-crafted features (e.g. vehicle queue length). However, human-crafted features are abstractions of raw traffic data (e.g., position and speed of vehicles), which ignore some useful traffic information and lead to suboptimal traffic signal controls. In this paper, we propose a deep reinforcement learning algorithm that automatically extracts all useful features (machine-crafted features) from raw real-time traffic data and learns the optimal policy for adaptive traffic signal control. To improve algorithm stability, we adopt experience replay and target network mechanisms. Simulation results show that our algorithm reduces vehicle delay by up to 47% and 86% when compared to another two popular traffic signal control algorithms, longest queue first algorithm and fixed time control algorithm, respectively.
We investigate the stability properties of two different classes of metabolic cycles using a combination of analytical and computational methods. Using principles from structural kinetic modeling (SKM), we show that the stability of metabolic networks with certain structural regularities can be studied using a combination of analytical and computational techniques. We then apply these techniques to a class of single input, single output metabolic cycles, and find that the cycles are stable under all conditions tested. Next, we extend our analysis to a small autocatalytic cycle, and determine parameter regimes within which the cycle is very likely to be stable. We demonstrate that analytical methods can be used to understand the relationship between kinetic parameters and stability, and that results from these analytical methods can be confirmed with computational experiments. In addition, our results suggest that elevated metabolite concentrations and certain crucial saturation parameters can strongly affect the stability of the entire metabolic cycle. We discuss our results in light of the possibility that evolutionary forces may select for metabolic network topologies with a high intrinsic probability of being stable. Furthermore, our conclusions support the hypothesis that certain types of metabolic cycles may have played a role in the development of primitive metabolism despite the absence of regulatory mechanisms.
Collaborative filtering (CF) is a successful approach commonly used by many recommender systems. Conventional CF-based methods use the ratings given to items by users as the sole source of information for learning to make recommendation. However, the ratings are often very sparse in many applications, causing CF-based methods to degrade significantly in their recommendation performance. To address this sparsity problem, auxiliary information such as item content information may be utilized. Collaborative topic regression (CTR) is an appealing recent method taking this approach which tightly couples the two components that learn from two different sources of information. Nevertheless, the latent representation learned by CTR may not be very effective when the auxiliary information is very sparse. To address this problem, we generalize recent advances in deep learning from i.i.d. input to non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian model called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix. Extensive experiments on three real-world datasets from different domains show that CDL can significantly advance the state of the art.
Semantic object parts can be useful for several visual recognition tasks. Lately, these tasks have been addressed using Convolutional Neural Networks (CNN), achieving outstanding results. In this work we study whether CNNs learn semantic parts in their internal representation. We investigate the responses of convolutional filters and try to associate their stimuli with semantic parts. We perform two extensive quantitative analysis. First, we use ground-truth part bounding-boxes from the PASCAL-Part dataset to determine how many of those semantic parts emerge in the CNN. We explore this emergence for different layers, network depths, and supervision levels. Second, we collect human judgements in order to study what fraction of all filters systematically fire on any semantic part, even if not annotated in PASCAL-Part. Moreover, we explore several connections between discriminative power and semantics. We find out which are the most discriminative filters for object recognition, and analyze whether they respond to semantic parts or to other image patches. We also investigate the other direction: we determine which semantic parts are the most discriminative and whether they correspond to those parts emerging in the network. This enables to gain an even deeper understanding of the role of semantic parts in the network.
Recurrent Neural Networks (RNNs) have become the state-of-the-art choice for extracting patterns from temporal sequences. However, current RNN models are ill-suited to process irregularly sampled data triggered by events generated in continuous time by sensors or other neurons. Such data can occur, for example, when the input comes from novel event-driven artificial sensors that generate sparse, asynchronous streams of events or from multiple conventional sensors with different update intervals. In this work, we introduce the Phased LSTM model, which extends the LSTM unit by adding a new time gate. This gate is controlled by a parametrized oscillation with a frequency range that produces updates of the memory cell only during a small percentage of the cycle. Even with the sparse updates imposed by the oscillation, the Phased LSTM network achieves faster convergence than regular LSTMs on tasks which require learning of long sequences. The model naturally integrates inputs from sensors of arbitrary sampling rates, thereby opening new areas of investigation for processing asynchronous sensory events that carry timing information. It also greatly improves the performance of LSTMs in standard RNN applications, and does so with an order-of-magnitude fewer computes at runtime.
Recent advances have shown the capability of Fully Convolutional Neural Networks (FCN) to model cost functions for motion planning in the context of learning driving preferences purely based on demonstration data from human drivers. While pure learning from demonstrations in the framework of Inverse Reinforcement Learning (IRL) is a promising approach, we can benefit from well informed human priors and incorporate them into the learning process. Our work achieves this by pretraining a model to regress to a manual cost function and refining it based on Maximum Entropy Deep Inverse Reinforcement Learning. When injecting prior knowledge as pretraining for the network, we achieve higher robustness, more visually distinct obstacle boundaries, and the ability to capture instances of obstacles that elude models that purely learn from demonstration data. Furthermore, by exploiting these human priors, the resulting model can more accurately handle corner cases that are scarcely seen in the demonstration data, such as stairs, slopes, and underpasses.
A kinetic Ising model is analyzed where spin variables correspond to lattice cells with mobile or immobile particles. Introducing additional restrictions for the flip processes according to the n-spin facilitated kinetic Ising model and using Monte Carlo methods we study the freezing process under the influence of an additional nearest-neighbor interaction. The stretched exponential decay of the auto-correlation function is observed and the exponent $\gamma$ as well as the relaxation time are determined depending on the activation energy $h$ and the short range coupling $J$. The magnetization corresponding to the density of immobile particles is found to be the controlling parameter for the dynamic evolution.
This paper introduces a novel approach to the task of data association within the context of pedestrian tracking, by introducing a two-stage learning scheme to match pairs of detections. First, a Siamese convolutional neural network (CNN) is trained to learn descriptors encoding local spatio-temporal structures between the two input image patches, aggregating pixel values and optical flow information. Second, a set of contextual features derived from the position and size of the compared input patches are combined with the CNN output by means of a gradient boosting classifier to generate the final matching probability. This learning approach is validated by using a linear programming based multi-person tracker showing that even a simple and efficient tracker may outperform much more complex models when fed with our learned matching probabilities. Results on publicly available sequences show that our method meets state-of-the-art standards in multiple people tracking.
The electronic states of manganese in p-type GaN are investigated using photoluminescence (PL) and photoluminescence excitation (PLE) spectroscopies. A series of sharp PL lines at 1.0 eV is observed in codoped GaN and attributed to the intra d-shell transition 4T2(F)-4T1(F) of Mn4+ ions. PLE spectrum of the Mn4+ [4T2(F)-4T1(F)] luminescence reveals intra-center excitation processes via the excited states of Mn4+ ions. PLE peaks observed at 1.79 and 2.33 eV are attributed to the intra d-shell 4T1(P)-4T1(F) and 4A2(F)-4T1(F) transitions of Mn4+, respectively. In addition to the intra-shell excitation processes, a broad PLE band involving charge-transfer transition of the Mn4+/3+ deep level is observed, which is well described by the Lucovsky model. As determined from the onset of this PLE band, the position of the Mn4+/3+ deep level is 1.11 eV above the valence band maximum, which is consistent with prior theory using ab initio calculations. Our work indicates 4+ is the predominant oxidation state of Mn ions in p-type GaN:Mn when the Fermi energy is lower than 1.11 eV above the valence band maximum.
In this work, we explore multiple neural architectures adapted for the task of automatic post-editing of machine translation output. We focus on neural end-to-end models that combine both inputs $mt$ and $src$ in a single neural architecture, modeling $\{mt, src\} \rightarrow pe$ directly. Apart from that, we investigate the influence of hard-attention models which seem to be well-suited for monolingual tasks, as well as combinations of both ideas. We report results on data sets provided during the WMT 2016 shared task on automatic post-editing and can demonstrate that double-attention models that incorporate all available data in the APE scenario in a single model improve on the best shared task system and on all other published results after the shared task. Double-attention models that are combined with hard attention remain competitive despite applying fewer changes to the input.
Most optimal routing problems focus on minimizing travel time or distance traveled. Oftentimes, a more useful objective is to maximize the probability of on-time arrival, which requires statistical distributions of travel times, rather than just mean values. We propose a method to estimate travel time distributions on large-scale road networks, using probe vehicle data collected from GPS. We present a framework that works with large input of data, and scales linearly with the size of the network. Leveraging the planar topology of the graph, the method computes efficiently the time correlations between neighboring streets. First, raw probe vehicle traces are compressed into pairs of travel times and number of stops for each traversed road segment using a `stop-and-go' algorithm developed for this work. The compressed data is then used as input for training a path travel time model, which couples a Markov model along with a Gaussian Markov random field. Finally, scalable inference algorithms are developed for obtaining path travel time distributions from the composite MM-GMRF model. We illustrate the accuracy and scalability of our model on a 505,000 road link network spanning the San Francisco Bay Area.
Deep neural networks trained on large supervised datasets have led to impressive results in recent years. However, since well-annotated datasets can be prohibitively expensive and time-consuming to collect, recent work has explored the use of larger but noisy datasets that can be more easily obtained. In this paper, we investigate the behavior of deep neural networks on training sets with massively noisy labels. We show that successful learning is possible even with an essentially arbitrary amount of noise. For example, on MNIST we find that accuracy of above 90 percent is still attainable even when the dataset has been diluted with 100 noisy examples for each clean example. Such behavior holds across multiple patterns of label noise, even when noisy labels are biased towards confusing classes. Further, we show how the required dataset size for successful training increases with higher label noise. Finally, we present simple actionable techniques for improving learning in the regime of high label noise.
In scenarios where devices are too small to support MIMO antenna arrays, symbol-level cooperation may be used to pool the resources of distributed single-antenna devices to create a virtual MIMO antenna array. We address design fundamentals for distributed cooperative protocols where relays have an incomplete view of network information. A key issue in distributed networks is potential loss in spatial reuse due to the increased radio footprint of flows with cooperative relays. Hence, local gains from cooperation have to balance against network level losses. By using a novel binary network model that simplifies the space over which cooperative protocols must be designed, we develop a mechanism for the systematic and computational development of cooperative protocols as functions of the amount of network state information available at relay nodes. Through extensive network analysis and simulations, we demonstrate the successful application of this method to a series of protocols that span a range of network information availability at cooperative relays.
Graph signals offer a very generic and natural representation for data that lives on networks or irregular structures. The actual data structure is however often unknown a priori but can sometimes be estimated from the knowledge of the application domain. If this is not possible, the data structure has to be inferred from the mere signal observations. This is exactly the problem that we address in this paper, under the assumption that the graph signals can be represented as a sparse linear combination of a few atoms of a structured graph dictionary. The dictionary is constructed on polynomials of the graph Laplacian, which can sparsely represent a general class of graph signals composed of localized patterns on the graph. We formulate a graph learning problem, whose solution provides an ideal fit between the signal observations and the sparse graph signal model. As the problem is non-convex, we propose to solve it by alternating between a signal sparse coding and a graph update step. We provide experimental results that outline the good graph recovery performance of our method, which generally compares favourably to other recent network inference algorithms.
Roles represent node-level connectivity patterns such as star-center, star-edge nodes, near-cliques or nodes that act as bridges to different regions of the graph. Intuitively, two nodes belong to the same role if they are structurally similar. Roles have been mainly of interest to sociologists, but more recently, roles have become increasingly useful in other domains. Traditionally, the notion of roles were defined based on graph equivalences such as structural, regular, and stochastic equivalences. We briefly revisit these early notions and instead propose a more general formulation of roles based on the similarity of a feature representation (in contrast to the graph representation). This leads us to propose a taxonomy of three general classes of techniques for discovering roles that includes (i) graph-based roles, (ii) feature-based roles, and (iii) hybrid roles. We also propose a flexible framework for discovering roles using the notion of similarity on a feature-based representation. The framework consists of two fundamental components: (a) role feature construction and (b) role assignment using the learned feature representation. We discuss the different possibilities for discovering feature-based roles and the tradeoffs of the many techniques for computing them. Finally, we discuss potential applications and future directions and challenges.
Contour detection has been a fundamental component in many image segmentation and object detection systems. Most previous work utilizes low-level features such as texture or saliency to detect contours and then use them as cues for a higher-level task such as object detection. However, we claim that recognizing objects and predicting contours are two mutually related tasks. Contrary to traditional approaches, we show that we can invert the commonly established pipeline: instead of detecting contours with low-level cues for a higher-level recognition task, we exploit object-related features as high-level cues for contour detection.   We achieve this goal by means of a multi-scale deep network that consists of five convolutional layers and a bifurcated fully-connected sub-network. The section from the input layer to the fifth convolutional layer is fixed and directly lifted from a pre-trained network optimized over a large-scale object classification task. This section of the network is applied to four different scales of the image input. These four parallel and identical streams are then attached to a bifurcated sub-network consisting of two independently-trained branches. One branch learns to predict the contour likelihood (with a classification objective) whereas the other branch is trained to learn the fraction of human labelers agreeing about the contour presence at a given point (with a regression criterion).   We show that without any feature engineering our multi-scale deep learning approach achieves state-of-the-art results in contour detection.
One of the fundamental challenges in the design of distributed wireless networks is the large dynamic range of network state. Since continuous tracking of global network state at all nodes is practically impossible, nodes can only acquire limited local views of the whole network to design their transmission strategies. In this paper, we study multi-layer wireless networks and assume that each node has only a limited knowledge, namely 1-local view, where each S-D pair has enough information to perform optimally when other pairs do not interfere, along with connectivity information for rest of the network. We investigate the information-theoretic limits of communication with such limited knowledge at the nodes. We develop a novel transmission strategy, namely Coded Layer Scheduling, that solely relies on 1-local view at the nodes and incorporates three different techniques: (1) per layer interference avoidance, (2) repetition coding to allow overhearing of the interference, and (3) network coding to allow interference neutralization. We show that our proposed scheme can provide a significant throughput gain compared with the conventional interference avoidance strategies. Furthermore, we show that our strategy maximizes the achievable normalized sum-rate for some classes of networks, hence, characterizing the normalized sum-capacity of those networks with 1-local view.
Nowadays, we can find several diseases related to the unhealthy diet habits of the population, such as diabetes, obesity, anemia, bulimia and anorexia. In many cases, these diseases are related to the food consumption of people. Mediterranean diet is scientifically known as a healthy diet that helps to prevent many metabolic diseases. In particular, our work focuses on the recognition of Mediterranean food and dishes. The development of this methodology would allow to analise the daily habits of users with wearable cameras, within the topic of lifelogging. By using automatic mechanisms we could build an objective tool for the analysis of the patient's behaviour, allowing specialists to discover unhealthy food patterns and understand the user's lifestyle.   With the aim to automatically recognize a complete diet, we introduce a challenging multi-labeled dataset related to Mediterranean diet called FoodCAT. The first type of label provided consists of 115 food classes with an average of 400 images per dish, and the second one consists of 12 food categories with an average of 3800 pictures per class. This dataset will serve as a basis for the development of automatic diet recognition. In this context, deep learning and more specifically, Convolutional Neural Networks (CNNs), currently are state-of-the-art methods for automatic food recognition. In our work, we compare several architectures for image classification, with the purpose of diet recognition. Applying the best model for recognising food categories, we achieve a top-1 accuracy of 72.29\%, and top-5 of 97.07\%. In a complete diet recognition of dishes from Mediterranean diet, enlarged with the Food-101 dataset for international dishes recognition, we achieve a top-1 accuracy of 68.07\%, and top-5 of 89.53\%, for a total of 115+101 food classes.
We present results from simulations of the extragalactic polarized sky at 1.4 GHz. As the basis for our polarization models, we use a semi-empirical simulation of the extragalactic total intensity (Stokes I) continuum sky developed at the University of Oxford (http://scubed.physics.ox.ac.uk) under the European SKA Design Study (SKADS) initiative, and polarization distributions derived from analysis of polarization observations. By considering a luminosity dependence for the polarization of AGN, we are able to fit the 1.4 GHz polarized source counts derived from the NVSS and the DRAO ELAIS N1 deep field survey down to approximately 1 mJy. This trend is confirmed by analysis of the polarization of a complete sample of bright AGN. We are unable to fit the additional flattening of the polarized source counts from the deepest observations of the ELAIS N1 survey, which go down to ~0.5 mJy. Below 1 mJy in Stokes I at 1.4 GHz, starforming galaxies become an increasingly important fraction of all radio sources. We use a spiral galaxy integrated polarization model to make realistic predictions of the number of polarized sources at microJy levels in polarized flux density and hence, realistic predictions of what the next generation radio telescopes such as ASKAP, other SKA pathfinders and the SKA itself will see.
We obtain a limit theorem endowed with quantitative estimates for a general class of infinite dimensional hybrid processes with intrinsically two different time scales and including a population. As an application, we consider a large class of conductance-based neuron models describing the nerve impulse propagation along a neural cell at the scales of ion channels.
We develop an artificial neural circuit for contour tracking and navigation inspired by the chemotaxis of the nematode Caenorhabditis elegans. In order to harness the computational advantages spiking neural networks promise over their non-spiking counterparts, we develop a network comprising 7-spiking neurons with non-plastic synapses which we show is extremely robust in tracking a range of concentrations. Our worm uses information regarding local temporal gradients in sodium chloride concentration to decide the instantaneous path for foraging, exploration and tracking. A key neuron pair in the C. elegans chemotaxis network is the ASEL & ASER neuron pair, which capture the gradient of concentration sensed by the worm in their graded membrane potentials. The primary sensory neurons for our network are a pair of artificial spiking neurons that function as gradient detectors whose design is adapted from a computational model of the ASE neuron pair in C. elegans. Simulations show that our worm is able to detect the set-point with approximately four times higher probability than the optimal memoryless Levy foraging model. We also show that our spiking neural network is much more efficient and noise-resilient while navigating and tracking a contour, as compared to an equivalent non-spiking network. We demonstrate that our model is extremely robust to noise and with slight modifications can be used for other practical applications such as obstacle avoidance. Our network model could also be extended for use in three-dimensional contour tracking or obstacle avoidance.
Proper balance between exploitation and exploration is what makes good decisions, which achieve high rewards like payoff or evolutionary fitness. The Infomax principle postulates that maximization of information directs the function of diverse systems, from living systems to artificial neural networks. While specific applications are successful, the validity of information as a proxy for reward remains unclear. Here, we consider the multi-armed bandit decision problem, which features arms (slot-machines) of unknown probabilities of success and a player trying to maximize cumulative payoff by choosing the sequence of arms to play. We show that an Infomax strategy (Info-p) which optimally gathers information on the highest mean reward among the arms saturates known optimal bounds and compares favorably to existing policies. The highest mean reward considered by Info-p is not the quantity actually needed for the choice of the arm to play, yet it allows for optimal tradeoffs between exploration and exploitation.
The nonohmic conductivity of 2D hole gas (2DHG) in single $GaAsIn_{0.2}Ga_{0.8}AsGaAs$ quantum well structures within the temperature range of 1.4 - 4.2K, the carrier's densities $p=(1.5-8)\cdot10^{15}m^{-2}$ and a wide range of conductivities $(10^{-4}-100)G_0$ ($G_0=e^2/\pi\,h$) was investigated. It was shown that at conductivity $\sigma>G_0$ the energy relaxation rate $P(T_h,T_L)$ is well described by the conventional theory (P.J. Price J. Appl. Phys. 53, 6863 (1982)), which takes into account scattering on acoustic phonons with both piezoelectric and deformational potential coupling to holes. At the conductivity range $0.01G_0<\sigma<G_0$ energy the relaxation rate significantly deviates down from the theoretical value. The analysis of $\frac{dP}{d\sigma}$ at different lattice temperature $T_L$ shows that this deviation does not result from crossover to the hopping conductivity, which occurs at $\sigma<10^{-2}$, but from the Pippard ineffectiveness.
Training deep neural network is a high dimensional and a highly non-convex optimization problem. Stochastic gradient descent (SGD) algorithm and it's variations are the current state-of-the-art solvers for this task. However, due to non-covexity nature of the problem, it was observed that SGD slows down near saddle point. Recent empirical work claim that by detecting and escaping saddle point efficiently, it's more likely to improve training performance. With this objective, we revisit Hessian-free optimization method for deep networks. We also develop its distributed variant and demonstrate superior scaling potential to SGD, which allows more efficiently utilizing larger computing resources thus enabling large models and faster time to obtain desired solution. Furthermore, unlike truncated Newton method (Marten's HF) that ignores negative curvature information by using na\"ive conjugate gradient method and Gauss-Newton Hessian approximation information - we propose a novel algorithm to explore negative curvature direction by solving the sub-problem with stabilized bi-conjugate method involving possible indefinite stochastic Hessian information. We show that these techniques accelerate the training process for both the standard MNIST dataset and also the TIMIT speech recognition problem, demonstrating robust performance with upto an order of magnitude larger batch sizes. This increased scaling potential is illustrated with near linear speed-up on upto 16 CPU nodes for a simple 4-layer network.
Influential users play an important role in online social networks since users tend to have an impact on one other. Therefore, the proposed work analyzes users and their behavior in order to identify influential users and predict user participation. Normally, the success of a social media site is dependent on the activity level of the participating users. For both online social networking sites and individual users, it is of interest to find out if a topic will be interesting or not. In this article, we propose association learning to detect relationships between users. In order to verify the findings, several experiments were executed based on social network analysis, in which the most influential users identified from association rule learning were compared to the results from Degree Centrality and Page Rank Centrality. The results clearly indicate that it is possible to identify the most influential users using association rule learning. In addition, the results also indicate a lower execution time compared to state-of-the-art methods.
We model s- and d-wave ceramic superconductors with a three-dimensional lattice of randomly distributed Josephson junctions with finite self-inductance. The field and temperature dependences of the microwave absoption are obtained by solving the corresponding Langevin dynamical equations. We find that at magnetic field H=0 the microwave absoption of the s-wave samples, when plotted against the field, has a minimum at any temperature. In the case of d-wave superconductors one has a peak at H=0 in the temperature region where the paramagnetic Meissner effect is observable. These results agree with experiments. The dependence of the microwave absorption on the screening strength was found to be nontrivial due to the crossover from the weak to the strong screening regime.
P2P overlays provide a framework for building distributed applications consisting of few to many resources with features including self-configuration, scalability, and resilience to node failures. Such systems have been successfully adopted in large-scale services for content delivery networks, file sharing, and data storage. In small-scale systems, they can be useful to address privacy concerns and for network applications that lack dedicated servers. The bootstrap problem, finding an existing peer in the overlay, remains a challenge to enabling these services for small-scale P2P systems. In large networks, the solution to the bootstrap problem has been the use of dedicated services, though creating and maintaining these systems requires expertise and resources, which constrain their usefulness and make them unappealing for small-scale systems. This paper surveys and summarizes requirements that allow peers potentially constrained by network connectivity to bootstrap small-scale overlays through the use of existing public overlays. In order to support bootstrapping, a public overlay must support the following requirements: a method for reflection in order to obtain publicly reachable addresses, so peers behind network address translators and firewalls can receive incoming connection requests; communication relaying to share public addresses and communicate when direct communication is not feasible; and rendezvous for discovering remote peers, when the overlay lacks stable membership. After presenting a survey of various public overlays, we identify two overlays that match the requirements: XMPP overlays, such as Google Talk and Live Journal Talk, and Brunet, a structured overlay based upon Symphony. We present qualitative experiences with prototypes that demonstrate the ability to bootstrap small-scale private structured overlays from public Brunet or XMPP infrastructures.
In this letter, a generalization of pairwise models to non-Markovian epidemics on networks is presented. For the case of infectious periods of fixed length, the resulting pairwise model is a system of delay differential equations (DDEs), which shows excellent agreement with results based on explicit stochastic simulations of non-Markovian epidemics on networks. Furthermore, we analytically compute a new R0-like threshold quantity and an implicit analytical relation between this and the final epidemic size. In addition we show that the pairwise model and the analytic calculations can be generalized in terms of integro-differential equations to any distribution of the infectious period, and we illustrate this by presenting a closed form expression for the final epidemic size. By showing the rigorous mathematical link between non-Markovian network epidemics and pairwise DDEs, we provide the framework for a deeper and more rigorous understanding of the impact of non-Markovian dynamics with explicit results for final epidemic size and threshold quantities.
We present the soft exponential activation function for artificial neural networks that continuously interpolates between logarithmic, linear, and exponential functions. This activation function is simple, differentiable, and parameterized so that it can be trained as the rest of the network is trained. We hypothesize that soft exponential has the potential to improve neural network learning, as it can exactly calculate many natural operations that typical neural networks can only approximate, including addition, multiplication, inner product, distance, polynomials, and sinusoids.
Obtaining semantic labels on a large scale radiology image database (215,786 key images from 61,845 unique patients) is a prerequisite yet bottleneck to train highly effective deep convolutional neural network (CNN) models for image recognition. Nevertheless, conventional methods for collecting image labels (e.g., Google search followed by crowd-sourcing) are not applicable due to the formidable difficulties of medical annotation tasks for those who are not clinically trained. This type of image labeling task remains non-trivial even for radiologists due to uncertainty and possible drastic inter-observer variation or inconsistency.   In this paper, we present a looped deep pseudo-task optimization procedure for automatic category discovery of visually coherent and clinically semantic (concept) clusters. Our system can be initialized by domain-specific (CNN trained on radiology images and text report derived labels) or generic (ImageNet based) CNN models. Afterwards, a sequence of pseudo-tasks are exploited by the looped deep image feature clustering (to refine image labels) and deep CNN training/classification using new labels (to obtain more task representative deep features). Our method is conceptually simple and based on the hypothesized "convergence" of better labels leading to better trained CNN models which in turn feed more effective deep image features to facilitate more meaningful clustering/labels. We have empirically validated the convergence and demonstrated promising quantitative and qualitative results. Category labels of significantly higher quality than those in previous work are discovered. This allows for further investigation of the hierarchical semantic nature of the given large-scale radiology image database.
We apply the method of gauge transformation to spin glasses under the microcanonical ensemble to study the possibility of ensemble inequivalence in systems with long-range interactions and quenched disorder. It is proved that all the results derived under the canonical ensemble on the Nishimori line (NL) can be reproduced by the microcanonical ensemble irrespective of the range of interactions. This establishes that ensemble inequivalence should take place away from the NL if it happens in spin glasses. It is also proved on the NL that the microcanonical configurational average of the energy as a function of temperature is exactly equal to the average energy in the canonical ensemble for any finite-size systems with Gaussian disorder. In this sense, ensembles are equivalent even for finite systems.
Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as Breiman's random forest and extremely randomized trees) operate on batches of training data. Online methods are now in greater demand. Existing online random forests, however, require more training data than their batch counterpart to achieve comparable predictive performance. In this work, we use Mondrian processes (Roy and Teh, 2009) to construct ensembles of random decision trees we call Mondrian forests. Mondrian forests can be grown in an incremental/online fashion and remarkably, the distribution of online Mondrian forests is the same as that of batch Mondrian forests. Mondrian forests achieve competitive predictive performance comparable with existing online random forests and periodically re-trained batch random forests, while being more than an order of magnitude faster, thus representing a better computation vs accuracy tradeoff.
We analyze a collaboration network based on the Marvel Universe comic books. First, we consider the system as a binary network, where two characters are connected if they appear in the same publication. The analysis of degree correlations reveals that, in contrast to most real social networks, the Marvel Universe presents a disassortative mixing on the degree. Then, we use a weight measure to study the system as a weighted network. This allows us to find and characterize well defined communities. Through the analysis of the community structure and the clustering as a function of the degree we show that the network presents a hierarchical structure. Finally, we comment on possible mechanisms responsible for the particular motifs observed.
In this paper, we consider the problems of minimizing sum power and maximizing sum rate for multi-cell networks with load coupling, where coupling relation occurs among cells due to inter-cell interference. This coupling relation is characterized by the signal-to-interference-and-noise-ratio (SINR) coupling model with cell load vector and cell power vector as the variables. Due to the nonlinear SINR coupling model, the optimization problems for multi-cell networks with load coupling is nonconvex. To solve these nonconvex problems, we first consider the optimization problems for single-cell networks. Through variable transformations, the optimization problems can be equivalently transformed into convex problems. By solving the Karush-Kuhn-Tucker (KKT), the optimal solutions to power minimization and rate maximization problems can be obtained in closed form. Based on the theoretical findings of optimization problems for single-cell networks, we develop a distributed time allocation and power control algorithm with low complexity for sum power minimization in multi-cell networks. This algorithm is proved to be convergent and globally optimal by using the properties of standard interference function. For sum rate optimization in multi-cell networks, we also provide a distributed algorithm which yields suboptimal solution. Besides, the convergence for this distributed algorithm is proved. Numerical results illustrate the theoretical findings, showing the superiority of our solutions compared to the conventional solution of allocating uniform power for users in the same cell.
Radio networks (RN) are distributed systems (\textit{ad hoc networks}) consisting in $n \ge 2$ radio stations. Assuming the number $n$ unknown, two distinct models of RN without collision detection (\textit{no-CD}) are addressed: the model with \textit{weak no-CD} RN and the one with \textit{strong no-CD} RN. We design and analyze two distributed leader election protocols, each one running in each of the above two (no-CD RN) models, respectively. Both randomized protocols are shown to elect a leader within $\BO(\log{(n)})$ expected time, with no station being awake for more than $\BO(\log{\log{(n)}})$ time slots (such algorithms are said to be \textit{energy-efficient}). Therefore, a new class of efficient algorithms is set up that matchthe $\Omega(\log{(n)})$ time lower-bound established by Kushilevitz and Mansour.
High-resolution scanning superconducting quantum interference device (SQUID) microscopy was used to study the flux quantization phenomenon in multiply-connected anisotropic high-Tc NdBa2Cu3O7-d single crystalline thin films. The spatial distribution of internal flux in a hole was found to be non-uniform and changed drastically for applied small fields. With increased fields above 10uT, a local magnetic dipole flux developed inside the hole, in contrast to an isotropic Nb superconductor. The total net flux trapped in a hole was kept to be constant for larger holes, but the abrupt transition of flux quantization state was observed for smaller holes. The possible explanation is given based on the anisotropic dx2-y2-wave order parameter of high-Tc superconductors.
Metabolic networks can be turned into kinetic models in a predefined steady state by sampling the reaction elasticities in this state. Elasticities for many reversible rate laws can be computed from the reaction Gibbs free energies, which are determined by the state, and from physically unconstrained saturation values. Starting from a network structure with allosteric regulation and consistent metabolic fluxes and concentrations, one can sample the elasticities, compute the control coefficients, and reconstruct a kinetic model with consistent reversible rate laws. Some of the model variables are manually chosen, fitted to data, or optimised, while the others are computed from them. The resulting model ensemble allows for probabilistic predictions, for instance, about possible dynamic behaviour. By adding more data or tighter constraints, the predictions can be made more precise. Model variants differing in network structure, flux distributions, thermodynamic forces, regulation, or rate laws can be realised by different model ensembles and compared by significance tests. The thermodynamic forces have specific effects on flux control, on the synergisms between enzymes, and on the emergence and propagation of metabolite fluctuations. Large kinetic models could help to simulate global metabolic dynamics and to predict the effects of enzyme inhibition, differential expression, genetic modifications, and their combinations on metabolic fluxes. MATLAB code for elasticity sampling is freely available.
Within the framework of lattice QCD a high statistics computation of the nucleon axial and tensor charges is given. Particular attention is paid to the chiral and continuum extrapolations.
We report the discovery of a strong iron K alpha line in the hard X-ray selected source CXOJ 123716.7+621733 in the Chandra Deep Field North survey at z=1.146. The analysis is made possible by the very deep exposure ~2 Ms and low background of the ACIS detector. The line profile seems to be inconsistent with a narrow feature. The best fit solution is achieved with a broad line. Most of the flux in the broad component originates at energies below 6.4 keV with a shape similar to that expected from emission in the innermost regions of the accretion disk.
The recently proposed neural network joint model (NNJM) (Devlin et al., 2014) augments the n-gram target language model with a heuristically chosen source context window, achieving state-of-the-art performance in SMT. In this paper, we give a more systematic treatment by summarizing the relevant source information through a convolutional architecture guided by the target information. With different guiding signals during decoding, our specifically designed convolution+gating architectures can pinpoint the parts of a source sentence that are relevant to predicting a target word, and fuse them with the context of entire source sentence to form a unified representation. This representation, together with target language words, are fed to a deep neural network (DNN) to form a stronger NNJM. Experiments on two NIST Chinese-English translation tasks show that the proposed model can achieve significant improvements over the previous NNJM by up to +1.08 BLEU points on average
Given a limited number of entries from the superposition of a low-rank matrix plus the product of a known fat compression matrix times a sparse matrix, recovery of the low-rank and sparse components is a fundamental task subsuming compressed sensing, matrix completion, and principal components pursuit. This paper develops algorithms for distributed sparsity-regularized rank minimization over networks, when the nuclear- and $\ell_1$-norm are used as surrogates to the rank and nonzero entry counts of the sought matrices, respectively. While nuclear-norm minimization has well-documented merits when centralized processing is viable, non-separability of the singular-value sum challenges its distributed minimization. To overcome this limitation, an alternative characterization of the nuclear norm is adopted which leads to a separable, yet non-convex cost minimized via the alternating-direction method of multipliers. The novel distributed iterations entail reduced-complexity per-node tasks, and affordable message passing among single-hop neighbors. Interestingly, upon convergence the distributed (non-convex) estimator provably attains the global optimum of its centralized counterpart, regardless of initialization. Several application domains are outlined to highlight the generality and impact of the proposed framework. These include unveiling traffic anomalies in backbone networks, predicting networkwide path latencies, and mapping the RF ambiance using wireless cognitive radios. Simulations with synthetic and real network data corroborate the convergence of the novel distributed algorithm, and its centralized performance guarantees.
We report on the implications of the peak in the cosmic star-formation rate (SFR) at redshift z ~ 1.5 for the resulting population of low-mass X-ray binaries(LMXB) and for that of their descendants, the millisecond radio pulsars (MRP). Since the evolutionary timescales of LMXBs, their progenitors, and their descendants are thought be significant fractions of the time-interval between the SFR peak and the present epoch, there is a lag in the turn-on of the LMXB population, with the peak activity occurring at z ~ 0.5 - 1.0. The peak in the MRP population is delayed further, occurring at z < 0.5. We show that the discrepancy between the birthrate of LMXBs and MRPs, found under the assumption of a stead-state SFR, can be resolved for the population as a whole when the effects of a time-variable SFR are included. A discrepancy may persist for LMXBs with short orbital periods, although a detailed population synthesis will be required to confirm this. Further, since the integrated X-ray luminosity distribution of normal galaxies is dominated by X-ray binaries, it should show strong luminosity evolution with redshift. In addition to an enhancement near the peak (z ~ 1.5) of the SFR due to the prompt turn-on of the relatively short-lived massive X-ray binaries and young supernova remnants, we predict a second enhancement by a factor ~10 at a redshift between ~ 0.5 and ~ 1 due to the delayed turn-on of the LMXB population. Deep X-ray observations of galaxies out to z ~ 1 by AXAF will be able to observe this enhancement, and, by determining its shape as a function of redshift, will provide an important new method for constraining evolutionary models of X-ray binaries.
We carried out a feasibility study on the measurement of the branching ratio of H -> cc_bar at a future e+e- linear collider. We used the topological vertex reconstructing algorithm for accumulating secondary vertex information and the neural network for optimizing c quark selection. With an assumption of a light Higgs mass of 120 GeV/c^2, we estimated the statistical error of Br(H -> cc_bar) to be 20.1% or 25.7% depending on the number of vertex detector layers at the center-of-mass energy of 250 GeV and the intergrated luminosity of 500 fb^-1.
Inclusive D* production is measured in deep-inelastic ep scattering at HERA with the H1 detector. In addition, the production of dijets in events with a D* meson is investigated. The analysis covers values of photon virtuality 2< Q^2 <=100 GeV^2 and of inelasticity 0.05<= y <= 0.7. Differential cross sections are measured as a function of Q^2 and x and of various D* meson and jet observables. Within the experimental and theoretical uncertainties all measured cross sections are found to be adequately described by next-to-leading order (NLO) QCD calculations, based on the photon-gluon fusion process and DGLAP evolution, without the need for an additional resolved component of the photon beyond what is included at NLO. A reasonable description of the data is also achieved by a prediction based on the CCFM evolution of partons involving the k_T-unintegrated gluon distribution of the proton.
Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies. However, they are computationally expensive to train and difficult to parallelize. Recent work has shown that normalizing intermediate representations of neural networks can significantly improve convergence rates in feedforward neural networks . In particular, batch normalization, which uses mini-batch statistics to standardize features, was shown to significantly reduce training time. In this paper, we show that applying batch normalization to the hidden-to-hidden transitions of our RNNs doesn't help the training procedure. We also show that when applied to the input-to-hidden transitions, batch normalization can lead to a faster convergence of the training criterion but doesn't seem to improve the generalization performance on both our language modelling and speech recognition tasks. All in all, applying batch normalization to RNNs turns out to be more challenging than applying it to feedforward networks, but certain variants of it can still be beneficial.
As a technology to read brain states from measurable brain activities, brain decoding are widely applied in industries and medical sciences. In spite of high demands in these applications for a universal decoder that can be applied to all individuals simultaneously, large variation in brain activities across individuals has limited the scope of many studies to the development of individual-specific decoders. In this study, we used deep neural network (DNN), a nonlinear hierarchical model, to construct a subject-transfer decoder. Our decoder is the first successful DNN-based subject-transfer decoder. When applied to a large-scale functional magnetic resonance imaging (fMRI) database, our DNN-based decoder achieved higher decoding accuracy than other baseline methods, including support vector machine (SVM). In order to analyze the knowledge acquired by this decoder, we applied principal sensitivity analysis (PSA) to the decoder and visualized the discriminative features that are common to all subjects in the dataset. Our PSA successfully visualized the subject-independent features contributing to the subject-transferability of the trained decoder.
Autonomous indoor navigation of Micro Aerial Vehicles (MAVs) possesses many challenges. One main reason is that GPS has limited precision in indoor environments. The additional fact that MAVs are not able to carry heavy weight or power consuming sensors, such as range finders, makes indoor autonomous navigation a challenging task. In this paper, we propose a practical system in which a quadcopter autonomously navigates indoors and finds a specific target, i.e., a book bag, by using a single camera. A deep learning model, Convolutional Neural Network (ConvNet), is used to learn a controller strategy that mimics an expert pilot's choice of action. We show our system's performance through real-time experiments in diverse indoor locations. To understand more about our trained network, we use several visualization techniques.
We study Lagrangian trajectories and scalar transport statistics in decaying Burgers turbulence. We choose velocity fields, solutions of the inviscid Burgers equation, whose probability distributions are specified by Kida's statistics. They are time-correlated, not time-reversal invariant and not Gaussian. We discuss in some details the effect of shocks on trajectories and transport equations. We derive the inviscid limit of these equations using a formalism of operators localized on shocks. We compute the probability distribution functions of the trajectories although they do not define Markov processes. As physically expected, these trajectories are statistically well-defined but collapse with probability one at infinite time. We point out that the advected scalars enjoy inverse energy cascades. We also make a few comments on the connection between our computations and persistence problems.
In this work we extend previous analyses of linguistic networks by adopting a multi-layer network framework for modelling the human mental lexicon, i.e. an abstract mental repository where words and concepts are stored together with their linguistic patterns. Across a three-layer linguistic multiplex, we model English words as nodes and connect them according to (i) phonological similarities, (ii) synonym relationships and (iii) free word associations. Our main aim is to exploit this multi-layered structure to explore the influence of phonological and semantic relationships on lexicon assembly over time. We propose a model of lexicon growth which is driven by the phonological layer: words are suggested according to different orderings of insertion (e.g. shorter word length, highest frequency, semantic multiplex features) and accepted or rejected subject to constraints. We then measure times of network assembly and compare these to empirical data about the age of acquisition of words. In agreement with empirical studies in psycholinguistics, our results provide quantitative evidence for the hypothesis that word acquisition is driven by features at multiple levels of organisation within language.
We study the entanglement in momentum space of a disordered one-dimensional fermion lattice model with attractive interaction. We observe that the many-body localization transition can be characterized by behaviors of two components in entanglement spectrum. One of the components is related to paired-fermion entanglement which contributes to the long-range correlation in position space, and the vanishing of it indicates the emerged many-body localized phase. Based on this understanding, we identify the critical point between delocalized and localized phases for interaction of different strength. Additionally by method of entanglement spectrum, we provide a new evidence to show the transition of two phases induced by interaction, and find this phase transition is not influenced by the disorder. Our result shows key characteristics in entanglement for different phases in the system, and provides a novel perspective to understand many-body localization phenomenon.
We propose a Bayesian nonparametric model including time-varying predictors in dynamic network inference. The model is applied to infer the dependence structure among financial markets during the global financial crisis, estimating effects of verbal and material cooperation efforts. We interestingly learn contagion effects, with increasing influence of verbal relations during the financial crisis and opposite results during the United States housing bubble.
We describe our experiences in using SPIN to verify parts of the Multi Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of processes connected by Unix network sockets. MPD is dynamic: processes and connections among them are created and destroyed as MPD is initialized, runs user processes, recovers from faults, and terminates. This dynamic nature is easily expressible in the SPIN/PROMELA framework but poses performance and scalability challenges. We present here the results of expressing some of the parallel algorithms of MPD and executing both simulation and verification runs with SPIN.
We show that harmonic vibrations in amorphous silicon can be decomposed to transverse and longitudinal components in all frequency range even in the absence of the well defined wave vector ${\bf q}$. For this purpose we define the transverse component of the eigenvector with given $\omega$ as a component, which does not change the volumes of Voronoi cells around atoms. The longitudinal component is the remaining orthogonal component. We have found the longitudinal and transverse components of the vibrational density of states for numerical model of amorphous silicon. The vibrations are mostly transverse below 7 THz and above 15 THz. In the frequency interval in between the vibrations have a longitudinal nature. Just this sudden transformation of vibrations at 7 THz from almost transverse to almost longitudinal ones explains the prominent peak in the diffusivity of the amorphous silicon just above 7 THz.
Transportation infrastructure of a country is one of the most important indicators of its economic growth. Here we study the Airport Network of India (ANI), which represents India's domestic civil aviation infrastructure, as a complex network. We find that ANI, a network of domestic airports connected by air links, is a small-world network characterized by a truncated power-law degree distribution, and has a signature of hierarchy. We investigate ANI as a weighted network to explore its various properties and compare them with their topological counterparts. The traffic in ANI, as in the World-wide Airport Network (WAN), is found to be accumulated on interconnected groups of airports and is concentrated between large airports. In contrast to WAN, ANI is found to be having disassortative mixing which is offset by the traffic dynamics. The analysis indicates toward possible mechanism of formation of a national transportation network, which is different from that on a global scale.
Recognizing acoustic events is an intricate problem for a machine and an emerging field of research. Deep neural networks achieve convincing results and are currently the state-of-the-art approach for many tasks. One advantage is their implicit feature learning, opposite to an explicit feature extraction of the input signal. In this work, we analyzed whether more discriminative features can be learned from either the time-domain or the frequency-domain representation of the audio signal. For this purpose, we trained multiple deep networks with different architectures on the Freiburg-106 and ESC-10 datasets. Our results show that feature learning from the frequency domain is superior to the time domain. Moreover, additionally using convolution and pooling layers, to explore local structures of the audio signal, significantly improves the recognition performance and achieves state-of-the-art results.
Network tomography means to estimate internal link states from end-to-end path measurements. In conventional network tomography, to make packets transmissively penetrate a network, a cooperation between transmitter and receiver nodes is required, which are located at different places in the network. In this paper, we propose a reflective network tomography, which can totally avoid such a cooperation, since a single transceiver node transmits packets and receives them after traversing back from the network. Furthermore, we are interested in identification of a limited number of bottleneck links, so we naturally introduce compressed sensing technique into it. Allowing two kinds of paths such as (fully) loopy path and folded path, we propose a computationally-efficient algorithm for constructing reflective paths for a given network. In the performance evaluation by computer simulation, we confirm the effectiveness of the proposed reflective network tomography scheme.
Communities are fundamental entities for the characterization of the structure of real networks. The standard approach to the identification of communities in networks is based on the optimization of a quality function known as "modularity". Although modularity has been at the center of an intense research activity and many methods for its maximization have been proposed, not much it is yet known about the necessary conditions that communities need to satisfy in order to be detectable with modularity maximization methods. Here, we develop a simple theory to establish these conditions, and we successfully apply it to various classes of network models. Our main result is that heterogeneity in the degree distribution helps modularity to correctly recover the community structure of a network and that, in the realistic case of scale-free networks with degree exponent $\gamma < 2.5$, modularity is always able to detect the presence of communities.
Preferential attachment --- by which new nodes attach to existing nodes with probability proportional to the existing nodes' degree --- has become the standard growth model for scale-free networks, where the asymptotic probability of a node having degree $k$ is proportional to $k^{-\gamma}$. However, the motivation for this model is entirely ad hoc. We use exact likelihood arguments and show that the optimal way to build a scale-free network is to attach most new links to nodes of low degree. Curiously, this leads to a scale-free networks with a single dominant hub: a star-like structure we call a super-star network. Asymptotically, the optimal strategy is to attach each new node to one of the nodes of degree $k$ with probability proportional to $\frac{1}{N+\zeta(\gamma)(k+1)^\gamma}$ (in a $N$ node network) --- a stronger bias toward high degree nodes than exhibited by standard preferential attachment. Our algorithm generates optimally scale-free networks (the super-star networks) as well as randomly sampling the space of all scale-free networks with a given degree exponent $\gamma$. We generate viable realisation with finite $N$ for $1\ll \gamma<2$ as well as $\gamma>2$. We observe an apparently discontinuous transition at $\gamma\approx 2$ between so-called super-star networks and more tree-like realisations. Gradually increasing $\gamma$ further leads to re-emergence of a super-star hub. To quantify these structural features we derive a new analytic expression for the expected degree exponent of a pure preferential attachment process, and introduce alternative measures of network entropy. Our approach is generic and may also be applied to an arbitrary degree distribution.
Deep-learning metrics have recently demonstrated extremely good performance to match image patches for stereo reconstruction. However, training such metrics requires large amount of labeled stereo images, which can be difficult or costly to collect for certain applications. The main contribution of our work is a new semi-supervised method for learning deep metrics from unlabeled stereo images, given coarse information about the scenes and the optical system. Our method alternatively optimizes the metric with a standard stochastic gradient descent, and applies stereo constraints to regularize its prediction. Experiments on reference data-sets show that, for a given network architecture, training with this new method without ground-truth produces a metric with performance as good as state-of-the-art baselines trained with the said ground-truth. This work has three practical implications. Firstly, it helps to overcome limitations of training sets, in particular noisy ground truth. Secondly it allows to use much more training data during learning. Thirdly, it allows to tune deep metric for a particular stereo system, even if ground truth is not available.
In this paper we present a new Bayesian network model for classification that combines the naive-Bayes (NB) classifier and the finite-mixture (FM) classifier. The resulting classifier aims at relaxing the strong assumptions on which the two component models are based, in an attempt to improve on their classification performance, both in terms of accuracy and in terms of calibration of the estimated probabilities. The proposed classifier is obtained by superimposing a finite mixture model on the set of feature variables of a naive Bayes model. We present experimental results that compare the predictive performance on real datasets of the new classifier with the predictive performance of the NB classifier and the FM classifier.
A completeness result for d-separation applied to discrete Bayesian networks is presented and it is shown that in a strong measure-theoretic sense almost all discrete distributions for a given network structure are faithful; i.e. the independence facts true of the distribution are all and only those entailed by the network structure.
Motivated by the existence of metal-insulator transition in one-dimensional non-interacting fermions in quasiperiodic and pseudorandom potentials, we studied interacting spinless fermion models using exact many-body Lanczos diagonalization techniques. Our main focus was to understand the effect of the fermion-fermion interaction on the transport properties of aperiodic systems. We calculated the ground state energy and the Kohn charge stiffness Dc. Our numerical results indicate that there exists a region in the interaction strength parameter space where the system may behave differently from the metallic and insulating phases. This intermediate phase may be characterized by a power law scaling of the charge stiffness constant in contrast to the localized phase where Dc scales exponentially with the size of the system.
Spin and chirality orderings of the three-dimensional Heisenberg spin glass under magnetic fields are studied by large-scale equilibrium Monte Carlo simulations. It is found that the chiral-glass transition and the chiral-glass ordered state, which are essentially of the same character as their zero-field counterparts, occur under magnetic fields. The chiral-glass ordered state exhibits a one-step-like peculiar replica-symmetry breaking in the chiral sector, while it does not accompany the spin-glass order perpendicular to the applied field. Critical perperties of the chiral-glass transition are different from those of the standard Ising spin glass. Magnetic phase diagram of the model is constructed, which reveals that the chiral-glass state is quite robust against magnetic fields. The chiral-glass transition line has a character of the Gabay-Toulouse line of the mean-field model, yet its physical origin being entirely different. These numerical results are discussed in light of the recently developed spin-chirality decoupling-recoupling scenario. Implication to experimental phase diagram is also discussed.
Allosteric effects are often underlying the activity of proteins and elucidating generic design aspects and functional principles which are unique to allosteric phenomena represents a major challenge. Here an approach which consists in the in silico design of synthetic structures which, as the principal element of allostery, encode dynamical long-range coupling among two sites is presented. The structures are represented by elastic networks, similar to coarse-grained models of real proteins. A strategy of evolutionary optimization was implemented to iteratively improve allosteric coupling. In the designed structures allosteric interactions were analyzed in terms of strain propagation and simple pathways which emerged during evolution were identified as signatures through which long-range communication was established. Moreover, robustness of allosteric performance with respect to mutations was demonstrated. As it turned out, the designed prototype structures reveal dynamical properties resembling those found in real allosteric proteins. Hence, they may serve as toy models of complex allosteric systems, such as proteins.
We investigate the random matrix configurations for two or three interacting electrons in one-dimensional disordered systems. In a suitable non-interacting localized electron basis we obtain a sparse random matrix with very long tails which is different from a superimposed random band matrix usually thought to be valid. The number of non-zero off-diagonal matrix elements is shown to decay very weakly from the matrix diagonal and the non-zero matrix elements are distributed according to a Lorentzian around zero with also very weakly decaying parameters. The corresponding random matrix for three interacting electrons is similar but even more sparse.
Our goal is to quickly find top $k$ lists of nodes with the largest degrees in large complex networks. If the adjacency list of the network is known (not often the case in complex networks), a deterministic algorithm to find a node with the largest degree requires an average complexity of O(n), where $n$ is the number of nodes in the network. Even this modest complexity can be very high for large complex networks. We propose to use the random walk based method. We show theoretically and by numerical experiments that for large networks the random walk method finds good quality top lists of nodes with high probability and with computational savings of orders of magnitude. We also propose stopping criteria for the random walk method which requires very little knowledge about the structure of the network.
We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks.
Liquids exhibit highly complex, non-linear behavior under changing simulation conditions such as user interactions. We propose a method to map this complex behavior over a parameter range onto a reduced representation based on space-time deformations. In order to represent the complexity of the full space of inputs, we use aligned deformations from optical flow solves, and we leverage the power of generative neural networks to synthesize additional deformations for refinement. We introduce a novel deformation-aware loss function, which enables optimization in the highly non-linear space of multiple deformations. To demonstrate the effectiveness of our approach, we showcase the method with several complex examples in two and four dimensions. Our representation makes it possible to generate implicit surfaces of liquids very efficiently, which allows us to very efficiently display the scene from any angle, and to add secondary effects such as particle systems. We have implemented a mobile application with our full pipeline to demonstrate that real-time interaction is possible with our approach.
We design a new approach that allows robot learning of new activities from unlabeled human example videos. Given videos of humans executing the same activity from a human's viewpoint (i.e., first-person videos), our objective is to make the robot learn the temporal structure of the activity as its future regression network, and learn to transfer such model for its own motor execution. We present a new deep learning model: We extend the state-of-the-art convolutional object detection network for the detection of human hands in training videos based on image information, and newly introduce the concept of using a fully convolutional network to regress (i.e., predict) the intermediate scene representation corresponding to the future frame (e.g., 1-2 seconds later). Combining these allows direct prediction of future locations of human hands and objects, which enables the robot to infer the motor control plan using our manipulation network. We experimentally confirm that our approach makes learning of robot activities from unlabeled human interaction videos possible, and demonstrate that our robot is able to execute the learned collaborative activities in real-time directly based on its camera input.
We consider location-dependent opportunistic bandwidth sharing between static and mobile downlink users in a cellular network. Each cell has some fixed number of static users. Mobile users enter the cell, move inside the cell for some time and then leave the cell. In order to provide higher data rate to mobile users, we propose to provide higher bandwidth to the mobile users at favourable times and locations, and provide higher bandwidth to the static users in other times. We formulate the problem as a long run average reward Markov decision process (MDP) where the per-step reward is a linear combination of instantaneous data volumes received by static and mobile users, and find the optimal policy. The transition structure of this MDP is not known in general. To alleviate this issue, we propose a learning algorithm based on single timescale stochastic approximation. Also, noting that the unconstrained MDP can be used to solve a constrained problem, we provide a learning algorithm based on multi-timescale stochastic approximation. The results are extended to address the issue of fair bandwidth sharing between the two classes of users. Numerical results demonstrate performance improvement by our scheme, and also the trade-off between performance gain and fairness.
Due to popularity surge social networks became lucrative targets for spammers and guerilla marketers, who are trying to game ranking systems and broadcast their messages at little to none cost. Ranking systems, for example Twitter's Trends, can be gamed by scripted users also called bots, who are automatically or semi-automatically twitting essentially the same message. Judging by the prices and abundance of supply from PR firms this is an easy to implement and widely used tactic, at least in Russian blogosphere. Aggregative analysis of social networks should at best mark those messages as spam or at least correctly downplay their importance as they represent opinions only of a few, if dedicated, users. Hence bot detection plays a crucial role in social network mining and analysis. In this paper we propose technique called RepRank which could be viewed as Markov chain based model for reputation propagation on graphs utilizing simultaneous trust and anti-trust propagation and provide effective numerical approach for its computation. Comparison with another models such as TrustRank and some of its modifications on sample of 320000 Russian speaking Twitter users is presented. The dataset is presented as well.
In this paper, the applicability of the entropy method for the trend towards equilibrium for reaction-diffusion systems arising from first order chemical reaction networks is studied. In particular, we present a suitable entropy structure for weakly reversible reaction networks without detail balance condition.   We show by deriving an entropy-entropy dissipation estimate that for any weakly reversible network each solution trajectory converges exponentially fast to the unique positive equilibrium with computable rates. This convergence is shown to be true even in cases when the diffusion coefficients all but one species are zero.   For non-weakly reversible networks consisting of source, transmission and target components, it is shown that species belonging to a source or transmission component decay to zero exponentially fast while species belonging to a target component converge to the corresponding positive equilibria, which are determined by the dynamics of the target component and the mass injected from other components. The results of this work, in some sense, complete the picture of trend to equilibrium for first order chemical reaction networks.
This paper introduces a machine learning based collaborative multi-band spectrum sensing policy for cognitive radios. The proposed sensing policy guides secondary users to focus the search of unused radio spectrum to those frequencies that persistently provide them high data rate. The proposed policy is based on machine learning, which makes it adaptive with the temporally and spatially varying radio spectrum. Furthermore, there is no need for dynamic modeling of the primary activity since it is implicitly learned over time. Energy efficiency is achieved by minimizing the number of assigned sensors per each subband under a constraint on miss detection probability. It is important to control the missed detections because they cause collisions with primary transmissions and lead to retransmissions at both the primary and secondary user. Simulations show that the proposed machine learning based sensing policy improves the overall throughput of the secondary network and improves the energy efficiency while controlling the miss detection probability.
Bayesian networks are probabilistic graphical models often used in big data analytics. The problem of exact structure learning is to find a network structure that is optimal under certain scoring criteria. The problem is known to be NP-hard and the existing methods are both computationally and memory intensive. In this paper, we introduce a new approach for exact structure learning. Our strategy is to leverage relationship between a partial network structure and the remaining variables to constraint the number of ways in which the partial network can be optimally extended. Via experimental results, we show that the method provides up to three times improvement in runtime, and orders of magnitude reduction in memory consumption over the current best algorithms.
In this work we release our extensible and easily configurable neural network training software. It provides a rich set of functional layers with a particular focus on efficient training of recurrent neural network topologies on multiple GPUs. The source of the software package is public and freely available for academic research purposes and can be used as a framework or as a standalone tool which supports a flexible configuration. The software allows to train state-of-the-art deep bidirectional long short-term memory (LSTM) models on both one dimensional data like speech or two dimensional data like handwritten text and was used to develop successful submission systems in several evaluation campaigns.
We address the problem of reconstructing sparse signals from noisy and compressive measurements using a feed-forward deep neural network (DNN) with an architecture motivated by the iterative shrinkage-thresholding algorithm (ISTA). We maintain the weights and biases of the network links as prescribed by ISTA and model the nonlinear activation function using a linear expansion of thresholds (LET), which has been very successful in image denoising and deconvolution. The optimal set of coefficients of the parametrized activation is learned over a training dataset containing measurement-sparse signal pairs, corresponding to a fixed sensing matrix. For training, we develop an efficient second-order algorithm, which requires only matrix-vector product computations in every training epoch (Hessian-free optimization) and offers superior convergence performance than gradient-descent optimization. Subsequently, we derive an improved network architecture inspired by FISTA, a faster version of ISTA, to achieve similar signal estimation performance with about 50% of the number of layers. The resulting architecture turns out to be a deep residual network, which has recently been shown to exhibit superior performance in several visual recognition tasks. Numerical experiments demonstrate that the proposed DNN architectures lead to 3 to 4 dB improvement in the reconstruction signal-to-noise ratio (SNR), compared with the state-of-the-art sparse coding algorithms.
We contribute to the understanding of how systemic risk arises in a network of credit-interlinked agents. Motivated by empirical studies we formulate a network model which, despite its simplicity, depicts the nature of interbank markets better than a homogeneous model. The components of a vector Ornstein-Uhlenbeck process living on the vertices of the network describe the financial robustnesses of the agents. For this system, we prove a LLN for growing network size leading to a propagation of chaos result. We state properties, which arise from such a structure, and examine the effect of inhomogeneity on several risk management issues and the possibility of contagion.
The electronic properties of disordered systems have been the subject of intense study for several decades. Thermoelectric properties, such as thermopower and thermal conductivity, have been relatively neglected. A long standing problem is represented by the sign of the thermoelectric power. In crystalline semiconductors this is related to the sign of the majority carriers, but in non-crystalline systems it is commonly observed to change sign at low temperatures. In spite of its apparent universality this change has been interpreted in a variety of ways in different systems. We have developed a Green's function recursion algorithm based on the Chester-Thelling-Kubo-Greenwood formula for calculating the kinetic coefficients on long strips or bars. From these we can deduce the electrical conductivity, the Seebeck and Peltier coefficients and the thermal conductivity, as well as the Lorenz number. We present initial results for 1D systems. We observe a Lorentzian distribution for the thermopower which is modified by the presence of inelastic scattering. This could give rise to non-negligible quantum fluctuations in macroscopic systems at low temperatures.
We study a deterministic dynamics with two time scales in a continuous state attractor network. To the usual (fast) relaxation dynamics towards point attractors (``patterns'') we add a slow coupling dynamics that makes the visited patterns to loose stability leading to an itinerant behavior in the form of punctuated equilibria. One finds that the transition frequency matrix between patterns shows non-trivial statistical properties in the chaotic itinerant regime. We show that mixture input patterns can be temporally segmented by the itinerant dynamics. The viability of a combinatorial spatio-temporal neural code is also demonstrated.
First-order logic is known to have limited expressive power over finite structures. It enjoys in particular the locality property, which states that first-order formulae cannot have a global view of a structure. This limitation ensures on their low sequential computational complexity. We show that the locality impacts as well on their distributed computational complexity. We use first-order formulae to describe the properties of finite connected graphs, which are the topology of communication networks, on which the first-order formulae are also evaluated. We show that over bounded degree networks and planar networks, first-order properties can be frugally evaluated, that is, with only a bounded number of messages, of size logarithmic in the number of nodes, sent over each link. Moreover, we show that the result carries over for the extension of first-order logic with unary counting.
Our analysis of the Deep CFHT M33 variability survey database has uncovered 5 Beat Cepheids (BCs) that are pulsating in the fundamental and first overtone modes. With {\it only} the help of stellar pulsation theory and of mass--luminosity (M-L) relations, derived from evolutionary tracks, we can accurately determine the metallicities Z of these stars. The [O/H] metallicity gradient of -0.16 dex/kpc that is inferred from the M33 galacto-centric distances of these Cepheids and from their 'pulsation' metallicities is in excellent agreement with the standard spectroscopic metallicity gradients that are determined from H II regions, early B supergiant stars and planetary nebulae. Beat Cepheids can thus provide an additional, independent probe of galactic metallicity distributions.
The next step in the understanding of the genome organization, after the determination of complete sequences, involves proteomics. The proteome includes the whole set of protein-protein interactions, and two recent independent studies have shown that its topology displays a number of surprising features shared by other complex networks, both natural and artificial. In order to understand the origins of this topology and its evolutionary implications, we present a simple model of proteome evolution that is able to reproduce many of the observed statistical regularities reported from the analysis of the yeast proteome. Our results suggest that the observed patterns can be explained by a process of gene duplication and diversification that would evolve proteome networks under a selection pressure, favoring robustness against failure of its individual components.
An operator algebra implementation of Markov chain Monte Carlo algorithms for simulating Markov random fields is proposed. It allows the dynamics of networks whose nodes have discrete state spaces to be specified by the action of an update operator that is composed of creation and annihilation operators. This formulation of discrete network dynamics has properties that are similar to those of a quantum field theory of bosons, which allows reuse of many conceptual and theoretical structures from QFT. The equilibrium behaviour of one of these generalised MRFs and of the adaptive cluster expansion network (ACEnet) are shown to be equivalent, which provides a way of unifying these two theories.
The Kronecker channel model of wireless communication is analyzed using statistical mechanics methods. In the model, spatial proximities among transmission/reception antennas are taken into account as certain correlation matrices, which generally yield non-trivial dependence among symbols to be estimated. This prevents accurate assessment of the communication performance by naively using a previously developed analytical scheme based on a matrix integration formula. In order to resolve this difficulty, we develop a formalism that can formally handle the correlations in Kronecker models based on the known scheme. Unfortunately, direct application of the developed scheme is, in general, practically difficult. However, the formalism is still useful, indicating that the effect of the correlations generally increase after the fourth order with respect to correlation strength. Therefore, the known analytical scheme offers a good approximation in performance evaluation when the correlation strength is sufficiently small. For a class of specific correlation, we show that the performance analysis can be mapped to the problem of one-dimensional spin systems in random fields, which can be investigated without approximation by the belief propagation algorithm.
We consider Lifshits' model of a quantum particle subject to a repulsive Poissonian random potential and address various issues related to the influence of a constant magnetic field on the leading low-energy tail of the integrated density of states. In particular, we propose the magnetic analog of a $ 40 $-year-old landmark result of Lifshits for short-ranged single-impurity potentials $ U $. The Lifshits tail is shown to change its character from purely quantum, through quantum-classical, to purely classical with increasing range of $ U $. This systematics is explained by the increasing importance of the classical fluctuations of the particle's potential energy in comparison to the quantum fluctuations associated with its kinetic energy.
Using the Martin-Siggia-Rose method, we study propagation of acoustic waves in strongly heterogeneous media which are characterized by a broad distribution of the elastic constants. Gaussian-white distributed elastic constants, as well as those with long-range correlations with non-decaying power-law correlation functions, are considered. The study is motivated in part by a recent discovery that the elastic moduli of rock at large length scales may be characterized by long-range power-law correlation functions. Depending on the disorder, the renormalization group (RG) flows exhibit a transition to localized regime in {\it any} dimension. We have numerically checked the RG results using the transfer-matrix method and direct numerical simulations for one- and two-dimensional systems, respectively.
Recurrent Bistable Gradient Networks are attractor based neural networks characterized by bistable dynamics of each single neuron. Coupled together using linear interaction determined by the interconnection weights, these networks do not suffer from spurious states or very limited capacity anymore. Vladimir Chinarov and Michael Menzinger, who invented these networks, trained them using Hebb's learning rule. We show, that this way of computing the weights leads to unwanted behaviour and limitations of the networks capabilities. Furthermore we evince, that using the first order of Hintons Contrastive Divergence algorithm leads to a quite promising recurrent neural network. These findings are tested by learning images of the MNIST database for handwritten numbers.
Cherenkov images of air showers can also be classified using multifractal and wavelet parameters, as compared to the conventional Hillas image parameters. This technique was applied to the images recorded by the cameras of the stereoscopic imaging air Cherenkov-telescopes operated by the HEGRA collaboration. With respect to the identification of particles, the performance of multifractal and wavelet parameters was examined using a data sample from the observation of the active galaxy Mkn 501 that showed a high gamma-ray flux. The multifractal parameters were also combined with the Hillas parameters using a neural network approach in order to further improve the gamma/hadron-separation.
Air-gapped computers are disconnected from the Internet physically and logically. This measure is taken in order to prevent the leakage of sensitive data from secured networks. In the past, it has been shown that malware can exfiltrate data from air-gapped computers by transmitting ultrasonic signals via the computer's speakers. However, such acoustic communication relies on the availability of speakers on a computer. In this paper, we present 'DiskFiltration,' a covert channel which facilitates the leakage of data from an air-gapped compute via acoustic signals emitted from its hard disk drive (HDD). Our method is unique in that, unlike other acoustic covert channels, it doesn't require the presence of speakers or audio hardware in the air-gapped computer. A malware installed on a compromised machine can generate acoustic emissions at specific audio frequencies by controlling the movements of the HDD's actuator arm. Digital Information can be modulated over the acoustic signals and then be picked up by a nearby receiver (e.g., smartphone, smartwatch, laptop, etc.). We examine the HDD anatomy and analyze its acoustical characteristics. We also present signal generation and detection, and data modulation and demodulation algorithms. Based on our proposed method, we developed a transmitter on a personal computer and a receiver on a smartphone, and we provide the design and implementation details. We also evaluate our covert channel on various types of internal and external HDDs in different computer chassis and at various distances. With DiskFiltration we were able to covertly transmit data (e.g., passwords, encryption keys, and keylogging data) between air-gapped computers to a smartphone at an effective bit rate of 180 bits/minute (10,800 bits/hour) and a distance of up to two meters (six feet).
Computer vision has made remarkable progress in recent years. Deep neural network (DNN) models optimized to identify objects in images exhibit unprecedented task-trained accuracy and, remarkably, some generalization ability: new visual problems can now be solved more easily based on previous learning. Biological vision (learned in life and through evolution) is also accurate and general-purpose. Is it possible that these different learning regimes converge to similar problem-dependent optimal computations? We therefore asked whether the human system-level computation of visual perception has DNN correlates and considered several anecdotal test cases. We found that perceptual sensitivity to image changes has DNN mid-computation correlates, while sensitivity to segmentation, crowding and shape has DNN end-computation correlates. Our results quantify the applicability of using DNN computation to estimate perceptual loss, and are consistent with the fascinating theoretical view that properties of human perception are a consequence of architecture-independent visual learning.
For general dissipative dynamical systems we study what fraction of solutions exhibit chaotic behavior depending on the dimensionality $d$ of the phase space. We find that a system of $d$ globally coupled ODE's with quadratic and cubic non-linearities with random coefficients and initial conditions, the probability of a trajectory to be chaotic increases universally from $\sim 10^{-5} - 10^{-4}$ for $d=3$ to essentially one for $d\sim 50$. In the limit of large $d$, the invariant measure of the dynamical systems exhibits universal scaling that depends on the degree of non-linearity but does not depend on the choice of coefficients, and the largest Lyapunov exponent converges to a universal scaling limit. Using statistical arguments, we provide analytical explanations for the observed scaling and for the probability of chaos.
A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using a discrete dynamical system representation within the XCS Learning Classifier System. In particular, asynchronous random Boolean networks are used to represent the traditional condition-action production system rules. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such discrete dynamical systems within XCS to solve a number of well-known test problems.
An extremal model for the plasticity of amorphous materials is studied in a simple two-dimensional anti-plane geometry. The steady-state is analyzed through numerical simulations. Long-range spatial and temporal correlations in local slip events are shown to develop, leading to non-trivial and highly anisotropic scaling laws. In particular, the plastic strain is shown to statistically concentrate over a region which tends to align perpendicular to the displacement gradient. By construction, the model can be seen as giving rise to a depinning transition, the threshold of which (i.e. the macroscopic yield stress) also reveal scaling properties reflecting the localization of the activity.
The collective dynamics of liquid Gallium close to the melting point has been studied using Inelastic X-ray Scattering to probe lengthscales smaller than the size of the first coordination shell. %(momentum transfers, $Q$, $>$15 nm$^{-1}$). Although the structural properties of this partially covalent liquid strongly deviate from a simple hard-sphere model, the dynamics, as reflected in the quasi-elastic scattering, are beautifully described within the framework of the extended heat mode approximation of Enskog's kinetic theory, analytically derived for a hard spheres system. The present work demonstrates the applicability of Enskog's theory to non hard- sphere and non simple liquids.
In this paper, we present Deep Extreme Feature Extraction (DEFE), a new ensemble MVA method for searching $\tau^{+}\tau^{-}$ channel of Higgs bosons in high energy physics. DEFE can be viewed as a deep ensemble learning scheme that trains a strongly diverse set of neural feature learners without explicitly encouraging diversity and penalizing correlations. This is achieved by adopting an implicit neural controller (not involved in feedforward compuation) that directly controls and distributes gradient flows from higher level deep prediction network. Such model-independent controller results in that every single local feature learned are used in the feature-to-output mapping stage, avoiding the blind averaging of features. DEFE makes the ensembles 'deep' in the sense that it allows deep post-process of these features that tries to learn to select and abstract the ensemble of neural feature learners. With the application of this model, a selection regions full of signal process can be obtained through the training of a miniature collision events set. In comparison of the Classic Deep Neural Network, DEFE shows a state-of-the-art performance: the error rate has decreased by about 37\%, the accuracy has broken through 90\% for the first time, along with the discovery significance has reached a standard deviation of 6.0 $\sigma$. Experimental data shows that, DEFE is able to train an ensemble of discriminative feature learners that boosts the overperformance of final prediction.
This dissertation highlights connections between the fields of neural networks, game theory and time series generation. The concept of antipredictability is explained, and the properties of time series that are antipredictable for several prototypical prediction algorithms (neural networks, Boolean funtions etc.) are studied. The Minority Game provides a framework in which antipredictability arises naturally. Several variations of the MG are introduced and compared, including extensions to more than two choices, and the properties of the generated time series are analysed. A learning algorithm is presented by which a neural network can find a good mixed strategy in zero-sum matrix games. In a certain limit, this algorithm is a stochastic variation of the "fictitious play" learning algorithm.
We analyze how the range of disorder affects the localization properties of quasiparticles in a two-dimensional d-wave superconductor within the standard non-linear sigma-model approach to disordered systems. We show that for purely long-range disorder, which only induces intra-node scattering processes, the approach is free from the ambiguities which often beset the disordered Dirac-fermion theories, and gives rise to a Wess-Zumino-Novikov-Witten action leading to vanishing density of states and finite conductivities. We also study the crossover induced by internode scattering due to a short range component of the disorder, thus providing a coherent non-linear sigma-model description in agreement with all the various findings of different approaches.
We propose a family of statistical models for social network evolution over time, which represents an extension of Exponential Random Graph Models (ERGMs). Many of the methods for ERGMs are readily adapted for these models, including maximum likelihood estimation algorithms. We discuss models of this type and their properties, and give examples, as well as a demonstration of their use for hypothesis testing and classification. We believe our temporal ERG models represent a useful new framework for modeling time-evolving social networks, and rewiring networks from other domains such as gene regulation circuitry, and communication networks.
After almost half a century since the work of Anderson [Phys. Rev. {\bf 109}, 1492 (1958)], at present there is no well established theoretical framework for understanding the dynamics of interacting particles in the presence of disorder. Here, we address this problem for interacting bosons near $T=0$, a situation that has been realized in trapped atomic experiments with an optical speckle disorder. We develop a theoretical model for understanding the hydrodynamic transport of \emph{finite-size} Bose-Einstein condensates through disorder potentials. The goal has been to set up a simple model that will retain all the richness of the system, yet provide analytic expressions, allowing deeper insight into the physical mechanism. Comparison of our theoretical predictions with the experimental data on large-amplitude dipole oscillations of a condensate in an optical-speckle disorder shows striking agreement. We are able to quantify various dissipative regimes of slow and fast damping. Our calculations provide a clear evidence of reduction in disorder strength due to interactions. The analytic treatment presented here allows us to predict the power law governing the interaction dependance of damping. The corresponding exponents are found to depend sensitively on the dimensionality and are in excellent agreement with experimental observations. Thus, the adeptness of our model, to correctly capture the essential physics of dissipation in such transport experiments, is established.
We present a novel graphical framework for modeling non-negative sequential data with hierarchical structure. Our model corresponds to a network of coupled non-negative matrix factorization (NMF) modules, which we refer to as a positive factor network (PFN). The data model is linear, subject to non-negativity constraints, so that observation data consisting of an additive combination of individually representable observations is also representable by the network. This is a desirable property for modeling problems in computational auditory scene analysis, since distinct sound sources in the environment are often well-modeled as combining additively in the corresponding magnitude spectrogram. We propose inference and learning algorithms that leverage existing NMF algorithms and that are straightforward to implement. We present a target tracking example and provide results for synthetic observation data which serve to illustrate the interesting properties of PFNs and motivate their potential usefulness in applications such as music transcription, source separation, and speech recognition. We show how a target process characterized by a hierarchical state transition model can be represented as a PFN. Our results illustrate that a PFN which is defined in terms of a single target observation can then be used to effectively track the states of multiple simultaneous targets. Our results show that the quality of the inferred target states degrades gradually as the observation noise is increased. We also present results for an example in which meaningful hierarchical features are extracted from a spectrogram. Such a hierarchical representation could be useful for music transcription and source separation applications. We also propose a network for language modeling.
This paper addresses the problem of scalable optimization for L1-regularized conditional Gaussian graphical models. Conditional Gaussian graphical models generalize the well-known Gaussian graphical models to conditional distributions to model the output network influenced by conditioning input variables. While highly scalable optimization methods exist for sparse Gaussian graphical model estimation, state-of-the-art methods for conditional Gaussian graphical models are not efficient enough and more importantly, fail due to memory constraints for very large problems. In this paper, we propose a new optimization procedure based on a Newton method that efficiently iterates over two sub-problems, leading to drastic improvement in computation time compared to the previous methods. We then extend our method to scale to large problems under memory constraints, using block coordinate descent to limit memory usage while achieving fast convergence. Using synthetic and genomic data, we show that our methods can solve one million dimensional problems to high accuracy in a little over a day on a single machine.
In this paper we study automatic recognition of cars of four types: Bus, Truck, Van and Small car. For this problem we consider two data driven frameworks: a deep neural network and a support vector machine using SIFT features. The accuracy of the methods is validated with a database of over 6500 images, and the resulting prediction accuracy is over 97 %. This clearly exceeds the accuracies of earlier studies that use manually engineered feature extraction pipelines.
Extracting communities using existing community detection algorithms yields dense sub-networks that are difficult to analyse. Extracting a smaller sample that embodies the relationships of a list of suspects is an important part of the beginning of an investigation. In this paper, we present the efficacy of our shortest paths network search algorithm (SPNSA) that begins with an "algorithm feed", a small subset of nodes of particular interest, and builds an investigative sub-network. The algorithm feed may consist of known criminals or suspects, or persons of influence. This sets our approach apart from existing community detection algorithms. We apply the SPNSA on the Enron Dataset of e-mail communications starting with those convicted of money laundering in relation to the collapse of Enron as the algorithm feed. The algorithm produces sparse and small sub-networks that could feasibly identify a list of persons and relationships to be further investigated. In contrast, we show that identifying sub-networks of interest using either community detection algorithms or a k-Neighbourhood approach produces sub-networks of much larger size and complexity. When the 18 top managers of Enron were used as the algorithm feed, the resulting sub-network identified 4 convicted criminals that were not managers and so not part of the algorithm feed. We also directly tested the SPNSA by removing one of the convicted criminals from the algorithm feed and re-running the algorithm; in 5 out of 9 cases the left out criminal occurred in the resulting sub-network.
Deep convolutional neural networks (CNNs) have been actively adopted in the field of music information retrieval, e.g. genre classification, mood detection, and chord recognition. However, the process of learning and prediction is little understood, particularly when it is applied to spectrograms. We introduce auralisation of a CNN to understand its underlying mechanism, which is based on a deconvolution procedure introduced in [2]. Auralisation of a CNN is converting the learned convolutional features that are obtained from deconvolution into audio signals. In the experiments and discussions, we explain trained features of a 5-layer CNN based on the deconvolved spectrograms and auralised signals. The pairwise correlations per layers with varying different musical attributes are also investigated to understand the evolution of the learnt features. It is shown that in the deep layers, the features are learnt to capture textures, the patterns of continuous distributions, rather than shapes of lines.
Dealing with the complex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. We incorporate these representations in a novel machine translation model which jointly learns word alignments and translations via a hard attention mechanism. Evaluating on translating from several morphologically rich languages into English, we show consistent improvements over strong baseline methods, of between 1 and 1.5 BLEU points.
We present very simple non-integral relations between deep inelastic structure functions $F_L$, $F_2$ and the gluon distribution at small $x$ based on perturbative QCD which are useful for the phenomenological analysis of data at low $x$. As an application we extract the deep inelastic scattering cross-sections ratio $R= \sigma_L/\sigma_T$ in the range $10^{-4} \leq x \leq 10^{-2}$ from $F_2$ HERA data.
We propose a layered street view model to encode both depth and semantic information on street view images for autonomous driving. Recently, stixels, stix-mantics, and tiered scene labeling methods have been proposed to model street view images. We propose a 4-layer street view model, a compact representation over the recently proposed stix-mantics model. Our layers encode semantic classes like ground, pedestrians, vehicles, buildings, and sky in addition to the depths. The only input to our algorithm is a pair of stereo images. We use a deep neural network to extract the appearance features for semantic classes. We use a simple and an efficient inference algorithm to jointly estimate both semantic classes and layered depth values. Our method outperforms other competing approaches in Daimler urban scene segmentation dataset. Our algorithm is massively parallelizable, allowing a GPU implementation with a processing speed about 9 fps.
The scaling properties of spectra of real world complex networks are studied by using the wavelet transform. It is found that the spectra of networks are multifractal. According to the values of the long-range correlation exponent, the Hust exponent $H$, the networks can be classified into three types, namely, $H>0.5$, $H=0.5$ and $H<0.5$. All real world networks considered belong to the class of $H \ge 0.5$, which may be explained by the hierarchical properties.
It is no secret that pornographic material is now a one-click-away from everyone, including children and minors. General social media networks are striving to isolate adult images and videos from normal ones. Intelligent image analysis methods can help to automatically detect and isolate questionable images in media. Unfortunately, these methods require vast experience to design the classifier including one or more of the popular computer vision feature descriptors. We propose to build a classifier based on one of the recently flourishing deep learning techniques. Convolutional neural networks contain many layers for both automatic features extraction and classification. The benefit is an easier system to build (no need for hand-crafting features and classifiers). Additionally, our experiments show that it is even more accurate than the state of the art methods on the most recent benchmark dataset.
Research of delayed neural networks with variable self-inhibitions, inter-connection weights, and inputs is an important issue. %In the real world, self-inhibitions, %inter-connection weights, and inputs should vary through time. In In this paper, we discuss a large class of delayed dynamical systems with almost periodic self-inhibitions, inter-connection weights, and inputs. This model is universal and includes delayed systems with time-varying delays, distributed delays as well as combination of both. We prove that under some mild conditions, the system has a unique almost periodic solution, which is globally exponentially stable. We propose a new approach, which is independent of existing theory concerning with existence of almost periodic solution for dynamical systems.
Taking inspiration from biological evolution, we explore the idea of "Can deep neural networks evolve naturally over successive generations into highly efficient deep neural networks?" by introducing the notion of synthesizing new highly efficient, yet powerful deep neural networks over successive generations via an evolutionary process from ancestor deep neural networks. The architectural traits of ancestor deep neural networks are encoded using synaptic probability models, which can be viewed as the `DNA' of these networks. New descendant networks with differing network architectures are synthesized based on these synaptic probability models from the ancestor networks and computational environmental factor models, in a random manner to mimic heredity, natural selection, and random mutation. These offspring networks are then trained into fully functional networks, like one would train a newborn, and have more efficient, more diverse network architectures than their ancestor networks, while achieving powerful modeling capabilities. Experimental results for the task of visual saliency demonstrated that the synthesized `evolved' offspring networks can achieve state-of-the-art performance while having network architectures that are significantly more efficient (with a staggering $\sim$48-fold decrease in synapses by the fourth generation) compared to the original ancestor network.
Secured opportunistic Medium Access Control (MAC) and complexity reduction in channel estimation are proposed in the Cross layer design Cognitive Radio Networks deploying the secured dynamic channel allocation from the endorsed channel reservation. Channel Endorsement and Transmission policy is deployed to optimize the free channel selection as well as channel utilization to cognitive radio users. This strategy provide the secured and reliable link to secondary users as well as the collision free link to primary users between the physical and MAC layers which yields the better network performance. On the other hand, Complexity Reduction in Minimum Mean Square Errror (CR-MMSE) and Maximum Likelihood (CR-ML) algorithm on Decision Directed Channel Estimation (DDCE) is deployed significantly to achieve computational complexity as Least Square (LS) method. Rigorously, CR-MMSE in sample spaced channel impulse response (SS-CIR) is implemented by allowing the computationally inspired matrix inversion. Regarding CR-ML, Pilot Symbol Assisted Modulation (PSAM) with DDCE is implemented such the pilot symbol sequence provides the significant performance gain in frequency correlation using the finite delay spread. It is found that CRMMSE demonstrates outstanding Symbol Error Rate (SER) performance over MMSE and LS, and CR-ML over MMSE and ML.
Mega-city analysis with very high resolution (VHR) satellite images has been drawing increasing interest in the fields of city planning and social investigation. It is known that accurate land-use, urban density, and population distribution information is the key to mega-city monitoring and environmental studies. Therefore, how to generate land-use, urban density, and population distribution maps at a fine scale using VHR satellite images has become a hot topic. Previous studies have focused solely on individual tasks with elaborate hand-crafted features and have ignored the relationship between different tasks. In this study, we aim to propose a universal framework which can: 1) automatically learn the internal feature representation from the raw image data; and 2) simultaneously produce fine-scale land-use, urban density, and population distribution maps. For the first target, a deep convolutional neural network (CNN) is applied to learn the hierarchical feature representation from the raw image data. For the second target, a novel CNN-based universal framework is proposed to process the VHR satellite images and generate the land-use, urban density, and population distribution maps. To the best of our knowledge, this is the first CNN-based mega-city analysis method which can process a VHR remote sensing image with such a large data volume. A VHR satellite image (1.2 m spatial resolution) of the center of Wuhan covering an area of 2606 km2 was used to evaluate the proposed method. The experimental results confirm that the proposed method can achieve a promising accuracy for land-use, urban density, and population distribution maps.
Recent captioning models are limited in their ability to scale and describe concepts unseen in paired image-text corpora. We propose the Novel Object Captioner (NOC), a deep visual semantic captioning model that can describe a large number of object categories not present in existing image-caption datasets. Our model takes advantage of external sources -- labeled images from object recognition datasets, and semantic knowledge extracted from unannotated text. We propose minimizing a joint objective which can learn from these diverse data sources and leverage distributional semantic embeddings, enabling the model to generalize and describe novel objects outside of image-caption datasets. We demonstrate that our model exploits semantic information to generate captions for hundreds of object categories in the ImageNet object recognition dataset that are not observed in MSCOCO image-caption training data, as well as many categories that are observed very rarely. Both automatic evaluations and human judgements show that our model considerably outperforms prior work in being able to describe many more categories of objects.
We present a method to generate realistic, three-dimensional networks of crosslinked semiflexible polymers. The free energy of these networks is obtained from the force-extension characteristics of the individual polymers and their persistent directionality through the crosslinks. A Monte Carlo scheme is employed to obtain isotropic, homogeneous networks that minimize the free energy, and for which all of the relevant parameters can be varied: the persistence length, the contour length as well as the crosslinking length may be chosen at will. We also provide an initial survey of the mechanical properties of our networks subjected to shear strains, showing them to display the expected non-linear stiffening behavior. Also, a key role for non-affinity and its relation to order in the network is uncovered.
We investigate the numerical performance of the regularized deconvolution closure introduced recently by the authors. The purpose of the closure is to furnish constitutive equations for Irwing-Kirkwood-Noll procedure, a well known method for deriving continuum balance equations from the Newton's equations of particle dynamics. A version of this procedure used in the paper relies on spatial averaging developed by Hardy, and independently by Murdoch and Bedeaux. The constitutive equations for the stress are given as a sum of several operator terms acting on the mesoscale average density and velocity. Each term is a "convolution sandwich" containing the deconvolution operator, a composition or a product operator, and the convolution (averaging) operator. Deconvolution is constructed using filtered regularization methods from the theory of ill-posed problems. The purpose of regularization is to ensure numerical stability. The particular technique used for numerical experiments is truncated singular value decomposition (SVD). The accuracy of the constitutive equations depends on several parameters: the choice of the averaging window function, the value of the mesoscale resolution parameter, scale separation, the level of truncation of singular values, and the level of spectral filtering of the averages. We conduct numerical experiments to determine the effect of each parameter on the accuracy and efficiency of the method. Partial error estimates are also obtained.
One of the main goals of the COMPASS experiment at CERN is the determination of the gluon contribution to the nucleon spin. To achieve this goal COMPASS uses a naturally polarised muon beam with an energy of 160 GeV and a fixed polarised target. Two types of materials are used for the latter: $^{6}$LiD (polarised deuterons) during the years of 2002-2006 and NH$_{3}$ (polarised protons) in 2007. The gluons in the nucleon can be accessed directly via the Photon Gluon Fusion (PGF) process. Among the channels studied by COMPASS, the production of charmed mesons is the one that tags a PGF interaction in the most clean and efficient way. This talk presents a result for the gluon polarisation, $\Delta g/g$, which is based on a measurement of the spin asymmetry resulting from the production of D$^{0}$ mesons. These mesons are reconstructed through the invariant mass of their decay products. The statistical significance of the $D^{0}$ mass spectra has been improved significantly using a new method based on Neural Networks. The $\Delta g/g$ result is also presented using a next-to-leading order (NLO-QCD) analysis of the $\mu N \rightarrow q\bar{q}$ reaction. Such correction is relevant and was for the first time applied to an experimental measurement of the gluon polarisation.
We calculate the subgap density of states of a disordered single-channel normal metal connected to a superconductor at one end (NS junction) or at both ends (SNS junction). The probability distribution of the energy of a bound state (Andreev level) is broadened by disorder. In the SNS case the two-fold degeneracy of the Andreev levels is removed by disorder leading to a splitting in addition to the broadening. The distribution of the splitting is given precisely by Wigner's surmise from random-matrix theory. For strong disorder the mean density of states is largely unaffected by the proximity to the superconductor, because of localization, except in a narrow energy region near the Fermi level, where the density of states is suppressed with a log-normal tail.
We present source counts at 6.7 micron and 15 micron from our maps of the Hubble Deep Field region, reaching 38.6 microJy at 6.7 micron and 255 microJy at 15 micron. These are the first ever extra-galactic number counts to be presented at 6.7 micron and are 3 decades fainter than IRAS at 12 micron. Both source counts and a P(D) analysis suggest we have reached the ISO confusion limit at 15 micron: this will have important implications for future space missions. These data provide an excellent reference point for other ongoing ISO surveys. A no-evolution model at 15 micron is ruled out at >3 sigma, while two models which fit the steep IRAS 60 micron counts are acceptable. This provides important confirmation of the strong evolution seen in IRAS surveys. One of these models can then be ruled out from the 6.7 micron data.
Reduction of unnecessary energy consumption is becoming a major concern in wired networking, because of the potential economical benefits and of its expected environmental impact. These issues, usually referred to as "green networking", relate to embedding energy-awareness in the design, in the devices and in the protocols of networks. In this work, we first formulate a more precise definition of the "green" attribute. We furthermore identify a few paradigms that are the key enablers of energy-aware networking research. We then overview the current state of the art and provide a taxonomy of the relevant work, with a special focus on wired networking. At a high level, we identify four branches of green networking research that stem from different observations on the root causes of energy waste, namely (i) Adaptive Link Rate, (ii) Interface proxying, (iii) Energy-aware infrastructures and (iv) Energy-aware applications. In this work, we do not only explore specific proposals pertaining to each of the above branches, but also offer a perspective for research.
We report on a recent extraction of the higher twist contributions to the deep inelastic structure functions $F_2^{ep,ed}(x,Q^2)$ in the large $x$ region. It is shown that the size of the extracted higher twist contributions is strongly correlated with the higher order corrections applied to the leading twist part. A gradual lowering of the higher twist contributions going from NLO to N$^4$LO is observed, where in the latter case only the leading large $x$ terms were considered.
Dataflow matrix machines are a powerful generalization of recurrent neural networks. They work with multiple types of arbitrary linear streams, multiple types of powerful neurons, and allow to incorporate higher-order constructions. We expect them to be useful in machine learning and probabilistic programming, and in the synthesis of dynamic systems and of deterministic and probabilistic programs.
Cognitive Radio (CR) operates in different fields as varied, one of these is cognitive radio networks. In this paper, we propose a new approach used CR, which aims to manage potential failures of computer systems and applications through the introduction of two aspects of autonomous networks to make systems capable of managing themselves with minimum human intervention.
Energy harvesting is a technology for enabling green, sustainable, and autonomous wireless networks. In this paper, a large-scale wireless network with energy harvesting transmitters is considered, where a group of transmitters forms a cluster to cooperatively serve a desired receiver amid interference and noise. To characterize the link-level performance, closed-form expressions are derived for the transmission success probability at a receiver in terms of key parameters such as node densities, energy harvesting parameters, channel parameters, and cluster size, for a given cluster geometry. The analysis is further extended to characterize a network-level performance metric, capturing the tradeoff between link quality and the fraction of receivers served. Numerical simulations validate the accuracy of the analytical model. Several useful insights are provided. For example, while more cooperation helps improve the link-level performance, the network-level performance might degrade with the cluster size. Numerical results show that a small cluster size (typically 3 or smaller) optimizes the network-level performance. Furthermore, substantial performance can be extracted with a relatively small energy buffer. Moreover, the utility of having a large energy buffer increases with the energy harvesting rate as well as with the cluster size in sufficiently dense networks.
Wireless sensor networks are often used for environmental monitoring applications. In this context sampling and reconstruction of a physical field is one of the most important problems to solve. We focus on a bandlimited field and find under which conditions on the network topology the reconstruction of the field is successful, with a given probability. We review irregular sampling theory, and analyze the problem using random matrix theory. We show that even a very irregular spatial distribution of sensors may lead to a successful signal reconstruction, provided that the number of collected samples is large enough with respect to the field bandwidth. Furthermore, we give the basis to analytically determine the probability of successful field reconstruction.
We combine the photometric redshift data of Fernandez-Soto et al. (1997) with the morphological data of Odewahn et al. (1996) for all galaxies with I < 26.0 detected in the Hubble Deep Field. From this combined catalog we generate the morphological galaxy number-counts and corresponding redshift distributions and compare these to the predictions of high normalization zero- and passive- evolution models. From this comparison we conclude the following: (1) E/S0s are seen in numbers and over a redshift range consistent with zero- or minimal passive- evolution to I = 24. Beyond this limit fewer E/S0s are observed than predicted implying a net negative evolutionary process --- luminosity dimming, disassembly or masking by dust --- at I > 24. (2) Spiral galaxies are present in numbers consistent with zero- evolution predictions to I = 22. Beyond this magnitude some net- positive evolution is required. Although the number-counts are consistent with the passive-evolution predictions to I=26.0 the redshift distributions favor number AND luminosity evolution. (3) There is no obvious explanation for the late-type/irregular class and this category requires further subdivision. While a small fraction of the population lies at low redshift (i.e. true irregulars), the majority lie at redshifts, 1 < z < 3. At z > 1.5 mergers are frequent and, taken in conjunction with the absence of normal spirals at z > 2, the logical inference is that they represent the progenitors of normal spirals forming via hierarchical merging.
Many enterprise environments have databases running on network-attached server-storage infrastructure (referred to as Storage Area Networks or SANs). Both the database and the SAN are complex systems that need their own separate administrative teams. This paper puts forth the vision of an innovative management framework to simplify administrative tasks that require an in-depth understanding of both the database and the SAN. As a concrete instance, we consider the task of diagnosing the slowdown in performance of a database query that is executed multiple times (e.g., in a periodic report-generation setting). This task is very challenging because the space of possible causes includes problems specific to the database, problems specific to the SAN, and problems that arise due to interactions between the two systems. In addition, the monitoring data available from these systems can be noisy.   We describe the design of DIADS which is an integrated diagnosis tool for database and SAN administrators. DIADS generates and uses a powerful abstraction called Annotated Plan Graphs (APGs) that ties together the execution path of queries in the database and the SAN. Using an innovative workflow that combines domain-specific knowledge with machine-learning techniques, DIADS was applied successfully to diagnose query slowdowns caused by complex combinations of events across a PostgreSQL database and a production SAN.
We introduce an application of a mobile transient network architecture on top of the current Internet. This paper is an application extension to a conceptual mobile network architecture. It attempts to specifically reinforce some of the powerful notions exposed by the architecture from an application perspective. Of these notions, we explore the network expansion layer, an overlay of components and services, that enables a persistent identification network and other required services. The overlay abstraction introduces several benefits of which mobility and communication across heterogenous network structures are of interest to this paper. We present implementations of several components and protocols including gateways, Agents and the Open Device Access Protocol. Our present identification network implementation exploits the current implementation of the Handle System through the use of distributed, global and persistent identifiers called handles. Handles are used to identify and locate devices and services abstracting any physical location or network association from the communicating ends. A communication framework is finally demonstrated that would allow for mobile devices on the public Internet to have persistent identifiers and thus be persistently accessible either directly or indirectly. This application expands IP inter-operability beyond its current boundaries.
Current practice in convolutional neural networks (CNN) remains largely bottom-up and the role of top-down process in CNN for pattern analysis and visual inference is not very clear. In this paper, we propose a new method for structured labeling by developing convolutional pseudo-prior (ConvPP) on the ground-truth labels. Our method has several interesting properties: (1) compared with classical machine learning algorithms like CRFs and Structural SVM, ConvPP automatically learns rich convolutional kernels to capture both short- and long- range contexts; (2) compared with cascade classifiers like Auto-Context, ConvPP avoids the iterative steps of learning a series of discriminative classifiers and automatically learns contextual configurations; (3) compared with recent efforts combing CNN models with CRFs and RNNs, ConvPP learns convolution in the labeling space with much improved modeling capability and less manual specification; (4) compared with Bayesian models like MRFs, ConvPP capitalizes on the rich representation power of convolution by automatically learning priors built on convolutional filters. We accomplish our task using pseudo-likelihood approximation to the prior under a novel fixed-point network structure that facilitates an end-to-end learning process. We show state-of-the-art results on sequential labeling and image labeling benchmarks.
We present and discuss the results of an experimental analysis in the design of Boolean networks by means of genetic algorithms. A population of networks is evolved with the aim of finding a network such that the attractor it reaches is of required length $l$. In general, any target can be defined, provided that it is possible to model the task as an optimisation problem over the space of networks. We experiment with different initial conditions for the networks, namely in ordered, chaotic and critical regions, and also with different target length values. Results show that all kinds of initial networks can attain the desired goal, but with different success ratios: initial populations composed of critical or chaotic networks are more likely to reach the target. Moreover, the evolution starting from critical networks achieves the best overall performance. This study is the first step toward the use of search algorithms as tools for automatically design Boolean networks with required properties.
In this paper we develop operational calculus on programming spaces that generalizes existing approaches to automatic differentiation of computer programs and provides a rigorous framework for program analysis through calculus.   We present an abstract computing machine that models automatically differentiable computer programs. Computer programs are viewed as maps on a finite dimensional vector space called virtual memory space, which we extend by the tensor algebra of its dual to accommodate derivatives. The extended virtual memory is by itself an algebra of programs, a data structure one can calculate with, and its elements give the expansion of the original program as an infinite tensor series at program's input values. We define the operator of differentiation on programming spaces and implement a generalized shift operator in terms of its powers. Our approach offers a powerful tool for program analysis and approximation, and provides deep learning with a formal calculus.   Such a calculus connects general programs with deep learning through operators that map both formulations to the same space. This equivalence enables a generalization of existing methods for neural analysis to any computer program, and vice versa. Several applications are presented, most notably a meaningful way of neural network initialization that leads to a process of program boosting.
The wide-band code division multiple access (WCDMA) based 3G and beyond cellular mobile wireless networks are expected to provide a diverse range of multimedia services to mobile users with guaranteed quality of service (QoS). To serve diverse quality of service requirements of these networks it necessitates new radio resource management strategies for effective utilization of network resources with coding schemes. Call admission control (CAC) is a significant component in wireless networks to guarantee quality of service requirements and also to enhance the network resilience. In this paper capacity enhancement for WCDMA network with convolutional coding scheme is discussed and compared with block code and without coding scheme to achieve a better balance between resource utilization and quality of service provisioning. The model of this network is valid for the real-time (RT) and non-real-time (NRT) services having different data rate. Simulation results demonstrate the effectiveness of the network using convolutional code in terms of capacity enhancement and QoS of the voice and video services.
In order to cope with the rapidly increasing service demand in cellular networks, more cells are needed with better resource usage efficiency. This poses challenges for the network planning since service demand in practical networks is not geographically uniform and, to cope with the non-uniform service demand, network deployments are becoming increasingly irregular. This paper introduces a new idea to deal with the non-uniform network topology. Rather than capturing the network character (e.g. load distribution) by means of stochastic methods, the proposed novel approach aims at transforming the analysis from the physical (irregular) domain to a canonical/dual (uniform) domain that simplifies the work due to its symmetry. To carry out this task, physical and canonical domains are connected using the conformal (Schwarz-Christoffel) mapping, that makes the rich and mature theory of Complex Analysis available. The main contribution of this paper is to introduce and validate the usability of conformal mapping in the load coupling analysis of cellular networks.
We present a determination of a set of polarized parton distributions (PDFs) of the nucleon, at next-to-leading order, from a global set of longitudinally polarized deep-inelastic scattering data: NNPDFpol1.0. The determination is based on the NNPDF methodology: a Monte Carlo approach, with neural networks used as unbiased interpolants, previously applied to the determination of unpolarized parton distributions, and designed to provide a faithful and statistically sound representation of PDF uncertainties. We present our dataset, its statistical features, and its Monte Carlo representation. We summarize the technique used to solve the polarized evolution equations and its benchmarking, and the method used to compute physical observables. We review the NNPDF methodology for parametrization and fitting of neural networks, the algorithm used to determine the optimal fit, and its adaptation to the polarized case. We finally present our set of polarized parton distributions. We discuss its statistical properties, test for its stability upon various modifications of the fitting procedure, and compare it to other recent polarized parton sets, and in particular obtain predictions for polarized first moments of PDFs based on it. We find that the uncertainties on the gluon, and to a lesser extent the strange PDF, were substantially underestimated in previous determinations.
We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.
In this paper, we propose a neural-based coding scheme in which an artificial neural network is exploited to automatically compress and decompress speech signals by a trainable approach. Having a two-stage training phase, the system can be fully specified to each speech frame and have robust performance across different speakers and wide range of spoken utterances. Indeed, Frame-based nonlinear predictive coding (FNPC) would code a frame in the procedure of training to predict the frame samples. The motivating objective is to analyze the system behavior in regenerating not only the envelope of spectra, but also the spectra phase. This scheme has been evaluated in time and discrete cosine transform (DCT) domains and the output of predicted phonemes show the potentiality of the FNPC to reconstruct complicated signals. The experiments were conducted on three voiced plosive phonemes, b/d/g/ in time and DCT domains versus the number of neurons in the hidden layer. Experiments approve the FNPC capability as an automatic coding system by which /b/d/g/ phonemes have been reproduced with a good accuracy. Evaluations revealed that the performance of FNPC system, trained to predict DCT coefficients is more desirable, particularly for frames with the wider distribution of energy, compared to time samples.
The cross section for the diffractive deep-inelastic scattering process ep -> e X p is measured, with the leading final state proton detected in the H1 Forward Proton Spectrometer. The data sample covers the range x_pom<0.1 in fractional proton longitudinal momentum loss, 0.1 < |t| < 0.7 GeV^2 in squared four-momentum transfer at the proton vertex and 4 < Q^2 < 700 GeV^2 in photon virtuality. The cross section is measured four-fold differentially in t, x_pom, Q^2 and beta=x/x_pom, where x is the Bjorken scaling variable. The t and x_pom dependences are interpreted in terms of an effective pomeron trajectory and a sub-leading exchange. The data are compared to perturbative QCD predictions at next-to-leading order based on diffractive parton distribution functions previously extracted from complementary measurements of inclusive diffractive deep-inelastic scattering. The ratio of the diffractive to the inclusive ep cross section is studied as a function of Q^2, beta and x_pom.
The problem of image-base person identification/recognition is to provide an identity to the image of an individual based on learned models that describe his/her appearance. Most traditional person identification systems rely on learning a static model on tediously labeled training data. Though labeling manually is an indispensable part of a supervised framework, for a large scale identification system labeling huge amount of data is a significant overhead. For large multi-sensor data as typically encountered in camera networks, labeling a lot of samples does not always mean more information, as redundant images are labeled several times. In this work, we propose a convex optimization based iterative framework that progressively and judiciously chooses a sparse but informative set of samples for labeling, with minimal overlap with previously labeled images. We also use a structure preserving sparse reconstruction based classifier to reduce the training burden typically seen in discriminative classifiers. The two stage approach leads to a novel framework for online update of the classifiers involving only the incorporation of new labeled data rather than any expensive training phase. We demonstrate the effectiveness of our approach on multi-camera person re-identification datasets, to demonstrate the feasibility of learning online classification models in multi-camera big data applications. Using three benchmark datasets, we validate our approach and demonstrate that our framework achieves superior performance with significantly less amount of manual labeling.
The (electromagnetic or weak) current operator responsible for deep inelastic scattering (DIS) should be local and satisfy the well-known commutation relations with the representation operators of the Poincare group. The problem whether these conditions are compatible with the factorization theorem and operator product expansion is investigated in detail. We argue that the current operator contains a nontrivial nonperturbative part which contributes to DIS even in the Bjorken limit. Nevertheless there exists a possibility that many results of the standard theory remain.
Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.
After Anderson's prediction of disorder-induced insulating behavior, extensive work found no singularities in the density of states of localized systems. However, Johri and Bhatt recently uncovered the existence of a non-analyticity in the density of states near the band edge of non-interacting systems with bounded disorder, in an energy range outside that captured by previous work. Moreover, this feature marks the boundary of an energy range in which the localization is sharply suppressed. Given strong current interest in the effect of interactions on disordered systems, we explore here the effect of a Hubbard $U$ interaction on this behavior. We find that in ensembles of small systems a cusp in the density of states persists and continues to be associated with a sharp suppression of the localization. We explore the origins of this behavior and discuss its connection with many-body localization.
The random antiferromagnetic spin-1/2 XX and XXZ chain is studied numerically for varying strength of the disorder, using exact diagonalization and stochastic series expansion methods. The spin-spin correlation function as well as the stiffness display a clear crossover from the pure behavior (no disorder) to the infinite randomness fixed point or random singlet behavior predicted by the the real space renormalization group. The crossover length scale is shown to diverge as $\xi\sim{\mathcal D}^{-\gamma}$, where ${\mathcal D}$ is the variance of the random bonds. Our estimates for the exponent $\gamma$ agrees well within the error bars with the one for the localization length exponent emerging within an analytical bosonization calculation. Exact diagonalization and stochastic series expansion results for the string correlation function are also presented.
Explicit characterization and computation of the multi-source network coding capacity region (or even bounds) is long standing open problem. In fact, finding the capacity region requires determination of the set of all entropic vectors $\Gamma^{*}$, which is known to be an extremely hard problem. On the other hand, calculating the explicitly known linear programming bound is very hard in practice due to an exponential growth in complexity as a function of network size. We give a new, easily computable outer bound, based on characterization of all functional dependencies in networks. We also show that the proposed bound is tighter than some known bounds.
In classical network reliability analysis, the system under study is a network with perfect nodes but imperfect link, that fail stochastically and independently. There, the goal is to find the probability that the resulting random graph is connected, called \emph{reliability}. Although the exact reliability computation belongs to the class of $\mathcal{NP}$-Hard problems, the literature offers three exact methods for exact reliability computation, to know, Sum of Disjoint Products (SDPs), Inclusion-Exclusion and Factorization.   Inspired in delay-sensitive applications in telecommunications, H\'ector Cancela and Louis Petingi defined in 2001 the diameter-constrained reliability, where terminals are required to be connected by $d$ hops or less, being $d$ a positive integer, called diameter.   Factorization theory in classical network reliability is a mature area. However, an extension to the diameter-constrained context requires at least the recognition of irrelevant links, and an extension of deletion-contraction formula. In this paper, we fully characterize the determination of irrelevant links. Diameter-constrained reliability invariants are presented, which, together with the recognition of irrelevant links, represent the building-blocks for a new factorization theory. The paper is closed with a discussion of trends for future work.
The dynamics of an extremely diluted neural network with high order synapses acting as corrections to the Hopfield model is investigated. As in the fully connected case, the high order terms may strongly improve the storage capacity of the system. The dynamics displays a very rich behavior, and in particular a new chaotic phase emerges depending on the weight of the high order connections $\epsilon$, the noise level $T$ and the network load defined as the rate between the number of stored patterns and the mean connectivity per neuron $\alpha =P/C$.
We reanalyze deep inelastic scattering data of BCDMS Collaboration by including proper cuts of ranges with large systematic errors. We perform also the fits of high statistic deep inelastic scattering data of BCDMS, SLAC, NM and BFP Collaborations taking the data separately and in combined way and find good agreement between these analyses. We extract the values of both the QCD coupling constant \alpha_s(M^2_Z) up to NLO level and of the power corrections to the structure function F_2. The fits of the combined data for the nonsinglet part of the structure function F_2 predict the coupling constant value \alpha_s(M^2_Z) = 0.1174 \pm 0.0007 (stat) \pm 0.0019 (syst) \pm 0.0010 (normalization) (or QCD parameter \Lambda^{(5)}_{MSbar} = 204 \pm 25 (total exper.err.) MeV). The fits of the combined data for both: the nonsinglet part and the singlet one, lead to the values \alpha_s(M^2_Z) = 0.1177 \pm 0.0007 (stat) \pm 0.0021 (syst) \pm 0.0009 (normalization) (or QCD parameter \Lambda^{(5)}_{MSbar} = 208 \pm 27 (total exper.err.) MeV). Both above values are in very good agreement with each other. We estimate theoretical uncertainties for \alpha_s(M^2_Z) as +0.0047 and -0.0057 from fits of the combine data, when complete singlet and nonsinglet Q^2 evolution is taken into account.
In this thesis, we have studied the large scale structure and system level dynamics of certain biological networks using tools from graph theory, computational biology and dynamical systems. We study the structure and dynamics of large scale metabolic networks inside three organisms, Escherichia coli, Saccharomyces cerevisiae and Staphylococcus aureus. We also study the dynamics of the large scale genetic network controlling E. coli metabolism. We have tried to explain the observed system level dynamical properties of these networks in terms of their underlying structure. Our studies of the system level dynamics of these large scale biological networks provide a different perspective on their functioning compared to that obtained from purely structural studies. Our study also leads to some new insights on features such as robustness, fragility and modularity of these large scale biological networks. We also shed light on how different networks inside the cell such as metabolic networks and genetic networks are interrelated to each other.
The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and relatively robust against identity-preserving transformations like depth-rotations. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations. While simulations of these models recapitulate the ventral stream's progression from early view-specific to late view-tolerant representations, they fail to generate the most salient property of the intermediate representation for faces found in the brain: mirror-symmetric tuning of the neural population to head orientation. Here we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules can provide approximate invariance at the top level of the network. While most of the learning rules do not yield mirror-symmetry in the mid-level representations, we characterize a specific biologically-plausible Hebb-type learning rule that is guaranteed to generate mirror-symmetric tuning to faces tuning at intermediate levels of the architecture.
A wireless network is realized by mobile devices which communicate over radio channels. Since, experiments of real life problem with real devices are very difficult, simulation is used very often. Among many other important properties that have to be defined for simulative experiments, the mobility model and the radio propagation model have to be selected carefully. Both have strong impact on the performance of mobile wireless networks, e.g., the performance of routing protocols varies with these models. There are many mobility and radio propagation models proposed in literature. Each of them was developed with different objectives and is not suited for every physical scenario. The radio propagation models used in common wireless network simulators, in general researcher consider simple radio propagation models and neglect obstacles in the propagation environment. In this paper, we study the performance of wireless networks simulation by consider different Radio propagation models with considering obstacles in the propagation environment. In this paper we analyzed the performance of wireless networks by OPNET Modeler .In this paper we quantify the parameters such as throughput, packet received attenuation.
The growing demand for data has driven the Service Providers (SPs) to provide differential treatment of traffic to generate additional revenue streams from Content Providers (CPs). While SPs currently only provide best-effort services to their CPs, it is plausible to envision a model in near future, where CPs are willing to sponsor quality of service for their content in exchange of sharing a portion of their profit with SPs. This quality sponsoring becomes invaluable especially when the available resources are scarce such as in wireless networks, and can be accommodated in a non-neutral network. In this paper, we consider the problem of Quality-Sponsored Data (QSD) in a non-neutral network. In our model, SPs allow CPs to sponsor a portion of their resources, and price it appropriately to maximize their payoff. The payoff of the SP depends on the monetary revenue and the satisfaction of end-users both for the non-sponsored and sponsored content, while CPs generate revenue through advertisement. We analyze the market dynamics and equilibria in two different frameworks, i.e. sequential and bargaining game frameworks, and provide strategies for (i) SPs: to determine if and how to price resources, and (ii) CPs: to determine if and what quality to sponsor. The frameworks characterize different sets of equilibrium strategies and market outcomes depending on the parameters of the market.
In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings. The data is modeled as the output of a multivariate GP. The inputs to that Gaussian process are then governed by another GP. A single layer model is equivalent to a standard GP or the GP latent variable model (GP-LVM). We perform inference in the model by approximate variational marginalization. This results in a strict lower bound on the marginal likelihood of the model which we use for model selection (number of layers and nodes per layer). Deep belief networks are typically applied to relatively large data sets using stochastic gradient descent for optimization. Our fully Bayesian treatment allows for the application of deep models even when data is scarce. Model selection by our variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.
Vision-based object detection is one of the fundamental functions in numerous traffic scene applications such as self-driving vehicle systems and advance driver assistance systems (ADAS). However, it is also a challenging task due to the diversity of traffic scene and the storage, power and computing source limitations of the platforms for traffic scene applications. This paper presents a generalized Haar filter based deep network which is suitable for the object detection tasks in traffic scene. In this approach, we first decompose a object detection task into several easier local regression tasks. Then, we handle the local regression tasks by using several tiny deep networks which simultaneously output the bounding boxes, categories and confidence scores of detected objects. To reduce the consumption of storage and computing resources, the weights of the deep networks are constrained to the form of generalized Haar filter in training phase. Additionally, we introduce the strategy of sparse windows generation to improve the efficiency of the algorithm. Finally, we perform several experiments to validate the performance of our proposed approach. Experimental results demonstrate that the proposed approach is both efficient and effective in traffic scene compared with the state-of-the-art.
The Hubbard model is studied in which disorder is introduced by putting the on-site interaction to zero on a fraction f of (impurity) sites of a square lattice. Using Quantum Monte Carlo methods and Dynamical Mean Field theory we find that antiferromagnetic long-range order is initially enhanced at half-filling and stabilized off half-filling by the disorder. The Mott-Hubbard charge gap of the pure system is broken up into two pieces by the disorder: one incompressible state remains at average density n=1 and another can be seen slightly below n=1+f. Qualitative explanations are provided.
Meta-learning consists in learning learning algorithms. We use a Long Short Term Memory (LSTM) based network to learn to compute on-line updates of the parameters of another neural network. These parameters are stored in the cell state of the LSTM. Our framework allows to compare learned algorithms to hand-made algorithms within the traditional train and test methodology. In an experiment, we learn a learning algorithm for a one-hidden layer Multi-Layer Perceptron (MLP) on non-linearly separable datasets. The learned algorithm is able to update parameters of both layers and generalise well on similar datasets.
Information maximization has been investigated as a possible mechanism of learning governing the self-organization that occurs within the neural systems of animals. Within the general context of models of neural systems bidirectionally interacting with environments, however, the role of information maximization remains to be elucidated. For bidirectionally interacting physical systems, universal laws describing the fluctuation they exhibit and the information they possess have recently been discovered. These laws are termed fluctuation theorems. In the present study, we formulate a theory of learning in neural networks bidirectionally interacting with environments based on the principle of information maximization. Our formulation begins with the introduction of a generalized fluctuation theorem, employing an interpretation appropriate for the present application, which differs from the original thermodynamic interpretation. We analytically and numerically demonstrate that the learning mechanism presented in our theory allows neural networks to efficiently explore their environments and optimally encode information about them.
Short text clustering is a challenging problem due to its sparseness of text representation. Here we propose a flexible Self-Taught Convolutional neural network framework for Short Text Clustering (dubbed STC^2), which can flexibly and successfully incorporate more useful semantic features and learn non-biased deep text representation in an unsupervised manner. In our framework, the original raw text features are firstly embedded into compact binary codes by using one existing unsupervised dimensionality reduction methods. Then, word embeddings are explored and fed into convolutional neural networks to learn deep feature representations, meanwhile the output units are used to fit the pre-trained binary codes in the training process. Finally, we get the optimal clusters by employing K-means to cluster the learned representations. Extensive experimental results demonstrate that the proposed framework is effective, flexible and outperform several popular clustering methods when tested on three public short text datasets.
Neural networks and evolutionary computation have a rich intertwined history. They most commonly appear together when an evolutionary algorithm optimises the parameters and topology of a neural network for reinforcement learning problems, or when a neural network is applied as a surrogate fitness function to aid the evolutionary optimisation of expensive fitness functions. In this paper we take a different approach, asking the question of whether a neural network can be used to provide a mutation distribution for an evolutionary algorithm, and what advantages this approach may offer? Two modern neural network models are investigated, a Denoising Autoencoder modified to produce stochastic outputs and the Neural Autoregressive Distribution Estimator. Results show that the neural network approach to learning genotypes is able to solve many difficult discrete problems, such as MaxSat and HIFF, and regularly outperforms other evolutionary techniques.
The nucleation rates derived for the condensation from a supersaturated vapor are examined both in the classical theory and in the modern coarse-grained field theory. By virtue of the scaling variable $\lambda_Z$ it is shown that the method of steepest descent is irrelevant to evaluate the nucleation rate in the proximity of the critical point in the capillary approximation. If the logarithmic corrections to the activation energy of a droplet are taken into account, then the calculated nucleation rates provide an adequate description of the liquid-gas phase transition both near and out of the critical range.
Counting the frequencies of 3-, 4-, and 5-node undirected motifs (also know as graphlets) is widely used for understanding complex networks such as social and biology networks. However, it is a great challenge to compute these metrics for a large graph due to the intensive computation. Despite recent efforts to count triangles (i.e., 3-node undirected motif counting), little attention has been given to developing scalable tools that can be used to characterize 4- and 5-node motifs. In this paper, we develop computational efficient methods to sample and count 4- and 5- node undirected motifs. Our methods provide unbiased estimators of motif frequencies, and we derive simple and exact formulas for the variances of the estimators. Moreover, our methods are designed to fit vertex centric programming models, so they can be easily applied to current graph computing systems such as Pregel and GraphLab. We conduct experiments on a variety of real-word datasets, and experimental results show that our methods are several orders of magnitude faster than the state-of-the-art methods under the same estimation errors.
For the linear bandit problem, we extend the analysis of algorithm CombEXP from [R. Combes, M. S. Talebi Mazraeh Shahi, A. Proutiere, and M. Lelarge. Combinatorial bandits revisited. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 2116--2124. Curran Associates, Inc., 2015. URL http://papers.nips.cc/paper/5831-combinatorial-bandits-revisited.pdf] to the high-probability case against adaptive adversaries, allowing actions to come from an arbitrary polytope. We prove a high-probability regret of \(O(T^{2/3})\) for time horizon \(T\). While this bound is weaker than the optimal \(O(\sqrt{T})\) bound achieved by GeometricHedge in [P. L. Bartlett, V. Dani, T. Hayes, S. Kakade, A. Rakhlin, and A. Tewari. High-probability regret bounds for bandit online linear optimization. In 21th Annual Conference on Learning Theory (COLT 2008), July 2008. http://eprints.qut.edu.au/45706/1/30-Bartlett.pdf], CombEXP is computationally efficient, requiring only an efficient linear optimization oracle over the convex hull of the actions.
The anomalous response of glasses in the echo amplitude experiment is explained in the presence of a magnetic field. We have considered the low energy excitations in terms of an effective two level system. The effective model is constructed on the flip-flop configuration of two interacting two level systems. The magnetic field affects the tunneling amplitude through the Aharanov-Bohm effect. The effective model has a lower scale of energy in addition to the new distribution of tunneling parameters which depend on the interaction. We are able to explain some features of echo amplitude versus a magnetic field, namely, the dephasing effect at low magnetic fields, dependence on the strength of the electric field, pulse separation effect and the influence of temperature. However this model fails to explain the isotope effects which essentially can be explained by the nuclear quadrupole moment. We will finally discuss the features of our results.
Represented as graphs, real networks are intricate combinations of order and disorder. Fixing some of the structural properties of network models to their values observed in real networks, many other properties appear as statistical consequences of these fixed observables, plus randomness in other respects. Here we employ the $dk$-series, a complete set of basic characteristics of the network structure, to study the statistical dependencies between different network properties. We consider six real networks---the Internet, US airport network, human protein interactions, technosocial web of trust, English word network, and an fMRI map of the human brain---and find that many important local and global structural properties of these networks are closely reproduced by $dk$-random graphs whose degree distributions, degree correlations, and clustering are as in the corresponding real network. We discuss important conceptual, methodological, and practical implications of this evaluation of network randomness, and release software to generate $dk$-random graphs.
In the modern healthcare system, rapidly expanding costs/complexity, the growing myriad of treatment options, and exploding information streams that often do not effectively reach the front lines hinder the ability to choose optimal treatment decisions over time. The goal in this paper is to develop a general purpose (non-disease-specific) computational/artificial intelligence (AI) framework to address these challenges. This serves two potential functions: 1) a simulation environment for exploring various healthcare policies, payment methodologies, etc., and 2) the basis for clinical artificial intelligence - an AI that can think like a doctor. This approach combines Markov decision processes and dynamic decision networks to learn from clinical data and develop complex plans via simulation of alternative sequential decision paths while capturing the sometimes conflicting, sometimes synergistic interactions of various components in the healthcare system. It can operate in partially observable environments (in the case of missing observations or data) by maintaining belief states about patient health status and functions as an online agent that plans and re-plans. This framework was evaluated using real patient data from an electronic health record. Such an AI framework easily outperforms the current treatment-as-usual (TAU) case-rate/fee-for-service models of healthcare (Cost per Unit Change: $189 vs. $497) while obtaining a 30-35% increase in patient outcomes. Tweaking certain model parameters further enhances this advantage, obtaining roughly 50% more improvement for roughly half the costs. Given careful design and problem formulation, an AI simulation framework can approximate optimal decisions even in complex and uncertain environments. Future work is described that outlines potential lines of research and integration of machine learning algorithms for personalized medicine.
We describe two techniques that significantly improve the running time of several standard machine-learning algorithms when data is sparse. The first technique is an algorithm that effeciently extracts one-way and two-way counts--either real or expected-- from discrete data. Extracting such counts is a fundamental step in learning algorithms for constructing a variety of models including decision trees, decision graphs, Bayesian networks, and naive-Bayes clustering models. The second technique is an algorithm that efficiently performs the E-step of the EM algorithm (i.e. inference) when applied to a naive-Bayes clustering model. Using real-world data sets, we demonstrate a dramatic decrease in running time for algorithms that incorporate these techniques.
Heterogeneity of firing rate statistics is known to have severe consequences on neural coding. Recent experimental recordings in weakly electric fish indicate that the distribution-width of superficial pyramidal cell firing rates (trial- and time-averaged) in the electrosensory lateral line lobe (ELL) depends on the stimulus, and also that network inputs can mediate changes in the firing rate distribution across the population. We previously developed theoretical methods to understand how two attributes (synaptic and intrinsic heterogeneity) interact and alter the firing rate distribution in a population of integrate-and-fire neurons with random recurrent coupling. Inspired by our experimental data, we extend these theoretical results to a delayed feedforward spiking network that qualitatively capture the changes of firing rate heterogeneity observed in in-vivo recordings. We demonstrate how heterogeneous neural attributes alter firing rate heterogeneity, accounting for the effect with various sensory stimuli. The model predicts how the strength of the effective network connectivity is related to intrinsic heterogeneity in such delayed feedforward networks: the strength of the feedforward input is positively correlated with excitability (threshold value for spiking) with low firing rate heterogeneity and is negatively correlated with excitability with high firing rate heterogeneity. We also show how our theory can be use to predict effective neural architecture. We demonstrate that neural attributes do not interact in a simple manner but rather in a complex stimulus-dependent fashion to control neural heterogeneity and discuss how it can ultimately shape population codes.
A new procedure based on layered feed-forward neural networks for the microplane material model parameters identification is proposed in the present paper. Novelties are usage of the Latin Hypercube Sampling method for the generation of training sets, a systematic employment of stochastic sensitivity analysis and a genetic algorithm-based training of a neural network by an evolutionary algorithm. Advantages and disadvantages of this approach together with possible extensions are thoroughly discussed and analyzed.
Causal inference uses observations to infer the causal structure of the data generating system. We study a class of functional models that we call Time Series Models with Independent Noise (TiMINo). These models require independent residual time series, whereas traditional methods like Granger causality exploit the variance of residuals. There are two main contributions: (1) Theoretical: By restricting the model class (e.g. to additive noise) we can provide a more general identifiability result than existing ones. This result incorporates lagged and instantaneous effects that can be nonlinear and do not need to be faithful, and non-instantaneous feedbacks between the time series. (2) Practical: If there are no feedback loops between time series, we propose an algorithm based on non-linear independence tests of time series. When the data are causally insufficient, or the data generating process does not satisfy the model assumptions, this algorithm may still give partial results, but mostly avoids incorrect answers. An extension to (non-instantaneous) feedbacks is possible, but not discussed. It outperforms existing methods on artificial and real data. Code can be provided upon request.
With laptops and desktops, the dominant method of text entry is the full-size keyboard; now with the ubiquity of mobile devices like smartphones, two new widely used methods have emerged: miniature touch screen keyboards and speech-based dictation. It is currently unknown how these two modern methods compare. We therefore evaluated the text entry performance of both methods in English and in Mandarin Chinese on a mobile smartphone. In the speech input case, our speech recognition system gave an initial transcription, and then recognition errors could be corrected using either speech again or the smartphone keyboard. We found that with speech recognition, the English input rate was 3.0x faster, and the Mandarin Chinese input rate 2.8x faster, than a state-of-the-art miniature smartphone keyboard. Further, with speech, the English error rate was 20.4% lower, and Mandarin error rate 63.4% lower, than the keyboard. Our experiment was carried out using Deep Speech 2, a deep learning-based speech recognition system, and the built-in Qwerty or Pinyin (Mandarin) Apple iOS keyboards. These results show that a significant shift from typing to speech might be imminent and impactful. Further research to develop effective speech interfaces is warranted.
This paper proposes a method for learning joint embeddings of images and text using a two-branch neural network with multiple layers of linear projections followed by nonlinearities. The network is trained using a large margin objective that combines cross-view ranking constraints with within-view neighborhood structure preservation constraints inspired by metric learning literature. Extensive experiments show that our approach gains significant improvements in accuracy for image-to-text and text-to-image retrieval. Our method achieves new state-of-the-art results on the Flickr30K and MSCOCO image-sentence datasets and shows promise on the new task of phrase localization on the Flickr30K Entities dataset.
Classifiers based on probabilistic graphical models are very effective. In continuous domains, maximum likelihood is usually used to assess the predictions of those classifiers. When data is scarce, this can easily lead to overfitting. In any probabilistic setting, Bayesian averaging (BA) provides theoretically optimal predictions and is known to be robust to overfitting. In this work we introduce Bayesian Conditional Gaussian Network Classifiers, which efficiently perform exact Bayesian averaging over the parameters. We evaluate the proposed classifiers against the maximum likelihood alternatives proposed so far over standard UCI datasets, concluding that performing BA improves the quality of the assessed probabilities (conditional log likelihood) whilst maintaining the error rate.   Overfitting is more likely to occur in domains where the number of data items is small and the number of variables is large. These two conditions are met in the realm of bioinformatics, where the early diagnosis of cancer from mass spectra is a relevant task. We provide an application of our classification framework to that problem, comparing it with the standard maximum likelihood alternative, where the improvement of quality in the assessed probabilities is confirmed.
We present a theoretical study of the phase diagram and the structure of a fluid adsorbed in high-porosity aerogels by means of an integral-equation approach combined with the replica formalism. To simulate a realistic gel environment, we use an aerogel structure factor obtained from an off-lattice diffusion-limited cluster-cluster aggregation process. The predictions of the theory are in qualitative agreement with the experimental results, showing a substantial narrowing of the gas-liquid coexistence curve (compared to that of the bulk fluid), associated with weak changes in the critical density and temperature. The influence of the aerogel structure (nontrivial short-range correlations due to connectedness, long-range fractal behavior of the silica strands) is shown to be important at low fluid densities.
We report a thorough characterization of the glassy dynamics of benzophenone by broadband dielectric spectroscopy. We detect a well pronounced beta relaxation peak developing into an excess wing with increasing temperature. A previous analysis of results from Optical-Kerr-effect measurements on this material within the mode coupling theory revealed a high-frequency Cole-Cole peak. We address the question if this phenomenon also may explain the Johari-Goldstein beta relaxation, a so far unexplained spectral feature inherent to glass-forming matter, mainly observed in dielectric spectra. Our results demonstrate that according to the present status of theory, both spectral features seem not to be directly related.
A new protocol of the zero-field-cooled (ZFC) magnetization process is studied experimentally on an Ising spin-glass (SG) Fe$_{0.50}$Mn$_{0.50}$TiO$_3$ and numerically on the Edwards-Anderson Ising SG model. Although the time scales differ very much between the experiment and the simulation, the behavior of the ZFC magnetization observed in the two systems can be interpreted by means of a common scaling expression based on the droplet picture. The results strongly suggest that the SG coherence length, or the mean size of droplet excitations, involved even in the experimental ZFC process, is about a hundred lattice distances or less.
We study the spatio-temporal evolution of wave packets in one-dimensional quasiperiodic lattices which localize linear waves. Nonlinearity (related to two-body interactions) has destructive effect on localization, as recently observed for interacting atomic condensates [Phys. Rev. Lett. 106, 230403 (2011)]. We extend the analysis of the characteristics of the subdiffusive dynamics to large temporal and spatial scales. Our results for the second moment $m_2$ consistently reveal an asymptotic $m_2 \sim t^{1/3}$ and intermediate $m_2 \sim t^{1/2}$ laws. At variance to purely random systems [Europhys. Lett. 91, 30001 (2010)] the fractal gap structure of the linear wave spectrum strongly favors intermediate self-trapping events. Our findings give a new dimension to the theory of wave packet spreading in localizing environments.
Describable visual facial attributes are now commonplace in human biometrics and affective computing, with existing algorithms even reaching a sufficient point of maturity for placement into commercial products. These algorithms model objective facets of facial appearance, such as hair and eye color, expression, and aspects of the geometry of the face. A natural extension, which has not been studied to any great extent thus far, is the ability to model subjective attributes that are assigned to a face based purely on visual judgements. For instance, with just a glance, our first impression of a face may lead us to believe that a person is smart, worthy of our trust, and perhaps even our admiration - regardless of the underlying truth behind such attributes. Psychologists believe that these judgements are based on a variety of factors such as emotional states, personality traits, and other physiognomic cues. But work in this direction leads to an interesting question: how do we create models for problems where there is no ground truth, only measurable behavior? In this paper, we introduce a new convolutional neural network-based regression framework that allows us to train predictive models of crowd behavior for social attribute assignment. Over images from the AFLW face database, these models demonstrate strong correlations with human crowd ratings.
With the technology development, the need of analyze and extraction of useful information is increasing. Bayesian networks contain knowledge from data and experts that could be used for decision making processes But they are not easily understandable thus the rule extraction methods have been used but they have high computation costs. To overcome this problem we extract rules from Bayesian network using genetic algorithm. Then we generate the graphical chain by mutually matching the extracted rules and Bayesian network. This graphical chain could shows the sequence of events that lead to the target which could help the decision making process. The experimental results on small networks show that the proposed method has comparable results with brute force method which has a significantly higher computation cost.
We present results of Monte Carlo simulations of random bond Potts models in two dimensions, for different numbers of Potts states, q. We introduce a simple scheme which yields continuous self-dual distributions of the interactions. As expected, we find multifractal behavior of the correlation functions at the critical point and obtain estimates of the exponent eta_n for several moments, n, of the correlation functions, including typical (n -> 0), average (n=1) and others. In addition, for q=8, we find that there is only a single correlation length exponent describing the correlation length away from criticality. This is numerically very close to the pure Ising value of unity.
The variance of the Lyapunov exponent is calculated exactly in the one-dimensional Anderson model with random site energies distributed according to the Cauchy distribution. We derive an exact analytical criterion for the validity of the single parameter scaling in this model. According to this criterion, states with energies from the conduction band of the underlying non-random system satisfy single parameter scaling when disorder is small enough. At the same time, single parameter scaling is not valid for states close to band boundaries and those outside of the original spectrum even in the case of small disorder. The obtained results are applied to the Kronig-Penney model with the potential in the form of periodically positioned $\delta$-functions with random strengths. We show that an increase in the disorder can restore single parameter scalingbehavior for states from band-gaps of this model.
With the proliferation of social networking sites (SNSs) such as Facebook and Google+, investigating the impact of SNSs on our lives has become an important research area in recent years. Though SNS usage plays a key role in connecting people with friends and families from distant places, SNSs also bring concern for families. We focus on imbalance SNS usage, i.e., an individual remains busy in using SNSs when her family member is expecting to spend time with her. More specifically, we investigate the cause and pattern of imbalance SNS usage and how the emotion of family members may become affected, if they use SNSs in an imbalanced way in a regular manner. This paper is the first attempt to identify the relationship between an individual's imbalance SNS usage and the emotion of her family member in the context of a developing country.
This paper aims at one newly raising task in vision and multimedia research: recognizing human actions from still images. Its main challenges lie in the large variations in human poses and appearances, as well as the lack of temporal motion information. Addressing these problems, we propose to develop an expressive deep model to naturally integrate human layout and surrounding contexts for higher level action understanding from still images. In particular, a Deep Belief Net is trained to fuse information from different noisy sources such as body part detection and object detection. To bridge the semantic gap, we used manually labeled data to greatly improve the effectiveness and efficiency of the pre-training and fine-tuning stages of the DBN training. The resulting framework is shown to be robust to sometimes unreliable inputs (e.g., imprecise detections of human parts and objects), and outperforms the state-of-the-art approaches.
We present a method for extraction of detailed information on polarized quark densities from semi-inclusive deep inelastic scattering l+N -> l+h+X, in both LO and NLO QCD without any assumptions about fragmentation functions and polarized sea densities. The only symmetries utilised are charge conjugation and isotopic spin invariance of strong interactions.
In this paper we focus on the detection of network anomalies like Denial of Service (DoS) attacks and port scans in a unified manner. While there has been an extensive amount of research in network anomaly detection, current state of the art methods are only able to detect one class of anomalies at the cost of others. The key tool we will use is based on the spectral decomposition of a trajectory/hankel matrix which is able to detect deviations from both between and within correlation present in the observed network traffic data. Detailed experiments on synthetic and real network traces shows a significant improvement in detection capability over competing approaches. In the process we also address the issue of robustness of anomaly detection systems in a principled fashion.
It is widely accepted that the complex dynamics characteristic of recurrent neural circuits contributes in a fundamental manner to brain function. Progress has been slow in understanding and exploiting the computational power of recurrent dynamics for two main reasons: nonlinear recurrent networks often exhibit chaotic behavior and most known learning rules do not work in robust fashion in recurrent networks. Here we address both these problems by demonstrating how random recurrent networks (RRN) that initially exhibit chaotic dynamics can be tuned through a supervised learning rule to generate locally stable neural patterns of activity that are both complex and robust to noise. The outcome is a novel neural network regime that exhibits both transiently stable and chaotic trajectories. We further show that the recurrent learning rule dramatically increases the ability of RRNs to generate complex spatiotemporal motor patterns, and accounts for recent experimental data showing a decrease in neural variability in response to stimulus onset.
Time series of graphs are increasingly prevalent in modern data and pose unique challenges to visual exploration and pattern extraction. This paper describes the development and application of matrix factorizations for exploration and time-varying community detection in time-evolving graph sequences. The matrix factorization model allows the user to home in on and display interesting, underlying structure and its evolution over time. The methods are scalable to weighted networks with a large number of time points or nodes, and can accommodate sudden changes to graph topology. Our techniques are demonstrated with several dynamic graph series from both synthetic and real world data, including citation and trade networks. These examples illustrate how users can steer the techniques and combine them with existing methods to discover and display meaningful patterns in sizable graphs over many time points.
The simplicial condition and other stronger conditions that imply it have recently played a central role in developing polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger conditions. In this paper, we demonstrate, for the first time, that the simplicial condition is a fundamental, algorithm-independent, information-theoretic necessary condition for consistent separable topic estimation. Furthermore, under solely the simplicial condition, we present a practical quadratic-complexity algorithm based on random projections which consistently detects all novel words of all topics using only up to second-order empirical word moments. This algorithm is amenable to distributed implementation making it attractive for 'big-data' scenarios involving a network of large distributed databases.
In scenarios motivated by string theories, it is possible to have extra Kaluza-Klein dimensions compactified to rather large magnitudes, leading to large effects of gravity at scales down to a TeV. The effect of the spin-2 Kaluza-Klein modes on the deep-inelastic cross-section at HERA is investigated. We find that the data can be used to obtain bounds on the effective low energy scale, $M_S$.
A quantitative application to real supercooled liquids of the mean-field scenario for the glass transition ($T_g$) is proposed. This scenario, based on an analogy with spin-glass models, suggests a unified picture of the mode-coupling dynamical singularity ($T_c$) and of the entropy crisis at the Kauzmann temperature ($T_K$), with $T_c>T_g>T_K$. Fitting a simple set of mode-coupling equations to experimental light-scattering spectra of two fragile liquids and deriving the equivalent spin-glass model, we can estimate not only $T_c$, but also the static transition temperature $T_s$ corresponding supposedly to $T_K$. For the models and systems considered here, $T_s$ is always found above $T_g$, in the fluid phase. A comparison with recent theoretical calculations shows that this overestimation of the ability of a liquid to form a glass seems to be a generic feature of the mean-field approach.
The supercooled dynamics of a Lennard-Jones model liquid is numerically investigated studying relevant points of the potential energy surface, i.e. the minima of the square gradient of total potential energy $V$. The main findings are: ({\it i}) the number of negative curvatures $n$ of these sampled points appears to extrapolate to zero at the mode coupling critical temperature $T_c$; ({\it ii}) the temperature behavior of $n(T)$ has a close relationship with the temperature behavior of the diffusivity; ({\it iii}) the potential energy landscape shows an high regularity in the distances among the relevant points and in their energy location. Finally we discuss a model of the landscape, previously introduced by Madan and Keyes [J. Chem. Phys. {\bf 98}, 3342 (1993)], able to reproduce the previous findings.
The energy loss effect in nuclear matter, which is another nuclear effect apart from the nuclear effect on the parton distribution as in deep inelastic scattering process, can be measured best by the nuclear dependence of the high energy nuclear Drell-Yan process. By means of the nuclear parton distribution studied only with lepton deep inelastic scattering experimental data, measured Drell-Yan production cross sections for 800GeV proton incident on a variety of nuclear targets are analyzed within Glauber framework which takes into account energy loss of the beam proton. It is shown that the theoretical results with considering the energy loss effect are in good agreement with the FNAL E866.
Deep inelastic e^+ scattering data, taken with the H1 detector at HERA, are used to study the event shape variables thrust, jet broadening and jet mass in the current hemisphere of the Breit frame over a large range of momentum transfers Q between 7 GeV and 100 GeV. The data are compared with results from e^+e^- experiments. Using second order QCD calculations and an approach to relate hadronisation effects to power corrections an analysis of the Q dependences of the means of the event shape parameters is presented, from which both the power corrections and the strong coupling constant are determined without any assumption on fragmentation models. The power corrections of all event shape variables investigated follow a 1/Q behaviour and can be described by a common parameter alpha_0.
Humans perceive their surroundings in great detail even though most of our visual field is reduced to low-fidelity color-deprived (e.g. dichromatic) input by the retina. In contrast, most deep learning architectures are computationally wasteful in that they consider every part of the input when performing an image processing task. Yet, the human visual system is able to perform visual reasoning despite having only a small fovea of high visual acuity. With this in mind, we wish to understand the extent to which connectionist architectures are able to learn from and reason with low acuity, distorted inputs. Specifically, we train autoencoders to generate full-detail images from low-detail "foveations" of those images and then measure their ability to reconstruct the full-detail images from the foveated versions. By varying the type of foveation, we can study how well the architectures can cope with various types of distortion. We find that the autoencoder compensates for lower detail by learning increasingly global feature functions. In many cases, the learnt features are suitable for reconstructing the original full-detail image. For example, we find that the networks accurately perceive color in the periphery, even when 75\% of the input is achromatic.
Egocentric, or first-person vision which became popular in recent years with an emerge in wearable technology, is different than exocentric (third-person) vision in some distinguishable ways, one of which being that the camera wearer is generally not visible in the video frames. Recent work has been done on action and object recognition in egocentric videos, as well as work on biometric extraction from first-person videos. Height estimation can be a useful feature for both soft-biometrics and object tracking. Here, we propose a method of estimating the height of an egocentric camera without any calibration or reference points. We used both traditional computer vision approaches and deep learning in order to determine the visual cues that results in best height estimation. Here, we introduce a framework inspired by two stream networks comprising of two Convolutional Neural Networks, one based on spatial information, and one based on information given by optical flow in a frame. Given an egocentric video as an input to the framework, our model yields a height estimate as an output. We also incorporate late fusion to learn a combination of temporal and spatial cues. Comparing our model with other methods we used as baselines, we achieve height estimates for videos with a Mean Average Error of 14.04 cm over a range of 103 cm of data, and classification accuracy for relative height (tall, medium or short) up to 93.75% where chance level is 33%.
Environmental audio tagging aims to predict only the presence or absence of certain acoustic events in the interested acoustic scene. In this paper we make contributions to audio tagging in two parts, respectively, acoustic modeling and feature learning. We propose to use a shrinking deep neural network (DNN) framework incorporating unsupervised feature learning to handle the multi-label classification task. For the acoustic modeling, a large set of contextual frames of the chunk are fed into the DNN to perform a multi-label classification for the expected tags, considering that only chunk (or utterance) level rather than frame-level labels are available. Dropout and background noise aware training are also adopted to improve the generalization capability of the DNNs. For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features. The new features, which are smoothed against background noise and more compact with contextual information, can further improve the performance of the DNN baseline. Compared with the standard Gaussian Mixture Model (GMM) baseline of the DCASE 2016 audio tagging challenge, our proposed method obtains a significant equal error rate (EER) reduction from 0.21 to 0.13 on the development set. The proposed aDAE system can get a relative 6.7% EER reduction compared with the strong DNN baseline on the development set. Finally, the results also show that our approach obtains the state-of-the-art performance with 0.15 EER on the evaluation set of the DCASE 2016 audio tagging task while EER of the first prize of this challenge is 0.17.
An extensive body of research deals with estimating the correlation and the Hurst parameter of Internet traffic traces. The significance of these statistics is due to their fundamental impact on network performance. The coverage of Internet traffic traces is, however, limited since acquiring such traces is challenging with respect to, e.g., confidentiality, logging speed, and storage capacity. In this work, we investigate how the correlation of Internet traffic can be reliably estimated from random traffic samples. These samples are observed either by passive monitoring within the network, or otherwise by active packet probes at end systems. We analyze random sampling processes with different inter-sample distributions and show how to obtain asymptotically unbiased estimates from these samples. We quantify the inherent limitations that are due to limited observations and explore the influence of various parameters, such as sampling intensity, network utilization, or Hurst parameter on the estimation accuracy. We design an active probing method which enables simple and lightweight traffic sampling without support from the network. We verify our approach in a controlled network environment and present comprehensive Internet measurements. We find that the correlation exhibits properties such as long range dependence as well as periodicities and that it differs significantly across Internet paths and observation times.
In our society and century, clothing is not anymore used only as a means for body protection. Our paper builds upon the evidence, studied within the social sciences, that clothing brings a clear communicative message in terms of social signals, influencing the impression and behaviour of others towards a person. In fact, clothing correlates with personality traits, both in terms of self-assessment and assessments that unacquainted people give to an individual. The consequences of these facts are important: the influence of clothing on the decision making of individuals has been investigated in the literature, showing that it represents a discriminative factor to differentiate among diverse groups of people. Unfortunately, this has been observed after cumbersome and expensive manual annotations, on very restricted populations, limiting the scope of the resulting claims. With this position paper, we want to sketch the main steps of the very first systematic analysis, driven by social signal processing techniques, of the relationship between clothing and social signals, both sent and perceived. Thanks to human parsing technologies, which exhibit high robustness owing to deep learning architectures, we are now capable to isolate visual patterns characterising a large types of garments. These algorithms will be used to capture statistical relations on a large corpus of evidence to confirm the sociological findings and to go beyond the state of the art.
In distributed ML applications, shared parameters are usually replicated among computing nodes to minimize network overhead. Therefore, proper consistency model must be carefully chosen to ensure algorithm's correctness and provide high throughput. Existing consistency models used in general-purpose databases and modern distributed ML systems are either too loose to guarantee correctness of the ML algorithms or too strict and thus fail to fully exploit the computing power of the underlying distributed system.   Many ML algorithms fall into the category of \emph{iterative convergent algorithms} which start from a randomly chosen initial point and converge to optima by repeating iteratively a set of procedures. We've found that many such algorithms are to a bounded amount of inconsistency and still converge correctly. This property allows distributed ML to relax strict consistency models to improve system performance while theoretically guarantees algorithmic correctness. In this paper, we present several relaxed consistency models for asynchronous parallel computation and theoretically prove their algorithmic correctness. The proposed consistency models are implemented in a distributed parameter server and evaluated in the context of a popular ML application: topic modeling.
While standard persistent homology has been successful in extracting information from metric datasets, its applicability to more general data, e.g. directed networks, is hindered by its natural insensitivity to asymmetry. We study a construction of homology of digraphs due to Grigor'yan, Lin, Muranov and Yau, and extend this construction to the persistent framework. The result, which we call persistent path homology, can provide information about the digraph structure of a directed network at varying resolutions. Moreover, this method encodes a rich level of detail about the asymmetric structure of the input directed network. We test our method on both simulated and real-world directed networks and conjecture some of its characteristics.
Recent works show conflicting results: network capacity may increase or decrease with higher transmission power under different scenarios. In this work, we want to understand this paradox. Specifically, we address the following questions: (1)Theoretically, should we increase or decrease transmission power to maximize network capacity? (2) Theoretically, how much network capacity gain can we achieve by power control? (3) Under realistic situations, how do power control, link scheduling and routing interact with each other? Under which scenarios can we expect a large capacity gain by using higher transmission power? To answer these questions, firstly, we prove that the optimal network capacity is a non-decreasing function of transmission power. Secondly, we prove that the optimal network capacity can be increased unlimitedly by higher transmission power in some network configurations. However, when nodes are distributed uniformly, the gain of optimal network capacity by higher transmission power is upper-bounded by a positive constant. Thirdly, we discuss why network capacity in practice may increase or decrease with higher transmission power under different scenarios using carrier sensing and the minimum hop-count routing. Extensive simulations are carried out to verify our analysis.
Deep learning has been extensively used various aspects of computer vision area. Deep learning separate itself from traditional neural network by having a much deeper and complicated network layers in its network structures. Traditionally, deep neural network is abundantly used in computer vision tasks including classification and detection and has achieve remarkable success and set up a new state of the art results in these fields. Instead of using neural network for vision recognition and detection. I will show the ability of neural network to do image registration, synthesis of images and image retrieval in this report.
We introduce a novel artificial neural network architecture that integrates robustness to adversarial input in the network structure. The main idea of our approach is to force the network to make predictions on what the given instance of the class under consideration would look like and subsequently test those predictions. By forcing the network to redraw the relevant parts of the image and subsequently comparing this new image to the original, we are having the network give a "proof" of the presence of the object.
We report recent research on computing with biology-based neural network models by means of physics-based opto-electronic hardware. New technology provides opportunities for very-high-speed computation and uncovers problems obstructing the wide-spread use of this new capability. The Computation Modeling community may be able to offer solutions to these cross-boundary research problems.
Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures, i.e., function classes obtained by composing multiple non-linear transformations. Self-taught learning (exploiting unlabeled examples or examples from other distributions) has already been applied to deep learners, but mostly to show the advantage of unlabeled examples. Here we explore the advantage brought by {\em out-of-distribution examples}. For this purpose we developed a powerful generator of stochastic variations and noise processes for character images, including not only affine transformations but also slant, local elastic deformations, changes in thickness, background images, grey level changes, contrast, occlusion, and various types of noise. The out-of-distribution examples are obtained from these highly distorted images or by including examples of object classes different from those in the target test set. We show that {\em deep learners benefit more from out-of-distribution examples than a corresponding shallow learner}, at least in the area of handwritten character recognition. In fact, we show that they beat previously published results and reach human-level performance on both handwritten digit classification and 62-class handwritten character recognition.
This paper considers the problem of decentralized optimization with a composite objective containing smooth and non-smooth terms. To solve the problem, a proximal-gradient scheme is studied. Specifically, the smooth and nonsmooth terms are dealt with by gradient update and proximal update, respectively. The studied algorithm is closely related to a previous decentralized optimization algorithm, PG-EXTRA [37], but has a few advantages. First of all, in our new scheme, agents use uncoordinated step-sizes and the stable upper bounds on step-sizes are independent from network topology. The step-sizes depend on local objective functions, and they can be as large as that of the gradient descent. Secondly, for the special case without non-smooth terms, linear convergence can be achieved under the strong convexity assumption. The dependence of the convergence rate on the objective functions and the network are separated, and the convergence rate of our new scheme is as good as one of the two convergence rates that match the typical rates for the general gradient descent and the consensus averaging. We also provide some numerical experiments to demonstrate the efficacy of the introduced algorithms and validate our theoretical discoveries.
A framework is presented for unsupervised learning of representations based on infomax principle for large-scale neural populations. We use an asymptotic approximation to the Shannon's mutual information for a large neural population to demonstrate that a good initial approximation to the global information-theoretic optimum can be obtained by a hierarchical infomax method. Starting from the initial solution, an efficient algorithm based on gradient descent of the final objective function is proposed to learn representations from the input datasets, and the method works for complete, overcomplete, and undercomplete bases. As confirmed by numerical experiments, our method is robust and highly efficient for extracting salient features from input datasets. Compared with the main existing methods, our algorithm has a distinct advantage in both the training speed and the robustness of unsupervised representation learning. Furthermore, the proposed method is easily extended to the supervised or unsupervised model for training deep structure networks.
In this paper we present a domain adaptation technique for formant estimation using a deep network. We first train a deep learning network on a small read speech dataset. We then freeze the parameters of the trained network and use several different datasets to train an adaptation layer that makes the obtained network universal in the sense that it works well for a variety of speakers and speech domains with very different characteristics. We evaluated our adapted network on three datasets, each of which has different speaker characteristics and speech styles. The performance of our method compares favorably with alternative methods for formant estimation.
We quantify the uncertainty in weak lensing mass estimates of clusters of galaxies, caused by distant (uncorrelated) large scale structure along the line of sight. We find that the effect is fairly small for deep observations (20<R<26) of massive clusters (sigma=1000 km/s) at intermediate redshifts, where the bulk of the sources are at high redshifts compared to the cluster redshift. If the lensing signal is measured out to 1.5 h_{50}^{-1} Mpc the typical 1sigma relative uncertainty in the mass is about 6%. However, in other situations the induced uncertainty can be larger. For instance, in the case of nearby clusters, such as the Coma cluster, background structures introduce a considerable uncertainty in the mass, limiting the maximum achievable S/N-ratio to \sim 7, even for deep observations. The noise in the cluster mass estimate caused by the large scale structure increases with increasing aperture size, which will also complicate attempts to constrain cluster mass profiles at large distances from the cluster centre. However, the distant large scale structure studied here can be considered an additional (statistical) source of error, and by averaging the results of several clusters the noise is decreased.
Feature representation and metric learning are two critical components in person re-identification models. In this paper, we focus on the feature representation and claim that hand-crafted histogram features can be complementary to Convolutional Neural Network (CNN) features. We propose a novel feature extraction model called Feature Fusion Net (FFN) for pedestrian image representation. In FFN, back propagation makes CNN features constrained by the handcrafted features. Utilizing color histogram features (RGB, HSV, YCbCr, Lab and YIQ) and texture features (multi-scale and multi-orientation Gabor features), we get a new deep feature representation that is more discriminative and compact. Experiments on three challenging datasets (VIPeR, CUHK01, PRID450s) validates the effectiveness of our proposal.
The effectiveness of long short term memory networks trained by backpropagation through time for stock price prediction is explored in this paper. A range of different architecture LSTM networks are constructed trained and tested.
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This manuscript presents Deep Convolutional Neural Fields (DeepCNF), a combination of DCNN with Conditional Random Field (CRF), for sequence labeling with highly imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on highly imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. We then test our AUC-maximized DeepCNF on three very different protein sequence labeling tasks: solvent accessibility prediction, 8-state secondary structure prediction, and disorder prediction. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also have similar performance as the other two training methods on the solvent accessibility prediction problem which has three equally-distributed labels. Furthermore, our experimental results also show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks.
We propose a simple neural network model to deal with the domain adaptation problem in object recognition. Our model incorporates the Maximum Mean Discrepancy (MMD) measure as a regularization in the supervised learning to reduce the distribution mismatch between the source and target domains in the latent space. From experiments, we demonstrate that the MMD regularization is an effective tool to provide good domain adaptation models on both SURF features and raw image pixels of a particular image data set. We also show that our proposed model, preceded by the denoising auto-encoder pretraining, achieves better performance than recent benchmark models on the same data sets. This work represents the first study of MMD measure in the context of neural networks.
Model reduction is a highly desirable process for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction accuracy and model variance. Net-Trim is a layer-wise convex framework to prune (sparsify) deep neural networks. The method is applicable to neural networks operating with the rectified linear unit (ReLU) as the nonlinear activation. The basic idea is to retrain the network layer by layer keeping the layer inputs and outputs close to the originally trained model, while seeking a sparse transform matrix. We present both the parallel and cascade versions of the algorithm. While the former enjoys computational distributability, the latter is capable of achieving simpler models. In both cases, we mathematically show a consistency between the retrained model and the initial trained network. We also derive the general sufficient conditions for the recovery of a sparse transform matrix. In the case of standard Gaussian training samples of dimension $N$ being fed to a layer, and $s$ being the maximum number of nonzero terms across all columns of the transform matrix, we show that $\mathcal{O}(s\log N)$ samples are enough to accurately learn the layer model.
A cluster algorithm formulated in continuous (imaginary) time is presented for Ising models in a transverse field. It works directly with an infinite number of time-slices in the imaginary time direction, avoiding the necessity to take this limit explicitly. The algorithm is tested at the zero-temperature critical point of the pure two-dimensional (2d) transverse Ising model. Then it is applied to the 2d Ising ferromagnet with random bonds and transverse fields, for which the phase diagram is determined. Finite size scaling at the quantum critical point as well as the study of the quantum Griffiths-McCoy phase indicate that the dynamical critical exponent is infinite as in 1d.
The F-measure or F-score is one of the most commonly used single number measures in Information Retrieval, Natural Language Processing and Machine Learning, but it is based on a mistake, and the flawed assumptions render it unsuitable for use in most contexts! Fortunately, there are better alternatives.
Part-based image classification aims at representing categories by small sets of learned discriminative parts, upon which an image representation is built. Considered as a promising avenue a decade ago, this direction has been neglected since the advent of deep neural networks. In this context, this paper brings two contributions: first, it shows that despite the recent success of end-to-end holistic models, explicit part learning can boosts classification performance. Second, this work proceeds one step further than recent part-based models (PBM), focusing on how to learn parts without using any labeled data. Instead of learning a set of parts per class, as generally done in the PBM literature, the proposed approach both constructs a partition of a given set of images into visually similar groups, and subsequently learn a set of discriminative parts per group in a fully unsupervised fashion. This strategy opens the door to the use of PBM in new applications for which the notion of image categories is irrelevant, such as instance-based image retrieval, for example. We experimentally show that our learned parts can help building efficient image representations, for classification as well as for indexing tasks, resulting in performance superior to holistic state-of-the art Deep Convolutional Neural Networks (DCNN) encoding.
An increase in the efficiency of sampling from Boltzmann distributions would have a significant impact on deep learning and other machine-learning applications. Recently, quantum annealers have been proposed as a potential candidate to speed up this task, but several limitations still bar these state-of-the-art technologies from being used effectively. One of the main limitations is that, while the device may indeed sample from a Boltzmann-like distribution, quantum dynamical arguments suggest it will do so with an {\it instance-dependent} effective temperature, different from its physical temperature. Unless this unknown temperature can be unveiled, it might not be possible to effectively use a quantum annealer for Boltzmann sampling. In this work, we propose a strategy to overcome this challenge with a simple effective-temperature estimation algorithm. We provide a systematic study assessing the impact of the effective temperatures in the learning of a special class of a restricted Boltzmann machine embedded on quantum hardware, which can serve as a building block for deep-learning architectures. We also provide a comparison to $k$-step contrastive divergence (CD-$k$) with $k$ up to 100. Although assuming a suitable fixed effective temperature also allows us to outperform one step contrastive divergence (CD-1), only when using an instance-dependent effective temperature do we find a performance close to that of CD-100 for the case studied here.
Review article of various complexity measures of networks
We present opinion recommendation, a novel task of jointly predicting a custom review with a rating score that a certain user would give to a certain product or service, given existing reviews and rating scores to the product or service by other users, and the reviews that the user has given to other products and services. A characteristic of opinion recommendation is the reliance of multiple data sources for multi-task joint learning, which is the strength of neural models. We use a single neural network to model users and products, capturing their correlation and generating customised product representations using a deep memory network, from which customised ratings and reviews are constructed jointly. Results show that our opinion recommendation system gives ratings that are closer to real user ratings on Yelp.com data compared with Yelp's own ratings, and our methods give better results compared to several pipelines baselines using state-of-the-art sentiment rating and summarization systems.
A description of the Fortran program HECTOR for a variety of semi-analytical calculations of radiative QED, QCD, and electroweak corrections to the double-differential cross sections of NC and CC deep inelastic charged lepton proton (or lepton deuteron) scattering is presented. HECTOR originates from the substantially improved and extended earlier programs HELIOS and TERAD91. It is mainly intended for applications at HERA or LEPxLHC, but may be used also for muon scattering in fixed target experiments. The QED corrections may be calculated in different sets of variables: leptonic, hadronic, mixed, Jaquet-Blondel, double angle etc. Besides the leading logarithmic approximation up to order O(alpha^2), exact order O(alpha) corrections and inclusive soft photon exponentiation are taken into account. The photoproduction region is also covered.
The integrable structure of Ginibre's Orthogonal Ensemble of random matrices is looked at through the prism of the probability "p_{n,k}" to find exactly "k" real eigenvalues in the spectrum of an "n" by "n" real asymmetric Gaussian random matrix. The exact solution for the probability function "p_{n,k}" is presented, and its remarkable connection to the theory of symmetric functions is revealed. An extension of the Dyson integration theorem is a key ingredient of the theory presented.
In the field of fusing multi-spectral and panchromatic images (Pan-sharpening), the impressive effectiveness of deep neural networks has been recently employed to overcome the drawbacks of traditional linear models and boost the fusing accuracy. However, to the best of our knowledge, existing research works are mainly based on simple and flat networks with relatively shallow architecture, which severely limited their performances. In this paper, the concept of residual learning has been introduced to form a very deep convolutional neural network to make a full use of the high non-linearity of deep learning models. By both quantitative and visual assessments on a large number of high quality multi-spectral images from various sources, it has been supported that our proposed model is superior to all mainstream algorithms included in the comparison, and achieved the highest spatial-spectral unified accuracy.
Despite the advent of autonomous cars, it's likely - at least in the near future - that human attention will still maintain a central role as a guarantee in terms of legal responsibility during the driving task. In this paper we study the dynamics of the driver's gaze and use it as a proxy to understand related attentional mechanisms. First, we build our analysis upon two questions: where and what the driver is looking at? Second, we model the driver's gaze by training a coarse-to-fine convolutional network on short sequences extracted from the DR(eye)VE dataset. Experimental comparison against different baselines reveal that the driver's gaze can indeed be learnt to some extent, despite i) being highly subjective and ii) having only one driver's gaze available for each sequence due to the irreproducibility of the scene. Eventually, we advocate for a new assisted driving paradigm which suggests to the driver, with no intervention, where she should focus her attention.
Deep ConvNets have shown its good performance in image classification tasks. However it still remains as a problem in deep video representation for action recognition. The problem comes from two aspects: on one hand, current video ConvNets are relatively shallow compared with image ConvNets, which limits its capability of capturing the complex video action information; on the other hand, temporal information of videos is not properly utilized to pool and encode the video sequences. Towards these issues, in this paper, we utilize two state-of-the-art ConvNets, i.e., the very deep spatial net (VGGNet) and the temporal net from Two-Stream ConvNets, for action representation. The convolutional layers and the proposed new layer, called frame-diff layer, are extracted and pooled with two temporal pooling strategy: Trajectory pooling and line pooling. The pooled local descriptors are then encoded with VLAD to form the video representations. In order to verify the effectiveness of the proposed framework, we conduct experiments on UCF101 and HMDB51 datasets. It achieves the accuracy of 93.78\% on UCF101 which is the state-of-the-art and the accuracy of 65.62\% on HMDB51 which is comparable to the state-of-the-art.
We use a renormalization group method, similar to that developed for random spin chains, to infer information about the layouts of cellular wireless networks.
The Hall conductance $\sigma_{xy}$ of two-dimensional {\it lattice} electrons with random potential is investigated. The change of $\sigma_{xy}$ due to randomness is focused on. It is a quantum phase transition where the {\it sum rule} of $\sigma_{xy}$ plays an important role. By the {\it string} (anyon) gauge, numerical study becomes possible in sufficiently weak magnetic field regime which is essential to discuss the floating scenario in the continuum model. Topological objects in the Bloch wavefunctions, charged vortices, are obtained explicitly. The anomalous plateau transitions ($\Delta \sigma_{xy}= 2,3,... >1$) and the trajectory of delocalized states are discussed.
The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks. However, these architectures are rather shallow in comparison to the deep convolutional networks which have pushed the state-of-the-art in computer vision. We present a new architecture (VDCNN) for text processing which operates directly at the character level and uses only small convolutions and pooling operations. We are able to show that the performance of this model increases with depth: using up to 29 convolutional layers, we report improvements over the state-of-the-art on several public text classification tasks. To the best of our knowledge, this is the first time that very deep convolutional nets have been applied to text processing.
We have measured the center-of-mass fluctuations of the height of a contact line at depinning for two different systems: liquid hydrogen on a rough cesium substrate and isopropanol on a silicon wafer grafted with silanized patches. The contact line is subject to a confining quadratic well, provided by gravity.   From the second cumulant of the height fluctuations, we measure the renormalized disorder correlator Delta(u), predicted by the Functional RG theory to attain a fixed point, as soon as the capillary length is large compared to the Larkin length set by the microscopic disorder. The experiments are consistent with the asymptotic form for Delta(u) predicted by Functional RG, including a linear cusp at u=0. The observed small deviations could be used as a probe of the underlying physical processes. The third moment, as well as avalanche-size distributions are measured and compared to predictions from Functional RG.
Stochastic Gradient Descent (SGD) is arguably the most popular of the machine learning methods applied to training deep neural networks (DNN) today. It has recently been demonstrated that SGD can be statistically biased so that certain elements of the training set are learned more rapidly than others. In this article, we place SGD into a feedback loop whereby the probability of selection is proportional to error magnitude. This provides a novelty-driven oddball SGD process that learns more rapidly than traditional SGD by prioritising those elements of the training set with the largest novelty (error). In our DNN example, oddball SGD trains some 50x faster than regular SGD.
We discuss our outreach efforts to introduce school students to network science and explain why networks researchers should be involved in such outreach activities. We provide overviews of modules that we have designed for these efforts, comment on our successes and failures, and illustrate the potentially enormous impact of such outreach efforts.
An instability of a diffusive Fermi liquid, indicative of a metal-insulator transition (expected to be of first order), arising solely from the competition between quenched disorder and short-ranged interparticle interactions is identified in Hubbard-like models for spinless fermions, subject to (complex) random hopping at half-filling on bipartite lattices. The instability, found within a Finkel'stein Non-Linear Sigma Model treatment in d = (2 + epsilon) > 2 dimensions, originates from an underlying particle-hole like (so-called "chiral") symmetry, shared by both disorder and interactions. In the clean, interacting Fermi liquid this symmetry is responsible for the (completely different) "nesting" instability.
We consider a group of Bayesian agents who are each given an independent signal about an unknown state of the world, and proceed to communicate with each other. We study the question of asymptotic learning: do agents learn the state of the world with probability that approaches one as the number of agents tends to infinity?   We show that under general conditions asymptotic learning follows from agreement on posterior actions or posterior beliefs, regardless of the communication dynamics. In particular, we prove that asymptotic learning holds for the Gale-Kariv model on undirected networks and non-atomic private beliefs.
We classify twenty-one Indo-European languages starting from written text. We use neural networks in order to define a distance among different languages, construct a dendrogram and analyze the ultrametric structure that emerges. Four or five subgroups of languages are identified, according to the "cut" of the dendrogram, drawn with an entropic criterion. The results and the method are discussed.
We consider recent reports on small-world topologies of interaction networks derived from the dynamics of spatially extended systems that are investigated in diverse scientific fields such as neurosciences, geophysics, or meteorology. With numerical simulations that mimic typical experimental situations we have identified an important constraint when characterizing such networks: indications of a small-world topology can be expected solely due to the spatial sampling of the system along with commonly used time series analysis based approaches to network characterization.
In conducting ferromagnetic materials, a moving domain wall induces eddy currents in the sample which give rise to an effective retarding pressure on the domain wall. We show here that the pressure is not just proportional to the instantaneous velocity of the wall, as often assumed in domain wall models, but depends on the history of the motion. We calculate the retarding pressure by solving the Maxwell equations for the field generated by the eddy currents, and show how its effect can be accounted for by associating a negative effective mass to the magnetic wall. We analyze the dependence of this effect on the sample geometry and discuss the implications for Barkhausen noise measurements.
One of the options for creating a Deep Underground Science and Engineering Laboratory (DUSEL) is a site in the Mt. Stuart batholith, a granodiorite and tonalite rock mass in the Cascade mountain range in Washington State. The batholith's 100-year history in hard-rock tunneling includes the construction of the longest and deepest tunnels in the U.S., the parallel Cascade and Pioneer tunnels. The laboratory plan would utilize these two tunnels to produce a laboratory that has many desirable features, including dedicated, clean, horizontal access, container-module transport, and low operations costs. Various aspects of the site help to reduce geotechnical, environmental, and safety risks.
Systems biology is an inter-disciplinary field that studies systems of biological components at different scales, which may be molecules, cells or entire organism. In particular, systems biology methods are applied to understand functional deregulations within human cells (e.g., cancers). In this context, we present several python packages linked to CellNOptR (R package), which is used to build predictive logic models of signalling networks by training networks (derived from literature) to signalling (phospho-proteomic) data. The first package (cellnopt.wrapper) is a wrapper based on RPY2 that allows a full access to CellNOptR functionalities within Python. The second one (cellnopt.core) was designed to ease the manipulation and visualisation of data structures used in CellNOptR, which was achieved by using Pandas, NetworkX and matplotlib. Systems biology also makes extensive use of web resources and services. We will give an overview and status of BioServices, which allows one to access programmatically to web resources used in life science and how it can be combined with CellNOptR.
Problems in machine learning (ML) can involve noisy input data, and ML classification methods have reached limiting accuracies when based on standard ML data sets consisting of feature vectors and their classes. Greater accuracy will require incorporation of prior structural information on data into learning. We study methods to regularize feature vectors (unsupervised regularization methods), analogous to supervised regularization for estimating functions in ML. We study regularization (denoising) of ML feature vectors using Tikhonov and other regularization methods for functions on ${\bf R}^n$. A feature vector ${\bf x}=(x_1,\ldots,x_n)=\{x_q\}_{q=1}^n$ is viewed as a function of its index $q$, and smoothed using prior information on its structure. This can involve a penalty functional on feature vectors analogous to those in statistical learning, or use of proximity (e.g. graph) structure on the set of indices. Such feature vector regularization inherits a property from function denoising on ${\bf R}^n$, in that accuracy is non-monotonic in the denoising (regularization) parameter $\alpha$. Under some assumptions about the noise level and the data structure, we show that the best reconstruction accuracy also occurs at a finite positive $\alpha$ in index spaces with graph structures. We adapt two standard function denoising methods used on ${\bf R}^n$, local averaging and kernel regression. In general the index space can be any discrete set with a notion of proximity, e.g. a metric space, a subset of ${\bf R}^n$, or a graph/network, with feature vectors as functions with some notion of continuity. We show this improves feature vector recovery, and thus the subsequent classification or regression done on them. We give an example in gene expression analysis for cancer classification with the genome as an index space and network structure based protein-protein interactions.
This work presents a comprehensive and structured taxonomy of available techniques for managing the handover process in mobility architectures. Representative works from the existing literature have been divided into appropriate categories, based on their ability to support horizontal handovers, vertical handovers and multihoming. We describe approaches designed to work on the current Internet (i.e. IPv4-based networks), as well as those that have been devised for the "future" Internet (e.g. IPv6-based networks and extensions). Quantitative measures and qualitative indicators are also presented and used to evaluate and compare the examined approaches. This critical review provides some valuable guidelines and suggestions for designing and developing mobility architectures, including some practical expedients (e.g. those required in the current Internet environment), aimed to cope with the presence of NAT/firewalls and to provide support to legacy systems and several communication protocols working at the application layer.
We consider the random fluctuations of the free energy in the $p$-spin version of the Sherrington-Kirkpatrick model in the high temperature regime. Using the martingale approach of Comets and Neveu as used in the standard SK model combined with truncation techniques inspired by a recent paper by Talagrand on the $p$-spin version, we prove that (for $p$ even) the random corrections to the free energy are on a scale $N^{-(p-2)/4}$ only, and after proper rescaling converge to a standard Gaussian random variable. This is shown to hold for all values of the inverse temperature, $\b$, smaller than a critical $\b_p$. We also show that $\b_p\to \sqrt{2\ln 2}$ as $p\uparrow +\infty$. Additionally we study the formal $p\uparrow +\infty$ limit of these models, the random energy model. Here we compute the precise limit theorem for the partition function at {\it all} temperatures. For $\b<\sqrt{2\ln2}$, fluctuations are found at an {\it exponentially small} scale, with two distinct limit laws above and below a second critical value $\sqrt{\ln 2/2}$: For $\b$ up to that value the rescaled fluctuations are Gaussian, while below that there are non-Gaussian fluctuations driven by the Poisson process of the extreme values of the random energies. For $\b$ larger than the critical $\sqrt{2\ln 2}$, the fluctuations of the logarithm of the partition function are on scale one and are expressed in terms of the Poisson process of extremes. At the critical temperature, the partition function divided by its expectation converges to 1/2.
A statistical mechanical framework for analyzing random linear vector channels is presented in a large system limit. The framework is based on the assumptions that the left and right singular value bases of the rectangular channel matrix $\bH$ are generated independently from uniform distributions over Haar measures and the eigenvalues of $\bH^{\rm T}\bH$ asymptotically follow a certain specific distribution. These assumptions make it possible to characterize the communication performance of the channel utilizing an integral formula with respect to $\bH$, which is analogous to the one introduced by Marinari {\em et. al.} in {\em J. Phys. A} {\bf 27}, 7647 (1994) for large random square (symmetric) matrices. A computationally feasible algorithm for approximately decoding received signals based on the integral formula is also provided.
The explosion of cloud services on the Internet brings new challenges in service discovery and selection. Particularly, the demand for efficient quality-of-service (QoS) evaluation is becoming urgently strong. To address this issue, this paper proposes neighborhood-based approach for QoS prediction of cloud services by taking advantages of collaborative intelligence. Different from heuristic collaborative filtering and matrix factorization, we define a formal neighborhood-based prediction framework which allows an efficient global optimization scheme, and then exploit different baseline estimate component to improve predictive performance. To validate the proposed methods, a large-scale QoS-specific dataset which consists of invocation records from 339 service users on 5,825 web services on a world-scale distributed network is used. Experimental results demonstrate that the learned neighborhood-based models can overcome existing difficulties of heuristic collaborative filtering methods and achieve superior performance than state-of-the-art prediction methods.
RNA molecules form a sequence-specific self-pairing pattern at low temperatures. We analyze this problem using a random pairing energy model as well as a random sequence model that includes a base stacking energy in favor of helix propagation. The free energy cost for separating a chain into two equal halves offers a quantitative measure of sequence specific pairing. In the low temperature glass phase, this quantity grows quadratically with the logarithm of the chain length, but it switches to a linear behavior of entropic origin in the high temperature molten phase. Transition between the two phases is continuous, with characteristics that resemble those of a disordered elastic manifold in two dimensions. For designed sequences, however, a power-law distribution of pairing energies on a coarse-grained level may be more appropriate. Extreme value statistics arguments then predict a power-law growth of the free energy cost to break a chain, in agreement with numerical simulations. Interestingly, the distribution of pairing distances in the ground state secondary structure follows a remarkable power-law with an exponent -4/3, independent of the specific assumptions for the base pairing energies.
The purpose of this paper is to introduce and investigate a new system of global fractional-order interval implicit projection neural networks. An existence and uniqueness theorem of the equilibrium point for such kind of global fractional-order interval implicit projection neural networks is obtained under some suitable assumptions. Moreover, Mittag-Leffler stability of the global fractional-order interval implicit projection neural networks is also proved. Finally, two numerical examples are given to illustrate the validity of our results.
Electron tunneling measurements of the density of states (DOS) in ultra-thin Be films reveal that a correlation gap mediates their insulating behavior. In films with sheet resistance $R<5000\Omega$ the correlation singularity appears as the usual perturbative $ln(V)$ zero bias anomaly (ZBA) in the DOS. As R is increased further, however, the ZBA grows and begins to dominate the DOS spectrum. This evolution continues until a non-perturbative $|V|$ Efros-Shklovskii Coulomb gap spectrum finally emerges in the highest R films. Transport measurements of films which display this gap are well described by a universal variable range hopping law $R(T)=(h/2e^2)exp(T_o/T)^{1/2}$.
Recently, deep learning methods have been shown to improve the performance of recommender systems over traditional methods, especially when review text is available. For example, a recent model, DeepCoNN, uses neural nets to learn one latent representation for the text of all reviews written by a target user, and a second latent representation for the text of all reviews for a target item, and then combines these latent representations to obtain state-of-the-art performance on recommendation tasks. We show that (unsurprisingly) much of the predictive value of review text comes from reviews of the target user for the target item. We then introduce a way in which this information can be used in recommendation, even when the target user's review for the target item is not available. Our model, called TransNets, extends the DeepCoNN model by introducing an additional latent layer representing the target user-target item pair. We then regularize this layer, at training time, to be similar to another latent representation of the target user's review of the target item. We show that TransNets and extensions of it improve substantially over the previous state-of-the-art.
The magnetic properties of single crystals of LiHo_xY_{1-x}F_4 with x=16.5% and x=4.5% were recorded down to 35 mK using a micro-SQUID magnetometer. While this system is considered as the archetypal quantum spin glass, the detailed analysis of our magnetization data indicates the absence of a phase transition, not only in a transverse applied magnetic field, but also without field. A zero-Kelvin phase transition is also unlikely, as the magnetization seems to follow a non-critical exponential dependence on the temperature. Our analysis thus unmasks the true, short-ranged nature of the magnetic properties of the LiHo_xY_{1-x}F_4 system, validating recent theoretical investigations suggesting the lack of phase transition in this system.
This paper addresses network code design for robust transmission of sources over an orthogonal two-hop wireless network with a broadcasting relay. The network consists of multiple sources and destinations in which each destination, benefiting the relay signal, intends to decode a subset of the sources. Two special instances of this network are orthogonal broadcast relay channel and the orthogonal multiple access relay channel. The focus is on complexity constrained scenarios, e.g., for wireless sensor networks, where channel coding is practically imperfect. Taking a source-channel and network coding approach, we design the network code (mapping) at the relay such that the average reconstruction distortion at the destinations is minimized. To this end, by decomposing the distortion into its components, an efficient design algorithm is proposed. The resulting network code is nonlinear and substantially outperforms the best performing linear network code. A motivating formulation of a family of structured nonlinear network codes is also presented. Numerical results and comparison with linear network coding at the relay and the corresponding distortion-power bound demonstrate the effectiveness of the proposed schemes and a promising research direction.
With the increasing demands for new data and real-time services, wireless networks should support calls with different traffic characteristics and different Quality of Service (QoS)guarantees. In addition, various wireless technologies and networks exist currently that can satisfy different needs and requirements of mobile users. Since these different wireless networks act as complementary to each other in terms of their capabilities and suitability for different applications, integration of these networks will enable the mobile users to be always connected to the best available access network depending on their requirements. This integration of heterogeneous networks will, however, lead to heterogeneities in access technologies and network protocols. To meet the requirements of mobile users under this heterogeneous environment, a common infrastructure to interconnect multiple access networks will be needed. In this chapter, the design issues of a number of mobility management schemes have been presented. Each of these schemes utilizes IP-based technologies to enable efficient roaming in heterogeneous network. Efficient handoff mechanisms are essential for ensuring seamless connectivity and uninterrupted service delivery. A number of handoff schemes in a heterogeneous networking environment are also presented in this chapter.
Deep Spiking Neural Networks are becoming increasingly powerful tools for cognitive computing platforms. However, most of the existing literature on such computing models are developed with limited insights on the underlying hardware implementation, resulting in area and power expensive designs. Although several neuromimetic devices emulating neural operations have been proposed recently, their functionality has been limited to very simple neural models that may prove to be inefficient at complex recognition tasks. In this work, we venture into the relatively unexplored area of utilizing the inherent device stochasticity of such neuromimetic devices to model complex neural functionalities in a probabilistic framework in the time domain. We consider the implementation of a Deep Spiking Neural Network capable of performing high accuracy and low latency classification tasks where the neural computing unit is enabled by the stochastic switching behavior of a Magnetic Tunnel Junction. Simulation studies indicate an energy improvement of $20\times$ over a baseline CMOS design in $45nm$ technology.
In this Letter, we theoretically propose for the first time that graphene monolayers can be used for superscatterer designs. We show that the scattering cross section of the bare deep-subwavelength dielectric cylinder is markedly enhanced by six orders of magnitude due to the excitation of the first-order resonance of graphene plamons. By utilizing the tunability of the plasmonic resonance through tuning graphene's chemical potential, the graphene superscatterer works in a wide range of frequencies from several terahertz to tens of terahertz.
This article reviews and discusses 1) the discovery and early work on the evolution of quasars and AGNs, 2) the different techniques used to find quasars and their suitability for evolutionary studies, 3) the current status of our knowledge of AGN evolution for 0 < z < 6, 4) the new results and questions that deep radio and X-ray surveys are producing for the subject, 5) the relation of AGNs to massive black holes being found in local galaxies and what they tell us about both galaxy and AGN evolution, and 6) current research problems and future directions in quasar and AGN evolution.
We have determined Li, C, N, O, Na, and Fe abundances, and 12C/13C isotopic ratios for a sample of 62 field metal-poor stars (plus 43 taken from the literature). This large sample was used to show that small mass lower-RGB stars (i.e., fainter than the RGB bump) have abundances of light elements in agreement with theoretical predictions from classical evolutionary models. A second, distinct mixing episode occurs just after the RGB bump, reaching regions of incomplete CNO burning. No O-Na anticorrelation, as observed in globular cluster stars, is found in field stars. This means that the mixing episode is not deep enough to reach regions where ON-burning occurs.
This paper designs a high-performance deep convolutional network (DeepID2+) for face recognition. It is learned with the identification-verification supervisory signal. By increasing the dimension of hidden representations and adding supervision to early convolutional layers, DeepID2+ achieves new state-of-the-art on LFW and YouTube Faces benchmarks. Through empirical studies, we have discovered three properties of its deep neural activations critical for the high performance: sparsity, selectiveness and robustness. (1) It is observed that neural activations are moderately sparse. Moderate sparsity maximizes the discriminative power of the deep net as well as the distance between images. It is surprising that DeepID2+ still can achieve high recognition accuracy even after the neural responses are binarized. (2) Its neurons in higher layers are highly selective to identities and identity-related attributes. We can identify different subsets of neurons which are either constantly excited or inhibited when different identities or attributes are present. Although DeepID2+ is not taught to distinguish attributes during training, it has implicitly learned such high-level concepts. (3) It is much more robust to occlusions, although occlusion patterns are not included in the training set.
Mobility support in future networks will be predominately based on micro mobility protocols. Current proposed schemes such as Hierarchical Mobile IPv6 (HMIPv6) and more importantly Proxy Mobile IPv6 (PMIPv6) provide localized mobility support by electing a node within the network (topologically close to the Access Routers (AR)) to act as mobility anchor. Such schemes can significantly improve handover latency, as well as the end-to-end signalling overhead, but might entail scalability issues, which in some instances, do not fit adequately with the current explosion of mobile Internet traffic, and the evolutionary trend towards flat network architectures. The notion of using Distributed Mobility Management (DMM) allows for decentralization by anchoring nodes at their AR. The idea is that for sessions with duration less than the cell residence time efficient mobility can be supported. However, DMM might be highly suboptimal in instances where nodes perform multiple handovers during the session lifetime. Hence, one approach cannot be effective for all types of applications. In this paper a hybrid mobility management solution, integrated with a new routing scheme, is proposed. The scheme selects the most suitable mobility approach, centralized or distributed, by taking into account the flows' requirement, in terms of Quality of Experience (QoE), and new routing constraints. The results of optimization problem show that the proposed approach can achieve efficient resource utilisation, ease network congestion, and lead to a significant improvement in the QoE.
We study unsupervised learning by developing introspective generative modeling (IGM) that attains a generator using progressively learned deep convolutional neural networks. The generator is itself a discriminator, capable of introspection: being able to self-evaluate the difference between its generated samples and the given training data. When followed by repeated discriminative learning, desirable properties of modern discriminative classifiers are directly inherited by the generator. IGM learns a cascade of CNN classifiers using a synthesis-by-classification algorithm. In the experiments, we observe encouraging results on a number of applications including texture modeling, artistic style transferring, face modeling, and semi-supervised learning.
Studies investigating the neural bases of cognitive phenomena such as perception, attention and decision-making increasingly employ multialternative task designs. It is essential in such designs to distinguish the neural correlates of behavioral choices arising from changes in perceptual factors, such as enhanced sensitivity to sensory information, from those arising from changes in decisional factors, such as a stronger bias for a particular response or choice (choice bias). To date such a distinction is not possible with established approaches. Thus, there is a critical need for a theoretical approach that distinguishes the effects of changes in sensitivity from those of changes in choice bias in multialternative tasks. Here, we introduce a mathematical model that decouples choice bias from perceptual sensitivity effects in multialternative detection tasks: multialternative tasks that incorporate catch trials to measure the ability to detect one among multiple (potential) stimuli or stimulus features. By formulating the perceptual decision in a novel, multidimensional signal detection framework, our model identifies the distinct effects of bias and sensitivity on behavioral choices. With a combination of analytical and numerical approaches, we demonstrate that model parameters (sensitivity, bias) are estimated reliably and uniquely, even in tasks involving arbitrarily large numbers of alternatives. Model simulations revealed that ignoring choice bias or performance during catch trials produced systematically inaccurate estimates of perceptual sensitivity, a finding that has important implications for interpreting behavioral data in multialternative detection and cued attention tasks. The model will find important application in identifying the effects of neural perturbations (stimulation or inactivation) on an animal's perception in multialternative attention and decision-making tasks.
Recurrent neural networks have been very successful at predicting sequences of words in tasks such as language modeling. However, all such models are based on the conventional classification framework, where the model is trained against one-hot targets, and each word is represented both as an input and as an output in isolation. This causes inefficiencies in learning both in terms of utilizing all of the information and in terms of the number of parameters needed to train. We introduce a novel theoretical framework that facilitates better learning in language modeling, and show that our framework leads to tying together the input embedding and the output projection matrices, greatly reducing the number of trainable variables. Our framework leads to state of the art performance on the Penn Treebank with a variety of network models.
We explore the effects of over-specificity in learning algorithms by investigating the behavior of a student, suited to learn optimally from a teacher $\mathbf{B}$, learning from a teacher $\mathbf{B}'\neq\mathbf{B}$. We only considered the supervised, on-line learning scenario with teachers selected from a particular family. We found that, in the general case, the application of the optimal algorithm to the wrong teacher produces a residual generalization error, even if the right teacher is harder. By imposing mild conditions to the learning algorithm form we obtained an approximation for the residual generalization error. Simulations carried in finite networks validate the estimate found.
Distance computation is one of the most fundamental primitives used in communication networks. The cost of effectively and accurately computing pairwise network distances can become prohibitive in large-scale networks such as the Internet and Peer-to-Peer (P2P) networks. To negotiate the rising need for very efficient distance computation, approximation techniques for numerous variants of this question have recently received significant attention in the literature. The goal is to preprocess the graph and store a small amount of information such that whenever a query for any pairwise distance is issued, the distance can be well approximated (i.e., with small stretch) very quickly in an online fashion. Specifically, the pre-processing (usually) involves storing a small sketch with each node, such that at query time only the sketches of the concerned nodes need to be looked up to compute the approximate distance. In this paper, we present the first theoretical study of distance sketches derived from distance oracles in a distributed network. We first present a fast distributed algorithm for computing approximate distance sketches, based on a distributed implementation of the distance oracle scheme of [Thorup-Zwick, JACM 2005]. We also show how to modify this basic construction to achieve different tradeoffs between the number of pairs for which the distance estimate is accurate and other parameters. These tradeoffs can then be combined to give an efficient construction of small sketches with provable average-case as well as worst-case performance. Our algorithms use only small-sized messages and hence are suitable for bandwidth-constrained networks, and can be used in various networking applications such as topology discovery and construction, token management, load balancing, monitoring overlays, and several other problems in distributed algorithms.
This paper studies the problem of computing Nash equilibrium in wireless networks modeled by Weighted Timed Automata. Such formalism comes together with a logic that can be used to describe complex features such as timed energy constraints. Our contribution is a method for solving this problem using Statistical Model Checking. The method has been implemented in UPPAAL model checker and has been applied to the analysis of Aloha CSMA/CD and IEEE 802.15.4 CSMA/CA protocols.
We propose a transition-based dependency parser using Recurrent Neural Networks with Long Short-Term Memory (LSTM) units. This extends the feedforward neural network parser of Chen and Manning (2014) and enables modelling of entire sequences of shift/reduce transition decisions. On the Google Web Treebank, our LSTM parser is competitive with the best feedforward parser on overall accuracy and notably achieves more than 3% improvement for long-range dependencies, which has proved difficult for previous transition-based parsers due to error propagation and limited context information. Our findings additionally suggest that dropout regularisation on the embedding layer is crucial to improve the LSTM's generalisation.
The dynamic problem of enclosing an expanding fire can be modelled by a discrete variant in a grid graph. While the fire expands to all neighbouring cells in any time step, the fire fighter is allowed to block $c$ cells in the average outside the fire in the same time interval. It was shown that the success of the fire fighter is guaranteed for $c>1.5$ but no strategy can enclose the fire for $c\leq 1.5$. For achieving such a critical threshold the correctness (sometimes even optimality) of strategies and lower bounds have been shown by integer programming or by direct but often very sophisticated arguments. We investigate the problem whether it is possible to find or to approach such a threshold and/or optimal strategies by means of evolutionary algorithms, i.e., we just try to learn successful strategies for different constants $c$ and have a look at the outcome. The main general idea is that this approach might give some insight in the power of evolutionary strategies for similar geometrically motivated threshold questions. We investigate the variant of protecting a highway with still unknown threshold and found interesting strategic paradigms.   Keywords: Dynamic environments, fire fighting, evolutionary strategies, threshold approximation
An individual's decisions are often guided by those of his or her peers, i.e., neighbors in a social network. Presumably, being privy to the experiences of others aids in learning and decision making, but how much advantage does an individual gain by observing her neighbors? Such problems make appearances in sociology and economics and, in this paper, we present a novel model to capture such decision-making processes and appeal to the classical multi-armed bandit framework to analyze it. Each individual, in addition to her own actions, can observe the actions and rewards obtained by her neighbors, and can use all of this information in order to minimize her own regret. We provide algorithms for this setting, both for stochastic and adversarial bandits, and show that their regret smoothly interpolates between the regret in the classical bandit setting and that of the full-information setting as a function of the neighbors' exploration. In the stochastic setting the additional information must simply be incorporated into the usual estimation of the rewards, while in the adversarial setting this is attained by constructing a new unbiased estimator for the rewards and appropriately bounding the amount of additional information provided by the neighbors. These algorithms are optimal up to log factors; despite the fact that the agents act independently and selfishly, this implies that it is an approximate Nash equilibria for all agents to use our algorithms. Further, we show via empirical simulations that our algorithms, often significantly, outperform existing algorithms that one could apply to this setting.
The traditional models of distributed computing focus mainly on networks of computer-like devices that can exchange large messages with their neighbors and perform arbitrary local computations. Recently, there is a trend to apply distributed computing methods to networks of sub-microprocessor devices, e.g., biological cellular networks or networks of nano-devices. However, the suitability of the traditional distributed computing models to these types of networks is questionable: do tiny bio/nano nodes "compute" and/or "communicate" essentially the same as a computer? In this paper, we introduce a new model that depicts a network of randomized finite state machines operating in an asynchronous environment. Although the computation and communication capabilities of each individual device in the new model are, by design, much weaker than those of a computer, we show that some of the most important and extensively studied distributed computing problems can still be solved efficiently.
Much about the confinement and dynamical symmetry breaking in QCD might be learned from models with supersymmetry. In particular, models based on N=2 supersymmetric theories with gauge groups SU(N), SO(N) and $USp(2 N)$ and with various number of flavors, give deep dynamical hints about these phenomena. For instance, the BPS non-abelian monopoles can become the dominant degrees of freedom in the infrared due to quantum effects. Upon condensation (which can be triggered in these class of models by perturbing them with an adjoint scalar mass) they induce confinement with calculable pattern of dynamical symmetry breaking. This may occur either in a weakly interacting regime or in a strongly coupled regime (in the latter, often the low-energy degrees of freedom contain relatively non-local monopoles and dyons simultaneously and the system is near a nontrivial fixed-point).   Also, the existence of sytems with BPS {\it non-abelian vortices} has been shown recently. These results point toward the idea that the ground state of QCD is a sort of dual superconductor of non-abelian variety.
It is commonly believed that the central visual field is important for recognizing objects and faces, and the peripheral region is useful for scene recognition. However, the relative importance of central versus peripheral information for object, scene, and face recognition is unclear. In a behavioral study, Larson and Loschky (2009) investigated this question by measuring the scene recognition accuracy as a function of visual angle, and demonstrated that peripheral vision was indeed more useful in recognizing scenes than central vision. In this work, we modeled and replicated the result of Larson and Loschky (2009), using deep convolutional neural networks. Having fit the data for scenes, we used the model to predict future data for large-scale scene recognition as well as for objects and faces. Our results suggest that the relative order of importance of using central visual field information is face recognition>object recognition>scene recognition, and vice-versa for peripheral information.
We conducted radio detection observations at 8.4 GHz for 22 radio-loud broad absorption line (BAL) quasars, selected from the Sloan Digital Sky Survey (SDSS) Third Data Release, by a very-long-baseline interferometry (VLBI) technique. The VLBI instrument we used was developed by the Optically ConnecTed Array for VLBI Exploration project (OCTAVE), which is operated as a subarray of the Japanese VLBI Network (JVN). We aimed at selecting BAL quasars with nonthermal jets suitable for measuring their orientation angles and ages by subsequent detailed VLBI imaging studies to evaluate two controversial issues of whether BAL quasars are viewed nearly edge-on, and of whether BAL quasars are in a short-lived evolutionary phase of quasar population. We detected 20 out of 22 sources using the OCTAVE baselines, implying brightness temperatures greater than 10^5 K, which presumably come from nonthermal jets. Hence, BAL outflows and nonthermal jets can be generated simultaneously in these central engines. We also found four inverted-spectrum sources, which are interpreted as Doppler-beamed, pole-on-viewed relativistic jet sources or young radio sources: single edge-on geometry cannot describe all BAL quasars. We discuss the implications of the OCTAVE observations for investigations for the orientation and evolutionary stage of BAL quasars.
The dynamics of the SK model at T=0 starting from random spin configurations is considered. The metastable states reached by such dynamics are atypical of such states as a whole, in that the probability density of site energies, $p(\lambda)$, is small at $\lambda=0$. Since virtually all metastable states have a much larger $p(0)$, this behavior demonstrates a qualitative failure of the Edwards hypothesis. We look for its origins by modelling the changes in the site energies during the dynamics as a Markov process. We show how the small $p(0)$ arises from features of the Markov process that have a clear physical basis in the spin-glass, and hence explain the failure of the Edwards hypothesis.
As microblogging services like Twitter are becoming more and more influential in today's globalised world, its facets like sentiment analysis are being extensively studied. We are no longer constrained by our own opinion. Others opinions and sentiments play a huge role in shaping our perspective. In this paper, we build on previous works on Twitter sentiment analysis using Distant Supervision. The existing approach requires huge computation resource for analysing large number of tweets. In this paper, we propose techniques to speed up the computation process for sentiment analysis. We use tweet subjectivity to select the right training samples. We also introduce the concept of EFWS (Effective Word Score) of a tweet that is derived from polarity scores of frequently used words, which is an additional heuristic that can be used to speed up the sentiment classification with standard machine learning algorithms. We performed our experiments using 1.6 million tweets. Experimental evaluations show that our proposed technique is more efficient and has higher accuracy compared to previously proposed methods. We achieve overall accuracies of around 80% (EFWS heuristic gives an accuracy around 85%) on a training dataset of 100K tweets, which is half the size of the dataset used for the baseline model. The accuracy of our proposed model is 2-3% higher than the baseline model, and the model effectively trains at twice the speed of the baseline model.
Population growth and increasing droughts are creating unprecedented strain on the continued availability of water resources. Since irrigation is a major consumer of fresh water, wastage of resources in this sector could have strong consequences. To address this issue, irrigation water management and prediction techniques need to be employed effectively and should be able to account for the variabilities present in the environment. The different techniques surveyed in this paper can be classified into two categories: computational and statistical. Computational methods deal with scientific correlations between physical parameters whereas statistical methods involve specific prediction algorithms that can be used to automate the process of irrigation water prediction. These algorithms interpret semantic relationships between the various parameters of temperature, pressure, evapotranspiration etc. and store them as numerical precomputed entities specific to the conditions and the area used as the data for the training corpus used to train it. We focus on reviewing the computational methods used to determine Evapotranspiration and its implications. We compare the efficiencies of different data mining and machine learning methods implemented in this area, such as Logistic Regression, Decision Tress Classifier, SysFor, Support Vector Machine(SVM), Fuzzy Logic techniques, Artifical Neural Networks(ANNs) and various hybrids of Genetic Algorithms (GA) applied to irrigation prediction. We also recommend a possible technique for the same based on its superior results in other such time series analysis tasks.
Several efforts to predict student failure rate (SFR) at school accurately still remains a core problem area faced by many in the educational sector. The procedure for forecasting SFR are rigid and most often times require data scaling or conversion into binary form such as is the case of the logistic model which may lead to lose of information and effect size attenuation. Also, the high number of factors, incomplete and unbalanced dataset, and black boxing issues as in Artificial Neural Networks and Fuzzy logic systems exposes the need for more efficient tools. Currently the application of Genetic Programming (GP) holds great promises and has produced tremendous positive results in different sectors. In this regard, this study developed GPSFARPS, a software application to provide a robust solution to the prediction of SFR using an evolutionary algorithm known as multi-gene genetic programming. The approach is validated by feeding a testing data set to the evolved GP models. Result obtained from GPSFARPS simulations show its unique ability to evolve a suitable failure rate expression with a fast convergence at 30 generations from a maximum specified generation of 500. The multi-gene system was also able to minimize the evolved model expression and accurately predict student failure rate using a subset of the original expression
One of the emerging trends for sports analytics is the growing use of player and ball tracking data. A parallel development is deep learning predictive approaches that use vast quantities of data with less reliance on feature engineering. This paper applies recurrent neural networks in the form of sequence modeling to predict whether a three-point shot is successful. The models are capable of learning the trajectory of a basketball without any knowledge of physics. For comparison, a baseline static machine learning model with a full set of features, such as angle and velocity, in addition to the positional data is also tested. Using a dataset of over 20,000 three pointers from NBA SportVu data, the models based simply on sequential positional data outperform a static feature rich machine learning model in predicting whether a three-point shot is successful. This suggests deep learning models may offer an improvement to traditional feature based machine learning methods for tracking data.
We present a new approach to learn compressible representations in deep architectures with an end-to-end training strategy. Our method is based on a soft (continuous) relaxation of quantization and entropy, which we anneal to their discrete counterparts throughout training. We showcase this method for two challenging applications: Image compression and neural network compression. While these tasks have typically been approached with different methods, our soft-to-hard quantization approach gives results competitive with the state-of-the-art for both.
We present a recurrent encoder-decoder deep neural network architecture that directly translates speech in one language into text in another. The model does not explicitly transcribe the speech into text in the source language, nor does it require supervision from the ground truth source language transcription during training. We apply a slightly modified sequence-to-sequence with attention architecture that has previously been used for speech recognition and show that it can be repurposed for this more complex task, illustrating the power of attention-based models. A single model trained end-to-end obtains state-of-the-art performance on the Fisher Callhome Spanish-English speech translation task, outperforming a cascade of independently trained sequence-to-sequence speech recognition and machine translation models by 1.8 BLEU points on the Fisher test set. In addition, we find that making use of the training data in both languages by multi-task training sequence-to-sequence speech translation and recognition models with a shared encoder network can improve performance by a further 1.4 BLEU points.
Most community detection algorithms from the literature work as optimization tools that minimize a given \textit{fitness function}, while assuming that each node belongs to a single community. Since there is no hard concept of what a community is, most proposed fitness functions focus on a particular definition. As such, these functions do not always lead to partitions that correspond to those observed in practice. This paper proposes a new flexible fitness function that allows the identification of communities with distinct characteristics. Such flexibility was evaluated through the adoption of an immune-inspired optimization algorithm, named cob-aiNet[C], to identify both disjoint and overlapping communities in a set of benchmark networks. The results have shown that the obtained partitions are much closer to the ground-truth than those obtained by the optimization of the modularity function.
We introduce a method that can be used to evolve the topology of a network in a way that preserves both the network's spectral as well as local structure. This method is quite versatile in the sense that it can be used to evolve a network's topology over any collection of the network's elements. This evolution preserves both the eigenvector centrality of these elements as well as the eigenvalues of the original network. Although this method is introduced as a tool to model network growth, we show it can also be used to compare the topology of different networks where two networks are considered similar if their evolved topologies are the same. Because this method preserves the spectral structure of a network, which is related to the network's dynamics, it can also be used to study the interplay of network growth and function. We show that if a network's dynamics is intrinsically stable, which is a stronger version of the standard notion of stability, then the network remains intrinsically stable as the network's topology evolves. This is of interest since the growth of a network can have a destabilizing effect on the network's dynamics, in general. In this sense the methods developed here can be used as a tool for designing mechanisms of network growth that ensure a network remains stabile as it grows.
Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Specifically, we study the deep Gaussian process, a type of infinitely-wide, deep neural network. We show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit. We propose an alternate network architecture which does not suffer from this pathology. We also examine deep covariance functions, obtained by composing infinitely many feature transforms. Lastly, we characterize the class of models obtained by performing dropout on Gaussian processes.
Techniques involving factorization are found in a wide range of applications and have enjoyed significant empirical success in many fields. However, common to a vast majority of these problems is the significant disadvantage that the associated optimization problems are typically non-convex due to a multilinear form or other convexity destroying transformation. Here we build on ideas from convex relaxations of matrix factorizations and present a very general framework which allows for the analysis of a wide range of non-convex factorization problems - including matrix factorization, tensor factorization, and deep neural network training formulations. We derive sufficient conditions to guarantee that a local minimum of the non-convex optimization problem is a global minimum and show that if the size of the factorized variables is large enough then from any initialization it is possible to find a global minimizer using a purely local descent algorithm. Our framework also provides a partial theoretical justification for the increasingly common use of Rectified Linear Units (ReLUs) in deep neural networks and offers guidance on deep network architectures and regularization strategies to facilitate efficient optimization.
We report the results of our classification-based machine translation model, built upon the framework of a recurrent neural network using gated recurrent units. Unlike other RNN models that attempt to maximize the overall conditional log probability of sentences against sentences, our model focuses a classification approach of estimating the conditional probability of the next word given the input sequence. This simpler approach using GRUs was hoped to be comparable with more complicated RNN models, but achievements in this implementation were modest and there remains a lot of room for improving this classification approach.
Attentional Neural Network is a new framework that integrates top-down cognitive bias and bottom-up feature extraction in one coherent architecture. The top-down influence is especially effective when dealing with high noise or difficult segmentation problems. Our system is modular and extensible. It is also easy to train and cheap to run, and yet can accommodate complex behaviors. We obtain classification accuracy better than or competitive with state of art results on the MNIST variation dataset, and successfully disentangle overlaid digits with high success rates. We view such a general purpose framework as an essential foundation for a larger system emulating the cognitive abilities of the whole brain.
Understanding the spatial networks formed by the trajectories of mobile users can be beneficial to applications ranging from epidemiology to local search. Despite the potential for impact in a number of fields, several aspects of human mobility networks remain largely unexplored due to the lack of large-scale data at a fine spatiotemporal resolution. Using a longitudinal dataset from the location-based service Foursquare, we perform an empirical analysis of the topological properties of place networks and note their resemblance to online social networks in terms of heavy-tailed degree distributions, triadic closure mechanisms and the small world property. Unlike social networks however, place networks present a mixture of connectivity trends in terms of assortativity that are surprisingly similar to those of the web graph. We take advantage of additional semantic information to interpret how nodes that take on functional roles such as `travel hub', or `food spot' behave in these networks. Finally, motivated by the large volume of new links appearing in place networks over time, we formulate the classic link prediction problem in this new domain. We propose a novel variant of gravity models that brings together three essential elements of inter-place connectivity in urban environments: network-level interactions, human mobility dynamics, and geographic distance. We evaluate this model and find it outperforms a number of baseline predictors and supervised learning algorithms on a task of predicting new links in a sample of one hundred popular cities.
Consumers' purchase decisions are increasingly influenced by user-generated online reviews. Accordingly, there has been growing concern about the potential for posting deceptive opinion spam fictitious reviews that have been deliberately written to sound authentic, to deceive the readers. Existing approaches mainly focus on developing automatic supervised learning based methods to help users identify deceptive opinion spams.   This work, we used the LSI and Sprinkled LSI technique to reduce the dimension for deception detection. We make our contribution to demonstrate what LSI is capturing in latent semantic space and reveal how deceptive opinions can be recognized automatically from truthful opinions. Finally, we proposed a voting scheme which integrates different approaches to further improve the classification performance.
The topology of the Internet has typically been measured by sampling traceroutes, which are roughly shortest paths from sources to destinations. The resulting measurements have been used to infer that the Internet's degree distribution is scale-free; however, many of these measurements have relied on sampling traceroutes from a small number of sources. It was recently argued that sampling in this way can introduce a fundamental bias in the degree distribution, for instance, causing random (Erdos-Renyi) graphs to appear to have power law degree distributions. We explain this phenomenon analytically using differential equations to model the growth of a breadth-first tree in a random graph G(n,p=c/n) of average degree c, and show that sampling from a single source gives an apparent power law degree distribution P(k) ~ 1/k for k < c.
Large scale real-world network data such as social and information networks are ubiquitous. The study of such social and information networks seeks to find patterns and explain their emergence through tractable models. In most networks, and especially in social networks, nodes have a rich set of attributes (e.g., age, gender) associated with them.   Here we present a model that we refer to as the Multiplicative Attribute Graphs (MAG), which naturally captures the interactions between the network structure and the node attributes. We consider a model where each node has a vector of categorical latent attributes associated with it. The probability of an edge between a pair of nodes then depends on the product of individual attribute-attribute affinities. The model yields itself to mathematical analysis and we derive thresholds for the connectivity and the emergence of the giant connected component, and show that the model gives rise to networks with a constant diameter. We analyze the degree distribution to show that MAG model can produce networks with either log-normal or power-law degree distributions depending on certain conditions.
Many works have concentrated on visualizing and understanding the inner mechanism of convolutional neural networks (CNNs) by generating images that activate some specific neurons, which is called deep visualization. However, it is still unclear what the filters extract from images intuitively. In this paper, we propose a modified code inversion algorithm, called feature map inversion, to understand the function of filter of interest in CNNs. We reveal that every filter extracts a specific texture. The texture from higher layer contains more colours and more intricate structures. We also demonstrate that style of images could be a combination of these texture primitives. Two methods are proposed to reallocate energy distribution of feature maps randomly and purposefully. Then, we inverse the modified code and generate images of diverse styles. With these results, we provide an explanation about why Gram matrix of feature maps \cite{Gatys_2016_CVPR} could represent image style.
We consider the two-dimensional (2d) random Ising model on a diagonal strip of the square lattice, where the bonds take two values, $J_1>J_2$, with equal probability. Using an iterative method, based on a successive application of the star-triangle transformation, we have determined at the bulk critical temperature the correlation length along the strip, $\xi_L$, for different widths of the strip, $L \le 21$. The ratio of the two lengths, $\xi_L/L=A$, is found to approach the universal value, $A=2/\pi$ for large $L$, independent of the dilution parameter, $J_1/J_2$. With our method we have demonstrated with high numerical precision, that the surface correlation function of the 2d dilute Ising model is self-averaging, in the critical point conformally coovariant and the corresponding decay exponent is $\eta_{\parallel}=1$.
In this article, we propose a general framework for multi-focal image classification and authentication, the methodology being demonstrated on microscope pollen images. The framework is meant to be generic and based on a brute force-like approach aimed to be efficient not only on any kind, and any number, of pollen images (regardless of the pollen type), but also on any kind of multi-focal images. All stages of the framework's pipeline are designed to be used in an automatic fashion. First, the optimal focus is selected using the absolute gradient method. Then, pollen grains are extracted using a coarse-to-fine approach involving both clustering and morphological techniques (coarse stage), and a snake-based segmentation (fine stage). Finally, features are extracted and selected using a generalized approach, and their classification is tested with four classifiers: Weighted Neighbor Distance, Neural Network, Decision Tree and Random Forest. The latter method, which has shown the best and more robust classification accuracy results (above 97\% for any number of pollen types), is finally used for the authentication stage.
We present probabilistic neural programs, a framework for program induction that permits flexible specification of both a computational model and inference algorithm while simultaneously enabling the use of deep neural networks. Probabilistic neural programs combine a computation graph for specifying a neural network with an operator for weighted nondeterministic choice. Thus, a program describes both a collection of decisions as well as the neural network architecture used to make each one. We evaluate our approach on a challenging diagram question answering task where probabilistic neural programs correctly execute nearly twice as many programs as a baseline model.
Continuous-time Bayesian networks is a natural structured representation language for multicomponent stochastic processes that evolve continuously over time. Despite the compact representation, inference in such models is intractable even in relatively simple structured networks. Here we introduce a mean field variational approximation in which we use a product of inhomogeneous Markov processes to approximate a distribution over trajectories. This variational approach leads to a globally consistent distribution, which can be efficiently queried. Additionally, it provides a lower bound on the probability of observations, thus making it attractive for learning tasks. We provide the theoretical foundations for the approximation, an efficient implementation that exploits the wide range of highly optimized ordinary differential equations (ODE) solvers, experimentally explore characterizations of processes for which this approximation is suitable, and show applications to a large-scale realworld inference problem.
In practice, since many communication networks are huge in scale or complicated in structure even dynamic, the predesigned network codes based on the network topology is impossible even if the topological structure is known. Therefore, random linear network coding was proposed as an acceptable coding technique. In this paper, we further study the performance of random linear network coding by analyzing the failure probabilities at sink node for different knowledge of network topology and get some tight and asymptotically tight upper bounds of the failure probabilities. In particular, the worst cases are indicated for these bounds. Furthermore, if the more information about the network topology is utilized, the better upper bounds are obtained. These bounds improve on the known ones. Finally, we also discuss the lower bound of this failure probability and show that it is also asymptotically tight.
We present a deep layered architecture that generalizes classical convolutional neural networks (ConvNets). The architecture, called SimNets, is driven by two operators, one being a similarity function whose family contains the convolution operator used in ConvNets, and the other is a new soft max-min-mean operator called MEX that realizes classical operators like ReLU and max pooling, but has additional capabilities that make SimNets a powerful generalization of ConvNets. Three interesting properties emerge from the architecture: (i) the basic input to hidden layer to output machinery contains as special cases kernel machines with the Exponential and Generalized Gaussian kernels, the output units being "neurons in feature space" (ii) in its general form, the basic machinery has a higher abstraction level than kernel machines, and (iii) initializing networks using unsupervised learning is natural. Experiments demonstrate the capability of achieving state of the art accuracy with networks that are an order of magnitude smaller than comparable ConvNets.
The gauge glass model offers an interesting example of a randomly frustrated system with a continuous O(2) symmetry. In two dimensions, the existence of a glass phase at low temperatures has long been disputed among numerical studies. To resolve this controversy, we examine the behavior of vortices whose movement generates phase slips that destroy phase rigidity at large distances. Detailed analytical and numerical studies of the corresponding Coulomb gas problem in a random potential establish that the ground state, with a finite density of vortices, is polarizable with a scale-dependent dielectric susceptibility. Screening by vortex/antivortex pairs of arbitrarily large size is present to eliminate the logarithmic divergence of the Coulomb energy of a single vortex. The observed power-law decay of the Coulomb interaction between vortices with distance in the ground state leads to a power-law divergence of the glass correlation length with temperature $T$. It is argued that free vortices possess a bound excitation energy and a nonzero diffusion constant at any $T>0$.
Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e.g., "I don't know") regardless of the input. We suggest that the traditional objective function, i.e., the likelihood of output (response) given input (message) is unsuited to response generation tasks. Instead we propose using Maximum Mutual Information (MMI) as the objective function in neural models. Experimental results demonstrate that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.
Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any given state and the reward predictor maps states to scalar rewards. The value function of a state can be computed as the inner product between the successor map and the reward weights. In this paper, we present DSR, which generalizes SR within an end-to-end deep reinforcement learning framework. DSR has several appealing properties including: increased sensitivity to distal reward changes due to factorization of reward and world dynamics, and the ability to extract bottleneck states (subgoals) given successor maps trained under a random policy. We show the efficacy of our approach on two diverse environments given raw pixel observations -- simple grid-world domains (MazeBase) and the Doom game engine.
In Evolutionary Dynamics the understanding of cooperative phenomena in natural and social systems has been the subject of intense research during decades. We focus attention here on the so-called "Lattice Reciprocity" mechanisms that enhance evolutionary survival of the cooperative phenotype in the Prisoner's Dilemma game when the population of darwinian replicators interact through a fixed network of social contacts. Exact results on a "Dipole Model" are presented, along with a mean-field analysis as well as results from extensive numerical Monte Carlo simulations. The theoretical framework used is that of standard Statistical Mechanics of macroscopic systems, but with no energy considerations. We illustrate the power of this perspective on social modeling, by consistently interpreting the onset of lattice reciprocity as a thermodynamical phase transition that, moreover, cannot be captured by a purely mean-field approach.
The design of modulation schemes for the physical layer network-coded two way relaying scenario is considered with the protocol which employs two phases: Multiple access (MA) Phase and Broadcast (BC) phase. It was observed by Koike-Akino et al. that adaptively changing the network coding map used at the relay according to the channel conditions greatly reduces the impact of multiple access interference which occurs at the relay during the MA phase. In other words, the set of all possible channel realizations (the complex plane) is quantized into a finite number of regions, with a specific network coding map giving the best performance in a particular region. We highlight the issues associated with the scheme proposed by Koike-Akino et al. and propose a scheme which solves these issues. We obtain a quantization of the set of all possible channel realizations analytically for the case when $M$-PSK (for $M$ any power of 2) is the signal set used during the MA phase. It is shown that the complex plane can be classified into two regions: a region in which any network coding map which satisfies the so called exclusive law gives the same best performance and a region in which the choice of the network coding map affects the performance, which is further quantized based on the choice of the network coding map which optimizes the performance. The quantization thus obtained analytically, leads to the same as the one obtained using computer search for 4-PSK signal set by Koike-Akino et al., when specialized for $M=4.$
Mathematical models of cellular physiological mechanisms often involve random walks on graphs representing transitions within networks of functional states. Schmandt and Gal\'{a}n recently introduced a novel stochastic shielding approximation as a fast, accurate method for generating approximate sample paths from a finite state Markov process in which only a subset of states are observable. For example, in ion channel models, such as the Hodgkin-Huxley or other conductance based neural models, a nerve cell has a population of ion channels whose states comprise the nodes of a graph, only some of which allow a transmembrane current to pass. The stochastic shielding approximation consists of neglecting fluctuations in the dynamics associated with edges in the graph not directly affecting the observable states. We consider the problem of finding the optimal complexity reducing mapping from a stochastic process on a graph to an approximate process on a smaller sample space, as determined by the choice of a particular linear measurement functional on the graph. The partitioning of ion channel states into conducting versus nonconducting states provides a case in point. In addition to establishing that Schmandt and Gal\'{a}n's approximation is in fact optimal in a specific sense, we use recent results from random matrix theory to provide heuristic error estimates for the accuracy of the stochastic shielding approximation for an ensemble of random graphs. Moreover, we provide a novel quantitative measure of the contribution of individual transitions within the reaction graph to the accuracy of the approximate process.
In this paper, we revise two commonly used saturated functions, the logistic sigmoid and the hyperbolic tangent (tanh).   We point out that, besides the well-known non-zero centered property, slope of the activation function near the origin is another possible reason making training deep networks with the logistic function difficult to train. We demonstrate that, with proper rescaling, the logistic sigmoid achieves comparable results with tanh.   Then following the same argument, we improve tahn by penalizing in the negative part. We show that "penalized tanh" is comparable and even outperforms the state-of-the-art non-saturated functions including ReLU and leaky ReLU on deep convolution neural networks.   Our results contradict to the conclusion of previous works that the saturation property causes the slow convergence. It suggests further investigation is necessary to better understand activation functions in deep architectures.
In Social Networks, it is often interesting to study type of networks formed, its efficiency with respect to social objective and which networks are stable. Many work have already been there in this area. Players in network formation game will incur some cost to form a link. In previous work, all player will have same cost for forming a link int he network. Here we will the study the optimal network and Nash equilibrium network when each player have different cost value for forming the link in the social network.
Neural network based generative models with discriminative components are a powerful approach for semi-supervised learning. However, these techniques a) cannot account for model uncertainty in the estimation of the model's discriminative component and b) lack flexibility to capture complex stochastic patterns in the label generation process. To avoid these problems, we first propose to use a discriminative component with stochastic inputs for increased noise flexibility. We show how an efficient Gibbs sampling procedure can marginalize the stochastic inputs when inferring missing labels in this model. Following this, we extend the discriminative component to be fully Bayesian and produce estimates of uncertainty in its parameter values. This opens the door for semi-supervised Bayesian active learning.
Deep neural networks are being used to solve complex classification problems, in which other machine learning classifiers, such as SVM, fall short. Recurrent Neural Networks (RNNs) have been used for tasks that involves sequential inputs, like speech to text. In the cyber security domain, RNNs based on API calls have been able to classify unsigned malware better than other classifiers. In this paper we present a black-box attack against RNNs, focusing on finding adversarial API call sequences that would be misclassified by a RNN without affecting the malware functionality. We also show that the this attack is effective against many classifiers, due-to the transferability principle between RNN variants, feed-forward DNNs and state-of-the-art traditional machine learning classifiers. Finally, we introduce the transferability by transitivity principle, causing an attack against generalized classifier like RNN variants to be transferable to less generalized classifiers like feed-forward DNNs. We conclude by discussing possible defense mechanisms.
A basic issue in optimization, inverse theory,neural networks, computational chemistry and many other problems is the geometrical characterization of high dimensional functions. In inverse calculations one aims to characterize the set of models that fit the data (among other constraints). If the data misfit function is unimodal then one can find its peak by local optimization methods and characterize its width (related to the range of data-fitting models) by estimating derivatives at this peak. On the other hand, if there are local extrema, then a number of interesting and difficult problems arise. Are the local extrema important compared to the global or can they be eliminated (e.g., by smoothing) without significant loss of information? Is there a sufficiently small number of local extrema that they can be enumerated via local optimization? What are the basins of attraction of these local extrema? Can two extrema be joined by a path that never goes uphill? Can the whole problem be reduced to one of enumerating the local extrema and their basins of attraction? For locally ill-conditioned functions, premature convergence of local optimization can be confused with the presence of local extrema. Addressing any of these issues requires topographic information about the functions under study. But in many applications these functions may have hundreds or thousands of variables and can only be evaluated pointwise (by some numerical method for instance). In this paper we describe systematic (but generic) methods of analysing the topography of high dimensional functions using local optimization methods applied to randomly chosen starting models. We provide a number of quantitative measures of function topography that have proven to be useful in practical problems along with error estimates.
Many seemingly different macroscopic systems (magnets, ferroelectrics, CDW, vortices,..) can be described as generic disordered elastic systems. Understanding their static and dynamics thus poses challenging problems both from the point of view of fundamental physics and of practical applications. Despite important progress many questions remain open. In particular the temperature has drastic effects on the way these systems respond to an external force. We address here the important question of the thermal effect close to depinning, and whether these effects can be understood in the analogy with standard critical phenomena, analogy so useful to understand the zero temperature case. We show that close to the depinning force temperature leads to a rounding of the depinning transition and compute the corresponding exponent. In addition, using a novel algorithm it is possible to study precisely the behavior close to depinning, and to show that the commonly accepted analogy of the depinning with a critical phenomenon does not fully hold, since no divergent lengthscale exists in the steady state properties of the line below the depinning threshold.
Renormalization-group methods in soft-collinear effective theory are used to perform the resummation of large perturbative logarithms for deep-inelastic scattering in the threshold region x->1. The factorization theorem for the structure function F_2(x,Q^2) for x->1 is rederived in the effective theory, whereby contributions from the hard scale Q^2 and the jet scale Q^2(1-x) are encoded in Wilson coefficients of effective-theory operators. Resummation is achieved by solving the evolution equations for these operators. Simple analytic results for the resummed expressions are obtained directly in momentum space, and are free of the Landau-pole singularities inherent to the traditional moment-space results. We show analytically that the two methods are nonetheless equivalent order by order in the perturbative expansion, and perform a numerical comparison up to next-to-next-to-leading order in renormalization-group improved perturbation theory.
This work studies consensus strategies for networks of agents with limited memory, computation, and communication capabilities. We assume that agents can process only values from a finite alphabet, and we adopt the framework of finite fields, where the alphabet consists of the integers {0,...,p-1}, for some prime number p, and operations are performed modulo p. Thus, we define a new class of consensus dynamics, which can be exploited in certain applications such as pose estimation in capacity and memory constrained sensor networks. For consensus networks over finite fields, we provide necessary and sufficient conditions on the network topology and weights to ensure convergence. We show that consensus networks over finite fields converge in finite time, a feature that can be hardly achieved over the field of real numbers. For the design of finite-field consensus networks, we propose a general design method, with high computational complexity, and a network composition rule to generate large consensus networks from smaller components. Finally, we discuss the application of finite-field consensus networks to distributed averaging and pose estimation in sensor networks.
Robustness to mutations and noise has been shown to evolve through stabilizing selection for optimal phenotypes in model gene regulatory networks. The ability to evolve robust mutants is known to depend on the network architecture. How do the dynamical properties and state-space structures of networks with high and low robustness differ? Does selection operate on the global dynamical behavior of the networks? What kind of state-space structures are favored by selection? We provide damage propagation analysis and an extensive statistical analysis of state spaces of these model networks to show that the change in their dynamical properties due to stabilizing selection for optimal phenotypes is minor. Most notably, the networks that are most robust to both mutations and noise are highly chaotic. Certain properties of chaotic networks, such as being able to produce large attractor basins, can be useful for maintaining a stable gene-expression pattern. Our findings indicate that conventional measures of stability, such as the damage-propagation rate, do not provide much information about robustness to mutations or noise in model gene regulatory networks.
While the tunneling conductance between two spherical-like conducting particles depends on the relative inter-particle distance, the wave function overlap between states of two rod-like particles, and so the tunneling conductance, depends also on the relative orientation of the rod axes. Modeling slender rod-like particles as cylindrical quantum wells of diameter $D$ and length $L\gg D$, we calculate the matrix element of the tunneling between two rods for arbitrary relative orientations of the rod axes. We show that tunneling between two parallel rods is about $L/\sqrt{D\xi}$ times larger than the tunneling matrix element for perpendicular rods, where $\xi$ is the tunneling decay length. By considering the full dependence of the tunneling conductance on the angle between rod axes, we calculate within an effective medium theory the conductivity of dispersions of rods with different degrees of alignment. We find that for isotropically oriented rods, the effect of orientation in the tunneling processes is marginal for all rod concentrations. On the contrary, for systems of strongly aligned rods, the enhanced tunneling between nearly parallel rods increases significantly the system conductivity in a relatively large concentration range. Next, we consider systems in which short-range attraction between rods is added, as in dispersions of rods with depletion interaction. We find that the strongly anisotropic attraction promotes enhanced tunneling between neighboring parallel rods, increasing the effective medium conductivity by several orders of magnitude compared to the case in which the angular dependence of tunneling is ignored, even for relatively weak attractions.
Vision sensors lie in the heart of computer vision. In many computer vision applications, such as AR/VR, non-contacting near-field communication (NFC) with high throughput is required to transfer information to algorithms. In this work, we proposed a novel NFC system which utilizes multiple frequency bands to achieve high throughput.
The population of compact massive galaxies observed at z > 1 are hypothesised, both observationally and in simulations, to be merger remnants of gas-rich disc galaxies. To probe such a scenario we analyse a sample of 12 gas-rich and active star forming sub-mm galaxies (SMGs) at 1.8 < z < 3. We present a structural and size measurement analysis for all of these objects using very deep ACS and NICMOS imaging in the GOODS-North field. Our analysis reveals a heterogeneous mix of morphologies and sizes. We find that four galaxies (33% \pm 17%) show clear signs of mergers or interactions, which we classify as early-stage mergers. The remaining galaxies are divided into two categories: five of them (42% \pm 18%) are diffuse and regular disc-like objects, while three (25% \pm 14%) are very compact, spheroidal systems. We argue that these three categories can be accommodated into an evolutionary sequence, showing the transformation from isolated, gas-rich discs with typical sizes of 2-3 kpc, into compact (< 1 kpc) galaxies through violent major merger events, compatible with the scenario depicted by theoretical models. Our findings that some SMGs are already dense and compact provides strong support to the idea that SMGs are the precursors of the compact, massive galaxies found at slightly lower redshift.
Predicting cascade dynamics has important implications for understanding information propagation and launching viral marketing. Previous works mainly adopt a pair-wise manner, modeling the propagation probability between pairs of users using n^2 independent parameters for n users. Consequently, these models suffer from severe overfitting problem, specially for pairs of users without direct interactions, limiting their prediction accuracy. Here we propose to model the cascade dynamics by learning two low-dimensional user-specific vectors from observed cascades, capturing their influence and susceptibility respectively. This model requires much less parameters and thus could combat overfitting problem. Moreover, this model could naturally model context-dependent factors like cumulative effect in information propagation. Extensive experiments on synthetic dataset and a large-scale microblogging dataset demonstrate that this model outperforms the existing pair-wise models at predicting cascade dynamics, cascade size, and "who will be retweeted".
We consider an evolving network of a fixed number of nodes. The allocation of edges is a dynamical stochastic process inspired by biological reproduction dynamics, namely by deleting and duplicating existing nodes and their edges. The properties of the degree distribution in the stationary state is analysed by use of the Fokker-Planck equation. For a broad range of parameters exponential degree distributions are observed. The mechanism responsible for this behaviour is illuminated by use of a simple mean field equation and reproduced by the Fokker-Planck equation treating the degree-degree correlations approximately. In the limit of zero mutations the degree distribution becomes a power law.
We present an open-source accessory for the NAO robot, which enables to test computationally demanding algorithms in an external platform while preserving robot's autonomy and mobility. The platform has the form of a backpack, which can be 3D printed and replicated, and holds an ODROID XU4 board to process algorithms externally with ROS compatibility. We provide also a software bridge between the B-Human's framework and ROS to have access to the robot's sensors close to real-time. We tested the platform in several robotics applications such as data logging, visual SLAM, and robot vision with deep learning techniques. The CAD model, hardware specifications and software are available online for the benefit of the community: https://github.com/uchile-robotics/nao-backpack
The functional significance of correlations between action potentials of neurons is still a matter of vivid debates. In particular it is presently unclear how much synchrony is caused by afferent synchronized events and how much is intrinsic due to the connectivity structure of cortex. The available analytical approaches based on the diffusion approximation do not allow to model spike synchrony, preventing a thorough analysis. Here we theoretically investigate to what extent common synaptic afferents and synchronized inputs each contribute to closely time-locked spiking activity of pairs of neurons. We employ direct simulation and extend earlier analytical methods based on the diffusion approximation to pulse-coupling, allowing us to introduce precisely timed correlations in the spiking activity of the synaptic afferents. We investigate the transmission of correlated synaptic input currents by pairs of integrate-and-fire model neurons, so that the same input covariance can be realized by common inputs or by spiking synchrony. We identify two distinct regimes: In the limit of low correlation linear perturbation theory accurately determines the correlation transmission coefficient, which is typically smaller than unity, but increases sensitively even for weakly synchronous inputs. In the limit of high afferent correlation, in the presence of synchrony a qualitatively new picture arises. As the non-linear neuronal response becomes dominant, the output correlation becomes higher than the total correlation in the input. This transmission coefficient larger unity is a direct consequence of non-linear neural processing in the presence of noise, elucidating how synchrony-coded signals benefit from these generic properties present in cortical networks.
This paper presents preliminary results of Croatian syllable networks analysis. Syllable network is a network in which nodes are syllables and links between them are constructed according to their connections within words. In this paper we analyze networks of syllables generated from texts collected from the Croatian Wikipedia and Blogs. As a main tool we use complex network analysis methods which provide mechanisms that can reveal new patterns in a language structure. We aim to show that syllable networks have much higher clustering coefficient in comparison to Erd\"os-Renyi random networks. The results indicate that Croatian syllable networks exhibit certain properties of a small world networks. Furthermore, we compared Croatian syllable networks with Portuguese and Chinese syllable networks and we showed that they have similar properties.
We introduce the nonparametric metadata dependent relational (NMDR) model, a Bayesian nonparametric stochastic block model for network data. The NMDR allows the entities associated with each node to have mixed membership in an unbounded collection of latent communities. Learned regression models allow these memberships to depend on, and be predicted from, arbitrary node metadata. We develop efficient MCMC algorithms for learning NMDR models from partially observed node relationships. Retrospective MCMC methods allow our sampler to work directly with the infinite stick-breaking representation of the NMDR, avoiding the need for finite truncations. Our results demonstrate recovery of useful latent communities from real-world social and ecological networks, and the usefulness of metadata in link prediction tasks.
We introduce the neural network approach to global fits of parton distribution functions. First we review previous work on unbiased parametrizations of deep-inelastic structure functions with faithful estimation of their uncertainties, and then we summarize the current status of neural network parton distribution fits.
We analyse the storage and retrieval capacity in a recurrent neural network of spiking integrate and fire neurons. In the model we distinguish between a learning mode, during which the synaptic connections change according to a Spike-Timing Dependent Plasticity (STDP) rule, and a recall mode, in which connections strengths are no more plastic. Our findings show the ability of the network to store and recall periodic phase coded patterns a small number of neurons has been stimulated. The self sustained dynamics selectively gives an oscillating spiking activity that matches one of the stored patterns, depending on the initialization of the network.
We present and implement two algorithms for analytic asymptotic evaluation of the marginal likelihood of data given a Bayesian network with hidden nodes. As shown by previous work, this evaluation is particularly hard for latent Bayesian network models, namely networks that include hidden variables, where asymptotic approximation deviates from the standard BIC score. Our algorithms solve two central difficulties in asymptotic evaluation of marginal likelihood integrals, namely, evaluation of regular dimensionality drop for latent Bayesian network models and computation of non-standard approximation formulas for singular statistics for these models. The presented algorithms are implemented in Matlab and Maple and their usage is demonstrated for marginal likelihood approximations for Bayesian networks with hidden variables.
We study a disordered 2D electron gas with a spectral node in a vicinity of the node. After identifying the fundamental dynamical symmetries of this system, the spontaneous breaking of the latter by a Grassmann field is studied within a nonlinear sigma model approach. This allows us to reduce the average two-particle Green's function to a diffusion propagator with a random diffusion coefficient. The latter has non-degenerate saddle points and is treated by the conventional self-consistent Born approximation. This leads to a renormalized chemical potential and a renormalized diffusion coefficient, where the DC conductivity increases linearly with the density of quasiparticles. Applied to the special case of Dirac fermions, our approach provides a comprehensive description of the minimal conductivity at the Dirac node as well as for the V-shape conductivity inside the bands.
We propose a third-order WENO reconstruction which satisfies the sign property, required for constructing high resolution entropy stable finite difference scheme for conservation laws. The reconstruction technique, which is termed as SP-WENO, is endowed with additional properties making it a more robust option compared to ENO schemes of the same order. The performance of the proposed reconstruction is demonstrated via a series of numerical experiments for linear and nonlinear scalar conservation laws. The scheme is easily extended to multi-dimensional conservation laws.
We consider a two-unicast-Z network over a directed acyclic graph of unit capacitated edges; the two-unicast-Z network is a special case of two-unicast networks where one of the destinations has apriori side information of the unwanted (interfering) message. In this paper, we settle open questions on the limits of network coding for two-unicast-Z networks by showing that the generalized network sharing bound is not tight, vector linear codes outperform scalar linear codes, and non-linear codes outperform linear codes in general. We also develop a commutative algebraic approach to deriving linear network coding achievability results, and demonstrate our approach by providing an alternate proof to the previous result of Wang et. al. regarding feasibility of rate (1,1) in the network.
We introduce a heterogeneous connection model for network formation to capture the effect of cost heterogeneity on the structure of efficient networks. In the proposed model, connection costs are assumed to be separable, which means the total connection cost for each agent is uniquely proportional to its degree. For these sets of networks, we provide the analytical solution for the efficient network and discuss stability impli- cations. We show that the efficient network exhibits a core-periphery structure, and for a given density, we find a lower bound for clustering coefficient of the efficient network.
Point-to-multipoint communications are expected to play a pivotal role in next-generation networks. This paper refers to a cellular system transmitting layered multicast services to a multicast group of users. Reliability of communications is ensured via different Random Linear Network Coding (RLNC) techniques. We deal with a fundamental problem: the computational complexity of the RLNC decoder. The higher the number of decoding operations is, the more the user's computational overhead grows and, consequently, the faster the battery of mobile devices drains. By referring to several sparse RLNC techniques, and without any assumption on the implementation of the RLNC decoder in use, we provide an efficient way to characterize the performance of users targeted by ultra-reliable layered multicast services. The proposed modeling allows to efficiently derive the average number of coded packet transmissions needed to recover one or more service layers. We design a convex resource allocation framework that allows to minimize the complexity of the RLNC decoder by jointly optimizing the transmission parameters and the sparsity of the code. The designed optimization framework also ensures service guarantees to predetermined fractions of users. The performance of the proposed optimization framework is then investigated in a LTE-A eMBMS network multicasting H.264/SVC video services.
Many practical machine learning tasks employ very deep convolutional neural networks. Such large depths pose formidable computational challenges in training and operating the network. It is therefore important to understand how many layers are actually needed to have most of the input signal's features be contained in the feature vector generated by the network. This question can be formalized by asking how quickly the energy contained in the feature maps decays across layers. In addition, it is desirable that none of the input signal's features be "lost" in the feature extraction network or, more formally, we want energy conservation in the sense of the energy contained in the feature vector being proportional to that of the corresponding input signal. This paper establishes conditions for energy conservation for a wide class of deep convolutional neural networks and characterizes corresponding feature map energy decay rates. Specifically, we consider general scattering networks, and find that under mild analyticity and high-pass conditions on the filters (which encompass, inter alia, various constructions of Weyl-Heisenberg filters, wavelets, ridgelets, ($\alpha$)-curvelets, and shearlets) the feature map energy decays at least polynomially fast. For broad families of wavelets and Weyl-Heisenberg filters, the guaranteed decay rate is shown to be exponential. Our results yield handy estimates of the number of layers needed to have at least $((1-\varepsilon)\cdot 100)\%$ of the input signal energy be contained in the feature vector.
We represent the sequence of fMRI (Functional Magnetic Resonance Imaging) brain volumes recorded during a cognitive stimulus by a graph which consists of a set of local meshes. The corresponding cognitive process, encoded in the brain, is then represented by these meshes each of which is estimated assuming a linear relationship among the voxel time series in a predefined locality. First, we define the concept of locality in two neighborhood systems, namely, the spatial and functional neighborhoods. Then, we construct spatially and functionally local meshes around each voxel, called seed voxel, by connecting it either to its spatial or functional p-nearest neighbors. The mesh formed around a voxel is a directed sub-graph with a star topology, where the direction of the edges is taken towards the seed voxel at the center of the mesh. We represent the time series recorded at each seed voxel in terms of linear combination of the time series of its p-nearest neighbors in the mesh. The relationships between a seed voxel and its neighbors are represented by the edge weights of each mesh, and are estimated by solving a linear regression equation. The estimated mesh edge weights lead to a better representation of information in the brain for encoding and decoding of the cognitive tasks. We test our model on a visual object recognition and emotional memory retrieval experiments using Support Vector Machines that are trained using the mesh edge weights as features. In the experimental analysis, we observe that the edge weights of the spatial and functional meshes perform better than the state-of-the-art brain decoding models.
Body Sensor Networks (BSNs) provide continuous health monitoring and analysis of physiological parameters. A high degree of Quality-of-Service (QoS) for BSN is extremely required. Inter-user interference is introduced by the simultaneous communication of BSNs congregating in the same area. In this paper, a decentralized inter-user interference suppression algorithm for BSN, namely DISG, is proposed. Each BSN measures the SINR from other BSNs and then adaptively selects the suitable channel and transmission power. By utilizing non-cooperative game theory and no regret learning algorithm, DISG provides an adaptive inter-user interference suppression strategy. The correctness and effectiveness of DISG is theoretically proved, and the experimental results show that DISG can reduce the effect of inter-user interference effectively.
We perform an accurate test of Ultrametricity in the aging dynamics of the three dimensional Edwards-Anderson spin glass. Our method consists in considering the evolution in parallel of two identical systems constrained to have fixed overlap. This turns out to be a particularly efficient way to study the geometrical relations between configurations at distant large times. Our findings strongly hint towards dynamical ultrametricity in spin glasses, while this is absent in simpler aging systems with domain growth dynamics. A recently developed theory of linear response in glassy systems allows to infer that dynamical ultrametricity implies the same property at the level of equilibrium states.
This paper describes three conceptual areas in physics that are particularly important targets for educational interventions in K-12 science. These conceptual areas are force and motion, conservation of energy, and geometrical optics, which were prominent in the US national and four US state standards that we examined. The four US state standards that were analyzed to explore the extent to which the K-12 science standards differ in different states were selected to include states in different geographic regions and of different sizes. The three conceptual areas that were common to all the four state standards are conceptual building blocks for other science concepts covered in the K-12 curriculum. Since these three areas have been found to be ripe with deep student misconceptions that are resilient to conventional physics instruction, the nature of difficulties in these areas is described in some depth, along with pointers towards approaches that have met with some success in each conceptual area.
Image geolocalization, inferring the geographic location of an image, is a challenging computer vision problem with many potential applications. The recent state-of-the-art approach to this problem is a deep image classification approach in which the world is spatially divided into cells and a deep network is trained to predict the correct cell for a given image. We propose to combine this approach with the original Im2GPS approach in which a query image is matched against a database of geotagged images and the location is inferred from the retrieved set. We estimate the geographic location of a query image by applying kernel density estimation to the locations of its nearest neighbors in the reference database. Interestingly, we find that the best features for our retrieval task are derived from networks trained with classification loss even though we do not use a classification approach at test time. Training with classification loss outperforms several deep feature learning methods (e.g. Siamese networks with contrastive of triplet loss) more typical for retrieval applications. Our simple approach achieves state-of-the-art geolocalization accuracy while also requiring significantly less training data.
We study the Anderson-type transition previously found in the spectrum of the QCD quark Dirac operator in the high temperature, quark-gluon plasma phase. Using finite size scaling for the unfolded level spacing distribution, we show that in the thermodynamic limit there is a genuine mobility edge, where the spectral statistics changes from Poisson to Wigner-Dyson statistics in a non-analytic way. We determine the correlation length critical exponent, $\nu$, and find that it is compatible with that of the unitary Anderson model.
The selection rules for dipole and Raman activity can be relaxed due to local distortion of a crystalline structure. In this situation a dipole-inactive mode can become simultaneously active in Raman scattering and in dipole interaction with the electromagnetic field. The later interaction results in disorder-induced polaritons, which could be observed in first-order Raman spectra. We calculate scattering cross-section in the case of a material with a diamond-like average structure, and show that there exist a strong possibility of observing the disorder induced polaritons.
The recent COCO object detection dataset presents several new challenges for object detection. In particular, it contains objects at a broad range of scales, less prototypical images, and requires more precise localization. To address these challenges, we test three modifications to the standard Fast R-CNN object detector: (1) skip connections that give the detector access to features at multiple network layers, (2) a foveal structure to exploit object context at multiple object resolutions, and (3) an integral loss function and corresponding network adjustment that improve localization. The result of these modifications is that information can flow along multiple paths in our network, including through features from multiple network layers and from multiple object views. We refer to our modified classifier as a "MultiPath" network. We couple our MultiPath network with DeepMask object proposals, which are well suited for localization and small objects, and adapt our pipeline to predict segmentation masks in addition to bounding boxes. The combined system improves results over the baseline Fast R-CNN detector with Selective Search by 66% overall and by 4x on small objects. It placed second in both the COCO 2015 detection and segmentation challenges.
Gravitational-wave astronomy will soon become a new tool for observing the Universe. Detecting and interpreting gravitational waves will require deep theoretical insights into astronomical sources. The past three decades have seen remarkable progress in analytical and numerical computations of the source dynamics, development of search algorithms and analysis of data from detectors with unprecedented sensitivity. This Chapter is devoted to examine the advances and future challenges in understanding the dynamics of binary and isolated compact-object systems, expected cosmological sources, their amplitudes and rates, and highlights of results from gravitational-wave observations. All of this is a testament to the readiness of the community to open a new window for observing the cosmos, a century after gravitational waves were first predicted by Albert Einstein.
This paper proposed a class of novel Deep Recurrent Neural Networks which can incorporate language-level information into acoustic models. For simplicity, we named these networks Recurrent Deep Language Networks (RDLNs). Multiple variants of RDLNs were considered, including two kinds of context information, two methods to process the context, and two methods to incorporate the language-level information. RDLNs provided possible methods to fine-tune the whole Automatic Speech Recognition (ASR) system in the acoustic modeling process.
Uncorrelated random scale-free networks are useful null models to check the accuracy an the analytical solutions of dynamical processes defined on complex networks. We propose and analyze a model capable to generate random uncorrelated scale-free networks with no multiple and self-connections. The model is based on the classical configuration model, with an additional restriction on the maximum possible degree of the vertices. We check numerically that the proposed model indeed generates scale-free networks with no two and three vertex correlations, as measured by the average degree of the nearest neighbors and the clustering coefficient of the vertices of degree $k$, respectively.
When a signal is emitted from a source, recorded by an array of transducers, time reversed and re-emitted into the medium, it will refocus approximately on the source location. We analyze the refocusing resolution in a high frequency, remote sensing regime, and show that, because of multiple scattering, in an inhomogeneous or random medium it can improve beyond the diffraction limit. We also show that the back-propagated signal from a spatially localized narrow-band source is self-averaging, or statistically stable, and relate this to the self-averaging properties of functionals of the Wigner distribution in phase space. Time reversal from spatially distributed sources is self-averaging only for broad-band signals. The array of transducers operates in a remote-sensing regime so we analyze time reversal with the parabolic or paraxial wave equation.
We present a new model of the evolutionary dynamics and the growth of on-line social networks. The model emulates people's strategies for acquiring information in social networks, emphasising the local subjective view of an individual and what kind of information the individual can acquire when arriving in a new social context. The model proceeds through two phases: (a) a discovery phase, in which the individual becomes aware of the surrounding world and (b) an elaboration phase, in which the individual elaborates locally the information trough a cognitive-inspired algorithm. Model generated networks reproduce main features of both theoretical and real-world networks, such as high clustering coefficient, low characteristic path length, strong division in communities, and variability of degree distributions.
Segmentation of focal (localized) brain pathologies such as brain tumors and brain lesions caused by multiple sclerosis and ischemic strokes are necessary for medical diagnosis, surgical planning and disease development as well as other applications such as tractography. Over the years, attempts have been made to automate this process for both clinical and research reasons. In this regard, machine learning methods have long been a focus of attention. Over the past two years, the medical imaging field has seen a rise in the use of a particular branch of machine learning commonly known as deep learning. In the non-medical computer vision world, deep learning based methods have obtained state-of-the-art results on many datasets. Recent studies in computer aided diagnostics have shown deep learning methods (and especially convolutional neural networks - CNN) to yield promising results. In this chapter, we provide a survey of CNN methods applied to medical imaging with a focus on brain pathology segmentation. In particular, we discuss their characteristic peculiarities and their specific configuration and adjustments that are best suited to segment medical images. We also underline the intrinsic differences deep learning methods have with other machine learning methods.
Crowd sourcing has become a widely adopted scheme to collect ground truth labels. However, it is a well-known problem that these labels can be very noisy. In this paper, we demonstrate how to learn a deep convolutional neural network (DCNN) from noisy labels, using facial expression recognition as an example. More specifically, we have 10 taggers to label each input image, and compare four different approaches to utilizing the multiple labels: majority voting, multi-label learning, probabilistic label drawing, and cross-entropy loss. We show that the traditional majority voting scheme does not perform as well as the last two approaches that fully leverage the label distribution. An enhanced FER+ data set with multiple labels for each face image will also be shared with the research community.
Majorization-minimization algorithms consist of iteratively minimizing a majorizing surrogate of an objective function. Because of its simplicity and its wide applicability, this principle has been very popular in statistics and in signal processing. In this paper, we intend to make this principle scalable. We introduce a stochastic majorization-minimization scheme which is able to deal with large-scale or possibly infinite data sets. When applied to convex optimization problems under suitable assumptions, we show that it achieves an expected convergence rate of $O(1/\sqrt{n})$ after $n$ iterations, and of $O(1/n)$ for strongly convex functions. Equally important, our scheme almost surely converges to stationary points for a large class of non-convex problems. We develop several efficient algorithms based on our framework. First, we propose a new stochastic proximal gradient method, which experimentally matches state-of-the-art solvers for large-scale $\ell_1$-logistic regression. Second, we develop an online DC programming algorithm for non-convex sparse estimation. Finally, we demonstrate the effectiveness of our approach for solving large-scale structured matrix factorization problems.
We propose a model to automatically describe changes introduced in the source code of a program using natural language. Our method receives as input a set of code commits, which contains both the modifications and message introduced by an user. These two modalities are used to train an encoder-decoder architecture. We evaluated our approach on twelve real world open source projects from four different programming languages. Quantitative and qualitative results showed that the proposed approach can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting.
Conditional random fields (CRFs) are usually specified by graphical models but in this paper we propose to use probabilistic logic programs and specify them generatively. Our intension is first to provide a unified approach to CRFs for complex modeling through the use of a Turing complete language and second to offer a convenient way of realizing generative-discriminative pairs in machine learning to compare generative and discriminative models and choose the best model. We implemented our approach as the D-PRISM language by modifying PRISM, a logic-based probabilistic modeling language for generative modeling, while exploiting its dynamic programming mechanism for efficient probability computation. We tested D-PRISM with logistic regression, a linear-chain CRF and a CRF-CFG and empirically confirmed their excellent discriminative performance compared to their generative counterparts, i.e.\ naive Bayes, an HMM and a PCFG. We also introduced new CRF models, CRF-BNCs and CRF-LCGs. They are CRF versions of Bayesian network classifiers and probabilistic left-corner grammars respectively and easily implementable in D-PRISM. We empirically showed that they outperform their generative counterparts as expected.
Peer-to-peer (P2P) networks have mostly focused on task oriented networking, where networks are constructed for single applications, i.e. file-sharing, DNS caching, etc. In this work, we introduce IPOP, a system for creating virtual IP networks on top of a P2P overlay. IPOP enables seamless access to Grid resources spanning multiple domains by aggregating them into a virtual IP network that is completely isolated from the physical network. The virtual IP network provided by IPOP supports deployment of existing IP-based protocols over a robust, self-configuring P2P overlay. We present implementation details as well as experimental measurement results taken from LAN, WAN, and Planet-Lab tests.
We have analysed data from five XMM-Newton observations of XB 1254-69, one of them simultaneous with INTEGRAL, to investigate the mechanism responsible for the highly variable dips durations and depths seen from this low-mass X-ray binary. Deep dips were present during two observations, shallow dips during one and no dips were detected during the remaining two observations. At high (1-4 s) time resolution ``shallow dips'' are seen to include a few, very rapid, deep dips whilst the ``deep'' dips consist of many similar very rapid, deep, fluctuations. The folded V-band Optical Monitor light curves obtained when the source was undergoing deep, shallow and no detectable dipping exhibit sinusoid-like variations with different amplitudes and phases. We fit EPIC spectra obtained from "persistent" or dip-free intervals with a model consisting of disc-blackbody and thermal comptonisation components together with Gaussian emission features at 1 and 6.6 keV modified by absorption due to cold and photo-ionised material. None of the spectral parameters appears to be strongly correlated with the dip depth except for the temperature of the disc blackbody which is coolest (kT ~ 1.8 keV) when deep dips are present and warmest (kT ~ 2.1 keV) when no dips are detectable. We propose that the changes in both disc temperature and optical modulation could be explained by the presence of a tilted accretion disc in the system. We provide a revised estimate of the orbital period of 0.16388875 +/- 0.00000017 day.
We review our recent work on the synchronization of a network of delay-coupled maps, focusing on the interplay of the network topology and the delay times that take into account the finite velocity of propagation of interactions. We assume that the elements of the network are identical ($N$ logistic maps in the regime where the individual maps, without coupling, evolve in a chaotic orbit) and that the coupling strengths are uniform throughout the network. We show that if the delay times are sufficiently heterogeneous, for adequate coupling strength the network synchronizes in a spatially homogeneous steady-state, which is unstable for the individual maps without coupling. This synchronization behavior is referred to as ``suppression of chaos by random delays'' and is in contrast with the synchronization when all the interaction delay times are homogeneous, because with homogeneous delays the network synchronizes in a state where the elements display in-phase time-periodic or chaotic oscillations. We analyze the influence of the network topology considering four different types of networks: two regular (a ring-type and a ring-type with a central node) and two random (free-scale Barabasi-Albert and small-world Newman-Watts). We find that when the delay times are sufficiently heterogeneous the synchronization behavior is largely independent of the network topology but depends on the networks connectivity, i.e., on the average number of neighbors per node.
The deterministic network calculus offers an elegant framework for determining delays and backlog in a network with deterministic service guarantees to individual traffic flows. This paper addresses the problem of extending the network calculus to a probabilistic framework with statistical service guarantees. Here, the key difficulty relates to expressing, in a statistical setting, an end-to-end (network) service curve as a concatenation of per-node service curves. The notion of an effective service curve is developed as a probabilistic bound on the service received by an individual flow. It is shown that per-node effective service curves can be concatenated to yield a network effective service curve.
In this thesis, we have investigated the higher twist structure functions in the recently developed method based on light-front Hamiltonian QCD. Because of various special properties of light-front QCD, this is a more intuitive approach towards deep inelastic scattering but has a well defined field theoretic calculational procedure. Because our method is different from the conventional one, we have obtained various new results. Also, it is easier to explore many aspects which are difficult to understand in the conventional way. The structure functions, twist four $F_L$ and $g_T$ contain non-trivial interaction dependence in the operator structure and therefore involve quark-gluon dynamics. We have shown that the intergrals of these are directly related to the light-front Hamiltonian density and transverse spin operator respectively. These are non-perturbative relations. We have also renormalized the twist four longitudinal structure function and the full transverse spin operator respectively upto one loop in light-front Hamiltonian perturbation theory. We have investigated twist four $F_L$ for bound states like a meson in 1+1 dimensional QCD and for a positronium in 3+1 dimensional QED.
Social insects provide an excellent platform to investigate flow of information in regulatory systems since their successful social organization is essentially achieved by effective information transfer through complex connectivity patterns among the colony members. Network representation of such behavioural interactions offers a powerful tool for structural as well as dynamical analysis of the underlying regulatory systems. In this paper, we focus on the dominance interaction networks in the tropical social wasp \textit{Ropalidia marginata} - a species where behavioural observations indicate that such interactions are principally responsible for the transfer of information between individuals about their colony needs, resulting in a regulation of their own activities. Our research reveals that the dominance networks of \textit{R. marginata} are structurally similar to a class of naturally evolved information processing networks, a fact confirmed also by the predominance of a specific substructure - the `feed-forward loop' - a key functional component in many other information transfer networks. The dynamical analysis through Boolean modeling confirms that the networks are sufficiently stable under small fluctuations and yet capable of more efficient information transfer compared to their randomized counterparts. Our results suggest the involvement of a common structural design principle in different biological regulatory systems and a possible similarity with respect to the effect of selection on the organization levels of such systems. The findings are also consistent with the hypothesis that dominance behaviour has been shaped by natural selection to co-opt the information transfer process in such social insect species, in addition to its primal function of mediation of reproductive competition in the colony.
Identifying the most influential spreaders is an important issue in understanding and controlling spreading processes on complex networks. Recent studies showed that nodes located in the core of a network as identified by the k-shell decomposition are the most influential spreaders. However, through a great deal of numerical simulations, we observe that not in all real networks do nodes in high shells are very influential: in some networks the core nodes are the most influential which we call true core, while in others nodes in high shells, even the innermost core, are not good spreaders which we call core-like group. By analyzing the k-core structure of the networks, we find that the true core of a network links diversely to the shells of the network, while the core-like group links very locally within the group. For nodes in the core-like group, the k-shell index cannot reflect their location importance in the network. We further introduce a measure based on the link diversity of shells to effectively distinguish the true core and core-like group, and identify core-like groups throughout the networks. Our findings help to better understand the structural features of real networks and influential nodes.
In this paper we use group, action and orbit to understand how evolutionary solve nonconvex optimization problems.
We propose an object detection system that relies on a multi-region deep convolutional neural network (CNN) that also encodes semantic segmentation-aware features. The resulting CNN-based representation aims at capturing a diverse set of discriminative appearance factors and exhibits localization sensitivity that is essential for accurate object localization. We exploit the above properties of our recognition module by integrating it on an iterative localization mechanism that alternates between scoring a box proposal and refining its location with a deep CNN regression model. Thanks to the efficient use of our modules, we detect objects with very high localization accuracy. On the detection challenges of PASCAL VOC2007 and PASCAL VOC2012 we achieve mAP of 78.2% and 73.9% correspondingly, surpassing any other published work by a significant margin.
Note: This paper describes an older version of DeepLIFT. See https://arxiv.org/abs/1704.02685 for the newer version. Original abstract follows: The purported "black box" nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Learning Important FeaTures), an efficient and effective method for computing importance scores in a neural network. DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. We apply DeepLIFT to models trained on natural images and genomic data, and show significant advantages over gradient-based methods.
We study the evolution of the network properties of a populated network embedded in a genotype space characterised by either a low or a high number of potential links, with particular emphasis on the connectivity and clustering. Evolution produces two distinct types of network. When a specific genotype is only able to influence a few other genotypes, the ecology consists of separate non-interacting clusters in genotype space. When different types may influence a large number of other sites, the network becomes one large interconnected cluster. The distribution of interaction strengths -- but not the number of connections -- changes significantly with time. We find that the species abundance is only realistic for a high level of species connectivity. This suggests that real ecosystems form one interconnected whole in which selection leads to stronger interactions between the different types. Analogies with niche and neutral theory are also considered.
We probe the flow of two dimensional foams, consisting of a monolayer of bubbles sandwiched between a liquid bath and glass plate, as a function of driving rate, packing fraction and degree of disorder. First, we find that bidisperse, disordered foams exhibit strongly rate dependent and inhomogeneous (shear banded) velocity profiles, while monodisperse, ordered foams are also shear banded, but essentially rate independent. Second, we introduce a simple model based on balancing the averaged drag forces between the bubbles and the top plate and the averaged bubble-bubble drag forces. This model captures the observed rate dependent flows, and the rate independent flows. Third, we perform independent rheological measurements, both for ordered and disordered systems, and find these to be fully consistent with the scaling forms of the drag forces assumed in the simple model, and we see that disorder modifies the scaling. Fourth, we vary the packing fraction $\phi$ of the foam over a substantial range, and find that the flow profiles become increasingly shear banded when the foam is made wetter. Surprisingly, our model describes flow profiles and rate dependence over the whole range of packing fractions with the same power law exponents -- only a dimensionless number $k$ which measures the ratio of the pre-factors of the viscous drag laws is seen to vary with packing fraction. We find that $k \sim (\phi-\phi_c)^{-1}$, where $\phi_c \approx 0.84$, corresponding to the 2d jamming density, and suggest that this scaling follows from the geometry of the deformed facets between bubbles in contact. Overall, our work suggests a route to rationalize aspects of the ubiquitous Herschel-Bulkley (power law) rheology observed in a wide range of disordered materials.
In this paper, we propose a distributive queueaware intra-cell user scheduling and inter-cell interference (ICI) management control design for a delay-optimal celluar downlink system with M base stations (BSs), and K users in each cell. Each BS has K downlink queues for K users respectively with heterogeneous arrivals and delay requirements. The ICI management control is adaptive to joint queue state information (QSI) over a slow time scale, while the user scheduling control is adaptive to both the joint QSI and the joint channel state information (CSI) over a faster time scale. We show that the problem can be modeled as an infinite horizon average cost Partially Observed Markov Decision Problem (POMDP), which is NP-hard in general. By exploiting the special structure of the problem, we shall derive an equivalent Bellman equation to solve the POMDP problem. To address the distributive requirement and the issue of dimensionality and computation complexity, we derive a distributive online stochastic learning algorithm, which only requires local QSI and local CSI at each of the M BSs. We show that the proposed learning algorithm converges almost surely (with probability 1) and has significant gain compared with various baselines. The proposed solution only has linear complexity order O(MK).
In the big data era, scalability has become a crucial requirement for any useful computational model. Probabilistic graphical models are very useful for mining and discovering data insights, but they are not scalable enough to be suitable for big data problems. Bayesian Networks particularly demonstrate this limitation when their data is represented using few random variables while each random variable has a massive set of values. With hierarchical data - data that is arranged in a treelike structure with several levels - one would expect to see hundreds of thousands or millions of values distributed over even just a small number of levels. When modeling this kind of hierarchical data across large data sets, Bayesian networks become infeasible for representing the probability distributions for the following reasons: i) Each level represents a single random variable with hundreds of thousands of values, ii) The number of levels is usually small, so there are also few random variables, and iii) The structure of the network is predefined since the dependency is modeled top-down from each parent to each of its child nodes, so the network would contain a single linear path for the random variables from each parent to each child node. In this paper we present a scalable probabilistic graphical model to overcome these limitations for massive hierarchical data. We believe the proposed model will lead to an easily-scalable, more readable, and expressive implementation for problems that require probabilistic-based solutions for massive amounts of hierarchical data. We successfully applied this model to solve two different challenging probabilistic-based problems on massive hierarchical data sets for different domains, namely, bioinformatics and latent semantic discovery over search logs.
According to the droplet picture of spin glasses, the low-temperature phase of spin glasses should be replica symmetric. However, analysis of the stability of this state suggested that it was unstable and this instability lends support to the Parisi replica symmetry breaking picture of spin glasses. The finite-size scaling functions in the critical region of spin glasses below T_c in dimensions greater than 6 can be determined and for them the replica symmetric solution is unstable order by order in perturbation theory. Nevertheless the exact solution can be shown to be replica-symmetric. It is suggested that a similar mechanism might apply in the low-temperature phase of spin glasses in less than six dimensions, but that a replica symmetry broken state might exist in more than six dimensions.
We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show that a $q$-ary deep belief network with $L\geq 2+\frac{q^{\lceil m-\delta \rceil}-1}{q-1}$ layers of width $n \leq m + \log_q(m) + 1$ for some $m\in \mathbb{N}$ can approximate any probability distribution on $\{0,1,\ldots,q-1\}^n$ without exceeding a Kullback-Leibler divergence of $\delta$. Our analysis covers discrete restricted Boltzmann machines and na\"ive Bayes models as special cases.
Mobile devices integrating wireless short-range communication technologies make possible new applications for spontaneous communication, interaction and collaboration. An interesting approach is to use collaboration to facilitate communication when mobile devices are not able to establish direct communication paths. Opportunistic networks, formed when mobile devices communicate with each other while users are in close proximity, can help applications still exchange data in such cases. In opportunistic networks routes are built dynamically, as each mobile device acts according to the store-carry-and-forward paradigm. Thus, contacts between mobile devices are seen as opportunities to move data towards destination. In such networks data dissemination is done using forwarding and is usually based on a publish/subscribe model. Opportunistic data dissemination also raises questions concerning user privacy and incentives. Such problems are addressed differently by various opportunistic data dissemination techniques. In this paper we analyze existing relevant work in the area of data dissemination in opportunistic networks. We present the categories of a proposed taxonomy that captures the capabilities of data dissemination techniques used in such networks. Moreover, we survey relevant data dissemination techniques and analyze them using the proposed taxonomy.
The combined information from cosmic ray air showers that trigger both the surface and underground parts of the IceCube Neutrino Observatory allows the reconstruction of both the energy and mass of the primary particle through the knee region of the energy spectrum and above. The properties of high-energy muon bundles, created early in the formation of extensive air showers and capable of penetrating deep into the ice, are related to the primary energy and composition. New methods for reconstructing the direction and composition-sensitive properties of muon bundles are shown. Based on a likelihood minimization procedure using IceCube signals, and accounting for photon propagation, ice properties, and the energy loss processes of muons in ice, the muon bundle energy loss is reconstructed. The results of the high-energy muon bundle reconstruction in the deep ice and the reconstruction of the lateral distribution of low energy particles in the surface detector can be combined to study primary composition and energy. The performance and composition sensitivity for both simulated and experimental data are discussed.
We study unlimited infinite churn in peer-to-peer overlay networks. Under this churn, arbitrary many peers may concurrently request to join or leave the overlay network; moreover these requests may never stop coming. We prove that unlimited adversarial churn, where processes may just exit the overlay network, is unsolvable. We focus on cooperative churn where exiting processes participate in the churn handling algorithm. We define the problem of unlimited infinite churn in this setting. We distinguish the fair version of the problem, where each request is eventually satisfied, from the unfair version that just guarantees progress. We focus on local solutions to the problem, and prove that a local solution to the Fair Infinite Unlimited Churn is impossible. We then present and prove correct an algorithm UIUC that solves the Unfair Infinite Unlimited Churn Problem for a linearized peer-to-peer overlay network. We extend this solution to skip lists and skip graphs.
In this paper we adapt online estimation strategies to perform model-based clustering on large networks. Our work focuses on two algorithms, the first based on the SAEM algorithm, and the second on variational methods. These two strategies are compared with existing approaches on simulated and real data. We use the method to decipher the connexion structure of the political websphere during the US political campaign in 2008. We show that our online EM-based algorithms offer a good trade-off between precision and speed, when estimating parameters for mixture distributions in the context of random graphs.
We develop a topology data analysis-based method to detect early signs for critical transitions in financial data. From the time-series of multiple stock prices, we build time-dependent correlation networks, which exhibit topological structures. We compute the persistent homology associated to these structures in order to track the changes in topology when approaching a critical transition. As a case study, we investigate a portfolio of stocks during a period prior to the US financial crisis of 2007-2008, and show the presence of early signs of the critical transition.
The article describes a course on system design (structural approach) which involves the following: issues of systems engineering; structural models; basic technological problems (structural system modeling, modular design, evaluation/comparison, revelation of bottlenecks, improvement/upgrade, multistage design, modeling of system evolution); solving methods (optimization, combinatorial optimization, multicriteria decision making); design frameworks; and applications. The course contains lectures and a set of special laboratory works. The laboratory works consist in designing and implementing a set of programs to solve multicriteria problems (ranking/selection, multiple choice problem, clustering, assignment). The programs above are used to solve some standard problems (e.g., hierarchical design of a student plan, design of a marketing strategy). Concurrently, each student can examine a unique applied problem from his/her applied domain(s) (e.g., telemetric system, GSM network, integrated security system, testing of microprocessor systems, wireless sensor, corporative communication network, network topology). Mainly, the course is targeted to developing the student skills in modular analysis and design of various multidisciplinary composite systems (e.g., software, electronic devices, information, computers, communications). The course was implemented in Moscow Institute of Physics and Technology (State University).
Basic Linear Algebra Subprograms (BLAS) are a set of low level linear algebra kernels widely adopted by applications involved with the deep learning and scientific computing. The massive and economic computing power brought forth by the emerging GPU architectures drives interest in implementation of compute-intensive level 3 BLAS on multi-GPU systems. In this paper, we investigate existing multi-GPU level 3 BLAS and present that 1) issues, such as the improper load balancing, inefficient communication, insufficient GPU stream level concurrency and data caching, impede current implementations from fully harnessing heterogeneous computing resources; 2) and the inter-GPU Peer-to-Peer(P2P) communication remains unexplored. We then present BLASX: a highly optimized multi-GPU level-3 BLAS. We adopt the concepts of algorithms-by-tiles treating a matrix tile as the basic data unit and operations on tiles as the basic task. Tasks are guided with a dynamic asynchronous runtime, which is cache and locality aware. The communication cost under BLASX becomes trivial as it perfectly overlaps communication and computation across multiple streams during asynchronous task progression. It also takes the current tile cache scheme one step further by proposing an innovative 2-level hierarchical tile cache, taking advantage of inter-GPU P2P communication. As a result, linear speedup is observable with BLASX under multi-GPU configurations; and the extensive benchmarks demonstrate that BLASX consistently outperforms the related leading industrial and academic projects such as cuBLAS-XT, SuperMatrix, MAGMA and PaRSEC.
We present a model for random simple graphs with a degree distribution that obeys a power law (i.e., is heavy-tailed). To attain this behavior, the edge probabilities in the graph are constructed from Bertoin-Fujita-Roynette-Yor (BFRY) random variables, which have been recently utilized in Bayesian statistics for the construction of power law models in several applications. Our construction readily extends to capture the structure of latent factors, similarly to stochastic blockmodels, while maintaining its power law degree distribution. The BFRY random variables are well approximated by gamma random variables in a variational Bayesian inference routine, which we apply to several network datasets for which power law degree distributions are a natural assumption. By learning the parameters of the BFRY distribution via probabilistic inference, we are able to automatically select the appropriate power law behavior from the data. In order to further scale our inference procedure, we adopt stochastic gradient ascent routines where the gradients are computed on minibatches (i.e., subsets) of the edges in the graph.
In this work we study the properties of deep neural networks (DNN) with random weights. We formally prove that these networks perform a distance-preserving embedding of the data. Based on this we then draw conclusions on the size of the training data and the networks' structure. A longer version of this paper with more results and details can be found in (Giryes et al., 2015). In particular, we formally prove in the longer version that DNN with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data.
Type I X-ray bursts are thermonuclear stellar explosions driven by charged-particle reactions. In the regime for combined H/He-ignition, the main nuclear flow is dominated by the rp-process (rapid proton-captures and beta+ decays), the 3 alpha-reaction, and the alpha-p-process (a suite of (alpha,p) and (p,gamma) reactions). The main flow is expected to proceed away from the valley of stability, eventually reaching the proton drip-line beyond A = 38. Detailed analysis of the relevant reactions along the main path has only been scarcely addressed, mainly in the context of parameterized one-zone models. In this paper, we present a detailed study of the nucleosynthesis and nuclear processes powering type I X-ray bursts. The reported 11 bursts have been computed by means of a spherically symmetric (1D), Lagrangian, hydrodynamic code, linked to a nuclear reaction network that contains 325 isotopes (from 1H to 107Te), and 1392 nuclear processes. These evolutionary sequences, followed from the onset of accretion up to the explosion and expansion stages, have been performed for 2 different metallicities to explore the dependence between the extension of the main nuclear flow and the initial metal content. We carefully analyze the dominant reactions and the products of nucleosynthesis, together with the the physical parameters that determine the light curve (including recurrence times, ratios between persistent and burst luminosities, or the extent of the envelope expansion). Results are in qualitative agreement with the observed properties of some well-studied bursting sources. Leakage from the predicted SbSnTe-cycle cannot be discarded in some of our models. Production of 12C (and implications for the mechanism that powers superbursts), light p-nuclei, and the amount of H left over after the bursting episodes will also be discussed.
Understanding the 3D structure of a scene is of vital importance, when it comes to developing fully autonomous robots. To this end, we present a novel deep learning based framework that estimates depth, surface normals and surface curvature by only using a single RGB image. To the best of our knowledge this is the first work to estimate surface curvature from colour using a machine learning approach. Additionally, we demonstrate that by tuning the network to infer well designed features, such as surface curvature, we can achieve improved performance at estimating depth and normals.This indicates that network guidance is still a useful aspect of designing and training a neural network. We run extensive experiments where the network is trained to infer different tasks while the model capacity is kept constant resulting in different feature maps based on the tasks at hand. We outperform the previous state-of-the-art benchmarks which jointly estimate depths and surface normals while predicting surface curvature in parallel.
We study characteristics of receptive fields of units in deep convolutional networks. The receptive field size is a crucial issue in many visual tasks, as the output must respond to large enough areas in the image to capture information about large objects. We introduce the notion of an effective receptive field, and show that it both has a Gaussian distribution and only occupies a fraction of the full theoretical receptive field. We analyze the effective receptive field in several architecture designs, and the effect of nonlinear activations, dropout, sub-sampling and skip connections on it. This leads to suggestions for ways to address its tendency to be too small.
This paper investigates the problem of Chinese zero pronoun resolution. Most existing approaches are based on machine learning algorithms, using hand-crafted features, which is labor-intensive. More- over, semantic information that is essential in the resolution of noun phrases has not been addressed enough by previous approaches on zero pronoun resolution. This is because that zero pronouns have no descriptive information, which makes it almost impossible to calculate semantic similarity between the zero pronoun and its candidate antecedents. To deal with these problems, we aim at exploring learn- ing algorithms that are capable of generating semantic representations for zero pronouns, capturing the intricate related- ness between zero pronouns and candidate antecedents, and meanwhile less dependent on extensive feature engineering. To achieve this goal, in this paper, we propose a zero pronoun-specific neural network for Chinese zero pronoun resolution task. Experimental results show that our approach significantly outperforms the state-of-the- art method.
We discuss how to efficiently forward data in vehicular networks. Existing solutions do not make full use of trajectory planning of nearby vehicles, or social attributes. The development of onboard navigation system provides drivers some traveling route information. The main novelty of our approach is to envision sharing partial traveling information to the encountered vehicles for better service. Our data forwarding algorithm utilizes this lightweight information under the delusive paths privacy preservation together with the social community structure in vehicular networks. We assume that data transmission is carried by vehicles and road side units (RSUs), while cellular network manages and coordinates relevant global information. The approximate destination set is the set of RSUs that are often passed by the destination vehicle. RSU importance is raised by summing encounter ratios of RSUs in the same connected component. We first define a concept of space-time approachability which is derived from shared partial traveling route and encounter information. It describes the capability of a vehicle to advance messages toward destination. Then, we design a novel data forwarding algorithm, called approachability based algorithm, which combines the space-time approachability with the social community attribute in vehicular networks. We evaluate our approachability based algorithm on data sets from San Francisco Cabspotting and Shanghai Taxi Movement. Results show that the partially shared traveling information plays a positive role in data forwarding in vehicular networks. Approachability based data forwarding algorithm achieves a better performance than existing social based algorithms in vehicular networks.
Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature computed globally from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this work, we present a new system for scene text detection by proposing a novel Text-Attentional Convolutional Neural Network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/nontext information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates main task of text/non-text classification. In addition, a powerful low-level detector called Contrast- Enhancement Maximally Stable Extremal Regions (CE-MSERs) is developed, which extends the widely-used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 dataset, with a F-measure of 0.82, improving the state-of-the-art results substantially.
After a strong electric bias field was applied to a glass sample at temperatures in the millikelvin range its AC-dielectric constant increases and then decays logarithmically with time. For the polyester glass mylar we have observed the relaxation of the dielectric constant back to its initial value for several temperatures and histories of the bias field. Starting from the dipole gap theory we have developed a model suggesting that the change of the dielectric constant after transient application of a bias field is only partly due to relaxational processes. In addition, non-adiabatic driving of tunneling states (TSs) by applied electric fields causes long lasting changes in the dielectric constant. Moreover, our observations indicate that at temperatures below 50 mK the relaxation of TSs is caused primarily by interactions between TSs.
This paper describes our approach to the SemEval 2017 Task 10: "Extracting Keyphrases and Relations from Scientific Publications", specifically to Subtask (B): "Classification of identified keyphrases". We explored three different deep learning approaches: a character-level convolutional neural network (CNN), a stacked learner with an MLP meta-classifier, and an attention based Bi-LSTM. From these approaches, we created an ensemble of differently hyper-parameterized systems, achieving a micro-F1-score of 0.63 on the test data. Our approach ranks 2nd (score of 1st placed system: 0.64) out of four according to this official score. However, we erroneously trained 2 out of 3 neural nets (the stacker and the CNN) on only roughly 15% of the full data, namely, the original development set. When trained on the full data (training+development), our ensemble has a micro-F1-score of 0.69. Our code is available from https://github.com/UKPLab/semeval2017-scienceie.
Two complementary techniques for analyzing search spaces are proposed: (i) an algorithm to detect search points with potential to be local optima; and (ii) a slightly adjusted Wang-Landau sampling algorithm to explore larger search spaces. The detection algorithm assumes that local optima are points which are easier to reach and harder to leave by a slow adaptive walker. A slow adaptive walker moves to a nearest fitter point. Thus, points with larger outgoing step sizes relative to incoming step sizes are marked using the local optima score formulae as potential local optima points (PLOPs). Defining local optima in these more general terms allows their detection within the closure of a subset of a search space, and the sampling of a search space unshackled by a particular move set. Tests are done with NK and HIFF problems to confirm that PLOPs detected in the manner proposed retain characteristics of local optima, and that the adjusted Wang-Landau samples are more representative of the search space than samples produced by choosing points uniformly at random. While our approach shows promise, more needs to be done to reduce its computation cost that it may pave a way toward analyzing larger search spaces of practical meaning.
LightNet is a lightweight, versatile and purely Matlab-based deep learning framework. The idea underlying its design is to provide an easy-to-understand, easy-to-use and efficient computational platform for deep learning research. The implemented framework supports major deep learning architectures such as Multilayer Perceptron Networks (MLP), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). The framework also supports both CPU and GPU computation, and the switch between them is straightforward. Different applications in computer vision, natural language processing and robotics are demonstrated as experiments.
Part I of this work examined the mean-square stability and convergence of the learning process of distributed strategies over graphs. The results identified conditions on the network topology, utilities, and data in order to ensure stability; the results also identified three distinct stages in the learning behavior of multi-agent networks related to transient phases I and II and the steady-state phase. This Part II examines the steady-state phase of distributed learning by networked agents. Apart from characterizing the performance of the individual agents, it is shown that the network induces a useful equalization effect across all agents. In this way, the performance of noisier agents is enhanced to the same level as the performance of agents with less noisy data. It is further shown that in the small step-size regime, each agent in the network is able to achieve the same performance level as that of a centralized strategy corresponding to a fully connected network. The results in this part reveal explicitly which aspects of the network topology and operation influence performance and provide important insights into the design of effective mechanisms for the processing and diffusion of information over networks.
The recently introduced Intelligent Trial and Error algorithm (IT\&E) enables robots to creatively adapt to damage in a matter of minutes by combining an off-line evolutionary algorithm and an on-line learning algorithm based on Bayesian Optimization. We extend the IT\&E algorithm to allow for robots to learn to compensate for damages while executing their task(s). This leads to a semi-episodic learning scheme that increases the robot's lifetime autonomy and adaptivity. Preliminary experiments on a toy simulation and a 6-legged robot locomotion task show promising results.
Data-driven saliency has recently gained a lot of attention thanks to the use of Convolutional Neural Networks for predicting gaze fixations. In this paper we go beyond standard approaches to saliency prediction, in which gaze maps are computed with a feed-forward network, and we present a novel model which can predict accurate saliency maps by incorporating neural attentive mechanisms. The core of our solution is a Convolutional LSTM that focuses on the most salient regions of the input image to iteratively refine the predicted saliency map. Additionally, to tackle the center bias present in human eye fixations, our model can learn a set of prior maps generated with Gaussian functions. We show, through an extensive evaluation, that the proposed architecture overcomes the current state of the art on two public saliency prediction datasets. We further study the contribution of each key components to demonstrate their robustness on different scenarios.
Mammography is widely recognized as the most reliable technique for early detection of breast cancers. Automated or semi-automated computerized classification schemes can be very useful in assisting radiologists with a second opinion about the visual diagnosis of breast lesions, thus leading to a reduction in the number of unnecessary biopsies. We present a computer-aided diagnosis (CADi) system for the characterization of massive lesions in mammograms, whose aim is to distinguish malignant from benign masses. The CADi system we realized is based on a three-stage algorithm: a) a segmentation technique extracts the contours of the massive lesion from the image; b) sixteen features based on size and shape of the lesion are computed; c) a neural classifier merges the features into an estimated likelihood of malignancy. A dataset of 226 massive lesions (109 malignant and 117 benign) has been used in this study. The system performances have been evaluated terms of the receiver-operating characteristic (ROC) analysis, obtaining A_z = 0.80+-0.04 as the estimated area under the ROC curve.
We present five methods to the problem of network anomaly detection. These methods cover most of the common techniques in the anomaly detection field, including Statistical Hypothesis Tests (SHT), Support Vector Machines (SVM) and clustering analysis. We evaluate all methods in a simulated network that consists of nominal data, three flow-level anomalies and one packet-level attack. Through analyzing the results, we point out the advantages and disadvantages of each method and conclude that combining the results of the individual methods can yield improved anomaly detection results.
The virtualization of Radio Access Networks (RANs) has been proposed as one of the important use cases of Network Function Virtualization (NFV). In Virtualized Radio Access Networks (VRANs), some functions from a Base Station (BS), such as those which make up the Base Band Unit (BBU), may be implemented in a shared infrastructure located at either a data center or distributed in network nodes. For the latter option, one challenge is in deciding which subset of the available network nodes can be used to host the physical BBU servers (the placement problem), and then to which of the available physical BBUs each Remote Radio Head (RRH) should be assigned (the assignment problem). These two problems constitute what we refer to as the VRAN Placement and Assignment Problem (VRAN-PAP). In this paper, we start by formally defining the VRAN-PAP before formulating it as a Binary Integer Linear Program (BILP) whose objective is to minimize the server and front haul link setup costs as well as the latency between each RRH and its assigned BBU. Since the BILP could become computationally intractable, we also propose a greedy approximation for larger instances of the VRAN-PAP. We perform simulations to compare both algorithms in terms of solution quality as well as computation time under varying network sizes and setup budgets.
Detection and photometry of sources in the U_n, G, R, and K_s bands in a 9x9 arcmin^2 region of the sky, centered on the Hubble Deep Field, are described. The data permit construction of complete photometric catalogs to roughly U_n=25, G=26, R=25.5 and K_s=20 mag, and significant photometric measurements somewhat fainter. The galaxy number density is 1.3x10^5 deg^{-2} to R=25.0 mag. Galaxy number counts have slopes dlog N/dm=0.42, 0.33, 0.27 and 0.31 in the U_n, G, R and K_s bands, consistent with previous studies and the trend that fainter galaxies are, on average, bluer. Galaxy catalogs selected in the R and K_s bands are presented, containing 3607 and 488 sources, in field areas of 74.8 and 59.4 arcmin^2, to R=25.5 and and K_s=20 mag.
An intriguing property of deep neural networks is the existence of adversarial examples, which can transfer among different architectures. These transferable adversarial examples may severely hinder deep neural network-based applications. Previous works mostly study the transferability using small scale datasets. In this work, we are the first to conduct an extensive study of the transferability over large models and a large scale dataset, and we are also the first to study the transferability of targeted adversarial examples with their target labels. We study both non-targeted and targeted adversarial examples, and show that while transferable non-targeted adversarial examples are easy to find, targeted adversarial examples generated using existing approaches almost never transfer with their target labels. Therefore, we propose novel ensemble-based approaches to generating transferable adversarial examples. Using such approaches, we observe a large proportion of targeted adversarial examples that are able to transfer with their target labels for the first time. We also present some geometric studies to help understanding the transferable adversarial examples. Finally, we show that the adversarial examples generated using ensemble-based approaches can successfully attack Clarifai.com, which is a black-box image classification system.
This paper introduces an end-to-end fine-tuning method to improve hand-eye coordination in modular deep visuo-motor policies (modular networks) where each module is trained independently. Benefiting from weighted losses, the fine-tuning method significantly improves the performance of the policies for a robotic planar reaching task.
Control planes for global carrier networks should be programmable (so that new functionality can be easily introduced) and scalable (so they can handle the numerical scale and geographic scope of these networks). Neither traditional control planes nor new SDN-based control planes meet both of these goals. In this paper, we propose a framework for recursive routing computations that combines the best of SDN (programmability) and traditional networks (scalability through hierarchy) to achieve these two desired properties. Through simulation on graphs of up to 10,000 nodes, we evaluate our design's ability to support a variety of routing and traffic engineering solutions, while incorporating a fast failure recovery mechanism.
Deep convolutional neural network (CNN) inference requires significant amount of memory and computation, which limits its deployment on embedded devices. To alleviate these problems to some extent, prior research utilize low precision fixed-point numbers to represent the CNN weights and activations. However, the minimum required data precision of fixed-point weights varies across different networks and also across different layers of the same network. In this work, we propose using floating-point numbers for representing the weights and fixed-point numbers for representing the activations. We show that using floating-point representation for weights is more efficient than fixed-point representation for the same bit-width and demonstrate it on popular large-scale CNNs such as AlexNet, SqueezeNet, GoogLeNet and VGG-16. We also show that such a representation scheme enables compact hardware multiply-and-accumulate (MAC) unit design. Experimental results show that the proposed scheme reduces the weight storage by up to 36% and power consumption of the hardware multiplier by up to 50%.
Training very deep networks is an important open problem in machine learning. One of many difficulties is that the norm of the back-propagated error gradient can grow or decay exponentially. Here we show that training very deep feed-forward networks (FFNs) is not as difficult as previously thought. Unlike when back-propagation is applied to a recurrent network, application to an FFN amounts to multiplying the error gradient by a different random matrix at each layer. We show that the successive application of correctly scaled random matrices to an initial vector results in a random walk of the log of the norm of the resulting vectors, and we compute the scaling that makes this walk unbiased. The variance of the random walk grows only linearly with network depth and is inversely proportional to the size of each layer. Practically, this implies a gradient whose log-norm scales with the square root of the network depth and shows that the vanishing gradient problem can be mitigated by increasing the width of the layers. Mathematical analyses and experimental results using stochastic gradient descent to optimize tasks related to the MNIST and TIMIT datasets are provided to support these claims. Equations for the optimal matrix scaling are provided for the linear and ReLU cases.
Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But recent deep learning object detectors have avoided pyramid representations, in part because they are compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using FPN in a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.
LTE in unlicensed spectrum (LTE-U) is a promising approach to overcome the wireless spectrum scarcity. However, to reap the benefits of LTE-U, a fair coexistence mechanism with other incumbent WiFi deployments is required. In this paper, a novel deep learning approach is proposed for modeling the resource allocation problem of LTE-U small base stations (SBSs). The proposed approach enables multiple SBSs to proactively perform dynamic channel selection, carrier aggregation, and fractional spectrum access while guaranteeing fairness with existing WiFi networks and other LTE-U operators. Adopting a proactive coexistence mechanism enables future delay-intolerant LTE-U data demands to be served within a given prediction window ahead of their actual arrival time thus avoiding the underutilization of the unlicensed spectrum during off-peak hours while maximizing the total served LTE-U traffic load. To this end, a noncooperative game model is formulated in which SBSs are modeled as Homo Egualis agents that aim at predicting a sequence of future actions and thus achieving long-term equal weighted fairness with WLAN and other LTE-U operators over a given time horizon. The proposed deep learning algorithm is then shown to reach a mixed-strategy Nash equilibrium (NE), when it converges. Simulation results using real data traces show that the proposed scheme can yield up to 28% and 11% gains over a conventional reactive approach and a proportional fair coexistence mechanism, respectively. The results also show that the proposed framework prevents WiFi performance degradation for a densely deployed LTE-U network.
We present an electron identification algorithm based on a neural network approach applied to the ZEUS uranium calorimeter. The study is motivated by the need to select deep inelastic, neutral current, electron proton interactions characterized by the presence of a scattered electron in the final state. The performance of the algorithm is compared to an electron identification method based on a classical probabilistic approach. By means of a principle component analysis the improvement in the performance is traced back to the number of variables used in the neural network approach.
Swarm intelligence is all about developing collective behaviours to solve complex, ill-structured and large-scale problems. Efficiency in collective behaviours depends on how to harmonise the individual contributions so that a complementary collective effort can be achieved to offer a useful solution. The main points in organising the harmony remains as managing the diversification and intensification actions appropriately, where the efficiency of collective behaviours depends on blending these two actions appropriately. In this study, two swarm intelligence algorithms inspired of natural honeybee colonies have been overviewed with many respects and two new revisions and a hybrid version have been studied to improve the efficiencies in solving numerical optimisation problems, which are well-known hard benchmarks. Consequently, the revisions and especially the hybrid algorithm proposed have outperformed the two original bee algorithms in solving these very hard numerical optimisation benchmarks.
The "neural code" is the way the brain characterizes, stores, and processes information. Unraveling the neural code is a key goal of mathematical neuroscience. Topology, coding theory, and, recently, commutative algebra are some the mathematical areas that are involved in analyzing these codes. Neural rings and ideals are algebraic objects that create a bridge between mathematical neuroscience and commutative algebra. A neural ideal is an ideal in a polynomial ring that encodes the combinatorial firing data of a neural code. Using some algebraic techniques one hopes to understand more about the structure of a neural code via neural rings and ideals. In this paper, we introduce an operation, called "polarization," that allows us to relate neural ideals with squarefree monomial ideals, which are very well studied and known for their nice behavior in commutative algebra.
Computational prediction of membrane protein (MP) structures is very challenging partially due to lack of sufficient solved structures for homology modeling. Recently direct evolutionary coupling analysis (DCA) sheds some light on protein contact prediction and accordingly, contact-assisted folding, but DCA is effective only on some very large-sized families since it uses information only in a single protein family. This paper presents a deep transfer learning method that can significantly improve MP contact prediction by learning contact patterns and complex sequence-contact relationship from thousands of non-membrane proteins (non-MPs). Tested on 510 non-redundant MPs, our deep model (learned from only non-MPs) has top L/10 long-range contact prediction accuracy 0.69, better than our deep model trained by only MPs (0.63) and much better than a representative DCA method CCMpred (0.47) and the CASP11 winner MetaPSICOV (0.55). The accuracy of our deep model can be further improved to 0.72 when trained by a mix of non-MPs and MPs. When only contacts in transmembrane regions are evaluated, our method has top L/10 long-range accuracy 0.62, 0.57, and 0.53 when trained by a mix of non-MPs and MPs, by non-MPs only, and by MPs only, respectively, still much better than MetaPSICOV (0.45) and CCMpred (0.40). All these results suggest that sequence-structure relationship learned by our deep model from non-MPs generalizes well to MP contact prediction. Improved contact prediction also leads to better contact-assisted folding. Using only top predicted contacts as restraints, our deep learning method can fold 160 and 200 of 510 MPs with TMscore>0.6 when trained by non-MPs only and by a mix of non-MPs and MPs, respectively, while CCMpred and MetaPSICOV can do so for only 56 and 77 MPs, respectively. Our contact-assisted folding also greatly outperforms homology modeling.
Population diversity is essential for avoiding premature convergence in Genetic Algorithms (GAs) and for the effective use of crossover. Yet the dynamics of how diversity emerges in populations are not well understood. We use rigorous runtime analysis to gain insight into population dynamics and GA performance for the ($\mu$+1) GA and the $\text{Jump}_k$ test function. We show that the interplay of crossover and mutation may serve as a catalyst leading to a sudden burst of diversity. This leads to improvements of the expected optimisation time of order $\Omega(n/\log n)$ compared to mutation-only algorithms like (1+1) EA. Moreover, increasing the mutation rate by an arbitrarily small constant factor can facilitate the generation of diversity, leading to speedups of order $\Omega(n)$. We also compare seven commonly used diversity mechanisms and evaluate their impact on runtime bounds for the ($\mu$+1) GA. All previous results in this context only hold for unrealistically low crossover probability $p_c=O(k/n)$, while we give analyses for the setting of constant $p_c < 1$ in all but one case.   For the typical case of constant $k > 2$ and constant $p_c$, we can compare the resulting expected runtimes for different diversity mechanisms assuming an optimal choice of $\mu$: $O(n^{k-1})$ for duplicate elimination/minim., $O(n^2\log n)$ for maximising the convex hull, $O(n\log n)$ for deterministic crowding (assuming $p_c = k/n$), $O(n\log n)$ for maximising Hamming distance, $O(n\log n)$ for fitness sharing, $O(n\log n)$ for single-receiver island model.   This proves a sizeable advantage of all variants of the ($\mu$+1) GA compared to (1+1) EA, which requires time $\Theta(n^k)$. Experiments complement our theoretical findings and further highlight the benefits of crossover and diversity on $\text{Jump}_k$.
Current works on Information Centric Networking assume the spectrum of caching strategies under the Least Recently/ Frequently Used (LRFU) scheme as the de-facto standard, due to the ease of implementation and easier analysis of such strategies. In this paper we predict the popularity distribution of YouTube videos within a campus network. We explore two broad approaches in predicting the popularity of videos in the network: consensus approaches based on aggregate behavior in the network, and social approaches based on the information diffusion over an implicit network. We measure the performance of our approaches under a simple caching framework by picking the k most popular videos according to our predicted distribution and calculating the hit rate on the cache. We develop our approach by first incorporating video inter-arrival time (based on the power-law distribution governing the transmission time between two receivers of the same message in scale-free networks) to the baseline (LRFU), then combining with an information diffusion model over the inferred latent social graph that governs diffusion of videos in the network. We apply techniques from latent social network inference to learn the sharing probabilities between users in the network and apply a virus propagation model borrowed from mathematical epidemiology to estimate the number of times a video will be accessed in the future. Our approach gives rise to a 14% hit rate improvement over the baseline.
In this paper, we explore the risks of friends in social networks caused by their friendship patterns, by using real life social network data and starting from a previously defined risk model. Particularly, we observe that risks of friendships can be mined by analyzing users' attitude towards friends of friends. This allows us to give new insights into friendship and risk dynamics on social networks.
Deep artificial neural networks have made remarkable progress in different tasks in the field of computer vision. However, the empirical analysis of these models and investigation of their failure cases has received attention recently. In this work, we show that deep learning models cannot generalize to atypical images that are substantially different from training images. This is in contrast to the superior generalization ability of the visual system in the human brain. We focus on Convolutional Neural Networks (CNN) as the state-of-the-art models in object recognition and classification; investigate this problem in more detail, and hypothesize that training CNN models suffer from unstructured loss minimization. We propose computational models to improve the generalization capacity of CNNs by considering how typical a training image looks like. By conducting an extensive set of experiments we show that involving a typicality measure can improve the classification results on a new set of images by a large margin. More importantly, this significant improvement is achieved without fine-tuning the CNN model on the target image set.
The largenet2 C++ library provides an infrastructure for the simulation of large dynamic and adaptive networks with discrete node and link states. The library is released as free software. It is available at http://rincedd.github.com/largenet2. Largenet2 is licensed under the Creative Commons Attribution-NonCommercial 3.0 Unported License.
Previous research has shown that neural networks can model survival data in situations in which some patients' death times are unknown, e.g. right-censored. However, neural networks have rarely been shown to outperform their linear counterparts such as the Cox proportional hazards model. In this paper, we run simulated experiments and use real survival data to build upon the risk-regression architecture proposed by Faraggi and Simon. We demonstrate that our model, DeepSurv, not only works as well as other survival models but actually outperforms in predictive ability on survival data with linear and nonlinear risk functions. We then show that the neural network can also serve as a recommender system by including a categorical variable representing a treatment group. This can be used to provide personalized treatment recommendations based on an individual's calculated risk. We provide an open source Python module that implements these methods in order to advance research on deep learning and survival analysis.
We consider a distributed downlink user association problem in a small cell network, where small cells obtain the required energy for providing wireless services to users through ambient energy harvesting. Since energy harvesting is opportunistic in nature, the amount of harvested energy is a random variable, without any a priori known characteristics. Moreover, since users arrive in the network randomly and require different wireless services, the energy consumption is a random variable as well. In this paper, we propose a probabilistic framework to mathematically model and analyze the random behavior of energy harvesting and energy consumption in dense small cell networks. Furthermore, as acquiring (even statistical) channel and network knowledge is very costly in a distributed dense network, we develop a bandit-theoretical formulation for distributed user association when no information is available at users
We describe CITlab's recognition system for the HTRtS competition attached to the 13. International Conference on Document Analysis and Recognition, ICDAR 2015. The task comprises the recognition of historical handwritten documents. The core algorithms of our system are based on multi-dimensional recurrent neural networks (MDRNN) and connectionist temporal classification (CTC). The software modules behind that as well as the basic utility technologies are essentially powered by PLANET's ARGUS framework for intelligent text recognition and image processing.
We present results from a wide-area photometric survey of the Phoenix dwarf galaxy, one of the rare dwarf irregular/ dwarf spheroidal transition type galaxies (dTs) of the Local Group (LG). These objects offer the opportunity to study the existence of possible evolutionary links between the late- and early- type LG dwarf galaxies, since the properties of dTs suggest that they may be dwarf irregulars in the process of transforming into dwarf spheroidals. Using FORS at the VLT we have acquired VI photometry of Phoenix. The data reach a S/N~10 just below the horizontal branch of the system and consist of a mosaic of images that covers an area of 26' x 26' centered on the coordinates of the optical center of the galaxy. Examination of the colour-magnitude diagram and luminosity function revealed the presence of a bump above the red clump, consistent with being a red giant branch bump. The deep photometry combined with the large area covered allows us to put on a secure ground the determination of the overall structural properties of the galaxy and to derive the spatial distribution of stars in different evolutionary phases and age ranges, from 0.1 Gyr to the oldest stars. The best-fitting profile to the overall stellar population is a Sersic profile of Sersic radius R_S = 1.82'+-0.06' and m=0.83+-0.03. We confirm that the spatial distribution of stars is found to become more and more centrally concentrated the younger the stellar population, as reported in previous studies. This is similar to the stellar population gradients found for close-by Milky Way dwarf spheroidal galaxies. We quantify such spatial variations by analyzing the surface number density profiles of stellar populations in different age ranges; [Abridged]
A new supervised learning algorithm, SNN/LP, is proposed for Spiking Neural Networks. This novel algorithm uses limited precision for both synaptic weights and synaptic delays; 3 bits in each case. Also a genetic algorithm is used for the supervised training. The results are comparable or better than previously published work. The results are applicable to the realization of large scale hardware neural networks. One of the trained networks is implemented in programmable hardware.
The spread of new beliefs, behaviors, conventions, norms, and technologies in social and economic networks are often driven by cascading mechanisms, and so are contagion dynamics in financial networks. Global behaviors generally emerge from the interplay between the structure of the interconnection topology and the local agents' interactions. We focus on the Linear Threshold Model (LTM) of cascades first introduced by Granovetter (1978). This can be interpreted as the best response dynamics in a network game whereby agents choose strategically between two actions and their payoff is an increasing function of the number of their neighbors choosing the same action. Each agent is equipped with an individual threshold representing the number of her neighbors who must have adopted a certain action for that to become the agent's best response. We analyze the LTM dynamics on large-scale networks with heterogeneous agents. Through a local mean-field approach, we obtain a nonlinear, one-dimensional, recursive equation that approximates the evolution of the LTM dynamics on most of the networks of a given size and distribution of degrees and thresholds. Specifically, we prove that, on all but a fraction of networks with given degree and threshold statistics that is vanishing as the network size grows large, the actual fraction of adopters of a given action in the LTM dynamics is arbitrarily close to the output of the aforementioned recursion. We then analyze the dynamic behavior of this recursion and its bifurcations from a dynamical systems viewpoint. Applications of our findings to some real network testbeds show good adherence of the theoretical predictions to numerical simulations.
Although there are now some tentative signs that the start of cycle 24 has begun there is still considerable interest in the somewhat unusual behaviour of the current solar minimum and the apparent delay in the true start of the next cycle. While this behaviour is easily tracked by observing the change in surface activity a question can also be asked about what is happening beneath the surface where the magnetic activity ultimately originates? In order to try to answer this question we can look at the behaviour of the frequencies of the Sun's natural seismic modes of oscillation - the p modes. These seismic frequencies also respond to changes in activity and are probes of conditions in the solar interior. The Birmingham Solar Oscillations Network (BiSON) has made measurements of low-degree (low-$\ell$) p mode frequencies over the last three solar cycles, and so is in a unique position to explore the current unusual and extended solar minimum. We compare the frequency shifts in the low-$\ell$ p-modes obtained from the BiSON data with the change in surface activity as measured by different proxies and show there are significant differences especially during the declining phase of solar cycle 23 and into the current minimum. We also observe quasi-biennial periodic behaviour in the p mode frequencies over the last 2 cycles that, unlike in the surface measurements, seems to be present at mid- and low-activity levels. Additionally we look at the frequency shifts of individual $\ell$ modes.
We study a stochastic model of infection spreading on a network. At each time step a node is chosen at random, along with one of its neighbors. If the node is infected and the neighbor is susceptible, the neighbor becomes infected. How many time steps $T$ does it take to completely infect a network of $N$ nodes, starting from a single infected node? An analogy to the classic "coupon collector" problem of probability theory reveals that the takeover time $T$ is dominated by extremal behavior, either when there are only a few infected nodes near the start of the process or a few susceptible nodes near the end. We show that for $N \gg 1$, the takeover time $T$ is distributed as a Gumbel for the star graph; as the sum of two Gumbels for a complete graph and an Erd\H{o}s-R\'{e}nyi random graph; as a normal for a one-dimensional ring and a two-dimensional lattice; and as a family of intermediate skewed distributions for $d$-dimensional lattices with $d \ge 3$ (these distributions approach the sum of two Gumbels as $d$ approaches infinity). Connections to evolutionary dynamics, cancer, incubation periods of infectious diseases, first-passage percolation, and other spreading phenomena in biology and physics are discussed.
Given large amount of real photos for training, Convolutional neural network shows excellent performance on object recognition tasks. However, the process of collecting data is so tedious and the background are also limited which makes it hard to establish a perfect database. In this paper, our generative model trained with synthetic images rendered from 3D models reduces the workload of data collection and limitation of conditions. Our structure is composed of two sub-networks: semantic foreground object reconstruction network based on Bayesian inference and classification network based on multi-triplet cost function for avoiding over-fitting problem on monotone surface and fully utilizing pose information by establishing sphere-like distribution of descriptors in each category which is helpful for recognition on regular photos according to poses, lighting condition, background and category information of rendered images. Firstly, our conjugate structure called generative model with metric learning utilizing additional foreground object channels generated from Bayesian rendering as the joint of two sub-networks. Multi-triplet cost function based on poses for object recognition are used for metric learning which makes it possible training a category classifier purely based on synthetic data. Secondly, we design a coordinate training strategy with the help of adaptive noises acting as corruption on input images to help both sub-networks benefit from each other and avoid inharmonious parameter tuning due to different convergence speed of two sub-networks. Our structure achieves the state of the art accuracy of over 50\% on ShapeNet database with data migration obstacle from synthetic images to real photos. This pipeline makes it applicable to do recognition on real images only based on 3D models.
A major area in neuroscience research is the study of how the brain processes spatial information. Neurons in the brain represent external stimuli via neural codes. These codes often arise from stereotyped stimulus-response maps, associating to each neuron a convex receptive field. An important problem consists in determining what stimulus space features can be extracted directly from a neural code. The neural ideal is an algebraic object that encodes the full combinatorial data of a neural code. This ideal can be expressed in a canonical form that directly translates to a minimal description of the receptive field structure intrinsic to the code. In here, we describe a SageMath package that contains several algorithms related to the canonical form of a neural ideal.
Gene Regulatory Network (GRN) plays an important role in knowing insight of cellular life cycle. It gives information about at which different environmental conditions genes of particular interest get over expressed or under expressed. Modelling of GRN is nothing but finding interactive relationships between genes. Interaction can be positive or negative. For inference of GRN, time series data provided by Microarray technology is used. Key factors to be considered while constructing GRN are scalability, robustness, reliability and maximum detection of true positive interactions between genes. This paper gives detailed technical review of existing methods applied for building of GRN along with scope for future work.
The basic notion of percolation in physics assumes the emergence of a giant connected (percolation) cluster in a large disordered system when the density of connections exceeds some critical value. Until recently, the percolation phase transitions were believed to be continuous, however, in 2009, a remarkably different, discontinuous phase transition was reported in a new so-called "explosive percolation" problem. Each link in this problem is established by a specific optimization process. Here, employing strict analytical arguments and numerical calculations, we find that in fact the "explosive percolation" transition is continuous though with an uniquely small critical exponent of the percolation cluster size. These transitions provide a new class of critical phenomena in irreversible systems and processes.
Analyzing multivariate time series data is important for many applications such as automated control, fault diagnosis and anomaly detection. One of the key challenges is to learn latent features automatically from dynamically changing multivariate input. In visual recognition tasks, convolutional neural networks (CNNs) have been successful to learn generalized feature extractors with shared parameters over the spatial domain. However, when high-dimensional multivariate time series is given, designing an appropriate CNN model structure becomes challenging because the kernels may need to be extended through the full dimension of the input volume. To address this issue, we present two structure learning algorithms for deep CNN models. Our algorithms exploit the covariance structure over multiple time series to partition input volume into groups. The first algorithm learns the group CNN structures explicitly by clustering individual input sequences. The second algorithm learns the group CNN structures implicitly from the error backpropagation. In experiments with two real-world datasets, we demonstrate that our group CNNs outperform existing CNN based regression methods.
Neural networks are a revolutionary but immature technique that is fast evolving and heavily relies on data. To benefit from the newest development and newly available data, we want the gap between research and production as small as possibly. On the other hand, differing from traditional machine learning models, neural network is not just yet another statistic model, but a model for the natural processing engine --- the brain. In this work, we describe a neural network library named {\texttt akid}. It provides higher level of abstraction for entities (abstracted as blocks) in nature upon the abstraction done on signals (abstracted as tensors) by Tensorflow, characterizing the dataism observation that all entities in nature processes input and emit out in some ways. It includes a full stack of software that provides abstraction to let researchers focus on research instead of implementation, while at the same time the developed program can also be put into production seamlessly in a distributed environment, and be production ready. At the top application stack, it provides out-of-box tools for neural network applications. Lower down, akid provides a programming paradigm that lets user easily build customized models. The distributed computing stack handles the concurrency and communication, thus letting models be trained or deployed to a single GPU, multiple GPUs, or a distributed environment without affecting how a model is specified in the programming paradigm stack. Lastly, the distributed deployment stack handles how the distributed computing is deployed, thus decoupling the research prototype environment with the actual production environment, and is able to dynamically allocate computing resources, so development (Devs) and operations (Ops) could be separated. Please refer to http://akid.readthedocs.io/en/latest/ for documentation.
We present novel techniques to accelerate the convergence of Deep Learning algorithms by conducting low overhead removal of redundant neurons -- apoptosis of neurons -- which do not contribute to model learning, during the training phase itself. We provide in-depth theoretical underpinnings of our heuristics (bounding accuracy loss and handling apoptosis of several neuron types), and present the methods to conduct adaptive neuron apoptosis. Specifically, we are able to improve the training time for several datasets by 2-3x, while reducing the number of parameters by up to 30x (4-5x on average) on datasets such as ImageNet classification. For the Higgs Boson dataset, our implementation improves the accuracy (measured by Area Under Curve (AUC)) for classification from 0.88/1 to 0.94/1, while reducing the number of parameters by 3x in comparison to existing literature. The proposed methods achieve a 2.44x speedup in comparison to the default (no apoptosis) algorithm.
Very large-scale Deep Neural Networks (DNNs) have achieved remarkable successes in a large variety of computer vision tasks. However, the high computation intensity of DNNs makes it challenging to deploy these models on resource-limited systems. Some studies used low-rank approaches that approximate the filters by low-rank basis to accelerate the testing. Those works directly decomposed the pre-trained DNNs by Low-Rank Approximations (LRA). How to train DNNs toward lower-rank space for more efficient DNNs, however, remains as an open area. To solve the issue, in this work, we propose Force Regularization, which uses attractive forces to enforce filters so as to coordinate more weight information into lower-rank space. We mathematically and empirically prove that after applying our technique, standard LRA methods can reconstruct filters using much lower basis and thus result in faster DNNs. The effectiveness of our approach is comprehensively evaluated in ResNets, AlexNet, and GoogLeNet. In AlexNet, for example, Force Regularization gains 2x speedup on modern GPU without accuracy loss and 4.05x speedup on CPU by paying small accuracy degradation. Moreover, Force Regularization better initializes the low-rank DNNs such that the fine-tuning can converge faster toward higher accuracy. The obtained lower-rank DNNs can be further sparsified, proving that Force Regularization can be integrated with state-of-the-art sparsity-based acceleration methods.
It remains a challenge to efficiently extract spatialtemporal information from skeleton sequences for 3D human action recognition. Although most recent action recognition methods are based on Recurrent Neural Networks which present outstanding performance, one of the shortcomings of these methods is the tendency to overemphasize the temporal information. Since 3D convolutional neural network(3D CNN) is a powerful tool to simultaneously learn features from both spatial and temporal dimensions through capturing the correlations between three dimensional signals, this paper proposes a novel two-stream model using 3D CNN. To our best knowledge, this is the first application of 3D CNN in skeleton-based action recognition. Our method consists of three stages. First, skeleton joints are mapped into a 3D coordinate space and then encoding the spatial and temporal information, respectively. Second, 3D CNN models are seperately adopted to extract deep features from two streams. Third, to enhance the ability of deep features to capture global relationships, we extend every stream into multitemporal version. Extensive experiments on the SmartHome dataset and the large-scale NTU RGB-D dataset demonstrate that our method outperforms most of RNN-based methods, which verify the complementary property between spatial and temporal information and the robustness to noise.
The mean field Kuramoto model describing the synchronization of a population of phase oscillators with a bimodal frequency distribution is analyzed (by the method of multiple scales) near regions in its phase diagram corresponding to synchronization to phases with a time periodic order parameter. The richest behavior is found near the tricritical point were the incoherent, stationarily synchronized, ``traveling wave'' and ``standing wave'' phases coexist. The behavior near the tricritical point can be extrapolated to the rest of the phase diagram. Direct Brownian simulation of the model confirms our findings.
The eigenvalues of matrices representing the structure of large-scale complex networks present a wide range of applications, from the analysis of dynamical processes taking place in the network to spectral techniques aiming to rank the importance of nodes in the network. A common approach to study the relationship between the structure of a network and its eigenvalues is to use synthetic random networks in which structural properties of interest, such as degree distributions, are prescribed. Although very common, synthetic models present two major flaws: (\emph{i}) These models are only suitable to study a very limited range of structural properties, and (\emph{ii}) they implicitly induce structural properties that are not directly controlled and can deceivingly influence the network eigenvalue spectrum. In this paper, we propose an alternative approach to overcome these limitations. Our approach is not based on synthetic models, instead, we use algebraic graph theory and convex optimization to study how structural properties influence the spectrum of eigenvalues of the network. Using our approach, we can compute with low computational overhead global spectral properties of a network from its local structural properties. We illustrate our approach by studying how structural properties of online social networks influence their eigenvalue spectra.
We present a deep ($\sim$85 ksec) ASCA observation of the prototype Broad Absorption Line Quasar (BALQSO) PHL 5200. This is the best X-ray spectrum of a BALQSO yet. We find that (1) the source is not intrinsically X-ray weak, (2) the line of sight absorption is very strong with N_H=5 x 10^{23} cm^{-2}, (3) the absorber does not cover the source completely; the covering fraction is about 90%. This is consistent with the large optical polarization observed in this source, implying multiple lines of sight. The most surprising result of this observation is that (4) the spectrum of this BALQSO is not exactly similar to other radio-quiet quasars. The hard X-ray spectrum of PHL 5200 is steep with the power-law spectral index alpha ~1.5. This is similar to the steepest hard X-ray slopes observed so far. At low redshifts, such steep slopes are observed in narrow line Seyfert 1 galaxies, believed to be accreting at a high Eddington rate. This observation strengthens the analogy between BALQSOs and NLS1 galaxies and supports the hypothesis that BALQSOs represent an early evolutionary state of quasars (Mathur 2000). It is well accepted that the orientation to the line of sight determines the appearance of a quasar; age seems to play a significant role as well.
This paper studies a spectrum sharing scenario between a cooperative relay network (CRN) and a nearby ad-hoc network. In particular, we consider a dynamic spectrum access and resource allocation problem of the CRN. Based on sensing and predicting the ad-hoc transmission behaviors, the ergodic traffic collision time between the CRN and ad-hoc network is minimized subject to an ergodic uplink throughput requirement for the CRN. We focus on real-time implementation of spectrum sharing policy under practical computation and signaling limitations. In our spectrum sharing policy, most computation tasks are accomplished off-line. Hence, little real-time calculation is required which fits the requirement of practical applications. Moreover, the signaling procedure and computation process are designed carefully to reduce the time delay between spectrum sensing and data transmission, which is crucial for enhancing the accuracy of traffic prediction and improving the performance of interference mitigation. The benefits of spectrum sensing and cooperative relay techniques are demonstrated by our numerical experiments.
Social communities extraction and their dynamics are one of the most important problems in today's social network analysis. During last few years, many researchers have proposed their own methods for group discovery in social networks. However, almost none of them have noticed that modern social networks are much more complex than few years ago. Due to vast amount of different data about various user activities available in IT systems, it is possible to distinguish the new class of social networks called multi-layered social network. For that reason, the new approach to community detection in the multi-layered social network, which utilizes multi-layered edge clustering coefficient is proposed in the paper.
Software-Defined Networking (SDN) has raised the boundaries of cloud computing by offering unparalleled levels of control and flexibility to system administrators over their virtualized environments. To properly embrace this new era of SDN-driven network architectures, the research community must not only consider the impact of SDN over the protocol stack, but also on its overlying networked applications. In this big ideas paper, we study the impact of SDN on the design of future message-oriented middleware, specifically pub/sub systems. We argue that key concepts put forth by SDN can be applied in a meaningful fashion to the next generation of pub/sub systems. First, pub/sub can adopt a logically centralized controller model for maintenance, monitoring, and control of the overlay network. We establish a parallel with existing work on centralized pub/sub routing and discuss how the logically centralized controller model can be implemented in a distributed manner. Second, we investigate the separation of the control and data plane, which is integral to SDN, which can be adopted to raise the level of decoupling in pub/sub. We introduce a new model of pub/sub which separates the traditional publisher and subscriber roles into flow regulators and producer/consumers of data. We then present use cases that benefit from this approach and study the impact of decoupling for performance.
Recent experiments have shown that sodium, a prototype simple metal at ambient conditions, exhibits unexpected complexity under high pressure. One of the most puzzling phenomena in the behaviour of dense sodium is the pressure-induced drop in its melting temperature, which extends from 1000 K at ~30GPa to as low as room temperature at ~120GPa. Despite significant theoretical effort to understand the anomalous melting its origins have remained unclear. In this work, we reconstruct the sodium phase diagram using an ab-initio-quality neural-network potential. We demonstrate that the reentrant behaviour results from the screening of interionic interactions by conduction electrons, which at high pressure induces a softening in the short-range repulsion. It is expected that such an effect plays an important role in governing the behaviour of a wide range of metals and alloys.
With ever-increasing available data, predicting individuals' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a "new" computational social science. Here, we propose a novel approach based on stochastic block models, which have been developed by sociologists as plausible models of complex networks of social interactions. Our model is in the spirit of predicting individuals' preferences based on the preferences of others but, rather than fitting a particular model, we rely on a Bayesian approach that samples over the ensemble of all possible models. We show that our approach is considerably more accurate than leading recommender algorithms, with major relative improvements between 38% and 99% over industry-level algorithms. Besides, our approach sheds light on decision-making processes by identifying groups of individuals that have consistently similar preferences, and enabling the analysis of the characteristics of those groups.
Using ac susceptibility, dc magnetization and heat capacity measurements, we have investigated the magnetic properties of $Cd_{0.5}Cu_{0.5}Cr_2O_4$. $Cd_{0.5}Cu_{0.5}Cr_2O_4$ has an extraordinary magnetic phase including a metastable spin-glass(SG) phase at zero field, a possible phase separation scenario of AFM/FM above $\sim 0.5T$ field, and at intermediate fields, an apparent pseudo reentrant spin-glass (RSG) plateau is observed. These phenomena are closely correlated with the pinning effect of the $Cu^{2+}$ sublattice on the frustrated lattice.
We introduce an LSTM-based method for dynamically integrating several word-prediction experts to obtain a conditional language model which can be good simultaneously at several subtasks. We illustrate this general approach with an application to dialogue where we integrate a neural chat model, good at conversational aspects, with a neural question-answering model, good at retrieving precise information from a knowledge-base, and show how the integration combines the strengths of the independent components. We hope that this focused contribution will attract attention on the benefits of using such mixtures of experts in NLP.
Networks describe a variety of interacting complex systems in social science, biology and information technology. Usually the nodes of real networks are identified not only by their connections but also by some other characteristics. Examples of characteristics of nodes can be age, gender or nationality of a person in a social network, the abundance of proteins in the cell taking part in a protein-interaction networks or the geographical position of airports that are connected by directed flights. Integrating the information on the connections of each node with the information about its characteristics is crucial to discriminating between the essential and negligible characteristics of nodes for the structure of the network. In this paper we propose a general indicator, based on entropy measures, to quantify the dependence of a network's structure on a given set of features.   We apply this method to social networks of friendships in US schools, to the protein-interaction network of Saccharomyces cerevisiae and to the US airport network, showing that the proposed measure provides information which complements other known measures.
Deep learning consists in training neural networks to perform computations that sequentially unfold in many steps over a time dimension or an intrinsic depth dimension. Effective learning in this setting is usually accomplished by specialized network architectures that are designed to mitigate the vanishing gradient problem of naive deep networks. Many of these architectures, such as LSTMs, GRUs, Highway Networks and Deep Residual Network, are based on a single structural principle: the state passthrough.   We observe that these architectures, hereby characterized as Passthrough Networks, in addition to the mitigation of the vanishing gradient problem, enable the decoupling of the network state size from the number of parameters of the network, a possibility that is exploited in some recent works but not thoroughly explored.   In this work we propose simple, yet effective, low-rank and low-rank plus diagonal matrix parametrizations for Passthrough Networks which exploit this decoupling property, reducing the data complexity and memory requirements of the network while preserving its memory capacity. We present competitive experimental results on synthetic tasks and a near state of the art result on sequential randomly-permuted MNIST classification, a hard task on natural data.
In recent years, Deep Learning has become the go-to solution for a broad range of applications, often outperforming state-of-the-art. However, it is important, for both theoreticians and practitioners, to gain a deeper understanding of the difficulties and limitations associated with common approaches and algorithms. We describe four types of simple problems, for which the gradient-based algorithms commonly used in deep learning either fail or suffer from significant difficulties. We illustrate the failures through practical experiments, and provide theoretical insights explaining their source, and how they might be remedied.
We study the effect of the network topology to propagation phenomena on networks in this article. We do not assume any propagation model such as the contact process or SIR model\cite{Ker} because the study is only the consideratons of the purely topological effect, especially the effect of cycles of a network. To uncover universal properties independent of explicit propagation models is expected due to it. First of all, we introduce some indeces for propagation phenomena of a network. Second we introduce a concept of cycles with a little differences to usal cycles, which is called "STOC" in the body of this article. We find some analytic relations between thesm, STOC and some indeces. Moreover we can find the total number of STOCs in a network, analytically. This consideration leads to an extension of the celebrated "Euler's polyhedron formula", which is only applicable to planar graphs. This extended formula is applicable to any graphs. Last we estimate numerically the indeces and the number of STOCs based on the theoretical considerations for some complex networks and make some discussion on the effects of cycles in networks to propagation.
A scattering transform defines a locally translation invariant representation which is stable to time-warping deformations. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.
For most people, social contacts play an integral part in finding a new job. As observed by Granovetter's seminal study, the proportion of jobs obtained through social contacts is usually large compared to those obtained through postings or agencies. At the same time, job markets are a natural example of two-sided matching markets. An important solution concept in such markets is that of stable matchings, and the use of the celebrated Gale-Shapley algorithm to compute them. So far, the literature has evolved separately, either focusing on the implications of information flowing through a social network, or on developing a mathematical theory of job markets through the use of two-sided matching techniques.   In this paper we provide a model of the job market that brings both aspects of job markets together. To model the social scientists' observations, we assume that workers learn only about positions in firms through social contacts. Given that information structure, we study both static properties of what we call locally stable matchings (i.e., stable matchings subject to informational constraints given by a social network) and dynamic properties through a reinterpretation of Gale-Shapley's algorithm as myopic best response dynamics.   We prove that, in general, the set of locally stable matching strictly contains that of stable matchings and it is in fact NP-complete to determine if they are identical. We also show that the lattice structure of stable matchings is in general absent. Finally, we focus on myopic best response dynamics inspired by the Gale-Shapley algorithm. We study the efficiency loss due to the informational constraints, providing both lower and upper bounds.
Today 4G mobile systems are evolving to provide IP connectivity for diverse applications and services up to 1Gbps. They are designed to optimize the network performance, improve cost efficiency and facilitate the uptake of mass market IP-based services. Nevertheless, the growing demand and the diverse patterns of mobile traffic place an increasing strain on cellular networks. To cater to the large volumes of traffic delivered by the new services and applications, the future 5G network will provide the fundamental infrastructure for billions of new devices with less predictable traffic patterns will join the network. The 5G technology is presently in its early research stages, so researches are currently underway exploring different architectural paths to address their key drivers. SDN techniques have been seen as promising enablers for this vision of carrier networks, which will likely play a crucial role in the design of 5G wireless networks. A critical understanding of this emerging paradigm is necessary to address the multiple challenges of the future SDN-enabled 5G technology. To address this requirement, a survey the emerging trends and prospects, followed by in-depth discussion of major challenges in this area are discussed.
In this paper we analyze the structure of the UNSAT-phase of the overconstrained 3-SAT model by studying the low temperature phase of the associated disordered spin model. We derive the $\infty$ Replica Symmetry Broken equations for a general class of disordered spin models which includes the Sherrington - Kirkpatrick model, the Ising $p$-spin model as well as the overconstrained 3-SAT model as particular cases. We have numerically solved the $\infty$ Replica Symmetry Broken equations using a pseudo-spectral code down to and including zero temperature. We find that the UNSAT-phase of the overconstrained 3-SAT model is of the $\infty$-RSB kind: in order to get a stable solution the replica symmetry has to be broken in a continuous way, similarly to the SK model in external magnetic field.
We propose an algorithm to locate the most critical nodes to network robustness. Such critical nodes may be thought of as those most related to the notion of network centrality. Our proposal relies only on a localized spectral analysis of a limited subnetwork centered at each node in the network. We also present a procedure allowing the navigation from any node towards a critical node following only local information computed by the proposed algorithm. Experimental results confirm the effectiveness of our proposal considering networks of different scales and topological characteristics.
Deep learning has the potential to revolutionize quantum chemistry as it is ideally suited to learn representations for structured data and speed up the exploration of chemical space. While convolutional neural networks have proven to be the first choice for images, audio and video data, the atoms in molecules are not restricted to a grid. Instead, their precise locations contain essential physical information, that would get lost if discretized. Thus, we propose to use continuous-filter convolutional layers to be able to model local correlations without requiring the data to lie on a grid. We apply those layers in SchNet: a novel deep learning architecture modeling quantum interactions in molecules. We obtain a joint model for the total energy and interatomic forces that follows fundamental quantum-chemical principles. This includes rotationally invariant energy predictions and a smooth, differentiable potential energy surface. Our architecture achieves state-of-the-art performance for benchmarks of equilibrium molecules and molecular dynamics trajectories. Finally, we introduce a more challenging benchmark with chemical and structural variations that suggests the path for further work.
The construction of large-scale, low-latency networks becomes difficult as the number of nodes increases. In general, the way to construct a theoretically optimal solution is unknown. However, it is known that some methods can construct suboptimal networks with low-latency. One such method is to construct large-scale networks from optimal or suboptimal small networks, using the product of graphs. There are two major advantages to this method. One is that we can reuse small, already known networks to construct large-scale networks. The other is that the networks obtained by this method have graph-theoretical symmetry, which reduces the overhead of communication between nodes. A network can be viewed as a graph, which is a mathematical term from combinatorics. The design of low-latency networks can be treated as a mathematical problem of finding small diameter graphs with a given number of nodes ( called order ) and a given number of connections between each node ( called degree ). In this paper, we overview how to construct large graphs from optimal or suboptimal small graphs by using graph-theoretical products. We focus on the case of diameter 2 in particular. As an example, we introduce a graph of order 256, degree 22 and diameter 2, which granted us the Deepest Improvement Award at the Graph Golf competition. Moreover, the average shortest path length of the graph is the smallest in graphs of order 256 and degree 22.
We investigate the task of assessing sentence-level prompt relevance in learner essays. Various systems using word overlap, neural embeddings and neural compositional models are evaluated on two datasets of learner writing. We propose a new method for sentence-level similarity calculation, which learns to adjust the weights of pre-trained word embeddings for a specific task, achieving substantially higher accuracy compared to other relevant baselines.
Variational inference provides a powerful tool for approximate probabilistic in- ference on complex, structured models. Typical variational inference methods, however, require to use inference networks with computationally tractable proba- bility density functions. This largely limits the design and implementation of vari- ational inference methods. We consider wild variational inference methods that do not require tractable density functions on the inference networks, and hence can be applied in more challenging cases. As an example of application, we treat stochastic gradient Langevin dynamics (SGLD) as an inference network, and use our methods to automatically adjust the step sizes of SGLD, yielding significant improvement over the hand-designed step size schemes
Data Cleaning is a long standing problem, which is growing in importance with the mass of uncurated web data. State of the art approaches for handling inconsistent data are systems that learn and use conditional functional dependencies (CFDs) to rectify data. These methods learn data patterns--CFDs--from a clean sample of the data and use them to rectify the dirty/inconsistent data. While getting a clean training sample is feasible in enterprise data scenarios, it is infeasible in web databases where there is no separate curated data. CFD based methods are unfortunately particularly sensitive to noise; we will empirically demonstrate that the number of CFDs learned falls quite drastically with even a small amount of noise. In order to overcome this limitation, we propose a fully probabilistic framework for cleaning data. Our approach involves learning both the generative and error (corruption) models of the data and using them to clean the data. For generative models, we learn Bayes networks from the data. For error models, we consider a maximum entropy framework for combing multiple error processes. The generative and error models are learned directly from the noisy data. We present the details of the framework and demonstrate its effectiveness in rectifying web data.
Stable random variables are motivated by the central limit theorem for densities with (potentially) unbounded variance and can be thought of as natural generalizations of the Gaussian distribution to skewed and heavy-tailed phenomenon. In this paper, we introduce stable graphical (SG) models, a class of multivariate stable densities that can also be represented as Bayesian networks whose edges encode linear dependencies between random variables. One major hurdle to the extensive use of stable distributions is the lack of a closed-form analytical expression for their densities. This makes penalized maximum-likelihood based learning computationally demanding. We establish theoretically that the Bayesian information criterion (BIC) can asymptotically be reduced to the computationally more tractable minimum dispersion criterion (MDC) and develop StabLe, a structure learning algorithm based on MDC. We use simulated datasets for five benchmark network topologies to empirically demonstrate how StabLe improves upon ordinary least squares (OLS) regression. We also apply StabLe to microarray gene expression data for lymphoblastoid cells from 727 individuals belonging to eight global population groups. We establish that StabLe improves test set performance relative to OLS via ten-fold cross-validation. Finally, we develop SGEX, a method for quantifying differential expression of genes between different population groups.
Many complex systems can be represented as networks of dynamical elements whose states evolve in response to interactions with neighboring elements, noise and external stimuli. The collective behavior of such systems can exhibit remarkable ordering phenomena such as chimera order corresponding to coexistence of ordered and disordered regions. Often, the interactions in such systems can also evolve over time responding to changes in the dynamical states of the elements. Link adaptation inspired by Hebbian learning, the dominant paradigm for neuronal plasticity, has been earlier shown to result in structural balance by removing any initial frustration in a system that arises through conflicting interactions. Here we show that the rate of the adaptive dynamics for the interactions is crucial in deciding the emergence of different ordering behavior (including chimera) and frustration in networks of Ising spins. In particular, we observe that small changes in the link adaptation rate about a critical value result in the system exhibiting radically different energy landscapes, viz., smooth landscape corresponding to balanced systems seen for fast learning, and rugged landscapes corresponding to frustrated systems seen for slow learning.
We analyze a simple network where a source and a receiver are connected by a line of erasure channels of different reliabilities. Recent prior work has shown that random linear network coding can achieve the min-cut capacity and therefore the asymptotic rate is determined by the worst link of the line network. In this paper we investigate the delay for transmitting a batch of packets, which is a function of all the erasure probabilities and the number of packets in the batch. We show a monotonicity result on the delay function and derive simple expressions which characterize the expected delay behavior of line networks. Further, we use a martingale bounded differences argument to show that the actual delay is tightly concentrated around its expectation.
The universal behaviour of two-dimensional loop models can change dramatically when loops are allowed to cross. We study models with crossings both analytically and with extensive Monte Carlo simulations. Our main focus (the 'completely packed loop model with crossings') is a simple generalisation of well-known models which shows an interesting phase diagram with continuous phase transitions of a new kind. These separate the unusual 'Goldstone' phase observed previously from phases with short loops. Using mappings to Z_2 lattice gauge theory, we show that the continuum description of the model is a replica limit of the sigma model on real projective space (RP^{n-1}). This field theory sustains Z_2 point defects which proliferate at the transition. In addition to studying the new critical points, we characterise the universal properties of the Goldstone phase in detail, comparing renormalisation group (RG) calculations with numerical data on systems of linear size up to L=10^6 at loop fugacity n=1. (Very large sizes are necessary because of the logarithmic form of correlation functions and other observables.) The model is relevant to polymers on the verge of collapse, and a particular point in parameter space maps to self-avoiding trails at their \Theta-point; we use the RG treatment of a perturbed sigma model to resolve some perplexing features in the previous literature on trails. Finally, one of the phase transitions considered here is a close analogue of those in disordered electronic systems --- specifically, Anderson metal-insulator transitions --- and provides a simpler context in which to study the properties of these poorly-understood (central-charge-zero) critical points.
This paper concerns {\em randomized} leader election in synchronous distributed networks. A distributed leader election algorithm is presented for complete $n$-node networks that runs in O(1) rounds and (with high probability) uses only $O(\sqrt{n}\log^{3/2} n)$ messages to elect a unique leader (with high probability). When considering the "explicit" variant of leader election where eventually every node knows the identity of the leader, our algorithm yields the asymptotically optimal bounds of O(1) rounds and O(n) messages. This algorithm is then extended to one solving leader election on any connected non-bipartite $n$-node graph $G$ in $O(\tau(G))$ time and $O(\tau(G)\sqrt{n}\log^{3/2} n)$ messages, where $\tau(G)$ is the mixing time of a random walk on $G$. The above result implies highly efficient (sublinear running time and messages) leader election algorithms for networks with small mixing times, such as expanders and hypercubes. In contrast, previous leader election algorithms had at least linear message complexity even in complete graphs. Moreover, super-linear message lower bounds are known for time-efficient {\em deterministic} leader election algorithms.   Finally, we present an almost matching lower bound for randomized leader election, showing that $\Omega(\sqrt n)$ messages are needed for any leader election algorithm that succeeds with probability at least $1/e + \eps$, for any small constant $\eps > 0$.   We view our results as a step towards understanding the randomized complexity ofleader election in distributed networks.
In this paper we introduce a new method which employs the concept of "Orientation Vectors" to train a feed forward neural network and suitable for problems where large dimensions are involved and the clusters are characteristically sparse. The new method is not NP hard as the problem size increases. We `derive' the method by starting from Kolmogrov's method and then relax some of the stringent conditions. We show for most classification problems three layers are sufficient and the network size depends on the number of clusters. We prove as the number of clusters increase from N to N+dN the number of processing elements in the first layer only increases by d(logN), and are proportional to the number of classes, and the method is not NP hard.   Many examples are solved to demonstrate that the method of Orientation Vectors requires much less computational effort than Radial Basis Function methods and other techniques wherein distance computations are required, in fact the present method increases logarithmically with problem size compared to the Radial Basis Function method and the other methods which depend on distance computations e.g statistical methods where probabilistic distances are calculated. A practical method of applying the concept of Occum's razor to choose between two architectures which solve the same classification problem has been described. The ramifications of the above findings on the field of Deep Learning have also been briefly investigated and we have found that it directly leads to the existence of certain types of NN architectures which can be used as a "mapping engine", which has the property of "invertibility", thus improving the prospect of their deployment for solving problems involving Deep Learning and hierarchical classification. The latter possibility has a lot of future scope in the areas of machine learning and cloud computing.
In this study, we propose a novel method to measure bottom-up saliency maps of natural images. In order to eliminate the influence of top-down signals, backward masking is used to make stimuli (natural images) subjectively invisible to subjects, however, the bottom-up saliency can still orient the subjects attention. To measure this orientation/attention effect, we adopt the cueing effect paradigm by deploying discrimination tasks at each location of an image, and measure the discrimination performance variation across the image as the attentional effect of the bottom-up saliency. Such attentional effects are combined to construct a final bottomup saliency map. Based on the proposed method, we introduce a new bottom-up saliency map dataset of natural images to benchmark computational models. We compare several state-of-the-art saliency models on the dataset. Moreover, the proposed paradigm is applied to investigate the neural basis of the bottom-up visual saliency map by analyzing psychophysical and fMRI experimental results. Our findings suggest that the bottom-up saliency maps of natural images are constructed in V1. It provides a strong scientific evidence to resolve the long standing dispute in neuroscience about where the bottom-up saliency map is constructed in human brain.
In this paper, we extend Meek's conjecture (Meek 1997) from directed and acyclic graphs to chain graphs, and prove that the extended conjecture is true. Specifically, we prove that if a chain graph H is an independence map of the independence model induced by another chain graph G, then (i) G can be transformed into H by a sequence of directed and undirected edge additions and feasible splits and mergings, and (ii) after each operation in the sequence H remains an independence map of the independence model induced by G. Our result has the same important consequence for learning chain graphs from data as the proof of Meek's conjecture in (Chickering 2002) had for learning Bayesian networks from data: It makes it possible to develop efficient and asymptotically correct learning algorithms under mild assumptions.
In sparse signal representation, the choice of a dictionary often involves a tradeoff between two desirable properties -- the ability to adapt to specific signal data and a fast implementation of the dictionary. To sparsely represent signals residing on weighted graphs, an additional design challenge is to incorporate the intrinsic geometric structure of the irregular data domain into the atoms of the dictionary. In this work, we propose a parametric dictionary learning algorithm to design data-adapted, structured dictionaries that sparsely represent graph signals. In particular, we model graph signals as combinations of overlapping local patterns. We impose the constraint that each dictionary is a concatenation of subdictionaries, with each subdictionary being a polynomial of the graph Laplacian matrix, representing a single pattern translated to different areas of the graph. The learning algorithm adapts the patterns to a training set of graph signals. Experimental results on both synthetic and real datasets demonstrate that the dictionaries learned by the proposed algorithm are competitive with and often better than unstructured dictionaries learned by state-of-the-art numerical learning algorithms in terms of sparse approximation of graph signals. In contrast to the unstructured dictionaries, however, the dictionaries learned by the proposed algorithm feature localized atoms and can be implemented in a computationally efficient manner in signal processing tasks such as compression, denoising, and classification.
Despite their great success, there is still no comprehensive theoretical understanding of learning with Deep Neural Networks (DNNs) or their inner organization. Previous work proposed to analyze DNNs in the \textit{Information Plane}; i.e., the plane of the Mutual Information values that each layer preserves on the input and output variables. They suggested that the goal of the network is to optimize the Information Bottleneck (IB) tradeoff between compression and prediction, successively, for each layer.   In this work we follow up on this idea and demonstrate the effectiveness of the Information-Plane visualization of DNNs. Our main results are: (i) most of the training epochs in standard DL are spent on {\emph compression} of the input to efficient representation and not on fitting the training labels. (ii) The representation compression phase begins when the training errors becomes small and the Stochastic Gradient Decent (SGD) epochs change from a fast drift to smaller training error into a stochastic relaxation, or random diffusion, constrained by the training error value. (iii) The converged layers lie on or very close to the Information Bottleneck (IB) theoretical bound, and the maps from the input to any hidden layer and from this hidden layer to the output satisfy the IB self-consistent equations. This generalization through noise mechanism is unique to Deep Neural Networks and absent in one layer networks. (iv) The training time is dramatically reduced when adding more hidden layers. Thus the main advantage of the hidden layers is computational. This can be explained by the reduced relaxation time, as this it scales super-linearly (exponentially for simple diffusion) with the information compression from the previous layer.
Nonconvex optimization problems such as the ones in training deep neural networks suffer from a phenomenon called saddle point proliferation. This means that there are a vast number of high error saddle points present in the loss function. Second order methods have been tremendously successful and widely adopted in the convex optimization community, while their usefulness in deep learning remains limited. This is due to two problems: computational complexity and the methods being driven towards the high error saddle points. We introduce a novel algorithm specially designed to solve these two issues, providing a crucial first step to take the widely known advantages of Newton's method to the nonconvex optimization community, especially in high dimensional settings.
Point-of-interest (POI) recommendation is an important application in location-based social networks (LBSNs), which learns the user preference and mobility pattern from check-in sequences to recommend POIs. However, previous POI recommendation systems model check-in sequences based on either tensor factorization or Markov chain model, which cannot capture contextual check-in information in sequences. The contextual check-in information implies the complementary functions among POIs that compose an individual's daily check-in sequence. In this paper, we exploit the embedding learning technique to capture the contextual check-in information and further propose the \textit{{\textbf{SE}}}quential \textit{{\textbf{E}}}mbedding \textit{{\textbf{R}}}ank (\textit{SEER}) model for POI recommendation. In particular, the \textit{SEER} model learns user preferences via a pairwise ranking model under the sequential constraint modeled by the POI embedding learning method. Furthermore, we incorporate two important factors, i.e., temporal influence and geographical influence, into the \textit{SEER} model to enhance the POI recommendation system. Due to the temporal variance of sequences on different days, we propose a temporal POI embedding model and incorporate the temporal POI representations into a temporal preference ranking model to establish the \textit{T}emporal \textit{SEER} (\textit{T-SEER}) model. In addition, We incorporate the geographical influence into the \textit{T-SEER} model and develop the \textit{\textbf{Geo-Temporal}} \textit{{\textbf{SEER}}} (\textit{GT-SEER}) model.
The paper is aimed at analyzing the potential of new information networks to solve the problems of energy management network with the use of renewable energy sources. One of the basic problems of renewable energy sources is their temporal and spatial variability. It is mainly about resources based on direct solar radiation and wind speed. New computer systems that use only classical connection-based solid structure of computer network connections but also on the basis of short-range connections allow accurate prediction of the active intensity changes observed energy. Using the system thus created can control precisely the basic energy equipment / generator and operable appliances / gradient to reduce the power needed resources or from work. This approach is one of the directions of further development of smart appliances and smart elements in the energy sector.
Many real incidents demonstrate that users of Online Social Networks need mechanisms that help them manage their interactions by increasing the awareness of the different contexts that coexist in Online Social Networks and preventing them from exchanging inappropriate information in those contexts or disseminating sensitive information from some contexts to others. Contextual integrity is a privacy theory that conceptualises the appropriateness of information sharing based on the contexts in which this information is to be shared. Computational models of Contextual Integrity assume the existence of well-defined contexts, in which individuals enact pre-defined roles and information sharing is governed by an explicit set of norms. However, contexts in Online Social Networks are known to be implicit, unknown a priori and ever changing; users relationships are constantly evolving; and the information sharing norms are implicit. This makes current Contextual Integrity models not suitable for Online Social Networks.   In this paper, we propose the first computational model of Implicit Contextual Integrity, presenting an information model and an Information Assistant Agent that uses the information model to learn implicit contexts, relationships and the information sharing norms to help users avoid inappropriate information exchanges and undesired information disseminations. Through an experimental evaluation, we validate the properties of Information Assistant Agents, which are shown to: infer the information sharing norms even if a small proportion of the users follow the norms and in presence of malicious users; help reduce the exchange of inappropriate information and the dissemination of sensitive information with only a partial view of the system and the information received and sent by their users; and minimise the burden to the users in terms of raising unnecessary alerts.
Compressive image recovery is a challenging problem that requires fast and accurate algorithms. Recently, neural networks have been applied to this problem with promising results. By exploiting massively parallel GPU processing architectures and oodles of training data, they are able to run orders of magnitude faster than existing methods. Unfortunately, these methods are difficult to train, often-times specific to a single measurement matrix, and largely unprincipled blackboxes.   It was recently demonstrated that iterative sparse-signal-recovery algorithms can be unrolled to form interpretable deep neural networks. Taking inspiration from this work, we develop novel neural network architectures that mimic the behavior of the denoising-based approximate message passing (D-AMP) and denoising-based vector approximate message passing (D-VAMP) algorithms. We call these new networks {\em Learned} D-AMP (LDAMP) and {\em Learned} D-VAMP (LDVAMP).   The LDAMP/LDVAMP networks are easy to train, can be applied to a variety of different measurement matrices, and come with a state-evolution heuristic that accurately predicts their performance. Most importantly, our networks outperform the state-of-the-art BM3D-AMP and NLR-CS algorithms in terms of both accuracy and runtime. At high resolutions, and when used with matrices which have fast matrix multiply implementations, LDAMP runs over $50\times$ faster than BM3D-AMP and hundreds of times faster than NLR-CS.
We study distributed algorithms implemented in a simplified but biologically plausible model for stochastic spiking neural networks. We focus on tradeoffs between computation time and network complexity, along with the role of randomness in efficient neural computation.   It is widely accepted that neural computation is inherently stochastic. In recent work, we explored how this stochasticity could be leveraged to solve the `winner-take-all' leader election task. Here, we focus on using randomness in neural algorithms for similarity testing and compression. In the most basic setting, given two $n$-length patterns of firing neurons, we wish to distinguish if the patterns are equal or $\epsilon$-far from equal.   Randomization allows us to solve this task with a very compact network, using $O \left (\frac{\sqrt{n}\log n}{\epsilon}\right)$ auxiliary neurons, which is sublinear in the input size. At the heart of our solution is the design of a $t$-round neural random access memory, or indexing network, which we call a neuro-RAM. This module can be implemented with $O(n/t)$ auxiliary neurons and is useful in many applications beyond similarity testing.   Using a combination of Yao's minimax principle and a VC dimension-based argument, we show that the tradeoff between runtime and network size in our neuro-RAM is nearly optimal. Our result has several implications -- since our neuro-RAM construction can be implemented with deterministic threshold gates, it demonstrates that, in contrast to similarity testing, randomness does not provide significant computational advantages for this problem. It also establishes a separation between our networks, which spike with a sigmoidal probability function, and well-studied but less biologically plausible deterministic sigmoidal networks, whose gates output real number values, and which can implement a neuro-RAM much more efficiently.
Previous work on network coding capacity for random wired and wireless networks have focused on the case where the capacities of links in the network are independent. In this paper, we consider a more realistic model, where wireless networks are modelled by random geometric graphs with interference and noise. In this model, the capacities of links are not independent. By employing coupling and martingale methods, we show that, under mild conditions, the network coding capacity for random wireless networks still exhibits a concentration behavior around the mean value of the minimum cut.
A set of analytical and computational tools based on transition path theory (TPT) is proposed to analyze flows in complex networks. Specifically, TPT is used to study the statistical properties of the reactive trajectories by which transitions occur between specific groups of nodes on the network. Sampling tools are built upon the outputs of TPT that allow to generate these reactive trajectories directly, or even transition paths that travel from one group of nodes to the other without making any detour and carry the same probability current as the reactive trajectories. These objects permit to characterize the mechanism of the transitions, for example by quantifying the width of the tubes by which these transitions occur, the location and distribution of their dynamical bottlenecks, etc. These tools are applied to a network modeling the dynamics of the Lennard-Jones cluster with 38 atoms (LJ38) and used to understand the mechanism by which this cluster rearranges itself between its two most likely states at various temperatures.
This paper presents a state-of-the-art system for Vietnamese Named Entity Recognition (NER). By incorporating automatic syntactic features with word embeddings as input for bidirectional Long Short-Term Memory (Bi-LSTM), our system, although simpler than some deep learning architectures, achieves a much better result for Vietnamese NER. The proposed method achieves an overall F1 score of 92.05% on the test set of an evaluation campaign, organized in late 2016 by the Vietnamese Language and Speech Processing (VLSP) community. Our named entity recognition system outperforms the best previous systems for Vietnamese NER by a large margin.
In [1], K\"otter and Kschischang presented a new model for error correcting codes in network coding. The alphabet in this model is the subspace lattice of a given vector space, a code is a subset of this lattice and the used metric on this alphabet is the map d: (U, V) \longmapsto dim(U + V) - dim(U \bigcap V). In this paper we generalize this model to arbitrary modular lattices, i.e. we consider codes, which are subsets of modular lattices. The used metric in this general case is the map d: (x, y) \longmapsto h(x \bigvee y) - h(x \bigwedge y), where h is the height function of the lattice. We apply this model to submodule lattices. Moreover, we show a method to compute the size of spheres in certain modular lattices and present a sphere packing bound, a sphere covering bound, and a singleton bound for codes, which are subsets of modular lattices.   [1] R. K\"otter, F.R. Kschischang: Coding for errors and erasures in random network coding, IEEE Trans. Inf. Theory, Vol. 54, No. 8, 2008
The current noise in a classical one-dimensional charge density wave system is studied in the weak pinning regime by solving the overdamped equation of motion numerically. At low temperatures and just above the zero temperature depinning threshold, the power spectrum of the current noise $S(f)$ was found to scale with frequency $f$ as $S(f) \sim f^{-\gamma}$, where $\gamma \approx 1$, suggesting the existence of {\it flicker noise}. Our result is in agreement with experimental findings for quasi-one-dimensional charge density wave systems and provides the first evidence of $1/f$ behavior obtained from first principles.
Motivated by vision-based reinforcement learning (RL) problems, in particular Atari games from the recent benchmark Aracade Learning Environment (ALE), we consider spatio-temporal prediction problems where future (image-)frames are dependent on control variables or actions as well as previous frames. While not composed of natural scenes, frames in Atari games are high-dimensional in size, can involve tens of objects with one or more objects being controlled by the actions directly and many other objects being influenced indirectly, can involve entry and departure of objects, and can involve deep partial observability. We propose and evaluate two deep neural network architectures that consist of encoding, action-conditional transformation, and decoding layers based on convolutional neural networks and recurrent neural networks. Experimental results show that the proposed architectures are able to generate visually-realistic frames that are also useful for control over approximately 100-step action-conditional futures in some games. To the best of our knowledge, this paper is the first to make and evaluate long-term predictions on high-dimensional video conditioned by control inputs.
Text in natural images contains rich semantics that are often highly relevant to objects or scene. In this paper, we focus on the problem of fully exploiting scene text for visual understanding. The main idea is combining word representations and deep visual features into a globally trainable deep convolutional neural network. First, the recognized words are obtained by a scene text reading system. Then, we combine the word embedding of the recognized words and the deep visual features into a single representation, which is optimized by a convolutional neural network for fine-grained image classification. In our framework, the attention mechanism is adopted to reveal the relevance between each recognized word and the given image, which further enhances the recognition performance. We have performed experiments on two datasets: Con-Text dataset and Drink Bottle dataset, that are proposed for fine-grained classification of business places and drink bottles, respectively. The experimental results consistently demonstrate that the proposed method combining textual and visual cues significantly outperforms classification with only visual representations. Moreover, we have shown that the learned representation improves the retrieval performance on the drink bottle images by a large margin, making it potentially useful in product search.
Traders in a market typically have widely different, private information on the return of an asset. The equilibrium price of the asset may reflect this information more accurately if the number of traders is large enough compared to the number of the states of the world that determine the return of the asset. We study the transition from markets where prices do not reflect the information accurately into markets where it does. In competitive markets, this transition takes place suddenly, at a critical value of the ratio between number of states and number of traders. The Nash equilibrium market behaves quite differently from a competitive market even in the limit of large economies.
A heterogenous network is considered where the base stations (BSs), small base stations (SBSs) and users are distributed according to independent Poisson point processes (PPPs). We let the SBS nodes to posses high storage capacity and are assumed to form a distributed caching network. Popular data files are stored in the local cache of SBS, so that users can download the desired files from one of the SBS in the vicinity subject to availability. The offloading-loss is captured via a cost function that depends on a random caching strategy proposed in this paper. The cost function depends on the popularity profile, which is, in general, unknown. In this work, the popularity profile is estimated at the BS using the available instantaneous demands from the users in a time interval $[0,\tau]$. This is then used to find an estimate of the cost function from which the optimal random caching strategy is devised. The main results of this work are the following: First it is shown that the waiting time $\tau$ to achieve an $\epsilon>0$ difference between the achieved and optimal costs is finite, provided the user density is greater than a predefined threshold. In this case, $\tau$ is shown to scale as $N^2$, where $N$ is the support of the popularity profile. Secondly, a transfer learning-based approach is proposed to obtain an estimate of the popularity profile used to compute the empirical cost function. A condition is derived under which the proposed transfer learning-based approach performs better than the random caching strategy.
In this paper, we present a statistical-mechanical analysis of deep learning. We elucidate some of the essential components of deep learning---pre-training by unsupervised learning and fine tuning by supervised learning. We formulate the extraction of features from the training data as a margin criterion in a high-dimensional feature-vector space. The self-organized classifier is then supplied with small amounts of labelled data, as in deep learning. Although we employ a simple single-layer perceptron model, rather than directly analyzing a multi-layer neural network, we find a nontrivial phase transition that is dependent on the number of unlabelled data in the generalization error of the resultant classifier. In this sense, we evaluate the efficacy of the unsupervised learning component of deep learning. The analysis is performed by the replica method, which is a sophisticated tool in statistical mechanics. We validate our result in the manner of deep learning, using a simple iterative algorithm to learn the weight vector on the basis of belief propagation.
A general scheme for detecting and analyzing topological patterns in large complex networks is presented. In this scheme the network in question is compared with its properly randomized version that preserves some of its low-level topological properties. Statistically significant deviation of any measurable property of a network from this null model likely reflect its design principles and/or evolutionary history. We illustrate this basic scheme on the example of the correlation profile of the Internet quantifying correlations between connectivities of its neighboring nodes. This profile distinguishes the Internet from previously studied molecular networks with a similar scale-free connectivity distribution. We finally demonstrate that clustering in a network is very sensitive to both the connectivity distribution and its correlation profile and compare the clustering in the Internet to the appropriate null model.
In this paper, we propose a special fusion method for combining ensembles of base classifiers utilizing new neural networks in order to improve overall efficiency of classification. While ensembles are designed such that each classifier is trained independently while the decision fusion is performed as a final procedure, in this method, we would be interested in making the fusion process more adaptive and efficient. This new combiner, called Neural Network Kernel Least Mean Square1, attempts to fuse outputs of the ensembles of classifiers. The proposed Neural Network has some special properties such as Kernel abilities,Least Mean Square features, easy learning over variants of patterns and traditional neuron capabilities. Neural Network Kernel Least Mean Square is a special neuron which is trained with Kernel Least Mean Square properties. This new neuron is used as a classifiers combiner to fuse outputs of base neural network classifiers. Performance of this method is analyzed and compared with other fusion methods. The analysis represents higher performance of our new method as opposed to others.
As a consequence of the growing popularity of smart mobile devices, mobile malware is clearly on the rise, with attackers targeting valuable user information and exploiting vulnerabilities of the mobile ecosystems. With the emergence of large-scale mobile botnets, smartphones can also be used to launch attacks on mobile networks. The NEMESYS project will develop novel security technologies for seamless service provisioning in the smart mobile ecosystem, and improve mobile network security through better understanding of the threat landscape. NEMESYS will gather and analyze information about the nature of cyber-attacks targeting mobile users and the mobile network so that appropriate counter-measures can be taken. We will develop a data collection infrastructure that incorporates virtualized mobile honeypots and a honeyclient, to gather, detect and provide early warning of mobile attacks and better understand the modus operandi of cyber-criminals that target mobile devices. By correlating the extracted information with the known patterns of attacks from wireline networks, we will reveal and identify trends in the way that cyber-criminals launch attacks against mobile devices.
This paper proposes Relational Similarity Machines (RSM): a fast, accurate, and flexible relational learning framework for supervised and semi-supervised learning tasks. Despite the importance of relational learning, most existing methods are hard to adapt to different settings, due to issues with efficiency, scalability, accuracy, and flexibility for handling a wide variety of classification problems, data, constraints, and tasks. For instance, many existing methods perform poorly for multi-class classification problems, graphs that are sparsely labeled or network data with low relational autocorrelation. In contrast, the proposed relational learning framework is designed to be (i) fast for learning and inference at real-time interactive rates, and (ii) flexible for a variety of learning settings (multi-class problems), constraints (few labeled instances), and application domains. The experiments demonstrate the effectiveness of RSM for a variety of tasks and data.
A Multi-Agent System is a distributed system where the agents or nodes perform complex functions that cannot be written down in analytic form. Multi-Agent Systems are highly connected, and the information they contain is mostly stored in the connections. When agents update their state, they take into account the state of the other agents, and they have access to those states via the connections. There is also external, user-generated input into the Multi-Agent System. As so much information is stored in the connections, agents are often memory-less. This memory-less property, together with the randomness of the external input, has allowed us to model Multi-Agent Systems using Markov chains. In this paper, we look at Multi-Agent Systems that evolve, i.e. the number of agents varies according to the fitness of the individual agents. We extend our Markov chain model, and define stability. This is the start of a methodology to control Multi-Agent Systems. We then build upon this to construct an entropy-based definition for the degree of instability (entropy of the limit probabilities), which we used to perform a stability analysis. We then investigated the stability of evolving agent populations through simulation, and show that the results are consistent with the original definition of stability in non-evolving Multi-Agent Systems, proposed by Chli and De Wilde. This paper forms the theoretical basis for the construction of Digital Business Ecosystems, and applications have been reported elsewhere.
Despite recent advances in multi-scale deep representations, their limitations are attributed to expensive parameters and weak fusion modules. Hence, we propose an efficient approach to fuse multi-scale deep representations, called convolutional fusion networks (CFN). Owing to using 1$\times$1 convolution and global average pooling, CFN can efficiently generate the side branches while adding few parameters. In addition, we present a locally-connected fusion module, which can learn adaptive weights for the side branches and form a discriminatively fused feature. CFN models trained on the CIFAR and ImageNet datasets demonstrate remarkable improvements over the plain CNNs. Furthermore, we generalize CFN to three new tasks, including scene recognition, fine-grained recognition and image retrieval. Our experiments show that it can obtain consistent improvements towards the transferring tasks.
In this two-part paper, the DMT of cooperative multi-hop networks is examined. The focus is on single-source single-sink (ss-ss) multi-hop relay networks having slow-fading links and relays that potentially possess multiple antennas. The present paper examines the two end-points of the DMT of full-duplex networks. In particular, the maximum achievable diversity of arbitrary multi-terminal wireless networks is shown to be equal to the min-cut. The maximum multiplexing gain of arbitrary full-duplex ss-ss networks is shown to be equal to the min-cut rank, using a new connection to a deterministic network. We also prove some basic results including a proof that the colored noise encountered in AF protocols for cooperative networks can be treated as white noise for DMT computations. The DMT of a parallel channel with independent MIMO links is also computed here. As an application of these basic results, we prove that a linear tradeoff between maximum diversity and maximum multiplexing gain is achievable for full-duplex networks with single antenna nodes. All protocols in this paper are explicit and rely only upon amplify-and-forward (AF) relaying. Half duplex networks are studied, and explicit codes for all protocols proposed in both parts, are provided in the companion paper.
Real-data networks often appear to have strong modularity, or network-of-networks structure, in which subgraphs of various size and consistency occur. Finding the respective subgraph structure is of great importance, in particular for understanding the dynamics on these networks. Here we study modular networks using generalized method of maximum likelihood. We first demonstrate how the method works on computer-generated networks with the subgraphs of controlled connection strengths and clustering. We then implement the algorithm which is based on weights of links and show its efficiency in finding weighted subgraphs on fully connected graph and on real-data network of yeast.
Optimization on manifolds is a rapidly developing branch of nonlinear optimization. Its focus is on problems where the smooth geometry of the search space can be leveraged to design efficient numerical algorithms. In particular, optimization on manifolds is well-suited to deal with rank and orthogonality constraints. Such structured constraints appear pervasively in machine learning applications, including low-rank matrix completion, sensor network localization, camera network registration, independent component analysis, metric learning, dimensionality reduction and so on. The Manopt toolbox, available at www.manopt.org, is a user-friendly, documented piece of software dedicated to simplify experimenting with state of the art Riemannian optimization algorithms. We aim particularly at reaching practitioners outside our field.
To accommodate the explosive growth in mobile data traffic, both mobile cellular operators and mobile users are increasingly interested in offloading the traffic from cellular networks to Wi-Fi networks. However, previously proposed offloading schemes mainly focus on reducing the cellular data usage, without paying too much attention on the quality of service (QoS) requirements of the applications. In this paper, we study the Wi-Fi offloading problem with delay-tolerant applications under usage-based pricing. We aim to achieve a good tradeoff between the user's payment and its QoS characterized by the file transfer deadline. We first propose a general Delay-Aware Wi-Fi Offloading and Network Selection (DAWN) algorithm for a general single-user decision scenario. We then analytically establish the sufficient conditions, under which the optimal policy exhibits a threshold structure in terms of both the time and file size. As a result, we propose a monotone DAWN algorithm that approximately solves the general offloading problem, and has a much lower computational complexity comparing to the optimal algorithm. Simulation results show that both the general and monotone DAWN schemes achieve a high probability of completing file transfer under a stringent deadline, and require the lowest payment under a non-stringent deadline as compared with three heuristic schemes.
We present an analytical study of the time dependent diffusion coefficient in a dilute suspension of spheres with partially absorbing boundary condition. Following Kirkpatrick (J. Chem. Phys. 76, 4255) we obtain a perturbative expansion for the time dependent particle density using volume fraction $f$ of spheres as an expansion parameter. The exact single particle $t$-operator for partially absorbing boundary condition is used to obtain a closed form time-dependent diffusion coefficient $D(t)$ accurate to first order in the volume fraction $f$. Short and long time limits of $D(t)$ are checked against the known short-time results for partially or fully absorbing boundary conditions and long-time results for reflecting boundary conditions. For fully absorbing boundary condition the long time diffusion coefficient is found to be $D(t)=5 a^2/(12 f D_{0} t) +O((D_0t/a^2)^{-2})$, to the first order of perturbation theory. Here $f$ is small but non-zero, $D_0$ the diffusion coefficient in the absence of spheres, and $a$ the radius of the spheres. The validity of this perturbative result is discussed.
The disorder induced metal--insulator transition is investigated in a three-dimensional simple cubic lattice and compared for the presence and absence of time-reversal and spin-rotational symmetry, i.e. in the three conventional symmetry classes. Large scale numerical simulations have been performed on systems with linear sizes up to $L=100$ in order to obtain eigenstates at the band center, $E=0$. The multifractal dimensions, exponents $D_q$ and $\alpha_q$, have been determined in the range of $-1\leq q\leq 2$. The finite-size scaling of the generalized multifractal exponents provide the critical exponents for the different symmetry classes in accordance with values known from the literature based on high precision transfer matrix techniques. The multifractal exponents of the different symmetry classes provide further characterization of the Anderson transition, which was missing from the literature so far.
A majority of studied models for scale-free networks have degree distributions with exponents greater than $2$. Real networks, however, can demonstrate essentially more heavy-tailed degree distributions. We explore two models of scale-free equilibrium networks that have the degree distribution exponent $\gamma = 1$, $P(q) \sim q^{-\gamma}$. Such degree distributions can be identified in empirical data only if the mean degree of a network is sufficiently high. Our models exploit a rewiring mechanism. They are local in the sense that no knowledge of the network structure, apart from the immediate neighbourhood of the vertices, is required. These models generate uncorrelated networks in the infinite size limit, where they are solved explicitly. We investigate finite size effects by the use of simulations. We find that both models exhibit disassortative degree-degree correlations for finite network sizes. In addition, we observe a markedly degree-dependent clustering in the finite networks. We indicate a real-world network with a similar degree distribution.
For many real spin-glass materials, the Edwards-Anderson model with continuous-symmetry spins is more realistic than the rather better understood Ising variant. In principle, the nature of an occurring spin-glass phase in such systems might be inferred from an analysis of the zero-temperature properties. Unfortunately, with few exceptions, the problem of finding ground-state configurations is a non-polynomial problem computationally, such that efficient approximation algorithms are called for. Here, we employ the recently developed genetic embedded matching (GEM) heuristic to investigate the nature of the zero-temperature phase of the bimodal XY spin glass in two dimensions. We analyze bulk properties such as the asymptotic ground-state energy and the phase diagram of disorder strength vs. disorder concentration. For the case of a symmetric distribution of ferromagnetic and antiferromagnetic bonds, we find that the ground state of the model is unique up to a global O(2) rotation of the spins. In particular, there are no extensive degeneracies in this model. The main focus of this work is on an investigation of the excitation spectrum as probed by changing the boundary conditions. Using appropriate finite-size scaling techniques, we consistently determine the stiffness of spin and chiral domain walls and the corresponding fractal dimensions. Most noteworthy, we find that the spin and chiral channels are characterized by two distinct stiffness exponents and, consequently, the system displays spin-chirality decoupling at large length scales. Results for the overlap distribution do not support the possibility of a multitude of thermodynamic pure states.
The objective of this paper is to implement the Trie based Tuple Space Search(TTSS) packet classification algorithm for Network Processor(NP) based router to enhance multimedia applications. The performance is evaluated using Intel IXP2400 NP Simulator. The results demonstrate that, TTSS has better performance than Tuple Space Search algorithm and is well suited to achieve high speed packet classification to support multimedia applications.
We study the purely relaxational critical dynamics with non-conserved order parameter (model A critical dynamics) for three-dimensional magnets with disorder in a form of the random anisotropy axis. For the random axis anisotropic distribution, the static asymptotic critical behaviour coincides with that of random site Ising systems. Therefore the asymptotic critical dynamics is governed by the dynamical exponent of the random Ising model. However, the disorder influences considerably the dynamical behaviour in the non-asymptotic regime. We perform a field-theoretical renormalization group analysis within the minimal subtraction scheme in two-loop approximation to investigate asymptotic and effective critical dynamics of random anisotropy systems. The results demonstrate the non-monotonic behaviour of the dynamical effective critical exponent $z_{\rm eff}$.
We introduce an architecture based on deep hierarchical decompositions to learn effective representations of large graphs. Our framework extends classic R-decompositions used in kernel methods, enabling nested "part-of-part" relations. Unlike recursive neural networks, which unroll a template on input graphs directly, we unroll a neural network template over the decomposition hierarchy, allowing us to deal with the high degree variability that typically characterize social network graphs. Deep hierarchical decompositions are also amenable to domain compression, a technique that reduces both space and time complexity by exploiting symmetries. We show empirically that our approach is competitive with current state-of-the-art graph classification methods, particularly when dealing with social network datasets.
Cloud computing is a new computing paradigm which allows sharing of resources on remote server such as hardware, network, storage using internet and provides the way through which application, computing power, computing infrastructure can be delivered to the user as a service. Cloud computing unique attribute promise cost effective Information Technology Solution (IT Solution) to the user. All computing needs are provided by the Cloud Service Provider (CSP) and they can be increased or decreased dynamically as required by the user. As data and Application are located at the server and may be beyond geographical boundary, this leads a number of concern from the user prospective. The objective of this paper is to explore the key issues of cloud computing which is delaying its adoption.
Large sets of genotypes give rise to the same phenotype because phenotypic expression is highly redundant. Accordingly, a population can accept mutations without altering its phenotype, as long as thegenotype mutates into another one on the same set. By linking every pair of genotypes that are mutually accessible through mutation, genotypes organize themselves into neutral networks (NN). These networks are known to be heterogeneous and assortative, and these properties affect the evolutionary dynamics of the population. By studying the dynamics of populations on NN with arbitrary topology we analyze the effect of assortativity, of NN (phenotype) fitness, and of network size. We find that the probability that the population leaves the network is smaller the longer the time spent on it. This progressive "phenotypic entrapment" entails a systematic increase in the overdispersion of the process with time and an acceleration in the fixation rate of neutral mutations. We also quantify the variation of these effects with the size of the phenotype and with its fitness relative to that of neighbouring alternatives.
In this contribution we describe an approach to evolve composite covariance functions for Gaussian processes using genetic programming. A critical aspect of Gaussian processes and similar kernel-based models such as SVM is, that the covariance function should be adapted to the modeled data. Frequently, the squared exponential covariance function is used as a default. However, this can lead to a misspecified model, which does not fit the data well. In the proposed approach we use a grammar for the composition of covariance functions and genetic programming to search over the space of sentences that can be derived from the grammar. We tested the proposed approach on synthetic data from two-dimensional test functions, and on the Mauna Loa CO2 time series. The results show, that our approach is feasible, finding covariance functions that perform much better than a default covariance function. For the CO2 data set a composite covariance function is found, that matches the performance of a hand-tuned covariance function.
This paper discusses the implementation of a tactical network simulation tool. The tool is called Tactical Network Modeller (TNM). TNM uses some novel techniques to simplify the building of the network model using graph theory constrained by a hierarchical tree which reflects the organisation structure. TNM allows models to be constructed using an Application Programming Interface (API) or a node based User Interface (UI). When the model is constructed, different simulation back-ends can be applied to it. A discrete event simulation and a network emulation back-end are implemented building on top of open source tools. TNM is simple to create models for non technical users. The model can be used to analyse information flows. The same model can be used for a full network emulation. This allows real software and protocols to be tested in a realistic simulated environment. The flexibility of the software allows its use from engineering up to campaign planning.
Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy and time. Additionally, synapses in biological neural networks are not binary connections, but exhibit a nonlinear response function as neurotransmitters are emitted and diffuse between neurons. Inspired by neuroscience principles, we present a digital neuromorphic architecture, the Spiking Temporal Processing Unit (STPU), capable of modeling arbitrary complex synaptic response functions without requiring additional hardware components. We consider the paradigm of spiking neurons with temporally coded information as opposed to non-spiking rate coded neurons used in most neural networks. In this paradigm we examine liquid state machines applied to speech recognition and show how a liquid state machine with temporal dynamics maps onto the STPU-demonstrating the flexibility and efficiency of the STPU for instantiating neural algorithms.
In this work we provide a thorough examination of the nonlinear dielectric properties of a succinonitrile-glutaronitrile mixture, representing one of the rare example of a plastic crystal with fragile glassy dynamics. The detected alteration of the complex dielectric permittivity under high fields can be explained considering the heterogeneous nature of glassy dynamics and a field-induced variation of entropy. While the first mechanism was also found in structural glass formers, the latter effect seems to be typical for plastic crystals. Morover, the third harmonic component of the dielectric susceptibility is reported, revealing a spectral shape as predicted for cooperative molecular dynamics. In accord with the fragile nature of the glass transition in this plastic crystal, we deduce a relatively strong temperature dependence of the number of correlated molecules, in contrast to the much weaker temperature dependence in plastic-crystalline cyclo-octanol, whose glass transition is of strong nature.
Chimera graphs define the topology of one of the first commercially available quantum computers. A variety of optimization problems have been mapped to this topology to evaluate the behavior of quantum enhanced optimization heuristics in relation to other optimizers, being able to efficiently solve problems classically to use them as benchmarks for quantum machines. In this paper we investigate for the first time the use of Evolutionary Algorithms (EAs) on Ising spin glass instances defined on the Chimera topology. Three genetic algorithms (GAs) and three estimation of distribution algorithms (EDAs) are evaluated over $1000$ hard instances of the Ising spin glass constructed from Sidon sets. We focus on determining whether the information about the topology of the graph can be used to improve the results of EAs and on identifying the characteristics of the Ising instances that influence the success rate of GAs and EDAs.
We propose an image based end-to-end learning framework that helps lane-change decisions for human drivers and autonomous vehicles. The proposed system, Safe Lane-Change Aid Network (SLCAN), trains a deep convolutional neural network to classify the status of adjacent lanes from rear view images acquired by cameras mounted on both sides of the vehicle. Rather than depending on any explicit object detection or tracking scheme, SLCAN reads the whole input image and directly decides whether initiation of the lane-change at the moment is safe or not. We collected and annotated 77,273 rear side view images to train and test SLCAN. Experimental results show that the proposed framework achieves 96.98% classification accuracy although the test images are from unseen roadways. We also visualize the saliency map to understand which part of image SLCAN looks at for correct decisions.
Cloud computing provides great benefits for applications hosted on the Web that also have special computational and storage requirements. This paper proposes an extensible and flexible architecture for integrating Wireless Sensor Networks with the Cloud. We have used REST based Web services as an interoperable application layer that can be directly integrated into other application domains for remote monitoring such as e-health care services, smart homes, or even vehicular area networks (VAN). For proof of concept, we have implemented a REST based Web services on an IP based low power WSN test bed, which enables data access from anywhere. The alert feature has also been implemented to notify users via email or tweets for monitoring data when they exceed values and events of interest.
Question Answering (QA) is fundamental to natural language processing in that most nlp problems can be phrased as QA (Kumar et al., 2015). Current weakly supervised memory network models that have been proposed so far struggle at answering questions that involve relations among multiple entities (such as facebook's bAbi qa5-three-arg-relations in (Weston et al., 2015)). To address this problem of learning multi-argument multi-hop semantic relations for the purpose of QA, we propose a method that combines the jointly learned long-term read-write memory and attentive inference components of end-to-end memory networks (MemN2N) (Sukhbaatar et al., 2015) with distributed sentence vector representations encoded by a Skip-Thought model (Kiros et al., 2015). This choice to append Skip-Thought Vectors to the existing MemN2N framework is motivated by the fact that Skip-Thought Vectors have been shown to accurately model multi-argument semantic relations (Kiros et al., 2015).
As prior knowledge of objects or object features helps us make relations for similar objects on attentional tasks, pre-trained deep convolutional neural networks (CNNs) can be used to detect salient objects on images regardless of the object class is in the network knowledge or not. In this paper, we propose a top-down saliency model using CNN, a weakly supervised CNN model trained for 1000 object labelling task from RGB images. The model detects attentive regions based on their objectness scores predicted by selected features from CNNs. To estimate the salient objects effectively, we combine both forward and backward features, while demonstrating that partially-guided backpropagation will provide sufficient information for selecting the features from forward run of CNN model. Finally, these top-down cues are enhanced with a state-of-the-art bottom-up model as complementing the overall saliency. As the proposed model is an effective integration of forward and backward cues through objectness without any supervision or regression to ground truth data, it gives promising results compared to state-of-the-art models in two different datasets.
We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA's cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. Both of these convolution implementations are available in open source, and are faster than NVIDIA's cuDNN implementation for many common convolutional layers (up to 23.5x for some synthetic kernel configurations). We discuss different performance regimes of convolutions, comparing areas where straightforward time domain convolutions outperform Fourier frequency domain convolutions. Details on algorithmic applications of NVIDIA GPU hardware specifics in the implementation of fbfft are also provided.
We present a new simulation scheme based on the Lattice-Boltzmann method to simulate the dynamics of charged colloids in an electrolyte. In our model we describe the electrostatics on the level of a Poisson-Boltzmann equation and the hydrodynamics of the fluid by the linearized Navier-Stokes equations. We verify our simulation scheme by means of a Chapman-Enskog expansion. Our method is applied to the calculation of the reduced sedimentation velocity U/U_0 for a cubic array of charged spheres in an electrolyte. We show that we recover the analytical solution first derived by Booth (F. Booth, J. Chem. Phys. 22, 1956 (1954)) for a weakly charged, isolated sphere in an unbounded electrolyte. The present method makes it possible to go beyond the Booth theory, and we discuss the dependence of the sedimentation velocity on the charge of the spheres. Finally we compare our results to experimental data.
Jamming, or dynamical arrest, is a transition at which many particles stop moving in a collective manner. In nature it is brought about by, for example, increasing the packing density, changing the interactions between particles, or otherwise restricting the local motion of the elements of the system. The onset of collectivity occurs because, when one particle is blocked, it may lead to the blocking of a neighbor. That particle may then block one of its neighbors, these effects propagating across some typical domain of size named the dynamical correlation length. When this length diverges, the system becomes immobile. Even where it is finite but large the dynamics is dramatically slowed. Such phenomena lead to glasses, gels, and other very long-lived nonequilibrium solids. The bootstrap percolation models are the simplest examples describing these spatio-temporal correlations. We have been able to solve one such model in two dimensions exactly, exhibiting the precise evolution of the jamming correlations on approach to arrest. We believe that the nature of these correlations and the method we devise to solve the problem are quite general. Both should be of considerable help in further developing this field.
We present a new model for singing synthesis based on a modified version of the WaveNet architecture. Instead of modeling raw waveform, we model features produced by a parametric vocoder that separates the influence of pitch and timbre. This allows conveniently modifying pitch to match any target melody, facilitates training on more modest dataset sizes, and significantly reduces training and generation times. Our model makes frame-wise predictions using mixture density outputs rather than categorical outputs in order to reduce the required parameter count. As we found overfitting to be an issue with the relatively small datasets used in our experiments, we propose a method to regularize the model and make the autoregressive generation process more robust to prediction errors. Using a simple multi-stream architecture, harmonic, aperiodic and voiced/unvoiced components can all be predicted in a coherent manner. We compare our method to existing parametric statistical and state-of-the-art concatenative methods using quantitative metrics and a listening test. While naive implementations of the autoregressive generation algorithm tend to be inefficient, using a smart algorithm we can greatly speed up the process and obtain a system that's competitive in both speed and quality.
Low-density parity-check codes with irregular constructions have been recently shown to outperform the most advanced error-correcting codes to date. In this paper we apply methods of statistical physics to study the typical properties of simple irregular codes.   We use the replica method to find a phase transition which coincides with Shannon's coding bound when appropriate parameters are chosen.   The decoding by belief propagation is also studied using statistical physics arguments; the theoretical solutions obtained are in good agreement with simulations. We compare the performance of irregular with that of regular codes and discuss the factors that contribute to the improvement in performance.
Many real-world relations can be represented by signed networks with positive and negative links, as a result of which signed network analysis has attracted increasing attention from multiple disciplines. With the increasing prevalence of social media networks, signed network analysis has evolved from developing and measuring theories to mining tasks. In this article, we present a review of mining signed networks in the context of social media and discuss some promising research directions and new frontiers. We begin by giving basic concepts and unique properties and principles of signed networks. Then we classify and review tasks of signed network mining with representative algorithms. We also delineate some tasks that have not been extensively studied with formal definitions and also propose research directions to expand the field of signed network mining.
This study presents novel methods for computing fixed points of positive concave mappings and for characterizing the existence of fixed points. These methods are particularly important in planning and optimization tasks in wireless networks. For example, previous studies have shown that the feasibility of a network design can be quickly evaluated by computing the fixed point of a concave mapping that is constructed based on many environmental and network control parameters such as the position of base stations, channel conditions, and antenna tilts. To address this and more general problems, given a positive concave mapping, we show two alternative but equivalent ways to construct a matrix that is guaranteed to have spectral radius strictly smaller than one if the mapping has a fixed point. This matrix is then used to build a new mapping that preserves the fixed point of the original positive concave mapping. We show that the standard fixed point iterations using the new mapping converges faster than the standard iterations applied to the original concave mapping. As exemplary applications of the proposed methods, we consider the problems of power and load planning in networks based on the orthogonal frequency division multiple access (OFDMA) technology.
Characterizing large online social networks (OSNs) through node querying is a challenging task. OSNs often impose severe constraints on the query rate, hence limiting the sample size to a small fraction of the total network. Various ad-hoc subgraph sampling methods have been proposed, but many of them give biased estimates and no theoretical basis on the accuracy. In this work, we focus on developing sampling methods for OSNs where querying a node also reveals partial structural information about its neighbors. Our methods are optimized for NoSQL graph databases (if the database can be accessed directly), or utilize Web API available on most major OSNs for graph sampling. We show that our sampling method has provable convergence guarantees on being an unbiased estimator, and it is more accurate than current state-of-the-art methods. We characterize metrics such as node label density estimation and edge label density estimation, two of the most fundamental network characteristics from which other network characteristics can be derived. We evaluate our methods on-the-fly over several live networks using their native APIs. Our simulation studies over a variety of offline datasets show that by including neighborhood information, our method drastically (4-fold) reduces the number of samples required to achieve the same estimation accuracy of state-of-the-art methods.
This paper presents a new approach for analysing structural properties of time series from complex systems. Starting from the concept of recurrences in phase space, the recurrence matrix of a time series is interpreted as the adjacency matrix of an associated complex network which links different points in time if the evolution of the considered states is very similar. A critical comparison of these recurrence networks with similar existing techniques is presented, revealing strong conceptual benefits of the new approach which can be considered as a unifying framework for transforming time series into complex networks that also includes other methods as special cases.   It is demonstrated that there are fundamental relationships between the topological properties of recurrence networks and the statistical properties of the phase space density of the underlying dynamical system. Hence, the network description yields new quantitative characteristics of the dynamical complexity of a time series, which substantially complement existing measures of recurrence quantification analysis.
Directed acyclic graphs (DAGs) are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical, as well as biological systems, where directed edges between nodes represent the influence of components of the system on each other. The general problem of estimating DAGs from observed data is computationally NP-hard, Moreover two directed graphs may be observationally equivalent. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose a penalized likelihood approach that directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso penalties are considered and an efficient algorithm is proposed for estimation of high dimensional DAGs. We study variable selection consistency of the two penalties when the number of variables grows to infinity with the sample size. We show that although lasso can only consistently estimate the true network under stringent assumptions, adaptive lasso achieves this task under mild regularity conditions. The performance of the proposed methods is compared to alternative methods in simulated, as well as real, data examples.
We report on the identification of the old stellar population galaxy candidates at z > 5. We developed a new infrared color selection scheme to isolate galaxies with the strong Balmer breaks at z > 5, and applied it to the ultra-deep and wide infrared survey data from the Spitzer Extended Deep Survey (SEDS) and the UKIRT Infrared Deep Sky Survey. The eight objects satisfying K - [3.6] > 1.3 and K - [3.6] > 2.4 ([3.6] - [4.5]) + 0.6 are selected in the 0.34 deg^2 SEDS Ultra Deep Survey field. Rich multi-wavelength imaging data from optical to far-infrared are also used to reject blending sources and strong nebular line emitters, and we finally obtained the three most likely evolved galaxies at z > 5. Their stacked spectral energy distribution is fitted well with the old stellar population template with M_{*} = (7.5+-1.5) x 10^{10} Msun, star formation rate = 0.9 +- 0.2 Msun yr^{-1}, dust A_V < 1, and age = 0.7+-0.4 Gyr at z = 5.7+-0.6, where the dusty star-forming galaxies at z ~ 2.8 are disfavored because of the faintness in the 24um. The stellar mass density of these evolved galaxy candidates, (6+-4) x 10^4 Msun Mpc^{-3}, is much lower than that of star-forming galaxies, but the non-zero fraction suggests that initial star-formation and quenching have been completed by z ~ 6.
In this work we present an end-to-end system for text spotting -- localising and recognising text in natural scene images -- and text based image retrieval. This system is based on a region proposal mechanism for detection and deep convolutional neural networks for recognition. Our pipeline uses a novel combination of complementary proposal generation techniques to ensure high recall, and a fast subsequent filtering stage for improving precision. For the recognition and ranking of proposals, we train very large convolutional neural networks to perform word recognition on the whole proposal region at the same time, departing from the character classifier based systems of the past. These networks are trained solely on data produced by a synthetic text generation engine, requiring no human labelled data.   Analysing the stages of our pipeline, we show state-of-the-art performance throughout. We perform rigorous experiments across a number of standard end-to-end text spotting benchmarks and text-based image retrieval datasets, showing a large improvement over all previous methods. Finally, we demonstrate a real-world application of our text spotting system to allow thousands of hours of news footage to be instantly searchable via a text query.
Person search in real-world scenarios is a new challenging computer version task with many meaningful applications. The challenge of this task mainly comes from: (1) unavailable bounding boxes for pedestrians and the model needs to search for the person over the whole gallery images; (2) huge variance of visual appearance of a particular person owing to varying poses, lighting conditions, and occlusions. To address these two critical issues in modern person search applications, we propose a novel Individual Aggregation Network (IAN) that can accurately localize persons by learning to minimize intra-person feature variations. IAN is built upon the state-of-the-art object detection framework, i.e., faster R-CNN, so that high-quality region proposals for pedestrians can be produced in an online manner. In addition, to relieve the negative effect caused by varying visual appearances of the same individual, IAN introduces a novel center loss that can increase the intra-class compactness of feature representations. The engaged center loss encourages persons with the same identity to have similar feature characteristics. Extensive experimental results on two benchmarks, i.e., CUHK-SYSU and PRW, well demonstrate the superiority of the proposed model. In particular, IAN achieves 77.23% mAP and 80.45% top-1 accuracy on CUHK-SYSU, which outperform the state-of-the-art by 1.7% and 1.85%, respectively.
What is the most effective way to spread a behavior on a network? The recent literature on network diffusion has focused mostly on models of simple contagion---where contagion can result from contact with a single "infected" individual---and complex contagion---where contagion requires contact with multiple "infected" sources. While in the case of simple contagion, strategies focused on central nodes are known to be effective, the strategies that are most effective in the case of complex contagion are relatively unknown. Here, we study the strategies that optimize the diffusion of a behavior on a network in the case of complex contagion by comparing algorithms that choose which nodes to target at each step. We find that, contrary to the case of simple contagion, where targeting central nodes is an effective strategy, in the case of complex contagion minimizing the total diffusion time requires the use of dynamic strategies targeting less connected nodes in the beginning and hubs at a critical intermediate time. That is, the strategic question in the case of complex contagion is not who to target, but when to target hubs. We solve the model analytically for simple network structures and also use numerical simulations to show that these dynamic strategies outperform simpler strategies that could be hypothesized to be effective, like always choosing the node with the highest probability of infection. These findings shed light on the dynamic strategies that optimize network diffusion in the case of complex contagion.
Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, recent works have shown that models with much smaller number of parameters can also perform just as well. In this work, we introduce the problem of architecture-learning, i.e; learning the architecture of a neural network along with weights. We introduce a new trainable parameter called tri-state ReLU, which helps in eliminating unnecessary neurons. We also propose a smooth regularizer which encourages the total number of neurons after elimination to be small. The resulting objective is differentiable and simple to optimize. We experimentally validate our method on both small and large networks, and show that it can learn models with a considerably small number of parameters without affecting prediction accuracy.
Security issues in MANET are a challenging task nowadays. MANETs are vulnerable to passive attacks and active attacks because of a limited number of resources and lack of centralized authority. Blackhole attack is an attack in network layer which degrade the network performance by dropping the packets. In this paper, we have proposed a Secure Fault-Tolerant Paradigm (SFTP) which checks the Blackhole attack in the network. The three phases used in SFTP algorithm are designing of coverage area to find the area of coverage, Network Connection algorithm to design a fault-tolerant model and Route Discovery algorithm to discover the route and data delivery from source to destination. SFTP gives better network performance by making the network fault free.
High energy scattering experiments involving nuclei are typically analyzed in terms of light front variables. The desire to provide realistic,relativistic wave functions expressed in terms of these variables led me to try to use light front dynamics to compute nuclear wave functions. Here calculations of infinite nuclear matter in the mean field approximation and also in a light front version of Bruckner theory which includes NN correlations are reviewed. Applications of these wave functions to nuclear deep inelastic scattering and Drell-Yan processes are discussed. We find that relativistic mean field theory produces no EMC binding effect.
We describe a collection of acoustic and language modeling techniques that lowered the word error rate of our English conversational telephone LVCSR system to a record 6.6% on the Switchboard subset of the Hub5 2000 evaluation testset. On the acoustic side, we use a score fusion of three strong models: recurrent nets with maxout activations, very deep convolutional nets with 3x3 kernels, and bidirectional long short-term memory nets which operate on FMLLR and i-vector features. On the language modeling side, we use an updated model "M" and hierarchical neural network LMs.
Optical measurements of the real and imaginary frequency dependent conductivity of uncompensated n-type silicon are reported. The experiments are done in the quantum limit, $ \hbar\omega > k_{B}T$, across a broad doping range on the insulating side of the Metal-Insulator transition (MIT). The observed low energy linear frequency dependence shows characteristics consistent with theories of a Coulomb glass, but discrepancies exist in the relative magnitudes of the real and imaginary components. At higher energies we observe a crossover to a quadratic frequency dependence that is sharper than expected over the entire dopant range. The concentration dependence gives evidence that the Coulomb interaction energy is the relevant energy scale that determines this crossover.
The problem of reliability of the dynamics in biological regulatory networks is studied in the framework of a generalized Boolean network model with continuous timing and noise. Using well-known artificial genetic networks such as the repressilator, we discuss concepts of reliability of rhythmic attractors. In a simple evolution process we investigate how overall network structure affects the reliability of the dynamics. In the course of the evolution, networks are selected for reliable dynamics. We find that most networks can be easily evolved towards reliable functioning while preserving the original function.
Generative model has been one of the most common approaches for solving the Dialog State Tracking Problem with the capabilities to model the dialog hypotheses in an explicit manner. The most important task in such Bayesian networks models is constructing the most reliable user models by learning and reflecting the training data into the probability distribution of user actions conditional on networks states. This paper provides an overall picture of the learning process in a Bayesian framework with an emphasize on the state-of-the-art theoretical analyses of the Expectation Maximization learning algorithm.
This work addresses multi-class segmentation of indoor scenes with RGB-D inputs. While this area of research has gained much attention recently, most works still rely on hand-crafted features. In contrast, we apply a multiscale convolutional network to learn features directly from the images and the depth information. We obtain state-of-the-art on the NYU-v2 depth dataset with an accuracy of 64.5%. We illustrate the labeling of indoor scenes in videos sequences that could be processed in real-time using appropriate hardware such as an FPGA.
The Galactic Centre (GC) has experienced a high degree of recent star-forming activity, as evidenced by the large number of massive stars currently residing there. The relative abundances of chemical elements in the GC may provide insights into the origins of this activity. Here, we present high-resolution $H$-band spectra of two Red Supergiants in the GC (IRS~7 and VR~5-7), and in combination with spectral synthesis we derive abundances for Fe and C, as well as other $\alpha$-elements Ca, Si, Mg Ti and O. We find that the C-depletion in VR~5-7 is consistent with the predictions of evolutionary models of RSGs, while the heavy depletion of C and O in IRS~7's atmosphere is indicative of deep mixing, possibly due to fast initial rotation and/or enhanced mass-loss. Our results indicate that the {\it current} surface Fe/H content of each star is slightly above Solar. However, comparisons to evolutionary models indicate that the {\it initial} Fe/H ratio was likely closer to Solar, and has been driven higher by H-depletion at the stars' surface. Overall, we find $\alpha$/Fe ratios for both stars which are consistent with the thin Galactic disk. These results are consistent with other chemical studies of the GC, given the precision to which abundances can currently be determined. We argue that the GC abundances are consistent with a scenario in which the recent star-forming activity in the GC was fuelled by either material travelling down the Bar from the inner disk, or from the winds of stars in the inner Bulge -- with no need to invoke top-heavy stellar Initial Mass Functions to explain anomalous abundance ratios.
We investigate the ground-state properties of a disorderd Ising model with uniform transverse field on the Bethe lattice, focusing on the quantum phase transition from a paramagnetic to a glassy phase that is induced by reducing the intensity of the transverse field. We use a combination of quantum Monte Carlo algorithms and exact diagonalization to compute R\'enyi entropies, quantum Fisher information, correlation functions and order parameter. We locate the transition by means of the peak of the R\'enyi entropy and we find agreement with the transition point estimated from the emergence of finite values of the Edwards-Anderson order parameter and from the peak of the correlation length. We interpret the results by means of a mean-field theory in which quantum fluctuations are treated as massive particles hopping on the interaction graph. We see that the particles are delocalized at the transition, a fact that points towards the existence of possibly another transition deep in the glassy phase where these particles localize, therefore leading to a many-body localized phase.
We introduce the DROW detector, a deep learning based detector for 2D range data. Laser scanners are lighting invariant, provide accurate range data, and typically cover a large field of view, making them interesting sensors for robotics applications. So far, research on detection in laser range data has been dominated by hand-crafted features and boosted classifiers, potentially losing performance due to suboptimal design choices. We propose a Convolutional Neural Network (CNN) based detector for this task. We show how to effectively apply CNNs for detection in 2D range data, and propose a depth preprocessing step and voting scheme that significantly improve CNN performance. We demonstrate our approach on wheelchairs and walkers, obtaining state of the art detection results. Apart from the training data, none of our design choices limits the detector to these two classes, though. We provide a ROS node for our detector and release our dataset containing 464k laser scans, out of which 24k were annotated.
The disordered flux line lattice in single crystals of the slightly overdoped aFe_{2-x}Co_xAs_2 (x = 0.19, Tc = 23 K) superconductor is studied by magnetization measurements, small-angle neutron scattering (SANS), and magnetic force microscopy (MFM). In the whole range of magnetic fields up to 9 T, vortex pinning precludes the formation of an ordered Abrikosov lattice. Instead, a vitreous vortex phase (vortex glass) with a short-range hexagonal order is observed. Statistical processing of MFM datasets lets us directly measure its radial and angular distribution functions and extract the radial correlation length \zeta. In contrast to predictions of the collective pinning model, no increase in the correlated volume with the applied field is observed. Instead, we find that \zeta decreases as 1.3*R1 ~ H^(-1/2) over four decades of the applied magnetic field, where R1 is the radius of the first coordination shell of the vortex lattice. Such universal scaling of \zeta implies that the vortex pinning in iron arsenides remains strong even in the absence of static magnetism. This result is consistent with all the real- and reciprocal-space vortex-lattice measurements in overdoped as-grown aFe_{2-x}Co_xAs_2 published to date and is thus sample-independent. The failure of the collective pinning model suggests that the vortices remain in the single-vortex pinning limit even in high magnetic fields up to 9 T.
Sentiment analysis is the Natural Language Processing (NLP) task dealing with the detection and classification of sentiments in texts. While some tasks deal with identifying the presence of sentiment in the text (Subjectivity analysis), other tasks aim at determining the polarity of the text categorizing them as positive, negative and neutral. Whenever there is a presence of sentiment in the text, it has a source (people, group of people or any entity) and the sentiment is directed towards some entity, object, event or person. Sentiment analysis tasks aim to determine the subject, the target and the polarity or valence of the sentiment. In our work, we try to automatically extract sentiment (positive or negative) from Facebook posts using a machine learning approach.While some works have been done in code-mixed social media data and in sentiment analysis separately, our work is the first attempt (as of now) which aims at performing sentiment analysis of code-mixed social media text. We have used extensive pre-processing to remove noise from raw text. Multilayer Perceptron model has been used to determine the polarity of the sentiment. We have also developed the corpus for this task by manually labeling Facebook posts with their associated sentiments.
Iterative shrinkage/thresholding algorithm (ISTA) is a well-studied method for finding sparse solutions to ill-posed inverse problems. In this letter, we present a data-driven scheme for learning optimal thresholding functions for ISTA. The proposed scheme is obtained by relating iterations of ISTA to layers of a simple deep neural network (DNN) and developing a corresponding error backpropagation algorithm that allows to fine-tune the thresholding functions. Simulations on sparse statistical signals illustrate potential gains in estimation quality due to the proposed data adaptive ISTA.
Autonomous driving is a multi-agent setting where the host vehicle must apply sophisticated negotiation skills with other road users when overtaking, giving way, merging, taking left and right turns and while pushing ahead in unstructured urban roadways. Since there are many possible scenarios, manually tackling all possible cases will likely yield a too simplistic policy. Moreover, one must balance between unexpected behavior of other drivers/pedestrians and at the same time not to be too defensive so that normal traffic flow is maintained.   In this paper we apply deep reinforcement learning to the problem of forming long term driving strategies. We note that there are two major challenges that make autonomous driving different from other robotic tasks. First, is the necessity for ensuring functional safety - something that machine learning has difficulty with given that performance is optimized at the level of an expectation over many instances. Second, the Markov Decision Process model often used in robotics is problematic in our case because of unpredictable behavior of other agents in this multi-agent scenario. We make three contributions in our work. First, we show how policy gradient iterations can be used without Markovian assumptions. Second, we decompose the problem into a composition of a Policy for Desires (which is to be learned) and trajectory planning with hard constraints (which is not learned). The goal of Desires is to enable comfort of driving, while hard constraints guarantees the safety of driving. Third, we introduce a hierarchical temporal abstraction we call an "Option Graph" with a gating mechanism that significantly reduces the effective horizon and thereby reducing the variance of the gradient estimation even further.
The method for analyzing algorithmic runtime complexity using decision trees is discussed using the sorting algorithm. This method is then extended to optimal algorithms which may find all cliques of size q in network N, or simply the first clique of size q in network N. Finally, the lower bound of such decision trees is demonstrated to not be in P.
A combined analytical and numerical study is performed of the mapping between strongly interacting fermions and weakly interacting spins, in the framework of the Hubbard, t-J and Heisenberg models. While for spatially homogeneous models in the thermodynamic limit the mapping is thoroughly understood, we here focus on aspects that become relevant in spatially inhomogeneous situations, such as the effect of boundaries, impurities, superlattices and interfaces. We consider parameter regimes that are relevant for traditional applications of these models, such as electrons in cuprates and manganites, and for more recent applications to atoms in optical lattices. The rate of the mapping as a function of the interaction strength is determined from the Bethe-Ansatz for infinite systems and from numerical diagonalization for finite systems. We show analytically that if translational symmetry is broken through the presence of impurities, the mapping persists and is, in a certain sense, as local as possible, provided the spin-spin interaction between two sites of the Heisenberg model is calculated from the harmonic mean of the onsite Coulomb interaction on adjacent sites of the Hubbard model. Numerical calculations corroborate these findings also in interfaces and superlattices, where analytical calculations are more complicated.
Previous work on sensitivity analysis in Bayesian networks has focused on single parameters, where the goal is to understand the sensitivity of queries to single parameter changes, and to identify single parameter changes that would enforce a certain query constraint. In this paper, we expand the work to multiple parameters which may be in the CPT of a single variable, or the CPTs of multiple variables. Not only do we identify the solution space of multiple parameter changes that would be needed to enforce a query constraint, but we also show how to find the optimal solution, that is, the one which disturbs the current probability distribution the least (with respect to a specific measure of disturbance). We characterize the computational complexity of our new techniques and discuss their applications to developing and debugging Bayesian networks, and to the problem of reasoning about the value (reliability) of new information.
Thermodynamic RAM (kT-RAM) is a neuromemristive co-processor design based on the theory of AHaH Computing and implemented via CMOS and memristors. The co-processor is a 2-D array of differential memristor pairs (synapses) that can be selectively coupled together (neurons) via the digital bit addressing of the underlying CMOS RAM circuitry. The chip is designed to plug into existing digital computers and be interacted with via a simple instruction set. Anti-Hebbian and Hebbian (AHaH) computing forms the theoretical framework from which a nature-inspired type of computing architecture is built where, unlike von Neumann architectures, memory and processor are physically combined for synaptic operations. Through exploitation of AHaH attractor states, memristor-based circuits converge to attractor basins that represents machine learning solutions such as unsupervised feature learning, supervised classification and anomaly detection. Because kT-RAM eliminates the need to shuttle bits back and forth between memory and processor and can operate at very low voltage levels, it can significantly surpass CPU, GPU, and FPGA performance for synaptic integration and learning operations. Here, we present a memristor technology developed for use in kT-RAM, in particular bi-directional incremental adaptation of conductance via short low-voltage 1.0 V, 1.0 microsecond pulses.
This paper introduces a new architectural framework, known as input fast-forwarding, that can enhance the performance of deep networks. The main idea is to incorporate a parallel path that sends representations of input values forward to deeper network layers. This scheme is substantially different from "deep supervision" in which the loss layer is re-introduced to earlier layers. The parallel path provided by fast-forwarding enhances the training process in two ways. First, it enables the individual layers to combine higher-level information (from the standard processing path) with lower-level information (from the fast-forward path). Second, this new architecture reduces the problem of vanishing gradients substantially because the fast-forwarding path provides a shorter route for gradient backpropagation. In order to evaluate the utility of the proposed technique, a Fast-Forward Network (FFNet), with 20 convolutional layers along with parallel fast-forward paths, has been created and tested. The paper presents empirical results that demonstrate improved learning capacity of FFNet due to fast-forwarding, as compared to GoogLeNet (with deep supervision) and CaffeNet, which are 4x and 18x larger in size, respectively. All of the source code and deep learning models described in this paper will be made available to the entire research community
We consider a disordered spin model with multi-spin interactions undergoing a glass transition. We introduce a dynamic and a static length scales and compute them in the Kac limit (long--but--finite range interactions). They diverge at the dynamic and static phase transition with exponents (respectively) -1/4 and -1. The two length scales are approximately equal well above the mode coupling transition. Their discrepancy increases rapidly as this transition is approached. We argue that this signals a crossover from mode coupling to activated dynamics.
Automatically generating a natural language description of an image is a task close to the heart of image understanding. In this paper, we present a multi-model neural network method closely related to the human visual system that automatically learns to describe the content of images. Our model consists of two sub-models: an object detection and localization model, which extract the information of objects and their spatial relationship in images respectively; Besides, a deep recurrent neural network (RNN) based on long short-term memory (LSTM) units with attention mechanism for sentences generation. Each word of the description will be automatically aligned to different objects of the input image when it is generated. This is similar to the attention mechanism of the human visual system. Experimental results on the COCO dataset showcase the merit of the proposed method, which outperforms previous benchmark models.
In this paper, we investigate the effect of "heterogeneity of link weight", heterogeneity of the frequency or amount of interactions among individuals, on the evolution of cooperation. Based on an analysis of the evolutionary prisoner's dilemma game on a weighted one-dimensional lattice network with "intra-individual heterogeneity", we confirm that moderate level of link-weight heterogeneity can facilitate cooperation. Furthermore, we identify two key mechanisms by which link-weight heterogeneity promotes the evolution of cooperation: mechanisms for spread and maintenance of cooperation. We also derive the corresponding conditions under which the mechanisms can work through evolutionary dynamics.
A Content-Based Image Retrieval (CBIR) system which identifies similar medical images based on a query image can assist clinicians for more accurate diagnosis. The recent CBIR research trend favors the construction and use of binary codes to represent images. Deep architectures could learn the non-linear relationship among image pixels adaptively, allowing the automatic learning of high-level features from raw pixels. However, most of them require class labels, which are expensive to obtain, particularly for medical images. The methods which do not need class labels utilize a deep autoencoder for binary hashing, but the code construction involves a specific training algorithm and an ad-hoc regularization technique. In this study, we explored using a deep de-noising autoencoder (DDA), with a new unsupervised training scheme using only backpropagation and dropout, to hash images into binary codes. We conducted experiments on more than 14,000 x-ray images. By using class labels only for evaluating the retrieval results, we constructed a 16-bit DDA and a 512-bit DDA independently. Comparing to other unsupervised methods, we succeeded to obtain the lowest total error by using the 512-bit codes for retrieval via exhaustive search, and speed up 9.27 times with the use of the 16-bit codes while keeping a comparable total error. We found that our new training scheme could reduce the total retrieval error significantly by 21.9%. To further boost the image retrieval performance, we developed Radon Autoencoder Barcode (RABC) which are learned from the Radon projections of images using a de-noising autoencoder. Experimental results demonstrated its superior performance in retrieval when it was combined with DDA binary codes.
The ground state properties of random-exchange spin-1/2 Heisenberg antiferromagnets on the square lattice are investigated using a combination of quantum Monte Carlo simulations, exact numerical diagonalizations, and spin wave calculations. Whereas arbirarily weak disorder has a drastic effect on 1d Heisenberg AFM, we find that in two dimensions the characteristics of the ground state like long-range order is robust even against strong disorder. While the antiferromagnetic order parameter and the spin stiffness are exponentially reduced for singular exchange distributions, they vanish only in the limit of infinite randomness.
Generalized traveling salesman problem (GTSP) is an extension of classical traveling salesman problem (TSP), which is a combinatorial optimization problem and an NP-hard problem. In this paper, an efficient discrete state transition algorithm (DSTA) for GTSP is proposed, where a new local search operator named \textit{K-circle}, directed by neighborhood information in space, has been introduced to DSTA to shrink search space and strengthen search ability. A novel robust update mechanism, restore in probability and risk in probability (Double R-Probability), is used in our work to escape from local minima. The proposed algorithm is tested on a set of GTSP instances. Compared with other heuristics, experimental results have demonstrated the effectiveness and strong adaptability of DSTA and also show that DSTA has better search ability than its competitors.
In this paper, the main aim is to exhibit swarm intelligence power in cloud based scenario. Heterogeneous environment has been configured at server-side network of the whole cloud network. In the proposed system, different types of servers are being used to manage useful assorted atmosphere. Swarm intelligence has been adopted for enhancing the performance of overall system network. Specific location at server-side of the network is going to be selected by the swarm intelligence concept for accessing desired elements. Flexibility, robustness and self-organization, which are to be considered at the time of designing the system environment, are the main features of swarm intelligence.
Many real-world networks have properties of small-world networks, with clustered local neighborhoods and low average-shortest path (ASP). They may also show a scale-free degree distribution, which can be generated by growth and preferential attachment to highly connected nodes, or hubs. However, many real-world networks consist of multiple, inter-connected clusters not normally seen in systems grown by preferential attachment, and there also exist real-world networks with a scale-free degree distribution that do not contain highly connected hubs. We describe spatial growth mechanisms, not using preferential attachment, that address both aspects.
It is discussed how the worldwide tourist arrivals, about 10% of world's domestic product, form a largely heterogeneous and directed complex network. Remarkably the random network of connectivity is converted into a scale-free network of intensities. The importance of weights on network connections is brought into discussion. It is also shown how strategic positioning particularly benefit from market diversity and that interactions among countries prevail on a technological and economic pattern, questioning the backbones of traveling driving forces.
Using data from a large laboratory experiment on problem solving in which we varied the structure of 16-person networks we investigate how an organization's network structure may be constructed to optimize performance in complex problem-solving tasks. Problem solving involves both search for information and search for theories to make sense of that information. We show that the effect of network structure is opposite for these two equally important forms of search. Dense clustering encourages members of a network to generate more diverse information, but it also has the power to discourage the generation of diverse theories: clustering promotes exploration in information space, but decreases exploration in solution space. Previous research, tending to focus on only one of those two spaces, had produced inconsistent conclusions about the value of network clustering. By adopting an experimental platform on which information was measured separately from solutions, we were able to reconcile past contradictions and clarify the effects of network clustering on performance. The finding both provides a sharper tool for structuring organizations for knowledge work and reveals the challenges inherent in manipulating network structure to enhance performance, as the communication structure that helps one aspect of problem solving may harm the other.
Wireless Sensor networks (WSN) is an emerging technology and have great potential to be employed in critical situations like battlefields and commercial applications such as building, traffic surveillance, habitat monitoring and smart homes and many more scenarios. One of the major challenges wireless sensor networks face today is security. While the deployment of sensor nodes in an unattended environment makes the networks vulnerable to a variety of potential attacks, the inherent power and memory limitations of sensor nodes makes conventional security solutions unfeasible. The sensing technology combined with processing power and wireless communication makes it profitable for being exploited in great quantity in future. The wireless communication technology also acquires various types of security threats. This paper discusses a wide variety of attacks in WSN and their classification mechanisms and different securities available to handle them including the challenges faced.
A dynamic Boltzmann machine (DyBM) has been proposed as a model of a spiking neural network, and its learning rule of maximizing the log-likelihood of given time-series has been shown to exhibit key properties of spike-timing dependent plasticity (STDP), which had been postulated and experimentally confirmed in the field of neuroscience as a learning rule that refines the Hebbian rule. Here, we relax some of the constraints in the DyBM in a way that it becomes more suitable for computation and learning. We show that learning the DyBM can be considered as logistic regression for binary-valued time-series. We also show how the DyBM can learn real-valued data in the form of a Gaussian DyBM and discuss its relation to the vector autoregressive (VAR) model. The Gaussian DyBM extends the VAR by using additional explanatory variables, which correspond to the eligibility traces of the DyBM and capture long term dependency of the time-series. Numerical experiments show that the Gaussian DyBM significantly improves the predictive accuracy over VAR.
We consider a network of coupled agents playing the Prisoner's Dilemma game, in which players are allowed to pick a strategy in the interval [0,1], with 0 corresponding to defection, 1 to cooperation, and intermediate values representing mixed strategies in which each player may act as a cooperator or a defector over a large number of interactions with a certain probability. Our model is payoff-driven, i.e., we assume that the level of accumulated payoff at each node is a relevant parameter in the selection of strategies. Also, we consider that each player chooses his/her strategy in a context of limited information. We present a deterministic nonlinear model for the evolution of strategies. We show that the final strategies depend on the network structure and on the choice of the parameters of the game. We find that polarized strategies (pure cooperator/defector states) typically emerge when (i) the network connections are sparse, (ii) the network degree distribution is heterogeneous, (iii) the network is assortative, and surprisingly, (iv) the benefit of cooperation is high.
Complex networks have recently attracted much attention in diverse areas of science and technology. Many networks such as the WWW and biological networks are known to display spatial heterogeneity which can be characterized by their fractal dimensions. Multifractal analysis is a useful way to systematically describe the spatial heterogeneity of both theoretical and experimental fractal patterns. In this paper, we introduce a new box covering algorithm for multifractal analysis of complex networks. This algorithm is used to calculate the generalized fractal dimensions $D_{q}$ of some theoretical networks, namely scale-free networks, small world networks and random networks, and one kind of real networks, namely protein-protein interaction networks of different species. Our numerical results indicate the existence of multifractality in scale-free networks and protein-protein interaction networks, while the multifractal behavior is not clear-cut for small world networks and random networks. The possible variation of $D_{q}$ due to changes in the parameters of the theoretical network models is also discussed.
We consider a linear multi-hop network composed of multi-state discrete-time memoryless channels over each hop, with orthogonal time-sharing across hops under a half-duplex relaying protocol. We analyze the probability of error and associated reliability function \cite{Gallager68} over the multi-hop network; with emphasis on random coding and sphere packing bounds, under the assumption of point-to-point coding over each hop. In particular, we define the system reliability function for the multi-hop network and derive lower and upper bounds on this function to specify the reliability-optimal operating conditions of the network under an end-to-end constraint on the total number of channel uses. Moreover, we apply the reliability analysis to bound the expected end-to-end latency of multi-hop communication under the support of an automatic repeat request (ARQ) protocol. Considering an additive white Gaussian noise (AWGN) channel model over each hop, we evaluate and compare these bounds to draw insights on the role of multi-hopping toward enhancing the end-to-end rate-reliability-delay tradeoff.
We report the evolution of superconducting properties as a function of disorder in homogeneously disordered epitaxial NbN thin films grown on (100) MgO substrates, studied through a combination of electrical transport, Hall Effect and tunneling measurements. The thickness of all our films are >50nm much larger than the coherence length ~5nm. The effective disorder in different films encompasses a large range, with the Ioffe-Regel parameter varying in the range kFl~1.38-8.77. Tunneling measurements on films with different disorder reveals that for films with large disorder the bulk superconducting transition temperature (Tc) is not associated with a vanishing of the superconducting energy gap, but rather a large broadening of the superconducting density of states. Our results provide strong evidence of the loss of superconductivity via phase-fluctuations in a disordered s-wave superconductor.
The broadcast throughput in a network is defined as the average number of messages that can be transmitted per unit time from a given source to all other nodes when time goes to infinity.   Classical broadcast algorithms treat messages as atomic tokens and route them from the source to the receivers by making intermediate nodes store and forward messages. The more recent network coding approach, in contrast, prompts intermediate nodes to mix and code together messages. It has been shown that certain wired networks have an asymptotic network coding gap, that is, they have asymptotically higher broadcast throughput when using network coding compared to routing. Whether such a gap exists for wireless networks has been an open question of great interest. We approach this question by studying the broadcast throughput of the radio network model which has been a standard mathematical model to study wireless communication.   We show that there is a family of radio networks with a tight $\Theta(\log \log n)$ network coding gap, that is, networks in which the asymptotic throughput achievable via routing messages is a $\Theta(\log \log n)$ factor smaller than that of the optimal network coding algorithm. We also provide new tight upper and lower bounds that show that the asymptotic worst-case broadcast throughput over all networks with $n$ nodes is $\Theta(1 / \log n)$ messages-per-round for both routing and network coding.
We propose a scheme for the realization of artificial neural networks based on Superconducting Quantum Interference Devices (SQUIDs). In order to demonstrate the operation of this scheme we designed and successfully tested a small network that implements a XOR gate and is trained by means of examples. The proposed scheme can be particularly convenient as support for superconducting applications such as detectors for astrophysics, high energy experiments, medicine imaging and so on.
It is well known that artificial neural networks (ANNs) can learn deterministic automata. Learning nondeterministic automata is another matter. This is important because much of the world is nondeterministic, taking the form of unpredictable or probabilistic events that must be acted upon. If ANNs are to engage such phenomena, then they must be able to learn how to deal with nondeterminism. In this project the game of Pong poses a nondeterministic environment. The learner is given an incomplete view of the game state and underlying deterministic physics, resulting in a nondeterministic game. Three models were trained and tested on the game: Mona, Elman, and Numenta's NuPIC.
Non-Fermi liquids are metals that cannot be adiabatically deformed into free fermion states. We argue for the existence of "non-Fermi glasses," which are phases of interacting disordered fermions that are fully many-body localized, yet cannot be deformed into an Anderson insulator without an eigenstate phase transition. We explore the properties of such non-Fermi glasses, focusing on a specific solvable example. At high temperature, non-Fermi glasses have qualitatively similar spectral features to Anderson insulators. We identify a diagnostic, based on ratios of correlation functions, that sharply distinguishes between the two phases even at infinite temperature. We argue that our results and diagnostic should generically apply to the high-temperature behavior of the many-body localized descendants of fractionalized phases.
In this study, we investigate the resilience of duplex networked layers ($\alpha$ and $\beta$) coupled with antagonistic interlinks, each layer of which inhibits its counterpart at the microscopic level, changing the following factors: whether the influence of the initial failures in $\alpha$ remains (quenched (Case Q)) or not (free (Case F)); the effect of intralayer degree-degree correlations in each layer and interlayer degree-degree correlations; and the type of the initial failures, such as random failures (RFs) or targeted attacks (TAs). We illustrate that the percolation processes repeat in both Cases Q and F, although only in Case F are nodes that initially failed reactivated. To analytically evaluate the resilience of each layer, we develop a methodology based on the cavity method for deriving the size of a giant component (GC). Strong hysteresis, which is ignored in the standard cavity analysis, is observed in the repetition of the percolation processes particularly in Case F. To handle this, we heuristically modify interlayer messages for macroscopic analysis, the utility of which is verified by numerical experiments. The percolation transition in each layer is continuous in both Cases Q and F. We also analyze the influences of degree-degree correlations on the robustness of layer $\alpha$, in particular for the case of TAs. The analysis indicates that the critical fraction of initial failures that makes the GC size in layer $\alpha$ vanish depends only on its intralayer degree-degree correlations. Although our model is defined in a somewhat abstract manner, it may have relevance to ecological systems that are composed of endangered species (layer $\alpha$) and invaders (layer $\beta$), the former of which are damaged by the latter whereas the latter are exterminated in the areas where the former are active.
We study the performance of quantum annealing for systems with ground-state degeneracy by directly solving the Schr\"odinger equation for small systems and quantum Monte Carlo simulations for larger systems. The results indicate that naive quantum annealing using a transverse field may not be well suited to identify all degenerate ground-state configurations, although the value of the ground-state energy is often efficiently estimated. An introduction of quantum transitions to all states with equal weights is shown to greatly improve the situation but with a sacrifice in the annealing time. We also clarify the relation between the spin configurations in the degenerate ground states and the probabilities that those states are obtained by quantum annealing. The strengths and weaknesses of quantum annealing for problems with degenerate ground states are discussed in comparison with classical simulated annealing.
We review the search for the glass transition in water in its various amorphous forms, and highlight the paradoxes that the search has produced. Focussing on the glassy form of water obtained by hyperquenching, we examine its reported calorimetric properties in the light of new findings for bulk glassformers vitrified by hyperquenching. Like the hyperquenched bulk glassformers, hyperquenched water exhibits a large relaxation exotherm on reheating, as the high enthalpy quenched form relaxes to a lower enthalpy state. When the scanning temperature is scaled by the glass transition temperature, the bulk glassformers show a common pattern in this relaxation exotherm. However, when the relaxation exotherm for hyperquenched water is scaled onto the plot for the bulk glassformers using the commonly accepted glass transition temperature for water, 136K, its behavior appears completely different. If we require that the behavior of water be comparable to the other cases, except for crystallizing before the glass transition (like the majority of hyperquenched metallic glasses), then the glass transition for water must be reset to a value of 165-170K.
Unlike the matrix case, computing low-rank approximations of tensors is NP-hard and numerically ill-posed in general. Even the best rank-1 approximation of a tensor is NP-hard. In this paper, we use convex optimization to develop polynomial-time algorithms for low-rank approximation and completion of positive tensors. Our approach is to use algebraic topology to define a new (numerically well-posed) decomposition for positive tensors, which we show is equivalent to the standard tensor decomposition in important cases. Though computing this decomposition is a nonconvex optimization problem, we prove it can be exactly reformulated as a convex optimization problem. This allows us to construct polynomial-time randomized algorithms for computing this decomposition and for solving low-rank tensor approximation problems. Among the consequences is that best rank-1 approximations of positive tensors can be computed in polynomial time. Our framework is next extended to the tensor completion problem, where noisy entries of a tensor are observed and then used to estimate missing entries. We provide a polynomial-time algorithm that for specific cases requires a polynomial (in tensor order) number of measurements, in contrast to existing approaches that require an exponential number of measurements. These algorithms are extended to exploit sparsity in the tensor to reduce the number of measurements needed. We conclude by providing a novel interpretation of statistical regression problems with categorical variables as tensor completion problems, and numerical examples with synthetic data and data from a bioengineered metabolic network show the improved performance of our approach on this problem.
We propose a method for characterizing large complex networks by introducing a new matrix structure, unique for a given network, which encodes structural information; provides useful visualization, even for very large networks; and allows for rigorous statistical comparison between networks. Dynamic processes such as percolation can be visualized using animations. Applications to graph theory are discussed, as are generalizations to weighted networks, real-world network similarity testing, and applicability to the graph isomorphism problem.
Investigating the fate of dissolved carbon dioxide under extreme conditions is critical to understanding the deep carbon cycle in the Earth, a process that ultimately influences global climate change. We used first-principles molecular dynamics simulations to study carbonates and carbon dioxide dissolved in water at pressures (P) and temperatures (T) approximating the conditions of the Earth's upper mantle. Contrary to popular geochemical models assuming that molecular CO$_2$(aq) is the major carbon species present in water under deep earth conditions, we found that at 11 GPa and 1000 K carbon exists almost entirely in the forms of solvated carbonate (CO$_3^{2-}$) and bicarbonate (HCO$_3^-$) ions, and that even carbonic acid (H$_2$CO$_3$(aq)) is more abundant than CO$_2$(aq). Furthermore, our simulations revealed that ion pairing between Na$^+$ and CO$_3^{2-}$/HCO$_3^-$ is greatly affected by P-T conditions, decreasing with increasing pressure at 800$\sim$1000 K. Our results suggest that in the Earth's upper mantle, water-rich geo-fluids transport a majority of carbon in the form of rapidly interconverting CO$_3^{2-}$ and HCO$_3^-$ ions, not solvated CO$_2$(aq) molecules.
With the growth of the internet it is becoming increasingly important to understand how the behaviour of players is affected by the topology of the network interconnecting them. Many models which involve networks of interacting players have been proposed and best response games are amongst the simplest. In best response games each vertex simultaneously updates to employ the best response to their current surroundings. We concentrate upon trying to understand the dynamics of best response games on regular graphs with many strategies. When more than two strategies are present highly complex dynamics can ensue. We focus upon trying to understand exactly how best response games on regular graphs sample from the space of possible cellular automata. To understand this issue we investigate convex divisions in high dimensional space and we prove that almost every division of $k-1$ dimensional space into $k$ convex regions includes a single point where all regions meet. We then find connections between the convex geometry of best response games and the theory of alternating circuits on graphs. Exploiting these unexpected connections allows us to gain an interesting answer to our question of when cellular automata are best response games.
The relationship between network topology and system dynamics has significant implications for unifying our understanding of the interplay among metabolic, gene-regulatory, and ecosystem network architecures. Here we analyze the stability and robustness of a large class of dynamics on such networks. We determine the probability distribution of robustness as a function of network topology and show that robustness is classified by the number of links between modules of the network. We also demonstrate that permutation of these modules is a fundamental symmetry of dynamical robustness. Analysis of these findings leads to the conclusion that the most robust systems have the most hierarchical structure. This relationship provides a means by which evolutionary selection for a purely dynamical phenomenon may shape network architectures across scales of the biological hierarchy.
Metastable states in Ising spin-glass models are studied by finding iterative solutions of mean-field equations for the local magnetizations. Two different equations are studied: the TAP equations which are exact for the SK model, and the simpler `naive-mean-field' (NMF) equations. The free-energy landscapes that emerge are very different. For the TAP equations, the numerical studies confirm the analytical results of Aspelmeier et al., which predict that TAP states consist of close pairs of minima and index-one (one unstable direction) saddle points, while for the NMF equations saddle points with large indices are found. For TAP the barrier height between a minimum and its nearby saddle point scales as (f-f_0)^{-1/3} where f is the free energy per spin of the solution and f_0 is the equilibrium free energy per spin. This means that for `pure states', for which f-f_0 is of order 1/N, the barriers scale as N^{1/3}, but between states for which f-f_0 is of order one the barriers are finite and also small so such metastable states will be of limited physical significance. For the NMF equations there are saddles of index K and we can demonstrate that their complexity Sigma_K scales as a function of K/N. We have also employed an iterative scheme with a free parameter that can be adjusted to bring the system of equations close to the `edge of chaos'. Both for the TAP and NME equations it is possible with this approach to find metastable states whose free energy per spin is close to f_0. As N increases, it becomes harder and harder to find solutions near the edge of chaos, but nevertheless the results which can be obtained are competitive with those achieved by more time-consuming computing methods and suggest that this method may be of general utility.
Large dataset collected from Ubuntu chat channel is studied as a complex dynamical system with emergent collective behaviour of users. With the appropriate network mappings we examined wealthy topological structure of Ubuntu network. The structure of this network is determined by computing different topological measures. The directed, weighted network, which is a suitable representation of the dataset from Ubuntu chat channel is characterized with power law dependencies of various quantities, hierarchical organization and disassortative mixing patterns. Beyond the topological features, the emergent collective state is further quantified by analysis of time series of users activities driven by emotions. Analysis of time series reveals self-organized dynamics with long-range temporal correlations in user actions.
This article describes a new type of artificial neuron, called the authors "cyberneuron". Unlike classical models of artificial neurons, this type of neuron used table substitution instead of the operation of multiplication of input values for the weights. This allowed to significantly increase the information capacity of a single neuron, but also greatly simplify the process of learning. Considered an example of the use of "cyberneuron" with the task of detecting computer viruses.
We study the structure of Fermionic networks, i.e., a model of networks based on the behavior of fermionic gases, and we analyze dynamical processes over them. In this model, particle dynamics have been mapped to the domain of networks, hence a parameter representing the temperature controls the evolution of the system. In doing so, it is possible to generate adaptive networks, i.e., networks whose structure varies over time. As shown in previous works, networks generated by quantum statistics can undergo critical phenomena as phase transitions and, moreover, they can be considered as thermodynamic systems. In this study, we analyze Fermionic networks and opinion dynamics processes over them, framing this network model as a computational model useful to represent complex and adaptive systems. Results highlight that a strong relation holds between the gas temperature and the structure of the achieved networks. Notably, both the degree distribution and the assortativity vary as the temperature varies, hence we can state that fermionic networks behave as adaptive networks. On the other hand, it is worth to highlight that we did not find relation between outcomes of opinion dynamics processes and the gas temperature. Therefore, although the latter plays a fundamental role in gas dynamics, on the network domain its importance is related only to structural properties of fermionic networks.
In this work, we revisit atrous convolution, a powerful tool to explicitly adjust filter's field-of-view as well as control the resolution of feature responses computed by Deep Convolutional Neural Networks, in the application of semantic image segmentation. To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates. Furthermore, we propose to augment our previously proposed Atrous Spatial Pyramid Pooling module, which probes convolutional features at multiple scales, with image-level features encoding global context and further boost performance. We also elaborate on implementation details and share our experience on training our system. The proposed `DeepLabv3' system significantly improves over our previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2012 semantic image segmentation benchmark.
A modification of the saturation model of deep inelastic scattering at small x which includes the Altarelli-Parisi (DGLAP) evolution is presented. Significant improvement of the description of the structure function F_2 at large Q^2 is achieved and a good description of diffractive data is preserved.
In this paper, we explore how the forthcoming generation of large-scale radio continuum surveys, with the inclusion of some degree of redshift information, can constrain cosmological parameters. By cross-matching these radio surveys with shallow optical to near-infrared surveys, we can essentially separate the source distribution into a low- and a high-redshift sample, thus providing a constraint on the evolution of cosmological parameters such as those related to dark energy. We examine two radio surveys, the Evolutionary Map of the Universe (EMU) and the Westerbork Observations of the Deep APERTIF Northern sky (WODAN). A crucial advantage is their combined potential to provide a deep, full-sky survey. The surveys used for the cross-identifications are SkyMapper and SDSS, for the southern and northern skies, respectively. We concentrate on the galaxy clustering angular power spectrum as our benchmark observable, and find that the possibility of including such low redshift information yields major improvements in the determination of cosmological parameters. With this approach, and provided a good knowledge of the galaxy bias evolution, we are able to put strict constraints on the dark energy parameters, i.e. w_0=-0.9+/-0.041 and w_a=-0.24+/-0.13, with type Ia supernovae and CMB priors (with a one-parameter bias in this case); this corresponds to a Figure of Merit (FoM) > 600, which is twice better than what is obtained by using only the cross-identified sources and greater than four time better than the case without any redshift information at all.
We present a new method of utilizing the color and asymmetry values for galaxies in the Hubble Deep Field to determine both their morphological features and physical parameters. By using a color-asymmetry diagram, we show that various types of star-forming galaxies (e.g. irregular versus interacting, peculiar galaxies) can be distinguished in local samples. We apply the same methods to the F814W images of the Hubble Deep Field, and show preliminary results indicating that galaxy mergers and interactions are the dominate process responsible for creating asymmetries in the HDF galaxies.
Compressive sensing has been successfully used for optimized operations in wireless sensor networks. However, raw data collected by sensors may be neither originally sparse nor easily transformed into a sparse data representation. This paper addresses the problem of transforming source data collected by sensor nodes into a sparse representation with a few nonzero elements. Our contributions that address three major issues include: 1) an effective method that extracts population sparsity of the data, 2) a sparsity ratio guarantee scheme, and 3) a customized learning algorithm of the sparsifying dictionary. We introduce an unsupervised neural network to extract an intrinsic sparse coding of the data. The sparse codes are generated at the activation of the hidden layer using a sparsity nomination constraint and a shrinking mechanism. Our analysis using real data samples shows that the proposed method outperforms conventional sparsity-inducing methods.
In this work, we propose a novel Reversible Recursive Instance-level Object Segmentation (R2-IOS) framework to address the challenging instance-level object segmentation task. R2-IOS consists of a reversible proposal refinement sub-network that predicts bounding box offsets for refining the object proposal locations, and an instance-level segmentation sub-network that generates the foreground mask of the dominant object instance in each proposal. By being recursive, R2-IOS iteratively optimizes the two sub-networks during joint training, in which the refined object proposals and improved segmentation predictions are alternately fed into each other to progressively increase the network capabilities. By being reversible, the proposal refinement sub-network adaptively determines an optimal number of refinement iterations required for each proposal during both training and testing. Furthermore, to handle multiple overlapped instances within a proposal, an instance-aware denoising autoencoder is introduced into the segmentation sub-network to distinguish the dominant object from other distracting instances. Extensive experiments on the challenging PASCAL VOC 2012 benchmark well demonstrate the superiority of R2-IOS over other state-of-the-art methods. In particular, the $\text{AP}^r$ over $20$ classes at $0.5$ IoU achieves $66.7\%$, which significantly outperforms the results of $58.7\%$ by PFN~\cite{PFN} and $46.3\%$ by~\cite{liu2015multi}.
In a recent study of chaos synchronization in symmetric complex networks [Pecora \textit{et al}., Nat. Commun. {\bf 5}, 4079 (2014)], it is found that stable synchronous clusters may coexist with many non-synchronous nodes in the asynchronous regime, resembling the chimera state observed in regular networks of non-locally coupled periodic oscillators. Although of practical significance, this new type of state, namely the chimera-synchronization state, is hardly generated for the general complex networks, due to either the topological instabilities or the weak coupling strength. Here, based on the strategy of pinning coupling, we propose an effective method for inducing chimera-synchronization in symmetric complex network of coupled chaotic oscillators. We are able to argue mathematically that, by pinning a group of nodes satisfying permutation symmetry, there always exits a critical pinning strength beyond which the unstable chimera-synchronization states can be successfully induced. The feasibility and efficiency of the control method are verified by numerical simulations of both artificial and real-world complex networks, with the numerical results well fitted by the theoretical predictions.
In machine learning, the use of an artificial neural network is the mainstream approach. Such a network consists of layers of neurons. These neurons are of the same type characterized by the two features: (1) an inner product of an input vector and a matching weighting vector of trainable parameters and (2) a nonlinear excitation function. Here we investigate the possibility of replacing the inner product with a quadratic function of the input vector, thereby upgrading the 1st order neuron to the 2nd order neuron, empowering individual neurons, and facilitating the optimization of neural networks. Also, numerical examples are provided to illustrate the feasibility and merits of the 2nd order neurons. Finally, further topics are discussed.
Viewing the trajectory of a patient as a dynamical system, a recurrent neural network was developed to learn the course of patient encounters in the Pediatric Intensive Care Unit (PICU) of a major tertiary care center. Data extracted from Electronic Medical Records (EMR) of about 12000 patients who were admitted to the PICU over a period of more than 10 years were leveraged. The RNN model ingests a sequence of measurements which include physiologic observations, laboratory results, administered drugs and interventions, and generates temporally dynamic predictions for in-ICU mortality at user-specified times. The RNN's ICU mortality predictions offer significant improvements over those from two clinically-used scores and static machine learning algorithms.
In this paper, we propose a new deep learning approach, called neural association model (NAM), for probabilistic reasoning in artificial intelligence. We propose to use neural networks to model association between any two events in a domain. Neural networks take one event as input and compute a conditional probability of the other event to model how likely these two events are to be associated. The actual meaning of the conditional probabilities varies between applications and depends on how the models are trained. In this work, as two case studies, we have investigated two NAM structures, namely deep neural networks (DNN) and relation-modulated neural nets (RMNN), on several probabilistic reasoning tasks in AI, including recognizing textual entailment, triple classification in multi-relational knowledge bases and commonsense reasoning. Experimental results on several popular datasets derived from WordNet, FreeBase and ConceptNet have all demonstrated that both DNNs and RMNNs perform equally well and they can significantly outperform the conventional methods available for these reasoning tasks. Moreover, compared with DNNs, RMNNs are superior in knowledge transfer, where a pre-trained model can be quickly extended to an unseen relation after observing only a few training samples. To further prove the effectiveness of the proposed models, in this work, we have applied NAMs to solving challenging Winograd Schema (WS) problems. Experiments conducted on a set of WS problems prove that the proposed models have the potential for commonsense reasoning.
Finding the number of triangles in a network is an important problem in the analysis of complex networks. The number of triangles also has important applications in data mining. Existing distributed memory parallel algorithms for counting triangles are either Map-Reduce based or message passing interface (MPI) based and work with overlapping partitions of the given network. These algorithms are designed for very sparse networks and do not work well when the degrees of the nodes are relatively larger. For networks with larger degrees, Map-Reduce based algorithm generates prohibitively large intermediate data, and in MPI based algorithms with overlapping partitions, each partition can grow as large as the original network, wiping out the benefit of partitioning the network.   In this paper, we present two efficient MPI-based parallel algorithms for counting triangles in massive networks with large degrees. The first algorithm is a space-efficient algorithm for networks that do not fit in the main memory of a single compute node. This algorithm divides the network into non-overlapping partitions. The second algorithm is for the case where the main memory of each node is large enough to contain the entire network. We observe that for such a case, computation load can be balanced dynamically and present a dynamic load balancing scheme which improves the performance significantly. Both of our algorithms scale well to large networks and to a large number of processors.
Sensor networks potentially feature large numbers of nodes that can sense their environment over time, communicate with each other over a wireless network, and process information. They differ from data networks in that the network as a whole may be designed for a specific application. We study the theoretical foundations of such large scale sensor networks, addressing four fundamental issues- connectivity, capacity, clocks and function computation.   To begin with, a sensor network must be connected so that information can indeed be exchanged between nodes. The connectivity graph of an ad-hoc network is modeled as a random graph and the critical range for asymptotic connectivity is determined, as well as the critical number of neighbors that a node needs to connect to. Next, given connectivity, we address the issue of how much data can be transported over the sensor network. We present fundamental bounds on capacity under several models, as well as architectural implications for how wireless communication should be organized.   Temporal information is important both for the applications of sensor networks as well as their operation.We present fundamental bounds on the synchronizability of clocks in networks, and also present and analyze algorithms for clock synchronization. Finally we turn to the issue of gathering relevant information, that sensor networks are designed to do. One needs to study optimal strategies for in-network aggregation of data, in order to reliably compute a composite function of sensor measurements, as well as the complexity of doing so. We address the issue of how such computation can be performed efficiently in a sensor network and the algorithms for doing so, for some classes of functions.
We numerically investigate mixtures of two interacting bosonic species with unequal parameters in one-dimensional optical lattices. In large parameter regions full phase segregation is seen to minimize the energy of the system, but the true ground state is masked by an exponentially large number of metastable states characterized by microscopic phase separation. The ensemble of these quantum emulsion states, reminiscent of emulsions of immiscible fluids, has macroscopic properties analogous to those of a Bose glass, namely a finite compressibility in absence of superfluidity. Their metastability is probed by extensive quantum Monte Carlo simulations generating a rich correlated stochastic dynamics. The tuning of the repulsion of one of the two species via a Feshbach resonance drives the system through a quantum phase transition to the superfluid state.
Bellman's optimality principle has been of enormous importance in the development of whole branches of applied mathematics, computer science, optimal control theory, economics, decision making, and classical physics. Examples are numerous: dynamic programming, Markov chains, stochastic dynamics, calculus of variations, and the brachistochrone problem. Here we show that Bellman's optimality principle is violated in a teleportation problem on a quantum network. This implies that finding the optimal fidelity route for teleporting a quantum state between two distant nodes on a quantum network with bi-partite entanglement will be a tough problem and will require further investigation.
I review the ways that have been proposed to measure the quark transversity distribution in the nucleon. I then explain a proposal, developed by Xuemin Jin, Jian Tang and myself, to measure transversity through the final state interaction between two mesons ($\pi\pi$, $K \bar K$, or $\pi K$) produced in the current fragmentation region in deep inelastic scattering on a transversely polarized nucleon.
The widespread relevance of complex networks is a valuable tool in the analysis of a broad range of systems. There is a demand for tools which enable the extraction of meaningful information and allow the comparison between different systems. We present a novel measure of similarity between nodes in different networks as a generalization of the concept of self-similarity. A similarity matrix is assembled as the distance between feature vectors that contain the in and out paths of all lengths for each node. Hence, nodes operating in a similar flow environment are considered similar regardless of network membership. We demonstrate that this method has the potential to be influential in tasks such as assigning identity or function to uncharacterized nodes. In addition an innovative application of graph partitioning to the raw results extends the concept to the comparison of networks in terms of their underlying role-structure.
The extensive search for deviations from Gaussianity in cosmic microwave background radiation (CMB) data is very important due to the information about the very early moments of the universe encoded there. Recent analyses from Planck CMB data do not exclude the presence of non-Gaussianity of small amplitude, although they are consistent with the Gaussian hypothesis. The use of different techniques is essential to provide information about types and amplitudes of non-Gaussianities in the CMB data. In particular, we find interesting to construct an estimator based upon the combination of two powerful statistical tools that appears to be sensitive enough to detect tiny deviations from Gaussianity in CMB maps. This estimator combines the Minkowski functionals with a Neural Network, maximizing a tool widely used to study non-Gaussian signals with a reinforcement of another tool designed to identify patterns in a data set. We test our estimator by analyzing simulated CMB maps contaminated with different amounts of local primordial non-Gaussianity quantified by the dimensionless parameter fNL. We apply it to these sets of CMB maps and find \gtrsim 98% of chance of positive detection, even for small intensity local non-Gaussianity like fNL = 38 +/- 18, the current limit from Planck data for large angular scales. Additionally, we test the suitability to distinguish between primary and secondary non-Gaussianities and find out that our method successfully classifies ~ 95% of the tested maps. Furthermore, we analyze the foreground-cleaned Planck maps obtaining constraints for non-Gaussianity at large-angles that are in good agreement with recent constraints. Finally, we also test the robustness of our estimator including cut-sky masks and realistic noise maps measured by Planck, obtaining successful results as well.
This paper introduces a novel, well-founded, betweenness measure, called the Bag-of-Paths (BoP) betweenness, as well as its extension, the BoP group betweenness, to tackle semisupervised classification problems on weighted directed graphs. The objective of semi-supervised classification is to assign a label to unlabeled nodes using the whole topology of the graph and the labeled nodes at our disposal. The BoP betweenness relies on a bag-of-paths framework assigning a Boltzmann distribution on the set of all possible paths through the network such that long (high-cost) paths have a low probability of being picked from the bag, while short (low-cost) paths have a high probability of being picked. Within that context, the BoP betweenness of node j is defined as the sum of the a posteriori probabilities that node j lies in-between two arbitrary nodes i, k, when picking a path starting in i and ending in k. Intuitively, a node typically receives a high betweenness if it has a large probability of appearing on paths connecting two arbitrary nodes of the network. This quantity can be computed in closed form by inverting a n x n matrix where n is the number of nodes. For the group betweenness, the paths are constrained to start and end in nodes within the same class, therefore defining a group betweenness for each class. Unlabeled nodes are then classified according to the class showing the highest group betweenness. Experiments on various real-world data sets show that BoP group betweenness outperforms all the tested state of-the-art methods. The benefit of the BoP betweenness is particularly noticeable when only a few labeled nodes are available.
Despite the successes in capturing continuous distributions, the application of generative adversarial networks (GANs) to discrete settings, like natural language tasks, is rather restricted. The fundamental reason is the difficulty of back-propagation through discrete random variables combined with the inherent instability of the GAN training objective. To address these problems, we propose Maximum-Likelihood Augmented Discrete Generative Adversarial Networks. Instead of directly optimizing the GAN objective, we derive a novel and low-variance objective using the discriminator's output that follows corresponds to the log-likelihood. Compared with the original, the new objective is proved to be consistent in theory and beneficial in practice. The experimental results on various discrete datasets demonstrate the effectiveness of the proposed approach.
Investigating relation between various structural patterns found in real-world networks and stability of underlying systems is crucial to understand importance and evolutionary origin of such patterns. We evolve multiplex networks, comprising of anti-symmetric couplings in one layer, depicting predator-prey relation, and symmetric couplings in the other, depicting mutualistic (or competitive) relation, based on stability maximization through the largest eigenvalue. We find that the correlated multiplexity emerges as evolution progresses. The evolved values of the correlated multiplexity exhibit a dependence on the inter-link coupling strength. Furthermore, the inter-layer coupling strength governs the evolution of disassortativity property in the individual layers. We provide analytical understanding to these findings by considering star like networks in both the layers. The model and tools used here are useful for understanding the principles governing the stability as well as importance of such patterns in the underlying networks of real-world systems.
Accurately determining and classifying the structure of complex networks is the focus of much current research. One class of network of particular interest are metabolic pathways, which have previously been studied from a graph theoretical viewpoint in a number of ways. Metabolic networks describe the chemical reactions within cells and are thus of prime importance from a biological perspective.   Here we analyse metabolic networks from a section of microorganisms, using a range of metrics and attempt to address anomalies between the observed metrics and current descriptions of the graphical structure. We propose that the growth of the network may in some way be regulated by network size and attempt to reproduce networks with similar metrics to the metabolic pathways using a generative approach. We provide some hypotheses as to why biological networks may evolve according to these model criteria.
We investigate the dynamics of an epidemiological susceptible-infected-susceptible (SIS) model on an adaptive network. This model combines epidemic spreading (dynamics on the network) with rewiring of network connections (topological evolution of the network). We propose and implement a computational approach that enables us to study the dynamics of the network directly on an emergent, coarse-grained level. The approach sidesteps the derivation of closed low-dimensional approximations. Our investigations reveal that global coupling, which enters through the awareness of the population to the disease, can result in robust large-amplitude oscillations of the state and topology of the network.
The main objectives and instruments to develop Belarusian educational and research web portal of nuclear knowledge are discussed. Draft structure of portal is presented.
Understanding why a model made a certain prediction is crucial in many applications. However, with large modern datasets the best accuracy is often achieved by complex models even experts struggle to interpret, such as ensemble or deep learning models. This creates a tension between accuracy and interpretability. In response, a variety of methods have recently been proposed to help users interpret the predictions of complex models. Here, we present a unified framework for interpreting predictions, namely SHAP (SHapley Additive exPlanations, which assigns each feature an importance for a particular prediction. The key novel components of the SHAP framework are the identification of a class of additive feature importance measures and theoretical results that there is a unique solution in this class with a set of desired properties. This class unifies six existing methods, and several recent methods in this class do not have these desired properties. This means that our framework can inform the development of new methods for explaining prediction models. We demonstrate that several new methods we presented in this paper based on the SHAP framework show better computational performance and better consistency with human intuition than existing methods.
The base motivation of Mobile Cloud Computing was empowering mobile devices by application offloading onto powerful cloud resources. However, this goal can't entirely be reached because of the high offloading cost imposed by the long physical distance between the mobile device and the cloud. To address this issue, we propose an application offloading onto a nearby mobile cloud composed of the mobile devices in the vicinity-a Spontaneous Proximity Cloud. We introduce our proposed dynamic, ant-inspired, bi-objective offloading middleware-ACOMMA, and explain its extension to perform a close mobile application offloading. With the learning-based offloading decision-making process of ACOMMA, combined to the collaborative resource sharing, the mobile devices can cooperate for decision cache sharing. We evaluate the performance of ACOMMA in collaborative mode with real benchmarks Face Recognition and Monte-Carlo algorithms-and achieve 50% execution time gain.
Many malware families utilize domain generation algorithms (DGAs) to establish command and control (C&C) connections. While there are many methods to pseudorandomly generate domains, we focus in this paper on detecting (and generating) domains on a per-domain basis which provides a simple and flexible means to detect known DGA families. Recent machine learning approaches to DGA detection have been successful on fairly simplistic DGAs, many of which produce names of fixed length. However, models trained on limited datasets are somewhat blind to new DGA variants.   In this paper, we leverage the concept of generative adversarial networks to construct a deep learning based DGA that is designed to intentionally bypass a deep learning based detector. In a series of adversarial rounds, the generator learns to generate domain names that are increasingly more difficult to detect. In turn, a detector model updates its parameters to compensate for the adversarially generated domains. We test the hypothesis of whether adversarially generated domains may be used to augment training sets in order to harden other machine learning models against yet-to-be-observed DGAs. We detail solutions to several challenges in training this character-based generative adversarial network (GAN). In particular, our deep learning architecture begins as a domain name auto-encoder (encoder + decoder) trained on domains in the Alexa one million. Then the encoder and decoder are reassembled competitively in a generative adversarial network (detector + generator), with novel neural architectures and training strategies to improve convergence.
It has been found that the networks with scale-free distribution are very resilient to random failures. The purpose of this work is to determine the network design guideline which maximize the network robustness to random failures with the average number of links per node of the network is constant. The optimal value of the distribution exponent and the minimum connectivity to different network size are given in this paper. Finally, the optimization strategy how to improve the evolving network robustness is given.
Several embedded application domains for reconfigurable systems tend to combine frequent changes with high performance demands of their workloads such as image processing, wearable computing and network processors. Time multiplexing of reconfigurable hardware resources raises a number of new issues, ranging from run-time systems to complex programming models that usually form a Reconfigurable hardware Operating System (ROS). The Operating System performs online task scheduling and handles resource management. There are many challenges in adaptive computing and dynamic reconfigurable systems. One of the major understudied challenges is estimating the required resources in terms of soft cores, Programmable Reconfigurable Regions (PRRs), the appropriate communication infrastructure, and to predict a near optimal layout and floorplan of the reconfigurable logic fabric. Some of these issues are specific to the application being designed, while others are more general and relate to the underlying run-time environment. Static resource allocation for Run- Time Reconfiguration (RTR) often leads to inferior and unacceptable results. In this paper, we present a novel adaptive and dynamic methodology, based on a Machine Learning approach, for predicting and estimating the necessary resources for an application based on past historical information. An important feature of the proposed methodology is that the system is able to learn and generalize and, therefore, is expected to improve its accuracy over time. The goal of the entire process is to extract useful hidden knowledge from the data. This knowledge is the prediction and estimation of the necessary resources for an unknown or not previously seen application.
Open chemical reaction systems involve matter-exchange with the surroundings. As a result, species can accumulate inside a system during the course of the reaction. We study the role of network topology in governing the concentration build-up inside a fixed reaction volume at steady state, particularly focusing on the effect of additional paths. The problem is akin to that in traffic networks where an extra route, surprisingly, can increase the overall travel time. This is known as the Braess' paradox. Here, we report chemical analogues of such a paradox in suitably chosen reaction networks, where extra reaction step(s) can inflate the total concentration, denoted as `load', at steady state. It is shown that, such counter-intuitive behavior emerges in a qualitatively similar pattern in networks of varying complexities. We then explore how such extra routes affect the load in a biochemical scheme of uric acid degradation. From a thorough analysis of this network, we propose a functional role of some decomposition steps that can trim the load, indicating the importance of the latter in the evolution of reaction mechanisms in living matter.
With Monte Carlo methods, we investigate the universality class of the depinning transition in the two-dimensional Ising model with quenched random fields. Based on the short-time dynamic approach, we accurately determine the depinning transition field and both static and dynamic critical exponents. The critical exponents vary significantly with the form and strength of the random fields, but exhibit independence on the updating schemes of the Monte Carlo algorithm. From the roughness exponents $\zeta, \zeta_{loc}$ and $\zeta_s$, one may judge that the depinning transition of the random-field Ising model belongs to the new dynamic universality class with $\zeta \neq \zeta_{loc}\neq \zeta_s$ and $\zeta_{loc} \neq 1$. The crossover from the second-order phase transition to the first-order one is observed for the uniform distribution of the random fields, but it is not present for the Gaussian distribution.
We have developed an ultra-stable source in the deep ultraviolet, suitable to fulfill the interrogation requirements of a future fully-operational lattice clock based on neutral mercury. At the core of the system is a Fabry-P\'erot cavity which is highly impervious to temperature and vibrational perturbations. The mirror substrate is made of fused silica in order to exploit the comparatively low thermal noise limits associated with this material. By stabilizing the frequency of a 1062.6 nm Yb-doped fiber laser to the cavity, and including an additional link to LNE-SYRTE's fountain primary frequency standards via an optical frequency comb, we produce a signal which is both stable at the 1E-15 level in fractional terms and referenced to primary frequency standards. The signal is subsequently amplified and frequency-doubled twice to produce several milliwatts of interrogation signal at 265.6 nm in the deep ultraviolet.
The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage.
A novel approach to predict the atomic densities of amorphous materials is explored on the basis of Car-Parrinello molecular dynamics (CPMD) in density functional theory. Despite that determination of the atomic density of matter is crucial in understanding its physical properties, no such method has ever been proposed for amorphous materials until now. In our approach, by assuming that the canonical distribution of amorphous materials is Gaussian distribution, we generate multiple amorphous structures with several different volumes by CPMD simulations and average the total energies at each volume. The density is then determined to be the one that minimizes the averaged total energy. In this study, this approach is implemented for amorphous silicon ({$a$-Si}) to demonstrate its validity, and we have determined the density of {$a$-Si} to be 4.1 % lower and its bulk modulus to be 28 GPa smaller than those of the crystal, which are in good agreement with experiments. We have also confirmed that generating samples through classical molecular dynamics simulations produces a comparable result and validates our assumption. The findings suggest that the presented method is applicable to other amorphous systems, including those that lack experimental knowledge.
We study the evolution of the luminosity function (LF) of type-1 and type-2 AGN in the mid-infrared, and derive their contribution to the Cosmic InfraRed Background (CIRB) and the expected deep source counts to be observed by Spitzer at 24 micron. The sample of type-1 and type-2 AGN was selected at 15 micron (ISO) and 12 micron (IRAS), and classified on the basis of their optical spectra. Local templates of type-1 and type-2 AGN have been used to derive the intrinsic 15 micron luminosities. We adopted an evolving smooth two-power law shape of the LF, whose parameters have been derived using an un-binned maximum likelihood method. We find that the LF of type-1 AGN is compatible with a pure luminosity evolution (L(z)=L(0)(1+z)^k_L) model where k_L~2.9. A small flattening of the faint slope of the LF with increasing redshift is favoured by the data. A similar evolutionary scenario is found for the type-2 population with a rate k_L ranging from ~1.8 to 2.6, depending significantly on the adopted mid-infrared spectral energy distribution. Also for type-2 AGN a flattening of the LF with increasing redshift is suggested by the data, possibly caused by the loss of a fraction of type-2 AGN hidden within the optically classified starburst and normal galaxies. The type-1 AGN contribution to the CIRB at 15 micron is (4.2-12.1) x 10e-11 W m^-2 sr^-1, while the type-2 AGN contribution is (5.5-11.0) x 10e-11 W m^-2 sr^-1. We expect that Spitzer will observe, down to a flux limit of S_24 = 0.01 mJy, a density of ~1200 deg^-2 type-1 and ~1000 deg^-2 type-2 optically classified AGN. The derived total contribution of the AGN galaxies to the CIRB (4-10%) and Spitzer counts should be considered as lower limits, because of a possible loss of type-2 sources caused by the optical classification.
We consider a model for the reionization history of the Universe in which a significant fraction of the observed optical depth is a result of direct reionization by the decay products of a scaling cosmic defect network. We show that such network can make a significant contribution to the reionization history of the Universe even if its energy density is very small (the defect energy density has to be greater than about $10^{-11}$ of the background density). We compute the Cosmic Microwave Background temperature, polarization and temperature-polarization cross power spectrum and show that a contribution to the observed optical depth due to the decay products of a scaling defect network may help to reconcile a high optical depth with a low redshift of complete reionization suggested by quasar data. However, if the energy density of defects is approximately a constant fraction of the background density then these models do not explain the large scale bump in the temperature-polarization cross power spectrum observed by WMAP.
We present a model for dihadron fragmentation functions, describing the fragmentation of a quark into two unpolarized hadrons. We tune the parameters of our model to the output of the PYTHIA event generator for two-hadron semi-inclusive production in deep inelastic scattering at HERMES. Once the parameters of the model are fixed, we make predictions for other unknown fragmentation functions and for a single-spin asymmetry in the azimuthal distribution of pi+ pi- pairs in semi-inclusive deep inelastic scattering on a transversely polarized target at HERMES and COMPASS. Such asymmetry could be used to measure the quark transversity distribution function.
Type-1 fuzzy logic has frequently been used in control systems. However this method is sometimes shown to be too restrictive and unable to adapt in the presence of uncertainty. In this paper we compare type-1 fuzzy control with several other fuzzy approaches under a range of uncertain conditions. Interval type-2 and non-stationary fuzzy controllers are compared, along with 'dual surface' type-2 control, named due to utilising both the lower and upper values produced from standard interval type-2 systems. We tune a type-1 controller, then derive the membership functions and footprints of uncertainty from the type-1 system and evaluate them using a simulated autonomous sailing problem with varying amounts of environmental uncertainty. We show that while these more sophisticated controllers can produce better performance than the type-1 controller, this is not guaranteed and that selection of Footprint of Uncertainty (FOU) size has a large effect on this relative performance.
A wide range of Sensor Networks (SNs) are deployed in real world applications which generate large amount of raw sensory data. Data mining technique to extract useful knowledge from these applications is an emerging research area due to its crucial importance but still its a challenge to discover knowledge efficiently from the sensor network data. In this paper we proposed a Distributed Data Extraction (DDE) method to extract data from sensor networks by applying rules based clustering and association rule mining techniques. A significant amount of sensor readings sent from the sensors to the data processing point(s) may be lost or corrupted. DDE is also estimating these missing values from available sensor reading instead of requesting the sensor node to resend lost reading. DDE also apply data reduction which is able to reduce the data size while transmitting to sink. Results show our proposed approach exhibits the maximum data accuracy and efficient data extraction in term of the entire networks energy consumption.
Credal networks are graph-based statistical models whose parameters take values in a set, instead of being sharply specified as in traditional statistical models (e.g., Bayesian networks). The computational complexity of inferences on such models depends on the irrelevance/independence concept adopted. In this paper, we study inferential complexity under the concepts of epistemic irrelevance and strong independence. We show that inferences under strong independence are NP-hard even in trees with ternary variables. We prove that under epistemic irrelevance the polynomial time complexity of inferences in credal trees is not likely to extend to more general models (e.g. singly connected networks). These results clearly distinguish networks that admit efficient inferences and those where inferences are most likely hard, and settle several open questions regarding computational complexity.
The general framework of diffractive deep inelastic scattering is introduced and reports given in the session on diffractive interactions at the International Workshop on Deep-Inelastic Scattering and Related Phenomena, Rome, April 1996, are presented.
This paper investigates the detection of communication outbreaks among a small team of actors in time-varying networks. We propose monitoring plans for known and unknown teams based on generalizations of the exponentially weighted moving average (EWMA) statistic. For unknown teams, we propose an efficient neighborhood-based search to estimate a collection of candidate teams. This procedure dramatically reduces the computational complexity of an exhaustive search. Our procedure consists of two steps: communication counts between actors are first smoothed using a multivariate EWMA strategy. Densely connected teams are identified as candidates using a neighborhood search approach. These candidate teams are then monitored using a surveillance plan derived from a generalized EWMA statistic. Monitoring plans are established for collaborative teams, teams with a dominant leader, as well as for global outbreaks. We consider weighted heterogeneous dynamic networks, where the expected communication count between each pair of actors is potentially different across pairs and time, as well as homogeneous networks, where the expected communication count is constant across time and actors. Our monitoring plans are evaluated on a test bed of simulated networks as well as on the U.S. Senate co-voting network, which models the Senate voting patterns from 1857 to 2015. Our analysis suggests that our surveillance strategies can efficiently detect relevant and significant changes in dynamic networks.
Many types of point singularity have a topological index, or 'charge', associated with them. For example the phase of a complex field depending on two variables can either increase or decrease on making a clockwise circuit around a simple zero, enabling the zeros to be assigned charges of plus or minus one. In random fields we can define a correlation function for the charge-weighted density of singularities. In many types of random fields, this correlation function satisfies an identity which shows that the singularities 'screen' each other perfectly: a positive singularity is surrounded by an excess of concentration of negatives which exactly cancel its charge, and vice-versa. This paper gives a simple and widely applicable derivation of this result. A counterexample where screening is incomplete is also exhibited.
As one of the emerging algorithms in the field of Artificial Immune Systems (AIS), the Dendritic Cell Algorithm (DCA) has been successfully applied to a number of challenging real-world problems. However, one criticism is the lack of a formal definition, which could result in ambiguity for understanding the algorithm. Moreover, previous investigations have mainly focused on its empirical aspects. Therefore, it is necessary to provide a formal definition of the algorithm, as well as to perform runtime analyses to revealits theoretical aspects. In this paper, we define the deterministic version of the DCA, named the dDCA, using set theory and mathematical functions. Runtime analyses of the standard algorithm and the one with additional segmentation are performed. Our analysis suggests that the standard dDCA has a runtime complexity of O(n2) for the worst-case scenario, where n is the number of input data instances. The introduction of segmentation changes the algorithm's worst case runtime complexity to O(max(nN; nz)), for DC population size N with size of each segment z. Finally, two runtime variables of the algorithm are formulated based on the input data, to understand its runtime behaviour as guidelines for further development.
A critical component to enabling intelligent reasoning in partially observable environments is memory. Despite this importance, Deep Reinforcement Learning (DRL) agents have so far used relatively simple memory architectures, with the main methods to overcome partial observability being either a temporal convolution over the past k frames or an LSTM layer. More recent work (Oh et al., 2016) has went beyond these architectures by using memory networks which can allow more sophisticated addressing schemes over the past k frames. But even these architectures are unsatisfactory due to the reason that they are limited to only remembering information from the last k frames. In this paper, we develop a memory system with an adaptable write operator that is customized to the sorts of 3D environments that DRL agents typically interact with. This architecture, called the Neural Map, uses a spatially structured 2D memory image to learn to store arbitrary information about the environment over long time lags. We demonstrate empirically that the Neural Map surpasses previous DRL memories on a set of challenging 2D and 3D maze environments and show that it is capable of generalizing to environments that were not seen during training.
In this paper, we introduce deep learning technology to tackle two traditional low-level image processing problems, companding and inverse halftoning. We make two main contributions. First, to the best knowledge of the authors, this is the first work that has successfully developed deep learning based solutions to these two traditional low-level image processing problems. This not only introduces new methods to tackle well-known image processing problems but also demonstrates the power of deep learning in solving traditional signal processing problems. Second, we have developed an effective deep learning algorithm based on insights into the properties of visual quality of images and the internal representation properties of a deep convolutional neural network (CNN). We train a deep CNN as a nonlinear transformation function to map a low bit depth image to higher bit depth or from a halftone image to a continuous tone image. We also employ another pretrained deep CNN as a feature extractor to derive visually important features to construct the objective function for the training of the mapping CNN. We present experimental results to demonstrate the effectiveness of the new deep learning based solutions.
In this paper, we propose a novel deep convolutional neural network (CNN)-based algorithm for solving ill-posed inverse problems. Regularized iterative algorithms have emerged as the standard approach to ill-posed inverse problems in the past few decades. These methods produce excellent results, but can be challenging to deploy in practice due to factors including the high computational cost of the forward and adjoint operators and the difficulty of hyper parameter selection. The starting point of our work is the observation that unrolled iterative methods have the form of a CNN (filtering followed by point-wise non-linearity) when the normal operator (H*H, the adjoint of H times H) of the forward model is a convolution. Based on this observation, we propose using direct inversion followed by a CNN to solve normal-convolutional inverse problems. The direct inversion encapsulates the physical model of the system, but leads to artifacts when the problem is ill-posed; the CNN combines multiresolution decomposition and residual learning in order to learn to remove these artifacts while preserving image structure. We demonstrate the performance of the proposed network in sparse-view reconstruction (down to 50 views) on parallel beam X-ray computed tomography in synthetic phantoms as well as in real experimental sinograms. The proposed network outperforms total variation-regularized iterative reconstruction for the more realistic phantoms and requires less than a second to reconstruct a 512 x 512 image on GPU.
This paper proposes a method for transferring the RGB color spectrum to near-infrared (NIR) images using deep multi-scale convolutional neural networks. A direct and integrated transfer between NIR and RGB pixels is trained. The trained model does not require any user guidance or a reference image database in the recall phase to produce images with a natural appearance. To preserve the rich details of the NIR image, its high frequency features are transferred to the estimated RGB image. The presented approach is trained and evaluated on a real-world dataset containing a large amount of road scene images in summer. The dataset was captured by a multi-CCD NIR/RGB camera, which ensures a perfect pixel to pixel registration.
Online video websites receive huge amount of videos daily from users all around the world. How to provide valuable recommendations to viewers is an important task for both video websites and related third parties, such as search engines. Previous work conducted numerous analysis on the view counts of videos, which measure a video's value in terms of popularity. However, the long-lasting value of an online video, namely longevity, is hidden behind the history that a video accumulates its "popularity" through time. Generally speaking, a longevous video tends to constantly draw society's attention. With focus on one of the leading video websites, Youtube, this paper proposes a scoring mechanism quantifying a video's longevity. Evaluating a video's longevity can not only improve a video recommender system, but also help us to discover videos having greater advertising value, as well as adjust a video website's strategy of storing videos to shorten its responding time. In order to accurately quantify longevity, we introduce the concept of latent social impulses and how to use them measure a video's longevity. In order to derive latent social impulses, we view the video website as a digital signal filter and formulate the task as a convex minimization problem. The proposed longevity computation is based on the derived social impulses. Unfortunately, the required information to derive social impulses are not always public, which makes a third party unable to directly evaluate every video's longevity. To solve this problem, we formulate a semi-supervised learning task by using part of videos having known longevity scores to predict the unknown longevity scores. We propose a Gaussian Random Markov model with Loopy Belief Propagation to solve this problem. The conducted experiments on Youtube demonstrate that the proposed method significantly improves the prediction results comparing to baselines.
Online learning has become crucial to many problems in machine learning. As more data is collected sequentially, quickly adapting to changes in the data distribution can offer several competitive advantages such as avoiding loss of prior knowledge and more efficient learning. However, adaptation to changes in the data distribution (also known as covariate shift) needs to be performed without compromising past knowledge already built in into the model to cope with voluminous and dynamic data. In this paper, we propose an online stacked Denoising Autoencoder whose structure is adapted through reinforcement learning. Our algorithm forces the network to exploit and explore favourable architectures employing an estimated utility function that maximises the accuracy of an unseen validation sequence. Different actions, such as Pool, Increment and Merge are available to modify the structure of the network. As we observe through a series of experiments, our approach is more responsive, robust, and principled than its counterparts for non-stationary as well as stationary data distributions. Experimental results indicate that our algorithm performs better at preserving gained prior knowledge and responding to changes in the data distribution.
We successfully evolved a neural network controller that produces dynamic walking in a simulated bipedal robot with compliant actuators, a difficult control problem. The evolutionary evaluation uses a detailed software simulation of a physical robot. We describe: 1) a novel theoretical method to encourage populations to evolve "around" local optima, which employs multiple demes and fitness functions of progressively increasing difficulty, and 2) the novel genetic representation of the neural controller.
We study the cyclic dominance of three species in two-dimensional constrained Newman-Watts networks with a four-state variant of the rock-paper-scissors game. By limiting the maximal connection distance $R_{max}$ in Newman-Watts networks with the long-rang connection probability $p$, we depict more realistically the stochastic interactions among species within ecosystems. When we fix mobility and vary the value of $p$ or $R_{max}$, the Monte Carlo simulations show that the spiral waves grow in size, and the system becomes unstable and biodiversity is lost with increasing $p$ or $R_{max}$. These results are similar to recent results of Reichenbach \textit{et al.} [Nature (London) \textbf{448}, 1046 (2007)], in which they increase the mobility only without including long-range interactions. We compared extinctions with or without long-range connections and computed spatial correlation functions and correlation length. We conclude that long-range connections could improve the mobility of species, drastically changing their crossover to extinction and making the system more unstable.
End-to-end learning treats the entire system as a whole adaptable black box, which, if sufficient data are available, may learn a system that works very well for the target task. This principle has recently been applied to several prototype research on speaker verification (SV), where the feature learning and classifier are learned together with an objective function that is consistent with the evaluation metric. An opposite approach to end-to-end is feature learning, which firstly trains a feature learning model, and then constructs a back-end classifier separately to perform SV. Recently, both approaches achieved significant performance gains on SV, mainly attributed to the smart utilization of deep neural networks. However, the two approaches have not been carefully compared, and their respective advantages have not been well discussed. In this paper, we compare the end-to-end and feature learning approaches on a text-independent SV task. Our experiments on a dataset sampled from the Fisher database and involving 5,000 speakers demonstrated that the feature learning approach outperformed the end-to-end approach. This is a strong support for the feature learning approach, at least with data and computation resources similar to ours.
The study of the effects of the density variations on the vibrational dynamics in vitreous silica is presented. A detailed analysis of the dynamical structure factor, as well as of the current spectra, allows the identification of a flattened transverse branch which is highly sensitive to the density variations. The experimental variations on the intensity and position of the Boson Peak (BP) in v-SiO2 as a function of density are reproduced and interpreted as being due to the shift and disappearance of the latter band. The BP itself is found to correspond to the lower energy tail of the excess states due to the piling up of vibrational modes at energies corresponding to the flattening of the transverse branch.
We develop a diagrammatic scattering theory for interacting bosons in a three-dimensional, weakly disordered potential. Based on a microscopic N-body scattering theory, we identify the relevant diagrams including elastic and inelastic collision processes that are sufficient to describe diffusive quantum transport. By taking advantage of the statistical properties of the weak disorder potential, we demonstrate how the N-body dynamics can be reduced to a nonlinear integral equation of Boltzmann type for the single-particle diffusive flux. Our theory reduces to the Gross-Pitaevskii mean field description in the limit where only elastic collisions are taken into account. However, even at weak interaction strength, inelastic collisions lead to energy redistribution between the bosons - initially prepared all at the same single-particle energy - and thereby induce thermalization of the single-particle current. In addition, we include also weak localization effects and determine the coherent corrections to the incoherent transport in terms of the coherent backscattering signal. We find that inelastic collisions lead to an enhancement of the backscattered cone in a narrow spectral window for increasing interaction strength.
We develop a thorough analytical study of the $O(1/N)$ correction to the spectrum of regular random graphs with $N \rightarrow \infty$ nodes. The finite size fluctuations of the resolvent are given in terms of a weighted series over the contributions coming from loops of all possible lengths, from which we obtain the isolated eigenvalue as well as an analytical expression for the $O(1/N)$ correction to the continuous part of the spectrum. The comparison between this analytical formula and direct diagonalization results exhibits an excellent agreement, confirming the correctness of our expression.
Designing effective and efficient classifier for pattern analysis is a key problem in machine learning and computer vision. Many the solutions to the problem require to perform logic operations such as `and', `or', and `not'. Classification and regression tree (CART) include these operations explicitly. Other methods such as neural networks, SVM, and boosting learn/compute a weighted sum on features (weak classifiers), which weakly perform the 'and' and 'or' operations. However, it is hard for these classifiers to deal with the 'xor' pattern directly. In this paper, we propose layered logic classifiers for patterns of complicated distributions by combining the `and', `or', and `not' operations. The proposed algorithm is very general and easy to implement. We test the classifiers on several typical datasets from the Irvine repository and two challenging vision applications, object segmentation and pedestrian detection. We observe significant improvements on all the datasets over the widely used decision stump based AdaBoost algorithm. The resulting classifiers have much less training complexity than decision tree based AdaBoost, and can be applied in a wide range of domains.
High-throughput protein interaction detection methods are strongly affected by false positive and false negative results. Focused experiments are needed to complement the large-scale methods by validating previously detected interactions but it is often difficult to decide which proteins to probe as interaction partners. Developing reliable computational methods assisting this decision process is a pressing need in bioinformatics. We show that we can use the conserved properties of the protein network to identify and validate interaction candidates. We apply a number of machine learning algorithms to the protein connectivity information and achieve a surprisingly good overall performance in predicting interacting proteins. Using a 'leave-one-out' approach we find average success rates between 20-50% for predicting the correct interaction partner of a protein. We demonstrate that the success of these methods is based on the presence of conserved interaction motifs within the network. A reference implementation and a table with candidate interacting partners for each yeast protein are available at http://www.protsuggest.org
To which degree are shape indices of individual cells of a tessellation characteristic for the stochastic process that generates them? Within the context of stochastic geometry and the physics of disordered materials, this corresponds to the question of relationships between different stochastic models. In the context of image analysis of synthetic and biological materials, this question is central to the problem of inferring information about formation processes from spatial measurements of resulting random structures. We address this question by a theory-based simulation study of shape indices derived from Minkowski tensors for a variety of tessellation models. We focus on the relationship between two indices: an isoperimetric ratio of the empirical averages of cell volume and area and the cell elongation quantified by eigenvalue ratios of interfacial Minkowski tensors. Simulation data for these quantities, as well as for distributions thereof and for correlations of cell shape and volume, are presented for Voronoi mosaics of the Poisson point process, determinantal and permanental point processes, and Gibbs hard-core and random sequential absorption processes as well as for Laguerre tessellations of polydisperse spheres and STIT- and Poisson hyperplane tessellations. These data are complemented by mechanically stable crystalline sphere and disordered ellipsoid packings and area-minimising foam models. We find that shape indices of individual cells are not sufficient to unambiguously identify the generating process even amongst this limited set of processes. However, we identify significant differences of the shape indices between many of these tessellation models. Given a realization of a tessellation, these shape indices can narrow the choice of possible generating processes, providing a powerful tool which can be further strengthened by density-resolved volume-shape correlations.
Upon cooling, glass-forming liquids experience a two-step relaxation associated to the cage rattling and the escape from the cage, and the following decoupling between the \b{eta}- and the {\alpha}-relaxations. The found decoupling behaviors have greatly changed the face of glassy physics and materials studies. Here we report a novel dynamic decoupling that the relaxation function changes gradually from a single-step to a two-step form as temperature declines through the stress relaxation of various metallic glasses in a broad time and temperature range below glass transition temperature (Tg). Such a two-step relaxation is unexpected in glassy state and reveals a decoupling of dynamic modes arising from two different mechanisms: a faster one exhibiting ballistic-like feature, and a slower one associated with a broader distribution of relaxation times typical of subdiffusive atomic motion. This first observation of two-step dynamics in metallic glassy state points to a far richer-than-expected scenario for glass relaxation.
Task based neuroimaging tools for the study of cognitive neuroscience provide insight into understanding how the brain responds to increasing cognitive demand. Theoretical models of neural-cognitive relationships have been developed based on observations of linear and non-linear increases in brain activity. Neural efficiency and capacity are two parameters of current theoretical models. These two theoretical parameters describe the rate of increase of brain activity and the upper limits of the increases, respectively. The current work demonstrates that a quadratic model of increasing brain activity in response to the n-back task is a solution to a differential equation model. This reinterpretation of a standard approach to analyzing a common cognitive task provides a wealth of new insight. The results include brain wide measures of neural efficiency and capacity. The quantification of neural-cognitive relationships provides evidence to support current cognitive neuroscience theories. In addition, the methods provide a framework for understanding the neural mechanisms of working memory. This allows estimation of the effects of experimental manipulations within a conceptual research framework. The proposed methods were applied to twenty-one healthy young adults while engaging in four levels of the n-back task. All methods are easily applicable using standard current software packages for neuroimaging.
Computational visual aesthetics has recently become an active research area. Existing state-of-art methods formulate this as a binary classification task where a given image is predicted to be beautiful or not. In many applications such as image retrieval and enhancement, it is more important to rank images based on their aesthetic quality instead of binary-categorizing them. Furthermore, in such applications, it may be possible that all images belong to the same category. Hence determining the aesthetic ranking of the images is more appropriate. To this end, we formulate a novel problem of ranking images with respect to their aesthetic quality. We construct a new dataset of image pairs with relative labels by carefully selecting images from the popular AVA dataset. Unlike in aesthetics classification, there is no single threshold which would determine the ranking order of the images across our entire dataset. We propose a deep neural network based approach that is trained on image pairs by incorporating principles from relative learning. Results show that such relative training procedure allows our network to rank the images with a higher accuracy than a state-of-art network trained on the same set of images using binary labels.
We study epidemic spreading processes in large networks, when the spread is assisted by a small number of external agents: infection sources with bounded spreading power, but whose movement is unrestricted vis-\`a-vis the underlying network topology. For networks which are `spatially constrained', we show that the spread of infection can be significantly speeded up even by a few such external agents infecting randomly. Moreover, for general networks, we derive upper-bounds on the order of the spreading time achieved by certain simple (random/greedy) external-spreading policies. Conversely, for certain common classes of networks such as line graphs, grids and random geometric graphs, we also derive lower bounds on the order of the spreading time over all (potentially network-state aware and adversarial) external-spreading policies; these adversarial lower bounds match (up to logarithmic factors) the spreading time achieved by an external agent with a random spreading policy. This demonstrates that random, state-oblivious infection-spreading by an external agent is in fact order-wise optimal for spreading in such spatially constrained networks.
Convolutional Neural Networks (CNNs) have recently emerged as the dominant model in computer vision. If provided with enough training data, they predict almost any visual quantity. In a discrete setting, such as classification, CNNs are not only able to predict a label but often predict a confidence in the form of a probability distribution over the output space. In continuous regression tasks, such a probability estimate is often lacking. We present a regression framework which models the output distribution of neural networks. This output distribution allows us to infer the most likely labeling following a set of physical or modeling constraints. These constraints capture the intricate interplay between different input and output variables, and complement the output of a CNN. However, they may not hold everywhere. Our setup further allows to learn a confidence with which a constraint holds, in the form of a distribution of the constrain satisfaction. We evaluate our approach on the problem of intrinsic image decomposition, and show that constrained structured regression significantly increases the state-of-the-art.
Finite size effects on dynamical heterogeneity are studied in liquid silica with Molecular Dynamics simulations using the BKS potential model. When the system size decreases relaxation times are found to increase in accordance with previous results in finite-size simulations and confined liquids. It has been suggested that this increase may be related to a modification of the cooperative motions in confined liquids. In agreement with this hypothesis we observe a decrease of the dynamical heterogeneities associated to the most and the least mobile atoms when the size L decreases. However we find that the decrease of the dynamical aggregation associated to the least mobile atoms is much more important than the decrease associated to the most mobile atoms. This result is surprising as the liquid is slowed down. The decrease of the heterogeneous behavior is also in contradiction with the increase of the heterogeneities observed in liquids confined in nanopores. However an increase of the non-Gaussian parameter appears both in nanopores and in the finite size simulations. As the non-Gaussian parameter is usually associated with dynamical heterogeneities, the increase of the non-Gaussian parameter together with a decrease of dynamical heterogeneity is also surprising.
Today 2.4 GHz based wireless sensor networks are increasing at a tremendous pace, and are seen in widespread applications. Product innovation and support by many vendors in 2.4 GHz makes it a preferred choice, but the networks are prone to issues like interference, and range issues. On the other hand, the less popular 868 MHz in the ISM band has not seen significant usage. In this paper we explore the use of 868 MHz channel to implement a wireless sensor network, and study the efficacy of this channel
Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when convergence does happen it is often highly sensitive to learning rates. We propose a simple modification of stochastic gradient descent that stabilizes adversarial networks. We show, both in theory and practice, that the proposed method reliably converges to saddle points. This makes adversarial networks less likely to "collapse", and enables faster training with larger learning rates.
We report on a large-scale study of student learning of quantum tunneling in 4 traditional and 4 transformed modern physics courses. In the transformed courses, which were designed to address student difficulties found in previous research, students still struggle with many of the same issues found in other courses. However, the reasons for these difficulties are more subtle, and many new issues are brought to the surface. By explicitly addressing how to build models of wave functions and energy and how to relate these models to real physical systems, we have opened up a floodgate of deep and difficult questions as students struggle to make sense of these models. We conclude that the difficulties found in previous research are the tip of the iceberg, and the real issue at the heart of student difficulties in learning quantum tunneling is the struggle to build the complex models that are implicit in experts' understanding but often not explicitly addressed in instruction.
This work explores the characteristics of financial contagion in networks whose links distributions approaches a power law, using a model that defines banks balance sheets from information of network connectivity. By varying the parameters for the creation of the network, several interbank networks are built, in which the concentrations of debts and credits are obtained from links distributions during the creation networks process. Three main types of interbank network are analyzed for their resilience to contagion: i) concentration of debts is greater than concentration of credits, ii) concentration of credits is greater than concentration of debts and iii) concentrations of debts and credits are similar. We also tested the effect of a variation in connectivity in conjunction with variation in concentration of links. The results suggest that more connected networks with high concentration of credits (featuring nodes that are large creditors of the system) present greater resilience to contagion when compared with the others networks analyzed. Evaluating some topological indices of systemic risk suggested by the literature we have verified the ability of these indices to explain the impact on the system caused by the failure of a node. There is a clear positive correlation between the topological indices and the magnitude of losses in the case of networks with high concentration of debts. This correlation is smaller for more resilient networks.
We introduce Neural Choice by Elimination, a new framework that integrates deep neural networks into probabilistic sequential choice models for learning to rank. Given a set of items to chose from, the elimination strategy starts with the whole item set and iteratively eliminates the least worthy item in the remaining subset. We prove that the choice by elimination is equivalent to marginalizing out the random Gompertz latent utilities. Coupled with the choice model is the recently introduced Neural Highway Networks for approximating arbitrarily complex rank functions. We evaluate the proposed framework on a large-scale public dataset with over 425K items, drawn from the Yahoo! learning to rank challenge. It is demonstrated that the proposed method is competitive against state-of-the-art learning to rank methods.
TTL caching models have recently regained significant research interest, largely due to their ability to fit popular caching policies such as LRU. This paper advances the state-of-the-art analysis of TTL-based cache networks by developing two exact methods with orthogonal generality and computational complexity. The first method generalizes existing results for line networks under renewal requests to the broad class of caching policies whereby evictions are driven by stopping times. The obtained results are further generalized, using the second method, to feedforward networks with Markov arrival processes (MAP) requests. MAPs are particularly suitable for non-line networks because they are closed not only under superposition and splitting, as known, but also under input-output caching operations as proven herein for phase-type TTL distributions. The crucial benefit of the two closure properties is that they jointly enable the first exact analysis of feedforward networks of TTL caches in great generality.
The ESO-Spitzer extragalactic Imaging Survey (ESIS) is the optical follow up of the Spitzer Wide-Area InfraRed Extragalactic (SWIRE) survey in the ELAIS-S1 area. This paper presents B, V, R Wide Field Imager observations of the first 1.5 square degree of the ESIS survey. Data reduction is described including astrometric calibration, illumination and color corrections, completeness and photometric accuracy estimates. Number counts and color distributions are compared to literature observational and theoretical data, including non-evolutionary, PLE, evolutionary and semi-analytic Lambda-CDM galaxy models, as well as Milky Way stellar predictions. ESIS data are in good agreement with previous works and are best reproduced by evolutionary and hierarchical Lambda-CDM scenarios. The ELAIS-S1 area benefits from extensive follow-up from X-ray to radio frequencies: some potential uses of the multi-wavelength observations are illustrated. Optical-Spitzer color-color plots promise to be very powerful tools to disentangle different classes of sources (e.g. AGNs, starbursts, quiescent galaxies). Ultraviolet GALEX data are matched to optical and Spitzer samples, leading to a discussion of galaxy properties in the UV-to-24 microns color space. The spectral energy distribution of a few objects, from the X-rays to the far-IR are presented as examples of the multi-wavelength study of galaxy emission components in different spectral domains.
We propose a method for inferring human attributes (such as gender, hair style, clothes style, expression, action) from images of people under large variation of viewpoint, pose, appearance, articulation and occlusion. Convolutional Neural Nets (CNN) have been shown to perform very well on large scale object recognition problems. In the context of attribute classification, however, the signal is often subtle and it may cover only a small part of the image, while the image is dominated by the effects of pose and viewpoint. Discounting for pose variation would require training on very large labeled datasets which are not presently available. Part-based models, such as poselets and DPM have been shown to perform well for this problem but they are limited by shallow low-level features. We propose a new method which combines part-based models and deep learning by training pose-normalized CNNs. We show substantial improvement vs. state-of-the-art methods on challenging attribute classification tasks in unconstrained settings. Experiments confirm that our method outperforms both the best part-based methods on this problem and conventional CNNs trained on the full bounding box of the person.
Throughout economic history, the global economy has experienced recurring crises. The persistent recurrence of such economic crises calls for an understanding of their generic features rather than treating them as singular events. The global economic system is a highly complex system and can best be viewed in terms of a network of interacting macroeconomic agents. In this regard, from the perspective of collective network dynamics, here we explore how the topology of global macroeconomic network affects the patterns of spreading of economic crises. Using a simple toy model of crisis spreading, we demonstrate that an individual country's role in crisis spreading is not only dependent on its gross macroeconomic capacities, but also on its local and global connectivity profile in the context of the world economic network. We find that on one hand clustering of weak links at the regional scale can significantly aggravate the spread of crises, but on the other hand the current network structure at the global scale harbors a higher tolerance of extreme crises compared to more "globalized" random networks. These results suggest that there can be a potential hidden cost in the ongoing globalization movement towards establishing less-constrained, trans-regional economic links between countries, by increasing the vulnerability of global economic system to extreme crises.
Iterative Proportional Fitting (IPF), combined with EM, is commonly used as an algorithm for likelihood maximization in undirected graphical models. In this paper, we present two iterative algorithms that generalize upon IPF. The first one is for likelihood maximization in discrete chain factor graphs, which we define as a wide class of discrete variable models including undirected graphical models and Bayesian networks, but also chain graphs and sigmoid belief networks. The second one is for conditional likelihood maximization in standard undirected models and Bayesian networks. In both algorithms, the iteration steps are expressed in closed form. Numerical simulations show that the algorithms are competitive with state of the art methods.
In this paper, we study the quantity of computational resources (state machine states and/or probabilistic transition precision) needed to solve specific problems in a single hop network where nodes communicate using only beeps. We begin by focusing on randomized leader election. We prove a lower bound on the states required to solve this problem with a given error bound, probability precision, and (when relevant) network size lower bound. We then show the bound tight with a matching upper bound. Noting that our optimal upper bound is slow, we describe two faster algorithms that trade some state optimality to gain efficiency. We then turn our attention to more general classes of problems by proving that once you have enough states to solve leader election with a given error bound, you have (within constant factors) enough states to simulate correctly, with this same error bound, a logspace TM with a constant number of unary input tapes: allowing you to solve a large and expressive set of problems. These results identify a key simplicity threshold beyond which useful distributed computation is possible in the beeping model.
We present a rigorous result on ultra-slow diffusion by solving a Fokker-Planck equation, which describes anomalous transport in a three dimensional (3D) comb. This 3D cylindrical comb consists of a cylinder of discs threaten on a backbone. It is shown that the ultra-slow contaminant spreading along the backbone is described by the mean squared displacement (MSD) of the order of $\ln (t)$. This phenomenon takes place only for normal two dimensional diffusion inside the infinite secondary branches (discs). When the secondary branches have finite boundaries, the ultra-slow motion is a transient process and the asymptotic behavior is normal diffusion. In another example, when anomalous diffusion takes place in the secondary branches, a destruction of ultra-slow (logarithmic) diffusion takes place as well. As the result, one observes "enhanced" subdiffusion with the MSD $\sim t^{1-\alpha}\ln t$, where $0<\alpha<1$.
We study networks of non-locally coupled electronic oscillators that can be described approximately by a Kuramoto-like model. The experimental networks show long complex transients from random initial conditions on the route to network synchronization. The transients display complex behaviors, including resurgence of chimera states, which are network dynamics where order and disorder coexists. The spatial domain of the chimera state moves around the network and alternates with desynchronized dynamics. The fast timescale of our oscillators (on the order of $100\;\mathrm{ns}$) allows us to study the scaling of the transient time of large networks of more than a hundred nodes, which has not yet been confirmed previously in an experiment and could potentially be important in many natural networks. We find that the average transient time increases exponentially with the network size and can be modeled as a Poisson process in experiment and simulation. This exponential scaling is a result of a synchronization rate that follows a power law of the phase-space volume.
We present the relaxation dynamics of glass-forming glycerol mixed with 1.1 nm sized polyhedral oligomeric silsesquioxane (POSS) molecules using dielectric spectroscopy (DS) and two different neutron scattering (NS) techniques. Both, the reorientational dynamics as measured by DS and the density fluctuations detected by NS reveal a broadening of the alpha relaxation when POSS molecules are added. Moreover, we find a significant slowing down of the alpha-relaxation time. These effects are in accord with the heterogeneity scenario considered for the dynamics of glasses and supercooled liquids. The addition of POSS also affects the excess wing in glycerol arising from a secondary relaxation process, which seems to exhibit a dramatic increase in relative strength compared to the alpha-relaxation.
The new type of Mobile Ad hoc Network which is called Vehicular Ad hoc Networks (VANET) created a fertile environment for research. In this research, a protocol Particle Swarm Optimization Contention Based Broadcast (PCBB) is proposed, for fast andeffective dissemination of emergency messages within a geographical area to distribute the emergency message and achieve the safety system, this research will help the VANET system to achieve its safety goals in intelligent and efficient way.
The neural-network interatomic potential for crystalline and liquid Si has been developed using the forward stepwise regression technique to reduce the number of bases with keeping the accuracy of the potential. This approach of making the neural-network potential enables us to construct the accurate interatomic potentials with less and important bases selected systematically and less heuristically. The evaluation of bulk crystalline properties, and dynamic properties of liquid Si show good agreements between the neural-network potential and ab-initio results.
According to the Anderson theorem, the critical temperature T_c of a disordered superconductor is determined by the average density of states and does not change at the localization threshold. This statement is valid under assumption of a self-averaging order parameter, which can be violated in the strong localization region. Stimulating by statements on the essential increase of T_c near the Anderson transition, we carried out the systematic investigation of possible violations of self-averaging. Strong deviations from the Anderson theorem are possible due to resonances at the quasi-discrete levels, resulting in localization of the order parameter at the atomic scale. This effect is determined by the properties of individual impurities and has no direct relation to the Anderson transition. In particular, we see no reasons to say on "fractal superconductivity" near the localization threshold.
We study at T=0 the minimum energy of a domain wall and its gap to the first excited state concentrating on two-dimensional random-bond Ising magnets. The average gap scales as $\Delta E_1 \sim L^\theta f(N_z)$, where $f(y) \sim [\ln y]^{-1/2}$, $\theta$ is the energy fluctuation exponent, $L$ length scale, and $N_z$ the number of energy valleys. The logarithmic scaling is due to extremal statistics, which is illustrated by mapping the problem into the Kardar-Parisi-Zhang roughening process. It follows that the susceptibility of domain walls has also a logarithmic dependence on system size.
Over the past decade network theory has been applied successfully to the study of a variety of complex adaptive systems. However, the application of these techniques to non-human social networks has several shortfalls. Firstly, in most cases the strength of associations between individuals is disregarded. Secondly, present techniques assume that observed interactions are invariant values and not statistical samples taken from a population. These two simplifications have weakened the value of these techniques when applied to the study of animal social systems. Here we introduce a set of behaviorally meaningful weighted network statistics that can be readily applied to matrices of association indices between pairs of individual animals. We also introduce bootstrapping techniques that estimate the effects of sampling uncertainty on the network statistics and structure. Finally, we discuss the use of randomisation tests to detect the departure of observed network statistics from expected values under null hypotheses of random association given the sampling structure of the data. We use two case studies to show that these techniques provide invaluable insight in the dynamics of interactions within social units and in the community structure of societies.
Early diagnosis, playing an important role in preventing progress and treating the Alzheimer's disease (AD), is based on classification of features extracted from brain images. The features have to accurately capture main AD-related variations of anatomical brain structures, such as, e.g., ventricles size, hippocampus shape, cortical thickness, and brain volume. This paper proposes to predict the AD with a deep 3D convolutional neural network (3D-CNN), which can learn generic features capturing AD biomarkers and adapt to different domain datasets. The 3D-CNN is built upon a 3D convolutional autoencoder, which is pre-trained to capture anatomical shape variations in structural brain MRI scans. Fully connected upper layers of the 3D-CNN are then fine-tuned for each task-specific AD classification. Experiments on the \emph{ADNI} MRI dataset with no skull-stripping preprocessing have shown our 3D-CNN outperforms several conventional classifiers by accuracy and robustness. Abilities of the 3D-CNN to generalize the features learnt and adapt to other domains have been validated on the \emph{CADDementia} dataset.
Many algorithms have been proposed for predicting missing edges in networks, but they do not usually take account of which edges are missing. We focus on networks which have missing edges of the form that is likely to occur in real networks, and compare algorithms that find these missing edges. We also investigate the effect of this kind of missing data on community detection algorithms.
Coherent network error correction is the error-control problem in network coding with the knowledge of the network codes at the source and sink nodes. With respect to a given set of local encoding kernels defining a linear network code, we obtain refined versions of the Hamming bound, the Singleton bound and the Gilbert-Varshamov bound for coherent network error correction. Similar to its classical counterpart, this refined Singleton bound is tight for linear network codes. The tightness of this refined bound is shown by two construction algorithms of linear network codes achieving this bound. These two algorithms illustrate different design methods: one makes use of existing network coding algorithms for error-free transmission and the other makes use of classical error-correcting codes. The implication of the tightness of the refined Singleton bound is that the sink nodes with higher maximum flow values can have higher error correction capabilities.
Previous studies on the invulnerability of scale-free networks under edge attacks supported the conclusion that scale-free networks would be fragile under selective attacks. However, these studies are based on qualitative methods with obscure definitions on the robustness. This paper therefore employs a quantitative method to analyze the invulnerability of the scale-free networks, and uses four scale-free networks as the experimental group and four random networks as the control group. The experimental results show that some scale-free networks are robust under selective edge attacks, different to previous studies. Thus, this paper analyzes the difference between the experimental results and previous studies, and suggests reasonable explanations.
In this paper, we investigate the associative memory in recurrent neural networks, based on the model of evolving neural networks proposed by Nolfi, Miglino and Parisi. Experimentally developed network has highly asymmetric synaptic weights and dilute connections, quite different from those of the Hopfield model. Some results on the effect of learning efficiency on the evolution are also presented.
Recent advances in deep learning for object recognition in natural images has prompted a surge of interest in applying a similar set of techniques to medical images. Most of the initial attempts largely focused on replacing the input to such a deep convolutional neural network from a natural image to a medical image. This, however, does not take into consideration the fundamental differences between these two types of data. More specifically, detection or recognition of an anomaly in medical images depends significantly on fine details, unlike object recognition in natural images where coarser, more global structures matter more. This difference makes it inadequate to use the existing deep convolutional neural networks architectures, which were developed for natural images, because they rely on heavily downsampling an image to a much lower resolution to reduce the memory requirements. This hides details necessary to make accurate predictions for medical images. Furthermore, a single exam in medical imaging often comes with a set of different views which must be seamlessly fused in order to reach a correct conclusion. In our work, we propose to use a multi-view deep convolutional neural network that handles a set of more than one high-resolution medical image. We evaluate this network on large-scale mammography-based breast cancer screening (BI-RADS prediction) using 103 thousand images. We focus on investigating the impact of training set sizes and image sizes on the prediction accuracy. Our results highlight that performance clearly increases with the size of training set, and that the best performance can only be achieved using the images in the original resolution. This suggests the future direction of medical imaging research using deep neural networks is to utilize as much data as possible with the least amount of potentially harmful preprocessing.
Urban planning applications (energy audits, investment, etc.) require an understanding of built infrastructure and its environment, i.e., both low-level, physical features (amount of vegetation, building area and geometry etc.), as well as higher-level concepts such as land use classes (which encode expert understanding of socio-economic end uses). This kind of data is expensive and labor-intensive to obtain, which limits its availability (particularly in developing countries). We analyze patterns in land use in urban neighborhoods using large-scale satellite imagery data (which is available worldwide from third-party providers) and state-of-the-art computer vision techniques based on deep convolutional neural networks. For supervision, given the limited availability of standard benchmarks for remote-sensing data, we obtain ground truth land use class labels carefully sampled from open-source surveys, in particular the Urban Atlas land classification dataset of $20$ land use classes across $~300$ European cities. We use this data to train and compare deep architectures which have recently shown good performance on standard computer vision tasks (image classification and segmentation), including on geospatial data. Furthermore, we show that the deep representations extracted from satellite imagery of urban environments can be used to compare neighborhoods across several cities. We make our dataset available for other machine learning researchers to use for remote-sensing applications.
Network Intrusion Detection Systems (NDIS) monitor a network with the aim of discerning malicious from benign activity on that network. While a wide range of approaches have met varying levels of success, most IDS's rely on having access to a database of known attack signatures which are written by security experts. Nowadays, in order to solve problems with false positive alters, correlation algorithms are used to add additional structure to sequences of IDS alerts. However, such techniques are of no help in discovering novel attacks or variations of known attacks, something the human immune system (HIS) is capable of doing in its own specialised domain. This paper presents a novel immune algorithm for application to an intrusion detection problem. The goal is to discover packets containing novel variations of attacks covered by an existing signature base.
This article establishes the performance of stochastic blockmodels in addressing the co-clustering problem of partitioning a binary array into subsets, assuming only that the data are generated by a nonparametric process satisfying the condition of separate exchangeability. We provide oracle inequalities with rate of convergence $\mathcal{O}_P(n^{-1/4})$ corresponding to profile likelihood maximization and mean-square error minimization, and show that the blockmodel can be interpreted in this setting as an optimal piecewise-constant approximation to the generative nonparametric model. We also show for large sample sizes that the detection of co-clusters in such data indicates with high probability the existence of co-clusters of equal size and asymptotically equivalent connectivity in the underlying generative process.
In clinical data sets we often find static information (e.g. patient gender, blood type, etc.) combined with sequences of data that are recorded during multiple hospital visits (e.g. medications prescribed, tests performed, etc.). Recurrent Neural Networks (RNNs) have proven to be very successful for modelling sequences of data in many areas of Machine Learning. In this work we present an approach based on RNNs, specifically designed for the clinical domain, that combines static and dynamic information in order to predict future events. We work with a database collected in the Charit\'{e} Hospital in Berlin that contains complete information concerning patients that underwent a kidney transplantation. After the transplantation three main endpoints can occur: rejection of the kidney, loss of the kidney and death of the patient. Our goal is to predict, based on information recorded in the Electronic Health Record of each patient, whether any of those endpoints will occur within the next six or twelve months after each visit to the clinic. We compared different types of RNNs that we developed for this work, with a model based on a Feedforward Neural Network and a Logistic Regression model. We found that the RNN that we developed based on Gated Recurrent Units provides the best performance for this task. We also used the same models for a second task, i.e., next event prediction, and found that here the model based on a Feedforward Neural Network outperformed the other models. Our hypothesis is that long-term dependencies are not as relevant in this task.
In this paper we present DeepLearningKit - an open source framework that supports using pretrained deep learning models (convolutional neural networks) for iOS, OS X and tvOS. DeepLearningKit is developed in Metal in order to utilize the GPU efficiently and Swift for integration with applications, e.g. iOS-based mobile apps on iPhone/iPad, tvOS-based apps for the big screen, or OS X desktop applications. The goal is to support using deep learning models trained with popular frameworks such as Caffe, Torch, TensorFlow, Theano, Pylearn, Deeplearning4J and Mocha. Given the massive GPU resources and time required to train Deep Learning models we suggest an App Store like model to distribute and download pretrained and reusable Deep Learning models.
In this paper, we build an organization of high-dimensional datasets that cannot be cleanly embedded into a low-dimensional representation due to missing entries and a subset of the features being irrelevant to modeling functions of interest. Our algorithm begins by defining coarse neighborhoods of the points and defining an expected empirical function value on these neighborhoods. We then generate new non-linear features with deep net representations tuned to model the approximate function, and re-organize the geometry of the points with respect to the new representation. Finally, the points are locally z-scored to create an intrinsic geometric organization which is independent of the parameters of the deep net, a geometry designed to assure smoothness with respect to the empirical function. We examine this approach on data from the Center for Medicare and Medicaid Services Hospital Quality Initiative, and generate an intrinsic low-dimensional organization of the hospitals that is smooth with respect to an expert driven function of quality.
Standard quantitative models of the stock market predict a log-normal distribution for stock returns (Bachelier 1900, Osborne 1959), but it is recognised (Fama 1965) that empirical data, in comparison with a Gaussian, exhibit leptokurtosis (it has more probability mass in its tails and centre) and fat tails (probabilities of extreme events are underestimated). Different attempts to explain this departure from normality have coexisted. In particular, since one of the strong assumptions of the Gaussian model concerns the volatility, considered finite and constant, the new models were built on a non finite (Mandelbrot 1963) or non constant (Cox, Ingersoll and Ross 1985) volatility. We investigate in this thesis a very recent model (Dragulescu et al. 2002) based on a Brownian motion process for the returns, and a stochastic mean-reverting process for the volatility. In this model, the forward Kolmogorov equation that governs the time evolution of returns is solved analytically. We test this new theory against different stock indexes (Dow Jones Industrial Average, Standard and Poor s and Footsie), over different periods (from 20 to 105 years). Our aim is to compare this model with the classical Gaussian and with a simple Neural Network, used as a benchmark. We perform the usual statistical tests on the kurtosis and tails of the expected distributions, paying particular attention to the outliers. As claimed by the authors, the new model outperforms the Gaussian for any time lag, but is artificially too complex for medium and low frequencies, where the Gaussian is preferable. Moreover this model is still rejected for high frequencies, at a 0.05 level of significance, due to the kurtosis, incorrectly handled.
Aims. The aim of this work is to study the contribution of the Ly-a emitters (LAE) to the star formation rate density (SFRD) of the Universe in the interval 2<z<6.6.   Methods. We assembled a sample of 217 LAE from the Vimos-VLT Deep Survey (VVDS) with secure spectroscopic redshifts in the redshift range 2 < z < 6.62 and fluxes down to F=1.5x10^18 erg/s/cm^2. 133 LAE are serendipitous identifications in the 22 arcmin^2 total slit area surveyed with the VVDS-Deep and the 3.3 arcmin^2 from the VVDS Ultra-Deep survey, and 84 are targeted identifications in the 0.62 deg^2 surveyed with the VVDS-DEEP and 0.16 deg^2 from the Ultra-Deep survey. Among the serendipitous targets we estimate that 90% of the emission lines are most probably Ly-a, while the remaining 10% could be either [OII]3727 or Ly-a. We computed the LF and derived the SFRD from LAE at these redshifts.   Results. The VVDS-LAE sample reaches faint line fluxes F(Lya) = 1.5x1^18 erg/s/cm^2 (corresponding to L(Lya)=10^41 erg/s at z~3) enabling the faint end slope of the luminosity function to be constrained to a=-1.6+-0.12 at redshift z~2.5 and to a=-1.78+0.1-0.12 at z=4, placing on firm statistical grounds trends found in previous LAE studies, and indicating that sub-L* LAE contribute significantly to the SFRD. The projected number density and volume density of faint LAE in 2<z<6.6 with F>1.5x10^18 erg/s/cm^2 are 33 galaxies/arcmin^2 and 4x10^-2 Mpc^-3, respectively. We find that the the observed luminosity function of LAE does not evolve from z=2 to z=6. This implies that, after correction for the redshift-dependent IGM absorption, the intrinsic LF must have evolved significantly over 3 Gyr. The SFRD from LAE contributes to about 20% of the SFRD at z =2-3, while the LAE appear to be the dominant source of star formation producing ionizing photons in the early universe z>5-6, becoming equivalent to that of Lyman Break galaxies.
Catastrophic forgetting is a problem which refers to losing the information of the first task after training from the second task in continual learning of neural networks. To resolve this problem, we propose the incremental moment matching (IMM), which uses the Bayesian neural network framework. IMM assumes that the posterior distribution of parameters of neural networks is approximated with Gaussian distribution and incrementally matches the moment of the posteriors, which are trained for the first and second task, respectively. To make our Gaussian assumption reasonable, the IMM procedure utilizes various transfer learning techniques including weight transfer, L2-norm of old and new parameters, and a newly proposed variant of dropout using old parameters. We analyze our methods on the MNIST and CIFAR-10 datasets, and then evaluate them on a real-world life-log dataset collected using Google Glass. Experimental results show that IMM produces state-of-the-art performance in a variety of datasets.
We introduce a deep, generative autoencoder capable of learning hierarchies of distributed representations from data. Successive deep stochastic hidden layers are equipped with autoregressive connections, which enable the model to be sampled from quickly and exactly via ancestral sampling. We derive an efficient approximate parameter estimation method based on the minimum description length (MDL) principle, which can be seen as maximising a variational lower bound on the log-likelihood, with a feedforward neural network implementing approximate inference. We demonstrate state-of-the-art generative performance on a number of classic data sets: several UCI data sets, MNIST and Atari 2600 games.
For a deeply supercooled liquid near its glass transition temperature, we suggest a possible way to connect the temperature dependence of its molar excess entropy to that of its viscosity by constructing a macroscopic model, where the deeply supercooled liquid is assumed to be a mixture of solid-like and liquid-like micro regions. In this model, we assume that the mole fraction x of the liquid-like micro regions tends to zero as the temperature T of the liquid is decreased and extrapolated to a temperature Tg*, which we assume to be below but close to the lowest glass transition temperature Tg attainable with the slowest possible cooling rate for the liquid. Without referring to any specific microscopic nature of the solid-like and liquid-like micro regions, we also assume that near Tg, the molar enthalpy of the solid-like micro regions is lower than that of the liquid-like micro regions. We then show that the temperature dependence of x is directly related to that of the molar excess entropy. Close to Tg, we assume that an activated motion of the solid-like micro regions controls the viscosity and that this activated motion is a collective motion involving practically all of the solid-like micro-regions so that the molar activation free energy for the activated motion is proportional to the mole fraction, 1-x, of the solid-like micro regions. The temperature dependence of the viscosity is thus connected to that of the molar excess entropy through the temperature dependence of the mole fraction x. As an example, we apply our model to a class of glass formers for which the molar excess entropy at temperatures near Tg is proportional to 1-T/TK with TK < Tg \sim Tg* and find their viscosities to be well approximated by the Vogel-Fulcher-Tamman equation for temperatures very close to Tg. We estimate the values of three parameters in our model for three glass formers in this class.
Flexible information routing fundamentally underlies the function of many biological and artificial networks. Yet, how such systems may specifically communicate and dynamically route information is not well understood. Here we identify a generic mechanism to route information on top of collective dynamical reference states in complex networks. Switching between collective dynamics induces flexible reorganization of information sharing and routing patterns, as quantified by delayed mutual information and transfer entropy measures between activities of a network's units. We demonstrate the power of this generic mechanism specifically for oscillatory dynamics and analyze how individual unit properties, the network topology and external inputs coact to systematically organize information routing. For multi-scale, modular architectures, we resolve routing patterns at all levels. Interestingly, local interventions within one sub-network may remotely determine non-local network-wide communication. These results help understanding and designing information routing patterns across systems where collective dynamics co-occurs with a communication function.
We study how the ground state of the two-dimensional Ising spin glass with Gaussian interactions in zero magnetic field changes on altering the boundary conditions. The probability that relative spin orientations change in a region far from the boundary goes to zero with the (linear) size of the system L like L^{-lambda}, where lambda = -0.70 +/- 0.08. We argue that lambda is equal to d-d_f where d (=2) is the dimension of the system and d_f is the fractal dimension of a domain wall induced by changes in the boundary conditions. Our value for d_f is consistent with earlier estimates. These results show that, at zero temperature, there is only a single pure state (plus the state with all spins flipped) in agreement with the predictions of the droplet model.
In this paper, we propose a novel face alignment method using single deep network (SDN) on existing limited training data. Rather than using a max-pooling layer followed one convolutional layer in typical convolutional neural networks (CNN), SDN adopts a stack of 3 layer groups instead. Each group layer contains two convolutional layers and a max-pooling layer, which can extract the features hierarchically. Moreover, an effective data augmentation strategy and corresponding training skills are also proposed to over-come the lack of training images on COFW and 300-W da-tasets. The experiment results show that our method outper-forms state-of-the-art methods in both detection accuracy and speed.
This paper presents the result of a search for a narrow strange baryonic state in the invariant-mass distribution of $pK^0_S(\overline{p}K^0_S)$ in HERA II data (2003-2007). The search was performed in the deep inelastic scattering (DIS) data with photon virtuality, $Q^2$, between 20 and 100 GeV. The analysis in HERA II period has improvements over the previous ZEUS analysis in HERA I period (1996-2000). These consist of large statistics and improved particle identification performance by using both of the CTD $dE/dx$ and the MVD $dE/dx$ in this search. Contrary to evidence for such an exotic state around 1.52 GeV in the HERA I data, no such state is found in this analysis. The upper limits on the production cross section are set at 95% confidence level.
Many classes of images exhibit rotational symmetry. Convolutional neural networks are sometimes trained using data augmentation to exploit this, but they are still required to learn the rotation equivariance properties from the data. Encoding these properties into the network architecture, as we are already used to doing for translation equivariance by using convolutional layers, could result in a more efficient use of the parameter budget by relieving the model from learning them. We introduce four operations which can be inserted into neural network models as layers, and which can be combined to make these models partially equivariant to rotations. They also enable parameter sharing across different orientations. We evaluate the effect of these architectural modifications on three datasets which exhibit rotational symmetry and demonstrate improved performance with smaller models.
For enhancing noisy signals, pre-trained single-channel speech enhancement schemes exploit prior knowledge about the shape of typical speech structures. This knowledge is obtained from training data for which methods from machine learning are used, e.g., Mixtures of Gaussians, nonnegative matrix factorization, and deep neural networks. If only speech envelopes are employed as prior speech knowledge, e.g., to meet requirements in terms of computational complexity and memory consumption, Wiener-like enhancement filters will not be able to reduce noise components between speech spectral harmonics. In this paper, we highlight the role of clean speech estimators that employ super-Gaussian speech priors in particular for pre- trained approaches when spectral envelope models are used. In the 2000s, such estimators have been considered by many researchers for improving non-trained enhancement schemes. However, while the benefit of super-Gaussian clean speech estimators in non-trained enhancement schemes is limited, we point out that these estimators make a much larger difference for enhancement schemes that employ pre-trained envelope models. We show that for such pre-trained enhancements schemes super- Gaussian estimators allow for a suppression of annoying residual noises which are not reduced using Gaussian filters such as the Wiener filter. As a consequence, considerable improvements in terms of Perceptual Evaluation of Speech Quality and segmental signal-to-noise ratios are achieved.
We present the exact analytical expression for the spectrum of a sparse non-Hermitian random matrix ensemble, generalizing two classical results in random-matrix theory: this analytical expression forms a non-Hermitian version of the Kesten-Mckay law as well as a sparse realization of Girko's elliptic law. Our exact result opens new perspectives in the study of several physical problems modelled on sparse random graphs. In this context, we show analytically that the convergence rate of a transport process on a very sparse graph depends upon the degree of symmetry of the edges in a non-monotonous way.
Facial Expression Recognition is an active area of research in computer vision with a wide range of applications. Several approaches have been developed to solve this problem for different benchmark datasets. However, Facial Expression Recognition in the wild remains an area where much work is still needed to serve real-world applications. To this end, in this paper we present a novel approach towards facial expression recognition. We fuse rich deep features with domain knowledge through encoding discriminant facial patches. We conduct experiments on two of the most popular benchmark datasets; CK and TFE. Moreover, we present a novel dataset that, unlike its precedents, consists of natural - not acted - expression images. Experimental results show that our approach achieves state-of-the-art results over standard benchmarks and our own dataset
This paper studies a wireless network where multiple users cooperate with each other to improve the overall network performance. Our goal is to design an optimal distributed power allocation algorithm that enables user cooperation, in particular, to guide each user on the decision of transmission mode selection and relay selection. Our algorithm has the nice interpretation of an auction mechanism with multiple auctioneers and multiple bidders. Specifically, in our proposed framework, each user acts as both an auctioneer (seller) and a bidder (buyer). Each auctioneer determines its trading price and allocates power to bidders, and each bidder chooses the demand from each auctioneer. By following the proposed distributed algorithm, each user determines how much power to reserve for its own transmission, how much power to purchase from other users, and how much power to contribute for relaying the signals of others. We derive the optimal bidding and pricing strategies that maximize the weighted sum rates of the users. Extensive simulations are carried out to verify our proposed approach.
In most domains of network analysis researchers consider networks that arise in nature with weighted edges. Such networks are routinely dichotomized in the interest of using available methods for statistical inference with networks. The generalized exponential random graph model (GERGM) is a recently proposed method used to simulate and model the edges of a weighted graph. The GERGM specifies a joint distribution for an exponential family of graphs with continuous-valued edge weights. However, current estimation algorithms for the GERGM only allow inference on a restricted family of model specifications. To address this issue, we develop a Metropolis--Hastings method that can be used to estimate any GERGM specification, thereby significantly extending the family of weighted graphs that can be modeled with the GERGM. We show that new flexible model specifications are capable of avoiding likelihood degeneracy and efficiently capturing network structure in applications where such models were not previously available. We demonstrate the utility of this new class of GERGMs through application to two real network data sets, and we further assess the effectiveness of our proposed methodology by simulating non-degenerate model specifications from the well-studied two-stars model. A working R version of the GERGM code is available in the supplement and will be incorporated in the gergm CRAN package.
This paper is part of a study whose goal is to show the effciency of using Bayes networks to carry out model based vision calculations. [Binford et al. 1987] Recognition proceeds by drawing up a network model from the object's geometric and functional description that predicts the appearance of an object. Then this network is used to find the object within a photographic image. Many existing and proposed techniques for vision recognition resemble the uncertainty calculations of a Bayes net. In contrast, though, they lack a derivation from first principles, and tend to rely on arbitrary parameters that we hope to avoid by a network model. The connectedness of the network depends on what independence considerations can be identified in the vision problem. Greater independence leads to easier calculations, at the expense of the net's expressiveness. Once this trade-off is made and the structure of the network is determined, it should be possible to tailor a solution technique for it. This paper explores the use of a network with multiply connected paths, drawing on both techniques of belief networks [Pearl 86] and influence diagrams. We then demonstrate how one formulation of a multiply connected network can be solved.
This paper considers the two-stage capacitated facility location problem (TSCFLP) in which products manufactured in plants are delivered to customers via storage depots. Customer demands are satisfied subject to limited plant production and limited depot storage capacity. The objective is to determine the locations of plants and depots in order to minimize the total cost including the fixed cost and transportation cost. A hybrid evolutionary algorithm (HEA) with genetic operations and local search is proposed. To avoid the expensive calculation of fitness of population in terms of computational time, the HEA uses extreme machine learning to approximate the fitness of most of the individuals. Moreover, two heuristics based on the characteristic of the problem is incorporated to generate a good initial population.   Computational experiments are performed on two sets of test instances from the recent literature. The performance of the proposed algorithm is evaluated and analyzed. Compared with the state-of-the-art genetic algorithm, the proposed algorithm can find the optimal or near-optimal solutions in a reasonable computational time.
In this paper we propose to learn a mapping from image pixels into a dense template grid through a fully convolutional network. We formulate this task as a regression problem and train our network by leveraging upon manually annotated facial landmarks "in-the-wild". We use such landmarks to establish a dense correspondence field between a three-dimensional object template and the input image, which then serves as the ground-truth for training our regression system. We show that we can combine ideas from semantic segmentation with regression networks, yielding a highly-accurate "quantized regression" architecture.   Our system, called DenseReg, allows us to estimate dense image-to-template correspondences in a fully convolutional manner. As such our network can provide useful correspondence information as a stand-alone system, while when used as an initialization for Statistical Deformable Models we obtain landmark localization results that largely outperform the current state-of-the-art on the challenging 300W benchmark. We thoroughly evaluate our method on a host of facial analysis tasks and also provide qualitative results for dense human body correspondence. We make our code available at http://alpguler.com/DenseReg.html along with supplementary materials.
Modularity structures are common in various social and biological networks. However, its dynamical origin remains an open question. In this work, we set up a dynamical model describing the evolution of a social network. Based on the observations of real social networks, we introduced a link-creating/deleting strategy according to the local dynamics in the model. Thus the coevolution of dynamics and topology naturally determines the network properties. It is found that for a small coupling strength, the networked system cannot reach any synchronization and the network topology is homogeneous. Interestingly, when the coupling strength is large enough, the networked system spontaneously forms communities with different dynamical states. Meanwhile, the network topology becomes heterogeneous with modular structures. It is further shown that in a certain parameter regime, both the degree and the community size in the formed network follow a power-law distribution, and the networks are found to be assortative. These results are consistent with the characteristics of many empirical networks, and are helpful to understand the mechanism of formation of modularity in complex networks.
Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due to the so-called vanishing gradient problem. In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent. This is achieved by using a slight structural modification of the simple recurrent neural network architecture. We encourage some of the hidden units to change their state slowly by making part of the recurrent weight matrix close to identity, thus forming kind of a longer term memory. We evaluate our model in language modeling experiments, where we obtain similar performance to the much more complex Long Short Term Memory (LSTM) networks (Hochreiter & Schmidhuber, 1997).
The non-equilibrium dynamics of three paradigmatic models for two-dimensional systems with quenched disorder is studied with a focus on the existence and analysis of a growing length scale during aging at low temperatures: 1) The random bond Ising ferromagnet, 2) the Edwards-Anderson model for a spin glas, 3) the solid-on-solid model on a disordered substrate (equivalent to the sine-Gordon model with random phase shifts). Interestingly, we find in all three models a length scale that grows algebraically with time (up to the system size in cases 1 and 3, up to the finite equilibrium length in case 2) with a temperature dependent growth exponent. Whereas in cases 1 and 2 this length scale characterizes a coarsening process, it represents in case 3 the growing size of fluctuations during aging.
Single top-quark t-channel production is exploited for studies of top quark properties. The analyses include the measurement of the CKM matrix element, $|V_{tb}|$, search for anomalous couplings of the top quark using a Bayesian neural network analysis, measurement of single top-quark polarization which directly confirms the V-A nature of the $tWb$ production vertex, and the measurement of W-helicity fractions in the phase space sampled by a selection optimized for t-channel single top-quark production, orthogonal to the $t\overline{t}$ final states used in traditional measurements of these properties. All measurements are found to be consistent with the standard model predictions.
We study the magnetic excitation spectrum in three-dimensional diluted ferromagnetic nearest-neighbor systems down to the percolation threshold. The disorder effects resulting from the dilution are handled accurately within self-consistent local random phase approximation approach. The calculations are performed using relatively large systems containing typically 20 000 localized spins, a systematic average over many configurations of disorder is performed. We analyze in details the change in the magnon spectrum and magnon density of states as we increase the dilution. The zone of stability of the well-defined magnon modes is shown to shrink drastically as we approach the percolation threshold. We also calculate the spin stiffness which appears to vanish at the percolation threshold exactly. A comparison with available data, based on a different theoretical approach, is also provided. We hope that this study will motivate new experimental studies based on inelastic neutron-scattering measurements.
Cross-domain visual data matching is one of the fundamental problems in many real-world vision tasks, e.g., matching persons across ID photos and surveillance videos. Conventional approaches to this problem usually involves two steps: i) projecting samples from different domains into a common space, and ii) computing (dis-)similarity in this space based on a certain distance. In this paper, we present a novel pairwise similarity measure that advances existing models by i) expanding traditional linear projections into affine transformations and ii) fusing affine Mahalanobis distance and Cosine similarity by a data-driven combination. Moreover, we unify our similarity measure with feature representation learning via deep convolutional neural networks. Specifically, we incorporate the similarity measure matrix into the deep architecture, enabling an end-to-end way of model optimization. We extensively evaluate our generalized similarity model in several challenging cross-domain matching tasks: person re-identification under different views and face verification over different modalities (i.e., faces from still images and videos, older and younger faces, and sketch and photo portraits). The experimental results demonstrate superior performance of our model over other state-of-the-art methods.
Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then decode sentences in multiple target languages from this representation. In this work we explore this idea in the context of neural encoder decoder architectures, albeit on a smaller scale and without MT as the end goal. Specifically, we consider the case of three languages or modalities X, Z and Y wherein we are interested in generating sequences in Y starting from information available in X. However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications). Z thus acts as a pivot/bridge. An obvious solution, which is perhaps less elegant but works very well in practice is to train a two stage model which first converts from X to Z and then from Z to Y. Instead we explore an interlingua inspired solution which jointly learns to do the following (i) encode X and Z to a common representation and (ii) decode Y from this common representation. We evaluate our model on two tasks: (i) bridge transliteration and (ii) bridge captioning. We report promising results in both these applications and believe that this is a right step towards truly interlingua inspired encoder decoder architectures.
In this paper, an application-based QoS evaluation approach for heterogeneous networks is proposed.It is possible to expand the network capacity and coverage in a dynamic fashion by applying heterogeneous wireless network architecture. However, the Quality of Service (QoS) evaluation of this type of network architecture is very challenging due to the presence of different communication technologies. Different communication technologies have different characteristics and the applications that utilize them have unique QoS requirements. Although, the communication technologies have different performance measurement parameters, the applications using these radio access networks have the same QoS requirements. As a result, it would be easier to evaluate the QoS of the access networks and the overall network configuration based on the performance of applications running on them. Using such application-based QoS evaluation approach, the heterogeneous nature of the underlying networks and the diversity of their traffic can be adequately taken into account. Through simulation studies, we show that the application performance based assessment approach facilitates better QoS management and monitoring of heterogeneous network configurations.
We present an error-controlled mesh refinement procedure for needle insertion simulation and apply it to the simulation of electrode implantation for deep brain stimulation, including brain shift. Our approach enables to control the error in the computation of the displacement and stress fields around the needle tip and needle shaft by suitably refining the mesh, whilst maintaining a coarser mesh in other parts of the domain. We demonstrate through academic and practical examples that our approach increases the accuracy of the displacement and stress fields around the needle without increasing the computational expense. This enables real-time simulations. The proposed methodology has direct implications to increase the accuracy and control the computational expense of the simulation of percutaneous procedures such as biopsy, brachytherapy, regional anesthesia, or cryotherapy and can be essential to the development of robotic guidance.
This is a short review of the properties of electron proton interactions characterized by the presence of large rapidity gaps (LRG) in the measured hadronic final state as obtained by the ZEUS Collaboration at the HERA Collider. In the deep inelastic neutral current $ep$ interactions, the factorization properties of the LRG events interpreted as due to the diffractive dissociation of the virtual photon are compatible with expectations from the Regge phenomenology of soft interactions. The measurement of deep inelastic scattering combined with results from photoproduction of high $p_T$ jets are successfully interpreted in terms of a factorizable Pomeron consisting of quarks and with a substantial contribution of a gluonic component. The first hints of a more complicated nature of the Pomeron are observed in the deep inelastic exclusive $\rho^o$ production, where a strong increase of the production cross section with energy is observed relative to the measurements of the NMC Collaboration at lower energy.
Network interdiction can be viewed as a game between two players, an "interdictor" and a "flow player". The flow player wishes to send as much material as possible through a network, while the interdictor attempts to minimize the amount of transported material by removing a certain number of arcs, say $\Gamma$ arcs. We introduce the randomized network interdiction problem that allows the interdictor to use randomness to select arcs to be removed. We model the problem in two different ways: arc-based and path-based formulations, depending on whether flows are defined on arcs or paths, respectively. We present insights into the modeling power, complexity, and approximability of both formulations. In particular, we prove that $Z_{\text{NI}}/Z_{\text{RNI}}\leq \Gamma+1$, $Z_{\text{NI}}/Z_{\text{RNI}}^{\text{Path}}\leq \Gamma+1$, $Z_{\text{RNI}}/Z_{\text{RNI}}^{\text{Path}}\leq \Gamma$, where $Z_{\text{NI}}$, $Z_{\text{RNI}}$, and $Z_{\text{RNI}}^{\text{Path}}$ are the optimal values of the network interdiction problem and its randomized versions in arc-based and path-based formulations, respectively. We also show that these bounds are tight. We show that it is NP-hard to compute the values $Z_{\text{RNI}}$ and $Z_{\text{RNI}}^{\text{Path}}$ for a general $\Gamma$, but they are computable in polynomial time when $\Gamma=1$. Further, we provide a $(\Gamma+1)$-approximation for $Z_{\text{NI}}$, a $\Gamma$-approximation for $Z_{\text{RNI}}$, and a $\big(1+\lfloor \Gamma/2\rfloor \cdot \lceil \Gamma/2\rceil/(\Gamma+1)\big)$-approximation for $Z_{\text{RNI}}^{\text{Path}}$.
Rotation information for spiral galaxies can be obtained through the observation of different spectral lines. While the Halpha(6563 A) line is often used for galaxies with low to moderate redshifts, it is redshifted into the near-infrared at z>0.4. This is why most high redshift surveys rely on the [OII](3727 A) line. Using a sample of 32 spiral galaxies at 0.155 < z < 0.25 observed simultaneously in both Halpha and [OII] with the Hale 200 inch telescope, the relation between velocity widths extracted from these two spectral lines is investigated, and we conclude that Halpha derived velocities can be reliably compared to high z [OII] measurements. The sample of galaxies is then used along with VIMOS-VLT Deep Survey observations to perform the angular diameter - redshift test to find constraints on cosmological parameters. The test makes it possible to discriminate between various cosmological models, given the upper limit of disc size evolution at the maximum redshift of the data set, no matter what the evolutionary scenario is.
Neuroevolution has proven effective at many reinforcement learning tasks, but does not seem to scale well to high-dimensional controller representations, which are needed for tasks where the input is raw pixel data. We propose a novel method where we train an autoencoder to create a comparatively low-dimensional representation of the environment observation, and then use CMA-ES to train neural network controllers acting on this input data. As the behavior of the agent changes the nature of the input data, the autoencoder training progresses throughout evolution. We test this method in the VizDoom environment built on the classic FPS Doom, where it performs well on a health-pack gathering task.
In this study, we propose an algorithm for computing the network size of communicating agents. The algorithm is distributed: a) it does not require a leader selection; b) it only requires local exchange of information, and; c) its design can be implemented using local information only, without any global information about the network. It is privacy-preserving, namely it does not require to propagate identifying labels. This algorithm is based on system identification, and more precisely on the identification of the order of a suitably-constructed discrete-time linear time-invariant system over some finite field. We provide a probabilistic guarantee for any randomly picked node to correctly compute the number of nodes in the network. Moreover, numerical implementation has been taken into account to make the algorithm applicable to networks of hundreds of nodes, and therefore make the algorithm applicable in real-world sensor or robotic networks. We finally illustrate our results in simulation and conclude the paper with discussions on how our technique differs from a previously-known strategy based on statistical inference.
Bayesian network models with latent variables are widely used in statistics and machine learning. In this paper we provide a complete algebraic characterization of Bayesian network models with latent variables when the observed variables are discrete and no assumption is made about the state-space of the latent variables. We show that it is algebraically equivalent to the so-called nested Markov model, meaning that the two are the same up to inequality constraints on the joint probabilities. In particular these two models have the same dimension. The nested Markov model is therefore the best possible description of the latent variable model that avoids consideration of inequalities, which are extremely complicated in general. A consequence of this is that the constraint finding algorithm of Tian and Pearl (UAI 2002, pp519-527) is complete for finding equality constraints.   Latent variable models suffer from difficulties of unidentifiable parameters and non-regular asymptotics; in contrast the nested Markov model is fully identifiable, represents a curved exponential family of known dimension, and can easily be fitted using an explicit parameterization.
Given a budget and arbitrary cost for selecting each node, the budgeted influence maximization (BIM) problem concerns selecting a set of seed nodes to disseminate some information that maximizes the total number of nodes influenced (termed as influence spread) in social networks at a total cost no more than the budget. Our proposed seed selection algorithm for the BIM problem guarantees an approximation ratio of (1 - 1/sqrt(e)). The seed selection algorithm needs to calculate the influence spread of candidate seed sets, which is known to be #P-complex. Identifying the linkage between the computation of marginal probabilities in Bayesian networks and the influence spread, we devise efficient heuristic algorithms for the latter problem. Experiments using both large-scale social networks and synthetically generated networks demonstrate superior performance of the proposed algorithm with moderate computation costs. Moreover, synthetic datasets allow us to vary the network parameters and gain important insights on the impact of graph structures on the performance of different algorithms.
Hierarchical organization -- the recursive composition of sub-modules -- is ubiquitous in biological networks, including neural, metabolic, ecological, and genetic regulatory networks, and in human-made systems, such as large organizations and the Internet. To date, most research on hierarchy in networks has been limited to quantifying this property. However, an open, important question in evolutionary biology is why hierarchical organization evolves in the first place. It has recently been shown that modularity evolves because of the presence of a cost for network connections. Here we investigate whether such connection costs also tend to cause a hierarchical organization of such modules. In computational simulations, we find that networks without a connection cost do not evolve to be hierarchical, even when the task has a hierarchical structure. However, with a connection cost, networks evolve to be both modular and hierarchical, and these networks exhibit higher overall performance and evolvability (i.e. faster adaptation to new environments). Additional analyses confirm that hierarchy independently improves adaptability after controlling for modularity. Overall, our results suggest that the same force--the cost of connections--promotes the evolution of both hierarchy and modularity, and that these properties are important drivers of network performance and adaptability. In addition to shedding light on the emergence of hierarchy across the many domains in which it appears, these findings will also accelerate future research into evolving more complex, intelligent computational brains in the fields of artificial intelligence and robotics.
Rapid progress made in the field of sensor technology, wireless communication, and computer networks in recent past, led to the development of wireless Ad-hoc sensor networks, consisting of small, low-cost sensors, which can monitor wide and remote areas with precision and liveliness unseen to the date without the intervention of a human operator. This work comes up with a stochastic model for periodic sensor-deployment (in face of their limited amount of battery-life) to maintain a minimal node-connectivity in wireless sensor networks. The node deployment cannot be modeled by using results from conventional continuous birth-death process, since new nodes are added to the network in bursts, i.e. the birth process is not continuous in practical situations. We analyze the periodic node deployment process using discrete birth-continuous death process and obtain two important statistical measures of the existing number of nodes in the network, namely the mean and variance. We show that the above mentioned sequences of mean and variances always converge to finite steady state values, thus ensuring the stability of the system. We also develop a cost function for the process of periodic deployment of sensor nodes and minimize it to find the optimal time ({\tau}) and optimum number of re-deployment (q) for maintaining minimum connectivity in the network.
Hashing methods have attracted much attention for large scale image retrieval. Some deep hashing methods have achieved promising results by taking advantage of the strong representation power of deep networks recently. However, existing deep hashing methods treat all hash bits equally. On one hand, a large number of images share the same distance to a query image due to the discrete Hamming distance, which raises a critical issue of image retrieval where fine-grained rankings are very important. On the other hand, different hash bits actually contribute to the image retrieval differently, and treating them equally greatly affects the retrieval accuracy of image. To address the above two problems, we propose the query-adaptive deep weighted hashing (QaDWH) approach, which can perform fine-grained ranking for different queries by weighted Hamming distance. First, a novel deep hashing network is proposed to learn the hash codes and corresponding class-wise weights jointly, so that the learned weights can reflect the importance of different hash bits for different image classes. Second, a query-adaptive image retrieval method is proposed, which rapidly generates hash bit weights for different query images by fusing its semantic probability and the learned class-wise weights. Fine-grained image retrieval is then performed by the weighted Hamming distance, which can provide more accurate ranking than the traditional Hamming distance. Experiments on four widely used datasets show that the proposed approach outperforms eight state-of-the-art hashing methods.
Many domain adaptation approaches rely on learning cross domain shared representations to transfer the knowledge learned in one domain to other domains. Traditional domain adaptation only considers adapting for one task. In this paper, we explore multi-task representation learning under the domain adaptation scenario. We propose a neural network framework that supports domain adaptation for multiple tasks simultaneously, and learns shared representations that better generalize for domain adaptation. We apply the proposed framework to domain adaptation for sequence tagging problems considering two tasks: Chinese word segmentation and named entity recognition. Experiments show that multi-task domain adaptation works better than disjoint domain adaptation for each task, and achieves the state-of-the-art results for both tasks in the social media domain.
The question of anomalous transport due to a band of impurity states in unconventional superconductors is discussed. In general, the bound state energies are not in midgap, even in the unitarity limit. This implies that, generically, the states associated with impurities are broad resonances, not true bound states. There is no impurity band in the usual sense of the phrase. The wavefunctions of these resonances possess interesting anisotropies in real space, but this does not result in anomalous hopping between impurities. I conclude that the system of resonances produces no qualitative modifications to the T-matrix theory with impurity averaging which is normally used to treat the low-temperature transport of unconventional superconductors. However, users of this method often assume a density of states which is symmetric around the chemical potential. This is not normally the case. It is found that the non-crossing approximation is not valid in a strictly two-dimensional system.
Deep Neural Networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multi-layer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the ''importance'' of individual pixels wrt the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012 and MIT Places data sets. Our main result is that the recently proposed Layer-wise Relevance Propagation (LRP) algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of neural network performance.
Performance measurement of robotic controllers based on fuzzy logic, operating under uncertainty, is a subject area which has been somewhat ignored in the current literature. In this paper standard measures such as RMSE are shown to be inappropriate for use under conditions where the environmental uncertainty changes significantly between experiments. An overview of current methods which have been applied by other authors is presented, followed by a design of a more sophisticated method of comparison. This method is then applied to a robotic control problem to observe its outcome compared with a single measure. Results show that the technique described provides a more robust method of performance comparison than less complex methods allowing better comparisons to be drawn.
Some aspects of the functional RG (FRG) approach to pinned elastic manifolds (of internal dimension $d$) at finite temperature $T>0$ are reviewed and reexamined in this much expanded version of [Europhys. Lett. {\bf 76} 457 (2006)]. The particle limit $d=0$ provides a test for the theory: there the FRG is equivalent to the decaying Burgers equation, with viscosity $\nu \sim T$ - both being formally irrelevant. Analogy between Kolmogorov scaling and FRG cumulant scaling is discussed. Next we prove that the FRG function $R(u)$ and higher cumulants defined from the field theory can be obtained {\it for any $d$} from moments of a renormalized potential defined in an sliding harmonic well. This allows to measure the fixed point function $R(u)$ in numerics and experiments. In $d=0$ the beta function (of the inviscid limit) is obtained from first principles to four loop. For Sinai model (uncorrelated Burgers initial velocities) the ERG hierarchy can be solved and the exact function $R(u)$ is obtained. Connections to exact solutions for the statistics of shocks in Burgers and to ballistic aggregation is detailed. A relation is established between the size distribution of shocks and the one for droplets. A droplet solution to the ERG functional hierarchy is found for any $d$, and the form of $R(u)$ in the thermal boundary layer is related to droplet probabilities. These being known for the $d=0$ Sinai model the function $R(u)$ is obtained there at any $T$. Consistency of the $\epsilon=4-d$ expansion in one and two loop FRG is studied from first principles, and connected to shock and droplet relations which could be tested in numerics.
A space-time physical-layer network coding (ST- PNC) method is presented for information exchange among multiple users over fully-connected multi-way relay networks. The method involves two steps: i) side-information learning and ii) space-time relay transmission. In the first step, different sets of users are scheduled to send signals over networks and the remaining users and relays overhear the transmitted signals, thereby learning the interference patterns. In the second step, multiple relays cooperatively send out linear combinations of signals received in the previous phase using space-time precoding so that all users efficiently exploit their side-information in the form of: 1) what they sent and 2) what they overheard in decoding. This coding concept is illustrated through two simple network examples. It is shown that ST-PNC improves the sum of degrees of freedom (sum-DoF) of the network compared to existing interference management methods. With ST-PNC, the sum-DoF of a general multi-way relay network without channel knowledge at the users is characterized in terms of relevant system parameters, chiefly the number of users, the number of relays, and the number of antennas at relays. A major implication of the derived results is that efficiently harnessing both transmit- ted and overheard signals as side-information brings significant performance improvements to fully-connected multi-way relay networks.
Recent evidence in rodent cerebral cortex and olfactory bulb suggests that short-term dynamics of excitatory synaptic transmission is correlated to stereotypical connectivity motifs. It was observed that neurons with short-term facilitating synapses form predominantly reciprocal pairwise connections, while neurons with short-term depressing synapses form unidirectional pairwise connections. The cause of these structural differences in synaptic microcircuits is unknown. We propose that these connectivity motifs emerge from the interactions between short-term synaptic dynamics (SD) and long-term spike-timing dependent plasticity (STDP). While the impact of STDP on SD was shown in vitro, the mutual interactions between STDP and SD in large networks are still the subject of intense research. We formulate a computational model by combining SD and STDP, which captures faithfully short- and long-term dependence on both spike times and frequency. As a proof of concept, we simulate recurrent networks of spiking neurons with random initial connection efficacies and where synapses are either all short-term facilitating or all depressing. For identical background inputs, and as a direct consequence of internally generated activity, we find that networks with depressing synapses evolve unidirectional connectivity motifs, while networks with facilitating synapses evolve reciprocal connectivity motifs. This holds for heterogeneous networks including both facilitating and depressing synapses. Our study highlights the conditions under which SD-STDP might the correlation between facilitation and reciprocal connectivity motifs, as well as between depression and unidirectional motifs. We further suggest experiments for the validation of the proposed mechanism.
We propose a cluster-based quantization method to convert pre-trained full precision weights into ternary weights with minimal impact on the accuracy. In addition, we also constrain the activations to 8-bits thus enabling sub 8-bit full integer inference pipeline. Our method uses smaller clusters of N filters with a common scaling factor to minimize the quantization loss, while also maximizing the number of ternary operations. We show that with a cluster size of N=4 on Resnet-101, can achieve 71.8% TOP-1 accuracy, within 6% of the best full precision results while replacing ~85% of all multiplications with 8-bit accumulations. Using the same method with 4-bit weights achieves 76.3% TOP-1 accuracy which within 2% of the full precision result. We also study the impact of the size of the cluster on both performance and accuracy, larger cluster sizes N=64 can replace ~98% of the multiplications with ternary operations but introduces significant drop in accuracy which necessitates fine tuning the parameters with retraining the network at lower precision. To address this we have also trained low-precision Resnet-50 with 8-bit activations and ternary weights by pre-initializing the network with full precision weights and achieve 68.9% TOP-1 accuracy within 4 additional epochs. Our final quantized model can run on a full 8-bit compute pipeline, with a potential 16x improvement in performance compared to baseline full-precision models.
In Heterogeneous mobile ad hoc networks (MANETs) congestion occurs with limited resources. Due to the shared wireless channel and dynamic topology, packet transmissions suffer from interference and fading. In heterogeneous ad hoc networks, throughput via a given route is depending on the minimum data rate of all its links. In a route of links with various data rates, if a high data rate node forwards more traffic to a low data rate node, there is a chance of congestion, which leads to long queuing delays in such routes. Since hop count is used as a routing metric in traditional routing, it do not adapt well to mobile nodes. A congestion-aware routing metric for MANETs should incorporate transmission capability, reliability, and congestion around a link. In this paper, we propose to develop a hop-by-hop congestion aware routing protocol which employs a combined weight value as a routing metric, based on the data rate, queuing delay, link quality and MAC overhead. Among the discovered routes, the route with minimum cost index is selected, which is based on the node weight of all the in-network nodes. Simulation results prove that our proposed routing protocol attains high throughput and packet delivery ratio, by reducing the packet drop and delay.
Network coding in geometric space, a new research direction also known as Space Information Flow, is a promising research field which shows the superiority of network coding in space over routing in space. Present literatures proved that given six terminal nodes, network coding in space is strictly superior to routing in space in terms of single-source multicast in regular (5+1) model, in which five terminal nodes forms a regular pentagon centered at a terminal node. In order to compare the performance between network coding in space and routing in space, this paper quantitatively studies two classes of network coding in space and optimal routing in space when any terminal node moves arbitrarily in two-dimensional Euclidean space, and cost advantage is used as the metric. Furthermore, the upper-bound of cost advantage is figured out as well as the region where network coding in space is superior to routing in space. Several properties of Space Information Flow are also presented.
Semantic labeling (or pixel-level land-cover classification) in ultra-high resolution imagery (< 10cm) requires statistical models able to learn high level concepts from spatial data, with large appearance variations. Convolutional Neural Networks (CNNs) achieve this goal by learning discriminatively a hierarchy of representations of increasing abstraction.   In this paper we present a CNN-based system relying on an downsample-then-upsample architecture. Specifically, it first learns a rough spatial map of high-level representations by means of convolutions and then learns to upsample them back to the original resolution by deconvolutions. By doing so, the CNN learns to densely label every pixel at the original resolution of the image. This results in many advantages, including i) state-of-the-art numerical accuracy, ii) improved geometric accuracy of predictions and iii) high efficiency at inference time.   We test the proposed system on the Vaihingen and Potsdam sub-decimeter resolution datasets, involving semantic labeling of aerial images of 9cm and 5cm resolution, respectively. These datasets are composed by many large and fully annotated tiles allowing an unbiased evaluation of models making use of spatial information. We do so by comparing two standard CNN architectures to the proposed one: standard patch classification, prediction of local label patches by employing only convolutions and full patch labeling by employing deconvolutions. All the systems compare favorably or outperform a state-of-the-art baseline relying on superpixels and powerful appearance descriptors. The proposed full patch labeling CNN outperforms these models by a large margin, also showing a very appealing inference time.
There is a lot of research interest in encoding variable length sentences into fixed length vectors, in a way that preserves the sentence meanings. Two common methods include representations based on averaging word vectors, and representations based on the hidden states of recurrent neural networks such as LSTMs. The sentence vectors are used as features for subsequent machine learning tasks or for pre-training in the context of deep learning. However, not much is known about the properties that are encoded in these sentence representations and about the language information they capture. We propose a framework that facilitates better understanding of the encoded representations. We define prediction tasks around isolated aspects of sentence structure (namely sentence length, word content, and word order), and score representations by the ability to train a classifier to solve each prediction task when using the representation as input. We demonstrate the potential contribution of the approach by analyzing different sentence representation mechanisms. The analysis sheds light on the relative strengths of different sentence embedding methods with respect to these low level prediction tasks, and on the effect of the encoded vector's dimensionality on the resulting representations.
The extended supernova remnant G 116.5+1.1 was observed in the optical emission lines of Halpha+[N II], [S II] and [O III]; deep long-slit spectra were also obtained. The morphology of the remnant's observed emission is mainly diffuse and patchy in contrast to the known filamentary emission seen along the western limb. The bulk of the detected emission in the region appears unrelated to the remnant but there is one area of emission in the south-east which is characterized by a [S II]/Halpha ratio of ~0.5, implying a possible relation to G 116.5+1.1. If this is actually the case, it would imply a more extended remnant than previously realized. Emission in the [O III] 5007 A line image is not detected, excluding moderate or fast velocity shocks running into ionized interstellar clouds. Our current estimate of the distance to G 116.5+1.1 of ~3 kpc is in agreement with earlier estimates and implies a very extended remnant (69 pc x 45 pc). Observations further to the north-east of G 116.5+1.1 revealed a network of filamentary structures prominent in Halpha+[N II] and [S II] but failed to detect [O III] line emission. Long-slit spectra in a number of positions provide strong evidence that this newly detected emission arises from shock heated gas. Typical Halpha fluxes lie in the range of 9 to 17 x10^{-17} erg/s/cm^2/ arcsec^2, while low electron densities are implied by the intensities of the sulfur lines. Weak emission from the medium ionization line at 5007 A is detected in only one spectrum. Cool dust emission at 60 and 100 microns may be correlated with the optical emission in a limited number of positions. Surpisingly, radio emission is not detected in published surveys suggesting that the new candidate remnant may belong to the class of "radio quiet" supernova remnants.
Recent results from the \emph{ep} collier HERA are presented. Inclusive dijet cross sections have been measured in neutral current deep inelastic scattering, for virtualities of the exchanged boson in the range $125 < Q^{2} < 20\,000\; GeV^{2}$ and in photoproduction, $Q^{2} \sim 0 GeV^{2}$. The measurements are compared to perturbative QCD calculations at next-to-leading order.
To find out the role of the wiring cost in the organization of the neural network of the nematode \textit{Caenorhapditis elegans} (\textit{C. elegans}), we build the neuronal map of \textit{C. elegans} based on geometrical positions of neurons and define the cost as inter-neuronal Euclidean distance \textit{d}. We show that the wiring probability decays exponentially as a function of \textit{d}. Using the edge exchanging method and the component placement optimization scheme, we show that positions of neurons are not randomly distributed but organized to reduce the total wiring cost. Furthermore, we numerically study the trade-off between the wiring cost and the performance of the Hopfield model on the neural network.
Vehicular traffic is a foremost problem in modern cities. Huge amount of time and resources are wasted while traveling due to traffic congestion. With the introduction of sophisticated traffic management systems, such as those incorporating dynamic traffic assignments, more stringent demands are being placed upon the available real time traffic data. In this paper we have proposed mobile agent as a mechanism to handle the traffic problem on road. Mobile software agents can be used to provide the better QoS (Quality of Service) in vehicular ad hoc network to improve the safety application and driver comfort.
Using Monte Carlo simulations, we study the evolution of contigent cooperation and ethnocentrism in the one-move game. Interactions and reproduction among computational agents are simulated on {\it undirected} and {\it directed} Barab\'asi-Albert (BA) networks. We first replicate the Hammond-Axelrod model of in-group favoritism on a square lattice and then generalize this model on {\it undirected} and {\it directed} BA networks for both asexual and sexual reproduction cases. Our simulations demonstrate that irrespective of the mode of reproduction, ethnocentric strategy becomes common even though cooperation is individually costly and mechanisms such as reciprocity or conformity are absent. Moreover, our results indicate that the spread of favoritism toward similar others highly depends on the network topology and the associated heterogeneity of the studied population.
The original analysis of the star formation history in the NICMOS Deep images of the NHDF is extended to the entire NHDF utilizing NICMOS and WFPC2 archival data. The roughly constant star formation rate from redshifts 1 to 6 found in this study is consistent with the original results. Star formation rates from this study, Lyman break galaxies and sub-mm observations are now in concordance The spike of star formation at redshift 2 due to 2 ULIRGs in the small Deep NICMOS field is smoothed out in the larger area results presented here. The larger source base of this study allows comparison with predictions from hierarchical galaxy formation models. In general the observation are consistent with the predictions. The observed luminosity functions at redshifts 1-6 are presented for future comparisons with theoretical galaxy evolution calculations. Mid and far infrared properties of the sources are also calculated and compared with observations. A candidate for the VLA source VLA 3651+1221 is discussed.
In this work we consider the problem of learning the structure of Markov networks from data. We present an approach for tackling this problem called IBMAP, together with an efficient instantiation of the approach: the IBMAP-HC algorithm, designed for avoiding important limitations of existing independence-based algorithms. These algorithms proceed by performing statistical independence tests on data, trusting completely the outcome of each test. In practice tests may be incorrect, resulting in potential cascading errors and the consequent reduction in the quality of the structures learned. IBMAP contemplates this uncertainty in the outcome of the tests through a probabilistic maximum-a-posteriori approach. The approach is instantiated in the IBMAP-HC algorithm, a structure selection strategy that performs a polynomial heuristic local search in the space of possible structures. We present an extensive empirical evaluation on synthetic and real data, showing that our algorithm outperforms significantly the current independence-based algorithms, in terms of data efficiency and quality of learned structures, with equivalent computational complexities. We also show the performance of IBMAP-HC in a real-world application of knowledge discovery: EDAs, which are evolutionary algorithms that use structure learning on each generation for modeling the distribution of populations. The experiments show that when IBMAP-HC is used to learn the structure, EDAs improve the convergence to the optimum.
The study of Mobile Ad-hoc Network remains attractive due to the desire to achieve better performance and scalability. MANETs are distributed systems consisting of mobile hosts that are connected by multi-hop wireless links. Such systems are self organized and facilitate communication in the network without any centralized administration. MANETs exhibit battery power constraint and suffer scalability issues therefore cluster formation is expensive. This is due to the large number of messages passed during the process of cluster formation. Clustering has evolved as an imperative research domain that enhances system performance such as throughput and delay in Mobile Ad hoc Networks (MANETs) in the presence of both mobility and a large number of mobile terminals.In this thesis, we present a clustering scheme that minimizes message overhead and congestion for cluster formation and maintenance. The algorithm is devised to be independent of the MANET Routing algorithm. Depending upon the context, the clustering algorithm may be implemented in the routing or in higher layers. The dynamic formation of clusters helps reduce data packet overhead, node complexity and power consumption, The simulation has been performed in ns-2. The simulation shows that the number of clusters formed is in proportion with the number of nodes in MANET.
A credal network under epistemic irrelevance is a generalised type of Bayesian network that relaxes its two main building blocks. On the one hand, the local probabilities are allowed to be partially specified. On the other hand, the assessments of independence do not have to hold exactly. Conceptually, these two features turn credal networks under epistemic irrelevance into a powerful alternative to Bayesian networks, offering a more flexible approach to graph-based multivariate uncertainty modelling. However, in practice, they have long been perceived as very hard to work with, both theoretically and computationally.   The aim of this paper is to demonstrate that this perception is no longer justified. We provide a general introduction to credal networks under epistemic irrelevance, give an overview of the state of the art, and present several new theoretical results. Most importantly, we explain how these results can be combined to allow for the design of recursive inference methods. We provide numerous concrete examples of how this can be achieved, and use these to demonstrate that computing with credal networks under epistemic irrelevance is most definitely feasible, and in some cases even highly efficient. We also discuss several philosophical aspects, including the lack of symmetry, how to deal with probability zero, the interpretation of lower expectations, the axiomatic status of graphoid properties, and the difference between updating and conditioning.
Network motifs, the recurring regulatory structural patterns in networks, are able to self-organize to produce networks. Three major motifs, feedforward loop, single input modules and bi-fan are found in gene regulatory networks. The large ratio of genes to transcription factors (TFs) in genomes leads to a sharing of TFs by motifs and is sufficient to result in network self-organization. We find a common design principle of these motifs: short transcript's half-life (THL) TFs are significantly enriched in motifs and hubs. This enrichment becomes one of the driving forces for the emergence of the network scale-free topology and allows the network to quickly adapt to environmental changes. Most feedforward loops and bi-fans contain at least one short THL TF, which can be seen as a criterion for self-assembling these motifs. We have classified the motifs according to their short THL TF content. We show that the percentage of the different motif subtypes varies in different cellular conditions.
We have worked out evolutionary synthesis models of the broad-band spectral energy distribution of elliptical galaxies over the whole frequency range from UV to far--IR. Internal extinction and far--IR re--emission by interstellar dust have been taken into account in a self--consistent way. Diffuse dust emission has been modelled in terms of two components: warm dust, located in regions of high radiation intensity, and cold dust, heated by the general radiation field. Emission from circumstellar dust clouds at different galactic ages was also taken into account. The models reproduce well the present average broad--band spectrum of nearby ellipticals over about four decades in frequency. Under the assumption of a dust--to--gas ratio proportional to the metallicity, the fraction of bolometric luminosity coming out at far-IR wavelengths strongly evolves during the galaxy lifetime, ranging from very low local values ($\lsim 0.5\%$) to $\sim 30\%$ or more in the first billion years. Some models even imply that early phases are optically thick; these models provide an excellent fit of the observed spectral energy distribution of the high redshift galaxy IRAS F10214$+$4724. Far--IR observations may thus play a key role in investigating the early evolution of elliptical galaxies. A strong far-IR evolution of early-type galaxies might be a crucial ingredient to explain the deep $60\mic$ IRAS counts.
We investigate numerically the relaxation dynamics of an elastic string in two-dimensional random media by thermal fluctuations starting from a flat configuration. Measuring spatial fluctuations of its mean position, we find that the correlation length grows in time asymptotically as $\xi \sim (\ln t)^{1/\tilde\chi}$. This implies that the relaxation dynamics is driven by thermal activations over random energy barriers which scale as $E_B(\ell) \sim \ell^{\tilde\chi}$ with a length scale $\ell$. Numerical data strongly suggest that the energy barrier exponent $\tilde{\chi}$ is identical to the energy fluctuation exponent $\chi=1/3$. We also find that there exists a long transient regime, where the correlation length follows a power-law dynamics as $\xi \sim t^{1/z}$ with a nonuniversal dynamic exponent $z$. The origin of the transient scaling behavior is discussed in the context of the relaxation dynamics on finite ramified clusters of disorder.
To meet the growing local and distributed computing needs, the cloud is now descending to the network edge and sometimes to user equipments. This approach aims at distributing computing, data processing, and networking services closer to the end users. Instead of concentrating data and computation in a small number of large clouds, many edge systems are envisioned to be deployed close to the end users or where computing and intelligent networking can best meet user needs. In this paper, we go further converging such massively distributed computing systems with multiple radio accesses. We propose an architecture called MUREN (Multi-Radio Edge Node) for managing traffic in future mobile edge networks. Our solution is based on the Mobile Edge Cloud (MEC) architecture and its close interaction with Software Defined Networking (SDN), the whole jointly interacting with Software-Defined Radios (SDR). We have implemented our architecture in a proof of concept and tested it with two edge scenarios. Our experiments show that centralizing the intelligence in the MEC allows to guarantee the requirements of the edge services either by adapting the waveform parameters, or through changing the radio interface or even by reconfiguring the applications. More generally, the best decision can be seen as the optimal reaction to the wireless links variations
In this paper we continue our investigation of the analogical neural network, paying interest to its replica symmetric behavior in the absence of external fields of any type. Bridging the neural network to a bipartite spin-glass, we introduce and apply a new interpolation scheme to its free energy that naturally extends the interpolation via cavity fields or stochastic perturbations to these models. As a result we obtain the free energy of the system as a sum rule, which, at least at the replica symmetric level, can be solved exactly. As a next step we study its related self-consistent equations for the order parameters and their rescaled fluctuations, found to diverge on the same critical line of the standard Amit-Gutfreund-Sompolinsky theory.
Accurate medical image segmentation is essential for diagnosis, surgical planning and many other applications. Convolutional Neural Networks (CNNs) have shown to be state-of-the-art automatic segmentation methods while the result still needs to be refined to become accurate and robust enough for clinical use. We propose a deep learning-based interactive segmentation method in order to improve the segmentation obtained by an automatic CNN as well as reduce user interactions during refinement for better results. We use one CNN to obtain an initial segmentation automatically, on which user interactions are added to indicate mis-segmentations. Another CNN takes as input the user interactions with the initial segmentation and gives a refined result. We propose a new way to combine user interactions with CNNs through geodesic distance maps, and propose a resolution-preserving network that can give better dense prediction. In addition, we integrate user interactions as hard constraints into back-propagatable Conditional Random Fields. We validated the proposed framework in the application of placenta segmentation from fetal MRI and clavicle segmentation from chest radiographs. Experimental results show our method achieves a large improvement from automatic CNNs, and obtains comparable accuracy with fewer user interventions and less time compared with traditional interactive methods.
We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This generative model allows efficient search and optimization through open-ended spaces of chemical compounds. We train deep neural networks on hundreds of thousands of existing chemical structures to construct two coupled functions: an encoder and a decoder. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to the discrete representation from this latent space. Continuous representations allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous representations also allow the use of powerful gradient-based optimization to efficiently guide the search for optimized functional compounds. We demonstrate our method in the design of drug-like molecules as well as organic light-emitting diodes.
Security is an essential requirement in mobile ad hoc networks to provide protected communication between mobile nodes. Due to unique characteristics of MANETS, it creates a number of consequential challenges to its security design. To overcome the challenges, there is a need to build a multifence security solution that achieves both broad protection and desirable network performance. MANETs are vulnerable to various attacks, blackhole, is one of the possible attacks. Black hole is a type of routing attack where a malicious node advertise itself as having the shortest path to all nodes in the environment by sending fake route reply. By doing this, the malicious node can deprive the traffic from the source node. It can be used as a denial-of-service attack where it can drop the packets later. In this paper, we proposed a DPRAODV (Detection, Prevention and Reactive AODV) to prevent security threats of blackhole by notifying other nodes in the network of the incident. The simulation results in ns2 (ver- 2.33) demonstrate that our protocol not only prevents blackhole attack but consequently improves the overall performance of (normal) AODV in presence of black hole attack.
We propose a tensor network encoding the set of all eigenstates of a fully many-body localized system in one dimension. Our construction, conceptually based on the ansatz introduced in Phys. Rev. B 94, 041116(R) (2016), is built from two layers of unitary matrices which act on blocks of $\ell$ contiguous sites.   We argue this yields an exponential reduction in computational time and memory requirement as compared to all previous approaches for finding a representation of the complete eigenspectrum of large many-body localized systems with a given accuracy. Concretely, we optimize the unitaries by minimizing the magnitude of the commutator of the approximate integrals of motion and the Hamiltonian, which can be done in a local fashion. This further reduces the computational complexity of the tensor networks arising in the minimization process compared to previous work. We test the accuracy of our method by comparing the approximate energy spectrum to exact diagonalization results for the random field Heisenberg model on 16 sites. We find that the technique is highly accurate deep in the localized regime and maintains a surprising degree of accuracy in predicting certain local quantities even in the vicinity of the predicted dynamical phase transition. To demonstrate the power of our technique, we study a system of 72 sites and we are able to see clear signatures of the phase transition. Our work opens a new avenue to study properties of the many-body localization transition in large systems.
Topological network motifs represent functional relationships within and between regulatory and protein-protein interaction networks. Enriched motifs often aggregate into self-contained units forming functional modules. Theoretical models for network evolution by duplication-divergence mechanisms and for network topology by hierarchical scale-free networks have suggested a one-to-one relation between network motif enrichment and aggregation, but this relation has never been tested quantitatively in real biological interaction networks. Here we introduce a novel method for assessing the statistical significance of network motif aggregation and for identifying clusters of overlapping network motifs. Using an integrated network of transcriptional, posttranslational and protein-protein interactions in yeast we show that network motif aggregation reflects a local modularity property which is independent of network motif enrichment. In particular our method identified novel functional network themes for a set of motifs which are not enriched yet aggregate significantly and challenges the conventional view that network motif enrichment is the most basic organizational principle of complex networks.
We consider the task of learning a dynamical system from high-dimensional time-course data. For instance, we might wish to estimate a gene regulatory network from gene expression data measured at discrete time points. We model the dynamical system non-parametrically as a system of additive ordinary differential equations. Most existing methods for parameter estimation in ordinary differential equations estimate the derivatives from noisy observations. This is known to be challenging and inefficient. We propose a novel approach that does not involve derivative estimation. We show that the proposed method can consistently recover the true network structure even in high dimensions, and we demonstrate empirical improvement over competing approaches.
A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorithm to adjust the weights of the action network so that a simple direct measure of the training performance is maximized. Experimental results from the application of the method to the pole balancing problem indicate improved training performance compared with critic-based and genetic reinforcement approaches.
This paper presents the input convex neural network architecture. These are scalar-valued (potentially deep) neural networks with constraints on the network parameters such that the output of the network is a convex function of (some of) the inputs. The networks allow for efficient inference via optimization over some inputs to the network given others, and can be applied to settings including structured prediction, data imputation, reinforcement learning, and others. In this paper we lay the basic groundwork for these models, proposing methods for inference, optimization and learning, and analyze their representational power. We show that many existing neural network architectures can be made input-convex with a minor modification, and develop specialized optimization algorithms tailored to this setting. Finally, we highlight the performance of the methods on multi-label prediction, image completion, and reinforcement learning problems, where we show improvement over the existing state of the art in many cases.
We address the question of market efficiency using the Minority Game (MG) model. First we show that removing unrealistic features of the MG leads to models which reproduce a scaling behavior close to what is observed in real markets. In particular we find that i) fat tails and clustered volatility arise at the phase transition point and that ii) the crossover to random walk behavior of prices is a finite size effect. This, on one hand, suggests that markets operate close to criticality, where the market is marginally efficient. On the other it allows one to measure the distance from criticality of real market, using cross-over times. The artificial market described by the MG is then studied as an ecosystem with different_species_ of traders. This clarifies the nature of the interaction and the particular role played by the various populations.
A brain microstate is characterized by a unique, fixed spatial distribution of electrically active neurons with time varying amplitude. It is hypothesized that a microstate implements a functional/physiological state of the brain during which specific neural computations are performed. Based on this hypothesis, brain electrical activity is modeled as a time sequence of non-overlapping microstates with variable, finite durations (Lehmann and Skrandies 1980, 1984; Lehmann et al 1987). In this study, EEG recordings from 109 participants during eyes closed resting condition are modeled with four microstates. In a first part, a new confirmatory statistics method is introduced for the determination of the cortical distributions of electric neuronal activity that generate each microstate. All microstates have common posterior cingulate generators, while three microstates additionally include activity in the left occipital/parietal, right occipital/parietal, and anterior cingulate cortices. This appears to be a fragmented version of the metabolically (PET/fMRI) computed default mode network (DMN), supporting the notion that these four regions activate sequentially at high time resolution, and that slow metabolic imaging corresponds to a low-pass filtered version. In the second part of this study, the microstate amplitude time series are used as the basis for estimating the strength, directionality, and spectral characteristics (i.e., which oscillations are preferentially transmitted) of the connections that are mediated by the microstate transitions. The results show that the posterior cingulate is an important hub, sending alpha and beta oscillatory information to all other microstate generator regions. Interestingly, beyond alpha, beta oscillations are essential in the maintenance of the brain during resting state.
We examine the utility of multiple channels of communication in wireless networks under the SINR model of interference. The central question is whether the use of multiple channels can result in linear speedup, up to some fundamental limit. We answer this question affirmatively for the data aggregation problem, perhaps the most fundamental problem in sensor networks. To achieve this, we form a hierarchical structure of independent interest, and illustrate its versatility by obtaining a new algorithm with linear speedup for the node coloring problem.
We present a novel framework for inferring regulatory and sequence-level information from gene co-expression networks. The key idea of our methodology is the systematic integration of network inference and network topological analysis approaches for uncovering biological insights. We determine the gene co-expression network of Bacillus subtilis using Affymetrix GeneChip time series data and show how the inferred network topology can be linked to sequence-level information hard-wired in the organism's genome. We propose a systematic way for determining the correlation threshold at which two genes are assessed to be co-expressed by using the clustering coefficient and we expand the scope of the gene co-expression network by proposing the slope ratio metric as a means for incorporating directionality on the edges. We show through specific examples for B. subtilis that by incorporating expression level information in addition to the temporal expression patterns, we can uncover sequence-level biological insights. In particular, we are able to identify a number of cases where (i) the co-expressed genes are part of a single transcriptional unit or operon and (ii) the inferred directionality arises due to the presence of intra-operon transcription termination sites.
Evolutionary dynamics can be studied in well-mixed or structured populations. Population structure typically arises from the heterogeneous distribution of individuals in physical space or on social networks. Here we introduce a new type of space to evolutionary game dynamics: phenotype space. The population is well-mixed in the sense that everyone is equally likely to interact with everyone else, but the behavioral strategies depend on distance in phenotype space. Individuals might behave differently towards those who look similar or dissimilar. Individuals mutate to nearby phenotypes. We study the `phenotypic space walk' of populations. We present analytic calculations that bring together ideas from coalescence theory and evolutionary game dynamics. As a particular example, we investigate the evolution of cooperation in phenotype space. We obtain a precise condition for natural selection to favor cooperators over defectors: for a one-dimensional phenotype space and large population size the critical benefit-to-cost ratio is given by b/c=1+2/sqrt{3}. We derive the fundamental condition for any evolutionary game and explore higher dimensional phenotype spaces.
In this work, we propose a novel recurrent neural network (RNN) architecture. The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. The recurrent signals exchanged between layers are gated adaptively based on the previous hidden states and the current input. We evaluated the proposed GF-RNN with different types of recurrent units, such as tanh, long short-term memory and gated recurrent units, on the tasks of character-level language modeling and Python program evaluation. Our empirical evaluation of different RNN units, revealed that in both tasks, the GF-RNN outperforms the conventional approaches to build deep stacked RNNs. We suggest that the improvement arises because the GF-RNN can adaptively assign different layers to different timescales and layer-to-layer interactions (including the top-down ones which are not usually present in a stacked RNN) by learning to gate these interactions.
It is well known that cooperation cannot be an evolutionary stable strategy for a non-iterative game in a well-mixed population. In contrast, structured populations favor cooperation since cooperators can benefit each other by forming local clusters. Previous studies have shown that scale-free networks strongly promote cooperation. However, little is known about the invasion mechanism of cooperation in scale-free networks. To study microscopic and macroscopic behaviors of cooperators' invasion, we conducted computational experiments of the evolution of cooperation in scale-free networks where, starting from all defectors, cooperators can spontaneously emerge by mutation. Since the evolutionary dynamics are influenced by the definition of fitness, we tested two commonly adopted fitness functions: accumulated payoff and average payoff. Simulation results show that cooperation is strongly enhanced with the accumulated payoff fitness compared to the average payoff fitness. However, the difference between the two functions decreases as the average degree increases. As the average degree increases, cooperation decreases with the accumulated payoff fitness, while it increases with the average payoff fitness. Moreover, with the average payoff fitness, low-degree nodes play a more important role in spreading cooperative strategies compared to the case of the accumulated payoff fitness.
We consider the problem of an heavy particle in a double well potential (DWP) interacting with an electron bath. Under general assumptions, we map the problem to a three-color logarithmic gas model, where the size of the core of the charged particles is proportional to the tunneling time, $\tau_{tun}$, of the heavy particle between the two wells. For times larger then $\tau_{tun}$ this model is equivalent to the anisotropic two-channel Kondo (2CK) model in a transverse field. This allows us to establish a relationship between the microscopic parameters of DWP and the 2CK problem. We show that the strong coupling fixed point of the 2CK model can never be reached for the DWP problem, in agreement with the results of Kagan and Prokof'ev, [Sov.Phys. JEPT, 69, 836 (1989)] and Aleiner et al., [Phys. Rev. Lett. 86, 2629 (2001)].
A rooted acyclic digraph N with labelled leaves displays a tree T when there exists a way to select a unique parent of each hybrid vertex resulting in the tree T. Let Tr(N) denote the set of all trees displayed by the network N. In general, there may be many other networks M such that Tr(M) = Tr(N). A network is regular if it is isomorphic with its cover digraph. This paper shows that if N is regular, there is a procedure to reconstruct N given Tr(N). Hence if N and M are regular networks and Tr(N) = Tr(M), it follows that N = M, proving that a regular network is uniquely determined by its displayed trees.
A critical question for cognitive neuroscience regards how transitions between cognitive states emerge from the dynamic activity of functional brain networks. However, current methodologies cannot easily evaluate both the spatial and temporal changes in brain networks with cognitive state. Here we combine a simple data reorganization with spatial ICA, enabling a spatiotemporal ICA (stICA) which captures the consistent evolution of networks during onset and offset of a task. The technique was applied to FMRI datasets involving alternating between rest and task and to simple synthetic data. Starting and finishing time-points of periods of interest (anchors) were defined at task block onsets and offsets. For each subject, the ten volumes following each anchor were extracted and concatenated spatially, producing a single 3D sample. Samples for all anchors and subjects were concatenated along the fourth dimension. This 4D dataset was decomposed using ICA into spatiotemporal components. One component exhibited the transition with task onset from a default mode network (DMN) becoming less active to a fronto-parietal control network (FPCN) becoming more active. We observed other changes with relevance to understanding network dynamics, e.g., the DMN showed a changing spatial distribution, shifting to an anterior/superior pattern of deactivation during task from a posterior/inferior pattern during rest. By anchoring analyses to periods associated with the onsets and offsets of task, our approach reveals novel aspects of the dynamics of network activity accompanying these transitions. Importantly, these findings were observed without specifying a priori either the spatial networks or the task time courses.
We consider geometrical clusters (i.e. domains of parallel spins) in the square lattice random field Ising model by varying the strength of the Gaussian random field, $\Delta$. In agreement with the conclusion of previous investigation (Phys. Rev. E{\bf 63}, 066109 (2001)), the geometrical correlation length, i.e. the average size of the clusters, $\xi$, is finite for $\Delta > \Delta_c \approx 1.65$ and divergent for $\Delta \le \Delta_c$. The scaling function of the distribution of the mass of the clusters as well as the geometrical correlation function are found to involve the scaling exponents of critical percolation. On the other hand the divergence of the correlation length, $\xi(\Delta) \sim (\Delta - \Delta_c)^{-\nu}$, with $\nu \approx 2.$ is related to that of tricritical percolation. It is verified numerically that critical geometrical correlations transform conformally.
We investigate spin-spin correlation functions in the low temperature phase of spin-glasses. Using the replica field theory formalism, we examine in detail their infrared (long distance) behavior. In particular we identify a longitudinal mode that behaves as massive at intermediate length scales (near-infrared region). These issues are then addressed by numerical simulation; the analysis of our data is compatible with the prediction that the longitudinal mode appears as massive, i.e., that it undergoes an exponential decay, for distances smaller than an appropriate correlation length.
We demonstrate that a generative model for object shapes can achieve state of the art results on challenging scene text recognition tasks, and with orders of magnitude fewer training images than required for competing discriminative methods. In addition to transcribing text from challenging images, our method performs fine-grained instance segmentation of characters. We show that our model is more robust to both affine transformations and non-affine deformations compared to previous approaches.
This study is concerned with the dynamical behaviors of epidemic spreading over a two-layered interconnected network. Three models in different levels are proposed to describe cooperative spreading processes over the interconnected network, wherein the disease in one network can spread to the other. Theoretical analysis is provided for each model to reveal that the global epidemic threshold in the interconnected network is not larger than the epidemic thresholds for the two isolated layered networks. In particular, in an interconnected homogenous network, detailed theoretical analysis is presented, which allows quick and accurate calculations of the global epidemic threshold. Moreover, in an interconnected heterogeneous network with inter-layer correlation between node degrees, it is found that the inter-layer correlation coefficient has little impact on the epidemic threshold, but has significant impact on the total prevalence. Simulations further verify the analytical results, showing that cooperative epidemic processes promote the spreading of diseases.
We study thermal states of transiently accreting neutron stars (with mean accretion rates $\dot{M} \sim 10^{-14}-10^{-9}$ M$_\odot$ yr$^{-1}$) determined by the deep crustal heating of accreted matter sinking into stellar interiors. We formalize a direct correspondence of this problem to the problem of cooling neutron stars. Using a simple toy model we analyze the most important factors which affect the thermal states of accreting stars: a strong superfluidity in the cores of low-mass stars and a fast neutrino emission (in nucleon, pion-condensed, kaon-condensed, or quark phases of dense matter) in the cores of high-mass stars. We briefly compare the results with the observations of soft X-ray transients in quiescence. If the upper limit on the quiescent thermal luminosity of the neutron star in SAX J1808.4-3658 (Campana et al. 2002) is associated with the deep crustal heating, it favors the model of nucleon neutron-star cores with switched-on direct Urca process.
In this paper, we propose a Modified distributed storage algorithm for wireless sensor networks (MDSA). Wireless Sensor Networks, as it is well known, suffer of power limitation, small memory capacity,and limited processing capabilities. Therefore, every node may disappear temporarily or permanently from the network due to many different reasons such as battery failure or physical damage. Since every node collects significant data about its region, it is important to find a methodology to recover these data in case of failure of the source node. Distributed storage algorithms provide reliable access to data through the redundancy spread over individual unreliable nodes. The proposed algorithm uses flooding to spread data over the network and unicasting to provide controlled data redundancy through the network. We evaluate the performance of the proposed algorithm through implementation and simulation. We show the results and the performance evaluation of the proposed algorithm.
We are developing a computer-aided detection (CAD) system for the identification of small pulmonary nodules in screening CT scans. The main modules of our system, i.e. a dot-enhancement filter for nodule candidate selection and a neural classifier for false positive finding reduction, are described. The preliminary results obtained on the so-far collected database of lung CT are discussed.
We present a method for synthesizing a frontal, neutral-expression image of a person's face given an input face photograph. This is achieved by learning to generate facial landmarks and textures from features extracted from a facial-recognition network. Unlike previous approaches, our encoding feature vector is largely invariant to lighting, pose, and facial expression. Exploiting this invariance, we train our decoder network using only frontal, neutral-expression photographs. Since these photographs are well aligned, we can decompose them into a sparse set of landmark points and aligned texture maps. The decoder then predicts landmarks and textures independently and combines them using a differentiable image warping operation. The resulting images can be used for a number of applications, such as analyzing facial attributes, exposure and white balance adjustment, or creating a 3-D avatar.
Anonymous messaging platforms, such as Secret and Whisper, have emerged as important social media for sharing one's thoughts without the fear of being judged by friends, family, or the public. Further, such anonymous platforms are crucial in nations with authoritarian governments; the right to free expression and sometimes the personal safety of the author of the message depend on anonymity. Whether for fear of judgment or personal endangerment, it is crucial to keep anonymous the identity of the user who initially posted a sensitive message. In this paper, we consider an adversary who observes a snapshot of the spread of a message at a certain time. Recent advances in rumor source detection shows that the existing messaging protocols are vulnerable against such an adversary. We introduce a novel messaging protocol, which we call adaptive diffusion, and show that it spreads the messages fast and achieves a perfect obfuscation of the source when the underlying contact network is an infinite regular tree: all users with the message are nearly equally likely to have been the origin of the message. Experiments on a sampled Facebook network show that it effectively hides the location of the source even when the graph is finite, irregular and has cycles. We further consider a stronger adversarial model where a subset of colluding users track the reception of messages. We show that the adaptive diffusion provides a strong protection of the anonymity of the source even under this scenario.
In this work and the supporting Part II, we examine the performance of stochastic sub-gradient learning strategies under weaker conditions than usually considered in the literature. The new conditions are shown to be automatically satisfied by several important cases of interest including SVM, LASSO, and Total-Variation denoising formulations. In comparison, these problems do not satisfy the traditional assumptions used in prior analyses and, therefore, conclusions derived from these earlier treatments are not directly applicable to these problems. The results in this article establish that stochastic sub-gradient strategies can attain linear convergence rates, as opposed to sub-linear rates, to the steady-state regime. A realizable exponential-weighting procedure is employed to smooth the intermediate iterates and guarantee useful performance bounds in terms of convergence rate and excessive risk performance. Part I of this work focuses on single-agent scenarios, which are common in stand-alone learning applications, while Part II extends the analysis to networked learners. The theoretical conclusions are illustrated by several examples and simulations, including comparisons with the FISTA procedure.
In this paper, we consider how to maximize users' influence in Online Social Networks (OSNs) by exploiting social relationships only. Our first contribution is to extend to OSNs the model of Kempe et al. [1] on the propagation of information in a social network and to show that a greedy algorithm is a good approximation of the optimal algorithm that is NP-hard. However, the greedy algorithm requires global knowledge, which is hardly practical. Our second contribution is to show on simulations on the full Twitter social graph that simple and practical strategies perform close to the greedy algorithm.
We study interacting 1D two-component mixtures of cold atoms in a random potential, and extend the results reported earlier [{\it Phys. Rev. Lett.} {\bf 105}, 115301 (2010)]. We construct the phase diagram of a disordered Bose-Fermi mixture as a function of the strength of the Bose-Bose and Bose-Fermi interactions, and the ratio of the bosonic sound velocity and the Fermi velocity. Performing renormalization group and variational calculations, three phases are identified: (i) a fully delocalized two-component Luttinger liquid with superfluid bosons and fermions (ii) a fully localized phase with both components pinned by disorder, and (iii) an intermediate phase where fermions are localized but bosons are superfluid. Within the variational approach, each phase corresponds to a different level of replica symmetry breaking. In the fully localized phase we find that the bosonic and fermionic localization lengths can largely differ. We also compute the momentum distribution as well as the structure factor of the atoms (both experimentally accessible), and discuss how the three phases can be experimentally distinguished.
In this paper, we consider the joint task of simultaneously optimizing (i) the weights of a deep neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection). While these problems are generally dealt with separately, we present a simple regularized formulation allowing to solve all three of them in parallel, using standard optimization routines. Specifically, we extend the group Lasso penalty (originated in the linear regression literature) in order to impose group-level sparsity on the network's connections, where each group is defined as the set of outgoing weights from a unit. Depending on the specific case, the weights can be related to an input variable, to a hidden neuron, or to a bias unit, thus performing simultaneously all the aforementioned tasks in order to obtain a compact network. We perform an extensive experimental evaluation, by comparing with classical weight decay and Lasso penalties. We show that a sparse version of the group Lasso penalty is able to achieve competitive performances, while at the same time resulting in extremely compact networks with a smaller number of input features. We evaluate both on a toy dataset for handwritten digit recognition, and on multiple realistic large-scale classification problems.
This paper aims to theoretically prove by applying Marotto's Theorem that both transiently chaotic neural networks (TCNN) and discrete-time recurrent neural networks (DRNN) have chaotic structure. A significant property of TCNN and DRNN is that they have only one fixed point, when absolute values of the self-feedback connection weights in TCNN and the difference time in DRNN are sufficiently large. We show that this unique fixed point can actually evolve into a snap-back repeller which generates chaotic structure, if several conditions are satisfied. On the other hand, by using the Lyapunov functions, we also derive sufficient conditions on asymptotical stability for symmetrical versions of both TCNN and DRNN, under which TCNN and DRNN asymptotically converge to a fixed point. Furthermore, generic bifurcations are also considered in this paper. Since both of TCNN and DRNN are not special but simple and general, the obtained theoretical results hold for a wide class of discrete-time neural networks. To demonstrate the theoretical results of this paper better, several numerical simulations are provided as illustrating examples.
The effect of weak localization on spin relaxation in a two-dimensional system with a spin-split spectrum is considered. It is shown that the spin relaxation slows down due to the interference of electron waves moving along closed paths in opposite directions. As a result, the averaged electron spin decays at large times as $1/t$. It is found that the spin dynamics can be described by a Boltzmann-type equation, in which the weak localization effects are taken into account as nonlocal-in-time corrections to the collision integral. The corrections are expressed via a spin-dependent return probability. The physical nature of the phenomenon is discussed and it is shown that the "nonbackscattering" contribution to the weak localization plays an essential role. It is also demonstrated that the magnetic field, both transversal and longitudinal, suppresses the power tail in the spin polarization.
Networked entanglement is an essential component for a plethora of quantum computation and communication protocols. Direct transmission of quantum signals over long distances is prevented by fibre attenuation and the no-cloning theorem, motivating the development of quantum repeaters, designed to purify entanglement, extending its range. Quantum repeaters have been demonstrated over short distances, but error-corrected, global repeater networks with high bandwidth require new technology. Here we show that error corrected quantum memories installed in cargo containers and carried by ship can provide a flexible connection between local networks, enabling low-latency, high-fidelity quantum communication across global distances at higher bandwidths than previously proposed. With demonstrations of technology with sufficient fidelity to enable topological error-correction, implementation of the quantum memories is within reach, and bandwidth increases with improvements in fabrication. Our approach to quantum networking avoids technological restrictions of repeater deployment, providing an alternate path to a worldwide Quantum Internet.
This paper develops a two-layer neural network in which the neuron model computes a user-defined similarity function between inputs and weights. The neuron transfer function is formed by composition of an adapted logistic function with the mean of the partial input-weight similarities. The resulting neuron model is capable of dealing directly with variables of potentially different nature (continuous, fuzzy, ordinal, categorical). There is also provision for missing values. The network is trained using a two-stage procedure very similar to that used to train a radial basis function (RBF) neural network. The network is compared to two types of RBF networks in a non-trivial dataset: the Horse Colic problem, taken as a case study and analyzed in detail.
This paper indicates two errors in the formulation of the main optimization model in the article "Dynamic power management in energy-aware computer networks and data intensive computing systems" by Niewiadomska-Szynkiewicz et al. [FGCS, vol.37 (2014), pp.284-296] and shows how to fix them.
We compare the ground state of the random-field Ising model with Gaussian distributed random fields, with its non-equilibrium hysteretic counterpart, the demagnetized state. This is a low energy state obtained by a sequence of slow magnetic field oscillations with decreasing amplitude. The main concern is how optimized the demagnetized state is with respect to the best-possible ground state. Exact results for the energy in d=1 show that in a paramagnet, with finite spin-spin correlations, there is a significant difference in the energies if the disorder is not so strong that the states are trivially almost alike. We use numerical simulations to better characterize the difference between the ground state and the demagnetized state. For d>=3 the random-field Ising model displays a disorder induced phase transition between a paramagnetic and a ferromagnetic state. The locations of the critical points R_c(DS), R_c(GS) differ for the demagnetized state and ground state. Consequently, it is in this regime that the optimization of the demagnetized stat is the worst whereas both deep in the paramagnetic regime and in the ferromagnetic one the states resemble each other to a great extent. We argue based on the numerics that in d=3 the scaling at the transition is the same in the demagnetized and ground states. This claim is corroborated by the exact solution of the model on the Bethe lattice, where the R_c's are also different.
In this report, we consider maximal solutions to the induced bounded-degree subgraph problem and relate it to issues concerning stream control in multiple-input multiple-output (MIMO) networks. We present a new distributed algorithm that completes in logarithmic time with high probability and is guaranteed to complete in linear time. We conclude the report with simulation results that address the effectiveness of stream control and the relative impact of receiver overloading and flexible interference suppression.
The application of Deep Neural Networks for ranking in search engines may obviate the need for the extensive feature engineering common to current learning-to-rank methods. However, we show that combining simple relevance matching features like BM25 with existing Deep Neural Net models often substantially improves the accuracy of these models, indicating that they do not capture essential local relevance matching signals. We describe a novel deep Recurrent Neural Net-based model that we call Match-Tensor. The architecture of the Match-Tensor model simultaneously accounts for both local relevance matching and global topicality signals allowing for a rich interplay between them when computing the relevance of a document to a query. On a large held-out test set consisting of social media documents, we demonstrate not only that Match-Tensor outperforms BM25 and other classes of DNNs but also that it largely subsumes signals present in these models.
We use the 1-bond -> 2-phonon percolation doublet of zincblende alloys as a mesoscope for an unusual insight into their phonon behavior under pressure. We focus on (Zn,Be)Se and show by Raman scattering that the original Be-Se doublet at ambient pressure, of the stretching-bending type, turns into a pure-bending singlet at the approach of the high-pressure ZnSe-like rocksalt phase, an unnatural one for the Be-Se bonds. The freezing of the Be-Se stretching mode is discussed within the scope of the percolation model (mesoscopic scale), with ab initio calculations in support (microscopic scale).
We study charge transport in one-dimensional graphene superlattices created by applying layered periodic and disordered potentials. It is shown that the transport and spectral properties of such structures are strongly anisotropic. In the direction perpendicular to the layers, the eigenstates in a disordered sample are delocalized for all energies and provide a minimal non-zero conductivity, which cannot be destroyed by disorder, no matter how strong this is. However, along with extended states, there exist discrete sets of angles and energies with exponentially localized eigenfunctions (disorder-induced resonances). It is shown that, depending on the type of the unperturbed system, the disorder could either suppress or enhance the transmission. Most remarkable properties of the transmission have been found in graphene systems built of alternating p-n and n-p junctions. This transmission has anomalously narrow angular spectrum and, surprisingly, in some range of directions it is practically independent of the amplitude of fluctuations of the potential. Owing to these features, such samples could be used as building blocks in tunable electronic circuits. To better understand the physical implications of the results presented here, most of our results have been contrasted with those for analogous wave systems. Along with similarities, a number of quite surprising differences have been found.
Processing sequential data of variable length is a major challenge in a wide range of applications, such as speech recognition, language modeling, generative image modeling and machine translation. Here, we address this challenge by proposing a novel recurrent neural network (RNN) architecture, the Fast-Slow RNN (FS-RNN). The FS-RNN incorporates the strengths of both multiscale RNNs and deep transition RNNs as it processes sequential data on different timescales and learns complex transition functions from one time step to the next. We evaluate the FS-RNN on two character level language modeling data sets, Penn Treebank and Hutter Prize Wikipedia, where we improve state of the art results to $1.19$ and $1.25$ bits-per-character (BPC), respectively. In addition, an ensemble of two FS-RNNs achieves $1.20$ BPC on Hutter Prize Wikipedia outperforming the best known compression algorithm with respect to the BPC measure. We also present an empirical investigation of the learning and network dynamics of the FS-RNN, which explains the improved performance compared to other RNN architectures. Our approach is general as any kind of RNN cell is a possible building block for the FS-RNN architecture, and thus can be flexibly applied to different tasks.
Neural network based methods have obtained great progress on a variety of natural language processing tasks. However, in most previous works, the models are learned based on single-task supervised objectives, which often suffer from insufficient training data. In this paper, we use the multi-task learning framework to jointly learn across multiple related tasks. Based on recurrent neural network, we propose three different mechanisms of sharing information to model text with task-specific and shared layers. The entire network is trained jointly on all these tasks. Experiments on four benchmark text classification tasks show that our proposed models can improve the performance of a task with the help of other related tasks.
Recent developments in hybrid biological-technological systems (hybrid bionic systems) has made clear the need for evaluating ergonomic fit in such systems, especially as users first become adjusted to using such systems. This training is accompanied by physiological adaptation, and can be thought of computationally as a relative degree of matching between prosthetic devices, physiology, and behavior. Achieving performance augmentation involves two features of performance: a specific form of learning, memory, and mechanotransduction called sensorimotor learning, and physiological adaptation to novel physical information imposed by the augmented environment of hybrid bionic systems. A method borrowed from environmental medicine involving perturbing the environment for a range of internal physiological conditions was used to induce sensorimotor learning and memory associated physiological changes. In addition, features of the adult phenotype were considered as a mitigator of learning-related adaptations. Using a series of statistical tests and techniques, the results demonstrate than three forms of regulation are at work related to morphological, neural, and muscular control. A discussion of the conceptual relationship between homeostasis and adaptation will then be discussed in addition to potential applications to performance augmentation strategies.
This paper have two parts. In the first part we discuss word embeddings. We discuss the need for them, some of the methods to create them, and some of their interesting properties. We also compare them to image embeddings and see how word embedding and image embedding can be combined to perform different tasks. In the second part we implement a convolutional neural network trained on top of pre-trained word vectors. The network is used for several sentence-level classification tasks, and achieves state-of-art (or comparable) results, demonstrating the great power of pre-trainted word embeddings over random ones.
Artificial Neural Networks (ANNs) have received increasing attention in recent years with applications that span a wide range of disciplines including vital domains such as medicine, network security and autonomous transportation. However, neural network architectures are becoming increasingly complex and with an increasing need to obtain real-time results from such models, it has become pivotal to use parallelization as a mechanism for speeding up network training and deployment. In this work we propose an implementation of Network Parallel Training through Cannon's Algorithm for matrix multiplication. We show that increasing the number of processes speeds up training until the point where process communication costs become prohibitive; this point varies by network complexity. We also show through empirical efficiency calculations that the speedup obtained is superlinear.
Self-sustainability is a crucial step for modern sensor networks. Here, we offer an original and comprehensive framework for autonomous sensor networks powered by renewable energy sources. We decompose our design into two nested optimization steps: the inner step characterizes the optimal network operating point subject to an average energy consumption constraint, while the outer step provides online energy management policies making the system energetically self-sufficient in the presence of unpredictable and intermittent energy sources. Our framework sheds new light into the design of pragmatic schemes for the control of energy harvesting sensor networks} and permits to gauge the impact of key sensor network parameters, such as the battery capacity, the harvester size, the information transmission rate and the radio duty cycle. We analyze the robustness of the obtained energy management policies in the cases where the nodes have differing energy inflow statistics and where topology changes may occur, devising effective heuristics. Our energy management policies are finally evaluated considering real solar radiation traces, validating them against state of the art solutions and describing the impact of relevant design choices in terms of achievable network throughput and battery level dynamics.
We revisit the question of the "sign phase transition" for interfering directed paths with real amplitudes in a random medium. The sign of the total amplitude of the paths to a given point may be viewed as an Ising order parameter, so we suggest that a coarse-grained theory for system is a dynamic Ising model coupled to a Kardar-Parisi-Zhang (KPZ) model. It appears that when the KPZ model is in its strong-coupling ("pinned") phase, the Ising model does not have a stable ferromagnetic phase, so there is no sign phase transition. We investigate this numerically for the case of {\ss}1+1 dimensions, demonstrating the instability of the Ising ordered phase there.
Molecular transport in living systems regulates numerous processes underlying biological function. Although many cellular components exhibit anomalous diffusion, only recently has the subdiffusive motion been associated with nonergodic behavior. These findings have stimulated new questions for their implications in statistical mechanics and cell biology. Is nonergodicity a common strategy shared by living systems? Which physical mechanisms generate it? What are its implications for biological function? Here, we use single particle tracking to demonstrate that the motion of DC-SIGN, a receptor with unique pathogen recognition capabilities, reveals nonergodic subdiffusion on living cell membranes. In contrast to previous studies, this behavior is incompatible with transient immobilization and therefore it can not be interpreted according to continuous time random walk theory. We show that the receptor undergoes changes of diffusivity, consistent with the current view of the cell membrane as a highly dynamic and diverse environment. Simulations based on a model of ordinary random walk in complex media quantitatively reproduce all our observations, pointing toward diffusion heterogeneity as the cause of DC-SIGN behavior. By studying different receptor mutants, we further correlate receptor motion to its molecular structure, thus establishing a strong link between nonergodicity and biological function. These results underscore the role of disorder in cell membranes and its connection with function regulation. Due to its generality, our approach offers a framework to interpret anomalous transport in other complex media where dynamic heterogeneity might play a major role, such as those found, e.g., in soft condensed matter, geology and ecology.
In recent years systematic experimental studies of the temperature dependence of the resistivity in a variety of dilute, ultra clean two dimensional electron/hole systems have revived the fundamental question of localization or, alternatively, the existence of a metal-insulator transition in the presence of strong electron-electron interactions in two dimensions. We argue that under the extreme conditions of ultra clean systems not only is the electron-electron interaction very strong but the role of other system specific properties are also enhanced. In particular, we emphasize the role of valleys in determining the transport properties of the dilute electron gas in silicon inversion layers (Si-MOSFETs). It is shown that for a high quality sample the temperature behavior of the resistivity in the region close to the critical region of the metal-insulator transition is well described by a renormalization group analysis of the interplay of interaction and disorder if the electron band is assumed to have two distinct valleys. The decrease in the resistivity up to five times has been captured in the correct temperature interval by this analysis, without involving any adjustable parameters. The considerable variance in the data obtained from different Si-MOSFET samples is attributed to the sample dependent scattering rate across the two valleys, presenting thereby with a possible explanation for the absence of universal behavior in Si-MOSFET samples of different quality.
We study the dynamics of supervised learning in layered neural networks, in the regime where the size $p$ of the training set is proportional to the number $N$ of inputs. Here the local fields are no longer described by Gaussian probability distributions and the learning dynamics is of a spin-glass nature, with the composition of the training set playing the role of quenched disorder. We show how dynamical replica theory can be used to predict the evolution of macroscopic observables, including the two relevant performance measures (training error and generalization error), incorporating the old formalism developed for complete training sets in the limit $\alpha=p/N\to\infty$ as a special case. For simplicity we restrict ourselves in this paper to single-layer networks and realizable tasks.
Recent years have seen a growing interest in understanding deep neural networks from an optimization perspective. It is understood now that converging to low-cost local minima is sufficient for such models to become effective in practice. However, in this work, we propose a new hypothesis based on recent theoretical findings and empirical studies that deep neural network models actually converge to saddle points with high degeneracy. Our findings from this work are new, and can have a significant impact on the development of gradient descent based methods for training deep networks. We validated our hypotheses using an extensive experimental evaluation on standard datasets such as MNIST and CIFAR-10, and also showed that recent efforts that attempt to escape saddles finally converge to saddles with high degeneracy, which we define as `good saddles'. We also verified the famous Wigner's Semicircle Law in our experimental results.
Operators perceive programmable networks brought by Software Defined Networks (SDN) as cornerstone to decrease the time to deploy new services, to augment the flexibility and to adapt network resources to customer needs at runtime. However, despite the vulnerabilities identified due that the intelligence is centralized on SDN, its research is more centered on forwarding traffic and reconfiguration issues, not considering to a great extent the fault management aspects of the control plane. This paper provides SDN with fault management capabilities by using autonomic principles like self-healing mechanisms. We propose a generic self-healing approach that relies on a Bayesian Networks for the diagnosis block and we applied this algorithm to a centralized SDN infrastructure to prove its functioning in the presence of faults.
The problem of neural network association is to retrieve a previously memorized pattern from its noisy version using a network of neurons. An ideal neural network should include three components simultaneously: a learning algorithm, a large pattern retrieval capacity and resilience against noise. Prior works in this area usually improve one or two aspects at the cost of the third.   Our work takes a step forward in closing this gap. More specifically, we show that by forcing natural constraints on the set of learning patterns, we can drastically improve the retrieval capacity of our neural network. Moreover, we devise a learning algorithm whose role is to learn those patterns satisfying the above mentioned constraints. Finally we show that our neural network can cope with a fair amount of noise.
We develop a method of constructing percolation clusters that allows us to build very large clusters using very little computer memory by limiting the maximum number of sites for which we maintain state information to a number of the order of the number of sites in the largest chemical shell of the cluster being created. The memory required to grow a cluster of mass s is of the order of $s^\theta$ bytes where $\theta$ ranges from 0.4 for 2-dimensional lattices to 0.5 for 6- (or higher)-dimensional lattices. We use this method to estimate $d_{\scriptsize min}$, the exponent relating the minimum path $\ell$ to the Euclidean distance r, for 4D and 5D hypercubic lattices. Analyzing both site and bond percolation, we find $d_{\scriptsize min}=1.607\pm 0.005$ (4D) and $d_{\scriptsize min}=1.812\pm 0.006$ (5D). In order to determine $d_{\scriptsize min}$ to high precision, and without bias, it was necessary to first find precise values for the percolation threshold, $p_c$: $p_c=0.196889\pm 0.000003$ (4D) and $p_c=0.14081\pm 0.00001$ (5D) for site and $p_c=0.160130\pm 0.000003$ (4D) and $p_c=0.118174\pm 0.000004$ (5D) for bond percolation. We also calculate the Fisher exponent, $\tau$, determined in the course of calculating the values of $p_c$: $\tau=2.313\pm 0.003$ (4D) and $\tau=2.412\pm 0.004$ (5D).
We have obtained low-resolution optical spectra of three faint brown dwarf candidates in the Pleiades open cluster. The objects observed are Roque 12 (I=18.5), Roque 5 (I=19.7) and Roque 25 (I=21.2). The spectrum of Roque 25 does not show the strong TiO bandheads that characterize the optical spectra of M-type stars, but molecular bands of CaH, CrH and VO are clearly present. We classify Roque 25 as an early L-type brown dwarf. Using current theoretical evolutionary tracks we estimate that the transition from M-type to L-type in the Pleiades (age ~ 120 Myr) takes place at Teff ~ 2200 K or M ~ 0.04 Msol. Roque 25 is a benchmark brown dwarf in the Pleiades because it is the first known one that belongs to the L-type class. It provides evidence that the IMF extends down to about 0.035 Msol, and serves as a guide for future deep searches for even less massive young brown dwarfs.
Recent collapses of SIP servers in the carrier networks indicates two potential problems of SIP: (1) the current SIP design does not easily scale up to large network sizes, and (2) the built-in SIP overload control mechanism cannot handle overload conditions effectively. In order to help carriers prevent widespread SIP network failure effectively, this chapter presents a systematic investigation of current state-of-the-art overload control algorithms. To achieve this goal, this chapter first reviews two basic mechanisms of SIP, and summarizes numerous experiment results reported in the literatures which demonstrate the impact of overload on SIP networks. After surveying the approaches for modeling the dynamic behaviour of SIP networks experiencing overload, the chapter presents a comparison and assessment of different types of SIP overload control solutions. Finally it outlines some research opportunities for managing SIP overload control.
Surface granulation of the Sun is primarily a consequence of thermal transport in the outer 1 % of the radius. Its typical scale of about 1 - 2 Mm is set by the balance between convection, free-streaming radiation, and the strong density stratification in the surface layers.   The physics of granulation is well understood, as demonstrated by the close agreement between numerical simulation, theory, and observation. Superimposed on the energetic granular structure comprising high-speed flows, are larger scale long-lived flow systems (~ 300 m/s) called supergranules. Supergranulation has a typical scale of 24 - 36 Mm. It is not clear if supergranulation results from the interaction of granules or is causally linked to deep convection or a consequence of magneto-convection. Other outstanding questions remain: how deep are supergranules? How do they participate in global dynamics of the Sun? Further challenges are posed by our lack of insight into the dynamics of larger scales in the deep convection region. Recent helioseismic constraints have suggested that convective velocity amplitudes on large scales may be overestimated by an order of magnitude or more, implying that Reynolds stresses associated with large-scale convection, thought to play a significant role in the sustenance of differential rotation and meridional circulation, might be two orders of magnitude weaker than theory and computation predict. While basic understanding on the nature of convection on global scales and the maintenance of global circulations is incomplete, progress is imminent, given substantial improvements in computation, theory and helioseismic inferences.
Software-Defined Networking (SDN) controllers are considered as Network Operating Systems (NOSs) and often viewed as a single point of failure. Detecting which SDN controller is managing a target network is a big step for an attacker to launch specific/effective attacks against it. In this paper, we demonstrate the feasibility of fingerpirinting SDN controllers. We propose techniques allowing an attacker placed in the data plane, which is supposed to be physically separate from the control plane, to detect which controller is managing the network. To the best of our knowledge, this is the first work on fingerprinting SDN controllers, with as primary goal to emphasize the necessity to highly secure the controller. We focus on OpenFlow-based SDN networks since OpenFlow is currently the most deployed SDN technology by hardware and software vendors.
We define and study the statistical models in exponential family form whose sufficient statistics are the degree distributions and the bi-degree distributions of undirected labelled simple graphs. Graphs that are constrained by the joint degree distributions are called $dK$-graphs in the computer science literature and this paper attempts to provide the first statistically grounded analysis of this type of models. In addition to formalizing these models, we provide some preliminary results for the parameter estimation and the asymptotic behaviour of the model for degree distribution, and discuss the parameter estimation for the model for bi-degree distribution.
The variational autoencoder (VAE; Kingma, Welling (2014)) is a recently proposed generative model pairing a top-down generative network with a bottom-up recognition network which approximates posterior inference. It typically makes strong assumptions about posterior inference, for instance that the posterior distribution is approximately factorial, and that its parameters can be approximated with nonlinear regression from the observations. As we show empirically, the VAE objective can lead to overly simplified representations which fail to use the network's entire modeling capacity. We present the importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting. In the IWAE, the recognition network uses multiple samples to approximate the posterior, giving it increased flexibility to model complex posteriors which do not fit the VAE modeling assumptions. We show empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log-likelihood on density estimation benchmarks.
Real life networks are generally modelled as scale free networks. Information diffusion in such networks in decentralised environment is a difficult and resource consuming affair. Gossip algorithms have come up as a good solution to this problem. In this paper, we have proposed Adaptive First Push Then Pull gossip algorithm. We show that algorithm works with minimum cost when the transition round to switch from Adaptive Push to Adaptive Pull is close to Round(log(N)). Furthermore, we compare our algorithm with Push, Pull and First Push Then Pull and show that the proposed algorithm is the most cost efficient in Scale Free networks.
We analyse diffusion dynamics on weakly-coupled networks (interconnected networks) by means of separation of time scales. Using an adiabatic approximation we reduced the system dynamics to a Markov chain with aggregated variables and derived a transport equation that is analogous to Fick's First Law and includes a driving force. Entropy production is a sum of microscopic entropy transport, which results from the particle's migration between networks of different topologies and macroscopic entropy production of the Markov chain. Equilibrium particles partition between different sub-networks depends only on internal sub-network parameters. Our framework, confirmed by numerical simulations, is also useful for considering diffusion in nested systems corresponding to hierarchical networks with several different time scales thus it can serve to uncover hidden hierarchy levels from observations of diffusion processes.
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms---Sensitivity and Implementation Invariance that attribution methods ought to satisfy. We show that they are not satisfied by most known attribution methods, which we consider to be a fundamental weakness of those methods. We use the axioms to guide the design of a new attribution method called Integrated Gradients. Our method requires no modification to the original network and is extremely simple to implement; it just needs a few calls to the standard gradient operator. We apply this method to a couple of image models, a couple of text models and a chemistry model, demonstrating its ability to debug networks, to extract rules from a network, and to enable users to engage with models better.
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
As complex networks find applications in a growing range of disciplines, the diversity of naturally occurring and model networks being studied is exploding. The adoption of a well-developed collection of network taxonomies is a natural method for both organizing this data and understanding deeper relationships between networks. Most existing metrics for network structure rely on classical graph-theoretic measures, extracting characteristics primarily related to individual vertices or paths between them, and thus classify networks from the perspective of local features. Here, we describe an alternative approach to studying structure in networks that relies on an algebraic-topological metric called persistent homology, which studies intrinsically mesoscale structures called cycles, constructed from cliques in the network. We present a classification of 14 commonly studied weighted network models into four groups or classes, and discuss the structural themes arising in each class. Finally, we compute the persistent homology of two real-world networks and one network constructed by a common dynamical systems model, and we compare the results with the three classes to obtain a better understanding of those networks.
Diffusion Magnetic Resonance Imaging (MRI) exploits the anisotropic diffusion of water molecules in the brain to enable the estimation of the brain's anatomical fiber tracts at a relatively high resolution. In particular, tractographic methods can be used to generate whole-brain anatomical connectivity matrix where each element provides an estimate of the connectivity strength between the corresponding voxels. Structural brain networks are built using the connectivity information and a predefined brain parcellation, where the nodes of the network represent the brain regions and the edge weights capture the connectivity strengths between the corresponding brain regions. This paper introduces a number of novel scalable methods to generate and analyze structural brain networks with a varying number of nodes. In particular, we introduce a new parallel algorithm to quickly generate large scale connectivity-based parcellations for which voxels in a region possess highly similar connectivity patterns to the rest of the regions. We show that the corresponding regional structural consistency is always superior to randomly generated parcellations over a wide range of parcellation sizes. Corresponding brain networks with a varying number of nodes are analyzed using standard graph-theorectic measures, as well as, new measures derived from spectral graph theory. Our results indicate increasingly more statistical power of brain networks with larger numbers of nodes and the relatively unique shape of the spectral profile of large brain networks relative to other well-known networks.
A periodically driven rotor is a prototypical model that exhibits a transition to chaos in the classical regime and dynamical localization (related to Anderson localization) in the quantum regime. In a recent work [Phys. Rev. B 94, 085120 (2016)], A. C. Keser et al. considered a many-body generalization of coupled quantum kicked rotors, and showed that in the special integrable linear case, dynamical localization survives interactions. By analogy with many-body localization, the phenomenon was dubbed dynamical many-body localization. In the present work, we study nonintegrable models of single and coupled quantum relativistic kicked rotors (QRKRs) that bridge the gap between the conventional quadratic rotors and the integrable linear models. For a single QRKR, we supplement the recent analysis of the angular-momentum-space dynamics with a study of the spin dynamics. Our analysis of two and three coupled QRKRs along with the proved localization in the many-body linear model indicate that dynamical localization exists in few-body systems. Moreover, the relation between QRKR and linear rotor models implies that dynamical many-body localization can exist in generic, nonintegrable many-body systems. And localization can generally result from a complicated interplay between Anderson mechanism and limiting integrability, since the many-body linear model is a high-angular-momentum limit of many-body QRKRs. We also analyze the dynamics of two coupled QRKRs in the highly unusual superballistic regime and find that the resonance conditions are relaxed due to interactions. Finally, we propose experimental realizations of the QRKR model in cold atoms in optical lattices.
Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of computer vision and natural language processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of News articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce new deep learning methods that address source detection, popularity prediction, article illustration and geolocation of articles. An adaptive CNN architecture is proposed, that shares most of the structure for all the tasks, and is suitable for multitask and transfer learning. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and popularity metrics). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field.
In this paper, we describe a novel deep convolutional neural network (CNN) that is deeper and wider than other existing deep networks for hyperspectral image classification. Unlike current state-of-the-art approaches in CNN-based hyperspectral image classification, the proposed network, called contextual deep CNN, can optimally explore local contextual interactions by jointly exploiting local spatio-spectral relationships of neighboring individual pixel vectors. The joint exploitation of the spatio-spectral information is achieved by a multi-scale convolutional filter bank used as an initial component of the proposed CNN pipeline. The initial spatial and spectral feature maps obtained from the multi-scale filter bank are then combined together to form a joint spatio-spectral feature map. The joint feature map representing rich spectral and spatial properties of the hyperspectral image is then fed through a fully convolutional network that eventually predicts the corresponding label of each pixel vector. The proposed approach is tested on three benchmark datasets: the Indian Pines dataset, the Salinas dataset and the University of Pavia dataset. Performance comparison shows enhanced classification performance of the proposed approach over the current state-of-the-art on the three datasets.
Tumor growth prediction, a highly challenging task, has long been viewed as a mathematical modeling problem, where the tumor growth pattern is personalized based on imaging and clinical data of a target patient. Though mathematical models yield promising results, their prediction accuracy may be limited by the absence of population trend data and personalized clinical characteristics. In this paper, we propose a statistical group learning approach to predict the tumor growth pattern that incorporates both the population trend and personalized data, in order to discover high-level features from multimodal imaging data. A deep convolutional neural network approach is developed to model the voxel-wise spatio-temporal tumor progression. The deep features are combined with the time intervals and the clinical factors to feed a process of feature selection. Our predictive model is pretrained on a group data set and personalized on the target patient data to estimate the future spatio-temporal progression of the patient's tumor. Multimodal imaging data at multiple time points are used in the learning, personalization and inference stages. Our method achieves a Dice coefficient of 86.8% +- 3.6% and RVD of 7.9% +- 5.4% on a pancreatic tumor data set, outperforming the DSC of 84.4% +- 4.0% and RVD 13.9% +- 9.8% obtained by a previous state-of-the-art model-based method.
This work focuses on representing very high-dimensional global image descriptors using very compact 64-1024 bit binary hashes for instance retrieval. We propose DeepHash: a hashing scheme based on deep networks. Key to making DeepHash work at extremely low bitrates are three important considerations -- regularization, depth and fine-tuning -- each requiring solutions specific to the hashing problem. In-depth evaluation shows that our scheme consistently outperforms state-of-the-art methods across all data sets for both Fisher Vectors and Deep Convolutional Neural Network features, by up to 20 percent over other schemes. The retrieval performance with 256-bit hashes is close to that of the uncompressed floating point features -- a remarkable 512 times compression.
We propose a scheme to employ backpropagation neural networks (BPNNs) for both stages of fingerprinting-based indoor positioning using WLAN/WiFi signal strengths (FWIPS): radio map construction during the offline stage, and localization during the online stage. Given a training radio map (TRM), i.e., a set of coordinate vectors and associated WLAN/WiFi signal strengths of the available access points, a BPNN can be trained to output the expected signal strengths for any input position within the region of interest (BPNN-RM). This can be used to provide a continuous representation of the radio map and to filter, densify or decimate a discrete radio map. Correspondingly, the TRM can also be used to train another BPNN to output the expected position within the region of interest for any input vector of recorded signal strengths and thus carry out localization (BPNN-LA).Key aspects of the design of such artificial neural networks for a specific application are the selection of design parameters like the number of hidden layers and nodes within the network, and the training procedure. Summarizing extensive numerical simulations, based on real measurements in a testbed, we analyze the impact of these design choices on the performance of the BPNN and compare the results in particular to those obtained using the $k$ nearest neighbors ($k$NN) and weighted $k$ nearest neighbors approaches to FWIPS.
A microwave setup for mode-resolved transport measurement in quasi-one-dimensional (quasi-1D) structures is presented. We will demonstrate a technique for direct measurement of the Green's function of the system. With its help we will investigate quasi-1D structures with various types of disorder. We will focus on stratified structures, i.e., structures that are homogeneous perpendicular to the direction of wave propagation. In this case the interaction between different channels is absent, so wave propagation occurs individually in each open channel. We will apply analytical results developed in the theory of one-dimensional (1D) disordered models in order to explain main features of the quasi-1D transport. The main focus will be selective transport due to long-range correlations in the disorder. In our setup, we can intentionally introduce correlations by changing the positions of periodically spaced brass bars of finite thickness. Because of the equivalence of the stationary Schr\"odinger equation and the Helmholtz equation, the result can be directly applied to selective electron transport in nanowires, nanostripes, and superlattices.
Single image super-resolution (SR) is an ill-posed problem which aims to recover high-resolution (HR) images from their low-resolution (LR) observations. The crux of this problem lies in learning the complex mapping between low-resolution patches and the corresponding high-resolution patches. Prior arts have used either a mixture of simple regression models or a single non-linear neural network for this propose. This paper proposes the method of learning a mixture of SR inference modules in a unified framework to tackle this problem. Specifically, a number of SR inference modules specialized in different image local patterns are first independently applied on the LR image to obtain various HR estimates, and the resultant HR estimates are adaptively aggregated to form the final HR image. By selecting neural networks as the SR inference module, the whole procedure can be incorporated into a unified network and be optimized jointly. Extensive experiments are conducted to investigate the relation between restoration performance and different network architectures. Compared with other current image SR approaches, our proposed method achieves state-of-the-arts restoration results on a wide range of images consistently while allowing more flexible design choices. The source codes are available in http://www.ifp.illinois.edu/~dingliu2/accv2016.
To ensure proper functioning of a Wireless Sensor Network (WSN), it is crucial that the network is able to detect anomalies in communication quality (e.g., RSSI), which may cause performance degradation, so that the network can react accordingly. In this paper, we introduce RADIUS, a lightweight system for the purpose. The design of RADIUS is aimed at minimizing the detection error (caused by normal randomness of RSSI) in discriminating good links from weak links and at reaching high detection accuracy under diverse link conditions and dynamic environment changes. Central to the design is a threshold-based decision approach that has its foundation on the Bayes decision theory. In RADIUS, various techniques are developed to address challenges inherent in applying this approach. In addition, through extensive experiments, proper configuration of the parameters involved in these techniques is identified for an indoor environment. In a prototype implementation of the RADIUS system deployed in an indoor testbed, the results show that RADIUS is accurate in detecting anomalous link quality degradation for all links across the network, maintaining a stable error rate of 6.13% on average.
A framework is presented for understanding the common character of proteins. Proteins are linear chain molecules. However, the simple model of a polymer viewed as spheres tethered together does not account for many of the observed characteristics of protein structures. The authors show here that proteins may be regarded as tubes of nonzero thickness. This approach allows one to bridge the conventional compact polymer phase with a novel phase employed by Nature to house biomolecular structures. The continuum description of a tube (or a sheet) of arbitrary thickness entails using appropriately chosen many-body interactions rather than two-body interactions. The authors suggest that the structures of folded proteins are selected based on geometrical considerations and are poised at the edge of compaction, thus accounting for their versatility and flexibility. This approach also offers an explanation for why helices and sheets are the building blocks of protein structures.
We show that the competition between interactions on different length scales, as relevant for the formation of stripes in doped Mott insulators, can cause a glass transition in a system with no explicitly quenched disorder. We analytically determine a universal criterion for the emergence of an exponentially large number of metastable configurations that leads to a finite configurational entropy and a landscape dominated viscous flow. We demonstrate that glassines is unambiguously tied to a new length scale which characterizes the typical length over which defects and imperfections in the stripe pattern are allowed to wander over long times.
We develop an approach to characterize excited states of disordered many-body systems using spatially resolved structures of entanglement. We show that the behavior of the mutual information (MI) between two parties of a many-body system can signal a qualitative difference between thermal and localized phases -- MI is finite in insulators while it approaches zero in the thermodynamic limit in the ergodic phase. Related quantities, such as the recently introduced Codification Volume (CV), are shown to be suitable to quantify the correlation length of the system. These ideas are illustrated using prototypical non-interacting wavefunctions of localized and extended particles and then applied to characterize states of strongly excited interacting spin chains. We especially focus on evolution of spatial structure of quantum information between high temperature diffusive and many-body localized phases believed to exist in these models. We study MI as a function of disorder strength both averaged over the eigenstates and in time-evolved product states drawn from continuously deformed family of initial states realizable experimentally. As expected, spectral and time-evolved averages coincide inside the ergodic phase and differ significantly outside. We also highlight dispersion among the initial states \emph{within} the localized phase -- some of these show considerable generation and delocalization of quantum information.
This paper analyzes the adoption of unstructured P2P overlay networks to build publish-subscribe systems. We consider a very simple distributed communication protocol, based on gossip and on the local knowledge each node has about subscriptions made by its neighbours. In particular, upon reception (or generation) of a novel event, a node sends it to those neighbours whose subscriptions match that event. Moreover, the node gossips the event to its "non-interested" neighbours, so that the event can be spread through the overlay. A mathematical analysis is provided to estimate the number of nodes receiving the event, based on the network topology, the amount of subscribers and the gossip probability. These outcomes are compared to those obtained via simulation. Results show even when the amount of subscribers represents a very small (yet non-negligible) portion of network nodes, by tuning the gossip probability the event can percolate through the overlay. Hence, the use of unstructured networks. coupled with simple dissemination protocols, represents a viable approach to build peer-to-peer publish-subscribe applications.
This document describes a GPS-based tracking model for position and velocity states on and off of a road network and it enables parallel, online learning of state-dependent parameters, such as GPS error, acceleration error, and road transition probabilities. More specifically, the conditionally linear tracking model of Ulmke and Koch (2006) is adapted to the Particle Learning framework of H. F. Lopes, et. al. (2011), which provides a foundation for further hierarchical Bayesian extensions. The filter is shown to perform well on a real city road network while sufficiently estimating on and off road transition probabilities. The model in this paper is also backed by an open-source Java project.
Texture is one of the most important properties of visual surface that helps in discriminating one object from another or an object from background. The self-organizing map (SOM) is an excellent tool in exploratory phase of data mining. It projects its input space on prototypes of a low-dimensional regular grid that can be effectively utilized to visualize and explore properties of the data. This paper proposes an enhancement extraction method for accurate extracting features for efficient image representation it based on SOM neural network. In this approach, we apply three different partitioning approaches as a region of interested (ROI) selection methods for extracting different accurate textural features from medical image as a primary step of our extraction method. Fisherfaces feature selection is used, for selecting discriminated features form extracted textural features. Experimental result showed the high accuracy of medical image categorization with our proposed extraction method. Experiments held on Mammographic Image Analysis Society (MIAS) dataset.
We present a randomized distributed algorithm that in radio networks with collision detection broadcasts a single message in $O(D+\log^2 n)$ time slots, with high probability. In view of the lower-bound $\Omega(D+\log^2 n)$, our algorithm is optimal in the considered model answering the decades-old question of Alon, Bar-Noy, Linial and Peleg [JCSS 1991].
Deep learning is an important component of big-data analytic tools and intelligent applications, such as, self-driving cars, computer vision, speech recognition, or precision medicine. However, the training process is computationally intensive, and often requires a large amount of time if performed sequentially. Modern parallel computing systems provide the capability to reduce the required training time of deep neural networks. In this paper, we present our parallelization scheme for training convolutional neural networks (CNN) named Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS). Major features of CHAOS include the support for thread and vector parallelism, non-instant updates of weight parameters during back-propagation without a significant delay, and implicit synchronization in arbitrary order. CHAOS is tailored for parallel computing systems that are accelerated with the Intel Xeon Phi. We evaluate our parallelization approach empirically using measurement techniques and performance modeling for various numbers of threads and CNN architectures. Experimental results for the MNIST dataset of handwritten digits using the total number of threads on the Xeon Phi show speedups of up to 103x compared to the execution on one thread of the Xeon Phi, 14x compared to the sequential execution on Intel Xeon E5, and 58x compared to the sequential execution on Intel Core i5.
In recent years, we have observed a significant trend towards filling the gap between social network analysis and control. This trend was enabled by the introduction of new mathematical models describing dynamics of social groups, the advancement in complex networks theory and multi-agent systems, and the development of modern computational tools for big data analysis. The aim of this tutorial is to highlight a novel chapter of control theory, dealing with applications to social systems, to the attention of the broad research community. This paper is the first part of the tutorial, and it is focused on the most classical models of social dynamics and on their relations to the recent achievements in multi-agent systems.
This paper studies online shortest path routing over multi-hop networks. Link costs or delays are time-varying and modeled by independent and identically distributed random processes, whose parameters are initially unknown. The parameters, and hence the optimal path, can only be estimated by routing packets through the network and observing the realized delays. Our aim is to find a routing policy that minimizes the regret (the cumulative difference of expected delay) between the path chosen by the policy and the unknown optimal path. We formulate the problem as a combinatorial bandit optimization problem and consider several scenarios that differ in where routing decisions are made and in the information available when making the decisions. For each scenario, we derive a tight asymptotic lower bound on the regret that has to be satisfied by any online routing policy. These bounds help us to understand the performance improvements we can expect when (i) taking routing decisions at each hop rather than at the source only, and (ii) observing per-link delays rather than end-to-end path delays. In particular, we show that (i) is of no use while (ii) can have a spectacular impact. Three algorithms, with a trade-off between computational complexity and performance, are proposed. The regret upper bounds of these algorithms improve over those of the existing algorithms, and they significantly outperform state-of-the-art algorithms in numerical experiments.
We compute cross sections for two-jet production in deep inelastic scattering (DIS), with one jet from initial state radiation (ISR) and the other from final state radiation, with a summation of large logarithms up to next-to-next-to-leading logarithmic (NNLL) accuracy. Use of the DIS event shape 1-jettiness ensures that events have two well-collimated jets. We calculate distributions for three versions of 1-jettiness that have different sensitivity to the transverse momentum of the ISR, and derive factorization theorems for each of them using the soft collinear effective theory (SCET). The structure of the transverse momentum dependence in the factorization theorems is different for each 1-jettiness. We present numerical results for these three observables with parameters for the HERA collider.
The Electronic Product Code (EPC) Network is an important part of the Internet of Things. The Physical Mark-Up Language (PML) is to represent and de-scribe data related to objects in EPC Network. The PML documents of each component to exchange data in EPC Network system are XML documents based on PML Core schema. For managing theses huge amount of PML documents of tags captured by Radio frequency identification (RFID) readers, it is inevitable to develop the high-performance technol-ogy, such as filtering and integrating these tag data. So in this paper, we propose an approach for meas-uring the similarity of PML documents based on Bayesian Network of several sensors. With respect to the features of PML, while measuring the similarity, we firstly reduce the redundancy data except information of EPC. On the basis of this, the Bayesian Network model derived from the structure of the PML documents being compared is constructed.
We consider the question of determining how the topological structure influences a consensus dynamical process taking place on a network. By considering a large dataset of real-world networks we first determine that the removal of edges according to their communicability angle -an angle between position vectors of the nodes in an Euclidean communicability space- increases the average time of consensus by a factor of 5.68 in real-world networks. The edge betweenness centrality also identifies -in a smaller proportion- those critical edges for the consensus dynamics, i.e., its removal increases the time of consensus by a factor of 3.70. We justify theoretically these findings on the basis of the role played by the algebraic connectivity and the isoperimetric number of networks on the dynamical process studied, and their connections with the properties mentioned before. Finally, we study the role played by global topological parameters of networks on the consensus dynamics. We determine that the network density and the average distance-sum -an analogous of the node degree for shortest-path distances, account for more than 80% of the variance of the average time of consensus in the real-world networks studied.
Deep neural networks have become the primary learning technique for object recognition. Videos, unlike still images, are temporally coherent which makes the application of deep networks non-trivial. Here, we investigate how motion can aid object recognition in short videos. Our approach is based on Long Short-Term Memory (LSTM) deep networks. Unlike previous applications of LSTMs, we implement each gate as a convolution. We show that convolutional-based LSTM models are capable of learning motion dependencies and are able to improve the recognition accuracy when more frames in a sequence are available. We evaluate our approach on the Washington RGBD Object dataset and on the Washington RGBD Scenes dataset. Our approach outperforms deep nets applied to still images and sets a new state-of-the-art in this domain.
Whether as telecommunications or power systems, networks are very important in everyday life. Maintaining these networks properly functional and connected, even under attacks or failures, is of special concern. This topic has been previously studied with a whole network robustness perspective,modeling networks as undirected graphs (such as roads or simply cables). This perspective measures the average behavior of the network after its last node has failed. In this article we propose two alternatives to well-known studies about the robustness of the backbone Internet: to use a supply network model and metrics for its representation (we called it the Go-Index), and to use robustness metrics that can be calculated while disconnections appear. Our research question is: if a smart adversary has a limited number of strikes to attack the Internet, how much will the damage be after each one in terms of network disconnection? Our findings suggest that in order to design robust networks it might be better to have a complete view of the robustness evolution of the network, from both the infrastructure and the users perspective.
Autoencoders have emerged as a useful framework for unsupervised learning of internal representations, and a wide variety of apparently conceptually disparate regularization techniques have been proposed to generate useful features. Here we extend existing denoising autoencoders to additionally inject noise before the nonlinearity, and at the hidden unit activations. We show that a wide variety of previous methods, including denoising, contractive, and sparse autoencoders, as well as dropout can be interpreted using this framework. This noise injection framework reaps practical benefits by providing a unified strategy to develop new internal representations by designing the nature of the injected noise. We show that noisy autoencoders outperform denoising autoencoders at the very task of denoising, and are competitive with other single-layer techniques on MNIST, and CIFAR-10. We also show that types of noise other than dropout improve performance in a deep network through sparsifying, decorrelating, and spreading information across representations.
Bayesian models of cognition hypothesize that human brains make sense of data by representing probability distributions and applying Bayes' rule to find the best explanation for available data. Understanding the neural mechanisms underlying probabilistic models remains important because Bayesian models provide a computational framework, rather than specifying mechanistic processes. Here, we propose a deterministic neural-network model which estimates and represents probability distributions from observable events --- a phenomenon related to the concept of probability matching. Our model learns to represent probabilities without receiving any representation of them from the external world, but rather by experiencing the occurrence patterns of individual events. Our neural implementation of probability matching is paired with a neural module applying Bayes' rule, forming a comprehensive neural scheme to simulate human Bayesian learning and inference. Our model also provides novel explanations of base-rate neglect, a notable deviation from Bayes.
Vector representation of sentences is important for many text processing tasks that involve clustering, classifying, or ranking sentences. Recently, distributed representation of sentences learned by neural models from unlabeled data has been shown to outperform the traditional bag-of-words representation. However, most of these learning methods consider only the content of a sentence and disregard the relations among sentences in a discourse by and large.   In this paper, we propose a series of novel models for learning latent representations of sentences (Sen2Vec) that consider the content of a sentence as well as inter-sentence relations. We first represent the inter-sentence relations with a language network and then use the network to induce contextual information into the content-based Sen2Vec models. Two different approaches are introduced to exploit the information in the network. Our first approach retrofits (already trained) Sen2Vec vectors with respect to the network in two different ways: (1) using the adjacency relations of a node, and (2) using a stochastic sampling method which is more flexible in sampling neighbors of a node. The second approach uses a regularizer to encode the information in the network into the existing Sen2Vec model. Experimental results show that our proposed models outperform existing methods in three fundamental information system tasks demonstrating the effectiveness of our approach. The models leverage the computational power of multi-core CPUs to achieve fine-grained computational efficiency. We make our code publicly available upon acceptance.
We study the role of scale-free structure and noise in collective dynamics of neuronal networks. For this purpose, we simulate and study analytically a cortical circuit model with stochastic neurons. We compare collective neuronal activity of networks with different topologies: classical random graphs and scale-free networks. We show that, in scale-free networks with divergent second moment of degree distribution, an influence of noise on neuronal activity is strongly enhanced in comparison with networks with a finite second moment. A very small noise level can stimulate spontaneous activity of a finite fraction of neurons and sustained network oscillations. We demonstrate tolerance of collective dynamics of the scale-free networks to random damage in a broad range of the number of randomly removed excitatory and inhibitory neurons. A random removal of neurons leads to gradual decrease of frequency of network oscillations similar to the slowing-down of the alpha rhythm in Alzheimer's disease. However, the networks are vulnerable to targeted attacks. A removal of a few excitatory or inhibitory hubs can impair sustained network oscillations.
We study the transport properties of interacting electrons in a disordered quantum wire within the framework of the Luttinger liquid model. We demonstrate that the notion of weak localization is applicable to the strongly correlated one-dimensional electron system. Two alternative approaches to the problem are developed, both combining fermionic and bosonic treatment of the underlying physics. We calculate the relevant dephasing rate, which for spinless electrons is governed by the interplay of electron-electron interaction and disorder, thus vanishing in the clean limit. Our approach provides a framework for a systematic study of mesoscopic effects in strongly correlated electron systems.
Mutualistic networks have been shown to involve complex patterns of interactions among animal and plant species. The architecture of these webs seems to pervade some of their robust and fragile behaviour. Recent work indicates that there is a strong correlation between the patterning of animal-plant interactions and their phylogenetic organisation. Here we show that such pattern and other reported regularities from mutualistic webs can be properly explained by means of a very simple model of speciation and divergence. This model also predicts a co-extinction dynamics under species loss consistent with the presence of an evolutionary signal. The agreement between observed and model networks suggests that some patterns displayed by real mutualistic webs might actually represent evolutionary spandrels.
We present results from Monte Carlo simulations of the one-dimensional Ising spin glass with power-law interactions at low temperature, using the parallel tempering Monte Carlo method. For a set of parameters where the long-range part of the interaction is relevant, we find evidence for large-scale droplet-like excitations with an energy that is independent of system size, consistent with replica symmetry breaking. We also perform zero-temperature defect energy calculations for a range of parameters and find a stiffness exponent for domain walls in reasonable, but by no means perfect agreement with analytic predictions.
We propose a label propagation approach to geolocation prediction based on Modified Adsorption, with two enhancements:(1) the removal of "celebrity" nodes to increase location homophily and boost tractability, and (2) he incorporation of text-based geolocation priors for test users. Experiments over three Twitter benchmark datasets achieve state-of-the-art results, and demonstrate the effectiveness of the enhancements.
This paper presents the results of a photometric redshift study of galaxies in the Hubble Deep Field (HDF). The method of determining redshifts from broadband colors is described, and the dangers inherent in using it to estimate redshifts, particularly at very high z, are discussed. In particular, the need for accurate high-z spectral energy distributions is illustrated. The validity of our photometric redshift technique is demonstrated both by direct verification with available HDF spectroscopic data and by comparisons of luminosity functions and luminosity densities with those obtained from z < 1 spectroscopic redshift surveys. Evolution of the galaxy population is studied over 0 \lesssim z < 4. Brightening is seen in both the luminosity function and the luminosity density out to z \approx 3; this is followed by a decline in both at z > 3. A population of z < 0.5 star-forming dwarfs is observed to M_{F450W_{AB}} = -11. Our results are discussed in the context of recent developments in the understanding of galaxy evolution.
We argue that for generic systems close to a critical point, an extended Fluctuation-Dissipation relation connects the low frequency non-linear (cubic) susceptibility to the four-point correlation function. In glassy systems, the latter contains interesting information on the heterogeneity and cooperativity of the dynamics. Our result suggests that if the abrupt slowing down of glassy materials is indeed accompanied by the growth of a cooperative length ell, then the non-linear, 3 omega response to an oscillating field should substantially increase and give direct information on the temperature (or density) dependence of ell. The analysis of the non-linear compressibility or the dielectric susceptibility in supercooled liquids, or the non-linear magnetic susceptibility in spin-glasses, should give access to a cooperative length scale, that grows as the temperature is decreased or as the age of the system increases. Our theoretical analysis holds exactly within the Mode-Coupling Theory of glasses.
Coupled oscillator networks show a complex interrelations between topological characteristics of the network and the nonlinear stability of single nodes with respect to large but realistic perturbations. We extend previous results on these relations by incorporating sampling-based measures of the transient behaviour of the system, its survivability, as well as its asymptotic behaviour, its basin stability. By combining basin stability and survivability we uncover novel, previously unknown asymptotic states with solitary, desynchronized oscillators which are rotating with a frequency different from their natural one. They occur almost exclusively after perturbations at nodes with specific topological properties.   More generally we confirm and significantly refine the results on the distinguished role tree-shaped appendices play for nonlinear stability. We find a topological classification scheme for nodes located in such appendices, that exactly separates them according to their stability properties, thus establishing a strong link between topology and dynamics. Hence, the results can be used for the identification of vulnerable nodes in power grids or other coupled oscillator networks. From this classification we can derive general design principles for resilient power grids. We find that striving for homogeneous network topologies facilitates a better performance in terms of nonlinear dynamical network stability. While the employed second-order Kuramoto-like model is parametrized to be representative for power grids, we expect these insights to transfer to other critical infrastructure systems or complex network dynamics appearing in various other fields.
We are offering a particular interpretation (well within the range of experimentally and theoretically accepted notions) of neural connectivity and dynamics and discuss it as the data-and-process architecture of the visual system. In this interpretation the permanent connectivity of cortex is an overlay of well-structured networks, nets, which are formed on the slow time-scale of learning by self-interaction of the network under the influence of sensory input, and which are selectively activated on the fast perceptual time-scale. Nets serve as an explicit, hierarchically structured representation of visual structure in the various sub-modalities, as constraint networks favouring mutually consistent sets of latent variables and as projection mappings to deal with invariance.
To investigate the influence of electronic interaction on the metal-insulator transition (MIT), we consider the Aubry-Andr\'{e} (or Harper) model which describes a quasiperiodic one-dimensional quantum system of non-interacting electrons and exhibits an MIT. For a two-particle system, we study the effect of a Hubbard interaction on the transition by means of the transfer-matrix method and finite-size scaling. In agreement with previous studies we find that the interaction localizes some states in the otherwise metallic phase of the system. Nevertheless, the MIT remains unaffected by the interaction. For a long-range interaction, many more states become localized for sufficiently large interaction strength and the MIT appears to shift towards smaller quasiperiodic potential strength.
The role of interference and entanglement in quantum neural processing is discussed. It is argued that on contrast to the quantum computing the problem of the use of exponential resources as the payment for the absense of entanglement does not exist for quantum neural processing. This is because of corresponding systems, as any modern classical artificial neural systems, do not realize functions precisely, but approximate them by training on small sets of examples. It can permit to implement quantum neural systems optically, because in this case there is no need in exponential resources of optical devices (beam-splitters etc.). On the other hand, the role of entanglement in quantum neural processing is still very important, because it actually associates qubit states: this is necessary feature of quantum neural memory models.
Development of new medications is a lengthy and costly process. Finding novel indications for existing drugs, or drug repositioning, can serve as a useful strategy to shorten the development cycle. In this study, we develop a general framework for drug discovery or repositioning by predicting indication for a particular disease based on expression profiles of the drugs. Drugs that are not originally indicated for the disease but with high predicted probabilities serve as good candidates for repurposing. This framework is widely applicable to any chemicals or drugs with expression profiles measured, even if the drug targets are unknown. The method is also highly flexible as virtually any supervised learning algorithms can be used. We applied this approach to identify repositioning opportunities for schizophrenia as well as depression and anxiety disorders. We applied a variety of state-of-the-art machine learning approaches for prediction, including support vector machines (SVM), deep neural networks (DNN), elastic net regression (EN), random forest (RF) and gradient boosted machines (GBM). Nested cross-validation was adopted to evaluate predictive performances. We found that SVM performed the best overall, followed by EN and DNN. As a further validation of our approach, we observed that the repositioning hits are enriched for psychiatric medications considered in clinical trials. Many top repositioning hits are also supported by previous animal or clinical studies. Interestingly, despite slightly lower predictive accuracy, DNN and EN revealed several literature-supported repositioning candidates that are of different mechanisms of actions from the SVM top results.
We study the stability of wireless networks under stochastic arrival processes of packets, and design efficient, distributed algorithms that achieve stability in the SINR (Signal to Interference and Noise Ratio) interference model.   Specifically, we make the following contributions. We give a distributed algorithm that achieves $\Omega(\frac{1}{\log^2 n})$-efficiency on all networks (where $n$ is the number of links in the network), for all length monotone, sub-linear power assignments. For the power control version of the problem, we give a distributed algorithm with $\Omega(\frac{1}{\log n(\log n + \log \log \Delta)})$-efficiency (where $\Delta$ is the length diversity of the link set).
The summary of the results of our next-to-next-to-leading fits of Tevatron experimental data for $xF_3$ structure function of the $\nu N$ deep-inelastic scattering is given. The special attention is paid to the extraction of twist-4 contributions and demonstration of the interplay between these effects and higher-order perturbative QCD corrections. The factorization and renormalization scale uncertainties of the results obtained are analysed
How to enhance the communication efficiency and quality on vehicular networks is one critical important issue. While with the larger and larger scale of vehicular networks in dense cities, the real-world datasets show that the vehicular networks essentially belong to the complex network model. Meanwhile, the extensive research on complex networks has shown that the complex network theory can both provide an accurate network illustration model and further make great contributions to the network design, optimization and management. In this paper, we start with analyzing characteristics of a taxi GPS dataset and then establishing the vehicular-to-infrastructure, vehicle-to-vehicle and the hybrid communication model, respectively. Moreover, we propose a clustering algorithm for station selection, a traffic allocation optimization model and an information source selection model based on the communication performances and complex network theory.
In this work, we are interested on the analysis of competing marketing campaigns between an incumbent who dominates the market and a challenger who wants to enter the market. We are interested in (a) the simultaneous decision of how many resources to allocate to their potential customers to advertise their products for both marketing campaigns, and (b) the optimal allocation on the situation in which the incumbent knows the entrance of the challenger and thus can predict its response. Applying results from game theory, we characterize these optimal strategic resource allocations for the voter model of social networks.
We introduce a model for granular flow in a one-dimensional rice pile that incorporates rolling effects through a long-range rolling probability for the individual rice grains proportional to $r^{-\rho}$, $r$ being the distance traveled by a grain in a single topling event. The exponent $\rho$ controls the average rolling distance. We have shown that the crossover from power law to stretched exponential behaviors observed experimentally in the granular dynamics of rice piles can be well described as a long-range effect resulting from a change in the transport properties of individual grains. We showed that stretched exponential avalanche distributions can be associated with a long-range regime for $1<\rho<2$ where the average rolling distance grows as a power law with the system size, while power law distributions are associated with a short range regime for $\rho>2$, where the average rolling distance is independent of the system size.
How can we model networks with a mathematically tractable model that allows for rigorous analysis of network properties? Networks exhibit a long list of surprising properties: heavy tails for the degree distribution; small diameters; and densification and shrinking diameters over time. Most present network models either fail to match several of the above properties, are complicated to analyze mathematically, or both. In this paper we propose a generative model for networks that is both mathematically tractable and can generate networks that have the above mentioned properties. Our main idea is to use the Kronecker product to generate graphs that we refer to as "Kronecker graphs".   First, we prove that Kronecker graphs naturally obey common network properties. We also provide empirical evidence showing that Kronecker graphs can effectively model the structure of real networks.   We then present KronFit, a fast and scalable algorithm for fitting the Kronecker graph generation model to large real networks. A naive approach to fitting would take super- exponential time. In contrast, KronFit takes linear time, by exploiting the structure of Kronecker matrix multiplication and by using statistical simulation techniques.   Experiments on large real and synthetic networks show that KronFit finds accurate parameters that indeed very well mimic the properties of target networks. Once fitted, the model parameters can be used to gain insights about the network structure, and the resulting synthetic graphs can be used for null- models, anonymization, extrapolations, and graph summarization.
The coverage problem in wireless sensor networks deals with the problem of covering a region or parts of it with sensors. In this paper, we address the problem of covering a set of line segments in sensor networks. A line segment ` is said to be covered if it intersects the sensing regions of at least one sensor distributed in that region. We show that the problem of finding the minimum number of sensors needed to cover each member in a given set of line segments in a rectangular area is NP-hard. Next, we propose a constant factor approximation algorithm for the problem of covering a set of axis-parallel line segments. We also show that a PTAS exists for this problem.
We apply a new anticommuting path integral technique to clarify the fermionic structure of the 2D Ising model with quenched site dilution. In the $N$-replica scheme, the model is explicitly reformulated as a theory of interacting fermions on a lattice. An unusual feature is that the leading term of interaction in the exact lattice theory is of order $2N$ in fermions, where $N$ is the number of replicas. The continuum-limit approximation near $T_c $ for weak dilution produces, however, an effective four-fermion interaction. In particular, this implies the doubled-logarithmic singularity in the specific heat near $T_c$ for weak site dilution. The exact value of the initial slope is also obtained.   Keywords: Ising model, site disorder, anticommuting Grassmann variables
Understanding the fracture toughness (resistance) of glasses is a fundamental problem of prime theoretical and practical importance. Here we theoretically study its dependence on the loading rate, the age (history) of the glass and the notch radius $\rho$. Reduced-dimensionality analysis suggests that the notch fracture toughness results from a competition between the initial, age- and history-dependent, plastic relaxation timescale $\tau^{pl}_0$ and an effective loading timescale $\tau^{ext}(\dot{K}_I,\rho)$, where $\dot{K}_I$ is the tensile stress-intensity-factor rate. The toughness is predicted to scale with $\sqrt{\rho}$ independently of $\xi\!\equiv\!\tau^{ext}\!/\tau^{pl}_0$ for $\xi\!\ll\! 1$, to scale as $T\sqrt{\rho}\,\log(\xi)$ for $\xi\!\gg\!1$ (related to thermal activation, where $T$ is the temperature) and to feature a non-monotonic behavior in the crossover region $\xi\!\sim\!{\cal O}(1)$ (related to plastic yielding dynamics). These predictions are verified using novel 2D computations, providing a unified picture of the notch fracture toughness of glasses. The theory highlights the importance of timescales competition and far from steady-state elasto-viscoplastic dynamics for understanding the toughness, and shows that the latter varies quite significantly with the glass age (history) and applied loading rate. Experimental support for bulk metallic glasses is presented.
The paper shows that there is a deep structure on certain sets of bisimilar Probabilistic Automata (PA). The key prerequisite for these structures is a notion of compactness of PA. It is shown that compact bisimilar PA form lattices. These results are then used in order to establish normal forms not only for finite automata, but also for infinite automata, as long as they are compact.
Anonymized electronic medical records are an increasingly popular source of research data. However, these datasets often lack race and ethnicity information. This creates problems for researchers modeling human disease, as race and ethnicity are powerful confounders for many health exposures and treatment outcomes; race and ethnicity are closely linked to population-specific genetic variation. We showed that deep neural networks generate more accurate estimates for missing racial and ethnic information than competing methods (e.g., logistic regression, random forest). RIDDLE yielded significantly better classification performance across all metrics that were considered: accuracy, cross-entropy loss (error), and area under the curve for receiver operating characteristic plots (all $p < 10^{-6}$). We made specific efforts to interpret the trained neural network models to identify, quantify, and visualize medical features which are predictive of race and ethnicity. We used these characterizations of informative features to perform a systematic comparison of differential disease patterns by race and ethnicity. The fact that clinical histories are informative for imputing race and ethnicity could reflect (1) a skewed distribution of blue- and white-collar professions across racial and ethnic groups, (2) uneven accessibility and subjective importance of prophylactic health, (3) possible variation in lifestyle, such as dietary habits, and (4) differences in background genetic variation which predispose to diseases.
In this paper we study synchronization of random clustered networks consisting of Kuramoto oscillators. More specifically, by developing a mean-field analysis, we find that the presence of cycles of order three does not play an important role on network synchronization, showing that the synchronization of random clustered networks can be described by tree-based theories, even for high values of clustering. In order to support our findings, we provide numerical simulations considering clustered and non-clustered networks, which are in good agreement with our theoretical results.
In massive open online courses (MOOCs), peer grading serves as a critical tool for scaling the grading of complex, open-ended assignments to courses with tens or hundreds of thousands of students. But despite promising initial trials, it does not always deliver accurate results compared to human experts. In this paper, we develop algorithms for estimating and correcting for grader biases and reliabilities, showing significant improvement in peer grading accuracy on real data with 63,199 peer grades from Coursera's HCI course offerings --- the largest peer grading networks analysed to date. We relate grader biases and reliabilities to other student factors such as student engagement, performance as well as commenting style. We also show that our model can lead to more intelligent assignment of graders to gradees.
We introduce a natural generalization of submodular set cover and exact active learning with a finite hypothesis class (query learning). We call this new problem interactive submodular set cover. Applications include advertising in social networks with hidden information. We give an approximation guarantee for a novel greedy algorithm and give a hardness of approximation result which matches up to constant factors. We also discuss negative results for simpler approaches and present encouraging early experimental results.
We consider a fundamental algorithmic question in spectral graph theory: Compute a spectral sparsifier of random-walk matrix-polynomial $$L_\alpha(G)=D-\sum_{r=1}^d\alpha_rD(D^{-1}A)^r$$ where $A$ is the adjacency matrix of a weighted, undirected graph, $D$ is the diagonal matrix of weighted degrees, and $\alpha=(\alpha_1...\alpha_d)$ are nonnegative coefficients with $\sum_{r=1}^d\alpha_r=1$. Recall that $D^{-1}A$ is the transition matrix of random walks on the graph. The sparsification of $L_\alpha(G)$ appears to be algorithmically challenging as the matrix power $(D^{-1}A)^r$ is defined by all paths of length $r$, whose precise calculation would be prohibitively expensive.   In this paper, we develop the first nearly linear time algorithm for this sparsification problem: For any $G$ with $n$ vertices and $m$ edges, $d$ coefficients $\alpha$, and $\epsilon > 0$, our algorithm runs in time $O(d^2m\log^2n/\epsilon^{2})$ to construct a Laplacian matrix $\tilde{L}=D-\tilde{A}$ with $O(n\log n/\epsilon^{2})$ non-zeros such that $\tilde{L}\approx_{\epsilon}L_\alpha(G)$.   Matrix polynomials arise in mathematical analysis of matrix functions as well as numerical solutions of matrix equations. Our work is particularly motivated by the algorithmic problems for speeding up the classic Newton's method in applications such as computing the inverse square-root of the precision matrix of a Gaussian random field, as well as computing the $q$th-root transition (for $q\geq1$) in a time-reversible Markov model. The key algorithmic step for both applications is the construction of a spectral sparsifier of a constant degree random-walk matrix-polynomials introduced by Newton's method. Our algorithm can also be used to build efficient data structures for effective resistances for multi-step time-reversible Markov models, and we anticipate that it could be useful for other tasks in network analysis.
The next generation of mobile networks, namely 5G, and the Internet of Things (IoT) have brought a large number of delay sensitive services. In this context Cloud services are migrating to the edge of the networks to reduce latency. The notion of Fog computing, where the edge plays an active role in the execution of services, comes to meet the need for the stringent requirements. Thus, it becomes of a high importance to elegantly formulate and optimize this problem of mapping demand to supply. This work does exactly that, taking into account two key aspects of a service allocation problem in the Fog, namely modeling cost of executing a given set of services, and the randomness of resources availability, which may come from pre-existing load or server mobility. We introduce an integer optimization formulation to minimize the total cost under a guarantee of service execution despite the uncertainty of resources availability.
An exact analytical theory is developed for calculating the diffusion coefficient of charge carriers in strongly anisotropic disordered solids with one-dimensional hopping transport mode for any dependence of the hopping rates on space and energy. So far such a theory existed only for calculating the carrier mobility. The dependence of the diffusion coefficient on the electric field evidences a linear, non-analytic behavior at low fields for all considered models of disorder. The mobility, on the contrary, demonstrates a parabolic, analytic field dependence for a random-barrier model, being linear, non-analytic for a random energy model. For both models the Einstein relation between the diffusion coefficient and mobility is proven to be violated at any finite electric field. The question on whether these non-analytic field dependences of the transport coefficients and the concomitant violation of the Einstein's formula are due to the dimensionality of space or due to the considered models of disorder is resolved in the following paper [Nenashev et al., arXiv:0912.3169], where analytical calculations and computer simulations are carried out for two- and three-dimensional systems.
The properties of randomly evolving special trees having defined and analyzed already in two earlier papers (arXiv:cond-mat/0205650 and arXiv:cond-mat/0211092) have been investigated in the case when the continuous time parameter converges to infinity. Equations for generating functions of the number of nodes and end-nodes in a stationary (i.e. infinitely old) tree have been derived. In order to solve exactly these equations we have chosen three simple distributions for the number of new nodes produced by one dying node. By using appropriate method we have calculated step-by-step the probabilities of finding n=1,2,... nodes as well as end-nodes in a stationary random tree. The conclusion to be correct that in the evolution process the formation of a rod-like stationary tree is much more probable than the formation of a tree with many branches.
In wireless location-aware networks, mobile nodes (agents) typically obtain their positions through ranging with respect to nodes with known positions (anchors). Transmit power allocation not only affects network lifetime, throughput, and interference, but also determines localization accuracy. In this paper, we present an optimization framework for robust power allocation in network localization to tackle imperfect knowledge of network parameters. In particular, we formulate power allocation problems to minimize the squared position error bound (SPEB) and the maximum directional position error bound (mDPEB), respectively, for a given power budget. We show that such formulations can be efficiently solved via conic programming. Moreover, we design an efficient power allocation scheme that allows distributed computations among agents. The simulation results show that the proposed schemes significantly outperform uniform power allocation, and the robust schemes outperform their non-robust counterparts when the network parameters are subject to uncertainty.
Directed networks are pervasive both in nature and engineered systems, often underlying the complex behavior observed in biological systems, microblogs and social interactions over the web, as well as global financial markets. Since their structures are often unobservable, in order to facilitate network analytics, one generally resorts to approaches capitalizing on measurable nodal processes to infer the unknown topology. Structural equation models (SEMs) are capable of incorporating exogenous inputs to resolve inherent directional ambiguities. However, conventional SEMs assume full knowledge of exogenous inputs, which may not be readily available in some practical settings. The present paper advocates a novel SEM-based topology inference approach that entails factorization of a three-way tensor, constructed from the observed nodal data, using the well-known parallel factor (PARAFAC) decomposition. It turns out that second-order piecewise stationary statistics of exogenous variables suffice to identify the hidden topology. Capitalizing on the uniqueness properties inherent to high-order tensor factorizations, it is shown that topology identification is possible under reasonably mild conditions. In addition, to facilitate real-time operation and inference of time-varying networks, an adaptive (PARAFAC) tensor decomposition scheme which tracks the topology-revealing tensor factors is developed. Extensive tests on simulated and real stock quote data demonstrate the merits of the novel tensor-based approach.
We analyse the motion of a system of particles subjected a random force fluctuating in both space and time, and experiencing viscous damping. When the damping exceeds a certain threshold, the system undergoes a phase transition: the particle trajectories coalesce. We analyse this transition by mapping it to a Kramers problem which we solve exactly. In the limit of weak random force we characterise the dynamics by computing the rate at which caustics are crossed, and the statistics of the particle density in the coalescing phase. Last but not least we describe possible realisations of the effect, ranging from trajectories of raindrops on glass surfaces to animal migration patterns.
We study diffusion-limited pair annihilation $A+A\to 0$ on one-dimensional lattices with inhomogeneous nearest neighbour hopping in the limit of infinite reaction rate. We obtain a simple exact expression for the particle concentration $\rho_k(t)$ of the many-particle system in terms of the conditional probabilities $P(m;t|l;0)$ for a single random walker in a dual medium. For some disordered systems with an initially randomly filled lattice this leads asymptotically to $\bar{\rho(t)}=\bar{P(0;2t|0;0)}$ for the disorder-averaged particle density. We also obtain interesting exact relations for single-particle conditional probabilities in random media related by duality, such as random-barrier and random-trap systems. For some specific random barrier systems the Smoluchovsky approach to diffusion-limited annihilation turns out to fail.
The softmax representation of probabilities for categorical variables plays a prominent role in modern machine learning with numerous applications in areas such as large scale classification, neural language modeling and recommendation systems. However, softmax estimation is very expensive for large scale inference because of the high cost associated with computing the normalizing constant. Here, we introduce an efficient approximation to softmax probabilities which takes the form of a rigorous lower bound on the exact probability. This bound is expressed as a product over pairwise probabilities and it leads to scalable estimation based on stochastic optimization. It allows us to perform doubly stochastic estimation by subsampling both training instances and class labels. We show that the new bound has interesting theoretical properties and we demonstrate its use in classification problems.
Many phenomena taking place in the solar photosphere are controlled by plasma motions. Although the line-of-sight component of the velocity can be estimated using the Doppler effect, we do not have direct spectroscopic access to the components that are perpendicular to the line-of-sight. These components are typically estimated using methods based on local correlation tracking. We have designed DeepVel, an end-to-end deep neural network that produces an estimation of the velocity at every single pixel and at every time step and at three different heights in the atmosphere from just two consecutive continuum images. We confront DeepVel with local correlation tracking, pointing out that they give very similar results in the time- and spatially-averaged cases. We use the network to study the evolution in height of the horizontal velocity field in fragmenting granules, supporting the buoyancy-braking mechanism for the formation of integranular lanes in these granules. We also show that DeepVel can capture very small vortices, so that we can potentially expand the scaling cascade of vortices to very small sizes and durations.
This paper develops an active sensing method to estimate the relative weight (or trust) agents place on their neighbors' information in a social network. The model used for the regression is based on the steady state equation in the linear DeGroot model under the influence of stubborn agents, i.e., agents whose opinions are not influenced by their neighbors. This method can be viewed as a \emph{social RADAR}, where the stubborn agents excite the system and the latter can be estimated through the reverberation observed from the analysis of the agents' opinions. The social network sensing problem can be interpreted as a blind compressed sensing problem with a sparse measurement matrix. We prove that the network structure will be revealed when a sufficient number of stubborn agents independently influence a number of ordinary (non-stubborn) agents. We investigate the scenario with a deterministic or randomized DeGroot model and propose a consistent estimator of the steady states for the latter scenario. Simulation results on synthetic and real world networks support our findings.
This article presents studies on low-field electrical conduction in the range 4-to-300 K for a ultrafast material: InGaAs:ErAs grown by molecular beam epitaxy. The unique properties include nano-scale ErAs crystallines in host semiconductor, a deep Fermi level, and picosecond ultrafast photocarrier recombination. As the temperature drops, the conduction mechanisms are in the sequence of thermal activation, nearest-neighbor hopping, variable-range hopping, and Anderson localization. In the low-temperature limit, finite-conductivity metallic behavior, not insulating, was observed. This unusual conduction behavior is explained with the Abrahams scaling theory.
Hysteresis loops are obtained in the Ising spin-glass phase in d=3, using frustration-conserving hard-spin mean-field theory. The system is driven by a time-dependent random magnetic field H_Q that is conjugate to the spin-glass order Q, yielding a field-driven first-order phase transition through the spin-glass phase. The hysteresis loop area A of the Q-H_Q curve scales with respect to the sweep rate h of magnetic field as A-A_0 = h^b. In the spin-glass and random-bond ferromagnetic phases, the sweep-rate scaling exponent b changes with temperature T, but appears not to change with antiferromagnetic bond concentration p. By contrast, in the pure ferromagnetic phase, b does not depend on T and has a sharply different value than in the two other phases.
We propose Information Theoretic-Learning (ITL) divergence measures for variational regularization of neural networks. We also explore ITL-regularized autoencoders as an alternative to variational autoencoding bayes, adversarial autoencoders and generative adversarial networks for randomly generating sample data without explicitly defining a partition function. This paper also formalizes, generative moment matching networks under the ITL framework.
Methods from convex optimization such as accelerated gradient descent are widely used as building blocks for deep learning algorithms. However, the reasons for their empirical success are unclear, since neural networks are not convex and standard guarantees do not apply. This paper develops the first rigorous link between online convex optimization and error backpropagation on convolutional networks. The first step is to introduce circadian games, a mild generalization of convex games with similar convergence properties. The main result is that error backpropagation on a convolutional network is equivalent to playing out a circadian game. It follows immediately that the waking-regret of players in the game (the units in the neural network) controls the overall rate of convergence of the network. Finally, we explore some implications of the results: (i) we describe the representations learned by a neural network game-theoretically, (ii) propose a learning setting at the level of individual units that can be plugged into deep architectures, and (iii) propose a new approach to adaptive model selection by applying bandit algorithms to choose which players to wake on each round.
The paper deals with the theoretical investigation of plasmons in weakly disordered array of quantum wires, consisting of finite number of quantum wires, arranged at an equal distance from each other. The array of quantum wires is characterized by the fact that the density of electrons of one "defect" quantum wire was different from that of other quantum wires. At the same time it is assumed that "defect" quantum wire can be arranged at an arbitrary position in array. It is shown that the amount of plasmon modes in weakly disordered array of quantum wires is equal to the number of quantum wires in array. The existence of the local plasmon mode, whose properties differ from those of usual modes, is found. We point out that the local plasmon mode spectrum is slightly sensitive to the position of "defect" quantum wire in the array. At the same time the spectrum of usual plasmon modes is shown to be very sensitive to the position of "defect" quantum wire.
Geo-location database-assisted TV white space network reduces the need of energy-intensive processes (such as spectrum sensing), hence can achieve green cognitive communication effectively. The success of such a network relies on a proper business model that provides incentives for all parties involved. In this paper, we propose MINE GOLD (a Model of INformation markEt for GeO-Location Database), which enables databases to sell the spectrum information to unlicensed white space devices (WSDs) for profit. Specifically, we focus on an oligopoly information market with multiple databases, and study the interactions among databases and WSDs using a two-stage hierarchical model. In Stage I, databases compete to sell information to WSDs by optimizing their information prices. In Stage II, each WSD decides whether and from which database to purchase the information, to maximize his benefit of using the TV white space. We first characterize how the WSDs' purchasing behaviors dynamically evolve, and what is the equilibrium point under fixed information prices from the databases. We then analyze how the system parameters and the databases' pricing decisions affect the market equilibrium, and what is the equilibrium of the database price competition. Our numerical results show that, perhaps counter-intuitively, the databases' aggregate revenue is not monotonic with the number of databases. Moreover, numerical results show that a large degree of positive network externality would improve the databases' revenues and the system performance.
The dynamics of interacting perceptrons is solved analytically. For a directed flow of information the system runs into a state which has a higher symmetry than the topology of the model. A symmetry breaking phase transition is found with increasing learning rate. In addition it is shown that a system of interacting perceptrons which is trained on the history of its minority decisions develops a good strategy for the problem of adaptive competition known as the Bar Problem or Minority Game.
The presence of dynamical heterogeneities, i.e. nanometer-scale regions containing molecules rearranging cooperatively at very different rates compared to the bulk, is increasingly being recognized as crucial in our understanding of the glass transition, from the non-exponential nature of relaxation, to the divergence of the relaxation times. Recently, dynamical heterogeneities have been directly observed experimentally. However a clear physical picture for the origin of these heterogeneities is still lacking. Here we investigate a possible physical mechanism for the origin of dynamical heterogeneities in the non-equilibrium dynamics of structural glasses. We test the predictions regarding universal scaling of fluctuations derived from this mechanism against simulation results in a simple binary Lennard-Jones glass model, and find that to a first approximation they are satisfied. We also propose to apply the same kind of analysis to experimental data from confocal microscopy in colloidal glasses.
This paper aims to improve the feature learning in Convolutional Networks (Convnet) by capturing the structure of objects. A new sparsity function is imposed on the extracted featuremap to capture the structure and shape of the learned object, extracting interpretable features to improve the prediction performance. The proposed algorithm is based on organizing the activation within and across featuremap by constraining the node activities through $\ell_{2}$ and $\ell_{1}$ normalization in a structured form.
The detection of an electromagnetic transient which may originate from a binary neutron star merger can increase the probability that a given segment of data from the LIGO-Virgo ground-based gravitational-wave detector network contains a signal from a binary coalescence. Additional information contained in the electromagnetic signal, such as the sky location or distance to the source, can help rule out false alarms, and thus lower the necessary threshold for a detection. Here, we develop a framework for determining how much sensitivity is added to a gravitational-wave search by triggering on an electromagnetic transient. We apply this framework to a variety of relevant electromagnetic transients, from short GRBs to signatures of r-process heating to optical and radio orphan afterglows. We compute the expected rates of multi-messenger observations in the Advanced detector era, and find that searches triggered on short GRBs --- with current high-energy instruments, such as Fermi --- and nucleosynthetic `kilonovae' --- with future optical surveys, like LSST --- can boost the number of multi-messenger detections by 15% and 40%, respectively, for a binary neutron star progenitor model. Short GRB triggers offer precise merger timing, but suffer from detection rates decreased by beaming and the high a priori probability that the source is outside the LIGO-Virgo sensitive volume. Isotropic kilonovae, on the other hand, could be commonly observed within the LIGO-Virgo sensitive volume with an instrument roughly an order of magnitude more sensitive than current optical surveys. We propose that the most productive strategy for making multi-messenger gravitational-wave observations is using triggers from future deep, optical all-sky surveys, with characteristics comparable to LSST, which could make as many as ten such coincident observations a year.
The enforcement of sensitive policies in untrusted environments is still an open challenge for policy-based systems. On the one hand, taking any appropriate security decision requires access to these policies. On the other hand, if such access is allowed in an untrusted environment then confidential information might be leaked by the policies. The key challenge is how to enforce sensitive policies and protect content in untrusted environments. In the context of untrusted environments, we mainly distinguish between outsourced and distributed environments. The most attractive paradigms concerning outsourced and distributed environments are cloud computing and opportunistic networks, respectively.   In this dissertation, we present the design, technical and implementation details of our proposed policy-based access control mechanisms for untrusted environments. First of all, we provide full confidentiality of access policies in outsourced environments, where service providers do not learn private information about policies. We support expressive policies and take into account contextual information. The system entities do not share any encryption keys. For complex user management, we offer the full-fledged Role-Based Access Control (RBAC) policies.   In opportunistic networks, we protect content by specifying expressive policies. In our proposed approach, brokers match subscriptions against policies associated with content without compromising privacy of subscribers. As a result, unauthorised brokers neither gain access to content nor learn policies and authorised nodes gain access only if they satisfy policies specified by publishers. Our proposed system provides scalable key management in which loosely-coupled publishers and subscribers communicate without any prior contact. Finally, we have developed a prototype of the system that runs on real smartphones and analysed its performance.
As the Internet becomes severely overburdened with exponentially growing traffic demand, it becomes a general belief that a new generation data network is in urgent need today. However, standing at this crossroad, we find that we are in a situation that lacks a theory of network designing. This issue becomes even more serious as the recent progress of network measurement and modeling challenges the foundation of network research in the past decades.   This paper tries to set up a scientific foundation for network designing by formalizing it as a multi-objective optimization process and quantifying the way different designing choices independently and collectively influence these objectives. A cartesian coordinate system is introduced to map the effect of each designing scheme to a coordinate. We investigated the achievable area of the network designing space and proved some boundary conditions. It is shown that different kind of networks display different shapes of achievable areas in the cartesian coordinate and exhibit different abilities to achieve cost-effective and scalable designing. In particular, we found that the philosophy underlying current empirical network designing and engineering fails to meet the cost-effective and evolvable requirements of network designing. We demonstrated that the efficient routing combined with effective betweenness based link bandwidth allocation scheme is a cost-effective and scalable design for BA-like scale-free networks, whereas if other designing choices cannot be determined beforehand, ER network is a markedly good candidate for cost-effective and scalable design.
A Mobile Ad hoc network (MANET) consists of nodes which use multi-hop communication to establish connection between nodes. Traditional infrastructure based systems use a centralized architecture for address allocation. However, this is not possible in Ad hoc networks due to their dynamic structure. Many schemes have been proposed to solve this problem, but most of them use network-wide broadcasts to ensure the availability of a new address. This becomes extremely difficult as network size grows. In this paper, we propose an address allocation algorithm which avoids network-wide broadcasts to allocate address to a new node. Moreover, the algorithm allocates addresses dynamically such that the network maintains an "IP resembles topology" state. In such a state, routing becomes easier and the overall overhead in communication is reduced. This algorithm is particularly useful for routing protocols which use topology information to route messages in the network. Our solution is designed with scalability in mind such that the cost of address assignment to a new node is independent of the number of nodes in the network.
The increasing availability of temporal network data is calling for more research on extracting and characterizing mesoscopic structures in temporal networks and on relating such structure to specific functions or properties of the system. An outstanding challenge is the extension of the results achieved for static networks to time-varying networks, where the topological structure of the system and the temporal activity patterns of its components are intertwined. Here we investigate the use of a latent factor decomposition technique, non-negative tensor factorization, to extract the community-activity structure of temporal networks. The method is intrinsically temporal and allows to simultaneously identify communities and to track their activity over time. We represent the time-varying adjacency matrix of a temporal network as a three-way tensor and approximate this tensor as a sum of terms that can be interpreted as communities of nodes with an associated activity time series. We summarize known computational techniques for tensor decomposition and discuss some quality metrics that can be used to tune the complexity of the factorized representation. We subsequently apply tensor factorization to a temporal network for which a ground truth is available for both the community structure and the temporal activity patterns. The data we use describe the social interactions of students in a school, the associations between students and school classes, and the spatio-temporal trajectories of students over time. We show that non-negative tensor factorization is capable of recovering the class structure with high accuracy. In particular, the extracted tensor components can be validated either as known school classes, or in terms of correlated activity patterns, i.e., of spatial and temporal coincidences that are determined by the known school activity schedule.
Bipartite user-object networks are becoming increasingly popular in representing user interaction data in a web or e-commerce environment. They have certain characteristics and challenges that differentiates them from other bipartite networks. This paper analyzes the properties of five real world user-object networks. In all cases we found a heavy tail object degree distribution with popular objects connecting together a large part of the users causing significant edge inflation in the projected users network. We propose a novel edge weighting strategy based on tf-idf and show that the new scheme improves both the density and the quality of the community structure in the projections. The improvement is also noticed when comparing to partially random networks.
This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative results of the model's efficacy at learning a 3D rendering engine.
We discuss the behavior of large ensembles of phase oscillators networking via scale-free topologies in the presence of a positive correlation between the oscillators' natural frequencies and network's degrees. In particular, we show that the further presence of degree-degree correlation in the network structure has important consequences on the nature of the phase transition characterizing the passage from the phase-incoherent to the phase-coherent network's state. While high levels of positive and negative mixing consistently induce a second-order phase transition, moderate values of assortative mixing, such as those ubiquitously characterizing social networks in the real world, greatly enhance the irreversible nature of explosive synchronization in growing scale-free networks. This latter effect corresponds to a maximization of the area and of the width of the hysteretic loop that differentiates the forward and backward transitions to synchronization.
If the number of lattice sites is odd, a quantum particle hopping on a bipartite lattice with random hopping between the two sublattices only is guaranteed to have an eigenstate at zero energy. We show that the localization length of this eigenstate depends strongly on the boundaries of the lattice, and can take values anywhere between the mean free path and infinity. The same dependence on boundary conditions is seen in the conductance of such a lattice if it is connected to electron reservoirs via narrow leads. For any nonzero energy, the dependence on boundary conditions is removed for sufficiently large system sizes.
Nowadays folksonomy is used as a system derived from user-generated electronic tags or keywords that annotate and describe online content. But it is not a classification system as an ontology. To consider it as a classification system it would be necessary to share a representation of contexts by all the users. This paper is proposing the use of folksonomies and network theory to devise a new concept: a "Folksodriven Structure Network" to represent folksonomies. This paper proposed and analyzed the network structure of Folksodriven tags thought as folsksonomy tags suggestions for the user on a dataset built on chosen websites. It is observed that the Folksodriven Network has relative low path lengths checking it with classic networking measures (clustering coefficient). Experiment result shows it can facilitate serendipitous discovery of content among users. Neat examples and clear formulas can show how a "Folksodriven Structure Network" can be used to tackle ontology mapping challenges.
Much of social network analysis is - implicitly or explicitly - predicated on the assumption that individuals tend to be more similar to their friends than to strangers. Thus, an observed social network provides a noisy signal about the latent underlying "social space:" the way in which individuals are similar or dissimilar. Many research questions frequently addressed via social network analysis are in reality questions about this social space, raising the question of inverting the process: Given a social network, how accurately can we reconstruct the social structure of similarities and dissimilarities?   We begin to address this problem formally. Observed social networks are usually multiplex, in the sense that they reflect (dis)similarities in several different "categories," such as geographical proximity, kinship, or similarity of professions/hobbies. We assume that each such category is characterized by a latent metric capturing (dis)similarities in this category. Each category gives rise to a separate social network: a random graph parameterized by this metric. For a concrete model, we consider Kleinberg's small world model and some variations thereof. The observed social network is the unlabeled union of these graphs, i.e., the presence or absence of edges can be observed, but not their origins. Our main result is an algorithm which reconstructs each metric with provably low distortion.
Deep-inelastic ep scattering data, taken with the H1 detector at HERA, are used to study event shape variables over a large range of "relevant energy" Q between 7 GeV and 100 GeV. Previously published analysis on thrust, jet broadening, jet mass and C parameter are substantially refined and updated; differential two-jet rates treated as event shapes are presented for the first time.   The Q dependence of the mean values is fit to second order calculations of perturbative QCD applying power law corrections proportional to 1/Q^p to account for hadronization effects. The concept of these power corrections is tested by a systematic investigation in terms of a non-perturbative parameter \bar{alpha}_{p-1} and the strong coupling constant.
Spatial distribution of local tunneling conductivity was investigated for deep and shallow impurities on semiconductor surfaces. Non-equilibrium Coulomb interaction and interference effects were taken into account and analyzed theoretically with the help of Keldysh formalism. Two models were investigated: mean field self-consistent approach for shallow impurity state and Hubbard-{I} model for deep impurity state. We have found that not only above the impurity but also at the distances comparable to the lattice period both effects interference between direct and resonant tunneling channels and on-site Coulomb repulsion of localized electrons strongly modifies form of tunneling conductivity measured by the scanning tunneling microscopy/spectroscopy (STM/STS).
Online Social Networks (OSN) are among the most popular applications in today's Internet. Decentralized online social networks (DOSNs), a special class of OSNs, promise better privacy and autonomy than traditional centralized OSNs. However, ensuring availability of content when the content owner is not online remains a major challenge. In this paper, we rely on the structure of the social graphs underlying DOSN for replication. In particular, we propose that friends, who are anyhow interested in the content, are used to replicate the users content. We study the availability of such natural replication schemes via both theoretical analysis as well as simulations based on data from OSN users. We find that the availability of the content increases drastically when compared to the online time of the user, e. g., by a factor of more than 2 for 90% of the users. Thus, with these simple schemes we provide a baseline for any more complicated content replication scheme.
In this paper, we build a model for biological neural nets where the activity of the network is described by Hawkes processes having a variable length memory. The particularity of this paper is to deal with an infinite number of components. We propose a graphical construction of the process and we build, by means of a perfect simulation algorithm, a stationary version of the process. To carry out this algorithm, we make use of a Kalikow-type decomposition technique.   Two models are described in this paper. In the first model, we associate to each edge of the interaction graph a saturation threshold that controls the influence of a neuron on another. In the second model, we impose a structure on the interaction graph leading to a cascade of spike trains. Such structures, where neurons are divided into layers can be found in retina.
Though appropriate for core Internet infrastructure, the Internet Protocol is unsuited to routing within and between emerging ad-hoc edge networks due to its dependence on hierarchical, administratively assigned addresses. Existing ad-hoc routing protocols address the management problem but do not scale to Internet-wide networks. The promise of ubiquitous network computing cannot be fulfilled until we develop an Unmanaged Internet Protocol (UIP), a scalable routing protocol that manages itself automatically. UIP must route within and between constantly changing edge networks potentially containing millions or billions of nodes, and must still function within edge networks disconnected from the main Internet, all without imposing the administrative burden of hierarchical address assignment. Such a protocol appears challenging but feasible. We propose an architecture based on self-certifying, cryptographic node identities and a routing algorithm adapted from distributed hash tables.
We obtained a deep 150ksec Chandra observation of the optically selected cluster of galaxies, RCS 2318+0034, to investigate the gas mass fraction in this system. Combining our deep {\it Chandra} observation with an archival 50ksec observation, we derive gas mass fractions of fgas=0.06 +- 0.02 and 0.10 +- 0.02 within r2500 and r500, respectively. The gas mass fraction in RCS 2318+0034 within r500 is typical of X-ray selected clusters. Further study shows that the large scale properties of RCS 2318+0034, including the relations between gas mass, X-ray luminosity and gas temperature are also consistent with the observed correlations of X-ray selected clusters. However, the gas mass fraction within r2500 is less than most X-ray selected clusters, as previously reported. The deep Chandra image of RCS 2318+0034 shows that this system is presently undergoing a major merger which may have an impact on the inferred gas mass fraction within r2500.
We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs. real data. We also demonstrate that for appropriately tuned explicit regularization (e.g., dropout) we can degrade DNN training performance on noise datasets without compromising generalization on real data. Our analysis suggests that the notions of effective capacity which are dataset independent are unlikely to explain the generalization performance of deep networks when trained with gradient based methods because training data itself plays an important role in determining the degree of memorization.
The need for accurate photometric redshifts estimation is a topic that has fundamental importance in Astronomy, due to the necessity of efficiently obtaining redshift information without the need of spectroscopic analysis. We propose a method for determining accurate multimodal photo-z probability density functions (PDFs) using Mixture Density Networks (MDN) and Deep Convolutional Networks (DCN). A comparison with a Random Forest (RF) is performed.
We study supervised learning and generalisation in coupled perceptrons trained on-line using two learning scenarios. In the first scenario the teacher and the student are independent networks and both are represented by an Ashkin-Teller perceptron. In the second scenario the student and the teacher are simple perceptrons but are coupled by an Ashkin-Teller type four-neuron interaction term. Expressions for the generalisation error and the learning curves are derived for various learning algorithms. The analytic results find excellent confirmation in numerical simulations.
Neural networks with rectified linear unit activations are essentially multivariate linear splines. As such, one of many ways to measure the "complexity" or "expressivity" of a neural network is to count the number of knots in the spline model. We study the number of knots in fully-connected feedforward neural networks with rectified linear unit activation functions. We intentionally keep the neural networks very simple, so as to make theoretical analyses more approachable. An induction on the number of layers $l$ reveals a tight upper bound on the number of knots in $\mathbb{R} \to \mathbb{R}^p$ deep neural networks. With $n_i \gg 1$ neurons in layer $i = 1, \dots, l$, the upper bound is approximately $n_1 \dots n_l$. We then show that the exact upper bound is tight, and we demonstrate the upper bound with an example. The purpose of these analyses is to pave a path for understanding the behavior of general $\mathbb{R}^q \to \mathbb{R}^p$ neural networks.
We present an analytical method for computing the mean cover time of a random walk process on arbitrary, complex networks. The cover time is defined as the time a random walker requires to visit every node in the network at least once. This quantity is particularly important for random search processes and target localization in network topologies. Based on the global mean first passage time of target nodes we derive an estimate for the cumulative distribution function of the cover time based on first passage time statistics. We show that our result can be applied to various model networks, including Erd\H{o}s-R\'enyi and Barab\'asi-Albert networks, as well as various real-world networks. Our results reveal an intimate link between first passage and cover time statistics in networks in which structurally induced temporal correlations decay quickly and offer a computationally efficient way for estimating cover times in network related applications.
The study of randomness in low-dimensional quantum antiferromagnets is at the forefront of research in the field of strongly correlated electron systems, yet there have been relatively few experimental model systems. Complementary neutron scattering and numerical experiments demonstrate that the spin-diluted Heisenberg antiferromagnet La2Cu(1-z)(Zn,Mg)zO4 is an excellent model material for square-lattice site percolation in the extreme quantum limit of spin one-half. Measurements of the ordered moment and spin correlations provide important quantitative information for tests of theories for this complex quantum-impurity problem.
We set up a signal-driven scheme of the chaotic neural network with the coupling constants corresponding to certain information, and investigate the stochastic resonance-like effects under its deterministic dynamics, comparing with the conventional case of Hopfield network with stochastic noise. It is shown that the chaotic neural network can enhance weak subthreshold signals and have higher coherence abilities between stimulus and response than those attained by the conventional stochastic model.
The nonlinear rheological properties of dense colloidal suspensions under steady shear are discussed within a first principles approach. It starts from the Smoluchowski equation of interacting Brownian particles in a given shear flow, derives generalized Green-Kubo relations, which contain the transients dynamics formally exactly, and closes the equations using mode coupling approximations. Shear thinning of colloidal fluids and dynamical yielding of colloidal glasses arise from a competition between a slowing down of structural relaxation, because of particle interactions, and enhanced decorrelation of fluctuations, caused by the shear advection of density fluctuations. The integration through transients approach takes account of the dynamic competition, translational invariance enters the concept of wavevector advection, and the mode coupling approximation enables to quantitatively explore the shear-induced suppression of particle caging and the resulting speed-up of the structural relaxation. Extended comparisons with shear stress data in the linear response and in the nonlinear regime measured in model thermo-sensitive core-shell latices are discussed. Additionally, the single particle motion under shear observed by confocal microscopy and in computer simulations is reviewed and analysed theoretically.
Numerical simulations of spin glass models with continuous variables set the problem of a reliable but efficient discretization of such variables. In particular, the main question is how fast physical observables computed in the discretized model converge toward the ones of the continuous model when the number of states of the discretized model increases. We answer this question for the XY model and its discretization, the $q$-state clock model, in the mean-field setting provided by random graphs. It is found that the convergence of physical observables is exponentially fast in the number $q$ of states of the clock model, so allowing a very reliable approximation of the XY model by using a rather small number of states. Furthermore, such an exponential convergence is found to be independent from the disorder distribution used. Only at $T=0$ the convergence is slightly slower (stretched exponential). Thanks to the analytical solution to the $q$-state clock model, we compute accurate phase diagrams in the temperature versus disorder strength plane. We find that, at zero temperature, spontaneous replica symmetry breaking takes place for any amount of disorder, even an infinitesimal one. We also study the one step of replica symmetry breaking (1RSB) solution in the low-temperature spin glass phase.
By capturing the anisotropic water diffusion in tissue, diffusion magnetic resonance imaging (dMRI) provides a unique tool for noninvasively probing the tissue microstructure and orientation in the human brain. The diffusion profile can be described by the ensemble average propagator (EAP), which is inferred from observed diffusion signals. However, accurate EAP estimation using the number of diffusion gradients that is clinically practical can be challenging. In this work, we propose a deep learning algorithm for EAP estimation, which is named learning-based ensemble average propagator estimation (LEAPE). The EAP is commonly represented by a basis and its associated coefficients, and here we choose the SHORE basis and design a deep network to estimate the coefficients. The network comprises two cascaded components. The first component is a multiple layer perceptron (MLP) that simultaneously predicts the unknown coefficients. However, typical training loss functions, such as mean squared errors, may not properly represent the geometry of the possibly non-Euclidean space of the coefficients, which in particular causes problems for the extraction of directional information from the EAP. Therefore, to regularize the training, in the second component we compute an auxiliary output of approximated fiber orientation (FO) errors with the aid of a second MLP that is trained separately. We performed experiments using dMRI data that resemble clinically achievable $q$-space sampling, and observed promising results compared with the conventional EAP estimation method.
The increasing integration of distributed energy resources (DERs) calls for new planning and operational tools. However, such tools depend on system topology and line parameters, which may be missing or inaccurate in distribution grids. With abundant data, one idea is to use linear regression to find line parameters, based on which topology can be identified. Unfortunately, the linear regression method is accurate only if there is no noise in both the input measurements (e.g., voltage magnitude and phase angle) and output measurements (e.g., active and reactive power). For topology estimation, even with a small error in measurements, the regression-based method is incapable of finding the topology using non-zero line parameters with a proper metric. To model input and output measurement errors simultaneously, we propose the error-in-variables (EIV) model in a maximum likelihood estimation (MLE) framework for joint line parameter and topology estimation. While directly solving the problem is NP-hard, we successfully adapt the problem into a generalized low-rank approximation problem via variable transformation and noise decorrelation. For accurate topology estimation, we let it interact with parameter estimation in a fashion that is similar to expectation-maximization fashion in machine learning. The proposed PaToPa approach does not require a radial network setting and works for mesh networks. We demonstrate the superior performance in accuracy for our method on IEEE test cases with actual feeder data from South California Edison.
Chemical reaction networks (CRNs) formally model chemistry in a well-mixed solution. CRNs are widely used to describe information processing occurring in natural cellular regulatory networks, and with upcoming advances in synthetic biology, CRNs are a promising programming language for the design of artificial molecular control circuitry. Due to a formal equivalence between CRNs and a model of distributed computing known as population protocols, results transfer readily between the two models.   We show that if a CRN respects finite density (at most O(n) additional molecules can be produced from n initial molecules), then starting from any dense initial configuration (all molecular species initially present have initial count Omega(n), where n is the initial molecular count and volume), then every producible species is produced in constant time with high probability.   This implies that no CRN obeying the stated constraints can function as a timer, able to produce a molecule, but doing so only after a time that is an unbounded function of the input size. This has consequences regarding an open question of Angluin, Aspnes, and Eisenstat concerning the ability of population protocols to perform fast, reliable leader election and to simulate arbitrary algorithms from a uniform initial state.
Network coding can significantly improve the transmission rate of communication networks with packet loss compared with routing. However, using network coding usually incurs high computational and storage costs in the network devices and terminals. For example, some network coding schemes require the computational and/or storage capacities of an intermediate network node to increase linearly with the number of packets for transmission, making such schemes difficult to be implemented in a router-like device that has only constant computational and storage capacities. In this paper, we introduce BATched Sparse code (BATS code), which enables a digital fountain approach to resolve the above issue. BATS code is a coding scheme that consists of an outer code and an inner code. The outer code is a matrix generation of a fountain code. It works with the inner code that comprises random linear coding at the intermediate network nodes. BATS codes preserve such desirable properties of fountain codes as ratelessness and low encoding/decoding complexity. The computational and storage capacities of the intermediate network nodes required for applying BATS codes are independent of the number of packets for transmission. Almost capacity-achieving BATS code schemes are devised for unicast networks, two-way relay networks, tree networks, a class of three-layer networks, and the butterfly network. For general networks, under different optimization criteria, guaranteed decoding rates for the receiving nodes can be obtained.
We introduce an approach to derive an effective scalar field theory for the glass transition; the fluctuating field is the overlap between equilibrium configurations. We apply it to the case of constrained liquids for which the introduction of a conjugate source to the overlap field was predicted to lead to an equilibrium critical point. We show that the long-distance physics in the vicinity of this critical point is in the same universality class as that of a paradigmatic disordered model: the random-field Ising model. The quenched disorder is provided here by a reference equilibrium liquid configuration. We discuss to what extent this field-theoretical description and the mapping to the random field Ising model hold in the whole supercooled liquid regime, in particular near the glass transition.
We study the formation of the photonic band structure in small clusters of dielectric spheres. The first signs of the band structure, an attribute of an infinite crystal, can appear for clusters of 5 particles. Density of resonant states of a cluster of 32 spheres may exhibit a well defined structure similar to the density of electromagnetic states of the infinite photonic crystal. The resonant mode structure of finite-size aggregates is shown to be insensitive to random displacements of particles off the perfect lattice positions as large as half-radius of the particle. The results were obtained by an efficient numerical method, which relates the density of resonant states to the the scattering coefficients of the electromagnetic scattering problem. Generalized multisphere Mie (GMM) solution was used to obtain scattering matrix elements. These results are important to miniature photonic crystal design as well as understanding of light localization in dense random media.
Conventionally, information is represented by spike rates in the neural system. Here, we consider the ability of temporally modulated activities in neuronal networks to carry information extra to spike rates. These temporal modulations, commonly known as population spikes, are due to the presence of synaptic depression in a neuronal network model. We discuss its relevance to an experiment on transparent motions in macaque monkeys by Treue et al. in 2000. They found that if the moving directions of objects are too close, the firing rate profile will be very similar to that with one direction. As the difference in the moving directions of objects is large enough, the neuronal system would respond in such a way that the network enhances the resolution in the moving directions of the objects. In this paper, we propose that this behavior can be reproduced by neural networks with dynamical synapses when there are multiple external inputs. We will demonstrate how resolution enhancement can be achieved, and discuss the conditions under which temporally modulated activities are able to enhance information processing performances in general.
Recent works have shown that it is possible to automatically predict intrinsic image properties like memorability. In this paper, we take a step forward addressing the question: "Can we make an image more memorable?". Methods for automatically increasing image memorability would have an impact in many application fields like education, gaming or advertising. Our work is inspired by the popular editing-by-applying-filters paradigm adopted in photo editing applications, like Instagram and Prisma. In this context, the problem of increasing image memorability maps to that of retrieving "memorabilizing" filters or style "seeds". Still, users generally have to go through most of the available filters before finding the desired solution, thus turning the editing process into a resource and time consuming task. In this work, we show that it is possible to automatically retrieve the best style seeds for a given image, thus remarkably reducing the number of human attempts needed to find a good match. Our approach leverages from recent advances in the field of image synthesis and adopts a deep architecture for generating a memorable picture from a given input image and a style seed. Importantly, to automatically select the best style a novel learning-based solution, also relying on deep models, is proposed. Our experimental evaluation, conducted on publicly available benchmarks, demonstrates the effectiveness of the proposed approach for generating memorable images through automatic style seed selection
We present extensive numerical investigations of the structural relaxation dynamics of a realistic model of the amorphous high-temperature ceramic a-Si$_3$B$_3$N$_7$, probing the mean square displacement (MSD) of the atoms, the bond survival probability (BSP), the average energy, the specific heat, and the two-point energy average. Combining the information from these different sources, we identify a transition temperature $T_c \approx 2000$ K below which the system is no longer ergodic and physical quantities observed over a time $t_{obs}$ show a systematic parametric dependence on the waiting time $t_w$, or age, elapsed after the quench. The aging dynamics 'stiffens' as the system becomes older, which is similar to the behavior of highly idealized models such as Ising spin-glasses and Lennard-Jones glasses.
Quasiballistic 1D quantum wires are known to have a conductance of the order of 2e^2/h, with small sample-to-sample fluctuations. We present a study of the transconductance G_12 of two Coulomb-coupled quasiballistic wires, i.e., we consider the Coulomb drag geometry. We show that the fluctuations in G_12 differ dramatically from those of the diagonal conductance G_ii: the fluctuations are large, and can even exceed the mean value, thus implying a possible reversal of the induced drag current. We report extensive numerical simulations elucidating the fluctuations, both for correlated and uncorrelated disorder. We also present analytic arguments, which fully account for the trends observed numerically.
-In recent years, there has been an increased focus on the mechanics of information transmission in spiking neural networks. Especially the Noise Shaping properties of these networks and their similarity to Delta-Sigma Modulators has received a lot of attention. However, very little of the research done in this area has focused on the effect the weights in these networks have on the Noise Shaping properties and on post- processing of the network output signal. This paper concerns itself with the various modes of network operation and beneficial as well as detrimental effects which the systematic generation of network weights can effect. Also, a method for post-processing of the spiking output signal is introduced, bringing the output signal more in line with conventional Delta-Sigma Modulators. Relevancy of this research to industrial application of neural nets as building blocks of oversampled A/D converters is shown. Also, further points of contention are listed, which must be thoroughly researched to add to the above mentioned applicability of spiking neural nets.
This paper is a review of the evolutionary history of deep learning models. It covers from the genesis of neural networks when associationism modeling of the brain is studied, to the models that dominate the last decade of research in deep learning like convolutional neural networks, deep belief networks, and recurrent neural networks. In addition to a review of these models, this paper primarily focuses on the precedents of the models above, examining how the initial ideas are assembled to construct the early models and how these preliminary models are developed into their current forms. Many of these evolutionary paths last more than half a century and have a diversity of directions. For example, CNN is built on prior knowledge of biological vision system; DBN is evolved from a trade-off of modeling power and computation complexity of graphical models and many nowadays models are neural counterparts of ancient linear models. This paper reviews these evolutionary paths and offers a concise thought flow of how these models are developed, and aims to provide a thorough background for deep learning. More importantly, along with the path, this paper summarizes the gist behind these milestones and proposes many directions to guide the future research of deep learning.
We consider the problem of Recognizing Textual Entailment within an Information Retrieval context, where we must simultaneously determine the relevancy as well as degree of entailment for individual pieces of evidence to determine a yes/no answer to a binary natural language question.   We compare several variants of neural networks for sentence embeddings in a setting of decision-making based on evidence of varying relevance. We propose a basic model to integrate evidence for entailment, show that joint training of the sentence embeddings to model relevance and entailment is feasible even with no explicit per-evidence supervision, and show the importance of evaluating strong baselines. We also demonstrate the benefit of carrying over text comprehension model trained on an unrelated task for our small datasets.   Our research is motivated primarily by a new open dataset we introduce, consisting of binary questions and news-based evidence snippets. We also apply the proposed relevance-entailment model on a similar task of ranking multiple-choice test answers, evaluating it on a preliminary dataset of school test questions as well as the standard MCTest dataset, where we improve the neural model state-of-art.
The Artificial Neural network is a functional imitation of simplified model of the biological neurons and their goal is to construct useful computers for real world problems. The ANN applications have increased dramatically in the last few years fired by both theoretical and practical applications in a wide variety of applications. A brief theory of ANN is presented and potential areas are identified and future trends are discussed.
Social media have substantially altered the way brands and businesses advertise: Online Social Networks provide brands with more versatile and dynamic channels for advertisement than traditional media (e.g., TV and radio). Levels of engagement in such media are usually measured in terms of content adoption (e.g., likes and retweets) and sentiment, around a given topic. However, sentiment analysis and topic identification are both non-trivial tasks.   In this paper, using data collected from Twitter as a case study, we analyze how engagement and sentiment in promoted content spread over a 10-day period. We find that promoted tweets lead to higher positive sentiment than promoted trends; although promoted trends pay off in response volume. We observe that levels of engagement for the brand and promoted content are highest on the first day of the campaign, and fall considerably thereafter. However, we show that these insights depend on the use of robust machine learning and natural language processing techniques to gather focused, relevant datasets, and to accurately gauge sentiment, rather than relying on the simple keyword- or frequency-based metrics sometimes used in social media research.
We develop a timed calculus for Mobile Ad Hoc Networks embodying the peculiarities of local broadcast, node mobility and communication interference. We present a Reduction Semantics and a Labelled Transition Semantics and prove the equivalence between them. We then apply our calculus to model and study some MAC-layer protocols with special emphasis on node mobility and communication interference.   A main purpose of the semantics is to describe the various forms of interference while nodes change their locations in the network. Such interference only occurs when a node is simultaneously reached by more than one ongoing transmission over the same channel.
With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization, highlighting the importance of scale normalization, and making a connection between sharpness and PAC-Bayes theory. We then investigate how well the measures explain different observed phenomena.
From a microscopic model for the pyrochlore antiferromagnet Tb2Sn2O7, including the crystal field Hamiltonian and interactions between the angular momenta, we compute an effective pseudospin-1/2 Hamiltonian Heff$ that incorporates perturbatively in the effective interactions the effect of excited crystal field levels. We obtain the semiclassical ground states of Heff and find a region of parameter space with a two-in/two-out spin ice configuration on each tetrahedron with ordering wavevector q=0 and with spins canted away from the local Ising axes as found in Tb2Sn2O7. This ground state can also be obtained from a dipolar spin ice model in which the Ising constraint is softened. Monte Carlo simulations on the latter model reveal a region of the phase diagram with spin ice-like freezing and another with a transition into Tb2Sn2O7-type long range order. We comment on the differences between Tb2Sn2O7 and the perplexing spin liquid Tb2Ti2O7.
The $q$-model, a random walk model rich in behaviour and applications, is investigated. We introduce and motivate the $q$-model via its application proposed by Coppersmith {\em et al.} to the flow of stress through granular matter at rest. For a special value of its parameters the $q$-model has a critical point that we analyse. To characterise the critical point we imagine that a uniform load has been applied to the top of the granular medium and we study the evolution with depth of fluctuations in the distribution of load. Close to the critical point explicit calculation reveals that the evolution of load exhibits scaling behaviour analogous to thermodynamic critical phenomena. The critical behaviour is remarkably tractable: the harvest of analytic results includes scaling functions that describe the evolution of the variance of the load distribution close to the critical point and of the entire load distribution right at the critical point, values of the associated critical exponents, and determination of the upper critical dimension. These results are of intrinsic interest as a tractable example of a random critical point. Of the many applications of the q-model, the critical behaviour is particularly relevant to network models of river basins, as we briefly discuss. Finally we discuss circumstances under which quantum network models that describe the surface electronic states of a quantum Hall multilayer can be mapped onto the classical $q$-model. For mesoscopic multilayers of finite circumference the mapping fails; instead a mapping to a ferromagnetic supersymmetric spin chain has proved fruitful. We discuss aspects of the superspin mapping and give a new elementary derivation of it making use of operator rather than functional methods.
The class of the Generalized Coherent Potential Approximations (GCPA) to the Density Functional Theory (DFT) is introduced within the Multiple Scattering Theory formalism for dealing with, ordered or disordered, metallic alloys. All GCPA theories are based on a common ansatz for the kinetic part of the Hohenberg-Kohn functional and each theory of the class is specified by an external model concerning the potential reconstruction. The GCPA density functional consists of marginally coupled local contributions, does not depend on the details of the charge density and can be exactly rewritten as a function of the appropriate charge multipole moments associated with each lattice site. A general procedure based on the integration of the 'qV' laws is described that allows for the explicit construction the same function. The coarse grained nature of the GCPA density functional implies great computational advantages and is connected with the O(N) scalability of GCPA algorithms. Moreover, it is shown that a convenient truncated series expansion of the GCPA functional leads to the Charge Excess Functional (CEF) theory [E. Bruno, L. Zingales and Y. Wang, Phys. Rev. Lett. {\bf 91}, 166401 (2003)] which here is offered in a generalized version that includes multipolar interactions. CEF and the GCPA numerical results are compared with status of art LAPW full-potential density functional calculations for 62, bcc- and fcc-based, ordered CuZn alloys, in all the range of concentrations. These extensive tests show that the discrepancies between GCPA and CEF are always within the numerical accuracy of the calculations, both for the site charges and the total energies. Furthermore, GCPA and CEF very carefully reproduce the LAPW site charges and the total energy trends.
The recently proposed DeGroot-Friedkin model describes the dynamical evolution of individual social power in a social network that holds opinion discussions on a sequence of different issues. This paper revisits that model, and uses nonlinear contraction analysis, among other tools, to establish several novel results. First, we show that for a social network with constant topology, each individual's social power converges to its equilibrium value exponentially fast, whereas previous results only concluded asymptotic convergence. Second, when the network topology is dynamic (i.e., the relative interaction matrix may change between any two successive issues), we show that each individual exponentially forgets its initial social power. Specifically, individual social power is dependent only on the dynamic network topology, and initial (or perceived) social power is forgotten as a result of sequential opinion discussion. Last, we provide an explicit upper bound on an individual's social power as the number of issues discussed tends to infinity; this bound depends only on the network topology. Simulations are provided to illustrate our results.
We present a content-based automatic music tagging algorithm using fully convolutional neural networks (FCNs). We evaluate different architectures consisting of 2D convolutional layers and subsampling layers only. In the experiments, we measure the AUC-ROC scores of the architectures with different complexities and input types using the MagnaTagATune dataset, where a 4-layer architecture shows state-of-the-art performance with mel-spectrogram input. Furthermore, we evaluated the performances of the architectures with varying the number of layers on a larger dataset (Million Song Dataset), and found that deeper models outperformed the 4-layer architecture. The experiments show that mel-spectrogram is an effective time-frequency representation for automatic tagging and that more complex models benefit from more training data.
A search for QCD-instanton-induced events in deep inelastic ep scattering has been performed with the ZEUS detector at the HERA collider, using data corresponding to an integrated luminosity of 38 pb^{-1}. A kinematic range defined by cuts on the photon virtuality, Q^2 > 120 GeV^2, and on the Bjorken scaling variable, x > 10^{-3}, has been investigated. The QCD-instanton induced events were modelled by the Monte Carlo generator QCDINS. A background-independent, conservative 95% confidence level upper limit for the instanton cross section of 26 pb is obtained, to be compared with the theoretically expected value of 8.9 pb.
Many large decentralized systems rely on information propagation to ensure their proper function. We examine a common scenario in which only participants that are aware of the information can compete for some reward, and thus informed participants have an incentive not to propagate information to others. One recent example in which such tension arises is the 2009 DARPA Network Challenge (finding red balloons). We focus on another prominent example: Bitcoin, a decentralized electronic currency system.   Bitcoin represents a radical new approach to monetary systems. It has been getting a large amount of public attention over the last year, both in policy discussions and in the popular press. Its cryptographic fundamentals have largely held up even as its usage has become increasingly widespread. We find, however, that it exhibits a fundamental problem of a different nature, based on how its incentives are structured. We propose a modification to the protocol that can eliminate this problem.   Bitcoin relies on a peer-to-peer network to track transactions that are performed with the currency. For this purpose, every transaction a node learns about should be transmitted to its neighbors in the network. The current implemented protocol provides an incentive to nodes to not broadcast transactions they are aware of. Our solution is to augment the protocol with a scheme that rewards information propagation. Since clones are easy to create in the Bitcoin system, an important feature of our scheme is Sybil-proofness.   We show that our proposed scheme succeeds in setting the correct incentives, that it is Sybil-proof, and that it requires only a small payment overhead, all this is achieved with iterated elimination of dominated strategies. We complement this result by showing that there are no reward schemes in which information propagation and no self-cloning is a dominant strategy.
In network theory, Pearson's correlation coefficients are most commonly used to measure the degree assortativity of a network. We investigate the behavior of these coefficients in the setting of directed networks with heavy-tailed degree sequences. We prove that for graphs where the in- and out-degree sequences satisfy a power law with realistic parameters, Pearson's correlation coefficients converge to a non-negative number in the infinite network size limit. We propose alternative measures for degree-degree dependencies in directed networks based on Spearman's rho and Kendall's tau. Using examples and calculations on the Wikipedia graphs for nine different languages, we show why these rank correlation measures are more suited for measuring degree assortativity in directed graphs with heavy-tailed degrees.
Yield-stress is a problematic and controversial non-Newtonian flow phenomenon. In this article, we investigate the flow of yield-stress substances through porous media within the framework of pore-scale network modeling. We also investigate the validity of the Minimum Threshold Path (MTP) algorithms to predict the pressure yield point of a network depicting random or regular porous media. Percolation theory as a basis for predicting the yield point of a network is briefly presented and assessed. In the course of this study, a yield-stress flow simulation model alongside several numerical algorithms related to yield-stress in porous media were developed, implemented and assessed. The general conclusion is that modeling the flow of yield-stress fluids in porous media is too difficult and problematic. More fundamental modeling strategies are required to tackle this problem in the future.
Covariance matrices are fundamental to the analysis and forecast of economic, physical and biological systems. Although the eigenvalues $\{\lambda_i\}$ and eigenvectors $\{\bf{u}_i\}$ of a covariance matrix are central to such endeavors, in practice one must inevitably approximate the covariance matrix based on data with finite sample size $n$ to obtain empirical eigenvalues $\{\tilde{\lambda}_i\}$ and eigenvectors $\{\tilde{\bf{u}}_i\}$, and therefore understanding the error so introduced is of central importance. We analyze eigenvector error $\|\bf{u}_i - \tilde{\bf{u}}_i \|^2$ while leveraging the assumption that the true covariance matrix is drawn from an ensemble with known spectral properties, such as the distribution of and the gaps between eigenvalues. This approach complements previous analyses of eigenvector error that require the full set of eigenvalues to be known. In particular, we study the distribution of the expected square error $r= \mathbb{E}\left[\| \bf{u}_i - \tilde{\bf{u}}_i \|^2\right]$ across the matrix ensemble for eigenvalues near a particular value $\lambda$ and, for example, find for sufficiently large matrix size $p$ and sample size $n$ that the probability density of the squared error decays as $1/nr^2$. We support this and further results with numerical experiments.
The centimeter-wave luminosity of local radio galaxies correlates well with their star formation rate. We extend this correlation to surveys of high-redshift radio sources to estimate the global star formation history. The star formation rate found from radio observations needs no correction for dust obscuration, unlike the values calculated from optical and ultraviolet data. Three deep radio surveys have provided catalogs of sources with nearly complete optical identifications and nearly 60% complete spectroscopic redshifts: the Hubble Deep Field and Flanking Fields at 12h+62d, the SSA13 field at 13h+42d, and the V15 field at 14h+52d. We use the redshift distribution of these radio sources to constrain the evolution of their luminosity function. The epoch dependent luminosity function is then used to estimate the evolving global star formation density. At redshifts less than one, our calculated star formation rates are significantly larger than even the dust-corrected optically-selected star formation rates; however, we confirm the rapid rise from z=0 to z=1 seen in those surveys.
Plasmodium of true slime mold, Physarum polycephalum, is an amoeboid organism, which spreads with developing tubular network structure and crawls on two-dimensional plane with oscillating the cell thickness. The plasmodium transforms its tubular network structure to adapt to the environment. To reveal the effect of the network structure on the oscillating behavior of the plasmodium, we constructed coupled map systems on two-dimensional weighted networks as models of the plasmodium, and investigated the relation between the distribution of weights on the network edges and the synchronization in the system. We found the probability that the system shows phase synchronization changes drastically with the weight distribution even if the total weight is constant. This implies the oscillating patterns observed in the plasmodium are controlled by the tube widths or cross-sections in the tubular networks.
A method named simple2complex for modeling and training deep neural networks is proposed. Simple2complex train deep neural networks by smoothly adding more and more layers to the shallow networks, as the learning procedure going on, the network is just like growing. Compared with learning by end2end, simple2complex is with less possibility trapping into local minimal, namely, owning ability for global optimization. Cifar10 is used for verifying the superiority of simple2complex.
Sum Product Networks (SPNs) are a recently developed class of deep generative models which compute their associated unnormalized density functions using a special type of arithmetic circuit. When certain sufficient conditions, called the decomposability and completeness conditions (or "D&C" conditions), are imposed on the structure of these circuits, marginal densities and other useful quantities, which are typically intractable for other deep generative models, can be computed by what amounts to a single evaluation of the network (which is a property known as "validity"). However, the effect that the D&C conditions have on the capabilities of D&C SPNs is not well understood.   In this work we analyze the D&C conditions, expose the various connections that D&C SPNs have with multilinear arithmetic circuits, and consider the question of how well they can capture various distributions as a function of their size and depth. Among our various contributions is a result which establishes the existence of a relatively simple distribution with fully tractable marginal densities which cannot be efficiently captured by D&C SPNs of any depth, but which can be efficiently captured by various other deep generative models. We also show that with each additional layer of depth permitted, the set of distributions which can be efficiently captured by D&C SPNs grows in size. This kind of "depth hierarchy" property has been widely conjectured to hold for various deep models, but has never been proven for any of them. Some of our other contributions include a new characterization of the D&C conditions as sufficient and necessary ones for a slightly strengthened notion of validity, and various state-machine characterizations of the types of computations that can be performed efficiently by D&C SPNs.
We study numerically the dynamics of a one-electron wave packet in a two-dimensional random lattice with long-range correlated diagonal disorder in the presence of a uniform electric field. The time-dependent Schr\"{o}dinger equation is used for this purpose. We find that the wave packet displays Bloch-like oscillations associated with the appearance of a phase of delocalized states in the strong correlation regime. The amplitude of oscillations directly reflects the bandwidth of the phase and allows to measure it. The oscillations reveal two main frequencies whose values are determined by the structure of the underlying potential in the vicinity of the wave packet maximum.
We report results for simulations of the four-dimensional XY spin glass using the parallel tempering Monte Carlo method at low temperatures for moderate sizes. Our results are qualitatively consistent with earlier work on the three-dimensional gauge glass as well as three- and four-dimensional Edwards-Anderson Ising spin glass. An extrapolation of our results would indicate that large-scale excitations cost only a finite amount of energy in the thermodynamic limit. The surface of these excitations may be fractal, although we cannot rule out a scenario compatible with replica symmetry breaking in which the surface of low-energy large-scale excitations is space filling.
Natural gradient descent is a principled method for adapting the parameters of a statistical model on-line using an underlying Riemannian parameter space to redefine the direction of steepest descent. The algorithm is examined via methods of statistical physics which accurately characterize both transient and asymptotic behavior. A solution of the learning dynamics is obtained for the case of multilayer neural network training in the limit of large input dimension. We find that natural gradient learning leads to optimal asymptotic performance and outperforms gradient descent in the transient, significantly shortening or even removing plateaus in the transient generalization performance which typically hamper gradient descent training.
Sequence prediction and classification are ubiquitous and challenging problems in machine learning that can require identifying complex dependencies between temporally distant inputs. Recurrent Neural Networks (RNNs) have the ability, in theory, to cope with these temporal dependencies by virtue of the short-term memory implemented by their recurrent (feedback) connections. However, in practice they are difficult to train successfully when the long-term memory is required. This paper introduces a simple, yet powerful modification to the standard RNN architecture, the Clockwork RNN (CW-RNN), in which the hidden layer is partitioned into separate modules, each processing inputs at its own temporal granularity, making computations only at its prescribed clock rate. Rather than making the standard RNN models more complex, CW-RNN reduces the number of RNN parameters, improves the performance significantly in the tasks tested, and speeds up the network evaluation. The network is demonstrated in preliminary experiments involving two tasks: audio signal generation and TIMIT spoken word classification, where it outperforms both RNN and LSTM networks.
We investigate quantum phase transitions in the spin-1/2 Heisenberg antiferromagnet on square lattices with inhomogeneous bond dilution. It is shown that quantum fluctuations can be continuously tuned by inhomogeneous bond dilution, eventually leading to the destruction of long-range magnetic order on the percolating cluster. Two multicritical points are identified at which the magnetic transition separates from the percolation transition, introducing a novel quantum phase transition. Beyond these multicritical points a quantum-disordered phase appears, characterized by an infinite percolating cluster with short ranged antiferromagnetic order. In this phase, the low-temperature uniform susceptibility diverges algebraically with non-universal exponents. This is a signature that the novel quantum-disordered phase is a quantum Griffiths phase, as also directly confirmed by the statistical distribution of local gaps. This study thus presents evidence of a genuine quantum Griffiths phenomenon in a two-dimensional Heisenberg antiferromagnet.
Deep learning models' architectures, including depth and width, are key factors influencing models' performance, such as test accuracy and computation time. This paper solves two problems: given computation time budget, choose an architecture to maximize accuracy, and given accuracy requirement, choose an architecture to minimize computation time. We convert this architecture optimization into a subset selection problem. With accuracy's submodularity and computation time's supermodularity, we propose efficient greedy optimization algorithms. The experiments demonstrate our algorithm's ability to find more accurate models or faster models. By analyzing architecture evolution with growing time budget, we discuss relationships among accuracy, time and architecture, and give suggestions on neural network architecture design.
Automatic building extraction from aerial and satellite imagery is highly challenging due to extremely large variations of building appearances. To attack this problem, we design a convolutional network with a final stage that integrates activations from multiple preceding stages for pixel-wise prediction, and introduce the signed distance function of building boundaries as the output representation, which has an enhanced representation power. We leverage abundant building footprint data available from geographic information systems (GIS) to compile training data. The trained network achieves superior performance on datasets that are significantly larger and more complex than those used in prior work, demonstrating that the proposed method provides a promising and scalable solution for automating this labor-intensive task.
In this paper, we propose an efficient range free localization scheme for large scale three dimensional wireless sensor networks. Our system environment consists of two type of sensors, randomly deployed static sensors and global positioning system equipped moving sensors. These moving anchors travels across the network field and broadcast their current locations on specified intervals. As soon as the sensors which are deployed in random fashion receives three beacon messages (known locations broadcasted by anchors), they computes their locations automatically by using our proposed algorithm. One of our significant contributions is, we use only three different beacon messages to localize one sensor, while in the best of our knowledge, all previously proposed methods use at least four different known locations. The ability of our method to localize by using only three known locations not only saves computation, time, energy, but also reduces the number of anchors needed to be deployed and more importantly reduces the communication overheads. Experimental results demonstrate that our proposed scheme improves the overall efficiency of localization process significantly.   Important Note: Final version of this paper is accepted and published by Journal of Wireless Personal Communication, Springer : June, 2014 The final version of publication is available at link.springer.com Link: http://link.springer.com/article/10.1007\%2Fs11277-014-1852-6
In this paper we present a new approach for efficient regression based object tracking which we refer to as Deep- LK. Our approach is closely related to the Generic Object Tracking Using Regression Networks (GOTURN) framework of Held et al. We make the following contributions. First, we demonstrate that there is a theoretical relationship between siamese regression networks like GOTURN and the classical Inverse-Compositional Lucas & Kanade (IC-LK) algorithm. Further, we demonstrate that unlike GOTURN IC-LK adapts its regressor to the appearance of the currently tracked frame. We argue that this missing property in GOTURN can be attributed to its poor performance on unseen objects and/or viewpoints. Second, we propose a novel framework for object tracking - which we refer to as Deep-LK - that is inspired by the IC-LK framework. Finally, we show impressive results demonstrating that Deep-LK substantially outperforms GOTURN. Additionally, we demonstrate comparable tracking performance to current state of the art deep-trackers whilst being an order of magnitude (i.e. 100 FPS) computationally efficient.
We present a new cluster-finding algorithm based on a combination of the Voronoi Tessellation and Friends-Of-Friends methods. The algorithm utilises probability distribution functions derived from a photometric redshift analysis and is tested on simulated cluster-catalogues. We use a 9 band photometric catalogue over 0.5 square degrees in the Subaru XMM-Newton Deep Field. The photometry is comprised of UKIDSS Ultra Deep Survey infrared J and K data combined with 3.6 micro-m and 4.5 micro-m Spitzer bands and optical BVRi'z' imaging from the Subaru Telescope. The cluster catalogue contains 13 clusters at redshifts 0.61 <= z <= 1.39 with luminosities 10L* < L_tot < 50 L*.
In this paper we propose the utterance-level Permutation Invariant Training (uPIT) technique. uPIT is a practically applicable, end-to-end, deep learning based solution for speaker independent multi-talker speech separation. Specifically, uPIT extends the recently proposed Permutation Invariant Training (PIT) technique with an utterance-level cost function, hence eliminating the need for solving an additional permutation problem during inference, which is otherwise required by frame-level PIT. We achieve this using Recurrent Neural Networks (RNNs) that, during training, minimize the utterance-level separation error, hence forcing separated frames belonging to the same speaker to be aligned to the same output stream. In practice, this allows RNNs, trained with uPIT, to separate multi-talker mixed speech without any prior knowledge of signal duration, number of speakers, speaker identity or gender. We evaluated uPIT on the WSJ0 and Danish two- and three-talker mixed-speech separation tasks and found that uPIT outperforms techniques based on Non-negative Matrix Factorization (NMF) and Computational Auditory Scene Analysis (CASA), and compares favorably with Deep Clustering (DPCL) and the Deep Attractor Network (DANet). Furthermore, we found that models trained with uPIT generalize well to unseen speakers and languages. Finally, we found that a single model, trained with uPIT, can handle both two-speaker, and three-speaker speech mixtures.
Many objects in the real world are difficult to describe by a single numerical vector of a fixed length, whereas describing them by a set of vectors is more natural. Therefore, Multiple instance learning (MIL) techniques have been constantly gaining on importance throughout last years. MIL formalism represents each object (sample) by a set (bag) of feature vectors (instances) of fixed length where knowledge about objects (e.g., class label) is available on bag level but not necessarily on instance level. Many standard tools including supervised classifiers have been already adapted to MIL setting since the problem got formalized in late nineties. In this work we propose a neural network (NN) based formalism that intuitively bridges the gap between MIL problem definition and the vast existing knowledge-base of standard models and classifiers. We show that the proposed NN formalism is effectively optimizable by a modified back-propagation algorithm and can reveal unknown patterns inside bags. Comparison to eight types of classifiers from the prior art on a set of 14 publicly available benchmark datasets confirms the advantages and accuracy of the proposed solution.
The Signal-to-Interference-and-Noise-Ratio (SINR) physical model is one of the legitimate models of wireless networks. Despite of the vast amount of study done in design and analysis of centralized algorithms supporting wireless communication under the SINR physical model, little is known about distributed algorithms in this model, especially deterministic ones. In this work we construct, in a deterministic distributed way, a backbone structure on the top of a given wireless network, which can be used for transforming many algorithms designed in a simpler model of ad hoc broadcast networks without interference into the SINR physical model with uniform power of stations, without increasing their asymptotic time complexity. The time cost of the backbone data structure construction is only O(Delta polylog n) rounds, where Delta is roughly the inverse of network density and n is the number of nodes in the whole network. The core of the construction is a novel combinatorial structure called SINR-selector, which is introduced and constructed in this paper. We demonstrate the power of the backbone data structure by using it for obtaining efficient O(D+Delta polylog n)-round and O(D+k+Delta polylog n)-round deterministic distributed solutions for leader election and multi-broadcast, respectively, where D is the network diameter and k is the number of messages to be disseminated.
Skip connections made the training of very deep networks possible and have become an indispensable component in a variety of neural architectures. A completely satisfactory explanation for their success remains elusive. Here, we present a novel explanation for the benefits of skip connections in training very deep networks. The difficulty of training deep networks is partly due to the singularities caused by the non-identifiability of the model. Two such singularities have been identified in previous work: (i) overlap singularities caused by the permutation symmetry of nodes in a given layer and (ii) elimination singularities corresponding to the elimination, i.e. consistent deactivation, of nodes. These singularities cause degenerate manifolds in the loss landscape previously shown to slow down learning. We argue that skip connections eliminate these singularities by breaking the permutation symmetry of nodes and by reducing the possibility of node elimination. Moreover, for typical initializations, skip connections move the network away from the "ghosts" of these singularities and sculpt the landscape around them to alleviate the learning slow-down. These hypotheses are supported by evidence from simplified models, as well as from experiments with fully-connected deep networks trained on CIFAR-10 and CIFAR-100.
The most common assumption in evolutionary game theory is that players should adopt a strategy that warrants the highest payoff. However, recent studies indicate that the spatial selection for cooperation is enhanced if an appropriate fraction of the population chooses the most common rather than the most profitable strategy within the interaction range. Such conformity might be due to herding instincts or crowd behavior in humans and social animals. In a heterogeneous population where individuals differ in their degree, collective influence, or other traits, an unanswered question remains who should conform. Selecting conformists randomly is the simplest choice, but it is neither a realistic nor the optimal one. We show that, regardless of the source of heterogeneity and game parametrization, socially the most favorable outcomes emerge if the masses conform. On the other hand, forcing leaders to conform significantly hinders the constructive interplay between heterogeneity and coordination, leading to evolutionary outcomes that are worse still than if conformists were chosen randomly. We conclude that leaders must be able to create a following for network reciprocity to be optimally augmented by conformity. In the opposite case, when leaders are castrated and made to follow, the failure of coordination impairs the evolution of cooperation.
The capabilities of a neutrino factory in the determination of polarized parton distributions from charged-current deep-inelastic scattering experiments is discussed. We present a study of the accuracy in the determination of polarized parton distributions that would be possible with such a facility. We show that these measurements have the potential to distinguish between different theoretical scenarios for the proton spin structure.
We report an implementation of a clinical information extraction tool that leverages deep neural network to annotate event spans and their attributes from raw clinical notes and pathology reports. Our approach uses context words and their part-of-speech tags and shape information as features. Then we hire temporal (1D) convolutional neural network to learn hidden feature representations. Finally, we use Multilayer Perceptron (MLP) to predict event spans. The empirical evaluation demonstrates that our approach significantly outperforms baselines.
In this document, a privacy-preserving distributed profile matching protocol is proposed in a particular network context called \emph{mobile social network}. Such networks are often deployed in more or less hostile environments, requiring rigorous security mechanisms. In the same time, energy and computational resources are limited as these heterogeneous networks are frequently constituted by wireless components like tablets or mobile phones. This is why a new encryption algorithm having an high level of security while preserving resources is proposed in this paper. The approach is based on elliptic curve cryptography, more specifically on an almost completely homomorphic cryptosystem over a supersingular elliptic curve, leading to a secure and efficient preservation of privacy in distributed profile matching.
While there is significant work on sensing and recognition of significant places for users, little attention has been given to users' significant routes. Recognizing these routine journeys, opens doors to the development of novel applications, like personalized travel alerts, and enhancement of user's travel experience. However, the high energy consumption of traditional location sensing technologies, such as GPS or WiFi based localization, is a barrier to passive and ubiquitous route sensing through smartphones.   In this paper, we present a passive route sensing framework that continuously monitors a vehicle user solely through a phone's gyroscope and accelerometer. This approach can differentiate and recognize various routes taken by the user by time warping angular speeds experienced by the phone while in transit and is independent of phone orientation and location within the vehicle, small detours and traffic conditions. We compare the route learning and recognition capabilities of this approach with GPS trajectory analysis and show that it achieves similar performance. Moreover, with an embedded co-processor, common to most new generation phones, it achieves energy savings of an order of magnitude over the GPS sensor.
We study localised activity patterns in neural field equations posed on the Euclidean plane; such models are commonly used to describe the coarse-grained activity of large ensembles of cortical neurons in a spatially continuous way. We employ matrix-free Newton-Krylov solvers and perform numerical continuation of localised patterns directly on the integral form of the equation. This opens up the possibility to study systems whose synaptic kernel does not lead to an equivalent PDE formulation. We present a numerical bifurcation study of localised states and show that the proposed models support patterns of activity with varying spatial extent through the mechanism of homoclinic snaking. The regular organisation of these patterns is due to spatial interactions at a specific scale associated with the separation of excitation peaks in the chosen connectivity function. The results presented form a basis for the general study of localised cortical activity with inputs and, more specifically, for investigating the localised spread of orientation selective activity that has been observed in the primary visual cortex with local visual input.
In information retrieval, learning to rank constructs a machine-based ranking model which given a query, sorts the search results by their degree of relevance or importance to the query. Neural networks have been successfully applied to this problem, and in this paper, we propose an attention-based deep neural network which better incorporates different embeddings of the queries and search results with an attention-based mechanism. This model also applies a decoder mechanism to learn the ranks of the search results in a listwise fashion. The embeddings are trained with convolutional neural networks or the word2vec model. We demonstrate the performance of this model with image retrieval and text querying data sets.
Monte Carlo simulations and finite-size scaling theory have been used to study the critical behavior of repulsive dimers on square lattices at 2/3 monolayer coverage. A "zig-zag" (ZZ) ordered phase, characterized by domains of parallel ZZ strips oriented at $\pm 45^o$ from the lattice symmetry axes, was found. This ordered phase is separated from the disordered state by a order-disorder phase transition occurring at a finite critical temperature. Based on the strong axial anisotropy of the ZZ phase, an orientational order parameter has been introduced. All the critical quantities have been obtained. The set of critical exponents suggests that the system belongs to a new universality class.
The noisy-or and its generalization noisy-max have been utilized to reduce the complexity of knowledge acquisition. In this paper, we present a new representation of noisy-max that allows for efficient inference in general Bayesian networks. Empirical studies show that our method is capable of computing queries in well-known large medical networks, QMR-DT and CPCS, for which no previous exact inference method has been shown to perform well.
An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-based NMT. This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. We demonstrate the effectiveness of both approaches over the WMT translation tasks between English and German in both directions. With local attention, we achieve a significant gain of 5.0 BLEU points over non-attentional systems which already incorporate known techniques such as dropout. Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25.9 BLEU points, an improvement of 1.0 BLEU points over the existing best system backed by NMT and an n-gram reranker.
In this paper we present ideas and architectural principles upon which we are basing the development of a distributed, open-source infrastructure that, in turn, will support the expression of business models, the dynamic composition of software services, and the optimisation of service chains through automatic self-organising and evolutionary algorithms derived from biology. The target users are small and medium-sized enterprises (SMEs). We call the collection of the infrastructure, the software services, and the SMEs a Digital Business Ecosystem (DBE).
Who are the influential people in an online social network? The answer to this question depends not only on the structure of the network, but also on details of the dynamic processes occurring on it. We classify these processes as conservative and non-conservative. A random walk on a network is an example of a conservative dynamic process, while information spread is non-conservative. The influence models used to rank network nodes can be similarly classified, depending on the dynamic process they implicitly emulate. We claim that in order to correctly rank network nodes, the influence model has to match the details of the dynamic process. We study a real-world network on the social news aggregator Digg, which allows users to post and vote for news stories. We empirically define influence as the number of in-network votes a user's post generates. This influence measure, and the resulting ranking, arises entirely from the dynamics of voting on Digg, which represents non-conservative information flow.   We then compare predictions of different influence models with this empirical estimate of influence. The results show that non-conservative models are better able to predict influential users on Digg. We find that normalized alpha-centrality metric turns out to be one of the best predictors of influence. We also present a simple algorithm for computing this metric and the associated mathematical formulation and analytical proofs.
The number of nodes of a network, called its size, is one of the most important network parameters. A radio network is a collection of stations, called nodes, with wireless transmission and receiving capabilities. It is modeled as a simple connected undirected graph whose nodes communicate in synchronous rounds. In each round, a node can either transmit a message to all its neighbors, or stay silent and listen. At the receiving end, a node $v$ hears a message from a neighbor $w$ in a given round, if $v$ listens in this round, and if $w$ is its only neighbor that transmits in this round. If $v$ listens in a round, and two or more neighbors of $v$ transmit in this round, a collision occurs at $v$. Two scenarios are considered in the literature: if nodes can distinguish collision from silence (the latter occurs when no neighbor transmits), we say that the network has the collision detection capability, otherwise there is no collision detection.   We consider the task of size discovery: finding the size of an unknown radio network with collision detection. All nodes have to output the size of the network, using a deterministic algorithm. Nodes have labels which are (not necessarily distinct) binary strings. The length of a labeling scheme is the largest length of a label.   Our main result states that the minimum length of a labeling scheme permitting size discovery in the class of networks of maximum degree Delta is Theta(\log\log Delta).
A disordered medium is often constructed by $N$ points independently and identically distributed in a $d$-dimensional hyperspace. Characteristics related to the statistics of this system is known as the random point problem. As $d \to \infty$, the distances between two points become independent random variables, leading to its mean field description: the random link model. While the numerical treatment of large random point problems pose no major difficulty, the same is not true for large random link systems due to Euclidean restrictions. Exploring the deterministic nature of the congruential pseudo-random number generators, we present techniques which allow the consideration of models with memory consumption of order O(N), instead of $O(N^2)$ in a naive implementation but with the same time dependence $O(N^2)$.
Service fingerprinting (i.e. the identification of network services and other applications on computing systems) is an essential part of penetration tests. The main contribution of this paper is a study on the improvement of fingerprinting tools. By applying mutation-based fuzzing as a fingerprint generation method, subtle differences in response messages can be identified. These differences in response messages provide means for the differentiation and identification of network services. To prove the feasibility of the approach, an implementation of a fingerprinting tool for ftp servers is presented and compared to preexisting fingerprinting tools. As a result of this study it is shown that mutation-based fuzzing is an appropriate method for service fingerprinting that even offers advantages over preexisting methods.
We present wide field-of-view near-infrared imaging from the NTT and very deep optical imaging from the HST of the young merging galaxy NGC 3597. The morphology of the galaxy and the properties of the newly formed proto-globular clusters (PGCs) are examined. Our K band data reveals the presence of a second nucleus, which provides further evidence that NGC 3597 is the result of a recent merger. Combining new K band photometry with optical photometry, we are able for the first time to derive a unique age for the newly formed PGCs of a few Myrs. This is consistent with the galaxy starburst age of < 10 Myrs. From deep HST imaging, we are able to probe the luminosity function ~8 magnitudes fainter than normal, old globular clusters, and confirm that the PGCs have a power-law distribution with a slope of ~-2.
We show how appropriate rewiring with the aid of Metropolis Monte Carlo computational experiments can be exploited to create network topologies possessing prescribed values of the average path length (APL) while keeping the same connectivity degree and clustering coefficient distributions. Using the proposed rewiring rules we illustrate how the emergent dynamics of the celebrated majority-rule model are shaped by the distinct impact of the APL attesting the need for developing efficient algorithms for tuning such network characteristics.
Widespread outreach programs using remote retinal imaging have proven to decrease the risk from diabetic retinopathy, the leading cause of blindness in the US. However, this process still requires manual verification of image quality and grading of images for level of disease by a trained human grader and will continue to be limited by the lack of such scarce resources. Computer-aided diagnosis of retinal images have recently gained increasing attention in the machine learning community. In this paper, we introduce a set of neural networks for diabetic retinopathy classification of fundus retinal images. We evaluate the efficiency of the proposed classifiers in combination with preprocessing and augmentation steps on a sample dataset. Our experimental results show that neural networks in combination with preprocessing on the images can boost the classification accuracy on this dataset. Moreover the proposed models are scalable and can be used in large scale datasets for diabetic retinopathy detection. The models introduced in this paper can be used to facilitate the diagnosis and speed up the detection process.
In Part I \cite{Zhao13TSPasync1}, we introduced a fairly general model for asynchronous events over adaptive networks including random topologies, random link failures, random data arrival times, and agents turning on and off randomly. We performed a stability analysis and established the notable fact that the network is still able to converge in the mean-square-error sense to the desired solution. Once stable behavior is guaranteed, it becomes important to evaluate how fast the iterates converge and how close they get to the optimal solution. This is a demanding task due to the various asynchronous events and due to the fact that agents influence each other. In this Part II, we carry out a detailed analysis of the mean-square-error performance of asynchronous strategies for solving distributed optimization and adaptation problems over networks. We derive analytical expressions for the mean-square convergence rate and the steady-state mean-square-deviation. The expressions reveal how the various parameters of the asynchronous behavior influence network performance. In the process, we establish the interesting conclusion that even under the influence of asynchronous events, all agents in the adaptive network can still reach an $O(\nu^{1 + \gamma_o'})$ near-agreement with some $\gamma_o' > 0$ while approaching the desired solution within $O(\nu)$ accuracy, where $\nu$ is proportional to the small step-size parameter for adaptation.
This paper summarises how the "SP theory of intelligence" and its realisation in the "SP computer model" simplifies and integrates concepts across artificial intelligence and related areas, and thus provides a promising foundation for the development of a general, human-level thinking machine, in accordance with the main goal of research in artificial general intelligence.   The key to this simplification and integration is the powerful concept of "multiple alignment", borrowed and adapted from bioinformatics. This concept has the potential to be the "double helix" of intelligence, with as much significance for human-level intelligence as has DNA for biological sciences.   Strengths of the SP system include: versatility in the representation of diverse kinds of knowledge; versatility in aspects of intelligence (including: strengths in unsupervised learning; the processing of natural language; pattern recognition at multiple levels of abstraction that is robust in the face of errors in data; several kinds of reasoning (including: one-step `deductive' reasoning; chains of reasoning; abductive reasoning; reasoning with probabilistic networks and trees; reasoning with 'rules'; nonmonotonic reasoning and reasoning with default values; Bayesian reasoning with 'explaining away'; and more); planning; problem solving; and more); seamless integration of diverse kinds of knowledge and diverse aspects of intelligence in any combination; and potential for application in several areas (including: helping to solve nine problems with big data; helping to develop human-level intelligence in autonomous robots; serving as a database with intelligence and with versatility in the representation and integration of several forms of knowledge; serving as a vehicle for medical knowledge and as an aid to medical diagnosis; and several more).
Matrix factorization is a key component of collaborative filtering-based recommendation systems because it allows us to complete sparse user-by-item ratings matrices under a low-rank assumption that encodes the belief that similar users give similar ratings and that similar items garner similar ratings. This paradigm has had immeasurable practical success, but it is not the complete story for understanding and inferring the preferences of people. First, peoples' preferences and their observable manifestations as ratings evolve over time along general patterns of trajectories. Second, an individual person's preferences evolve over time through influence of their social connections. In this paper, we develop a unified process model for both types of dynamics within a state space approach, together with an efficient optimization scheme for estimation within that model. The model combines elements from recent developments in dynamic matrix factorization, opinion dynamics and social learning, and trust-based recommendation. The estimation builds upon recent advances in numerical nonlinear optimization. Empirical results on a large-scale data set from the Epinions website demonstrate consistent reduction in root mean squared error by consideration of the two types of dynamics.
Recent language models, especially those based on recurrent neural networks (RNNs), make it possible to generate natural language from a learned probability. Language generation has wide applications including machine translation, summarization, question answering, conversation systems, etc. Existing methods typically learn a joint probability of words conditioned on additional information, which is (either statically or dynamically) fed to RNN's hidden layer. In many applications, we are likely to impose hard constraints on the generated texts, i.e., a particular word must appear in the sentence. Unfortunately, existing approaches could not solve this problem. In this paper, we propose a novel backward and forward language model. Provided a specific word, we use RNNs to generate previous words and future words, either simultaneously or asynchronously, resulting in two model variants. In this way, the given word could appear at any position in the sentence. Experimental results show that the generated texts are comparable to sequential LMs in quality.
Motivated by data-rich experiments in transcriptional regulation and sensory neuroscience, we consider the following general problem in statistical inference. When exposed to a high-dimensional signal S, a system of interest computes a representation R of that signal which is then observed through a noisy measurement M. From a large number of signals and measurements, we wish to infer the "filter" that maps S to R. However, the standard method for solving such problems, likelihood-based inference, requires perfect a priori knowledge of the "noise function" mapping R to M. In practice such noise functions are usually known only approximately, if at all, and using an incorrect noise function will typically bias the inferred filter. Here we show that, in the large data limit, this need for a pre-characterized noise function can be circumvented by searching for filters that instead maximize the mutual information I[M;R] between observed measurements and predicted representations. Moreover, if the correct filter lies within the space of filters being explored, maximizing mutual information becomes equivalent to simultaneously maximizing every dependence measure that satisfies the Data Processing Inequality. It is important to note that maximizing mutual information will typically leave a small number of directions in parameter space unconstrained. We term these directions "diffeomorphic modes" and present an equation that allows these modes to be derived systematically. The presence of diffeomorphic modes reflects a fundamental and nontrivial substructure within parameter space, one that is obscured by standard likelihood-based inference.
This is an English translation of my 1977 Russian preprint. It contains the first explicit definition of the pion distribution amplitude (DA), the expression for the pion form factor asymptotics in terms of the pion DA, and formulates the pQCD parton picture for hard exclusive processes.   Abstract of the original paper:   The large Q^2 behavior of the pion electromagnetic form factor is explicitly calculated in the non-Abelian gauge theory to demonstrate a field-theoretical approach to the deep elastic processes of composite particles. The approach is equivalent to a new type of parton model.
Neural systems are comprised of interacting units, and relevant information regarding their function or malfunction can be inferred by analyzing the statistical dependencies between the activity of each unit. Whilst correlations and mutual information are commonly used to characterize these dependencies, our objective here is to extend interactions to triplets of variables to better detect and characterize dynamic information transfer. Our approach relies on the measure of interaction information (II). The sign of II provides information as to the extent to which the interaction of variables in triplets is redundant (R) or synergetic (S). Here, based on this approach, we calculated the R and S status for triplets of electrophysiological data recorded from drug-resistant patients with mesial temporal lobe epilepsy in order to study the spatial organization and dynamics of R and S close to the epileptogenic zone (the area responsible for seizure propagation). In terms of spatial organization, our results show that R matched the epileptogenic zone while S was distributed more in the surrounding area. In relation to dynamics, R made the largest contribution to high frequency bands (14-100Hz), whilst S was expressed more strongly at lower frequencies (1-7Hz). Thus, applying interaction information to such clinical data reveals new aspects of epileptogenic structure in terms of the nature (redundancy vs. synergy) and dynamics (fast vs. slow rhythms) of the interactions. We expect this methodology, robust and simple, can reveal new aspects beyond pair-interactions in networks of interacting units in other setups with multi-recording data sets (and thus, not necessarily in epilepsy, the pathology we have approached here).
We propose a nonequilibrium Monte Carlo (MC) approach to explore nonequilibrium dynamical ferromagnetism of interacting single molecule magnets (SMMs). Both quantum spin tunneling and thermally activated spin reversal are successfully implemented in the same MC simulation framework. Applied to a typical example, this simulation method satisfactorily reproduces experimental magnetization curves with experimental parameters. Our results show that both quantum and classical effects are essential to determine the hysteresis behaviors. This method is effective and reliable to gain deep insights into SMMs.
Next generation cellular networks will have to leverage large cell densifications to accomplish the ambitious goals for aggregate multi-user sum rates, for which CRAN architecture is a favored network design. This shifts the attention back to applicable resource allocation (RA), which need to be applicable for very short radio frames, large and dense sets of radio heads, and large user populations in the coordination area. So far, mainly CSI-based RA schemes have been proposed for this task. However, they have considerable complexity and also incur a significant CSI acquisition overhead on the system. In this paper, we study an alternative approach which promises lower complexity with also a lower overhead. We propose to base the RA in multi-antenna CRAN systems on the position information of user terminals only. We use Random Forests as supervised machine learning approach to determine the multi-user RAs. This likely leads to lower overhead costs, as the acquisition of position information requires less radio resources in comparison to the acquisition of instantaneous CSI. The results show the following findings: I) In general, learning-based RA schemes can achieve comparable spectral efficiency to CSI-based scheme; II) If taking the system overhead into account, learning-based RA scheme utilizing position information outperform legacy CSI-based scheme by up to 100%; III) Despite their dependency on the training data, Random Forests based RA scheme is robust against position inaccuracies and changes in the propagation scenario; IV) The most important factor influencing the performance of learning-based RA scheme is the antenna orientation, for which we present three approaches that restore most of the original performance results. To the best of our knowledge, these insights are new and indicate a novel as well as promising approach to master the complexity in future cellular networks.
Tracing data as collated by CoCoMac, a seminal neuroinformatics database, is at multiple resolutions -- white matter tracts were studied for areas and their subdivisions by different reports. Network theoretic analysis of this multi-resolution data often assumes that the data at various resolutions is equivalent, which may not be correct. In this paper we propose three methods to resolve the multi-resolution issue such that the resultant networks have connectivity data at only one resolution. The different resultant networks are compared in terms of their network analysis metrics and degree distributions.
We propose to directly map raw visual observations and text input to actions for instruction execution. While existing approaches assume access to structured environment representations or use a pipeline of separately trained models, we learn a single model to jointly reason about linguistic and visual input. We use reinforcement learning in a contextual bandit setting to train a neural network agent. To guide the agent's exploration, we use reward shaping with different forms of supervision. Our approach does not require intermediate representations, planning procedures, or training different models. We evaluate in a simulated environment, and show significant improvements over supervised learning and common reinforcement learning variants.
We study exchange coupling due to the interelectron Coulomb interaction between two ferromagnetic grains embedded into insulating matrix. This contribution to the exchange interaction complements the contribution due to virtual electron hopping between the grains. We show that the Coulomb and the hopping based exchange interactions are comparable. However, for most system parameters these contributions have opposite signs and compete with each other. In contrast to the hopping based exchange interaction the Coulomb based exchange is inversely proportional to the dielectric constant of the insulating matrix $\varepsilon$. The total intergrain exchange interaction has a complicated dependence on the dielectric permittivity of the insulating matrix. Increasing $\varepsilon$ one can observe the ferromagnet-antiferromagnet (FM-AFM) and AFM-FM transitions. For certain parameters no transition is possible, however even in this case the exchange interaction has large variations, changing its value by three times with increasing the matrix dielectric constant.
We have rederived the controversial influence functional approach of Golubev and Zaikin (GZ) for interacting electrons in disordered metals in a way that allows us to show its equivalence, before disorder averaging, to diagrammatic Keldysh perturbation theory. By representing a certain Pauli factor (1-2 rho) occuring in GZ's effective action in the frequency domain (instead of the time domain, as GZ do), we also achieve a more accurate treatment of recoil effects. With this change, GZ's approach reproduces, in a remarkably simple way, the standard, generally accepted result for the decoherence rate. -- The main text and appendix A.1 to A.3 of the present paper have already been published previously; for convenience, they are included here again, together with five additional, lengthy appendices containing relevant technical details.
In this article we discuss general strategies and computer algorithms to test the connectivity of unstructured networks which consist of a number of segments connected through randomly distributed nodes.
Convolutional Neural Networks (CNNs) have become the state-of-the-art in various computer vision tasks, but they are still premature for most sensor data, especially in pervasive and wearable computing. A major reason for this is the limited amount of annotated training data. In this paper, we propose the idea of leveraging the discriminative power of pre-trained deep CNNs on 2-dimensional sensor data by transforming the sensor modality to the visual domain. By three proposed strategies, 2D sensor output is converted into pressure distribution imageries. Then we utilize a pre-trained CNN for transfer learning on the converted imagery data. We evaluate our method on a gait dataset of floor surface pressure mapping. We obtain a classification accuracy of 87.66%, which outperforms the conventional machine learning methods by over 10%.
In this paper we present a novel graph kernel framework inspired the by the Weisfeiler-Lehman (WL) isomorphism tests. Any WL test comprises a relabelling phase of the nodes based on test-specific information extracted from the graph, for example the set of neighbours of a node. We defined a novel relabelling and derived two kernels of the framework from it. The novel kernels are very fast to compute and achieve state-of-the-art results on five real-world datasets.
We consider transport of dilute two-dimensional electrons, with temperature between Fermi and Debye temperatures. In this regime, electrons form a nondegenerate plasma with mobility limited by potential disorder. Different kinds of impurities contribute unique signatures to the resulting temperature-dependent Drude conductivity, via energy-dependent scattering. This opens up a way to characterize sample disorder composition. In particular, neutral impurities cause a slow decrease in conductivity with temperature, whereas charged impurities result in conductivity growing as a square root of temperature. This observation serves as a precaution for literally interpreting metallic or insulating conductivity dependence, as both can be found in a classical metallic system.
The acquisition of Magnetic Resonance Imaging (MRI) is inherently slow. Inspired by recent advances in deep learning, we propose a framework for reconstructing MR images from undersampled data using a deep cascade of convolutional neural networks to accelerate the data acquisition process. We show that for Cartesian undersampling of 2D cardiac MR images, the proposed method outperforms the state-of-the-art compressed sensing approaches, such as dictionary learning-based MRI (DLMRI) reconstruction, in terms of reconstruction error, perceptual quality and reconstruction speed for both 3-fold and 6-fold undersampling. Compared to DLMRI, the error produced by the method proposed is approximately twice as small, allowing to preserve anatomical structures more faithfully. Using our method, each image can be reconstructed in 23 ms, which is fast enough to enable real-time applications.
Recently, hashing methods have been widely used in large-scale image retrieval. However, most existing hashing methods did not consider the hierarchical relation of labels, which means that they ignored the rich information stored in the hierarchy. Moreover, most of previous works treat each bit in a hash code equally, which does not meet the scenario of hierarchical labeled data. In this paper, we propose a novel deep hashing method, called supervised hierarchical deep hashing (SHDH), to perform hash code learning for hierarchical labeled data. Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each layer, and design a deep convolutional neural network to obtain a hash code for each data point. Extensive experiments on several real-world public datasets show that the proposed method outperforms the state-of-the-art baselines in the image retrieval task.
An inductive learning algorithm takes a set of data as input and generates a hypothesis as output. A set of data is typically consistent with an infinite number of hypotheses; therefore, there must be factors other than the data that determine the output of the learning algorithm. In machine learning, these other factors are called the bias of the learner. Classical learning algorithms have a fixed bias, implicit in their design. Recently developed learning algorithms dynamically adjust their bias as they search for a hypothesis. Algorithms that shift bias in this manner are not as well understood as classical algorithms. In this paper, we show that the Baldwin effect has implications for the design and analysis of bias shifting algorithms. The Baldwin effect was proposed in 1896, to explain how phenomena that might appear to require Lamarckian evolution (inheritance of acquired characteristics) can arise from purely Darwinian evolution. Hinton and Nowlan presented a computational model of the Baldwin effect in 1987. We explore a variation on their model, which we constructed explicitly to illustrate the lessons that the Baldwin effect has for research in bias shifting algorithms. The main lesson is that it appears that a good strategy for shift of bias in a learning algorithm is to begin with a weak bias and gradually shift to a strong bias.
This paper addresses the problem of quantifying biomarkers in multi-stained tissues, based on color and spatial information. A deep learning based method that can automatically localize and quantify the cells expressing biomarker(s) in a whole slide image is proposed. The deep learning network is a fully convolutional network (FCN) whose input is the true RGB color image of a tissue and output is a map of the different biomarkers. The FCN relies on a convolutional neural network (CNN) that classifies each cell separately according to the biomarker it expresses. In this study, images of immunohistochemistry (IHC) stained slides were collected and used. More than 4,500 RGB images of cells were manually labeled based on the expressing biomarkers. The labeled cell images were used to train the CNN (obtaining an accuracy of 92% in a test set). The trained CNN is then extended to an FCN that generates a map of all biomarkers in the whole slide image acquired by the scanner (instead of classifying every cell image). To evaluate our method, we manually labeled all nuclei expressing different biomarkers in two whole slide images and used theses as the ground truth. Our proposed method for immunohistochemical analysis compares well with the manual labeling by humans (average F-score of 0.96).
The electric conductance of a strip of undoped graphene increases in the presence of a disorder potential, which is smooth on atomic scales. The phenomenon is attributed to impurity-assisted resonant tunneling of massless Dirac fermions. Employing the transfer matrix approach we demonstrate the resonant character of the conductivity enhancement in the presence of a single impurity. We also calculate the two-terminal conductivity for the model with one-dimensional fluctuations of disorder potential by a mapping onto a problem of Anderson localization.
The bootstrap percolation (or threshold model) is a dynamic process modelling the propagation of an epidemic on a graph, where inactive vertices become active if their number of active neighbours reach some threshold. We study an optimization problem related to it, namely the determination of the minimal number of active sites in an initial configuration that leads to the activation of the whole graph under this dynamics, with and without a constraint on the time needed for the complete activation. This problem encompasses in special cases many extremal characteristics of graphs like their independence, decycling or domination number, and can also be seen as a packing problem of repulsive particles. We use the cavity method (including the effects of replica symmetry breaking), an heuristic technique of statistical mechanics many predictions of which have been confirmed rigorously in the recent years. We have obtained in this way several quantitative conjectures on the size of minimal contagious sets in large random regular graphs, the most striking being that 5-regular random graph with a threshold of activation of 3 (resp. 6-regular with threshold 4) have contagious sets containing a fraction 1/6 (resp. 1/4) of the total number of vertices. Equivalently these numbers are the minimal fraction of vertices that have to be removed from a 5-regular (resp. 6-regular) random graph to destroy its 3-core. We also investigated Survey Propagation like algorithmic procedures for solving this optimization problem on single instances of random regular graphs.
This paper proposes and evaluates a new position-based Parallel Routing Protocol (PRP) for simultaneously routing multiple data packets over disjoint paths in a mobile ad-hoc network (MANET) for higher reliability and reduced communication delays. PRP views the geographical region where the MANET is located as a virtual 2-dimensional grid of cells. Cell-disjoint (parallel) paths between grid cells are constructed and used for building pre-computed routing tables. A single gateway node in each grid cell handles routing through that grid cell reducing routing overheads. Each node maintains updated information about its own location in the virtual grid using GPS. Nodes also keep track of the location of other nodes using a new proposed cell-based broadcasting algorithm. Nodes exchange energy level information with neighbors allowing energy-aware selection of the gateway nodes. Performance evaluation results have been derived showing the attractiveness of the proposed parallel routing protocol from different respects including low communication delays, high packet delivery ratios, high routing path stability, and low routing overheads.
Green energy powered cognitive radio (CR) network is capable of liberating the wireless access networks from spectral and energy constraints. The limitation of the spectrum is alleviated by exploiting cognitive networking in which wireless nodes sense and utilize the spare spectrum for data communications, while dependence on the traditional unsustainable energy is assuaged by adopting energy harvesting (EH) through which green energy can be harnessed to power wireless networks. Green energy powered CR increases the network availability and thus extends emerging network applications. Designing green CR networks is challenging. It requires not only the optimization of dynamic spectrum access but also the optimal utilization of green energy. This paper surveys the energy efficient cognitive radio techniques and the optimization of green energy powered wireless networks. Existing works on energy aware spectrum sensing, management, and sharing are investigated in detail. The state of the art of the energy efficient CR based wireless access network is discussed in various aspects such as relay and cooperative radio and small cells. Envisioning green energy as an important energy resource in the future, network performance highly depends on the dynamics of the available spectrum and green energy. As compared with the traditional energy source, the arrival rate of green energy, which highly depends on the environment of the energy harvesters, is rather random and intermittent. To optimize and adapt the usage of green energy according to the opportunistic spectrum availability, we discuss research challenges in designing cognitive radio networks which are powered by energy harvesters.
We study the persistence properties in a simple model of two coupled interfaces characterized by heights h_1 and h_2 respectively, each growing over a d-dimensional substrate. The first interface evolves independently of the second and can correspond to any generic growing interface, e.g., of the Edwards-Wilkinson or of the Kardar-Parisi-Zhang variety. The evolution of h_2, however, is coupled to h_1 via a quenched random velocity field. In the limit d\to 0, our model reduces to the Matheron-de Marsily model in two dimensions. For d=1, our model describes a Rouse polymer chain in two dimensions advected by a transverse velocity field. We show analytically that after a long waiting time t_0\to \infty, the stochastic process h_2, at a fixed point in space but as a function of time, becomes a fractional Brownian motion with a Hurst exponent, H_2=1-\beta_1/2, where \beta_1 is the growth exponent characterizing the first interface. The associated persistence exponent is shown to be \theta_s^2=1-H_2=\beta_1/2. These analytical results are verified by numerical simulations.
Data that is transient over an unsecured wireless network is always susceptible to being intercepted by anyone within the range of the wireless signal. Hence providing secure communication to keep the user information and devices safe when connected wirelessly has become one of the major concerns. Quantum cryptography provides a solution towards absolute communication security over the network by encoding information as polarized photons, which can be sent through the air. This paper explores on the aspect of application of quantum cryptography in wireless networks. In this paper we present a methodology for integrating quantum cryptography and security of IEEE 802.11 wireless networks in terms of distribution of the encryption keys.
Using each node's degree as a proxy for its importance, the topological hierarchy of a complex network is introduced and quantified. We propose a simple dynamical process used to construct networks which are either maximally or minimally hierarchical. Comparison with these extremal cases as well as with random scale-free networks allows us to better understand hierarchical versus modular features in several real-life complex networks. For random scale-free topologies the extent of topological hierarchy is shown to smoothly decline with $\gamma$ -- the exponent of a degree distribution -- reaching its highest possible value for $\gamma \leq 2$ and quickly approaching zero for $\gamma>3$.
We propose a method to improve the DL SINR for a single cell indoor base station operating in the millimeter wave frequency range using deep reinforcement learning. In this paper, we use the deep reinforcement learning model to arrive at optimal sequences of actions to improve the cellular network SINR value from a starting to a feasible target value. While deep reinforcement learning has been discussed extensively in literature, its applications in the cellular networks in general and in mmWave propagations are new and starting to gain attention. We have run simulations and have shown that an optimal action sequence is feasible even against the randomness of the network actions.
Law enforcement and intelligence agencies worldwide struggle to find effective ways to fight and control organized crime. However, illegal networks operate outside the law and much of the data collected is classified. Therefore, little is known about criminal networks structure, topological weaknesses, and control. In this contribution we present a unique criminal network of federal crimes in Brazil. We study its structure, its response to different attack strategies, and its controllability. Surprisingly, the network composed of multiple crimes of federal jurisdiction has a giant component, enclosing more than a half of all its edges. This component shows some typical social network characteristics, such as small-worldness and high clustering coefficient, however it is much "darker" than common social networks, having low levels of edge density and network efficiency. On the other side, it has a very high modularity value, $Q=0.96$. Comparing multiple attack strategies, we show that it is possible to disrupt the giant component of the network by removing only $2\%$ of its edges or nodes, according to a module-based prescription, precisely due to its high modularity. Finally, we show that the component is controllable, in the sense of the exact network control theory, by getting access to $20\%$ of the driver nodes.
We study the dynamics of an asymmetric intruder in a glass-former model. At equilibrium, the intruder diffuses with average zero velocity. After an abrupt quench to $T$ deeply under the mode-coupling temperature, a net average drift is observed, steady on a logarithmic time-scale. The phenomenon is well reproduced in an asymmetric version of the Sinai model. The subvelocity of the intruder grows with $T_{eff}/T$, where $T_{eff}$ is defined by the response-correlation ratio, corresponding to a general behavior of thermal ratchets when in contact with two thermal reservoirs.
There has been tremendous progress in algorithmic methods for computing driving directions on road networks. Most of that work focuses on time-independent route planning, where it is assumed that the cost on each arc is constant per query. In practice, the current traffic situation significantly influences the travel time on large parts of the road network, and it changes over the day. One can distinguish between traffic congestion that can be predicted using historical traffic data, and congestion due to unpredictable events, e.g., accidents. In this work, we study the \emph{dynamic and time-dependent} route planning problem, which takes both prediction (based on historical data) and live traffic into account. To this end, we propose a practical algorithm that, while robust to user preferences, is able to integrate global changes of the time-dependent metric~(e.g., due to traffic updates or user restrictions) faster than previous approaches, while allowing subsequent queries that enable interactive applications.
We propose a simple model quantum network consisting of diamond-shaped plaquettes with deterministic distribution of magnetic and non-magnetic atoms in presence of a uniform external magnetic flux in each plaquette and predict that such a simple model can be a prospective candidate for spin filter as well as flux driven spintronic switch. The orientations and the amplitudes of the substrate magnetic moments play a crucial role in the energy band engineering of the two spin channels which essentially gives us a control over the spin transmission leading to a spin filtering effect. The externally tunable magnetic flux plays an important role in inducing a switch on-switch off effect for both the spin states indicating the behavior like a spintronic switch. Even a correlated disorder configuration in the on-site potentials and in the magnetic moments may lead to disorder-induced spin filtering phenomenon where one of the spin channel gets entirely blocked leaving the other one transmitting over the entire allowed energy regime. All these features are established by evaluating the density of states and the two terminal transmission probabilities using the transfer-matrix formalism within a tight-binding framework. Experimental realization of our theoretical study may be helpful in designing new spintronic devices.
We numerically study the distribution function of the conductivity (transmission) in the one-dimensional tight-binding Anderson model in the region of fluctuation states. We show that while single parameter scaling in this region is not valid, the distribution can still be described within a scaling approach based upon the ratio of two fundamental quantities, the localization length, $l_{loc}$, and a new length, $l_s$, related to the integral density of states. In an intermediate interval of the system's length $L$, $l_{loc}\ll L\ll l_s$, the variance of the Lyapunov exponent does not follow the predictions of the central limit theorem, and may even grow with $L$.
We describe a distributed framework for resources management in peer-to-peer networks leading to golden-rule reciprocity, a kind of one-versus-rest tit-for-tat, so that the delays experienced by any given peer's messages in the rest of the network are proportional to those experienced by others' messages at that peer.
A distributed algorithm performs local computations on pieces of input and communicates the results through given communication links. When processing a massive graph in a distributed algorithm, local outputs must be configured as a solution to a graph problem without shared memory and with few rounds of communication. In this paper we consider the problem of computing a local cluster in a massive graph in the distributed setting. Computing local clusters are of certain application-specific interests, such as detecting communities in social networks or groups of interacting proteins in biological networks. When the graph models the computer network itself, detecting local clusters can help to prevent communication bottlenecks. We give a distributed algorithm that computes a local cluster in time that depends only logarithmically on the size of the graph in the CONGEST model. In particular, when the value of the optimal local cluster is known, the algorithm runs in time entirely independent of the size of the graph and depends only on error bounds for approximation. We also show that the local cluster problem can be computed in the k-machine distributed model in sublinear time. The speedup of our local cluster algorithms is mainly due to the use of our distributed algorithm for heat kernel pagerank.
In this paper, we consider the Universe at the late stage of its evolution and deep inside the cell of uniformity. At these scales, we consider the Universe to be filled with dust-like matter in the form of discretely distributed galaxies, a minimally coupled scalar field and radiation as matter sources. We investigate such a Universe in the mechanical approach. This means that the peculiar velocities of the inhomogeneities (in the form of galaxies) as well as fluctuations of other perfect fluids are non-relativistic. Such fluids are designated as coupled because they are concentrated around inhomogeneities. In the present paper we investigate the conditions under which a scalar field can become coupled, and show that, at the background level, such coupled scalar field behaves as a two component perfect fluid: a network of frustrated cosmic strings with EoS parameter $w=-1/3$ and a cosmological constant. The potential of this scalar field is very flat at the present time. Hence, the coupled scalar field can provide the late cosmic acceleration. The fluctuations of the energy density and pressure of this field are concentrated around the galaxies screening their gravitational potentials. Therefore, such scalar fields can be regarded as coupled to the inhomogeneities.
Many social, biological, and economic systems can be approached by complex networks of interacting units. The behaviour of several models on small-world networks has recently been studied. These models are expected to capture the essential features of the complex processes taking place on real networks like disease spreading, formation of public opinion, distribution of wealth, etc. In many of these systems relations are directed, in the sense that links only act in one direction (outwards or inwards). We investigate the effect of directed links on the behaviour of a simple spin-like model evolving on a small-world network. We show that directed networks may lead to a highly nontrivial phase diagram including first and second-order phase transitions out of equilibrium.
We complement our previous work [arxiv: 0707.0565] with the full (non diluted) solution describing the stable states of an attractor network that stores correlated patterns of activity. The new solution provides a good fit of simulations of a network storing the feature norms of McRae and colleagues [McRae et al, 2005], experimentally obtained combinations of features representing concepts in semantic memory. We discuss three ways to improve the storage capacity of the network: adding uninformative neurons, removing informative neurons and introducing popularity-modulated hebbian learning. We show that if the strength of synapses is modulated by an exponential decay of the popularity of the pre-synaptic neuron, any distribution of patterns can be stored and retrieved with approximately an optimal storage capacity - i.e, C ~ I.p, the minimum number of connections per neuron needed to sustain the retrieval of a pattern is proportional to the information content of the pattern multiplied by the number of patterns stored in the network.
We study the localization properties of disordered d-wave superconductors by means of the fermionic replica trick method. We derive the effective non-linear sigma-model describing the diffusive modes related to spin transport which we analyze by the Wilson-Polyakov renormalization group. A lot of different symmetry classes are considered within the same framework. According to the presence or the absence of certain symmetries, we provide a detailed classification for the behavior of some physical quantities, like the density of states, the spin and the quasiparticle charge conductivities. Following the original Finkel'stein approach, we finally extend the effective functional method to include residual quasiparticle interactions, at all orders in the scattering amplitudes. We consider both the superconducting and the normal phase, with and without chiral symmetry, which occurs in the so called two-sublattice models.
We have performed a statistical characterization of the effect of afterpulsing in a free-running silicon single-photon detector by measuring the distribution of afterpulse waiting times in response to pulsed illumination and fitting it by a sum of exponentials. We show that a high degree of goodness of fit can be obtained for 5 exponentials, but the physical meaning of estimated characteristic times is dubious. We show that a continuous limit of the sum of exponentials with a uniform density between the limiting times gives excellent fitting results in the full range of the detector response function. This means that in certain detectors the afterpulsing is caused by a continuous band of deep levels in the active area of the photodetector.
We use ab initio simulations to study the static and dynamic properties of a sodium borosilicate liquid with composition 3Na_2O-B_2O_3-6SiO_2, i.e. a system that is the basis of many glass-forming materials. In particular we focus on the question how boron is embedded into the local structure of the silicate network liquid. From the partial structure factors we conclude that there is a weak nanoscale phase separation between silicon and boron and that the sodium atoms form channel-like structures as they have been found in previous studies of sodo-silicate glass-formers. Our results for the X-ray and neutron structure factor show that this feature is basically unnoticeable in the former but should be visible in the latter as a small peak at small wave-vectors. At high temperatures we find a high concentration of three-fold coordinated boron atoms which decreases rapidly with decreasing T, whereas the number of four-fold coordinated boron atoms increases. Therefore we conclude that at the experimental glass transition temperature most boron atoms will be four-fold coordinated. We show that the transformation of [3]B into [4]B with decreasing T is not just related to the diminution of non-bridging oxygen atoms as claimed in previous studies, but to a restructuration of the silicate matrix. The diffusion constants of the various elements show an Arrhenius behavior and we find that the one for boron has the same value as the one of oxygen and is significantly larger than the one of silicon. This shows that these two network formers have rather different dynamical properties, a result that is also confirmed from the time dependence of the van Hove functions. Finally we show that the coherent intermediate scattering function for the sodium atoms is very different from the incoherent one and that it tracks the one of the matrix atoms.
As elderly population increases, portion of dementia patients becomes larger. Thus social cost of caring dementia patients has been a major concern to many nations. This article introduces a dementia assistive system operated by various sensors and devices installed in body area and activity area of patients. Since this system is served based on a network which includes a number of nodes, it requires techniques to reduce the network performance degradation caused by densely composed sensors and devices. This article introduces existing protocols for communications of sensors and devices at both low rate and high rate transmission.
This note is a brief survey of some results of the recent collaboration of neurobiologists and mathematicians dedicated to stimulus reconstruction from neuronal spiking activity. This collaboration, in particular, led to the consideration of binary codes used by brain for encoding a stimuli domain such as a rodent's territory through the combinatorics of its covering by local neighborhoods.   The survey is addressed to mathematicians (cf. [DeSch01]) and focuses on the idea that stimuli spaces are represented by the relevant neural codes as simplicial sets and thus encode say, the homotopy type of space if local neighborhoods are convex (see [CuIt08], [CuItVCYo13], [Yo14], [SiGh07]).}
This abstract describes the segmentation system used to participate in the challenge ISIC 2017: Skin Lesion Analysis Towards Melanoma Detection. Several preprocessing techniques have been tested for three color representations (RGB, YCbCr and HSV) of 392 images. Results have been used to choose the better preprocessing for each channel. In each case a neural network is trained to predict the Jaccard Index based on object characteristics. The system includes black frames and reference circle detection algorithms but no special treatment is done for hair removal. Segmentation is performed in two steps first the best channel to be segmented is chosen by selecting the best neural network output. If this output does not predict a Jaccard Index over 0.5 a more aggressive preprocessing is performed using open and close morphological operations and the segmentation of the channel that obtains the best output from the neural networks is selected as the lesion.
In the information science literature, recent studies have used patent databases and patent classification information to construct network maps of patent technology classes. In such a patent technology map, almost all pairs of technology classes are connected, whereas most of the connections between them are extremely weak. This observation suggests the possibility of filtering the patent network map by removing weak links. However, removing links may reduce the explanatory power of the network on inventor or organization diversification. The network links may explain the patent portfolio diversification paths of inventors and inventing organizations. We measure the diversification explanatory power of the patent network map, and present a method to objectively choose an optimal trade-off between explanatory power and removing weak links. We show that this method can remove a degree of arbitrariness compared with previous filtering methods based on arbitrary thresholds, and also identify previous filtering methods that created filters outside the optimal trade-off. The filtered map aims to aid in network visualization analyses of the technological diversification of inventors, organizations and other innovation agents, and potential foresight analysis. Such applications to a prolific inventor (Leonard Forbes) and company (Google) are demonstrated.
We consider the following problem: two nodes want to reliably communicate in a dynamic multihop network where some nodes have been compromised, and may have a totally arbitrary and unpredictable behavior. These nodes are called Byzantine. We consider the two cases where cryptography is available and not available. We prove the necessary and sufficient condition (that is, the weakest possible condition) to ensure reliable communication in this context. Our proof is constructive, as we provide Byzantine-resilient algorithms for reliable communication that are optimal with respect to our impossibility results. In a second part, we investigate the impact of our conditions in three case studies: participants interacting in a conference, robots moving on a grid and agents in the subway. Our simulations indicate a clear benefit of using our algorithms for reliable communication in those contexts.
A variant of Kauffman's model of cellular metabolism is presented. It is a randomly generated network of boolean gates, identical to Kauffman's except for a small bias in favor of boolean gates that depend on at most one input. The bias is asymptotic to 0 as the number of gates increases. Upper bounds on the time until the network reaches a state cycle and the size of the state cycle, as functions of the number of gates $n$, are derived. If the bias approaches 0 slowly enough, the state cycles will be smaller than $n^c$ for some $c<1$. This lends support to Kauffman's claim that in his version of random network the average size of the state cycles is approximately $n^{1/2}$.
Time evolution equations for dynamical systems can often be derived from generating functionals. Examples are Newton's equations of motion in classical dynamics which can be generated within the Lagrange or the Hamiltonian formalism. We propose that generating functionals for self-organizing complex systems offer several advantages. Generating functionals allow to formulate complex dynamical systems systematically and the results obtained are typically valid for classes of complex systems, as defined by the type of their respective generating functionals. The generated dynamical systems tend, in addition, to be minimal, containing only few free and undetermined parameters. We point out that two or more generating functionals may be used to define a complex system and that multiple generating function may not, and should not, be combined into a single overall objective function. We provide and discuss examples in terms of adapting neural networks.
A sampling algorithm is presented that generates spin glass configurations of the 2D Edwards-Anderson Ising spin glass at finite temperature, with probabilities proportional to their Boltzmann weights. Such an algorithm overcomes the slow dynamics of direct simulation and can be used to study long-range correlation functions and coarse-grained dynamics. The algorithm uses a correspondence between spin configurations on a regular lattice and dimer (edge) coverings of a related graph: Wilson's algorithm [D. B. Wilson, Proc. 8th Symp. Discrete Algorithms 258, (1997)] for sampling dimer coverings on a planar lattice is adapted to generate samplings for the dimer problem corresponding to both planar and toroidal spin glass samples. This algorithm is recursive: it computes probabilities for spins along a "separator" that divides the sample in half. Given the spins on the separator, sample configurations for the two separated halves are generated by further division and assignment. The algorithm is simplified by using Pfaffian elimination, rather than Gaussian elimination, for sampling dimer configurations. For n spins and given floating point precision, the algorithm has an asymptotic run-time of O(n^{3/2}); it is found that the required precision scales as inverse temperature and grows only slowly with system size. Sample applications and benchmarking results are presented for samples of size up to n=128^2, with fixed and periodic boundary conditions.
Establishing accurate morphological measurements of galaxies in a reasonable amount of time for future big-data surveys such as EUCLID, the Large Synoptic Survey Telescope or the Wide Field Infrared Survey Telescope is a challenge. Because of its high level of abstraction with little human intervention, deep learning appears to be a promising approach. Deep learning is a rapidly growing discipline that models high-level patterns in data as complex multilayered networks. In this work we test the ability of deep convolutional networks to provide parametric properties of Hubble Space Telescope like galaxies (half-light radii, Sersic indices, total flux etc..). We simulate a set of galaxies including point spread function and realistic noise from the CANDELS survey and try to recover the main galaxy parameters using deep-learning. We com- pare the results with the ones obtained with the commonly used profile fitting based software GALFIT. This way showing that with our method we obtain results at least equally good as the ones obtained with GALFIT but, once trained, with a factor 5 hundred time faster.
Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al. 2015) shows that a DNN has strong dependency towards the training dataset, and the learned features cannot be easily transferred to a different but relevant task without fine-tuning. In this paper, we propose a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN. By modulating the statistics in all Batch Normalization layers across the network, our approach achieves deep adaptation effect for domain adaptation tasks. In contrary to other deep learning domain adaptation methods, our method does not require additional components, and is parameter-free. It archives state-of-the-art performance despite its surprising simplicity. Furthermore, we demonstrate that our method is complementary with other existing methods. Combining AdaBN with existing domain adaptation treatments may further improve model performance.
Threshold networks are used as models for neural or gene regulatory networks. They show a rich dynamical behaviour with a transition between a frozen and a chaotic phase. We investigate the phase diagram of randomly connected threshold networks with real-valued thresholds h and a fixed number of inputs per node. The nodes are updated according to the same rules as in a model of the cell-cycle network of Saccharomyces cereviseae [PNAS 101, 4781 (2004)]. Using the annealed approximation, we derive expressions for the time evolution of the proportion of nodes in the "on" and "off" state, and for the sensitivity $\lambda$. The results are compared with simulations of quenched networks. We find that for integer values of h the simulations show marked deviations from the annealed approximation even for large networks. This can be attributed to the particular choice of the updating rule.
While deep convolutional neural networks (CNNs) have shown a great success in single-label image classification, it is important to note that real world images generally contain multiple labels, which could correspond to different objects, scenes, actions and attributes in an image. Traditional approaches to multi-label image classification learn independent classifiers for each category and employ ranking or thresholding on the classification results. These techniques, although working well, fail to explicitly exploit the label dependencies in an image. In this paper, we utilize recurrent neural networks (RNNs) to address this problem. Combined with CNNs, the proposed CNN-RNN framework learns a joint image-label embedding to characterize the semantic label dependency as well as the image-label relevance, and it can be trained end-to-end from scratch to integrate both information in a unified framework. Experimental results on public benchmark datasets demonstrate that the proposed architecture achieves better performance than the state-of-the-art multi-label classification model
Here, we present a novel approach to solve the problem of reconstructing perceived stimuli from brain responses by combining probabilistic inference with deep learning. Our approach first inverts the linear transformation from latent features to brain responses with maximum a posteriori estimation and then inverts the nonlinear transformation from perceived stimuli to latent features with adversarial training of convolutional neural networks. We test our approach with a functional magnetic resonance imaging experiment and show that it can generate state-of-the-art reconstructions of perceived faces from brain activations.
Through the past decade the field of network science has established itself as a common ground for the cross-fertilization of exciting inter-disciplinary studies which has motivated researchers to model almost every physical system as an interacting network consisting of nodes and links. Although public transport networks such as airline and railway networks have been extensively studied, the status of bus networks still remains in obscurity. In developing countries like India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer some of the basic questions on its evolution, growth, robustness and resiliency. In this paper, we model the bus networks of major Indian cities as graphs in \textit{L}-space, and evaluate their various statistical properties using concepts from network science. Our analysis reveals a wide spectrum of network topology with the common underlying feature of small-world property. We observe that the networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, like Internet, WWW and airline, which are virtual, bus networks are physically constrained. The presence of various geographical and economic constraints allow these networks to evolve over time. Our findings therefore, throw light on the evolution of such geographically and socio-economically constrained networks which will help us in designing more efficient networks in the future.
Data as a commodity has always been purchased and sold. Recently, web services that are data marketplaces have emerged that match data buyers with data sellers. So far there are no guidelines how to price queries against a database. We consider the recently proposed query-based pricing framework of Koutris et al and ask the question of computing optimal input prices in this framework by formulating a buyer utility model.   We establish the interesting and deep equivalence between arbitrage-freeness in the query-pricing framework and envy-freeness in pricing theory for appropriately chosen buyer valuations. Given the approximation hardness results from envy-free pricing we then develop logarithmic approximation pricing algorithms exploiting the max flow interpretation of the arbitrage-free pricing for the restricted query language proposed by Koutris et al. We propose a novel polynomial-time logarithmic approximation pricing scheme and show that our new scheme performs better than the existing envy-free pricing algorithms instance-by-instance. We also present a faster pricing algorithm that is always greater than the existing solutions, but worse than our previous scheme. We experimentally show how our pricing algorithms perform with respect to the existing envy-free pricing algorithms and to the optimal exponentially computable solution, and our experiments show that our approximation algorithms consistently arrive at about 99% of the optimal.
Most of the recent successful methods in accurate object detection and localization used some variants of R-CNN style two stage Convolutional Neural Networks (CNN) where plausible regions were proposed in the first stage then followed by a second stage for decision refinement. Despite the simplicity of training and the efficiency in deployment, the single stage detection methods have not been as competitive when evaluated in benchmarks consider mAP for high IoU thresholds. In this paper, we proposed a novel single stage end-to-end trainable object detection network to overcome this limitation. We achieved this by introducing Recurrent Rolling Convolution (RRC) architecture over multi-scale feature maps to construct object classifiers and bounding box regressors which are "deep in context". We evaluated our method in the challenging KITTI dataset which measures methods under IoU threshold of 0.7. We showed that with RRC, a single reduced VGG-16 based model already significantly outperformed all the previously published results. At the time this paper was written our models ranked the first in KITTI car detection (the hard level), the first in cyclist detection and the second in pedestrian detection. These results were not reached by the previous single stage methods. The code is publicly available.
We present a comparative analysis of large-scale topological and evolutionary properties of transcription networks in three species, the two distant bacteria E. coli and B. subtilis, and the yeast S. cerevisiae. The study focuses on the global aspects of feedback and hierarchy in transcriptional regulatory pathways. While confirming that gene duplication has a significant impact on the shaping of all the analyzed transcription networks, our results point to distinct trends between the bacteria, where time constraints in the transcription of downstream genes might be important in shaping the hierarchical structure of the network, and yeast, which seems able to sustain a higher wiring complexity, that includes the more feedback, intricate hierarchy, and the combinatorial use of heterodimers made of duplicate transcription factors.
We investigate the effective memory depth of RNN models by using them for $n$-gram language model (LM) smoothing.   Experiments on a small corpus (UPenn Treebank, one million words of training data and 10k vocabulary) have found the LSTM cell with dropout to be the best model for encoding the $n$-gram state when compared with feed-forward and vanilla RNN models. When preserving the sentence independence assumption the LSTM $n$-gram matches the LSTM LM performance for $n=9$ and slightly outperforms it for $n=13$. When allowing dependencies across sentence boundaries, the LSTM $13$-gram almost matches the perplexity of the unlimited history LSTM LM.   LSTM $n$-gram smoothing also has the desirable property of improving with increasing $n$-gram order, unlike the Katz or Kneser-Ney back-off estimators. Using multinomial distributions as targets in training instead of the usual one-hot target is only slightly beneficial for low $n$-gram orders.   Experiments on the One Billion Words benchmark show that the results hold at larger scale: while LSTM smoothing for short $n$-gram contexts does not provide significant advantages over classic N-gram models, it becomes effective with long contexts ($n > 5$); depending on the task and amount of data it can match fully recurrent LSTM models at about $n=13$. This may have implications when modeling short-format text, e.g. voice search/query LMs.   Building LSTM $n$-gram LMs may be appealing for some practical situations: the state in a $n$-gram LM can be succinctly represented with $(n-1)*4$ bytes storing the identity of the words in the context and batches of $n$-gram contexts can be processed in parallel. On the downside, the $n$-gram context encoding computed by the LSTM is discarded, making the model more expensive than a regular recurrent LSTM LM.
We study the multi-broadcast problem in multi-hop wireless networks under the SINR model deployed in the 2D Euclidean plane. In multi-broadcast, there are $k$ initial rumours, potentially belonging to different nodes, that must be forwarded to all $n$ nodes of the network. Furthermore, in each round a node can only transmit a small message that could contain at most one initial rumor and $O(\log n)$ control bits. In order to be successfully delivered to a node, transmissions must satisfy the (Signal-to-Inference-and-Noise-Ratio) SINR condition and have sufficiently strong signal at the receiver. We present deterministic algorithms for multi-broadcast for different settings that reflect the different types of knowledge about the topology of the network available to the nodes: (i) the whole network topology (ii) their own coordinates and coordinates of their neighbors (iii) only their own coordinates, and (iv) only their own ids and the ids of their neighbors. For the former two settings, we present solutions that are scalable with respect to the diameter of the network and the polylogarithm of the network size, i.e., $\log^c n$ for some constant $c> 0$, while the solutions for the latter two have round complexity that is superlinear in the number of nodes. The last result is of special significance, as it is the first result for the SINR model that does not require nodes to know their coordinates in the plane (a very specialized type of knowledge), but intricately exploits the understanding that nodes are implanted in the 2D Euclidean plane.
Recurrent neural networks (RNNs) in combination with a pooling operator and the neighbourhood components analysis (NCA) objective function are able to detect the characterizing dynamics of sequences and embed them into a fixed-length vector space of arbitrary dimensionality. Subsequently, the resulting features are meaningful and can be used for visualization or nearest neighbour classification in linear time. This kind of metric learning for sequential data enables the use of algorithms tailored towards fixed length vector spaces such as R^n.
Coordination games are important to explain efficient and desirable social behavior. Here we study these games by extensive numerical simulation on networked social structures using an evolutionary approach. We show that local network effects may promote selection of efficient equilibria in both pure and general coordination games and may explain social polarization. These results are put into perspective with respect to known theoretical results. The main insight we obtain is that clustering, and especially community structure in social networks has a positive role in promoting socially efficient outcomes.
Continuous time Bayesian network classifiers are designed for temporal classification of multivariate streaming data when time duration of events matters and the class does not change over time. This paper introduces the CTBNCToolkit: an open source Java toolkit which provides a stand-alone application for temporal classification and a library for continuous time Bayesian network classifiers. CTBNCToolkit implements the inference algorithm, the parameter learning algorithm, and the structural learning algorithm for continuous time Bayesian network classifiers. The structural learning algorithm is based on scoring functions: the marginal log-likelihood score and the conditional log-likelihood score are provided. CTBNCToolkit provides also an implementation of the expectation maximization algorithm for clustering purpose. The paper introduces continuous time Bayesian network classifiers. How to use the CTBNToolkit from the command line is described in a specific section. Tutorial examples are included to facilitate users to understand how the toolkit must be used. A section dedicate to the Java library is proposed to help further code extensions.
We consider distributed optimization problems in which a group of agents are to collaboratively seek the global optimum through peer-to-peer communication networks. The problem arises in various application areas, such as resource allocation, sensor fusion and distributed learning. We propose a general efficient distributed algorithm--termed Distributed Forward-Backward Bregman Splitting (D-FBBS)--to simultaneously solve the above primal problem as well as its dual based on Bregman method and operator splitting. The proposed algorithm allows agents to communicate asynchronously and thus lends itself to stochastic networks. This algorithm belongs to the family of general proximal point algorithms and is shown to have close connections with some existing well-known algorithms when dealing with fixed networks. However, we will show that it is generally different from the existing ones due to its effectiveness in handling stochastic networks. With proper assumptions, we establish a non-ergodic convergence rate of o(1/k) in terms of fixed point residuals over fixed networks both for D-FBBS and its inexact version (ID-FBBS) which is more computationally efficient and an ergodic convergence rate of O(1/k) for D-FBBS over stochastic networks respectively. We also apply the proposed algorithm to sensor fusion problems to show its superior performance compared to existing methods.
Recently, a number of cloud platforms and services have been developed for data intensive computing, including Hadoop, Sector, CloudStore (formerly KFS), HBase, and Thrift. In order to benchmark the performance of these systems, to investigate their interoperability, and to experiment with new services based on flexible compute node and network provisioning capabilities, we have designed and implemented a large scale testbed called the Open Cloud Testbed (OCT). Currently the OCT has 120 nodes in four data centers: Baltimore, Chicago (two locations), and San Diego. In contrast to other cloud testbeds, which are in small geographic areas and which are based on commodity Internet services, the OCT is a wide area testbed and the four data centers are connected with a high performance 10Gb/s network, based on a foundation of dedicated lightpaths. This testbed can address the requirements of extremely large data streams that challenge other types of distributed infrastructure. We have also developed several utilities to support the development of cloud computing systems and services, including novel node and network provisioning services, a monitoring system, and a RPC system. In this paper, we describe the OCT architecture and monitoring system. We also describe some benchmarks that we developed and some interoperability studies we performed using these benchmarks.
By leveraging information technologies, organizations now have the ability to design their communication networks and crowdsourcing platforms to pursue various performance goals, but existing research on network design does not account for the specific features of social networks, such as the notion of teams. We fill this gap by demonstrating how desirable aspects of organizational structure can be mapped parsimoniously onto the spectrum of the graph Laplacian allowing the specification of structural objectives and build on recent advances in non-convex programming to optimize them. This design framework is general, but we focus here on the problem of creating graphs that balance high modularity and low mixing time, and show how "liaisons" rather than brokers maximize this objective.
Recent research efforts have shown that the popular BitTorrent protocol does not provide fair resource reciprocation and may allow free-riding. In this paper, we propose a BitTorrent-like protocol that replaces the peer selection mechanisms in the regular BitTorrent protocol with a novel reinforcement learning (RL) based mechanism. Due to the inherent opration of P2P systems, which involves repeated interactions among peers over a long period of time, the peers can efficiently identify free-riders as well as desirable collaborators by learning the behavior of their associated peers. Thus, it can help peers improve their download rates and discourage free-riding, while improving fairness in the system. We model the peers' interactions in the BitTorrent-like network as a repeated interaction game, where we explicitly consider the strategic behavior of the peers. A peer, which applies the RL-based mechanism, uses a partial history of the observations on associated peers' statistical reciprocal behaviors to determine its best responses and estimate the corresponding impact on its expected utility. The policy determines the peer's resource reciprocations with other peers, which would maximize the peer's long-term performance, thereby making foresighted decisions. We have implemented the proposed reinforcement-learning based mechanism and incorporated it into an existing BitTorrent client. We have performed extensive experiments on a controlled Planetlab test bed. Our results confirm that our proposed protocol (1) promotes fairness in terms of incentives to each peer's contribution e.g. high capacity peers improve their download completion time by up to 33\%, (2) improves the system stability and robustness e.g. reducing the peer selection luctuations by 57\%, and (3) discourages free-riding e.g. peers reduce by 64\% their upload to \FR, in comparison to the regular \BT~protocol.
We use molecular dynamics computer simulations to study the relaxation dynamics of a viscous melt of silica. The coherent and incoherent intermediate scattering functions, F_d(q,t) and F_s(q,t), show a crossover from a nearly exponential decay at high temperatures to a two-step relaxation at low temperatures. Close to the critical temperature of mode-coupling theory (MCT) the correlators obey in the alpha-regime the time temperature superposition principle (TTSP) and show a weak stretching. We determine the wave-vector dependence of the stretching parameter and find that for F_d(q,t) it shows oscillations which are in phase with the static structure factor. The temperature dependence of the alpha- relaxation times tau shows a crossover from an Arrhenius law at low temperatures to a weaker T-dependence at intermediate and high temperatures. At the latter temperatures the T-dependence is described well by a power law. We find that the exponent gamma of the power law for tau are significantly larger than the one for the diffusion constant. The q-dependence of the alpha-relaxation times for F_d(q,t) oscillates around tau(q) for F_s(q,t) and is in phase with the structure factor. Due to the strong vibrational component of the dynamics at short times the TTSP is not valid in the beta- relaxation regime. We show, however, that in this time window the shape of the curves is independent of the correlator and is given by a functional form proposed by MCT. We find that the value of the von Schweidler exponent and the value of gamma for finite q are compatible with the expression proposed by MCT. We conclude that, in the temperature regime where the relaxation times are mesoscopic, many aspects of the dynamics of this strong glass former can be rationalized very well by MCT.
Deep Web is content hidden behind HTML forms. Since it represents a large portion of the structured, unstructured and dynamic data on the Web, accessing Deep-Web content has been a long challenge for the database community. This paper describes a crawler for accessing Deep-Web using Ontologies. Performance evaluation of the proposed work showed that this new approach has promising results.
The Monte Carlo models ARIADNE, HERWIG and LEPTO are compared to deep-inelastic scattering data measured at the ep-collider HERA.
We experimentally study a periodically driven many-body localized system realized by interacting fermions in a one-dimensional quasi-disordered optical lattice. By preparing the system in a far-from-equilibrium state and monitoring the remains of an imprinted density pattern, we identify a localized phase at high drive frequencies and an ergodic phase at low ones. These two distinct phases are separated by a dynamical phase transition which depends on both the drive frequency and the drive strength. Our observations are quantitatively supported by numerical simulations and are directly connected to the change in the statistical properties of the effective Floquet Hamiltonian.
Poisoning attack is identified as a severe security threat to machine learning algorithms. In many applications, for example, deep neural network (DNN) models collect public data as the inputs to perform re-training, where the input data can be poisoned. Although poisoning attack against support vector machines (SVM) has been extensively studied before, there is still very limited knowledge about how such attack can be implemented on neural networks (NN), especially DNNs. In this work, we first examine the possibility of applying traditional gradient-based method (named as the direct gradient method) to generate poisoned data against NNs by leveraging the gradient of the target model w.r.t. the normal data. We then propose a generative method to accelerate the generation rate of the poisoned data: an auto-encoder (generator) used to generate poisoned data is updated by a reward function of the loss, and the target NN model (discriminator) receives the poisoned data to calculate the loss w.r.t. the normal data. Our experiment results show that the generative method can speed up the poisoned data generation rate by up to 239.38x compared with the direct gradient method, with slightly lower model accuracy degradation. A countermeasure is also designed to detect such poisoning attack methods by checking the loss of the target model.
We address the problem of optimizing the throughput of network coded traffic in mobile networks operating in challenging environments where connectivity is intermittent and locally available memory space is limited. Random linear network coding (RLNC) is shown to be equivalent (across all possible initial conditions) to a random message selection strategy where nodes are able to exchange buffer occupancy information during contacts. This result creates the premises for a tractable analysis of RLNC packet spread, which is in turn used for enhancing its throughput under broadcast. By exploiting the similarity between channel coding and RLNC in intermittently connected networks, we show that quite surprisingly, network coding, when not used properly, is still significantly underutilizing network resources. We propose an enhanced forwarding protocol that increases considerably the throughput for practical cases, with negligible additional delay.
The azimuthal spin asymmetries for pion production in semi-inclusive deep inelastic scattering of unpolarized charged lepton beams on longitudinally polarized nucleon targets, are reanalyzed by taking into account an important sign correction to previous formulas. It is found that different approaches of distribution functions and fragmentation functions may lead to distinct predictions on the azimuthal asymmetries measured in the HERMES experiments, thus the available data cannot be considered as a direct measurement of quark transversity distributions, although they still can serve to provide useful information on these distributions and on T-odd fragmentation functions. Predictions of the azimuthal spin asymmetries for kaon production are also presented, with different approaches of distribution and fragmentation functions. The unfavored fragmentation functions cannot be neglected for $K^-$ and $K^0_S$ production in semi-inclusive deep inelastic processes.
Interest in understanding the interplay between noise and the response of a non-linear device cuts across disciplinary boundaries. It is as relevant for unmasking the dynamics of neurons in noisy environments as it is for designing reliable nanoscale logic circuit elements and sensors. Most studies of noise in non-linear devices are limited to either time-correlated noise with a Lorentzian spectrum (of which the white noise is a limiting case) or just white noise. We use analytical theory and numerical simulations to study the impact of the more ubiquitous "natural" noise with a 1/f frequency spectrum. Specifically, we study the impact of the 1/f noise on a leaky integrate and fire model of a neuron. The impact of noise is considered on two quantities of interest to neuron function: The spike count Fano factor and the speed of neuron response to a small step-like stimulus. For the perfect (non-leaky) integrate and fire model, we show that the Fano factor can be expressed as an integral over noise spectrum weighted by a (low pass) filter function. This result elucidates the connection between low frequency noise and disorder in neuron dynamics. We compare our results to experimental data of single neurons in vivo, and show how the 1/f noise model provides much better agreement than the usual approximations based on Lorentzian noise. The low frequency noise, however, complicates the case for information coding scheme based on interspike intervals by introducing variability in the neuron response time. On a positive note, the neuron response time to a step stimulus is, remarkably, nearly optimal in the presence of 1/f noise. An explanation of this effect elucidates how the brain can take advantage of noise to prime a subset of the neurons to respond almost instantly to sudden stimuli.
The average distance from a node to all other nodes in a graph, or from a query point in a metric space to a set of points, is a fundamental quantity in data analysis. The inverse of the average distance, known as the (classic) closeness centrality of a node, is a popular importance measure in the study of social networks. We develop novel structural insights on the sparsifiability of the distance relation via weighted sampling. Based on that, we present highly practical algorithms with strong statistical guarantees for fundamental problems. We show that the average distance (and hence the centrality) for all nodes in a graph can be estimated using $O(\epsilon^{-2})$ single-source distance computations. For a set $V$ of $n$ points in a metric space, we show that after preprocessing which uses $O(n)$ distance computations we can compute a weighted sample $S\subset V$ of size $O(\epsilon^{-2})$ such that the average distance from any query point $v$ to $V$ can be estimated from the distances from $v$ to $S$. Finally, we show that for a set of points $V$ in a metric space, we can estimate the average pairwise distance using $O(n+\epsilon^{-2})$ distance computations. The estimate is based on a weighted sample of $O(\epsilon^{-2})$ pairs of points, which is computed using $O(n)$ distance computations. Our estimates are unbiased with normalized mean square error (NRMSE) of at most $\epsilon$. Increasing the sample size by a $O(\log n)$ factor ensures that the probability that the relative error exceeds $\epsilon$ is polynomially small.
Multi-cell cooperative processing with limited backhaul traffic is studied for cellular uplinks. Aiming at reduced backhaul overhead, a sparsity-regularized multi-cell receive-filter design problem is formulated. Both unstructured distributed cooperation as well as clustered cooperation, in which base station groups are formed for tight cooperation, are considered. Dynamic clustered cooperation, where the sparse equalizer and the cooperation clusters are jointly determined, is solved via alternating minimization based on spectral clustering and group-sparse regression. Furthermore, decentralized implementations of both unstructured and clustered cooperation schemes are developed for scalability, robustness and computational efficiency. Extensive numerical tests verify the efficacy of the proposed methods.
This paper introduces a logical system, called BV, which extends multiplicative linear logic by a non-commutative self-dual logical operator. This extension is particularly challenging for the sequent calculus, and so far it is not achieved therein. It becomes very natural in a new formalism, called the calculus of structures, which is the main contribution of this work. Structures are formulae submitted to certain equational laws typical of sequents. The calculus of structures is obtained by generalising the sequent calculus in such a way that a new top-down symmetry of derivations is observed, and it employs inference rules that rewrite inside structures at any depth. These properties, in addition to allow the design of BV, yield a modular proof of cut elimination.
Xue and Kumar have established that the number of neighbors required for connectivity of wireless networks with N uniformly distributed nodes must grow as log(N), and they also established that the actual number required lies between 0.074log(N) and 5.1774log(N). In this short paper, by recognizing that connectivity results for networks where the nodes are distributed according to a Poisson point process can often be applied to the problem for a network with N nodes, we are able to improve the lower bound. In particular, we show that a network with nodes distributed in a unit square according to a 2D Poisson point process of parameter N will be asymptotically disconnected with probability one if the number of neighbors is less than 0.129log(N). Moreover, similar number of neighbors is not enough for an asymptotically connected network with N nodes uniformly in a unit square, hence improving the lower bound.
Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.
We show how the thermodynamic properties of large many-body localized systems can be studied using quantum Monte Carlo simulations. To this end we devise a heuristic way of constructing local integrals of motion of very high quality, which are added to the Hamiltonian in conjunction with Lagrange multipliers. The ground state simulation of the shifted Hamiltonian corresponds to a high-energy state of the original Hamiltonian in case of exactly known local integrals of motion. We can show that the inevitable mixing between eigenstates as a consequence of non-perfect integrals of motion is weak enough such that the characteristics of many-body localized systems are not averaged out in our approach, unlike the standard ensembles of statistical mechanics. Our method paves the way to study higher dimensions and indicates that a full many-body localized phase in 2d, where (nearly) all eigenstates are localized, is likely to exist.
Among topics of opinion formation it is of interest to observe the characteristics of networks with a priori distinct communities. As an illustration, we report on the citation network(s) unfolded in the recent decades through web available works belonging to selected members of the Neocreationist and Intelligent Design Proponents (IDP) and the Darwinian Evolution Defenders (DED) communities. An adjacency matrix of tagged nodes is first constructed; it is not symmetric. A generalization of considerations pertaining to the case of networks with biased links, directed or undirected, is thus presented. The main characteristic coefficients describing the structure of such partially directed networks with tagged nodes are outlined. The structural features are discussed searching for statistical aspects, equivalence or not of subnetworks through the degree distributions, each network assortativity, the global and local clustering coefficients and the Average Overlap Indices. The various closed and open triangles made from nodes, moreover distinguishing the community, are especially listed to calculate the clustering characteristics. The distribution of elements in the rectangular submatrices are specially examined since they represent inter-community connexions. The emphasis being on distinguishing the number of vertices belonging to a given community. Using such informations one can distinguish between opinion leaders, followers and main rivals and briefly interpret their relationships through psychological-like conditions intrinsic to behavior rules in either community. Considerations on other controversy cases with similar social constraints are outlined, as well as suggestions on further, more general, work deduced from our observations on such networks.
Scattering of neutrons in the 24-150 keV incident energy range from H2O relative to that of D2O and H2O-D2O mixtures was reported very recently. Studying time-of-flight integrated intensities, the applied experimental procedure appears to be transparent and may open up a novel class of neutron experiments regarding the "anomalous" scattering from protons, firstly observed in our experiment at ISIS in the 5-100 eV range. The keV-results were analyzed within standard theory, also including (1) multiple scattering and (2) the strong incident-energy dependence of the neutron-proton cross section in this energy range. The analysis reveals a striking anomalous ratio of scattering intensity of H2O relative to that of D2O of about 20%, thus being in surprisingly good agreement with the earlier results of the original experiment at ISIS.
Many infrastructure networks have a modular structure and are also interdependent. While significant research has explored the resilience of interdependent networks, there has been no analysis of the effects of modularity. Here we develop a theoretical framework for attacks on interdependent modular networks and support our results by simulations. We focus on the case where each network has the same number of communities and the dependency links are restricted to be between pairs of communities of different networks. This is very realistic for infrastructure across cities. Each city has its own infrastructures and different infrastructures are dependent within the city. However, each infrastructure is connected within and between cities. For example, a power grid will connect many cities as will a communication network, yet a power station and communication tower that are interdependent will likely be in the same city. It has been shown that single networks are very susceptible to the failure of the interconnected nodes (between communities) Shai et al. and that attacks on these nodes are more crippling than attacks based on betweenness da Cunha et al. In our example of cities these nodes have long range links which are more likely to fail. For both treelike and looplike interdependent modular networks we find distinct regimes depending on the number of modules, $m$. (i) In the case where there are fewer modules with strong intraconnections, the system first separates into modules in an abrupt first-order transition and then each module undergoes a second percolation transition. (ii) When there are more modules with many interconnections between them, the system undergoes a single transition. Overall, we find that modular structure can influence the type of transitions observed in interdependent networks and should be considered in attempts to make interdependent networks more resilient.
We study the properties of dijet production in deep inelastic scattering using a unified BFKL/DGLAP framework, which includes important subleading ln (1/x) contributions. We calculate the azimuthal decorrelation between the jets. We compute the cross section for dijet production as a function of Q^2 and the jet transverse momentum, as well as calculate the total dijet rate. We compare the predictions with HERA data.
In recent years, free space optical (FSO) communication has gained significant importance owing to its unique features: large bandwidth, license free spectrum, high data rate, easy and quick deployability, less power and low mass requirement. FSO communication uses optical carrier in the near infrared (IR) and visible band to establish either terrestrial links within the Earths atmosphere or inter-satellite or deep space links or ground to satellite or satellite to ground links. However, despite of great potential of FSO communication, its performance is limited by the adverse effects (viz., absorption, scattering and turbulence) of the atmospheric channel. Out of these three effects, the atmospheric turbulence is a major challenge that may lead to serious degradation in the bit error rate (BER) performance of the system and make the communication link infeasible. This paper presents a comprehensive survey on various challenges faced by FSO communication system for both terrestrial and space links. It will provide details of various performance mitigation techniques in order to have high link availability and reliability of FSO system. The first part of the paper will focus on various types of impairments that poses a serious challenge to the performance of FSO system for both terrestrial and space links. The latter part of the paper will provide the reader with an exhaustive review of various techniques used in FSO system both at physical layer as well as at the upper layers (transport, network or link layer) to combat the adverse effects of the atmosphere. Further, this survey uniquely offers the current literature on FSO coding and modulation schemes using various channel models and detection techniques. It also presents a recently developed technique in FSO system using orbital angular momentum to combat the effect of atmospheric turbulence.
Cellular mechanism-of-action is of fundamental concern in many biological studies. It is of particular interest for identifying the cause of disease and learning the way in which treatments act against disease. However, pinpointing such mechanisms is difficult, due to the fact that small perturbations to the cell can have wide-ranging downstream effects. Given a snapshot of cellular activity, it can be challenging to tell where a disturbance originated. The presence of an ever-greater variety of high-throughput biological data offers an opportunity to examine cellular behavior from multiple angles, but also presents the statistical challenge of how to effectively analyze data from multiple sources. In this setting, we propose a method for mechanism-of-action inference by extending network filtering to multi-attribute data. We first estimate a joint Gaussian graphical model across multiple data types using penalized regression and filter for network effects. We then apply a set of likelihood ratio tests to identify the most likely site of the original perturbation. In addition, we propose a conditional testing procedure to allow for detection of multiple perturbations. We demonstrate this methodology on paired gene expression and methylation data from The Cancer Genome Atlas (TCGA).
The relationship between the regulatory design and the functionality of molecular networks is a key issue in biology. Modules and motifs have been associated to various cellular processes, thereby providing anecdotal evidence for performance based localization on molecular networks. To quantify structure-function relationship we investigate similarities of proteins which are close in the regulatory network of the yeast Saccharomyces Cerevisiae. We find that the topology of the regulatory network show weak remnants of its history of network reorganizations, but strong features of co-regulated proteins associated to similar tasks. This suggests that local topological features of regulatory networks, including broad degree distributions, emerge as an implicit result of matching a number of needed processes to a finite toolbox of proteins.
In this paper, we describe the deep sparse coding network (SCN), a novel deep network that encodes intermediate representations with nonnegative sparse coding. The SCN is built upon a number of cascading bottleneck modules, where each module consists of two sparse coding layers with relatively wide and slim dictionaries that are specialized to produce high dimensional discriminative features and low dimensional representations for clustering, respectively. During training, both the dictionaries and regularization parameters are optimized with an end-to-end supervised learning algorithm based on multilevel optimization. Effectiveness of an SCN with seven bottleneck modules is verified on several popular benchmark datasets. Remarkably, with few parameters to learn, our SCN achieves 5.81% and 19.93% classification error rate on CIFAR-10 and CIFAR-100, respectively.
"Deep Learning" methods attempt to learn generic features in an unsupervised fashion from a large unlabelled data set. These generic features should perform as well as the best hand crafted features for any learning problem that makes use of this data. We provide a definition of generic features, characterize when it is possible to learn them and provide methods closely related to the autoencoder and deep belief network of deep learning. In order to do so we use the notion of deficiency and illustrate its value in studying certain general learning problems.
Polarized deep--inelastic scattering data on longitudinally polarized hydrogen and deuterium targets have been used to determine double spin asymmetries of cross sections. Inclusive and semi--inclusive asymmetries for the production of positive and negative pions from hydrogen were obtained in a re--analysis of previously published data. Inclusive and semi--inclusive asymmetries for the production of negative and positive pions and kaons were measured on a polarized deuterium target. The separate helicity densities for the up and down quarks and the anti--up, anti--down, and strange sea quarks were computed from these asymmetries in a ``leading order'' QCD analysis. The polarization of the up--quark is positive and that of the down--quark is negative. All extracted sea quark polarizations are consistent with zero, and the light quark sea helicity densities are flavor symmetric within the experimental uncertainties. First and second moments of the extracted quark helicity densities in the measured range are consistent with fits of inclusive data.
Studying tongue motion during speech using ultrasound is a standard procedure, but automatic ultrasound image labelling remains a challenge, as standard tongue shape extraction methods typically require human intervention. This article presents a method based on deep neural networks to automatically extract tongue contour from ultrasound images on a speech dataset. We use a deep autoencoder trained to learn the relationship between an image and its related contour, so that the model is able to automatically reconstruct contours from the ultrasound image alone. In this paper, we use an automatic labelling algorithm instead of time-consuming hand-labelling during the training process, and estimate the performances of both automatic labelling and contour extraction as compared to hand-labelling. Observed results show quality scores comparable to the state of the art.
We propose a method to classify images from target classes with a small number of training examples based on transfer learning from non-target classes. Without using any more information than class labels for samples from non-target classes, we train a Siamese net to estimate the probability of two images to belong to the same class. With some post-processing, output of the Siamese net can be used to form a gram matrix of a Mercer kernel. Coupled with a support vector machine (SVM), such a kernel gave reasonable classification accuracy on target classes without any fine-tuning. When the Siamese net was only partially fine-tuned using a small number of samples from the target classes, the resulting classifier outperformed the state-of-the-art and other alternatives. We share class separation capabilities and insights into the learning process of such a kernel on MNIST, Dogs vs. Cats, and CIFAR-10 datasets.
Cooperative communication can improve communication quality in wireless communication networks through strategic relay selection. However, wireless cooperative communication networks are vulnerable to the attacks initiated on relays. Although applying authentication protocols can secure cooperative communication when the selected relay is malicious, better system throughput could be obtained without executing authentication protocol when the selected relay is free from attacker's attack. In this paper, a game theoretic approach is proposed to quantitatively analyze the attacking strategies of the attacker who chooses one relay to attack so as to make rational decision on relay selection and extent of applying authentication protocols, which reaches the trade-off between system security requirement and quality of service (QoS) in wireless cooperative communication networks.
Dataflow matrix machines are a powerful generalization of recurrent neural networks. They work with multiple types of linear streams and multiple types of neurons, including higher-order neurons which dynamically update the matrix describing weights and topology of the network in question while the network is running. It seems that the power of dataflow matrix machines is sufficient for them to be a convenient general purpose programming platform. This paper explores a number of useful programming idioms and constructions arising in this context.
The critical behavior of the disordered ferromagnetic Ising model is studied numerically by the Monte Carlo method in a wide range of variation of concentration of nonmagnetic impurity atoms. The temperature dependences of correlation length and magnetic susceptibility are determined for samples with various spin concentrations and various linear sizes. The finite-size scaling technique is used for obtaining scaling functions for these quantities, which exhibit a universal behavior in the critical region; the critical temperatures and static critical exponents are also determined using scaling corrections. On the basis of variation of the scaling functions and values of critical exponents upon a change in the concentration, the conclusion is drawn concerning the existence of two universal classes of the critical behavior of the diluted Ising model with different characteristics for weakly and strongly disordered systems.
There has been a surge of interest in community detection in homogeneous single-relational networks which contain only one type of nodes and edges. However, many real-world systems are naturally described as heterogeneous multi-relational networks which contain multiple types of nodes and edges. In this paper, we propose a new method for detecting communities in such networks. Our method is based on optimizing the composite modularity, which is a new modularity proposed for evaluating partitions of a heterogeneous multi-relational network into communities. Our method is parameter-free, scalable, and suitable for various networks with general structure. We demonstrate that it outperforms the state-of-the-art techniques in detecting pre-planted communities in synthetic networks. Applied to a real-world Digg network, it successfully detects meaningful communities.
In this paper, we obtain new results on the covering radius and deep holes for projective Reed-Solomon (PRS) codes.
The promise of compressive sensing (CS) has been offset by two significant challenges. First, real-world data is not exactly sparse in a fixed basis. Second, current high-performance recovery algorithms are slow to converge, which limits CS to either non-real-time applications or scenarios where massive back-end computing is available. In this paper, we attack both of these challenges head-on by developing a new signal recovery framework we call {\em DeepInverse} that learns the inverse transformation from measurement vectors to signals using a {\em deep convolutional network}. When trained on a set of representative images, the network learns both a representation for the signals (addressing challenge one) and an inverse map approximating a greedy or convex recovery algorithm (addressing challenge two). Our experiments indicate that the DeepInverse network closely approximates the solution produced by state-of-the-art CS recovery algorithms yet is hundreds of times faster in run time. The tradeoff for the ultrafast run time is a computationally intensive, off-line training procedure typical to deep networks. However, the training needs to be completed only once, which makes the approach attractive for a host of sparse recovery problems.
The temperature ($T$)-shift protcol of aging in the 3 dimensional (3D) Edwards- Anderson (EA) spin-glass (SG) model is studied through the out-of-phase component of the ac susceptibility simulated by the Monte Carlo method. For processes with a small magnitude of the $T$-shift, $\Delta T$, the memory imprinted before the $T$-shift is preserved under the $T$-change and the SG short-range order continuously grows after the $T$-shift, which we call the cumulative memory scenario. For a negative $T$-shift process with a large $\Delta T$ the deviation from the cumulative memory scenario has been observed for the first time in the numerical simulation. We attribute the phenomenon to the `chaos effect' which, we argue, is qualitatively different from the so-called rejuvenation effect observed just after the $T$-shift.
We present new algorithms for adaptively learning artificial neural networks. Our algorithms (AdaNet) adaptively learn both the structure of the network and its weights. They are based on a solid theoretical analysis, including data-dependent generalization guarantees that we prove and discuss in detail. We report the results of large-scale experiments with one of our algorithms on several binary classification tasks extracted from the CIFAR-10 dataset. The results demonstrate that our algorithm can automatically learn network structures with very competitive performance accuracies when compared with those achieved for neural networks found by standard approaches.
Delay/Disruption-Tolerant Network (DTN) protocols typically address sparse intermittently connected networks whereas Mobile Ad-hoc Network (MANET) protocols address the fairly stable and fully connected ones. But many intermediate situations may occur on mobility dynamics or radio link instability. In such cases, where the network frequently splits into evolving connected groups, none of the conventional routing paradigms (DTN or MANET) are fully satisfactory. In this paper we propose HYMAD, a Hybrid DTN-MANET routing protocol which uses DTN between disjoint groups of nodes while using MANET routing within these groups. HYMAD is fully decentralized and only makes use of topological information exchanges between the nodes. The strength of HYMAD lies in its ability to adapt to the changing connectivity patterns of the network. We evaluate the scheme in simulation by replaying synthetic and real life mobility traces which exhibit a broad range of connectivity dynamics. The results show that HYMAD introduces limited overhead and outperforms the multi-copy Spray-and-Wait DTN routing protocol it extends, both in terms of delivery ratio and delay. This hybrid DTN-MANET approach offers a promising venue for the delivery of elastic data in mobile ad-hoc networks as it retains the resilience of a \textit{pure} DTN protocol while significantly improving performance.
We present an inquiry lab activity on Circuit Design that was conducted in Fall 2009 with first-year community college students majoring in Electrical Engineering Technology. This inquiry emphasized the use of engineering process skills, including circuit assembly and problem solving, while learning technical content. Content goals of the inquiry emphasized understanding voltage dividers (Kirchoff's voltage law) and analysis and optimization of resistive networks (Thevenin equivalence). We assumed prior exposure to series and parallel circuits and Ohm's law (the relationship between voltage, current, and resistance) and designed the inquiry to develop these skills. The inquiry utilized selection of engineering challenges on a specific circuit (the Wheatstone Bridge) to realize these learning goals. Students generated questions and observations during the starters, which were categorized into four engineering challenges or design goals. The students formed teams and chose one challenge to focus on during the inquiry. We created a rubric for summative assessment which helped to clarify and solidify project goals while designing the inquiry and aided in formative assessment during the activity. After describing implementation, we compare and contrast engineering-oriented inquiry design as opposed to activities geared toward science learning.
There are two common approaches for optimizing the performance of a machine: genetic algorithms and machine learning. A genetic algorithm is applied over many generations whereas machine learning works by applying feedback until the system meets a performance threshold. Though these are methods that typically operate separately, we combine evolutionary adaptation and machine learning into one approach. Our focus is on machines that can learn during their lifetime, but instead of equipping them with a machine learning algorithm we aim to let them evolve their ability to learn by themselves. We use evolvable networks of probabilistic and deterministic logic gates, known as Markov Brains, as our computational model organism. The ability of Markov Brains to learn is augmented by a novel adaptive component that can change its computational behavior based on feedback. We show that Markov Brains can indeed evolve to incorporate these feedback gates to improve their adaptability to variable environments. By combining these two methods, we now also implemented a computational model that can be used to study the evolution of learning.
URh_2Ge_2 occupies an extraordinary position among the heavy-electron 122-compounds, by exhibiting a previously unidentified form of magnetic correlations at low temperatures, instead of the usual antiferromagnetism. Here we present new results of ac and dc susceptibilities, specific heat and neutron diffraction on single-crystalline as-grown URh_2Ge_2. These data clearly indicate that crystallographic disorder on a local scale produces spin glass behavior in the sample. We therefore conclude that URh_2Ge_2 is a 3D Ising-like, random-bond, heavy-fermion spin glass.
Adaptive behavior is mainly the result of adaptive brains. We go a step beyond and claim that the brain does not only adapt to its surrounding reality but rather, it builds itself up to constructs its own reality. That is, rather than just trying to passively understand its environment, the brain is the architect of its own reality in an active process where its internal models of the external world frame how its new interactions with the environment are assimilated. These internal models represent relevant predictive patterns of interaction all over the different brain structures: perceptual, sensorimotor, motor, etc. The emergence of adaptive behavior arises from this self-constructive nature of the brain, based on the following principles of organization: self-experimental, self- growing, and self-repairing. Self-experimental, since to ensure survival, the self-constructive brain (SCB) is an active machine capable of performing experiments of its own interactions with the environment by mental simulation. Self-growing, since it dynamically and incrementally constructs internal structures in order to build a model of the world as it gathers statistics from its interactions with the environment. Self-repairing, since to survive the SCB must also be robust and capable of finding ways to repair parts of previously working structures and hence re-construct a previous relevant pattern of activity.
The possibility to safely extract the neutron deep inelastic structure function $F_2^n(x)$ in the range $0 \le x \le 0.9$ from joint measurements of deep inelastic structure functions of $^{3}He$ and $^{3}H$ is demonstrated. While the nuclear structure effects are relevant, the model dependence in this extraction linked to the $N-N$ interaction is shown to be weak.
For an accurate description of the polarized deep inelastic scattering at low $x$ including the logarithmic corrections, $\ln^2(1/x)$, is required. These corrections resummed strongly influence the behaviour of the spin structure functions and their moments. Results of the work of J. Kwiecinski and myself on this problem are reviewed.}
It is known that storage capacity per synapse increases by synaptic pruning in the case of a correlation-type associative memory model. However, the storage capacity of the entire network then decreases. To overcome this difficulty, we propose decreasing the connecting rate while keeping the total number of synapses constant by introducing delayed synapses. In this paper, a discrete synchronous-type model with both delayed synapses and their prunings is discussed as a concrete example of the proposal. First, we explain the Yanai-Kim theory by employing the statistical neurodynamics. This theory involves macrodynamical equations for the dynamics of a network with serial delay elements. Next, considering the translational symmetry of the explained equations, we re-derive macroscopic steady state equations of the model by using the discrete Fourier transformation. The storage capacities are analyzed quantitatively. Furthermore, two types of synaptic prunings are treated analytically: random pruning and systematic pruning. As a result, it becomes clear that in both prunings, the storage capacity increases as the length of delay increases and the connecting rate of the synapses decreases when the total number of synapses is constant. Moreover, an interesting fact becomes clear: the storage capacity asymptotically approaches $2/\pi$ due to random pruning. In contrast, the storage capacity diverges in proportion to the logarithm of the length of delay by systematic pruning and the proportion constant is $4/\pi$. These results theoretically support the significance of pruning following an overgrowth of synapses in the brain and strongly suggest that the brain prefers to store dynamic attractors such as sequences and limit cycles rather than equilibrium states.
Much of information sits in an unprecedented amount of text data. Managing allocation of these large scale text data is an important problem for many areas. Topic modeling performs well in this problem. The traditional generative models (PLSA,LDA) are the state-of-the-art approaches in topic modeling and most recent research on topic generation has been focusing on improving or extending these models. However, results of traditional generative models are sensitive to the number of topics K, which must be specified manually. The problem of generating topics from corpus resembles community detection in networks. Many effective algorithms can automatically detect communities from networks without a manually specified number of the communities. Inspired by these algorithms, in this paper, we propose a novel method named Hierarchical Latent Semantic Mapping (HLSM), which automatically generates topics from corpus. HLSM calculates the association between each pair of words in the latent topic space, then constructs a unipartite network of words with this association and hierarchically generates topics from this network. We apply HLSM to several document collections and the experimental comparisons against several state-of-the-art approaches demonstrate the promising performance.
What is a complex network? How do we characterize complex networks? Which systems can be studied from a network approach? In this text, we motivate the use of complex networks to study and understand a broad panoply of systems, ranging from physics and biology to economy and sociology. Using basic tools from statistical physics, we will characterize the main types of networks found in nature. Moreover, the most recent trends in network research will be briefly discussed.
Motivated by the problem of tracking a direction in a decentralized way, we consider the general problem of cooperative learning in multi-agent systems with time-varying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector $\mu$ from noisy measurements made independently by autonomous nodes. Our protocol is completely distributed and able to cope with the time-varying, unpredictable, and noisy nature of inter-agent communication, and intermittent noisy measurements of $\mu$. Our main result bounds the learning speed of our protocol in terms of the size and combinatorial features of the (time-varying) networks connecting the nodes.
We introduce a two-dimensional walk model in which a random walker can only move on the first quarter of a two-dimensional plane. We calculate the partition function of this walk model using a transfer matrix method and show that the model undergoes a phase-transition. Surprisingly the partition function of this two-dimensional walk model is exactly equal to that of a driven-diffusive system defined on a discrete lattice with periodic boundary conditions in which a phase transition occurs from a high-density to a low-density phase. The driven-diffusive system can be mapped to a zero-range process where the particles can accumulate in a single lattice site in the low-density phase. This is very reminiscent of real-space Bose-Einstein condensation.
Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.
This paper presents a video summarization technique for an Internet video to provide a quick way to overview its content. This is a challenging problem because finding important or informative parts of the original video requires to understand its content. Furthermore the content of Internet videos is very diverse, ranging from home videos to documentaries, which makes video summarization much more tough as prior knowledge is almost not available. To tackle this problem, we propose to use deep video features that can encode various levels of content semantics, including objects, actions, and scenes, improving the efficiency of standard video summarization techniques. For this, we design a deep neural network that maps videos as well as descriptions to a common semantic space and jointly trained it with associated pairs of videos and descriptions. To generate a video summary, we extract the deep features from each segment of the original video and apply a clustering-based summarization technique to them. We evaluate our video summaries using the SumMe dataset as well as baseline approaches. The results demonstrated the advantages of incorporating our deep semantic features in a video summarization technique.
Folksonomy is said to provide a democratic tagging system that reflects the opinions of the general public, but it is not a classification system and it is hard to make sense of. It would be necessary to share a representation of contexts by all the users to develop a social and collaborative matching. The solution could be to help the users to choose proper tags thanks to a dynamical driven system of folksonomy that could evolve during the time. This paper uses a cluster analysis to measure a new concept of a structure called "Folksodriven", which consists of tags, source and time. Many approaches include in their goals the use of folksonomy that could evolve during time to evaluate characteristics. This paper describes an alternative where the goal is to develop a weighted network of tags where link strengths are based on the frequencies of tag co-occurrence, and studied the weight distributions and connectivity correlations among nodes in this network. The paper proposes and analyzes the network structure of the Folksodriven tags thought as folksonomy tags suggestions for the user on a dataset built on chosen websites. It is observed that the hypergraphs of the Folksodriven are highly connected and that the relative path lengths are relatively low, facilitating thus the serendipitous discovery of interesting contents for the users. Then its characteristics, Clustering Coefficient, is compared with random networks. The goal of this paper is a useful analysis of the use of folksonomies on some well known and extensive web sites with real user involvement. The advantages of the new tagging method using folksonomy are on a new interesting method to be employed by a knowledge management system.   *** This paper has been accepted to the International Conference on Social Computing and its Applications (SCA 2011) - Sydney Australia, 12-14 December 2011 ***
Nowadays, most mobile devices are equipped with multiple wireless interfaces, causing an emerging research interest in device to device (D2D) communication: the idea behind the D2D paradigm is to exploit the proper interface to directly communicate with another user, without traversing any network infrastructure. A first issue related to this paradigm consists in the need for a coordinator, called controller, able to decide when activating a D2D connection is appropriate and eventually able to manage such connection. In this view, the paradigm of Software Defined Networking (SDN), can be exploited both to handle the data flows among the devices and to interact directly with every device.   This work is focused on a scenario where a device is selected by the SDN controller, in order to become the master node of a WiFi-Direct network. The remaining nodes, called clients, can exchange data with other nodes through the master. The objective is to infer, through different Machine Learning approaches, the number of nodes actively involved in receiving data, exploiting only the information available at the client side and without modifying any standard communication protocol. The information about the number of client nodes is crucial when, e.g., a user desires a precise prediction of the transmission estimated time of arrival (ETA) while downloading a file.
In this paper, we focus on fine-grained recognition of vehicles mainly in traffic surveillance applications. We propose an approach orthogonal to recent advancement in fine-grained recognition (automatic part discovery, bilinear pooling). Also, in contrast to other methods focused on fine-grained recognition of vehicles, we do not limit ourselves to frontal/rear viewpoint but allow the vehicles to be seen from any viewpoint. Our approach is based on 3D bounding boxes built around the vehicles. The bounding box can be automatically constructed from traffic surveillance data. For scenarios where it is not possible to use the precise construction, we propose a method for estimation of the 3D bounding box. The 3D bounding box is used to normalize the image viewpoint by "unpacking" the image into plane. We also propose to randomly alter the color of the image and add a rectangle with random noise to random position in the image during training Convolutional Neural Networks. We have collected a large fine-grained vehicle dataset BoxCars116k, with 116k images of vehicles from various viewpoints taken by numerous surveillance cameras. We performed a number of experiments which show that our proposed method significantly improves CNN classification accuracy (the accuracy is increased by up to 12 percent points and the error is reduced by up to 50% compared to CNNs without the proposed modifications). We also show that our method outperforms state-of-the-art methods for fine-grained recognition.
Recent experimental studies of living neural networks reveal that their global activation induced by electrical stimulation can be explained using the concept of bootstrap percolation on a directed random network. The experiment consists in activating externally an initial random fraction of the neurons and observe the process of firing until its equilibrium. The final portion of neurons that are active depends in a non linear way on the initial fraction. The main result of this paper is a theorem which enables us to find the asymptotic of final proportion of the fired neurons in the case of random directed graphs with given node degrees as the model for interacting network. This gives a rigorous mathematical proof of a phenomena observed by physicists in neural networks.
Emotion analysis is a crucial problem to endow artifact machines with real intelligence in many large potential applications. As external appearances of human emotions, electroencephalogram (EEG) signals and video face signals are widely used to track and analyze human's affective information. According to their common characteristics of spatial-temporal volumes, in this paper we propose a novel deep learning framework named spatial-temporal recurrent neural network (STRNN) to unify the learning of two different signal sources into a spatial-temporal dependency model. In STRNN, to capture those spatially cooccurrent variations of human emotions, a multi-directional recurrent neural network (RNN) layer is employed to capture longrange contextual cues by traversing the spatial region of each time slice from multiple angles. Then a bi-directional temporal RNN layer is further used to learn discriminative temporal dependencies from the sequences concatenating spatial features of each time slice produced from the spatial RNN layer. To further select those salient regions of emotion representation, we impose sparse projection onto those hidden states of spatial and temporal domains, which actually also increases the model discriminant ability because of this global consideration. Consequently, such a two-layer RNN model builds spatial dependencies as well as temporal dependencies of the input signals. Experimental results on the public emotion datasets of EEG and facial expression demonstrate the proposed STRNN method is more competitive over those state-of-the-art methods.
Due to the ever rising importance of the network paradigm across several areas of science, comparing and classifying graphs represent essential steps in the networks analysis of complex systems. Both tasks have been recently tackled via quite different strategies, even tailored ad-hoc for the investigated problem. Here we deal with both operations by introducing the Hamming-Ipsen-Mikhailov (HIM) distance, a novel metric to quantitatively measure the difference between two graphs sharing the same vertices. The new measure combines the local Hamming distance and the global spectral Ipsen-Mikhailov distance so to overcome the drawbacks affecting the two components separately. Building then the HIM kernel function derived from the HIM distance it is possible to move from network comparison to network classification via the Support Vector Machine (SVM) algorithm. Applications of HIM distance and HIM kernel in computational biology and social networks science demonstrate the effectiveness of the proposed functions as a general purpose solution.
The use of distributions and high-level features from deep architecture has become commonplace in modern computer vision. Both of these methodologies have separately achieved a great deal of success in many computer vision tasks. However, there has been little work attempting to leverage the power of these to methodologies jointly. To this end, this paper presents the Deep Mean Maps (DMMs) framework, a novel family of methods to non-parametrically represent distributions of features in convolutional neural network models.   DMMs are able to both classify images using the distribution of top-level features, and to tune the top-level features for performing this task. We show how to implement DMMs using a special mean map layer composed of typical CNN operations, making both forward and backward propagation simple.   We illustrate the efficacy of DMMs at analyzing distributional patterns in image data in a synthetic data experiment. We also show that we extending existing deep architectures with DMMs improves the performance of existing CNNs on several challenging real-world datasets.
Syntactic parsing is a key task in natural language processing which has been dominated by symbolic, grammar-based syntactic parsers. Neural networks, with their distributed representations, are challenging these methods.   In this paper, we want to show that existing parsing algorithms can cross the border and be defined over distributed representations. We then define D-CYK: a version of the traditional CYK algorithm defined over distributed representations. Our D-CYK operates as the original CYK but uses matrix multiplications. These operations are compatible with traditional neural networks. Experiments show that D-CYK approximates the original CYK. By showing that CYK can be performed on distributed representations, our D-CYK opens the possibility of defining recurrent layers of CYK-informed neural networks.
We study translation invariant stochastic processes on $\mathbb{R}^d$ or $\mathbb{Z}^d$ whose diffraction spectrum or structure function $S(k)$, i.e. the Fourier transform of the truncated total pair correlation function, vanishes on an open set $U$ in the wave space. A key family of such processes are stealthy hyperuniform point processes, for which the origin $k=0$ is in $U$; these are of much current physical interest. We show that all such processes exhibit the following remarkable maximal rigidity : namely, the configuration outside a bounded region determines, with probability 1, the exact value (or the exact locations of the points) of the process inside the region. In particular, such processes are completely determined by their tail. In the 1D discrete setting (i.e. $\mathbb{Z}$-valued processes on $\mathbb{Z}$), this can also be seen as a consequence of a recent theorem of Borichev, Sodin and Weiss; in higher dimensions or in the continuum, such a phenomenon seems novel. For stealthy hyperuniform point processes, we prove the Zhang-Stillinger-Torquato conjecture that such processes have bounded holes (empty regions), with a universal bound that depends inversely on the size of $U$.
Wireless sensor networks (WSNs) are commonly used in various ubiquitous and pervasive applications. Due to limited power resources, the optimal dynamic base station (BS) replacement could be Prolong the sensor network lifetime. In this paper we'll present a dynamic optimum method for base station replacement so that can save energy in sensors and increases network lifetime. Because positioning problem is a NPhard problem [1], therefore we'll use genetic algorithm to solve positioning problem. We've considered energy and distance parameters for finding BS optimized position. In our represented algorithm base station position is fixed just during each round and its positioning is done at the start of next round then it'll be placed in optimized position. Evaluating our proposed algorithm, we'll execute DBSR algorithm on LEACH & HEED Protocols.
Evaluating influence spread in social networks is a fundamental procedure to estimate the word-of-mouth effect in viral marketing. There are enormous studies about this topic; however, under the standard stochastic cascade models, the exact computation of influence spread is known to be #P-hard. Thus, the existing studies have used Monte-Carlo simulation-based approximations to avoid exact computation.   We propose the first algorithm to compute influence spread exactly under the independent cascade model. The algorithm first constructs binary decision diagrams (BDDs) for all possible realizations of influence spread, then computes influence spread by dynamic programming on the constructed BDDs. To construct the BDDs efficiently, we designed a new frontier-based search-type procedure. The constructed BDDs can also be used to solve other influence-spread related problems, such as random sampling without rejection, conditional influence spread evaluation, dynamic probability update, and gradient computation for probability optimization problems.   We conducted computational experiments to evaluate the proposed algorithm. The algorithm successfully computed influence spread on real-world networks with a hundred edges in a reasonable time, which is quite impossible by the naive algorithm. We also conducted an experiment to evaluate the accuracy of the Monte-Carlo simulation-based approximation by comparing exact influence spread obtained by the proposed algorithm.
Sampling is an important tool for estimating large, complex sums and integrals over high dimensional spaces. For instance, important sampling has been used as an alternative to exact methods for inference in belief networks. Ideally, we want to have a sampling distribution that provides optimal-variance estimators. In this paper, we present methods that improve the sampling distribution by systematically adapting it as we obtain information from the samples. We present a stochastic-gradient-descent method for sequentially updating the sampling distribution based on the direct minization of the variance. We also present other stochastic-gradient-descent methods based on the minimizationof typical notions of distance between the current sampling distribution and approximations of the target, optimal distribution. We finally validate and compare the different methods empirically by applying them to the problem of action evaluation in influence diagrams.
Many of the current state-of-the-art Large Vocabulary Continuous Speech Recognition Systems (LVCSR) are hybrids of neural networks and Hidden Markov Models (HMMs). Most of these systems contain separate components that deal with the acoustic modelling, language modelling and sequence decoding. We investigate a more direct approach in which the HMM is replaced with a Recurrent Neural Network (RNN) that performs sequence prediction directly at the character level. Alignment between the input features and the desired character sequence is learned automatically by an attention mechanism built into the RNN. For each predicted character, the attention mechanism scans the input sequence and chooses relevant frames. We propose two methods to speed up this operation: limiting the scan to a subset of most promising frames and pooling over time the information contained in neighboring frames, thereby reducing source sequence length. Integrating an n-gram language model into the decoding process yields recognition accuracies similar to other HMM-free RNN-based approaches.
We calculate the structure functions for unpolarized deep inelastic scattering of baryons using an AdS/QCD soft wall model that considers a dressed mass for the bulk fermionic fields. Considering the regime of large Bjorken parameter x, we compare the results for the proton structure function $F_2$ with experimental results.
We propose a novel approach for analysing time series using complex network theory. We identify the recurrence matrix calculated from time series with the adjacency matrix of a complex network, and apply measures for the characterisation of complex networks to this recurrence matrix. By using the logistic map, we illustrate the potentials of these complex network measures for detecting dynamical transitions. Finally we apply the proposed approach to a marine palaeo-climate record and identify subtle changes of the climate regime.
We consider deep multi-layered generative models such as Boltzmann machines or Hopfield nets in which computation (which implements inference) is both recurrent and stochastic, but where the recurrence is not to model sequential structure, only to perform computation. We find conditions under which a simple feedforward computation is a very good initialization for inference, after the input units are clamped to observed values. It means that after the feedforward initialization, the recurrent network is very close to a fixed point of the network dynamics, where the energy gradient is 0. The main condition is that consecutive layers form a good auto-encoder, or more generally that different groups of inputs into the unit (in particular, bottom-up inputs on one hand, top-down inputs on the other hand) are consistent with each other, producing the same contribution into the total weighted sum of inputs. In biological terms, this would correspond to having each dendritic branch correctly predicting the aggregate input from all the dendritic branches, i.e., the soma potential. This is consistent with the prediction that the synaptic weights into dendritic branches such as those of the apical and basal dendrites of pyramidal cells are trained to minimize the prediction error made by the dendritic branch when the target is the somatic activity. Whereas previous work has shown how to achieve fast negative phase inference (when the model is unclamped) in a predictive recurrent model, this contribution helps to achieve fast positive phase inference (when the target output is clamped) in such recurrent neural models.
Advanced Driver Assistance Systems (ADAS) have made driving safer over the last decade. They prepare vehicles for unsafe road conditions and alert drivers if they perform a dangerous maneuver. However, many accidents are unavoidable because by the time drivers are alerted, it is already too late. Anticipating maneuvers beforehand can alert drivers before they perform the maneuver and also give ADAS more time to avoid or prepare for the danger.   In this work we propose a vehicular sensor-rich platform and learning algorithms for maneuver anticipation. For this purpose we equip a car with cameras, Global Positioning System (GPS), and a computing device to capture the driving context from both inside and outside of the car. In order to anticipate maneuvers, we propose a sensory-fusion deep learning architecture which jointly learns to anticipate and fuse multiple sensory streams. Our architecture consists of Recurrent Neural Networks (RNNs) that use Long Short-Term Memory (LSTM) units to capture long temporal dependencies. We propose a novel training procedure which allows the network to predict the future given only a partial temporal context. We introduce a diverse data set with 1180 miles of natural freeway and city driving, and show that we can anticipate maneuvers 3.5 seconds before they occur in real-time with a precision and recall of 90.5\% and 87.4\% respectively.
Restoration of macroscopic isotropy has been investigated in (d+1)-simplex fractal conductor networks via exact real space renormalization group transformations. Using some theorems of fixed point theory, it has been shown very rigoroursly that the macroscopic conductivity becomes isotropic for large scales and anisotropy vanishes with a scaling exponent which is computed exactly for arbitrary values of d and decimation numbers b=2,3,4 and   5.
This chapter aims at reviewing complex networks models and methods that were either developed for or applied to socioeconomic issues, and pertinent to the theme of New Economic Geography. After an introduction to the foundations of the field of complex networks, the present summary adds insights on the statistical mechanical approach, and on the most relevant computational aspects for the treatment of these systems. As the most frequently used model for interacting agent-based systems, a brief description of the statistical mechanics of the classical Ising model on regular lattices, together with recent extensions of the same model on small-world Watts-Strogatz and scale-free Albert-Barabasi complex networks is included. Other sections of the chapter are devoted to applications of complex networks to economics, finance, spreading of innovations, and regional trade and developments. The chapter also reviews results involving applications of complex networks to other relevant socioeconomic issues, including results for opinion and citation networks. Finally, some avenues for future research are introduced before summarizing the main conclusions of the chapter.
Gradient networks can be used to model the dominant structure of complex networks. Previous works have focused on random gradient networks. Here we study gradient networks that minimize jamming on substrate networks with scale-free and Erd\H{o}s-R\'enyi structure. We introduce structural correlations and strongly reduce congestion occurring on the network by using a Monte Carlo optimization scheme. This optimization alters the degree distribution and other structural properties of the resulting gradient networks. These results are expected to be relevant for transport and other dynamical processes in real network systems.
Social media have become the main vehicle of information production and consumption online. Millions of users every day log on their Facebook or Twitter accounts to get updates and news, read about their topics of interest, and become exposed to new opportunities and interactions. Although recent studies suggest that the contents users produce will affect the emotions of their readers, we still lack a rigorous understanding of the role and effects of contents sentiment on the dynamics of information diffusion. This work aims at quantifying the effect of sentiment on information diffusion, to understand: (i) whether positive conversations spread faster and/or broader than negative ones (or vice-versa); (ii) what kind of emotions are more typical of popular conversations on social media; and, (iii) what type of sentiment is expressed in conversations characterized by different temporal dynamics. Our findings show that, at the level of contents, negative messages spread faster than positive ones, but positive ones reach larger audiences, suggesting that people are more inclined to share and favorite positive contents, the so-called positive bias. As for the entire conversations, we highlight how different temporal dynamics exhibit different sentiment patterns: for example, positive sentiment builds up for highly-anticipated events, while unexpected events are mainly characterized by negative sentiment. Our contribution is a milestone to understand how the emotions expressed in short texts affect their spreading in online social ecosystems, and may help to craft effective policies and strategies for content generation and diffusion.
In the present work we demonstrate the use of a parcellation free connectivity model based on Poisson point processes. This model produces for each subject a continuous bivariate intensity function that represents for every possible pair of points the relative rate at which we observe tracts terminating at those points. We fit this model to explore degree sequence equivalents for spatial continuum graphs, and to investigate the local differences between estimated intensity functions for two different tractography methods. This is a companion paper to Moyer et al. (2016), where the model was originally defined.
The human brain forms functional networks on all spatial scales. Modern fMRI scanners allow for resolving functional brain data in high resolution, enabling the study of large-scale networks that relate to cognitive processes. The analysis of such networks forms a cornerstone of experimental neuroscience. Due to the immense size and complexity of the underlying data sets, efficient evaluation and visualization pose challenges for data analysis. In this study, we combine recent advances in experimental neuroscience and applied mathematics to perform a mathematical characterization of complex networks constructed from fMRI data. We use task-related edge densities [Lohmann et al., 2016] for constructing networks whose nodes represent voxels in the fMRI data and whose edges represent the task-related changes in synchronization between them. This construction captures the dynamic formation of patterns of neuronal activity and therefore efficiently represents the connectivity structure between brain regions. Using geometric methods that utilize Forman-Ricci curvature as an edge-based network characteristic [Weber et al., 2017], we perform a mathematical analysis of the resulting complex networks. We motivate the use of edge-based characteristics to evaluate the network structure with geometric methods. Our results identify important structural network features including long-range connections of high curvature acting as bridges between major network components. The geometric features link curvature to higher order network organization that could aid in understanding the connectivity and interplay of brain regions in cognitive processes.
This paper considers networks where relationships between nodes are represented by directed dissimilarities. The goal is to study methods for the determination of hierarchical clusters, i.e., a family of nested partitions indexed by a connectivity parameter, induced by the given dissimilarity structures. Our construction of hierarchical clustering methods is based on defining admissible methods to be those methods that abide by the axioms of value - nodes in a network with two nodes are clustered together at the maximum of the two dissimilarities between them - and transformation - when dissimilarities are reduced, the network may become more clustered but not less. Several admissible methods are constructed and two particular methods, termed reciprocal and nonreciprocal clustering, are shown to provide upper and lower bounds in the space of admissible methods. Alternative clustering methodologies and axioms are further considered. Allowing the outcome of hierarchical clustering to be asymmetric, so that it matches the asymmetry of the original data, leads to the inception of quasi-clustering methods. The existence of a unique quasi-clustering method is shown. Allowing clustering in a two-node network to proceed at the minimum of the two dissimilarities generates an alternative axiomatic construction. There is a unique clustering method in this case too. The paper also develops algorithms for the computation of hierarchical clusters using matrix powers on a min-max dioid algebra and studies the stability of the methods proposed. We proved that most of the methods introduced in this paper are such that similar networks yield similar hierarchical clustering results. Algorithms are exemplified through their application to networks describing internal migration within states of the United States (U.S.) and the interrelation between sectors of the U.S. economy.
We develop an algebraic theory of synchronous dataflow networks. First, a basic algebraic theory of networks, called BNA (Basic Network Algebra), is introduced. This theory captures the basic algebraic properties of networks. For synchronous dataflow networks, it is subsequently extended with additional constants for the branching connections that occur between the cells of synchronous dataflow networks and axioms for these additional constants. We also give two models of the resulting theory, the one based on stream transformers and the other based on processes as considered in process algebra.
We consider two optimization problems on synchronization of oscillator networks: maximization of synchronizability and minimization of synchronization cost. We first develop an extension of the well-known master stability framework to the case of non-diagonalizable Laplacian matrices. We then show that the solution sets of the two optimization problems coincide and are simultaneously characterized by a simple condition on the Laplacian eigenvalues. Among the optimal networks, we identify a subclass of hierarchical networks, characterized by the absence of feedback loops and the normalization of inputs. We show that most optimal networks are directed and non-diagonalizable, necessitating the extension of the framework. We also show how oriented spanning trees can be used to explicitly and systematically construct optimal networks under network topological constraints. Our results may provide insights into the evolutionary origin of structures in complex networks for which synchronization plays a significant role.
The nucleon mass corrections are calculated to all polarized structure functions for neutral and charged current deep inelastic scattering in lowest order in the coupling constant. The impact of the target mass corrections on the general relations between the twist--2 and twist--3 parts of the structure functions is studied and three new relations between the twist--3 contributions are derived. The size of nucleon mass corrections for the $g_1$ and $g_2$ structure functions are estimated.
The pseudo-likelihood method is one of the most popular algorithms for learning sparse binary pairwise Markov networks. In this paper, we formulate the $L_1$ regularized pseudo-likelihood problem as a sparse multiple logistic regression problem. In this way, many insights and optimization procedures for sparse logistic regression can be applied to the learning of discrete Markov networks. Specifically, we use the coordinate descent algorithm for generalized linear models with convex penalties, combined with strong screening rules, to solve the pseudo-likelihood problem with $L_1$ regularization. Therefore a substantial speedup without losing any accuracy can be achieved. Furthermore, this method is more stable than the node-wise logistic regression approach on unbalanced high-dimensional data when penalized by small regularization parameters. Thorough numerical experiments on simulated data and real world data demonstrate the advantages of the proposed method.
Consider the problem of source coding in networks with multiple receiving terminals, each having access to some kind of side information. In this case, standard coding techniques are either prohibitively complex to decode, or require network-source coding separation, resulting in sub-optimal transmission rates. To alleviate this problem, we offer a joint network-source coding scheme based on matrix sparsification at the code design phase, which allows the terminals to use an efficient decoding procedure (syndrome decoding using LDPC), despite the network coding throughout the network. Via a novel relation between matrix sparsification and rate-distortion theory, we give lower and upper bounds on the best achievable sparsification performance. These bounds allow us to analyze our scheme, and, in particular, show that in the limit where all receivers have comparable side information (in terms of conditional entropy), or, equivalently, have weak side information, a vanishing density can be achieved. As a result, efficient decoding is possible at all terminals simultaneously. Simulation results motivate the use of this scheme at non-limiting rates as well.
Deep neural networks are state-of-the-art models for understanding the content of images, video and raw input data. However, implementing a deep neural network in embedded systems is a challenging task, because a typical deep neural network, such as a Deep Belief Network using 128x128 images as input, could exhaust Giga bytes of memory and result in bandwidth and computing bottleneck. To address this challenge, this paper presents a hardware-oriented deep learning algorithm, named as the Deep Adaptive Network, which attempts to exploit the sparsity in the neural connections. The proposed method adaptively reduces the weights associated with negligible features to zero, leading to sparse feedforward network architecture. Furthermore, since the small proportion of important weights are significantly larger than zero, they can be robustly thresholded and represented using single-bit integers (-1 and +1), leading to implementations of deep neural networks with sparse and binary connections. Our experiments showed that, for the application of recognizing MNIST handwritten digits, the features extracted by a two-layer Deep Adaptive Network with about 25% reserved important connections achieved 97.2% classification accuracy, which was almost the same with the standard Deep Belief Network (97.3%). Furthermore, for efficient hardware implementations, the sparse-and-binary-weighted deep neural network could save about 99.3% memory and 99.9% computation units without significant loss of classification accuracy for pattern recognition applications.
Emerging nano-scale programmable Resistive-RAM (RRAM) has been identified as a promising technology for implementing brain-inspired computing hardware. Several neural network architectures, that essentially involve computation of scalar products between input data vectors and stored network weights can be efficiently implemented using high density cross-bar arrays of RRAM integrated with CMOS. In such a design, the CMOS interface may be responsible for providing input excitations and for processing the RRAM output. In order to achieve high energy efficiency along with high integration density in RRAM based neuromorphic hardware, the design of RRAM-CMOS interface can therefore play a major role. In this work we propose design of high performance, current mode CMOS interface for RRAM based neural network design. The use of current mode excitation for input interface and design of digitally assisted current-mode CMOS neuron circuit for the output interface is presented. The proposed technique achieve 10x energy as well as performance improvement over conventional approaches employed in literature. Network level simulations show that the proposed scheme can achieve 2 orders of magnitude lower energy dissipation as compared to a digital ASIC implementation of a feed-forward neural network.
We describe Microsoft's conversational speech recognition system, in which we combine recent developments in neural-network-based acoustic and language modeling to advance the state of the art on the Switchboard recognition task. Inspired by machine learning ensemble techniques, the system uses a range of convolutional and recurrent neural networks. I-vector modeling and lattice-free MMI training provide significant gains for all acoustic model architectures. Language model rescoring with multiple forward and backward running RNNLMs, and word posterior-based system combination provide a 20% boost. The best single system uses a ResNet architecture acoustic model with RNNLM rescoring, and achieves a word error rate of 6.9% on the NIST 2000 Switchboard task. The combined system has an error rate of 6.2%, representing an improvement over previously reported results on this benchmark task.
Two computer codes relevant for the description of deep inelastic scattering off polarized targets are discussed. The code ${\mu}$ela deals with radiative corrections to elastic $\mu e$ scattering, one method applied for muon beam polarimetry. The code HECTOR allows to calculate both the radiative corrections for unpolarized and polarized deep inelastic scattering, including higher order QED corrections.
We propose a generalized Lanczos method to generate the many-body basis states of quantum lattice models using tensor-network states (TNS). The ground-state wave function is represented as a linear superposition composed from a set of TNS generated by Lanczos iteration. This method improves significantly both the accuracy and the efficiency of the tensor-network algorithm and allows the ground state to be determined accurately using TNS with very small virtual bond dimensions. This state contains significantly more entanglement than each individual TNS, reproducing correctly the logarithmic size dependence of the entanglement entropy in a critical system. The method can be generalized to non-Hamiltonian systems and to the calculation of low-lying excited states, dynamical correlation functions, and other physical properties of strongly correlated systems.
The self-consistent signal-to-noise analysis (SCSNA) is an alternative to the replica method for deriving the set of order parameter equations for associative memory neural network models and is closely related with the Thouless-Anderson-Palmer equation (TAP) approach. In the recent paper by Shiino and Yamana the Onsager reaction term of the TAP equation has been found to be obtained from the SCSNA for Hopfield neural networks with 2-body interaction. We study the TAP equation for an associative memory stochastic analog neural network with 3-body interaction to investigate the structure of the Onsager reaction term, in connection with the term proportional to the output characteristic to the SCSNA. We report the SCSNA framework for analog networks with 3-body interactions as well as a novel recipe based on the cavity concept that involves two cavities and the hybrid use of the SCSNA to obtain the TAP equation.
This paper addresses multi-user quantum key distribution networks, in which any two users can mutually exchange a secret key without trusting any other nodes. The same network also supports conventional classical communications by assigning two different wavelength bands to quantum and classical signals. Time and code division multiple access (CDMA) techniques, within a passive star network, are considered. In the case of CDMA, it turns out that the optimal performance is achieved at a unity code weight. A listen-before-send protocol is then proposed to improve secret key generation rates in this case. Finally, a hybrid setup with wavelength routers and passive optical networks, which can support a large number of users, is considered and analyzed.
Gene regulatory networks can be successfully modeled as Boolean networks. A much discussed hypothesis says that such model networks reproduce empirical findings the best if they are tuned to operate at criticality, i.e. at the borderline between their ordered and disordered phases. Critical networks have been argued to lead to a number of functional advantages such as maximal dynamical range, maximal sensitivity to environmental changes, as well as to an excellent trade off between stability and flexibility. Here, we study the effect of noise within the context of Boolean networks trained to learn complex tasks under supervision. We verify that quasi-critical networks are the ones learning in the fastest possible way --even for asynchronous updating rules-- and that the larger the task complexity the smaller the distance to criticality. On the other hand, when additional sources of intrinsic noise in the network states and/or in its wiring pattern are introduced, the optimally performing networks become clearly subcritical. These results suggest that in order to compensate for inherent stochasticity, regulatory and other type of biological networks might become subcritical rather than being critical, all the most if the task to be performed has limited complexity.
This paper describes our approach to the DSTL Satellite Imagery Feature Detection challenge run by Kaggle. The primary goal of this challenge is accurate semantic segmentation of different classes in satellite imagery. Our approach is based on an adaptation of fully convolutional neural network for multispectral data processing. In addition, we defined several modifications to the training objective and overall training pipeline, e.g. boundary effect estimation, also we discuss usage of data augmentation strategies and reflectance indices. Our solution scored third place out of 419 entries. Its accuracy is comparable to the first two places, but unlike those solutions, it doesn't rely on complex ensembling techniques and thus can be easily scaled for deployment in production as a part of automatic feature labeling systems for satellite imagery analysis.
The rheology of a granular shear flow is studied in a quasi-2d rotating cylinder. Measurements are carried out near the midpoint along the length of the surface flowing layer where the flow is steady and non-accelerating. Streakline photography and image analysis are used to obtain particle velocities and positions. Different particle sizes and rotational speeds are considered. We find a sharp transition in the apparent viscosity ($\eta$) variation with rms velocity ($u$). In the fluid-like region above the depth corresponding to the transition point (higher rms velocities) there is a rapid increase in viscosity with decreasing rms velocity. Below the transition depth we find $\eta \propto u^{-1.5}$ for all the different cases studied and the material approaches an amorphous solid-like state deep in the layer. The velocity distribution is Maxwellian above the transition point and a Poisson velocity distribution is obtained deep in the layer. The observed transition appears to be analogous to a glass transition.
Traditionally, networks such as datacenter interconnects are designed to optimize worst-case performance under arbitrary traffic patterns. Such network designs can however be far from optimal when considering the actual workloads and traffic patterns which they serve. This insight led to the development of demand-aware datacenter interconnects which can be reconfigured depending on the workload.   Motivated by these trends, this paper initiates the algorithmic study of demand-aware networks (DANs) designs, and in particular the design of bounded-degree networks. The inputs to the network design problem are a discrete communication request distribution, D, defined over communicating pairs from the node set V , and a bound, d, on the maximum degree. In turn, our objective is to design an (undirected) demand-aware network N = (V,E) of bounded-degree d, which provides short routing paths between frequently communicating nodes distributed across N. In particular, the designed network should minimize the expected path length on N (with respect to D), which is a basic measure of the efficiency of the network.   We show that this fundamental network design problem exhibits interesting connections to several classic combinatorial problems and to information theory. We derive a general lower bound based on the entropy of the communication pattern D, and present asymptotically optimal network-aware design algorithms for important distribution families, such as sparse distributions and distributions of locally bounded doubling dimensions.
The entanglement entropy of the two-dimensional random transverse Ising model is studied with a numerical implementation of the strong disorder renormalization group. The asymptotic behavior of the entropy per surface area diverges at, and only at, the quantum phase transition that is governed by an infinite randomness fixed point. Here we identify a double-logarithmic multiplicative correction to the area law for the entanglement entropy. This contrasts with the pure area law valid at the infinite randomness fixed point in the diluted transverse Ising model in higher dimensions.
With the advent of modern computer networks, fault diagnosis has been a focus of research activity. This paper reviews the history of fault diagnosis in networks and discusses the main methods in information gathering section, information analyzing section and diagnosing and revolving section of fault diagnosis in networks. Emphasis will be placed upon knowledge-based methods with discussing the advantages and shortcomings of the different methods. The survey is concluded with a description of some open problems.
Pattern recognition methods using neuroimaging data for the diagnosis of Alzheimer's disease have been the subject of extensive research in recent years. In this paper, we use deep learning methods, and in particular sparse autoencoders and 3D convolutional neural networks, to build an algorithm that can predict the disease status of a patient, based on an MRI scan of the brain. We report on experiments using the ADNI data set involving 2,265 historical scans. We demonstrate that 3D convolutional neural networks outperform several other classifiers reported in the literature and produce state-of-art results.
In the development of the brain, it is known that synapses are pruned following over-growth. This pruning following over-growth seems to be a universal phenomenon that occurs in almost all areas -- visual cortex, motor area, association area, and so on. It has been shown numerically that the synapse efficiency is increased by systematic deletion. We discuss the synapse efficiency to evaluate the effect of pruning following over-growth, and analytically show that the synapse efficiency diverges as O(log c) at the limit where connecting rate c is extremely small. Under a fixed synapse number criterion, the optimal connecting rate, which maximize memory performance, exists.
We propose a general parametrizable model to capture the dynamic interaction among bacteria in the formation of micro-colonies. micro-colonies represent the first social step towards the formation of structured multicellular communities known as bacterial biofilms, which protect the bacteria against antimicrobials. In our model, bacteria can form links in the form of intercellular adhesins (such as polysaccharides) to collaborate in the production of resources that are fundamental to protect them against antimicrobials. Since maintaining a link can be costly, we assume that each bacterium forms and maintains a link only if the benefit received from the link is larger than the cost, and we formalize the interaction among bacteria as a dynamic network formation game. We rigorously characterize some of the key properties of the network evolution depending on the parameters of the system. In particular, we derive the parameters under which it is guaranteed that all bacteria will join micro-colonies and the parameters under which it is guaranteed that some bacteria will not join micro-colonies. Importantly, our study does not only characterize the properties of networks emerging in equilibrium, but it also provides important insights on how the network dynamically evolves and on how the formation history impacts the emerging networks in equilibrium. This analysis can be used to develop methods to influence on- the-fly the evolution of the network, and such methods can be useful to treat or prevent biofilm-related diseases.
To analyze the movements and to study the trajectories of players is a crucial need for a team when it looks to improve its chances of winning a match or to understand its performances. State of the art tracking systems now produce spatio-temporal traces of player trajectories with high definition and frequency that has facilitated a variety of research efforts to extract insight from the trajectories. Despite many methods borrowed from different disciplines (machine learning, network and complex systems, GIS, computer vision, statistics) has been proposed to answer to the needs of teams, a friendly and easy-to-use approach to visualize spatio-temporal movements is still missing. This paper suggests the use of gvisMotionChart function in GoogleVis R package. I present and discuss results of a basketball case study. Data refers to a match played by an italian team militant in "C-gold" league on March 22nd, 2016. With this case study I show that such a visualization approach could be useful in supporting researcher on preliminar stages of their analysis on sports' movements, and to facilitate the interpretation of their results.
We propose a decentralized stochastic control solution for the broadcast message dissemination problem in wireless ad hoc networks with slow fading channels. We formulate the control problem as a dynamic robust game which is well justified by two key observations; first, the shared nature of the wireless medium which inevitably cross-couples the nodes' forwarding decisions, thus binding them together as strategic players; second, the stochastic dynamics associated with the link qualities which renders the transmission costs noisy, thus motivating a robust formulation. Given the non stationarity induced by the fading process, an online solution for the formulated game would then require an adaptive procedure capable of both convergence to and tracking strategic equilibria as the environment changes. To this end, we deploy the strategic and non stationary learning algorithm of regret tracking, the temporally adaptive variant of the celebrated regret matching algorithm, to guarantee the emergence and active tracking of the correlated equilibria in the dynamic robust forwarding game. We also make provision for exploiting the channel state information, when available, to enhance the convergence speed of the learning algorithm by conducting an accurate transmission cost estimation. This cost estimate can basically serve as a model which spares the algorithm from extra action exploration, thus rendering the learning process more sample efficient. Simulation results reveal that our proposed solution excels in terms of both the number of transmissions and load distribution while also maintaining near perfect delivery ratio, especially in dense crowded environments.
Influence maximization is the problem of finding influent nodes in a graph so as to maximize the spread of information. It has many applications in advertising and marketing on social networks. In this paper, we study the problem of sequentially selecting seeds in the network under the hypothesis that previously activated nodes can still transfer information, but do not yield further rewards. Furthermore, we make no assumption on the underlying diffusion model. We refer to this problem as online influence maximization with persistence. We first discuss scenarios motivating the problem and present our approach to solve it. We then analyze a novel algorithm relying on upper confidence bound on the so-called missing mass, that is, the expected number of nodes that can still be reached from a given seed. From a computational standpoint, the proposed approach is several orders faster than state-of-the-art methods, making it possible to tackle very large graphs. In addition, it displays high-quality spreads on both simulated and real datasets.
Transfer-matrix methods are used to calculate spin-spin correlation functions ($G$), Helmholtz free energies ($f$) and magnetizations ($m$) in the two-dimensional random-field Ising model close to the zero-field bulk critical temperature $T_{c 0}$, on long strips of width $L = 3 - 18$ sites, for binary field distributions. Analysis of the probability distributions of $G$ for varying spin-spin distances $R$ shows that describing the decay of their averaged values by effective correlation lengths is a valid procedure only for not very large $R$. Connections between field-- and correlation function distributions at high temperatures are established, yielding approximate analytical expressions for the latter, which are used for computation of the corresponding structure factor. It is shown that, for fixed $R/L$, the fractional widths of correlation-function distributions saturate asymptotically with $L^{-2.2}$. Considering an added uniform applied field $h$, a connection between $f(h)$, $m(h)$, the Gibbs free energy $g(m)$ and the distribution function for the uniform magnetization in zero uniform field, $P_0(m)$, is derived and first illustrated for pure systems, and then applied for non-zero random field. From finite-size scaling and crossover arguments, coupled with numerical data, it is found that the width of $P_0(m)$ varies against (non-vanishing, but small) random-field intensity $H_0$ as $H_0^{-3/7}$.
Stationary wave functions at the transition between plateaus of the integer quantum Hall effect are known to exhibit multi-fractal statistics. Here we explore this critical behavior for the case of scattering states of the Chalker-Coddington model with point contacts. We argue that moments formed from the wave amplitudes of critical scattering states decay as pure powers of the distance between the points of contact and observation. These moments in the continuum limit are proposed to be correlations functions of primary fields of an underlying conformal field theory. We check this proposal numerically by finite-size scaling. We also verify the CFT prediction for a 3-point function involving two primary fields.
Fifty years ago, John von Neumann compared the architecture of the brain with that of computers that he invented and which is still in use today. In those days, the organisation of computers was based on concepts of brain organisation. Here, we give an update on current results on the global organisation of neural systems. For neural systems, we outline how the spatial and topological architecture of neuronal and cortical networks facilitates robustness against failures, fast processing, and balanced network activation. Finally, we discuss mechanisms of self-organization for such architectures. After all, the organization of the brain might again inspire computer architecture.
This paper considers a general data-fitting problem over a networked system, in which many computing nodes are connected by an undirected graph. This kind of problem can find many real-world applications and has been studied extensively in the literature. However, existing solutions either need a central controller for information sharing or requires slot synchronization among different nodes, which increases the difficulty of practical implementations, especially for a very large and heterogeneous system.   As a contrast, in this paper, we treat the data-fitting problem over the network as a stochastic programming problem with many constraints. By adapting the results in a recent paper, we design a fully distributed and asynchronized stochastic gradient descent (SGD) algorithm. We show that our algorithm can achieve global optimality and consensus asymptotically by only local computations and communications. Additionally, we provide a sharp lower bound for the convergence speed in the regular graph case. This result fits the intuition and provides guidance to design a `good' network topology to speed up the convergence. Also, the merit of our design is validated by experiments on both synthetic and real-world datasets.
Beyond its highly publicized victories in Go, there have been numerous successful applications of deep learning in information retrieval, computer vision and speech recognition. In cybersecurity, an increasing number of companies have become excited about the potential of deep learning, and have started to use it for various security incidents, the most popular being malware detection. These companies assert that deep learning (DL) could help turn the tide in the battle against malware infections. However, deep neural networks (DNNs) are vulnerable to adversarial samples, a flaw that plagues most if not all statistical learning models. Recent research has demonstrated that those with malicious intent can easily circumvent deep learning-powered malware detection by exploiting this flaw.   In order to address this problem, previous work has developed various defense mechanisms that either augmenting training data or enhance model's complexity. However, after a thorough analysis of the fundamental flaw in DNNs, we discover that the effectiveness of current defenses is limited and, more importantly, cannot provide theoretical guarantees as to their robustness against adversarial sampled-based attacks. As such, we propose a new adversary resistant technique that obstructs attackers from constructing impactful adversarial samples by randomly nullifying features within samples. In this work, we evaluate our proposed technique against a real world dataset with 14,679 malware variants and 17,399 benign programs. We theoretically validate the robustness of our technique, and empirically show that our technique significantly boosts DNN robustness to adversarial samples while maintaining high accuracy in classification. To demonstrate the general applicability of our proposed method, we also conduct experiments using the MNIST and CIFAR-10 datasets, generally used in image recognition research.
Finding an optimal sensing policy for a particular access policy and sensing scheme is a laborious combinatorial problem that requires the system model parameters to be known. In practise the parameters or the model itself may not be completely known making reinforcement learning methods appealing. In this paper a non-parametric reinforcement learning-based method is developed for sensing and accessing multi-band radio spectrum in multi-user cognitive radio networks. A suboptimal sensing policy search algorithm is proposed for a particular multi-user multi-band access policy and the randomized Chair-Varshney rule. The randomized Chair-Varshney rule is used to reduce the probability of false alarms under a constraint on the probability of detection that protects the primary user. The simulation results show that the proposed method achieves a sum profit (e.g. data rate) close to the optimal sensing policy while achieving the desired probability of detection.
Convolutional neural networks (CNNs) work well on large datasets. But labelled data is hard to collect, and in some applications larger amounts of data are not available. The problem then is how to use CNNs with small data -- as CNNs overfit quickly. We present an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches. This is by placing a probability distribution over the CNN's kernels. We approximate our model's intractable posterior with Bernoulli variational distributions, requiring no additional model parameters.   On the theoretical side, we cast dropout network training as approximate inference in Bayesian neural networks. This allows us to implement our model using existing tools in deep learning with no increase in time complexity, while highlighting a negative result in the field. We show a considerable improvement in classification accuracy compared to standard techniques and improve on published state-of-the-art results for CIFAR-10.
One of the most recent architectures of networks is Software-Defined Networks (SDNs) using a con- troller appliance to control the set of switches on the network. The controlling process includes installing or uninstalling packet-processing rules on flow tables of switches. This paper presents a high-level imperative network programming language, called ImNet, to facilitate writing efficient, yet simple, programs executed by controller to manage switches. ImNet is simply-structured, expressive, compositional, and imperative. This paper also introduces an operational semantics to ImNet. Detailed examples of programs (with their operational semantics) constructed in ImNet are illustrated in the paper as well.
Generative adversarial networks (GAN) approximate a target data distribution by jointly optimizing an objective function through a "two-player game" between a generator and a discriminator. Despite their empirical success, however, two very basic questions on how well they can approximate the target distribution remain unanswered. First, it is not known how restricting the discriminator family affects the approximation quality. Second, while a number of different objective functions have been proposed, we do not understand when convergence to the global minima of the objective function leads to convergence to the target distribution under various notions of distributional convergence.   In this paper, we address these questions in a broad and unified setting by defining a notion of adversarial divergences that includes a number of recently proposed objective functions. We show that if the objective function is an adversarial divergence with some additional conditions, then using a restricted discriminator family has a moment-matching effect. Additionally, we show that for objective functions that are strict adversarial divergences, convergence in the objective function implies weak convergence, thus generalizing previous results.
A multi-hop synchronous wirelss network is said to be unknown if the nodes have no knowledge of the topology. A basic task in wireless network is that of broadcasting a message (created by a fixed source node) to all nodes of the network. The multi-broadcast that consists in performing a set of r independent broadcasts. In this paper, we study the completion and the termination time of distributed protocols for both the (single) broadcast and the multi-broadcast operations on unknown networks as functions of the number of nodes n, the maximum eccentricity D, the maximum in-degree Delta, and the congestion c of the networks. We establish new connections between these operations and some combinatorial concepts, such as selective families, strongly-selective families (also known as superimposed codes), and pairwise r-different families. Such connections, combined with a set of new lower and upper bounds on the size of the above families, allow us to derive new lower bounds and new distributed protocols for the broadcast and multi-broadcast operations. In particular, our upper bounds are almost tight and improve exponentially over the previous bounds when D and Delta are polylogarithmic in n. Network topologies having ``small'' eccentricity and ``small'' degree (such as bounded-degree expanders) are often used in practice to achieve efficient communication.
We present a study of the properties of network of political discussions on one of the most popular Polish Internet forums. This provides the opportunity to study the computer mediated human interactions in strongly bipolar environment. The comments of the participants are found to be mostly disagreements, with strong percentage of invective and provocative ones. Binary exchanges (quarrels) play significant role in the network growth and topology. Statistical analysis shows that the growth of the discussions depends on the degree of controversy of the subject and the intensity of personal conflict between the participants. This is in contrast to most previously studied social networks, for example networks of scientific citations, where the nature of the links is much more positive and based on similarity and collaboration rather than opposition and abuse. The work discusses also the implications of the findings for more general studies of consensus formation, where our observations of increased conflict contradict the usual assumptions that interactions between people lead to averaging of opinions and agreement.
A Microcanonical Finite Site Ansatz in terms of quantities measurable in a Finite Lattice allows to extend phenomenological renormalization (the so called quotients method) to the microcanonical ensemble. The Ansatz is tested numerically in two models where the canonical specific-heat diverges at criticality, thus implying Fisher-renormalization of the critical exponents: the 3D ferromagnetic Ising model and the 2D four-states Potts model (where large logarithmic corrections are known to occur in the canonical ensemble). A recently proposed microcanonical cluster method allows to simulate systems as large as L=1024 (Potts) or L=128 (Ising). The quotients method provides extremely accurate determinations of the anomalous dimension and of the (Fisher-renormalized) thermal $\nu$ exponent. While in the Ising model the numerical agreement with our theoretical expectations is impressive, in the Potts case we need to carefully incorporate logarithmic corrections to the microcanonical Ansatz in order to rationalize our data.
The role of Network Theory in the study of the financial crisis has been widely spotted in the latest years. It has been shown how the network topology and the dynamics running on top of it can trigger the outbreak of large systemic crisis. Following this methodological perspective we introduce here the Accounting Network, i.e. the network we can extract through vector similarities techniques from companies' financial statements. We build the Accounting Network on a large database of worldwide banks in the period 2001-2013, covering the onset of the global financial crisis of mid-2007. After a careful data cleaning, we apply a quality check in the construction of the network, introducing a parameter (the Quality Ratio) capable of trading off the size of the sample (coverage) and the representativeness of the financial statements (accuracy). We compute several basic network statistics and check, with the Louvain community detection algorithm, for emerging communities of banks. Remarkably enough sensible regional aggregations show up with the Japanese and the US clusters dominating the community structure, although the presence of a geographically mixed community points to a gradual convergence of banks into similar supranational practices. Finally, a Principal Component Analysis procedure reveals the main economic components that influence communities' heterogeneity. Even using the most basic vector similarity hypotheses on the composition of the financial statements, the signature of the financial crisis clearly arises across the years around 2008. We finally discuss how the Accounting Networks can be improved to reflect the best practices in the financial statement analysis.
The genetic cluster-exact approximation algorithm is an efficient method to calculate ground states of EA spin glasses. The method can be used to study ground-state landscapes by calculating many independent ground states for each realization of the disorder. The algorithm is analyzed with respect to the statistics of the ground states and the valleys of the energy landscape. Furthermore, the distribution inside each valley is evaluated. It is shown that the algorithm does not lead to a true T=0 thermodynamic distribution, i.e. each ground state has not the same frequency of occurrence when performing many runs. An extension of the technique is outlined, which guarantees that each ground states occurs with the same probability.
In human societies the probability of strategy adoption from a given person may be affected by the personal features. Now we investigate how an artificially imposed restricted ability to reproduce, overruling ones fitness, affects an evolutionary process. For this purpose we employ the evolutionary prisoner's dilemma game on different complex graphs. Reproduction restrictions can have a facilitative effect on the evolution of cooperation that sets in irrespective of particularities of the interaction network. Indeed, an appropriate fraction of less fertile individuals may lead to full supremacy of cooperators where otherwise defection would be widespread. By studying cooperation levels within the group of individuals having full reproduction capabilities, we reveal that the recent mechanism for the promotion of cooperation is conceptually similar to the one reported previously for scale-free networks. Our results suggest that the diversity in the reproduction capability, related to inherently different attitudes of individuals, can enforce the emergence of cooperative behavior among selfish competitors.
This paper presents the building heating demand prediction model with occupancy profile and operational heating power level characteristics in short time horizon (a couple of days) using artificial neural network. In addition, novel pseudo dynamic transitional model is introduced, which consider time dependent attributes of operational power level characteristics and its effect in the overall model performance is outlined. Pseudo dynamic model is applied to a case study of French Institution building and compared its results with static and other pseudo dynamic neural network models. The results show the coefficients of correlation in static and pseudo dynamic neural network model of 0.82 and 0.89 (with energy consumption error of 0.02%) during the learning phase, and 0.61 and 0.85 during the prediction phase respectively. Further, orthogonal array design is applied to the pseudo dynamic model to check the schedule of occupancy profile and operational heating power level characteristics. The results show the new schedule and provide the robust design for pseudo dynamic model. Due to prediction in short time horizon, it finds application for Energy Services Company (ESCOs) to manage the heating load for dynamic control of heat production system.
With an expansive and ubiquitously available gold mine of educational data, Massive Open Online courses (MOOCs) have become the an important foci of learning analytics research. In this paper, we investigate potential reasons as to why are these digitalized learning repositories being plagued with huge attrition rates. We analyze an ongoing online course offered in Coursera using a social network perspective, with an objective to identify students who are actively participating in course discussions and those who are potentially at a risk of dropping off. We additionally perform extensive forum analysis to visualize student's posting patterns longitudinally. Our results provide insights that can assist educational designers in establishing a pedagogical basis for decision-making while designing MOOCs. We infer prominent characteristics about the participation patterns of distinct groups of students in the networked learning community, and effectively discover important discussion threads. These methods can, despite the otherwise prohibitive number of students involved, allow an instructor to leverage forum behavior to identify opportunities for support.
New candidate ground states at 1:4, 1:2, and 1:1 compositions are identified in the well-known Fe-B system via a combination of ab initio high-throughput and evolutionary searches. We show that the proposed oP12-FeB2 stabilizes by a break up of 2D boron layers into 1D chains while oP10-FeB4 stabilizes by a distortion of a 3D boron network. The uniqueness of these configurations gives rise to a set of remarkable properties: oP12-FeB2 is expected to be the first semiconducting metal diboride and oP10-FeB4 is shown to have the potential for phonon-mediated superconductivity with a Tc of 15-20 K.
Community detection is an important tool for analyzing the social graph of mobile phone users. The problem of finding communities in static graphs has been widely studied. However, since mobile social networks evolve over time, static graph algorithms are not sufficient. To be useful in practice (e.g. when used by a telecom analyst), the stability of the partitions becomes critical. We tackle this particular use case in this paper: tracking evolution of communities in dynamic scenarios with focus on stability. We propose two modifications to a widely used static community detection algorithm: we introduce fixed nodes and preferential attachment to pre-existing communities. We then describe experiments to study the stability and quality of the resulting partitions on real-world social networks, represented by monthly call graphs for millions of subscribers.
There have been several independent analyses of the recent Wide Field Camera 3 images of the Hubble Deep Field, selecting galaxies at z>6 through the Lyman break technique. Presented here is a matched catalogue of objects in common between the analyses posted to this preprint server, listing the different catalogue names associated with the same sources.
Stereo reconstruction from rectified images has recently been revisited within the context of deep learning. Using a deep Convolutional Neural Network to obtain patch-wise matching cost volumes has resulted in state of the art stereo reconstruction on classic datasets like Middlebury and Kitti. By introducing this cost into a classical stereo pipeline, the final results are improved dramatically over non-learning based cost models. However these pipelines typically include hand engineered post processing steps to effectively regularize and clean the result. Here, we show that it is possible to take a more holistic approach by training a fully end-to-end network which directly includes regularization in the form of a densely connected Conditional Random Field (CRF) that acts as a prior on inter-pixel interactions. We demonstrate that our approach on both synthetic and real world datasets outperforms an alternative end-to-end network and compares favorably to more hand engineered approaches.
We present the largest kinship recognition dataset to date, Families in the Wild (FIW). Motivated by the lack of a single, unified dataset for kinship recognition, we aim to provide a dataset that captivates the interest of the research community. With only a small team, we were able to collect, organize, and label over 10,000 family photos of 1,000 families with our annotation tool designed to mark complex hierarchical relationships and local label information in a quick and efficient manner. We include several benchmarks for two image-based tasks, kinship verification and family recognition. For this, we incorporate several visual features and metric learning methods as baselines. Also, we demonstrate that a pre-trained Convolutional Neural Network (CNN) as an off-the-shelf feature extractor outperforms the other feature types. Then, results were further boosted by fine-tuning two deep CNNs on FIW data: (1) for kinship verification, a triplet loss function was learned on top of the network of pre-trained weights; (2) for family recognition, a family-specific softmax classifier was added to the network.
We consider the following \textit{network computation problem}. In an acyclic network, there are multiple source nodes, each generating multiple messages, and there are multiple sink nodes, each demanding a function of the source messages. The network coding problem corresponds to the case in which every demand function is equal to some source message, i.e., each sink demands some source message. Connections between network coding problems and matroids have been well studied. In this work, we establish a relation between network computation problems and representable matroids. We show that a network computation problem in which the sinks demand linear functions of source messages admits a scalar linear solution if and only if it is matroidal with respect to a representable matroid whose representation fulfills certain constraints dictated by the network computation problem. Next, we obtain a connection between network computation problems and functional dependency relations (FD-relations) and show that FD-relations can be used to characterize network computation problem with arbitrary (not necessarily linear) function demands as well as nonlinear network codes.
We formulate an optimization problem for maximizing the data rate of a common message transmitted from nodes within an airborne network broadcast to a central station receiver while maintaining a set of intra-network rate demands. Assuming that the network has full-duplex links with multi-beam directional capability, we obtain a convex multi-commodity flow problem and use a distributed augmented Lagrangian algorithm to solve for the optimal flows associated with each beam in the network. For each augmented Lagrangian iteration, we propose a scaled gradient projection method to minimize the local Lagrangian function that incorporates the local topology of each node in the network. Simulation results show fast convergence of the algorithm in comparison to simple distributed primal dual methods and highlight performance gains over standard minimum distance-based routing.
In this paper we propose a rapidly converging LMS algorithm for crosstalk cancellation. The architecture is similar to deep neural networks, where multiple layers are adapted sequentially. The application motivating this approach is gigabit rate transmission over unshielded twisted pairs using a vectored system. The crosstalk cancellation algorithm uses an adaptive non-diagonal preprocessing matrix prior to a conventional LMS crosstalk canceler. The update of the preprocessing matrix is inspired by deep neural networks. However, since most the operations in the Deep-LMS algorithm are linear, we are capable of providing an exact convergence speed analysis. The role of the preprocessing matrix is to speed up the convergence of the conventional LMS crosstalk canceler and hence the convergence of the overall system. The Deep-LMS is important for crosstalk cancellation in the novel G.fast standard, where traditional LMS converges very slowly due to the ill-conditioned covariance matrix of the received signal at the extended bandwidth. Simulation results support our analysis and show significant reduction in convergence time compared to existing LMS variants.
Automatic segmentation of medical images is an important task for many clinical applications. In practice, a wide range of anatomical structures are visualised using different imaging modalities. In this paper, we investigate whether a single convolutional neural network (CNN) can be trained to perform different segmentation tasks.   A single CNN is trained to segment six tissues in MR brain images, the pectoral muscle in MR breast images, and the coronary arteries in cardiac CTA. The CNN therefore learns to identify the imaging modality, the visualised anatomical structures, and the tissue classes.   For each of the three tasks (brain MRI, breast MRI and cardiac CTA), this combined training procedure resulted in a segmentation performance equivalent to that of a CNN trained specifically for that task, demonstrating the high capacity of CNN architectures. Hence, a single system could be used in clinical practice to automatically perform diverse segmentation tasks without task-specific training.
We consider the critical properties of points of continuous glass transition as one can find in liquids in presence of constraints or in liquids in porous media. Through a one loop analysis we show that the critical Replica Field Theory describing these points can be mapped in the $\phi^4$-Random Field Ising Model. We confirm our analysis studying the finite size scaling of the $p$-spin model defined on sparse random graph, where a fraction of variables is frozen such that the phase transition is of a continuous kind.
NICMOS observations of the resolved object fluxes in the Hubble Deep Field North and the Hubble Ultra Deep Field are significantly below the fluxes attributed to a 1.4 - 1.8 microns Near InfraRed Background Excess (NIRBE) from previous low spatial resolution NIRS measurements. Tests placing sources in the NICMOS image with fluxes sufficient to account for the NIRBE indicate that the NIRBE flux must be either flat on scales greater than 100 arc second or clumped on scales of several arc minutes to avoid detection in the NICMOS image. A fluctuation analysis of the new NICMOS data shows a fluctuation spectrum consistent with that found at the same wavelength in deep 2MASS calibration images. The fluctuation analysis shows that the majority of the fluctuation power comes from resolved galaxies at redshifts of 1.5 and less and that the fluctuations observed in the earlier deep 2MASS observations can be completely accounted for with normal low redshift galaxies. Neither the NICMOS direct flux measurements nor the fluctuation analysis require an additional component of near infrared flux other than the flux from normal resolved galaxies in the redshift range between 0 and 7. The residual fluctuations in the angular range between 1 and 10 arc seconds is 1-2 nW m-2 sr-1 which is at or above several predictions of fluctuations from high redshift population III objects, but inconsistent with attributing the entire NIRBE to high redshift galaxies.
In this paper, we propose DeepCut, a method to obtain pixelwise object segmentations given an image dataset labelled with bounding box annotations. It extends the approach of the well-known GrabCut method to include machine learning by training a neural network classifier from bounding box annotations. We formulate the problem as an energy minimisation problem over a densely-connected conditional random field and iteratively update the training targets to obtain pixelwise object segmentations. Additionally, we propose variants of the DeepCut method and compare those to a naive approach to CNN training under weak supervision. We test its applicability to solve brain and lung segmentation problems on a challenging fetal magnetic resonance dataset and obtain encouraging results in terms of accuracy.
Dynamic scaling analyses are performed in the spin-glass phase of the $\pm J$ Ising, the {\it XY}, and the Heisenberg models in three dimensions. We found a crossover from the critical dynamics to the ground-state dynamics in the Ising model and the Heisenberg model. The ground-state dynamics of the Ising model is characterized by an activation law with a finite energy gap: the typical time diverges exponentially. On the other hand, the typical time in the Heisenberg model diverges algebraically with the inverse temperature. Algebraic relaxation with a finite dynamic exponent is observed after the typical time in both models. The ground-state dynamic exponent is estimated to be $z_0\simeq 13$, which is common to both models. There is no crossover in the {\it XY} model. The critical dynamics is considered to continue to the ground-state.
Determining the best method for training a machine learning algorithm is critical to maximizing its ability to classify data. In this paper, we compare the standard "fully supervised" approach (that relies on knowledge of event-by-event truth-level labels) with a recent proposal that instead utilizes class ratios as the only discriminating information provided during training. This so-called "weakly supervised" technique has access to less information than the fully supervised method and yet is still able to yield impressive discriminating power. In addition, weak supervision seems particularly well suited to particle physics since quantum mechanics is incompatible with the notion of mapping an individual event onto any single Feynman diagram. We examine the technique in detail -- both analytically and numerically -- with a focus on the robustness to issues of mischaracterizing the training samples. Weakly supervised networks turn out to be remarkably insensitive to systematic mismodeling. Furthermore, we demonstrate that the event level outputs for weakly versus fully supervised networks are probing different kinematics, even though the numerical quality metrics are essentially identical. This implies that it should be possible to improve the overall classification ability by combining the output from the two types of networks. For concreteness, we apply this technology to a signature of beyond the Standard Model physics to demonstrate that all these impressive features continue to hold in a scenario of relevance to the LHC. Example code is provided at https://github.com/bostdiek/PublicWeaklySupervised/tree/master.
Population structure and spatial heterogeneity are integral components of evolutionary dynamics, in general, and of evolution of cooperation, in particular. Structure can promote the emergence of cooperation in some populations and suppress it in others. Here, we provide results for weak selection to favor cooperation on regular graphs for any configuration, meaning any arrangement of cooperators and defectors. Our results extend previous work on fixation probabilities of single, randomly placed mutants. We find that for any configuration cooperation is never favored for birth-death (BD) updating. In contrast, for death-birth (DB) updating, we derive a simple, computationally tractable formula for weak selection to favor cooperation when starting from any configuration containing any number of cooperators and defectors. This formula elucidates two important features: (i) the takeover of cooperation can be enhanced by the strategic placement of cooperators and (ii) adding more cooperators to a configuration can sometimes suppress the evolution of cooperation. These findings give a formal account for how selection acts on all transient states that appear in evolutionary trajectories. They also inform the strategic design of initial states in social networks to maximally promote cooperation. We also derive general results that characterize the interaction of any two strategies, not only cooperation and defection.
Malicious softwares or malwares for short have become a major security threat. While originating in criminal behavior, their impact are also influenced by the decisions of legitimate end users. Getting agents in the Internet, and in networks in general, to invest in and deploy security features and protocols is a challenge, in particular because of economic reasons arising from the presence of network externalities.   In this paper, we focus on the question of incentive alignment for agents of a large network towards a better security. We start with an economic model for a single agent, that determines the optimal amount to invest in protection. The model takes into account the vulnerability of the agent to a security breach and the potential loss if a security breach occurs. We derive conditions on the quality of the protection to ensure that the optimal amount spent on security is an increasing function of the agent's vulnerability and potential loss. We also show that for a large class of risks, only a small fraction of the expected loss should be invested.   Building on these results, we study a network of interconnected agents subject to epidemic risks. We derive conditions to ensure that the incentives of all agents are aligned towards a better security. When agents are strategic, we show that security investments are always socially inefficient due to the network externalities. Moreover alignment of incentives typically implies a coordination problem, leading to an equilibrium with a very high price of anarchy.
A wavelet scattering network computes a translation invariant image representation, which is stable to deformations and preserves high frequency information for classification. It cascades wavelet transform convolutions with non-linear modulus and averaging operators. The first network layer outputs SIFT-type descriptors whereas the next layers provide complementary invariant information which improves classification. The mathematical analysis of wavelet scattering networks explains important properties of deep convolution networks for classification.   A scattering representation of stationary processes incorporates higher order moments and can thus discriminate textures having the same Fourier power spectrum. State of the art classification results are obtained for handwritten digits and texture discrimination, using a Gaussian kernel SVM and a generative PCA classifier.
In previous work, we developed the scaled SIS process, which models the dynamics of SIS epidemics over networks. With the scaled SIS process, we can consider networks that are finite-sized and of arbitrary topology (i.e., we are not restricted to specific classes of networks). We derived for the scaled SIS process a closed-form expression for the time-asymptotic probability distribution of the states of all the agents in the network. This closed-form solution of the equilibrium distribution explicitly exhibits the underlying network topology through its adjacency matrix. This paper determines which network configuration is the most probable. We prove that, for a range of epidemics parameters, this combinatorial problem leads to a submodular optimization problem, which is exactly solvable in polynomial time. We relate the most-probable configuration to the network structure, in particular, to the existence of high density subgraphs. Depending on the epidemics parameters, subset of agents may be more likely to be infected than others; these more-vulnerable agents form subgraphs that are denser than the overall network. We illustrate our results with a 193 node social network and the 4941 node Western US power grid under different epidemics parameters.
Interactions among people or objects are often dynamic in nature and can be represented as a sequence of networks, each providing a snapshot of the interactions over a brief period of time. An important task in analyzing such evolving networks is change-point detection, in which we both identify the times at which the large-scale pattern of interactions changes fundamentally and quantify how large and what kind of change occurred. Here, we formalize for the first time the network change-point detection problem within an online probabilistic learning framework and introduce a method that can reliably solve it. This method combines a generalized hierarchical random graph model with a Bayesian hypothesis test to quantitatively determine if, when, and precisely how a change point has occurred. We analyze the detectability of our method using synthetic data with known change points of different types and magnitudes, and show that this method is more accurate than several previously used alternatives. Applied to two high-resolution evolving social networks, this method identifies a sequence of change points that align with known external "shocks" to these networks.
This paper proposed a new explicit nonlinear dimensionality reduction using neural networks for image retrieval tasks. We first proposed a Quasi-curvature Locally Linear Embedding (QLLE) for training set. QLLE guarantees the linear criterion in neighborhood of each sample. Then, a neural method (NM) is proposed for out-of-sample problem. Combining QLLE and NM, we provide a explicit nonlinear dimensionality reduction approach for efficient image retrieval. The experimental results in three benchmark datasets illustrate that our method can get better performance than other state-of-the-art out-of-sample methods.
We analyze, via Imry-Ma scaling arguments, the strong disorder phases that exist in low dimensions at all temperatures for directed polymers and interfaces in random media. For the uncorrelated Gaussian disorder, we obtain that the optimal strategy for the polymer in dimension $1+d$ with $0<d<2$ involves at the same time (i) a confinement in a favorable tube of radius $R_S \sim L^{\nu_S}$ with $\nu_S=1/(4-d)<1/2$ (ii) a superdiffusive behavior $R \sim L^{\nu}$ with $\nu=(3-d)/(4-d)>1/2$ for the wandering of the best favorable tube available. The corresponding free-energy then scales as $F \sim L^{\omega}$ with $\omega=2 \nu-1$ and the left tail of the probability distribution involves a stretched exponential of exponent $\eta= (4-d)/2$. These results generalize the well known exact exponents $\nu=2/3$, $\omega=1/3$ and $\eta=3/2$ in $d=1$, where the subleading transverse length $R_S \sim L^{1/3}$ is known as the typical distance between two replicas in the Bethe Ansatz wave function. We then extend our approach to correlated disorder in transverse directions with exponent $\alpha$ and/or to manifolds in dimension $D+d=d_{t}$ with $0<D<2$. The strategy of being both confined and superdiffusive is still optimal for decaying correlations ($\alpha<0$), whereas it is not for growing correlations ($\alpha>0$). In particular, for an interface of dimension $(d_t-1)$ in a space of total dimension $5/3<d_t<3$ with random-bond disorder, our approach yields the confinement exponent $\nu_S = (d_t-1)(3-d_t)/(5d_t-7)$. Finally, we study the exponents in the presence of an algebraic tail $1/V^{1+\mu}$ in the disorder distribution, and obtain various regimes in the $(\mu,d)$ plane.
The DEPAS (Decentralized Probabilistic Auto-Scaling) algorithm assumes an overlay network of computing nodes where each node probabilistically decides to shut down, allocate one or more other nodes or do nothing. DEPAS was formulated, tested, and theoretically analyzed for the simplified case of homogenous systems. In this paper, we extend DEPAS to heterogeneous systems.
Capacitated fixed-charge network flows are used to model a variety of problems in telecommunication, facility location, production planning and supply chain management. In this paper, we investigate capacitated path substructures and derive strong and easy-to-compute \emph{path cover and path pack inequalities}. These inequalities are based on an explicit characterization of the submodular inequalities through a fast computation of parametric minimum cuts on a path, and they generalize the well-known flow cover and flow pack inequalities for the single-node relaxations of fixed-charge flow models. We provide necessary and sufficient facet conditions. Computational results demonstrate the effectiveness of the inequalities when used as cuts in a branch-and-cut algorithm.
We present a new approach to the calculation of measures in weighted networks, based on the translation of a weighted network into an ensemble of edges. This leads to a straightforward generalization of any measure defined on unweighted networks, such as the average degree of the nearest neighbours, the clustering coefficient, the `betweenness', the distance between two nodes and the diameter of a network. All these measures are well established for unweighted networks but have hitherto proven difficult to define for weighted networks. Further to introducing this approach we demonstrate its advantages by applying the clustering coefficient constructed in this way to two real-world weighted networks.
Background. Organisms use a variety of mechanisms to protect themselves against perturbations. For example, repair mechanisms fix damage, feedback loops keep homeostatic systems at their setpoints, and biochemical filters distinguish signal from noise. Such buffering mechanisms are often discussed in terms of robustness, which may be measured by reduced sensitivity of performance to perturbations. Methodology/Principal Findings. I use a mathematical model to analyze the evolutionary dynamics of robustness in order to understand aspects of organismal design by natural selection. I focus on two characters: one character performs an adaptive task; the other character buffers the performance of the first character against perturbations. Increased perturbations favor enhanced buffering and robustness, which in turn decreases sensitivity and reduces the intensity of natural selection on the adaptive character. Reduced selective pressure on the adaptive character often leads to a less costly, lower performance trait. Conclusions/Significance. The paradox of robustness arises from evolutionary dynamics: enhanced robustness causes an evolutionary reduction in the adaptive performance of the target character, leading to a degree of maladaptation compared to what could be achieved by natural selection in the absence of robustness mechanisms. Over evolutionary time, buffering traits may become layered on top of each other, while the underlying adaptive traits become replaced by cheaper, lower performance components. The paradox of robustness has widespread implications for understanding organismal design.
In this paper, we consider the community detection problem in signed networks, where there are two types of edges: positive edges (friends) and negative edges (enemies). One renowned theorem of signed networks, known as Harary's theorem, states that structurally balanced signed networks are clusterable. By viewing each cycle in a signed network as a parity-check constraint, we show that the community detection problem in a signed network with two communities is equivalent to the decoding problem for a parity-check code. We also show how one can use two renowned decoding algorithms in error- correcting codes for community detection in signed networks: the bit-flipping algorithm, and the belief propagation algorithm. In addition to these two algorithms, we also propose a new community detection algorithm, called the Hamming distance algorithm, that performs community detection by finding a codeword that minimizes the Hamming distance. We compare the performance of these three algorithms by conducting various experiments with known ground truth. Our experimental results show that our Hamming distance algorithm outperforms the other two.
Recent advance of large scale similarity search involves using deeply learned representations to improve the search accuracy and use vector quantization methods to increase the search speed. However, how to learn deep representations that strongly preserve similarities between data pairs and can be accurately quantized via vector quantization remains a challenging task. Existing methods simply leverage quantization loss and similarity loss, which result in unexpectedly biased back-propagating gradients and affect the search performances. To this end, we propose a novel gradient snapping layer (GSL) to directly regularize the back-propagating gradient towards a neighboring codeword, the generated gradients are un-biased for reducing similarity loss and also propel the learned representations to be accurately quantized. Joint deep representation and vector quantization learning can be easily performed by alternatively optimize the quantization codebook and the deep neural network. The proposed framework is compatible with various existing vector quantization approaches. Experimental results demonstrate that the proposed framework is effective, flexible and outperforms the state-of-the-art large scale similarity search methods.
We consider a generalized multi-hop MIMO amplify-and-forward (AF) relay network with multiple sources/destinations and arbitrarily number of relays. We establish two dualities and the corresponding dual transformations between such a network and its dual, respectively under single network linear constraint and per-hop linear constraint. The result is a generalization of the previous dualities under different special cases and is proved using new techniques which reveal more insight on the duality structure that can be exploited to optimize MIMO precoders. A unified optimization framework is proposed to find a stationary point for an important class of non-convex optimization problems of AF relay networks based on a local Lagrange dual method, where the primal algorithm only finds a stationary point for the inner loop problem of maximizing the Lagrangian w.r.t. the primal variables. The input covariance matrices are shown to satisfy a polite water-filling structure at a stationary point of the inner loop problem. The duality and polite water-filling are exploited to design fast primal algorithms. Compared to the existing algorithms, the proposed optimization framework with duality-based primal algorithms can be used to solve more general problems with lower computation cost.
This work investigates the spectrum parameterization problem using deep neural networks (DNNs). The proposed scheme consists of the following procedures: first, the configuration of a DNN is initialized using a series of autoencoder neural networks; second, the DNN is fine-tuned using a gradient descent scheme; third, stellar parameters ($T_{eff}$, log$~g$, and [Fe/H]) are estimated using the obtained DNN. This scheme was evaluated on both real spectra from SDSS/SEGUE and synthetic spectra calculated from Kurucz's new opacity distribution function models. Test consistencies between our estimates and those provided by the spectroscopic parameter pipeline of SDSS show that the mean absolute errors (MAEs) are 0.0048, 0.1477, and 0.1129 dex for log$~T_{eff}$, log$~g$, and [Fe/H] (64.85 K for $T_{eff}$), respectively. For the synthetic spectra, the MAE test accuracies are 0.0011, 0.0182, and 0.0112 dex for log$~T_{eff}$, log$~g$, and [Fe/H] (14.90 K for $T_{eff}$), respectively.
The spectral gap of the graph Laplacian with Dirichlet boundary conditions is computed for the graphs of several communication networks at the IP-layer, which are subgraphs of the much larger global IP-layer network. We show that the Dirichlet spectral gap of these networks is substantially larger than the standard spectral gap and is likely to remain non-zero in the infinite graph limit. We first prove this result for finite regular trees, and show that the Dirichlet spectral gap in the infinite tree limit converges to the spectral gap of the infinite tree. We also perform Dirichlet spectral clustering on the IP-layer networks and show that it often yields cuts near the network core that create genuine single-component clusters. This is much better than traditional spectral clustering where several disjoint fragments near the periphery are liable to be misleadingly classified as a single cluster. Spectral clustering is often used to identify bottlenecks or congestion; since congestion in these networks is known to peak at the core, our results suggest that Dirichlet spectral clustering may be better at finding bona-fide bottlenecks.
Facial expression classification is a kind of image classification and it has received much attention, in recent years. There are many approaches to solve these problems with aiming to increase efficient classification. One of famous suggestions is described as first step, project image to different spaces; second step, in each of these spaces, images are classified into responsive class and the last step, combine the above classified results into the final result. The advantages of this approach are to reflect fulfill and multiform of image classified. In this paper, we use 2D-PCA and its variants to project the pattern or image into different spaces with different grouping strategies. Then we develop a model which combines many Neural Networks applied for the last step. This model evaluates the reliability of each space and gives the final classification conclusion. Our model links many Neural Networks together, so we call it Multi Artificial Neural Network (MANN). We apply our proposal model for 6 basic facial expressions on JAFFE database consisting 213 images posed by 10 Japanese female models.
One important factor determining the computational complexity of evaluating a probabilistic network is the cardinality of the state spaces of the nodes. By varying the granularity of the state spaces, one can trade off accuracy in the result for computational efficiency. We present an anytime procedure for approximate evaluation of probabilistic networks based on this idea. On application to some simple networks, the procedure exhibits a smooth improvement in approximation quality as computation time increases. This suggests that state-space abstraction is one more useful control parameter for designing real-time probabilistic reasoners.
We present a deep hierarchical recurrent neural network for sequence tagging. Given a sequence of words, our model employs deep gated recurrent units on both character and word levels to encode morphology and context information, and applies a conditional random field layer to predict the tags. Our model is task independent, language independent, and feature engineering free. We further extend our model to multi-task and cross-lingual joint training by sharing the architecture and parameters. Our model achieves state-of-the-art results in multiple languages on several benchmark tasks including POS tagging, chunking, and NER. We also demonstrate that multi-task and cross-lingual joint training can improve the performance in various cases.
Recently proposed neural network activation functions such as rectified linear, maxout, and local winner-take-all have allowed for faster and more effective training of deep neural architectures on large and complex datasets. The common trait among these functions is that they implement local competition between small groups of computational units within a layer, so that only part of the network is activated for any given input pattern. In this paper, we attempt to visualize and understand this self-modularization, and suggest a unified explanation for the beneficial properties of such networks. We also show how our insights can be directly useful for efficiently performing retrieval over large datasets using neural networks.
We have investigated the temperature dependence of the electrical conductivity sigma(N,B,T) of nominally uncompensated, neutron-transmutation-doped ^{70}Ge:Ga samples in magnetic fields up to B=8 T at low temperatures (T=0.05-0.5 K). In our earlier studies at B=0, the critical exponent mu=0.5 defined by sigma(N,0,0) \propto (N-N_c)^{mu} has been determined for the same series of ^{70}Ge:Ga samples with the doping concentration N ranging from 1.861 \times 10^{17} cm^{-3} to 2.434 \times 10^{17} cm^{-3}. In magnetic fields, the motion of carriers loses time-reversal symmetry, the universality class may change and with it the value of mu. In this work, we show that magnetic fields indeed affect the value of mu (mu changes from 0.5 at B=0 to 1.1 at B \geq 4 T). The same exponent mu'=1.1 is also found in the magnetic-field-induced MIT for three different ^{70}Ge:Ga samples, i.e., sigma(N,B,0) \propto [B_c(N)-B]^{mu'} where B_c(N) is the concentration-dependent critical magnetic induction. We show that sigma(N,B,0) obeys a simple scaling rule on the (N,B) plane. Based on this finding, we derive from a simple mathematical argument that mu=mu' as has been observed in our experiment.
The problem of quantizing the activations of a deep neural network is considered. An examination of the popular binary quantization approach shows that this consists of approximating a classical non-linearity, the hyperbolic tangent, by two functions: a piecewise constant sign function, which is used in feedforward network computations, and a piecewise linear hard tanh function, used in the backpropagation step during network learning. The problem of approximating the ReLU non-linearity, widely used in the recent deep learning literature, is then considered. An half-wave Gaussian quantizer (HWGQ) is proposed for forward approximation and shown to have efficient implementation, by exploiting the statistics of of network activations and batch normalization operations commonly used in the literature. To overcome the problem of gradient mismatch, due to the use of different forward and backward approximations, several piece-wise backward approximators are then investigated. The implementation of the resulting quantized network, denoted as HWGQ-Net, is shown to achieve much closer performance to full precision networks, such as AlexNet, ResNet, GoogLeNet and VGG-Net, than previously available low-precision networks, with 1-bit binary weights and 2-bit quantized activations.
Cooperative behaviors near the disorder-induced critical point in a random field Ising model are numerically investigated by analyzing time-dependent magnetization in ordering processes from a special initial condition. We find that the intensity of fluctuations of time-dependent magnetization, $\chi(t)$, attains a maximum value at a time $t=\tau$ in a normal phase and that $\chi(\tau)$ and $\tau$ exhibit divergences near the disorder-induced critical point. Furthermore, spin configurations around the time $\tau$ are characterized by a length scale, which also exhibits a divergence near the critical point. We estimate the critical exponents that characterize these power-law divergences by using a finite-size scaling method.
The analysis of computer and communication networks gives rise to some interesting inverse problems. This paper is concerned with active network tomography where the goal is to recover information about quality-of-service (QoS) parameters at the link level from aggregate data measured on end-to-end network paths. The estimation and monitoring of QoS parameters, such as loss rates and delays, are of considerable interest to network engineers and Internet service providers. The paper provides a review of the inverse problems and recent research on inference for loss rates and delay distributions. Some new results on parametric inference for delay distributions are also developed. In addition, a real application on Internet telephony is discussed.
Properties of the free energy landscape in phase space of a dense hard sphere system characterized by a discretized free energy functional of the Ramakrishnan-Yussouff form are investigated numerically. A considerable number of glassy local minima of the free energy are located and the distribution of an appropriately defined ``overlap'' between minima is calculated. The process of transition from the basin of attraction of a minimum to that of another one is studied using a new ``microcanonical'' Monte Carlo procedure, leading to a determination of the effective height of free energy barriers that separate different glassy minima. The general appearance of the free energy landscape resembles that of a putting green: deep minima separated by a fairly flat structure. The growth of the effective free-energy barriers with increasing density is consistent with the Vogel-Fulcher law, and this growth is primarily driven by an entropic mechanism.
Decentralised optimisation tasks are important components of multi-agent systems. These tasks can be interpreted as n-player potential games: therefore game-theoretic learning algorithms can be used to solve decentralised optimisation tasks. Fictitious play is the canonical example of these algorithms. Nevertheless fictitious play implicitly assumes that players have stationary strategies. We present a novel variant of fictitious play where players predict their opponents' strategies using Extended Kalman filters and use their predictions to update their strategies.   We show that in 2 by 2 games with at least one pure Nash equilibrium and in potential games where players have two available actions, the proposed algorithm converges to the pure Nash equilibrium. The performance of the proposed algorithm was empirically tested, in two strategic form games and an ad-hoc sensor network surveillance problem. The proposed algorithm performs better than the classic fictitious play algorithm in these games and therefore improves the performance of game-theoretical learning in decentralised optimisation.
We introduce a new problem of generating an image based on a small number of key local patches without any geometric prior. In this work, key local patches are defined as informative regions of the target object or scene. This is a challenging problem since it requires generating realistic images and predicting locations of parts at the same time. We construct adversarial networks to tackle this problem. A generator network generates a fake image as well as a mask based on the encoder-decoder framework. On the other hand, a discriminator network aims to detect fake images. The network is trained with three losses to consider spatial, appearance, and adversarial information. The spatial loss determines whether the locations of predicted parts are correct. Input patches are restored in the output image without much modification due to the appearance loss. The adversarial loss ensures output images are realistic. The proposed network is trained without supervisory signals since no labels of key parts are required. Experimental results on six datasets demonstrate that the proposed algorithm performs favorably on challenging objects and scenes.
We describe the Algebraic Bethe Ansatz for the spin-1/2 XXX and XXZ Heisenberg chains with open and periodic boundary conditions in terms of tensor networks. These Bethe eigenstates have the structure of Matrix Product States with a conserved number of down-spins. The tensor network formulation suggestes possible extensions of the Algebraic Bethe Ansatz to two dimensions.
A covariant formulation of spin-dependent deep-inelastic scattering from nuclei is presented. In the relativistic impulse approximation the inclusive nuclear cross section is described in terms of the amplitude for forward, virtual-photon scattering from an off-mass-shell nucleon. The general structure of the off-shell nucleon hadronic tensor is derived, and the leading behavior of the off-shell nucleon structure functions computed in the Bjorken limit. The formalism, which is valid for nucleons bound inside nuclei with spin 1/2 or 1, is applied to the case of the deuteron.
In this work, the beta-decay halflives problem is dealt as a nonlinear optimization problem, which is resolved in the statistical framework of Machine Learning (LM). Continuing past similar approaches, we have constructed sophisticated Artificial Neural Networks (ANNs) and Support Vector Regression Machines (SVMs) for each class with even-odd character in Z and N to global model the systematics of nuclei that decay 100% by the beta-minus-mode in their ground states. The arising large-scale lifetime calculations generated by both types of machines are discussed and compared with each other, with the available experimental data, with previous results obtained with neural networks, as well as with estimates coming from traditional global nuclear models. Particular attention is paid on the estimates for exotic and halo nuclei and we focus to those nuclides that are involved in the r-process nucleosynthesis. It is found that statistical models based on LM can at least match or even surpass the predictive performance of the best conventional models of beta-decay systematics and can complement the latter.
A framework is presented for a computational theory of probabilistic argument. The Probabilistic Reasoning Environment encodes knowledge at three levels. At the deepest level are a set of schemata encoding the system's domain knowledge. This knowledge is used to build a set of second-level arguments, which are structured for efficient recapture of the knowledge used to construct them. Finally, at the top level is a Bayesian network constructed from the arguments. The system is designed to facilitate not just propagation of beliefs and assimilation of evidence, but also the dynamic process of constructing a belief network, evaluating its adequacy, and revising it when necessary.
We use a multi-color classification method introduced by Wolf, Meisenheimer & Roeser (2000) to reliably identify stars, galaxies and quasars in the up to 16-dimensional color space provided by the filter set of the Calar Alto Deep Imaging Survey (CADIS). The samples of stars, galaxies and quasars obtained this way have been used for dedicated studies published in separate papers. The classification is good enough to detect quasars rather completely and efficiently without confirmative spectroscopy. The multi-color redshifts are accurate enough for most statistical applications, e.g. evolutionary studies of the galaxy luminosity function. We characterize our current dataset on the CADIS 1h-, 9h- and 16h-fields. Using Monte-Carlo simulations we model the classification performance expected for CADIS. We present a summary of the classification results and discuss unclassified objects. More than 99% of the whole catalog sample at R<22 (more than 95% at R<23) are successfully classified matching the expectations derived from the simulations. A small number of peculiar objects challenging the classification are discussed in detail. Spectroscopic observations are used to check the reliability of the multi-color classification (6 mistakes among 151 objects with R<24). We also determine the accuracy of the multi-color redshifts which are rather good for galaxies (sigma_z = 0.03) and useful for quasars. We find the classification performance derived from the simulations to compare well with results from the real survey. Finally, we locate areas for potential improvement of the classification.
There is plenty of theoretical and empirical evidence that depth of neural networks is a crucial ingredient for their success. However, network training becomes more difficult with increasing depth and training of very deep networks remains an open problem. In this extended abstract, we introduce a new architecture designed to ease gradient-based training of very deep networks. We refer to networks with this architecture as highway networks, since they allow unimpeded information flow across several layers on "information highways". The architecture is characterized by the use of gating units which learn to regulate the flow of information through a network. Highway networks with hundreds of layers can be trained directly using stochastic gradient descent and with a variety of activation functions, opening up the possibility of studying extremely deep and efficient architectures.
Acyclic digraphs are the underlying representation of Bayesian networks, a widely used class of probabilistic graphical models. Learning the underlying graph from data is a way of gaining insights about the structural properties of a domain. Structure learning forms one of the inference challenges of statistical graphical models.   MCMC methods, notably structure MCMC, to sample graphs from the posterior distribution given the data are probably the only viable option for Bayesian model averaging. Score modularity and restrictions on the number of parents of each node allow the graphs to be grouped into larger collections, which can be scored as a whole to improve the chain's convergence. Current examples of algorithms taking advantage of grouping are the biased order MCMC, which acts on the alternative space of permuted triangular matrices, and non ergodic edge reversal moves.   Here we propose a novel algorithm, which employs the underlying combinatorial structure of DAGs to define a new grouping. As a result convergence is improved compared to structure MCMC, while still retaining the property of producing an unbiased sample. Finally the method can be combined with edge reversal moves to improve the sampler further.
It is very challenging to select informative features from tens of thousands of measured features in high-throughput data analysis. Recently, several parametric/regression models have been developed utilizing the gene network information to select genes or pathways strongly associated with a clinical/biological outcome. Alternatively, in this paper, we propose a nonparametric Bayesian model for gene selection incorporating network information. In addition to identifying genes that have a strong association with a clinical outcome, our model can select genes with particular expressional behavior, in which case the regression models are not directly applicable. We show that our proposed model is equivalent to an infinity mixture model for which we develop a posterior computation algorithm based on Markov chain Monte Carlo (MCMC) methods. We also propose two fast computing algorithms that approximate the posterior simulation with good accuracy but relatively low computational cost. We illustrate our methods on simulation studies and the analysis of Spellman yeast cell cycle microarray data.
Multi-task learning (MTL) in deep neural networks for NLP has recently received increasing interest due to some compelling benefits, including its potential to efficiently regularize models and to reduce the need for labeled data. While it has brought significant improvements in a number of NLP tasks, mixed results have been reported, and little is known about the conditions under which MTL leads to gains in NLP. This paper sheds light on the specific task relations that can lead to gains from MTL models over single-task setups.
It is demonstrated how in the absence of solutions for QCD under conditions deep inside compact stars an equation of state can be obtained within a model that is built on the basic symmetries of the QCD Lagrangian, in particular chiral symmetry and color symmetry. While in the vacuum the chiral symmetry is spontaneously broken, it gets restored at high densities. Color symmetry, however, gets broken simultaneously by the formation of colorful diquark condensates. It is shown that a strong diquark condensate in cold dense quark matter is essential for supporting the possibility that such states could exist in the recently observed pulsars with masses of 2 $M_\odot$.
Power-law random banded unitary matrices (PRBUM), whose matrix elements decay in a power-law fashion, were recently proposed to model the critical statistics of the Floquet eigenstates of periodically driven quantum systems. In this work, we numerically study in detail the statistical properties of PRBUM ensembles in the delocalization-localization transition regime. In particular, implications of the delocalization-localization transition for the fractal dimension of the eigenvectors, for the distribution function of the eigenvector components, and for the nearest neighbor spacing statistics of the eigenphases are examined. On the one hand, our results further indicate that a PRBUM ensemble can serve as a unitary analog of the power-law random Hermitian matrix model for Anderson transition. On the other hand, some statistical features unseen before are found from PRBUM. For example, the dependence of the fractal dimension of the eigenvectors of PRBUM upon one ensemble parameter displays features that are quite different from that for the power-law random Hermitian matrix model. Furthermore, in the time-reversal symmetric case the nearest neighbor spacing distribution of PRBUM eigenphases is found to obey a semi-Poisson distribution for a broad range, but display an anomalous level repulsion in the absence of time-reversal symmetry.
We introduce a graphical calculus for computing morphism spaces between the categorified spin networks of Cooper and Krushkal. The calculus, phrased in terms of planar compositions of categorified Jones-Wenzl projectors and their duals, is then used to study the module structure of spin networks over the colored unknots.
Recent years have seen an increasing interest in the design of AQM (Active Queue Management) controllers. The purpose of these controllers is to manage the network congestion under varying loads, link delays and bandwidth. In this paper, a new AQM controller is proposed which is trained by using the SVM (Support Vector Machine) with the RBF (Radial Basis Function) kernal. The proposed controller is called the support vector based AQM (SAM) controller. The performance of the proposed controller has been compared with three conventional AQM controllers, namely the Random Early Detection, Blue and Proportional Plus Integral Controller. The preliminary simulation studies show that the performance of the proposed controller is comparable to the conventional controllers. However, the proposed controller is more efficient in controlling the queue size than the conventional controllers.
This paper addresses the challenge of viewing and navigating Bayesian networks as their structural size and complexity grow. Starting with a review of the state of the art of visualizing Bayesian networks, an area which has largely been passed over, we improve upon existing visualizations in three ways. First, we apply a disciplined approach to the graphic design of the basic elements of the Bayesian network. Second, we propose a technique for direct, visual comparison of posterior distributions resulting from alternative evidence sets. Third, we leverage a central mathematical tool in information theory, to assist the user in finding variables of interest in the network, and to reduce visual complexity where unimportant. We present our methods applied to two modestly large Bayesian networks constructed from real-world data sets. Results suggest the new techniques can be a useful tool for discovering information flow phenomena, and also for qualitative comparisons of different evidence configurations, especially in large probabilistic networks.
We investigate the formation of hydrogen cyanide (HCN) in the inner circumstellar envelopes of thermally pulsing asymptotic giant branch (TP-AGB) stars. A dynamic model for periodically shocked atmospheres, which includes an extended chemo-kinetic network, is for the first time coupled to detailed evolutionary tracks for the TP-AGB phase computed with the COLIBRI code. We carried out a calibration of the main shock parameters (the shock formation radius and the effective adiabatic index) using the circumstellar HCN abundances recently measured for a populous sample of pulsating TP-AGB stars. Our models recover the range of the observed HCN concentrations as a function of the mass-loss rates, and successfully reproduce the systematic increase of HCN moving along the M-S-C chemical sequence of TP-AGB stars, that traces the increase of the surface C/O ratio. The chemical calibration brings along two important implications: i) the first shock should emerge very close to the photosphere, and ii) shocks are expected to have a dominant isothermal character in the denser region close to the star (within ~ 3-4 R), implying that radiative processes should be quite efficient. Our analysis also suggests that the HCN concentrations in the inner circumstellar envelopes are critically affected by the H-H2 chemistry during the post-shock relaxation stages.
Due to the increasing complexity and heterogeneity of contemporary Command, Control, Communications, Computers, & Intelligence systems at all levels within military organizations, the adoption of the Service Oriented Architectures (SOA) principles and concepts is becoming essential. SOA provides flexibility and interoperability of services enabling the realization of efficient and modular information infrastructure for command and control systems. However, within a tactical domain, the presence of potentially highly mobile actors equipped with constrained communications media (i.e., unreliable radio networks with limited bandwidth) limits the applicability of traditional SOA technologies. The TACTICS project aims at the definition and experimental demonstration of a Tactical Services Infrastructure enabling tactical radio networks (without any modifications of the radio part of those networks) to participate in SOA infrastructures and provide, as well as consume, services to and from the strategic domain independently of the user's location.
A new combined next to leading order QCD analysis of the polarized inclusive and semi-inclusive deep inelastic lepton-hadron scattering (DIS) data is presented. In contrast to previous combined analyses, the $1/Q^2$ terms (kinematic - target mass corrections, and dynamic - higher twist corrections) in the expression for the nucleon spin structure function $g_1$ are taken into account. The new COMPASS data are included in the analysis. The impact of the semi-inclusive data on the polarized parton densities (PDFs) and on the higher twist corrections is discussed. The new results for the PDFs are compared to our (Leader, Sidorov, Stamenov) LSS'06 PDFs, obtained from the fit to the inclusive DIS data alone, and to those obtained from the de Florian, Sassot, Stratmann, and Vogelsang global analysis.
We study the evolution of heterogeneous networks of oscillators subject to a state-dependent interconnection rule. We find that heterogeneity in the node dynamics is key in organizing the architecture of the functional emerging networks. We demonstrate that increasing heterogeneity among the nodes in state-dependent networks of phase oscillators causes a differentiation in the activation probabilities of the links. This, in turn, yields the formation of hubs associated to nodes with larger distances from the average frequency of the ensemble. Our generic local evolutionary strategy can be used to solve a wide range of synchronization and control problems.
We present an approach for agents to learn representations of a global map from sensor data, to aid their exploration in new environments. To achieve this, we embed procedures mimicking that of traditional Simultaneous Localization and Mapping (SLAM) into the soft attention based addressing of external memory architectures, in which the external memory acts as an internal representation of the environment. This structure encourages the evolution of SLAM-like behaviors inside a completely differentiable deep neural network. We show that this approach can help reinforcement learning agents to successfully explore new environments where long-term memory is essential. We validate our approach in both challenging grid-world environments and preliminary Gazebo experiments. A video of our experiments can be found at: https://goo.gl/RfiSxo.
We present a new action recognition deep neural network which adaptively learns the best action velocities in addition to the classification. While deep neural networks have reached maturity for image understanding tasks, we are still exploring network topologies and features to handle the richer environment of video clips. Here, we tackle the problem of multiple velocities in action recognition, and provide state-of-the-art results for gesture recognition, on known and new collected datasets. We further provide the training steps for our semi-supervised network, suited to learn from huge unlabeled datasets with only a fraction of labeled examples.
This paper proposes an extension to the Generative Adversarial Networks (GANs), namely as ARTGAN to synthetically generate more challenging and complex images such as artwork that have abstract characteristics. This is in contrast to most of the current solutions that focused on generating natural images such as room interiors, birds, flowers and faces. The key innovation of our work is to allow back-propagation of the loss function w.r.t. the labels (randomly assigned to each generated images) to the generator from the discriminator. With the feedback from the label information, the generator is able to learn faster and achieve better generated image quality. Empirically, we show that the proposed ARTGAN is capable to create realistic artwork, as well as generate compelling real world images that globally look natural with clear shape on CIFAR-10.
The spreading (propagation) of diseases, viruses, and disasters such as power blackout through a huge-scale and complex network is one of the most concerned issues today. In this paper, we study the control of such spreading in a nonlinear spreading model of small-world networks. We found that the short-cut adding probability $p$ in the N-W model \cite{N-W:1999} of small-world networks determines the Hopf bifurcation and other bifurcating behaviors in the proposed model. We further show a control technique that stabilize a periodic spreading behavior onto a stable equilibrium over the proposed model of small-world networks.
In this paper we study the Indian Highway Network as a complex network where the junction points are considered as nodes, and the links are formed by an existing connection. We explore the topological properties and community structure of the network. We observe that the Indian Highway Network displays small world properties and is assortative in nature. We also identify the most important road-junctions (or cities) in the highway network based on the betweenness centrality of the node. This could help in identifying the potential congestion points in the network. Our study is of practical importance and could provide a novel approach to reduce congestion and improve the performance of the highway network
A simple model of harmonic vibrations in topologically disordered systems, such as glasses and supercooled liquids, is studied analytically by extending Euclidean Random Matrix Theory to include vector vibrations. Rather generally, it is found that i) the dynamic structure factor shows sound-like Brillouin peaks whose longitudinal/transverse character can only be distinguished for small transferred momentum, p, ii) the model presents a mechanical instability transition at small densities, for which scaling laws are analytically predicted and confirmed numerically, iii) the Brillouin peaks persist deep into the unstable phase, the phase transition being noticeable mostly in their linewidth, iv) the Brillouin linewidth scales like p square in the stable phase, and like p in the unstable one. The analytical results are checked numerically for a simple potential. The main features of glassy vibrations previously deduced from scalar ERMT are not substantially altered by these new results.
We study the inverse problem of constructing an appropriate Hamiltonian from a physically reasonable set of orthogonal wave functions for a quantum spin system. Usually, we are given a local Hamiltonian and try to characterize the relevant wave functions and energies (the spectrum) of the system. Here, we take the opposite approach; starting from a collection of orthogonal wave functions, our goal is to characterize the associated parent Hamiltonian, to see how the wave functions and the energy values determine the structure of the parent Hamiltonian. Specifically, we obtain (quasi) local Hamiltonians by a complete set of (multilayer) product states and a local mapping of the energy values to the wave functions. On the other hand, a complete set of tree wave functions (having a tree structure) results to nonlocal Hamiltonians and operators which flip simultaneously all the spins in a single branch of the tree graph. We observe that even for a given set of wave functions, the energy spectrum can significantly change the nature of interactions in the Hamiltonian. These effects can be exploited in a quantum engineering problem optimizing an objective functional of the Hamiltonian.
Time is at a premium for recurrent network dynamics, and particularly so when they are stochastic and correlated: the quality of inference from such dynamics fundamentally depends on how fast the neural circuit generates new samples from its stationary distribution. Indeed, behavioral decisions can occur on fast time scales (~100 ms), but it is unclear what neural circuit dynamics afford sampling at such high rates.   We analyzed a stochastic form of rate-based linear neuronal network dynamics with synaptic weight matrix $W$, and the dependence on $W$ of the covariance of the stationary distribution of joint firing rates. This covariance $\Sigma$ can be actively used to represent posterior uncertainty via sampling under a linear-Gaussian latent variable model. The key insight is that the mapping between $W$ and $\Sigma$ is degenerate: there are infinitely many $W$'s that lead to sampling from the same $\Sigma$ but differ greatly in the speed at which they sample. We were able to explicitly separate these extra degrees of freedom in a parametric form and thus study their effects on sampling speed.   We show that previous proposals for probabilistic sampling in neural circuits correspond to using a symmetric $W$ which violates Dale's law and results in critically slow sampling, even for moderate stationary correlations. In contrast, optimizing network dynamics for speed consistently yielded asymmetric $W$'s and dynamics characterized by fast transients, such that samples of network activity became fully decorrelated over ~10 ms. Importantly, networks with separate excitatory/inhibitory populations proved to be particularly efficient samplers, and were in the balanced regime. Thus, plausible neural circuit dynamics can perform fast sampling for efficient decoding and inference.
We discuss the structure of 2D conformal field theories (CFT) at central charge c=0 describing critical disordered systems, polymers and percolation. We construct a novel extension of the c=0 Virasoro algebra, characterized by a number b measuring the effective number of massless degrees of freedom, and by a logarithmic partner of the stress tensor. It is argued to be present at a generic random critical point, lacking super Kac-Moody, or other higher symmetries, and is a tool to describe and classify such theories. Interestingly, this algebra is not only consistent with, but indeed naturally accommodates in general an underlying global supersymmetry. Polymers and percolation realize this algebra. Unexpectedly, we find that the c=0 Kac table of the degenerate fields contains two distinct theories with b=5/6 and b=-5/8 which we conjecture to correspond to percolation and polymers respectively. A given Kac-table field can be degenerate only in one of them. Remarkably, we also find this algebra, and thereby an ensuing hidden supersymmetry, realized at general replica-averaged critical points, for which we derive an explicit formula for b.
Experiments on carrier recombination in two-dimensional organic structures are often interpreted in the frame of the Langevin model with taking into account only the drift of the charge carriers in their mutual electric field. While this approach is well justified for three-dimensional systems, it is in general not valid for two-dimensional structures, where the contribution of diffusion can play a dominant role. We study the two-dimensional Langevin recombination theoretically and find the critical concentration below which diffusion cannot be neglected. For typical experimental conditions, neglecting the diffusion leads to an underestimation of the recombination rate by several times.
Deep neural networks can be obscenely wasteful. When processing video, a convolutional network expends a fixed amount of computation for each frame with no regard to the similarity between neighbouring frames. As a result, it ends up repeatedly doing very similar computations. To put an end to such waste, we introduce Sigma-Delta networks. With each new input, each layer in this network sends a discretized form of its change in activation to the next layer. Thus the amount of computation that the network does scales with the amount of change in the input and layer activations, rather than the size of the network. We introduce an optimization method for converting any pre-trained deep network into an optimally efficient Sigma-Delta network, and show that our algorithm, if run on the appropriate hardware, could cut at least an order of magnitude from the computational cost of processing video data.
We compute the Functional Renormalization Group (FRG) disorder- correlator function R(v) for d-dimensional elastic manifolds pinned by a random potential in the limit of infinite embedding space dimension N. It measures the equilibrium response of the manifold in a quadratic potential well as the center of the well is varied from 0 to v. We find two distinct scaling regimes: (i) a "single shock" regime, v^2 ~ 1/L^d where L^d is the system volume and (ii) a "thermodynamic" regime, v^2 ~ N. In regime (i) all the equivalent replica symmetry breaking (RSB) saddle points within the Gaussian variational approximation contribute, while in regime (ii) the effect of RSB enters only through a single anomaly. When the RSB is continuous (e.g., for short-range disorder, in dimension 2 <= d <= 4), we prove that regime (ii) yields the large-N FRG function obtained previously. In that case, the disorder correlator exhibits a cusp in both regimes, though with different amplitudes and of different physical origin. When the RSB solution is 1-step and non- marginal (e.g., d < 2 for SR disorder), the correlator R(v) in regime (ii) is considerably reduced, and exhibits no cusp. Solutions of the FRG flow corresponding to non-equilibrium states are discussed as well. In all cases the regime (i) exhibits a cusp non-analyticity at T=0, whose form and thermal rounding at finite T is obtained exactly and interpreted in terms of shocks. The results are compared with previous work, and consequences for manifolds at finite N, as well as extensions to spin glasses and related models are discussed.
Purpose: Compressed sensing MRI (CS-MRI) from single and parallel coils is one of the powerful ways to reduce the scan time of MR imaging with performance guarantee. However, the computational costs are usually expensive. This paper aims to propose a computationally fast and accurate deep learning algorithm for the reconstruction of MR images from highly down-sampled k-space data.   Theory: Based on the topological analysis, we show that the data manifold of the aliasing artifact is easier to learn from a uniform subsampling pattern with additional low-frequency k-space data. Thus, we develop deep aliasing artifact learning networks for the magnitude and phase images to estimate and remove the aliasing artifacts from highly accelerated MR acquisition.   Methods: The aliasing artifacts are directly estimated from the distorted magnitude and phase images reconstructed from subsampled k-space data so that we can get an aliasing-free images by subtracting the estimated aliasing artifact from corrupted inputs. Moreover, to deal with the globally distributed aliasing artifact, we develop a multi-scale deep neural network with a large receptive field.   Results: The experimental results confirm that the proposed deep artifact learning network effectively estimates and removes the aliasing artifacts. Compared to existing CS methods from single and multi-coli data, the proposed network shows minimal errors by removing the coherent aliasing artifacts. Furthermore, the computational time is by order of magnitude faster.   Conclusion: As the proposed deep artifact learning network immediately generates accurate reconstruction, it has great potential for clinical applications.
How does connectivity impact network dynamics? We address this question by linking network characteristics on two scales. On the global scale we consider the coherence of overall network dynamics. We show that such \emph{global coherence} in activity can often be predicted from the \emph{local structure} of the network. To characterize local network structure we use "motif cumulants," a measure of the deviation of pathway counts from those expected in a minimal probabilistic network model.   We extend previous results in three ways. First, we give a new combinatorial formulation of motif cumulants that relates to the allied concept in probability theory. Second, we show that the link between global network dynamics and local network architecture is strongly affected by heterogeneity in network connectivity. However, we introduce a network-partitioning method that recovers a tight relationship between architecture and dynamics. Third, for a particular set of models we generalize the underlying theory to treat dynamical coherence at arbitrary orders (i.e. triplet correlations, and beyond). We show that at any order only a highly restricted set of motifs impact dynamical correlations.
We present deep HI imaging of the nearby spiral galaxy NGC 4414, taken as part of the Westerbork HALOGAS (Hydrogen Accretion in LOcal GAlaxieS) survey. The observations show that NGC 4414 can be characterized by a regularly rotating inner HI disk, and a more disturbed outer disk. Modeling of the kinematics shows that the outer disk is best described by a U-shaped warp. Deep optical imaging also reveals the presence of a low surface brightness stellar shell, indicating a minor interaction with a dwarf galaxy at some stage in the past. Modeling of the inner disk suggests that about 4 percent of the inner HI is in the form of extra-planar gas. Because of the the disturbed nature of the outer disk, this number is difficult to constrain for the galaxy as a whole. These new, deep observations of NGC 4414 presented here show that even apparently undisturbed galaxies are interacting with their environment.
The structure entropy is one of the most important parameters to describe the structure property of the complex networks. Most of the existing struc- ture entropies are based on the degree or the betweenness centrality. In order to describe the structure property of the complex networks more reasonably, a new structure entropy of the complex networks based on the Tsallis nonextensive statistical mechanics is proposed in this paper. The influence of the degree and the betweenness centrality on the structure property is combined in the proposed structure entropy. Compared with the existing structure entropy, the proposed structure entropy is more reasonable to describe the structure property of the complex networks in some situations.
The evolution of spin network states in loop quantum gravity can be described by introducing a time variable, defined by the surfaces of constant value of an auxiliary scalar field. We regulate the Hamiltonian, generating such an evolution, and evaluate its action both on edges and on vertices of the spin network states. The analytical computations are carried out completely to yield a finite, diffeomorphism invariant result. We use techniques from the recoupling theory of colored graphs with trivalent vertices to evaluate the graphical part of the Hamiltonian action. We show that the action on edges is equivalent to a diffeomorphism transformation, while the action on vertices adds new edges and re-routes the loops through the vertices.
With the popularity of mobile devices, personalized speech recognizer becomes more realizable today and highly attractive. Each mobile device is primarily used by a single user, so it's possible to have a personalized recognizer well matching to the characteristics of individual user. Although acoustic model personalization has been investigated for decades, much less work have been reported on personalizing language model, probably because of the difficulties in collecting enough personalized corpora. Previous work used the corpora collected from social networks to solve the problem, but constructing a personalized model for each user is troublesome. In this paper, we propose a universal recurrent neural network language model with user characteristic features, so all users share the same model, except each with different user characteristic features. These user characteristic features can be obtained by crowdsouring over social networks, which include huge quantity of texts posted by users with known friend relationships, who may share some subject topics and wording patterns. The preliminary experiments on Facebook corpus showed that this proposed approach not only drastically reduced the model perplexity, but offered very good improvement in recognition accuracy in n-best rescoring tests. This approach also mitigated the data sparseness problem for personalized language models.
Reservoir Computing is a bio-inspired computing paradigm for processing time dependent signals. The performance of its analogue implementation are comparable to other state of the art algorithms for tasks such as speech recognition or chaotic time series prediction, but these are often constrained by the offline training methods commonly employed. Here we investigated the online learning approach by training an opto-electronic reservoir computer using a simple gradient descent algorithm, programmed on an FPGA chip. Our system was applied to wireless communications, a quickly growing domain with an increasing demand for fast analogue devices to equalise the nonlinear distorted channels. We report error rates up to two orders of magnitude lower than previous implementations on this task. We show that our system is particularly well-suited for realistic channel equalisation by testing it on a drifting and a switching channels and obtaining good performances
We present a general and flexible procedure which allows for the reduction (or expansion) of any dynamical network while preserving the spectrum of the network's adjacency matrix. Computationally, this process is simple and easily implemented for the analysis of any network. Moreover, it is possible to isospectrally reduce a network with respect to any network characteristic including centrality, betweenness, etc. This procedure also establishes new equivalence relations which partition all dynamical networks into spectrally equivalent classes. Here, we present general facts regarding isospectral network transformations which we then demonstrate in simple examples. Overall, our procedure introduces new possibilities for the analysis of networks in ways that are easily visualized.
We study one-dimensional discrete as well as continuous time random walks, either with a fixed number of steps (for discrete time) $n$ or on a fixed time interval $T$ (for continuous time). In both cases, we focus on symmetric probability distribution functions (PDF) of jumps with a finite support $[-g_{max}, g_{max}]$. For continuous time random walks (CTRWs), the waiting time $\tau$ between two consecutive jumps is a random variable whose probability distribution (PDF) has a power law tail $\Psi(\tau) \propto \tau^{-1-\gamma}$, with $0<\gamma<1$. We obtain exact results for the joint statistics of the gap between the first two maximal positions of the random walk and the time elapsed between them. We show that for large $n$ (or large time $T$ for CTRW), this joint PDF reaches a stationary joint distribution which exhibits an interesting concentration effect in the sense that a gap close to its maximum possible value, $g\approx g_{max}$, is much more likely to be achieved by two successive jumps rather than by a long walk between the first two maxima. Our numerical simulations confirm this concentration effect.
Charge transport processes in disordered complex media are accompanied by anomalously slow relaxation for which usually a broad distribution of relaxation times is adopted. To account for those properties of the environment, a standard kinetic approach in description of the system is addressed either in the framework of continuous-time random walks (CTRW) or fractional diffusion. In this paper the power of the CTRW approach is illustrated by use of the probabilistic formalism and limit theorems that allow to predict the limiting distributions of the paths traversed by charges and to derive effective relaxation properties of the entire system of interest. Application of the method is discussed for non-exponential electron-transfer processes controlled by dynamics of the surrounding medium.
A quantum particle cannot in general diffuse through a disordered medium because of its wavelike nature, but interacting particles can escape localization by collectively percolating through the system. For bosonic particles this phenomenon corresponds to a quantum transition from a localized insulator phase - the Bose glass - to a superfluid phase, in which particles condense into an extended state. Here, we construct a universal phase diagram of disordered bosons in doped quantum magnets for which bosonic quasi-particles are represented by magnetized states (spin triplets) of the quantum spins, condensing into a magnetically ordered state. The appearance of a Bose glass leads to strong measurable signatures in the onset of superfluidity of the spin-triplet gas, exhibiting a complex crossover from low-temperature quantum percolation to a conventional condensation transition at intermediate temperatures.
Often the current mode coupling theory (MCT) of glass transitions is compared with mean field theories. We explore this possible correspondence. After showing a simple-minded derivation of MCT with some difficulties we give a concise account of our toy model developed to gain more insight into MCT. We then reduce this toy model by adiabatically eliminating rapidly varying velocity-like variables to obtain a Fokker-Planck equation for the slowly varying density-like variables where diffusion matrix can be singular. This gives a room for nonergodic stationary solutions of the above equation.
This paper addresses the matching of short music audio snippets to the corresponding pixel location in images of sheet music. A system is presented that simultaneously learns to read notes, listens to music and matches the currently played music to its corresponding notes in the sheet. It consists of an end-to-end multi-modal convolutional neural network that takes as input images of sheet music and spectrograms of the respective audio snippets. It learns to predict, for a given unseen audio snippet (covering approximately one bar of music), the corresponding position in the respective score line. Our results suggest that with the use of (deep) neural networks -- which have proven to be powerful image processing models -- working with sheet music becomes feasible and a promising future research direction.
Supervised topic models can help clinical researchers find interpretable cooccurence patterns in count data that are relevant for diagnostics. However, standard formulations of supervised Latent Dirichlet Allocation have two problems. First, when documents have many more words than labels, the influence of the labels will be negligible. Second, due to conditional independence assumptions in the graphical model the impact of supervised labels on the learned topic-word probabilities is often minimal, leading to poor predictions on heldout data. We investigate penalized optimization methods for training sLDA that produce interpretable topic-word parameters and useful heldout predictions, using recognition networks to speed-up inference. We report preliminary results on synthetic data and on predicting successful anti-depressant medication given a patient's diagnostic history.
Recent intensive experimental studies revealed that (Y,Lu)Ni$_2$B$_2$C is an anisotropic s-wave superconductor. In addition, its gap function possesses deep point minima, whose ratio of the gap anisotropy is more than 10. On the theoretical side, however, it is nontrivial to understand the origin of such a peculiar superconductivity. In the present paper, we propose a mechanism of the s-wave superconductivity with deep gap minima, based on the theoretical model where strong electron-phonon coupling as well as the moderate magnetic fluctuations coexist. By analyzing the strong coupling Eliashberg equation, we find that s-wave superconducting gap function owing to the electron-phonon coupling becomes highly anisotropic as the magnetic fluctuations increases. The set of model parameters for realizing the strong gap anisotropy in the present model will be appropriate for (Y,Lu)Ni$_2$B$_2$C. According to the present mechanism, (groups of) pair of gap minima appear at points on the Fermi surface which are connected by the nesting vector ${\bf Q}$, in both cases of s-wave superconductors and non s-wave ones. We briefly discuss other superconductors with highly anisotropic gap function, e.g., PrOs$_{4}$Sb$_{12}$ and Na$_{0.33}$CoO$_2$.
We have investigated the effect of the magnetic field (B) on the very low-temperature equilibrium heat capacity ceq of the quasi-1 D organic compound (TMTTF)2Br, characterized by a commensurate Spin Density Wave (SDW) ground state. Below 1K, ceq is dominated by a Schottky-like contribution, very sensitive to the experimental time scale, a property that we have previously measured in numerous DW compounds. Under applied field (in the range 0.2- 7 T), the equilibrium dynamics, and hence ceq extracted from the time constant, increases enormously. For B = 2-3 T, ceq varies like B2, in agreement with a magnetic Zeeman coupling. Another specific property, common to other Charge/Spin density wave (DW) compounds, is the occurrence of metastable branches in ceq, induced at very low temperature by the field exceeding a critical value. These effects are discussed within a generalization to SDWs in a magnetic field of the available Larkin-Ovchinnikov local model of strong pinning. A limitation of the model when compared to experiments is pointed out.
This paper introduces two recurrent neural network structures called Simple Gated Unit (SGU) and Deep Simple Gated Unit (DSGU), which are general structures for learning long term dependencies. Compared to traditional Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), both structures require fewer parameters and less computation time in sequence classification tasks. Unlike GRU and LSTM, which require more than one gates to control information flow in the network, SGU and DSGU only use one multiplicative gate to control the flow of information. We show that this difference can accelerate the learning speed in tasks that require long dependency information. We also show that DSGU is more numerically stable than SGU. In addition, we also propose a standard way of representing inner structure of RNN called RNN Conventional Graph (RCG), which helps analyzing the relationship between input units and hidden units of RNN.
We compute the distribution of velocities of the particles ejected by the impact of the projectile released from NASA Deep Impact spacecraft on the nucleus of comet 9P/Tempel 1 on the successive 20 hours following the collision. This is performed by the development and use of an ill-conditioned inverse problem approach, whose main ingredients are a set of observations taken by the Narrow Angle Camera (NAC) of OSIRIS onboard the Rosetta spacecraft, and a set of simple models of the expansion of the dust ejecta plume for different velocities. Terminal velocities are derived using a maximum likelihood estimator.   We compare our results with published estimates of the expansion velocity of the dust cloud. Our approach and models reproduce well the velocity distribution of the ejected particles. We consider these successful comparisons of the velocities as an evidence for the appropriateness of the approach. This analysis provides a more thorough understanding of the properties of the Deep Impact dust cloud.
Linear-Nonlinear-Poisson (LNP) models are a popular and powerful tool for describing encoding (stimulus-response) transformations by single sensory as well as motor neurons. Recently, there has been rising interest in the second- and higher-order correlation structure of neural spike trains, and how it may be related to specific encoding relationships. The distortion of signal correlations as they are transformed through particular LNP models is predictable and in some cases analytically tractable and invertible. Here, we propose that LNP encoding models can potentially be identified strictly from the correlation transformations they induce, and develop a computational method for identifying minimum-phase single-neuron temporal kernels under white and colored- random Gaussian excitation. Unlike reverse-correlation or maximum-likelihood, correlation-distortion based identification does not require the simultaneous observation of stimulus-response pairs - only their respective second order statistics. Although in principle filter kernels are not necessarily minimum-phase, and only their spectral amplitude can be uniquely determined from output correlations, we show that in practice this method provides excellent estimates of kernels from a range of parametric models of neural systems. We conclude by discussing how this approach could potentially enable neural models to be estimated from a much wider variety of experimental conditions and systems, and its limitations.
The magnetic ordering due to the long range dipole coupling in an ensemble of magnetic islands is investigated. If the islands are large enough and closely separated, the average dipole energy per island can explain the magnitude of the observed ordering temperature of such an ensemble. The energetical degeneracy with respect to a continuous in-plane rotation of the magnetic moments in a periodic ensemble of islands is lifted in presence of an island size dispersion and an irregular island array. Many different (metastable) magnetic states are obtained, reminiscent of a spin-glass behavior. We obtain that the average magnetic binding energy per island due to the dipole coupling increases with increasing positional disorder. The island ensembles exhibit non-collinear magnetic structures, resulting in non-saturated ensemble magnetizations. The calculations are performed with a classical spin model for ensembles of islands in unit cells with periodic boundary conditions. The point dipole sums are augmented by an island areal correction.
We present a preliminary study of a new class of two-input cellular automata called eventually number-conserving cellular automata characterized by the property of evolving after a finite number of time steps to states whose number of active sites remains constant. Eventually number-conserving cellular automata are models of open systems of interacting particles, that is, system of particles interacting with the external world, The particle aspect of eventually number-conserving cellular automata can be emphasized by the motion representation of the cellular automaton evolution rule. This new class of cellular automata contains, as strict subclasses, number-conserving cellular automata, monotone cellular automata, and cellular automata emulating number-conserving ones. Our main objective is to show that they are not what one might naively think they are.
A major problem for the learning of Bayesian networks (BNs) is the exponential number of parameters needed for conditional probability tables. Recent research reduces this complexity by modeling local structure in the probability tables. We examine the use of log-linear local models. While log-linear models in this context are not new (Whittaker, 1990; Buntine, 1991; Neal, 1992; Heckerman and Meek, 1997), for structure learning they are generally subsumed under a naive Bayes model. We describe an alternative interpretation, and use a Minimum Message Length (MML) (Wallace, 1987) metric for structure learning of networks exhibiting causal independence, which we term first-order networks (FONs). We also investigate local model selection on a node-by-node basis.
We study the synchronization transition (ST) of a modified Kuramoto model on two different types of modular complex networks. It is found that the ST depends on the type of inter-modular connections. For the network with decentralized (centralized) inter-modular connections, the ST occurs at finite coupling constant (behaves abnormally). Such distinct features are found in the yeast protein interaction network and the Internet, respectively. Moreover, by applying the finite-size scaling analysis to an artificial network with decentralized inter-modular connections, we obtain the exponent associated with the order parameter of the ST to be $\beta \approx 1$ different from $\beta_{\rm MF} \approx 1/2$ obtained from the scale-free network with the same degree distribution but the absence of modular structure, corresponding to the mean field value.
Traffic congestion in urban road networks is a costly problem that affects all major cities in developed countries. To tackle this problem, it is possible (i) to act on the supply side, increasing the number of roads or lanes in a network, (ii) to reduce the demand, restricting the access to urban areas at specific hours or to specific vehicles, or (iii) to improve the efficiency of the existing network, by means of a widespread use of so-called Intelligent Transportation Systems (ITS). In line with the recent advances in smart transportation management infrastructures, ITS has turned out to be a promising field of application for artificial intelligence techniques. In particular, multiagent systems seem to be the ideal candidates for the design and implementation of ITS. In fact, drivers can be naturally modelled as autonomous agents that interact with the transportation management infrastructure, thereby generating a large-scale, open, agent-based system. To regulate such a system and maintain a smooth and efficient flow of traffic, decentralised mechanisms for the management of the transportation infrastructure are needed.   In this article we propose a distributed, market-inspired, mechanism for the management of a future urban road network, where intelligent autonomous vehicles, operated by software agents on behalf of their human owners, interact with the infrastructure in order to travel safely and efficiently through the road network. Building on the reservation-based intersection control model proposed by Dresner and Stone, we consider two different scenarios: one with a single intersection and one with a network of intersections. In the former, we analyse the performance of a novel policy based on combinatorial auctions for the allocation of reservations. In the latter, we analyse the impact that a traffic assignment strategy inspired by competitive markets has on the drivers route choices. Finally we propose an adaptive management mechanism that integrates the auction-based traffic control policy with the competitive traffic assignment strategy.
We demonstrate the application of the multiplex networks-approach for the analysis of various networks which connected individuals and communities in the politically highly fragmented late medieval Balkans (1204-1453 AD) within and across border zones. We present how we obtain relational data from our sources and the integration of these data into three different networks (of roads, state administration and ecclesiastical administration) of various topologies; then we calculate several indicators for influences and overlaps between these different networks which connect the same set of nodes (settlements). We analyse changes and continuities in the topologies of the various networks for three time-steps (1210, 1324 and 1380 CE) and demonstrate the role of these networks as frameworks for social interactions. Finally, we combine all three networks into one network which shows properties observed for the small world model. Thus, we demonstrate possibilities for capturing historical complexity with the help of the multiplex networks approach.
We consider a three-dimensional network of aqueous droplets joined by single lipid bilayers to form a cohesive, tissue-like material. The droplets in these networks can be programmed to have distinct osmolarities so that osmotic gradients generate internal stresses via local fluid flows to cause the network to change shape. We discover, using molecular dynamics simulations, a reversible folding-unfolding process by adding an osmotic interaction with the surrounding environment which necessarily evolves dynamically as the shape of the network changes. This discovery is the next important step towards osmotic robotics in this system. We also explore analytically and numerically how the networks become faceted via buckling and how quasi-one-dimensional networks become three-dimensional.
Predicting a person's gender based on the iris texture has been explored by several researchers. This paper considers several dimensions of experimental work on this problem, including person-disjoint train and test, and the effect of cosmetics on eyelash occlusion and imperfect segmentation. We also consider the use of multi-layer perceptron and convolutional neural networks as classifiers, comparing the use of data-driven and hand-crafted features. Our results suggest that the gender-from-iris problem is more difficult than has so far been appreciated. Estimating accuracy using a mean of N person-disjoint train and test partitions, and considering the effect of makeup - a combination of experimental conditions not present in any previous work - we find a much weaker ability to predict gender-from-iris texture than has been suggested in previous work.
Genetic fitness optimization using small populations or small population updates across generations generally suffers from randomly diverging evolutions. We propose a notion of highly probable fitness optimization through feasible evolutionary computing runs on small size populations. Based on rapidly mixing Markov chains, the approach pertains to most types of evolutionary genetic algorithms, genetic programming and the like. We establish that for systems having associated rapidly mixing Markov chains and appropriate stationary distributions the new method finds optimal programs (individuals) with probability almost 1. To make the method useful would require a structured design methodology where the development of the program and the guarantee of the rapidly mixing property go hand in hand. We analyze a simple example to show that the method is implementable. More significant examples require theoretical advances, for example with respect to the Metropolis filter.
We consider the problem of deterministic broadcasting in radio networks when the nodes have limited knowledge about the topology of the network. We show that for every deterministic broadcasting protocol there exists a network, of radius 2, for which the protocol takes at least $\Omega(\sqrt{n}) rounds for completing the broadcast. Our argument can be extended to prove a lower bound of Omega(\sqrt{nD}) rounds for broadcasting in radio networks of radius D. This resolves one of the open problems posed in [29], where in the authors proved a lower bound of $\Omega(n^{1/4}) rounds for broadcasting in constant diameter networks.   We prove the new lower $\Omega(\sqrt{n})$ bound for a special family of radius 2 networks. Each network of this family consists of O(\sqrt{n}) components which are connected to each other via only the source node. At the heart of the proof is a novel simulation argument, which essentially says that any arbitrarily complicated strategy of the source node can be simulated by the nodes of the networks, if the source node just transmits partial topological knowledge about some component instead of arbitrary complicated messages. To the best of our knowledge this type of simulation argument is novel and may be useful in further improving the lower bound or may find use in other applications.   Keywords: radio networks, deterministic broadcast, lower bound, advice string, simulation, selective families, limited topological knowledge.
Real world network datasets often contain a wealth of complex topological information. In the face of these data, researchers often employ methods to extract reduced networks containing the most important structures or pathways, sometimes known as `skeletons' or `backbones'. Numerous such methods have been developed. Yet data are often noisy or incomplete, with unknown numbers of missing or spurious links. Relatively little effort has gone into understanding how salient network extraction methods perform in the face of noisy or incomplete networks. We study this problem by comparing how the salient features extracted by two popular methods change when networks are perturbed, either by deleting nodes or links, or by randomly rewiring links. Our results indicate that simple, global statistics for skeletons can be accurately inferred even for noisy and incomplete network data, but it is crucial to have complete, reliable data to use the exact topologies of skeletons or backbones. These results also help us understand how skeletons respond to damage to the network itself, as in an attack scenario.
With the advent of cheap computing through five decades of continued miniaturization following Moores Law, wearable devices are becoming increasingly popular. These wearable devices are typically interconnected using wireless body area network (WBAN). Human body communication (HBC) provides an alternate energy-efficient communication technique between on-body wearable devices by using the human body as a conducting medium. This allows order of magnitude lower communication power, compared to WBAN, due to lower loss and broadband signaling. Moreover, HBC is significantly more secure than WBAN, as the information is contained within the human body and cannot be snooped on unless the person is physically touched. In this paper, we highlight applications of HBC as (1) Social Networking (e.g. LinkedIn/Facebook friend request sent during Handshaking in a meeting/party), (2) Secure Authentication using human-human or human-machine dynamic HBC and (3) ultra-low power, secure BAN using intra-human HBC. One of the biggest technical bottlenecks of HBC has been the interference (e.g. FM) picked up by the human body acting like an antenna. In this work, for the first time, we introduce an integrating dual data rate (DDR) receiver technique, that allows notch filtering (>20 dB) of the interference for interference-robust HBC.
In recent years, deep learning has achieved great success in many computer vision applications. Convolutional neural networks (CNNs) have lately emerged as a major approach to image classification. Most research on CNNs thus far has focused on developing architectures such as the Inception and residual networks. The convolution layer is the core of the CNN, but few studies have addressed the convolution unit itself. In this paper, we introduce a convolution unit called the active convolution unit (ACU). A new convolution has no fixed shape, because of which we can define any form of convolution. Its shape can be learned through backpropagation during training. Our proposed unit has a few advantages. First, the ACU is a generalization of convolution; it can define not only all conventional convolutions, but also convolutions with fractional pixel coordinates. We can freely change the shape of the convolution, which provides greater freedom to form CNN structures. Second, the shape of the convolution is learned while training and there is no need to tune it by hand. Third, the ACU can learn better than a conventional unit, where we obtained the improvement simply by changing the conventional convolution to an ACU. We tested our proposed method on plain and residual networks, and the results showed significant improvement using our method on various datasets and architectures in comparison with the baseline.
The universality of the crossing probability $\pi_{hs}$ of a system to percolate only in the horizontal direction, was investigated numerically by using a cluster Monte-Carlo algorithm for the $q$-state Potts model for $q=2,3,4$ and for percolation $q=1$. We check the percolation through Fortuin-Kasteleyn clusters near the critical point on the square lattice by using representation of the Potts model as the correlated site-bond percolation model. It was shown that probability of a system to percolate only in the horizontal direction $\pi_{hs}$ has universal form $\pi_{hs}=A(q) Q(z)$ for $q=1,2,3,4$ as a function of the scaling variable $z= [ b(q)L^{\frac{1}{\nu(q)}}(p-p_{c}(q,L)) ]^{\zeta(q)}$. Here,   $p=1-\exp(-\beta)$ is the probability of a bond to be closed, $A(q)$ is the nonuniversal crossing amplitude, $b(q)$ is the nonuniversal metric factor, $\zeta(q)$ is the nonuniversal scaling index, $\nu(q)$ is the correlation length index.   The universal function $Q(x) \simeq \exp(-z)$. Nonuniversal scaling factors were found numerically.
We consider the problem of large wave prediction in two-dimensional water waves. Such waves form due to the synergistic effect of dispersive mixing of smaller wave groups and the action of localized nonlinear wave interactions that leads to focusing. Instead of a direct simulation approach, we rely on the decomposition of the wave field into a discrete set of localized wave groups with optimal length scales and amplitudes. Due to the short-term character of the prediction, these wave groups do not interact and therefore their dynamics can be characterized individually. Using direct numerical simulations of the governing envelope equations we precompute the expected maximum elevation for each of those wave groups. The combination of the wave field decomposition algorithm, which provides information about the statistics of the system, and the precomputed map for the expected wave group elevation, which encodes dynamical information, allows (i) for understanding of how the probability of occurrence of rogue waves changes as the spectrum parameters vary, (ii) the computation of a critical length scale characterizing wave groups with high probability of evolving to rogue waves, and (iii) the formulation of a robust and parsimonious reduced-order prediction scheme for large waves. We assess the validity of this scheme in several cases of ocean wave spectra.
Recent progress in neural learning demonstrated that machines can do well in regularized tasks, e.g., the game of Go. However, artistic activities such as poem generation are still widely regarded as human's special capability. In this paper, we demonstrate that a simple neural model can imitate human in some tasks of art generation. We particularly focus on traditional Chinese poetry, and show that machines can do as well as many contemporary poets and weakly pass the Feigenbaum Test, a variant of Turing test in professional domains. Our method is based on an attention-based recurrent neural network, which accepts a set of keywords as the theme and generates poems by looking at each keyword during the generation. A number of techniques are proposed to improve the model, including character vector initialization, attention to input and hybrid-style training. Compared to existing poetry generation methods, our model can generate much more theme-consistent and semantic-rich poems.
We studied the morphology of galaxies in the Chandra Deep Field South using ACS multi-wavelength data from the Great Observatories Origin Deep Survey and 524 spectroscopic redshifts from the VIMOS VLT Deep Survey completed with 2874 photometric redshift computed from COMBO-17 multi-color data. The rest-frame B-band makes it possible to discriminate two morphological types in an asymmetry-concentration diagram: bulge- and disk-dominated galaxies. The rest-frame color index $B-I$ is found to be very correlated with the morphological classification: wholly bulge-dominated galaxies are redder than disk-dominated galaxies. However color allowed us to distinguish a population of faint blue bulge-dominated galaxies ($B-I<0.9$), whose nature is still unclear. Using the rest-frame B-band classification from $z\sim0.15$ up to $z\sim1.1$, we quantified the evolution of the proportion of morphological types as a function of the redshift. Our large sample allowed us to compute luminosity functions per morphological type in rest-frame B-band. The bulge-dominated population is found to be composite: on the one hand the red ($(B-I)\_{AB}>0.9$), bright galaxies, which seem to increase in density toward low redshifts. On the other hand the blue, compact, faint bulge-dominated galaxies, strongly evolving with the redshift.
Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous networks, without distinguishing different types of objects and links in the networks. Recently, more and more researchers begin to consider these interconnected, multi-typed data as heterogeneous information networks, and develop structural analysis approaches by leveraging the rich semantic meaning of structural types of objects and links in the networks. Compared to widely studied homogeneous network, the heterogeneous information network contains richer structure and semantic information, which provides plenty of opportunities as well as a lot of challenges for data mining. In this paper, we provide a survey of heterogeneous information network analysis. We will introduce basic concepts of heterogeneous information network analysis, examine its developments on different data mining tasks, discuss some advanced topics, and point out some future research directions.
Semantic parsing aims at mapping natural language to machine interpretable meaning representations. Traditional approaches rely on high-quality lexicons, manually-built templates, and linguistic features which are either domain- or representation-specific. In this paper we present a general method based on an attention-enhanced encoder-decoder model. We encode input utterances into vector representations, and generate their logical forms by conditioning the output sequences or trees on the encoding vectors. Experimental results on four datasets show that our approach performs competitively without using hand-engineered features and is easy to adapt across domains and meaning representations.
Inspired by the dissipative quantum model of brain, we model the states of neural nets in terms of collective modes by the help of the formalism of Quantum Field Theory. We exhibit an explicit neural net model which allows to memorize a sequence of several informations without reciprocal destructive interference, namely we solve the overprinting problem in such a way last registered information does not destroy the ones previously registered. Moreover, the net is able to recall not only the last registered information in the sequence, but also anyone of those previously registered.
In a variety of application domains the content to be recommended to users is associated with text. This includes research papers, movies with associated plot summaries, news articles, blog posts, etc. Recommendation approaches based on latent factor models can be extended naturally to leverage text by employing an explicit mapping from text to factors. This enables recommendations for new, unseen content, and may generalize better, since the factors for all items are produced by a compactly-parametrized model. Previous work has used topic models or averages of word embeddings for this mapping. In this paper we present a method leveraging deep recurrent neural networks to encode the text sequence into a latent vector, specifically gated recurrent units (GRUs) trained end-to-end on the collaborative filtering task. For the task of scientific paper recommendation, this yields models with significantly higher accuracy. In cold-start scenarios, we beat the previous state-of-the-art, all of which ignore word order. Performance is further improved by multi-task learning, where the text encoder network is trained for a combination of content recommendation and item metadata prediction. This regularizes the collaborative filtering model, ameliorating the problem of sparsity of the observed rating matrix.
We study the stochastic online problem of learning to influence in a social network with semi-bandit feedback, where we observe how users influence each other. The problem combines challenges of limited feedback, because the learning agent only observes the influenced portion of the network, and combinatorial number of actions, because the cardinality of the feasible set is exponential in the maximum number of influencers. We propose a computationally efficient UCB-like algorithm, IMLinUCB, and analyze it. Our regret bounds are polynomial in all quantities of interest; reflect the structure of the network and the probabilities of influence. Moreover, they do not depend on inherently large quantities, such as the cardinality of the action set. To the best of our knowledge, these are the first such results. IMLinUCB permits linear generalization and therefore is suitable for large-scale problems. Our experiments show that the regret of IMLinUCB scales as suggested by our upper bounds in several representative graph topologies; and based on linear generalization, IMLinUCB can significantly reduce regret of real-world influence maximization semi-bandits.
We study the effect of competition between short-term synaptic depression and facilitation on the dynamical properties of attractor neural networks, using Monte Carlo simulation and a mean field analysis. Depending on the balance between depression, facilitation and the noise, the network displays different behaviours, including associative memory and switching of the activity between different attractors. We conclude that synaptic facilitation enhances the attractor instability in a way that (i) intensifies the system adaptability to external stimuli, which is in agreement with experiments, and (ii) favours the retrieval of information with less error during short--time intervals.
The momentum conservation sum rule for deep inelastic scattering (DIS) from composite particles is investigated using the general theory of relativity. For two 1+1 dimensional examples, it shown that covariant theories automatically satisy the DIS momentum conservation sum rule provided the bound state is covariantilly normalized. Therefore, in these cases the two DIS sum rules for baryon conservation and momentum conservation are equivalent.
Secured communication in ad hoc wireless networks is primarily important, because the communication signals are openly available as they propagate through air and are more susceptible to attacks ranging from passive eavesdropping to active interfering. The lack of any central coordination and shared wireless medium makes them more vulnerable to attacks than wired networks. Nodes act both as hosts and routers and are interconnected by Multi- hop communication path for forwarding and receiving packets to/from other nodes. The objective of this paper is to propose a key exchange and encryption mechanism that aims to use the MAC address as an additional parameter as the message specific key[to encrypt]and forward data among the nodes. The nodes are organized in spanning tree fashion, as they avoid forming cycles and exchange of key occurs only with authenticated neighbors in ad hoc networks, where nodes join or leave the network dynamically.
Mass-action kinetics is frequently used in systems biology to model the behaviour of interacting chemical species. Many important dynamical properties are known to hold for such systems if they are weakly reversible and have a low deficiency. In particular, the Deficiency Zero and Deficiency One Theorems guarantee strong regularity with regards to the number and stability of positive equilibrium states. It is also known that chemical reaction networks with disparate reaction structure can exhibit the same qualitative dynamics. The theory of linear conjugacy encapsulates the cases where this relationship is captured by a linear transformation. In this paper, we propose a mixed-integer linear programming algorithm capable of determining weakly reversible reaction networks with a minimal deficiency which are linearly conjugate to a given reaction network.
The random walk is fundamental to modeling dynamic processes on networks. Metrics based on the random walk have been used in many applications from image processing to Web page ranking. However, how appropriate are random walks to modeling and analyzing social networks? We argue that unlike a random walk, which conserves the quantity diffusing on a network, many interesting social phenomena, such as the spread of information or disease on a social network, are fundamentally non-conservative. When an individual infects her neighbor with a virus, the total amount of infection increases. We classify diffusion processes as conservative and non-conservative and show how these differences impact the choice of metrics used for network analysis, as well as our understanding of network structure and behavior. We show that Alpha-Centrality, which mathematically describes non-conservative diffusion, leads to new insights into the behavior of spreading processes on networks. We give a scalable approximate algorithm for computing the Alpha-Centrality in a massive graph. We validate our approach on real-world online social networks of Digg. We show that a non-conservative metric, such as Alpha-Centrality, produces better agreement with empirical measure of influence than conservative metrics, such as PageRank. We hope that our investigation will inspire further exploration into the realms of conservative and non-conservative metrics in social network analysis.
We study the scaling properties of a georouting scheme in a wireless multi-hop network of $n$ mobile nodes. Our aim is to increase the network capacity quasi linearly with $n$ while keeping the average delay bounded. In our model, mobile nodes move according to an i.i.d. random walk with velocity $v$ and transmit packets to randomly chosen destinations. The average packet delivery delay of our scheme is of order $1/v$ and it achieves the network capacity of order $\frac{n}{\log n\log\log n}$. This shows a practical throughput-delay trade-off, in particular when compared with the seminal result of Gupta and Kumar which shows network capacity of order $\sqrt{n/\log n}$ and negligible delay and the groundbreaking result of Grossglausser and Tse which achieves network capacity of order $n$ but with an average delay of order $\sqrt{n}/v$. We confirm the generality of our analytical results using simulations under various interference models.
Since it is impossible to predict and identify all the vulnerabilities of a network, and penetration into a system by malicious intruders cannot always be prevented, intrusion detection systems (IDSs) are essential entities for ensuring the security of a networked system. To be effective in carrying out their functions, the IDSs need to be accurate, adaptive, and extensible. Given these stringent requirements and the high level of vulnerabilities of the current days' networks, the design of an IDS has become a very challenging task. Although, an extensive research has been done on intrusion detection in a distributed environment, distributed IDSs suffer from a number of drawbacks e.g., high rates of false positives, low detection efficiency etc. In this paper, the design of a distributed IDS is proposed that consists of a group of autonomous and cooperating agents. In addition to its ability to detect attacks, the system is capable of identifying and isolating compromised nodes in the network thereby introducing fault-tolerance in its operations. The experiments conducted on the system have shown that it has high detection efficiency and low false positives compared to some of the currently existing systems.
We present a novel clustering approach for moving object trajectories that are constrained by an underlying road network. The approach builds a similarity graph based on these trajectories then uses modularity-optimization hiearchical graph clustering to regroup trajectories with similar profiles. Our experimental study shows the superiority of the proposed approach over classic hierarchical clustering and gives a brief insight to visualization of the clustering results.
This paper presents an end-to-end pixelwise fully automated segmentation of the head sectioned images of the Visible Korean Human (VKH) project based on Deep Convolutional Neural Networks (DCNNs). By converting classification networks into Fully Convolutional Networks (FCNs), a coarse prediction map, with smaller size than the original input image, can be created for segmentation purposes. To refine this map and to obtain a dense pixel-wise output, standard FCNs use deconvolution layers to upsample the coarse map. However, upsampling based on deconvolution increases the number of network parameters and causes loss of detail because of interpolation. On the other hand, dilated convolution is a new technique introduced recently that attempts to capture multi-scale contextual information without increasing the network parameters while keeping the resolution of the prediction maps high. We used both a standard FCN and a dilated convolution based FCN for semantic segmentation of the head sectioned images of the VKH dataset. Quantitative results showed approximately 20% improvement in the segmentation accuracy when using FCNs with dilated convolutions.
We study the problem of analyzing influence of various factors affecting individual messages posted in social media. The problem is challenging because of various types of influences propagating through the social media network that act simultaneously on any user. Additionally, the topic composition of the influencing factors and the susceptibility of users to these influences evolve over time. This problem has not studied before, and off-the-shelf models are unsuitable for this purpose. To capture the complex interplay of these various factors, we propose a new non-parametric model called the Dynamic Multi-Relational Chinese Restaurant Process. This accounts for the user network for data generation and also allows the parameters to evolve over time. Designing inference algorithms for this model suited for large scale social-media data is another challenge. To this end, we propose a scalable and multi-threaded inference algorithm based on online Gibbs Sampling. Extensive evaluations on large-scale Twitter and Facebook data show that the extracted topics when applied to authorship and commenting prediction outperform state-of-the-art baselines. More importantly, our model produces valuable insights on topic trends and user personality trends, beyond the capability of existing approaches.
Minimum distance diagrams are a way to encode the diameter and routing information of multi-loop networks. For the widely studied case of double-loop networks, it is known that each network has at most two such diagrams and that they have a very definite form "L-shape''.   In contrast, in this paper we show that there are triple-loop networks with an arbitrarily big number of associated minimum distance diagrams. For doing this, we build-up on the relations between minimum distance diagrams and monomial ideals.
In spatial evolutionary games the fitness of each individual is traditionally determined by the payoffs it obtains upon playing the game with its neighbors. Since defection yields the highest individual benefits, the outlook for cooperators is gloomy. While network reciprocity promotes collaborative efforts, chances of averting the impending social decline are slim if the temptation to defect is strong. It is therefore of interest to identify viable mechanisms that provide additional support for the evolution of cooperation. Inspired by the fact that the environment may be just as important as inheritance for individual development, we introduce a simple switch that allows a player to either keep its original payoff or use the average payoff of all its neighbors. Depending on which payoff is higher, the influence of either option can be tuned by means of a single parameter. We show that, in general, taking into account the environment promotes cooperation. Yet coveting the fitness of one's neighbors too strongly is not optimal. In fact, cooperation thrives best only if the influence of payoffs obtained in the traditional way is equal to that of the average payoff of the neighborhood. We present results for the prisoner's dilemma and the snowdrift game, for different levels of uncertainty governing the strategy adoption process, and for different neighborhood sizes. Our approach outlines a viable route to increased levels of cooperative behavior in structured populations, but one that requires a thoughtful implementation.
Learning features invariant to arbitrary transformations in the data is a requirement for any recognition system, biological or artificial. It is now widely accepted that simple cells in the primary visual cortex respond to features while the complex cells respond to features invariant to different transformations. We present a novel two-layered feedforward neural model that learns features in the first layer by spatial spherical clustering and invariance to transformations in the second layer by temporal spherical clustering. Learning occurs in an online and unsupervised manner following the Hebbian rule. When exposed to natural videos acquired by a camera mounted on a cat's head, the first and second layer neurons in our model develop simple and complex cell-like receptive field properties. The model can predict by learning lateral connections among the first layer neurons. A topographic map to their spatial features emerges by exponentially decaying the flow of activation with distance from one neuron to another in the first layer that fire in close temporal proximity, thereby minimizing the pooling length in an online manner simultaneously with feature learning.
In this paper, we describe a so-called screening approach for learning robust processing of spontaneously spoken language. A screening approach is a flat analysis which uses shallow sequences of category representations for analyzing an utterance at various syntactic, semantic and dialog levels. Rather than using a deeply structured symbolic analysis, we use a flat connectionist analysis. This screening approach aims at supporting speech and language processing by using (1) data-driven learning and (2) robustness of connectionist networks. In order to test this approach, we have developed the SCREEN system which is based on this new robust, learned and flat analysis.   In this paper, we focus on a detailed description of SCREEN's architecture, the flat syntactic and semantic analysis, the interaction with a speech recognizer, and a detailed evaluation analysis of the robustness under the influence of noisy or incomplete input. The main result of this paper is that flat representations allow more robust processing of spontaneous spoken language than deeply structured representations. In particular, we show how the fault-tolerance and learning capability of connectionist networks can support a flat analysis for providing more robust spoken-language processing within an overall hybrid symbolic/connectionist framework.
Semantic segmentation of functional magnetic resonance imaging (fMRI) makes great sense for pathology diagnosis and decision system of medical robots. The multi-channel fMRI provides more information of the pathological features. But the increased amount of data causes complexity in feature detections. This paper proposes a principal component analysis (PCA)-aided fully convolutional network to particularly deal with multi-channel fMRI. We transfer the learned weights of contemporary classification networks to the segmentation task by fine-tuning. The results of the convolutional network are compared with various methods e.g. k-NN. A new labeling strategy is proposed to solve the semantic segmentation problem with unclear boundaries. Even with a small-sized training dataset, the test results demonstrate that our model outperforms other pathological feature detection methods. Besides, its forward inference only takes 90 milliseconds for a single set of fMRI data. To our knowledge, this is the first time to realize pixel-wise labeling of multi-channel magnetic resonance image using FCN.
Contrary to the structural aspect of conventional social network analysis, a new method in behavioral analysis is proposed. We define behavioral measures including self-loops and multiple links and illustrate the behavioral analysis with the networks of Wikipedia editing. Behavioral social network analysis provides an explanation of human behavior that may be further extended to the explanation of culture through social phenomena.
In a recent paper (Michalopolou A., Syskakis E. and Papastaikoudis C., 2001 J. Phys.: Cond. Mat. 13, 11615) the authors reported on the measurements of electrical resistivity and specific heat at zero magnetic field carried out on polycrystalline non-stoichiometric La_{0.95-x}Sr_{x}MnO_{3} manganites.In particular, they attributed the low temperature behavior of resistivity (shallow minimum and slight upturn at lowest temperatures) to 3D electron-electron interaction enhanced by disorder, using results of numerical fittings of the dependencies of resistivity on temperature in the interval 4.2 -- 40 K. We argue that such an analysis may be not valid for polycrystalline manganites where relatively strong grain boundary effects might mask weak contribution of quantum effects to low temperature resistivity. The crucial test of applicability of the theory of quantum corrections to conductivity in this case is the resistive measurements under non-zero magnetic field.
We present a catalog of 30 QSOs and their spectra, in the square degree of sky centered on the northern Hubble Deep Field. These QSOs were selected by multicolor photometry and subsequently confirmed with spectroscopy. They range in magnitude from 17.6 < B < 21.0 and in redshift from 0.44 < z < 2.98. We also include in the catalog an AGN with redshift z=0.135. Together, these objects comprise a new grid of absorption probes which can be used to study the correlation between luminous galaxies, non-luminous halos and Lyman-alpha absorbers along the line of sight toward the Hubble Deep Field.
Mean-field analysis is an important tool for understanding dynamics on complex networks. However, surprisingly little attention has been paid to the question of whether mean-field predictions are accurate, and this is particularly true for real-world networks with clustering and modular structure. In this paper, we compare mean-field predictions to numerical simulation results for dynamical processes running on 21 real-world networks and demonstrate that the accuracy of the theory depends not only on the mean degree of the networks but also on the mean first-neighbor degree. We show that mean-field theory can give (unexpectedly) accurate results for certain dynamics on disassortative real-world networks even when the mean degree is as low as 4.
This paper addresses the task of segmenting moving objects in unconstrained videos. We introduce a novel two-stream neural network with an explicit memory module to achieve this. The two streams of the network encode spatial and temporal features in a video sequence respectively, while the memory module captures the evolution of objects over time. The module to build a "visual memory" in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. Given a video frame as input, our approach assigns each pixel an object or background label based on the learned spatio-temporal features as well as the "visual memory" specific to the video, acquired automatically without any manually-annotated frames. The visual memory is implemented with convolutional gated recurrent units, which allows to propagate spatial information over time. We evaluate our method extensively on two benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show state-of-the-art results. For example, our approach outperforms the top method on the DAVIS dataset by nearly 6%. We also provide an extensive ablative analysis to investigate the influence of each component in the proposed framework.
Compared to Multilayer Neural Networks with real weights, Binary Multilayer Neural Networks (BMNNs) can be implemented more efficiently on dedicated hardware. BMNNs have been demonstrated to be effective on binary classification tasks with Expectation BackPropagation (EBP) algorithm on high dimensional text datasets. In this paper, we investigate the capability of BMNNs using the EBP algorithm on multiclass image classification tasks. The performances of binary neural networks with multiple hidden layers and different numbers of hidden units are examined on MNIST. We also explore the effectiveness of image spatial filters and the dropout technique in BMNNs. Experimental results on MNIST dataset show that EBP can obtain 2.12% test error with binary weights and 1.66% test error with real weights, which is comparable to the results of standard BackPropagation algorithm on fully connected MNNs.
The present work provides a new approach to solve the well-known multi-robot co-operative box pushing problem as a multi objective optimization problem using modified Multi-objective Particle Swarm Optimization. The method proposed here allows both turning and translation of the box, during shift to a desired goal position. We have employed local planning scheme to determine the magnitude of the forces applied by the two mobile robots perpendicularly at specific locations on the box to align and translate it in each distinct step of motion of the box, for minimization of both time and energy. Finally the results are compared with the results obtained by solving the same problem using Non-dominated Sorting Genetic Algorithm-II (NSGA-II). The proposed scheme is found to give better results compared to NSGA-II.
Recent measurements of inclusive deep inelastic scattering differential cross-section in the range 1.5 \gev2\le Q^2\le 30000 \gev2 and 5\cdot 10^{-6}\le x\le 0.65 are presented. Phenomenological analyses performed from these measurements are also described.
Steerable properties dominate the design of traditional filters, e.g., Gabor filters, and endow features the capability of dealing with spatial transformations. However, such excellent properties have not been well explored in the popular deep convolutional neural networks (DCNNs). In this paper, we propose a new deep model, termed Gabor Convolutional Networks (GCNs or Gabor CNNs), which incorporates Gabor filters into DCNNs to enhance the resistance of deep learned features to the orientation and scale changes. By only manipulating the basic element of DCNNs based on Gabor filters, i.e., the convolution operator, GCNs can be easily implemented and are compatible with any popular deep learning architecture. Experimental results demonstrate the super capability of our algorithm in recognizing objects, where the scale and rotation changes occur frequently. The proposed GCNs have much fewer learnable network parameters, and thus is easier to train with an end-to-end pipeline. To encourage further developments, the source code is released at Github.
We characterize the achievable range of performance measures in product-form networks where one or more system parameters can be freely set by a network operator. Given a product-form network and a set of configurable parameters, we identify which performance measures can be controlled and which target values can be attained. We also discuss an online optimization algorithm, which allows a network operator to set the system parameters so as to achieve target performance metrics. In some cases, the algorithm can be implemented in a distributed fashion, of which we give several examples. Finally, we give conditions that guarantee convergence of the algorithm, under the assumption that the target performance metrics are within the achievable range.
We study the question of how reliably one can distinguish two quantum field theories (QFTs). Each QFT defines a probability distribution on the space of fields. The relative entropy provides a notion of proximity between these distributions and quantifies the number of measurements required to distinguish between them. In the case of nearby conformal field theories, this reduces to the Zamolodchikov metric on the space of couplings. Our formulation quantifies the information lost under renormalization group flow from the UV to the IR and leads us to a quantification of fine-tuning. This formalism also leads us to a criterion for distinguishability of low energy effective field theories generated by the string theory landscape.
We study the critical point of directed pinning/wetting models with quenched disorder. The distribution K(.) of the location of the first contact of the (free) polymer with the defect line is assumed to be of the form K(n)=n^{-\alpha-1}L(n), with L(.) slowly varying. The model undergoes a (de)-localization phase transition: the free energy (per unit length) is zero in the delocalized phase and positive in the localized phase. For \alpha<1/2 it is known that disorder is irrelevant: quenched and annealed critical points coincide for small disorder, as well as quenched and annealed critical exponents. The same has been proven also for \alpha=1/2, but under the assumption that L(.) diverges sufficiently fast at infinity, an hypothesis that is not satisfied in the (1+1)-dimensional wetting model considered by Forgacs et al. (1986) and Derrida et al. (1992), where L(.) is asymptotically constant. Here we prove that, if 1/2<\alpha<1 or \alpha >1, then quenched and annealed critical points differ whenever disorder is present, and we give the scaling form of their difference for small disorder. In agreement with the so-called Harris criterion, disorder is therefore relevant in this case. In the marginal case \alpha=1/2, under the assumption that L(.) vanishes sufficiently fast at infinity, we prove that the difference between quenched and annealed critical points, which is known to be smaller than any power of the disorder strength, is positive: disorder is marginally relevant. Again, the case considered by Forgacs et al. (1986) and Derrida et al. (1992) is out of our analysis and remains open.
We discuss recent measurements of the wavelength-dependent absorption coefficients in deep South Pole ice. The method uses transit time distributions of pulses from a variable-frequency laser sent between emitters and receivers embedded in the ice. At depths of 800 to 1000 m scattering is dominated by residual air bubbles, whereas absorption occurs both in ice itself and in insoluble impurities. The absorption coefficient increases approximately exponentially with wavelength in the measured interval 410 to 610 nm. At the shortest wavelength our value is about a factor 20 below previous values obtained for laboratory ice and lake ice; with increasing wavelength the discrepancy with previous measurements decreases. At around 415 to 500 nm the experimental uncertainties are small enough for us to resolve an extrinsic contribution to absorption in ice: submicron dust particles contribute by an amount that increases with depth and corresponds well with the expected increase seen near the Last Glacial Maximum in Vostok and Dome C ice cores. The laser pulse method allows remote mapping of gross structure in dust concentration as a function of depth in glacial ice.
We study the random K-satisfiability problem using a partition function where each solution is reweighted according to the number of variables that satisfy every clause. We apply belief propagation and the related cavity method to the reweighted partition function. This allows us to obtain several new results on the properties of random K-satisfiability problem. In particular the reweighting allows to introduce a planted ensemble that generates instances that are, in some region of parameters, equivalent to random instances. We are hence able to generate at the same time a typical random SAT instance and one of its solutions. We study the relation between clustering and belief propagation fixed points and we give a direct evidence for the existence of purely entropic (rather than energetic) barriers between clusters in some region of parameters in the random K-satisfiability problem. We exhibit, in some large planted instances, solutions with a non-trivial whitening core; such solutions were known to exist but were so far never found on very large instances. Finally, we discuss algorithmic hardness of such planted instances and we determine a region of parameters in which planting leads to satisfiable benchmarks that, up to our knowledge, are the hardest known.
While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning - leveraging unlabeled examples to learn about the structure of a domain - remains a difficult unsolved challenge. Here, we explore prediction of future frames in a video sequence as an unsupervised learning rule for learning about the structure of the visual world. We describe a predictive neural network ("PredNet") architecture that is inspired by the concept of "predictive coding" from the neuroscience literature. These networks learn to predict future frames in a video sequence, with each layer in the network making local predictions and only forwarding deviations from those predictions to subsequent network layers. We show that these networks are able to robustly learn to predict the movement of synthetic (rendered) objects, and that in doing so, the networks learn internal representations that are useful for decoding latent object parameters (e.g. pose) that support object recognition with fewer training views. We also show that these networks can scale to complex natural image streams (car-mounted camera videos), capturing key aspects of both egocentric movement and the movement of objects in the visual scene, and the representation learned in this setting is useful for estimating the steering angle. Altogether, these results suggest that prediction represents a powerful framework for unsupervised learning, allowing for implicit learning of object and scene structure.
Music accounts for a significant chunk of interest among various online activities. This is reflected by wide array of alternatives offered in music related web/mobile apps, information portals, featuring millions of artists, songs and events attracting user activity at similar scale. Availability of large scale structured and unstructured data has attracted similar level of attention by data science community. This paper attempts to offer current state-of-the-art in music related analysis. Various approaches involving machine learning, information theory, social network analysis, semantic web and linked open data are represented in the form of taxonomy along with data sources and use cases addressed by the research community.
Context-consistency checking is challenging in the dynamic and uncertain ubiquitous computing environments. This is because contexts are often noisy owing to unreliable sensing data streams, inaccurate data measurement, fragile connectivity and resource constraints. One of the state-of-the-art efforts is CEDA, which concurrently detects context consistency by exploring the \emph{happened-before} relation among events. However, CEDA is seriously limited by several side effects --- centralized detection manner that easily gets down the checker process, heavy computing complexity and false negative.   In this paper, we propose SECA: Snapshot-based Event Detection for Checking Asynchronous Context Consistency in ubiquitous computing. SECA introduces snapshot-based timestamp to check event relations, which can detect scenarios where CEDA fails. Moreover, it simplifies the logical clock instead of adopting the vector clock, and thus significantly reduces both time and space complexity. Empirical studies show that SECA outperforms CEDA in terms of detection accuracy, scalability, and computing complexity.
Network data appear in a number of applications, such as online social networks and biological networks, and there is growing interest in both developing models for networks as well as studying the properties of such data. Since individual network datasets continue to grow in size, it is necessary to develop models that accurately represent the real-life scaling properties of networks. One behavior of interest is having a power law in the degree distribution. However, other types of power laws that have been observed empirically and considered for applications such as clustering and feature allocation models have not been studied as frequently in models for graph data. In this paper, we enumerate desirable asymptotic behavior that may be of interest for modeling graph data, including sparsity and several types of power laws. We outline a general framework for graph generative models using completely random measures; by contrast to the pioneering work of Caron and Fox (2015), we consider instantiating more of the existing atoms of the random measure as the dataset size increases rather than adding new atoms to the measure. We see that these two models can be complementary; they respectively yield interpretations as (1) time passing among existing members of a network and (2) new individuals joining a network. We detail a particular instance of this framework and show simulated results that suggest this model exhibits some desirable asymptotic power-law behavior.
Active vision is inherently attention-driven: The agent selects views of observation to best approach the vision task while improving its internal representation of the scene being observed. Inspired by the recent success of attention-based models in 2D vision tasks based on single RGB images, we propose to address the multi-view depth-based active object recognition using attention mechanism, through developing an end-to-end recurrent 3D attentional network. The architecture comprises of a recurrent neural network (RNN), storing and updating an internal representation, and two levels of spatial transformer units, guiding two-level attentions. Our model, trained with a 3D shape database, is able to iteratively attend to the best views targeting an object of interest for recognizing it, and focus on the object in each view for removing the background clutter. To realize 3D view selection, we derive a 3D spatial transformer network which is differentiable for training with back-propagation, achieving must faster convergence than the reinforcement learning employed by most existing attention-based models. Experiments show that our method outperforms state-of-the-art methods in cluttered scenes.
We study the free energy distribution function of weakly disordered Ising ferromagnet in terms of the D-dimensional random temperature Ginzburg-Landau Hamiltonian. It is shown that besides the usual Gaussian "body" this distribution function exhibits non-Gaussian tails both in the paramagnetic and in the ferromagnetic phases. Explicit asymptotic expressions for these tails are derived. It is demonstrated that the tails are strongly asymmetric: the left tail (for large negative values of the free energy) is much more slow than the right one (for large positive values of the free energy). It is argued that in the critical point the free energy of the random Ising ferromagnet in dimensions D<4 is described by a non-trivial universal distribution function being non self-averaging
We report a novel mechanism for the occurrence of chaos at the macroscopic level induced by the frustration of interaction, namely frustration-induced chaos, in a non-monotonic sequential associative memory model. We succeed in deriving exact macroscopic dynamical equations from the microscopic dynamics in the case of the thermodynamic limit and prove that two order parameters dominate this large-degree-of-freedom system. Two-parameter bifurcation diagrams are obtained from the order-parameter equations. Then we analytically show that the chaos is low-dimensional at the macroscopic level when the system has some degree of frustration, but that the chaos definitely does not occur without the frustration.
Deep learning approaches have reached a celebrity status in artificial intelligence field, its success have mostly relied on Convolutional Networks (CNN) and Recurrent Networks. By exploiting fundamental spatial properties of images and videos, the CNN always achieves dominant performance on visual tasks. And the Recurrent Networks (RNN) especially long short-term memory methods (LSTM) can successfully characterize the temporal correlation, thus exhibits superior capability for time series tasks. Traffic flow data have plentiful characteristics on both time and space domain. However, applications of CNN and LSTM approaches on traffic flow are limited. In this paper, we propose a novel deep architecture combined CNN and LSTM to forecast future traffic flow (CLTFP). An 1-dimension CNN is exploited to capture spatial features of traffic flow, and two LSTMs are utilized to mine the short-term variability and periodicities of traffic flow. Given those meaningful features, the feature-level fusion is performed to achieve short-term forecasting. The proposed CLTFP is compared with other popular forecasting methods on an open datasets. Experimental results indicate that the CLTFP has considerable advantages in traffic flow forecasting. in additional, the proposed CLTFP is analyzed from the view of Granger Causality, and several interesting properties of CLTFP are discovered and discussed .
Classical counterparts of a great variety of quantum systems, from atomic physics to quantum wells and quantum dots, to optical, microwave, and acoustic resonators exhibit partially chaotic dynamics. Since it is often impossible to measure the temporal dynamics in qunatum systems, the main and probably the most dramatic manifestation of classical chaos in their phase space is seen in the distribution of spacing between the neighboring energy levels. While the mechanism leading to the onset of chaotic dynamics is different in every system, the level spacing distribution obeys the universal law, changing from Poissonian in the completely integrable systems to Wigner in completely chaotic ones.
Motivated by the statistical fluctuation of Dirac spectrum of QCD-like theories subjected to (pseudo)reality-violating perturbations and in the epsilon-regime, we compute the smallest eigenvalue distribution and the level spacing distribution of chiral and non-chiral parametric random matrix ensembles of Dyson-Mehta-Pandey type. To this end we employ the Nystrom method to numerically evaluate the Fredholm Pfaffian of the integral kernel for the chG(O,S)E-chGUE and G(O,S)E-GUE crossover. We confirm the validity and universality of our results by comparing them with several lattice models, namely fundamental and adjoint staggered Dirac spectra of SU(2) quenched lattice gauge theory under the twisted boundary condition (imaginary chemical potential) or perturbed by phase noise. Both in the zero-virtuality region and in the spectral bulk, excellent one-parameter fitting is achieved already on a small 4^4 lattice. Anticipated scaling of the fitting parameter with the twisting phase, mean level spacing, and the system size allows for precise determination of the pion decay (diffusion) constant F in the low-energy effective Lagrangian.
Isolated quantum systems with quenched randomness exhibit many-body localization (MBL), wherein they do not reach local thermal equilibrium even when highly excited above their ground states. It is widely believed that individual eigenstates capture this breakdown of thermalization at finite size. We show that this belief is false in general and that a MBL system can exhibit the eigenstate properties of a thermalizing system. We propose that localized approximately conserved operators (l$^*$-bits) underlie localization in such systems. In dimensions $d>1$, we further argue that the existing MBL phenomenology is unstable to boundary effects and gives way to l$^*$-bits. Physical consequences of l$^*$-bits include the possibility of an eigenstate phase transition within the MBL phase unrelated to the dynamical transition in $d=1$ and thermal eigenstates at all parameters in $d>1$. Near-term experiments in ultra-cold atomic systems and numerics can probe the dynamics generated by boundary layers and emergence of l$^*$-bits.
The Riemannian 10j symbols are spin networks that assign an amplitude to each 4-simplex in the Barrett-Crane model of Riemannian quantum gravity. This amplitude is a function of the areas of the 10 faces of the 4-simplex, and Barrett and Williams have shown that one contribution to its asymptotics comes from the Regge action for all non-degenerate 4-simplices with the specified face areas. However, we show numerically that the dominant contribution comes from degenerate 4-simplices. As a consequence, one can compute the asymptotics of the Riemannian 10j symbols by evaluating a `degenerate spin network', where the rotation group SO(4) is replaced by the Euclidean group of isometries of R^3. We conjecture formulas for the asymptotics of a large class of Riemannian and Lorentzian spin networks in terms of these degenerate spin networks, and check these formulas in some special cases. Among other things, this conjecture implies that the Lorentzian 10j symbols are asymptotic to 1/16 times the Riemannian ones.
This paper presents a concept of image pixel fusion of visual and thermal faces, which can significantly improve the overall performance of a face recognition system. Several factors affect face recognition performance including pose variations, facial expression changes, occlusions, and most importantly illumination changes. So, image pixel fusion of thermal and visual images is a solution to overcome the drawbacks present in the individual thermal and visual face images. Fused images are projected into eigenspace and finally classified using a multi-layer perceptron. In the experiments we have used Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database benchmark thermal and visual face images. Experimental results show that the proposed approach significantly improves the verification and identification performance and the success rate is 95.07%. The main objective of employing fusion is to produce a fused image that provides the most detailed and reliable information. Fusion of multiple images together produces a more efficient representation of the image.
Clustering in wireless sensor networks (WSNs) is an important technique to ease topology management and routing. Clustering provides an effective method for prolonging lifetime of a WSN. This paper proposes energy efficient multi-level clustering schemes for wireless sensor networks. Wireless sensor nodes are extremely energy constrained with a limited transmission range. Due to large area of deployment, the network needs to have a multi-level clustering protocol that will enable far-off nodes to communicate with the base station. Simulation is used to analyze the proposed protocols and compare their performance with existing protocol EEMC. Simulation results demonstrate that our proposed protocols are effective in prolonging the network lifetime.
We present the discovery of a radio galaxy at a likely redshift of z = 4.424 in one of the flanking fields of the Hubble Deep Field. Radio observations with the VLA and MERLIN centered on the HDF yielded a complete sample of microjansky radio sources, of which about 20% have no optical counterpart to I < 25 mag. In this Letter, we address the possible nature of one of these sources, through deep HST NICMOS images in the F110W (J) and F160W (H) filters. VLA J123642+621331 has a single emission line at 6595-A, which we identify with Lyman-alpha at z = 4.424. We argue that this faint (H = 23.9 mag), compact (r = 0.2 arcsec), red (I - K = 2.0) object is most likely a dusty, star-forming galaxy with an embedded active nucleus.
We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation. We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains.
Wireless Body Area Sensor Networks (WBASNs) consist of on-body or in-body sensors placed on human body for health monitoring. Energy conservation of these sensors, while guaranteeing a required level of performance, is a challenging task. Energy efficient routing schemes are designed for the longevity of network lifetime. In this paper, we propose a routing protocol for measuring fatigue of a soldier. Three sensors are attached to soldier's body that monitor specific parameters. Our proposed protocol is an event driven protocol and takes three scenarios for measuring the fatigue of a soldier. We evaluate our proposed work in terms of network lifetime, throughput, remaining energy of sensors and fatigue of a soldier.
We present the first results from a wide solid angle, moderately deep {\it Chandra} survey of the Lockman Hole North-West region. Our 9 ACIS-I fields cover an effective solid angle of 0.4 deg$^{2}$ and reach a depth of $3 \times 10^{-16}$ \ergpcmsqps in the 0.4--2 keV band and $3 \times 10^{-15}$ \ergpcmsqps in the 2--8 keV band. The best fit logN-logS for the entire field, the largest contiguous {\it Chandra} field yet observed, matches well onto that of the {\it Chandra} Deep Field North. We show that the full range of the `cosmic variance' previously seen in different {\it Chandra} fields is reproduced in this small region of the sky. Counts-in-cells analysis shows that the hard band sources are more strongly correlated than the soft band sources.
New entropy measures have been recently introduced for the quantification of the complexity of networks. Most of these entropy measures apply to static networks or to dynamical processes defined on static complex networks. In this paper we define the entropy rate of growing network models. This entropy rate quantifies how many labeled networks are typically generated by the growing network models. We analytically evaluate the difference between the entropy rate of growing tree network models and the entropy of tree networks that have the same asymptotic degree distribution. We find that the growing networks with linear preferential attachment generated by dynamical models are exponentially less than the static networks with the same degree distribution for a large variety of relevant growing network models. We study the entropy rate for growing network models showing structural phase transitions including models with non-linear preferential attachment. Finally, we bring numerical evidence that the entropy rate above and below the structural phase transitions follow a different scaling with the network size.
In the previous report [Phys. Rev. B {\bf{62}} 13812 (2000)], by proposing the mechanism under which electric conductivity is caused by the activational hopping conduction with the Wigner surmise of the level statistics, the temperature-dependent of electronic conductivity of a highly disordered carbon system was evaluated including apparent metal-insulator transition. Since the system consists of small pieces of graphite, it was assumed that the reason why the level statistics appears is due to the behavior of the quantum chaos in each granular graphite. In this article, we revise the assumption and show another origin of the Wigner surmise, which is more natural for the carbon system based on a recent investigation of graph zeta function in graph theory.
Recent works using artificial neural networks based on word distributed representation greatly boost the performance of various natural language learning tasks, especially question answering. Though, they also carry along with some attendant problems, such as corpus selection for embedding learning, dictionary transformation for different learning tasks, etc. In this paper, we propose to straightforwardly model sentences by means of character sequences, and then utilize convolutional neural networks to integrate character embedding learning together with point-wise answer selection training. Compared with deep models pre-trained on word embedding (WE) strategy, our character-sequential representation (CSR) based method shows a much simpler procedure and more stable performance across different benchmarks. Extensive experiments on two benchmark answer selection datasets exhibit the competitive performance compared with the state-of-the-art methods.
Weyl semimetal, a three-dimensional electronic system with relativistic linear energy dispersion around gapless points carrying nontrivial Berry charge, has emerged as a novel form of topological state of matter that spurs intense experimental and theoretical studies in the past few years.This system is predicted to exhibit a wealth of unique response and transport properties.A crucial question is whether those properties are robust against disorder and whether Anderson localization occurs.There have been a number of theoretical and numerical studies on related questions, mostly analyzing the problems using perturbative approaches.In this work, the effects of nonperturbative topological (vortex loop) excitations and Berry phase in disordered time-reversal invariant 3d Weyl semimetal are studied.It is shown that the chiral symmetry is restored in the nonlinear sigma model describing the diffusons upon disorder average as any net topological term and its delocalization result do not take effect at sufficiently short length scales.Anderson localization occurs at sufficiently strong disorder and we predict that chirality and related phenomena disappear at such transition.Nevertheless, we uncover a mechanism that originates from Berry phase that impedes such localization effect.We show the occurrence of destructive interference between the vortex loops and between scattering paths due to the the vortex loops' Berry phase.This Berry phase effect and finite fugacity for the onset of vortex loop proliferation resist the Anderson localization and are the key to the robustness of the semimetallic phase of Weyl systems as observed in numerics.We point out the consistency of our theory with a recent experimental finding of the absent chiral anomaly in a noncentrosymmetric Weyl semimetal.
We present Listen, Attend and Spell (LAS), a neural network that learns to transcribe speech utterances to characters. Unlike traditional DNN-HMM models, this model learns all the components of a speech recognizer jointly. Our system has two components: a listener and a speller. The listener is a pyramidal recurrent network encoder that accepts filter bank spectra as inputs. The speller is an attention-based recurrent network decoder that emits characters as outputs. The network produces character sequences without making any independence assumptions between the characters. This is the key improvement of LAS over previous end-to-end CTC models. On a subset of the Google voice search task, LAS achieves a word error rate (WER) of 14.1% without a dictionary or a language model, and 10.3% with language model rescoring over the top 32 beams. By comparison, the state-of-the-art CLDNN-HMM model achieves a WER of 8.0%.
The typical algorithmic problem in viral marketing aims to identify a set of influential users in a social network, who, when convinced to adopt a product, shall influence other users in the network and trigger a large cascade of adoptions. However, the host (the owner of an online social platform) often faces more constraints than a single product, endless user attentions, unlimited budget and unbounded time; in reality, multiple products need to be advertised, each user can tolerate only a small number of recommendations, influencing user has a cost and advertisers have only limited budgets, and the adoptions need to be maximized within a short time window.   Given theses myriads of user, monetary, and timing constraints, it is extremely challenging for the host to design principled and efficient viral market algorithms with provable guarantees. In this paper, we provide a novel solution by formulating the problem as a submodular maximization in a continuous-time diffusion model under an intersection of a matroid and multiple knapsack constraints. We also propose an adaptive threshold greedy algorithm which can be faster than the traditional greedy algorithm with lazy evaluation, and scalable to networks with million of nodes. Furthermore, our mathematical formulation allows us to prove that the algorithm can achieve an approximation factor of $k_a/(2+2 k)$ when $k_a$ out of the $k$ knapsack constraints are active, which also improves over previous guarantees from combinatorial optimization literature. In the case when influencing each user has uniform cost, the approximation becomes even better to a factor of $1/3$. Extensive synthetic and real world experiments demonstrate that our budgeted influence maximization algorithm achieves the-state-of-the-art in terms of both effectiveness and scalability, often beating the next best by significant margins.
In this paper, we propose the new fixed-size ordinally-forgetting encoding (FOFE) method, which can almost uniquely encode any variable-length sequence of words into a fixed-size representation. FOFE can model the word order in a sequence using a simple ordinally-forgetting mechanism according to the positions of words. In this work, we have applied FOFE to feedforward neural network language models (FNN-LMs). Experimental results have shown that without using any recurrent feedbacks, FOFE based FNN-LMs can significantly outperform not only the standard fixed-input FNN-LMs but also the popular RNN-LMs.
We study shear stress relaxation for a gelling melt of randomly crosslinked, interacting monomers. We derive a lower bound for the static shear viscosity $\eta$, which implies that it diverges algebraically with a critical exponent $k\ge 2\nu-\beta$. Here, $\nu$ and $\beta$ are the critical exponents of percolation theory for the correlation length and the gel fraction. In particular, the divergence is stronger than in the Rouse model, proving the relevance of excluded-volume interactions for the dynamic critical behaviour at the gel transition. Precisely at the critical point, our exact results imply a Mark-Houwink relation for the shear viscosity of isolated clusters of fixed size.
We investigate optimal channel assignment algorithms that maximize per node throughput in dense multichannel multi-radio (MC-MR) wireless networks. Specifically, we consider an MC-MR network where all nodes are within the transmission range of each other. This situation is encountered in many real-life settings such as students in a lecture hall, delegates attending a conference, or soldiers in a battlefield. In this scenario, we show that intelligent assignment of the available channels results in a significantly higher per node throughput. We first propose a class of channel assignment algorithms, parameterized by T (the number of transceivers per node), that can achieve $\Theta(1/N^{1/T})$ per node throughput using $\Theta(TN^{1-1/T})$ channels. In view of practical constraints on $T$, we then propose another algorithm that can achieve $\Theta(1/(\log_2 N)^2)$ per node throughput using only two transceivers per node. Finally, we identify a fundamental relationship between the achievable per node throughput, the total number of channels used, and the network size under any strategy. Using analysis and simulations, we show that our algorithms achieve close to optimal performance at different operating points on this curve. Our work has several interesting implications on the optimal network design for dense MC-MR wireless networks.
The multifractal properties of the Edwards-Anderson order parameter of the short-range Ising spin glass model on d=3 diamond hierarchical lattices is studied via an exact recursion procedure. The profiles of the local order parameter are calculated and analysed within a range of temperatures close to the critical point with four symmetric distributions of the coupling constants (Gaussian, Bimodal, Uniform and Exponential). Unlike the pure case, the multifractal analysis of these profiles reveals that a large spectrum of the $\alpha $-H\"older exponent is required to describe the singularities of the measure defined by the normalized local order parameter, at and below the critical point. Minor changes in these spectra are observed for distinct initial distributions of coupling constants, suggesting an universal spectra behavior. For temperatures slightly above T_{c}, a dramatic change in the $F(\alpha)$ function is found, signalizing the transition.
I present here some reflections and very speculative remarks on the detection of relativistic magnetic monopoles by currently operating deep underwater/ice neutrino telescopes.
Malicious mobile phone worms spread between devices via short-range Bluetooth contacts, similar to the propagation of human and other biological viruses. Recent work has employed models from epidemiology and complex networks to analyse the spread of malware and the effect of patching specific nodes. These approaches have adopted a static view of the mobile networks, i.e., by aggregating all the edges that appear over time, which leads to an approximate representation of the real interactions: instead, these networks are inherently dynamic and the edge appearance and disappearance is highly influenced by the ordering of the human contacts, something which is not captured at all by existing complex network measures. In this paper we first study how the blocking of malware propagation through immunisation of key nodes (even if carefully chosen through static or temporal betweenness centrality metrics) is ineffective: this is due to the richness of alternative paths in these networks. Then we introduce a time-aware containment strategy that spreads a patch message starting from nodes with high temporal closeness centrality and show its effectiveness using three real-world datasets. Temporal closeness allows the identification of nodes able to reach most nodes quickly: we show that this scheme can reduce the cellular network resource consumption and associated costs, achieving, at the same time, a complete containment of the malware in a limited amount of time.
Three-dimensional topological insulators are classified into "strong" (STI) and "weak" (WTI) according to the nature of their surface states. While the surface states of the STI are topologically protected from localization, this does not hold for the WTI. In this work we show that the surface states of the WTI are actually protected from any random perturbation that does not break time-reversal symmetry, and does not close the bulk energy gap. Consequently, the conductivity of metallic surfaces in the clean system remains finite even in the presence of strong disorder of this type. In the weak disorder limit the surfaces are found to be perfect metals, and strong surface disorder only acts to push the metallic surfaces inwards. We find that the WTI differs from the STI primarily in its anisotropy, and that the anisotropy is not a sign of its weakness but rather of its richness.
A great many tools have been developed for supervised classification, ranging from early methods such as linear discriminant analysis through to modern developments such as neural networks and support vector machines. A large number of comparative studies have been conducted in attempts to establish the relative superiority of these methods. This paper argues that these comparisons often fail to take into account important aspects of real problems, so that the apparent superiority of more sophisticated methods may be something of an illusion. In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.
We study the diversity of complex spatio-temporal patterns in the behavior of random synchronous asymmetric neural networks (RSANNs). Special attention is given to the impact of disordered threshold values on limit-cycle diversity and limit-cycle complexity in RSANNs which have `normal' thresholds by default. Surprisingly, RSANNs exhibit only a small repertoire of rather complex limit-cycle patterns when all parameters are fixed. This repertoire of complex patterns is also rather stable with respect to small parameter changes. These two unexpected results may generalize to the study of other complex systems. In order to reach beyond this seemingly-disabling `stable and small' aspect of the limit-cycle repertoire of RSANNs, we have found that if an RSANN has threshold disorder above a critical level, then there is a rapid increase of the size of the repertoire of patterns. The repertoire size initially follows a power-law function of the magnitude of the threshold disorder. As the disorder increases further, the limit-cycle patterns themselves become simpler until at a second critical level most of the limit cycles become simple fixed points. Nonetheless, for moderate changes in the threshold parameters, RSANNs are found to display specific features of behavior desired for rapidly-responding processing systems: accessibility to a large set of complex patterns.
We consider the problem of depth estimation from a single monocular image in this work. It is a challenging task as no reliable depth cues are available, e.g., stereo correspondences, motions, etc. Previous efforts have been focusing on exploiting geometric priors or additional sources of information, with all using hand-crafted features. Recently, there is mounting evidence that features from deep convolutional neural networks (CNN) are setting new records for various vision applications. On the other hand, considering the continuous characteristic of the depth values, depth estimations can be naturally formulated into a continuous conditional random field (CRF) learning problem. Therefore, we in this paper present a deep convolutional neural field model for estimating depths from a single image, aiming to jointly explore the capacity of deep CNN and continuous CRF. Specifically, we propose a deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework.   The proposed method can be used for depth estimations of general scenes with no geometric priors nor any extra information injected. In our case, the integral of the partition function can be analytically calculated, thus we can exactly solve the log-likelihood optimization. Moreover, solving the MAP problem for predicting depths of a new image is highly efficient as closed-form solutions exist. We experimentally demonstrate that the proposed method outperforms state-of-the-art depth estimation methods on both indoor and outdoor scene datasets.
Using the SIS model on unweighted and weighted networks, we consider the disease localization phenomenon. In contrast to the well-recognized point of view that diseases infect a finite fraction of vertices right above the epidemic threshold, we show that diseases can be localized on a finite number of vertices, where hubs and edges with large weights are centers of localization. Our results follow from the analysis of standard models of networks and empirical data for real-world networks.
This paper deals with the problem of the requirements for quantum systems that enable one to design efficient quantum algorithms. We rise the issue of the possibility to utilise the non-complete networks for algorithmic purposes. In particular we consider applications for the spatial search problems. We focus on showing that the asymptotic complexity widely discussed in the related work is not enough tool for determining the potential of the network. We provide an example of a network where the asymptotic complexity is the same for a variety of cases and yet it is not always possible to implement successful search procedure within the quantum walk scheme.   The examples are based on an Apollonian network which models a variety of iteratively generated planar networks. The network is planar, exhibits linear growth of edges number, consists nodes of different degrees and has the small-world and scale-free properties. This motivates its analysis due to the simplicity in terms of connections density and potential for quantum phenomena due to the nodes diversity.
Separation of competing speech is a key challenge in signal processing and a feat routinely performed by the human auditory brain. A long standing benchmark of the spectrogram approach to source separation is known as the ideal binary mask. Here, we train a convolutional deep neural network, on a two-speaker cocktail party problem, to make probabilistic predictions about binary masks. Our results approach ideal binary mask performance, illustrating that relatively simple deep neural networks are capable of robust binary mask prediction. We also illustrate the trade-off between prediction statistics and separation quality.
We introduce a simple, general strategy to manipulate the behavior of a neural decoder that enables it to generate outputs that have specific properties of interest (e.g., sequences of a pre-specified length). The model can be thought of as a simple version of the actor-critic model that uses an interpolation of the actor (the MLE-based token generation policy) and the critic (a value function that estimates the future values of the desired property) for decision making. We demonstrate that the approach is able to incorporate a variety of properties that cannot be handled by standard neural sequence decoders, such as sequence length and backward probability (probability of sources given targets), in addition to yielding consistent improvements in abstractive summarization and machine translation when the property to be optimized is BLEU or ROUGE scores.
At the mean-field level, on fully connected lattices, several disordered spin models have been shown to belong to the universality class of "structural glasses", with a "random first-order transition" (RFOT) characterized by a discontinuous jump of the order parameter and no latent heat. However, their behavior in finite dimensions is often drastically different, displaying either no glassiness at all or a conventional spin-glass transition. We clarify the physical reasons for this phenomenon and stress the unusual fragility of the RFOT to short-range fluctuations, associated e.g. with the mere existence of a finite number of neighbors. Accordingly, the solution of fully connected models is only predictive in very high dimension whereas, despite being also mean-field in character, the Bethe approximation provides valuable information on the behavior of finite-dimensional systems. We suggest that before embarking on a full-blown account of fluctuations on all scales through computer simulation or renormalization-group approach, models for structural glasses should first be tested for the effect of short-range fluctuations and we discuss ways to do it. Our results indicate that disordered spin models do not appear to pass the test and are therefore questionable models for investigating the glass transition in three dimensions. This also highlights how nontrivial is the first step of deriving an effective theory for the RFOT phenomenology from a rigorous integration over the short-range fluctuations.
Containing the spreading of crime in urban societies remains a major challenge. Empirical evidence suggests that, left unchecked, crimes may be recurrent and proliferate. On the other hand, eradicating a culture of crime may be difficult, especially under extreme social circumstances that impair the creation of a shared sense of social responsibility. Although our understanding of the mechanisms that drive the emergence and diffusion of crime is still incomplete, recent research highlights applied mathematics and methods of statistical physics as valuable theoretical resources that may help us better understand criminal activity. We review different approaches aimed at modeling and improving our understanding of crime, focusing on the nucleation of crime hotspots using partial differential equations, self-exciting point process and agent-based modeling, adversarial evolutionary games, and the network science behind the formation of gangs and large-scale organized crime. We emphasize that statistical physics of crime can relevantly inform the design of successful crime prevention strategies, as well as improve the accuracy of expectations about how different policing interventions should impact malicious human activity deviating from social norms. We also outline possible directions for future research, related to the effects of social and coevolving networks and to the hierarchical growth of criminal structures due to self-organization.
We study networks of communicating learning agents that cooperate to solve a common nonstochastic bandit problem. Agents use an underlying communication network to get messages about actions selected by other agents, and drop messages that took more than $d$ hops to arrive, where $d$ is a delay parameter. We introduce \textsc{Exp3-Coop}, a cooperative version of the {\sc Exp3} algorithm and prove that with $K$ actions and $N$ agents the average per-agent regret after $T$ rounds is at most of order $\sqrt{\bigl(d+1 + \tfrac{K}{N}\alpha_{\le d}\bigr)(T\ln K)}$, where $\alpha_{\le d}$ is the independence number of the $d$-th power of the connected communication graph $G$. We then show that for any connected graph, for $d=\sqrt{K}$ the regret bound is $K^{1/4}\sqrt{T}$, strictly better than the minimax regret $\sqrt{KT}$ for noncooperating agents. More informed choices of $d$ lead to bounds which are arbitrarily close to the full information minimax regret $\sqrt{T\ln K}$ when $G$ is dense. When $G$ has sparse components, we show that a variant of \textsc{Exp3-Coop}, allowing agents to choose their parameters according to their centrality in $G$, strictly improves the regret. Finally, as a by-product of our analysis, we provide the first characterization of the minimax regret for bandit learning with delay.
We present measurements of the rates of high-redshift Type Ia supernovae derived from the Subaru/XMM-Newton Deep Survey (SXDS). We carried out repeat deep imaging observations with Suprime-Cam on the Subaru Telescope, and detected 1040 variable objects over 0.918 deg$^2$ in the Subaru/XMM-Newton Deep Field. From the imaging observations, light curves in the observed $i'$-band are constructed for all objects, and we fit the observed light curves with template light curves. Out of the 1040 variable objects detected by the SXDS, 39 objects over the redshift range $0.2 < z < 1.4$ are classified as Type Ia supernovae using the light curves. These are among the most distant SN Ia rate measurements to date. We find that the Type Ia supernova rate increase up to $z \sim 0.8$ and may then flatten at higher redshift. The rates can be fitted by a simple power law, $r_V(z)=r_0(1+z)^\alpha$ with $r_0=0.20^{+0.52}_{-0.16}$(stat.)$^{+0.26}_{-0.07}$(syst.)$\times 10^{-4} {\rm yr}^{-1}{\rm Mpc}^{-3}$, and $\alpha=2.04^{+1.84}_{-1.96}$(stat.)$^{+2.11}_{-0.86}$(syst.).
In prokaryotic genomes the number of transcriptional regulators is known to quadratically scale with the total number of protein-coding genes. Toolbox model was recently proposed to explain this scaling for metabolic enzymes and their regulators. According to its rules the metabolic network of an organism evolves by horizontal transfer of pathways from other species. These pathways are part of a larger "universal" network formed by the union of all species-specific networks. It remained to be understood, however, how the topological properties of this universal network influence the scaling law of functional content of genomes. In this study we answer this question by first analyzing the scaling properties of the toolbox model on arbitrary tree-like universal networks. We mathematically prove that the critical branching topology, in which the average number of upstream neighbors of a node is equal to one, is both necessary and sufficient for the quadratic scaling. Conversely, the toolbox model on trees with exponentially expanding, supercritical topology is characterized by the linear scaling with logarithmic corrections. We further generalize our model to include reactions with multiple substrates/products as well as branched or cyclic metabolic pathways. Unlike the original model the new version employs evolutionary optimized pathways with the smallest number of reactions necessary to achieve their metabolic tasks. Numerical simulations of this most realistic model on the universal network from the KEGG database again produced approximately quadratic scaling. Our results demonstrate why, in spite of their "small-world" topology, real-life metabolic networks are characterized by a broad distribution of pathway lengths and sizes of metabolic regulons in regulatory networks.
Distributed software is becoming more and more dynamic to support applications able to respond and adapt to the changes of their execution environment. For instance, service-oriented computing (SOC) envisages applications as services running over globally available computational resources where discovery and binding between them is transparently performed by a middleware. Asynchronous Relational Networks (ARNs) is a well-known formal orchestration model, based on hypergraphs, for the description of service-oriented software artefacts. Choreography and orchestration are the two main design principles for the development of distributed software. In this work, we propose Communicating Relational Networks (CRNs), which is a variant of ARNs, but relies on choreographies for the characterisation of the communicational aspects of a software artefact, and for making their automated analysis more efficient.
Artificial neural networks (ANN) are inadequate to biological neural networks. This inadequacy is manifested in the use of the obsolete model of the neuron and the connectionist paradigm of constructing ANN. The result of this inadequacy is the existence of many shortcomings of the ANN and the problems of their practical implementation. The alternative principle of ANN construction is proposed in the article. This principle was called the detector principle. The basis of the detector principle is the consideration of the binding property of the input signals of a neuron. A new model of the neuron-detector, a new approach to teaching ANN - counter training and a new approach to the formation of the ANN architecture are used in this principle.
We study the extreme statistics of N non-intersecting Brownian motions (vicious walkers) over a unit time interval in one dimension. Using path-integral techniques we compute exactly the joint distribution of the maximum M and of the time \tau_M at which this maximum is reached. We focus in particular on non-intersecting Brownian bridges ("watermelons without wall") and non-intersecting Brownian excursions ("watermelons with a wall"). We discuss in detail the relationships between such vicious walkers models in watermelons configurations and stochastic growth models in curved geometry on the one hand and the directed polymer in a disordered medium (DPRM) with one free end-point on the other hand. We also check our results using numerical simulations of Dyson's Brownian motion and confront them with numerical simulations of the Polynuclear Growth Model (PNG) and of a model of DPRM on a discrete lattice. Some of the results presented here were announced in a recent letter [J. Rambeau and G. Schehr, Europhys. Lett. 91, 60006 (2010)].
We propose that the spin glass phase of cuprates is due to the proliferation of topological defects of a spiral distortion of the antiferromagnet order. Our theory explains straightforwardly the simultaneous existence of short range incommensurate magnetic correlations and complete a-b symmetry breaking in this phase. We show via a renormalization group calculation that the collinear O(3)/O(2) symmetry is unstable towards the formation of local non-collinear correlations. A critical disorder strength is identified beyond which topological defects proliferate already at zero temperature.
We identify and investigate a computational model arising in molecular computing, social computing and sensor network. The model is made of of multiple agents who are computationally limited and posses no global information. The agents may represent nodes in a social network, sensors, or molecules in a molecular computer. Assuming that each agent is in one of $k$ states, we say that {\em the system computes} $f:[k]^{n} \to [k]$ if all agents eventually converge to the correct value of $f$. We present number of general results characterizing the computational power of the mode. We further present protocols for computing the plurality function with $O(\log k)$ memory and for approximately counting the number of nodes of a given color with $O(\log \log n)$ memory, where $n$ is the number of agents in the networks. These results are tight.
We introduce LOCATHE (Location-Enhanced Authenticated Key Exchange), a generic protocol that pools location, user attributes, access policy and desired services into a multi-factor authentication, allowing two peers to establish a secure, encrypted session and perform mutual authentication with pre-shared keys, passwords and other authentication factors. LOCATHE contributes to: (1) forward secrecy through ephemeral session keys; (2) security through zero-knowledge password proofs (ZKPP), such that no passwords can be learned from the exchange; (3) the ability to use not only location, but also multiple authentication factors from a user to a service; (4) providing a two-tiered privacy authentication scheme, in which a user may be authenticated either based on her attributes (hiding her unique identification), or with a full individual authentication; (5) employing the expressiveness and flexibility of Decentralized or Multi-Authority Ciphertext-Policy Attribute-Based Encryption, allowing multiple service providers to control their respective key generation and attributes.
In this paper, we investigate the performance of incremental redundancy combining as a new cooperative relaying protocol for large M2M networks with opportunistic relaying. The nodes in the large M2M network are modeled by a Poisson Point Process, experience Rayleigh fading and utilize slotted ALOHA as the MAC protocol. The progress rate density (PRD) of the M2M network is used to quantify the performance of proposed relaying protocol and compare it to conventional multihop relaying with no cooperation. It is shown that incremental redundancy combining in a large M2M network provides substantial throughput improvements over conventional relaying with no cooperation at all practical values of the network parameters.
We study the effect of localisation on the propagation of a pulse through a multi-mode disordered waveguide. The correlator <u(omega1)u*(omega2)> of the transmitted wave amplitude u at two frequencies differing by delta_omega has for large delta_omega the stretched exponential tail ~exp(-sqrt{tau_D delta_omega/2}). The time constant tau_D=L^2/D is given by the diffusion coefficient D, even if the length L of the waveguide is much greater than the localisation length xi. Localisation has the effect of multiplying the correlator by a frequency-independent factor exp(-L/2xi), which disappears upon breaking time-reversal symmetry.
This study presents a dynamic neural network model based on the predictive coding framework for perceiving and predicting the dynamic visuo-proprioceptive patterns. In our previous study [1], we have shown that the deep dynamic neural network model was able to coordinate visual perception and action generation in a seamless manner. In the current study, we extended the previous model under the predictive coding framework to endow the model with a capability of perceiving and predicting dynamic visuo-proprioceptive patterns as well as a capability of inferring intention behind the perceived visuomotor information through minimizing prediction error. A set of synthetic experiments were conducted in which a robot learned to imitate the gestures of another robot in a simulation environment. The experimental results showed that with given intention states, the model was able to mentally simulate the possible incoming dynamic visuo-proprioceptive patterns in a top-down process without the inputs from the external environment. Moreover, the results highlighted the role of minimizing prediction error in inferring underlying intention of the perceived visuo-proprioceptive patterns, supporting the predictive coding account of the mirror neuron systems. The results also revealed that minimizing prediction error in one modality induced the recall of the corresponding representation of another modality acquired during the consolidative learning of raw-level visuo-proprioceptive patterns.
Osteoporosis can be identified by looking at 2D x-ray images of the bone. The high degree of similarity between images of a healthy bone and a diseased one makes classification a challenge. A good bone texture characterization technique is essential for identifying osteoporosis cases. Standard texture feature extraction techniques like Local Binary Pattern (LBP), Gray Level Co-occurrence Matrix (GLCM) have been used for this purpose. In this paper, we draw a comparison between deep features extracted from convolution neural network against these traditional features. Our results show that deep features have more discriminative power as classifiers trained on them always outperform the ones trained on traditional features.
We consider the problem of community detection or clustering in the labeled Stochastic Block Model (LSBM) with a finite number $K$ of clusters of sizes linearly growing with the global population of items $n$. Every pair of items is labeled independently at random, and label $\ell$ appears with probability $p(i,j,\ell)$ between two items in clusters indexed by $i$ and $j$, respectively. The objective is to reconstruct the clusters from the observation of these random labels.   Clustering under the SBM and their extensions has attracted much attention recently. Most existing work aimed at characterizing the set of parameters such that it is possible to infer clusters either positively correlated with the true clusters, or with a vanishing proportion of misclassified items, or exactly matching the true clusters. We find the set of parameters such that there exists a clustering algorithm with at most $s$ misclassified items in average under the general LSBM and for any $s=o(n)$, which solves one open problem raised in \cite{abbe2015community}. We further develop an algorithm, based on simple spectral methods, that achieves this fundamental performance limit within $O(n \mbox{polylog}(n))$ computations and without the a-priori knowledge of the model parameters.
Drone detection is the problem of finding the smallest rectangle that encloses the drone(s) in a video sequence. In this study, we propose a solution using an end-to-end object detection model based on convolutional neural networks. To solve the scarce data problem for training the network, we propose an algorithm for creating an extensive artificial dataset by combining background-subtracted real images. With this approach, we can achieve precision and recall values both of which are high at the same time.
In this paper, we explore different ways to extend a recurrent neural network (RNN) to a \textit{deep} RNN. We start by arguing that the concept of depth in an RNN is not as clear as it is in feedforward neural networks. By carefully analyzing and understanding the architecture of an RNN, however, we find three points of an RNN which may be made deeper; (1) input-to-hidden function, (2) hidden-to-hidden transition and (3) hidden-to-output function. Based on this observation, we propose two novel architectures of a deep RNN which are orthogonal to an earlier attempt of stacking multiple recurrent layers to build a deep RNN (Schmidhuber, 1992; El Hihi and Bengio, 1996). We provide an alternative interpretation of these deep RNNs using a novel framework based on neural operators. The proposed deep RNNs are empirically evaluated on the tasks of polyphonic music prediction and language modeling. The experimental result supports our claim that the proposed deep RNNs benefit from the depth and outperform the conventional, shallow RNNs.
Model-based methods and deep neural networks have both been tremendously successful paradigms in machine learning. In model-based methods, problem domain knowledge can be built into the constraints of the model, typically at the expense of difficulties during inference. In contrast, deterministic deep neural networks are constructed in such a way that inference is straightforward, but their architectures are generic and it is unclear how to incorporate knowledge. This work aims to obtain the advantages of both approaches. To do so, we start with a model-based approach and an associated inference algorithm, and \emph{unfold} the inference iterations as layers in a deep network. Rather than optimizing the original model, we \emph{untie} the model parameters across layers, in order to create a more powerful network. The resulting architecture can be trained discriminatively to perform accurate inference within a fixed network size. We show how this framework allows us to interpret conventional networks as mean-field inference in Markov random fields, and to obtain new architectures by instead using belief propagation as the inference algorithm. We then show its application to a non-negative matrix factorization model that incorporates the problem-domain knowledge that sound sources are additive. Deep unfolding of this model yields a new kind of non-negative deep neural network, that can be trained using a multiplicative backpropagation-style update algorithm. We present speech enhancement experiments showing that our approach is competitive with conventional neural networks despite using far fewer parameters.
It is known that an identical delay in all transmission lines can destabilize macroscopic stationarity of a neural network, causing oscillation or chaos. We analyze the collective dynamics of a network whose intra-transmission delays are distributed in time. Here, a neuron is modeled as a discrete-time threshold element that responds in an all-or-nothing manner to a linear sum of signals that arrive after delays assigned to individual transmission lines. Even though transmission delays are distributed in time, a whole network exhibits a single collective oscillation with a period close to the average transmission delay. The collective oscillation can not only be a simple alternation of the consecutive firing and resting, but also nontrivially sequenced series of firing and resting, reverberating in a certain period of time. Moreover, the system dynamics can be made quasiperiodic or chaotic by changing the distribution of delays.
A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges. The proliferation of text networks such as hyperlinked webpages and academic citation networks has led to an increasing demand for quickly developing a general sense of a new text network, namely text network exploration. In this paper, we address the problem of text network exploration through constructing a heterogeneous web of topics, which allows people to investigate a text network associating word level with document level. To achieve this, a probabilistic generative model for text and links is proposed, where three different relationships in the heterogeneous topic web are quantified. We also develop a prototype demo system named TopicAtlas to exhibit such heterogeneous topic web, and demonstrate how this system can facilitate the task of text network exploration. Extensive qualitative analyses are included to verify the effectiveness of this heterogeneous topic web. Besides, we validate our model on real-life text networks, showing that it preserves good performance on objective evaluation metrics.
We propose a deep learning approach for user-guided image colorization. The system directly maps a grayscale image, along with sparse, local user "hints" to an output colorization with a Convolutional Neural Network (CNN). Rather than using hand-defined rules, the network propagates user edits by fusing low-level cues along with high-level semantic information, learned from large-scale data. We train on a million images, with simulated user inputs. To guide the user towards efficient input selection, the system recommends likely colors based on the input image and current user inputs. The colorization is performed in a single feed-forward pass, enabling real-time use. Even with randomly simulated user inputs, we show that the proposed system helps novice users quickly create realistic colorizations, and offers large improvements in colorization quality with just a minute of use. In addition, we demonstrate that the framework can incorporate other user "hints" to the desired colorization, showing an application to color histogram transfer. Our code and models are available at https://richzhang.github.io/ideepcolor.
In this paper, we propose a novel multi-channel network with infrastructure support, called an MC-IS network, which has not been studied in the literature. To the best of our knowledge, we are the first to study such an MC-IS network. Our proposed MC-IS network has a number of advantages over three existing conventional networks, namely a single-channel wireless ad hoc network (called an SC-AH network), a multi-channel wireless ad hoc network (called an MC-AH network) and a single-channel network with infrastructure support (called an SC-IS network). In particular, the network capacity of our proposed MC-IS network is $\sqrt{n \log n}$ times higher than that of an SC-AH network and an MC-AH network and the same as that of an SC-IS network, where $n$ is the number of nodes in the network. The average delay of our MC-IS network is $\sqrt{\log n/n}$ times lower than that of an SC-AH network and an MC-AH network, and $\min\{C_I,m\}$ times lower than the average delay of an SC-IS network, where $C_I$ and $m$ denote the number of channels dedicated for infrastructure communications and the number of interfaces mounted at each infrastructure node, respectively. Our analysis on an MC-IS network equipped with omni-directional antennas only has been extended to an MC-IS network equipped with directional antennas only, which are named as an MC-IS-DA network. We show that an MC-IS-DA network has an even lower delay of $\frac{c}{\lfloor \frac{2\pi}{\theta}\rfloor \cdot C_I}$ compared with an SC-IS network and our MC-IS network. For example, when $C_I=12$ and $\theta=\frac{\pi}{12}$, an MC-IS-DA network can further reduce the delay by 24 times lower that of an MC-IS network and reduce the delay by 288 times lower than that of an SC-IS network.
Under the Ansatz that the occupation times of a system with finitely many states are given by the Gibbs distribution, an effective temperature is uniquely determined (up to a choice of scale), and may be computed de novo, without any reference to a Hamiltonian for empirically accessible systems. As an example, the calculation of the effective temperature for a classical Bose gas is outlined and applied to the analysis of computer network traffic.
We study the sensitivity of directed complex networks to the addition and pruning of edges and vertices and introduce the susceptibility, which quantifies this sensitivity. We show that topologically different parts of a directed network have different sensitivity to the addition and pruning of edges and vertices and, therefore, they are characterized by different susceptibilities. These susceptibilities diverge at the critical point of the directed percolation transition, signaling the appearance (or disappearance) of the giant strongly connected component in the infinite size limit. We demonstrate this behavior in randomly damaged real and synthetic directed complex networks, such as the World Wide Web, Twitter, the \emph{Caenorhabditis elegans} neural network, directed Erd\H{o}s-R\'enyi graphs, and others. We reveal a non-monotonous dependence of the sensitivity to random pruning of edges or vertices in the case of \emph{Caenorhabditis elegans} and Twitter that manifests specific structural peculiarities of these networks. We propose the measurements of the susceptibilities during the addition or pruning of edges and vertices as a new method for studying structural peculiarities of directed networks.
The prediction of graph evolution is an important and challenging problem in the analysis of networks and of the Web in particular. But while the appearance of new links is part of virtually every model of Web growth, the disappearance of links has received much less attention in the literature. To fill this gap, our approach DecLiNe (an acronym for DECay of LInks in NEtworks) aims to predict link decay in networks, based on structural analysis of corresponding graph models. In analogy to the link prediction problem, we show that analysis of graph structures can help to identify indicators for superfluous links under consideration of common network models. In doing so, we introduce novel metrics that denote the likelihood of certain links in social graphs to remain in the network, and combine them with state-of-the-art machine learning methods for predicting link decay. Our methods are independent of the underlying network type, and can be applied to such diverse networks as the Web, social networks and any other structure representable as a network, and can be easily combined with case-specific content analysis and adopted for a variety of social network mining, filtering and recommendation applications. In systematic evaluations with large-scale datasets of Wikipedia we show the practical feasibility of the proposed structure-based link decay prediction algorithms.
Quantum networks are an integral component in performing efficient computation and communication tasks that are not accessible using classical systems. A key aspect in designing an effective and scalable quantum network is generating entanglement between its nodes, which is robust against defects in the network. We consider an isotropic quantum network of spin-1/2 particles with a finite fraction of defects, where the corresponding wave function of the network is rotationally invariant under the action of local unitaries, and we show that any reduced density matrix also remains unaltered under the local actions. By using quantum information-theoretic concepts like strong subadditivity of von Neumann entropy and approximate quantum telecloning, we prove analytically that in the presence of defects, caused by loss of a finite fraction of spins, the network sustains genuine multisite entanglement, and at the same time may exhibit finite moderate-range bipartite entanglement, in contrast to the network with no defects.
In this paper, we thoroughly investigate the quality of features produced by deep neural network architectures obtained by stacking and convolving Auto-Encoders. In particular, we are interested into the relation of their reconstruction score with their performance on document layout analysis. When using Auto-Encoders, intuitively one could assume that features which are good for reconstruction will also lead to high classification accuracies. However, we prove that this is not always the case. We examine the reconstruction score, training error and the results obtained if we were to use the same features for both input reconstruction and a classification task. We show that the reconstruction score is not a good metric because it is biased by the decoder quality. Furthermore, experimental results suggest that there is no correlation between the reconstruction score and the quality of features for a classification task and that given the network size and configuration it is not possible to make assumptions on its training error magnitude. Therefore we conclude that both, reconstruction score and training error should not be used jointly to evaluate the quality of the features produced by a Stacked Convolutional Auto-Encoders for a classification task. Consequently one should independently investigate the network classification abilities directly.
Neuro-fuzzy systems have attracted growing interest of researchers in various scientific and engineering areas due to the increasing need of intelligent systems. This paper evaluates the use of two popular soft computing techniques and conventional statistical approach based on Box--Jenkins autoregressive integrated moving average (ARIMA) model to predict electricity demand in the State of Victoria, Australia. The soft computing methods considered are an evolving fuzzy neural network (EFuNN) and an artificial neural network (ANN) trained using scaled conjugate gradient algorithm (CGA) and backpropagation (BP) algorithm. The forecast accuracy is compared with the forecasts used by Victorian Power Exchange (VPX) and the actual energy demand. To evaluate, we considered load demand patterns for 10 consecutive months taken every 30 min for training the different prediction models. Test results show that the neuro-fuzzy system performed better than neural networks, ARIMA model and the VPX forecasts.
More recently STM experimets present firm evidence of some kind of charge modulation in underdoped cuprates. The peculiar observations of the above experiments are located in the so called pseudo-gap region of the phase diagram, just over the superconducting-dome. The model that will be used captures in a simple way the idea that the pseudo-gap phase is formed of bound fermion pairs which are close to a CDW instability but generally do not have long range order due to quenched disorder. Thus the charge degrees of freedom will be modeled by an Ising order parameter in the presence of quenched disorder, so representing a charge glassy phase. This glassy phase will be in competition with a superconducting phase modeled by a complex order parameter. As a first step we will study the model in one dimension without disorder to familiarize with it and also to search for a possible explanation of the Giant Proximity Effect (GPE). After that we will investigate numerically the ground state properties of the 2-dimensional lattice model, focusing our attention especially on the stiffness and magnetization of the system in the xy plane. We will show that disorder can induce superconductivity in a CDW phase. This picture is really interesting because it could show how an insulating system can produce a superconducting phase thank to the interplay with impurites.
Humans interact through numerous channels to build and maintain social connections: they meet face-to-face, initiate phone calls or send text messages, and interact via social media. Although it is known that the network of physical contacts, for example, is distinct from the network arising from communication events via phone calls and instant messages, the extent to which these networks differ is not clear. In fact, the network structure of these channels shows large structural variations. Each network of interactions, however, contains both central and peripheral individuals: central members are characterized by higher connectivity and can reach a high fraction of the network within a low number of connections, contrary to the nodes on the periphery. Here we show that the various channels account for diverse relationships between pairs of individuals and the corresponding interaction patterns across channels differ to an extent that hinders the simple reduction of social ties to a single layer. Furthemore, the origin and purpose of each network also determine the role of their respective central members: highly connected individuals in the person-to-person networks interact with their environment in a regular manner, while members central in the social communication networks display irregular behavior with respect to their physical contacts and are more active through rare, social events. These results suggest that due to the inherently different functions of communication channels, each one favors different social behaviors and different strategies for interacting with the environment. Our findings can facilitate the understanding of the varying roles and impact individuals have on the population, which can further shed light on the prediction and prevention of epidemic outbreaks, or information propagation.
Neural network has been attracting more and more researchers since the past decades. The properties, such as parameter sensitivity, random similarity, learning ability, etc., make it suitable for information protection, such as data encryption, data authentication, intrusion detection, etc. In this paper, by investigating neural networks' properties, the low-cost authentication method based on neural networks is proposed and used to authenticate images or videos. The authentication method can detect whether the images or videos are modified maliciously. Firstly, this chapter introduces neural networks' properties, such as parameter sensitivity, random similarity, diffusion property, confusion property, one-way property, etc. Secondly, the chapter gives an introduction to neural network based protection methods. Thirdly, an image or video authentication scheme based on neural networks is presented, and its performances, including security, robustness and efficiency, are analyzed. Finally, conclusions are drawn, and some open issues in this field are presented.
The necessity of maintaining a robust antiterrorist task force has become imperative in recent times with resurgence of rogue element in the society. A well equipped combat force warrants the safety and security of citizens and the integrity of the sovereign state. In this paper we propose a novel teleoperating robot which can play a major role in combat, rescue and reconnaissance missions by substantially reducing loss of human soldiers in such hostile environments. The proposed robotic solution consists of an unmanned ground vehicle equipped with an IP camera visual system broadcasting real-time video data to a remote cloud server. With the advancement in machine learning algorithms in the field of computer vision, we incorporate state of the art deep convolutional neural networks to identify and predict individuals with malevolent intent. The classification is performed on every frame of the video stream by the trained network in the cloud server. The predicted output of the network is overlaid on the video stream with specific colour marks and prediction percentage. Finally the data is resized into half-side by side format and streamed to the head mount display worn by the human controller which facilitates first person view of the scenario. The ground vehicle is also coupled with an unmanned aerial vehicle for aerial surveillance. The proposed scheme is an assistive system and the final decision evidently lies with the human handler.
We report the occurrence of a deep low state in the eclipsing short-period cataclysmic variable IR Com, lasting more than two years. Spectroscopy obtained in this state shows the system as a detached white dwarf plus low-mass companion, indicating that accretion has practically ceased. The spectral type of the companion derived from the SDSS spectrum is M6-7, somewhat later than expected for the orbital period of IR Com. Its radial velocity amplitude, K_2=419.6+-3.4 km/s, together with the inclination of 75-90deg implies 0.8Msun<Mwd<1.0Msun. We estimate the white dwarf temperature to be ~15000K, and the absence of Zeeman splitting in the Balmer lines rules out magnetic fields in excess of ~5 MG. IR Com still defies an unambiguous classification, in particular the occurrence of a deep, long low state is so far unique among short-period CVs that are not strongly magnetic.
Deep neural networks with lots of parameters are typically used for large-scale computer vision tasks such as image classification. This is a result of using dense matrix multiplications and convolutions. However, sparse computations are known to be much more efficient. In this work, we train and build neural networks which implicitly use sparse computations. We introduce additional gate variables to perform parameter selection and show that this is equivalent to using a spike-and-slab prior. We experimentally validate our method on both small and large networks and achieve state-of-the-art compression results for sparse neural network models.
Modularity maximization has been one of the most widely used approaches in the last decade for discovering community structure in networks of practical interest in biology, computing, social science, statistical mechanics, and more. Modularity is a quality function that measures the difference between the number of edges found within clusters minus the number of edges one would statistically expect to find based on random chance. We present a natural generalization of modularity based on the difference between the actual and expected number of walks within clusters, which we call walk-modularity. Walk-modularity can be expressed in matrix form, and community detection can be performed by finding leading eigenvectors of the walk-modularity matrix. We demonstrate community detection on both synthetic and real-world networks and find that walk-modularity maximization returns significantly improved results compared to traditional modularity maximization.
Most studies addressing translucent network design targeted a trade-off between minimizing the number of deployed regenerators and minimizing the number of regeneration sites. The latter highly depends on the carrier's strategy and is motivated by various considerations such as power consumption, maintenance and supervision costs. However, concentrating regenerators into a small number of nodes exposes the network to a high risk of data losses in the eventual case of regenerator pool failure. In this paper, we address the problem of survivable translucent network design taking into account the simultaneous effect of four transmission impairments. We propose an exact approach based on a mathematical formulation to solve the problem of regenerator placement while ensuring the network survivability in the hazardous event of a regenerator pool failure. For this purpose, for each accepted request requiring regeneration, the network management plane computes in advance several routing paths along with associated valid wavelengths going through different regeneration sites. In doing so, we target to implement an M : N shared regenerator pool protection scheme. Simulation results highlight the gain obtained by reducing the number of regeneration sites without sacrificing network survivability.
Adaptive stochastic gradient methods such as AdaGrad have gained popularity in particular for training deep neural networks. The most commonly used and studied variant maintains a diagonal matrix approximation to second order information by accumulating past gradients which are used to tune the step size adaptively. In certain situations the full-matrix variant of AdaGrad is expected to attain better performance, however in high dimensions it is computationally impractical. We present Ada-LR and RadaGrad two computationally efficient approximations to full-matrix AdaGrad based on randomized dimensionality reduction. They are able to capture dependencies between features and achieve similar performance to full-matrix AdaGrad but at a much smaller computational cost. We show that the regret of Ada-LR is close to the regret of full-matrix AdaGrad which can have an up-to exponentially smaller dependence on the dimension than the diagonal variant. Empirically, we show that Ada-LR and RadaGrad perform similarly to full-matrix AdaGrad. On the task of training convolutional neural networks as well as recurrent neural networks, RadaGrad achieves faster convergence than diagonal AdaGrad.
We present doubly stochastic gradient MCMC, a simple and generic method for (approximate) Bayesian inference of deep generative models (DGMs) in a collapsed continuous parameter space. At each MCMC sampling step, the algorithm randomly draws a mini-batch of data samples to estimate the gradient of log-posterior and further estimates the intractable expectation over hidden variables via a neural adaptive importance sampler, where the proposal distribution is parameterized by a deep neural network and learnt jointly. We demonstrate the effectiveness on learning various DGMs in a wide range of tasks, including density estimation, data generation and missing data imputation. Our method outperforms many state-of-the-art competitors.
External neural memory structures have recently become a popular tool for algorithmic deep learning (Graves et al. 2014, Weston et al. 2014). These models generally utilize differentiable versions of traditional discrete memory-access structures (random access, stacks, tapes) to provide the storage necessary for computational tasks. In this work, we argue that these neural memory systems lack specific structure important for relative indexing, and propose an alternative model, Lie-access memory, that is explicitly designed for the neural setting. In this paradigm, memory is accessed using a continuous head in a key-space manifold. The head is moved via Lie group actions, such as shifts or rotations, generated by a controller, and memory access is performed by linear smoothing in key space. We argue that Lie groups provide a natural generalization of discrete memory structures, such as Turing machines, as they provide inverse and identity operators while maintaining differentiability. To experiment with this approach, we implement a simplified Lie-access neural Turing machine (LANTM) with different Lie groups. We find that this approach is able to perform well on a range of algorithmic tasks.
In this paper, we build and explore supervised learning models of ferromagnetic system behavior, using Monte-Carlo sampling of the spin configuration space generated by the 2D Ising model. Given the enormous size of the space of all possible Ising model realizations, the question arises as to how to choose a reasonable number of samples that will form physically meaningful and non-intersecting training and testing datasets. Here, we propose a sampling technique called ID-MH that uses the Metropolis-Hastings algorithm creating Markov process across energy levels within the predefined configuration subspace. We show that application of this method retains phase transitions in both training and testing datasets and serves the purpose of validation of a machine learning algorithm. For larger lattice dimensions, ID-MH is not feasible as it requires knowledge of the complete configuration space. As such, we develop a new "block-ID" sampling strategy: it decomposes the given structure into square blocks with lattice dimension no greater than 5 and uses ID-MH sampling of candidate blocks. Further comparison of the performance of commonly used machine learning methods such as random forests, decision trees, k nearest neighbors and artificial neural networks shows that the PCA-based Decision Tree regressor is the most accurate predictor of magnetizations of the Ising model. For energies, however, the accuracy of prediction is not satisfactory, highlighting the need to consider more algorithmically complex methods (e.g., deep learning).
We present imaging results and source counts from an ISOCAM deep and ultra-deep cosmological survey through gravitationally lensing clusters of galaxies at 7 and 15 microns. A total area of about 53 sq.arcmin was covered in maps of three clusters. The lensing increases the sensitivity of the survey. A large number of luminous mid-infrared (MIR) sources were detected behind the lenses, and most could be unambiguously identified with visible counterparts. Thanks to the gravitational amplification, these results include the faintest MIR detections ever recorded, extending source counts to an unprecedented level. The source counts, corrected for cluster contamination and lensing distortion effects, show an excess by a factor of 10 with respect to the prediction of a no-evolution model, as we reported for A2390 alone in Altieri et al. (1999). These results support the A2390 result that the resolved 7 and 15 microns background radiation intensities are 1.7 (+/- 0.5) x 10^-9 and 3.3 (+/- 1.3) x 10^-9 W/m^2/sr, respectively, integrating from 30 microJy to 50 mJy.
Recently published methods enable training of bitwise neural networks which allow reduced representation of down to a single bit per weight. We present a method that exploits ensemble decisions based on multiple stochastically sampled network models to increase performance figures of bitwise neural networks in terms of classification accuracy at inference. Our experiments with the CIFAR-10 and GTSRB datasets show that the performance of such network ensembles surpasses the performance of the high-precision base model. With this technique we achieve 5.81% best classification error on CIFAR-10 test set using bitwise networks. Concerning inference on embedded systems we evaluate these bitwise networks using a hardware efficient stochastic rounding procedure. Our work contributes to efficient embedded bitwise neural networks.
Deep learning applications are computation-intensive and often employ GPU as the underlying computing devices. Deep learning frameworks provide powerful programming interfaces, but the gap between source codes and practical GPU operations make it difficult to analyze the performance of deep learning applications. In this paper, through examing the features of GPU traces and deep learning applications, we use the suffix tree structure to extract the repeated patten in GPU traces. Performance analysis graphs can be generated from the preprocessed GPU traces. We further present \texttt{DeepProf}, a novel tool to automatically process GPU traces and generate performance analysis reports for deep learning applications. Empirical study verifies the effectiveness of \texttt{DeepProf} in performance analysis and diagnosis. We also find out some interesting properties of Tensorflow, which can be used to guide the deep learning system setup.
Distributed quantum communication and quantum computing offer many new opportunities for quantum information processing. Here networks based on highly nonlocal quantum resources with complex entanglement structures have been proposed for distributing, sharing and processing quantum information. Graph states in particular have emerged as powerful resources for such tasks using measurement-based techniques. We report an experimental demonstration of graph-state quantum secret sharing, an important primitive for a quantum network. We use an all-optical setup to encode quantum information into photons representing a five-qubit graph state. We are able to reliably encode, distribute and share quantum information between four parties. In our experiment we demonstrate the integration of three distinct secret sharing protocols, which allow for security and protocol parameters not possible with any single protocol alone. Our results show that graph states are a promising approach for sophisticated multi-layered protocols in quantum networks.
Documents exhibit sequential structure at multiple levels of abstraction (e.g., sentences, paragraphs, sections). These abstractions constitute a natural hierarchy for representing the context in which to infer the meaning of words and larger fragments of text. In this paper, we present CLSTM (Contextual LSTM), an extension of the recurrent neural network LSTM (Long-Short Term Memory) model, where we incorporate contextual features (e.g., topics) into the model. We evaluate CLSTM on three specific NLP tasks: word prediction, next sentence selection, and sentence topic prediction. Results from experiments run on two corpora, English documents in Wikipedia and a subset of articles from a recent snapshot of English Google News, indicate that using both words and topics as features improves performance of the CLSTM models over baseline LSTM models for these tasks. For example on the next sentence selection task, we get relative accuracy improvements of 21% for the Wikipedia dataset and 18% for the Google News dataset. This clearly demonstrates the significant benefit of using context appropriately in natural language (NL) tasks. This has implications for a wide variety of NL applications like question answering, sentence completion, paraphrase generation, and next utterance prediction in dialog systems.
As training data rapid growth, large-scale parallel training with multi-GPUs cluster is widely applied in the neural network model learning currently.We present a new approach that applies exponential moving average method in large-scale parallel training of neural network model. It is a non-interference strategy that the exponential moving average model is not broadcasted to distributed workers to update their local models after model synchronization in the training process, and it is implemented as the final model of the training system. Fully-connected feed-forward neural networks (DNNs) and deep unidirectional Long short-term memory (LSTM) recurrent neural networks (RNNs) are successfully trained with proposed method for large vocabulary continuous speech recognition on Shenma voice search data in Mandarin. The character error rate (CER) of Mandarin speech recognition further degrades than state-of-the-art approaches of parallel training.
We analytically study the one-dimensional Asymmetric Simple Exclusion Process (ASEP) with open boundaries under sublattice-parallel updating scheme. We investigate the stationary state properties of this model conditioned on finding a given particle number in the system. Recent numerical investigations have shown that the model possesses three different phases in this case. Using a matrix product method we calculate both exact canonical partition function and also density profiles of the particles in each phase. Application of the Yang-Lee theory reveals that the model undergoes two second-order phase transitions at critical points. These results confirm the correctness of our previous numerical studies.
We show that $1/f$-noise in the variable range hopping regime is related to transitions of many-electrons clusters (fluctuators) between two almost degenerate states. Giant fluctuation times necessary for $1/f$-noise are provided by slow rate of simultaneous tunneling of many localized electrons and by large activation barriers for their consecutive rearrangements. The Hooge constant steeply grows with decreasing temperature because it is easier to find a slow fluctuator at lower temperatures. Our conclusions qualitatively agree with the low temperature observations of $1/f$-noise in p-type silicon and GaAs.
This paper deals with cellular (e.g. LTE) networks that selectively offload the mobile data traffic onto WiFi (IEEE 802.11) networks to improve network performance. We propose the Seamless Internetwork Flow Mobility (SIFM) architecture that provides seamless flow-mobility support using concepts of Software Defined Networking (SDN). The SDN paradigm decouples the control and data plane, leading to a centralized network intelligence and state. The SIFM architecture utilizes this aspect of SDN and moves the mobility decisions to a centralized Flow Controller (FC). This provides a global network view while making mobility decisions and also reduces the complexity at the PGW. We implement and evaluate both basic PMIPv6 and the SIFM architectures by incorporating salient LTE and WiFi network features in the ns-3 simulator. Performance experiments validate that seamless mobility is achieved. Also, the SIFM architecture shows an improved network performance when compared to the base PMIPv6 architecture. A proof-of-concept prototype of the SIFM architecture has been implemented on an experimental testbed. The LTE network is emulated by integrating USRP B210x with the OpenLTE eNodeB and OpenLTE EPC. The WiFi network is emulated using hostapd and dnsmasq daemons running on Ubuntu 12.04. An off-the-shelf LG G2 mobile phone running Android 4.2.2 is used as the user equipment. We demonstrate seamless mobility between the LTE network and the WiFi network with the help of ICMP ping and a TCP chat application.
The motion of the structure determining components is highly collective, both in amorphous solids and in undercooled liquids. This has been deduced from experimental low temperature data in the tunneling regime as well as from the vanishing isotope effect in diffusion in glasses and undercooled liquids. In molecular dynamics simulations of glasses one observes that both low frequency resonant vibrations and atomic jumps are centered on more than 10 atoms which, in densely packed materials, form chainlike structures. With increasing temperature the number of atoms jumping collectively increases. These chains of collectively jumping atoms are also seen in undercooled liquids. Collectivity only vanishes at higher temperatures. This collectivity is intimately related to the dynamic heterogeneity which causes a non-Gaussianity of the atomic displacements.
Computer simulations of structure formation in network forming materials (such as amorphous semiconductors, glasses, or fluids containing hydrogen bonds) are challenging. The problem is that large structural changes in the network topology are rare events, making it very difficult to equilibrate these systems. To overcome this problem, Wooten, Winer and Weaire [PRL 54, 1392 (1985)] proposed a Monte Carlo bond-switch move, constructed to alter the network topology at every step. The resulting algorithm is well suited to study networks at zero temperature. However, since thermal fluctuations are ignored, it cannot be used to probe the phase behavior at finite temperature. In this paper, a modification of the original bond-switch move is proposed, in which detailed balance and ergodicity are both obeyed, thereby facilitating a correct sampling of the Boltzmann distribution for these systems at any finite temperature. The merits of the modified algorithm are demonstrated in a detailed investigation of the melting transition in a two-dimensional 3-fold coordinated network.
Modern classical computing devices, except of simplest calculators, have von Neumann architecture, i.e., a part of the memory is used for the program and a part for the data. It is likely, that analogues of such architecture are also desirable for the future applications in quantum computing, communications and control. It is also interesting for the modern theoretical research in the quantum information science and raises challenging questions about an experimental assessment of such a programmable models. Together with some progress in the given direction, such ideas encounter specific problems arising from the very essence of quantum laws. Currently are known two different ways to overcome such problems, sometime denoted as a stochastic and deterministic approach. The presented paper is devoted to the second one, that is also may be called the programmable quantum networks with pure states.   In the paper are discussed basic principles and theoretical models that can be used for the design of such nano-devices, e.g., the conditional quantum dynamics, the Nielsen-Chuang "no-programming theorem, the idea of deterministic and stochastic quantum gates arrays. Both programmable quantum networks with finite registers and hybrid models with continuous quantum variables are considered. As a basic model for the universal programmable quantum network with pure states and finite program register is chosen a "Control-Shift" quantum processor architecture with three buses introduced in earlier works. It is shown also, that quantum cellular automata approach to the construction of an universal programmable quantum computer often may be considered as the particular case of such design.
In this study we map out the large-scale structure of citation networks of science journals and follow their evolution in time by using stochastic block models (SBMs). The SBM fitting procedures are principled methods that can be used to find hierarchical grouping of journals into blocks that show similar incoming and outgoing citations patterns. These methods work directly on the citation network without the need to construct auxiliary networks based on similarity of nodes. We fit the SBMs to the networks of journals we have constructed from the data set of around 630 million citations and find a variety of different types of blocks, such as clusters, bridges, sources, and sinks. In addition we use a recent generalization of SBMs to determine how much a manually curated classification of journals into subfields of science is related to the block structure of the journal network and how this relationship changes in time. The SBM method tries to find a network of blocks that is the best high-level representation of the network of journals, and we illustrate how these block networks (at various levels of resolution) can be used as maps of science.
We investigate the performance of a two-hop cognitive relay network with a buffered decode and forward (DF) relay. We derive expressions for the rate performance of an adaptive link selection-based buffered relay (ALSBR) scheme with peak power and peak interference constraints on the secondary nodes, and compare its performance with that of conventional unbuffered relay (CUBR) and conventional buffered relay (CBR) schemes. Use of buffered relays with adaptive link selection is shown to be particularly advantageous in underlay cognitive radio networks. The insights developed are of significance to system designers since cognitive radio frameworks are being explored for use in 5G systems. Computer simulation results are presented to demonstrate accuracy of the derived expressions.
The success of deep architectures is at least in part attributed to the layer-by-layer unsupervised pre-training that initializes the network. Various papers have reported extensive empirical analysis focusing on the design and implementation of good pre-training procedures. However, an understanding pertaining to the consistency of parameter estimates, the convergence of learning procedures and the sample size estimates is still unavailable in the literature. In this work, we study pre-training in classical and distributed denoising autoencoders with these goals in mind. We show that the gradient converges at the rate of $\frac{1}{\sqrt{N}}$ and has a sub-linear dependence on the size of the autoencoder network. In a distributed setting where disjoint sections of the whole network are pre-trained synchronously, we show that the convergence improves by at least $\tau^{3/4}$, where $\tau$ corresponds to the size of the sections. We provide a broad set of experiments to empirically evaluate the suggested behavior.
We report the direct observation of slow fluctuations of helical antiferromagnetic domains in an ultra-thin holmium film using coherent resonant magnetic x-ray scattering. We observe a gradual increase of the fluctuations in the speckle pattern with increasing temperature, while at the same time a static contribution to the speckle pattern remains. This finding indicates that domain-wall fluctuations occur over a large range of time scales. We ascribe this non-ergodic behavior to the strong dependence of the fluctuation rate on the local thickness of the film.
In this paper, we propose a new unsupervised feature learning framework, namely Deep Sparse Coding (DeepSC), that extends sparse coding to a multi-layer architecture for visual object recognition tasks. The main innovation of the framework is that it connects the sparse-encoders from different layers by a sparse-to-dense module. The sparse-to-dense module is a composition of a local spatial pooling step and a low-dimensional embedding process, which takes advantage of the spatial smoothness information in the image. As a result, the new method is able to learn several levels of sparse representation of the image which capture features at a variety of abstraction levels and simultaneously preserve the spatial smoothness between the neighboring image patches. Combining the feature representations from multiple layers, DeepSC achieves the state-of-the-art performance on multiple object recognition tasks.
Identifying community structure in networks is an issue of particular interest in network science. The modularity introduced by Newman and Girvan [Phys. Rev. E 69, 026113 (2004)] is the most popular quality function for community detection in networks. In this study, we identify a problem in the concept of modularity and suggest a solution to overcome this problem. Specifically, we obtain a new quality function for community detection. We refer to the function as Z-modularity because it measures the Z-score of a given division with respect to the fraction of the number of edges within communities. Our theoretical analysis shows that Z-modularity mitigates the resolution limit of the original modularity in certain cases. Computational experiments using both artificial networks and well-known real-world networks demonstrate the validity and reliability of the proposed quality function.
We present a Multi-Scale Pyramidal Pooling Network, featuring a novel pyramidal pooling layer at multiple scales and a novel encoding layer. Thanks to the former the network does not require all images of a given classification task to be of equal size. The encoding layer improves generalisation performance in comparison to similar neural network architectures, especially when training data is scarce. We evaluate and compare our system to convolutional neural networks and state-of-the-art computer vision methods on various benchmark datasets. We also present results on industrial steel defect classification, where existing architectures are not applicable because of the constraint on equally sized input images. The proposed architecture can be seen as a fully supervised hierarchical bag-of-features extension that is trained online and can be fine-tuned for any given task.
Context. We use Hubble Space Telescope photometry of six rich, compact star clusters in the Large Magellanic Cloud (LMC), with ages ranging from 0.01 to 1.0 Gyr, to derive the clusters' stellar mass functions (MFs) at their half-mass radii.   Aims. The LMC is an ideal environment to study stellar MFs, because it contains a large population of compact clusters at different evolutionary stages. We aim to obtain constraints on the initial MFs (IMFs) of our sample clusters on the basis of their present-day MFs, combined with our understanding of their dynamical and photometric evolution.   Methods. We derive the clusters' present-day MFs below 1.0 Msun using deep observations with the Space Telescope Imaging Spectrograph and updated stellar population synthesis models.   Results. Since the relaxation timescales of low-mass stars are very long, dynamical evolution will not have affected the MFs below 1.0 Msun significantly, so that - within the uncertainties - the derived MFs are consistent with the solar-neighbourhood IMF, at least for the younger clusters.   Conclusions. The IMF in the low-density, low-metallicity environment of the LMC disk is not significantly different from that in the solar neighbourhood.
Heavy Ion Collisions (HIC) represent a unique tool to probe the in-medium nuclear interaction in regions away from saturation. In this work we present a selection of reaction observables in dissipative collisions particularly sensitive to the isovector part of the interaction, i.e. to the symmetry term of the nuclear Equation of State (EoS). At low energies the behavior of the symmetry energy around saturation influences dissipation and fragment production mechanisms. We will first discuss the recently observed Dynamical Dipole Radiation, due to a collective neutron-proton oscillation during the charge equilibration in fusion and deep-inelastic collisions. We will review in detail all the main properties, yield, spectrum, damping and angular distributions, revealing important isospin effects. Reactions induced by unstable 132Sn beams appear to be very promising tools to test the sub-saturation Isovector EoS. Predictions are also presented for deep-inelastic and fragmentation collisions induced by neutron rich projectiles. The importance of studying violent collisions with radioactive beams at low and Fermi energies is finally stressed.
We report a measurement of the top quark mass $M_t$ in the dilepton decay channel $t\bar{t}\to b\ell'^{+}\nu'_\ell\bar{b}\ell^{-}\bar{\nu}_{\ell}$. Events are selected with a neural network which has been directly optimized for statistical precision in top quark mass using neuroevolution, a technique modeled on biological evolution. The top quark mass is extracted from per-event probability densities that are formed by the convolution of leading order matrix elements and detector resolution functions. The joint probability is the product of the probability densities from 344 candidate events in 2.0 fb$^{-1}$ of $p\bar{p}$ collisions collected with the CDF II detector, yielding a measurement of $M_t= 171.2\pm 2.7(\textrm{stat.})\pm 2.9(\textrm{syst.})\mathrm{GeV}/c^2$.
Previous work has shown that species interacting in an ecosystem and actors transacting in an economic context may have notable similarities in behavior. However, the specific mechanism that may underlie similarities in nature and human systems has not been analyzed. Building on stochastic food-web models, we propose a parsimonious bipartite-cooperation model that reproduces the key features of mutualistic networks - degree distribution, nestedness and modularity -- for both ecological networks and socio-economic networks. Our analysis uses two diverse networks. Mutually-beneficial interactions between plants and their pollinators, and cooperative economic exchanges between designers and their contractors. We find that these mutualistic networks share a key hierarchical ordering of their members, along with an exponential constraint in the number and type of partners they can cooperate with. We use our model to show that slight changes in the interaction constraints can produce either extremely nested or random structures, revealing that these constraints play a key role in the evolution of mutualistic networks. This could also encourage a new systematic approach to study the functional and structural properties of networks. The surprising correspondence across mutualistic networks suggests their broadly representativeness and their potential role in the productive organization of exchange systems, both ecological and social.
Learning the influence structure of multiple time series data is of great interest to many disciplines. This paper studies the problem of recovering the causal structure in network of multivariate linear Hawkes processes. In such processes, the occurrence of an event in one process affects the probability of occurrence of new events in some other processes. Thus, a natural notion of causality exists between such processes captured by the support of the excitation matrix. We show that the resulting causal influence network is equivalent to the Directed Information graph (DIG) of the processes, which encodes the causal factorization of the joint distribution of the processes. Furthermore, we present an algorithm for learning the support of excitation matrix (or equivalently the DIG). The performance of the algorithm is evaluated on synthesized multivariate Hawkes networks as well as a stock market and MemeTracker real-world dataset.
In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements in localization performance. We provide extensive quantitative comparison of CNN-based vs SIFT-based localization methods, showing the weaknesses and strengths of each. Furthermore, we present a new large-scale indoor sequence with accurate ground truth from a laser scanner. Experimental results on both indoor and outdoor public datasets show our method outperforms existing deep architectures, and can localize images in hard conditions, e.g., in the presence of mostly textureless surfaces, where classic SIFT-based methods fail.
Although conservation of energy is fundamental in physics, its principles seem to be violated in the field of wave propagation in turbid media by the energy enhancement of the coherent backscattering cone. In this letter we present experimental data which show that the energy enhancement of the cone is balanced by an energy cutback at all scattering angles. Moreover, we give a complete theoretical description, which is in good agreement with these data. The additional terms needed to enforce energy conservation in this description result from an interference effect between incident and multiply scattered waves, which is reminiscent of the optical theorem in single scattering.
In this paper, we address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object. Natural language object retrieval differs from text-based image retrieval task as it involves spatial information about objects within the scene and global scene context. To address this issue, we propose a novel Spatial Context Recurrent ConvNet (SCRC) model as scoring function on candidate boxes for object retrieval, integrating spatial configurations and global scene-level contextual information into the network. Our model processes query text, local image descriptors, spatial configurations and global context features through a recurrent network, outputs the probability of the query text conditioned on each candidate box as a score for the box, and can transfer visual-linguistic knowledge from image captioning domain to our task. Experimental results demonstrate that our method effectively utilizes both local and global information, outperforming previous baseline methods significantly on different datasets and scenarios, and can exploit large scale vision and language datasets for knowledge transfer.
Inclusive e^\pm p single and double differential cross sections for neutral (NC) and charged current (CC) deep inelastic scattering processes are measured with the H1 detector at HERA. The data were taken at a centre-of-mass energy of \sqrt{s} = 319 GeV with a total integrated luminosity of 327.8 pb^{-1} shared between two lepton beam charges and two longitudinal lepton polarisation modes. The differential cross sections are measured in the range of negative four-momentum transfer squared, Q2, between 60 and 50 000 GeV2, and Bjorken x between 0.0008 and 0.65. The measurements are combined with earlier published unpolarised H1 data to improve statistical precision and used to determine the structure function xF_3^{\gamma Z}. A measurement of the structure function F_2^{\gamma Z}, sensitive to parity violating effects in NC, is presented for the first time. The polarisation dependence of the CC total cross section is also measured. The new measurements are well described by a next-to-leading order QCD fit based on all published H1 inclusive cross section data which are used to extract the parton distribution functions of the proton.
Anderson localization is a universal quantum feature caused by destructive interference. On the other hand chiral symmetry is a key ingredient in different problems of theoretical physics: from nonperturbative QCD to highly doped semiconductors. We investigate the interplay of these two phenomena in the context of a three-dimensional disordered system. We show that chiral symmetry induces an Anderson transition (AT) in the region close to the band center. Typical properties at the AT such as multifractality and critical statistics are quantitatively affected by this additional symmetry. The origin of the AT has been traced back to the power-law decay of the eigenstates; this feature may also be relevant in systems without chiral symmetry.
The emergence of smart Wi-Fi APs (Access Point), which are equipped with huge storage space, opens a new research area on how to utilize these resources at the edge network to improve users' quality of experience (QoE) (e.g., a short startup delay and smooth playback). One important research interest in this area is content prefetching, which predicts and accurately fetches contents ahead of users' requests to shift the traffic away during peak periods. However, in practice, the different video watching patterns among users, and the varying network connection status lead to the time-varying server load, which eventually makes the content prefetching problem challenging. To understand this challenge, this paper first performs a large-scale measurement study on users' AP connection and TV series watching patterns using real-traces. Then, based on the obtained insights, we formulate the content prefetching problem as a Markov Decision Process (MDP). The objective is to strike a balance between the increased prefetching&storage cost incurred by incorrect prediction and the reduced content download delay because of successful prediction. A learning-based approach is proposed to solve this problem and another three algorithms are adopted as baselines. In particular, first, we investigate the performance lower bound by using a random algorithm, and the upper bound by using an ideal offline approach. Then, we present a heuristic algorithm as another baseline. Finally, we design a reinforcement learning algorithm that is more practical to work in the online manner. Through extensive trace-based experiments, we demonstrate the performance gain of our design. Remarkably, our learning-based algorithm achieves a better precision and hit ratio (e.g., 80%) with about 70% (resp. 50%) cost saving compared to the random (resp. heuristic) algorithm.
Brain tumour segmentation plays a key role in computer-assisted surgery. Deep neural networks have increased the accuracy of automatic segmentation significantly, however these models tend to generalise poorly to different imaging modalities than those for which they have been designed, thereby limiting their applications. For example, a network architecture initially designed for brain parcellation of monomodal T1 MRI can not be easily translated into an efficient tumour segmentation network that jointly utilises T1, T1c, Flair and T2 MRI. To tackle this, we propose a novel scalable multimodal deep learning architecture using new nested structures that explicitly leverage deep features within or across modalities. This aims at making the early layers of the architecture structured and sparse so that the final architecture becomes scalable to the number of modalities. We evaluate the scalable architecture for brain tumour segmentation and give evidence of its regularisation effect compared to the conventional concatenation approach.
In this paper, we study the information-theoretic limits of learning the structure of Bayesian networks (BNs), on discrete as well as continuous random variables, from a finite number of samples. We show that the minimum number of samples required by any procedure to recover the correct structure grows as $\Omega(m)$ and $\Omega(k \log m + (k^2/m))$ for non-sparse and sparse BNs respectively, where $m$ is the number of variables and $k$ is the maximum number of parents per node. We provide a simple recipe, based on an extension of the Fano's inequality, to obtain information-theoretic limits of structure recovery for any exponential family BN. We instantiate our result for specific conditional distributions in the exponential family to characterize the fundamental limits of learning various commonly used BNs, such as conditional probability table based networks, gaussian BNs, noisy-OR networks, and logistic regression networks. En route to obtaining our main results, we obtain tight bounds on the number of sparse and non-sparse essential-DAGs. Finally, as a byproduct, we recover the information-theoretic limits of sparse variable selection for logistic regression.
In translation, considering the document as a whole can help to resolve ambiguities and inconsistencies. In this paper, we propose a cross-sentence context-aware approach and investigate the influence of historical contextual information on the performance of neural machine translation (NMT). First, this history is summarized in a hierarchical way. We then integrate the historical representation into NMT in two strategies: 1) a warm-start of encoder and decoder states, and 2) an auxiliary context source for updating decoder states. Experimental results on a large Chinese-English translation task show that our approach significantly improves upon a strong attention-based NMT system by up to +2.1 BLEU points.
We present flattened convolutional neural networks that are designed for fast feedforward execution. The redundancy of the parameters, especially weights of the convolutional filters in convolutional neural networks has been extensively studied and different heuristics have been proposed to construct a low rank basis of the filters after training. In this work, we train flattened networks that consist of consecutive sequence of one-dimensional filters across all directions in 3D space to obtain comparable performance as conventional convolutional networks. We tested flattened model on different datasets and found that the flattened layer can effectively substitute for the 3D filters without loss of accuracy. The flattened convolution pipelines provide around two times speed-up during feedforward pass compared to the baseline model due to the significant reduction of learning parameters. Furthermore, the proposed method does not require efforts in manual tuning or post processing once the model is trained.
We report the analysis in the wavevector space of the density correlator of a Lennard Jones binary mixture confined in a disordered matrix of soft spheres upon supercooling. In spite of the strong confining medium the behavior of the mixture is consistent with the Mode Coupling Theory predictions for bulk supercooled liquids. The relaxation times extracted from the fit of the density correlator to the stretched exponential function follow a unique power law behavior as a function of wavevector and temperature. The von Schweidler scaling properties are valid for an extended wavevector range around the peak of the structure factor. The parameters extracted in the present work are compared with the bulk values obtained in literature.
We study a disordered vibrational model system, where the spring constants k are chosen from a distribution P(k) ~ 1/k above a cut-off value k_min > 0. We can motivate this distribution by the presence of free volume in glassy materials. We show that the model system reproduces several important features of the boson peak in real glasses: (i) a low-frequency excess contribution to the Debye density of states, (ii) the hump of the specific heat c_V(T) including the power-law relation between height and position of the hump, and (iii) the transition to localized modes well above the boson peak frequency.
Real networks can be classified into two categories: fractal networks and non-fractal networks. Here we introduce a unifying model for the two types of networks. Our model network is governed by a parameter $q$. We obtain the topological properties of the network including the degree distribution, average path length, diameter, fractal dimensions, and betweenness centrality distribution, which are controlled by parameter $q$. Interestingly, we show that by adjusting $q$, the networks undergo a transition from fractal to non-fractal scalings, and exhibit a crossover from `large' to small worlds at the same time. Our research may shed some light on understanding the evolution and relationships of fractal and non-fractal networks.
This work focusses on analyzing the optimization strategies of routing protocols with respect to energy utilization of sensor nodes in Wireless Sensor Network (WSNs). Different routing mechanisms have been proposed to address energy optimization problem in sensor nodes. Clustering mechanism is one of the popular WSNs routing mechanisms. In this paper, we first address energy limitation constraints with respect to maximizing network life time using linear programming formulation technique. To check the efficiency of different clustering scheme against modeled constraints, we select four cluster based routing protocols; Low Energy Adaptive Clustering Hierarchy (LEACH), Threshold Sensitive Energy Efficient sensor Network (TEEN), Stable Election Protocol (SEP), and Distributed Energy Efficient Clustering (DEEC). To validate our mathematical framework, we perform analytical simulations in MATLAB by choosing number of alive nodes, number of dead nodes, number of packets and number of CHs, as performance metrics.
Deep networks have become very popular over the past few years. The main reason for this widespread use is their excellent ability to learn and predict knowledge in a very easy and efficient way. Convolutional neural networks and auto-encoders have become the normal in the area of imaging and computer vision achieving unprecedented accuracy levels in many applications. The most common strategy is to build and train networks with many layers by tuning their hyper-parameters. While this approach has proven to be a successful way to build robust deep learning schemes it suffers from high complexity. In this paper we introduce a module that learns color space transformations within a network. Given a large dataset of colored images the color space transformation module tries to learn color space transformations that increase overall classification accuracy. This module has shown to increase overall accuracy for the same network design and to achieve faster convergence. It is part of a broader family of image transformations (e.g. spatial transformer network).
Quantum computer has an amazing potential of fast information processing. However, realisation of a digital quantum computer is still a challenging problem requiring highly accurate controls and key application strategies. Here we propose a novel platform, quantum reservoir computing, to solve these issues successfully by exploiting natural quantum dynamics, which is ubiquitous in laboratories nowadays, for machine learning. In this framework, nonlinear dynamics including classical chaos can be universally emulated in quantum systems. A number of numerical experiments show that quantum systems consisting of at most seven qubits possess computational capabilities comparable to conventional recurrent neural networks of 500 nodes. This discovery opens up a new paradigm for information processing with artificial intelligence powered by quantum physics.
We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating these enzymes. Such coupling can be thought of as a proportional "recipe" for genome composition of the type "a spoonful of sugar for each egg yolk". The model jointly reproduces two known empirical laws: the distribution of family sizes and the nonlinear scaling of the number of genes in certain functional categories (e.g. transcription factors) with genome size. In addition, it allows us to derive a novel relation between the exponents characterising these two scaling laws, establishing a direct quantitative connection between evolutionary and functional categories. It predicts that functional categories that grow faster-than-linearly with genome size to be characterised by flatter-than-average family size distributions. This relation is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves that the joint quantitative trends of functional and evolutionary classes can be understood in terms of evolutionary growth with proportional recipes.
In this paper, we present a unified end-to-end approach to build a large scale Visual Search and Recommendation system for e-commerce. Previous works have targeted these problems in isolation. We believe a more effective and elegant solution could be obtained by tackling them together. We propose a unified Deep Convolutional Neural Network architecture, called VisNet, to learn embeddings to capture the notion of visual similarity, across several semantic granularities. We demonstrate the superiority of our approach for the task of image retrieval, by comparing against the state-of-the-art on the Exact Street2Shop dataset. We then share the design decisions and trade-offs made while deploying the model to power Visual Recommendations across a catalog of 50M products, supporting 2K queries a second at Flipkart, India's largest e-commerce company. The deployment of our solution has yielded a significant business impact, as measured by the conversion-rate.
In this paper, we propose a zoom-out-and-in network for generating object proposals. We utilize different resolutions of feature maps in the network to detect object instances of various sizes. Specifically, we divide the anchor candidates into three clusters based on the scale size and place them on feature maps of distinct strides to detect small, medium and large objects, respectively. Deeper feature maps contain region-level semantics which can help shallow counterparts to identify small objects. Therefore we design a zoom-in sub-network to increase the resolution of high level features via a deconvolution operation. The high-level features with high resolution are then combined and merged with low-level features to detect objects. Furthermore, we devise a recursive training pipeline to consecutively regress region proposals at the training stage in order to match the iterative regression at the testing stage. We demonstrate the effectiveness of the proposed method on ILSVRC DET and MS COCO datasets, where our algorithm performs better than the state-of-the-arts in various evaluation metrics. It also increases average precision by around 2% in the detection system.
We study the discrete nonlinear Schr\"oinger equation with weak disorder, focusing on the regime when the nonlinearity is, on the one hand, weak enough for the normal modes of the linear problem to remain well resolved, but on the other, strong enough for the dynamics of the normal mode amplitudes to be chaotic for almost all modes. We show that in this regime and in the limit of high temperature, the macroscopic density $\rho$ satisfies the nonlinear diffusion equation with a density-dependent diffusion coefficient, $D(\rho)=D_0\rho^2$. An explicit expression for $D_0$ is obtained in terms of the eigenfunctions and eigenvalues of the linear problem, which is then evaluated numerically. The role of the second conserved quantity (energy) in the transport is also quantitatively discussed.
Cross section ratios for deep-inelastic scattering from 14N and 3He with respect to 2H have been measured by the HERMES experiment at DESY using a 27.5 GeV positron beam. The data cover a range in the Bjorken scaling variable x between 0.013 and 0.65, while the negative squared four-momentum transfer Q^2 varies from 0.5 to 15 GeV^2. The data are compared to measurements performed by NMC, E665, and SLAC on 4He and 12C, and are found to be different for x < 0.06 and Q^2 < 1.5 GeV^2. The observed difference is attributed to an A-dependence of the ratio R = sigma_L / sigma_T of longitudinal to transverse deep-inelastic scattering cross sections at low x and low Q^2.
Hypergraph is a topological model for networks. In order to study the topology of hypergraphs, the homology of the associated simplicial complexes and the embedded homology have been invented. In this paper, we give some algorithms to compute the homology of the associated simplicial complexes and the embedded homology of hypergraphs as well as some heuristics for efficient computations.
Acoustic events often have a visual counterpart. Knowledge of visual information can aid the understanding of complex auditory scenes, even when only a stereo mixdown is available in the audio domain, \eg identifying which musicians are playing in large musical ensembles. In this paper, we consider a vision-based approach to note onset detection. As a case study we focus on challenging, real-world clarinetist videos and carry out preliminary experiments on a 3D convolutional neural network based on multiple streams and purposely avoiding temporal pooling. We release an audiovisual dataset with 4.5 hours of clarinetist videos together with cleaned annotations which include about 36,000 onsets and the coordinates for a number of salient points and regions of interest. By performing several training trials on our dataset, we learned that the problem is challenging. We found that the CNN model is highly sensitive to the optimization algorithm and hyper-parameters, and that treating the problem as binary classification may prevent the joint optimization of precision and recall. To encourage further research, we publicly share our dataset, annotations and all models and detail which issues we came across during our preliminary experiments.
A number of problems can be formulated as prediction on graph-structured data. In this work, we generalize the convolution operator from regular grids to arbitrary graphs while avoiding the spectral domain, which allows us to handle graphs of varying size and connectivity. To move beyond a simple diffusion, filter weights are conditioned on the specific edge labels in the neighborhood of a vertex. Together with the proper choice of graph coarsening, we explore constructing deep neural networks for graph classification. In particular, we demonstrate the generality of our formulation in point cloud classification, where we set the new state of the art, and on a graph classification dataset, where we outperform other deep learning approaches.
The understanding of neurodegenerative diseases undoubtedly passes through the study of human brain white matter fiber tracts. To date, diffusion magnetic resonance imaging (dMRI) is the unique technique to obtain information about the neural architecture of the human brain, thus permitting the study of white matter connections and their integrity. However, a remaining challenge of the dMRI community is to better characterize complex fiber crossing configurations, where diffusion tensor imaging (DTI) is limited but high angular resolution diffusion imaging (HARDI) now brings solutions. This paper investigates the development of both identification and classification process of the local water diffusion phenomenon based on HARDI data to automatically detect imaging voxels where there are single and crossing fiber bundle populations. The technique is based on knowledge extraction processes and is validated on a dMRI phantom dataset with ground truth.
The history of phishing traces back in important ways to the mid-1990s when hacking software facilitated the mass targeting of people in password stealing scams on America Online (AOL). The first of these software programs was mine, called AOHell, and it was where the word phishing was coined. The software provided an automated password and credit card-stealing mechanism starting in January 1995. Though the practice of tricking users in order to steal passwords or information possibly goes back to the earliest days of computer networking, AOHell's phishing system was the first automated tool made publicly available for this purpose. The program influenced the creation of many other automated phishing systems that were made over a number of years. These tools were available to amateurs who used them to engage in a countless number of phishing attacks. By the later part of the decade, the activity moved from AOL to other networks and eventually grew to involve professional criminals on the internet. What began as a scheme by rebellious teenagers to steal passwords evolved into one of the top computer security threats affecting people, corporations, and governments.
Many of the existing methods for learning joint embedding of images and text use only supervised information from paired images and its textual attributes. Taking advantage of the recent success of unsupervised learning in deep neural networks, we propose an end-to-end learning framework that is able to extract more robust multi-modal representations across domains. The proposed method combines representation learning models (i.e., auto-encoders) together with cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn joint embeddings for semantic and visual features. A novel technique of unsupervised-data adaptation inference is introduced to construct more comprehensive embeddings for both labeled and unlabeled data. We evaluate our method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with a wide range of applications, including zero and few-shot image recognition and retrieval, from inductive to transductive settings. Empirically, we show that our framework improves over the current state of the art on many of the considered tasks.
In this letter we derive the Froissart boundary in QCD for the deep inelastic structure function in low $x$ kinematic region. We show that the comparison of the Froissart boundary with the new HERA experimental data gives rise to a challenge for QCD to explain the matching between the deep inelastic scattering and real photoproduction process.
Judgments about personality based on facial appearance are strong effectors in social decision making, and are known to have impact on areas from presidential elections to jury decisions. Recent work has shown that it is possible to predict perception of memorability, trustworthiness, intelligence and other attributes in human face images. The most successful of these approaches require face images expertly annotated with key facial landmarks. We demonstrate a Convolutional Neural Network (CNN) model that is able to perform the same task without the need for landmark features, thereby greatly increasing efficiency. The model has high accuracy, surpassing human-level performance in some cases. Furthermore, we use a deconvolutional approach to visualize important features for perception of 22 attributes and demonstrate a new method for separately visualizing positive and negative features.
We have considered the dynamical evolution of cellular patterns controlled by a stochastic Glauber process determined by the deviations of local cell topology from that of a crystalline structure. Above a critical temperature evolution is towards a common equilibrium state from any initial configuration, but beneath this temperature there is a dynamical phase transition, with a start from a quasi-random state leading to non-equilibrium glassy freezing whereas an ordered start rests almost unchanged. A temporal persistence function decays exponentially in the high temperature equilibrating state but has a characteristic slow non-equilibrium aging-like behaviour in the low temperature glassy phase.
We numerically show that very small magnetic flux can significantly shift the metal-insulator transition point in a disordered electronic system. The shift we observe for the 3d Anderson model obeys a power law as predicted by Larkin and Khmel'nitskii (1981). We compute the exponent and find good agreement with the prediction. However, the power law holds only for much smaller magnetic fields than has been previously assumed, and is accompanied by a large prefactor, leading to a surprising strong dependence of the transition point on the applied magnetic field. Furthermore, we show that the critical level-spacing distribution is identical in the presence and absence of a magnetic field if hard-wall boundary conditions are applied. This result is surprising since both cases belong to different universality classes and different distributions have been reported for periodic boundary conditions.
In Wireless Ad-hoc Networks, nodes are free to move randomly and organize themselves arbitrarily, thus topology may change quickly and capriciously. In Mobile Ad-hoc NETworks, specially Wireless Multi-hop Networks provide users with facility of quick communication. In Wireless Multi-hop Networks, routing protocols with energy efficient and delay reduction techniques are needed to fulfill users demands. In this paper, we present Linear Programming models to assess and enhance reactive routing protocols. To practically examine constraints of respective Linear Programming models over reactive protocols, we select AODV, DSR and DYMO. It is deduced from analytical simulations of Linear Programming models in MATLAB that quick route repair reduces routing latency and optimizations of retransmission attempts results efficient energy utilization. To provide quick repair, we enhance AODV and DSR. To practically examine the efficiency of enhanced protocols in different scenarios of Wireless Multi-hop Networks, we conduct simulations using NS-2. From simulation results, enhanced DSR and AODV achieve efficient output by optimizing routing latencies and routing load in terms of retransmission attempts.
With the development of mobile sensing and mobile social networking techniques, Mobile Crowd Sensing and Computing (MCSC), which leverages heterogeneous crowdsourced data for large-scale sensing, has become a leading paradigm. Built on top of the participatory sensing vision, MCSC has two characterizing features: (1) it leverages heterogeneous crowdsourced data from two data sources: participatory sensing and participatory social media; and (2) it presents the fusion of human and machine intelligence (HMI) in both the sensing and computing process. This paper characterizes the unique features and challenges of MCSC. We further present early efforts on MCSC to demonstrate the benefits of aggregating heterogeneous crowdsourced data.
Despite a large amount of data and numerous theoretical proposals, the microscopic mechanism of transport in thick film resistors remains unclear. However, recent low temperature measurements point toward a possible variable range hopping mechanism of transport. Here we examine how such a mechanism affects the gauge factor of thick film resistors. We find that at sufficiently low temperatures $T$, for which the resistivity follows the Mott's law $R(T)\sim \exp(T_0/T)^{1/4}$, the gauge factor GF is proportional to $(T_0/T)^{1/4}$. Moreover, the inclusion of Coulomb gap effects leads to ${\rm GF}\sim (T_0'/T)^{1/2}$ at lower temperatures. In addition, we study a simple model which generalizes the variable range hopping mechanism by taking into account the finite mean inter-grain spacing. Our results suggest a possible experimental verification of the validity of the variable range hopping in thick film resistors.
Gallager codes are the best error-correcting codes to-date. In this paper we study them by using the tools of statistical mechanics. The corresponding statistical mechanics model is a spin model on a sparse random graph. The model can be solved by elementary methods (i.e. without replicas) in a large connectivity limit. For low enough temperatures it presents a completely frozen glassy phase (q_{EA}=1). The same scenario is shown to hold for finite connectivities. In this case we adopt the replica approach and exhibit a one-step replica symmetry breaking order parameter. We argue that our ansatz yields the exact solution of the model. This allows us to determine the whole phase diagram and to understand the performances of Gallager codes.
Distortions of bound nucleon space-time structure, measured in deep inelastic scattering off nuclei, are discussed from view-point of a theoretical approach based on the Bethe-Salpeter formalism. It is shown that modification of the structure function $F_2^{\rm N}$ results from relative time dependence in Green functions of bound nucleons. The modification plays fundamental role in analysis of deep inelastic scattering experiments and allows one to obtain new information about nucleon structure.
We investigate a 1D disordered Hamiltonian with a non analytical step-like dispersion relation whose level statistics is exactly described by Semi-Poisson statistics(SP). It is shown that this result is robust, namely, does not depend neither on the microscopic details of the potential nor on a magnetic flux but only on the type of non-analyticity. We also argue that a deterministic kicked rotator with a non-analytical step-like potential has the same spectral properties. Semi-Poisson statistics (SP), typical of pseudo-integrable billiards, has been frequently claimed to describe critical statistics, namely, the level statistics of a disordered system at the Anderson transition (AT). However we provide convincing evidence they are indeed different: each of them has its origin in a different type of classical singularities.
The statistics of critical spin-spin correlation functions in Ising systems with non-frustrated disorder are investigated on a strip geometry, via numerical transfer-matrix techniques. Conformal invariance concepts are used, in order to test for logarithmic corrections to pure power-law decay against distance. Fits of our data to conformal-invariance expressions, specific to logarithmic corrections to correlations on strips, give results with the correct sign, for the moments of order $n=0-4$ of the correlation-function distribution. We find an interval of disorder strength along which corrections to pure-system behavior can be decomposed into the product of a known $n$-dependent factor and an approximately $n$-independent one, in accordance with predictions. A phenomenological fitting procedure is proposed, which takes partial account of subdominant terms of correlation-function decay on strips. In the low-disorder limit, it gives results in fairly good agreement with theoretical predictions, provided that an additional assumption is made.
The Tribolium genome contains 21 nuclear receptors, representing all of the six known subfamilies. When compared to other species, this first complete set for a Coleoptera reveals a strong conservation of the number and identity of nuclear receptors in holometabolous insects. Two novelties are observed: the atypical NR0 gene knirps is present only in brachyceran flies, while the NR2E6 gene is found only in Tribolium and in Apis. Using a quantitative analysis of the evolutionary rate, we discovered that nuclear receptors could be divided into two groups. In one group of 13 proteins, the rates follow the trend of the Mecopterida genome-wide acceleration. In a second group of five nuclear receptors, all acting together at the top of the ecdysone cascade, we observed an overacceleration of the evolutionary rate during the early divergence of Mecopterida. We thus extended our analysis to the twelve classic ecdysone transcriptional regulators and found that six of them (ECR, USP, HR3, E75, HR4 and Kr-h1) underwent an overacceleration at the base of the Mecopterida lineage. By contrast, E74, E93, BR, HR39, FTZ-F1 and E78 do not show this divergence. We suggest that coevolution occurred within a network of regulators that control the ecdysone cascade. The advent of Tribolium as a powerful model should allow a better understanding of this evolution.
Approximately 2.8 Myr before the present our planet was subjected to the debris of a supernova explosion. The terrestrial proxy for this event was the discovery of live atoms of 60Fe in a deep-sea ferromanganese crust. The signature for this supernova event should also reside in magnetite Fe3O4 microfossils produced by magnetotactic bacteria extant at the time of the Earth-supernova interaction, provided the bacteria preferentially uptake iron from fine-grained iron oxides and ferric hydroxides. Using estimates for the terrestrial supernova 60Fe flux, combined with our empirically derived microfossil concentrations in a deep-sea drill core, we deduce a conservative estimate of the ^{60}{Fe} fraction as 60Fe/Fe ~ 3.6 x 10^{-15}. This value sits comfortably within the sensitivity limit of present accelerator mass spectrometry capabilities. The implication is that a biogenic signature of this cosmic event is detectable in the Earth's fossil record.
While machine learning approaches to image restoration offer great promise, current methods risk training "one-trick ponies" that perform well only for image corruption of a particular level of difficulty--such as a certain level of noise or blur. First, we examine the weakness of a one-trick pony model and demonstrate that training general models to handle arbitrary levels of corruption is indeed non-trivial. Then, we propose an on-demand learning algorithm for training image restoration models with deep convolutional neural networks. The main idea is to exploit a feedback mechanism to self-generate training instances where they are needed most, thereby learning models that can generalize across difficulty levels. On four restoration tasks---image inpainting, pixel interpolation, image deblurring, and image denoising---and three diverse datasets, our approach consistently outperforms both the status quo training procedure and curriculum learning alternatives.
We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and output shape) of convolutional, pooling and transposed convolutional layers, as well as the relationship between convolutional and transposed convolutional layers. Relationships are derived for various cases, and are illustrated in order to make them intuitive.
Software evolution is a fundamental process that transcends the realm of technical artifacts and permeates the entire organizational structure of a software project. By means of a longitudinal empirical study of 18 large open-source projects, we examine and discuss the evolutionary principles that govern the coordination of developers. By applying a network-analytic approach, we found that the implicit and self-organizing structure of developer coordination is ubiquitously described by non-random organizational principles that defy conventional software-engineering wisdom. In particular, we found that: (a) developers form scale-free networks, in which the majority of coordination requirements arise among an extremely small number of developers, (b) developers tend to accumulate coordination requirements with more and more developers over time, presumably limited by an upper bound, and (c) initially developers are hierarchically arranged, but over time, form a hybrid structure, in which core developers are hierarchically arranged and peripheral developers are not. Our results suggest that the organizational structure of large projects is constrained to evolve towards a state that balances the costs and benefits of developer coordination, and the mechanisms used to achieve this state depend on the project's scale.
This work investigates the maximum broadcast throughput and its achievability in multi-hop wireless networks with half-duplex node constraint. We allow the use of physical-layer network coding (PNC). Although the use of PNC for unicast has been extensively studied, there has been little prior work on PNC for broadcast. Our specific results are as follows: 1) For single-source broadcast, the theoretical throughput upper bound is n/(n+1), where n is the "min vertex-cut" size of the network. 2) In general, the throughput upper bound is not always achievable. 3) For grid and many other networks, the throughput upper bound n/(n+1) is achievable. Our work can be considered as an attempt to understand the relationship between max-flow and min-cut in half-duplex broadcast networks with cycles (there has been prior work on networks with cycles, but not half-duplex broadcast networks).
In this work we offer a significant improvement on the previous smallest spiking neural P systems and solve the problem of finding the smallest possible extended spiking neural P system. Paun and Paun gave a universal spiking neural P system with 84 neurons and another that has extended rules with 49 neurons. Subsequently, Zhang et al. reduced the number of neurons used to give universality to 67 for spiking neural P systems and to 41 for the extended model. Here we give a small universal spiking neural P system that has only 17 neurons and another that has extended rules with 5 neurons. All of the above mentioned spiking neural P systems suffer from an exponential slow down when simulating Turing machines. Using a more relaxed encoding technique we get a universal spiking neural P system that has extended rules with only 4 neurons. This latter spiking neural P system simulates 2-counter machines in linear time and thus suffer from a double exponential time overhead when simulating Turing machines. We show that extended spiking neural P systems with 3 neurons are simulated by log-space bounded Turing machines, and so there exists no such universal system with 3 neurons. It immediately follows that our 4-neuron system is the smallest possible extended spiking neural P system that is universal. Finally, we show that if we generalise the output technique we can give a universal spiking neural P system with extended rules that has only 3 neurons. This system is also the smallest of its kind as a universal spiking neural P system with extended rules and generalised output is not possible with 2 neurons.
The evidence lower bound (ELBO) appears in many algorithms for maximum likelihood estimation (MLE) with latent variables because it is a sharp lower bound of the marginal log-likelihood. For neural latent variable models, optimizing the ELBO jointly in the variational posterior and model parameters produces state-of-the-art results. Inspired by the success of the ELBO as a surrogate MLE objective, we consider the extension of the ELBO to a family of lower bounds defined by a Monte Carlo estimator of the marginal likelihood. We show that the tightness of such bounds is asymptotically related to the variance of the underlying estimator. We introduce a special case, the filtering variational objectives (FIVOs), which takes the same arguments as the ELBO and passes them through a particle filter to form a tighter bound. FIVOs can be optimized tractably with stochastic gradients, and are particularly suited to MLE in sequential latent variable models. In standard sequential generative modeling tasks we present uniform improvements over models trained with ELBO, including some whole nat-per-timestep improvements.
We report on the observation of the coherent enhancement of the return probability ("enhanced return to the origin" , ERO) in a periodically kicked cold-atom gas. By submitting an atomic wave packet to a pulsed, periodically shifted laser standing wave, we induce an oscillation of ERO in time and explain it in terms of a periodic, reversible dephasing in the weak-localization interference sequences responsible for ERO. Monitoring the temporal decay of ERO, we exploit its quantum coherent nature to quantify the decoherence rate of the atomic system.
The isochoric thermal conductivity of solid nitrogen has been investigated on four samples of different densities in the temperature interval from 20 K to the onset of melting. In alfa-N2 the isochoric thermal conductivity exhibits a dependence weaker than 1/T; in beta-N2 it increases slightly with temperature. The experimental results are discussed within a model in which the heat is transported by low-frequency phonons or by "diffusive" modes above the mobility boundary. The growth of the thermal conductivity in beta-N2 is attributed to the decreasing "rotational" component of the total thermal resistance, which occurs as the rotational correlations between the neighboring molecules become weaker.
The depth is one of the key factors behind the great success of convolutional neural networks (CNNs), with the gradient vanishing issue having been largely addressed by various nets, e.g. ResNet. However, when the depth goes very deep, the supervision information from the loss function will vanish due to the long backpropagation path, especially for those shallow layers. This means that intermediate layers receive less supervision information and will lead to redundancy in models. As a result, the model becomes very redundant and the over-fitting issue may happen. To address this, we propose a model, called AuxNet, by introducing auxiliary outputs at intermediate layers. Different from existing approaches, we propose a Multi-path training method to propagate not only gradients but also sufficient supervision informationfrommultipleauxiliaryoutputs. TheproposedAuxNetwithmulti-pathtrainingmethodgivesrisetomorecompact networks which outperform their very deep equivalent (i.e. ResNet). For example, AuxNet with 44 layers performs better than the ResNet equivalent with 110 layers on several benchmark data sets, i.e. CIFAR-10, CIFAR-100 and SVHN.
Inelastic neutron scattering with high wave-vector resolution has characterized the propagation of transverse spin wave modes near the antiferromagnetic zone center in the metastable domain state of a random field Ising magnet. A well-defined, long wavelength excitation is observed despite the absence of long-range magnetic order. Direct comparisons with the spin wave dispersion in the long-range ordered antiferromagnetic state reveal no measurable effects from the domain structure. This result recalls analogous behavior in thermally disordered anisotropic spin chains but contrasts sharply with that of the phonon modes in relaxor ferroelectrics.
Convolutional neural networks (CNNs) are currently state-of-the-art for various classification tasks, but are computationally expensive. Propagating through the convolutional layers is very slow, as each kernel in each layer must sequentially calculate many dot products for a single forward and backward propagation which equates to $\mathcal{O}(N^{2}n^{2})$ per kernel per layer where the inputs are $N \times N$ arrays and the kernels are $n \times n$ arrays. Convolution can be efficiently performed as a Hadamard product in the frequency domain. The bottleneck is the transformation which has a cost of $\mathcal{O}(N^{2}\log_2 N)$ using the fast Fourier transform (FFT). However, the increase in efficiency is less significant when $N\gg n$ as is the case in CNNs. We mitigate this by using the "overlap-and-add" technique reducing the computational complexity to $\mathcal{O}(N^2\log_2 n)$ per kernel. This method increases the algorithm's efficiency in both the forward and backward propagation, reducing the training and testing time for CNNs. Our empirical results show our method reduces computational time by a factor of up to 16.3 times the traditional convolution implementation for a 8 $\times$ 8 kernel and a 224 $\times$ 224 image.
Big networks express various large-scale networks in many practical areas such as computer networks, internet of things, cloud computation, manufacturing systems, transportation networks, and healthcare systems. This paper analyzes such big networks, and applies the mean-field theory and the nonlinear Markov processes to set up a broad class of nonlinear continuous-time block-structured Markov processes, which can be applied to deal with many practical stochastic systems. Firstly, a nonlinear Markov process is derived from a large number of interacting big networks with symmetric interactions, each of which is described as a continuous-time block-structured Markov process. Secondly, some effective algorithms are given for computing the fixed points of the nonlinear Markov process by means of the UL-type RG-factorization. Finally, the Birkhoff center, the Lyapunov functions and the relative entropy are used to analyze stability or metastability of the big network, and several interesting open problems are proposed with detailed interpretation. We believe that the results given in this paper can be useful and effective in the study of big networks.
The analysis of complex reaction networks is of great importance in several chemical and biochemical fields (interstellar chemistry, prebiotic chemistry, reaction mechanism, etc). In this article, we propose to simultaneously refine and extend for general chemical reaction systems the formalism initially introduced for the description of metabolic networks. The classical approaches through the computation of the right null space leads to the decomposition of the network into complex ``cycles'' of reactions concerned with all metabolites. We show how, departing from the left null space computation, the flux analysis can be decoupled into linear fluxes and single loops, allowing a more refine qualitative analysis as a function of the antagonisms and connections among these local fluxes. This analysis is made possible by the decomposition of the molecules into elementary subunits, called "reactons" and the consequent decomposition of the whole network into simple first order unary partial reactions related with simple transfers of reactons from one molecule to another. This article explains and justifies the algorithmic steps leading to the total decomposition of the reaction network into its constitutive elementary subpart.
We report the statistical properties of three bus-transport networks (BTN) in three different cities of China. These networks are composed of a set of bus lines and stations serviced by these. Network properties, including the degree distribution, clustering and average path length are studied in different definitions of network topology. We explore scaling laws and correlations that may govern intrinsic features of such networks. Besides, we create a weighted network representation for BTN with lines mapped to nodes and number of common stations to weights between lines. In such a representation, the distributions of degree, strength and weight are investigated. A linear behavior between strength and degree s(k) ~ k is also observed.
The Oslo rice pile model is a sandpile-like paradigmatic model of ``Self-Organized Criticality'' (SOC). In this paper it is shown that the Oslo model is in fact exactly a discrete realization of the much studied quenched Edwards-Wilkinson equation (qEW) [Nattermann et al., J. Phys. II France 2, 1483 (1992)]. This is possible by choosing the correct dynamical variable and identifying its equation of motion. It establishes for the first time an exact link between SOC models and the field of interface growth with quenched disorder. This connection is obviously very encouraging as it suggests that established theoretical techniques can be brought to bear with full strength on some of the hitherto elusive problems of SOC.
The time evolution of spatial fluctuations in inhomogeneous d-dimensional biological systems is analyzed. A single species continuous growth model, in which the population disperses via diffusion and convection is considered. Time-independent environmental heterogeneities, such as a random distribution of nutrients or sunlight are modeled by quenched disorder in the growth rate. Linearization of this model of population dynamics shows that the fastest growing localized state dominates in a time proportional to a power of the logarithm of the system size. Using an analogy with a Schrodinger equation subject to a constant imaginary vector potential, we propose a delocalization transition for the steady state of the nonlinear problem at a critical convection threshold separating localized and extended states. In the limit of high convection velocity, the linearized growth problem in $d$ dimensions exhibits singular scaling behavior described by a (d-1)-dimensional generalization of the noisy Burgers' equation, with universal singularities in the density of states associated with disorder averaged eigenvalues near the band edge in the complex plane. The Burgers mapping leads to unusual transverse spreading of convecting delocalized populations.
Efficient similarity retrieval from large-scale multimodal database is pervasive in modern search engines and social networks. To support queries across content modalities, the system should enable cross-modal correlation and computation-efficient indexing. While hashing methods have shown great potential in achieving this goal, current attempts generally fail to learn isomorphic hash codes in a seamless scheme, that is, they embed multiple modalities in a continuous isomorphic space and separately threshold embeddings into binary codes, which incurs substantial loss of retrieval accuracy. In this paper, we approach seamless multimodal hashing by proposing a novel Composite Correlation Quantization (CCQ) model. Specifically, CCQ jointly finds correlation-maximal mappings that transform different modalities into isomorphic latent space, and learns composite quantizers that convert the isomorphic latent features into compact binary codes. An optimization framework is devised to preserve both intra-modal similarity and inter-modal correlation through minimizing both reconstruction and quantization errors, which can be trained from both paired and partially paired data in linear time. A comprehensive set of experiments clearly show the superior effectiveness and efficiency of CCQ against the state of the art hashing methods for both unimodal and cross-modal retrieval.
The importance of stochasticity within biological systems has been shown repeatedly during the last years and has raised the need for efficient stochastic tools. We present SABRE, a tool for stochastic analysis of biochemical reaction networks. SABRE implements fast adaptive uniformization (FAU), a direct numerical approximation algorithm for computing transient solutions of biochemical reaction networks. Biochemical reactions networks represent biological systems studied at a molecular level and these reactions can be modeled as transitions of a Markov chain. SABRE accepts as input the formalism of guarded commands, which it interprets either as continuous-time or as discrete-time Markov chains. Besides operating in a stochastic mode, SABRE may also perform a deterministic analysis by directly computing a mean-field approximation of the system under study. We illustrate the different functionalities of SABRE by means of biological case studies.
Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for inducing the stochasticity observed in cortex. Here, we introduce Synaptic Sampling Machines, a class of neural network models that uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised learning. Similar to the original formulation of Boltzmann machines, these models can be viewed as a stochastic counterpart of Hopfield networks, but where stochasticity is induced by a random mask over the connections. Synaptic stochasticity plays the dual role of an efficient mechanism for sampling, and a regularizer during learning akin to DropConnect. A local synaptic plasticity rule implementing an event-driven form of contrastive divergence enables the learning of generative models in an on-line fashion. Synaptic sampling machines perform equally well using discrete-timed artificial units (as in Hopfield networks) or continuous-timed leaky integrate & fire neurons. The learned representations are remarkably sparse and robust to reductions in bit precision and synapse pruning: removal of more than 75% of the weakest connections followed by cursory re-learning causes a negligible performance loss on benchmark classification tasks. The spiking neuron-based synaptic sampling machines outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware.
Artificial Intelligence (AI) techniques are known for its ability in tackling problems found to be unyielding to traditional mathematical methods. A recent addition to these techniques are the Computational Intelligence (CI) techniques which, in most cases, are nature or biologically inspired techniques. Different CI techniques found their way to many control engineering applications, including system identification, and the results obtained by many researchers were encouraging. However, most control engineers and researchers used the basic CI models as is or slightly modified them to match their needs. Henceforth, the merits of one model over the other was not clear, and full potential of these models was not exploited.   In this research, Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) methods, which are different CI techniques, are modified to best suit the multimodal problem of system identification. In the first case of GA, an extension to the basic algorithm, which is inspired from nature as well, was deployed by introducing redundant genetic material. This extension, which come in handy in living organisms, did not result in significant performance improvement to the basic algorithm. In the second case, the Clubs-based PSO (C-PSO) dynamic neighborhood structure was introduced to replace the basic static structure used in canonical PSO algorithms. This modification of the neighborhood structure resulted in significant performance of the algorithm regarding convergence speed, and equipped it with a tool to handle multimodal problems.   To understand the suitability of different GA and PSO techniques in the problem of system identification, they were used in an induction motor's parameter identification problem. The results enforced previous conclusions and showed the superiority of PSO in general over the GA in such a multimodal problem.
Consider a fully connected network of nodes, some of which have a piece of data to be disseminated to the whole network. We analyze the following push-type epidemic algorithm: in each push round, every node that has the data, i.e., every infected node, randomly chooses $c \in {\mathbb Z}_+$ other nodes in the network and transmits, i.e., pushes, the data to them. We write this round as a random walk whose each step corresponds to a random selection of one of the infected nodes; this gives recursive formulas for the distribution and the moments of the number of newly infected nodes in a push round. We use the formula for the distribution to compute the expected number of rounds so that a given percentage of the network is infected and continue a numerical comparison of the push algorithm and the pull algorithm (where the susceptible nodes randomly choose peers) initiated in an earlier work. We then derive the fluid and diffusion limits of the random walk as the network size goes to $\infty$ and deduce a number of properties of the push algorithm: 1) the number of newly infected nodes in a push round, and the number of random selections needed so that a given percent of the network is infected, are both asymptotically normal 2) for large networks, starting with a nonzero proportion of infected nodes, a pull round infects slightly more nodes on average 3) the number of rounds until a given proportion $\lambda$ of the network is infected converges to a constant for almost all $\lambda \in (0,1)$. Numerical examples for theoretical results are provided.
Long Short-Term Memory networks (LSTMs) can be trained to realize inverse control of physics-based sound synthesizers. Physics-based sound synthesizers simulate the laws of physics to produce output sound according to input gesture signals. When a user's gestures are measured in real time, she or he can use them to control physics-based sound synthesizers, thereby creating simulated virtual instruments. An intriguing question is how to program a computer to learn to play such physics-based models. This work demonstrates that LSTMs can be trained to accomplish this inverse control task with four physics-based sound synthesizers.
We describe the development of artificial neural networks (ANN) for the prediction of the properties of ceramic materials. The ceramics studied here include polycrystalline, inorganic, non-metallic materials and are investigated on the basis of their dielectric and ionic properties. Dielectric materials are of interest in telecommunication applications where they are used in tuning and filtering equipment. Ionic and mixed conductors are the subjects of a concerted effort in the search for new materials that can be incorporated into efficient, clean electrochemical devices of interest in energy production and greenhouse gas reduction applications. Multi-layer perceptron ANNs are trained using the back-propagation algorithm and utilise data obtained from the literature to learn composition-property relationships between the inputs and outputs of the system. The trained networks use compositional information to predict the relative permittivity and oxygen diffusion properties of ceramic materials. The results show that ANNs are able to produce accurate predictions of the properties of these ceramic materials which can be used to develop materials suitable for use in telecommunication and energy production applications.
This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Several neural networks are trained to modelize the value of rewards knowing the context. Two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons. The proposed algorithms are successfully tested on a large dataset with and without stationarity of rewards.
Tissue characterization has long been an important component of Computer Aided Diagnosis (CAD) systems for automatic lesion detection and further clinical planning. Motivated by the superior performance of deep learning methods on various computer vision problems, there has been increasing work applying deep learning to medical image analysis. However, the development of a robust and reliable deep learning model for computer-aided diagnosis is still highly challenging due to the combination of the high heterogeneity in the medical images and the relative lack of training samples. Specifically, annotation and labeling of the medical images is much more expensive and time-consuming than other applications and often involves manual labor from multiple domain experts. In this work, we propose a multi-stage, self-paced learning framework utilizing a convolutional neural network (CNN) to classify Computed Tomography (CT) image patches. The key contribution of this approach is that we augment the size of training samples by refining the unlabeled instances with a self-paced learning CNN. By implementing the framework on high performance computing servers including the NVIDIA DGX1 machine, we obtained the experimental result, showing that the self-pace boosted network consistently outperformed the original network even with very scarce manual labels. The performance gain indicates that applications with limited training samples such as medical image analysis can benefit from using the proposed framework.
Discontinuous percolation transitions and the associated tricritical points are manifest in a wide range of both equilibrium and non-equilibrium cooperative phenomena. To demonstrate this, we present and relate the continuous and first order behaviors in two different classes of models: The first are generalized epidemic processes (GEP) that describe in their spatially embedded version - either on or off a regular lattice - compact or fractal cluster growth in random media at zero temperature. A random graph version of GEP is mapped onto a model previously proposed for complex social contagion. We compute detailed phase diagrams and compare our numerical results at the tricritical point in d = 3 with field theory predictions of Janssen et al. [Phys. Rev. E 70, 026114 (2004)]. The second class consists of exponential ("Hamiltonian", or formally equilibrium) random graph models and includes the Strauss and the 2-star model, where 'chemical potentials' control the densities of links, triangles or 2-stars. When the chemical potentials in either graph model are O(logN), the percolation transition can coincide with a first order phase transition in the density of links, making the former also discontinuous. Hysteresis loops can then be of mixed order, with second order behavior for decreasing link fugacity, and a jump (first order) when it increases.
Much of studies on neural computation are based on network models of static neurons that produce analog output, despite the fact that information processing in the brain is predominantly carried out by dynamic neurons that produce discrete pulses called spikes. Research in spike-based computation has been impeded by the lack of efficient supervised learning algorithm for spiking networks. Here, we present a gradient descent method for optimizing spiking network models by introducing a differentiable formulation of spiking networks and deriving the exact gradient calculation. For demonstration, we trained recurrent spiking networks on two dynamic tasks: one that requires optimizing fast (~millisecond) spike-based interactions for efficient encoding of information, and a delayed memory XOR task over extended duration (~second). The results show that our method indeed optimizes the spiking network dynamics on the time scale of individual spikes as well as behavioral time scales. In conclusion, our result offers a general purpose supervised learning algorithm for spiking neural networks, thus advancing further investigations on spike-based computation.
We address the problem of identifying the topology of an unknown weighted, directed network of LTI systems stimulated by wide-sense stationary noises of unknown power spectral densities. We propose several reconstruction algorithms based on the cross-power spectral densities of the network's response to the input noises. Our first algorithm reconstructs the Boolean structure (i.e., existence and directions of links) of a directed network from a series of dynamical responses. Moreover, we propose a second algorithm to recover the exact structure of the network (including edge weights), as well as the power spectral density of the input noises, when an eigenvalue-eigenvector pair of the connectivity matrix is known (for example, Laplacian connectivity matrices). Finally, for the particular cases of nonreciprocal networks (i.e., networks with no directed edges pointing in opposite directions) and undirected networks, we propose specialized algorithms that result in a lower computational cost.
The temporal component of social networks is often neglected in their analysis, and statistical measures are typically performed on a "static" representation of the network. As a result, measures of importance (like betweenness centrality) cannot reveal any temporal role of the entities involved. Our goal is to start filling this limitation by proposing a form of temporal betweenness measure, and by using it to analyse a knowledge mobilization network. We show that this measure, which takes time explicitly into account, allows us to detect centrality roles that were completely hidden in the classical statistical analysis. In particular, we uncover nodes whose static centrality was considered negligible, but whose temporal role is instead important to accelerate mobilization flow in the network. We also observe the reverse behaviour by detecting nodes with high static centrality, whose role as temporal bridges is instead very low. By revealing important temporal roles, this study is a first step towards a better understanding of the impact of time in social networks, and opens the road to further investigation.
This work presents an unsupervised learning based approach to the ubiquitous computer vision problem of image matching. We start from the insight that the problem of frame-interpolation implicitly solves for inter-frame correspondences. This permits the application of analysis-by-synthesis: we firstly train and apply a Convolutional Neural Network for frame-interpolation, then obtain correspondences by inverting the learned CNN. The key benefit behind this strategy is that the CNN for frame-interpolation can be trained in an unsupervised manner by exploiting the temporal coherency that is naturally contained in real-world video sequences. The present model therefore learns image matching by simply watching videos. Besides a promise to be more generally applicable, the presented approach achieves surprising performance comparable to traditional empirically designed methods.
A new method for analyzing low density parity check (LDPC) codes and low density generator matrix (LDGM) codes under bit maximum a posteriori probability (MAP) decoding is introduced. The method is based on a rigorous approach to spin glasses developed by Francesco Guerra. It allows to construct lower bounds on the entropy of the transmitted message conditional to the received one. Based on heuristic statistical mechanics calculations, we conjecture such bounds to be tight. The result holds for standard irregular ensembles when used over binary input output symmetric channels. The method is first developed for Tanner graph ensembles with Poisson left degree distribution. It is then generalized to `multi-Poisson' graphs, and, by a completion procedure, to arbitrary degree distribution.
In this work we show that adapting Deep Convolutional Neural Network training to the task of boundary detection can result in substantial improvements over the current state-of-the-art in boundary detection.   Our contributions consist firstly in combining a careful design of the loss for boundary detection training, a multi-resolution architecture and training with external data to improve the detection accuracy of the current state of the art. When measured on the standard Berkeley Segmentation Dataset, we improve theoptimal dataset scale F-measure from 0.780 to 0.808 - while human performance is at 0.803. We further improve performance to 0.813 by combining deep learning with grouping, integrating the Normalized Cuts technique within a deep network.   We also examine the potential of our boundary detector in conjunction with the task of semantic segmentation and demonstrate clear improvements over state-of-the-art systems. Our detector is fully integrated in the popular Caffe framework and processes a 320x420 image in less than a second.
One of the main issues in modern network science is the phenomenon of cascading failures of a small number of attacks. Here we define the dimension of a network to be the maximal number of functions or features of nodes of the network. It was shown that there exist linear networks which are provably secure, where a network is linear, if it has dimension one, that the high dimensions of networks are the mechanisms of overlapping communities, that overlapping communities are obstacles for network security, and that there exists an algorithm to reduce high dimensional networks to low dimensional ones which simultaneously preserves all the network properties and significantly amplifies security of networks. Our results explore that dimension is a fundamental measure of networks, that there exist linear networks which are provably secure, that high dimensional networks are insecure, and that security of networks can be amplified by reducing dimensions.
We consider the problem of learning causal networks with interventions, when each intervention is limited in size under Pearl's Structural Equation Model with independent errors (SEM-IE). The objective is to minimize the number of experiments to discover the causal directions of all the edges in a causal graph. Previous work has focused on the use of separating systems for complete graphs for this task. We prove that any deterministic adaptive algorithm needs to be a separating system in order to learn complete graphs in the worst case. In addition, we present a novel separating system construction, whose size is close to optimal and is arguably simpler than previous work in combinatorics. We also develop a novel information theoretic lower bound on the number of interventions that applies in full generality, including for randomized adaptive learning algorithms.   For general chordal graphs, we derive worst case lower bounds on the number of interventions. Building on observations about induced trees, we give a new deterministic adaptive algorithm to learn directions on any chordal skeleton completely. In the worst case, our achievable scheme is an $\alpha$-approximation algorithm where $\alpha$ is the independence number of the graph. We also show that there exist graph classes for which the sufficient number of experiments is close to the lower bound. In the other extreme, there are graph classes for which the required number of experiments is multiplicatively $\alpha$ away from our lower bound.   In simulations, our algorithm almost always performs very close to the lower bound, while the approach based on separating systems for complete graphs is significantly worse for random chordal graphs.
One of the notable fields in studying the genetics of cancer is disease gene identification which affects disease treatment and drug discovery. Many researches have been done in this field. Genome-wide association studies (GWAS) are one of them that focus on the identification of diseases-susceptible loci on chromosomes. Recently, computational approaches, known as gene prioritization methods have been used to identify candidate disease genes. Gene prioritization methods integrate several data sources to discover and prioritize the most probable candidate disease genes. In this paper, we propose a prioritization method, called HybridRanker which is a network-based technique and it also uses experimental data to identify candidate cancer genes. We apply our proposed method on colorectal cancer data. It is notable to say that in HybridRanker, for considering both local and global network information of a protein-protein interaction network, different algorithms such as shortest-path, random walk with restart and network propagation are exploited. By using these algorithms, initial scores are given to genes within the network. Furthermore, by looking through diseases with similar symptoms and also comorbid diseases and by extracting their causing genes, the gene scores are recalculated. We also use gene-phenotype relations for an additional scoring of the candidate genes. Our method is validated and compared with other prioritization methods in leave one-out cross-validation and the comparison results show the better performance of the HybridRanker.
We review the theory of loss networks, including recent results on their dynamical behaviour. We give also some new results.
Influence of number of particles considered in numerical simulations on complex dielectric permittivity of binary dilute dielectric mixtures in two-dimensions are reported. In the simulations, dodecagons (polygons with 12-sides) were used to mimic disk-shaped inclusions. Using such an approach we were able to consider $16^2$ particles in a unit-square. The effective dielectric permittivity of the mixtures were calculated using the finite element at two different frequecies which were much higher and lower than the characteristic relaxation rate of the Maxwell-Wagner-Sillars polarization. The results were compared to an analytical solution. It was found that the permittivity values at low frequencies were inside Wiener bounds however they violated the Hashin-Shtrikman bounds.
We study the spreading of initially localized excitations in 1D disordered granular crystals. We thereby investigate localization phenomena in strongly nonlinear systems, which we demonstrate to be fundamentally different from localization in linear and weakly nonlinear systems. We compare wave dynamics in chains with 3 different types of disorder: an uncorrelated (Anderson-like) disorder and 2 types of correlated disorders (random dimer arrangements), and for 2 types of initial conditions: displacement excitations and velocity excitations. For strongly precompressed chains, the dynamics depend strongly on the initial condition. For displacement excitations, the long-time behavior of the second moment $\tilde{m}_2$ has oscillations that depend on the type of disorder, with a complex trend that differs markedly from a power law and which is particularly evident for an Anderson disorder. For velocity excitations, we find a scaling $\tilde{m}_2\sim t^{\gamma}$ (for a constant $\gamma$) for all 3 types of disorder. For weakly precompressed (strongly nonlinear) chains, $\tilde{m}_2$ and the inverse participation ratio $P^{-1}$ satisfy $\tilde{m}_2\sim t^{\gamma}$ and $P^{-1}\sim t^{-\eta}$, and the dynamics is superdiffusive for all examined cases. When precompression is strong, the IPR decreases slowly for all 3 types of disorder, and we observe a partial localization around the core and the leading edge of the wave. For an Anderson disorder, displacement perturbations lead to localization of energy primarily in the core, and velocity perturbations cause the energy to be divided between the core and the leading edge. This localization phenomenon does not occur in the sonic-vacuum regime, which yields the surprising result that the energy is no longer contained in strongly nonlinear waves but instead is spread across many sites. In this regime, the exponents are very similar in all cases.
Quite generally, an insulator is theoretically defined by a vanishing conductivity tensor at the absolute zero of temperature. In classical insulators, such as band insulators, vanishing conductivities lead to diverging resistivities. In other insulators, in particular when a high magnetic field (B) is added, it is possible that while the magneto-resistance diverges, the Hall resistance remains finite, which is known as a Hall insulator. In this letter we demonstrate experimentally the existence of another, more exotic, insulator. This insulator, which terminates the quantum Hall effect series in a two-dimensional electron system, is characterized by a Hall resistance which is approximately quantized in the quantum unit of resistance h/e^2. This insulator is termed a quantized Hall insulator. In addition we show that for the same sample, the insulating state preceding the QHE series, at low-B, is of the HI kind.
We study the strategic aspects of social influence in a society of agents linked by a trust network, introducing a new class of games called games of influence. A game of influence is an infinite repeated game with incomplete information in which, at each stage of interaction, an agent can make her opinions visible (public) or invisible (private) in order to influence other agents' opinions. The influence process is mediated by a trust network, as we assume that the opinion of a given agent is only affected by the opinions of those agents that she considers trustworthy (i.e., the agents in the trust network that are directly linked to her). Each agent is endowed with a goal, expressed in a suitable temporal language inspired from linear temporal logic (LTL). We show that games of influence provide a simple abstraction to explore the effects of the trust network structure on the agents' behaviour, by considering solution concepts from game-theory such as Nash equilibrium, weak dominance and winning strategies.
This paper presents incremental network quantization (INQ), a novel method, targeting to efficiently convert any pre-trained full-precision convolutional neural network (CNN) model into a low-precision version whose weights are constrained to be either powers of two or zero. Unlike existing methods which are struggled in noticeable accuracy loss, our INQ has the potential to resolve this issue, as benefiting from two innovations. On one hand, we introduce three interdependent operations, namely weight partition, group-wise quantization and re-training. A well-proven measure is employed to divide the weights in each layer of a pre-trained CNN model into two disjoint groups. The weights in the first group are responsible to form a low-precision base, thus they are quantized by a variable-length encoding method. The weights in the other group are responsible to compensate for the accuracy loss from the quantization, thus they are the ones to be re-trained. On the other hand, these three operations are repeated on the latest re-trained group in an iterative manner until all the weights are converted into low-precision ones, acting as an incremental network quantization and accuracy enhancement procedure. Extensive experiments on the ImageNet classification task using almost all known deep CNN architectures including AlexNet, VGG-16, GoogleNet and ResNets well testify the efficacy of the proposed method. Specifically, at 5-bit quantization, our models have improved accuracy than the 32-bit floating-point references. Taking ResNet-18 as an example, we further show that our quantized models with 4-bit, 3-bit and 2-bit ternary weights have improved or very similar accuracy against its 32-bit floating-point baseline. Besides, impressive results with the combination of network pruning and INQ are also reported. The code will be made publicly available.
Wind integration in power grids is very difficult, essentially because of the uncertain nature of wind speed. Forecasting errors on output from wind turbines may have costly consequences. For instance, power might be bought at highest price to meet the load. On the other hand, in case of surplus, power may be wasted. Energy storage facility may provide some recourse against the uncertainty on wind generation. Because of the sequential nature of power scheduling problems, stochastic dynamic programming is often used as solution method. However, this scheme is limited to very small networks by the so-called curse of dimensionality. To face such limitations, several approximate approaches have been proposed. We analyze the management of a network composed of conventional power units as well as wind turbines through approximate dynamic programming. We consider a general power network model with ramping constraints on the conventional generators. We use generalized linear programming techniques to linearize the problems. We test the algorithm on several networks of different sizes and report results about the computation time. We also carry out comparisons with classical dynamic programming on a small network. The results show the algorithm seems to offer a fair trade-off between solution time and accuracy.
Using path integral Monte Carlo we simulate a 3D system of up to 1000 magnetic flux lines by mapping it onto a system of interacting bosons in (2+1)D. With increasing temperature we find a first order melting of flux lines from an ordered solid to an entangled liquid signalled by a finite entropy jump and sharp discontinuities in the defect density and the structure factor $S({\bf G})$ at the first reciprocal lattice vector. In the presence of a small number of strong columnar pins, we find that the crystal is transformed into a Bose glass phase with patches of crystalline order nucleated around the trapped vortices but with no overall positional or orientational order. This glassy phase melts into a defected entangled liquid through a continuous transition.
We study a version of the minority game in which one agent is allowed to join the game in a random fashion. It is shown that in the crowded regime, i.e., for small values of the memory size $m$ of the agents in the population, the agent performs significantly well if she decides to participate the game randomly with a probability $q$ {\em and} she records the performance of her strategies only in the turns that she participates. The information, characterized by a quantity called the inefficiency, embedded in the agent's strategies performance turns out to be very different from that of the other agents. Detailed numerical studies reveal a relationship between the success rate of the agent and the inefficiency. The relationship can be understood analytically in terms of the dynamics in which the various possible histories are being visited as the game proceeds. For a finite fraction of randomly participating agents up to 60% of the population, it is found that the winning edge of these agents persists.
The Magellanic Clouds are close enough to the Milky Way to provide an excellent environment in which to study extragalactic PNe. Most of these PNe are bright enough to be spectroscopically observed and spatially resolved. With the latest high resolution detectors on today's large telescopes it is even possible to directly observe a large number of central stars. Magellanic Cloud (MC) PNe provide several astrophysical benefits including low overall extinction and a good sample size covering a large range of dynamic evolutionary timescales while the known distances provide a direct estimation of luminosity and physical dimensions. Multi-wavelength surveys are revealing intriguing differences between MC and Galactic PNe.   Over the past 5 years there has been a substantial increase in the number of PNe discovered in the Large Magellanic Cloud (LMC) in particular. Deep surveys have allowed the faint end of the luminosity function to be investigated, finally providing a strong clue to its overall shape. In so doing, the surveys are approaching completeness, estimated at ~80% in the LMC (~120 deg2) and ~65% in the Small Magellanic Cloud (SMC) (~20 deg2).   The number of galaxies comprising the Local Group (LG) and its outskirts has been growing steadily over the past 5 years and now numbers 48. Most of the 7 newly discovered galaxies are dwarf spheroidal (dSph) in structure and range from 7.6 to 755 kpc from the Milky Way. Nonetheless, there are no published searches for PNe in any of these galaxies to date. Apart from the LMC and Milky Way, the number of PN discoveries has been very modest and only one additional LG galaxy has been surveyed for PNe over the past 5 years. This paper provides the number of Local Group PNe currently known and estimates each galaxy's total PN population.
Reconstituted filamentous actin networks with myosin motor proteins form active gels, in which motor proteins generate forces that drive the network far from equilibrium. This motor activity can also strongly affect the network elasticity; experiments have shown a dramatic stiffening in in vitro networks with molecular motors. Here we study the effects of motor generated forces on the mechanics of simulated 2D networks of athermal stiff filaments. We show how heterogeneous internal motor stresses can lead to stiffening in networks that are governed by filament bending modes. The motors are modeled as force dipoles that cause muscle like contractions. These contractions "pull out" the floppy bending modes in the system, which induces a cross-over to a stiffer stretching dominated regime. Through this mechanism, motors can lead to a nonlinear network response, even when the constituent filaments are themselves purely linear. These results have implications for the mechanics of living cells and suggest new design principles for active biomemetic materials with tunable mechanical properties.
We explore the effect of introducing prior information into the intermediate level of neural networks for a learning task on which all the state-of-the-art machine learning algorithms tested failed to learn. We motivate our work from the hypothesis that humans learn such intermediate concepts from other individuals via a form of supervision or guidance using a curriculum. The experiments we have conducted provide positive evidence in favor of this hypothesis. In our experiments, a two-tiered MLP architecture is trained on a dataset with 64x64 binary inputs images, each image with three sprites. The final task is to decide whether all the sprites are the same or one of them is different. Sprites are pentomino tetris shapes and they are placed in an image with different locations using scaling and rotation transformations. The first part of the two-tiered MLP is pre-trained with intermediate-level targets being the presence of sprites at each location, while the second part takes the output of the first part as input and predicts the final task's target binary event. The two-tiered MLP architecture, with a few tens of thousand examples, was able to learn the task perfectly, whereas all other algorithms (include unsupervised pre-training, but also traditional algorithms like SVMs, decision trees and boosting) all perform no better than chance. We hypothesize that the optimization difficulty involved when the intermediate pre-training is not performed is due to the {\em composition} of two highly non-linear tasks. Our findings are also consistent with hypotheses on cultural learning inspired by the observations of optimization problems with deep learning, presumably because of effective local minima.
We view a neural network as a distributed system of which neurons can fail independently, and we evaluate its robustness in the absence of any (recovery) learning phase. We give tight bounds on the number of neurons that can fail without harming the result of a computation. To determine our bounds, we leverage the fact that neural activation functions are Lipschitz-continuous. Our bound is on a quantity, we call the \textit{Forward Error Propagation}, capturing how much error is propagated by a neural network when a given number of components is failing, computing this quantity only requires looking at the topology of the network, while experimentally assessing the robustness of a network requires the costly experiment of looking at all the possible inputs and testing all the possible configurations of the network corresponding to different failure situations, facing a discouraging combinatorial explosion.   We distinguish the case of neurons that can fail and stop their activity (crashed neurons) from the case of neurons that can fail by transmitting arbitrary values (Byzantine neurons). Interestingly, as we show in the paper, our bound can easily be extended to the case where synapses can fail.   We show how our bound can be leveraged to quantify the effect of memory cost reduction on the accuracy of a neural network, to estimate the amount of information any neuron needs from its preceding layer, enabling thereby a boosting scheme that prevents neurons from waiting for unnecessary signals. We finally discuss the trade-off between neural networks robustness and learning cost.
We present an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on DCNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large windows (receptive fields). However, this success comes at a cost, since the associated loss of effecive spatial resolution washes out high-frequency details and leads to blurry object boundaries. Here, we propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class-boundaries explicit in the model, First, we construct a comparatively simple, memory-efficient model by adding boundary detection to the Segnet encoder-decoder architecture. Second, we also include boundary detection in FCN-type models and set up a high-end classifier ensemble. We show that boundary detection significantly improves semantic segmentation with CNNs. Our high-end ensemble achieves > 90% overall accuracy on the ISPRS Vaihingen benchmark.
We analyze the performance of TCP and TCP with network coding (TCP/NC) in lossy networks. We build upon the framework introduced by Padhye et al. and characterize the throughput behavior of classical TCP and TCP/NC as a function of erasure probability, round-trip time, maximum window size, and duration of the connection. Our analytical results show that network coding masks random erasures from TCP, thus preventing TCP's performance degradation in lossy networks. It is further seen that TCP/NC has significant throughput gains over TCP.   In addition, we show that TCP/NC may lead to cost reduction for wireless network providers while maintaining a certain quality of service to their users. We measure the cost in terms of number of base stations, which is highly correlated to the energy, capital, and operational costs of a network provider. We show that increasing the available bandwidth may not necessarily lead to increase in throughput, particularly in lossy networks in which TCP does not perform well. We show that using protocols such as TCP/NC, which are more resilient to erasures, may lead to a throughput commensurate the bandwidth dedicated to each user.
In this paper we analyze the bipartite network of countries and products from UN data on country production. We define the country-country and product-product projected networks and introduce a novel method of filtering information based on elements' similarity. As a result we find that country clustering reveals unexpected socio-geographic links among the most competing countries. On the same footings the products clustering can be efficiently used for a bottom-up classification of produced goods. Furthermore we mathematically reformulate the "reflections method" introduced by Hidalgo and Hausmann as a fixpoint problem; such formulation highlights some conceptual weaknesses of the approach. To overcome such an issue, we introduce an alternative methodology (based on biased Markov chains) that allows to rank countries in a conceptually consistent way. Our analysis uncovers a strong non-linear interaction between the diversification of a country and the ubiquity of its products, thus suggesting the possible need of moving towards more efficient and direct non-linear fixpoint algorithms to rank countries and products in the global market.
It is widely believed that the brain performs approximate probabilistic inference to estimate causal variables in the world from ambiguous sensory data. To understand these computations, we need to analyze how information is represented and transformed by the actions of nonlinear recurrent neural networks. We propose that these probabilistic computations function by a message-passing algorithm operating at the level of redundant neural populations. To explain this framework, we review its underlying concepts, including graphical models, sufficient statistics, and message-passing, and then describe how these concepts could be implemented by recurrently connected probabilistic population codes. The relevant information flow in these networks will be most interpretable at the population level, particularly for redundant neural codes. We therefore outline a general approach to identify the essential features of a neural message-passing algorithm. Finally, we argue that to reveal the most important aspects of these neural computations, we must study large-scale activity patterns during moderately complex, naturalistic behaviors.
Recent advances in neural variational inference have spawned a renaissance in deep latent variable models. In this paper we introduce a generic variational inference framework for generative and conditional models of text. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. We validate this framework on two very different text modelling applications, generative document modelling and supervised question answering. Our neural variational document model combines a continuous stochastic document representation with a bag-of-words generative model and achieves the lowest reported perplexities on two standard test corpora. The neural answer selection model employs a stochastic representation layer within an attention mechanism to extract the semantics between a question and answer pair. On two question answering benchmarks this model exceeds all previous published benchmarks.
Clustering, or transitivity has been observed in real networks and its effects on their structure and function has been discussed extensively. The focus of these studies has been on clustering of single networks while the effect of clustering on the robustness of coupled networks received very little attention. Only the case of a pair of fully coupled networks with clustering has been studied recently. Here we generalize the study of clustering of a fully coupled pair of networks to the study of partially interdependent network of networks with clustering within the network components. We show both analytically and numerically, how clustering within the networks, affects the percolation properties of interdependent networks, including percolation threshold, size of giant component and critical coupling point where first order phase transition changes to second order phase transition as the coupling between the networks reduces. We study two types of clustering: one type proposed by Newman where the average degree is kept constant while changing the clustering and the other proposed by Hackett $et$ $al.$ where the degree distribution is kept constant. The first type of clustering is treated both analytically and numerically while the second one is treated only numerically.
Power Grids and other delivery networks has been attracted some attention by the network literature last decades. Despite the Power Grids dynamics has been controlled by computer systems and human operators, the static features of this type of network can be studied and analyzed. The topology of the Brazilian Power Grid (BPG) was studied in this work. We obtained the spatial structure of the BPG from the ONS (electric systems national operator), consisting of high-voltage transmission lines, generating stations and substations. The local low-voltage substations and local power delivery as well the dynamic features of the network were neglected. We analyze the complex network of the BPG and identify the main topological information, such as the mean degree, the degree distribution, the network size and the clustering coefficient to caracterize the complex network. We also detected the critical locations on the network and, therefore, the more susceptible points to lead to a cascading failure and even to a blackouts. Surprisely, due to the characteristic of the topology and physical structure of the network, we show that the BPG is resilient against random failures, since the random removal of links does not affect significantly the size of the largest cluster. We observe that when a fraction of the links are randomly removed, the network may disintegrates into smaller and disconnected parts, however, the power grid largest component remains connected. We believe that the even a static study of the network topology can help to identify the critical situations and also prevent failures and possible blackouts on the network.
First-principles calculations based on density functional theory have been widely used in studies of the structural, thermoelastic, rheological, and electronic properties of earth-forming materials. The exchange-correlation term, however, is implemented based on various approximations, and this is believed to be the main reason for discrepancies between experiments and theoretical predictions. In this work, by using periclase MgO as a prototype system we examine the discrepancies in pressure and Kohn-Sham energy that are due to the choice of the exchange-correlation functional. For instance, we choose local density approximation and generalized gradient approximation. We perform extensive first-principles calculations at various temperatures and volumes and find that the exchange-correlation-based discrepancies in Kohn-Sham energy and pressure should be independent of temperature. This implies that the physical quantities, such as the equation of states, heat capacity, and the Gr\"{u}neisen parameter, estimated by a particular choice of exchange-correlation functional can easily be transformed into those estimated by another exchange-correlation functional. Our findings may be helpful in providing useful constraints on mineral properties %at thermodynamic conditions compatible to deep Earth. at deep Earth thermodynamic conditions.
The ongoing progress in networking security, together with the growing range of robot applications in many fields of everyday life, makes robotics tangible reality in our near future. Accordingly, new advanced services, depends on the interplay between the robotics and cyber security, are being an important role in robotics world. This paper addresses technological implications of security enhancement to the Internet of Thing (IoT)-aided robotics domain, where networked robots are expected to work in complex environments. The security enhancement suggested by the NIST (National Institute of Standards and Technology) creates a security template for secure communications over the network are also discussed.
The long-range one-dimensional Ising spin-glass with random couplings decaying as $J(r) \propto r^{-\sigma}$ presents a spin-glass phase $T_c(\sigma)>0$ for $0 \leq \sigma<1$ (the limit $\sigma=0$ corresponds to the mean-field SK-model). We use the eigenvalue method introduced in our previous work [C. Monthus and T. Garel, J. Stat. Mech. P12017 (2009)] to measure the equilibrium time $t_{eq}(N)$ at temperature $T=T_c(\sigma)/2$ as a function of the number $N$ of spins. We find the activated scaling $\ln t_{eq}(N) \sim N^{\psi}$ with the same barrier exponent $\psi \simeq 0.33$ in the whole region $0\leq\sigma <1$.
Starting with a similarity function between objects, it is possible to define a distance metric on pairs of objects, and more generally on probability distributions over them. These distance metrics have a deep basis in functional analysis, measure theory and geometric measure theory, and have a rich structure that includes an isometric embedding into a (possibly infinite dimensional) Hilbert space. They have recently been applied to numerous problems in machine learning and shape analysis.   In this paper, we provide the first algorithmic analysis of these distance metrics. Our main contributions are as follows: (i) We present fast approximation algorithms for computing the kernel distance between two point sets P and Q that runs in near-linear time in the size of (P cup Q) (note that an explicit calculation would take quadratic time). (ii) We present polynomial-time algorithms for approximately minimizing the kernel distance under rigid transformation; they run in time O(n + poly(1/epsilon, log n)). (iii) We provide several general techniques for reducing complex objects to convenient sparse representations (specifically to point sets or sets of points sets) which approximately preserve the kernel distance. In particular, this allows us to reduce problems of computing the kernel distance between various types of objects such as curves, surfaces, and distributions to computing the kernel distance between point sets. These take advantage of the reproducing kernel Hilbert space and a new relation linking binary range spaces to continuous range spaces with bounded fat-shattering dimension.
Network traffic monitoring systems have to deal with a challenging problem: the traffic capturing process almost invariably produces duplicate packets. In spite of this, and in contrast with other fields, there is no scientific literature addressing it. This paper establishes the theoretical background concerning data duplication in network traffic analysis: generating mechanisms, types of duplicates and their characteristics are described. On this basis, a duplicate detection and removal methodology is proposed. Moreover, an analytical and experimental study is presented, whose results provide a dimensioning rule for this methodology.
Analysing translation quality in regards to specific linguistic phenomena has historically been difficult and time-consuming. Neural machine translation has the attractive property that it can produce scores for arbitrary translations, and we propose a novel method to assess how well NMT systems model specific linguistic phenomena such as agreement over long distances, the production of novel words, and the faithful translation of polarity. The core idea is that we measure whether a reference translation is more probable under a NMT model than a contrastive translation which introduces a specific type of error. We present LingEval97, a large-scale data set of 97000 contrastive translation pairs based on the WMT English->German translation task, with errors automatically created with simple rules. We report results for a number of systems, and find that recently introduced character-level NMT systems perform better at transliteration than models with byte-pair encoding (BPE) segmentation, but perform more poorly at morphosyntactic agreement, and translating discontiguous units of meaning.
The ANTARES Neutrino Telescope is a water Cherenkov detector currently under construction in the Mediterranean Sea. It is also designed to serve as a platform for investigations of the deep-sea environment. In this context, the ANTARES group at the University of Erlangen will integrate acoustic sensors within the infrastructure of the experiment. With this dedicated setup, tests of acoustic particle detection methods and deep-sea acoustic background studies shall be performed. The aim of this project is to evaluate the feasibility of a future acoustic neutrino telescope in the deep sea operating in the ultra-high energy regime. In these proceedings, the implementation of the project is described in the context of the premises and challenges set by the physics of acoustic particle detection and the integration into an existing infrastructure.
When electrons are subject to a large external magnetic field, the conventional charge quantum Hall effect \cite{Klitzing,Tsui} dictates that an electronic excitation gap is generated in the sample bulk, but metallic conduction is permitted at the boundary. Recent theoretical models suggest that certain bulk insulators with large spin-orbit interactions may also naturally support conducting topological boundary states in the extreme quantum limit, which opens up the possibility for studying unusual quantum Hall-like phenomena in zero external magnetic field. Bulk Bi$_{1-x}$Sb$_x$ single crystals are expected to be prime candidates for one such unusual Hall phase of matter known as the topological insulator. The hallmark of a topological insulator is the existence of metallic surface states that are higher dimensional analogues of the edge states that characterize a spin Hall insulator. In addition to its interesting boundary states, the bulk of Bi$_{1-x}$Sb$_x$ is predicted to exhibit three-dimensional Dirac particles, another topic of heightened current interest. Here, using incident-photon-energy-modulated (IPEM-ARPES), we report the first direct observation of massive Dirac particles in the bulk of Bi$_{0.9}$Sb$_{0.1}$, locate the Kramers' points at the sample's boundary and provide a comprehensive mapping of the topological Dirac insulator's gapless surface modes. These findings taken together suggest that the observed surface state on the boundary of the bulk insulator is a realization of the much sought exotic "topological metal". They also suggest that this material has potential application in developing next-generation quantum computing devices.
We report results of deep optical spectroscopy with Subaru/FOCAS for Lyman Break Galaxy (LBG) candidates at z~5. So far, we made spectroscopic observations for 24 LBG candidates among ~200 bright (z'<25.0) LBG sample and confirmed 9 objects to be LBGs at z~5. Intriguingly, these bright LBGs show no or a weak Ly-alpha emission and relatively strong low ionization interstellar metal absorption lines. We also identified 2 faint (z'>25.0) objects to be at z~5 with their strong (EW_rest>20AA) Ly-alpha emission. Combining our results with other spectroscopic observations of galaxies at the similar redshift range, we found a clear luminosity dependence of EW_rest of Ly-alpha emission, i.e., the lack of strong Ly-alpha emission in bright LBGs. If the absence of Ly-alpha emission is due to dust absorption, these results suggest that bright LBGs at z~5 are in dusty and more chemically evolved environment than faint ones. This interpretation implies that bright LBGs started star formation earlier than faint ones, suggesting biased star formation.
This paper proposes boosting-like deep learning (BDL) framework for pedestrian detection. Due to overtraining on the limited training samples, overfitting is a major problem of deep learning. We incorporate a boosting-like technique into deep learning to weigh the training samples, and thus prevent overtraining in the iterative process. We theoretically give the details of derivation of our algorithm, and report the experimental results on open data sets showing that BDL achieves a better stable performance than the state-of-the-arts. Our approach achieves 15.85% and 3.81% reduction in the average miss rate compared with ACF and JointDeep on the largest Caltech benchmark dataset, respectively.
The interplay between site dilution and quantum fluctuations in S=1 Heisenberg antiferromagnets on the square lattice is investigated using quantum Monte Carlo simulations. Quantum fluctuations are tuned by a single-ion anisotropy D. In the clean limit, a sufficiently large D>D_c = 5.65(2) J forces each spin into its m_S=0 state, and thus destabilizes antiferromagnetic order. In the presence of site dilution, quantum fluctuations are found to destroy N\'eel order before the percolation threshold of the lattice is reached, if D exceeds a critical value D^* = 2.3(2) J. This mechanism opens up an extended quantum-disordered Mott glass phase on the percolated lattice, characterized by a gapless spectrum and vanishing uniform susceptibility.
We use random-matrix theory and supersymmetry techniques to work out the two-point correlation function between states in a hierarchical model which employs Feshbach's chaining hypothesis: Classes of many-body states are introduced. Only states within the same or neighboring classes are coupled. We assume that the density of states per class grows monotonically with class index. The problem is mapped onto a one-dimensional non-linear sigma model. In the limit of a large number of states in each class we derive the critical exponent for the growth of the level density with class index for which delocalization sets in. From a realistic modelling of the class-dependence of the level density, we conclude that the model does not predict Fock-space localization in nuclei.
As the Grid evolves from a high performance cluster middleware to a multipurpose utility computing framework, a good understanding of Grid applications, their statistics and utilisation patterns is required. This study looks at job execution times and resource utilisations in a Grid environment, and their significance in cluster and network dimensioning, local level scheduling and resource management.
The paper presents two mechanisms for designing an on-demand, reliable and efficient collection protocol for Wireless Sensor Networks. The former is the Bidirectional Link Quality Estimation, which allows nodes to easily and quickly compute the quality of a link between a pair of nodes. The latter, Hierarchical Range Sectoring, organizes sensors in different sectors based on their location within the network. Based on this organization, nodes from each sector are coordinated to transmit in specific periods of time to reduce the hidden terminal problem. To evaluate these two mechanisms, a protocol called HBCP (Hierarchical-Based Collection Protocol), that implements both mechanisms, has been implemented in TinyOS 2.1, and evaluated in a testbed using TelosB motes. The results show that the HBCP protocol is able to achieve a very high reliability, especially in large networks and in scenarios with bottlenecks.
We propose a simple but strong baseline for time series classification from scratch with deep neural networks. Our proposed baseline models are pure end-to-end without any heavy preprocessing on the raw data or feature crafting. The proposed Fully Convolutional Network (FCN) achieves premium performance to other state-of-the-art approaches and our exploration of the very deep neural networks with the ResNet structure is also competitive. The global average pooling in our convolutional model enables the exploitation of the Class Activation Map (CAM) to find out the contributing region in the raw data for the specific labels. Our models provides a simple choice for the real world application and a good starting point for the future research. An overall analysis is provided to discuss the generalization capability of our models, learned features, network structures and the classification semantics.
The standard LSTM, although it succeeds in the modeling long-range dependences, suffers from a highly complex structure that can be simplified through modifications to its gate units. This paper was to perform an empirical comparison between the standard LSTM and three new simplified variants that were obtained by eliminating input signal, bias and hidden unit signal from individual gates, on the tasks of modeling two sequence datasets. The experiments show that the three variants, with reduced parameters, can achieve comparable performance with the standard LSTM. Due attention should be paid to turning the learning rate to achieve high accuracies
We study the steady state of driven elastic strings in disordered media below the depinning threshold. In the low-temperature limit, for a fixed sample, the steady state is dominated by a single configuration, which we determine exactly from the transition pathways between metastable states. We obtain the dynamical phase diagram in this limit. At variance with a thermodynamic phase transition, the depinning transition is not associated with a divergent length scale of the steady state below threshold, but only of the transient dynamics. We discuss the distribution of barrier heights, and check the validity of the dynamic phase diagram at small but finite temperatures using Langevin simulations. The phase diagram continues to hold for broken statistical tilt symmetry. We point out the relevance of our results for experiments of creep motion in elastic interfaces.
First results on inclusive D0 and D* production in deep inelastic $ep$ scattering are reported using data collected by the H1 experiment at HERA in 1994. Differential cross sections are presented for both channels and are found to agree well with QCD predictions based on the boson gluon fusion process. A charm production cross section for 10 GeV$^2\le Q^2\le100$~GeV$^2$ and $0.01\le y\le0.7$ of $\sigma(ep\to c\barcX) = (17.4 \pm 1.6 \pm 1.7 \pm 1.4)$~nb is derived. A first measurement of the charm contribution F2_charm(x,Q^2) to the proton structure function for Bjorken $x$ between $8\cdot10^{-4}$ and $8\cdot10^{-3}$ is presented. In this kinematic range a ratio F2_charm/F2= 0.237\pm0.021{+0.043\atop-0.039}$ is observed.
Understanding how the brain stores and processes information is central to mathematical neuroscience. Neural data is often represented as a neural code: a set of binary firing patterns $\mathcal{C}\subset\{0,1\}^n$. We have previously introduced the neural ring, an algebraic object which encodes combinatorial information, in order to analyze the structure of neural codes. We now relate maps between neural codes to notions of homomorphism between the corresponding neural rings. Using three natural operations on neural codes (permutation, inclusion, deletion) as motivation, we search for a restricted class of homomorphisms which correspond to these natural operations. We choose the framework of linear-monomial module homomorphisms, and find that the class of associated code maps neatly captures these three operations, and necessarily includes two others - repetition and adding trivial neurons - which are also meaningful in a neural coding context.
We study the optimal control problem of allocating campaigning resources over the campaign duration and degree classes in a social network. Information diffusion is modeled as a Susceptible-Infected epidemic and direct recruitment of susceptible nodes to the infected (informed) class is used as a strategy to accelerate the spread of information. We formulate an optimal control problem for optimizing a net reward function, a linear combination of the reward due to information spread and cost due to application of controls. The time varying resource allocation and seeds for the epidemic are jointly optimized. A problem variation includes a fixed budget constraint. We prove the existence of a solution for the optimal control problem, provide conditions for uniqueness of the solution, and prove some structural results for the controls (e.g. controls are non-increasing functions of time). The solution technique uses Pontryagin's Maximum Principle and the forward-backward sweep algorithm (and its modifications) for numerical computations. Our formulations lead to large optimality systems with up to about 200 differential equations and allow us to study the effect of network topology (Erdos-Renyi/scale-free) on the controls. Results reveal that the allocation of campaigning resources to various degree classes depends not only on the network topology but also on system parameters such as cost/abundance of resources. The optimal strategies lead to significant gains over heuristic strategies for various model parameters. Our modeling approach assumes uncorrelated network, however, we find the approach useful for real networks as well. This work is useful in product advertising, political and crowdfunding campaigns in social networks.
Deep residual networks were shown to be able to scale up to thousands of layers and still have improving performance. However, each fraction of a percent of improved accuracy costs nearly doubling the number of layers, and so training very deep residual networks has a problem of diminishing feature reuse, which makes these networks very slow to train. To tackle these problems, in this paper we conduct a detailed experimental study on the architecture of ResNet blocks, based on which we propose a novel architecture where we decrease depth and increase width of residual networks. We call the resulting network structures wide residual networks (WRNs) and show that these are far superior over their commonly used thin and very deep counterparts. For example, we demonstrate that even a simple 16-layer-deep wide residual network outperforms in accuracy and efficiency all previous deep residual networks, including thousand-layer-deep networks, achieving new state-of-the-art results on CIFAR, SVHN, COCO, and significant improvements on ImageNet. Our code and models are available at https://github.com/szagoruyko/wide-residual-networks
Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson $\chi^2$ divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.
Peer-to-Peer systems are based on the concept of resources localization and mutualisation in dynamic context. In specific environment such as mobile networks, characterized by high variability and dynamicity of network conditions and performances, where nodes can join and leave the network dynamically, resources reliability and availability constitute a critical issue. The resource discovery problem arises in the context of peer to peer (P2P) networks, where at any point of time a peer may be placed at or removed from any location over a general purpose network. Locating a resource or service efficiently is one of the most important issues related to peer-to-peer networks. The objective of a search mechanism is to successfully locate resources while incurring low overhead and low delay. This paper presents a survey on P2P networks management: classification, applications, platforms, simulators and security.
Cadieu et al. (Cadieu,2014) reported that deep neural networks(DNNs) could rival the representation of primate inferotemporal cortex for object recognition. Lehky et al. (Lehky,2011) provided a statistical analysis on neural responses to object stimuli in primate AIT cortex. They found the intrinsic dimensionality of object representations in AIT cortex is around 100 (Lehky,2014). Considering the outstanding performance of DNNs in object recognition, it is worthwhile investigating whether the responses of DNN neurons have similar response statistics to those of AIT neurons. Following Lehky et al.'s works, we analyze the response statistics to image stimuli and the intrinsic dimensionality of object representations of DNN neurons. Our findings show in terms of kurtosis and Pareto tail index, the response statistics on single-neuron selectivity and population sparseness of DNN neurons are fundamentally different from those of IT neurons except some special cases. By increasing the number of neurons and stimuli, the conclusions could alter substantially. In addition, with the ascendancy of the convolutional layers of DNNs, the single-neuron selectivity and population sparseness of DNN neurons increase, indicating the last convolutional layer is to learn features for object representations, while the following fully-connected layers are to learn categorization features. It is also found that a sufficiently large number of stimuli and neurons are necessary for obtaining a stable dimensionality. To our knowledge, this is the first work to analyze the response statistics of DNN neurons comparing with AIT neurons, and our results provide not only some insights into the discrepancy of DNN neurons with respect to IT neurons in object representation, but also shed some light on possible outcomes of IT neurons when the number of recorded neurons and stimuli is beyond the level in (Lehky,2011,2014).
Networks Unplugged: Towards A Model of Compatibility Regulation Between Information Platforms This Article outlines a basic model for regulating interoperability between rival information platforms. In so doing, it insists that antitrust, intellectual property, and telecommunications regulation all must follow the same set of principles to facilitate competition between rival standards where possible, mandating or allowing cooperation only where necessary to facilitate competition within a standard when network-level competition is infeasible. To date, the antitrust regime best approximates the type of model I have in mind, but sound competition policy requires that telecommunications regulation and intellectual property law follow its basic principles as well.
Deep Convolutional Neural Networks (DCNN) require millions of labeled training examples for image classification and object detection tasks, which restrict these models to domains where such a dataset is available. We explore the use of unsupervised sparse coding applied to stereo-video data to help alleviate the need for large amounts of labeled data. In this paper, we show that unsupervised sparse coding is able to learn disparity and motion sensitive basis functions when exposed to unlabeled stereo-video data. Additionally, we show that a DCNN that incorporates unsupervised learning exhibits better performance than fully supervised networks. Furthermore, finding a sparse representation in the first layer, which infers a sparse set of activations, allows for consistent performance over varying initializations and ordering of training examples when compared to representations computed from a single convolution. Finally, we compare activations between the unsupervised sparse-coding layer and the supervised layer when applied to stereo-video data, and show that sparse coding exhibits an encoding that is depth selective, whereas encodings from a single convolution do not. These result indicates promise for using unsupervised sparse-coding approaches in real- world computer vision tasks in domains with limited labeled training data.
The question of why and how animal and human groups form temporarily stable hierarchical organizations has long been a great challenge from the point of quantitative interpretations. The prevailing observation/consensus is that a hierarchical social or technological structure is optimal considering a variety of aspects. Here we introduce a simple quantitative interpretation of this situation using an approach reminiscent of those developed for describing complex behaviour in terms of statistical mechanics. We look for the optimum of the efficiency function $E_{\rm eff}=1/N \sum_{i,j} J_{ij} a_i a_j$ with $J_{ij}$ denoting the nature of the interaction between the units $i$ and $j$ and $a_i$ standing for the ability of member $i$ to contribute to the efficiency of the system. Notably, this expression for $E_{\rm eff}$ has a similar structure to that of the energy as defined for spin-glasses. There is, however, an essential and novel feature of our approach: instead of optimizing by looking for a locally optimal state of the units in the nodes of a pre-defined network, we search for extrema in the complex efficiency landscape by finding locally optimal network topologies using a standard Monte Carlo method.
In the first part of this paper, we propose new optimization-based methods for the computation of preferred (dense, sparse, reversible, detailed and complex balanced) linearly conjugate reaction network structures with mass action dynamics. The developed methods are extensions of previously published results on dynamically equivalent reaction networks and are based on mixed-integer linear programming. As related theoretical contributions we show that (i) dense linearly conjugate networks define a unique super-structure for any positive diagonal state transformation if the set of chemical complexes is given, and (ii) the existence of linearly conjugate detailed balanced and complex balanced networks do not depend on the selection of equilibrium points. In the second part of the paper it is shown that determining dynamically equivalent realizations to a network that is structurally fixed but parametrically not can also be written and solved as a mixed-integer linear programming problem. Several examples illustrate the presented computation methods.
For computer vision applications, prior works have shown the efficacy of reducing the numeric precision of model parameters (network weights) in deep neural networks but also that reducing the precision of activations hurts model accuracy much more than reducing the precision of model parameters. We study schemes to train networks from scratch using reduced-precision activations without hurting the model accuracy. We reduce the precision of activation maps (along with model parameters) using a novel quantization scheme and increase the number of filter maps in a layer, and find that this scheme compensates or surpasses the accuracy of the baseline full-precision network. As a result, one can significantly reduce the dynamic memory footprint, memory bandwidth, computational energy and speed up the training and inference process with appropriate hardware support. We call our scheme WRPN - wide reduced-precision networks. We report results using our proposed schemes and show that our results are better than previously reported accuracies on ILSVRC-12 dataset while being computationally less expensive compared to previously reported reduced-precision networks.
Given a connected graph $G=(V,E)$, the closeness centrality of a vertex $v$ is defined as $\frac{n-1}{\sum_{w \in V} d(v,w)}$. This measure is widely used in the analysis of real-world complex networks, and the problem of selecting the $k$ most central vertices has been deeply analysed in the last decade. However, this problem is computationally not easy, especially for large networks: in the first part of the paper, we prove that it is not solvable in time $\O(|E|^{2-\epsilon})$ on directed graphs, for any constant $\epsilon>0$, under reasonable complexity assumptions. Furthermore, we propose a new algorithm for selecting the $k$ most central nodes in a graph: we experimentally show that this algorithm improves significantly both the textbook algorithm, which is based on computing the distance between all pairs of vertices, and the state of the art. For example, we are able to compute the top $k$ nodes in few dozens of seconds in real-world networks with millions of nodes and edges. Finally, as a case study, we compute the $10$ most central actors in the IMDB collaboration network, where two actors are linked if they played together in a movie, and in the Wikipedia citation network, which contains a directed edge from a page $p$ to a page $q$ if $p$ contains a link to $q$.
The relativistic Hartree approximation predicts a deep attractive potential for antinucleon, which largely reduce the threshold energies of the nucleon-antinucleon (N-Nbar) production. This effect has played an important role to explain the quenching of the Gamow-Teller (GT) strength because the quenched strength in the particle-hole excitation is partially taken by the nucleon-antinucleon production. On the other hand antiproton experiments do not reveal deep attractive potential for antinucleon. In this paper we study energy-dependence of the nucleon self-energies in the relativistic Hartree-Fock (RHF) approximation in off-mass-shell states. Then we demonstrate that the antinucleon appearing in low energy observables is in the off-mass-shell energy region and its properties are quite different from that at the on-mass-shell state. Furthermore we show that the quenched amount of the GT strength is not taken only by the \NNbar production but also by the meson production through the imaginary part of the nucleon self-energy in the RHF approximation.
We study the vertex cover problem on finite connectivity random graphs by zero-temperature cavity method. The minimum vertex cover corresponds to the ground state(s) of a proposed Ising spin model. When the connectivity c>e=2.718282, there is no state for this system as the reweighting parameter y, which takes a similar role as the inverse temperature \beta in conventional statistical physics, approaches infinity; consequently the ground state energy is obtained at a finite value of y when the free energy function attains its maximum value. The minimum vertex cover size at given c is estimated using population dynamics and compared with known rigorous bounds and numerical results. The backbone size is also calculated.
As the emergence and the thriving development of social networks, a huge number of short texts are accumulated and need to be processed. Inferring latent topics of collected short texts is useful for understanding its hidden structure and predicting new contents. Unlike conventional topic models such as latent Dirichlet allocation (LDA), a biterm topic model (BTM) was recently proposed for short texts to overcome the sparseness of document-level word co-occurrences by directly modeling the generation process of word pairs. Stochastic inference algorithms based on collapsed Gibbs sampling (CGS) and collapsed variational inference have been proposed for BTM. However, they either require large computational complexity, or rely on very crude estimation. In this work, we develop a stochastic divergence minimization inference algorithm for BTM to estimate latent topics more accurately in a scalable way. Experiments demonstrate the superiority of our proposed algorithm compared with existing inference algorithms.
Cloud computing has demonstrated itself to be a scalable and cost-efficient solution for many real-world applications. However, its modus operandi is not ideally suited to resource-constrained environments that are characterized by limited network bandwidth and high latencies. With the increasing proliferation and sophistication of edge devices, the idea of fog computing proposes to offload some of the computation to the edge. To this end, micro-clouds---which are modular and portable assemblies of small single-board computers---have started to gain attention as infrastructures to support fog computing by offering isolated resource provisioning at the edge in a cost-effective way. We investigate the feasibility and readiness of micro-clouds for delivering the vision of fog computing. Through a number of experiments, we showcase the potential of micro-clouds formed by collections of Raspberry Pi computers to host a range of fog-related applications, particularly for locations where there is limited network bandwidths and long latencies.
We search for conditions under which a characteristic time scale for ordering dynamics towards either of two absorbing states in a finite complex network of interactions does not exist. With this aim, we study random networks and networks with mesoscale community structure built up from randomly connected cliques. We find that large heterogeneity at the mesoscale level of the network appears to be a sufficient mechanism for the absence of a characteristic time for the dynamics. Such heterogeneity results in dynamical metastable states that survive at any time scale.
Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimportant connections with small weights and retraining the network given the sparsity constraint. In the final D (re-Dense) step, we increase the model capacity by removing the sparsity constraint, re-initialize the pruned parameters from zero and retrain the whole dense network. Experiments show that DSD training can improve the performance for a wide range of CNNs, RNNs and LSTMs on the tasks of image classification, caption generation and speech recognition. On ImageNet, DSD improved the Top1 accuracy of GoogLeNet by 1.1%, VGG-16 by 4.3%, ResNet-18 by 1.2% and ResNet-50 by 1.1%, respectively. On the WSJ'93 dataset, DSD improved DeepSpeech and DeepSpeech2 WER by 2.0% and 1.1%. On the Flickr-8K dataset, DSD improved the NeuralTalk BLEU score by over 1.7. DSD is easy to use in practice: at training time, DSD incurs only one extra hyper-parameter: the sparsity ratio in the S step. At testing time, DSD doesn't change the network architecture or incur any inference overhead. The consistent and significant performance gain of DSD experiments shows the inadequacy of the current training methods for finding the best local optimum, while DSD effectively achieves superior optimization performance for finding a better solution. DSD models are available to download at https://songhan.github.io/DSD.
Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provides useful insights for better understanding and utilization of missing values in time series analysis.
Fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights are fine-tuned by retraining. We propose an improved fixedpoint optimization algorithm that estimates the quantization step size dynamically during the retraining. In addition, a gradual quantization scheme is also tested, which sequentially applies fixed-point optimizations from high- to low-precision. The experiments are conducted for feed-forward deep neural networks (FFDNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs).
Defects and defect models of solids are reviewed. A numerical method able to treat non-periodical solids possessing several simultaneous defect types is given for simulating scattering in nanosize clusters. The approach takes particle size, shape, and defects into account and isolates element specific signals. Examples illustrating how laboratory scale facilities can be used to extract crucial information about defects are given. As a case study a statistical approximation model for lead-zirconate-titanate (PZT) is introduced. PZT is a material possessing several defect types, including substitutional, displacement and surface defects. Spatial composition variation is taken into account by introducing a model in which the edge lengths of each cell depend on the distribution of Zr and Ti ions in the cluster. Spatially varying edge lengths and angles are referred to as microstrain. The Pb, Zr and Ti cation positions were adjusted by bond-valence sum (BVS) model to fullfil nominal valence requirement. The model is applied to compute the scattering from ellipsoid shaped PZT clusters and to simulate the structural changes as a function of average composition. Two-phase co-existence range, the so called morphotropic phase boundary (MPB) composition is given correctly. To make a comparison with commonly used x-ray and neutron diffraction data selected Bragg reflection intensities and line shapes were simulated. Examples of the effect of size and shape of the scattering clusters on diffraction patterns are given and the particle dimensions, computed through Scherrer equation, are compared with the exact cluster dimensions. Scattering from two types of 180 degree domains in spherical particles, one type assigned to Ti-rich PZT and the second to the MPB and Zr rich PZT, is computed. We show how the method can be used for modelling polarization reversal.
Cross-modal retrieval has become a highlighted research topic for retrieval across multimedia data such as image and text. And a two-stage learning framework is widely adopted by most existing methods based on Deep Neural Network (DNN): The first learning stage is to generate separate representation for each modality, and the second learning stage is to get the cross-modal common representation. However, the existing methods have three limitations: (1) In the first learning stage, they only model intra-modality correlation, but ignore inter-modality correlation with rich complementary context. (2) In the second learning stage, they only adopt shallow networks with single-loss regularization, but ignore the intrinsic relevance of intra-modality and inter-modality correlation. (3) Only original instances are considered while the complementary fine-grained clues provided by their patches are ignored. For addressing the above problems, this paper proposes a cross-modal correlation learning (CCL) approach with multi-grained fusion by hierarchical network, and the contributions are as follows: (1) In the first learning stage, CCL exploits multi-level association with joint optimization to preserve the complementary context from intra-modality and inter-modality correlation simultaneously. (2) In the second learning stage, a multi-task learning strategy is designed to adaptively balance the intra-modality semantic category constraints and inter-modality pairwise similarity constraints. (3) CCL adopts multi-grained modeling, which fuses the coarse-grained instances and fine-grained patches to make cross-modal correlation more precise. Comparing with 13 state-of-the-art methods on 6 widely-used cross-modal datasets, the experimental results show our CCL approach achieves the best performance.
We introduce a new coarse-graining algorithm, tensor network skeletonization, for the numerical computation of tensor networks. This approach utilizes a structure-preserving skeletonization procedure to remove short-range correlations effectively at every scale. This approach is first presented in the setting of 2D statistical Ising model and is then extended to higher dimensional tensor networks and disordered systems. When applied to the Euclidean path integral formulation, this approach also gives rise to new efficient representations of the ground states for 1D and 2D quantum Ising models.
We present a new algorithm for computing balanced flows in equality networks arising in market equilibrium computations. The current best time bound for computing balanced flows in such networks requires $O(n)$ maxflow computations, where $n$ is the number of nodes in the network [Devanur et al. 2008]. Our algorithm requires only a single parametric flow computation. The best algorithm for computing parametric flows [Gallo et al. 1989] is only by a logarithmic factor slower than the best algorithms for computing maxflows. Hence, the running time of the algorithms in [Devanur et al. 2008] and [Duan and Mehlhorn 2015] for computing market equilibria in linear Fisher and Arrow-Debreu markets improve by almost a factor of $n$.
Next generation (5G) wireless networks are expected to support the massive data and accommodate a wide range of services/use cases with distinct requirements in a cost-effective, flexible, and agile manner. As a promising solution, wireless network virtualization (WNV), or network slicing, enables multiple virtual networks to share the common infrastructure on demand, and to be customized for different services/use cases. This article focuses on network-wide resource allocation for realizing WNV. Specifically, the motivations, the enabling platforms, and the benefits of WNV, are first reviewed. Then, resource allocation for WNV along with the technical challenges is discussed. Afterwards, a software defined networking (SDN) enabled resource allocation framework is proposed to facilitate WNV, including the key procedures and the corresponding modeling approaches. Furthermore, a case study is provided as an example of resource allocation in WNV. Finally, some open research topics essential to WNV are discussed.
We describe the use of quasiperiodic oscillators for computation and control of robots. We also describe their relationship to central pattern generators in simple organisms and develop a group theory for describing the dynamics of these systems.
Sensory systems pass information about an animal's environment to higher nervous system units through sequences of action potentials. When these action potentials have essentially equivalent waveforms, all information is contained in the interspike intervals (ISIs) of the spike sequence. We address the question: How do neural circuits recognize and read these ISI sequences?   Our answer is given in terms of a biologically inspired neural circuit that we construct using biologically realistic neurons. The essential ingredients of the ISI Reading Unit (IRU) are (i) a tunable time delay circuit modelled after one found in the anterior forebrain pathway of the birdsong system and (ii) a recently observed rule for inhibitory synaptic plasticity. We present a circuit that can both learn the ISIs of a training sequence using inhibitory synaptic plasticity and then recognize the same ISI sequence when it is presented on subsequent occasions. We investigate the ability of this IRU to learn in the presence of two kinds of noise: jitter in the time of each spike and random spikes occurring in the ideal spike sequence. We also discuss how the circuit can be detuned by removing the selected ISI sequence and replacing it by an ISI sequence with ISIs drawn from a probability distribution.   We have investigated realizations of the time delay circuit using Hodgkin-Huxley conductance based neurons connected by realistic excitatory and inhibitory synapses. Our models for the time delay circuit are tunable from about 10 ms to 100 ms allowing one to learn and recognize ISI sequences within that range of ISIs. ISIs down to a few ms and longer than 100 ms are possible with other intrinsic and synaptic currents in the component neurons.
This paper summarizes the work done by the authors for the Zero Resource Speech Challenge organized in the technical program of Interspeech 2015. The goal of the challenge is to discover linguistic units directly from unlabeled speech data. The Multi-layered Acoustic Tokenizer (MAT) proposed in this work automatically discovers multiple sets of acoustic tokens from the given corpus. Each acoustic token set is specified by a set of hyperparameters that describe the model configuration. These sets of acoustic tokens carry different characteristics of the given corpus and the language behind thus can be mutually reinforced. The multiple sets of token labels are then used as the targets of a Multi-target DNN (MDNN) trained on low-level acoustic features. Bottleneck features extracted from the MDNN are used as feedback for the MAT and the MDNN itself. We call this iterative system the Multi-layered Acoustic Tokenizing Deep Neural Network (MAT-DNN) which generates high quality features for track 1 of the challenge and acoustic tokens for track 2 of the challenge.
We propose a simple, elegant solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages. Our solution requires no change in the model architecture from our base system but instead introduces an artificial token at the beginning of the input sentence to specify the required target language. The rest of the model, which includes encoder, decoder and attention, remains unchanged and is shared across all languages. Using a shared wordpiece vocabulary, our approach enables Multilingual NMT using a single model without any increase in parameters, which is significantly simpler than previous proposals for Multilingual NMT. Our method often improves the translation quality of all involved language pairs, even while keeping the total number of model parameters constant. On the WMT'14 benchmarks, a single multilingual model achieves comparable performance for English$\rightarrow$French and surpasses state-of-the-art results for English$\rightarrow$German. Similarly, a single multilingual model surpasses state-of-the-art results for French$\rightarrow$English and German$\rightarrow$English on WMT'14 and WMT'15 benchmarks respectively. On production corpora, multilingual models of up to twelve language pairs allow for better translation of many individual pairs. In addition to improving the translation quality of language pairs that the model was trained with, our models can also learn to perform implicit bridging between language pairs never seen explicitly during training, showing that transfer learning and zero-shot translation is possible for neural translation. Finally, we show analyses that hints at a universal interlingua representation in our models and show some interesting examples when mixing languages.
Recurrent neural networks (RNNs), especially long short-term memory (LSTM) RNNs, are effective network for sequential task like speech recognition. Deeper LSTM models perform well on large vocabulary continuous speech recognition, because of their impressive learning ability. However, it is more difficult to train a deeper network. We introduce a training framework with layer-wise training and exponential moving average methods for deeper LSTM models. It is a competitive framework that LSTM models of more than 7 layers are successfully trained on Shenma voice search data in Mandarin and they outperform the deep LSTM models trained by conventional approach. Moreover, in order for online streaming speech recognition applications, the shallow model with low real time factor is distilled from the very deep model. The recognition accuracy have little loss in the distillation process. Therefore, the model trained with the proposed training framework reduces relative 14\% character error rate, compared to original model which has the similar real-time capability. Furthermore, the novel transfer learning strategy with segmental Minimum Bayes-Risk is also introduced in the framework. The strategy makes it possible that training with only a small part of dataset could outperform full dataset training from the beginning.
In the Minimal Supersymmetric Standard Model (MSSM), the simultaneous appearance of lepton and baryon number violation causes the proton to decay much faster than the experimental bound allows. Customarily, a discrete symmetry known as R-parity is imposed to forbid these dangerous interactions. This work begins by arguing that there is no deep theoretical motivation for preferring R-parity over other discrete symmetries and continues by adopting the MSSM with baryon number conservation replacing R-parity conservation. For the purpose of studying the influence of the consequent lepton number violating interactions, 1278 new decay channels of supersymmetric particles into Standard Model particles have been included in the PYTHIA event generator. The augmented event generator is then used in combination with the atlfast detector simulation to study the impact of lepton number violation on event topologies in the ATLAS detector, and trigger menus designed for LV-SUSY are proposed based on very general conclusions. The subsequent analysis uses a combination of primitive cuts and neural networks to optimize the discriminating power between signal and background events. In all scenarios studied, it is found that a $5\sigma$ discovery is possible for cross sections down to $10^{-10}$ mb with an integrated luminosity of 30 fb$^{-1}$.
We study distributed stochastic convex optimization under the delayed gradient model where the server nodes perform parameter updates, while the worker nodes compute stochastic gradients. We discuss, analyze, and experiment with a setup motivated by the behavior of real-world distributed computation networks, where the machines are differently slow at different time. Therefore, we allow the parameter updates to be sensitive to the actual delays experienced, rather than to worst-case bounds on the maximum delay. This sensitivity leads to larger stepsizes, that can help gain rapid initial convergence without having to wait too long for slower machines, while maintaining the same asymptotic complexity. We obtain encouraging improvements to overall convergence for distributed experiments on real datasets with up to billions of examples and features.
Stochastic geometry models of wireless networks based on Poisson point processes are increasingly being developed with a focus on studying various signal-to-interference-plus-noise ratio (SINR) values. We show that the SINR values experienced by a typical user with respect to different base stations of a Poissonian cellular network are related to a specific instance of the so-called two-parameter Poisson-Dirichlet process. This process has many interesting properties as well as applications in various fields. We give examples of several results proved for this process that are of immediate or potential interest in the development of analytic tools for cellular networks. Some of them simplify or are akin to certain results that are recently being developed in wireless networks literature. By doing this we hope to motivate further research and use of Poisson-Dirichlet processes in this new setting.
Kernel fusion is a popular and effective approach for combining multiple features that characterize different aspects of data. Traditional approaches for Multiple Kernel Learning (MKL) attempt to learn the parameters for combining the kernels through sophisticated optimization procedures. In this paper, we propose an alternative approach that creates dense embeddings for data using the kernel similarities and adopts a deep neural network architecture for fusing the embeddings. In order to improve the effectiveness of this network, we introduce the kernel dropout regularization strategy coupled with the use of an expanded set of composition kernels. Experiment results on a real-world activity recognition dataset show that the proposed architecture is effective in fusing kernels and achieves state-of-the-art performance.
While deep convolutional neural networks frequently approach or exceed human-level performance at benchmark tasks involving static images, extending this success to moving images is not straightforward. Having models which can learn to understand video is of interest for many applications, including content recommendation, prediction, summarization, event/object detection and understanding human visual perception, but many domains lack sufficient data to explore and perfect video models. In order to address the need for a simple, quantitative benchmark for developing and understanding video, we present MovieFIB, a fill-in-the-blank question-answering dataset with over 300,000 examples, based on descriptive video annotations for the visually impaired. In addition to presenting statistics and a description of the dataset, we perform a detailed analysis of 5 different models' predictions, and compare these with human performance. We investigate the relative importance of language, static (2D) visual features, and moving (3D) visual features; the effects of increasing dataset size, the number of frames sampled; and of vocabulary size. We illustrate that: this task is not solvable by a language model alone; our model combining 2D and 3D visual information indeed provides the best result; all models perform significantly worse than human-level. We provide human evaluations for responses given by different models and find that accuracy on the MovieFIB evaluation corresponds well with human judgement. We suggest avenues for improving video models, and hope that the proposed dataset can be useful for measuring and encouraging progress in this very interesting field.
Recognizing dynamic scenes is one of the fundamental problems in scene understanding, which categorizes moving scenes such as a forest fire, landslide, or avalanche. While existing methods focus on reliable capturing of static and dynamic information, few works have explored frame selection from a dynamic scene sequence. In this paper, we propose dynamic scene recognition using a deep dual descriptor based on `key frames' and `key segments.' Key frames that reflect the feature distribution of the sequence with a small number are used for capturing salient static appearances. Key segments, which are captured from the area around each key frame, provide an additional discriminative power by dynamic patterns within short time intervals. To this end, two types of transferred convolutional neural network features are used in our approach. A fully connected layer is used to select the key frames and key segments, while the convolutional layer is used to describe them. We conducted experiments using public datasets as well as a new dataset comprised of 23 dynamic scene classes with 10 videos per class. The evaluation results demonstrated the state-of-the-art performance of the proposed method.
Evolutionary game theory is employed to study topological conditions of scale-free networks for the evolution of cooperation. We show that Apollonian Networks (ANs) are perfect scale-free networks, on which cooperation can spread to all individuals, even though there are initially only 3 or 4 hubs occupied by cooperators and all the others by defectors. Local topological features such as degree, clustering coefficient, gradient as well as topology potential are adopted to analyze the advantages of ANs in cooperation enhancement. Furthermore, a degree-skeleton underlying ANs is uncovered for understanding the cooperation diffusion. Constructing this kind degree-skeleton for random scale-free networks promotes cooperation level close to that of Barab\'asi-Albert networks, which gives deeper insights into the origin of the latter on organization and further promotion of cooperation.
Quantum many-body problem with exponentially large degrees of freedom can be reduced to a tractable computational form by neural network method \cite{CT}. The power of deep neural network (DNN) based on deep learning is clarified by mapping it to renormalization group (RG), which may shed lights on holographic principle by identifying a sequence of RG transformations to the AdS geometry. In this essay, we show that any network which reflects RG process has intrinsic hyperbolic geometry, and discuss the structure of entanglement encoded in the graph of DNN. We find the entanglement structure of deep neural network is of Ryu-Takayanagi form. Based on these facts, we argue that the emergence of holographic gravitational theory is related to deep learning process of the quantum field theory.
We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning. Our proposal is simpler and more biologically-plausible. Unlike previous approaches, our technique can be applied out of the box to all learning scenarios (e.g., online learning, batch learning, fully-connected, convolutional, feedforward, recurrent and mixed --- recurrent and convolutional) and compare favorably with existing approaches. We also propose Lp Normalization for normalizing by different orders of statistical moments. In particular, L1 normalization is well-performing, simple to implement, fast to compute, more biologically-plausible and thus ideal for GPU or hardware implementations.
The specific heat critical behavior is measured and analyzed for a single crystal of the random-field Ising system Fe(0.93)Zn(0.07)F2 using pulsed heat and optical birefringence techniques. This high magnetic concentration sample does not exhibit the severe scattering hysteresis at low temperature seen in lower concentration samples and its behavior is therefore that of an equilibrium random-field Ising model system. The equivalence of the behavior observed with pulsed heat techniques and optical birefringence is established. The critical peak appears to be a symmetric, logarithmic divergence, in disagreement with random-field model computer simulations. The random-field specific heat scaling function is determined.
Hierarchical architectures are critical to the scalability of reinforcement learning methods. Current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme uses the concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.
Time Interference Alignment is a flavor of Interference Alignment that increases the network capacity by suitably staggering the transmission delays of the senders. In this work the analysis of the existing literature is generalized and the focus is on the computation of the dof for networks with randomly placed users in a n-dimensional Euclidean space. In the basic case without coordination among the transmitters analytical expressions of the sum dof can be derived. If the transmit delays are coordinated, in 20% of the cases time Interference Alignment yields additional dof with respect to orthogonal access schemes. The potential capacity improvements for satellite networks are also investigated.
We present an online visual analytics approach to helping users explore and understand hierarchical topic evolution in high-volume text streams. The key idea behind this approach is to identify representative topics in incoming documents and align them with the existing representative topics that they immediately follow (in time). To this end, we learn a set of streaming tree cuts from topic trees based on user-selected focus nodes. A dynamic Bayesian network model has been developed to derive the tree cuts in the incoming topic trees to balance the fitness of each tree cut and the smoothness between adjacent tree cuts. By connecting the corresponding topics at different times, we are able to provide an overview of the evolving hierarchical topics. A sedimentation-based visualization has been designed to enable the interactive analysis of streaming text data from global patterns to local details. We evaluated our method on real-world datasets and the results are generally favorable.
We present a family of networks, expanded deterministic Apollonian networks, which are a generalization of the Apollonian networks and are simultaneously scale-free, small-world, and highly clustered. We introduce a labeling of their vertices that allows to determine a shortest path routing between any two vertices of the network based only on the labels.
We present a face detection algorithm based on Deformable Part Models and deep pyramidal features. The proposed method called DP2MFD is able to detect faces of various sizes and poses in unconstrained conditions. It reduces the gap in training and testing of DPM on deep features by adding a normalization layer to the deep convolutional neural network (CNN). Extensive experiments on four publicly available unconstrained face detection datasets show that our method is able to capture the meaningful structure of faces and performs significantly better than many competitive face detection algorithms.
We study effective mobility in 2 dimensional (2D) and 3 dimensional (3D) systems, where hopping transitions of carriers are described by the Marcus equation under a Gaussian density of states in the dilute limit. Using an effective medium approximation (EMA), we determined the coefficient $C_d$ for the effective mobility expressed by $\mu_{\rm eff}\propto\exp\left[-\lambda/\left(4 k_{\rm B} T\right)- C_d\sigma^2/\left(k_{\rm B} T\right)^2 \right]/\left[\sqrt{\lambda} (k_{\rm B} T)^{3/2}\right]$, where $\lambda$ is the reorganization energy, $\sigma$ is the standard deviation of the Gaussian density of states, and $k_{\rm B} T$ takes its usual meaning. We found $C_d=1/2$ for both 2D and 3D. While various estimates of the coefficient $C_d$ for 3D systems are available in the literature, we provide for the first time the expected $C_d$ value for a 2D system. By means of kinetic Monte-Carlo simulations, we show that the effective mobility is well described by the equation shown above under certain conditions on $\lambda$. We also give examples of analysis of experimental data for 2D and 3D systems based on our theoretical results.
We study the dynamics of the Naming Game as an opinion formation model on time-varying social networks. This agent-based model captures the essential features of the agreement dynamics by means of a memory-based negotiation process. Our study focuses on the impact of time-varying properties of the social network of the agents on the Naming Game dynamics. We investigate the outcomes of the dynamics on two different types of time-varying data - (i) the networks vary across days and (ii) the networks vary within very short intervals of time (20 seconds). In the first case, we find that networks with strong community structure hinder the system from reaching global agreement; the evolution of the Naming Game in these networks maintains clusters of coexisting opinions indefinitely leading to metastability. In the second case, we investigate the evolution of the Naming Game in perfect synchronization with the time evolution of the underlying social network shedding new light on the traditional emergent properties of the game that differ largely from what has been reported in the existing literature
Multivariate clustering in astrophysics is a recent development justified by the bigger and bigger surveys of the sky. The phylogenetic approach is probably the most unexpected technique that has appeared for the unsupervised classification of galaxies, stellar populations or globular clusters. On one side, this is a somewhat natural way of classifying astrophysical entities which are all evolving objects. On the other side, several conceptual and practical difficulties arize, such as the hierarchical representation of the astrophysical diversity, the continuous nature of the parameters, and the adequation of the result to the usual practice for the physical interpretation. Most of these have now been solved through the studies of limited samples of stellar clusters and galaxies. Up to now, only the Maximum Parsimony (cladistics) has been used since it is the simplest and most general phylogenetic technique. Probabilistic and network approaches are obvious extensions that should be explored in the future.
We combine data from the extremely deep Hubble Space Telescope $U$ (F300W) image obtained using WFPC2 as part of the parallel observations of the Hubble Ultra Deep Field campaign, with $BVi$ images from the Great Observatories Origins Deep Survey (GOODS) to identify a sample of Lyman break galaxies in the redshift range $2.0 \lesssim z \lesssim 3.5$. We use recent stellar population synthesis models with a wide variety of ages, metallicities, redshifts, and dust content, and a detailed representation of the \HI cosmic opacity as a function of redshift to model the colors of galaxies in our combination of WFPC2/ACS filters. Using these models, we derive improved color selection criteria that provide a clean selection of relatively unobscured, star forming galaxies in this redshift range. Our WFPC2/F300W image is the deepest image ever obtained at that wavelength. The $10\sigma$ limiting magnitude measured over 0.2 arcsec$^2$ is 27.5 magnitudes in the WFPC2/F300W image, about 0.5 magnitudes deeper than the F300W image in the Hubble Deep Field (HDF)-N. This extra depth relative to the HDFs allows us to directly probe the luminosity function about 0.5 magnitudes deeper than the depth accessible with the HDF data along an independent line of sight. Our sample of star-forming galaxies with $2.0 \lesssim z \lesssim 3.5$ includes 125 objects, the majority of which show clumpy morphologies. We measure a star formation rate density of 0.18   $M_\odot \rm yr^{-1} Mpc^{-3}$, marginally higher than the value measured for the Hubble Deep Fields.
Unraveling the structure of complex biological networks and relating it to their functional role is an important task in systems biology. Here we attempt to characterize the functional organization of the large-scale metabolic networks of three microorganisms. We apply flux balance analysis to study the optimal growth states of these organisms in different environments. By investigating the differential usage of reactions across flux patterns for different environments, we observe a striking bimodal distribution in the activity of reactions. Motivated by this, we propose a simple algorithm to decompose the metabolic network into three sub-networks. It turns out that our reaction classifier which is blind to the biochemical role of pathways leads to three functionally relevant sub-networks that correspond to input, output and intermediate parts of the metabolic network with distinct structural characteristics. Our decomposition method unveils a functional bow-tie organization of metabolic networks that is different from the bow-tie structure determined by graph-theoretic methods that do not incorporate functionality.
The problem and implications of community detection in networks have raised a huge attention, for its important applications in both natural and social sciences. A number of algorithms has been developed to solve this problem, addressing either speed optimization or the quality of the partitions calculated. In this paper we propose a multi-step procedure bridging the fastest, but less accurate algorithms (coarse clustering), with the slowest, most effective ones (refinement). By adopting heuristic ranking of the nodes, and classifying a fraction of them as `critical', a refinement step can be restricted to this subset of the network, thus saving computational time. Preliminary numerical results are discussed, showing improvement of the final partition.
We develop a simple, chemostat-based model illustrating how a process analogous to associative learning can occur in a biochemical network. Associative learning is a form of learning whereby a system "learns" to associate two stimuli with one another. In our model, two types of replicating molecules, denoted A and B, are present in some initial concentration in the chemostat. Molecules A and B are stimulated to replicate by some growth factors, denoted GA and GB, respectively. It is also assumed that A and B can covalently link, and that the conjugated molecule can be stimulated by either the GA or GB growth factors (and can be degraded). We show that, if the chemostat is stimulated by both growth factors for a certain time, followed by a time gap during which the chemostat is not stimulated at all, and if the chemostat is then stimulated again by only one of the growth factors, then there will be a transient increase in the number of molecules activated by the other growth factor. Therefore, the chemostat bears the imprint of earlier, simultaneous stimulation with both growth factors, which is indicative of associative learning. It is interesting to note that the dynamics of our model is consistent with various aspects of Pavlov's original series of associative learning experiments in dogs. We discuss how associative learning can potentially be performed in vitro within RNA, DNA, or peptide networks. We also highlight how such a mechanism could potentially be involved in genomic evolution, and suggest bioinformatics studies that could be used to find evidence for associative learning processes at work inside living cells.
We study the synchronization between left and right hemisphere rat EEG channels by using various synchronization measures, namely non-linear interdependences, phase-synchronizations, mutual information, cross-correlation and the coherence function. In passing we show a close relation between two recently proposed phase synchronization measures and we extend the definition of one of them. In three typical examples we observe that except mutual information, all these measures give a useful quantification that is hard to be guessed beforehand from the raw data. Despite their differences, results are qualitatively the same. Therefore, we claim that the applied measures are valuable for the study of synchronization in real data. Moreover, in the particular case of EEG signals their use as complementary variables could be of clinical relevance.
Much recent research aims to identify evidence for Drug-Drug Interactions (DDI) and Adverse Drug reactions (ADR) from the biomedical scientific literature. In addition to this "Bibliome", the universe of social media provides a very promising source of large-scale data that can help identify DDI and ADR in ways that have not been hitherto possible. Given the large number of users, analysis of social media data may be useful to identify under-reported, population-level pathology associated with DDI, thus further contributing to improvements in population health. Moreover, tapping into this data allows us to infer drug interactions with natural products--including cannabis--which constitute an array of DDI very poorly explored by biomedical research thus far. Our goal is to determine the potential of Instagram for public health monitoring and surveillance for DDI, ADR, and behavioral pathology at large. Using drug, symptom, and natural product dictionaries for identification of the various types of DDI and ADR evidence, we have collected ~7000 timelines. We report on 1) the development of a monitoring tool to easily observe user-level timelines associated with drug and symptom terms of interest, and 2) population-level behavior via the analysis of co-occurrence networks computed from user timelines at three different scales: monthly, weekly, and daily occurrences. Analysis of these networks further reveals 3) drug and symptom direct and indirect associations with greater support in user timelines, as well as 4) clusters of symptoms and drugs revealed by the collective behavior of the observed population. This demonstrates that Instagram contains much drug- and pathology specific data for public health monitoring of DDI and ADR, and that complex network analysis provides an important toolbox to extract health-related associations and their support from large-scale social media data.
In a social network, adoption probability refers to the probability that a social entity will adopt a product, service, or opinion in the foreseeable future. Such probabilities are central to fundamental issues in social network analysis, including the influence maximization problem. In practice, adoption probabilities have significant implications for applications ranging from social network-based target marketing to political campaigns; yet, predicting adoption probabilities has not received sufficient research attention. Building on relevant social network theories, we identify and operationalize key factors that affect adoption decisions: social influence, structural equivalence, entity similarity, and confounding factors. We then develop the locally-weighted expectation-maximization method for Na\"ive Bayesian learning to predict adoption probabilities on the basis of these factors. The principal challenge addressed in this study is how to predict adoption probabilities in the presence of confounding factors that are generally unobserved. Using data from two large-scale social networks, we demonstrate the effectiveness of the proposed method. The empirical results also suggest that cascade methods primarily using social influence to predict adoption probabilities offer limited predictive power, and that confounding factors are critical to adoption probability predictions.
Voice over Internet Protocol (VoIP) is a technology that allows you to make voice calls using a broadband Internet connection instead of a regular (or analog) phone line. Voice over Internet Protocol (VoIP) has led human speech to a new level, where conversation across continents can be much cheaper & faster. However, as IP networks are not designed for real-time applications, the network impairments such as packet loss, jitter and delay have a severe impact on speech quality. The playout buffer at the receiver side is used to compensate jitter at a trade-off of delay and loss. We found the characteristics of delay and loss are dependent on IP network and sudden variable delay (spike) often performs both regular and irregular characteristics. Different playout buffer algorithms can have different impacts on the achievement speech quality. It is important to design a playout buffer algorithm which can help achieve an optimum speech quality. In this paper, we investigate to the understanding how network impairments and existing adaptive buffer algorithms affect the speech quality and further to design a modified buffer algorithm to obtain an optimized voice quality. We conduct experiments to existing algorithms and compared their performance under different network conditions with high and low network delay variations. Preliminary results show that the new algorithm can enhance the perceived speech quality in most network conditions and it is more efficient and suitable for real buffer mechanism.
Two mobile agents, starting from different nodes of a network at possibly different times, have to meet at the same node. This problem is known as $\mathit{rendezvous}$. Agents move in synchronous rounds. Each agent has a distinct integer label from the set $\{1,\dots,L\}$. Two main efficiency measures of rendezvous are its $\mathit{time}$ (the number of rounds until the meeting) and its $\mathit{cost}$ (the total number of edge traversals). We investigate tradeoffs between these two measures. A natural benchmark for both time and cost of rendezvous in a network is the number of edge traversals needed for visiting all nodes of the network, called the exploration time. Hence we express the time and cost of rendezvous as functions of an upper bound $E$ on the time of exploration (where $E$ and a corresponding exploration procedure are known to both agents) and of the size $L$ of the label space. We present two natural rendezvous algorithms. Algorithm $\mathtt{Cheap}$ has cost $O(E)$ (and, in fact, a version of this algorithm for the model where the agents start simultaneously has cost exactly $E$) and time $O(EL)$. Algorithm $\mathtt{Fast}$ has both time and cost $O(E\log L)$. Our main contributions are lower bounds showing that, perhaps surprisingly, these two algorithms capture the tradeoffs between time and cost of rendezvous almost tightly. We show that any deterministic rendezvous algorithm of cost asymptotically $E$ (i.e., of cost $E+o(E)$) must have time $\Omega(EL)$. On the other hand, we show that any deterministic rendezvous algorithm with time complexity $O(E\log L)$ must have cost $\Omega (E\log L)$.
Distribution networks -- from vasculature to urban transportation systems -- are prevalent in both the natural and consumer worlds. These systems are intrinsically physical in composition and are embedded into real space, properties that lead to constraints on their topological organization. In this study, we compare and contrast two types of biological distribution networks: mycelial fungi and the vasculature system on the surface of rodent brains. Both systems are alike in that they must route resources efficiently, but they are also inherently distinct in terms of their growth mechanisms, and in that fungi are not attached to a larger organism and must often function in unregulated and varied environments. We begin by uncovering a common organizational principle -- Rentian scaling -- that manifests as hierarchical network layout in both physical and topological space. Simulated models of distribution networks optimized for transport in the presence of fluctuations are also shown to exhibit this feature in their embedding, with similar scaling exponents. However, we also find clear differences in how the fungi and vasculature balance tradeoffs in material cost, efficiency, and robustness. While the vasculature appear well optimized for low cost, but relatively high efficiency, the fungi tend to form more expensive but in turn more robust networks. These differences may be driven by the distinct functions that each system must perform, and the different habitats in which they reside. As a whole, this work demonstrates that distribution networks contain a set of common, emergent design features, as well as tailored optimizations.
We propose in this paper an original technique to predict global radiation using a hybrid ARMA/ANN model and data issued from a numerical weather prediction model (ALADIN). We particularly look at the Multi-Layer Perceptron. After optimizing our architecture with ALADIN and endogenous data previously made stationary and using an innovative pre-input layer selection method, we combined it to an ARMA model from a rule based on the analysis of hourly data series. This model has been used to forecast the hourly global radiation for five places in Mediterranean area. Our technique outperforms classical models for all the places. The nRMSE for our hybrid model ANN/ARMA is 14.9% compared to 26.2% for the na\"ive persistence predictor. Note that in the stand alone ANN case the nRMSE is 18.4%. Finally, in order to discuss the reliability of the forecaster outputs, a complementary study concerning the confidence interval of each prediction is proposed
The introduction of lung cancer screening programs will produce an unprecedented amount of chest CT scans in the near future, which radiologists will have to read in order to decide on a patient follow-up strategy. According to the current guidelines, the workup of screen-detected nodules strongly relies on nodule size and nodule type. In this paper, we present a deep learning system based on multi-stream multi-scale convolutional networks, which automatically classifies all nodule types relevant for nodule workup. The system processes raw CT data containing a nodule without the need for any additional information such as nodule segmentation or nodule size and learns a representation of 3D data by analyzing an arbitrary number of 2D views of a given nodule. The deep learning system was trained with data from the Italian MILD screening trial and validated on an independent set of data from the Danish DLCST screening trial. We analyze the advantage of processing nodules at multiple scales with a multi-stream convolutional network architecture, and we show that the proposed deep learning system achieves performance at classifying nodule type that surpasses the one of classical machine learning approaches and is within the inter-observer variability among four experienced human observers.
We study the steady state distribution of the energy of the Sherrington-Kirkpatrick model driven by a tapping mechanism which mimics the mechanically driven dynamics of granular media. The dynamics consists of two phases: a zero temperature relaxation phase which leads the system to a metastable state, then a tapping which excites the system thus reactivating the relaxational dynamics. Numerically we investigate whether the distribution of the energies of the blocked states obtained agrees with a simple canonical form of the Edwards measure. It is found that this canonical measure is in good agreement with the dynamically measured energy distribution. A possible experimental test of the Edwards measure based on the study here is proposed.
Tree structures are commonly used in the tasks of semantic analysis and understanding over the data of different modalities, such as natural language, 2D or 3D graphics and images, or Web pages. Previous studies model the tree structures in a bottom-up manner, where the leaf nodes (given in advance) are merged into internal nodes until they reach the root node. However, these models are not applicable when the leaf nodes are not explicitly specified ahead of prediction. Here, we introduce a neural machine for top-down generation of tree structures that aims to infer such tree structures without the specified leaf nodes. In this model, the history memories from ancestors are fed to a node to generate its (ordered) children in a recursive manner. This model can be utilized as a tree-structured decoder in the framework of "X to tree" learning, where X stands for any structure (e.g. chain, tree etc.) that can be represented as a latent vector. By transforming the dialogue generation problem into a sequence-to-tree task, we demonstrate the proposed X2Tree framework achieves a 11.15% increase of response acceptance ratio over the baseline methods.
Deep learning has shown to be effective for robust and real-time monocular image relocalisation. In particular, PoseNet is a deep convolutional neural network which learns to regress the 6-DOF camera pose from a single image. It learns to localize using high level features and is robust to difficult lighting, motion blur and unknown camera intrinsics, where point based SIFT registration fails. However, it was trained using a naive loss function, with hyper-parameters which require expensive tuning. In this paper, we give the problem a more fundamental theoretical treatment. We explore a number of novel loss functions for learning camera pose which are based on geometry and scene reprojection error. Additionally we show how to automatically learn an optimal weighting to simultaneously regress position and orientation. By leveraging geometry, we demonstrate that our technique significantly improves PoseNet's performance across datasets ranging from indoor rooms to a small city.
We study an incremental network design problem, where in each time period of the planning horizon an arc can be added to the network and a maximum flow problem is solved, and where the objective is to maximize the cumulative flow over the entire planning horizon. After presenting two mixed integer programming (MIP) formulations for this NP-complete problem, we describe several heuristics and prove performance bounds for some special cases. In a series of computational experiments, we compare the performance of the MIP formulations as well as the heuristics.
Designing a fast and efficient optimization method with local optima avoidance capability on a variety of optimization problems is still an open problem for many researchers. In this work, the concept of a new global optimization method with an open implementation area is introduced as a Curved Space Optimization (CSO) method, which is a simple probabilistic optimization method enhanced by concepts of general relativity theory. To address global optimization challenges such as performance and convergence, this new method is designed based on transformation of a random search space into a new search space based on concepts of space-time curvature in general relativity theory. In order to evaluate the performance of our proposed method, an implementation of CSO is deployed and its results are compared on benchmark functions with state-of-the art optimization methods. The results show that the performance of CSO is promising on unimodal and multimodal benchmark functions with different search space dimension sizes.
Softmax loss is widely used in deep neural networks for multi-class classification, where each class is represented by a weight vector, a sample is represented as a feature vector, and the feature vector has the largest projection on the weight vector of the correct category when the model correctly classifies a sample. To ensure generalization, weight decay that shrinks the weight norm is often used as regularizer. Different from traditional learning algorithms where features are fixed and only weights are tunable, features are also tunable as representation learning in deep learning. Thus, we propose feature incay to also regularize representation learning, which favors feature vectors with large norm when the samples can be correctly classified. With the feature incay, feature vectors are further pushed away from the origin along the direction of their corresponding weight vectors, which achieves better inter-class separability. In addition, the proposed feature incay encourages intra-class compactness along the directions of weight vectors by increasing the small feature norm faster than the large ones. Empirical results on MNIST, CIFAR10 and CIFAR100 demonstrate feature incay can improve the generalization ability.
Fragmentation methods such as the many-body expansion (MBE) are a common strategy to model large systems by partitioning energies into a hierarchy of decreasingly significant contributions. The number of fragments required for chemical accuracy is still prohibitively expensive for ab-initio MBE to compete with force field approximations for applications beyond single-point energies. Alongside the MBE, empirical models of ab-initio potential energy surfaces have improved, especially non-linear models based on neural networks (NN) which can reproduce ab-initio potential energy surfaces rapidly and accurately. Although they are fast, NNs suffer from their own curse of dimensionality; they must be trained on a representative sample of chemical space. In this paper we examine the synergy of the MBE and NN's, and explore their complementarity. The MBE offers a systematic way to treat systems of arbitrary size and intelligently sample chemical space. NN's reduce, by a factor in excess of $10^6$ the computational overhead of the MBE and reproduce the accuracy of ab-initio calculations without specialized force fields. We show they are remarkably general, providing comparable accuracy with drastically different chemical embeddings. To assess this we test a new chemical embedding which can be inverted to predict molecules with desired properties.
We study the statistics of near-extreme events of Brownian motion (BM) on the time interval [0,t]. We focus on the density of states (DOS) near the maximum \rho(r,t) which is the amount of time spent by the process at a distance r from the maximum. We develop a path integral approach to study functionals of the maximum of BM, which allows us to study the full probability density function (PDF) of \rho(r,t) and obtain an explicit expression for the moments, \langle [\rho(r,t)]^k \rangle, for arbitrary integer k. We also study near-extremes of constrained BM, like the Brownian bridge. Finally we also present numerical simulations to check our analytical results.
Modeling physiological time-series in ICU is of high clinical importance. However, data collected within ICU are irregular in time and often contain missing measurements. Since absence of a measure would signify its lack of importance, the missingness is indeed informative and might reflect the decision making by the clinician. Here we propose a deep learning architecture that can effectively handle these challenges for predicting ICU mortality outcomes. The model is based on Long Short-Term Memory, and has layered attention mechanisms. At the sensing layer, the model decides whether to observe and incorporate parts of the current measurements. At the reasoning layer, evidences across time steps are weighted and combined. The model is evaluated on the PhysioNet 2012 dataset showing competitive and interpretable results.
This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale with respect to model size, training-set size, computational costs and memory. Our analysis shows that despite being more costly to train, RNNLMs obtain much lower perplexities on standard benchmarks than n-gram models. We train the largest known RNNs and present relative word error rates gains of 18% on an ASR task. We also present the new lowest perplexities on the recently released billion word language modelling benchmark, 1 BLEU point gain on machine translation and a 17% relative hit rate gain in word prediction.
This paper analyzes the law and economics of United States v. Microsoft, a landmark case of antitrust intervention in network industries. [abridged]
Late-type spiral galaxies are thought to be the dynamically simplest type of disk galaxies and our understanding of their properties plays a key role in the galaxy formation and evolution scenarios. The low surface brightness (LSB) galaxy UGC 7321, a nearby, isolated, ``superthin'' edge-on galaxy, is an ideal object to study those purely disk dominated bulge-less galaxies. Although late type spirals are believed to exhibit the simplest possible structure, even prior observations showed deviations from a pure single component exponential disk in the case of UGC 7321. We present for the first time photometric evidence for peanut-shaped outer isophotes from a deep optical (R-band) image of UGC 7321. Observations and dynamical modeling suggest that boxy/peanut-shaped (b/p) bulges in general form through the buckling instability in bars of the parent galaxy disks. Together with recent HI observations supporting the presence of a stellar bar in UGC 7321 this could be the earliest known case of the buckling process during the evolutionary life of a LSB galaxy, whereby material in the disk-bar has started to be pumped up above the disk, but a genuine bulge has not yet formed.
Recent years saw an increased interest in modeling and understanding the mechanisms of opinion and innovation spread through human networks. Using analysis of real-world social data, researchers are able to gain a better understanding of the dynamics of social networks and subsequently model the changes in such networks over time. We developed a social network model that both utilizes an agent-based approach with a dynamic update of opinions and connections between agents and reflects opinion propagation and structural changes over time as observed in real-world data. We validate the model using data from the Social Evolution dataset of the MIT Human Dynamics Lab describing changes in friendships and health self-perception in a targeted student population over a nine-month period. We demonstrate the effectiveness of the approach by predicting changes in both opinion spread and connectivity of the network. We also use the model to evaluate how the network parameters, such as the level of `openness' and willingness to incorporate opinions of neighboring agents, affect the outcome. The model not only provides insight into the dynamics of ever changing social networks, but also presents a tool with which one can investigate opinion propagation strategies for networks of various structures and opinion distributions.
To date, the instability of prognostic predictors in a sparse high dimensional model, which hinders their clinical adoption, has received little attention. Stable prediction is often overlooked in favour of performance. Yet, stability prevails as key when adopting models in critical areas as healthcare. Our study proposes a stabilization scheme by detecting higher order feature correlations. Using a linear model as basis for prediction, we achieve feature stability by regularising latent correlation in features. Latent higher order correlation among features is modelled using an autoencoder network. Stability is enhanced by combining a recent technique that uses a feature graph, and augmenting external unlabelled data for training the autoencoder network. Our experiments are conducted on a heart failure cohort from an Australian hospital. Stability was measured using Consistency index for feature subsets and signal-to-noise ratio for model parameters. Our methods demonstrated significant improvement in feature stability and model estimation stability when compared to baselines.
It has been proven that transfer learning provides an easy way to achieve state-of-the-art accuracies on several vision tasks by training a simple classifier on top of features obtained from pre-trained neural networks. The goal of this work is to generate better features for transfer learning from multiple publicly available pre-trained neural networks. To this end, we propose a novel architecture called Stacked Neural Networks which leverages the fast training time of transfer learning while simultaneously being much more accurate. We show that using a stacked NN architecture can result in up to 8% improvements in accuracy over state-of-the-art techniques using only one pre-trained network for transfer learning. A second aim of this work is to make network fine- tuning retain the generalizability of the base network to unseen tasks. To this end, we propose a new technique called "joint fine-tuning" that is able to give accuracies comparable to finetuning the same network individually over two datasets. We also show that a jointly finetuned network generalizes better to unseen tasks when compared to a network finetuned over a single task.
Motivated by current challenges in data-intensive sensor networks, we formulate a coverage optimization problem for mobile visual sensors as a (constrained) repeated multi-player game. Each visual sensor tries to optimize its own coverage while minimizing the processing cost. We present two distributed learning algorithms where each sensor only remembers its own utility values and actions played during the last plays. These algorithms are proven to be convergent in probability to the set of (constrained) Nash equilibria and global optima of certain coverage performance metric, respectively.
We introduce a simple network model that is inspired by social information networks such as twitter. Agents are nodes, connecting to another agent by building a directed edge has a cost, and reaching other agents via short directed paths has a benefit; in effect, an agent wants to reach others quickly, but without the cost of directly connecting each and every one. Even in its simplest form, edges in this framework are neither substitutes or complements in general; hence, standard techniques are required to study the model's properties and dynamics do not apply.   We prove that an asynchronous edge dynamics always converge to a stable network; in fact, for this convergence is fast for a range of parameters. Moreover, the set of stable networks are nontrivial and can support the type of network structures that have been observed to appear in social information networks -- from community clusters to broadcast networks, depending on the parameters many natural formations can emerge. We further study the static game, and give classes of stable and efficient networks for nontrivial parameter ranges. We close several problems, and leave many interesting ones open.
Here, we develop an audiovisual deep residual network for multimodal apparent personality trait recognition. The network is trained end-to-end for predicting the Big Five personality traits of people from their videos. That is, the network does not require any feature engineering or visual analysis such as face detection, face landmark alignment or facial expression recognition. Recently, the network won the third place in the ChaLearn First Impressions Challenge with a test accuracy of 0.9109.
The non-Markovian nature of polymer motions is accounted for in folding kinetics, using frequency-dependent friction. Folding, like many other problems in the physics of disordered systems, involves barrier crossing on a correlated energy landscape. A variational transition state theory (VTST) that reduces to the usual Bryngelson-Wolynes Kramers approach when the non-Markovian aspects are neglected is used to obtain the rate, without making any assumptions regarding the size of the barrier, or the memory time of the friction. The transformation to collective variables dependent on the dynamics of the system allows the theory to address the controversial issue of what are ``good'' reaction coordinates for folding.
We present deep Spitzer 3.6 micron observations of three z~5 GRB host galaxies. Our observations reveal that z~5 GRB hosts are a factor of 3 less luminous than the median rest-frame V-band luminosity of spectroscopically confirmed z~5 galaxies in the GOODS fields and the UDF. The strong connection between GRBs and massive star formation implies that not all star-forming galaxies at these redshifts are currently being accounted for in deep surveys and GRBs provide a unique way to measure the contribution to the star-formation rate density from galaxies at the faint end of the galaxy luminosity function. By correlating the co-moving star-formation rate density with co-moving GRB rates at lower redshifts, we estimate a lower limit to the star-formation rate density of 0.12+/-0.09 and 0.09+/-0.05 M_sun/yr/Mpc^3 at z~4.5 and z~6, respectively. Finally, we provide evidence that the average metallicity of star-forming galaxies evolves as (stellar mass density)^(0.69+/-0.17) between $z\sim5$ and $z\sim0$, probably indicative of the loss of a significant fraction of metals to the intergalactic medium, particularly in low-mass galaxies.
The concept of affordance is important to understand the relevance of object parts for a certain functional interaction. Affordance types generalize across object categories and are not mutually exclusive. This makes the segmentation of affordance regions of objects in images a difficult task. In this work, we build on an iterative approach that learns a convolutional neural network for affordance segmentation from sparse keypoints. During this process, the predictions of the network need to be binarized. In this work, we propose an adaptive approach for binarization and estimate the parameters for initialization by approximated cross validation. We evaluate our approach on two affordance datasets where our approach outperforms the state-of-the-art for weakly supervised affordance segmentation.
Large-scale deep neural networks (DNN) have been successfully used in a number of tasks from image recognition to natural language processing. They are trained using large training sets on large models, making them computationally and memory intensive. As such, there is much interest in research development for faster training and test time. In this paper, we present a unique approach using lower precision weights for more efficient and faster training phase. We separate imagery into different frequency bands (e.g. with different information content) such that the neural net can better learn using less bits. We present this approach as a complement existing methods such as pruning network connections and encoding learning weights. We show results where this approach supports more stable learning with 2-4X reduction in precision with 17X reduction in DNN parameters.
In the past five years we have observed the rise of incredibly well performing feed-forward neural networks trained supervisedly for vision related tasks. These models have achieved super-human performance on object recognition, localisation, and detection in still images. However, there is a need to identify the best strategy to employ these networks with temporal visual inputs and obtain a robust and stable representation of video data. Inspired by the human visual system, we propose a deep neural network family, CortexNet, which features not only bottom-up feed-forward connections, but also it models the abundant top-down feedback and lateral connections, which are present in our visual cortex. We introduce two training schemes - the unsupervised MatchNet and weakly supervised TempoNet modes - where a network learns how to correctly anticipate a subsequent frame in a video clip or the identity of its predominant subject, by learning egomotion clues and how to automatically track several objects in the current scene. Find the project website at https://engineering.purdue.edu/elab/CortexNet/.
The computer realization of a "sense of humour" requires the creation of an algorithm for solving the "linguistic problem", i.e. the problem of recognizing a continuous sequence of polysemantic images. Such algorithm may be realized in the Hopfield model of a neural network after its proper modification.
Artificial neural networks are most commonly trained with the back-propagation algorithm, where the gradient for learning is provided by back-propagating the error, layer by layer, from the output layer to the hidden layers. A recently discovered method called feedback-alignment shows that the weights used for propagating the error backward don't have to be symmetric with the weights used for propagation the activation forward. In fact, random feedback weights work evenly well, because the network learns how to make the feedback useful. In this work, the feedback alignment principle is used for training hidden layers more independently from the rest of the network, and from a zero initial condition. The error is propagated through fixed random feedback connections directly from the output layer to each hidden layer. This simple method is able to achieve zero training error even in convolutional networks and very deep networks, completely without error back-propagation. The method is a step towards biologically plausible machine learning because the error signal is almost local, and no symmetric or reciprocal weights are required. Experiments show that the test performance on MNIST and CIFAR is almost as good as those obtained with back-propagation for fully connected networks. If combined with dropout, the method achieves 1.45% error on the permutation invariant MNIST task.
Deep Recurrent Neural Network architectures, though remarkably capable at modeling sequences, lack an intuitive high-level spatio-temporal structure. That is while many problems in computer vision inherently have an underlying high-level structure and can benefit from it. Spatio-temporal graphs are a popular tool for imposing such high-level intuitions in the formulation of real world problems. In this paper, we propose an approach for combining the power of high-level spatio-temporal graphs and sequence learning success of Recurrent Neural Networks~(RNNs). We develop a scalable method for casting an arbitrary spatio-temporal graph as a rich RNN mixture that is feedforward, fully differentiable, and jointly trainable. The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps. The evaluations of the proposed approach on a diverse set of problems, ranging from modeling human motion to object interactions, shows improvement over the state-of-the-art with a large margin. We expect this method to empower new approaches to problem formulation through high-level spatio-temporal graphs and Recurrent Neural Networks.
The spread of a disease, a computer virus or information is discussed in a directed complex network. We are concerned with a steady state of the spread for the SIR and SIS dynamic models. In a scale-free directed network it is shown that the threshold of its outbreak in both models approaches zero under a high correlation between nodal indegrees and outdegrees.
Microblogging platforms such as Twitter provide active communication channels during mass convergence and emergency events such as earthquakes, typhoons. During the sudden onset of a crisis situation, affected people post useful information on Twitter that can be used for situational awareness and other humanitarian disaster response efforts, if processed timely and effectively. Processing social media information pose multiple challenges such as parsing noisy, brief and informal messages, learning information categories from the incoming stream of messages and classifying them into different classes among others. One of the basic necessities of many of these tasks is the availability of data, in particular human-annotated data. In this paper, we present human-annotated Twitter corpora collected during 19 different crises that took place between 2013 and 2015. To demonstrate the utility of the annotations, we train machine learning classifiers. Moreover, we publish first largest word2vec word embeddings trained on 52 million crisis-related tweets. To deal with tweets language issues, we present human-annotated normalized lexical resources for different lexical variations.
This study considers the problem of automated detection of non-relevant posts on Web forums and discusses the approach of resolving this problem by approximation it with the task of detection of semantic relatedness between the given post and the opening post of the forum discussion thread. The approximated task could be resolved through learning the supervised classifier with a composed word embeddings of two posts. Considering that the success in this task could be quite sensitive to the choice of word representations, we propose a comparison of the performance of different word embedding models. We train 7 models (Word2Vec, Glove, Word2Vec-f, Wang2Vec, AdaGram, FastText, Swivel), evaluate embeddings produced by them on dataset of human judgements and compare their performance on the task of non-relevant posts detection. To make the comparison, we propose a dataset of semantic relatedness with posts from one of the most popular Russian Web forums, imageboard "2ch", which has challenging lexical and grammatical features.
We investigate quantum transport through strongly disordered barriers, made of a material with exceptionally high resistivity that behaves as an Anderson insulator or a ``bad metal'' in the bulk, by analyzing the distribution of Landauer transmission eigenvalues for a junction where such barrier is attached to two clean metallic leads. We find that scaling of the transmission eigenvalue distribution with the junction thickness (starting from the single interface limit) always predicts a non-zero probability to find high transmission channels even in relatively thick barriers. Using this distribution, we compute the zero frequency shot noise power (as well as its sample-to-sample fluctuations) and demonstrate how it provides a single number characterization of non-trivial transmission properties of different types of disordered barriers. The appearance of open conducting channels, whose transmission eigenvalue is close to one, and corresponding violent mesoscopic fluctuations of transport quantities explain at least some of the peculiar zero-bias anomalies in the Anderson-insulator/superconductor junctions observed in recent experiments [Phys. Rev. B {\bf 61}, 13037 (2000)]. Our findings are also relevant for the understanding of the role of defects that can undermine quality of thin tunnel barriers made of conventional band-insulators.
Predictive coding (PDC) has recently attracted attention in the neuroscience and computing community as a candidate unifying paradigm for neuronal studies and artificial neural network implementations particularly targeted at unsupervised learning systems. The Mismatch Negativity (MMN) has also recently been studied in relation to PC and found to be a useful ingredient in neural predictive coding systems. Backed by the behavior of living organisms, such networks are particularly useful in forming spatio-temporal transitions and invariant representations of the input world. However, most neural systems still do not account for large number of synapses even though this has been shown by a few machine learning researchers as an effective and very important component of any neural system if such a system is to behave properly. Our major point here is that PDC systems with the MMN effect in addition to a large number of synapses can greatly improve any neural learning system's performance and ability to make decisions in the machine world. In this paper, we propose a novel bio-mimetic computational intelligence algorithm -- the Deviant Learning Algorithm, inspired by these key ideas and functional properties of recent brain-cognitive discoveries and theories. We also show by numerical experiments guided by theoretical insights, how our invented bio-mimetic algorithm can achieve competitive predictions even with very small problem specific data.
In the problem of matrix compressed sensing we aim to recover a low-rank matrix from few of its element-wise linear projections. In this contribution we analyze the asymptotic performance of a Bayes-optimal inference procedure for a model where the matrix to be recovered is a product of random matrices. The results that we obtain using the replica method describe the state evolution of the recently introduced P-BiG-AMP algorithm. We show the existence of different types of phase transitions, their implications for the solvability of the problem, and we compare the results of the theoretical analysis to the performance reached by P-BiG-AMP. Remarkably the asymptotic replica equations for matrix compressed sensing are the same as those for a related but formally different problem of matrix factorization.
Smartphones have created a significant impact on the day to day activities of every individual. Now a days a wide range of Smartphone applications are available and it necessitates high computing resources in order to build these applications. Cloud computing offers enormous resources and extends services to resource-constrained mobile devices. Mobile Cloud Computing is emerging as a key technology to utilize virtually unlimited resources over the Internet using Smartphones. Offloading data and computations to improve productivity, enhance performance, save energy, and improve user experience. Social network applications largely utilize Mobile Cloud Computing to reap the benefits. The social network has witnessed unprecedented growth in the recent years, and millions of registered users access it using Smartphones. The mobile cloud social network applications introduce not only convenience but also various issues related to criminal and illegal activities. Despite being primarily used to communicate and socialize with contacts, the multifarious and anonymous nature of social networking websites increases susceptibility to cybercrimes. Taking into account, the advantage of mobile cloud computing and popularity of social network applications, it is essential to establish a forensic framework based on mobile cloud platform that solves the problems of today forensic requirements. In this paper we present a mobile cloud forensic framework that allows the forensic investigator to collect the automated synchronized copies of data on both mobile and cloud servers to prove the evidence of cloud usage. We also show our preliminary results of this study.
We study the intrinsic, disorder-induced decoherence of an isolated quantum system under its own dynamics. Specifically, we investigate the characteristic time scale (i.e., the decoherence time) associated with an interacting many-body system losing the memory of its initial state. To characterize the erasure of the initial state memory, we define a time scale, the intrinsic decoherence time, by thresholding the gradual decay of the disorder-averaged return probability. We demonstrate the system-size independence of the intrinsic decoherence time in different models, and we study its dependence on the disorder strength. We find that the intrinsic decoherence time increases monotonically as the disorder strength increases in accordance with the relaxation of locally measurable quantities. We investigate several interacting spin (e.g., Ising and Heisenberg) and fermion (e.g., Anderson and Aubry-Andr\'e) models to obtain the intrinsic decoherence time as a function of disorder and interaction strength.
Ad hoc network is a collection of wireless mobile nodes that dynamically form a temporary network without the use of any existing network infrastructure or centralized administration. A cognitive radio is a radio that can change its transmitter parameters based on interaction with the environment in which it operates. The basic idea of cognitive radio networks is that the unlicensed devices (cognitive radio users or secondary users) need to vacate the spectrum band once the licensed device (primary user) is detected. Cognitive capability and reconfigurability are the key characteristics of cognitive radio. Routing is an important issue in Mobile Cognitive Radio Ad Hoc Networks (MCRAHNs). In this paper, a survey of routing protocols for mobile cognitive radio ad networks is discussed.
We explore the resiliency and robustness of systems while viewing them as complex, multi-genre networks. The term "complex, multi-genre networks" refers to networks that combine several distinct genres - networks of physical resources, communication networks, information networks, and social and cognitive networks. We show that this perspective is fruitful and adds to our understanding of fundamental challenges and tradeoffs in robustness and resiliency, as well as potential solutions to the challenges. Study of systems as multi-genre networks is relatively uncommon; instead, it is customary in research and engineering literature to focus on a view of a network comprised of homogeneous elements, (e.g., a network of communication devices, or a network of social beings). Yet, most if not all real-world networks are multi-genre - it is hard to find any real system of a significant complexity that does not include a combination of interconnected physical elements, communication devices and channels, data collections, and human users forming an integrated, inter-dependent whole. Most approaches to improving resiliency and robustness involve compromises, and the key challenge is to find a favorable compromise. Such compromises involve reducing or managing the complexity of the network: coupling, rigidity and dependency. We discuss several of these compromises, e.g., performance vs resiliency; resiliency to one type of disruption vs resiliency to another disruption type; and complexity vs resiliency.
Using the concepts of slow sound and of critical coupling, an ultra-thin acoustic metamaterial panel for perfect and omnidirectional absorption is theoretically and experimentally conceived in this work. The system is made of a rigid panel with a periodic distribution of thin closed slits, the upper wall of which is loaded by Helmholtz Resonators (HRs). The presence of resonators produces a slow sound propagation shifting the resonance frequency of the slit to the deep sub-wavelength regime ($\lambda/88$). By controlling the geometry of the slit and the HRs, the intrinsic visco-thermal losses can be tuned in order to exactly compensate the energy leakage of the system and fulfill the critical coupling condition to create the perfect absorption of sound in a large range of incidence angles due to the deep subwavelength behavior.
Recently, deep convolutional neural network (DCNN) achieved increasingly remarkable success and rapidly developed in the field of natural image recognition. Compared with the natural image, the scale of remote sensing image is larger and the scene and the object it represents are more macroscopic. This study inquires whether remote sensing scene and natural scene recognitions differ and raises the following questions: What are the key factors in remote sensing scene recognition? Is the DCNN recognition mechanism centered on object recognition still applicable to the scenarios of remote sensing scene understanding? We performed several experiments to explore the influence of the DCNN structure and the scale of remote sensing scene understanding from the perspective of scene complexity. Our experiment shows that understanding a complex scene depends on an in-depth network and multiple-scale perception. Using a visualization method, we qualitatively and quantitatively analyze the recognition mechanism in a complex remote sensing scene and demonstrate the importance of multi-objective joint semantic support.
The random matrix theory is used to bridge the network structures and the dynamical processes defined on them. We propose a possible dynamical mechanism for the enhancement effect of network structures on synchronization processes, based upon which a dynamic-based index of the synchronizability is introduced in the present paper.
The study reports on the first large statistics numerical experiment searching for rare eigenstates of anomalously high amplitudes in three-dimensional diffusive metallic conductors. Only a small fraction of a huge number of investigated eigenfunctions generates the far asymptotic tail of their amplitude distribution function. The relevance of the relationship between disorder and spectral averaging, as well as of the quantum transport properties of the investigated mesoscopic samples, for the numerical exploration of eigenstate statistics is divulged. The quest provides exact results to serve as a reference point in understanding the limits of approximations employed in different analytical predictions, and thereby the physics (quantum vs semiclassical) behind large deviations from the universal predictions of random matrix theory.
Stochastic simulations are one of the cornerstones of the analysis of dynamical processes on complex networks, and are often the only accessible way to explore their behavior. The development of fast algorithms is paramount to allow large-scale simulations. The Gillespie algorithm can be used for fast simulation of stochastic processes, and variants of it have been applied to simulate dynamical processes on static networks. However, its adaptation to temporal networks remains non-trivial. We here present a temporal Gillespie algorithm that solves this problem. Our method is applicable to general Poisson (constant-rate) processes on temporal networks, stochastically exact, and up to multiple orders of magnitude faster than traditional simulation schemes based on rejection sampling. We also show how it can be extended to simulate non-Markovian processes. The algorithm is easily applicable in practice, and as an illustration we detail how to simulate both Poissonian and non-Markovian models of epidemic spreading. Namely, we provide pseudocode and its implementation in C++ for simulating the paradigmatic Susceptible-Infected-Susceptible and Susceptible-Infected-Recovered models and a Susceptible-Infected-Recovered model with non-constant recovery rates. For empirical networks, the temporal Gillespie algorithm is here typically from 10 to 100 times faster than rejection sampling.
This paper has a twofold scope. The first one is to clarify and put in evidence the isomorphic character of two theories developed in quite different fields: on one side, threshold logic, on the other side, simple games. One of the main purposes in both theories is to determine when a simple game is representable as a weighted game, which allows a very compact and easily comprehensible representation. Deep results were found in threshold logic in the sixties and seventies for this problem. However, game theory has taken the lead and some new results have been obtained for the problem in the last two decades. The second and main goal of this paper is to provide some new results on this problem and propose several open questions and conjectures for future research. The results we obtain depend on two significant parameters of the game: the number of types of equivalent players and the number of types of shift-minimal winning coalitions.
In this paper, we study a discriminatively trained deep convolutional network for the task of visual tracking. Our tracker utilizes both motion and appearance features that are extracted from a pre-trained dual stream deep convolution network. We show that the features extracted from our dual-stream network can provide rich information about the target and this leads to competitive performance against state of the art tracking methods on a visual tracking benchmark.
This paper is an attempt to incorporate the idea of spiking neural P systems as an early seed into the area of Operating System Design, regarding their capability to solve some classical computer science problems. It is reflecting the power of such systems to simulate well known parallel computational models, like logic gates, arithmetic operation, and sorting. In these devices, the time (when the neurons fire and/or spike) plays an essential role. For instance, the result of a computation is the time between the moments when a specified neuron spikes. Seen as number computing devices, SN P systems are shown to be computationally complete, and with such capabilities, arithmetic operations, logic, and timing, some first steps could be taken towards an OS design.
The adoption of smart / adaptive antenna techniques in future wireless systems is expected to have a significant impact on the efficient use of the spectrum, the minimization of the cost of establishing new wireless networks, the optimization of service quality and realization of transparent operation across multi technology wireless networks [1]. This paper presents brief account on smart antenna (SA) system. SAs can place nulls in the direction of interferers via adaptive updating of weights linked to each antenna element. SAs thus cancel out most of the co-channel interference resulting in better quality of reception and lower dropped calls. SAs can also track the user within a cell via direction of arrival algorithms [2]. This paper explains the architecture, evolution and how the smart / adaptive antenna differs from the basic format of antenna. The paper further explains about the radiation pattern of the antenna and why it is highly preferred in its relative field. The capabilities of smart / adaptive antenna are easily employable to Cognitive Radio and OFDMA system.
Complex activity recognition is challenging due to the inherent uncertainty and diversity of performing a complex activity. Normally, each instance of a complex activity has its own configuration of atomic actions and their temporal dependencies. We propose in this paper an atomic action-based Bayesian model that constructs Allen's interval relation networks to characterize complex activities with structural varieties in a probabilistic generative way: By introducing latent variables from the Chinese restaurant process, our approach is able to capture all possible styles of a particular complex activity as a unique set of distributions over atomic actions and relations. We also show that local temporal dependencies can be retained and are globally consistent in the resulting interval network. Moreover, network structure can be learned from empirical data. A new dataset of complex hand activities has been constructed and made publicly available, which is much larger in size than any existing datasets. Empirical evaluations on benchmark datasets as well as our in-house dataset demonstrate the competitiveness of our approach.
Vehicular Ad hoc Network (VANET) is a special class of Mobile Ad hoc Network (MANET) where vehicles are considered as MANET nodes with wireless links. The key difference of VANET and MANET is the special mobility pattern and rapidly changeable topology. There has been significant interest in improving safety and traffic efficiency using VANET. The design of routing protocols in VANET is important and necessary issue for support the smart ITS. Existing routing protocols of MANET are not suitable for VANET. AOMDV is the most important on demand multipath routing protocol. This paper proposes SSD-AOMDV as VANET routing protocol. SSD-AOMDV improves AOMDV to suit VANET characteristics. SSD-AOMDV adds the mobility parameters: Stop_times, Speed and Direction to hop count as new AOMDV routing metric to select next hop during the route discovery phase. Stop_times metric is added to simulate buses mobility pattern and traffic lights at intersections. Simulation results show that SSD-AOMDV achieves better performance compared to AOMDV.
The abstraction tasks are challenging for multi- modal sequences as they require a deeper semantic understanding and a novel text generation for the data. Although the recurrent neural networks (RNN) can be used to model the context of the time-sequences, in most cases the long-term dependencies of multi-modal data make the back-propagation through time training of RNN tend to vanish in the time domain. Recently, inspired from Multiple Time-scale Recurrent Neural Network (MTRNN), an extension of Gated Recurrent Unit (GRU), called Multiple Time-scale Gated Recurrent Unit (MTGRU), has been proposed to learn the long-term dependencies in natural language processing. Particularly it is also able to accomplish the abstraction task for paragraphs given that the time constants are well defined. In this paper, we compare the MTRNN and MTGRU in terms of its learning performances as well as their abstraction representation on higher level (with a slower neural activation). This was done by conducting two studies based on a smaller data- set (two-dimension time sequences from non-linear functions) and a relatively large data-set (43-dimension time sequences from iCub manipulation tasks with multi-modal data). We conclude that gated recurrent mechanisms may be necessary for learning long-term dependencies in large dimension multi-modal data-sets (e.g. learning of robot manipulation), even when natural language commands was not involved. But for smaller learning tasks with simple time-sequences, generic version of recurrent models, such as MTRNN, were sufficient to accomplish the abstraction task.
Every device defines the behavior of various applications running on it with the help of user profiles or settings. What if a user wants different different applications to run in different different networks. Then he cannot do this because in todays existing operating systems the application profile, user specific tool settings and any such system usage settings do not consider the current network characteristics of the user or system and are therefore static in nature.Therefore there is a need for an intelligent system which will dynamically change or apply changes to the settings of various applications running on the system considering the current network characteristics based on user specifications. This paper presents an idea such that a user can set different different applications in different different networks.This paper presents how the user will get pop up messages when he visits to safe web sites. And according to the users current network status he will get a pop up message when he will go for download or stream audio or video files.
Dynamic autocorrelations $<S_i^{\alpha}(t) S_i^{\alpha}>$ (\alpha=x,z) of an isolated impurity spin in a S=1/2 XX chain are calculated. The impurity spin, defined by a local change in the nearest-neighbor coupling, is either in the bulk or at the boundary of the open-ended chain. The exact numerical calculation of the correlations employs the Jordan-Wigner mapping from spin operators to Fermi operators; effects of finite system size can be eliminated. Two distinct temperature regimes are observed in the long-time asymptotic behavior. At T=0 only power laws are present. At high T the x correlation decays exponentially (except at short times) while the z correlation still shows an asymptotic power law (different from the one at T=0) after an intermediate exponential phase. The boundary impurity correlations follow power laws at all T. The power laws for the z correlation and the boundary correlations can be deduced from the impurity-induced changes in the properties of the Jordan-Wigner fermion states.
We investigate the nature of the critical behavior of the random-anisotropy Heisenberg model (RAM), which describes a magnetic system with random uniaxial single-site anisotropy, such as some amorphous alloys of rare earths and transition metals. In particular, we consider the strong-anisotropy limit (SRAM), in which the Hamiltonian can be rewritten as the one of an Ising spin-glass model with correlated bond disorder. We perform Monte Carlo simulations of the SRAM on simple cubic L^3 lattices, up to L=30, measuring correlation functions of the replica-replica overlap, which is the order parameter at a glass transition. The corresponding results show critical behavior and finite-size scaling. They provide evidence of a finite-temperature continuous transition with critical exponents $\eta_o=-0.24(4)$ and $\nu_o=2.4(6)$. These results are close to the corresponding estimates that have been obtained in the usual Ising spin-glass model with uncorrelated bond disorder, suggesting that the two models belong to the same universality class. We also determine the leading correction-to-scaling exponent finding $\omega = 1.0(4)$.
Accurate real time crime prediction is a fundamental issue for public safety, but remains a challenging problem for the scientific community. Crime occurrences depend on many complex factors. Compared to many predictable events, crime is sparse. At different spatio-temporal scales, crime distributions display dramatically different patterns. These distributions are of very low regularity in both space and time. In this work, we adapt the state-of-the-art deep learning spatio-temporal predictor, ST-ResNet [Zhang et al, AAAI, 2017], to collectively predict crime distribution over the Los Angeles area. Our models are two staged. First, we preprocess the raw crime data. This includes regularization in both space and time to enhance predictable signals. Second, we adapt hierarchical structures of residual convolutional units to train multi-factor crime prediction models. Experiments over a half year period in Los Angeles reveal highly accurate predictive power of our models.
Nowadays computer networks use different kind of memory whose speeds and capacities vary widely. There exist methods of a so-called caching which are intended to use the different kinds of memory in such a way that the frequently used data are stored in the faster memory, wheres the infrequent ones are stored in the slower memory. We address the problems of estimating the caching efficiency and its capacity. We define the efficiency and capacity of the caching and suggest a method for their estimation based on the analysis of kinds of the accessible memory.
The nervous system encodes continuous information from the environment in the form of discrete spikes, and then decodes these to produce smooth motor actions. Understanding how spikes integrate, represent, and process information to produce behavior is one of the greatest challenges in neuroscience. Information theory has the potential to help us address this challenge. Informational analyses of deep and feed-forward artificial neural networks solving static input-output tasks, have led to the proposal of the \emph{Information Bottleneck} principle, which states that deeper layers encode more relevant yet minimal information about the inputs. Such an analyses on networks that are recurrent, spiking, and perform control tasks is relatively unexplored. Here, we present results from a Mutual Information analysis of a recurrent spiking neural network that was evolved to perform the classic pole-balancing task. Our results show that these networks deviate from the \emph{Information Bottleneck} principle prescribed for feed-forward networks.
We present a new computational technique to detect and analyze statistically significant geographic variation in language. Our meta-analysis approach captures statistical properties of word usage across geographical regions and uses statistical methods to identify significant changes specific to regions. While previous approaches have primarily focused on lexical variation between regions, our method identifies words that demonstrate semantic and syntactic variation as well.   We extend recently developed techniques for neural language models to learn word representations which capture differing semantics across geographical regions. In order to quantify this variation and ensure robust detection of true regional differences, we formulate a null model to determine whether observed changes are statistically significant. Our method is the first such approach to explicitly account for random variation due to chance while detecting regional variation in word meaning.   To validate our model, we study and analyze two different massive online data sets: millions of tweets from Twitter spanning not only four different countries but also fifty states, as well as millions of phrases contained in the Google Book Ngrams. Our analysis reveals interesting facets of language change at multiple scales of geographic resolution -- from neighboring states to distant continents.   Finally, using our model, we propose a measure of semantic distance between languages. Our analysis of British and American English over a period of 100 years reveals that semantic variation between these dialects is shrinking.
Here we research the univariate quantitative approximation of real and complex valued continuous functions on a compact interval or all the real line by quasi-interpolation, Baskakov type and quadrature type neural network operators. We perform also the related fractional approximation. These approximations are derived by establishing Jackson type inequalities involving the modulus of continuity of the engaged function or its high order derivative or fractional derivatives. Our operators are defined by using a density function induced by the error function. The approximations are pointwise and with respect to the uniform norm. The related feed-forward neural networks are with one hidden layer.
The time evolution of an exactly solvable layered feedforward neural network with three-state neurons and optimizing the mutual information is studied for arbitrary synaptic noise (temperature). Detailed stationary temperature-capacity and capacity-activity phase diagrams are obtained. The model exhibits pattern retrieval, pattern-fluctuation retrieval and spin-glass phases. It is found that there is an improved performance in the form of both a larger critical capacity and information content compared with three-state Ising-type layered network models. Flow diagrams reveal that saddle-point solutions associated with fluctuation overlaps slow down considerably the flow of the network states towards the stable fixed-points.
We introduce a deep learning image segmentation framework that is extremely robust to missing imaging modalities. Instead of attempting to impute or synthesize missing data, the proposed approach learns, for each modality, an embedding of the input image into a single latent vector space for which arithmetic operations (such as taking the mean) are well defined. Points in that space, which are averaged over modalities available at inference time, can then be further processed to yield the desired segmentation. As such, any combinatorial subset of available modalities can be provided as input, without having to learn a combinatorial number of imputation models. Evaluated on two neurological MRI datasets (brain tumors and MS lesions), the approach yields state-of-the-art segmentation results when provided with all modalities; moreover, its performance degrades remarkably gracefully when modalities are removed, significantly more so than alternative mean-filling or other synthesis approaches.
The transcritical bifurcation without parameters (TBWP) describes a stability change along a line of equilibria, resulting from the loss of normal hyperbolicity at a given point of such a line. Memristive circuits systematically yield manifolds of non-isolated equilibria, and in this paper we address a systematic characterization of the TBWP in circuits with a single memristor. To achieve this we develop two mathematical results of independent interest; the first one is an extension of the TBWP theorem to explicit ordinary differential equations (ODEs) in arbitrary dimension; the second result drives the characterization of this phenomenon to semiexplicit differential-algebraic equations (DAEs), which provide the appropriate framework for the analysis of circuit dynamics. In the circuit context the analysis is performed in graph-theoretic terms: in this setting, our first working scenario is restricted to passive problems (exception made of the bifurcating memristor), and in a second step some results are presented for the analysis of non-passive cases. The latter context is illustrated by means of a memristive neural network model.
Many features of complex systems can now be unveiled by applying statistical physics methods to treat them as social networks. The power of the analysis may be limited, however, by the presence of ambiguity in names, e.g., caused by homonymy in collaborative networks. In this paper we show that the ability to distinguish between homonymous authors is enhanced when longer-distance connections are considered, rather than looking at only the immediate neighbors of a node in the collaborative network. Optimized results were obtained upon using the 3rd hierarchy in connections. Furthermore, reasonable distinction among authors could also be achieved upon using pattern recognition strategies for the data generated from the topology of the collaborative network. These results were obtained with a network from papers in the arXiv repository, into which homonymy was deliberately introduced to test the methods with a controlled, reliable dataset. In all cases, several methods of supervised and unsupervised machine learning were used, leading to the same overall results. The suitability of using deeper hierarchies and network topology was confirmed with a real database of movie actors, with the additional finding that the distinguishing ability can be further enhanced by combining topology features and long-range connections in the collaborative network.
Water-filling is the term for the classic solution to the problem of allocating constrained power to a set of parallel channels to maximize the total data-rate. It is used widely in practice, for example, for power allocation to sub-carriers in multi-user OFDM systems such as WiMax. The classic water-filling algorithm is deterministic and requires perfect knowledge of the channel gain to noise ratios. In this paper we consider how to do power allocation over stochastically time-varying (i.i.d.) channels with unknown gain to noise ratio distributions. We adopt an online learning framework based on stochastic multi-armed bandits. We consider two variations of the problem, one in which the goal is to find a power allocation to maximize $\sum\limits_i \mathbb{E}[\log(1 + SNR_i)]$, and another in which the goal is to find a power allocation to maximize $\sum\limits_i \log(1 + \mathbb{E}[SNR_i])$. For the first problem, we propose a \emph{cognitive water-filling} algorithm that we call CWF1. We show that CWF1 obtains a regret (defined as the cumulative gap over time between the sum-rate obtained by a distribution-aware genie and this policy) that grows polynomially in the number of channels and logarithmically in time, implying that it asymptotically achieves the optimal time-averaged rate that can be obtained when the gain distributions are known. For the second problem, we present an algorithm called CWF2, which is, to our knowledge, the first algorithm in the literature on stochastic multi-armed bandits to exploit non-linear dependencies between the arms. We prove that the number of times CWF2 picks the incorrect power allocation is bounded by a function that is polynomial in the number of channels and logarithmic in time, implying that its frequency of incorrect allocation tends to zero.
In this paper, we present a new kind of learning implementation to recognize the patterns using the concept of Mirroring Neural Network (MNN) which can extract information from distinct sensory input patterns and perform pattern recognition tasks. It is also capable of being used as an advanced associative memory wherein image data is associated with voice inputs in an unsupervised manner. Since the architecture is hierarchical and modular it has the potential of being used to devise learning engines of ever increasing complexity.
We suggest a novel nonlinear $\sigma$-model for the description of disordered superconductors. The main distinction from existing models lies in the fact that the saddle point equation is solved non-perturbatively in the superconducting pairing field. It allows one to use the model both in the vicinity of the metal-superconductor transition and well below its critical temperature with full account for the self-consistency conditions. We show that the model reproduces a set of known results in different limiting cases, and apply it for a self-consistent description of the proximity effect at the superconductor-metal interface.
We investigate the interaction effect between oxygen impurities in crystalline germanium on the basis of a quantum rotor model. The dipolar interaction of nearby oxygen impurities engenders non-trivial low-lying excitations, giving rise to anomalous behaviors for oxygen-doped germanium (Ge:O) below a few degrees Kelvin. In particular, it is theoretically predicted that Ge:O samples with oxygen-concentration of 10$^{17-18}$cm$^{-3}$ show (i) power-law specific heats below 0.1 K, and (ii) a peculiar hump in dielectric susceptibilities around 1 K. We present an interpretation for the power-law specific heats, which is based on the picture of local double-well potentials randomly distributed in Ge:O samples.
We give an overview of ISINA: INTEGRAL Source Identification Network Algorithm. This machine learning algorithm, using Random Forests, is applied to the IBIS/ISGRI dataset in order to ease the production of unbiased future soft gamma-ray source catalogues. First we introduce the dataset and the problems encountered when dealing with images obtained using the coded mask technique. The initial step of source candidate searching is introduced and an initial candidate list is created. A description of the feature extraction on the initial candidate list is then performed together with feature merging for these candidates. Three training and testing sets are created in order to deal with the diverse timescales encountered when dealing with the gamma-ray sky. Three independent Random Forest are built: one dealing with faint persistent source recognition, one dealing with strong persistent sources and a final one dealing with transients. For the latter, a new transient detection technique is introduced and described: the Transient Matrix. Finally the performance of the network is assessed and discussed using the testing set and some illustrative source examples.
The GIPSY system provides a framework for a distributed multi-tier demand-driven evaluation of heterogeneous programs, in which certain tiers can generate demands, while others can respond to demands to work on them. They are connected through a virtual network that can be flexibly reconfigured at run-time. Although the demand generator components were originally designed specifically for the eductive (demand-driven) evaluation of Lucid intensional programs, the GIPSY's run-time's flexible framework design enables it to perform the execution of various kinds of programs that can be evaluated using the demand-driven computational model. Management of the GISPY networks has become a tedious (although scripted) task that took manual command-line console to do, which does not scale for large experiments. Therefore a new component has been designed and developed to allow users to represent, visualize, and interactively create, configure and seamlessly manage such a network as a graph. Consequently, this work presents a Graphical GMT Manager, an interactive graph-based assistant component for the GIPSY network creation and configuration management. Besides allowing the management of the nodes and tiers (mapped to hosts where store, workers, and generators reside), it lets the user to visually control the network parameters and the interconnection between computational nodes at run-time. In this paper we motivate and present the key features of this newly implemented graph-based component. We give the graph representation details, mapping of the graph nodes to tiers, tier groups, and specific commands. We provide the requirements and design specification of the tool and its implementation. Then we detail and discuss some experimental results.
In vitro investigation of neural architectures requires cell positioning. For that purpose, micro-magnets have been developed on silicon substrates and combined with chemical patterning to attract cells to adhesive sites and keep them there during incubation. We have shown that the use of micro-magnets allows to achieve a high filling factor (~90%) of defined adhesive sites in neural networks and prevents migration of cells during growth. This approach has great potential for neural interfacing by providing accurate and time-stable coupling with integrated nanodevices.
ECG Feature Extraction plays a significant role in diagnosing most of the cardiac diseases. One cardiac cycle in an ECG signal consists of the P-QRS-T waves. This feature extraction scheme determines the amplitudes and intervals in the ECG signal for subsequent analysis. The amplitudes and intervals value of P-QRS-T segment determines the functioning of heart of every human. Recently, numerous research and techniques have been developed for analyzing the ECG signal. The proposed schemes were mostly based on Fuzzy Logic Methods, Artificial Neural Networks (ANN), Genetic Algorithm (GA), Support Vector Machines (SVM), and other Signal Analysis techniques. All these techniques and algorithms have their advantages and limitations. This proposed paper discusses various techniques and transformations proposed earlier in literature for extracting feature from an ECG signal. In addition this paper also provides a comparative study of various methods proposed by researchers in extracting the feature from ECG signal.
Over the past decade, many works on the modeling of wireless networks using stochastic geometry have been proposed. Results about probability of coverage, throughput or mean interference, have been provided for a wide variety of networks (cellular, ad-hoc, cognitive, sensors, etc). These results notably allow to tune network protocol parameters. Nevertheless, in their vast majority, these works assume that the wireless network deployment is flat: nodes are placed on the Euclidean plane. However, this assumption is disproved in dense urban environments where many nodes are deployed in high buildings. In this letter, we derive the exact form of the probability of coverage for the cases where the interferers form a 3D Poisson Point Process (PPP) and a 3D Modified Matern Process (MMP), and compare the results with the 2D case. The main goal of this letter is to show that the 2D model, although being the most common, can lead either to an optimistic or a pessimistic evaluation of the probability of coverage depending on the parameters of the model.
Cognitive ageing seems to be a story of global degradation. As one ages there are a number of physical, chemical and biological changes that take place. Therefore it is logical to assume that the brain is no exception to this phenomenon. The principle purpose of this project is to use models of neural dynamics and learning based on the underlying principle of self-organised criticality, to account for the age related cognitive effects. In this regard learning in neural networks can serve as a model for the acquisition of skills and knowledge in early development stages i.e. the ageing process and criticality in the network serves as the optimum state of cognitive abilities. Possible candidate mechanisms for ageing in a neural network are loss of connectivity and neurons, increase in the level of noise, reduction in white matter or more interestingly longer learning history and the competition among several optimization objectives. In this paper we are primarily interested in the affect of the longer learning history on memory and thus the optimality in the brain. Hence it is hypothesized that prolonged learning in the form of associative memory patterns can destroy the state of criticality in the network. We base our model on Tsodyks and Markrams [49] model of dynamic synapses, in the process to explore the effect of combining standard Hebbian learning with the phenomenon of Self-organised criticality. The project mainly consists of evaluations and simulations of networks of integrate and fire-neurons that have been subjected to various combinations of neural-level ageing effects, with the aim of establishing the primary hypothesis and understanding the decline of cognitive abilities due to ageing, using one of its important characteristics, a longer learning history.
Neural conversational models require substantial amounts of dialogue data for their parameter estimation and are therefore usually learned on large corpora such as chat forums or movie subtitles. These corpora are, however, often challenging to work with, notably due to their frequent lack of turn segmentation and the presence of multiple references external to the dialogue itself. This paper shows that these challenges can be mitigated by adding a weighting model into the architecture. The weighting model, which is itself estimated from dialogue data, associates each training example to a numerical weight that reflects its intrinsic quality for dialogue modelling. At training time, these sample weights are included into the empirical loss to be minimised. Evaluation results on retrieval-based models trained on movie and TV subtitles demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics.
Companies have been increasingly seeking new mechanisms for making their electronic marketing campaigns to become viral, thus obtaining a cascading recommendation effect similar to word-of-mouth. We analysed a dataset of a magazine publisher that uses email as the main marketing strategy and found out that networks emerging from those campaigns form a very sparse graph. We show that online social networks can be effectively used as a means to expand recommendation networks. Starting from a set of users, called seeders, we crawled Google's Orkut and collected about 20 million users and 80 million relationships. Next, we extended the original recommendation network by adding new edges using Orkut relationships that built a much denser network. Therefore, we advocate that online social networks are much more effective than email-based marketing campaigns
Network virtualization is a technology of running multiple heterogeneous network architecture on a shared substrate network. One of the crucial components in network virtualization is virtual network embedding, which provides a way to allocate physical network resources (CPU and link bandwidth) to virtual network requests. Despite significant research efforts on virtual network embedding in wired and cellular networks, little attention has been paid to that in wireless multi-hop networks, which is becoming more important due to its rapid growth and the need to share these networks among different business sectors and users. In this paper, we first study the root causes of new challenges of virtual network embedding in wireless multi-hop networks, and propose a new embedding algorithm that efficiently uses the resources of the physical substrate network. We examine our algorithm's performance through extensive simulations under various scenarios. Due to lack of competitive algorithms, we compare the proposed algorithm to five other algorithms, mainly borrowed from wired embedding or artificially made by us, partially with or without the key algorithmic ideas to assess their impacts.
Based on the distinguishing features of multi-tier millimeter wave (mmWave) networks such as different transmit powers, different directivity gains from directional beamforming alignment and path loss laws for line-of-sight (LOS) and non-line-of-sight (NLOS) links, we introduce a normalization model to simplify the analysis of multi-tier mmWave cellular networks. The highlight of the model is that we convert a multi-tier mmWave cellular network into a single-tier mmWave network, where all the base stations (BSs) have the same normalized transmit power 1 and the densities of BSs scaled by LOS or NLOS scaling factors respectively follow piecewise constant function which has multiple demarcation points. On this basis, expressions for computing the coverage probability are obtained in general case with beamforming alignment errors and the special case with perfect beamforming alignment in the communication. According to corresponding numerical exploration, we conclude that the normalization model for multi-tier mmWave cellular networks fully meets requirements of network performance analysis, and it is simpler and clearer than the untransformed model. Besides, an unexpected but sensible finding is that there is an optimal beam width that maximizes coverage probability in the case with beamforming alignment errors.
The proposal is to use clusters, graphs and networks as models in order to analyse the Web structure. Clusters, graphs and networks provide knowledge representation and organization. Clusters were generated by co-site analysis. The sample is a set of academic Web sites from the countries belonging to the European Union. These clusters are here revisited from the point of view of graph theory and social network analysis. This is a quantitative and structural analysis. In fact, the Internet is a computer network that connects people and organizations. Thus we may consider it to be a social network. The set of Web academic sites represents an empirical social network, and is viewed as a virtual community. The network structural properties are here analysed applying together cluster analysis, graph theory and social network analysis.
This paper is concerned with the problem of stochastic control of gene regulatory networks (GRNs) observed indirectly through noisy measurements and with uncertainty in the intervention inputs. The partial observability of the gene states and uncertainty in the intervention process are accounted for by modeling GRNs using the partially-observed Boolean dynamical system (POBDS) signal model with noisy gene expression measurements. Obtaining the optimal infinite-horizon control strategy for this problem is not attainable in general, and we apply reinforcement learning and Gaussian process techniques to find a near-optimal solution. The POBDS is first transformed to a directly-observed Markov Decision Process in a continuous belief space, and the Gaussian process is used for modeling the cost function over the belief and intervention spaces. Reinforcement learning then is used to learn the cost function from the available gene expression data. In addition, we employ sparsification, which enables the control of large partially-observed GRNs. The performance of the resulting algorithm is studied through a comprehensive set of numerical experiments using synthetic gene expression data generated from a melanoma gene regulatory network.
Recently, there has been a flurry of industrial activity around logo recognition, such as Ditto's service for marketers to track their brands in user-generated images, and LogoGrab's mobile app platform for logo recognition. However, relatively little academic or open-source logo recognition progress has been made in the last four years. Meanwhile, deep convolutional neural networks (DCNNs) have revolutionized a broad range of object recognition applications. In this work, we apply DCNNs to logo recognition. We propose several DCNN architectures, with which we surpass published state-of-art accuracy on a popular logo recognition dataset.
We present a stochastic model for a social network, where new actors may join the network, existing actors may become inactive and, at a later stage, reactivate themselves. Our model captures the evolution of the network, assuming that actors attain new relations or become active according to the preferential attachment rule. We derive the mean-field equations for this stochastic model and show that, asymptotically, the distribution of actors obeys a power-law distribution. In particular, the model applies to social networks such as wireless local area networks, where users connect to access-points, and peer-to-peer networks where users connect to each other. As a proof of concept, we demonstrate the validity of our model empirically by analysing a public log containing traces from a wireless network at Dartmouth College over a period of three years. Analysing the data processed according to our model, we demonstrate that the distribution of user accesses is asymptotically a power-law distribution.
We present a comprehensive study on the use of autoencoders for modelling text data, in which (differently from previous studies) we focus our attention on the following issues: i) we explore the suitability of two different models bDA and rsDA for constructing deep autoencoders for text data at the sentence level; ii) we propose and evaluate two novel metrics for better assessing the text-reconstruction capabilities of autoencoders; and iii) we propose an automatic method to find the critical bottleneck dimensionality for text language representations (below which structural information is lost).
Collaboration networks provide a method for examining the highly heterogeneous structure of collaborative communities. However, we still have limited theoretical understanding of how individual heterogeneity relates to network heterogeneity. The model presented here provides a framework linking an individual's skill set to her position in the collaboration network, and the distribution of skills in the population to the structure of the collaboration network as a whole. This model suggests that there is a non-trivial relationship between skills and network position: individuals with a useful combination of skills will have a disproportionate number of links in the network. Indeed, in some cases, an individual's degree is non-monotonic in the number of skills she has--an individual with very few skills may outperform an individual with many. Special cases of the model suggest that the degree distribution of the network will be skewed, even when the distribution of skills is uniform in the population. The degree distribution becomes more skewed as problems become more difficult, leading to a community dominated by a few high-degree superstars. This has striking implications for labor market outcomes in industries where production is largely the result of collaborative effort.
Wireless Sensor Networks (WSNs) emerge as underlying infrastructures for new classes of large scale net- worked embedded systems. However, WSNs system designers must fulfill the Quality-of-Service (QoS) requirements imposed by the applications (and users). Very harsh and dynamic physical environments and extremely limited energy/computing/memory/communication node resources are major obstacles for satisfying QoS metrics such as reliability, timeliness and system lifetime. The limited communication range of WSN nodes, link asymmetry and the characteristics of the physical environment lead to a major source of QoS degradation in WSNs. This paper proposes a Link Reliability based Two-Hop Routing protocol for wireless Sensor Networks (WSNs). The protocol achieves to reduce packet deadline miss ratio while consid- ering link reliability, two-hop velocity and power efficiency and utilizes memory and computational effective methods for estimating the link metrics. Numerical results provide insights that the protocol has a lower packet deadline miss ratio and longer sensor network lifetime. The results show that the proposed protocol is a feasible solution to the QoS routing problem in wireless sensor networks that support real-time applications.
When a Convolutional Neural Network is used for on-the-fly evaluation of continuously updating time-sequences, many redundant convolution operations are performed. We propose the method of Deep Shifting, which remembers previously calculated results of convolution operations in order to minimize the number of calculations. The reduction in complexity is at least a constant and in the best case quadratic. We demonstrate that this method does indeed save significant computation time in a practical implementation, especially when the networks receives a large number of time-frames.
In this paper, a new method for decoding Low Density Parity Check (LDPC) codes, based on Multi-Layer Perceptron (MLP) neural networks is proposed. Due to the fact that in neural networks all procedures are processed in parallel, this method can be considered as a viable alternative to Message Passing Algorithm (MPA), with high computational complexity. Our proposed algorithm runs with soft criterion and concurrently does not use probabilistic quantities to decide what the estimated codeword is. Although the neural decoder performance is close to the error performance of Sum Product Algorithm (SPA), it is comparatively less complex. Therefore, the proposed decoder emerges as a new infrastructure for decoding LDPC codes.
This article presents a semantic tracker which simultaneously tracks a single target and recognises its category. In general, it is hard to design a tracking model suitable for all object categories, e.g., a rigid tracker for a car is not suitable for a deformable gymnast. Category-based trackers usually achieve superior tracking performance for the objects of that specific category, but have difficulties being generalised. Therefore, we propose a novel unified robust tracking framework which explicitly encodes both generic features and category-based features. The tracker consists of a shared convolutional network (NetS), which feeds into two parallel networks, NetC for classification and NetT for tracking. NetS is pre-trained on ImageNet to serve as a generic feature extractor across the different object categories for NetC and NetT. NetC utilises those features within fully connected layers to classify the object category. NetT has multiple branches, corresponding to multiple categories, to distinguish the tracked object from the background. Since each branch in NetT is trained by the videos of a specific category or groups of similar categories, NetT encodes category-based features for tracking. During online tracking, NetC and NetT jointly determine the target regions with the right category and foreground labels for target estimation. To improve the robustness and precision, NetC and NetT inter-supervise each other and trigger network adaptation when their outputs are ambiguous for the same image regions (i.e., when the category label contradicts the foreground/background classification). We have compared the performance of our tracker to other state-of-the-art trackers on a large-scale tracking benchmark (100 sequences)---the obtained results demonstrate the effectiveness of our proposed tracker as it outperformed other 38 state-of-the-art tracking algorithms.
Recent experimental studies of magneto-resistance in disordered superconducting thin films reveal a huge peak (about 5 orders of magnitude). While it may be expected that magnetic field destroys superconductivity, leading to an enhanced resistance, attenuation of the resistance at higher magnetic fields is surprising.   We propose a model which accounts for the experimental results in the entire range of magnetic fields, based on the formation of superconducting islands due to fluctuations in the superconducting order parameter amplitude. At strong magnetic fields Coulomb blockade in these islands gives rise to negative magneto-resistance. As the magnetic field is reduced the effect of Coulomb blockade diminishes and eventually the magneto-resistance changes sign. Numerical calculations show good qualitative agreement with experimental data.
Since Convolutional Neural Networks (CNNs) have become the leading learning paradigm in visual recognition, Naive Bayes Nearest Neighbour (NBNN)-based classifiers have lost momentum in the community. This is because (1) such algorithms cannot use CNN activations as input features; (2) they cannot be used as final layer of CNN architectures for end-to-end training , and (3) they are generally not scalable and hence cannot handle big data. This paper proposes a framework that addresses all these issues, thus bringing back NBNNs on the map. We solve the first by extracting CNN activations from local patches at multiple scale levels, similarly to [1]. We address simultaneously the second and third by proposing a scalable version of Naive Bayes Non-linear Learning (NBNL, [2]). Results obtained using pre-trained CNNs on standard scene and domain adaptation databases show the strength of our approach, opening a new season for NBNNs.
This paper critically assesses wireless broadband internet infrastructure, in the rural and remote communities of WDR in terms of supply, demand and utilisation. Only 8 of 20 towns have ADSL/ADSL2+, and only 3 towns have 4G mobile network coverage. Conversely all of the towns have 2G/3G mobile network coverage but have problems with speed, reliability of service and capacity to handle data traffic loads at peak times. Satellite broadband internet for remote areas is also patchy at best. Satisfaction with existing wireless broadband internet services is highly variable across rural and remote communities in WDR. Finally we provide suggestions to improve broadband internet access for rural and remote communities. Public and private investment and sharing of wired and wireless broadband internet infrastructure is needed to provide the backhaul networks and 4G mobile and fixed wireless services to ensure high speed, reliable and affordable broadband services for rural and remote communities.
We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently long and 2) sufficiently close to the leaves. How much of the true tree is recovered depends on the sequence length provided. The algorithm is distance-based and runs in polynomial time.
The Spitzer-Cosmic Assembly Deep Near-Infrared Extragalactic Legacy Survey (S-CANDELS; PI G. Fazio) is a Cycle 8 Exploration Program designed to detect galaxies at very high redshifts (z > 5). To mitigate the effects of cosmic variance and also to take advantage of deep coextensive coverage in multiple bands by the Hubble Space Telescope Multi-Cycle Treasury Program CANDELS, S-CANDELS was carried out within five widely separated extragalactic fields: the UKIDSS Ultra-Deep Survey, the Extended Chandra Deep Field South, COSMOS, the HST Deep Field North, and the Extended Groth Strip. S-CANDELS builds upon the existing coverage of these fields from the Spitzer Extended Deep Survey (SEDS) by increasing the integration time from 12 hours to a total of 50 hours but within a smaller area, 0.16 square degrees. The additional depth significantly increases the survey completeness at faint magnitudes. This paper describes the S-CANDELS survey design, processing, and publicly-available data products. We present IRAC dual-band 3.6+4.5 micron catalogs reaching to a depth of 26.5 AB mag. Deep IRAC counts for the roughly 135,000 galaxies detected by S-CANDELS are consistent with models based on known galaxy populations. The increase in depth beyond earlier Spitzer/IRAC surveys does not reveal a significant additional contribution from discrete sources to the diffuse Cosmic Infrared Background (CIB). Thus it remains true that only roughly half of the estimated CIB flux from COBE/DIRBE is resolved.
Self-assembly materials are traditionally designed so that molecular or meso-scale components form a single kind of large structure. Here, we propose a scheme to create "multifarious assembly mixtures", which self-assemble many different large structures from a set of shared components. We show that the number of multifarious structures stored in the solution of components increases rapidly with the number of different types of components. Yet, each stored structure can be retrieved by tuning only a few parameters, the number of which is only weakly dependent on the size of the assembled structure. Implications for artificial and biological self-assembly are discussed.
We investigate the fate of the quantum Hall extended states within a continuum model with spatially correlated disorder potentials. The model can be projected onto a couple of the lowest Landau bands. Levitation of the $n=0$ critical states is observed if at least the two lowest Landau bands are considered. The dependence on the magnetic length $l_B=(\hbar/(eB))^{1/2}$ and on the correlation length of the disorder potential $\eta$ is combined into a single dimensionless parameter $\hat\eta=\eta/l_B$. This enables us to study the behavior of the critical states for vanishing magnetic field. In the two Landau band limit, we find a disorder dependent saturation of the critical states' levitation which is in contrast to earlier propositions, but in accord with some experiments.
A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40,000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem.   We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, with significant improvement for predicting cellular locations.
This paper presents a new method and a constraint-based objective function to solve two problems related to the design of optical telecommunication networks, namely the Synchronous Optical Network Ring Assignment Problem (SRAP) and the Intra-ring Synchronous Optical Network Design Problem (IDP). These network topology problems can be represented as a graph partitioning with capacity constraints as shown in previous works. We present here a new objective function and a new local search algorithm to solve these problems. Experiments conducted in Comet allow us to compare our method to previous ones and show that we obtain better results.
In this paper we develop network protection strategies against a single link failure in optical networks. The motivation behind this work is the fact that $%70$ of all available links in an optical network suffers from a single link failure. In the proposed protection strategies, denoted NPS-I and NPS-II, we deploy network coding and reduced capacity on the working paths to provide a backup protection path that will carry encoded data from all sources. In addition, we provide implementation aspects and how to deploy the proposed strategies in case of an optical network with $n$ disjoint working paths.
Personal robots are expected to interact with the user by recognizing the user's face. However, in most of the service robot applications, the user needs to move himself/herself to allow the robot to see him/her face to face. To overcome such limitations, a method for estimating human body orientation is required. Previous studies used various components such as feature extractors and classification models to classify the orientation which resulted in low performance. For a more robust and accurate approach, we propose the light weight convolutional neural networks, an end to end system, for estimating human body orientation. Our body orientation estimation model achieved 81.58% and 94% accuracy with the benchmark dataset and our own dataset respectively. The proposed method can be used in a wide range of service robot applications which depend on the ability to estimate human body orientation. To show its usefulness in service robot applications, we designed a simple robot application which allows the robot to move towards the user's frontal plane. With this, we demonstrated an improved face detection rate.
A bond-disordered two-dimensional Ising model is used to simulate Kauzmann's mechanism of vitrification in liquids, by a Glauber Monte Carlo simulation. The rearrangement of configurations is achieved by allowing impurity bonds to hop to nearest neighbors at the same rate as the spins flip. For slow cooling, the theoretical minimum energy configuration is approached, characterized by an amorphous distribution of locally optimally arranged impurity bonds. Rapid cooling to low temperatures regularly finds bond configurations of higher energy, which are both a priori rare and severely restrictive to spin movement, providing a simple realization of kinetic vitrification. A supercooled liquid regime is also found, and characterized by a change in sign of the field derivative of the spin-glass susceptibility at a finite temperature.
We consider learning representations (features) in the setting in which we have access to multiple unlabeled views of the data for learning while only one view is available for downstream tasks. Previous work on this problem has proposed several techniques based on deep neural networks, typically involving either autoencoder-like networks with a reconstruction objective or paired feedforward networks with a batch-style correlation-based objective. We analyze several techniques based on prior work, as well as new variants, and compare them empirically on image, speech, and text tasks. We find an advantage for correlation-based representation learning, while the best results on most tasks are obtained with our new variant, deep canonically correlated autoencoders (DCCAE). We also explore a stochastic optimization procedure for minibatch correlation-based objectives and discuss the time/performance trade-offs for kernel-based and neural network-based implementations.
In iterative supervised learning algorithms it is common to reach a point in the search where no further induction seems to be possible with the available data. If the search is continued beyond this point, the risk of overfitting increases significantly. Following the recent developments in inductive semantic stochastic methods, this paper studies the feasibility of using information gathered from the semantic neighborhood to decide when to stop the search. Two semantic stopping criteria are proposed and experimentally assessed in Geometric Semantic Genetic Programming (GSGP) and in the Semantic Learning Machine (SLM) algorithm (the equivalent algorithm for neural networks). The experiments are performed on real-world high-dimensional regression datasets. The results show that the proposed semantic stopping criteria are able to detect stopping points that result in a competitive generalization for both GSGP and SLM. This approach also yields computationally efficient algorithms as it allows the evolution of neural networks in less than 3 seconds on average, and of GP trees in at most 10 seconds. The usage of the proposed semantic stopping criteria in conjunction with the computation of optimal mutation/learning steps also results in small trees and neural networks.
Tree-based protocols are ubiquitous in distributed systems. They are flexible, they perform generally well, and, in static conditions, their analysis is mostly simple. Under churn, however, node joins and failures can have complex global effects on the tree overlays, making analysis surprisingly subtle. To our knowledge, few prior analytic results for performance estimation of tree based protocols under churn are currently known. We study a simple Bellman-Ford-like protocol which performs network size estimation over a tree-shaped overlay. A continuous time Markov model is constructed which allows key protocol characteristics to be estimated, including the expected number of nodes at a given (perceived) distance to the root and, for each such node, the expected (perceived) size of the subnetwork rooted at that node. We validate the model by simulation, using a range of network sizes, node degrees, and churn-to-protocol rates, with convincing results.
The recent proof by F. Guerra that the Parisi ansatz provides a lower bound on the free energy of the SK spin-glass model could have been taken as offering some support to the validity of the purported solution. In this work we present a broader variational principle, in which the lower bound, as well as the actual value, are obtained through an optimization procedure for which ultrametic/hierarchal structures form only a subset of the variational class. The validity of Parisi's ansatz for the SK model is still in question. The new variational principle may be of help in critical review of the issue.
Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients. In this paper, we propose a new adaptive learning rate algorithm, which utilizes curvature information for automatically tuning the learning rates. The information about the element-wise curvature of the loss function is estimated from the local statistics of the stochastic first order gradients. We further propose a new variance reduction technique to speed up the convergence. In our preliminary experiments with deep neural networks, we obtained better performance compared to the popular stochastic gradient algorithms.
We propose a novel method for speeding up stochastic optimization algorithms via sketching methods, which recently became a powerful tool for accelerating algorithms for numerical linear algebra. We revisit the method of conditioning for accelerating first-order methods and suggest the use of sketching methods for constructing a cheap conditioner that attains a significant speedup with respect to the Stochastic Gradient Descent (SGD) algorithm. While our theoretical guarantees assume convexity, we discuss the applicability of our method to deep neural networks, and experimentally demonstrate its merits.
To answer traffic engineering goals, current backbone networks use expensive and sophisticated equipments, that run distributed algorithms to imple- ment dynamic multi-path routing (e.g., MPLS tunnels and dynamic trunk rerout- ing). We think that the same goals can be fulfilled using a simpler approach, where the core of the backbone only implements many a priori computed paths, and most adaptation to traffic engineering goals only takes place at the edge of the network. In the vein of Software Defined Networking, edge adaptation should be driven by a logically centralized controller that leverages the available paths to adapt traffic load balancing to the current demands and network status. In this article we present two algorithms to help building this vision. The first one selects sets of paths able to support future load balancing needs and adaptation to network faults. As the total number of required paths is very important, and their continuous availability requires many FIB entries in core routers, we also present a second algorithm that aggregates these paths in a reduced number of trees. This second algorithm achieves better results than previously proposed algorithms for path aggregation. To conclude, we show that off-the-shelf equipment supporting simple protocols may be used to implement routing with these trees, what shows that simplicity in the core can be achieved by using only trivially available proto- cols and their most common and unsophisticated implementation
We explore a framework called boosted Markov networks to combine the learning capacity of boosting and the rich modeling semantics of Markov networks and applying the framework for video-based activity recognition. Importantly, we extend the framework to incorporate hidden variables. We show how the framework can be applied for both model learning and feature selection. We demonstrate that boosted Markov networks with hidden variables perform comparably with the standard maximum likelihood estimation. However, our framework is able to learn sparse models, and therefore can provide computational savings when the learned models are used for classification.
Recent theoretical advances offer an exact, first-principle theory of jamming criticality in infinite dimension as well as universal scaling relations between critical exponents in all dimensions. For packings of frictionless spheres near the jamming transition, these advances predict that nontrivial power-law exponents characterize the critical distribution of (i) small inter-particle gaps and (ii) weak contact forces, both of which are crucial for mechanical stability. The scaling of the inter-particle gaps is known to be constant in all spatial dimensions $d$ -- including the physically relevant $d=2$ and 3, but the value of the weak force exponent remains the object of debate and confusion. Here, we resolve this ambiguity by numerical simulations. We construct isostatic jammed packings with extremely high accuracy, and introduce a simple criterion to separate the contribution of particles that give rise to localized buckling excitations, i.e., bucklers, from the others. This analysis reveals the remarkable dimensional robustness of mean-field marginality and its associated criticality.
Glaucoma is the second leading cause of blindness all over the world, with approximately 60 million cases reported worldwide in 2010. If undiagnosed in time, glaucoma causes irreversible damage to the optic nerve leading to blindness. The optic nerve head examination, which involves measurement of cup-to-disc ratio, is considered one of the most valuable methods of structural diagnosis of the disease. Estimation of cup-to-disc ratio requires segmentation of optic disc and optic cup on eye fundus images and can be performed by modern computer vision algorithms. This work presents universal approach for automatic optic disc and cup segmentation, which is based on deep learning, namely, modification of U-Net convolutional neural network. Our experiments include comparison with the best known methods on publicly available databases DRIONS-DB, RIM-ONE v.3, DRISHTI-GS. For both optic disc and cup segmentation, our method achieves quality comparable to current state-of-the-art methods, outperforming them in terms of the prediction time.
We use coherent X-rays to probe the aging dynamics of a metallic glass directly on the atomic level. Contrary to the common assumption of a steady slowing down of the dynamics usually observed in macroscopic studies, we show that the structural relaxation processes underlying aging in this metallic glass are intermittent and highly heterogeneous at the atomic scale. Moreover, physical aging is triggered by cooperative atomic rearrangements, driven by the relaxation of internal stresses. The rich diversity of this behavior reflects a complex energy landscape, giving rise to a unique type of glassy-state dynamics.
This paper considers a fully-connected interference network with a relay in which multiple users equipped with a single antenna want to exchange multiple unicast messages with other users in the network by sharing the relay equipped with multiple antennas. For such a network, the degrees of freedom (DoF) are derived by considering various message exchange scenarios: a multi-user fully-connected Y channel, a two-pair two-way interference channel with the relay, and a two-pair two-way X channel with the relay. Further, considering distributed relays employing a single antenna in the two-way interference channel and the three-user fully-connected Y channel, achievable sum-DoF are also derived in the two-way interference channel and the three-user fully-connected Y channel. A major implication of the derived DoF results is that a relay with multiple antennas or multiple relays employing a single antenna increases the capacity scaling law of the multi-user interference network when multiple directional information flows are considered, even if the networks are fully-connected and all nodes operate in half-duplex. These results reveal that the relay is useful in the multi-way interference network with practical considerations.
In this paper, inspired from our previous algorithm, which was based on the theory of Tsallis statistical mechanics, we develop a new evolving stochastic learning algorithm for neural networks. The new algorithm combines deterministic and stochastic search steps by employing a different adaptive stepsize for each network weight, and applies a form of noise that is characterized by the nonextensive entropic index q, regulated by a weight decay term. The behavior of the learning algorithm can be made more stochastic or deterministic depending on the trade off between the temperature T and the q values. This is achieved by introducing a formula that defines a time--dependent relationship between these two important learning parameters. Our experimental study verifies that there are indeed improvements in the convergence speed of this new evolving stochastic learning algorithm, which makes learning faster than using the original Hybrid Learning Scheme (HLS). In addition, experiments are conducted to explore the influence of the entropic index q and temperature T on the convergence speed and stability of the proposed method.
We test this premise and explore representation spaces from a single deep convolutional network and their visualization to argue for a novel unified feature extraction framework. The objective is to utilize and re-purpose trained feature extractors without the need for network retraining on three remote sensing tasks i.e. superpixel mapping, pixel-level segmentation and semantic based image visualization. By leveraging the same convolutional feature extractors and viewing them as visual information extractors that encode different image representation spaces, we demonstrate a preliminary inductive transfer learning potential on multiscale experiments that incorporate edge-level details up to semantic-level information.
We consider wireless sensor networks secured by the heterogeneous random key predistribution scheme under an on/off channel model. The heterogeneous random key predistribution scheme considers the case when the network includes sensor nodes with varying levels of resources, features, or connectivity requirements; e.g., regular nodes vs. cluster heads, but does not incorporate the fact that wireless channel are unreliable. To capture the unreliability of the wireless medium, we use an on/off channel model; wherein, each wireless channel is either on (with probability $\alpha$) or off (with probability $1-\alpha$) independently. We present conditions (in the form of zero-one laws) on how to scale the parameters of the network model so that with high probability the network is $k$-connected, i.e., the network remains connected even if any $k-1$ nodes fail or leave the network. We also present numerical results to support these conditions in the finite-node regime.
We describe the DeepMind Kinetics human action video dataset. The dataset contains 400 human action classes, with at least 400 video clips for each action. Each clip lasts around 10s and is taken from a different YouTube video. The actions are human focussed and cover a broad range of classes including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands. We describe the statistics of the dataset, how it was collected, and give some baseline performance figures for neural network architectures trained and tested for human action classification on this dataset. We also carry out a preliminary analysis of whether imbalance in the dataset leads to bias in the classifiers.
The dynamics of neural networks is often characterized by collective behavior and quasi-synchronous events, where a large fraction of neurons fire in short time intervals, separated by uncorrelated firing activity. These global temporal signals are crucial for brain functioning. They strongly depend on the topology of the network and on the fluctuations of the connectivity. We propose a heterogeneous mean--field approach to neural dynamics on random networks, that explicitly preserves the disorder in the topology at growing network sizes, and leads to a set of self-consistent equations. Within this approach, we provide an effective description of microscopic and large scale temporal signals in a leaky integrate-and-fire model with short term plasticity, where quasi-synchronous events arise. Our equations provide a clear analytical picture of the dynamics, evidencing the contributions of both periodic (locked) and aperiodic (unlocked) neurons to the measurable average signal. In particular, we formulate and solve a global inverse problem of reconstructing the in-degree distribution from the knowledge of the average activity field. Our method is very general and applies to a large class of dynamical models on dense random networks.
This article applies Machine Learning techniques to solve Intrusion Detection problems within computer networks. Due to complex and dynamic nature of computer networks and hacking techniques, detecting malicious activities remains a challenging task for security experts, that is, currently available defense systems suffer from low detection capability and high number of false alarms. To overcome such performance limitations, we propose a novel Machine Learning algorithm, namely Boosted Subspace Probabilistic Neural Network (BSPNN), which integrates an adaptive boosting technique and a semi parametric neural network to obtain good tradeoff between accuracy and generality. As the result, learning bias and generalization variance can be significantly minimized. Substantial experiments on KDD 99 intrusion benchmark indicate that our model outperforms other state of the art learning algorithms, with significantly improved detection accuracy, minimal false alarms and relatively small computational complexity.
We investigate the representation of hierarchical models in terms of marginals of other hierarchical models with smaller interactions. We focus on binary variables and marginals of pairwise interaction models whose hidden variables are conditionally independent given the visible variables. In this case the problem is equivalent to the representation of linear subspaces of polynomials by feedforward neural networks with soft-plus computational units. We show that every hidden variable can freely model multiple interactions among the visible variables, which allows us to generalize and improve previous results. In particular, we show that a restricted Boltzmann machine with less than $[ 2(\log(v)+1) / (v+1) ] 2^v-1$ hidden binary variables can approximate every distribution of $v$ visible binary variables arbitrarily well, compared to $2^{v-1}-1$ from the best previously known result.
We generate equilibrium configurations for the three and four dimensional Ising spin glass with Gaussian distributed couplings at temperatures well below the transition temperature T_c. These states are analyzed by a recently proposed method using clustering. The analysis reveals a hierarchical state space structure. At each level of the hierarchy states are labeled by the orientations of a set of correlated macroscopic spin domains. Our picture of the low temperature phase of short range spin glasses is that of a State Hierarchy Induced by Correlated Spin domains (SHICS). The complexity of the low temperature phase is manifest in the fact that the composition of such a spin domain (i.e. its constituent spins), as well as its identifying label, are defined and determined by the ``location'' in the state hierarchy at which it appears. Mapping out the phase space structure by means of the orientations assumed by these domains enhances our ability to investigate the overlap distribution, which we find to be non-trivial. Evidence is also presented that these states may have a non-ultrametric structure.
We study the nature of the superfluid--insulator quantum phase transition in a one-dimensional system of lattice bosons with off-diagonal disorder in the limit of large integer filling factor. Monte Carlo simulations of two strongly disordered models show that the universality class of the transition in question is the same as that of the superfluid--Mott-insulator transition in a pure system. This result can be explained by disorder self-averaging in the superfluid phase and applicability of the standard quantum hydrodynamic action. We also formulate the necessary conditions which should be satisfied by the stong-randomness universality class, if one exists.
In sequence prediction task, there mainly exist two types of learning algorithms. One genre based on maximum likelihood estimation (MLE) supervises the training in every time step via a surrogate loss function but leads to train-test discrepancy and hard-to-train problem. The other genre based on Reinforcement Learning gives the model freedom to generate samples and guides the training by feeding back task-level reward but suffers from high variance and instability problem. In general, these two frameworks both have their own trade-offs and complement each other. In this paper, we introduce a hybrid-coach framework to combine these two existing algorithms so that the supervision from MLE can stabilize RL training while RL can incorporate task-level loss into MLE training. To augment more easy-to-learn samples for MLE training and denser reward for RL training, the coach is encouraged to be close to both the data and model distribution. During training, the coach acts as a proxy target and pulls the model distribution to the data distribution, as the model makes progress it correspondingly upgrades itself to become harsher and finally anneal to the ultimate target distribution. With the coordination of the coach system, our hybrid-coach framework can efficiently resolve RL's reward sparsity problem and MLE's data sparsity problem. We apply our algorithm on two Machine Translation Task and show significant improvements over the state-of-art systems.
Peer-to-peer (P2P) networks establish loosely coupled application-level overlays on top of the Internet to facilitate efficient sharing of resources. It can be roughly classified as either structured or unstructured networks. Without stringent constraints over the network topology, unstructured P2P networks can be constructed very efficiently and are therefore considered suitable to the Internet environment. However, the random search strategies adopted by these networks usually perform poorly with a large network size. To enhance the search performance in unstructured P2P networks through exploiting users' common interest patterns captured within a probability-theoretic framework termed the user interest model (UIM). A search protocol and a routing table updating protocol are further proposed in order to expedite the search process through self organizing the P2P network into a small world. Both theoretical and experimental analyses are conducted and demonstrated the effectiveness and efficiency of the approach.
The high frequency dynamics of glassy Selenium has been studied by Inelastic X-ray Scattering at beamline BL35XU (SPring-8). The high quality of the data allows one to pinpoint the existence of a dispersing acoustic mode for wavevectors ($Q$) of $1.5<Q<12.5$ nm$^{-1}$, helping to clarify a previous contradiction between experimental and numerical results. The sound velocity shows a positive dispersion, exceeding the hydrodynamic value by $\approx$ 10% at $Q<3.5$ nm$^{-1}$. The $Q^2$ dependence of the sound attenuation $\Gamma(Q)$, reported for other glasses, is found to be the low-$Q$ limit of a more general $\Gamma(Q) \propto \Omega(Q)^2$ law which applies also to the higher $Q$ region, where $\Omega(Q)\propto Q$ no longer holds.
Fully convolutional deep neural networks carry out excellent potential for fast and accurate image segmentation. One of the main challenges in training these networks is data imbalance, which is particularly problematic in medical imaging applications such as lesion segmentation where the number of lesion voxels is often much lower than the number of non-lesion voxels. Training with unbalanced data can lead to predictions that are severely biased towards high precision but low recall (sensitivity), which is undesired especially in medical applications where false negatives are much less tolerable than false positives. Several methods have been proposed to deal with this problem including balanced sampling, two step training, sample re-weighting, and similarity loss functions. In this paper, we propose a generalized loss function based on the Tversky index to address the issue of data imbalance and achieve much better trade-off between precision and recall in training 3D fully convolutional deep neural networks. Experimental results in multiple sclerosis lesion segmentation on magnetic resonance images show improved F2 score, Dice coefficient, and the area under the precision-recall curve in test data. Based on these results we suggest Tversky loss function as a generalized framework to effectively train deep neural networks.
In order to predict the potential energy surface (PES) from measured structure in equilibrium state, one should typically perform trial-and-error statistical thermodynamic simulation with assumed multibody interactions. Very recently, we derive map from a set of equilibrium structure in crystalline solids to that of corresponding PES in explicit matrix form, where the PES can be inversely determined from the measured structure. The practical problem to construct the map appears when system size of measured structure is not sufficiently large, which results in non-trivial treatment of asymmetry problem in the map. The present study proposes alternative approach to avoiding treatment of the asymmetry problem, demonstrating more accurate prediction of the PES than the map constructed by explicitly treating the asymmetry.
This paper reports on a study in which a novel virtual moving sound-based spatial auditory brain-computer interface (BCI) paradigm is developed. Classic auditory BCIs rely on spatially static stimuli, which are often boring and difficult to perceive when subjects have non-uniform spatial hearing perception characteristics. The concept of moving sound proposed and tested in the paper allows for the creation of a P300 oddball paradigm of necessary target and non-target auditory stimuli, which are more interesting and easier to distinguish. We present a report of our study of seven healthy subjects, which proves the concept of moving sound stimuli usability for a novel BCI. We compare online BCI classification results in static and moving sound paradigms yielding similar accuracy results. The subject preference reports suggest that the proposed moving sound protocol is more comfortable and easier to discriminate with the online BCI.
I discuss the impact of the the coherent parton radiation on semi-inclusive production of hadrons in deep inelastic scattering at lepton-proton colliders. Such radiation can be consistently described in the $b$-space resummation formalism, which was originally developed to improve theoretical description of production of hadrons at $e^+e^-$ colliders and electroweak vector bosons at hadron-hadron colliders. I present the detailed derivation of the resummed cross section and the energy flow at the next-to-leading order of perturbative QCD. The theoretical results are compared to the experimental data measured at the $ep$ collider HERA. I argue that semi-inclusive deep inelastic scattering (SIDIS) at lepton-hadron colliders offers exceptional opportunities to study coherent parton radiation, which are not available yet at colliders of other types.
We study a weakly disordered 2D electron gas with two bands and a spectral node within the weak-localization approach and compare its results with those of Gaussian fluctuations around the self-consistent Born approximation. The appearance of diffusive modes depends on the type of disorder. In particular, we find for a random gap a diffusive mode only from ladder contributions, whereas for a random scalar potential the diffusive mode is created by ladder and by maximally crossed contributions. The ladder (maximally crossed) contributions correspond to fermionic (bosonic) Gaussian fluctuations. We calculate the conductivity corrections from the density--density Kubo formula and find a good agreement with the experimentally observed V-shape conductivity of graphene.
The article presents new results on the use of variable thresholds to increase the capacity of a feedback neural network. Non-binary networks are also considered in this analysis.
Most centralities proposed for identifying influential spreaders on social networks to either spread a message or to stop an epidemic require the full topological information of the network on which spreading occurs. In practice, however, collecting all connections between agents in social networks can be hardly achieved. As a result, such metrics could be difficult to apply to real social networks. Consequently, a new approach for identifying influential people without the explicit network information is demanded in order to provide an efficient immunization or spreading strategy, in a practical sense. In this study, we seek a possible way for finding influential spreaders by using the social mechanisms of how social connections are formed in real networks. We find that a reliable immunization scheme can be achieved by asking people how they interact with each other. From these surveys we find that the probabilistic tendency to connect to a hub has the strongest predictive power for influential spreaders among tested social mechanisms. Our observation also suggests that people who connect different communities is more likely to be an influential spreader when a network has a strong modular structure. Our finding implies that not only the effect of network location but also the behavior of individuals is important to design optimal immunization or spreading schemes.
We consider convex-concave saddle-point problems where the objective functions may be split in many components, and extend recent stochastic variance reduction methods (such as SVRG or SAGA) to provide the first large-scale linearly convergent algorithms for this class of problems which is common in machine learning. While the algorithmic extension is straightforward, it comes with challenges and opportunities: (a) the convex minimization analysis does not apply and we use the notion of monotone operators to prove convergence, showing in particular that the same algorithm applies to a larger class of problems, such as variational inequalities, (b) there are two notions of splits, in terms of functions, or in terms of partial derivatives, (c) the split does need to be done with convex-concave terms, (d) non-uniform sampling is key to an efficient algorithm, both in theory and practice, and (e) these incremental algorithms can be easily accelerated using a simple extension of the "catalyst" framework, leading to an algorithm which is always superior to accelerated batch algorithms.
Recently in Online Social Networks (OSNs), the Least Cost Influence (LCI) problem has become one of the central research topics. It aims at identifying a minimum number of seed users who can trigger a wide cascade of information propagation. Most of existing literature investigated the LCI problem only based on an individual network. However, nowadays users often join several OSNs such that information could be spread across different networks simultaneously. Therefore, in order to obtain the best set of seed users, it is crucial to consider the role of overlapping users under this circumstances.   In this article, we propose a unified framework to represent and analyze the influence diffusion in multiplex networks. More specifically, we tackle the LCI problem by mapping a set of networks into a single one via lossless and lossy coupling schemes. The lossless coupling scheme preserves all properties of original networks to achieve high quality solutions, while the lossy coupling scheme offers an attractive alternative when the running time and memory consumption are of primary concern. Various experiments conducted on both real and synthesized datasets have validated the effectiveness of the coupling schemes, which also provide some interesting insights into the process of influence propagation in multiplex networks.
In this paper a lattice model for the diffusional transport of particles in the interphase cell nucleus is proposed. The dynamic behaviour of single chains on the lattice is investigated and Rouse scaling is verified. Dynamical dense networks are created by a combined version of the bond fluctuation method and a Metropolis Monte Carlo algorithm. Semidilute behaviour of the dense chain networks is shown. By comparing diffusion of particles in a static and a dynamical chain network, we demonstrate that chain diffusion does not alter the diffusion process of small particles. However, we prove that a dynamical network facilitates the transport of large particles. By weighting the mean square displacement trajectories of particles in the static chain network data from the dynamical network can be reconstructed. Additionally, it is shown that subdiffusive behaviour of particles on short time scales results from trapping processes in the crowded environment of the chain network. In the presented model a protein with 30 nm diameter has an effective diffusion coefficient of 1.24E-11 m^2/s in a chromatin fiber network.
We consider the existence and computational complexity of coalitional stability concepts based on social networks. Our concepts represent a natural and rich combinatorial generalization of a recent approach termed partition equilibrium. We assume that players in a strategic game are embedded in a social network, and there are coordination constraints that restrict the potential coalitions that can jointly deviate in the game to the set of cliques in the social network. In addition, players act in a "considerate" fashion to ignore potentially profitable (group) deviations if the change in their strategy may cause a decrease of utility to their neighbors.   We study the properties of such considerate equilibria in application to the class of resource selection games (RSG). Our main result proves existence of a considerate equilibrium in all symmetric RSG with strictly increasing delays, for any social network among the players. The existence proof is constructive and yields an efficient algorithm. In fact, the computed considerate equilibrium is a Nash equilibrium for the standard RSG showing that there exists a state that is stable against selfish and considerate behavior simultaneously. In addition, we show results on convergence of considerate dynamics.
Gene regulatory networks constitute the first layer of the cellular computation for cell adaptation and surveillance. In these webs, a set of causal relations is built up from thousands of interactions between transcription factors and their target genes. The large size of these webs and their entangled nature make difficult to achieve a global view of their internal organisation. Here, this problem has been addressed through a comparative study for {\em Escherichia coli}, {\em Bacillus subtilis} and {\em Saccharomyces cerevisiae} gene regulatory networks. We extract the minimal core of causal relations, uncovering the hierarchical and modular organisation from a novel dynamical/causal perspective. Our results reveal a marked top-down hierarchy containing several small dynamical modules for \textit{E. coli} and \textit{B. subtilis}. Conversely, the yeast network displays a single but large dynamical module in the middle of a bow-tie structure. We found that these dynamical modules capture the relevant wiring among both common and organism-specific biological functions such as transcription initiation, metabolic control, signal transduction, response to stress, sporulation and cell cycle. Functional and topological results suggest that two fundamentally different forms of logic organisation may have evolved in bacteria and yeast.
We explore multi-scale convolutional neural nets (CNNs) for image classification. Contemporary approaches extract features from a single output layer. By extracting features from multiple layers, one can simultaneously reason about high, mid, and low-level features during classification. The resulting multi-scale architecture can itself be seen as a feed-forward model that is structured as a directed acyclic graph (DAG-CNNs). We use DAG-CNNs to learn a set of multiscale features that can be effectively shared between coarse and fine-grained classification tasks. While fine-tuning such models helps performance, we show that even "off-the-self" multiscale features perform quite well. We present extensive analysis and demonstrate state-of-the-art classification performance on three standard scene benchmarks (SUN397, MIT67, and Scene15). In terms of the heavily benchmarked MIT67 and Scene15 datasets, our results reduce the lowest previously-reported error by 23.9% and 9.5%, respectively.
The paper was done as an assigned Princeton university project. It is being withdrawn since it needs to be changed and updated substantially.
Neural networks can synchronize by learning from each other. In the case of discrete weights full synchronization is achieved in a finite number of steps. Additional networks can be trained by using the inputs and outputs generated during this process as examples. Several learning rules for both tasks are presented and analyzed. In the case of Tree Parity Machines synchronization is much faster than learning. Scaling laws for the number of steps needed for full synchronization and successful learning are derived using analytical models. They indicate that the difference between both processes can be controlled by changing the synaptic depth. In the case of bidirectional interaction the synchronization time increases proportional to the square of this parameter, but it grows exponentially, if information is transmitted in one direction only. Because of this effect neural synchronization can be used to construct a cryptographic key-exchange protocol. Here the partners benefit from mutual interaction, so that a passive attacker is usually unable to learn the generated key in time. The success probabilities of different attack methods are determined by numerical simulations and scaling laws are derived from the data. They show that the partners can reach any desired level of security by just increasing the synaptic depth. Then the complexity of a successful attack grows exponentially, but there is only a polynomial increase of the effort needed to generate a key. Further improvements of security are possible by replacing the random inputs with queries generated by the partners.
Tackling pattern recognition problems in areas such as computer vision, bioinformatics, speech or text recognition is often done best by taking into account task-specific statistical relations between output variables. In structured prediction, this internal structure is used to predict multiple outputs simultaneously, leading to more accurate and coherent predictions. Structural support vector machines (SSVMs) are nonprobabilistic models that optimize a joint input-output function through margin-based learning. Because SSVMs generally disregard the interplay between unary and interaction factors during the training phase, final parameters are suboptimal. Moreover, its factors are often restricted to linear combinations of input features, limiting its generalization power. To improve prediction accuracy, this paper proposes: (i) Joint inference and learning by integration of back-propagation and loss-augmented inference in SSVM subgradient descent; (ii) Extending SSVM factors to neural networks that form highly nonlinear functions of input features. Image segmentation benchmark results demonstrate improvements over conventional SSVM training methods in terms of accuracy, highlighting the feasibility of end-to-end SSVM training with neural factors.
In the last decade it became apparent that a large number of the most interesting structures and phenomena of the world can be described by networks: separable elements, with connections (or interactions) between certain pairs of them.   These huge networks pose exciting challenges for the mathematician. Graph Theory (the mathematical theory of networks) faces novel, unconventional problems: these very large networks (like the Internet) are never completely known, in most cases they are not even well defined. Data about them can be collected only by indirect means like random local sampling.   Dense networks (in which a node is adjacent to a positive percent of others nodes) and sparse networks (in which a node has a bounded number of neighbors) show very different behavior. From a practical point of view, sparse networks are more important, but at present we have more complete theoretical results for dense networks. The paper surveys relations with probability, algebra, extrema graph theory, and analysis.
We address the problem of instance-level semantic segmentation, which aims at jointly detecting, segmenting and classifying every individual object in an image. In this context, existing methods typically propose candidate objects, usually as bounding boxes, and directly predict a binary mask within each such proposal. As a consequence, they cannot recover from errors in the object candidate generation process, such as too small or shifted boxes.   In this paper, we introduce a novel object segment representation based on the distance transform of the object masks. We then design an object mask network (OMN) with a new residual-deconvolution architecture that infers such a representation and decodes it into the final binary object mask. This allows us to predict masks that go beyond the scope of the bounding boxes and are thus robust to inaccurate object candidates. We integrate our OMN into a Multitask Network Cascade framework, and learn the resulting boundary-aware instance segmentation (BAIS) network in an end-to-end manner. Our experiments on the PASCAL VOC 2012 and the Cityscapes datasets demonstrate the benefits of our approach, which outperforms the state-of-the-art in both object proposal generation and instance segmentation.
We review recent results about the maximal values of the Kullback-Leibler information divergence from statistical models defined by neural networks, including naive Bayes models, restricted Boltzmann machines, deep belief networks, and various classes of exponential families. We illustrate approaches to compute the maximal divergence from a given model starting from simple sub- or super-models. We give a new result for deep and narrow belief networks with finite-valued units.
This paper proposes a new indicator of text structure, called the lexical cohesion profile (LCP), which locates segment boundaries in a text. A text segment is a coherent scene; the words in a segment are linked together via lexical cohesion relations. LCP records mutual similarity of words in a sequence of text. The similarity of words, which represents their cohesiveness, is computed using a semantic network. Comparison with the text segments marked by a number of subjects shows that LCP closely correlates with the human judgments. LCP may provide valuable information for resolving anaphora and ellipsis.
Deep ocean acoustics, in the absence of shipping and wildlife, is driven by surface processes. Best understood is the signal generated by non-linear surface wave interactions, the Longuet-Higgins mechanism, which dominates from 0.1 to 10 Hz, and may be significant for another octave. For this source, the spectral matrix of pressure and vector velocity is derived for points near the bottom of a deep ocean resting on an elastic half-space. In the absence of a bottom, the ratios of matrix elements are universal constants. Bottom effects vitiate the usual "standing wave approximation," but a weaker form of the approximation is shown to hold, and this is used for numerical calculations. In the weak standing wave approximation, the ratios of matrix elements are independent of the surface wave spectrum, but depend on frequency and the propagation environment. Data from the Hawaii-2 Observatory are in excellent accord with the theory for frequencies between 0.1 and 1 Hz, less so at higher frequencies. Insensitivity of the spectral ratios to wind, and presumably waves, is indeed observed in the data.
Dynamic extinction colonisation models (also called contact processes) are widely studied in epidemiology and in metapopulation theory. Contacts are usually assumed to be possible only through a network of connected patches. This network accounts for a spatial landscape or a social organisation of interactions. Thanks to social network literature, heterogeneous networks of contacts can be considered. A major issue is to assess the influence of the network in the dynamic model. Most work with this common purpose uses deterministic models or an approximation of a stochastic Extinction-Colonisation model (sEC) which are relevant only for large networks. When working with a limited size network, the induced stochasticity is essential and has to be taken into account in the conclusions. Here, a rigorous framework is proposed for limited size networks and the limitations of the deterministic approximation are exhibited. This framework allows exact computations when the number of patches is small. Otherwise, simulations are used and enhanced by adapted simulation techniques when necessary. A sensitivity analysis was conducted to compare four main topologies of networks in contrasting settings to determine the role of the network. A challenging case was studied in this context: seed exchange of crop species in the R\'eseau Semences Paysannes (RSP), an emergent French farmers' organisation. A stochastic Extinction-Colonisation model was used to characterize the consequences of substantial changes in terms of RSP's social organisation on the ability of the system to maintain crop varieties.
The growth of wireless communication technologies has been producing the intense demand for high-speed, efficient, reliable voice & data communication. As a result, third generation partnership project (3GPP) has implemented next generation wireless communication technology long term evolution (LTE) which is designed to increase the capacity and speed of existing mobile telephone & data networks. LTE has adopted a multicarrier transmission technique known as orthogonal frequency division multiplexing (OFDM). OFDM meets the LTE requirement for spectrum flexibility and enables cost-efficient solutions for very wide carriers. One major generic problem of OFDM technique is high peak to average power ratio (PAPR) which is defined as the ratio of the peak power to the average power of the OFDM signal. A trade-off is necessary for reducing PAPR with increasing bit error rate (BER), computational complexity or data rate loss etc. In this paper, two clipping based filtering methods have been implemented & also analyzed their modulation effects on reducing PAPR.
We introduce a globally-convergent algorithm for optimizing the tree-reweighted (TRW) variational objective over the marginal polytope. The algorithm is based on the conditional gradient method (Frank-Wolfe) and moves pseudomarginals within the marginal polytope through repeated maximum a posteriori (MAP) calls. This modular structure enables us to leverage black-box MAP solvers (both exact and approximate) for variational inference, and obtains more accurate results than tree-reweighted algorithms that optimize over the local consistency relaxation. Theoretically, we bound the sub-optimality for the proposed algorithm despite the TRW objective having unbounded gradients at the boundary of the marginal polytope. Empirically, we demonstrate the increased quality of results found by tightening the relaxation over the marginal polytope as well as the spanning tree polytope on synthetic and real-world instances.
We classify digits of real-world house numbers using convolutional neural networks (ConvNets). ConvNets are hierarchical feature learning neural networks whose structure is biologically inspired. Unlike many popular vision approaches that are hand-designed, ConvNets can automatically learn a unique set of features optimized for a given task. We augmented the traditional ConvNet architecture by learning multi-stage features and by using Lp pooling and establish a new state-of-the-art of 94.85% accuracy on the SVHN dataset (45.2% error improvement). Furthermore, we analyze the benefits of different pooling methods and multi-stage features in ConvNets. The source code and a tutorial are available at eblearn.sf.net.
Predicting business process behaviour is an important aspect of business process management. Motivated by research in natural language processing, this paper describes an application of deep learning with recurrent neural networks to the problem of predicting the next event in a business process. This is both a novel method in process prediction, which has largely relied on explicit process models, and also a novel application of deep learning methods. The approach is evaluated on two real datasets and our results surpass the state-of-the-art in prediction precision.
Random networks are widely used for modeling and analyzing complex processes. Many mathematical models have been proposed to capture diverse real-world networks. One of the most important aspects of these models is degree distribution. Chung--Lu (CL) model is a random network model, which can produce networks with any given arbitrary degree distribution. The complex systems we deal with nowadays are growing larger and more diverse than ever. Generating random networks with any given degree distribution consisting of billions of nodes and edges or more has become a necessity, which requires efficient and parallel algorithms. We present an MPI-based distributed memory parallel algorithm for generating massive random networks using CL model, which takes $O(\frac{m+n}{P}+P)$ time with high probability and $O(n)$ space per processor, where $n$, $m$, and $P$ are the number of nodes, edges and processors, respectively. The time efficiency is achieved by using a novel load-balancing algorithm. Our algorithms scale very well to a large number of processors and can generate massive power--law networks with one billion nodes and $250$ billion edges in one minute using $1024$ processors.
Complex socioeconomic networks such as information, finance and even terrorist networks need resilience to cascades - to prevent the failure of a single node from causing a far-reaching domino effect. We show that terrorist and guerrilla networks are uniquely cascade-resilient while maintaining high efficiency, but they become more vulnerable beyond a certain threshold. We also introduce an optimization method for constructing networks with high passive cascade resilience. The optimal networks are found to be based on cells, where each cell has a star topology. Counterintuitively, we find that there are conditions where networks should not be modified to stop cascades because doing so would come at a disproportionate loss of efficiency. Implementation of these findings can lead to more cascade-resilient networks in many diverse areas.
Distant speech recognition is a challenge, particularly due to the corruption of speech signals by reverberation caused by large distances between the speaker and microphone. In order to cope with a wide range of reverberations in real-world situations, we present novel approaches for acoustic modeling including an ensemble of deep neural networks (DNNs) and an ensemble of jointly trained DNNs. First, multiple DNNs are established, each of which corresponds to a different reverberation time 60 (RT60) in a setup step. Also, each model in the ensemble of DNN acoustic models is further jointly trained, including both feature mapping and acoustic modeling, where the feature mapping is designed for the dereverberation as a front-end. In a testing phase, the two most likely DNNs are chosen from the DNN ensemble using maximum a posteriori (MAP) probabilities, computed in an online fashion by using maximum likelihood (ML)-based blind RT60 estimation and then the posterior probability outputs from two DNNs are combined using the ML-based weights as a simple average. Extensive experiments demonstrate that the proposed approach leads to substantial improvements in speech recognition accuracy over the conventional DNN baseline systems under diverse reverberant conditions.
In this essay we elaborate on recent evidence demonstrating the presence of a second order phase transition in human brain dynamics and discuss its consequences for theoretical approaches to brain function. We review early evidence of criticality in brain dynamics at different spatial and temporal scales, and we stress how it was necessary to unify concepts and analysis techniques across scales to introduce the adequate order and control parameters which define the transition. A discussion on the relation between structural vs. dynamical complexity exposes future steps to understand the dynamics of the connectome (structure) from which emerges the cognitome (function).
We present a new method for online prediction and learning of tensors ($N$-way arrays, $N >2$) from sequential measurements. We focus on the specific case of 3-D tensors and exploit a recently developed framework of structured tensor decompositions proposed in [1]. In this framework it is possible to treat 3-D tensors as linear operators and appropriately generalize notions of rank and positive definiteness to tensors in a natural way. Using these notions we propose a generalization of the matrix exponentiated gradient descent algorithm [2] to a tensor exponentiated gradient descent algorithm using an extension of the notion of von-Neumann divergence to tensors. Then following a similar construction as in [3], we exploit this algorithm to propose an online algorithm for learning and prediction of tensors with provable regret guarantees. Simulations results are presented on semi-synthetic data sets of ratings evolving in time under local influence over a social network. The result indicate superior performance compared to other (online) convex tensor completion methods.
Post-translational modification (PTM) of proteins plays a key role in signal transduction, and hence significant effort has gone toward understanding how PTM networks process information. This involves, on the theory side, analyzing the dynamical systems arising from such networks. Which networks are, for instance, bistable? Which networks admit sustained oscillations? Which parameter values enable such behaviors? In this Perspective, we highlight recent progress in this area and point out some important future directions.
We study the transport of information between two complex networks with similar properties. Both networks generate non-Poisson renewal fluctuations with a power-law spectrum 1/f^(3-\mu), the case \mu= 2 corresponding to ideal 1/f-noise. We denote by \mu_S and \mu_P the power-law indexes of the network "system" of interest S and the perturbing network P respectively. By adopting a generalized fluctuation-dissipation theorem (FDT) we show that the ideal condition of 1/f-noise for both networks corresponds to maximal information transport. We prove that to make the network S respond when \mu_S < 2 we have to set the condition \mu_P < 2. In the latter case, if \mu_P < \mu_S, the system S inherits the relaxation properties of the perturbing network. In the case where \mu_P > 2, no response and no information transmission occurs in the long-time limit. We consider two possible generalizations of the fluctuation-dissipation theorem and show that both lead to maximal information transport in the condition of 1/f-noise.
We introduce a technique for augmenting neural text-to-speech (TTS) with lowdimensional trainable speaker embeddings to generate different voices from a single model. As a starting point, we show improvements over the two state-ofthe-art approaches for single-speaker neural TTS: Deep Voice 1 and Tacotron. We introduce Deep Voice 2, which is based on a similar pipeline with Deep Voice 1, but constructed with higher performance building blocks and demonstrates a significant audio quality improvement over Deep Voice 1. We improve Tacotron by introducing a post-processing neural vocoder, and demonstrate a significant audio quality improvement. We then demonstrate our technique for multi-speaker speech synthesis for both Deep Voice 2 and Tacotron on two multi-speaker TTS datasets. We show that a single neural TTS system can learn hundreds of unique voices from less than half an hour of data per speaker, while achieving high audio quality synthesis and preserving the speaker identities almost perfectly.
We investigate a method to deal with congestion of sectors and delays in the tactical phase of air traffic flow and capacity management. It relies on temporal objectives given for every point of the flight plans and shared among the controllers in order to create a collaborative environment. This would enhance the transition from the network view of the flow management to the local view of air traffic control. Uncertainty is modeled at the trajectory level with temporal information on the boundary points of the crossed sectors and then, we infer the probabilistic occupancy count. Therefore, we can model the accuracy of the trajectory prediction in the optimization process in order to fix some safety margins. On the one hand, more accurate is our prediction; more efficient will be the proposed solutions, because of the tighter safety margins. On the other hand, when uncertainty is not negligible, the proposed solutions will be more robust to disruptions. Furthermore, a multiobjective algorithm is used to find the tradeoff between the delays and congestion, which are antagonist in airspace with high traffic density. The flow management position can choose manually, or automatically with a preference-based algorithm, the adequate solution. This method is tested against two instances, one with 10 flights and 5 sectors and one with 300 flights and 16 sectors.
The impressive performance and plasticity of convolutional neural networks to solve different vision problems are shadowed by their black-box nature and its consequent lack of full understanding. To reduce this gap we propose to describe the activity of individual neurons by quantifying their inherent selectivity to specific properties. Our approach is based on the definition of feature selectivity indexes that allow the ranking of neurons according to specific properties. Here we report the results of exploring selectivity indexes for: (a) an image feature (color); and (b) an image label (class membership). Our contribution is a framework to seek or classify neurons by indexing on these selectivity properties. It helps to find color selective neurons, such as a red-mushroom neuron in layer conv4 or class selective neurons such as dog-face neurons in layer conv5, and establishes a methodology to derive other selectivity properties. Indexing on neuron selectivity can statistically draw how features and classes are represented through layers at a moment when the size of trained nets is growing and automatic tools to index can be helpful.
Dynamical properties of a Lennard-Jones binary mixture embedded in an off lattice matrix of soft spheres are studied in the direct space upon supercooling by molecular dynamics simulations. On lowering temperature the smaller particles tend to avoid the soft sphere interfaces and correspondingly their mobility decreases below the one of the larger particles. The system displays a dynamic behaviour consistent with the Mode Coupling predictions. A decrease of the mode coupling crossover temperature with respect to the bulk is found. We find however that the range of validity of the theory shrinks with respect to the bulk. This is due to the change in the smaller particle mobility and to a substantial enhancement of hopping processes well above the cross over temperature upon confinement.
Networks of neurons in the brain encode preferred patterns of neural activity via their synaptic connections. Despite receiving considerable attention, the precise relationship between network connectivity and encoded patterns is still poorly understood. Here we consider this problem for networks of threshold-linear neurons whose computational function is to learn and store a set of binary patterns (e.g., a neural code) as "permitted sets" of the network. We introduce a simple Encoding Rule that selectively turns "on" synapses between neurons that co-appear in one or more patterns. The rule uses synapses that are binary, in the sense of having only two states ("on" or "off"), but also heterogeneous, with weights drawn from an underlying synaptic strength matrix S. Our main results precisely describe the stored patterns that result from the Encoding Rule -- including unintended "spurious" states -- and give an explicit characterization of the dependence on S. In particular, we find that binary patterns are successfully stored in these networks when the excitatory connections between neurons are geometrically balanced -- i.e., they satisfy a set of geometric constraints. Furthermore, we find that certain types of neural codes are "natural" in the context of these networks, meaning that the full code can be accurately learned from a highly undersampled set of patterns. Interestingly, many commonly observed neural codes in cortical and hippocampal areas are natural in this sense. As an application, we construct networks that encode hippocampal place field codes nearly exactly, following presentation of only a small fraction of patterns. To obtain our results, we prove new theorems using classical ideas from convex and distance geometry, such as Cayley-Menger determinants, revealing a novel connection between these areas of mathematics and coding properties of neural networks.
In interacting binaries, comparison of a donor star's radial response to mass loss with the response of its Roche radius determines whether mass loss persists and, if so, determines the timescale and stability of the ensuing evolutionary phase. For giants with deep convective envelopes, the canonical description holds that once mass transfer begins it typically proceeds catastrophically on the dynamical timescale, as the star cannot lose sufficient heat in order to avoid expansion. However, we demonstrate that the local thermal timescale of the envelope's superadiabatic outer surface layer remains comparable to that of mass loss in most cases of "dynamical" mass loss. We argue therefore that if mass loss proceeds on a timescale longer than this, then even a deep convective envelope will not dramatically expand, as the surface layer will have time to relax thermally and reconstitute itself. We demonstrate that in general the polytropic approximation gives much too strict a criterion for stability, and discuss the dependence of the donor's response on its radius in addition to its core mass. In general, we find that the effective response of the donor on rapid timescales cannot be determined accurately without detailed evolutionary calculations.
Motivated by a recent optical-lattice experiment by Choi et al.[Science 352, 1547 (2016)], we discuss how domain-wall melting can be used to investigate many-body localization. First, by considering noninteracting fermion models, we demonstrate that experimentally accessible measures are sensitive to localization and can thus be used to detect the delocalization-localization transition, including divergences of characteristic length scales. Second, using extensive time-dependent density matrix renormalization group simulations, we study fermions with repulsive interactions on a chain and a two-leg ladder. The extracted critical disorder strengths agree well with the ones found in existing literature.
Multiple different approaches of generating adversarial examples have been proposed to attack deep neural networks. These approaches involve either directly computing gradients with respect to the image pixels, or directly solving an optimization on the image pixels. In this work, we present a fundamentally new method for generating adversarial examples that is fast to execute and provides exceptional diversity of output. We efficiently train feed-forward neural networks in a self-supervised manner to generate adversarial examples against a target network or set of networks. We call such a network an Adversarial Transformation Network (ATN). ATNs are trained to generate adversarial examples that minimally modify the classifier's outputs given the original input, while constraining the new classification to match an adversarial target class. We present methods to train ATNs and analyze their effectiveness targeting a variety of MNIST classifiers as well as the latest state-of-the-art ImageNet classifier Inception ResNet v2.
Recurrent Neural Networks (RNNs) have been a prominent concept within artificial intelligence. They are inspired by Biological Neural Networks (BNNs) and provide an intuitive and abstract representation of how BNNs work. Derived from the more generic Artificial Neural Networks (ANNs), the recurrent ones are meant to be used for temporal tasks, such as speech recognition, because they are capable of memorizing historic input. However, such networks are very time consuming to train as a result of their inherent nature. Recently, Echo State Networks and Liquid State Machines have been proposed as possible RNN alternatives, under the name of Reservoir Computing (RC). RCs are far more easy to train. In this paper, Cellular Automata are used as reservoir, and are tested on the 5-bit memory task (a well known benchmark within the RC community). The work herein provides a method of mapping binary inputs from the task onto the automata, and a recurrent architecture for handling the sequential aspects of it. Furthermore, a layered (deep) reservoir architecture is proposed. Performances are compared towards earlier work, in addition to its single-layer version. Results show that the single CA reservoir system yields similar results to state-of-the-art work. The system comprised of two layered reservoirs do show a noticeable improvement compared to a single CA reservoir. This indicates potential for further research and provides valuable insight on how to design CA reservoir systems.
This paper aims to shed some light on the concept of virality - especially in social networks - and to provide new insights on its structure. We argue that: (a) virality is a phenomenon strictly connected to the nature of the content being spread, rather than to the influencers who spread it, (b) virality is a phenomenon with many facets, i.e. under this generic term several different effects of persuasive communication are comprised and they only partially overlap. To give ground to our claims, we provide initial experiments in a machine learning framework to show how various aspects of virality can be independently predicted according to content features.
The Actor Network represents heterogeneous entities as actants (Callon et al., 1983; 1986). Although computer programs for the visualization of social networks increasingly allow us to represent heterogeneity in a network using different shapes and colors for the visualization, hitherto this possibility has scarcely been exploited (Mogoutov et al., 2008). In this contribution to the Festschrift, I study the question of what heterogeneity can add specifically to the visualization of a network. How does an integrated network improve on the one-dimensional ones (such as co-word and co-author maps)? The oeuvre of Michel Callon is used as the case materials, that is, his 65 papers which can be retrieved from the (Social) Science Citation Index since 1975.
This paper examines users who are common to two popular online social networks, Instagram and Ask.fm, that are often used for cyberbullying. An analysis of the negativity and positivity of word usage in posts by common users of these two social networks is performed. These results are normalized in comparison to a sample of typical users in both networks. We also examine the posting activity of common user profiles and consider its correlation with negativity. Within the Ask.fm social network, which allows anonymous posts, the relationship between anonymity and negativity is further explored.
We study the interaction of the relaxation processes with the density fluctuations by molecular dynamics simulation of a flexible molecule model for o-terphenyl (oTP) in the liquid and supercooled phases. We find evidence, besides the structural relaxation, of a secondary vibrational relaxation whose characteristic time, few ps, is slightly temperature dependent. This i) confirms the result by Monaco et al. [Phys. Rev, E 62, 7595 (2000)] of the vibrational nature of the fast relaxation observed in Brillouin Light Scattering (BLS) experiments in oTP; and ii) poses a caveat on the interpretation of the BLS spectra of molecular systems in terms of a purely center of mass dynamics.
We address the problem of distributed matching of features in networks with vision systems. Every camera in the network has limited communication capabilities and can only exchange local matches with its neighbors. We propose a distributed algorithm that takes these local matches and computes global correspondences by a proper propagation in the network. When the algorithm finishes, each camera knows the global correspondences between its features and the features of all the cameras in the network. The presence of spurious introduced by the local matcher may produce inconsistent global correspondences, which are association paths between features from the same camera. The contributions of this work are the propagation of the local matches and the detection and resolution of these inconsistencies by deleting local matches. Our resolution algorithm considers the quality of each local match, when this information is provided by the local matcher. We formally prove that after executing the algorithm, the network finishes with a global data association free of inconsistencies. We provide a fully decentralized solution to the problem which does not rely on any particular communication topology. Simulations and experimental results with real images show the performance of the method considering different features, matching functions and scenarios.
In globally coupled oscillators, it is believed that strong higher harmonics of coupling functions are essential for multibranch entrainment (MBE), in which there exist many stable states, whose number scales as $\sim$ $O(\exp N)$ (where N is the system size). The existence of MBE implies the non-ergodicity of the system. Then, because this apparent breaking of ergodicity is caused by microscopic energy barriers, this seems to be in conflict with a basic principle of statistical physics. In this paper, using macroscopic dynamical theories, we demonstrate that there is no such ergodicity breaking, and such a system slowly evolves among branch states, jumping over microscopic energy barriers due to the influence of thermal noise. This phenomenon can be regarded as an example of slow dynamics driven by a perturbation along a neutrally stable manifold consisting of an infinite number of branch states.
Over the next decade we will witness the development of a new infrastructure in support of data-intensive scientific research, which includes Astronomy. This new networked environment will offer both challenges and opportunities to our community and has the potential to transform the way data are described, curated and preserved. Based on the lessons learned during the development and management of the ADS, a case is made for adopting the emerging technologies and practices of the Semantic Web to support the way Astronomy research will be conducted. Examples of how small, incremental steps can, in the aggregate, make a significant difference in the provision and repurposing of astronomical data are provided.
To compensate for sensory processing delays, the visual system must make predictions to ensure timely and appropriate behaviors. Recent work has found predictive information about the stimulus in neural populations early in vision processing, starting in the retina. However, to utilize this information, cells downstream must in turn be able to read out the predictive information from the spiking activity of retinal ganglion cells. Here we investigate whether a downstream cell could learn efficient encoding of predictive information in its inputs in the absence of other instructive signals, from the correlations in the inputs themselves. We simulate learning driven by spiking activity recorded in salamander retina. We model a downstream cell as a binary neuron receiving a small group of weighted inputs and quantify the predictive information between activity in the binary neuron and future input. Input weights change according to spike timing-dependent learning rules during a training period. We characterize the readouts learned under spike timing-dependent learning rules, finding that although the fixed points of learning dynamics are not associated with absolute optimal readouts, they convey nearly all the information conveyed by the optimal readout. Moreover, we find that learned perceptrons transmit position and velocity information of a moving bar stimulus nearly as efficiently as optimal perceptrons. We conclude that predictive information is, in principle, readable from the perspective of downstream neurons in the absence of other inputs, and consequently suggests that bottom-up prediction may play an important role in sensory processing.
Spin ordering and its effect on the low energy quasiparticles in a p-wave superconducting fluid are investigated. We study the properties of a new 2D quantum spin triplet superconducting liquid where the ground state is spin rotation invariant. In quantum spin disordered cases, zero energy skyrmions are topologically stable because of a fermionic zero mode in space-time monopoles which serve as a quantum protectorate.The low energy quasiparticles are bound states of the bare Bogolubov- De Gennes ({\em BdeG})quasiparticles and skyrmions, which are charge neutral bosons at the zero energy limit. Further more, the spin collective excitations are fractionalized ones carrying a half spin and obeying fermionic statistics. In thermally spin disordered superconducting states, a bare {\em BdeG} has an infinity energy and each two {\em BdeG} quasiparticles form spin zero or one bound states.
The galactose network is a complex system responsible for galactose metabolism. It has been extensively studied experimentally and mathematically at the unicellular level to broaden our understanding of its regulatory mechanisms at higher order species. Although the key molecular players involved in the metabolic and regulatory processes underlying this system have been known for decades, their interactions and chemical kinetics remain incompletely understood. Mathematical models can provide an alternative method to study the dynamics of this network from a quantitative and a qualitative perspective. Here, we employ such approaches to unravel the main properties of the galactose network, including equilibrium binary and temporal responses, as a way to decipher its adaptation to actively-changing inputs. We combine the two main components of the network; namely, the genetic branch which allows for bistable responses, and a metabolic branch, encompassing the relevant metabolic processes and glucose repressive reactions. We use both computational tools to estimate model parameters based on published experimental data, as well as bifurcation analysis to decipher the properties of the system in various parameter regimes. Our model analysis reveals that the interplay between the inducer (galactose) and the repressor (glucose) creates the bistability regime which dictates the temporal responses of the system. Based on the same bifurcation techniques, we can also explain why the system is robust to genetic mutations and molecular instabilities. These findings may provide experimentalists with a theoretical framework upon which they can determine how the galactose network functions under various conditions.
In this paper we describe the recent advancements made in the IBM i-vector speaker recognition system for conversational speech. In particular, we identify key techniques that contribute to significant improvements in performance of our system, and quantify their contributions. The techniques include: 1) a nearest-neighbor discriminant analysis (NDA) approach that is formulated to alleviate some of the limitations associated with the conventional linear discriminant analysis (LDA) that assumes Gaussian class-conditional distributions, 2) the application of speaker- and channel-adapted features, which are derived from an automatic speech recognition (ASR) system, for speaker recognition, and 3) the use of a deep neural network (DNN) acoustic model with a large number of output units (~10k senones) to compute the frame-level soft alignments required in the i-vector estimation process. We evaluate these techniques on the NIST 2010 speaker recognition evaluation (SRE) extended core conditions involving telephone and microphone trials. Experimental results indicate that: 1) the NDA is more effective (up to 35% relative improvement in terms of EER) than the traditional parametric LDA for speaker recognition, 2) when compared to raw acoustic features (e.g., MFCCs), the ASR speaker-adapted features provide gains in speaker recognition performance, and 3) increasing the number of output units in the DNN acoustic model (i.e., increasing the senone set size from 2k to 10k) provides consistent improvements in performance (for example from 37% to 57% relative EER gains over our baseline GMM i-vector system). To our knowledge, results reported in this paper represent the best performances published to date on the NIST SRE 2010 extended core tasks.
Relatively small data sets available for expression recognition research make the training of deep networks for expression recognition very challenging. Although fine-tuning can partially alleviate the issue, the performance is still below acceptable levels as the deep features probably contain redun- dant information from the pre-trained domain. In this paper, we present FaceNet2ExpNet, a novel idea to train an expression recognition network based on static images. We first propose a new distribution function to model the high-level neurons of the expression network. Based on this, a two-stage training algorithm is carefully designed. In the pre-training stage, we train the convolutional layers of the expression net, regularized by the face net; In the refining stage, we append fully- connected layers to the pre-trained convolutional layers and train the whole network jointly. Visualization shows that the model trained with our method captures improved high-level expression semantics. Evaluations on four public expression databases, CK+, Oulu-CASIA, TFD, and SFEW demonstrate that our method achieves better results than state-of-the-art.
One essential task in information extraction from the medical corpus is drug name recognition. Compared with text sources come from other domains, the medical text is special and has unique characteristics. In addition, the medical text mining poses more challenges, e.g., more unstructured text, the fast growing of new terms addition, a wide range of name variation for the same drug. The mining is even more challenging due to the lack of labeled dataset sources and external knowledge, as well as multiple token representations for a single drug name that is more common in the real application setting. Although many approaches have been proposed to overwhelm the task, some problems remained with poor F-score performance (less than 0.75). This paper presents a new treatment in data representation techniques to overcome some of those challenges. We propose three data representation techniques based on the characteristics of word distribution and word similarities as a result of word embedding training. The first technique is evaluated with the standard NN model, i.e., MLP (Multi-Layer Perceptrons). The second technique involves two deep network classifiers, i.e., DBN (Deep Belief Networks), and SAE (Stacked Denoising Encoders). The third technique represents the sentence as a sequence that is evaluated with a recurrent NN model, i.e., LSTM (Long Short Term Memory). In extracting the drug name entities, the third technique gives the best F-score performance compared to the state of the art, with its average F-score being 0.8645.
An artificial neural network is presented based on the idea of connections between units that are only active for a specific range of input values and zero outside that range (and so are not evaluated outside the active range). The connection function is represented by a polynomial with compact support. The finite range of activation allows for great activation sparsity in the network and means that theoretically you are able to add computational power to the network without increasing the computational time required to evaluate the network for a given input. The polynomial order ranges from first to fifth order. Unit dropout is used for regularization and a parameter free weight update is used. Better performance is obtained by moving from piecewise linear connections to piecewise quadratic, even better performance can be obtained by moving to higher order polynomials. The algorithm is tested on the MAGIC Gamma ray data set as well as the MNIST data set.
Reliable broadcasting data to multiple receivers over lossy wireless channels is challenging due to the heterogeneity of the wireless link conditions. Automatic Repeat-reQuest (ARQ) based retransmission schemes are bandwidth inefficient due to data duplication at receivers. Network coding (NC) has been shown to be a promising technique for improving network bandwidth efficiency by combining multiple lost data packets for retransmission. However, it is challenging to accurately determine which lost packets should be combined together due to disrupted feedback channels. This paper proposes an adaptive data encoding scheme at the transmitter by joining network coding and machine learning (NCML) for retransmission of lost packets. Our proposed NCML extracts the important features from historical feedback signals received by the transmitter to train a classifier. The constructed classifier is then used to predict states of transmitted data packets at different receivers based on their corrupted feedback signals for effective data mixing. We have conducted extensive simulations to collaborate the efficiency of our proposed approach. The simulation results show that our machine learning algorithm can be trained efficiently and accurately. The simulation results show that on average the proposed NCML can correctly classify 90% of the states of transmitted data packets at different receivers. It achieves significant bandwidth gain compared with the ARQ and NC based schemes in different transmission terrains, power levels, and the distances between the transmitter and receivers.
Previous work has shown that feature maps of deep convolutional neural networks (CNNs) can be interpreted as feature representation of a particular image region. Features aggregated from these feature maps have been exploited for image retrieval tasks and achieved state-of-the-art performances in recent years. The key to the success of such methods is the feature representation. However, the different factors that impact the effectiveness of features are still not explored thoroughly. There are much less discussion about the best combination of them.   The main contribution of our paper is the thorough evaluations of the various factors that affect the discriminative ability of the features extracted from CNNs. Based on the evaluation results, we also identify the best choices for different factors and propose a new multi-scale image feature representation method to encode the image effectively. Finally, we show that the proposed method generalises well and outperforms the state-of-the-art methods on four typical datasets used for visual instance retrieval.
Mobility is one of the basic features that define an ad hoc network, an asset that leaves the field free for the nodes to move. The most important aspect of this kind of network turns into a great disadvantage when it comes to commercial applications, take as an example: the automotive networks that allow communication between a groups of vehicles. The ad hoc on-demand distance vector (AODV) routing protocol, designed for mobile ad hoc networks, has two main functions. First, it enables route establishment between a source and a destination node by initiating a route discovery process. Second, it maintains the active routes, which means finding alternative routes in a case of a link failure and deleting routes when they are no longer desired. In a highly mobile network those are demanding tasks to be performed efficiently and accurately. In this paper, we focused in the first point to enhance the local decision of each node in the network by the quantification of the mobility of their neighbours. Quantification is made around RSSI algorithm a well known distance estimation method.
How does the brain represent visual information from the outside world? Here, we approach this question with a deep convolutional neural network that mimics neuronal circuitry and coding, and learns to solve computer vision tasks. Using this network as a computational model of the visual cor-tex, we develop novel encoding and decoding mod-els to describe the bi-directional relationships be-tween visual input and cortical activity measured with functional magnetic resonance imaging. Test-ing these models with imaging data from humans watching natural movies, we show that the encod-ing model can predict cortical responses and re-trieve visual representations at individual brain lo-cations, and that the decoding model can decipher the measured cortical activity to reconstruct the visual and semantic experiences. Both the encod-ing and decoding models utilize cortical representa-tions of hierarchical, invariant, and nonlinear visual features. Being self-contained, efficient, and gener-alizable, these models constitute a computational workbench for high-throughput investigation of all stages of visual processing. We also anticipate that the general strategy for neural encoding and decod-ing via deep-learning models will be applicable to other sensory or cognitive experiences, e.g. speech, imagery, memories and dreams.
Real-world face recognition using a single sample per person (SSPP) is a challenging task. The problem is exacerbated if the conditions under which the gallery image and the probe set are captured are completely different. To address these issues from the perspective of domain adaptation, we introduce an SSPP domain adaptation network (SSPP-DAN). In the proposed approach, domain adaptation, feature extraction, and classification are performed jointly using a deep architecture with domain-adversarial training. However, the SSPP characteristic of one training sample per class is insufficient to train the deep architecture. To overcome this shortage, we generate synthetic images with varying poses using a 3D face model. Experimental evaluations using a realistic SSPP dataset show that deep domain adaptation and image synthesis complement each other and dramatically improve accuracy. Experiments on a benchmark dataset using the proposed approach show state-of-the-art performance.
Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner workings of such computational brains is both fascinating basic science that is interesting in its own right - similar to why we study the human brain - and will enable researchers to further improve DNNs. One path to understanding how a neural network functions internally is to study what each of its neurons has learned to detect. One such method is called activation maximization (AM), which synthesizes an input (e.g. an image) that highly activates a neuron. Here we dramatically improve the qualitative state of the art of activation maximization by harnessing a powerful, learned prior: a deep generator network (DGN). The algorithm (1) generates qualitatively state-of-the-art synthetic images that look almost real, (2) reveals the features learned by each neuron in an interpretable way, (3) generalizes well to new datasets and somewhat well to different network architectures without requiring the prior to be relearned, and (4) can be considered as a high-quality generative method (in this case, by generating novel, creative, interesting, recognizable images).
The existence of cycles of the matrix maps in Fibonacci class of lattices is well established. We show that such cycles are intimately connected with the presence of interesting positional correlations among the constituent `atoms' in a one dimensional quasiperiodic lattice. We particularly address the transfer model of the classic golden mean Fibonacci chain where a six cycle of the full matrix map exists at the centre of the spectrum [Kohmoto et al, Phys. Rev. B 35, 1020 (1987)], and for which no simple physical picture has so far been provided, to the best of our knowledge. In addition, we show that our prescription leads to a determination of other energy values for a mixed model of the Fibonacci chain, for which the full matrix map may have similar cyclic behaviour. Apart from the standard transfer-model of a golden mean Fibonacci chain, we address a variant of it and the silver mean lattice, where the existence of four cycles of the matrix map is already known to exist. The underlying positional correlations for all such cases are discussed in details.
The Laessig-Wiese (LW) field theory for the freezing transition of random RNA secondary structures is generalized to the situation of an external force. We find a second-order phase transition at a critical applied force f = f_c. For f < f_c forces are irrelevant. For f > f_c, the extension L as a function of pulling force f scales as (f-f_c)^(1/gamma-1). The exponent gamma is calculated in an epsilon-expansion: At 1-loop order gamma = epsilon/2 = 1/2, equivalent to the disorder-free case. 2-loop results yielding gamma = 0.6 are briefly mentioned. Using a locking argument, we speculate that this result extends to the strong-disorder phase.
The development of powerful 3D scanning hardware and reconstruction algorithms has strongly promoted the generation of 3D surface reconstructions in different domains. An area of special interest for such 3D reconstructions is the cultural heritage domain, where surface reconstructions are generated to digitally preserve historical artifacts. While reconstruction quality nowadays is sufficient in many cases, the robust analysis (e.g. segmentation, matching, and classification) of reconstructed 3D data is still an open topic. In this paper, we target the automatic and interactive segmentation of high-resolution 3D surface reconstructions from the archaeological domain. To foster research in this field, we introduce a fully annotated and publicly available large-scale 3D surface dataset including high-resolution meshes, depth maps and point clouds as a novel benchmark dataset to the community. We provide baseline results for our existing random forest-based approach and for the first time investigate segmentation with convolutional neural networks (CNNs) on the data. Results show that both approaches have complementary strengths and weaknesses and that the provided dataset represents a challenge for future research.
Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background. Many existing methods usually require fine-grained supervisions (e.g., bounding-box annotations) to learn patch features, which requires a great effort to label images may limit their potential applications. In this paper, we propose to learn patch features via weak supervisions, i.e., only image-level supervisions. To achieve this goal, we treat images as bags and patches as instances to integrate the weakly supervised multiple instance learning constraints into deep neural networks. Also, our method integrates the traditional multiple stages of weakly supervised object classification and discovery into a unified deep convolutional neural network and optimizes the network in an end-to-end way. The network processes the two tasks object classification and discovery jointly, and shares hierarchical deep features. Through this jointly learning strategy, weakly supervised object classification and discovery are beneficial to each other. We test the proposed method on the challenging PASCAL VOC datasets. The results show that our method can obtain state-of-the-art performance on object classification, and very competitive results on object discovery, with faster testing speed than competitors.
Deep learning (DL) has achieved notable successes in many machine learning tasks. A number of frameworks have been developed to expedite the process of designing and training deep neural networks (DNNs), such as Caffe, Torch and Theano. Currently they can harness multiple GPUs on a single machine, but are unable to use GPUs that are distributed across multiple machines; as even average-sized DNNs can take days to train on a single GPU with 100s of GBs to TBs of data, distributed GPUs present a prime opportunity for scaling up DL. However, the limited bandwidth available on commodity Ethernet networks presents a bottleneck to distributed GPU training, and prevents its trivial realization.   To investigate how to adapt existing frameworks to efficiently support distributed GPUs, we propose Poseidon, a scalable system architecture for distributed inter-machine communication in existing DL frameworks. We integrate Poseidon with Caffe and evaluate its performance at training DNNs for object recognition. Poseidon features three key contributions that accelerate DNN training on clusters: (1) a three-level hybrid architecture that allows Poseidon to support both CPU-only and GPU-equipped clusters, (2) a distributed wait-free backpropagation (DWBP) algorithm to improve GPU utilization and to balance communication, and (3) a structure-aware communication protocol (SACP) to minimize communication overheads. We empirically show that Poseidon converges to same objectives as a single machine, and achieves state-of-art training speedup across multiple models and well-established datasets using a commodity GPU cluster of 8 nodes (e.g. 4.5x speedup on AlexNet, 4x on GoogLeNet, 4x on CIFAR-10). On the much larger ImageNet22K dataset, Poseidon with 8 nodes achieves better speedup and competitive accuracy to recent CPU-based distributed systems such as Adam and Le et al., which use 10s to 1000s of nodes.
We derive high-temperature series expansions for the free energy and the susceptibility of random-bond $q$-state Potts models on hypercubic lattices using a star-graph expansion technique. This method enables the exact calculation of quenched disorder averages for arbitrary uncorrelated coupling distributions. Moreover, we can keep the disorder strength $p$ as well as the dimension $d$ as symbolic parameters. By applying several series analysis techniques to the new series expansions, one can scan large regions of the $(p,d)$ parameter space for any value of $q$. For the bond-diluted 4-state Potts model in three dimensions, which exhibits a rather strong first-order phase transition in the undiluted case, we present results for the transition temperature and the effective critical exponent $\gamma$ as a function of $p$ as obtained from the analysis of susceptibility series up to order 18. A comparison with recent Monte Carlo data (Chatelain {\em et al.}, Phys. Rev. E64, 036120(2001)) shows signals for the softening to a second-order transition at finite disorder strength.
We consider simple examples illustrating some new features of the linear response theory developed by Ruelle for dissipative and chaotic systems [{\em J. of Stat. Phys.} {\bf 95} (1999) 393]. In this theory the concepts of linear response, susceptibility and resonance, which are familiar to physicists, have been revisited due to the dynamical contraction of the whole phase space onto attractors. In particular the standard framework of the "fluctuation-dissipation" theorem breaks down and new resonances can show up oustside the powerspectrum. In previous papers we proposed and used new numerical methods to demonstrate the presence of the new resonances predicted by Ruelle in a model of chaotic neural network. In this article we deal with simpler models which can be worked out analytically in order to gain more insights into the genesis of the ``stable'' resonances and their consequences on the linear response of the system. We consider a class of 2-dimensional time-discrete maps describing simple rotator models with a contracting radial dynamics onto the unit circle and a chaotic angular dynamics $\theta_{t+1} = 2 \theta_t (\mod 2\pi)$. A generalisation of this system to a network of interconnected rotators is also analysed and related with our previous studies \cite{CS1,CS2}. These models permit us to classify the different types of resonances in the susceptibility and to discuss in particular the relation between the relaxation time of the system to equilibrium with the {\em mixing} time given by the decay of the correlation functions. Also it enables one to propose some general mechanisms responsible for the creation of stable resonances with arbitrary frequencies, widths, and dependency on the pair of perturbed/observed variables.
A linear mesh network is considered in which a single user per cell communicates to a local base station via a dedicated relay (two-hop communication). Exploiting the possibly relevant inter-cell channel gains, rate splitting with successive cancellation in both hops is investigated as a promising solution to improve the rate of basic single-rate communications. Then, an alternative solution is proposed that attempts to improve the performance of the second hop (from the relays to base stations) by cooperative transmission among the relay stations. The cooperative scheme leverages the common information obtained by the relays as a by-product of the use of rate splitting in the first hop. Numerical results bring insight into the conditions (network topology and power constraints) under which rate splitting, with possible relay cooperation, is beneficial. Multi-cell processing (joint decoding at the base stations) is also considered for reference.
Recently, there has been significant progress in understanding reinforcement learning in discounted infinite-horizon Markov decision processes (MDPs) by deriving tight sample complexity bounds. However, in many real-world applications, an interactive learning agent operates for a fixed or bounded period of time, for example tutoring students for exams or handling customer service requests. Such scenarios can often be better treated as episodic fixed-horizon MDPs, for which only looser bounds on the sample complexity exist. A natural notion of sample complexity in this setting is the number of episodes required to guarantee a certain performance with high probability (PAC guarantee). In this paper, we derive an upper PAC bound $\tilde O(\frac{|\mathcal S|^2 |\mathcal A| H^2}{\epsilon^2} \ln\frac 1 \delta)$ and a lower PAC bound $\tilde \Omega(\frac{|\mathcal S| |\mathcal A| H^2}{\epsilon^2} \ln \frac 1 {\delta + c})$ that match up to log-terms and an additional linear dependency on the number of states $|\mathcal S|$. The lower bound is the first of its kind for this setting. Our upper bound leverages Bernstein's inequality to improve on previous bounds for episodic finite-horizon MDPs which have a time-horizon dependency of at least $H^3$.
We analyze the ordering dynamics of the voter model in different classes of complex networks. We observe that whether the voter dynamics orders the system depends on the effective dimensionality of the interaction networks. We also find that when there is no ordering in the system, the average survival time of metastable states in finite networks decreases with network disorder and degree heterogeneity. The existence of hubs in the network modifies the linear system size scaling law of the survival time. The size of an ordered domain is sensitive to the network disorder and the average connectivity, decreasing with both; however it seems not to depend on network size and degree heterogeneity.
We introduce a model for synchronizer of marked pairs, which is a node for joining results of parallel processing in two-branch fork-join queueing network. A distribution for number of jobs in the synchronizer is obtained. Calculations are performed assuming that: arrivals to the network form a Poisson process, each branch operates like an M/M/N queueing system. It is shown that a mean quantity of jobs in the synchronizer is bounded below by the value, defined by parameters of the network (which contains the synchronizer) and does not depend upon performance and particular properties of the synchronizer. A domain of network parameters is found, where the flow of jobs departing from the synchronizer does not manifest a statistically significant difference from the Poisson type, despite the correlation between job flows from both branches of the fork-join network.
Many machine learning tasks require finding per-part correspondences between objects. In this work we focus on low-level correspondences - a highly ambiguous matching problem. We propose to use a hierarchical semantic representation of the objects, coming from a convolutional neural network, to solve this ambiguity. Training it for low-level correspondence prediction directly might not be an option in some domains where the ground-truth correspondences are hard to obtain. We show how transfer from recognition can be used to avoid such training. Our idea is to mark parts as "matching" if their features are close to each other at all the levels of convolutional feature hierarchy (neural paths). Although the overall number of such paths is exponential in the number of layers, we propose a polynomial algorithm for aggregating all of them in a single backward pass. The empirical validation is done on the task of stereo correspondence and demonstrates that we achieve competitive results among the methods which do not use labeled target domain data.
We present a mean field model for spin glasses with a natural notion of distance built in, namely, the Edwards-Anderson model on the diluted D-dimensional unit hypercube in the limit of large D. We show that finite D effects are strongly dependent on the connectivity, being much smaller for a fixed coordination number. We solve the non trivial problem of generating these lattices. Afterwards, we numerically study the nonequilibrium dynamics of the mean field spin glass. Our three main findings are: (i) the dynamics is ruled by an infinite number of time-sectors, (ii) the aging dynamics consists on the growth of coherent domains with a non vanishing surface-volume ratio, and (iii) the propagator in Fourier space follows the p^4 law. We study as well finite D effects in the nonequilibrium dynamics, finding that a naive finite size scaling ansatz works surprisingly well.
With the arrival of modern internet era, large public networks of various types have come to existence to benefit the society as a whole and several research areas such as sociology, economics and geography in particular. However, the societal and research benefits of these networks have also given rise to potentially significant privacy issues in the sense that malicious entities may violate the privacy of the users of such a network by analyzing the network and deliberately using such privacy violations for deleterious purposes. Such considerations have given rise to a new active research area that deals with the quantification of privacy of users in large networks and the corresponding investigation of computational complexity issues of computing such quantified privacy measures. In this paper, we formalize three such privacy measures for large networks and provide non-trivial theoretical computational complexity results for computing these measures. Our results show the first two measures can be computed efficiently, whereas the third measure is provably hard to compute within a logarithmic approximation factor. Furthermore, we also provide computational complexity results for the case when the privacy requirement of the network is severely restricted, including an efficient logarithmic approximation.
We deal in this paper with the content forwarding problem in Delay Tolerant Networks (DTNs). We first formulate the content delivery interaction as a non-cooperative satisfaction game. On one hand, the source node seeks to ensure a delivery probability above some given threshold. On the other hand, the relay nodes seek to maximize their own payoffs. The source node offers a reward (virtual coins) to the relay which caches and forwards the file to the final destination. Each relay faces the dilemma of accepting/rejecting to cache the source's file. Cooperation incurs energy cost due to caching, carrying and forwarding the source's file. Yet, when a relay accepts to cooperate, it may receive some reward if it succeeds to be the first relay to forward the content to the destination. Otherwise, the relay may receive some penalty in the form of a constant regret; the latter parameter is introduced to make incentive for cooperation. Next, we introduce the concept of Satisfaction Equilibrium (SE) as a solution concept to the induced game. Now, the source node is solely interested in reaching a file delivery probability greater than some given threshold, while the relays behave rationally to maximize their respective payoffs. Full characterizations of the SEs for both pure and mixed strategies are derived. Furthermore, we propose two learning algorithms allowing the players (source/relays) to reach the SE strategies. Finally, extensive numerical investigations and some learning simulations are carried out to illustrate the behaviour of the interacting nodes.
The influence of configurational disorder on the magnetic properties of diluted Heisenberg spin systems is studied with regard to the ferromagnetic stability of diluted magnetic semiconductors. The equation of motion of the magnon Green's function is decoupled by Tyablikov approximation. With supercell approach, the concentrations of magnetic ions are determined by the size of the supercell in which there is only one magnetic ion per supercell in our method. In order to distinguish the influence of dilution and disorder, there are two kinds of supercells being used: the \textit{diluted and ordered} case and the \textit{diluted and disordered} case. The configurational averaging of magnon Green function due to disorder is treated in the augmented space formalism. The random exchange integrals between two supercells are treated as a matrix. The obtained magnon spectral densities are used to calculate the temperature dependence of magnetization and Curie temperature. The results are shown as following: (i) dilution leads to increasing the averaged distance of two magnetic ions, further decreases the effective exchange integrals and is main reason to reduce Curie temperature; (ii) spatial position disorder of magnetic ions results in the dispersions of the exchange integrals between two supercells and slightly changes ferromagnetic transition temperature; (iii) the exponential damping of distance dependence obviously reduces Curie temperature and should be set carefully in any phenomenological model.
It is suggested that a large deep underocean neutrino detector, given the presence of significant numbers of neutrinos in the PeV range as predicted by various models of Active Galactic Nuclei, can make unique measurements of the properties of neutrinos. It will be possible to observe the existence of the nu_tau, measure its mixing with other flavors, in fact test the mixing pattern for all three flavors based upon the mixing parameters suggested by the atmospheric and solar neutrino data, and measure the nu_tau cross section. The key signature is the charged current nu_tau interaction, which produces a double cascade, one at either end of a minimum ionizing track. At a few PeV these cascades would be separated by roughly 100 m, and thus be easily resolvable in DUMAND and similar detectors. Future applications are precise neutrino astronomy and earth tomography.
We study XY and dimerized XX spin-1/2 chains with random exchange couplings by analytical and numerical methods and scaling considerations. We extend previous investigations to dynamical properties, to surface quantities and operator profiles, and give a detailed analysis of the Griffiths phase. We present a phenomenological scaling theory of average quantities based on the scaling properties of rare regions, in which the distribution of the couplings follows a surviving random walk character. Using this theory we have obtained the complete set of critical decay exponents of the random XY and XX models, both in the volume and at the surface. The scaling results are confronted with numerical calculations based on a mapping to free fermions, which then lead to an exact correspondence with directed walks. The numerically calculated critical operator profiles on large finite systems (L<=512) are found to follow conformal predictions with the decay exponents of the phenomenological scaling theory. Dynamical correlations in the critical state are in average logarithmically slow and their distribution show multi-scaling character. In the Griffiths phase, which is an extended part of the off-critical region average autocorrelations have a power-law form with a non-universal decay exponent, which is analytically calculated. We note on extensions of our work to the random antiferromagnetic XXZ chain and to higher dimensions.
Research on generative models is a central project in the emerging field of network science, and it studies how statistical patterns found in real networks could be generated by formal rules. Output from these generative models is then the basis for designing and evaluating computational methods on networks, and for verification and simulation studies. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions. In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods. We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.
The entropy of a hierarchical network topology in an ensemble of sparse random networks with "hidden variables" associated to its nodes, is the log-likelihood that a given network topology is present in the chosen ensemble.We obtain a general formula for this entropy,which has a clear simple interpretation in some simple limiting cases. The results provide new keys with which to solve the general problem of "fitting" a given network with an appropriate ensemble of random networks.
Learning from examples is one of the key problems in science and engineering. It deals with function reconstruction from a finite set of direct and noisy samples. Regularization in reproducing kernel Hilbert spaces (RKHSs) is widely used to solve this task and includes powerful estimators such as regularization networks. Recent achievements include the proof of the statistical consistency of these kernel- based approaches. Parallel to this, many different system identification techniques have been developed but the interaction with machine learning does not appear so strong yet. One reason is that the RKHSs usually employed in machine learning do not embed the information available on dynamic systems, e.g. BIBO stability. In addition, in system identification the independent data assumptions routinely adopted in machine learning are never satisfied in practice. This paper provides new results which strengthen the connection between system identification and machine learning. Our starting point is the introduction of RKHSs of dynamic systems. They contain functionals over spaces defined by system inputs and allow to interpret system identification as learning from examples. In both linear and nonlinear settings, it is shown that this perspective permits to derive in a relatively simple way conditions on RKHS stability (i.e. the property of containing only BIBO stable systems or predictors), also facilitating the design of new kernels for system identification. Furthermore, we prove the convergence of the regularized estimator to the optimal predictor under conditions typical of dynamic systems.
By allowing intermediate nodes to perform non-trivial operations on packets, such as mixing data from multiple streams, network coding breaks with the ruling store and forward networking paradigm and opens a myriad of challenging security questions. Following a brief overview of emerging network coding protocols, we provide a taxonomy of their security vulnerabilities, which highlights the differences between attack scenarios in which network coding is particularly vulnerable and other relevant cases in which the intrinsic properties of network coding allow for stronger and more efficient security solutions than classical routing. Furthermore, we give practical examples where network coding can be combined with classical cryptography both for secure communication and secret key distribution. Throughout the paper we identify a number of research challenges deemed relevant towards the applicability of secure network coding in practical networks.
A mobile ad hoc network (MANET) is a collection of autonomous nodes that communicate with each other by forming a multi-hop radio network and maintaining connections in a decentralized manner. Security remains a major challenge for these networks due to their features of open medium, dynamically changing topologies, reliance on cooperative algorithms, absence of centralized monitoring points, and lack of clear lines of defense. Most of the routing protocols for MANETs are thus vulnerable to various types of attacks. For security, these protocols are highly dependent on cryptographic key exchange operations. This paper presents a multi-path certification protocol for efficient and reliable key exchange among the nodes in a MANET. Simulation results have shown the effectiveness and efficiency of the protocol.
In a Wireless Sensor Network (WSN), data manipulation and representation is a crucial part and can take a lot of time to be developed from scratch. Although various visualization tools have been created for certain projects so far, these tools can only be used in certain scenarios, due to their hard-coded packet formats and network's properties. To speed up the development process, a visualization tool which can adapt to any kind of WSN is essentially necessary. For this purpose, a general-purpose visualization tool - NViz, which can represent and visualize data for any kind of WSN, is proposed. NViz allows users to set their network's properties and packet formats through XML files. Based on properties defined, users can choose the meaning of them and let NViz represents the data respectively. Furthermore, a better Replay mechanism, which lets researchers and developers debug their WSN easily, is also integrated in this tool. NViz is designed based on a layered architecture which allows for clear and well-defined interrelationships and interfaces between each component.
The topography of the free energy landscape in phase space of a dense hard sphere system characterized by a discretized free energy functional of the Ramakrishnan-Yussouff form is investigated numerically using a ``microcanonical'' Monte Carlo procedure. We locate a considerable number of glassy local minima of the free energy and analyze the distributions of the free energy at a minimum and an appropriately defined phase-space ``distance'' between different minima. We find evidence for the existence of pairs of closely related glassy minima (``two-level systems''). We also investigate the way the system makes transitions as it moves from the basin of attraction of a minimum to that of another one after a start under nonequilibrium conditions. This allows us to determine the effective height of free energy barriers that separate a glassy minimum from the others. The dependence of the height of free energy barriers on the density is investigated in detail. The general appearance of the free energy landscape resembles that of a putting green: relatively deep minima separated by a fairly flat structure. We discuss the connection of our results with the Vogel-Fulcher law and relate our observations to other work on the glass transition.
We investigate the surface plasmonic lattice solitons (PLSs) in semi-infinite graphene sheet arrays. The surface soliton is formed as the SPPs tunneling is inhibited by the graphene nonlinearity, and meanwhile the incident power should be above a threshold value. Thanks to the strong confinement of surface plasmon polaritons (SPPs) on graphene, the effective width of surface PLSs can be squeezed into deep-subwavelength scale of ~ 0.001{\lambda}. Based on the stable propagation of surface PLSs, we find that the light propagation can be switched from the array boundary to the inner graphene sheets by reducing the incident power or increasing the chemical potential of graphene. The study may find promising application in optical switches on deep-subwavelength scale.
We study the relation of the potential energy landscape (PEL) topography to relaxation dynamics of a small model glass former of Lennard-Jones type. The mechanism under investigation is the hopping betweem superstructures of PEL mimima, called metabasins (MB). From the mean durations $\tauphi$ of visits to MBs, we derive effective depths of these objects by the relation $\Eapp=\d\ln\tauphi/\d\beta$, where $\beta=1/\kB T$. Since the apparent activation energies $\Eapp$ are of purely dynamical origin, we look for a quantitative relation to PEL structure. A consequence of the rugged nature of MBs is that escapes from MBs are not single hops between PEL minima, but complicated multi-minima sequences. We introduce the concept of return probabilities to the bottom of MBs in order to judge whether the attraction range of a MB was left. We then compute the energy barriers that were surmounted. These turn out to be in good agreement with the effective depths $\Eapp$, calculated from dynamics. Barriers are identified with the help of a new method, which accurately performs a descent along the ridge between two minima. A comparison to another method is given. We analyze the population of transition regions between minima, called basin borders. No indication for the mechanism of diffusion to change around the mode-coupling transition can be found. We discuss the question whether the one-dimensional reaction paths connecting two minima are relevant for the calculation of reaction rates at the temperatures under study.
This paper presents a novel approach for automatically generating image descriptions: visual detectors, language models, and multimodal similarity models learnt directly from a dataset of image captions. We use multiple instance learning to train visual detectors for words that commonly occur in captions, including many different parts of speech such as nouns, verbs, and adjectives. The word detector outputs serve as conditional inputs to a maximum-entropy language model. The language model learns from a set of over 400,000 image descriptions to capture the statistics of word usage. We capture global semantics by re-ranking caption candidates using sentence-level features and a deep multimodal similarity model. Our system is state-of-the-art on the official Microsoft COCO benchmark, producing a BLEU-4 score of 29.1%. When human judges compare the system captions to ones written by other people on our held-out test set, the system captions have equal or better quality 34% of the time.
The transmission of stress through a marginally stable granular pile in two dimensions is exactly formulated in terms of a vector field of loop forces, and thence in terms of a single scalar potential.   The loop force formulation leads to a local constitutive equation coupling the stress tensor to {\it fluctuations} in the local geometry. For a disordered pile of rough grains (even with simple orientational order) this means the stress tensor components are coupled in a frustrated manner, analogous to a spin glass.   If the local geometry of the pile of rough grains has long range staggered order, frustration is avoided and a simple linear theory follows. Known exact lattice solutions for rough grains fall into this class.   We show that a pile of smooth grains (lacking friction) can always be mapped into a pile of unfrustrated rough grains. Thus it appears that the problems of rough and smooth grains may be fundamentally distinct.
We study the half-chain entanglement entropy in the ground state of the spin-1/2 XX chain across an extended random defect, where the strength of disorder decays with the distance from the interface algebraically as $\Delta_l\sim l^{-\kappa}$. In the whole regime $\kappa\ge 0$, the average entanglement entropy is found to increase logarithmically with the system size $L$ as $S_L\simeq\frac{c_{\rm eff}(\kappa)}{6}\ln L+const$, where the effective central charge $c_{\rm eff}(\kappa)$ depends on $\kappa$. In the regime $\kappa<1/2$, where the extended defect is a relevant perturbation, the strong-disorder renormalization group method gives $c_{\rm eff}(\kappa)=(1-2\kappa)\ln2$, while, in the regime $\kappa\ge 1/2$, where the extended defect is irrelevant in the bulk, numerical results indicate a non-zero effective central charge, which increases with $\kappa$. The variation of $c_{\rm eff}(\kappa)$ is thus found to be non-monotonic and discontinuous at $\kappa=1/2$.
Learning and reasoning are both aspects of what is considered to be intelligence. Their studies within AI have been separated historically, learning being the topic of machine learning and neural networks, and reasoning falling under classical (or symbolic) AI. However, learning and reasoning are in many ways interdependent. This paper discusses the nature of some of these interdependencies and proposes a general framework called FLARE, that combines inductive learning using prior knowledge together with reasoning in a propositional setting. Several examples that test the framework are presented, including classical induction, many important reasoning protocols and two simple expert systems.
We present what we believe to be the first formal verification of a biologically realistic (nonlinear ODE) model of a neural circuit in a multicellular organism: Tap Withdrawal (TW) in \emph{C. Elegans}, the common roundworm. TW is a reflexive behavior exhibited by \emph{C. Elegans} in response to vibrating the surface on which it is moving; the neural circuit underlying this response is the subject of this investigation. Specifically, we perform reachability analysis on the TW circuit model of Wicks et al. (1996), which enables us to estimate key circuit parameters. Underlying our approach is the use of Fan and Mitra's recently developed technique for automatically computing local discrepancy (convergence and divergence rates) of general nonlinear systems. We show that the results we obtain are in agreement with the experimental results of Wicks et al. (1995). As opposed to the fixed parameters found in most biological models, which can only produce the predominant behavior, our techniques characterize ranges of parameters that produce (and do not produce) all three observed behaviors: reversal of movement, acceleration, and lack of response.
We investigate the phase diagram of a discrete version of the Maier-Saupe model with the inclusion of additional degrees of freedom to mimic a distribution of rodlike and disklike molecules. Solutions of this problem on a Bethe lattice come from the analysis of the fixed points of a set of nonlinear recursion relations. Besides the fixed points associated with isotropic and uniaxial nematic structures, there is also a fixed point associated with a biaxial nematic structure. Due to the existence of large overlaps of the stability regions, we resorted to a scheme to calculate the free energy of these structures deep in the interior of a large Cayley tree. Both thermodynamic and dynamic-stability analyses rule out the presence of a biaxial phase, in qualitative agreement with previous mean-field results.
The production fractions of charged and neutral b-hadrons in b-quark events from Z0 decays have been measured with the DELPHI detector at LEP. An algorithm has been developed, based on a neural network, to estimate the charge of the weakly-decaying b-hadron by distinguishing its decay products from particles produced at the primary vertex. From the data taken in the years 1994 and 1995, the fraction of bbar-quarks fragmenting into positively charged weakly-decaying b-hadrons has been measured to be: f^+ = (42.09 +/- 0.82 (stat.) +/- 0.89 (syst.))%. Subtracting the rates for charged Xibar_b^+ and Omegabar_b^+ baryons gives the production fraction of B^+ mesons: f_Bu = (40.99 +/- 0.82 (stat.) +/- 1.11 (syst.))%.
"Natural Language," whether spoken and attended to by humans, or processed and generated by computers, requires networked structures that reflect creative processes in semantic, syntactic, phonetic, linguistic, social, emotional, and cultural modules. Being able to produce novel and useful behavior following repeated practice gets to the root of both artificial intelligence and human language. This paper investigates the modalities involved in language-like applications that computers -- and programmers -- engage with, and aims to fine tune the questions we ask to better account for context, self-awareness, and embodiment.
An important aspect of a Euclidean network is its link length distribution, studied in a few real networks so far. We compute the distribution of the link lengths between collaborators whose papers appear in the PhysicalReview Letters (PRL) in several years within a range of four decades. The distribution is non-monotonic; there is a peak at nearest neighbour distances followed by a sharp fall and a subsequent rise at larger distances. The behaviour of the statistical properties of the distribution with time indicates that collaborations might become distance independent in about thirty to forty years.
We discuss jet vetoes as a means of probing colour flow in hard-scattering processes in hadronic collisions. As an example, we describe a calculation of the dijet cross-section with a jet veto, which resums the leading logarithms of the veto scale and is matched to a fixed-order computation. We compare this prediction to the measurement performed by the ATLAS collaboration. Finally, we outline future developments in this research area.
A large amount of information exists in reviews written by users. This source of information has been ignored by most of the current recommender systems while it can potentially alleviate the sparsity problem and improve the quality of recommendations. In this paper, we present a deep model to learn item properties and user behaviors jointly from review text. The proposed model, named Deep Cooperative Neural Networks (DeepCoNN), consists of two parallel neural networks coupled in the last layers. One of the networks focuses on learning user behaviors exploiting reviews written by the user, and the other one learns item properties from the reviews written for the item. A shared layer is introduced on the top to couple these two networks together. The shared layer enables latent factors learned for users and items to interact with each other in a manner similar to factorization machine techniques. Experimental results demonstrate that DeepCoNN significantly outperforms all baseline recommender systems on a variety of datasets.
We investigate the effect of diagonal disorder on bosons in an optical lattice described by an Anderson-Hubbard model at zero temperature. It is known that within Gutzwiller mean-field theory spatially resolved calculations suffer particularly from finite system sizes in the disordered case, while arithmetic averaging of the order parameter cannot describe the Bose glass phase for finite hopping $J>0$. Here we present and apply a new \emph{stochastic} mean-field theory which captures localization due to disorder, includes non-trivial dimensional effects beyond the mean-field scaling level and is applicable in the thermodynamic limit. In contrast to fermionic systems, we find the existence of a critical hopping strength, above which the system remains superfluid for arbitrarily strong disorder.
Lung nodule detection is a class imbalanced problem because nodules are found with much lower frequency than non-nodules. In the class imbalanced problem, conventional classifiers tend to be overwhelmed by the majority class and ignore the minority class. We therefore propose cascaded convolutional neural networks to cope with the class imbalanced problem. In the proposed approach, cascaded convolutional neural networks that perform as selective classifiers filter out obvious non-nodules. Successively, a convolutional neural network trained with a balanced data set calculates nodule probabilities. The proposed method achieved the detection sensitivity of 85.3% and 90.7% at 1 and 4 false positives per scan in FROC curve, respectively.
Network functions (e.g., firewalls, load balancers, etc.) have been traditionally provided through proprietary hardware appliances. Often, hardware appliances need to be hardwired back to back to form a service chain providing chained network functions. Hardware appliances cannot be provisioned on demand since they are statically embedded in the network topology, making creation, insertion, modification, upgrade, and removal of service chains complex, and also slowing down service innovation. Hence, network operators are starting to deploy Virtual Network Functions (VNFs), which are virtualized over commodity hardware. VNFs can be deployed in Data Centers (DCs) or in Network Function Virtualization (NFV) capable network elements (nodes) such as routers and switches. NFV capable nodes and DCs together form a Network enabled Cloud (NeC) that helps to facilitate the dynamic service chaining required to support evolving network traffic and its service demands. In this study, we focus on the VNF service chain placement and traffic routing problem, and build a model for placing a VNF service chain while minimizing network resource consumption. Our results indicate that a NeC having a DC and NFV capable nodes can significantly reduce network-resource consumption.
Mobile Edge Computing (MEC) is an emerging paradigm that provides computing, storage, and networking resources within the edge of the mobile Radio Access Network (RAN). MEC servers are deployed on generic computing platform within the RAN and allow for delay-sensitive and context-aware applications to be executed in close proximity to the end users. This approach alleviates the backhaul and core network and is crucial for enabling low-latency, high-bandwidth, and agile mobile services. This article envisages a real-time, context-aware collaboration framework that lies at the edge of the RAN, constituted of MEC servers and mobile devices, and that amalgamates the heterogeneous resources at the edge. Specifically, we introduce and study three strong use cases ranging from mobile-edge orchestration, collaborative caching and processing and multi-layer interference cancellation. We demonstrate the promising benefits of these approaches in facilitating the evolution to 5G networks. Finally, we discuss the key technical challenges and open-research issues that need to be addressed in order to make an efficient integration of MEC into 5G ecosystem.
In previous work (`Network coding meets TCP') we proposed a new protocol that interfaces network coding with TCP by means of a coding layer between TCP and IP. Unlike the usual batch-based coding schemes, the protocol uses a TCP-compatible sliding window code in combination with new rules for acknowledging bytes to TCP that take into account the network coding operations in the lower layer. The protocol was presented in a theoretical framework and considered only in conjunction with TCP Vegas. In this paper we present a real-world implementation of this protocol that addresses several important practical aspects of incorporating network coding and decoding with TCP's window management mechanism. Further, we work with the more widespread and practical TCP Reno. Our implementation significantly advances the goal of designing a deployable, general, TCP-compatible protocol that provides the benefits of network coding.
We present a policy search method for learning complex feedback control policies that map from high-dimensional sensory inputs to motor torques, for manipulation tasks with discontinuous contact dynamics. We build on a prior technique called guided policy search (GPS), which iteratively optimizes a set of local policies for specific instances of a task, and uses these to train a complex, high-dimensional global policy that generalizes across task instances. We extend GPS in the following ways: (1) we propose the use of a model-free local optimizer based on path integral stochastic optimal control (PI2), which enables us to learn local policies for tasks with highly discontinuous contact dynamics; and (2) we enable GPS to train on a new set of task instances in every iteration by using on-policy sampling: this increases the diversity of the instances that the policy is trained on, and is crucial for achieving good generalization. We show that these contributions enable us to learn deep neural network policies that can directly perform torque control from visual input. We validate the method on a challenging door opening task and a pick-and-place task, and we demonstrate that our approach substantially outperforms the prior LQR-based local policy optimizer on these tasks. Furthermore, we show that on-policy sampling significantly increases the generalization ability of these policies.
An active participation of players in evolutionary games depends on several factors, ranging from personal stakes to the properties of the interaction network. Diverse activity patterns thus have to be taken into account when studying the evolution of cooperation in social dilemmas. Here we study the weak prisoner's dilemma game, where the activity of each player is determined in a probabilistic manner either by its degree or by its payoff. While degree-correlated activity introduces cascading failures of cooperation that are particularly severe on scale-free networks with frequently inactive hubs, payoff-correlated activity provides a more nuanced activity profile, which ultimately hinders systemic breakdowns of cooperation. To determine optimal conditions for the evolution of cooperation, we introduce an exponential decay to payoff-correlated activity that determines how fast the activity of a player returns to its default state. We show that there exists an intermediate decay rate, at which the resolution of the social dilemma is optimal. This can be explained by the emerging activity patterns of players, where the inactivity of hubs is compensated effectively by the increased activity of average-degree players, who through their collective influence in the network sustain a higher level of cooperation. The sudden drops in the fraction of cooperators observed with degree-correlated activity therefore vanish, and so does the need for the lengthy spatiotemporal reorganization of compact cooperative clusters. The absence of such asymmetric dynamic instabilities thus leads to an optimal resolution of social dilemmas, especially when the conditions for the evolution of cooperation are strongly adverse.
We propose a multi-view network for text classification. Our method automatically creates various views of its input text, each taking the form of soft attention weights that distribute the classifier's focus among a set of base features. For a bag-of-words representation, each view focuses on a different subset of the text's words. Aggregating many such views results in a more discriminative and robust representation. Through a novel architecture that both stacks and concatenates views, we produce a network that emphasizes both depth and width, allowing training to converge quickly. Using our multi-view architecture, we establish new state-of-the-art accuracies on two benchmark tasks.
Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes.   To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations; 20000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.
Recent work suggests that Heisenberg spin glasses may belong to the same universality class than structural glasses. Indeed, finding a lattice equivalent for supercooled liquids would probably allow easier numerical and analytical studies, that may help to answer long-standing questions on the glass transition. Supercooled liquids have many peculiar behaviors that should be found in the paramagnetic phase of Heisenberg spin glasses if the analogy between the two systems holds. It is with this motivation that we undertake a study of the paramagnetic phase of Heisenberg spin glasses. We shall emphasize the role of the energy landscape, with a detailed study of the properties of the inherent structures (by analogy with supercooled liquids, we name inherent structure the local minimum of the energy function which is closest to the current spin configuration). Finding inherent structures will require the development of a new search algorithm. We shall investigate as well the existence of a dynamic transition in the paramagnetic phase. As a matter of fact, both the existence of a complex energy landscape, as well as the existence of a dynamic transition, are distinguished features of the Physics of supercooled liquids.
In this work, we propose a quantum neural network named quantum perceptron over a field (QPF). Quantum computers are not yet a reality and the models and algorithms proposed in this work cannot be simulated in actual (or classical) computers. QPF is a direct generalization of a classical perceptron and solves some drawbacks found in previous models of quantum perceptrons. We also present a learning algorithm named Superposition based Architecture Learning algorithm (SAL) that optimizes the neural network weights and architectures. SAL searches for the best architecture in a finite set of neural network architectures with linear time over the number of patterns in the training set. SAL is the first learning algorithm to determine neural network architectures in polynomial time. This speedup is obtained by the use of quantum parallelism and a non-linear quantum operator.
Using the discrete $\pm J$ bond distribution for the Sherrington-Kirkpatrick spin glass, all ground states for the entire ensemble of the bond disorder are enumerated. Although the combinatorial complexity of the enumeration severely restricts attainable system sizes, here $N\leq9$, some remarkably intricate patterns found in previous studies already emerge. The analysis of the exact ground state frequencies suggests a direct construction of their probability density function. Against expectations, the result suggests that its highly skewed appearance for finite $N$ evolves logarithmically slow towards a Gaussian distribution.
Deep learning is currently the subject of intensive study. However, fundamental concepts such as representations are not formally defined -- researchers "know them when they see them" -- and there is no common language for describing and analyzing algorithms. This essay proposes an abstract framework that identifies the essential features of current practice and may provide a foundation for future developments.   The backbone of almost all deep learning algorithms is backpropagation, which is simply a gradient computation distributed over a neural network. The main ingredients of the framework are thus, unsurprisingly: (i) game theory, to formalize distributed optimization; and (ii) communication protocols, to track the flow of zeroth and first-order information. The framework allows natural definitions of semantics (as the meaning encoded in functions), representations (as functions whose semantics is chosen to optimized a criterion) and grammars (as communication protocols equipped with first-order convergence guarantees).   Much of the essay is spent discussing examples taken from the literature. The ultimate aim is to develop a graphical language for describing the structure of deep learning algorithms that backgrounds the details of the optimization procedure and foregrounds how the components interact. Inspiration is taken from probabilistic graphical models and factor graphs, which capture the essential structural features of multivariate distributions.
We present experimental data and simulations on the effects of in-plane tension on nanoindentation hardness and pop-in noise. Nanoindentation experiments using a Berkovich tip are performed on bulk polycrystaline Al samples, under tension in a custom 4pt-bending fixture. The hardness displays a transition, for indentation depths smaller than 10nm, as function of the in-plane stress at a value consistent with the bulk tensile yield stress. Displacement bursts appear insensitive to in-plane tension and this transition disappears for larger indentation depths. Two dimensional discrete dislocation dynamics simulations confirm that a regime exists where hardness is sensitive to tension-induced pre-existing dislocations.
This paper investigates the enumeration, rate region computation, and hierarchy of general multi-source multi-sink hyperedge networks under network coding, which includes multiple network models, such as independent distributed storage systems and index coding problems, as special cases. A notion of minimal networks and a notion of network equivalence under group action are defined. An efficient algorithm capable of directly listing single minimal canonical representatives from each network equivalence class is presented and utilized to list all minimal canonical networks with up to 5 sources and hyperedges. Computational tools are then applied to obtain the rate regions of all of these canonical networks, providing exact expressions for 744,119 newly solved network coding rate regions corresponding to more than 2 trillion isomorphic network coding problems. In order to better understand and analyze the huge repository of rate regions through hierarchy, several embedding and combination operations are defined so that the rate region of the network after operation can be derived from the rate regions of networks involved in the operation. The embedding operations enable the definition and determination of a list of forbidden network minors for the sufficiency of classes of linear codes. The combination operations enable the rate regions of some larger networks to be obtained as the combination of the rate regions of smaller networks. The integration of both the combinations and embedding operators is then shown to enable the calculation of rate regions for many networks not reachable via combination operations alone.
We report on optical and infrared observations of the anomalous X-ray pulsar (AXP) 1E 1048.1-5937, made during its ongoing X-ray flare which started in 2007 March. We detected the source in the optical I and near-infrared Ks bands in two ground-based observations and obtained deep flux upper limits from four observations, including one with the Spitzer Space Telescope at 4.5 and 8.0 microns. The detections indicate that the source was approximately 1.3--1.6 magnitudes brighter than in 2003--2006, when it was at the tail of a previous similar X-ray flare. Similar related flux variations have been seen in two other AXPs during their X-ray outbursts, suggesting common behavior for large X-ray flux variation events in AXPs. The Spitzer flux 1E 1048.1-5937 limits are sufficiently deep that we can exclude mid-infrared emission similar to that from the AXP 4U 0142+61, which has been interpreted as arising from a dust disk around the AXP. The optical/near-infrared emission from probably has a magnetospheric origin. The similarity in the flux spectra of 4U 0142+61 and 1E 1048.1-5937 challenges the dust disk model proposed for the latter.
We propose 'Significance-Offset Convolutional Neural Network', a deep convolutional network architecture for multivariate time series regression. The model is inspired by standard autoregressive (AR) models and gating mechanisms used in recurrent neural networks. It involves an AR-like weighting system, where the final predictor is obtained as a weighted sum of sub-predictors while the weights are data-dependent functions learnt through a convolutional network.The architecture was designed for applications on asynchronous time series with low signal-to-noise ratio and hence is evaluated on such datasets: a hedge fund proprietary dataset of over2 million quotes for a credit derivative index andan artificially generated noisy autoregressive series. The proposed architecture achieves promising results compared to convolutional and recur-rent neural networks. The code for the numerical experiments and the architecture implementation will be shared online to make the research reproducible.
We discuss results for heavy quark correlations in next-to-leading order QCD in deep inelastic electroproduction.
We define a replica field theory describing finite dimensional site disordered spin systems by introducing the notion of grand canonical disorder, where the number of spins in the system is random but quenched. A general analysis of this field theory is made using the Gaussian variational or Hartree Fock method, and illustrated with several specific examples. Irrespective of the form of interaction between the spins this approximation predicts a spin glass phase. We discuss the replica symmetric phase at length, explicitly identifying the correlator that diverges at the spin glass transition. We also discuss the form of continuous replica symmetry breaking found just below the transition. Finally we show how an analysis of ferromagnetic ordering indicates a breakdown of the approximation.
While recent deep neural network models have achieved promising results on the image captioning task, they rely largely on the availability of corpora with paired image and sentence captions to describe objects in context. In this work, we propose the Deep Compositional Captioner (DCC) to address the task of generating descriptions of novel objects which are not present in paired image-sentence datasets. Our method achieves this by leveraging large object recognition datasets and external text corpora and by transferring knowledge between semantically similar concepts. Current deep caption models can only describe objects contained in paired image-sentence corpora, despite the fact that they are pre-trained with large object recognition datasets, namely ImageNet. In contrast, our model can compose sentences that describe novel objects and their interactions with other objects. We demonstrate our model's ability to describe novel concepts by empirically evaluating its performance on MSCOCO and show qualitative results on ImageNet images of objects for which no paired image-caption data exist. Further, we extend our approach to generate descriptions of objects in video clips. Our results show that DCC has distinct advantages over existing image and video captioning approaches for generating descriptions of new objects in context.
Instance Search (INS) is a fundamental problem for many applications, while it is more challenging comparing to traditional image search since the relevancy is defined at the instance level.   Existing works have demonstrated the success of many complex ensemble systems that are typically conducted by firstly generating object proposals, and then extracting handcrafted and/or CNN features of each proposal for matching. However, object bounding box proposals and feature extraction are often conducted in two separated steps, thus the effectiveness of these methods collapses. Also, due to the large amount of generated proposals, matching speed becomes the bottleneck that limits its application to large-scale datasets. To tackle these issues, in this paper we propose an effective and efficient Deep Region Hashing (DRH) approach for large-scale INS using an image patch as the query. Specifically, DRH is an end-to-end deep neural network which consists of object proposal, feature extraction, and hash code generation. DRH shares full-image convolutional feature map with the region proposal network, thus enabling nearly cost-free region proposals. Also, each high-dimensional, real-valued region features are mapped onto a low-dimensional, compact binary codes for the efficient object region level matching on large-scale dataset. Experimental results on four datasets show that our DRH can achieve even better performance than the state-of-the-arts in terms of MAP, while the efficiency is improved by nearly 100 times.
Both parametric and non-parametric approaches have demonstrated encouraging performances in the human parsing task, namely segmenting a human image into several semantic regions (e.g., hat, bag, left arm, face). In this work, we aim to develop a new solution with the advantages of both methodologies, namely supervision from annotated data and the flexibility to use newly annotated (possibly uncommon) images, and present a quasi-parametric human parsing model. Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image. Given a testing image, we first retrieve its KNN images from the annotated/manually-parsed human image corpus. Then each semantic region in each KNN image is matched with confidence to the testing image using M-CNN, and the matched regions from all KNN images are further fused, followed by a superpixel smoothing procedure to obtain the ultimate human parsing result. The M-CNN differs from the classic CNN in that the tailored cross image matching filters are introduced to characterize the matching between the testing image and the semantic region of a KNN image. The cross image matching filters are defined at different convolutional layers, each aiming to capture a particular range of displacements. Comprehensive evaluations over a large dataset with 7,700 annotated human images well demonstrate the significant performance gain from the quasi-parametric model over the state-of-the-arts, for the human parsing task.
The production of renewable and sustainable energy is one of the most important challenges currently facing mankind. Wind has made an increasing contribution to the world's energy supply mix, but still remains a long way from reaching its full potential. In this paper, we investigate the use of artificial evolution to design vertical-axis wind turbine prototypes that are physically instantiated and evaluated under fan generated wind conditions. Initially a conventional evolutionary algorithm is used to explore the design space of a single wind turbine and later a cooperative coevolutionary algorithm is used to explore the design space of an array of wind turbines. Artificial neural networks are used throughout as surrogate models to assist learning and found to reduce the number of fabrications required to reach a higher aerodynamic efficiency. Unlike in other approaches, such as computational fluid dynamics simulations, no mathematical formulations are used and no model assumptions are made.
We discuss the similarities and differences between training an auto-encoder to minimize the reconstruction error, and training the same auto-encoder to compress the data via a generative model. Minimizing a codelength for the data using an auto-encoder is equivalent to minimizing the reconstruction error plus some correcting terms which have an interpretation as either a denoising or contractive property of the decoding function. These terms are related but not identical to those used in denoising or contractive auto-encoders [Vincent et al. 2010, Rifai et al. 2011]. In particular, the codelength viewpoint fully determines an optimal noise level for the denoising criterion.
We propose a new platform for implementing secure wireless ad hoc networks. Our proposal is based on a modular architecture, with the software stack constructed directly on the Ethernet layer. Within our platform we use a new security protocol that we designed to ensure mutual authentication between nodes and a secure key exchange. The correctness of the proposed security protocol is ensured by Guttman's authentication tests.
Contemporary neuroimaging methods can shed light on the basis of human neural and cognitive specializations, with important implications for neuroscience and medicine. Different MRI acquisitions provide different brain networks at the macroscale; whilst diffusion-weighted MRI (dMRI) provides a structural connectivity (SC) coincident with the bundles of parallel fibers between brain areas, functional MRI (fMRI) accounts for the variations in the blood-oxygenation-level-dependent T2* signal, providing functional connectivity (FC).Understanding the precise relation between FC and SC, that is, between brain dynamics and structure, is still a challenge for neuroscience. To investigate this problem, we acquired data at rest and built the corresponding SC (with matrix elements corresponding to the fiber number between brain areas) to be compared with FC connectivity matrices obtained by 3 different methods: directed dependencies by an exploratory version of structural equation modeling (eSEM), linear correlations (C) and partial correlations (PC). We also considered the possibility of using lagged correlations in time series; so, we compared a lagged version of eSEM and Granger causality (GC). Our results were two-fold: firstly, eSEM performance in correlating with SC was comparable to those obtained from C and PC, but eSEM (not C nor PC) provides information about directionality of the functional interactions. Second, interactions on a time scale much smaller than the sampling time, captured by instantaneous connectivity methods, are much more related to SC than slow directed influences captured by the lagged analysis. Indeed the performance in correlating with SC was much worse for GC and for the lagged version of eSEM. We expect these results to supply further insights to the interplay between SC and functional patterns, an important issue in the study of brain physiology and function.
Traditional approaches to building a large scale knowledge graph have usually relied on extracting information (entities, their properties, and relations between them) from unstructured text (e.g. Dbpedia). Recent advances in Convolutional Neural Networks (CNN) allow us to shift our focus to learning entities and relations from images, as they build robust models that require little or no pre-processing of the images. In this paper, we present an approach to identify and extract spatial relations (e.g., The girl is standing behind the table) from images using CNNs. Our research addresses two specific challenges: providing insight into how spatial relations are learned by the network and which parts of the image are used to predict these relations. We use the pre-trained network VGGNet to extract features from an image and train a Multi-layer Perceptron (MLP) on a set of synthetic images and the sun09 dataset to extract spatial relations. The MLP predicts spatial relations without a bounding box around the objects or the space in the image depicting the relation. To understand how the spatial relations are represented in the network, a heatmap is overlayed on the image to show the regions that are deemed important by the network. Also, we analyze the MLP to show the relationship between the activation of consistent groups of nodes and the prediction of a spatial relation. We show how the loss of these groups affects the networks ability to identify relations.
This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks. Compared to previous work, Fast R-CNN employs several innovations to improve training and testing speed while also increasing detection accuracy. Fast R-CNN trains the very deep VGG16 network 9x faster than R-CNN, is 213x faster at test-time, and achieves a higher mAP on PASCAL VOC 2012. Compared to SPPnet, Fast R-CNN trains VGG16 3x faster, tests 10x faster, and is more accurate. Fast R-CNN is implemented in Python and C++ (using Caffe) and is available under the open-source MIT License at https://github.com/rbgirshick/fast-rcnn.
We report experiments on spontaneous imbibition of a viscous fluid by a model porous medium in the absence of gravity. The average position of the interface satisfies Washburn's law. Scaling of the interface fluctuations provides a dynamic exponent z \simeq 3, indicative of global dynamics driven by capillary forces. The complete set of exponents clearly shows that interfaces are not self-affine, exhibiting distinct local and global scaling, both for time (b=0.64\pm 0.02, b* =0.33 \pm 0.03) and space (a=1.94 \pm 0.20, a_loc=0.94 \pm 0.10). These values are compatible with an intrinsic anomalous scaling scenario.
The past few years has witnessed the great success of recommender systems, which can significantly help users find relevant and interesting items for them in the information era. However, a vast class of researches in this area mainly focus on predicting missing links in bipartite user-item networks (represented as behavioral networks). Comparatively, the social impact, especially the network structure based properties, is relatively lack of study. In this paper, we firstly obtain five corresponding network-based features, including user activity, average neighbors' degree, clustering coefficient, assortative coefficient and discrimination, from social and behavioral networks, respectively. A hybrid algorithm is proposed to integrate those features from two respective networks. Subsequently, we employ a machine learning process to use those features to provide recommendation results in a binary classifier method. Experimental results on a real dataset, Flixster, suggest that the proposed method can significantly enhance the algorithmic accuracy. In addition, as network-based properties consider not only the social activities, but also take into account user preferences in the behavioral networks, therefore, it performs much better than that from either social or behavioral networks. Furthermore, since the features based on the behavioral network contain more diverse and meaningfully structural information, they play a vital role in uncovering users' potential preference, which, might show light in deeply understanding the structure and function of the social and behavioral networks.
Complex biological systems are very robust to genetic and environmental changes at all levels of organization. Many biological functions of Escherichia coli metabolism can be sustained against single-gene or even multiple-gene mutations by using redundant or alternative pathways. Thus, only a limited number of genes have been identified to be lethal to the cell. In this regard, the reaction-centric gene deletion study has a limitation in understanding the metabolic robustness. Here, we report the use of flux-sum, which is the summation of all incoming or outgoing fluxes around a particular metabolite under pseudo-steady state conditions, as a good conserved property for elucidating such robustness of E. coli from the metabolite point of view. The functional behavior, as well as the structural and evolutionary properties of metabolites essential to the cell survival, was investigated by means of a constraints-based flux analysis under perturbed conditions. The essential metabolites are capable of maintaining a steady flux-sum even against severe perturbation by actively redistributing the relevant fluxes. Disrupting the flux-sum maintenance was found to suppress cell growth. This approach of analyzing metabolite essentiality provides insight into cellular robustness and concomitant fragility, which can be used for several applications, including the development of new drugs for treating pathogens.
Various system tasks like interference coordination, handover decisions, admission control etc. in current cellular networks require precise mid-term (spanning over a few seconds) performance models. Due to channel-dependent scheduling at the base station, these performance models are not simple to obtain. Furthermore, LTE cellular systems are interference-limited, hence, the way interference is modelled is crucial for the accuracy. In this paper we present a closed-form analytical performance model for proportional fair scheduling in OFDMA/LTE networks. The model takes into account a precise SINR distribution into account. We refine our model with respect to uniform modulation and coding, as applied in LTE networks. Furthermore, the analytical analysis is extended also for ultra-dense deployments likely to happen in the 5-th generation of cellular networks. The resulting analytical performance model is validated by means of simulations considering realistic network deployments. Compared with related work, our model demonstrates a significantly higher accuracy for long-term throughput estimation.
It is shown, by means of Monte Carlo simulation and Finite Size Scaling analysis, that the Heisenberg spin glass undergoes a finite-temperature phase transition in three dimensions. There is a single critical temperature, at which both a spin glass and a chiral glass orderings develop. The Monte Carlo algorithm, adapted from lattice gauge theory simulations, makes possible to thermalize lattices of size L=32, larger than in any previous spin glass simulation in three dimensions. High accuracy is reached thanks to the use of the Marenostrum supercomputer. The large range of system sizes studied allow us to consider scaling corrections.
Network of neurons in the brain apply - unlike processors in our current generation of computer hardware - an event-based processing strategy, where short pulses (spikes) are emitted sparsely by neurons to signal the occurrence of an event at a particular point in time. Such spike-based computations promise to be substantially more power-efficient than traditional clocked processing schemes. However it turned out to be surprisingly difficult to design networks of spiking neurons that are able to carry out demanding computations. We present here a new theoretical framework for organizing computations of networks of spiking neurons. In particular, we show that a suitable design enables them to solve hard constraint satisfaction problems from the domains of planning - optimization and verification - logical inference. The underlying design principles employ noise as a computational resource. Nevertheless the timing of spikes (rather than just spike rates) plays an essential role in the resulting computations. Furthermore, one can demonstrate for the Traveling Salesman Problem a surprising computational advantage of networks of spiking neurons compared with traditional artificial neural networks and Gibbs sampling. The identification of such advantage has been a well-known open problem.
The zero-temperature transmission rate spectrum of a double-chain tight-binding model for real DNA is calculated. It is shown that a band of extended-like states exists only for finite chain length with strong inter-chain coupling. While the whole spectrum tends to zero in thermodynamic limit, regardless of the strength of inter-chain coupling. It is also shown that a more faithful model for real DNA with periodic sugar-phosphate chains in backbone structures can be mapped into the above simple double-chain tight-binding model. Combined with above results, the transmission rate of real DNA with long random sequence of nucleotides is expected to be poor.
This paper proposes a semantic segmentation method for outdoor scenes captured by a surveillance camera. Our algorithm classifies each perceptually homogenous region as one of the predefined classes learned from a collection of manually labelled images. The proposed approach combines two different types of information. First, color segmentation is performed to divide the scene into perceptually similar regions. Then, the second step is based on SIFT keypoints and uses the bag of words representation of the regions for the classification. The prediction is done using a Na\"ive Bayesian Network as a generative classifier. Compared to existing techniques, our method provides more compact representations of scene contents and the segmentation result is more consistent with human perception due to the combination of the color information with the image keypoints. The experiments conducted on a publicly available data set demonstrate the validity of the proposed method.
In this article we discuss several aspects of the stochastic dynamics of spin models. The paper has two independent parts. Firstly, we explore a few properties of the multi-point correlations and responses of generic systems evolving in equilibrium with a thermal bath. We propose a fluctuation principle that allows us to derive fluctuation-dissipation relations for many-time correlations and linear responses. We also speculate on how these features will be modified in systems evolving slowly out of equilibrium, as finite-dimensional or dilute spin-glasses. Secondly, we present a formalism that allows one to derive a series of approximated equations that determine the dynamics of disordered spin models on random (hyper) graphs.
We present a library of efficient implementations of deep learning primitives. Deep learning workloads are computationally intensive, and optimizing their kernels is difficult and time-consuming. As parallel architectures evolve, kernels must be reoptimized, which makes maintaining codebases difficult over time. Similar issues have long been addressed in the HPC community by libraries such as the Basic Linear Algebra Subroutines (BLAS). However, there is no analogous library for deep learning. Without such a library, researchers implementing deep learning workloads on parallel processors must create and optimize their own implementations of the main computational kernels, and this work must be repeated as new parallel processors emerge. To address this problem, we have created a library similar in intent to BLAS, with optimized routines for deep learning workloads. Our implementation contains routines for GPUs, although similarly to the BLAS library, these routines could be implemented for other platforms. The library is easy to integrate into existing frameworks, and provides optimized performance and memory usage. For example, integrating cuDNN into Caffe, a popular framework for convolutional networks, improves performance by 36% on a standard model while also reducing memory consumption.
We show that the mean number of attractors in a critical Boolean network under asynchronous stochastic update grows like a power law and that the mean size of the attractors increases as a stretched exponential with the system size. This is in strong contrast to the synchronous case, where the number of attractors grows faster than any power law.
This paper deals with throughput scaling laws for random ad-hoc wireless networks in a rich scattering environment. We develop schemes to optimize the ratio, $\rho(n)$ of achievable network sum capacity to the sum of the point-to-point capacities of source-destinations pairs operating in isolation. For fixed SNR networks, i.e., where the worst case SNR over the source-destination pairs is fixed independent of $n$, we show that collaborative strategies yield a scaling law of $\rho(n) = {\cal O}(\frac{1}{n^{1/3}})$ in contrast to multi-hop strategies which yield a scaling law of $\rho(n) = {\cal O}(\frac{1}{\sqrt{n}})$. While, networks where worst case SNR goes to zero, do not preclude the possibility of collaboration, multi-hop strategies achieve optimal throughput. The plausible reason is that the gains due to collaboration cannot offset the effect of vanishing receive SNR. This suggests that for fixed SNR networks, a network designer should look for network protocols that exploit collaboration. The fact that most current networks operate in a fixed SNR interference limited environment provides further motivation for considering this regime.
In many applications it is of interest to identify anomalous behavior within a dynamic interacting system. Such anomalous interactions are reflected by structural changes in the network representation of the system. We propose and investigate the use of a dynamic version of the degree corrected stochastic block model (DCSBM) to model and monitor dynamic networks that undergo a significant structural change. We apply statistical process monitoring techniques to the estimated parameters of the DCSBM to identify significant structural changes in the network. Application of our surveillance strategy to the dynamic U.S. Senate co-voting network reveals that we are able to detect significant changes in the network that reflect both times of cohesion and times of polarization among Republican and Democratic party members. These findings provide valuable insight about the evolution of the bipartisan political system in the United States. Our analysis demonstrates that the dynamic DCSBM monitoring procedure effectively detects local and global structural changes in dynamic networks. The DCSBM approach is an example of a more general framework that combines parametric random graph models and statistical process monitoring techniques for network surveillance.
In the quest for alternatives to traditional CMOS, it is being suggested that digital computing efficiency and power can be improved by matching the precision to the application. Many applications do not need the high precision that is being used today. In particular, large gains in area- and power efficiency could be achieved by dedicated analog realizations of approximate computing engines. In this work, we explore the use of memristor networks for analog approximate computation, based on a machine learning framework called reservoir computing. Most experimental investigations on the dynamics of memristors focus on their nonvolatile behavior. Hence, the volatility that is present in the developed technologies is usually unwanted and it is not included in simulation models. In contrast, in reservoir computing, volatility is not only desirable but necessary. Therefore, in this work, we propose two different ways to incorporate it into memristor simulation models. The first is an extension of Strukov's model and the second is an equivalent Wiener model approximation. We analyze and compare the dynamical properties of these models and discuss their implications for the memory and the nonlinear processing capacity of memristor networks. Our results indicate that device variability, increasingly causing problems in traditional computer design, is an asset in the context of reservoir computing. We conclude that, although both models could lead to useful memristor based reservoir computing systems, their computational performance will differ. Therefore, experimental modeling research is required for the development of accurate volatile memristor models.
Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm. Stochastic neural networks combine the power of large parametric functions with that of graphical models, which makes it possible to learn very complex distributions. However, as backpropagation is not directly applicable to stochastic networks that include discrete sampling operations within their computational graph, training such networks remains difficult. We present MuProp, an unbiased gradient estimator for stochastic networks, designed to make this task easier. MuProp improves on the likelihood-ratio estimator by reducing its variance using a control variate based on the first-order Taylor expansion of a mean-field network. Crucially, unlike prior attempts at using backpropagation for training stochastic networks, the resulting estimator is unbiased and well behaved. Our experiments on structured output prediction and discrete latent variable modeling demonstrate that MuProp yields consistently good performance across a range of difficult tasks.
We study the phase transitions of a random copolymer chain with quenched disorder. We apply a replica variational approach based on a Gaussian trial Hamiltonian in terms of the correlation functions of monomer Fourier coordinates. This allows us to study collapse, phase separation and freezing transitions within the same mean field theory. The effective free energy of the system is derived analytically and analysed numerically. Such quantities as the radius of gyration or the average value of the overlap between different replicas are treated as observables and evaluated by introducing appropriate external fields to the Hamiltonian. We obtain the phase diagram and show that this system exhibits a scale dependent freezing transition. The correlations between replicas appear at different length scales as the temperature decreases. This indicates the existence of the topological frustration.
This paper is on human pose estimation using Convolutional Neural Networks. Our main contribution is a CNN cascaded architecture specifically designed for learning part relationships and spatial context, and robustly inferring pose even for the case of severe part occlusions. To this end, we propose a detection-followed-by-regression CNN cascade. The first part of our cascade outputs part detection heatmaps and the second part performs regression on these heatmaps. The benefits of the proposed architecture are multi-fold: It guides the network where to focus in the image and effectively encodes part constraints and context. More importantly, it can effectively cope with occlusions because part detection heatmaps for occluded parts provide low confidence scores which subsequently guide the regression part of our network to rely on contextual information in order to predict the location of these parts. Additionally, we show that the proposed cascade is flexible enough to readily allow the integration of various CNN architectures for both detection and regression, including recent ones based on residual learning. Finally, we illustrate that our cascade achieves top performance on the MPII and LSP data sets. Code can be downloaded from http://www.cs.nott.ac.uk/~psxab5/
This paper proposes KB-InfoBot -- a multi-turn dialogue agent which helps users search Knowledge Bases (KBs) without composing complicated queries. Such goal-oriented dialogue agents typically need to interact with an external database to access real-world knowledge. Previous systems achieved this by issuing a symbolic query to the KB to retrieve entries based on their attributes. However, such symbolic operations break the differentiability of the system and prevent end-to-end training of neural dialogue agents. In this paper, we address this limitation by replacing symbolic queries with an induced "soft" posterior distribution over the KB that indicates which entities the user is interested in. Integrating the soft retrieval process with a reinforcement learner leads to higher task success rate and reward in both simulations and against real users. We also present a fully neural end-to-end agent, trained entirely from user feedback, and discuss its application towards personalized dialogue agents. The source code is available at https://github.com/MiuLab/KB-InfoBot.
We describe an image compression method, consisting of a nonlinear analysis transformation, a uniform quantizer, and a nonlinear synthesis transformation. The transforms are constructed in three successive stages of convolutional linear filters and nonlinear activation functions. Unlike most convolutional neural networks, the joint nonlinearity is chosen to implement a form of local gain control, inspired by those used to model biological neurons. Using a variant of stochastic gradient descent, we jointly optimize the entire model for rate-distortion performance over a database of training images, introducing a continuous proxy for the discontinuous loss function arising from the quantizer. Under certain conditions, the relaxed loss function may be interpreted as the log likelihood of a generative model, as implemented by a variational autoencoder. Unlike these models, however, the compression model must operate at any given point along the rate-distortion curve, as specified by a trade-off parameter. Across an independent set of test images, we find that the optimized method generally exhibits better rate-distortion performance than the standard JPEG and JPEG 2000 compression methods. More importantly, we observe a dramatic improvement in visual quality for all images at all bit rates, which is supported by objective quality estimates using MS-SSIM.
Gene regulatory networks (GRN) govern phenotypic adaptations and reflect the trade-offs between physiological responses and evolutionary adaptation that act at different time scales. To identify patterns of molecular function and genetic diversity in GRNs, we studied the drought response of the common sunflower, Helianthus annuus, and how the underlying GRN is related to its evolution. We examined the responses of 32,423 expressed sequences to drought and to abscisic acid and selected 145 co-expressed transcripts. We characterized their regulatory relationships in nine kinetic studies based on different hormones. From this, we inferred a GRN by meta-analyses of a Gaussian graphical model and a random forest algorithm and studied the genetic differentiation among populations (FST) at nodes. We identified two main hubs in the network that transport nitrate in guard cells. This suggests that nitrate transport is a critical aspect of sunflower physiological response to drought. We observed that differentiation of the network genes in elite sunflower cultivars is correlated with their position and connectivity. This systems biology approach combined molecular data at different time scales and identified important physiological processes. At the evolutionary level, we propose that network topology could influence responses to human selection and possibly adaptation to dry environments.
When belief propagation (BP) converges, it does so to a stationary point of the Bethe free energy $F$, and is often strikingly accurate. However, it may converge only to a local optimum or may not converge at all. An algorithm was recently introduced for attractive binary pairwise MRFs which is guaranteed to return an $\epsilon$-approximation to the global minimum of $F$ in polynomial time provided the maximum degree $\Delta=O(\log n)$, where $n$ is the number of variables. Here we significantly improve this algorithm and derive several results including a new approach based on analyzing first derivatives of $F$, which leads to performance that is typically far superior and yields a fully polynomial-time approximation scheme (FPTAS) for attractive models without any degree restriction. Further, the method applies to general (non-attractive) models, though with no polynomial time guarantee in this case, leading to the important result that approximating $\log$ of the Bethe partition function, $\log Z_B=-\min F$, for a general model to additive $\epsilon$-accuracy may be reduced to a discrete MAP inference problem. We explore an application to predicting equipment failure on an urban power network and demonstrate that the Bethe approximation can perform well even when BP fails to converge.
Networks coming from protein-protein interactions, transcriptional regulation, signaling, or metabolism may appear to have "unusual" properties. To quantify this, it is appropriate to randomize the network and test the hypothesis that the network is not statistically different from expected in a motivated ensemble. However, when dealing with metabolic networks, the randomization of the network using edge exchange generates fictitious reactions that are biochemically meaningless. Here we provide several natural ensembles of randomized metabolic networks. A first constraint is to use valid biochemical reactions. Further constraints correspond to imposing appropriate functional constraints. We explain how to perform these randomizations with the help of Markov Chain Monte Carlo (MCMC) and show that they allow one to approach the properties of biological metabolic networks. The implication of the present work is that the observed global structural properties of real metabolic networks are likely to be the consequence of simple biochemical and functional constraints.
We present deep observations at 450 um and 850 um in the Extended Groth Strip field taken with the SCUBA-2 camera mounted on the James Clerk Maxwell Telescope as part of the deep SCUBA-2 Cosmology Legacy Survey (S2CLS), achieving a central instrumental depth of $\sigma_{450}=1.2$ mJy/beam and $\sigma_{850}=0.2$ mJy/beam. We detect 57 sources at 450 um and 90 at 850 um with S/N > 3.5 over ~70 sq. arcmin. From these detections we derive the number counts at flux densities $S_{450}>4.0$ mJy and $S_{850}>0.9$ mJy, which represent the deepest number counts at these wavelengths derived using directly extracted sources from only blank-field observations with a single-dish telescope. Our measurements smoothly connect the gap between previous shallower blank-field single-dish observations and deep interferometric ALMA results. We estimate the contribution of our SCUBA-2 detected galaxies to the cosmic infrared background (CIB), as well as the contribution of 24 um-selected galaxies through a stacking technique, which add a total of $0.26\pm0.03$ and $0.07\pm0.01$ MJy/sr, at 450 um and 850 um, respectively. These surface brightnesses correspond to $60\pm20$ and $50\pm20$ per cent of the total CIB measurements, where the errors are dominated by those of the total CIB. Using the photometric redshifts of the 24 um-selected sample and the redshift distributions of the submillimetre galaxies, we find that the redshift distribution of the recovered CIB is different at each wavelength, with a peak at $z\sim1$ for 450 um and at $z\sim2$ for 850um, consistent with previous observations and theoretical models.
In this paper we propose a technique which avoids the evaluation of certain convolutional filters in a deep neural network. This allows to trade-off the accuracy of a deep neural network with the computational and memory requirements. This is especially important on a constrained device unable to hold all the weights of the network in memory.
A random walk model in a one dimensional disordered medium with an oscillatory input current is presented as a generic model of boundary perturbation methods to investigate properties of a transport process in a disordered medium. It is rigorously shown that an admittance which is equal to the Fourier-Laplace transform of the first-passage time distribution is non-self-averaging when the disorder is strong. The low frequency behavior of the disorder-averaged admittance, $<\chi > -1 \sim \omega^{\mu}$ where $\mu < 1$, does not coincide with the low frequency behavior of the admittance for any sample, $\chi - 1 \sim \omega$. It implies that the Cole-Cole plot of $<\chi>$ appears at a different position from the Cole-Cole plots of $\chi$ of any sample. These results are confirmed by Monte-Carlo simulations.
We analyze many body localization (MBL) in an interacting one-dimensional system with a deterministic aperiodic potential. Below the threshold value of the potential $h < h_c$, the non-interacting system has single particle mobility edges at $\pm E_c$ while for $ h > h_c$ all the single particle states are localized. We demonstrate that even in the presence of single particle mobility edges, the interacting system can have MBL. Our numerical calculation of participation ratio in the Fock space and Shannon entropy shows that both for $h < h_c$ (quarter filled) and $h>h_c$ ($h\sim h_c$ and half filled), many body states in the middle of the spectrum are delocalized while the low energy states with $E < E_1$ and the high energy states with $E> E_2$ are localized. Variance of entanglement entropy (EE) also shows divergence at $E_{1,2}$ indicating a transition from MBL to delocalized regime. We also studied eigenstate thermalisation hypothesis (ETH) and found that the low energy many body states, which show area law scaling for EE do not obey ETH. The crossings from volume to area law scaling for EE and from thermal to non-thermal behaviour occurs deep inside the localised regime. For $h \gg h_c$, all the many body states remain localized for weak to intermediate strength of interaction and the system shows infinite temperature MBL phase.
We propose a novel activation function that implements piece-wise orthogonal non-linear mappings based on permutations. It is straightforward to implement, and very computationally efficient, also it has little memory requirements. We tested it on two toy problems for feedforward and recurrent networks, it shows similar performance to tanh and ReLU. OPLU activation function ensures norm preservance of the backpropagated gradients, therefore it is potentially good for the training of deep, extra deep, and recurrent neural networks.
The configurational entropy is among the key observables to characterize experimentally the formation of a glass. Physically, it quantifies the multiplicity of metastable states in which an amorphous material can be found at a given temperature, and its temperature dependence provides a major thermodynamic signature of the glass transition, which is experimentally accessible. Measurements of the configurational entropy require, however, some approximations which have often led to ambiguities and contradictory results. Here we implement a novel numerical scheme to measure the configurational entropy Sigma(T) in supercooled liquids, using a direct determination of the free energy cost to localize the system within a single metastable state at temperature T. For two prototypical glass-forming liquids, we find that Sigma(T) disappears discontinuously above a temperature T_c, which is slightly lower than the usual estimate of the onset temperature for glassy dynamics. This observation is in good agreement with theoretical expectations, but contrasts sharply with alternative numerical methods. While the temperature dependence of Sigma(T) correlates with the glass fragility, we show that the validity of the Adam-Gibbs relation (relating configurational entropy to structural relaxation time) established in earlier numerical studies is smaller than previously thought, potentially resolving an important conflict between experiments and simulations.
In this paper we report the measurement of conductance fluctuations in single crystals of Si made metallic by heavy doping (n \approx 2-2.5n_c, n_c being critical composition at Metal-Insulator transition). Since all dimensions (L) of the samples are much larger than the electron phase coherent length L_\phi (L/L_\phi \sim 10^3), our system is truly three dimensional. Temperature and magnetic field dependence of noise strongly indicate the universal conductance fluctuations (UCF) as predominant source of the observed magnitude of noise. Conductance fluctuations within a single phase coherent region of L_\phi^3 was found to be saturated at <(\delta G_\phi)^2> \approx (e^2/h)^2. An accurate knowledge of the level of disorder, enables us to calculate the change in conductance \delta G_1 due to movement of a single scatterer as \delta G_1 \sim e^2/h, which is \sim 2 orders of magnitude higher than its theoretically expected value in 3D systems.
We provide a bridge between generative modeling in the Machine Learning community and simulated physical processes in High Energy Particle Physics by applying a novel Generative Adversarial Network (GAN) architecture to the production of jet images -- 2D representations of energy depositions from particles interacting with a calorimeter. We propose a simple architecture, the Location-Aware Generative Adversarial Network, that learns to produce realistic radiation patterns from simulated high energy particle collisions. The pixel intensities of GAN-generated images faithfully span over many orders of magnitude and exhibit the desired low-dimensional physical properties (i.e., jet mass, n-subjettiness, etc.). We shed light on limitations, and provide a novel empirical validation of image quality and validity of GAN-produced simulations of the natural world. This work provides a base for further explorations of GANs for use in faster simulation in High Energy Particle Physics.
The modern and innovative medical applications based on wireless network are being developed in the commercial sectors as well as in research. The emerging wireless networks are rapidly becoming a fundamental part of medical solutions due to increasing accessibility for healthcare professionals/patients reducing healthcare costs. Discovering the routes among hosts that are energy efficient without compromise on smooth communication is desirable. This work investigates energy efficiency of some selected proactive and reactive routing protocols in wireless network for healthcare environments. After simulation and analysis we found that DSR is best energy efficient routing protocol among DSR, DSDV and AODV, because DSR has maximum remaining energy.
Deep generative models (DGMs) are effective on learning multilayered representations of complex data and performing inference of input data by exploring the generative ability. However, little work has been done on examining or empowering the discriminative ability of DGMs on making accurate predictions. This paper presents max-margin deep generative models (mmDGMs), which explore the strongly discriminative principle of max-margin learning to improve the discriminative power of DGMs, while retaining the generative capability. We develop an efficient doubly stochastic subgradient algorithm for the piecewise linear objective. Empirical results on MNIST and SVHN datasets demonstrate that (1) max-margin learning can significantly improve the prediction performance of DGMs and meanwhile retain the generative ability; and (2) mmDGMs are competitive to the state-of-the-art fully discriminative networks by employing deep convolutional neural networks (CNNs) as both recognition and generative models.
A principled approach to characterize the hidden structure of networks is to formulate generative models, and then infer their parameters from data. When the desired structure is composed of modules or "communities", a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints, i.e. the generated networks are not allowed to violate the patterns imposed by the model. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: 1. Deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, that not only remove limitations that seriously degrade the inference on large networks, but also reveal structures at multiple scales; 2. A very efficient inference algorithm that scales well not only for networks with a large number of nodes and edges, but also with an unlimited number of modules. We show also how this approach can be used to sample modular hierarchies from the posterior distribution, as well as to perform model selection. We discuss and analyze the differences between sampling from the posterior and simply finding the single parameter estimate that maximizes it. Furthermore, we expose a direct equivalence between our microcanonical approach and alternative derivations based on the canonical SBM.
We study the kinematics of deep inelastic scattering corresponding to the rotationally symmetric distribution of quark momenta in the nucleon rest frame. It is shown that rotational symmetry together with Lorentz invariance can in leading order impose constraints on the quark intrinsic momenta. Obtained constraints are discussed and compared with the available experimental data.
The focus of user behavior in the Internet has changed over the recent years towards being driven by exchanging and accessing information. Many advances in networking technologies have utilized this change by focusing on the content of an exchange rather than the endpoints exchanging the content. Network coding and information centric networking are two examples of these technology trends, each being developed largely independent so far. This paper brings these areas together in an evolutionary as well as explorative setting for a new internetworking architecture. We outline opportunities for applying network coding in a novel and performance-enhancing way that could eventually push forward the case for information centric network itself.
The Folksodriven framework makes it possible for data scientists to define an ontology environment where searching for buried patterns that have some kind of predictive power to build predictive models more effectively. It accomplishes this through an abstractions that isolate parameters of the predictive modeling process searching for patterns and designing the feature set, too. To reflect the evolving knowledge, this paper considers ontologies based on folksonomies according to a new concept structure called "Folksodriven" to represent folksonomies. So, the studies on the transformational regulation of the Folksodriven tags are regarded to be important for adaptive folksonomies classifications in an evolving environment used by Intelligent Systems to represent the knowledge sharing. Folksodriven tags are used to categorize salient data points so they can be fed to a machine-learning system and "featurizing" the data.
This research is to search for alternatives to the resolution of complex medical diagnosis where human knowledge should be apprehended in a general fashion. Successful application examples show that human diagnostic capabilities are significantly worse than the neural diagnostic system. Our research describes a constructive neural network algorithm with backpropagation; offer an approach for the incremental construction of nearminimal neural network architectures for pattern classification. The algorithm starts with minimal number of hidden units in the single hidden layer; additional units are added to the hidden layer one at a time to improve the accuracy of the network and to get an optimal size of a neural network. Our algorithm was tested on several benchmarking classification problems including Cancer1, Heart, and Diabetes with good generalization ability.
The last decade has seen the parallel emergence in computational neuroscience and machine learning of neural network structures which spread the input signal randomly to a higher dimensional space; perform a nonlinear activation; and then solve for a regression or classification output by means of a mathematical pseudoinverse operation. In the field of neuromorphic engineering, these methods are increasingly popular for synthesizing biologically plausible neural networks, but the "learning method" - computation of the pseudoinverse by singular value decomposition - is problematic both for biological plausibility and because it is not an online or an adaptive method. We present an online or incremental method of computing the pseudoinverse, which we argue is biologically plausible as a learning method, and which can be made adaptable for non-stationary data streams. The method is significantly more memory-efficient than the conventional computation of pseudoinverses by singular value decomposition.
In post genomic era with the advent of new technologies a huge amount of complex molecular data are generated with high throughput. The management of this biological data is definitely a challenging task due to complexity and heterogeneity of data for discovering new knowledge. Issues like managing noisy and incomplete data are needed to be dealt with. Use of data mining in biological domain has made its inventory success. Discovering new knowledge from the biological data is a major challenge in data mining technique. The novelty of the proposed model is its combined use of intelligent techniques to classify the protein sequence faster and efficiently. Use of FFT, fuzzy classifier, String weighted algorithm, gram encoding method, neural network model and rough set classifier in a single model and in an appropriate place can enhance the quality of the classification system.Thus the primary challenge is to identify and classify the large protein sequences in a very fast and easy but intellectual way to decrease the time complexity and space complexity.
Collections of journal papers, often referred to as 'citation networks', can be modeled as a collection of coupled bipartite networks which tend to exhibit linear growth and preferential attachment as papers are added to the collection. Assuming primary nodes in the first partition and secondary nodes in the second partition, the basic bipartite Yule process assumes that as each primary node is added to the network, it links to multiple secondary nodes, and with probability, $\alpha$, each new link may connect to a newly appearing secondary node. The number of links from a new primary node follows some distribution that is a characteristic of the specific network. Links to existing secondary nodes follow a preferential attachment rule. With modifications to adapt to specific networks, bipartite Yule processes simulate networks that can be validated against actual networks using a wide variety of network metrics. The application of bipartite Yule processes to the simulation of paper-reference networks and paper-author networks is demonstrated and simulation results are shown to mimic networks from actual collections of papers across several network metrics.
Deep neural networks have dramatically advanced the state of the art for many areas of machine learning. Recently they have been shown to have a remarkable ability to generate highly complex visual artifacts such as images and text rather than simply recognize them.   In this work we use neural networks to effectively invert low-dimensional face embeddings while producing realistically looking consistent images. Our contribution is twofold, first we show that a gradient ascent style approaches can be used to reproduce consistent images, with a help of a guiding image. Second, we demonstrate that we can train a separate neural network to effectively solve the minimization problem in one pass, and generate images in real-time. We then evaluate the loss imposed by using a neural network instead of the gradient descent by comparing the final values of the minimized loss function.
Convolutional neural network (CNN) has drawn increasing interest in visual tracking owing to its powerfulness in feature extraction. Most existing CNN-based trackers treat tracking as a classification problem. However, these trackers are sensitive to similar distractors because their CNN models mainly focus on inter-class classification. To address this problem, we use self-structure information of object to distinguish it from distractors. Specifically, we utilize recurrent neural network (RNN) to model object structure, and incorporate it into CNN to improve its robustness to similar distractors. Considering that convolutional layers in different levels characterize the object from different perspectives, we use multiple RNNs to model object structure in different levels respectively. Extensive experiments on three benchmarks, OTB100, TC-128 and VOT2015, show that the proposed algorithm outperforms other methods. Code is released at http://www.dabi.temple.edu/~hbling/code/SANet/SANet.html.
Deep structured output learning shows great promise in tasks like semantic image segmentation. We proffer a new, efficient deep structured model learning scheme, in which we show how deep Convolutional Neural Networks (CNNs) can be used to estimate the messages in message passing inference for structured prediction with Conditional Random Fields (CRFs). With such CNN message estimators, we obviate the need to learn or evaluate potential functions for message calculation. This confers significant efficiency for learning, since otherwise when performing structured learning for a CRF with CNN potentials it is necessary to undertake expensive inference for every stochastic gradient iteration. The network output dimension for message estimation is the same as the number of classes, in contrast to the network output for general CNN potential functions in CRFs, which is exponential in the order of the potentials. Hence CNN message learning has fewer network parameters and is more scalable for cases that a large number of classes are involved. We apply our method to semantic image segmentation on the PASCAL VOC 2012 dataset. We achieve an intersection-over-union score of 73.4 on its test set, which is the best reported result for methods using the VOC training images alone. This impressive performance demonstrates the effectiveness and usefulness of our CNN message learning method.
In this paper, we propose a robust and parsimonious approach using Deep Convolutional Neural Network (DCNN) to recognize and interpret interior space. DCNN has achieved incredible success in object and scene recognition. In this study we design and train a DCNN to classify a pre-zoning indoor space, and from a single phone photo to recognize the learned space features, with no need of additional assistive technology. We collect more than 600,000 images inside MIT campus buildings to train our DCNN model, and achieved 97.9% accuracy in validation dataset and 81.7% accuracy in test dataset based on spatial-scale fixed model. Furthermore, the recognition accuracy and spatial resolution can be potentially improved through multiscale classification model. We identify the discriminative image regions through Class Activating Mapping (CAM) technique, to observe the model's behavior in how to recognize space and interpret it in an abstract way. By evaluating the results with misclassification matrix, we investigate the visual spatial feature of interior space by looking into its visual similarity and visual distinctiveness, giving insights into interior design and human indoor perception and wayfinding research. The contribution of this paper is threefold. First, we propose a robust and parsimonious approach for indoor navigation using DCNN. Second, we demonstrate that DCNN also has a potential capability in space feature learning and recognition, even under severe appearance changes. Third, we introduce a DCNN based approach to look into the visual similarity and visual distinctiveness of interior space.
Singularities of a statistical model are the elements of the model's parameter space which make the corresponding Fisher information matrix degenerate. These are the points for which estimation techniques such as the maximum likelihood estimator and standard Bayesian procedures do not admit the root-$n$ parametric rate of convergence. We propose a general framework for the identification of singularity structures of the parameter space of finite mixtures, and study the impacts of the singularity levels on minimax lower bounds and rates of convergence for the maximum likelihood estimator over a compact parameter space. Our study makes explicit the deep links between model singularities, parameter estimation convergence rates and minimax lower bounds, and the algebraic geometry of the parameter space for mixtures of continuous distributions. The theory is applied to establish concrete convergence rates of parameter estimation for finite mixture of skewnormal distributions. This rich and increasingly popular mixture model is shown to exhibit a remarkably complex range of asymptotic behaviors which have not been hitherto reported in the literature.
In many emerging applications, data streams are monitored in a network environment. Due to limited communication bandwidth and other resource constraints, a critical and practical demand is to online compress data streams continuously with quality guarantee. Although many data compression and digital signal processing methods have been developed to reduce data volume, their super-linear time and more-than-constant space complexity prevents them from being applied directly on data streams, particularly over resource-constrained sensor networks. In this paper, we tackle the problem of online quality guaranteed compression of data streams using fast linear approximation (i.e., using line segments to approximate a time series). Technically, we address two versions of the problem which explore quality guarantees in different forms. We develop online algorithms with linear time complexity and constant cost in space. Our algorithms are optimal in the sense they generate the minimum number of segments that approximate a time series with the required quality guarantee. To meet the resource constraints in sensor networks, we also develop a fast algorithm which creates connecting segments with very simple computation. The low cost nature of our methods leads to a unique edge on the applications of massive and fast streaming environment, low bandwidth networks, and heavily constrained nodes in computational power. We implement and evaluate our methods in the application of an acoustic wireless sensor network.
Burst contention is a well-known challenging problem in Optical Burst Switching (OBS) networks. Contention resolution approaches are always reactive and attempt to minimize the BLR based on local information available at the core node. On the other hand, a proactive approach that avoids burst losses before they occur is desirable. To reduce the probability of burst contention, a more robust routing algorithm than the shortest path is needed. This paper proposes a new routing mechanism for JET-based OBS networks, called Graphical Probabilistic Routing Model (GPRM) that selects less utilized links, on a hop-by-hop basis by using a bayesian network. We assume no wavelength conversion and no buffering to be available at the core nodes of the OBS network. We simulate the proposed approach under dynamic load to demonstrate that it reduces the Burst Loss Ratio (BLR) compared to static approaches by using Network Simulator 2 (ns-2) on NSFnet network topology and with realistic traffic matrix. Simulation results clearly show that the proposed approach outperforms static approaches in terms of BLR.
We present a comprehensive study of s-process nucleosynthesis in 15, 20, 25, and 30 $\msun$ stellar models having solar-like initial composition. The stars are evolved up to ignition of central neon with a 659 species network coupled to the stellar models. In this way, the initial composition from one burning phase to another is consistently determined, especially with respect to neutron capture reactions. The aim of our calculations is to gain a full account of the s-process yield from massive stars. In the present work, we focus primarily on the s-process during central helium burning and illuminate some major uncertainties affecting the calculations. We briefly show how advanced burning can significantly affect the products of the core helium burning s-process and, in particular, can greatly deplete $^{80}$Kr that was strongly overproduced in the earlier core helium burning phase; however, we leave a complete analysis of the s-process during the advanced evolutionary phases (especially in shell carbon burning) to a subsequent paper. Our results can help to constrain the yield of the s-process material from massive stars during their pre-supernova evolution.
We study a non-relativistic charged particle on the Euclidean plane R^2 subject to a perpendicular constant magnetic field and an R^2-homogeneous random potential in the approximation that the corresponding random Landau Hamiltonian on the Hilbert space L^2(R^2) is restricted to the eigenspace of a single but arbitrary Landau level. For a wide class of Gaussian random potentials we rigorously prove that the associated restricted integrated density of states is absolutely continuous with respect to the Lebesgue measure. We construct explicit upper bounds on the resulting derivative, the restricted density of states. As a consequence, any given energy is seen to be almost surely not an eigenvalue of the restricted random Landau Hamiltonian.
Existing deep convolutional neural networks (CNNs) have shown their great success on image classification. CNNs mainly consist of convolutional and pooling layers, both of which are performed on local image areas without considering the dependencies among different image regions. However, such dependencies are very important for generating explicit image representation. In contrast, recurrent neural networks (RNNs) are well known for their ability of encoding contextual information among sequential data, and they only require a limited number of network parameters. General RNNs can hardly be directly applied on non-sequential data. Thus, we proposed the hierarchical RNNs (HRNNs). In HRNNs, each RNN layer focuses on modeling spatial dependencies among image regions from the same scale but different locations. While the cross RNN scale connections target on modeling scale dependencies among regions from the same location but different scales. Specifically, we propose two recurrent neural network models: 1) hierarchical simple recurrent network (HSRN), which is fast and has low computational cost; and 2) hierarchical long-short term memory recurrent network (HLSTM), which performs better than HSRN with the price of more computational cost.   In this manuscript, we integrate CNNs with HRNNs, and develop end-to-end convolutional hierarchical recurrent neural networks (C-HRNNs). C-HRNNs not only make use of the representation power of CNNs, but also efficiently encodes spatial and scale dependencies among different image regions. On four of the most challenging object/scene image classification benchmarks, our C-HRNNs achieve state-of-the-art results on Places 205, SUN 397, MIT indoor, and competitive results on ILSVRC 2012.
Autonomous control systems onboard planetary rovers and spacecraft benefit from having cognitive capabilities like learning so that they can adapt to unexpected situations in-situ. Q-learning is a form of reinforcement learning and it has been efficient in solving certain class of learning problems. However, embedded systems onboard planetary rovers and spacecraft rarely implement learning algorithms due to the constraints faced in the field, like processing power, chip size, convergence rate and costs due to the need for radiation hardening. These challenges present a compelling need for a portable, low-power, area efficient hardware accelerator to make learning algorithms practical onboard space hardware. This paper presents a FPGA implementation of Q-learning with Artificial Neural Networks (ANN). This method matches the massive parallelism inherent in neural network software with the fine-grain parallelism of an FPGA hardware thereby dramatically reducing processing time. Mars Science Laboratory currently uses Xilinx-Space-grade Virtex FPGA devices for image processing, pyrotechnic operation control and obstacle avoidance. We simulate and program our architecture on a Xilinx Virtex 7 FPGA. The architectural implementation for a single neuron Q-learning and a more complex Multilayer Perception (MLP) Q-learning accelerator has been demonstrated. The results show up to a 43-fold speed up by Virtex 7 FPGAs compared to a conventional Intel i5 2.3 GHz CPU. Finally, we simulate the proposed architecture using the Symphony simulator and compiler from Xilinx, and evaluate the performance and power consumption.
This paper discusses the systematic design of an adaptive feedback linearizing neurocontroller for a high-order model of the synchronous machine/infinite bus power system. The power system is first modelled as an input-output nonlinear discrete-time system approximated by two neural networks. The approach allows a simple linear pole-placement controller (which is itself not a neural network) to be designed. The control law is specified such that the controller adaptively calculates an appropriate feedback linearizing control law at each sampling instant by utilizing plant parameter estimates provided by the neural system model. The control system also adapts itself on-line. This avoids the requirement for exact knowledge of the power system dynamics and full state measurement as well as other difficulties associated with implementing analytical input-output feedback linearizing control for a complex power system model. Furthermore, a departure is made from the `ad hoc' manner in which many neural controllers have been designed for power systems; the approach used here has foundations in control theoretic concepts of adaptive feedback linearization and pole-placement control design.   Simulation results demonstrate the performance of this controller for a representative example of a single-machine/infinite bus power system configuration under various operational conditions.
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Users of distributed datastores that employ quorum-based replication are burdened with the choice of a suitable client-centric consistency setting for each storage operation. The above matching choice is difficult to reason about as it requires deliberating about the tradeoff between the latency and staleness, i.e., how stale (old) the result is. The latency and staleness for a given operation depend on the client-centric consistency setting applied, as well as dynamic parameters such as the current workload and network condition.We present OptCon, a novel machine learning-based predictive framework, that can automate the choice of client-centric consistency setting under user-specified latency and staleness thresholds given in the service level agreement (SLA). Under a given SLA, OptCon predicts a client-centric consistency setting that is matching, i.e., it is weak enough to satisfy the latency threshold, while being strong enough to satisfy the staleness threshold. While manually tuned consistency settings remain fixed unless explicitly reconfigured, OptCon tunes consistency settings on a per-operation basis with respect to changing workload and network state. Using decision tree learning, OptCon yields 0.14 cross validation error in predicting matching consistency settings under latency and staleness thresholds given in the SLA. We demonstrate experimentally that OptCon is at least as effective as any manually chosen consistency settings in adapting to the SLA thresholds for different use cases. We also demonstrate that OptCon adapts to variations in workload, whereas a given manually chosen fixed consistency setting satisfies the SLA only for a characteristic workload.
We address the question of the effect of disorder on heat conduction in an anharmonic chain with interactions given by the Fermi-Pasta-Ulam (FPU) potential. In contrast to the conclusions of an earlier paper [Phys. Rev. Lett. 86, 63 (2001)] which found that disorder could induce a finite thermal conductivity at low temperatures, we find no evidence of a finite temperature transition in conducting properties. Instead, we find that at low temperatures, small system size transport properties are dominated by disorder but the asymptotic system size dependence of current is given by the usual FPU result J ~ 1/N^{2/3}. We also present new interesting results on the binary-mass ordered FPU chain.
We introduce the "NoBackTrack" algorithm to train the parameters of dynamical systems such as recurrent neural networks. This algorithm works in an online, memoryless setting, thus requiring no backpropagation through time, and is scalable, avoiding the large computational and memory cost of maintaining the full gradient of the current state with respect to the parameters.   The algorithm essentially maintains, at each time, a single search direction in parameter space. The evolution of this search direction is partly stochastic and is constructed in such a way to provide, at every time, an unbiased random estimate of the gradient of the loss function with respect to the parameters. Because the gradient estimate is unbiased, on average over time the parameter is updated as it should.   The resulting gradient estimate can then be fed to a lightweight Kalman-like filter to yield an improved algorithm. For recurrent neural networks, the resulting algorithms scale linearly with the number of parameters.   Small-scale experiments confirm the suitability of the approach, showing that the stochastic approximation of the gradient introduced in the algorithm is not detrimental to learning. In particular, the Kalman-like version of NoBackTrack is superior to backpropagation through time (BPTT) when the time span of dependencies in the data is longer than the truncation span for BPTT.
We study the details of the distribution of the entanglement spectrum (eigenvalues of the reduced density matrix) of a disordered spin chain exhibiting a many-body localization (MBL) transition. In the thermalizing region we identify the evolution under increasing system size of the eigenvalues distribution function, whose thermodynamic limit is close (but possibly different from) the Marchenko-Pastur distribution. From the analysis we extract a correlation length $L_s(h)$ determining the minimum system size to enter the asymptotic region. We find that $L_s(h)$ diverges at the MBL transition. We discuss the nature of the subleading corrections to the entanglement spectrum distribution and to the entanglement entropy.
Automated classification of human anatomy is an important prerequisite for many computer-aided diagnosis systems. The spatial complexity and variability of anatomy throughout the human body makes classification difficult. "Deep learning" methods such as convolutional networks (ConvNets) outperform other state-of-the-art methods in image classification tasks. In this work, we present a method for organ- or body-part-specific anatomical classification of medical images acquired using computed tomography (CT) with ConvNets. We train a ConvNet, using 4,298 separate axial 2D key-images to learn 5 anatomical classes. Key-images were mined from a hospital PACS archive, using a set of 1,675 patients. We show that a data augmentation approach can help to enrich the data set and improve classification performance. Using ConvNets and data augmentation, we achieve anatomy-specific classification error of 5.9 % and area-under-the-curve (AUC) values of an average of 0.998 in testing. We demonstrate that deep learning can be used to train very reliable and accurate classifiers that could initialize further computer-aided diagnosis.
The problem of estimating high-dimensional network models arises naturally in the analysis of many physical, biological and socio-economic systems. Examples include stock price fluctuations in financial markets and gene regulatory networks representing effects of regulators (transcription factors) on regulated genes in genetics. We aim to learn the structure of the network over time employing the framework of Granger causal models under the assumptions of sparsity of its edges and inherent grouping structure among its nodes. We introduce a thresholded variant of the Group Lasso estimator for discovering Granger causal interactions among the nodes of the network. Asymptotic results on the consistency of the new estimation procedure are developed. The performance of the proposed methodology is assessed through an extensive set of simulation studies and comparisons with existing techniques.
Coverage and connectivity issues of three-dimensional (3D) networks are addressed in [2], but that work assumes that a node can be placed at any arbitrary location. In this work, we drop that assumption and rather assume that nodes are uniformly and densely deployed in a 3D space. We want to devise a mechanism that keeps some nodes active and puts other nodes into sleep so that the number of active nodes at a time is minimized (and thus network life time is maximized), while maintaining full coverage and connectivity. One simple way to do that is to partition the 3D space into cells, and only one node in each cell remains active at a time. Our results show that the number of active nodes can be minimized if the shape of each cell is a truncated octahedron. It requires the sensing range to be at least 0.542326 times the transmission radius. This value is 0.5, 0.53452 and 0.5 for cube, hexagonal prism, and rhombic dodecahedron, respectively. However, at a time the number of active nodes for cube, hexagonal prism and rhombic dodecahedron model is respectively 2.372239, 1.82615 and 1.49468 times of that of truncated octahedron model. So clearly truncated octahedron model has the highest network lifetime. We also provide a distributed topology control algorithm that can be used by each sensor node to determine its cell id using a constant number of local arithmetic operations provided that the sensor node knows its location. We also validate our results by simulation.
In developing brain, axons and dendrites are capable of connecting to each other with high precision. Recent advances in imaging have allowed for the monitoring of axonal, dendritic, and synapse dynamics in vivo. It is observed that the majority of axonal and dendritic branches are formed 'in error', only to be retracted later. The functional significance of the overproduction of branches is not clear. In this study, we use a computational model to investigate the speed and efficiency of different branching strategies. We show that branching itself allows for substantial acceleration in the identification of appropriate targets through the use of a parallel search. We also show that the formation of new branches in the vicinity of existing synapses leads to the formation of target connectivity with a decreased number of erroneous branches. This finding allows us to explain the high correlation between the branch points and synapses observed in the Xenopus laevis retinotectal system. We also suggest that the most efficient branching rule is different for axons and dendrites. The optimal axonal strategy is to form new branches in the vicinity of existing synapses, whereas the optimal rule for dendrites is to form new branches preferentially in the vicinity of synapses with correlated pre- and postsynaptic electric activity. Thus, our studies suggest that the developing neural system employs a set of sophisticated computational strategies that facilitate the formation of required circuitry, so that it may proceed in the fastest and most frugal way.
Studies into the statistical properties of biological networks have led to important biological insights, such as the presence of hubs and hierarchical modularity. There is also a growing interest in studying the statistical properties of networks in the context of cancer genomics. However, relatively little is known as to what network features differ between the cancer and normal cell physiologies, or between different cancer cell phenotypes. Based on the observation that frequent genomic alterations underlie a more aggressive cancer phenotype, we asked if such an effect could be detectable as an increase in the randomness of local gene expression patterns. Using a breast cancer gene expression data set and a model network of protein interactions we derive constrained weighted networks defined by a stochastic information flux matrix reflecting expression correlations between interacting proteins. Based on this stochastic matrix we propose and compute an entropy measure that quantifies the degree of randomness in the local pattern of information flux around single genes. By comparing the local entropies in the non-metastatic versus metastatic breast cancer networks, we here show that breast cancers that metastasize are characterised by a small yet significant increase in the degree of randomness of local expression patterns. We validate this result in three additional breast cancer expression data sets and demonstrate that local entropy better characterises the metastatic phenotype than other non-entropy based measures. We show that increases in entropy can be used to identify genes and signalling pathways implicated in breast cancer metastasis. Further exploration of such integrated cancer expression and protein interaction networks will therefore be a fruitful endeavour.
Hyperlinks are an essential feature of the World Wide Web. They are especially important for online encyclopedias such as Wikipedia: an article can often only be understood in the context of related articles, and hyperlinks make it easy to explore this context. But important links are often missing, and several methods have been proposed to alleviate this problem by learning a linking model based on the structure of the existing links. Here we propose a novel approach to identifying missing links in Wikipedia. We build on the fact that the ultimate purpose of Wikipedia links is to aid navigation. Rather than merely suggesting new links that are in tune with the structure of existing links, our method finds missing links that would immediately enhance Wikipedia's navigability. We leverage data sets of navigation paths collected through a Wikipedia-based human-computation game in which users must find a short path from a start to a target article by only clicking links encountered along the way. We harness human navigational traces to identify a set of candidates for missing links and then rank these candidates. Experiments show that our procedure identifies missing links of high quality.
In this paper, results of an experimental study of a deep convolution neural network architecture which can classify different handwritten digits using EBLearn library are reported. The purpose of this neural network is to classify input images into 10 different classes or digits (0-9) and to explore new findings. The input dataset used consists of digits images of size 32X32 in grayscale (MNIST dataset).
This paper presents a model based on Deep Learning algorithms of LSTM and GRU for facilitating an anomaly detection in Large Hadron Collider superconducting magnets. We used high resolution data available in Post Mortem database to train a set of models and chose the best possible set of their hyper-parameters. Using Deep Learning approach allowed to examine a vast body of data and extract the fragments which require further experts examination and are regarded as anomalies. The presented method does not require tedious manual threshold setting and operator attention at the stage of the system setup. Instead, the automatic approach is proposed, which achieves according to our experiments accuracy of 99%. This is reached for the largest dataset of 302 MB and the following architecture of the network: single layer LSTM, 128 cells, 20 epochs of training, look_back=16, look_ahead=128, grid=100 and optimizer Adam. All the experiments were run on GPU Nvidia Tesla K80
We present the Neural Physics Engine (NPE), a framework for learning simulators of intuitive physics that naturally generalize across variable object count and different scene configurations. We propose a factorization of a physical scene into composable object-based representations and a neural network architecture whose compositional structure factorizes object dynamics into pairwise interactions. Like a symbolic physics engine, the NPE is endowed with generic notions of objects and their interactions; realized as a neural network, it can be trained via stochastic gradient descent to adapt to specific object properties and dynamics of different worlds. We evaluate the efficacy of our approach on simple rigid body dynamics in two-dimensional worlds. By comparing to less structured architectures, we show that the NPE's compositional representation of the structure in physical interactions improves its ability to predict movement, generalize across variable object count and different scene configurations, and infer latent properties of objects such as mass.
We introduce the concept of dynamically growing a neural network during training. In particular, an untrainable deep network starts as a trainable shallow network and newly added layers are slowly, organically added during training, thereby increasing the network's depth. This is accomplished by a new layer, which we call DropIn. The DropIn layer starts by passing the output from a previous layer (effectively skipping over the newly added layers), then increasingly including units from the new layers for both feedforward and backpropagation. We show that deep networks, which are untrainable with conventional methods, will converge with DropIn layers interspersed in the architecture. In addition, we demonstrate that DropIn provides regularization during training in an analogous way as dropout. Experiments are described with the MNIST dataset and various expanded LeNet architectures, CIFAR-10 dataset with its architecture expanded from 3 to 11 layers, and on the ImageNet dataset with the AlexNet architecture expanded to 13 layers and the VGG 16-layer architecture.
We investigate the power-suppressed corrections to structure functions in flavour singlet deep inelastic lepton scattering, to complement the previous results for the non-singlet contribution. Our method is a dispersive approach based on an analysis of Feynman graphs containing massive gluons; and our results agree with those obtained from leading infrared renormalon contributions. As in non-singlet deep inelastic scattering we find that the leading corrections are proportional to 1/Q^2. We find that the singlet contribution becomes important at small x.
We study the synchronization of fully-connected and totally excitatory integrate and fire neural networks in presence of Gaussian white noises. Using a large deviation principle, we prove the stability of the synchronized state under stochastic perturbations. Then, we give a lower bound on the probability of synchronization for networks which are not initially synchronized. This bound shows the robustness of the emergence of synchronization in presence of small stochastic perturbations.
Device-to-device (D2D) communication in cellular networks allows direct transmission between two cellular devices with local communication needs. Due to the increasing number of autonomous heterogeneous devices in future mobile networks, an efficient resource allocation scheme is required to maximize network throughput and achieve higher spectral efficiency. In this paper, performance of network-integrated D2D communication under channel uncertainties is investigated where D2D traffic is carried through relay nodes. Considering a multi-user and multi-relay network, we propose a robust distributed solution for resource allocation with a view to maximizing network sum-rate when the interference from other relay nodes and the link gains are uncertain. An optimization problem is formulated for allocating radio resources at the relays to maximize end-to-end rate as well as satisfy the quality-of-service (QoS) requirements for cellular and D2D user equipments under total power constraint. Each of the uncertain parameters is modeled by a bounded distance between its estimated and bounded values. We show that the robust problem is convex and a gradient-aided dual decomposition algorithm is applied to allocate radio resources in a distributed manner. Finally, to reduce the cost of robustness defined as the reduction of achievable sum-rate, we utilize the \textit{chance constraint approach} to achieve a trade-off between robustness and optimality. The numerical results show that there is a distance threshold beyond which relay-aided D2D communication significantly improves network performance when compared to direct communication between D2D peers.
We view the folding of RNA-sequences as a map that assigns a pattern of base pairings to each sequence, known as secondary structure. These preimages can be constructed as random graphs (i.e. the neutral networks associated to the structure $s$). By interpreting the secondary structure as biological information we can formulate the so called Error Threshold of Shapes as an extension of Eigen's et al. concept of an error threshold in the single peak landscape. Analogue to the approach of Derrida & Peliti for a of the population on the neutral network. On the one hand this model of a single shape landscape allows the derivation of analytical results, on the other hand the concept gives rise to study various scenarios by means of simulations, e.g. the interaction of two different networks. It turns out that the intersection of two sets of compatible sequences (with respect to the pair of secondary structures) plays a key role in the search for ''fitter'' secondary structures.
A method to boot a cluster of diskless network clients from a single write-protected NFS root file system is shown. The problems encountered when first implementing the setup and their solution are discussed. Finally, the setup is briefly compared to using a kernel-embedded root file system.
The N$\acute{\rm e}$el temperature of the new frustrated family of Sr\emph{RE}$_2$O$_4$ (\emph{RE} = rare earth) compounds is yet limited to $\sim$ 0.9 K, which more or less hampers a complete understanding of the relevant magnetic frustrations and spin interactions. Here we report on a new frustrated member to the family, SrTb$_2$O$_4$ with a record $T_{\rm N}$ = 4.28(2) K, and an experimental study of the magnetic interacting and frustrating mechanisms by polarized and unpolarized neutron scattering. The compound SrTb$_2$O$_4$ displays an incommensurate antiferromagnetic (AFM) order with a transverse wave vector \textbf{Q}$^{\rm 0.5 K}_{\rm AFM}$ = (0.5924(1), 0.0059(1), 0) albeit with partially-ordered moments, 1.92(6) $\mu_{\rm B}$ at 0.5 K, stemming from only one of the two inequivalent Tb sites mainly by virtue of their different octahedral distortions. The localized moments are confined to the \emph{bc} plane, 11.9(66)$^\circ$ away from the \emph{b} axis probably by single-ion anisotropy. We reveal that this AFM order is dominated mainly by dipole-dipole interactions and disclose that the octahedral distortion, nearest-neighbour (NN) ferromagnetic (FM) arrangement, different next NN FM and AFM configurations, and in-plane anisotropic spin correlations are vital to the magnetic structure and associated multiple frustrations. The discovery of the thus far highest AFM transition temperature renders SrTb$_2$O$_4$ a new friendly frustrated platform in the family for exploring the nature of magnetic interactions and frustrations.
This paper tackles the problem of the semantic gap between a document and a query within an ad-hoc information retrieval task. In this context, knowledge bases (KBs) have already been acknowledged as valuable means since they allow the representation of explicit relations between entities. However, they do not necessarily represent implicit relations that could be hidden in a corpora. This latter issue is tackled by recent works dealing with deep representation learn ing of texts. With this in mind, we argue that embedding KBs within deep neural architectures supporting documentquery matching would give rise to fine-grained latent representations of both words and their semantic relations. In this paper, we review the main approaches of neural-based document ranking as well as those approaches for latent representation of entities and relations via KBs. We then propose some avenues to incorporate KBs in deep neural approaches for document ranking. More particularly, this paper advocates that KBs can be used either to support enhanced latent representations of queries and documents based on both distributional and relational semantics or to serve as a semantic translator between their latent distributional representations.
We propose a new model based on the deconvolutional networks and SAX discretization to learn the representation for multivariate time series. Deconvolutional networks fully exploit the advantage the powerful expressiveness of deep neural networks in the manner of unsupervised learning. We design a network structure specifically to capture the cross-channel correlation with deconvolution, forcing the pooling operation to perform the dimension reduction along each position in the individual channel. Discretization based on Symbolic Aggregate Approximation is applied on the feature vectors to further extract the bag of features. We show how this representation and bag of features helps on classification. A full comparison with the sequence distance based approach is provided to demonstrate the effectiveness of our approach on the standard datasets. We further build the Markov matrix from the discretized representation from the deconvolution to visualize the time series as complex networks, which show more class-specific statistical properties and clear structures with respect to different labels.
Spin-glass (SG) and chiral-glass (CG) orderings in three dimensional (3D) Heisenberg spin glass with and without magnetic anisotropy are studied by using large-scale off-equilibrium Monte Carlo simulations. A characteristic time of relaxation, which diverges at a transition temperature in the thermodynamic limit, is obtained as a function of the temperature and the system size. Based on the finite-size scaling analysis for the relaxation time, it is found that in the isotropic Heisenberg spin glass, the CG phase transition occurs at a finite temperature, while the SG transition occurs at a lower temperature, which is compatible with zero. Our results of the anisotropic case support the chirality scenario for the phase transitions in the 3D Heisenberg spin glasses.
In series of papers Fomin introduced and discussed the so-called robust phases in a system with frozen orientational disorder (with application to superfluid 3He in aerogel). We show that his consideration is based on the erroneous overestimation of the fluctuation energy which comes from the interaction of the Goldstone modes with the frozen disorder. This interaction leads to the Imry-Ma effect, which destroys the orientational order, but is unable to destroy the local structure of the A-phase. There is no ground for the robust phases.
High resolution measurements reveal that condensation isotherms of $^4$He in a silica aerogel become discontinuous below a critical temperature. We show that this behaviour does not correspond to an equilibrium phase transition modified by the disorder induced by the aerogel structure, but to the disorder-driven critical point predicted for the athermal out-of-equilibrium dynamics of the Random Field Ising Model. Our results evidence the key role of non-equilibrium effects in the phase transitions of disordered systems.
In this paper, we develop a new approach of spatially supervised recurrent convolutional neural networks for visual object tracking. Our recurrent convolutional network exploits the history of locations as well as the distinctive visual features learned by the deep neural networks. Inspired by recent bounding box regression methods for object detection, we study the regression capability of Long Short-Term Memory (LSTM) in the temporal domain, and propose to concatenate high-level visual features produced by convolutional networks with region information. In contrast to existing deep learning based trackers that use binary classification for region candidates, we use regression for direct prediction of the tracking locations both at the convolutional layer and at the recurrent unit. Our extensive experimental results and performance comparison with state-of-the-art tracking methods on challenging benchmark video tracking datasets shows that our tracker is more accurate and robust while maintaining low computational cost. For most test video sequences, our method achieves the best tracking performance, often outperforms the second best by a large margin.
We present a new method to extract parton distribution functions from high energy experimental data based on a specific type of neural networks, the Self-Organizing Maps. We illustrate the features of our new procedure that are particularly useful for an anaysis directed at extracting generalized parton distributions from data. We show quantitative results of our initial analysis of the parton distribution functions from inclusive deep inelastic scattering.
Neural encoder-decoder models of machine translation have achieved impressive results, rivalling traditional translation models. However their modelling formulation is overly simplistic, and omits several key inductive biases built into traditional models. In this paper we extend the attentional neural translation model to include structural biases from word based alignment models, including positional bias, Markov conditioning, fertility and agreement over translation directions. We show improvements over a baseline attentional model and standard phrase-based model over several language pairs, evaluating on difficult languages in a low resource setting.
The three-stage Clos networks remain the most popular solution to many practical switching systems to date. The aim of this paper is to show that the modular structure of Clos networks is invariant with respect to the technological changes. Due to the wavelength routing property of arrayed-waveguide gratings (AWGs), non-blocking and contention-free wavelength-division-multiplexing (WDM) switches require that two calls carried by the same wavelength must be connected by separated links; otherwise, they must be carried by different wavelengths. Thus, in addition to the non-blocking condition, the challenge of the design of AWG-based multistage switching networks is to scale down the wavelength granularity and to reduce the conversion range of tunable wavelength converters (TWCs). We devise a logic scheme to partition the WDM switch network into wavelength autonomous cells, and show that the wavelength scalability problem can be solved by recursively reusing similar, but smaller, set of wavelengths in different cells. Furthermore, we prove that the rearrangeably non-blocking (RNB) condition and route assignments in these AWG-based three-stage networks are consistent with that of classical Clos networks. Thus, the optimal AWG-based non-blocking Clos networks also can achieve 100% utilization when all input and output wavelength channels are busy.
Classifying phases of matter is a central problem in physics. For quantum mechanical systems, this task can be daunting owing to the exponentially large Hilbert space. Thanks to the available computing power and access to ever larger data sets, classification problems are now routinely solved using machine learning techniques. Here, we propose to use a neural network based approach to find phase transitions depending on the performance of the neural network after training it with deliberately incorrectly labelled data. We demonstrate the success of this method on the topological phase transition in the Kitaev chain, the thermal phase transition in the classical Ising model, and the many-body-localization transition in a disordered quantum spin chain. Our method does not depend on order parameters, knowledge of the topological content of the phases, or any other specifics of the transition at hand. It therefore paves the way to a generic tool to identify unexplored phase transitions.
This paper analyzes multi-period mortgage risk at loan and pool levels using an unprecedented dataset of over 120 million prime and subprime mortgages originated across the United States between 1995 and 2014, which includes the individual characteristics of each loan, monthly updates on loan performance over the life of a loan, and a number of time-varying economic variables at the zip code level. We develop, estimate, and test dynamic machine learning models for mortgage prepayment, delinquency, and foreclosure which capture loan-to-loan correlation due to geographic proximity and exposure to common risk factors. The basic building block is a deep neural network which addresses the nonlinear relationship between the explanatory variables and loan performance. Our likelihood estimators, which are based on 3.5 billion borrower-month observations, indicate that mortgage risk is strongly influenced by local economic factors such as zip-code level foreclosure rates. The out-of-sample predictive performance of our deep learning model is a significant improvement over linear models such as logistic regression. Model parameters are estimated using GPU parallel computing due to the computational challenges associated with the large amount of data. The deep learning model's superior accuracy compared to linear models directly translates into improved performance for investors. Portfolios constructed with the deep learning model have lower prepayment and delinquency rates than portfolios chosen with a logistic regression.
We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting. The network builds an internal plan, which is continuously updated upon observation of the next input from the environment. It can also partition this internal representation into contiguous sub- sequences by learning for how long the plan can be committed to - i.e. followed without re-planing. Combining these properties, the proposed model, dubbed STRategic Attentive Writer (STRAW) can learn high-level, temporally abstracted macro- actions of varying lengths that are solely learnt from data without any prior information. These macro-actions enable both structured exploration and economic computation. We experimentally demonstrate that STRAW delivers strong improvements on several ATARI games by employing temporally extended planning strategies (e.g. Ms. Pacman and Frostbite). It is at the same time a general algorithm that can be applied on any sequence data. To that end, we also show that when trained on text prediction task, STRAW naturally predicts frequent n-grams (instead of macro-actions), demonstrating the generality of the approach.
Top-performing deep architectures are trained on massive amounts of labeled data. In the absence of labeled data for a certain task, domain adaptation often provides an attractive option given that labeled data of similar nature but from a different domain (e.g. synthetic images) are available. Here, we propose a new approach to domain adaptation in deep architectures that can be trained on large amount of labeled data from the source domain and large amount of unlabeled data from the target domain (no labeled target-domain data is necessary).   As the training progresses, the approach promotes the emergence of "deep" features that are (i) discriminative for the main learning task on the source domain and (ii) invariant with respect to the shift between the domains. We show that this adaptation behaviour can be achieved in almost any feed-forward model by augmenting it with few standard layers and a simple new gradient reversal layer. The resulting augmented architecture can be trained using standard backpropagation.   Overall, the approach can be implemented with little effort using any of the deep-learning packages. The method performs very well in a series of image classification experiments, achieving adaptation effect in the presence of big domain shifts and outperforming previous state-of-the-art on Office datasets.
Word and phrase tables are key inputs to machine translations, but costly to produce. New unsupervised learning methods represent words and phrases in a high-dimensional vector space, and these monolingual embeddings have been shown to encode syntactic and semantic relationships between language elements. The information captured by these embeddings can be exploited for bilingual translation by learning a transformation matrix that allows to match relative positions across two monolingual vector spaces. This method aims to identify high-quality candidates for word and phrase translation more cost-effectively from unlabeled data.   This paper expands the scope of previous attempts of bilingual translation to four languages (English, German, Spanish, and French). It shows how to process the source data, train a neural network to learn the high-dimensional embeddings for individual languages and expands the framework for testing their quality beyond the English language. Furthermore, it shows how to learn bilingual transformation matrices and obtain candidates for word and phrase translation, and assess their quality.
Using molecular dynamics computer simulations we study the dynamics of a molecular liquid by means of a general class of time-dependent correlators S_{ll'}^m(q,t) which explicitly involve translational (TDOF) and orientational degrees of freedom (ODOF). The system is composed of rigid, linear molecules with Lennard- Jones interactions. The q-dependence of the static correlators S_{ll'}^m(q) strongly depend on l, l' and m. The time dependent correlators are calculated for l=l'. A thorough test of the predictions of mode coupling theory (MCT) is performed for S_{ll}^m(q,t) and its self part S_{ll}^{(s)m}(q,t), for l=1,..,6. We find a clear signature for the existence of a single temperature T_c, at which the dynamics changes significantly. The first scaling law of MCT, which involves the critical correlator G(t), holds for l>=2, but no critical law is observed. Since this is true for the same exponent parameter lambda as obtained for the TDOF, we obtain a consistent description of both, the TDOF and ODOF, with the exception of l=1. This different behavior for l \ne 1 and l=1 can also be seen from the corresponding susceptibilities (chi'')_{ll}^m(q,omega) which exhibit a minimum at about the same frequency omega_{min} for all q and all l \ne 1, in contrast to (chi'')_{11}^m(q,omega) for which omega'_{min} approx 10 omega_{min} . The asymptotic regime, for which the first scaling law holds, shrinks with increasing l. The second scaling law of MCT (time-temperature superposition principle) is reasonably fulfilled for l \ne 1 but not for l=1. Furthermore we show that the q- and (l,m)-dependence of the self part approximately factorizes, i.e. S_{ll}^{(s)m}(q,t) \cong C_l^{(s)}(t) F_s(q,t) for all m.
Content Placement (CP) problem in Cloud-based Content Delivery Networks (CCDNs) leverage resource elasticity to build cost effective CDNs that guarantee QoS. In this paper, we present our novel CP model, which optimally places content on surrogates in the cloud, to achieve (a) minimum cost of leasing storage and bandwidth resources for data coming into and going out of the cloud zones and regions, (b) guarantee Service Level Agreement (SLA), and (c) minimize degree of QoS violations. The CP problem is NP-Hard, hence we design a unique push-based heuristic, called Weighted Social Network Analysis (W-SNA) for CCDN providers. W-SNA is based on Betweeness Centrality (BC) from SNA and prioritizes surrogates based on their relationship to the other vertices in the network graph. To achieve our unique objectives, we further prioritize surrogates based on weights derived from storage cost and content requests. We compare our heuristic to current state of the art Greedy Site (GS) and purely Social Network Analysis (SNA) heuristics, which are relevant to our work. We show that W-SNA outperforms GS and SNA in minimizing cost and QoS. Moreover, W-SNA guarantees SLA but also minimizes the degree of QoS violations. To the best of our knowledge, this is the first model and heuristic of its kind, which is timely and gives a fundamental pre-allocation scheme for future online and dynamic resource provision for CCDNs.
We consider a threshold-crossing spiking process as a simple model for the activity within a population of neurons. Assuming that these neurons are driven by a common fluctuating input with Gaussian statistics, we evaluate the cross-correlation of spike trains in pairs of model neurons with different thresholds. This correlation function tends to be asymmetric in time, indicating a preference for the neuron with the lower threshold to fire before the one with the higher threshold, even if their inputs are identical. The relationship between these results and spike statistics in other models of neural activity are explored. In particular, we compare our model with an integrate-and-fire model in which the membrane voltage resets following each spike. The qualitative properties of spike cross correlations, emerging from the threshold-crossing model, are similar to those of bursting events in the integrate-and-fire model. This is particularly true for generalized integrate-and-fire models in which spikes tend to occur in bursts as observed, for example, in retinal ganglion cells driven by a rapidly fluctuating visual stimulus. The threshold crossing model thus provides a simple, analytically tractable description of event onsets in these neurons.
In recent years, the performance of face verification systems has significantly improved using deep convolutional neural networks (DCNNs). A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images. The softmax loss function does not optimize the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap. In this paper, we add an L2-constraint to the feature descriptors which restricts them to lie on a hypersphere of a fixed radius. This module can be easily implemented using existing deep learning frameworks. We show that integrating this simple step in the training pipeline significantly boosts the performance of face verification. Specifically, we achieve state-of-the-art results on the challenging IJB-A dataset, achieving True Accept Rate of 0.909 at False Accept Rate 0.0001 on the face verification protocol. Additionally, we achieve state-of-the-art performance on LFW dataset with an accuracy of 99.78%, and competing performance on YTF dataset with accuracy of 96.08%.
Neuroscience is undergoing a period of rapid experimental progress and expansion. New mathematical tools, previously unknown in the neuroscience community, are now being used to tackle fundamental questions and analyze emerging data sets. Consistent with this trend, the last decade has seen an uptick in the use of topological ideas and methods in neuroscience. In this talk I will survey recent applications of topology in neuroscience, and explain why topology is an especially natural tool for understanding neural codes. Note: This is a write-up of my talk for the Current Events Bulletin, held at the 2016 Joint Math Meetings in Seattle, WA.
With few exceptions, the field of Machine Learning (ML) research has largely ignored the browser as a computational engine. Beyond an educational resource for ML, the browser has vast potential to not only improve the state-of-the-art in ML research, but also, inexpensively and on a massive scale, to bring sophisticated ML learning and prediction to the public at large. This paper introduces MLitB, a prototype ML framework written entirely in JavaScript, capable of performing large-scale distributed computing with heterogeneous classes of devices. The development of MLitB has been driven by several underlying objectives whose aim is to make ML learning and usage ubiquitous (by using ubiquitous compute devices), cheap and effortlessly distributed, and collaborative. This is achieved by allowing every internet capable device to run training algorithms and predictive models with no software installation and by saving models in universally readable formats. Our prototype library is capable of training deep neural networks with synchronized, distributed stochastic gradient descent. MLitB offers several important opportunities for novel ML research, including: development of distributed learning algorithms, advancement of web GPU algorithms, novel field and mobile applications, privacy preserving computing, and green grid-computing. MLitB is available as open source software.
We focus on the central problem of discriminating between metallic and insulating behaviour in amorphous alloys formed between a semiconductor and a metal. For this, the logarithmic temperature derivative of the conductivity, w = d ln sigma / d ln T, has proved over recent years to be very helpful in determining the critical value x_c of the metal content x for the metal-insulator transition (MIT). We show that, for various amorphous alloys, recent experimental results on w(T,x) are qualitatively inconsistent with the usual assumptions of continuity of the MIT at T = 0 and of sigma(T,x_c) being proportional to a power of T. These results suggest that w(T,x_c) tends to 0 as T -> 0, in which case the MIT should be discontinuous at T = 0 (but only there), in agreement with Mott's hypothesis of a finite minimum metallic conductivity.
Cloud computing has pervaded through every aspect of Information technology in past decade. It has become easier to process plethora of data, generated by various devices in real time, with the advent of cloud networks. The privacy of users data is maintained by data centers around the world and hence it has become feasible to operate on that data from lightweight portable devices. But with ease of processing comes the security aspect of the data. One such security aspect is secure file transfer either internally within cloud or externally from one cloud network to another. File management is central to cloud computing and it is paramount to address the security concerns which arise out of it. This survey paper aims to elucidate the various protocols which can be used for secure file transfer and analyze the ramifications of using each protocol.
Performance of wireless mesh networks (WMNs) in terms of network capacity, end-to-end latency, and network resilience depends upon the prevalent levels of interference. Thus, interference alleviation is a fundamental design concern in multi-radio multi-channel (MRMC) WMNs, and is achieved through a judicious channel assignment (CA) to the radios in a WMN. In our earlier works we have tried to address the problem of estimating the intensity of interference in a wireless network and predicting the performance of CA schemes based on the measure of the interference estimate. We have proposed reliable CA performance prediction approaches which have proven effective in grid WMNs. In this work, we further assess the reliability of these CA performance prediction techniques in a large MRMC WMN which comprises of randomly placed mesh routers. We perform exhaustive simulations on an ns-3 802.11n environment. We obtain conclusive results to demonstrate the efficacy of proposed schemes in random WMNs as well.
Brain science is an evolving research area inviting great enthusiasm with its potential for providing insights and thereby, preventing, and treating multiple neuronal disorders affecting millions of patients. Discovery of relationships, such as brain connectivity, is a major goal in basic, translational, and clinical science. Algorithms for causal discovery are used in diverse fields for tackling problems similar to the task of reconstruction of neuronal brain connectivity. Our aim is to understand the strengths and limitations of these methods, measure performance and its determinants, and provide insights to enhance their performance and applicability. We performed extensive empirical testing and benchmarking of reconstruction performance of several state-of-the-art algorithms along with several ensemble techniques used to combine them. Our experiments used a clear and broadly relevant gold standard based on calcium fluorescence time series recordings of thousands of neurons sampled from a previously validated realistic, neuronal model. Correlation, entropy-based measures, Cross-Correlation for short time lags, and Generalized Transfer Entropy had the best performances with area under ROC curve (AUC) in the range of 0.7-0.8 even for smaller sample sizes of n = 100 to 1,000 and converged quickly (at less than n = 1,000). Ensembles of best-performing methods using random forests and neural networks generated AUC of ~0.9 with n = 10,000. Several important insights regarding parameter choice and sample size were gained for guiding the experimental design of studies. Our data are also supportive of the feasibility of reliably reconstructing complex neuronal connectivity using existing techniques.
These lecture notes focus on the mean field theory of spin glasses, with particular emphasis on the presence of a very large number of metastable states in these systems. This phenomenon, and some of its physical consequences, will be discussed in details for fully-connected models and for models defined on random lattices. This will be done using the replica and cavity methods.   These notes have been prepared for a course of the PhD program in Statistical Mechanics at SISSA, Trieste and at the University of Rome "Sapienza". Part of the material is reprinted from other lecture notes, and when this is done a reference is obviously provided to the original.
We consider the problem of forming a distributed queue in the adversarial dynamic network model of Kuhn, Lynch, and Oshman (STOC 2010) in which the network topology changes from round to round but the network stays connected. This is a synchronous model in which network nodes are assumed to be fixed, the communication links for each round are chosen by an adversary, and nodes do not know who their neighbors are for the current round before they broadcast their messages. Queue requests may arrive over rounds at arbitrary nodes and the goal is to eventually enqueue them in a distributed queue. We present two algorithms that give a total distributed ordering of queue requests in this model. We measure the performance of our algorithms through round complexity, which is the total number of rounds needed to solve the distributed queuing problem. We show that in 1-interval connected graphs, where the communication links change arbitrarily between every round, it is possible to solve the distributed queueing problem in O(nk) rounds using O(log n) size messages, where n is the number of nodes in the network and k <= n is the number of queue requests. Further, we show that for more stable graphs, e.g. T-interval connected graphs where the communication links change in every T rounds, the distributed queuing problem can be solved in O(n+ (nk/min(alpha,T))) rounds using the same O(log n) size messages, where alpha > 0 is the concurrency level parameter that captures the minimum number of active queue requests in the system in any round. These results hold in any arbitrary (sequential, one-shot concurrent, or dynamic) arrival of k queue requests in the system. Moreover, our algorithms ensure correctness in the sense that each queue request is eventually enqueued in the distributed queue after it is issued and each queue request is enqueued exactly once. We also provide an impossibility result for this distributed queuing problem in this model. To the best of our knowledge, these are the first solutions to the distributed queuing problem in adversarial dynamic networks.
The two-species population dynamics model is the simplest paradigm of interspecies interaction. Here, we include intraspecific competition to the Lotka-Volterra model and solve it analytically. Despite being simple and thoroughly studied, this model presents a very rich behavior and some characteristics not so well explored, which are unveiled. The forbidden region in the mutualism regime and the dependence on initial conditions in the competition regime are some examples of these characteristics. From the stability of the steady state solutions, three phases are obtained: (i) extinction of one species (Gause transition), (ii) their coexistence and (iii) a forbidden region. Full analytical solutions have been obtained for the considered ecological regimes. The time transient allows one to defined time scales for the system evolution, which can be relevant for the study of tumor growth by theoretical or computer simulation models.
We introduce Mean Field Markov games with $N$ players, in which each individual in a large population interacts with other randomly selected players. The states and actions of each player in an interaction together determine the instantaneous payoff for all involved players. They also determine the transition probabilities to move to the next state. Each individual wishes to maximize the total expected discounted payoff over an infinite horizon. We provide a rigorous derivation of the asymptotic behavior of this system as the size of the population grows to infinity. Under indistinguishability per type assumption, we show that under any Markov strategy, the random process consisting of one specific player and the remaining population converges weakly to a jump process driven by the solution of a system of differential equations. We characterize the solutions to the team and to the game problems at the limit of infinite population and use these to construct near optimal strategies for the case of a finite, but large, number of players. We show that the large population asymptotic of the microscopic model is equivalent to a (macroscopic) mean field stochastic game in which a local interaction is described by a single player against a population profile (the mean field limit). We illustrate our model to derive the equations for a dynamic evolutionary Hawk and Dove game with energy level.
In this paper, we introduce Key-Value Memory Networks to a multimodal setting and a novel key-addressing mechanism to deal with sequence-to-sequence models. The proposed model naturally decomposes the problem of video captioning into vision and language segments, dealing with them as key-value pairs. More specifically, we learn a semantic embedding (v) corresponding to each frame (k) in the video, thereby creating (k, v) memory slots. We propose to find the next step attention weights conditioned on the previous attention distributions for the key-value memory slots in the memory addressing schema. Exploiting this flexibility of the framework, we additionally capture spatial dependencies while mapping from the visual to semantic embedding. Experiments done on the Youtube2Text dataset demonstrate usefulness of recurrent key-addressing, while achieving competitive scores on BLEU@4, METEOR metrics against state-of-the-art models.
Most state-of-the-art named entity recognition (NER) systems rely on handcrafted features and on the output of other NLP tasks such as part-of-speech (POS) tagging and text chunking. In this work we propose a language-independent NER system that uses automatically learned features only. Our approach is based on the CharWNN deep neural network, which uses word-level and character-level representations (embeddings) to perform sequential classification. We perform an extensive number of experiments using two annotated corpora in two different languages: HAREM I corpus, which contains texts in Portuguese; and the SPA CoNLL-2002 corpus, which contains texts in Spanish. Our experimental results shade light on the contribution of neural character embeddings for NER. Moreover, we demonstrate that the same neural network which has been successfully applied to POS tagging can also achieve state-of-the-art results for language-independet NER, using the same hyperparameters, and without any handcrafted features. For the HAREM I corpus, CharWNN outperforms the state-of-the-art system by 7.9 points in the F1-score for the total scenario (ten NE classes), and by 7.2 points in the F1 for the selective scenario (five NE classes).
The Building Block Hypothesis (BBH) states that adaptive systems combine good partial solutions (so-called building blocks) to find increasingly better solutions. It is thought that Genetic Algorithms (GAs) implement the BBH. However, for GAs building blocks are semi-theoretical objects in that they are thought only to be implicitly exploited via the selection and crossover operations of a GA. In the current work, we discover a mathematical method to identify the complete set of schemata present in a given population of a GA; as such a natural way to study schema processing (and thus the BBH) is revealed. We demonstrate how this approach can be used both theoretically and experimentally. Theoretically, we show that the search space for good schemata is a complete lattice and that each generation samples a complete sub-lattice of this search space. In addition, we show that combining schemata can only explore a subset of the search space. Experimentally, we compare how well different crossover methods combine building blocks. We find that for most crossover methods approximately 25-35% of building blocks in a generation result from the combination of the previous generation's building blocks. We also find that an increase in the combination of building blocks does not lead to an increase in the efficiency of a GA. To complement this article, we introduce an open source Python package called schematax, which allows one to calculate the schemata present in a population using the methods described in this article.
Elections unleash strong political views on Twitter, but what do people really think about politics? Opinion and trend mining on micro blogs dealing with politics has recently attracted researchers in several fields including Information Retrieval and Machine Learning (ML). Since the performance of ML and Natural Language Processing (NLP) approaches are limited by the amount and quality of data available, one promising alternative for some tasks is the automatic propagation of expert annotations. This paper intends to develop a so-called active learning process for automatically annotating French language tweets that deal with the image (i.e., representation, web reputation) of politicians. Our main focus is on the methodology followed to build an original annotated dataset expressing opinion from two French politicians over time. We therefore review state of the art NLP-based ML algorithms to automatically annotate tweets using a manual initiation step as bootstrap. This paper focuses on key issues about active learning while building a large annotated data set from noise. This will be introduced by human annotators, abundance of data and the label distribution across data and entities. In turn, we show that Twitter characteristics such as the author's name or hashtags can be considered as the bearing point to not only improve automatic systems for Opinion Mining (OM) and Topic Classification but also to reduce noise in human annotations. However, a later thorough analysis shows that reducing noise might induce the loss of crucial information.
Convolutional neural networks (CNNs) with deep architectures have substantially advanced the state-of-the-art in computer vision tasks. However, deep networks are typically resource-intensive and thus difficult to be deployed on mobile devices. Recently, CNNs with binary weights have shown compelling efficiency to the community, whereas the accuracy of such models is usually unsatisfactory in practice. In this paper, we introduce network sketching as a novel technique of pursuing binary-weight CNNs, targeting at more faithful inference and better trade-off for practical applications. Our basic idea is to exploit binary structure directly in pre-trained filter banks and produce binary-weight models via tensor expansion. The whole process can be treated as a coarse-to-fine model approximation, akin to the pencil drawing steps of outlining and shading. To further speedup the generated models, namely the sketches, we also propose an associative implementation of binary tensor convolutions. Experimental results demonstrate that a proper sketch of AlexNet (or ResNet) outperforms the existing binary-weight models by large margins on the ImageNet large scale classification task, while the committed memory for network parameters only exceeds a little.
We develop a general problem setting for training and testing the ability of agents to gather information efficiently. Specifically, we present a collection of tasks in which success requires searching through a partially-observed environment, for fragments of information which can be pieced together to accomplish various goals. We combine deep architectures with techniques from reinforcement learning to develop agents that solve our tasks. We shape the behavior of these agents by combining extrinsic and intrinsic rewards. We empirically demonstrate that these agents learn to search actively and intelligently for new information to reduce their uncertainty, and to exploit information they have already acquired.
Growing neuropsychological and neurophysiological evidence suggests that the visual cortex uses parts-based representations to encode, store and retrieve relevant objects. In such a scheme, objects are represented as a set of spatially distributed local features, or parts, arranged in stereotypical fashion. To encode the local appearance and to represent the relations between the constituent parts, there has to be an appropriate memory structure formed by previous experience with visual objects. Here, we propose a model how a hierarchical memory structure supporting efficient storage and rapid recall of parts-based representations can be established by an experience-driven process of self-organization. The process is based on the collaboration of slow bidirectional synaptic plasticity and homeostatic unit activity regulation, both running at the top of fast activity dynamics with winner-take-all character modulated by an oscillatory rhythm. These neural mechanisms lay down the basis for cooperation and competition between the distributed units and their synaptic connections. Choosing human face recognition as a test task, we show that, under the condition of open-ended, unsupervised incremental learning, the system is able to form memory traces for individual faces in a parts-based fashion. On a lower memory layer the synaptic structure is developed to represent local facial features and their interrelations, while the identities of different persons are captured explicitly on a higher layer. An additional property of the resulting representations is the sparseness of both the activity during the recall and the synaptic patterns comprising the memory traces.
Traditional machine learning methods usually minimize a simple loss function to learn a predictive model, and then use a complex performance measure to measure the prediction performance. However, minimizing a simple loss function cannot guarantee that an optimal performance. In this paper, we study the problem of optimizing the complex performance measure directly to obtain a predictive model. We proposed to construct a maximum likelihood model for this problem, and to learn the model parameter, we minimize a com- plex loss function corresponding to the desired complex performance measure. To optimize the loss function, we approximate the upper bound of the complex loss. We also propose impose the sparsity to the model parameter to obtain a sparse model. An objective is constructed by combining the upper bound of the loss function and the sparsity of the model parameter, and we develop an iterative algorithm to minimize it by using the fast iterative shrinkage- thresholding algorithm framework. The experiments on optimization on three different complex performance measures, including F-score, receiver operating characteristic curve, and recall precision curve break even point, over three real-world applications, aircraft event recognition of civil aviation safety, in- trusion detection in wireless mesh networks, and image classification, show the advantages of the proposed method over state-of-the-art methods.
In this paper we present the biologically inspired Ripple Pond Network (RPN), a simply connected spiking neural network that, operating together with recently proposed PolyChronous Networks (PCN), enables rapid, unsupervised, scale and rotation invariant object recognition using efficient spatio-temporal spike coding. The RPN has been developed as a hardware solution linking previously implemented neuromorphic vision and memory structures capable of delivering end-to-end high-speed, low-power and low-resolution recognition for mobile and autonomous applications where slow, highly sophisticated and power hungry signal processing solutions are ineffective. Key aspects in the proposed approach include utilising the spatial properties of physically embedded neural networks and propagating waves of activity therein for information processing, using dimensional collapse of imagery information into amenable temporal patterns and the use of asynchronous frames for information binding.
Model calculations have been performed on the spike-train response of a pair of Hodgkin-Huxley (HH) neurons coupled by recurrent excitatory-excitatory couplings with time delay. The coupled, excitable HH neurons are assumed to receive the two kinds of spike-train inputs: the transient input consisting of $M$ impulses for the finite duration ($M$: integer) and the sequential input with the constant interspike interval (ISI). The distribution of the output ISI $T_{\rm o}$ shows a rich of variety depending on the coupling strength and the time delay. The comparison is made between the dependence of the output ISI for the transient inputs and that for the sequential inputs.
We study the effects of weak disorder on a Wigner crystal in a magnetic field. We show that an elastic description of the pinned Wigner crystal provides an excellent framework to obtain most of the physically relevant observables. Using such a description, we compute the static and dynamical properties. We find that, akin to the Bragg glass phase, a good degree of translational order survives (up to a large lengthscale in $d=2$, infinite in $d=3$). Using a gaussian variational method, we obtain the full frequency dependence of the conductivity tensor. The zero temperature Hall resistivity is independent of frequency and remains unaffected by disorder at its classical value. We show that the characteristic features of the conductivity in the pinned Wigner crystal are dramatically different from those arising from the naive extrapolations of Fukuyama-Lee type theories for charge density waves. We determine the relevant scales and find that the physical properties depend crucially on whether the disorder correlation length is larger than the cyclotron length or not. We analyse, in particular, the magnetic field and density dependence of the optical conductivity. Within our approach the pinning frequency can increase with increasing magnetic field and varies as $n^{-3/2}$ with the density $n$. We compare our predictions with recent experiments on transport in two dimensional electron gases under strong magnetic fields. Our theory allows for a consistent interpretation of these experiments in terms of a pinned WC.
In this work we propose a new deep learning tool called deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at a time. This requires solving a simple (shallow) dictionary learning problem, the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state of the art supervised dictionary learning tools like discriminative KSVD and label consistent KSVD. Our method yields better results than all.
We consider the coupling between two networks, each having N nodes whose individual dynamics is modeled by a two-state master equation. The intra-network interactions are all to all, whereas the inter-network interactions involve only a small percentage of the total number of nodes. We demonstrate that the dynamics of the mean field for a single network has an equivalent description in terms of a Langevin equation for a particle in a double-well potential. The coupling of two networks or equivalent coupling of two Langevin equations demonstrates synchronization or antisynchronization between two systems, depending on the sign of the interaction. The anti-synchronized behavior is explained in terms of the potential function and the inter-network interaction. The relative entropy is used to establish that the conditions for maximum information transfer between the networks are consistent with the Principle of Complexity Management and occurs when one system is near the critical state. The limitations of the Langevin modeling of the network coupling are also discussed.
Using the HST WFPC2 we perform deep I-band imaging of 9 radio-selected (limit 14 microJanskys at 8.5 GHz) faint galaxies from Roche, Lowenthal and Koo (2002). Two are also observed in V. Six of the galaxies have known redshifs of 0.4<z<1.0. Radial intensity profiles indicate that 7 are disk galaxies and 2 are bulge-dominated. Four of the six with redshifts have a high optical surface brightness compared to typical disk galaxies. Two of the 9 galaxies are in close interacting pairs, another two are very asymmetric and three have large, luminous rings resembling the collisional starburst rings in the Cartwheel galaxy. In most of these galaxies the high radio luminosities are probably the result of interaction-triggered starbursts. The mixture of observed morphologies suggests that enhanced radio luminosities often persist for >0.2 Gyr, to a late stage of the interaction. One of these 9 galaxies may be an exception in that it is a large red elliptical and its strong radio emission is more likely to be from an obscured AGN.
On a daily investment decision in a security market, the price earnings (PE) ratio is one of the most widely applied methods being used as a firm valuation tool by investment experts. Unfortunately, recent academic developments in financial econometrics and machine learning rarely look at this tool. In practice, fundamental PE ratios are often estimated only by subjective expert opinions. The purpose of this research is to formalize a process of fundamental PE estimation by employing advanced dynamic Bayesian network (DBN) methodology. The estimated PE ratio from our model can be used either as a information support for an expert to make investment decisions, or as an automatic trading system illustrated in experiments. Forward-backward inference and EM parameter estimation algorithms are derived with respect to the proposed DBN structure. Unlike existing works in literatures, the economic interpretation of our DBN model is well-justified by behavioral finance evidences of volatility. A simple but practical trading strategy is invented based on the result of Bayesian inference. Extensive experiments show that our trading strategy equipped with the inferenced PE ratios consistently outperforms standard investment benchmarks.
The quality of machine translation is rapidly evolving. Today one can find several machine translation systems on the web that provide reasonable translations, although the systems are not perfect. In some specific domains, the quality may decrease. A recently proposed approach to this domain is neural machine translation. It aims at building a jointly-tuned single neural network that maximizes translation performance, a very different approach from traditional statistical machine translation. Recently proposed neural machine translation models often belong to the encoder-decoder family in which a source sentence is encoded into a fixed length vector that is, in turn, decoded to generate a translation. The present research examines the effects of different training methods on a Polish-English Machine Translation system used for medical data. The European Medicines Agency parallel text corpus was used as the basis for training of neural and statistical network-based translation systems. The main machine translation evaluation metrics have also been used in analysis of the systems. A comparison and implementation of a real-time medical translator is the main focus of our experiments.
We describe the underlying data model and implementation of a new architecture for the National Science Digital Library (NSDL) by the Core Integration Team (CI). The architecture is based on the notion of an information network overlay. This network, implemented as a graph of digital objects in a Fedora repository, allows the representation of multiple information entities and their relationships. The architecture provides the framework for contextualization and reuse of resources, which we argue is essential for the utility of the NSDL as a tool for teaching and learning.
There has been a long-standing controversy whether information in neuronal networks is carried by the firing rate code or by the firing temporal code. The current status of the rivalry between the two codes is briefly reviewed with the recent studies such as the brain-machine interface (BMI). Then we have proposed a generalized rate model based on the {\it finite} $N$-unit Langevin model subjected to additive and/or multiplicative noises, in order to understand the firing property of a cluster containing $N$ neurons. The stationary property of the rate model has been studied with the use of the Fokker-Planck equation (FPE) method. Our rate model is shown to yield various kinds of stationary distributions such as the interspike-interval distribution expressed by non-Gaussians including gamma, inverse-Gaussian-like and log-normal-like distributions.   The dynamical property of the generalized rate model has been studied with the use of the augmented moment method (AMM) which was developed by the author [H. Hasegawa, J. Phys. Soc. Jpn. 75 (2006) 033001]. From the macroscopic point of view in the AMM, the property of the $N$-unit neuron cluster is expressed in terms of {\it three} quantities; $\mu$, the mean of spiking rates of $R=(1/N) \sum_i r_i$ where $r_i$ denotes the firing rate of a neuron $i$ in the cluster: $\gamma$, averaged fluctuations in local variables ($r_i$): $\rho$, fluctuations in global variable ($R$). We get equations of motions of the three quantities, which show $\rho \sim \gamma/N$ for weak couplings. This implies that the population rate code is generally more reliable than the single-neuron rate code. Our rate model is extended and applied to an ensemble containing multiple neuron clusters.
A review of three-dimensional waves on deep-water is presented. Three forms of three dimensionality, namely oblique, forced and spontaneous type, are identified. An alternative formulation for these three-dimensional waves is given through cubic nonlinear Schr\"odinger equation. The periodic solutions of the cubic nonlinear Schr\"odinger equation are found using Weierstrass elliptic $\wp$ functions. It is shown that the classification of solutions depends on the boundary conditions, wavenumber and frequency. For certain parameters, Weierstrass $\wp$ functions are reduced to periodic, hyperbolic or Jacobi elliptic functions. It is demonstrated that some of these solutions do not have any physical significance. An analytical solution of cubic nonlinear Schr\"odinger equation with wind forcing is also obtained which results in how groups of waves are generated on the surface of deep water in the ocean. In this case the dependency on the energy-transfer parameter, from wind to waves, make either the groups of wave to grow initially and eventually dissipate, or simply decay or grow in time.
The past decade has seen great advances in our understanding of the role of noise in gene regulation and the physical limits to signaling in biological networks. Here we introduce the spectral method for computation of the joint probability distribution over all species in a biological network. The spectral method exploits the natural eigenfunctions of the master equation of birth-death processes to solve for the joint distribution of modules within the network, which then inform each other and facilitate calculation of the entire joint distribution. We illustrate the method on a ubiquitous case in nature: linear regulatory cascades. The efficiency of the method makes possible numerical optimization of the input and regulatory parameters, revealing design properties of, e.g., the most informative cascades. We find, for threshold regulation, that a cascade of strong regulations converts a unimodal input to a bimodal output, that multimodal inputs are no more informative than bimodal inputs, and that a chain of up-regulations outperforms a chain of down-regulations. We anticipate that this numerical approach may be useful for modeling noise in a variety of small network topologies in biology.
Linear and nonlinear dynamic properties of a reentrant ferromagnet Cu$_{0.2}$Co$_{0.8}$Cl$_{2}$-FeCl$_{3}$ graphite bi-intercalation compound are studied using AC and DC magnetic susceptibility. This compound undergoes successive phase transitions at the transition temperatures $T_{h}$ (= 16 K), $T_{c}$ (= 9.7 K), and $T_{RSG}$ (= 3.5 K). The static and dynamic behaviors of the reentrant spin glass phase below $T_{RSG}$ are characterized by those of normal spin glass phase with critical exponent $\beta$ = 0.57 $\pm$ 0.10, a dynamic critical exponent $x$ = 8.5 $\pm$ 1.8, and an exponent $p$ (= 1.55 $\pm$ 0.13) for the de Almeida -Thouless line. A prominent nonlinear susceptibility is observed between $T_{RSG}$ and $T_{c}$ and around $T_{h}$, suggesting a chaotic nature of the ferromagnetic phase ($T_{RSG} \leq T \leq T_{c}$) and the helical spin ordered phase ($T_{c} \leq T \leq T_{h}$). The aging phenomena are observed both in the RSG and FM phases, with the same qualitative features as in normal spin glasses. The aging of zero-field cooled magnetization indicates a drastic change of relaxation mechanism below and above $T_{RSG}$. The time dependence of the absorption $\chi^{\prime \prime}$ is described by a power law form ($\approx t^{-b^{\prime \prime}}$) in the ferromagnetic phase, where $b^{\prime \prime} \approx 0.074 \pm 0.016$ at $f$ = 0.05 Hz and $T$ = 7 K. No $\omega t$-scaling law for $\chi^{\prime \prime}$ [$\approx (\omega t)^{-b^{\prime \prime}}$] is observed.
We present the results of molecular dynamics computer simulations of a simple glass former close to an interface between the liquid and the frozen amorphous phase of the same material. By investigating F_s(q,z,t), the incoherent intermediate scattering function for particles that have a distance z from the wall, we show that the relaxation dynamics of the particles close to the wall is much slower than the one for particles far away from the wall. For small z the typical relaxation time for F_s(q,z,t) increases like exp(Delta/(z-z_p)), where Delta and z_p are constants. We use the location of the crossover from this law to the bulk behavior to define a first length scale tilde{z}. A different length scale is defined by considering the Ansatz F_s(q,z,t) = F_s^{bulk}(q,t) +a(t) exp[-(z/xi(t))^{beta(t)}], where a(t), xi(t), and beta(t) are fit parameters. We show that this Ansatz gives a very good description of the data for all times and all values of z. The length xi(t) increases for short and intermediate times and decreases again on the time scale of the alpha-relaxation of the system. The maximum value of xi(t) can thus be defined as a new length scale xi_max. We find that tilde{z} as well as xi_max increase with decreasing temperature. The temperature dependence of this increase is compatible with a divergence of the length scale at the Kauzmann temperature of the bulk system.
We consider community detection in Degree-Corrected Stochastic Block Models (DC-SBM). We propose a spectral clustering algorithm based on a suitably normalized adjacency matrix. We show that this algorithm consistently recovers the block-membership of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order log$(n)$ or higher. Recovery succeeds even for very heterogeneous degree-distributions. The used algorithm does not rely on parameters as input. In particular, it does not need to know the number of communities.
The volume of personal information and data most Internet users find themselves amassing is ever increasing and the fast pace of the modern world results in most requiring instant access to their files. Millions of these users turn to cloud based file synchronisation services, such as Dropbox, Microsoft Skydrive, Apple iCloud and Google Drive, to enable "always-on" access to their most up-to-date data from any computer or mobile device with an Internet connection. The prevalence of recent articles covering various invasion of privacy issues and data protection breaches in the media has caused many to review their online security practices with their personal information. To provide an alternative to cloud based file backup and synchronisation, BitTorrent Inc. released an alternative cloudless file backup and synchronisation service, named BitTorrent Sync to alpha testers in April 2013. BitTorrent Sync's popularity rose dramatically throughout 2013, reaching over two million active users by the end of the year. This paper outlines a number of scenarios where the network investigation of the service may prove invaluable as part of a digital forensic investigation. An investigation methodology is proposed outlining the required steps involved in retrieving digital evidence from the network and the results from a proof of concept investigation are presented.
Deep convolutional neural networks (ConvNets) of 3-dimensional kernels allow joint modeling of spatiotemporal features. These networks have improved performance of video and volumetric image analysis, but have been limited in size due to the low memory ceiling of GPU hardware. Existing CPU implementations overcome this constraint but are impractically slow. Here we extend and optimize the faster Winograd-class of convolutional algorithms to the $N$-dimensional case and specifically for CPU hardware. First, we remove the need to manually hand-craft algorithms by exploiting the relaxed constraints and cheap sparse access of CPU memory. Second, we maximize CPU utilization and multicore scalability by transforming data matrices to be cache-aware, integer multiples of AVX vector widths. Treating 2-dimensional ConvNets as a special (and the least beneficial) case of our approach, we demonstrate a 5 to 25-fold improvement in throughput compared to previous state-of-the-art.
State-of-the-art deep neural networks have achieved impressive results on many image classification tasks. However, these same architectures have been shown to be unstable to small, well sought, perturbations of the images. Despite the importance of this phenomenon, no effective methods have been proposed to accurately compute the robustness of state-of-the-art deep classifiers to such perturbations on large-scale datasets. In this paper, we fill this gap and propose the DeepFool algorithm to efficiently compute perturbations that fool deep networks, and thus reliably quantify the robustness of these classifiers. Extensive experimental results show that our approach outperforms recent methods in the task of computing adversarial perturbations and making classifiers more robust.
Deep X-ray surveys are providing crucial information on the evolution of AGN and galaxies. We review some of the latest results based on the X-ray spectral analysis of the sources detected in the Chandra Deep Field South, namely: i) constraints on obscured accretion; ii) constraints on the missing fraction of the X-ray background; iii) the redshift distribution of Compton-thick sources and TypeII QSO; iv) the detection of star formation activity in high-z galaxies through stacking techniques; v) the detection of large scale structure in the AGN distribution and its effect on nuclear activity. Such observational findings are consistent with a scenario where nuclear activity and star formation processes develop together in an anti-hierarchical fashion.
Above two dimensions, diffusion of a particle in a medium with quenched random traps is believed to be well-described by the annealed continuous time random walk (CTRW). We propose an approximate expression for the first-passage-time (FPT) distribution in a given sample that enables detailed comparison of the two problems. For a system of finite size, the number and spatial arrangement of deep traps yield significant sample-to-sample variations in the FPT statistics. Numerical simulations of a quenched trap model with power-law sojourn times confirm the existence of two characteristic time scales and a non-self-averaging FPT distribution, as predicted by theory.
Transposable data represents interactions among two sets of entities, and are typically represented as a matrix containing the known interaction values. Additional side information may consist of feature vectors specific to entities corresponding to the rows and/or columns of such a matrix. Further information may also be available in the form of interactions or hierarchies among entities along the same mode (axis). We propose a novel approach for modeling transposable data with missing interactions given additional side information. The interactions are modeled as noisy observations from a latent noise free matrix generated from a matrix-variate Gaussian process. The construction of row and column covariances using side information provides a flexible mechanism for specifying a-priori knowledge of the row and column correlations in the data. Further, the use of such a prior combined with the side information enables predictions for new rows and columns not observed in the training data. In this work, we combine the matrix-variate Gaussian process model with low rank constraints. The constrained Gaussian process approach is applied to the prediction of hidden associations between genes and diseases using a small set of observed associations as well as prior covariances induced by gene-gene interaction networks and disease ontologies. The proposed approach is also applied to recommender systems data which involves predicting the item ratings of users using known associations as well as prior covariances induced by social networks. We present experimental results that highlight the performance of constrained matrix-variate Gaussian process as compared to state of the art approaches in each domain.
Spontaneous brain activity, as observed in functional neuroimaging, has been shown to display reproducible structure that expresses brain architecture and carries markers of brain pathologies. An important view of modern neuroscience is that such large-scale structure of coherent activity reflects modularity properties of brain connectivity graphs. However, to date, there has been no demonstration that the limited and noisy data available in spontaneous activity observations could be used to learn full-brain probabilistic models that generalize to new data. Learning such models entails two main challenges: i) modeling full brain connectivity is a difficult estimation problem that faces the curse of dimensionality and ii) variability between subjects, coupled with the variability of functional signals between experimental runs, makes the use of multiple datasets challenging. We describe subject-level brain functional connectivity structure as a multivariate Gaussian process and introduce a new strategy to estimate it from group data, by imposing a common structure on the graphical model in the population. We show that individual models learned from functional Magnetic Resonance Imaging (fMRI) data using this population prior generalize better to unseen data than models based on alternative regularization schemes. To our knowledge, this is the first report of a cross-validated model of spontaneous brain activity. Finally, we use the estimated graphical model to explore the large-scale characteristics of functional architecture and show for the first time that known cognitive networks appear as the integrated communities of functional connectivity graph.
In many simulation studies involving networks there is the need to rely on a sample network to perform the simulation experiments. In many cases, real network data is not available due to privacy concerns. In that case we can recourse to synthetic data sets with similar properties to the real data. In this paper we discuss the problem of generating synthetic data sets for a certain kind of online social network, for simulation purposes. Some popular online social networks, such as LinkedIn and ResearchGate, allow user endorsements for specific skills. For each particular skill, the endorsements give rise to a directed subgraph of the corresponding network, where the nodes correspond to network members or users, and the arcs represent endorsement relations. Modelling these endorsement digraphs can be done by formulating an optimization problem, which is amenable to different heuristics. Our construction method consists of two stages: The first one simulates the growth of the network, and the second one solves the aforementioned optimization problem to construct the endorsements.
Deep learning has enjoyed a great deal of success because of its ability to learn useful features for tasks such as classification. But there has been less exploration in learning the factors of variation apart from the classification signal. By augmenting autoencoders with simple regularization terms during training, we demonstrate that standard deep architectures can discover and explicitly represent factors of variation beyond those relevant for categorization. We introduce a cross-covariance penalty (XCov) as a method to disentangle factors like handwriting style for digits and subject identity in faces. We demonstrate this on the MNIST handwritten digit database, the Toronto Faces Database (TFD) and the Multi-PIE dataset by generating manipulated instances of the data. Furthermore, we demonstrate these deep networks can extrapolate `hidden' variation in the supervised signal.
This paper analyses the world web of mergers and acquisitions (M&As) using a complex network approach. We use data of M&As to build a temporal sequence of binary and weighted-directed networks for the period 1995-2010 and 224 countries (nodes) connected according to their M&As flows (links). We study different geographical and temporal aspects of the international M&A network (IMAN), building sequences of filtered sub-networks whose links belong to specific intervals of distance or time. Given that M&As and trade are complementary ways of reaching foreign markets, we perform our analysis using statistics employed for the study of the international trade network (ITN), highlighting the similarities and differences between the ITN and the IMAN. In contrast to the ITN, the IMAN is a low density network characterized by a persistent giant component with many external nodes and low reciprocity. Clustering patterns are very heterogeneous and dynamic. High-income economies are the main acquirers and are characterized by high connectivity, implying that most countries are targets of a few acquirers. Like in the ITN, geographical distance strongly impacts the structure of the IMAN: link-weights and node degrees have a non-linear relation with distance, and an assortative pattern is present at short distances.
We present a search for standard model Higgs boson production in association with a W boson in proton-antiproton collisions ($p\bar{p}\to W^\pm H \to \ell\nu b\bar{b}$) at a center of mass energy of 1.96 TeV. The search employs data collected with the CDF II detector which correspond to an integrated luminosity of approximately 1 fb-1. We select events consistent with a signature of a single lepton ($e^\pm/\mu^\pm$), missing transverse energy, and two jets. Jets corresponding to bottom quarks are identified with a secondary vertex tagging method and a neural network filter technique. The observed number of events and the dijet mass distributions are consistent with the standard model background expectations, and we set 95% confidence level upper limits on the production cross section times branching ratio ranging from 3.9 to 1.3 pb for Higgs boson masses from 110 to 150 GeV/c2, respectively.
We study how the behavior of viral spreading processes is influenced by local structural properties of the network over which they propagate. For a wide variety of spreading processes, the largest eigenvalue of the adjacency matrix of the network plays a key role on their global dynamical behavior. For many real-world large-scale networks, it is unfeasible to exactly retrieve the complete network structure to compute its largest eigenvalue. Instead, one usually have access to myopic, egocentric views of the network structure, also called egonets. In this paper, we propose a mathematical framework, based on algebraic graph theory and convex optimization, to study how local structural properties of the network constrain the interval of possible values in which the largest eigenvalue must lie. Based on this framework, we present a computationally efficient approach to find this interval from a collection of egonets. Our numerical simulations show that, for several social and communication networks, local structural properties of the network strongly constrain the location of the largest eigenvalue and the resulting spreading dynamics. From a practical point of view, our results can be used to dictate immunization strategies to tame the spreading of a virus, or to design network topologies that facilitate the spreading of information virally.
Swarm intelligence and bio-inspired algorithms form a hot topic in the developments of new algorithms inspired by nature. These nature-inspired metaheuristic algorithms can be based on swarm intelligence, biological systems, physical and chemical systems. Therefore, these algorithms can be called swarm-intelligence-based, bio-inspired, physics-based and chemistry-based, depending on the sources of inspiration. Though not all of them are efficient, a few algorithms have proved to be very efficient and thus have become popular tools for solving real-world problems. Some algorithms are insufficiently studied. The purpose of this review is to present a relatively comprehensive list of all the algorithms in the literature, so as to inspire further research.
We recently proved threshold saturation for spatially coupled sparse superposition codes on the additive white Gaussian noise channel. Here we generalize our analysis to a much broader setting. We show for any memoryless channel that spatial coupling allows generalized approximate message-passing (GAMP) decoding to reach the potential (or Bayes optimal) threshold of the code ensemble. Moreover in the large input alphabet size limit: i) the GAMP algorithmic threshold of the underlying (or uncoupled) code ensemble is simply expressed as a Fisher information; ii) the potential threshold tends to Shannon's capacity. Although we focus on coding for sake of coherence with our previous results, the framework and methods are very general and hold for a wide class of generalized estimation problems with random linear mixing.
Next generation mobile networks will rely ever more heavily on resource sharing. In this article we study the sharing of radio access network and spectrum among mobile operators. We assess the impact of sharing these two types of resources on the performance of spatially distributed mobile networks. We apply stochastic geometry to observe the combined effect of, for example, the level of spatial clustering among the deployed base stations, the shared network size, or the coordination in shared spectrum use on network coverage and expected user data rate. We uncover some complex effects of mobile network resource sharing, which involve non-linearly scaling gains and performance trade-offs related to the sharing scenario or the spatial clustering level.
Convolutional neural networks provide visual features that perform remarkably well in many computer vision applications. However, training these networks requires significant amounts of supervision. This paper introduces a generic framework to train deep networks, end-to-end, with no supervision. We propose to fix a set of target representations, called Noise As Targets (NAT), and to constrain the deep features to align to them. This domain agnostic approach avoids the standard unsupervised learning issues of trivial solutions and collapsing of features. Thanks to a stochastic batch reassignment strategy and a separable square loss function, it scales to millions of images. The proposed approach produces representations that perform on par with state-of-the-art unsupervised methods on ImageNet and Pascal VOC.
We develop a Deep-Text Recurrent Network (DTRN) that regards scene text reading as a sequence labelling problem. We leverage recent advances of deep convolutional neural networks to generate an ordered high-level sequence from a whole word image, avoiding the difficult character segmentation problem. Then a deep recurrent model, building on long short-term memory (LSTM), is developed to robustly recognize the generated CNN sequences, departing from most existing approaches recognising each character independently. Our model has a number of appealing properties in comparison to existing scene text recognition methods: (i) It can recognise highly ambiguous words by leveraging meaningful context information, allowing it to work reliably without either pre- or post-processing; (ii) the deep CNN feature is robust to various image distortions; (iii) it retains the explicit order information in word image, which is essential to discriminate word strings; (iv) the model does not depend on pre-defined dictionary, and it can process unknown words and arbitrary strings. Codes for the DTRN will be available.
Physics analysis in astroparticle experiments requires the capability of recognizing new phenomena; in order to establish what is new, it is important to develop tools for automatic classification, able to compare the final result with data from different detectors. A typical example is the problem of Gamma Ray Burst detection, classification, and possible association to known sources: for this task physicists will need in the next years tools to associate data from optical databases, from satellite experiments (EGRET, GLAST), and from Cherenkov telescopes (MAGIC, HESS, CANGAROO, VERITAS).
In this article the framework for Parisi's spontaneous replica symmetry breaking is reviewed, and subsequently applied to the example of the statistical mechanical description of the storage properties of a McCulloch-Pitts neuron. The technical details are reviewed extensively, with regard to the wide range of systems where the method may be applied. Parisi's partial differential equation and related differential equations are discussed, and a Green function technique introduced for the calculation of replica averages, the key to determining the averages of physical quantities. The ensuing graph rules involve only tree graphs, as appropriate for a mean-field-like model. The lowest order Ward-Takahashi identity is recovered analytically and is shown to lead to the Goldstone modes in continuous replica symmetry breaking phases. The need for a replica symmetry breaking theory in the storage problem of the neuron has arisen due to the thermodynamical instability of formerly given solutions. Variational forms for the neuron's free energy are derived in terms of the order parameter function x(q), for different prior distribution of synapses. Analytically in the high temperature limit and numerically in generic cases various phases are identified, among them one similar to the Parisi phase in the Sherrington-Kirkpatrick model. Extensive quantities like the error per pattern change slightly with respect to the known unstable solutions, but there is a significant difference in the distribution of non-extensive quantities like the synaptic overlaps and the pattern storage stability parameter. A simulation result is also reviewed and compared to the prediction of the theory.
We propose a sign-based online learning (SOL) algorithm for a neuromorphic hardware framework called Trainable Analogue Block (TAB). The TAB framework utilises the principles of neural population coding, implying that it encodes the input stimulus using a large pool of nonlinear neurons. The SOL algorithm is a simple weight update rule that employs the sign of the hidden layer activation and the sign of the output error, which is the difference between the target output and the predicted output. The SOL algorithm is easily implementable in hardware, and can be used in any artificial neural network framework that learns weights by minimising a convex cost function. We show that the TAB framework can be trained for various regression tasks using the SOL algorithm.
Deep convolutional neural networks have recently proven extremely effective for difficult face recognition problems in uncontrolled settings. To train such networks very large training sets are needed with millions of labeled images. For some applications, such as near-infrared (NIR) face recognition, such larger training datasets are, however, not publicly available and very difficult to collect. We propose a method to generate very large training datasets of synthetic images by compositing real face images in a given dataset. We show that this method enables to learn models from as few as 10,000 training images, which perform on par with models trained from 500,000 images. Using our approach we also improve the state-of-the-art results on the CASIA NIR-VIS heterogeneous face recognition dataset.
Although attention-based Neural Machine Translation have achieved great success, attention-mechanism cannot capture the entire meaning of the source sentence because the attention mechanism generates a target word depending heavily on the relevant parts of the source sentence. The report of earlier studies has introduced a latent variable to capture the entire meaning of sentence and achieved improvement on attention-based Neural Machine Translation. We follow this approach and we believe that the capturing meaning of sentence benefits from image information because human beings understand the meaning of language not only from textual information but also from perceptual information such as that gained from vision. As described herein, we propose a neural machine translation model that introduces a continuous latent variable containing an underlying semantic extracted from texts and images. Our model, which can be trained end-to-end, requires image information only when training. Experiments conducted with an English--German translation task show that our model outperforms over the baseline.
Videos contain very rich semantic information. Traditional hand-crafted features are known to be inadequate in analyzing complex video semantics. Inspired by the huge success of the deep learning methods in analyzing image, audio and text data, significant efforts are recently being devoted to the design of deep nets for video analytics. Among the many practical needs, classifying videos (or video clips) based on their major semantic categories (e.g., "skiing") is useful in many applications. In this paper, we conduct an in-depth study to investigate important implementation options that may affect the performance of deep nets on video classification. Our evaluations are conducted on top of a recent two-stream convolutional neural network (CNN) pipeline, which uses both static frames and motion optical flows, and has demonstrated competitive performance against the state-of-the-art methods. In order to gain insights and to arrive at a practical guideline, many important options are studied, including network architectures, model fusion, learning parameters and the final prediction methods. Based on the evaluations, very competitive results are attained on two popular video classification benchmarks. We hope that the discussions and conclusions from this work can help researchers in related fields to quickly set up a good basis for further investigations along this very promising direction.
Learning algorithms need generally the possibility to compare several streams of information. Neural learning architectures hence need a unit, a comparator, able to compare several inputs encoding either internal or external information, like for instance predictions and sensory readings. Without the possibility of comparing the values of prediction to actual sensory inputs, reward evaluation and supervised learning would not be possible.   Comparators are usually not implemented explicitly, necessary comparisons are commonly performed by directly comparing one-to-one the respective activities. This implies that the characteristics of the two input streams (like size and encoding) must be provided at the time of designing the system.   It is however plausible that biological comparators emerge from self-organizing, genetically encoded principles, which allow the system to adapt to the changes in the input and in the organism.   We propose an unsupervised neural circuitry, where the function of input comparison emerges via self-organization only from the interaction of the system with the respective inputs, without external influence or supervision.   The proposed neural comparator adapts, unsupervised, according to the correlations present in the input streams. The system consists of a multilayer feed-forward neural network which follows a local output minimization (anti-Hebbian) rule for adaptation of the synaptic weights.   The local output minimization allows the circuit to autonomously acquire the capability of comparing the neural activities received from different neural populations, which may differ in the size of the population and in the neural encoding used. The comparator is able to compare objects never encountered before in the sensory input streams and to evaluate a measure of their similarity, even when differently encoded.
Cyber-physical systems (CPS), such as automotive systems, are starting to include sophisticated machine learning (ML) components. Their correctness, therefore, depends on properties of the inner ML modules. While learning algorithms aim to generalize from examples, they are only as good as the examples provided, and recent efforts have shown that they can produce inconsistent output under small adversarial perturbations. This raises the question: can the output from learning components can lead to a failure of the entire CPS? In this work, we address this question by formulating it as a problem of falsifying signal temporal logic (STL) specifications for CPS with ML components. We propose a compositional falsification framework where a temporal logic falsifier and a machine learning analyzer cooperate with the aim of finding falsifying executions of the considered model. The efficacy of the proposed technique is shown on an automatic emergency braking system model with a perception component based on deep neural networks.
Over the last decade, wireless networks have experienced an impressive growth and now play a main role in many telecommunications systems. As a consequence, scarce radio resources, such as frequencies, became congested and the need for effective and efficient assignment methods arose. In this work, we present a Genetic Algorithm for solving large instances of the Power, Frequency and Modulation Assignment Problem, arising in the design of wireless networks. To our best knowledge, this is the first Genetic Algorithm that is proposed for such problem. Compared to previous works, our approach allows a wider exploration of the set of power solutions, while eliminating sources of numerical problems. The performance of the algorithm is assessed by tests over a set of large realistic instances of a Fixed WiMAX Network.
Fine-grained image classification is a challenging task due to the large intra-class variance and small inter-class variance, aiming at recognizing hundreds of sub-categories belonging to the same basic-level category. Most existing fine-grained image classification methods generally learn part detection models to obtain the semantic parts for better classification accuracy. Despite achieving promising results, these methods mainly have two limitations: (1) not all the parts which obtained through the part detection models are beneficial and indispensable for classification, and (2) fine-grained image classification requires more detailed visual descriptions which could not be provided by the part locations or attribute annotations. For addressing the above two limitations, this paper proposes the two-stream model combining vision and language (CVL) for learning latent semantic representations. The vision stream learns deep representations from the original visual information via deep convolutional neural network. The language stream utilizes the natural language descriptions which could point out the discriminative parts or characteristics for each image, and provides a flexible and compact way of encoding the salient visual aspects for distinguishing sub-categories. Since the two streams are complementary, combining the two streams can further achieves better classification accuracy. Comparing with 12 state-of-the-art methods on the widely used CUB-200-2011 dataset for fine-grained image classification, the experimental results demonstrate our CVL approach achieves the best performance.
It is argued that for the computer to be able to interact with humans, it needs to have the communication skills of humans. One of these skills is the ability to understand the emotional state of the person. This thesis describes a neural network-based approach for emotion classification. We learn a classifier that can recognize six basic emotions with an average accuracy of 77% over the Cohn-Kanade database. The novelty of this work is that instead of empirically selecting the parameters of the neural network, i.e. the learning rate, activation function parameter, momentum number, the number of nodes in one layer, etc. we developed a strategy that can automatically select comparatively better combination of these parameters. We also introduce another way to perform back propagation. Instead of using the partial differential of the error function, we use optimal algorithm; namely Powell's direction set to minimize the error function. We were also interested in construction an authentic emotion databases. This is a very important task because nowadays there is no such database available. Finally, we perform several experiments and show that our neural network approach can be successfully used for emotion recognition.
We show that the distribution of the time delay for one-dimensional random potentials is universal in the high energy or weak disorder limit. Our analytical results are in excellent agreement with extensive numerical simulations carried out on samples whose sizes are large compared to the localisation length (localised regime). The case of small samples is also discussed (ballistic regime). We provide a physical argument which explains in a quantitative way the origin of the exponential divergence of the moments. The occurence of a log-normal tail for finite size systems is analysed. Finally, we present exact results in the low energy limit which clearly show a departure from the universal behaviour.
Much recent work in bioinformatics has focused on the inference of various types of biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. A common setting involves inferring network edges in a supervised fashion from a set of high-confidence edges, possibly characterized by multiple, heterogeneous data sets (protein sequence, gene expression, etc.). Here, we distinguish between two modes of inference in this setting: direct inference based upon similarities between nodes joined by an edge, and indirect inference based upon similarities between one pair of nodes and another pair of nodes. We propose a supervised approach for the direct case by translating it into a distance metric learning problem. A relaxation of the resulting convex optimization problem leads to the support vector machine (SVM) algorithm with a particular kernel for pairs, which we call the metric learning pairwise kernel (MLPK). We demonstrate, using several real biological networks, that this direct approach often improves upon the state-of-the-art SVM for indirect inference with the tensor product pairwise kernel.
We use heat capacity measurements as a function of field rotation to identify the nodal gap structure of CeIrIn5 at pressures to 2.05 GPa, deep inside its superconducting dome. A four-fold oscillation in the heat capacity at 0.3 K is observed for all pressures but with its sign reversed between 1.50 and 0.90 GPa. On the basis of recent theoretical models for the field-angle dependent specific heat, all data, including the sign reversal, imply a d{x^2-y^2} order parameter with nodes along [110], which constrains theoretical models of the pairing mechanism in CeIrIn5.
Network coding is a technique to maximize communication rates within a network, in communication protocols for simultaneous multi-party transmission of information. Linear network codes are examples of such protocols in which the local computations performed at the nodes in the network are limited to linear transformations of their input data (represented as elements of a ring, such as the integers modulo 2). The quantum linear network coding protocols of Kobayashi et al [arXiv:0908.1457 and arXiv:1012.4583] coherently simulate classical linear network codes, using supplemental classical communication. We demonstrate that these protocols correspond in a natural way to measurement-based quantum computations with graph states over over qudits [arXiv:quant-ph/0301052, arXiv:quant-ph/0603226, and arXiv:0704.1263] having a structure directly related to the network.
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, we introduce "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy. Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding. After the first two steps we retrain the network to fine tune the remaining connections and the quantized centroids. Pruning, reduces the number of connections by 9x to 13x; Quantization then reduces the number of bits that represent each connection from 32 to 5. On the ImageNet dataset, our method reduced the storage required by AlexNet by 35x, from 240MB to 6.9MB, without loss of accuracy. Our method reduced the size of VGG-16 by 49x from 552MB to 11.3MB, again with no loss of accuracy. This allows fitting the model into on-chip SRAM cache rather than off-chip DRAM memory. Our compression method also facilitates the use of complex neural networks in mobile applications where application size and download bandwidth are constrained. Benchmarked on CPU, GPU and mobile GPU, compressed network has 3x to 4x layerwise speedup and 3x to 7x better energy efficiency.
Recent successes in learning-based image classification, however, heavily rely on the large number of annotated training samples, which may require considerable human efforts. In this paper, we propose a novel active learning framework, which is capable of building a competitive classifier with optimal feature representation via a limited amount of labeled training instances in an incremental learning manner. Our approach advances the existing active learning methods in two aspects. First, we incorporate deep convolutional neural networks into active learning. Through the properly designed framework, the feature representation and the classifier can be simultaneously updated with progressively annotated informative samples. Second, we present a cost-effective sample selection strategy to improve the classification performance with less manual annotations. Unlike traditional methods focusing on only the uncertain samples of low prediction confidence, we especially discover the large amount of high confidence samples from the unlabeled set for feature learning. Specifically, these high confidence samples are automatically selected and iteratively assigned pseudo-labels. We thus call our framework "Cost-Effective Active Learning" (CEAL) standing for the two advantages.Extensive experiments demonstrate that the proposed CEAL framework can achieve promising results on two challenging image classification datasets, i.e., face recognition on CACD database [1] and object categorization on Caltech-256 [2].
For robotic vehicles to navigate safely and efficiently in pedestrian-rich environments, it is important to model subtle human behaviors and navigation rules. However, while instinctive to humans, socially compliant navigation is still difficult to quantify due to the stochasticity in people's behaviors. Existing works are mostly focused on using feature-matching techniques to describe and imitate human paths, but often do not generalize well since the feature values can vary from person to person, and even run to run. This work notes that while it is challenging to directly specify the details of what to do (precise mechanisms of human navigation), it is straightforward to specify what not to do (violations of social norms). Specifically, using deep reinforcement learning, this work develops a time-efficient navigation policy that respects common social norms. The proposed method is shown to enable fully autonomous navigation of a robotic vehicle moving at human walking speed in an environment with many pedestrians.
We study the problem of distributed adaptive estimation over networks where nodes cooperate to estimate physical parameters that can vary over both space and time domains. We use a set of basis functions to characterize the space-varying nature of the parameters and propose a diffusion least mean-squares (LMS) strategy to recover these parameters from successive time measurements. We analyze the stability and convergence of the proposed algorithm, and derive closed-form expressions to predict its learning behavior and steady-state performance in terms of mean-square error. We find that in the estimation of the space-varying parameters using distributed approaches, the covariance matrix of the regression data at each node becomes rank-deficient. Our analysis reveals that the proposed algorithm can overcome this difficulty to a large extent by benefiting from the network stochastic matrices that are used to combine exchanged information between nodes. We provide computer experiments to illustrate and support the theoretical findings.
We introduce a method for automatically selecting the path, or syllabus, that a neural network follows through a curriculum so as to maximise learning efficiency. A measure of the amount that the network learns from each data sample is provided as a reward signal to a nonstationary multi-armed bandit algorithm, which then determines a stochastic syllabus. We consider a range of signals derived from two distinct indicators of learning progress: rate of increase in prediction accuracy, and rate of increase in network complexity. Experimental results for LSTM networks on three curricula demonstrate that our approach can significantly accelerate learning, in some cases halving the time required to attain a satisfactory performance level.
A microscopic approach is presented for calculating general properties of interacting Brownian particles under steady shearing. We start from exact expressions for shear-dependent steady-state averages, such as correlation and structure functions, in the form of generalized Green-Kubo relations. To these we apply approximations inspired by the mode coupling theory (MCT) for the quiescent system, accessing steady-state properties by integration through the transient dynamics after startup of steady shear. Exact equations of motion, with memory effects, for the required transient density correlation functions are derived next; these can also be approximated within an MCT-like approach. This results in closed equations for the non-equilibrium stationary state of sheared dense colloidal dispersions, with the equilibrium structure factor of the unsheared system as the only input. In three dimensions, these equations currently require further approximation prior to numerical solution. However, some universal aspects can be analyzed exactly, including the discontinuous onset of a yield stress at the ideal glass transition predicted by MCT. Using these methods we additionally discuss the distorted microstructure of a sheared hard-sphere colloid near the glass transition, and consider how this relates to the shear stress. Time-dependent fluctuations around the stationary state are then approximated, and compared to data from experiment and simulation; the correlators for yielding glassy states obey a `time-shear-superposition' principle. The work presented here fully develops an approach first outlined previously (M. Fuchs and M. E. Cates, Phys. Rev. Lett. 89, 248304, (2002)), while incorporating a significant technical change from that work in the choice of mode coupling approximation used, whose advantages are discussed.
We define the notion of compiling a Bayesian network with evidence and provide a specific approach for evidence-based compilation, which makes use of logical processing. The approach is practical and advantageous in a number of application areas-including maximum likelihood estimation, sensitivity analysis, and MAP computations-and we provide specific empirical results in the domain of genetic linkage analysis. We also show that the approach is applicable for networks that do not contain determinism, and show that it empirically subsumes the performance of the quickscore algorithm when applied to noisy-or networks.
We describe a Monte Carlo event generator for the simulation of QCD-instanton induced processes in deep-inelastic scattering (HERA). The QCDINS package is designed as an ``add-on'' hard process generator interfaced to the general hadronic event simulation package HERWIG. It incorporates the theoretically predicted production rate for instanton-induced events as well as the essential characteristics that have been derived theoretically for the partonic final state of instanton-induced processes: notably, the flavour democratic and isotropic production of the partonic final state, energy weight factors different for gluons and quarks, and a high average multiplicity O(10) of produced partons with a Poisson distribution of the gluon multiplicity. While the subsequent perturbative evolution of the generated partons is always handled by the HERWIG package, the final hadronization step may optionally be performed also by means of the general hadronic event simulation package JETSET.
The evolution of mobile cellular networks has brought great changes of network architecture. For example, heterogeneous cellular network (HetNet) and Ultra dense network (UDN) have been proposed as promising techniques for 5G systems. Dense deployment of base stations (BSs) allows a mobile user to be able to access multiple BSs. Meanwhile the unbalance between UL and DL in HetNets, such as different received SINR threshold and traffic load, etc., becomes increasingly obvious. All these factors naturally inspire us to consider decoupling of uplink and downlink in radio access network. An interesting question is that whether the decoupled uplink (UL) /downlink (DL) access (DUDA) mode outperforms traditional coupled uplink (UL)/downlink (DL) access (CUDA) mode or not, and how big is the performance difference in terms of system rate, spectrum efficiency (SE) and energy efficiency (EE), etc. in HetNets. In this paper, we aim at thoroughly comparing the performance of the two modes based on stochastic geometry theory. In our analytical model, we take into account dynamic transmit power control in UL communication. Specifically, we employ fractional power control (FPC) to model a location-dependent channel state. Numerical results reveals that DUDA mode significantly outperforms CUDA mode in system rate, SE and EE in HetNets. In addition, DUDA mode improves load balance and potential fairness for both different type BSs and associated UEs.
Mobile applications are benefiting significantly from the advancement in deep learning, e.g. providing new features. Given a trained deep learning model, applications usually need to perform a series of matrix operations based on the input data, in order to infer possible output values. Because of model computation complexity and increased model sizes, those trained models are usually hosted in the cloud. When mobile apps need to utilize those models, they will have to send input data over the network. While cloud-based deep learning can provide reasonable response time for mobile apps, it also restricts the use case scenarios, e.g. mobile apps need to have access to network. With mobile specific deep learning optimizations, it is now possible to employ device-based inference. However, because mobile hardware, e.g. GPU and memory size, can be very different and limited when compared to desktop counterpart, it is important to understand the feasibility of this new device-based deep learning inference architecture. In this paper, we empirically evaluate the inference efficiency of three Convolutional Neural Networks using a benchmark Android application we developed. Based on our application-driven analysis, we have identified several performance bottlenecks for mobile applications powered by on-device deep learning inference.
Compressed sensing (CS) is a signal processing framework for efficiently reconstructing a signal from a small number of measurements, obtained by linear projections of the signal. Block-based CS is a lightweight CS approach that is mostly suitable for processing very high-dimensional images and videos: it operates on local patches, employs a low-complexity reconstruction operator and requires significantly less memory to store the sensing matrix. In this paper we present a deep learning approach for block-based CS, in which a fully-connected network performs both the block-based linear sensing and non-linear reconstruction stages. During the training phase, the sensing matrix and the non-linear reconstruction operator are \emph{jointly} optimized, and the proposed approach outperforms state-of-the-art both in terms of reconstruction quality and computation time. For example, at a 25% sensing rate the average PSNR advantage is 0.77dB and computation time is over 200-times faster.
Our study of the evolution of transmission eigenvalues, due to changes in various physical parameters in a disordered region of arbitrary dimensions, results in a generalization of the celebrated DMPK equation. The evolution is shown to be governed by a single complexity parameter which implies a deep level of universality of transport phenomena through a wide range of disordered regions. We also find that the interaction among eigenvalues is of many body type that has important consequences for the statistical behavior of transport properties.
Localisation of gamma-ray interaction points in monolithic scintillator crystals can simplify the design and improve the performance of a future Compton telescope for gamma-ray astronomy. In this paper we compare the position resolution of three monolithic scintillators: a 28x28x20 mm3 (length x breadth x thickness) LaBr3:Ce crystal, a 25x25x20 mm3 CeBr3 crystal and a 25x25x10 mm3 CeBr3 crystal. Each crystal was encapsulated and coupled to an array of 4x4 silicon photomultipliers through an optical window. The measurements were conducted using 81 keV and 356 keV gamma-rays from a collimated 133Ba source. The 3D position reconstruction of interaction points was performed using artificial neural networks trained with experimental data. Although the position resolution was significantly better for the thinner crystal, the 20 mm thick CeBr3 crystal showed an acceptable resolution of about 5.4 mm FWHM for the x and y coordinates, and 7.8 mm FWHM for the z-coordinate (crystal depth) at 356 keV. These values were obtained from the full position scans of the crystal sides. The position resolution of the LaBr3:Ce crystal was found to be considerably worse, presumably due to the highly diffusive optical in- terface between the crystal and the optical window of the enclosure. The energy resolution (FWHM) measured for 662 keV gamma-rays was 4.0% for LaBr3:Ce and 5.5% for CeBr3. The same crystals equipped with a PMT (Hamamatsu R6322-100) gave an energy resolution of 3.0% and 4.7%, respectively.
We interpret the subgraph centrality as the partition function of a network. The entropy, the internal energy and the Helmholtz free energy are defined for networks and molecular graphs on the basis of graph spectral theory. Various relations of these quantities to the structure and the dynamics of the complex networks are discussed. They include the cohesiveness of the network and the critical coupling of coupled phase oscillators. We explore several models of network growing/evolution as well as real-world networks, such as those representing metabolic and protein-protein interaction networks as well as the interaction between secondary structure elements in proteins.
Control problem in a biological system is the problem of finding an interventional policy for changing the state of the biological system from an undesirable state, e.g. disease, into a desirable healthy state. Boolean networks are utilized as mathematical model for gene regulatory networks. This paper provides an algorithm to solve the control problem in Boolean networks. The proposed algorithm is implemented and applied on two biological systems: T-cell receptor network and Drosophila melanogaster network. Results show that the proposed algorithm works faster in solving the control problem over these networks, while having similar accuracy, in comparison to previous methods.
We study the problem of transferring a sample in one domain to an analog sample in another domain. Given two related domains, S and T, we would like to learn a generative function G that maps an input sample from S to the domain T, such that the output of a given function f, which accepts inputs in either domains, would remain unchanged. Other than the function f, the training data is unsupervised and consist of a set of samples from each domain. The Domain Transfer Network (DTN) we present employs a compound loss function that includes a multiclass GAN loss, an f-constancy component, and a regularizing component that encourages G to map samples from T to themselves. We apply our method to visual domains including digits and face images and demonstrate its ability to generate convincing novel images of previously unseen entities, while preserving their identity.
Maintenance of cooperation was studied for a two-strategy evolutionary Prisoner's Dilemma game where the players are located on a one-dimensional chain and their payoff comes from games with the nearest and next-nearest neighbor interactions. The applied host geometry makes possible to study the impacts of two conflicting topological features. The evolutionary rule involves some noise affecting the strategy adoptions between the interacting players. Using Monte Carlo simulations and the extended versions of dynamical mean-field theory we determined the phase diagram as a function of noise level and a payoff parameter. The peculiar feature of the diagram is changed significantly when the connectivity structure is extended by extra links as suggested by Newman and Watts.
The dynamics of the survival probability of quantum walkers on a one-dimensional lattice with random distribution of absorbing immobile traps are investigated. The survival probability of quantum walkers is compared with that of classical walkers. It is shown that the time dependence of survival probability of quantum walkers has a piecewise stretched exponential character depending on the density of traps in numerical and analytical observations. The crossover between the quantum analogs of the Rosenstock and Donsker-Varadhan behaviors is identified.
The role of generalized parton distributions in wide-angle exclusive reactions will be discussed. In contrast to deep virtual exclusive reactions the wide angle processes offer the possibility of investigating the generalized parton distributions at large momentum transfer.
Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.
One of the main tasks of post-genomic informatics is to systematically investigate all molecules and their interactions within a living cell so as to understand how these molecules and the interactions between them relate to the function of the organism, while networks are appropriate abstract description of all kinds of interactions. In the past few years, great achievement has been made in developing theory of complex networks for revealing the organizing principles that govern the formation and evolution of various complex biological, technological and social networks. This paper reviews the accomplishments in constructing genome-based metabolic networks and describes how the theory of complex networks is applied to analyze metabolic networks.
In this paper, we study the optimal placement and optimal number of base stations added to an existing wireless data network through the interference gradient method. This proposed method considers a sub-region of the existing wireless data network, hereafter called region of interest. In this region, the provider wants to increase the network coverage and the users throughput. In this aim, the provider needs to determine the optimal number of base stations to be added and their optimal placement. The proposed approach is based on the Delaunay triangulation of the region of interest and the gradient descent method in each triangle to compute the minimum interference locations. We quantify the increase of coverage and throughput.
We study properties of jammed packings of frictionless spheres over a wide range of volume fractions. There exists a crossover volume fraction which separates deeply jammed solids from marginally jammed solids. In deeply jammed solids, all the scalings presented in marginally jammed solids are replaced with remarkably different ones with potential independent exponents. Correspondingly, there are structural changes in the pair distribution function associated with the crossover. The normal modes of vibration of deeply jammed solids also exhibit some anomalies, e.g. strengthened quasi-localization and absence of Debye-like density of states at low frequencies. Deeply jammed systems may thus be cataloged to a new class of amorphous solids.
Due to the rapidly growing scale and heterogeneity of wireless networks, the design of distributed cross-layer optimization algorithms have received significant interest from the networking research community. So far, the standard distributed cross-layer approach in the literature is based on first-order Lagrangian dual decomposition and the subgradient method, which suffers a slow convergence rate. In this paper, we make the first known attempt to develop a distributed Newton's method, which is second-order and enjoys a quadratic convergence rate. However, due to interference in wireless networks, the Hessian matrix of the cross-layer problem has an non-separable structure. As a result, developing a distributed second-order algorithm is far more challenging than its counterpart for wireline networks. Our main results in this paper are two-fold: i) For a special network setting where all links mutually interfere, we derive decentralized closed-form expressions to compute the Hessian inverse; ii) For general wireless networks where the interference relationships are arbitrary, we propose a distributed iterative matrix splitting scheme for the Hessian inverse. These results successfully lead to a new theoretical framework for cross-layer optimization in wireless networks. More importantly, our work contributes to an exciting second-order paradigm shift in wireless networks optimization theory.
We study the impact of nuclear effects on the extraction of the weak-mixing angle $\sin^2\theta_W$ from deep inelastic (anti-)neutrino-nucleus scattering, with special emphasis on the recently announced NuTeV Collaboration 3$\sigma$ deviation of $\sin^2\theta_W$ from its standard model value. We have found that nuclear effects, which are very important in electromagnetic deep inelastic scattering (DIS), are quite small in weak charged current DIS. In neutral current DIS processes, which contain the weak mixing angle, we predict that these effects play also an important role and may dramatically affect the value of $\sin^2\theta_W$ extracted from the experimental data.
Hidden Markov Model (HMM) is often regarded as the dynamical model of choice in many fields and applications. It is also at the heart of most state-of-the-art speech recognition systems since the 70's. However, from Gaussian mixture models HMMs (GMM-HMM) to deep neural network HMMs (DNN-HMM), the underlying Markovian chain of state-of-the-art models did not changed much. The "left-to-right" topology is mostly always employed because very few other alternatives exist. In this paper, we propose that finely-tuned HMM topologies are essential for precise temporal modelling and that this approach should be investigated in state-of-the-art HMM system. As such, we propose a proof-of-concept framework for learning efficient topologies by pruning down complex generic models. Speech recognition experiments that were conducted indicate that complex time dependencies can be better learned by this approach than with classical "left-to-right" models.
Traffic learning and prediction is at the heart of the evaluation of the performance of telecommunications networks and attracts a lot of attention in wired broadband networks. Now, benefiting from the big data in cellular networks, it becomes possible to make the analyses one step further into the application level. In this paper, we firstly collect a significant amount of application-level traffic data from cellular network operators. Afterwards, with the aid of the traffic "big data", we make a comprehensive study over the modeling and prediction framework of cellular network traffic. Our results solidly demonstrate that there universally exist some traffic statistical modeling characteristics, including ALPHA-stable modeled property in the temporal domain and the sparsity in the spatial domain. Meanwhile, the results also demonstrate the distinctions originated from the uniqueness of different service types of applications. Furthermore, we propose a new traffic prediction framework to encompass and explore these aforementioned characteristics and then develop a dictionary learning-based alternating direction method to solve it. Besides, we validate the prediction accuracy improvement and the robustness of the proposed framework through extensive simulation results.
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT'14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous best result on this task. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
In a recent paper, Suppes et al. (2012) [arXiv:arXiv:1010.3063] used neural oscillators to create a model, based on reasonable neurophysiological assumptions, of the behavioral stimulus-response (SR) theory. In this paper, we describe the model in a less mathematical and more physical and intuitive way.
It is today acknowledged that neural network language models outperform backoff language models in applications like speech recognition or statistical machine translation. However, training these models on large amounts of data can take several days. We present efficient techniques to adapt a neural network language model to new data. Instead of training a completely new model or relying on mixture approaches, we propose two new methods: continued training on resampled data or insertion of adaptation layers. We present experimental results in an CAT environment where the post-edits of professional translators are used to improve an SMT system. Both methods are very fast and achieve significant improvements without overfitting the small adaptation data.
Approximate Bayesian Computation (ABC) methods are used to approximate posterior distributions in models with unknown or computationally intractable likelihoods. Both the accuracy and computational efficiency of ABC depend on the choice of summary statistic, but outside of special cases where the optimal summary statistics are known, it is unclear which guiding principles can be used to construct effective summary statistics. In this paper we explore the possibility of automating the process of constructing summary statistics by training deep neural networks to predict the parameters from artificially generated data: the resulting summary statistics are approximately posterior means of the parameters. With minimal model-specific tuning, our method constructs summary statistics for the Ising model and the moving-average model, which match or exceed theoretically-motivated summary statistics in terms of the accuracies of the resulting posteriors.
A big challenge in algorithmic composition is to devise a model that is both easily trainable and able to reproduce the long-range temporal dependencies typical of music. Here we investigate how artificial neural networks can be trained on a large corpus of melodies and turned into automated music composers able to generate new melodies coherent with the style they have been trained on. We employ gated recurrent unit networks that have been shown to be particularly efficient in learning complex sequential activations with arbitrary long time lags. Our model processes rhythm and melody in parallel while modeling the relation between these two features. Using such an approach, we were able to generate interesting complete melodies or suggest possible continuations of a melody fragment that is coherent with the characteristics of the fragment itself.
Recent approaches based on artificial neural networks (ANNs) have shown promising results for named-entity recognition (NER). In order to achieve high performances, ANNs need to be trained on a large labeled dataset. However, labels might be difficult to obtain for the dataset on which the user wants to perform NER: label scarcity is particularly pronounced for patient note de-identification, which is an instance of NER. In this work, we analyze to what extent transfer learning may address this issue. In particular, we demonstrate that transferring an ANN model trained on a large labeled dataset to another dataset with a limited number of labels improves upon the state-of-the-art results on two different datasets for patient note de-identification.
Learning discrete representations of data is a central machine learning task because of the compactness of the representations and ease of interpretation. The task includes clustering and hash learning as special cases. Deep neural networks are promising to be used because they can model the non-linearity of data and scale to large datasets. However, their model complexity is huge, and therefore, we need to carefully regularize the networks in order to learn useful representations that exhibit intended invariance for applications of interest. To this end, we propose a method called Information Maximizing Self-Augmented Training (IMSAT). In IMSAT, we use data augmentation to impose the invariance on discrete representations. More specifically, we encourage the predicted representations of augmented data points to be close to those of the original data points in an end-to-end fashion. At the same time, we maximize the information-theoretic dependency between data and their predicted discrete representations. Extensive experiments on benchmark datasets show that IMSAT produces state-of-the-art results for both clustering and unsupervised hash learning.
We investigate the superfluid (SF) to Bose glass (BG) quantum phase transition using extensive quantum Monte Carlo simulations of two-dimensional hard-core bosons in a random box potential. $T=0$ critical properties are studied by thorough finite-size scaling of condensate and SF densities, both vanishing at the same critical disorder $W_c=4.80(5)$. Our results give the following estimates for the critical exponents: $z=1.85(15)$, $\nu=1.20(12)$, $\eta=-0.40(15)$. Furthermore, the probability distribution of the SF response $P(\ln\rho_{\rm sf})$ displays striking differences across the transition: while it narrows with increasing system sizes $L$ in the SF phase, it broadens in the BG regime, indicating an absence of self-averaging, and at the critical point $P(\ln\rho_{\rm sf}+z \ln L)$ is scale invariant. Finally, high-precision measurements of the local density rule out a percolation picture for the SF-BG transition.
A linear nonadiabatic analysis of the vibrational stability of population III main-sequence stars was carried out. It was demonstrated that, in the case of massive stars with $M \gtrsim 5\Mo$, helium burning (triple alpha reaction) starts during the main-sequence stage and produces $^{12}{\rm C}$, leading to the activation of a part of the CNO-cycle. It was found that, despite of that, those stars with $M\lesssim 13\Mo$ become unstable against the dipole g$_1$- and g$_2$-modes due to the $\varepsilon$-mechanism, during the early evolutionary phase at which the pp-chain is still the dominant nuclear energy source. The instability due to the $\varepsilon$-mechanism occurs against g-modes having a large amplitude in the off-centered $^3$He accumulation shell in the deep interior, and the growth time is much shorter than the evolutionary timescale. This instability is therefore likely to induce mixing in the stellar interior, and have a significant influence on the evolution of these stars.
The interface between a superconductor and a topological insulator has been proposed to harbor novel quasiparticles that realize physical schemes for fault-tolerant quantum computation. Here, we present high resolution angle-resolved photoemission experimental results, which along with first principles calculations, suggest that the surface-edge states of Bi2Se3 form a topological 2D metal. When magnetic atoms are deposited, the surface tends to lose Kramers' degeneracy and a k-space connection thread between the bulk valence and conduction bands is lost. Our observed states carry a \pi Berry's phase suggesting that although the real materials are often electron doped fully undoped Bi2Se3 would be a Z2 topological insulator at room temperature.
This paper demonstrates two different fusion techniques at two different levels of a human face recognition process. The first one is called data fusion at lower level and the second one is the decision fusion towards the end of the recognition process. At first a data fusion is applied on visual and corresponding thermal images to generate fused image. Data fusion is implemented in the wavelet domain after decomposing the images through Daubechies wavelet coefficients (db2). During the data fusion maximum of approximate and other three details coefficients are merged together. After that Principle Component Analysis (PCA) is applied over the fused coefficients and finally two different artificial neural networks namely Multilayer Perceptron(MLP) and Radial Basis Function(RBF) networks have been used separately to classify the images. After that, for decision fusion based decisions from both the classifiers are combined together using Bayesian formulation. For experiments, IRIS thermal/visible Face Database has been used. Experimental results show that the performance of multiple classifier system along with decision fusion works well over the single classifier system.
We develop a framework to track the structure of temporal networks with a signal processing approach. The method is based on the duality between networks and signals using a multidimensional scaling technique. This enables a study of the network structure using frequency patterns of the corresponding signals. An extension is proposed for temporal networks, thereby enabling a tracking of the network structure over time. A method to automatically extract the most significant frequency patterns and their activation coefficients over time is then introduced, using nonnegative matrix factorization of the temporal spectra. The framework, inspired by audio decomposition, allows transforming back these frequency patterns into networks, to highlight the evolution of the underlying structure of the network over time. The effectiveness of the method is first evidenced on a toy example, prior being used to study a temporal network of face-to-face contacts. The extraction of sub-networks highlights significant structures decomposed on time intervals.
Today, there are two major paradigms for vision-based autonomous driving systems: mediated perception approaches that parse an entire scene to make a driving decision, and behavior reflex approaches that directly map an input image to a driving action by a regressor. In this paper, we propose a third paradigm: a direct perception approach to estimate the affordance for driving. We propose to map an input image to a small number of key perception indicators that directly relate to the affordance of a road/traffic state for driving. Our representation provides a set of compact yet complete descriptions of the scene to enable a simple controller to drive autonomously. Falling in between the two extremes of mediated perception and behavior reflex, we argue that our direct perception representation provides the right level of abstraction. To demonstrate this, we train a deep Convolutional Neural Network using recording from 12 hours of human driving in a video game and show that our model can work well to drive a car in a very diverse set of virtual environments. We also train a model for car distance estimation on the KITTI dataset. Results show that our direct perception approach can generalize well to real driving images. Source code and data are available on our project website.
We show the nonexistence of deep pockets in a large class of groups, extending a result of Bogopol'skii. We then give examples of important groups (such as Nil and Sol) which have deep pockets.
When an attacker wants to falsify an image, in most of cases she/he will perform a JPEG recompression. Different techniques have been developed based on diverse theoretical assumptions but very effective solutions have not been developed yet. Recently, machine learning based approaches have been started to appear in the field of image forensics to solve diverse tasks such as acquisition source identification and forgery detection. In this last case, the aim ahead would be to get a trained neural network able, given a to-be-checked image, to reliably localize the forged areas. With this in mind, our paper proposes a step forward in this direction by analyzing how a single or double JPEG compression can be revealed and localized using convolutional neural networks (CNNs). Different kinds of input to the CNN have been taken into consideration, and various experiments have been carried out trying also to evidence potential issues to be further investigated.
Generative adversarial networks (GANs) are a recently proposed class of generative models in which a generator is trained to optimize a cost function that is being simultaneously learned by a discriminator. While the idea of learning cost functions is relatively new to the field of generative modeling, learning costs has long been studied in control and reinforcement learning (RL) domains, typically for imitation learning from demonstrations. In these fields, learning cost function underlying observed behavior is known as inverse reinforcement learning (IRL) or inverse optimal control. While at first the connection between cost learning in RL and cost learning in generative modeling may appear to be a superficial one, we show in this paper that certain IRL methods are in fact mathematically equivalent to GANs. In particular, we demonstrate an equivalence between a sample-based algorithm for maximum entropy IRL and a GAN in which the generator's density can be evaluated and is provided as an additional input to the discriminator. Interestingly, maximum entropy IRL is a special case of an energy-based model. We discuss the interpretation of GANs as an algorithm for training energy-based models, and relate this interpretation to other recent work that seeks to connect GANs and EBMs. By formally highlighting the connection between GANs, IRL, and EBMs, we hope that researchers in all three communities can better identify and apply transferable ideas from one domain to another, particularly for developing more stable and scalable algorithms: a major challenge in all three domains.
Under certain circumstances such as lack of information or bounded rationality, human players can take decisions on which strategy to choose in a game on the basis of simple opinions. These opinions can be modified after each round by observing own or others payoff results but can be also modified after interchanging impressions with other players. In this way, the update of the strategies can become a question that goes beyond simple evolutionary rules based on fitness and become a social issue. In this work, we explore this scenario by coupling a game with an opinion dynamics model. The opinion is represented by a continuous variable that corresponds to the certainty of the agents respect to which strategy is best. The opinions transform into actions by making the selection of an strategy a stochastic event with a probability regulated by the opinion. A certain regard for the previous round payoff is included but the main update rules of the opinion are given by a model inspired in social interchanges. We find that the dynamics fixed points of the coupled model is different from those of the evolutionary game or the opinion models alone. Furthermore, new features emerge such as the resilience of the fraction of cooperators to the topology of the social interaction network or to the presence of a small fraction of extremist players.
We construct a generic, simple, and efficient scheduling policy for stochastic processing networks, and provide a general framework to establish its stability. Our policy is randomized and prioritized: with high probability it prioritizes jobs which have been least routed through the network. We show that the network is globally stable under this policy if there exists an appropriate quadratic local Lyapunov function that provides a negative drift with respect to nominal loads at servers. Applying this generic framework, we obtain stability results for our policy in many important examples of stochastic processing networks: open multiclass queueing networks, parallel server networks, networks of input-queued switches, and a variety of wireless network models with interference constraints. Our main novelty is the construction of an appropriate global Lyapunov function from quadratic local Lyapunov functions, which we believe to be of broader interest.
Efficient processing of large-scale graphs in distributed environments has been an increasingly popular topic of research in recent years. Inter-connected data that can be modeled as graphs arise in application domains such as machine learning, recommendation, web search, and social network analysis. Writing distributed graph applications is inherently hard and requires programming models that can cover a diverse set of problem domains, including iterative refinement algorithms, graph transformations, graph aggregations, pattern matching, ego-network analysis, and graph traversals. Several high-level programming abstractions have been proposed and adopted by distributed graph processing systems and big data platforms. Even though significant work has been done to experimentally compare distributed graph processing frameworks, no qualitative study and comparison of graph programming abstractions has been conducted yet. In this survey, we review and analyze the most prevalent high-level programming models for distributed graph processing, in terms of their semantics and applicability. We identify the classes of graph applications that can be naturally expressed by each abstraction and we also give examples of applications that are hard or impossible to express. We review 34 distributed graph processing systems with respect to their programming abstractions, execution models, and communication mechanisms. Finally, we discuss trends and open research questions in the area of distributed graph processing.
Cells adapt their metabolic fluxes in response to changes in the environment. We present a systematic flux-based framework for the construction of graphs to represent organism-wide metabolic networks. Our graphs encode the directionality of metabolic fluxes via links that represent the flow of metabolites from source to target reactions. The methodology can be applied in the absence of a specific biological context by modelling fluxes as probabilities, or tailored to different environmental conditions by incorporating flux distributions computed from constraint-based modelling such as Flux Balance Analysis. We illustrate our approach on the central carbon metabolism of Escherichia coli and study the derived graphs under various growth conditions. The results reveal drastic changes in the topological and community structure of the metabolic graphs, which capture the re-routing of metabolic fluxes under each growth condition. By integrating constraint-based models and tools from network science, our framework allows for the interrogation of environment-specific metabolic responses beyond fixed, standard pathway descriptions.
We study the horizontal expansion of vertically confined ultra-cold atoms in the presence of disorder. Vertical confinement allows us to realize a situation with a few coupled harmonic oscillator quantum states. The disordered potential is created by an optical speckle at an angle of 30{\deg} with respect to the horizontal plane, resulting in an effective anisotropy of the correlation lengths of a factor of 2 in that plane. We observe diffusion leading to non-Gaussian density profiles. Diffusion coefficients, extracted from the experimental results, show anisotropy and strong energy dependence, in agreement with numerical calculations.
Deep neural networks have demonstrated very promising performance on accurate segmentation of challenging organs (e.g., pancreas) in abdominal CT and MRI scans. The current deep learning approaches conduct pancreas segmentation by processing sequences of 2D image slices independently through deep, dense per-pixel masking for each image, without explicitly enforcing spatial consistency constraint on segmentation of successive slices. We propose a new convolutional/recurrent neural network architecture to address the contextual learning and segmentation consistency problem. A deep convolutional sub-network is first designed and pre-trained from scratch. The output layer of this network module is then connected to recurrent layers and can be fine-tuned for contextual learning, in an end-to-end manner. Our recurrent sub-network is a type of Long short-term memory (LSTM) network that performs segmentation on an image by integrating its neighboring slice segmentation predictions, in the form of a dependent sequence processing. Additionally, a novel segmentation-direct loss function (named Jaccard Loss) is proposed and deep networks are trained to optimize Jaccard Index (JI) directly. Extensive experiments are conducted to validate our proposed deep models, on quantitative pancreas segmentation using both CT and MRI scans. Our method outperforms the state-of-the-art work on CT [11] and MRI pancreas segmentation [1], respectively.
In India many people are now dependent on online banking. This raises security concerns as the banking websites are forged and fraud can be committed by identity theft. These forged websites are called as Phishing websites and created by malicious people to mimic web pages of real websites and it attempts to defraud people of their personal information. Detecting and identifying phishing websites is a really complex and dynamic problem involving many factors and criteria. This paper discusses about the prediction of phishing websites using neural networks. A neural network is a multilayer system which reduces the error and increases the performance. This paper describes a framework to better classify and predict the phishing sites using neural networks.
Spiking neural networks (SNNs) enable power-efficient implementations due to their sparse, spike-based coding scheme. This paper develops a bio-inspired SNN that uses unsupervised learning to extract discriminative features from speech signals, which can subsequently be used in a classifier. The architecture consists of a spiking convolutional/pooling layer followed by a fully connected spiking layer for feature discovery. The convolutional layer of leaky, integrate-and-fire (LIF) neurons represents primary acoustic features. The fully connected layer is equipped with a probabilistic spike-timing-dependent plasticity learning rule. This layer represents the discriminative features through probabilistic, LIF neurons. To assess the discriminative power of the learned features, they are used in a hidden Markov model (HMM) for spoken digit recognition. The experimental results show performance above 96% that compares favorably with popular statistical feature extraction methods. Our results provide a novel demonstration of unsupervised feature acquisition in an SNN.
We study quantum state transfer through a qubit network modeled by spins with XY interaction, when relying on a single excitation. We show that it is possible to achieve perfect transfer by shifting (adding) energy to specific vertices. This technique appears to be a potentially powerful tool to change, and in some cases improve, transfer capabilities of quantum networks. Analytical results are presented for all-to-all networks and all-to-all networks with a missing link. Moreover, we evaluate the effect of random fluctuations on the transmission fidelity.
We use the AdS/CFT correspondence and the holographic softwall model to investigate the Deep Inelastic Scattering (DIS) in the exponentially small $x$ (Bjorken Parameter) regime. We calculate the corresponding structure functions for scalar fields. Using these results we studied the problem of the saturation line in the strong interactions. Our results are consistent with those achieved using other models.
Across many scientific domains, there is a common need to automatically extract a simplified view or coarse-graining of how a complex system's components interact. This general task is called community detection in networks and is analogous to searching for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called "ground truth" communities. This works well in synthetic networks with planted communities because such networks' links are formed explicitly based on those known communities. However, there are no planted communities in real world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. Here, we show that metadata are not the same as ground truth, and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that there can be no algorithm that is optimal for all possible community detection tasks. However, community detection remains a powerful tool and node metadata still have value so a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class of models. We demonstrate these techniques using both synthetic and real-world networks, and for multiple types of metadata and community structure.
Cellular neural circuit and networks consisting of interconnected neurons and glia are ulti- mately responsible for the information processing associated with information processing in the brain. While there are major efforts aimed at mapping the structural and (electro)physiological connectivity of brain networks, such as the White House BRAIN Initiative aimed at the devel- opment of neurotechnologies capable of high density neural recordings, theoretical and compu- tational methods for analyzing and making sense of all this data seem to be further behind. Here, we propose and provide a summary of an approach for calculating effective connectivity from experimental observations of neuronal network activity. The proposed method operates on network-level data, makes use of all relevant prior knowledge, such as dynamical models of individual cells in the network and the physical structural connectivity of the network, and is broadly applicable to large classes of biological and non-biological networks.
The assessment of highly-risky situations at road intersections have been recently revealed as an important research topic within the context of the automotive industry. In this paper we shall introduce a novel approach to compute risk functions by using a combination of a highly non-linear processing model in conjunction with a powerful information encoding procedure. Specifically, the elements of information either static or dynamic that appear in a road intersection scene are encoded by using directed positional acyclic labeled graphs. The risk assessment problem is then reformulated in terms of an inductive learning task carried out by a recursive neural network. Recursive neural networks are connectionist models capable of solving supervised and non-supervised learning problems represented by directed ordered acyclic graphs. The potential of this novel approach is demonstrated through well predefined scenarios. The major difference of our approach compared to others is expressed by the fact of learning the structure of the risk. Furthermore, the combination of a rich information encoding procedure with a generalized model of dynamical recurrent networks permit us, as we shall demonstrate, a sophisticated processing of information that we believe as being a first step for building future advanced intersection safety systems
Purpose Segmentation of the liver from abdominal computed tomography (CT) image is an essential step in some computer assisted clinical interventions, such as surgery planning for living donor liver transplant (LDLT), radiotherapy and volume measurement. In this work, we develop a deep learning algorithm with graph cut refinement to automatically segment liver in CT scans. Methods The proposed method consists of two main steps: (i) simultaneously liver detection and probabilistic segmentation using 3D convolutional neural networks (CNNs); (ii) accuracy refinement of initial segmentation with graph cut and the previously learned probability map. Results The proposed approach was validated on forty CT volumes taken from two public databases MICCAI-Sliver07 and 3Dircadb. For the MICCAI-Sliver07 test set, the calculated mean ratios of volumetric overlap error (VOE), relative volume difference (RVD), average symmetric surface distance (ASD), root mean square symmetric surface distance (RMSD) and maximum symmetric surface distance (MSD) are 5.9%, 2.7%, 0.91%, 1.88 mm, and 18.94 mm, respectively. In the case of 20 3Dircadb data, the calculated mean ratios of VOE, RVD, ASD, RMSD and MSD are 9.36%, 0.97%, 1.89%, 4.15 mm and 33.14 mm, respectively. Conclusion The proposed method is fully automatic without any user interaction. Quantitative results reveal that the proposed approach is efficient and accurate for hepatic volume estimation in a clinical setup. The high correlation between the automatic and manual references shows that the proposed method can be good enough to replace the time-consuming and non-reproducible manual segmentation method.
Are we using the right potential functions in the Conditional Random Field models that are popular in the Vision community? Semantic segmentation and other pixel-level labelling tasks have made significant progress recently due to the deep learning paradigm. However, most state-of-the-art structured prediction methods also include a random field model with a hand-crafted Gaussian potential to model spatial priors, label consistencies and feature-based image conditioning.   In this paper, we challenge this view by developing a new inference and learning framework which can learn arbitrary pairwise CRF potentials. Both standard spatial and high-dimensional bilateral kernels are considered. Our framework is based on the observation that CRF inference can be achieved via projected gradient descent and consequently, can easily be integrated in deep neural networks to allow for end-to-end training. It is empirically demonstrated that such learned potentials can improve segmentation accuracy and that certain label class interactions are indeed better modelled by a non-Gaussian potential. Our framework is evaluated on several public benchmarks for semantic segmentation with improved performance compared to previous state-of-the-art CNN+CRF models.
The ideas of the linear combination of atomic orbitals (LCAO) method, well known from the study of electrons, is extended to the classical wave case. The Mie resonances of the isolated scatterer in the classical wave case, are analogous to the localized eigenstates in the electronic case. The matrix elements of the two-dimensional tight-binding (TB) Hamiltonian are obtained by fitting to ab initio results. The transferability of the TB model is tested by reproducing accurately the band structure of different 2D lattices, with and without defects, thus proving that the obtained TB parameters can be used to study other properties of the photonic band gap materials.
One of the difficulties of training deep neural networks is caused by improper scaling between layers. Scaling issues introduce exploding / gradient problems, and have typically been addressed by careful scale-preserving initialization. We investigate the value of preserving scale, or isometry, beyond the initial weights. We propose two methods of maintaing isometry, one exact and one stochastic. Preliminary experiments show that for both determinant and scale-normalization effectively speeds up learning. Results suggest that isometry is important in the beginning of learning, and maintaining it leads to faster learning.
We analyse a 75ks XMM-Newton observation of PG 2112+059 performed in November 2005 and compare it with a 15ks XMM-Newton observation taken in May 2003. PG 2112+059 was found in a deep minimum state as its 0.2-12 keV flux decreased by a factor of 10 in comparison to the May 2003 observation. During the deep minimum state the spectra show strong emission in excess of the continuum in the 3-6 keV region. The excess emission corresponds to an EW = 26.1 keV whereas its shape resembles that of heavily absorbed objects. The spectra of both observations of PG 2112+059 can be explained statistically by a combination of two absorbers where one shows a high column density, $N_{H} \sim 4.5 \times 10^{23} cm^{-2}$, and the other high ionisation parameters. As the ionisation parameter of the high flux state, $\xi \sim 34 erg cm s^{-1}$, is lower than the value found for the deep minimum state, $\xi \sim 110 erg cm s^{-1}$, either the absorbers are physically different or the absorbing material is moving with respect to the X-ray source. The spectra can also be explained by a continuum plus X-ray ionised reflection on the accretion disk, seen behind a warm absorber. The ionisation parameter of the high state ($\xi \sim 5.6 erg cm s^{-1}$) is higher than the ionisation parameter of the deep minimum state ($\xi \sim 0.2 erg cm s^{-1}$), as expected for a stationary absorber. The values found for the ionisation parameters are in the range typical for AGNs. The spectra observed during the deep minimum state are reflection dominated and show no continuum emission. These can be understood in the context of light bending near the supermassive black hole as predicted by Minutti and Fabian.
We present results from deep radio observations taken with the VLA at a center frequency of 1400 MHz cover a region of the SWIRE Spitzer Legacy survey, centered at 10 46 00, 59 01 00 (J2000). The reduction and cataloging of the radio sources are described. The survey presented is the deepest so far in terms of the radio source density on the sky. Perhaps surprisingly, the sources down to the bottom of the catalog have median angular sizes greater than 1 arcsecond, like their cousins 10-100 times stronger. If the log N - log S normalization remains constant at the lowest flux densities, there are about 6 sources per square arcminute down to 15 microJy at 20cm. Given the finite source sizes this implies we may reach the natural confusion limit near 1 microJy.
We propose Diverse Embedding Neural Network (DENN), a novel architecture for language models (LMs). A DENNLM projects the input word history vector onto multiple diverse low-dimensional sub-spaces instead of a single higher-dimensional sub-space as in conventional feed-forward neural network LMs. We encourage these sub-spaces to be diverse during network training through an augmented loss function. Our language modeling experiments on the Penn Treebank data set show the performance benefit of using a DENNLM.
Understanding the metal-insulator transition in disordered many-fermion systems, both with and without interactions, is one of the most challenging and consequential problems in condensed matter physics. In this paper we address this issue from the perspective of the modern theory of the insulating state (MTIS), which has already proven to be effective for band and Mott insulators in clean systems. First we consider noninteracting systems with different types of aperiodic external potentials: uncorrelated disorder (one-dimensional Anderson model), deterministic disorder (Aubry-Andr\'e Hamiltonian and its modification including next-nearest neighbour hopping), and disorder with long-range correlations (self-affine potential). We show how the many-body localisation tensor defined within the MTIS may be used as a powerful probe to discriminate the insulating and the metallic phases, and to locate the transition point. Then we investigate the effect of weak repulsive interactions in the Aubry-Andr\'{e} Hamiltonian, a model which describes a recent cold-atoms experiment. By treating the weak interactions within a mean-field approximation we obtain a linear shift of the transition point towards stronger disorder, providing evidence for delocalisation induced by interactions.
The driving force behind convolutional networks - the most successful deep learning architecture to date, is their expressive power. Despite its wide acceptance and vast empirical evidence, formal analyses supporting this belief are scarce. The primary notions for formally reasoning about expressiveness are efficiency and inductive bias. Expressive efficiency refers to the ability of a network architecture to realize functions that require an alternative architecture to be much larger. Inductive bias refers to the prioritization of some functions over others given prior knowledge regarding a task at hand. In this paper we overview a series of works written by the authors, that through an equivalence to hierarchical tensor decompositions, analyze the expressive efficiency and inductive bias of various convolutional network architectural features (depth, width, strides and more). The results presented shed light on the demonstrated effectiveness of convolutional networks, and in addition, provide new tools for network design.
Researches on sequential vocalization often require analysis of vocalizations in long continuous sounds. In such studies as developmental ones or studies across generations in which days or months of vocalizations must be analyzed, methods for automatic recognition would be strongly desired. Although methods for automatic speech recognition for application purposes have been intensively studied, blindly applying them for biological purposes may not be an optimal solution. This is because, unlike human speech recognition, analysis of sequential vocalizations often requires accurate extraction of timing information. In the present study we propose automated systems suitable for recognizing birdsong, one of the most intensively investigated sequential vocalizations, focusing on the three properties of the birdsong. First, a song is a sequence of vocal elements, called notes, which can be grouped into categories. Second, temporal structure of birdsong is precisely controlled, meaning that temporal information is important in song analysis. Finally, notes are produced according to certain probabilistic rules, which may facilitate the accurate song recognition. We divided the procedure of song recognition into three sub-steps: local classification, boundary detection, and global sequencing, each of which corresponds to each of the three properties of birdsong. We compared the performances of several different ways to arrange these three steps. As results, we demonstrated a hybrid model of a deep neural network and a hidden Markov model is effective in recognizing birdsong with variable note sequences. We propose suitable arrangements of methods according to whether accurate boundary detection is needed. Also we designed the new measure to jointly evaluate the accuracy of note classification and boundary detection. Our methods should be applicable, with small modification and tuning, to the songs in other species that hold the three properties of the sequential vocalization.
Despite the recent advances in automatically describing image contents, their applications have been mostly limited to image caption datasets containing natural images (e.g., Flickr 30k, MSCOCO). In this paper, we present a deep learning model to efficiently detect a disease from an image and annotate its contexts (e.g., location, severity and the affected organs). We employ a publicly available radiology dataset of chest x-rays and their reports, and use its image annotations to mine disease names to train convolutional neural networks (CNNs). In doing so, we adopt various regularization techniques to circumvent the large normal-vs-diseased cases bias. Recurrent neural networks (RNNs) are then trained to describe the contexts of a detected disease, based on the deep CNN features. Moreover, we introduce a novel approach to use the weights of the already trained pair of CNN/RNN on the domain-specific image/text dataset, to infer the joint image/text contexts for composite image labeling. Significantly improved image annotation results are demonstrated using the recurrent neural cascade model by taking the joint image/text contexts into account.
This paper underscores the conjecture that intrinsic computation is maximal in systems at the "edge of chaos." We study the relationship between dynamics and computational capability in Random Boolean Networks (RBN) for Reservoir Computing (RC). RC is a computational paradigm in which a trained readout layer interprets the dynamics of an excitable component (called the reservoir) that is perturbed by external input. The reservoir is often implemented as a homogeneous recurrent neural network, but there has been little investigation into the properties of reservoirs that are discrete and heterogeneous. Random Boolean networks are generic and heterogeneous dynamical systems and here we use them as the reservoir. An RBN is typically a closed system; to use it as a reservoir we extend it with an input layer. As a consequence of perturbation, the RBN does not necessarily fall into an attractor. Computational capability in RC arises from a trade-off between separability and fading memory of inputs. We find the balance of these properties predictive of classification power and optimal at critical connectivity. These results are relevant to the construction of devices which exploit the intrinsic dynamics of complex heterogeneous systems, such as biomolecular substrates.
Many state-of-the-art results obtained with deep networks are achieved with the largest models that could be trained, and if more computation power was available, we might be able to exploit much larger datasets in order to improve generalization ability. Whereas in learning algorithms such as decision trees the ratio of capacity (e.g., the number of parameters) to computation is very favorable (up to exponentially more parameters than computation), the ratio is essentially 1 for deep neural networks. Conditional computation has been proposed as a way to increase the capacity of a deep neural network without increasing the amount of computation required, by activating some parameters and computation "on-demand", on a per-example basis. In this note, we propose a novel parametrization of weight matrices in neural networks which has the potential to increase up to exponentially the ratio of the number of parameters to computation. The proposed approach is based on turning on some parameters (weight matrices) when specific bit patterns of hidden unit activations are obtained. In order to better control for the overfitting that might result, we propose a parametrization that is tree-structured, where each node of the tree corresponds to a prefix of a sequence of sign bits, or gating units, associated with hidden units.
We apply deep recurrent neural networks, which are capable of learning complex sequential information, to classify supernovae\footnote{Code available at \href{https://github.com/adammoss/supernovae}{https://github.com/adammoss/supernovae}}. The observational time and filter fluxes are used as inputs to the network, but since the inputs are agnostic additional data such as host galaxy information can also be included. Using the Supernovae Photometric Classification Challenge (SPCC) data, we find that deep networks are capable of learning about light curves, however the performance of the network is highly sensitive to the amount of training data. For a training size of 50\% of the representational SPCC dataset (around $10^4$ supernovae) we obtain a type-Ia vs. non-type-Ia classification accuracy of 94.7\%, an area under the Receiver Operating Characteristic curve AUC of 0.986 and a SPCC figure-of-merit $F_1=0.64$. When using only the data for the early-epoch challenge defined by the SPCC we achieve a classification accuracy of 93.1\%, AUC of 0.977 and $F_1=0.58$, results almost as good as with the whole light-curve. By employing bidirectional neural networks we can acquire impressive classification results between supernovae types -I,~-II and~-III at an accuracy of 90.4\% and AUC of 0.974. We also apply a pre-trained model to obtain classification probabilities as a function of time, and show it can give early indications of supernovae type. Our method is competitive with existing algorithms and has applications for future large-scale photometric surveys.
Machine learning based on convolutional neural networks can be used to study jet images from the LHC. Top tagging in fat jets offers a well-defined framework to establish our DeepTop approach and compare its performance to QCD-based top taggers. We first optimize a network architecture to identify top quarks in Monte Carlo simulations of the Standard Model production channel. Using standard fat jets we then compare its performance to a multivariate QCD-based top tagger. We find that both approaches lead to comparable performance, establishing convolutional networks as a promising new approach for multivariate hypothesis-based top tagging.
We study pairwise Ising models for describing the statistics of multi-neuron spike trains, using data from a simulated cortical network. We explore efficient ways of finding the optimal couplings in these models and examine their statistical properties. To do this, we extract the optimal couplings for subsets of size up to 200 neurons, essentially exactly, using Boltzmann learning. We then study the quality of several approximate methods for finding the couplings by comparing their results with those found from Boltzmann learning. Two of these methods- inversion of the TAP equations and an approximation proposed by Sessak and Monasson- are remarkably accurate. Using these approximations for larger subsets of neurons, we find that extracting couplings using data from a subset smaller than the full network tends systematically to overestimate their magnitude. This effect is described qualitatively by infinite-range spin glass theory for the normal phase. We also show that a globally-correlated input to the neurons in the network lead to a small increase in the average coupling. However, the pair-to-pair variation of the couplings is much larger than this and reflects intrinsic properties of the network. Finally, we study the quality of these models by comparing their entropies with that of the data. We find that they perform well for small subsets of the neurons in the network, but the fit quality starts to deteriorate as the subset size grows, signalling the need to include higher order correlations to describe the statistics of large networks.
We present our experience in applying distributional semantics (neural word embeddings) to the problem of representing and clustering documents in a bilingual comparable corpus. Our data is a collection of Russian and Ukrainian academic texts, for which topics are their academic fields. In order to build language-independent semantic representations of these documents, we train neural distributional models on monolingual corpora and learn the optimal linear transformation of vectors from one language to another. The resulting vectors are then used to produce `semantic fingerprints' of documents, serving as input to a clustering algorithm.   The presented method is compared to several baselines including `orthographic translation' with Levenshtein edit distance and outperforms them by a large margin. We also show that language-independent `semantic fingerprints' are superior to multi-lingual clustering algorithms proposed in the previous work, at the same time requiring less linguistic resources.
We propose an optimization method of mutual learning which converges into the identical state of optimum ensemble learning within the framework of on-line learning, and have analyzed its asymptotic property through the statistical mechanics method.The proposed model consists of two learning steps: two students independently learn from a teacher, and then the students learn from each other through the mutual learning. In mutual learning, students learn from each other and the generalization error is improved even if the teacher has not taken part in the mutual learning. However, in the case of different initial overlaps(direction cosine) between teacher and students, a student with a larger initial overlap tends to have a larger generalization error than that of before the mutual learning. To overcome this problem, our proposed optimization method of mutual learning optimizes the step sizes of two students to minimize the asymptotic property of the generalization error. Consequently, the optimized mutual learning converges to a generalization error identical to that of the optimal ensemble learning. In addition, we show the relationship between the optimum step size of the mutual learning and the integration mechanism of the ensemble learning.
The development of sensory receptive fields has been modeled in the past by a variety of models including normative models such as sparse coding or independent component analysis and bottom-up models such as spike-timing dependent plasticity or the Bienenstock-Cooper-Munro model of synaptic plasticity. Here we show that the above variety of approaches can all be unified into a single common principle, namely Nonlinear Hebbian Learning. When Nonlinear Hebbian Learning is applied to natural images, receptive field shapes were strongly constrained by the input statistics and preprocessing, but exhibited only modest variation across different choices of nonlinearities in neuron models or synaptic plasticity rules. Neither overcompleteness nor sparse network activity are necessary for the development of localized receptive fields. The analysis of alternative sensory modalities such as auditory models or V2 development lead to the same conclusions. In all examples, receptive fields can be predicted a priori by reformulating an abstract model as nonlinear Hebbian learning. Thus nonlinear Hebbian learning and natural statistics can account for many aspects of receptive field formation across models and sensory modalities.
We consider qubit networks where adjacent qubits besides interacting via XY-coupling, also dissipate into the same environment. The steady states are computed exactly for all network sizes and topologies, showing that they are always symmetric under permutation of network sites, leading to a uniform distribution of the stationary entanglement across the network. The maximum entanglement between two arbitrary qubits is shown to depend only on the total number of qubits in the network, and scales linearly with it. A possible physical realization by means of an array of doped cavities is discussed for the case of a linear chain.
Network updates such as policy and routing changes occur frequently in Software Defined Networks (SDN). Updates should be performed consistently, preventing temporary disruptions, and should require as little overhead as possible. Scalability is increasingly becoming an essential requirement in SDN. In this paper we propose to use time-triggered network updates to achieve consistent updates. Our proposed solution requires lower overhead than existing update approaches, without compromising the consistency during the update. We demonstrate that accurate time enables far more scalable consistent updates in SDN than previously available. In addition, it provides the SDN programmer with fine-grained control over the tradeoff between consistency and scalability.
Spectral embedding based on the Singular Value Decomposition (SVD) is a widely used "preprocessing" step in many learning tasks, typically leading to dimensionality reduction by projecting onto a number of dominant singular vectors and rescaling the coordinate axes (by a predefined function of the singular value). However, the number of such vectors required to capture problem structure grows with problem size, and even partial SVD computation becomes a bottleneck. In this paper, we propose a low-complexity it compressive spectral embedding algorithm, which employs random projections and finite order polynomial expansions to compute approximations to SVD-based embedding. For an m times n matrix with T non-zeros, its time complexity is O((T+m+n)log(m+n)), and the embedding dimension is O(log(m+n)), both of which are independent of the number of singular vectors whose effect we wish to capture. To the best of our knowledge, this is the first work to circumvent this dependence on the number of singular vectors for general SVD-based embeddings. The key to sidestepping the SVD is the observation that, for downstream inference tasks such as clustering and classification, we are only interested in using the resulting embedding to evaluate pairwise similarity metrics derived from the euclidean norm, rather than capturing the effect of the underlying matrix on arbitrary vectors as a partial SVD tries to do. Our numerical results on network datasets demonstrate the efficacy of the proposed method, and motivate further exploration of its application to large-scale inference tasks.
The aim of this chapter is twofold. In the first part we will provide a brief overview of the mathematical and statistical foundations of graphical models, along with their fundamental properties, estimation and basic inference procedures. In particular we will develop Markov networks (also known as Markov random fields) and Bayesian networks, which comprise most past and current literature on graphical models. In the second part we will review some applications of graphical models in systems biology.
We propose the Particle Swarm Optimization (PSO) as an alternative method for locating periodic orbits in a three--dimensional (3D) model of barred galaxies. We develop an appropriate scheme that transforms the problem of finding periodic orbits into the problem of detecting global minimizers of a function, which is defined on the Poincar\'{e} Surface of Section (PSS) of the Hamiltonian system. By combining the PSO method with deflection techniques, we succeeded in tracing systematically several periodic orbits of the system. The method succeeded in tracing the initial conditions of periodic orbits in cases where Newton iterative techniques had difficulties. In particular, we found families of 2D and 3D periodic orbits associated with the inner 8:1 to 12:1 resonances, between the radial 4:1 and corotation resonances of our 3D Ferrers bar model. The main advantages of the proposed algorithm is its simplicity, its ability to work using function values solely, as well as its ability to locate many periodic orbits per run at a given Jacobian constant.
We consider a renewal process \tau={\tau_0,\tau_1,...} on the integers, where the law of \tau_i-\tau_{i-1} has a power-like tail P(\tau_i-\tau_{i-1}=n)=n^{-(\alpha+1)}L(n) with \alpha\ge0 and L(.) slowly varying. We then assign a random, n-dependent reward/penalty to the occurrence of the event that the site n belongs to tau. This class of problems includes, among others, (1+d)-dimensional models of pinning of directed polymers on a one-dimensional random defect, (1+1)-dimensional models of wetting of disordered substrates, and the Poland-Scheraga model of DNA denaturation. By varying the average of the reward, the system undergoes a transition from a localized phase where \tau occupies a finite fraction of N to a delocalized phase where the density of \tau vanishes. In absence of disorder the transition is of first order for \alpha>1 and of higher order for \alpha<1. Moreover, for \alpha ranging from 1 to 0, the transition ranges from first to infinite order. Presence of even an arbitrarily small amount of disorder is known to modify the order of transition as soon as \alpha>1/2. In physical terms, disorder is relevant in this situation, in agreement with the heuristic Harris criterion. On the other hand, for 0<\alpha<1/2 it has been proven recently by K. Alexander that, if disorder is sufficiently weak, critical exponents are not modified by randomness: disorder is irrelevant. In this work, generalizing techniques which in the framework of spin glasses are known as replica coupling and interpolation, we give a new, simpler proof of the main results of [2]. Moreover, we (partially) justify a small-disorder expansion worked out in [9] for \alpha<1/2, showing that it provides a free energy upper bound which improves the annealed one.
In this paper, we obtain bounds on the probability of convergence to the optimal solution for the compact Genetic Algorithm (cGA) and the Population Based Incremental Learning (PBIL). We also give a sufficient condition for convergence of these algorithms to the optimal solution and compute a range of possible values of the parameters of these algorithms for which they converge to the optimal solution with a confidence level.
Recovering intrinsic low dimensional subspaces from data distributed on them is a key preprocessing step to many applications. In recent years, there has been a lot of work that models subspace recovery as low rank minimization problems. We find that some representative models, such as Robust Principal Component Analysis (R-PCA), Robust Low Rank Representation (R-LRR), and Robust Latent Low Rank Representation (R-LatLRR), are actually deeply connected. More specifically, we discover that once a solution to one of the models is obtained, we can obtain the solutions to other models in closed-form formulations. Since R-PCA is the simplest, our discovery makes it the center of low rank subspace recovery models. Our work has two important implications. First, R-PCA has a solid theoretical foundation. Under certain conditions, we could find better solutions to these low rank models at overwhelming probabilities, although these models are non-convex. Second, we can obtain significantly faster algorithms for these models by solving R-PCA first. The computation cost can be further cut by applying low complexity randomized algorithms, e.g., our novel $\ell_{2,1}$ filtering algorithm, to R-PCA. Experiments verify the advantages of our algorithms over other state-of-the-art ones that are based on the alternating direction method.
The long exposure times of the HST Ultra-Deep Field plus the use of an empirically derived position-dependent PSF, have enabled us to measure a cardioid/displacement distortion map coefficient as well as improving upon the sextupole map coefficient. We confirmed that curved background galaxies are clumped on the same angular scale as found in the HST Deep Field North. The new cardioid/displacement map coefficient is strongly correlated to a product of the sextupole and quadrupole coefficients. One would expect to see such a correlation from fits to background galaxies with quadrupole and sextupole moments. Events that depart from this correlation are expected to arise from map coefficient changes due to lensing, and several galaxy subsets selected using this criteria are indeed clumped.
Neural field models with transmission delay may be cast as abstract delay differential equations (DDE). The theory of dual semigroups (also called sun-star calculus) provides a natural framework for the analysis of a broad class of delay equations, among which DDE. In particular, it may be used advantageously for the investigation of stability and bifurcation of steady states. After introducing the neural field model in its basic functional analytic setting and discussing its spectral properties, we elaborate extensively an example and derive a characteristic equation. Under certain conditions the associated equilibrium may destabilise in a Hopf bifurcation. Furthermore, two Hopf curves may intersect in a double Hopf point in a two-dimensional parameter space. We provide general formulas for the corresponding critical normal form coefficients, evaluate these numerically and interpret the results.
An image is a very effective tool for conveying emotions. Many researchers have investigated in computing the image emotions by using various features extracted from images. In this paper, we focus on two high level features, the object and the background, and assume that the semantic information of images is a good cue for predicting emotion. An object is one of the most important elements that define an image, and we find out through experiments that there is a high correlation between the object and the emotion in images. Even with the same object, there may be slight difference in emotion due to different backgrounds, and we use the semantic information of the background to improve the prediction performance. By combining the different levels of features, we build an emotion based feed forward deep neural network which produces the emotion values of a given image. The output emotion values in our framework are continuous values in the 2-dimensional space (Valence and Arousal), which are more effective than using a few number of emotion categories in describing emotions. Experiments confirm the effectiveness of our network in predicting the emotion of images.
In this paper, we propose a generic and simple algorithmic framework for first order optimization. The framework essentially contains two consecutive steps in each iteration: 1) computing and normalizing the mini-batch stochastic gradient; 2) selecting adaptive step size to update the decision variable (parameter) towards the negative of the normalized gradient. We show that the proposed approach, when customized to the popular adaptive stepsize methods, such as AdaGrad, can enjoy a sublinear convergence rate, if the objective is convex. We also conduct extensive empirical studies on various non-convex neural network optimization problems, including multi layer perceptron, convolution neural networks and recurrent neural networks. The results indicate the normalized gradient with adaptive step size can help accelerate the training of neural networks. In particular, significant speedup can be observed if the networks are deep or the dependencies are long.
We train a number of neural networks to play games Bowling, Breakout and Seaquest using information stored in the memory of a video game console Atari 2600. We consider four models of neural networks which differ in size and architecture: two networks which use only information contained in the RAM and two mixed networks which use both information in the RAM and information from the screen. As the benchmark we used the convolutional model proposed in NIPS and received comparable results in all considered games. Quite surprisingly, in the case of Seaquest we were able to train RAM-only agents which behave better than the benchmark screen-only agent. Mixing screen and RAM did not lead to an improved performance comparing to screen-only and RAM-only agents.
Mapping a complex network of $N$coupled identical oscillators to a quantum system, the nearest neighbor level spacing (NNLS) distribution is used to identify collective chaos in the corresponding classical dynamics on the complex network. The classical dynamics on an Erdos-Renyi network with the wiring probability $p_{ER} \le \frac{1}{N}$ is in the state of collective order, while that on an Erdos-Renyi network with $p_{ER} > \frac{1}{N}$ in the state of collective chaos. The dynamics on a WS Small-world complex network evolves from collective order to collective chaos rapidly in the region of the rewiring probability $p_r \in [0.0,0.1]$, and then keeps chaotic up to $p_r = 1.0$. The dynamics on a Growing Random Network (GRN) is in a special state deviates from order significantly in a way opposite to that on WS small-world networks. Each network can be measured by a couple values of two parameters $(\beta ,\eta)$.
Ground states of the Edwards-Anderson (EA) spin glass model are studied on infinite graphs with finite degree. Ground states are spin configurations that locally minimize the EA Hamiltonian on each finite set of vertices. A problem with far-reaching consequences in mathematics and physics is to determine the number of ground states for the model on Z^d for any d. This problem can be seen as the spin glass version of determining the number of infinite geodesics in first-passage percolation or the number of ground states in the disordered ferromagnet. It was recently shown by Newman, Stein and the two authors that, on the half-plane Z \times N, there is a unique ground state (up to global flip) arising from the weak limit of finite-volume ground states for a particular choice of boundary conditions. In this paper, we study the entire set of ground states on the infinite graph, proving that the number of ground states on the half-plane must be two (related by a global flip) or infinity. This is the first result on the entire set of ground states in a non-trivial dimension. In the first part of the paper, we develop tools of interest to prove the analogous result on Z^d.
We introduce a model for a preferentially attached network which has grown from a small world network. Here, the average path length and the clustering coefficient are estimated, and the topological properties of modeled networks are compared as the initial conditions are changed. As a result, it is shown that the topological properties of the initial network remain even after the network growth. However, the vulnerability of each to preferentially attached nodes being added is not the same. It is found that the average path length rapidly decreases as the ratio of preferentially attached nodes increases and that the characteristics of the initial network can be easily disappeared. On the other hand, the clustering coefficient of the initial network slowly decreases with the ratio of preferentially attached nodes and its clustering characteristic remains much longer.
This paper proposes new nonnegative (shallow and multi-layer) autoencoders by combining the spiking Random Neural Network (RNN) model, the network architecture typical used in deep-learning area and the training technique inspired from nonnegative matrix factorization (NMF). The shallow autoencoder is a simplified RNN model, which is then stacked into a multi-layer architecture. The learning algorithm is based on the weight update rules in NMF, subject to the nonnegative probability constraints of the RNN. The autoencoders equipped with this learning algorithm are tested on typical image datasets including the MNIST, Yale face and CIFAR-10 datasets, and also using 16 real-world datasets from different areas. The results obtained through these tests yield the desired high learning and recognition accuracy. Also, numerical simulations of the stochastic spiking behavior of this RNN auto encoder, show that it can be implemented in a highly-distributed manner.
Using molecular dynamics simulations we investigate the relaxation dynamics of a supercooled liquid close to a rough as well as close to a smooth wall. For the former situation the relaxation times increase strongly with decreasing distance from the wall whereas in the second case they strongly decrease. We use this dependence to extract various dynamical length scales and show that they grow with decreasing temperature. By calculating the frequency dependent average susceptibility of such confined systems we show that the experimental interpretation of such data is very difficult.
Detecting extreme events in large datasets is a major challenge in climate science research. Current algorithms for extreme event detection are build upon human expertise in defining events based on subjective thresholds of relevant physical variables. Often, multiple competing methods produce vastly different results on the same dataset. Accurate characterization of extreme events in climate simulations and observational data archives is critical for understanding the trends and potential impacts of such events in a climate change content. This study presents the first application of Deep Learning techniques as alternative methodology for climate extreme events detection. Deep neural networks are able to learn high-level representations of a broad class of patterns from labeled data. In this work, we developed deep Convolutional Neural Network (CNN) classification system and demonstrated the usefulness of Deep Learning technique for tackling climate pattern detection problems. Coupled with Bayesian based hyper-parameter optimization scheme, our deep CNN system achieves 89\%-99\% of accuracy in detecting extreme events (Tropical Cyclones, Atmospheric Rivers and Weather Fronts
The Random Mutation Hill-Climbing algorithm is a direct search technique mostly used in discrete domains. It repeats the process of randomly selecting a neighbour of a best-so-far solution and accepts the neighbour if it is better than or equal to it. In this work, we propose to use a novel method to select the neighbour solution using a set of independent multi- armed bandit-style selection units which results in a bandit-based Random Mutation Hill-Climbing algorithm. The new algorithm significantly outperforms Random Mutation Hill-Climbing in both OneMax (in noise-free and noisy cases) and Royal Road problems (in the noise-free case). The algorithm shows particular promise for discrete optimisation problems where each fitness evaluation is expensive.
Measurements of production of the neutral and charged strange hadrons in ep collisions with the ZEUS detector are presented. The data on differential cross sections, baryon-to-meson ratios, baryon-antibaryon asymmetry and Bose-Einstein correlations in deep inelastic scattering and photoproduction are summarized.
In this work, we propose and address a new computer vision task, which we call fashion item detection, where the aim is to detect various fashion items a person in the image is wearing or carrying. The types of fashion items we consider in this work include hat, glasses, bag, pants, shoes and so on. The detection of fashion items can be an important first step of various e-commerce applications for fashion industry. Our method is based on state-of-the-art object detection method pipeline which combines object proposal methods with a Deep Convolutional Neural Network. Since the locations of fashion items are in strong correlation with the locations of body joints positions, we incorporate contextual information from body poses in order to improve the detection performance. Through the experiments, we demonstrate the effectiveness of the proposed method.
After a pedagogical review of the simple constituent quark model and deep inelastic sum rules, we describe how a quark sea as produced by the emission of internal Goldstone bosons by the valence quarks can account for the observed features of proton spin and flavor structures. Some issues concerning the strange quark content of the nucleon are also discussed.
Complex systems are characterized by specific time-dependent interactions among their many constituents. As a consequence they often manifest rich, non-trivial and unexpected behavior. Examples arise both in the physical and non-physical world. The study of complex systems forms a new interdisciplinary research area that cuts across physics, biology, ecology, economics, sociology, and the humanities. In this paper we review the essence of complex systems from a physicist's point of view, and try to clarify what makes them conceptually different from systems that are traditionally studied in physics. Our goal is to demonstrate how the dynamics of such systems may be conceptualized in quantitative and predictive terms by extending notions from statistical physics and how they can often be captured in a framework of co-evolving multiplex network structures. We mention three areas of complex-systems science that are currently studied extensively, the science of cities, dynamics of societies, and the representation of texts as evolutionary objects. We discuss why these areas form complex systems in the above sense. We argue that there exists plenty of new land for physicists to explore and that methodical and conceptual progress is needed most.
During the last few years, much research has been devoted to strategic interactions on complex networks. In this context, the Prisoner's Dilemma has become a paradigmatic model, and it has been established that imitative evolutionary dynamics lead to very different outcomes depending on the details of the network. We here report that when one takes into account the real behavior of people observed in the experiments, both at the mean-field level and on utterly different networks the observed level of cooperation is the same. We thus show that when human subjects interact in an heterogeneous mix including cooperators, defectors and moody conditional cooperators, the structure of the population does not promote or inhibit cooperation with respect to a well mixed population.
We consider moment matching techniques for estimation in Latent Dirichlet Allocation (LDA). By drawing explicit links between LDA and discrete versions of independent component analysis (ICA), we first derive a new set of cumulant-based tensors, with an improved sample complexity. Moreover, we reuse standard ICA techniques such as joint diagonalization of tensors to improve over existing methods based on the tensor power method. In an extensive set of experiments on both synthetic and real datasets, we show that our new combination of tensors and orthogonal joint diagonalization techniques outperforms existing moment matching methods.
As-grown topological insulators (TIs) are typically heavily-doped $n$-type crystals. Compensation by acceptors is used to move the Fermi level to the middle of the band gap, but even then TIs have a frustratingly small bulk resistivity. We show that this small resistivity is the result of band bending by poorly screened fluctuations in the random Coulomb potential. Using numerical simulations of a completely compensated TI, we find that the bulk resistivity has an activation energy of just 0.15 times the band gap, in good agreement with experimental data. At lower temperatures activated transport crosses over to variable range hopping with a relatively large localization length.
Recurrent Neural Networks (RNNs) produce state-of-art performance on many machine learning tasks but their demand on resources in terms of memory and computational power are often high. Therefore, there is a great interest in optimizing the computations performed with these models especially when considering development of specialized low-power hardware for deep networks. One way of reducing the computational needs is to limit the numerical precision of the network weights and biases. This has led to different proposed rounding methods which have been applied so far to only Convolutional Neural Networks and Fully-Connected Networks. This paper addresses the question of how to best reduce weight precision during training in the case of RNNs. We present results from the use of different stochastic and deterministic reduced precision training methods applied to three major RNN types which are then tested on several datasets. The results show that the weight binarization methods do not work with the RNNs. However, the stochastic and deterministic ternarization, and pow2-ternarization methods gave rise to low-precision RNNs that produce similar and even higher accuracy on certain datasets therefore providing a path towards training more efficient implementations of RNNs in specialized hardware.
We propose and systematically evaluate three strategies for training dynamically-routed artificial neural networks: graphs of learned transformations through which different input signals may take different paths. Though some approaches have advantages over others, the resulting networks are often qualitatively similar. We find that, in dynamically-routed networks trained to classify images, layers and branches become specialized to process distinct categories of images. Additionally, given a fixed computational budget, dynamically-routed networks tend to perform better than comparable statically-routed networks.
We study fast learning rates when the losses are not necessarily bounded and may have a distribution with heavy tails. To enable such analyses, we introduce two new conditions: (i) the envelope function $\sup_{f \in \mathcal{F}}|\ell \circ f|$, where $\ell$ is the loss function and $\mathcal{F}$ is the hypothesis class, exists and is $L^r$-integrable, and (ii) $\ell$ satisfies the multi-scale Bernstein's condition on $\mathcal{F}$. Under these assumptions, we prove that learning rate faster than $O(n^{-1/2})$ can be obtained and, depending on $r$ and the multi-scale Bernstein's powers, can be arbitrarily close to $O(n^{-1})$. We then verify these assumptions and derive fast learning rates for the problem of vector quantization by $k$-means clustering with heavy-tailed distributions. The analyses enable us to obtain novel learning rates that extend and complement existing results in the literature from both theoretical and practical viewpoints.
In this paper, we consider delay minimization for interference networks with renewable energy source, where the transmission power of a node comes from both the conventional utility power (AC power) and the renewable energy source. We assume the transmission power of each node is a function of the local channel state, local data queue state and local energy queue state only. In turn, we consider two delay optimization formulations, namely the decentralized partially observable Markov decision process (DEC-POMDP) and Non-cooperative partially observable stochastic game (POSG). In DEC-POMDP formulation, we derive a decentralized online learning algorithm to determine the control actions and Lagrangian multipliers (LMs) simultaneously, based on the policy gradient approach. Under some mild technical conditions, the proposed decentralized policy gradient algorithm converges almost surely to a local optimal solution. On the other hand, in the non-cooperative POSG formulation, the transmitter nodes are non-cooperative. We extend the decentralized policy gradient solution and establish the technical proof for almost-sure convergence of the learning algorithms. In both cases, the solutions are very robust to model variations. Finally, the delay performance of the proposed solutions are compared with conventional baseline schemes for interference networks and it is illustrated that substantial delay performance gain and energy savings can be achieved.
In this paper we investigate the use of discriminative model learning through Convolutional Neural Networks (CNNs) for SAR image despeckling. The network uses a residual learning strategy, hence it does not recover the filtered image, but the speckle component, which is then subtracted from the noisy one. Training is carried out by considering a large multitemporal SAR image and its multilook version, in order to approximate a clean image. Experimental results, both on synthetic and real SAR data, show the method to achieve better performance with respect to state-of-the-art techniques.
We test the usefulness of a generalized inverse participation ratio (GIPR) as a measure of Anderson localization. The GIPR differs from the usual inverse participation ratio in that it depends on the local density of states rather than on the single-electron wavefunctions. This makes it suitable for application to many-body systems. We benchmark the GIPR by performing a finite-size scaling analysis of a disordered, noninteracting, three-dimensional tight-binding lattice. We find values for the critical disorder and critical exponents that are in agreement with published values.
The goal of this chapter is to review recent analytical results about the growth of a (static) correlation length in glassy systems, and the connection that can be made between this length scale and the equilibrium correlation time of its dynamics. The definition of such a length scale is first given in a generic setting, including finite-dimensional models, along with rigorous bounds linking it to the correlation time. We then present some particular cases (finite connectivity mean-field models, and Kac limit of finite dimensional systems) where this length can be actually computed.
P2P computing lifts taxing issues in various areas of computer science. The largely used decentralized unstructured P2P systems are ad hoc in nature and present a number of research challenges. In this paper, we provide a comprehensive theoretical survey of various state-of-the-art search and replication schemes in unstructured P2P networks for file-sharing applications. The classifications of search and replication techniques and their advantages and disadvantages are briefly explained. Finally, the various issues on searching and replication for unstructured P2P networks are discussed.
The explosive growth of Web 2.0, which was characterized by the creation of online social networks, has reignited the study of factors that could help us understand the growth and dynamism of these networks. Various generative network models have been proposed, including the Barabasi-Albert and Watts-Strogatz models. In this study, we revisit the problem from a perspective that seeks to compare results obtained from these generative models with those from real networks. To this end, we consider the dating network Skout Inc. An analysis is performed on the topological characteristics of the network that could explain the creation of new network links. Afterwards, the results are contrasted with those obtained from the Barabasi-Albert and Watts-Strogatz generative models. We conclude that a key factor that could explain the creation of links originates in its cluster structure, where link recommendations are more precise in Watts-Strogatz segmented networks than in Barabasi-Albert hierarchical networks. This result reinforces the need to establish more and better network segmentation algorithms that are capable of clustering large networks precisely and efficiently.
In this Letter, we study how cooperation is organized in complex topologies by analyzing the evolutionary (replicator) dynamics of the Prisoner's Dilemma, a two-players game with two available strategies, defection and cooperation, whose payoff matrix favors defection. We show that, asymptotically, the population is partitioned into three subsets: individuals that always cooperate ({\em pure cooperators}), always defect ({\em pure defectors}) and those that intermittently change their strategy. In fact the size of the latter set is the biggest for a wide range of the "stimulus to defect" parameter. While in homogeneous random graphs pure cooperators are grouped into several clusters, in heterogeneous scale-free (SF) networks they always form a single cluster containing the most connected individuals (hubs). Our results give further insights into why cooperation in SF networks is favored.
There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt to improve the understanding of the underlying issues by exploring these problems from an analytical, a geometric and a dynamical systems perspective. Our analysis is used to justify a simple yet effective solution. We propose a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem. We validate empirically our hypothesis and proposed solutions in the experimental section.
We derive Thouless-Anderson-Palmer (TAP) equations for quantum disordered systems. We apply them to the study of the paramagnetic and glassy phases in the quantum version of the spherical p spin-glass model. We generalize several useful quantities (complexity, threshold level, etc.) and various ideas (configurational entropy crisis, etc), that have been developed within the classical TAP approach, to quantum systems. The analysis of the quantum TAP equations allows us to show that the phase diagram (temperature-quantum parameter) of the p spin-glass model should be generic. In particular, we argue that a crossover from a second order thermodynamic transition close to the classical critical point to a first order thermodynamic transition close to the quantum critical point is to be expected in a large class of systems.
The Griffiths inequalities for Ising spin glasses are proved on the Nishimori line with various bond randomness which includes Gaussian and $\pm J$ bond randomness. The proof for Ising systems with Gaussian bond randomness has already been carried out by Morita et al, which uses not only the gauge theory but also the properties of the Gaussian distribution, so that it cannot be directly applied to the systems with other bond randomness. The present proof essentially uses only the gauge theory, so that it does not depend on the detail properties of the probability distribution of random interactions. Thus, the results obtained from the inequalities for Ising systems with Gaussian bond randomness do also hold for those with various bond randomness, especially with $\pm J$ bond randomness.
In the previous work, we investigated the correlation-induced localization-delocalization transition (LDT) of the wavefunction at band center ($E=0$) in the one-dimensional tight-binding model with fractal disorder [Yamada, EPJB (2015) 88, 264]. In the present work, we study the energy ($E \neq 0$) dependence of the normalized localization length and the delocalization of the wavefunction at the different energy in the same system. The mobility edges in the LDT arise when the fractal dimension of the potential landscape is larger than the critical value depending on the disorder strength, which is consistent with the previous result.In addition, we present the distribution of individual NLL and Lyapunov exponent in the system with LDT.
We investigate the statistical properties of the eigenvalues and eigenvectors in a random matrix ensemble with $H_{ij}\sim |i-j|^{-\mu}$. It is known that this model shows a localization-delocalization transition (LDT) as a function of the parameter $\mu$. The model is critical at $\mu=1$ and the eigenstates are multifractals. Based on numerical simulations we demonstrate that the spectral statistics at criticality differs from semi-Poisson statistics which is expected to be a general feature of systems exhibiting a LDT or `weak chaos'.
This study proposes a general, scalable method to learn control-oriented thermal models of buildings that could enable wide-scale deployment of cost-effective predictive controls. An Unscented Kalman Filter augmented for parameter and disturbance estimation is shown to accurately learn and predict a building's thermal response. Recent studies of heating, ventilating, and air conditioning (HVAC) systems have shown significant energy savings with advanced model predictive control (MPC). A scalable cost-effective method to readily acquire accurate, robust models of individual buildings' unique thermal envelopes has historically been elusive and hindered the widespread deployment of prediction-based control systems. Continuous commissioning and lifetime performance of these thermal models requires deployment of on-line data-driven system identification and parameter estimation routines. We propose a novel gray-box approach using an Unscented Kalman Filter based on a multi-zone thermal network and validate it with EnergyPlus simulation data. The filter quickly learns parameters of a thermal network during periods of known or constrained loads and then characterizes unknown loads in order to provide accurate 24+ hour energy predictions. This study extends our initial investigation by formalizing parameter and disturbance estimation routines and demonstrating results across a year-long study.
In this work we consider random Boolean networks that provide a general model for genetic regulatory networks. We extend the analysis of James Lynch who was able to proof Kauffman's conjecture that in the ordered phase of random networks, the number of ineffective and freezing gates is large, where as in the disordered phase their number is small. Lynch proved the conjecture only for networks with connectivity two and non-uniform probabilities for the Boolean functions. We show how to apply the proof to networks with arbitrary connectivity $K$ and to random networks with biased Boolean functions. It turns out that in these cases Lynch's parameter $\lambda$ is equivalent to the expectation of average sensitivity of the Boolean functions used to construct the network. Hence we can apply a known theorem for the expectation of the average sensitivity. In order to prove the results for networks with biased functions, we deduct the expectation of the average sensitivity when only functions with specific connectivity and specific bias are chosen at random.
In this paper, a compress-and-forward scheme with backward decoding is presented for the unicast wireless relay network. The encoding at the source and relay is a generalization of the noisy network coding scheme (NNC). While it achieves the same reliable data rate as noisy network coding scheme, the backward decoding allows for a better decoding complexity as compared to the joint decoding of the NNC scheme. Characterizing the layered decoding scheme is shown to be equivalent to characterizing an information flow for the wireless network. A node-flow for a graph with bisubmodular capacity constraints is presented and a max-flow min-cut theorem is proved for it. This generalizes many well-known results of flows over capacity constrained graphs studied in computer science literature. The results for the unicast relay network are generalized to the network with multiple sources with independent messages intended for a single destination.
A two-terminal interactive distributed source coding problem with alternating messages for function computation at both locations is studied. For any number of messages, a computable characterization of the rate region is provided in terms of single-letter information measures. While interaction is useless in terms of the minimum sum-rate for lossless source reproduction at one or both locations, the gains can be arbitrarily large for function computation even when the sources are independent. For a class of sources and functions, interaction is shown to be useless, even with infinite messages, when a function has to be computed at only one location, but is shown to be useful, if functions have to be computed at both locations. For computing the Boolean AND function of two independent Bernoulli sources at both locations, an achievable infinite-message sum-rate with infinitesimal-rate messages is derived in terms of a two-dimensional definite integral and a rate-allocation curve. A general framework for multiterminal interactive function computation based on an information exchange protocol which successively switches among different distributed source coding configurations is developed. For networks with a star topology, multiple rounds of interactive coding is shown to decrease the scaling law of the total network rate by an order of magnitude as the network grows.
We present a measurement of the longitudinal spin cross section asymmetry for deep inelastic muon-nucleon interactions with two high transverse momentum hadrons in the final state. Two methods of event classification are used to increase the contribution of the Photon Gluon Fusion process to above 30%. The most effective one, based on a neural network approach, provides the asymmetries A_p(lN->lhhX)=0.030+/-0.057+/-0.010 and A_d(lN->lhhX)=0.070+/-0.076+/-0.010. From these values we derive an averaged gluon polarization delta(G)/G=-0.20+/-0.28+/-0.10 at an average fraction of nucleon momentum carried by gluons eta=0.07.
The Kak family of neural networks is able to learn patterns quickly, and this speed of learning can be a decisive advantage over other competing models in many applications. Amongst the implementations of these networks are those using reconfigurable networks, FPGAs and optical networks. In some applications, it is useful to use complex data, and it is with that in mind that this introduction to the basic Kak network with complex inputs is being presented. The training algorithm is prescriptive and the network weights are assigned simply upon examining the inputs. The input is mapped using quaternary encoding for purpose of efficienty. This network family is part of a larger hierarchy of learning schemes that include quantum models.
This is a case study discussing the supervised artificial neural network for the purpose of forecasting with comparison of the Box-Jenkins methodology by using the data of well known emergency service Rescue 1122. We fits a variety of neural network (NN) models and many problems were revealed while fitting the ANNs model to achieve the local minima. Moreover ANNs model is giving much better out of sample forecasts as compare to the ARIMA model. However we use diagnostic checks for the comparison of models.
Deep-inelastic positron-proton interactions at low values of Bjorken-x down to x \approx 4.10^-5 which give rise to high transverse momentum pi^0 mesons are studied with the H1 experiment at HERA. The inclusive cross section for pi^0 mesons produced at small angles with respect to the proton remnant (the forward region) is presented as a function of the transverse momentum and energy of the pi^0 and of the four-momentum transfer Q^2 and Bjorken-x. Measurements are also presented of the transverse energy flow in events containing a forward pi^0 meson. Hadronic final state calculations based on QCD models implementing different parton evolution schemes are confronted with the data.
The metapopulation framework is adopted in a wide array of disciplines to describe systems of well separated yet connected subpopulations. The subgroups or patches are often represented as nodes in a network whose links represent the migration routes among them. The connections have been so far mostly considered as static, but in general evolve in time. Here we address this case by investigating simple contagion processes on time-varying metapopulation networks. We focus on the SIR process and determine analytically the mobility threshold for the onset of an epidemic spreading in the framework of activity-driven network models. We find profound differences from the case of static networks. The threshold is entirely described by the dynamical parameters defining the average number of instantaneously migrating individuals and does not depend on the properties of the static network representation. Remarkably, the diffusion and contagion processes are slower in time-varying graphs than in their aggregated static counterparts, the mobility threshold being even two orders of magnitude larger in the first case. The presented results confirm the importance of considering the time-varying nature of complex networks.
We study the effects of noise on stationary pulse solutions (bumps) in spatially extended neural fields. The dynamics of a neural field is described by an integrodifferential equation whose integral term characterizes synaptic interactions between neurons in different spatial locations of the network. Translationally symmetric neural fields support a continuum of stationary bump solutions, which may be centered at any spatial location. Random fluctuations are introduced by modeling the system as a spatially extended Langevin equation whose noise term we take to be multiplicative or additive. For nonzero noise, these bumps are shown to wander about the domain in a purely diffusive way. We can approximate the effective diffusion coefficient using a small noise expansion. Upon breaking the (continuous) translation symmetry of the system using a spatially heterogeneous inputs or synapses, bumps in the stochastic neural field can become temporarily pinned to a finite number of locations in the network. In the case of spatially heterogeneous synaptic weights, as the modulation frequency of this heterogeneity increases, the effective diffusion of bumps in the network approaches that of the network with spatially homogeneous weights.
For the first time the inverse Laplace transform was applied for analysis of 3He relaxation in porous media. It was shown that inverse Laplace transform gives new information about these systems. The uniform-penalty algorithm has been performed to obtain the 3He relaxation times distribution in pores of clay sample. It is possible to obtain pores' sizes distribution by using applicable model. Keywords: inverse Laplace transform, uniform-penalty algorithm, liquid 3He, 3He, He3, He-3, helium-3, pulse nuclear magnetic resonance, clay, porous media.
Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails to match the required detection rates in many tasks and there exists a need for labeled data to guide the model generation. Our first contribution shows that classical semi-supervised approaches, originating from a supervised classifier, are inappropriate and hardly detect new and unknown anomalies. We argue that semi-supervised anomaly detection needs to ground on the unsupervised learning paradigm and devise a novel algorithm that meets this requirement. Although being intrinsically non-convex, we further show that the optimization problem has a convex equivalent under relatively mild assumptions. Additionally, we propose an active learning strategy to automatically filter candidates for labeling. In an empirical study on network intrusion detection data, we observe that the proposed learning methodology requires much less labeled data than the state-of-the-art, while achieving higher detection accuracies.
Universal security over a network with linear network coding has been intensively studied. However, previous linear codes used for this purpose were linear over a larger field than that used on the network. In this work, we introduce new parameters (relative dimension/rank support profile and relative generalized matrix weights) for linear codes that are linear over the field used in the network, measuring the universal security performance of these codes. The proposed new parameters enable us to use optimally universal secure linear codes on noiseless networks for all possible parameters, as opposed to previous works, and also enable us to add universal security to the recently proposed list-decodable rank-metric codes by Guruswami et al. We give several properties of the new parameters: monotonicity, Singleton-type lower and upper bounds, a duality theorem, and definitions and characterizations of equivalences of linear codes. Finally, we show that our parameters strictly extend relative dimension/length profile and relative generalized Hamming weights, respectively, and relative dimension/intersection profile and relative generalized rank weights, respectively. Moreover, we show that generalized matrix weights are larger than Delsarte generalized weights.
This paper proposes a theory for designing stable interconnection of linear active multi-port networks at the ports. Such interconnections can lead to unstable networks even if the original networks are stable with respect to bounded port excitations. Hence such a theory is necessary for realising interconnections of active multiport networks. Stabilization theory of linear feedback systems using stable coprime factorizations of transfer functions has been well known. This theory witnessed glorious developments in recent past culminating into the $H_{\infty}$ approach to design of feedback systems. However these important developments have seldom been utilized for network interconnections due to the difficulty of realizing feedback signal flow graph for multi-port networks with inputs and outputs as port sources and responses. This paper resolves this problem by developing the stabilization theory directly in terms of port connection description without formulation in terms of signal flow graph of the implicit feedback connection. The stable port interconnection results into an affine parametrized network function in which the free parameter is itself a stable network function and describes all stabilizing port compensations of a given network.
The absence of self averaging in mesoscopic systems is a consequence of long-range intensity correlation. Microwave measurements suggest and diagrammatic calculations confirm that the correlation function of the normalized intensity with displacement of the source and detector, $\Delta R$ and $\Delta r$, respectively, can be expressed as the sum of three terms, with distinctive spatial dependences. Each term involves only the sum or the product of the square of the field correlation function, $F \equiv F_{E}^2$. The leading-order term is the product, the next term is proportional to the sum. The third term is proportional to $[F(\Delta R)F(\Delta r) + [F(\Delta R)+F(\Delta r)] + 1]$.
Interpersonal relations are fickle, with close friendships often dissolving into enmity. In this work, we explore linguistic cues that presage such transitions by studying dyadic interactions in an online strategy game where players form alliances and break those alliances through betrayal. We characterize friendships that are unlikely to last and examine temporal patterns that foretell betrayal.   We reveal that subtle signs of imminent betrayal are encoded in the conversational patterns of the dyad, even if the victim is not aware of the relationship's fate. In particular, we find that lasting friendships exhibit a form of balance that manifests itself through language. In contrast, sudden changes in the balance of certain conversational attributes---such as positive sentiment, politeness, or focus on future planning---signal impending betrayal.
Identifying measurable genetic indicators (or biomarkers) of a specific condition of a biological system is a key element of precision medicine. Indeed it allows to tailor diagnostic, prognostic and treatment choice to individual characteristics of a patient. In machine learning terms, biomarker discovery can be framed as a feature selection problem on whole-genome data sets. However, classical feature selection methods are usually underpowered to process these data sets, which contain orders of magnitude more features than samples. This can be addressed by making the assumption that genetic features that are linked on a biological network are more likely to work jointly towards explaining the phenotype of interest. We review here three families of methods for feature selection that integrate prior knowledge in the form of networks.
In this paper, we study estimation of potentially unstable linear dynamical systems when the observations are distributed over a network. We are interested in scenarios when the information exchange among the agents is restricted. In particular, we consider that each agent can exchange information with its neighbors only once per dynamical system evolution-step. Existing work with similar information-constraints is restricted to static parameter estimation, whereas, the work on dynamical systems assumes large number of information exchange iterations between every two consecutive system evolution steps.   We show that when the agent communication network is sparely-connected, the sparsity of the network plays a key role in the stability and performance of the underlying estimation algorithm. To this end, we introduce the notion of \emph{Network Tracing Capacity} (NTC), which is defined as the largest two-norm of the system matrix that can be estimated with bounded error. Extending this to fully-connected networks or infinite information exchanges (per dynamical system evolution-step), we note that the NTC is infinite, i.e., any dynamical system can be estimated with bounded error. In short, the NTC characterizes the estimation capability of a sparse network by relating it to the evolution of the underlying dynamical system.
A growing number of indicators are now being used with some confidence to measure the metallicity(Z) of photoionisation regions in planetary nebulae, galactic HII regions(GHIIRs), extra-galactic HII regions(EGHIIRs) and HII galaxies(HIIGs). However, a universal indicator valid also at high metallicities has yet to be found. Here, we report on a new artificial intelligence-based approach to determine metallicity indicators that shows promise for the provision of improved empirical fits. The method hinges on the application of an evolutionary neural network to observational emission line data. The network's DNA, encoded in its architecture, weights and neuron transfer functions, is evolved using a genetic algorithm. Furthermore, selection, operating on a set of 10 distinct neuron transfer functions, means that the empirical relation encoded in the network solution architecture is in functional rather than numerical form. Thus the network solutions provide an equation for the metallicity in terms of line ratios without a priori assumptions. Tapping into the mathematical power offered by this approach, we applied the network to detailed observations of both nebula and auroral emission lines in the optical for a sample of 96 HII-type regions and we were able to obtain an empirical relation between Z and S23 with a dispersion of only 0.16 dex. We show how the method can be used to identify new diagnostics as well as the nonlinear relationship supposed to exist between the metallicity Z, ionisation parameter U and effective (or equivalent) temperature T*.
Analytic treatment of a non-equilibrium random system with large degrees of freedoms is one of most important problems of physics. However, little research has been done on this problem as far as we know. In this paper, we propose a new mean field theory that can treat a general class of a non-equilibrium random system. We apply the present theory to an analysis for an associative memory with oscillatory elements, which is a well-known typical random system with large degrees of freedoms.
This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation. Specifically, we use unsupervised motion-based segmentation on videos to obtain segments, which we use as 'pseudo ground truth' to train a convolutional network to segment objects from a single frame. Given the extensive evidence that motion plays a key role in the development of the human visual system, we hope that this straightforward approach to unsupervised learning will be more effective than cleverly designed 'pretext' tasks studied in the literature. Indeed, our extensive experiments show that this is the case. When used for transfer learning on object detection, our representation significantly outperforms previous unsupervised approaches across multiple settings, especially when training data for the target task is scarce.
We report Ultraviolet Brillouin light scattering experimental data on v-SiO2 in an unexplored frequency region, performed with a newly available spectrometer up to exchanged wavevector q values of 0.075 nm^{-1}, as a function of temperature. The measured attenuation scales on visible data following a q^2 behavior and is temperature-dependent. Such temperature dependence is found in a good agreement with that measured at lower q, suggesting that its origin is mainly due to a dynamic attenuation mechanism. The comparison between the present data with those obtained by Inelastic X-ray Scattering suggests the existence of a cross-over to a different attenuation regimes.
In this letter we extend the factorization procedure of the deep-inelastic hadron tensor, proposed by Qiu, to include non-zero quark masses. The manifest gauge invariance of both soft and hard parts is preserved. Using a so-called spurion to generate the quark-mass terms, the simple parton-model interpretation is also kept. The calculation of the deep-inelastic transverse-spin structure function $g_2$ is used to illustrate the algorithm.
We propose a method for integration of features extracted using deep representations of Convolutional Neural Networks (CNNs) each of which is learned using a different image dataset of objects and materials for material recognition. Given a set of representations of multiple pre-trained CNNs, we first compute activations of features using the representations on the images to select a set of samples which are best represented by the features. Then, we measure the uncertainty of the features by computing the entropy of class distributions for each sample set. Finally, we compute the contribution of each feature to representation of classes for feature selection and integration. We examine the proposed method on three benchmark datasets for material recognition. Experimental results show that the proposed method achieves state-of-the-art performance by integrating deep features. Additionally, we introduce a new material dataset called EFMD by extending Flickr Material Database (FMD). By the employment of the EFMD with transfer learning for updating the learned CNN models, we achieve 84.0%+/-1.8% accuracy on the FMD dataset which is close to human performance that is 84.9%.
The glassy behavior observed in the pyrochlore magnet Y2Mo2O7, where the magnetic Mo^{4+} ions interact predominantly via isotropic nearest neighbor antiferromagnetic exchange, possibly with additional weak disorder, is a distinct class of spin glass systems where frustration is mostly geometrical. A model proposed to describe such a spin glass behavior is the Heisenberg model on a pyrochlore lattice with random but strictly antiferromagnetic exchange disorder. In this paper, we provide compelling numerical evidence from extensive Monte Carlo simulations which show that the model exhibits a finite temperature spin glass transition and thus is a realization of a spin glass induced by random weak disorder from spin liquid. From our results, we are led to suggest that the spin glass state of Y2Mo2O7 is driven by effective strong disorder.
New types of machine learning hardware in development and entering the market hold the promise of revolutionizing deep learning in a manner as profound as GPUs. However, existing software frameworks and training algorithms for deep learning have yet to evolve to fully leverage the capability of the new wave of silicon. We already see the limitations of existing algorithms for models that exploit structured input via complex and instance-dependent control flow, which prohibits minibatching. We present an asynchronous model-parallel (AMP) training algorithm that is specifically motivated by training on networks of interconnected devices. Through an implementation on multi-core CPUs, we show that AMP training converges to the same accuracy as conventional synchronous training algorithms in a similar number of epochs, but utilizes the available hardware more efficiently even for small minibatch sizes, resulting in significantly shorter overall training times. Our framework opens the door for scaling up a new class of deep learning models that cannot be efficiently trained today.
Congestion control protocol and bandwidth allocation problems are often formulated into Network Utility Maximization (NUM) framework. Existing solutions for NUM generally focus on single-layered applications. As applications such as video streaming grow in importance and popularity, addressing user utility function for these multi-layered multimedia applications in NUM formulation becomes vital. In this paper, we propose a new multi-layered user utility model that leverages on studies of human visual perception and quality of experience (QoE) from the fields of computer graphics and human computer interaction (HCI). Using this new utility model to investigate network activities, we demonstrate that solving NUM with multi-layered utility is intractable, and that rate allocation and network pricing may oscillate due to user behavior specific to multi-layered applications. To address this, we propose a new approach for admission control to ensure quality of service (QoS) and quality of experience (QoE).
A topological phase can often be represented by a corresponding wavefunction (exact eigenstate of a model Hamiltonian) that has a higher underlying symmetry than necessary. When the symmetry is explicitly broken in the Hamiltonian, the model wavefunction fails to account for the change due to the lack of a variational parameter. Here we exemplify the case by an integer quantum Hall system with anisotropic interaction. We demonstrate the recovery of the variational parameter in a single-mode approximation, which is consistent with the recently proposed geometric consideration of the quantum Hall phases.
Many communication networks consist of legacy and new devices using heterogeneous technologies, such as copper wire, optical fiber, wireless and power line communication (PLC). Most network simulators, however, have been designed to work well with a single underlying link layer technology. Furthermore, there are hardly any suitable models for network simulators of PLC. In this paper we present extensions of the Contiki OS network simulator Cooja: A device may support multiple interfaces accessing multiple PLC segments or wireless channels and a simple PLC medium model is introduced describing packet loss probability as a function of distance. We test our approach to simulate a Smart Grid scenario of Ring Main Units equipped with PLC devices.
The results of 2D MHD simulations of solar magnetogranulation are used to analyze the horizontal magnetic fields and the response of the synthesized Stokes profiles of the FeI 1564.85 nm line to the magnetic fields. Selected 1.5-h series of the 2D MHD models reproduces a region of the network fields with their immediate surrounding on the solar surface with the unsigned magnetic flux density of 192 G. According to the magnetic field distribution obtained, the most probable absolute strength of the horizontal magnetic field at an optical depth of tau_5 = 1 (tau_5 denotes tau at lambda = 500 nm) is 50 G, while the mean value is 244 G. On average, the horizontal magnetic fields are stronger than the vertical fields to heights of about 400 km in the photosphere due to their higher density and the larger area they occupy. The maximum factor by which the horizontal fields are greater is 1.5. Strong horizontal magnetic flux tubes emerge at the surface as spots with field strengths of more than 500 G. These are smaller than granules in size, and have lifetimes of 3.6 min. They form in the photosphere due to the expulsion of magnetic fields by convective flows coming from deep subphotospheric layers. The data obtained qualitatively agree with observations with the Hinode space observatory.
The spatial arrangement of urban hubs and centers and how individuals interact with these centers is a crucial problem with many applications ranging from urban planning to epidemiology. We utilize here in an unprecedented manner the large scale, real-time 'Oyster' card database of individual person movements in the London subway to reveal the structure and organization of the city. We show that patterns of intraurban movement are strongly heterogeneous in terms of volume, but not in terms of distance travelled, and that there is a polycentric structure composed of large flows organized around a limited number of activity centers. For smaller flows, the pattern of connections becomes richer and more complex and is not strictly hierarchical since it mixes different levels consisting of different orders of magnitude. This new understanding can shed light on the impact of new urban projects on the evolution of the polycentric configuration of a city and the dense structure of its centers and it provides an initial approach to modeling flows in an urban system.
The frequency-dependent mean free paths (MFPs) of vibrational heat carriers in amorphous silicon are predicted from the length dependence of the spectrally decomposed heat current (SDHC) obtained from non-equilibrium molecular dynamics simulations. The results suggest a (frequency)$^{-2}$ scaling of the room-temperature MFPs below 5 THz. The MFPs exhibit a local maximum at a frequency of 8 THz and fall below 1 nm at frequencies greater than 10 THz, indicating localized vibrations. The MFPs extracted from sub-10 nm system-size simulations are used to predict the length-dependence of thermal conductivity up to system sizes of 100 nm and good agreement is found with separate molecular dynamics simulations. Weighting the SDHC by the frequency-dependent quantum occupation function provides a simple and convenient method to account for quantum statistics and provides reasonable agreement with the experimentally-measured trend and magnitude.
From a geometric perspective most nonlinear binary classification algorithms, including state of the art versions of Support Vector Machine (SVM) and Radial Basis Function Network (RBFN) classifiers, and are based on the idea of reconstructing indicator functions. We propose instead to use reconstruction of the signed distance function (SDF) as a basis for binary classification. We discuss properties of the signed distance function that can be exploited in classification algorithms. We develop simple versions of such classifiers and test them on several linear and nonlinear problems. On linear tests accuracy of the new algorithm exceeds that of standard SVM methods, with an average of 50% fewer misclassifications. Performance of the new methods also matches or exceeds that of standard methods on several nonlinear problems including classification of benchmark diagnostic micro-array data sets.
Networking in Wireless Sensor networks is a challenging task due to the lack of resources in the network as well as the frequent changes in network topology. Although lots of research has been done on supporting QoS in the Internet and other networks, but they are not suitable for wireless sensor networks and still QoS support for such networks remains an open problem. In this paper, a new scheme has been proposed for achieving QoS in terms of packet delivery, multiple connections, better power management and stable routes in case of failure. It offers quick adaptation to distributed processing, dynamic linking, low processing overhead and loop freedom at all times. The proposed scheme has been incorporated using QDPRA protocol and by extensive simulation the performance has been studied, and it is clearly shown that the proposed scheme performs very well for different network scenarios.
When a magnetic field is applied, the mixed state of a conventional Type II superconductor gets destroyed at the upper critical field Hc2, where the normal vortex cores overlap with each other. Here, we show that in the presence weak and homogeneous disorder the destruction of superconductivity with field follows a different route. Starting with a weakly disordered NbN thin film ( Tc ~ 9K ), we show that under the application of magnetic field the superconducting state becomes increasingly granular, where lines of vortices separate the superconducting islands. Consequently, phase fluctuations between these islands give rise to a field induced pseudogap phase, which has a gap in the electronic density of states but where the global zero resistance state is destroyed.
The publicly available EIS-DEEP optical-NIR data for the AXAF (Chandra) Deep Field have been used to construct samples of Extremely Red Objects (EROs) using various single-band and multi-band color criteria. In this work we define as EROs objects with colors consistent with passively evolving elliptical galaxies at z $\geq$ 1. The EROs surface densities we derive are intermediate between previous published values, emphasizing again the need for larger survey areas to constrain the effects of possible large-scale structure. Although various single-color selected samples can be derived, the EROs sample selected using R-Ks > 5, I-Ks > 4, J-Ks > 1.8 jointly is likely to contain the highest fraction of passively evolving luminous field elliptical galaxies at z >= 1, or conversely, the lowest fraction of lower redshift interlopers. The surface density of this multi-band selected EROs sample is consistent with the conclusion that little or no field elliptical volume density evolution has occurred in the redshift range 0 > z > 1.5. However, extensive spectroscopic followup is necessary to confirm this conclusion.
Consider a medium characterized by N points whose coordinates are randomly generated by a uniform distribution along the edges of a unitary d-dimensional hypercube. A walker leaves from each point of this disordered medium and moves according to the deterministic rule to go to the nearest point which has not been visited in the preceding \mu steps (deterministic tourist walk). Each trajectory generated by this dynamics has an initial non-periodic part of t steps (transient) and a final periodic part of p steps (attractor). The neighborhood rank probabilities are parameterized by the normalized incomplete beta function I_d = I_{1/4}[1/2,(d+1)/2]. The joint distribution S_{\mu,d}^{(N)}(t,p) is relevant, and the marginal distributions previously studied are particular cases. We show that, for the memory-less deterministic tourist walk in the euclidean space, this distribution is: S_{1,d}^{(\infty)}(t,p) = [\Gamma(1+I_d^{-1}) (t+I_d^{-1})/\Gamma(t+p+I_d^{-1})] \delta_{p,2}, where t=0,1,2,...,\infty, \Gamma(z) is the gamma function and \delta_{i,j} is the Kronecker's delta. The mean field models are random link model, which corresponds to d \to \infty, and random map model which, even for \mu = 0, presents non-trivial cycle distribution [S_{0,rm}^{(N)}(p) \propto p^{-1}]: S_{0,rm}^{(N)}(t,p) = \Gamma(N)/\{\Gamma[N+1-(t+p)]N^{t+p}\}. The fundamental quantities are the number of explored points n_e=t+p and I_d. Although the obtained distributions are simple, they do not follow straightforwardly and they have been validated by numerical experiments.
With the rapid proliferation of smart mobile devices, users now take millions of photos every day. These include large numbers of clothing and accessory images. We would like to answer questions like `What outfit goes well with this pair of shoes?' To answer these types of questions, one has to go beyond learning visual similarity and learn a visual notion of compatibility across categories. In this paper, we propose a novel learning framework to help answer these types of questions. The main idea of this framework is to learn a feature transformation from images of items into a latent space that expresses compatibility. For the feature transformation, we use a Siamese Convolutional Neural Network (CNN) architecture, where training examples are pairs of items that are either compatible or incompatible. We model compatibility based on co-occurrence in large-scale user behavior data; in particular co-purchase data from Amazon.com. To learn cross-category fit, we introduce a strategic method to sample training data, where pairs of items are heterogeneous dyads, i.e., the two elements of a pair belong to different high-level categories. While this approach is applicable to a wide variety of settings, we focus on the representative problem of learning compatible clothing style. Our results indicate that the proposed framework is capable of learning semantic information about visual style and is able to generate outfits of clothes, with items from different categories, that go well together.
A constant need to increase the network capacity for meeting the growing demands of the subscribers has led to the evolution of cellular communication networks from the first generation (1G) to the fifth generation (5G). There will be billions of connected devices in the near future. Such a large number of connections are expected to be heterogeneous in nature, demanding higher data rates, lesser delays, enhanced system capacity and superior throughput. The available spectrum resources are limited and need to be flexibly used by the mobile network operators (MNOs) to cope with the rising demands. An emerging facilitator of the upcoming high data rate demanding next generation networks (NGNs) is device-to-device (D2D) communication. An extensive survey on device-to-device (D2D) communication has been presented in this paper, including the plus points it offers, the key open issues associated with it like peer discovery, resource allocation etc, demanding special attention of the research community, some of its integrant technologies like millimeter wave D2D (mmWave), ultra dense networks (UDNs), cognitive D2D, handover procedure in D2D and its numerous use cases. Architecture is suggested aiming to fulfill all the subscriber demands in an optimal manner. The Appendix mentions some ongoing standardization activities and research projects of D2D communication.
A Green's function formalism is used to calculate the spectrum of localized modes of an impurity layer implanted within a ferromagnetic thin film. The equations of motion for the Green's functions are determined in the framework of the Ising model in a transverse field. We show that depending on the thickness, exchange and effective field parameters, there is a ``crossover'' effect between the surface modes and impurity localized modes. For thicker films the results show that the degeneracy of the surface modes can be lifted by the presence of an impurity layer.
Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition but still remains an important challenge. Data-driven supervised approaches, especially the ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks.
Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.
Large datasets often have unreliable labels-such as those obtained from Amazon's Mechanical Turk or social media platforms-and classifiers trained on mislabeled datasets often exhibit poor performance. We present a simple, effective technique for accounting for label noise when training deep neural networks. We augment a standard deep network with a softmax layer that models the label noise statistics. Then, we train the deep network and noise model jointly via end-to-end stochastic gradient descent on the (perhaps mislabeled) dataset. The augmented model is overdetermined, so in order to encourage the learning of a non-trivial noise model, we apply dropout regularization to the weights of the noise model during training. Numerical experiments on noisy versions of the CIFAR-10 and MNIST datasets show that the proposed dropout technique outperforms state-of-the-art methods.
Each human genome is a 3 billion base pair set of encoding instructions. Decoding the genome using deep learning fundamentally differs from most tasks, as we do not know the full structure of the data and therefore cannot design architectures to suit it. As such, architectures that fit the structure of genomics should be learned not prescribed. Here, we develop a novel search algorithm, applicable across domains, that discovers an optimal architecture which simultaneously learns general genomic patterns and identifies the most important sequence motifs in predicting functional genomic outcomes. The architectures we find using this algorithm succeed at using only RNA expression data to predict gene regulatory structure, learn human-interpretable visualizations of key sequence motifs, and surpass state-of-the-art results on benchmark genomics challenges.
Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We evaluate our proposed algorithm extensively on a number of cropped scene text benchmark datasets, namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word, and show better performance than comparable methods. We perform a rigorous analysis of all the steps in our approach and analyze the results. We also show that state-of-the-art convolutional neural network features can be integrated in our framework to further improve the recognition performance.
Accurately estimating the software size, cost, effort and schedule is probably the biggest challenge facing software developers today. It has major implications for the management of software development because both the overestimates and underestimates have direct impact for causing damage to software companies. Lot of models have been proposed over the years by various researchers for carrying out effort estimations. Also some of the studies for early stage effort estimations suggest the importance of early estimations. New paradigms offer alternatives to estimate the software development effort, in particular the Computational Intelligence (CI) that exploits mechanisms of interaction between humans and processes domain knowledge with the intention of building intelligent systems (IS). Among IS, Artificial Neural Network and Fuzzy Logic are the two most popular soft computing techniques for software development effort estimation. In this paper neural network models and Mamdani FIS model have been used to predict the early stage effort estimations using the student dataset. It has been found that Mamdani FIS was able to predict the early stage efforts more efficiently in comparison to the neural network models based models.
We consider the random-anisotropy model on the square and on the cubic lattice in the strong-anisotropy limit. We compute exact ground-state configurations, and we use them to determine the stiffness exponent at zero temperature; we find $\theta = -0.275(5)$ and $\theta \approx 0.2$ respectively in two and three dimensions. These results show that the low-temperature phase of the model is the same as that of the usual Ising spin-glass model. We also show that no magnetic order occurs in two dimensions, since the expectation value of the magnetization is zero and spatial correlation functions decay exponentially. In three dimensions our data strongly support the absence of spontaneous magnetization in the infinite-volume limit.
We investigate the scaling properties of single flux lines in a random pinning landscape consisting of splayed columnar defects. Such correlated defects can be injected into Type II superconductors by inducing nuclear fission or via direct heavy ion irradiation. The result is often very efficient pinning of the vortices which gives, e.g., a strongly enhanced critical current. The wandering exponent \zeta and the free energy exponent \omega of a single flux line in such a disordered environment are obtained analytically from scaling arguments combined with extreme-value statistics. In contrast to the case of point disorder, where these exponents are universal, we find a dependence of the exponents on details in the probability distribution of the low lying energies of the columnar defects. The analytical results show excellent agreement with numerical transfer matrix calculations in two and three dimensions.
Short-range order (SRO) in disordered alloys is typically interpreted as competition between chemical effect of negative (or positive) energy gain by mixing constituent elements and geometric effects comes from difference in effective atomic radius. Although we have a number of theoretical approaches to quantitatively estimate SRO at given temperatures, it is still unclear to systematically understand trends in SRO for binary alloys in terms of geometric character, e.g., effective atomic radius for constituents. Since chemical effect plays significant role on SRO, it has been believed that purely geometric character cannot quantitatively explain the SRO trends. Despite these considerations, based on the density functional theory (DFT) calculations on fcc-based 28 equiatomic binary alloys, we find that while convensional Goldschmidt or DFT-based atomic radius for constituents have no significant correlation with SRO, atomic radius for specially selected structure, constructed purely from information about underlying lattice, can successfully capture the magnitude of SRO. These facts strongly indicate that purely geometric information of the system plays central role to determine characteristic disordered structure.
Toxicity analysis and prediction are of paramount importance to human health and environmental protection. Existing computational methods are built from a wide variety of descriptors and regressors, which makes their performance analysis difficult. For example, deep neural network (DNN), a successful approach in many occasions, acts like a black box and offers little conceptual elegance or physical understanding. The present work constructs a common set of microscopic descriptors based on established physical models for charges, surface areas and free energies to assess the performance of multi-task convolutional neural network (MT-CNN) architectures and a few other approaches, including random forest (RF) and gradient boosting decision tree (GBDT), on an equal footing. Comparison is also given to convolutional neural network (CNN) and non-convolutional deep neural network (DNN) algorithms. Four benchmark toxicity data sets (i.e., endpoints) are used to evaluate various approaches. Extensive numerical studies indicate that the present MT-CNN architecture is able to outperform the state-of-the-art methods.
Background: Current neuronal monitoring techniques, such as calcium imaging and multi-electrode arrays, enable recordings of spiking activity from hundreds of neurons simultaneously. Of primary importance in systems neuroscience is the identification of cell assemblies: groups of neurons that cooperate in some form within the recorded population.   New Method: We introduce a simple, integrated framework for the detection of cell-assemblies from spiking data without a priori assumptions about the size or number of groups present. We define a biophysically-inspired measure to extract a directed functional connectivity matrix between both excitatory and inhibitory neurons based on their spiking history. The resulting network representation is analyzed using the Markov Stability framework, a graph theoretical method for community detection across scales, to reveal groups of neurons that are significantly related in the recorded time-series at different levels of granularity.   Results and comparison with existing methods: Using synthetic spike-trains, including simulated data from leaky-integrate-and-fire networks, our method is able to identify important patterns in the data such as hierarchical structure that are missed by other standard methods. We further apply the method to experimental data from retinal ganglion cells of mouse and salamander, in which we identify cell-groups that correspond to known functional types, and to hippocampal recordings from rats exploring a linear track, where we detect place cells with high fidelity.   Conclusions: We present a versatile method to detect neural assemblies in spiking data applicable across a spectrum of relevant scales that contributes to understanding spatio-temporal information gathered from systems neuroscience experiments.
As real-world Bayesian networks continue to grow larger and more complex, it is important to investigate the possibilities for improving the performance of existing algorithms of probabilistic inference. Motivated by examples, we investigate the dependency of the performance of Lazy propagation on the message computation algorithm. We show how Symbolic Probabilistic Inference (SPI) and Arc-Reversal (AR) can be used for computation of clique to clique messages in the addition to the traditional use of Variable Elimination (VE). In addition, the paper resents the results of an empirical evaluation of the performance of Lazy propagation using VE, SPI, and AR as the message computation algorithm. The results of the empirical evaluation show that for most networks, the performance of inference did not depend on the choice of message computation algorithm, but for some randomly generated networks the choice had an impact on both space and time performance. In the cases where the choice had an impact, AR produced the best results.
Molecular dynamics results for the dynamic Prigogine-Defay ratio are presented for two glass-forming liquids, thus evaluating the experimentally relevant quantity for testing whether metastable-equilibrium liquid dynamics to a good approximation are described by a single parameter. For the Kob-Andersen binary Lennard-Jones mixture as well as for an asymmetric dumbbell model liquid a single-parameter description works quite well. This is confirmed by time-domain results where it is found that energy and pressure fluctuations are strongly correlated on the alpha-time scale in the NVT ensemble; in the NpT ensemble energy and volume fluctuations similarly correlate strongly.
An efficient protocol for the synthesis of differently substituted 1, 4-dihydropyridines in deep eutectic solvents under solvent-free conditions is reported herewith. Excellent yields of the resultant products have been obtained. Recyclability studies have also been performed for deep eutectic solvents with very little loss in activity up to five recycles.
Scale-free foraging patterns are widespread among animals. These may be the outcome of an optimal searching strategy to find scarce randomly distributed resources, but a less explored alternative is that this behaviour may result from the interaction of foraging animals with a particular distribution of resources. We introduce a simple foraging model where individuals follow mental maps and choose their displacements according to a maximum efficiency criterion, in a spatially disordered environment containing many trees with a heterogeneous size distribution. We show that a particular tree size frequency distribution induces non-Gaussian movement patterns with multiple spatial scales (L\'evy walks). These results are consistent with tree size variation and Spider monkey (Ateles geoffroyi) foraging patterns. We discuss the consequences that our results may have for the patterns of seed dispersal by foraging primates.
We analyze a science collaboration network, i.e. a network whose nodes are scientists with edges connecting them for each paper published together. Furthermore we develop a model for the simulation of discontiguous small-world networks that shows good coherence with the empirical data.
We study the persistent recoverable prevalence and the extinction of computer viruses via e-mails on a growing scale-free network with new users, which structure is estimated form real data. The typical phenomenon is simulated in a realistic model with the probabilistic execution and detection of viruses. Moreover, the conditions of extinction by random and targeted immunizations for hubs are derived through bifurcation analysis for simpler models by using a mean-field approximation without the connectivity correlations. We can qualitatively understand the mechanisms of the spread in linearly growing scale-free networks.
We study the temperature dependence of the conductivity due to quantum interference processes for a two-dimensional disordered itinerant electron system close to a ferromagnetic quantum critical point. Near the quantum critical point, the cross-over between diffusive and ballistic regimes of quantum interference effects occurs at a temperature $ T^{\ast}=1/\tau \gamma (E_{F}\tau)^{2}$, where $\gamma $ is the parameter associated with the Landau damping of the spin fluctuations, $\tau $ is the impurity scattering time, and $E_{F}$ is the Fermi energy. For a generic choice of parameters, $T^{\ast}$ is smaller than the nominal crossover scale $1/\tau $. In the ballistic quantum critical regime, the conductivity behaves as $T^{1/3}$.
For short-ranged disordered quantum models in one dimension, the Many-Body-Localization is analyzed via the adaptation to the Many-Body context [M. Serbyn, Z. Papic and D.A. Abanin, PRX 5, 041047 (2015)] of the Thouless point of view on the Anderson transition : the question is whether a local interaction between two long chains is able to reshuffle completely the eigenstates (Delocalized phase with a volume-law entanglement) or whether the hybridization between tensor states remains limited (Many-Body-Localized Phase with an area-law entanglement). The central object is thus the level of Hybridization induced by the matrix elements of local operators, as compared with the difference of diagonal energies. The multifractal analysis of these matrix elements of local operators is used to analyze the corresponding statistics of resonances. Our main conclusion is that the critical point is characterized by the Strong-Multifractality Spectrum $f(0 \leq \alpha \leq 2)=\frac{\alpha}{2}$, well known in the context of Anderson Localization in spaces of effective infinite dimensionality, where the size of the Hilbert space grows exponentially with the volume. Finally, the possibility of a delocalized non-ergodic phase near criticality is discussed.
The algebraic formulation for linear network coding in acyclic networks with the links having integer delay is well known. Based on this formulation, for a given set of connections over an arbitrary acyclic network with integer delay assumed for the links, the output symbols at the sink nodes, at any given time instant, is a \mathbb{F}_{q}$-linear combination of the input symbols across different generations, where $\mathbb{F}_{q}$ denotes the field over which the network operates. We use finite-field discrete fourier transform (DFT) to convert the output symbols at the sink nodes, at any given time instant, into a $\mathbb{F}_{q}$-linear combination of the input symbols generated during the same generation. We call this as transforming the acyclic network with delay into {\em $n$-instantaneous networks} ($n$ is sufficiently large). We show that under certain conditions, there exists a network code satisfying sink demands in the usual (non-transform) approach if and only if there exists a network code satisfying sink demands in the transform approach. Furthermore, we show that the transform method (along with the use of alignment strategies) can be employed to achieve half the rate corresponding to the individual source-destination min-cut (which are assumed to be equal to 1) for some classes of three-source three-destination unicast network with delays, when the zero-interference conditions are not satisfied.
Emerging Information-Centric Networking (ICN) architectures seek to optimally utilize both bandwidth and storage for efficient content distribution over the network. The Virtual Interest Packet (VIP) framework has been proposed to enable joint design of forwarding, caching, and congestion control strategies within the Named Data Networking (NDN) architecture. While the existing VIP algorithms exhibit good performance, they are primarily focused on maximizing network throughput and utility, and do not explicitly consider user delay. In this paper, we develop a new class of enhanced algorithms for joint dynamic forwarding, caching and congestion control within the VIP framework. These enhanced VIP algorithms adaptively stabilize the network and maximize network utility, while improving the delay performance by intelligently making use of VIP information beyond one hop. Generalizing Lyapunov drift techniques, we prove the throughput optimality and characterize the utility-delay tradeoff of the enhanced VIP algorithms. Numerical experiments demonstrate the superior performance of the resulting enhanced algorithms for handling Interest Packets and Data Packets within the actual plane, in terms of low network delay and high network utility.
Conserving power in mobile ad-hoc and sensor networks is a big challenge. Most of the nodes in these networks, in general, are battery powered, therefore, an efficient power saving protocol is required to extend the lifetime of such networks. A lot of work has been done and several protocols have been proposed to address this problem. Gossip based protocols, which are based on the results of percolation theory, significantly reduce power consumption with very little implementation overhead. However, not much work has been done to make gossiping battery aware. In this paper we introduce a simple gossip based battery aware sleep protocol. The protocol allows low battery nodes to sleep more, therefore, improves overall network lifetime.
In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and dropping translations in NMT. For each source word, our model starts with a full coverage embedding vector to track the coverage status, and then keeps updating it with neural networks as the translation goes. Experiments on the large-scale Chinese-to-English task show that our enhanced model improves the translation quality significantly on various test sets over the strong large vocabulary NMT system.
In past years there has been increasing interest in field of Wireless Sensor Networks (WSNs). One of the major issue of WSNs is development of energy efficient routing protocols. Clustering is an effective way to increase energy efficiency. Mostly, heterogenous protocols consider two or three energy level of nodes. In reality, heterogonous WSNs contain large range of energy levels. By analyzing communication energy consumption of the clusters and large range of energy levels in heterogenous WSN, we propose BEENISH (Balanced Energy Efficient Network Integrated Super Heterogenous) Protocol. It assumes WSN containing four energy levels of nodes. Here, Cluster Heads (CHs) are elected on the bases of residual energy level of nodes. Simulation results show that it performs better than existing clustering protocols in heterogeneous WSNs. Our protocol achieve longer stability, lifetime and more effective messages than Distributed Energy Efficient Clustering (DEEC), Developed DEEC (DDEEC) and Enhanced DEEC (EDEEC).
Triadic closure has been conceptualized and measured in a variety of ways, most famously the clustering coefficient. Existing extensions to affiliation networks, however, are sensitive to repeat group attendance, which manifests in bipartite models as biclique proliferation. Whereas this sensitivity does not reflect common interpretations of triadic closure in social networks, this paper proposes a measure of triadic closure in affiliation networks designed to control for it. To avoid arbitrariness, the paper introduces a triadic framework for affiliation networks, within which a range of measures can be defined; it then presents a set of basic axioms that suffice to narrow this range to the one measure. An instrumental assessment compares the proposed and two existing measures for reliability, validity, redundancy, and practicality. All three measures then take part in an investigation of three empirical social networks, which illustrates their differences.
In this paper, an automatic approach to predict 3D coordinates from stereo laparoscopic images is presented. The approach maps a vector of pixel intensities to 3D coordinates through training a six layer deep neural network. The architectural aspects of the approach is presented and in detail and the method is evaluated on a publicly available dataset with promising results.
We study the nonequilibrium phase transition in a contact process with extended quenched defects by means of Monte-Carlo simulations. We find that the spatial disorder correlations dramatically increase the effects of the impurities. As a result, the sharp phase transition is completely destroyed by smearing. This is caused by effects similar to but stronger than the usual Griffiths phenomena, viz., rare strongly coupled spatial regions can undergo the phase transition independently from the bulk system. We determine both the stationary density in the vicinity of the smeared transition and its time evolution, and we compare the simulation results to a recent theory based on extremal statistics.
We propose an algorithm to separate simultaneously speaking persons from each other, the "cocktail party problem", using a single microphone. Our approach involves a deep recurrent neural networks regression to a vector space that is descriptive of independent speakers. Such a vector space can embed empirically determined speaker characteristics and is optimized by distinguishing between speaker masks. We call this technique source-contrastive estimation. The methodology is inspired by negative sampling, which has seen success in natural language processing, where an embedding is learned by correlating and de-correlating a given input vector with output weights. Although the matrix determined by the output weights is dependent on a set of known speakers, we only use the input vectors during inference. Doing so will ensure that source separation is explicitly speaker-independent. Our approach is similar to recent deep neural network clustering and permutation-invariant training research; we use weighted spectral features and masks to augment individual speaker frequencies while filtering out other speakers. We avoid, however, the severe computational burden of other approaches with our technique. Furthermore, by training a vector space rather than combinations of different speakers or differences thereof, we avoid the so-called permutation problem during training. Our algorithm offers an intuitive, computationally efficient response to the cocktail party problem, and most importantly boasts better empirical performance than other current techniques.
Monocular 3D facial shape reconstruction from a single 2D facial image has been an active research area due to its wide applications. Inspired by the success of deep neural networks (DNN), we propose a DNN-based approach for End-to-End 3D FAce Reconstruction (UH-E2FAR) from a single 2D image. Different from recent works that reconstruct and refine the 3D face in an iterative manner using both an RGB image and an initial 3D facial shape rendering, our DNN model is end-to-end, and thus the complicated 3D rendering process can be avoided. Moreover, we integrate in the DNN architecture two components, namely a multi-task loss function and a fusion convolutional neural network (CNN) to improve facial expression reconstruction. With the multi-task loss function, 3D face reconstruction is divided into neutral 3D facial shape reconstruction and expressive 3D facial shape reconstruction. The neutral 3D facial shape is class-specific. Therefore, higher layer features are useful. In comparison, the expressive 3D facial shape favors lower or intermediate layer features. With the fusion-CNN, features from different intermediate layers are fused and transformed for predicting the 3D expressive facial shape. Through extensive experiments, we demonstrate the superiority of our end-to-end framework in improving the accuracy of 3D face reconstruction.
This abstract will be modified after correcting the minor error in Eq.(2)
We do not attempt to provide yet another definition of selforganization, but explore the conditions under which we can model a system as self-organizing. These involve the dynamics of entropy, and the purpose, aspects, and description level chosen by an observer. We show how, changing the level or "graining" of description, the same system can appear selforganizing or self-disorganizing. We discuss ontological issues we face when studying self-organizing systems, and analyse when designing and controlling artificial self-organizing systems is useful. We conclude that self-organization is a way of observing systems, not an absolute class of systems.
In this paper we propose a CNN architecture for semantic image segmentation. We introduce a new 'bilateral inception' module that can be inserted in existing CNN architectures and performs bilateral filtering, at multiple feature-scales, between superpixels in an image. The feature spaces for bilateral filtering and other parameters of the module are learned end-to-end using standard backpropagation techniques. The bilateral inception module addresses two issues that arise with general CNN segmentation architectures. First, this module propagates information between (super) pixels while respecting image edges, thus using the structured information of the problem for improved results. Second, the layer recovers a full resolution segmentation result from the lower resolution solution of a CNN. In the experiments, we modify several existing CNN architectures by inserting our inception module between the last CNN (1x1 convolution) layers. Empirical results on three different datasets show reliable improvements not only in comparison to the baseline networks, but also in comparison to several dense-pixel prediction techniques such as CRFs, while being competitive in time.
We examine Bayesian methods for learning Bayesian networks from a combination of prior knowledge and statistical data. In particular, we unify the approaches we presented at last year's conference for discrete and Gaussian domains. We derive a general Bayesian scoring metric, appropriate for both domains. We then use this metric in combination with well-known statistical facts about the Dirichlet and normal--Wishart distributions to derive our metrics for discrete and Gaussian domains.
In this paper, we propose an approach that exploits object segmentation in order to improve the accuracy of object detection. We frame the problem as inference in a Markov Random Field, in which each detection hypothesis scores object appearance as well as contextual information using Convolutional Neural Networks, and allows the hypothesis to choose and score a segment out of a large pool of accurate object segmentation proposals. This enables the detector to incorporate additional evidence when it is available and thus results in more accurate detections. Our experiments show an improvement of 4.1% in mAP over the R-CNN baseline on PASCAL VOC 2010, and 3.4% over the current state-of-the-art, demonstrating the power of our approach.
The very notion of social network implies that linked individuals interact repeatedly with each other. This allows them not only to learn successful strategies and adapt to them, but also to condition their own behavior on the behavior of others, in a strategic forward looking manner. Game theory of repeated games shows that these circumstances are conducive to the emergence of collaboration in simple games of two players. We investigate the extension of this concept to the case where players are engaged in a local contribution game and show that rationality and credibility of threats identify a class of Nash equilibria -- that we call "collaborative equilibria" -- that have a precise interpretation in terms of sub-graphs of the social network. For large network games, the number of such equilibria is exponentially large in the number of players. When incentives to defect are small, equilibria are supported by local structures whereas when incentives exceed a threshold they acquire a non-local nature, which requires a "critical mass" of more than a given fraction of the players to collaborate. Therefore, when incentives are high, an individual deviation typically causes the collapse of collaboration across the whole system. At the same time, higher incentives to defect typically support equilibria with a higher density of collaborators. The resulting picture conforms with several results in sociology and in the experimental literature on game theory, such as the prevalence of collaboration in denser groups and in the structural hubs of sparse networks.
In network communication, the source often transmits messages at several different information rates within a session. How to deal with information transmission and network error correction simultaneously under different rates is introduced in this paper as a variable-rate network error correction problem. Apparently, linear network error correction MDS codes are expected to be used for these different rates. For this purpose, designing a linear network error correction MDS code based on the existing results for each information rate is an efficient solution. In order to solve the problem more efficiently, we present the concept of variable-rate linear network error correction MDS codes, that is, these linear network error correction MDS codes of different rates have the same local encoding kernel at each internal node. Further, we propose an approach to construct such a family of variable-rate network MDS codes and give an algorithm for efficient implementation. This approach saves the storage space for each internal node, and resources and time for the transmission on networks. Moreover, the performance of our proposed algorithm is analyzed, including the field size, the time complexity, the encoding complexity at the source node, and the decoding methods. Finally, a random method is introduced for constructing variable-rate network MDS codes and we obtain a lower bound on the success probability of this random method, which shows that this probability will approach to one as the base field size goes to infinity.
Traditional radio planning tools present a steep learning curve. We present BotRf, a Telegram Bot that facilitates the process by guiding non-experts in assessing the feasibility of radio links. Built on open source tools, BotRf can run on any smartphone or PC running Telegram. Using it on a smartphone has the added value that the Bot can leverage the internal GPS to enter coordinates. BotRf can be used in environments with low bandwidth as the generated data traffic is quite limited. We present examples of its use in Venezuela.
KNN is one of the most popular classification methods, but it often fails to work well with inappropriate choice of distance metric or due to the presence of numerous class-irrelevant features. Linear feature transformation methods have been widely applied to extract class-relevant information to improve kNN classification, which is very limited in many applications. Kernels have been used to learn powerful non-linear feature transformations, but these methods fail to scale to large datasets. In this paper, we present a scalable non-linear feature mapping method based on a deep neural network pretrained with restricted boltzmann machines for improving kNN classification in a large-margin framework, which we call DNet-kNN. DNet-kNN can be used for both classification and for supervised dimensionality reduction. The experimental results on two benchmark handwritten digit datasets show that DNet-kNN has much better performance than large-margin kNN using a linear mapping and kNN based on a deep autoencoder pretrained with retricted boltzmann machines.
The depth-dependence of bubble concentration at pressures above the transition to the air hydrate phase and the optical scattering length due to bubbles in deep ice at the South Pole are modeled using diffusion-growth data from the laboratory, taking into account the dependence of age and temperature on depth in the ice. The model fits the available data on bubbles in cores from Vostok and Byrd and on scattering length in deep ice at the South Pole. It explains why bubbles and air hydrate crystals co-exist in deep ice over a range of depths as great as 800 m and predicts that at depths below $\rm \sim$ 1400 m the AMANDA neutrino observatory at the South Pole will operate unimpaired by light scattering from bubbles.
In the traveling salesman problem, one must find the length of the shortest closed tour visiting given ``cities''. We study the stochastic version of the problem, taking the locations of cities and the distances separating them to be random variables drawn from an ensemble. We consider first the ensemble where cities are placed in Euclidean space. We investigate how the optimum tour length scales with number of cities and with number of spatial dimensions. We then examine the analytical theory behind the random link ensemble, where distances between cities are independent random variables. Finally, we look at the related geometric issue of nearest neighbor distances, and find some remarkable universalities.
In this paper, we examine cognitive radio systems that evolve dynamically over time due to changing user and environmental conditions. To combine the advantages of orthogonal frequency division multiplexing (OFDM) and multiple-input, multiple-output (MIMO) technologies, we consider a MIMO-OFDM cognitive radio network where wireless users with multiple antennas communicate over several non-interfering frequency bands. As the network's primary users (PUs) come and go in the system, the communication environment changes constantly (and, in many cases, randomly). Accordingly, the network's unlicensed, secondary users (SUs) must adapt their transmit profiles "on the fly" in order to maximize their data rate in a rapidly evolving environment over which they have no control. In this dynamic setting, static solution concepts (such as Nash equilibrium) are no longer relevant, so we focus on dynamic transmit policies that lead to no regret: specifically, we consider policies that perform at least as well as (and typically outperform) even the best fixed transmit profile in hindsight. Drawing on the method of matrix exponential learning and online mirror descent techniques, we derive a no-regret transmit policy for the system's SUs which relies only on local channel state information (CSI). Using this method, the system's SUs are able to track their individually evolving optimum transmit profiles remarkably well, even under rapidly (and randomly) changing conditions. Importantly, the proposed augmented exponential learning (AXL) policy leads to no regret even if the SUs' channel measurements are subject to arbitrarily large observation errors (the imperfect CSI case), thus ensuring the method's robustness in the presence of uncertainties.
In this paper we show that QUASAR-370 large area hybrid phototube developed for and successfully used in a number of astroparticle physics experiments, the lake Baikal deep underwater neutrino experiment among them, could be used as a prototype of a photodetector for the next generation of giant neutrino telescopes.
Most of existing work learn sentiment-specific word representation for improving Twitter sentiment classification, which encoded both n-gram and distant supervised tweet sentiment information in learning process. They assume all words within a tweet have the same sentiment polarity as the whole tweet, which ignores the word its own sentiment polarity. To address this problem, we propose to learn sentiment-specific word embedding by exploiting both lexicon resource and distant supervised information. We develop a multi-level sentiment-enriched word embedding learning method, which uses parallel asymmetric neural network to model n-gram, word level sentiment and tweet level sentiment in learning process. Experiments on standard benchmarks show our approach outperforms state-of-the-art methods.
The electronic transport in polypyrrole thin films synthesized chemically from the vapor phase is studied as a function of temperature as well as of electric and magnetic fields. We find distinct differences in comparison to the behavior of both polypyrrole films prepared by electrochemical growth as well as of the bulk films obtained from conventional chemical synthesis. For small electric fields F, a transition from Efros-Shklovskii variable range hopping to Arrhenius activated transport is observed at 30 K. High electric fields induce short range hopping. The characteristic hopping distance is found to be proportional to F^(-1/2). The magnetoresistance R(B) is independent of F below a critical magnetic field, above which F counteracts the magnetic field induced localization.
A wide range of disordered materials contain electronic states that are spatially well localized. In this work, we investigated the electrical response of such systems in non-equilibrium conditions to external electromagnetic field. We obtained the expression for optical conductivity valid for any non-equilibrium state of electronic subsystem. In the case of incoherent non-equilibrium state, this expression contains only the positions of localized electronic states, Fermi's golden rule transition probabilities between the states and the populations of electronic states. The same form of expression is valid both in the case of weak electron-phonon interaction and weak electron-impurity interaction that act as perturbations of electronic Hamiltonian. The derivation was performed by expanding the general expression for AC conductivity in powers of small electron-phonon interaction or electron-impurity interaction parameter. Applications of the expression to two model systems, a simple one- dimensional Gaussian disorder model and the model of a realistic three-dimensional organic polymer material, were presented, as well.
A Convolutional Neural Network was used to predict kidney function in patients with chronic kidney disease from high-resolution digital pathology scans of their kidney biopsies. Kidney biopsies were taken from participants of the NEPTUNE study, a longitudinal cohort study whose goal is to set up infrastructure for observing the evolution of 3 forms of idiopathic nephrotic syndrome, including developing predictors for progression of kidney disease. The knowledge of future kidney function is desirable as it can identify high-risk patients and influence treatment decisions, reducing the likelihood of irreversible kidney decline.
One of the most demanding challenges for the designers of parallel computing architectures is to deliver an efficient network infrastructure providing low latency, high bandwidth communications while preserving scalability. Besides off-chip communications between processors, recent multi-tile (i.e. multi-core) architectures face the challenge for an efficient on-chip interconnection network between processor's tiles. In this paper, we present a configurable and scalable architecture, based on our Distributed Network Processor (DNP) IP Library, targeting systems ranging from single MPSoCs to massive HPC platforms. The DNP provides inter-tile services for both on-chip and off-chip communications with a uniform RDMA style API, over a multi-dimensional direct network with a (possibly) hybrid topology.
The Call admission control (CAC) is one of the Radio Resource Management (RRM) techniques that plays influential role in ensuring the desired Quality of Service (QoS) to the users and applications in next generation networks. This paper proposes a fuzzy neural approach for making the call admission control decision in multi class traffic based Next Generation Wireless Networks (NGWN). The proposed Fuzzy Neural call admission control (FNCAC) scheme is an integrated CAC module that combines the linguistic control capabilities of the fuzzy logic controller and the learning capabilities of the neural networks. The model is based on recurrent radial basis function networks which have better learning and adaptability that can be used to develop intelligent system to handle the incoming traffic in an heterogeneous network environment. The simulation results are optimistic and indicates that the proposed FNCAC algorithm performs better than the other two methods and the call blocking probability is minimal when compared to other two methods.
We present a novel dynamic configuration technique for deep neural networks that permits step-wise energy-accuracy trade-offs during runtime. Our configuration technique adjusts the number of channels in the network dynamically depending on response time, power, and accuracy targets. To enable this dynamic configuration technique, we co-design a new training algorithm, where the network is incrementally trained such that the weights in channels trained in earlier steps are fixed. Our technique provides the flexibility of multiple networks while storing and utilizing one set of weights. We evaluate our techniques using both an ASIC-based hardware accelerator as well as a low-power embedded GPGPU and show that our approach leads to only a small or negligible loss in the final network accuracy. We analyze the performance of our proposed methodology using three well-known networks for MNIST, CIFAR-10, and SVHN datasets, and we show that we are able to achieve up to 95% energy reduction with less than 1% accuracy loss across the three benchmarks. In addition, compared to prior work on dynamic network reconfiguration, we show that our approach leads to approximately 50% savings in storage requirements, while achieving similar accuracy.
The online learning of deep neural networks is an interesting problem of machine learning because, for example, major IT companies want to manage the information of the massive data uploaded on the web daily, and this technology can contribute to the next generation of lifelong learning. We aim to train deep models from new data that consists of new classes, distributions, and tasks at minimal computational cost, which we call online deep learning. Unfortunately, deep neural network learning through classical online and incremental methods does not work well in both theory and practice. In this paper, we introduce dual memory architectures for online incremental deep learning. The proposed architecture consists of deep representation learners and fast learnable shallow kernel networks, both of which synergize to track the information of new data. During the training phase, we use various online, incremental ensemble, and transfer learning techniques in order to achieve lower error of the architecture. On the MNIST, CIFAR-10, and ImageNet image recognition tasks, the proposed dual memory architectures performs much better than the classical online and incremental ensemble algorithm, and their accuracies are similar to that of the batch learner.
Large-scale deep convolutional neural networks (CNNs) are widely used in machine learning applications. While CNNs involve huge complexity, VLSI (ASIC and FPGA) chips that deliver high-density integration of computational resources are regarded as a promising platform for CNN's implementation. At massive parallelism of computational units, however, the external memory bandwidth, which is constrained by the pin count of the VLSI chip, becomes the system bottleneck. Moreover, VLSI solutions are usually regarded as a lack of the flexibility to be reconfigured for the various parameters of CNNs. This paper presents CNN-MERP to address these issues. CNN-MERP incorporates an efficient memory hierarchy that significantly reduces the bandwidth requirements from multiple optimizations including on/off-chip data allocation, data flow optimization and data reuse. The proposed 2-level reconfigurability is utilized to enable fast and efficient reconfiguration, which is based on the control logic and the multiboot feature of FPGA. As a result, an external memory bandwidth requirement of 1.94MB/GFlop is achieved, which is 55% lower than prior arts. Under limited DRAM bandwidth, a system throughput of 1244GFlop/s is achieved at the Vertex UltraScale platform, which is 5.48 times higher than the state-of-the-art FPGA implementations.
We study the popular centrality measure known as effective conductance or in some circles as information centrality. After reinterpreting this measure in terms of modulus (energy) of families of walks on the network, we introduce a new measure called shell modulus centrality, that relies on the egocentric structure of the graph. Egonetworks are networks formed around a focal node (ego) with a specific order of neighborhoods. We then propose efficient analytical and approximate methods for computing these measures on both undirected and directed networks. Finally, we describe a simple method inspired by shell modulus centrality, called general degree, which improves simple degree centrality and could prove to be a useful tool for network science.
The concept of weighted asymmetries is revisited for semi-inclusive deep inelastic scattering. We consider the cross section in Fourier space, conjugate to the outgoing hadron's transverse momentum, where convolutions of transverse momentum dependent parton distribution functions and fragmentation functions become simple products. Individual asymmetric terms in the cross section can be projected out by means of a generalized set of weights involving Bessel functions. Advantages of employing these Bessel weights are that they suppress (divergent) contributions from high transverse momentum and that soft factors cancel in (Bessel-) weighted asymmetries. Also, the resulting compact expressions immediately connect to previous work on evolution equations for transverse momentum dependent parton distribution and fragmentation functions and to quantities accessible in lattice QCD. Bessel weighted asymmetries are thus model independent observables that augment the description and our understanding of correlations of spin and momentum in nucleon structure.
Do unique node identifiers help in deciding whether a network $G$ has a prescribed property $P$? We study this question in the context of distributed local decision, where the objective is to decide whether $G \in P$ by having each node run a constant-time distributed decision algorithm. If $G \in P$, all the nodes should output yes; if $G \notin P$, at least one node should output no.   A recent work (Fraigniaud et al., OPODIS 2012) studied the role of identifiers in local decision and gave several conditions under which identifiers are not needed. In this article, we answer their original question. More than that, we do so under all combinations of the following two critical variations on the underlying model of distributed computing:   ($B$): the size of the identifiers is bounded by a function of the size of the input network; as opposed to ($\neg B$): the identifiers are unbounded.   ($C$): the nodes run a computable algorithm; as opposed to ($\neg C$): the nodes can compute any, possibly uncomputable function.   While it is easy to see that under ($\neg B, \neg C$) identifiers are not needed, we show that under all other combinations there are properties that can be decided locally if and only if identifiers are present. Our constructions use ideas from classical computability theory.
Jackendoff (2002) posed four challenges that linguistic combinatoriality and rules of language present to theories of brain function. The essence of these problems is the question of how to neurally instantiate the rapid construction and transformation of the compositional structures that are typically taken to be the domain of symbolic processing. He contended that typical connectionist approaches fail to meet these challenges and that the dialogue between linguistic theory and cognitive neuroscience will be relatively unproductive until the importance of these problems is widely recognised and the challenges answered by some technical innovation in connectionist modelling. This paper claims that a little-known family of connectionist models (Vector Symbolic Architectures) are able to meet Jackendoff's challenges.
Graphical modelling has a long history in statistics as a tool for the analysis of multivariate data, starting from Wright's path analysis and Gibbs' applications to statistical physics at the beginning of the last century. In its modern form, it was pioneered by Lauritzen and Wermuth and Pearl in the 1980s, and has since found applications in fields as diverse as bioinformatics, customer satisfaction surveys and weather forecasts.   Genetics and systems biology are unique among these fields in the dimension of the data sets they study, which often contain several hundreds of variables and only a few tens or hundreds of observations. This raises problems in both computational complexity and the statistical significance of the resulting networks, collectively known as the "curse of dimensionality". Furthermore, the data themselves are difficult to model correctly due to the limited understanding of the underlying mechanisms. In the following, we will illustrate how such challenges affect practical graphical modelling and some possible solutions.
A soft control of the network activity through varying reward in a proof-of-work (PoW) cryptocurrency is reported. Rewards are the necessity to incent the contributors activities (i.e., mining) in order to maintain the PoW network. Contrary to constant rewarding in a certain period implemented in most of cryptocurrency, such as bitcoin, we propose a network-dependent rewarding model system, primarily including two phases: 1) activities encouraging phase in which higher rewards are issued at higher network activities; and 2) discouraging further increase of activities by reducing rewards. The advantages of this system include 1) fair distribution of rewards among a variety of contributors, and 2) enforcing a limit to the network activity and hence the cost of maintaining the PoW network. This mechanism requires network contributors to show their participation in order to earn maximum rewards, i.e., proof-of-mining.
Cloud computing is making it possible to separate the process of building an infrastructure for service provisioning from the business of providing end user services. Today, such infrastructures are normally provided in large data centres and the applications are executed remotely from the users. One reason for this is that cloud computing requires a reasonably stable infrastructure and networking environment, largely due to management reasons. Networking of Information (NetInf) is an information centric networking paradigm that can support cloud computing by providing new possibilities for network transport and storage. It offers direct access to information objects through a simple API, independent of their location in the network. This abstraction can hide much of the complexity of storage and network transport systems that cloud computing today has to deal with. In this paper we analyze how cloud computing and NetInf can be combined to make cloud computing infrastructures easier to manage, and potentially enable deployment in smaller and more dynamic networking environments. NetInf should thus be understood as an enhancement to the infrastructure for cloud computing rather than a change to cloud computing technology as such. To illustrate the approach taken by NetInf, we also describe how it can be implemented by introducing a specific name resolution and routing mechanism.
This paper describes a new online convex optimization method which incorporates a family of candidate dynamical models and establishes novel tracking regret bounds that scale with the comparator's deviation from the best dynamical model in this family. Previous online optimization methods are designed to have a total accumulated loss comparable to that of the best comparator sequence, and existing tracking or shifting regret bounds scale with the overall variation of the comparator sequence. In many practical scenarios, however, the environment is nonstationary and comparator sequences with small variation are quite weak, resulting in large losses. The proposed Dynamic Mirror Descent method, in contrast, can yield low regret relative to highly variable comparator sequences by both tracking the best dynamical model and forming predictions based on that model. This concept is demonstrated empirically in the context of sequential compressive observations of a dynamic scene and tracking a dynamic social network.
Artificial object perception usually relies on a priori defined models and feature extraction algorithms. We study how the concept of object can be grounded in the sensorimotor experience of a naive agent. Without any knowledge about itself or the world it is immersed in, the agent explores its sensorimotor space and identifies objects as consistent networks of sensorimotor transitions, independent from their context. A fundamental drive for prediction is assumed to explain the emergence of such networks from a developmental standpoint. An algorithm is proposed and tested to illustrate the approach.
In this thesis we consider the polarized deep inelastic scattering in the region of low values of Bjorken variable, $x$. We formulate the evolution equations for the unintegrated parton distributions which include a complete resummation of the double logarithmic contributions, $\ln^2(1/x)$. Afterwards, these equations are completed with the standard LO and NLO DGLAP evolution terms, in order to obtain the proper behaviour of the parton distributions at moderate and large values of $x$. The equations obtained are applied to the following observables and processes: (i) to the nucleon structure function, $g_1$, in the polarized deep inelastic scattering, (ii) to the structure function of the polarized photon, $g_1^{\gamma}$, in the scattering of a lepton on a polarized photon, and (iii) to the differential structure function, $x_J d^2g_1/dx_J dk_J^2$, in the polarized deep inelastic scattering accompanied by a forward jet. Case (iii) is proposed to be a test process for the presence and the magnitude of the $\ln^2(1/x)$ contributions. For each process the consequences of including the logarithmic corrections are studied in a detail. After integrating out the structure function, $g_1$, the moments of the nucleon structure function are obtained. The contribution of the region of low $x$ to these moments is estimated, and then discussed in the context of the spin sum rules. Finally, some predictions for the observables, the asymmetry and the cross sections, in the processes (i)-(iii) are given. They are important to planned experiments with the polarized HERA and linear colliders, which will probe the region of low values of Bjorken $x$.
We present an application of self-adaptive supervised learning classifiers derived from the Machine Learning paradigm, to the identification of candidate Globular Clusters in deep, wide-field, single band HST images. Several methods provided by the DAME (Data Mining & Exploration) web application, were tested and compared on the NGC1399 HST data described in Paolillo 2011. The best results were obtained using a Multi Layer Perceptron with Quasi Newton learning rule which achieved a classification accuracy of 98.3%, with a completeness of 97.8% and 1.6% of contamination. An extensive set of experiments revealed that the use of accurate structural parameters (effective radius, central surface brightness) does improve the final result, but only by 5%. It is also shown that the method is capable to retrieve also extreme sources (for instance, very extended objects) which are missed by more traditional approaches.
At very low temperatures, the tunnelling theory for amorphous solids predicts a thermal conductivity $\kappa\propto T^p$, with $p = 2$. We have studied the effect of the Nuclear Quadrupole moment on the thermal conductivity of glasses at very low temperatures. We developed a theory that couples the tunnelling motion to the nuclear quadrupoles moment in order to evaluate the thermal conductivity. Our result suggests a cross over between two different regimes at the temperature close to the nuclear quadrupoles energy. Below this temperature we have shown that the thermal conductivity is larger than the standard tunneling result and therefore we have $p < 2$. However, for temperatures higher than the nuclear quadrupoles energy, the result of standard tunnelling model has been found.
We investigate the currently debated issue concerning whether transition metal substitutions dope carriers in iron based superconductors. From first-principles calculations of the configuration-averaged spectral function of BaFe$_2$As$_2$ with disordered Co/Zn substitutions of Fe, important doping effects are found beyond merely changing the carrier density. While the chemical potential shifts suggest doping of a large amount of carriers, a reduction of the coherent carrier density is found due to the loss of spectral weight. Therefore, none of the change in the Fermi surface, density of states, or charge distribution can be solely used for counting doped coherent carriers, let alone presenting the full effects of the disordered substitutions. Our study highlights the necessity of including disorder effects in the studies of doped materials in general.
Learning the right graph representation from noisy, multisource data has garnered significant interest in recent years. A central tenet of this problem is relational learning. Here the objective is to incorporate the partial information each data source gives us in a way that captures the true underlying relationships. To address this challenge, we present a general, boosting-inspired framework for combining weak evidence of entity associations into a robust similarity metric. We explore the extent to which different quality measurements yield graph representations that are suitable for community detection. We then present empirical results on both synthetic and real datasets demonstrating the utility of this framework. Our framework leads to suitable global graph representations from quality measurements local to each edge. Finally, we discuss future extensions and theoretical considerations of learning useful graph representations from weak feedback in general application settings.
We consider a discrete time stochastic queueing system where a controller makes a 2-stage decision every slot. The decision at the first stage reveals a hidden source of randomness with a control-dependent (but unknown) probability distribution. The decision at the second stage incurs a penalty vector that depends on this revealed randomness. The goal is to stabilize all queues and minimize a convex function of the time average penalty vector subject to an additional set of time average penalty constraints. This setting fits a wide class of stochastic optimization problems. This includes problems of opportunistic scheduling in wireless networks, where a 2-stage decision about channel measurement and packet transmission must be made every slot without knowledge of the underlying transmission success probabilities. We develop a simple max-weight algorithm that learns efficient behavior by averaging functionals of previous outcomes. The algorithm yields performance that can be pushed arbitrarily close to optimal, with a tradeoff in convergence time and delay.
Animals perform near-optimal probabilistic inference in a wide range of psychophysical tasks. Probabilistic inference requires trial-to-trial representation of the uncertainties associated with task variables and subsequent use of this representation. Previous work has implemented such computations using neural networks with hand-crafted and task-dependent operations. We show that generic neural networks trained with a simple error-based learning rule perform near-optimal probabilistic inference in nine common psychophysical tasks. In a probabilistic categorization task, error-based learning in a generic network simultaneously explains a monkey's learning curve and the evolution of qualitative aspects of its choice behavior. In all tasks, the number of neurons required for a given level of performance grows sub-linearly with the input population size, a substantial improvement on previous implementations of probabilistic inference. The trained networks develop a novel sparsity-based probabilistic population code. Our results suggest that probabilistic inference emerges naturally in generic neural networks trained with error-based learning rules.
We have applied a Long Short-Term Memory neural network to model S&P 500 volatility, incorporating Google domestic trends as indicators of the public mood and macroeconomic factors. In a held-out test set, our Long Short-Term Memory model gives a mean absolute percentage error of 24.2%, outperforming linear Ridge/Lasso and autoregressive GARCH benchmarks by at least 31%. This evaluation is based on an optimal observation and normalization scheme which maximizes the mutual information between domestic trends and daily volatility in the training set. Our preliminary investigation shows strong promise for better predicting stock behavior via deep learning and neural network models.
Exponential random graph models are a class of widely used exponential family models for social networks. The topological structure of an observed network is modelled by the relative prevalence of a set of local sub-graph configurations termed network statistics. One of the key tasks in the application of these models is which network statistics to include in the model. This can be thought of as statistical model selection problem. This is a very challenging problem---the posterior distribution for each model is often termed "doubly intractable" since computation of the likelihood is rarely available, but also, the evidence of the posterior is, as usual, intractable. The contribution of this paper is the development of a fully Bayesian model selection method based on a reversible jump Markov chain Monte Carlo algorithm extension of Caimo and Friel (2011) which estimates the posterior probability for each competing model.
The iterations of many sparse estimation algorithms are comprised of a fixed linear filter cascaded with a thresholding nonlinearity, which collectively resemble a typical neural network layer. Consequently, a lengthy sequence of algorithm iterations can be viewed as a deep network with shared, hand-crafted layer weights. It is therefore quite natural to examine the degree to which a learned network model might act as a viable surrogate for traditional sparse estimation in domains where ample training data is available. While the possibility of a reduced computational budget is readily apparent when a ceiling is imposed on the number of layers, our work primarily focuses on estimation accuracy. In particular, it is well-known that when a signal dictionary has coherent columns, as quantified by a large RIP constant, then most tractable iterative algorithms are unable to find maximally sparse representations. In contrast, we demonstrate both theoretically and empirically the potential for a trained deep network to recover minimal $\ell_0$-norm representations in regimes where existing methods fail. The resulting system is deployed on a practical photometric stereo estimation problem, where the goal is to remove sparse outliers that can disrupt the estimation of surface normals from a 3D scene.
We improve the results by Siegelmann & Sontag (1995) by providing a novel and parsimonious constructive mapping between Turing Machines and Recurrent Artificial Neural Networks, based on recent developments of Nonlinear Dynamical Automata. The architecture of the resulting R-ANNs is simple and elegant, stemming from its transparent relation with the underlying NDAs. These characteristics yield promise for developments in machine learning methods and symbolic computation with continuous time dynamical systems. A framework is provided to directly program the R-ANNs from Turing Machine descriptions, in absence of network training. At the same time, the network can potentially be trained to perform algorithmic tasks, with exciting possibilities in the integration of approaches akin to Google DeepMind's Neural Turing Machines.
Bio-inspired optimization algorithms have been gaining more popularity recently. One of the most important of these algorithms is particle swarm optimization (PSO). PSO is based on the collective intelligence of a swam of particles. Each particle explores a part of the search space looking for the optimal position and adjusts its position according to two factors; the first is its own experience and the second is the collective experience of the whole swarm. PSO has been successfully used to solve many optimization problems. In this work we use PSO to improve the performance of a well-known representation method of time series data which is the symbolic aggregate approximation (SAX). As with other time series representation methods, SAX results in loss of information when applied to represent time series. In this paper we use PSO to propose a new minimum distance WMD for SAX to remedy this problem. Unlike the original minimum distance, the new distance sets different weights to different segments of the time series according to their information content. This weighted minimum distance enhances the performance of SAX as we show through experiments using different time series datasets.
Network data mining has become an important area of study due to the large number of problems it can be applied to. This paper presents NOESIS, an open source framework for network data mining that provides a large collection of network analysis techniques, including the analysis of network structural properties, community detection methods, link scoring, and link prediction, as well as network visualization algorithms. It also features a complete stand-alone graphical user interface that facilitates the use of all these techniques. The NOESIS framework has been designed using solid object-oriented design principles and structured parallel programming. As a lightweight library with minimal external dependencies and a permissive software license, NOESIS can be incorporated into other software projects. Released under a BSD license, it is available from http://noesis.ikor.org.
This paper proposes a novel deep learning framework named bidirectional-convolutional long short term memory (Bi-CLSTM) network to automatically learn the spectral-spatial feature from hyperspectral images (HSIs). In the network, the issue of spectral feature extraction is considered as a sequence learning problem, and a recurrent connection operator across the spectral domain is used to address it. Meanwhile, inspired from the widely used convolutional neural network (CNN), a convolution operator across the spatial domain is incorporated into the network to extract the spatial feature. Besides, to sufficiently capture the spectral information, a bidirectional recurrent connection is proposed. In the classification phase, the learned features are concatenated into a vector and fed to a softmax classifier via a fully-connected operator. To validate the effectiveness of the proposed Bi-CLSTM framework, we compare it with several state-of-the-art methods, including the CNN framework, on three widely used HSIs. The obtained results show that Bi-CLSTM can improve the classification performance as compared to other methods.
A quantum phase transition is a dramatic event marked by large spatial and temporal fluctuations, where one phase of matter with its ground state and tower of excitations reorganizes into a completely different phase. We provide new insight into the disorder-driven superconductor-insulator transition (SIT) in two dimensions, a problem of great theoretical and experimental interest, with the dynamical conductivity \sigma(\omega) and the bosonic (pair) spectral function P(\omega) calculated from quantum Monte Carlo simulations. We identify characteristic energy scales in the superconducting and insulating phases that vanish at the transition due to enhanced quantum fluctuations, despite the persistence of a robust fermionic gap across the SIT. Disorder leads to enhanced absorption in \sigma(\omega) at low frequencies compared to the SIT in a clean system. Disorder also expands the quantum critical region, due to a change in the universality class, with an underlying T=0 critical point with a finite low-frequency conductivity.
Pooling is an important component in convolutional neural networks (CNNs) for aggregating features and reducing computational burden. Compared with other components such as convolutional layers and fully connected layers which are completely learned from data, the pooling component is still handcrafted such as max pooling and average pooling. This paper proposes a learnable pooling function using recurrent neural networks (RNN) so that the pooling can be fully adapted to data and other components of the network, leading to an improved performance. Such a network with learnable pooling function is referred to as a fully trainable network (FTN). Experimental results have demonstrated that the proposed RNN-based pooling can well approximate the existing pooling functions and improve the performance of the network. Especially for small networks, the proposed FTN can improve the performance by seven percentage points in terms of error rate on the CIFAR-10 dataset compared with the traditional CNN.
We study in this work the properties of the $Q_{mf}$ network which is constructed from an anisotropic partition of the square, the multifractal tiling. This tiling is build using a single parameter $\rho$, in the limit of $\rho \to 1$ the tiling degenerates into the square lattice that is associated with a regular network.   The $Q_{mf}$ network is a space-filling network with the following characteristics: it shows a power-law distribution of connectivity for $k>7$ and it has an high clustering coefficient when compared with a random network associated. In addition the $Q_{mf}$ network satisfy the relation $N \propto \ell^{d_f}$ where $\ell$ is a typical length of the network (the average minimal distance) and $N$ the network size. We call $d_f$ the fractal dimension of the network. In tne limit case $\rho \to 1$ we have $d_{f} \to 2$.
Some years ago a cellular automata model was proposed to describe the evolution of the immune repertoire of B cells and antibodies based on Jerne's immune network theory and shape-space formalism. Here we investigate if the networks generated by this model in the different regimes can be classified as complex networks. We have found that in the chaotic regime the network has random characteristics with large, constant values of clustering coefficients, while in the ordered phase, the degree distribution of the network is exponential and the clustering coefficient exhibits power law behavior. In the transition region we observed a mixed behavior (random-like and exponential) of the degree distribution as opposed to the scale-free behavior reported for other biological networks. Randomness and low connectivity in the active sites allow for rapid changes in the connectivity distribution of the immune network in order to include and/or discard information and generate a dynamic memory. However it is the availability of the low concentration nodes to change rapidly without driving the system to pathological states that allow the generation of dynamic memory and consequently a reproduction of immune system behavior in mice. Although the overall behavior of degree correlation is positive, there is an interplay between assortative and disassortative mixing in the stable and transition regions regulated by a threshold value of the node degree, which achieves a maximum value on the transition region and becomes totally assortative in the chaotic regime.
In Grids scheduling decisions are often made on the basis of jobs being either data or computation intensive: in data intensive situations jobs may be pushed to the data and in computation intensive situations data may be pulled to the jobs. This kind of scheduling, in which there is no consideration of network characteristics, can lead to performance degradation in a Grid environment and may result in large processing queues and job execution delays due to site overloads. In this paper we describe a Data Intensive and Network Aware (DIANA) meta-scheduling approach, which takes into account data, processing power and network characteristics when making scheduling decisions across multiple sites. Through a practical implementation on a Grid testbed, we demonstrate that queue and execution times of data-intensive jobs can be significantly improved when we introduce our proposed DIANA scheduler. The basic scheduling decisions are dictated by a weighting factor for each potential target location which is a calculated function of network characteristics, processing cycles and data location and size. The job scheduler provides a global ranking of the computing resources and then selects an optimal one on the basis of this overall access and execution cost. The DIANA approach considers the Grid as a combination of active network elements and takes network characteristics as a first class criterion in the scheduling decision matrix along with computation and data. The scheduler can then make informed decisions by taking into account the changing state of the network, locality and size of the data and the pool of available processing cycles.
The distribution of transverse energy, $E_T$, which accompanies deep-inelastic electron-proton scattering at small $x$, is predicted in the central region away from the current jet and proton remnants. We use BFKL dynamics, which arises from the summation of multiple gluon emissions at small $x$, to derive an analytic expression for the $E_T$ flow. One interesting feature is an $x^{-\epsilon}$ increase of the $E_T$ distribution with decreasing $x$, where $\epsilon = (3\alpha_s/\pi)2\log 2$. We perform a numerical study to examine the possibility of using characteristics of the $E_T$ distribution as a means of identifying BFKL dynamics at HERA.
Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks. To achieve a further performance improvement, in this research, deep extensions on LSTM are investigated considering that deep hierarchical model has turned out to be more efficient than a shallow one. Motivated by previous research on constructing deep recurrent neural networks (RNNs), alternative deep LSTM architectures are proposed and empirically evaluated on a large vocabulary conversational telephone speech recognition task. Meanwhile, regarding to multi-GPU devices, the training process for LSTM networks is introduced and discussed. Experimental results demonstrate that the deep LSTM networks benefit from the depth and yield the state-of-the-art performance on this task.
Human beings often assess the aesthetic quality of an image coupled with the identification of the image's semantic content. This paper addresses the correlation issue between automatic aesthetic quality assessment and semantic recognition. We cast the assessment problem as the main task among a multi-task deep model, and argue that semantic recognition task offers the key to address this problem. Based on convolutional neural networks, we employ a single and simple multi-task framework to efficiently utilize the supervision of aesthetic and semantic labels. A correlation item between these two tasks is further introduced to the framework by incorporating the inter-task relationship learning. This item not only provides some useful insight about the correlation but also improves assessment accuracy of the aesthetic task. Particularly, an effective strategy is developed to keep a balance between the two tasks, which facilitates to optimize the parameters of the framework. Extensive experiments on the challenging AVA dataset and Photo.net dataset validate the importance of semantic recognition in aesthetic quality assessment, and demonstrate that multi-task deep models can discover an effective aesthetic representation to achieve state-of-the-art results.
Evolutionary computation techniques have mostly been used to solve various optimization and learning problems successfully. Evolutionary algorithm is more effective to gain optimal solution(s) to solve complex problems than traditional methods. In case of problems with large set of parameters, evolutionary computation technique incurs a huge computational burden for a single processing unit. Taking this limitation into account, this paper presents a new distributed evolutionary computation technique, which decomposes decision vectors into smaller components and achieves optimal solution in a short time. In this technique, a Jacobi-based Time Variant Adaptive (JBTVA) Hybrid Evolutionary Algorithm is distributed incorporating cluster computation. Moreover, two new selection methods named Best All Selection (BAS) and Twin Selection (TS) are introduced for selecting best fit solution vector. Experimental results show that optimal solution is achieved for different kinds of problems having huge parameters and a considerable speedup is obtained in proposed distributed system.
Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224x224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning.   The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102x faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007.   In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.
Complex networks have non-trivial characteristics and appear in many real-world systems. Such networks are vitally important, but their full underlying dynamics are not completely understood. Utilizing new data sources, however, can unveil the evolution process of these networks.   This study uses the recently published Reddit dataset, containing over 1.65 billion comments, to construct the largest publicly available social network corpus to date. We used this dataset to deeply examine the network evolution process, which resulted in two key observations: First, links are more likely to be created among users who join a network at a similar time. Second, the rate in which new users join a network is a central factor in molding a network's topology; i.e., different user-join patterns create different topological properties.   Based on these observations, we developed the \textit{Temporal Preferential Attachment} random network generation model. This model produces not only scale-free networks that have relative high clustering coefficients, but also networks that are sensitive to both the rate and the time in which users join the network. This results in a more accurate and flexible model of how complex networks evolve, one which more closely represents real-world data.
We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an $n$ node multilayer neural net that has degree at most $n^{\gamma}$ for some $\gamma <1$ and each edge has a random edge weight in $[-1,1]$. Our algorithm learns {\em almost all} networks in this class with polynomial running time. The sample complexity is quadratic or cubic depending upon the details of the model.   The algorithm uses layerwise learning. It is based upon a novel idea of observing correlations among features and using these to infer the underlying edge structure via a global graph recovery procedure. The analysis of the algorithm reveals interesting structure of neural networks with random edge weights.
Fuzzy modeling has many advantages over the non-fuzzy methods, such as robustness against uncertainties and less sensitivity to the varying dynamics of nonlinear systems. Data-driven fuzzy modeling needs to extract fuzzy rules from the input/output data, and train the fuzzy parameters. This paper takes advantages from deep learning, probability theory, fuzzy modeling, and extreme learning machines. We use the restricted Boltzmann machine (RBM) and probability theory to overcome some common problems in data based modeling methods. The RBM is modified such that it can be trained with continuous values. A probability based clustering method is proposed to partition the hidden features from the RBM, and extract fuzzy rules with probability measurement. An extreme learning machine and an optimization method are applied to train the consequent part of the fuzzy rules and the probability parameters. The proposed method is validated with two benchmark problems.
Reinforcement learning is a general and powerful framework with which to study and implement artificial intelligence. Recent advances in deep learning have enabled RL algorithms to achieve impressive performance in restricted domains such as playing Atari video games (Mnih et al., 2015) and, recently, the board game Go (Silver et al., 2016). However, we are still far from constructing a generally intelligent agent. Many of the obstacles and open questions are conceptual: What does it mean to be intelligent? How does one explore and learn optimally in general, unknown environments? What, in fact, does it mean to be optimal in the general sense? The universal Bayesian agent AIXI (Hutter, 2005) is a model of a maximally intelligent agent, and plays a central role in the sub-field of general reinforcement learning (GRL). Recently, AIXI has been shown to be flawed in important ways; it doesn't explore enough to be asymptotically optimal (Orseau, 2010), and it can perform poorly with certain priors (Leike and Hutter, 2015). Several variants of AIXI have been proposed to attempt to address these shortfalls: among them are entropy-seeking agents (Orseau, 2011), knowledge-seeking agents (Orseau et al., 2013), Bayes with bursts of exploration (Lattimore, 2013), MDL agents (Leike, 2016a), Thompson sampling (Leike et al., 2016), and optimism (Sunehag and Hutter, 2015). We present AIXIjs, a JavaScript implementation of these GRL agents. This implementation is accompanied by a framework for running experiments against various environments, similar to OpenAI Gym (Brockman et al., 2016), and a suite of interactive demos that explore different properties of the agents, similar to REINFORCEjs (Karpathy, 2015). We use AIXIjs to present numerous experiments illustrating fundamental properties of, and differences between, these agents.
Consider the continuum of points along the edges of a network, i.e., an undirected graph with positive edge weights. We measure distance between these points in terms of the shortest path distance along the network, known as the network distance. Within this metric space, we study farthest points.   We introduce network farthest-point diagrams, which capture how the farthest points---and the distance to them---change as we traverse the network. We preprocess a network G such that, when given a query point q on G, we can quickly determine the farthest point(s) from q in G as well as the farthest distance from q in G. Furthermore, we introduce a data structure supporting queries for the parts of the network that are farther away from q than some threshold R > 0, where R is part of the query.   We also introduce the minimum eccentricity feed-link problem defined as follows. Given a network G with geometric edge weights and a point p that is not on G, connect p to a point q on G with a straight line segment pq, called a feed-link, such that the largest network distance from p to any point in the resulting network is minimized. We solve the minimum eccentricity feed-link problem using eccentricity diagrams. In addition, we provide a data structure for the query version, where the network G is fixed and a query consists of the point p.
This paper presents the University of Cambridge submission to WMT16. Motivated by the complementary nature of syntactical machine translation and neural machine translation (NMT), we exploit the synergies of Hiero and NMT in different combination schemes. Starting out with a simple neural lattice rescoring approach, we show that the Hiero lattices are often too narrow for NMT ensembles. Therefore, instead of a hard restriction of the NMT search space to the lattice, we propose to loosely couple NMT and Hiero by composition with a modified version of the edit distance transducer. The loose combination outperforms lattice rescoring, especially when using multiple NMT systems in an ensemble.
A distributed single-hop wireless network with $K$ links is considered, where the links are partitioned into a fixed number ($M$) of clusters each operating in a subchannel with bandwidth $\frac{W}{M}$. The subchannels are assumed to be orthogonal to each other. A general shadow-fading model, described by parameters $(\alpha,\varpi)$, is considered where $\alpha$ denotes the probability of shadowing and $\varpi$ ($\varpi \leq 1$) represents the average cross-link gains. The main goal of this paper is to find the maximum network throughput in the asymptotic regime of $K \to \infty$, which is achieved by: i) proposing a distributed and non-iterative power allocation strategy, where the objective of each user is to maximize its best estimate (based on its local information, i.e., direct channel gain) of the average network throughput, and ii) choosing the optimum value for $M$. In the first part of the paper, the network hroughput is defined as the \textit{average sum-rate} of the network, which is shown to scale as $\Theta (\log K)$. Moreover, it is proved that in the strong interference scenario, the optimum power allocation strategy for each user is a threshold-based on-off scheme. In the second part, the network throughput is defined as the \textit{guaranteed sum-rate}, when the outage probability approaches zero. In this scenario, it is demonstrated that the on-off power allocation scheme maximizes the throughput, which scales as $\frac{W}{\alpha \varpi} \log K$. Moreover, the optimum spectrum sharing for maximizing the average sum-rate and the guaranteed sum-rate is achieved at M=1.
Low temperature carrier transport properties in two-dimensional (2D) semiconductor systems can be theoretically well-understood within a mean-field type RPA-Boltzmann theory as being limited by scattering from screened Coulomb disorder arising from random quenched charged impurities in the environment. In the current work, we derive a number of simple analytical formula, supported by realistic numerical calculations, for the relevant density, mobility, and temperature range where 2D transport should manifest strong intrinsic (i.e., arising purely from electronic effects and not from phonon scattering) metallic temperature dependence in different semiconductor materials arising entirely from the 2D screening properties, thus providing an explanation for why the strong temperature dependence of the 2D resistivity can only be observed in high-quality and low-disorder (i.e., high-mobility) 2D samples and also why some high-quality 2D materials (i.e., n-GaAs) manifest much weaker metallicity than other materials. We also discuss effects of interaction and disorder on the 2D screening properties in this context as well as compare 2D and 3D screening functions to comment why such a strong intrinsic temperature dependence arising from screening cannot occur in 3D metallic carrier transport. Experimentally verifiable predictions are made about the quantitative magnitude of the maximum possible low-temperature metallicity in 2D systems and the scaling behavior of the temperature scale controlling the quantum to classical crossover where the system reverses the sign of the temperature derivative of the 2D resistivity at high temperatures.
This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time. The approach is demonstrated for text (where the data are discrete) and online handwriting (where the data are real-valued). It is then extended to handwriting synthesis by allowing the network to condition its predictions on a text sequence. The resulting system is able to generate highly realistic cursive handwriting in a wide variety of styles.
We describe and analyze a new boosting algorithm for deep learning called SelfieBoost. Unlike other boosting algorithms, like AdaBoost, which construct ensembles of classifiers, SelfieBoost boosts the accuracy of a single network. We prove a $\log(1/\epsilon)$ convergence rate for SelfieBoost under some "SGD success" assumption which seems to hold in practice.
The regularization and output consistency behavior of dropout and layer-wise pretraining for learning deep networks have been fairly well studied. However, our understanding of how the asymptotic convergence of backpropagation in deep architectures is related to the structural properties of the network and other design choices (like denoising and dropout rate) is less clear at this time. An interesting question one may ask is whether the network architecture and input data statistics may guide the choices of learning parameters and vice versa. In this work, we explore the association between such structural, distributional and learnability aspects vis-\`a-vis their interaction with parameter convergence rates. We present a framework to address these questions based on convergence of backpropagation for general nonconvex objectives using first-order information. This analysis suggests an interesting relationship between feature denoising and dropout. Building upon these results, we obtain a setup that provides systematic guidance regarding the choice of learning parameters and network sizes that achieve a certain level of convergence (in the optimization sense) often mediated by statistical attributes of the inputs. Our results are supported by a set of experimental evaluations as well as independent empirical observations reported by other groups.
We present a novel object detection pipeline for localization and recognition in three dimensional environments. Our approach makes use of an RGB-D sensor and combines state-of-the-art techniques from the robotics and computer vision communities to create a robust, real-time detection system. We focus specifically on solving the object detection problem for tabletop scenes, a common environment for assistive manipulators. Our detection pipeline locates objects in a point cloud representation of the scene. These clusters are subsequently used to compute a bounding box around each object in the RGB space. Each defined patch is then fed into a Convolutional Neural Network (CNN) for object recognition. We also demonstrate that our region proposal method can be used to develop novel datasets that are both large and diverse enough to train deep learning models, and easy enough to collect that end-users can develop their own datasets. Lastly, we validate the resulting system through an extensive analysis of the accuracy and run-time of the full pipeline.
Perceptual completion of figures is a basic process revealing the deep architecture of low level vision. In this paper a complete gauge field Lagrangian is proposed allowing to couple the retinex equation with neurogeometrical models and to solve the problem of modal completion, i.e. the pop up of the Kanizsa triangle. Euler-Lagrange equations are derived by variational calculus and numerically solved. Plausible neurophysiological implementations of the particle and field equations are discussed and a model of the interaction between LGN and visual cortex is proposed.
We use soft-collinear effective theory (SCET) to study the factorization properties of deep inelastic scattering in the region of phase space where 1-x = O(Lambda_{QCD/Q}). By applying a regions analysis to loop diagrams in the Breit frame, we show that the appropriate version of SCET includes anti-hard-collinear, collinear, and soft-collinear fields. We find that the effects of the soft-collinear fields spoil perturbative factorization even at leading order in the 1/Q expansion.
Results of neural network learning are always subject to some variability, due to the sensitivity to initial conditions, to convergence to local minima, and, sometimes more dramatically, to sampling variability. This paper presents a set of tools designed to assess the reliability of the results of Self-Organizing Maps (SOM), i.e. to test on a statistical basis the confidence we can have on the result of a specific SOM. The tools concern the quantization error in a SOM, and the neighborhood relations (both at the level of a specific pair of observations and globally on the map). As a by-product, these measures also allow to assess the adequacy of the number of units chosen in a map. The tools may also be used to measure objectively how the SOM are less sensitive to non-linear optimization problems (local minima, convergence, etc.) than other neural network models.
Many neural networks exhibit stability in their activation patterns over time in response to inputs from sensors operating under real-world conditions. By capitalizing on this property of natural signals, we propose a Recurrent Neural Network (RNN) architecture called a delta network in which each neuron transmits its value only when the change in its activation exceeds a threshold. The execution of RNNs as delta networks is attractive because their states must be stored and fetched at every timestep, unlike in convolutional neural networks (CNNs). We show that a naive run-time delta network implementation offers modest improvements on the number of memory accesses and computes, but optimized training techniques confer higher accuracy at higher speedup. With these optimizations, we demonstrate a 9X reduction in cost with negligible loss of accuracy for the TIDIGITS audio digit recognition benchmark. Similarly, on the large Wall Street Journal speech recognition benchmark even existing networks can be greatly accelerated as delta networks, and a 5.7x improvement with negligible loss of accuracy can be obtained through training. Finally, on an end-to-end CNN trained for steering angle prediction in a driving dataset, the RNN cost can be reduced by a substantial 100X.
A novel definition of the stimulus-specific information is presented, which is particularly useful when the stimuli constitute a continuous and metric set, as for example, position in space. The approach allows one to build the spatial information distribution of a given neural response. The method is applied to the investigation of putative differences in the coding of position in hippocampus and lateral septum.
This paper presents a new approach in the management of mobile ad hoc networks. Our alternative, based on mobile agent technology, allows the design of mobile centralized server in ad hoc network, where it is not obvious to think about a centralized management, due to the absence of any administration or fixed infrastructure in these networks. The aim of this centralized approach is to provide permanent availability of services in ad hoc networks which are characterized by a distributed management. In order to evaluate the performance of the proposed approach, we apply it to solve the problem of mobile code localization in ad hoc networks. A comparative study, based upon a simulation, of centralized and distributed localization protocols in terms of messages number exchanged and response time shows that the centralized approach in a distributed form is more interesting than a totally centralized approach.
This paper proposes a cross-layer based cognitive radio multichannel medium access control (MAC) protocol with TDMA, which integrate the spectrum sensing at physical (PHY) layer and the packet scheduling at MAC layer, for the ad hoc wireless networks. The IEEE 802.11 standard allows for the use of multiple channels available at the PHY layer, but its MAC protocol is designed only for a single channel. A single channel MAC protocol does not work well in a multichannel environment, because of the multichannel hidden terminal problem. Our proposed protocol enables secondary users (SUs) to utilize multiple channels by switching channels dynamically, thus increasing network throughput. In our proposed protocol, each SU is equipped with only one spectrum agile transceiver, but solves the multichannel hidden terminal problem using temporal synchronization. The proposed cognitive radio MAC (CR-MAC) protocol allows SUs to identify and use the unused frequency spectrum in a way that constrains the level of interference to the primary users (PUs). Our scheme improves network throughput significantly, especially when the network is highly congested. The simulation results show that our proposed CR-MAC protocol successfully exploits multiple channels and significantly improves network performance by using the licensed spectrum band opportunistically and protects PUs from interference, even in hidden terminal situations.
This paper1 presents an efficient approach to an existing batch verification system on Identity based group signature (IBGS) which can be applied to any Mobile ad hoc network device including Vehicle Ad hoc Networks (VANET). We propose an optimized way to batch signatures in order to get maximum throughput from a device in runtime environment. In addition, we minimize the number of pairing computations in batch verification proposed by B. Qin et al. for large scale VANET. We introduce a batch scheduling algorithm for batch verification targeting further minimization the batch computation time.
In this paper, we investigate the problem of recovering source information from an incomplete set of network coded data. We first study the theoretical performance of such systems under maximum a posteriori (MAP) decoding and derive the upper bound on the probability of decoding error as a function of the system parameters. We also establish the sufficient conditions on the number of network coded symbols required to achieve decoding error probability below a certain level. We then propose a low complexity iterative decoding algorithm based on message passing for decoding the network coded data of a particular class of statistically dependent sources that present pairwise linear correlation. The algorithm operates on a graph that captures the network coding constraints, while the knowledge about the source correlation is directly incorporated in the messages exchanged over the graph. We test the proposed method on both synthetic data and correlated image sequences and demonstrate that the prior knowledge about the source correlation can be effectively exploited at the decoder in order to provide a good reconstruction of the transmitted data in cases where the network coded data available at the decoder is not sufficient for exact decoding.
With the continuing empirical successes of deep networks, it becomes increasingly important to develop better methods for understanding training of models and the representations learned within. In this paper we propose Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods). We deploy this tool to measure the intrinsic dimensionality of layers, showing in some cases needless over-parameterization; to probe learning dynamics throughout training, finding that networks converge to final representations from the bottom up; to show where class-specific information in networks is formed; and to suggest new training regimes that simultaneously save computation and overfit less.
We use the generic replica symmetric cubic field-theory to study the transition of short range Ising spin glasses in a magnetic field around the upper critical dimension, d=6. A novel fixed-point is found, in addition to the well-known zero magnetic field fixed-point, from the application of the renormalization group. In the spin glass limit, n going to 0, this fixed-point governs the critical behaviour of a class of systems characterised by a single cubic interaction parameter. For this universality class, the spin glass susceptibility diverges at criticality, whereas the longitudinal mode remains massive. The third mode, the so-called anomalous one, however, behaves unusually, having a jump at criticality. The physical consequences of this unusual behaviour are discussed, and a comparison with the conventional de Almeida-Thouless scenario presented.
We briefly describe a new general algorithm for carrying out QCD calculations to next-to-leading order in perturbation theory. The algorithm can be used for computing arbitrary jet cross sections in arbitrary processes and can be straightforwardly implemented in general-purpose Monte Carlo programs. We show numerical results for the specific case of jet cross sections in deep inelastic scattering at HERA energies.
We study the generation of singlets in quantum networks with nodes initially sharing a finite number of partially entangled bipartite mixed states. We prove that singlets between arbitrary nodes in such networks can be created if and only if the initial states connecting the nodes have a particular form. We then generalize the method of entanglement percolation, previously developed for pure states, to mixed states of this form. As part of this, we find and compare different distillation protocols necessary to convert groups of mixed states shared between neighboring nodes of the network into singlets. In addition, we discuss protocols that only rely on local rules for the efficient connection of two remote nodes in the network via entanglement swapping. Further improvements of the success probability of singlet generation are developed by using particular forms of `quantum preprocessing' on the network. This includes generalized forms of entanglement swapping and we show how such strategies can be embedded in regular and hierarchical quantum networks.
Problems related to network coding for acyclic, instantaneous networks (where the edges of the acyclic graph representing the network are assumed to have zero-delay) have been extensively dealt with in the recent past. The most prominent of these problems include (a) the existence of network codes that achieve maximum rate of transmission, (b) efficient network code constructions, and (c) field size issues. In practice, however, networks have transmission delays. In network coding theory, such networks with transmission delays are generally abstracted by assuming that their edges have integer delays. Note that using enough memory at the nodes of an acyclic network with integer delays can effectively simulate instantaneous behavior, which is probably why only acyclic instantaneous networks have been primarily focused on thus far. In this work, we elaborate on issues ((a), (b) and (c) above) related to network coding for acyclic networks with integer delays, which have till now mostly been overlooked. We show that the delays associated with the edges of the network cannot be ignored, and in fact turn out to be advantageous, disadvantageous or immaterial, depending on the topology of the network and the problem considered i.e., (a), (b) or (c). In the process, we also show that for a single source multicast problem in acyclic networks (instantaneous and with delays), the network coding operations at each node can simply be limited to storing old symbols and coding them over a binary field. Therefore, operations over elements of larger fields are unnecessary in the network, the trade-off being that enough memory exists at the nodes and at the sinks, and that the sinks have more processing power.
We propose a general interpretation for long-range correlation effects in the activity and volatility of financial markets. This interpretation is based on the fact that the choice between `active' and `inactive' strategies is subordinated to random-walk like processes. We numerically demonstrate our scenario in the framework of simplified market models, such as the Minority Game model with an inactive strategy, or a more sophisticated version that includes some price dynamics. We show that real market data can be surprisingly well accounted for by these simple models.
We study the slow stochastic dynamics near the depinning threshold of an elastic interface in a random medium by solving a particularly suited model of hopping interacting particles which belongs to the quenched-Edwards-Wilkinson depinning universality class. The model allows us to compare the cases of uniformly activated and Arrhenius activated hops. In the former case, the velocity accurately follows a standard scaling law of the force and noise intensity with the analog of the thermal rounding exponent satisfying a modified "hyperscaling" relation. For the Arrhenius activation, we find, both numerically and analytically, that the standard scaling form fails for any value of the thermal rounding exponent. We propose an alternative scaling incorporating logarithmic corrections that appropriately fit the numerical results. We argue that this anomalous scaling is related to the strong correlation between activated hops which, alternated with deterministic depinning-like avalanches, occur below the depinning threshold. We rationalize this spatio-temporal patterns by making an analogy of the present model in the near threshold creep regime with some well known models with extremal dynamics, particularly the Bak-Sneppen model.
We present a novel response generation system that can be trained end to end on large quantities of unstructured Twitter conversations. A neural network architecture is used to address sparsity issues that arise when integrating contextual information into classic statistical models, allowing the system to take into account previous dialog utterances. Our dynamic-context generative models show consistent gains over both context-sensitive and non-context-sensitive Machine Translation and Information Retrieval baselines.
With the advancement of robotics, machine learning, and machine perception, increasingly more robots will enter human environments to assist with daily tasks. However, dynamically-changing human environments requires reactive motion plans. Reactivity can be accomplished through replanning, e.g. model-predictive control, or through a reactive feedback policy that modifies on-going behavior in response to sensory events. In this paper, we investigate how to use machine learning to add reactivity to a previously learned nominal skilled behavior. We approach this by learning a reactive modification term for movement plans represented by nonlinear differential equations. In particular, we use dynamic movement primitives (DMPs) to represent a skill and a neural network to learn a reactive policy from human demonstrations. We use the well explored domain of obstacle avoidance for robot manipulation as a test bed. Our approach demonstrates how a neural network can be combined with physical insights to ensure robust behavior across different obstacle settings and movement durations. Evaluations on an anthropomorphic robotic system demonstrate the effectiveness of our work.
In recent years, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are useful for large-scale scientific and Internet data analysis problems. In this chapter, I will describe two recent examples---one having to do with selecting good columns or features from a (DNA Single Nucleotide Polymorphism) data matrix, and the other having to do with selecting good clusters or communities from a data graph (representing a social or information network)---that drew on ideas from both areas and that may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale data analysis problems.
We investigate the eigenvalue statistics of random Bernoulli matrices, where the matrix elements are chosen independently from a binary set with equal probability. This is achieved by initiating a discrete random walk process over the space of matrices and analysing the induced random motion of the eigenvalues - an approach which is similar to Dyson's Brownian motion model but with important modifications. In particular, we show our process is described by a Fokker-Planck equation, up to an error margin which vanishes in the limit of large matrix dimension. The stationary solution of which corresponds to the joint probability density function of certain well-known fixed trace Gaussian ensembles.
As our professional, social, and financial existences become increasingly digitized and as our government, healthcare, and military infrastructures rely more on computer technologies, they present larger and more lucrative targets for malware. Stealth malware in particular poses an increased threat because it is specifically designed to evade detection mechanisms, spreading dormant, in the wild for extended periods of time, gathering sensitive information or positioning itself for a high-impact zero-day attack. Policing the growing attack surface requires the development of efficient anti-malware solutions with improved generalization to detect novel types of malware and resolve these occurrences with as little burden on human experts as possible. In this paper, we survey malicious stealth technologies as well as existing solutions for detecting and categorizing these countermeasures autonomously. While machine learning offers promising potential for increasingly autonomous solutions with improved generalization to new malware types, both at the network level and at the host level, our findings suggest that several flawed assumptions inherent to most recognition algorithms prevent a direct mapping between the stealth malware recognition problem and a machine learning solution. The most notable of these flawed assumptions is the closed world assumption: that no sample belonging to a class outside of a static training set will appear at query time. We present a formalized adaptive open world framework for stealth malware recognition and relate it mathematically to research from other machine learning domains.
In the paradigm of network coding, the information-theoretic security problem is encountered in the presence of a wiretapper, who has capability of accessing an unknown channel-subset in communication networks. In order to combat this eavesdropping attack, secure network coding is introduced to prevent information from being leaked to adversaries. For any construction of secure linear network codes (SLNCs) over a wiretap network, the based field size is a very important index, because it largely determines the computational and space complexities, and further the efficiency of network transmission. Further, it is also very important for the process of secure network coding from theoretical research to practical applications. In the present paper, we proposed new lower bounds on the field size for constructing SLNCs, which shows that the field size can be reduced considerably without giving up any security and capacity. In addition, since the obtained lower bounds depend on network topology, how to efficiently determine them is another very important and attractive problem for explicit constructions of SLNCs. Motivated by a desire to develop methods with low complexity and empirical behavior to solve this problem, we propose efficient approaches and present a series of algorithms for implementation. Subsequently, the complexity of our algorithms is analyzed, which shows that they are polynomial-time.
This paper proposes a self-adaptation mechanism to manage the resources allocated to the different species comprising a cooperative coevolutionary algorithm. The proposed approach relies on a dynamic extension to the well-known multi-armed bandit framework. At each iteration, the dynamic multi-armed bandit makes a decision on which species to evolve for a generation, using the history of progress made by the different species to guide the decisions. We show experimentally, on a benchmark and a real-world problem, that evolving the different populations at different paces allows not only to identify solutions more rapidly, but also improves the capacity of cooperative coevolution to solve more complex problems.
We consider the problem of optimally designing a body wireless sensor network, while taking into account the uncertainty of data generation of biosensors. Since the related min-max robustness Integer Linear Programming (ILP) problem can be difficult to solve even for state-of-the-art commercial optimization solvers, we propose an original heuristic for its solution. The heuristic combines deterministic and probabilistic variable fixing strategies, guided by the information coming from strengthened linear relaxations of the ILP robust model, and includes a very large neighborhood search for reparation and improvement of generated solutions, formulated as an ILP problem solved exactly. Computational tests on realistic instances show that our heuristic finds solutions of much higher quality than a state-of-the-art solver and than an effective benchmark heuristic.
How networks endure damage is a central issue in neural network research. This includes temporal as well as spatial pattern of damage. Here, based on some very simple models we study the difference between a slow-growing and acute damage and the relation between the size and rate of injury. Our result shows that in both a three-layer and a homeostasis model a slow-growing damage has a decreasing effect on network disability as compared with a fast growing one. This finding is in accord with clinical reports where the state of patients before and after the operation for slow-growing injuries is much better that those patients with acute injuries.
In this paper we propose to use the Winner Takes All hashing technique to speed up forward propagation and backward propagation in fully connected layers in convolutional neural networks. The proposed technique reduces significantly the computational complexity, which in turn, allows us to train layers with a large number of kernels with out the associated time penalty.   As a consequence we are able to train convolutional neural network on a very large number of output classes with only a small increase in the computational cost. To show the effectiveness of the technique we train a new output layer on a pretrained network using both the regular multiplicative approach and our proposed hashing methodology. Our results showed no drop in performance and demonstrate, with our implementation, a 7 fold speed up during the training.
We address a problem of optimization on product of embedded submanifolds of convolution kernels (PEMs) in convolutional neural networks (CNNs). First, we explore metric and curvature properties of PEMs in terms of component manifolds. Next, we propose a SGD method, called C-SGD, by generalizing the SGD methods employed on kernel submanifolds for optimization on product of different collections of kernel submanifolds. Then, we analyze convergence properties of the C-SGD considering sectional curvature properties of PEMs. In the theoretical results, we expound the constraints that can be used to compute adaptive step sizes of the C-SGD in order to assure the asymptotic convergence.
Ladder networks are a notable new concept in the field of semi-supervised learning by showing state-of-the-art results in image recognition tasks while being compatible with many existing neural architectures. We present the recurrent ladder network, a novel modification of the ladder network, for semi-supervised learning of recurrent neural networks which we evaluate with a phoneme recognition task on the TIMIT corpus. Our results show that the model is able to consistently outperform the baseline and achieve fully-supervised baseline performance with only 75% of all labels which demonstrates that the model is capable of using unsupervised data as an effective regulariser.
Ferroelectric switching and nanoscale domain dynamics were investigated using atomic force microscopy on monocrystalline Pb(Zr0.2Ti0.8)O3 thin films. Measurements of domain size versus writing time reveal a two-step domain growth mechanism, in which initial nucleation is followed by radial domain wall motion perpendicular to the polarization direction. The electric field dependence of the domain wall velocity demonstrates that domain wall motion in ferroelectric thin films is a creep process, with the critical exponent mu close to 1. The dimensionality of the films suggests that disorder is at the origin of the observed creep behavior.
What drives the propensity for the social network dynamics? Social influence is believed to drive both off-line and on-line human behavior, however it has not been considered as a driver of social network evolution. Our analysis suggest that, while the network structure affects the spread of influence in social networks, the network is in turn shaped by social influence activity (i.e., the process of social influence wherein one person's attitudes and behaviors affect another's). To that end, we develop a novel model of network evolution where the dynamics of network follow the mechanism of influence propagation, which are not captured by the existing network evolution models. Our experiments confirm the predictions of our model and demonstrate the important role that social influence can play in the process of network evolution. As well exploring the reason of social network evolution, different genres of social influence have been spotted having different effects on the network dynamics. These findings and methods are essential to both our understanding of the mechanisms that drive network evolution and our knowledge of the role of social influence in shaping the network structure.
Deep inelastic scattering at low x can be described by essentially only two fitted parameters. The interpretation of J/Psi photoproduction in terms of the gluon structure function is elaborated upon.
Satellite communication systems are the means of realizing a global broadband integrated services digital network. Due to the statistical nature of the integrated services traffic, the resulting rate fluctuations and burstiness render congestion control a complicated, yet indispensable function. The long propagation delay of the earth-satellite link further imposes severe demands and constraints on the congestion control schemes, as well as the media access control techniques and retransmission protocols that can be employed in a satellite network. The problems in designing satellite network protocols, as well as some of the solutions proposed to tackle these problems, will be the primary focus of this survey.
The modification of the ac conductivity and the complex permittivity of conducting polypyrrole was monitored throughout a two years thermal aging at 343K. Reduction of the cross-over frequency is correlated with the degradation of dc-conductivity, while the ac conductivity region corresponding to the so-called 'universal' dielectric relaxation remains practically invariant during the first year of ageing, which implies a collective co-operativity among multiple degradation processes that yield a practically time-independent effective disordered environment. A broad dielectric loss peak recorded in fresh specimens splits into two distinct relaxations for intermediate stages of the annealing process. The ageing-time evolution of the dc component and the relaxations are qualitatively analysed and time constants are determined.
Direct transport processes play an important role in wireless communications where an ideal setup uses microwave fields to establish reliable communication channels between transmitter and receiver. But it is inherent to the problem that one cannot fully control the environment. While the influence of a complex scattering surrounding can be very well described using Random Matrix Theory it is not always obvious how to combine this universal approach with concrete communication channels. In this work we present an approach introducing an enhanced path between two antennas to the Hamilton operator to account for a prototypical problem. In order to be able to describe the stability of wireless chip-to-chip communication, we analyze the transport properties and predict the stability of the transmission under increasing importance of the environment.
We study the worst-case adaptive optimization problem with budget constraint that is useful for modeling various practical applications in artificial intelligence and machine learning. We investigate the near-optimality of greedy algorithms for this problem with both modular and non-modular cost functions. In both cases, we prove that two simple greedy algorithms are not near-optimal but the best between them is near-optimal if the utility function satisfies pointwise submodularity and pointwise cost-sensitive submodularity respectively. This implies a combined algorithm that is near-optimal with respect to the optimal algorithm that uses half of the budget. We discuss applications of our theoretical results and also report experiments comparing the greedy algorithms on the active learning problem.
CSMA/CA networks have often been analyzed using a stylized model that is fully characterized by a vector of back-off rates and a conflict graph. Further, for any achievable throughput vector $\vec \theta$ the existence of a unique vector $\vec \nu(\vec \theta)$ of back-off rates that achieves this throughput vector was proven. Although this unique vector can in principle be computed iteratively, the required time complexity grows exponentially in the network size, making this only feasible for small networks.   In this paper, we present an explicit formula for the unique vector of back-off rates $\vec \nu(\vec \theta)$ needed to achieve any achievable throughput vector $\vec \theta$ provided that the network has a chordal conflict graph. This class of networks contains a number of special cases of interest such as (inhomogeneous) line networks and networks with an acyclic conflict graph. Moreover, these back-off rates are such that the back-off rate of a node only depends on its own target throughput and the target throughput of its neighbors and can be determined in a distributed manner.   We further indicate that back-off rates of this form cannot be obtained in general for networks with non-chordal conflict graphs. For general conflict graphs we nevertheless show how to adapt the back-off rates when a node is added to the network when its interfering nodes form a clique in the conflict graph. Finally, we introduce a distributed chordal approximation algorithm for general conflict graphs which is shown (using numerical examples) to be more accurate than the Bethe approximation.
Managing network complexity, accommodating greater numbers of subscribers, improving coverage to support data services (e.g. email, video, and music downloads), keeping up to speed with fast-changing technology, and driving maximum value from existing networks - all while reducing CapEX and OpEX and ensuring Quality of Service (QoS) for the network and Quality of Experience (QoE) for the user. These are just some of the pressing business issues faced by mobileservice providers, summarized by the demand to "achieve more, for less." The ultimate goal of optimization techniques at the network and application layer is to ensure End-user perceived QoS. The next generation networks (NGN), a composite environment of proven telecommunications and Internet-oriented mechanisms have become generally recognized as the telecommunications environment of the future. However, the nature of the NGN environment presents several complex issues regarding quality assurance that have not existed in the legacy environments (e.g., multi-network, multi-vendor, and multi-operator IP-based telecommunications environment, distributed intelligence, third-party provisioning, fixed-wireless and mobile access, etc.). In this Research Paper, a service aware policy-based approach to NGN quality assurance is presented, taking into account both perceptual quality of experience and technologydependant quality of service issues. The respective procedures, entities, mechanisms, and profiles are discussed. The purpose of the presented approach is in research, development, and discussion of pursuing the end-to-end controllability of the quality of the multimedia NGN-based communications in an environment that is best effort in its nature and promotes end user's access agnosticism, service agility, and global mobility.
In this paper we study a problem within Dempster-Shafer theory where 2**n - 1 pieces of evidence are clustered by a neural structure into n clusters. The clustering is done by minimizing a metaconflict function. Previously we developed a method based on iterative optimization. However, for large scale problems we need a method with lower computational complexity. The neural structure was found to be effective and much faster than iterative optimization for larger problems. While the growth in metaconflict was faster for the neural structure compared with iterative optimization in medium sized problems, the metaconflict per cluster and evidence was moderate. The neural structure was able to find a global minimum over ten runs for problem sizes up to six clusters.
We provide an efficient algorithm for determining how a road network has evolved over time, given two snapshot instances from different dates. To allow for such determinations across different databases and even against hand drawn maps, we take a strictly topological approach in this paper, so that we compare road networks based strictly on graph-theoretic properties. Given two road networks of same region from two different dates, our approach allows one to match road network portions that remain intact and also point out added or removed portions. We analyze our algorithm both theoretically, showing that it runs in polynomial time for non-degenerate road networks even though a related problem is NP-complete, and experimentally, using dated road networks from the TIGER/Line archive of the U.S. Census Bureau.
The statistical theory of certain complex wave interference phenomena, like the statistical fluctuations of transmission and reflection of waves, is of considerable interest in many fields of physics. In this article we shall be mainly interested in those situations where the complexity derives from the quenched randomness of scattering potentials, as in the case of disordered conductors, or, more in general, disordered waveguides. In studies performed in such systems one has found remarkable statistical regularities, in the sense that the probability distribution for various macroscopic quantities involves a rather small number of relevant physical parameters, while the rest of the microscopic details serves as mere "scaffolding". We shall review past work in which this feature was captured following a maximum-entropy approach, as well as later studies in which the existence of a limiting distribution, in the sense of a generalized central-limit theorem, has been actually demonstrated. We then describe a microscopic potential model that was developed recently, which gives rise to a further generalization of the central-limit theorem and thus to a limiting macroscopic statistics.
We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a \emph{per-pixel} loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing \emph{perceptual} loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.
The study of the sub-structure of complex networks is of major importance to relate topology and functionality. Many efforts have been devoted to the analysis of the modular structure of networks using the quality function known as modularity. However, generally speaking, the relation between topological modules and functional groups is still unknown, and depends on the semantic of the links. Sometimes, we know in advance that many connections are transitive and, as a consequence, triangles have a specific meaning. Here we propose the study of the modular structure of networks considering triangles as the building blocks of modules. The method generalizes the standard modularity and uses spectral optimization to find its maximum. We compare the partitions obtained with those resulting from the optimization of the standard modularity in several real networks. The results show that the information reported by the analysis of modules of triangles complements the information of the classical modularity analysis.
In this paper, a random clique network model to mimic the large clustering coefficient and the modular structure that exist in many real complex networks, such as social networks, artificial networks, and protein interaction networks, is introduced by combining the random selection rule of the Erd\"os and R\'enyi (ER) model and the concept of cliques. We find that random clique networks having a small average degree differ from the ER network in that they have a large clustering coefficient and a power law clustering spectrum, while networks having a high average degree have similar properties as the ER model. In addition, we find that the relation between the clustering coefficient and the average degree shows a non-monotonic behavior and that the degree distributions can be fit by multiple Poisson curves; we explain the origin of such novel behaviors and degree distributions.
The study of Newton's method in complex-valued neural networks faces many difficulties. In this paper, we derive Newton's method backpropagation algorithms for complex-valued holomorphic multilayer perceptrons, and investigate the convergence of the one-step Newton steplength algorithm for the minimization of real-valued complex functions via Newton's method. To provide experimental support for the use of holomorphic activation functions, we perform a comparison of using sigmoidal functions versus their Taylor polynomial approximations as activation functions by using the algorithms developed in this paper and the known gradient descent backpropagation algorithm. Our experiments indicate that the Newton's method based algorithms, combined with the use of polynomial activation functions, provide significant improvement in the number of training iterations required over the existing algorithms.
It is suggested that the full nucleus-nucleus potential consists of the macroscopic and shell-correction parts. The deep sub-barrier fusion hindrance takes place in a nucleus-nucleus system with a strong negative shell-correction contribution to the full heavy-ion potential, while a strong positive shell-correction contribution to the full potential leads to weak enhancement of the deep sub-barrier fusion cross section.
Despite the highest classification accuracy in wide varieties of application areas, artificial neural network has one disadvantage. The way this Network comes to a decision is not easily comprehensible. The lack of explanation ability reduces the acceptability of neural network in data mining and decision system. This drawback is the reason why researchers have proposed many rule extraction algorithms to solve the problem. Recently, Deep Neural Network (DNN) is achieving a profound result over the standard neural network for classification and recognition problems. It is a hot machine learning area proven both useful and innovative. This paper has thoroughly reviewed various rule extraction algorithms, considering the classification scheme: decompositional, pedagogical, and eclectics. It also presents the evaluation of these algorithms based on the neural network structure with which the algorithm is intended to work. The main contribution of this review is to show that there is a limited study of rule extraction algorithm from DNN.
In this paper, we propose TopicRNN, a recurrent neural network (RNN)-based language model designed to directly capture the global semantic meaning relating words in a document via latent topics. Because of their sequential nature, RNNs are good at capturing the local structure of a word sequence - both semantic and syntactic - but might face difficulty remembering long-range dependencies. Intuitively, these long-range dependencies are of semantic nature. In contrast, latent topic models are able to capture the global underlying semantic structure of a document but do not account for word ordering. The proposed TopicRNN model integrates the merits of RNNs and latent topic models: it captures local (syntactic) dependencies using an RNN and global (semantic) dependencies using latent topics. Unlike previous work on contextual RNN language modeling, our model is learned end-to-end. Empirical results on word prediction show that TopicRNN outperforms existing contextual RNN baselines. In addition, TopicRNN can be used as an unsupervised feature extractor for documents. We do this for sentiment analysis on the IMDB movie review dataset and report an error rate of $6.28\%$. This is comparable to the state-of-the-art $5.91\%$ resulting from a semi-supervised approach. Finally, TopicRNN also yields sensible topics, making it a useful alternative to document models such as latent Dirichlet allocation.
The equipartition theorem states that in equilibrium thermal energy is equally distributed among uncoupled degrees of freedom which appear quadratically in the system's Hamiltonian. However, for spatially coupled degrees of freedom --- such as interacting particles --- one may speculate that the spatial distribution of thermal energy may differ from the value predicted by equipartition, possibly quite substantially in strongly inhomogeneous/disordered systems. Here we show that for systems undergoing simple Gaussian fluctuations around an equilibrium state, the spatial distribution is universally bounded from above by $\frac{1}{2}k_BT$. We further show that in one-dimensional systems with short-range interactions, the thermal energy is equally partitioned even for coupled degrees of freedom in the thermodynamic limit and that in higher dimensions non-trivial spatial distributions emerge. Some implications are discussed.
In this paper the use of artificial neural network in power system stability is studied. A predictive controller based on two neural networks is designed and tested on a single machine infinite bus system which is used to replace conventional power system stabilizers. They have been used for decades in power system to dampen small amplitude low frequency oscillation in power systems. The increases in size and complexity of power systems have cast a shadow on efficiency of conventional method. New control strategies have been proposed in many researches. Artificial Neural Networks have been studied in many publications but lack of assurance of their functionality has hindered the practical usage of them in utilities. The proposed control structure is modelled using a novel data exchange established between MATLAB and
A single-spin asymmetry in the azimuthal distribution of neutral pions relative to the lepton scattering plane has been measured for the first time in deep-inelastic scattering of positrons off longitudinally polarized protons. The analysing power in the sin(phi) moment of the cross section is 0.019 +/- 0.007(stat.) +/- 0.003(syst.). This result is compared to single-spin asymmetries for charged pion production measured in the same kinematic range. The pi^0 asymmetry is of the same size as the pi^+ asymmetry and shows a similar dependence on the relevant kinematic variables. The asymmetry is described by a phenomenological calculation based on a fragmentation function that represents sensitivity to the transverse polarization of the struck quark.
In this paper a multi-parameter A*(A- star)-ants based algorithm is proposed in order to find the best optimized multi-parameter path between two desired points in regions. This algorithm recognizes paths, according to user desired parameters using electronic maps. The proposed algorithm is a combination of A* and ants algorithm in which the proposed A* algorithm is the prologue to the suggested ant based algorithm .In fact, this A* algorithm invigorates some paths pheromones in ants algorithm. As one of implementations of this method, this algorithm was applied on a part of Kerman city, Iran as a multi-parameter vehicle navigator. It finds the best optimized multi-parameter direction between two desired junctions based on city traveler parameters. Comparison results between the proposed method and ants algorithm demonstrates efficiency and lower cost function results of the proposed method versus ants algorithm.
Recently, a novel family of biologically plausible online algorithms for reducing the dimensionality of streaming data has been derived from the similarity matching principle. In these algorithms, the number of output dimensions can be determined adaptively by thresholding the singular values of the input data matrix. However, setting such threshold requires knowing the magnitude of the desired singular values in advance. Here we propose online algorithms where the threshold is self-calibrating based on the singular values computed from the existing observations. To derive these algorithms from the similarity matching cost function we propose novel regularizers. As before, these online algorithms can be implemented by Hebbian/anti-Hebbian neural networks in which the learning rule depends on the chosen regularizer. We demonstrate both mathematically and via simulation the effectiveness of these online algorithms in various settings.
We study dynamics of opinion formation in a network of coupled agents. As the network evolves to a steady state, opinions of agents within the same community converge faster than those of other agents. This framework allows us to study how network topology and network flow, which mediates the transfer of opinions between agents, both affect the formation of communities. In traditional models of opinion dynamics, agents are coupled via conservative flows, which result in one-to-one opinion transfer. However, social interactions are often non-conservative, resulting in one-to-many transfer of opinions. We study opinion formation in networks using one-to-one and one-to-many interactions and show that they lead to different community structure within the same network.
Stochastic network calculus is a theory for stochastic service guarantee analysis of computer communication networks. In the current stochastic network calculus literature, its traffic and server models are typically based on the cumulative amount of traffic and cumulative amount of service respectively. However, there are network scenarios where the applicability of such models is limited, and hence new ways of modeling traffic and service are needed to address this limitation. This paper presents time-domain models and results for stochastic network calculus. Particularly, we define traffic models, which are based on probabilistic lower-bounds on cumulative packet inter-arrival time, and server models, which are based on probabilistic upper-bounds on cumulative packet service time. In addition, examples demonstrating the use of the proposed time-domain models are provided. On the basis of the proposed models, the five basic properties of stochastic network calculus are also proved, which implies broad applicability of the proposed time-domain approach.
We present recent QCD results in nu-N scattering at the Fermilab CCFR/NuTeV experiments. We present the latest Next-to-Next-Leading order strong coupling constant, alpha-s, extracted from Gross-Llewellyn-Smith sum rule. The value of alpha-s from this measurement, at the mass of Z boson, is alpha-s^{NNLO}(M_Z**2)=0.114^{+0.009}_{-0.012}. We also present a preliminary result of the CCFR F_2 at large-x. This measurement of F_2 is compared to other experiments and various models that provide nuclear effects. We discuss the previous strange sea quark measurement from the CCFR experiment and the prospects for improvements of the measurement in the NuTeV experiment.
Extensive body of work has shown that for the model of a non-interacting electron in a random potential there is a quantum critical point for dimensions greater than two---a metal-insulator transition. This model also plays an important role in the plateau-to-plateu transition in the integer quantum Hall effect, which is also correctly captured by a scaling theory. Yet, in neither of these cases the ground state energy shows any non-analyticity as a function of a suitable tuning parameter, typically considered to be a hallmark of a quantum phase transition, similar to the non-analyticity of the free energy in a classical phase transition. Here we show that von Neumann entropy (entanglement entropy) is non-analytic at these phase transitions and can track the fundamental changes in the internal correlations of the ground state wave function. In particular, it summarizes the spatially wildly fluctuating intensities of the wave function close to the criticality of the Anderson transition. It is likely that all quantum phase transitions can be similarly described.
The research discipline of network steganography deals with the hiding of information within network transmissions, e.g. to transfer illicit information in networks with Internet censorship. The last decades of research on network steganography led to more than hundred techniques for hiding data in network transmissions. However, previous research has shown that most of these hiding techniques are either based on the same idea or introduce limited novelty, enabling the application of existing countermeasures. In this paper, we provide a link between the field of creativity and network steganographic research. We propose a framework and a metric to help evaluating the creativity bound to a given hiding technique. This way, we support two sides of the scientific peer review process as both authors and reviewers can use our framework to analyze the novelty and applicability of hiding techniques. At the same time, we contribute to a uniform terminology in network steganography.
We give a short Wiener measure proof of the Riemann hypothesis based on a surprising, unexpected and deep relation between the Riemann zeta $\zeta(s)$ and the trivial zeta $\zeta_{t}(s):=Im(s)(2Re(s)-1)$.
From many datasets gathered in online social networks, well defined community structures have been observed. A large number of users participate in these networks and the size of the resulting graphs poses computational challenges. There is a particular demand in identifying the nodes responsible for information flow between communities; for example, in temporal Twitter networks edges between communities play a key role in propagating spikes of activity when the connectivity between communities is sparse and few edges exist between different clusters of nodes. The new algorithm proposed here is aimed at revealing these key connections by measuring a node's vicinity to nodes of another community. We look at the nodes which have edges in more than one community and the locality of nodes around them which influence the information received and broadcasted to them. The method relies on independent random walks of a chosen fixed number of steps, originating from nodes with edges in more than one community. For the large networks that we have in mind, existing measures such as betweenness centrality are difficult to compute, even with recent methods that approximate the large number of operations required. We therefore design an algorithm that scales up to the demand of current big data requirements and has the ability to harness parallel processing capabilities. The new algorithm is illustrated on synthetic data, where results can be judged carefully, and also on a real, large scale Twitter activity data, where new insights can be gained.
We develop elements of a theory of cooperation and coordination in networks. Rather than considering a communication network as a means of distributing information, or of reconstructing random processes at remote nodes, we ask what dependence can be established among the nodes given the communication constraints. Specifically, in a network with communication rates {R_{i,j}} between the nodes, we ask what is the set of all achievable joint distributions p(x1, ..., xm) of actions at the nodes of the network. Several networks are solved, including arbitrarily large cascade networks.   Distributed cooperation can be the solution to many problems such as distributed games, distributed control, and establishing mutual information bounds on the influence of one part of a physical system on another.
We address a challenging fine-grain classification problem: recognizing a font style from an image of text. In this task, it is very easy to generate lots of rendered font examples but very hard to obtain real-world labeled images. This real-to-synthetic domain gap caused poor generalization to new real data in previous methods (Chen et al. (2014)). In this paper, we refer to Convolutional Neural Networks, and use an adaptation technique based on a Stacked Convolutional Auto-Encoder that exploits unlabeled real-world images combined with synthetic data. The proposed method achieves an accuracy of higher than 80% (top-5) on a real-world dataset.
NOvA is a long-baseline accelerator-based neutrino oscillation experiment that is optimized for $\nu_\mu\to\nu_e$ measurements. It uses the upgraded NuMI beam from Fermilab and measures electron-neutrino appearance and muon-neutrino disappearance at its Far Detector in Ash River, Minnesota. The $\nu_e$ appearance analysis at NOvA aims to resolve the neutrino mass hierarchy problem and to constrain the CP-violating phase. The first data set of $2.74\times10^{20}$ protons on target (POT) equivalent exposure taken by NOvA has been analyzed. The first measurement of electron-neutrino appearance in NOvA provides solid evidence of $\nu_\mu\to\nu_e$ oscillation with the NuMI beam line. Electron-neutrino identification is the key ingredient for the $\nu_e$ appearance analysis. The electron-identification algorithm used to produce the primary results presented here compares 3-D shower-energy profiles with Monte Carlo prototypes to construct likelihoods for each particle hypothesis. Particle likelihoods, among other event-topology variables, are used as inputs to an Artificial Neural Network for the final electron-neutrino identification. The design and implementation of this algorithm is also presented.
Analogy plays an important role in creativity, and is extensively used in science as well as art. In this paper we introduce a technique for the automated generation of cross-domain analogies based on a novel evolutionary algorithm (EA). Unlike existing work in computational analogy-making restricted to creating analogies between two given cases, our approach, for a given case, is capable of creating an analogy along with the novel analogous case itself. Our algorithm is based on the concept of "memes", which are units of culture, or knowledge, undergoing variation and selection under a fitness measure, and represents evolving pieces of knowledge as semantic networks. Using a fitness function based on Gentner's structure mapping theory of analogies, we demonstrate the feasibility of spontaneously generating semantic networks that are analogous to a given base network.
We study the critical dynamics of hyper-cubic finite size system in the presence of quenched short-range correlated disorder. By using the random $T_c$ model A for the critical dynamics and the renormalization group method in the vicinity of the upper critical dimension $d=4$, we derive in first order of $\epsilon$ the expression for the relaxation time. Its finite-size scaling behavior is discussed both analytically and numerically in details. This was made possible by analyzing carefully the finite--size effects on the Onsager kinetic coefficient. The obtained results are compared to those reported in the literature.
We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words. Since word-level information provides a crucial source of bias, our input model composes representations of character sequences into representations of words (as determined by whitespace boundaries), and then these are translated using a joint attention/translation model. In the target language, the translation is modeled as a sequence of word vectors, but each word is generated one character at a time, conditional on the previous character generations in each word. As the representation and generation of words is performed at the character level, our model is capable of interpreting and generating unseen word forms. A secondary benefit of this approach is that it alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages. We show that our model can achieve translation results that are on par with conventional word-based models.
Chimera is a fascinating phenomenon of coexisting synchronized and desynchronized behaviour that was discovered in networks of nonlocally coupled identical phase oscillators over ten years ago. Since then, chimeras were found in numerous theoretical and experimental studies and more recently in models of neuronal dynamics as well. In this work, we consider a generic model for a saddle-node bifurcation on a limit cycle representative for neural excitability type I. We obtain chimera states with multiple coherent regions (clustered chimeras/multi-chimeras) depending on the distance from the excitability threshold, the range of nonlocal coupling as well as the coupling strength. A detailed stability diagram for these chimera states as well as other interesting coexisting patterns like traveling waves are presented.
We study numerically the geometrical and free-energy fluctuations of a static one-dimensional (1D) interface with a short-range elasticity, submitted to a quenched random-bond Gaussian disorder of finite correlation length $\xi>0$, and at finite temperature $T$. Using the exact mapping from the static 1D interface to the 1+1 Directed Polymer (DP) growing in a continuous space, we focus our analysis on the disorder free-energy of the DP endpoint, a quantity which is strictly zero in absence of disorder and whose sample-to-sample fluctuations at a fixed growing `time' $t$ inherit the statistical translation-invariance of the microscopic disorder explored by the DP. Constructing a new numerical scheme for the integration of the Kardar-Parisi-Zhang (KPZ) evolution equation obeyed by the free-energy, we address numerically the `time'- and temperature-dependence of the disorder free-energy fluctuations at fixed finite $\xi$. We examine on one hand the amplitude $\tilde{D}_{t}$ and effective correlation length $\tilde{\xi}_t$ of the free-energy fluctuations, and on the other hand the imprint of the specific microscopic disorder correlator on the large-`time' shape of the free-energy two-point correlator. We observe numerically the crossover to a low-temperature regime below a finite characteristic temperature $T_c(\xi)$, as previously predicted by Gaussian-Variational-Method (GVM) computations and scaling arguments, and extensively investigated analytically in [Phys. Rev. E, 87 042406 (2013)]. Finally we address numerically the `time'- and temperature-dependence of the roughness $B(t)$, which quantifies the DP endpoint transverse fluctuations, and we show how the amplitude $\tilde{D}_{\infty}(T,\xi)$ controls the different regimes experienced by $B(t)$ -- in agreement with the analytical predictions of a DP `toymodel' approach.
We view Digital Ecosystems to be the digital counterparts of biological ecosystems, which are considered to be robust, self-organising and scalable architectures that can automatically solve complex, dynamic problems. So, this work is concerned with the creation, investigation, and optimisation of Digital Ecosystems, exploiting the self-organising properties of biological ecosystems. First, we created the Digital Ecosystem, a novel optimisation technique inspired by biological ecosystems, where the optimisation works at two levels: a first optimisation, migration of agents which are distributed in a decentralised peer-to-peer network, operating continuously in time; this process feeds a second optimisation based on evolutionary computing that operates locally on single peers and is aimed at finding solutions to satisfy locally relevant constraints. We then investigated its self-organising aspects, starting with an extension to the definition of Physical Complexity to include evolving agent populations. Next, we established stability of evolving agent populations over time, by extending the Chli-DeWilde definition of agent stability to include evolutionary dynamics. Further, we evaluated the diversity of the software agents within evolving agent populations. To conclude, we considered alternative augmentations to optimise and accelerate our Digital Ecosystem, by studying the accelerating effect of a clustering catalyst on the evolutionary dynamics. We also studied the optimising effect of targeted migration on the ecological dynamics, through the indirect and emergent optimisation of the agent migration patterns. Overall, we have advanced the understanding of creating Digital Ecosystems, the self-organisation that occurs within them, and the optimisation of their Ecosystem-Oriented Architecture.
Systems of particles interacting via inverse-power law potentials have an invariance with respect to changes in length and temperature, implying a correspondence in the dynamics and thermodynamics between different `isomorphic' sets of temperatures and densities. In a recent series of works, it has been argued that such correspondences hold to a surprisingly good approximation in a much more general class of potentials, an observation that summarizes many properties that have been observed in the past. In this paper we show that such relations are exact in high-dimensional liquids and glasses, a limit in which the conditions for these mappings to hold become transparent. The special role played by the exponential potential is also confirmed.
We study an evolutionary algorithm that locally adapts thresholds and wiring in Random Threshold Networks, based on measurements of a dynamical order parameter. A control parameter $p$ determines the probability of threshold adaptations vs. link rewiring. For any $p < 1$, we find spontaneous symmetry breaking into a new class of self-organized networks, characterized by a much higher average connectivity $\bar{K}_{evo}$ than networks without threshold adaptation ($p =1$). While $\bar{K}_{evo}$ and evolved out-degree distributions are independent from $p$ for $p <1$, in-degree distributions become broader when $p \to 1$, approaching a power-law. In this limit, time scale separation between threshold adaptions and rewiring also leads to strong correlations between thresholds and in-degree. Finally, evidence is presented that networks converge to self-organized criticality for large $N$.
Most real-world data can be modeled as heterogeneous information networks (HINs) consisting of vertices of multiple types and their relationships. Search for similar vertices of the same type in large HINs, such as bibliographic networks and business-review networks, is a fundamental problem with broad applications. Although similarity search in HINs has been studied previously, most existing approaches neither explore rich semantic information embedded in the network structures nor take user's preference as a guidance.   In this paper, we re-examine similarity search in HINs and propose a novel embedding-based framework. It models vertices as low-dimensional vectors to explore network structure-embedded similarity. To accommodate user preferences at defining similarity semantics, our proposed framework, ESim, accepts user-defined meta-paths as guidance to learn vertex vectors in a user-preferred embedding space. Moreover, an efficient and parallel sampling-based optimization algorithm has been developed to learn embeddings in large-scale HINs. Extensive experiments on real-world large-scale HINs demonstrate a significant improvement on the effectiveness of ESim over several state-of-the-art algorithms as well as its scalability.
This work describes the implementation and application of a correlation determination method based on Self Organizing Maps and Bayesian Inference (SOMBI). SOMBI aims to automatically identify relations between different observed parameters in unstructured cosmological or astrophysical surveys by automatically identifying data clusters in high-dimensional datasets via the Self Organizing Map neural network algorithm. Parameter relations are then revealed by means of a Bayesian inference within respective identified data clusters. Specifically such relations are assumed to be parametrized as a polynomial of unknown order. The Bayesian approach results in a posterior probability distribution function for respective polynomial coefficients. To decide which polynomial order suffices to describe correlation structures in data, we include a method for model selection, the Bayesian Information Criterion, to the analysis. The performance of the SOMBI algorithm is tested with mock data. As illustration we also provide applications of our method to cosmological data. In particular, we present results of a correlation analysis between galaxy and AGN properties provided by the SDSS catalog with the cosmic large-scale-structure (LSS). The results indicate that the combined galaxy and LSS dataset indeed is clustered into several sub-samples of data with different average properties (for example different stellar masses or web-type classifications). The majority of data clusters appear to have a similar correlation structure between galaxy properties and the LSS. In particular we revealed a positive and linear dependency between the stellar mass, the absolute magnitude and the color of a galaxy with the corresponding cosmic density field. A remaining subset of data shows inverted correlations, which might be an artifact of non-linear redshift distortions.
In this paper we propose some new measures of language development using network analyses, which is inspired by the recent surge of interests in network studies of many real-world systems. Children's and care-takers' speech data from a longitudinal study are represented as a series of networks, word forms being taken as nodes and collocation of words as links. Measures on the properties of the networks, such as size, connectivity, hub and authority analyses, etc., allow us to make quantitative comparison so as to reveal different paths of development. For example, the asynchrony of development in network size and average degree suggests that children cannot be simply classified as early talkers or late talkers by one or two measures. Children follow different paths in a multi-dimensional space. They may develop faster in one dimension but slower in another dimension. The network approach requires little preprocessing of words and analyses on sentence structures, and the characteristics of words and their usage emerge from the network and are independent of any grammatical presumptions. We show that the change of the two articles "the" and "a" in their roles as important nodes in the network reflects the progress of children's syntactic development: the two articles often start in children's networks as hubs and later shift to authorities, while they are authorities constantly in the adult's networks. The network analyses provide a new approach to study language development, and at the same time language development also presents a rich area for network theories to explore.
Background: In cognitive neuroscience the potential of Deep Neural Networks (DNNs) for solving complex classification tasks is yet to be fully exploited. The most limiting factor is that DNNs as notorious 'black boxes' do not provide insight into neurophysiological phenomena underlying a decision. Layer-wise Relevance Propagation (LRP) has been introduced as a novel method to explain individual network decisions. New Method: We propose the application of DNNs with LRP for the first time for EEG data analysis. Through LRP the single-trial DNN decisions are transformed into heatmaps indicating each data point's relevance for the outcome of the decision. Results: DNN achieves classification accuracies comparable to those of CSP-LDA. In subjects with low performance subject-to-subject transfer of trained DNNs can improve the results. The single-trial LRP heatmaps reveal neurophysiologically plausible patterns, resembling CSP-derived scalp maps. Critically, while CSP patterns represent class-wise aggregated information, LRP heatmaps pinpoint neural patterns to single time points in single trials. Comparison with Existing Method(s): We compare the classification performance of DNNs to that of linear CSP-LDA on two data sets related to motor-imaginery BCI. Conclusion: We have demonstrated that DNN is a powerful non-linear tool for EEG analysis. With LRP a new quality of high-resolution assessment of neural activity can be reached. LRP is a potential remedy for the lack of interpretability of DNNs that has limited their utility in neuroscientific applications. The extreme specificity of the LRP-derived heatmaps opens up new avenues for investigating neural activity underlying complex perception or decision-related processes.
Image aesthetics assessment has been challenging due to its subjective nature. Inspired by the scientific advances in the human visual perception and neuroaesthetics, we design Brain-Inspired Deep Networks (BDN) for this task. BDN first learns attributes through the parallel supervised pathways, on a variety of selected feature dimensions. A high-level synthesis network is trained to associate and transform those attributes into the overall aesthetics rating. We then extend BDN to predicting the distribution of human ratings, since aesthetics ratings are often subjective. Another highlight is our first-of-its-kind study of label-preserving transformations in the context of aesthetics assessment, which leads to an effective data augmentation approach. Experimental results on the AVA dataset show that our biological inspired and task-specific BDN model gains significantly performance improvement, compared to other state-of-the-art models with the same or higher parameter capacity.
IC 348 is a young (t$\sim$3Myr) and nearby (d$\sim$340pc) star forming region in the Perseus molecular cloud. We performed a deep imaging survey using the MEGACAM (z-band) and WIRCAM (JHK and narrowband CH${_4}$ on/off) wide-field cameras on the Canada-France-Hawaii Telescope. From the analysis of the narrowband CH${_4}$ on/off deep images, we report 4 T-dwarf candidates, of which 3 clearly lie within the limits of the IC 348 cluster. An upper limit on the extinction was estimated for each candidate from colour-magnitude diagrams, and found consistent with extinction maps of the cloud. Initial comparisons with T-dwarf spectral models suggest these candidates have a spectral type between T3 and T5, and perhaps later, potentially making these among the lowest mass isolated objects detected in a young star forming region so far.
Network detection is an important capability in many areas of applied research in which data can be represented as a graph of entities and relationships. Oftentimes the object of interest is a relatively small subgraph in an enormous, potentially uninteresting background. This aspect characterizes network detection as a "big data" problem. Graph partitioning and network discovery have been major research areas over the last ten years, driven by interest in internet search, cyber security, social networks, and criminal or terrorist activities. The specific problem of network discovery is addressed as a special case of graph partitioning in which membership in a small subgraph of interest must be determined. Algebraic graph theory is used as the basis to analyze and compare different network detection methods. A new Bayesian network detection framework is introduced that partitions the graph based on prior information and direct observations. The new approach, called space-time threat propagation, is proved to maximize the probability of detection and is therefore optimum in the Neyman-Pearson sense. This optimality criterion is compared to spectral community detection approaches which divide the global graph into subsets or communities with optimal connectivity properties. We also explore a new generative stochastic model for covert networks and analyze using receiver operating characteristics the detection performance of both classes of optimal detection techniques.
Given $n$ discrete random variables, its entropy vector is the $2^n-1$ dimensional vector obtained from the joint entropies of all non-empty subsets of the random variables. It is well known that there is a one-to-one correspondence between such an entropy vector and a certain group-characterizable vector obtained from a finite group and $n$ of its subgroups [3]. This correspondence may be useful for characterizing the space of entropic vectors and for designing network codes. If one restricts attention to abelian groups then not all entropy vectors can be obtained. This is an explanation for the fact shown by Dougherty et al [4] that linear network codes cannot achieve capacity in general network coding problems. All abelian group-characterizable vectors, and by fiat all entropy vectors generated by linear network codes, satisfy a linear inequality called the Ingleton inequality. It is therefore of interest to identify groups that violate the Ingleton inequality. In this paper, we study the problem of finding nonabelian finite groups that yield characterizable vectors which violate the Ingleton inequality. Using a refined computer search, we find the symmetric group $S_5$ to be the smallest group that violates the Ingleton inequality. Careful study of the structure of this group, and its subgroups, reveals that it belongs to the Ingleton-violating family $PGL(2,q)$ with a prime power $q \geq 5$, i.e., the projective group of $2\times 2$ nonsingular matrices with entries in $\mathbb{F}_q$. We further interpret this family using the theory of group actions. We also extend the construction to more general groups such as $PGL(n,q)$ and $GL(n,q)$. The families of groups identified here are therefore good candidates for constructing network codes more powerful than linear network codes, and we discuss some considerations for constructing such group network codes.
Deep networks have produced significant gains for various visual recognition problems, leading to high impact academic and commercial applications. Recent work in deep networks highlighted that it is easy to generate images that humans would never classify as a particular object class, yet networks classify such images high confidence as that given class - deep network are easily fooled with images humans do not consider meaningful. The closed set nature of deep networks forces them to choose from one of the known classes leading to such artifacts. Recognition in the real world is open set, i.e. the recognition system should reject unknown/unseen classes at test time. We present a methodology to adapt deep networks for open set recognition, by introducing a new model layer, OpenMax, which estimates the probability of an input being from an unknown class. A key element of estimating the unknown probability is adapting Meta-Recognition concepts to the activation patterns in the penultimate layer of the network. OpenMax allows rejection of "fooling" and unrelated open set images presented to the system; OpenMax greatly reduces the number of obvious errors made by a deep network. We prove that the OpenMax concept provides bounded open space risk, thereby formally providing an open set recognition solution. We evaluate the resulting open set deep networks using pre-trained networks from the Caffe Model-zoo on ImageNet 2012 validation data, and thousands of fooling and open set images. The proposed OpenMax model significantly outperforms open set recognition accuracy of basic deep networks as well as deep networks with thresholding of SoftMax probabilities.
Broadcasting and gossiping are fundamental communication tasks in networks. In broadcasting,one node of a network has a message that must be learned by all other nodes. In gossiping, every node has a (possibly different) message, and all messages must be learned by all nodes. We study these well-researched tasks in a very weak communication model, called the {\em beeping model}. Communication proceeds in synchronous rounds. In each round, a node can either listen, i.e., stay silent, or beep, i.e., emit a signal. A node hears a beep in a round, if it listens in this round and if one or more adjacent nodes beep in this round. All nodes have different labels from the set $\{0,\dots , L-1\}$.   Our aim is to provide fast deterministic algorithms for broadcasting and gossiping in the beeping model. Let $N$ be an upper bound on the size of the network and $D$ its diameter. Let $m$ be the size of the message in broadcasting, and $M$ an upper bound on the size of all input messages in gossiping. For the task of broadcasting we give an algorithm working in time $O(D+m)$ for arbitrary networks, which is optimal. For the task of gossiping we give an algorithm working in time $O(N(M+D\log L))$ for arbitrary networks.   At the time of writing this paper we were unaware of the paper: A. Czumaj, P. Davis, Communicating with Beeps, arxiv:1505.06107 [cs.DC] which contains the same results for broadcasting and a stronger upper bound for gossiping in a slightly different model.
Continually arriving information is communicated through a network of $n$ agents, with the value of information to the $j$'th recipient being a decreasing function of $j/n$, and communication costs paid by recipient. Regardless of details of network and communication costs, the social optimum policy is to communicate arbitrarily slowly. But selfish agent behavior leads to Nash equilibria which (in the $n \to \infty$ limit) may be efficient (Nash payoff $=$ social optimum payoff) or wasteful ($0 < $ Nash payoff $<$ social optimum payoff) or totally wasteful (Nash payoff $=0$). We study the cases of the complete network (constant communication costs between all agents), the grid with only nearest-neighbor communication, and the grid with communication cost a function of distance. The main technical tool is analysis of the associated first passage percolation process or SI epidemic (representing spread of one item of information) and in particular its "window width", the time interval during which most agents learn the item. Many arguments are just outlined, not intended as complete rigorous proofs.
Given a social network represented by a graph $G$, we consider the problem of finding a bounded cardinality set of nodes $S$ with the property that the influence spreading from $S$ in $G$ is as large as possible. The dynamics that govern the spread of influence is the following: initially only elements in $S$ are influenced; subsequently at each round, the set of influenced elements is augmented by all nodes in the network that have a sufficiently large number of already influenced neighbors. While it is known that the general problem is hard to solve --- even in the approximate sense --- we present exact polynomial time algorithms for trees, paths, cycles, and complete graphs.
We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art despite having 60% fewer parameters. On languages with rich morphology (Arabic, Czech, French, German, Spanish, Russian), the model outperforms word-level/morpheme-level LSTM baselines, again with fewer parameters. The results suggest that on many languages, character inputs are sufficient for language modeling. Analysis of word representations obtained from the character composition part of the model reveals that the model is able to encode, from characters only, both semantic and orthographic information.
This paper introduces Quicksilver, a fast deformable image registration method. Quicksilver registration for image-pairs works by patch-wise prediction of a deformation model based directly on image appearance. A deep encoder-decoder network is used as the prediction model. While the prediction strategy is general, we focus on predictions for the Large Deformation Diffeomorphic Metric Mapping (LDDMM) model. Specifically, we predict the momentum-parameterization of LDDMM, which facilitates a patch-wise prediction strategy while maintaining the theoretical properties of LDDMM, such as guaranteed diffeomorphic mappings for sufficiently strong regularization. We also provide a probabilistic version of our prediction network which can be sampled during the testing time to calculate uncertainties in the predicted deformations. Finally, we introduce a new correction network which greatly increases the prediction accuracy of an already existing prediction network. We show experimental results for uni-modal atlas-to-image as well as uni- / multi- modal image-to-image registrations. These experiments demonstrate that our method accurately predicts registrations obtained by numerical optimization, is very fast, achieves state-of-the-art registration results on four standard validation datasets, and can jointly learn an image similarity measure. Quicksilver is freely available as an open-source software.
This study advocates a mathematical framework of ''transport relations'' on a network. They single out a subset of ''traffic states'' described by time, duration, position and other traffic attributes (called ''monads'' for short). Duration evolutions are non-negative, decreasing toward zero for incoming durations, increasing from zero for outgoing durations, allowing the detection of ''junction states'' defined as traffic states with ''zero duration''. A ''junction relation'' (crossroads, synapses, clearing houses, etc.) Is a subset of the transport relation made of junction states? The objective is to construct a ''transport regulator'' associating with traffic states a set of ''celerities'' that mobiles circulating in the network can use as velocities. In other word, a network is regarded as a ''provider of velocity information'' to the mobiles for travelling from one departure state to an arrival state across a junction relation (a kind of geodesic problem). This investigation assumes that a system governs the evolution of monads in function of time, duration and position using celerities as controls and provides the transport regulator, a feedback from transport states to celerities. The proposed mathematical framework can acclimate road or aerial networks, endocrine (hormonal) or synaptic (neurotransmitters) networks, financial or economic networks, which motivated this investigation. This framework could probably accommodate computer and even social networks. This investigation is restricted to junctions between two routes.
How to pass from local to global scales in anonymous networks? How to organize a selfstabilizing propagation of information with feedback. From the Angluin impossibility results, we cannot elect a leader in a general anonymous network. Thus, it is impossible to build a rooted spanning tree. Many problems can only be solved by probabilistic methods. In this paper we show how to use Unison to design a self-stabilizing barrier synchronization in an anonymous network. We show that the commuication structure of this barrier synchronization designs a self-stabilizing wave-stream, or pipelining wave, in anonymous networks. We introduce two variants of Wave: the strong waves and the wavelets. A strong wave can be used to solve the idempotent r-operator parametrized computation problem. A wavelet deals with k-distance computation. We show how to use Unison to design a self-stabilizing wave stream, a self-stabilizing strong wave stream and a self-stabilizing wavelet stream.
We propose a growing network model that consists of two tunable mechanisms: growth by merging modules which are represented as complete graphs and a fitness-driven preferential attachment. Our model exhibits the three prominent statistical properties are widely shared in real biological networks, for example gene regulatory, protein-protein interaction, and metabolic networks. They retain three power law relationships, such as the power laws of degree distribution, clustering spectrum, and degree-degree correlation corresponding to scale-free connectivity, hierarchical modularity, and disassortativity, respectively. After making comparisons of these properties between model networks and biological networks, we confirmed that our model has inference potential for evolutionary processes of biological networks.
In this paper, we study the high temperature or low connectivity phase of the Viana-Bray model. This is a diluted version of the well known Sherrington-Kirkpatrick mean field spin glass. In the whole replica symmetric region, we obtain a complete control of the system, proving annealing for the infinite volume free energy, and a central limit theorem for the suitably rescaled fluctuations of the multi-overlaps. Moreover, we show that free energy fluctuations, on the scale 1/N, converge in the infinite volume limit to a non-Gaussian random variable, whose variance diverges at the boundary of the replica-symmetric region. The connection with the fully connected Sherrington-Kirkpatrick model is discussed.
We study sums of directed paths on a hierarchical lattice where each bond has either a positive or negative sign with a probability $p$. Such path sums $J$ have been used to model interference effects by hopping electrons in the strongly localized regime. The advantage of hierarchical lattices is that they include path crossings, ignored by mean field approaches, while still permitting analytical treatment. Here, we perform a scaling analysis of the controversial ``sign transition'' using Monte Carlo sampling, and conclude that the transition exists and is second order. Furthermore, we make use of exact moment recursion relations to find that the moments $<J^n>$ always determine, uniquely, the probability distribution $P(J)$. We also derive, exactly, the moment behavior as a function of $p$ in the thermodynamic limit. Extrapolations ($n\to 0$) to obtain $<\ln J>$ for odd and even moments yield a new signal for the transition that coincides with Monte Carlo simulations. Analysis of high moments yield interesting ``solitonic'' structures that propagate as a function of $p$. Finally, we derive the exact probability distribution for path sums $J$ up to length L=64 for all sign probabilities.
Recent works have highlighted scale invariance or symmetry present in the weight space of a typical deep network and the adverse effect it has on the Euclidean gradient based stochastic gradient descent optimization. In this work, we show that a commonly used deep network, which uses convolution, batch normalization, reLU, max-pooling, and sub-sampling pipeline, possess more complex forms of symmetry arising from scaling-based reparameterization of the network weights. We propose to tackle the issue of the weight space symmetry by constraining the filters to lie on the unit-norm manifold. Consequently, training the network boils down to using stochastic gradient descent updates on the unit-norm manifold. Our empirical evidence based on the MNIST dataset shows that the proposed updates improve the test performance beyond what is achieved with batch normalization and without sacrificing the computational efficiency of the weight updates.
The recognition of color texture under varying lighting conditions is still an open issue. Several features have been proposed for this purpose, ranging from traditional statistical descriptors to features extracted with neural networks. Still, it is not completely clear under what circumstances a feature performs better than the others. In this paper we report an extensive comparison of old and new texture features, with and without a color normalization step, with a particular focus on how they are affected by small and large variation in the lighting conditions. The evaluation is performed on a new texture database including 68 samples of raw food acquired under 46 conditions that present single and combined variations of light color, direction and intensity. The database allows to systematically investigate the robustness of texture descriptors across a large range of variations of imaging conditions.
We report a measurement of the single top quark production cross section in 2.2 ~fb-1 of p-pbar collision data collected by the Collider Detector at Fermilab at sqrt{s}=1.96 TeV. Candidate events are classified as signal-like by three parallel analyses which use likelihood, matrix element, and neural network discriminants. These results are combined in order to improve the sensitivity. We observe a signal consistent with the standard model prediction, but inconsistent with the background-only model by 3.7 standard deviations with a median expected sensitivity of 4.9 standard deviations. We measure a cross section of 2.2 +0.7 -0.6(stat+sys) pb, extract the CKM matrix element value |V_{tb}|=0.88 +0.13 -0.12 (stat+sys) +- 0.07(theory), and set the limit |V_{tb}|>0.66 at the 95% C.L.
We study analytically the metal-insulator transition in a disordered conductor by combining the self-consistent theory of localization with the one parameter scaling theory. We provide explicit expressions of the critical exponents and the critical disorder as a function of the spatial dimensionality, $d$. The critical exponent $\nu$ controlling the divergence of the localization length at the transition is found to be $\nu = {1 \over 2}+ {1 \over {d-2}}$. This result confirms that the upper critical dimension is infinity. Level statistics are investigated in detail. We show that the two level correlation function decays exponentially and the number variance is linear with a slope which is an increasing function of the spatial dimensionality.
Power dissipation in sequential circuits is due to increased toggling count of Circuit under Test, which depends upon test vectors applied. If successive test vectors sequences have more toggling nature then it is sure that toggling rate of flip flops is higher. Higher toggling for flip flops results more power dissipation. To overcome this problem, one method is to use GA to have test vectors of high fault coverage in short interval, followed by Hamming distance management on test patterns. This approach is time consuming and needs more efforts. Another method which is purposed in this paper is a PSO based Frame Work to optimize power dissipation. Here target is to set the entire test vector in a frame for time period 'T', so that the frame consists of all those vectors strings which not only provide high fault coverage but also arrange vectors in frame to produce minimum toggling.
We critically examine the question of scaling of the Deep Inelastic   Scattering process in the medium Bjorken x region on a scalar boson in the framework of the AdS/QCD correspondence. To get the right polarization structure of the forward electroproduction amplitude, we show that one needs to add (at least) the scalar to scalar and scalar to vector hadronic amplitudes. This illustrates how the partonic picture may emerge from a simple scenario based on the AdS/QCD correspondence, provided one allows the conformal dimension of the hadronic field to equal 1 and use the concept of "hadron - parton duality" .
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the time horizon. The literature lacks a general model encompassing these sorts of problems. We introduce such a model, called "bandits with knapsacks", that combines aspects of stochastic integer programming with online learning. A distinctive feature of our problem, in comparison to the existing regret-minimization literature, is that the optimal policy for a given latent distribution may significantly outperform the policy that plays the optimal fixed arm. Consequently, achieving sublinear regret in the bandits-with-knapsacks problem is significantly more challenging than in conventional bandit problems.   We present two algorithms whose reward is close to the information-theoretic optimum: one is based on a novel "balanced exploration" paradigm, while the other is a primal-dual algorithm that uses multiplicative updates. Further, we prove that the regret achieved by both algorithms is optimal up to polylogarithmic factors. We illustrate the generality of the problem by presenting applications in a number of different domains including electronic commerce, routing, and scheduling. As one example of a concrete application, we consider the problem of dynamic posted pricing with limited supply and obtain the first algorithm whose regret, with respect to the optimal dynamic policy, is sublinear in the supply.
Spike-sorting techniques attempt to classify a series of noisy electrical waveforms according to the identity of the neurons that generated them. Existing techniques perform this classification ignoring several properties of actual neurons that can ultimately improve classification performance. In this chapter, after illustrating the spike-sorting problem with real data, we propose a more realistic spike train generation model. It incorporates both a description of "non trivial" (ie, non Poisson) neuronal discharge statistics and a description of spike waveform dynamics (eg, the events amplitude decays for short inter-spike intervals). We show that this spike train generation model is analogous to a one-dimensional Potts spin glass model. We can therefore use the computational methods which have been developed in fields where Potts models are extensively used. These methods are based on the construction of a Markov Chain in the space of model parameters and spike train configurations, where a configuration is defined by specifying a neuron of origin for each spike. This Markov Chain is built such that its unique stationary density is the posterior density of model parameters and configurations given the observed data. A Monte Carlo simulation of the Markov Chain is then used to estimate the posterior density. The theoretical background on Markov chains is provided and the way to build the transition matrix of the Markov Chain is illustrated with a simple, but realistic, model for data generation . Simulated data are used to illustrate the performance of the method and to show that it can easily cope with neurons generating spikes with highly dynamic waveforms and/or generating strongly overlapping clusters on Wilson plots.
Many methods have been developed for finding the commonalities between different organisms to study their phylogeny. The structure of metabolic networks also reveal valuable insights into metabolic capacity of species as well as into the habitats where they have evolved. We constructed metabolic networks of 79 fully sequenced organisms and compared their architectures. We used spectral density of normalized Laplacian matrix for comparing the structure of networks. The eigenvalues of this matrix reflect not only the global architecture of a network but also the local topologies that are produced by different graph evolutionary processes like motif duplication or joining. A divergence measure on spectral densities is used to quantify the distances between various metabolic networks, and a split network is constructed to analyze the phylogeny from these distances. In our analysis, we focus on the species, which belong to different classes, but appear more related to each other in the phylogeny. We tried to explore whether they have evolved under similar environmental conditions or have similar life histories. With this focus, we have obtained interesting insights into the phylogenetic commonality between different organisms.
Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural network in order to understand their underlying representation. To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training. We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene.
This paper analyzes the survivability of a physical network against earthquakes and proposes spatial network design rules to make a network robust against earthquakes. The disaster area model used is fairly generic and bounded. The proposed design rules for physical networks include: (i) a shorter zigzag route can reduce the probability that a network intersects a disaster area, (ii) an additive performance metric, such as repair cost, is independent of the network shape if the route length is fixed, and (iii) additional routes within a ring network does not decrease the probability that all the routes between a given pair of nodes intersect the disaster area, but a wider detour route decreases it. Formulas for evaluating the probability of disconnecting two given nodes are also derived. An optimal server placement is shown as an application of the theoretical results. These analysis results are validated through empirical earthquake data.
We present a scalable method for brain cell identification in multiview confocal light sheet microscopy images. Our algorithmic pipeline includes a hierarchical registration approach and a novel multiview version of semantic deconvolution that simultaneously enhance visibility of fluorescent cell bodies, equalize their contrast, and fuses adjacent views into a single 3D images on which cell identification is performed with mean shift.   We present empirical results on a whole-brain image of an adult Arc-dVenus mouse acquired at 4micron resolution. Based on an annotated test volume containing 3278 cells, our algorithm achieves an $F_1$ measure of 0.89.
In this paper, we consider the problem of assessing local clustering in complex networks. Various definitions for this measure have been proposed for the cases of networks having weighted edges, but less attention has been paid to both weighted and directed networks. We provide a new local clustering coefficient for this kind of networks, starting from those existing in the literature for the weighted and undirected case. Furthermore, we extract from our coefficient four specific components, in order to separately consider different link patterns of triangles. Empirical applications on several real networks from different frameworks and with different order are provided. The performance of our coefficient is also compared with that of existing coefficients in the literature.
The violation of certain Bell inequalities allows for device-independent information processing secure against non-signalling eavesdroppers. However, this only holds for the Bell network, in which two or more agents perform local measurements on a single shared source of entanglement. To overcome the practical constraint that entangled systems can only be transmitted over relatively short distances, large-scale multi-source networks---in which entanglement is swapped between intermediate nodes---have been employed. Yet, to use Bell inequality violation to establish secure device-independent protocols between distant agents in such multi-source networks, the outcomes of intermediate entanglement swapping measurements must be post-selected---impeding the rate at which security can be established. Moreover, having multiple intermediate nodes in the network opens the door for generalised eavesdropping attacks not seen in the Bell network. Do there exist analogues of Bell inequalities for such multi-source networks whose violation is a resource for device-independent information processing without the need for post-selection? Recently, polynomial inequalities on the correlations classically achievable in multi-source networks have been derived. In this paper, the violation of these polynomial inequalities will be shown to allow for device-independent information processing without the need to condition on intermediate agents' outcomes. Moreover, their violation can prevent generalised eavesdropper attacks.
The astronomical growth of data has necessitated the need for educating well-qualified data scientists to derive deep insights from large and complex data sets generated by organizations. In this paper, we present our interdisciplinary approach and experiences in teaching a Data Science course, the first of its kind offered at the Wright State University. Two faculty members from the Management Information Systems (MIS) and Computer Science (CS) departments designed and co-taught the course with perspectives from their previous research and teaching experiences. Students in the class had mix backgrounds with mainly MIS and CS majors. Students' learning outcomes and post course survey responses suggested that the course delivered a broad overview of data science as desired, and that students worked synergistically with those of different majors in collaborative lab assignments and in a semester long project. The interdisciplinary pedagogy helped build collaboration and create satisfaction among learners.
The paper deals with electromagnetic effects associated with a radially symmetric system of progressive surface waves in the deep sea, induced by underwater oscillating sources or by dispersive decay of the initial localized perturbations of the sea surface.
Effective and accurate diagnosis of Alzheimer's disease (AD) or mild cognitive impairment (MCI) can be critical for early treatment and thus has attracted more and more attention nowadays. Since first introduced, machine learning methods have been gaining increasing popularity for AD related research. Among the various identified biomarkers, magnetic resonance imaging (MRI) are widely used for the prediction of AD or MCI. However, before a machine learning algorithm can be applied, image features need to be extracted to represent the MRI images. While good representations can be pivotal to the classification performance, almost all the previous studies typically rely on human labelling to find the regions of interest (ROI) which may be correlated to AD, such as hippocampus, amygdala, precuneus, etc. This procedure requires domain knowledge and is costly and tedious.   Instead of relying on extraction of ROI features, it is more promising to remove manual ROI labelling from the pipeline and directly work on the raw MRI images. In other words, we can let the machine learning methods to figure out these informative and discriminative image structures for AD classification. In this work, we propose to learn deep convolutional image features using unsupervised and supervised learning. Deep learning has emerged as a powerful tool in the machine learning community and has been successfully applied to various tasks. We thus propose to exploit deep features of MRI images based on a pre-trained large convolutional neural network (CNN) for AD and MCI classification, which spares the effort of manual ROI annotation process.
The aging regime of the trap model, observed for a temperature T below the glass transition temperature T_g, is a prototypical example of non-stationary out-of-equilibrium state. We characterize this state by evaluating its "distance to equilibrium", defined as the Shannon entropy difference \Delta S (in absolute value) between the non-equilibrium state and the equilibrium state with the same energy. We consider the time evolution of \Delta S and show that, rather unexpectedly, \Delta S(t) continuously increases in the aging regime, if the number of traps is infinite, meaning that the "distance to equilibrium" increases instead of decreasing in the relaxation process. For a finite number N of traps, \Delta S(t) exhibits a maximum value before eventually converging to zero when equilibrium is reached. The time t* at which the maximum is reached however scales in a non-standard way as t* ~ N^(T_g/2T), while the equilibration time scales as N^(T_g/T). In addition, the curves \Delta S(t) for different N are found to rescale as ln t/ln t*, instead of the more familiar scaling t/t*.
Frequency dependent electronic transport is investigated for a two-dimensional disordered system in the presence of a strong perpendicular static magnetic field. The ac-conductivity is calculated numerically from Kubo's linear response theory using a recursive Green's function technique. In the tail of the lowest Landau band, we find a linear frequency dependence for the imaginary part of $\sigma_{xx}(\omega)$ which agrees well with earlier analytical calculations. On the other hand, the frequency dependence of the real part can not be expressed by a simple power law. The broadening of the $\sigma_{xx}$-peak with frequency in the lowest Landau band is found to exhibit a scaling relation from which the critical exponent can be extracted.
A coupled cell network is a model for many situations such as food webs in ecosystems, cellular metabolism, economical networks... It consists in a directed graph $G$, each node (or cell) representing an agent of the network and each directed arrow representing which agent acts on which one. It yields a system of differential equations $\dot x(t)=f(x(t))$, where the component $i$ of $f$ depends only on the cells $x_j(t)$ for which the arrow $j\rightarrow i$ exists in $G$. In this paper, we investigate the observation problems in coupled cell networks: can one deduce the behaviour of the whole network (oscillations, stabilisation etc.) by observing only one of the cells? We show that the natural observation properties holds for almost all the interactions $f$.
We present a method for automatically generating repair feedback for syntax errors for introductory programming problems. Syntax errors constitute one of the largest classes of errors (34%) in our dataset of student submissions obtained from a MOOC course on edX. The previous techniques for generating automated feed- back on programming assignments have focused on functional correctness and style considerations of student programs. These techniques analyze the program AST of the program and then perform some dynamic and symbolic analyses to compute repair feedback. Unfortunately, it is not possible to generate ASTs for student pro- grams with syntax errors and therefore the previous feedback techniques are not applicable in repairing syntax errors.   We present a technique for providing feedback on syntax errors that uses Recurrent neural networks (RNNs) to model syntactically valid token sequences. Our approach is inspired from the recent work on learning language models from Big Code (large code corpus). For a given programming assignment, we first learn an RNN to model all valid token sequences using the set of syntactically correct student submissions. Then, for a student submission with syntax errors, we query the learnt RNN model with the prefix to- ken sequence to predict token sequences that can fix the error by either replacing or inserting the predicted token sequence at the error location. We evaluate our technique on over 14, 000 student submissions with syntax errors. Our technique can completely re- pair 31.69% (4501/14203) of submissions with syntax errors and in addition partially correct 6.39% (908/14203) of the submissions.
Non-uniform blind deblurring for general dynamic scenes is a challenging computer vision problem since blurs are caused by camera shake, scene depth as well as multiple object motions. To remove these complicated motion blurs, conventional energy optimization based methods rely on simple assumptions such that blur kernel is partially uniform or locally linear. Moreover, recent machine learning based methods also depend on synthetic blur datasets generated under these assumptions. This makes conventional deblurring methods fail to remove blurs where blur kernel is difficult to approximate or parameterize (e.g. object motion boundaries). In this work, we propose a multi-scale convolutional neural network that restores blurred images caused by various sources in an end-to-end manner. Furthermore, we present multi-scale loss function that mimics conventional coarse-to-fine approaches. Moreover, we propose a new large scale dataset that provides pairs of realistic blurry image and the corresponding ground truth sharp image that are obtained by a high-speed camera. With the proposed model trained on this dataset, we demonstrate empirically that our method achieves the state-of-the-art performance in dynamic scene deblurring not only qualitatively, but also quantitatively.
After more than a decade of intensive research in the field of diluted magnetic semiconductors (DMS), the nature and origin of ferromagnetism, especially in III-V compounds is still controversial. Many questions and open issues are under intensive debates. Why after so many years of investigations Mn doped GaAs remains the candidate with the highest Curie temperature among the broad family of III-V materials doped with transition metal (TM) impurities ? How can one understand that these temperatures are almost two orders of magnitude larger than that of hole doped (Zn,Mn)Te or (Cd,Mn)Se? Is there any intrinsic limitation or is there any hope to reach in the dilute regime room temperature ferromagnetism? How can one explain the proximity of (Ga,Mn)As to the metal-insulator transition and the change from Ruderman-Kittel-Kasuya-Yosida (RKKY) couplings in II-VI compounds to double exchange type in (Ga,Mn)N? In spite of the great success of density functional theory based studies to provide accurately the critical temperatures in various compounds, till very lately a theory that provides a coherent picture and understanding of the underlying physics was still missing. Recently, within a minimal model it has been possible to show that among the physical parameters, the key one is the position of the TM acceptor level. By tuning the value of that parameter, one is able to explain quantitatively both magnetic and transport properties in a broad family of DMS. We will see that this minimal model explains in particular the RKKY nature of the exchange in (Zn,Mn)Te/(Cd,Mn)Te and the double exchange type in (Ga,Mn)N and simultaneously the reason why (Ga,Mn)As exhibits the highest critical temperature among both II-VI and III-V DMS.
Freehand sketching is an inherently sequential process. Yet, most approaches for hand-drawn sketch recognition either ignore this sequential aspect or exploit it in an ad-hoc manner. In our work, we propose a recurrent neural network architecture for sketch object recognition which exploits the long-term sequential and structural regularities in stroke data in a scalable manner. Specifically, we introduce a Gated Recurrent Unit based framework which leverages deep sketch features and weighted per-timestep loss to achieve state-of-the-art results on a large database of freehand object sketches across a large number of object categories. The inherently online nature of our framework is especially suited for on-the-fly recognition of objects as they are being drawn. Thus, our framework can enable interesting applications such as camera-equipped robots playing the popular party game Pictionary with human players and generating sparsified yet recognizable sketches of objects.
We map the conformation space of a simple lattice polymer chain to a network, where (i) the vertices of the network have a one-to-one correspondence to the conformations of the chain, and (ii) a link between two vertices indicates the possibility of switching from one conformation to the other by a single Monte Carlo move of the chain. We find that the geometric properties of this network are similar to those of small-world networks, namely, the diameter of conformation space increases, for large networks, as the logarithm of the number of conformations, while locally the network appears to have low dimensionality.
Null space Newton algorithms are efficient in solving the nonlinear equations arising in hydraulic analysis of water distribution networks. In this article, we propose and evaluate an inexact Newton method that relies on partial updates of the network pipes' frictional headloss computations to solve the linear systems more efficiently and with numerical reliability. The update set parameters are studied to propose appropriate values. Different null space basis generation schemes are analysed to choose methods for sparse and well-conditioned null space bases resulting in a smaller update set. The Newton steps are computed in the null space by solving sparse, symmetric positive definite systems with sparse Cholesky factorizations. By using the constant structure of the null space system matrices, a single symbolic factorization in the Cholesky decomposition is used multiple times, reducing the computational cost of linear solves. The algorithms and analyses are validated using medium to large-scale water network models.
We introduce and discuss a general criterion for the derivative pricing in the general situation of incomplete markets, we refer to it as the No Almost Sure Arbitrage Principle. This approach is based on the theory of optimal strategy in repeated multiplicative games originally introduced by Kelly. As particular cases we obtain the Cox-Ross-Rubinstein and Black-Scholes in the complete markets case and the Schweizer and Bouchaud-Sornette as a quadratic approximation of our prescription. Technical and numerical aspects for the practical option pricing, as large deviation theory approximation and Monte Carlo computation are discussed in detail.
We present a Bayesian approach to adapting parameters of a well-trained context-dependent, deep-neural-network, hidden Markov model (CD-DNN-HMM) to improve automatic speech recognition performance. Given an abundance of DNN parameters but with only a limited amount of data, the effectiveness of the adapted DNN model can often be compromised. We formulate maximum a posteriori (MAP) adaptation of parameters of a specially designed CD-DNN-HMM with an augmented linear hidden networks connected to the output tied states, or senones, and compare it to feature space MAP linear regression previously proposed. Experimental evidences on the 20,000-word open vocabulary Wall Street Journal task demonstrate the feasibility of the proposed framework. In supervised adaptation, the proposed MAP adaptation approach provides more than 10% relative error reduction and consistently outperforms the conventional transformation based methods. Furthermore, we present an initial attempt to generate hierarchical priors to improve adaptation efficiency and effectiveness with limited adaptation data by exploiting similarities among senones.
Ever increasing computational power will require methods for automatic programming. We present an alternative to genetic programming, based on a general model of thinking and learning. The advantage is that evolution takes place in the space of constructs and can thus exploit the mathematical structures of this space. The model is formalized, and a macro language is presented which allows for a formal yet intuitive description of the problem under consideration. A prototype has been developed to implement the scheme in PERL. This method will lead to a concentration on the analysis of problems, to a more rapid prototyping, to the treatment of new problem classes, and to the investigation of philosophical problems. We see fields of application in nonlinear differential equations, pattern recognition, robotics, model building, and animated pictures.
Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users express weak supervision strategies or domain heuristics as labeling functions, which are programs that label subsets of the data, but that are noisy and may conflict. We show that by explicitly representing this training set labeling process as a generative model, we can "denoise" the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs. Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that data programming would have led to a new winning score, and also show that applying data programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points over a state-of-the-art LSTM baseline (and into second place in the competition). Additionally, in initial user studies we observed that data programming may be an easier way for non-experts to create machine learning models when training data is limited or unavailable.
We study various statistical properties of real roots of three different classes of random polynomials which recently attracted a vivid interest in the context of probability theory and quantum chaos. We first focus on gap probabilities on the real axis, i.e. the probability that these polynomials have no real root in a given interval. For generalized Kac polynomials, indexed by an integer d, of large degree n, one finds that the probability of no real root in the interval [0,1] decays as a power law n^{-\theta(d)} where \theta(d) > 0 is the persistence exponent of the diffusion equation with random initial conditions in spatial dimension d. For n \gg 1 even, the probability that they have no real root on the full real axis decays like n^{-2(\theta(2)+\theta(d))}. For Weyl polynomials and Binomial polynomials, this probability decays respectively like \exp{(-2\theta_{\infty}} \sqrt{n}) and \exp{(-\pi \theta_{\infty} \sqrt{n})} where \theta_{\infty} is such that \theta(d) = 2^{-3/2} \theta_{\infty} \sqrt{d} in large dimension d. We also show that the probability that such polynomials have exactly k roots on a given interval [a,b] has a scaling form given by \exp{(-N_{ab} \tilde \phi(k/N_{ab}))} where N_{ab} is the mean number of real roots in [a,b] and \tilde \phi(x) a universal scaling function. We develop a simple Mean Field (MF) theory reproducing qualitatively these scaling behaviors, and improve systematically this MF approach using the method of persistence with partial survival, which in some cases yields exact results. Finally, we show that the probability density function of the largest absolute value of the real roots has a universal algebraic tail with exponent {-2}. These analytical results are confirmed by detailed numerical computations.
Images have become one of the most popular types of media through which users convey their emotions within online social networks. Although vast amount of research is devoted to sentiment analysis of textual data, there has been very limited work that focuses on analyzing sentiment of image data. In this work, we propose a novel visual sentiment prediction framework that performs image understanding with Deep Convolutional Neural Networks (CNN). Specifically, the proposed sentiment prediction framework performs transfer learning from a CNN with millions of parameters, which is pre-trained on large-scale data for object recognition. Experiments conducted on two real-world datasets from Twitter and Tumblr demonstrate the effectiveness of the proposed visual sentiment analysis framework.
We have measured the spatial and spectral dependence of the microwave field inside an open absorbing waveguide filled with randomly juxtaposed dielectric slabs in the spectral region in which the average level spacing exceeds the typical level width. Whenever lines overlap in the spectrum, the field exhibits multiple peaks within the sample. Only then is substantial energy found beyond the first half of the sample. When the spectrum throughout the sample is decomposed into a sum of Lorentzian lines plus a broad background, their central frequencies and widths are found to be essentially independent of position. Thus, this decomposition provides the electromagnetic quasimodes underlying the extended field in nominally localized samples. When the quasimodes overlap spectrally, they exhibit multiple peaks in space.
When spatial constraint for the constituents (e.g., atom or particle) of system is once given, disordered structure for non-interacting system in equilibrium states is symmetric with respect to equiatomic composition. Meanwhile, when the interaction between constituents is introduced, this symmetry is typically broken, naturally appearing compositional asymmetry. Although this asymmetry, depending on temperature, comes from multibody interactions in the system, we here clarify that the asymmetry near equiatomic composition can be universally well-characterized by two specially selected microscopic structure, which can be known a priori without any information about interactions or temperature: The key role is the class of spatial constraint. Based on the facts, we provide analytical expression of temperature dependence of disordered structure, and demonstrate its validity and applicability by predicting short-range order parameters of practical alloys compared with full thermodynamic simulation.
The energy level spacing distribution of a tight-binding hamiltonian is monitored across the mobility edge for a fixed disorder strength. Any mixing of extended and localized levels is avoided in the configurational averages, thus approaching the critical point very closely and with high energy resolution. By finite size scaling the method is shown to provide a very accurate estimate of the mobility edge and of the critical exponent for a cubic lattice with lorentzian distributed diagonal disorder. Since no averaging in wide energy windows is required, the method appears as a powerful tool for locating the mobility edges in more complex models of real physical systems.
Observational studies are based on accurate assessment of human state. A behavior recognition system that models interlocutors' state in real-time can significantly aid the mental health domain. However, behavior recognition from speech remains a challenging task since it is difficult to find generalizable and representative features because of noisy and high-dimensional data, especially when data is limited and annotated coarsely and subjectively. Deep Neural Networks (DNN) have shown promise in a wide range of machine learning tasks, but for Behavioral Signal Processing (BSP) tasks their application has been constrained due to limited quantity of data. We propose a Sparsely-Connected and Disjointly-Trained DNN (SD-DNN) framework to deal with limited data. First, we break the acoustic feature set into subsets and train multiple distinct classifiers. Then, the hidden layers of these classifiers become parts of a deeper network that integrates all feature streams. The overall system allows for full connectivity while limiting the number of parameters trained at any time and allows convergence possible with even limited data. We present results on multiple behavior codes in the couples' therapy domain and demonstrate the benefits in behavior classification accuracy. We also show the viability of this system towards live behavior annotations.
We introduce a new paradigm of learning for reasoning, understanding, and prediction, as well as the scaffolding network to implement this paradigm. The scaffolding network embodies an incremental learning approach that is formulated as a teacher-student network architecture to teach machines how to understand text and do reasoning. The key to our computational scaffolding approach is the interactions between the teacher and the student through sequential questioning. The student observes each sentence in the text incrementally, and it uses an attention-based neural net to discover and register the key information in relation to its current memory. Meanwhile, the teacher asks questions about the observed text, and the student network gets rewarded by correctly answering these questions. The entire network is updated continually using reinforcement learning. Our experimental results on synthetic and real datasets show that the scaffolding network not only outperforms state-of-the-art methods but also learns to do reasoning in a scalable way even with little human generated input.
Statistical properties of eigenvectors in non-Hermitian random matrix ensembles are discussed, with an emphasis on correlations between left and right eigenvectors. Two approaches are described. One is an exact calculation for Ginibre's ensemble, in which each matrix element is an independent, identically distributed Gaussian complex random variable. The other is a simpler calculation using $N^{-1}$ as an expansion parameter, where $N$ is the rank of the random matrix: this is applied to Girko's ensemble. Consequences of eigenvector correlations which may be of physical importance in applications are also discussed. It is shown that eigenvalues are much more sensitive to perturbations than in the corresponding Hermitian random matrix ensembles. It is also shown that, in problems with time-evolution governed by a non- Hermitian random matrix, transients are controlled by eigenvector correlations.
Large graphs can be found in a wide array of scientific fields ranging from sociology and biology to scientometrics and computer science. Their analysis is by no means a trivial task due to their sheer size and complex structure. Such structure encompasses features so diverse as diameter shrinking, power law degree distribution and self similarity, edge interdependence, and communities. When the adjacency matrix of a graph is considered, then new, spectral properties arise such as primary eigenvalue component decay function, eigenvalue decay function, eigenvalue sign alternation around zero, and spectral gap. Graph mining is the scientific field which attempts to extract information and knowledge from graphs through their structural and spectral properties. Graph modeling is the associated field of generating synthetic graphs with properties similar to those of real graphs in order to simulate the latter. Such simulations may be desirable because of privacy concerns, cost, or lack of access to real data. Pivotal to simulation are low- and high-level software packages offering graph analysis and visualization capabilities. This survey outlines the most important structural and spectral graph properties, a considerable number of graph models, as well the most common graph mining and graph learning tools.
A wide range of applications and research has been done with genome-scale metabolic models. In this work we describe a methodology for comparing metabolic networks constructed from genome-scale metabolic models and how to apply this comparison in order to infer evolutionary distances between different organisms. Our methodology allows a quantification of the metabolic differences between different species from a broad range of families and even kingdoms. This quantification is then applied in order to reconstruct phylogenetic trees for sets of various organisms.
The recently launched LinkedIn Salary product has been designed to realize the vision of helping the world's professionals optimize their earning potential through salary transparency. We describe the overall design and architecture of the salary modeling system underlying this product. We focus on the unique data mining challenges in designing and implementing the system, and describe the modeling components such as outlier detection and Bayesian hierarchical smoothing that help to compute and present robust compensation insights to users. We report on extensive evaluation with nearly one year of anonymized compensation data collected from over one million LinkedIn users, thereby demonstrating the efficacy of the statistical models. We also highlight the lessons learned through the deployment of our system at LinkedIn.
Recent mobile equipment (as well as the norm IEEE 802.21) now offers the possibility for users to switch from one technology to another (vertical handover). This allows flexibility in resource assignments and, consequently, increases the potential throughput allocated to each user. In this paper, we design a fully distributed algorithm based on trial and error mechanisms that exploits the benefits of vertical handover by finding fair and efficient assignment schemes. On the one hand, mobiles gradually update the fraction of data packets they send to each network based on the rewards they receive from the stations. On the other hand, network stations send rewards to each mobile that represent the impact each mobile has on the cell throughput. This reward function is closely related to the concept of marginal cost in the pricing literature. Both the station and the mobile algorithms are simple enough to be implemented in current standard equipment. Based on tools from evolutionary games, potential games and replicator dynamics, we analytically show the convergence of the algorithm to solutions that are efficient and fair in terms of throughput. Moreover, we show that after convergence, each user is connected to a single network cell which avoids costly repeated vertical handovers. Several simple heuristics based on this algorithm are proposed to achieve fast convergence. Indeed, for implementation purposes, the number of iterations should remain in the order of a few tens. We also compare, for different loads, the quality of their solutions.
We introduce a new disordered system, the Super-Potts model, which is a more frustrated version of the Potts glass. Its elementary degrees of freedom are variables that can take M values and are coupled via pair-wise interactions. Its exact solution on a completely connected lattice demonstrates that for large enough M it belongs to the class of mean-field systems solved by a one step replica symmetry breaking Ansatz. Numerical simulations by the parallel tempering technique show that in three dimensions it displays a phenomenological behaviour similar to the one of glass-forming liquids. The Super-Potts glass is therefore the first long-sought disordered model allowing one to perform extensive and detailed studies of the Random First Order Transition in finite dimensions. We also discuss its behaviour for small values of M, which is similar to the one of spin-glasses in a field.
We bridge the properties of the regular square and honeycomb Voronoi tessellations of the plane to those of the Poisson-Voronoi case, thus analyzing in a common framework symmetry-break processes and the approach to uniformly random distributions of tessellation-generating points. We consider ensemble simulations of tessellations generated by points whose regular positions are perturbed through a Gaussian noise controlled by the parameter alpha. We study the number of sides, the area, and the perimeter of the Voronoi cells. For alpha>0, hexagons are the most common class of cells, and 2-parameter gamma distributions describe well the statistics of the geometrical characteristics. The symmetry break due to noise destroys the square tessellation, whereas the honeycomb hexagonal tessellation is very stable and all Voronoi cells are hexagon for small but finite noise with alpha<0.1. For a moderate amount of Gaussian noise, memory of the specific unperturbed tessellation is lost, because the statistics of the two perturbed tessellations is indistinguishable. When alpha>2, results converge to those of Poisson-Voronoi tessellations. The properties of n-sided cells change with alpha until the Poisson-Voronoi limit is reached for alpha>2. The Desch law for perimeters is confirmed to be not valid and a square root dependence on n is established. The ensemble mean of the cells area and perimeter restricted to the hexagonal cells coincides with the full ensemble mean; this might imply that the number of sides acts as a thermodynamic state variable fluctuating about n=6; this reinforces the idea that hexagons, beyond their ubiquitous numerical prominence, can be taken as generic polygons in 2D Voronoi tessellations.
Uplink-downlink decoupling in which users can be associated to different base stations in the uplink and downlink of heterogeneous small cell networks (SCNs) has attracted significant attention recently. However, most existing works focus on simple association mechanisms in LTE SCNs that operate only in the licensed band. In contrast, in this paper, the problem of resource allocation with uplink-downlink decoupling is studied for an SCN that incorporates LTE in the unlicensed band (LTE-U). Here, the users can access both licensed and unlicensed bands while being associated to different base stations. This problem is formulated as a noncooperative game that incorporates user association, spectrum allocation, and load balancing. To solve this problem, a distributed algorithm based on the machine learning framework of echo state networks is proposed using which the small base stations autonomously choose their optimal bands allocation strategies while having only limited information on the network's and users' states. It is shown that the proposed algorithm converges to a stationary mixed-strategy distribution which constitutes a mixed strategy Nash equilibrium for the studied game. Simulation results show that the proposed approach yields significant gains, in terms of the sum-rate of the 50th percentile of users, that reach up to 60% and 78%, respectively, compared to Q-learning and Q-learning without decoupling. The results also show that ESN significantly provides a considerable reduction of information exchange for the wireless network.
We have developed and trained a convolutional neural network to automatically and simultaneously segment optic disc, fovea and blood vessels. Fundus images were normalised before segmentation was performed to enforce consistency in background lighting and contrast. For every effective point in the fundus image, our algorithm extracted three channels of input from the neighbourhood of the point and forward the response across the 7 layer network. In average, our segmentation achieved an accuracy of 92.68 percent on the testing set from Drive database.
Scene classification plays a key role in interpreting the remotely sensed high-resolution images. With the development of deep learning, supervised learning in classification of Remote Sensing with convolutional networks (CNNs) has been frequently adopted. However, researchers paid less attention to unsupervised learning in remote sensing with CNNs. In order to filling the gap, this paper proposes a set of CNNs called \textbf{M}ultiple l\textbf{A}ye\textbf{R} fea\textbf{T}ure m\textbf{A}tching(MARTA) generative adversarial networks (GANs) to learn representation using only unlabeled data. There will be two models of MARTA GANs involved: (1) a generative model $G$ that captures the data distribution and provides more training data; (2) a discriminative model $D$ that estimates the possibility that a sample came from the training data rather than $G$ and in this way a well-formed representation of dataset can be learned. Therefore, MARTA GANs obtain the state-of-the-art results which outperform the results got from UC-Merced Land-use dataset and Brazilian Coffee Scenes dataset.
We investigate the linear stability of a flat interface that separates a liquid layer from a fully-developed turbulent gas flow. In this context, linear-stability analysis involves the study of the dynamics of a small-amplitude wave on the interface, and we develop a model that describes wave-induced perturbation turbulent stresses (PTS). We demonstrate the effect of the PTS on the stability properties of the system in two cases: for a laminar thin film, and for deep-water waves. In the first case, we find that the PTS have little effect on the growth rate of the waves, although they do affect the structure of the perturbation velocities. In the second case, the PTS enhance the maximum growth rate, although the overall shape of the dispersion curve is unchanged. Again, the PTS modify the structure of the velocity field, especially at longer wavelengths. Finally, we demonstrate a kind of parameter tuning that enables the production of the thin-film (slow) waves in a deep-water setting.
In grid networks, distributed resources are interconnected by wide area network to support compute and data-intensive applications, which require reliable and efficient transfer of gigabits (even terabits) of data. Different from best-effort traffic in Internet, bulk data transfer in grid requires bandwidth reservation as a fundamental service. Existing reservation schemes such as RSVP are designed for real-time traffic specified by reservation rate, transfer start time but with unknown lifetime. In comparison, bulk data transfer requests are defined in terms of volume and deadline, which provide more information, and allow more flexibility in reservation schemes, i.e., transfer start time can be flexibly chosen, and reservation for a single request can be divided into multiple intervals with different reservation rates. We define a flexible reservation framework using time-rate function algebra, and identify a series of practical reservation scheme families with increasing generality and potential performance, namely, FixTime-FixRate, FixTime-FlexRate, FlexTime-FlexRate, and Multi-Interval. Simple heuristics are used to select representative scheme from each family for performance comparison. Simulation results show that the increasing flexibility can potentially improve system performance, minimizing both blocking probability and mean flow time. We also discuss the distributed implementation of proposed framework.
After reviewing the basics of the cavity method in classical systems, we show how its quantum version, with some appropriate approximation scheme, can be used to study a system of spins with random ferromagnetic interactions and a random transverse field. The quantum cavity equations describing the ferromagnetic-paramagnetic phase transition can be transformed into the well-known problem of a classical directed polymer in a random medium. The glass transition of this polymer problem translates ino the existence of a `Griffith phase' close to the quantum phase transition of the quantum spin problem, where the physics is dominated by rare events.
A crucial privacy-driven issue nowadays is re-identifying anonymized social networks by mapping them to correlated cross-domain auxiliary networks. Prior works are typically based on modeling social networks as random graphs representing users and their relations, and subsequently quantify the quality of mappings through cost functions that are proposed without sufficient rationale. Also, it remains unknown how to algorithmically meet the demand of such quantifications, i.e., to find the minimizer of the cost functions. We address those concerns in a more realistic social network modeling parameterized by community structures that can be leveraged as side information for de-anonymization. By Maximum A Posteriori (MAP) estimation, our first contribution is new and well justified cost functions, which, when minimized, enjoy superiority to previous ones in finding the correct mapping with the highest probability. The feasibility of the cost functions is then for the first time algorithmically characterized. While proving the general multiplicative inapproximability, we are able to propose two algorithms, which, respectively, enjoy an \epsilon-additive approximation and a conditional optimality in carrying out successful user re-identification. Our theoretical findings are empirically validated, with a notable dataset extracted from rare true cross-domain networks that reproduce genuine social network de-anonymization. Both theoretical and empirical observations also manifest the importance of community information in enhancing privacy inferencing.
Within an artificial neural network (ANN) approach, we classify simulated signals corresponding to the semi-classical description of Bloch oscillations on a two-dimensional square lattice. After the ANN is properly trained, we consider the inverse problem of Bloch oscillations (BO) in which a new signal is classified according to the lattice spacing and external electric field strength oriented along a particular direction of the lattice with an accuracy of 96%. This approach can be improved depending on the time spent in training the network and the computational power available. This work is one of the first efforts for analyzing the BO with ANN in two-dimensional crystals.
We present deep imaging at 6.7 micron and 15 micron from the CAM instrument on the Infrared Space Observatory (ISO), centred on the Hubble Deep Field (HDF). These are the deepest integrations published to date at these wavelengths in any region of sky. We discuss the observation strategy and the data reduction. The observed source density appears to approach the CAM confusion limit at 15 micron, and fluctuations in the 6.7 micron sky background may be identifiable with similar spatial fluctuations in the HDF galaxy counts. ISO appears to be detecting comparable field galaxy populations to the HDF, and our data yields strong evidence that future IR missions (such as SIRTF, FIRST and WIRE) as well as SCUBA and millimetre arrays will easily detect field galaxies out to comparably high redshifts.
Successful attempts to predict judges' votes shed light into how legal decisions are made and, ultimately, into the behavior and evolution of the judiciary. Here, we investigate to what extent it is possible to make predictions of a justice's vote based on the other justices' votes in the same case. For our predictions, we use models and methods that have been developed to uncover hidden associations between actors in complex social networks. We show that these methods are more accurate at predicting justice's votes than forecasts made by legal experts and by algorithms that take into consideration the content of the cases. We argue that, within our framework, high predictability is a quantitative proxy for stable justice (and case) blocks, which probably reflect stable a priori attitudes toward the law. We find that U. S. Supreme Court justice votes are more predictable than one would expect from an ideal court composed of perfectly independent justices. Deviations from ideal behavior are most apparent in divided 5-4 decisions, where justice blocks seem to be most stable. Moreover, we find evidence that justice predictability decreased during the 50-year period spanning from the Warren Court to the Rehnquist Court, and that aggregate court predictability has been significantly lower during Democratic presidencies. More broadly, our results show that it is possible to use methods developed for the analysis of complex social networks to quantitatively investigate historical questions related to political decision-making.
We derive a rigorous lower bound on the average local energy for the Ising model with quenched randomness. The result is that the lower bound is given by the average local energy calculated in the absence of all interactions other than the one under consideration. The only condition for this statement to hold is that the distribution function of the random interaction under consideration is symmetric. All other interactions can be arbitrarily distributed including non-random cases. A non-trivial fact is that any introduction of other interactions to the isolated case always leads to an increase of the average local energy, which is opposite to ferromagnetic systems where the Griffiths inequality holds. Another inequality is proved for asymmetrically distributed interactions. The probability for the thermal average of the local energy to be lower than that for the isolated case takes a maximum value on the Nishimori line as a function of the temperature. In this sense the system is most stable on the Nishimori line.
This paper studies a combination of generative Markov random field (MRF) models and discriminatively trained deep convolutional neural networks (dCNNs) for synthesizing 2D images. The generative MRF acts on higher-levels of a dCNN feature pyramid, controling the image layout at an abstract level. We apply the method to both photographic and non-photo-realistic (artwork) synthesis tasks. The MRF regularizer prevents over-excitation artifacts and reduces implausible feature mixtures common to previous dCNN inversion approaches, permitting synthezing photographic content with increased visual plausibility. Unlike standard MRF-based texture synthesis, the combined system can both match and adapt local features with considerable variability, yielding results far out of reach of classic generative MRF methods.
Deep learning and convolutional neural networks (ConvNets) have been successfully applied to most relevant tasks in the computer vision community. However, these networks are computationally demanding and not suitable for embedded devices where memory and time consumption are relevant.   In this paper, we propose DecomposeMe, a simple but effective technique to learn features using 1D convolutions. The proposed architecture enables both simplicity and filter sharing leading to increased learning capacity. A comprehensive set of large-scale experiments on ImageNet and Places2 demonstrates the ability of our method to improve performance while significantly reducing the number of parameters required. Notably, on Places2, we obtain an improvement in relative top-1 classification accuracy of 7.7\% with an architecture that requires 92% fewer parameters compared to VGG-B. The proposed network is also demonstrated to generalize to other tasks by converting existing networks.
Monte Carlo simulations and finite-size scaling analysis have been carried out to study the critical behavior in a submonolayer lattice-gas of interacting monomers adsorbed on one-dimensional channels arranged in a triangular cross-sectional structure. The model mimics a nanoporous environment, where each nanotube or unit cell is represented by a one-dimensional array. Two kinds of lateral interaction energies have been considered: $1)$ $w_L$, interaction energy between nearest-neighbor particles adsorbed along a single channel and $2)$ $w_T$, interaction energy between particles adsorbed across nearest-neighbor channels. For $w_L/w_T=0$ and $w_T > 0$, successive planes are uncorrelated, the system is equivalent to the triangular lattice and the well-known $(\sqrt{3} \times \sqrt{3})$ $[(\sqrt{3} \times \sqrt{3})^*]$ ordered phase is found at low temperatures and a coverage, $\theta$, of 1/3 $[2/3]$. In the more general case ($w_L/w_T \neq 0$ and $w_T > 0$), a competition between interactions along a single channel and a transverse coupling between sites in neighboring channels allows to evolve to a three-dimensional adsorbed layer. Consequently, the $(\sqrt{3} \times \sqrt{3})$ and $(\sqrt{3} \times \sqrt{3})^*$ structures "propagate" along the channels and new ordered phases appear in the adlayer. The Monte Carlo technique was combined with the recently reported Free Energy Minimization Criterion Approach (FEMCA), to predict the critical temperatures of the order-disorder transformation. The excellent qualitative agreement between simulated data and FEMCA results allow us to interpret the physical meaning of the mechanisms underlying the observed transitions.
We study the effect of an external field on (1+1) and (2+1) dimensional elastic manifolds, at zero temperature and with random bond disorder. Due to the glassy energy landscape the configuration of a manifold changes often in abrupt, ``first order'' -type of large jumps when the field is applied. First the scaling behavior of the energy gap between the global energy minimum and the next lowest minimum of the manifold is considered, by employing exact ground state calculations and an extreme statistics argument. The scaling has a logarithmic prefactor originating from the number of the minima in the landscape, and reads $\Delta E_1 \sim L^\theta [\ln(L_z L^{-\zeta})]^{-1/2}$, where $\zeta$ is the roughness exponent and $\theta$ is the energy fluctuation exponent of the manifold, $L$ is the linear size of the manifold, and $L_z$ is the system height. The gap scaling is extended to the case of a finite external field and yields for the susceptibility of the manifolds $\chi_{tot} \sim L^{2D+1-\theta} [(1-\zeta)\ln(L)]^{1/2}$. We also present a mean field argument for the finite size scaling of the first jump field, $h_1 \sim L^{d-\theta}$. The implications to wetting in random systems, to finite-temperature behavior and the relation to Kardar-Parisi-Zhang non-equilibrium surface growth are discussed.
Training efficiency is one of the main problems for Neural Machine Translation (NMT). Deep networks, very large data and many training iterations are necessary to achieve state-of-the-art performance for NMT. This results in very high computation cost and slow down research and industrialization. In this paper, we first investigate the instability by randomizations for NMT training, and further propose an efficient training method based on data boosting and bootstrapping with no modifications to the neural network. Experiments show that this method can converge much faster compared with a baseline system and achieve stable improvement up to 2.36 BLEU points with 80% training cost.
In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Although recent neurophysiological experiments suggest that synchronous neural activity is involved in some perceptual and cognitive processes, the functional role of such coherent neuronal behavior is not well understood. As a first step in clarifying this role, we investigate how the temporal coherence of certain neuronal activity affects the activity pattern in a neural network. Using a simple network of leaky integrate-and-fire neurons, we study the effects of synchronized incoming spikes on the functioning of two mechanisms typically used in model neural systems, winner-take-all competition and associative memory. We demonstrate that a pair of switches undergone by the incoming spikes, from asynchronous to synchronous and then back to asynchronous, triggers a transition of the network from one state to another state. In the case of associative memory, for example, this switching controls the timing of the next recalling, whereas the firing rate pattern in the asynchronous state prepares the network for the next retrieval pattern.
In this paper we tackle the problem of visually predicting surface friction for environments with diverse surfaces, and integrating this knowledge into biped robot locomotion planning. The problem is essential for autonomous robot locomotion since diverse surfaces with varying friction abound in the real world, from wood to ceramic tiles, grass or ice, which may cause difficulties or huge energy costs for robot locomotion if not considered. We propose to estimate friction and its uncertainty from visual estimation of material classes using convolutional neural networks, together with probability distribution functions of friction associated with each material. We then robustly integrate the friction predictions into a hierarchical (footstep and full-body) planning method using chance constraints, and optimize the same trajectory costs at both levels of the planning method for consistency. Our solution achieves fully autonomous perception and locomotion on slippery terrain, which considers not only friction and its uncertainty, but also collision, stability and trajectory cost. We show promising friction prediction results in real pictures of outdoor scenarios, and planning experiments on a real robot facing surfaces with different friction.
We propose a novel cascaded framework, namely deep deformation network (DDN), for localizing landmarks in non-rigid objects. The hallmarks of DDN are its incorporation of geometric constraints within a convolutional neural network (CNN) framework, ease and efficiency of training, as well as generality of application. A novel shape basis network (SBN) forms the first stage of the cascade, whereby landmarks are initialized by combining the benefits of CNN features and a learned shape basis to reduce the complexity of the highly nonlinear pose manifold. In the second stage, a point transformer network (PTN) estimates local deformation parameterized as thin-plate spline transformation for a finer refinement. Our framework does not incorporate either handcrafted features or part connectivity, which enables an end-to-end shape prediction pipeline during both training and testing. In contrast to prior cascaded networks for landmark localization that learn a mapping from feature space to landmark locations, we demonstrate that the regularization induced through geometric priors in the DDN makes it easier to train, yet produces superior results. The efficacy and generality of the architecture is demonstrated through state-of-the-art performances on several benchmarks for multiple tasks such as facial landmark localization, human body pose estimation and bird part localization.
We study the statistics of the extremes of a discrete Gaussian field with logarithmic correlations at the level of the Gibbs measure. The model is defined on the periodic interval $[0,1]$, and its correlation structure is nonhierarchical. It is based on a model introduced by Bacry and Muzy [Comm. Math. Phys. 236 (2003) 449-475] (see also Barral and Mandelbrot [Probab. Theory Related Fields 124 (2002) 409-430]), and is similar to the logarithmic Random Energy Model studied by Carpentier and Le Doussal [Phys. Rev. E (3) 63 (2001) 026110] and more recently by Fyodorov and Bouchaud [J. Phys. A 41 (2008) 372001]. At low temperature, it is shown that the normalized covariance of two points sampled from the Gibbs measure is either $0$ or $1$. This is used to prove that the joint distribution of the Gibbs weights converges in a suitable sense to that of a Poisson-Dirichlet variable. In particular, this proves a conjecture of Carpentier and Le Doussal that the statistics of the extremes of the log-correlated field behave as those of i.i.d. Gaussian variables and of branching Brownian motion at the level of the Gibbs measure. The method of proof is robust and is adaptable to other log-correlated Gaussian fields.
Reconstructing the evolutionary past of a family of genes is an important aspect of many genomic studies. To help with this, simple operations on a set of sequences called orthology relations may be employed. In addition to being interesting from a practical point of view they are also attractive from a theoretical perspective in that e. g. a characterization is known for when such a relation is representable by a certain type of phylogenetic tree. For an orthology relation inferred from real biological data it is however generally too much to hope for that it satisfies that characterization. Rather than trying to correct the data in some way or another which has its own drawbacks, as an alternative, we propose to represent an orthology relation $\delta$ in terms of a structure more general than a phylogenetic tree called a phylogenetic network. To compute such a network in the form of a level-1 representation for $\delta$, we introduce the novel {\sc Network-Popping} algorithm which has several attractive properties. In addition, we characterize orthology relations $\delta$ on some set $X$ that have a level-1 representation in terms of eight natural properties for $\delta$ as well as in terms for level-1 representations of orthology relations on certain subsets of $X$.
The issue of partitioning a network into communities has attracted a great deal of attention recently. Most authors seem to equate this issue with the one of finding the maximum value of the modularity, as defined by Newman. Since the problem formulated this way is NP-hard, most effort has gone into the construction of search algorithms, and less to the question of other measures of community structures, similarities between various partitionings and the validation with respect to external information. Here we concentrate on a class of computer generated networks and on three well-studied real networks which constitute a bench-mark for network studies; the karate club, the US college football teams and a gene network of yeast. We utilize some standard ways of clustering data (originally not designed for finding community structures in networks) and show that these classical methods sometimes outperform the newer ones. We discuss various measures of the strength of the modular structure, and show by examples features and drawbacks. Further, we compare different partitions by applying some graph-theoretic concepts of distance, which indicate that one of the quality measures of the degree of modularity corresponds quite well with the distance from the true partition. Finally, we introduce a way to validate the partitionings with respect to external data when the nodes are classified but the network structure is unknown. This is here possible since we know everything of the computer generated networks, as well as the historical answer to how the karate club and the football teams are partitioned in reality. The partitioning of the gene network is validated by use of the Gene Ontology database, where we show that a community in general corresponds to a biological process.
The paper presents the method of transferring PCM (Pulse-Code Modulation) based audio messages through SMS (Short Message Service) over GSM (Global System for Mobile Communications) network. As SMS is text based service, and could not send voice. Our method enables voice transferring through SMS, by converting PCM audio into characters. Than Huffman coding compression technique is applied in order to reduce numbers of characters which will latterly set as payload text of SMS. Testing the said method we develop an application using J2me platform
In this paper, a novel framework is proposed for optimizing the operation and performance of a large-scale, multi-hop millimeter wave (mmW) backhaul within a wireless small cell network (SCN) that encompasses multiple mobile network operators (MNOs). The proposed framework enables the small base stations (SBSs) to jointly decide on forming the multi-hop, mmW links over backhaul infrastructure that belongs to multiple, independent MNOs, while properly allocating resources across those links. In this regard, the problem is addressed using a novel framework based on matching theory that is composed to two, highly inter-related stages: a multi-hop network formation stage and a resource management stage. One unique feature of this framework is that it jointly accounts for both wireless channel characteristics and economic factors during both network formation and resource management. The multi-hop network formation stage is formulated as a one-to-many matching game which is solved using a novel algorithm, that builds on the so-called deferred acceptance algorithm and is shown to yield a stable and Pareto optimal multi-hop mmW backhaul network. Then, a one-to-many matching game is formulated to enable proper resource allocation across the formed multi-hop network. This game is then shown to exhibit peer effects and, as such, a novel algorithm is developed to find a stable and optimal resource management solution that can properly cope with these peer effects. Simulation results show that the proposed framework yields substantial gains, in terms of the average sum rate, reaching up to 27% and 54%, respectively, compared to a non-cooperative scheme in which inter-operator sharing is not allowed and a random allocation approach. The results also show that our framework provides insights on how to manage pricing and the cost of the cooperative mmW backhaul network for the MNOs.
We present an algorithm for the visualisation of time series. To that end we employ echo state networks to convert time series into a suitable vector representation which is capable of capturing the latent dynamics of the time series. Subsequently, the obtained vector representations are put through an autoencoder and the visualisation is constructed using the activations of the bottleneck. The crux of the work lies with defining an objective function that quantifies the reconstruction error of these representations in a principled manner. We demonstrate the method on synthetic and real data.
For many low-resource languages, spoken language resources are more likely to be annotated with translations than with transcriptions. Translated speech data is potentially valuable for documenting endangered languages or for training speech translation systems. A first step towards making use of such data would be to automatically align spoken words with their translations. We present a model that combines Dyer et al.'s reparameterization of IBM Model 2 (fast-align) and k-means clustering using Dynamic Time Warping as a distance metric. The two components are trained jointly using expectation-maximization. In an extremely low-resource scenario, our model performs significantly better than both a neural model and a strong baseline.
We pursue a phenomenological study of higher twist effects in high energy processes by taking into account the off-shellness (virtuality) of partons bound in the nucleon. The effect of parton off-shellness in deep inelastic ep->eX scattering and the Drell-Yan process (pp->l+ l- X) is examined. Assuming factorization and a single-parameter Breit-Wigner form for the parton spectral function, we develop a model to calculate the corresponding off-shell cross sections. Allowing for a finite parton width of the order of 100 MeV, we reproduce the data of both DIS and the triple differential Drell-Yan cross section without an additional K-factor. The results are compared to those from perturbative QCD and the intrinsic-kT approach.
This work examines a stochastic formulation of the generalized Nash equilibrium problem (GNEP) where agents are subject to randomness in the environment of unknown statistical distribution. We focus on fully-distributed online learning by agents and employ penalized individual cost functions to deal with coupled constraints. Three stochastic gradient strategies are developed with constant step-sizes. We allow the agents to use heterogeneous step-sizes and show that the penalty solution is able to approach the Nash equilibrium in a stable manner within $O(\mu_\text{max})$, for small step-size value $\mu_\text{max}$ and sufficiently large penalty parameters. The operation of the algorithm is illustrated by considering the network Cournot competition problem.
Observational astronomy is the beneficiary of an ancient chain of apprenticeship. Kepler's laws required Tycho's data. As the pace of discoveries has increased over the centuries, so has the cadence of tutelage (literally, "watching over"). Naked eye astronomy is thousands of years old, the telescope hundreds, digital imaging a few decades, but today's undergraduates will use instrumentation yet unbuilt - and thus, unfamiliar to their professors - to complete their doctoral dissertations. Not only has the quickening cadence of astronomical data-taking overrun the apprehension of the science within, but the contingent pace of experimental design threatens our capacity to learn new techniques and apply them productively. Virtual technologies are necessary to accelerate our human processes of perception and comprehension to keep up with astronomical instrumentation and pipelined dataflows. Necessary, but not sufficient. Computers can confuse us as efficiently as they illuminate. Rather, as with neural pathways evolved to meet competitive ecological challenges, astronomical software and data must become organized into ever more coherent "threads" of execution. These are the same threaded constructs as understood by computer science. No datum is an island.
We consider the problem of controlling the propagation of an epidemic outbreak in an arbitrary contact network by distributing vaccination resources throughout the network. We analyze a networked version of the Susceptible-Infected-Susceptible (SIS) epidemic model when individuals in the network present different levels of susceptibility to the epidemic. In this context, controlling the spread of an epidemic outbreak can be written as a spectral condition involving the eigenvalues of a matrix that depends on the network structure and the parameters of the model. We study the problem of finding the optimal distribution of vaccines throughout the network to control the spread of an epidemic outbreak. We propose a convex framework to find cost-optimal distribution of vaccination resources when different levels of vaccination are allowed. We also propose a greedy approach with quality guarantees for the case of all-or-nothing vaccination. We illustrate our approaches with numerical simulations in a real social network.
We have performed a deep spectroscopic survey of extremely red galaxies on the GOODS-South field, using GMOS on Gemini South. We present here spectra and redshifts for 16 ERGs at 0.87<z<2.02, to a limit K=20.2. The ERHs are a mixture of spheroidals, mergers and spirals, with one AGN. For at least 10 of these galaxies we observe [OII] emission lines. We perform an age-dating analysis by fitting the spectra and 9-band photometry of the ERGs with models of passively evolving stellar populations combined with a younger star-forming component. The best-fitting ages for the old stellar components range from 0.6 to 4.5 Gyr, with a mean 2.1 Gyr. Masses range from 3 to 20 times 10^10 solar masses. The star-forming component typically forms a few per cent of the total mass, with dust reddening averaging E(B-V)=0.35. Its timescale tends to be short for mergers (<50 Myr) and longer (200-800 Myr) for spiral ERGs.
The interaction between natural selection and random mutation is frequently debated in recent years. Does similar dilemma also exist in the evolution of real networks such as biological networks? In this paper, we try to discuss this issue by a simple model system, in which the topological structure of networks is repeatedly modified and selected in order to make them have better performance in dynamical processes. Interestingly, when the networks with optimal performance deviate from the steady state networks under pure mutations, we find the evolution behaves as a balance between mutation and selection. Furthermore, when the timescales of mutations and dynamical processes are comparable with each other, the steady state of evolution is mainly determined by mutation. On the opposite side, when the timescale of mutations is much longer than that of dynamical processes, selection dominates the evolution and the steady-state networks turn to have much improved performance and highly heterogeneous structures. Despite the simplicity of our model system, this finding could give useful indication to detect the underlying mechanisms that rein the evolution of real systems.
We present here a simple qualitative model that interpolates between the high and low temperature properties of quasi-1D conductors. At high temperatures we argue that transport is governed by inelastic scattering whereas at low temperatures the conductance decays exponentially with the electron dephasing length. The crossover between these regimes occurs at the temperature at which the elastic and inelastic scattering times become equal. This model is shown to be in quantitative agreement with the organic conductor $TTT_2I_{3-\delta}$. Within this model, we also show that on the insulating side, the positive magnetoresistance of the form $(H/T)^2$ observed in $TTT_2I_{3-\delta}$ and other quasi-1D conductors can be explained by the role spin-flip scattering plays in the electron dephasing rate.
Exact calculations of the transmittance of surface corrugated optical waveguides are presented. The elastic scattering of diffuse light or other electromagnetic waves from a rough surface induces a diffusive transport along the waveguide axis. As the length of the corrugated part of the waveguide increases, a transition from the diffusive to the localized regime is observed. This involves an analogy with electron conduction in nanowires, and hence, a concept analogous to that of ``resistance'' can be introduced. We show an oscillatory behavior of both the elastic mean free path and the localization length versus the wavelength.
In this work we address the challenging problem of 3D human pose estimation from single images. Recent approaches learn deep neural networks to regress 3D pose directly from images. One major challenge for such methods, however, is the collection of training data. Specifically, collecting large amounts of training data containing unconstrained images annotated with accurate 3D poses is infeasible. We therefore propose to use two independent training sources. The first source consists of accurate 3D motion capture data, and the second source consists of unconstrained images with annotated 2D poses. To integrate both sources, we propose a dual-source approach that combines 2D pose estimation with efficient 3D pose retrieval. To this end, we first convert the motion capture data into a normalized 2D pose space, and separately learn a 2D pose estimation model from the image data. During inference, we estimate the 2D pose and efficiently retrieve the nearest 3D poses. We then jointly estimate a mapping from the 3D pose space to the image and reconstruct the 3D pose. We provide a comprehensive evaluation of the proposed method and experimentally demonstrate the effectiveness of our approach, even when the skeleton structures of the two sources differ substantially.
Sliding window convolutional networks (ConvNets) have become a popular approach to computer vision problems such as image segmentation, and object detection and localization. Here we consider the problem of inference, the application of a previously trained ConvNet, with emphasis on 3D images. Our goal is to maximize throughput, defined as average number of output voxels computed per unit time. Other things being equal, processing a larger image tends to increase throughput, because fractionally less computation is wasted on the borders of the image. It follows that an apparently slower algorithm may end up having higher throughput if it can process a larger image within the constraint of the available RAM. We introduce novel CPU and GPU primitives for convolutional and pooling layers, which are designed to minimize memory overhead. The primitives include convolution based on highly efficient pruned FFTs. Our theoretical analyses and empirical tests reveal a number of interesting findings. For some ConvNet architectures, cuDNN is outperformed by our FFT-based GPU primitives, and these in turn can be outperformed by our CPU primitives. The CPU manages to achieve higher throughput because of its fast access to more RAM. A novel primitive in which the GPU accesses host RAM can significantly increase GPU throughput. Finally, a CPU-GPU algorithm achieves the greatest throughput of all, 10x or more than other publicly available implementations of sliding window 3D ConvNets. All of our code has been made available as open source project.
Research into active networking has provided the incentive to re-visit what has traditionally been classified as distinct properties and characteristics of information transfer such as protocol versus service; at a more fundamental level this paper considers the blending of computation and communication by means of complexity. The specific service examined in this paper is network self-prediction enabled by Active Virtual Network Management Prediction. Computation/communication is analyzed via Kolmogorov Complexity. The result is a mechanism to understand and improve the performance of active networking and Active Virtual Network Management Prediction in particular. The Active Virtual Network Management Prediction mechanism allows information, in various states of algorithmic and static form, to be transported in the service of prediction for network management. The results are generally applicable to algorithmic transmission of information. Kolmogorov Complexity is used and experimentally validated as a theory describing the relationship among algorithmic compression, complexity, and prediction accuracy within an active network. Finally, the paper concludes with a complexity-based framework for Information Assurance that attempts to take a holistic view of vulnerability analysis.
The Lockman Hole is a well-studied extragalactic field with extensive multi-band ancillary data covering a wide range in frequency, essential for characterising the physical and evolutionary properties of the various source populations detected in deep radio fields (mainly star-forming galaxies and AGNs). In this paper we present new 150-MHz observations carried out with the LOw Frequency ARray (LOFAR), allowing us to explore a new spectral window for the faint radio source population. This 150-MHz image covers an area of 34.7 square degrees with a resolution of 18.6$\times$14.7 arcsec and reaches an rms of 160 $\mu$Jy beam$^{-1}$ at the centre of the field.   As expected for a low-frequency selected sample, the vast majority of sources exhibit steep spectra, with a median spectral index of $\alpha_{150}^{1400}=-0.78\pm0.015$. The median spectral index becomes slightly flatter (increasing from $\alpha_{150}^{1400}=-0.84$ to $\alpha_{150}^{1400}=-0.75$) with decreasing flux density down to $S_{150} \sim$10 mJy before flattening out and remaining constant below this flux level. For a bright subset of the 150-MHz selected sample we can trace the spectral properties down to lower frequencies using 60-MHz LOFAR observations, finding tentative evidence for sources to become flatter in spectrum between 60 and 150 MHz. Using the deep, multi-frequency data available in the Lockman Hole, we identify a sample of 100 Ultra-steep spectrum (USS) sources and 13 peaked spectrum sources. We estimate that up to 21 percent of these could have $z>4$ and are candidate high-$z$ radio galaxies, but further follow-up observations are required to confirm the physical nature of these objects.
This paper introduces two new closely related betweenness centrality measures based on the Randomized Shortest Paths (RSP) framework, which fill a gap between traditional network centrality measures based on shortest paths and more recent methods considering random walks or current flows. The framework defines Boltzmann probability distributions over paths of the network which focus on the shortest paths, but also take into account longer paths depending on an inverse temperature parameter. RSP's have previously proven to be useful in defining distance measures on networks. In this work we study their utility in quantifying the importance of the nodes of a network. The proposed RSP betweenness centralities combine, in an optimal way, the ideas of using the shortest and purely random paths for analysing the roles of network nodes, avoiding issues involving these two paradigms. We present the derivations of these measures and how they can be computed in an efficient way. In addition, we show with real world examples the potential of the RSP betweenness centralities in identifying interesting nodes of a network that more traditional methods might fail to notice.
In opportunistic networks the existence of a simultaneous path is not assumed to transmit a message between a sender and a receiver. Information about the context in which the users communicate is a key piece of knowledge to design efficient routing protocols in opportunistic networks. But this kind of information is not always available. When users are very isolated, context information cannot be distributed, and cannot be used for taking efficient routing decisions. In such cases, context oblivious based schemes are only way to enable communication between users. As soon as users become more social, context data spreads in the network, and context based routing becomes an efficient solution. In this paper we design an integrated routing protocol that is able to use context data as soon as it becomes available and falls back to dissemination based routing when context information is not available. Then, we provide a comparison between Epidemic and PROPHET, these are representative of context oblivious and context aware routing protocols. Our results show that integrated routing protocol is able to provide better result in term of message delivery probability and message delay in both cases when context information about users is available or not.
We obtain an exact analytical expression for magnetoresistance using noncommutative geometry and quantum groups.Then we will show that there is a deep relationship between magnetoresistance and the quantum group $su_{q}(2)$, from which we understand the quantum interpretation of the quantum corrections to the conductivity.
Generally, the threshold of percolation in complex networks depends on the underlying structural characterization. However, what topological property plays a predominant role is still unknown, despite the speculation of some authors that degree distribution is a key ingredient. The purpose of this paper is to show that power-law degree distribution itself is not sufficient to characterize the threshold of bond percolation in scale-free networks. To achieve this goal, we first propose a family of scale-free networks with the same degree sequence and obtain by analytical or numerical means several topological features of the networks. Then, by making use of the renormalization group technique we determine the threshold of bond percolation in our networks. We find an existence of non-zero thresholds and demonstrate that these thresholds can be quite different, which implies that power-law degree distribution does not suffice to characterize the percolation threshold in scale-free networks.
We show that there is a simple (approximately radial) function on $\reals^d$, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth -- even if increased by 1 -- can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different.
We use molecular dynamics computer simulations to investigate the relaxation dynamics of a simple model for a colloidal gel at a low volume fraction. We find that due to the presence of the open spanning network this dynamics shows at low temperature a non-trivial dependence on the wave-vector which is very different from the one observed in dense glass-forming liquids. At high wave vectors the relaxation is due to the fast cooperative motion of the branches of the gel network, whereas at low wave vectors the overall rearrangements of the heterogeneous structure produce the relaxation process.
We investigate the problem of designing delay-aware joint flow control, routing, and scheduling algorithms in general multi-hop networks for maximizing network utilization. Since the end-to-end delay performance has a complex dependence on the high-order statistics of cross-layer algorithms, earlier optimization-based design methodologies that optimize the long term network utilization are not immediately well-suited for delay-aware design. This motivates us in this work to develop a novel design framework and alternative methods that take advantage of several unexploited design choices in the routing and the scheduling strategy spaces. In particular, we reveal and exploit a crucial characteristic of back pressure-type controllers that enables us to develop a novel link rate allocation strategy that not only optimizes long-term network utilization, but also yields loop free multi-path routes} between each source-destination pair. Moreover, we propose a regulated scheduling strategy, based on a token-based service discipline, for shaping the per-hop delay distribution to obtain highly desirable end-to-end delay performance. We establish that our joint flow control, routing, and scheduling algorithm achieves loop-free routes and optimal network utilization. Our extensive numerical studies support our theoretical results, and further show that our joint design leads to substantial end-to-end delay performance improvements in multi-hop networks compared to earlier solutions.
In cond-mat/9907125 the low-temperature behavior of a model for RNA secondary structure was studied. It is claimed that the model exhibits a breaking of the replica symmetry, since the width of the distribution P(q) of overlaps may converge to a finite value at T=0. The authors used an exact enumeration method to obtain all ground states for a given RNA sequence. Because of the exponential growing degeneracy, only sequences up to length L=256 could be studied.   Here it is shown that, in contrast to the previous results, by going to much larger sizes as L=2000 the variance coverges towards zero, i.e. P(q) is a delta-function in the thermodynamic limit.
Recent experiments have highlighted how collective dynamics in networks of brain regions affect behavior and cognitive function. In this paper we show that a simple, homogeneous system of densely connected oscillators representing the aggregate activity of local brain regions can exhibit a rich variety of dynamical patterns emerging via spontaneous breaking of permutation or translational symmetry. Our results connect recent experimental findings and suggest that a range of complicated activity patterns seen in the brain could be explained even without a full knowledge of its wiring diagram.
We show that an appropriate description of the non-equilibrium dynamics of disordered systems is obtained through a strong disorder renormalization procedure in {\it configuration space}, that we define for any master equation with transitions rates $W ({\cal C} \to {\cal C}')$ between configurations. The idea is to eliminate iteratively the configuration with the highest exit rate $W_{out} ({\cal C})= \sum_{{\cal C}'} W ({\cal C} \to {\cal C}')$ to obtain renormalized transition rates between the remaining configurations. The multiplicative structure of the new generated transition rates suggests that, for a very broad class of disordered systems, the distribution of renormalized exit barriers defined as $B_{out} ({\cal C}) \equiv - \ln W_{out}({\cal C})$ will become broader and broader upon iteration, so that the strong disorder renormalization procedure should become asymptotically exact at large time scales. We have checked numerically this scenario for the non-equilibrium dynamics of a directed polymer in a two dimensional random medium.
The diffractive production of vector mesons in deep inelastic scattering (DIS) is calculated in the McLerran-Venugopalan model. This is relevant when large parton densities are probed by the virtual photon, as is the case at small Bjorken x or in DIS off nuclei. We investigate differences between the exclusive production (when the target doesn't break up) which dominates at small momentum transfer squared |t|, and the diffractive production (when the target scatters inelastically) which dominates at large |t|.
This paper describes LIUM submissions to WMT17 News Translation Task for English-German, English-Turkish, English-Czech and English-Latvian language pairs. We train BPE-based attentive Neural Machine Translation systems with and without factored outputs using the open source nmtpy framework. Competitive scores were obtained by ensembling various systems and exploiting the availability of target monolingual corpora for back-translation. The impact of back-translation quantity and quality is also analyzed for English-Turkish where our post-deadline submission surpassed the best entry by +1.6 BLEU.
There are many difficulties facing a handwritten Arabic recognition system such as unlimited variation in human handwriting, similarities of distinct character shapes, interconnections of neighbouring characters and their position in the word. The typical Optical Character Recognition (OCR) systems are based mainly on three stages, preprocessing, features extraction and recognition. This paper proposes new methods for handwritten Arabic character recognition which is based on novel preprocessing operations including different kinds of noise removal also different kind of features like structural, Statistical and Morphological features from the main body of the character and also from the secondary components. Evaluation of the accuracy of the selected features is made. The system was trained and tested by back propagation neural network with CENPRMI dataset. The proposed algorithm obtained promising results as it is able to recognize 88% of our test set accurately. In Comparable with other related works we find that our result is the highest among other published works.
In this thesis I present a virtual laboratory which implements five different models for controlling animats: a rule-based system, a behaviour-based system, a concept-based system, a neural network, and a Braitenberg architecture. Through different experiments, I compare the performance of the models and conclude that there is no "best" model, since different models are better for different things in different contexts.   The models I chose, although quite simple, represent different approaches for studying cognition. Using the results as an empirical philosophical aid,   I note that there is no "best" approach for studying cognition, since different approaches have all advantages and disadvantages, because they study different aspects of cognition from different contexts. This has implications for current debates on "proper" approaches for cognition: all approaches are a bit proper, but none will be "proper enough". I draw remarks on the notion of cognition abstracting from all the approaches used to study it, and propose a simple classification for different types of cognition.
The paper introduces a general framework for derivation of continuum equations governing meso-scale dynamics of large particle systems. The balance equations for spatial averages such as density, linear momentum, and energy were previously derived by a number of authors. These equations are not in closed form because the stress and the heat flux cannot be evaluated without the knowledge of particle positions and velocities. We propose a closure method for approximating fluxes in terms of other meso-scale averages. The main idea is to rewrite the non-linear averages as linear convolutions that relate micro- and meso-scale dynamical functions. The convolutions can be approximately inverted using regularization methods developed for solving ill-posed problems. This yields closed form constitutive equations that can be evaluated without solving the underlying ODEs. We test the method numerically on Fermi-Pasta-Ulam chains with two different potentials: the classical Lennard-Jones, and the purely repulsive potential used in granular materials modeling. The initial conditions incorporate velocity fluctuations on scales that are smaller than the size of the averaging window. The results show very good agreement between the exact stress and its closed form approximation.
Biological neurons communicate with a sparing exchange of pulses - spikes. It is an open question how real spiking neurons produce the kind of powerful neural computation that is possible with deep artificial neural networks, using only so very few spikes to communicate. Building on recent insights in neuroscience, we present an Adapting Spiking Neural Network (ASNN) based on adaptive spiking neurons. These spiking neurons efficiently encode information in spike-trains using a form of Asynchronous Pulsed Sigma-Delta coding while homeostatically optimizing their firing rate. In the proposed paradigm of spiking neuron computation, neural adaptation is tightly coupled to synaptic plasticity, to ensure that downstream neurons can correctly decode upstream spiking neurons. We show that this type of network is inherently able to carry out asynchronous and event-driven neural computation, while performing identical to corresponding artificial neural networks (ANNs). In particular, we show that these adaptive spiking neurons can be drop in replacements for ReLU neurons in standard feedforward ANNs comprised of such units. We demonstrate that this can also be successfully applied to a ReLU based deep convolutional neural network for classifying the MNIST dataset. The ASNN thus outperforms current Spiking Neural Networks (SNNs) implementations, while responding (up to) an order of magnitude faster and using an order of magnitude fewer spikes. Additionally, in a streaming setting where frames are continuously classified, we show that the ASNN requires substantially fewer network updates as compared to the corresponding ANN.
This paper introduces a visual sentiment concept classification method based on deep convolutional neural networks (CNNs). The visual sentiment concepts are adjective noun pairs (ANPs) automatically discovered from the tags of web photos, and can be utilized as effective statistical cues for detecting emotions depicted in the images. Nearly one million Flickr images tagged with these ANPs are downloaded to train the classifiers of the concepts. We adopt the popular model of deep convolutional neural networks which recently shows great performance improvement on classifying large-scale web-based image dataset such as ImageNet. Our deep CNNs model is trained based on Caffe, a newly developed deep learning framework. To deal with the biased training data which only contains images with strong sentiment and to prevent overfitting, we initialize the model with the model weights trained from ImageNet. Performance evaluation shows the newly trained deep CNNs model SentiBank 2.0 (or called DeepSentiBank) is significantly improved in both annotation accuracy and retrieval performance, compared to its predecessors which mainly use binary SVM classification models.
Scaling machine learning methods to very large datasets has attracted considerable attention in recent years, thanks to easy access to ubiquitous sensing and data from the web. We study face recognition and show that three distinct properties have surprising effects on the transferability of deep convolutional networks (CNN): (1) The bottleneck of the network serves as an important transfer learning regularizer, and (2) in contrast to the common wisdom, performance saturation may exist in CNN's (as the number of training samples grows); we propose a solution for alleviating this by replacing the naive random subsampling of the training set with a bootstrapping process. Moreover, (3) we find a link between the representation norm and the ability to discriminate in a target domain, which sheds lights on how such networks represent faces. Based on these discoveries, we are able to improve face recognition accuracy on the widely used LFW benchmark, both in the verification (1:1) and identification (1:N) protocols, and directly compare, for the first time, with the state of the art Commercially-Off-The-Shelf system and show a sizable leap in performance.
We present a critical analysis of the Sompolinsky theory of equilibrium dynamics. By using the spherical $2+p$ spin glass model we test the asymptotic static limit of the Sompolinsky solution showing that it fails to yield a thermodynamically stable solution. We then present an alternative formulation, based on the Crisanti, H\"orner and Sommers [Z. f\"ur Physik {\bf 92}, 257 (1993)] dynamical solution of the spherical $p$-spin spin glass model, reproducing a stable static limit that coincides, in the case of a one step Replica Symmetry Breaking Ansatz, with the solution at the dynamic free energy threshold at which the relaxing system gets stuck off-equilibrium. We formally extend our analysis to any number of Replica Symmetry Breakings $R$. In the limit $R\to\infty$ both formulations lead to the Parisi anti-parabolic differential equation. This is the special case, though, where no dynamic blocking threshold occurs. The new formulation does not contain the additional order parameter $\Delta$ of the Sompolinsky theory.
GoTools is a program which solves life & death problems in the game of Go. This paper describes experiments using a Genetic Algorithm to optimize heuristic weights used by GoTools' tree-search. The complete set of heuristic weights is composed of different subgroups, each of which can be optimized with a suitable fitness function. As a useful side product, an MPI interface for FreePascal was implemented to allow the use of a parallelized fitness function running on a Beowulf cluster. The aim of this exercise is to optimize the current version of GoTools, and to make tools available in preparation of an extension of GoTools for solving open boundary life & death problems, which will introduce more heuristic parameters to be fine tuned.
Mobile communication has become a vigorous field of research in computer science, due to the wide spreading of mobile technologies, applications and services. The intertwining of communication, computation and mobility constantly poses new challenges to algorithmic design in this area. The Foundations of Mobile Computing (FOMC) workshop is dedicated to all aspects that cover contributions both in the design and analysis of discrete/distributed algorithms, and in the system modeling of mobile, wireless and similarly dynamic networks. It aims to bring together the practitioners and theoreticians of the field to foster cooperation between research in mobile computing and algorithms.
A comparative analysis has been done of the formerly established two self-consistent solutions for the density of quasiparticle states in doped d-wave superconductors, having strikingly different and disputed behavior near the Fermi energy. One of them (1) remains finite in this limit, while the other (2) tends to zero. To resolve this discrepancy, the known Ioffe-Regel criterion for band states, widely used for doped semiconductors, was applied to these solutions. It is shown that both them are valid in limited and different energy regions, where the corresponding quasiparticles are weakly damped. In particular, density of states of nodal quasiparticles near the Fermi level is provided by the (2) solution, while the (1) only applies far enough from this level.
We study coarsening phenomena in three different simple exclusion processes with quenched disordered jump rates. In the case of the totally asymmetric process, an earlier phenomenological description is improved, yielding for the time dependence of the length scale $\xi(t)\sim t/(\ln t)^2$, which is found to be in agreement with results of Monte Carlo simulations. For the partially asymmetric process, the logarithmically slow coarsening predicted by a phenomenological theory is confirmed by Monte Carlo simulations and numerical mean-field calculations. Finally, coarsening in a bidirectional, two-lane model with random lane-change rates is studied. Here, Monte Carlo simulations indicate an unusual dependence of the dynamical exponent on the density of particles.
A determinant property of the structure of a biological network is the distribution of local connectivity patterns, i.e., network motifs. In this work, a method for creating directed, unweighted networks while promoting a certain combination of motifs is presented. This motif-based network algorithm starts with an empty graph and randomly connects the nodes by advancing or discouraging the formation of chosen motifs. The in- or out-degree distribution of the generated networks can be explicitly chosen. The algorithm is shown to perform well in producing networks with high occurrences of the targeted motifs, both ones consisting of 3 nodes as well as ones consisting of 4 nodes. Moreover, the algorithm can also be tuned to bring about global network characteristics found in many natural networks, such as small-worldness and modularity.
This paper gives new concentration inequalities for the spectral norm of a wide class of matrix martingales in continuous time. These results extend previously established Freedman and Bernstein inequalities for series of random matrices to the class of continuous time processes. Our analysis relies on a new supermartingale property of the trace exponential proved within the framework of stochastic calculus. We provide also several examples that illustrate the fact that our results allow us to recover easily several formerly obtained sharp bounds for discrete time matrix martingales.
Electronic realizations of neurons are of great interest as building blocks for neuromorphic computation. Electronic neurons should send signals into the input and output lines when subject to an input signal exceeding a given threshold, in such a way that they may affect all other parts of a neural network. Here, we propose a design for a neuron that is based on molecular-electronics components and thus promises a very high level of integration. We employ the Monte Carlo technique to simulate typical time evolutions of this system and thereby show that it indeed functions as a neuron.
In many real-world networks, the rates of node and link addition are time dependent. This observation motivates the definition of accelerating networks. There has been relatively little investigation of accelerating networks and previous efforts at analyzing their degree distributions have employed mean-field techniques. By contrast, we show that it is possible to apply a master-equation approach to such network development. We provide full time-dependent expressions for the evolution of the degree distributions for the canonical situations of random and preferential attachment in networks undergoing constant acceleration. These results are in excellent agreement with results obtained from simulations. We note that a growing, non-equilibrium network undergoing constant acceleration with random attachment is equivalent to a classical random graph, bridging the gap between non-equilibrium and classical equilibrium networks.
In order to form the intricate network of synaptic connections in the brain, the growth cones migrate through the embryonic environment to their targets using chemical communication. As a first step to study self-wiring, 2D model systems of neurons have been used. We present a simple model to reproduce the salient features of the 2D systems. The model incorporates random walkers representing the growth cones, which migrate in response to chemotaxis substances extracted by the soma and communicate with each other and with the soma by means of attractive chemotactic "feedback".
Can a graph specifying the pattern of connections of a dynamical network be reconstructed from statistical properties of a signal generated by such a system? In this model study, we present an evolutionary algorithm for reconstruction of graphs from their Laplacian spectra. Through a stochastic process of mutations and selection, evolving test networks converge to a reference graph. Applying the method to several examples of random graphs, clustered graphs, and small-world networks, we show that the proposed stochastic evolution allows exact reconstruction of relatively small networks and yields good approximations in the case of large sizes.
All types of networks arise as intricate combinations of dyadic building blocks formed by pairs of vertices. In directed networks, the dyadic patterns are entirely determined by reciprocity, i.e. the tendency to form, or to avoid, mutual links. Reciprocity has dramatic effects on every networks dynamical processes and the emergence of structures like motifs and communities. The binary reciprocity has been extensively studied: that of weighted networks is still poorly understood. We introduce a general approach to it, by defining quantities capturing the observed patterns (from dyad-specific to vertex-specific and network-wide) and introducing analytically solved models (Exponential Random Graphs-type). Counter-intuitively, the previous reciprocity measures based on the similarity of the mutual links-weights are uninformative. By contrast, our measures can classify different weighted networks, track the temporal evolution of a networks reciprocity, identify patterns. We show that in some networks the local reciprocity structure can be inferred from the global one.
It was recently recognized that interdependencies among different networks can play a crucial role in triggering cascading failures and hence system-wide disasters. A recent model shows how pairs of interdependent networks can exhibit an abrupt percolation transition as failures accumulate. We report on the effects of topology on failure propagation for a model system consisting of two interdependent networks. We find that the internal node correlations in each of the two interdependent networks significantly changes the critical density of failures that triggers the total disruption of the two-network system. Specifically, we find that the assortativity (i.e. the likelihood of nodes with similar degree to be connected) within a single network decreases the robustness of the entire system. The results of this study on the influence of assortativity may provide insights into ways of improving the robustness of network architecture, and thus enhances the level of protection of critical infrastructures.
Transcription factor binding sites vary in their specificity, both within and between species. Binding specificity has a strong impact on the evolution of gene expression, because it determines how easily regulatory interactions are gained and lost. Nevertheless, we have a relatively poor understanding of what evolutionary forces determine the specificity of binding sites. Here we address this question by studying regulatory modules composed of multiple binding sites. Using a population-genetic model, we show that more complex regulatory modules, composed of a greater number of binding sites, must employ binding sites that are individually less specific, compared to less complex regulatory modules. This effect is extremely general, and it hold regardless of the regulatory logic of a module. We attribute this phenomenon to the inability of stabilising selection to maintain highly specific sites in large regulatory modules. Our analysis helps to explain broad empirical trends in the yeast regulatory network: those genes with a greater number of transcriptional regulators feature by less specific binding sites, and there is less variance in their specificity, compared to genes with fewer regulators. Likewise, our results also help to explain the well-known trend towards lower specificity in the transcription factor binding sites of higher eukaryotes, which perform complex regulatory tasks, compared to prokaryotes.
We consider the level of information security provided by random linear network coding in network scenarios in which all nodes comply with the communication protocols yet are assumed to be potential eavesdroppers (i.e. "nice but curious"). For this setup, which differs from wiretapping scenarios considered previously, we develop a natural algebraic security criterion, and prove several of its key properties. A preliminary analysis of the impact of network topology on the overall network coding security, in particular for complete directed acyclic graphs, is also included.
The ratio g1/F1 has been measured over the range 0.03<x<0.6 and 0.3<Q2<10 (GeV/c)2 using deep-inelastic scattering of polarized electrons from polarized protons and deuterons. We find g1/F1 to be consistent with no Q2-dependence at fixed x in the deep-inelastic region Q^2>1 (GeV/c)2. A trend is observed for g1/F1 to decrease at lower Q2. Fits to world data with and without a possible Q2-dependence in g1/F1 are in agreement with the Bjorken sum rule, but Delta_q is substantially less than the quark-parton model expectation.
In this paper, we discuss on the use of self-organizing protocols to improve the reliability of dynamic Peer-to-Peer (P2P) overlay networks. Two similar approaches are studied, which are based on local knowledge of the nodes' 2nd neighborhood. The first scheme is a simple protocol requiring interactions among nodes and their direct neighbors. The second scheme adds a check on the Edge Clustering Coefficient (ECC), a local measure that allows determining edges connecting different clusters in the network. The performed simulation assessment evaluates these protocols over uniform networks, clustered networks and scale-free networks. Different failure modes are considered. Results demonstrate the effectiveness of the proposal.
We report a realization of an associative memory signal/information processing system based on simple enzyme-catalyzed biochemical reactions. Optically detected chemical output is always obtained in response to the triggering input, but the system can also "learn" by association, to later respond to the second input if it is initially applied in combination with the triggering input as the "training" step. This second chemical input is not self-reinforcing in the present system, which therefore can later "unlearn" to react to the second input if it is applied several times on its own. Such processing steps realized with (bio)chemical kinetics promise applications of bio-inspired/memory-involving components in "networked" (concatenated) biomolecular processes for multi-signal sensing and complex information processing.
We investigate the addition of symmetry and temporal context information to a deep Convolutional Neural Network (CNN) with the purpose of detecting malignant soft tissue lesions in mammography. We employ a simple linear mapping that takes the location of a mass candidate and maps it to either the contra-lateral or prior mammogram and Regions Of Interest (ROI) are extracted around each location. We subsequently explore two different architectures (1) a fusion model employing two datastreams were both ROIs are fed to the network during training and testing and (2) a stage-wise approach where a single ROI CNN is trained on the primary image and subsequently used as feature extractor for both primary and symmetrical or prior ROIs. A 'shallow' Gradient Boosted Tree (GBT) classifier is then trained on the concatenation of these features and used to classify the joint representation. Results shown a significant increase in performance using the first architecture and symmetry information, but only marginal gains in performance using temporal data and the other setting. We feel results are promising and can greatly be improved when more temporal data becomes available.
We study in Ising spin glasses the finite-size effects near the spin-glass transition in zero field and at the de Almeida-Thouless transition in a field by Monte Carlo methods and by analytical approximations. In zero field, the finite-size scaling function associated with the spin-glass susceptibility of the Sherrington-Kirkpatrick mean-field spin-glass model is of the same form as that of one-dimensional spin-glass models with power-law long-range interactions in the regime where they can be a proxy for the Edwards-Anderson short-range spin-glass model above the upper critical dimension. We also calculate a simple analytical approximation for the spin-glass susceptibility crossover function. The behavior of the spin-glass susceptibility near the de Almeida-Thouless transition line has also been studied, but here we have only been able to obtain analytically its behavior in the asymptotic limit above and below the transition. We have also simulated the one-dimensional system in a field in the non-mean-field regime to illustrate that when the Imry-Ma droplet length scale exceeds the system size one can then be erroneously lead to conclude that there is a de Almeida-Thouless transition even though it is absent.
With the rapid increase of transnational communication and cooperation, people frequently encounter multilingual scenarios in various situations. In this paper, we are concerned with a relatively new problem: script identification at word or line levels in natural scenes. A large-scale dataset with a great quantity of natural images and 10 types of widely used languages is constructed and released. In allusion to the challenges in script identification in real-world scenarios, a deep learning based algorithm is proposed. The experiments on the proposed dataset demonstrate that our algorithm achieves superior performance, compared with conventional image classification methods, such as the original CNN architecture and LLC.
The area of mobile ad hoc networking has received considerable attention of the research community in recent years. These networks have gained immense popularity primarily due to their infrastructure-less mode of operation which makes them a suitable candidate for deployment in emergency scenarios like relief operation, battlefield etc., where either the pre-existing infrastructure is totally damaged or it is not possible to establish a new infrastructure quickly. However, MANETs are constrained due to the limited transmission range of the mobile nodes which reduces the total coverage area. Sometimes the infrastructure-less ad hoc network may be combined with a fixed network to form a hybrid network which can cover a wider area with the advantage of having less fixed infrastructure. In such a combined network, for transferring data, we need base stations which act as gateways between the wired and wireless domains. Due to the hybrid nature of these networks, routing is considered a challenging task. Several routing protocols have been proposed and tested under various traffic conditions. However, the simulations of such routing protocols usually do not consider the hybrid network scenario. In this work we have carried out a systematic performance study of the two prominent routing protocols: Destination Sequenced Distance Vector Routing (DSDV) and Dynamic Source Routing (DSR) protocols in the hybrid networking environment. We have analyzed the performance differentials on the basis of three metrics - packet delivery fraction, average end-to-end delay and normalized routing load under varying pause time with different number of sources using NS2 based simulation.
We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with fine-tuning by backpropagation - a computationally efficient procedure that maintains good generalization in the pruned network. We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters. We focus on transfer learning, where large pretrained networks are adapted to specialized tasks. The proposed criterion demonstrates superior performance compared to other criteria, e.g. the norm of kernel weights or feature map activation, for pruning large CNNs after adaptation to fine-grained classification tasks (Birds-200 and Flowers-102) relaying only on the first order gradient information. We also show that pruning can lead to more than 10x theoretical (5x practical) reduction in adapted 3D-convolutional filters with a small drop in accuracy in a recurrent gesture classifier. Finally, we show results for the large-scale ImageNet dataset to emphasize the flexibility of our approach.
We report on a study of interaction effects on the polarization of a disordered two-dimensional electron system in a strong magnetic field. Treating the Coulomb interaction within the time-dependent Hartree-Fock approximation we find numerical evidence for dynamical scaling with a dynamical critical exponent z=1 at the integer quantum Hall plateau transition in the lowest Landau level. Within the numerical accuracy of our data the conductivity at the transition and the anomalous diffusion exponent are given by the values for non-interacting electrons, independent of the strength of the interaction.
Aiming to understand real-world hierarchical networks whose degree distributions are neither power law nor exponential, we construct a hybrid clique network that includes both homogeneous and inhomogeneous parts, and introduce an inhomogeneity parameter to tune the ratio between the homogeneous part and the inhomogeneous one. We perform Monte-Carlo simulations to study various properties of such a network, including the degree distribution, the average shortest-path-length, the clustering coefficient, the clustering spectrum, and the communicability.
This work presents a supervised learning based approach to the computer vision problem of frame interpolation. The presented technique could also be used in the cartoon animations since drawing each individual frame consumes a noticeable amount of time. The most existing solutions to this problem use unsupervised methods and focus only on real life videos with already high frame rate. However, the experiments show that such methods do not work as well when the frame rate becomes low and object displacements between frames becomes large. This is due to the fact that interpolation of the large displacement motion requires knowledge of the motion structure thus the simple techniques such as frame averaging start to fail. In this work the deep convolutional neural network is used to solve the frame interpolation problem. In addition, it is shown that incorporating the prior information such as optical flow improves the interpolation quality significantly.
Extensive DMRG calculations for spin S=1/2 and S=3/2 disordered antiferromagnetic Heisenberg chains show a rather distinct behavior in the two cases. While at sufficiently strong disorder both systems are in a random singlet phase, we show that weak disorder is an irrelevant perturbation for the S=3/2 chain, contrary to what expected from a naive application of the Harris criterion. The observed irrelevance is attributed to the presence of a new correlation length due to enhanced end-to-end correlations. This phenomenon is expected to occur for all half-integer S > 1/2 chains. A possible phase diagram of the chain for generic S is also discussed.
We report some qualitatively new features of emergence, competition and dynamical stabilization of dissipative rotating spiral waves (RSWs) in the cellular-automaton model of laser-like excitable media proposed in arXiv:cond-mat/0410460v2 ; arXiv:cond-mat/0602345 . Part of the observed features are caused by unusual mechanism of excitation vorticity when the RSW's core get into the surface layer of an active medium. Instead of the well known scenario of RSW collapse, which takes place after collision of RSW's core with absorbing boundary, we observed complicated transformations of the core leading to regeneration (nonlinear "reflection" from the boundary) of the RSW or even to birth of several new RSWs in the surface layer. Computer experiments on bottlenecked evolution of such the RSW-ensembles (vortex matter) are reported and a possible explanation of real experiments on spin-lattice relaxation in dilute paramagnets is proposed on the basis of an analysis of the RSWs dynamics. Chimera states in RSW-ensembles are revealed and compared with analogous states in ensembles of nonlocally coupled oscillators. Generally, our computer experiments have shown that vortex matter states in laser-like excitable media have some important features of aggregate states of the usual matter.
With the fifth generation (5G) of mobile broadband systems, Radio Resources Management (RRM) will reach unprecedented levels of complexity. To cope with the higher complexity of RRM functionalities, while retaining the fast execution required in 5G, this manuscript presents a lean 5G RRM architecture that capitalizes on the most recent advances in the field of machine learning in combination with the large amount of data readily available in the network from measurements and system observations. The result is a general-purpose learning framework capable of generating algorithms specialized to RRM functionalities directly from data gathered in the network. The potential of this approach is verified in three study cases and future directions on applications of machine learning to RRM are discussed.
As one of the most popular south-bound protocol of software-defined networking(SDN), OpenFlow decouples the network control from forwarding devices. It offers flexible and scalable functionality for networks. These advantages may cause performance issues since there are performance penalties in terms of packet processing speed. It is important to understand the performance of OpenFlow switches and controllers for its deployments. In this paper we model the packet processing time of OpenFlow switches and controllers. We mainly analyze how the probability of packet-in messages impacts the performance of switches and controllers. Our results show that there is a performance penalty in OpenFlow networks. However, the penalty is not much when probability of packet-in messages is low. This model can be used for a network designer to approximate the performance of her deployments.
Clustering is typically measured by the ratio of triangles to all triples, open or closed. Generating clustered networks, and how clustering affects dynamics on networks, is reasonably well understood for certain classes of networks \cite{vmclust, karrerclust2010}, e.g., networks composed of lines and non-overlapping triangles. In this paper we show that it is possible to generate networks which, despite having the same degree distribution and equal clustering, exhibit different higher-order structure, specifically, overlapping triangles and other order-four (a closed network motif composed of four nodes) structures. To distinguish and quantify these additional structural features, we develop a new network metric capable of measuring order-four structure which, when used alongside traditional network metrics, allows us to more accurately describe a network's topology. Three network generation algorithms are considered: a modified configuration model and two rewiring algorithms. By generating homogeneous networks with equal clustering we study and quantify their structural differences, and using SIS (Susceptible-Infected-Susceptible) and SIR (Susceptible-Infected-Recovered) dynamics we investigate computationally how differences in higher-order structure impact on epidemic threshold, final epidemic or prevalence levels and time evolution of epidemics. Our results suggest that characterising and measuring higher-order network structure is needed to advance our understanding of the impact of network topology on dynamics unfolding on the networks.
In order to classify the nonlinear feature with linear classifier and improve the classification accuracy, a deep learning network named kernel principal component analysis network (KPCANet) is proposed. First, mapping the data into higher space with kernel principal component analysis to make the data linearly separable. Then building a two-layer KPCANet to obtain the principal components of image. Finally, classifying the principal components with linearly classifier. Experimental results show that the proposed KPCANet is effective in face recognition, object recognition and hand-writing digits recognition, it also outperforms principal component analysis network (PCANet) generally as well. Besides, KPCANet is invariant to illumination and stable to occlusion and slight deformation.
The application of psychophysiology in human-computer interaction is a growing field with significant potential for future smart personalised systems. Working in this emerging field requires comprehension of an array of physiological signals and analysis techniques.   Event-related potentials, termed ERPs, are a stimulus- or action-locked waveform indicating a characteristic neural response. ERPs derived from electroencephalography have been extensively studied in basic research, and have been applied especially in the field of brain-computer interfaces. For ecologically-valid settings there are considerable challenges to application, however recent work shows some promise for ERPs outside the lab. Here we present a short review on the application of ERPs in human-computer interaction.   This paper aims to serve as a primer for the novice, enabling rapid familiarisation with the latest core concepts. We put special emphasis on everyday human-computer interface applications to distinguish from the more common clinical or sports uses of psychophysiology.   This paper is an extract from a comprehensive review of the entire field of ambulatory psychophysiology, including 12 similar chapters, plus application guidelines and systematic review. Thus any citation should be made using the following reference:   B. Cowley, M. Filetti, K. Lukander, J. Torniainen, A. Henelius, L. Ahonen, O. Barral, I. Kosunen, T. Valtonen, M. Huotilainen, N. Ravaja, G. Jacucci. The Psychophysiology Primer: a guide to methods and a broad review with a focus on human-computer interaction. Foundations and Trends in Human-Computer Interaction, vol. 9, no. 3-4, pp. 150--307, 2016.
Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles to design effective ConvNet architectures for action recognition in videos and learn these models given limited training samples. Our first contribution is temporal segment network (TSN), a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video. The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network. Our approach obtains the state-the-of-art performance on the datasets of HMDB51 ( $ 69.4\% $) and UCF101 ($ 94.2\% $). We also visualize the learned ConvNet models, which qualitatively demonstrates the effectiveness of temporal segment network and the proposed good practices.
Entanglement structure serves as a powerful way to characterize quantum many-body phases. This is particularly so for gapless quantum liquids, where entanglement-based tools provide one of the only means to systematically characterize these complicated phases. For example, the Fermi-surface structure of Fermi-liquids is revealed in entanglement entropy by a log-correction to the typical boundary-law scaling of simpler quantum ground-states. In this paper, I analyze the entanglement structure of a disordered, but delocalized diffusive metal. Using a combination of analytic arguments and numerical calculations, I show that, despite having the same number of extended gapless excitations as a clean Fermi-liquid, the diffusive metal exhibits only boundary-law entanglement scaling. This result pinpoints the sharp Fermi-surface structure, rather than the finite density of gapless excitations, as the origin of the log-correction in the Fermi-liquid entanglement scaling.
It has recently been shown that supervised learning with the popular logistic loss is equivalent to optimizing the exponential loss over sufficient statistics about the class: Rademacher observations (rados). We first show that this unexpected equivalence can actually be generalized to other example / rado losses, with necessary and sufficient conditions for the equivalence, exemplified on four losses that bear popular names in various fields: exponential (boosting), mean-variance (finance), Linear Hinge (on-line learning), ReLU (deep learning), and unhinged (statistics). Second, we show that the generalization unveils a surprising new connection to regularized learning, and in particular a sufficient condition under which regularizing the loss over examples is equivalent to regularizing the rados (with Minkowski sums) in the equivalent rado loss. This brings simple and powerful rado-based learning algorithms for sparsity-controlling regularization, that we exemplify on a boosting algorithm for the regularized exponential rado-loss, which formally boosts over four types of regularization, including the popular ridge and lasso, and the recently coined slope --- we obtain the first proven boosting algorithm for this last regularization. Through our first contribution on the equivalence of rado and example-based losses, Omega-R.AdaBoost~appears to be an efficient proxy to boost the regularized logistic loss over examples using whichever of the four regularizers. Experiments display that regularization consistently improves performances of rado-based learning, and may challenge or beat the state of the art of example-based learning even when learning over small sets of rados. Finally, we connect regularization to differential privacy, and display how tiny budgets can be afforded on big domains while beating (protected) example-based learning.
The mincut graph bisection problem involves partitioning the n vertices of a graph into disjoint subsets, each containing exactly n/2 vertices, while minimizing the number of "cut" edges with an endpoint in each subset. When considered over sparse random graphs, the phase structure of the graph bisection problem displays certain familiar properties, but also some surprises. It is known that when the mean degree is below the critical value of 2 log 2, the cutsize is zero with high probability. We study how the minimum cutsize increases with mean degree above this critical threshold, finding a new analytical upper bound that improves considerably upon previous bounds. Combined with recent results on expander graphs, our bound suggests the unusual scenario that random graph bisection is replica symmetric up to and beyond the critical threshold, with a replica symmetry breaking transition possibly taking place above the threshold. An intriguing algorithmic consequence is that although the problem is NP-hard, we can find near-optimal cutsizes (whose ratio to the optimal value approaches 1 asymptotically) in polynomial time for typical instances near the phase transition.
The general equations of motion for two dimensional Laplacian growth are derived using the conformal mapping method. In the singular case, all singularities of the conformal map are on the unit circle, and the map is a degenerate Schwarz-Christoffel map. The equations of motion describe the motions of these singularities. Despite the typical fractal-like outcomes of Laplacian growth processes, the equations of motion are shown to be not particularly sensitive to initial conditions. It is argued that the sensitivity of this system derives from a novel cause, the non-uniqueness of solutions to the differential system. By a mechanism of singularity creation, every solution can become more complex, even in the absence of noise, without violating the growth law. These processes are permitted, but are not required, meaning the equation of motion does not determine the motion, even in the small.
We use the holographic dual of a finite-temperature, strongly-coupled, gauge theory with a small number of flavors of massive fundamental quarks to study meson excitations and deep inelastic scattering (DIS) in the low-temperature phase, where the mesons are stable. We show that a high-energy flavor current with nearly light-like kinematics disappears into the plasma by resonantly producing mesons in highly excited states. This mechanism generates the same DIS structure functions as in the high temperature phase, where mesons are unstable and the current disappears through medium-induced parton branching. To establish this picture, we derive analytic results for the meson spectrum, which are exact in the case of light-like mesons and which corroborate and complete previous, mostly numerical, studies in the literature. We find that the meson levels are very finely spaced near the light-cone, so that the current can always decay, without a fine-tuning of its kinematics.
We study local and nonlocal correlations of light transmitted through active random media. The conventional approach results in divergence of ensemble averaged correlation functions due to existence of lasing realizations. We introduce conditional average for correlation functions by omitting the divergent realizations. Our numerical simulation reveals that amplification does not affect local spatial correlation. The nonlocal intensity correlations are strongly magnified due to selective enhancement of the contributions from long propagation paths. We also show that by increasing gain, the average mode linewidth can be made smaller than the average mode spacing. This implies that light transport through a diffusive random system with gain could exhibit some similarities to that through a localized passive system, owing to dominant influence of the resonant modes with narrow width.
We study the sum capacity of multiple unicasts in wired and wireless multihop networks. With 2 source nodes and 2 sink nodes, there are a total of 4 independent unicast sessions (messages), one from each source to each sink node (this setting is also known as an X network). For wired networks with arbitrary connectivity, the sum capacity is achieved simply by routing. For wireless networks, we explore the degrees of freedom (DoF) of multihop X networks with a layered structure, allowing arbitrary number of hops, and arbitrary connectivity within each hop. For the case when there are no more than two relay nodes in each layer, the DoF can only take values 1, 4/3, 3/2 or 2, based on the connectivity of the network, for almost all values of channel coefficients. When there are arbitrary number of relays in each layer, the DoF can also take the value 5/3 . Achievability schemes incorporate linear forwarding, interference alignment and aligned interference neutralization principles. Information theoretic converse arguments specialized for the connectivity of the network are constructed based on the intuition from linear dimension counting arguments.
We present variational generative adversarial networks, a general learning framework that combines a variational auto-encoder with a generative adversarial network, for synthesizing images of fine-grained categories, such as faces of a specific person or objects in a category. Our approach models an image as a composition of label and latent attributes in a probabilistic model. By varying the fine-grained category label fed to the resulting generative model, we can generate images in a specific category by randomly drawn values on a latent attribute vector. The novelty of our approach comes from two aspects. Firstly, we propose to adopt a cross entropy loss for the discriminative and classifier network, but a mean discrepancy objective for the generative network. This kind of asymmetric loss function makes the training of the GAN more stable. Secondly, we adopt an encoder network to learn the relationship between the latent space and the real image space, and use pairwise feature matching to keep the structure of generated images. We experiment with natural images of faces, flowers, and birds, and demonstrate that the proposed models are capable of generating realistic and diverse samples with fine-grained category labels. We further show that our models can be applied to other tasks, such as image inpainting, super-resolution, and data augmentation for training better face recognition models.
The growth of cooperatively rearranging regions was invoked long ago by Adam and Gibbs to explain the slowing down of glass-forming liquids. The lack of knowledge about the nature of the growing order, though, complicates the definition of an appropriate correlation function. One option is the point-to-set correlation function, which measures the spatial span of the influence of amorphous boundary conditions on a confined system. By using a swap Monte Carlo algorithm we measure the equilibration time of a liquid droplet bounded by amorphous boundary conditions in a model glass-former at low temperature, and we show that the cavity relaxation time increases with the size of the droplet, saturating to the bulk value when the droplet outgrows the point-to-set correlation length. This fact supports the idea that the point-to-set correlation length is the natural size of the cooperatively rearranging regions. On the other hand, the cavity relaxation time computed by a standard, nonswap dynamics, has the opposite behavior, showing a very steep increase when the cavity size is decreased. We try to reconcile this difference by discussing the possible hybridization between MCT and activated processes, and by introducing a new kind of amorphous boundary conditions, inspired by the concept of frozen external state as an alternative to the commonly used frozen external configuration.
This paper studies large-scale dynamical networks where the current state of the system is a linear transformation of the previous state, contaminated by a multivariate Gaussian noise. Examples include stock markets, human brains and gene regulatory networks. We introduce a transition matrix to describe the evolution, which can be translated to a directed Granger transition graph, and use the concentration matrix of the Gaussian noise to capture the second-order relations between nodes, which can be translated to an undirected conditional dependence graph. We propose regularizing the two graphs jointly in topology identification and dynamics estimation. Based on the notion of joint association graph (JAG), we develop a joint graphical screening and estimation (JGSE) framework for efficient network learning in big data. In particular, our method can pre-determine and remove unnecessary edges based on the joint graphical structure, referred to as JAG screening, and can decompose a large network into smaller subnetworks in a robust manner, referred to as JAG decomposition. JAG screening and decomposition can reduce the problem size and search space for fine estimation at a later stage. Experiments on both synthetic data and real-world applications show the effectiveness of the proposed framework in large-scale network topology identification and dynamics estimation.
The $S=1/2$ frustrated Heisenberg chains with bond alternation is known to exhibit a magnetization plateau at half of the saturation magnetization $\Ms$ accompanied by the spontanuous translational symmetry breakdown. The effect of randomness on the magnetization process of this model is investigated. First we consider the mixture of the two kinds of chains both of which possess the $\Ms/2$-plateau in the common interval of the magnetization field. The plateau at $\Ms/2$ is found to vanish immediately if the randomness is switched on in agreement with Totsuka's prediction. The small plateau also appears near the saturation field due to the localization of inverted spins around the minority bond. On the other hand, if the stronger bond is replaced by the ferromagnetic bonds randomly, the randomness induced fractional plateau appears as in the nonfrustrated case. The plateau at $\Ms/2$ does not simply vanish but shifts and splits into two smaller plateaux. The magnetization on this plateau varies nonlinearly with $1-p$. The physical origin of this behavior is explained based on the strong coupling picture.
We examine the uncertainty of perturbative QCD factorization for hadron structure functions in deep inelastic scattering at a large value of the Bjorken variable xB. We analyze the target mass correction to the structure functions by using the collinear factorization approach in the momentum space. We express the long distance physics of structure functions and the leading target mass corrections in terms of parton distribution functions with the standard operator definition. We compare our result with existing work on the target mass correction. We also discuss the impact of a final-state jet function on the extraction of parton distributions at large fractional momentum x.
This paper presents sampling-based speech parameter generation using moment-matching networks for Deep Neural Network (DNN)-based speech synthesis. Although people never produce exactly the same speech even if we try to express the same linguistic and para-linguistic information, typical statistical speech synthesis produces completely the same speech, i.e., there is no inter-utterance variation in synthetic speech. To give synthetic speech natural inter-utterance variation, this paper builds DNN acoustic models that make it possible to randomly sample speech parameters. The DNNs are trained so that they make the moments of generated speech parameters close to those of natural speech parameters. Since the variation of speech parameters is compressed into a low-dimensional simple prior noise vector, our algorithm has lower computation cost than direct sampling of speech parameters. As the first step towards generating synthetic speech that has natural inter-utterance variation, this paper investigates whether or not the proposed sampling-based generation deteriorates synthetic speech quality. In evaluation, we compare speech quality of conventional maximum likelihood-based generation and proposed sampling-based generation. The result demonstrates the proposed generation causes no degradation in speech quality.
Most existing star-galaxy classifiers use the reduced summary information from catalogs, requiring careful feature extraction and selection. The latest advances in machine learning that use deep convolutional neural networks allow a machine to automatically learn the features directly from data, minimizing the need for input from human experts. We present a star-galaxy classification framework that uses deep convolutional neural networks (ConvNets) directly on the reduced, calibrated pixel values. Using data from the Sloan Digital Sky Survey (SDSS) and the Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS), we demonstrate that ConvNets are able to produce accurate and well-calibrated probabilistic classifications that are competitive with conventional machine learning techniques. Future advances in deep learning may bring more success with current and forthcoming photometric surveys, such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope (LSST), because deep neural networks require very little, manual feature engineering.
Deep networks trained on large-scale data can learn transferable features to promote learning multiple tasks. As deep features eventually transition from general to specific along deep networks, a fundamental problem is how to exploit the relationship across different tasks and improve the feature transferability in the task-specific layers. In this paper, we propose Deep Relationship Networks (DRN) that discover the task relationship based on novel tensor normal priors over the parameter tensors of multiple task-specific layers in deep convolutional networks. By jointly learning transferable features and task relationships, DRN is able to alleviate the dilemma of negative-transfer in the feature layers and under-transfer in the classifier layer. Extensive experiments show that DRN yields state-of-the-art results on standard multi-task learning benchmarks.
This paper addresses the problem of filtering with a state-space model. Standard approaches for filtering assume that a probabilistic model for observations (i.e. the observation model) is given explicitly or at least parametrically. We consider a setting where this assumption is not satisfied; we assume that the knowledge of the observation model is only provided by examples of state-observation pairs. This setting is important and appears when state variables are defined as quantities that are very different from the observations. We propose Kernel Monte Carlo Filter, a novel filtering method that is focused on this setting. Our approach is based on the framework of kernel mean embeddings, which enables nonparametric posterior inference using the state-observation examples. The proposed method represents state distributions as weighted samples, propagates these samples by sampling, estimates the state posteriors by Kernel Bayes' Rule, and resamples by Kernel Herding. In particular, the sampling and resampling procedures are novel in being expressed using kernel mean embeddings, so we theoretically analyze their behaviors. We reveal the following properties, which are similar to those of corresponding procedures in particle methods: (1) the performance of sampling can degrade if the effective sample size of a weighted sample is small; (2) resampling improves the sampling performance by increasing the effective sample size. We first demonstrate these theoretical findings by synthetic experiments. Then we show the effectiveness of the proposed filter by artificial and real data experiments, which include vision-based mobile robot localization.
Building biological models by inferring functional dependencies from experimental data is an im- portant issue in Molecular Biology. To relieve the biologist from this traditionally manual process, various approaches have been proposed to increase the degree of automation. However, available ap- proaches often yield a single model only, rely on specific assumptions, and/or use dedicated, heuris- tic algorithms that are intolerant to changing circumstances or requirements in the view of the rapid progress made in Biotechnology. Our aim is to provide a declarative solution to the problem by ap- peal to Answer Set Programming (ASP) overcoming these difficulties. We build upon an existing approach to Automatic Network Reconstruction proposed by part of the authors. This approach has firm mathematical foundations and is well suited for ASP due to its combinatorial flavor providing a characterization of all models explaining a set of experiments. The usage of ASP has several ben- efits over the existing heuristic algorithms. First, it is declarative and thus transparent for biological experts. Second, it is elaboration tolerant and thus allows for an easy exploration and incorporation of biological constraints. Third, it allows for exploring the entire space of possible models. Finally, our approach offers an excellent performance, matching existing, special-purpose systems.
We investigate the structure of the Marvel Universe collaboration network, where two Marvel characters are considered linked if they jointly appear in the same Marvel comic book. We show that this network is clearly not a random network, and that it has most, but not all, characteristics of "real-life" collaboration networks, such as movie actors or scientific collaboration networks. The study of this artificial universe that tries to look like a real one, helps to understand that there are underlying principles that make real-life networks have definite characteristics.
Randomized search heuristics such as evolutionary algorithms, simulated annealing, and ant colony optimization are a broadly used class of general-purpose algorithms. Analyzing them via classical methods of theoretical computer science is a growing field. While several strong runtime analysis results have appeared in the last 20 years, a powerful complexity theory for such algorithms is yet to be developed. We enrich the existing notions of black-box complexity by the additional restriction that not the actual objective values, but only the relative quality of the previously evaluated solutions may be taken into account by the black-box algorithm. Many randomized search heuristics belong to this class of algorithms.   We show that the new ranking-based model gives more realistic complexity estimates for some problems. For example, the class of all binary-value functions has a black-box complexity of $O(\log n)$ in the previous black-box models, but has a ranking-based complexity of $\Theta(n)$.   For the class of all OneMax functions, we present a ranking-based black-box algorithm that has a runtime of $\Theta(n / \log n)$, which shows that the OneMax problem does not become harder with the additional ranking-basedness restriction.
Based on the density of connections between the nodes of high degree, we introduce two bounds of the spectral radius. We use these bounds to split a network into two sets, one of these sets contains the high degree nodes, we refer to this set as the spectral--core. The degree of the nodes of the subnetwork formed by the spectral--core gives an approximation to the top entries of the leading eigenvector of the whole network. We also present some numerical examples showing the dependancy of the spectral--core with the assortativity coefficient, its evaluation in several real networks and how the properties of the spectral--core can be used to reduce the spectral radius.
Surveys with ISO (Kessler et al 1996), in particular with the CAM (Cesarsky et al 1996) and PHOT (Lemke et al 1996) instruments, will greatly extend our understanding of extra-galactic populations and their cosmological evolution. The main advantages that ISO surveys have over e.g IRAS are increased sensitivity/depth and wavelength coverage. Within the Guaranteed and Open Time programmes there are many field surveys which will efficiently map the limits in these parameters. In this talk I will briefly overview those surveys before concentrating in more detail on one survey in particular, the ISO survey of the Hubble Deep Field (HDF), to illustrate the kind of results that can be expected.
A generative model based on training deep architectures is proposed. The model consists of K networks that are trained together to learn the underlying distribution of a given data set. The process starts with dividing the input data into K clusters and feeding each of them into a separate network. After few iterations of training networks separately, we use an EM-like algorithm to train the networks together and update the clusters of the data. We call this model Mixture of Networks. The provided model is a platform that can be used for any deep structure and be trained by any conventional objective function for distribution modeling. As the components of the model are neural networks, it has high capability in characterizing complicated data distributions as well as clustering data. We apply the algorithm on MNIST hand-written digits and Yale face datasets. We also demonstrate the clustering ability of the model using some real-world and toy examples.
Elastic effects in a model of disordered nematic elastomers are numerically investigated in two dimensions. Networks crosslinked in the isotropic phase exhibit unusual soft mechanical response against stretching. It arises from gradual alignment of orientationally correlated regions that are elongated along the director. A sharp crossover to a macroscopically aligned state is obtained on further stretching. The effect of random internal stress is also discussed.
Spatial evolution is investigated in a simulated system of nine competing and mutating bacterium strains, which mimics the biochemical war among bacteria capable of producing two different bacteriocins (toxins) at most. Random sequential dynamics on a square lattice is governed by very symmetrical transition rules for neighborhood invasion of sensitive strains by killers, killers by resistants, and resistants by by sensitives. The community of the nine possible toxicity/resistance types undergoes a critical phase transition as the uniform transmutation rates between the types decreases below a critical value $P_c$ above which all the nine types of strain coexist with equal frequencies. Passing the critical mutation rate from above, the system collapses into one of the three topologically identical states, each consisting of three strain types. Of the three final states each accrues with equal probability and all three maintain themselves in a self-organizing polydomain structure via cyclic invasions. Our Monte Carlo simulations support that this symmetry breaking transition belongs to the universality class of the three-state Potts model.
This paper introduces the Discrete Dithered Desynchronization (D3sync) algorithm which is a decentralized Time Division Multiple Access (TDMA) technique in which a set of network nodes computes iteratively a conflict-free schedule so that each node obtains a portion of a frame that is an integer multiple of a fixed slot size. The algorithm is inspired by the dynamics of Pulse Coupled Oscillators (PCO), but unlike its predecessors that divide arbitrarily the frame among the nodes in the network, the D3sync allocates discrete resources among the network nodes.   Our paper proves the convergence of the D3sync algorithm and gives an upper bound on the convergence time of the algorithm.
We study two aspects of information semantics: (i) the collection of all relationships, (ii) tracking and spotting anomaly and change. The first is implemented by endowing all relevant information spaces with a Euclidean metric in a common projected space. The second is modelled by an induced ultrametric. A very general way to achieve a Euclidean embedding of different information spaces based on cross-tabulation counts (and from other input data formats) is provided by Correspondence Analysis. From there, the induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We employ such a perspective to look at narrative, "the flow of thought and the flow of language" (Chafe). In application to policy decision making, we show how we can focus analysis in a small number of dimensions.
The ability to predict future states of the environment is a central pillar of intelligence. At its core, effective prediction requires an internal model of the world and an understanding of the rules by which the world changes. Here, we explore the internal models developed by deep neural networks trained using a loss based on predicting future frames in synthetic video sequences, using a CNN-LSTM-deCNN framework. We first show that this architecture can achieve excellent performance in visual sequence prediction tasks, including state-of-the-art performance in a standard 'bouncing balls' dataset (Sutskever et al., 2009). Using a weighted mean-squared error and adversarial loss (Goodfellow et al., 2014), the same architecture successfully extrapolates out-of-the-plane rotations of computer-generated faces. Furthermore, despite being trained end-to-end to predict only pixel-level information, our Predictive Generative Networks learn a representation of the latent structure of the underlying three-dimensional objects themselves. Importantly, we find that this representation is naturally tolerant to object transformations, and generalizes well to new tasks, such as classification of static images. Similar models trained solely with a reconstruction loss fail to generalize as effectively. We argue that prediction can serve as a powerful unsupervised loss for learning rich internal representations of high-level object features.
Despite the recent success of deep neural networks in various applications, designing and training deep neural networks is still among the greatest challenges in the field. In this work, we address the challenge of designing and training feedforward Multilayer Perceptrons (MLPs) from a smooth optimisation perspective. By characterising the critical point conditions of an MLP based loss function, we identify conditions to eliminate local optima of the corresponding cost function. By studying the Hessian structure of the cost function at the global minima, we develop an approximate Newton's MLP algorithm. Our results are demonstrated on an analysis of MLPs with only one hidden layer, and numerically evaluated on the benchmark problem of four region classification.
This work exposes which mechanisms and procesess in the Nature of evolution compute a function not computable by Turing machine. The computer with intelligence that is not higher than one bacteria population could have, but with efficency to solve the problems that are non-computable by Turing machine is represented. This theoretical construction is called Universal Evolutinary Computer and it is based on the superecursive algorithms of evolvability.
A central and long-standing issue in evolutionary theory is the origin of the biological variation upon which natural selection acts1. Some hypotheses suggest that evolutionary change represents an adaptation to the surrounding environment within the constraints of an organism's innate characteristics. Elucidation of the origin and evolutionary relationship of species has been complemented by nucleotide sequence and gene content analyses, with profound implications for recognizing life's major domains. Understanding of evolutionary relationships may be further expanded by comparing systemic higher-level organization among species. Here we employ multivariate analyses to evaluate the biochemical reaction pathways characterizing 43 species. Comparison of the information transfer pathways of Archaea and Eukaryotes indicates a close relationship between these domains. In addition, whereas eukaryotic metabolic enzymes are primarily of bacterial origin, the pathway-level organization of archaeal and eukaryotic metabolic networks is more closely related. Our analyses therefore suggest that during the symbiotic evolution of eukaryotes, incorporation of bacterial metabolic enzymes into the proto-archaeal proteome was constrained by the host's pre-existing metabolic architecture.
In this paper we discuss a very simple approach of combining content and link information in graph structures for the purpose of community discovery, a fundamental task in network analysis. Our approach hinges on the basic intuition that many networks contain noise in the link structure and that content information can help strengthen the community signal. This enables ones to eliminate the impact of noise (false positives and false negatives), which is particularly prevalent in online social networks and Web-scale information networks.   Specifically we introduce a measure of signal strength between two nodes in the network by fusing their link strength with content similarity. Link strength is estimated based on whether the link is likely (with high probability) to reside within a community. Content similarity is estimated through cosine similarity or Jaccard coefficient. We discuss a simple mechanism for fusing content and link similarity. We then present a biased edge sampling procedure which retains edges that are locally relevant for each graph node. The resulting backbone graph can be clustered using standard community discovery algorithms such as Metis and Markov clustering.   Through extensive experiments on multiple real-world datasets (Flickr, Wikipedia and CiteSeer) with varying sizes and characteristics, we demonstrate the effectiveness and efficiency of our methods over state-of-the-art learning and mining approaches several of which also attempt to combine link and content analysis for the purposes of community discovery. Specifically we always find a qualitative benefit when combining content with link analysis. Additionally our biased graph sampling approach realizes a quantitative benefit in that it is typically several orders of magnitude faster than competing approaches.
Inferring genetic networks from gene expression data is one of the most challenging work in the post-genomic era, partly due to the vast space of possible networks and the relatively small amount of data available. In this field, Gaussian Graphical Model (GGM) provides a convenient framework for the discovery of biological networks. In this paper, we propose an original approach for inferring gene regulation networks using a robust biological prior on their structure in order to limit the set of candidate networks.   Pathways, that represent biological knowledge on the regulatory networks, will be used as an informative prior knowledge to drive Network Inference. This approach is based on the selection of a relevant set of genes, called the "molecular signature", associated with a condition of interest (for instance, the genes involved in disease development). In this context, differential expression analysis is a well established strategy. However outcome signatures are often not consistent and show little overlap between studies. Thus, we will dedicate the first part of our work to the improvement of the standard process of biomarker identification to guarantee the robustness and reproducibility of the molecular signature.   Our approach enables to compare the networks inferred between two conditions of interest (for instance case and control networks) and help along the biological interpretation of results. Thus it allows to identify differential regulations that occur in these conditions. We illustrate the proposed approach by applying our method to a study of breast cancer's response to treatment.
Networks such as organizational network of a global company play an important role in a variety of knowledge management and information diffusion tasks. The nodes in these networks correspond to individuals who are self-interested. The topology of these networks often plays a crucial role in deciding the ease and speed with which certain tasks can be accomplished using these networks. Consequently, growing a stable network having a certain topology is of interest. Motivated by this, we study the following important problem: given a certain desired network topology, under what conditions would best response (link addition/deletion) strategies played by self-interested agents lead to formation of a pairwise stable network with only that topology. We study this interesting reverse engineering problem by proposing a natural model of recursive network formation. In this model, nodes enter the network sequentially and the utility of a node captures principal determinants of network formation, namely (1) benefits from immediate neighbors, (2) costs of maintaining links with immediate neighbors, (3) benefits from indirect neighbors, (4) bridging benefits, and (5) network entry fee. Based on this model, we analyze relevant network topologies such as star graph, complete graph, bipartite Turan graph, and multiple stars with interconnected centers, and derive a set of sufficient conditions under which these topologies emerge as pairwise stable networks. We also study the social welfare properties of the above topologies.
Deep learning techniques have been used recently to tackle the audio source separation problem. In this work, we propose to use deep convolution denoising auto-encoders (CDAEs) for monaural audio source separation. We use as many CDAEs as the number of sources to be separated from the mixed signal. Each CDAE is trained to separate one source and treats the other sources as background noise. The main idea is to allow each CDAE to learn suitable time-frequency filters and features to its corresponding source. Our experimental results show that CDAEs perform source separation slightly better than the deep feedforward neural networks (FNNs) even with a much less number of parameters than FNNs.
The current study examines how adequate coordination among different cognitive processes including visual recognition, attention switching, action preparation and generation can be developed via learning of robots by introducing a novel model, the Visuo-Motor Deep Dynamic Neural Network (VMDNN). The proposed model is built on coupling of a dynamic vision network, a motor generation network, and a higher level network allocated on top of these two. The simulation experiments using the iCub simulator were conducted for cognitive tasks including visual object manipulation responding to human gestures. The results showed that synergetic coordination can be developed via iterative learning through the whole network when spatio-temporal hierarchy and temporal one can be self-organized in the visual pathway and in the motor pathway, respectively, such that the higher level can manipulate them with abstraction.
Network theory provides novel concepts that promise an improved characterization of interacting dynamical systems. Within this framework, evolving networks can be considered as being composed of nodes, representing systems, and of time-varying edges, representing interactions between these systems. This approach is highly attractive to further our understanding of the physiological and pathophysiological dynamics in human brain networks. Indeed, there is growing evidence that the epileptic process can be regarded as a large-scale network phenomenon. We here review methodologies for inferring networks from empirical time series and for a characterization of these evolving networks. We summarize recent findings derived from studies that investigate human epileptic brain networks evolving on timescales ranging from few seconds to weeks. We point to possible pitfalls and open issues, and discuss future perspectives.
A recent trend in network research involves finding the appropriate geometric model for a given network, and to use features of the model to infer information about the network. One piece of information that this paper will focus on is the severity and location of congestion in the network's traffic flow. To this end, many versions of "curved surfaces" have been proposed, each with some set of parameters to allow the model to fit the network, and then those parameters are linked to congestion in some manner. These proposed spaces include Gromov's hyperbolic spaces, a scaled variation of Gromov's hyperbolic space, curved $2$-manifolds, additive metrics, and fixed points in the automorphism group of the graph. However, in each case above, the link between the parameters of the geometric space and the congestion of the network has been intuitive and non-rigorous. This paper is a thorough and rigorous treatment of each space's ability to describe congestion. Our conclusion is that Gromov's hyperbolicity is the unique space from which information can be extracted.   Our investigation's wide scope led to the resolution of conjectures and open problems from a wide variety of topics, including \\ (a) a conjecture of Dourisboure and Gavoille on a $2$-approximation method for calculating tree-length, \\ (b) an open problem from Narayan and Saniee on the amount of congestion in the Euclidean grid,\\ (c) all of the conjectures from Jonckheere, Lou, Bonahon, and Baryshnikov relating congestion to rotational symmetry, and\\ (d) a rejection of the implication by Jonckheere, Lohsoonthorn, and Bonahon that scaled hyperbolicity implies properties characterized by hyperbolic spaces.\\ We also show that Buneman's distance approximating tree is the best possible up to a constant both additively and multiplicatively.
The ability to describe images with natural language sentences is the hallmark for image and language understanding. Such a system has wide ranging applications such as annotating images and using natural sentences to search for images.In this project we focus on the task of bidirectional image retrieval: such asystem is capable of retrieving an image based on a sentence (image search) andretrieve sentence based on an image query (image annotation). We present asystem based on a global ranking objective function which uses a combinationof convolutional neural networks (CNN) and multi layer perceptrons (MLP).It takes a pair of image and sentence and processes them in different channels,finally embedding it into a common multimodal vector space. These embeddingsencode abstract semantic information about the two inputs and can be comparedusing traditional information retrieval approaches. For each such pair, the modelreturns a score which is interpretted as a similarity metric. If this score is high,the image and sentence are likely to convey similar meaning, and if the score is low then they are likely not to.   The visual input is modeled via deep convolutional neural network. On theother hand we explore three models for the textual module. The first one isbag of words with an MLP. The second one uses n-grams (bigram, trigrams,and a combination of trigram & skip-grams) with an MLP. The third is morespecialized deep network specific for modeling variable length sequences (SSE).We report comparable performance to recent work in the field, even though ouroverall model is simpler. We also show that the training time choice of how wecan generate our negative samples has a significant impact on performance, and can be used to specialize the bi-directional system in one particular task.
Depth image super-resolution is an extremely challenging task due to the information loss in sub-sampling. Deep convolutional neural network have been widely applied to color image super-resolution. Quite surprisingly, this success has not been matched to depth super-resolution. This is mainly due to the inherent difference between color and depth images. In this paper, we bridge up the gap and extend the success of deep convolutional neural network to depth super-resolution. The proposed deep depth super-resolution method learns the mapping from a low-resolution depth image to a high resolution one in an end-to-end style. Furthermore, to better regularize the learned depth map, we propose to exploit the depth field statistics and the local correlation between depth image and color image. These priors are integrated in an energy minimization formulation, where the deep neural network learns the unary term, the depth field statistics works as global model constraint and the color-depth correlation is utilized to enforce the local structure in depth images. Extensive experiments on various depth super-resolution benchmark datasets show that our method outperforms the state-of-the-art depth image super-resolution methods with a margin.
This paper concerns the fully automatic direct in vivo measurement of active and passive dynamic skeletal muscle states using ultrasound imaging. Despite the long standing medical need (myopathies, neuropathies, pain, injury, ageing), currently technology (electromyography, dynamometry, shear wave imaging) provides no general, non-invasive method for online estimation of skeletal intramuscular states. Ultrasound provides a technology in which static and dynamic muscle states can be observed non-invasively, yet current computational image understanding approaches are inadequate. We propose a new approach in which deep learning methods are used for understanding the content of ultrasound images of muscle in terms of its measured state. Ultrasound data synchronized with electromyography of the calf muscles, with measures of joint torque/angle were recorded from 19 healthy participants (6 female, ages: 30 +- 7.7). A segmentation algorithm previously developed by our group was applied to extract a region of interest of the medial gastrocnemius. Then a deep convolutional neural network was trained to predict the measured states (joint angle/torque, electromyography) directly from the segmented images. Results revealed for the first time that active and passive muscle states can be measured directly from standard b-mode ultrasound images, accurately predicting for a held out test participant changes in the joint angle, electromyography, and torque with as little error as 0.022{\deg}, 0.0001V, 0.256Nm (root mean square error) respectively.
The subject of this study is the long-time equilibration dynamics of a strongly disordered one-dimensional chain of coupled weakly anharmonic classical oscillators. It is argued that chaos in this system has a very particular spatial structure: it can be viewed as a dilute gas of chaotic spots. Each chaotic spot corresponds to a stochastic pump which drives the Arnold diffusion of the oscillators surrounding it, thus leading to their relaxation and thermalization. The most important mechanism of relaxation at long distances is provided by random migration of the chaotic spots along the chain, which bears analogy with variable-range hopping of electrons in strongly disordered solids. The cor- resonding macroscopic transport equations are obtained.
X-ray radiography is the most readily available imaging modality and has a broad range of applications that spans from diagnosis to intra-operative guidance in cardiac, orthopedics, and trauma procedures. Proper interpretation of the hidden and obscured anatomy in X-ray images remains a challenge and often requires high radiation dose and imaging from several perspectives. In this work, we aim at decomposing the conventional X-ray image into d X-ray components of independent, non-overlapped, clipped sub-volumes using deep learning approach. Despite the challenging aspects of modeling such a highly ill-posed problem, exciting and encouraging results are obtained paving the path for further contributions in this direction.
In this paper we propose a simple and efficient method for improving stochastic gradient descent methods by using feedback from the objective function. The method tracks the relative changes in the objective function with a running average, and uses it to adaptively tune the learning rate in stochastic gradient descent. We specifically apply this idea to modify Adam, a popular algorithm for training deep neural networks. We conduct experiments to compare the resulting algorithm, which we call Eve, with state of the art methods used for training deep learning models. We train CNNs for image classification, and RNNs for language modeling and question answering. Our experiments show that Eve outperforms all other algorithms on these benchmark tasks. We also analyze the behavior of the feedback mechanism during the training process.
We study the structure of loops in networks using the notion of modulus of loop families. We introduce a new measure of network clustering by quantifying the richness of families of (simple) loops. Modulus tries to minimize the expected overlap among loops by spreading the expected link-usage optimally. We propose weighting networks using these expected link-usages to improve classical community detection algorithms. We show that the proposed method enhances the performance of certain algorithms, such as spectral partitioning and modularity maximization heuristics, on standard benchmarks.
Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where investors mimic each others' trades using real money in foreign exchange and other asset markets. We find that in this setting people use a decision mechanism in which popularity is treated as a prior distribution for which decisions are best to make. This mechanism is boundedly rational at the individual level, but we prove that in the aggregate implements a type of approximate "Thompson sampling"---a well-known and highly effective single-agent Bayesian machine learning algorithm for sequential decision-making. The perspective of distributed Bayesian inference therefore reveals how collective rationality emerges from the boundedly rational decision mechanisms people use.
The increasing role of home automation in routine life and the rising demand for sensor networks enhanced wireless personal area networks (WPANs) development, pervasiveness of wireless & wired network, and research. Soon arose the need of implementing the Internet Protocol in these devices in order to WPAN standards, raising the way for questions on how to provide seamless communication between wired and wireless technologies. After a quick overview of the Low-rate WPAN standard (IEEE 802.15.4) and the Zigbee stack, this paper focuses on understanding the implications when interconnecting low powered IEEE 802.15.4 devices and a wired IPv6 domain. Subsequently the focus will be on existing approaches to connect LoWPAN devices to the internet and on how these approaches try to solve these challenges, concluding with a critical analysis of interoperability problems.
We study delocalization transition in a one-dimensional electron system with quenched disorder by using supersymmetric (SUSY) methods. Especially we focus on effects of nonlocal correlation of disorder, for most of studies given so far considered $\delta$-function type white noise disorder. We obtain wave function of the "lowest-energy" state which dominates partition function in the limit of large system size. Density of states is calculated in the scaling region. The result shows that delocalization transition is stable against nonlocal short-ranged correlation of disorder. Especially states near the band center are enhanced by the correlation of disorder which partially suppresses random fluctuation of disorder. Physical picture of the localization and the delocalization transition is discussed.
Variants of the Kohonen model are proposed to study biological principles of self-organization in a model of young brain. We suggest a function to measure aquired knowledge and use it to auto-adapt the topology of neuronal connectivity, yielding substantial organizational improvement relative to the standard model. In the early phase of organization with most intense learning, we observe that neural connectivity is of Small World type, which is very efficient to organize neurons in response to stimuli. In analogy to human brain where pruning of neural connectivity (and neuron cell death) occurs in early life, this feature is present also in our model, which is found to stabilize neuronal response to stimuli.
The small-$t$ behaviour of the deep inelastic diffractive dissociation cross section in the triple Regge region is investigated, using the BFKL approximation in perturbative QCD. We show that the cross section is finite at $t=0$, but the diffusion in $\ln{k_t^2}$ leads to a large contribution of small momenta at the triple Pomeron vertex. We study the dependence upon the total energy and the invariant mass. At $t=0$, there is a decoupling of the three BFKL singularities which is a consequence of the conservation of the conformal dimension. For large invariant masses, the four gluon state in the upper t-channel plays an important role and cannot be neglected.
The anatomical location of imaging features is of crucial importance for accurate diagnosis in many medical tasks. Convolutional neural networks (CNN) have had huge successes in computer vision, but they lack the natural ability to incorporate the anatomical location in their decision making process, hindering success in some medical image analysis tasks.   In this paper, to integrate the anatomical location information into the network, we propose several deep CNN architectures that consider multi-scale patches or take explicit location features while training. We apply and compare the proposed architectures for segmentation of white matter hyperintensities in brain MR images on a large dataset. As a result, we observe that the CNNs that incorporate location information substantially outperform a conventional segmentation method with hand-crafted features as well as CNNs that do not integrate location information. On a test set of 46 scans, the best configuration of our networks obtained a Dice score of 0.791, compared to 0.797 for an independent human observer. Performance levels of the machine and the independent human observer were not statistically significantly different (p-value=0.17).
Current analysis of tumor proliferation, the most salient prognostic biomarker for invasive breast cancer, is limited to subjective mitosis counting by pathologists in localized regions of tissue images. This study presents the first data-driven integrative approach to characterize the severity of tumor growth and spread on a categorical and molecular level, utilizing multiple biologically salient deep learning classifiers to develop a comprehensive prognostic model. Our approach achieves pathologist-level performance on three-class categorical tumor severity prediction. It additionally pioneers prediction of molecular expression data from a tissue image, obtaining a Spearman's rank correlation coefficient of 0.60 with ex vivo mean calculated RNA expression. Furthermore, our framework is applied to identify over two hundred unprecedented biomarkers critical to the accurate assessment of tumor proliferation, validating our proposed integrative pipeline as the first to holistically and objectively analyze histopathological images.
Symmetry considerations and a direct, Hubbard-Stratonovich type, derivation are used to construct a replica field-theory relevant to the study of the spin glass transition of short range models in a magnetic field. A mean-field treatment reveals that two different types of transitions exist, whenever the replica number n is kept larger than zero. The Sherrington-Kirkpatrick critical point in zero magnetic field between the paramagnet and replica magnet (a replica symmetric phase with a nonzero spin glass order parameter) separates from the de Almeida-Thouless line, along which replica symmetry breaking occurs. We argue that for studying the de Almeida-Thouless transition around the upper critical dimension d=6, it is necessary to use the generic cubic model with all the three bare masses and eight cubic couplings. The critical role n may play is also emphasized. To make perturbative calculations feasible, a new representation of the cubic interaction is introduced. To illustrate the method, we compute the masses in one-loop order. Some technical details and a list of vertex rules are presented to help future renormalisation-group calculations.
A Virtual Private Network (VPN) provides private network connections over a publicly accessible shared network. The effective allocation of bandwidth for VPNs assumes significance in the present scenario due to varied traffic. Each VPN endpoint specifies bounds on the total amount of traffic that it is likely to send or receive at any time. The network provider tailors the VPN so that there is sufficient bandwidth for any traffic matrix that is consistent with these bounds. The approach incorporates the use of Ad-hoc On demand Distance Vector (AODV) protocol, with a view to accomplish an enhancement in the performance of the mobile networks. The NS2 based simulation results are evaluated in terms of its metrics for different bandwidth allocations, besides analyzing its performance in the event of exigencies such as link failures. The results highlight the suitability of the proposed strategy in the context of real time applications.
Network theory provides a rich toolbox consisting of methods, measures, and models for studying the structure and dynamics of complex systems found in nature, society, or technology. Recently, it has been pointed out that many real-world complex systems are more adequately mapped by networks of interacting or interdependent networks, e.g., a power grid showing interdependency with a communication network. Additionally, in many real-world situations it is reasonable to include node weights into complex network statistics to reflect the varying size or importance of subsystems that are represented by nodes in the network of interest. E.g., nodes can represent vastly different surface area in climate networks, volume in brain networks or economic capacity in trade networks. In this letter, combining both ideas, we derive a novel class of statistical measures for analysing the structure of networks of interacting networks with heterogeneous node weights. Using a prototypical spatial network model, we show that the newly introduced node-weighted interacting network measures indeed provide an improved representation of the underlying system's properties as compared to their unweighted analogues. We apply our method to study the complex network structure of cross-boundary trade between European Union (EU) and non-EU countries finding that it provides important information on trade balance and economic robustness.
Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory. Prior work tries to address this restriction by virtualizing the memory usage of DNNs, enabling both CPU and GPU memory to be utilized for memory allocations. Despite its merits, virtualizing memory can incur significant performance overheads when the time needed to copy data back and forth from CPU memory is higher than the latency to perform the computations required for DNN forward and backward propagation. We introduce a high-performance virtualization strategy based on a "compressing DMA engine" (cDMA) that drastically reduces the size of the data structures that are targeted for CPU-side allocations. The cDMA engine offers an average 2.6x (maximum 13.8x) compression ratio by exploiting the sparsity inherent in offloaded data, improving the performance of virtualized DNNs by an average 32% (maximum 61%).
This work addresses the problem of camera elevation estimation from a single photograph in an outdoor environment. We introduce a new benchmark dataset of one-hundred thousand images with annotated camera elevation called Alps100K. We propose and experimentally evaluate two automatic data-driven approaches to camera elevation estimation: one based on convolutional neural networks, the other on local features. To compare the proposed methods to human performance, an experiment with 100 subjects is conducted. The experimental results show that both proposed approaches outperform humans and that the best result is achieved by their combination.
We consider the problem of density estimation on Riemannian manifolds. Density estimation on manifolds has many applications in fluid-mechanics, optics and plasma physics and it appears often when dealing with angular variables (such as used in protein folding, robot limbs, gene-expression) and in general directional statistics. In spite of the multitude of algorithms available for density estimation in the Euclidean spaces $\mathbf{R}^n$ that scale to large n (e.g. normalizing flows, kernel methods and variational approximations), most of these methods are not immediately suitable for density estimation in more general Riemannian manifolds. We revisit techniques related to homeomorphisms from differential geometry for projecting densities to sub-manifolds and use it to generalize the idea of normalizing flows to more general Riemannian manifolds. The resulting algorithm is scalable, simple to implement and suitable for use with automatic differentiation. We demonstrate concrete examples of this method on the n-sphere $\mathbf{S}^n$.
We describe a method, based on neural networks, of revealing Compton form factors in the deeply virtual region. We compare this approach to standard least-squares model fitting both for a simplified toy case and for HERMES data.
The approach described here allows using membership function to represent imprecise and uncertain knowledge by learning in Fuzzy Semantic Networks. This representation has a great practical interest due to the possibility to realize on the one hand, the construction of this membership function from a simple value expressing the degree of interpretation of an Object or a Goal as compared to an other and on the other hand, the adjustment of the membership function during the apprenticeship. We show, how to use these membership functions to represent the interpretation of an Object (respectively of a Goal) user as compared to an system Object (respectively to a Goal). We also show the possibility to make decision for each representation of an user Object compared to a system Object. This decision is taken by determining decision coefficient calculates according to the nucleus of the membership function of the user Object.
In this thesis, we propose several modelling strategies to tackle evolving data in different contexts. In the framework of static clustering, we start by introducing a soft kernel spectral clustering (SKSC) algorithm, which can better deal with overlapping clusters with respect to kernel spectral clustering (KSC) and provides more interpretable outcomes. Afterwards, a whole strategy based upon KSC for community detection of static networks is proposed, where the extraction of a high quality training sub-graph, the choice of the kernel function, the model selection and the applicability to large-scale data are key aspects. This paves the way for the development of a novel clustering algorithm for the analysis of evolving networks called kernel spectral clustering with memory effect (MKSC), where the temporal smoothness between clustering results in successive time steps is incorporated at the level of the primal optimization problem, by properly modifying the KSC formulation. Later on, an application of KSC to fault detection of an industrial machine is presented. Here, a smart pre-processing of the data by means of a proper windowing operation is necessary to catch the ongoing degradation process affecting the machine. In this way, in a genuinely unsupervised manner, it is possible to raise an early warning when necessary, in an online fashion. Finally, we propose a new algorithm called incremental kernel spectral clustering (IKSC) for online learning of non-stationary data. This ambitious challenge is faced by taking advantage of the out-of-sample property of kernel spectral clustering (KSC) to adapt the initial model, in order to tackle merging, splitting or drifting of clusters across time. Real-world applications considered in this thesis include image segmentation, time-series clustering, community detection of static and evolving networks.
We develop a novel deep contour detection algorithm with a top-down fully convolutional encoder-decoder network. Our proposed method, named TD-CEDN, solves two important issues in this low-level vision problem: (1) learning multi-scale and multi-level features; and (2) applying an effective top-down refined approach in the networks. TD-CEDN performs the pixel-wise prediction by means of leveraging features at all layers of the net. Unlike skip connections and previous encoder-decoder methods, we first learn a coarse feature map after the encoder stage in a feedforward pass, and then refine this feature map in a top-down strategy during the decoder stage utilizing features at successively lower layers. Therefore, the deconvolutional process is conducted stepwise, which is guided by Deeply-Supervision Net providing the integrated direct supervision. The above proposed technologies lead to a more precise and clearer prediction. Our proposed algorithm achieved the state-of-the-art on the BSDS500 dataset (ODS F-score of 0.788), the PASCAL VOC2012 dataset (ODS F-score of 0.588), and and the NYU Depth dataset (ODS F-score of 0.735).
Complex networks of real-world systems are believed to be controlled by common phenomena, producing structures far from regular or random. These include scale-free degree distributions, small-world structure and assortative mixing by degree, which are also the properties captured by different random graph models proposed in the literature. However, many (non-social) real-world networks are in fact disassortative by degree. Thus, we here propose a simple evolving model that generates networks with most common properties of real-world networks including degree disassortativity. Furthermore, the model has a natural interpretation for citation networks with different practical applications.
We generalize the standard Hopfield model to the case when a weight is assigned to each input pattern. The weight can be interpreted as the frequency of the pattern occurrence at the input of the network. In the framework of the statistical physics approach we obtain the saddle-point equation allowing us to examine the memory of the network. In the case of unequal weights our model does not lead to the catastrophic destruction of the memory due to its overfilling (that is typical for the standard Hopfield model). The real memory consists only of the patterns with weights exceeding a critical value that is determined by the weights distribution. We obtain the algorithm allowing us to find this critical value for an arbitrary distribution of the weights, and analyze in detail some particular weights distributions. It is shown that the memory decreases as compared to the case of the standard Hopfield model. However, in our model the network can learn online without the catastrophic destruction of the memory.
Companies do not operate in a vacuum. As companies move towards an increasingly specialized production function and their reach is becoming truly global, their aptitude in managing and shaping their inter-organizational network is a determining factor in measuring their health. Current models of company financial health often lack variables explaining the inter-organizational network, and as such, assume that (1) all networks are the same and (2) the performance of partners do not impact companies. This paper aims to be a first step in the direction of removing these assumptions. Specifically, the impact is illustrated by examining the effects of customer and supplier concentrations and partners' credit risk on credit-default swap (CDS) spreads while controlling for credit risk and size. We rely upon supply-chain data from Bloomberg that provides insight into companies' relationships. The empirical results show that a well diversified customer network lowers CDS spread, while having stable partners with low default probabilities increase spreads. The latter result suggests that successful companies do not focus on building a stable eco-system around themselves, but instead focus on their own profit maximization at the cost of the financial health of their suppliers' and customers'. At a more general level, the results indicate the importance of considering the inter-organizational networks, and highlight the value of including network variables in credit risk models.
Dimensionality requirement poses a major challenge for Interference alignment (IA) in practical systems. This work evaluates the necessary and sufficient conditions on channel structure of a fully connected general interference network to make perfect IA feasible within limited number of channel extensions. So far, IA feasibility literature have mainly focused on network topology, in contrast, this work makes use of the channel structure to achieve total number of degrees of freedom (DoF) of the considered network by extending the channel aided IA scheme to the case of interference channel with general message demands. We consider a single-hop interference network with $K$ transmitters and $N$ receivers each equipped with a single antenna. Each transmitter emits an independent message and each receiver requests an arbitrary subset of the messages. Obtained channel aiding conditions can be considered as the optimal DoF feasibility conditions on channel structure. As a byproduct, assuming optimal DoF assignment, it is proved that in a general interference network, there is no user with a unique maximum number of DoF.
Constructing a connected dominating set as the virtual backbone plays an important role in wireless networks. In this paper, we propose two novel approximate algorithms for dominating set and connected dominating set in wireless networks, respectively. Both of the algorithms are based on edge dominating capability which is a novel notion proposed in this paper. Simulations show that each of proposed algorithm has good performance especially in dense wireless networks .
Many networks are complex dynamical systems, where both attributes of nodes and topology of the network (link structure) can change with time. We propose a model of co-evolving networks where both node at- tributes and network structure evolve under mutual influence. Specifically, we consider a mixed membership stochastic blockmodel, where the probability of observing a link between two nodes depends on their current membership vectors, while those membership vectors themselves evolve in the presence of a link between the nodes. Thus, the network is shaped by the interaction of stochastic processes describing the nodes, while the processes themselves are influenced by the changing network structure. We derive an efficient variational inference procedure for our model, and validate the model on both synthetic and real-world data.
We investigate nonequilibrium phase transitions in the presence of disorder that locally breaks the symmetry between two equivalent macroscopic states. In low-dimensional equilibrium systems, such "random-field" disorder is known to have dramatic effects: It prevents spontaneous symmetry breaking and completely destroys the phase transition. In contrast, we demonstrate that the phase transition of the one-dimensional generalized contact process persists in the presence of random field disorder. The dynamics in the symmetry-broken phase becomes ultraslow and is described by a Sinai walk of the domain walls between two different absorbing states. We discuss the generality and limitations of our theory, and we illustrate our results by means of large-scale Monte-Carlo simulations.
This article investigates the problem of estimating stellar atmospheric parameters from spectra. Feature extraction is a key procedure in estimating stellar parameters automatically. We propose a scheme for spectral feature extraction and atmospheric parameter estimation using the following three procedures: firstly, learn a set of basic structure elements (BSE) from stellar spectra using an autoencoder; secondly, extract representative features from stellar spectra based on the learned BSEs through some procedures of convolution and pooling; thirdly, estimate stellar parameters ($T_{eff}$, log$~g$, [Fe/H]) using a back-propagation (BP) network. The proposed scheme has been evaluated on both real spectra from Sloan Digital Sky Survey (SDSS)/Sloan Extension for Galactic Understanding and Exploration (SEGUE) and synthetic spectra calculated from Kurucz's new opacity distribution function (NEWODF) models. The best mean absolute errors (MAEs) are 0.0060 dex for log$~T_{eff}$, 0.1978 dex for log$~g$ and 0.1770 dex for [Fe/H] for the real spectra and 0.0004 dex for log$~T_{eff}$, 0.0145 dex for log$~g$ and 0.0070 dex for [Fe/H] for the synthetic spectra.
There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations - in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture's solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.
Hopping conductivity is enhanced when exposed to microwave fields (Phys. Rev. Lett., 102, 206601, 2009). Data taken on a variety of Anderson-localized systems are presented to illustrate the generality of the phenomenon. Specific features of these results lead us to conjecture that the effect is due to a field-enhanced hopping, which is the high frequency version of the non-Ohmic effect, well known in the dc transport regime. Experimental evidence in support of this scenario is presented and discussed. It is pointed out that existing models for non-Ohmic behavior in the hopping regime may, at best, offer a qualitative explanation to experiments. In particular, they cannot account for the extremely low values of the threshold fields that mark the onset of non-Ohmic behavior.
We study products of arbitrary random real $2 \times 2$ matrices that are close to the identity matrix. Using the Iwasawa decomposition of $\text{SL}(2,{\mathbb R})$, we identify a continuum regime where the mean values and the covariances of the three Iwasawa parameters are simultaneously small. In this regime, the Lyapunov exponent of the product is shown to assume a scaling form. In the general case, the corresponding scaling function is expressed in terms of Gauss' hypergeometric function. A number of particular cases are also considered, where the scaling function of the Lyapunov exponent involves other special functions (Airy, Bessel, Whittaker, elliptic). The general solution thus obtained allows us, among other things, to recover in a unified framework many results known previously from exactly solvable models of one-dimensional disordered systems.
We generalize the concept of convective (or velocity-dependent) Lyapunov exponent $\Lambda(v)$ to an entire spectrum $\Lambda(v,n)$. Our results are supported by the consistency between the outcome of the chronotopic approach [{\it S. Lepri et al. J. Stat. Phys., 82 5/6 (1996) 1429}] and a more direct method. There exists a critical integrated density $n=n_c$, beyond which the convective exponent exhibits a discontinuous dependence on the velocity, which originates from the appearance of multiple branches. This phenomenon can be traced back to a change of concavity of the so-called {\it temporal} Lyapunov spectrum for $n>n_c$, which is therefore a dynamical invariant.
Elementary flux modes (EFMs) are pathways through a metabolic reaction network that connect external substrates to products. Using EFMs, a metabolic network can be transformed into its macroscopic counterpart, in which the internal metabolites have been eliminated and only external metabolites remain. In EFMs-based metabolic flux analysis (MFA) experimentally determined external fluxes are used to estimate the flux of each EFM. It is in general prohibitive to enumerate all EFMs for complex networks, since the number of EFMs increases rapidly with network complexity. In this work we present an optimization-based method that dynamically generates a subset of EFMs and solves the EFMs-based MFA problem simultaneously. The obtained subset contains EFMs that contribute to the optimal solution of the EFMs-based MFA problem. The usefulness of our method was examined in a case-study using data from a Chinese hamster ovary cell culture and two networks of varied complexity. It was demonstrated that the EFMs-based MFA problem could be solved at a low computational cost, even for the more complex network. Additionally, only a fraction of the total number of EFMs was needed to compute the optimal solution.
We present a framework for efficient perceptual inference that explicitly reasons about the segmentation of its inputs and features. Rather than being trained for any specific segmentation, our framework learns the grouping process in an unsupervised manner or alongside any supervised task. By enriching the representations of a neural network, we enable it to group the representations of different objects in an iterative manner. By allowing the system to amortize the iterative inference of the groupings, we achieve very fast convergence. In contrast to many other recently proposed methods for addressing multi-object scenes, our system does not assume the inputs to be images and can therefore directly handle other modalities. For multi-digit classification of very cluttered images that require texture segmentation, our method offers improved classification performance over convolutional networks despite being fully connected. Furthermore, we observe that our system greatly improves on the semi-supervised result of a baseline Ladder network on our dataset, indicating that segmentation can also improve sample efficiency.
This paper focuses on a particular transmission scheme called local network coding, which has been reported to provide significant performance gains in practical wireless networks. The performance of this scheme strongly depends on the network topology and thus on the locations of the wireless nodes. Also, it has been shown previously that finding the encoding strategy, which achieves maximum performance, requires complex calculations to be undertaken by the wireless node in real-time.   Both deterministic and random point pattern are explored and using the Boolean connectivity model we provide upper bounds for the maximum coding number, i.e., the number of packets that can be combined such that the corresponding receivers are able to decode. For the models studied, this upper bound is of order of $\sqrt{N}$, where $N$ denotes the (mean) number of neighbors. Moreover, achievable coding numbers are provided for grid-like networks. We also calculate the multiplicative constants that determine the gain in case of a small network. Building on the above results, we provide an analytic expression for the upper bound of the efficiency of local network coding. The conveyed message is that it is favorable to reduce computational complexity by relying only on small encoding numbers since the resulting expected throughput loss is negligible.
The on-line shortest path problem is considered under various models of partial monitoring. Given a weighted directed acyclic graph whose edge weights can change in an arbitrary (adversarial) way, a decision maker has to choose in each round of a game a path between two distinguished vertices such that the loss of the chosen path (defined as the sum of the weights of its composing edges) be as small as possible. In a setting generalizing the multi-armed bandit problem, after choosing a path, the decision maker learns only the weights of those edges that belong to the chosen path. For this problem, an algorithm is given whose average cumulative loss in n rounds exceeds that of the best path, matched off-line to the entire sequence of the edge weights, by a quantity that is proportional to 1/\sqrt{n} and depends only polynomially on the number of edges of the graph. The algorithm can be implemented with linear complexity in the number of rounds n and in the number of edges. An extension to the so-called label efficient setting is also given, in which the decision maker is informed about the weights of the edges corresponding to the chosen path at a total of m << n time instances. Another extension is shown where the decision maker competes against a time-varying path, a generalization of the problem of tracking the best expert. A version of the multi-armed bandit setting for shortest path is also discussed where the decision maker learns only the total weight of the chosen path but not the weights of the individual edges on the path. Applications to routing in packet switched networks along with simulation results are also presented.
Deep neural networks (DNNs) have been successfully applied to a wide variety of acoustic modeling tasks in recent years. These include the applications of DNNs either in a discriminative feature extraction or in a hybrid acoustic modeling scenario. Despite the rapid progress in this area, a number of challenges remain in training DNNs. This paper presents an effective way of training DNNs using a manifold learning based regularization framework. In this framework, the parameters of the network are optimized to preserve underlying manifold based relationships between speech feature vectors while minimizing a measure of loss between network outputs and targets. This is achieved by incorporating manifold based locality constraints in the objective criterion of DNNs. Empirical evidence is provided to demonstrate that training a network with manifold constraints preserves structural compactness in the hidden layers of the network. Manifold regularization is applied to train bottleneck DNNs for feature extraction in hidden Markov model (HMM) based speech recognition. The experiments in this work are conducted on the Aurora-2 spoken digits and the Aurora-4 read news large vocabulary continuous speech recognition tasks. The performance is measured in terms of word error rate (WER) on these tasks. It is shown that the manifold regularized DNNs result in up to 37% reduction in WER relative to standard DNNs.
We compute the synthetic 2-100 keV spectrum of an actively star forming galaxy. To this aim we use a luminosity function of point sources appropriate for starbursts (based on Chandra data), as well as the types of stellar sources and their corresponding spectra. Our estimates indicate that a Compton spectral component - resulting from scattering of SN-accelerated electrons by ambient FIR and CMB photons - could possibly be detected, in deep INTEGRAL observations of nearby starburst galaxies, only if there is a break in the spectra of the brightest (L > 2x10^{38} erg/s) point sources.
We present an algorithm for finding ground states of two dimensional spin glass systems based on ideas from matrix product states in quantum information theory. The algorithm works directly at zero temperature and defines an approximate "boundary Hamiltonian" whose accuracy depends on a parameter $k$. We test the algorithm against exact methods on random field and random bond Ising models, and we find that accurate results require a $k$ which scales roughly polynomially with the system size. The algorithm also performs well when tested on small systems with arbitrary interactions, where no fast, exact algorithms exist. The time required is significantly less than Monte Carlo schemes.
We report the discovery of an X-ray luminous galaxy cluster at z =1.11. RDCS J0910+5422 was selected as an X-ray cluster candidate in the ROSAT Deep Cluster Survey on the basis of its spatial extent in a Rosat PSPC image. Deep optical and near-IR imaging reveal a red galaxy overdensity around the peak of the X-ray emission, with a significant excess of objects with J-K and I-K colors typical of elliptical galaxies at z ~ 1.0. Spectroscopic observations at the Keck II telescope secured 9 galaxy redshifts in the range 1.095<z<1.120 yielding a mean cluster redshift of <z>=1.106. Eight of these galaxies lie within a 30 arcsec radius around the peak X-ray emission. A deep Chandra ACIS exposure on this field shows extended X-ray morphology and allows the X-ray spectrum of the intracluster medium to be measured. The cluster has a bolometric luminosity L_x = 2.48^{+0.33}_{-0.26} x 10^44 ergs/s, a temperature of kT = 7.2^{+2.2}_{-1.4} keV, and a mass within r = 1 Mpc of 7.0 x 10^14 M_sun (H_0=65 km/s/Mpc, Omega_m = 0.3, and Lambda = 0.7). The spatial distribution of the cluster members is elongated, which is not due to an observational selection effect, and followed by the X-ray morphology. The X-ray surface brightness profile and the spectrophotometric properties of the cluster members suggest that this is an example of a massive cluster in an advanced stage of formation with a hot ICM and an old galaxy population already in place at z > 1.
Background: Single-particle cryo-electron microscopy (cryo-EM) has become a popular tool for structural determination of biological macromolecular complexes. High-resolution cryo-EM reconstruction often requires hundreds of thousands of single-particle images. Particle extraction from experimental micrographs thus can be laborious and presents a major practical bottleneck in cryo-EM structural determination. Existing computational methods of particle picking often use low-resolution templates as inputs for particle matching, making it possible to cause reference-dependent bias. It is critical to develop a highly efficient template-free method to automatically recognize particle images from cryo-EM micrographs. Results: We developed a deep learning-based algorithmic framework, DeepEM, for single-particle recognition from noisy cryo-EM micrographs, enabling automated particle picking, selection and verification in an integrated fashion. The kernel of DeepEM is built upon a convolutional neural network (CNN) of eight layers, which can be recursively trained to be highly "knowledgeable". Our approach exhibits improved performance and high precision when tested on the standard KLH dataset. Application of DeepEM to several challenging experimental cryo-EM datasets demonstrates its capability in avoiding selection of un-wanted particles and non-particles even when true particles contain fewer features. Conclusions: The DeepEM method derived from a deep CNN allows automated particle extraction from raw cryo-EM micrographs in the absence of templates, which demonstrated improved performance, objectivity and accuracy. Application of this novel approach is expected to free the labor involved in single-particle verification, thus promoting the efficiency of cryo-EM data processing.
Security of computers and the networks that connect them is increasingly becoming of great significance. Intrusion detection system is one of the security defense tools for computer networks. This paper compares two different model Approaches for representing intrusion detection system by using decision tree techniques. These approaches are Phase-model approach and Level-model approach. Each model is implemented by using two techniques, New Attacks and Data partitioning techniques. The experimental results showed that Phase approach has higher classification rate in both New Attacks and Data Partitioning techniques than Level approach.
To control infection spreading on networks, we investigate the effect of observer nodes that recognize infection in a neighboring node and make the rest of the neighbor nodes immune. We numerically show that random placement of observer nodes works better on networks with clustering than on locally treelike networks, implying that our model is promising for realistic social networks. The efficiency of several heuristic schemes for observer placement is also examined for synthetic and empirical networks. In parallel with numerical simulations of epidemic dynamics, we also show that the effect of observer placement can be assessed by the size of the largest connected component of networks remaining after removing observer nodes and links between their neighboring nodes.
Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this paper, we discover that a high-quality visual saliency model can be learned from multiscale features extracted using deep convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for feature extraction at three different scales. The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature. To generate a more robust feature, we integrate handcrafted low-level features with our deep contrast feature. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotations. Experimental results demonstrate that our proposed method is capable of achieving state-of-the-art performance on all public benchmarks, improving the F- measure by 6.12% and 10.0% respectively on the DUT-OMRON dataset and our new dataset (HKU-IS), and lowering the mean absolute error by 9% and 35.3% respectively on these two datasets.
We draw attention to a clear dichotomy between small-world networks exhibiting exponential neighborhood growth, and fractal-like networks where neighborhoods grow according to a power law. This distinction is observed in a number of real-world networks, and is related to the degree correlations and geographical constraints. We conclude by pointing out that the status of human social networks in this dichotomy is far from clear.
We measure galaxy sizes on a sample of $\sim1200$ galaxies with confirmed spectroscopic redshifts $2 \leq z_{spec} \leq 4.5$ in the VIMOS Ultra Deep Survey (VUDS), representative of star-forming galaxies with $i_\mathrm{AB} \leq 25$. We first derive galaxy sizes applying a classical parametric profile fitting method using GALFIT. We then measure the total pixel area covered by a galaxy above a given surface brightness threshold, which overcomes the difficulty of measuring sizes of galaxies with irregular shapes. We then compare the results obtained for the equivalent circularized radius enclosing 100\% of the measured galaxy light $r_T^{100}$ to those obtained with the effective radius $r_{e,\mathrm{circ}}$ measured with GALFIT. We find that the sizes of galaxies computed with our non-parametric approach span a large range but remain roughly constant on average with a median value $r_T^{100}\sim2.2$ kpc for galaxies with $2<z<4.5$. This is in stark contrast with the strong downward evolution of $r_e$ with increasing redshift, down to sizes of $<1$ kpc at $z\sim4.5$. We analyze the difference and find that parametric fitting of complex, asymmetric, multi-component galaxies is severely underestimating their sizes. By comparing $r_T^{100}$ with physical parameters obtained through SED fitting we find that the star-forming galaxies that are the largest at any redshift are, on average, more massive and more star-forming. We discover that galaxies present more concentrated light profiles as we move towards higher redshifts. We interpret these results as the signature of several, possibly different, evolutionary paths of galaxies in their early stages of assembly, including major and minor merging or star-formation in multiple bright regions. (abridged)
The random initialization of weights of a multilayer perceptron makes it possible to model its training process as a Las Vegas algorithm, i.e. a randomized algorithm which stops when some required training error is obtained, and whose execution time is a random variable. This modeling is used to perform a case study on a well-known pattern recognition benchmark: the UCI Thyroid Disease Database. Empirical evidence is presented of the training time probability distribution exhibiting a heavy tail behavior, meaning a big probability mass of long executions. This fact is exploited to reduce the training time cost by applying two simple restart strategies. The first assumes full knowledge of the distribution yielding a 40% cut down in expected time with respect to the training without restarts. The second, assumes null knowledge, yielding a reduction ranging from 9% to 23%.
Time-periodic driving of a quantum system can enable new dynamical topological phases of matter that could not exist in thermal equilibrium. We investigate two related classes of dynamical topological phenomena in 2D systems: Floquet symmetry protected topological phases (FSPTs), and Floquet enriched topological orders (FETs). By constructing solvable lattice models for a complete set of 2D bosonic FSPT phases, we show that bosonic FSPTs can be understood as topological pumps which deposit loops of 1D SPT chains onto the boundary during each driving cycle, which protects a non-trivial edge state by dynamically tuning the edge to a self-dual point poised between the 1D SPT and trivial phases of the edge. By coupling these FSPT models to dynamical gauge fields, we construct solvable models of FET orders in which anyon excitations are dynamically transmuted into topologically distinct anyon types during each driving period. These bosonic FSPT and gauged FSPT models are classified by group cohomology methods. In addition, we also construct examples of "beyond cohomology" FET orders, which can be viewed as topological pumps of 1D topological chains formed of emergent anyonic quasi-particles.
A book Chapter consisting of some of the main areas of research in graph theory applied to physics. It includes graphs in condensed matter theory, such as the tight-binding and the Hubbard model. It follows the study of graph theory and statistical physics by means of the analysis of the Potts model. Then, we consider the use of graph polynomials in solving Feynman integrals, graphs and electrical networks, vibrational analysis in networked systems and random graphs. The second part deals with the study of complex networks and includes the models of "small-world", "scale-freeness", network motifs, centrality measures, the use of statistical mechanics for the analysis of networks and network communicability and the study of communities in networks. The chapter is finished by considering some dynamical models on networks, such as the consensus analysis, synchronization of coupled oscillators and epidemic models on networks.
This paper addresses deep face recognition (FR) problem under open-set protocol, where ideal face features are expected to have smaller maximal intra-class distance than minimal inter-class distance under a suitably chosen metric space. However, few existing algorithms can effectively achieve this criterion. To this end, we propose the angular softmax (A-Softmax) loss that enables convolutional neural networks (CNNs) to learn angularly discriminative features. Geometrically, A-Softmax loss can be viewed as imposing discriminative constraints on a hypersphere manifold, which intrinsically matches the prior that faces also lie on a manifold. Moreover, the size of angular margin can be quantitatively adjusted by a parameter m. We further derive specific $m$ to approximate the ideal feature criterion. Extensive analysis and experiments on Labeled Face in the Wild (LFW), Youtube Faces (YTF) and MegaFace Challenge 1 show the superiority of A-Softmax loss in FR tasks.
Internet users and businesses are increasingly using online social networks (OSN) to drive audience traffic and increase their popularity. In order to boost social presence, OSN users need to increase the visibility and reach of their online profile, like - Facebook likes, Twitter followers, Instagram comments and Yelp reviews. For example, an increase in Twitter followers not only improves the audience reach of the user but also boosts the perceived social reputation and popularity. This has led to a scope for an underground market that provides followers, likes, comments, etc. via a network of fraudulent and compromised accounts and various collusion techniques.   In this paper, we landscape the underground markets that provide Twitter followers by studying their basic building blocks - merchants, customers and phony followers. We charecterize the services provided by merchants to understand their operational structure and market hierarchy. Twitter underground markets can operationalize using a premium monetary scheme or other incentivized freemium schemes. We find out that freemium market has an oligopoly structure with few merchants being the market leaders. We also show that merchant popularity does not have any correlation with the quality of service provided by the merchant to its customers. Our findings also shed light on the characteristics and quality of market customers and the phony followers provided. We draw comparison between legitimate users and phony followers, and find out key identifiers to separate such users. With the help of these differentiating features, we build a supervised learning model to predict suspicious following behaviour with an accuracy of 89.2%.
The effective impedance of strongly anisotropic polycrystals has been investigated under the conditions of extremely anomalous skin effect. We were interested in finding out how the value of the effective impedance depends on the geometry of the Fermi surface of a single crystal grain. The previously obtained nonperturbative solution based on the application of the impedance (the Leontovich) boundary conditons was used to calculate the effective impedance of a polycrystalline metal. When the Fermi surface is a surface of revolution, the expression for the effective impedance is rather simple. Some model Fermi surfaces were examined. In the vicinity of the electronic topological transition the singularities of the effective impedance related to the change of the topology of the Fermi surface were calculated. Our results show that though a polycrystal is an isotropic medium in average, it is not sufficient to consider it as a metal with an effective spherical Fermi surface, since this can lead to the loss of some characteristic features of extremely anomalous skin effect in polycrystals.
Jarosite family compounds, KFe_3(OH)_6(SO_4)_2, (abbreviate Fe jarosite), and KCr_3(OH)_6(SO_4)_2, (Cr jarosite), are typical examples of the Heisenberg antiferromagnet on the kagome lattice and have been investigated by means of magnetization and NMR experiments. The susceptibility of Cr jarosite deviates from Curie-Weiss law due to the short-range spin correlation below about 150 K and shows the magnetic transition at 4.2 K, while Fe jarosite has the transition at 65 K. The susceptibility data fit well with the calculated one on the high temperature expansion for the Heisenberg antiferromagnet on the kagome lattice. The values of exchange interaction of Cr jarosite and Fe jarosite are derived to be J_Cr = 4.9 K and J_Fe = 23 K, respectively. The 1H-NMR spectra of Fe jarosite suggest that the ordered spin structure is the q = 0 type with positive chirality of the 120 degrees configuration. The transition is caused by a weak single-ion type anisotropy. The spin-lattice relaxation rate, 1/T_1, of Fe jarosite in the ordered phase decreases sharply with lowering the temperature and can be well explained by the two-magnon process of spin wave with the anisotropy.
We study the glass and jamming transition of finite-dimensional models of simple liquids: hard- spheres, harmonic spheres and more generally bounded pair potentials that modelize frictionless spheres in interaction. At finite temperature, we study their glassy dynamics via field-theoretic methods by resorting to a mapping towards an effective quantum mechanical evolution, and show that such an approach resolves several technical problems encountered with previous attempts. We then study the static, mean-field version of their glass transition via replica theory, and set up an expansion in terms of the corresponding static order parameter. Thanks to this expansion, we are able to make a direct and exact comparison between historical Mode-Coupling results and replica theory. Finally we study these models at zero temperature within the hypotheses of the random-first-order-transition theory, and are able to derive a quantitative mean-field theory of the jamming transition. The theoretic methods of field theory and liquid theory used in this work are presented in a mostly self-contained, and hopefully pedagogical, way. This manuscript is a corrected version of my PhD thesis, defended in June, 29th, under the advisorship of Fr\'ed\'eric van Wijland, and also contains the result of collaborations with Ludovic Berthier and Francesco Zamponi. The PhD work was funded by a CFM-JP Aguilar grant, and conducted in the Laboratory MSC at Universit\'e Denis Diderot--Paris 7, France.
In this paper, we propose a novel deep neural network framework embedded with low-level features (LCNN) for salient object detection in complex images. We utilise the advantage of convolutional neural networks to automatically learn the high-level features that capture the structured information and semantic context in the image. In order to better adapt a CNN model into the saliency task, we redesign the network architecture based on the small-scale datasets. Several low-level features are extracted, which can effectively capture contrast and spatial information in the salient regions, and incorporated to compensate with the learned high-level features at the output of the last fully connected layer. The concatenated feature vector is further fed into a hinge-loss SVM detector in a joint discriminative learning manner and the final saliency score of each region within the bounding box is obtained by the linear combination of the detector's weights. Experiments on three challenging benchmark (MSRA-5000, PASCAL-S, ECCSD) demonstrate our algorithm to be effective and superior than most low-level oriented state-of-the-arts in terms of P-R curves, F-measure and mean absolute errors.
Low temperature properties of glasses are derived within a generalized tunneling model, considering the motion of charged particles on a closed path in a double-well potential. The presence of a magnetic induction field B violates the time reversal invariance due to the Aharonov-Bohm phase, and leads to flux periodic energy levels. At low temperature, this effect is shown to be strongly enhanced by dipole-dipole and elastic interactions between tunneling systems and becomes measurable. Thus, the recently observed strong sensitivity of the electric permittivity to weak magnetic fields can be explained. In addition, superimposed oscillations as a function of the magnetic field are predicted.
The structure of the control network of transnational corporations affects global market competition and financial stability. So far, only small national samples were studied and there was no appropriate methodology to assess control globally. We present the first investigation of the architecture of the international ownership network, along with the computation of the control held by each global player. We find that transnational corporations form a giant bow-tie structure and that a large portion of control flows to a small tightly-knit core of financial institutions. This core can be seen as an economic "super-entity" that raises new important issues both for researchers and policy makers.
This paper explores the application of Probabilistic Neural Network (PNN), Support Vector Machine (SVM) and Kmeans clustering as tools for automated classification of massive stellar spectra.
Recently the Quantum Spin Hall effect (QSH) was theoretically predicted and experimentally realized in a quantum wells based on binary semiconductor HgTe[1-3]. QSH state and topological insulators are the new states of quantum matter interesting both for fundamental condensed matter physics and material science[1-11]. Many of Heusler compounds with C1b structure are ternary semiconductors which are structurally and electronically related to the binary semiconductors. The diversity of Heusler materials opens wide possibilities for tuning the band gap and setting the desired band inversion by choosing compounds with appropriate hybridization strength (by lattice parameter) and the magnitude of spin-orbit coupling (by the atomic charge). Based on the first-principle calculations we demonstrate that around fifty Heusler compounds show the band inversion similar to HgTe. The topological state in these zero-gap semiconductors can be created by applying strain or by designing an appropriate quantum well structure, similar to the case of HgTe. Many of these ternary zero-gap semiconductors (LnAuPb, LnPdBi, LnPtSb and LnPtBi) contain the rare earth element Ln which can realize additional properties ranging from superconductivity (e. g. LaPtBi[12]) to magnetism (e. g. GdPtBi[13]) and heavy-fermion behavior (e. g. YbPtBi[14]). These properties can open new research directions in realizing the quantized anomalous Hall effect and topological superconductors.
This letter presents the sparse vector signal detection from one bit compressed sensing measurements, in contrast to the previous works which deal with scalar signal detection. In this letter, available results are extended to the vector case and the GLRT detector and the optimal quantizer design are obtained. Also, a double-detector scheme is introduced in which a sensor level threshold detector is integrated into network level GLRT to improve the performance. The detection criteria of oracle and clairvoyant detectors are also derived. Simulation results show that with careful design of the threshold detector, the overall detection performance of double-detector scheme would be better than the sign-GLRT proposed in [1] and close to oracle and clairvoyant detectors. Also, the proposed detector is applied to spectrum sensing and the results are near the well known energy detector which uses the real valued data while the proposed detector only uses the sign of the data.
Service composition enables customizable services to be provided to the service consumers. Since the capabilities and the performances of mobile devices (e.g., smart phone, PDA, handheld media player) have improved, a mobile device can be utilized to interact with external mobile service providers towards providing composite Web service to remote clients. Existing approaches on mobile-hosted service composition are usually platform dependent, and rely on centralized infrastructure. Such approaches are not feasible in an open, mobile infrastructure-less environment, in which networked services are implemented using different technologies, devices are capable of dynamically joining or leaving the network, and a centralized management entity is nonexistent. This paper proposes a solution to enable mobile Web service composition in an open, infrastructure-less environment based on loosely coupled SOA techniques.
The human face constantly conveys information, both consciously and subconsciously. However, as basic as it is for humans to visually interpret this information, it is quite a big challenge for machines. Conventional semantic facial feature recognition and analysis techniques are already in use and are based on physiological heuristics, but they suffer from lack of robustness and high computation time. This thesis aims to explore ways for machines to learn to interpret semantic information available in faces in an automated manner without requiring manual design of feature detectors, using the approach of Deep Learning. This thesis provides a study of the effects of various factors and hyper-parameters of deep neural networks in the process of determining an optimal network configuration for the task of semantic facial feature recognition. This thesis explores the effectiveness of the system to recognize the various semantic features (like emotions, age, gender, ethnicity etc.) present in faces. Furthermore, the relation between the effect of high-level concepts on low level features is explored through an analysis of the similarities in low-level descriptors of different semantic features. This thesis also demonstrates a novel idea of using a deep network to generate 3-D Active Appearance Models of faces from real-world 2-D images.   For a more detailed report on this work, please see [arXiv:1512.00743v1].
Quantum site percolation as a limiting case of binary alloy is studied numerically in 2D within the tight-binding model. We address the transport properties in all regimes - ballistic, diffusive (metallic), localized and crossover between the latter two. Special attention is given to the region close to the conduction band center, but even there the Anderson localization persists, without signs of metal - insulator transition. We found standard localization for sufficiently large samples. For smaller systems, novel partial quantization of Landauer conductances, i. e. most values close to small integers in arbitrary units is observed at band center. The crossover types of conductance distributions (outside the band center) are found to be similar to systems with corrugated surfaces. Universal conductance fluctuations in metallic regime are shown to approach the known, theoretically predicted value. The resonances in localized regime are Pendry necklaces. We tested Pendry's conjecture on the probability of such rare conducting samples and it proved consistent with our numerical results.
Recurrent neural networks are a powerful tool for modeling sequential data, but the dependence of each timestep's computation on the previous timestep's output limits parallelism and makes RNNs unwieldy for very long sequences. We introduce quasi-recurrent neural networks (QRNNs), an approach to neural sequence modeling that alternates convolutional layers, which apply in parallel across timesteps, and a minimalist recurrent pooling function that applies in parallel across channels. Despite lacking trainable recurrent layers, stacked QRNNs have better predictive accuracy than stacked LSTMs of the same hidden size. Due to their increased parallelism, they are up to 16 times faster at train and test time. Experiments on language modeling, sentiment classification, and character-level neural machine translation demonstrate these advantages and underline the viability of QRNNs as a basic building block for a variety of sequence tasks.
The recent significant progress in realizing full-duplex~(FD) systems has opened up a promising avenue for improving quality of service (QoS) and quality of experience (QoE) in future wireless networks. There is an urgent need to address the diverse set of challenges regarding different aspects of FD network design, theory, and development. In addition to the self-interference cancelation signal processing algorithms, network protocols such as resource management are also essential in the practical design and implementation of FD wireless networks. This article aims to present the latest development and future directions of resource allocation in different full duplex systems by exploring the network resources in different domains, including power, space, frequency, and device dimensions. Four representative application scenarios are considered: FD MIMO networks, FD cooperative networks, FD OFDMA cellular networks, and FD heterogeneous networks. Resource management problems and novel algorithms in these systems are presented, and key open research directions are discussed.
A novel approach is presented for the solution of instantaneous chemical equilibrium problems. The chemical equilibrium can be considered, due to its intrinsically local character, as a mapping of the three-dimensional parameter space spanned by the temperature, hydrogen density and electron density into many one-dimensional spaces representing the number density of each species. We take advantage of the ability of artificial neural networks to approximate non-linear functions and construct neural networks for the fast and efficient solution of the chemical equilibrium problem in typical stellar atmosphere physical conditions. The neural network approach has the advantage of providing an analytic function, which can be rapidly evaluated. The networks are trained with a learning set (that covers the entire parameter space) until a relative error below 1% is reached. It has been verified that the networks are not overtrained by using an additional verification set. The networks are then applied to a snapshot of realistic three-dimensional convection simulations of the solar atmosphere showing good generalization properties.
We propose a Deep Texture Encoding Network (Deep-TEN) with a novel Encoding Layer integrated on top of convolutional layers, which ports the entire dictionary learning and encoding pipeline into a single model. Current methods build from distinct components, using standard encoders with separate off-the-shelf features such as SIFT descriptors or pre-trained CNN features for material recognition. Our new approach provides an end-to-end learning framework, where the inherent visual vocabularies are learned directly from the loss function. The features, dictionaries and the encoding representation for the classifier are all learned simultaneously. The representation is orderless and therefore is particularly useful for material and texture recognition. The Encoding Layer generalizes robust residual encoders such as VLAD and Fisher Vectors, and has the property of discarding domain specific information which makes the learned convolutional features easier to transfer. Additionally, joint training using multiple datasets of varied sizes and class labels is supported resulting in increased recognition performance. The experimental results show superior performance as compared to state-of-the-art methods using gold-standard databases such as MINC-2500, Flickr Material Database, KTH-TIPS-2b, and two recent databases 4D-Light-Field-Material and GTOS. The source code for the complete system are publicly available.
The problem of estimating subjective visual properties from image and video has attracted increasing interest. A subjective visual property is useful either on its own (e.g. image and video interestingness) or as an intermediate representation for visual recognition (e.g. a relative attribute). Due to its ambiguous nature, annotating the value of a subjective visual property for learning a prediction model is challenging. To make the annotation more reliable, recent studies employ crowdsourcing tools to collect pairwise comparison labels because human annotators are much better at ranking two images/videos (e.g. which one is more interesting) than giving an absolute value to each of them separately. However, using crowdsourced data also introduces outliers. Existing methods rely on majority voting to prune the annotation outliers/errors. They thus require large amount of pairwise labels to be collected. More importantly as a local outlier detection method, majority voting is ineffective in identifying outliers that can cause global ranking inconsistencies. In this paper, we propose a more principled way to identify annotation outliers by formulating the subjective visual property prediction task as a unified robust learning to rank problem, tackling both the outlier detection and learning to rank jointly. Differing from existing methods, the proposed method integrates local pairwise comparison labels together to minimise a cost that corresponds to global inconsistency of ranking order. This not only leads to better detection of annotation outliers but also enables learning with extremely sparse annotations. Extensive experiments on various benchmark datasets demonstrate that our new approach significantly outperforms state-of-the-arts alternatives.
The global financial system has become highly connected and complex. Has been proven in practice that existing models, measures and reports of financial risk fail to capture some important systemic dimensions. Only lately, advisory boards have been established in high level and regulations are directly targeted to systemic risk. In the same direction, a growing number of researchers employ network analysis to model systemic risk in financial networks. Current approaches are concentrated on interbank payment network flows in national and international level. This work builds on existing approaches to account for systemic risk assessment in micro level. Particularly, we introduce the analysis of intra-bank financial risk interconnections, by examining the real case of "cheques-as-collateral" network for a major Greek bank. Our model offers useful information about the negative spillovers of disruption to a financial entity in a bank's lending network and could complement existing credit scoring models that account only for idiosyncratic customer's financial profile. Most importantly, the proposed methodology can be employed in many segments of the entire financial system, providing a useful tool in the hands of regulatory authorities in assessing more accurate estimates of systemic risk.
The question why deep learning algorithms perform so well in practice has attracted increasing research interest. However, most of well-established approaches, such as hypothesis capacity, robustness or sparseness, have not provided complete explanations, due to the high complexity of the deep learning algorithms and their inherent randomness. In this work, we introduce a new approach~\textendash~ensemble robustness~\textendash~towards characterizing the generalization performance of generic deep learning algorithms. Ensemble robustness concerns robustness of the \emph{population} of the hypotheses that may be output by a learning algorithm. Through the lens of ensemble robustness, we reveal that a stochastic learning algorithm can generalize well as long as its sensitiveness to adversarial perturbation is bounded in average, or equivalently, the performance variance of the algorithm is small. Quantifying ensemble robustness of various deep learning algorithms may be difficult analytically. However, extensive simulations for seven common deep learning algorithms for different network architectures provide supporting evidence for our claims. Furthermore, our work explains the good performance of several published deep learning algorithms.
Static and dynamic behavior of short-range Ising-spin glass Cu$_{0.5}$Co$_{0.5}$Cl$_{2}$-FeCl$_{3}$ graphite bi-intercalation compounds (GBIC) has been studied with SQUID DC and AC magnetic susceptibility. The $T$ dependence of the zero-field relaxation time $\tau$ above a spin-freezing temperature $T_{g}$ (= 3.92 $\pm$ 0.11 K) is well described by critical slowing down. The absorption $\chi^{\prime\prime}$ below $T_{g}$ decreases with increasing angular frequency $\omega$, which is in contrast to the case of 3D Ising spin glass. The dynamic freezing temperature $T_{f}(H,\omega)$ at which d$M_{FC}(T,H)/$d$H=\chi^{\prime}(T,H=0,\omega)$, is determined as a function of frequency (0.01 Hz $\leq \omega/2\pi \leq$ 1 kHz) and magnetic field (0 $\leq H \leq$ 5 kOe). The dynamic scaling analysis of the relaxation time $\tau(T,H)$ defined as $\tau = 1/\omega$ at $T = T_{f}(H,\omega)$ suggests the absence of SG phase in the presence of $H$ (at least above 100 Oe). Dynamic scaling analysis of $\chi^{\prime \prime}(T, \omega)$ and $\tau(T,H)$ near $T_{g}$ leads to the critical exponents ($\beta$ = 0.36 $\pm$ 0.03, $\gamma$ = 3.5 $\pm$ 0.4, $\nu$ = 1.4 $\pm$ 0.2, $z$ = 6.6 $\pm$ 1.2, $\psi$ = 0.24 $\pm$ 0.02, and $\theta$ = 0.13 $\pm$ 0.02). The aging phenomenon is studied through the absorption $\chi^{\prime \prime}(\omega, t)$ below $T_{g}$. It obeys a $(\omega t)^{-b^{\prime \prime}}$ power-law decay with an exponent $b^{\prime \prime}\approx 0.15 - 0.2$. The rejuvenation effect is also observed under sufficiently large (temperature and magnetic-field) perturbations.
In this study we want to connect our previously proposed context-relevant topographical maps with the deep learning community. Our architecture is a classifier with hidden layers that are hierarchical two-dimensional topographical maps. These maps differ from the conventional self-organizing maps in that their organizations are influenced by the context of the data labels in a top-down manner. In this way bottom-up and top-down learning are combined in a biologically relevant representational learning setting. Compared to our previous work, we are here specifically elaborating the model in a more challenging setting compared to our previous experiments and to advance more hidden representation layers to bring our discussions into the context of deep representational learning.
We report an artificial geometrically frustrated magnet based on an array of lithographically fabricated single-domain ferromagnetic islands. The islands are arranged such that the dipole interactions create a two-dimensional analogue to spin ice. Images of the magnetic moments of individual elements in this correlated system allow us to study the local accommodation of frustration. We see both ice-like short-range correlations and an absence of long-range correlations, behaviour which is strikingly similar to the lowtemperature state of spin ice. These results demonstrate that artificial frustrated magnets can provide an uncharted arena in which the physics of frustration can be directly visualized.
We propose a new method for creating computationally efficient convolutional neural networks (CNNs) by using low-rank representations of convolutional filters. Rather than approximating filters in previously-trained networks with more efficient versions, we learn a set of small basis filters from scratch; during training, the network learns to combine these basis filters into more complex filters that are discriminative for image classification. To train such networks, a novel weight initialization scheme is used. This allows effective initialization of connection weights in convolutional layers composed of groups of differently-shaped filters. We validate our approach by applying it to several existing CNN architectures and training these networks from scratch using the CIFAR, ILSVRC and MIT Places datasets. Our results show similar or higher accuracy than conventional CNNs with much less compute. Applying our method to an improved version of VGG-11 network using global max-pooling, we achieve comparable validation accuracy using 41% less compute and only 24% of the original VGG-11 model parameters; another variant of our method gives a 1 percentage point increase in accuracy over our improved VGG-11 model, giving a top-5 center-crop validation accuracy of 89.7% while reducing computation by 16% relative to the original VGG-11 model. Applying our method to the GoogLeNet architecture for ILSVRC, we achieved comparable accuracy with 26% less compute and 41% fewer model parameters. Applying our method to a near state-of-the-art network for CIFAR, we achieved comparable accuracy with 46% less compute and 55% fewer parameters.
The problem of learning Markov equivalence classes of Bayesian network structures may be solved by searching for the maximum of a scoring metric in a space of these classes. This paper deals with the definition and analysis of one such search space. We use a theoretically motivated neighbourhood, the inclusion boundary, and represent equivalence classes by essential graphs. We show that this search space is connected and that the score of the neighbours can be evaluated incrementally. We devise a practical way of building this neighbourhood for an essential graph that is purely graphical and does not explicitely refer to the underlying independences. We find that its size can be intractable, depending on the complexity of the essential graph of the equivalence class. The emphasis is put on the potential use of this space with greedy hill -climbing search
In this paper, we propose deformable deep convolutional neural networks for generic object detection. This new deep learning object detection framework has innovations in multiple aspects. In the proposed new deep architecture, a new deformation constrained pooling (def-pooling) layer models the deformation of object parts with geometric constraint and penalty. A new pre-training strategy is proposed to learn feature representations more suitable for the object detection task and with good generalization capability. By changing the net structures, training strategies, adding and removing some key components in the detection pipeline, a set of models with large diversity are obtained, which significantly improves the effectiveness of model averaging. The proposed approach improves the mean averaged precision obtained by RCNN \cite{girshick2014rich}, which was the state-of-the-art, from 31\% to 50.3\% on the ILSVRC2014 detection test set. It also outperforms the winner of ILSVRC2014, GoogLeNet, by 6.1\%. Detailed component-wise analysis is also provided through extensive experimental evaluation, which provide a global view for people to understand the deep learning object detection pipeline.
This chapter discusses the need of security and privacy protection mechanisms in aggregation protocols used in wireless sensor networks (WSN). It presents a comprehensive state of the art discussion on the various privacy protection mechanisms used in WSNs and particularly focuses on the CPDA protocols proposed by He et al. (INFOCOM 2007). It identifies a security vulnerability in the CPDA protocol and proposes a mechanism to plug that vulnerability. To demonstrate the need of security in aggregation process, the chapter further presents various threats in WSN aggregation mechanisms. A large number of existing protocols for secure aggregation in WSN are discussed briefly and a protocol is proposed for secure aggregation which can detect false data injected by malicious nodes in a WSN. The performance of the protocol is also presented. The chapter concludes while highlighting some future directions of research in secure data aggregation in WSNs.
In this tutorial, the basics of game theory are introduced along with an overview of its most recent and emerging applications in signal processing. One of the main features of this contribution is to gather in a single paper some fundamental game-theoretic notions and tools which, over the past few years, have become widely spread over a large number of papers. In particular, both strategic-form and coalition-form games are described in details while the key connections and differences between them are outlined. Moreover, a particular attention is also devoted to clarify the connections between strategic-form games and distributed optimization and learning algorithms. Beyond an introduction to the basic concepts and main solution approaches, several carefully designed examples are provided to allow a better understanding of how to apply the described tools.
We present our preliminary results on the study of the high luminosity ULIRGs (log(LFIR) > 12.3) in the 1Jy sample. Deep, high resolution optical R images with 7 arcmin x 7 arcmin field of view for 2/3 of the whole sample, show that they are generally found in very rich environments, they seem to be late stage merger products (long tidal tails are found in only 3 systems) and all of them show multiple nuclei (double nuclei in 11: 2HII, 4 LINERs, 4 Sy2 and 1 Sy1). GTC first light instruments are very well suited for addressing the question of the energy source of ULIRGs, together with a more refined view of the proposed evolutionary sequence.
End-to-end training of deep learning-based models allows for implicit learning of intermediate representations based on the final task loss. However, the end-to-end approach ignores the useful domain knowledge encoded in explicit intermediate-level supervision. We hypothesize that using intermediate representations as auxiliary supervision at lower levels of deep networks may be a good way of combining the advantages of end-to-end training and more traditional pipeline approaches. We present experiments on conversational speech recognition where we use lower-level tasks, such as phoneme recognition, in a multitask training approach with an encoder-decoder model for direct character transcription. We compare multiple types of lower-level tasks and analyze the effects of the auxiliary tasks. Our results on the Switchboard corpus show that this approach improves recognition accuracy over a standard encoder-decoder model on the Eval2000 test set.
Co-authorship in publications within a discipline uncovers interesting properties of the analysed field. We represent collaboration in academic papers of computer science in terms of differently grained networks, including those sub-networks that emerge from conference and journal co-authorship only. We take advantage of the network science paraphernalia to take a picture of computer science collaboration including all papers published in the field since 1936. We investigate typical bibliometric properties like scientific productivity of authors and collaboration level in papers, as well as large-scale network properties like reachability and average separation distance among scholars, distribution of the number of scholar collaborators, network resilience and dependence on star collaborators, network clustering, and network assortativity by number of collaborators.
Understanding brain connectivity has become one of the most important issues in neuroscience. But connectivity data can reflect either the functional relationships of the brain activities or the anatomical properties between brain areas. Although one should expect a clear relationship between both representations it is not straightforward. Here we present a formalism that allows for the comparison of structural (DTI) and functional (fMRI) networks by embedding both in a common metric space. In this metric space one can then find for which regions the two networks are significantly different. Our methodology can be used not only to compare multimodal networks but also to extract statistically significant aggregated networks of a set of subjects. Actually, we use this procedure to aggregate a set of functional (fMRI) networks from different subjects in an aggregated network that is compared with the anatomical (DTI) connectivity. The comparison of the aggregated network reveals some features that are not observed when the comparison is done with the classical averaged network.
We propose a novel approximate inference algorithm that approximates a target distribution by amortising the dynamics of a user-selected MCMC sampler. The idea is to initialise MCMC using samples from an approximation network, apply the MCMC operator to improve these samples, and finally use the samples to update the approximation network thereby improving its quality. This provides a new generic framework for approximate inference, allowing us to deploy highly complex, or implicitly defined approximation families with intractable densities, including approximations produced by warping a source of randomness through a deep neural network. Experiments consider image modelling with deep generative models as a challenging test for the method. Deep models trained using amortised MCMC are shown to generate realistic looking samples as well as producing diverse imputations for images with regions of missing pixels.
We investigate the community structure of the global ownership network of transnational corporations. We find a pronounced organization in communities that cannot be explained by randomness. Despite the global character of this network, communities reflect first of all the geographical location of firms, while the industrial sector plays only a marginal role. We also analyze the network in which the nodes are the communities and the links are obtained by aggregating the links among firms belonging to pairs of communities. We analyze the network centrality of the top 50 communities and we provide the first quantitative assessment of the financial sector role in connecting the global economy.
The Hopfield-Tank (1985) recurrent neural network architecture for the Traveling Salesman Problem is generalized to a fully interconnected "cellular" neural network of regular oscillators. Tours are defined by synchronization patterns, allowing the simultaneous representation of all cyclic permutations of a given tour. The network converges to local optima some of which correspond to shortest-distance tours, as can be shown analytically in a stationary phase approximation. Simulated annealing is required for global optimization, but the stochastic element might be replaced by chaotic intermittency in a further generalization of the architecture to a network of chaotic oscillators.
We present a general framework for studying the multilevel structure of lattice network coding (LNC), which serves as the theoretical fundamental for solving the ring-based LNC problem in practice, with greatly reduced decoding complexity. Building on the framework developed, we propose a novel lattice-based network coding solution, termed layered integer forcing (LIF), which applies to any lattices having multilevel structure. The theoretic foundations of the developed multilevel framework lead to a new general lattice construction approach, the elementary divisor construction (EDC), which shows its strength in improving the overall rate over multiple access channels (MAC) with low computational cost. We prove that the EDC lattices subsume the traditional complex construction approaches. Then a soft detector is developed for lattice network relaying, based on the multilevel structure of EDC. This makes it possible to employ iterative decoding in lattice network coding, and simulation results show the large potential of using iterative multistage decoding to approach the capacity.
This work deals with the viscoelasticity of the arterial wall and its influence on the pulse waves. We describe the viscoelasticity by a non-linear Kelvin-Voigt model in which the coefficients are fitted using experimental time series of pressure and radius measured on a sheep's arterial network. We obtained a good agreement between the results of the nonlinear Kelvin-Voigt model and the experimental measurements. We found that the viscoelastic relaxation time-defined by the ratio between the viscoelastic coefficient and the Young's modulus-is nearly constant throughout the network. Therefore, as it is well known that smaller arteries are stiffer, the viscoelastic coefficient rises when approaching the peripheral sites to compensate the rise of the Young's modulus, resulting in a higher damping effect. We incorporated the fitted viscoelastic coefficients in a nonlinear 1D fluid model to compute the pulse waves in the network. The damping effect of viscoelasticity on the high frequency waves is clear especially at the peripheral sites.
In this paper we introduce an evolutionary algorithm for the solution of linear integer programs. The strategy is based on the separation of the variables into the integer subset and the continuous subset; the integer variables are fixed by the evolutionary system, and the continuous ones are determined in function of them, by a linear program solver.   We report results obtained for some standard benchmark problems, and compare them with those obtained by branch-and-bound. The performance of the evolutionary algorithm is promising. Good feasible solutions were generally obtained, and in some of the difficult benchmark tests it outperformed branch-and-bound.
We develop a lattice mean field theory for ferromagnetic ordering in diluted magnetic semiconductors by taking into account the spatial fluctuations associated with random disorder in the magnetic impurity locations and the finite mean free path associated with low carrier mobilities. Assuming a carrier-mediated indirect RKKY exchange interaction among the magnetic impurities, we find substantial deviation from the extensively used continuum Zener model Weiss mean-field predictions. Our theory allows accurate analytic predictions for Tc, and provides simple explanations for a number of observed anomalies including the non-Brillouin function magnetization curves, the suppressed low-temperature magnetization saturation, and the dependence of Tc on conductivity.
In the last decade, over a million stars were monitored to detect transiting planets. Manual interpretation of potential exoplanet candidates is labor intensive and subject to human error, the results of which are difficult to quantify. Here we present a new method of detecting exoplanet candidates in large planetary search projects which, unlike current methods uses a neural network. Neural networks, also called "deep learning" or "deep nets", are a state of the art machine learning technique designed to give a computer perception into a specific problem by training it to recognize patterns. Unlike past transit detection algorithms deep nets learn to recognize planet features instead of relying on hand-coded metrics that humans perceive as the most representative. Our deep learning algorithms are capable of detecting Earth-like exoplanets in noisy time-series data with 99$\%$ accuracy compared to a 73$\%$ accuracy using least-squares. For planet signals smaller than the noise we devise a method for finding periodic transits using a phase folding technique that yields a constraint when fitting for the orbital period. Deep nets are highly generalizable allowing data to be evaluated from different time series after interpolation. We validate our deep net on light curves from the Kepler mission and detect periodic transits similar to the true period without any model fitting.
We analyze dropout in deep networks with rectified linear units and the quadratic loss. Our results expose surprising differences between the behavior of dropout and more traditional regularizers like weight decay. For example, on some simple data sets dropout training produces negative weights even though the output is the sum of the inputs. This provides a counterpoint to the suggestion that dropout discourages co-adaptation of weights. We also show that the dropout penalty can grow exponentially in the depth of the network while the weight-decay penalty remains essentially linear, and that dropout is insensitive to various re-scalings of the input features, outputs, and network weights. This last insensitivity implies that there are no isolated local minima of the dropout training criterion. Our work uncovers new properties of dropout, extends our understanding of why dropout succeeds, and lays the foundation for further progress.
We extend the study of nuclear dependence of the transverse momentum dependent parton distribution functions and azimuthal asymmetries to semi-inclusive deep inelastic scattering (SIDIS) off polarized nuclear targets. We show that azimuthal asymmetries are suppressed for SIDIS off a polarized nuclear target relative to that off a polarized nucleon due to multiple scattering inside the nucleus. Using the value of transport parameter inside large nuclei extracted from jet quenching analyses in SIDIS off nuclear targets, we also present a numerical estimate of the nuclear suppression of the azimuthal asymmetry that might be useful to guide the future experimental studies of SIDIS off polarized nuclear targets.
In recent years, protocols that are based on the properties of random walks on graphs have found many applications in communication and information networks, such as wireless networks, peer-to-peer networks and the Web. For wireless networks (and other networks), graphs are actually not the correct model of the communication; instead hyper-graphs better capture the communication over a wireless shared channel. Motivated by this example, we study in this paper random walks on hyper-graphs. First, we formalize the random walk process on hyper-graphs and generalize key notions from random walks on graphs. We then give the novel definition of radio cover time, namely, the expected time of a random walk to be heard (as opposed to visit) by all nodes. We then provide some basic bounds on the radio cover, in particular, we show that while on graphs the radio cover time is O(mn), in hyper-graphs it is O(mnr) where n, m and r are the number of nodes, the number of edges and the rank of the hyper-graph, respectively. In addition, we define radio hitting times and give a polynomial algorithm to compute them. We conclude the paper with results on specific hyper-graphs that model wireless networks in one and two dimensions.
The doctrine of double effect ($\mathcal{DDE}$) is a long-studied ethical principle that governs when actions that have both positive and negative effects are to be allowed. The goal in this paper is to automate $\mathcal{DDE}$. We briefly present $\mathcal{DDE}$, and use a first-order modal logic, the deontic cognitive event calculus, as our framework to formalize the doctrine. We present formalizations of increasingly stronger versions of the principle, including what is known as the doctrine of triple effect. We then use our framework to simulate successfully scenarios that have been used to test for the presence of the principle in human subjects. Our framework can be used in two different modes: One can use it to build $\mathcal{DDE}$-compliant autonomous systems from scratch, or one can use it to verify that a given AI system is $\mathcal{DDE}$-compliant, by applying a $\mathcal{DDE}$ layer on an existing system or model. For the latter mode, the underlying AI system can be built using any architecture (planners, deep neural networks, bayesian networks, knowledge-representation systems, or a hybrid); as long as the system exposes a few parameters in its model, such verification is possible. The role of the $\mathcal{DDE}$ layer here is akin to a (dynamic or static) software verifier that examines existing software modules. Finally, we end by presenting initial work on how one can apply our $\mathcal{DDE}$ layer to the STRIPS-style planning model, and to a modified POMDP model.This is preliminary work to illustrate the feasibility of the second mode, and we hope that our initial sketches can be useful for other researchers in incorporating DDE in their own frameworks.
Using the Panel Study of Income Dynamics data on the period 1982-1992, this paper investigates some mechanisms of the labor market in the United States. This market is analyzed as a stable structure constituted of segments which present contrasted characteristics under the usual distinction between primary and secondary sectors. Using a neural network algorithm applied on quantitative variables measured at the level of heads of household, a broad classification in four classes of situations is constructed. It shows a clear hierarchy going from situations of very precarious work or no work at all, to situations of stable jobs with higher wages than the average. A Markov chain, constructed with the trajectories between the different situations of these workers, shows a very stable structure of this segmented labor market. Keywords: segmented labor market, unemployment, trajectories, Kohonen algorithm, Markov chain.
In this paper we consider the problem of multi-view face detection. While there has been significant research on this problem, current state-of-the-art approaches for this task require annotation of facial landmarks, e.g. TSM [25], or annotation of face poses [28, 22]. They also require training dozens of models to fully capture faces in all orientations, e.g. 22 models in HeadHunter method [22]. In this paper we propose Deep Dense Face Detector (DDFD), a method that does not require pose/landmark annotation and is able to detect faces in a wide range of orientations using a single model based on deep convolutional neural networks. The proposed method has minimal complexity; unlike other recent deep learning object detection methods [9], it does not require additional components such as segmentation, bounding-box regression, or SVM classifiers. Furthermore, we analyzed scores of the proposed face detector for faces in different orientations and found that 1) the proposed method is able to detect faces from different angles and can handle occlusion to some extent, 2) there seems to be a correlation between dis- tribution of positive examples in the training set and scores of the proposed face detector. The latter suggests that the proposed methods performance can be further improved by using better sampling strategies and more sophisticated data augmentation techniques. Evaluations on popular face detection benchmark datasets show that our single-model face detector algorithm has similar or better performance compared to the previous methods, which are more complex and require annotations of either different poses or facial landmarks.
A multi-way factor analysis model is introduced for tensor-variate data of any order. Each data item is represented as a (sparse) sum of Kruskal decompositions, a Kruskal-factor analysis (KFA). KFA is nonparametric and can infer both the tensor-rank of each dictionary atom and the number of dictionary atoms. The model is adapted for online learning, which allows dictionary learning on large data sets. After KFA is introduced, the model is extended to a deep convolutional tensor-factor analysis, supervised by a Bayesian SVM. The experiments section demonstrates the improvement of KFA over vectorized approaches (e.g., BPFA), tensor decompositions, and convolutional neural networks (CNN) in multi-way denoising, blind inpainting, and image classification. The improvement in PSNR for the inpainting results over other methods exceeds 1dB in several cases and we achieve state of the art results on Caltech101 image classification.
An equilibrated model glass-forming liquid is studied by mapping successive configurations produced by molecular dynamics simulation onto a time series of inherent structures (local minima in the potential energy). Using this ``inherent dynamics'' approach we find direct numerical evidence for the long held view that below a crossover temperature, $T_x$, the liquid's dynamics can be separated into (i) vibrations around inherent structures and (ii) transitions between inherent structures (M. Goldstein, J. Chem. Phys. {\bf 51}, 3728 (1969)), i.e., the dynamics become ``dominated'' by the potential energy landscape. In agreement with previous proposals, we find that $T_x$ is within the vicinity of the mode-coupling critical temperature $T_c$. We further find that at the lowest temperature simulated (close to $T_x$), transitions between inherent structures involve cooperative, string like rearrangements of groups of particles moving distances substantially smaller than the average interparticle distance.
Nanomagnetic arrays are widespread in data storage and processing. As current technologies approach fundamental limits on size and thermal stability, extracting additional functionality from arrays is crucial to advancing technological progress. One design exploiting the enhanced magnetic interactions in dense arrays is the geometrically-frustrated metamaterial 'artificial spin ice' (ASI). Frustrated systems offer vast untapped potential arising from their unique microstate landscapes, presenting intriguing opportunities from reconfigurable logic to magnonic devices or hardware neural networks. However, progress in such systems is impeded by the inability to access more than a fraction of the total microstate space. Here, we present a powerful surface-probe lithography technique, magnetic topological lithography, providing access to all possible microstates in ASI and related nanomagnetic arrays. We demonstrate the creation of two previously elusive states; the spin-crystal ground state of dipolar kagome ASI and high-energy, low-entropy 'monopole-chain' states exhibiting negative effective temperatures.
Network models are widely used to represent relational information among interacting units and the structural implications of these relations. Recently, social network studies have focused a great deal of attention on random graph models of networks whose nodes represent individual social actors and whose edges represent a specified relationship between the actors. Most inference for social network models assumes that the presence or absence of all possible links is observed, that the information is completely reliable, and that there are no measurement (e.g., recording) errors. This is clearly not true in practice, as much network data is collected though sample surveys. In addition even if a census of a population is attempted, individuals and links between individuals are missed (i.e., do not appear in the recorded data). In this paper we develop the conceptual and computational theory for inference based on sampled network information. We first review forms of network sampling designs used in practice. We consider inference from the likelihood framework, and develop a typology of network data that reflects their treatment within this frame. We then develop inference for social network models based on information from adaptive network designs. We motivate and illustrate these ideas by analyzing the effect of link-tracing sampling designs on a collaboration network.
In this paper, we propose a method using a three dimensional convolutional neural network (3-D-CNN) to fuse together multispectral (MS) and hyperspectral (HS) images to obtain a high resolution hyperspectral image. Dimensionality reduction of the hyperspectral image is performed prior to fusion in order to significantly reduce the computational time and make the method more robust to noise. Experiments are performed on a data set simulated using a real hyperspectral image. The results obtained show that the proposed approach is very promising when compared to conventional methods. This is especially true when the hyperspectral image is corrupted by additive noise.
Exploiting quantum properties to outperform classical ways of information-processing is an outstanding goal of modern physics. A promising route is quantum simulation, which aims at implementing relevant and computationally hard problems in controllable quantum systems. Here we demonstrate that in a trapped ion setup, with present day technology, it is possible to realize a spin model of the Mattis type that exhibits spin glass phases. Remarkably, our method produces the glassy behavior without the need for any disorder potential, just by controlling the detuning of the spin-phonon coupling. Applying a transverse field, the system can be used to benchmark quantum annealing strategies which aim at reaching the ground state of the spin glass starting from the paramagnetic phase. In the vicinity of a phonon resonance, the problem maps onto number partitioning, and instances which are difficult to address classically can be implemented.
Correlations in spike-train ensembles can seriously impair the encoding of information by their spatio-temporal structure. An inevitable source of correlation in finite neural networks is common presynaptic input to pairs of neurons. Recent theoretical and experimental studies demonstrate that spike correlations in recurrent neural networks are considerably smaller than expected based on the amount of shared presynaptic input. By means of a linear network model and simulations of networks of leaky integrate-and-fire neurons, we show that shared-input correlations are efficiently suppressed by inhibitory feedback. To elucidate the effect of feedback, we compare the responses of the intact recurrent network and systems where the statistics of the feedback channel is perturbed. The suppression of spike-train correlations and population-rate fluctuations by inhibitory feedback can be observed both in purely inhibitory and in excitatory-inhibitory networks. The effect is fully understood by a linear theory and becomes already apparent at the macroscopic level of the population averaged activity. At the microscopic level, shared-input correlations are suppressed by spike-train correlations: In purely inhibitory networks, they are canceled by negative spike-train correlations. In excitatory-inhibitory networks, spike-train correlations are typically positive. Here, the suppression of input correlations is not a result of the mere existence of correlations between excitatory (E) and inhibitory (I) neurons, but a consequence of a particular structure of correlations among the three possible pairings (EE, EI, II).
In this paper, we have to concentrate on implementation of Weighted Clustering Algorithm with the help of Genetic Algorithm (GA).Here we have developed new algorithm for the implementation of GA-based approach with the help of Weighted Clustering Algorithm (WCA) (4). ClusterHead chosen is a important thing for clustering in adhoc networks. So, we have shown the optimization technique for the minimization of ClusterHeads(CH) based on some parameter such as degree difference, Battery power (Pv), degree of mobility, and sum of the distances of a node in adhoc networks. ClusterHeads selection of adhoc networks is an important thing for clustering. Here, we have discussed the performance comparison between deterministic approach and GA based approach. In this performance comparison, we have seen that GA does not always give the good result compare to deterministic WCA algorithm. Here we have seen connectivity (connectivity can be measured by the probability that a node is reachable to any other node.) is better than the deterministic WCA algorithm (4).
Evolutionary dynamics have been traditionally studied in the context of homogeneous populations, mainly described my the Moran process. Recently, this approach has been generalized in \cite{LHN} by arranging individuals on the nodes of a network. Undirected networks seem to have a smoother behavior than directed ones, and thus it is more challenging to find suppressors/amplifiers of selection. In this paper we present the first class of undirected graphs which act as suppressors of selection, by achieving a fixation probability that is at most one half of that of the complete graph, as the number of vertices increases. Moreover, we provide some generic upper and lower bounds for the fixation probability of general undirected graphs. As our main contribution, we introduce the natural alternative of the model proposed in \cite{LHN}, where all individuals interact simultaneously and the result is a compromise between aggressive and non-aggressive individuals. That is, the behavior of the individuals in our new model and in the model of \cite{LHN} can be interpreted as an "aggregation" vs. an "all-or-nothing" strategy, respectively. We prove that our new model of mutual influences admits a \emph{potential function}, which guarantees the convergence of the system for any graph topology and any initial fitness vector of the individuals. Furthermore, we prove fast convergence to the stable state for the case of the complete graph, as well as we provide almost tight bounds on the limit fitness of the individuals. Apart from being important on its own, this new evolutionary model appears to be useful also in the abstract modeling of control mechanisms over invading populations in networks. We demonstrate this by introducing and analyzing two alternative control approaches, for which we bound the time needed to stabilize to the "healthy" state of the system.
An autocatalytic pattern matching polymer system is studied as an abstract model for chemical ecosystem evolution. Highly ordered populations with particular sequence patterns appear spontaneously out of a vast number of possible states. The interplay between the selected microscopic sequence patterns and the macroscopic cooperative structures is examined. Stability, fluctuations, and evolutionary selection mechanisms are investigated for the involved self-organizing processes.
We present a deep, fully convolutional neural network that learns to route a circuit layout net with appropriate choice of metal tracks and wire class combinations. Inputs to the network are the encoded layouts containing spatial location of pins to be routed. After 15 fully convolutional stages followed by a score comparator, the network outputs 8 layout layers (corresponding to 4 route layers, 3 via layers and an identity-mapped pin layer) which are then decoded to obtain the routed layouts. We formulate this as a binary segmentation problem on a per-pixel per-layer basis, where the network is trained to correctly classify pixels in each layout layer to be 'on' or 'off'. To demonstrate learnability of layout design rules, we train the network on a dataset of 50,000 train and 10,000 validation samples that we generate based on certain pre-defined layout constraints. Precision, recall and $F_1$ score metrics are used to track the training progress. Our network achieves $F_1\approx97\%$ on the train set and $F_1\approx92\%$ on the validation set. We use PyTorch for implementing our model.
This paper examines Bayesian belief network inference using simulation as a method for computing the posterior probabilities of network variables. Specifically, it examines the use of a method described by Henrion, called logic sampling, and a method described by Pearl, called stochastic simulation. We first review the conditions under which logic sampling is computationally infeasible. Such cases motivated the development of the Pearl's stochastic simulation algorithm. We have found that this stochastic simulation algorithm, when applied to certain networks, leads to much slower than expected convergence to the true posterior probabilities. This behavior is a result of the tendency for local areas in the network to become fixed through many simulation cycles. The time required to obtain significant convergence can be made arbitrarily long by strengthening the probabilistic dependency between nodes. We propose the use of several forms of graph modification, such as graph pruning, arc reversal, and node reduction, in order to convert some networks into formats that are computationally more efficient for simulation.
Randomly disordered (polydomain) liquid crystalline elastomers align under stress. We study the dynamics of stress relaxation before, during and after the Polydomain-Monodomain transition. The results for different materials show the universal ultra-slow logarithmic behaviour, especially pronounced in the region of the transition. The data is approximated very well by an equation Sigma(t) ~ Sigma_{eq} + A/(1+ Alpha Log[t]). We propose a theoretical model based on the concept of cooperative mechanical resistance for the re-orientation of each domain, attempting to follow the soft-deformation pathway. The exact model solution can be approximated by compact analytical expressions valid at short and at long times of relaxation, with two model parameters determined from the data.
We address the controversy concerning the necessary conditions for the observation of Berry phases in disordered mesoscopic conductors. For this purpose we calculate the spin-dependent conductance of disordered two-dimensional structures in the presence of inhomogeneous magnetic fields. Our numerical results show that for both, the overall conductance and quantum corrections, the relevant parameter defining adiabatic spin transport scales with the square root of the number of scattering events, in generalization of Stern's original proposal [Phys. Rev. Lett. 68, 1022 (1992)]. This could hinder a clear-cut experimental observation of Berry phase effects in diffusive metallic rings.
The state-of-the-art solutions to the vocabulary mismatch in information retrieval (IR) mainly aim at leveraging either the relational semantics provided by external resources or the distributional semantics, recently investigated by deep neural approaches. Guided by the intuition that the relational semantics might improve the effectiveness of deep neural approaches, we propose the Deep Semantic Resource Inference Model (DSRIM) that relies on: 1) a representation of raw-data that models the relational semantics of text by jointly considering objects and relations expressed in a knowledge resource, and 2) an end-to-end neural architecture that learns the query-document relevance by leveraging the distributional and relational semantics of documents and queries. The experimental evaluation carried out on two TREC datasets from TREC Terabyte and TREC CDS tracks relying respectively on WordNet and MeSH resources, indicates that our model outperforms state-of-the-art semantic and deep neural IR models.
Random linear network coding (RLNC) has been shown to efficiently improve the network performance in terms of reducing transmission delays and increasing the throughput in broadcast and multicast communications. However, it can result in increased storage and computational complexity at the receivers end. In our previous work we considered the broadcast transmission of large file to N receivers. We showed that the storage and complexity requirements at the receivers end can be greatly reduced when segmenting the file into smaller blocks and applying RLNC to these blocks. To that purpose, we proposed a packet scheduling policy, namely the Least Received. In this work we will prove the optimality of our previously proposed policy, in terms of file transfer completion time, when N = 2. We will model our system as a Markov Decision Process and prove the optimality of the policy using Dynamic Programming. Our intuition is that the Least Received policy may be optimal regardless of the number of receivers. Towards that end, we will provide experimental results that verify that ntuition.
Using the very deep Subaru images of the GOODS-N region, from the MOIRCS Deep Survey and images from the HST/ACS, we have measured the Luminosity Ratio (LR) of the outer to the central regions of massive (M>10^{10.5}M_{Sun}) galaxies at fixed radii in a single rest-frame for z<3.5 as a new approach to the problem of size evolution. We didn't observe any evolution in the median LR. Had a significant size growth occurred, the outer to central luminosity ratios would have demonstrated a corresponding increase with a decrease in redshift.
Restricted Boltzmann Machines (RBMs) and models derived from them have been successfully used as basic building blocks in deep artificial neural networks for automatic features extraction, unsupervised weights initialization, but also as density estimators. Thus, their generative and discriminative capabilities, but also their computational time are instrumental to a wide range of applications. Our main contribution is to look at RBMs from a topological perspective, bringing insights from network science. Firstly, here we show that RBMs and Gaussian RBMs (GRBMs) are bipartite graphs which naturally have a small-world topology. Secondly, we demonstrate both on synthetic and real-world datasets that by constraining RBMs and GRBMs to a scale-free topology (while still considering local neighborhoods and data distribution), we reduce the number of weights that need to be computed by a few orders of magnitude, at virtually no loss in generative performance. Thirdly, we show that, for a fixed number of weights, our proposed sparse models (which by design have a higher number of hidden neurons) achieve better generative capabilities than standard fully connected RBMs and GRBMs (which by design have a smaller number of hidden neurons), at no additional computational costs.
Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition. For the task of capturing temporal structure in video, however, there still remain numerous open research questions. Current research suggests using a simple temporal feature pooling strategy to take into account the temporal aspect of video. We demonstrate that this method is not sufficient for gesture recognition, where temporal information is more discriminative compared to general video classification tasks. We explore deep architectures for gesture recognition in video and propose a new end-to-end trainable neural network architecture incorporating temporal convolutions and bidirectional recurrence. Our main contributions are twofold; first, we show that recurrence is crucial for this task; second, we show that adding temporal convolutions leads to significant improvements. We evaluate the different approaches on the Montalbano gesture recognition dataset, where we achieve state-of-the-art results.
In this paper, a novel and generic multi-objective design paradigm is proposed which utilizes quantum-behaved PSO(QPSO) for deciding the optimal configuration of the LQR controller for a given problem considering a set of competing objectives. There are three main contributions introduced in this paper as follows. (1) The standard QPSO algorithm is reinforced with an informed initialization scheme based on the simulated annealing algorithm and Gaussian neighborhood selection mechanism. (2) It is also augmented with a local search strategy which integrates the advantages of memetic algorithm into conventional QPSO. (3) An aggregated dynamic weighting criterion is introduced that dynamically combines the soft and hard constraints with control objectives to provide the designer with a set of Pareto optimal solutions and lets her to decide the target solution based on practical preferences. The proposed method is compared against a gradient-based method, seven meta-heuristics, and the trial-and-error method on two control benchmarks using sensitivity analysis and full factorial parameter selection and the results are validated using one-tailed T-test. The experimental results suggest that the proposed method outperforms opponent methods in terms of controller effort, measures associated with transient response and criteria related to steady-state.
In this paper we evaluate the quality of the activation layers of a convolutional neural network (CNN) for the gen- eration of object proposals. We generate hypotheses in a sliding-window fashion over different activation layers and show that the final convolutional layers can find the object of interest with high recall but poor localization due to the coarseness of the feature maps. Instead, the first layers of the network can better localize the object of interest but with a reduced recall. Based on this observation we design a method for proposing object locations that is based on CNN features and that combines the best of both worlds. We build an inverse cascade that, going from the final to the initial convolutional layers of the CNN, selects the most promising object locations and refines their boxes in a coarse-to-fine manner. The method is efficient, because i) it uses the same features extracted for detection, ii) it aggregates features using integral images, and iii) it avoids a dense evaluation of the proposals due to the inverse coarse-to-fine cascade. The method is also accurate; it outperforms most of the previously proposed object proposals approaches and when plugged into a CNN-based detector produces state-of-the- art detection performance.
We aim to shed light on the strengths and weaknesses of the newly introduced neural machine translation paradigm. To that end, we conduct a multifaceted evaluation in which we compare outputs produced by state-of-the-art neural machine translation and phrase-based machine translation systems for 9 language directions across a number of dimensions. Specifically, we measure the similarity of the outputs, their fluency and amount of reordering, the effect of sentence length and performance across different error categories. We find out that translations produced by neural machine translation systems are considerably different, more fluent and more accurate in terms of word order compared to those produced by phrase-based systems. Neural machine translation systems are also more accurate at producing inflected forms, but they perform poorly when translating very long sentences.
Acoustic models using probabilistic linear discriminant analysis (PLDA) capture the correlations within feature vectors using subspaces which do not vastly expand the model. This allows high dimensional and correlated feature spaces to be used, without requiring the estimation of multiple high dimension covariance matrices. In this letter we extend the recently presented PLDA mixture model for speech recognition through a tied PLDA approach, which is better able to control the model size to avoid overfitting. We carried out experiments using the Switchboard corpus, with both mel frequency cepstral coefficient features and bottleneck feature derived from a deep neural network. Reductions in word error rate were obtained by using tied PLDA, compared with the PLDA mixture model, subspace Gaussian mixture models, and deep neural networks.
Author name ambiguity decreases the quality and reliability of information retrieved from digital libraries. Existing methods have tried to solve this problem by predefining a feature set based on expert's knowledge for a specific dataset. In this paper, we propose a new approach which uses deep neural network to learn features automatically from data. Additionally, we propose the general system architecture for author name disambiguation on any dataset. In this research, we evaluate the proposed method on a dataset containing Vietnamese author names. The results show that this method significantly outperforms other methods that use predefined feature set. The proposed method achieves 99.31% in terms of accuracy. Prediction error rate decreases from 1.83% to 0.69%, i.e., it decreases by 1.14%, or 62.3% relatively compared with other methods that use predefined feature set (Table 3).
We present the detailed analysis of the spherical s+p spin glass model with two competing interactions: among p spins and among s spins. The most interesting case is the 2+p model with p > 3 for which a very rich phase diagram occurs, including, next to the paramagnetic and the glassy phase represented by the one step replica symmetry breaking ansatz typical of the spherical p-spin model, other two amorphous phases. Transitions between two contiguous phases can also be of different kind. The model can thus serve as mean-field representation of amorphous-amorphous transitions (or transitions between undercooled liquids of different structure). The model is analytically solvable everywhere in the phase space, even in the limit where the infinite replica symmetry breaking ansatz is required to yield a thermodynamically stable phase.
We present the design of a cross-layer protocol to maintain connectivity in an earthquake monitoring and early warning sensor network in the absence of communications infrastructure. Such systems, by design, warn of events that severely damage or destroy communications infrastructure. However, the data they provide is of critical importance to emergency and rescue decision making in the immediate aftermath of such events, as is continued early warning of aftershocks, tsunamis, or other subsequent dangers. Utilizing a beyond line-of-sight (BLOS) HF physical layer, we propose an adaptable cross-layer network design that meets these specialized requirements. We are able to provide ultra high connectivity (UHC) early warning on strict time deadlines under worst-case channel conditions along with providing sufficient capacity for continued seismic data collection from a 1000 sensor network.
The influence of substitutional disorder on the magnetic properties of diluted Heisenberg spin systems is studied with regard to the magnetic stability of ferromagnetic diluted semiconductors (DMS). The equation of motion for the magnon Green's function within Tyablikov approximation is solved numerically for finite systems. The resulting spectral density is then used to estimate the magnetization and Curie temperature of an infinite system. This method is suitable for any form of a ferromagnetic exchange interaction. Besides different lattices and spin magnitude $S$, exchange interactions of different range are examined. The results show that, for short-range interaction, no magnetic order exists below the critical percolation concentration, whereas a linear dependence of the Curie temperature on the concentration of spins is found for ferromagnetic long-range interaction.
For many power-limited networks, such as wireless sensor networks and mobile ad hoc networks, maximizing the network lifetime is the first concern in the related designing and maintaining activities. We study the network lifetime from the perspective of network science. In our dynamic network, nodes are assigned a fixed amount of energy initially and consume the energy in the delivery of packets. We divided the network traffic flow into four states: no, slow, fast, and absolute congestion states. We derive the network lifetime by considering the state of the traffic flow. We find that the network lifetime is generally opposite to traffic congestion in that the more congested traffic, the less network lifetime. We also find the impacts of factors such as packet generation rate, communication radius, node moving speed, etc., on network lifetime and traffic congestion.
Compressing neural nets is an active research problem, given the large size of state-of-the-art nets for tasks such as object recognition, and the computational limits imposed by mobile devices. We give a general formulation of model compression as constrained optimization. This includes many types of compression: quantization, low-rank decomposition, pruning, lossless compression and others. Then, we give a general algorithm to optimize this nonconvex problem based on the augmented Lagrangian and alternating optimization. This results in a "learning-compression" algorithm, which alternates a learning step of the uncompressed model, independent of the compression type, with a compression step of the model parameters, independent of the learning task. This simple, efficient algorithm is guaranteed to find the best compressed model for the task in a local sense under standard assumptions.   We present separately in several companion papers the development of this general framework into specific algorithms for model compression based on quantization, pruning and other variations, including experimental results on compressing neural nets and other models.
We present the preliminary results of a survey that makes use of several deep exposures obtained with the X-Ray telescopes of the BeppoSAX satellite. The survey limiting sensitivity is 5 x 10^-14 cgs in the 2-10 keV band and 7 x 10^-14 cgs in the harder 5-10 keV band. We find that the 2-10 keV LogN-LogS is consistent with that determined in ASCA surveys. The counts in the 5-10 keV band imply either a very hard average spectral slope or the existence of a population of heavily absorbed sources that can hardly be detected in soft X-ray surveys. A sample of 83 serendipitous sources has been compiled from a systematic search in 50 MECS images. The analysis of the hardness ratio of this sample also implies very hard or heavily cutoff spectral shapes.
Having accurate, detailed, and up-to-date information about wildlife location and behavior across broad geographic areas would revolutionize our ability to study, conserve, and manage species and ecosystems. Currently such data are mostly gathered manually at great expense, and thus are sparsely and infrequently collected. Here we investigate the ability to automatically, accurately, and inexpensively collect such data from motion sensor cameras. These camera traps enable pictures of wildlife to be collected inexpensively, unobtrusively, and at high-volume. However, identifying the animals, animal attributes, and behaviors in these pictures remains an expensive, time-consuming, manual task often performed by researchers, hired technicians, or crowdsourced teams of human volunteers. In this paper, we demonstrate that such data can be automatically extracted by deep neural networks (aka deep learning), which is a cutting-edge type of artificial intelligence. In particular, we use the existing human-labeled images from the Snapshot Serengeti dataset to train deep convolutional neural networks for identifying 48 species in 3.2 million images taken from Tanzania's Serengeti National Park. We train neural networks that automatically identify animals with over 92% accuracy. More importantly, we can choose to have our system classify only the images it is highly confident about, allowing valuable human time to be focused only on challenging images. In this case, our automatic animal identification system saves approximately ~8.2 years (at 40 hours per week) of human labeling effort (i.e. over 17,000 hours) while operating on a 3.2-million-image dataset at the same 96.6% accuracy level of crowdsourced teams of human volunteers. Those efficiency gains immediately highlight the importance of using deep neural networks to automate data extraction from camera trap images.
Antal et al. [Phys. Rev. E \textbf{72}, 036121 (2005)] have studied the balance dynamics on the social networks. In this paper, based on the model proposed by Antal et al., we improve it and generalize the binary social networks to the ternary social networks. When the social networks get dynamically balanced, we obtain the distributions of each relation and the time needed for dynamic balance. Besides, we study the self-organized criticality on the ternary social networks based on our model. For the ternary social networks evolving to the sensitive state, any small disturbance may result in an avalanche. The occurrence of the avalanche satisfies the power-law form both spatially and temporally. Numerical results verify our theoretical expectations.
Inference in general Ising models is difficult, due to high treewidth making tree-based algorithms intractable. Moreover, when interactions are strong, Gibbs sampling may take exponential time to converge to the stationary distribution. We present an algorithm to project Ising model parameters onto a parameter set that is guaranteed to be fast mixing, under several divergences. We find that Gibbs sampling using the projected parameters is more accurate than with the original parameters when interaction strengths are strong and when limited time is available for sampling.
Slow adaption processes, like synaptic and intrinsic plasticity, abound in the brain and shape the landscape for the neural dynamics occurring on substantially faster timescales. At any given time the network is characterized by a set of internal parameters, which are adapting continuously, albeit slowly. This set of parameters defines the number and the location of the respective adiabatic attractors. The slow evolution of network parameters hence induces an evolving attractor landscape, a process which we term attractor metadynamics. We study the nature of the metadynamics of the attractor landscape for several continuous-time autonomous model networks. We find both first- and second-order changes in the location of adiabatic attractors and argue that the study of the continuously evolving attractor landscape constitutes a powerful tool for understanding the overall development of the neural dynamics.
With the emerging of the fifth generation (5G) mobile communication systems and software defined networks, not only the performance of vehicular networks could be improved but also new applications of vehicular networks are required by future vehicles, e.g., pilotless vehicles. To meet requirements from intelligent transportation systems, a new vehicular network architecture integrated with 5G mobile communication technologies and software defined network is proposed in this paper. Moreover, fog cells have been proposed to flexibly cover vehicles and avoid frequently handover between vehicles and road side units (RSUs). Based on the proposed 5G software defined vehicular networks, the transmission delay and throughput are analyzed and compared. Simulation results indicate that there exist a minimum transmission delay of 5G software defined vehicular networks considering different vehicle densities. Moreover, the throughput of fog cells in 5G software defined vehicular networks is better than the throughput of traditional transportation management systems.
Online Social Network (OSN) is one of the most hottest services in the past years. It preserves the life of users and provides great potential for journalists, sociologists and business analysts. Crawling data from social network is a basic step for social network information analysis and processing. As the network becomes huge and information on the network updates faster than web pages, crawling is more difficult because of the limitations of band-width, politeness etiquette and computation power. To extract fresh information from social network efficiently and effectively, this paper presents a novel crawling method and discusses parallelization architecture of social network. To discover the feature of social network, we gather data from real social network, analyze them and build a model to describe the discipline of users' behavior. With the modeled behavior, we propose methods to predict users' behavior. According to the prediction, we schedule our crawler more reasonably and extract more fresh information with parallelization technologies. Experimental results demonstrate that our strategies could obtain information from OSN efficiently and effectively.
This paper describes a framework for modeling the interface between perception and memory on the algorithmic level of analysis. It is consistent with phenomena associated with many different brain regions. These include view-dependence (and invariance) effects in visual psychophysics and inferotemporal cortex physiology, as well as episodic memory recall interference effects associated with the medial temporal lobe. The perspective developed here relies on a novel interpretation of Hubel and Wiesel's conjecture for how receptive fields tuned to complex objects, and invariant to details, could be achieved. It complements existing accounts of two-speed learning systems in neocortex and hippocampus (e.g., McClelland et al. 1995) while significantly expanding their scope to encompass a unified view of the entire pathway from V1 to hippocampus.
We study the equilibrium distribution of relative strategy scores of agents in the asymmetric phase ($\alpha\equiv P/N\gtrsim 1$) of the basic Minority Game using sign-payoff, with $N$ agents holding two strategies over $P$ histories. We formulate a statistical model that makes use of the gauge freedom with respect to the ordering of an agent's strategies to quantify the correlation between the attendance and the distribution of strategies. The relative score $x\in\mathbb{Z}$ of the two strategies of an agent is described in terms of a one dimensional random walk with asymmetric jump probabilities, leading either to a static and asymmetric exponential distribution centered at $x=0$ for fickle agents or to diffusion with a positive or negative drift for frozen agents. In terms of scaled coordinates $x/\sqrt{N}$ and $t/N$ the distributions are uniquely given by $\alpha$ and in quantitative agreement with direct simulations of the game. As the model avoids the reformulation in terms of a constrained minimization problem it can be used for arbitrary payoff functions with little calculational effort and provides a transparent and simple formulation of the dynamics of the basic Minority Game in the asymmetric phase.
Several techniques for domain adaptation have been proposed to account for differences in the distribution of the data used for training and testing. The majority of this work focuses on a binary domain label. Similar problems occur in a scientific context where there may be a continuous family of plausible data generation processes associated to the presence of systematic uncertainties. Robust inference is possible if it is based on a pivot -- a quantity whose distribution does not depend on the unknown values of the nuisance parameters that parametrize this family of data generation processes. In this work, we introduce and derive theoretical results for a training procedure based on adversarial networks for enforcing the pivotal property (or, equivalently, fairness with respect to continuous attributes) on a predictive model. The method includes a hyperparameter to control the trade-off between accuracy and robustness. We demonstrate the effectiveness of this approach with a toy example and examples from particle physics.
The performance of neural network classifiers is determined by a number of hyperparameters, including learning rate, batch size, and depth. A number of attempts have been made to explore these parameters in the literature, and at times, to develop methods for optimizing them. However, exploration of parameter spaces has often been limited. In this note, I report the results of large scale experiments exploring these different parameters and their interactions.
For the past few years, in the race between image steganography and steganalysis, deep learning has emerged as a very promising alternative to steganalyzer approaches based on rich image models combined with ensemble classifiers. A key knowledge of image steganalyzer, which combines relevant image features and innovative classification procedures, can be deduced by a deep learning approach called Convolutional Neural Networks (CNN). These kind of deep learning networks is so well-suited for classification tasks based on the detection of variations in 2D shapes that it is the state-of-the-art in many image recognition problems. In this article, we design a CNN-based steganalyzer for images obtained by applying steganography with a unique embedding key. This one is quite different from the previous study of {\em Qian et al.} and its successor, namely {\em Pibre et al.} The proposed architecture embeds less convolutions, with much larger filters in the final convolutional layer, and is more general: it is able to deal with larger images and lower payloads. For the "same embedding key" scenario, our proposal outperforms all other steganalyzers, in particular the existing CNN-based ones, and defeats many state-of-the-art image steganography schemes.
This paper presents results on the memory capacity of a generalized feedback neural network using a circulant matrix. Children are capable of learning soon after birth which indicates that the neural networks of the brain have prior learnt capacity that is a consequence of the regular structures in the brain's organization. Motivated by this idea, we consider the capacity of circulant matrices as weight matrices in a feedback network.
Random networks are widely used to model complex networks and research their properties. In order to get a good approximation of complex networks encountered in various disciplines of science, the ability to tune various statistical properties of random networks is very important. In this manuscript we present an algorithm which is able to construct arbitrarily degree-degree correlated networks with adjustable degree-dependent clustering. We verify the algorithm by using empirical networks as input and describe additionally a simple way to fix a degree-dependent clustering function if degree-degree correlations are given.
In this paper, we propose a compact network called CUNet (compact unsupervised network) to counter the image classification challenge. Different from the traditional convolutional neural networks learning filters by the time-consuming stochastic gradient descent, CUNet learns the filter bank from diverse image patches with the simple K-means, which significantly avoids the requirement of scarce labeled training images, reduces the training consumption, and maintains the high discriminative ability. Besides, we propose a new pooling method named weighted pooling considering the different weight values of adjacent neurons, which helps to improve the robustness to small image distortions. In the output layer, CUNet integrates the feature maps gained in the last hidden layer, and straightforwardly computes histograms in non-overlapped blocks. To reduce feature redundancy, we implement the max-pooling operation on adjacent blocks to select the most competitive features. Comprehensive experiments are conducted to demonstrate the state-of-the-art classification performances with CUNet on CIFAR-10, STL-10, MNIST and Caltech101 benchmark datasets.
It is not, in general, possible to have access to all variables that determine the behavior of a system. Having identified a number of variables whose values can be accessed, there may still be hidden variables which influence the dynamics of the system. The result is model ambiguity in the sense that, for the same (or very similar) input values, different objective outputs should have been obtained. In addition, the degree of ambiguity may vary widely across the whole range of input values. Thus, to evaluate the accuracy of a model it is of utmost importance to create a method to obtain the degree of reliability of each output result. In this paper we present such a scheme composed of two coupled artificial neural networks: the first one being responsible for outputting the predicted value, whereas the other evaluates the reliability of the output, which is learned from the error values of the first one. As an illustration, the scheme is applied to a model for tracking slopes in a straw chamber and to a credit scoring model.
Networks are frequently studied algebraically through matrices. In this work, we show that networks may be studied in a more abstract level using results from the theory of matroids by establishing connections to networks by decomposition results of matroids. First, we present the implications of the decomposition of regular matroids to networks and related classes of matrices, and secondly we show that strongly unimodular matrices are closed under $k$-sums for $k=1,2$ implying a decomposition into highly connected network-representing blocks, which are also shown to have a special structure.
A Hilbert space embedding of a distribution---in short, a kernel mean embedding---has recently emerged as a powerful tool for machine learning and inference. The basic idea behind this framework is to map distributions into a reproducing kernel Hilbert space (RKHS) in which the whole arsenal of kernel methods can be extended to probability measures. It can be viewed as a generalization of the original "feature map" common to support vector machines (SVMs) and other kernel methods. While initially closely associated with the latter, it has meanwhile found application in fields ranging from kernel machines and probabilistic modeling to statistical inference, causal discovery, and deep learning. The goal of this survey is to give a comprehensive review of existing work and recent advances in this research area, and to discuss the most challenging issues and open problems that could lead to new research directions. The survey begins with a brief introduction to the RKHS and positive definite kernels which forms the backbone of this survey, followed by a thorough discussion of the Hilbert space embedding of marginal distributions, theoretical guarantees, and a review of its applications. The embedding of distributions enables us to apply RKHS methods to probability measures which prompts a wide range of applications such as kernel two-sample testing, independent testing, and learning on distributional data. Next, we discuss the Hilbert space embedding for conditional distributions, give theoretical insights, and review some applications. The conditional mean embedding enables us to perform sum, product, and Bayes' rules---which are ubiquitous in graphical model, probabilistic inference, and reinforcement learning---in a non-parametric way. We then discuss relationships between this framework and other related areas. Lastly, we give some suggestions on future research directions.
Using the charmonium light-front wavefunctions obtained by diagonalizing an effective Hamiltonian with the one-gluon exchange interaction and a confining potential inspired by light-front holography in the basis light-front quantization formalism, we compute production of charmonium states in diffractive deep inelastic scattering and ultra-peripheral heavy ion collisions within the dipole picture. Our method allows us to predict yields of all vector charmonium states below the open flavor thresholds in high-energy deep inelastic scattering, proton-nucleus and ultra-peripheral heavy ion collisions, without introducing any new parameters in the light-front wavefunctions. The obtained charmonium cross section is in reasonable agreement with experimental data at HERA, RHIC and LHC. We observe that the cross-section ratio $\sigma_{\Psi(2s)}/\sigma_{J/\Psi}$ reveals significant independence of model parameters.
In this talk, I will concentrate on $Q^2$-dependence of deep inelastic sum rules. I will first give a modern definition of deep-inelastic sum rules and then discuss physical origins of their scaling violation at finite $Q^2$. Following this, I discuss a few well-known examples, in particular, the Bjorken sum rule, which is at the center of interest of this symposium. Finally, I consider the $Q^2 \to 0$ limit of sum rules using low-energy theorems. I think this can motivate some interesting CEBAF physics.
Complex systems such as glasses, gels, granular materials, and systems far from equilibrium exhibit violation of the ergodic hypothesis (EH) and of the fluctuation-dissipation theorem (FDT). Recent investigations in systems with memory have established a hierarchical connection between mixing, the EH and the FDT. They have shown that a failure of the mixing condition (MC) will lead to the subsequent failures of the EH and of the FDT. Another important point is that such violations are not limited to complex systems: simple systems may also display this feature. Results from such systems are analytical and obviously easier to understand than those obtained in complex structures, where a large number of competing phenomena are present. In this work, we review some important requirements for the validity of the FDT and its connection with mixing, the EH and anomalous diffusion in one-dimensional systems. We show that when the FDT fails, an out-of-equilibrium system relaxes to an effective temperature different from that of the heat reservoir. This effective temperature is a signature of metastability found in many complex systems such as spin-glasses and granular materials.
In this report paper we first present a report of the Advanced Machine Learning Course Project on the provided data set and then present a novel heuristic algorithm for exact Bayesian network (BN) structure discovery that uses decomposable scoring functions. Our algorithm follows a different approach to solve the problem of BN structure discovery than the previously used methods such as Dynamic Programming (DP) and Branch and Bound to reduce the search space and find the global optima space for the problem. The algorithm we propose has some degree of flexibility that can make it more or less greedy. The more the algorithm is set to be greedy, the more the speed of the algorithm will be, and the less optimal the final structure. Our algorithm runs in a much less time than the previously known methods and guarantees to have an optimality of close to 99%. Therefore, it sacrifices less than one percent of score of an optimal structure in order to gain a much lower running time and make the algorithm feasible for large data sets (we may note that we never used any toolbox except for result validation)
In this paper, we propose and investigate a new neural network architecture called Neural Random Access Machine. It can manipulate and dereference pointers to an external variable-size random-access memory. The model is trained from pure input-output examples using backpropagation.   We evaluate the new model on a number of simple algorithmic tasks whose solutions require pointer manipulation and dereferencing. Our results show that the proposed model can learn to solve algorithmic tasks of such type and is capable of operating on simple data structures like linked-lists and binary trees. For easier tasks, the learned solutions generalize to sequences of arbitrary length. Moreover, memory access during inference can be done in a constant time under some assumptions.
In this paper, we propose an ad-hoc on-demand distance vector routing algorithm for mobile ad-hoc networks taking into account node mobility. Changeable topology of such mobile ad-hoc networks provokes overhead messages in order to search available routes and maintain found routes. The overhead messages impede data delivery from sources to destination and deteriorate network performance. To overcome such a challenge, our proposed algorithm estimates link duration based neighboring node mobility and chooses the most reliable route. The proposed algorithm also applies the estimate for route maintenance to lessen the number of overhead messages. Via simulations, the proposed algorithm is verified in various mobile environments. In the low mobility environment, by reducing route maintenance messages, the proposed algorithm significantly improves network performance such as packet data rate and end-to-end delay. In the high mobility environment, the reduction of route discovery message enhances network performance since the proposed algorithm provides more reliable routes.
This thesis is devoted to the study of some aspects of perturbative QCD, and in particular to the development of high-precision techniques for the extraction of physical parameters such as structure functions, parton distributions, and the strong coupling from the analysis of deep inelastic scattering data. First, we will discuss scaling violations of singlet and nonsinglet truncated moments, and the use of truncated momets to solve the Altarelli-Parisi equation. Then we will suggest an approach based on neural networks to the parametrization and interpolation of experimental data, which retains information on experimental errors and correlations. The method of truncated moments can be combined with the neural network fit to extract various quantities of phenomenological interest in a bias-free way. As an example of such application, we will discuss the determination of the strong coupling constant.
The increasing availability of online and mobile information platforms is facilitating the development of peer-to-peer collaboration strategies in large-scale networks. These technologies are being leveraged by networked robotic systems to provide applications of automated transport, resource redistribution (collaborative consumption), and location services. Yet, external observations of the system dynamics may expose sensitive information about the participants that compose these networks (robots, resources, and humans). In particular, we are concerned with settings where an adversary gains access to a snapshot of the dynamic state of the system. We propose a method that quantifies how easy it is for the adversary to identify the specific type of any agent (which can be a robot, resource, or human) in the network, based on this observation. We draw from the theory of differential privacy to propose a closed-form expression for the leakage of the system when the snapshot is taken at steady-state, as well as a numerical approach to compute the leakage when the snapshot is taken at any given time. The novelty of our approach is that our privacy model builds on a macroscopic description of the system's state, which allows us to take account of protected entities (network participants) that are interdependent. Our results show how the leakage varies, as a function of the composition and dynamic behavior of the network; they also indicate design rules for increasing privacy levels.
Statistical analysis and inferences on spike trains are one of the central topics in neural coding. It is of great interest to understand the underlying distribution and geometric structure of given spike train data. However, a fundamental obstacle is that the space of all spike trains is not an Euclidean space, and non-Euclidean metrics have been commonly used in the literature to characterize the variability and pattern in neural observations. Over the past few years, two Euclidean-like metrics were independently developed to measure distance in the spike train space. An important benefit of these metrics is that the spike train space will be suitable for embedding in Euclidean spaces due to their Euclidean properties. In this paper, we systematically compare these two metrics on theory, properties, and applications. Because of its Euclidean properties, one of these metrics has been further used in defining summary statistics (i.e. mean and variance) and conducting statistical inferences in the spike train space. Here we provide equivalent definitions using the other metric and show that consistent statistical inferences can be conducted. We then apply both inference frameworks in a neural coding problem for a recording in geniculate ganglion stimulated by different tastes. It is found that both frameworks achieve desirable results and provide useful new tools in statistical inferences in neural spike train space.
Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river network optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which is unrealistic in real- world settings. We therefore address the robust river network design problem where the goal is to optimize river connectivity for fish movement by removing barriers. We assume that fish passability probabilities are known only imprecisely, but are within some interval bounds. We then develop a planning approach that computes the policies with either high robust ratio or low regret. Empirically, our approach scales well to large river networks. We also provide insights into the solutions generated by our robust approach, which has significantly higher robust ratio than the baseline solution with mean parameter estimates.
Background/Purpose: In the context of the European Biodiversity policy, the Green Infrastructure Strategy is one supporting tool to mitigate fragmentation, inter-alia to increase the spatial and functional connectivity between protected and unprotected areas. The Joint Research Centre has developed an integrated model to provide a macro-scale set of indices to evaluate the connectivity of the Natura 2000 network, which forms the backbone of a Green Infrastructure for Europe. The model allows a wide assessment and comparison to be performed across countries in terms of structural (spatially connected or isolated sites) and functional connectivity (least-cost distances between sites influenced by distribution, distance and land cover).   Main conclusion: The Natura 2000 network in Europe shows differences among countries in terms of the sizes and numbers of sites, their distribution as well as distances between sites. Connectivity has been assessed on the basis of a 500 m average inter-site distance, roads and intensive land use as barrier effects as well as the presence of "green" corridors. In all countries the Natura 2000 network is mostly made of sites which are not physically connected. Highest functional connectivity values are found for Spain, Slovakia, Romania and Bulgaria. The more natural landscape in Sweden and Finland does not result in high inter-site network connectivity due to large inter-site distances. The distribution of subnets with respect to roads explains the higher share of isolated subnets in Portugal than in Belgium.
Minimizing non-convex and high-dimensional objective functions are challenging, especially when training modern deep neural networks. In this paper, a novel approach is proposed which divides the training process into two consecutive phases to obtain better generalization performance: Bayesian sampling and stochastic optimization. The first phase is to explore the energy landscape and to capture the "fat" modes; and the second one is to fine-tune the parameter learned from the first phase. In the Bayesian learning phase, we apply continuous tempering and stochastic approximation into the Langevin dynamics to create an efficient and effective sampler, in which the temperature is adjusted automatically according to the designed "temperature dynamics". These strategies can overcome the challenge of early trapping into bad local minima and have achieved remarkable improvements in various types of neural networks as shown in our theoretical analysis and empirical experiments.
The search for patterns or motifs in data represents an area of key interest to many researchers. In this paper we present the Motif Tracking Algorithm, a novel immune inspired pattern identification tool that is able to identify variable length unknown motifs which repeat within time series data. The algorithm searches from a neutral perspective that is independent of the data being analysed and the underlying motifs. In this paper we test the flexibility of the motif tracking algorithm by applying it to the search for patterns in two industrial data sets. The algorithm is able to identify a population of meaningful motifs in both cases, and the value of these motifs is discussed.
People today typically use multiple online social networks (Facebook, Twitter, Google+, LinkedIn, etc.). Each online network represents a subset of their "real" ego-networks. An interesting and challenging problem is to reconcile these online networks, that is, to identify all the accounts belonging to the same individual. Besides providing a richer understanding of social dynamics, the problem has a number of practical applications. At first sight, this problem appears algorithmically challenging. Fortunately, a small fraction of individuals explicitly link their accounts across multiple networks; our work leverages these connections to identify a very large fraction of the network.   Our main contributions are to mathematically formalize the problem for the first time, and to design a simple, local, and efficient parallel algorithm to solve it. We are able to prove strong theoretical guarantees on the algorithm's performance on well-established network models (Random Graphs, Preferential Attachment). We also experimentally confirm the effectiveness of the algorithm on synthetic and real social network data sets.
Clustering is a promising approach for building hierarchies and simplifying the routing process in mobile ad-hoc network environments. The main objective of clustering is to identify suitable node representatives, i.e. cluster heads (CHs), to store routing and topology information and maximize clusters stability. Traditional clustering algorithms suggest CH election exclusively based on node IDs or location information and involve frequent broadcasting of control packets, even when network topology remains unchanged. More recent works take into account additional metrics (such as energy and mobility) and optimize initial clustering. However, in many situations (e.g. in relatively static topologies) re-clustering procedure is hardly ever invoked; hence initially elected CHs soon reach battery exhaustion. Herein, we introduce an efficient distributed clustering algorithm that uses both mobility and energy metrics to provide stable cluster formations. CHs are initially elected based on the time and cost-efficient lowest-ID method. During clustering maintenance phase though, node IDs are re-assigned according to nodes mobility and energy status, ensuring that nodes with low-mobility and sufficient energy supply are assigned low IDs and, hence, are elected as CHs. Our algorithm also reduces control traffic volume since broadcast period is adjusted according to nodes mobility pattern: we employ infrequent broadcasting for relative static network topologies, and increase broadcast frequency for highly mobile network configurations. Simulation results verify that energy consumption is uniformly distributed among network nodes and that signaling overhead is significantly decreased.
The cytoskeleton of eukaryotic cells provides mechanical support and governs intracellular transport. These functions rely on the complex mechanical properties of networks of semiflexible protein filaments. Recent theoretical interest has focused on mesoscopic properties of such networks and especially on the effect of local, non-affine bending deformations on mechanics. Here, we study the impact of local network deformations on the scale-dependent mobility of probe particles in entangled networks of semiflexible actin filaments by high-bandwidth microrheology. We find that micron-sized particles in these networks experience two opposing non-continuum elastic effects: entropic depletion reduces the effective network rigidity, while local non-affine deformations of the network substantially enhance the rigidity at low frequencies. We show that a simple model of lateral bending of filaments embedded in a viscoelastic background leads to a scaling regime for the apparent elastic modulus G'(\omega) \sim \omega^{9/16}, closely matching the experiments. These results provide quantitative evidence for how different a semiflexible polymer network can feel for small objects, and they demonstrate how non-affine bending deformations can be dominant for the mobility of vesicles and organelles in the cell.
This work presents a distributed algorithm for nonlinear adaptive learning. In particular, a set of nodes obtain measurements, sequentially one per time step, which are related via a nonlinear function; their goal is to collectively minimize a cost function by employing a diffusion based Kernel Least Mean Squares (KLMS). The algorithm follows the Adapt Then Combine mode of cooperation. Moreover, the theoretical properties of the algorithm are studied and it is proved that under certain assumptions the algorithm suffers a no regret bound. Finally, comparative experiments verify that the proposed scheme outperforms other variants of the LMS.
Exchange of services and resources in, or over, networks is attracting nowadays renewed interest. However, despite the broad applicability and the extensive study of such models, e.g., in the context of P2P networks, many fundamental questions regarding their properties and efficiency remain unanswered. We consider such a service exchange model and analyze the users' interactions under three different approaches. First, we study a centrally designed service allocation policy that yields the fair total service each user should receive based on the service it others to the others. Accordingly, we consider a competitive market where each user determines selfishly its allocation policy so as to maximize the service it receives in return, and a coalitional game model where users are allowed to coordinate their policies. We prove that there is a unique equilibrium exchange allocation for both game theoretic formulations, which also coincides with the central fair service allocation. Furthermore, we characterize its properties in terms of the coalitions that emerge and the equilibrium allocations, and analyze its dependency on the underlying network graph. That servicing policy is the natural reference point to the various mechanisms that are currently proposed to incentivize user participation and improve the efficiency of such networked service (or, resource) exchange markets.
In a disordered medium with Kerr-type nonlinearity, the transmitted speckle pattern was predicted to become unstable, as a result of the positive feedback between the intensity fluctuations and the nonlinear dependence of the local refractive index. We present the first experimental evidence of speckle instability of light transversally scattered in a liquid crystal cell. A two-dimensional controlled disorder is imprinted in the medium through a suitable illumination of a photoconductive wall of the cell, whereas the nonlinear response is obtained through optical reorientation of the liquid crystal molecules. Speckle patterns with oscillating intensity are observed above a critical threshold, which is found to depend on the scattering mean free path, as predicted by theory.
One of the critical threat to internet security is Distributed Denial of Service (DDoS). This paper by the introduction of automated online attack classification and attack packet discarding helps to resolve the network security issue by certain level. The incoming packets are assigned scores based on the priority associated with the attributes and on comparison with probability distribution of arriving packets on per packet basis.
Collisions of white dwarfs (WDs) have recently been invoked as a possible mechanism for type Ia supernovae (SNIa). A pivotal feature for the viability of WD collisions as SNIa progenitors is that a significant fraction of the mass is highly compressed to the densities required for efficient $^{56}$Ni production before the ignition of the detonation wave. Previous studies have predominantly employed model WDs composed entirely of carbon-oxygen (CO), whereas WDs are expected to have a non-negligible helium envelope. Given that helium is more susceptible to explosive burning than CO under the conditions characteristic of WD collision, a legitimate concern is whether or not early time He detonation ignition can translate to early time CO detonation, thereby drastically reducing $^{56}$Ni synthesis. We investigate the role of He in determining the fate of WD collisions by performing a series of two-dimensional hydrodynamics calculations. We find that a necessary condition for non-trivial reduction of the CO ignition time is that the He detonation birthed in the contact region successfully propagates into the unshocked shell. We determine the minimal He shell mass as a function of the total WD mass that upholds this condition. Although we utilize a simplified reaction network similar to those used in previous studies, our findings are in good agreement with detailed investigations concerning the impact of network size on He shell detonations. This allows us to extend our results to the case with more realistic burning physics. Based on the comparison of these findings against evolutionary calculations of WD compositions, we conclude that most, if not all, WD collisions will not be drastically impacted by their intrinsic He components.
A VoIP based call has stringent QoS requirements with respect to delay, jitter, loss, MOS and R-Factor. Various QoS mechanisms implemented to satisfy these requirements must be adaptive under diverse network scenarios and applied in proper sequence, otherwise they may conflict with each other. The objective of this paper is to address the problem of adaptive QoS maintenance and sequential execution of available QoS implementation mechanisms with respect to VoIP under varying network conditions. In this paper, we generalize this problem as state-space problem and solve it. Firstly, we map the problem of QoS optimization into state-space domain and apply incremental heuristic search. We implement the proposed algorithm under various network and user scenarios in a VoIP test-bed for QoS enhancement. Then learning strategy is implemented for refinement of knowledge base to improve the performance of call quality over time. Finally, we discuss the advantages and uniqueness of our approach.
As a step towards developing zero-shot task generalization capabilities in reinforcement learning (RL), we introduce a new RL problem where the agent should learn to execute sequences of instructions after learning useful skills that solve subtasks. In this problem, we consider two types of generalizations: to previously unseen instructions and to longer sequences of instructions. For generalization over unseen instructions, we propose a new objective which encourages learning correspondences between similar subtasks by making analogies. For generalization over sequential instructions, we present a hierarchical architecture where a meta controller learns to use the acquired skills for executing the instructions. To deal with delayed reward, we propose a new neural architecture in the meta controller that learns when to update the subtask, which makes learning more efficient. Experimental results on a stochastic 3D domain show that the proposed ideas are crucial for generalization to longer instructions as well as unseen instructions.
We have introduced two crossover operators, MMX-BLXexploit and MMX-BLXexplore, for simultaneously solving multiple feature/subset selection problems where the features may have numeric attributes and the subset sizes are not predefined. These operators differ on the level of exploration and exploitation they perform; one is designed to produce convergence controlled mutation and the other exhibits a quasi-constant mutation rate. We illustrate the characteristic of these operators by evolving pattern detectors to distinguish alcoholics from controls using their visually evoked response potentials (VERPs). This task encapsulates two groups of subset selection problems; choosing a subset of EEG leads along with the lead-weights (features with attributes) and the other that defines the temporal pattern that characterizes the alcoholic VERPs. We observed better generalization performance from MMX-BLXexplore. Perhaps, MMX-BLXexploit was handicapped by not having a restart mechanism. These operators are novel and appears to hold promise for solving simultaneous feature selection problems.
The sizes of snow slab failure that trigger snow avalanches are power-law distributed. Such a power-law probability distribution function has also been proposed to characterize different landslide types. In order to understand this scaling for gravity driven systems, we introduce a two-threshold 2-d cellular automaton, in which failure occurs irreversibly. Taking snow slab avalanches as a model system, we find that the sizes of the largest avalanches just preceeding the lattice system breakdown are power law distributed. By tuning the maximum value of the ratio of the two failure thresholds our model reproduces the range of power law exponents observed for land-, rock- or snow avalanches. We suggest this control parameter represents the material cohesion anisotropy.
Community detection is one of the fundamental problems of network analysis, for which a number of methods have been proposed. Most model-based or criteria-based methods have to solve an optimization problem over a discrete set of labels to find communities, which is computationally infeasible. Some fast spectral algorithms have been proposed for specific methods or models, but only on a case-by-case basis. Here we propose a general approach for maximizing a function of a network adjacency matrix over discrete labels by projecting the set of labels onto a subspace approximating the leading eigenvectors of the expected adjacency matrix. This projection onto a low-dimensional space makes the feasible set of labels much smaller and the optimization problem much easier. We prove a general result about this method and show how to apply it to several previously proposed community detection criteria, establishing its consistency for label estimation in each case and demonstrating the fundamental connection between spectral properties of the network and various model-based approaches to community detection. Simulations and applications to real-world data are included to demonstrate our method performs well for multiple problems over a wide range of parameters.
We propose a new method for assessing agents influence in network structures, which takes into consideration nodes attributes, individual and group influences of nodes, and the intensity of interactions. This approach helps us to identify both explicit and hidden central elements which cannot be detected by classical centrality measures or other indices.
Prosody affects the naturalness and intelligibility of speech. However, automatic prosody prediction from text for Chinese speech synthesis is still a great challenge and the traditional conditional random fields (CRF) based method always heavily relies on feature engineering. In this paper, we propose to use neural networks to predict prosodic boundary labels directly from Chinese characters without any feature engineering. Experimental results show that stacking feed-forward and bidirectional long short-term memory (BLSTM) recurrent network layers achieves superior performance over the CRF-based method. The embedding features learned from raw text further enhance the performance.
This paper describes work in progress towards an automated formal and rigorous analysis of the Ad hoc On-Demand Distance Vector (AODV) routing protocol, a popular protocol used in ad hoc wireless networks. We give a brief overview of a model of AODV implemented in the UPPAAL model checker, and describe experiments carried out to explore AODV's behaviour in two network topologies. We were able to locate automatically and confirm some known problematic and undesirable behaviours. We believe this use of model checking as a diagnostic tool complements other formal methods based protocol modelling and verification techniques, such as process algebras. Model checking is in particular useful for the discovery of protocol limitations and in the development of improved variations.
Because of the significant increase in size and complexity of the networks, the distributed computation of eigenvalues and eigenvectors of graph matrices has become very challenging and yet it remains as important as before. In this paper we develop efficient distributed algorithms to detect, with higher resolution, closely situated eigenvalues and corresponding eigenvectors of symmetric graph matrices. We model the system of graph spectral computation as physical systems with Lagrangian and Hamiltonian dynamics. The spectrum of Laplacian matrix, in particular, is framed as a classical spring-mass system with Lagrangian dynamics. The spectrum of any general symmetric graph matrix turns out to have a simple connection with quantum systems and it can be thus formulated as a solution to a Schr\"odinger-type differential equation. Taking into account the higher resolution requirement in the spectrum computation and the related stability issues in the numerical solution of the underlying differential equation, we propose the application of symplectic integrators to the calculation of eigenspectrum. The effectiveness of the proposed techniques is demonstrated with numerical simulations on real-world networks of different sizes and complexities.
The internal jet structure in dijet production in deep-inelastic scattering is measured with the H1 detector at HERA. Jets with transverse energies E(T,breit) > 5 GeV are selected in the Breit frame employing kt and cone jet algorithms in the kinematic region of squared momentum transfers of 10 < Q^2 < 120 GeV^2 and x-Bjorken values of 0.0002 < x < 0.008. Jet shapes and subjet multiplicities are measured as functions of a resolution parameter. The corrected data are well described by QCD models. It is observed that jets are more collimated with increasing transverse jet energies and decreasing pseudo-rapidities, i.e. towards the photon direction. Comparisons with OPAL data show that jet shapes of jets measured in photon-photon collisions are very similar to those measured in ep collisions.
In a range of citation networks, the in-degree distributions boast time-periodicity---the distributions of citations per article published each year present similar scale-free tails. This phenomenon can be regarded as a consequence of the emergence of hot topics and the existence of the "burst" phenomenon. With this inference considered, a geometric model based on our previous study is established, in which the sizes of the influence zones of nodes follow the same power-law distribution and decrease with their ages. The model successfully reproduces the time-periodicity of the in-degree distributions of the empirical data, and accounts for the presence of citation burst as well. Moreover, a reasonable explanation for the emergence of the scale-free tails by regarding the citation behavior between articles as a "yes/no" experiment is presented. The model can also predict the time-periodicity of the local clustering coefficients, which indicates that the model is a good tool in researches on the evolutionary mechanism of citation networks.
We present an asymptotic analysis of Viterbi Training (VT) and contrast it with a more conventional Maximum Likelihood (ML) approach to parameter estimation in Hidden Markov Models. While ML estimator works by (locally) maximizing the likelihood of the observed data, VT seeks to maximize the probability of the most likely hidden state sequence. We develop an analytical framework based on a generating function formalism and illustrate it on an exactly solvable model of HMM with one unambiguous symbol. For this particular model the ML objective function is continuously degenerate. VT objective, in contrast, is shown to have only finite degeneracy. Furthermore, VT converges faster and results in sparser (simpler) models, thus realizing an automatic Occam's razor for HMM learning. For more general scenario VT can be worse compared to ML but still capable of correctly recovering most of the parameters.
Successive whole genome duplications have recently been firmly established in all major eukaryote kingdoms. It is not clear, however, how such dramatic evolutionary process has contributed to shape the large scale topology of protein-protein interaction (PPI) networks. We propose and analytically solve a generic model of PPI network evolution under successive whole genome duplications. This demonstrates that the observed scale-free degree distributions and conserved multi-protein complexes may have concomitantly arised from i) intrinsic exponential dynamics of PPI network evolution and ii) asymmetric divergence of gene duplicates. This requirement of asymmetric divergence is in fact "spontaneously" fulfilled at the level of protein-binding domains. In addition, domain shuffling of multi-domain proteins is shown to provide a powerful combinatorial source of PPI network innovation, while preserving essential structures of the underlying single-domain interaction network. Finally, large scale features of PPI networks reflecting the "combinatorial logic" behind direct and indirect protein interactions are well reproduced numerically with only two adjusted parameters of clear biological significance.
Context management strategies in wireless technology are dependent upon the collection of accurate information from the individual nodes. This information (called context information) can be exploited by administrators or automated systems in order to decide on specific management concerns. While traditional approaches for fixed networks are more or less centralized, more complex management strategies have been proposed for wireless networks, such as hierarchical, fully distributed and hybrid ones. The reason for the introduction of new strategies is based on the dynamic and unpredictable nature of wireless networks and their (usually) limited resources, which do not support centralized management solutions. In this work, efforts is being made to uncovered some specific strategies that can be used to managed context information that reaches the centre of decision making, the work is concluded with a detail comparison of the strategies to enable context application developers make right choice of strategy to be employed in a specific situation.
Recommending items to users is a challenging task due to the large amount of missing information. In many cases, the data solely consist of ratings or tags voluntarily contributed by each user on a very limited subset of the available items, so that most of the data of potential interest is actually missing. Current approaches to recommendation usually assume that the unobserved data is missing at random. In this contribution, we provide statistical evidence that existing movie recommendation datasets reveal a significant positive association between the rating of items and the propensity to select these items. We propose a computationally efficient variational approach that makes it possible to exploit this selection bias so as to improve the estimation of ratings from small populations of users. Results obtained with this approach applied to neighborhood-based collaborative filtering illustrate its potential for improving the reliability of the recommendation.
The cortical processes involved in learning are not well understood. Recent experiments have studied population-level response in the orofacial somatosensory (S1) and motor (S1) cortices of rhesus macaque monkeys during adaptation to a simple tongue protrusion task within and across multiple learning sessions. Initial findings have suggested the formation of cell assemblies during adaptation. In this report we explore differences in cell activity between successful and failed trials as the monkey learns during two sessions. The ability to directly compare data across multiple sessions is fairly new and until now research has mostly focused on the activity of neurons during successful trials only. We confirm findings of the development of coherently active cell assemblies and find that neural response differentiates significantly between successful and unsuccessful trials, particularly as the monkey adapts to the task. Our findings motivate further research into the differences in activity between successful and unsuccessful trials in these experiments.
We propose an attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELF (DEep Local Feature). The new feature is based on convolutional neural networks, which are trained only with image-level annotations on a landmark image dataset. To identify semantically useful local features for image retrieval, we also propose an attention mechanism for keypoint selection, which shares most network layers with the descriptor. This framework can be used for image retrieval as a drop-in replacement for other keypoint detectors and descriptors, enabling more accurate feature matching and geometric verification. Our system produces reliable confidence scores to reject false positives effectively---in particular, it is robust against queries that have no correct match in the database. To evaluate the proposed descriptor, we introduce a new large-scale dataset, which involves challenges in both database and query such as background clutter, partial occlusion, multiple landmarks, objects in variable scales, etc. We show that DELF outperforms the state-of-the-art global and local descriptors in the large-scale setting by significant margins.
Deep neural networks have achieved remarkable success in a wide range of practical problems. However, due to the inherent large parameter space, deep models are notoriously prone to overfitting and difficult to be deployed in portable devices with limited memory. In this paper, we propose an iterative hard thresholding (IHT) approach to train Skinny Deep Neural Networks (SDNNs). An SDNN has much fewer parameters yet can achieve competitive or even better performance than its full CNN counterpart. More concretely, the IHT approach trains an SDNN through following two alternative phases: (I) perform hard thresholding to drop connections with small activations and fine-tune the other significant filters; (II)~re-activate the frozen connections and train the entire network to improve its overall discriminative capability. We verify the superiority of SDNNs in terms of efficiency and classification performance on four benchmark object recognition datasets, including CIFAR-10, CIFAR-100, MNIST and ImageNet. Experimental results clearly demonstrate that IHT can be applied for training SDNN based on various CNN architectures such as NIN and AlexNet.
The optimal capacity of a diluted Blume-Emery-Griffiths neural network is studied as a function of the pattern activity and the embedding stability using the Gardner entropy approach. Annealed dilution is considered, cutting some of the couplings referring to the ternary patterns themselves and some of the couplings related to the active patterns, both simultaneously (synchronous dilution) or independently (asynchronous dilution). Through the de Almeida-Thouless criterion it is found that the replica-symmetric solution is locally unstable as soon as there is dilution. The distribution of the couplings shows the typical gap with a width depending on the amount of dilution, but this gap persists even in cases where a particular type of coupling plays no role in the learning process.
We explore the issue of refining an existent Bayesian network structure using new data which might mention only a subset of the variables. Most previous works have only considered the refinement of the network's conditional probability parameters, and have not addressed the issue of refining the network's structure. We develop a new approach for refining the network's structure. Our approach is based on the Minimal Description Length (MDL) principle, and it employs an adapted version of a Bayesian network learning algorithm developed in our previous work. One of the adaptations required is to modify the previous algorithm to account for the structure of the existent network. The learning algorithm generates a partial network structure which can then be used to improve the existent network. We also present experimental evidence demonstrating the effectiveness of our approach.
Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Tree-LSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).
Traditional image tagging and retrieval algorithms have limited value as a result of being trained with heavily curated datasets. These limitations are most evident when arbitrary search words are used that do not intersect with training set labels. Weak labels from user generated content (UGC) found in the wild (e.g., Google Photos, FlickR, etc.) have an almost unlimited number of unique words in the metadata tags. Prior work on word embeddings successfully leveraged unstructured text with large vocabularies, and our proposed method seeks to apply similar cost functions to open source imagery. Specifically, we train a deep learning image tagging and retrieval system on large scale, user generated content (UGC) using sampling methods and joint optimization of word embeddings. By using the Yahoo! FlickR Creative Commons (YFCC100M) dataset, such an approach builds robustness to common unstructured data issues that include but are not limited to irrelevant tags, misspellings, multiple languages, polysemy, and tag imbalance. As a result, the final proposed algorithm will not only yield comparable results to state of the art in conventional image tagging, but will enable new capability to train algorithms on large, scale unstructured text in the YFCC100M dataset and outperform cited work in zero-shot capability.
Random walks constitute a fundamental mechanism for a large set of dynamics taking place on networks. In this article, we study random walks on weighted networks with an arbitrary degree distribution, where the weight of an edge between two nodes has a tunable parameter. By using the spectral graph theory, we derive analytical expressions for the stationary distribution, mean first-passage time (MFPT), average trapping time (ATT), and lower bound of the ATT, which is defined as the average MFPT to a given node over every starting point chosen from the stationary distribution. All these results depend on the weight parameter, indicating a significant role of network weights on random walks. For the case of uncorrelated networks, we provide explicit formulas for the stationary distribution as well as ATT. Particularly, for uncorrelated scale-free networks, when the target is placed on a node with the highest degree, we show that ATT can display various scalings of network size, depending also on the same parameter. Our findings could pave a way to delicately controlling random-walk dynamics on complex networks.
We take the generation of Chinese classical poem lines as a sequence-to-sequence learning problem, and build a novel system based on the RNN Encoder-Decoder structure to generate quatrains (Jueju in Chinese), with a topic word as input. Our system can jointly learn semantic meaning within a single line, semantic relevance among lines in a poem, and the use of structural, rhythmical and tonal patterns, without utilizing any constraint templates. Experimental results show that our system outperforms other competitive systems. We also find that the attention mechanism can capture the word associations in Chinese classical poetry and inverting target lines in training can improve performance.
We discuss how minimal financial market models can be constructed by bridging the gap between two existing, but incomplete, market models: a model in which a population of virtual traders make decisions based on common global information but lack local information from their social network, and a model in which the traders form a dynamically evolving social network but lack any decision-making based on global information. We show that a suitable combination of these two models -- in particular, a population of virtual traders with access to both global and local information -- produces results for the price return distribution which are closer to the reported stylized facts. We believe that this type of model can be applied across a wide range of systems in which collective human activity is observed.
We study the problem of how to distribute the training of large-scale deep learning models in the parallel computing environment. We propose a new distributed stochastic optimization method called Elastic Averaging SGD (EASGD). We analyze the convergence rate of the EASGD method in the synchronous scenario and compare its stability condition with the existing ADMM method in the round-robin scheme. An asynchronous and momentum variant of the EASGD method is applied to train deep convolutional neural networks for image classification on the CIFAR and ImageNet datasets. Our approach accelerates the training and furthermore achieves better test accuracy. It also requires a much smaller amount of communication than other common baseline approaches such as the DOWNPOUR method.   We then investigate the limit in speedup of the initial and the asymptotic phase of the mini-batch SGD, the momentum SGD, and the EASGD methods. We find that the spread of the input data distribution has a big impact on their initial convergence rate and stability region. We also find a surprising connection between the momentum SGD and the EASGD method with a negative moving average rate. A non-convex case is also studied to understand when EASGD can get trapped by a saddle point.   Finally, we scale up the EASGD method by using a tree structured network topology. We show empirically its advantage and challenge. We also establish a connection between the EASGD and the DOWNPOUR method with the classical Jacobi and the Gauss-Seidel method, thus unifying a class of distributed stochastic optimization methods.
Personalized predictive medicine necessitates the modeling of patient illness and care processes, which inherently have long-term temporal dependencies. Healthcare observations, recorded in electronic medical records, are episodic and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural network that reads medical records, stores previous illness history, infers current illness states and predicts future medical outcomes. At the data level, DeepCare represents care episodes as vectors in space, models patient health state trajectories through explicit memory of historical records. Built on Long Short-Term Memory (LSTM), DeepCare introduces time parameterizations to handle irregular timed events by moderating the forgetting and consolidation of memory cells. DeepCare also incorporates medical interventions that change the course of illness and shape future medical risk. Moving up to the health state level, historical and present health states are then aggregated through multiscale temporal pooling, before passing through a neural network that estimates future outcomes. We demonstrate the efficacy of DeepCare for disease progression modeling, intervention recommendation, and future risk prediction. On two important cohorts with heavy social and economic burden -- diabetes and mental health -- the results show improved modeling and risk prediction accuracy.
Revealing hidden features in unlabeled data is called unsupervised feature learning, which plays an important role in pretraining a deep neural network. Here we provide a statistical mechanics analysis of the unsupervised learning in a restricted Boltzmann machine with binary synapses. A message passing equation to infer the hidden feature is derived, and furthermore, variants of this equation are analyzed. A statistical analysis by replica theory describes the thermodynamic properties of the model. Our analysis confirms an entropy crisis preceding the non-convergence of the message passing equation, suggesting a discontinuous phase transition as a key characteristic of the restricted Boltzmann machine. Continuous phase transition is also confirmed depending on the embedded feature strength in the data. The mean-field result under the replica symmetric assumption agrees with that obtained by running message passing algorithms on single instances of finite sizes. Interestingly, in an approximate Hopfield model, the entropy crisis is absent, and a continuous phase transition is observed instead. We also develop an iterative equation to infer the hyper-parameter (temperature) hidden in the data, which in physics corresponds to iteratively imposing Nishimori condition. Our study provides insights towards understanding the thermodynamic properties of the restricted Boltzmann machine learning, and moreover important theoretical basis to build simplified deep networks.
Motion blur from camera shake is a major problem in videos captured by hand-held devices. Unlike single-image deblurring, video-based approaches can take advantage of the abundant information that exists across neighboring frames. As a result the best performing methods rely on aligning nearby frames. However, aligning images is a computationally expensive and fragile procedure, and methods that aggregate information must therefore be able to identify which regions have been accurately aligned and which have not, a task which requires high level scene understanding. In this work, we introduce a deep learning solution to video deblurring, where a CNN is trained end-to-end to learn how to accumulate information across frames. To train this network, we collected a dataset of real videos recorded with a high framerate camera, which we use to generate synthetic motion blur for supervision. We show that the features learned from this dataset extend to deblurring motion blur that arises due to camera shake in a wide range of videos, and compare the quality of results to a number of other baselines.
Learning to hash has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval, due to its computation efficiency and retrieval quality. Deep learning to hash, which improves retrieval quality by end-to-end representation learning and hash encoding, has received increasing attention recently. Subject to the vanishing gradient difficulty in the optimization with binary activations, existing deep learning to hash methods need to first learn continuous representations and then generate binary hash codes in a separated binarization step, which suffer from substantial loss of retrieval quality. This paper presents HashNet, a novel deep architecture for deep learning to hash by continuation method, which learns exactly binary hash codes from imbalanced similarity data where the number of similar pairs is much smaller than the number of dissimilar pairs. The key idea is to attack the vanishing gradient problem in optimizing deep networks with non-smooth binary activations by continuation method, in which we begin from learning an easier network with smoothed activation function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, deep network with the sign activation function. Comprehensive empirical evidence shows that HashNet can generate exactly binary hash codes and yield state-of-the-art multimedia retrieval performance on standard benchmarks.
The movement of the eyes has been the subject of intensive research as a way to elucidate inner mechanisms of cognitive processes. A cognitive task that is rather frequent in our daily life is the visual search for hidden objects. Here we investigate through eye-tracking experiments the statistical properties associated with the search of target images embedded in a landscape of distractors. Specifically, our results show that the twofold process of eye movement, composed of sequences of fixations (small steps) intercalated by saccades (longer jumps), displays characteristic statistical signatures. While the saccadic jumps follow a log normal distribution of distances, which is typical of multiplicative processes, the lengths of the smaller steps in the fixation trajectories are consistent with a power-law distribution. Moreover, the present analysis reveals a clear transition between a directional serial search to an isotropic random movement as the difficulty level of the searching task is increased.
Deep neural networks are a powerful tool for feature learning and extraction given their ability to model high-level abstractions in highly complex data. One area worth exploring in feature learning and extraction using deep neural networks is efficient neural connectivity formation for faster feature learning and extraction. Motivated by findings of stochastic synaptic connectivity formation in the brain as well as the brain's uncanny ability to efficiently represent information, we propose the efficient learning and extraction of features via StochasticNets, where sparsely-connected deep neural networks can be formed via stochastic connectivity between neurons. To evaluate the feasibility of such a deep neural network architecture for feature learning and extraction, we train deep convolutional StochasticNets to learn abstract features using the CIFAR-10 dataset, and extract the learned features from images to perform classification on the SVHN and STL-10 datasets. Experimental results show that features learned using deep convolutional StochasticNets, with fewer neural connections than conventional deep convolutional neural networks, can allow for better or comparable classification accuracy than conventional deep neural networks: relative test error decrease of ~4.5% for classification on the STL-10 dataset and ~1% for classification on the SVHN dataset. Furthermore, it was shown that the deep features extracted using deep convolutional StochasticNets can provide comparable classification accuracy even when only 10% of the training data is used for feature learning. Finally, it was also shown that significant gains in feature extraction speed can be achieved in embedded applications using StochasticNets. As such, StochasticNets allow for faster feature learning and extraction performance while facilitate for better or comparable accuracy performances.
In many networks, vertices have hidden attributes, or types, that are correlated with the networks topology. If the topology is known but these attributes are not, and if learning the attributes is costly, we need a method for choosing which vertex to query in order to learn as much as possible about the attributes of the other vertices. We assume the network is generated by a stochastic block model, but we make no assumptions about its assortativity or disassortativity. We choose which vertex to query using two methods: 1) maximizing the mutual information between its attributes and those of the others (a well-known approach in active learning) and 2) maximizing the average agreement between two independent samples of the conditional Gibbs distribution. Experimental results show that both these methods do much better than simple heuristics. They also consistently identify certain vertices as important by querying them early on.
Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions furthermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.
We present an algorithm for error correction in topological codes that exploits modern machine learning techniques. Our decoder is constructed from a stochastic neural network called a Boltzmann machine, of the type extensively used in deep learning. We provide a general prescription for the training of the network and a decoding strategy that is applicable to a wide variety of stabilizer codes with very little specialization. We demonstrate the neural decoder numerically on the well-known two dimensional toric code with phase-flip errors.
Voltage control plays an important role in the operation of electricity distribution networks, especially with high penetration of distributed energy resources. These resources introduces significant and fast varying uncertainties. In this paper, we focus on reactive power compensation to control voltage in the presence of uncertainties. We adopt a probabilistic approach that accounts for arbitrary correlations between renewable resources at each of the buses and we use the linearized DistFlow equations to model the distribution network. We then show that this optimization problem is convex for a wide variety of probabilistic distributions. Compared to conventional per-bus chance constraints, our formulation is more robust to uncertainty and more computationally tractable. We illustrate the results using standard IEEE distribution test feeders.
This paper studies the network throughput and transport delay of a multihop wireless random access network based on a Markov renewal model of packet transportation. We show that the distribution of the source-to-destination (SD) distance plays a critical role in characterizing network performance. We establish necessary and sufficient condition on the SD distance for scalable network throughput, and address the optimal rate allocation issue with fairness and the QoS requirements taken into consideration. In respect to the end-to-end performance, the transport delay is explored in this paper along with network throughput. We characterize the transport delay by relating it to nodal queueing behavior and the SD-distance distribution; the former is a local property while the latter is a global property. In addition, we apply the large deviation theory to derive the tail distribution of transport delay. To put our theory into practical network operation, several traffic scaling laws are provided to demonstrate how network scalability can be achieved by localizing the traffic pattern, and a leaky bucket scheme at the network access is proposed for traffic shaping and flow control.
An efficient Bayesian inference method for problems that can be mapped onto dense graphs is presented. The approach is based on message passing where messages are averaged over a large number of replicated variable systems exposed to the same evidential nodes. An assumption about the symmetry of the solutions is required for carrying out the averages; here we extend the previous derivation based on a replica symmetric (RS) like structure to include a more complex one-step replica symmetry breaking (1RSB)-like ansatz. To demonstrate the potential of the approach it is employed for studying critical properties of the Ising linear perceptron and for multiuser detection in Code Division Multiple Access (CDMA) under different noise models. Results obtained under the RS assumption in the non-critical regime give rise to a highly efficient signal detection algorithm in the context of CDMA; while in the critical regime one observes a first order transition line that ends in a continuous phase transition point. Finite size effects are also observed. While the 1RSB ansatz is not required for the original problems, it was applied to the CDMA signal detection problem with a more complex noise model that exhibits RSB behaviour, resulting in an improvement in performance.
Atom is an anonymous messaging system that protects against traffic-analysis attacks. Unlike many prior systems, each Atom server touches only a small fraction of the total messages routed through the network. As a result, the system's capacity scales near-linearly with the number of servers. At the same time, each Atom user benefits from "best possible" anonymity: a user is anonymous among all honest users of the system, against an active adversary who controls the entire network, a portion of the system's servers, and any number of malicious users. The architectural ideas behind Atom have been known in theory, but putting them into practice requires new techniques for (1) avoiding the reliance on heavy general-purpose multi-party computation protocols, (2) defeating active attacks by malicious servers at minimal performance cost, and (3) handling server failure and churn.   Atom is most suitable for sending a large number of short messages, as in a microblogging application or a high-security communication bootstrapping ("dialing") for private messaging systems. We show that, on a heterogeneous network of 1,024 servers, Atom can transit a million Tweet-length messages in 28 minutes. This is over 23x faster than prior systems with similar privacy guarantees.
The conventional classification schemes -- notably multinomial logistic regression -- used in conjunction with convolutional networks (convnets) are classical in statistics, designed without consideration for the usual coupling with convnets, stochastic gradient descent, and backpropagation. In the specific application to supervised learning for convnets, a simple scale-invariant classification stage turns out to be more robust than multinomial logistic regression, appears to result in slightly lower errors on several standard test sets, has similar computational costs, and features precise control over the actual rate of learning. "Scale-invariant" means that multiplying the input values by any nonzero scalar leaves the output unchanged.
Since 1996, a hybrid experiment consisting of the emulsion chamber and burst detector array and the Tibet-II air-shower array has been operated at Yangbajing (4300 m above sea level, 606 g/cm^2) in Tibet. This experiment can detect air-shower cores, called as burst events, accompanied by air showers in excess of about 100 TeV. We observed about 4300 burst events accompanied by air showers during 690 days of operation and selected 820 proton-induced events with its primary energy above 200 TeV using a neural network method. Using this data set, we obtained the energy spectrum of primary protons in the energy range from 200 to 1000 TeV. The differential energy spectrum obtained in this energy region can be fitted by a power law with the index of -2.97 $\pm$ 0.06, which is steeper than that obtained by direct measurements at lower energies. We also obtained the energy spectrum of helium nuclei at particle energies around 1000 TeV.
Data attacks on state estimation modify part of system measurements such that the tempered measurements cause incorrect system state estimates. Attack techniques proposed in the literature often require detailed knowledge of system parameters. Such information is difficult to acquire in practice. The subspace methods presented in this paper, on the other hand, learn the system operating subspace from measurements and launch attacks accordingly. Conditions for the existence of an unobservable subspace attack are obtained under the full and partial measurement models. Using the estimated system subspace, two attack strategies are presented. The first strategy aims to affect the system state directly by hiding the attack vector in the system subspace. The second strategy misleads the bad data detection mechanism so that data not under attack are removed. Performance of these attacks are evaluated using the IEEE 14-bus network and the IEEE 118-bus network.
We consider single band of conduction electrons interacting with displacements of the transitional ions.In the classical regime strong coupling transforms the harmonic elastic energy for an ion to the one of the well with two deep minima,so that the system is described in terms of Ising spins. Inter-site interactions via the exchange by electrons order spins at lower temperatures. Extention to the quantum regime is discussed. Below the CDW-transition the energy spectrum of electrons remains metallic because the structural vector Q and the FS sizes are not related.Large values of the CDW gaps seen in the tunneling experiments find their natural explanation as due to the deep energy minima in the bound two-well electron-ion complex.
The size of modern data centers is constantly increasing. As it is not economic to interconnect all machines in the data center using a full-bisection-bandwidth network, techniques have to be developed to increase the efficiency of data-center networks. The Software-Defined Network paradigm opened the door for centralized traffic engineering (TE) in such environments. Up to now, there were already a number of TE proposals for SDN-controlled data centers that all work very well. However, these techniques either use a high amount of flow table entries or a high flow installation rate that overwhelms available switching hardware, or they require custom or very expensive end-of-line equipment to be usable in practice.   We present HybridTE, a TE technique that uses (uncertain) information about large flows. Using this extra information, our technique has very low hardware requirements while maintaining better performance than existing TE techniques. This enables us to build very low-cost, high performance data-center networks.
BDeu marginal likelihood score is a popular model selection criterion for selecting a Bayesian network structure based on sample data. This non-informative scoring criterion assigns same score for network structures that encode same independence statements. However, before applying the BDeu score, one must determine a single parameter, the equivalent sample size alpha. Unfortunately no generally accepted rule for determining the alpha parameter has been suggested. This is disturbing, since in this paper we show through a series of concrete experiments that the solution of the network structure optimization problem is highly sensitive to the chosen alpha parameter value. Based on these results, we are able to give explanations for how and why this phenomenon happens, and discuss ideas for solving this problem.
The Bergm package provides a comprehensive framework for Bayesian inference using Markov chain Monte Carlo (MCMC) algorithms. It can also supply graphical Bayesian goodness-of-fit procedures that address the issue of model adequacy. The package is simple to use and represents an attractive way of analysing network data as it offers the advantage of a complete probabilistic treatment of uncertainty. Bergm is based on the ergm package and therefore it makes use of the same model set-up and network simulation algorithms. The Bergm package has been continually improved in terms of speed performance over the last years and now offers the end-user a feasible option for carrying out Bayesian inference for networks with several thousands of nodes.
We study properties of a random walk in a generalized Sinai model, in which a quenched random potential is a trajectory of a fractional Brownian motion with arbitrary Hurst parameter H, 0< H <1, so that the random force field displays strong spatial correlations. In this case, the disorder-average mean-square displacement grows in proportion to log^{2/H}(n), n being time. We prove that moments of arbitrary order k of the steady-state current J_L through a finite segment of length L of such a chain decay as L^{-(1-H)}, independently of k, which suggests that despite a logarithmic confinement the average current is much higher than its Fickian counterpart in homogeneous systems. Our results reveal a paradoxical behavior such that, for fixed n and L, the mean square displacement decreases when one varies H from 0 to 1, while the average current increases. This counter-intuitive behavior is explained via an analysis of representative realizations of disorder.
Variational inference is a powerful tool for approximate inference, and it has been recently applied for representation learning with deep generative models. We develop the variational Gaussian process (VGP), a Bayesian nonparametric variational family, which adapts its shape to match complex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs and warping them through random non-linear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity. We prove a universal approximation theorem for the VGP, demonstrating its representative power for learning any model. For inference we present a variational objective inspired by auto-encoders and perform black box inference over a wide class of models. The VGP achieves new state-of-the-art results for unsupervised learning, inferring models such as the deep latent Gaussian model and the recently proposed DRAW.
The state-of-the-art in semantic segmentation is currently represented by fully convolutional networks (FCNs). However, FCNs use large receptive fields and many pooling layers, both of which cause blurring and low spatial resolution in the deep layers. As a result FCNs tend to produce segmentations that are poorly localized around object boundaries. Prior work has attempted to address this issue in post-processing steps, for example using a color-based CRF on top of the FCN predictions. However, these approaches require additional parameters and low-level features that are difficult to tune and integrate into the original network architecture. Additionally, most CRFs use color-based pixel affinities, which are not well suited for semantic segmentation and lead to spatially disjoint predictions.   To overcome these problems, we introduce a Boundary Neural Field (BNF), which is a global energy model integrating FCN predictions with boundary cues. The boundary information is used to enhance semantic segment coherence and to improve object localization. Specifically, we first show that the convolutional filters of semantic FCNs provide good features for boundary detection. We then employ the predicted boundaries to define pairwise potentials in our energy. Finally, we show that our energy decomposes semantic segmentation into multiple binary problems, which can be relaxed for efficient global optimization. We report extensive experiments demonstrating that minimization of our global boundary-based energy yields results superior to prior globalization methods, both quantitatively as well as qualitatively.
We study how large functional networks can grow stably under possible cascading overload failures and evaluated the maximum stable network size above which even a small-scale failure would cause a fatal breakdown of the network. Employing a model of cascading failures induced by temporally fluctuating loads, the maximum stable size $n_{\text{max}}$ has been calculated as a function of the load reduction parameter $r$ that characterizes how quickly the total load is reduced during the cascade. If we reduce the total load sufficiently fast ($r\ge r_{\text{c}}$), the network can grow infinitely. Otherwise, $n_{\text{max}}$ is finite and increases with $r$. For a fixed $r\,(<r_{\text{c}})$, $n_{\text{max}}$ for a scale-free network is larger than that for an exponential network with the same average degree. We also discuss how one detects and avoids the crisis of a fatal breakdown of the network from the relation between the sizes of the initial network and the largest component after an ordinarily occurring cascading failure.
We perform a combined experimental and theoretical study of tetramethyltetraselenafulvalene (TMTSF) single-crystal field-effect transistors, whose electrical characteristics exhibit clear signatures of the intrinsic transport properties of the material. We introduce a simple, well-defined model based on physical parameters and we successfully reproduce quantitatively the device properties as a function of temperature and carrier density. The analysis allows its internal consistency to be checked, and enables the reliable extraction of the density and characteristic energy of shallow and deep traps in the material. Our findings indicate that shallow traps originate from electrostatic potential fluctuations generated by charges fixed in the deep traps.
This is the first paper in a series aimed at defining a statistically significant sample of QSOs in the range $ 15 < B < 18.75$ and $ 0.3 < z < 2.2$. The selection is carried out using direct plates obtained at the ESO and UK Schmidt Telescopes, scanned with the COSMOS facility and searched for objects with an ultraviolet excess. Follow-up spectroscopy, carried out at ESO La Silla, is used to classify each candidate. In this initial paper, we describe the scientific objectives of the survey; the selection and observing techniques used. We present the first sample of 285 QSOs ($M_B < -23$) in a 153 deg$^2$ area, covered by the six ``deep'' fields, intended to obtain significant statistics down $B \simeq 18.75$ with unprecedented photometric accuracy. From this database, QSO counts are determined in the magnitude range $ 17 < B < 18.75$.
Using large-scale simulations based on matrix product state and quantum Monte Carlo techniques, we study the superfluid to Bose glass-transition for one-dimensional attractive hard-core bosons at zero temperature, across the full regime from weak to strong disorder. As a function of interaction and disorder strength, we identify a Berezinskii-Kosterlitz-Thouless critical line with two different regimes. At small attraction where critical disorder is weak compared to the bandwidth, the critical Luttinger parameter $K_c$ takes its universal Giamarchi-Schulz value $K_{c}=3/2$. Conversely, a non-universal $K_c>3/2$ emerges for stronger attraction where weak-link physics is relevant. In this strong disorder regime, the transition is characterized by self-similar power-law distributed weak links with a continuously varying characteristic exponent $\alpha$.
Music Sight Reading is a complex process in which when it is occurred in the brain some learning attributes would be emerged. Besides giving a model based on actor-critic method in the Reinforcement Learning, the agent is considered to have a neural network structure. We studied on where the sight reading process is happened and also a serious problem which is how the synaptic weights would be adjusted through the learning process. The model we offer here is a computational model on which an updated weights equation to fix the weights is accompanied too.
Adaptive networks consist of a collection of agents with adaptation and learning abilities. The agents interact with each other on a local level and diffuse information across the network through their collaborations. In this work, we consider two types of agents: informed agents and uninformed agents. The former receive new data regularly and perform consultation and in-network tasks, while the latter do not collect data and only participate in the consultation tasks. We examine the performance of adaptive networks as a function of the proportion of informed agents and their distribution in space. The results reveal some interesting and surprising trade-offs between convergence rate and mean-square performance. In particular, among other results, it is shown that the performance of adaptive networks does not necessarily improve with a larger proportion of informed agents. Instead, it is established that the larger the proportion of informed agents is, the faster the convergence rate of the network becomes albeit at the expense of some deterioration in mean-square performance. The results further establish that uninformed agents play an important role in determining the steady-state performance of the network, and that it is preferable to keep some of the highly connected agents uninformed. The arguments reveal an important interplay among three factors: the number and distribution of informed agents in the network, the convergence rate of the learning process, and the estimation accuracy in steady-state. Expressions that quantify these relations are derived, and simulations are included to support the theoretical findings. We further apply the results to two models that are widely used to represent behavior over complex networks, namely, the Erdos-Renyi and scale-free models.
In this paper, we propose ELF, an Extensive, Lightweight and Flexible platform for fundamental reinforcement learning research. Using ELF, we implement a highly customizable real-time strategy (RTS) engine with three game environments (Mini-RTS, Capture the Flag and Tower Defense). Mini-RTS, as a miniature version of StarCraft, captures key game dynamics and runs at 40K frame-per-second (FPS) per core on a Macbook Pro notebook. When coupled with modern reinforcement learning methods, the system can train a full-game bot against built-in AIs end-to-end in one day with 6 CPUs and 1 GPU. In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environments like Arcade Learning Environment. Using ELF, we thoroughly explore training parameters and show that a network with Leaky ReLU and Batch Normalization coupled with long-horizon training and progressive curriculum beats the rule-based built-in AI more than $70\%$ of the time in the full game of Mini-RTS. Strong performance is also achieved on the other two games. In game replays, we show our agents learn interesting strategies. ELF, along with its RL platform, will be open-sourced.
Software-defined networking is considered a promising new paradigm, enabling more reliable and formally verifiable communication networks. However, this paper shows that the separation of the control plane from the data plane, which lies at the heart of Software-Defined Networks (SDNs), introduces a new vulnerability which we call \emph{teleportation}. An attacker (e.g., a malicious switch in the data plane or a host connected to the network) can use teleportation to transmit information via the control plane and bypass critical network functions in the data plane (e.g., a firewall), and to violate security policies as well as logical and even physical separations. This paper characterizes the design space for teleportation attacks theoretically, and then identifies four different teleportation techniques. We demonstrate and discuss how these techniques can be exploited for different attacks (e.g., exfiltrating confidential data at high rates), and also initiate the discussion of possible countermeasures. Generally, and given today's trend toward more intent-based networking, we believe that our findings are relevant beyond the use cases considered in this paper.
Algorithms for mining very large graphs, such as those representing online social networks, to discover the relative frequency of small subgraphs within them are of high interest to sociologists, computer scientists and marketeers alike. However, the computation of these network motif statistics via naive enumeration is infeasible for either its prohibitive computational costs or access restrictions on the full graph data. Methods to estimate the motif statistics based on random walks by sampling only a small fraction of the subgraphs in the large graph address both of these challenges. In this paper, we present a new algorithm, called the Waddling Random Walk (WRW), which estimates the concentration of motifs of any size. It derives its name from the fact that it sways a little to the left and to the right, thus also sampling nodes not directly on the path of the random walk. The WRW algorithm achieves its computational efficiency by not trying to enumerate subgraphs around the random walk but instead using a randomized protocol to sample subgraphs in the neighborhood of the nodes visited by the walk. In addition, WRW achieves significantly higher accuracy (measured by the closeness of its estimate to the correct value) and higher precision (measured by the low variance in its estimations) than the current state-of-the-art algorithms for mining subgraph statistics. We illustrate these advantages in speed, accuracy and precision using simulations on well-known and widely used graph datasets representing real networks.
The transfer of polarization from a high-energy positron to a \lam hyperon produced in semi-inclusive deep-inelastic scattering has been measured. The data have been obtained by the HERMES experiment at DESY using the 27.6 GeV longitudinally polarized positron beam of the HERA collider and unpolarized gas targets internal to the positron (electron) storage ring. The longitudinal spin transfer coefficient is found to be $\dll = 0.11 \pm 0.10 \mathrm{(stat)} \pm 0.03 \mathrm{(syst)}$ at an average fractional energy carried by the \lam hyperon $<z >= 0.45$. The dependence of \dll on both the fractional energy $z$ and the fractional longitudinal momentum $x_F$ is presented.
In order to create a novel model of memory and brain function, we focus our approach on the sub-molecular (electron), molecular (tubulin) and macromolecular (microtubule) components of the neural cytoskeleton. Due to their size and geometry, these systems may be approached using the principles of quantum physics. We identify quantum-physics derived mechanisms conceivably underlying the integrated yet differentiated aspects of memory encoding/recall as well as the molecular basis of the engram. We treat the tubulin molecule as the fundamental computation unit (qubit) in a quantum-computational network that consists of microtubules (MTs), networks of MTs and ultimately entire neurons and neural networks.   We derive experimentally testable predictions of our quantum brain hypothesis and perform experiments on these.
Finding community structures in networks is important in network science, technology, and applications. To date, most algorithms that aim to find community structures only focus either on unipartite or bipartite networks. A unipartite network consists of one set of nodes and a bipartite network consists of two nonoverlapping sets of nodes with only links joining the nodes in different sets. However, a third type of network exists, defined here as the mixture network. Just like a bipartite network, a mixture network also consists of two sets of nodes, but some nodes may simultaneously belong to two sets, which breaks the nonoverlapping restriction of a bipartite network. The mixture network can be considered as a general case, with unipartite and bipartite networks viewed as its limiting cases. A mixture network can represent not only all the unipartite and bipartite networks, but also a wide range of real-world networks that cannot be properly represented as either unipartite or bipartite networks in fields such as biology and social science. Based on this observation, we first propose a probabilistic model that can find modules in unipartite, bipartite, and mixture networks in a unified framework based on the link community model for a unipartite undirected network [B Ball et al (2011 Phys. Rev. E 84 036103)]. We test our algorithm on synthetic networks (both overlapping and nonoverlapping communities) and apply it to two real-world networks: a southern women bipartite network and a human transcriptional regulatory mixture network. The results suggest that our model performs well for all three types of networks, is competitive with other algorithms for unipartite or bipartite networks, and is applicable to real-world networks.
The effects of dominant sequential interactions are investigated in an exactly solvable feed-forward layered neural network model of binary units and patterns near saturation in which the interaction consists of a Hebbian part and a symmetric sequential term. Phase diagrams of stationary states are obtained and a new phase of cyclic correlated states of period two is found for a weak Hebbian term, independently of the number of condensed patterns $c$.
We analyse the calculations of deep inelastic structure functions of free nucleons by QCD sum rules method,carried out by others.We present our results of calculation of the distribution of valence quarks in nucleon placed into the nuclear matter.We show that the change in distribution has typical EMC shape.We discuss possible application of the method to investiga- tion of other aspects of deep-inelastic processes.We analyse also the limits of possibilities of the approach.
Increasingly, cognitive scientists have demonstrated interest in applying tools from deep learning. One use for deep learning is in language acquisition where it is useful to know if a linguistic phenomenon can be learned through domain-general means. To assess whether unsupervised deep learning is appropriate, we first pose a smaller question: Can unsupervised neural networks apply linguistic rules productively, using them in novel situations? We draw from the literature on determiner/noun productivity by training an unsupervised, autoencoder network measuring its ability to combine nouns with determiners. Our simple autoencoder creates combinations it has not previously encountered and produces a degree of overlap matching adults. While this preliminary work does not provide conclusive evidence for productivity, it warrants further investigation with more complex models. Further, this work helps lay the foundations for future collaboration between the deep learning and cognitive science communities.
The analysis of observed time series from nonlinear systems is usually done by making a time-delay reconstruction to unfold the dynamics on a multi-dimensional state space. An important aspect of the analysis is the choice of the correct embedding dimension. The conventional procedure used for this is either the method of false nearest neighbors or the saturation of some invariant measure, such as, correlation dimension. Here we examine this issue from a complex network perspective and propose a recurrence network based measure to determine the acceptable minimum embedding dimension to be used for such analysis. The measure proposed here is based on the well known Kullback-Leibler divergence commonly used in information theory. We show that the measure is simple and direct to compute and give accurate result for short time series. To show the significance of the measure in the analysis of practical data, we present the analysis of two EEG signals as examples.
We study the diversity of complex spatio-temporal patterns of random synchronous asymmetric neural networks (RSANNs). Specifically, we investigate the impact of noisy thresholds on network performance and find that there is a narrow and interesting region of noise parameters where RSANNs display specific features of behavior desired for rapidly `thinking' systems: accessibility to a large set of distinct, complex patterns.
We present a fully automatic method employing convolutional neural networks based on the 2D U-net architecture and random forest classifier to solve the automatic liver lesion segmentation problem of the ISBI 2017 Liver Tumor Segmentation Challenge (LiTS). In order to constrain the ROI in which the tumors could be located, a liver segmentation is performed first. For the organ segmentation, an ensemble of convolutional networks is trained to segment a liver using a set of 179 liver CT datasets from liver surgery planning. Inside of the liver ROI a neural network, trained using 127 challenge training datasets, identifies tumor candidates, which are subsequently filtered with a random forest classifier yielding the final tumor segmentation. The evaluation on the 70 challenge test cases resulted in a mean Dice coefficient of 0.65, ranking our method in the second place.
The compact elliptical galaxy M32 offers a unique testing ground for theories of stellar evolution. Because of its proximity, solar-blind UV observations can resolve the hot evolved stars in its center. Some of these late evolutionary phases are too rapid to study adequately in globular clusters, and their study in the Galactic field is often complicated by uncertainties in distance and reddening. Using the UV cameras on the Space Telescope Imaging Spectrograph, we have obtained a deep color-magnitude diagram (CMD) of the M32 center. Although the hot horizontal branch is well-detected, our CMD shows a striking scarcity of the brighter post-asymptotic giant branch (PAGB) and post-early AGB stars expected for a population of this size. This dearth suggests that the evolution to the white dwarf phase may be much more rapid than that predicted by canonical evolutionary tracks for low-mass stars.
The production of leading neutrons, where the neutron carries a large fraction x_L of the incoming proton's longitudinal momentum, is studied in deep-inelastic positron-proton scattering at HERA. The data were taken with the H1 detector in the years 2006 and 2007 and correspond to an integrated luminosity of 122 pb^{-1}. The semi-inclusive cross section is measured in the phase space defined by the photon virtuality 6 < Q^2 < 100 GeV^2, Bjorken scaling variable 1.5x10^{-4} < x < 3x10^{-2}, longitudinal momentum fraction 0.32 < x_L < 0.95 and neutron transverse momentum p_T < 0.2 GeV. The leading neutron structure function, F_2^{LN(3)}(Q^2,x,x_L), and the fraction of deep-inelastic scattering events containing a leading neutron are studied as a function of Q^2, x and x_L. Assuming that the pion exchange mechanism dominates leading neutron production, the data provide constraints on the shape of the pion structure function.
Among several tasks in Machine Learning, a specially important one is that of inferring the latent variables of a system and their causal relations with the observed behavior. Learning a Hidden Markov Model of given stochastic process is a textbook example, known as the positive realization problem (PRP). The PRP and its solutions have far-reaching consequences in many areas of systems and control theory, and positive systems theory.   We consider the scenario where the latent variables are quantum states, and the system dynamics is constrained only by physical transformations on the quantum system. The observable dynamics is then described by a quantum instrument, and the task is to determine which quantum instrument --if any-- yields the process at hand by iterative application.   We take as starting point the theory of quasi-realizations, whence a description of the dynamics of the process is given in terms of linear maps on state vectors and probabilities are given by linear functionals on the state vectors. This description, despite its remarkable resemblance with the Hidden Markov Model, or the iterated quantum instrument, is nevertheless devoid of any stochastic or quantum mechanical interpretation, as said maps fail to satisfy any positivity conditions. The Completely-Positive realization problem then consists in determining whether an equivalent quantum mechanical description of the same process exists.   We generalize some key results of stochastic realization theory, and show that the problem has deep connections with operator systems theory, yielding possible insight to the lifting problem in quotient operator systems. Our results have potential applications in quantum machine learning, device-independent characterization and reverse-engineering of stochastic processes and quantum processors, and dynamical processes with quantum memory.
We establish a natural translation from word rewriting systems to strictly positive polymodal logics. Thereby, the latter can be considered as a generalization of the former. As a corollary we obtain examples of undecidable strictly positive normal modal logics. The translation has its counterpart on the level of proofs: we formulate a natural deep inference proof system for strictly positive logics generalizing derivations in word rewriting systems. We also formulate some open questions related to the theory of modal companions of superintuitionistic logics that was initiated by L.L. Maximova and V.V. Rybakov.
In this paper, we consider the effect of feedback channel error on the throughput of one-hop wireless networks under the random connection model. The transmission strategy is based on activating source-destination pairs with strongest direct links. While these activated pairs are identified based on Channel State Information (CSI) at the receive side, the transmit side will be provided with a noisy version of this information via the feedback channel. Such error will degrade network throughput, as we investigate in this paper. Our results show that if the feedback error probability is below a given threshold, network can tolerate such error without any significant throughput loss. The threshold value depends on the number of nodes in the network and the channel fading distribution. Such analysis is crucial in design of error correction codes for feedback channel in such networks.
Facial Expression Classification is an interesting research problem in recent years. There are a lot of methods to solve this problem. In this research, we propose a novel approach using Canny, Principal Component Analysis (PCA) and Artificial Neural Network. Firstly, in preprocessing phase, we use Canny for local region detection of facial images. Then each of local region's features will be presented based on Principal Component Analysis (PCA). Finally, using Artificial Neural Network (ANN)applies for Facial Expression Classification. We apply our proposal method (Canny_PCA_ANN) for recognition of six basic facial expressions on JAFFE database consisting 213 images posed by 10 Japanese female models. The experimental result shows the feasibility of our proposal method.
Random electron systems show rich phases such as Anderson insulator, diffusive metal, quantum and anomalous quantum Hall insulator, Weyl semimetal, as well as strong/weak topological insulators. Eigenfunctions of each matter phase have specific features, but due to the random nature of systems, judging the matter phase from eigenfunctions is difficult. Here we propose the deep learning algorithm to capture the features of eigenfunctions. Localization-delocalization transition as well as disordered Chern insulator-Anderson insulator transition is discussed.
We show that extended self-similarity, a scaling phenomenon firstly observed in classical turbulent flows, holds for a two-dimensional metal-insulator transition that belongs to the universality class of random Dirac fermions. Deviations from multifractality, which in turbulence are due to the dominance of diffusive processes at small scales, appear in the condensed matter context as a large scale, finite size effect related to the imposition of an infra-red cutoff in the field theory formulation. We propose a phenomenological interpretation of extended self-similarity in the metal-insulator transition within the framework of the random $\beta$-model description of multifractal sets. As a natural step our discussion is bridged to the analysis of strange attractors, where crossovers between multifractal and non-multifractal regimes are found and extended self-similarity turns to be verified as well.
Using neutron pair distribution function (PDF) analysis over the temperature range from 1000 K to 15 K, we demonstrate the existence of local polarization and the formation of medium-range, polar nanoregions (PNRs) with local rhombohedral order in a prototypical relaxor ferroelectric Pb(Mg$_{1/3}$Nb$_{2/3}$)O$_3$. We estimate the volume fraction of the PNRs as a function of temperature and show that this fraction steadily increases from 0 % to a maximum of $\sim$ 30% as the temperature decreases from 650 K to 15 K. Below T$\sim$200 K the PNRs start to overlap as their volume fraction reaches the percolation threshold. We propose that percolating PNRs and their concomitant overlap play a significant role in the relaxor behavior of Pb(Mg$_{1/3}$Nb$_{2/3}$)O$_3$.
We consider the combined influence of disorder, electron-electron interactions and quantum hopping on the properties of electronic systems in a localized phase, approaching an insulator-metal transition. The generic models in this regime are the quantum Coulomb glass and its generalization to electrons with spin. After introducing these models we explain our computational method, the Hartree-Fock based diagonalization. We then discuss the conductance and compare spinless fermions and electrons. It turns out that spin degrees of freedom do not play an essential role in the systems considered. Finally, we analyze localization and decay of single-particle excitations. We find that interactions generically tend to localize these excitations which is a result of the Coulomb gap in the single-particle density of states.
A lot of prior work on event extraction has exploited a variety of features to represent events. Such methods have several drawbacks: 1) the features are often specific for a particular domain and do not generalize well; 2) the features are derived from various linguistic analyses and are error-prone; and 3) some features may be expensive and require domain expert. In this paper, we develop a Chinese event extraction system that uses word embedding vectors to represent language, and deep neural networks to learn the abstract feature representation in order to greatly reduce the effort of feature engineering. In addition, in this framework, we leverage large amount of unlabeled data, which can address the problem of limited labeled corpus for this task. Our experiments show that our proposed method performs better compared to the system using rich language features, and using unlabeled data benefits the word embeddings. This study suggests the potential of DNN and word embedding for the event extraction task.
A number of network structural characteristics have recently been the subject of particularly intense research, including degree distributions, community structure, and various measures of vertex centrality, to mention only a few. Vertices may have attributes associated with them; for example, properties of proteins in protein-protein interaction networks, users' social network profiles, or authors' publication histories in co-authorship networks. In a network, two vertices might be considered similar if they have similar attributes (features, properties), or they can be considered similar based solely on the network structure. Similarity of this type is called structural similarity, to distinguish it from properties similarity, social similarity, textual similarity, functional similarity or other similarity types found in networks. Here we focus on the similarity problem by computing (1) for each vertex a vector of structural features, called signature vector, based on the number of graphlets associated with the vertex, and (2) for the network its graphlet correlation matrix, measuring graphlets dependencies and hence revealing unknown organizational principles of the network. We found that real-world networks generally have very different structural characteristics resulting in different graphlet correlation matrices. In particular, the graphlet correlation matrix of the brain effective network is computed for 40 healthy subjects and common (present in more than 70 percent subjects) dependencies are raveled. Thus, negative correlations are found for 2-node graphlets and 3-node graphlets that are wedges and positive correlations are found only for 3-node graphlets that are triangles. Graphlets characteristics in directed networks could further significantly increase our understanding of real-world networks.
Modern society is critically dependent on the services provided by engineered infrastructure networks. When natural disasters (e.g. Hurricane Sandy) occur, the ability of these networks to provide service is often degraded because of physical damage to network components. One of the most critical of these networks is electric power, with medium voltage distribution circuits often suffering the most severe damage. However, well-placed upgrades to these distribution grids can greatly improve post-event network performance. We formulate an optimal electrical distribution grid design problem as a two-stage, stochastic mixed-integer program with damage scenarios from natural disasters modeled as a set of stochastic events. We develop and investigate the tractability of an exact and several heuristic algorithms based on decompositions that are hybrids of techniques developed by the AI and operations research communities. We provide computational evidence that these algorithms have significant benefits when compared with commercial, mixed-integer programming software.
This paper continues on the work of the B-Matrix approach in hebbian learning proposed by Dr. Kak. It reports the results on methods of improving the memory retrieval capacity of the hebbian neural network which implements the B-Matrix approach. Previously, the approach to retrieving the memories from the network was to clamp all the individual neurons separately and verify the integrity of these memories. Here we present a network with the capability to identify the "active sites" in the network during the training phase and use these "active sites" to generate the memories retrieved from these neurons. Three methods are proposed for obtaining the update order of the network from the proximity matrix when multiple neurons are to be clamped. We then present a comparison between the new methods to the classical case and also among the methods themselves.
Social relation defines the association, e.g, warm, friendliness, and dominance, between two or more people. Motivated by psychological studies, we investigate if such fine-grained and high-level relation traits can be characterised and quantified from face images in the wild. To address this challenging problem we propose a deep model that learns a rich face representation to capture gender, expression, head pose, and age-related attributes, and then performs pairwise-face reasoning for relation prediction. To learn from heterogeneous attribute sources, we formulate a new network architecture with a bridging layer to leverage the inherent correspondences among these datasets. It can also cope with missing target attribute labels. Extensive experiments show that our approach is effective for fine-grained social relation learning in images and videos.
We present the first unbiased determination of parton distribution functions (PDFs) with electroweak corrections. The aim of this thesis is to provide an exhaustive description of the theoretical framework and the technical implementation which leads to the determination of a set of PDFs which includes the photon PDF and quantum electrodynamics (QED) contributions to parton evolution. First, we introduce and motivate the need of including electroweak corrections to PDFs, providing phenomenological examples and presenting an overview of the current state of the art in PDF fits. The theoretical implications of such corrections are then described through the implementation of the combined QCD+QED evolution in APFEL, a public code for the solution of the PDF evolution developed particularly for this thesis. We proceed by presenting the new structure of the Neural-Network PDF (NNPDF) methodology used for the extraction of this set of PDFs with QED corrections. We then provide a first determination of the full set of PDFs based on deep-inelastic scattering data and LHC data for $W$ and $Z/\gamma^*$ Drell-Yan production, using leading-order QED and NLO or NNLO QCD: the so-called NNPDF2.3QED set of PDFs. We perform a preliminary investigation of the phenomenological implications of NNPDF2.3QED set, in particular, focusing on the photon-induced corrections to direct photon production at HERA, high-mass dilepton and $W$ pair production at the LHC and finally, providing a first determination of lepton PDFs through the APFEL evolution. We conclude with a summary of the technological upgrades required for the improvement of future PDF determinations with electroweak corrections.
We investigate the properties of the spanning trees of various real-world and model networks. The spanning tree representing the communication kernel of the original network is determined by maximizing total weight of edges, whose weights are given by the edge betweenness centralities. We find that a scale-free tree and shortcuts organize a complex network. The spanning tree shows robust betweenness centrality distribution that was observed in scale-free tree models. It turns out that the shortcut distribution characterizes the properties of original network, such as the clustering coefficient and the classification of networks by the betweenness centrality distribution.
If people with high risk of suicide can be identified through social media like microblog, it is possible to implement an active intervention system to save their lives. Based on this motivation, the current study administered the Suicide Probability Scale(SPS) to 1041 weibo users at Sina Weibo, which is a leading microblog service provider in China. Two NLP (Natural Language Processing) methods, the Chinese edition of Linguistic Inquiry and Word Count (LIWC) lexicon and Latent Dirichlet Allocation (LDA), are used to extract linguistic features from the Sina Weibo data. We trained predicting models by machine learning algorithm based on these two types of features, to estimate suicide probability based on linguistic features. The experiment results indicate that LDA can find topics that relate to suicide probability, and improve the performance of prediction. Our study adds value in prediction of suicidal probability of social network users with their behaviors.
Several network embedding models have been developed for unsigned networks. However, these models based on skip-gram cannot be applied to signed networks because they can only deal with one type of link. In this paper, we present our signed network embedding model called SNE. Our SNE adopts the log-bilinear model, uses node representations of all nodes along a given path, and further incorporates two signed-type vectors to capture the positive or negative relationship of each edge along the path. We conduct two experiments, node classification and link prediction, on both directed and undirected signed networks and compare with four baselines including a matrix factorization method and three state-of-the-art unsigned network embedding models. The experimental results demonstrate the effectiveness of our signed network embedding.
Demonstrations, protests, riots, and shifts in public opinion respond to the coordinating potential of communication networks. Digital technologies have turned interpersonal networks into massive, pervasive structures that constantly pulsate with information. Here, we propose a model that aims to analyze the contagion dynamics that emerge in networks when repeated activation is allowed, that is, when actors can engage recurrently in a collective effort. We analyze how the structure of communication networks impacts on the ability to coordinate actors, and we identify the conditions under which large-scale coordination is more likely to emerge.
The present survey provides the state-of-the-art of research, copiously devoted to Evolutionary Approach (EAs) for clustering exemplified with a diversity of evolutionary computations. The Survey provides a nomenclature that highlights some aspects that are very important in the context of evolutionary data clustering. The paper missions the clustering trade-offs branched out with wide-ranging Multi Objective Evolutionary Approaches (MOEAs) methods. Finally, this study addresses the potential challenges of MOEA design and data clustering, along with conclusions and recommendations for novice and researchers by positioning most promising paths of future research. MOEAs have substantial success across a variety of MOP applications, from pedagogical multifunction optimization to real-world engineering design. The survey paper noticeably organizes the developments witnessed in the past three decades for EAs based metaheuristics to solve multiobjective optimization problems (MOP) and to derive significant progression in ruling high quality elucidations in a single run. Data clustering is an exigent task, whose intricacy is caused by a lack of unique and precise definition of a cluster. The discrete optimization problem uses the cluster space to derive a solution for Multiobjective data clustering. Discovery of a majority or all of the clusters (of illogical shapes) present in the data is a long-standing goal of unsupervised predictive learning problems or exploratory pattern analysis.
We perform QCD fit of the nucleon strange and anti-strange sea distributions to the neutrino and anti-neutrino dimuon data by the CCFR and NuTeV collaborations, supplemented by the inclusive charged lepton-nucleon Deep Inelastic Scattering (DIS) and Drell-Yan data. The effective semi-leptonic charmed-hadron branching ratio is constrained from the inclusive charmed hadron measurements performed by the FNAL-E531 and CHORUS neutrino emulsion experiments as $B_\mu=(8.8\pm0.5)%$. We obtain a strange sea suppression factor {$\kappa(20 {\rm GeV}^2)=0.62\pm0.04({\rm exp.})\pm0.03({\rm QCD})$}. An $x$-distribution of total strange sea obtained in the fit is slightly softer than the non-strange sea, and an asymmetry between strange and anti-strange quark distributions is consistent with zero (integrated over $x$ it is equal to {$0.0013\pm 0.0009({\rm exp.}) \pm 0.0002({\rm QCD})$ at the scale of $20 {\rm GeV^2}$}).
Broadcast is a fundamental operation in networks, especially in wireless Mobile Ad Hoc NETworks (MANET). For example, some form of broadcasting is used by all on-demand MANET routing protocols, when there is uncertainty as to the location of the destination node, or for service discovery. Being such a basic operation of the networking protocols, the importance of efficient broadcasting has long been recognized by the networking community. Numerous papers proposed increasingly more efficient implementation of broadcasting, while other studies presented bounds on broadcast performance. In this work, we present a new approach to efficient broadcast in networks with dynamic topologies, such as MANET, and we introduce a new broadcasting algorithm for such networking environments. We evaluate our algorithm, showing that its performance comes remarkably close to the corresponding theoretical performance bounds, even in the presence of packet loss due to, for example, MAC-layer collisions. Furthermore, we compare the performance of the proposed algorithm with other recently proposed schemes, including in various mobility settings.
Monitoring the number of insect pests is a crucial component in pheromone-based pest management systems. In this paper, we propose an automatic detection pipeline based on deep learning for identifying and counting pests in images taken inside field traps. Applied to a commercial codling moth dataset, our method shows promising performance both qualitatively and quantitatively. Compared to previous attempts at pest detection, our approach uses no pest-specific engineering which enables it to adapt to other species and environments with minimal human effort. It is amenable to implementation on parallel hardware and therefore capable of deployment in settings where real-time performance is required.
We study the properties of common loss surfaces through their Hessian matrix. In particular, in the context of deep learning, we empirically show that the spectrum of the Hessian is composed of two parts: (1) the bulk centered near zero, (2) and outliers away from the bulk. We present numerical evidence and mathematical justifications to the following conjectures laid out by Sagun et. al. (2016): Fixing data, increasing the number of parameters merely scales the bulk of the spectrum; fixing the dimension and changing the data (for instance adding more clusters or making the data less separable) only affects the outliers. We believe that our observations have striking implications for non-convex optimization in high dimensions. First, the flatness of such landscapes (which can be measured by the singularity of the Hessian) implies that classical notions of basins of attraction may be quite misleading. And that the discussion of wide/narrow basins may be in need of a new perspective around over-parametrization and redundancy that are able to create large connected components at the bottom of the landscape. Second, the dependence of small number of large eigenvalues to the data distribution can be linked to the spectrum of the covariance matrix of gradients of model outputs. With this in mind, we may reevaluate the connections within the data-architecture-algorithm framework of a model, hoping that it would shed light into the geometry of high-dimensional and non-convex spaces in modern applications. In particular, we present a case that links the two observations: a gradient based method appears to be first climbing uphill and then falling downhill between two points; whereas, in fact, they lie in the same basin.
Neural population equations such as neural mass or field models are widely used to study brain activity on a large scale. However, the relation of these models to the properties of single neurons is unclear. Here we derive an equation for several interacting populations at the mesoscopic scale starting from a microscopic model of randomly connected generalized integrate-and-fire neuron models. Each population consists of 50 -- 2000 neurons of the same type but different populations account for different neuron types. The stochastic population equations that we find reveal how spike-history effects in single-neuron dynamics such as refractoriness and adaptation interact with finite-size fluctuations on the population level. Efficient integration of the stochastic mesoscopic equations reproduces the statistical behavior of the population activities obtained from microscopic simulations of a full spiking neural network model. The theory describes nonlinear emergent dynamics like finite-size-induced stochastic transitions in multistable networks and synchronization in balanced networks of excitatory and inhibitory neurons. The mesoscopic equations are employed to rapidly simulate a model of a local cortical microcircuit consisting of eight neuron types. Our theory establishes a general framework for modeling finite-size neural population dynamics based on single cell and synapse parameters and offers an efficient approach to analyzing cortical circuits and computations.
We present a domain adaption framework to address a domain mismatch between synthetic training and real-world testing data. We demonstrate our method on a challenging fine-grain classification problem: recognizing a font style from an image of text. In this task, it is very easy to generate lots of rendered font examples but very hard to obtain real-world labeled images. This real-to-synthetic domain gap caused poor generalization to new real data in previous font recognition methods (Chen et al. (2014)). In this paper, we introduce a Convolutional Neural Network decomposition approach, leveraging a large training corpus of synthetic data to obtain effective features for classification. This is done using an adaptation technique based on a Stacked Convolutional Auto-Encoder that exploits a large collection of unlabeled real-world text images combined with synthetic data preprocessed in a specific way. The proposed DeepFont method achieves an accuracy of higher than 80% (top-5) on a new large labeled real-world dataset we collected.
Ant Colony Optimization (ACO) has time complexity O(t*m*N*N), and its typical application is to solve Traveling Salesman Problem (TSP), where t, m, and N denotes the iteration number, number of ants, number of cities respectively. Cutting down running time is one of study focuses, and one way is to decrease parameter t and N, especially N. For this focus, the following method is presented in this paper. Firstly, design a novel clustering algorithm named Special Local Clustering algorithm (SLC), then apply it to classify all cities into compact classes, where compact class is the class that all cities in this class cluster tightly in a small region. Secondly, let ACO act on every class to get a local TSP route. Thirdly, all local TSP routes are jointed to form solution. Fourthly, the inaccuracy of solution caused by clustering is eliminated. Simulation shows that the presented method improves the running speed of ACO by 200 factors at least. And this high speed is benefit from two factors. One is that class has small size and parameter N is cut down. The route length at every iteration step is convergent when ACO acts on compact class. The other factor is that, using the convergence of route length as termination criterion of ACO and parameter t is cut down.
Neuronal activity arises from an interaction between ongoing firing generated spontaneously by neural circuits and responses driven by external stimuli. Using mean-field analysis, we ask how a neural network that intrinsically generates chaotic patterns of activity can remain sensitive to extrinsic input. We find that inputs not only drive network responses, they also actively suppress ongoing activity, ultimately leading to a phase transition in which chaos is completely eliminated. The critical input intensity at the phase transition is a non-monotonic function of stimulus frequency, revealing a "resonant" frequency at which the input is most effective at suppressing chaos even though the power spectrum of the spontaneous activity peaks at zero and falls exponentially. A prediction of our analysis is that the variance of neural responses should be most strongly suppressed at frequencies matching the range over which many sensory systems operate.
Visual rendering of graphs is a key task in the mapping of complex network data. Although most graph drawing algorithms emphasize aesthetic appeal, certain applications such as travel-time maps place more importance on visualization of structural network properties. The present paper advocates two graph embedding approaches with centrality considerations to comply with node hierarchy. The problem is formulated first as one of constrained multi-dimensional scaling (MDS), and it is solved via block coordinate descent iterations with successive approximations and guaranteed convergence to a KKT point. In addition, a regularization term enforcing graph smoothness is incorporated with the goal of reducing edge crossings. A second approach leverages the locally-linear embedding (LLE) algorithm which assumes that the graph encodes data sampled from a low-dimensional manifold. Closed-form solutions to the resulting centrality-constrained optimization problems are determined yielding meaningful embeddings. Experimental results demonstrate the efficacy of both approaches, especially for visualizing large networks on the order of thousands of nodes.
The discovery of magnetic and compositional effects in the low temperature properties of multi-component glasses has prompted the need to extend the standard two-level systems (2LSs) tunneling model. A possible extension \cite{Jug2004} assumes that a subset of tunneling quasi-particles is moving in a three-welled potential (TWP) associated with the ubiquitous inhomogeneities of the disordered atomic structure of the glass. We show that within an alternative, cellular description of the intermediate-range atomic structure of glasses the tunneling TWP can be fully justified. We then review how the experimentally discovered magnetic effects can be explained within the approach where only localized atomistic tunneling 2LSs and quasi-particles tunneling in TWPs are allowed. We discuss the origin of the magnetic effects in the heat capacity, dielectric constant (real and imaginary parts), polarization echo and SQUID magnetization in several glassy systems. We conclude by commenting on a strategy to reveal the mentioned tunneling states (2LSs and TWPs) by means of atomistic computer simulations and discuss the microscopic nature of the tunneling states in the context of the potential energy landscape of glass-forming systems.
We propose an experiential formula for the calculation of the energy eigenvalues of a particle moving in a one-dimension finite-deep square well potential after some physical considerations. This formula shows a simple relation between the energy eigenvalues and the potential papameters, and can be used to estimate the energy eigenvalues in a very simple way.
IEEE 802.11 is a widely used wireless LAN standard which offers a good bandwidth at low cost In an ESS, multiple APs can co-exist with overlapping coverage area. A mobile node connects to the AP from which it receives the best signal. Changes in traffic to and from different MNs occur over time. Load imbalance may develop on different APs. Throughput and delay of the different flows passing through the APs, where the load has increased beyond certain limit, may degrade. Different MNs associated to the overloaded APs will experience performance degradation. Overall performance of the ESS will also drop. In this paper we propose a scheme where MNs experiencing degraded performance will initiate action and with assistance from the associate AP perform handoff to less loaded AP within its range to improve performance.
In this paper, we discuss the new hard pole contribution to the $P_{h\perp}$-weighted single-transverse spin asymmetry in semi-inclusive deep inelastic scattering. We perform the complete next-to-leading order calculation of the $P_{h\perp}$-weighted cross section and show that the new hard pole contribution is required in order to obtain the complete evolution equation for the Qiu-Sterman function derived by different approaches.
We present a dependency parser implemented as a single deep neural network that reads orthographic representations of words and directly generates dependencies and their labels. Unlike typical approaches to parsing, the model doesn't require part-of-speech (POS) tagging of the sentences. With proper regularization and additional supervision achieved with multitask learning we reach state-of-the-art performance on Slavic languages from the Universal Dependencies treebank: with no linguistic features other than characters, our parser is as accurate as a transition- based system trained on perfect POS tags.
Contractile forces are essential for many developmental processes involving cell shape change and tissue deformation. Recent experiments on reconstituted actomyosin networks, the major component of the contractile machinery, have shown that active contractility occurs above a threshold motor concentration and within a window of crosslink concentration. We present a microscopic dynamic model that incorporates two essential aspects of actomyosin self-organization: the asymmetric load response of individual actin filaments and the correlated motor-driven events mimicking myosin-induced filament sliding. Using computer simulations we examine how the concentration and susceptibility of motors contribute to their collective behavior and interplay with the network connectivity to regulate macroscopic contractility. Our model is shown to capture the formation and dynamics of contractile structures and agree with the observed dependence of active contractility on microscopic parameters including the contractility onset. Cooperative action of load-resisting motors in a force-percolating structure integrates local contraction/buckling events into a global contractile state via an active coarsening process, in contrast to the flow transition driven by uncorrelated kicks of susceptible motors.
In this paper we investigate the problem of a long self-avoiding polymer chain immersed in a random medium. We find that in the limit of a very long chain and when the self-avoiding interaction is weak, the conformation of the chain consists of many ``blobs'' with connecting segments. The blobs are sections of the molecule curled up in regions of low potential in the case of a Gaussian distributed random potential or in regions of relatively low density of obstacles in the case of randomly distributed hard obstacles. We find that as the strength of the self-avoiding interaction is increased the chain undergoes a delocalization transition in the sense that the appropriate free energy per monomer is no longer negative. The chain is then no longer bound to a particular location in the medium but can easily wander around under the influence of a small perturbation. For a localized chain we estimate quantitatively the expected number of monomers in the ``blobs'' and in the connecting segments.
Networks have been a general tool for representing, analyzing, and modeling relational data arising in several domains. One of the most important aspect of network analysis is community detection or network clustering. Until recently, the major focus have been on discovering community structure in single (i.e., monoplex) networks. However, with the advent of relational data with multiple modalities, multiplex networks, i.e., networks composed of multiple layers representing different aspects of relations, have emerged. Consequently, community detection in multiplex network, i.e., detecting clusters of nodes shared by all layers, has become a new challenge. In this paper, we propose Network Fusion for Composite Community Extraction (NF-CCE), a new class of algorithms, based on four different non-negative matrix factorization models, capable of extracting composite communities in multiplex networks. Each algorithm works in two steps: first, it finds a non-negative, low-dimensional feature representation of each network layer; then, it fuses the feature representation of layers into a common non-negative, low-dimensional feature representation via collective factorization. The composite clusters are extracted from the common feature representation. We demonstrate the superior performance of our algorithms over the state-of-the-art methods on various types of multiplex networks, including biological, social, economic, citation, phone communication, and brain multiplex networks.
In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, \textit{as they are used in practice}, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods' codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods.
Amorphous solids or glasses are known to exhibit stretched-exponential decay over broad time intervals in several of their macroscopic observables: intermediate scattering function, dielectric relaxation modulus, time-elastic modulus etc. This behaviour is prominent especially near the glass transition. In this Letter we show, on the example of dielectric relaxation, that stretched-exponential relaxation is intimately related to the peculiar lattice dynamics of glasses. By reformulating the Lorentz model of dielectric matter in a more general form, we express the dielectric response as a function of the vibrational density of states (DOS) for a random assembly of spherical particles interacting harmonically with their nearest-neighbours. Surprisingly we find that near the glass transition for this system (which coincides with the Maxwell rigidity transition), the dielectric relaxation is perfectly consistent with stretched-exponential behaviour with Kohlrausch exponents $0.56 < \beta < 0.65$, which is the range where exponents are measured in most experimental systems. Crucially, the root cause of stretched-exponential relaxation can be traced back to soft modes (boson-peak) in the DOS.
The position and strength of the boson peak in silica glass vary considerably with temperature $T$. Such variations cannot be explained solely with changes in the Debye energy. New Brillouin scattering measurements are presented which allow determining the $T$-dependence of unrelaxed acoustic velocities. Using a velocity based on the bulk modulus, scaling exponents are found which agree with the soft-potential model. The unrelaxed bulk modulus thus appears to be a good measure for the structural evolution of silica with $T$ and to set the energy scale for the soft potentials.
We report a study of demagnetization protocols for frustrated arrays of interacting single domain permalloy nanomagnets by rotating the arrays in a changing magnetic field. The most effective demagnetization is achieved by not only stepping the field strength down while the sample is rotating, but by combining each field step with an alternation in the field direction. By contrast, linearly decreasing the field strength or stepping the field down without alternating the field direction leaves the arrays with a larger remanent magnetic moment. These results suggest that non-monotonic variations in field magnitude around and below the coercive field are important for the demagnetization process.
Disorder or sufficiently strong interactions can render a metallic state unstable causing it to turn into an insulating one. Despite the fact that the interplay of these two routes to a vanishing conductivity has been a central research topic, a unifying picture has not emerged so far. Here, we establish that the two-dimensional Falicov-Kimball model, one of the simplest lattice models of strong electron correlation does allow for the study of this interplay. In particular, we show that this model at particle-hole symmetry possesses three distinct thermodynamic insulating phases and exhibits Anderson localization. The previously reported metallic phase is identified as a finite-size feature due to the presence of weak localization. We characterize these phases by their electronic density of states, staggered occupation, conductivity, and the generalized inverse participation ratio. The implications of our findings for other strongly correlated systems are discussed.
Wireless sensor networks (WSNs) have recently gained a lot of attention by scientific community. Small and inexpensive devices with low energy consumption and limited computing resources are increasingly being adopted in different application scenarios including environmental monitoring, target tracking and biomedical health monitoring. In many such applications, node localization is inherently one of the system parameters. Localization process is necessary to report the origin of events, routing and to answer questions on the network coverage,assist group querying of sensors. In general, localization schemes are classified into two broad categories: range-based and range-free. However, it is difficult to classify hybrid solutions as range-based or range-free. In this paper we make this classification easy, where range-based schemes and range-free schemes are divided into two types: fully schemes and hybrid schemes. Moreover, we compare the most relevant localization algorithms and discuss the future research directions for wireless sensor networks localization schemes.
Information garnered from activity on location-based social networks can be harnessed to characterize urban spaces and organize them into neighborhoods. In this work, we adopt a data-driven approach to the identification and modeling of urban neighborhoods using location-based social networks. We represent geographic points in the city using spatio-temporal information about Foursquare user check-ins and semantic information about places, with the goal of developing features to input into a novel neighborhood detection algorithm. The algorithm first employs a similarity metric that assesses the homogeneity of a geographic area, and then with a simple mechanism of geographic navigation, it detects the boundaries of a city's neighborhoods. The models and algorithms devised are subsequently integrated into a publicly available, map-based tool named Hoodsquare that allows users to explore activities and neighborhoods in cities around the world.   Finally, we evaluate Hoodsquare in the context of a recommendation application where user profiles are matched to urban neighborhoods. By comparing with a number of baselines, we demonstrate how Hoodsquare can be used to accurately predict the home neighborhood of Twitter users. We also show that we are able to suggest neighborhoods geographically constrained in size, a desirable property in mobile recommendation scenarios for which geographical precision is key.
Photorealistic frontal view synthesis from a single face image has a wide range of applications in the field of face recognition. Although data-driven deep learning methods have been proposed to address this problem by seeking solutions from ample face data, this problem is still challenging because it is intrinsically ill-posed. This paper proposes a Two-Pathway Generative Adversarial Network (TP-GAN) for photorealistic frontal view synthesis by simultaneously perceiving global structures and local details. Four landmark located patch networks are proposed to attend to local textures in addition to the commonly used global encoder-decoder network. Except for the novel architecture, we make this ill-posed problem well constrained by introducing a combination of adversarial loss, symmetry loss and identity preserving loss. The combined loss function leverages both frontal face distribution and pre-trained discriminative deep face models to guide an identity preserving inference of frontal views from profiles. Different from previous deep learning methods that mainly rely on intermediate features for recognition, our method directly leverages the synthesized identity preserving image for downstream tasks like face recognition and attribution estimation. Experimental results demonstrate that our method not only presents compelling perceptual results but also outperforms state-of-the-art results on large pose face recognition.
Jet production in deep inelastic scattering for $120<Q^2<3600$~GeV$^2$ has been studied using data from an integrated luminosity of 3.2~pb$^{-1}$ collected with the ZEUS detector at HERA. Jets are identified with the JADE algorithm. A cut on the angular distribution of parton emission in the $\gamma^*$-parton centre-of-mass system minimises the experimental and theoretical uncertainties in the determination of the jet rates. The jet rates, when compared to ${\cal O}$($\alpha_{s}$^2$) perturbative QCD calculations, allow a precise determination of $\alpha_{s}(Q)$ in three $Q^2$-intervals. The values are consistent with a running of $\alpha_{s}(Q)$, as expected from QCD. Extrapolating to $Q=M_{Z^0}$ yields $\alpha_{s}(M_{Z^0}) = 0.117~\pm~0.005~(stat)~^{+0.004}_{-0.005}~(syst_{exp})~ ~{\pm~0.007}~(syst_{theory})$.
Real social interactions occur on networks in which each individual is connected to some, but not all, of others. In social dilemma games with a fixed population size, heterogeneity in the number of contacts per player is known to promote evolution of cooperation. Under a common assumption of positively biased payoff structure, well-connected players earn much by playing frequently, and cooperation once adopted by well-connected players is unbeatable and spreads to others. However, maintaining a social contact can be costly, which would prevent local payoffs from being positively biased. In replicator-type evolutionary dynamics, it is shown that even a relatively small participation cost extinguishes the merit of heterogeneous networks in terms of cooperation. In this situation, more connected players earn less so that they are no longer spreaders of cooperation. Instead, those with fewer contacts win and guide the evolution. The participation cost, or the baseline payoff, is irrelevant in homogeneous populations but is essential for evolutionary games on heterogeneous networks.
We consider the negative weight percolation (NWP) problem on hypercubic lattice graphs with fully periodic boundary conditions in all relevant dimensions from d=2 to the upper critical dimension d=6. The problem exhibits edge weights drawn from disorder distributions that allow for weights of either sign. We are interested in in the full ensemble of loops with negative weight, i.e. non-trivial (system spanning) loops as well as topologically trivial ("small") loops. The NWP phenomenon refers to the disorder driven proliferation of system spanning loops of total negative weight. While previous studies where focused on the latter loops, we here put under scrutiny the ensemble of small loops. Our aim is to characterize -using this extensive and exhaustive numerical study- the loop length distribution of the small loops right at and below the critical point of the hypercubic setups by means of two independent critical exponents. These can further be related to the results of previous finite-size scaling analyses carried out for the system spanning loops. For the numerical simulations we employed a mapping of the NWP model to a combinatorial optimization problem that can be solved exactly by using sophisticated matching algorithms. This allowed us to study here numerically exact very large systems with high statistics.
We introduce a general framework, applicable to a broad class of random walks on networks, that quantifies the response of the mean first-passage time to a target node to a local perturbation of the network, both in the context of attacks (damaged link) or strategies of transport enhancement (added link). This approach enables to determine explicitly the dependence of this response on geometric parameters (such as the network size and the localization of the perturbation) and on the intensity of the perturbation. In particular, it is showed that the relative variation of the MFPT is independent of the network size, and remains significant in the large size limit. Furthermore, in the case of non compact exploration of the network, it is found that a targeted perturbation keeps a substantial impact on transport properties for any localization of the damaged link.
This paper proves the separation between source-network coding and channel coding in networks of noisy, discrete, memoryless channels. We show that the set of achievable distortion matrices in delivering a family of dependent sources across such a network equals the set of achievable distortion matrices for delivering the same sources across a distinct network which is built by replacing each channel by a noiseless, point-to-point bit-pipe of the corresponding capacity. Thus a code that applies source-network coding across links that are made almost lossless through the application of independent channel coding across each link asymptotically achieves the optimal performance across the network as a whole.
Human language defines the most complex outcomes of evolution. The emergence of such an elaborated form of communication allowed humans to create extremely structured societies and manage symbols at different levels including, among others, semantics. All linguistic levels have to deal with an astronomic combinatorial potential that stems from the recursive nature of languages. This recursiveness is indeed a key defining trait. However, not all words are equally combined nor frequent. In breaking the symmetry between less and more often used and between less and more meaning-bearing units, universal scaling laws arise. Such laws, common to all human languages, appear on different stages from word inventories to networks of interacting words. Among these seemingly universal traits exhibited by language networks, ambiguity appears to be a specially relevant component. Ambiguity is avoided in most computational approaches to language processing, and yet it seems to be a crucial element of language architecture. Here we review the evidence both from language network architecture and from theoretical reasonings based on a least effort argument. Ambiguity is shown to play an essential role in providing a source of language efficiency, and is likely to be an inevitable byproduct of network growth.
Recently, deep learning approaches have achieved significant performance improvement in various imaging problems. However, it is still unclear why these deep learning architectures work. Moreover, the link between the deep learning and the classical signal processing approaches such as wavelet, non-local processing, compressed sensing, etc, is still not well understood, which often makes signal processors in deep troubles. To address these issues, here we show that the long-searched-for missing link is the convolutional framelets for representing a signal by convolving local and non-local bases. The convolutional framelets was originally developed to generalize the recent theory of low-rank Hankel matrix approaches, and this paper significantly extends the idea to derive a deep neural network using multi-layer convolutional framelets with perfect reconstruction (PR) under rectified linear unit (ReLU). Our analysis also shows that the popular deep network components such as residual block, redundant filter channels, and concatenated ReLU (CReLU) indeed help to achieve the PR, while the pooling and unpooling layers should be augmented with multi-resolution convolutional framelets to achieve PR condition. This discovery reveals the limitations of many existing deep learning architectures for inverse problems, and leads us to propose a novel deep convolutional framelets neural network. Using numerical experiments with sparse view x-ray computed tomography (CT), we demonstrated that our deep convolution framelets network shows consistent improvement. This discovery suggests that the success of deep learning is not from a magical power of a black-box, but rather comes from the power of a novel signal representation using non-local basis combined with data-driven local basis, which is indeed a natural extension of classical signal processing theory.
We summarize the recent results on electroweak physics and physics beyond the Standard Model that have been presented at the XIV International Workshop on Deep Inelastic Scattering 2006.
For large volumes of text data collected over time, a key knowledge discovery task is identifying and tracking clusters. These clusters may correspond to emerging themes, popular topics, or breaking news stories in a corpus. Therefore, recently there has been increased interest in the problem of clustering dynamic data. However, there exists little support for the interactive exploration of the output of these analysis techniques, particularly in cases where researchers wish to simultaneously explore both the change in cluster structure over time and the change in the textual content associated with clusters. In this paper, we propose a model for tracking dynamic clusters characterized by the evolutionary events of each cluster. Motivated by this model, the TextLuas system provides an implementation for tracking these dynamic clusters and visualizing their evolution using a metro map metaphor. To provide overviews of cluster content, we adapt the tag cloud representation to the dynamic clustering scenario. We demonstrate the TextLuas system on two different text corpora, where they are shown to elucidate the evolution of key themes. We also describe how TextLuas was applied to a problem in bibliographic network research.
An undercooled liquid is unstable. The driving force of the glass transition at Tg is a change of the undercooled-liquid Gibbs free energy. The classical Gibbs free energy change for a crystal formation is completed including an enthalpy saving. The crystal growth critical nucleus is used as a probe to observe the Laplace pressure change Dp accompanying the enthalpy change -Vm *Dp at Tg where Vm is the molar volume. A stable glass-liquid transition model predicts the specific heat jump of fragile liquids at temperatures smaller than Tg, the Kauzmann temperature TK where the liquid entropy excess with regard to crystal goes to zero, the equilibrium enthalpy between TK and Tg, the maximum nucleation rate at TK of superclusters containing magic atom numbers, and the equilibrium latent heats at Tg and TK. Strong-to-fragile and strong-to-strong liquid transitions at Tg are also described and all their thermodynamic parameters are determined from their specific heat jumps. The existence of fragile liquids quenched in the amorphous state, which do not undergo liquid-liquid transition during heating preceding their crystallization, is predicted. Long ageing times leading to the formation at TK of a stable glass composed of superclusters containing up to 147 atoms, touching and interpenetrating, are evaluated from nucleation rates. A fragile-to-fragile liquid transition occurs at Tg without stable-glass formation while a strong glass is stable after transition.
Using the SSWL database of syntactic parameters of world languages, and the MIT Media Lab data on language interactions, we construct a spin glass model of language evolution. We treat binary syntactic parameters as spin states, with languages as vertices of a graph, and assigned interaction energies along the edges. We study a rough model of syntax evolution, under the assumption that a strong interaction energy tends to cause parameters to align, as in the case of ferromagnetic materials. We also study how the spin glass model needs to be modified to account for entailment relations between syntactic parameters. This modification leads naturally to a generalization of Potts models with external magnetic field, which consists of a coupling at the vertices of an Ising model and a Potts model with q=3, that have the same edge interactions. We describe the results of simulations of the dynamics of these models, in different temperature and energy regimes. We discuss the linguistic interpretation of the parameters of the physical model.
We investigate a new method to augment recurrent neural networks with extra memory without increasing the number of network parameters. The system has an associative memory based on complex-valued vectors and is closely related to Holographic Reduced Representations and Long Short-Term Memory networks. Holographic Reduced Representations have limited capacity: as they store more information, each retrieval becomes noisier due to interference. Our system in contrast creates redundant copies of stored information, which enables retrieval with reduced noise. Experiments demonstrate faster learning on multiple memorization tasks.
Metabolic networks consist of linked functional components, or modules. The mechanism underlying metabolic network modularity is of great interest not only to researchers of basic science but also to those in fields of engineering. Previous studies have suggested a theoretical model, which proposes that a change in the evolutionary goal (system-specific purpose) increases network modularity, and this hypothesis was supported by statistical data analysis. Nevertheless, further investigation has uncovered additional possibilities that might explain the origin of network modularity. In this work, we propose an evolving network model without tuning parameters to describe metabolic networks. We demonstrate, quantitatively, that metabolic network modularity can arise from simple growth processes, independent of the change in the evolutionary goal. Our model is applicable to a wide range of organisms, and appears to suggest that metabolic network modularity can be more simply determined than previously thought. Nonetheless, our proposition does not serve to contradict the previous model; it strives to provide an insight from a different angle in the ongoing efforts to understand metabolic evolution, with the hope of eventually achieving the synthetic engineering of metabolic networks.
We evaluate a typical value of higher order cumulants (irreducible moments) of conductance fluctuations that could be extracted from magneto-conductance measurements in a single sample when an external magnetic field is swept over an interval $B_0$. We find that the n-th cumulant has a sample-dependent random part $\pm \gN 2^{n/2}\sqrt{a_{n}B_{c}/B_{0}}$, where $\gN 2$ is the variance of conductance fluctuations, $B_{c}$ is a correlation field, and $a_{n}\sim n!$. This means that an apparent deviation of the conductance distribution from a Gaussian shape, manifested by non-vanishing higher cumulants, can be a spurious result of correlations of conductances at different values of the magnetic field.
Two recently introduced criteria for estimation of generative models are both based on a reduction to binary classification. Noise-contrastive estimation (NCE) is an estimation procedure in which a generative model is trained to be able to distinguish data samples from noise samples. Generative adversarial networks (GANs) are pairs of generator and discriminator networks, with the generator network learning to generate samples by attempting to fool the discriminator network into believing its samples are real data. Both estimation procedures use the same function to drive learning, which naturally raises questions about how they are related to each other, as well as whether this function is related to maximum likelihood estimation (MLE). NCE corresponds to training an internal data model belonging to the {\em discriminator} network but using a fixed generator network. We show that a variant of NCE, with a dynamic generator network, is equivalent to maximum likelihood estimation. Since pairing a learned discriminator with an appropriate dynamically selected generator recovers MLE, one might expect the reverse to hold for pairing a learned generator with a certain discriminator. However, we show that recovering MLE for a learned generator requires departing from the distinguishability game. Specifically:   (i) The expected gradient of the NCE discriminator can be made to match the expected gradient of   MLE, if one is allowed to use a non-stationary noise distribution for NCE,   (ii) No choice of discriminator network can make the expected gradient for the GAN generator match that of MLE, and   (iii) The existing theory does not guarantee that GANs will converge in the non-convex case.   This suggests that the key next step in GAN research is to determine whether GANs converge, and if not, to modify their training algorithm to force convergence.
The effects of temperature on various aspects of neural activity from single cell to neural circuit level have long been known. However, how temperature affects the system-level of activity typical of experiments using non-invasive imaging techniques, such as magnetic brain imaging of electroencephalography, where neither its direct measurement nor its manipulation are possible, is essentially unknown. Starting from its basic physical definition, we discuss possible ways in which temperature may be used both as a parameter controlling the evolution of other variables through which brain activity is observed, and as a collective variable describing brain activity. These two aspects are further illustrated in the case of learning-related brain activity. Finally, methods to extract a temperature from experimental data are reviewed.
Today's industrial sensor networks require strong reliability and guarantees on messages delivery. These needs are even more important in real time applications like control/command, such as robotic wireless communications where strong temporal constraints are critical. For these reasons, classical random-based Medium Access Control (MAC) protocols present a non-null frame collision probability. In this paper we present an original full deterministic MAC-layer for industrial wireless network and its performance evaluation thanks to the development of a material prototype.
Information dissemination protocols for ad-hoc wireless networks frequently use a minimal subset of the available communication links, defining a rooted "broadcast" tree. In this work, we focus on the core challenge of disseminating from one layer to the next one of such tree. We call this problem Layer Dissemination. We study Layer Dissemination under a generalized model of interference, called affectance. The affectance model subsumes previous models, such as Radio Network and Signal to Inteference-plus-Noise Ratio. We present randomized and deterministic protocols for Layer Dissemination. These protocols are based on a combinatorial object that we call Affectance-selective Families. Our approach combines an engineering solution with theoretical guarantees. That is, we provide a method to characterize the network with a global measure of affectance based on measurements of interference in the specific deployment area. Then, our protocols distributedly produce an ad-hoc transmissions schedule for dissemination. In the randomized protocol only the network characterization is needed, whereas the deterministic protocol requires full knowledge of affectance. Our theoretical analysis provides guarantees on schedule length. We also present simulations of a real network-deployment area contrasting the performance of our randomized protocol, which takes into account affectance, against previous work for interference models that ignore some physical constraints. The striking improvement in performance shown by our simulations show the importance of utilizing a more physically-accurate model of interference that takes into account other effects beyond distance to transmitters.
An increasing number of social network mental disorders (SNMDs), such as Cyber-Relationship Addiction, Information Overload, and Net Compulsion, have been recently noted. Symptoms of these mental disorders are usually observed passively today, resulting in delayed clinical intervention. In this paper, we argue that mining online social behavior provides an opportunity to actively identify SNMDs at an early stage. It is challenging to detect SNMDs because the mental factors considered in standard diagnostic criteria (questionnaire) cannot be observed from online social activity logs. Our approach, new and innovative to the practice of SNMD detection, does not rely on self-revealing of those mental factors via questionnaires. Instead, we propose a machine learning framework, namely, Social Network Mental Disorder Detection (SNMDD), that exploits features extracted from social network data to accurately identify potential cases of SNMDs. We also exploit multi-source learning in SNMDD and propose a new SNMD-based Tensor Model (STM) to improve the performance. Our framework is evaluated via a user study with 3126 online social network users. We conduct a feature analysis, and also apply SNMDD on large-scale datasets and analyze the characteristics of the three SNMD types. The results show that SNMDD is promising for identifying online social network users with potential SNMDs.
Twitter is often the most up-to-date source for finding and tracking breaking news stories. Therefore, there is considerable interest in developing filters for tweet streams in order to track and summarize stories. This is a non-trivial text analytics task as tweets are short, and standard retrieval methods often fail as stories evolve over time. In this paper we examine the effectiveness of adaptive mechanisms for tracking and summarizing breaking news stories. We evaluate the effectiveness of these mechanisms on a number of recent news events for which manually curated timelines are available. Assessments based on ROUGE metrics indicate that an adaptive approaches are best suited for tracking evolving stories on Twitter.
It is well-known that cloud application performance critically depends on the network. Accordingly, new abstractions for cloud applications are proposed which extend the performance isolation guarantees to the network. The most common abstraction is the Virtual Cluster V C(n, b): the n virtual machines of a customer are connected to a virtual switch at bandwidth b. However, today, not much is known about how to efficiently embed and price virtual clusters. This paper makes two contributions. (1) We present an algorithm called Tetris that efficiently embeds virtual clusters arriving in an online fashion, by jointly optimizing the node and link resources. We show that this algorithm allows to multiplex more virtual clusters over the same physical infrastructure compared to existing algorithms, hence improving the provider profit. (2) We present the first demand-specific pricing model called DSP for virtual clusters. Our pricing model is fair in the sense that a customer only needs to pay for what he or she asked. Moreover, it features other desirable properties, such as price independence from mapping locations.
The learning rate is an information-theoretical quantity for bipartite Markov chains describing two coupled subsystems. It is defined as the rate at which transitions in the downstream subsystem tend to increase the mutual information between the two subsystems, and is bounded by the dissipation arising from these transitions. Its physical interpretation, however, is unclear, although it has been used as a metric for the sensing performance of the downstream subsystem. In this paper, we explore the behaviour of the learning rate for a number of simple model systems, establishing when and how its behaviour is distinct from the instantaneous mutual information between subsystems. In the simplest case, the two are almost equivalent. In more complex steady-state systems, the mutual information and the learning rate behave qualitatively distinctly, with the learning rate clearly now reflecting the rate at which the downstream system must update its information in response to changes in the upstream system. It is not clear whether this quantity is the most natural measure for sensor performance, and, indeed, we provide an example in which optimising the learning rate over a region of parameter space of the downstream system yields an apparently sub-optimal sensor.
In this paper, novel cooperative automatic repeat request (ARQ) methods with network coding are proposed for two way relaying network. Upon a failed transmission of a packet, the network enters cooperation phase, where the retransmission of the packets is aided by the relay node. The proposed approach integrates network coding into cooperative ARQ, aiming to improve the network throughput by reducing the number of retransmissions. For successive retransmission, three different methods for choosing the retransmitting node are considered. The throughput of the methods are analyzed and compared. The analysis is based on binary Markov channel which takes the correlation of the channel coefficients in time into account. Analytical results show that the proposed use of network coding result in throughput performance superior to traditional ARQ and cooperative ARQ without network coding. It is also observed that correlation can have significant effect on the performance of the proposed cooperative network coded ARQ approach. In particular the proposed approach is advantageous for slow to moderately fast fading channels.
The direct measurement of the top quark-Higgs coupling is one of the important questions in understanding the Higgs boson. The coupling can be obtained through measurement of the top quark pair-associated Higgs boson production cross-section. Of the multiple challenges arising in this cross-section measurement, we investigate the reconstruction of the partons originating from the hard scattering process using the measured jets in simulated ttH events. The task corresponds to an assignment challenge of m objects (jets) to n other objects (partons), where m>=n. We compare several methods with emphasis on a concept based on deep learning techniques which yields the best results with more than 50% of correct jet-parton assignments.
The twist 2 and twist 3 contributions of the polarized deep-inelastic structure functions are calculated both for neutral and charged current interactions using the operator product expansion in lowest order in QCD. The relations between the different structure functions are determined. New integral relations are derived between the twist 2 contributions of the structure functions $g_3(x,Q^2)$ and $g_5(x,Q^2)$ and between combinations of the twist 3 contributions to the structure functions $g_2(x,Q^2)$ and $g_3(x,Q^2)$. The sum rules for polarized deep inelastic scattering are discussed in detail.
In recent years, the information retrieval (IR) community has witnessed the first successful applications of deep neural network models to short-text matching and ad-hoc retrieval. It is exciting to see the research on deep neural networks and IR converge on these tasks of shared interest. However, the two communities have less in common when it comes to the choice of programming languages. Indri, an indexing framework popularly used by the IR community, is written in C++, while Torch, a popular machine learning library for deep learning, is written in the light-weight scripting language Lua. To bridge this gap, we introduce Luandri (pronounced "laundry"), a simple interface for exposing the search capabilities of Indri to Torch models implemented in Lua.
This paper focuses on large neural networks whose synaptic connectivity matrices are randomly chosen from certain random matrix ensembles. The dynamics of these networks can be characterized by the eigenvalue spectra of their connectivity matrices. In reality, neurons in a network do not necessarily behave in a similar way, but may belong to several different categories. The first study of the spectra of two-component neural networks was carried out by Rajan and Abbott. In their model, neurons are either 'excitatory' or 'inhibitory', and strengths of synapses from different types of neurons have Gaussian distributions with different means and variances. A surprising finding by Rajan and Abbott is that the eigenvalue spectra of these types of random synaptic matrices do not depend on the mean values of their elements. In this paper we prove that this is true even for a much more general type of random neural network, where there is a finite number of types of neurons, and their synaptic strengths have correlated distributions. Furthermore, using the diagrammatic techniques, we calculate the explicit formula for the spectra of synaptic matrices of multi-component neural networks.
We propose an algebraic setup for end-to-end physical-layer network coding based on submodule transmission. We introduce a distance function between modules, describe how it relates to information loss and errors, and show how to compute it. Then we propose a definition of submodule error-correcting code, and investigate bounds and constructions for such codes.
Finding the main product of a chemical reaction is one of the important problems of organic chemistry. This paper describes a method of applying a neural machine translation model to the prediction of organic chemical reactions. In order to translate 'reactants and reagents' to 'products', a gated recurrent unit based sequence-to-sequence model and a parser to generate input tokens for model from reaction SMILES strings were built. Training sets are composed of reactions from the patent databases, and reactions manually generated applying the elementary reactions in an organic chemistry textbook of Wade. The trained models were tested by examples and problems in the textbook. The prediction process does not need manual encoding of rules (e.g., SMARTS transformations) to predict products, hence it only needs sufficient training reaction sets to learn new types of reactions.
We investigate robustness of correlated networks against propagating attacks modeled by a susceptible-infected-removed model. By Monte-Carlo simulations, we numerically determine the first critical infection rate, above which a global outbreak of disease occurs, and the second critical infection rate, above which disease disintegrates the network. Our result shows that correlated networks are robust compared to the uncorrelated ones, regardless of whether they are assortative or disassortative, when a fraction of infected nodes in an initial state is not too large. For large initial fraction, disassortative network becomes fragile while assortative network holds robustness. This behavior is related to the layered network structure inevitably generated by a rewiring procedure we adopt to realize correlated networks.
AIMS: We present a weak lensing search of galaxy clusters in the 4 deg2 of the CFHT Legacy Survey Deep. This work aims at building a mass-selected sample of clusters. METHODS: We use the deep i' band images to perform weak lensing mass reconstructions and to identify high convergence peaks. Thanks to the availability of deep ugriz exposures, sources are selected from their photometric redshifts. We also use lensing tomography to derive an estimate of the lens redshift. After considering the raw statistics of peaks we check whether they can be associated to a clear optical counterpart or to published X-ray selected clusters. RESULTS: Among the 14 peaks found above a signal-to-noise detection threshold nu=3.5, eight are secure detections with estimated redshift 0.15<zl<0.6 and a velocity dispersion 450<sigma\_v<600 km/s. This low mass range is accessible thanks to the high density of background sources. We also use photometric redshifts of sources to test the effect of contamination by source-lens clustering for clusters detection. This latter turns out to play a minor role in our cluster sample. We study the intersection between the shear-selected clusters and XMM/LSS X-ray clusters in the D1 field. [ABRIDGED]
We develop a first line of attack for solving programming competition-style problems from input-output examples using deep learning. The approach is to train a neural network to predict properties of the program that generated the outputs from the inputs. We use the neural network's predictions to augment search techniques from the programming languages community, including enumerative search and an SMT-based solver. Empirically, we show that our approach leads to an order of magnitude speedup over the strong non-augmented baselines and a Recurrent Neural Network approach, and that we are able to solve problems of difficulty comparable to the simplest problems on programming competition websites.
We report on results of Quantum Monte Carlo simulations for bosons in a two dimensional quasi-periodic optical lattice. We study the ground state phase diagram at unity filling and confirm the existence of three phases: superfluid, Mott insulator, and Bose glass. At lower interaction strength, we find that sizable disorder strength is needed in order to destroy superfluidity in favor of the Bose glass. On the other hand, at large enough interaction superfluidity is completely destroyed in favor of the Mott insulator (at lower disorder strength) or the Bose glass (at larger disorder strength). At intermediate interactions, the system undergoes an insulator to superfluid transition upon increasing the disorder, while a further increase of disorder strength drives the superfluid to Bose glass phase transition. While we are not able to discern between the Mott insulator and the Bose glass at intermediate interactions, we study the transition between these two phases at larger interaction strength and, unlike what reported in arXiv:1110.3213v3 for random disorder, find no evidence of a Mott-glass-like behavior.
We propose to use deep convolutional neural networks to address the problem of cross-view image geolocalization, in which the geolocation of a ground-level query image is estimated by matching to georeferenced aerial images. We use state-of-the-art feature representations for ground-level images and introduce a cross-view training approach for learning a joint semantic feature representation for aerial images. We also propose a network architecture that fuses features extracted from aerial images at multiple spatial scales. To support training these networks, we introduce a massive database that contains pairs of aerial and ground-level images from across the United States. Our methods significantly out-perform the state of the art on two benchmark datasets. We also show, qualitatively, that the proposed feature representations are discriminative at both local and continental spatial scales.
When eliciting probability models from experts, knowledge engineers may compare the results of the model with expert judgment on test scenarios, then adjust model parameters to bring the behavior of the model more in line with the expert's intuition. This paper presents a methodology for analytic computation of sensitivity values to measure the impact of small changes in a network parameter on a target probability value or distribution. These values can be used to guide knowledge elicitation. They can also be used in a gradient descent algorithm to estimate parameter values that maximize a measure of goodness-of-fit to both local and holistic probability assessments.
Deep learning methods have resulted in significant performance improvements in several application domains and as such several software frameworks have been developed to facilitate their implementation. This paper presents a comparative study of five deep learning frameworks, namely Caffe, Neon, TensorFlow, Theano, and Torch, on three aspects: extensibility, hardware utilization, and speed. The study is performed on several types of deep learning architectures and we evaluate the performance of the above frameworks when employed on a single machine for both (multi-threaded) CPU and GPU (Nvidia Titan X) settings. The speed performance metrics used here include the gradient computation time, which is important during the training phase of deep networks, and the forward time, which is important from the deployment perspective of trained networks. For convolutional networks, we also report how each of these frameworks support various convolutional algorithms and their corresponding performance. From our experiments, we observe that Theano and Torch are the most easily extensible frameworks. We observe that Torch is best suited for any deep architecture on CPU, followed by Theano. It also achieves the best performance on the GPU for large convolutional and fully connected networks, followed closely by Neon. Theano achieves the best performance on GPU for training and deployment of LSTM networks. Caffe is the easiest for evaluating the performance of standard deep architectures. Finally, TensorFlow is a very flexible framework, similar to Theano, but its performance is currently not competitive compared to the other studied frameworks.
Recent research on deep learning, a set of machine learning techniques able to learn deep architectures, has shown how robotic perception and action greatly benefits from these techniques. In terms of spacecraft navigation and control system, this suggests that deep architectures may be considered now to drive all or part of the on-board decision making system. In this paper this claim is investigated in more detail training deep artificial neural networks to represent the optimal control action during a pinpoint landing, assuming perfect state information. It is found to be possible to train deep networks for this purpose and that the resulting landings, driven by the trained networks, are close to simulated optimal ones. These results allow for the design of an on-board real time optimal control system able to cope with large sets of possible initial states while still producing an optimal response.
Dropout is used as a practical tool to obtain uncertainty estimates in large vision models and reinforcement learning (RL) tasks. But to obtain well-calibrated uncertainty estimates, a grid-search over the dropout probabilities is necessary - a prohibitive operation with large models, and an impossible one with RL. We propose a new dropout variant which gives improved performance and better calibrated uncertainties. Relying on recent developments in Bayesian deep learning, we use a continuous relaxation of dropout's discrete masks. Together with a principled optimisation objective, this allows for automatic tuning of the dropout probability in large models, and as a result faster experimentation cycles. In RL this allows the agent to adapt its uncertainty dynamically as more data is observed. We analyse the proposed variant extensively on a range of tasks, and give insights into common practice in the field where larger dropout probabilities are often used in deeper model layers.
Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design - one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes and many others. While one can find impressively wide spread of various configurations of almost every aspect of the deep nets, one element is, in authors' opinion, underrepresented - while solving classification problems, vast majority of papers and applications simply use log loss. In this paper we try to investigate how particular choices of loss functions affect deep models and their learning dynamics, as well as resulting classifiers robustness to various effects. We perform experiments on classical datasets, as well as provide some additional, theoretical insights into the problem. In particular we show that L1 and L2 losses are, quite surprisingly, justified classification objectives for deep nets, by providing probabilistic interpretation in terms of expected misclassification. We also introduce two losses which are not typically used as deep nets objectives and show that they are viable alternatives to the existing ones.
Relationships between entities in datasets are often of multiple nature, like geographical distance, social relationships, or common interests among people in a social network, for example. This information can naturally be modeled by a set of weighted and undirected graphs that form a global multilayer graph, where the common vertex set represents the entities and the edges on different layers capture the similarities of the entities in term of the different modalities. In this paper, we address the problem of analyzing multi-layer graphs and propose methods for clustering the vertices by efficiently merging the information provided by the multiple modalities. To this end, we propose to combine the characteristics of individual graph layers using tools from subspace analysis on a Grassmann manifold. The resulting combination can then be viewed as a low dimensional representation of the original data which preserves the most important information from diverse relationships between entities. We use this information in new clustering methods and test our algorithm on several synthetic and real world datasets where we demonstrate superior or competitive performances compared to baseline and state-of-the-art techniques. Our generic framework further extends to numerous analysis and learning problems that involve different types of information on graphs.
Text-to-speech conversion has traditionally been performed either by concatenating short samples of speech or by using rule-based systems to convert a phonetic representation of speech into an acoustic representation, which is then converted into speech. This paper describes a system that uses a time-delay neural network (TDNN) to perform this phonetic-to-acoustic mapping, with another neural network to control the timing of the generated speech. The neural network system requires less memory than a concatenation system, and performed well in tests comparing it to commercial systems using other technologies.
In this paper, we extend the deep long short-term memory (DLSTM) recurrent neural networks by introducing gated direct connections between memory cells in adjacent layers. These direct links, called highway connections, enable unimpeded information flow across different layers and thus alleviate the gradient vanishing problem when building deeper LSTMs. We further introduce the latency-controlled bidirectional LSTMs (BLSTMs) which can exploit the whole history while keeping the latency under control. Efficient algorithms are proposed to train these novel networks using both frame and sequence discriminative criteria. Experiments on the AMI distant speech recognition (DSR) task indicate that we can train deeper LSTMs and achieve better improvement from sequence training with highway LSTMs (HLSTMs). Our novel model obtains $43.9/47.7\%$ WER on AMI (SDM) dev and eval sets, outperforming all previous works. It beats the strong DNN and DLSTM baselines with $15.7\%$ and $5.3\%$ relative improvement respectively.
This paper presents a convolutional neural network based approach for estimating the relative pose between two cameras. The proposed network takes RGB images from both cameras as input and directly produces the relative rotation and translation as output. The system is trained in an end-to-end manner utilising transfer learning from a large scale classification dataset. The introduced approach is compared with widely used local feature based methods (SURF, ORB) and the results indicate a clear improvement over the baseline. In addition, a variant of the proposed architecture containing a spatial pyramid pooling (SPP) layer is evaluated and shown to further improve the performance.
The paper briefy reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function - the ReLU function - used in present-day neural networks, as well as for the Gaussian networks. We propose a new definition of relative dimension to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning.
We introduce Deep Neural Programs (DNP), a novel programming paradigm for writing adaptive controllers for cy-ber-physical systems (CPS). DNP replace if and while statements, whose discontinuity is responsible for undecidability in CPS analysis, intractability in CPS design, and frailness in CPS implementation, with their smooth, neural nif and nwhile counterparts. This not only makes CPS analysis decidable and CPS design tractable, but also allows to write robust and adaptive CPS code. In DNP the connection between the sigmoidal guards of the nif and nwhile statements has to be given as a Gaussian Bayesian network, which reflects the partial knowledge, the CPS program has about its environment. To the best of our knowledge, DNP are the first approach linking neural networks to programs, in a way that makes explicit the meaning of the network. In order to prove and validate the usefulness of DNP, we use them to write and learn an adaptive CPS controller for the parallel parking of the Pioneer rovers available in our CPS lab.
We tackle the challenge of efficiently learning the structure of expressive multivariate real-valued densities of copula graphical models. We start by theoretically substantiating the conjecture that for many copula families the magnitude of Spearman's rank correlation coefficient is monotone in the expected contribution of an edge in network, namely the negative copula entropy. We then build on this theory and suggest a novel Bayesian approach that makes use of a prior over values of Spearman's rho for learning copula-based models that involve a mix of copula families. We demonstrate the generalization effectiveness of our highly efficient approach on sizable and varied real-life datasets.
The main concept of the proposed work is derived from Wireless Body Area network (WBAN). The proposed work employs Raspberry Pi kit as a personal server which logs the health data and it can be accessed by any PDA within the LAN range. In this paper, two vital parameters namely Temperature sensor and Heart beat sensor have been considered.
Slowly driven elastic interfaces, such as domain walls in dirty magnets, contact lines, or cracks proceed via intermittent motion, called avalanches. We develop a field-theoretic treatment to calculate, from first principles, the space-time statistics of instantaneous velocities within an avalanche. For elastic interfaces at (or above) their (internal) upper critical dimension d >= d_uc (d_uc = 2, 4 respectively for long-ranged and short-ranged elasticity) we show that the field theory for the center of mass reduces to the motion of a point particle in a random-force landscape, which is itself a random walk (ABBM model). Furthermore, the full spatial dependence of the velocity correlations is described by the Brownian-force model (BFM) where each point of the interface sees an independent Brownian-force landscape. Both ABBM and BFM can be solved exactly in any dimension d (for monotonous driving) by summing tree graphs, equivalent to solving a (non-linear) instanton equation. This tree approximation is the mean-field theory (MFT) for realistic interfaces in short-ranged disorder. Both for the center of mass, and for a given Fourier mode q, we obtain probability distribution functions (PDF's) of the velocity, as well as the avalanche shape and its fluctuations (second shape). Within MFT we find that velocity correlations at non-zero q are asymmetric under time reversal. Next we calculate, beyond MFT, i.e. including loop corrections, the 1-time PDF of the center-of-mass velocity du/dt for dimension d< d_uc. The singularity at small velocity P(du/dt) ~ 1/(du/dt)^a is substantially reduced from a=1 (MFT) to a = 1 - 2/9 (4-d) + ... (short-ranged elasticity) and a = 1 - 4/9 (2-d) + ... (long-ranged elasticity). We show how the dynamical theory recovers the avalanche-size distribution, and how the instanton relates to the response to an infinitesimal step in the force.
In this work we present a novel end-to-end framework for tracking and classifying a robot's surroundings in complex, dynamic and only partially observable real-world environments. The approach deploys a recurrent neural network to filter an input stream of raw laser measurements in order to directly infer object locations, along with their identity in both visible and occluded areas. To achieve this we first train the network using unsupervised Deep Tracking, a recently proposed theoretical framework for end-to-end space occupancy prediction. We show that by learning to track on a large amount of unsupervised data, the network creates a rich internal representation of its environment which we in turn exploit through the principle of inductive transfer of knowledge to perform the task of it's semantic classification. As a result, we show that only a small amount of labelled data suffices to steer the network towards mastering this additional task. Furthermore we propose a novel recurrent neural network architecture specifically tailored to tracking and semantic classification in real-world robotics applications. We demonstrate the tracking and classification performance of the method on real-world data collected at a busy road junction. Our evaluation shows that the proposed end-to-end framework compares favourably to a state-of-the-art, model-free tracking solution and that it outperforms a conventional one-shot training scheme for semantic classification.
We have calculated the Tsallis entropy and Fisher information matrix (entropy) of spatially-correlated nonextensive systems, by using an analytic non-Gaussian distribution obtained by the maximum entropy method. Effects of the correlated variability on the Fisher information matrix are shown to be different from those on the Tsallis entropy. The Fisher information is increased (decreased) by a positive (negative) correlation, whereas the Tsallis entropy is decreased with increasing an absolute magnitude of the correlation independently of its sign. This fact arises from the difference in their characteristics. It implies from the Cram\'{e}r-Rao inequality that the accuracy of unbiased estimate of fluctuation is improved by the negative correlation. A critical comparison is made between the present study and previous ones employing the Gaussian approximation for the correlated variability due to multiplicative noise.
Grammatical error correction (GEC) systems strive to correct both global errors in word order and usage, and local errors in spelling and inflection. Further developing upon recent work on neural machine translation, we propose a new hybrid neural model with nested attention layers for GEC. Experiments show that the new model can effectively correct errors of both types by incorporating word and character-level information,and that the model significantly outperforms previous neural models for GEC as measured on the standard CoNLL-14 benchmark dataset. Further analysis also shows that the superiority of the proposed model can be largely attributed to the use of the nested attention mechanism, which has proven particularly effective in correcting local errors that involve small edits in orthography.
Developers often wonder how to implement a certain functionality (e.g., how to parse XML files) using APIs. Obtaining an API usage sequence based on an API-related natural language query is very helpful in this regard. Given a query, existing approaches utilize information retrieval models to search for matching API sequences. These approaches treat queries and APIs as bag-of-words (i.e., keyword matching or word-to-word alignment) and lack a deep understanding of the semantics of the query.   We propose DeepAPI, a deep learning based approach to generate API usage sequences for a given natural language query. Instead of a bags-of-words assumption, it learns the sequence of words in a query and the sequence of associated APIs. DeepAPI adapts a neural language model named RNN Encoder-Decoder. It encodes a word sequence (user query) into a fixed-length context vector, and generates an API sequence based on the context vector. We also augment the RNN Encoder-Decoder by considering the importance of individual APIs. We empirically evaluate our approach with more than 7 million annotated code snippets collected from GitHub. The results show that our approach generates largely accurate API sequences and outperforms the related approaches.
This paper introduces THUMT, an open-source toolkit for neural machine translation (NMT) developed by the Natural Language Processing Group at Tsinghua University. THUMT implements the standard attention-based encoder-decoder framework on top of Theano and supports three training criteria: maximum likelihood estimation, minimum risk training, and semi-supervised training. It features a visualization tool for displaying the relevance between hidden states in neural networks and contextual words, which helps to analyze the internal workings of NMT. Experiments on Chinese-English datasets show that THUMT using minimum risk training significantly outperforms GroundHog, a state-of-the-art toolkit for NMT.
The dynamic behaviour of glassy materials displays strong nonequilibrium effects, such as ageing in simple protocols, memory, rejuvenation and Kovacs effects in more elaborated experiments. We show that this phenomenology may be easily understood in the context of the nonequilibrium critical dynamics of non-disordered systems, the main ingredient being the existence of an infinite equilibrium correlation length. As an example, we analytically investigate the behaviour of the 2D XY model submitted to temperature protocols similar to experiments. This shows that typical glassy effects may be obtained by `surfing on a critical line' without invoking the concept of temperature chaos nor the existence of a hierarchical phase space, as opposed to previous theoretical approaches. The relevance of this phenomenological approach to glassy dynamics is finally discussed.
Over the past few years, we have built a system that has exposed large volumes of Deep-Web content to Google.com users. The content that our system exposes contributes to more than 1000 search queries per-second and spans over 50 languages and hundreds of domains. The Deep Web has long been acknowledged to be a major source of structured data on the web, and hence accessing Deep-Web content has long been a problem of interest in the data management community. In this paper, we report on where we believe the Deep Web provides value and where it does not. We contrast two very different approaches to exposing Deep-Web content -- the surfacing approach that we used, and the virtual integration approach that has often been pursued in the data management literature. We emphasize where the values of each of the two approaches lie and caution against potential pitfalls. We outline important areas of future research and, in particular, emphasize the value that can be derived from analyzing large collections of potentially disparate structured data on the web.
Standard hybrid learners that use domain knowledge require stronger knowledge that is hard and expensive to acquire. However, weaker domain knowledge can benefit from prior knowledge while being cost effective. Weak knowledge in the form of feature relative importance (FRI) is presented and explained. Feature relative importance is a real valued approximation of a feature's importance provided by experts. Advantage of using this knowledge is demonstrated by IANN, a modified multilayer neural network algorithm. IANN is a very simple modification of standard neural network algorithm but attains significant performance gains. Experimental results in the field of molecular biology show higher performance over other empirical learning algorithms including standard backpropagation and support vector machines. IANN performance is even comparable to a theory refinement system KBANN that uses stronger domain knowledge. This shows Feature relative importance can improve performance of existing empirical learning algorithms significantly with minimal effort.
We consider quantum-dynamical phenomena in the $\mathrm{SU}(2)$, $S=1/2$ infinite-range quantum Heisenberg spin glass. For a fermionic generalization of the model we formulate generic dynamical self-consistency equations. Using the Popov-Fedotov trick to eliminate contributions of the non-magnetic fermionic states we study in particular the isotropic model variant on the spin space. Two complementary approximation schemes are applied: one restricts the quantum spin dynamics to a manageable number of Matsubara frequencies while the other employs an expansion in terms of the dynamical local spin susceptibility. We accurately determine the critical temperature $T_c$ of the spin glass to paramagnet transition. We find that the dynamical correlations cause an increase of $T_c$ by 2% compared to the result obtained in the spin-static approximation. The specific heat $C(T)$ exhibits a pronounced cusp at $T_c$. Contradictory to other reports we do not observe a maximum in the $C(T)$-curve above $T_c$.
We consider a dc-driven damped sine-Gordon model with a small nonlinear spatial-disorder term, onto which a sinusoidal modulation is superimposed. It describes, e.g., a weakly disordered system with a regular grain structure. We demonstrate that, at the second order of the perturbation theory (with respect to the weak spatial disorder), the periodically modulated disorder gives rise to an effective periodic potential. Dynamics of a kink moving in this potential is studied in the overdamped limit, using the adiabatic approximation, the main objective being to consider depinning of a trapped kink. The analytical results are compared with direct dynamical simulations of the underlying model, as well as with numerical results using the collective-coordinate approach but without the mean-field approximation. It is found that a critical force for the depinning of a kink trapped by the periodically modulated weak spatial disorder is much larger than that predicted by the mean-field approximation.
We use rest frame ultraviolet (UV), B, and V band images of five nearby (z<0.02) interacting and/or starbursting galaxies to simulate deep HST observations of peculiar galaxies at medium to high redshifts. In particular, we simulate Hubble Deep Field (HDF) observations in the F606W and F814W filters of starburst galaxies in the redshift range z~0.5---2.5 by explicitly account for the combined effects of band-shifting and surface brightness dimming. We find that extended morphological features remain readily visible in the long exposures typical of the HDF out to redshifts of ~ 1. For systems above z~1.5, the simulated morphologies look remarkably similar to those of the faint objects found in the HDF and other deep HST fields. Peculiar starburst galaxies therefore appear to be the best local analogs to the highest redshift galaxies in terms of morphology, star formation rates, and spectral energy distributions. Nevertheless, photometric measurements of the z>1.5 images fail to recover the true global properties of the underlying systems. This is because the high-z observations are sensitive to the rest-frame UV emission, which is dominated by the most active star forming regions. The extended distribution of starlight from more evolved populations would not be detected. We conclude that imaging observations in the restframe UV alone cannot reveal whether high-z systems (z>1.5) are proto-galaxies, proto-bulges, or starbursts within a pre-existing population. Definitive statements regarding the global properties and dynamical states of these objects require deep imaging observations at longer wavelengths.
Arthur's (1988) model for competing technologies is discussed from the perspective of evolution theory. Using Arthur's own model for the simulation, we show that 'lock-ins' can be suppressed by adding reflexivity or uncertainty on the side of consumers. Competing technologies then tend to remain in competition. From an evolutionary perspective, lock-ins and prevailing equilibrium can be considered as different trajectories of the techno-economic systems under study. Our simulation results suggest that technological developments which affect the natural preferences of consumers do not induce changes in trajectory, while changes in network parameters of a technology sometimes induce ordered substitution processes. These substitution processes have been shown empirically (e.g., Fisher & Prey, 1971), but hitherto they have been insufficiently understood from the perspective of evolutionary modelling. Implications for technology policies are discussed.
First-principles calculations reveal half metallicity in zigzag boron nitride (BN) nanoribbons (ZBNNRs). When the B edge, but not the N edge, of the ZBNNR is passivated, despite being a pure $sp$-electron system, the ribbon shows a giant spin splitting. The electrons at the Fermi level are 100% spin polarized with a half-metal gap of 0.38 eV and its conductivity is dominated by metallic single-spin states. The two states across at the Dirac point have different molecular origins, which signals a switch of carrier velocity. The ZBNNR should be a good potential candidate for widegap spintronics.
In this paper, we first present a new variant of Gaussian restricted Boltzmann machine (GRBM) called multivariate Gaussian restricted Boltzmann machine (MGRBM), with its definition and learning algorithm. Then we propose using a learned GRBM or MGRBM to extract better features for robust speech recognition. Our experiments on Aurora2 show that both GRBM-extracted and MGRBM-extracted feature performs much better than Mel-frequency cepstral coefficient (MFCC) with either HMM-GMM or hybrid HMM-deep neural network (DNN) acoustic model, and MGRBM-extracted feature is slightly better.
Different classes of communication network topologies and their representation in the form of adjacency matrix and its eigenvalues are presented. A self-organizing feature map neural network is used to map different classes of communication network topological patterns. The neural network simulation results are reported.
Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixels in an image. The resulting graphs have billions of edges, making traditional inference algorithms impractical. Our main contribution is a highly efficient approximate inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels. Our experiments demonstrate that dense connectivity at the pixel level substantially improves segmentation and labeling accuracy.
Motivated by tensions between data privacy for individual citizens, and societal priorities such as counterterrorism and the containment of infectious disease, we introduce a computational model that distinguishes between parties for whom privacy is explicitly protected, and those for whom it is not (the targeted subpopulation). The goal is the development of algorithms that can effectively identify and take action upon members of the targeted subpopulation in a way that minimally compromises the privacy of the protected, while simultaneously limiting the expense of distinguishing members of the two groups via costly mechanisms such as surveillance, background checks, or medical testing. Within this framework, we provide provably privacy-preserving algorithms for targeted search in social networks. These algorithms are natural variants of common graph search methods, and ensure privacy for the protected by the careful injection of noise in the prioritization of potential targets. We validate the utility of our algorithms with extensive computational experiments on two large-scale social network datasets.
The problem of finding overlapping communities in networks has gained much attention recently. Optimization-based approaches use non-negative matrix factorization (NMF) or variants, but the global optimum cannot be provably attained in general. Model-based approaches, such as the popular mixed-membership stochastic blockmodel or MMSB (Airoldi et al., 2008), use parameters for each node to specify the overlapping communities, but standard inference techniques cannot guarantee consistency. We link the two approaches, by (a) establishing sufficient conditions for the symmetric NMF optimization to have a unique solution under MMSB, and (b) proposing a computationally efficient algorithm called GeoNMF that is provably optimal and hence consistent for a broad parameter regime. We demonstrate its accuracy on both simulated and real-world datasets.
We experimentally study spatial fluctuations of the local density of states (LDOS) inside three-dimensional random photonic media. The LDOS is probed at many positions inside the random photonic media by measuring emission rates of a large number of individual fluorescent nanospheres. The emission rates are observed to fluctuate spatially. We analyze the variance of these fluctuations for samples with different scattering strengths. The measured variance of the emission rates agrees well with a model that takes into account the effect of the nearest scatterer only.
We present a deep learning approach to estimation of the bead parameters in welding tasks. Our model is based on a four-hidden-layer neural network architecture. More specifically, the first three hidden layers of this architecture utilize Sigmoid function to produce their respective intermediate outputs. On the other hand, the last hidden layer uses a linear transformation to generate the final output of this architecture. This transforms our deep network architecture from a classifier to a non-linear regression model. We compare the performance of our deep network with a selected number of results in the literature to show a considerable improvement in reducing the errors in estimation of these values. Furthermore, we show its scalability on estimating the weld bead parameters with same level of accuracy on combination of datasets that pertain to different welding techniques. This is a nontrivial result that is counter-intuitive to the general belief in this field of research.
Deep Convolutional Neural Networks (CNNs) are more powerful than Deep Neural Networks (DNN), as they are able to better reduce spectral variation in the input signal. This has also been confirmed experimentally, with CNNs showing improvements in word error rate (WER) between 4-12% relative compared to DNNs across a variety of LVCSR tasks. In this paper, we describe different methods to further improve CNN performance. First, we conduct a deep analysis comparing limited weight sharing and full weight sharing with state-of-the-art features. Second, we apply various pooling strategies that have shown improvements in computer vision to an LVCSR speech task. Third, we introduce a method to effectively incorporate speaker adaptation, namely fMLLR, into log-mel features. Fourth, we introduce an effective strategy to use dropout during Hessian-free sequence training. We find that with these improvements, particularly with fMLLR and dropout, we are able to achieve an additional 2-3% relative improvement in WER on a 50-hour Broadcast News task over our previous best CNN baseline. On a larger 400-hour BN task, we find an additional 4-5% relative improvement over our previous best CNN baseline.
We propose here an autonomous traffic signal control model based on analogy with neural networks. In this model, the length of cycle time period of traffic lights at each signal is autonomously adapted. We find a self-organizing collective behavior of such a model through simulation on a one-dimensional lattice model road: traffic congestion is greatly diffused when traffic signals have such autonomous adaptability with suitably tuned parameters. We also find that effectiveness of the system emerges through interactions between units and shows a threshold transition as a function of proportion of adaptive signals in the model.
Feature tracking Cardiac Magnetic Resonance (CMR) has recently emerged as an area of interest for quantification of regional cardiac function from balanced, steady state free precession (SSFP) cine sequences. However, currently available techniques lack full automation, limiting reproducibility. We propose a fully automated technique whereby a CMR image sequence is first segmented with a deep, fully convolutional neural network (CNN) architecture, and quadratic basis splines are fitted simultaneously across all cardiac frames using least squares optimization. Experiments are performed using data from 42 patients with hypertrophic cardiomyopathy (HCM) and 21 healthy control subjects. In terms of segmentation, we compared state-of-the-art CNN frameworks, U-Net and dilated convolution architectures, with and without temporal context, using cross validation with three folds. Performance relative to expert manual segmentation was similar across all networks: pixel accuracy was ~97%, intersection-over-union (IoU) across all classes was ~87%, and IoU across foreground classes only was ~85%. Endocardial left ventricular circumferential strain calculated from the proposed pipeline was significantly different in control and disease subjects (-25.3% vs -29.1%, p = 0.006), in agreement with the current clinical literature.
Segmenting a structural magnetic resonance imaging (MRI) scan is an important pre-processing step for analytic procedures and subsequent inferences about longitudinal tissue changes. Manual segmentation defines the current gold standard in quality but is prohibitively expensive. Automatic approaches are computationally intensive, incredibly slow at scale, and error prone due to usually involving many potentially faulty intermediate steps. In order to streamline the segmentation, we introduce a deep learning model that is based on volumetric dilated convolutions, subsequently reducing both processing time and errors. Compared to its competitors, the model has a reduced set of parameters and thus is easier to train and much faster to execute. The contrast in performance between the dilated network and its competitors becomes obvious when both are tested on a large dataset of unprocessed human brain volumes. The dilated network consistently outperforms not only another state-of-the-art deep learning approach, the up convolutional network, but also the ground truth on which it was trained. Not only can the incredible speed of our model make large scale analyses much easier but we also believe it has great potential in a clinical setting where, with little to no substantial delay, a patient and provider can go over test results.
Following the lack of microscopic information about the intriguing well-known electrical-thermal switching mechanism in Carbon Black-Polymer composites, we applied atomic force microscopy in order to reveal the local nature of the process and correlated it with the characteristics of the widely used commercial switches. We conclude that the switching events take place in critical interparticle tunneling junctions that carry most of the current. The macroscopic switched state is then a result of a dynamic-stationary state of fast switching and slow reconnection of the corresponding junctions.
In this work we study a simple evolutionary model of bipartite networks which its evolution is based on the duplication of nodes. Using analytical results along with numerical simulation of the model, we show that the above evolutionary model results in weighted scale free networks. Indeed we find that in the one mode picture we have weighted networks with scale free distributions for interesting quantities like the weights, the degrees and the weighted degrees of the nodes and the weights of the edges.
Recent progress in using recurrent neural networks (RNNs) for image description has motivated the exploration of their application for video description. However, while images are static, working with videos requires modeling their dynamic temporal structure and then properly integrating that information into a natural language description. In this context, we propose an approach that successfully takes into account both the local and global temporal structure of videos to produce descriptions. First, our approach incorporates a spatial temporal 3-D convolutional neural network (3-D CNN) representation of the short temporal dynamics. The 3-D CNN representation is trained on video action recognition tasks, so as to produce a representation that is tuned to human motion and behavior. Second we propose a temporal attention mechanism that allows to go beyond local temporal modeling and learns to automatically select the most relevant temporal segments given the text-generating RNN. Our approach exceeds the current state-of-art for both BLEU and METEOR metrics on the Youtube2Text dataset. We also present results on a new, larger and more challenging dataset of paired video and natural language descriptions.
Deep optical and near-infrared galaxy counts are utilized to estimate the extragalactic background light (EBL) coming from normal galactic light in the universe. Although the slope of number-magnitude relation of the faintest counts is flat enough for the count integration to converge, considerable fraction of EBL from galaxies could still have been missed in deep galaxy surveys because of various selection effects including the cosmological dimming of surface brightness of galaxies. Here we give an estimate of EBL from galaxy counts, in which these selection effects are quantitatively taken into account for the first time, based on reasonable models of galaxy evolution which are consistent with all available data of galaxy counts, size, and redshift distributions. We show that the EBL from galaxies is best resolved into discrete galaxies in the near-infrared bands (J, K) by using the latest data of the Subaru Deep Field; more than 80-90% of EBL from galaxies has been resolved in these bands. Our result indicates that the contribution by missing galaxies cannot account for the discrepancy between the count integration and recent tentative detections of diffuse EBL in the K-band (2.2 micron), and there may be a very diffuse component of EBL which has left no imprints in known galaxy populations.
Production of $\Lambda$ and $\bar\Lambda$ hyperons in deep-inelastic scattering of 160 GeV/c polarized muons is under study in the COMPASS (CERN NA58) experiment. Preliminary results on longitudinal polarization of the hyperons from the data collected during the 2002 run are presented.
We prove statement conjectured in [Baez and Barrett:2001] that every 3-edge-connected SL(2,C) spin-network with invariants of certain class is integrable. It means that the regularized evaluation (defined by a suitable integral) of such a spin-network is finite. Our proof is quite general. It is valid for relativistic spin-networks of Barrett and Crane as well as for spin-networks with the Engle-Pereira-Rovelli-Livine intertwiners and for some generalization of both. The result interesting from the group representation point of view opens also a possibility of defining vertex amplitudes for Spin-Foam models based on non-simplicial decompositions.
In this paper, we investigate analytically the properties of the disordered Bernoulli model of planar matching. This model is characterized by a topological phase transition, yielding complete planar matching solutions only above a critical density threshold. We develop a combinatorial procedure of arcs expansion that explicitly takes into account the contribution of short arcs, and allows to obtain an accurate analytical estimation of the critical value by reducing the global constrained problem to a set of local ones. As an application to a toy representation of the RNA secondary structures, we suggest generalized models that incorporate a one-to-one correspondence between the contact matrix and the RNA-type sequence, thus giving sense to the notion of effective non-integer alphabets.
This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation. DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images. The system substantially improves on the state of the art for generative models on MNIST, and, when trained on the Street View House Numbers dataset, it generates images that cannot be distinguished from real data with the naked eye.
We describe a new algorithm for learning multi-class neural-network models from large-scale clinical electroencephalograms (EEGs). This algorithm trains hidden neurons separately to classify all the pairs of classes. To find best pairwise classifiers, our algorithm searches for input variables which are relevant to the classification problem. Despite patient variability and heavily overlapping classes, a 16-class model learnt from EEGs of 65 sleeping newborns correctly classified 80.8% of the training and 80.1% of the testing examples. Additionally, the neural-network model provides a probabilistic interpretation of decisions.
We introduce Divnet, a flexible technique for learning networks with diverse neurons. Divnet models neuronal diversity by placing a Determinantal Point Process (DPP) over neurons in a given layer. It uses this DPP to select a subset of diverse neurons and subsequently fuses the redundant neurons into the selected ones. Compared with previous approaches, Divnet offers a more principled, flexible technique for capturing neuronal diversity and thus implicitly enforcing regularization. This enables effective auto-tuning of network architecture and leads to smaller network sizes without hurting performance. Moreover, through its focus on diversity and neuron fusing, Divnet remains compatible with other procedures that seek to reduce memory footprints of networks. We present experimental results to corroborate our claims: for pruning neural networks, Divnet is seen to be notably superior to competing approaches.
Although there are many neural network (NN) algorithms for prediction and for control, and although methods for optimal estimation (including filtering and prediction) and for optimal control in linear systems were provided by Kalman in 1960 (with nonlinear extensions since then), there has been, to my knowledge, no NN algorithm that learns either Kalman prediction or Kalman control (apart from the special case of stationary control). Here we show how optimal Kalman prediction and control (KPC), as well as system identification, can be learned and executed by a recurrent neural network composed of linear-response nodes, using as input only a stream of noisy measurement data.   The requirements of KPC appear to impose significant constraints on the allowed NN circuitry and signal flows. The NN architecture implied by these constraints bears certain resemblances to the local-circuit architecture of mammalian cerebral cortex. We discuss these resemblances, as well as caveats that limit our current ability to draw inferences for biological function. It has been suggested that the local cortical circuit (LCC) architecture may perform core functions (as yet unknown) that underlie sensory, motor,and other cortical processing. It is reasonable to conjecture that such functions may include prediction, the estimation or inference of missing or noisy sensory data, and the goal-driven generation of control signals. The resemblances found between the KPC NN architecture and that of the LCC are consistent with this conjecture.
We use smoothed analysis techniques to provide guarantees on the training loss of Multilayer Neural Networks (MNNs) at differentiable local minima. Specifically, we examine MNNs with piecewise linear activation functions, quadratic loss and a single output, under mild over-parametrization. We prove that for a MNN with one hidden layer, the training error is zero at every differentiable local minimum, for almost every dataset and dropout-like noise realization. We then extend these results to the case of more than one hidden layer. Our theoretical guarantees assume essentially nothing on the training data, and are verified numerically. These results suggest why the highly non-convex loss of such MNNs can be easily optimized using local updates (e.g., stochastic gradient descent), as observed empirically.
Vibrant development of a network-based economy requires separating investment in highly location specific local access technology from the development of standardized, geography-independent, wide-area network services. Thus far interconnection arrangements and associated regulations have been too closely tied to the idiosyncratic geographic structure of individual operators' networks. A key industry challenge is to foster the development of a wide area lattice of common geographic points of interconnection. Sound regulatory and anti-trust policy can help address this industry need.
Resource allocation with quality of service constraints is one of the most challenging problems in elastic optical networks which is normally formulated as an MINLP optimization program. In this paper, we focus on novel properties of geometric optimization and provide a heuristic approach for resource allocation which is very faster than its MINLP counterpart. Our heuristic consists of two main parts for routing/traffic ordering and power/spectrum assignment. It aims at minimization of transmitted optical power and spectrum usage constrained to quality of service and physical requirements. We consider three routing/traffic ordering procedures and compare them in terms of total transmitted optical power, total received noise power and total nonlinear interference including self- and cross-channel interferences. We propose a posynomial expression for optical signal to noise ratio in which fiber nonlinearities and spontaneous emission noise have been addressed. We also propose posynomial expressions that relate modulation spectral efficiency to its corresponding minimum required optical signal to noise ratio. We then use the posynomial expressions to develop six geometric formulations for power/spectrum assignment part of the heuristic which are different in run time, complexity and accuracy. Simulation results demonstrate that the proposed solution has a very good accuracy and much lower computational complexity in comparison with MINLP formulation. As example for European Cost239 optical network with 46 transmit transponders, the geometric formulations can be more than 59 times faster than its MINLP counterpart. Numerical results also reveal that in long-haul elastic optical networks, considering the product of the number of common fiber spans and the transmission bit rate is a better goal function for routing/traffic ordering sub-problem.
In the study of disordered models like spin glasses the key object of interest is the rugged energy hypersurface defined in configuration space. The statistical mechanics calculation of the Gibbs-Boltzmann Partition Function gives the information necessary to understand the equilibrium behavior of the system as a function of the temperature but is not enough if we are interested in more general aspects of the hypersurface: it does not give us, for instance, the different degrees of ruggedness at different scales. In the context of the Replica Symmetry Breaking (RSB) approach we discuss here a rather simple extension that can provide a much more detailed picture. The attractiveness of the method relies in that it is conceptually transparent and the additional calculations are rather straightforward. We think that this approach reveals an ultrametric organisation with many levels in models like p-spin glasses when we include saddle points. In this first paper we present the detailed calculations for the spherical p-spin glass model where we discover that the corresponding decreasing Parisi function $q(x)$ codes this hidden ultrametric organisation.
Cellular phenotypes are determined by the dynamical activity of networks of co-regulated genes. Elucidating such networks is crucial for the understanding of normal cell physiology as well as for the dissection of complex pathologic phenotypes. Existing methods for such "reverse engineering" of genetic networks from microarray expression data have been successful only in prokaryotes (E. coli) and lower eukaryotes (S. cerevisiae) with relatively simple genomes. Additionally, they have mostly attempted to reconstruct average properties about the network connectivity without capturing the highly conditional nature of the interactions. In this paper we extend the ARACNE algorithm, which we recently introduced and successfully applied to the reconstruction of whole-genome transcriptional networks from mammalian cells, precisely to link the existence of specific network structures to the expression or lack thereof of specific regulator genes. This is accomplished by analyzing thousands of alternative network topologies generated by constraining the data set on the presence or absence of putative regulator genes. By considering interactions that are consistently supported across several such constraints, we identify many transcriptional interactions that would not have been detectable by the original method. By selecting genes that produce statistically significant changes in network topology, we identify novel candidate regulator genes. Further analysis shows that transcription factors, kinases, phosphatases, and other gene families known to effect biochemical interactions, are significantly overrepresented among the set of candidate regulator genes identified in silico, indirectly supporting the validity of the approach.
Currently, the most successful learning models in computer vision are based on learning successive representations followed by a decision layer. This is usually actualized through feedforward multilayer neural networks, e.g. ConvNets, where each layer forms one of such successive representations. However, an alternative that can achieve the same goal is a feedback based approach in which the representation is formed in an iterative manner based on a feedback received from previous iteration's output.   We establish that a feedback based approach has several fundamental advantages over feedforward: it enables making early predictions at the query time, its output naturally conforms to a hierarchical structure in the label space (e.g. a taxonomy), and it provides a new basis for Curriculum Learning. We observe that feedback networks develop a considerably different representation compared to feedforward counterparts, in line with the aforementioned advantages. We put forth a general feedback based learning architecture with the endpoint results on par or better than existing feedforward networks with the addition of the above advantages. We also investigate several mechanisms in feedback architectures (e.g. skip connections in time) and design choices (e.g. feedback length). We hope this study offers new perspectives in quest for more natural and practical learning models.
Many communities have researched the application of novel network architectures such as Content-Centric Networking (CCN) and Software-Defined Networking (SDN) to build the future Internet. Another emerging technology which is big data analysis has also won lots of attentions from academia to industry. Many splendid researches have been done on CCN, SDN, and big data, which all have addressed separately in the traditional literature. In this paper, we propose a novel network paradigm to jointly consider CCN, SDN, and big data, and provide the architecture internal data flow, big data processing and use cases which indicate the benefits and applicability. Simulation results are exhibited to show the potential benefits relating to the proposed network paradigm. We refer to this novel paradigm as Data-Driven Networking (DDN).
Linking the growing IPv6 deployment to existing IPv4 addresses is an interesting field of research, be it for network forensics, structural analysis, or reconnaissance. In this work, we focus on classifying pairs of server IPv6 and IPv4 addresses as siblings, i.e., running on the same machine. Our methodology leverages active measurements of TCP timestamps and other network characteristics, which we measure against a diverse ground truth of 682 hosts. We define and extract a set of features, including estimation of variable (opposed to constant) remote clock skew. On these features, we train a manually crafted algorithm as well as a machine-learned decision tree. By conducting several measurement runs and training in cross-validation rounds, we aim to create models that generalize well and do not overfit our training data. We find both models to exceed 99% precision in train and test performance. We validate scalability by classifying 149k siblings in a large-scale measurement of 371k sibling candidates. We argue that this methodology, thoroughly cross-validated and likely to generalize well, can aid comparative studies of IPv6 and IPv4 behavior in the Internet. Striving for applicability and replicability, we release ready-to-use source code and raw data from our study.
Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and followers. This pattern also follows a power-law distribution, which is consistent with those observed in mainstream social networks. Given its potential impact on HPC workloads prediction and scheduling, we propose a fast-converging, computationally-efficient online learning algorithm for identifying social groups. Extensive evaluation shows that our online algorithm can (1) quickly identify the social relationships by using a small portion of incoming jobs and (2) can efficiently track group evolution over time.
We study the production of polarized $\Lambda$ hyperon in semi-inclusive deep inelastic scattering off an unpolarized target. We include the cases in which the $\Lambda$ hyperon is longitudinally polarized or transversely polarized, and in which the lepton beam is unpolarized or longitudinally polarized. Within the framework of the transverse momentum dependent factorization, we take into account the complete decomposition of the parton correlator for fragmentation up to twist-3. We present the cross section of the process to order $1/Q$. The expressions of the polarized structure functions, which may give rise to various spin asymmetries, are also given.
In learning belief networks, the single link lookahead search is widely adopted to reduce the search space. We show that there exists a class of probabilistic domain models which displays a special pattern of dependency. We analyze the behavior of several learning algorithms using different scoring metrics such as the entropy, conditional independence, minimal description length and Bayesian metrics. We demonstrate that single link lookahead search procedures (employed in these algorithms) cannot learn these models correctly. Thus, when the underlying domain model actually belongs to this class, the use of a single link search procedure will result in learning of an incorrect model. This may lead to inference errors when the model is used. Our analysis suggests that if the prior knowledge about a domain does not rule out the possible existence of these models, a multi-link lookahead search or other heuristics should be used for the learning process.
The current trend in object detection and localization is to learn predictions with high capacity deep neural networks trained on a very large amount of annotated data and using a high amount of processing power. In this work, we propose a new neural model which directly predicts bounding box coordinates. The particularity of our contribution lies in the local computations of predictions with a new form of local parameter sharing which keeps the overall amount of trainable parameters low. Key components of the model are spatial 2D-LSTM recurrent layers which convey contextual information between the regions of the image. We show that this model is more powerful than the state of the art in applications where training data is not as abundant as in the classical configuration of natural images and Imagenet/Pascal VOC tasks. We particularly target the detection of text in document images, but our method is not limited to this setting. The proposed model also facilitates the detection of many objects in a single image and can deal with inputs of variable sizes without resizing.
Named entity recognition often fails in idiosyncratic domains. That causes a problem for depending tasks, such as entity linking and relation extraction. We propose a generic and robust approach for high-recall named entity recognition. Our approach is easy to train and offers strong generalization over diverse domain-specific language, such as news documents (e.g. Reuters) or biomedical text (e.g. Medline). Our approach is based on deep contextual sequence learning and utilizes stacked bidirectional LSTM networks. Our model is trained with only few hundred labeled sentences and does not rely on further external knowledge. We report from our results F1 scores in the range of 84-94% on standard datasets.
Quantum input-output response analysis is a useful method for modeling the dynamics of complex quantum networks, such as those for communication or quantum control via cascade connections. Non-Markovian effects have not yet been studied in such networks. Here we extend the Markovian input-output network formalism developed in optical systems to non-Markovian cascaded networks which can be used, e.g., to analyze the input-output response of mesoscopic quantum networks. We use this formalism to explore the behavior of superconducting qubit networks, where we examine the effect of finite cavity bandwidths. We also discuss its application to open- and closed-loop control networks, and show how these networks create effective Hamiltonians for the controlled system.
Significant uncertainties exist in the measured amplitude of the angular two-point correlation function of galaxies at magnitudes $I\approx26$ and fainter. Published results from HST and ground-based galaxy catalogs seem to differ by as much as a factor of 3, and it is not clear whether the correlation amplitude as a function of magnitude increases or decreases in the faintest magnitude bins. In order to clarify the situation, we present new results from both ground-based and HST galaxy catalogs. The angular two-point correlation function as a function of limiting R and I magnitudes was computed from a galaxy catalog created from the Hubble Deep Field - South (HDF-S) WFPC2 image. The measured amplitudes of the correlation at an angular separation of 1 arcsec are consistent with those measured in the Northern counter part of the field. The flanking fields (FF fields) of the Hubble deep fields were used to extend the magnitude range for which we compute correlation amplitudes towards brighter magnitude bins. This allows easier comparison of the amplitudes to ground based data. The newly measured correlation amplitudes as a function of magnitude limit were compared to previously published measurements at larger separations. For this comparison, the correlation function was approximated by a power law with an index of 0.8. The scatter in the correlation amplitudes is too large to be explained by random errors. We argue that the most likely cause is the assumption that the shape of the correlation function does not depend on the magnitude limit.
Lattice models, also known as generalized Ising models or cluster expansions, are widely used in many areas of science and are routinely applied to alloy thermodynamics, solid-solid phase transitions, magnetic and thermal properties of solids, and fluid mechanics, among others. However, the problem of finding the true global ground state of a lattice model, which is essential for all of the aforementioned applications, has remained unresolved, with only a limited number of results for highly simplified systems known. In this article, we present the first general algorithm to find the exact ground states of complex lattice models and to prove their global optimality, resolving this fundamental problem in condensed matter and materials theory. We transform the infinite-discrete-optimization problem into a pair of combinatorial optimization (MAX-SAT) and non-smooth convex optimization (MAX-MIN) problems, which provide upper and lower bounds on the ground state energy respectively. By systematically converging these bounds to each other, we find and prove the exact ground state of realistic Hamiltonians whose solutions are completely intractable via traditional methods. Considering that currently such Hamiltonians are solved using simulated annealing and genetic algorithms that are often unable to find the true global energy minimum, and never able to prove the optimality of their result, our work opens the door to resolving long-standing uncertainties in lattice models of physical phenomena.
Quantifying the universality of avalanche observables beyond critical exponents is of current great interest in theory and experiments. Here, we improve the characterization of the spatio-temporal process inside avalanches in the universality class of the depinning of elastic interfaces in random media. Surprisingly, at variance with the temporal shape, the spatial shape of avalanches has not yet been predicted. In part this is due to a lack of an analytically tractable definition: how should the shapes be centered? Here we introduce such a definition, accessible in experiments, and study the mean spatial shape of avalanches at fixed size centered around their starting point (seed). We calculate the associated universal scaling functions, both in a mean-field model and beyond. Notably, they are predicted to exhibit a cusp singularity near the seed. The results are in good agreement with a numerical simulation of an elastic line.
Autoencoders have been successful in learning meaningful representations from image datasets. However, their performance on text datasets has not been widely studied. Traditional autoencoders tend to learn possibly trivial representations of text documents due to their confounding properties such as high-dimensionality, sparsity and power-law word distributions. In this paper, we propose a novel k-competitive autoencoder, called KATE, for text documents. Due to the competition between the neurons in the hidden layer, each neuron becomes specialized in recognizing specific data patterns, and overall the model can learn meaningful representations of textual data. A comprehensive set of experiments show that KATE can learn better representations than traditional autoencoders including denoising, contractive, variational, and k-sparse autoencoders. Our model also outperforms deep generative models, probabilistic topic models, and even word representation models (e.g., Word2Vec) in terms of several downstream tasks such as document classification, regression, and retrieval.
Plasmodium of Physarum polycephalum is a single cell visible by unaided eye. During its foraging behavior the cell spans spatially distributed sources of nutrients with a protoplasmic network. Geometrical structure of the protoplasmic networks allows the plasmodium to optimize transport of nutrients between remote parts of its body. Assuming major Mexican cities are sources of nutrients how much structure of Physarum protoplasmic network correspond to structure of Mexican Federal highway network? To find an answer undertook a series of laboratory experiments with living Physarum polycephalum. We represent geographical locations of major cities by oat flakes, place a piece of plasmodium in Mexico city area, record the plasmodium's foraging behavior and extract topology of nutrient transport networks. Results of our experiments show that the protoplasmic network formed by Physarum is isomorphic, subject to limitations imposed, to a network of principle highways. Ideas and results of the paper may contribute towards future developments in bio-inspired road planning.
In the context of multimedia content, a modality can be defined as a type of data item such as text, images, music, and videos. Up to now, only limited research has been conducted on cross-modal retrieval of suitable music for a specified video or vice versa. Moreover, much of the existing research relies on metadata such as keywords, tags, or associated description that must be individually produced and attached posterior. This paper introduces a new content-based, cross-modal retrieval method for video and music that is implemented through deep neural networks. The proposed model consists of a two-branch network that extracts features from the two different modalities and embeds them into a single embedding space. We train the network via cross-modal ranking loss such that videos and music with similar semantics end up close together in the embedding space. In addition, to preserve inherent characteristics within each modality, the proposed single-modal structure loss was also used for training. Owing to the lack of a dataset to evaluate cross-modal video-music tasks, we constructed a large-scale video-music pair benchmark. Finally, we introduced reasonable quantitative and qualitative experimental protocols. The experimental results on our dataset are expected to be a baseline for subsequent studies of less-mature video-to-music and music-to video related tasks.
Given a set of numbers, the balanced partioning problem is to divide them into two subsets, so that the sum of the numbers in each subset are as nearly equal as possible, subject to the constraint that the cardinalities of the subsets be within one of each other. We combine the balanced largest differencing method (BLDM) and Korf's complete Karmarkar-Karp algorithm to get a new algorithm that optimally solves the balanced partitioning problem. For numbers with twelve significant digits or less, the algorithm can optimally solve balanced partioning problems of arbitrary size in practice. For numbers with greater precision, it first returns the BLDM solution, then continues to find better solutions as time allows.
Stochastic modeling and simulation provide powerful predictive methods for the intrinsic understanding of fundamental mechanisms in complex biochemical networks. Typically, such mathematical models involve networks of coupled jump stochastic processes with a large number of parameters that need to be suitably calibrated against experimental data. In this direction, the parameter sensitivity analysis of reaction networks is an essential mathematical and computational tool, yielding information regarding the robustness and the identifiability of model parameters. However, existing sensitivity analysis approaches such as variants of the finite difference method can have an overwhelming computational cost in models with a high-dimensional parameter space. We develop a sensitivity analysis methodology suitable for complex stochastic reaction networks with a large number of parameters. The proposed approach is based on Information Theory methods and relies on the quantification of information loss due to parameter perturbations between time-series distributions. For this reason, we need to work on path-space, i.e., the set consisting of all stochastic trajectories, hence the proposed approach is referred to as "pathwise". The pathwise sensitivity analysis method is realized by employing the rigorously-derived Relative Entropy Rate (RER), which is directly computable from the propensity functions. A key aspect of the method is that an associated pathwise Fisher Information Matrix (FIM) is defined, which in turn constitutes a gradient-free approach to quantifying parameter sensitivities. The structure of the FIM turns out to be block-diagonal, revealing hidden parameter dependencies and sensitivities in reaction networks.
The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is widely accepted as a robust derivative-free continuous optimization algorithm for non-linear and non-convex optimization problems. CMA-ES is well known to be almost parameterless, meaning that only one hyper-parameter, the population size, is proposed to be tuned by the user. In this paper, we propose a principled approach called self-CMA-ES to achieve the online adaptation of CMA-ES hyper-parameters in order to improve its overall performance. Experimental results show that for larger-than-default population size, the default settings of hyper-parameters of CMA-ES are far from being optimal, and that self-CMA-ES allows for dynamically approaching optimal settings.
We report the discovery of H2O maser emission at 1.35 cm wavelength in seven active galactic nuclei (at distances up to <80 Mpc) during a survey conducted at the 70-m diameter antenna of the NASA Deep Space Network near Canberra, Australia. The detection rate was (approx.) 4%. Two of the maser sources are particularly interesting because they display satellite high-velocity emission lines, which are a signature of emission from the accretion disks of supermassive black holes when seen edge on. Three of the masers are coincident, to within uncertainties of 0.''2, with continuum emission sources we observed at about (lamda)1.3 cm. We also report the discovery of new spectral features in the Circinus galaxy H2O maser that broaden the known velocity range of emission therein by a factor of (approx.) 1.7. If the new spectral features originate in the Circinus accretion disk, then molecular material must survive at radii (approx.) 3 times smaller than had been believed previously [(approx.) 0.03 pc or (approx.) 2 x 105 Schwarzschild radii].
ReLU neural networks define piecewise linear functions of their inputs. However, initializing and training a neural network is very different from fitting a linear spline. In this paper, we expand empirically upon previous theoretical work to demonstrate features of trained neural networks. Standard network initialization and training produce networks vastly simpler than a naive parameter count would suggest and can impart odd features to the trained network. However, we also show the forced simplicity is beneficial and, indeed, critical for the wide success of these networks.
We introduce a growing network model---the copying model---in which a new node attaches to a randomly selected target node and, in addition, independently to each of the neighbors of the target with copying probability $p$. When $p<\frac{1}{2}$, this algorithm generates sparse networks, in which the average node degree is finite. A power-law degree distribution also arises, with a non-universal exponent whose value is determined by a transcendental equation in $p$. In the sparse regime, the network is "normal", e.g., the relative fluctuations in the number of links are asymptotically negligible. For $p\geq \frac{1}{2}$, the emergent networks are dense (the average degree increases with the number of nodes $N$) and they exhibit intriguing structural behaviors. In particular, the $N$-dependence of the number of $m$-cliques (complete subgraphs of $m$ nodes) undergoes $m-1$ transitions from normal to progressively more anomalous behavior at a $m$-dependent critical values of $p$. Different realizations of the network, which start from the same initial state, exhibit macroscopic fluctuations in the thermodynamic limit---absence of self averaging. When linking to second neighbors of the target node can occur, the number of links asymptotically grows as $N^2$ as $N\to\infty$, so that the network is effectively complete as $N\to \infty$.
We suggest analyzing neural networks through the prism of space constraints. We observe that most training algorithms applied in practice use bounded memory, which enables us to use a new notion introduced in the study of space-time tradeoffs that we call mixing complexity. This notion was devised in order to measure the (in)ability to learn using a bounded-memory algorithm. In this paper we describe how we use mixing complexity to obtain new results on what can and cannot be learned using neural networks.
Community structure identification has been an important research topic in complex networks and there has been many algorithms proposed so far to detect community structures in complex networks, where most of the algorithms are not suitable for very large networks because of their time-complexity. Genetic algorithm for detecting communities in complex networks, which is based on optimizing network modularity using genetic algorithm, is presented here. It is scalable to very large networks and does not need any priori knowledge about number of communities or any threshold value. It has O(e) time-complexity where e is the number of edges in the network. Its accuracy is tested with the known Zachary Karate Club and College Football datasets. Enron e-mail dataset is used for scalability test.
In this paper, we present an approach to minimize the energy consumption of multihop wireless packet networks, while achieving the required level of reliability. We consider networks that use Cooperative Network Coding (CNC), which is a synergistic combination of Cooperative Communications and Network Coding. Our approach is to optimize and balance the use of forward error control, error detection, and retransmissions at the packet level for these networks. Additionally, we introduce Cooperative Diversity Coding (CDC), which is a novel means to code the information packets, with the aim of minimizing the energy consumed for coding operations. The performance of CDC is similar to CNC in terms of the probability of successful reception at the destination and expected number of correctly received information packets at the destination. However, CDC requires less energy at the source node because of its implementation simplicity. Achieving minimal energy consumption, with the required level of reliability is critical for the optimum functioning of many wireless sensor and body area networks. For representative applications, the optimized CDC or CNC network achieves >= 25% energy savings compared to the baseline CNC scheme.
We propose a procedure based on phase equivalent chains of Darboux transformations to generate local potentials satisfying the radial Schr\"odinger equation and sharing the same scattering data. For potentials related by a chain of transformations an analytic expression is derived for the Jost function. It is shown how the same system of $S$-matrix poles can be differently distributed between poles and zeros of a Jost function which corresponds to different potentials with equal phase shifts. The concept of shallow and deep phase equivalent potentials is analyzed in connection with distinct distributions of poles. It is shown that phase equivalent chains do not violate the Levinson theorem. The method is applied to derive a shallow and a family of deep phase equivalent potentials describing the $^1S_0$ partial wave of the nucleon-nucleon scattering.
Online professional social networks such as LinkedIn have enhanced the ability of job seekers to discover and assess career opportunities, and the ability of job providers to discover and assess potential candidates. For most job seekers, salary (or broadly compensation) is a crucial consideration in choosing a new job. At the same time, job seekers face challenges in learning the compensation associated with different jobs, given the sensitive nature of compensation data and the dearth of reliable sources containing compensation data. Towards the goal of helping the world's professionals optimize their earning potential through salary transparency, we present LinkedIn Salary, a system for collecting compensation information from LinkedIn members and providing compensation insights to job seekers. We present the overall design and architecture, and describe the key components needed for the secure collection, de-identification, and processing of compensation data, focusing on the unique challenges associated with privacy and security. We perform an experimental study with more than one year of compensation submission history data collected from over 1.5 million LinkedIn members, thereby demonstrating the tradeoffs between privacy and modeling needs. We also highlight the lessons learned from the production deployment of this system at LinkedIn.
We investigate a Hamiltonian model of networks. The model is a mirror formulation of the XY model (hence the name) -- instead letting the XY spins vary, keeping the coupling topology static, we keep the spins conserved and sample different underlying networks. Our numerical simulations show complex scaling behaviors, but no finite-temperature critical behavior. The ground state and low-order excitations for sparse, finite graphs is a fragmented set of isolated network clusters. Configurations of higher energy are typically more connected. The connected networks of lowest energy are stretched out giving the network large average distances. For the finite sizes we investigate there are three regions -- a low-energy regime of fragmented networks, and intermediate regime of stretched-out networks, and a high-energy regime of compact, disordered topologies. Scaling up the system size, the borders between these regimes approach zero temperature algebraically, but different network structural quantities approach their T=0-values with different exponents.
We study a 1D Fermi gas with attractive short range-interactions in a disordered potential by the density matrix renormalization group (DMRG) technique. This setting can be implemented experimentally by using cold atom techniques. We identify a region of parameters for which disorder enhances the superfluid state. As disorder is further increased, global superfluidity eventually breaks down. However this transition occurs before the transition to the insulator state takes place. This suggests the existence of an intermediate metallic `pseudogap' phase characterized by strong pairing but no quasi long-range order.
We present a novel dataset and a novel algorithm for recognizing activities of daily living (ADL) from a first-person wearable camera. Handled objects are crucially important for egocentric ADL recognition. For specific examination of objects related to users' actions separately from other objects in an environment, many previous works have addressed the detection of handled objects in images captured from head-mounted and chest-mounted cameras. Nevertheless, detecting handled objects is not always easy because they tend to appear small in images. They can be occluded by a user's body. As described herein, we mount a camera on a user's wrist. A wrist-mounted camera can capture handled objects at a large scale, and thus it enables us to skip object detection process. To compare a wrist-mounted camera and a head-mounted camera, we also develop a novel and publicly available dataset that includes videos and annotations of daily activities captured simultaneously by both cameras. Additionally, we propose a discriminative video representation that retains spatial and temporal information after encoding frame descriptors extracted by Convolutional Neural Networks (CNN).
We present angular-resolved correlation measurements between photons after propagation through a three-dimensional disordered medium. The multiple scattering process induces photon correlations that are directly measured for light sources with different photon statistics. We find that multiple scattered photons between different angular directions with angles much larger than the average speckle width are strongly correlated. The time dependence of the angular photon correlation function is investigated and the coherence time of the light source is determined. Our results are found to be in excellent agreement with the continuous mode quantum theory of multiple scattering of light. The presented experimental technique is essential in order to study quantum phenomena in multiple scattering random media such as quantum interference and quantum entanglement of photons.
Transverse momentum spectra of charged particles produced in deep inelastic scattering are measured as a function of the kinematic variables x_B and Q2 using the H1 detector at the ep collider HERA. The data are compared to different parton emission models, either with or without ordering of the emissions in transverse momentum. The data provide evidence for a relatively large amount of parton radiation between the current and the remnant systems.
The aim of this paper is a short survey of models and methods that developed by the authors. These models and methods are used to optimize general networks with nonlinear non-convex restrictions and objectives possessing mixed continuous-discrete optimization variables. There are discussed the problem formulations and solution methods for simulation, optimization, sensitivity and stability analysis for flow with nonlinear potential in general networks. These problems and the developed methods and programs have industrial application e.g. by gas networks.
In order to control computational complexity, neural machine translation (NMT) systems convert all rare words outside the vocabulary into a single unk symbol. Previous solution (Luong et al., 2015) resorts to use multiple numbered unks to learn the correspondence between source and target rare words. However, testing words unseen in the training corpus cannot be handled by this method. And it also suffers from the noisy word alignment. In this paper, we focus on a major type of rare words -- named entity (NE), and propose to translate them with character level sequence to sequence model. The NE translation model is further used to derive high quality NE alignment in the bilingual training corpus. With the integration of NE translation and alignment modules, our NMT system is able to surpass the baseline system by 2.9 BLEU points on the Chinese to English task.
The integration of different learning and adaptation techniques to overcome individual limitations and to achieve synergetic effects through the hybridization or fusion of these techniques has, in recent years, contributed to a large number of new intelligent system designs. Computational intelligence is an innovative framework for constructing intelligent hybrid architectures involving Neural Networks (NN), Fuzzy Inference Systems (FIS), Probabilistic Reasoning (PR) and derivative free optimization techniques such as Evolutionary Computation (EC). Most of these hybridization approaches, however, follow an ad hoc design methodology, justified by success in certain application domains. Due to the lack of a common framework it often remains difficult to compare the various hybrid systems conceptually and to evaluate their performance comparatively. This chapter introduces the different generic architectures for integrating intelligent systems. The designing aspects and perspectives of different hybrid archirectures like NN-FIS, EC-FIS, EC-NN, FIS-PR and NN-FIS-EC systems are presented. Some conclusions are also provided towards the end.
Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images. We describe how we can train this model in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound. We also show through visualization how the model is able to automatically learn to fix its gaze on salient objects while generating the corresponding words in the output sequence. We validate the use of attention with state-of-the-art performance on three benchmark datasets: Flickr8k, Flickr30k and MS COCO.
Due to emerging technology, efficient multitasking approach is highly demanded. But it is hard to accomplish in heterogeneous wireless networks, where diverse networks have dissimilar geometric features in service and traffic models. Multitasking loss examination based on Markov chain becomes inflexible in these networks owing to rigorous computations is obligatory. This paper emphases on the performance of heterogeneous wireless networks based on multitasking. A method based on multitasking of the interrelated traffic is used to attain an approximate performance in heterogeneous wireless networks with congested traffic. The accuracy of the robust heterogeneous network with multitasking is verified by using ns2 simulations.
Materials characterization remains a significant, time-consuming undertaking. Generally speaking, spectroscopic techniques are used in conjunction with empirical and ab-initio calculations in order to elucidate structure. These experimental and computational methods typically require significant human input and interpretation, particularly with regards to novel materials. Recently, the application of data mining and machine learning to problems in material science have shown great promise in reducing this overhead. In the work presented here, several aspects of machine learning are explored with regards to characterizing a model material, titania, using solid-state Nuclear Magnetic Resonance (NMR). Specifically, a large dataset is generated, corresponding to NMR $^{47}$Ti spectra, using ab-initio calculations for generated TiO$_2$ structures. Principal Components Analysis (PCA) reveals that input spectra may be compressed by more than 90%, before being used for subsequent machine learning. Two key methods are used to learn the complex mapping between structural details and input NMR spectra, demonstrating excellent accuracy when presented with test sample spectra. This work compares Support Vector Regression (SVR) and Artificial Neural Networks (ANNs), as one step towards the construction of an expert system for solid state materials characterization.
Generative Adversarial Networks (GANs) excel at creating realistic images with complex models for which maximum likelihood is infeasible. However, the convergence of GAN training has still not been proved. We propose a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent that has an individual learning rate for both the discriminator and the generator. We prove that the TTUR converges under mild assumptions to a stationary Nash equilibrium. The convergence carries over to the popular Adam optimization, for which we prove that it follows the dynamics of a heavy ball with friction and thus prefers flat minima in the objective landscape. For the evaluation of the performance of GANs at image generation, we introduce the "Fr\'echet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score. In experiments, TTUR improves learning for DCGANs, improved Wasserstein GANs, and BEGANs, outperforming conventional GAN training on CelebA, One Billion Word Benchmark, and LSUN bedrooms.
We discuss two different regimes of condensate formation in zero-range processes on networks: on a q-regular network, where the condensate is formed as a result of a spontaneous symmetry breaking, and on an irregular network, where the symmetry of the partition function is explicitly broken. In the latter case we consider a minimal irregularity of the q-regular network introduced by a single Q-node with degree Q>q. The statics and dynamics of the condensation depends on the parameter log(Q/q), which controls the exponential fall-off of the distribution of particles on regular nodes and the typical time scale for melting of the condensate on the Q-node which increases exponentially with the system size $N$. This behavior is different than that on a q-regular network where log(Q/q)=0 and where the condensation results from the spontaneous symmetry breaking of the partition function, which is invariant under a permutation of particle occupation numbers on the q-nodes of the network. In this case the typical time scale for condensate melting is known to increase typically as a power of the system size.
The inefficiency of the Wardrop equilibrium of nonatomic routing games can be eliminated by placing tolls on the edges of a network so that the socially optimal flow is induced as an equilibrium flow. A solution where the minimum number of edges are tolled may be preferable over others due to its ease of implementation in real networks. In this paper we consider the minimum tollbooth (MINTB) problem, which seeks social optimum inducing tolls with minimum support. We prove for single commodity networks with linear latencies that the problem is NP-hard to approximate within a factor of $1.1377$ through a reduction from the minimum vertex cover problem. Insights from network design motivate us to formulate a new variation of the problem where, in addition to placing tolls, it is allowed to remove unused edges by the social optimum. We prove that this new problem remains NP-hard even for single commodity networks with linear latencies, using a reduction from the partition problem. On the positive side, we give the first exact polynomial solution to the MINTB problem in an important class of graphs---series-parallel graphs. Our algorithm solves MINTB by first tabulating the candidate solutions for subgraphs of the series-parallel network and then combining them optimally.
Existing models based on artificial neural networks (ANNs) for sentence classification often do not incorporate the context in which sentences appear, and classify sentences individually. However, traditional sentence classification approaches have been shown to greatly benefit from jointly classifying subsequent sentences, such as with conditional random fields. In this work, we present an ANN architecture that combines the effectiveness of typical ANN models to classify sentences in isolation, with the strength of structured prediction. Our model achieves state-of-the-art results on two different datasets for sequential sentence classification in medical abstracts.
A wireless system with multiple channels is considered, where each channel has several transmission states. A user learns about the instantaneous state of an available channel by transmitting a control packet in it. Since probing all channels consumes significant energy and time, a user needs to determine what and how much information it needs to acquire about the instantaneous states of the available channels so that it can maximize its transmission rate. This motivates the study of the trade-off between the cost of information acquisition and its value towards improving the transmission rate.   A simple model is presented for studying this information acquisition and exploitation trade-off when the channels are multi-state, with different distributions and information acquisition costs. The objective is to maximize a utility function which depends on both the cost and value of information. Solution techniques are presented for computing near-optimal policies with succinct representation in polynomial time. These policies provably achieve at least a fixed constant factor of the optimal utility on any problem instance, and in addition, have natural characterizations. The techniques are based on exploiting the structure of the optimal policy, and use of Lagrangean relaxations which simplify the space of approximately optimal solutions.
We study, using exact numerical simulations, the statistics of the longest excursion l_{\max}(t) up to time t for the fractional Brownian motion with Hurst exponent 0<H<1. We show that in the large t limit, < l_{\max}(t) > \propto Q_\infty t where Q_\infty \equiv Q_\infty(H) depends continuously on H, and in a non trivial way. These results are compared with exact analytical results obtained recently for a renewal process with an associated persistence exponent \theta = 1-H. This comparison shows that Q_\infty(H) carries the clear signature of non-Markovian effects for H\neq 1/2. The pre-asymptotic behavior of < l_{\max}(t)> is also discussed.
Multipath-TCP receives a lot of attention recently and can potentially improve quality of service for both private and commercial users. It leverages the multiple available paths and send packets through all the available paths. The growing of Mutipath TCP has received a growing interest from both researchers who publish a growing number of articles on the topic and the vendors since Apple has decided to use Multipath TCP on its smartphones and tablets to support the Siri voice recognition application. In this paper, we study the performance of Multipath TCP from its impact on the stability of the network. In particular, we study three scenarios, Internet, which is the largest networks and involves heterogeneous traffic, data center, which is smaller but has different traffic patterns compared with Internet scale network and wireless network, whose energy consumption also needs to be considered. Our study shows that stability is affected but not seriouly for Internet and wireless network, but datacenter network stability is seriously affected due to its bursty traffic pattern.
Hadronization, the process by which energetic quarks evolve into hadrons, has been studied phenomenologically for decades. However, little experimental insight has been gained into the space-time features of this fundamentally non-perturbative process. New experiments at Jefferson Lab, in combination with HERMES data, will provide significant new insights into the phenomena connected with hadron formation in deep inelastic scattering, such as quark energy loss in-medium, gluon emission, and color field restoration.
We present a fully differential next-to-next-to-leading order calculation of charm quark production in charged-current deep-inelastic scattering, with full charm-quark mass dependence. The next-to-next-to-leading order corrections in perturbative quantum chromodynamics are found to be comparable in size to the next-to-leading order corrections in certain kinematic regions. We compare our predictions with data on dimuon production in (anti-)neutrino scattering from a heavy nucleus. Our results can be used to improve the extraction of the parton distribution function of a strange quark in the nucleon.
This paper studies the performance of a wireless data network using energy-efficient power control techniques when different multiple access schemes, namely direct-sequence code division multiple access (DS-CDMA) and impulse-radio ultrawideband (IR-UWB), are considered. Due to the large bandwidth of the system, the multipath channel is assumed to be frequency-selective. By making use of noncooperative game-theoretic models and large-system analysis tools, explicit expressions for the achieved utilities at the Nash equilibrium are derived in terms of the network parameters. A measure of the loss of DS-CDMA with respect to IR-UWB is proposed, which proves substantial equivalence between the two schemes. Simulation results are provided to validate the analysis.
Deep reinforcement learning methods attain super-human performance in a wide range of environments. Such methods are grossly inefficient, often taking orders of magnitudes more data than humans to achieve reasonable performance. We propose Neural Episodic Control: a deep reinforcement learning agent that is able to rapidly assimilate new experiences and act upon them. Our agent uses a semi-tabular representation of the value function: a buffer of past experience containing slowly changing state representations and rapidly updated estimates of the value function. We show across a wide range of environments that our agent learns significantly faster than other state-of-the-art, general purpose deep reinforcement learning agents.
We discuss an analysis of Constraint Satisfaction problems, such as Sphere Packing, K-SAT and Graph Coloring, in terms of an effective energy landscape. Several intriguing geometrical properties of the solution space become in this light familiar in terms of the well-studied ones of rugged (glassy) energy landscapes. A `benchmark' algorithm naturally suggested by this construction finds solutions in polynomial time up to a point beyond the `clustering' and in some cases even the `thermodynamic' transitions. This point has a simple geometric meaning and can be in principle determined with standard Statistical Mechanical methods, thus pushing the analytic bound up to which problems are guaranteed to be easy. We illustrate this for the graph three and four-coloring problem. For Packing problems the present discussion allows to better characterize the `J-point', proposed as a systematic definition of Random Close Packing, and to place it in the context of other theories of glasses.
Dynamics and function of neuronal networks are determined by their synaptic connectivity. Current experimental methods to analyze synaptic network structure on the cellular level, however, cover only small fractions of functional neuronal circuits, typically without a simultaneous record of neuronal spiking activity. Here we present a method for the reconstruction of large recurrent neuronal networks from thousands of parallel spike train recordings. We employ maximum likelihood estimation of a generalized linear model of the spiking activity in continuous time. For this model the point process likelihood is concave, such that a global optimum of the parameters can be obtained by gradient ascent. Previous methods, including those of the same class, did not allow recurrent networks of that order of magnitude to be reconstructed due to prohibitive computational cost and numerical instabilities. We describe a minimal model that is optimized for large networks and an efficient scheme for its parallelized numerical optimization on generic computing clusters. For a simulated balanced random network of 1000 neurons, synaptic connectivity is recovered with a misclassification error rate of less than 1% under ideal conditions. We show that the error rate remains low in a series of example cases under progressively less ideal conditions. Finally, we successfully reconstruct the connectivity of a hidden synfire chain that is embedded in a random network, which requires clustering of the network connectivity to reveal the synfire groups. Our results demonstrate how synaptic connectivity could potentially be inferred from large-scale parallel spike train recordings.
Evolutionary algorithms have been frequently used for dynamic optimization problems. With this paper, we contribute to the theoretical understanding of this research area. We present the first computational complexity analysis of evolutionary algorithms for a dynamic variant of a classical combinatorial optimization problem, namely makespan scheduling. We study the model of a strong adversary which is allowed to change one job at regular intervals. Furthermore, we investigate the setting of random changes. Our results show that randomized local search and a simple evolutionary algorithm are very effective in dynamically tracking changes made to the problem instance.
We propose a deep learning framework for modeling complex high-dimensional densities called Non-linear Independent Component Estimation (NICE). It is based on the idea that a good representation is one in which the data has a distribution that is easy to model. For this purpose, a non-linear deterministic transformation of the data is learned that maps it to a latent space so as to make the transformed data conform to a factorized distribution, i.e., resulting in independent latent variables. We parametrize this transformation so that computing the Jacobian determinant and inverse transform is trivial, yet we maintain the ability to learn complex non-linear transformations, via a composition of simple building blocks, each based on a deep neural network. The training criterion is simply the exact log-likelihood, which is tractable. Unbiased ancestral sampling is also easy. We show that this approach yields good generative models on four image datasets and can be used for inpainting.
Polarized inclusive deep-inelastic scattering is formulated in the light cone expansion. The QCD evolution of the leading twist distribution functions is derived. It is shown that the twist--2 contribution to the structure functions g_2^{D(3)} is obtained via g_1^{D(3)} by a Wandzura--Wilczek relation.
Although Generative Adversarial Networks achieve state-of-the-art results on a variety of generative tasks, they are regarded as highly unstable and prone to miss modes. We argue that these bad behaviors of GANs are due to the very particular functional shape of the trained discriminators in high dimensional spaces, which can easily make training stuck or push probability mass in the wrong direction, towards that of higher concentration than that of the data generating distribution. We introduce several ways of regularizing the objective, which can dramatically stabilize the training of GAN models. We also show that our regularizers can help the fair distribution of probability mass across the modes of the data generating distribution, during the early phases of training and thus providing a unified solution to the missing modes problem.
The $^3$He isotope is important to many fields of astrophysics, including stellar evolution, chemical evolution, and cosmology. The isotope is produced in stars which evolve through the planetary nebula phase. Planetary nebulae are the final evolutionary phase of low- and intermediate-mass stars, where the extensive mass lost by the star on the asymptotic giant branch is ionised by the emerging white dwarf. This ejecta quickly disperses and merges with the surrounding ISM. The abundance of $^3$He can only be derived from the hyperfine transition of the ionised $^3$He, which is represented as $^3$He$^+$, these transition can be observed in the radio at the rest frequency of 8.665 GHz. $^3$He abundances in PNe can help test models of the chemical evolution of the Galaxy. Many hours have been put into trying to detect this line, using telescopes like Effelsberg a 100m dish from the Max Planck Institute for Radio Astronomy, the National Radio Astronomy Observatory (NRAO) 140-foot telescope, the NRAO Very Large Array, the Arecibo antenna, the Green Bank Telescope, and only just recently, the Deep Space Station 63 antenna from the Madrid Deep Space Communications Complex.
This paper proposes a new residual convolutional neural network (CNN) architecture for single image depth estimation. Compared with existing deep CNN based methods, our method achieves much better results with fewer training examples and model parameters. The advantages of our method come from the usage of dilated convolution, skip connection architecture and soft-weight-sum inference. Experimental evaluation on the NYU Depth V2 dataset shows that our method outperforms other state-of-the-art methods by a margin.
For the statistics of global observables in disordered systems, we discuss the matching between typical fluctuations and large deviations. We focus on the statistics of the ground state energy $E_0$ in two types of disordered models : (i) for the directed polymer of length $N$ in a two-dimensional medium, where many exact results exist (ii) for the Sherrington-Kirkpatrick spin-glass model of $N$ spins, where various possibilities have been proposed. Here we stress that, besides the behavior of the disorder-average $E_0^{av}(N)$ and of the standard deviation $ \Delta E_0(N) \sim N^{\omega_f}$ that defines the fluctuation exponent $\omega_f$, it is very instructive to study the full probability distribution $\Pi(u)$ of the rescaled variable $u= \frac{E_0(N)-E_0^{av}(N)}{\Delta E_0(N)}$ : (a) numerically, the convergence towards $\Pi(u)$ is usually very rapid, so that data on rather small sizes but with high statistics allow to measure the two tails exponents $\eta_{\pm}$ defined as $\ln \Pi(u \to \pm \infty) \sim - | u |^{\eta_{\pm}}$. In the generic case $1< \eta_{\pm} < +\infty$, this leads to explicit non-trivial terms in the asymptotic behaviors of the moments $\bar{Z_N^n}$ of the partition function when the combination $[| n | N^{\omega_f}]$ becomes large (b) simple rare events arguments can usually be found to obtain explicit relations between $\eta_{\pm}$ and $\omega_f$. These rare events usually correspond to 'anomalous' large deviation properties of the generalized form $R(w_{\pm} = \frac{E_0(N)-E_0^{av}(N)}{N^{\kappa_{\pm}}}) \sim e^{- N^{\rho_{\pm}} {\cal R}_{\pm}(w_{\pm})}$ (the 'usual' large deviations formalism corresponds to $\kappa_{\pm}=1=\rho_{\pm}$).
Network models are an increasingly popular way to abstract complex psychological phenomena. While the study of the structure of network models has led to many important insights, little attention is paid to how well they predict observations. This is despite the fact that predictability is crucial for judging the practical relevance of edges: for instance in clinical practice, predictability of a symptom indicates whether a an intervention on that symptom through the symptom network is promising. We close this methodological gap by introducing nodewise predictability, which quantifies how well a given node can be predicted by all other nodes it is connected to in the network. In addition, we provide fully reproducible code examples of how to compute and visualize nodewise predictability both for cross-sectional and time-series data.
We examine two recent artificial intelligence (AI) based deep learning algorithms for visual blending in convolutional neural networks (Mordvintsev et al. 2015, Gatys et al. 2015). To investigate the potential value of these algorithms as tools for computational creativity research, we explain and schematize the essential aspects of the algorithms' operation and give visual examples of their output. We discuss the relationship of the two algorithms to human cognitive science theories of creativity such as conceptual blending theory and honing theory, and characterize the algorithms with respect to generation of novelty and aesthetic quality.
In this paper, we propose a Double Thompson Sampling (D-TS) algorithm for dueling bandit problems. As indicated by its name, D-TS selects both the first and the second candidates according to Thompson Sampling. Specifically, D-TS maintains a posterior distribution for the preference matrix, and chooses the pair of arms for comparison by sampling twice from the posterior distribution. This simple algorithm applies to general Copeland dueling bandits, including Condorcet dueling bandits as its special case. For general Copeland dueling bandits, we show that D-TS achieves $O(K^2 \log T)$ regret. For Condorcet dueling bandits, we further simplify the D-TS algorithm and show that the simplified D-TS algorithm achieves $O(K \log T + K^2 \log \log T)$ regret. Simulation results based on both synthetic and real-world data demonstrate the efficiency of the proposed D-TS algorithm.
A longstanding idea in the literature on human cooperation is that cooperation should be reinforced when conditional cooperators are more likely to interact. In the context of social networks, this idea implies that cooperation should fare better in highly clustered networks such as cliques than in networks with low clustering such as random networks. To test this hypothesis, we conducted a series of web-based experiments, in which 24 individuals played a local public goods game arranged on one of five network topologies that varied between disconnected cliques and a random regular graph. In contrast with previous theoretical work, we found that network topology had no significant effect on average contributions. This result implies either that individuals are not conditional cooperators, or else that cooperation does not benefit from positive reinforcement between connected neighbors. We then tested both of these possibilities in two subsequent series of experiments in which artificial seed players were introduced, making either full or zero contributions. First, we found that although players did generally behave like conditional cooperators, they were as likely to decrease their contributions in response to low contributing neighbors as they were to increase their contributions in response to high contributing neighbors. Second, we found that positive effects of cooperation were contagious only to direct neighbors in the network. In total we report on 113 human subjects experiments, highlighting the speed, flexibility, and cost-effectiveness of web-based experiments over those conducted in physical labs.
We study numerically the interplay of phase coherence and vortex-glass state in two-dimensional Josephson-junction arrays with average rational values of flux quantum per plaquette $f$ and random dilution of junctions. For $f=1/2$, we find evidence of a phase coherence threshold value $x_s$, below the percolation concentration of diluted junctions $x_p$, where the superconducting transition vanishes. For $x_s < x < x_p$ the array behaves as a zero-temperature vortex glass with nonzero linear resistance at finite temperatures. The zero-temperature critical currents are insensitive to variations in $f$ in the vortex glass region while they are strongly $f$ dependent in the phase coherent region.
The Proportional Scheduler was recently proposed as a scheduling algorithm for multi-hop switch networks. For these networks, the BackPressure scheduler is the classical benchmark. For networks with fixed routing, the Proportional Scheduler is maximum stable, myopic and, furthermore, will alleviate certain scaling issued found in BackPressure for large networks. Nonetheless, the equilibrium and delay properties of the Proportional Scheduler has not been fully characterized.   In this article, we postulate on the equilibrium behaviour of the Proportional Scheduler though the analysis of an analogous rule called the Store-Forward allocation. It has been shown that Store-Forward has asymptotically allocates according to the Proportional Scheduler. Further, for Store-Forward networks, numerous equilibrium quantities are explicitly calculable. For FIFO networks under Store-Forward, we calculate the policies stationary distribution and end-to-end route delay. We discuss network topologies when the stationary distribution is product-form, a phenomenon which we call \emph{product form resource pooling}. We extend this product form notion to independent set scheduling on perfect graphs, where we show that non-neighbouring queues are statistically independent. Finally, we analyse the large deviations behaviour of the equilibrium distribution of Store-Forward networks in order to construct Lyapunov functions for FIFO switch networks.
Mobile phone data have recently become an attractive source of information about mobility behavior. Since cell phone data can be captured in a passive way for a large user population, they can be harnessed to collect well-sampled mobility information. In this paper, we propose CT-Mapper, an unsupervised algorithm that enables the mapping of mobile phone traces over a multimodal transport network. One of the main strengths of CT-Mapper is its capability to map noisy sparse cellular multimodal trajectories over a multilayer transportation network where the layers have different physical properties and not only to map trajectories associated with a single layer. Such a network is modeled by a large multilayer graph in which the nodes correspond to metro/train stations or road intersections and edges correspond to connections between them. The mapping problem is modeled by an unsupervised HMM where the observations correspond to sparse user mobile trajectories and the hidden states to the multilayer graph nodes. The HMM is unsupervised as the transition and emission probabilities are inferred using respectively the physical transportation properties and the information on the spatial coverage of antenna base stations. To evaluate CT-Mapper we collected cellular traces with their corresponding GPS trajectories for a group of volunteer users in Paris and vicinity (France). We show that CT-Mapper is able to accurately retrieve the real cell phone user paths despite the sparsity of the observed trace trajectories. Furthermore our transition probability model is up to 20% more accurate than other naive models.
A fundamental problem in the study of phylogenetic networks is to determine whether or not a given phylogenetic network contains a given phylogenetic tree. We develop a quadratic-time algorithm for this problem for binary nearly-stable phylogenetic networks. We also show that the number of reticulations in a reticulation visible or nearly stable phylogenetic network is bounded from above by a function linear in the number of taxa.
The persistence conjecture is a long-standing open problem in chemical reaction network theory. It concerns the behavior of solutions to coupled ODE systems that arise from applying mass-action kinetics to a network of chemical reactions. The idea is that if all reactions are reversible in a weak sense, then no species can go extinct. A notion that has been found useful in thinking about persistence is that of "critical siphon." We explore the combinatorics of critical siphons, with a view towards the persistence conjecture. We introduce the notions of "drainable" and "self-replicable" (or autocatalytic) siphons. We show that: every minimal critical siphon is either drainable or self-replicable; reaction networks without drainable siphons are persistent; and non-autocatalytic weakly-reversible networks are persistent. Our results clarify that the difficulties in proving the persistence conjecture are essentially due to competition between drainable and self-replicable siphons.
This paper explores the use of genetic algorithms for the design of networks, where the demands on the network fluctuate in time. For varying network constraints, we find the best network using the standard genetic algorithm operators such as inversion, mutation and crossover. We also examine how the choice of genetic algorithm operators affects the quality of the best network found. Such networks typically contain redundancy in servers, where several servers perform the same task and pleiotropy, where servers perform multiple tasks. We explore this trade-off between pleiotropy versus redundancy on the cost versus reliability as a measure of the quality of the network.
Weakly supervised object detection (WSOD), which is the problem of learning detectors using only image-level labels, has been attracting more and more interest. However, this problem is quite challenging due to the lack of location supervision. To address this issue, this paper integrates saliency into a deep architecture, in which the location in- formation is explored both explicitly and implicitly. Specifically, we select highly confident object pro- posals under the guidance of class-specific saliency maps. The location information, together with semantic and saliency information, of the selected proposals are then used to explicitly supervise the network by imposing two additional losses. Meanwhile, a saliency prediction sub-network is built in the architecture. The prediction results are used to implicitly guide the localization procedure. The entire network is trained end-to-end. Experiments on PASCAL VOC demonstrate that our approach outperforms all state-of-the-arts.
Distributed aggregation allows the derivation of a given global aggregate property from many individual local values in nodes of an interconnected network system. Simple aggregates such as minima/maxima, counts, sums and averages have been thoroughly studied in the past and are important tools for distributed algorithms and network coordination. Nonetheless, this kind of aggregates may not be comprehensive enough to characterize biased data distributions or when in presence of outliers, making the case for richer estimates of the values on the network. This work presents Spectra, a distributed algorithm for the estimation of distribution functions over large scale networks. The estimate is available at all nodes and the technique depicts important properties, namely: robust when exposed to high levels of message loss, fast convergence speed and fine precision in the estimate. It can also dynamically cope with changes of the sampled local property, not requiring algorithm restarts, and is highly resilient to node churn. The proposed approach is experimentally evaluated and contrasted to a competing state of the art distribution aggregation technique.
In this paper, novel enhanced Cognitive Radio Network is considered by using power control where secondary users are allowed to use wireless resources of the primary users when primary users are deactivated, but also allow secondary users to coexist with primary users while primary users are activated by managing interference caused from secondary users to primary users. Therefore, a novel finite horizon adaptive optimal distributed power allocation scheme is proposed by incorporating the effect of channel uncertainties for enhanced cognitive radio network in the presence of wireless channel uncertainties under two cases. In Case 1, proposed scheme can force the Signal-to-interference (SIR) of the secondary users to converge to a higher target value for increasing network throughput when primary users' are not communicating within finite horizon. Once primary users are activated as in the Case 2, proposed scheme can not only force the SIR of primary users to converge to a higher target SIR, but also force the SIR of secondary users to converge to a lower value for regulating their interference to Pus during finite time period. In order to mitigate the attenuation of SIR due to channel uncertainties the proposed novel finite horizon adaptive optimal distributed power allocation allows the SIR of both primary users' and secondary users' to converge to a desired target SIR while minimizing the energy consumption within finite horizon. Simulation results illustrate that this novel finite horizon adaptive optimal distributed power allocation scheme can converge much faster and cost less energy than others by adapting to the channel variations optimally.
We propose a distributed approach to train deep neural networks (DNNs), which has guaranteed convergence theoretically and great scalability empirically: close to 6 times faster on instance of ImageNet data set when run with 6 machines. The proposed scheme is close to optimally scalable in terms of number of machines, and guaranteed to converge to the same optima as the undistributed setting. The convergence and scalability of the distributed setting is shown empirically across different datasets (TIMIT and ImageNet) and machine learning tasks (image classification and phoneme extraction). The convergence analysis provides novel insights into this complex learning scheme, including: 1) layerwise convergence, and 2) convergence of the weights in probability.
We reformulate the projected imaginary-time evolution of Full Configuration Interaction Quantum Monte Carlo in terms of a Lagrangian minimization. This naturally leads to the admission of polynomial complex wavefunction parameterizations, circumventing the exponential scaling of the approach. While previously these functions have traditionally inhabited the domain of Variational Monte Carlo, we consider recently developments for the identification of deep-learning neural networks to optimize this Lagrangian, which can be written as a modification of the propagator for the wavefunction dynamics. We demonstrate this approach with a form of Tensor Network State, and use it to find solutions to the strongly-correlated Hubbard model, as well as its application to a fully periodic ab-initio Graphene sheet. The number of variables which can be simultaneously optimized greatly exceeds alternative formulations of Variational Monte Carlo, allowing for systematic improvability of the wavefunction flexibility towards exactness for a number of different forms, whilst blurring the line between traditional Variational and Projector quantum Monte Carlo approaches.
Numerical studies of amorphous silicon in harmonic approximation show that the highest 3.5% of vibrational normal modes are localized. As vibrational frequency increases through the boundary separating localized from delocalized modes, near omega_c=70meV, (the "mobility edge") there is a localization-delocalization (LD) transition, similar to a second-order thermodynamic phase transition. By a numerical study on a system with 4096 atoms, we are able to see exponential decay lengths of exact vibrational eigenstates, and test whether or not these diverge at omega_c. Results are consistent with a localization length xi which diverges above omega_c as (omega-omega_c)^{-p} where the exponent is p = 1.3 +/- 0.5. Below the mobility edge we find no evidence for a diverging correlation length. Such an asymmetry would contradict scaling ideas, and we suppose it is a finite-size artifact. If the scaling regime is narrower than our 1 meV resolution, then it cannot be seen directly on our finite system.
In this paper we demonstrate that it is possible to manage intelligence in constant time as a pre-process to information fusion through a series of processes dealing with issues such as clustering reports, ranking reports with respect to importance, extraction of prototypes from clusters and immediate classification of newly arriving intelligence reports. These methods are used when intelligence reports arrive which concerns different events which should be handled independently, when it is not known a priori to which event each intelligence report is related. We use clustering that runs as a back-end process to partition the intelligence into subsets representing the events, and in parallel, a fast classification that runs as a front-end process in order to put the newly arriving intelligence into its correct information fusion process.
We develop tools from computational algebraic geometry for the study of steady state features of autonomous polynomial dynamical systems via elimination of variables. In particular, we obtain nontrivial bounds for the steady state concentration of a given species in biochemical reaction networks with mass-action kinetics. This species is understood as the output of the network and we thus bound the maximal response of the system. The improved bounds give smaller starting boxes to launch numerical methods. We apply our results to the sequential enzymatic network studied in Markevich et al.(2004) to find nontrivial upper bounds for the different substrate concentrations at steady state.   Our approach does not require any simulation, analytical expression to describe the output in terms of the input, or the absence of multistationarity. Instead, we show how to extract information from effectively computable implicit dose-response curves with the use of resultants and discriminants. We moreover illustrate in the application to an enzymatic network, the relation between the exact implicit dose-response curve we obtain symbolically and the standard hysteresis diagram provided by a numerical solver.   The setting and tools we propose could yield many other results adapted to any autonomous polynomial dynamical system, beyond those where it is possible to get explicit expressions.
Accurate estimation of spatial gait characteristics is critical to assess motor impairments resulting from neurological or musculoskeletal disease. Currently, however, methodological constraints limit clinical applicability of state-of-the-art double integration approaches to gait patterns with a clear zero-velocity phase. We describe a novel approach to stride length estimation that uses deep convolutional neural networks to map stride-specific inertial sensor data to the resulting stride length. The model is trained on a publicly available and clinically relevant benchmark dataset consisting of 1220 strides from 101 geriatric patients. Evaluation is done in a 10-fold cross validation and for three different stride definitions. Even though best results are achieved with strides defined from mid-stance to mid-stance with average accuracy and precision of 0.01 $\pm$ 5.37 cm, performance does not strongly depend on stride definition. The achieved precision outperforms state-of-the-art methods evaluated on this benchmark dataset by 3.0 cm (36%). Due to the independence of stride definition, the proposed method is not subject to the methodological constrains that limit applicability of state-of-the-art double integration methods. Furthermore, precision on the benchmark dataset could be improved. With more precise mobile stride length estimation, new insights to the progression of neurological disease or early indications might be gained. Due to the independence of stride definition, previously uncharted diseases in terms of mobile gait analysis can now be investigated by re-training and applying the proposed method.
This paper presents a method for the detection of traveling waves of activity in neural recordings from multi-electrode arrays. The method converts local field potential measurements into the phase domain and fits a series of linear models to find planar traveling waves of activity. Here I present the new approach in the context of the previous work it extends, apply the approach to data from neural recordings from a single animal, and verify the success of the method on simulated data.   This paper was written in 2011, though it was uploaded to arXiv in 2014.
Inelastic X-ray scattering data have been collected for liquid sodium at T=390 K, i.e. slightly above the melting point. Owing to the very high instrumental resolution, pushed up to 1.5 meV, it has been possible to determine accurately the dynamic structure factor, $S(Q,\omega)$, in a wide wavevector range, $1.5 \div 15$ nm$^{-1}$, and to investigate on the dynamical processes underlying the collective dynamics. A detailed analysis of the lineshape of $S(Q,\omega)$, similarly to other liquid metals, reveals the co-existence of two different relaxation processes with slow and fast characteristic timescales respectively. The present data lead to the conclusion that: i) the picture of the relaxation mechanism based on a simple viscoelastic model fails; ii) although the comparison with other liquid metals reveals similar behavior, the data do not exhibit an exact scaling law as the principle of corresponding state would predict.
In this paper, we propose the amphibious influence maximization (AIM) model that combines traditional marketing via content providers and viral marketing to consumers in social networks in a single framework. In AIM, a set of content providers and consumers form a bipartite network while consumers also form their social network, and influence propagates from the content providers to consumers and among consumers in the social network following the independent cascade model. An advertiser needs to select a subset of seed content providers and a subset of seed consumers, such that the influence from the seed providers passing through the seed consumers could reach a large number of consumers in the social network in expectation.   We prove that the AIM problem is NP-hard to approximate to within any constant factor via a reduction from Feige's k-prover proof system for 3-SAT5. We also give evidence that even when the social network graph is trivial (i.e. has no edges), a polynomial time constant factor approximation for AIM is unlikely. However, when we assume that the weighted bi-adjacency matrix that describes the influence of content providers on consumers is of constant rank, a common assumption often used in recommender systems, we provide a polynomial-time algorithm that achieves approximation ratio of $(1-1/e-\epsilon)^3$ for any (polynomially small) $\epsilon > 0$. Our algorithmic results still hold for a more general model where cascades in social network follow a general monotone and submodular function.
Several efforts are currently underway to decipher the connectome or parts thereof in a variety of organisms. Ascertaining the detailed physiological properties of all the neurons in these connectomes, however, is out of the scope of such projects. It is therefore unclear to what extent knowledge of the connectome alone will advance a mechanistic understanding of computation occurring in these neural circuits, especially when the high-level function of the said circuit is unknown. We consider, here, the question of how the wiring diagram of neurons imposes constraints on what neural circuits can compute, when we cannot assume detailed information on the physiological response properties of the neurons. We call such constraints -- that arise by virtue of the connectome -- connectomic constraints on computation. For feedforward networks equipped with neurons that obey a deterministic spiking neuron model which satisfies a small number of properties, we ask if just by knowing the architecture of a network, we can rule out computations that it could be doing, no matter what response properties each of its neurons may have. We show results of this form, for certain classes of network architectures. On the other hand, we also prove that with the limited set of properties assumed for our model neurons, there are fundamental limits to the constraints imposed by network structure. Thus, our theory suggests that while connectomic constraints might restrict the computational ability of certain classes of network architectures, we may require more elaborate information on the properties of neurons in the network, before we can discern such results for other classes of networks.
We present an efficient method for detecting anomalies in videos. Recent applications of convolutional neural networks have shown promises of convolutional layers for object detection and recognition, especially in images. However, convolutional neural networks are supervised and require labels as learning signals. We propose a spatiotemporal architecture for anomaly detection in videos including crowded scenes. Our architecture includes two main components, one for spatial feature representation, and one for learning the temporal evolution of the spatial features. Experimental results on Avenue, Subway and UCSD benchmarks confirm that the detection accuracy of our method is comparable to state-of-the-art methods at a considerable speed of up to 140 fps.
The process industry implements many techniques with certain parameters in its operations to control the working of several actuators on field. Amongst these actuators, DC motor is a very common machine. The angular position of DC motor can be controlled to drive many processes such as the arm of a robot. The most famous and well known controller for such applications is PID controller. It uses proportional, integral and derivative functions to control the input signal before sending it to the plant unit. In this paper, another controller based on Artificial Neural Network (ANN) control is examined to replace the PID controller for controlling the angular position of a DC motor to drive a robot arm. Simulation is performed in MATLAB after training the neural network (supervised learning) and it is shown that results are acceptable and applicable in process industry for reference control applications. The paper also indicates that the ANN controller can be less complicated and less costly to implement in industrial control applications as compared to some other proposed schemes.
A light front treatment of the nuclear wave function is developed and applied, using the mean field approximation, to infinite nuclear matter. The nuclear mesons are shown to carry about a third of the nuclear plus momentum, p+; but their momentum distribution has support only at p+ =0, and the mesons do not contribute to nuclear deep inelastic scattering. This zero mode effect occurs because the meson fields are independent of space-time position.
One of the major challenges in biology concerns the integration of data across length and time scales into a consistent framework: how do macroscopic properties and functionalities arise from the molecular regulatory networks - and how can they change as a result of mutations? Morphogenesis provides an excellent model system to study how simple molecular networks robustly control complex processes on the macroscopic scale in spite of molecular noise, and how important functional variants can emerge from small genetic changes. Recent advancements in 3D imaging technologies, computer algorithms, and computer power now allow us to develop and analyse increasingly realistic models of biological control. Here we present our pipeline for image-based modeling that includes the segmentation of images, the determination of displacement fields, and the solution of systems of partial differential equations (PDEs) on the growing, embryonic domains. The development of suitable mathematical models, the data-based inference of parameter sets, and the evaluation of competing models are still challenging, and current approaches are discussed.
This work aims at combining adaptive protocol design, utility maximization and stochastic geometry. We focus on a spatial adaptation of Aloha within the framework of ad hoc networks. We consider quasi-static networks in which mobiles learn the local topology and incorporate this information to adapt their medium access probability (MAP) selection to their local environment. We consider the cases where nodes cooperate in a distributed way to maximize the global throughput or to achieve either proportional fair or max-min fair medium access. In the proportional fair case, we show that nodes can compute their optimal MAPs as solutions to certain fixed point equations. In the maximum throughput case, the optimal MAPs are obtained through a Gibbs Sampling based algorithm. In the max min case, these are obtained as the solution of a convex optimization problem. The main performance analysis result of the paper is that this type of distributed adaptation can be analyzed using stochastic geometry in the proportional fair case. In this case, we show that, when the nodes form a homogeneous Poisson point process in the Euclidean plane, the distribution of the optimal MAP can be obtained from that of a certain shot noise process w.r.t. the node Poisson point process and that the mean utility can also be derived from this distribution. We discuss the difficulties to be faced for analyzing the performance of the other cases (maximum throughput and max-min fairness). Numerical results illustrate our findings and quantify the gains brought by spatial adaptation in such networks.
Motivated by the problem of giving a bijective proof of the fact that the birational RSK correspondence satisfies the octahedron recurrence, we define interlacing networks, which are certain planar directed networks with a rigid structure of sources and sinks. We describe an involution that swaps paths in these networks and leads to Pl\"{u}cker-like three-term relations among path weights. We show that indeed these relations follow from the Pl\"{u}cker relations in the Grassmannian together with some simple rank properties of the matrices corresponding to our interlacing networks. The space of matrices obeying these rank properties forms the closure of a cell in the matroid stratification of the totally nonnegative Grassmannian. Not only does the octahedron recurrence for RSK follow immediately from the three-term relations for interlacing networks, but also these relations imply some interesting identities of Schur functions reminiscent of those obtained by Fulmek and Kleber. These Schur function identities lead to some results on Schur positivity for expressions of the form $s_{\nu}s_{\rho} - s_{\lambda}s_{\mu}$.
Personalization and adaptation to the user profile capability are the hottest issues to ensure ambient assisted living and context awareness in nowadays environments. With the growing healthcare and wellbeing context aware applications, modeling security policies becomes an important issue in the design of future access control models. This requires rich semantics using ontology modeling for the management of services provided to dependant people. However, current access control models remain unsuitable due to lack of personalization, adaptability and smartness to the handicap situation. In this paper, we propose a novel adaptable access control model and its related architecture in which the security policy is based on the handicap situation analyzed from the monitoring of userss behavior in order to grant a service using any assistive device within intelligent environment. the design of our model is an ontology-learning and evolving security policy for predicting the future actions of dependent people. This is reached by reasoning about historical data, contextual data and user behavior according to the access rules that are used in the inference engine to provide the right service according to the users needs.
This document collects the proceedings of the "Parton Radiation and Fragmentation from LHC to FCC-ee" workshop (http://indico.cern.ch/e/ee\_jets16) held at CERN in Nov. 2016. The writeup reviews the latest theoretical and experimental developments on parton radiation and parton-hadron fragmentation studies --including analyses of LEP, B-factories, and LHC data-- with a focus on the future perspectives reacheable in $e^+e^-$ measurements at the Future Circular Collider (FCC-ee), with multi-ab$^{-1}$ integrated luminosities yielding 10$^{12}$ and 10$^{8}$ jets from Z and W bosons decays as well as 10$^5$ gluon jets from Higgs boson decays. The main topics discussed are: (i) parton radiation and parton-to-hadron fragmentation functions (splitting functions at NNLO, small-$z$ NNLL resummations, global FF fits including Monte Carlo (MC) and neural-network analyses of the latest Belle/BaBar high-precision data, parton shower MC generators), (ii) jet properties (quark-gluon discrimination, $e^+e^-$ event shapes and multi-jet rates at NNLO+N$^{n}$LL, jet broadening and angularities, jet substructure at small-radius, jet charge determination, $e^+e^-$ jet reconstruction algorithms), (iii) heavy-quark jets (dead cone effect, charm-bottom separation, gluon-to-$b\bar{b}$ splitting), and (iv) non-perturbative QCD phenomena (colour reconnection, baryon and strangeness production, Bose-Einstein and Fermi-Dirac final-state correlations, colour string dynamics: spin effects, helix hadronization).
Gradients have been used to quantify feature importance in machine learning models. Unfortunately, in nonlinear deep networks, not only individual neurons but also the whole network can saturate, and as a result an important input feature can have a tiny gradient. We study various networks, and observe that this phenomena is indeed widespread, across many inputs.   We propose to examine interior gradients, which are gradients of counterfactual inputs constructed by scaling down the original input. We apply our method to the GoogleNet architecture for object recognition in images, as well as a ligand-based virtual screening network with categorical features and an LSTM based language model for the Penn Treebank dataset. We visualize how interior gradients better capture feature importance. Furthermore, interior gradients are applicable to a wide variety of deep networks, and have the attribution property that the feature importance scores sum to the the prediction score.   Best of all, interior gradients can be computed just as easily as gradients. In contrast, previous methods are complex to implement, which hinders practical adoption.
We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens. There are many ways to estimate or learn the high-level coarse tokens, but we argue that a simple extraction procedure is sufficient to capture a wealth of high-level discourse semantics. Such procedure allows training the multiresolution recurrent neural network by maximizing the exact joint log-likelihood over both sequences. In contrast to the standard log- likelihood objective w.r.t. natural language tokens (word perplexity), optimizing the joint log-likelihood biases the model towards modeling high-level abstractions. We apply the proposed model to the task of dialogue response generation in two challenging domains: the Ubuntu technical support domain, and Twitter conversations. On Ubuntu, the model outperforms competing approaches by a substantial margin, achieving state-of-the-art results according to both automatic evaluation metrics and a human evaluation study. On Twitter, the model appears to generate more relevant and on-topic responses according to automatic evaluation metrics. Finally, our experiments demonstrate that the proposed model is more adept at overcoming the sparsity of natural language and is better able to capture long-term structure.
We consider the densities of clusters, at the percolation point of a two-dimensional system, which are anchored in various ways to an edge. These quantities are calculated by use of conformal field theory and computer simulations. We find that they are given by simple functions of the potentials of 2-D electrostatic dipoles, and that a kind of superposition {\it cum} factorization applies. Our results broaden this connection, already known from previous studies, and we present evidence that it is more generally valid. An exact result similar to the Kirkwood superposition approximation emerges.
Can synchronization properties of a network of identical oscillators in the presence of noise be improved through appropriate rewiring of its connections? What are the optimal network architectures for a given total number of connections? We address these questions by running the optimization process, using the stochastic Markov Chain Monte Carlo method with replica exchange, to design the networks of phase oscillators with the increased tolerance against noise. As we find, the synchronization of a network, characterized by the Kuramoto order parameter, can be increased up to 40 %, as compared to that of the randomly generated networks, when the optimization is applied. Large ensembles of optimized networks are obtained and their statistical properties are investigated.
Two-tier networks, comprising a conventional cellular network overlaid with shorter range hotspots (e.g. femtocells, distributed antennas, or wired relays), offer an economically viable way to improve cellular system capacity. The capacity-limiting factor in such networks is interference. The cross-tier interference between macrocells and femtocells can suffocate the capacity due to the near-far problem, so in practice hotspots should use a different frequency channel than the potentially nearby high-power macrocell users. Centralized or coordinated frequency planning, which is difficult and inefficient even in conventional cellular networks, is all but impossible in a two-tier network. This paper proposes and analyzes an optimum decentralized spectrum allocation policy for two-tier networks that employ frequency division multiple access (including OFDMA). The proposed allocation is optimal in terms of Area Spectral Efficiency (ASE), and is subjected to a sensible Quality of Service (QoS) requirement, which guarantees that both macrocell and femtocell users attain at least a prescribed data rate. Results show the dependence of this allocation on the QoS requirement, hotspot density and the co-channel interference from the macrocell and surrounding femtocells. Design interpretations of this result are provided.
Mobile ad hoc networks (MANETs) are self-configuring infrastructure-less networks comprised of mobile nodes that communicate over wireless links without any central control on a peer-to-peer basis. These individual nodes act as routers to forward both their own data and also their neighbours' data by sending and receiving packets to and from other nodes in the network. The relatively easy configuration and the quick deployment make ad hoc networks suitable the emergency situations (such as human or natural disasters) and for military units in enemy territory. Securing data dissemination between these nodes in such networks, however, is a very challenging task. Exposing such information to anyone else other than the intended nodes could cause a privacy and confidentiality breach, particularly in military scenarios. In this paper we present a novel framework to enhance the privacy and data confidentiality in mobile ad hoc networks by attaching the originator policies to the messages as they are sent between nodes. We evaluate our framework using the Network Simulator (NS-2) to check whether the privacy and confidentiality of the originator are met. For this we implemented the Policy Enforcement Points (PEPs), as NS-2 agents that manage and enforce the policies attached to packets at every node in the MANET.
In this paper we provide a performance analysis framework for wireless industrial networks by deriving a service curve and a bound on the delay violation probability. For this purpose we use the (min,x) stochastic network calculus as well as a recently presented recursive formula for an end-to-end delay bound of wireless heterogeneous networks. The derived results are mapped to WirelessHART networks used in process automation and were validated via simulations. In addition to WirelessHART, our results can be applied to any wireless network whose physical layer conforms the IEEE 802.15.4 standard, while its MAC protocol incorporates TDMA and channel hopping, like e.g. ISA100.11a or TSCH-based networks. The provided delay analysis is especially useful during the network design phase, offering further research potential towards optimal routing and power management in QoS-constrained wireless industrial networks.
The charge dynamics of hydrogen-like centers formed by the implantation of energetic (4 MeV) muons in semi-insulating GaAs have been studied by muon spin resonance in electric fields. The results point to the significant role of deep hole traps in the compensation mechanism of GaAs. Electric-field-enhanced neutralization of deep electron and hole traps by muon-track-induced hot carriers results to an increase of the non-equilibrium carrier life-times. As a consequence, the muonium ($\mu^+ + e^-$) center at the tetrahedral As site can capture the track's holes and therefore behaves like a donor.
In analyzing of modern biological data, we are often dealing with ill-posed problems and missing data, mostly due to high dimensionality and multicollinearity of the dataset. In this paper, we have proposed a system based on matrix factorization (MF) and deep recurrent neural networks (DRNNs) for genotype imputation and phenotype sequences prediction. In order to model the long-term dependencies of phenotype data, the new Recurrent Linear Units (ReLU) learning strategy is utilized for the first time. The proposed model is implemented for parallel processing on central processing units (CPUs) and graphic processing units (GPUs). Performance of the proposed model is compared with other training algorithms for learning long-term dependencies as well as the sparse partial least square (SPLS) method on a set of genotype and phenotype data with 604 samples, 1980 single-nucleotide polymorphisms (SNPs), and two traits. The results demonstrate performance of the ReLU training algorithm in learning long-term dependencies in RNNs.
The ESO Imaging Survey (EIS) is an ongoing project to carry out public imaging surveys to support programs on the ESO Very Large Telescope (VLT). The first phase of the project started in July 1997 and consisted of a moderately deep, large-area survey (EIS-WIDE) and a deep optical/infrared survey (EIS-DEEP) using the ESO New Technology Telescope (NTT). EIS has recently reached another milestone with the completion of a Pilot Survey using the Wide-Field Image (WFI), an 8k by 8k mosaic CCD camera mounted on the MPG/ESO 2.2m telescope at La Silla. This paper briefly reviews the results of the original EIS and gives an update of the results obtained from the observations carried out as part of the Pilot Survey. Work in progress on the development of an advanced pipeline for handling data from large CCD mosaics and facilities to make the access to data products easier for external users are also discussed.
In this work, we formulated a real-world problem related to sewer pipeline gas detection using the classification-based approaches. The primary goal of this work was to identify the hazardousness of sewer pipeline to offer safe and non-hazardous access to sewer pipeline workers so that the human fatalities, which occurs due to the toxic exposure of sewer gas components, can be avoided. The dataset acquired through laboratory tests, experiments, and various literature sources was organized to design a predictive model that was able to identify/classify hazardous and non-hazardous situation of sewer pipeline. To design such prediction model, several classification algorithms were used and their performances were evaluated and compared, both empirically and statistically, over the collected dataset. In addition, the performances of several ensemble methods were analyzed to understand the extent of improvement offered by these methods. The result of this comprehensive study showed that the instance-based learning algorithm performed better than many other algorithms such as multilayer perceptron, radial basis function network, support vector machine, reduced pruning tree. Similarly, it was observed that multi-scheme ensemble approach enhanced the performance of base predictors.
A concept of fourth generation social network is described as one that, built on the features of augmented reality (AR), is able to implement an enriched layer of digital information that displays in People Augmented Reality (PAR) devices data shared by users in social networks. This PAR layer is accessed by the users in their devices through camera effects when targeting with a mobile phone to a user holding a mobile device with AGPS and with a profile in social media. The social network of fourth generation will be a combination between Facebook and Pokemon Go.
Enabling real time applications in wireless sensor networks requires certain delay and bandwidth which pose more challenges in the design of routing protocols. The algorithm that is used for packet routing in such applications should be able to establish a tradeoff between end to end delay parameter and energy consumption. In this paper, we propose a new multi path routing algorithm for real time applications in wireless sensor networks namely QEMPAR which is QoS aware and can increase the network lifetime. Simulation results show that the proposed algorithm is more efficient than previous algorithms in providing quality of service requirements of real-time applications.
Deep neural networks have recently achieved significant progress. Sharing trained models of these deep neural networks is very important in the rapid progress of researching or developing deep neural network systems. At the same time, it is necessary to protect the rights of shared trained models. To this end, we propose to use a digital watermarking technology to protect intellectual property or detect intellectual property infringement of trained models. Firstly, we formulate a new problem: embedding watermarks into deep neural networks. We also define requirements, embedding situations, and attack types for watermarking to deep neural networks. Secondly, we propose a general framework to embed a watermark into model parameters using a parameter regularizer. Our approach does not hurt the performance of networks into which a watermark is embedded. Finally, we perform comprehensive experiments to reveal the potential of watermarking to deep neural networks as a basis of this new problem. We show that our framework can embed a watermark in the situations of training a network from scratch, fine-tuning, and distilling without hurting the performance of a deep neural network. The embedded watermark does not disappear even after fine-tuning or parameter pruning; the watermark completely remains even after removing 65% of parameters were pruned. The implementation of this research is: https://github.com/yu4u/dnn-watermark
Time-triggered switched networks are a deterministic communication infrastructure used by real-time distributed embedded systems. Due to the criticality of the applications running over them, developers need to ensure that end-to-end communication is dependable and predictable. Traditional approaches assume static networks that are not flexible to changes caused by reconfigurations or, more importantly, faults, which are dealt with in the application using redundancy. We adopt the concept of handling faults in the switches from non-real-time networks while maintaining the required predictability.   We study a class of forwarding schemes that can handle various types of failures. We consider probabilistic failures. For a given network with a forwarding scheme and a constant $\ell$, we compute the {\em score} of the scheme, namely the probability (induced by faults) that at least $\ell$ messages arrive on time. We reduce the scoring problem to a reachability problem on a Markov chain with a "product-like" structure. Its special structure allows us to reason about it symbolically, and reduce the scoring problem to #SAT. Our solution is generic and can be adapted to different networks and other contexts. Also, we show the computational complexity of the scoring problem is #P-complete, and we study methods to estimate the score. We evaluate the effectiveness of our techniques with an implementation.
Lacunes of presumed vascular origin (lacunes) are associated with an increased risk of stroke, gait impairment, and dementia and are a primary imaging feature of the small vessel disease. Quantification of lacunes may be of great importance to elucidate the mechanisms behind neuro-degenerative disorders and is recommended as part of study standards for small vessel disease research. However, due to the different appearance of lacunes in various brain regions and the existence of other similar-looking structures, such as perivascular spaces, manual annotation is a difficult, elaborative and subjective task, which can potentially be greatly improved by reliable and consistent computer-aided detection (CAD) routines.   In this paper, we propose an automated two-stage method using deep convolutional neural networks (CNN). We show that this method has good performance and can considerably benefit readers. We first use a fully convolutional neural network to detect initial candidates. In the second step, we employ a 3D CNN as a false positive reduction tool. As the location information is important to the analysis of candidate structures, we further equip the network with contextual information using multi-scale analysis and integration of explicit location features. We trained, validated and tested our networks on a large dataset of 1075 cases obtained from two different studies. Subsequently, we conducted an observer study with four trained observers and compared our method with them using a free-response operating characteristic analysis. Shown on a test set of 111 cases, the resulting CAD system exhibits performance similar to the trained human observers and achieves a sensitivity of 0.974 with 0.13 false positives per slice. A feasibility study also showed that a trained human observer would considerably benefit once aided by the CAD system.
We explore the hypothesis that the relative abundance of feedback loops in many empirical complex networks is severely reduced owing to the presence of an inherent global directionality. Aimed at quantifying this idea, we propose a simple probabilistic model in which a free parameter $\gamma$ controls the degree of inherent directionality. Upon strengthening such directionality, the model predicts a drastic reduction in the fraction of loops which are also feedback loops. To test this prediction, we extensively enumerated loops and feedback loops in many empirical biological, ecological and socio- technological directed networks. We show that, in almost all cases, empirical networks have a much smaller fraction of feedback loops than network randomizations. Quite remarkably, this empirical finding is quantitatively reproduced, for all loop lengths, by our model by fitting its only parameter $\gamma$. Moreover, the fitted value of $\gamma$ correlates quite well with another direct measurement of network directionality, performed by means of a novel algorithm. We conclude that the existence of an inherent network directionality provides a parsimonious quantitative explanation for the observed lack of feedback loops in empirical networks.
This paper describes a new method for reducing the error in a classifier. It uses an error correction update that includes the very simple rule of either adding or subtracting the error adjustment, based on whether the variable value is currently larger or smaller than the desired value. While a traditional neuron would sum the inputs together and then apply a function to the total, this new method can change the function decision for each input value. This gives added flexibility to the convergence procedure, where through a series of transpositions, variables that are far away can continue towards the desired value, whereas variables that are originally much closer can oscillate from one side to the other. Tests show that the method can successfully classify some benchmark datasets. It can also work in a batch mode, with reduced training times and can be used as part of a neural network architecture. Some comparisons with an earlier wave shape paper are also made.
This paper presents an end-to-end deep learning framework using passive WiFi sensing to classify and estimate human respiration activity. A passive radar test-bed is used with two channels where the first channel provides the reference WiFi signal, whereas the other channel provides a surveillance signal that contains reflections from the human target. Adaptive filtering is performed to make the surveillance signal source-data invariant by eliminating the echoes of the direct transmitted signal. We propose a novel convolutional neural network to classify the complex time series data and determine if it corresponds to a breathing activity, followed by a random forest estimator to determine breathing rate. We collect an extensive dataset to train the learning models and develop reference benchmarks for the future studies in the field. Based on the results, we conclude that deep learning techniques coupled with passive radars offer great potential for end-to-end human activity recognition.
This report presents Giraffe, a chess engine that uses self-play to discover all its domain-specific knowledge, with minimal hand-crafted knowledge given by the programmer. Unlike previous attempts using machine learning only to perform parameter-tuning on hand-crafted evaluation functions, Giraffe's learning system also performs automatic feature extraction and pattern recognition. The trained evaluation function performs comparably to the evaluation functions of state-of-the-art chess engines - all of which containing thousands of lines of carefully hand-crafted pattern recognizers, tuned over many years by both computer chess experts and human chess masters. Giraffe is the most successful attempt thus far at using end-to-end machine learning to play chess.
The amount of available data about complex systems is increasing every year, measurements of larger and larger systems are collected and recorded. A natural representation of such data is given by networks, whose size is following the size of the original system. The current trend of multiple cores in computing infrastructures call for a parallel reimplementation of earlier methods. Here we present the grid version of CFinder, which can locate overlapping communities in directed, weighted or undirected networks based on the clique percolation method (CPM). We show that the computation of the communities can be distributed among several CPU-s or computers. Although switching to the parallel version not necessarily leads to gain in computing time, it definitely makes the community structure of extremely large networks accessible.
Predicting performance-related behavior of the underlying network structure becomes more and more indispensable in terms of the aspired application outcome quality. However, the reliable forecast of QoS metrics like packet transfer delay in wireless network systems is still a challenging task. Even though existing approaches are technically capable of determining such network properties under certain assumptions, they mostly abstract away from primal aspects that inherently have an essential impact on temporal network performance dynamics. Also, they usually require auxiliary resources to be implemented and deployed along with the actual network components. In the course of developing a lightweight measurement-based alternative for the self-inspection and prediction of volatile performance characteristics in environments of any kind, we selectively investigate the duration of message delivery and packet loss rate against various parameters peculiar to common radio network technologies like Wireless Sensor Networks (WSNs). Our hands-on experiments reveal the relations between the oftentimes underestimated medium access delay and a variety of main influencing factors including packet size, backoff period, and number of neighbor nodes contending for the communication medium. A closed formulation of selected weighted drivers facilitates the average-case prediction of inter-node packet transfer delays for arbitrary configurations of given network parameters even on resource-scarce WSN devices. We validate our prediction method against basic multi-hop networking scenarios. Yield field test results proof the basic feasibility and high precision of our approach to network property estimation in virtue of self-governed local measurements and regression-based calculations paving the way for a prospective self-management of network properties based upon autonomous distributed coordination.
Network epidemiology often assumes that the relationships defining the social network of a population are static. The dynamics of relationships is only taken indirectly into account, by assuming that the relevant information to study epidemic spread is encoded in the network obtained by considering numbers of partners accumulated over periods of time roughly proportional to the infectious period of the disease at hand. On the other hand, models explicitly including social dynamics are often too schematic to provide a reasonable representation of a real population, or so detailed that no general conclusions can be drawn from them. Here we present a model of social dynamics that is general enough that its parameters can be obtained by fitting data from surveys about sexual behaviour, but that can still be studied analytically, using mean field techniques. This allows us to obtain some general results about epidemic spreading. We show that using accumulated network data to estimate the static epidemic threshold leads to a significant underestimation of it. We also show that, for a dynamic network, the relative epidemic threshold is an increasing function of the infectious period of the disease, implying that the static value is a lower bound to the real threshold.
How much is 131 million US dollars? To help readers put such numbers in context, we propose a new task of automatically generating short descriptions known as perspectives, e.g. "$131 million is about the cost to employ everyone in Texas over a lunch period". First, we collect a dataset of numeric mentions in news articles, where each mention is labeled with a set of rated perspectives. We then propose a system to generate these descriptions consisting of two steps: formula construction and description generation. In construction, we compose formulae from numeric facts in a knowledge base and rank the resulting formulas based on familiarity, numeric proximity and semantic compatibility. In generation, we convert a formula into natural language using a sequence-to-sequence recurrent neural network. Our system obtains a 15.2% F1 improvement over a non-compositional baseline at formula construction and a 12.5 BLEU point improvement over a baseline description generation.
Layer-sequential unit-variance (LSUV) initialization - a simple method for weight initialization for deep net learning - is proposed. The method consists of the two steps. First, pre-initialize weights of each convolution or inner-product layer with orthonormal matrices. Second, proceed from the first to the final layer, normalizing the variance of the output of each layer to be equal to one.   Experiment with different activation functions (maxout, ReLU-family, tanh) show that the proposed initialization leads to learning of very deep nets that (i) produces networks with test accuracy better or equal to standard methods and (ii) is at least as fast as the complex schemes proposed specifically for very deep nets such as FitNets (Romero et al. (2015)) and Highway (Srivastava et al. (2015)).   Performance is evaluated on GoogLeNet, CaffeNet, FitNets and Residual nets and the state-of-the-art, or very close to it, is achieved on the MNIST, CIFAR-10/100 and ImageNet datasets.
We study current rectification effect in an asymmetric molecule HOOC-C$_6$H$_4$-(CH$_2$)$_n$ sandwiched between two Aluminum electrodes using an {\sl ab initio} nonequilibrium Green function method. The conductance of the system decreases exponentially with the increasing number $n$ of CH$_2$. The phenomenon of current rectification is observed such that a very small current appears at negative bias and a sharp negative differential resistance at a critical positive bias when $n\ge 2$. The rectification effect arises from the asymmetric structure of the molecule and the molecule-electrode couplings. A significant rectification ratio of $\sim$38 can be achieved when $n=5$.
Automatic affect recognition is a challenging task due to the various modalities emotions can be expressed with. Applications can be found in many domains including multimedia retrieval and human computer interaction. In recent years, deep neural networks have been used with great success in determining emotional states. Inspired by this success, we propose an emotion recognition system using auditory and visual modalities. To capture the emotional content for various styles of speaking, robust features need to be extracted. To this purpose, we utilize a Convolutional Neural Network (CNN) to extract features from the speech, while for the visual modality a deep residual network (ResNet) of 50 layers. In addition to the importance of feature extraction, a machine learning algorithm needs also to be insensitive to outliers while being able to model the context. To tackle this problem, Long Short-Term Memory (LSTM) networks are utilized. The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.
Kicked rotors with certain non-analytic potentials avoid dynamical localization and undergo a metal-insulator transition. We show that typical properties of this transition are still present as the non-analyticity is progressively smoothed out provided that the smoothing is less than a certain limiting value. We have identified a smoothing dependent time scale such that full dynamical localization is absent and the quantum momentum distribution develops power-law tails with anomalous decay exponents as in the case of a conductor at the metal-insulator transition. We discuss under what conditions these findings may be verified experimentally by using ultra cold atoms techniques. It is found that ultra-cold atoms can indeed be utilized for the experimental investigation of the metal-insulator transition.
We give a livelock free routing algorithm for any allowed network. Unlike some other solutions to this problem:   1) packets entering the network have an absolute upper bound on the time to reach their destination; 2) under light loads, packets are delivered to their destinations in nearly optimal time; 3) packets with desired paths far away from congested areas will have routing times far shorter than packets wanting to access congested areas; 4) if the network becomes congested and later clears, the network operates just as it would have when it was initially under a light load.   The main ideas of this note appear in a different form in my 1994 patent 5,369,745. This note adds to those results and makes them more mathematical.
We study a variant of the variational autoencoder model (VAE) with a Gaussian mixture as a prior distribution, with the goal of performing unsupervised clustering through deep generative models. We observe that the known problem of over-regularisation that has been shown to arise in regular VAEs also manifests itself in our model and leads to cluster degeneracy. We show that a heuristic called minimum information constraint that has been shown to mitigate this effect in VAEs can also be applied to improve unsupervised clustering performance with our model. Furthermore we analyse the effect of this heuristic and provide an intuition of the various processes with the help of visualizations. Finally, we demonstrate the performance of our model on synthetic data, MNIST and SVHN, showing that the obtained clusters are distinct, interpretable and result in achieving competitive performance on unsupervised clustering to the state-of-the-art results.
Luhmann (1984) defined society as a communication system which is structurally coupled to, but not an aggregate of, human action systems. The communication system is then considered as self-organizing ("autopoietic"), as are human actors. Communication systems can be studied by using Shannon's (1948) mathematical theory of communication. The update of a network by action at one of the local nodes is then a well-known problem in artificial intelligence (Pearl 1988). By combining these various theories, a general algorithm for probabilistic structure/action contingency can be derived. The consequences of this contingency for each system, its consequences for their further histories, and the stabilization on each side by counterbalancing mechanisms are discussed, in both mathematical and theoretical terms. An empirical example is elaborated.
This work presents a new algorithm called evolutionary exploration of augmenting convolutional topologies (EXACT), which is capable of evolving the structure of convolutional neural networks (CNNs). EXACT is in part modeled after the neuroevolution of augmenting topologies (NEAT) algorithm, with notable exceptions to allow it to scale to large scale distributed computing environments and evolve networks with convolutional filters. In addition to multithreaded and MPI versions, EXACT has been implemented as part of a BOINC volunteer computing project, allowing large scale evolution. During a period of two months, over 4,500 volunteered computers on the Citizen Science Grid trained over 120,000 CNNs and evolved networks reaching 98.32% test data accuracy on the MNIST handwritten digits dataset. These results are even stronger as the backpropagation strategy used to train the CNNs was fairly rudimentary (ReLU units, L2 regularization and Nesterov momentum) and these were initial test runs done without refinement of the backpropagation hyperparameters. Further, the EXACT evolutionary strategy is independent of the method used to train the CNNs, so they could be further improved by advanced techniques like elastic distortions, pretraining and dropout. The evolved networks are also quite interesting, showing "organic" structures and significant differences from standard human designed architectures.
Face recognition performance improves rapidly with the recent deep learning technique developing and underlying large training dataset accumulating. In this paper, we report our observations on how big data impacts the recognition performance. According to these observations, we build our Megvii Face Recognition System, which achieves 99.50% accuracy on the LFW benchmark, outperforming the previous state-of-the-art. Furthermore, we report the performance in a real-world security certification scenario. There still exists a clear gap between machine recognition and human performance. We summarize our experiments and present three challenges lying ahead in recent face recognition. And we indicate several possible solutions towards these challenges. We hope our work will stimulate the community's discussion of the difference between research benchmark and real-world applications.
Optimised population synthesis provides an empirical method to extract the relative mix of stellar evolutionary stages and the distribution of atmospheric parameters within unresolved stellar systems, yet a robust validation of this method is still lacking. We here provide a calibration of population synthesis via non-linear bound-constrained optimisation of stellar populations based upon optical spectra of mock stellar systems and observed Galactic Globular Clusters (GGCs). The MILES stellar library is used as a basis for mock spectra as well as templates for the synthesis of deep GGC spectra from Schiavon et al. (2005). Optimised population synthesis applied to mock spectra recovers mean light-weighted stellar atmospheric parameters to within a mean uncertainty of 240 K, 0.04 dex, and 0.03 dex for T_eff, log(g), and [Fe/H], respectively. Decompositions of both mock and GGC spectra confirm the method's ability to recover the expected mean light-weighted metallicity in dust-free conditions (E[B-V] < 0.15) with uncertainties comparable to evolutionary population synthesis methods. Dustier conditions require either appropriate dust-modelling when fitting to the full spectrum, or fitting only to select spectral features. We derive light-weighted fractions of stellar evolutionary stages from our population synthesis fits to GGCs, yielding on average a combined 25+/-6 per cent from main sequence and turnoff dwarfs, 64+/-7 per cent from subgiant, red giant and asymptotic giant branch stars, and 15+/-7 per cent from horizontal branch stars and blue stragglers. Excellent agreement is found between these fractions and those estimated from deep HST/ACS CMDs. Overall, optimised population synthesis remains a powerful tool for understanding the stellar populations within the integrated light of galaxies and globular clusters.
We consider in this paper competition of content creators in routing their content through various media. The routing decisions may correspond to the selection of a social network (e.g. twitter versus facebook or linkedin) or of a group within a given social network. The utility for a player to send its content to some medium is given as the difference between the dissemination utility at this medium and some transmission cost. We model this game as a congestion game and compute the pure potential of the game. In contrast to the continuous case, we show that there may be various equilibria. We show that the potential is M-concave which allows us to characterize the equilibria and to propose an algorithm for computing it. We then give a learning mechanism which allow us to give an efficient algorithm to determine an equilibrium. We finally determine the asymptotic form of the equilibrium and discuss the implications on the social medium selection problem.
We present a neuromorphic spiking neural network, the DELTRON, that can remember and store patterns by changing the delays of every connection as opposed to modifying the weights. The advantage of this architecture over traditional weight based ones is simpler hardware implementation without multipliers or digital-analog converters (DACs) as well as being suited to time-based computing. The name is derived due to similarity in the learning rule with an earlier architecture called Tempotron. The DELTRON can remember more patterns than other delay-based networks by modifying a few delays to remember the most 'salient' or synchronous part of every spike pattern. We present simulations of memory capacity and classification ability of the DELTRON for different random spatio-temporal spike patterns. The memory capacity for noisy spike patterns and missing spikes are also shown. Finally, we present SPICE simulation results of the core circuits involved in a reconfigurable mixed signal implementation of this architecture.
This article introduces a Green Cloudlet Network (GCN) architecture in the context of mobile cloud computing. The proposed architecture is aimed at providing seamless and low End-to-End (E2E) delay between a User Equipment (UE) and its Avatar (its software clone) in the cloudlets to facilitate the application workloads offloading process. Furthermore, Software Define Networking (SDN) based core network is introduced in the GCN architecture by replacing the traditional Evolved Packet Core (EPC) in the LTE network in order to provide efficient communications connections between different end points. Cloudlet Network File System (CNFS) is designed based on the proposed architecture in order to protect Avatars' dataset against hardware failure and improve the Avatars' performance in terms of data access latency. Moreover, green energy supplement is proposed in the architecture in order to reduce the extra Operational Expenditure (OPEX) and CO2 footprint incurred by running the distributed cloudlets. Owing to the temporal and spatial dynamics of both the green energy generation and energy demands of Green Cloudlet Systems (GCSs), designing an optimal green energy management strategy based on the characteristics of the green energy generation and the energy demands of eNBs and cloudlets to minimize the on-grid energy consumption is critical to the cloudlet provider.
This paper describes the extension and optimization of our previous work on very deep convolutional neural networks (CNNs) for effective recognition of noisy speech in the Aurora 4 task. The appropriate number of convolutional layers, the sizes of the filters, pooling operations and input feature maps are all modified: the filter and pooling sizes are reduced and dimensions of input feature maps are extended to allow adding more convolutional layers. Furthermore appropriate input padding and input feature map selection strategies are developed. In addition, an adaptation framework using joint training of very deep CNN with auxiliary features i-vector and fMLLR features is developed. These modifications give substantial word error rate reductions over the standard CNN used as baseline. Finally the very deep CNN is combined with an LSTM-RNN acoustic model and it is shown that state-level weighted log likelihood score combination in a joint acoustic model decoding scheme is very effective. On the Aurora 4 task, the very deep CNN achieves a WER of 8.81%, further 7.99% with auxiliary feature joint training, and 7.09% with LSTM-RNN joint decoding.
Crowd counting on static images is a challenging problem due to scale variations. Recently deep neural networks have been shown to be effective in this task. However, existing neural-networks-based methods often use the multi-column or multi-network model to extract the scale-relevant features, which is more complicated for optimization and computation wasting. To this end, we propose a novel multi-scale convolutional neural network (MSCNN) for single image crowd counting. Based on the multi-scale blobs, the network is able to generate scale-relevant features for higher crowd counting performances in a single-column architecture, which is both accuracy and cost effective for practical applications. Complemental results show that our method outperforms the state-of-the-art methods on both accuracy and robustness with far less number of parameters.
In supervised learning, the redundancy contained in random examples can be avoided by learning from queries. Using statistical mechanics, we study learning from minimum entropy queries in a large tree-committee machine. The generalization error decreases exponentially with the number of training examples, providing a significant improvement over the algebraic decay for random examples. The connection between entropy and generalization error in multi-layer networks is discussed, and a computationally cheap algorithm for constructing queries is suggested and analysed.
Current speech enhancement techniques operate on the spectral domain and/or exploit some higher-level feature. The majority of them tackle a limited number of noise conditions and rely on first-order statistics. To circumvent these issues, deep networks are being increasingly used, thanks to their ability to learn complex functions from large example sets. In this work, we propose the use of generative adversarial networks for speech enhancement. In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them. We evaluate the proposed model using an independent, unseen test set with two speakers and 20 alternative noise conditions. The enhanced samples confirm the viability of the proposed model, and both objective and subjective evaluations confirm the effectiveness of it. With that, we open the exploration of generative architectures for speech enhancement, which may progressively incorporate further speech-centric design choices to improve their performance.
Restricted Boltzmann machine (RBM) is one of the fundamental building blocks of deep learning. RBM finds wide applications in dimensional reduction, feature extraction, and recommender systems via modeling the probability distributions of a variety of input data including natural images, speech signals, and customer ratings, etc. We build a bridge between RBM and tensor network states (TNS) widely used in quantum many-body physics research. We devise efficient algorithms to translate an RBM into the commonly used TNS. Conversely, we give sufficient and necessary conditions to determine whether a TNS can be transformed into an RBM of given architectures. Revealing these general and constructive connections can cross-fertilize both deep learning and quantum-many body physics. Notably, by exploiting the entanglement entropy bound of TNS, we can rigorously quantify the expressive power of RBM on complex datasets. Insights into TNS and its entanglement capacity can guide the design of more powerful deep learning architectures. On the other hand, RBM can represent quantum many-body states with fewer parameters compared to TNS, which may allow more efficient classical simulations.
We present an analysis of the spectral density of the adjacency matrix of large random trees. We show that there is an infinity of delta peaks at all real numbers which are eigenvalues of finite trees. By exact enumerations and Monte-Carlo simulations, we have numerical estimations of the heights of peaks. In the large tree limit, the sum of their heights is 0.19173 +- 0.00005. Moreover all associated eigenvectors are strictly localized on a finite number of nodes. The rest of the spectral density is a function which vanishes at all positions of peaks, which are a dense subset of real numbers: so this function is almost everywhere discontinuous.   Keywords: random tree, spectral density, density of states, adjacency matrix, localization, delta peak.
In contrast to the network coding problem wherein the sinks in a network demand subsets of the source messages, in a network computation problem the sinks demand functions of the source messages. Similarly, in the functional index coding problem, the side information and demands of the clients include disjoint sets of functions of the information messages held by the transmitter instead of disjoint subsets of the messages, as is the case in the conventional index coding problem. It is known that any network coding problem can be transformed into an index coding problem and vice versa. In this work, we establish a similar relationship between network computation problems and a class of functional index coding problems, viz., those in which only the demands of the clients include functions of messages. We show that any network computation problem can be converted into a functional index coding problem wherein some clients demand functions of messages and vice versa. We prove that a solution for a network computation problem exists if and only if a functional index code (of a specific length determined by the network computation problem) for a suitably constructed functional index coding problem exists. And, that a functional index coding problem admits a solution of a specified length if and only if a suitably constructed network computation problem admits a solution.
We investigate the deviation of the level-correlation functions from the universal form for the chiral symmetric classes. Using the supersymmetric nonlinear sigma model we formulate the perturbation theory. The large energy behavior is compared with the result of the diagrammatic perturbation theory. We have the diffuson and cooperon contributions even in the average density of states. For the unitary and orthogonal classes we get the small energy behavior that suggests a weakening of the level repulsion. For the symplectic case we get a result with opposite tendency.
The technique of hiding messages in digital data is called a steganography technique. With improved sequencing techniques, increasing attempts have been conducted to hide hidden messages in deoxyribonucleic acid (DNA) sequences which have been become a medium for steganography. Many detection schemes have developed for conventional digital data, but these schemes not applicable to DNA sequences because of DNA's complex internal structures. In this paper, we propose the first DNA steganalysis framework for detecting hidden messages and conduct an experiment based on the random oracle model. Among the suitable models for the framework, splice junction classification using deep recurrent neural networks (RNNs) is most appropriate for performing DNA steganalysis. In our DNA steganography approach, we extract the hidden layer composed of RNNs to model the internal structure of a DNA sequence. We provide security for steganography schemes based on mutual entropy and provide simulation results that illustrate how our model detects hidden messages, independent of regions of a targeted reference genome. We apply our method to human genome datasets and determine that hidden messages in DNA sequences with a minimum sample size of 100 are detectable, regardless of the presence of hidden regions.
Network protocols have historically been developed on an ad-hoc basis, and cloud computing is no exception. A fundamental management protocol, not yet standardized, that cloud providers need to run to support wide-area virtual network services is the virtual network (VN) embedding protocol. In this paper, we use decomposition theory to provide a unifying architecture for the VN embedding problem. We show how our architecture subsumes existing solutions, and how it can be used by cloud providers to design a distributed VN embedding protocol that adapts to different scenarios, by merely instantiating different decomposition policies. We analyze key representative tradeoffs via simulation, and with our VN embedding testbed that uses a Linux system architecture to reserve virtual node and link capacities. In contrast with existing VN embedding solutions, we found that partitioning a VN request not only increases the signaling overhead, but may decrease cloud providers' revenue.
Properties of multiparticle final states produced in the central rapidity region of small-x deep inelastic scattering (DIS) are discussed. It is pointed out that these properties contain important information on the nature of the Pomeron - used for the description of total cross section and diffractive production in DIS. It is shown that models based on universality of the Pomeron predict charged particle multiplicities in small-x DIS which are in good agreement with recent HERA results.
We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from our CPU serial implementation, named GAME (Genetic Algorithm Model Experiment). It was successfully tested and validated on the detection of candidate Globular Clusters in deep, wide-field, single band HST images. The GPU version of GAME will be made available to the community by integrating it into the web application DAMEWARE (DAta Mining Web Application REsource (http://dame.dsf.unina.it/beta_info.html), a public data mining service specialized on massive astrophysical data. Since genetic algorithms are inherently parallel, the GPGPU computing paradigm leads to a speedup of a factor of 200x in the training phase with respect to the CPU based version.
Rectified linear unit (ReLU) is a widely used activation function for deep convolutional neural networks. In this paper, we propose a novel activation function called flexible rectified linear unit (FReLU). FReLU improves the flexibility of ReLU by a learnable rectified point. FReLU achieves a faster convergence and higher performance. Furthermore, FReLU does not rely on strict assumptions by self-adaption. FReLU is also simple and effective without using exponential function. We evaluate FReLU on two standard image classification dataset, including CIFAR-10 and CIFAR-100. Experimental results show the strengths of the proposed method.
Appropriate ranking algorithms and incentive mechanisms are essential to the creation of high-quality information by users of a social network. However, evaluating such mechanisms in a quantifiable way is a difficult problem. Studies of live social networks of limited utility, due to the subjective nature of ranking and the lack of experimental control. Simulation provides a valuable alternative: insofar as the simulation resembles the live social network, fielding a new algorithm within a simulated network can predict the effect it will have on the live network. In this paper, we propose a simulation model based on the actor-conceptinstance model of semantic social networks, then we evaluate the model against a number of common ranking algorithms.We observe their effects on information creation in such a network, and we extend our results to the evaluation of generic ranking algorithms and incentive mechanisms.
We apply the Mori-Zwanzig projection operator formalism to obtain an expression for the frequency dependent specific heat c(z) of a liquid. By using an exact transformation formula due to Lebowitz et al., we derive a relation between c(z) and K(t), the autocorrelation function of temperature fluctuations in the microcanonical ensemble. This connection thus allows to determine c(z) from computer simulations in equilibrium, i.e. without an external perturbation. By considering the generalization of K(t) to finite wave-vectors, we derive an expression to determine the thermal conductivity \lambda from such simulations. We present the results of extensive computer simulations in which we use the derived relations to determine c(z) over eight decades in frequency, as well as \lambda. The system investigated is a simple but realistic model for amorphous silica. We find that at high frequencies the real part of c(z) has the value of an ideal gas. c'(\omega) increases quickly at those frequencies which correspond to the vibrational excitations of the system. At low temperatures c'(\omega) shows a second step. The frequency at which this step is observed is comparable to the one at which the \alpha-relaxation peak is observed in the intermediate scattering function. Also the temperature dependence of the location of this second step is the same as the one of the $\alpha-$peak, thus showing that these quantities are intimately connected to each other. From c'(\omega) we estimate the temperature dependence of the vibrational and configurational part of the specific heat. We find that the static value of c(z) as well as \lambda are in good agreement with experimental data.
Here we address a fundamental issue in surface physics: the dynamics of adsorbed molecules. We study this problem when the particle's desorption is characterized by a non Markovian process, while the particle's adsorption and its motion in the bulk are governed by a Markovian dynamics. We study the diffusion of particles in a semi-infinite cubic lattice, and focus on the effective diffusion process at the interface $z = 1$. We calculate analytically the conditional probability to find the particle on the $z=1$ plane as well as the surface dispersion as functions of time. The comparison of these results with Monte Carlo simulations show an excellent agreement.
A large deviation analysis of the solving complexity of random 3-Satisfiability instances slightly below threshold is presented. While finding a solution for such instances demands an exponential effort with high probability, we show that an exponentially small fraction of resolutions require a computation scaling linearly in the size of the instance only. This exponentially small probability of easy resolutions is analytically calculated, and the corresponding exponent shown to be smaller (in absolute value) than the growth exponent of the typical resolution time. Our study therefore gives some theoretical basis to heuristic stop-and-restart solving procedures, and suggests a natural cut-off (the size of the instance) for the restart.
Continuous multimodal representations suitable for multimodal information retrieval are usually obtained with methods that heavily rely on multimodal autoencoders. In video hyperlinking, a task that aims at retrieving video segments, the state of the art is a variation of two interlocked networks working in opposing directions. These systems provide good multimodal embeddings and are also capable of translating from one representation space to the other. Operating on representation spaces, these networks lack the ability to operate in the original spaces (text or image), which makes it difficult to visualize the crossmodal function, and do not generalize well to unseen data. Recently, generative adversarial networks have gained popularity and have been used for generating realistic synthetic data and for obtaining high-level, single-modal latent representation spaces. In this work, we evaluate the feasibility of using GANs to obtain multimodal representations. We show that GANs can be used for multimodal representation learning and that they provide multimodal representations that are superior to representations obtained with multimodal autoencoders. Additionally, we illustrate the ability of visualizing crossmodal translations that can provide human-interpretable insights on learned GAN-based video hyperlinking models.
We construct team-optimal estimation algorithms over distributed networks for state estimation in the finite-horizon mean-square error (MSE) sense. Here, we have a distributed collection of agents with processing and cooperation capabilities. These agents observe noisy samples of a desired state through a linear model and seek to learn this state by interacting with each other. Although this problem has attracted significant attention and been studied extensively in fields including machine learning and signal processing, all the well-known strategies do not achieve team-optimal learning performance in the finite-horizon MSE sense. To this end, we formulate the finite-horizon distributed minimum MSE (MMSE) when there is no restriction on the size of the disclosed information, i.e., oracle performance, over an arbitrary network topology. Subsequently, we show that exchange of local estimates is sufficient to achieve the oracle performance only over certain network topologies. By inspecting these network structures, we propose recursive algorithms achieving the oracle performance through the disclosure of local estimates. For practical implementations we also provide approaches to reduce the complexity of the algorithms through the time-windowing of the observations. Finally, in the numerical examples, we demonstrate the superior performance of the introduced algorithms in the finite-horizon MSE sense due to optimal estimation.
Network analysis has revealed whole-network properties hypothesized to be general characteristics of ecosystems including pathway proliferation, and network non-locality, homogenization, amplification, mutualism, and synergism. Collectively these properties characterize the impact of indirect interactions among ecosystem elements. While ecosystem networks generally trace a thermodynamically conserved unit through the system, there appear to be several model classes. For example, trophic (TRO) networks are built upon a food web, usually follow energy or carbon, and are the most abundant in the literature. Biogeochemical cycling (BGC) networks trace nutrients like nitrogen or phosphorus and tend to have more recycling than TRO. We tested (1) the hypothesized generality of the properties in BGC networks and (2) that they tend to be more strongly expressed in BGC networks than in the TRO networks due to increased recycling. We compared the properties in 22 BGC and 57 TRO ecosystem networks from the literature using enaR. The results generally support the hypotheses. First, five of the properties occurred in all 22 BGC models, while network mutualism occurred in 86% of the models. Further, these results were generally robust to a $\pm$50% uncertainty in the model parameters. Second, the mean network statistics for the six properties were statistically significantly greater in the BGC models than the TRO models. These results (1) confirm the general presence of these properties in ecosystem networks, (2) highlight the significance of model types in determining property intensities, (3) reinforce the importance of recycling, and (4) provide a set of indicator benchmarks for future systems comparisons. Further, this work highlights how indirect effects distributed by network connectivity can transform whole-ecosystem functioning, and adds to the growing domain of network ecology.
The method presented extends a given regression neural network to make its performance improve. The modification affects the learning procedure only, hence the extension may be easily omitted during evaluation without any change in prediction. It means that the modified model may be evaluated as quickly as the original one but tends to perform better.   This improvement is possible because the modification gives better expressive power, provides better behaved gradients and works as a regularization. The knowledge gained by the temporarily extended neural network is contained in the parameters shared with the original neural network.   The only cost is an increase in learning time.
Thanks to their state-of-the-art performance, deep neural networks are increasingly used for object recognition. To achieve these results, they use millions of parameters to be trained. However, when targeting embedded applications the size of these models becomes problematic. As a consequence, their usage on smartphones or other resource limited devices is prohibited. In this paper we introduce a novel compression method for deep neural networks that is performed during the learning phase. It consists in adding an extra regularization term to the cost function of fully-connected layers. We combine this method with Product Quantization (PQ) of the trained weights for higher savings in storage consumption. We evaluate our method on two data sets (MNIST and CIFAR10), on which we achieve significantly larger compression rates than state-of-the-art methods.
In this paper, we consider networks with multiple unicast sessions. Generally, non-linear network coding is needed to achieve the whole rate region of network coding. Yet, there exist networks for which routing is sufficient to achieve the whole rate region, and we refer to them as routing-optimal networks. We identify a class of routing-optimal networks, which we refer to as information-distributive networks, defined by three topological features. Due to these features, for each rate vector achieved by network coding, there is always a routing scheme such that it achieves the same rate vector, and the traffic transmitted through the network is exactly the information transmitted over the cut-sets between the sources and the sinks in the corresponding network coding scheme. We present more examples of information-distributive networks, including some examples from index coding and single unicast with hard deadline constraint.
The growth of the average kinetic energy of classical particles is studied for potentials that are random both in space and time. Such potentials are relevant for recent experiments in optics and in atom optics. It is found that for small velocities uniform acceleration takes place, and at a later stage fluctuations of the potential are encountered, resulting in a regime of anomalous diffusion. This regime was studied in the framework of the Fokker-Planck approximation. The diffusion coefficient in velocity was expressed in terms of the average power spectral density, which is the Fourier transform of the potential correlation function. This enabled to establish a scaling form for the Fokker-Planck equation and to compute the large and small velocity limits of the diffusion coefficient. A classification of the random potentials into universality classes, characterized by the form of the diffusion coefficient in the limit of large and small velocity, was performed. It was shown that one dimensional systems exhibit a large variety of novel universality classes, contrary to systems in higher dimensions, where only one universality class is possible. The relation to Chirikov resonances, that are central in the theory of Chaos, was demonstrated. The general theory was applied and numerically tested for specific physically relevant examples.
Mirror neurons have been observed in the primary motor cortex of primate species, in particular in humans and monkeys. A mirror neuron fires when a person performs a certain action, and also when he observes the same action being performed by another person. A crucial step towards building fully autonomous intelligent systems with human-like learning abilities is the capability in modeling the mirror neuron. On one hand, the abundance of egocentric cameras in the past few years has offered the opportunity to study a lot of vision problems from the first-person perspective. A great deal of interesting research has been done during the past few years, trying to explore various computer vision tasks from the perspective of the self. On the other hand, videos recorded by traditional static cameras, capture humans performing different actions from an exocentric third-person perspective. In this work, we take the first step towards relating motion information across these two perspectives. We train models that predict motion in an egocentric view, by observing it from an exocentric view, and vice versa. This allows models to predict how an egocentric motion would look like from outside. To do so, we train linear and nonlinear models and evaluate their performance in terms of retrieving the egocentric (exocentric) motion features, while having access to an exocentric (egocentric) motion feature. Our experimental results demonstrate that motion information can be successfully transferred across the two views.
This article investigates the relationship between the interconnectivity and simulated dynamics of the thalamocortical system from the specific perspective of attempting to maximize the diversity of cortical states. This is achieved by designing the dynamics such that they favor opposing activity between adjacent regions, thus promoting dynamic diversity while avoiding widespread activation or de-activation. The anti-ferromagnetic Ising model with Metropolis dynamics is adopted and applied to four variations of the large-scale connectivity of the cat thalamocortical system: (a) considering only cortical regions and connections; (b) considering the entire thalamocortical system; (c) the same as in (b) but with the thalamic connections rewired so as to maintain the statistics of node degree and node degree correlations; and (d) as in (b) but with attenuated weights of the connections between cortical and thalamic nodes. A series of interesting findings are obtained, including the identification of specific substructures revealed by correlations between the activity of adjacent regions in case (a) and a pronounced effect of the thalamic connections in splitting the thalamocortical regions into two large groups of nearly homogenous opposite activation (i.e. cortical regions and thalamic nuclei, respectively) in cases (b) and (c). The latter effect is due to the existence of dense connections between cortical and thalamic regions and the lack of interconnectivity between the latter. Another interesting result regarding case (d) is the fact that the pattern of thalamic correlations tended to mirror that of the cortical regions. The possibility to control the level of correlation between the cortical regions by varying the strength of thalamocortical connections is also identified and discussed.
We present a model of decentralized growth for Artificial Neural Networks (ANNs) inspired by the development and the physiology of real nervous systems. In this model, each individual artificial neuron is an autonomous unit whose behavior is determined only by the genetic information it harbors and local concentrations of substrates modeled by a simple artificial chemistry. Gene expression is manifested as axon and dendrite growth, cell division and differentiation, substrate production and cell stimulation. We demonstrate the model's power with a hand-written genome that leads to the growth of a simple network which performs classical conditioning. To evolve more complex structures, we implemented a platform-independent, asynchronous, distributed Genetic Algorithm (GA) that allows users to participate in evolutionary experiments via the World Wide Web.
This paper focuses on the challenging problem of 3D pose estimation of a diverse spectrum of articulated objects from single depth images. A novel structured prediction approach is considered, where 3D poses are represented as skeletal models that naturally operate on manifolds. Given an input depth image, the problem of predicting the most proper articulation of underlying skeletal model is thus formulated as sequentially searching for the optimal skeletal configuration. This is subsequently addressed by convolutional neural nets trained end-to-end to render sequential prediction of the joint locations as regressing a set of tangent vectors of the underlying manifolds. Our approach is examined on various articulated objects including human hand, mouse, and fish benchmark datasets. Empirically it is shown to deliver highly competitive performance with respect to the state-of-the-arts, while operating in real-time (over 30 FPS).
This paper introduces the state-of-the-art machine translation (MT) evaluation survey that contains both manual and automatic evaluation methods. The traditional human evaluation criteria mainly include the intelligibility, fidelity, fluency, adequacy, comprehension, and informativeness. The advanced human assessments include task-oriented measures, post-editing, segment ranking, and extended criteriea, etc. We classify the automatic evaluation methods into two categories, including lexical similarity scenario and linguistic features application. The lexical similarity methods contain edit distance, precision, recall, F-measure, and word order. The linguistic features can be divided into syntactic features and semantic features respectively. The syntactic features include part of speech tag, phrase types and sentence structures, and the semantic features include named entity, synonyms, textual entailment, paraphrase, semantic roles, and language models. Subsequently, we also introduce the evaluation methods for MT evaluation including different correlation scores, and the recent quality estimation (QE) tasks for MT.   This paper differs from the existing works \cite{GALEprogram2009,EuroMatrixProject2007} from several aspects, by introducing some recent development of MT evaluation measures, the different classifications from manual to automatic evaluation measures, the introduction of recent QE tasks of MT, and the concise construction of the content.
This paper is part of a series describing the results from the Australia Telescope Hubble Deep Field South (ATHDFS) survey obtained with the Australia Telescope Compact Array (ATCA). This survey consists of observations at 1.4, 2.5, 5.2 and 8.7 GHz, all centred on the Hubble Deep Field South.   Here we present the first results from the extended observing campaign at 1.4 GHz. A total of 466 sources have been catalogued to a local sensitivity of 5 sigma (11 microJy rms). A source extraction technique is developed which: 1) successfully excludes spurious sources from the final source catalogues, and 2) accounts for the non-uniform noise in our image. A source catalogue is presented and the general properties of the 1.4 GHz image are discussed. We also present source counts derived from our ATHDFS 1.4 GHz catalogue. Particular attention is made to ensure the counts are corrected for survey incompleteness and systematic effects. Our counts are consistent with other surveys (e.g. ATESP, VIRMOS, and Phoenix Deep Field), and we find, in common with these surveys, that the HDFN counts are systematically lower.
Recurrent neural networks (RNNs) have shown outstanding performance on processing sequence data. However, they suffer from long training time, which demands parallel implementations of the training procedure. Parallelization of the training algorithms for RNNs are very challenging because internal recurrent paths form dependencies between two different time frames. In this paper, we first propose a generalized graph-based RNN structure that covers the most popular long short-term memory (LSTM) network. Then, we present a parallelization approach that automatically explores parallelisms of arbitrary RNNs by analyzing the graph structure. The experimental results show that the proposed approach shows great speed-up even with a single training stream, and further accelerates the training when combined with multiple parallel training streams.
We train generative 'up-convolutional' neural networks which are able to generate images of objects given object style, viewpoint, and color. We train the networks on rendered 3D models of chairs, tables, and cars. Our experiments show that the networks do not merely learn all images by heart, but rather find a meaningful representation of 3D models allowing them to assess the similarity of different models, interpolate between given views to generate the missing ones, extrapolate views, and invent new objects not present in the training set by recombining training instances, or even two different object classes. Moreover, we show that such generative networks can be used to find correspondences between different objects from the dataset, outperforming existing approaches on this task.
In recent years, deep learning based on artificial neural network (ANN) has achieved great success in pattern recognition. However, there is no clear understanding of such neural computational models. In this paper, we try to unravel "black-box" structure of Ann model from network flow. Specifically, we consider the feed forward Ann as a network flow model, which consists of many directional class-pathways. Each class-pathway encodes one class. The class-pathway of a class is obtained by connecting the activated neural nodes in each layer from input to output, where activation value of neural node (node-value) is defined by the weights of each layer in a trained ANN-classifier. From the perspective of the class-pathway, training an ANN-classifier can be regarded as the formulation process of class-pathways of different classes. By analyzing the the distances of each two class-pathways in a trained ANN-classifiers, we try to answer the questions, why the classifier performs so? At last, from the neural encodes view, we define the importance of each neural node through the class-pathways, which is helpful to optimize the structure of a classifier. Experiments for two types of ANN model including multi-layer MLP and CNN verify that the network flow based on class-pathway is a reasonable explanation for ANN models.
A key feature of neural network architectures is their ability to support the simultaneous interaction among large numbers of units in the learning and processing of representations. However, how the richness of such interactions trades off against the ability of a network to simultaneously carry out multiple independent processes -- a salient limitation in many domains of human cognition -- remains largely unexplored. In this paper we use a graph-theoretic analysis of network architecture to address this question, where tasks are represented as edges in a bipartite graph $G=(A \cup B, E)$. We define a new measure of multitasking capacity of such networks, based on the assumptions that tasks that \emph{need} to be multitasked rely on independent resources, i.e., form a matching, and that tasks \emph{can} be multitasked without interference if they form an induced matching. Our main result is an inherent tradeoff between the multitasking capacity and the average degree of the network that holds \emph{regardless of the network architecture}. These results are also extended to networks of depth greater than $2$. On the positive side, we demonstrate that networks that are random-like (e.g., locally sparse) can have desirable multitasking properties. Our results shed light into the parallel-processing limitations of neural systems and provide insights that may be useful for the analysis and design of parallel architectures.
In this paper we apply a new approach of the string theory to the real financial market. It is direct extension and application of the work [1] into prediction of prices. The models are constructed with an idea of prediction models based on the string invariants (PMBSI). The performance of PMBSI is compared to support vector machines (SVM) and artificial neural networks (ANN) on an artificial and a financial time series. Brief overview of the results and analysis is given. The first model is based on the correlation function as invariant and the second one is an application based on the deviations from the closed string/pattern form (PMBCS). We found the difference between these two approaches. The first model cannot predict the behavior of the forex market with good efficiency in comparison with the second one which is, in addition, able to make relevant profit per year.
Distributed frameworks are gaining increasingly widespread use in applications that process large amounts of data. One important example application is large scale similarity search, for which Locality Sensitive Hashing (LSH) has emerged as the method of choice, specially when the data is high-dimensional. At its core, LSH is based on hashing the data points to a number of buckets such that similar points are more likely to map to the same buckets. To guarantee high search quality, the LSH scheme needs a rather large number of hash tables. This entails a large space requirement, and in the distributed setting, with each query requiring a network call per hash bucket look up, this also entails a big network load. The Entropy LSH scheme proposed by Panigrahy significantly reduces the number of required hash tables by looking up a number of query offsets in addition to the query itself. While this improves the LSH space requirement, it does not help with (and in fact worsens) the search network efficiency, as now each query offset requires a network call. In this paper, focusing on the Euclidian space under $l_2$ norm and building up on Entropy LSH, we propose the distributed Layered LSH scheme, and prove that it exponentially decreases the network cost, while maintaining a good load balance between different machines. Our experiments also verify that our scheme results in a significant network traffic reduction that brings about large runtime improvement in real world applications.
Multiview assisted learning has gained significant attention in recent years in supervised learning genre. Availability of high performance computing devices enables learning algorithms to search simultaneously over multiple views or feature spaces to obtain an optimum classification performance. The paper is a pioneering attempt of formulating a mathematical foundation for realizing a multiview aided collaborative boosting architecture for multiclass classification. Most of the present algorithms apply multiview learning heuristically without exploring the fundamental mathematical changes imposed on traditional boosting. Also, most of the algorithms are restricted to two class or view setting. Our proposed mathematical framework enables collaborative boosting across any finite dimensional view spaces for multiclass learning. The boosting framework is based on forward stagewise additive model which minimizes a novel exponential loss function. We show that the exponential loss function essentially captures difficulty of a training sample space instead of the traditional `1/0' loss. The new algorithm restricts a weak view from over learning and thereby preventing overfitting. The model is inspired by our earlier attempt on collaborative boosting which was devoid of mathematical justification. The proposed algorithm is shown to converge much nearer to global minimum in the exponential loss space and thus supersedes our previous algorithm. The paper also presents analytical and numerical analysis of convergence and margin bounds for multiview boosting algorithms and we show that our proposed ensemble learning manifests lower error bound and higher margin compared to our previous model. Also, the proposed model is compared with traditional boosting and recent multiview boosting algorithms.
Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with recent neural network architectures. We evaluate the model performance through automatic evaluation metrics and by carrying out a human evaluation. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context.
Deep Learning is considered to be a quite young in the area of machine learning research, found its effectiveness in dealing complex yet high dimensional dataset that includes but limited to images, text and speech etc. with multiple levels of representation and abstraction. As there are a plethora of research on these datasets by various researchers , a win over them needs lots of attention. Careful setting of Deep learning parameters is of paramount importance in order to avoid the overfitting unlike conventional methods with limited parameter settings. Deep Convolutional neural network (DCNN) with multiple layers of compositions and appropriate settings might be is an efficient machine learning method that can outperform the conventional methods in a great way. However, due to its slow adoption in learning, there are also always a chance of overfitting during feature selection process, which can be addressed by employing a regularization method called dropout. Fast Random Forest (FRF) is a powerful ensemble classifier especially when the datasets are noisy and when the number of attributes is large in comparison to the number of instances, as is the case of Bioinformatics datasets. Several publicly available Bioinformatics dataset, Handwritten digits recognition and Image segmentation dataset are considered for evaluation of the proposed approach. The excellent performance obtained by the proposed DCNN based feature selection with FRF classifier on high dimensional datasets makes it a fast and accurate classifier in comparison the state-of-the-art.
Three-dimensional geometric data offer an excellent domain for studying representation learning and generative modeling. In this paper, we look at geometric data represented as point clouds. We introduce a deep autoencoder network for point clouds, which outperforms the state of the art in 3D recognition tasks. We also design GAN architectures to generate novel point-clouds. Importantly, we show that by training the GAN in the latent space learned by the autoencoder, we greatly boost the GAN's data-generating capacity, creating significantly more diverse and realistic geometries, with far simpler architectures. The expressive power of our learned embedding, obtained without human supervision, enables basic shape editing applications via simple algebraic manipulations, such as semantic part editing and shape interpolation.
Following the long-lived qualitative-dynamics tradition of explaining behavior in complex systems via the architecture of their attractors and basins, we investigate the patterns of switching between qualitatively distinct trajectories in a network of synchronized oscillators. Our system, consisting of nonlinear amplitude-phase oscillators arranged in a ring topology with reactive nearest neighbor coupling, is simple and connects directly to experimental realizations. We seek to understand how the multiple stable synchronized states connect to each other in state space by applying Gaussian white noise to each of the oscillators' phases. To do this, we first identify a set of locally stable limit cycles at any given coupling strength. For each of these attracting states, we analyze the effect of weak noise via the covariance matrix of deviations around those attractors. We then explore the noise-induced attractor switching behavior via numerical investigations. For a ring of three oscillators we find that an attractor-switching event is always accompanied by the crossing of two adjacent oscillators' phases. For larger numbers of oscillators we find that the distribution of times required to stochastically leave a given state falls off exponentially, and we build an attractor switching network out of the destination states as a coarse-grained description of the high-dimensional attractor-basin architecture.
This paper develops a new neural network architecture for modeling spatial distributions (i.e., distributions on R^d) which is computationally efficient and specifically designed to take advantage of the spatial structure of limit order books. The new architecture yields a low-dimensional model of price movements deep into the limit order book, allowing more effective use of information from deep in the limit order book (i.e., many levels beyond the best bid and best ask). This "spatial neural network" models the joint distribution of the state of the limit order book at a future time conditional on the current state of the limit order book. The spatial neural network outperforms other models such as the naive empirical model, logistic regression (with nonlinear features), and a standard neural network architecture. Both neural networks strongly outperform the logistic regression model. Due to its more effective use of information deep in the limit order book, the spatial neural network especially outperforms the standard neural network in the tail of the distribution, which is important for risk management applications. The models are trained and tested on nearly 500 stocks. Techniques from deep learning such as dropout are employed to improve performance. Due to the significant computational challenges associated with the large amount of data, models are trained with a cluster of 50 GPUs.
In the context of order statistics of discrete time random walks (RW), we investigate the statistics of the gap, $G_n$, and the number of time steps, $L_n$, between the two highest positions of a Markovian one-dimensional random walker, starting from $x_0 = 0$, after $n$ time steps (taking the $x$-axis vertical). The jumps $\eta_i = x_i - x_{i-1}$ are independent and identically distributed random variables drawn from a symmetric probability distribution function (PDF), $f(\eta)$, the Fourier transform of which has the small $k$ behavior $1 - \hat f(k) \propto |k|^\mu$, with $0 < \mu \leq 2$. For $\mu=2$, the variance of the jump distribution is finite and the RW (properly scaled) converges to a Brownian motion. For $0<\mu<2$, the RW is a L\'evy flight of index $\mu$. We show that the joint PDF of $G_n$ and $L_n$ converges to a well defined stationary bi-variate distribution $p(g,l)$ as the RW duration $n$ goes to infinity. We present a thorough analytical study of the limiting joint distribution $p(g,l)$, as well as of its associated marginals $p_{\rm gap}(g)$ and $p_{\rm time}(l)$, revealing a rich variety of behaviors depending on the tail of $f(\eta)$ (from slow decreasing algebraic tail to fast decreasing super-exponential tail). We also address the problem for a random bridge where the RW starts and ends at the origin after $n$ time steps. We show that in the large $n$ limit, the PDF of $G_n$ and $L_n$ converges to the {\it same} stationary distribution $p(g,l)$ as in the case of the free-end RW. Finally, we present a numerical check of our analytical predictions. Some of these results were announced in a recent letter [S. N. Majumdar, Ph. Mounaix, G. Schehr, Phys. Rev. Lett. {\bf 111}, 070601 (2013)].
Congestion problems are omnipresent in today's complex networks and represent a challenge in many research domains. In the context of Multi-agent Reinforcement Learning (MARL), approaches like difference rewards and resource abstraction have shown promising results in tackling such problems. Resource abstraction was shown to be an ideal candidate for solving large-scale resource allocation problems in a fully decentralized manner. However, its performance and applicability strongly depends on some, until now, undocumented assumptions. Two of the main congestion benchmark problems considered in the literature are: the Beach Problem Domain and the Traffic Lane Domain. In both settings the highest system utility is achieved when overcrowding one resource and keeping the rest at optimum capacity. We analyse how abstract grouping can promote this behaviour and how feasible it is to apply this approach in a real-world domain (i.e., what assumptions need to be satisfied and what knowledge is necessary). We introduce a new test problem, the Road Network Domain (RND), where the resources are no longer independent, but rather part of a network (e.g., road network), thus choosing one path will also impact the load on other paths having common road segments. We demonstrate the application of state-of-the-art MARL methods for this new congestion model and analyse their performance. RND allows us to highlight an important limitation of resource abstraction and show that the difference rewards approach manages to better capture and inform the agents about the dynamics of the environment.
Using the recently proposed model of combinatorial landscapes: local optima networks, we study the distribution of local optima in two classes of instances of the quadratic assignment problem. Our results indicate that the two problem instance classes give rise to very different configuration spaces. For the so-called real-like class, the optima networks possess a clear modular structure, while the networks belonging to the class of random uniform instances are less well partitionable into clusters. We briefly discuss the consequences of the findings for heuristically searching the corresponding problem spaces.
With the help of the Internet, social networks have grown rapidly. This has increased security requirements. We present a formalization of social networks as composite behavioral objects, defined using the Observational Transition System (OTS) approach. Our definition is then translated to the OTS/CafeOBJ algebraic specification methodology. This translation allows the formal verification of safety properties for social networks via the Proof Score method. Finally, using this methodology we formally verify some security properties.
A two-dimensional lattice gas model is proposed. The ground state of this model with a fixed density is neither periodic nor quasi-periodic. It also depends on system size in an irregular manner. On the other hand, it is ordered in the sense that the entropy density is zero in the thermodynamic limit. The existence of a thermodynamic transition associated with such irregularly ordered ground states is conjectured from a duality relation for a thermodynamic function. This conjecture is supported by a phenomenological argument and numerical experiments.
We present the principles of the measurement of the quark transversity distributions in semi-inclusive deep inelastic electron scattering, which form the basis of HELP and one of the ELFE proposals. A string model for Collins-type asymmetry in polarized quark fragmentation function is proposed. A possible role of the Collins effect in the single spin asymmetries observed by experiment E 704 at Fermilab is suggested.
A key objective of the smart grid is to improve reliability of utility services to end users. This requires strengthening resilience of distribution networks that lie at the edge of the grid. However, distribution networks are exposed to external disturbances such as hurricanes and snow storms where electricity service to customers is disrupted repeatedly. External disturbances cause large-scale power failures that are neither well-understood, nor formulated rigorously, nor studied systematically. This work studies resilience of power distribution networks to large-scale disturbances in three aspects. First, a non-stationary random process is derived to characterize an entire life cycle of large-scale failure and recovery. Second, resilience is defined based on the non-stationary random process. Close form analytical expressions are derived under specific large-scale failure scenarios. Third, the non-stationary model and the resilience metric are applied to a real life example of large-scale disruptions due to Hurricane Ike. Real data on large-scale failures from an operational network is used to learn time-varying model parameters and resilience metrics.
As generalizations of results of Christandl et al.\cite{8,9""} and Facer et al.\cite{Facer}, Bernasconi et al.\cite{godsil,godsil1} studied perfect state transfer (PST) between two particles in quantum networks modeled by a large class of cubelike graphs (e.g., the hypercube) which are the Cayley graphs of the elementary abelian group $Z_2^n$. In Refs. \cite{PST,psd}, respectively, PST of a qubit over distance regular spin networks and optimal state transfer (ST) of a $d$-level quantum state (qudit) over pseudo distance regular networks were discussed, where the networks considered there were not in general related with a certain finite group. In this paper, PST of a qudit over antipodes of more general networks called underlying networks of association schemes, is investigated. In particular, we consider the underlying networks of group association schemes in order to employ the group properties (such as irreducible characters) and use the algebraic structure of these networks (such as Bose-Mesner algebra) in order to give an explicit analytical formula for coupling constants in the Hamiltonians so that the state of a particular qudit initially encoded on one site will perfectly evolve to the opposite site without any dynamical control. It is shown that the only necessary condition in order to PST over these networks be achieved is that the centers of the corresponding groups be non-trivial. Therefore, PST over the underlying networks of the group association schemes over all the groups with non-trivial centers such as the abelian groups, the dihedral group $D_{2n}$ with even $n$, the Clifford group CL(n) and all of the $p$-groups can be achieved.
We study a networked version of the minority game in which agents can choose to follow the choices made by a neighbouring agent in a social network. We show that for a wide variety of networks a leadership structure always emerges, with most agents following the choice made by a few agents. We find a suitable parameterisation which highlights the universal aspects of the behaviour and which also indicates where results depend on the type of social network.
Deep learning has become a ubiquitous technology to improve machine intelligence. However, most of the existing deep models are structurally very complex, making them difficult to be deployed on the mobile platforms with limited computational power. In this paper, we propose a novel network compression method called dynamic network surgery, which can remarkably reduce the network complexity by making on-the-fly connection pruning. Unlike the previous methods which accomplish this task in a greedy way, we properly incorporate connection splicing into the whole process to avoid incorrect pruning and make it as a continual network maintenance. The effectiveness of our method is proved with experiments. Without any accuracy loss, our method can efficiently compress the number of parameters in LeNet-5 and AlexNet by a factor of $\bm{108}\times$ and $\bm{17.7}\times$ respectively, proving that it outperforms the recent pruning method by considerable margins. Code and some models are available at https://github.com/yiwenguo/Dynamic-Network-Surgery.
Most of the current boundary detection systems rely exclusively on low-level features, such as color and texture. However, perception studies suggest that humans employ object-level reasoning when judging if a particular pixel is a boundary. Inspired by this observation, in this work we show how to predict boundaries by exploiting object-level features from a pretrained object-classification network. Our method can be viewed as a "High-for-Low" approach where high-level object features inform the low-level boundary detection process. Our model achieves state-of-the-art performance on an established boundary detection benchmark and it is efficient to run.   Additionally, we show that due to the semantic nature of our boundaries we can use them to aid a number of high-level vision tasks. We demonstrate that using our boundaries we improve the performance of state-of-the-art methods on the problems of semantic boundary labeling, semantic segmentation and object proposal generation. We can view this process as a "Low-for-High" scheme, where low-level boundaries aid high-level vision tasks.   Thus, our contributions include a boundary detection system that is accurate, efficient, generalizes well to multiple datasets, and is also shown to improve existing state-of-the-art high-level vision methods on three distinct tasks.
This paper explores semi-qualitative probabilistic networks (SQPNs) that combine numeric and qualitative information. We first show that exact inferences with SQPNs are NPPP-Complete. We then show that existing qualitative relations in SQPNs (plus probabilistic logic and imprecise assessments) can be dealt effectively through multilinear programming. We then discuss learning: we consider a maximum likelihood method that generates point estimates given a SQPN and empirical data, and we describe a Bayesian-minded method that employs the Imprecise Dirichlet Model to generate set-valued estimates.
A Quantum Key Distribution (QKD) network is currently implemented in Vienna by integrating seven QKD-Link devices that connect five subsidiaries of SIEMENS Austria. We give an architectural overview of the network and present the enabling QKD-technologies, as well as the novel QKD network protocols.
In this paper we propose and investigate the performance of a dual multi-channel deficit round-robin (D-MCDRR) scheduler based on the existing single MCDRR scheduler. The existing scheduler is used for multiple channels with tunable transmitters and fixed receivers in hybrid time division multiplexing (TDM)/wavelength division multiplexing (WDM) optical networks. The proposed dual scheduler will also be used in the same optical networks. We extend the existing MCDRR scheduling algorithm for n channels to the case of considering two schedulers for the same n channels. Simulation results show that the proposed dual MCDRR (D-MCDRR) scheduler can provide better throughput when compared to the existing single MCDRR scheduler.
Networks can model real-world systems in a variety of domains. Network alignment (NA) aims to find a node mapping that conserves similar regions between compared networks. NA is applicable to many fields, including computational biology, where NA can guide the transfer of biological knowledge from well- to poorly-studied species across aligned network regions. Existing NA methods can only align static networks. However, most complex real-world systems evolve over time and should thus be modeled as dynamic networks. We hypothesize that aligning dynamic network representations of evolving systems will produce superior alignments compared to aligning the systems' static network representations, as is currently done. For this purpose, we introduce the first ever dynamic NA method, DynaMAGNA++. This proof-of-concept dynamic NA method is an extension of a state-of-the-art static NA method, MAGNA++. Even though both MAGNA++ and DynaMAGNA++ optimize edge as well as node conservation across the aligned networks, MAGNA++ conserves static edges and similarity between static node neighborhoods, while DynaMAGNA++ conserves dynamic edges (events) and similarity between evolving node neighborhoods. For this purpose, we introduce the first ever measure of dynamic edge conservation and rely on our recent measure of dynamic node conservation. Importantly, the two dynamic conservation measures can be optimized using any state-of-the-art NA method and not just MAGNA++. We confirm our hypothesis that dynamic NA is superior to static NA, under fair comparison conditions, on synthetic and real-world networks, in computational biology and social network domains. DynaMAGNA++ is parallelized and it includes a user-friendly graphical interface.
We focus on the task of amodal 3D object detection in RGB-D images, which aims to produce a 3D bounding box of an object in metric form at its full extent. We introduce Deep Sliding Shapes, a 3D ConvNet formulation that takes a 3D volumetric scene from a RGB-D image as input and outputs 3D object bounding boxes. In our approach, we propose the first 3D Region Proposal Network (RPN) to learn objectness from geometric shapes and the first joint Object Recognition Network (ORN) to extract geometric features in 3D and color features in 2D. In particular, we handle objects of various sizes by training an amodal RPN at two different scales and an ORN to regress 3D bounding boxes. Experiments show that our algorithm outperforms the state-of-the-art by 13.8 in mAP and is 200x faster than the original Sliding Shapes. All source code and pre-trained models will be available at GitHub.
Many link formation mechanisms for the evolution of social networks have been successful to reproduce various empirical findings in social networks. However, they have largely ignored the fact that individuals make decisions on whether to create links to other individuals based on cost and benefit of linking, and the fact that individuals may use perception of the network in their decision making. In this paper, we study the evolution of social networks in terms of perception-based strategic link formation. Here each individual has her own perception of the actual network, and uses it to decide whether to create a link to another individual. An individual with the least perception accuracy can benefit from updating her perception using that of the most accurate individual via a new link. This benefit is compared to the cost of linking in decision making. Once a new link is created, it affects the accuracies of other individuals' perceptions, leading to a further evolution of the actual network. As for initial actual networks, we consider homogeneous and heterogeneous cases. The homogeneous initial actual network is modeled by Erd\H{o}s-R\'enyi (ER) random networks, while we take a star network for the heterogeneous case. In any cases, individual perceptions of the actual network are modeled by ER random networks with controllable linking probability. Then the stable link density of the actual network is found to show discontinuous transitions or jumps according to the cost of linking. As the number of jumps is the consequence of the dynamical complexity, we discuss the effect of initial conditions on the number of jumps to find that the dynamical complexity strongly depends on how much individuals initially overestimate or underestimate the link density of the actual network. For the heterogeneous case, the role of the highly connected individual as an information spreader is discussed.
Previous preliminary results on the application of knowledge networks to noise reduction in stationary harmonic and weakly chaotic signals are extended to more general cases. The formalism gives a novel algorithm from which statistical tests for the identification of deterministic behavior in noisy stationary time series can be constructed.
Fisher Vector classifiers and Deep Neural Networks (DNNs) are popular and successful algorithms for solving image classification problems. However, both are generally considered `black box' predictors as the non-linear transformations involved have so far prevented transparent and interpretable reasoning. Recently, a principled technique, Layer-wise Relevance Propagation (LRP), has been developed in order to better comprehend the inherent structured reasoning of complex nonlinear classification models such as Bag of Feature models or DNNs. In this paper we (1) extend the LRP framework also for Fisher Vector classifiers and then use it as analysis tool to (2) quantify the importance of context for classification, (3) qualitatively compare DNNs against FV classifiers in terms of important image regions and (4) detect potential flaws and biases in data. All experiments are performed on the PASCAL VOC 2007 data set.
We propose a diagonalization scheme to study disordered 1-dim chains, in particular the transition between many-body localization (MBL) and the ergodic phase. Our scheme focuses on the dichotomy MBL versus RMT (random matrix theory) and it relies on specific, microscopically defined rules to treat resonant spots. At strong disorder, our scheme coincides with the mathematically rigorous procedure by J. Imbrie. We derive a simplified flow equation for the fraction of resonant sites and the typical localization length, obtaining a detailed analytic description of the transition. We argue that delocalization is induced by a quantum avalanche, seeded by large ergodic spots. We identify the microscopic origin of divergent length scales, most importantly the average end-to-end eigenstate correlation and the average localization length in eigenstates, contrasting them with the typical localization length, which is finite at the transition.
Training artificial neural networks requires a tedious empirical evaluation to determine a suitable neural network architecture. To avoid this empirical process several techniques have been proposed to automatise the architecture selection process. In this paper, we propose a method to perform parameter and architecture selection for a quantum weightless neural network (qWNN). The architecture selection is performed through the learning procedure of a qWNN with a learning algorithm that uses the principle of quantum superposition and a non-linear quantum operator. The main advantage of the proposed method is that it performs a global search in the space of qWNN architecture and parameters rather than a local search.
Temporal networks have been increasingly used to model a diversity of systems that evolve in time; for example human contact structures over which dynamic processes such as epidemics take place. A fundamental aspect of real-life networks is that they are sampled within temporal and spatial frames. Furthermore, one might wish to subsample networks to reduce their size for better visualization or to perform computationally intensive simulations. The sampling method may affect the network structure and thus caution is necessary to generalize results based on samples. In this paper, we study four sampling strategies applied to a variety of real-life temporal networks. We quantify the biases generated by each sampling strategy on a number of relevant statistics such as link activity, temporal paths and epidemic spread. We find that some biases are common in a variety of networks and statistics, but one strategy, uniform sampling of nodes, shows improved performance in most scenarios. Our results help researchers to better design network data collection protocols and to understand the limitations of sampled temporal network data.
A great improvement to the insight on brain function that we can get from fMRI data can come from effective connectivity analysis, in which the flow of information between even remote brain regions is inferred by the parameters of a predictive dynamical model. As opposed to biologically inspired models, some techniques as Granger causality (GC) are purely data-driven and rely on statistical prediction and temporal precedence. While powerful and widely applicable, this approach could suffer from two main limitations when applied to BOLD fMRI data: confounding effect of hemodynamic response function (HRF) and conditioning to a large number of variables in presence of short time series. For task-related fMRI, neural population dynamics can be captured by modeling signal dynamics with explicit exogenous inputs; for resting-state fMRI on the other hand, the absence of explicit inputs makes this task more difficult, unless relying on some specific prior physiological hypothesis. In order to overcome these issues and to allow a more general approach, here we present a simple and novel blind-deconvolution technique for BOLD-fMRI signal. Coming to the second limitation, a fully multivariate conditioning with short and noisy data leads to computational problems due to overfitting. Furthermore, conceptual issues arise in presence of redundancy. We thus apply partial conditioning to a limited subset of variables in the framework of information theory, as recently proposed. Mixing these two improvements we compare the differences between BOLD and deconvolved BOLD level effective networks and draw some conclusions.
This article describes the systems jointly submitted by Institute for Infocomm (I$^2$R), the Laboratoire d'Informatique de l'Universit\'e du Maine (LIUM), Nanyang Technology University (NTU) and the University of Eastern Finland (UEF) for 2015 NIST Language Recognition Evaluation (LRE). The submitted system is a fusion of nine sub-systems based on i-vectors extracted from different types of features. Given the i-vectors, several classifiers are adopted for the language detection task including support vector machines (SVM), multi-class logistic regression (MCLR), Probabilistic Linear Discriminant Analysis (PLDA) and Deep Neural Networks (DNN).
The cross-correlation matrix of daily returns of stock market indices in a diverse set of 37 countries worldwide was analyzed. Comparison of the spectrum of this matrix with predictions of random matrix theory provides an empirical evidence of strong interactions between individual economies, as manifested by three largest eigenvalues and the corresponding set of stable, non-random eigenvectors. The observed correlation structure is robust with respect to changes in the time horizon of returns ranging from 1 to 10 trading days, and to replacing individual returns with just their signs. This last observation confirms that it is mostly correlations in signs and not absolute values of fluctuations, which are responsible for the observed effect. Correlations between different trading days seem to persist for up to 3 days before decaying to the level of the background noise.
Variational Bayesian inference and (collapsed) Gibbs sampling are the two important classes of inference algorithms for Bayesian networks. Both have their advantages and disadvantages: collapsed Gibbs sampling is unbiased but is also inefficient for large count values and requires averaging over many samples to reduce variance. On the other hand, variational Bayesian inference is efficient and accurate for large count values but suffers from bias for small counts. We propose a hybrid algorithm that combines the best of both worlds: it samples very small counts and applies variational updates to large counts. This hybridization is shown to significantly improve testset perplexity relative to variational inference at no computational cost.
We compare complex networks built from the game of go and obtained from databases of human-played games with those obtained from computer-played games. Our investigations show that statistical features of the human-based networks and the computer-based networks differ, and that these differences can be statistically significant on a relatively small number of games using specific estimators. We show that the deterministic or stochastic nature of the computer algorithm playing the game can also be distinguished from these quantities. This can be seen as tool to implement a Turing-like test for go simulators.
Since the advent of deep learning, it has been used to solve various problems using many different architectures. The application of such deep architectures to auditory data is also not uncommon. However, these architectures do not always adequately consider the temporal dependencies in data. We thus propose a new generic architecture called the Deep Belief Network - Bidirectional Long Short-Term Memory (DBN-BLSTM) network that models sequences by keeping track of the temporal information while enabling deep representations in the data. We demonstrate this new architecture by applying it to the task of music generation and obtain state-of-the-art results.
The mutual information, I, of the three-state neural network can be obtained exactly for the mean-field architecture, as a function of three macroscopic parameters: the overlap, the neural activity and the {\em activity-overlap}, i.e. the overlap restricted to the active neurons. We perform an expansion of I on the overlap and the activity-overlap, around their values for neurons almost independent on the patterns. From this expansion we obtain an expression for a Hamiltonian which optimizes the retrieval properties of this system. This Hamiltonian has the form of a disordered Blume-Emery-Griffiths model. The dynamics corresponding to this Hamiltonian is found. As a special characteristic of such network, we see that information can survive even if no overlap is present. Hence the basin of attraction of the patterns and the retrieval capacity is much larger than for the Hopfield network. The extreme diluted version is analized, the curves of information are plotted and the phase diagrams are built.
Network creation games investigate complex networks from a game-theoretic point of view. Based on the original model by Fabrikant et al. [PODC'03] many variants have been introduced. However, almost all versions have the drawback that edges are treated uniformly, i.e. every edge has the same cost and that this common parameter heavily influences the outcomes and the analysis of these games.   We propose and analyze simple and natural parameter-free network creation games with non-uniform edge cost. Our models are inspired by social networks where the cost of forming a link is proportional to the popularity of the targeted node. Besides results on the complexity of computing a best response and on various properties of the sequential versions, we show that the most general version of our model has constant Price of Anarchy. To the best of our knowledge, this is the first proof of a constant Price of Anarchy for any network creation game.
We consider the problem of securing a multicast network against a wiretapper that can intercept the packets on a limited number of arbitrary network edges of its choice. We assume that the network employs the network coding technique to simultaneously deliver the packets available at the source to all the receivers.   We show that this problem can be looked at as a network generalization of the wiretap channel of type II introduced in a seminal paper by Ozarow and Wyner. In particular, we show that the transmitted information can be secured by using the Ozarow-Wyner approach of coset coding at the source on top of the existing network code. This way, we quickly and transparently recover some of the results available in the literature on secure network coding for wiretap networks. Moreover, we derive new bounds on the required alphabet size that are independent of the network size and devise an algorithm for the construction of secure network codes. We also look at the dual problem and analyze the amount of information that can be gained by the wiretapper as a function of the number of wiretapped edges.
We present an alternative to the pseudo-inverse method for determining the hidden to output weight values for Extreme Learning Machines performing classification tasks. The method is based on linear discriminant analysis and provides Bayes optimal single point estimates for the weight values.
Recent advances in associative memory design through strutured pattern sets and graph-based inference algorithms have allowed the reliable learning and retrieval of an exponential number of patterns. Both these and classical associative memories, however, have assumed internally noiseless computational nodes. This paper considers the setting when internal computations are also noisy. Even if all components are noisy, the final error probability in recall can often be made exceedingly small, as we characterize. There is a threshold phenomenon. We also show how to optimize inference algorithm parameters when knowing statistical properties of internal noise.
We discuss an extension of the fluctuation theorem to stochastic models that, in the limit of zero external drive, are not able to equilibrate with their environment, extending results presented by Sellitto (cond-mat/9809186). We show that if the entropy production rate is suitably defined, its probability distribution function verifies the Fluctuation Relation with the ambient temperature replaced by a (frequency-dependent) effective temperature. We derive modified Green-Kubo relations. We illustrate these results with the simple example of an oscillator coupled to a nonequilibrium bath driven by an external force. We discuss the relevance of our results for driven glasses and the diffusion of Brownian particles in out of equilibrium media and propose a concrete experimental strategy to measure the low frequency value of the effective temperature using the fluctuations of the work done by an ac conservative field. We compare our results to related ones that appeared in the literature recently.
A classic problem in physics is the origin of fat tailed distributions generated by complex systems. We study the distributions of stock returns measured over different time lags $\tau.$ We find that destroying all correlations without changing the $\tau = 1$ d distribution, by shuffling the order of the daily returns, causes the fat tails almost to vanish for $\tau>1$ d. We argue that the fat tails are caused by known long-range volatility correlations. Indeed, destroying only sign correlations, by shuffling the order of only the signs (but not the absolute values) of the daily returns, allows the fat tails to persist for $\tau >1$ d.
A well-designed fine-grained categorization system usually has three contradictory requirements: accuracy (the ability to identify objects among subordinate categories); interpretability (the ability to provide human-understandable explanation of recognition system behavior); and efficiency (the speed of the system). To handle the trade-off between accuracy and interpretability, we propose a novel "Deeper Part-Stacked CNN" architecture armed with interpretability by modeling subtle differences between object parts. The proposed architecture consists of a part localization network, a two-stream classification network that simultaneously encodes object-level and part-level cues, and a feature vectors fusion component. Specifically, the part localization network is implemented by exploring a new paradigm for key point localization that first samples a small number of representable pixels and then determine their labels via a convolutional layer followed by a softmax layer. We also use a cropping layer to extract part features and propose a scale mean-max layer for feature fusion learning. Experimentally, our proposed method outperform state-of-the-art approaches both in part localization task and classification task on Caltech-UCSD Birds-200-2011. Moreover, by adopting a set of sharing strategies between the computation of multiple object parts, our single model is fairly efficient running at 32 frames/sec.
In recent years, a vast amount of research has been conducted on learning people's interests from their actions. Yet their collective actions also allow us to learn something about the world, in particular, infer attributes of places people visit or interact with. Imagine classifying whether a hotel has a gym or a swimming pool, or whether a restaurant has a romantic atmosphere without ever asking its patrons. Algorithms we present can do just that.   Many web applications rely on knowing attributes of places, for instance, whether a particular restaurant has WiFi or offers outdoor seating. Such data can be used to support a range of user experiences, from explicit query-driven search to personalized place recommendations. However, obtaining these attributes is generally difficult, with existing approaches relying on crowdsourcing or parsing online reviews, both of which are noisy, biased, and have limited coverage. Here we present a novel approach to classifying place attributes, which learns from patrons' visit patterns based on anonymous observational data.   Our method, STEPS, learns from aggregated sequences of place visits. For example, if many people visit the restaurant on a Saturday evening, coming from a luxury hotel or theater, and stay for a long time, then this restaurant is more likely to have a romantic atmosphere. On the other hand, if most people visit the restaurant on weekdays, coming from work or a grocery store, then the restaurant is less likely to be romantic. We show that such transition features are highly predictive of place attributes. In an extensive empirical evaluation, STEPS nearly doubled the coverage of a state of the art approach thanks to learning from observational location data, which allowed our method to reason about many more places.
There are many indexes (measures or metrics) in Social Network Analysis (SNA), like density, cohesion, etc. In this paper, we define a new SNA index called "comfortability". One among the lack of many factors, which affect the effectiveness of a group, is "comfortability". So, comfortability is one of the important attributes (characteristics) for a successful team work. It is important to find a comfortable and successful team in any given social network. In this paper, comfortable team, better comfortable team and highly comfortable team of a social network are defined based on \textbf{graph theoretic concepts} and some of their structural properties are analyzed.   It is proved that forming better comfortable team or highly comfortable team in any connected network are NP-Complete using the concepts of domination in graph theory. Next, we give a polynomial-time approximation algorithm for finding such a highly comfortable team in any given network with performance ratio O(\ln \Delta), where \Delta is the maximum degree of a given network (graph). The time complexity of the algorithm is proved to be O(n^{3}), where n is the number of persons (vertices) in the network (graph). It is also proved that our algorithm has reasonably reduced the dispersion rate.
In this paper, we propose a fast fully convolutional neural network (FCNN) for crowd segmentation. By replacing the fully connected layers in CNN with 1 by 1 convolution kernels, FCNN takes whole images as inputs and directly outputs segmentation maps by one pass of forward propagation. It has the property of translation invariance like patch-by-patch scanning but with much lower computation cost. Once FCNN is learned, it can process input images of any sizes without warping them to a standard size. These attractive properties make it extendable to other general image segmentation problems. Based on FCNN, a multi-stage deep learning is proposed to integrate appearance and motion cues for crowd segmentation. Both appearance filters and motion filers are pretrained stage-by-stage and then jointly optimized. Different combination methods are investigated. The effectiveness of our approach and component-wise analysis are evaluated on two crowd segmentation datasets created by us, which include image frames from 235 and 11 scenes, respectively. They are currently the largest crowd segmentation datasets and will be released to the public.
We investigate the longest-path attacks on complex networks. Specifically, we remove approximately the longest simple path from a network iteratively until there are no paths left in the network. We propose two algorithms, the random augmenting approach (RPA) and the Hamilton-path based approach (HPA), for finding the approximately longest simple path in a network. Results demonstrate that steps of longest-path attacks increase with network density linearly for random networks, while exponentially increasing for scale-free networks. The more homogeneous the degree distribution is, the more fragile the network, which is totally different from the previous results of node or edge attacks. HPA is generally more efficient than RPA in the longest-path attacks of complex networks. These findings further help us understand the vulnerability of complex systems, better protect complex systems, and design more tolerant complex systems.
The effect of quenched disorder on the low-energy properties of various antiferromagnetic spin ladder models is studied by a numerical strong disorder renormalization group method and by density matrix renormalization. For strong enough disorder the originally gapped phases with finite topological or dimer order become gapless. In these quantum Griffiths phases the scaling of the energy, as well as the singularities in the dynamical quantities are characterized by a finite dynamical exponent, z, which varies with the strength of disorder. At the phase boundaries, separating topologically distinct Griffiths phases the singular behavior of the disordered ladders is generally controlled by an infinite randomness fixed point.
The magnetic critical properties of two-dimensional Ising spin glasses are controversial. Using exact ground state determination, we extract the properties of clusters flipped when increasing continuously a uniform field. We show that these clusters have many holes but otherwise have statistical properties similar to those of zero-field droplets. A detailed analysis gives for the magnetization exponent delta = 1.30 +/- 0.02 using lattice sizes up to 80x80; this is compatible with the droplet model prediction delta = 1.282. The reason for previous disagreements stems from the need to analyze both singular and analytic contributions in the low-field regime.
We consider a class of real random polynomials, indexed by an integer d, of large degree n and focus on the number of real roots of such random polynomials. The probability that such polynomials have no real root in the interval [0,1] decays as a power law n^{-\theta(d)} where \theta(d)>0 is the exponent associated to the decay of the persistence probability for the diffusion equation with random initial conditions in space dimension d. For n even, the probability that such polynomials have no root on the full real axis decays as n^{-2(\theta(d) + \theta(2))}. For d=1, this connection allows for a physical realization of real random polynomials. We further show that the probability that such polynomials have exactly k real roots in [0,1] has an unusual scaling form given by n^{-\tilde \phi(k/\log n)} where \tilde \phi(x) is a universal large deviation function.
The evaluation of a tunneling tail by the Herman-Kluk method, which is a quasiclassical way to compute quantum dynamics, is examined by asymptotic analysis. In the shallower part of the tail, as well as in the classically allowed region, it is shown that the leading terms of semiclassical evaluations of quantum theory and the Herman-Kluk formula agree, which is known as an asymptotic equivalence. In the deeper part, it is shown that the asymptotic equivalence breaks down, due to the emergence of unusual "tunneling trajectory", which is an artifact of the Herman-Kluk method.
Quantum state can be teleported to a remote site by only local measurement and classical communication if the prior Einstein-Podolsky-Rosen quantum channel is available between the sender and the receiver. Those quantum channels shared by multiple nodes can constitute a quantum network. Yet, studies on the efficiency of quantum communication between nodes of quantum networks remain limited, which differs from classical case in that the quantum channel will be consumed if teleportation is performed. Here, we introduce the exclusive quantum channels (EQC) as the measure of efficiency of quantum information transmission. It quantifies the amount of quantum information which can be teleported between nodes in a quantum network. We show that different types of EQC are local quantities with effective circles. Significantly, capacity of quantum communication of quantum networks quantified by EQC is independent of distance for the communicating nodes. Thus, the quantum network can be dealt as the isotropic medium where quantum communication is no-decaying. EQC are studied by both analytical and numerical methods. The EQC can be enhanced by transformations of lattices of quantum network via entanglement swapping. Our result opens the avenue in studying the quantum communication of the quantum networks.
We present numerical evidences for the logarithmic scaling of the entanglement entropy in critical random spin chains. Very large scale exact diagonalizations performed at the critical XX point up to L=2000 spins 1/2 lead to a perfect agreement with recent real-space renormalization-group predictions of Refael and Moore [Phys. Rev. Lett. {\bf 93}, 260602 (2004)] for the logarithmic scaling of the entanglement entropy in the Random Singlet Phase with an effective central charge ${\tilde{c}}=c\times \ln 2$. Moreover we provide the first visual proof of the existence the Random Singlet Phase thanks to the quantum entanglement concept.
The design of complexity-aware cascaded detectors, combining features of very different complexities, is considered. A new cascade design procedure is introduced, by formulating cascade learning as the Lagrangian optimization of a risk that accounts for both accuracy and complexity. A boosting algorithm, denoted as complexity aware cascade training (CompACT), is then derived to solve this optimization. CompACT cascades are shown to seek an optimal trade-off between accuracy and complexity by pushing features of higher complexity to the later cascade stages, where only a few difficult candidate patches remain to be classified. This enables the use of features of vastly different complexities in a single detector. In result, the feature pool can be expanded to features previously impractical for cascade design, such as the responses of a deep convolutional neural network (CNN). This is demonstrated through the design of a pedestrian detector with a pool of features whose complexities span orders of magnitude. The resulting cascade generalizes the combination of a CNN with an object proposal mechanism: rather than a pre-processing stage, CompACT cascades seamlessly integrate CNNs in their stages. This enables state of the art performance on the Caltech and KITTI datasets, at fairly fast speeds.
In a "tipping" model, each node in a social network, representing an individual, adopts a behavior if a certain number of his incoming neighbors previously held that property. A key problem for viral marketers is to determine an initial "seed" set in a network such that if given a property then the entire network adopts the behavior. Here we introduce a method for quickly finding seed sets that scales to very large networks. Our approach finds a set of nodes that guarantees spreading to the entire network under the tipping model. After experimentally evaluating 31 real-world networks, we found that our approach often finds such sets that are several orders of magnitude smaller than the population size. Our approach also scales well - on a Friendster social network consisting of 5.6 million nodes and 28 million edges we found a seed sets in under 3.6 hours. We also find that highly clustered local neighborhoods and dense network-wide community structure together suppress the ability of a trend to spread under the tipping model.
The off-shell behaviors of bound nucleons in deep inelastic lepton nucleus scattering are discussed in two scenarios with the basic constituents chosen to be baryon-mesons and quark-gluons respectively in light-cone formalism. It is found that when taking into account the effect due to internal quark structure of nucleons, the derived scaling variable for bound nucleons and the calculated nuclear structure functions are different from those in considering the baryon-mesons as the effective elementary constituents. This implies that the pure baryon-meson descriptions of nuclei give the inaccurate off-shell behavior of the bound nucleon structure function, thereby the quark-gluons seem to be the most appropriate degrees of freedom for nuclear descriptions. It is also shown that the EMC effect cannot be explained by nuclear binding effect from a sound theoretical basis.
Wireless sensor networks have attracted a lot of interest over the last decade in wireless and mobile computing research community. Applications of these networks are numerous and growing, which range from indoor deployment scenarios in the home and office to outdoor deployment in adversary's territory in a tactical battleground. However, due to distributed nature and their deployment in remote areas, these networks are vulnerable to numerous security threats that can adversely affect their performance. This chapter provides a comprehensive discussion on the state of the art in security technologies for wireless sensor networks. It identifies various possible attacks at different layers of the communication protocol stack in a typical sensor network and their possible countermeasures. A brief discussion on the future direction of research in WSN security is also included.
We present a multi-scale simulation of early stage of DNA damages by the indirect action of hydroxyl ($^\bullet$OH) free radicals generated by electrons and protons. The computational method comprises of interfacing the Geant4-DNA Monte Carlo with the ReaxFF molecular dynamics software. A clustering method was employed to map the coordinates of $^\bullet$OH-radicals extracted from the ionization track-structures onto nano-meter simulation voxels filled with DNA and water molecules. The molecular dynamics simulation provides the time evolution and chemical reactions in individual simulation voxels as well as the energy-landscape accounted for the DNA-$^\bullet$OH chemical reaction that is essential for the first principle enumeration of hydrogen abstractions, chemical bond breaks, and DNA-lesions induced by collection of ions in clusters less than the critical dimension which is approximately 2-3 \AA. We show that the formation of broken bonds leads to DNA base and backbone damages that collectively propagate to DNA single and double strand breaks. For illustration of the methodology, we focused on particles with initial energy of 1 MeV. Our studies reveal a qualitative difference in DNA damage induced by low energy electrons and protons. Electrons mainly generate small pockets of $^\bullet$OH-radicals, randomly dispersed in the cell volume. In contrast, protons generate larger clusters along a straight-line parallel to the direction of the particle. The ratio of the total DNA double strand breaks induced by a single proton and electron track is determined to be $\approx$ 4 in the linear scaling limit. The tool developed in this work can be used in the future to investigate the relative biological effectiveness of light and heavy ions that are used in radiotherapy.
QoS identification for untrustworthy Web services is critical in QoS management in the service computing since the performance of untrustworthy Web services may result in QoS downgrade. The key issue is to intelligently learn the characteristics of trustworthy Web services from different QoS levels, then to identify the untrustworthy ones according to the characteristics of QoS metrics. As one of the intelligent identification approaches, deep neural network has emerged as a powerful technique in recent years. In this paper, we propose a novel two-phase neural network model to identify the untrustworthy Web services. In the first phase, Web services are collected from the published QoS dataset. Then, we design a feedforward neural network model to build the classifier for Web services with different QoS levels. In the second phase, we employ a probabilistic neural network (PNN) model to identify the untrustworthy Web services from each classification. The experimental results show the proposed approach has 90.5% identification ratio far higher than other competing approaches.
We consider the multiple scattering of a scalar wave in a disordered medium with a weak nonlinearity of Kerr type. The perturbation theory, developed to calculate the temporal autocorrelation function of scattered wave, fails at short correlation times. A self-consistent calculation shows that for nonlinearities exceeding a certain threshold value, the multiple-scattering speckle pattern becomes unstable and exhibits spontaneous fluctuations even in the absence of scatterer motion. The instability is due to a distributed feedback in the system "coherent wave + nonlinear disordered medium". The feedback is provided by the multiple scattering. The development of instability is independent of the sign of nonlinearity.
In this paper, we propose a fully automatic MRI cardiac segmentation method based on a novel deep convolutional neural network (CNN). As opposed to most cardiac segmentation methods which focus on the left ventricle (and especially the left cavity), our method segments both the left ventricular cavity, the left ventricular epicardium, and the right ventricular cavity. The novelty of our network lies in its maximum a posteriori loss function, which is specifically designed for the cardiac anatomy. Our loss function incorporates the cross-entropy of the predicted labels, the predicted contours, a cardiac shape prior, and an a priori term. Our model also includes a cardiac center-of-mass regression module which allows for an automatic shape prior registration. Also, since our method processes raw MR images without any manual preprocessing and/or image cropping, our CNN learns both high-level features (useful to distinguish the heart from other organs with a similar shape) and low-level features (useful to get accurate segmentation results). Those features are learned with a multi-resolution conv-deconv "grid" architecture which can be seen as an extension of the U-Net.   We trained and tested our model on the ACDC MICCAI'17 challenge dataset of 150 patients whose diastolic and systolic images were manually outlined by 2 medical experts. Results reveal that our method can segment all three regions of a 3D MRI cardiac volume in $0.4$ second with an average Dice index of $0.90$, which is significantly better than state-of-the-art deep learning methods.
Similarity-preserving hashing is a commonly used method for nearest neighbour search in large-scale image retrieval. For image retrieval, deep-networks-based hashing methods are appealing since they can simultaneously learn effective image representations and compact hash codes. This paper focuses on deep-networks-based hashing for multi-label images, each of which may contain objects of multiple categories. In most existing hashing methods, each image is represented by one piece of hash code, which is referred to as semantic hashing. This setting may be suboptimal for multi-label image retrieval. To solve this problem, we propose a deep architecture that learns \textbf{instance-aware} image representations for multi-label image data, which are organized in multiple groups, with each group containing the features for one category. The instance-aware representations not only bring advantages to semantic hashing, but also can be used in category-aware hashing, in which an image is represented by multiple pieces of hash codes and each piece of code corresponds to a category. Extensive evaluations conducted on several benchmark datasets demonstrate that, for both semantic hashing and category-aware hashing, the proposed method shows substantial improvement over the state-of-the-art supervised and unsupervised hashing methods.
The advent of social media has provided an extraordinary, if imperfect, 'big data' window into the form and evolution of social networks. Based on nearly 40 million message pairs posted to Twitter between September 2008 and February 2009, we construct and examine the revealed social network structure and dynamics over the time scales of days, weeks, and months. At the level of user behavior, we employ our recently developed hedonometric analysis methods to investigate patterns of sentiment expression. We find users' average happiness scores to be positively and significantly correlated with those of users one, two, and three links away. We strengthen our analysis by proposing and using a null model to test the effect of network topology on the assortativity of happiness. We also find evidence that more well connected users write happier status updates, with a transition occurring around Dunbar's number. More generally, our work provides evidence of a social sub-network structure within Twitter and raises several methodological points of interest with regard to social network reconstructions.
There are at least three fundamental states of matter, depending upon temperature and pressure: gas, liquid, and solid (crystal). These states are separated by first-order phase transitions between them. In both gas and liquid phases the complete translational and rotational symmetry exist, whereas in a solid phase both symmetries are broken. In intermediate phases between liquid and solid, which include liquid crystal and plastic crystal phases, only one of the two symmetries is preserved. Among the fundamental states of matter, the liquid state is most poorly understood. We argue that it is crucial for a better understanding of liquid to recognize that a liquid generally has a tendency to have local structural order and its presence is intrinsic and universal to any liquid. Such structural ordering is a consequence of many body correlations, more specifically, bond angle correlations, which we believe are crucial for the description of the liquid state. We show that this physical picture may naturally explain difficult unsolved problems associated with the liquid state, such as anomalies of water-type liquids (water, Si, Ge, ...), liquid-liquid transition, liquid-glass transition, crystallization and quasicrystal formation, in a unified manner. In other words, we need a new order parameter representing low local free-energy configuration, which is bond orientational order parameter in many cases, in addition to density order parameter for the physical description of these phenomena. Here we review our two-order-parameter model of liquid and consider how transient local structural ordering is linked to all of the above-mentioned phenomena. The relationship between these phenomena are also discussed.
We consider in this paper top-k query answering in social tagging systems, also known as folksonomies. This problem requires a significant departure from existing, socially agnostic techniques. In a network-aware context, one can (and should) exploit the social links, which can indicate how users relate to the seeker and how much weight their tagging actions should have in the result build-up. We propose an algorithm that has the potential to scale to current applications. While the problem has already been considered in previous literature, this was done either under strong simplifying assumptions or under choices that cannot scale to even moderate-size real world applications. We first consider a key aspect of the problem, which is accessing the closest or most relevant users for a given seeker. We describe how this can be done on the fly (without any pre-computations) for several possible choices - arguably the most natural ones - of proximity computation in a user network. Based on this, our top-k algorithm is sound and complete, while addressing the scalability issues of the existing ones. Importantly, our technique is instance optimal in the case when the search relies exclusively on the social weight of tagging actions. To further reduce response times, we then consider directions for efficiency by approximation. Extensive experiments on real world data show that our techniques can drastically improve the response time, without sacrificing precision.
NCs are the natural evolution of PCs, ubiquitous computers everywhere. The current vision of NCs requires two improbable developments: (1) inexpensive high-bandwidth WAN links to the Internet, and (2) inexpensive centralized servers. The large NC bandwidth requirements will force each home or office to have a local server LAN attached to the NCs. These servers will be much less expensive to purchase and manage than a centralized solution. Centralized staff are expensive and unresponsive.
Maximum throughput requires path diversity enabled by bifurcating traffic at different network nodes. In this work, we consider a network where traffic bifurcation is allowed only at a subset of nodes called \emph{routers}, while the rest nodes (called \emph{forwarders}) cannot bifurcate traffic and hence only forward packets on specified paths. This implements an overlay network of routers where each overlay link corresponds to a path in the physical network. We study dynamic routing implemented at the overlay. We develop a queue-based policy, which is shown to be maximally stable (throughput optimal) for a restricted class of network scenarios where overlay links do not correspond to overlapping physical paths. Simulation results show that our policy yields better delay over dynamic policies that allow bifurcation at all nodes, such as the backpressure policy. Additionally, we provide a heuristic extension of our proposed overlay routing scheme for the unrestricted class of networks.
Virtual backbone trees have been used for efficient communication between sink node and any other node in the deployed area. But all the proposed virtual backbone trees are not fully energy efficient and EVBTs have few flaws associated with them. In this paper two such virtual backbones are proposed. The motive behind the first algorithm, Most Minimal Energy Virtual Backbone Tree (MMEVBT), is to minimise the energy consumption when packets are transmitted between sink and a target sensor node. The energy consumption is most minimal and optimal and it is shown why it always has minimal energy consumption during any transfer of packet between every node with the sink node. For every node, route path with most minimal energy consumption is identified and a new tree node is elected only when a better minimal energy consumption route is identified for a node to communicate with the sink and vice versa. By moving sink periodically it is ensured the battery of the nodes near sink is not completely drained out. Another backbone construction algorithm is proposed which maximises the network lifetime by increasing the lifetime of all tree nodes. Simulations are done in NS2 to practically test the algorithms and the results are discussed in detail.
We propose to define full many-body localization in terms of the recently introduced integrals of motion[Chandran et al., arXiv:1407.8480], which characterize the time-averaged response of the system to a local perturbation. The quasi-locality of such integrals of motion implies an effective lightcone that grows logarithmically in time. This subsequently implies that (i) the average entanglement entropy can grow at most logarithmically in time for a global quench from a product state, and (ii) with high probability, the time evolution of a local operator for a time interval $|t|$ can be classically simulated with a resource that scales polynomially in $|t|$.
Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.
Stochastic Gradient Descent (SGD) is a popular optimization method which has been applied to many important machine learning tasks such as Support Vector Machines and Deep Neural Networks. In order to parallelize SGD, minibatch training is often employed. The standard approach is to uniformly sample a minibatch at each step, which often leads to high variance. In this paper we propose a stratified sampling strategy, which divides the whole dataset into clusters with low within-cluster variance; we then take examples from these clusters using a stratified sampling technique. It is shown that the convergence rate can be significantly improved by the algorithm. Encouraging experimental results confirm the effectiveness of the proposed method.
Exact maximum clique finders have progressed to the point where we can investigate cliques in million-node social and information networks, as well as find strongly connected components in temporal networks. We use one such finder to study a large collection of modern networks emanating from biological, social, and technological domains. We show inter-relationships between maximum cliques and several other common network properties, including network density, maximum core, and number of triangles. In temporal networks, we find that the largest temporal strong components have around 20-30% of the vertices of the entire network. These components represent groups of highly communicative individuals. In addition, we discuss and improve the performance and utility of the maximum clique finder itself.
A summary of the results on deep inelastic muon nucleon and muon nucleus scattering experiments performed by the NMC collaboration is presented.
Observability and controllability are essential concepts to the design of predictive observer models and feedback controllers of networked systems. For example, noncontrollable mathematical models of real systems have subspaces that influence model behavior, but cannot be controlled by an input. Such subspaces can be difficult to determine in complex nonlinear networks. Since almost all of the present theory was developed for linear networks without symmetries, here we present a numerical and group representational framework, to quantify the observability and controllability of nonlinear networks with explicit symmetries that shows the connection between symmetries and nonlinear measures of observability and controllability. We numerically observe and theoretically predict that not all symmetries have the same effect on network observation and control. Our analysis shows that the presence of symmetry in a network may decrease observability and controllability, although networks containing only rotational symmetries remain controllable and observable. These results alter our view of the nature of observability and controllability in complex networks, change our understanding of structural controllability, and affect the design of mathematical models to observe and control such networks.
EVOC is a computer model of the EVOlution of Culture. It consists of neural network based agents that invent ideas for actions, and imitate neighbors' actions. EVOC replicates using a different fitness function the results obtained with an earlier model (MAV), including (1) an increase in mean fitness of actions, and (2) an increase and then decrease in the diversity of actions. Diversity of actions is positively correlated with number of needs, population size and density, and with the erosion of borders between populations. Slowly eroding borders maximize diversity, fostering specialization followed by sharing of fit actions. Square (as opposed to toroidal) worlds also exhibit higher diversity. Introducing a leader that broadcasts its actions throughout the population increases the fitness of actions but reduces diversity; these effects diminish the more leaders there are. Low density populations have less fit ideas but broadcasting diminishes this effect.
Typography is a ubiquitous art form that affects our understanding, perception, and trust in what we read. Thousands of different font-faces have been created with enormous variations in the characters. In this paper, we learn the style of a font by analyzing a small subset of only four letters. From these four letters, we learn two tasks. The first is a discrimination task: given the four letters and a new candidate letter, does the new letter belong to the same font? Second, given the four basis letters, can we generate all of the other letters with the same characteristics as those in the basis set? We use deep neural networks to address both tasks, quantitatively and qualitatively measure the results in a variety of novel manners, and present a thorough investigation of the weaknesses and strengths of the approach.
Generative models for graphs have been typically committed to strong prior assumptions concerning the form of the modeled distributions. Moreover, the vast majority of currently available models are either only suitable for characterizing some particular network properties (such as degree distribution or clustering coefficient), or they are aimed at estimating joint probability distributions, which is often intractable in large-scale networks. In this paper, we first propose a novel network statistic, based on the Laplacian spectrum of graphs, which allows to dispense with any parametric assumption concerning the modeled network properties. Second, we use the defined statistic to develop the Fiedler random graph model, switching the focus from the estimation of joint probability distributions to a more tractable conditional estimation setting. After analyzing the dependence structure characterizing Fiedler random graphs, we evaluate them experimentally in edge prediction over several real-world networks, showing that they allow to reach a much higher prediction accuracy than various alternative statistical models.
Previous approaches for scene text detection have already achieved promising performances across various benchmarks. However, they usually fall short when dealing with challenging scenarios, even when equipped with deep neural network models, because the overall performance is determined by the interplay of multiple stages and components in the pipelines. In this work, we propose a simple yet powerful pipeline that yields fast and accurate text detection in natural scenes. The pipeline directly predicts words or text lines of arbitrary orientations and quadrilateral shapes in full images, eliminating unnecessary intermediate steps (e.g., candidate aggregation and word partitioning), with a single neural network. The simplicity of our pipeline allows concentrating efforts on designing loss functions and neural network architecture. Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500 demonstrate that the proposed algorithm significantly outperforms state-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR 2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fps at 720p resolution.
The development of synoptic sky surveys has led to a massive amount of data for which resources needed for analysis are beyond human capabilities. To process this information and to extract all possible knowledge, machine learning techniques become necessary. Here we present a new method to automatically discover unknown variable objects in large astronomical catalogs. With the aim of taking full advantage of all the information we have about known objects, our method is based on a supervised algorithm. In particular, we train a random forest classifier using known variability classes of objects and obtain votes for each of the objects in the training set. We then model this voting distribution with a Bayesian network and obtain the joint voting distribution among the training objects. Consequently, an unknown object is considered as an outlier insofar it has a low joint probability. Our method is suitable for exploring massive datasets given that the training process is performed offline. We tested our algorithm on 20 millions light-curves from the MACHO catalog and generated a list of anomalous candidates. We divided the candidates into two main classes of outliers: artifacts and intrinsic outliers. Artifacts were principally due to air mass variation, seasonal variation, bad calibration or instrumental errors and were consequently removed from our outlier list and added to the training set. After retraining, we selected about 4000 objects, which we passed to a post analysis stage by perfoming a cross-match with all publicly available catalogs. Within these candidates we identified certain known but rare objects such as eclipsing Cepheids, blue variables, cataclysmic variables and X-ray sources. For some outliers there were no additional information. Among them we identified three unknown variability types and few individual outliers that will be followed up for a deeper analysis.
First measurements of the Collins and Sivers asymmetries of charged hadrons produced in deep-inelastic scattering of muons on a transversely polarized 6-LiD target are presented. The data were taken in 2002 with the COMPASS spectrometer using the muon beam of the CERN SPS at 160 GeV/c. The Collins asymmetry turns out to be compatible with zero, as does the measured Sivers asymmetry within the present statistical errors.
Predicting the click-through rate of an advertisement is a critical component of online advertising platforms. In sponsored search, the click-through rate estimates the probability that a displayed advertisement is clicked by a user after she submits a query to the search engine. Commercial search engines typically rely on machine learning models trained with a large number of features to make such predictions. This is inevitably requires a lot of engineering efforts to define, compute, and select the appropriate features. In this paper, we propose two novel approaches (one working at character level and the other working at word level) that use deep convolutional neural networks to predict the click-through rate of a query-advertisement pair. Specially, the proposed architectures only consider the textual content appearing in a query-advertisement pair as input, and produce as output a click-through rate prediction. By comparing the character-level model with the word-level model, we show that language representation can be learnt from scratch at character level when trained on enough data. Through extensive experiments using billions of query-advertisement pairs of a popular commercial search engine, we demonstrate that both approaches significantly outperform a baseline model built on well-selected text features and a state-of-the-art word2vec-based approach. Finally, by combining the predictions of the deep models introduced in this study with the prediction of the model in production of the same commercial search engine, we significantly improve the accuracy and the calibration of the click-through rate prediction of the production system.
The behaviour of the TCP AIMD algorithm is known to cause queue length oscillations when congestion occurs at a router output link. Indeed, due to these queueing variations, end-to-end applications experience large delay jitter. Many studies have proposed efficient Active Queue Management (AQM) mechanisms in order to reduce queue oscillations and stabilize the queue length. These AQM are mostly improvements of the Random Early Detection (RED) model. Unfortunately, these enhancements do not react in a similar manner for various network conditions and are strongly sensitive to their initial setting parameters. Although this paper proposes a solution to overcome the difficulties of setting these parameters by using a Kohonen neural network model, another goal of this study is to investigate whether cognitive intelligence could be placed in the core network to solve such stability problem. In our context, we use results from the neural network area to demonstrate that our proposal, named Kohonen-RED (KRED), enables a stable queue length without complex parameters setting and passive measurements.
Individuals might abstain from participating in an instance of an evolutionary game for various reasons, ranging from lack of interest to risk aversion. In order to understand the consequences of such diverse activity patterns on the evolution of cooperation, we study a weak prisoner's dilemma where each player's participation is probabilistic rather than certain. Players that do not participate get a null payoff and are unable to replicate. We show that inactivity introduces cascading failures of cooperation, which are particularly severe on scale-free networks with frequently inactive hubs. The drops in the fraction of cooperators are sudden, while the spatiotemporal reorganization of compact cooperative clusters, and thus the recovery, takes time. Nevertheless, if the activity of players is directly proportional to their degree, or if the interaction network is not strongly heterogeneous, the overall evolution of cooperation is not impaired. This is because inactivity negatively affects the potency of low-degree defectors, who are hence unable to utilize on their inherent evolutionary advantage. Between cascading failures, the fraction of cooperators is therefore higher than usual, which lastly balances out the asymmetric dynamic instabilities that emerge due to intermittent blackouts of cooperative hubs.
Natural ecosystems are characterized by striking diversity of form and functions and yet exhibit deep symmetries emerging across scales of space, time and organizational complexity. Species-area relationships and species-abundance distributions are examples of emerging patterns irrespective of the details of the underlying ecosystem functions. Here we present empirical and theoretical evidence for a new macroecological pattern related to the distributions of local species persistence times, defined as the timespans between local colonizations and extinctions in a given geographic region. Empirical distributions pertaining to two different taxa, breeding birds and herbaceous plants, analyzed in a new framework that accounts for the finiteness of the observational period, exhibit power-law scaling limited by a cut-off determined by the rate of emergence of new species. In spite of the differences between taxa and spatial scales of analysis, the scaling exponents are statistically indistinguishable from each other and significantly different from those predicted by existing models. We theoretically investigate how the scaling features depend on the structure of the spatial interaction network and show that the empirical scaling exponents are reproduced once a two-dimensional isotropic texture is used, regardless of the details of the ecological interactions. The framework developed here also allows to link the cut-off timescale with the spatial scale of analysis, and the persistence-time distribution to the species-area relationship. We conclude that the inherent coherence obtained between spatial and temporal macroecological patterns points at a seemingly general feature of the dynamical evolution of ecosystems.
We present a model that takes into account the coupling between evolutionary game dynamics and social influence. Importantly, social influence and game dynamics take place in different domains, which we model as different layers of a multiplex network. We show that the coupling between these dynamical processes can lead to cooperation in scenarios where the pure game dynamics predicts defection. In addition, we show that the structure of the network layers and the relation between them can further increase cooperation. Remarkably, if the layers are related in a certain way, the system can reach a polarized metastable state.These findings could explain the prevalence of polarization observed in many social dilemmas.
The chemical reaction network (CRN) is a widely used formalism to describe macroscopic behavior of chemical systems. Available tools for CRN modelling and simulation require local access, installation, and often involve local file storage, which is susceptible to loss, lacks searchable structure, and does not support concurrency. Furthermore, simulations are often single-threaded, and user interfaces are non-trivial to use. Therefore there are significant hurdles to conducting efficient and collaborative chemical research. In this paper, we introduce a new enterprise chemistry simulation framework, COEL, which addresses these issues. COEL is the first web-based framework of its kind. A visually pleasing and intuitive user interface, simulations that run on a large computational grid, reliable database storage, and transactional services make COEL ideal for collaborative research and education. COEL's most prominent features include ODE-based simulations of chemical reaction networks and multicompartment reaction networks, with rich options for user interactions with those networks. COEL provides DNA-strand displacement transformations and visualization (and is to our knowledge the first CRN framework to do so), GA optimization of rate constants, expression validation, an application-wide plotting engine, and SBML/Octave/Matlab export. We also present an overview of the underlying software and technologies employed and describe the main architectural decisions driving our development. COEL is available at http://coel-sim.org for selected research teams only. We plan to provide a part of COEL's functionality to the general public in the near future.
Point of care diagnostics using microscopy and computer vision methods have been applied to a number of practical problems, and are particularly relevant to low-income, high disease burden areas. However, this is subject to the limitations in sensitivity and specificity of the computer vision methods used. In general, deep learning has recently revolutionised the field of computer vision, in some cases surpassing human performance for other object recognition tasks. In this paper, we evaluate the performance of deep convolutional neural networks on three different microscopy tasks: diagnosis of malaria in thick blood smears, tuberculosis in sputum samples, and intestinal parasite eggs in stool samples. In all cases accuracy is very high and substantially better than an alternative approach more representative of traditional medical imaging techniques.
Many real-world complex systems across natural, social, and economical domains consist of manifold layers to form multiplex networks. The multiple network layers give rise to nonlinear effect for the emergent dynamics of systems. Especially, weak layers that can potentially play significant role in amplifying the vulnerability of multiplex networks might be shadowed in the aggregated single-layer network framework which indiscriminately accumulates all layers. Here we present a simple model of cascading failure on multiplex networks of weight-heterogeneous layers. By simulating the model on the multiplex network of international trades, we found that the multiplex model produces more catastrophic cascading failures which are the result of emergent collective effect of coupling layers, rather than the simple sum thereof. Therefore risks can be systematically underestimated in single-layer network analyses because the impact of weak layers can be overlooked. We anticipate that our simple theoretical study can contribute to further investigation and design of optimal risk-averse real-world complex systems.
We present a new random sampling strategy for k-bandlimited signals defined on graphs, based on determinantal point processes (DPP). For small graphs, ie, in cases where the spectrum of the graph is accessible, we exhibit a DPP sampling scheme that enables perfect recovery of bandlimited signals. For large graphs, ie, in cases where the graph's spectrum is not accessible, we investigate, both theoretically and empirically, a sub-optimal but much faster DPP based on loop-erased random walks on the graph. Preliminary experiments show promising results especially in cases where the number of measurements should stay as small as possible and for graphs that have a strong community structure. Our sampling scheme is efficient and can be applied to graphs with up to $10^6$ nodes.
Decentralized coded content caching for next generation cellular networks is studied. The contents are linearly combined and cached in under-utilized caches of User Terminals (UTs) and its throughput capacity is compared with decentralized uncoded content caching. In both scenarios, we consider multihop Device-to-Device (D2D) communications and the use of femtocaches in the network. It is shown that decentralized coded content caching can increase the network throughput capacity compared to decentralized uncoded caching by reducing the number of hops needed to deliver the desired content. Further, the throughput capacity for Zipfian content request distribution is computed and it is shown that the decentralized coded content cache placement can increase the throughput capacity of cellular networks by a factor of $(\log(n))^2$ where $n$ is the number of nodes served by a femtocache.
Content Delivery Networks (CDN) are witnessing the outburst of video streaming (e.g., personal live streaming or Video-on-Demand) where the video content, produced or accessed by mobile phones, must be quickly transferred from a point to another of the network. Whenever a user requests a video not directly available at the edge server, the CDN network must 1) identify the best location in the network where the content is stored, 2) set up a connection and 3) deliver the video as quickly as possible. For this reason, existing CDNs are adopting an overlay structure to reduce latency, leveraging the flexibility introduced by the Software Defined Networking (SDN) paradigm. In order to guarantee a satisfactory Quality of Experience (QoE) to users, the connection must respect several Quality of Service (QoS) constraints. In this paper, we focus on the sub-problem 2), by presenting an approach to efficiently compute and maintain paths in the overlay network. Our approach allows to speed up the transfer of video segments by finding minimum delay overlay paths under constraints on hop count, jitter, packet loss and relay processing capacity. The proposed algorithm provides a near-optimal solution, while drastically reducing the execution time. We show on traces collected in a real CDN that our solution allows to maximize the number of fast video transfers.
It has been discovered recently that many social, biological and ecological systems have the so-called small-world and scale-free features, which has provoked new research interest in the studies of various complex networks. Yet, most network models studied thus far are binary, with the linking strengths being either 0 or 1, while which are best described by weighted-linking networks, in which the vertices interact with each other with varying strengths. Here we found that the distribution of connection strengths of scientific collaboration networks decays also in a power-law form and we conjecture that all weighted-linking networks of this type follow the same distribution.
Deep Convolutional Neural Networks (CNNs) have recently evinced immense success for various image recognition tasks. However, a question of paramount importance is somewhat unanswered in deep learning research - is the selected CNN optimal for the dataset in terms of accuracy and model size? In this paper, we intend to answer this question and introduce a novel strategy that alters the architecture of a given CNN for a specified dataset, to potentially enhance the original accuracy while possibly reducing the model size. We use two operations for architecture refinement, viz. stretching and symmetrical splitting. Our procedure starts with a pre-trained CNN for a given dataset, and optimally decides the stretch and split factors across the network to refine the architecture. We empirically demonstrate the necessity of the two operations. We evaluate our approach on two natural scenes attributes datasets, SUN Attributes and CAMIT-NSAD, with architectures of GoogleNet and VGG-11, that are quite contrasting in their construction. We justify our choice of datasets, and show that they are interestingly distinct from each other, and together pose a challenge to our architectural refinement algorithm. Our results substantiate the usefulness of the proposed method.
We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines.
Nonparametric estimation of the conditional distribution of a response given high-dimensional features is a challenging problem. It is important to allow not only the mean but also the variance and shape of the response density to change flexibly with features, which are massive-dimensional. We propose a multiscale dictionary learning model, which expresses the conditional response density as a convex combination of dictionary densities, with the densities used and their weights dependent on the path through a tree decomposition of the feature space. A fast graph partitioning algorithm is applied to obtain the tree decomposition, with Bayesian methods then used to adaptively prune and average over different sub-trees in a soft probabilistic manner. The algorithm scales efficiently to approximately one million features. State of the art predictive performance is demonstrated for toy examples and two neuroscience applications including up to a million features.
The strength of carrier-sense multiple access with collision avoidance (CSMA/CA) can be combined with that of time-division multiple access (TDMA) to enhance the channel access performance in wireless networks such as the IEEE 802.15.4-based wireless personal area networks (WPANs). In particular, the performance of legacy CSMA/CA-based medium access control (MAC) scheme in congested networks can be enhanced through a hybrid CSMA/CA-TDMA scheme while preserving the scalability property. In this paper, we present distributed and centralized channel access models which follow the transmission strategies based on Markov decision process (MDP) to access both contention period and contention-free period in an intelligent way. The models consider the buffer status as an indication of congestion provided that the offered traffic does not exceed the channel capacity. We extend the models to consider the hidden node collision problem encountered due to the signal attenuation caused by channel fading. The simulation results show that the MDP-based distributed channel access scheme outperforms the legacy slotted CSMA/CA scheme. This scheme also works efficiently in a network consisting of heterogeneous nodes. The centralized model outperforms the distributed model but requires the global information of the network.
Fusing satellite observations and station measurements to estimate ground-level PM2.5 is promising for monitoring PM2.5 pollution. A geo-intelligent approach, which incorporates geographical correlation into an intelligent deep learning architecture, is developed to estimate PM2.5. Specifically, it considers geographical distance and spatiotemporally correlated PM2.5 in a deep belief network (denoted as Geoi-DBN). Geoi-DBN can capture the essential features associated with PM2.5 from latent factors. It was trained and tested with data from China in 2015. The results show that Geoi-DBN performs significantly better than the traditional neural network. The cross-validation R increases from 0.63 to 0.94, and RMSE decreases from 29.56 to 13.68${\mu}$g/m3. On the basis of the derived PM2.5 distribution, it is predicted that over 80% of the Chinese population live in areas with an annual mean PM2.5 of greater than 35${\mu}$g/m3. This study provides a new perspective for air pollution monitoring in large geographic regions.
One of the important techniques of Data mining is Classification. Many real world problems in various fields such as business, science, industry and medicine can be solved by using classification approach. Neural Networks have emerged as an important tool for classification. The advantages of Neural Networks helps for efficient classification of given data. In this study a Heart diseases dataset is analyzed using Neural Network approach. To increase the efficiency of the classification process parallel approach is also adopted in the training phase.
We present recent results for the Landau-gauge gluon and ghost propagators in SU(3) lattice gluodynamics obtained on a sequence of lattices with linear extension ranging from L=64 to L=96 at $\beta = 5.70$, thus reaching "deep infrared" momenta down to 75 MeV. Our gauge-fixing procedure essentially uses a simulated annealing technique which allows us to reach gauge-functional values closer to the global maxima than standard approaches do. Our results are consistent with the so-called decoupling solutions found for Dyson-Schwinger and functional renormalization group equations.
The spread of rumors, which are known as unverified statements of uncertain origin, may cause tremendous number of social problems. If it would be possible to identify factors affecting spreading a rumor (such as agents' desires, trust network, etc.), then this could be used to slowdown or stop its spreading. A computational model that includes rumor features and the way a rumor is spread among society's members, based on their desires, is therefore needed. Our research is centering on the relation between the homogeneity of the society and rumor convergence in it and result shows that the homogeneity of the society is a necessary condition for convergence of the spreading rumor.
We present a connectionist architecture that can learn a model of the relations between perceptions and actions and use this model for behavior planning. State representations are learned with a growing self-organizing layer which is directly coupled to a perception and a motor layer. Knowledge about possible state transitions is encoded in the lateral connectivity. Motor signals modulate this lateral connectivity and a dynamic field on the layer organizes a planning process. All mechanisms are local and adaptation is based on Hebbian ideas. The model is continuous in the action, perception, and time domain.
This paper presents a comprehensive summery of different energy efficient protocols that are based on the basic Mechanism of DSR and enlightens the effort and commitment that has been made since last 10 year to turn the traditional DSR as energy efficient routing protocol.
Exploiting the wealth of imaging and non-imaging information for disease prediction tasks requires models capable of representing, at the same time, individual features as well as data associations between subjects from potentially large populations. Graphs provide a natural framework for such tasks, yet previous graph-based approaches focus on pairwise similarities without modelling the subjects' individual characteristics and features. On the other hand, relying solely on subject-specific imaging feature vectors fails to model the interaction and similarity between subjects, which can reduce performance. In this paper, we introduce the novel concept of Graph Convolutional Networks (GCN) for brain analysis in populations, combining imaging and non-imaging data. We represent populations as a sparse graph where its vertices are associated with image-based feature vectors and the edges encode phenotypic information. This structure was used to train a GCN model on partially labelled graphs, aiming to infer the classes of unlabelled nodes from the node features and pairwise associations between subjects. We demonstrate the potential of the method on the challenging ADNI and ABIDE databases, as a proof of concept of the benefit from integrating contextual information in classification tasks. This has a clear impact on the quality of the predictions, leading to 69.5% accuracy for ABIDE (outperforming the current state of the art of 66.8%) and 77% for ADNI for prediction of MCI conversion, significantly outperforming standard linear classifiers where only individual features are considered.
Since the BOSS competition, in 2010, most steganalysis approaches use a learning methodology involving two steps: feature extraction, such as the Rich Models (RM), for the image representation, and use of the Ensemble Classifier (EC) for the learning step. In 2015, Qian et al. have shown that the use of a deep learning approach that jointly learns and computes the features, is very promising for the steganalysis. In this paper, we follow-up the study of Qian et al., and show that, due to intrinsic joint minimization, the results obtained from a Convolutional Neural Network (CNN) or a Fully Connected Neural Network (FNN), if well parameterized, surpass the conventional use of a RM with an EC. First, numerous experiments were conducted in order to find the best " shape " of the CNN. Second, experiments were carried out in the clairvoyant scenario in order to compare the CNN and FNN to an RM with an EC. The results show more than 16% reduction in the classification error with our CNN or FNN. Third, experiments were also performed in a cover-source mismatch setting. The results show that the CNN and FNN are naturally robust to the mismatch problem. In Addition to the experiments, we provide discussions on the internal mechanisms of a CNN, and weave links with some previously stated ideas, in order to understand the impressive results we obtained.
Deterministic sandpile models are studied on a cost optimized Barab\'asi-Albert (BA) scale-free network whose nodes are the sites of a square lattice. For the optimized BA network, the sandpile model has the same critical behaviour as the BTW sandpile, whereas for the un-optimized BA network the critical behaviour is mean-field like.
A recurrent neural network with noisy input is studied analytically, on the basis of a Discrete Time Master Equation. The latter is derived from a biologically realizable learning rule for the weights of the connections. In a numerical study it is found that the fixed points of the dynamics of the net are time dependent, implying that the representation in the brain of a fixed piece of information (e.g., a word to be recognized) is not fixed in time.
We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collections of objects, called buckets, that have a high probability to be similar to the query. More specifically, NearBucket-LSH employs an LSH extension that searches in near buckets, and improves search quality but also significantly increases the network cost. We decrease the network cost by considering the internals of both LSH and the P2P overlay, and harnessing their properties to our needs. We show that our NearBucket-LSH increases search quality for a given network cost compared to previous art. In many cases, the search quality increases by more than 50%.
The aim of this work is to make available to the community a large collection of mass-action reaction networks of a given size for further research. The set is limited to what can be computed on a modern multi-core desktop in reasonable time (< 20 days). We have currently generated over 47 million unique reaction networks. All currently generated sets of networks are available and as new sets are completed they will also be made available. Also provided are programs for translating them into different formats, along with documentation and examples. Source code and binaries for all the programs are included. These can be downloaded from (http://www.sys-bio.org/networkenumeration). This library of networks will allow for thorough studies of the reaction network space. Additionally, these methods serve as an example for future work on enumerating other types of biological networks, such as genetic regulatory networks and mass-action networks that include regulation.
In the past decade, a small corner of the fly's visual system has become an important testing ground for ideas about coding and computation in the nervous system. A number of results demonstrate that this system operates with a precision and efficiency near the limits imposed by physics, and more generally these results point to the reliability and efficiency of the strategies that nature has selected for representing and processing visual signals. A recent series of papers by Egelhaaf and coworkers, however, suggests that almost all these conclusions are incorrect. In this contribution we place these controversies in a larger context, emphasizing that the arguments are not just about flies, but rather about how we should quantify the neural response to complex, naturalistic inputs. As an example, Egelhaaf et al. (and many others) compute certain correlation functions and use the apparent correlation times as a measure of temporal precision in the neural response. This analysis neglects the structure of the correlation function at short times, and we show how to analyze this structure to reveal a temporal precision 30 times better than suggested by the correlation time; this precision is confirmed by a much more detailed information theoretic analysis. In reviewing other aspects of the controversy, we find that the analysis methods used by Egelhaaf et al. suffer from some mathematical inconsistencies, and that in some cases we are unable to reproduce their experimental results. Finally, we present results from new experiments that probe the neural response to inputs that approach more closely the natural context for freely flying flies. These new experiments demonstrate that the fly's visual system is even more precise and efficient under natural conditions than had been inferred from our earlier work.
The structure of the Internet at the Autonomous System (AS) level has been studied by both the Physics and Computer Science communities. We extend this work to include features of the core and the periphery, taking a radial perspective on AS network structure. New methods for plotting AS data are described, and they are used to analyze data sets that have been extended to contain edges missing from earlier collections. In particular, the average distance from one vertex to the rest of the network is used as the baseline metric for investigating radial structure. Common vertex-specific quantities are plotted against this metric to reveal distinctive characteristics of central and peripheral vertices. Two data sets are analyzed using these measures as well as two common generative models (Barabasi-Albert and Inet). We find a clear distinction between the highly connected core and a sparse periphery. We also find that the periphery has a more complex structure than that predicted by degree distribution or the two generative models.
Food webs are networks describing who is eating whom in an ecological community. By now it is clear that many aspects of food-web structure are reproducible across diverse habitats, yet little is known about the driving force behind this structure. Evolutionary and population dynamical mechanisms have been considered. We propose a model for the evolutionary dynamics of food-web topology and show that it accurately reproduces observed food-web characteristic in the steady state. It is based on the observation that most consumers are larger than their resource species and the hypothesis that speciation and extinction rates decrease with increasing body mass. Results give strong support to the evolutionary hypothesis.
Dynamical wiring and rewiring in neural networks are carried out by activity-dependent growth and retraction of axons and dendrites, guided by gudance molecules, released by target cells. Experience-dependent structural changes in cortical microcurcuts lead to changes in activity, i.e. to changes in information encoded. Specific pattens of external stimulation can lead to creation of new synaptical connections between neurons. Calcium influxes controlled by neuronal activity regulates processes of neurotrophic factors release by neurons, growth cones movement and synapse differentiation in developing neural system, therefore activity-dependent self-wiring can serve as a basis of structural plasticity in cortical networks and can be considered as a form of learning.
Broadcast networks are often used in modern communication systems. A common broadcast network is a single hop shared media system, where a transmitted message is heard by all neighbors, such as some LAN networks. In this work we consider a more complex environment, in which a transmitted message is heard only by a group of neighbors, such as Ad-Hoc networks, satellite and radio networks, as well as wireless multistation backbone system for mobile communication. It is important to design efficient algorithms for such environments. Our main result is a new Leader Election algorithm, with O(n) time complexity and O(n*lg(n)) message transmission complexity. Our distributed solution uses a propagation of information with feedback (PIF) building block tuned to the broadcast media, and a special counting and joining approach for the election procedure phase. The latter is required for achieving the linear time. It is demonstrated that the broadcast model requires solutions which are different from the known point-to-point model.
Game theory has been applied to investigate network security. But different security scenarios were often modeled via different types of games and analyzed in an ad-hoc manner. In this paper, we propose an algebraic approach for modeling and analyzing uniformly several types of network security games. This approach is based on a probabilistic extension of the value-passing Calculus of Communicating Systems (CCS) which is regarded as a Generative model for Probabilistic Value-passing CCS (PVCCSG for short). Our approach gives a uniform framework, called PVCCSG based security model, for the security scenarios modeled via perfect and complete or incomplete information games. We present then a uniform algorithm for computing the Nash equilibria strategies of a network security game on its PVCCSG based security model. The algorithm first generates a transition system for each of the PVCCSG based security models, then simplifies this transition system through graph-theoretic abstraction and bisimulation minimization. Then, a backward induction method, which is only applicable to finite tree models, can be used to compute all the Nash equilibria strategies of the (possibly infinite) security games. This algorithm is implemented and can also be tuned smoothly for computing its social optimal strategies. The effectiveness and efficiency of this approach are further demonstrated with four detailed case studies from the field of network security.
We examine the process of Deep Inelastic Scattering (DIS) in the small-x limit, where Pomeron exchange dominates. Using the AdS/CFT correspondence, we study Pomeron exchange in the dual string theory in AdS space, which allows us to study DIS at strong coupling. Two possibilities are examined, a purely conformal model, and a model with a hard-wall cutoff introduced to take into account the effects of confinement. Comparing our calculations with HERA DIS data, we find a very good agreement not only at large $Q^2$ dominated by conformal symmetry, but due to our strong coupling approach which allows us to go beyond traditional pQCD methods, at small $Q^2$ as well, taking into account all available HERA small x data.
We present remote Operating System detection as an inference problem: given a set of observations (the target host responses to a set of tests), we want to infer the OS type which most probably generated these observations. Classical techniques used to perform this analysis present several limitations. To improve the analysis, we have developed tools using neural networks and Statistics tools. We present two working modules: one which uses DCE-RPC endpoints to distinguish Windows versions, and another which uses Nmap signatures to distinguish different version of Windows, Linux, Solaris, OpenBSD, FreeBSD and NetBSD systems. We explain the details of the topology and inner workings of the neural networks used, and the fine tuning of their parameters. Finally we show positive experimental results.
Significant success has been reported recently using deep neural networks for classification. Such large networks can be computationally intensive, even after training is over. Implementing these trained networks in hardware chips with a limited precision of synaptic weights may improve their speed and energy efficiency by several orders of magnitude, thus enabling their integration into small and low-power electronic devices. With this motivation, we develop a computationally efficient learning algorithm for multilayer neural networks with binary weights, assuming all the hidden neurons have a fan-out of one. This algorithm, derived within a Bayesian probabilistic online setting, is shown to work well for both synthetic and real-world problems, performing comparably to algorithms with real-valued weights, while retaining computational tractability.
We present deep WSRT 1.4 GHz observations of the Hubble Deep Field region. At the 5 sigma level, the WSRT clearly detects 85 regions of radio emission in a 10' x 10' field centred on the HDF. Eight of these regions fall within the HDF itself, four of these are sources that have not previously been detected at 1.4 GHz, although two of these are VLA detections at 8.5GHz. The two new radio sources detected by the WSRT are identified with relatively bright (I<21m) moderate redshift spiral and irregular type galaxies. In the full field, the WSRT detects 22 regions of emission that were not previously detected by the VLA at 1.4GHz. At least two of these are associated with nearby, extended star-forming galaxies.
We investigate the transport of a passive tracer in a two-dimensional stratified random medium with flow parallel and perpendicular to the strata. Assuming a Gaussian random flow with a Gaussian correlation function, it is not only possible to derive exact expressions for the temporal behaviour of the dispersion coefficients characterising the tracer transport, but also to investigate the self-averaging properties of these quantities. We give explicit results for the dispersion coefficient and its mean square fluctuations as a function of time. As it turns out, the sample to sample fluctuations which are encoded in the latter quantity remain finite even in the asymptotic limit of infinite times, which implies that the given two-dimensional model is not self-averaging.
Cognitive Social Structure (CSS) network studies collect relational data on respondents' direct ties and their perception of ties among all other individuals in the network. When reporting their perception networks, respondents commit two types of errors, namely, omission (false negatives) and commission (false positives) errors. We first assess the relationship between these two error types, and their contributions on the overall respondent accuracy. Next we propose a method for estimating networks based on perceptions of a random sample of respondents from a bounded social network, which utilizes the Receiving Operator Characteristic (ROC) curve for balancing the tradeoffs between omission and commission errors. A comparative numerical study shows that the proposed estimation method performs well. This new method can be easily integrated to organization studies that use randomized surveys to study multiple organizations. The burgeoning field of multilevel analysis of inter-organizational networks can also immensely benefit from this approach.
Many real-world systems are profitably described as complex networks that grow over time. Preferential attachment and node fitness are two ubiquitous growth mechanisms that not only explain certain structural properties commonly observed in real-world systems, but are also tied to a number of applications in modeling and inference. In the node fitness mechanism, the probability a node acquires a new edge is proportional to a quantity called fitness that is assumed to be independent of the network structure. On the other hand, in the preferential attachment mechanism, this probability of acquiring new edges is proportional to a function of the current number of edges of the node. While this function is originally assumed to be the linear function, and hence fixed, in general it can be arbitrary, and thus is the target of estimation in real-world datasets. While there are standard statistical packages for estimating the structural properties of complex networks, there is no corresponding package when it comes to the estimation of preferential attachment and node fitness mechanisms. This paper introduces the R package PAFit, which implements well-established statistical methods for estimating preferential attachment and node fitness, as well as a number of functions for generating complex networks from these two mechanisms. The main computational part of the package is implemented in C++ with OpenMP to ensure good performance for large-scale networks. In this paper, we first introduce the main functionalities of PAFit using simulated examples, and then use the package to analyze a collaboration network between scientists in the field of complex networks.
Femtocellular networks will co-exist with macrocellular networks, mitigation of the interference between these two network types is a key challenge for successful integration of these two technologies. In particular, there are several interference mechanisms between the femtocellular and the macrocellular networks, and the effects of the resulting interference depend on the density of femtocells and the overlaid macrocells in a particular coverage area. While improper interference management can cause a significant reduction in the system capacity and can increase the outage probability, effective and efficient frequency allocation among femtocells and macrocells can result in a successful co-existence of these two technologies. Furthermore, highly dense femtocellular deployments the ultimate goal of the femtocellular technology will require significant degree of self-organization in lieu of manual configuration. In this paper, we present various femtocellular network deployment scenarios, and we propose a number of frequency-allocation schemes to mitigate the interference and to increases the spectral efficiency of the integrated network. These schemes include: shared frequency band, dedicated frequency band, sub-frequency band, static frequency-reuse, and dynamic frequency-reuse.
A concept of higher order neighborhood in complex networks, introduced previously (PRE \textbf{73}, 046101, (2006)), is systematically explored to investigate larger scale structures in complex networks. The basic idea is to consider each higher order neighborhood as a network in itself, represented by a corresponding adjacency matrix. Usual network indices are then used to evaluate the properties of each neighborhood. Results for a large number of typical networks are presented and discussed. Further, the information from all neighborhoods is condensed in a single neighborhood matrix, which can be explored for visualizing the neighborhood structure. On the basis of such representation, a distance is introduced to compare, in a quantitative way, how far apart networks are in the space of neighborhood matrices. The distance depends both on the network topology and the adopted node numbering. Given a pair of networks, a Monte Carlo algorithm is developed to find the best numbering for one of them, holding fixed the numbering of the second network, obtaining a projection of the first one onto the pattern of the other. The minimal value found for the distance reflects differences in the neighborhood structures of the two networks that arise from distinct topologies. Examples are worked out allowing for a quantitative comparison for distances among a set of distinct networks.
PROXTONE is a novel and fast method for optimization of large scale non-smooth convex problem \cite{shi2015large}. In this work, we try to use PROXTONE method in solving large scale \emph{non-smooth non-convex} problems, for example training of sparse deep neural network (sparse DNN) or sparse convolutional neural network (sparse CNN) for embedded or mobile device. PROXTONE converges much faster than first order methods, while first order method is easy in deriving and controlling the sparseness of the solutions. Thus in some applications, in order to train sparse models fast, we propose to combine the merits of both methods, that is we use PROXTONE in the first several epochs to reach the neighborhood of an optimal solution, and then use the first order method to explore the possibility of sparsity in the following training. We call such method PROXTONE plus (PROXTONE$^+$). Both PROXTONE and PROXTONE$^+$ are tested in our experiments, and which demonstrate both methods improved convergence speed twice as fast at least on diverse sparse model learning problems, and at the same time reduce the size to 0.5\% for DNN models. The source of all the algorithms is available upon request.
Engineering networks fall into the category of large-scale networks with heterogeneous nodes such as sources and sinks. The survivability analysis of such networks requires the analysis of the connectivity of the network components for every possible combination of faults to determine a network response to each combination of faults. From the computational complexity point of view, the problem belongs to the class of exponential time problems at least. Partially, the problem complexity can be reduced by mapping the initial topology of a complex large-scale network with multiple sources and multiple sinks onto a set of smaller sub-topologies with multiple sources and a single sink connected to the network of sources by a single link. In this paper, the mapping procedure is applied to the Florida power grid.
Deep neural networks usually benefit from unsupervised pre-training, e.g. auto-encoders. However, the classifier further needs supervised fine-tuning methods for good discrimination. Besides, due to the limits of full-connection, the application of auto-encoders is usually limited to small, well aligned images. In this paper, we incorporate the supervised information to propose a novel formulation, namely class-encoder, whose training objective is to reconstruct a sample from another one of which the labels are identical. Class-encoder aims to minimize the intra-class variations in the feature space, and to learn a good discriminative manifolds on a class scale. We impose the class-encoder as a constraint into the softmax for better supervised training, and extend the reconstruction on feature-level to tackle the parameter size issue and translation issue. The experiments show that the class-encoder helps to improve the performance on benchmarks of classification and face recognition. This could also be a promising direction for fast training of face recognition models.
The development of mobile devices (CPU, memory, and storage) and the introduction of mobile networks (Ad-Hoc, Wi-Fi, WiMAX, and 3.5G) have opened new opportunities for next generation of mobile services. It becomes more convenience and desirable for mobile internet users to be connected everywhere. However, ubiquitous mobile access connectivity faces interoperation issues between wireless network providers and wireless network technologies. Although mobile users would like to get as many services as possible while they travel, there is a lack of technology to identify visited users in current foreign network authentication systems. This challenge lies in the fact that a foreign network provider does not initially have the authentication credentials of a mobile user. Existing approaches use roaming agreement to exchange authentication information between home network and foreign network. This paper proposes a roaming agreement-less approach designed based on our ubiquitous mobile access model. Our approach consist of two tokens, Passport (identification token) and Visa (authorisation token) to provide the mobile user with a flexible authentication method to access foreign network services. The security analysis indicates that our proposal is more suitable for ubiquitous mobile communication especially in roaming agreement-less environment.
The use of deep learning to solve problems in literary arts has been a recent trend that has gained a lot of attention and automated generation of music has been an active area. This project deals with the generation of music using raw audio files in the frequency domain relying on various LSTM architectures. Fully connected and convolutional layers are used along with LSTM's to capture rich features in the frequency domain and increase the quality of music generated. The work is focused on unconstrained music generation and uses no information about musical structure(notes or chords) to aid learning.The music generated from various architectures are compared using blind fold tests. Using the raw audio to train models is the direction to tapping the enormous amount of mp3 files that exist over the internet without requiring the manual effort to make structured MIDI files. Moreover, not all audio files can be represented with MIDI files making the study of these models an interesting prospect to the future of such models.
An expansion for the free energy functional of the Sherrington-Kirkpatrick (SK) model, around the Replica Symmetric SK solution $Q^{({\rm RS})}_{ab} = \delta_{ab} + q(1-\delta_{ab})$ is investigated. In particular, when the expansion is truncated to fourth order in. $Q_{ab} - Q^{({\rm RS})}_{ab}$. The Full Replica Symmetry Broken (FRSB) solution is explicitly found but it turns out to exist only in the range of temperature $0.549...\leq T\leq T_c=1$, not including T=0. On the other hand an expansion around the paramagnetic solution $Q^{({\rm PM})}_{ab} = \delta_{ab}$ up to fourth order yields a FRSB solution that exists in a limited temperature range $0.915...\leq T \leq T_c=1$.
In the recent years, the rapid spread of mobile device has create the vast amount of mobile data. However, some shallow-structure models such as support vector machine (SVM) have difficulty dealing with high dimensional data with the development of mobile network. In this paper, we analyze mobile data to predict human trajectories in order to understand human mobility pattern via a deep-structure model called "DeepSpace". To the best of out knowledge, it is the first time that the deep learning approach is applied to predicting human trajectories. Furthermore, we develop the vanilla convolutional neural network (CNN) to be an online learning system, which can deal with the continuous mobile data stream. In general, "DeepSpace" consists of two different prediction models corresponding to different scales in space (the coarse prediction model and fine prediction models). This two models constitute a hierarchical structure, which enable the whole architecture to be run in parallel. Finally, we test our model based on the data usage detail records (UDRs) from the mobile cellular network in a city of southeastern China, instead of the call detail records (CDRs) which are widely used by others as usual. The experiment results show that "DeepSpace" is promising in human trajectories prediction.
First measurements are presented of the diffractive cross section $\sigma_{ep \rightarrow eXY}$ at centre-of-mass energies $\sqrt{s}$ of 225 and 252 GeV, together with a precise new measurement at $\sqrt{s}$ of 319 GeV, using data taken with the H1 detector in the years 2006 and 2007. Together with previous H1 data at $\sqrt{s}$ of 301 GeV, the measurements are used to extract the diffractive longitudinal structure function F_L^D in the range of photon virtualities 4.0 <= Q^2 <= 44.0 GeV^2 and fractional proton longitudinal momentum loss 5 10^{-4} <= x_{IP} <= 3 10^{-3}. The measured F_L^D is compared with leading twist predictions based on diffractive parton densities extracted in NLO QCD fits to previous measurements of diffractive Deep-Inelastic Scattering and with a model which additionally includes a higher twist contribution derived from a colour dipole approach. The ratio of the diffractive cross section induced by longitudinally polarised photons to that for transversely polarised photons is extracted and compared with the analogous quantity for inclusive Deep-Inelastic Scattering.
Lossy compression introduces complex compression artifacts, particularly blocking artifacts, ringing effects and blurring. Existing algorithms either focus on removing blocking artifacts and produce blurred output, or restore sharpened images that are accompanied with ringing effects. Inspired by the success of deep convolutional networks (DCN) on superresolution, we formulate a compact and efficient network for seamless attenuation of different compression artifacts. To meet the speed requirement of real-world applications, we further accelerate the proposed baseline model by layer decomposition and joint use of large-stride convolutional and deconvolutional layers. This also leads to a more general CNN framework that has a close relationship with the conventional Multi-Layer Perceptron (MLP). Finally, the modified network achieves a speed up of 7.5 times with almost no performance loss compared to the baseline model. We also demonstrate that a deeper model can be effectively trained with features learned in a shallow network. Following a similar "easy to hard" idea, we systematically investigate three practical transfer settings and show the effectiveness of transfer learning in low-level vision problems. Our method shows superior performance than the state-of-the-art methods both on benchmark datasets and a real-world use case.
Stochastic Kronecker graphs supply a parsimonious model for large sparse real world graphs. They can specify the distribution of a large random graph using only three or four parameters. Those parameters have however proved difficult to choose in specific applications. This article looks at method of moments estimators that are computationally much simpler than maximum likelihood. The estimators are fast and in our examples, they typically yield Kronecker parameters with expected feature counts closer to a given graph than we get from KronFit. The improvement was especially prominent for the number of triangles in the graph.
This paper analyzes the behavior of selfish transmitters under imperfect location information. The scenario considered is that of a wireless network consisting of selfish nodes that are randomly distributed over the network domain according to a known probability distribution, and that are interested in communicating with a common sink node using common radio resources. In this scenario, the wireless nodes do not know the exact locations of their competitors but rather have belief distributions about these locations. Firstly, properties of the packet success probability curve as a function of the node-sink separation are obtained for such networks. Secondly, a monotonicity property for the best-response strategies of selfish nodes is identified. That is, for any given strategies of competitors of a node, there exists a critical node-sink separation for this node such that its best-response is to transmit when its distance to the sink node is smaller than this critical threshold, and to back off otherwise. Finally, necessary and sufficient conditions for a given strategy profile to be a Nash equilibrium are provided.
Dataset bias remains a significant barrier towards solving real world computer vision tasks. Though deep convolutional networks have proven to be a competitive approach for image classification, a question remains: have these models have solved the dataset bias problem? In general, training or fine-tuning a state-of-the-art deep model on a new domain requires a significant amount of data, which for many applications is simply not available. Transfer of models directly to new domains without adaptation has historically led to poor recognition performance. In this paper, we pose the following question: is a single image dataset, much larger than previously explored for adaptation, comprehensive enough to learn general deep models that may be effectively applied to new image domains? In other words, are deep CNNs trained on large amounts of labeled data as susceptible to dataset bias as previous methods have been shown to be? We show that a generic supervised deep CNN model trained on a large dataset reduces, but does not remove, dataset bias. Furthermore, we propose several methods for adaptation with deep models that are able to operate with little (one example per category) or no labeled domain specific data. Our experiments show that adaptation of deep models on benchmark visual domain adaptation datasets can provide a significant performance boost.
Face representation is a crucial step of face recognition systems. An optimal face representation should be discriminative, robust, compact, and very easy-to-implement. While numerous hand-crafted and learning-based representations have been proposed, considerable room for improvement is still present. In this paper, we present a very easy-to-implement deep learning framework for face representation. Our method bases on a new structure of deep network (called Pyramid CNN). The proposed Pyramid CNN adopts a greedy-filter-and-down-sample operation, which enables the training procedure to be very fast and computation-efficient. In addition, the structure of Pyramid CNN can naturally incorporate feature sharing across multi-scale face representations, increasing the discriminative ability of resulting representation. Our basic network is capable of achieving high recognition accuracy ($85.8\%$ on LFW benchmark) with only 8 dimension representation. When extended to feature-sharing Pyramid CNN, our system achieves the state-of-the-art performance ($97.3\%$) on LFW benchmark. We also introduce a new benchmark of realistic face images on social network and validate our proposed representation has a good ability of generalization.
Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently. However, existing theories cannot explain their convergence and speedup properties, mainly due to the nonconvexity of most deep learning formulations and the asynchronous parallel mechanism. To fill the gaps in theory and provide theoretical supports, this paper studies two asynchronous parallel implementations of SG: one is on the computer network and the other is on the shared memory system. We establish an ergodic convergence rate $O(1/\sqrt{K})$ for both algorithms and prove that the linear speedup is achievable if the number of workers is bounded by $\sqrt{K}$ ($K$ is the total number of iterations). Our results generalize and improve existing analysis for convex minimization.
Bistability plays a central role in the gene regulatory networks (GRNs) controlling many essential biological functions, including cellular differentiation and cell cycle control. However, establishing the network topologies that can exhibit bistability remains a challenge, in part due to the exceedingly large variety of GRNs that exist for even a small number of components. We begin to address this problem by employing chemical reaction network theory in a comprehensive in silico survey to determine the capacity for bistability of more than 40,000 simple networks that can be formed by two transcription factor-coding genes and their associated proteins (assuming only the most elementary biochemical processes). We find that there exist reaction rate constants leading to bistability in ~90% of these GRN models, including several circuits that do not contain any of the TF cooperativity commonly associated with bistable systems, and the majority of which could only be identified as bistable through an original subnetwork-based analysis. A topological sorting of the two-gene family of networks based on the presence or absence of biochemical reactions reveals eleven minimal bistable networks (i.e., bistable networks that do not contain within them a smaller bistable subnetwork). The large number of previously unknown bistable network topologies suggests that the capacity for switch-like behavior in GRNs arises with relative ease and is not easily lost through network evolution. To highlight the relevance of the systematic application of CRNT to bistable network identification in real biological systems, we integrated publicly available protein-protein interaction, protein-DNA interaction, and gene expression data from Saccharomyces cerevisiae, and identified several GRNs predicted to behave in a bistable fashion.
This paper presents a model of network formation in repeated games where the players adapt their strategies and network ties simultaneously using a simple reinforcement-learning scheme. It is demonstrated that the coevolutionary dynamics of such systems can be described via coupled replicator equations. We provide a comprehensive analysis for three-player two-action games, which is the minimum system size with nontrivial structural dynamics. In particular, we characterize the Nash equilibria (NE) in such games and examine the local stability of the rest points corresponding to those equilibria. We also study general n-player networks via both simulations and analytical methods and find that in the absence of exploration, the stable equilibria consist of star motifs as the main building blocks of the network. Furthermore, in all stable equilibria the agents play pure strategies, even when the game allows mixed NE. Finally, we study the impact of exploration on learning outcomes, and observe that there is a critical exploration rate above which the symmetric and uniformly connected network topology becomes stable.
Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a group level scalable probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial noise modeling. For task-based and resting state fMRI, we show that the sparsity constraint gives rise to components similar to those obtained by group independent component analysis. The noise modeling shows that noise is reduced in areas typically associated with activation by the experimental design. The psFA model identifies sparse components and the probabilistic setting provides a natural way to handle parameter uncertainties. The variational Bayesian framework easily extends to more complex noise models than the presently considered.
We present our analysis on the clustering properties of star-forming galaxies selected by narrow-band excesses in the Subaru Deep Field. Specifically we focus on Halpha emitting galaxies at z = 0.24 and z = 0.40 in the same field, to investigate possible evolutionary signatures of clustering properties of star-forming galaxies. Based on the analysis on 228 Halpha emitting galaxies with 39.8 < log L(Halpha) < 40.8 at z = 0.40, we find that their two-point correlation function is estimated as xi = (r/1.62^{+0.64}_{-0.50} Mpc)^{-1.84 +/- 0.08}. This is similar to that of Halpha emitting galaxies in the same Halpha luminosity range at z = 0.24, xi = (r/1.88^{+0.60}_{-0.49} Mpc)^{-1.89 +/- 0.07}. These correlation lengths are smaller than those for the brighter galaxy sample studied by Meneux et al. (2006) in the same redshift range. The evolution of correlation length between z = 0.24 and z = 0.40 is interpreted by the gravitational growth of the dark matter halos.
Quantification is well known to be a major obstacle in the construction of a probabilistic network, especially when relying on human experts for this purpose. The construction of a qualitative probabilistic network has been proposed as an initial step in a network s quantification, since the qualitative network can be used TO gain preliminary insight IN the projected networks reasoning behaviour. We extend on this idea and present a new type of network in which both signs and numbers are specified; we further present an associated algorithm for probabilistic inference. Building upon these semi-qualitative networks, a probabilistic network can be quantified and studied in a stepwise manner. As a result, modelling inadequacies can be detected and amended at an early stage in the quantification process.
In this work we demostrate using a two-temperature volume average 0D model that robust stabilization, with regard the hellium ash confinement time, of the burn conditions of a tokamak reactor with the ITER FEAT design parameters can be achieved using Radial Basis Neural Networks (RBNN). Alpha particle thermalization time delay is taken into account in this model. The control actions implemented by means of a RBNN, include the modulation of the DT refueling rate, a neutral He-4 injection beam and auxiliary heating powers to ions and to electrons; all of them constrained to lie within allowable range values. Here we assume that the tokamak follows the IPB98(y,2) scaling for the energy confinement time, while the helium ash confinement time is assumed to be independently estimated on-line. The DT and helium ash confinement times are assumed to keep a constant relationship at all times. An on-line noisy estimation of the helium ash confinement time is simulated by corrupting it with pseudo Gaussian noise.
High transverse momentum pi^0 -mesons and charged particles are measured in deep inelastic e-p scattering events at low Bjorken-x taken with the H1 detector at HERA. The production of high-pt particles is strongly correlated to the emission of hard partons in QCD and is therefore sensitive to the dynamics of the strong interaction. For the first time the measurement of single particles has been extended to the region of small angles w.r.t. the proton remnant (forward region). This region is expected to be particularly sensitive to QCD evolution effects in final states. Results are presented as a function of Bjorken-x and x_i, the fraction of the incident proton's energy carried by the particle, and are compared to different QCD models.
The Johnson filtration of the automorphism group of a free group is composed of those automorphisms which act trivially on nilpotent quotients of the free group. We compute cohomology classes as follows: (i) we analyze analogous classes for a subgroup of the pure symmetric automorphism group of a free group, and (ii) we analyze features of these classes which are preserved by the Johnson homomorphism. One consequence is that the ranks of the cohomology groups in any fixed dimension between 1 and n-1 increase without bound for terms deep in the Johnson filtraton.
The identification of the limiting factors in the dynamical behavior of complex systems is an important interdisciplinary problem which often can be traced to the spectral properties of an underlying network. By deriving a general relation between the eigenvalues of weighted and unweighted networks, here I show that for a wide class of networks the dynamical behavior is tightly bounded by few network parameters. This result provides rigorous conditions for the design of networks with predefined dynamical properties and for the structural control of physical processes in complex systems. The results are illustrated using synchronization phenomena as a model process.
General human action recognition requires understanding of various visual cues. In this paper, we propose a network architecture that computes and integrates the most important visual cues for action recognition: pose, motion, and the raw images. For the integration, we introduce a Markov chain model which adds cues successively. The resulting approach is efficient and applicable to action classification as well as to spatial and temporal action localization. The two contributions clearly improve the performance over respective baselines. The overall approach achieves state-of-the-art action classification performance on HMDB51, J-HMDB and NTU RGB+D datasets. Moreover, it yields state-of-the-art spatio-temporal action localization results on UCF101 and J-HMDB.
Deep neural networks is a branch in machine learning that has seen a meteoric rise in popularity due to its powerful abilities to represent and model high-level abstractions in highly complex data. One area in deep neural networks that is ripe for exploration is neural connectivity formation. A pivotal study on the brain tissue of rats found that synaptic formation for specific functional connectivity in neocortical neural microcircuits can be surprisingly well modeled and predicted as a random formation. Motivated by this intriguing finding, we introduce the concept of StochasticNet, where deep neural networks are formed via stochastic connectivity between neurons. As a result, any type of deep neural networks can be formed as a StochasticNet by allowing the neuron connectivity to be stochastic. Stochastic synaptic formations, in a deep neural network architecture, can allow for efficient utilization of neurons for performing specific tasks. To evaluate the feasibility of such a deep neural network architecture, we train a StochasticNet using four different image datasets (CIFAR-10, MNIST, SVHN, and STL-10). Experimental results show that a StochasticNet, using less than half the number of neural connections as a conventional deep neural network, achieves comparable accuracy and reduces overfitting on the CIFAR-10, MNIST and SVHN dataset. Interestingly, StochasticNet with less than half the number of neural connections, achieved a higher accuracy (relative improvement in test error rate of ~6% compared to ConvNet) on the STL-10 dataset than a conventional deep neural network. Finally, StochasticNets have faster operational speeds while achieving better or similar accuracy performances.
In this note, we clarify the stability of the large-N functional RG fixed points of the order/disorder transition in the random-field (RF) and random-anisotropy (RA) O(N) models. We carefully distinguish between infinite N, and large but finite N. For infinite N, the Schwarz-Soffer inequality does not give a useful bound, and all fixed points found in cond-mat/0510344 (Phys. Rev. Lett. 96, 197202 (2006)) correspond to physical disorder. For large but finite N (i.e. to first order in 1/N) the non-analytic RF fixed point becomes unstable, and the disorder flows to an analytic fixed point characterized by dimensional reduction. However, for random anisotropy the fixed point remains non-analytic (i.e. exhibits a cusp) and is stable in the 1/N expansion, while the corresponding dimensional-reduction fixed point is unstable. In this case the Schwarz-Soffer inequality does not constrain the 2-point spin correlation. We compute the critical exponents of this new fixed point in a series in 1/N and to 2-loop order.
In previous work, we have developed a simple method to study the behavior of the Sherrington-Kirkpatrick mean field spin glass model for high temperatures, or equivalently for high external fields. The basic idea was to couple two different replicas with a quadratic term, trying to push out the two replica overlap from its replica symmetric value. In the case of zero external field, our results reproduced the well known validity of the annealed approximation, up to the known critical value for the temperature. In the case of nontrivial external field, our method could prove the validity of the Sherrington-Kirkpatrick replica symmetric solution up to a line, which fell short of the Almeida-Thouless line, associated to the onset of the spontaneous replica symmetry breaking, in the Parisi Ansatz. Here, we make a strategic improvement of the method, by modifying the flow equations, with respect to the parameters of the model. We exploit also previous results on the overlap fluctuations in the replica symmetric region. As a result, we give a simple proof that replica symmetry holds up to the critical Almeida-Thouless line, as expected on physical grounds. Our results are compared with the characterization of the replica symmetry breaking line previously given by Talagrand. We outline also a possible extension of our methods to the broken replica symmetry region.
A set of fixed points of the Hopfield type neural network was under investigation. Its connection matrix is constructed with regard to the Hebb rule from a highly symmetric set of the memorized patterns. Depending on the external parameter the analytic description of the fixed points set had been obtained. And as a conclusion, some exact results of Hopfield neural networks were gained.
In this paper we apply a heuristic method based on artificial neural networks in order to trace out the efficient frontier associated to the portfolio selection problem. We consider a generalization of the standard Markowitz mean-variance model which includes cardinality and bounding constraints. These constraints ensure the investment in a given number of different assets and limit the amount of capital to be invested in each asset. We present some experimental results obtained with the neural network heuristic and we compare them to those obtained with three previous heuristic methods.
We present a new deep learning-based approach for dense stereo matching. Compared to previous works, our approach does not use deep learning of pixel appearance descriptors, employing very fast classical matching scores instead. At the same time, our approach uses a deep convolutional network to predict the local parameters of cost volume aggregation process, which in this paper we implement using differentiable domain transform. By treating such transform as a recurrent neural network, we are able to train our whole system that includes cost volume computation, cost-volume aggregation (smoothing), and winner-takes-all disparity selection end-to-end. The resulting method is highly efficient at test time, while achieving good matching accuracy. On the KITTI 2015 benchmark, it achieves a result of 6.34\% error rate while running at 29 frames per second rate on a modern GPU.
Implementing an accurate and fast activation function with low cost is a crucial aspect to the implementation of Deep Neural Networks (DNNs) on FPGAs. We propose a high-accuracy approximation approach for the hyperbolic tangent activation function of artificial neurons in DNNs. It is based on the Discrete Cosine Transform Interpolation Filter (DCTIF). The proposed architecture combines simple arithmetic operations on stored samples of the hyperbolic tangent function and on input data. The proposed DCTIF implementation achieves two orders of magnitude greater precision than previous work while using the same or fewer computational resources. Various combinations of DCTIF parameters can be chosen to tradeoff the accuracy and complexity of the hyperbolic tangent function. In one case, the proposed architecture approximates the hyperbolic tangent activation function with 10E-5 maximum error while requiring only 1.52 Kbits memory and 57 LUTs of a Virtex-7 FPGA. We also discuss how the activation function accuracy affects the performance of DNNs in terms of their training and testing accuracies. We show that a high accuracy approximation can be necessary in order to maintain the same DNN training and testing performances realized by the exact function.
We study the statistical properties of the sampled networks by a random walker. We compare topological properties of the sampled networks such as degree distribution, degree-degree correlation, and clustering coefficient with those of the original networks. From the numerical results, we find that most of topological properties of the sampled networks are almost the same as those of the original networks for $\gamma \lesssim 3$. In contrast, we find that the degree distribution exponent of the sampled networks for $\gamma>3$ somewhat deviates from that of the original networks when the ratio of the sampled network size to the original network size becomes smaller. We also apply the sampling method to various real networks such as collaboration of movie actor, world wide web, and peer-to-peer networks. All topological properties of the sampled networks show the essentially same as the original real networks.
We aim to study the modeling limitations of the commonly employed boosted decision trees classifier. Inspired by the success of large, data-hungry visual recognition models (e.g. deep convolutional neural networks), this paper focuses on the relationship between modeling capacity of the weak learners, dataset size, and dataset properties. A set of novel experiments on the Caltech Pedestrian Detection benchmark results in the best known performance among non-CNN techniques while operating at fast run-time speed. Furthermore, the performance is on par with deep architectures (9.71% log-average miss rate), while using only HOG+LUV channels as features. The conclusions from this study are shown to generalize over different object detection domains as demonstrated on the FDDB face detection benchmark (93.37% accuracy). Despite the impressive performance, this study reveals the limited modeling capacity of the common boosted trees model, motivating a need for architectural changes in order to compete with multi-level and very deep architectures.
We present an approach to automatically classify clinical text at a sentence level. We are using deep convolutional neural networks to represent complex features. We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%.
N-tuple networks have been successfully used as position evaluation functions for board games such as Othello or Connect Four. The effectiveness of such networks depends on their architecture, which is determined by the placement of constituent n-tuples, sequences of board locations, providing input to the network. The most popular method of placing n-tuples consists in randomly generating a small number of long, snake-shaped board location sequences. In comparison, we show that learning n-tuple networks is significantly more effective if they involve a large number of systematically placed, short, straight n-tuples. Moreover, we demonstrate that in order to obtain the best performance and the steepest learning curve for Othello it is enough to use n-tuples of size just 2, yielding a network consisting of only 288 weights. The best such network evolved in this study has been evaluated in the online Othello League, obtaining the performance of nearly 96% --- more than any other player to date.
This extended abstract describes the participation of RECOD Titans in parts 1 and 3 of the ISIC Challenge 2017 "Skin Lesion Analysis Towards Melanoma Detection" (ISBI 2017). Although our team has a long experience with melanoma classification, the ISIC Challenge 2017 was the very first time we worked on skin-lesion segmentation. For part 1 (segmentation), our final submission used four of our models: two trained with all 2000 samples, without a validation split, for 250 and for 500 epochs respectively; and other two trained and validated with two different 1600/400 splits, for 220 epochs. Those four models, individually, achieved between 0.780 and 0.783 official validation scores. Our final submission averaged the output of those four models achieved a score of 0.793. For part 3 (classification), the submitted test run as well as our last official validation run were the result from a meta-model that assembled seven base deep-learning models: three based on Inception-V4 trained on our largest dataset; three based on Inception trained on our smallest dataset; and one based on ResNet-101 trained on our smaller dataset. The results of those component models were stacked in a meta-learning layer based on an SVM trained on the validation set of our largest dataset.
In this paper we propose wireless sensor network architecture with layered protocols, targeting different aspects of the awareness requirements in wireless sensor networks. Under such a unified framework, we pay special attention to the most important awareness issues in wireless sensor networks: namely dynamic awareness, energy awareness and spatiotemporal awareness. First, we propose the spatiotemporal aware routing protocol for wireless sensor networks, which maintains a constant delivery speed for soft real-time communication. Second, we introduce the time-energy aware aggregation scheme, which increases the degree of aggregation and reduces energy consumption without jeopardizing the end-to-end delay. It is the data aggregation scheme to take the timely delivery of messages as well as protocol overhead into account to adjust aggregation strategies adaptively in accordance with assessed traffic conditions and expected wireless sensor network requirements. Third, we also deal with dynamic awareness by proposing a new communication category based on a lazy-binding concept. The evaluation of this integrated architecture demonstrates its performance composability and its capability to be tailored for applications with different awareness requirements. We believe that the integrated architecture, supporting various kinds of awareness requirements, lays a foundation for overall wireless sensor network.
Contemporary modeling approaches to the dynamics of neural networks consider two main classes of models: biologically grounded spiking neurons and functionally inspired rate-based units. The unified simulation framework presented here supports the combination of the two for multi-scale modeling approaches, the quantitative validation of mean-field approaches by spiking network simulations, and an increase in reliability by usage of the same simulation code and the same network model specifications for both model classes. While most efficient spiking simulations rely on the communication of discrete events, rate models require time-continuous interactions between neurons. Exploiting the conceptual similarity to the inclusion of gap junctions in spiking network simulations, we arrive at a reference implementation of instantaneous and delayed interactions between rate-based models in a spiking network simulator. The separation of rate dynamics from the general connection and communication infrastructure ensures flexibility of the framework. We further demonstrate the broad applicability of the framework by considering various examples from the literature ranging from random networks to neural field models. The study provides the prerequisite for interactions between rate-based and spiking models in a joint simulation.
Recent studies have shown that synchronizability of complex networks can be significantly improved by asymmetric couplings, and increase of coupling gradient is always in favor of network synchronization. Here we argue and demonstrate that, for typical complex networks, there usually exists an optimal coupling gradient under which the maximum network synchronizability is achieved. After this optimal value, increase of coupling gradient could deteriorate synchronization. We attribute the suppression of network synchronization at large gradient to the phenomenon of network breaking, and find that, in comparing with sparsely connected homogeneous networks, densely connected heterogeneous networks have the superiority of adopting large gradient. The findings are supported by indirect simulations of eigenvalue analysis and direct simulations of coupled nonidentical oscillator networks.
We present NN-grams, a novel, hybrid language model integrating n-grams and neural networks (NN) for speech recognition. The model takes as input both word histories as well as n-gram counts. Thus, it combines the memorization capacity and scalability of an n-gram model with the generalization ability of neural networks. We report experiments where the model is trained on 26B words. NN-grams are efficient at run-time since they do not include an output soft-max layer. The model is trained using noise contrastive estimation (NCE), an approach that transforms the estimation problem of neural networks into one of binary classification between data samples and noise samples. We present results with noise samples derived from either an n-gram distribution or from speech recognition lattices. NN-grams outperforms an n-gram model on an Italian speech recognition dictation task.
Network service providers and customers are often concerned with aggregate performance measures that span multiple network paths. Unfortunately, forming such network-wide measures can be difficult, due to the issues of scale involved. In particular, the number of paths grows too rapidly with the number of endpoints to make exhaustive measurement practical. As a result, there is interest in the feasibility of methods that dramatically reduce the number of paths measured in such situations while maintaining acceptable accuracy.   In previous work we proposed a statistical framework to efficiently address this problem, in the context of additive metrics such as delay and loss rate, for which the per-path metric is a sum of (possibly transformed) per-link measures. The key to our method lies in the observation and exploitation of significant redundancy in network paths (sharing of common links).   In this paper we make three contributions: (1) we generalize the framework to make it more immediately applicable to network measurements encountered in practice; (2) we demonstrate that the observed path redundancy upon which our method is based is robust to variation in key network conditions and characteristics, including link failures; and (3) we show how the framework may be applied to address three practical problems of interest to network providers and customers, using data from an operating network. In particular, we show how appropriate selection of small sets of path measurements can be used to accurately estimate network-wide averages of path delays, to reliably detect network anomalies, and to effectively make a choice between alternative sub-networks, as a customer choosing between two providers or two ingress points into a provider network.
The Principal Component Analysis (PCA) is a data dimensionality reduction technique well-suited for processing data from sensor networks. It can be applied to tasks like compression, event detection, and event recognition. This technique is based on a linear transform where the sensor measurements are projected on a set of principal components. When sensor measurements are correlated, a small set of principal components can explain most of the measurements variability. This allows to significantly decrease the amount of radio communication and of energy consumption. In this paper, we show that the power iteration method can be distributed in a sensor network in order to compute an approximation of the principal components. The proposed implementation relies on an aggregation service, which has recently been shown to provide a suitable framework for distributing the computation of a linear transform within a sensor network. We also extend this previous work by providing a detailed analysis of the computational, memory, and communication costs involved. A compression experiment involving real data validates the algorithm and illustrates the tradeoffs between accuracy and communication costs.
This work develops a distributed optimization strategy with guaranteed exact convergence for a broad class of left-stochastic combination policies. The resulting exact diffusion strategy is shown in Part II to have a wider stability range and superior convergence performance than the EXTRA strategy. The exact diffusion solution is applicable to non-symmetric left-stochastic combination matrices, while most earlier developments on exact consensus implementations are limited to doubly-stochastic matrices; these latter matrices impose stringent constraints on the network topology. Similar difficulties arise for implementations with right-stochastic policies, which are common in push-sum consensus solutions. The derivation of the exact diffusion strategy in this work relies on reformulating the aggregate optimization problem as a penalized problem and resorting to a diagonally-weighted incremental construction. Detailed stability and convergence analyses are pursued in Part II and are facilitated by examining the evolution of the error dynamics in a transformed domain. Numerical simulations illustrate the theoretical conclusions.
A summary is given on the main aspects which were discussed by the working group. They include new results on the deep inelastic scattering structure functions $F_2, xF_3, F_L$ and $F_2^{c\bar{c}}$ and their parametrizations, the measurement of the gluon density, recent theoretical work on the small $x$ behavior of structure functions, theoretical and experimental results on $\alpha_s$, the direct photon cross section, and a discussion of the event rates in the high $p_T$ range at Tevatron and the high $Q^2$ range at HERA, as well as possible interpretations.
The Cuckoo optimization algorithm (COA) is developed for solving single-objective problems and it cannot be used for solving multi-objective problems. So the multi-objective cuckoo optimization algorithm based on data envelopment analysis (DEA) is developed in this paper and it can gain the efficient Pareto frontiers. This algorithm is presented by the CCR model of DEA and the output-oriented approach of it. The selection criterion is higher efficiency for next iteration of the proposed hybrid method. So the profit function of the COA is replaced by the efficiency value that is obtained from DEA. This algorithm is compared with other methods using some test problems. The results shows using COA and DEA approach for solving multi-objective problems increases the speed and the accuracy of the generated solutions.
The mortality related to cervical cancer can be substantially reduced through early detection and treatment. However, current detection techniques, such as Pap smear and colposcopy, fail to achieve a concurrently high sensitivity and specificity.   In vivo fluorescence spectroscopy is a technique which quickly, non-invasively and quantitatively probes the biochemical and morphological changes that occur in pre-cancerous tissue.   A multivariate statistical algorithm was used to extract clinically useful information from tissue spectra acquired from 361 cervical sites from 95 patients at 337, 380 and 460 nm excitation wavelengths. The multivariate statistical analysis was also employed to reduce the number of fluorescence excitation-emission wavelength pairs required to discriminate healthy tissue samples from pre-cancerous tissue samples. The use of connectionist methods such as multi layered perceptrons, radial basis function networks, and ensembles of such networks was investigated. RBF ensemble algorithms based on fluorescence spectra potentially provide automated, and near real-time implementation of pre-cancer detection in the hands of non-experts. The results are more reliable, direct and accurate than those achieved by either human experts or multivariate statistical algorithms.
One of the main theoretical motivations for the emerging area of network coding is the achievability of the max-flow/min-cut rate for single source multicast. This can exceed the rate achievable with routing alone, and is achievable with linear network codes. The multi-source problem is more complicated. Computation of its capacity region is equivalent to determination of the set of all entropy functions $\Gamma^*$, which is non-polyhedral. The aim of this paper is to demonstrate that this difficulty can arise even in single source problems. In particular, for single source networks with hierarchical sink requirements, and for single source networks with secrecy constraints. In both cases, we exhibit networks whose capacity regions involve $\Gamma^*$. As in the multi-source case, linear codes are insufficient.
Deep neural network (DNN) based approaches hold significant potential for reinforcement learning (RL) and have already shown remarkable gains over state-of-art methods in a number of applications. The effectiveness of DNN methods can be attributed to leveraging the abundance of supervised data to learn value functions, Q-functions, and policy function approximations without the need for feature engineering. Nevertheless, the deployment of DNN-based predictors with very deep architectures can pose an issue due to computational and other resource constraints at test-time in a number of applications. We propose a novel approach for reducing the average latency by learning a computationally efficient gating function that is capable of recognizing states in a sequential decision process for which policy prescriptions of a shallow network suffices and deeper layers of the DNN have little marginal utility. The overall system is adaptive in that it dynamically switches control actions based on state-estimates in order to reduce average latency without sacrificing terminal performance. We experiment with a number of alternative loss-functions to train gating functions and shallow policies and show that in a number of applications a speed-up of up to almost 5X can be obtained with little loss in performance.
Wireless sensor networks (WSNs) have attracted considerable attention in recent years and motivate a host of new challenges for distributed signal processing. The problem of distributed or decentralized estimation has often been considered in the context of parametric models. However, the success of parametric methods is limited by the appropriateness of the strong statistical assumptions made by the models. In this paper, a more flexible nonparametric model for distributed regression is considered that is applicable in a variety of WSN applications including field estimation. Here, starting with the standard regularized kernel least-squares estimator, a message-passing algorithm for distributed estimation in WSNs is derived. The algorithm can be viewed as an instantiation of the successive orthogonal projection (SOP) algorithm. Various practical aspects of the algorithm are discussed and several numerical simulations validate the potential of the approach.
Network-level Redundancy Elimination (RE) techniques have been proposed to reduce the amount of traffic in the Internet. and the costs of the WAN access in the Internet. RE middleboxes are usually placed in the network access gateways and strip off the repeated data from the packets. More recently, generic network-level caching architectures have been proposed as alternative to reduce the redundant data traffic in the network, presenting benefits and drawbacks compared to RE. In this paper, we compare a generic in-network caching architecture against state-of-the-art redundancy elimination (RE) solutions on real network topologies, presenting the advantages of each technique. Our results show that in-network caching architectures outperform state-of-the-art RE solutions across a wide range of traffic characteristics and parameters.
In this paper, we consider different aspects of the network functional compression problem where computation of a function (or, some functions) of sources located at certain nodes in a network is desired at receiver(s). The rate region of this problem has been considered in the literature under certain restrictive assumptions, particularly in terms of the network topology, the functions and the characteristics of the sources. In this paper, we present results that significantly relax these assumptions. Firstly, we consider this problem for an arbitrary tree network and asymptotically lossless computation. We show that, for depth one trees with correlated sources, or for general trees with independent sources, a modularized coding scheme based on graph colorings and Slepian-Wolf compression performs arbitrarily closely to rate lower bounds. For a general tree network with independent sources, optimal computation to be performed at intermediate nodes is derived. We introduce a necessary and sufficient condition on graph colorings of any achievable coding scheme, called coloring connectivity condition (C.C.C.).   Secondly, we investigate the effect of having several functions at the receiver. In this problem, we derive a rate region and propose a coding scheme based on graph colorings. Thirdly, we consider the functional compression problem with feedback. We show that, in this problem, unlike Slepian-Wolf compression, by having feedback, one may outperform rate bounds of the case without feedback. Fourthly, we investigate functional computation problem with distortion. We compute a rate-distortion region for this problem. Then, we propose a simple suboptimal coding scheme with a non-trivial performance guarantee. Finally, we introduce cases where finding minimum entropy colorings and therefore, optimal coding schemes can be performed in polynomial time.
The study of complex networks has been historically based on simple graph data models representing relationships between individuals. However, often reality cannot be accurately captured by a flat graph model. This has led to the development of multi-layer networks. These models have the potential of becoming the reference tools in network data analysis, but require the parallel development of specific analysis methods explicitly exploiting the information hidden in-between the layers and the availability of a critical mass of reference data to experiment with the tools and investigate the real-world organization of these complex systems. In this work we introduce a real-world layered network combining different kinds of online and offline relationships, and present an innovative methodology and related analysis tools suggesting the existence of hidden motifs traversing and correlating different representation layers. We also introduce a notion of betweenness centrality for multiple networks. While some preliminary experimental evidence is reported, our hypotheses are still largely unverified, and in our opinion this calls for the availability of new analysis methods but also new reference multi-layer social network data.
How to self-localize large teams of underwater nodes using only noisy range measurements? How to do it in a distributed way, and incorporating dynamics into the problem? How to reject outliers and produce trustworthy position estimates? The stringent acoustic communication channel and the accuracy needs of our geophysical survey application demand faster and more accurate localization methods. We approach dynamic localization as a MAP estimation problem where the prior encodes dynamics, and we devise a convex relaxation method that takes advantage of previous estimates at each measurement acquisition step; The algorithm converges at an optimal rate for first order methods. LocDyn is distributed: there is no fusion center responsible for processing acquired data and the same simple computations are performed for each node. LocDyn is accurate: experiments attest to a smaller positioning error than a comparable Kalman filter. LocDyn is robust: it rejects outlier noise, while the comparing methods succumb in terms of positioning error.
In this paper, a high-level comparison of both SOAP (Simple Object Access Protocol) and REST (Representational State Transfer) is made. These are the two main approaches for interfacing to the web with web services. Both approaches are different and present some advantages and disadvantages for interfacing to web services: SOAP is conceptually more difficult (has a steeper learning curve) and more "heavy-weight" than REST, although it lacks of standards support for security. In order to test their eficiency (in time), two experiments have been performed using both technologies: a client-server model implementation and a master-slave based genetic algorithm (GA). The results obtained show clear differences in time between SOAP and REST implementations. Although both techniques are suitable for developing parallel systems, SOAP is heavier than REST, mainly due to the verbosity of SOAP communications (XML increases the time taken to parse the messages).
The surveillance, analysis and ultimately the efficient long-term prediction and control of epidemic dynamics appear to be one of the major challenges nowadays. Detailed atomistic mathematical models play an important role towards this aim. In this work it is shown how one can exploit the Equation Free approach and optimization methods such as Simulated Annealing to bridge detailed individual-based epidemic simulation with coarse-grained, systems-level, analysis. The methodology provides a systematic approach for analyzing the parametric behavior of complex/ multi-scale epidemic simulators much more efficiently than simply simulating forward in time. It is shown how steady state and (if required) time-dependent computations, stability computations, as well as continuation and numerical bifurcation analysis can be performed in a straightforward manner. The approach is illustrated through a simple individual-based epidemic model deploying on a random regular connected graph. Using the individual-based microscopic simulator as a black box coarse-grained timestepper and with the aid of Simulated Annealing I compute the coarse-grained equilibrium bifurcation diagram and analyze the stability of the stationary states sidestepping the necessity of obtaining explicit closures at the macroscopic level under a pairwise representation perspective.
In this paper we study the problem of learning a shallow artificial neural network that best fits a training data set. We study this problem in the over-parameterized regime where the number of observations are fewer than the number of parameters in the model. We show that with quadratic activations the optimization landscape of training such shallow neural networks has certain favorable characteristics that allow globally optimal models to be found efficiently using a variety of local search heuristics. This result holds for an arbitrary training data of input/output pairs. For differentiable activation functions we also show that gradient descent, when suitably initialized, converges at a linear rate to a globally optimal model. This result focuses on a realizable model where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted weight coefficients.
Studies in recent years have demonstrated that neural organization and structure impact an individual's ability to perform a given task. Specifically, individuals with greater neural efficiency have been shown to outperform those with less organized functional structure. In this work, we compare the predictive ability of properties of neural connectivity on a working memory task. We provide two novel approaches for characterizing functional network connectivity from electroencephalography (EEG), and compare these features to the average power across frequency bands in EEG channels. Our first novel approach represents functional connectivity structure through the distribution of eigenvalues making up channel coherence matrices in multiple frequency bands. Our second approach creates a connectivity network at each frequency band, and assesses variability in average path lengths of connected components and degree across the network. Failures in digit and sentence recall on single trials are detected using a Gaussian classifier for each feature set, at each frequency band. The classifier results are then fused across frequency bands, with the resulting detection performance summarized using the area under the receiver operating characteristic curve (AUC) statistic. Fused AUC results of 0.63/0.58/0.61 for digit recall failure and 0.58/0.59/0.54 for sentence recall failure are obtained from the connectivity structure, graph variability, and channel power features respectively.
We study the relaxation process in normal and anomalous diffusion regimes for systems described by a generalized Langevin equation (GLE). We demonstrate the existence of a very general correlation function which describes the relaxation phenomena. Such function is even; therefore, it cannot be an exponential or a stretched exponential. However, for a proper choice of the parameters, those functions can be reproduced within certain intervals with good precision. We also show the passage from the non-Markovian to the Markovian behaviour in the normal diffusion regime. For times longer than the relaxation time, the correlation function for anomalous diffusion becomes a power law for broad-band noise.
The convergence, convergence rate and expected hitting time play fundamental roles in the analysis of randomised search heuristics. This paper presents a unified Markov chain approach to studying them. Using the approach, the sufficient and necessary conditions of convergence in distribution are established. Then the average convergence rate is introduced to randomised search heuristics and its lower and upper bounds are derived. Finally, novel average drift analysis and backward drift analysis are proposed for bounding the expected hitting time. A computational study is also conducted to investigate the convergence, convergence rate and expected hitting time. The theoretical study belongs to a prior and general study while the computational study belongs to a posterior and case study.
Given a video and a description sentence with one missing word (we call it the "source sentence"), Video-Fill-In-the-Blank (VFIB) problem is to find the missing word automatically. The contextual information of the sentence, as well as visual cues from the video, are important to infer the missing word accurately. Since the source sentence is broken into two fragments: the sentence's left fragment (before the blank) and the sentence's right fragment (after the blank), traditional Recurrent Neural Networks cannot encode this structure accurately because of many possible variations of the missing word in terms of the location and type of the word in the source sentence. For example, a missing word can be the first word or be in the middle of the sentence and it can be a verb or an adjective. In this paper, we propose a framework to tackle the textual encoding: Two separate LSTMs (the LR and RL LSTMs) are employed to encode the left and right sentence fragments and a novel structure is introduced to combine each fragment with an "external memory" corresponding the opposite fragments. For the visual encoding, end-to-end spatial and temporal attention models are employed to select discriminative visual representations to find the missing word. In the experiments, we demonstrate the superior performance of the proposed method on challenging VFIB problem. Furthermore, we introduce an extended and more generalized version of VFIB, which is not limited to a single blank. Our experiments indicate the generalization capability of our method in dealing with such more realistic scenarios.
Numerous efforts have been made to design different low level saliency cues for the RGBD saliency detection, such as color or depth contrast features, background and color compactness priors. However, how these saliency cues interact with each other and how to incorporate these low level saliency cues effectively to generate a master saliency map remain a challenging problem. In this paper, we design a new convolutional neural network (CNN) to fuse different low level saliency cues into hierarchical features for automatically detecting salient objects in RGBD images. In contrast to the existing works that directly feed raw image pixels to the CNN, the proposed method takes advantage of the knowledge in traditional saliency detection by adopting various meaningful and well-designed saliency feature vectors as input. This can guide the training of CNN towards detecting salient object more effectively due to the reduced learning ambiguity. We then integrate a Laplacian propagation framework with the learned CNN to extract a spatially consistent saliency map by exploiting the intrinsic structure of the input image. Extensive quantitative and qualitative experimental evaluations on three datasets demonstrate that the proposed method consistently outperforms state-of-the-art methods.
We introduce the dense captioning task, which requires a computer vision system to both localize and describe salient regions in images in natural language. The dense captioning task generalizes object detection when the descriptions consist of a single word, and Image Captioning when one predicted region covers the full image. To address the localization and description task jointly we propose a Fully Convolutional Localization Network (FCLN) architecture that processes an image with a single, efficient forward pass, requires no external regions proposals, and can be trained end-to-end with a single round of optimization. The architecture is composed of a Convolutional Network, a novel dense localization layer, and Recurrent Neural Network language model that generates the label sequences. We evaluate our network on the Visual Genome dataset, which comprises 94,000 images and 4,100,000 region-grounded captions. We observe both speed and accuracy improvements over baselines based on current state of the art approaches in both generation and retrieval settings.
Incorporating multi-scale features in fully convolutional neural networks (FCNs) has been a key element to achieving state-of-the-art performance on semantic image segmentation. One common way to extract multi-scale features is to feed multiple resized input images to a shared deep network and then merge the resulting features for pixelwise classification. In this work, we propose an attention mechanism that learns to softly weight the multi-scale features at each pixel location. We adapt a state-of-the-art semantic image segmentation model, which we jointly train with multi-scale input images and the attention model. The proposed attention model not only outperforms average- and max-pooling, but allows us to diagnostically visualize the importance of features at different positions and scales. Moreover, we show that adding extra supervision to the output at each scale is essential to achieving excellent performance when merging multi-scale features. We demonstrate the effectiveness of our model with extensive experiments on three challenging datasets, including PASCAL-Person-Part, PASCAL VOC 2012 and a subset of MS-COCO 2014.
We present results from the deepest Herschel-PACS (Photodetector Array Camera and Spectrometer) far-infrared blank field extragalactic survey, obtained by combining observations of the GOODS (Great Observatories Origins Deep Survey) fields from the PACS Evolutionary Probe (PEP) and GOODS-Herschel key programmes. We describe data reduction and the construction of images and catalogues. In the deepest parts of the GOODS-S field, the catalogues reach 3-sigma depths of 0.9, 0.6 and 1.3 mJy at 70, 100 and 160 um, respectively, and resolve ~75% of the cosmic infrared background at 100um and 160um into individually detected sources. We use these data to estimate the PACS confusion noise, to derive the PACS number counts down to unprecedented depths and to determine the infrared luminosity function of galaxies down to LIR=10^11 Lsun at z~1 and LIR=10^12 Lsun at z~2, respectively. For the infrared luminosity function of galaxies, our deep Herschel far-infrared observations are fundamental because they provide more accurate infrared luminosity estimates than those previously obtained from mid-infrared observations. Maps and source catalogues (>3-sigma) are now publicly released. Combined with the large wealth of multi-wavelength data available for the GOODS fields, these data provide a powerful new tool for studying galaxy evolution over a broad range of redshifts.
We have considered the ballistic propagation of the 2D electron Wigner distribution, which is excited by an ultrashort optical pulse from a short-range impurity into the first quantized subband of a selectively-doped heterostructure with high mobility. Transient ionization of a deep local state into a continuum conduction c-band state is described. Since the quantum nature of the photoexcitation, the Wigner distribution over 2D plane appears to be an alternating-sign function. Due to a negative contribution to the Wigner function, the mean values (concentration, energy, and flow) demonstrate an oscillating transient evolution in contrast to the diffusive classical regime of propagation.
We report on magneto-transport measurements on low-density, large-area monolayer epitaxial graphene devices grown on SiC. We show that the zero-energy Landau level (LL) in monolayer graphene, which is predicted to be magnetic field ($B$)-independent, can float up above the Fermi energy at low $B$. This is supported by the temperature ($T$)-driven flow diagram approximated by the semi-circle law as well as the $T$-independent point in the Hall conductivity $\sigma_{xy}$ near $e^2/h$. Our experimental data are in sharp contrast to conventional understanding of the zeroth LL and metallic-like behavior in pristine graphene prepared by mechanical exfoliation at low $T$. This surprising result can be ascribed to substrate-induced sublattice symmetry breaking which splits the degeneracy of the zeroth Landau level. Our finding provides a unified picture regarding the metallic behavior in pristine graphene prepared by mechanical exfoliation, and the insulating behavior and the insulator-quantum Hall transition in monolayer epitaxial graphene.
The principles of self-organizing the neural networks of optimal complexity is considered under the unrepresentative learning set. The method of self-organizing the multi-layered neural networks is offered and used to train the logical neural networks which were applied to the medical diagnostics.
A scheme is derived for learning connectivity in spiking neural networks. The scheme learns instantaneous firing rates that are conditional on the activity in other parts of the network. The scheme is independent of the choice of neuron dynamics or activation function, and network architecture. It involves two simple, online, local learning rules that are applied only in response to occurrences of spike events. This scheme provides a direct method for transferring ideas between the fields of deep learning and computational neuroscience. This learning scheme is demonstrated using a layered feedforward spiking neural network trained self-supervised on a prediction and classification task for moving MNIST images collected using a Dynamic Vision Sensor.
Collaborative filtering (CF) is the most widely used and successful approach for personalized service recommendations. Among the collaborative recommendation approaches, neighborhood based approaches enjoy a huge amount of popularity, due to their simplicity, justifiability, efficiency and stability. Neighborhood based collaborative filtering approach finds K nearest neighbors to an active user or K most similar rated items to the target item for recommendation. Traditional similarity measures use ratings of co-rated items to find similarity between a pair of users. Therefore, traditional similarity measures cannot compute effective neighbors in sparse dataset. In this paper, we propose a two-phase approach, which generates user-user and item-item networks using traditional similarity measures in the first phase. In the second phase, two hybrid approaches HB1, HB2, which utilize structural similarity of both the network for finding K nearest neighbors and K most similar items to a target items are introduced. To show effectiveness of the measures, we compared performances of neighborhood based CFs using state-of-the-art similarity measures with our proposed structural similarity measures based CFs. Recommendation results on a set of real data show that proposed measures based CFs outperform existing measures based CFs in various evaluation metrics.
We present the 24 micron rest-frame luminosity function (LF) of star-forming galaxies in the redshift range 0.0 < z < 0.6 constructed from 4047 spectroscopic redshifts from the AGN and Galaxy Evolution Survey of 24 micron selected sources in the Bootes field of the NOAO Deep Wide-Field Survey. This sample provides the best available combination of large area (9 deg^2), depth, and statistically complete spectroscopic observations, allowing us to probe the evolution of the 24 micron LF of galaxies at low and intermediate redshifts while minimizing the effects of cosmic variance. In order to use the observed 24 micron luminosity as a tracer for star formation, active galactic nuclei (AGNs) that could contribute significantly at 24 micron are identified and excluded from our star-forming galaxy sample based on their mid-IR spectral energy distributions or the detection of X-ray emission. The evolution of the 24 micron LF of star-forming galaxies for redshifts of z < 0.65 is consistent with a pure luminosity evolution where the characteristic 24 micron luminosity evolves as (1+z)^(3.8+/-0.3). We extend our evolutionary study to encompass 0.0 < z < 1.2 by combining our data with that of the Far-Infrared Deep Extragalactic Legacy Survey. Over this entire redshift range the evolution of the characteristic 24 micron luminosity is described by a slightly shallower power law of (1+z)^(3.4+/-0.2). We find a local star formation rate density of (1.09+/-0.21) x 10^-2 Msun/yr/Mpc^-3, and that it evolves as (1+z)^(3.5+/-0.2) over 0.0 < z < 1.2. These estimates are in good agreement with the rates using optical and UV fluxes corrected for the effects of intrinsic extinction in the observed sources. This agreement confirms that star formation at z <~ 1.2 is robustly traced by 24 micron observations and that it largely occurs in obscured regions of galaxies. (Abridged)
We measure the field dependence of spin glass free energy barriers in a thin amorphous Ge:Mn film through the time dependence of the magnetization. After the correlation length $\xi(t, T)$ has reached the film thickness $\mathcal {L}=155$~\AA~so that the dynamics are activated, we change the initial magnetic field by $\delta H$. In agreement with the scaling behavior exhibited in a companion Letter [Janus collaboration: M. Baity-Jesi {\it et al.}, Phys. Rev. Lett. {\bf 118}, 157202 (2017)], we find the activation energy is increased when $\delta H < 0$. The change is proportional to $(\delta H)^2$ with the addition of a small $(\delta H)^4$ term. The magnitude of the change of the spin glass free energy barriers is in near quantitative agreement with the prediction of a barrier model.
The structure of an online social network in most cases cannot be described just by links between its members. We study online social networks, in which members may have certain attitude, positive or negative toward each other, and so the network consists of a mixture of both positive and negative relationships. Our goal is to predict the sign of a given relationship based on the evidences provided in the current snapshot of the network. More precisely, using machine learning techniques we develop a model that after being trained on a particular network predicts the sign of an unknown or hidden link. The model uses relationships and influences from peers as evidences for the guess, however, the set of peers used is not predefined but rather learned during the training process. We use quadratic correlation between peer members to train the predictor. The model is tested on popular online datasets such as Epinions, Slashdot, and Wikipedia. In many cases it shows almost perfect prediction accuracy. Moreover, our model can also be efficiently updated as the underlaying social network evolves.
We investigate the structure of scientific collaboration networks. We consider two scientists to be connected if they have authored a paper together, and construct explicit networks of such connections using data drawn from a number of databases, including MEDLINE (biomedical research), the Los Alamos e-Print Archive (physics), and NCSTRL (computer science). We show that these collaboration networks form "small worlds" in which randomly chosen pairs of scientists are typically separated by only a short path of intermediate acquaintances. We further give results for mean and distribution of numbers of collaborators of authors, demonstrate the presence of clustering in the networks, and highlight a number of apparent differences in the patterns of collaboration between the fields studied.
Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep neural network architectures to combine image information across a video over longer time periods than previously attempted. We propose two methods capable of handling full length videos. The first method explores various convolutional temporal feature pooling architectures, examining the various design choices which need to be made when adapting a CNN for this task. The second proposed method explicitly models the video as an ordered sequence of frames. For this purpose we employ a recurrent neural network that uses Long Short-Term Memory (LSTM) cells which are connected to the output of the underlying CNN. Our best networks exhibit significant performance improvements over previously published results on the Sports 1 million dataset (73.1% vs. 60.9%) and the UCF-101 datasets with (88.6% vs. 88.0%) and without additional optical flow information (82.6% vs. 72.8%).
Face image quality can be defined as a measure of the utility of a face image to automatic face recognition. In this work, we propose (and compare) two methods for automatic face image quality based on target face quality values from (i) human assessments of face image quality (matcher-independent), and (ii) quality values computed from similarity scores (matcher-dependent). A support vector regression model trained on face features extracted using a deep convolutional neural network (ConvNet) is used to predict the quality of a face image. The proposed methods are evaluated on two unconstrained face image databases, LFW and IJB-A, which both contain facial variations with multiple quality factors. Evaluation of the proposed automatic face image quality measures shows we are able to reduce the FNMR at 1% FMR by at least 13% for two face matchers (a COTS matcher and a ConvNet matcher) by using the proposed face quality to select subsets of face images and video frames for matching templates (i.e., multiple faces per subject) in the IJB-A protocol. To our knowledge, this is the first work to utilize human assessments of face image quality in designing a predictor of unconstrained face quality that is shown to be effective in cross-database evaluation.
While perception tasks such as visual object recognition and text understanding play an important role in human intelligence, the subsequent tasks that involve inference, reasoning and planning require an even higher level of intelligence. The past few years have seen major advances in many perception tasks using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. To achieve integrated intelligence that involves both perception and inference, it is naturally desirable to tightly integrate deep learning and Bayesian models within a principled probabilistic framework, which we call Bayesian deep learning. In this unified framework, the perception of text or images using deep learning can boost the performance of higher-level inference and in return, the feedback from the inference process is able to enhance the perception of text or images. This paper proposes a general framework for Bayesian deep learning and reviews its recent applications on recommender systems, topic models, and control. In this paper, we also discuss the relationship and differences between Bayesian deep learning and other related topics like Bayesian treatment of neural networks.
The observation that individuals tend to be friends with people who are similar to themselves, commonly known as homophily, is a prominent and well-studied feature of social networks. Many machine learning methods exploit homophily to predict attributes of individuals based on the attributes of their friends. Meanwhile, recent work has shown that gender homophily can be weak or nonexistent in practice, making gender prediction particularly challenging. In this work, we identify another useful structural feature for predicting gender, an overdispersion of gender preferences introduced by individuals who have extreme preferences for a particular gender, regardless of their own gender. We call this property monophily for "love of one," and jointly characterize the statistical structure of homophily and monophily in social networks in terms of preference bias and preference variance. For prediction, we find that this pattern of extreme gender preferences introduces friend-of-friend correlations, where individuals are similar to their friends-of-friends without necessarily being similar to their friends. We analyze a population of online friendship networks in U.S. colleges and offline friendship networks in U.S. high schools and observe a fundamental difference between the success of prediction methods based on friends, "the company you keep," compared to methods based on friends-of-friends, "the company you're kept in." These findings offer an alternative perspective on attribute prediction in general and gender in particular, complicating the already difficult task of protecting attribute privacy.
We introduce a community network model which exhibits scale-free property and study the evolutionary Prisoner's Dilemma game (PDG) on this network model. It is found that the frequency of cooperators decreases with the increment of the average degree $\bar{k}$ from the simulation results. And reducing inter-community links can promote cooperation when we keep the total links (including inner-community and inter-community links) unchanged. It is also shown that the heterogeneity of networks does not always enhance cooperation and the pattern of links among all the vertices under a given degree-distribution plays a crucial role in the dominance of cooperation in the network model.
We have performed a deep multi-band photometric analysis of the star cluster population of Haro 11. This starburst galaxy (log L_FUV = 10.3 L_sun) is considered a nearby analogue of Lyman break galaxies (LBGs) at high redshift. The study of the numerous star clusters in the systems is an effective way to investigate the formation and evolution of the starburst phase. In fact, the SED fitting models have revealed a surprisingly young star cluster population, with ages between 0.5 and 40 Myr, and estimated masses between 10^3 and 10^7 solar masses. An independent age estimation has been done with the EW(Halpha) analysis of each cluster. This last analysis has confirmed the young ages of the clusters. We noticed that the clusters with ages between 1 and 10 Myr show a flux excess in H (NIC3/F160W) and/or I (WFPC2/F814W) bands with respect to the evolutionary models. Once more Haro 11 represents a challenge to our understanding.
Huge datasets in cyber security, such as network traffic logs, can be analyzed using machine learning and data mining methods. However, the amount of collected data is increasing, which makes analysis more difficult. Many machine learning methods have not been designed for big datasets, and consequently are slow and difficult to understand. We address the issue of efficient network traffic classification by creating an intrusion detection framework that applies dimensionality reduction and conjunctive rule extraction. The system can perform unsupervised anomaly detection and use this information to create conjunctive rules that classify huge amounts of traffic in real time. We test the implemented system with the widely used KDD Cup 99 dataset and real-world network logs to confirm that the performance is satisfactory. This system is transparent and does not work like a black box, making it intuitive for domain experts, such as network administrators.
In this work we evaluate different approaches to parallelize computation of convolutional neural networks across several GPUs.
The first step to handle semantic heterogeneity should be the attempt to enrich the semantic information about documents, i.e. to fill up the gaps in the documents meta-data automatically. Section 2 describes a set of cascading deductive and heuristic extraction rules, which were developed in the project CARMEN for the domain of Social Sciences. The mapping between different terminologies can be done by using intellectual, statistical and/or neural network transfer modules. Intellectual transfers use cross-concordances between different classification schemes or thesauri. Section 3 describes the creation, storage and handling of such transfers.
We present a research roadmap of a Planetary Nervous System (PNS), capable of sensing and mining the digital breadcrumbs of human activities and unveiling the knowledge hidden in the big data for addressing the big questions about social complexity. We envision the PNS as a globally distributed, self-organizing, techno-social system for answering analytical questions about the status of world-wide society, based on three pillars: social sensing, social mining, and the idea of trust networks and privacy-aware social mining. We discuss the ingredients of a science and a technology necessary to build the PNS upon the three mentioned pillars, beyond the limitations of their respective state-of-art. Social sensing is aimed at developing better methods for harvesting the big data from the techno-social ecosystem and make them available for mining, learning and analysis at a properly high abstraction level.Social mining is the problem of discovering patterns and models of human behaviour from the sensed data across the various social dimensions by data mining, machine learning and social network analysis. Trusted networks and privacy-aware social mining is aimed at creating a new deal around the questions of privacy and data ownership empowering individual persons with full awareness and control on own personal data, so that users may allow access and use of their data for their own good and the common good. The PNS will provide a goal-oriented knowledge discovery framework, made of technology and people, able to configure itself to the aim of answering questions about the pulse of global society. Given an analytical request, the PNS activates a process composed by a variety of interconnected tasks exploiting the social sensing and mining methods within the transparent ecosystem provided by the trusted network.
Multiobjective design optimization problems require multiobjective optimization techniques to solve, and it is often very challenging to obtain high-quality Pareto fronts accurately. In this paper, the recently developed flower pollination algorithm (FPA) is extended to solve multiobjective optimization problems. The proposed method is used to solve a set of multobjective test functions and two bi-objective design benchmarks, and a comparison of the proposed algorithm with other algorithms has been made, which shows that FPA is efficient with a good convergence rate. Finally, the importance for further parametric studies and theoretical analysis are highlighted and discussed.
Recent work exhibited that distributed word representations are good at capturing linguistic regularities in language. This allows vector-oriented reasoning based on simple linear algebra between words. Since many different methods have been proposed for learning document representations, it is natural to ask whether there is also linear structure in these learned representations to allow similar reasoning at document level. To answer this question, we design a new document analogy task for testing the semantic regularities in document representations, and conduct empirical evaluations over several state-of-the-art document representation models. The results reveal that neural embedding based document representations work better on this analogy task than conventional methods, and we provide some preliminary explanations over these observations.
The idea of style transfer has largely only been explored in image-based tasks, which we attribute in part to the specific nature of loss functions used for style transfer. We propose a general formulation of style transfer as an extension of generative adversarial networks, by using a discriminator to regularize a generator with an otherwise separate loss function. We apply our approach to the task of learning to play chess in the style of a specific player, and present empirical evidence for the viability of our approach.
We survey the development of Clifford's geometric algebra and some of its engineering applications during the last 15 years. Several recently developed applications and their merits are discussed in some detail. We thus hope to clearly demonstrate the benefit of developing problem solutions in a unified framework for algebra and geometry with the widest possible scope: from quantum computing and electromagnetism to satellite navigation, from neural computing to camera geometry, image processing, robotics and beyond.
The definition of preferences assigned to individuals is a concept that concerns many disciplines, from economics, with the search of an acceptable outcome for an ensemble of individuals, to decision making an analysis of vote systems. We are concerned in the phenomena of good selection and economic fairness. In Arrow's theorem this situation is expressed as the impossibility of aggregate preferences among individuals falling down into some unfairness state. This situation was also analyzed in a previous model in a network of individuals with a random allocation. Both analysis are based on static preferences.   In a real society the individuals are confronted to information exchange which can modify the way that they think. Also, the preference formation of each individual is influenced by this exchange. This consideration reveals why the actual theory is not able to make an accurate analysis of the influence of the individual, or cluster of individuals, in the fairness state. The aim of this investigation is to consider the coupling of two systems, one for the formation of preferences and a second where an allocation of goods is done in an evolutionary environment.
Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural networks. However, weight updates under Backprop depend on lengthy recursive computations and require separate output and error messages -- features not shared by biological neurons, that are perhaps unnecessary. In this paper, we revisit Backprop and the credit assignment problem. We first decompose Backprop into a collection of interacting learning algorithms; provide regret bounds on the performance of these sub-algorithms; and factorize Backprop's error signals. Using these results, we derive a new credit assignment algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop. Finally, we provide a sufficient condition for Kickback to follow error gradients, and show that Kickback matches Backprop's performance on real-world regression benchmarks.
The difficulty in analyzing LSTM-like recurrent neural networks lies in the complex structure of the recurrent unit, which induces highly complex nonlinear dynamics. In this paper, we design a new simple recurrent unit, which we call Prototypical Recurrent Unit (PRU). We verify experimentally that PRU performs comparably to LSTM and GRU. This potentially enables PRU to be a prototypical example for analytic study of LSTM-like recurrent networks. Along these experiments, the memorization capability of LSTM-like networks is also studied and some insights are obtained.
Secure access of communication networks has become an increasingly important area of consideration for the communication service providers of present day. Broadband Wireless Access (BWA) networks are proving to be an efficient and cost effective solution for the provisioning of high rate wireless traffic links in static and mobile domains. The secure access of these networks is necessary to ensure their superior operation and revenue efficacy. Although authentication process is a key to secure access in BWA networks, the breaches present in them limit the networks performance. In this paper, the vulnerabilities in the authentication frameworks of BWA networks have been unveiled. Moreover, this paper also describes the limitations of these protocols and of the solutions proposed to them due to the involved computational complexities and overheads. The possible attacks on privacy and performance of BWA networks have been discussed and explained in detail.
User modeling plays an important role in delivering customized web services to the users and improving their engagement. However, most user models in the literature do not explicitly consider the temporal behavior of users. More recently, continuous-time user modeling has gained considerable attention and many user behavior models have been proposed based on temporal point processes. However, typical point process based models often considered the impact of peer influence and content on the user participation and neglected other factors. Gamification elements, are among those factors that are neglected, while they have a strong impact on user participation in online services. In this paper, we propose interdependent multi-dimensional temporal point processes that capture the impact of badges on user participation besides the peer influence and content factors. We extend the proposed processes to model user actions over the community based question and answering websites, and propose an inference algorithm based on Variational-EM that can efficiently learn the model parameters. Extensive experiments on both synthetic and real data gathered from Stack Overflow show that our inference algorithm learns the parameters efficiently and the proposed method can better predict the user behavior compared to the alternatives.
We propose a new training method for a feedforward neural network having the activation functions with the geometric contraction property. The method consists of constructing a new functional that is less nonlinear in comparison with the classical functional by removing the nonlinearity of the activation functions from the output layer. We validate this new method by a series of experiments that show an improved learning speed and also a better classification error.
Within a phenomenological approach a criterion for a choice of forms of the order parameter of the superfluid phases of 3He in aerogel in a vicinity of the transition temperature is derived. The order parameter of bulk B-phase of 3He meets this criterion and that of the bulk A-phase doesn`t. A class of order parameters of Equal Spin Pairing (ESP) type meeting the derived criterion is discussed. Order parameters belonging to this class are proposed as candidates for the observed A-like superfluid phase of liquid 3He in aerogel. Effect of magnetic field on the order parameters of this class is considered.
This paper presents the contribution to the third 'CHiME' speech separation and recognition challenge including both front-end signal processing and back-end speech recognition. In the front-end, Multi-channel Wiener filter (MWF) is designed to achieve background noise reduction. Different from traditional MWF, optimized parameter for the tradeoff between noise reduction and target signal distortion is built according to the desired noise reduction level. In the back-end, several techniques are taken advantage to improve the noisy Automatic Speech Recognition (ASR) performance including Deep Neural Network (DNN), Convolutional Neural Network (CNN) and Long short-term memory (LSTM) using medium vocabulary, Lattice rescoring with a big vocabulary language model finite state transducer, and ROVER scheme. Experimental results show the proposed system combining front-end and back-end is effective to improve the ASR performance.
We consider the inclusive cross section for jet production with large transverse momentum in deep-inelastic scattering. This process has been proposed as a probe of small-x physics, particularly the measurement of `hot spots' inside the proton. We present a numerical calculation of this process, taking into account a larger phase space. The theoretical reliability as well as phenomenological uncertainties of the calculation are discussed.
In a popular scenario due to Heyl, quasi periodic oscillations (QPOs) which are seen during type 1 X-ray bursts are produced by giant travelling waves in neutron-star oceans. Piro and Bildsten have proposed that during the burst cooling the wave in the bursting layer may convert into a deep crustal interface wave, which would cut off the visible QPOs. This cut-off would help explain the magnitude of the QPO frequency drift, which is otherwise overpredicted by a factor of several in Heyl's scenario. In this paper, we study the coupling between the bursting layer and the deep ocean. The coupling turns out to be weak and only a small fraction of the surface-wave energy gets transferred to that of the crustal-interface wave during the burst. Thus the crustal-interface wave plays no dynamical role during the burst, and no early QPO cut-off should occur.
The security of wireless sensor networks is an active topic of research where both symmetric and asymmetric key cryptography issues have been studied. Due to their computational feasibility on typical sensor nodes, symmetric key algorithms that use the same key to encrypt and decrypt messages have been intensively studied and perfectly deployed in such environment. Because of the wireless sensor's limited infrastructure, the bottleneck challenge for deploying these algorithms is the key distribution. For the same reason of resources restriction, key distribution mechanisms which are used in traditional wireless networks are not efficient for sensor networks.   To overcome the key distribution problem, several key pre-distribution algorithms and techniques that assign keys or keying material for the networks nodes in an offline phase have been introduced recently. In this paper, we introduce a supplemental distribution technique based on the communication pattern and deployment knowledge modeling. Our technique is based on the hierarchical grid deployment. For granting a proportional security level with number of dependent sensors, we use different polynomials in different orders with different weights. In seek of our proposed work's value, we provide a detailed analysis on the used resources, resulting security, resiliency, and connectivity compared with other related works.
Recommendations can greatly benefit from good representations of the user state at recommendation time. Recent approaches that leverage Recurrent Neural Networks (RNNs) for session-based recommendations have shown that Deep Learning models can provide useful user representations for recommendation. However, current RNN modeling approaches summarize the user state by only taking into account the sequence of items that the user has interacted with in the past, without taking into account other essential types of context information such as the associated types of user-item interactions, the time gaps between events and the time of day for each interaction. To address this, we propose a new class of Contextual Recurrent Neural Networks for Recommendation (CRNNs) that can take into account the contextual information both in the input and output layers and modifying the behavior of the RNN by combining the context embedding with the item embedding and more explicitly, in the model dynamics, by parametrizing the hidden unit transitions as a function of context information. We compare our CRNNs approach with RNNs and non-sequential baselines and show good improvements on the next event prediction task.
Software-defined networks have been proposed as a viable solution to decrease the power consumption of the networking component in data center networks. Still the question remains on which scheduling algorithms are most suited to achieve this goal. We propose 4 different linear programming approaches that schedule requested traffic flows on SDN switches according to different objectives. Depending on pre-defined software quality requirements such as delay and performance, a single variation or a combination of variations can be selected to optimize the power saving and the performance metrics. Our simulation results demonstrate that all our algorithm variations outperform the shortest path scheduling algorithm, our baseline on power savings, less or more strongly depending on the power model chosen. We show that in FatTree networks, where switches can save up to 60% of power in sleeping mode, we can achieve 15% minimum improvement assuming a one-to-one traffic scenario. Two of our algorithm variations privilege performance over power saving and still provide around 45% of the maximum achievable savings.
This paper proposes a new approach to a novel value network architecture for the game Go, called a multi-labelled (ML) value network. In the ML value network, different values (win rates) are trained simultaneously for different settings of komi, a compensation given to balance the initiative of playing first. The ML value network has three advantages, (a) it outputs values for different komi, (b) it supports dynamic komi, and (c) it lowers the mean squared error (MSE). This paper also proposes a new dynamic komi method to improve game-playing strength. This paper also performs experiments to demonstrate the merits of the architecture. First, the MSE of the ML value network is generally lower than the value network alone. Second, the program based on the ML value network wins by a rate of 67.6% against the program based on the value network alone. Third, the program with the proposed dynamic komi method significantly improves the playing strength over the baseline that does not use dynamic komi, especially for handicap games. To our knowledge, up to date, no handicap games have been played openly by programs using value networks. This paper provides these programs with a useful approach to playing handicap games.
Existing speaker verification (SV) systems often suffer from performance degradation if there is any language mismatch between model training, speaker enrollment, and test. A major cause of this degradation is that most existing SV methods rely on a probabilistic model to infer the speaker factor, so any significant change on the distribution of the speech signal will impact the inference. Recently, we proposed a deep learning model that can learn how to extract the speaker factor by a deep neural network (DNN). By this feature learning, an SV system can be constructed with a very simple back-end model. In this paper, we investigate the robustness of the feature-based SV system in situations with language mismatch. Our experiments were conducted on a complex cross-lingual scenario, where the model training was in English, and the enrollment and test were in Chinese or Uyghur. The experiments demonstrated that the feature-based system outperformed the i-vector system with a large margin, particularly with language mismatch between enrollment and test.
The localization length, $\xi$, in a 2--dimensional Anderson insulator depends on the electron spin scattering rate by magnetic impurities, $\tau_s^{-1}$. For antiferromagnetic sign of the exchange, %constant, the time $\tau_s$ is {\em itself a function of $\xi$}, due to the Kondo correlations. We demonstrate that the unitary regime of localization is impossible when the concentration of magnetic impurities, $n_{\tiny M}$, is smaller than a critical value, $n_c$. For $n_{\tiny M}>n_c$, the dependence of $\xi$ on the dimensionless conductance, $g$, is {\em reentrant}, crossing over to unitary, and back to orthogonal behavior upon increasing $g$. Sensitivity of Kondo correlations to a weak {\em parallel} magnetic field results in a giant parallel magnetoresistance.
Artificial intelligence (AI) is an important technology that supports daily social life and economic activities. It contributes greatly to the sustainable growth of Japan's economy and solves various social problems. In recent years, AI has attracted attention as a key for growth in developed countries such as Europe and the United States and developing countries such as China and India. The attention has been focused mainly on developing new artificial intelligence information communication technology (ICT) and robot technology (RT). Although recently developed AI technology certainly excels in extracting certain patterns, there are many limitations. Most ICT models are overly dependent on big data, lack a self-idea function, and are complicated. In this paper, rather than merely developing next-generation artificial intelligence technology, we aim to develop a new concept of general-purpose intelligence cognition technology called Beyond AI. Specifically, we plan to develop an intelligent learning model called Brain Intelligence (BI) that generates new ideas about events without having experienced them by using artificial life with an imagine function. We will also conduct demonstrations of the developed BI intelligence learning model on automatic driving, precision medical care, and industrial robots.
The semiflexible F-actin network of the cytoskeleton is cross-linked by a variety of proteins including filamin, which contain Ig-domains that unfold under applied tension. We examine a simple semiflexible network model cross-linked by such unfolding linkers that captures the main mechanical features of F-actin networks cross-linked by filamin proteins and show that under sufficiently high strain the network spontaneously self-organizes so that an appreciable fraction of the filamin cross-linkers are at the threshold of domain unfolding. We propose an explanation of this organization based on a mean-field model and suggest a qualitative experimental signature of this type of network reorganization under applied strain that may be observable in intracellular microrheology experiments of Crocker et al.
We present Walklets, a novel approach for learning multiscale representations of vertices in a network. In contrast to previous works, these representations explicitly encode multiscale vertex relationships in a way that is analytically derivable.   Walklets generates these multiscale relationships by subsampling short random walks on the vertices of a graph. By `skipping' over steps in each random walk, our method generates a corpus of vertex pairs which are reachable via paths of a fixed length. This corpus can then be used to learn a series of latent representations, each of which captures successively higher order relationships from the adjacency matrix.   We demonstrate the efficacy of Walklets's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, DBLP, Flickr, and YouTube. Our results show that Walklets outperforms new methods based on neural matrix factorization. Specifically, we outperform DeepWalk by up to 10% and LINE by 58% Micro-F1 on challenging multi-label classification tasks. Finally, Walklets is an online algorithm, and can easily scale to graphs with millions of vertices and edges.
Deep convolutional neural networks (DCNNs) are an influential tool for solving various problems in the machine learning and computer vision fields. In this paper, we introduce a new deep learning model called an Inception- Recurrent Convolutional Neural Network (IRCNN), which utilizes the power of an inception network combined with recurrent layers in DCNN architecture. We have empirically evaluated the recognition performance of the proposed IRCNN model using different benchmark datasets such as MNIST, CIFAR-10, CIFAR- 100, and SVHN. Experimental results show similar or higher recognition accuracy when compared to most of the popular DCNNs including the RCNN. Furthermore, we have investigated IRCNN performance against equivalent Inception Networks and Inception-Residual Networks using the CIFAR-100 dataset. We report about 3.5%, 3.47% and 2.54% improvement in classification accuracy when compared to the RCNN, equivalent Inception Networks, and Inception- Residual Networks on the augmented CIFAR- 100 dataset respectively.
Todays mobile networks contain an increasing variety of proprietary hardware stifling innovation and leading to longer time-to-market for introduction of new services. We propose to replace the mobile core network nodes and interfaces with an Open Source SW implementation running on general purpose commodity hardware. The proposed open source approach referred to as coreless mobile network is expected to reduce cost, increase flexibility, improve innovation speed and accelerate time-to-market for introduction of new features and functionalities. A common Open Source SW framework will also enable automatic discovery and selection, seamless data mobility as well as unified charging and billing across cellular, WiFi, UAV and satellite access networks.
Deep convolutional neural networks have become the gold standard for image recognition tasks, demonstrating many current state-of-the-art results and even achieving near-human level performance on some tasks. Despite this fact it has been shown that their strong generalisation qualities can be fooled to misclassify previously correctly classified natural images and give erroneous high confidence classifications to nonsense synthetic images. In this paper we extend that work, by presenting a straightforward way to perturb an image in such a way as to cause it to acquire any other label from within the dataset while leaving this perturbed image visually indistinguishable from the original.
In timing-based neural codes, neurons have to emit action potentials at precise moments in time. We use a supervised learning paradigm to derive a synaptic update rule that optimizes via gradient ascent the likelihood of postsynaptic firing at one or several desired firing times. We find that the optimal strategy of up- and downregulating synaptic efficacies can be described by a two-phase learning window similar to that of Spike-Timing Dependent Plasticity (STDP). If the presynaptic spike arrives before the desired postsynaptic spike timing, our optimal learning rule predicts that the synapse should become potentiated. The dependence of the potentiation on spike timing directly reflects the time course of an excitatory postsynaptic potential. The presence and amplitude of depression of synaptic efficacies for reversed spike timing depends on how constraints are implemented in the optimization problem. Two different constraints, i.e., control of postsynaptic rates or control of temporal locality,are discussed.
Action recognition in still images has seen major improvement in recent years due to advances in human pose estimation, object recognition and stronger feature representations produced by deep neural networks. However, there are still many cases in which performance remains far from that of humans. A major difficulty arises in distinguishing between transitive actions in which the overall actor pose is similar, and recognition therefore depends on details of the grasp and the object, which may be largely occluded. In this paper we demonstrate how recognition is improved by obtaining precise localization of the action-object and consequently extracting details of the object shape together with the actor-object interaction. To obtain exact localization of the action object and its interaction with the actor, we employ a coarse-to-fine approach which combines semantic segmentation and contextual features, in successive stages. We focus on (but are not limited) to face-related actions, a set of actions that includes several currently challenging categories. We present an average relative improvement of 35% over state-of-the art and validate through experimentation the effectiveness of our approach.
This paper describes how coherent backscattering is altered by an external magnetic field. In the theory presented, magneto-optical effects occur inside Mie scatterers embedded in a non-magnetic medium. Unlike previous theories based on point-like scatterers, the decrease of coherent backscattering is obtained in leading order of the magnetic field using rigorous Mie theory. This decrease is strongly enhanced in the proximity of resonances, which cause the path length of the wave inside a scatterer to be increased. Also presented is a novel analysis of the shape of the backscattering cone in a magnetic field.
Mixture of experts (MoE) models are a powerful probabilistic neural network framework that can be used for classification, clustering, and regression. The most common types of MoE models are those with soft-max gating and experts with linear means. Such models are referred to as soft-max gated mixture of linear experts (MoLE) models. Due to their popularity, the majority of theoretical results regarding MoE and MoLE models have concentrated on those with soft-max gating. For example, it has been proved that soft-max gated MoLE mean functions are dense in the class of continuous functions, over any compact subset of Euclidean space. A popular alternative to soft-max gating is Gaussian gating. We prove that the denseness result regarding soft-max gated MoLE mean functions extends to Gaussian-gated MoLE mean functions. That is, we prove that Gaussian-gated MoLE mean functions are dense in the class of continuous functions, over any compact subset of Euclidean space.
We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation. Our decoder learns to attend to source-language words and parts of an image independently by means of two separate attention mechanisms as it generates words in the target language. We find that our model can efficiently exploit not just back-translated in-domain multi-modal data but also large general-domain text-only MT corpora. We also report state-of-the-art results on the Multi30k data set.
The task of estimating the spatial layout of cluttered indoor scenes from a single RGB image is addressed in this work. Existing solutions to this problems largely rely on hand-craft features and vanishing lines, and they often fail in highly cluttered indoor rooms. The proposed coarse-to-fine indoor layout estimation (CFILE) method consists of two stages: 1) coarse layout estimation; and 2) fine layout localization. In the first stage, we adopt a fully convolutional neural network (FCN) to obtain a coarse-scale room layout estimate that is close to the ground truth globally. The proposed FCN considers combines the layout contour property and the surface property so as to provide a robust estimate in the presence of cluttered objects. In the second stage, we formulate an optimization framework that enforces several constraints such as layout contour straightness, surface smoothness and geometric constraints for layout detail refinement. Our proposed system offers the state-of-the-art performance on two commonly used benchmark datasets.
A nonlinear channel estimator using complex Least Square Support Vector Machines (LS-SVM) is proposed for pilot-aided OFDM system and applied to Long Term Evolution (LTE) downlink under high mobility conditions. The estimation algorithm makes use of the reference signals to estimate the total frequency response of the highly selective multipath channel in the presence of non-Gaussian impulse noise interfering with pilot signals. Thus, the algorithm maps trained data into a high dimensional feature space and uses the structural risk minimization (SRM) principle to carry out the regression estimation for the frequency response function of the highly selective channel. The simulations show the effectiveness of the proposed method which has good performance and high precision to track the variations of the fading channels compared to the conventional LS method and it is robust at high speed mobility.
In this paper we describe the use of a new artificial neural network, called the difference boosting neural network (DBNN), for automated classification problems in astronomical data analysis. We illustrate the capabilities of the network by applying it to star galaxy classification using recently released, deep imaging data. We have compared our results with classification made by the widely used Source Extractor (SExtractor) package. We show that while the performance of the DBNN in star-galaxy classification is comparable to that of SExtractor, it has the advantage of significantly higher speed and flexibility during training as well as classification.
It is clear that the current attempts at using algorithms to create artificial neural networks have had mixed success at best when it comes to creating large networks and/or complex behavior. This should not be unexpected, as creating an artificial brain is essentially a design problem. Human design ingenuity still surpasses computational design for most tasks in most domains, including architecture, game design, and authoring literary fiction. This leads us to ask which the best way is to combine human and machine design capacities when it comes to designing artificial brains. Both of them have their strengths and weaknesses; for example, humans are much too slow to manually specify thousands of neurons, let alone the billions of neurons that go into a human brain, but on the other hand they can rely on a vast repository of common-sense understanding and design heuristics that can help them perform a much better guided search in design space than an algorithm. Therefore, in this paper we argue for a mixed-initiative approach for collaborative online brain building and present first results towards this goal.
The circular orthogonal and circular symplectic ensembles are mapped onto free, non-hermitian fermion systems. As an illustration, the two-level form factors are calculated.
In this paper, for the first time, we introduce a multiple instance (MI) deep hashing technique for learning discriminative hash codes with weak bag-level supervision suited for large-scale retrieval. We learn such hash codes by aggregating deeply learnt hierarchical representations across bag members through a dedicated MI pool layer. For better trainability and retrieval quality, we propose a two-pronged approach that includes robust optimization and training with an auxiliary single instance hashing arm which is down-regulated gradually. We pose retrieval for tumor assessment as an MI problem because tumors often coexist with benign masses and could exhibit complementary signatures when scanned from different anatomical views. Experimental validations on benchmark mammography and histology datasets demonstrate improved retrieval performance over the state-of-the-art methods.
Existing studies on semantic parsing mainly focus on the in-domain setting. We formulate cross-domain semantic parsing as a domain adaptation problem: train a semantic parser on some source domains and then adapt it to the target domain. Due to the diversity of logical forms in different domains, this problem presents unique and intriguing challenges. By converting logical forms into canonical utterances in natural language, we reduce semantic parsing to paraphrasing, and develop an attentive sequence-to-sequence paraphrase model that is general and flexible to adapt to different domains. We discover two problems, small micro variance and large macro variance, of pre-trained word embeddings that hurdle their direct use in neural networks, and propose standardization techniques as a remedy. On the popular Overnight dataset, which contains eight domains, we show that both cross-domain training and standardized pre-trained word embeddings can bring significant improvement.
We present a general method to obtain the exact rate function $\Psi_{[a,b]}(k)$ controlling the large deviation probability $\text{Prob}[\mathcal{I}_N[a,b]=kN] \asymp e^{-N\Psi_{[a,b]}(k)}$ that a $N \times N$ sparse random matrix has $\mathcal{I}_N[a,b]=kN$ eigenvalues inside the interval $[a,b]$. The method is applied to study the eigenvalue statistics in two distinct examples: (i) the shifted index number of eigenvalues for an ensemble of Erd\"os-R\'enyi graphs and (ii) the number of eigenvalues within a bounded region of the spectrum for the Anderson model on regular random graphs. A salient feature of the rate function in both cases is that, unlike rotationally invariant random matrices, it is asymmetric with respect to its minimum. The asymmetric character depends on the disorder in a way that is compatible with the distinct eigenvalue statistics corresponding to localized and delocalized eigenstates. The results also show that the number variance $\sigma_{N}^{2}$ for the Anderson model on a regular graph scales as $\sigma_{N}^{2} \propto N$ ($N \gg 1$) for any nonzero disorder, which is consistent with the absence of level-repulsion in the extended phase. Our theoretical findings are thoroughly compared to numerical diagonalization in both cases, showing a reasonable good agreement.
The one-step replica symmetry breaking (RSB) is used to study a two-sublattice fermionic infinite-range Ising spin glass (SG) model in a transverse field $\Gamma$. The problem is formulated in a Grassmann path integral formalism within the static approximation. In this model, a parallel magnetic field $H$ breaks the symmetry of the sublattices. It destroys the antiferromagnetic (AF) order, but it can favor the nonergodic mixed phase (SG+AF) characterizing an asymmetric RSB region. In this region, intra-sublattice disordered interactions $V$ increase the difference between the RSB solutions of each sublattice. The freezing temperature shows a higher increase with $H$ when $V$ enhances. A discontinue phase transition from the replica symmetry (RS) solution to the RSB solution can appear with the presence of an intra-sublattice ferromagnetic average coupling. The $\Gamma$ field introduces a quantum spin flip mechanism that suppresses the magnetic orders leading them to quantum critical points. Results suggest that the quantum effects are not able to restore the RS solution. However, in the asymmetric RSB region, $\Gamma$ can produce a stable RS solution at any finite temperature for a particular sublattice while the other sublattice still presents RSB solution for the special case in which only the intra-sublattice spins couple with disordered interactions.
Techniques known as Nonlinear Set Membership prediction, Lipschitz Interpolation or Kinky Inference are approaches to machine learning that utilise presupposed Lipschitz properties to compute inferences over unobserved function values. Provided a bound on the true best Lipschitz constant of the target function is known a priori they offer convergence guarantees as well as bounds around the predictions. Considering a more general setting that builds on Hoelder continuity relative to pseudo-metrics, we propose an online method for estimating the Hoelder constant online from function value observations that possibly are corrupted by bounded observational errors. Utilising this to compute adaptive parameters within a kinky inference rule gives rise to a nonparametric machine learning method, for which we establish strong universal approximation guarantees. That is, we show that our prediction rule can learn any continuous function in the limit of increasingly dense data to within a worst-case error bound that depends on the level of observational uncertainty. We apply our method in the context of nonparametric model-reference adaptive control (MRAC). Across a range of simulated aircraft roll-dynamics and performance metrics our approach outperforms recently proposed alternatives that were based on Gaussian processes and RBF-neural networks. For discrete-time systems, we provide stability guarantees for our learning-based controllers both for the batch and the online learning setting.
The Transmission Control Protocol (TCP) was designed to provide reliable transport services in wired networks. In such networks, packet losses mainly occur due to congestion. Hence, TCP was designed to apply congestion avoidance techniques to cope with packet losses. Nowadays, TCP is also utilized in wireless networks where, besides congestion, numerous other reasons for packet losses exist. This results in reduced throughput and increased transmission round-trip time when the state of the wireless channel is bad. We propose a new network layer, that transparently sits below the transport layer and hides non congestion-imposed packet losses from TCP. The network coding in this new layer is based on the well-known class of Maximum Distance Separable (MDS) codes.
Networks capture our intuition about relationships in the world. They describe the friendships between Facebook users, interactions in financial markets, and synapses connecting neurons in the brain. These networks are richly structured with cliques of friends, sectors of stocks, and a smorgasbord of cell types that govern how neurons connect. Some networks, like social network friendships, can be directly observed, but in many cases we only have an indirect view of the network through the actions of its constituents and an understanding of how the network mediates that activity. In this work, we focus on the problem of latent network discovery in the case where the observable activity takes the form of a mutually-excitatory point process known as a Hawkes process. We build on previous work that has taken a Bayesian approach to this problem, specifying prior distributions over the latent network structure and a likelihood of observed activity given this network. We extend this work by proposing a discrete-time formulation and developing a computationally efficient stochastic variational inference (SVI) algorithm that allows us to scale the approach to long sequences of observations. We demonstrate our algorithm on the calcium imaging data used in the Chalearn neural connectomics challenge.
One fundamental assumption in statistical physics is that generic closed quantum many-body systems thermalize under their own dynamics. Recently, the emergence of many-body localized systems has questioned this concept, challenging our understanding of the connection between statistical physics and quantum mechanics. Here we report on the observation of a many-body localization transition between thermal and localized phases for bosons in a two-dimensional disordered optical lattice. With our single site resolved measurements we track the relaxation dynamics of an initially prepared out-of-equilibrium density pattern and find strong evidence for a diverging length scale when approaching the localization transition. Our experiments mark the first demonstration and in-depth characterization of many-body localization in a regime not accessible with state-of-the-art simulations on classical computers.
A neural network based technique is presented, which is able to successfully extract polynomial classification rules from labeled electroencephalogram (EEG) signals. To represent the classification rules in an analytical form, we use the polynomial neural networks trained by a modified Group Method of Data Handling (GMDH). The classification rules were extracted from clinical EEG data that were recorded from an Alzheimer patient and the sudden death risk patients. The third data is EEG recordings that include the normal and artifact segments. These EEG data were visually identified by medical experts. The extracted polynomial rules verified on the testing EEG data allow to correctly classify 72% of the risk group patients and 96.5% of the segments. These rules performs slightly better than standard feedforward neural networks.
We present a phase diagram as a function of disorder in three-dimensional NbN thin films, as the system enters the critical disorder for the destruction of the superconducting state. The superconducting state is investigated using a combination of magnetotransport and tunneling spectroscopy measurements. Our studies reveal 3 different disorder regimes. At low disorder the (k_{F}l~10-4), the system follows the mean field Bardeen-Cooper-Schrieffer behavior where the superconducting energy gap vanishes at the temperature where electrical resistance appears. For stronger disorder (k_{F}l<4) a "pseudogap" state emerges where a gap in the electronic spectrum persists up to temperatures much higher than Tc, suggesting that Cooper pairs continue to exist in the system even after the zero resistance state is destroyed. Finally, very strongly disordered samples (k_{F}l<1) exhibit a pronounced magnetoresistance peak at low temperatures, suggesting that localized Cooper pairs continue to survive in the system even after the global superconducting ground state is completely destroyed.
We study a deep matrix factorization problem. It takes as input a matrix X obtained by multiplying K matrices (called factors). Each factor is obtained by applying a fixed linear operator to a vector of parameters satisfying a sparsity constraint. We provide sharp conditions on the structure of the model that guarantee the stable recovery of the factors from the knowledge of X and the model for the factors. This is crucial in order to interpret the factors and the intermediate features obtained when applying a few factors to a datum. When K = 1: the paper provides compressed sensing statements; K = 2 covers (for instance) Non-negative Matrix Factorization, Dictionary learning, low rank approximation, phase recovery. The particularity of this paper is to extend the study to deep problems. As an illustration, we detail the analysis and provide (entirely computable) guarantees for the stable recovery of a (non-neural) sparse convolutional network.
Traditional object recognition approaches apply feature extraction, part deformation handling, occlusion handling and classification sequentially while they are independent from each other. Ouyang and Wang proposed a model for jointly learning of all of the mentioned processes using one deep neural network. We utilized, and manipulated their toolbox in order to apply it in car detection scenarios where it had not been tested. Creating a single deep architecture from these components, improves the interaction between them and can enhance the performance of the whole system. We believe that the approach can be used as a general purpose object detection toolbox. We tested the algorithm on UIUC car dataset, and achieved an outstanding result. The accuracy of our method was 97 % while the previously reported results showed an accuracy of up to 91 %. We strongly believe that having an experiment on a larger dataset can show the advantage of using deep models over shallow ones.
Social networks have emerged as a critical factor in information dissemination, search, marketing, expertise and influence discovery, and potentially an important tool for mobilizing people. Social media has made social networks ubiquitous, and also given researchers access to massive quantities of data for empirical analysis. These data sets offer a rich source of evidence for studying dynamics of individual and group behavior, the structure of networks and global patterns of the flow of information on them. However, in most previous studies, the structure of the underlying networks was not directly visible but had to be inferred from the flow of information from one individual to another. As a result, we do not yet understand dynamics of information spread on networks or how the structure of the network affects it. We address this gap by analyzing data from two popular social news sites. Specifically, we extract social networks of active users on Digg and Twitter, and track how interest in news stories spreads among them. We show that social networks play a crucial role in the spread of information on these sites, and that network structure affects dynamics of information flow.
The minimum conductance problem is an NP-hard graph partitioning problem. Apart from the search for bottlenecks in complex networks, the problem is very closely related to the popular area of network community detection. In this paper, we tackle the minimum conductance problem as a pseudo-Boolean optimisation problem and propose a memetic algorithm to solve it. An efficient local search strategy is established. Our memetic algorithm starts by using this local search strategy with different random strings to sample a set of diverse initial solutions. This is followed by an evolutionary phase based on a steady-state framework and two intensification subroutines. We compare the algorithm to a wide range of multi-start local search approaches and classical genetic algorithms with different crossover operators. The experimental results are presented for a diverse set of real-world networks. These results indicate that the memetic algorithm outperforms the alternative stochastic approaches.
Evolutionary algorithms have been successfully applied to a variety of optimisation problems in stationary environments. However, many real world optimisation problems are set in dynamic environments where the success criteria shifts regularly. Population diversity affects algorithmic performance, particularly on multiobjective and dynamic problems. Diversity mechanisms are methods of altering evolutionary algorithms in a way that promotes the maintenance of population diversity. This project intends to measure and compare the performance effect a variety of diversity mechanisms have on an evolutionary algorithm when facing an assortment of dynamic problems.
While much attention has been focussed on the successes of perturbative QCD in describing the $Q^2$-dependence of deep-inelastic structure functions, the starting distributions themselves contain important, non-perturbative information on the structure of the nucleon, which has been somewhat neglected. We review some of the most important, recent discoveries resulting from studies of deep-inelastic scattering. There are important connections between these discoveries and low energy properties of the nucleon and wherever possible we shall make these clear. In particular, we shall see that well known features of QCD, such as dynamical symmetry breaking, are reflected in the properties of the measured parton distributions.
To ensure security in data transmission is one of the most important issues for wireless relay networks, and physical layer security is an attractive alternative solution to address this issue. In this paper, we consider a cooperative network, consisting of one source node, one destination node, one eavesdropper node, and a number of relay nodes. Specifically, the source may select several relays to help forward the signal to the corresponding destination to achieve the best security performance. However, the relays may have the incentive not to report their true private channel information in order to get more chances to be selected and gain more payoff from the source. We propose a Vickey-Clark-Grove (VCG) based mechanism and an Arrow-d'Aspremont-Gerard-Varet (AGV) based mechanism into the investigated relay network to solve this cheating problem. In these two different mechanisms, we design different "transfer payment" functions to the payoff of each selected relay and prove that each relay gets its maximum (expected) payoff when it truthfully reveals its private channel information to the source. And then, an optimal secrecy rate of the network can be achieved. After discussing and comparing the VCG and AGV mechanisms, we prove that the AGV mechanism can achieve all of the basic qualifications (incentive compatibility, individual rationality and budget balance) for our system. Moreover, we discuss the optimal quantity of relays that the source node should select. Simulation results verify efficiency and fairness of the VCG and AGV mechanisms, and consolidate these conclusions.
This paper introduces the visually informed embedding of word (VIEW), a continuous vector representation for a word extracted from a deep neural model trained using the Microsoft COCO data set to forecast the spatial arrangements between visual objects, given a textual description. The model is composed of a deep multilayer perceptron (MLP) stacked on the top of a Long Short Term Memory (LSTM) network, the latter being preceded by an embedding layer. The VIEW is applied to transferring multimodal background knowledge to Spatial Role Labeling (SpRL) algorithms, which recognize spatial relations between objects mentioned in the text. This work also contributes with a new method to select complementary features and a fine-tuning method for MLP that improves the $F1$ measure in classifying the words into spatial roles. The VIEW is evaluated with the Task 3 of SemEval-2013 benchmark data set, SpaceEval.
Universal security over a network with linear network coding has been intensively studied. However, previous linear codes and code pairs used for this purpose were linear over a larger field than that used on the network. In this work, we introduce new parameters (relative dimension/rank support profile and relative generalized matrix weights) for code pairs that are linear over the field used in the network, measuring the universal security performance of these code pairs. For one code and non-square matrices, generalized metrix weights coincide with the existing Delsarte generalized weights, hence we prove the conection between these latter weights and secure network coding. The proposed new parameters enable us to use optimal universal secure linear codes on noiseless networks for all possible parameters, as opposed to previous works, and also enable us to add universal security to the recently proposed list-decodable rank-metric codes by Guruswami et al. We give several properties of the new parameters: monotonicity, Singleton-type lower and upper bounds, a duality theorem, and definitions and characterizations of equivalences and degenerateness of linear codes. Finally, we show that our parameters strictly extend relative dimension/length profile and relative generalized Hamming weights, respectively, and relative dimension/intersection profile and relative generalized rank weights, respectively. The duality theorems for generalized Hamming weights and generalized rank weights can be deduced as special cases of the duality theorem for generalized matrix weights.
A model glass with fast and slow processes is studied. The statics is simple and the facilitated slow dynamics is exactly solvable. The main features of a fragile glass take place: Kauzmann transition, Vogel-Fulcher law, Adam-Gibbs relation and aging. The time evolution can be so slow that a quasi-equilibrium occur at a time dependent effective temperature. The same effective temperature is derived from the Fluctuation-Dissipation ratio, which supports the applicability of out of equilibrium thermodynamics.
Cloud computing is a paradigm that has the potential to transform and revolutionalize the next generation IT industry by making software available to end-users as a service. A cloud, also commonly known as a cloud network, typically comprises of hardware (network of servers) and a collection of softwares that is made available to end-users in a pay-as-you-go manner. Multiple public cloud providers (ex., Amazon) co-existing in a cloud computing market provide similar services (software as a service) to its clients, both in terms of the nature of an application, as well as in quality of service (QoS) provision. The decision of whether a cloud hosts (or finds it profitable to host) a service in the long-term would depend jointly on the price it sets, the QoS guarantees it provides to its customers, and the satisfaction of the advertised guarantees. In this paper, we devise and analyze three inter-organizational economic models relevant to cloud networks. We formulate our problems as non co-operative price and QoS games between multiple cloud providers existing in a cloud market. We prove that a unique pure strategy Nash equilibrium (NE) exists in two of the three models. Our analysis paves the path for each cloud provider to 1) know what prices and QoS level to set for end-users of a given service type, such that the provider could exist in the cloud market, and 2) practically and dynamically provision appropriate capacity for satisfying advertised QoS guarantees.
This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [Kivinen & Warmuth, 1994]. We provide a unified framework for parameter estimation that encompasses both on-line learning, where the model is continuously adapted to new data cases as they arrive, and the more traditional batch learning, where a pre-accumulated set of samples is used in a one-time model selection process. In the batch case, our framework encompasses both the gradient projection algorithm and the EM algorithm for Bayesian networks. The framework also leads to new on-line and batch parameter update schemes, including a parameterized version of EM. We provide both empirical and theoretical results indicating that parameterized EM allows faster convergence to the maximum likelihood parameters than does standard EM.
Coevolutionary games cast players that may change their strategies as well as their networks of interaction. In this paper a framework is introduced for describing coevolutionary game dynamics by landscape models. It is shown that coevolutionary games invoke dynamic landscapes. Numerical experiments are shown for a prisoner's dilemma (PD) and a snow drift (SD) game that both use either birth-death (BD) or death-birth (DB) strategy updating. The resulting landscapes are analyzed with respect to modality and ruggedness
Bayesian optimization has become a successful tool for hyperparameter optimization of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed Fabolas, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that Fabolas often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.
Zero temperature properties of a dilute weakly interacting $d$-dimensional Bose gas in a random potential are studied. We calculate geometrical and energetic characteristics of the localized state of a gas confined in a large box or in a harmonic trap. Different regimes of the localized state are found depending on the ratio of two characteristic length scales of the disorder, the Larkin length and the disorder correlation length. Repulsing bosons confined in a large box with average density $n$ well below a critical value $n_c$ are trapped in deep potential wells of extension much smaller than distance between them. Tunneling between these wells is exponentially small. The ground state of such a gas is a random singlet with no long-range phase correlation For $n>n_c$ repulsion between particles overcomes the disorder and the gas transits from the localized to a coherent superfluid state. The critical density $n_c$ is calculated in terms of the disorder parameters and the interaction strength. For atoms in traps four different regimes are found, only one of it is superfluid. The theory is extended to lower (1 and 2) dimensions. Its quantitative predictions can be checked in experiments with ultracold atomic gases and other Bose-systems.
Traditional mean-field theory is a simple generic approach for understanding various phases. But that approach only applies to symmetry breaking states with short-range entanglement. In this paper, we describe a generic approach for studying 2D quantum phases with long-range entanglement (such as topological phases). Our approach is a variational method that uses tensor product states (also known as projected entangled pair states) as trial wave functions. We use a 2D real space RG algorithm to evaluate expectation values in these wave functions. We demonstrate our algorithm by studying several simple 2D quantum spin models.
The problem of finding optimal set of users for influencing others in the social network has been widely studied. Because it is NP-hard, some heuristics were proposed to find sub-optimal solutions. Still, one of the commonly used assumption is the one that seeds are chosen on the static network, not the dynamic one. This static approach is in fact far from the real-world networks, where new nodes may appear and old ones dynamically disappear in course of time.   The main purpose of this paper is to analyse how the results of one of the typical models for spread of influence - linear threshold - differ depending on the strategy of building the social network used later for choosing seeds. To show the impact of network creation strategy on the final number of influenced nodes - outcome of spread of influence, the results for three approaches were studied: one static and two temporal with different granularities, i.e. various number of time windows. Social networks for each time window encapsulated dynamic changes in the network structure. Calculation of various node structural measures like degree or betweenness respected these changes by means of forgetting mechanism - more recent data had greater influence on node measure values. These measures were, in turn, used for node ranking and their selection for seeding.   All concepts were applied to experimental verification on five real datasets. The results revealed that temporal approach is always better than static and the higher granularity in the temporal social network while seeding, the more finally influenced nodes. Additionally, outdegree measure with exponential forgetting typically outperformed other time-dependent structural measures, if used for seed candidate ranking.
A conjecture is given for the exact location of the multicritical point in the phase diagram of the +/- J Ising model on the triangular lattice. The result p_c=0.8358058 agrees well with a recent numerical estimate. From this value, it is possible to derive a comparable conjecture for the exact location of the multicritical point for the hexagonal lattice, p_c=0.9327041, again in excellent agreement with a numerical study. The method is a variant of duality transformation to relate the triangular lattice directly with its dual triangular lattice without recourse to the hexagonal lattice, in conjunction with the replica method.
In this paper, we introduce a trie-structured Bayesian model for unsupervised morphological segmentation. We adopt prior information from different sources in the model. We use neural word embeddings to discover words that are morphologically derived from each other and thereby that are semantically similar. We use letter successor variety counts obtained from tries that are built by neural word embeddings. Our results show that using different information sources such as neural word embeddings and letter successor variety as prior information improves morphological segmentation in a Bayesian model. Our model outperforms other unsupervised morphological segmentation models on Turkish and gives promising results on English and German for scarce resources.
In a disordered system one can either consider a microcanonical ensemble, where there is a precise constraint on the random variables, or a canonical ensemble where the variables are chosen according to a distribution without constraints. We address the question as to whether critical exponents in these two cases can differ through a detailed study of the random transverse-field Ising chain. We find that the exponents are the same in both ensembles, though some critical amplitudes vanish in the microcanonical ensemble for correlations which span the whole system and are particularly sensitive to the constraint. This can \textit{appear} as a different exponent. We expect that this apparent dependence of exponents on ensemble is related to the integrability of the model, and would not occur in non-integrable models.
Large computer-understandable proofs consist of millions of intermediate logical steps. The vast majority of such steps originate from manually selected and manually guided heuristics applied to intermediate goals. So far, machine learning has generally not been used to filter or generate these steps. In this paper, we introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the purpose of developing new machine learning-based theorem-proving strategies. We make this dataset publicly available under the BSD license. We propose various machine learning tasks that can be performed on this dataset, and discuss their significance for theorem proving. We also benchmark a set of simple baseline machine learning models suited for the tasks (including logistic regression, convolutional neural networks and recurrent neural networks). The results of our baseline models show the promise of applying machine learning to HOL theorem proving.
Sentiment analysis is a common task in natural language processing that aims to detect polarity of a text document (typically a consumer review). In the simplest settings, we discriminate only between positive and negative sentiment, turning the task into a standard binary classification problem. We compare several ma- chine learning approaches to this problem, and combine them to achieve the best possible results. We show how to use for this task the standard generative lan- guage models, which are slightly complementary to the state of the art techniques. We achieve strong results on a well-known dataset of IMDB movie reviews. Our results are easily reproducible, as we publish also the code needed to repeat the experiments. This should simplify further advance of the state of the art, as other researchers can combine their techniques with ours with little effort.
Generative adversarial networks have been proposed as a way of efficiently training deep generative neural networks. We propose a generative adversarial model that works on continuous sequential data, and apply it by training it on a collection of classical music. We conclude that it generates music that sounds better and better as the model is trained, report statistics on generated music, and let the reader judge the quality by downloading the generated songs.
Autoassociative networks were proposed in the 80's as simplified models of memory function in the brain, using recurrent connectivity with hebbian plasticity to store patterns of neural activity that can be later recalled. This type of computation has been suggested to take place in the CA3 region of the hippocampus and at several levels in the cortex. One of the weaknesses of these models is their apparent inability to store correlated patterns of activity. We show, however, that a small and biologically plausible modification in the `learning rule' (associating to each neuron a plasticity threshold that reflects its popularity) enables the network to handle correlations. We study the stability properties of the resulting memories (in terms of their resistance to the damage of neurons or synapses), finding a novel property of autoassociative networks: not all memories are equally robust, and the most informative are also the most sensitive to damage. We relate these results to category-specific effects in semantic memory patients, where concepts related to `non-living things' are usually more resistant to brain damage than those related to `living things', a phenomenon suspected to be rooted in the correlation between representations of concepts in the cortex.
Although real-coded differential evolution (DE) algorithms can perform well on continuous optimization problems (CoOPs), it is still a challenging task to design an efficient binary-coded DE algorithm. Inspired by the learning mechanism of particle swarm optimization (PSO) algorithms, we propose a binary learning differential evolution (BLDE) algorithm that can efficiently locate the global optimal solutions by learning from the last population. Then, we theoretically prove the global convergence of BLDE, and compare it with some existing binary-coded evolutionary algorithms (EAs) via numerical experiments. Numerical results show that BLDE is competitive to the compared EAs, and meanwhile, further study is performed via the change curves of a renewal metric and a refinement metric to investigate why BLDE cannot outperform some compared EAs for several selected benchmark problems. Finally, we employ BLDE solving the unit commitment problem (UCP) in power systems to show its applicability in practical problems.
The interaction between the vortex lines in a type-II superconductor is mediated by currents. In the absence of transverse screening this interaction is long-ranged, stiffening up the vortex lattice as expressed by the dispersive elastic moduli. The effect of disorder is strongly reduced, resulting in a mean-squared displacement correlator <u^2(R,L)> = <[u(R,L)-u(0,0)]^2> characterized by a mere logarithmic growth with distance. Finite screening cuts the interaction on the scale of the London penetration depth \lambda and limits the above behavior to distances R<\lambda. Using a functional renormalization group (RG) approach, we derive the flow equation for the disorder correlation function and calculate the disorder-averaged mean-squared relative displacement <u^2(R)> \propto ln^{2\sigma} (R/a_0). The logarithmic growth (2\sigma=1) in the perturbative regime at small distances [A.I. Larkin and Yu.N. Ovchinnikov, J. Low Temp. Phys. 34, 409 (1979)] crosses over to a sub-logarithmic growth with 2\sigma=0.348 at large distances.
The Australian SKA Pathfinder (ASKAP) is a new radio-telescope being built in Western Australia. One of the key surveys for which it is being built is EMU (Evolutionary Map of the Universe), which will make a deep (~10 {\mu}Jy/bm rms) radio continuum survey covering the entire sky as far North as +30\circ. EMU may be compared to the NRAO VLA Sky Survey (NVSS), except that it will have about 45 times the sensitivity, and five times the resolution. EMU will also have much better sensitivity to diffuse emission than previous large surveys, and is expected to produce a large catalogue of relics, tailed galaxies, and haloes, and will increase the number of known clusters by a significant factor. Here we describe the EMU project and its impact on the astrophysics of clusters.
We introduce DeepNAT, a 3D Deep convolutional neural network for the automatic segmentation of NeuroAnaTomy in T1-weighted magnetic resonance images. DeepNAT is an end-to-end learning-based approach to brain segmentation that jointly learns an abstract feature representation and a multi-class classification. We propose a 3D patch-based approach, where we do not only predict the center voxel of the patch but also neighbors, which is formulated as multi-task learning. To address a class imbalance problem, we arrange two networks hierarchically, where the first one separates foreground from background, and the second one identifies 25 brain structures on the foreground. Since patches lack spatial context, we augment them with coordinates. To this end, we introduce a novel intrinsic parameterization of the brain volume, formed by eigenfunctions of the Laplace-Beltrami operator. As network architecture, we use three convolutional layers with pooling, batch normalization, and non-linearities, followed by fully connected layers with dropout. The final segmentation is inferred from the probabilistic output of the network with a 3D fully connected conditional random field, which ensures label agreement between close voxels. The roughly 2.7 million parameters in the network are learned with stochastic gradient descent. Our results show that DeepNAT compares favorably to state-of-the-art methods. Finally, the purely learning-based method may have a high potential for the adaptation to young, old, or diseased brains by fine-tuning the pre-trained network with a small training sample on the target application, where the availability of larger datasets with manual annotations may boost the overall segmentation accuracy in the future.
Recent work emphasizes that the maximum entropy principle provides a bridge between statistical mechanics models for collective behavior in neural networks and experiments on networks of real neurons. Most of this work has focused on capturing the measured correlations among pairs of neurons. Here we suggest an alternative, constructing models that are consistent with the distribution of global network activity, i.e. the probability that K out of N cells in the network generate action potentials in the same small time bin. The inverse problem that we need to solve in constructing the model is analytically tractable, and provides a natural "thermodynamics" for the network in the limit of large N. We analyze the responses of neurons in a small patch of the retina to naturalistic stimuli, and find that the implied thermodynamics is very close to an unusual critical point, in which the entropy (in proper units) is exactly equal to the energy.
A new theoretical model is presented to study the nanoscale electronic inhomogeneity in high-$T_c$ cuprates. In this model, we argue that the randomly distributed out-of-plane interstitial oxygen dopants induces locally the off-diagonal (i.e., hopping integral) disorder. This disorder modulates the superexchange interaction resulting from a large-$U$ Hubbard model, which in turns changes the local pairing interaction. The microscopic self-consistent calculations shows that the large gap regions are registered to the locations of dopants. Large gap regions exhibit small and broader coherence peaks. These results are qualitatively consistent with recent STM observations on optimally doped Bi$_2$Sr$_2$CaCu$_2$O$_{8+\delta}$.
The single spin asymmetries for a longitudinally polarized lepton beam or a longitudinally polarized nucleon target in semi-inclusive deep-inelastic scattering are twist-3 observables. We study these asymmetries in a simple diquark spectator model of the nucleon. Analogous to the case of transverse target polarization, non-vanishing asymmetries are generated by gluon exchange between the struck quark and the target system. It is pointed out that the coupling of the virtual photon to the diquark is needed in order to preserve electromagnetic gauge invariance at the twist-3 level. The calculation indicates that previous analyses of these observables are incomplete.
We present an analysis of the faint M star population seen as foreground contaminants in deep extragalactic surveys. We use space-based data to separate such stars from high redshift galaxies in a publically-available dataset, and consider the photometric properties of the resulting sample in the optical and infrared. The inferred distances place these stars well beyond the scale height of the thick disk. We find strong similarities between this faint sample (reaching i'_{AB}=25) and the brighter disk M dwarf population studied by other authors. The optical-infrared properties of the bulk of our sources spanning 6000A-4.5microns are consistent with those 5-10 magnitudes brighter. We also present deep spectroscopy of faint M dwarf stars reaching continuum limits of i'_{AB}~26, and measure absorption line strengths in the CaH2 and TiO5 bands. Both photometrically and spectroscopically, our sources are consistent with metallicities as low as a tenth solar: metal-rich compared with halo stars at similar heliocentric distances. We comment on the possible MACHO identification of M stars at faint magnitudes.
Convolutional Neural Network (CNN) is a very powerful approach to extract discriminative local descriptors for effective image search. Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors. Taking a different approach, in this paper, we propose a novel framework to achieve competitive retrieval performance. Firstly, we propose various masking schemes, namely SIFT-mask, SUM-mask, and MAX-mask, to select a representative subset of local convolutional features and remove a large number of redundant features. We demonstrate that this can effectively address the burstiness issue and improve retrieval accuracy. Secondly, we propose to employ recent embedding and aggregating methods to further enhance feature discriminability. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art retrieval accuracy.
We consider the sizing of network buffers in 802.11 based networks. Wireless networks face a number of fundamental issues that do not arise in wired networks. We demonstrate that the use of fixed size buffers in 802.11 networks inevitably leads to either undesirable channel under-utilization or unnecessary high delays. We present two novel dynamic buffer sizing algorithms that achieve high throughput while maintaining low delay across a wide range of network conditions. Experimental measurements demonstrate the utility of the proposed algorithms in a production WLAN and a lab testbed.
Community structure is a typical property of many real-world networks, and has become a key to understand the dynamics of the networked systems. In these networks most nodes apparently lie in a community while there often exists a few nodes straddling several communities. An ideal algorithm for community detection is preferable which can identify the overlapping communities in such networks. To represent an overlapping division we develop a encoding schema composed of two segments, the first one represents a disjoint partition and the second one represents a extension of the partition that allows of multiple memberships. We give a measure for the informativeness of a node, and present an evolutionary method for detecting the overlapping communities in a network.
KIC~8262223 is an eclipsing binary with a short orbital period ($P=1.61$ d). The {\it Kepler} light curves are of Algol-type and display deep and partial eclipses, ellipsoidal variations, and pulsations of Delta Scuti type. We analyzed the {\it Kepler} photometric data, complemented by phase-resolved spectra from the R-C Spectrograph on the 4-meter Mayall telescope at Kitt Peak National Observatory and determined the fundamental parameters of this system. The low mass and oversized secondary ($M_2=0.20M_{\odot}$, $R_2=1.31R_{\odot}$) is the remnant of the donor star that transferred most of its mass to the gainer, and now the primary star. The current primary star is thus not a normal $\delta$ Scuti star but the result of mass accretion from a lower mass progenitor. We discuss the possible evolutionary history and demonstrate with the MESA evolution code that the system can be understood as the result of non-conservative binary evolution similar to that for the formation of EL CVn type binaries. The pulsations of the primary star can be explained as radial and non-radial pressure modes. The equilibrium models from single star evolutionary tracks can match the observed mass and radius ($M_1=1.94M_{\odot}$, $R_1=1.67R_{\odot}$) but the predicted unstable modes associated with these models differ somewhat from those observed. This work presents a preliminary asteroseismic analysis of the abnormal Delta Scuti pulsators, and we discuss the need for better theoretical understanding of such pulsating mass gaining stars.
We propose a method to evaluate parameters in the Hamiltonian of the Ising chain under site-dependent transverse fields, with a proviso that we can control and measure one of the edge spins only. We evaluate the eigenvalues of the Hamiltonian and the time-evoultion operator exactly for a 3-spin chain, from which we obtain the expectation values of $\sigma_x$ of the first spin. The parameters are found from the peak positions of the Fourier transform of the expectation value. There are four assumptions in our method, which are mild enough to be satisfied in many physical systems.
We propose a model for growing networks based on a finite memory of the nodes. The model shows stylized features of real-world networks: power law distribution of degree, linear preferential attachment of new links and a negative correlation between the age of a node and its link attachment rate. Notably, the degree distribution is conserved even though only the most recently grown part of the network is considered. This feature is relevant because real-world networks truncated in the same way exhibit a power-law distribution in the degree. As the network grows, the clustering reaches an asymptotic value larger than for regular lattices of the same average connectivity. These high-clustering scale-free networks indicate that memory effects could be crucial for a correct description of the dynamics of growing networks.
Internet usage has shifted from host-centric end-to-end communication to a content-centric approach mainly used for content delivery. Information Centric Networking (ICN) was proposed as a promising novel content delivery architecture. ICN includes in-network caching features at every node which has a major impact on content retrieval. For instance, the ICN efficiency depends drastically on the management of the caching resources. The management of caching resources is achieved with caching strategies. Caching strategies decide what, where and when content is stored in the network. In this paper, we revisit the recent technical contributions on caching strategies, we compare them and discuss opportunities and challenges for improving content delivery in the upcoming years.
Activity-driven modeling has been recently proposed as an alternative growth mechanism for time varying networks, displaying power-law degree distribution in time-aggregated representation. This approach assumes memoryless agents developing random connections, thus leading to random networks that fail to reproduce two-nodes degree correlations and the high clustering coefficient widely observed in real social networks. In this work we introduce these missing topological features by accounting for memory effects on the dynamic evolution of time-aggregated networks. To this end, we propose an activity-driven network growth model including a triadic-closure step as main connectivity mechanism. We show that this mechanism provides some of the fundamental topological features expected for social networks. We derive analytical results and perform extensive numerical simulations in regimes with and without population growth. Finally, we present two cases of study, one comprising face-to-face encounters in a closed gathering, while the other one from an online social friendship network.
Recently, reinforcement learning has been successfully applied to the logical game of Go, various Atari games, and even a 3D game, Labyrinth, though it continues to have problems in sparse reward settings. It is difficult to explore, but also difficult to exploit, a small number of successes when learning policy. To solve this issue, the subgoal and option framework have been proposed. However, discovering subgoals online is too expensive to be used to learn options in large state spaces. We propose Micro-objective learning (MOL) to solve this problem. The main idea is to estimate how important a state is while training and to give an additional reward proportional to its importance. We evaluated our algorithm in two Atari games: Montezuma's Revenge and Seaquest. With three experiments to each game, MOL significantly improved the baseline scores. Especially in Montezuma's Revenge, MOL achieved two times better results than the previous state-of-the-art model.
We investigate the critical properties of d-dimensional magnetic systems with quenched extended defects, correlated in   $\epsilon_d$ dimensions (which can be considered as the dimensionality of the defects) and randomly distributed in the remaining $d-\epsilon_d$ dimensions; both in the case of fixed dimension d=3 and when the space dimension continuously changes from the lower critical dimension to the upper one. The renormalization group calculations are performed in the minimal subtraction scheme. We analyze the two-loop renormalization group functions for different fixed values of the parameters $d, \epsilon_d$. To this end, we apply the Chisholm-Borel resummation technique and report the numerical values of the critical exponents for the universality class of this system.
Understanding why some cellular components are conserved across species, while others evolve rapidly is a key question of modern biology. Here we demonstrate that in S. cerevisiae proteins organized in cohesive patterns of interactions are conserved to a significantly higher degree than those that do not participate in such motifs. We find that the conservation of proteins within distinct topological motifs correlates with the motif's inter-connectedness and function and also depends on the structure of the overall interactome topology. These findings indicate that motifs may represent evolutionary conserved topological units of cellular networks molded in accordance with the specific biological function in which they participate.
The ever increasing demands for using resource-constrained mobile devices for running more resource intensive applications nowadays has initiated the development of cyber foraging solutions that offload parts or whole computational intensive tasks to more powerful surrogate stationary computers and run them on behalf of mobile devices as required. The choice of proper mix of mobile devices and surrogates has remained an unresolved challenge though. In this paper, we propose a new decision-making mechanism for cyber foraging systems to select the best locations to run an application, based on context metrics such as the specifications of surrogates, the specifications of mobile devices, application specification, and communication network specification. Experimental results show faster response time and lower energy consumption of benched applications compared to when applications run wholly on mobile devices and when applications are offloaded to surrogates blindly for execution.
We study the model of a discrete directed polymer (DP) on the square lattice with homogeneous inverse gamma distribution of site random Boltzmann weights, introduced by Seppalainen. The integer moments of the partition sum, $\overline{Z^n}$, are studied using a transfer matrix formulation, which appears as a generalization of the Lieb-Liniger quantum mechanics of bosons to discrete time and space. In the present case of the inverse gamma distribution the model is integrable in terms of a coordinate Bethe Ansatz, as discovered by Brunet. Using the Brunet-Bethe eigenstates we obtain an exact expression for the integer moments of $\overline{Z^n}$ for polymers of arbitrary lengths and fixed endpoint positions. Although these moments do not exist for all integer n, we are nevertheless able to construct a generating function which reproduces all existing integer moments, and which takes the form of a Fredholm determinant (FD). This suggests an analytic continuation via a Mellin-Barnes transform and we thereby propose a FD ansatz representation for the probability distribution function (PDF) of $Z$ and its Laplace transform. In the limit of very long DP, this ansatz yields that the distribution of the free energy converges to the GUE Tracy-Widom distribution up to a non-trivial average and variance that we calculate. Our asymptotic predictions coincide with a result by Borodin et al. based on a formula obtained by Seppalainen using the gRSK correspondence. In addition we obtain the dependence on the endpoint position and the exact elastic coefficient at large time. We argue the equivalence between our formula and the one of Borodin et al. As we discuss, this open the way to explore the connections between quantum integrability and tropical geometry.
The spontaneous transitions between D-dimensional spatial maps in an attractor neural network are studied. Two scenarios for the transition from on map to another are found, depending on the level of noise: (1) through a mixed state, partly localized in both maps around positions where the maps are most similar; (2) through a weakly localized state in one of the two maps, followed by a condensation in the arrival map. Our predictions are confirmed by numerical simulations, and qualitatively compared to recent recordings of hippocampal place cells during quick-environment-changing experiments in rats.
We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. In most anomaly detection algorithms, the dissimilarity between data samples is calculated by a single criterion, such as Euclidean distance. However, in many cases there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such a case, multiple criteria can be defined, and one can test for anomalies by scalarizing the multiple criteria using a linear combination of them. If the importance of the different criteria are not known in advance, the algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we introduce a novel non-parametric multi-criteria anomaly detection method using Pareto depth analysis (PDA). PDA uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach scales linearly in the number of criteria and is provably better than linear combinations of the criteria.
Convolutional neural networks (CNNs) have been widely applied in the computer vision community to solve complex problems in image recognition and analysis. We describe an application of the CNN technology to the problem of identifying particle interactions in sampling calorimeters used commonly in high energy physics and high energy neutrino physics in particular. Following a discussion of the core concepts of CNNs and recent innovations in CNN architectures related to the field of deep learning, we outline a specific application to the NOvA neutrino detector. This algorithm, CVN (Convolutional Visual Network) identifies neutrino interactions based on their topology without the need for detailed reconstruction and outperforms algorithms currently in use by the NOvA collaboration.
The Longest Common Subsequence (LCS) Problem asks for the longest sequence of (non-contiguous) matches between two given strings of characters. Using extensive Monte Carlo simulations, we find a finite size scaling law of the form E(L)/N =C + A/(N^1/2 ln N)+... for the mean LCS length of two random strings of size N over S letters. We provide precise estimates of C for S between 2 and 15. We consider also a related Bernoulli Matching model where the different entries of an N times M array are independently occupied with probability 1/S. In that case we find the expression of the limit of L(N,M)/N as N grows to infinity, as a function of r=M/N. This expression provides a very good approximation for the Random String model, which gets more and more accurate as S increases. The question of the ``universality class'' of the LCS problem is also considered. For the Bernoulli Matching model we find very good agreement with recent scaling predictions of Hwa and Lassig for Needleman-Wunsch sequence alignment. We find however that the variance of the LCS length has a different scaling different in the Random String model, suggesting that long-ranged correlations among the matches are relevant in this model. We finally study the ``ground state'' properties of this problem. We find in particular that the number of solutions typically grows exponentially with N, i.e. this system has a residual entropy at T=0. Also the overlap between two LCSs chosen at random is found to be self averaging and to aproach a definite value q(S)<1 as N grows.
A large body of work in machine learning has focused on the problem of learning a close approximation to an underlying combinatorial function, given a small set of labeled examples. However, for real-valued functions, cardinal labels might not be accessible, or it may be difficult for an expert to consistently assign real-valued labels over the entire set of examples. For instance, it is notoriously hard for consumers to reliably assign values to bundles of merchandise. Instead, it might be much easier for a consumer to report which of two bundles she likes better. With this motivation in mind, we consider an alternative learning model, wherein the algorithm must learn the underlying function up to pairwise comparisons, from pairwise comparisons. In this model, we present a series of novel algorithms that learn over a wide variety of combinatorial function classes. These range from graph functions to broad classes of valuation functions that are fundamentally important in microeconomic theory, the analysis of social networks, and machine learning, such as coverage, submodular, XOS, and subadditive functions, as well as functions with sparse Fourier support.
Resilience of the most important properties of stochastic and regular (deterministic) small-world interconnection networks is studied. It is shown that in the broad range of values of the fraction of faulty nodes the networks under consideration possess high fault tolerance, the deterministic networks being slightly better than the stochastic ones.
We use a recently found generalization of the Cauchy-Binet theorem to give a new proof of the Chebotarev-Shamis forest theorem telling that det(1+L) is the number of rooted spanning forests in a finite simple graph G with Laplacian L. More generally, we show that det(1+k L) is the number of rooted edge-k-colored spanning forests in G. If a forest with an even number of edges is called even, then det(1-L) is the difference between even and odd rooted spanning forests in G.
We study a lattice model of ``commons'', where a resource is shared locally among the agents of various cooperative tendency. The payoff function of an agent is proportional to the fraction of his operation rate and the net output of the resource. After each time step a site is occupied by the neighbor of maximum profit or by its owner himself. In steady state the model is dominated by ``altruist'' agents with a small minority of selfish agents, forming a complex pattern. The dynamics selects cooperative levels in a way that the model becomes critical. We study the critical behavior of the model in case of moderate mutation rate and find the power spectrum of fluctuation of activity shows a $1/f^\alpha$ behavior with $\alpha \sim 1.30$. In case of very slow mutation rate the steady state has slow fluctuations which helps the evolution of higher cooperative tendency.
Graphene is known as a two-dimensional Dirac semimetal, in which electron states are described by the Dirac equation of relativistic quantum mechanics. Three-dimensional analogues of graphene are characterized by Dirac points or lines in momentum space, which are protected by symmetry. Here, we report a novel 3D carbon allotrope belonging to a class of topological nodal line semimetals, discovered by using an evolutionary structure search method. The new carbon phase in monoclinic $C$2$/m$ space group, termed $m$-$C_8$, consists of five-membered rings with $sp^3$ bonding interconnected by $sp^2$-bonded carbon networks. Enthalpy calculations reveal that $m$-$C_8$ is more favorable over recently reported topological semimetallic carbon allotropes, and the dynamical stability of $m$-$C_8$ is verified by phonon spectra and molecular dynamics simulations. Simulated x-ray diffraction spectra propose that $m$-$C_8$ would be one of the unidentified carbon phases observed in detonation shoot. The analysis of electronic properties indicates that $m$-$C_8$ exhibits the nodal line protected by both inversion and time-reversal symmetries in the absence of spin-orbit coupling and the surface band connecting the projected nodal points. Our results may help design new carbon allotropes with exotic electronic properties.
In a recent work Schomerus and Beenakker [Phys. Rev. Lett. 84, 3927 (2000)] tried to check numerically our prediction about a two-scale localization in disordered wires in a weak magnetic field. According to our theory an exponential decay of wave functions with the localization length of the orthogonal ensemble is followed at larger distances by another exponential decay with the length of the unitary ensemble. Studying the transmittance, Schomerus and Beenakker did not confirm our theory. The most probable explanation they suggested is that only rare states could have the two scales while ``typical'' states have the only localization length. We suggest another explanation of the discrepancy, namely, that the Borland conjecture relating the transmittance and wave functions to each other may not be used for studying the effect of the magnetic field on the tails of wave functions.
Neutron scattering investigations were carried out in PbMg$_{1/3}$Ta$_{2/3}$O$_3$ and BaMg$_{1/3}$Ta$_{2/3}$O$_3$ complex perovskites. The crystal structure of both compounds does not show any phase transition in the temperature range 1.5 -- 730 K. Whereas the temperature dependence of the lattice parameter of BaMg$_{1/3}$Ta$_{2/3}$O$_3$ follows the classical expectations, the lattice parameter of relaxor ferroelectric PbMg$_{1/3}$Ta$_{2/3}$O$_3$ exhibits anomalies. One of these anomalies is observed in the same temperature range as the peak in the dielectric susceptibility. We find that in PbMg$_{1/3}$Ta$_{2/3}$O$_3$, lead ions are displaced from the ideal positions in the perovskite structure at all temperatures. Consequently short-range order is present. This induces strong diffuse scattering with an anisotropic shape in wavevector space. The temperature dependences of the diffuse neutron scattering intensity and of the amplitude of the lead displacements are similar.
In this paper, we present an EEG study of two music improvisation experiments. Professional musicians with high level of improvisation skills were asked to perform music either according to notes (composed music) or in improvisation. Each piece of music was performed in two different modes: strict mode and "let-go" mode. Synchronized EEG data was measured from both musicians and listeners. We used one of the most reliable causality measures: conditional mutual information from mixed embedding (MIME), to analyze directed correlations between different EEG channels, which was combined with network theory to construct both intra-brain and cross-brain neural networks. Differences were identified in intra-brain neural networks between composed music and improvisation and between strict mode and "let-go" mode. Particular brain regions such as frontal, parietal and temporal regions were found to play a key role in differentiating the brain activities between different playing conditions. By comparing the level of degree centralities in intra-brain neural networks, we found musicians responding differently to listeners when playing music in different conditions.
Recent elastic and inelastic neutron scattering studies of the highly frustrated pyrochlore antiferromagnet Tb2Ti2O7 have shown some very intriguing features that cannot be modeled by the local <111> classical Ising model, naively expected to describe this system at low temperatures. Using the random phase approximation to take into account fluctuations between the ground state doublet and the first excited doublet, we successfully describe the elastic neutron scattering pattern and dispersion relations in Tb2Ti2O7, semi-quantitatively consistent with experimental observations.
In the last decade, deep learning has contributed to advances in a wide range computer vision tasks including texture analysis. This paper explores a new approach for texture segmentation using deep convolutional neural networks, sharing important ideas with classic filter bank based texture segmentation methods. Several methods are developed to train Fully Convolutional Networks to segment textures in various applications. We show in particular that these networks can learn to recognize and segment a type of texture, e.g. wood and grass from texture recognition datasets (no training segmentation). We demonstrate that Fully Convolutional Networks can learn from repetitive patterns to segment a particular texture from a single image or even a part of an image. We take advantage of these findings to develop a method that is evaluated on a series of supervised and unsupervised experiments and improve the state of the art on the Prague texture segmentation datasets.
This paper extends the reinforcement learning ideas into the multi-agents system, which is far more complicated than the previously studied single-agent system. We studied two different multi-agents systems. One is the fully-connected neural network consists of multiple single neurons. Another one is the simplified mechanical arm system which is controlled by multiple neurons. We suppose that each neuron is like an agent and it can do Gibbs sampling of the posterior probability of stimulus features. The policy is optimized in a way that the cumulative global rewards are maximized. The algorithm for the second system is based on the same idea but we incorporate the physics model into the constraints. The simulation results show that for the first system our algorithm converges well. For the second system it does not converge well in a reasonable simulation time length. In summary, we took the initial endeavor to study the reinforcement learning for multi-agents system. The computational complexity is always an issue and significant amount of works have to be done in order to better understand the problem.
This paper is devoted to measuring the security of cyber networks under advanced persistent threats (APTs). First, an APT-based cyber attack-defense process is modeled as an individual-level dynamical system. Second, the dynamic model is shown to exhibit the global stability. On this basis, a new security metric of cyber networks, which is known as the limit security, is defined as the limit expected fraction of compromised nodes in the networks. Next, the influence of different factors on the limit security is illuminated through theoretical analysis and computer simulation. This work helps understand the security of cyber networks under APTs.
Recent years with the popularity of mobile devices have witnessed an explosive growth of mobile multimedia contents which dominate more than 50\% of mobile data traffic. This significant growth poses a severe challenge for future cellular networks. As a promising approach to overcome the challenge, we advocate Content Retrieval At the Edge, a content-centric cooperative service paradigm via device-to-device (D2D) communications to reduce cellular traffic volume in mobile networks. By leveraging the Named Data Networking (NDN) principle, we propose sNDN, a social-aware named data framework to achieve efficient cooperative content retrieval. Specifically, sNDN introduces Friendship Circle by grouping a user with her close friends of both high mobility similarity and high content similarity. We construct NDN routing tables conditioned on Friendship Circle encounter frequency to navigate a content request and a content reply packet between Friendship Circles, and leverage social properties in Friendship Circle to search for the final target as inner-Friendship Circle routing. The evaluation results demonstrate that sNDN can save cellular capacity greatly and outperform other content retrieval schemes significantly.
Most neural network models for document classification on social media focus on text infor-mation to the neglect of other information on these platforms. In this paper, we classify post stance on social media channels and develop UTCNN, a neural network model that incorporates user tastes, topic tastes, and user comments on posts. UTCNN not only works on social media texts, but also analyzes texts in forums and message boards. Experiments performed on Chinese Facebook data and English online debate forum data show that UTCNN achieves a 0.755 macro-average f-score for supportive, neutral, and unsupportive stance classes on Facebook data, which is significantly better than models in which either user, topic, or comment information is withheld. This model design greatly mitigates the lack of data for the minor class without the use of oversampling. In addition, UTCNN yields a 0.842 accuracy on English online debate forum data, which also significantly outperforms results from previous work as well as other deep learning models, showing that UTCNN performs well regardless of language or platform.
The performance of deep neural networks is well-known to be sensitive to the setting of their hyperparameters. Recent advances in reverse-mode automatic differentiation allow for optimizing hyperparameters with gradients. The standard way of computing these gradients involves a forward and backward pass of computations. However, the backward pass usually needs to consume unaffordable memory to store all the intermediate variables to exactly reverse the forward training procedure. In this work we propose a simple but effective method, DrMAD, to distill the knowledge of the forward pass into a shortcut path, through which we approximately reverse the training trajectory. Experiments on several image benchmark datasets show that DrMAD is at least 45 times faster and consumes 100 times less memory compared to state-of-the-art methods for optimizing hyperparameters with minimal compromise to its effectiveness. To the best of our knowledge, DrMAD is the first research attempt to make it practical to automatically tune thousands of hyperparameters of deep neural networks. The code can be downloaded from https://github.com/bigaidream-projects/drmad
We study the non-stationary dynamics of an elastic interface in a disordered medium at the depinning transition. We compute the two-time response and correlation functions, found to be universal and characterized by two independent critical exponents. We find a good agreement between two-loop Functional Renormalization Group calculations and molecular dynamics simulations for the scaling forms, and for the response aging exponent $\theta_R$. We also describe a dynamical dimensional crossover, observed at long times in the relaxation of a finite system. Our results are relevant for the non-steady driven dynamics of domain walls in ferromagnetic films and contact lines in wetting.
Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show $\sqrt{T}$-type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp for which we show logarithmic regret bounds for strongly convex functions. Finally, we demonstrate in the experiments that these new variants outperform other adaptive gradient techniques or stochastic gradient descent in the optimization of strongly convex functions as well as in training of deep neural networks.
Latest results indicate that features learned via convolutional neural networks outperform previous descriptors on classification tasks by a large margin. It has been shown that these networks still work well when they are applied to datasets or recognition tasks different from those they were trained on. However, descriptors like SIFT are not only used in recognition but also for many correspondence problems that rely on descriptor matching. In this paper we compare features from various layers of convolutional neural nets to standard SIFT descriptors. We consider a network that was trained on ImageNet and another one that was trained without supervision. Surprisingly, convolutional neural networks clearly outperform SIFT on descriptor matching. This paper has been merged with arXiv:1406.6909
A self-organising neural network is presented that is based on a rigorous Bayesian analysis of the information contained in individual neural firing events. This leads to a visual cortex network (VICON) that has many of the properties emerge when a mammalian visual cortex is exposed to data arriving from two imaging sensors (i.e. the two retinae), such as dominance stripes and orientation maps.
We propose efficient distributed algorithms to aid navigation of a user through a geographic area covered by sensors. The sensors sense the level of danger at their locations and we use this information to find a safe path for the user through the sensor field. Traditional distributed navigation algorithms rely upon flooding the whole network with packets to find an optimal safe path. To reduce the communication expense, we introduce the concept of a skeleton graph which is a sparse subset of the true sensor network communication graph. Using skeleton graphs we show that it is possible to find approximate safe paths with much lower communication cost. We give tight theoretical guarantees on the quality of our approximation and by simulation, show the effectiveness of our algorithms in realistic sensor network situations.
Scheduling control problems for a family of unitary networks under heavy traffic with general interarrival and service times, probabilistic routing and an infinite horizon discounted linear holding cost are studied. Diffusion control problems, that have been proposed as approximate models for the study of these critically loaded controlled stochastic networks, can be regarded as formal scaling limits of such stochastic systems. However, to date, a rigorous limit theory that justifies the use of such approximations for a general family of controlled networks has been lacking. It is shown that, under broad conditions, the value function of the suitably scaled network control problem converges to that of the associated diffusion control problem. This scaling limit result, in addition to giving a precise mathematical basis for the above approximation approach, suggests a general strategy for constructing near optimal controls for the physical stochastic networks by solving the associated diffusion control problem.
Images are an important data source for diagnosis and treatment of oral diseases. The manual classification of images may lead to misdiagnosis or mistreatment due to subjective errors. In this paper an image classification model based on Convolutional Neural Network is applied to Quantitative Light-induced Fluorescence images. The deep neural network outperforms other state of the art shallow classification models in predicting labels derived from three different dental plaque assessment scores. The model directly benefits from multi-channel representation of the images resulting in improved performance when, besides the Red colour channel, additional Green and Blue colour channels are used.
Learning a task induces connectivity changes in neural circuits, thereby changing their dynamics. To elucidate task related neural dynamics we study trained Recurrent Neural Networks. We develop a Mean Field Theory for Reservoir Computing networks trained to have multiple fixed point attractors. Our main result is that the dynamics of the network's output in the vicinity of attractors is governed by a low order linear Ordinary Differential Equation. Stability of the resulting ODE can be assessed, predicting training success or failure. As a consequence, networks of Rectified Linear (RLU) and of sigmoidal nonlinearities are shown to have diametrically different properties when it comes to learning attractors. Furthermore, a characteristic time constant, which remains finite at the edge of chaos, offers an explanation of the network's output robustness in the presence of variability of the internal neural dynamics. Finally, the proposed theory predicts state dependent frequency selectivity in network response.
Despite the recent achievements in machine learning, we are still very far from achieving real artificial intelligence. In this paper, we discuss the limitations of standard deep learning approaches and show that some of these limitations can be overcome by learning how to grow the complexity of a model in a structured way. Specifically, we study the simplest sequence prediction problems that are beyond the scope of what is learnable with standard recurrent networks, algorithmically generated sequences which can only be learned by models which have the capacity to count and to memorize sequences. We show that some basic algorithms can be learned from sequential data using a recurrent network associated with a trainable memory.
We uncover an infinite family of time-reversal symmetric 3d interacting topological insulators of bosons or spins, in time-periodically driven systems, which we term Floquet topological paramagnets (FTPMs). These FTPM phases exhibit intrinsically dynamical properties that could not occur in thermal equilibrium, and are governed by an infinite set of $Z_2$-valued topological invariants, one for each prime number. The topological invariants are physically characterized by surface magnetic domain walls that act as unidirectional quantum channels, transferring quantized packets of information during each driving period. We construct exactly solvable models realizing each of these phases, and discuss the anomalous dynamics of their topologically protected surface states. Unlike previous encountered examples of Floquet SPT phases, these 3d FTPMs are not captured by group cohomology methods, and cannot be obtained from equilibrium classifications simply by treating the discrete time-translation as an ordinary symmetry. The simplest such FTPM phase can feature anomalous $Z_2$ (toric code) surface topological order, in which the gauge electric and magnetic excitations are exchanged in each Floquet period, which cannot occur in a pure 2d system without breaking time reversal symmetry.
This work is devoted to the construction of explicit feedback control laws for the robust, global, exponential stabilization of general, uncertain, discrete-time, acyclic traffic networks. We consider discrete-time, uncertain network models which satisfy very weak assumptions. The construction of the controllers and the rigorous proof of the robust, global, exponential stability for the closed-loop system are based on recently proposed vector-Lyapunov function criteria, as well as the fact that the network is acyclic. It is shown, in this study, that the latter requirement is necessary for the existence of a robust, global, exponential stabilizer of the desired uncongested equilibrium point of the network. An illustrative example demonstrates the applicability of the obtained results to realistic traffic flow networks.
We elucidate the effects of defect disorder and $e$-$e$ interaction on the spectral density of the defect states emerging in the Mott-Hubbard gap of doped transition-metal oxides, such as Y$_{1-x}$Ca$_{x}$VO$_{3}$. A soft gap of kinetic origin develops in the defect band and survives defect disorder for $e$-$e$ interaction strengths comparable to the defect potential and hopping integral values above a doping dependent threshold, otherwise only a pseudogap persists. These two regimes naturally emerge in the statistical distribution of gaps among different defect realizations, which turns out to be of Weibull type. Its shape parameter $k$ determines the exponent of the power-law dependence of the density of states at the chemical potential ($k-1$) and hence distinguishes between the soft gap ($k\geq2$) and the pseudogap ($k<2$) regimes. Both $k$ and the effective gap scale with the hopping integral and the $e$-$e$ interaction in a wide doping range. The motion of doped holes is confined by the closest defect potential and the overall spin-orbital structure. Such a generic behavior leads to complex non-hydrogen-like defect states that tend to preserve the underlying $C$-type spin and $G$-type orbital order and can be detected and analyzed via scanning tunneling microscopy.
We summarize the current status of cosmological measurements using SNe Ia. Searches to an average depth of z~0.5 have found approximately 100 SNe Ia to date, and measurements of their light curves and peak magnitudes find these objects to be about 0.25mag fainter than predictions for an empty universe. These measurements imply low values for Omega_M and a positive cosmological constant, with high statistical significance. Searches out to z~1.0-1.2 for SNe Ia (peak magnitudes of I~24.5) will greatly aid in confirming this result, or demonstrate the existence of systematic errors. Multi-epoch spectra of SNe Ia at z~0.5 are needed to constrain possible evolutionary effects. I band searches should be able to find SNe Ia out to z~2. We discuss some simulations of deep searches and discovery statistics at several redshifts.
A recent Drude model description of the metallic regime and of a channel- averaged elastic mean free path (mfp), $\ell_0$, in an $N$-channel tight-binding wire identifies the Thouless localization length, $N\ell_0$, as a proper lower bound of macroscopic length scales ("mean free path") for the DMPK equation describing the localized regime of the wire. The mfp $\ell_0$ leads to a metallic regime which is consistent with Dorokhov's microscopic transmission analysis in terms of a nominal elastic mfp. On the other hand, the validity of Mello's derivation of universal conductance fluctuations in the metallic regime based on the DMPK equation is restored if the mfp $\ell'$, of order $N\ell_0$, in that equation is replaced by the correct mean free path $\ell_0$.
Graphical models are widely used to study biological networks. Interventions on network nodes are an important feature of many experimental designs for the study of biological networks. In this paper we put forward a causal variant of dynamic Bayesian networks (DBNs) for the purpose of modeling time-course data with interventions. The models inherit the simplicity and computational efficiency of DBNs but allow interventional data to be integrated into network inference. We show empirical results, on both simulated and experimental data, that demonstrate the need to appropriately handle interventions when interventions form part of the design.
Solutions for mobility management in wireless networks have been investigated and proposed in various research projects and standardization bodies. With the continuing deployment of different access networks, the wider range of applications tailored for a mobile environment, and a larger diversity of wireless end systems, it emerged that a single mobility protocol (such as Mobile IP) is not sufficient to handle the different requirements adequately. Thus a solution is needed to manage multiple mobility protocols in end systems and network nodes, to detect and select the required protocols, versions and optional features, and enable control on running daemons. For this purpose a mobility toolbox has been developed as part of the EU funded Ambient Networks project. This paper describes this modular management approach and illustrates the additional benefits a mobility protocol can gain by using state transfer as an example.
We study the average entanglement entropy of blocks of contiguous spins in aperiodic XXZ chains which possess an aperiodic singlet phase at least in a certain limit of the coupling ratios. In this phase, where the ground state constructed by a real space renormalization group method, consists (asymptotically) of independent singlet pairs, the average entanglement entropy is found to be a piecewise linear function of the block size. The enveloping curve of this function is growing logarithmically with the block size, with an effective central charge in front of the logarithm which is characteristic for the underlying aperiodic sequence. The aperiodic sequence producing the largest effective central charge is identified, and the latter is found to exceed the central charge of the corresponding homogeneous model. For marginal aperiodic modulations, numerical investigations performed for the XX model show a logarithmic dependence, as well, with an effective central charge varying continuously with the coupling ratio.
The functional and structural representation of the brain as a complex network is marked by the fact that the comparison of noisy and intrinsically correlated high-dimensional structures between experimental conditions or groups shuns typical mass univariate methods. Furthermore most network estimation methods cannot distinguish between real and spurious correlation arising from the convolution due to nodes' interaction, which thus introduces additional noise in the data. We propose a machine learning pipeline aimed at identifying multivariate differences between brain networks associated to different experimental conditions. The pipeline (1) leverages the deconvolved individual contribution of each edge and (2) maps the task into a sparse classification problem in order to construct the associated "sparse deconvolved predictive network", i.e., a graph with the same nodes of those compared but whose edge weights are defined by their relevance for out of sample predictions in classification. We present an application of the proposed method by decoding the covert attention direction (left or right) based on the single-trial functional connectivity matrix extracted from high-frequency magnetoencephalography (MEG) data. Our results demonstrate how network deconvolution matched with sparse classification methods outperforms typical approaches for MEG decoding.
We derive a microscopic theory of glassy dynamics based on the transport of voids by micro-string motions, each of which involves particles arranged in a line hopping simultaneously displacing one another. Disorder is modeled by a random energy landscape quenched in the configuration space of distinguishable particles. We study the evolution of local regions with m coupled voids. At low temperature, energetically accessible local particle configurations can be organized into a random tree with nodes and edges denoting configurations and micro-string propagations respectively. A micro-string propagation initiated by a void can facilitate similar motions by other voids via perturbing the random energy landscape. Dynamics is dominated by coupled voids of an optimal group size, which increases as temperature decreases. We obtain explicit expressions of the particle diffusion coefficient and a particle return probability, which describe glassy behaviors at low temperature and small void density. Comparison with a distinguishable-particle lattice model (DPLM) of glass shows very good quantitative agreements using only a single adjustable parameter related to the interaction range of the micro-strings.
Several recent studies have tackled the issue of optimal network immunization, by providing efficient criteria to identify key nodes to be removed in order to break apart a network, thus preventing the occurrence of extensive epidemic outbreaks. Yet, although the efficiency of those criteria has been demonstrated also in empirical networks, preventive immunization is rarely applied to real world scenarios, where the usual approach is the a posteriori attempt to contain epidemic outbreaks using quarantine measures. Here we compare the efficiency of prevention with that of quarantine in terms of the tradeoff between the number of removed and saved nodes on both synthetic networks with different degree distributions, and on the global airline transportation network. We show how, consistent with common sense, but contrary to common practice, preventing is almost always better than curing: rescuing an infected network by quarantine becomes inefficient soon after the first infection.
A Multi-hop Control Network (MCN) consists of a plant where the communication between sensors, actuators and computational unit is supported by a wireless multi-hop communication network, and data flow is performed using scheduling and routing of sensing and actuation data. We characterize the problem of detecting the failure of links of the radio connectivity graph and provide necessary and sufficient conditions on the plant dynamics and on the communication protocol. We also provide a methodology to \emph{explicitly} design the network topology, scheduling and routing of a communication protocol in order to satisfy the above conditions.
We evaluate level spacing and smallest eigenvalue distributions of chiral random matrix ensembles transiting from symplectic or orthogonal to unitary symmetry classes with a crossover parameter rho. As expected from the effective sigma model description, these results can be fitted perfectly to the fundamental or adjoint staggered Dirac spectrum of SU(2) quenched lattice gauge theory under the imaginary chemical potential (twisting) mu. The linear dependence of the parameter rho on mu determines the pion decay constant F as its proportionality constant.
The traffic in telephone networks is analyzed in this paper. Unlike the classical traffic analysis where call blockings are due to the limited channel capacity, we consider here a more realistic cause for call blockings which is due to the way in which users are networked in a real-life human society. Furthermore, two kinds of user network, namely, the fully-connected user network and the scale-free network, are employed to model the way in which telephone users are connected. We show that the blocking probability is generally higher in the case of the scale-free user network, and that the carried traffic intensity is practically limited not only by the network capacity but also by the property of the user network.
The design process of photovoltaic (PV) modules can be greatly enhanced by using advanced and accurate models in order to predict accurately their electrical output behavior. The main aim of this paper is to investigate the application of an advanced neural network based model of a module to improve the accuracy of the predicted output I--V and P--V curves and to keep in account the change of all the parameters at different operating conditions. Radial basis function neural networks (RBFNN) are here utilized to predict the output characteristic of a commercial PV module, by reading only the data of solar irradiation and temperature. A lot of available experimental data were used for the training of the RBFNN, and a backpropagation algorithm was employed. Simulation and experimental validation is reported.
The number of scientific papers grows exponentially in many disciplines. The share of online available papers grows as well. At the same time, the period of time for a paper to loose at chance to be cited anymore shortens. The decay of the citing rate shows similarity to ultradiffusional processes as for other online contents in social networks. The distribution of papers per author shows similarity to the distribution of posts per user in social networks. The rate of uncited papers for online available papers grows while some papers 'go viral' in terms of being cited. Summarized, the practice of scientific publishing moves towards the domain of social networks. The goal of this project is to create a text engineering tool, which can semi-automatically categorize a paper according to its type of contribution and extract relationships between them into an ontological database. Semi-automatic categorization means that the mistakes made by automatic pre-categorization and relationship-extraction will be corrected through a wikipedia-like front-end by volunteers from general public. This tool should not only help researchers and the general public to find relevant supplementary material and peers faster, but also provide more information for research funding agencies.
Recent results suggest that state-of-the-art saliency models perform far from optimal in predicting fixations. This lack in performance has been attributed to an inability to model the influence of high-level image features such as objects. Recent seminal advances in applying deep neural networks to tasks like object recognition suggests that they are able to capture this kind of structure. However, the enormous amount of training data necessary to train these networks makes them difficult to apply directly to saliency prediction. We present a novel way of reusing existing neural networks that have been pretrained on the task of object recognition in models of fixation prediction. Using the well-known network of Krizhevsky et al. (2012), we come up with a new saliency model that significantly outperforms all state-of-the-art models on the MIT Saliency Benchmark. We show that the structure of this network allows new insights in the psychophysics of fixation selection and potentially their neural implementation. To train our network, we build on recent work on the modeling of saliency as point processes.
Greedy Restrictive Boltzmann Machines yield an fairly low 0.72% error rate on the famous MNIST database of handwritten digits. All that was required to achieve this result was a high number of hidden layers consisting of many neurons, and a graphics card to greatly speed up the rate of learning.
We define a \emph{thermal network}, which is a network where the flow functionality of a node depends upon its temperature. This model is inspired by several types of real-life networks, and generalizes some conventional network models wherein nodes have fixed capacities and the problem is to maximize the flow through the network. In a thermal network, the temperature of a node increases as traffic moves through it, and nodes may also cool spontaneously over time, or by employing cooling packets. We analyze the problems of maximizing the flow from a source to a sink for both these cases, for a holistic view with respect to the single-source-single-sink dynamic flow problem in a thermal network. We have studied certain properties such a thermal network exhibits, and give closed-form solutions for the maximum flow that can be achieved through such a network.
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. Our method prunes redundant connections using a three-step method. First, we train the network to learn which connections are important. Next, we prune the unimportant connections. Finally, we retrain the network to fine tune the weights of the remaining connections. On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9x, from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the number of parameters can be reduced by 13x, from 138 million to 10.3 million, again with no loss of accuracy.
We introduce Natural Neural Networks, a novel family of algorithms that speed up convergence by adapting their internal representation during training to improve conditioning of the Fisher matrix. In particular, we show a specific example that employs a simple and efficient reparametrization of the neural network weights by implicitly whitening the representation obtained at each layer, while preserving the feed-forward computation of the network. Such networks can be trained efficiently via the proposed Projected Natural Gradient Descent algorithm (PRONG), which amortizes the cost of these reparametrizations over many parameter updates and is closely related to the Mirror Descent online learning algorithm. We highlight the benefits of our method on both unsupervised and supervised learning tasks, and showcase its scalability by training on the large-scale ImageNet Challenge dataset.
Real epidemic spreading networks often composed of several kinds of networks interconnected with each other, and the interrelated networks have the different topologies and epidemic dynamics. Moreover, most human diseases are derived from animals, and the zoonotic infections always spread on interconnected networks. In this paper, we consider the epidemic spreading on one-way circular-coupled network consist of three interconnected subnetworks. Here, two one-way three-layer circular interactive networks are established by introducing the heterogeneous mean-field approach method, then we get the basic reproduction numbers and prove the global stability of the disease-free equilibrium and endemic equilibrium of the models. Through mathematical analysis and numerical simulations, it is found that the basic reproduction numbers $R_0$ of the two models are dependent on the infection rates, infection periods, average degrees and degree ratios. In the first model, the network structures of the inner contact patterns have a bigger impact on $R_0$ than that of the cross contact patterns. Under the same contact pattern, the internal infection rates have greater influence on $R_0$ than the cross-infection rates. In the second model, the disease prevails in a heterogeneous network has a greater impact on $R_0$ than the disease from a homogeneous network, and the infections among the three subnetworks all play a important role in the propagation process. Numerical examples verify and expand these theoretical results very well.
In computer vision, an entity such as an image or video is often represented as a set of instance vectors, which can be SIFT, motion, or deep learning feature vectors extracted from different parts of that entity. Thus, it is essential to design efficient and effective methods to compare two sets of instance vectors. Existing methods such as FV, VLAD or Super Vectors have achieved excellent results. However, this paper shows that these methods are designed based on a generative perspective, and a discriminative method can be more effective in categorizing images or videos. The proposed D3 (discriminative distribution distance) method effectively compares two sets as two distributions, and proposes a directional total variation distance (DTVD) to measure how separated are they. Furthermore, a robust classifier-based method is proposed to estimate DTVD robustly. The D3 method is evaluated in action and image recognition tasks and has achieved excellent accuracy and speed. D3 also has a synergy with FV. The combination of D3 and FV has advantages over D3, FV, and VLAD.
Grid and peer-to-peer (P2P) networks are two ideal technologies for file sharing. A P2P grid is a special case of grid networks in which P2P communications are used for communication between nodes and trust management. Use of this technology allows creation of a network with greater distribution and scalability. Semantic grids have appeared as an expansion of grid networks in which rich resource metadata are revealed and clearly handled. In a semantic P2P grid, nodes are clustered into different groups based on the semantic similarities between their services. This paper proposes a reputation model for trust management in a semantic P2P Grid. We use fuzzy theory, in a trust overlay network named FR TRUST that models the network structure and the storage of reputation information. In fact we present a reputation collection and computation system for semantic P2P Grids. The system uses fuzzy theory to compute a peer trust level, which can be either: Low, Medium, or High. Our experimental results demonstrate that FR TRUST combines low (and therefore desirable) a good computational complexity with high ranking accuracy.
We present a study of exclusion processes on networks as models for complex transport phenomena and in particular for active transport of motor proteins along the cytoskeleton. We argue that active transport processes on networks spontaneously develop density heterogeneities at various scales. These heterogeneities can be regulated through a variety of multi-scale factors, such as the interplay of exclusion interactions, the non-equilibrium nature of the transport process and the network topology.   We show how an effective rate approach allows to develop an understanding of the stationary state of transport processes through complex networks from the phase diagram of one single segment. For exclusion processes we rationalize that the stationary state can be classified in three qualitatively different regimes: a homogeneous phase as well as inhomogeneous network and segment phases.   In particular, we present here a study of the stationary state on networks of three paradigmatic models from non-equilibrium statistical physics: the totally asymmetric simple exclusion process, the partially asymmetric simple exclusion process and the totally asymmetric simple exclusion process with Langmuir kinetics. With these models we can interpolate between equilibrium (due to bi-directional motion along a network or infinite diffusion) and out-of-equilibrium active directed motion along a network. The study of these models sheds further light on the emergence of density heterogeneities in active phenomena.
In this paper, we propose a multi-channel full-duplex Medium Access Control (MAC) protocol for cognitive radio networks (MFDC-MAC). Our design exploits the fact that full-duplex (FD) secondary users (SUs) can perform spectrum sensing and access simultaneously, and we employ the randomized dynamic channel selection for load balancing among channels and the standard backoff mechanism for contention resolution on each available channel. Then, we develop a mathematical model to analyze the throughput performance of the proposed MFDC-MAC protocol. Furthermore, we study the protocol configuration optimization to maximize the network throughput where we show that this optimization can be performed in two steps, namely optimization of access and transmission parameters on each channel and optimization of channel selection probabilities of the users. Such optimization aims at achieving efficient self-interference management for FD transceivers, sensing overhead control, and load balancing among the channels. Numerical results demonstrate the impacts of different protocol parameters and the importance of parameter optimization on the throughput performance as well as the significant performance gain of the proposed design compared to traditional design.
The implementation of a quantum computer requires the realization of a large number of N-qubit unitary operations which represent the possible oracles or which are part of the quantum algorithm. Until now there are no standard ways to uniformly generate whole classes of N-qubit gates. We have developed a method to generate arbitrary controlled phase shift operations with a single network of one-qubit and two-qubit operations. This kind of network can be adapted to various physical implementations of quantum computing and is suitable to realize the Deutsch-Jozsa algorithm as well as Grover's search algorithm.
We developed a website for kids where they can share new as well as remixed animations and games, e.g., interactive music videos, which they created on their smartphones or tablets using a visual "LEGO-style" programming environment called Catroid. Online communities for children like our website have unique requirements, and keeping the commitment of kids on a high level is a continuous challenge. For instance, one key motivator for kids is the ability to entertain their friends. Another success factor is the ability to learn from and cooperate with other children. In this short position paper we attempt at identifying the requirements for the success of such an online community, both from the point of view of the kids as well as of their parents, and at finding ways to make it attractive for both.
Our view of galaxy evolution has been dramatically enhanced by the recent deep field submm surveys carried out with the SCUBA camera on the JCMT. SCUBA has discovered a population of luminous infrared galaxies at redshifts ~1-4 that emit most of their energy at far-IR/submm wavelengths. The cumulative surface density of submm sources (~10,000 per sq.deg with S_850 > 1 mJy) appears to be sufficient to account for nearly all of the 850 micron extragalactic background. The SCUBA sources are plausibly the high-redshift counterparts of more local (z < 1) luminous infrared galaxies that have been identified in IRAS and ISO deep field surveys, the majority of which appear to be major mergers of gas-rich disks accompanied by dust-enshrouded nuclear starbursts and powerful AGN. The SCUBA sources are plausibly the progenitors of the present-day spheroidal population. This major event in galaxy evolution, equal in bolometric luminosity to that observed at optical wavelengths, is largely missed by current UV/optical surveys.
We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients. In particular, during backward pass, parameter gradients are stochastically quantized to low bitwidth numbers before being propagated to convolutional layers. As convolutions during forward/backward passes can now operate on low bitwidth weights and activations/gradients respectively, DoReFa-Net can use bit convolution kernels to accelerate both training and inference. Moreover, as bit convolutions can be efficiently implemented on CPU, FPGA, ASIC and GPU, DoReFa-Net opens the way to accelerate training of low bitwidth neural network on these hardware. Our experiments on SVHN and ImageNet datasets prove that DoReFa-Net can achieve comparable prediction accuracy as 32-bit counterparts. For example, a DoReFa-Net derived from AlexNet that has 1-bit weights, 2-bit activations, can be trained from scratch using 6-bit gradients to get 46.1\% top-1 accuracy on ImageNet validation set. The DoReFa-Net AlexNet model is released publicly.
We proposed Neural Enquirer as a neural network architecture to execute a natural language (NL) query on a knowledge-base (KB) for answers. Basically, Neural Enquirer finds the distributed representation of a query and then executes it on knowledge-base tables to obtain the answer as one of the values in the tables. Unlike similar efforts in end-to-end training of semantic parsers, Neural Enquirer is fully "neuralized": it not only gives distributional representation of the query and the knowledge-base, but also realizes the execution of compositional queries as a series of differentiable operations, with intermediate results (consisting of annotations of the tables at different levels) saved on multiple layers of memory. Neural Enquirer can be trained with gradient descent, with which not only the parameters of the controlling components and semantic parsing component, but also the embeddings of the tables and query words can be learned from scratch. The training can be done in an end-to-end fashion, but it can take stronger guidance, e.g., the step-by-step supervision for complicated queries, and benefit from it. Neural Enquirer is one step towards building neural network systems which seek to understand language by executing it on real-world. Our experiments show that Neural Enquirer can learn to execute fairly complicated NL queries on tables with rich structures.
The gauge theory for random spin systems is extended to quantum spin glasses to derive a number of exact and/or rigorous results. The transverse Ising model and the quantum gauge glass are shown to be gauge invariant. For these models, an identity is proved that the expectation value of the gauge invariant operator in the ferromagnetic limit is equal to the one in the classical equilibrium state on the Nishimori line. As a result, a set of inequalities for the correlation function are proved, which restrict the location of the ordered phase. It is also proved that there is no long-range order in the two-dimensional quantum gauge glass in the ground state. The phase diagram for the quantum XY Mattis model is determined.
This paper addresses the problem of distributed hypothesis testing in multi-agent networks, where agents repeatedly collect local observations about an unknown state of the world, and try to collaboratively detect the true state through information exchange. We focus on the impact of failures and asynchrony (two fundamental factors in distributed systems) on the performance of consensus-based non-Bayesian learning. In particular, we consider the scenario where the networked agents may suffer crash faults, and messages delay can be arbitrarily long but finite. We identify the minimal global detectability of the network for non-Bayesian rule to succeed. In addition, we obtain a generalization of a celebrated result by Wolfowitz and Hajnal to submatrices, which might be of independent interest.
In light of recent measurements of the nucleon structure function in the small-$x$ deep-inelastic regime at HERA and the consequently improved theoretical understanding of the quark distributions in this range of parton fractional momentum, we present new results for the neutrino-nucleon cross section at ultra high energies, up to $ E_{\nu} \sim 10^{21}$ eV. The results are relevant to deep underground muon detectors looking for such neutrinos. For $\simeq 10^{20}$ eV neutrinos, our cross sections are about a factor of $4$ to $10$ times larger than the previously reported results by Reno, Quigg and Walker. We discuss implications of this new neutrino-nucleon cross section for a variety of current and future neutrino detectors.
We propose an integer approximation of Echo State Networks (ESN) based on the mathematics of hyperdimensional computing. The reservoir of the proposed Integer Echo State Network (intESN) contains only n-bits integers and replaces the recurrent matrix multiply with an efficient cyclic shift operation. Such an architecture results in dramatic improvements in memory footprint and computational efficiency, with minimal performance loss. Our architecture naturally supports the usage of the trained reservoir in symbolic processing tasks of analogy making and logical inference.
Motivated by recent experiments in neuroscience which indicate that neuronal avalanches exhibit scale invariant behavior similar to self-organized critical systems, we study the role of noisy (non-conservative) local dynamics on the critical behavior of a sandpile model which can be taken to mimic the dynamics of neuronal avalanches. We find that despite the fact that noise breaks the strict local conservation required to attain criticality, our system exhibit true criticality for a wide range of noise in various dimensions, given that conservation is respected \textit{on the average}. Although the system remains critical, exhibiting finite-size scaling, the value of critical exponents change depending on the intensity of local noise. Interestingly, for sufficiently strong noise level, the critical exponents approach and saturate at their mean-field values, consistent with empirical measurements of neuronal avalanches. This is confirmed for both two and three dimensional models. However, addition of noise does not affect the exponents at the upper critical dimension ($D=4$). In addition to extensive finite-size scaling analysis of our systems, we also employ a useful time-series analysis method in order to establish true criticality of noisy systems. Finally, we discuss the implications of our work in neuroscience as well as some implications for general phenomena of criticality in non-equilibrium systems.
In the decade since Jeff Hawkins proposed Hierarchical Temporal Memory (HTM) as a model of neocortical computation, the theory and the algorithms have evolved dramatically. This paper presents a detailed description of HTM's Cortical Learning Algorithm (CLA), including for the first time a rigorous mathematical formulation of all aspects of the computations. Prediction Assisted CLA (paCLA), a refinement of the CLA is presented, which is both closer to the neuroscience and adds significantly to the computational power. Finally, we summarise the key functions of neocortex which are expressed in paCLA implementations.
The article investigates influence relation between two sets of agents in a social network. It proposes a logical system that captures propositional properties of this relation valid in all threshold models of social networks with the same topological structure. The logical system consists of Armstrong axioms for functional dependence and an additional Lighthouse axiom. The main results are soundness, completeness, and decidability theorems for this logical system.
In this paper, we present analytical study of routing overhead of reactive routing protocols for Wireless Multihop Networks (WMhNs). To accomplish the framework of generalized routing overhead, we choose Ad-Hoc on Demand Distance Vector (AODV), Dynamic Source Routing (DSR) and Dynamic MANET on Demand (DYMO). Considering basic themes of these protocols, we enhance the generalized network models by adding route monitoring overhead. Later, we take different network parameters and produce framework discussing the impact of variations of these parameters in network and routing performance. In the second part of our work, we simulate above mentioned routing protocols and give a brief discussion and comparison about the environments where these routing protocols perform better.
We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in human language communication. For example, humans tend to repeat entity names or even long phrases in conversation. The challenge with regard to copying in Seq2Seq is that new machinery is needed to decide when to perform the operation. In this paper, we incorporate copying into neural network-based Seq2Seq learning and propose a new model called CopyNet with encoder-decoder structure. CopyNet can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence. Our empirical study on both synthetic data sets and real world data sets demonstrates the efficacy of CopyNet. For example, CopyNet can outperform regular RNN-based model with remarkable margins on text summarization tasks.
Classically, percolation critical exponents are linked to the power laws that characterize percolation cluster fractal properties. It is found here that the gradient percolation power laws are conserved even for extreme gradient values for which the frontier of the infinite cluster is no more fractal. In particular the exponent 7/4 which was recently demonstrated to be the exact value for the dimension of the so-called "hull" or external perimeter of the incipient percolation cluster keeps its value in describing the width and length of gradient percolation frontiers whatever the gradient value. Its origin is then not to be found in the thermodynamic limit. The comparison between numerical results and the exact results that can be obtained analytically for extreme values of the gradient suggests that there exist a unique power law from size 1 to infinity which describes the gradient percolation frontier, this law becoming a scaling law in the large system limit.
We explore frame-level audio feature learning for chord recognition using artificial neural networks. We present the argument that chroma vectors potentially hold enough information to model harmonic content of audio for chord recognition, but that standard chroma extractors compute too noisy features. This leads us to propose a learned chroma feature extractor based on artificial neural networks. It is trained to compute chroma features that encode harmonic information important for chord recognition, while being robust to irrelevant interferences. We achieve this by feeding the network an audio spectrum with context instead of a single frame as input. This way, the network can learn to selectively compensate noise and resolve harmonic ambiguities.   We compare the resulting features to hand-crafted ones by using a simple linear frame-wise classifier for chord recognition on various data sets. The results show that the learned feature extractor produces superior chroma vectors for chord recognition.
The advent of large scale neural computational platforms has highlighted the lack of algorithms for synthesis of neural structures to perform predefined cognitive tasks. The Neural Engineering Framework offers one such synthesis, but it is most effective for a spike rate representation of neural information, and it requires a large number of neurons to implement simple functions. We describe a neural network synthesis method that generates synaptic connectivity for neurons which process time-encoded neural signals, and which makes very sparse use of neurons. The method allows the user to specify, arbitrarily, neuronal characteristics such as axonal and dendritic delays, and synaptic transfer functions, and then solves for the optimal input-output relationship using computed dendritic weights. The method may be used for batch or online learning and has an extremely fast optimization process. We demonstrate its use in generating a network to recognize speech which is sparsely encoded as spike times.
Spiking neural networks have been referred to as the third generation of artificial neural networks where the information is coded as time of the spikes. There are a number of different spiking neuron models available and they are categorized based on their level of abstraction. In addition, there are two known learning methods, unsupervised and supervised learning. This thesis focuses on supervised learning where a new algorithm is proposed, based on genetic algorithms. The proposed algorithm is able to train both synaptic weights and delays and also allow each neuron to emit multiple spikes thus taking full advantage of the spatial-temporal coding power of the spiking neurons. In addition, limited synaptic precision is applied; only six bits are used to describe and train a synapse, three bits for the weights and three bits for the delays. Two limited precision schemes are investigated. The proposed algorithm is tested on the XOR classification problem where it produces better results for even smaller network architectures than the proposed ones. Furthermore, the algorithm is benchmarked on the Fisher iris classification problem where it produces higher classification accuracies compared to SpikeProp, QuickProp and Rprop. Finally, a hardware implementation on a microcontroller is done for the XOR problem as a proof of concept. Keywords: Spiking neural networks, supervised learning, limited synaptic precision, genetic algorithms, hardware implementation.
We consider the approach describing glass formation in liquids as a progressive trapping in an exponentially large number of metastable states. To go beyond the mean-field setting, we provide a real-space renormalization group (RG) analysis of the associated replica free-energy functional. The present approximation yields in finite dimensions an ideal glass transition similar to that found in mean field. However, we find that along the RG flow the properties associated with metastable glassy states, such as the configurational entropy, are only defined up to a characteristic length scale that diverges as one approaches the ideal glass transition. The critical exponents characterizing the vicinity of the transition are the usual ones associated with a first-order discontinuity fixed point.
The classical Heisenberg antiferromagnet on the pyrochlore lattice is macroscopically and continuously degenerate and the system remains disordered at all temperatures, even in the presence of weak dilution with nonmagnetic ions. We show that, in stark contrast, weak bond disorder lifts the ground state degeneracy in favour of locally collinear spin configurations. We present a proof that for a single tetrahedron the ground state is perfectly collinear but identify two mechanisms which preclude the establishment of a globally collinear state; one due to frustration and the other due to higher-order effects. We thus obtain a rugged energy landscape, which is necessary to account for the glassy phenomena found in real systems such as the pyrochlore Y_2Mo_2O_7 recently reported by Booth et al. [Booth] to contain a substantial degree of bond disorder.
The glass transition is described in terms of thermally activated local structural rearrangements, the secondary relaxations of the glass phase. The interaction between these secondary relaxations leads to a much faster and much more dramatic breakdown of the shear modulus than without interaction, thus creating the impression of a separate primary process which in reality does not exist. The model gives a new view on the fragility and the stretching, two puzzling features of the glass transition.
We propose a model Hamiltonian for describing charge transport through short homogeneous double stranded DNA molecules. We show that the hybridization of the overlapping pi orbitals in the base-pair stack coupled to the backbone is sufficient to predict the existence of a gap in the nonequilibrium current-voltage characteristics with a minimal number of parameters. Our results are in a good agreement with the recent finding of semiconducting behavior in short poly(G)-poly(C) DNA oligomers. In particular, our model provides a correct description of the molecular resonances which determine the quasi-linear part of the current out of the gap region.
Understanding 3D object structure from a single image is an important but difficult task in computer vision, mostly due to the lack of 3D object annotations in real images. Previous work tackles this problem by either solving an optimization task given 2D keypoint positions, or training on synthetic data with ground truth 3D information. In this work, we propose 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data. This is made possible mainly by two technical innovations. First, we propose a Projection Layer, which projects estimated 3D structure to 2D space, so that 3D-INN can be trained to predict 3D structural parameters supervised by 2D annotations on real images. Second, heatmaps of keypoints serve as an intermediate representation connecting real and synthetic data, enabling 3D-INN to benefit from the variation and abundance of synthetic 3D objects, without suffering from the difference between the statistics of real and synthesized images due to imperfect rendering. The network achieves state-of-the-art performance on both 2D keypoint estimation and 3D structure recovery. We also show that the recovered 3D information can be used in other vision applications, such as 3D rendering and image retrieval.
We address the common rate maximization problem in two-layer cellular networks where high-power and low-power base stations are colocated in the same geographical area. Interference becomes a serious problem when two or more layers are considered in the same network. For this purpose, power control in the downlink needs to be used to limit the interference and to fully exploit the benefits of additional layer deployments.We present an analytical framework to the common rate maximization problem both with and without maximum power constraints and propose a heuristic algorithm. We present simulation results for the proposed approaches in a two-layer network setup and observe a significant common rate increase compared to single-layer wireless networks.
The Benes network has been used as a rearrangeable network for over 40 years, yet the uniform $N(2 \log N-1)$ control complexity of the $N \times N$ Benes is not optimal for many permutations. In this paper, we present a novel $O(\log N)$ depth rearrangeable network called KR-Benes that is {\it permutation-specific control-optimal}. The KR-Benes routes {\it every} permutation with the minimal control complexity {\it specific} to that permutation and its worst-case complexity for arbitrary permutations is bounded by the Benes; thus it replaces the Benes when considering control complexity/latency. We design the KR-Benes by first constructing a restricted $2 \log K +2$ depth rearrangeable network called $K$-Benes for routing $K$-bounded permutations with control $2N \log K$, $0 \leq K \leq N/4$. We then show that the $N \times N$ Benes network itself (with one additional stage) contains every $K$-Benes network as a subgraph and use this property to construct the KR-Benes network. With regard to the control-optimality of the KR-Benes, we show that any optimal network for rearrangeably routing $K$-bounded permutations must have depth $2 \log K + 2$, and therefore the $K$-Benes (and hence the KR-Benes) is optimal.
The Chemical Master Equation (CME) is well known to provide the highest resolution models of a biochemical reaction network. Unfortunately, even simulating the CME can be a challenging task. For this reason more simple approximations to the CME have been proposed. In this work we focus on one such model, the Linear Noise Approximation. Specifically, we consider implications of a recently proposed LNA time-scale separation method. We show that the reduced order LNA converges to the full order model in the mean square sense. Using this as motivation we derive a network structure preserving reduction algorithm based on structured projections. We present convex optimisation algorithms that describe how such projections can be computed and we discuss when structured solutions exits. We also show that for a certain class of systems, structured projections can be found using basic linear algebra and no optimisation is necessary. The algorithms are then applied to a linearised stochastic LNA model of the yeast glycolysis pathway.
This work studies distributed primal-dual strategies for adaptation and learning over networks from streaming data. Two first-order methods are considered based on the Arrow-Hurwicz (AH) and augmented Lagrangian (AL) techniques. Several revealing results are discovered in relation to the performance and stability of these strategies when employed over adaptive networks. The conclusions establish that the advantages that these methods have for deterministic optimization problems do not necessarily carry over to stochastic optimization problems. It is found that they have narrower stability ranges and worse steady-state mean-square-error performance than primal methods of the consensus and diffusion type. It is also found that the AH technique can become unstable under a partial observation model, while the other techniques are able to recover the unknown under this scenario. A method to enhance the performance of AL strategies is proposed by tying the selection of the step-size to their regularization parameter. It is shown that this method allows the AL algorithm to approach the performance of consensus and diffusion strategies but that it remains less stable than these other strategies.
The proliferation of Content Delivery Networks (CDN) reveals that existing content networks are owned and operated by individual companies. As a consequence, closed delivery networks are evolved which do not cooperate with other CDNs and in practice, islands of CDNs are formed. Moreover, the logical separation between contents and services in this context results in two content networking domains. But present trends in content networks and content networking capabilities give rise to the interest in interconnecting content networks. Finding ways for distinct content networks to coordinate and cooperate with other content networks is necessary for better overall service. In addition to that, meeting the QoS requirements of users according to the negotiated Service Level Agreements between the user and the content network is a burning issue in this perspective. In this article, we present an open, scalable and Service-Oriented Architecture based system to assist the creation of open Content and Service Delivery Networks (CSDN) that scale and support sharing of resources with other CSDNs.
We derive exact expressions for the excess number of clusters b and the excess cumulants b_n of a related quantity at the 2-D percolation point. High-accuracy computer simulations are in accord with our predictions. b is a finite-size correction to the Temperley-Lieb or Baxter-Temperley-Ashley formula for the number of clusters per site n_c in the infinite system limit; the bn correct bulk cumulants. b and b_n are universal, and thus depend only on the system's shape. Higher-order corrections show no apparent dependence on fractional powers of the system size.
The aim of this paper is to discuss the role of robustness and diversity in population dynamics in particular to some properties of the multi-step from healthy tissue to fully malignant tumours. Recent evidence shows that diversity within the cell population of a neoplasm, a pre-tumoural lession that can develop into a fully malignant tumour, is the best predictor for its evolving into a tumour. By studying the dynamics of a population described by a multi-type, population-size limited branching process in terms of the evolutionary formalism, we show some general principles regarding the probability of a resident population to being invaded by a mutant population in terms of the number of types present in the population and their resilience. We show that, although diversity in the mutant population poses a barrier for the emergence of the initial (benign) lession, under appropiate conditions, namely, the phenotypes in the mutant population being more resilient than those of the resident population, a more variable noeplastic population is more likely to be invaded by a more malignant one. Analysis of a model of gene regulatory networks suggest possible mechanisms giving rise to mutants with increased phenotypic diversity and robustness. We then go on to show how these results may help us to interpret some recent data regarding the evolution of Barrett's oesophagus into throat cancer.
Explosive growth in the use of smart wireless devices has necessitated the provision of higher data rates and always-on connectivity, which are the main motivators for designing the fifth generation (5G) systems. To achieve higher system efficiency, massive antenna deployment with tight coordination is one potential strategy for designing 5G systems, but has two types of associated system overhead. First is the synchronization overhead, which can be reduced by implementing a cloud radio access network (CRAN)-based architecture design, that separates the baseband processing and radio access functionality to achieve better system synchronization. Second is the overhead for acquiring channel state information (CSI) of the users present in the system, which, however, increases tremendously when instantaneous CSI is used to serve high-mobility users. To serve a large number of users, a CRAN system with a dense deployment of remote radio heads (RRHs) is considered, such that each user has a line-of-sight (LOS) link with the corresponding RRH. Since, the trajectory of movement for high-mobility users is predictable; therefore, fairly accurate position estimates for those users can be obtained, and can be used for resource allocation to serve the considered users. The resource allocation is dependent upon various correlated system parameters, and these correlations can be learned using well-known \emph{machine learning} algorithms. This paper proposes a novel \emph{learning-based resource allocation scheme} for time division duplex (TDD) based 5G CRAN systems with dense RRH deployment, by using only the users' position estimates for resource allocation, thus avoiding the need for CSI acquisition. This reduces the overall system overhead significantly, while still achieving near-optimal system performance; thus, better (effective) system efficiency is achieved. (See the paper for full abstract)
In this paper, we study detection and fast reconstruction of the celebrated Watts-Strogatz (WS) small-world random graph model \citep{watts1998collective} which aims to describe real-world complex networks that exhibit both high clustering and short average length properties. The WS model with neighborhood size $k$ and rewiring probability probability $\beta$ can be viewed as a continuous interpolation between a deterministic ring lattice graph and the Erd\H{o}s-R\'{e}nyi random graph. We study both the computational and statistical aspects of detecting the deterministic ring lattice structure (or local geographical links, strong ties) in the presence of random connections (or long range links, weak ties), and for its recovery. The phase diagram in terms of $(k,\beta)$ is partitioned into several regions according to the difficulty of the problem. We propose distinct methods for the various regions.
We propose a method of constructing a network, in which its time structure is directly incorporated, based on a deterministic model from a time series. To construct such a network, we transform a linear model containing terms with different time delays into network topology. The terms in the model are translated into temporal nodes of the network. On each link connecting these nodes, we assign a positive real number representing the strength of relationship, or the "distance," between nodes specified by the parameters of the model. The method is demonstrated by a known system and applied to two actual time series.
Complex network dynamics have been analyzed with models of systems of coupled switches or systems of coupled oscillators. However, many complex systems are composed of components with diverse dynamics whose interactions drive the system's evolution. We, therefore, introduce a new modeling framework that describes the dynamics of networks composed of both oscillators and switches. Both oscillator synchronization and switch stability are preserved in these heterogeneous, coupled networks. Furthermore, this model recapitulates the qualitative dynamics for the yeast cell cycle consistent with the hypothesized dynamics resulting from decomposition of the regulatory network into dynamic motifs. Introducing feedback into the cell-cycle network induces qualitative dynamics analogous to limitless replicative potential that is a hallmark of cancer. As a result, the proposed model of switch and oscillator coupling provides the ability to incorporate mechanisms that underlie the synchronized stimulus response ubiquitous in biochemical systems.
We consider a symmetrical star-shaped network, in which bandwidth is shared among the active connections according to the "min" policy. Starting from a chaos propagation hypothesis, valid when the system is large enough, one can write equilibrium equations for an arbitrary link of the network. This paper describes an approach based on functional analysis of nonlinear integral operators, which allows to characterize quantitatively the behaviour of the network under heavy load conditions.
In this paper, we provide new complexity results for algorithms that learn discrete-variable Bayesian networks from data. Our results apply whenever the learning algorithm uses a scoring criterion that favors the simplest model able to represent the generative distribution exactly. Our results therefore hold whenever the learning algorithm uses a consistent scoring criterion and is applied to a sufficiently large dataset. We show that identifying high-scoring structures is hard, even when we are given an independence oracle, an inference oracle, and/or an information oracle. Our negative results also apply to the learning of discrete-variable Bayesian networks in which each node has at most k parents, for all k > 3.
We analyze the public transport networks (PTNs) of a number of major cities of the world. While the primary network topology is defined by a set of routes each servicing an ordered series of given stations, a number of different neighborhood relations may be defined both for the routes and the stations. The networks defined in this way display distinguishing properties, the most striking being that often several routes proceed in parallel for a sequence of stations. Other networks with real-world links like cables or neurons embedded in two or three dimensions often show the same feature - we use the car engineering term "harness" for such networks. Geographical data for the routes reveal surprising self-avoiding walk (SAW) properties. We propose and simulate an evolutionary model of PTNs based on effectively interacting SAWs that reproduces the key features.
General results from statistical learning theory suggest to understand not only brain computations, but also brain plasticity as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of network configurations. This model provides a viable alternative to existing models that propose convergence of parameters to maximum likelihood values. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience, how cortical networks can generalize learned information so well to novel experiences, and how they can compensate continuously for unforeseen disturbances of the network. The resulting new theory of network plasticity explains from a functional perspective a number of experimental data on stochastic aspects of synaptic plasticity that previously appeared to be quite puzzling.
Human visual object recognition is typically rapid and seemingly effortless, as well as largely independent of viewpoint and object orientation. Until very recently, animate visual systems were the only ones capable of this remarkable computational feat. This has changed with the rise of a class of computer vision algorithms called deep neural networks (DNNs) that achieve human-level classification performance on object recognition tasks. Furthermore, a growing number of studies report similarities in the way DNNs and the human visual system process objects, suggesting that current DNNs may be good models of human visual object recognition. Yet there clearly exist important architectural and processing differences between state-of-the-art DNNs and the primate visual system. The potential behavioural consequences of these differences are not well understood. We aim to address this issue by comparing human and DNN generalisation abilities towards image degradations. We find the human visual system to be more robust to image manipulations like contrast reduction, additive noise or novel eidolon-distortions. In addition, we find progressively diverging classification error-patterns between man and DNNs when the signal gets weaker, indicating that there may still be marked differences in the way humans and current DNNs perform visual object recognition. We envision that our findings as well as our carefully measured and freely available behavioural datasets provide a new useful benchmark for the computer vision community to improve the robustness of DNNs and a motivation for neuroscientists to search for mechanisms in the brain that could facilitate this robustness.
We correct the value of the exponent \tau.
We study the spatial distribution of minority carriers arising from their anomalous photon-assisted diffusion upon photo-excitation at an edge of n-InP slab for temperatures ranging from 300 K to 78 K. The experiment provides a realization of the "L\'evy flight" random walk of holes, in which the L\'evy distribution index gamma is controlled by the temperature. We show that the variation \gamma(T) is close to that predicted earlier on the basis of the assumed quasi-equilibrium (van Roosbroek-Shockley) intrinsic emission spectrum, \gamma=1-\Delta /kT, where \Delta(T) is the Urbach tailing parameter of the absorption spectra. The decreasing \gamma at lower temperatures results in a giant enhancement in the spread of holes -- over distances exceeding 1 cm from the region of photo-excitation.
Several adaptation techniques have been investigated to optimize fuzzy inference systems. Neural network learning algorithms have been used to determine the parameters of fuzzy inference system. Such models are often called as integrated neuro-fuzzy models. In an integrated neuro-fuzzy model there is no guarantee that the neural network learning algorithm converges and the tuning of fuzzy inference system will be successful. Success of evolutionary search procedures for optimization of fuzzy inference system is well proven and established in many application areas. In this paper, we will explore how the optimization of fuzzy inference systems could be further improved using a meta-heuristic approach combining neural network learning and evolutionary computation. The proposed technique could be considered as a methodology to integrate neural networks, fuzzy inference systems and evolutionary search procedures. We present the theoretical frameworks and some experimental results to demonstrate the efficiency of the proposed technique.
The convolutional neural networks (CNNs) have proven to be a powerful tool for discriminative learning. Recently researchers have also started to show interest in the generative aspects of CNNs in order to gain a deeper understanding of what they have learned and how to further improve them. This paper investigates generative modeling of CNNs. The main contributions include: (1) We construct a generative model for the CNN in the form of exponential tilting of a reference distribution. (2) We propose a generative gradient for pre-training CNNs by a non-parametric importance sampling scheme, which is fundamentally different from the commonly used discriminative gradient, and yet has the same computational architecture and cost as the latter. (3) We propose a generative visualization method for the CNNs by sampling from an explicit parametric image distribution. The proposed visualization method can directly draw synthetic samples for any given node in a trained CNN by the Hamiltonian Monte Carlo (HMC) algorithm, without resorting to any extra hold-out images. Experiments on the challenging ImageNet benchmark show that the proposed generative gradient pre-training consistently helps improve the performances of CNNs, and the proposed generative visualization method generates meaningful and varied samples of synthetic images from a large-scale deep CNN.
We study disordered antiferromagnetic spin-1/2 chains with nearest- and further-neighbor interactions using the real-space renormalization-group method. We find that the system supports two different phases, depending on the ratio of the strength between nearest-neighbor and further-neighbor interactions as well the bond randomness strength. For weak further neighbor coupling the system is in the familiar random singlet phase, while stronger further neighbor coupling drives the system to a large spin phase similar to that found in the study of random antiferromagnetic-ferromagnetic spin chains. The appearance of the large spin phase in the absence of ferromagnetic coupling is due to the frustration introduced by further neighboring couplings, and is unique to the disordered chains.
We study the effect of long-range connections on the infinite-randomness fixed point associated with the quantum phase transitions in a transverse Ising model (TIM). The TIM resides on a long-range connected lattice where any two sites at a distance r are connected with a non-random ferromagnetic bond with a probability that falls algebraically with the distance between the sites as 1/r^{d+\sigma}. The interplay of the fluctuations due to dilutions together with the quantum fluctuations due to the transverse field leads to an interesting critical behaviour. The exponents at the critical fixed point (which is an infinite randomness fixed point (IRFP)) are related to the classical "long-range" percolation exponents. The most interesting observation is that the gap exponent \psi is exactly obtained for all values of \sigma and d. Exponents depend on the range parameter \sigma and show a crossover to short-range values when \sigma >= 2 -\eta_{SR} where \eta_{SR} is the anomalous dimension for the conventional percolation problem. Long-range connections are also found to tune the strength of the Griffiths phase.
Prostate cancer is the most diagnosed form of cancer in Canadian men, and is the third leading cause of cancer death. Despite these statistics, prognosis is relatively good with a sufficiently early diagnosis, making fast and reliable prostate cancer detection crucial. As imaging-based prostate cancer screening, such as magnetic resonance imaging (MRI), requires an experienced medical professional to extensively review the data and perform a diagnosis, radiomics-driven methods help streamline the process and has the potential to significantly improve diagnostic accuracy and efficiency, and thus improving patient survival rates. These radiomics-driven methods currently rely on hand-crafted sets of quantitative imaging-based features, which are selected manually and can limit their ability to fully characterize unique prostate cancer tumour phenotype. In this study, we propose a novel \textit{discovery radiomics} framework for generating custom radiomic sequences tailored for prostate cancer detection. Discovery radiomics aims to uncover abstract imaging-based features that capture highly unique tumour traits and characteristics beyond what can be captured using predefined feature models. In this paper, we discover new custom radiomic sequencers for generating new prostate radiomic sequences using multi-parametric MRI data. We evaluated the performance of the discovered radiomic sequencer against a state-of-the-art hand-crafted radiomic sequencer for computer-aided prostate cancer detection with a feedforward neural network using real clinical prostate multi-parametric MRI data. Results for the discovered radiomic sequencer demonstrate good performance in prostate cancer detection and clinical decision support relative to the hand-crafted radiomic sequencer. The use of discovery radiomics shows potential for more efficient and reliable automatic prostate cancer detection.
This report will show the history of deep learning evolves. It will trace back as far as the initial belief of connectionism modelling of brain, and come back to look at its early stage realization: neural networks. With the background of neural network, we will gradually introduce how convolutional neural network, as a representative of deep discriminative models, is developed from neural networks, together with many practical techniques that can help in optimization of neural networks. On the other hand, we will also trace back to see the evolution history of deep generative models, to see how researchers balance the representation power and computation complexity to reach Restricted Boltzmann Machine and eventually reach Deep Belief Nets. Further, we will also look into the development history of modelling time series data with neural networks. We start with Time Delay Neural Networks and move further to currently famous model named Recurrent Neural Network and its extension Long Short Term Memory. We will also briefly look into how to construct deep recurrent neural networks. Finally, we will conclude this report with some interesting open-ended questions of deep neural networks.
We present the discovery of a super lithium-rich K giant star, G0928+73.2600. This red giant (T_eff = 4885 K and log g = 2.65) is a fast rotator with a projected rotational velocity of 8.4 km/s and an unusually high lithium abundance of A(Li) = 3.30 dex. Although the lack of a measured parallax precludes knowing the exact evolutionary phase, an isochrone-derived estimate of its luminosity places the star on the Hertzsprung-Russell diagram in a location that is not consistent with either the red bump on the first ascent of the red giant branch or with the second ascent on the asymptotic giant branch, the two evolutionary stages where lithium-rich giant stars tend to cluster. Thus, even among the already unusual group of lithium-rich giant stars, G0928+73.2600 is peculiar. Using 12C/13C as a tracer for mixing---more mixing leads to lower 12C/13C---we find 12C/13C = 28, which is near the expected value for standard first dredge-up mixing. We can therefore conclude that "extra" deep mixing has not occurred. Regardless of the ambiguity of the evolutionary stage, the extremely large lithium abundance and the rotational velocity of this star are unusual, and we speculate that G0928+73.2600 has been enriched in both lithium and angular momentum from a sub-stellar companion.
We propose a traffic confirmation attack on low-latency mix networks based on computing robust real-time binary hashes of network traffic flows. Firstly, we adapt the Coskun-Memon Algorithm to construct hashes that can withstand network impairments to allow fast matching of network flows. The resulting attack has a low startup cost and achieves a true positive match rate of 80% when matching one flow out of 9000 with less than 2% false positives, showing traffic confirmation attacks can be highly accurate even when only part of the network traffic flow is seen. Secondly, we attack probabilistic padding schemes achieving a match rate of over 90% from 9000 network traffic flows, showing advanced padding techniques are still vulnerable to traffic confirmation attacks.
Recent Hubble Space Telescope observations have found that the horizontal branches (HB's) in the metal-rich globular clusters NGC 6388 and NGC 6441 slope upward with decreasing B-V. Such a slope is not predicted by canonical HB models and cannot be produced by either a greater cluster age or enhanced mass loss along the red-giant branch (RGB). The peculiar HB morphology in these clusters may provide an important clue for understanding the second-parameter effect.   We have carried out extensive evolutionary calculations and numerical simulations in order to explore three non-canonical scenarios for explaining the sloped HB's in NGC 6388 and NGC 6441: i) A high cluster helium abundance scenario, where the HB evolution is characterized by long blue loops; ii) A rotation scenario, where internal rotation during the RGB phase increases the HB core mass; iii) A helium-mixing scenario, where deep mixing on the RGB enhances the envelope helium abundance. All three of these scenarios predict sloped HB's with anomalously bright RR Lyrae variables. We compare this prediction with the properties of the two known RR Lyrae variables in NGC 6388. Other possible observational tests are suggested.
A biological neural network is constituted by numerous subnetworks and modules with different functionalities. For an artificial neural network, the relationship between a network and its subnetworks is also important and useful for both theoretical and algorithmic research, i.e. it can be exploited to develop incremental network training algorithm or parallel network training algorithm. In this paper we explore the relationship between an ELM neural network and its subnetworks. To the best of our knowledge, we are the first to prove a theorem that shows an ELM neural network can be scattered into subnetworks and its optimal solution can be constructed recursively by the optimal solutions of these subnetworks. Based on the theorem we also present two algorithms to train a large ELM neural network efficiently: one is a parallel network training algorithm and the other is an incremental network training algorithm. The experimental results demonstrate the usefulness of the theorem and the validity of the developed algorithms.
This paper highlights new opportunities for designing large-scale machine learning systems as a consequence of blurring traditional boundaries that have allowed algorithm designers and application-level practitioners to stay -- for the most part -- oblivious to the details of the underlying hardware-level implementations. The hardware/software co-design methodology advocated here hinges on the deployment of compute-intensive machine learning kernels onto compute platforms that trade-off determinism in the computation for improvement in speed and/or energy efficiency. To achieve this, we revisit digital stochastic circuits for approximating matrix computations that are ubiquitous in machine learning algorithms. Theoretical and empirical evaluation is undertaken to assess the impact of the hardware-induced computational noise on algorithm performance. As a proof-of-concept, a stochastic hardware simulator is employed for training deep neural networks for image recognition problems.
Emulating spiking neural networks on analog neuromorphic hardware offers several advantages over simulating them on conventional computers, particularly in terms of speed and energy consumption. However, this usually comes at the cost of reduced control over the dynamics of the emulated networks. In this paper, we demonstrate how iterative training of a hardware-emulated network can compensate for anomalies induced by the analog substrate. We first convert a deep neural network trained in software to a spiking network on the BrainScaleS wafer-scale neuromorphic system, thereby enabling an acceleration factor of 10 000 compared to the biological time domain. This mapping is followed by the in-the-loop training, where in each training step, the network activity is first recorded in hardware and then used to compute the parameter updates in software via backpropagation. An essential finding is that the parameter updates do not have to be precise, but only need to approximately follow the correct gradient, which simplifies the computation of updates. Using this approach, after only several tens of iterations, the spiking network shows an accuracy close to the ideal software-emulated prototype. The presented techniques show that deep spiking networks emulated on analog neuromorphic devices can attain good computational performance despite the inherent variations of the analog substrate.
The transcription of handwritten text on images is one task in machine learning and one solution to solve it is using multi-dimensional recurrent neural networks (MDRNN) with connectionist temporal classification (CTC). The RNNs can contain special units, the long short-term memory (LSTM) cells. They are able to learn long term dependencies but they get unstable when the dimension is chosen greater than one. We defined some useful and necessary properties for the one-dimensional LSTM cell and extend them in the multi-dimensional case. Thereby we introduce several new cells with better stability. We present a method to design cells using the theory of linear shift invariant systems. The new cells are compared to the LSTM cell on the IFN/ENIT and Rimes database, where we can improve the recognition rate compared to the LSTM cell. So each application where the LSTM cells in MDRNNs are used could be improved by substituting them by the new developed cells.
These Network Coding improves the network operation beyond the traditional routing or store-and-forward, by mixing of data stream within a network. Network coding techniques explicitly minimizes the total no of transmission in wireless network. The Coding-aware routing maximizes the coding opportunity by finding the coding possible path for every packet in the network. Here we propose CORMEN: a new coding-aware routing mechanism based on opportunistic routing. In CORMEN, every node independently can take the decision whether to code packets or not and forwarding of packets is based on the coding opportunity available.
Genes carry the instructions for making proteins that are found in a cell as a specific sequence of nucleotides that are found in DNA molecules. But, the regions of these genes that code for proteins may occupy only a small region of the sequence. Identifying the coding regions play a vital role in understanding these genes. In this paper we propose a unsupervised Fuzzy Multiple Attractor Cellular Automata (FMCA) based pattern classifier to identify the coding region of a DNA sequence. We propose a distinct K-Means algorithm for designing FMACA classifier which is simple, efficient and produces more accurate classifier than that has previously been obtained for a range of different sequence lengths. Experimental results confirm the scalability of the proposed Unsupervised FCA based classifier to handle large volume of datasets irrespective of the number of classes, tuples and attributes. Good classification accuracy has been established.
In this paper, we investigate the development of generalized synchronization (GS) on typical complex networks, such as scale-free networks, small-world networks, random networks and modular networks. By adopting the auxiliary-system approach to networks, we show that GS can take place in oscillator networks with both heterogeneous and homogeneous degree distribution, regardless of whether the coupled chaotic oscillators are identical or nonidentical. For coupled identical oscillators on networks, we find that there exists a general bifurcation path from initial non-synchronization to final global complete synchronization (CS) via GS as the coupling strength is increased. For coupled nonidentical oscillators on networks, we further reveal how network topology competes with the local dynamics to dominate the development of GS on networks. Especially, we analyze how different coupling strategies affect the development of GS on complex networks. Our findings provide a further understanding for the occurrence and development of collective behavior in complex networks.
We investigate the two-dimensional four-color Ashkin-Teller model by means of large-scale Monte-Carlo simulations. We demonstrate that the first-order phase transition of the clean system is destroyed by random disorder introduced via site dilution. The critical behavior of the emerging continuous transition belongs to the clean two-dimensional Ising universality class, apart from logarithmic corrections. These results confirm perturbative renormalization-group predictions; they also agree with recent findings for the three-color case, indicating that the critical behavior is universal.
We bring together ideas from recent work on feature design for egocentric action recognition under one framework by exploring the use of deep convolutional neural networks (CNN). Recent work has shown that features such as hand appearance, object attributes, local hand motion and camera ego-motion are important for characterizing first-person actions. To integrate these ideas under one framework, we propose a twin stream network architecture, where one stream analyzes appearance information and the other stream analyzes motion information. Our appearance stream encodes prior knowledge of the egocentric paradigm by explicitly training the network to segment hands and localize objects. By visualizing certain neuron activation of our network, we show that our proposed architecture naturally learns features that capture object attributes and hand-object configurations. Our extensive experiments on benchmark egocentric action datasets show that our deep architecture enables recognition rates that significantly outperform state-of-the-art techniques -- an average $6.6\%$ increase in accuracy over all datasets. Furthermore, by learning to recognize objects, actions and activities jointly, the performance of individual recognition tasks also increase by $30\%$ (actions) and $14\%$ (objects). We also include the results of extensive ablative analysis to highlight the importance of network design decisions..
Besides the complexity in time or in number of messages, a common approach for analyzing distributed algorithms is to look at the assumptions they make on the underlying network. We investigate this question from the perspective of network dynamics. In particular, we ask how a given property on the evolution of the network can be rigorously proven as necessary or sufficient for a given algorithm. The main contribution of this paper is to propose the combination of two existing tools in this direction: local computations by means of graph relabelings, and evolving graphs. Such a combination makes it possible to express fine-grained properties on the network dynamics, then examine what impact those properties have on the execution at a precise, intertwined, level. We illustrate the use of this framework through the analysis of three simple algorithms, then discuss general implications of this work, which include (i) the possibility to compare distributed algorithms on the basis of their topological requirements, (ii) a formal hierarchy of dynamic networks based on these requirements, and (iii) the potential for mechanization induced by our framework, which we believe opens a door towards automated analysis and decision support in dynamic networks.
Horizontal integration of access technologies to networks and services should be accompanied by some kind of convergence of authentication technologies. The missing link for the federation of user identities across the technological boundaries separating authentication methods can be provided by trusted computing platforms. The concept of establishing transitive trust by trusted computing enables the desired crossdomain authentication functionality. The focus of target application scenarios lies in the realm of mobile networks and devices.
In recent few yearsWireless Sensor Networks (WSNs) have seen an increased interest in various applications like border field security, disaster management and medical applications. So large number of sensor nodes are deployed for such applications, which can work autonomously. Due to small power batteries in WSNs, efficient utilization of battery power is an important factor. Clustering is an efficient technique to extend life time of sensor networks by reducing the energy consumption. In this paper, we propose a new protocol; Energy Consumption Rate based Stable Election Protocol (ECRSEP). Our CH selection scheme is based on the weighted election probabilities of each node according to the Energy Consumption Rate (ECR) of each node. We compare results of our proposed protocol with Low Energy Adaptive Clustering Hierarchy (LEACH), Distributed Energy Efficient Clustering (DEEC), Stable Election Protocol (SEP), and Enhanced SEP(ESEP). Our simulation results show that our proposed protocol, ECRSEP outperforms all these protocols in terms of network stability and network lifetime.
We introduce a novel schema for sequence to sequence learning with a Deep Q-Network (DQN), which decodes the output sequence iteratively. The aim here is to enable the decoder to first tackle easier portions of the sequences, and then turn to cope with difficult parts. Specifically, in each iteration, an encoder-decoder Long Short-Term Memory (LSTM) network is employed to, from the input sequence, automatically create features to represent the internal states of and formulate a list of potential actions for the DQN. Take rephrasing a natural sentence as an example. This list can contain ranked potential words. Next, the DQN learns to make decision on which action (e.g., word) will be selected from the list to modify the current decoded sequence. The newly modified output sequence is subsequently used as the input to the DQN for the next decoding iteration. In each iteration, we also bias the reinforcement learning's attention to explore sequence portions which are previously difficult to be decoded. For evaluation, the proposed strategy was trained to decode ten thousands natural sentences. Our experiments indicate that, when compared to a left-to-right greedy beam search LSTM decoder, the proposed method performed competitively well when decoding sentences from the training set, but significantly outperformed the baseline when decoding unseen sentences, in terms of BLEU score obtained.
Heterogeneity is often natural in many contemporary applications involving massive data. While posing new challenges to effective learning, it can play a crucial role in powering meaningful scientific discoveries through the understanding of important differences among subpopulations of interest. In this paper, we exploit multiple networks with Gaussian graphs to encode the connectivity patterns of a large number of features on the subpopulations. To uncover the heterogeneity of these structures across subpopulations, we suggest a new framework of tuning-free heterogeneity pursuit (THP) via large-scale inference, where the number of networks is allowed to diverge. In particular, two new tests, the chi-based test and the linear functional-based test, are introduced and their asymptotic null distributions are established. Under mild regularity conditions, we establish that both tests are optimal in achieving the testable region boundary and the sample size requirement for the latter test is minimal. Both theoretical guarantees and the tuning-free feature stem from efficient multiple-network estimation by our newly suggested approach of heterogeneous group square-root Lasso (HGSL) for high-dimensional multi-response regression with heterogeneous noises. To solve this convex program, we further introduce a tuning-free algorithm that is scalable and enjoys provable convergence to the global optimum. Both computational and theoretical advantages of our procedure are elucidated through simulation and real data examples.
Neural Machine Translation (NMT) is a new approach for Machine Translation (MT), and due to its success, it has absorbed the attention of many researchers in the field. In this paper, we study NMT model on Persian-English language pairs, to analyze the model and investigate the appropriateness of the model for scarce-resourced scenarios, the situation that exists for Persian-centered translation systems. We adjust the model for the Persian language and find the best parameters and hyper parameters for two tasks: translation and transliteration. We also apply some preprocessing task on the Persian dataset which yields to increase for about one point in terms of BLEU score. Also, we have modified the loss function to enhance the word alignment of the model. This new loss function yields a total of 1.87 point improvements in terms of BLEU score in the translation quality.
We tackle the task of semi-supervised video object segmentation, i.e. segmenting the pixels belonging to an object in the video using the ground truth pixel mask for the first frame. We build on the recently introduced one-shot video object segmentation (OSVOS) approach which uses a pretrained network and fine-tunes it on the first frame. While achieving impressive performance, at test time OSVOS uses the fine-tuned network in unchanged form and is not able to adapt to large changes in object appearance. To overcome this limitation, we propose Online Adaptive Video Object Segmentation (OnAVOS) which updates the network online using training examples selected based on the confidence of the network and the spatial configuration. Additionally, we add a pretraining step based on objectness, which is learned on PASCAL. Our experiments show that both extensions are highly effective and improve the state of the art on DAVIS to an intersection-over-union score of 85.7%.
We present a measurement of the ratio of t-tbar production cross section via gluon-gluon fusion to the total t-tbar production cross section in p-pbar collisions at sqrt{s}=1.96 TeV at the Tevatron. Using a data sample with an integrated luminosity of 955/pb recorded by the CDF II detector at Fermilab, we select events based on the t-tbar decay to lepton+jets. Using an artificial neural network technique we discriminate between t-tbar events produced via q-qbar annihilation and gluon-gluon fusion, and find Cf=(gg->ttbar)/(pp->ttbar)<0.33 at the 68% confidence level. This result is combined with a previous measurement to obtain the most precise measurement of this quantity, Cf=0.07+0.15-0.07.
Deep nonlinear models pose a challenge for fitting parameters due to lack of knowledge of the hidden layer and the potentially non-affine relation of the initial and observed layers. In the present work we investigate the use of information theoretic measures such as mutual information and Kullback-Leibler (KL) divergence as objective functions for fitting such models without knowledge of the hidden layer. We investigate one model as a proof of concept and one application of cogntive performance. We further investigate the use of optimizers with these methods. Mutual information is largely successful as an objective, depending on the parameters. KL divergence is found to be similarly succesful, given some knowledge of the statistics of the hidden layer.
Many time series are generated by a set of entities that interact with one another over time. This paper introduces a broad, flexible framework to learn from multiple inter-dependent time series generated by such entities. Our framework explicitly models the entities and their interactions through time. It achieves this by building on the capabilities of Recurrent Neural Networks, while also offering several ways to incorporate domain knowledge/constraints into the model architecture. The capabilities of our approach are showcased through an application to weather prediction, which shows gains over strong baselines.
Recently, recurrent neural networks (RNNs) as powerful sequence models have re-emerged as a potential acoustic model for statistical parametric speech synthesis (SPSS). The long short-term memory (LSTM) architecture is particularly attractive because it addresses the vanishing gradient problem in standard RNNs, making them easier to train. Although recent studies have demonstrated that LSTMs can achieve significantly better performance on SPSS than deep feed-forward neural networks, little is known about why. Here we attempt to answer two questions: a) why do LSTMs work well as a sequence model for SPSS; b) which component (e.g., input gate, output gate, forget gate) is most important. We present a visual analysis alongside a series of experiments, resulting in a proposal for a simplified architecture. The simplified architecture has significantly fewer parameters than an LSTM, thus reducing generation complexity considerably without degrading quality.
We propose a deep neural network for the prediction of future frames in natural video sequences. To effectively handle complex evolution of pixels in videos, we propose to decompose the motion and content, two key components generating dynamics in videos. Our model is built upon the Encoder-Decoder Convolutional Neural Network and Convolutional LSTM for pixel-level prediction, which independently capture the spatial layout of an image and the corresponding temporal dynamics. By independently modeling motion and content, predicting the next frame reduces to converting the extracted content features into the next frame content by the identified motion features, which simplifies the task of prediction. Our model is end-to-end trainable over multiple time steps, and naturally learns to decompose motion and content without separate training. We evaluate the proposed network architecture on human activity videos using KTH, Weizmann action, and UCF-101 datasets. We show state-of-the-art performance in comparison to recent approaches. To the best of our knowledge, this is the first end-to-end trainable network architecture with motion and content separation to model the spatiotemporal dynamics for pixel-level future prediction in natural videos.
Photo lineups play a significant role in the eyewitness identification process. This method is used to provide evidence in the prosecution and subsequent conviction of suspects. Unfortunately, there are many cases where lineups have led to the conviction of an innocent suspect. One of the key factors affecting the incorrect identification of a suspect is the lack of lineup fairness, i.e. that the suspect differs significantly from all other candidates. Although the process of assembling fair lineup is both highly important and time-consuming, only a handful of tools are available to simplify the task. In this paper, we describe our work towards using recommender systems for the photo lineup assembling task. We propose and evaluate two complementary methods for item-based recommendation: one based on the visual descriptors of the deep neural network, the other based on the content-based attributes of persons. The initial evaluation made by forensic technicians shows that although results favored visual descriptors over attribute-based similarity, both approaches are functional and highly diverse in terms of recommended objects. Thus, future work should involve incorporating both approaches in a single prediction method, preference learning based on the feedback from forensic technicians and recommendation of assembled lineups instead of single candidates.
It is quite obvious that in the real world, more than one kind of relationship can exist between two actors and that those ties can be so intertwined that it is impossible to analyse them separately [Fienberg 85], [Minor 83], [Szell 10]. Social networks with more than one type of relation are not a completely new concept [Wasserman 94] but they were analysed mainly at the small scale, e.g. in [McPherson 01], [Padgett 93], and [Entwisle 07]. Just like in the case of regular single-layered social network there is no widely accepted definition or even common name. At the beginning such networks have been called multiplex network [Haythornthwaite 99], [Monge 03]. The term is derived from communications theory which defines multiplex as combining multiple signals into one in such way that it is possible to separate them if needed [Hamill 06]. Recently, the area of multi-layered social network has started attracting more and more attention in research conducted within different domains [Kazienko 11a], [Szell 10], [Rodriguez 07], [Rodriguez 09], and the meaning of multiplex network has expanded and covers not only social relationships but any kind of connection, e.g. based on geography, occupation, kinship, hobbies, etc. [Abraham 12]. This essay aims to summarize existing knowledge about one concept which has many different names i.e. the concept of Multi-layered Social Network also known as Layered social network, Multi-relational social network, Multidimensional social network, Multiplex social network
We overview dataflow matrix machines as a Turing complete generalization of recurrent neural networks and as a programming platform. We describe vector space of finite prefix trees with numerical leaves which allows us to combine expressive power of dataflow matrix machines with simplicity of traditional recurrent neural networks.
A wireless sensor network (WSN) is a wireless network consisting of spatially distributed autonomous devices using sensors to cooperatively monitor physical or environmental conditions, such as temperature, sound, vibration, pressure, motion or pollutants, at different locations. During RF transmission energy consumed by critically energy-constrained sensor nodes in a WSN is related to the life time system, but the life time of the system is inversely proportional to the energy consumed by sensor nodes. In that regard, modulated backscattering (MB) is a promising design choice, in which sensor nodes send their data just by switching their antenna impedance and reflecting the incident signal coming from an RF source. Hence wireless passive sensor networks (WPSN) designed to operate using MB do not have the lifetime constraints. In this we are going to investigate the system analytically. To obtain interference-free communication connectivity with the WPSN nodes number of RF sources is determined and analyzed in terms of output power and the transmission frequency of RF sources, network size, RF source and WPSN node characteristics. The results of this paper reveal that communication coverage and RF Source Power can be practically maintained in WPSN through careful selection of design parameters
The recent research effort towards defining new communication solutions for cyber-physical systems (CPS), to guarantee high availability level with limited cabling costs and complexity, has renewed the interest in ring-based networks. This topology has been recently used for various networked cyber-physical systems (Net-CPS), e.g., avionics and automotive, with the implementation of many Real Time Ethernet (RTE) profiles. A relevant issue for such networks is to prove timing predictability, a key requirement for safety-critical systems. We are interested in this paper in event-triggered ring-based networks, which guarantee high resource utilization efficiency and (re)configuration flexibility, at the cost of increasing the timing analysis complexity. The implementation of such a communication scheme on top of a ring topology actually induces cyclic dependencies, in comparison to time-triggered solutions. To cope with this arising issue of cyclic dependencies, only few techniques have been proposed in the literature, mainly based on Network Calculus framework, and consist in analyzing locally the delay upper bound in each crossed node, resulting in pessimistic end-to-end delay bounds. Hence, the main contribution in this paper is enhancing the delay bounds tightness of such networks, through an innovative global analysis based on Network Calculus, accounting the flow serialization phenomena along the flow path. An extensive analysis of such a proposal is conducted herein regarding the accuracy of delay bounds and its impact on the system performance, i.e., scalability and resource-efficiency; and the results highlight its outperformance, in comparison to conventional methods.
Many real networks in nature and society share two generic properties: they are scale-free and they display a high degree of clustering. We show that these two features are the consequence of a hierarchical organization, implying that small groups of nodes organize in a hierarchical manner into increasingly large groups, while maintaining a scale-free topology. In hierarchical networks the degree of clustering characterizing the different groups follows a strict scaling law, which can be used to identify the presence of a hierarchical organization in real networks. We find that several real networks, such as the World Wide Web, actor network, the Internet at the domain level and the semantic web obey this scaling law, indicating that hierarchy is a fundamental characteristic of many complex systems.
Malware family classification is an age old problem that many Anti-Virus (AV) companies have tackled. There are two common techniques used for classification, signature based and behavior based. Signature based classification uses a common sequence of bytes that appears in the binary code to identify and detect a family of malware. Behavior based classification uses artifacts created by malware during execution for identification. In this paper we report on a unique dataset we obtained from our operations and classified using several machine learning techniques using the behavior-based approach. Our main class of malware we are interested in classifying is the popular Zeus malware. For its classification we identify 65 features that are unique and robust for identifying malware families. We show that artifacts like file system, registry, and network features can be used to identify distinct malware families with high accuracy---in some cases as high as 95%.
In this paper we introduce the olog, or ontology log, a category-theoretic model for knowledge representation (KR). Grounded in formal mathematics, ologs can be rigorously formulated and cross-compared in ways that other KR models (such as semantic networks) cannot. An olog is similar to a relational database schema; in fact an olog can serve as a data repository if desired. Unlike database schemas, which are generally difficult to create or modify, ologs are designed to be user-friendly enough that authoring or reconfiguring an olog is a matter of course rather than a difficult chore. It is hoped that learning to author ologs is much simpler than learning a database definition language, despite their similarity. We describe ologs carefully and illustrate with many examples. As an application we show that any primitive recursive function can be described by an olog. We also show that ologs can be aligned or connected together into a larger network using functors. The various methods of information flow and institutions can then be used to integrate local and global world-views. We finish by providing several different avenues for future research.
Emotion being a subjective thing, leveraging knowledge and science behind labeled data and extracting the components that constitute it, has been a challenging problem in the industry for many years. With the evolution of deep learning in computer vision, emotion recognition has become a widely-tackled research problem. In this work, we propose two independent methods for this very task. The first method uses autoencoders to construct a unique representation of each emotion, while the second method is an 8-layer convolutional neural network (CNN). These methods were trained on the posed-emotion dataset (JAFFE), and to test their robustness, both the models were also tested on 100 random images from the Labeled Faces in the Wild (LFW) dataset, which consists of images that are candid than posed. The results show that with more fine-tuning and depth, our CNN model can outperform the state-of-the-art methods for emotion recognition. We also propose some exciting ideas for expanding the concept of representational autoencoders to improve their performance.
Traditional methods to tackle many music information retrieval tasks typically follow a two-step architecture: feature engineering followed by a simple learning algorithm. In these "shallow" architectures, feature engineering and learning are typically disjoint and unrelated. Additionally, feature engineering is difficult, and typically depends on extensive domain expertise.   In this paper, we present an application of convolutional neural networks for the task of automatic musical instrument identification. In this model, feature extraction and learning algorithms are trained together in an end-to-end fashion. We show that a convolutional neural network trained on raw audio can achieve performance surpassing traditional methods that rely on hand-crafted features.
Wireless Mesh Networks improve their capacities by equipping mesh nodes with multi-radios tuned to non-overlapping channels. Hence the data forwarding between two nodes has multiple selections of links and the bandwidth between the pair of nodes varies dynamically. Under this condition, a mesh node adopts machine learning mechanisms to choose the possible best next hop which has maximum bandwidth when it intends to forward data. In this paper, we present a machine learning based forwarding algorithm to let a forwarding node dynamically select the next hop with highest potential bandwidth capacity to resume communication based on learning algorithm. Key to this strategy is that a node only maintains three past status, and then it is able to learn and predict the potential bandwidth capacities of its links. Then, the node selects the next hop with potential maximal link bandwidth. Moreover, a geometrical based algorithm is developed to let the source node figure out the forwarding region in order to avoid flooding. Simulations demonstrate that our approach significantly speeds up the transmission and outperforms other peer algorithms.
We introduce two types of message passing algorithms for quantified Boolean formulas (QBF). The first type is a message passing based heuristics that can prove unsatisfiability of the QBF by assigning the universal variables in such a way that the remaining formula is unsatisfiable. In the second type, we use message passing to guide branching heuristics of a Davis-Putnam Logemann-Loveland (DPLL) complete solver. Numerical experiments show that on random QBFs our branching heuristics gives robust exponential efficiency gain with respect to the state-of-art solvers. We also manage to solve some previously unsolved benchmarks from the QBFLIB library. Apart from this our study sheds light on using message passing in small systems and as subroutines in complete solvers.
We address the problem of acoustic source separation in a deep learning framework we call "deep clustering." Rather than directly estimating signals or masking functions, we train a deep network to produce spectrogram embeddings that are discriminative for partition labels given in training data. Previous deep network approaches provide great advantages in terms of learning power and speed, but previously it has been unclear how to use them to separate signals in a class-independent way. In contrast, spectral clustering approaches are flexible with respect to the classes and number of items to be segmented, but it has been unclear how to leverage the learning power and speed of deep networks. To obtain the best of both worlds, we use an objective function that to train embeddings that yield a low-rank approximation to an ideal pairwise affinity matrix, in a class-independent way. This avoids the high cost of spectral factorization and instead produces compact clusters that are amenable to simple clustering methods. The segmentations are therefore implicitly encoded in the embeddings, and can be "decoded" by clustering. Preliminary experiments show that the proposed method can separate speech: when trained on spectrogram features containing mixtures of two speakers, and tested on mixtures of a held-out set of speakers, it can infer masking functions that improve signal quality by around 6dB. We show that the model can generalize to three-speaker mixtures despite training only on two-speaker mixtures. The framework can be used without class labels, and therefore has the potential to be trained on a diverse set of sound types, and to generalize to novel sources. We hope that future work will lead to segmentation of arbitrary sounds, with extensions to microphone array methods as well as image segmentation and other domains.
Anderson localization is a universal phenomenon affecting non-interacting quantum particles in disorder. In three spatial dimensions it becomes particularly interesting to study because of the presence of a quantum phase transition from localized to extended states, predicted by P.W. Anderson in his seminal work, taking place at a critical energy, the so-called mobility edge. The possible relation of the Anderson transition to the metal-insulator transitions observed in materials has originated a flurry of theoretical studies during the past 50 years, and it is now possible to predict very accurately the mobility edge starting from models of the microscopic disorder. However, the experiments performed so far with photons, ultrasound and ultracold atoms, while giving evidence of the transition, could not provide a precise measurement of the mobility edge. In this work we are able to obtain such a measurement using an ultracold atomic system in a disordered speckle potential, thanks to a precise control of the system energy. We find that the mobility edge is close to the mean disorder energy at small disorder strengths, while a clear effect of the spatial correlation of the disorder appears at larger strengths. The precise knowledge of the disorder properties in our system offers now the opportunity for an unprecedented experiment-theory comparison for 3D Anderson localization, which is also a necessary step to start the exploration of novel regimes for many-body disordered systems.
The native three dimensional structure of a single protein is determined by the physico chemical nature of its constituent amino acids. The twenty different types of amino acids, depending on their physico chemical properties, can be grouped into three major classes - hydrophobic, hydrophilic and charged. We have studied the anatomy of the weighted and unweighted networks of hydrophobic, hydrophilic and charged residues separately for a large number of proteins. Our results show that the average degree of the hydrophobic networks has significantly larger value than that of hydrophilic and charged networks. The average degree of the hydrophilic networks is slightly higher than that of charged networks. The average strength of the nodes of hydrophobic networks is nearly equal to that of the charged network; whereas that of hydrophilic networks has smaller value than that of hydrophobic and charged networks. The average strength for each of the three types of networks varies with its degree. The average strength of a node in charged networks increases more sharply than that of the hydrophobic and hydrophilic networks. Each of the three types of networks exhibits the 'small-world' property. Our results further indicate that the all amino acids' networks and hydrophobic networks are of assortative type. While maximum of the hydrophilic and charged networks are of assortative type, few others have the characteristics of disassortative mixing of the nodes. We have further observed that all amino acids' networks and hydrophobic networks bear the signature of hierarchy; whereas the hydrophilic and charged networks do not have any hierarchical signature.
This year, the Nara Institute of Science and Technology (NAIST)'s submission to the 2015 Workshop on Asian Translation was based on syntax-based statistical machine translation, with the addition of a reranking component using neural attentional machine translation models. Experiments re-confirmed results from previous work stating that neural MT reranking provides a large gain in objective evaluation measures such as BLEU, and also confirmed for the first time that these results also carry over to manual evaluation. We further perform a detailed analysis of reasons for this increase, finding that the main contributions of the neural models lie in improvement of the grammatical correctness of the output, as opposed to improvements in lexical choice of content words.
In the context of scene understanding, a variety of methods exists to estimate different information channels from mono or stereo images, including disparity, depth, and normals. Although several advances have been reported in the recent years for these tasks, the estimated information is often imprecise particularly near depth discontinuities or creases. Studies have however shown that precisely such depth edges carry critical cues for the perception of shape, and play important roles in tasks like depth-based segmentation or foreground selection. Unfortunately, the currently extracted channels often carry conflicting signals, making it difficult for subsequent applications to effectively use them. In this paper, we focus on the problem of obtaining high-precision depth edges (i.e., depth contours and creases) by jointly analyzing such unreliable information channels. We propose DepthCut, a data-driven fusion of the channels using a convolutional neural network trained on a large dataset with known depth. The resulting depth edges can be used for segmentation, decomposing a scene into depth layers with relatively flat depth, or improving the accuracy of the depth estimate near depth edges by constraining its gradients to agree with these edges. Quantitatively, we compare against 15 variants of baselines and demonstrate that our depth edges result in an improved segmentation performance and an improved depth estimate near depth edges compared to data-agnostic channel fusion. Qualitatively, we demonstrate that the depth edges result in superior segmentation and depth orderings.
In this research paper novel real/complex valued recurrent Hopfield Neural Network (RHNN) is proposed. The method of synthesizing the energy landscape of such a network and the experimental investigation of dynamics of Recurrent Hopfield Network is discussed. Parallel modes of operation (other than fully parallel mode) in layered RHNN is proposed. Also, certain potential applications are proposed.
We address the problem of defining a network graph on a large collection of classes. Each class is comprised of a collection of data points, sampled in a non i.i.d. way, from some unknown underlying distribution. The application we consider in this paper is a large scale high dimensional survey of people living in the US, and the question of how similar or different are the various counties in which these people live. We use a co-clustering diffusion metric to learn the underlying distribution of people, and build an approximate earth mover's distance algorithm using this data adaptive transportation cost.
The evaluation of relativistic spin networks plays a fundamental role in the Barrett-Crane state sum model of Lorentzian quantum gravity in 4 dimensions. A relativistic spin network is a graph labelled by unitary irreducible representations of the Lorentz group appearing in the direct integral decomposition of the space of L^2 functions on three-dimensional hyperbolic space. To `evaluate' such a spin network we must do an integral; if this integral converges we say the spin network is `integrable'. Here we show that a large class of relativistic spin networks are integrable, including any whose underlying graph is the 4-simplex (the complete graph on 5 vertices). This proves a conjecture of Barrett and Crane, whose validity is required for the convergence of their state sum model.
We report on a computational approach based on the self-consistent solution of the steady-state Boltzmann transport equation coupled with the Poisson equation for the study of inhomogeneous transport in deep submicron semiconductor structures. The nonlinear, coupled Poisson-Boltzmann system is solved numerically using finite difference and relaxation methods. We demonstrate our method by calculating the high-temperature transport characteristics of an inhomogeneously doped submicron GaAs structure where the large and inhomogeneous built-in fields produce an interesting fine structure in the high-energy tail of the electron velocity distribution, which in general is very far from a drifted-Maxwellian picture.
This paper describes a Bayesian method for combining an arbitrary mixture of observational and experimental data in order to learn causal Bayesian networks. Observational data are passively observed. Experimental data, such as that produced by randomized controlled trials, result from the experimenter manipulating one or more variables (typically randomly) and observing the states of other variables. The paper presents a Bayesian method for learning the causal structure and parameters of the underlying causal process that is generating the data, given that (1) the data contains a mixture of observational and experimental case records, and (2) the causal process is modeled as a causal Bayesian network. This learning method was applied using as input various mixtures of experimental and observational data that were generated from the ALARM causal Bayesian network. In these experiments, the absolute and relative quantities of experimental and observational data were varied systematically. For each of these training datasets, the learning method was applied to predict the causal structure and to estimate the causal parameters that exist among randomly selected pairs of nodes in ALARM that are not confounded. The paper reports how these structure predictions and parameter estimates compare with the true causal structures and parameters as given by the ALARM network.
Memristive crossbars have become a popular means for realizing unsupervised and supervised learning techniques. Often, to preserve mathematical rigor, the crossbar itself is separated from the neuron capacitors. In this work, we sought to simplify the design, removing extraneous components to consume significantly lower power at a minimal cost of accuracy. This work provides derivations for the design of such a network, named the Simple Spiking Locally Competitive Algorithm, or SSLCA, as well as CMOS designs and results on the CIFAR and MNIST datasets. Compared to a non-spiking model which scored 33% on CIFAR-10 with a single-layer classifier, this hardware scored 32% accuracy. When used with a state-of-the-art deep learning classifier, the non-spiking model achieved 82% and our simplified, spiking model achieved 80%, while compressing the input data by 79%. Compared to a previously proposed spiking model, our proposed hardware consumed 99% less energy to do the same work at 21 times the throughput. Accuracy held out with online learning to a write variance of 3% and a read variance of 40%. The proposed architecture's excellent accuracy and significantly lower energy usage demonstrate the utility of our innovations. This work provides a means for extremely low-energy sparse coding in mobile devices, such as cellular phones, or for very sparse coding as is needed by self-driving cars or robotics that must integrate data from multiple, high-resolution sensors.
Convolutional Neural Networks (CNNs) have become the state-of-the-art in supervised learning vision tasks. Their convolutional filters are of paramount importance for they allow to learn patterns while disregarding their locations in input images. When facing highly irregular domains, generalized convolutional operators based on an underlying graph structure have been proposed. However, these operators do not exactly match standard ones on grid graphs, and introduce unwanted additional invariance (e.g. with regards to rotations). We propose a novel approach to generalize CNNs to irregular domains using weight sharing and graph-based operators. Using experiments, we show that these models resemble CNNs on regular domains and offer better performance than multilayer perceptrons on distorded ones.
Security, privacy, and fairness have become critical in the era of data science and machine learning. More and more we see that achieving universally secure, private, and fair systems is practically impossible. We have seen for example how generative adversarial networks can be used to learn about the expected private training data; how the exploitation of additional data can reveal private information in the original one; and how what looks like unrelated features can teach us about each other. Confronted with this challenge, in this paper we open a new line of research, where the security, privacy, and fairness is learned and used in a closed environment. The goal is to ensure that a given entity (e.g., the company or the government), trusted to infer certain information with our data, is blocked from inferring protected information from it. For example, a hospital might be allowed to produce diagnosis on the patient (the positive task), without being able to infer the gender of the subject (negative task). Similarly, a company can guarantee that internally it is not using the provided data for any undesired task, an important goal that is not contradicting the virtually impossible challenge of blocking everybody from the undesired task. We design a system that learns to succeed on the positive task while simultaneously fail at the negative one, and illustrate this with challenging cases where the positive task is actually harder than the negative one being blocked. Fairness, to the information in the negative task, is often automatically obtained as a result of this proposed approach. The particular framework and examples open the door to security, privacy, and fairness in very important closed scenarios, ranging from private data accumulation companies like social networks to law-enforcement and hospitals.
The perturbative QCD predictions concerning deep inelastic scattering at low $x$ are summarized. The theoretical framework based on the leading log $1/x$ resummation and $k_t$ factorization theorem is described. The role of studying final states in deep inelastic scattering for revealing the details of the underlying dynamics at low $x$ is emphasised and some dedicated measurements, like deep inelastic scattering accompanied by an energetic jet, the measurement of the transverse energy flow and deep inelastic diffraction, are briefly discussed.
We investigate the effects of disorder within the T=0 Brinkman-Rice (BR) scenario for the Mott metal-insulator transition (MIT) in two dimensions (2d). For sufficiently weak disorder the transition retains the Mott character, as signaled by the vanishing of the local quasiparticles (QP) weights Z_{i} and strong disorder screening at criticality. In contrast to the behavior in high dimensions, here the local spatial fluctuations of QP parameters are strongly enhanced in the critical regime, with a distribution function P(Z) ~ Z^{\alpha-1} and \alpha tends to zero at the transition. This behavior indicates a robust emergence of an electronic Griffiths phase preceding the MIT, in a fashion surprisingly reminiscent of the "Infinite Randomness Fixed Point" scenario for disordered quantum magnets.
We investigate the stellar content of the starburst dwarf galaxy IC10 using accurate and deep optical data collected with the Advanced Camera for Surveys and with the Wide Field Planetary Camera 2 on board the Hubble Space Telescope. The comparison between theory and observations indicates a clear change in age distribution when moving from the center toward the external regions. Moreover, empirical calibrators and evolutionary predictions suggest the presence of a spread in heavy element abundance of the order of one-half dex. The comparison between old and intermediate-age core He-burning models with a well defined overdensity in the color-magnitude diagram indicates the presence of both intermediate-age, red clump stars and of old, red horizontal branch stars.
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that it can be efficiently trained on data with tens of thousands of samples per second of audio. When applied to text-to-speech, it yields state-of-the-art performance, with human listeners rating it as significantly more natural sounding than the best parametric and concatenative systems for both English and Mandarin. A single WaveNet can capture the characteristics of many different speakers with equal fidelity, and can switch between them by conditioning on the speaker identity. When trained to model music, we find that it generates novel and often highly realistic musical fragments. We also show that it can be employed as a discriminative model, returning promising results for phoneme recognition.
Now that spike trains from many neurons can be recorded simultaneously, there is a need for methods to decode these data to learn about the networks that these neurons are part of. One approach to this problem is to adjust the parameters of a simple model network to make its spike trains resemble the data as much as possible. The connections in the model network can then give us an idea of how the real neurons that generated the data are connected and how they influence each other. In this chapter we describe how to do this for the simplest kind of model: an Ising network. We derive algorithms for finding the best model connection strengths for fitting a given data set, as well as faster approximate algorithms based on mean field theory. We test the performance of these algorithms on data from model networks and experiments.
We formulate and propose an algorithm (MultiRank) for the ranking of nodes and layers in large multiplex networks. MultiRank takes into account the full multiplex network structure of the data and exploits the dual nature of the network in terms of nodes and layers. The proposed centrality of the layers (influences) and the centrality of the nodes are determined by a coupled set of equations. The basic idea consists in assigning more centrality to nodes that receive links from highly influential layers and from already central nodes. The layers are more influential if highly central nodes are active in them. The algorithm applies to directed/undirected as well as to weighted/unweighted multiplex networks. We discuss the application of MultiRank to three major examples of multiplex network datasets: the European Air Transportation Multiplex Network, the Pierre Auger Multiplex Collaboration Network and the FAO Multiplex Trade Network.
We study the scaling behavior of the size of minimum dominating set (MDS) in scale-free networks, with respect to network size $N$ and power-law exponent $\gamma$, while keeping the average degree fixed. We study ensembles generated by three different network construction methods, and we use a greedy algorithm to approximate the MDS. With a structural cutoff imposed on the maximal degree ($k_{\max}=\sqrt{N}$) we find linear scaling of the MDS size with respect to $N$ in all three network classes. Without any cutoff ($k_{\max}=N-1$) two of the network classes display a transition at $\gamma \approx 1.9$, with linear scaling above, and vanishingly weak dependence below, but in the third network class we find linear scaling irrespective of $\gamma$. We find that the partial MDS, which dominates a given $z<1$ fraction of nodes, displays essentially the same scaling behavior as the MDS.
Dropped pronouns (DPs) are ubiquitous in pro-drop languages like Chinese, Japanese etc. Previous work mainly focused on painstakingly exploring the empirical features for DPs recovery. In this paper, we propose a neural recovery machine (NRM) to model and recover DPs in Chinese, so that to avoid the non-trivial feature engineering process. The experimental results show that the proposed NRM significantly outperforms the state-of-the-art approaches on both two heterogeneous datasets. Further experiment results of Chinese zero pronoun (ZP) resolution show that the performance of ZP resolution can also be improved by recovering the ZPs to DPs.
In recent work we presented a new approach to the analysis of weighted networks, by providing a straightforward generalization of any network measure defined on unweighted networks. This approach is based on the translation of a weighted network into an ensemble of edges, and is particularly suited to the analysis of fully connected weighted networks. Here we apply our method to several such networks including distance matrices, and show that the clustering coefficient, constructed by using the ensemble approach, provides meaningful insights into the systems studied. In the particular case of two data sets from microarray experiments the clustering coefficient identifies a number of biologically significant genes, outperforming existing identification approaches.
Power-law distributions with various exponents are studied. We first introduce a simple and generic model that reproduces Zipf's law. We can regard this model both as the time evolution of the population of cities and that of the asset distribution. We show that our model is very robust against various variations. Next, we explain theoretically why our model reproduces Zipf's law. By considering the time-evolution equation of our model, we see that the essence of Zipf's law is an asymmetric random walk in a logarithmic scale. Finally, we extend our model by introducing an additional asymmetry. We show that the extended model reproduces various power-law exponents. By extending the theoretical argument for Zipf's law, we find a simple equation of the power-law exponent.
We consider a group of mobile users, within proximity of each other, who are interested in watching the same online video at roughly the same time. The common practice today is that each user downloads the video independently on her mobile device using her own cellular connection, which wastes access bandwidth and may also lead to poor video quality. We propose a novel cooperative system where each mobile device uses simultaneously two network interfaces: (i) the cellular to connect to the video server and download parts of the video and (ii) WiFi to connect locally to all other devices in the group and exchange those parts. Devices cooperate to efficiently utilize all network resources and are able to adapt to varying wireless network conditions. In the local WiFi network, we exploit overhearing, and we further combine it with network coding. The end result is savings in cellular bandwidth and improved user experience (faster download) by a factor on the order up to the group size.   We follow a complete approach, from theory to practice. First, we formulate the problem using a network utility maximization (NUM) framework, decompose the problem, and provide a distributed solution. Then, based on the structure of the NUM solution, we design a modular system called MicroCast and we implement it as an Android application. We provide both simulation results of the NUM solution and experimental evaluation of MicroCast on a testbed consisting of Android phones. We demonstrate that the proposed approach brings significant performance benefits without battery penalty.
A very timely issue for economic agent-based models (ABMs) is their empirical estimation. This paper describes a line of research that could resolve the issue by using machine learning techniques, using multi-layer artificial neural networks (ANNs), or so called Deep Nets. The seminal contribution by Hinton et al. (2006) introduced a fast and efficient training algorithm called Deep Learning, and there have been major breakthroughs in machine learning ever since. Economics has not yet benefited from these developments, and therefore we believe that now is the right time to apply Deep Learning and multi-layered neural networks to agent-based models in economics.
VANETs (Vehicular Ad hoc Networks) are highly mobile wireless ad hoc networks and will play an important role in public safety communications and commercial applications. Routing of data in VANETs is a challenging task due to rapidly changing topology and high speed mobility of vehicles. Conventional routing protocols in MANETs (Mobile Ad hoc Networks) are unable to fully address the unique characteristics in vehicular networks. In this paper, we propose EBGR (Edge Node Based Greedy Routing), a reliable greedy position based routing approach to forward packets to the node present in the edge of the transmission range of source/forwarding node as most suitable next hop, with consideration of nodes moving in the direction of the destination. We propose Revival Mobility model (RMM) to evaluate the performance of our routing technique. This paper presents a detailed description of our approach and simulation results show that packet delivery ratio is improved considerably compared to other routing techniques of VANET.
We perform equilibrium parallel-tempering simulations of the 3D Ising Edwards-Anderson spin glass in a field. A traditional analysis shows no signs of a phase transition. Yet, we encounter dramatic fluctuations in the behaviour of the model: Averages over all the data only describe the behaviour of a small fraction of it. Therefore we develop a new approach to study the equilibrium behaviour of the system, by classifying the measurements as a function of a conditioning variate. We propose a finite-size scaling analysis based on the probability distribution function of the conditioning variate, which may accelerate the convergence to the thermodynamic limit. In this way, we find a non-trivial spectrum of behaviours, where a part of the measurements behaves as the average, while the majority of them shows signs of scale invariance. As a result, we can estimate the temperature interval where the phase transition in a field ought to lie, if it exists. Although this would-be critical regime is unreachable with present resources, the numerical challenge is finally well posed.
In the recent years, we have witnessed the development of multi-label classification methods which utilize the structure of the label space in a divide and conquer approach to improve classification performance and allow large data sets to be classified efficiently. Yet most of the available data sets have been provided in train/test splits that did not account for maintaining a distribution of higher-order relationships between labels among splits or folds. We present a new approach to stratifying multi-label data for classification purposes based on the iterative stratification approach proposed by Sechidis et. al. in an ECML PKDD 2011 paper. Our method extends the iterative approach to take into account second-order relationships between labels. Obtained results are evaluated using statistical properties of obtained strata as presented by Sechidis. We also propose new statistical measures relevant to second-order quality: label pairs distribution, the percentage of label pairs without positive evidence in folds and label pair - fold pairs that have no positive evidence for the label pair. We verify the impact of new methods on classification performance of Binary Relevance, Label Powerset and a fast greedy community detection based label space partitioning classifier. Random Forests serve as base classifiers. We check the variation of the number of communities obtained per fold, and the stability of their modularity score. Second-Order Iterative Stratification is compared to standard k-fold, label set, and iterative stratification. The proposed approach lowers the variance of classification quality, improves label pair oriented measures and example distribution while maintaining a competitive quality in label-oriented measures. We also witness an increase in stability of network characteristics.
The dose regimen of Warfarin is separated into two phases. Firstly a loading dose is given, which is designed to bring the International Normalisation Ratio (INR) to within therapeutic range. Then a stable maintenance dose is given to maintain the INR within therapeutic range. In the United Kingdom (UK) the loading dose is usually given as three individual daily doses, the standard loading dose being 10mg on days one and two and 5mgs on day three, which can be varied at the discretion of the clinician. However, due to the large inter-individual variation in the response to Warfarin therapy, it is difficult to identify which patients will reach the narrow therapeutic window for target INR, and which will be above or below the therapeutic window. The aim of this research was to develop a methodology using a neural networks classification algorithm and data mining techniques to predict for a given loading dose and patient characteristics if the patient is more likely to achieve target INR or more likely to be above or below therapeutic range.   Multilayer perceptron (MLP) and 10-fold stratified cross validation algorithms were used to determine an artificial neural network to classify patients' response to their initial Warfarin loading dose. The resulting neural network model correctly classifies an individual's response to their Warfarin loading dose over 80% of the time. As well as taking into account the initial loading dose, the final model also includes demographic, genetic and a number of other potential confounding factors. With this model clinicians can predetermine whether a given loading regimen, along with specific patient characteristics will achieve a therapeutic response for a particular patient. Thus tailoring the loading dose regimen to meet the individual needs of the patient and reducing the risk of adverse drug reactions associated with Warfarin.
The millimeter wave (mmWave) band has the potential to provide high throughput among wearable devices. When mmWave wearable networks are used in crowded environments, such as on a bus or train, antenna directivity and orientation hold the key to achieving Gbps rates. Previous work using stochastic geometry often assumes an infinite number of interfering nodes drawn from a Poisson Point Process (PPP). Since indoor wearable networks will be isolated due to walls, a network with a finite number of nodes may be a more suitable model. In this paper, we characterize the significant sources of interference and develop closed-form expressions for the spatially averaged performance of a typical user's wearable communication link. The effect of human body blockage on the mmWave signals and the role of network density are investigated to show that an increase in interferer density reduces the mean number of significant interferers.
Many real networks are not isolated from each other but form networks of networks, often interrelated in non trivial ways. Here, we analyze an epidemic spreading process taking place on top of two interconnected complex networks. We develop a heterogeneous mean field approach that allows us to calculate the conditions for the emergence of an endemic state. Interestingly, a global endemic state may arise in the coupled system even though the epidemics is not able to propagate on each network separately, and even when the number of coupling connections is small. Our analytic results are successfully confronted against large-scale numerical simulations.
Deducing the structure of neural circuits is one of the central problems of modern neuroscience. Recently-introduced calcium fluorescent imaging methods permit experimentalists to observe network activity in large populations of neurons, but these techniques provide only indirect observations of neural spike trains, with limited time resolution and signal quality. In this work we present a Bayesian approach for inferring neural circuitry given this type of imaging data. We model the network activity in terms of a collection of coupled hidden Markov chains, with each chain corresponding to a single neuron in the network and the coupling between the chains reflecting the network's connectivity matrix. We derive a Monte Carlo Expectation--Maximization algorithm for fitting the model parameters; to obtain the sufficient statistics in a computationally-efficient manner, we introduce a specialized blockwise-Gibbs algorithm for sampling from the joint activity of all observed neurons given the observed fluorescence data. We perform large-scale simulations of randomly connected neuronal networks with biophysically realistic parameters and find that the proposed methods can accurately infer the connectivity in these networks given reasonable experimental and computational constraints. In addition, the estimation accuracy may be improved significantly by incorporating prior knowledge about the sparseness of connectivity in the network, via standard L$_1$ penalization methods.
The widespread availability of networked multimedia potentials embedded in an infrastructure of qualitative superior kind gives rise to new approaches in the areas of teleteaching and internet presentation: The distribution of professionally styled multimedia streams has fallen in the realm of possibility. This paper presents a prototype - both model and runtime environment - of a time directed media system treating any kind of presentational contribution as reusable media object components. The plug-in free runtime system is based on a database and allows for a flexible support of static media types as well as for easy extensions by streaming media servers. The prototypic implementation includes a preliminary Web Authoring platform.
In this paper, we investigate the global exponential stability for complex-valued recurrent neural networks with asynchronous time delays by decomposing complex-valued networks to real and imaginary parts and construct an equivalent real-valued system. The network model is described by a continuous-time equation. There are two main differences of this paper with previous works: (1), time delays can be asynchronous, i.e., delays between different nodes are different, which makes our model more general; (2), we prove the exponential convergence directly, while the existence and uniqueness of the equilibrium point is just a direct consequence of the exponential convergence. By using three generalized norms, we present some sufficient conditions for the uniqueness and global exponential stability of the equilibrium point for delayed complex-valued neural networks. These conditions in our results are less restrictive because of our consideration of the excitatory and inhibitory effects between neurons, so previous works of other researchers can be extended. Finally, some numerical simulations are given to demonstrate the correctness of our obtained results.
We describe CITlab's recognition system for the HTRtS competition attached to the 14. International Conference on Frontiers in Handwriting Recognition, ICFHR 2014. The task comprises the recognition of historical handwritten documents. The core algorithms of our system are based on multi-dimensional recurrent neural networks (MDRNN) and connectionist temporal classification (CTC). The software modules behind that as well as the basic utility technologies are essentially powered by PLANET's ARGUS framework for intelligent text recognition and image processing.
In this chapter, we present a literature survey of an emerging, cutting-edge, and multi-disciplinary field of research at the intersection of Robotics and Wireless Sensor Networks (WSN) which we refer to as Robotic Wireless Sensor Networks (RWSN). We define a Robotic Wireless Sensor Network as an autonomous networked multi-robot system that aims to achieve certain sensing goals while meeting and maintaining certain communication performance requirements, through cooperative control, learning and adaptation. While both of the component areas, i.e., Robotics and WSN, are very well-known and well-explored, there exist a whole set of new opportunities and research directions at the intersection of these two fields which are relatively or even completely unexplored. One such example would be the use of a set of robotic routers to set up a temporary communication path between a sender and a receiver that uses the controlled mobility to the advantage of packet routing. We find that there exist only a limited number of articles to be directly categorized as RWSN related works whereas there exist a range of articles in the robotics and the WSN literature that are also relevant to this new field of research. To connect the dots, we first identify the core problems and research trends related to RWSN such as connectivity, localization, routing, and robust flow of information. Next, we classify the existing research on RWSN as well as the relevant state-of-the-arts from robotics and WSN community according to the problems and trends identified in the first step. Lastly, we analyze what is missing in the existing literature, and identify topics that require more research attention in the future.
In this paper, we introduce a new user-level DSM system which has the ability to directly interact with underlying interconnection networks. The DSM system provides the application programmer a flexible API to program parallel applications either using shared memory semantics over physically distributed memory or to use an efficient remote memory demand paging technique. We also introduce a new time slice based memory consistency protocol which is used by the DSM system. We present preliminary results from our implementation on a small Opteron Linux cluster interconnected over Myrinet.
Crop yield forecasting is the methodology of predicting crop yields prior to harvest. The availability of accurate yield prediction frameworks have enormous implications from multiple standpoints, including impact on the crop commodity futures markets, formulation of agricultural policy, as well as crop insurance rating. The focus of this work is to construct a corn yield predictor at the county scale. Corn yield (forecasting) depends on a complex, interconnected set of variables that include economic, agricultural, management and meteorological factors. Conventional forecasting is either knowledge-based computer programs (that simulate plant-weather-soil-management interactions) coupled with targeted surveys or statistical model based. The former is limited by the need for painstaking calibration, while the latter is limited to univariate analysis or similar simplifying assumptions that fail to capture the complex interdependencies affecting yield. In this paper, we propose a data-driven approach that is "gray box" i.e. that seamlessly utilizes expert knowledge in constructing a statistical network model for corn yield forecasting. Our multivariate gray box model is developed on Bayesian network analysis to build a Directed Acyclic Graph (DAG) between predictors and yield. Starting from a complete graph connecting various carefully chosen variables and yield, expert knowledge is used to prune or strengthen edges connecting variables. Subsequently the structure (connectivity and edge weights) of the DAG that maximizes the likelihood of observing the training data is identified via optimization. We curated an extensive set of historical data (1948-2012) for each of the 99 counties in Iowa as data to train the model.
We analyze nuclear effects in the deep inelastic scattering on deuteron in the framework of the covariant approach. It is shown that this approach gives us the way to investigate the role of relativistic effects in the deep inelastic scattering (DIS) on deuteron, such as the relativistic kinematics and the off-mass-shell behavior of the Compton amplitude of nucleon in the consistent manner. We have obtained that taking into account of the nucleon amplitude off-mass-shell behavior gives us an analog of the interaction corrections in nonrelativistic models.
Rare regions with weak disorder (Griffiths regions) have the potential to spoil localization. We describe a non-perturbative construction of local integrals of motion (LIOMs) for a weakly interacting spin chain in one dimension, under a physically reasonable assumption on the statistics of eigenvalues. We discuss ideas about the situation in higher dimensions, where one can no longer ensure that interactions involving the Griffiths regions are much smaller than the typical energy-level spacing for such regions. We argue that ergodicity is restored in dimension d > 1, although equilibration should be extremely slow, similar to the dynamics of glasses.
The fields of neural computation and artificial neural networks have developed much in the last decades. Most of the works in these fields focus on implementing and/or learning discrete functions or behavior. However, technical, physical, and also cognitive processes evolve continuously in time. This cannot be described directly with standard architectures of artificial neural networks such as multi-layer feed-forward perceptrons. Therefore, in this paper, we will argue that neural networks modeling continuous time are needed explicitly for this purpose, because with them the synthesis and analysis of continuous and possibly periodic processes in time are possible (e.g. for robot behavior) besides computing discrete classification functions (e.g. for logical reasoning). We will relate possible neural network architectures with (hybrid) automata models that allow to express continuous processes.
We present a phenomenological model for the remanent magnetization at low temperatures in the quasi-one-dimensional dilute antiferromagnets CH_{3}NH_{3}Mn_{1-x}Cd_{x} Cl_{3}\cdot 2H_{2}O and (CH_{3})_{2}NH_{2}Mn_{1-x}Cd_{x}Cl_{3}\cdot 2H_{2}O. The model assumes the existence of uncompensated magnetic moments induced in the odd-sized segments generated along the Mn(^{2+}) chains upon dilution. These moments are further assumed to correlate ferromagnetically after removal of a cooling field. Using a (mean-field) linear-chain approximation and reasonable set of model parameters, we are able to reproduce the approximate linear temperature dependence observed for the remanent magnetization in the real compounds.
The Triple Helix model of university-industry-government relations can be generalized from a neo-institutional model of networks to a neo-evolutionary model of how three selection environments operate upon one another. The neo-evolutionary model enables us to appreciate both organizational integration in university-industry-government relations and differentiation among functions like the generation of intellectual capital, creation of wealth, and their attending legislation. The specification of innovation systems in terms of nations, sectors, cities, and regions can then be formulated as empirical questions: is synergy generated among functions in networks of relations? This Triple Helix model enables us to study the knowledge base of an urban economy in terms of a trade-off between locally stabilized and (potentially locked-in) trajectories versus the techno-economic and cultural development regimes which work with one more degree of freedom at the global level. The meta-stabilizing potentials of urban technologies between these two levels can be used reflexively as the intelligence of a creative reconstruction making cities smart(er).
Several recent deep neural networks experiments leverage the generalist-specialist paradigm for classification. However, no formal study compared the performance of different clustering algorithms for class assignment. In this paper we perform such a study, suggest slight modifications to the clustering procedures, and propose a novel algorithm designed to optimize the performance of of the specialist-generalist classification system. Our experiments on the CIFAR-10 and CIFAR-100 datasets allow us to investigate situations for varying number of classes on similar data. We find that our \emph{greedy pairs} clustering algorithm consistently outperforms other alternatives, while the choice of the confusion matrix has little impact on the final performance.
We present a technique for developing a network of re-used features, where the topology is formed using a coarse learning method, that allows gradient-descent fine tuning, known as an Abstract Deep Network (ADN). New features are built based on observed co-occurrences, and the network is maintained using a selection process related to evolutionary algorithms. This allows coarse ex- ploration of the problem space, effective for irregular domains, while gradient descent allows pre- cise solutions. Accuracy on standard UCI and Protein-Structure Prediction problems is comparable with benchmark SVM and optimized GBML approaches, and shows scalability for addressing large problems. The discrete implementation is symbolic, allowing interpretability, while the continuous method using fine-tuning shows improved accuracy. The binary multiplexer problem is explored, as an irregular domain that does not support gradient descent learning, showing solution to the bench- mark 135-bit problem. A convolutional implementation is demonstrated on image classification, showing an error-rate of 0.79% on the MNIST problem, without a pre-defined topology. The ADN system provides a method for developing a very sparse, deep feature topology, based on observed relationships between features, that is able to find solutions in irregular domains, and initialize a network prior to gradient descent learning.
Multilayered artificial neural networks (ANN) have found widespread utility in classification and recognition applications. The scale and complexity of such networks together with the inadequacies of general purpose computing platforms have led to a significant interest in the development of efficient hardware implementations. In this work, we focus on designing energy efficient on-chip storage for the synaptic weights. In order to minimize the power consumption of typical digital CMOS implementations of such large-scale networks, the digital neurons could be operated reliably at scaled voltages by reducing the clock frequency. On the contrary, the on-chip synaptic storage designed using a conventional 6T SRAM is susceptible to bitcell failures at reduced voltages. However, the intrinsic error resiliency of NNs to small synaptic weight perturbations enables us to scale the operating voltage of the 6TSRAM. Our analysis on a widely used digit recognition dataset indicates that the voltage can be scaled by 200mV from the nominal operating voltage (950mV) for practically no loss (less than 0.5%) in accuracy (22nm predictive technology). Scaling beyond that causes substantial performance degradation owing to increased probability of failures in the MSBs of the synaptic weights. We, therefore propose a significance driven hybrid 8T-6T SRAM, wherein the sensitive MSBs are stored in 8T bitcells that are robust at scaled voltages due to decoupled read and write paths. In an effort to further minimize the area penalty, we present a synaptic-sensitivity driven hybrid memory architecture consisting of multiple 8T-6T SRAM banks. Our circuit to system-level simulation framework shows that the proposed synaptic-sensitivity driven architecture provides a 30.91% reduction in the memory access power with a 10.41% area overhead, for less than 1% loss in the classification accuracy.
We numerically produce fully amorphous assemblies of frictionless spheres in three dimensions and study the jamming transition these packings undergo at large volume fractions. We specify four protocols yielding a critical value for the jamming volume fraction which is sharply defined in the limit of large system size, but is different for each protocol. Thus, we directly establish the existence of a continuous range of volume fraction where nonequilibrium jamming transitions occur. However, these jamming transitions share the same critical behaviour. Our results suggest that, even in the absence of partial crystalline ordering, a unique location of a random close packing does not exist, and that volume fraction alone is not sufficient to describe the properties of jammed states.
To examine thermoelectric (TE) properties of a semiconductor nanowire (NW) network, we propose a theoretical approach mapping the TE network on a two-port network. In contrast to a conventional single-port (i.e., resistor) network model, our model allows for large scale calculations showing convergence of TE figure of merit, $ZT$, with an increasing number of junctions. Using this model, numerical simulations are performed for the Bi$_2$Te$_3$ branched nanowire (BNW) and Cayley tree NW (CTNW) network. We find that the phonon scattering at the network junctions plays a dominant role in enhancing the network $ZT$. Specifically, disordered BNW and CTNW demonstrate an order of magnitude higher ZT enhancement compared to their ordered counterparts. Formation of preferential TE pathways in CTNW makes the network effectively behave as its BNW counterpart. We provide formalism for simulating large scale nanowire networks hinged upon experimentally measurable TE parameters of a single T-junction.
We revisit the study of a wrist-mounted camera system (referred to as HandCam) for recognizing activities of hands. HandCam has two unique properties as compared to egocentric systems (referred to as HeadCam): (1) it avoids the need to detect hands; (2) it more consistently observes the activities of hands. By taking advantage of these properties, we propose a deep-learning-based method to recognize hand states (free v.s. active hands, hand gestures, object categories), and discover object categories. Moreover, we propose a novel two-streams deep network to further take advantage of both HandCam and HeadCam. We have collected a new synchronized HandCam and HeadCam dataset with 20 videos captured in three scenes for hand states recognition. Experiments show that our HandCam system consistently outperforms a deep-learning-based HeadCam method (with estimated manipulation regions) and a dense-trajectory-based HeadCam method in all tasks. We also show that HandCam videos captured by different users can be easily aligned to improve free v.s. active recognition accuracy (3.3% improvement) in across-scenes use case. Moreover, we observe that finetuning Convolutional Neural Network consistently improves accuracy. Finally, our novel two-streams deep network combining HandCam and HeadCam features achieves the best performance in four out of five tasks. With more data, we believe a joint HandCam and HeadCam system can robustly log hand states in daily life.
We study the Edwards-Anderson model on a simple cubic lattice with a finite constant external field. We employ an indicator composed of a ratio of susceptibilities at finite wavenumbers, which was recently proposed to avoid the difficulties of a zero momentum quantity, for capturing the spin glass phase transition. Unfortunately, this new indicator is fairly noisy, so a large pool of samples at low temperature and small external field are needed to generate results with sufficiently small statistical error for analysis. We thus implement the Monte Carlo method using graphics processing units to drastically speedup the simulation. We confirm previous findings that conventional indicators for the spin glass transition, including the Binder ratio and the correlation length do not show any indication of a transition for rather low temperatures. However, the ratio of spin glass susceptibilities do show crossing behavior, albeit a systematic analysis is beyond the reach of the present data. This calls for a more thorough study of the three-dimension Edwards-Anderson model in an external field.
Statistical models that analyse (pairwise) relations between variables encompass assumptions about the underlying mechanism that generated the associations in the observed data. In the present paper we demonstrate that three Ising model representations exist that, although each proposes a distinct theoretical explanation for the observed associations, are mathematically equivalent. This equivalence allows the researcher to interpret the results of one model in three different ways. We illustrate the ramifications of this by discussing concepts that are conceived as problematic in their traditional explanation, yet when interpreted in the context of another explanation make immediate sense.
Using the cometary nucleus model developed by Espinasse et al. (1991), we calculate the thermodynamical evolution of Comet 9P/Tempel 1 over a period of 360 years. Starting from an initially amorphous cometary nucleus which incorporates an icy mixture of H2O and CO, we show that, at the time of Deep Impact collision, the crater is expected to form at depths where ice is in its crystalline form. Hence, the subsurface exposed to space should not be primordial. We also attempt an order-of-magnitude estimate of the heating and material ablation effects on the crater activity caused by the 370 Kg projectile released by the DI spacecraft. We thus show that heating effects play no role in the evolution of crater activity. We calculate that the CO production rate from the impacted region should be about 300-400 times higher from the crater resulting from the impact with a 35 m ablation than over the unperturbed nucleus in the immediate post-impact period. We also show that the H2O production rate is decreased by several orders of magnitude at the crater base just after ablation.
In this document we propose a new improvement for boosting techniques as proposed in Friedman '99 by the use of non-convex cost functional. The idea is to introduce a correlation term to better deal with forecasting of additive time series. The problem is discussed in a theoretical way to prove the existence of minimizing sequence, and in a numerical way to propose a new "ArgMin" algorithm. The model has been used to perform the touristic presence forecast for the winter season 1999/2000 in Trentino (italian Alps).
An universal quantum network which can implement a general quantum computing is proposed. In this sense, it can be called the quantum central processing unit (QCPU). For a given quantum computing, its realization of QCPU is just its quantum network. QCPU is standard and easy-assemble because it only has two kinds of basic elements and two auxiliary elements. QCPU and its realizations are scalable, that is, they can be connected together, and so they can construct the whole quantum network to implement the general quantum algorithm and quantum simulating procedure.
In this lecture I will present some models of neural networks that have been developed in the recent years. The aim is to construct neural networks which work as associative memories. Different attractors of the network will be identified as different internal representations of different objects. At the end of the lecture I will present a comparison among the theoretical results and some of the experiments done on real mammal brains.
We present a method to find the best temporal partition at any time-scale and rank the relevance of partitions found at different time-scales. This method is based on random walkers coevolving with the network and as such constitutes a generalization of partition stability to the case of temporal networks. We show that, when applied to a toy model and real datasets, temporal stability uncovers structures that are persistent over meaningful time-scales as well as important isolated events, making it an effective tool to study both abrupt changes and gradual evolution of a network mesoscopic structures.
We present a study of the structural properties of (x)Na$_2$S-(1-x)GeS$_2$ glasses through DFT-based molecular dynamics simulations, at different sodium concentrations ($0<x<0.5$). We computed the radial pair correlation functions as well as the total and partial structure factors. We also analyzed the evolution of the corner- and edge-sharing intertetrahedral links with the sodium concentration and show that the sodium ions exclusively destroy the former. With the increase of the sodium concentration the ``standard'' FSDP disappears and a new pre-peak appears in the structure factor which can be traced back in the Na-Na partial structure factor. This self organization of the sodium ions is coherent with Na-rich zones that we find at high modifier concentration.
Motivated by the problem of sorting, we introduce two simple combinatorial models with distinct Hamiltonians yet identical spectra (and hence partition function) and show that the local dynamics of these models are very different. After a deep quench, one model slowly relaxes to the sorted state whereas the other model becomes blocked by the presence of stable local minima.
In this work, we present a novel module to perform fusion of heterogeneous data using fully convolutional networks for semantic labeling. We introduce residual correction as a way to learn how to fuse predictions coming out of a dual stream architecture. Especially, we perform fusion of DSM and IRRG optical data on the ISPRS Vaihingen dataset over a urban area and obtain new state-of-the-art results.
We study a one-dimensional system of spinless electrons in the presence of a long-range Coulomb interaction (LRCI) and a random chemical potential at each site. We first present a Tomonaga-Luttinger liquid (TLL) description of the system. We use the bosonization technique followed by the replica trick to average over the quenched randomness. An expression for the localization length of the system is then obtained using the renormalization group method and also a physical argument. We then find the density of states for different values of the energy; we get different expressions depending on whether the energy is larger than or smaller than the inverse of the localization length. We work in the limit of weak disorder where the localization length is very large; at that length scale, the LRCI has the effect of reducing the interaction parameter K of the TLL to a value much smaller than the noninteracting value of unity.
Consider a Gaussian relay network where a source node communicates to a destination node with the help of several layers of relays. Recent work has shown that compress-and-forward based strategies can achieve the capacity of this network within an additive gap. Here, the relays quantize their received signals at the noise level and map them to random Gaussian codebooks. The resultant gap to capacity is independent of the SNRs of the channels in the network and the topology but is linear in the total number of nodes.   In this paper, we provide an improved lower bound on the rate achieved by compress-and-forward based strategies (noisy network coding in particular) in arbitrary Gaussian relay networks, whose gap to capacity depends on the network not only through the total number of nodes but also through the degrees of freedom of the min cut of the network. We illustrate that for many networks, this refined lower bound can lead to a better approximation of the capacity. In particular, we demonstrate that it leads to a logarithmic rather than linear capacity gap in the total number of nodes for certain classes of layered networks. The improvement comes from quantizing the received signals of the relays at a resolution decreasing with the total number of nodes in the network. This suggests that the rule-of-thumb in literature of quantizing the received signals at the noise level can be highly suboptimal.
We present a formalism for computing the complexity of metastable states and the zero-temperature magnetic hysteresis loop in the soft-spin random-field model in finite dimensions. The complexity is obtained as the Legendre transform of the free-energy associated to a certain action in replica space and the hysteresis loop above the critical disorder is defined as the curve in the field-magnetization plane where the complexity vanishes; the nonequilibrium magnetization is therefore obtained without having to follow the dynamical evolution. We use approximations borrowed from condensed-matter theory and based on assumptions on the structure of the direct correlation functions (or proper vertices), such as a local approximation for the self-energies, to calculate the hysteresis loop in three dimensions, the correlation functions along the loop, and the second moment of the avalanche-size distribution.
We consider disordered many-body systems with periodic time-dependent Hamiltonians in one spatial dimension. By studying the properties of the Floquet eigenstates, we identify two distinct phases: (i) a many-body localized (MBL) phase, in which almost all eigenstates have area-law entanglement entropy, and the eigenstate thermalization hypothesis (ETH) is violated, and (ii) a delocalized phase, in which eigenstates have volume-law entanglement and obey the ETH. MBL phase exhibits logarithmic in time growth of entanglement entropy for initial product states, which distinguishes it from the delocalized phase. We propose an effective model of the MBL phase in terms of an extensive number of emergent local integrals of motion (LIOM), which naturally explains the spectral and dynamical properties of this phase. Numerical data, obtained by exact diagonalization and time-evolving block decimation methods, suggests a direct transition between the two phases. Our results show that many-body localization is not destroyed by sufficiently weak periodic driving.
Motivated by recent applications of the Lyapunov's method in artificial neural networks, which could be considered as dynamical systems for which the convergence of the system trajectories to equilibrium states is a necessity. We re-look at a well-known Krasovskii's stability criteria pertaining to a non linear autonomous system. Instead, we consider the components of the same autonomous system with the help of the elements of Jacobian matrix J(x), thus proposing much simpler convergence criteria via the method of Lyapunov. We then apply our results to artificial neural networks and discuss our results with respect to recent ones in the field.
We address the problem of novel view synthesis: given an input image, synthesizing new images of the same object or scene observed from arbitrary viewpoints. We approach this as a learning task but, critically, instead of learning to synthesize pixels from scratch, we learn to copy them from the input image. Our approach exploits the observation that the visual appearance of different views of the same instance is highly correlated, and such correlation could be explicitly learned by training a convolutional neural network (CNN) to predict appearance flows -- 2-D coordinate vectors specifying which pixels in the input view could be used to reconstruct the target view. Furthermore, the proposed framework easily generalizes to multiple input views by learning how to optimally combine single-view predictions. We show that for both objects and scenes, our approach is able to synthesize novel views of higher perceptual quality than previous CNN-based techniques.
This paper introduces a potential learning scheme that can dynamically predict the stability of the reconnection of sub-networks to a main grid. As the future electrical power systems tend towards smarter and greener technology, the deployment of self sufficient networks, or microgrids, becomes more likely. Microgrids may operate on their own or synchronized with the main grid, thus control methods need to take into account islanding and reconnecting of said networks. The ability to optimally and safely reconnect a portion of the grid is not well understood and, as of now, limited to raw synchronization between interconnection points. A support vector machine (SVM) leveraging real-time data from phasor measurement units (PMUs) is proposed to predict in real time whether the reconnection of a sub-network to the main grid would lead to stability or instability. A dynamics simulator fed with pre-acquired system parameters is used to create training data for the SVM in various operating states. The classifier was tested on a variety of cases and operating points to ensure diversity. Accuracies of approximately 85% were observed throughout most conditions when making dynamic predictions of a given network.
The newly emerging field of wave front shaping in complex media has recently seen enormous progress. The driving force behind these advances has been the experimental accessibility of the information stored in the scattering matrix of a disordered medium, which can nowadays routinely be exploited to focus light as well as to image or to transmit information even across highly turbid scattering samples. We will provide an overview of these new techniques, of their experimental implementations as well as of the underlying theoretical concepts following from mesoscopic scattering theory. In particular, we will highlight the intimate connections between quantum transport phenomena and the scattering of light fields in disordered media, which can both be described by the same theoretical concepts. We also put particular emphasis on how the above topics relate to application-oriented research fields such as optical imaging, sensing and communication.
Jointly integrating aspect ratio and context has been extensively studied and shown performance improvement in traditional object detection systems such as the DPMs. It, however, has been largely ignored in deep neural network based detection systems. This paper presents a method of integrating a mixture of object models and region-based convolutional networks for accurate object detection. Each mixture component accounts for both object aspect ratio and multi-scale contextual information explicitly: (i) it exploits a mixture of tiling configurations in the RoI pooling to remedy the warping artifacts caused by a single type RoI pooling (e.g., with equally-sized 7 x 7 cells), and to respect the underlying object shapes more; (ii) it "looks from both the inside and the outside of a RoI" by incorporating contextual information at two scales: global context pooled from the whole image and local context pooled from the surrounding of a RoI. To facilitate accurate detection, this paper proposes a multi-stage detection scheme for integrating the mixture of object models, which utilizes the detection results of the model at the previous stage as the proposals for the current in both training and testing. The proposed method is called the aspect ratio and context aware region-based convolutional network (ARC-R-CNN). In experiments, ARC-R-CNN shows very competitive results with Faster R-CNN [41] and R-FCN [10] on two datasets: the PASCAL VOC and the Microsoft COCO. It obtains significantly better mAP performance using high IoU thresholds on both datasets.
Device-to-Device (D2D) communication is expected to enable a number of new services and applications in future mobile networks and has attracted significant research interest over the last few years. Remarkably, little attention has been placed on the issue of D2D communication for users belonging to different operators. In this paper, we focus on this aspect for D2D users that belong to different tenants (virtual network operators), assuming virtualized and programmable future 5G wireless networks. Under the assumption of a cross-tenant orchestrator, we show that significant gains can be achieved in terms of network performance by optimizing resource sharing from the different tenants, i.e., slices of the substrate physical network topology. To this end, a sum-rate optimization framework is proposed for optimal sharing of the virtualized resources. Via a wide site of numerical investigations, we prove the efficacy of the proposed solution and the achievable gains compared to legacy approaches.
We investigate the effect of large-x resummation on parton distributions by performing a fit of Deep Inelastic Scattering data from the NuTeV, BCDMS and NMC collaborations, using NLO and NLL soft-resummed coefficient functions. Our results show that soft resummation has a visible impact on quark densities at large x. Resummed parton fits would therefore be needed whenever high precision is required for cross sections evaluated near partonic threshold.
Randomly crosslinked macromolecules undergo a liquid-to-amorphous solid phase transition at a critical crosslink concentration. This transition has two main signatures: the random localization of a fraction of the monomers and the emergence of a nonzero static shear modulus. In this article, a semi-microscopic statistical mechanical theory of the elastic properties of the amorphous solid state is developed. This theory takes into account both quenched disorder and thermal fluctuations, and allows for the direct computation of the free energy change of the sample due to a given macroscopic shear strain. This leads to an unambiguous determination of the static shear modulus. At the level of mean field theory, it is found (i) that the shear modulus grows continuously from zero at the transition, and does so with the classical exponent, i.e., with the third power of the excess crosslink density and, quite surprisingly, (ii) that near the transition the external stresses do not spoil the spherical symmetry of the localization clouds of the particles.
Young Stellar Objects (YSOs) and in particular protostars are known to show a variety of high-energy processes. Observations in the X-ray and centimetric radio wavelength ranges are thought to constrain some of these processes, e.g., coronal-type magnetic activity. There is a well-known empirical correlation of radio and X-ray luminosities in active stars, the so-called Guedel-Benz relation. Previous evidence whether YSOs are compatible with this relation remains inconclusive for the earliest evolutionary stages. The main difficulty is that due to the extreme variability of these sources, simultaneous observations are essential. Until now, only few YSOs and only a handful of protostars have been observed simultaneously in the X-ray and radio range. To expand the sample, we have obtained such observations of two young clusters rich in protostars, NGC 1333 and IC 348. While the absolute sensitivity is lower for these regions than for more nearby clusters like CrA, we find that even in deep continuum observations carried out with the NRAO Very Large Array, the radio detection fraction for protostars in these clusters is much lower than the X-ray detection fraction. Very few YSOs are detected in both bands, and we find the radio and X-ray populations among YSOs to be largely distinct. We combine these new results with previous simultaneous Chandra and VLA observations of star-forming regions and find that YSOs with detections in both bands appear to be offset toward higher radio luminosities for given X-ray luminosities when compared to the Guedel-Benz relation, although even in this sensitive dataset most sources are too weak for the radio detections to provide information on the emission processes. The considerably improved sensitivity of the Expanded Very Large Array will provide a better census of the YSO radio population as well as better constraints on the emission mechanisms.
Ant Colony Optimization (ACO) is a very popular metaheuristic for solving computationally hard combinatorial optimization problems. Runtime analysis of ACO with respect to various pseudo-boolean functions and different graph based combinatorial optimization problems has been taken up in recent years. In this paper, we investigate the runtime behavior of an MMAS*(Max-Min Ant System) ACO algorithm on some well known hypergraph covering problems that are NP-Hard. In particular, we have addressed the Minimum Edge Cover problem, the Minimum Vertex Cover problem and the Maximum Weak- Independent Set problem. The influence of pheromone values and heuristic information on the running time is analysed. The results indicate that the heuristic information has greater impact towards improving the expected optimization time as compared to pheromone values. For certain instances of hypergraphs, we show that the MMAS* algorithm gives a constant order expected optimization time when the dominance of heuristic information is suitably increased.
We investigate the properties of quantum annealing applied to the random field Ising model in one, two and three dimensions. The decay rate of the residual energy, defined as the energy excess from the ground state, is find to be $e_{res}\sim \log(N_{MC})^{-\zeta}$ with $\zeta$ in the range $2...6$, depending on the strength of the random field. Systems with ``large clusters'' are harder to optimize as measured by $\zeta$. Our numerical results suggest that in the ordered phase $\zeta=2$ whereas in the paramagnetic phase the annealing procedure can be tuned so that $\zeta\to6$.
We developed a method for simulating pedestrian dynamics in a large, dense crowd. Our numerical model calculates pedestrian motion from Newton second laws, taking into account visco-elastic contact forces, contact friction, and ground reaction forces. In our computer simulation, non-spherical shapes (spheropolygons) modelled the positions of the chest and arms in the packing arrangement of pedestrian bodies, based on a cross-sectional profile using data from the US National Library of Medicine. Motive torque was taken to arise solely from the pedestrians orientation toward their preferred destination. The objective was to gain insight into a tragic incident at the Madrid Arena Pavilion in Spain, where five girls were crushed to death. The incident took place at a Halloween Celebration in 2012, in a long, densely crowded hallway used as entrance and exit at the same time. Our simulations reconstruct the mechanics of clogging in the hallway. The hypothetical case of a total evacuation order was also investigated. The result highlights the importance of the pedestrians density and the effect of counter flow in the onset of avalanches and clogging, and provides qualitative measures of the number of injuries based on a calculation of the contact force network between the pedestrians.
Many events occur in the world. Some event types are stochastically excited or inhibited---in the sense of having their probabilities elevated or decreased---by patterns in the sequence of previous events. Discovering such patterns can help us predict which type of event will happen next and when. We propose to model streams of discrete events in continuous time, by constructing a neurally self-modulating multivariate point process in which the intensities of multiple event types evolve according to a novel continuous-time LSTM. This generative model allows past events to influence the future in complex and realistic ways, by conditioning future event intensities on the hidden state of a recurrent neural network that has consumed the stream of past events. Our model has desirable qualitative properties. It achieves competitive likelihood and predictive accuracy on real and synthetic datasets, including under missing-data conditions.
We study a simple model of dephasing of Aharonov-Bohm oscillations in the transmission of an electron across a mesoscopic ring. A magnetic impurity in one of the arms of the ring couples to the electron spin via an exchange interaction. This interaction leads to spin flip scattering and induces dephasing via entanglement. This is akin to the models evoked earlier to explain destruction of interference due to which-path information in double-slit experiments. Total transmission is found to be symmetric under flux reversal but not the spin polarization.
Collective stable chaos consists of the persistence of disordered patterns in dynamical spatiotemporal systems possessing a negative maximum Lyapunov exponent. We analyze the role of the topology of connectivity on the emergence and collapse of collective stable chaos in systems of coupled maps defined on a small-world networks. As local dynamics we employ a map that exhibits a period-three superstable orbit. The network is characterized by a rewiring probability $p$. We find that collective chaos is inhibited on some ranges of values of the probability $p$; instead, in these regions the system reaches a synchronized state equal to the period-three orbit of the local dynamics. Our results show that the presence of long-range interactions can induce the collapse of collective stable chaos in spatiotemporal systems.
Random Overlap Structures (ROSt's) are random elements on the space of probability measures on the unit ball of a Hilbert space, where two measures are identified if they differ by an isometry. In spin glasses, they arise as natural limits of Gibbs measures under the appropriate algebra of functions. We prove that the so called `cavity mapping' on the space of ROSt's is continuous, leading to a proof of the stochastic stability conjecture for the limiting Gibbs measures of a large class of spin glass models. Similar arguments yield the proofs of a number of other properties of ROSt's that may be useful in future attempts at proving the ultrametricity conjecture. Lastly, assuming that the ultrametricity conjecture holds, the setup yields a constructive proof of the Parisi formula for the free energy of the Sherrington-Kirkpatrick model by making rigorous a heuristic of Aizenman, Sims and Starr.
As research into community finding in social networks progresses, there is a need for algorithms capable of detecting overlapping community structure. Many algorithms have been proposed in recent years that are capable of assigning each node to more than a single community. The performance of these algorithms tends to degrade when the ground-truth contains a more highly overlapping community structure, with nodes assigned to more than two communities. Such highly overlapping structure is likely to exist in many social networks, such as Facebook friendship networks. In this paper we present a scalable algorithm, MOSES, based on a statistical model of community structure, which is capable of detecting highly overlapping community structure, especially when there is variance in the number of communities each node is in. In evaluation on synthetic data MOSES is found to be superior to existing algorithms, especially at high levels of overlap. We demonstrate MOSES on real social network data by analyzing the networks of friendship links between students of five US universities.
This manuscript provides a model to characterize the energy savings of network coded storage (NCS) in storage area networks (SANs). We consider blocking probability of drives as our measure of performance. A mapping technique to analyze SANs as independent M/G/K/K queues is presented, and blocking probabilities for uncoded storage schemes and NCS are derived and compared. We show that coding operates differently than the amalgamation of file chunks and energy savings are shown to scale well with striping number. We illustrate that for enterprise-level SANs energy savings of 20-50% can be realized.
We report a numerical study of the bond-diluted 2-dimensional Potts model using transfer matrix calculations. For different numbers of states per spin, we show that the critical exponents at the random fixed point are the same as in self-dual random-bond cases. In addition, we determine the multifractal spectrum associated with the scaling dimensions of the moments of the spin-spin correlation function in the cylinder geometry. We show that the behaviour is fully compatible with the one observed in the random bond case, confirming the general picture according to which a unique fixed point describes the critical properties of different classes of disorder: dilution, self-dual binary random-bond, self-dual continuous random bond.
We report on the implementation of a certified compiler for a high-level hardware description language (HDL) called Fe-Si (FEatherweight SynthesIs). Fe-Si is a simplified version of Bluespec, an HDL based on a notion of guarded atomic actions. Fe-Si is defined as a dependently typed deep embedding in Coq. The target language of the compiler corresponds to a synthesisable subset of Verilog or VHDL. A key aspect of our approach is that input programs to the compiler can be defined and proved correct inside Coq. Then, we use extraction and a Verilog back-end (written in OCaml) to get a certified version of a hardware design.
The optimal allocation of resources for maximizing influence, spread of information or coverage, has gained attention in the past years, in particular in machine learning and data mining. But in applications, the parameters of the problem are rarely known exactly, and using wrong parameters can lead to undesirable outcomes. We hence revisit a continuous version of the Budget Allocation or Bipartite Influence Maximization problem introduced by Alon et al. (2012) from a robust optimization perspective, where an adversary may choose the least favorable parameters within a confidence set. The resulting problem is a nonconvex-concave saddle point problem (or game). We show that this nonconvex problem can be solved exactly by leveraging connections to continuous submodular functions, and by solving a constrained submodular minimization problem. Although constrained submodular minimization is hard in general, here, we establish conditions under which such a problem can be solved to arbitrary precision $\epsilon$.
In this theoretical paper, we investigate coherence properties of the near-resonant light scattered by two atoms exposed to a strong monochromatic field. To properly incorporate saturation effects, we use a quantum Langevin approach. In contrast to the standard optical Bloch equations, this method naturally provides the inelastic spectrum of the radiated light induced by the quantum electromagnetic vacuum fluctuations. However, to get the right spectral properties of the scattered light, it is essential to correctly describe the statistical properties of these vacuum fluctuations. Because of the presence of the two atoms, these statistical properties are not Gaussian : (i) the spatial two-points correlation function displays a speckle-like behavior and (ii) the three-points correlation function does not vanish. We also explain how to incorporate in a simple way propagation with a frequency-dependent scattering mean-free path, meaning that the two atoms are embedded in an average scattering dispersive medium. Finally we show that saturation-induced nonlinearities strongly modify the atomic scattering properties and, as a consequence, provide a source of decoherence in multiple scattering. This is exemplified by considering the coherent backscattering configuration where interference effects are blurred by this decoherence mechanism. This leads to a decrease of the so-called coherent backscattering enhancement factor.
The origin of the anomalous, 400% increase of the piezoelectric coefficient in Sc$_x$Al$_{1-x}$N alloys is revealed. Quantum mechanical calculations show that the effect is intrinsic. It comes from a strong change in the response of the internal atomic coordinates to strain and pronounced softening of C$_{33}$ elastic constant. The underlying mechanism is the flattening of the energy landscape due to a competition between the parent wurtzite and the so far experimentally unknown hexagonal phases of the alloy. Our observation provides a route for the design of materials with high piezoelectric response.
Causal inference relies on the structure of a graph, often a directed acyclic graph (DAG). Different graphs may result in different causal inference statements and different intervention distributions. To quantify such differences, we propose a (pre-) distance between DAGs, the structural intervention distance (SID). The SID is based on a graphical criterion only and quantifies the closeness between two DAGs in terms of their corresponding causal inference statements. It is therefore well-suited for evaluating graphs that are used for computing interventions. Instead of DAGs it is also possible to compare CPDAGs, completed partially directed acyclic graphs that represent Markov equivalence classes. Since it differs significantly from the popular Structural Hamming Distance (SHD), the SID constitutes a valuable additional measure. We discuss properties of this distance and provide an efficient implementation with software code available on the first author's homepage (an R package is under construction).
In the process of recording, storage and transmission of time-domain audio signals, errors may be introduced that are difficult to correct in an unsupervised way. Here, we train a convolutional deep neural network to re-synthesize input time-domain speech signals at its output layer. We then use this abstract transformation, which we call a deep transform (DT), to perform probabilistic re-synthesis on further speech (of the same speaker) which has been degraded. Using the convolutive DT, we demonstrate the recovery of speech audio that has been subject to extreme degradation. This approach may be useful for correction of errors in communications devices.
Deep-learning has proved in recent years to be a powerful tool for image analysis and is now widely used to segment both 2D and 3D medical images. Deep-learning segmentation frameworks rely not only on the choice of network architecture but also on the choice of loss function. When the segmentation process targets rare observations, a severe class imbalance is likely to occur between candidate labels, thus resulting in sub-optimal performance. In order to mitigate this issue, strategies such as the weighted cross-entropy function, the sensitivity function or the Dice loss function, have been proposed. In this work, we investigate the behavior of these loss functions and their sensitivity to learning rate tuning in the presence of different rates of label imbalance across 2D and 3D segmentation tasks. We also propose to use the class re-balancing properties of the Generalized Dice overlap, a known metric for segmentation assessment, as a robust and accurate deep-learning loss function for unbalanced tasks.
The behavior of complex networks under attack depends strongly on the specific attack scenario. Of special interest are scale-free networks, which are usually seen as robust under random failure or attack but appear to be especially vulnerable to targeted attacks. In a recent study of public transport networks of 14 major cities of the world we have shown that these networks may exhibit scale-free behaviour [Physica A 380, 585 (2007)]. Our further analysis, subject of this report, focuses on the effects that defunct or removed nodes have on the properties of public transport networks. Simulating different attack strategies we elaborate vulnerability criteria that allow to find minimal strategies with high impact on these systems.
Deep learning has been shown as a successful machine learning method for a variety of tasks, and its popularity results in numerous open-source deep learning software tools. Training a deep network is usually a very time-consuming process. To address the computational challenge in deep learning, many tools exploit hardware features such as multi-core CPUs and many-core GPUs to shorten the training time. However, different tools exhibit different features and running performance when training different types of deep networks on different hardware platforms, which makes it difficult for end users to select an appropriate pair of software and hardware. In this paper, we aim to make a comparative study of the state-of-the-art GPU-accelerated deep learning software tools, including Caffe, CNTK, MXNet, TensorFlow, and Torch. We first benchmark the running performance of these tools with three popular types of neural networks on two CPU platforms and three GPU platforms. We then benchmark some distributed versions on multiple GPUs. Our contribution is two-fold. First, for end users of deep learning tools, our benchmarking results can serve as a guide to selecting appropriate hardware platforms and software tools. Second, for software developers of deep learning tools, our in-depth analysis points out possible future directions to further optimize the running performance.
We consider an alternative approach for the computation of the stochastic gravitational wave background generated by small loops produced throughout the cosmological evolution of cosmic string networks and use it to derive an analytical approximation to the corresponding power spectrum. We show that this approximation produces an excellent fit to more elaborate results obtained using the Velocity-dependent One-Scale model to describe cosmic string network dynamics, over a wide frequency range, in the small-loop regime.
We propose a transfer deep learning (TDL) framework that can transfer the knowledge obtained from a single-modal neural network to a network with a different modality. Specifically, we show that we can leverage speech data to fine-tune the network trained for video recognition, given an initial set of audio-video parallel dataset within the same semantics. Our approach first learns the analogy-preserving embeddings between the abstract representations learned from intermediate layers of each network, allowing for semantics-level transfer between the source and target modalities. We then apply our neural network operation that fine-tunes the target network with the additional knowledge transferred from the source network, while keeping the topology of the target network unchanged. While we present an audio-visual recognition task as an application of our approach, our framework is flexible and thus can work with any multimodal dataset, or with any already-existing deep networks that share the common underlying semantics. In this work in progress report, we aim to provide comprehensive results of different configurations of the proposed approach on two widely used audio-visual datasets, and we discuss potential applications of the proposed approach.
The minimum dominating set (MDS) problem is a fundamental subject of theoretical computer science, and has found vast applications in different areas, including sensor networks, protein interaction networks, and structural controllability. However, the determination of the size of a MDS and the number of all MDSs in a general network is NP-hard, and it thus makes sense to seek particular graphs for which the MDS problem can be solved analytically. In this paper, we study the MDS problem in the pseudofractal scale-free web and the Sierpi\'nski graph, which have the same number of vertices and edges. For both networks, we determine explicitly the domination number, as well as the number of distinct MDSs. We show that the pseudofractal scale-free web has a unique MDS, and its domination number is only half of that for the Sierpi\'nski graph, which has many MDSs. We argue that the scale-free topology is responsible for the difference of the size and number of MDSs between the two studied graphs, which in turn indicates that power-law degree distribution plays an important role in the MDS problem and its applications in scale-free networks.
Neural networks of the brain form one of the most complex systems we know. Many qualitative features of the emerging collective phenomena, such as correlated activity, stability, response to inputs, chaotic and regular behavior, can, however, be understood in simple models that are accessible to a treatment in statistical mechanics, or, more precisely, classical statistical field theory.   This tutorial presents the fundamentals behind contemporary developments in the theory of neural networks of rate units that are based on methods from statistical mechanics of classical systems with a large number of interacting degrees of freedom. In particular we will focus on a relevant class of systems that have quenched (time independent) disorder. In neural networks, the main source of disorder arises from random synaptic couplings between neurons. These systems are in many respects similar to spin glasses. The tutorial therefore also explains the methods for these disordered systems as far as they are applied in neuroscience.   The presentation consists of two parts. In the first part we introduce stochastic differential equations in the Martin - Siggia - Rose - De Dominicis - Janssen path integral formalism. In the second part we employ this language to derive the dynamic mean-field theory for deterministic random networks, the basis of the seminal work by Sompolinsky, Crisanti, Sommers 1988, as well as a recent extension to stochastic dynamics.
We present a multi-wavelength study of a 3.6 $\mu$m-selected galaxy sample in the Extended Groth strip. The sample is complete for galaxies with stellar mass $>10^{9.5}$ \Msun and redshift $0.4<z<1.2$. In this redshift range, the IRAC 3.6 $\mu$m band measures the rest-frame near-infrared band, permitting nearly unbiased selection with respect to both quiescent and star-forming galaxies. The numerous spectroscopic redshifts available in the EGS are used to train an Artificial Neural Network to estimate photometric redshifts. The distribution of photometric redshift errors is Gaussian with standard deviation ${\sim}0.025(1+z)$, and the fraction of redshift failures (${>}3\sigma$ errors) is about 3.5%. A new method of validation based on pair statistics confirms the estimate of standard deviation even for galaxies lacking spectroscopic redshifts. Basic galaxy properties measured include rest-frame $U-B$ colors, $B$- and $K$-band absolute magnitudes, and stellar masses. We divide the sample into quiescent and star-forming galaxies according to their rest-frame $U-B$ colors and 24 to 3.6 \micron\ flux density ratios and derive rest $K$-band luminosity functions and stellar mass functions for quiescent, star forming, and all galaxies. The results show that massive, quiescent galaxies were in place by $z\approx1$, but lower mass galaxies generally ceased their star formation at later epochs.
Particle tracking is a powerful biophysical tool that requires conversion of large video files into position time series, i.e. traces of the species of interest for data analysis. Current tracking methods, based on a limited set of input parameters to identify bright objects, are ill-equipped to handle the spectrum of spatiotemporal heterogeneity and poor signal-to-noise ratios typically presented by submicron species in complex biological environments. Extensive user involvement is frequently necessary to optimize and execute tracking methods, which is not only inefficient but introduces user bias. To develop a fully automated tracking algorithm, we developed a convolutional neural network comprised of over 50,000 parameters, and employed deep learning to train the network on a diverse portfolio of video conditions. The neural network tracker, with no user-dependent input parameters, offered superior tracking performance, with exceptionally low lower false positive and false negative rates on both 2D and 3D simulated videos and 2D experimental videos of difficult-to-track species.
We introduce the notion of universal memcomputing machines (UMMs): a class of brain-inspired general-purpose computing machines based on systems with memory, whereby processing and storing of information occur on the same physical location. We analytically prove that the memory properties of UMMs endow them with universal computing power - they are Turing-complete -, intrinsic parallelism, functional polymorphism, and information overhead, namely their collective states can support exponential data compression directly in memory. We also demonstrate that a UMM has the same computational power as a non-deterministic Turing machine, namely it can solve NP--complete problems in polynomial time. However, by virtue of its information overhead, a UMM needs only an amount of memory cells (memprocessors) that grows polynomially with the problem size. As an example we provide the polynomial-time solution of the subset-sum problem and a simple hardware implementation of the same. Even though these results do not prove the statement NP=P within the Turing paradigm, the practical realization of these UMMs would represent a paradigm shift from present von Neumann architectures bringing us closer to brain-like neural computation.
Recurrent feedback connections in the mammalian visual system have been hypothesized to play a role in synthesizing input in the theoretical framework of analysis by synthesis. The comparison of internally synthesized representation with that of the input provides a validation mechanism during perceptual inference and learning. Inspired by these ideas, we proposed that the synthesis machinery can compose new, unobserved images by imagination to train the network itself so as to increase the robustness of the system in novel scenarios. As a proof of concept, we investigated whether images composed by imagination could help an object recognition system to deal with occlusion, which is challenging for the current state-of-the-art deep convolutional neural networks. We fine-tuned a network on images containing objects in various occlusion scenarios, that are imagined or self-generated through a deep generator network. Trained on imagined occluded scenarios under the object persistence constraint, our network discovered more subtle and localized image features that were neglected by the original network for object classification, obtaining better separability of different object classes in the feature space. This leads to significant improvement of object recognition under occlusion for our network relative to the original network trained only on un-occluded images. In addition to providing practical benefits in object recognition under occlusion, this work demonstrates the use of self-generated composition of visual scenes through the synthesis loop, combined with the object persistence constraint, can provide opportunities for neural networks to discover new relevant patterns in the data, and become more flexible in dealing with novel situations.
Spin ice materials, such as Dy2Ti2O7 and Ho2Ti2O7, have been the subject of much interest for over the past fifteen years. Their low temperature strongly correlated state can be mapped onto the proton disordered state of common water ice and, consequently, spin ices display the same low temperature residual Pauling entropy as water ice. Interestingly, it was found in a previous study [X. Ke {\it et. al.} Phys. Rev. Lett. {\bf 99}, 137203 (2007)] that, upon dilution of the magnetic rare-earth ions (Dy^{3+} and Ho^{3+}) by non-magnetic Yttrium (Y^{3+}) ions, the residual entropy depends {\it non-monotonically} on the concentration of Y^{3+} ions. In the present work, we report results from Monte Carlo simulations of site-diluted microscopic dipolar spin ice models (DSIM) that account quantitatively for the experimental specific heat measurements, and thus also for the residual entropy, as a function of dilution, for both Dy2Ti2O7 and Ho2Ti2O7. The main features of the dilution physics displayed by the magnetic specific heat data are quantitatively captured by the diluted DSIM up to, and including, 85% of the magnetic ions diluted (x=1.7). The previously reported departures in the residual entropy between Dy2Ti2O7 versus Ho2Ti2O7, as well as with a site-dilution variant of Pauling's approximation, are thus rationalized through the site-diluted DSIM. For 90% (x=1.8) and 95% (x=1.9) of the magnetic ions diluted, we find a significant discrepancy between the experimental and Monte Carlo specific heat results. We discuss some possible reasons for this disagreement.
Consider a collection of weighted subsets of a ground set N. Given a query subset Q of N, how fast can one (1) find the weighted sum over all subsets of Q, and (2) sample a subset of Q proportionally to the weights? We present a tree-based greedy heuristic, Treedy, that for a given positive tolerance d answers such counting and sampling queries to within a guaranteed relative error d and total variation distance d, respectively. Experimental results on artificial instances and in application to Bayesian structure discovery in Bayesian networks show that approximations yield dramatic savings in running time compared to exact computation, and that Treedy typically outperforms a previously proposed sorting-based heuristic.
The vulcanization transition - the crosslink-density-controlled equilibrium phase transition from the liquid to the amorphous solid state - is explored analytically from a renormalization group perspective. The analysis centers on a minimal model that accounts for both the thermal motion of the constituents and the quenched random constraints imposed on their motion by the crosslinks, as well as particle-particle repulsion which suppresses density fluctuations. A correlation function involving fluctuations of the amorphous solid order parameter, the behavior of which signals the vulcanization transition, is examined, its physical meaning is elucidated, and the associated susceptibility is constructed and analyzed. A Ginzburg criterion for the width (in crosslink density) of the critical region is derived and is found to be consistent with a prediction due to de Gennes. Certain universal critical exponents characterizing the vulcanization transition are computed, to lowest nontrivial order, within the framework of an expansion around the upper critical dimension of six. This expansion shows that the connection between vulcanization and percolation extends beyond mean-field theory, at least to first order in the departure from the upper critical dimension. The relationship between the present approach to vulcanized matter and other approaches is explored in the light of this connection. To conclude, some expectations for how the vulcanization transition is realized in two dimensions, developed with H. E. Castillo, are discussed.
We discuss two topics in the production of heavy quarks in deep-inelastic scattering: the next-to-leading order Monte-Carlo HVQDIS and the next-to-leading logarithmic resummation of soft gluon effects, including estimates of next-to-next-to-leading order corrections therefrom.
How will the climate system respond to anthropogenic forcings? One approach to this question relies on climate model projections. Current climate projections are considerably uncertain. Characterizing and, if possible, reducing this uncertainty is an area of ongoing research. We consider the problem of making projections of the North Atlantic meridional overturning circulation (AMOC). Uncertainties about climate model parameters play a key role in uncertainties in AMOC projections. When the observational data and the climate model output are high-dimensional spatial data sets, the data are typically aggregated due to computational constraints. The effects of aggregation are unclear because statistically rigorous approaches for model parameter inference have been infeasible for high-resolution data. Here we develop a flexible and computationally efficient approach using principal components and basis expansions to study the effect of spatial data aggregation on parametric and projection uncertainties. Our Bayesian reduced-dimensional calibration approach allows us to study the effect of complicated error structures and data-model discrepancies on our ability to learn about climate model parameters from high-dimensional data. Considering high-dimensional spatial observations reduces the effect of deep uncertainty associated with prior specifications for the data-model discrepancy. Also, using the unaggregated data results in sharper projections based on our climate model. Our computationally efficient approach may be widely applicable to a variety of high-dimensional computer model calibration problems.
The Parameter-Less Self-Organizing Map (PLSOM) is a new neural network algorithm based on the Self-Organizing Map (SOM). It eliminates the need for a learning rate and annealing schemes for learning rate and neighbourhood size. We discuss the relative performance of the PLSOM and the SOM and demonstrate some tasks in which the SOM fails but the PLSOM performs satisfactory. Finally we discuss some example applications of the PLSOM and present a proof of ordering under certain limited conditions.
We report the existence of a localization-delocalization transition in the classical plasma modes of a one dimensional Wigner Crystal with a white noise potential environment at T=0. Finite size scaling analysis reveals a divergence of the localization length at a critical eigenfrequency. Further scaling analysis indicates power law behavior of the critical frequency in terms of the relative interaction strength of the charges. A heuristic argument for this scaling behavior is consistent with the numerical results. Additionally, we explore a particular realization of random-bond disorder in a one dimensional Wigner lattice in which all of the collective modes are observed to be localized.
A speaker cluster-based speaker adaptive training (SAT) method under deep neural network-hidden Markov model (DNN-HMM) framework is presented in this paper. During training, speakers that are acoustically adjacent to each other are hierarchically clustered using an i-vector based distance metric. DNNs with speaker dependent layers are then adaptively trained for each cluster of speakers. Before decoding starts, an unseen speaker in test set is matched to the closest speaker cluster through comparing i-vector based distances. The previously trained DNN of the matched speaker cluster is used for decoding utterances of the test speaker. The performance of the proposed method on a large vocabulary spontaneous speech recognition task is evaluated on a training set of with 1500 hours of speech, and a test set of 24 speakers with 1774 utterances. Comparing to a speaker independent DNN with a baseline word error rate of 11.6%, a relative 6.8% reduction in word error rate is observed from the proposed method.
We propose Neural Reasoner, a framework for neural network-based reasoning over natural language sentences. Given a question, Neural Reasoner can infer over multiple supporting facts and find an answer to the question in specific forms. Neural Reasoner has 1) a specific interaction-pooling mechanism, allowing it to examine multiple facts, and 2) a deep architecture, allowing it to model the complicated logical relations in reasoning tasks. Assuming no particular structure exists in the question and facts, Neural Reasoner is able to accommodate different types of reasoning and different forms of language expressions. Despite the model complexity, Neural Reasoner can still be trained effectively in an end-to-end manner. Our empirical studies show that Neural Reasoner can outperform existing neural reasoning systems with remarkable margins on two difficult artificial tasks (Positional Reasoning and Path Finding) proposed in [8]. For example, it improves the accuracy on Path Finding(10K) from 33.4% [6] to over 98%.
Casting neural networks in generative frameworks is a highly sought-after endeavor these days. Existing methods, such as Generative Adversarial Networks, capture some of the generative capabilities, but not all. To truly leverage the power of generative models, tractable marginalization is needed, a feature outside the realm of current methods. We present a generative model based on convolutional arithmetic circuits, a variant of convolutional networks that computes high-dimensional functions through tensor decompositions. Our method admits tractable marginalization, combining the expressive power of convolutional networks with all the abilities that may be offered by a generative framework. We focus on the application of classification under missing data, where unknown portions of classified instances are absent at test time. Our model, which theoretically achieves optimal classification, provides state of the art performance when classifying images with missing pixels, as well as promising results when treating speech with occluded samples.
We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records.
A method that allows us to give a different treatment to any neuron inside feedforward neural networks is presented. The algorithm has been implemented with two very different learning methods: a standard Back-propagation (BP) procedure and an evolutionary algorithm. First, we have demonstrated that the EA training method converges faster and gives more accurate results than BP. Then we have made a full analysis of the effects of turning off different combinations of neurons after the training phase. We demonstrate that EA is much more robust than BP for all the cases under study. Even in the case when two hidden neurons are lost, EA training is still able to give good average results. This difference implies that we must be very careful when pruning or redundancy effects are being studied since the network performance when losing neurons strongly depends on the training method. Moreover, the influence of the individual inputs will also depend on the training algorithm. Since EA keeps a good classification performance when units are lost, this method could be a good way to simulate biological learning systems since they must be robust against deficient neuron performance. Although biological systems are much more complex than the simulations shown in this article, we propose that a smart training strategy such as the one shown here could be considered as a first protection against the losing of a certain number of neurons.
Small-world networks are highly clustered networks with small average distance among the nodes. There are many natural and technological networks that present this kind of connections. The on-off intermittency is investigated in small-world networks of chaotic maps in this paper. We show how the small-world topology would affect the on-off intermittency behavior. The distributions of the laminar phase are calculated numerically. The results show that the laminar phases obey power-law distributions.
We make use of ideas from the theory of complex networks to implement a machine learning classification of human DNA methylation data, that carry signatures of cancer development. The data were obtained from patients with various kinds of cancers and represented as parenclictic networks, wherein nodes correspond to genes, and edges are weighted according to pairwise variation from control group subjects. We demonstrate that for the $10$ types of cancer under study, it is possible to obtain a high performance of binary classification between cancer-positive and negative samples based on network measures. Remarkably, an accuracy as high as $93-99\%$ is achieved with only $12$ network topology indices, in a dramatic reduction of complexity from the original $15295$ gene methylation levels. Moreover, it was found that the parenclictic networks are scale-free in cancer-negative subjects, and deviate from the power-law node degree distribution in cancer. The node centrality ranking and arising modular structure could provide insights into the systems biology of cancer.
Deep Reinforcement Learning has yielded proficient controllers for complex tasks. However, these controllers have limited memory and rely on being able to perceive the complete game screen at each decision point. To address these shortcomings, this article investigates the effects of adding recurrency to a Deep Q-Network (DQN) by replacing the first post-convolutional fully-connected layer with a recurrent LSTM. The resulting \textit{Deep Recurrent Q-Network} (DRQN), although capable of seeing only a single frame at each timestep, successfully integrates information through time and replicates DQN's performance on standard Atari games and partially observed equivalents featuring flickering game screens. Additionally, when trained with partial observations and evaluated with incrementally more complete observations, DRQN's performance scales as a function of observability. Conversely, when trained with full observations and evaluated with partial observations, DRQN's performance degrades less than DQN's. Thus, given the same length of history, recurrency is a viable alternative to stacking a history of frames in the DQN's input layer and while recurrency confers no systematic advantage when learning to play the game, the recurrent net can better adapt at evaluation time if the quality of observations changes.
We present and analyze a new space-time finite element method for the solution of neural field equations with transmission delays. The numerical treatment of these systems is rare in the literature and currently has several restrictions on the spatial domain and the functions involved, such as connectivity and delay functions. The use of a space-time discretization, with basis functions that are discontinuous in time and continuous in space (dGcG-FEM), is a natural way to deal with space-dependent delays, which is important for many neural field applications. In this article we provide a detailed description of a space-time dGcG-FEM algorithm for neural delay equations, including an a-priori error analysis. We demonstrate the application of the dGcG-FEM algorithm on several neural field models, including problems with an inhomogeneous kernel.
Uniform and affordable Internet is emerging as one of the fundamental civil rights in developing countries. However in India, the connectivity is far from uniform across the regions, where the disparity is evident in the infrastructure, the cost of access and telecommunication services to provide Internet facilities among different economic classes. In spite of having a large mobile user base, the mobile Internet are still remarkably slower in some of the developing countries. Especially in India, it falls below 50% even in comparison with the performance of its developing counterparts!   This essay presents a study of connectivity and performance trends based on an exploratory analysis of mobile Internet measurement data from India. In order to assess the state of mobile networks and its readiness in adopting the different mobile standards (2G, 3G, and 4G) for commercial use, we discuss the spread, penetration, interoperability and the congestion trends. Based on our analysis, we argue that the network operators have taken negligible measures to scale the mobile Internet. Affordable Internet is definitely for everyone. But, the affordability of the Internet in terms of cost does not necessarily imply the rightful access to Internet services.   Chota recharge is possibly leading us to chota (shrunken) Internet!
Information theory is widely accepted as a powerful tool for analyzing complex systems and it has been applied in many disciplines. Recently, some central components of information theory - multivariate information measures - have found expanded use in the study of several phenomena. These information measures differ in subtle yet significant ways. Here, we will review the information theory behind each measure, as well as examine the differences between these measures by applying them to several simple model systems. In addition to these systems, we will illustrate the usefulness of the information measures by analyzing neural spiking data from a dissociated culture through early stages of its development. We hope that this work will aid other researchers as they seek the best multivariate information measure for their specific research goals and system. Finally, we have made software available online which allows the user to calculate all of the information measures discussed within this paper.
We consider the problem of in-network compressed sensing from distributed measurements. Every agent has a set of measurements of a signal $x$, and the objective is for the agents to recover $x$ from their collective measurements using only communication with neighbors in the network. Our distributed approach to this problem is based on the centralized Iterative Hard Thresholding algorithm (IHT). We first present a distributed IHT algorithm for static networks that leverages standard tools from distributed computing to execute in-network computations with minimized bandwidth consumption. Next, we address distributed signal recovery in networks with time-varying topologies. The network dynamics necessarily introduce inaccuracies to our in-network computations. To accommodate these inaccuracies, we show how centralized IHT can be extended to include inexact computations while still providing the same recovery guarantees as the original IHT algorithm. We then leverage these new theoretical results to develop a distributed version of IHT for time-varying networks. Evaluations show that our distributed algorithms for both static and time-varying networks outperform previously proposed solutions in time and bandwidth by several orders of magnitude.
Electricity transmission networks dissipate a non-negligible fraction of the power they transport due to the heat loss in the transmission lines. In this work we explore how the transport of energy can be more efficient by adding to the network multiple batteries that can coordinate their operations. Such batteries can both charge using the current excess in the network or discharge to meet the network current demand. Either way, the presence of batteries in the network can be leveraged to mitigate the intrinsic uncertainty in the power generation and demand and, hence, transport the energy more efficiently through the network. We consider a resistive DC network with stochastic external current injections or consumptions and show how the expected total heat loss depends on the network structure and on the batteries operations. Furthermore, in the case where the external currents are modeled by Ornstein-Uhlenbeck processes, we derive the dynamical optimal control for the batteries over a finite time interval.
We study the dynamics of a simple model for quantum decay, where a single state is coupled to a set of discrete states, the pseudo continuum, each coupled to a real continuum of states. We find that for constant matrix elements between the single state and the pseudo continuum the decay occurs via one state in a certain region of the parameters, involving the Dicke and quantum Zeno effects. When the matrix elements are random several cases are identified. For a pseudo continuum with small bandwidth there are weakly damped oscillations in the probability to be in the initial single state. For intermediate bandwidth one finds mesoscopic fluctuations in the probability with amplitude inversely proportional to the square root of the volume of the pseudo continuum space. They last for a long time compared to the non-random case.
Multilayer graphs are commonly used for representing different relations between entities and handling heterogeneous data processing tasks. New challenges arise in multilayer graph clustering for assigning clusters to a common multilayer node set and for combining information from each layer. This paper presents a theoretical framework for multilayer spectral graph clustering of the nodes via convex layer aggregation. Under a novel multilayer signal plus noise model, we provide a phase transition analysis that establishes the existence of a critical value on the noise level that permits reliable cluster separation. The analysis also specifies analytical upper and lower bounds on the critical value, where the bounds become exact when the clusters have identical sizes. Numerical experiments on synthetic multilayer graphs are conducted to validate the phase transition analysis and study the effect of layer weights and noise levels on clustering reliability.
The linear layer is one of the most pervasive modules in deep learning representations. However, it requires $O(N^2)$ parameters and $O(N^2)$ operations. These costs can be prohibitive in mobile applications or prevent scaling in many domains. Here, we introduce a deep, differentiable, fully-connected neural network module composed of diagonal matrices of parameters, $\mathbf{A}$ and $\mathbf{D}$, and the discrete cosine transform $\mathbf{C}$. The core module, structured as $\mathbf{ACDC^{-1}}$, has $O(N)$ parameters and incurs $O(N log N )$ operations. We present theoretical results showing how deep cascades of ACDC layers approximate linear layers. ACDC is, however, a stand-alone module and can be used in combination with any other types of module. In our experiments, we show that it can indeed be successfully interleaved with ReLU modules in convolutional neural networks for image recognition. Our experiments also study critical factors in the training of these structured modules, including initialization and depth. Finally, this paper also provides a connection between structured linear transforms used in deep learning and the field of Fourier optics, illustrating how ACDC could in principle be implemented with lenses and diffractive elements.
We consider the one-dimensional partially asymmetric exclusion process with random hopping rates, in which a fraction of particles (or sites) have a preferential jumping direction against the global drift. In this case the accumulated distance traveled by the particles, x, scales with the time, t, as x ~ t^{1/z}, with a dynamical exponent z > 0. Using extreme value statistics and an asymptotically exact strong disorder renormalization group method we analytically calculate, z_{pt}, for particlewise (pt) disorder, which is argued to be related to the dynamical exponent for sitewise (st) disorder as z_{st}=z_{pt}/2. In the symmetric situation with zero mean drift the particle diffusion is ultra-slow, logarithmic in time.
Neural populations encode information about their stimulus in a collective fashion, by joint activity patterns of spiking and silence. A full account of this mapping from stimulus to neural activity is given by the conditional probability distribution over neural codewords given the sensory input. To be able to infer a model for this distribution from large-scale neural recordings, we introduce a stimulus-dependent maximum entropy (SDME) model---a minimal extension of the canonical linear-nonlinear model of a single neuron, to a pairwise-coupled neural population. The model is able to capture the single-cell response properties as well as the correlations in neural spiking due to shared stimulus and due to effective neuron-to-neuron connections. Here we show that in a population of 100 retinal ganglion cells in the salamander retina responding to temporal white-noise stimuli, dependencies between cells play an important encoding role. As a result, the SDME model gives a more accurate account of single cell responses and in particular outperforms uncoupled models in reproducing the distributions of codewords emitted in response to a stimulus. We show how the SDME model, in conjunction with static maximum entropy models of population vocabulary, can be used to estimate information-theoretic quantities like surprise and information transmission in a neural population.
The spectra of spin models have been investigated in computation experiments. For the Sherrington-Kirkpatrick and Edwards-Anderson models we have determined the basic spectral characteristics: the average depth of a local minimum, the spectrum width, the depth of the global minimum. The experimental data are used to build the relations between these quantities and the model dimensionality N and find their asymptotic values for N goes to infinity.
Cloud computing is the next generation computing. Adopting the cloud computing is like signing up new form of a website. The GUI which controls the cloud computing make is directly control the hardware resource and your application. The difficulty part in cloud computing is to deploy in real environment. Its' difficult to know the exact cost and it's requirement until and unless we buy the service not only that whether it will support the existing application which is available on traditional data center or had to design a new application for the cloud computing environment. The security issue, latency, fault tolerance are some parameter which we need to keen care before deploying, all this we only know after deploying but by using simulation we can do the experiment before deploying it to real environment. By simulation we can understand the real environment of cloud computing and then after it successful result we can start deploying your application in cloud computing environment. By using the simulator it will save us lots of time and money.
Fault-tolerant consensus has been studied extensively in the literature, because it is one of the most important distributed primitives and has wide applications in practice. This paper surveys important results on fault-tolerant consensus in message-passing networks, and the focus is on results from the past decade. Particularly, we categorize the results into two groups: new problem formulations and practical applications. In the first part, we discuss new ways to define the consensus problem, which includes larger input domains, link fault models, different network models . . . etc, and briefly discuss the important techniques. In the second part, we focus on Crash Fault-Tolerant (CFT) systems that use Paxos or Raft, and Byzantine Fault-Tolerant (BFT) systems. We also discuss Bitcoin, which can be related to solving Byzantine consensus in anonymous systems, and compare Bitcoin with BFT systems and Byzantine consensus.
Unsupervised learning in a generalized Hopfield associative-memory network is investigated in this work. First, we prove that the (generalized) Hopfield model is equivalent to a semi-restricted Boltzmann machine with a layer of visible neurons and another layer of hidden binary neurons, so it could serve as the building block for a multilayered deep-learning system. We then demonstrate that the Hopfield network can learn to form a faithful internal representation of the observed samples, with the learned memory patterns being prototypes of the input data. Furthermore, we propose a spectral method to extract a small set of concepts (idealized prototypes) as the most concise summary or abstraction of the empirical data.
We utilize machine learning models which are based on recurrent neural networks to optimize dynamical decoupling (DD) sequences. DD is a relatively simple technique for suppressing the errors in quantum memory for certain noise models. In numerical simulations, we show that with minimum use of prior knowledge and starting from random sequences, the models are able to improve over time and eventually output DD-sequences with performance better than that of the well known DD-families. Furthermore, our algorithm is easy to implement in experiments to find solutions tailored to the specific hardware, as it treats the figure of merit as a black box.
We have measured the clustering of z<0.9 red galaxies and constrained models of the evolution of large-scale structure using the initial 1.2 sq. degree data release of the NOAO Deep Wide-Field Survey (NDWFS). The area and BwRI passbands of the NDWFS allow samples of >1000 galaxies to be selected as a function of spectral type, absolute magnitude, and photometric redshift. Spectral synthesis models can be used to predict the colors and luminosities of a galaxy population as a function of redshift. We have used PEGASE2 models, with exponentially declining star formation rates, to estimate the observed colors and luminosity evolution of galaxies and to connect, as an evolutionary sequence, related populations of galaxies at different redshifts. A red galaxy sample, with present-day rest-frame Vega colors of Bw-R>1.44, was chosen to allow comparisons with the 2dF Galaxy Redshift Survey and Sloan Digital Sky Survey. We find the spatial clustering of red galaxies to be a strong function of luminosity, with r0 increasing from 4.4+/-0.4 Mpc/h at M_R=-20 to 11.2+/-1.0 Mpc/h at M_R=-22. Clustering evolution measurements using samples where the rest-frame selection criteria vary with redshift, including all deep single-band magnitude limited samples, are biased due to the correlation of clustering with rest-frame color and luminosity. The clustering of M_R=-21, Bw-R>1.44 galaxies exhibits no significant evolution over the redshift range observed with r0= 6.3+/-0.5 Mpc/h in comoving coordinates. This is consistent with recent LCDM models where the bias of L* galaxies undergoes rapid evolution and r0 evolves very slowly at z<2.
The critical importance of human milk to infants and even human civilization has been well established. Although the human milk microbiome has received increasing attention with the expansion of research on the human microbiome, our understanding of the milk microbiome has been limited to cataloguing OTUs and computation of community diversity indexes. To the best of our knowledge, there has been no report on the bacterial interactions within the human milk microbiome. To bridge this gap, we reconstructed a milk bacterial community network with the data from Hunt et al (2011), which is the largest 16S-rRNA sequence data set of human milk microbiome available to date. Our analysis revealed that the milk microbiome network consists of two disconnected sub-networks. One sub-network is a fully connected complete graph consisting of seven genera as nodes and all of its pair-wise interactions among the bacteria are facilitative or cooperative. In contrast, the interactions in the other sub-network of 8 nodes are mixed but dominantly cooperative. Somewhat surprisingly, the only 'non-cooperative' nodes in the second sub-network are mutually cooperative Staphylococcus and Corynebacterium, genera that include some opportunistic pathogens. This potentially 'evil' alliance between Staphylococcus and Corynebacterium could be inhibited by the remaining nodes who cooperate with one another in the second sub-network. We postulate that the 'confrontation' between the 'evil' alliance and 'benign' alliance in human milk microbiome should have important health implications to lactating women and their infants and shifting the balance between the two alliances may be responsible for dysbiosis of the milk microbiome that permits mastitis. A related study focusing on ecological analysis was reported at (http://www.eurekalert.org/pub_releases/2014-09/scp-ahb090214.php).
We propose a novel algorithm for visual question answering based on a recurrent deep neural network, where every module in the network corresponds to a complete answering unit with attention mechanism by itself. The network is optimized by minimizing loss aggregated from all the units, which share model parameters while receiving different information to compute attention probability. For training, our model attends to a region within image feature map, updates its memory based on the question and attended image feature, and answers the question based on its memory state. This procedure is performed to compute loss in each step. The motivation of this approach is our observation that multi-step inferences are often required to answer questions while each problem may have a unique desirable number of steps, which is difficult to identify in practice. Hence, we always make the first unit in the network solve problems, but allow it to learn the knowledge from the rest of units by backpropagation unless it degrades the model. To implement this idea, we early-stop training each unit as soon as it starts to overfit. Note that, since more complex models tend to overfit on easier questions quickly, the last answering unit in the unfolded recurrent neural network is typically killed first while the first one remains last. We make a single-step prediction for a new question using the shared model. This strategy works better than the other options within our framework since the selected model is trained effectively from all units without overfitting. The proposed algorithm outperforms other multi-step attention based approaches using a single step prediction in VQA dataset.
Hierarchical feature extractors such as Convolutional Networks (ConvNets) have achieved impressive performance on a variety of classification tasks using purely feedforward processing. Feedforward architectures can learn rich representations of the input space but do not explicitly model dependencies in the output spaces, that are quite structured for tasks such as articulated human pose estimation or object segmentation. Here we propose a framework that expands the expressive power of hierarchical feature extractors to encompass both input and output spaces, by introducing top-down feedback. Instead of directly predicting the outputs in one go, we use a self-correcting model that progressively changes an initial solution by feeding back error predictions, in a process we call Iterative Error Feedback (IEF). IEF shows excellent performance on the task of articulated pose estimation in the challenging MPII and LSP benchmarks, matching the state-of-the-art without requiring ground truth scale annotation.
The widespread occurrence of an inverse square relation in the hierarchical distribution of sub-communities within communities (or sub-species within species) has been recently invoked as a signature of hierarchical self-organization within social and ecological systems. Here we show that, whether such systems are self-organized or not, this behavior is the consequence of the tree-like classification method. Different tree-like classifications (both of real and truly random systems) display a similar statistical behaviour when considering the sizes of their sub-branches.
Using Tyablikov's decoupling approximation, we calculate the initial suppression rate of the Neel temperature, $R_{IS}=-lim_{x-> 0} T^{-1}_{N} dT_{N}/dx$, in a quasi two-dimensional diluted Heisenberg antiferromagnet with nonmagnetic impurities of concentration $x$. In order to explain an experimental fact that $R^{(Zn)}_{IS}=3.4$ of the Zn-substitution is different from $R^{(Mg)}_{IS}=3.0$ of the Mg-substitution, we propose a model in which impurity substitution reduces the intra-plane exchange couplings surrounding impurities, as well as dilution of spin systems. The decrease of 12% in exchange coupling constants by Zn substitution and decrease of 6% by Mg substitution explain those two experimental results, when an appropriate value of the interplane coupling is used.
We evaluate the character-level translation method for neural semantic parsing on a large corpus of sentences annotated with Abstract Meaning Representations (AMRs). Using a seq2seq model, and some trivial preprocessing and postprocessing of AMRs, we obtain a baseline accuracy of 53.1 (F-score on AMR-triples). We examine four different approaches to improve this baseline result: (i) reordering AMR branches to match the word order of the input sentence increases performance to 58.3; (ii) adding part-of-speech tags (automatically produced) to the input shows improvement as well (57.2); (iii) So does the introduction of super characters (conflating frequent sequences of characters to a single character), reaching 57.4; (iv) adding silver-standard training data obtained by an off-the-shelf parser yields the biggest improvement, resulting in an F-score of 64.0. Combining all four techniques leads to an F-score of 69.0, which is state-of-the-art in AMR parsing. This is remarkable because of the relatively simplicity of the approach: the only explicit linguistic knowledge that we use are part-of-speech tags.
We study the mean escape time in a market model with stochastic volatility. The process followed by the volatility is the Cox Ingersoll and Ross process which is widely used to model stock price fluctuations. The market model can be considered as a generalization of the Heston model, where the geometric Brownian motion is replaced by a random walk in the presence of a cubic nonlinearity. We investigate the statistical properties of the escape time of the returns, from a given interval, as a function of the three parameters of the model. We find that the noise can have a stabilizing effect on the system, as long as the global noise is not too high with respect to the effective potential barrier experienced by a fictitious Brownian particle. We compare the probability density function of the return escape times of the model with those obtained from real market data. We find that they fit very well.
Social networks are commonly used for marketing purposes. For example, free samples of a product can be given to a few influential social network users (or "seed nodes"), with the hope that they will convince their friends to buy it. One way to formalize marketers' objective is through influence maximization (or IM), whose goal is to find the best seed nodes to activate under a fixed budget, so that the number of people who get influenced in the end is maximized. Recent solutions to IM rely on the influence probability that a user influences another one. However, this probability information may be unavailable or incomplete. In this paper, we study IM in the absence of complete information on influence probability. We call this problem Online Influence Maximization (OIM) since we learn influence probabilities at the same time we run influence campaigns. To solve OIM, we propose a multiple-trial approach, where (1) some seed nodes are selected based on existing influence information; (2) an influence campaign is started with these seed nodes; and (3) users' feedback is used to update influence information. We adopt the Explore-Exploit strategy, which can select seed nodes using either the current influence probability estimation (exploit), or the confidence bound on the estimation (explore). Any existing IM algorithm can be used in this framework. We also develop an incremental algorithm that can significantly reduce the overhead of handling users' feedback information. Our experiments show that our solution is more effective than traditional IM methods on the partial information.
We propose a simple domain adaptation method for neural networks in a supervised setting. Supervised domain adaptation is a way of improving the generalization performance on the target domain by using the source domain dataset, assuming that both of the datasets are labeled. Recently, recurrent neural networks have been shown to be successful on a variety of NLP tasks such as caption generation; however, the existing domain adaptation techniques are limited to (1) tune the model parameters by the target dataset after the training by the source dataset, or (2) design the network to have dual output, one for the source domain and the other for the target domain. Reformulating the idea of the domain adaptation technique proposed by Daume (2007), we propose a simple domain adaptation method, which can be applied to neural networks trained with a cross-entropy loss. On captioning datasets, we show performance improvements over other domain adaptation methods.
We study the spin relaxation (SR) of a two-dimensional electron gas (2DEG) in the quantized Hall regime and discuss the role of spatial inhomogeneity effects on the relaxation. The results are obtained for small filling factors ($\nu\ll 1$) or when the filling factor is close to an integer. In either case SR times are essentially determined by a smooth random potential. For small $\nu$ we predict a "magneto-confinement" resonance manifested in the enhancement of the SR rate when the Zeeman energy is close to the spacing of confinement sublevels in the low-energy wing of the disorder-broadened Landau level. In the resonant region the $B$-dependence of the SR time has a peculiar non-monotonic shape. If $\nu\simeq 2n+1$, the SR is going non-exponentially. Under typical conditions the calculated SR times range from $10^{-8}$ to $10^{-6} $s.
Here we present a range-limited approach to centrality measures in both non-weighted and weighted directed complex networks. We introduce an efficient method that generates for every node and every edge its betweenness centrality based on shortest paths of lengths not longer than $\ell = 1,...,L$ in case of non-weighted networks, and for weighted networks the corresponding quantities based on minimum weight paths with path weights not larger than $w_{\ell}=\ell \Delta$, $\ell=1,2...,L=R/\Delta$. These measures provide a systematic description on the positioning importance of a node (edge) with respect to its network neighborhoods 1-step out, 2-steps out, etc. up to including the whole network. We show that range-limited centralities obey universal scaling laws for large non-weighted networks. As the computation of traditional centrality measures is costly, this scaling behavior can be exploited to efficiently estimate centralities of nodes and edges for all ranges, including the traditional ones. The scaling behavior can also be exploited to show that the ranking top-list of nodes (edges) based on their range-limited centralities quickly freezes as function of the range, and hence the diameter-range top-list can be efficiently predicted. We also show how to estimate the typical largest node-to-node distance for a network of $N$ nodes, exploiting the aforementioned scaling behavior. These observations are illustrated on model networks and on a large social network inferred from cell-phone trace logs ($\sim 5.5\times 10^6$ nodes and $\sim 2.7\times 10^7$ edges). Finally, we apply these concepts to efficiently detect the vulnerability backbone of a network (defined as the smallest percolating cluster of the highest betweenness nodes and edges) and illustrate the importance of weight-based centrality measures in weighted networks in detecting such backbones.
We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of image processing techniques, training a linear SVM with feature vectors, merging SVM and CNN results, and identifying the species from a dataset of 57 trees. Our classification results show that fusion of deep representations with hand-crafted features leads to the highest accuracy. The proposed algorithm is embedded in a smart-phone application, which is publicly available. Furthermore, our novel dataset comprised of 5408 leaf images is also made public for use of other researchers.
We present a new benchmark dataset for video question answering (VideoQA) designed to evaluate algorithms' capability of spatio-temporal event understanding. Existing datasets either require very high-level reasoning from multi-modal information to find answers, or is mostly composed of the questions that can be answered by watching a single frame. Therefore, they are not suitable to evaluate models' real capacity and flexibility for VideoQA. To overcome such critical limitations, we focus on event-centric questions that require understanding temporal relation between multiple events in videos. An interesting idea in dataset construction process is that question-answer pairs are automatically generated from Super Mario video gameplays given a set of question templates. We also tackle VideoQA problem in the new dataset, referred to as MarioQA, by proposing spatio-temporal attention models based on deep neural networks. Our experiments show that the proposed deep neural network models with attention have meaningful performance improvement over several baselines.
Determining the magnitude and location of neural sources within the brain that are responsible for generating magnetoencephalography (MEG) signals measured on the surface of the head is a challenging problem in functional neuroimaging. The number of potential sources within the brain exceeds by an order of magnitude the number of recording sites. As a consequence, the estimates for the magnitude and location of the neural sources will be ill-conditioned because of the underdetermined nature of the problem. One well-known technique designed to address this imbalance is the minimum norm estimator (MNE). This approach imposes an $L^2$ regularization constraint that serves to stabilize and condition the source parameter estimates. However, these classes of regularizer are static in time and do not consider the temporal constraints inherent to the biophysics of the MEG experiment. In this paper we propose a dynamic state-space model that accounts for both spatial and temporal correlations within and across candidate intracortical sources. In our model, the observation model is derived from the steady-state solution to Maxwell's equations while the latent model representing neural dynamics is given by a random walk process.
Our deep Chandra exposures of 47Tuc and moderate exposures of NGC 6397 reveal a wealth of new phenomena for interacting X-ray binaries (IXBs) in globular clusters. In this (late) Review, updated since the conference, I summarize recent and ongoing analysis of the millisecond pulsars, the compact binaries containing white dwarfs and neutron stars, and the chromospherically active binaries in both globular clusters. Spectral variability analysis enables new insights into source properties and evolutionary history. These binary populations, now so ``easily'' visible, are large enough that their properties and spatial distributions reveal new hints of compact object formation and binary interactions with their parent cluster. Neutron stars appear overabundant, relative to white dwarfs, in 47Tuc vs. NGC 6397. The IXBs containing neutron stars (i.e., MSPs and qLMXBs), as the most massive and ancient compact binary sample, may trace the protocluster disk in 47Tuc, whereas compact binaries may have been ejected preferentially along the cluster rotation equator during the recent core collapse in NGC 6397.
The cloud radio access network (C-RAN) provides high spectral and energy efficiency performances, low expenditures and intelligent centralized system structures to operators, which has attracted intense interests in both academia and industry. In this paper, a hybrid coordinated multi-point transmission (H-CoMP) scheme is designed for the downlink transmission in C-RANs, which fulfills the flexible tradeoff between cooperation gain and fronthaul consumption. The queue-aware power and rate allocation with constraints of average fronthaul consumption for the delay-sensitive traffic are formulated as an infinite horizon constrained partially observed Markov decision process (POMDP), which takes both the urgent queue state information (QSI) and the imperfect channel state information at transmitters (CSIT) into account. To deal with the curse of dimensionality involved with the equivalent Bellman equation, the linear approximation of post-decision value functions is utilized. A stochastic gradient algorithm is presented to allocate the queue-aware power and transmission rate with H-CoMP, which is robust against unpredicted traffic arrivals and uncertainties caused by the imperfect CSIT. Furthermore, to substantially reduce the computing complexity, an online learning algorithm is proposed to estimate the per-queue post-decision value functions and update the Lagrange multipliers. The simulation results demonstrate performance gains of the proposed stochastic gradient algorithms, and confirm the asymptotical convergence of the proposed online learning algorithm.
Motivated by the effectiveness of correlation attacks against Tor, the censorship arms race, and observations of malicious relays in Tor, we propose that Tor users capture their trust in network elements using probability distributions over the sets of elements observed by network adversaries. We present a modular system that allows users to efficiently and conveniently create such distributions and use them to improve their security. The major components of this system are (i) an ontology of network-element types that represents the main threats to and vulnerabilities of anonymous communication over Tor, (ii) a formal language that allows users to naturally express trust beliefs about network elements, and (iii) a conversion procedure that takes the ontology, public information about the network, and user beliefs written in the trust language and produce a Bayesian Belief Network that represents the probability distribution in a way that is concise and easily sampleable. We also present preliminary experimental results that show the distribution produced by our system can improve security when employed by users; further improvement is seen when the system is employed by both users and services.
The emergence of in-vehicle entertainment systems and self-driving vehicles, and the latter's need for high-resolution, up-to-date maps, will bring a further increase in the amount of data vehicles consume. Considering how difficult WiFi offloading in vehicular environments is, the bulk of this additional load will be served by cellular networks. Cellular networks, in turn, will resort to caching at the network edge in order to reduce the strain on their core network, an approach also known as mobile edge computing, or fog computing. In this work, we exploit a real-world, large-scale trace coming from the users of the We-Fi app in order to (i) understand how significant the contribution of vehicular users is to the global traffic demand; (ii) compare the performance of different caching architectures; and (iii) studying how such a performance is influenced by recommendation systems and content locality. We express the price of fog computing through a metric called price-of-fog, accounting for the extra caches to deploy compared to a traditional, centralized approach. We find that fog computing allows a very significant reduction of the load on the core network, and the price thereof is low in all cases and becomes negligible if content demand is location specific. We can therefore conclude that vehicular networks make an excellent case for the transition to mobile-edge caching: thanks to the peculiar features of vehicular demand, we can obtain all the benefits of fog computing, including a reduction of the load on the core network, reducing the disadvantages to a minimum.
We study the problem of learning the best Bayesian network structure with respect to a decomposable score such as BDe, BIC or AIC. This problem is known to be NP-hard, which means that solving it becomes quickly infeasible as the number of variables increases. Nevertheless, in this paper we show that it is possible to learn the best Bayesian network structure with over 30 variables, which covers many practically interesting cases. Our algorithm is less complicated and more efficient than the techniques presented earlier. It can be easily parallelized, and offers a possibility for efficient exploration of the best networks consistent with different variable orderings. In the experimental part of the paper we compare the performance of the algorithm to the previous state-of-the-art algorithm. Free source-code and an online-demo can be found at http://b-course.hiit.fi/bene.
Diffusive transport is among the most common phenomena in nature [1]. However, as predicted by Anderson [2], diffusion may break down due to interference. This transition from diffusive transport to localization of waves should occur for any type of classical or quantum wave in any media as long as the wavelength becomes comparable to the transport mean free path $\ell^*$ [3]. The signatures of localization and those of absorption, or bound states, can however be similar, such that an unequivocal proof of the existence of wave localization in disordered bulk materials is still lacking. Here we present measurements of time resolved non-classical diffusion of visible light in strongly scattering samples, which cannot be explained by absorption, sample geometry or reduction in transport velocity. Deviations from classical diffusion increase strongly with decreasing $\ell^*$ as expected for a phase transition. This constitutes an experimental realization of the critical regime in the approach to Anderson localization.
It is well known that many real world networks have the power-law degree distribution (scale-free property). However there are no rigorous results for continuous-time quantum walks on such realistic graphs. In this paper, we analyze space-time behaviors of continuous-time quantum walks and random walks on the threshold network model which is a reasonable candidate model having scale-free property. We show that the quantum walker exhibits localization at the starting point, although the random walker tends to spread uniformly.
We study the problem of interference management in fast fading wireless networks, in which the transmitters are only aware of network topology. We consider a class of retransmission-based schemes, where transmitters in the network are only allowed to resend their symbols in order to assist with the neutralization of interference at the receivers. We introduce a necessary and sufficient condition on the network topology, under which half symmetric degrees-of-freedom (DoF) is achievable through the considered retransmission-based schemes. This corresponds to the "best" topologies since half symmetric DoF is the highest possible value for the symmetric DoF in the presence of interference. We show that when the condition is satisfied, there always exists a set of carefully chosen transmitters in the network, such that by retransmission of their symbols at an appropriate time slot, we can neutralize all the interfering signals at the receivers. Quite surprisingly, we also show that for any given network topology, if we cannot achieve half symmetric DoF by retransmission-based schemes, then there does not exist any linear scheme that can do so. We also consider a practical network scenario that models cell edge users in a heterogeneous network, and show that the characterized condition on the network topology occurs frequently. Furthermore, we numerically evaluate the achievable rates of the DoF-optimal retransmission-based scheme in such network scenario, and show that its throughput gain is not restricted to the asymptotic DoF analysis.
We use Monte Carlo simulations of the 3D uniformly frustrated XY model, with uncorrelated quenched randomness in the in-plane couplings, to model the effect of random point pins on the vortex line phases of a type II superconductor. We map out the phase diagram as a function of temperature T and randomness strength p for fixed applied magnetic field. We find that, as p increases to a critical value p_c, the first order vortex lattice melting line turns parallel to the T axis, and continues smoothly down to low temperature, rather than ending at a critical point. The entropy jump across this line at p_c vanishes, but the transition remains first order. Above this disorder driven transition line, we find that the helicity modulus parallel to the applied field vanishes, and so no true phase coherent vortex glass exists.
An enhanced concentration of 60Fe was found in a deep ocean's crust in 2004 in a layer corresponding to an age of ~2 Myr. The confirmation of this signal in terrestrial archives as supernova-induced and detection of other supernova-produced radionuclides is of great interest. We have identified two suitable marine sediment cores from the South Australian Basin and estimated the intensity of a possible signal of the supernova-produced radionuclides 26Al, 53Mn, 60Fe and the pure r-process element 244Pu in these cores. A finding of these radionuclides in a sediment core might allow to improve the time resolution of the signal and thus to link the signal to a supernova event in the solar vicinity ~2 Myr ago. Furthermore, it gives an insight on nucleosynthesis scenarios in massive stars, the condensation into dust grains and transport mechanisms from the supernova shell into the solar system.
We develop an efficient numerical method to study the quantum critical behavior of disordered systems with $\mathcal{O}(N)$ order-parameter symmetry in the large$-N$ limit. It is based on the iterative solution of the large$-N$ saddle-point equations combined with a fast algorithm for inverting the arising large sparse random matrices. As an example, we consider the superconductor-metal quantum phase transition in disordered nanowires. We study the behavior of various observables near the quantum phase transition. Our results agree with recent renormalization group predictions, i.e., the transition is governed by an infinite-randomness critical point, accompanied by quantum Griffiths singularities. Our method is highly efficient because the numerical effort for each iteration scales linearly with the system size. This allows us to study larger systems, with up to 1024 sites, than previous methods. We also discuss generalizations to higher dimensions and other systems including the itinerant antiferomagnetic transitions in disordered metals.
Environmental signals sensed by nervous systems are often represented in spike trains carried from sensory neurons to higher neural functions where decisions and functional actions occur. Information about the environmental stimulus is contained (encoded) in the train of spikes. We show how to "read" the encoding using state space methods of nonlinear dynamics. We create a mapping from spike signals which are output from the neural processing system back to an estimate of the analog input signal. This mapping is realized locally in a reconstructed state space embodying both the dynamics of the source of the sensory signal and the dynamics of the neural circuit doing the processing. We explore this idea using a Hodgkin-Huxley conductance based neuron model and input from a low dimensional dynamical system, the Lorenz system. We show that one may accurately learn the dynamical input/output connection and estimate with high precision the details of the input signals from spike timing output alone. This form of "reading the neural code" has a focus on the neural circuitry as a dynamical system and emphasizes how one interprets the dynamical degrees of freedom in the neural circuit as they transform analog environmental information into spike trains.
In order to clarify the physics of the gating of solids by ionic liquids (ILs) we have gated lightly doped $p$-Si, which is so well studied that it can be called the "hydrogen atom of solid state physics" and can be used as a test bed for ionic liquids. We explore the case where the concentration of induced holes at the Si surface is below $10^{12}\text{cm}^{-2}$, hundreds of times smaller than record values. We find that in this case an excess negative ion binds a hole on the interface between the IL and Si becoming a surface acceptor. We study the surface conductance of holes hopping between such nearest neighbor acceptors. Analyzing the acceptor concentration dependence of this conductivity, we find that the localization length of a hole is in reasonable agreement with our direct variational calculation of its binding energy. The observed hopping conductivity resembles that of well studied $\text{Na}^{+}$ implanted Si MOSFETs.
Our view on the deep universe has been so far biased towards optically bright galaxies. Now, the measurement of the Cosmic Infrared Background in FIRAS and DIRBE residuals, and the observations of FIR/submm sources by the ISOPHOT and SCUBA instruments begin unveiling the ``optically dark side'' of galaxy formation. Though the origin of dust heating is still unsolved, it appears very likely that a large fraction of the FIR/submm emission is due to heavily-extinguished star formation. Consequently, the level of the CIRB implies that about 2/3 of galaxy/star formation in the universe is hidden by dust shrouds. In this review, we introduce a new modeling of galaxy formation and evolution that provides us with specific predictions in FIR/submm wavebands. These predictions are compared with the current status of the observations. Finally, the capabilities of current and forthcoming instruments for all-sky and deep surveys of FIR/submm sources are briefly described.
Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of +1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.
This paper describes an efficient algorithm REx for generating symbolic rules from artificial neural network (ANN). Classification rules are sought in many areas from automatic knowledge acquisition to data mining and ANN rule extraction. This is because classification rules possess some attractive features. They are explicit, understandable and verifiable by domain experts, and can be modified, extended and passed on as modular knowledge. REx exploits the first order information in the data and finds shortest sufficient conditions for a rule of a class that can differentiate it from patterns of other classes. It can generate concise and perfect rules in the sense that the error rate of the rules is not worse than the inconsistency rate found in the original data. An important feature of rule extraction algorithm, REx, is its recursive nature. They are concise, comprehensible, order insensitive and do not involve any weight values. Extensive experimental studies on several benchmark classification problems, such as breast cancer, iris, season, and golf-playing, demonstrate the effectiveness of the proposed approach with good generalization ability.
This paper considers distributed computing on an anonymous quantum network, a network in which no party has a unique identifier and quantum communication and computation are available. It is proved that the leader election problem can exactly (i.e., without error in bounded time) be solved with at most the same complexity up to a constant factor as that of exactly computing symmetric functions (without intermediate measurements for a distributed and superposed input), if the number of parties is given to every party. A corollary of this result is a more efficient quantum leader election algorithm than existing ones: the new quantum algorithm runs in O(n) rounds with bit complexity O(mn^2), on an anonymous quantum network with n parties and m communication links. Another corollary is the first quantum algorithm that exactly computes any computable Boolean function with round complexity O(n) and with smaller bit complexity than that of existing classical algorithms in the worst case over all (computable) Boolean functions and network topologies. More generally, any n-qubit state can be shared with that complexity on an anonymous quantum network with n parties.
Over the last decade, a new idea challenging the classical self-non-self viewpoint has become popular amongst immunologists. It is called the Danger Theory. In this conceptual paper, we look at this theory from the perspective of Artificial Immune System practitioners. An overview of the Danger Theory is presented with particular emphasis on analogies in the Artificial Immune Systems world. A number of potential application areas are then used to provide a framing for a critical assessment of the concept, and its relevance for Artificial Immune Systems.
Percolation and jamming phenomena are investigated for anisotropic sequential deposition of dimers (particles occupying two adjacent adsorption sites) on a square lattice. The influence of dimer alignment on the electrical conductivity was examined. The percolation threshold for deposition of dimers was lower than for deposition of monomers. Nevertheless, the problem belongs to the universality class of random percolation. The lowest percolation threshold (pc = 0.562) was observed for isotropic orientation of dimers. It was higher (pc = 0.586) in the case of dimers aligned strictly along one direction. The state of dimer orientation influenced the concentration dependence of electrical conductivity. The proposed model seems to be useful for description of the percolating properties of anisotropic conductors.
We present a deep optical survey of Uranus' Hill sphere for small satellites. The Subaru 8-m telescope was used to survey about 3.5 square degrees with a 50% detection efficiency at limiting red magnitude m = 26.1 mag. This magnitude corresponds to objects that are about 7 km in radius (assuming an albedo of 0.04). We detected (without prior knowledge of their positions) all previously known outer satellites and discovered two new irregular satellites (S/2001 U2 and S/2003 U3). The two inner satellites Titania and Oberon were also detected. One of the newly discovered satellites (S/2003 U3) is the first known irregular prograde of the planet. The population, size distribution and orbital parameters of Uranus' irregular satellites are remarkably similar to the irregular satellites of gas giant Jupiter. Both have shallow size distributions (power law indices q~2 for radii > 7 km) with no correlation between the sizes of the satellites and their orbital parameters. However, unlike those of Jupiter, Uranus' irregular satellites do not appear to occupy tight distinct dynamical groups in semi-major axis versus inclination phase space. Two groupings in semi-major axis versus eccentricity phase space appear to be statistically significant.
In order to gain new insights into MIMO interference networks, the optimality of $\sum_{k=1}^K M_k/2$ (half the cake per user) degrees of freedom is explored for a $K$-user multiple-input-multiple-output (MIMO) interference channel where the cross-channels have arbitrary rank constraints, and the $k^{th}$ transmitter and receiver are equipped with $M_k$ antennas each. The result consolidates and significantly generalizes results from prior studies by Krishnamurthy et al., of rank-deficient interference channels where all users have $M$ antennas, and by Tang et al., of full rank interference channels where the $k^{th}$ user pair has $M_k$ antennas. The broader outcome of this work is a novel class of replication-based outer bounds for arbitrary rank-constrained MIMO interference networks where replicas of existing users are added as auxiliary users and the network connectivity is chosen to ensure that any achievable scheme for the original network also works in the new network. The replicated network creates a new perspective of the problem, so that even simple arguments such as user cooperation become quite powerful when applied in the replicated network, giving rise to stronger outer bounds, than when applied directly in the original network. Remarkably, the replication based bounds are broadly applicable not only to MIMO interference channels with arbitrary rank-constraints, but much more broadly, even beyond Gaussian settings.
The question of the local stability of the (replica-symmetric) amorphous solid state is addressed for a class of systems undergoing a continuous liquid to amorphous-solid phase transition driven by the effect of random constraints. The Hessian matrix, associated with infinitesimal fluctuations around the stationary point corresponding to the amorphous solid state, is obtained. The eigenvalues of this Hessian matrix are all shown to be strictly positive near the transition, except for one--the zero mode associated with the spontaneously broken continuous translational symmetry of the system. Thus the local stability of the amorphous solid state is established.
We address relation classification in the context of slot filling, the task of finding and evaluating fillers like "Steve Jobs" for the slot X in "X founded Apple". We propose a convolutional neural network which splits the input sentence into three parts according to the relation arguments and compare it to state-of-the-art and traditional approaches of relation classification. Finally, we combine different methods and show that the combination is better than individual approaches. We also analyze the effect of genre differences on performance.
This paper considers the problem of reducing the broadcast decoding delay of wireless networks using instantly decodable network coding (IDNC) based device-to-device (D2D) communications. In a D2D configuration, devices in the network can help hasten the recovery of the lost packets of other devices in their transmission range by sending network coded packets. Unlike previous works that assumed fully connected network, this paper proposes a partially connected configuration in which the decision should be made not only on the packet combinations but also on the set of transmitting devices. First, the different events occurring at each device are identified so as to derive an expression for the probability distribution of the decoding delay. The joint optimization problem over the set of transmitting devices and the packet combinations of each is, then, formulated. The optimal solution of the joint optimization problem is derived using a graph theory approach by introducing the cooperation graph and reformulating the problem as a maximum weight clique problem in which the weight of each vertex is the contribution of the device identified by the vertex. Through extensive simulations, the decoding delay experienced by all devices in the Point to Multi-Point (PMP) configuration, the fully connected D2D (FC-D2D) configuration and the more practical partially connected D2D (PC-D2D) configuration are compared. Numerical results suggest that the PC-D2D outperforms the FC-D2D and provides appreciable gain especially for poorly connected networks.
We have continued our studies of the Classical Nova outburst by evolving TNRs on 1.25Msun and 1.35Msun WDs (ONeMg composition) under conditions which produce mass ejection and a rapid increase in the emitted light, by examining the effects of changes in the nuclear reaction rates on both the observable features and the nucleosynthesis during the outburst. In order to improve our calculations over previous work, we have incorporated a modern nuclear reaction network into our hydrodynamic computer code. We find that the updates in the nuclear reaction rate libraries change the amount of ejected mass, peak luminosity, and the resulting nucleosynthesis. In addition, as a result of our improvements, we discovered that the pep reaction was not included in our previous studies of CN explosions. Although the energy production from this reaction is not important in the Sun, the densities in WD envelopes can exceed $10^4$ gm cm$^{-3}$ and the presence of this reaction increases the energy generation during the time that the p-p chain is operating. The effect of the increased energy generation is to reduce the evolution time to the peak of the TNR and, thereby, the accreted mass as compared to the evolutionary sequences done without this reaction included. As expected from our previous work, the reduction in accreted mass has important consequences on the characteristics of the resulting TNR and is discussed in this paper.
We tackle image question answering (ImageQA) problem by learning a convolutional neural network (CNN) with a dynamic parameter layer whose weights are determined adaptively based on questions. For the adaptive parameter prediction, we employ a separate parameter prediction network, which consists of gated recurrent unit (GRU) taking a question as its input and a fully-connected layer generating a set of candidate weights as its output. However, it is challenging to construct a parameter prediction network for a large number of parameters in the fully-connected dynamic parameter layer of the CNN. We reduce the complexity of this problem by incorporating a hashing technique, where the candidate weights given by the parameter prediction network are selected using a predefined hash function to determine individual weights in the dynamic parameter layer. The proposed network---joint network with the CNN for ImageQA and the parameter prediction network---is trained end-to-end through back-propagation, where its weights are initialized using a pre-trained CNN and GRU. The proposed algorithm illustrates the state-of-the-art performance on all available public ImageQA benchmarks.
Many real networks such as the World Wide Web, financial, biological, citation and social networks have a power-law degree distribution. Networks with this feature are also called scale-free. Several models for producing scale-free networks have been obtained by now and most of them are based on the preferential attachment approach. We will offer the model with another scale-free property explanation. The main idea is to approximate the network's adjacency matrix by multiplication of the matrices $V$ and $V^T$, where $V$ is the matrix of vertices' latent features. This approach is called matrix factorization and is successfully used in the link prediction problem. To create a generative model of scale-free networks we will sample latent features $V$ from some probabilistic distribution and try to generate a network's adjacency matrix. Entries in the generated matrix are dot products of latent features which are real numbers. In order to create an adjacency matrix, we approximate entries with the Boolean domain $\{0, 1\}$. We have incorporated the threshold parameter $\theta$ into the model for discretization of a dot product. Actually, we have been influenced by the geographical threshold models which were recently proven to have good results in a scale-free networks generation. The overview of our results is the following. First, we will describe our model formally. Second, we will tune the threshold $\theta$ in order to generate sparse growing networks. Finally, we will show that our model produces scale-free networks with the fixed power-law exponent which equals two. In order to generate oriented networks with tunable power-law exponents and to obtain other model properties, we will offer different modifications of our model. Some of our results will be demonstrated using computer simulation.
In binary metallic systems like the Pd--Er alloys charged with hydrogen the observed structure evolution exhibits complex dynamics. It is characterized by non-monotonic time variations in an Er-rich fraction respect with an Er-poor fraction observed experimentally. The present paper proposes a qualitative model for this non-monotonic structure relaxation. We assume that the alloy have crystalline defects which trap (or emit) an additional amount of Er atoms. Hydrogen atoms into the alloy disturb the phase equilibrium as well as change the defects capacity with respect to Er atoms. Both of these factors lead to the spatial redistribution of Er atoms and cause the interface between the Er-rich and the Er-poor phase to move. The competition of diffusion fluxes in system is responsible for non-monotonic time variations, for example, in the relative volume of the enriched phase. We have found the conditions when the interface motion can change its direction several times during the system relaxation to a new equilibrium state. From our point of view this effect is the essence of the hydrogen induced non-monotonic relaxation observed in such systems. The numerical simulation confirms the basic assumptions.
EMU is a wide-field radio continuum survey planned for the new Australian Square Kilometre Array Pathfinder (ASKAP) telescope, due to be completed in 2012. The primary goal of EMU is to make a deep ($\sim 10\mu$Jy/bm rms) radio continuum survey of the entire Southern Sky at 1.4 GHz, extending as far North as +30$\deg$ declination, with a 10 arcsec resolution. EMU is expected to detect and catalog about 70 million galaxies, including typical star-forming galaxies up to z=1, powerful starbursts to even greater redshifts, and AGNs to the edge of the Universe. EMU will undoubtedly discover new classes of object. Here I present the science goals and survey parameters.
Magnetic materials and nanostructures based on carbon offer unique opportunities for future technological applications such as spintronics. This article reviews graphene-derived systems in which magnetic correlations emerge as a result of reduced dimensions, disorder and other possible scenarios. In particular, zero-dimensional graphene nanofragments, one-dimensional graphene nanoribbons, and defect-induced magnetism in graphene and graphite are covered. Possible physical mechanisms of the emergence of magnetism in these systems are illustrated with the help of computational examples based on simple model Hamiltonians. In addition, this review covers spin transport properties, proposed designs of graphene-based spintronic devices, magnetic ordering at finite temperatures as well as the most recent experimental achievements.
Wireless access through a large distributed network of low-complexity infrastructure nodes empowered with cooperation and coordination capabilities, is an emerging radio architecture, candidate to deal with the mobile data capacity crunch. In the 3GPP evolutionary path, this is known as the Cloud-RAN paradigm for future radio. In such a complex network, distributed MIMO resources optimization is of paramount importance, in order to achieve capacity scaling. In this paper, we investigate efficient strategies towards optimizing the pairing of access nodes with users as well as linear precoding designs for providing fair QoS experience across the whole network, when data sharing is limited due to complexity and overhead constraints. We propose a method for obtaining the exact optimal spatial resources allocation solution which can be applied in networks of limited scale, as well as an approximation algorithm with bounded polynomial complexity which can be used in larger networks. The particular algorithm outperforms existing user-oriented clustering techniques and achieves quite high quality-of-service levels with reasonable complexity.
Quantum networks are distributed quantum many-body systems with tailored topology and controlled information exchange. They are the backbone of distributed quantum computing architectures and quantum communication. Here we present a prototype of such a quantum network based on single atoms embedded in optical cavities. We show that atom-cavity systems form universal nodes capable of sending, receiving, storing and releasing photonic quantum information. Quantum connectivity between nodes is achieved in the conceptually most fundamental way: by the coherent exchange of a single photon. We demonstrate the faithful transfer of an atomic quantum state and the creation of entanglement between two identical nodes in independent laboratories. The created nonlocal state is manipulated by local qubit rotation. This efficient cavity-based approach to quantum networking is particularly promising as it offers a clear perspective for scalability, thus paving the way towards large-scale quantum networks and their applications.
Mobile Agents (MAs) represent a distributed computing technology that promises to address the scalability problems of centralized network management. A critical issue that will affect the wider adoption of MA paradigm in management applications is the development of MA Platforms (MAPs) expressly oriented to distributed management. However, most of available platforms impose considerable burden on network and system resources and also lack of essential functionality. In this paper, we discuss the design considerations and implementation details of a complete MAP research prototype that sufficiently addresses all the aforementioned issues. Our MAP has been implemented in Java and tailored for network and systems management applications.
We develop a probabilistic framework for deep learning based on the Deep Rendering Mixture Model (DRMM), a new generative probabilistic model that explicitly capture variations in data due to latent task nuisance variables. We demonstrate that max-sum inference in the DRMM yields an algorithm that exactly reproduces the operations in deep convolutional neural networks (DCNs), providing a first principles derivation. Our framework provides new insights into the successes and shortcomings of DCNs as well as a principled route to their improvement. DRMM training via the Expectation-Maximization (EM) algorithm is a powerful alternative to DCN back-propagation, and initial training results are promising. Classification based on the DRMM and other variants outperforms DCNs in supervised digit classification, training 2-3x faster while achieving similar accuracy. Moreover, the DRMM is applicable to semi-supervised and unsupervised learning tasks, achieving results that are state-of-the-art in several categories on the MNIST benchmark and comparable to state of the art on the CIFAR10 benchmark.
Asymptotically exact results are obtained for the average Green function and the density of states in a Gaussian random potential for the space dimensionality d=4-epsilon over the entire energy range, including the vicinity of the mobility edge. For N\sim 1 (N is an order of the perturbation theory) only the parquet terms corresponding to the highest powers of 1/epsilon are retained. For large N all powers of 1/epsilon are taken into account with their coefficients calculated in the leading asymptotics in N. This calculation is performed by combining the condition of renormalizability of the theory with the Lipatov asymptotics.
This thesis report studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework. As a preliminary step, we explore Long Short-Term Memory (LSTM) networks used in Natural Language Processing (NLP) to tackle Question-Answering (text based). We then modify the previous model to accept an image as an input in addition to the question. For this purpose, we explore the VGG-16 and K-CNN convolutional neural networks to extract visual features from the image. These are merged with the word embedding or with a sentence embedding of the question to predict the answer. This work was successfully submitted to the Visual Question Answering Challenge 2016, where it achieved a 53,62% of accuracy in the test dataset. The developed software has followed the best programming practices and Python code style, providing a consistent baseline in Keras for different configurations.
This paper reviews machine learning applications and approaches to detection, classification and control of intelligent materials and structures with embedded distributed computation elements. The purpose of this survey is to identify desired tasks to be performed in each type of material or structure (e.g., damage detection in composites), identify and compare common approaches to learning such tasks, and investigate models and training paradigms used. Machine learning approaches and common temporal features used in the domains of structural health monitoring, morphable aircraft, wearable computing and robotic skins are explored. As the ultimate goal of this research is to incorporate the approaches described in this survey into a robotic material paradigm, the potential for adapting the computational models used in these applications, and corresponding training algorithms, to an amorphous network of computing nodes is considered. Distributed versions of support vector machines, graphical models and mixture models developed in the field of wireless sensor networks are reviewed. Potential areas of investigation, including possible architectures for incorporating machine learning into robotic nodes, training approaches, and the possibility of using deep learning approaches for automatic feature extraction, are discussed.
We consider the problem of reconstructing an $N$-dimensional continuous vector $\bx$ from $P$ constraints which are generated by its linear transformation under the assumption that the number of non-zero elements of $\bx$ is typically limited to $\rho N$ ($0\le \rho \le 1$). Problems of this type can be solved by minimizing a cost function with respect to the $L_p$-norm $||\bx||_p=\lim_{\epsilon \to +0}\sum_{i=1}^N |x_i|^{p+\epsilon}$, subject to the constraints under an appropriate condition. For several $p$, we assess a typical case limit $\alpha_c(\rho)$, which represents a critical relation between $\alpha=P/N$ and $\rho$ for successfully reconstructing the original vector by minimization for typical situations in the limit $N,P \to \infty$ with keeping $\alpha$ finite, utilizing the replica method. For $p=1$, $\alpha_c(\rho)$ is considerably smaller than its worst case counterpart, which has been rigorously derived by existing literature of information theory.
We calculate the transition temperature in ultranarrow superconducting wires as a function of wire width, resistance and applied magnetic field. We compare the results of first-order perturbation theory and the non-perturbative resummation technique developed by Oreg and Finkel'stein. The latter technique is found to be superior as it is valid even in the strong disorder limit. In both cases the predicted additional suppression of the transition temperature due to the reduced dimensionality is strongly dependent upon the boundary conditions used. When we use the correct (zero-gradient) boundary conditions, we find that theory and experiment are consistent, although more experimental data is required to verify this systematically. We calculate the magnetic field dependence of the transition temperature for different wire widths and resistances in the hope that this will be measured in future experiments. The predicted results have a rich structure - in particular we find a dimensional crossover which can be tuned by varying either the width of the wire or its resistance per square.
The role of the distribution of coupling constants on the critical exponents of the short-range Ising spin-glass model is investigated via real space renormalization group. A saddle-point spin glass critical point characterized by a fixed-point distribution is found in an appropriated parameter space. The critical exponents $\beta $ and $\nu $ are directly estimated from the data of the local Edwards-Anderson order parameters for the model defined on a diamond hierarchical lattice of fractal dimension $d_{f}=3$. Four distinct initial distributions of coupling constants (Gaussian, bimodal, uniform and exponential) are considered; the results clearly indicate a universal behavior.
A novel variational autoencoder is developed to model images, as well as associated labels or captions. The Deep Generative Deconvolutional Network (DGDN) is used as a decoder of the latent image features, and a deep Convolutional Neural Network (CNN) is used as an image encoder; the CNN is used to approximate a distribution for the latent DGDN features/code. The latent code is also linked to generative models for labels (Bayesian support vector machine) or captions (recurrent neural network). When predicting a label/caption for a new image at test, averaging is performed across the distribution of latent codes; this is computationally efficient as a consequence of the learned CNN-based encoder. Since the framework is capable of modeling the image in the presence/absence of associated labels/captions, a new semi-supervised setting is manifested for CNN learning with images; the framework even allows unsupervised CNN learning, based on images alone.
We consider a PageRank model of opinion formation on Ulam networks, generated by the intermittency map and the typical Chirikov map. The Ulam networks generated by these maps have certain similarities with such scale-free networks as the World Wide Web (WWW), showing an algebraic decay of the PageRank probability. We find that the opinion formation process on Ulam networks have certain similarities but also distinct features comparing to the WWW. We attribute these distinctions to internal differences in network structure of the Ulam and WWW networks. We also analyze the process of opinion formation in the frame of generalized Sznajd model which protects opinion of small communities.
This paper describes the applications of network methods for understanding interaction within members of sport teams.We analyze the interaction of batsmen in International Cricket matches. We generate batting partnership network (BPN) for different teams and determine the exact values of clustering coefficient, average degree, average shortest path length of the networks and compare them with the Erd\text{\"{o}}s-R\text{\'{e}}nyi model. We observe that the networks display small-world behavior and are disassortative in nature. We find that most connected batsman is not necessarily the most central and most central players are not necessarily the one with high batting averages. We study the community structure of the BPNs and identify each player's role based on inter-community and intra-community links. We observe that {\it Sir DG Bradman}, regarded as the best batsman in Cricket history does not occupy the central position in the network $-$ the so-called connector hub. We extend our analysis to quantify the performance, relative importance and effect of removing a player from the team, based on different centrality scores.
We study a wide class of topological free-fermion systems on a hypercubic lattice in spatial dimensions $d\ge 1$. When the Fermi level lies in a spectral gap or a mobility gap, the topological properties, e.g., the integral quantization of the topological invariant, are protected by certain symmetries of the Hamiltonian against disorder. This generic feature is characterized by a generalized index theorem which is a noncommutative analogue of the Atiyah-Singer index theorem. The noncommutative index defined in terms of a pair of projections gives a precise formula for the topological invariant in each symmetry class in any dimension ($d \ge 1$). Under the assumption on the nonvanishing spectral or mobility gap, we prove that the index formula reproduces Bott periodicity and all of the possible values of topological invariants in the classification table of topological insulators and superconductors. We also prove that the indices are robust against perturbations that do not break the symmetry of the unperturbed Hamiltonian.
Precise timing of spikes and temporal locking are key elements of neural computation. Here we demonstrate how even strongly heterogeneous, deterministic neural networks with delayed interactions and complex topology can exhibit periodic patterns of spikes that are precisely timed. We develop an analytical method to find the set of all networks exhibiting a predefined pattern dynamics. Such patterns may be arbitrarily long and of complicated temporal structure. We point out that the same pattern can exist in very different networks and have different stability properties.
The disease spreading on complex networks is studied in SIR model. Simulations on empirical complex networks reveal two specific regimes of disease spreading: local containment and epidemic outbreak. The variables measuring the extent of disease spreading are in general characterized by a bimodal probability distribution. Phase diagrams of disease spreading for empirical complex networks are introduced. A theoretical model of disease spreading on m-ary tree is investigated both analytically and in simulations. It is shown that the model reproduces qualitative features of phase diagrams of disease spreading observed in empirical complex networks. The role of tree-like structure of complex networks in disease spreading is discussed.
We consider a particle in one dimension submitted to amplitude and phase disorder. It can be mapped onto the complex Burgers equation, and provides a toy model for problems with interplay of interferences and disorder, such as the NSS model of hopping conductivity in disordered insulators and the Chalker-Coddington model for the (spin) quantum Hall effect. The model has three distinct phases: (I) a {\em high-temperature} or weak disorder phase, (II) a {\em pinned} phase for strong amplitude disorder, and (III) a {\em diffusive} phase for strong phase disorder, but weak amplitude disorder. We compute analytically the renormalized disorder correlator, equivalent to the Burgers velocity-velocity correlator at long times. In phase III, it assumes a universal form. For strong phase disorder, interference leads to a logarithmic singularity, related to zeroes of the partition sum, or poles of the complex Burgers velocity field. These results are valuable in the search for the adequate field theory for higher-dimensional systems.
Group Search Optimizer(GSO) is one of the best algorithms, is very new in the field of Evolutionary Computing. It is very robust and efficient algorithm, which is inspired by animal searching behaviour. The paper describes an application of GSO to clustering of networks. We have tested GSO against five standard benchmark datasets, GSO algorithm is proved very competitive in terms of accuracy and convergence speed.
As an extension of prior work, we study inspecific Hebbian learning using the classical Oja model. We use a combination of analytical tools and numerical simulations to investigate how the effects of inspecificity (or synaptic "cross-talk") depend on the input statistics. We investigated a variety of patterns that appear in dimensions higher than 2 (and classified them based on covariance type and input bias). The effects of inspecificity on the learning outcome were found to depend very strongly on the nature of the input, and in some cases were very dramatic, making unlikely the existence of a generic neural algorithm to correct learning inaccuracy due to cross-talk. We discuss the possibility that sophisticated learning, such as presumably occurs in the neocortex, is enabled as much by special proofreading machinery for enhancing specificity, as by special algorithms.
The Hamiltonian Mean Field (HMF) model of coupled inertial, Hamiltonian rotors is a prototype for conservative dynamics in systems with long-range interactions. We consider the case where the interactions between the rotors are governed by a network described by a weighted adjacency matrix. By studying the linear stability of the incoherent state, we find that the transition to synchrony occurs at a coupling constant $K$ inversely proportional to the largest eigenvalue of the adjacency matrix. We derive a closed system of equations for a set of local order parameters and use these equations to study the effect of network heterogeneity on the synchronization of the rotors. We find that for values of $K$ just beyond the transition to synchronization the degree of synchronization is highly dependent on the network's heterogeneity, but that for large values of $K$ the degree of synchronization is robust to changes in the heterogeneity of the network's degree distribution. Our results are illustrated with numerical simulations on Erd\"os-Renyi networks and networks with power-law degree distributions.
This paper introduces the task of speech-based visual question answering (VQA), that is, to generate an answer given an image and an associated spoken question. Our work is the first study of speech-based VQA with the intention of providing insights for applications such as speech-based virtual assistants. Two methods are studied: an end to end, deep neural network that directly uses audio waveforms as input versus a pipelined approach that performs ASR (Automatic Speech Recognition) on the question, followed by text-based visual question answering. Our main findings are 1) speech-based VQA achieves slightly worse results than the extensively-studied VQA with noise-free text and 2) the end-to-end model is competitive even though it has a simple architecture. Furthermore, we investigate the robustness of both methods by injecting various levels of noise into the spoken question and find speech-based VQA to be tolerant of noise at reasonable levels. The speech dataset, code, and supplementary material will be released to the public.
The Boolean Satisfiability problem asks if a Boolean formula is satisfiable by some assignment of the variables or not. It belongs to the NP-complete complexity class and hence no algorithm with polynomial time worst-case complexity is known, i.e., the problem is hard. The K-SAT problem is the subset of the Boolean Satisfiability problem, for which the Boolean formula has the conjunctive normal form with K literals per clause. This problem is still NP-complete for $K \ge 3$. Although the worst case complexity of NP-complete problems is conjectured to be exponential, there might be subsets of the realizations where solutions can typically be found in polynomial time. In fact, random $K$-SAT, with the number of clauses to number of variables ratio $\alpha$ as control parameter, shows a phase transition between a satisfiable phase and an unsatisfiable phase, at which the hardest problems are located. We use here several linear programming approaches to reveal further "easy-hard" transition points at which the typical hardness of the problems increases which means that such algorithms can solve the problem on one side efficiently but not beyond this point. For one of these transitions, we observed a coincidence with a structural transition of the literal factor graphs of the problem instances. We also investigated cutting-plane approaches, which often increase the computational efficiency. Also we tried out a mapping to another NP-complete optimization problem using a specific algorithm for that problem. In both cases, no improvement of the performance was observed, i.e., no shift of the easy-hard transition to higher values of $\alpha$.
This paper studies the degrees of freedom of full-duplex multicell networks that share the spectrum among multiple cells in a non-orthogonal setting. In the considered network, we assume that {\em full-duplex} base stations with multiple transmit and receive antennas communicate with multiple single-antenna mobile users. By spectrum sharing among multiple cells and (simultaneously) enabling full-duplex radio, the network can utilize the spectrum more flexibly, but, at the same time, the network is subject to multiple sources of interference compared to a network with separately dedicated bands for distinct cells and uplink--downlink traffic. Consequently, to take advantage of the additional freedom in utilizing the spectrum, interference management is a crucial ingredient. In this work, we propose a novel strategy based on interference alignment which takes into account inter-cell interference and intra-cell interference caused by spectrum sharing and full-duplex to establish a general achievability result on the sum degrees of freedom of the considered network. Paired with an upper bound on the sum degrees of freedom, which is tight under certain conditions, we demonstrate how spectrum sharing and full-duplex can significantly improve the throughput over conventional cellular networks, especially for a network with large number of users and/or cells.
Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offer more flexibility for implementing models that cope with data of varying dimensions and structure, relative to toolkits that operate on statically declared computations (e.g., TensorFlow, CNTK, and Theano). However, existing toolkits - both static and dynamic - require that the developer organize the computations into the batches necessary for exploiting high-performance algorithms and hardware. This batching task is generally difficult, but it becomes a major hurdle as architectures become complex. In this paper, we present an algorithm, and its implementation in the DyNet toolkit, for automatically batching operations. Developers simply write minibatch computations as aggregations of single instance computations, and the batching algorithm seamlessly executes them, on the fly, using computationally efficient batched operations. On a variety of tasks, we obtain throughput similar to that obtained with manual batches, as well as comparable speedups over single-instance learning on architectures that are impractical to batch manually.
The paper has been withdrawn since more effective experiments should be completed.   Auto-encoders (AE) has been widely applied in different fields of machine learning. However, as a deep model, there are a large amount of learnable parameters in the AE, which would cause over-fitting and slow learning speed in practice. Many researchers have been study the intrinsic structure of AE and showed different useful methods to regularize those parameters. In this paper, we present a novel regularization method based on a clustering algorithm which is able to classify the parameters into different groups. With this regularization, parameters in a given group have approximate equivalent values and over-fitting problem could be alleviated. Moreover, due to the competitive behavior of clustering algorithm, this model also overcomes some intrinsic problems of clustering algorithms like the determination of number of clusters. Experiments on handwritten digits recognition verify the effectiveness of our novel model.
The properties of the low excitation field magnetic response of the granular high temperature (HTSC) superconductor LaSrCuO have been analyzed at low temperatures. The response of the Josephson currents has been extracted from the data. It is shown that intergrain current response is fully irreversible, producing shielding response, but do not carry Meissner magnetization. Analysis of the data shows that the system of Josephson currents freezes into a glassy state even in the absense of external magnetic field, which is argued to be a consequence of the d-wave nature of superconductivity in LaSrCuO. The macroscopic diamagnetic response to very weak variations of the magnetic field is shown to be strongly irreversible but still qualitatively different from any previously known kind of the critical-state behaviour in superconductors.   A phenomenological description of these data is given in terms of a newly proposed ``fractal'' model of irreversibility in superconductors.
Rotational shear layers at the boundary between radiative and convective zones, tachoclines, play a key role in the process of magnetic field generation in solar-like stars. We present two sets of global simulations of rotating turbulent convection and dynamo. The first set considers a stellar convective envelope only; the second one, aiming at the formation of a tachocline, considers also the upper part of the radiative zone. Our results indicate that the resulting mean-flows and dynamo properties like the growth rate, saturation energy and mode depend on the Rossby (Ro) number. For the first set of models either oscillatory (with ~2 yr period) or steady dynamo solutions are obtained. The models in the second set naturally develop a tachocline which, in turn, leads to the generation of strong mean magnetic field. Since the field is also deposited into the stable deeper layer, its evolutionary time-scale is much longer than in the models without a tachocline. Surprisingly, the magnetic field in the upper turbulent convection zone evolves in the same time scale as the deep field. These models result in either an oscillatory dynamo with ~30 yr period or in a steady dynamo depending on Ro. In terms of the mean-field dynamo coefficients computed using FOSA, the field evolution in the oscillatory models without a tachocline seems to be consistent with dynamo waves propagating according to the Parker-Yoshimura sign rule. In the models with tachoclines the dynamics is more complex involving other transport mechanisms as well as tachocline instabilities.
In this paper we discuss the existence of joint probability distributions for quantum-like response computations in the brain. We do so by focusing on a contextual neural-oscillator model shown to reproduce the main features of behavioral stimulus-response theory. We then exhibit a simple example of contextual random variables not having a joint probability distribution, and describe how such variables can be obtained from neural oscillators, but not from a quantum observable algebra.
Double-spin asymmetries of semi-inclusive cross sections for the production of identified pions and kaons have been measured in deep-inelastic scattering of polarized positrons on a polarized deuterium target. Five helicity distributions including those for three sea quark flavors were extracted from these data together with re-analyzed previous data for identified pions from a hydrogen target. These distributions are consistent with zero for all three sea flavors. A recently predicted flavor asymmetry in the polarization of the light quark sea appears to be disfavored by the data.
The classification of endotrivial kG-modules, i.e., the elements of the Picard group of the stable module category, for an arbitrary finite group G has been a long-running quest in modular representation theory. By deep work of Dade, Alperin, Carlson, Thevenaz, and others, it has been reduced to understanding the subgroup consisting of modular representations that upon restriction to a Sylow p-subgroup split as the trivial module direct sum a projective module. In this paper we identify this subgroup as the first cohomology group of the orbit category on non-trivial p-subgroups with values in the units k^x, viewed as a constant coefficient system. We then use homotopical techniques to give a number of formulas for this group in terms of one-dimensional representations of normalizers and centralizers, in particular verifying the Carlson-Thevenaz conjecture. We also provide strong restrictions on when such representations of dimension greater than one can occur, in terms of the p-subgroup complex and p-fusion systems. We immediately recover and extend a large number of computational results in the literature, and further illustrate the computational potential by calculating the group in other sample new cases, e.g., for the Monster at all primes.
We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation of spurious information along the network are avoided. The proposed inference procedure is based on the minimization of the Bayesian Information Criterion (BIC) in the class of decomposable graphical models. This class of models can be used to represent complex relationships and has suitable properties that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than differentially expressed genes located in less interconnected regions. Based on that idea, a measure of uncertainty that resembles the notion of relative entropy is proposed. Our methods are illustrated with three publically available data sets on microarray experiments (the larger involving more than 50,000 genes and 64 patients) and a short simulation study.
We use Monte Carlo simulations to study the static and dynamical properties of a Potts glass with infinite range Gaussian distributed exchange interactions for a broad range of temperature and system size up to N=2560 spins. The results are compatible with a critical divergence of the relaxation time tau at the theoretically predicted dynamical transition temperature T_D, tau \propto (T-T_D)^{-\Delta} with Delta \approx 2. For finite N a further power law at T=T_D is found, tau(T=T_D) \propto N^{z^\star} with z^\star \approx 1.5 and for T>T_D dynamical finite-size scaling seems to hold. The order parameter distribution P(q) is qualitatively compatible with the scenario of a first order glass transition as predicted from one-step replica symmetry breaking schemes.
Deep neural networks are playing an important role in state-of-the-art visual recognition. To represent high-level visual concepts, modern networks are equipped with large convolutional layers, which use a large number of filters and contribute significantly to model complexity. For example, more than half of the weights of AlexNet are stored in the first fully-connected layer (4,096 filters).   We formulate the function of a convolutional layer as learning a large visual vocabulary, and propose an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity. We replace a convolutional layer with a two-stage DCL module, in which we first construct a couple of smaller convolutional layers individually, and then fuse them at each spatial position to consider feature co-occurrence. In mathematics, DCL can be explained as an efficient way of learning compositional visual concepts, in which the vocabulary size increases exponentially while the model complexity only increases linearly. We evaluate DCL on a wide range of visual recognition tasks, including a series of multi-digit number classification datasets, and some generic image classification datasets such as SVHN, CIFAR and ILSVRC2012. We apply DCL to several state-of-the-art network structures, improving the recognition accuracy meanwhile reducing the number of parameters (16.82% fewer in AlexNet).
computers into the real world, to serve humans where the ubiquitous network is the underneath infrastructure. In order to provide ubiquitous services (u-Service) which deliver useful information to service users without human intervention, this paper implements a proactive information delivery system using Bluetooth technology. Bluetooth is a lowpowered networking service that supports several protocol profiles, most importantly file transfer.Combined together, ubiquitous computing and Bluetooth ha e the potential to furnish ubiquitous solutions (u-Solutions) that are efficient, employ simplified design characteristics, and collaboratively perform functions they are otherwise not capable. Thus, this paper first addresses the current Bluetooth technology. Then, it suggests and develops the proactive information delivery system utilizing Bluetooth and ubiquitous computing network concepts. The proactive information delivery system can be used in many ubiquitous applications such as ubiquitous commerce (u-Commerce) and ubiquitous education (u- Education)
We explore the use of neural networks trained with dropout in predicting epileptic seizures from electroencephalographic data (scalp EEG). The input to the neural network is a 126 feature vector containing 9 features for each of the 14 EEG channels obtained over 1-second, non-overlapping windows. The models in our experiments achieved high sensitivity and specificity on patient records not used in the training process. This is demonstrated using leave-one-out-cross-validation across patient records, where we hold out one patient's record as the test set and use all other patients' records for training; repeating this procedure for all patients in the database.
Among the well-established methods for facial landmark detection is the family of Constrained Local Models (CLMs). An important part of CLM landmark alignment pipeline are the local detectors that estimate the alignment probability of each individual landmark over the facial region. In this paper we present Deep Constrained Local Model (DCLM) algorithm and the novel Dense-Projection Network (DPN) as a local detector. DPN is a deep neural network that consists of two important layers: Template Projection layer and Dense Aggregate layer. In Template Projection layer, patches of facial region are mapped to a higher dimensional space allowing the pose and rotation variations to be captured accurately. In Dense Aggregate layer an ensemble of experts is simulated within one network to make the landmark localization task more robust. In our extensive set of experiments we show that DPNs outperform previously proposed local detectors. Furthermore, we demonstrate that our proposed DCLM algorithm is state-of-the-art in facial landmark detection. We significantly outperform competitive baselines, that use both CLM-based and cascaded regression approaches, by a large margin on three publicly-available datasets for image and video landmark detection.
As the scarce spectrum resource is becoming over-crowded, cognitive radios (CRs) indicate great flexibility to improve the spectrum efficiency by opportunistically accessing the authorized frequency bands. One of the critical challenges for operating such radios in a network is how to efficiently allocate transmission powers and frequency resource among the secondary users (SUs) while satisfying the quality-of-service (QoS) constraints of the primary users (PUs). In this paper, we focus on the non-cooperative power allocation problem in cognitive wireless mesh networks (CogMesh) formed by a number of clusters with the consideration of energy efficiency. Due to the SUs' selfish and spontaneous properties, the problem is modeled as a stochastic learning process. We first extend the single-agent Q-learning to a multi-user context, and then propose a conjecture based multi-agent Qlearning algorithm to achieve the optimal transmission strategies with only private and incomplete information. An intelligent SU performs Q-function updates based on the conjecture over the other SUs' stochastic behaviors. This learning algorithm provably converges given certain restrictions that arise during learning procedure. Simulation experiments are used to verify the performance of our algorithm and demonstrate its effectiveness of improving the energy efficiency.
Information theory is a practical and theoretical framework developed for the study of communication over noisy channels. Its probabilistic basis and capacity to relate statistical structure to function make it ideally suited for studying information flow in the nervous system. It has a number of useful properties: it is a general measure sensitive to any relationship, not only linear effects; it has meaningful units which in many cases allow direct comparison between different experiments; and it can be used to study how much information can be gained by observing neural responses in single trials, rather than in averages over multiple trials. A variety of information theoretic quantities are commonly used in neuroscience - (see entry "Definitions of Information-Theoretic Quantities"). In this entry we review some applications of information theory in neuroscience to study encoding of information in both single neurons and neuronal populations.
In this letter, we proposed an ungrowing scale-free network model, wherein the total number of nodes is fixed and the evolution of network structure is driven by a rewiring process only. In spite of the idiographic form of $G$, by using a two-order master equation, we obtain the analytic solution of degree distribution in stable state of the network evolution under the condition that the selection probability $G$ in rewiring process only depends on nodes' degrees. A particular kind of the present networks with $G$ linearly correlated with degree is studied in detail. The analysis and simulations show that the degree distributions of these networks can varying from the Possion form to the power-law form with the decrease of a free parameter $\alpha$, indicating the growth may not be a necessary condition of the self-organizaton of a network in a scale-free structure.
In this paper we report the observation of curent induced change of resistance of thin metallic oxide films. The resistance changes at a very low current (current density $J \geq 10^{3}$ A/cm$^{2}$). We find that the time dependence associated with the processes (increase of resistance) show a streched exponential type dependence at lower temperature, which crosses over to a creep type behavior at $T \geq$ 350 K. The time scale associated shows a drastic drop in the magnitude at $T \approx$ 350 K, where a long range diffusion sets in increasing the conductivity noise. The phenomena is like a "glass-transition" in the random lattice of oxygen ions.
This paper proposes an alternating back-propagation algorithm for learning the generator network model. The model is a non-linear generalization of factor analysis. In this model, the mapping from the continuous latent factors to the observed signal is parametrized by a convolutional neural network. The alternating back-propagation algorithm iterates the following two steps: (1) Inferential back-propagation, which infers the latent factors by Langevin dynamics or gradient descent. (2) Learning back-propagation, which updates the parameters given the inferred latent factors by gradient descent. The gradient computations in both steps are powered by back-propagation, and they share most of their code in common. We show that the alternating back-propagation algorithm can learn realistic generator models of natural images, video sequences, and sounds. Moreover, it can also be used to learn from incomplete or indirect training data.
We consider the problem of predicting the time evolution of influence, the expected number of activated nodes, given a set of initially active nodes on a propagation network. To address the significant computational challenges of this problem on large-scale heterogeneous networks, we establish a system of differential equations governing the dynamics of probability mass functions on the state graph where the nodes each lumps a number of activation states of the network, which can be considered as an analogue to the Fokker-Planck equation in continuous space. We provides several methods to estimate the system parameters which depend on the identities of the initially active nodes, network topology, and activation rates etc. The influence is then estimated by the solution of such a system of differential equations. This approach gives rise to a class of novel and scalable algorithms that work effectively for large-scale and dense networks. Numerical results are provided to show the very promising performance in terms of prediction accuracy and computational efficiency of this approach.
The standard picture holds that giant elliptical galaxies formed in a single burst at high redshift. Aging of their stellar populations subsequently caused them to fade and become redder. The Canada-France Redshift Survey provides a sample of about 125 galaxies with the luminosities and colours of passively evolving giant ellipticals and with 0.1 < z < 1. This sample is inconsistent with the standard evolutionary picture with better than 99.9% confidence. The standard Schmidt test gives <V/V_{max}> = 0.398 when restricted to objects with no detected star formation, and <V/V_{max}> = 0.410 when objects with emission lines are also included. A smaller sample of early-type galaxies selected from tha Hawaii Deep Survey gives equally significant results. With increasing redshift a larger and larger fraction of the nearby elliptical and S0 population must drop out of of the sample, either because the galaxies are no longer single units or because star formation alters their colours. If the remaining fraction is modelled as F=(1+z)^{-\gamma}, the data imply gamma = 1.5 +- 0.3. At z=1 only about one third of bright E and S0 galaxies were already assembled and had the colours of old passively evolving systems. We discuss the sensitivity of these results to the incompleteness corrections and stellar population models we have adopted. We conclude that neither is uncertain enough to reconcile the observations with the standard picture. Hierarchical galaxy formation models suggest that both merging and recent star formation play a role in the strong evolution we have detected.
We consider wireless networks in which multiple paths are available between each source and destination. We allow each source to split traffic among all of its available paths, and ask the question: how do we attain the lowest possible number of transmissions to support a given traffic matrix? Traffic bound in opposite directions over two wireless hops can utilize the ``reverse carpooling'' advantage of network coding in order to decrease the number of transmissions used. We call such coded hops as ``hyper-links''. With the reverse carpooling technique longer paths might be cheaper than shorter ones. However, there is a prisoners dilemma type situation among sources -- the network coding advantage is realized only if there is traffic in both directions of a shared path. We develop a two-level distributed control scheme that decouples user choices from each other by declaring a hyper-link capacity, allowing sources to split their traffic selfishly in a distributed fashion, and then changing the hyper-link capacity based on user actions. We show that such a controller is stable, and verify our analytical insights by simulation.
We examine the power spectrum of the energy level fluctuations of a family of critical power-law random banded matrices with properties similar to those of a disordered conductor at the Anderson transition. It is shown both analytically and numerically that the Anderson transition is characterized by a power spectrum which presents $1/f^2$ noise for small frequencies but $1/f$ noise for larger frequencies. The analysis of the transition region between these two power-law limits gives an accurate estimation of the Thouless energy of the system. Finally we discuss under what circumstances these findings may be relevant in the context of non-random Hamiltonians.
We investigate supervised learning in neural networks. We consider a multi-layered feed-forward network with back propagation. We find that the network of small-world connectivity reduces the learning error and learning time when compared to the networks of regular or random connectivity. Our study has potential applications in the domain of data-mining, image processing, speech recognition, and pattern recognition.
We propose a change of style for numerical estimations of physical quantities from measurements to inferences. We estimate the most probable quantities for all the parameter region simultaneously by using the raw data cooperatively. Estimations with higher precisions are made possible. We can obtain a physical quantity as a continuous function, which is differentiated to obtain another quantity. We applied the method to the Heisenberg spin-glass model in three dimensions. A dynamic correlation-length scaling analysis suggests that the spin-glass and the chiral-glass transitions occur at the same temperature with a common exponent $\nu$. The value is consistent with the experimental results. We found that a size-crossover effect explains a spin-chirality separation problem.
The conventional theory of burning works well in the case of uniform media where all system parameters are spatially independent. We develop a theory of burning in disordered media. In this case, rare regions (hot spots) where the burning process is more effective than on average may control the heat propagation in an explosive sample. We show that most predictions of the theory of burning are quite different from the conventional case. In particular, we show that a system of randomly distributed hot spots exhibits a dynamic phase transition, which is similar to the percolation transition. Depending on parameters of the system the phase transition can be either first or second order. These two regimes are separated by a tricritical point. The above results may be applicable to dynamics of any over-heated disordered system with a first order phase transition.
This paper focuses on the identification of overlapping communities, allowing nodes to simultaneously belong to several communities, in a decentralised way. To that aim it proposes LOCNeSs, an algorithm specially designed to run in a decentralised environment and to limit propagation, two essential characteristics to be applied in mobile networks. It is based on the exploitation of the preferential attachment mechanism in networks. Experimental results show that LOCNeSs is stable and achieves good overlapping vertex identification.
A model for information spreading in a population of $N$ mobile agents is extended to $d$-dimensional regular lattices. This model, already studied on two-dimensional lattices, also takes into account the degeneration of information as it passes from one agent to the other. Here, we find that the structure of the underlying lattice strongly affects the time $\tau$ at which the whole population has been reached by information. By comparing numerical simulations with mean-field calculations, we show that dimension $d=2$ is marginal for this problem and mean-field calculations become exact for $d > 2$. Nevertheless, the striking nonmonotonic behavior exhibited by the final degree of information with respect to $N$ and the lattice size $L$ appears to be geometry independent.
It is approved that artificial neural networks can be considerable effective in anticipating and analyzing flows in which traditional methods and statics are not able to solve. in this article, by using two-layer feedforward network with tan-sigmoid transmission function in input and output layers, we can anticipate participation rate of public in kohgiloye and boyerahmad province in future presidential election of islamic republic of iran with 91% accuracy. the assessment standards of participation such as confusion matrix and roc diagrams have been approved our claims.
We introduce an analytically solvable model of two-dimensional continuous attractor neural networks (CANNs). The synaptic input and the neuronal response form Gaussian bumps in the absence of external stimuli, and enable the network to track external stimuli by its translational displacement in the two-dimensional space. Basis functions of the two-dimensional quantum harmonic oscillator in polar coordinates are introduced to describe the distortion modes of the Gaussian bump. The perturbative method is applied to analyze its dynamics. Testing the method by considering the network behavior when the external stimulus abruptly changes its position, we obtain results of the reaction time and the amplitudes of various distortion modes, with excellent agreement with simulation results.
We examine the stability of magnetic order in a classical Heisenberg model with quenched random exchange couplings. This system represents the spin degrees of freedom in high-$T_\textrm{c}$ compounds with immobile dopants. Starting from a replica representation of the nonlinear $\sigma$-model, we perform a renormalization-group analysis. The importance of cumulants of the disorder distribution to arbitrarily high orders necessitates a functional renormalization scheme. From the renormalization flow equations we determine the magnetic correlation length numerically as a function of the impurity concentration and of temperature. From our analysis follows that two-dimensional layers can be magnetically ordered for arbitrarily strong but sufficiently diluted defects. We further consider the dimensional crossover in a stack of weakly coupled layers. The resulting phase diagram is compared with experimental data for La$_{2-x}$Sr$_x$CuO$_4$.
The notion of 'presentation', as used in combinatorial group theory, is applied to coded character sets(CCSs) - sets which facilitate the interchange of messages in a digital computer network(DCN) . By grouping each element of the set into two portions and using the idea of group presentation(whereby a group is specified by its set of generators and its set of relators), the presentation of a CCS is described. This is illustrated using the Extended Binary Coded Decimal Interchange Code(EBCDIC) which is one of the most popular CCSs in DCNs.   Key words: Group presentation, coded character set, digital computer network
We investigate the phase structure of the random-field Ising model with a bimodal random field distribution. Our aim is to test for the possibility of an equilibrium spin-glass phase, and for replica symmetry breaking (RSB) within such a phase. We study a low-temperature region where the spin-glass phase is thought to occur, but which has received little numerical study to date. We use the exchange Monte-Carlo technique to acquire equilibrium information about the model, in particular the $P(q)$ distribution and the spectrum of eigenvalues of the spin-spin correlation matrix (which tests for the presence of RSB). Our studies span the range in parameter space from the ferromagnetic to the paramagnetic phase. We find however no convincing evidence for any equilibrium glass phase, with or without RSB, between these two phases. Instead we find clear evidence (principally from the $P(q)$ distribution) that there are only two phases at this low temperature, with a discontinuity in the magnetization at the transition like that seen at other temperatures.
In applications involving matching of image sets, the information from multiple images must be effectively exploited to represent each set. State-of-the-art methods use probabilistic distribution or subspace to model a set and use specific distance measure to compare two sets. These methods are slow to compute and not compact to use in a large scale scenario. Learning-based hashing is often used in large scale image retrieval as they provide a compact representation of each sample and the Hamming distance can be used to efficiently compare two samples. However, most hashing methods encode each image separately and discard knowledge that multiple images in the same set represent the same object or person. We investigate the set hashing problem by combining both set representation and hashing in a single deep neural network. An image set is first passed to a CNN module to extract image features, then these features are aggregated using two types of set feature to capture both set specific and database-wide distribution information. The computed set feature is then fed into a multilayer perceptron to learn a compact binary embedding. Triplet loss is used to train the network by forming set similarity relations using class labels. We extensively evaluate our approach on datasets used for image matching and show highly competitive performance compared to state-of-the-art methods.
Environmental audio tagging is a newly proposed task to predict the presence or absence of a specific audio event in a chunk. Deep neural network (DNN) based methods have been successfully adopted for predicting the audio tags in the domestic audio scene. In this paper, we propose to use a convolutional neural network (CNN) to extract robust features from mel-filter banks (MFBs), spectrograms or even raw waveforms for audio tagging. Gated recurrent unit (GRU) based recurrent neural networks (RNNs) are then cascaded to model the long-term temporal structure of the audio signal. To complement the input information, an auxiliary CNN is designed to learn on the spatial features of stereo recordings. We evaluate our proposed methods on Task 4 (audio tagging) of the Detection and Classification of Acoustic Scenes and Events 2016 (DCASE 2016) challenge. Compared with our recent DNN-based method, the proposed structure can reduce the equal error rate (EER) from 0.13 to 0.11 on the development set. The spatial features can further reduce the EER to 0.10. The performance of the end-to-end learning on raw waveforms is also comparable. Finally, on the evaluation set, we get the state-of-the-art performance with 0.12 EER while the performance of the best existing system is 0.15 EER.
We introduce InverseFaceNet, a deep convolutional inverse rendering framework for faces that jointly estimates facial pose, shape, expression, reflectance and illumination from a single input image in a single shot. By estimating all these parameters from just a single image, advanced editing possibilities on a single face image, such as appearance editing and relighting, become feasible. Previous learning-based face reconstruction approaches do not jointly recover all dimensions, or are severely limited in terms of visual quality. In contrast, we propose to recover high-quality facial pose, shape, expression, reflectance and illumination using a deep neural network that is trained using a large, synthetically created dataset. Our approach builds on a novel loss function that measures model-space similarity directly in parameter space and significantly improves reconstruction accuracy. In addition, we propose an analysis-by-synthesis breeding approach which iteratively updates the synthetic training corpus based on the distribution of real-world images, and we demonstrate that this strategy outperforms completely synthetically trained networks. Finally, we show high-quality reconstructions and compare our approach to several state-of-the-art approaches.
Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture that has been designed to address the vanishing and exploding gradient problems of conventional RNNs. Unlike feedforward neural networks, RNNs have cyclic connections making them powerful for modeling sequences. They have been successfully used for sequence labeling and sequence prediction tasks, such as handwriting recognition, language modeling, phonetic labeling of acoustic frames. However, in contrast to the deep neural networks, the use of RNNs in speech recognition has been limited to phone recognition in small scale tasks. In this paper, we present novel LSTM based RNN architectures which make more effective use of model parameters to train acoustic models for large vocabulary speech recognition. We train and compare LSTM, RNN and DNN models at various numbers of parameters and configurations. We show that LSTM models converge quickly and give state of the art speech recognition performance for relatively small sized models.
We consider the problem of coloring the vertices of a large sparse random graph with a given number of colors so that no adjacent vertices have the same color. Using the cavity method, we present a detailed and systematic analytical study of the space of proper colorings (solutions).   We show that for a fixed number of colors and as the average vertex degree (number of constraints) increases, the set of solutions undergoes several phase transitions similar to those observed in the mean field theory of glasses. First, at the clustering transition, the entropically dominant part of the phase space decomposes into an exponential number of pure states so that beyond this transition a uniform sampling of solutions becomes hard. Afterward, the space of solutions condenses over a finite number of the largest states and consequently the total entropy of solutions becomes smaller than the annealed one. Another transition takes place when in all the entropically dominant states a finite fraction of nodes freezes so that each of these nodes is allowed a single color in all the solutions inside the state. Eventually, above the coloring threshold, no more solutions are available. We compute all the critical connectivities for Erdos-Renyi and regular random graphs and determine their asymptotic values for large number of colors.   Finally, we discuss the algorithmic consequences of our findings. We argue that the onset of computational hardness is not associated with the clustering transition and we suggest instead that the freezing transition might be the relevant phenomenon. We also discuss the performance of a simple local Walk-COL algorithm and of the belief propagation algorithm in the light of our results.
We review phenomenological and microscopic theories of the structural glass transition
We discuss the imprints left by a cosmological evolution of the star formation rate (SFR) on the evolution of X-ray luminosities Lx of normal galaxies, using the scheme proposed by White and Ghosh (1998, WG98), wherein the evolution of Lx of a galaxy is driven by the evolution of its X-ray binary population. As indicated in WG98, the profile of Lx with redshift can both serve as a diagnostic probe of the SFR profile and constrain evolutionary models for X-ray binaries. We report here the first calculation of the expected evolution of X-ray luminosities of galaxies, updating the WG98 work by using a suite of more recently developed SFR profiles that span the currently plausible range. The first Chandra deep imaging results on Lx are beginning to probe the SFR profile of bright spirals: the early results are consistent with predictions based on current SFR models. Using these new SFR profiles, the resolution of the ``birthrate problem'' of low-mass X-ray binaries (LMXBs) and recycled, millisecond pulsars (WG98) in terms of an evolving global SFR is more complete. We discuss the possible impact of the variations in the SFR profile of individual galaxies and galaxy-types.
The rate optimization for wireless networks with low SNR is investigated. While the capacity in the limit of disappearing SNR is known to be linear for fading and non-fading channels, we study the problem of operating in low SNR wireless network with given node locations that use network coding over flows. The model we develop for low SNR Gaussian broadcast channel and multiple access channel respectively operates in a non-trivial feasible rate region. We show that the problem reduces to the optimization of total network power which can be casted as standard linear multi-commodity min-cost flow program with no inherent combinatorially difficult structure when network coding is used with non integer constraints (which is a reasonable assumption). This is essentially due to the linearity of the capacity with respect to vanishing SNR which helps avoid the effect of interference for the degraded broadcast channel and multiple access environment in consideration, respectively. We propose a fully decentralized Primal-Dual Subgradient Algorithm for achieving optimal rates on each subgraph (i.e. hyperarcs) of the network to support the set of traffic demands (multicast/unicast connections).
In state-of-the-art Neural Machine Translation (NMT), an attention mechanism is used during decoding to enhance the translation. At every step, the decoder uses this mechanism to focus on different parts of the source sentence to gather the most useful information before outputting its target word. Recently, the effectiveness of the attention mechanism has also been explored for multimodal tasks, where it becomes possible to focus both on sentence parts and image regions that they describe. In this paper, we compare several attention mechanism on the multimodal translation task (English, image to German) and evaluate the ability of the model to make use of images to improve translation. We surpass state-of-the-art scores on the Multi30k data set, we nevertheless identify and report different misbehavior of the machine while translating.
We address the problem of Mobility Robustness Optimization (MRO) and describe centralized Self Organizing Network (SON) solutions that can optimize connected-mode mobility Key Performance Indicators (KPIs). Our solution extends the earlier work of eICIC parameter optimization [7], to heterogeneous networks with mobility, and outline methods of progressive complexity that optimize the Retaining/Offloading Bias which are macro/pico views of Cell Individual Offset parameters. Simulation results under real LTE network deployment assumptions of a US metropolitan area demonstrate the effects of such solutions on the mobility KPIs. To our knowledge, this solution is the first that demonstrates the joint optimization of eICIC and MRO.
We present the Latent Sequence Decompositions (LSD) framework. LSD decomposes sequences with variable lengthed output units as a function of both the input sequence and the output sequence. We present a training algorithm which samples valid extensions and an approximate decoding algorithm. We experiment with the Wall Street Journal speech recognition task. Our LSD model achieves 12.9% WER compared to a character baseline of 14.8% WER. When combined with a convolutional network on the encoder, we achieve 9.6% WER.
The structure of the majority of modern deep neural networks is characterized by uni- directional feed-forward connectivity across a very large number of layers. By contrast, the architecture of the cortex of vertebrates contains fewer hierarchical levels but many recurrent and feedback connections. Here we show that a small, few-layer artificial neural network that employs feedback will reach top level performance on a standard benchmark task, otherwise only obtained by large feed-forward structures. To achieve this we use feed-forward transfer entropy between neurons to structure feedback connectivity. Transfer entropy can here intuitively be understood as a measure for the relevance of certain pathways in the network, which are then amplified by feedback. Feedback may therefore be key for high network performance in small brain-like architectures.
The United States spends more than $1B each year on initiatives such as the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed half a decade. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may provide a cheaper and faster alternative. Here, we present a method that determines socioeconomic trends from 50 million images of street scenes, gathered in 200 American cities by Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22M automobiles in total (8% of all automobiles in the US), was used to accurately estimate income, race, education, and voting patterns, with single-precinct resolution. (The average US precinct contains approximately 1000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a 15-minute drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next Presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographic trends may effectively complement labor-intensive approaches, with the potential to detect trends with fine spatial resolution, in close to real time.
Today's datacenters face important challenges for providing low-latency high-quality interactive services to meet user's expectation. For improving the application throughput, recent research works have embedded application deadline information into design of network flow schedule to meet the latency requirement. Here, arises a critical question: does application-level throughput mean providing better quality service? We note that there are usually a set of semantic related responses (or flows) for answering a query; and, some responses are highly correlative with the query while others do not. Thus, this observation motivates us to associate the importance of the contents with the application flows (or responses) in order to enhance the service quality. We first model the application importance maximization problem in a generic network and in a server-centric network. Since both of them are too complicated to be deployed in the real world, we propose the importance-aware delivery protocol, which is a distributed event-driven rate-based delivery control protocol, for server-centric datacenter networks. The proposed protocol is able to make use of the multiple disjoin paths of server-centric network, and jointly consider flow importance, flow size, and deadline to maximize the goodput of most-related semantic data of a query. Through real-data-based or synthetic simulations, the results show that our proposed protocol significantly outperforms D3 and MPTCP in terms of the precision at K and the sum of application-level importance.
We propose a probabilistic graphical model realizing a minimal encoding of real variables dependencies based on possibly incomplete observation and an empirical cumulative distribution function per variable. The target application is a large scale partially observed system, like e.g. a traffic network, where a small proportion of real valued variables are observed, and the other variables have to be predicted. Our design objective is therefore to have good scalability in a real-time setting. Instead of attempting to encode the dependencies of the system directly in the description space, we propose a way to encode them in a latent space of binary variables, reflecting a rough perception of the observable (congested/non-congested for a traffic road). The method relies in part on message passing algorithms, i.e. belief propagation, but the core of the work concerns the definition of meaningful latent variables associated to the variables of interest and their pairwise dependencies. Numerical experiments demonstrate the applicability of the method in practice.
We show that discretization of internode distribution in complex networks affects internode distances l_ij calculated as a function of degrees (k_i k_j) and an average path length <l> as function of network size N. For dense networks there are log-periodic oscillations of above quantities. We present real-world examples of such a behavior as well as we derive analytical expressions and compare them to numerical simulations. We consider a simple case of network optimization problem, arguing that discrete effects can lead to a nontrivial solution.
We summarise the perturbative QCD analysis of the structure function data for g_1 from longitudinally polarized deep inelastic scattering from proton, deuteron and neutron targets, with particular emphasis on testing sum rules, determining helicity fractions, and extracting the strong coupling from both scaling violations and the Bjorken sum rule.
The vanishing and exploding gradient problems are well-studied obstacles that make it difficult for recurrent neural networks to learn long-term time dependencies. We propose a reparameterization of standard recurrent neural networks to update linear transformations in a provably norm-preserving way through Givens rotations. Additionally, we use the absolute value function as an element-wise non-linearity to preserve the norm of backpropagated signals over the entire network. We show that this reparameterization reduces the number of parameters and maintains the same algorithmic complexity as a standard recurrent neural network, while outperforming standard recurrent neural networks with orthogonal initializations and Long Short-Term Memory networks on the copy problem.
We suggest a new perspective of research towards understanding the relations between structure and dynamics of a complex network: Can we design a network, e.g. by modifying the features of units or interactions, such that it exhibits a desired dynamics? Here we present a case study where we positively answer this question analytically for networks of spiking neural oscillators. First, we present a method of finding the set of all networks (defined by all mutual coupling strengths) that exhibit an arbitrary given periodic pattern of spikes as an invariant solution. In such a pattern all the spike times of all the neurons are exactly predefined. The method is very general as it covers networks of different types of neurons, excitatory and inhibitory couplings, interaction delays that may be heterogeneously distributed, and arbitrary network connectivities. Second, we show how to design networks if further restrictions are imposed, for instance by predefining the detailed network connectivity. We illustrate the applicability of the method by examples of Erd\"{o}s-R\'{e}nyi and power-law random networks. Third, the method can be used to design networks that optimize network properties. To illustrate this idea, we design networks that exhibit a predefined pattern dynamics while at the same time minimizing the networks' wiring costs.
We generalize previous studies on critical phenomena in communication networks by adding computational capabilities to the nodes to better describe real-world situations such as cloud computing. A set of tasks with random origin and destination with a multi-tier computational structure is distributed on a network modeled as a graph. The execution time (or latency) of each task is statically computed and the sum is used as the energy in a Montecarlo simulation in which the temperature parameter controls the resource allocation optimality. We study the transition to congestion by varying temperature and system load. A method to approximately recover the time-evolution of the system by interpolating the latency probability distributions is presented. This allows us to study the standard transition to the congested phase by varying the task production rate. We are able to reproduce the main known results on network congestion and to gain a deeper insight over the maximum theoretical performance of a system and its sensitivity to routing and load balancing errors.
In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing. Among different types of deep neural networks, convolutional neural networks have been most extensively studied. Due to the lack of training data and computing power in early days, it is hard to train a large high-capacity convolutional neural network without overfitting. After the rapid growth in the amount of the annotated data and the recent improvements in the strengths of graphics processor units (GPUs), the research on convolutional neural networks has been emerged swiftly and achieved state-of-the-art results on various tasks. In this paper, we provide a broad survey of the recent advances in convolutional neural networks. Besides, we also introduce some applications of convolutional neural networks in computer vision.
This paper presents a deep neural-network-based hierarchical graphical model for individual and group activity recognition in surveillance scenes. Deep networks are used to recognize the actions of individual people in a scene. Next, a neural-network-based hierarchical graphical model refines the predicted labels for each class by considering dependencies between the classes. This refinement step mimics a message-passing step similar to inference in a probabilistic graphical model. We show that this approach can be effective in group activity recognition, with the deep graphical model improving recognition rates over baseline methods.
In this paper we study the problem of scheduling packets on directed line networks. Each node in the network has a local buffer of bounded size $B$, and each edge (or link) can transmit a limited number $c$ of packets in every time unit. The input to the problems consists of $n$ - the size of the network, $B$ - the node buffer sizes, $c$ - the links capacities, and a set of source-destination packet requests with a release times. A solution for this problem is a schedule that delivers packets to their destination without violating the capacity constraints of the network (buffers or edges). Our goal is to design an algorithm that computes a schedule that maximizes the number of packets that arrive to their destination.   We present a randomized approximation algorithm with constant approximation ratio in expectation for the case where the buffer-size to link-capacity ratio $B/c$ is constant. This improves over the $O(\log^* n)$-approximation algorithm of R\"{a}cke-Ros\'{e}n (SPAA 2009).   The algorithm supports "soft" deadlines in the following sense. Packets may have deadlines, and the algorithm delivers the accepted packets no later than $\log n$ time units after their respective deadlines.
In studies of complex heterogeneous networks, particularly of the Internet, significant attention was paid to analyzing network failures caused by hardware faults or overload, where the network reaction was modeled as rerouting of traffic away from failed or congested elements. Here we model another type of the network reaction to congestion -- a sharp reduction of the input traffic rate through congested routes which occurs on much shorter time scales. We consider the onset of congestion in the Internet where local mismatch between demand and capacity results in traffic losses and show that it can be described as a phase transition characterized by strong non-Gaussian loss fluctuations at a mesoscopic time scale. The fluctuations, caused by noise in input traffic, are exacerbated by the heterogeneous nature of the network manifested in a scale-free load distribution. They result in the network strongly overreacting to the first signs of congestion by significantly reducing input traffic along the communication paths where congestion is utterly negligible.
For theoretically consistent determination of $\alpha_s$ from jet rates in deep inelastic scattering the dependence on $\alpha_s$ of parton distribution functions is in principle as important as that of hard scattering cross--sections. For the kinematical region accessible at HERA we investigate in detail numerical importance of these two sources of the $\alpha_s$ dependence of jet rates.
In this paper, we aim to model the formation of data dissemination in online social networks (OSNs), and measure the transport difficulty of generated data traffic. We focus on a usual type of interest-driven social sessions in OSNs, called \emph{Social-InterestCast}, under which a user will autonomously determine whether to view the content from his followees depending on his interest. It is challenging to figure out the formation mechanism of such a Social-InterestCast, since it involves multiple interrelated factors such as users' social relationships, users' interests, and content semantics. We propose a four-layered system model, consisting of physical layer, social layer, content layer, and session layer. By this model we successfully obtain the geographical distribution of Social-InterestCast sessions, serving as the precondition for quantifying data transport difficulty. We define the fundamental limit of \emph{transport load} as a new metric, called \emph{transport complexity}, i.e., the \emph{minimum required} transport load for an OSN over a given carrier network. Specifically, we derive the transport complexity for Social-InterestCast sessions in a large-scale OSN over the carrier network with optimal communication architecture. The results can act as the common lower bounds on transport load for Social-InterestCast over any carrier networks. To the best of our knowledge, this is the first work to measure the transport difficulty for data dissemination in OSNs by modeling session patterns with the interest-driven characteristics.
We provide a survey on relational models. Relational models describe complete networked {domains by taking into account global dependencies in the data}. Relational models can lead to more accurate predictions if compared to non-relational machine learning approaches. Relational models typically are based on probabilistic graphical models, e.g., Bayesian networks, Markov networks, or latent variable models. Relational models have applications in social networks analysis, the modeling of knowledge graphs, bioinformatics, recommendation systems, natural language processing, medical decision support, and linked data.
We numerically study a simple model for thermo-reversible colloidal gelation in which particles can form reversible bonds with a predefined maximum number of neighbors. We focus on three and four maximally coordinated particles, since in these two cases the low valency makes it possible to probe, in equilibrium, slow dynamics down to very low temperatures $T$. By studying a large region of $T$ and packing fraction $\phi$ we are able to estimate both the location of the liquid-gas phase separation spinodal and the locus of dynamic arrest, where the system is trapped in a disordered non-ergodic state. We find that there are two distinct arrest lines for the system: a {\it glass} line at high packing fraction, and a {\it gel} line at low $\phi$ and $T$. The former is rather vertical ($\phi$-controlled), while the latter is rather horizontal ($T$-controlled) in the $(\phi-T)$ plane. Dynamics on approaching the glass line along isotherms exhibit a power-law dependence on $\phi$, while dynamics along isochores follow an activated (Arrhenius) dependence. The gel has clearly distinct properties from those of both a repulsive and an attractive glass. A gel to glass crossover occurs in a fairly narrow range in $\phi$ along low $T$ isotherms, seen most strikingly in the behavior of the non-ergodicity factor. Interestingly, we detect the presence of anomalous dynamics, such as subdiffusive behavior for the mean squared displacement and logarithmic decay for the density correlation functions in the region where the gel dynamics interferes with the glass dynamics.
Many previous proposals for adversarial training of deep neural nets have included di- rectly modifying the gradient, training on a mix of original and adversarial examples, using contractive penalties, and approximately optimizing constrained adversarial ob- jective functions. In this paper, we show these proposals are actually all instances of optimizing a general, regularized objective we call DataGrad. Our proposed DataGrad framework, which can be viewed as a deep extension of the layerwise contractive au- toencoder penalty, cleanly simplifies prior work and easily allows extensions such as adversarial training with multi-task cues. In our experiments, we find that the deep gra- dient regularization of DataGrad (which also has L1 and L2 flavors of regularization) outperforms alternative forms of regularization, including classical L1, L2, and multi- task, both on the original dataset as well as on adversarial sets. Furthermore, we find that combining multi-task optimization with DataGrad adversarial training results in the most robust performance.
Neural signals are characterized by rich temporal and spatiotemporal dynamics that reflect the organization of cortical networks. Theoretical research has shown how neural networks can operate at different dynamic ranges that correspond to specific types of information processing. Here we present a data analysis framework that uses a linearized model of these dynamic states in order to decompose the measured neural signal into a series of components that capture both rhythmic and non-rhythmic neural activity. The method is based on stochastic differential equations and Gaussian process regression. Through computer simulations and analysis of magnetoencephalographic data, we demonstrate the efficacy of the method in identifying meaningful modulations of oscillatory signals corrupted by structured temporal and spatiotemporal noise. These results suggest that the method is particularly suitable for the analysis and interpretation of complex temporal and spatiotemporal neural signals.
We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the two tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Videos are available at http://yclin.me/adversarial_attack_RL/
Many firms these days are opting to specialize rather than generalize as a way of maintaining their competitiveness. Consequently, they cannot rely solely on themselves, but must cooperate by combining their advantages. To obtain the actual condition for this cooperation, a multi-layered network based on two different types of data was investigated. The first type was transaction data from Japanese firms. The network created from the data included 961,363 firms and 7,808,760 links. The second type of data were from joint-patent applications in Japan. The joint-patent application network included 54,197 nodes and 154,205 links. These two networks were merged into one network.   The first anaysis was based on input-output tables and three different tables were compared. The correlation coefficients between tables revealed that transactions were more strongly tied to joint-patent applications than the total amount of money. The total amount of money and transactions have few relationships and these are probably connected to joint-patent applications in different mechanisms. The second analysis was conducted based on the p* model. Choice, multiplicity, reciprocity, multi-reciprocity and transitivity configurations were evaluated. Multiplicity and reciprocity configurations were significant in all the analyzed industries. The results for multiplicity meant that transactions and joint-patent application links were closely related. Multi-reciprocity and transitivity configurations were significant in some of the analyzed industries. It was difficult to find any common characteristics in the industries. Bayesian networks were used in the third analysis. The learned structure revealed that if a transaction link between two firms is known, the categories of firms' industries do not affect to the existence of a patent link.
In this paper, we address the problem of classifying documents available from the global network of (open access) repositories according to their type. We show that the metadata provided by repositories enabling us to distinguish research papers, thesis and slides are missing in over 60% of cases. While these metadata describing document types are useful in a variety of scenarios ranging from research analytics to improving search and recommender (SR) systems, this problem has not yet been sufficiently addressed in the context of the repositories infrastructure. We have developed a new approach for classifying document types using supervised machine learning based exclusively on text specific features. We achieve 0.96 F1-score using the random forest and Adaboost classifiers, which are the best performing models on our data. By analysing the SR system logs of the CORE [1] digital library aggregator, we show that users are an order of magnitude more likely to click on research papers and thesis than on slides. This suggests that using document types as a feature for ranking/filtering SR results in digital libraries has the potential to improve user experience.
Tools of the theory of critical phenomena, namely the scaling analysis and universality, are argued to be applicable to large complex web-like network structures. Using a detailed analysis of the real data of the International Trade Network we argue that the scaled link weight distribution has an approximate log-normal distribution which remains robust over a period of 53 years. Another universal feature is observed in the power-law growth of the trade strength with gross domestic product, the exponent being similar for all countries. Using the 'rich-club' coefficient measure of the weighted networks it has been shown that the size of the rich-club controlling half of the world's trade is actually shrinking. While the gravity law is known to describe well the social interactions in the static networks of population migration, international trade, etc, here for the first time we studied a non-conservative dynamical model based on the gravity law which excellently reproduced many empirical features of the ITN.
We study the high-dimensional entanglement of a photon pair transmitted through a random medium. We show that multiple scattering in combination with the subsequent selection of only a fraction of outgoing modes reduces the average entanglement of an initially maximally entangled two-photon state. Entanglement corresponding to a random pure state is obtained when the number of modes accessible in transmission is much less than the number of modes in the incident light. An amount of entanglement approaching that of the incident light can be recovered by accessing a larger number of transmitted modes. In contrast, a pair of photons in a separable state does not gain any entanglement when transmitted through a random medium.
Networks are a convenient way to represent complex systems of interacting entities. Many networks contain "communities" of nodes that are more densely connected to each other than to nodes in the rest of the network. In this paper, we investigate the detection of communities in temporal networks represented as multilayer networks. As a focal example, we study time-dependent financial-asset correlation networks. We first argue that the use of the "modularity" quality function---which is defined by comparing edge weights in an observed network to expected edge weights in a "null network"---is application-dependent. We differentiate between "null networks" and "null models" in our discussion of modularity maximization, and we highlight that the same null network can correspond to different null models. We then investigate a multilayer modularity-maximization problem to identify communities in temporal networks. Our multilayer analysis only depends on the form of the maximization problem and not on the specific quality function that one chooses. We introduce a diagnostic to measure \emph{persistence} of community structure in a multilayer network partition. We prove several results that describe how the multilayer maximization problem measures a trade-off between static community structure within layers and higher values of persistence across layers. We also discuss some implementation issues that the popular "Louvain" heuristic faces with temporal multilayer networks and suggest ways to mitigate them.
Deep neural network is difficult to train and this predicament becomes worse as the depth increases. The essence of this problem exists in the magnitude of backpropagated errors that will result in gradient vanishing or exploding phenomenon. We show that a variant of regularizer which utilizes orthonormality among different filter banks can alleviate this problem. Moreover, we design a backward error modulation mechanism based on the quasi-isometry assumption between two consecutive parametric layers. Equipped with these two ingredients, we propose several novel optimization solutions that can be utilized for training a specific-structured (repetitively triple modules of Conv-BNReLU) extremely deep convolutional neural network (CNN) WITHOUT any shortcuts/ identity mappings from scratch. Experiments show that our proposed solutions can achieve distinct improvements for a 44-layer and a 110-layer plain networks on both the CIFAR-10 and ImageNet datasets. Moreover, we can successfully train plain CNNs to match the performance of the residual counterparts.   Besides, we propose new principles for designing network structure from the insights evoked by orthonormality. Combined with residual structure, we achieve comparative performance on the ImageNet dataset.
We show that information about social relationships can be used to improve user-level sentiment analysis. The main motivation behind our approach is that users that are somehow "connected" may be more likely to hold similar opinions; therefore, relationship information can complement what we can extract about a user's viewpoints from their utterances. Employing Twitter as a source for our experimental data, and working within a semi-supervised framework, we propose models that are induced either from the Twitter follower/followee network or from the network in Twitter formed by users referring to each other using "@" mentions. Our transductive learning results reveal that incorporating social-network information can indeed lead to statistically significant sentiment-classification improvements over the performance of an approach based on Support Vector Machines having access only to textual features.
Sensory neuroscience seeks to understand how the brain encodes natural environments. However, neural coding has largely been studied using simplified stimuli. In order to assess whether the brain's coding strategy depend on the stimulus ensemble, we apply a new information-theoretic method that allows unbiased calculation of neural filters (receptive fields) from responses to natural scenes or other complex signals with strong multipoint correlations. In the cat primary visual cortex we compare responses to natural inputs with those to noise inputs matched for luminance and contrast. We find that neural filters adaptively change with the input ensemble so as to increase the information carried by the neural response about the filtered stimulus. Adaptation affects the spatial frequency composition of the filter, enhancing sensitivity to under-represented frequencies in agreement with optimal encoding arguments. Adaptation occurs over 40 s to many minutes, longer than most previously reported forms of adaptation.
We investigate the evolutionary prisoner's dilemma game in structured populations by introducing dimers, which are defined as that two players in each dimer always hold a same strategy. We find that influences of dimers on cooperation depend on the type of dimers and the population structure. For those dimers in which players interact with each other, the cooperation level increases with the number of dimers though the cooperation improvement level depends on the type of network structures. On the other hand, the dimers, in which there are not mutual interactions, will not do any good to the cooperation level in a single community, but interestingly, will improve the cooperation level in a population with two communities. We explore the relationship between dimers and self-interactions and find that the effects of dimers are similar to that of self-interactions. Also, we find that the dimers, which are established over two communities in a multi-community network, act as one type of interaction through which information between communities is communicated by the requirement that two players in a dimer hold a same strategy.
Classifying products into categories precisely and efficiently is a major challenge in modern e-commerce. The high traffic of new products uploaded daily and the dynamic nature of the categories raise the need for machine learning models that can reduce the cost and time of human editors. In this paper, we propose a decision level fusion approach for multi-modal product classification using text and image inputs. We train input specific state-of-the-art deep neural networks for each input source, show the potential of forging them together into a multi-modal architecture and train a novel policy network that learns to choose between them. Finally, we demonstrate that our multi-modal network improves the top-1 accuracy % over both networks on a real-world large-scale product classification dataset that we collected fromWalmart.com. While we focus on image-text fusion that characterizes e-commerce domains, our algorithms can be easily applied to other modalities such as audio, video, physical sensors, etc.
Here we present DeepGaze II, a model that predicts where people look in images. The model uses the features from the VGG-19 deep neural network trained to identify objects in images. Contrary to other saliency models that use deep features, here we use the VGG features for saliency prediction with no additional fine-tuning (rather, a few readout layers are trained on top of the VGG features to predict saliency). The model is therefore a strong test of transfer learning. After conservative cross-validation, DeepGaze II explains about 87% of the explainable information gain in the patterns of fixations and achieves top performance in area under the curve metrics on the MIT300 hold-out benchmark. These results corroborate the finding from DeepGaze I (which explained 56% of the explainable information gain), that deep features trained on object recognition provide a versatile feature space for performing related visual tasks. We explore the factors that contribute to this success and present several informative image examples. A web service is available to compute model predictions at http://deepgaze.bethgelab.org.
We explore the top-$K$ rank aggregation problem. Suppose a collection of items is compared in pairs repeatedly, and we aim to recover a consistent ordering that focuses on the top-$K$ ranked items based on partially revealed preference information. We investigate the Bradley-Terry-Luce model in which one ranks items according to their perceived utilities modeled as noisy observations of their underlying true utilities. Our main contributions are two-fold. First, in a general comparison model where item pairs to compare are given a priori, we attain an upper and lower bound on the sample size for reliable recovery of the top-$K$ ranked items. Second, more importantly, extending the result to a random comparison model where item pairs to compare are chosen independently with some probability, we show that in slightly restricted regimes, the gap between the derived bounds reduces to a constant factor, hence reveals that a spectral method can achieve the minimax optimality on the (order-wise) sample size required for top-$K$ ranking. That is to say, we demonstrate a spectral method alone to be sufficient to achieve the optimality and advantageous in terms of computational complexity, as it does not require an additional stage of maximum likelihood estimation that a state-of-the-art scheme employs to achieve the optimality. We corroborate our main results by numerical experiments.
One of the decisions that arise when designing a neural network for any application is how the data should be represented in order to be presented to, and possibly generated by, a neural network. For audio, the choice is less obvious than it seems to be for visual images, and a variety of representations have been used for different applications including the raw digitized sample stream, hand-crafted features, machine discovered features, MFCCs and variants that include deltas, and a variety of spectral representations. This paper reviews some of these representations and issues that arise, focusing particularly on spectrograms for generating audio using neural networks for style transfer.
The ongoing unprecedented exponential explosion of available computing power, has radically transformed the methods of statistical inference. What used to be a small minority of statisticians advocating for the use of priors and a strict adherence to bayes theorem, it is now becoming the norm across disciplines. The evolutionary direction is now clear. The trend is towards more realistic, flexible and complex likelihoods characterized by an ever increasing number of parameters. This makes the old question of: What should the prior be? to acquire a new central importance in the modern bayesian theory of inference. Entropic priors provide one answer to the problem of prior selection. The general definition of an entropic prior has existed since 1988, but it was not until 1998 that it was found that they provide a new notion of complete ignorance. This paper re-introduces the family of entropic priors as minimizers of mutual information between the data and the parameters, as in [rodriguez98b], but with a small change and a correction. The general formalism is then applied to two large classes of models: Discrete probabilistic networks and univariate finite mixtures of gaussians. It is also shown how to perform inference by efficiently sampling the corresponding posterior distributions.
We model smart grids as complex interdependent networks, and study targeted attacks on smart grids for the first time. A smart grid consists of two networks: the power network and the communication network, interconnected by edges. Occurrence of failures (attacks) in one network triggers failures in the other network, and propagates in cascades across the networks. Such cascading failures can result in disintegration of either (or both) of the networks. Earlier works considered only random failures. In practical situations, an attacker is more likely to compromise nodes selectively.   We study cascading failures in smart grids, where an attacker selectively compromises the nodes with probabilities proportional to their degrees; high degree nodes are compromised with higher probability. We mathematically analyze the sizes of the giant components of the networks under targeted attacks, and compare the results with the corresponding sizes under random attacks. We show that networks disintegrate faster for targeted attacks compared to random attacks. A targeted attack on a small fraction of high degree nodes disintegrates one or both of the networks, whereas both the networks contain giant components for random attack on the same fraction of nodes.
We investigate the exploration of networks by a mobile agent. It is long known that, without global information about the graph, it is not possible to make the agent halts after the exploration except if the graph is a tree. We therefore endow the agent with binoculars, a sensing device that can show the local structure of the environment at a constant distance of the agent current location. We show that, with binoculars, it is possible to explore and halt in a large class of non-tree networks. We give a complete characterization of the class of networks that can be explored using binoculars using standard notions of discrete topology. Our characterization is constructive, we present an Exploration algorithm that is universal; this algorithm explores any network explorable with binoculars, and never halts in non-explorable networks.
Wireless sensor nodes along with Base Station (BS) constitute a Wireless Sensor Network (WSN). Nodes comprise of tiny power battery. Nodes sense the data and send it to BS. WSNs need protocol for efficient energy consumption of the network. In direct transmission and minimum transmission energy routing protocols, energy consumption is not well distributed. However, LEACH (Low-Energy Adaptive Clustering Hierarchy) is a clustering protocol; randomly selects the Cluster Heads (CHs) in each round. However, random selection of CHs does not guarantee efficient energy consumption of the network. Therefore, we proposed new clustering techniques in routing protocols, Location-aware Permanent CH (LPCH) and User Defined Location-aware Permanent CH (UDLPCH). In both protocols, network field is physically divided in to two regions, equal number of nodes are randomly deployed in each region. In LPCH, number of CHs are selected by LEACH algorithm in first round. However in UDLPCH, equal and optimum number of CHs are selected in each region, throughout the network life time number of CHs are remain same. Simulation results show that stability period and throughput of LPCH is greater than LEACH, stability period and throughput of UDLPCH is greater than LPCH.
We trained Binarized Neural Networks (BNNs) on the high resolution ImageNet ILSVRC-2102 dataset classification task and achieved a good performance. With a moderate size network of 13 layers, we obtained top-5 classification accuracy rate of 84.1 % on validation set through network distillation, much better than previous published results of 73.2% on XNOR network and 69.1% on binarized GoogleNET. We expect networks of better performance can be obtained by following our current strategies. We provide a detailed discussion and preliminary analysis on strategies used in the network training.
We present a novel technique to parametrize experimental data, based on the construction of a probability measure in the space of functions, which retains the full experimental information on errors and correlations. This measure is constructed in a two step process: first, a Monte Carlo sample of replicas of the experimental data is generated, and then an ensemble of neural network is trained over them. This parametrization does not introduce any bias due to the choice of a fixed functional form. Two applications of this technique are presented. First a probability measure in the space of the spectral function $\rho_{V-A}(s)$ is generated, which incorporates theoretical constraints as chiral sum rules, and is used to evaluate the vacuum condensates. Then we construct a probability measure in the space of the proton structure function $F_2^p(x,Q^2)$, which updates previous work, incorporating HERA data.
The Hall effect has been studied in a series of AuFe samples in the re-entrant concentration range, as well as in part of the spin glass range. An anomalous Hall contribution linked to the tilting of the local spins can be identified, confirming theoretical predictions of a novel topological Hall term induced when chirality is present. This effect can be understood in terms of Aharonov-Bohm-like intrinsic current loops arising from successive scatterings by canted local spins. The experimental measurements indicate that the chiral signal persists, meaning scattering within the nanoscopic loops remains coherent, up to temperatures of the order of 150 K.
It is analyzed whether the potential energy landscape of a glass-forming system can be effectively mapped on a random model which is described in statistical terms. For this purpose we generalize the simple trap model of Bouchaud and coworkers by dividing the total system into M weakly interacting identical subsystems, each being described in terms of a trap model. The distribution of traps in this extended trap model (ETM) is fully determined by the thermodynamics of the glass-former. The dynamics is described by two adjustable parameters, one characterizing the common energy level of the barriers, the other the strength of the interaction. The comparison is performed for the standard binary mixture Lennard-Jones system with 65 particles. The metabasins, identified in our previous work, are chosen as traps. Comparing molecular dynamics simulations of the Lennard-Jones system with Monte Carlo calculations of the ETM allows one to determine the adjustable parameters. Analysis of the first moment of the waiting distribution yields an optimum agreement when choosing M=3 subsystems. Comparison with the second moment of the waiting time distribution, reflecting dynamic heterogeneities, indicates that the sizes of the subsystems may fluctuate.
Multilayer networks allow one to represent diverse and interdependent connectivity patterns --- e.g., time-dependence, multiple subsystems, or both --- that arise in many applications and which are difficult or awkward to incorporate into standard network representations. In the study of multilayer networks, it is important to investigate "mesoscale" (i.e., intermediate-scale) structures, such as dense sets of nodes known as "communities" that are connected sparsely to each other, to discover network features that are not apparent at the microscale or the macroscale. A variety of methods and algorithms are available to identify communities in multilayer networks, but they differ in their definitions and/or assumptions of what constitutes a community, and many scalable algorithms provide approximate solutions with little or no theoretical guarantee on the quality of their approximations. Consequently, it is crucial to develop generative models of networks to use as a common test of community-detection tools. In the present paper, we develop a family of benchmarks for detecting mesoscale structures in multilayer networks by introducing a generative model that can explicitly incorporate dependency structure between layers. Our benchmark provides a standardized set of null models, together with an associated set of principles from which they are derived, for studies of mesoscale structures in multilayer networks. We discuss the parameters and properties of our generative model, and we illustrate its use by comparing a variety of community-detection methods.
We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.
A new dynamical clustering algorithm for the identification of modules in complex networks has been recently introduced \cite{BILPR}. In this paper we present a modified version of this algorithm based on a system of chaotic Roessler oscillators and we test its sensitivity on real and computer generated networks with a well known modular structure.
We introduce a globally normalized transition-based neural network model that achieves state-of-the-art part-of-speech tagging, dependency parsing and sentence compression results. Our model is a simple feed-forward neural network that operates on a task-specific transition system, yet achieves comparable or better accuracies than recurrent models. We discuss the importance of global as opposed to local normalization: a key insight is that the label bias problem implies that globally normalized models can be strictly more expressive than locally normalized models.
The causal (belief) network is a well-known graphical structure for representing independencies in a joint probability distribution. The exact methods and the approximation methods, which perform probabilistic inference in causal networks, often treat the conditional probabilities which are stored in the network as certain values. However, if one takes either a subjectivistic or a limiting frequency approach to probability, one can never be certain of probability values. An algorithm for probabilistic inference should not only be capable of reporting the inferred probabilities; it should also be capable of reporting the uncertainty in these probabilities relative to the uncertainty in the probabilities which are stored in the network. In section 2 of this paper a method is given for determining the prior variances of the probabilities of all the nodes. Section 3 contains an approximation method for determining the variances in inferred probabilities.
In this paper we propose a scalable version of a state-of-the-art deterministic time-invariant feature extraction approach based on consecutive changes of basis and nonlinearities, namely, the scattering network. The first focus of the paper is to extend the scattering network to allow the use of higher order nonlinearities as well as extracting nonlinear and Fourier based statistics leading to the required invariants of any inherently structured input. In order to reach fast convolutions and to leverage the intrinsic structure of wavelets, we derive our complete model in the Fourier domain. In addition of providing fast computations, we are now able to exploit sparse matrices due to extremely high sparsity well localized in the Fourier domain. As a result, we are able to reach a true linear time complexity with inputs in the Fourier domain allowing fast and energy efficient solutions to machine learning tasks. Validation of the features and computational results will be presented through the use of these invariant coefficients to perform classification on audio recordings of bird songs captured in multiple different soundscapes. In the end, the applicability of the presented solutions to deep artificial neural networks is discussed.
Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches---such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)---amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g. missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models, coupled with efficient inference methods, for multiway data analysis. We name these models InfTucker. Using these InfTucker, we conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or $t$ processes with nonlinear covariance functions. To efficiently learn the InfTucker from data, we develop a variational inference technique on tensors. Compared with classical implementation, the new technique reduces both time and space complexities by several orders of magnitude. Our experimental results on chemometrics and social network datasets demonstrate that our new models achieved significantly higher prediction accuracy than the most state-of-art tensor decomposition
A Wireless Sensor Network(WSN) can be described as a collection of untethered sensor nodes. An important application of WSNs is in the field of real-time communication. Real-time communication is a critical service which requires a qualitative routing protocol for energy-efficient network communication. The judicious use of energy of the network nodes is essential and important for sustainability and longevity of a WSN. This paper proposes an algorithm namely Priority-Energy Based Data Forwarding Algorithm(PEDF) which empowers the node to choose the most suitable packet forwarding path, based on the priority of the packet and the current energy status of the forwarding node. The algorithm hence dynamically adapts to the prevailing energy-scenario of the network and takes routing decisions accordingly, based on packet priority. Minimizing delay, minimizing energy utilization, maximizing throughput and maximizing network lifetime are the key elements of the proposed algorithm.
Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator's query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.
Different features have different relevance to a particular learning problem. Some features are less relevant; while some very important. Instead of selecting the most relevant features using feature selection, an algorithm can be given this knowledge of feature importance based on expert opinion or prior learning. Learning can be faster and more accurate if learners take feature importance into account. Correlation aided Neural Networks (CANN) is presented which is such an algorithm. CANN treats feature importance as the correlation coefficient between the target attribute and the features. CANN modifies normal feed-forward Neural Network to fit both correlation values and training data. Empirical evaluation shows that CANN is faster and more accurate than applying the two step approach of feature selection and then using normal learning algorithms.
Recently sequential model based optimization (SMBO) has emerged as a promising hyper-parameter optimization strategy in machine learning. In this work, we investigate SMBO to identify architecture hyper-parameters of deep convolution networks (DCNs) object recognition. We propose a simple SMBO strategy that starts from a set of random initial DCN architectures to generate new architectures, which on training perform well on a given dataset. Using the proposed SMBO strategy we are able to identify a number of DCN architectures that produce results that are comparable to state-of-the-art results on object recognition benchmarks.
Disentangled distributed representations of data are desirable for machine learning, since they are more expressive and can generalize from fewer examples. However, for complex data, the distributed representations of multiple objects present in the same input can interfere and lead to ambiguities, which is commonly referred to as the binding problem. We argue for the importance of the binding problem to the field of representation learning, and develop a probabilistic framework that explicitly models inputs as a composition of multiple objects. We propose an unsupervised algorithm that uses denoising autoencoders to dynamically bind features together in multi-object inputs through an Expectation-Maximization-like clustering process. The effectiveness of this method is demonstrated on artificially generated datasets of binary images, showing that it can even generalize to bind together new objects never seen by the autoencoder during training.
This paper presents an overview of some techniques and concepts coming from dynamical system theory and used for the analysis of dynamical neural networks models. In a first section, we describe the dynamics of the neuron, starting from the Hodgkin-Huxley description, which is somehow the canonical description for the ``biological neuron''. We discuss some models reducing the Hodgkin-Huxley model to a two dimensional dynamical system, keeping one of the main feature of the neuron: its excitability. We present then examples of phase diagram and bifurcation analysis for the Hodgin-Huxley equations. Finally, we end this section by a dynamical system analysis for the nervous flux propagation along the axon. We then consider neuron couplings, with a brief description of synapses, synaptic plasticiy and learning, in a second section. We also briefly discuss the delicate issue of causal action from one neuron to another when complex feedback effects and non linear dynamics are involved. The third section presents the limit of weak coupling and the use of normal forms technics to handle this situation. We consider then several examples of recurrent models with different type of synaptic interactions (symmetric, cooperative, random). We introduce various techniques coming from statistical physics and dynamical systems theory. A last section is devoted to a detailed example of recurrent model where we go in deep in the analysis of the dynamics and discuss the effect of learning on the neuron dynamics. We also present recent methods allowing the analysis of the non linear effects of the neural dynamics on signal propagation and causal action. An appendix, presenting the main notions of dynamical systems theory useful for the comprehension of the chapter, has been added for the convenience of the reader.
The introduction of convolutional layers greatly advanced the performance of neural networks on image tasks due to innately capturing a way of encoding and learning translation-invariant operations, matching one of the underlying symmetries of the image domain. In comparison, there are a number of problems in which there are a number of different inputs which are all 'of the same type' --- multiple particles, multiple agents, multiple stock prices, etc. The corresponding symmetry to this is permutation symmetry, in that the algorithm should not depend on the specific ordering of the input data. We discuss a permutation-invariant neural network layer in analogy to convolutional layers, and show the ability of this architecture to learn to predict the motion of a variable number of interacting hard discs in 2D. In the same way that convolutional layers can generalize to different image sizes, the permutation layer we describe generalizes to different numbers of objects.
We present a simple neural network model which combines a locally-connected feedforward structure, as is traditionally used to model inter-neuron connectivity, with a layer of undifferentiated connections which model the diffuse projections from the human limbic system to the cortex. This new layer makes it possible to model global effects such as salience, at the same time as the local network processes task-specific or local information. This simple combination network displays interactions between salience and regular processing which correspond to known effects in the developing brain, such as enhanced learning as a result of heightened affect.   The cortex biases neuronal responses to affect both learning and memory, through the use of diffuse projections from the limbic system to the cortex. Standard ANNs do not model this non-local flow of information represented by the ascending systems, which are a significant feature of the structure of the brain, and although they do allow associational learning with multiple-trial, they simply don't provide the capacity for one-time learning.   In this research we model this effect using an artificial neural network (ANN), creating a salience-affected neural network (SANN). We adapt an ANN to embody the capacity to respond to an input salience signal and to produce a reverse salience signal during testing.   This research demonstrates that input combinations similar to the inputs in the training data sets will produce similar reverse salience signals during testing. Furthermore, this research has uncovered a novel method for training ANNs with a single training iteration.
Label propagation has proven to be a fast method for detecting communities in large complex networks. Recent developments have also improved the accuracy of the approach, however, a general algorithm is still an open issue. We present an advanced label propagation algorithm that combines two unique strategies of community formation, namely, defensive preservation and offensive expansion of communities. Two strategies are combined in a hierarchical manner, to recursively extract the core of the network, and to identify whisker communities. The algorithm was evaluated on two classes of benchmark networks with planted partition and on almost 25 real-world networks ranging from networks with tens of nodes to networks with several tens of millions of edges. It is shown to be comparable to the current state-of-the-art community detection algorithms and superior to all previous label propagation algorithms, with comparable time complexity. In particular, analysis on real-world networks has proven that the algorithm has almost linear complexity, $\mathcal{O}(m^{1.19})$, and scales even better than basic label propagation algorithm ($m$ is the number of edges in the network).
In this paper, we present a method for characterizing the evolution of time-varying complex networks by adopting a thermodynamic representation of network structure computed from a polynomial (or algebraic) characterization of graph structure. Commencing from a representation of graph structure based on a characteristic polynomial computed from the normalized Laplacian matrix, we show how the polynomial is linked to the Boltzmann partition function of a network. This allows us to compute a number of thermodynamic quantities for the network, including the average energy and entropy. Assuming that the system does not change volume, we can also compute the temperature, defined as the rate of change of entropy with energy. All three thermodynamic variables can be approximated using low-order Taylor series that can be computed using the traces of powers of the Laplacian matrix, avoiding explicit computation of the normalized Laplacian spectrum. These polynomial approximations allow a smoothed representation of the evolution of networks to be constructed in the thermodynamic space spanned by entropy, energy, and temperature. We show how these thermodynamic variables can be computed in terms of simple network characteristics, e.g., the total number of nodes and node degree statistics for nodes connected by edges. We apply the resulting thermodynamic characterization to real-world time-varying networks representing complex systems in the financial and biological domains. The study demonstrates that the method provides an efficient tool for detecting abrupt changes and characterizing different stages in network evolution.
The basic idea of device-to-device (D2D) communication is that pairs of suitably selected wireless devices reuse the cellular spectrum to establish direct communication links, provided that the adverse effects of D2D communication on cellular users is minimized and cellular users are given a higher priority in using limited wireless resources. Despite its great potential in terms of coverage and capacity performance, implementing this new concept poses some challenges, in particular with respect to radio resource management. The main challenges arise from a strong need for distributed D2D solutions that operate in the absence of precise channel and network knowledge. In order to address this challenge, this paper studies a resource allocation problem in a single-cell wireless network with multiple D2D users sharing the available radio frequency channels with cellular users. We consider a realistic scenario where the base station (BS) is provided with strictly limited channel knowledge while D2D and cellular users have no information. We prove a lower-bound for the cellular aggregate utility in the downlink with fixed BS power, which allows for decoupling the channel allocation and D2D power control problems. An efficient graph-theoretical approach is proposed to perform the channel allocation, which offers flexibility with respect to allocation criterion (aggregate utility maximization, fairness, quality of service guarantee). We model the power control problem as a multi-agent learning game. We show that the game is an exact potential game with noisy rewards, defined on a discrete strategy set, and characterize the set of Nash equilibria. Q-learning better-reply dynamics is then used to achieve equilibrium.
Bayesian belief networks are bing increasingly used as a knowledge representation for diagnostic reasoning. One simple method for conducting diagnostic reasoning is to represent system faults and observations only. In this paper, we investigate how having intermediate nodes-nodes other than fault and observation nodes affects the diagnostic performance of a Bayesian belief network. We conducted a series of experiments on a set of real belief networks for medical diagnosis in liver and bile disease. We compared the effects on diagnostic performance of a two-level network consisting just of disease and finding nodes with that of a network which models intermediate pathophysiological disease states as well. We provide some theoretical evidence for differences observed between the abstracted two-level network and the full network.
An algorithm is described that adaptively learns a non-linear mutation distribution. It works by training a denoising autoencoder (DA) online at each generation of a genetic algorithm to reconstruct a slowly decaying memory of the best genotypes so far. A compressed hidden layer forces the autoencoder to learn hidden features in the training set that can be used to accelerate search on novel problems with similar structure. Its output neurons define a probability distribution that we sample from to produce offspring solutions. The algorithm outperforms a canonical genetic algorithm on several combinatorial optimisation problems, e.g. multidimensional 0/1 knapsack problem, MAXSAT, HIFF, and on parameter optimisation problems, e.g. Rastrigin and Rosenbrock functions.
Access networks, in particular, Digital Subscriber Line (DSL) equipment, are a significant source of energy consumption for wireline operators. Replacing large monolithic DSLAMs with smaller remote DSLAM units closer to customers can reduce the energy consumption as well as increase the reach of the access network. This paper attempts to formalize the design and optimization of the "last mile" wireline access network with energy as one of the costs to be minimized. In particular, the placement of remote DSLAM units needs to be optimized. We propose solutions for two scenarios. For the scenario where an existing all-copper network from the central office to the customers is to be transformed into a fiber-copper network with remote DSLAM units, we present optimal polynomial-time solutions. In the green-field scenario, both the access network layout and the placement of remote DSLAM units must be determined. We show that this problem is NP-complete. We present an optimal ILP formulation and also design an efficient heuristic-based approach to build a power-and-cost-optimized access network. Our heuristic-based approach yields results that are very close to optimal. We show how the power consumption of the access network can be reduced by carefully laying the access network and introducing remote DSLAM units.
Iron self-diffusion in nano-composite FeZr alloy has been investigated using neutron reflectometry technique as a function of applied compressive stress. A composite target of (Fe+Zr) and (57Fe+Zr) was alternatively sputtered to deposit chemically homogeneous multilayer (CHM) structure, [Fe75Zr25/57Fe75Zr25]10. The multilayers were deposited on to a bent Si wafer using a 3-point bending device. Post-deposition, the bending of the substrate was released which results in an applied compressive stress on to the multilayer. In the as-deposited state, the alloy multilayer forms an amorphous phase, which crystallizes into a nano-composite phase when heated at 373 K. Bragg peaks due to isotopic contrast were observed from CHM, when measured by neutron reflectivity, while x-ray reflectivity showed a pattern corresponding to a single layer. Self-diffusion of iron was measured with the decay of the intensities at the Bragg peaks in the neutron reflectivity pattern after thermal annealing at different temperatures. It was found that the self-diffusion of iron slows down with an increase in the strength of applied compressive stress.
We study the representation and encoding of phonemes in a recurrent neural network model of grounded speech. We use a model which processes images and their spoken descriptions, and projects the visual and auditory representations into the same semantic space. We perform a number of analyses on how information about individual phonemes is encoded in the MFCC features extracted from the speech signal, and the activations of the layers of the model. Via experiments with phoneme decoding and phoneme discrimination we show that phoneme representations are most salient in the lower layers of the model, where low-level signals are processed at a fine-grained level, although a large amount of phonological information is retain at the top recurrent layer. We further find out that the attention mechanism following the top recurrent layer significantly attenuates encoding of phonology and makes the utterance embeddings much more invariant to synonymy. Moreover, a hierarchical clustering of phoneme representations learned by the network shows an organizational structure of phonemes similar to those proposed in linguistics.
A new method of calculating deep inelastic structure functions for the nucleon in independent particle models is presented. The method is applied to the bag-like model and its predictions are compared with the parametrisation of Gluck, Reya and Vogt.
The problem of noncooperative resource allocation in a multipoint-to-multipoint cellular network is considered in this paper. The considered scenario is general enough to represent several key instances of modern wireless networks such as a multicellular network, a peer-to-peer network (interference channel), and a wireless network equipped with femtocells. In particular, the problem of joint transmit waveforms adaptation, linear receiver design, and transmit power control is examined. Several utility functions to be maximized are considered, and, among them, we cite the received SINR, and the transmitter energy efficiency, which is measured in bit/Joule, and represents the number of successfully delivered bits for each energy unit used for transmission. Resorting to the theory of potential games, noncooperative games admitting Nash equilibria in multipoint-to-multipoint cellular networks regardless of the channel coefficient realizations are designed. Computer simulations confirm that the considered games are convergent, and show the huge benefits that resource allocation schemes can bring to the performance of wireless data networks.
We present a detailed study of nuclear corrections in the deuteron (D) by performing an analysis of data from deep inelastic scattering off proton and D, dilepton pair production in $pp$ and pD interactions, and W^+- and Z boson production in pp and p pbar collisions. In particular, we discuss the determination of the off-shell function describing the modification of the parton distribution functions in bound nucleons in the context of global QCD fits. Our results are consistent with the ones obtained independently from the study of deep inelastic scattering data off heavy nuclei with $A\geq4$, further confirming the universality of the off-shell function. We also study the sensitivity to the modeling of the deuteron wave function. As an important application we discuss the impact of nuclear corrections to the deuteron on the determination of the d quark distribution.
The growing volumes of XML data sources on the Web or produced by enterprises, organizations etc. raise many performance challenges for data management applications. In this work, we are concerned with the distributed, peer-to-peer management of large corpora of XML documents, based on distributed hash table (or DHT, in short) overlay networks. We present ViP2P (standing for Views in Peer-to-Peer), a distributed platform for sharing XML documents based on a structured P2P network infrastructure (DHT). At the core of ViP2P stand distributed materialized XML views, defined by arbitrary XML queries, filled in with data published anywhere in the network, and exploited to efficiently answer queries issued by any network peer. ViP2P allows user queries to be evaluated over XML documents published by peers in two modes. First, a long-running subscription mode, when a query can be registered in the system and receive answers incrementally when and if published data matches the query. Second, queries can also be asked in an ad-hoc, snapshot mode, where results are required immediately and must be computed based on the results of other long-running, subscription queries. ViP2P innovates over other similar DHT-based XML sharing platforms by using a very expressive structured XML query language. This expressivity leads to a very flexible distribution of XML content in the ViP2P network, and to efficient snapshot query execution. ViP2P has been tested in real deployments of hundreds of computers. We present the platform architecture, its internal algorithms, and demonstrate its efficiency and scalability through a set of experiments. Our experimental results outgrow by orders of magnitude similar competitor systems in terms of data volumes, network size and data dissemination throughput.
We propose a simplified model of attention which is applicable to feed-forward neural networks and demonstrate that the resulting model can solve the synthetic "addition" and "multiplication" long-term memory problems for sequence lengths which are both longer and more widely varying than the best published results for these tasks.
In this paper we introduce a new symbolic type neural tree network called symbolic function network (SFN) that is based on using elementary functions to model systems in a symbolic form. The proposed formulation permits feature selection, functional selection, and flexible structure. We applied this model on the River Flow forecasting problem. The results found to be superior in both fitness and sparsity.
According to the chemical reaction network theory, the topology of a certain class of chemical reaction networks, regardless of the kinetic details, sets a limit on the dynamical properties that a particular network can potentially admit; the structure of a network predetermines the dynamic capacity of the network. We note that stochastic fluctuations can possibly confer a new dynamical capability to a network. Thus, it is of tremendous value to understand and be able to control the landscape of stochastic dynamical behaviors of a biochemical reaction network as a function of network architecture. Here we investigate such a case where stochastic fluctuations can give rise to the new capability of noise-induced oscillation in a subset of biochemical reaction networks, the networks with only three biochemical species whose reactions are governed by mass action kinetics and with the coupling of positive and negative feedback loops. We model the networks with the master equations and approximate them, using the linear noise approximation. For each network, we read the signal-to-noise ratio value, an indicator of amplified and coherent noise-induced oscillation, off from the analytically derived power spectra. We classify the networks into three performance groups based on the average values of the signal-to-noise ratio and the robustness. We identify the common network architecture among the networks belonging to the same performance group, from which we learn that the coupling of negative and positive feedback loops generally enhance the noise-induced oscillation performance better than the negative feedback loops alone. The performance of networks also depends on the relative size of the positive and negative feedback loops; the networks with the bigger positive and smaller negative feedbacks are much worse oscillators than the networks with only negative feedback loops.
Full duplex communication promises a paradigm shift in wireless networks by allowing simultaneous packet transmission and reception within the same channel. While recent prototypes indicate the feasibility of this concept, there is a lack of rigorous theoretical development on how full duplex impacts medium access control (MAC) protocols in practical wireless networks. In this paper, we formulate the first analytical model of a CSMA/CA based full duplex MAC protocol for a wireless LAN network composed of an access point serving mobile clients. There are two major contributions of our work: First, our Markov chain-based approach results in closed form expressions of throughput for both the access point and the clients for this new class of networks. Second, our study provides quantitative insights on how much of the classical hidden terminal problem can be mitigated through full duplex. We specifically demonstrate that the improvement in the network throughput is up to 35-40\% over the half duplex case. Our analytical models are verified through packet level simulations in ns-2. Our results also reveal the benefit of full duplex under varying network configuration parameters, such as number of hidden terminals, client density, and contention window size.
Realistic networks display heterogeneous transmission delays. We analyze here the limits of large stochastic multi-populations networks with stochastic coupling and random interconnection delays. We show that depending on the nature of the delays distributions, a quenched or averaged propagation of chaos takes place in these networks, and that the network equations converge towards a delayed McKean-Vlasov equation with distributed delays. Our approach is mostly fitted to neuroscience applications. We instantiate in particular a classical neuronal model, the Wilson and Cowan system, and show that the obtained limit equations have Gaussian solutions whose mean and standard deviation satisfy a closed set of coupled delay differential equations in which the distribution of delays and the noise levels appear as parameters. This allows to uncover precisely the effects of noise, delays and coupling on the dynamics of such heterogeneous networks, in particular their role in the emergence of synchronized oscillations. We show in several examples that not only the averaged delay, but also the dispersion, govern the dynamics of such networks.
Modern statistical machine translation (SMT) systems usually use a linear combination of features to model the quality of each translation hypothesis. The linear combination assumes that all the features are in a linear relationship and constrains that each feature interacts with the rest features in an linear manner, which might limit the expressive power of the model and lead to a under-fit model on the current data. In this paper, we propose a non-linear modeling for the quality of translation hypotheses based on neural networks, which allows more complex interaction between features. A learning framework is presented for training the non-linear models. We also discuss possible heuristics in designing the network structure which may improve the non-linear learning performance. Experimental results show that with the basic features of a hierarchical phrase-based machine translation system, our method produce translations that are better than a linear model.
Understanding neural networks is becoming increasingly important. Over the last few years different types of visualisation and explanation methods have been proposed. However, none of them explicitly considered the behaviour in the presence of noise and distracting elements. In this work, we will show how noise and distracting dimensions can influence the result of an explanation model. This gives a new theoretical insights to aid selection of the most appropriate explanation model within the deep-Taylor decomposition framework.
Cloud Radio Access Network (C-RAN) is emerging as a transformative architecture for the next generation of mobile cellular networks. In C-RAN, the Baseband Unit (BBU) is decoupled from the Base Station (BS) and consolidated in a centralized processing center. While the potential benefits of C-RAN have been studied extensively from the theoretical perspective, there are only a few works that address the system implementation issues and characterize the computational requirements of the virtualized BBU. In this paper, a programmable C-RAN testbed is presented where the BBU is virtualized using the OpenAirInterface (OAI) software platform, and the eNodeB and User Equipment (UEs) are implemented using USRP boards. Extensive experiments have been performed in a FDD downlink LTE emulation system to characterize the performance and computing resource consumption of the BBU under various conditions. It is shown that the processing time and CPU utilization of the BBU increase with the channel resources and with the Modulation and Coding Scheme (MCS) index, and that the CPU utilization percentage can be well approximated as a linear increasing function of the maximum downlink data rate. These results provide real-world insights into the characteristics of the BBU in terms of computing resource and power consumption, which may serve as inputs for the design of efficient resource-provisioning and allocation strategies in C-RAN systems.
Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g. 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g. living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used.
The trend towards increasingly deep neural networks has been driven by a general observation that increasing depth increases the performance of a network. Recently, however, evidence has been amassing that simply increasing depth may not be the best way to increase performance, particularly given other limitations. Investigations into deep residual networks have also suggested that they may not in fact be operating as a single deep network, but rather as an ensemble of many relatively shallow networks. We examine these issues, and in doing so arrive at a new interpretation of the unravelled view of deep residual networks which explains some of the behaviours that have been observed experimentally. As a result, we are able to derive a new, shallower, architecture of residual networks which significantly outperforms much deeper models such as ResNet-200 on the ImageNet classification dataset. We also show that this performance is transferable to other problem domains by developing a semantic segmentation approach which outperforms the state-of-the-art by a remarkable margin on datasets including PASCAL VOC, PASCAL Context, and Cityscapes. The architecture that we propose thus outperforms its comparators, including very deep ResNets, and yet is more efficient in memory use and sometimes also in training time. The code and models are available at https://github.com/itijyou/ademxapp
The ability to track a moving vehicle is of crucial importance in numerous applications. The task has often been approached by the importance sampling technique of particle filters due to its ability to model non-linear and non-Gaussian dynamics, of which a vehicle travelling on a road network is a good example. Particle filters perform poorly when observations are highly informative. In this paper, we address this problem by proposing particle filters that sample around the most recent observation. The proposal leads to an order of magnitude improvement in accuracy and efficiency over conventional particle filters, especially when observations are infrequent but low-noise.
This paper presents a new unsupervised learning approach with stacked autoencoder (SAE) for Arabic handwritten digits categorization. Recently, Arabic handwritten digits recognition has been an important area due to its applications in several fields. This work is focusing on the recognition part of handwritten Arabic digits recognition that face several challenges, including the unlimited variation in human handwriting and the large public databases. Arabic digits contains ten numbers that were descended from the Indian digits system. Stacked autoencoder (SAE) tested and trained the MADBase database (Arabic handwritten digits images) that contain 10000 testing images and 60000 training images. We show that the use of SAE leads to significant improvements across different machine-learning classification algorithms. SAE is giving an average accuracy of 98.5%.
We study roles of quark-quark correlations in the baryons for the deep inelastic structure function and the ${\Delta}I=1/2$ rule of the non-leptonic weak hyperon decay. The quark-quark correlation is incorporated phenomenologically as diquarks within the Nambu and Jona-Lasinio model. The strong diquark correlations in the spin-0 channel enhance the ${\Delta}I=1/2$ weak matrix elements, and account for the ${\Delta}I=1/2$ rule. The ratio of the nucleon structure functions $F_2^{n}(x)/F_2^{p}(x)$ also comes to agree with experiment due to the diquark correlations.
In this paper, we attempt to revisit the problem of multi-party conferencing from a practical perspective, and to rethink the design space involved in this problem. We believe that an emphasis on low end-to-end delays between any two parties in the conference is a must, and the source sending rate in a session should adapt to bandwidth availability and congestion. We present Celerity, a multi-party conferencing solution specifically designed to achieve our objectives. It is entirely Peer-to-Peer (P2P), and as such eliminating the cost of maintaining centrally administered servers. It is designed to deliver video with low end-to-end delays, at quality levels commensurate with available network resources over arbitrary network topologies where bottlenecks can be anywhere in the network. This is in contrast to commonly assumed P2P scenarios where bandwidth bottlenecks reside only at the edge of the network. The highlight in our design is a distributed and adaptive rate control protocol, that can discover and adapt to arbitrary topologies and network conditions quickly, converging to efficient link rate allocations allowed by the underlying network. In accordance with adaptive link rate control, source video encoding rates are also dynamically controlled to optimize video quality in arbitrary and unpredictable network conditions. We have implemented Celerity in a prototype system, and demonstrate its superior performance over existing solutions in a local experimental testbed and over the Internet.
Using a simple model with link removals as well as link additions, we show that an evolving network is scale free with a degree exponent in the range of (2, 4]. We then establish a relation between the network evolution and a set of non-homogeneous birth-and-death processes, and, with which, we capture the process by which the network connectivity evolves. We develop an effective algorithm to compute the network degree distribution accurately. Comparing analytical and numerical results with simulation, we identify some interesting network properties and verify the effectiveness of our method.
We study the routing of quantum information in parallel on multi-dimensional networks of tunable qubits and oscillators. These theoretical models are inspired by recent experiments in superconducting circuits using Josephson junctions and resonators. We show that perfect parallel state transfer is possible for certain networks of harmonic oscillator modes. We further extend this to the distribution of entanglement between every pair of nodes in the network, finding that the routing efficiency of hypercube networks is both optimal and robust in the presence of dissipation and finite bandwidth.
The recently puplished Barber-Tennyson (BT2) synthetic H$_2$O water line list is the most complete and accurate line list in existence. It is finding application in a wide range of astrophysical environments.   UKIRT spectra of comet Tempel 1, obtained after the 'Deep Impact' event, revealed several known H$_2$O solar pumped fluorescent (SPF)lines in the 2.8945 to 2.8985 $\mu$m region. In addition, using synthetic spectra produced with BT2, several emission lines were identified that had not previously been recorded in cometary spectra. Unlike the SPF lines, which are transitions from doubly-excited stretch states, these transitions, that we label 'SH', are from states with three or four quanta of vibrational excitation. The SH features were particularly strong during the period 20-40 minutes after impact.
Large-scale L1-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. High-performance algorithms and implementations are critical to efficiently solving these problems. Building upon previous work on coordinate descent algorithms for L1-regularized problems, we introduce a novel family of algorithms called block-greedy coordinate descent that includes, as special cases, several existing algorithms such as SCD, Greedy CD, Shotgun, and Thread-Greedy. We give a unified convergence analysis for the family of block-greedy algorithms. The analysis suggests that block-greedy coordinate descent can better exploit parallelism if features are clustered so that the maximum inner product between features in different blocks is small. Our theoretical convergence analysis is supported with experimental re- sults using data from diverse real-world applications. We hope that algorithmic approaches and convergence analysis we provide will not only advance the field, but will also encourage researchers to systematically explore the design space of algorithms for solving large-scale L1-regularization problems.
Unsupervised learning aims at the discovery of hidden structure that drives the observations in the real world. It is essential for success in modern machine learning. Latent variable models are versatile in unsupervised learning and have applications in almost every domain. Training latent variable models is challenging due to the non-convexity of the likelihood objective. An alternative method is based on the spectral decomposition of low order moment tensors. This versatile framework is guaranteed to estimate the correct model consistently. My thesis spans both theoretical analysis of tensor decomposition framework and practical implementation of various applications. This thesis presents theoretical results on convergence to globally optimal solution of tensor decomposition using the stochastic gradient descent, despite non-convexity of the objective. This is the first work that gives global convergence guarantees for the stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. This thesis also presents large-scale deployment of spectral methods carried out on various platforms. Dimensionality reduction techniques such as random projection are incorporated for a highly parallel and scalable tensor decomposition algorithm. We obtain a gain in both accuracies and in running times by several orders of magnitude compared to the state-of-art variational methods. To solve real world problems, more advanced models and learning algorithms are proposed. This thesis discusses generalization of LDA model to mixed membership stochastic block model for learning user communities in social network, convolutional dictionary model for learning word-sequence embeddings, hierarchical tensor decomposition and latent tree structure model for learning disease hierarchy, and spatial point process mixture model for detecting cell types in neuroscience.
An accurate and reliable image based fruit detection system is critical for supporting higher level agriculture tasks such as yield mapping and robotic harvesting. This paper presents the use of a state-of-the-art object detection framework, Faster R-CNN, in the context of fruit detection in orchards, including mangoes, almonds and apples. Ablation studies are presented to better understand the practical deployment of the detection network, including how much training data is required to capture variability in the dataset. Data augmentation techniques are shown to yield significant performance gains, resulting in a greater than two-fold reduction in the number of training images required. In contrast, transferring knowledge between orchards contributed to negligible performance gain over initialising the Deep Convolutional Neural Network directly from ImageNet features. Finally, to operate over orchard data containing between 100-1000 fruit per image, a tiling approach is introduced for the Faster R-CNN framework. The study has resulted in the best yet detection performance for these orchards relative to previous works, with an F1-score of >0.9 achieved for apples and mangoes.
In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience.
Model based iterative reconstruction (MBIR) algorithms for low-dose X-ray CT are computationally complex because of the repeated use of the forward and backward projection. Inspired by this success of deep learning in computer vision applications, we recently proposed a deep convolutional neural network (CNN) for low-dose X-ray CT and won the second place in 2016 AAPM Low-Dose CT Grand Challenge. However, some of the texture are not fully recovered, which was unfamiliar to some radiologists. To cope with this problem, here we propose a direct residual learning approach on directional wavelet domain to solve this problem and to improve the performance against previous work. In particular, the new network estimates the noise of each input wavelet transform, and then the de-noised wavelet coefficients are obtained by subtracting the noise from the input wavelet transform bands. The experimental results confirm that the proposed network has significantly improved performance, preserving the detail texture of the original images.
We report on density-functional-based tight-binding (DFTB) simulations of a series of amorphous arsenic sulfide models. In addition to the charged coordination defects previously proposed to exist in chalcogenide glasses, a novel defect pair, [As4]--[S3]+, consisting of a four-fold coordinated arsenic site in a seesaw configuration and a three-fold coordinated sulfur site in a planar trigonal configuration, was found in several models. The valence-alternation pairs S3+-S1- are converted into [As4]--[S3]+ pairs under HOMO-to-LUMO electronic excitation. This structural transformation is accompanied by a decrease in the size of the HOMO-LUMO band gap, which suggests that such transformations could contribute to photo-darkening in these materials.
The structure of a neural network developed for the gamma hadron separation in the ARGO-YBJ detector is presented. The discrimination power in the full ARGO-YBJ energy range is shown in detail and the improvement in the detector sensitivity is also discussed.
In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot without environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot's behaviour during navigation tasks. The system is made available to the community as a ROS module.
We study the fluctuations of the interface, in the steady state, of the Surface Relaxation Model (SRM) in two scale free interacting networks where a fraction $q$ of nodes in both networks interact one to one through external connections. We find that as $q$ increases the fluctuations on both networks decrease and thus the synchronization reaches an improvement of nearly $40\%$ when $q=1$. The decrease of the fluctuations on both networks is due mainly to the diffusion through external connections which allows to reducing the load in nodes by sending their excess mostly to low-degree nodes, which we report have the lowest heights. This effect enhances the matching of the heights of low-and high-degree nodes as $q$ increases reducing the fluctuations. This effect is almost independent of the degree distribution of the networks which means that the interconnection governs the behavior of the process over its topology.
We study a family of diluted attractor neural networks with a finite average number of (symmetric) connections per neuron. As in finite connectivity spin glasses, their equilibrium properties are described by order parameter functions, for which we derive an integral equation in replica symmetric (RS) approximation. A bifurcation analysis of this equation reveals the locations of the paramagnetic to recall and paramagnetic to spin-glass transition lines in the phase diagram. The line separating the retrieval phase from the spin-glass phase is calculated at zero temperature. All phase transitions are found to be continuous.
We consider multi-class single-server queueing networks that have a product form stationary distribution. A new limit result proves a sequence of such networks converges weakly to a stochastic flow level model. The stochastic flow level model found is insensitive. A large deviation principle for the stationary distribution of these multi-class queueing networks is also found. Its rate function has a dual form that coincides with proportional fairness. We then give the first rigorous proof that the stationary throughput of a multi-class single-server queueing network converges to a proportionally fair allocation. This work combines classical queueing networks with more recent work on stochastic flow level models and proportional fairness. One could view these seemingly different models as the same system described at different levels of granularity: a microscopic, queueing level description; a macroscopic, flow level description and a teleological, optimization description.
Using birefringence techniques we have measured the critical exponents beta, gamma, and delta in As-doped TbVO4, a structural realization of the random-field Ising model where random strain fields are introduced by V-As size mismatch. For pure TbVO4 we observe the expected classical critical exponents, while for a mixed sample with 15% As concentration our results are beta=0.31 +/-0.03, gamma=1.22 +/-0.07, and delta=4.2 +/-0.7. These values are consistent with the critical exponents for the short range pure Ising model in three dimensions in agreement with a prediction by Toh. The susceptibility data showed a crossover with temperature from classical to random field critical behaviour.
We show that a peripheral meson model can explain the large deep inelastic electron-proton scattering rapidity gap events observed at HERA.
The push-pull queueing network is a simple example in which servers either serve jobs or generate new arrivals. It was previously conjectured that there is no policy that makes the network positive recurrent (stable) in the critical case. We settle this conjecture and devise a general sufficient condition for non-stabilizability of queueing networks which is based on a linear martingale and further applies to generalizations of the push-pull network.
Network science investigates the architecture of complex systems to understand their functional and dynamical properties. Structural patterns such as communities shape diffusive processes on networks. However, these results hold under the strong assumption that networks are static entities where temporal aspects can be neglected. Here we propose a generalised formalism for linear dynamics on complex networks, able to incorporate statistical properties of the timings at which events occur. We show that the diffusion dynamics is affected by the network community structure and by the temporal properties of waiting times between events. We identify the main mechanism --- network structure, burstiness or fat-tails of waiting times --- determining the relaxation times of stochastic processes on temporal networks, in the absence of temporal-structure correlations. We identify situations when fine-scale structure can be discarded from the description of the dynamics or, conversely, when a fully detailed model is required due to temporal heterogeneities.
We present new parallel sorting networks for $17$ to $20$ inputs. For $17, 19,$ and $20$ inputs these new networks are faster (i.e., they require less computation steps) than the previously known best networks. Therefore, we improve upon the known upper bounds for minimal depth sorting networks on $17, 19,$ and $20$ channels. The networks were obtained using a combination of hand-crafted first layers and a SAT encoding of sorting networks.
This paper indicates two errors in the formulation of the main optimization model in the article "Control system for reducing energy consumption in backbone computer network" by Niewiadomska-Szynkiewicz et al. and shows how to fix them.
In this paper, a new mathematical framework to the analysis of millimeter wave cellular networks is introduced. Its peculiarity lies in considering realistic path-loss and blockage models, which are derived from recently reported experimental data. The path-loss model accounts for different distributions of line-of-sight and non-line-of-sight propagation conditions and the blockage model includes an outage state that provides a better representation of the outage possibilities of millimeter wave communications. By modeling the locations of the base stations as points of a Poisson point process and by relying on a noise-limited approximation for typical millimeter wave network deployments, simple and exact integral as well as approximated and closed-form formulas for computing the coverage probability and the average rate are obtained. With the aid of Monte Carlo simulations, the noise-limited approximation is shown to be sufficiently accurate for typical network densities. The proposed mathematical framework is applicable to cell association criteria based on the smallest path-loss and on the highest received power. It accounts for beamforming alignment errors and for multi-tier cellular network deployments. Numerical results confirm that sufficiently dense millimeter wave cellular networks are capable of outperforming micro wave cellular networks, both in terms of coverage probability and average rate.
This work presents a new approach for classification of genomic sequences from measurements of complex networks and information theory. For this, it is considered the nucleotides, dinucleotides and trinucleotides of a genomic sequence. For each of them, the entropy, sum entropy and maximum entropy values are calculated.For each of them is also generated a network, in which the nodes are the nucleotides, dinucleotides or trinucleotides and its edges are estimated by observing the respective adjacency among them in the genomic sequence. In this way, it is generated three networks, for which measures of complex networks are extracted.These measures together with measures of information theory comprise a feature vector representing a genomic sequence. Thus, the feature vector is used for classification by methods such as SVM, MultiLayer Perceptron, J48, IBK, Naive Bayes and Random Forest in order to evaluate the proposed approach.It was adopted coding sequences, intergenic sequences and TSS (Transcriptional Starter Sites) as datasets, for which the better results were obtained by the Random Forest with 91.2%, followed by J48 with 89.1% and SVM with 84.8% of accuracy. These results indicate that the new approach of feature extraction has its value, reaching good levels of classification even considering only the genomic sequences, i.e., no other a priori knowledge about them is considered.
Progress in the new information-theoretic process physics is reported in which the link to the phenomenology of general relativity is made. In process physics the fundamental assumption is that reality is to be modelled as self-organising semantic (or internal or relational) information using a self-referentially limited neural network model. Previous progress in process physics included the demonstration that space and quantum physics are emergent and unified, with time a distinct non-geometric process, that quantum phenomena are caused by fractal topological defects embedded in and forming a growing three-dimensional fractal process-space, which is essentially a quantum foam. Other features of the emergent physics were: quantum field theory with emergent flavour and confined colour, limited causality and the Born quantum measurement metarule, inertia, time-dilation effects, gravity and the equivalence principle, a growing universe with a cosmological constant, black holes and event horizons, and the emergence of classicality. Here general relativity and the technical language of general covariance is seen not to be fundamental but a phenomenological construct, arising as an amalgam of two distinct phenomena: the `gravitational' characteristics of the emergent quantum foam for which `matter' acts as a sink, and the classical `spacetime' measurement protocol, but with the later violated by quantum measurement processes. Quantum gravity, as manifested in the emergent Quantum Homotopic Field Theory of the process-space or quantum foam, is logically prior to the emergence of the general relativity phenomenology, and cannot be derived from it.
Human faces in surveillance videos often suffer from severe image blur, dramatic pose variations, and occlusion. In this paper, we propose a comprehensive framework based on Convolutional Neural Networks (CNN) to overcome challenges in video-based face recognition (VFR). First, to learn blur-robust face representations, we artificially blur training data composed of clear still images to account for a shortfall in real-world video training data. Using training data composed of both still images and artificially blurred data, CNN is encouraged to learn blur-insensitive features automatically. Second, to enhance robustness of CNN features to pose variations and occlusion, we propose a Trunk-Branch Ensemble CNN model (TBE-CNN), which extracts complementary information from holistic face images and patches cropped around facial components. TBE-CNN is an end-to-end model that extracts features efficiently by sharing the low- and middle-level convolutional layers between the trunk and branch networks. Third, to further promote the discriminative power of the representations learnt by TBE-CNN, we propose an improved triplet loss function. Systematic experiments justify the effectiveness of the proposed techniques. Most impressively, TBE-CNN achieves state-of-the-art performance on three popular video face databases: PaSC, COX Face, and YouTube Faces. With the proposed techniques, we also obtain the first place in the BTAS 2016 Video Person Recognition Evaluation.
We present the results of deep \chandra\ imaging of the central region of the Extended Groth Strip, the AEGIS-X Deep (AEGIS-XD) survey. When combined with previous \chandra\ observations of a wider area of the strip, AEGIS-X Wide (AEGIS-XW; Laird et~al. 2009), these provide data to a nominal exposure depth of 800ks in the three central ACIS-I fields, a region of approximately $0.29$~deg$^{2}$. This is currently the third deepest X-ray survey in existence, a factor $\sim 2-3$ shallower than the Chandra Deep Fields (CDFs) but over an area $\sim 3$ times greater than each CDF. We present a catalogue of 937 point sources detected in the deep \chandra\ observations. We present identifications of our X-ray sources from deep ground-based, Spitzer, GALEX and HST imaging. Using a likelihood ratio analysis, we associate multi band counterparts for 929/937 of our X-ray sources, with an estimated 95~\% reliability, making the identification completeness approximately 94~\% in a statistical sense. Reliable spectroscopic redshifts for 353 of our X-ray sources are provided predominantly from Keck (DEEP2/3) and MMT Hectospec, so the current spectroscopic completeness is $\sim 38$~per cent. For the remainder of the X-ray sources, we compute photometric redshifts based on multi-band photometry in up to 35 bands from the UV to mid-IR. Particular attention is given to the fact that the vast majority the X-ray sources are AGN and require hybrid templates. Our photometric redshifts have mean accuracy of $\sigma=0.04$ and an outlier fraction of approximately 5\%, reaching $\sigma=0.03$ with less than 4\% outliers in the area covered by CANDELS . The X-ray, multi-wavelength photometry and redshift catalogues are made publicly available.
We describe a graphical model for probabilistic relationships---an alternative to the Bayesian network---called a dependency network. The graph of a dependency network, unlike a Bayesian network, is potentially cyclic. The probability component of a dependency network, like a Bayesian network, is a set of conditional distributions, one for each node given its parents. We identify several basic properties of this representation and describe a computationally efficient procedure for learning the graph and probability components from data. We describe the application of this representation to probabilistic inference, collaborative filtering (the task of predicting preferences), and the visualization of acausal predictive relationships.
The subject of this work are random Schroedinger operators on regular rooted tree graphs $\T$ with stochastically homogeneous disorder. The operators are of the form $H_\lambda(\omega) = T + U + \lambda V(\omega)$ acting in $\ell^2(\T)$, with $T $ the adjacency matrix, $U$ a radially periodic potential, and $V(\omega)$ a random potential. This includes the only class of homogeneously random operators for which it was proven that the spectrum of $H_\lambda(\omega)$ exhibits an absolutely continuous (ac) component; a results established by A. Klein for weak disorder, in case U=0 and $V(\omega)$ given by iid random variables on $\T$. Our main contribution is a new method for establishing the persistence of ac spectrum under weak disorder. The method yields the continuity in the disorder parameter of the ac spectral density of $H_\lambda(\omega)$ at $\lambda = 0$. The latter is shown to converge in the $L^1$ sense over closed intervals in which $H_0$ has no singular spectrum. The analysis extends to random potentials whose values at different sites need not be independent, assuming only that their joint distribution is weakly correlated across different tree branches.
Layer normalization is a recently introduced technique for normalizing the activities of neurons in deep neural networks to improve the training speed and stability. In this paper, we introduce a new layer normalization technique called Dynamic Layer Normalization (DLN) for adaptive neural acoustic modeling in speech recognition. By dynamically generating the scaling and shifting parameters in layer normalization, DLN adapts neural acoustic models to the acoustic variability arising from various factors such as speakers, channel noises, and environments. Unlike other adaptive acoustic models, our proposed approach does not require additional adaptation data or speaker information such as i-vectors. Moreover, the model size is fixed as it dynamically generates adaptation parameters. We apply our proposed DLN to deep bidirectional LSTM acoustic models and evaluate them on two benchmark datasets for large vocabulary ASR experiments: WSJ and TED-LIUM release 2. The experimental results show that our DLN improves neural acoustic models in terms of transcription accuracy by dynamically adapting to various speakers and environments.
Portfolio management is the decision-making process of allocating an amount of fund into different financial investment products. Cryptocurrencies are electronic and decentralized alternatives to government-issued money, with Bitcoin as the best-known example of a cryptocurrency. This paper presents a model-less convolutional neural network with historic prices of a set of financial assets as its input, outputting portfolio weights of the set. The network is trained with 0.7 years' price data from a cryptocurrency exchange. The training is done in a reinforcement manner, maximizing the accumulative return, which is regarded as the reward function of the network. Backtest trading experiments with trading period of 30 minutes is conducted in the same market, achieving 10-fold returns in 1.8 months' periods. Some recently published portfolio selection strategies are also used to perform the same back-tests, whose results are compared with the neural network. The network is not limited to cryptocurrency, but can be applied to any other financial markets.
Many real-world phenomena can be modelled as dynamic optimization problems. In such cases, the environment problem changes dynamically and therefore, conventional methods are not capable of dealing with such problems. In this paper, a novel multi-swarm cellular particle swarm optimization algorithm is proposed by clustering and local search. In the proposed algorithm, the search space is partitioned into cells, while the particles identify changes in the search space and form clusters to create sub-swarms. Then a local search is applied to improve the solutions in the each cell. Simulation results for static standard benchmarks and dynamic environments show superiority of the proposed method over other alternative approaches.
The cycle prefix network is a Cayley coset digraph based on sequences over an alphabet which has been proposed as a vertex symmetric communication network. This network has been shown to have many remarkable communication properties such as a large number of vertices for a given degree and diameter, simple shortest path routing, Hamiltonicity, optimal connectivity, and others. These considerations for designing symmetric and directed interconnection networks are well justified in practice and have been widely recognized in the research community. Among the important properties of a good network, efficient routing is probably one of the most important. In this paper, we further study routing schemes in the cycle prefix network. We confirm an observation first made from computer experiments regarding the diameter change when certain links are removed in the original network, and we completely determine the wide diameter of the network. The wide diameter of a network is now perceived to be even more important than the diameter. We show by construction that the wide diameter of the cycle prefix network is very close to the ordinary diameter. This means that routing in parallel in this network costs little extra time compared to ordinary single path routing.
We present the Siamese Continuous Bag of Words (Siamese CBOW) model, a neural network for efficient estimation of high-quality sentence embeddings. Averaging the embeddings of words in a sentence has proven to be a surprisingly successful and efficient way of obtaining sentence embeddings. However, word embeddings trained with the methods currently available are not optimized for the task of sentence representation, and, thus, likely to be suboptimal. Siamese CBOW handles this problem by training word embeddings directly for the purpose of being averaged. The underlying neural network learns word embeddings by predicting, from a sentence representation, its surrounding sentences. We show the robustness of the Siamese CBOW model by evaluating it on 20 datasets stemming from a wide variety of sources.
In computer vision, convolutional neural networks (CNNs) have recently achieved new levels of performance for several inverse problems where RGB pixel appearance is mapped to attributes such as positions, normals or reflectance. In computer graphics, screen-space shading has recently increased the visual quality in interactive image synthesis, where per-pixel attributes such as positions, normals or reflectance of a virtual 3D scene are converted into RGB pixel appearance, enabling effects like ambient occlusion, indirect light, scattering, depth-of-field, motion blur, or anti-aliasing. In this paper we consider the diagonal problem: synthesizing appearance from given per-pixel attributes using a CNN. The resulting Deep Shading simulates various screen-space effects at competitive quality and speed while not being programmed by human experts but learned from example images.
We present the data profile and the evaluation plan of the second oriental language recognition (OLR) challenge AP17-OLR. Compared to the event last year (AP16-OLR), the new challenge involves more languages and focuses more on short utterances. The data is offered by SpeechOcean and the NSFC M2ASR project. Two types of baselines are constructed to assist the participants, one is based on the i-vector model and the other is based on various neural networks. We report the baseline results evaluated with various metrics defined by the AP17-OLR evaluation plan and demonstrate that the combined database is a reasonable data resource for multilingual research. All the data is free for participants, and the Kaldi recipes for the baselines have been published online.
With the rise of social media, people can now form relationships and communities easily regardless of location, race, ethnicity, or gender. However, the power of social media simultaneously enables harmful online behavior such as harassment and bullying. Cyberbullying is a serious social problem, making it an important topic in social network analysis. Machine learning methods can potentially help provide better understanding of this phenomenon, but they must address several key challenges: the rapidly changing vocabulary involved in cyber- bullying, the role of social network structure, and the scale of the data. In this study, we propose a model that simultaneously discovers instigators and victims of bullying as well as new bullying vocabulary by starting with a corpus of social interactions and a seed dictionary of bullying indicators. We formulate an objective function based on participant-vocabulary consistency. We evaluate this approach on Twitter and Ask.fm data sets and show that the proposed method can detect new bullying vocabulary as well as victims and bullies.
The flow equation method was proposed by Wegner as a technique for studying interacting systems in one dimension. Here, we apply this method to a disordered one dimensional model with power-law decaying hoppings. This model presents a transition as function of the decaying exponent $\alpha$. We derive the flow equations, and the evolution of single-particle operators. The flow equation reveals the delocalized nature of the states for $\alpha<1/2$. Additionally, in the regime, $\alpha>1/2$, we present a strong-bond renormalization group structure based on iterating the three-site clusters, where we solve the flow equations perturbatively. This renormalization group approach allows us to probe the critical point $\left(\alpha=1\right)$. This method correctly reproduces the critical level-spacing statistics, and the fractal dimensionality of the eigenfunctions.
A grand challenge in machine learning is the development of computational algorithms that match or outperform humans in perceptual inference tasks that are complicated by nuisance variation. For instance, visual object recognition involves the unknown object position, orientation, and scale in object recognition while speech recognition involves the unknown voice pronunciation, pitch, and speed. Recently, a new breed of deep learning algorithms have emerged for high-nuisance inference tasks that routinely yield pattern recognition systems with near- or super-human capabilities. But a fundamental question remains: Why do they work? Intuitions abound, but a coherent framework for understanding, analyzing, and synthesizing deep learning architectures has remained elusive. We answer this question by developing a new probabilistic framework for deep learning based on the Deep Rendering Model: a generative probabilistic model that explicitly captures latent nuisance variation. By relaxing the generative model to a discriminative one, we can recover two of the current leading deep learning systems, deep convolutional neural networks and random decision forests, providing insights into their successes and shortcomings, as well as a principled route to their improvement.
Phylogenetic trees and networks are leaf-labelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster Containment problem asks whether the given cluster is a cluster of some phylogenetic tree embedded in the network. Both problems are known to be NP-complete in general. In this article, we consider the restriction of these problems to several well-studied classes of phylogenetic networks. We show that Tree Containment is polynomial-time solvable for normal networks, for binary tree-child networks, and for level-$k$ networks. On the other hand, we show that, even for tree-sibling, time-consistent, regular networks, both Tree Containment and Cluster Containment remain NP-complete.
Nowadays, mobile users can switch between different available networks, for example, nearby WiFi networks or their standard mobile operator network. Soon it will be extended to other operators. However, unless telecommunication operators can directly benefit from allowing a user to switch to another operator, operators have an incentive to keep their network quality of service confidential to avoid that their users decide to switch to another network. In contrast, in a user-centric way, the users should be allowed to share their observations regarding the networks that they have used. In this paper, we present our work in progress towards attack-resistant sharing of quality of service information and network provider reputation among mobile users.
In a many body system, constituents interact with each other, forming a recursive pattern of interaction and giving rise to many interesting phenomena. Based upon concepts of the modern many body theory, a model for a generic many body system is developed. A novel approach is proposed to investigate the general features in such a system. An interesting phase transition in the system is found. Possible link to brain dynamics is discussed. It is shown how some of the basic brain processes, such as learning and memory, find therein a natural explanation.
An introduction and summary is given of the main results achieved by working group 1: Structure Functions in Deep Inelastic Scattering at HERA. The prospects were discussed of future measurements of the structure functions $F_{2}, F_{L}, xG_{3}, F_{2}^{Q\overline{Q}}$ and $F_{\pi}$ at HERA. The results represent a long term programme of experimentation with high luminosity, different lepton beam charges, proton and deuteron beams, allowing for precision measurements. The theoretical investigations focussed on QED corrections, higher order QCD corrections for different observables, resummation of small $x$ contributions, and the detailed understanding of NLO QCD evolution codes, to allow for a high precision analysis of the forthcoming deep inelastic data.
Most successful deep learning algorithms for action recognition extend models designed for image-based tasks such as object recognition to video. Such extensions are typically trained for actions on single video frames or very short clips, and then their predictions from sliding-windows over the video sequence are pooled for recognizing the action at the sequence level. Usually this pooling step uses the first-order statistics of frame-level action predictions. In this paper, we explore the advantages of using higher-order correlations; specifically, we introduce Higher-order Kernel (HOK) descriptors generated from the late fusion of CNN classifier scores from all the frames in a sequence. To generate these descriptors, we use the idea of kernel linearization. Specifically, a similarity kernel matrix, which captures the temporal evolution of deep classifier scores, is first linearized into kernel feature maps. The HOK descriptors are then generated from the higher-order co-occurrences of these feature maps, and are then used as input to a video-level classifier. We provide experiments on two fine-grained action recognition datasets and show that our scheme leads to state-of-the-art results.
In recent years there has been a significant increase in the use of graphs as a tool for representing information. It is very important to preserve the privacy of users when one wants to publish this information, especially in the case of social graphs. In this case, it is essential to implement an anonymization process in the data in order to preserve users' privacy. In this paper we present an algorithm for graph anonymization, called Evolutionary Algorithm for Graph Anonymization (EAGA), based on edge modifications to preserve the k-anonymity model.
Several analog and digital brain-inspired electronic systems have been recently proposed as dedicated solutions for fast simulations of spiking neural networks. While these architectures are useful for exploring the computational properties of large-scale models of the nervous system, the challenge of building low-power compact physical artifacts that can behave intelligently in the real-world and exhibit cognitive abilities still remains open. In this paper we propose a set of neuromorphic engineering solutions to address this challenge. In particular, we review neuromorphic circuits for emulating neural and synaptic dynamics in real-time and discuss the role of biophysically realistic temporal dynamics in hardware neural processing architectures; we review the challenges of realizing spike-based plasticity mechanisms in real physical systems and present examples of analog electronic circuits that implement them; we describe the computational properties of recurrent neural networks and show how neuromorphic Winner-Take-All circuits can implement working-memory and decision-making mechanisms. We validate the neuromorphic approach proposed with experimental results obtained from our own circuits and systems, and argue how the circuits and networks presented in this work represent a useful set of components for efficiently and elegantly implementing neuromorphic cognition.
An increasing amount of analytics is performed on data that is procured in a real-time fashion to make real-time decisions. Such tasks include simple reporting on streams to sophisticated model building. However, the practicality of such analyses are impeded in several domains because they are faced with a fundamental trade-off between data collection latency and analysis accuracy.   In this paper, we study this trade-off in the context of a specific domain, Cellular Radio Access Networks (RAN). Our choice of this domain is influenced by its commonalities with several other domains that produce real-time data, our access to a large live dataset, and their real-time nature and dimensionality which makes it a natural fit for a popular analysis technique, machine learning (ML). We find that the latency accuracy trade-off can be resolved using two broad, general techniques: intelligent data grouping and task formulations that leverage domain characteristics. Based on this, we present CellScope, a system that addresses this challenge by applying a domain specific formulation and application of Multi-task Learning (MTL) to RAN performance analysis. It achieves this goal using three techniques: feature engineering to transform raw data into effective features, a PCA inspired similarity metric to group data from geographically nearby base stations sharing performance commonalities, and a hybrid online-offline model for efficient model updates. Our evaluation of CellScope shows that its accuracy improvements over direct application of ML range from 2.5x to 4.4x while reducing the model update overhead by up to 4.8x. We have also used CellScope to analyze a live LTE consisting of over 2 million subscribers for a period of over 10 months, where it uncovered several problems and insights, some of them previously unknown.
In this paper we present a determinant quantum monte carlo study of the two dimensional Hubbard model with random site disorder. We show that, as in the case of bond disorder, the system undergoes a transition from an Anderson insulating phase to a metallic phase as the onsite repulsion U is increased beyond a critical value U_c. However, there appears to be no sharp signal of this metal-insulator transition in the screened site energies. We observe that, while the system remains metallic for interaction values upto twice U_c, the conductivity is maximal in the metallic phase just beyond U_c, and decreases for larger correlation.
Large, complex networks are ubiquitous in nature and society, and there is great interest in developing rigorous, scalable methods for identifying and characterizing their vulnerabilities. This paper presents an approach for analyzing the dynamics of complex networks in which the network of interest is first abstracted to a much simpler, but mathematically equivalent, representation, the required analysis is performed on the abstraction, and analytic conclusions are then mapped back to the original network and interpreted there. We begin by identifying a broad and important class of complex networks which admit vulnerability-preserving, finite state abstractions, and develop efficient algorithms for computing these abstractions. We then propose a vulnerability analysis methodology which combines these finite state abstractions with formal analytics from theoretical computer science to yield a comprehensive vulnerability analysis process for networks of realworld scale and complexity. The potential of the proposed approach is illustrated with a case study involving a realistic electric power grid model and also with brief discussions of biological and social network examples.
Energy efficient routing protocol for Wireless Sensor Networks (WSNs) is one of the most challenging task for researcher. Hierarchical routing protocols have been proved more energy efficient routing protocols, as compare to flat and location based routing protocols. Heterogeneity of nodes with respect to their energy level, has also added extra lifespan for sensor network. In this paper, we propose a Centralized Energy Efficient Clustering (CEEC) routing protocol. We design the CEEC for three level heterogeneous network. CEEC can also be implemented in multi-level heterogeneity of networks. For initial practical, we design and analyze CEEC for three level advance heterogeneous network. In CEEC, whole network area is divided into three equal regions, in which nodes with same energy are spread in same region.
Deep neural networks have been shown to be very successful at learning feature hierarchies in supervised learning tasks. Generative models, on the other hand, have benefited less from hierarchical models with multiple layers of latent variables. In this paper, we prove that hierarchical latent variable models do not take advantage of the hierarchical structure when trained with existing variational methods, and provide some limitations on the kind of features existing models can learn. Finally we propose an alternative architecture that do not suffer from these limitations. Our model is able to learn highly interpretable and disentangled hierarchical features on several natural image datasets with no task specific regularization or prior knowledge.
We consider the ability of deep neural networks to represent data that lies near a low-dimensional manifold in a high-dimensional space. We show that deep networks can efficiently extract the intrinsic, low-dimensional coordinates of such data. We first show that the first two layers of a deep network can exactly embed points lying on a monotonic chain, a special type of piecewise linear manifold, mapping them to a low-dimensional Euclidean space. Remarkably, the network can do this using an almost optimal number of parameters. We also show that this network projects nearby points onto the manifold and then embeds them with little error. We then extend these results to more general manifolds.
Driving support systems, such as car navigation systems are becoming common and they support driver in several aspects. Non-intrusive method of detecting Fatigue and drowsiness based on eye-blink count and eye directed instruction controlhelps the driver to prevent from collision caused by drowsy driving. Eye detection and tracking under various conditions such as illumination, background, face alignment and facial expression makes the problem complex.Neural Network based algorithm is proposed in this paper to detect the eyes efficiently. In the proposed algorithm, first the neural Network is trained to reject the non-eye regionbased on images with features of eyes and the images with features of non-eye using Gabor filter and Support Vector Machines to reduce the dimension and classify efficiently. In the algorithm, first the face is segmented using L*a*btransform color space, then eyes are detected using HSV and Neural Network approach. The algorithm is tested on nearly 100 images of different persons under different conditions and the results are satisfactory with success rate of 98%.The Neural Network is trained with 50 non-eye images and 50 eye images with different angles using Gabor filter. This paper is a part of research work on "Development of Non-Intrusive system for real-time Monitoring and Prediction of Driver Fatigue and drowsiness" project sponsored by Department of Science & Technology, Govt. of India, New Delhi at Vignan Institute of Technology and Sciences, Vignan Hills, Hyderabad.
Material recognition enables robots to incorporate knowledge of material properties into their interactions with everyday objects. For instance, material recognition opens up opportunities for clearer communication with a robot, such as "bring me the metal coffee mug", and having the ability to recognize plastic versus metal is crucial when using a microwave or oven. However, collecting labeled training data can be difficult with a robot, whereas many forms of unlabeled data could be collected relatively easily during a robot's interactions. We present a semi-supervised learning approach for material recognition that uses generative adversarial networks (GANs) with haptic features such as force, temperature, and vibration. Our approach achieves state-of-the-art results and enables a robot to estimate the material class of household objects with ~90% accuracy when 92% of the training data are unlabeled. We explore how well this generative approach can recognize the material of new objects and we discuss challenges facing this generalization. In addition, we have released the dataset used for this work which consists of time-series haptic measurements from a robot that conducted thousands of interactions with 72 household objects.
In this paper, we derive the information theoretic capacity of a special class of mesh networks. A mesh network is a heterogeneous wireless network in which the transmission among power limited nodes is assisted by powerful relays, which use the same wireless medium. We investigate the mesh network when there is one source, one destination, and multiple relays, which we call the single source multiple relay single destination (SSMRSD) mesh network. We derive the asymptotic capacity of the SSMRSD mesh network when the relay powers grow to infinity. Our approach is as follows. We first look at an upper bound on the information theoretic capacity of these networks in a Gaussian setting. We then show that this bound is achievable asymptotically using the compress-and-forward strategy for the multiple relay channel. We also perform numerical computations for the case when the relays have finite powers. We observe that even when the relay power is only a few times larger than the source power, the compress-and-forward rate gets close to the capacity. The results indicate the value of cooperation in wireless mesh networks. The capacity characterization quantifies how the relays can cooperate, using the compress-and-forward strategy, to either conserve node energy or to increase transmission rate.
An expansive, monotone operator is dominating; if it is also idempotent it is a closure operator. Although they have distinct properties, these two kinds of discrete operators are also intertwined. Every closure operator is dominating; every dominating operator embodies a closure. Both can be the basis of continuous set transformations. Dominating operators that exhibit categorical pull-back constitute a Galois connection and must be antimatroid closure operators. Applications involving social networks and learning spaces are suggested
The problem of security against timing based traffic analysis in wireless networks is considered in this work. An analytical measure of anonymity in eavesdropped networks is proposed using the information theoretic concept of equivocation. For a physical layer with orthogonal transmitter directed signaling, scheduling and relaying techniques are designed to maximize achievable network performance for any given level of anonymity. The network performance is measured by the achievable relay rates from the sources to destinations under latency and medium access constraints. In particular, analytical results are presented for two scenarios:   For a two-hop network with maximum anonymity, achievable rate regions for a general m x 1 relay are characterized when nodes generate independent Poisson transmission schedules. The rate regions are presented for both strict and average delay constraints on traffic flow through the relay.   For a multihop network with an arbitrary anonymity requirement, the problem of maximizing the sum-rate of flows (network throughput) is considered. A selective independent scheduling strategy is designed for this purpose, and using the analytical results for the two-hop network, the achievable throughput is characterized as a function of the anonymity level. The throughput-anonymity relation for the proposed strategy is shown to be equivalent to an information theoretic rate-distortion function.
We study the one-dimensional Bose gas in spatially correlated disorder at zero temperature, using an extended density-phase Bogoliubov method. We analyze in particular the decay of the one-body density matrix and the behaviour of the Bogoliubov excitations across the phase boundary. We observe that the transition to the Bose glass phase is marked by a power-law divergence of the density of states at low energy. A measure of the localization length displays a power-law energy dependence in both regions, with the exponent equal to -1 at the boundary. We draw the phase diagram of the superfluid-insulator transition in the limit of small interaction strength.
Successful applications of reinforcement learning in real-world problems often require dealing with partially observable states. It is in general very challenging to construct and infer hidden states as they often depend on the agent's entire interaction history and may require substantial domain knowledge. In this work, we investigate a deep-learning approach to learning the representation of states in partially observable tasks, with minimal prior knowledge of the domain. In particular, we propose a new family of hybrid models that combines the strength of both supervised learning (SL) and reinforcement learning (RL), trained in a joint fashion: The SL component can be a recurrent neural networks (RNN) or its long short-term memory (LSTM) version, which is equipped with the desired property of being able to capture long-term dependency on history, thus providing an effective way of learning the representation of hidden states. The RL component is a deep Q-network (DQN) that learns to optimize the control for maximizing long-term rewards. Extensive experiments in a direct mailing campaign problem demonstrate the effectiveness and advantages of the proposed approach, which performs the best among a set of previous state-of-the-art methods.
Few algorithms for supervised training of spiking neural networks exist that can deal with patterns of multiple spikes, and their computational properties are largely unexplored. We demonstrate in a set of simulations that the ReSuMe learning algorithm can be successfully applied to layered neural networks. Input and output patterns are encoded as spike trains of multiple precisely timed spikes, and the network learns to transform the input trains into target output trains. This is done by combining the ReSuMe learning algorithm with multiplicative scaling of the connections of downstream neurons.   We show in particular that layered networks with one hidden layer can learn the basic logical operations, including Exclusive-Or, while networks without hidden layer cannot, mirroring an analogous result for layered networks of rate neurons.   While supervised learning in spiking neural networks is not yet fit for technical purposes, exploring computational properties of spiking neural networks advances our understanding of how computations can be done with spike trains.
Search for possible relationships between phylogeny and ontogeny is one of the most important issues in the field of evolutionary developmental biology. By representing developmental dynamics of spatially located cells with gene expression dynamics with cell-to-cell interaction under external morphogen gradient, evolved are gene regulation networks under mutation and selection with the fitness to approach a prescribed spatial pattern of expressed genes. For most of thousands of numerical evolution experiments, evolution of pattern over generations and development of pattern by an evolved network exhibit remarkable congruence. Here, both the pattern dynamics consist of several epochs to form successive stripe formations between quasi-stationary regimes. In evolution, the regimes are generations needed to hit relevant mutations, while in development, they are due to the emergence of slowly varying expression that controls the pattern change. Successive pattern changes are thus generated, which are regulated by successive combinations of feedback or feedforward regulations under the upstream feedforward network that reads the morphogen gradient. By using a pattern generated by the upstream feedforward network as a boundary condition, downstream networks form later stripe patterns. These epochal changes in development and evolution are represented as same bifurcations in dynamical-systems theory, and this agreement of bifurcations lead to the evolution-development congruences. Violation of the evolution-development congruence, observed exceptionally, is shown to be originated in alteration of the boundary due to mutation at the upstream feedforward network. Our results provide a new look on developmental stages, punctuated equilibrium, developmental bottlenecks, and evolutionary acquisition of novelty in morphogenesis.
We demonstrate a spiking neural network for navigation motivated by the chemotaxis network of Caenorhabditis elegans. Our network uses information regarding temporal gradients in the tracking variable's concentration to make navigational decisions. The gradient information is determined by mimicking the underlying mechanisms of the ASE neurons of C. elegans. Simulations show that our model is able to forage and track a target set-point in extremely noisy environments. We develop a VLSI implementation for the main gradient detector neurons, which could be integrated with standard comparator circuitry to develop a robust circuit for navigation and contour tracking.
In this paper, we propose a generative knowledge transfer technique that trains an RNN based language model (student network) using text and output probabilities generated from a previously trained RNN (teacher network). The text generation can be conducted by either the teacher or the student network. We can also improve the performance by taking the ensemble of soft labels obtained from multiple teacher networks. This method can be used for privacy conscious language model adaptation because no user data is directly used for training. Especially, when the soft labels of multiple devices are aggregated via a trusted third party, we can expect very strong privacy protection.
Social networks are not static but rather constantly evolve in time. One of the elements thought to drive the evolution of social network structure is homophily - the need for individuals to connect with others who are similar to them. In this paper, we study how the spread of a new opinion, idea, or behavior on such a homophily-driven social network is affected by the changing network structure. In particular, using simulations, we study a variant of the Axelrod model on a network with a homophilic rewiring rule imposed. First, we find that the presence of homophilic rewiring within the network, in general, impedes the reaching of consensus in opinion, as the time to reach consensus diverges exponentially with network size $N$. We then investigate whether the introduction of committed individuals who are rigid in their opinion on a particular issue, can speed up the convergence to consensus on that issue. We demonstrate that as committed agents are added, beyond a critical value of the committed fraction, the consensus time growth becomes logarithmic in network size $N$. Furthermore, we show that slight changes in the interaction rule can produce strikingly different results in the scaling behavior of $T_c$. However, the benefit gained by introducing committed agents is qualitatively preserved across all the interaction rules we consider.
Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution. In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times.
In this paper, a multilayer perceptron guided key generation for encryption/decryption (MLPKG) has been proposed through recursive replacement using mutated character code generation for wireless communication of data/information. Multilayer perceptron transmitting systems at both ends accept an identical input vector, generate an output bit and the network are trained based on the output bit which is used to form a protected variable length secret-key. For each session, different hidden layer of multilayer neural network is selected randomly and weights or hidden units of this selected hidden layer help to form a secret session key. The plain text is encrypted using mutated character code table. Intermediate cipher text is yet again encrypted through recursive replacement technique to from next intermediate encrypted text which is again encrypted to form the final cipher text through chaining, cascaded xoring of multilayer perceptron generated session key. If size of the final block of intermediate cipher text is less than the size of the key then this block is kept unaltered. Receiver will use identical multilayer perceptron generated session key for performing deciphering process for getting the recursive replacement encrypted cipher text and then mutated character code table is used for decoding. Parametric tests have been done and results are compared in terms of Chi-Square test, response time in transmission with some existing classical techniques, which shows comparable results for the proposed technique.
An approach to the acceleration of parametric weak classifier boosting is proposed. Weak classifier is called parametric if it has fixed number of parameters and, so, can be represented as a point into multidimensional space. Genetic algorithm is used instead of exhaustive search to learn parameters of such classifier. Proposed approach also takes cases when effective algorithm for learning some of the classifier parameters exists into account. Experiments confirm that such an approach can dramatically decrease classifier training time while keeping both training and test errors small.
The Phoenix Deep Survey (PDS) is a multiwavelength survey based on deep 1.4 GHz radio observations used to identify a large sample of star forming galaxies to z=1. Here we present an exploration of the evolutionary constraints on the star-forming population imposed by the 1.4 GHz source counts, followed by an analysis of the average properties of extremely red galaxies in the PDS, by using the "stacking" technique.
Complex networks have acquired a great popularity in recent years, since the graph representation of many natural, social and technological systems is often very helpful to characterize and model their phenomenology. Additionally, the mathematical tools of statistical physics have proven to be particularly suitable for studying and understanding complex networks. Nevertheless, an important obstacle to this theoretical approach is still represented by the difficulties to draw parallelisms between network science and more traditional aspects of statistical physics. In this paper, we explore the relation between complex networks and a well known topic of statistical physics: renormalization. A general method to analyze renormalization flows of complex networks is introduced. The method can be applied to study any suitable renormalization transformation. Finite-size scaling can be performed on computer-generated networks in order to classify them in universality classes. We also present applications of the method on real networks.
Random networks of carbon nanotubes and metallic nanowires have shown to be very useful in the production of transparent, conducting films. The electronic transport on the film depends considerably on the network properties, and on the inter-wire coupling. Here we present a simple, computationally efficient method for the calculation of conductance on random nanostructured networks. The method is implemented on metallic nanowire networks, which are described within a single-orbital tight binding Hamiltonian, and the conductance is calculated with the Kubo formula. We show how the network conductance depends on the average number of connections per wire, and on the number of wires connected to the electrodes. We also show the effect of the inter-/intra-wire hopping ratio on the conductance through the network. Furthermore, we argue that this type of calculation is easily extendable to account for the upper conductivity of realistic films spanned by tunneling networks. When compared to experimental measurements, this quantity provides a clear indication of how much room is available for improving the film conductivity.
Convolutional Neural Networks (ConvNets) have successfully contributed to improve the accuracy of regression-based methods for computer vision tasks such as human pose estimation, landmark localization, and object detection. The network optimization has been usually performed with L2 loss and without considering the impact of outliers on the training process, where an outlier in this context is defined by a sample estimation that lies at an abnormal distance from the other training sample estimations in the objective space. In this work, we propose a regression model with ConvNets that achieves robustness to such outliers by minimizing Tukey's biweight function, an M-estimator robust to outliers, as the loss function for the ConvNet. In addition to the robust loss, we introduce a coarse-to-fine model, which processes input images of progressively higher resolutions for improving the accuracy of the regressed values. In our experiments, we demonstrate faster convergence and better generalization of our robust loss function for the tasks of human pose estimation and age estimation from face images. We also show that the combination of the robust loss function with the coarse-to-fine model produces comparable or better results than current state-of-the-art approaches in four publicly available human pose estimation datasets.
This paper develops a mathematical and computational framework for analyzing the expected performance of Bayesian data fusion, or joint statistical inference, within a sensor network. We use variational techniques to obtain the posterior expectation as the optimal fusion rule under a deterministic constraint and a quadratic cost, and study the smoothness and other properties of its classification performance. For a certain class of fusion problems, we prove that this fusion rule is also optimal in a much wider sense and satisfies strong asymptotic convergence results. We show how these results apply to a variety of examples with Gaussian, exponential and other statistics, and discuss computational methods for determining the fusion system's performance in more general, large-scale problems. These results are motivated by studying the performance of fusing multi-modal radar and acoustic sensors for detecting explosive substances, but have broad applicability to other Bayesian decision problems.
With the increasing number of deep multi-wavelength galaxy surveys, the spectral energy distribution (SED) of galaxies has become an invaluable tool for studying the formation of their structures and their evolution. In this context, standard analysis relies on simple spectro-photometric selection criteria based on a few SED colors. If this fully supervised classification already yielded clear achievements, it is not optimal to extract relevant information from the data. In this article, we propose to employ very recent advances in machine learning, and more precisely in feature learning, to derive a data-driven diagram. We show that the proposed approach based on denoising autoencoders recovers the bi-modality in the galaxy population in an unsupervised manner, without using any prior knowledge on galaxy SED classification. This technique has been compared to principal component analysis (PCA) and to standard color/color representations. In addition, preliminary results illustrate that this enables the capturing of extra physically meaningful information, such as redshift dependence, galaxy mass evolution and variation over the specific star formation rate. PCA also results in an unsupervised representation with physical properties, such as mass and sSFR, although this representation separates out. less other characteristics (bimodality, redshift evolution) than denoising autoencoders.
Chemistry has a key role in the evolution of the interstellar medium (ISM), so it is highly desirable to follow its evolution in numerical simulations. However, it may easily dominate the computational cost when applied to large systems. In this paper we discuss two approaches to reduce these costs: (i) based on computational strategies, and (ii) based on the properties and on the topology of the chemical network. The first methods are more robust, while the second are meant to be giving important information on the structure of large, complex networks. To this aim we first discuss the numerical solvers for integrating the system of ordinary differential equations (ODE) associated with the chemical network. We then propose a buffer method that decreases the computational time spent in solving the ODE system. We further discuss a flux-based method that allows one to determine and then cut on the fly the less active reactions. In addition we also present a topological approach for selecting the most probable species that will be active during the chemical evolution, thus gaining information on the chemical network that otherwise would be difficult to retrieve. This topological technique can also be used as an a priori reduction method for any size network. We implemented these methods into a 1D Lagrangian hydrodynamical code to test their effects: both classes lead to large computational speed-ups, ranging from x2 to x5. We have also tested some hybrid approaches finding that coupling the flux method with a buffer strategy gives the best trade-off between robustness and speed-up of calculations.
Quantitative modeling in systems biology can be difficult due to the scarcity of quantitative details about biological phenomenons, especially at the subcellular scale. An alternative to escape this difficulty is qualitative modeling since it requires few to no quantitative information. Among the qualitative modeling approaches, the Boolean network formalism is one of the most popular. However, Boolean models allow variables to be valued at only true or false, which can appear too simplistic when modeling biological processes. Consequently, this work proposes a modeling approach derived from Boolean networks where fuzzy operators are used and where edges are tuned. Fuzzy operators allow variables to be continuous and then to be more finely valued than with discrete modeling approaches, such as Boolean networks, while remaining qualitative. Moreover, to consider that in a given biological network some interactions are slower and/or weaker relative to other ones, edge states are computed in order to modulate in speed and strength the signal they convey. The proposed formalism is illustrated through its implementation on a tiny sample of the epidermal growth factor receptor signaling pathway. The obtained simulations show that continuous results are produced, thus allowing finer analysis, and that modulating the signal conveyed by the edges allows their tuning according to knowledge about the modeled interactions, thus incorporating more knowledge. The proposed modeling approach is expected to bring enhancements in the ability of qualitative models to simulate the dynamics of biological networks while not requiring quantitative information.
Security in mobile AD HOC network is a big challenge as it has no centralized authority which can supervise the individual nodes operating in the network. The attacks can come from both inside the network and from the outside. We are trying to classify the existing attacks into two broad categories: DATA traffic attacks and CONTROL traffic attacks. We will also be discussing the presently proposed methods of mitigating those attacks.
Shape analyses of tephra grains result in understanding eruption mechanism of volcanoes. However, we have to define and select parameter set such as convexity for the precise discrimination of tephra grains. Selection of the best parameter set for the recognition of tephra shapes is complicated. Actually, many shape parameters have been suggested. Recently, neural network has made a great success in the field of machine learning. Convolutional neural network can recognize the shape of images without human bias and shape parameters. We applied the simple convolutional neural network developed for the handwritten digits to the recognition of tephra shapes. The network was trained by Morphologi tephra images, and it can recognize the tephra shapes with approximately 90% of accuracy.
We consider the supervised training setting in which we learn task-specific word embeddings. We assume that we start with initial embeddings learned from unlabelled data and update them to learn task-specific embeddings for words in the supervised training data. However, for new words in the test set, we must use either their initial embeddings or a single unknown embedding, which often leads to errors. We address this by learning a neural network to map from initial embeddings to the task-specific embedding space, via a multi-loss objective function. The technique is general, but here we demonstrate its use for improved dependency parsing (especially for sentences with out-of-vocabulary words), as well as for downstream improvements on sentiment analysis.
Human Activity Recognition has witnessed a significant progress in the last decade. Although a great deal of work in this field goes in recognizing normal human activities, few studies focused on identifying motion in sports. Recognizing human movements in different sports has high impact on understanding the different styles of humans in the play and on improving their performance. As deep learning models proved to have good results in many classification problems, this paper will utilize deep learning to classify cross-country skiing movements, known as gears, collected using a 3D accelerometer. It will also provide a comparison between different deep learning models such as convolutional and recurrent neural networks versus standard multi-layer perceptron. Results show that deep learning is more effective and has the highest classification accuracy.
Although it is just over a year since the data was made public, the HDF exposure has stimulated considerable progress towards our understanding of the faint galaxy population. I present a brief personal account of the history of faint galaxy studies culminating in the HDF, and describe what I consider to be the main highlights thus far from this remarkable image. The HDF has given considerable impetus to studies of galaxy evolution and this has led to the emergence of a convincing empirical framework. Further exploitation of deep HST images in conjunction with ground-based 2-D spectroscopy will assist in the physical understanding of the evolutionary processes involved.
In this paper, we consider the problem of model order reduction of stochastic biochemical networks. In particular, we reduce the order of (the number of equations in) the Linear Noise Approximation of the Chemical Master Equation, which is often used to describe biochemical networks. In contrast to other biochemical network reduction methods, the presented one is projection-based. Projection-based methods are powerful tools, but the cost of their use is the loss of physical interpretation of the nodes in the network. In order alleviate this drawback, we employ structured projectors, which means that some nodes in the network will keep their physical interpretation. For many models in engineering, finding structured projectors is not always feasible; however, in the context of biochemical networks it is much more likely as the networks are often (almost) monotonic. To summarise, the method can serve as a trade-off between approximation quality and physical interpretation, which is illustrated on numerical examples.
This paper presents a review of instantaneously trained neural networks (ITNNs). These networks trade learning time for size and, in the basic model, a new hidden node is created for each training sample. Various versions of the corner-classification family of ITNNs, which have found applications in artificial intelligence (AI), are described. Implementation issues are also considered.
A perceptron with N random weights can store of the order of N patterns by removing a fraction of the weights without changing their strengths. The critical storage capacity as a function of the concentration of the remaining bonds for random outputs and for outputs given by a teacher perceptron is calculated. A simple Hebb-like dilution algorithm is presented which in the teacher case reaches the optimal generalization ability.
In this paper we have analyzed energy efficient neighbour selection algorithms for routing in wireless sensor networks. Since energy saving or consumption is an important aspect of wireless sensor networks, its precise usage is highly desirable both for the faithful performance of network and to increase the network life time. For this work, we have considered a flat network topology where every node has the same responsibility and capability. We have compared two energy efficient algorithms and analyzed their performances with increase in number of nodes, time rounds and node failures.
Connectivity and layout of underlying networks largely determine the behavior of many environments. For example, transportation networks determine the flow of traffic in cities, or maps determine the difficulty and flow in games. Designing such networks from scratch is challenging as even local network changes can have large global effects. We investigate how to computationally create networks starting from {\em only} high-level functional specifications. Such specifications can be in the form of network density, travel time versus network length, traffic type, destination locations, etc. We propose an integer programming-based approach that guarantees that the resultant networks are valid by fulfilling all specified hard constraints, and score favorably in terms of the objective function. We evaluate our algorithm in three different design settings (i.e., street layout, floorplanning, and game level design) and demonstrate, for the first time, that diverse networks can emerge purely from high-level functional specifications.
It is well known that the efficiency of a good thermoelectric material should be optimized with respect to doping concentration. However, much less attention has been paid to the optimization of the dopant's energy level. Thermoelectric materials doped with shallow levels may experience a dramatic reduction in their figures of merit at high temperatures due to the excitation of minority carriers that reduces the Seebeck coefficient and increases bipolar heat conduction. Doping with deep level impurities can delay the excitation of minority carriers as it requires a higher temperature to ionize all dopants. We find through modeling that, depending on the material type and temperature range of operation, different impurity levels (shallow or deep) will be desired to optimize the efficiency of a thermoelectric material. For different materials, we further clarify where the most preferable position of the impurity level within the band gap falls. Our research provides insights in choosing the most appropriate dopants for a thermoelectric material in order to maximize the device efficiency.
Consider the following routing problem in the context of a large scale network $G$, with particular interest paid to power law networks, although our results do not assume a particular degree distribution. A small number of nodes want to exchange messages and are looking for short paths on $G$. These nodes do not have access to the topology of $G$ but are allowed to crawl the network within a limited budget. Only crawlers whose sample paths cross are allowed to exchange topological information. In this work we study the use of random walks (RWs) to crawl $G$. We show that the ability of RWs to find short paths bears no relation to the paths that they take. Instead, it relies on two properties of RWs on power law networks: 1) RW's ability observe a sizable fraction of the network edges; and 2) an almost certainty that two distinct RW sample paths cross after a small percentage of the nodes have been visited. We show promising simulation results on several real world networks.
We describe our first-place solution to the ECML/PKDD discovery challenge on taxi destination prediction. The task consisted in predicting the destination of a taxi based on the beginning of its trajectory, represented as a variable-length sequence of GPS points, and diverse associated meta-information, such as the departure time, the driver id and client information. Contrary to most published competitor approaches, we used an almost fully automated approach based on neural networks and we ranked first out of 381 teams. The architectures we tried use multi-layer perceptrons, bidirectional recurrent neural networks and models inspired from recently introduced memory networks. Our approach could easily be adapted to other applications in which the goal is to predict a fixed-length output from a variable-length sequence.
In recent years deep neural networks have achieved great success in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most other languages do not enjoy such an abundance of annotated data for sentiment analysis. To tackle this problem, we propose the Adversarial Deep Averaging Network (ADAN) to transfer sentiment knowledge learned from labeled English data to low-resource languages where only unlabeled data exists. ADAN is a "Y-shaped" network with two discriminative branches: a sentiment classifier and an adversarial language identification scorer. Both branches take input from a shared feature extractor that aims to learn hidden representations that capture the underlying sentiment of the text and are invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms several baselines, including a strong pipeline approach that relies on state-of-the-art Machine Translation.
We present a novel descriptor, called deep self-convolutional activations (DeSCA), designed for establishing dense correspondences between images taken under different imaging modalities, such as different spectral ranges or lighting conditions. Motivated by descriptors based on local self-similarity (LSS), we formulate a novel descriptor by leveraging LSS in a deep architecture, leading to better discriminative power and greater robustness to non-rigid image deformations than state-of-the-art cross-modality descriptors. The DeSCA first computes self-convolutions over a local support window for randomly sampled patches, and then builds self-convolution activations by performing an average pooling through a hierarchical formulation within a deep convolutional architecture. Finally, the feature responses on the self-convolution activations are encoded through a spatial pyramid pooling in a circular configuration. In contrast to existing convolutional neural networks (CNNs) based descriptors, the DeSCA is training-free (i.e., randomly sampled patches are utilized as the convolution kernels), is robust to cross-modal imaging, and can be densely computed in an efficient manner that significantly reduces computational redundancy. The state-of-the-art performance of DeSCA on challenging cases of cross-modal image pairs is demonstrated through extensive experiments.
We study domain-wall excitations in two-dimensional random-bond Ising spin systems on a square lattice with side length L, subject to two different continuous disorder distributions. In both cases an adjustable parameter allows to tune the disorder so as to yield a transition from a spin-glass ordered ground state to a ferromagnetic groundstate. We formulate an auxiliary graph-theoretical problem in which domain walls are given by undirected shortest paths with possibly negative distances. Due to the details of the mapping, standard shortest-path algorithms cannot be applied. To solve such shortest-path problems we have to apply minimum-weight perfect-matching algorithms. We first locate the critical values of the disorder parameters, where the ferromagnet to spin-glass transition occurs for the two types of the disorder. For certain values of the disorder parameters close to the respective critical point, we investigate the system size dependence of the width of the the average domain-wall energy (~L^\theta) and the average domain-wall length (~L^df). Performing a finite-size scaling analysis for systems with a side length up to L=512, we find that both exponents remain constant in the spin-glass phase, i.e. \theta ~- 0.28 and df~1.275. This is consistent with conformal field theory, where it seems to be possible to relate the exponents from the analysis of Stochastic Loewner evolutions (SLEs) via df-1=3/[4(3+\theta)]. Finally, we characterize the transition in terms of ferromagnetic clusters of spins that form, as one proceeds from spin-glass ordered to ferromagnetic ground states.
We describe a computationally efficient, stochastic graph-regularization technique that can be utilized for the semi-supervised training of deep neural networks in a parallel or distributed setting. We utilize a technique, first described in [13] for the construction of mini-batches for stochastic gradient descent (SGD) based on synthesized partitions of an affinity graph that are consistent with the graph structure, but also preserve enough stochasticity for convergence of SGD to good local minima. We show how our technique allows a graph-based semi-supervised loss function to be decomposed into a sum over objectives, facilitating data parallelism for scalable training of machine learning models. Empirical results indicate that our method significantly improves classification accuracy compared to the fully-supervised case when the fraction of labeled data is low, and in the parallel case, achieves significant speed-up in terms of wall-clock time to convergence. We show the results for both sequential and distributed-memory semi-supervised DNN training on a speech corpus.
Single-top production has been studied with the ATLAS detector using 0.7 fb-1 of 2011 data recorded at 7 TeV center-of-mass energy. The measurement of electroweak production of top-quarks allows probes of the Wtb vertex and a direct measurement of the CKM matrix element |V_tb|. It is also expected to be sensitive to new physics such as flavor changing neutral currents or heavy W production. The t-channel cross-section measurements are performed using both a cut-based and neural network approach, while a cut-based selection in the dilepton channel is used to derive a limit on the associated (Wt) production. An observed cross-section of 90 +32 -22 pb (65 +28 -19 pb expected) is obtained for the t-channel, which is consistent with the Standard Model expectation. For the Wt production, an observed limit of < 39.1 pb (40.6 pb expected) is derived, which corresponds to about 2.5 times the Standard Model expectation.
Computed tomography imaging is a standard modality for detecting and assessing lung cancer. In order to evaluate the malignancy of lung nodules, clinical practice often involves expert qualitative ratings on several criteria describing a nodule's appearance and shape. Translating these features for computer-aided diagnostics is challenging due to their subjective nature and the difficulties in gaining a complete description. In this paper, we propose a computerized approach to quantitatively evaluate both appearance distinctions and 3D surface variations. Nodule shape was modeled and parameterized using spherical harmonics, and appearance features were extracted using deep convolutional neural networks. Both sets of features were combined to estimate the nodule malignancy using a random forest classifier. The proposed algorithm was tested on the publicly available Lung Image Database Consortium dataset, achieving high accuracy. By providing lung nodule characterization, this method can provide a robust alternative reference opinion for lung cancer diagnosis.
Nonnegative Matrix Factorization (NMF) aims to factorize a matrix into two optimized nonnegative matrices and has been widely used for unsupervised learning tasks such as product recommendation based on a rating matrix. However, although networks between nodes with the same nature exist, standard NMF overlooks them, e.g., the social network between users. This problem leads to comparatively low recommendation accuracy because these networks are also reflections of the nature of the nodes, such as the preferences of users in a social network. Also, social networks, as complex networks, have many different structures. Each structure is a composition of links between nodes and reflects the nature of nodes, so retaining the different network structures will lead to differences in recommendation performance. To investigate the impact of these network structures on the factorization, this paper proposes four multi-level network factorization algorithms based on the standard NMF, which integrates the vertical network (e.g., rating matrix) with the structures of horizontal network (e.g., user social network). These algorithms are carefully designed with corresponding convergence proofs to retain four desired network structures. Experiments on synthetic data show that the proposed algorithms are able to preserve the desired network structures as designed. Experiments on real-world data show that considering the horizontal networks improves the accuracy of document clustering and recommendation with standard NMF, and various structures show their differences in performance on these two tasks. These results can be directly used in document clustering and recommendation systems.
We consider the task of predicting various traits of a person given an image of their face. We estimate both objective traits, such as gender, ethnicity and hair-color; as well as subjective traits, such as the emotion a person expresses or whether he is humorous or attractive. For sizeable experimentation, we contribute a new Face Attributes Dataset (FAD), having roughly 200,000 attribute labels for the above traits, for over 10,000 facial images. Due to the recent surge of research on Deep Convolutional Neural Networks (CNNs), we begin by using a CNN architecture for estimating facial attributes and show that they indeed provide an impressive baseline performance. To further improve performance, we propose a novel approach that incorporates facial landmark information for input images as an additional channel, helping the CNN learn better attribute-specific features so that the landmarks across various training images hold correspondence. We empirically analyse the performance of our method, showing consistent improvement over the baseline across traits.
As the vast majority of network measures are defined for one-mode networks, two-mode networks often have to be projected onto one-mode networks to be analyzed. A number of issues arise in this transformation process, especially when analyzing ties among nodes' contacts. For example, the values attained by the global and local clustering coefficients on projected random two-mode networks deviate from the expected values in corresponding classical one-mode networks. Moreover, both the local clustering coefficient and constraint (structural holes) are inversely associated to nodes' two-mode degree. To overcome these issues, this paper proposes redefinitions of the clustering coefficients for two-mode networks.
In social networks of human individuals, social relationships do not necessarily last forever as they can either fade gradually with time, resulting in link aging, or terminate abruptly, causing link deletion, as even old friendships may cease. In this paper, we study a social network formation model where we introduce several ways by which a link termination takes place. If we adopt the link aging, we get a more modular structure with more homogeneously distributed link weights within communities than when link deletion is used. By investigating distributions and relations of various network characteristics, we find that the empirical findings are better reproduced with the link deletion model. This indicates that link deletion plays a more prominent role in organizing social networks than link aging.
We comment some recent unexpected experimental results on proton/deuteron deep inelastic scattering off water molecules in H2O-D2O mixtures (C.A. Chatzidimitrou-Dreismann et al. Phys. Rev. Lett. 79, 2839 (1997)) where a strong dependence of the ratio of proton/deuteron cross sections on the composition of the mixture was observed. We propose an explanation of this new effect based on the two-fluid picture of water which arises from the theory of QED coherence in condensed matter.
Protein structures are much more conserved than sequences during evolution. Based on this observation, we investigate the consequences of structural conservation on protein evolution. We study seven of the most studied protein folds, finding out that an extended neutral network in sequence space is associated to each of them. Within our model, neutral evolution leads to a non-Poissonian substitution process, due to the broad distribution of connectivities in neutral networks. The observation that the substitution process has non-Poissonian statistics has been used against the original Kimura's neutral theory, while our model shows that this is a generic property of neutral evolution with structural conservation. Our model also predicts that the substitution rate can strongly fluctuate from one branch to another of the evolutionary tree. The average sequence similarity within a neutral network is close to the threshold of randomness, as observed for families of sequences sharing the same fold. Nevertheless, some positions are more difficult to mutate than others. We compare such structurally conserved positions to positions conserved in protein evolution, suggesting that our model can be a valuable tool to distinguish structural from functional conservation in databases of protein families. These results indicate that a synergy between database analysis and structurally-based computational studies can increase our understanding of protein evolution.
Wireless sensor networks consist of sensor nodes with limited computational and communication capabilities. In this paper, the whole network of sensor nodes is divided into clusters based on their physical locations. In addition, efficient ways of key distribution among the nodes within the cluster and among controllers of each cluster are discussed. Also, inter and intra cluster communications are presented in detail. The security of the entire network through efficient key management by taking into consideration the network's power capabilities is discussed. A graphical representation of the simulation on the scheme is also presented.
State estimation is necessary in diagnosing anomalies in Water Demand Systems (WDS). In this paper we present a neural network performing such a task. State estimation is performed by using optimization, which tries to reconcile all the available information. Quantification of the uncertainty of the input data (telemetry measures and demand predictions) can be achieved by means of robust estate estimation. Using a mathematical model of the network, fuzzy estimated states for anomalous states of the network can be obtained. They are used to train a neural network capable of assessing WDS anomalies associated with particular sets of measurements.
Key establishment between any pair of nodes is an essential requirement for providing secure services in wireless sensor networks. Blom's scheme is a prominent key management scheme but its shortcomings include large computation overhead and memory cost. We propose a new scheme in this paper that modifies Blom's scheme in a manner that reduces memory and computation costs. This paper also provides the value for secure parameter t such that the network is resilient.
In this paper we introduce an intrusion detection system for Denial of Service (DoS) attacks against Domain Name System (DNS). Our system architecture consists of two most important parts: a statistical preprocessor and a neural network classifier. The preprocessor extracts required statistical features in a shorttime frame from traffic received by the target name server. We compared three different neural networks for detecting and classifying different types of DoS attacks. The proposed system is evaluated in a simulated network and showed that the best performed neural network is a feed-forward backpropagation with an accuracy of 99%.
The emergence of complex networks from evolutionary games is studied occurring when agents are allowed to switch interaction partners. For this purpose a coevolutionary iterated Prisoner's Dilemma game is defined on a random network with agents as nodes and games along the links. The agents change their neighborhoods to improve their payoff. The system relaxes to stationary states corresponding to cooperative Nash equilibria with the additional property that no agent can improve its payoff by changing its neighborhood. Small perturbations of the system lead to avalanches of strategy readjustments reestablishing equilibrium. As a result of the dynamics, the network of interactions develops non-trivial topological properties as a broad degree distribution suggesting scale-free behavior, small-world characteristics, and assortative mixing.
In this paper, we propose an approach to learn hierarchical features for visual object tracking. First, we offline learn features robust to diverse motion patterns from auxiliary video sequences. The hierarchical features are learned via a two-layer convolutional neural network. Embedding the temporal slowness constraint in the stacked architecture makes the learned features robust to complicated motion transformations, which is important for visual object tracking. Then, given a target video sequence, we propose a domain adaptation module to online adapt the pre-learned features according to the specific target object. The adaptation is conducted in both layers of the deep feature learning module so as to include appearance information of the specific target object. As a result, the learned hierarchical features can be robust to both complicated motion transformations and appearance changes of target objects. We integrate our feature learning algorithm into three tracking methods. Experimental results demonstrate that significant improvement can be achieved using our learned hierarchical features, especially on video sequences with complicated motion transformations.
The electron-electron interaction quantum correction to the conductivity of the gated single quantum well InP/In$_{0.53}$Ga$_{0.47}$As heterostructures is investigated experimentally. The analysis of the temperature and magnetic field dependences of the conductivity tensor allows us to obtain reliably the diffusion part of the interaction correction for different values of spin relaxation rate, $1/\tau_s$. The surprising result is that the spin relaxation processes do not suppress the interaction correction in the triplet channel and, thus, do not enhance the correction in magnitude contrary to theoretical expectations even in the case of relatively fast spin relaxation, $1/T\tau_s\simeq (20-25)\gg 1$.
We consider wireless mesh networks and the problem of routing end-to-end traffic over multiple paths for the same origin-destination pair with minimal interference. We introduce a heuristic for path determination with two distinguishing characteristics. First, it works by refining an extant set of paths, determined previously by a single- or multi-path routing algorithm. Second, it is totally local, in the sense that it can be run by each of the origins on information that is available no farther than the node's immediate neighborhood. We have conducted extensive computational experiments with the new heuristic, using AODV and OLSR, as well as their multi-path variants, as underlying routing methods. For two different CSMA settings (as implemented by 802.11) and one TDMA setting running a path-oriented link scheduling algorithm, we have demonstrated that the new heuristic is capable of improving the average throughput network-wide. When working from the paths generated by the multi-path routing algorithms, the heuristic is also capable to provide a more evenly distributed traffic pattern.
Motivated by the need to extract meaning from large amounts of complex structured data, we consider three critical problems on graphs: localization, decomposition, and dictionary learning of piecewise-constant signals. These graph-based problems are related to many real-world applications, such as localizing stimulus in brain connectivity networks, and mining traffic events in city street networks, where the key issue is to find the supports of localized activated patterns. Counterparts of these problems in classical signal/image processing, such as impulse detection and foreground detection, have been studied over the past few decades. We use piecewise-constant graph signals to model localized patterns, where each piece indicates a localized pattern that exhibits homogeneous internal behavior and the number of pieces indicates the number of localized patterns. For such signals, we show that decomposition and dictionary learning are natural extensions of localization, the goal of which is not only to efficiently approximate graph signals, but also to accurately find supports of localized patterns. For each of the three problems, i.e., localization, decomposition, and dictionary learning, we propose a specific graph signal model, an optimization problem, and a computationally efficient solver. The proposed solvers directly find the supports of arbitrary localized activated patterns without tuning any thresholds. We then conduct an extensive empirical study to validate the proposed methods on both simulated and real data including the analysis of a large volume of spatio-temporal Manhattan urban data. The analysis validates the effectiveness of the approach and suggests that graph signal processing tools may aid in urban planning and traffic forecasting.
In this paper, we propose a scheme, called the "algebraic watchdog" for wireless network coding, in which nodes can detect malicious behaviors probabilistically, police their downstream neighbors locally using overheard messages, and, thus, provide a secure global "self-checking network". Unlike traditional Byzantine detection protocols which are receiver-based, this protocol gives the senders an active role in checking the node downstream. This work is inspired by Marti et. al.'s watchdog-pathrater, which attempts to detect and mitigate the effects of routing misbehavior.   As the first building block of a such system, we focus on a two-hop network. We present a graphical model to understand the inference process nodes execute to police their downstream neighbors; as well as to compute, analyze, and approximate the probabilities of misdetection and false detection. In addition, we present an algebraic analysis of the performance using an hypothesis testing framework, that provides exact formulae for probabilities of false detection and misdetection.
The contribution of the radiative tail from the quasielastic peak to low order radiative correction to deep inelastic scattering of polarized leptons by polarized $^3$He was calculated within the sum rules formalism and $y$-scaling hypothesis. Numerical analysis was carried out under the conditions of HERMES experiment.
We describe a classical thermodynamic model that reproduces the main features of the solid hydrogen phase diagram. In particular, we show how the general structure types that are found by electronic structure calculations and the quantum nature of the protons can also be understood from a classical viewpoint. The model provides a picture not only of crystal structure, but also for the anomalous melting curve and insights into isotope effects, liquid metallisation and InfraRed activity. The existence of a classical picture for this most quantum of condensed matter systems provides a surprising extension of the correspondence principle of quantum mechanics, in particular the equivalent effects of classical and quantum uncertainty.
A number of recent studies have focused on the statistical properties of networked systems such as social networks and the World-Wide Web. Researchers have concentrated particularly on a few properties which seem to be common to many networks: the small-world property, power-law degree distributions, and network transitivity. In this paper, we highlight another property which is found in many networks, the property of community structure, in which network nodes are joined together in tightly-knit groups between which there are only looser connections. We propose a new method for detecting such communities, built around the idea of using centrality indices to find community boundaries. We test our method on computer generated and real-world graphs whose community structure is already known, and find that it detects this known structure with high sensitivity and reliability. We also apply the method to two networks whose community structure is not well-known - a collaboration network and a food web - and find that it detects significant and informative community divisions in both cases.
We propose an automatic method to infer high dynamic range illumination from a single, limited field-of-view, low dynamic range photograph of an indoor scene. In contrast to previous work that relies on specialized image capture, user input, and/or simple scene models, we train an end-to-end deep neural network that directly regresses a limited field-of-view photo to HDR illumination, without strong assumptions on scene geometry, material properties, or lighting. We show that this can be accomplished in a three step process: 1) we train a robust lighting classifier to automatically annotate the location of light sources in a large dataset of LDR environment maps, 2) we use these annotations to train a deep neural network that predicts the location of lights in a scene from a single limited field-of-view photo, and 3) we fine-tune this network using a small dataset of HDR environment maps to predict light intensities. This allows us to automatically recover high-quality HDR illumination estimates that significantly outperform previous state-of-the-art methods. Consequently, using our illumination estimates for applications like 3D object insertion, we can achieve results that are photo-realistic, which is validated via a perceptual user study.
The key idea of variational auto-encoders (VAEs) resembles that of traditional auto-encoder models in which spatial information is supposed to be explicitly encoded in the latent space. However, the latent variables in VAEs are vectors, which are commonly interpreted as multiple feature maps of size 1x1. Such representations can only convey spatial information implicitly when coupled with powerful decoders. In this work, we propose spatial VAEs that use latent variables as feature maps of larger size to explicitly capture spatial information. This is achieved by allowing the latent variables to be sampled from matrix-variate normal (MVN) distributions whose parameters are computed from the encoder network. To increase dependencies among locations on latent feature maps and reduce the number of parameters, we further propose spatial VAEs via low-rank MVN distributions. Experimental results show that the proposed spatial VAEs outperform original VAEs in capturing rich structural and spatial information.
In the most intrusion detection systems (IDS), a system tries to learn characteristics of different type of attacks by analyzing packets that sent or received in network. These packets have a lot of features. But not all of them is required to be analyzed to detect that specific type of attack. Detection speed and computational cost is another vital matter here, because in these types of problems, datasets are very huge regularly. In this paper we tried to propose a very simple and fast feature selection method to eliminate features with no helpful information on them. Result faster learning in process of redundant feature omission. We compared our proposed method with three most successful similarity based feature selection algorithm including Correlation Coefficient, Least Square Regression Error and Maximal Information Compression Index. After that we used recommended features by each of these algorithms in two popular classifiers including: Bayes and KNN classifier to measure the quality of the recommendations. Experimental result shows that although the proposed method can't outperform evaluated algorithms with high differences in accuracy, but in computational cost it has huge superiority over them.
In this paper, we consider the problem of exploring structural regularities of networks by dividing the nodes of a network into groups such that the members of each group have similar patterns of connections to other groups. Specifically, we propose a general statistical model to describe network structure. In this model, group is viewed as hidden or unobserved quantity and it is learned by fitting the observed network data using the expectation-maximization algorithm. Compared with existing models, the most prominent strength of our model is the high flexibility. This strength enables it to possess the advantages of existing models and overcomes their shortcomings in a unified way. As a result, not only broad types of structure can be detected without prior knowledge of what type of intrinsic regularities exist in the network, but also the type of identified structure can be directly learned from data. Moreover, by differentiating outgoing edges from incoming edges, our model can detect several types of structural regularities beyond competing models. Tests on a number of real world and artificial networks demonstrate that our model outperforms the state-of-the-art model at shedding light on the structural features of networks, including the overlapping community structure, multipartite structure and several other types of structure which are beyond the capability of existing models.
Ensuring security is something that is not easily done as many of the demands of network security conflict with the demands of mobile networks, majorly because of the nature of the mobile devices (e.g. low power consumption, low processing load). The study of secure distributed key agreement has great theoretical and practical significance. Securing Mobile Ad-hoc Networks using Distributed Public-key Cryptography in pairing with Mobile Ad hoc Networks and various protocols are essential for secure communications in open and distributed environment.
Despite the traditional focus of network science on static networks, most networked systems of scientific interest are characterized by temporal links. By disrupting the paths, link temporality has been shown to frustrate many dynamical processes on networks, from information spreading to accessibility. Considering the ubiquity of temporal networks in nature, we must ask: Are there any advantages of the networks' temporality? Here we develop an analytical framework to explore the control properties of temporal networks, arriving at the counterintuitive conclusion that temporal networks, compared to their static (i.e. aggregated) counterparts, reach controllability faster, demand orders of magnitude less control energy, and the control trajectories, through which the system reaches its final states, are significantly more compact than those characterizing their static counterparts. The combination of analytical, numerical and empirical results demonstrates that temporality ensures a degree of flexibility that would be unattainable in static networks, significantly enhancing our ability to control them.
The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. The paper investigates a distributed reinforcement learning setup with no prior information on the global state transition and local agent cost statistics. Specifically, with the agents' objective consisting of minimizing a network-averaged infinite horizon discounted cost, the paper proposes a distributed version of $Q$-learning, $\mathcal{QD}$-learning, in which the network agents collaborate by means of local processing and mutual information exchange over a sparse (possibly stochastic) communication network to achieve the network goal. Under the assumption that each agent is only aware of its local online cost data and the inter-agent communication network is \emph{weakly} connected, the proposed distributed scheme is almost surely (a.s.) shown to yield asymptotically the desired value function and the optimal stationary control policy at each network agent. The analytical techniques developed in the paper to address the mixed time-scale stochastic dynamics of the \emph{consensus + innovations} form, which arise as a result of the proposed interactive distributed scheme, are of independent interest.
It is known that the learning rate is the most important hyper-parameter to tune for training deep neural networks. This paper describes a new method for setting the learning rate, named cyclical learning rates, which practically eliminates the need to experimentally find the best values and schedule for the global learning rates. Instead of monotonically decreasing the learning rate, this method lets the learning rate cyclically vary between reasonable boundary values. Training with cyclical learning rates instead of fixed values achieves improved classification accuracy without a need to tune and often in fewer iterations. This paper also describes a simple way to estimate "reasonable bounds" -- linearly increasing the learning rate of the network for a few epochs. In addition, cyclical learning rates are demonstrated on the CIFAR-10 and CIFAR-100 datasets with ResNets, Stochastic Depth networks, and DenseNets, and the ImageNet dataset with the AlexNet and GoogLeNet architectures. These are practical tools for everyone who trains neural networks.
We formulate a minimal ansatz for local stress distribution in a solid that includes the possibility of strongly anharmonic short-length motions. We discover a broken-symmetry metastable phase that exhibits an aperiodic, frozen-in stress distribution. This aperiodic metastable phase is characterized by many distinct, nearly degenerate configurations. The activated transitions between the configurations are mapped onto the dynamics of a long range classical Heisenberg model with 6-component spins and anisotropic couplings. We argue the metastable phase corresponds to a deeply supercooled non-polymeric, non-metallic liquid, and further establish an order parameter for the glass-to-crystal transition. The spin model itself exhibits a continuous range of behaviors between two limits corresponding to frozen-in shear and uniform compression/dilation respectively. The two regimes are separated by a continuous transition controlled by the anisotropy in the spin-spin interaction, which is directly related to the Poisson ratio $\sigma$ of the material. The latter ratio and the ultra-violet cutoff of the theory determine the liquid configurational entropy. Our results suggest that liquid's fragility depends on the Poisson ratio in a non-monotonic way. The present ansatz provides a microscopic framework for computing the configurational entropy and relaxational spectrum of specific substances.
Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory. However, memory networks conduct the reasoning on sentence-level memory to output coarse semantic vectors and do not further take any attention mechanism to focus on words, which may lead to the model lose some detail information, especially when the answers are rare or unknown words. In this paper, we propose a novel Hierarchical Memory Networks, dubbed HMN. First, we encode the past facts into sentence-level memory and word-level memory respectively. Then, (k)-max pooling is exploited following reasoning module on the sentence-level memory to sample the (k) most relevant sentences to a question and feed these sentences into attention mechanism on the word-level memory to focus the words in the selected sentences. Finally, the prediction is jointly learned over the outputs of the sentence-level reasoning module and the word-level attention mechanism. The experimental results demonstrate that our approach successfully conducts answer selection on unknown words and achieves a better performance than memory networks.
Rich-club and page-club coefficients and their null models are introduced for directed graphs. Null models allow for a quantitative discussion of the rich-club and page-club phenomena. These coefficients are computed for four directed real-world networks: Arxiv High Energy Physics paper citation network, Web network (released from Google), Citation network among US Patents, and Email network from a EU research institution. The results show a high correlation between rich-club and page-club ordering. For journal paper citation network, we identify both rich-club and page-club ordering, showing that {}"elite" papers are cited by other {}"elite" papers. Google web network shows partial rich-club and page-club ordering up to some point and then a narrow declining of the corresponding normalized coefficients, indicating the lack of rich-club ordering and the lack of page-club ordering, i.e. high in-degree (PageRank) pages purposely avoid sharing links with other high in-degree (PageRank) pages. For UC patents citation network, we identify page-club and rich-club ordering providing a conclusion that {}"elite" patents are cited by other {}"elite" patents. Finally, for e-mail communication network we show lack of both rich-club and page-club ordering. We construct an example of synthetic network showing page-club ordering and the lack of rich-club ordering.
Primary user activity is a major bottleneck for existing routing protocols in cognitive radio networks. Typical routing protocols avoid areas that are highly congested with primary users, leaving only a small fragment of available links for secondary route construction. In addition, wireless links are prone to channel impairments such as multipath fading; which renders the quality of the available links highly fluctuating. In this paper, we investigate using cooperative communication mechanisms to reveal new routing opportunities, enhance route qualities, and enable true coexistence of primary and secondary networks. As a result, we propose Undercover: a cooperative routing protocol that utilizes the available location information to assist in the routing process. Specifically, our protocol revisits a fundamental assumption taken by the state of the art routing protocols designed for cognitive radio networks. Using Undercover, secondary users can transmit in the regions of primary users activity through utilizing cooperative communication techniques to null out transmission at primary receivers via beamforming. In addition, the secondary links qualities are enhanced using cooperative diversity. To account for the excessive levels of interference typically incurred due to cooperative transmissions, we allow our protocol to be interference-aware. Thus, cooperative transmissions are penalized in accordance to the amount of negatively affected secondary flows. We evaluate the performance of our proposed protocol via NS2 simulations which show that our protocol can enhance the network goodput by a ratio reaches up to 250% compared to other popular cognitive routing protocols with minimal added overhead.
We study the dynamical melting of "hot" one-dimensional many-body localized systems. As disorder is weakened below a critical value these non-thermal quantum glasses melt via a continuous dynamical phase transition into classical thermal liquids. By accounting for collective resonant tunneling processes, we derive and numerically solve an effective model for such quantum-to-classical transitions and compute their universal critical properties. Notably, the classical thermal liquid exhibits a broad regime of anomalously slow sub-diffusive equilibration dynamics and energy transport. The subdiffusive regime is characterized by a continuously evolving dynamical critical exponent that diverges with a universal power at the transition. Our approach elucidates the universal long-distance, low-energy scaling structure of many-body delocalization transitions in one dimension, in a way that is transparently connected to the underlying microscopic physics.
Cybercrime is increasing at a faster pace and sometimes causes billions of dollars of business- losses so investigating attackers after commitment is of utmost importance and become one of the main concerns of network managers. Network forensics as the process of Collecting, identifying, extracting and analyzing data and systematically monitoring traffic of network is one of the main requirements in detection and tracking of criminals. In this paper, we propose an architecture for network forensic system. Our proposed architecture consists of five main components: collection and indexing, database management, analysis component, SOC communication component and the database. The main difference between our proposed architecture and other systems is in analysis component. This component is composed of four parts: Analysis and investigation subsystem, Reporting subsystem, Alert and visualization subsystem and the malware analysis subsystem. The most important differentiating factors of the proposed system with existing systems are: clustering and ranking of malware, dynamic analysis of malware, collecting and analysis of network flows and anomalous behaviour analysis.
Elaborating reliable and versatile strategies for efficient light coupling between free space and thin films is of crucial importance for new technologies in energy efficiency. Nanostructured materials have opened unprecedented opportunities for light management, notably in thin-film solar cells. Efficient coherent light trapping has been accomplished through the careful design of plasmonic nanoparticles and gratings, resonant dielectric particles and photonic crystals. Alternative approaches have used randomly-textured surfaces as strong light diffusers to benefit from their broadband and wide-angle properties. Here, we propose a new strategy for photon management in thin films that combines both advantages of an efficient trapping due to coherent optical effects and broadband/wide-angle properties due to disorder. Our approach consists in the excitation of electromagnetic modes formed by multiple light scattering and wave interference in two-dimensional random media. We show, by numerical calculations, that the spectral and angular responses of thin films containing disordered photonic patterns are intimately related to the in-plane light transport process and can be tuned through structural correlations. Our findings, which are applicable to all waves, are particularly suited for improving the absorption efficiency of thin-film solar cells and can provide a novel approach for high-extraction efficiency light-emitting diodes.
Split networks are a popular tool for the analysis and visualization of complex evolutionary histories. Every collection of splits (bipartitions) of a finite set can be represented by a split network. Here we characterize which collection of splits can be represented using a planar split network. Our main theorem links these collections of splits with oriented matroids and arrangements of lines separating points in the plane. As a consequence of our main theorem, we establish a particularly simple characterization of maximal collections of these splits.
Adversarial attack has cast a shadow on the massive success of deep neural networks. Despite being almost visually identical to the clean data, the adversarial images can fool deep neural networks into wrong predictions with very high confidence. In this paper, however, we show that we can build a simple binary classifier separating the adversarial apart from the clean data with accuracy over 99%. We also empirically show that the binary classifier is robust to a second-round adversarial attack. In other words, it is difficult to disguise adversarial samples to bypass the binary classifier. Further more, we empirically investigate the generalization limitation which lingers on all current defensive methods, including the binary classifier approach. And we hypothesize that this is the result of intrinsic property of adversarial crafting algorithms.
In this paper, a centralized Power Control (PC) scheme and an interference channel learning method are jointly tackled to allow a Cognitive Radio Network (CRN) access to the frequency band of a Primary User (PU) operating based on an Adaptive Coding and Modulation (ACM) protocol. The learning process enabler is a cooperative Modulation and Coding Classification (MCC) technique which estimates the Modulation and Coding scheme (MCS) of the PU. Due to the lack of cooperation between the PU and the CRN, the CRN exploits this multilevel MCC sensing feedback as implicit channel state information (CSI) of the PU link in order to constantly monitor the impact of the aggregated interference it causes. In this paper, an algorithm is developed for maximizing the CRN throughput (the PC optimization objective) and simultaneously learning how to mitigate PU interference (the optimization problem constraint) by using only the MCC information. Ideal approaches for this problem setting with high convergence rate are the cutting plane methods (CPM). Here, we focus on the analytic center cutting plane method (ACCPM) and the center of gravity cutting plane method (CGCPM) whose effectiveness in the proposed simultaneous PC and interference channel learning algorithm is demonstrated through numerical simulations.
The rise in online social networking has brought about a revolution in social relations. However, its effects on offline interactions and its implications for collective well-being are still not clear and are under-investigated. We study the ecology of online and offline interaction in an evolutionary game framework where individuals can adopt different strategies of socialization. Our main result is that the spreading of self-protective behaviors to cope with hostile social environments can lead the economy to non-socially optimal stationary states.
Social media, regarded as two-layer networks consisting of users and items, turn out to be the most important channels for access to massive information in the era of Web 2.0. The dynamics of human activity and item popularity is a crucial issue in social media networks. In this paper, by analyzing the growth of user activity and item popularity in four empirical social media networks, i.e., Amazon, Flickr, Delicious and Wikipedia, it is found that cross links between users and items are more likely to be created by active users and to be acquired by popular items, where user activity and item popularity are measured by the number of cross links associated with users and items. This indicates that users generally trace popular items, overall. However, it is found that the inactive users more severely trace popular items than the active users. Inspired by empirical analysis, we propose an evolving model for such networks, in which the evolution is driven only by two-step random walk. Numerical experiments verified that the model can qualitatively reproduce the distributions of user activity and item popularity observed in empirical networks. These results might shed light on the understandings of micro dynamics of activity and popularity in social media networks.
When using deep, multi-layered architectures to build generative models of data, it is difficult to train all layers at once. We propose a layer-wise training procedure admitting a performance guarantee compared to the global optimum. It is based on an optimistic proxy of future performance, the best latent marginal. We interpret auto-encoders in this setting as generative models, by showing that they train a lower bound of this criterion. We test the new learning procedure against a state of the art method (stacked RBMs), and find it to improve performance. Both theory and experiments highlight the importance, when training deep architectures, of using an inference model (from data to hidden variables) richer than the generative model (from hidden variables to data).
Computational Fluid Dynamics (CFD) is a hugely important subject with applications in almost every engineering field, however, fluid simulations are extremely computationally and memory demanding. Towards this end, we present Lat-Net, a method for compressing both the computation time and memory usage of Lattice Boltzmann flow simulations using deep neural networks. Lat-Net employs convolutional autoencoders and residual connections in a fully differentiable scheme to compress the state size of a simulation and learn the dynamics on this compressed form. The result is a computationally and memory efficient neural network that can be iterated and queried to reproduce a fluid simulation. We show that once Lat-Net is trained, it can generalize to large grid sizes and complex geometries while maintaining accuracy. We also show that Lat-Net is a general method for compressing other Lattice Boltzmann based simulations such as Electromagnetism.
Network robustness against attacks has been widely studied in fields as diverse as the Internet, power grids and human societies. Typically, in these studies, robustness is assessed only in terms of the connectivity of the nodes unaffected by the attack. Here we put forward the idea that the connectivity of the affected nodes can play a crucial role in properly evaluating the overall network robustness and its future recovery from the attack. Specifically, we propose a dual perspective approach wherein at any instant in the network evolution under attack, two distinct networks are defined: (i) the Active Network (AN) composed of the unaffected nodes and (ii) the Idle Network (IN) composed of the affected nodes. The proposed robustness metric considers both the efficiency of destroying the AN and the efficiency of building-up the IN. We show, via analysis of both prototype networks and real world data, that trade-offs between the efficiency of Active and Idle network dynamics give rise to surprising crossovers and re-ranking of different attack strategies, pointing to significant implications for decision making.
We study the probability that two directed polymers in the same random potential do not intersect. We use the replica method to map the problem onto the attractive Lieb-Liniger model with generalized statistics between particles. We obtain analytical expressions for the first few moments of this probability, and compare them to a numerical simulation of a discrete model at high-temperature. From these observations, several large time properties of the non-crossing probabilities are conjectured. Extensions of our formalism to more general observables are discussed.
Wireless sensor networks (WSNs) suffers from the hot spot problem where the sensor nodes closest to the base station are need to relay more packet than the nodes farther away from the base station. Thus, lifetime of sensory network depends on these closest nodes. Clustering methods are used to extend the lifetime of a wireless sensor network. However, current clustering algorithms usually utilize two techniques; selecting cluster heads with more residual energy, and rotating cluster heads periodically to distribute the energy consumption among nodes in each cluster and lengthen the network lifetime. Most of the algorithms use random selection for selecting the cluster heads. Here, we propose a Fault Tolerant Trajectory Clustering (FTTC) technique for selecting the cluster heads in WSNs. Our algorithm selects the cluster heads based on traffic and rotates periodically. It provides the first Fault Tolerant Trajectory based clustering technique for selecting the cluster heads and to extenuate the hot spot problem by prolonging the network lifetime.
We experimentally demonstrate classification of 4x4 binary images into 4 classes, using a 3-layer mixed-signal neuromorphic network ("MLP perceptron"), based on two passive 20x20 memristive crossbar arrays, board-integrated with discrete CMOS components. The network features 10 hidden-layer and 4 output-layer analog CMOS neurons and 428 metal-oxide memristors, i.e. is almost an order of magnitude more complex than any previously reported functional memristor circuit. Moreover, the inference operation of this classifier is performed entirely in the integrated hardware. To deal with larger crossbar arrays, we have developed a semi-automatic approach to their forming and testing, and compared several memristor training schemes for coping with imperfect behavior of these devices, as well as with variability of analog CMOS neurons. The effectiveness of the proposed schemes for defect and variation tolerance was verified experimentally using the implemented network and, additionally, by modeling the operation of a larger network, with 300 hidden-layer neurons, on the MNIST benchmark. Finally, we propose a simple modification of the implemented memristor-based vector-by-matrix multiplier to allow its operation in a wider temperature range.
Extra connectivity and the pessimistic diagnosis are two crucial subjects for a multiprocessor system's ability to tolerate and diagnose faulty processor. The pessimistic diagnosis strategy is a classic strategy based on the PMC model in which isolates all faulty vertices within a set containing at most one fault-free vertex. In this paper, the result that the pessimistic diagnosability $t_p(G)$ equals the extra connectivity $\kappa_{1}(G)$ of a regular graph $G$ under some conditions are shown. Furthermore, the following new results are gotten: the pessimistic diagnosability $t_p(S_n^2)=4n-9$ for split-star networks $S_n^2$, $t_p(\Gamma_n)=2n-4$ for Cayley graphs generated by transposition trees $\Gamma_n$, $t_p(\Gamma_{n}(\Delta))=4n-11$ for Cayley graph generated by the $2$-tree $\Gamma_{n}(\Delta)$, $t_{p}(BP_n)=2n-2$ for the burnt pancake networks $BP_n$. As corollaries, the known results about the extra connectivity and the pessimistic diagnosability of many famous networks including the alternating group graphs, the alternating group networks, BC networks, the $k$-ary $n$-cube networks etc. are obtained directly.
We report the first measurements of the intrinsic strain fluctuations of living cells using a recently-developed tracer correlation technique along with a theoretical framework for interpreting such data in heterogeneous media with non-thermal driving. The fluctuations' spatial and temporal correlations indicate that the cytoskeleton can be treated as a course-grained continuum with power-law rheology, driven by a spatially random stress tensor field. Combined with recent cell rheology results, our data imply that intracellular stress fluctuations have a nearly $1/\omega^2$ power spectrum, as expected for a continuum with a slowly evolving internal prestress.
We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. This formula deviates from the standard BIC score. Our work provides a concrete example that the BIC score is generally not valid for statistical models that belong to a stratified exponential family. This stands in contrast to linear and curved exponential families, where the BIC score has been proven to provide a correct approximation for the marginal likelihood.
When deep learning is applied to visual object recognition, data augmentation is often used to generate additional training data without extra labeling cost. It helps to reduce overfitting and increase the performance of the algorithm. In this paper we investigate if it is possible to use data augmentation as the main component of an unsupervised feature learning architecture. To that end we sample a set of random image patches and declare each of them to be a separate single-image surrogate class. We then extend these trivial one-element classes by applying a variety of transformations to the initial 'seed' patches. Finally we train a convolutional neural network to discriminate between these surrogate classes. The feature representation learned by the network can then be used in various vision tasks. We find that this simple feature learning algorithm is surprisingly successful, achieving competitive classification results on several popular vision datasets (STL-10, CIFAR-10, Caltech-101).
We analyze the properties of the energy landscape of {\it finite-size} fully connected p-spin-like models whose high temperature phase is described, in the thermodynamic limit, by the schematic Mode Coupling Theory of super-cooled liquids. We show that {\it finite-size} fully connected p-spin-like models, where activated processes are possible, do exhibit properties typical of real super-cooled liquid when both are near the critical glass transition. Our results support the conclusion that fully-connected p-spin-like models are the natural statistical mechanical models for studying the glass transition in super-cooled liquids.
The omnipresence of deep learning architectures such as deep convolutional neural networks (CNN)s is fueled by the synergistic combination of ever-increasing labeled datasets and specialized hardware. Despite the indisputable success, the reliance on huge amounts of labeled data and specialized hardware can be a limiting factor when approaching new applications. To help alleviating these limitations, we propose an efficient learning strategy for layer-wise unsupervised training of deep CNNs on conventional hardware in acceptable time. Our proposed strategy consists of randomly convexifying the reconstruction contractive auto-encoding (RCAE) learning objective and solving the resulting large-scale convex minimization problem in the frequency domain via coordinate descent (CD). The main advantages of our proposed learning strategy are: (1) single tunable optimization parameter; (2) fast and guaranteed convergence; (3) possibilities for full parallelization. Numerical experiments show that our proposed learning strategy scales (in the worst case) linearly with image size, number of filters and filter size.
Inference of hidden classes in stochastic block model is a classical problem with important applications. Most commonly used methods for this problem involve na\"{\i}ve mean field approaches or heuristic spectral methods. Recently, belief propagation was proposed for this problem. In this contribution we perform a comparative study between the three methods on synthetically created networks. We show that belief propagation shows much better performance when compared to na\"{\i}ve mean field and spectral approaches. This applies to accuracy, computational efficiency and the tendency to overfit the data.
In this article, we propose a novel Winner-Take-All (WTA) architecture employing neurons with nonlinear dendrites and an online unsupervised structural plasticity rule for training it. Further, to aid hardware implementations, our network employs only binary synapses. The proposed learning rule is inspired by spike time dependent plasticity (STDP) but differs for each dendrite based on its activation level. It trains the WTA network through formation and elimination of connections between inputs and synapses. To demonstrate the performance of the proposed network and learning rule, we employ it to solve two, four and six class classification of random Poisson spike time inputs. The results indicate that by proper tuning of the inhibitory time constant of the WTA, a trade-off between specificity and sensitivity of the network can be achieved. We use the inhibitory time constant to set the number of subpatterns per pattern we want to detect. We show that while the percentage of successful trials are 92%, 88% and 82% for two, four and six class classification when no pattern subdivisions are made, it increases to 100% when each pattern is subdivided into 5 or 10 subpatterns. However, the former scenario of no pattern subdivision is more jitter resilient than the later ones.
Multidimensional recurrent neural networks (MDRNNs) have shown a remarkable performance in the area of speech and handwriting recognition. The performance of an MDRNN is improved by further increasing its depth, and the difficulty of learning the deeper network is overcome by using Hessian-free (HF) optimization. Given that connectionist temporal classification (CTC) is utilized as an objective of learning an MDRNN for sequence labeling, the non-convexity of CTC poses a problem when applying HF to the network. As a solution, a convex approximation of CTC is formulated and its relationship with the EM algorithm and the Fisher information matrix is discussed. An MDRNN up to a depth of 15 layers is successfully trained using HF, resulting in an improved performance for sequence labeling.
Pixel-wise semantic segmentation for visual scene understanding not only needs to be accurate, but also efficient in order to find any use in real-time application. Existing algorithms even though are accurate but they do not focus on utilizing the parameters of neural network efficiently. As a result they are huge in terms of parameters and number of operations; hence slow too. In this paper, we propose a novel deep neural network architecture which allows it to learn without any significant increase in number of parameters. Our network uses only 11.5 million parameters and 21.2 GFLOPs for processing an image of resolution 3x640x360. It gives state-of-the-art performance on CamVid and comparable results on Cityscapes dataset. We also compare our networks processing time on NVIDIA GPU and embedded system device with existing state-of-the-art architectures for different image resolutions.
This paper describes a new architecture for transient mobile networks destined to merge existing and future network architectures, communication implementations and protocol operations by introducing a new paradigm to data delivery and identification. The main goal of our research is to enable seamless end-to-end communication between mobile and stationary devices across multiple networks and through multiple communication environments. The architecture establishes a set of infrastructure components and protocols that set the ground for a Persistent Identification Network (PIN). The basis for the operation of PIN is an identification space consisting of unique location independent identifiers similar to the ones implemented in the Handle system. Persistent Identifiers are used to identify and locate Digital Entities which can include devices, services, users and even traffic. The architecture establishes a primary connection independent logical structure that can operate over conventional networks or more advanced peer-to-peer aggregation networks. Communication is based on routing pools and novel protocols for routing data across several abstraction levels of the network, regardless of the end-points' current association and state...
Nowadays, modern earth observation programs produce huge volumes of satellite images time series (SITS) that can be useful to monitor geographical areas through time. How to efficiently analyze such kind of information is still an open question in the remote sensing field. Recently, deep learning methods proved suitable to deal with remote sensing data mainly for scene classification (i.e. Convolutional Neural Networks - CNNs - on single images) while only very few studies exist involving temporal deep learning approaches (i.e Recurrent Neural Networks - RNNs) to deal with remote sensing time series. In this letter we evaluate the ability of Recurrent Neural Networks, in particular the Long-Short Term Memory (LSTM) model, to perform land cover classification considering multi-temporal spatial data derived from a time series of satellite images. We carried out experiments on two different datasets considering both pixel-based and object-based classification. The obtained results show that Recurrent Neural Networks are competitive compared to state-of-the-art classifiers, and may outperform classical approaches in presence of low represented and/or highly mixed classes. We also show that using the alternative feature representation generated by LSTM can improve the performances of standard classifiers.
We review the task of Sentence Pair Scoring, popular in the literature in various forms - viewed as Answer Sentence Selection, Semantic Text Scoring, Next Utterance Ranking, Recognizing Textual Entailment, Paraphrasing or e.g. a component of Memory Networks.   We argue that all such tasks are similar from the model perspective and propose new baselines by comparing the performance of common IR metrics and popular convolutional, recurrent and attention-based neural models across many Sentence Pair Scoring tasks and datasets. We discuss the problem of evaluating randomized models, propose a statistically grounded methodology, and attempt to improve comparisons by releasing new datasets that are much harder than some of the currently used well explored benchmarks. We introduce a unified open source software framework with easily pluggable models and tasks, which enables us to experiment with multi-task reusability of trained sentence model. We set a new state-of-art in performance on the Ubuntu Dialogue dataset.
We present and discuss several novel applications of deep learning for the physical layer. By interpreting a communications system as an autoencoder, we develop a fundamental new way to think about communications system design as an end-to-end reconstruction task that seeks to jointly optimize transmitter and receiver components in a single process. We show how this idea can be extended to networks of multiple transmitters and receivers and present the concept of radio transformer networks as a means to incorporate expert domain knowledge in the machine learning model. Lastly, we demonstrate the application of convolutional neural networks on raw IQ samples for modulation classification which achieves competitive accuracy with respect to traditional schemes relying on expert features. The paper is concluded with a discussion of open challenges and areas for future investigation.
Deep neural nets have caused a revolution in many classification tasks. A related ongoing revolution---also theoretically not understood---concerns their ability to serve as generative models for complicated types of data such as images and texts. These models are trained using ideas like variational autoencoders and Generative Adversarial Networks.   We take a first cut at explaining the expressivity of multilayer nets by giving a sufficient criterion for a function to be approximable by a neural network with $n$ hidden layers. A key ingredient is Barron's Theorem \cite{Barron1993}, which gives a Fourier criterion for approximability of a function by a neural network with 1 hidden layer. We show that a composition of $n$ functions which satisfy certain Fourier conditions ("Barron functions") can be approximated by a $n+1$-layer neural network.   For probability distributions, this translates into a criterion for a probability distribution to be approximable in Wasserstein distance---a natural metric on probability distributions---by a neural network applied to a fixed base distribution (e.g., multivariate gaussian).   Building up recent lower bound work, we also give an example function that shows that composition of Barron functions is more expressive than Barron functions alone.
Classical control and management plane for computer networks is addressing individual parameters of protocol layers within an individual wireless network device. We argue that this is not sufficient in phase of increasing deployment of highly re-configurable systems, as well as heterogeneous wireless systems co-existing in the same radio spectrum which demand harmonized, frequently even coordinated adaptation of multiple parameters in different protocol layers (cross-layer) in multiple network devices (cross-node).   We propose UniFlex, a framework enabling unified and flexible radio and network control. It provides an API enabling coordinated cross-layer control and management operation over multiple network nodes. The controller logic may be implemented either in a centralized or distributed manner. This allows to place time-sensitive control functions close to the controlled device (i.e., local control application), off-load more resource hungry network application to compute servers and make them work together to control entire network.   The UniFlex framework was prototypically implemented and provided to the research community as open-source. We evaluated the the framework in a number of use-cases, what proved its usability.
Support vector machines represent a promising development in machine learning research that is not widely used within the remote sensing community. This paper reports the results of Multispectral(Landsat-7 ETM+) and Hyperspectral DAIS)data in which multi-class SVMs are compared with maximum likelihood and artificial neural network methods in terms of classification accuracy. Our results show that the SVM achieves a higher level of classification accuracy than either the maximum likelihood or the neural classifier, and that the support vector machine can be used with small training datasets and high-dimensional data.
We introduce a class of neural networks derived from probabilistic models in the form of Bayesian networks. By imposing additional assumptions about the nature of the probabilistic models represented in the networks, we derive neural networks with standard dynamics that require no training to determine the synaptic weights, that perform accurate calculation of the mean values of the random variables, that can pool multiple sources of evidence, and that deal cleanly and consistently with inconsistent or contradictory evidence. The presented neural networks capture many properties of Bayesian networks, providing distributed versions of probabilistic models.
This paper introduces a learning scheme to construct a Hilbert space (i.e., a vector space along its inner product) to address both unsupervised and semi-supervised domain adaptation problems. This is achieved by learning projections from each domain to a latent space along the Mahalanobis metric of the latent space to simultaneously minimizing a notion of domain variance while maximizing a measure of discriminatory power. In particular, we make use of the Riemannian optimization techniques to match statistical properties (e.g., first and second order statistics) between samples projected into the latent space from different domains. Upon availability of class labels, we further deem samples sharing the same label to form more compact clusters while pulling away samples coming from different classes.We extensively evaluate and contrast our proposal against state-of-the-art methods for the task of visual domain adaptation using both handcrafted and deep-net features. Our experiments show that even with a simple nearest neighbor classifier, the proposed method can outperform several state-of-the-art methods benefitting from more involved classification schemes.
In this work we propose a physical model of organismal evolution, where phenotype, organism life expectancy, is directly related to genotype i.e. the stability of its proteins which can be determined exactly in the model. Simulating the model on a computer, we consistently observe the Big Bang scenario whereby exponential population growth ensues as favorable sequence-structure combinations (precursors of stable proteins) are discovered. After that, random diversity of the structural space abruptly collapses into a small set of preferred structural motifs. We observe that protein folds remain stable and abundant in the population at time scales much greater than mutation or organism lifetime, and the distribution of the lifetimes of dominant folds in a population approximately follows a power law. The separation of evolutionary time scales between discovery of new folds and generation of new sequences gives rise to emergence of protein families and superfamilies whose sizes are power-law distributed, closely matching the same distributions for real proteins. The network of structural similarities of the universe of evolved proteins has the same scale-free like character as the actual protein domain universe graph (PDUG). Further, the model predicts that ancient protein domains represent a highly connected and clustered subset of all protein domains, in complete agreement with reality. Together, these results provide a microscopic first principles picture of how protein structures and gene families evolved in the course of evolution.
Multiplicity moments of charged particles in deep inelastic E+P scattering have been measured with the ZEUS detector at HERA using an integrated luminosity of 38.4 pb^{-1}$. The moments for Q^2 > 1000 GeV^2 were studied in the current region of the Breit frame. The evolution of the moments was investigated as a function of restricted regions in polar angle and, for the first time, both in the transverse momentum and in absolute momentum of final-state particles. Analytic perturbative QCD predictions in conjunction with the hypothesis of Local Parton-Hadron Duality (LPHD) reproduce the trends of the moments in polar-angle regions, although some discrepancies are observed. For the moments restricted either in transverse or absolute momentum, the analytic results combined with the LPHD hypothesis show considerable deviations from the measurements. The study indicates a large influence of the hadronisation stage on the multiplicity distributions in the restricted phase-space regions studied here, which is inconsistent with the expectations of the LPHD hypothesis.
We study the dynamics of supervised on-line learning of realizable tasks in feed-forward neural networks. We focus on the regime where the number of examples used for training is proportional to the number of input channels N. Using generating function techniques from spin glass theory, we are able to average over the composition of the training set and transform the problem for N to infinity to an effective single pattern system, described completely by the student autocovariance, the student-teacher overlap and the student response function, with exact closed equations. Our method applies to arbitrary learning rules, i.e. not necessarily of a gradient-descent type. The resulting exact macroscopic dynamical equations can be integrated without finite-size effects up to any degree of accuracy, but their main value is in providing an exact and simple starting point for analytical approximation schemes. Finally, we show how, in the region of absent anomalous response and using the hypothesis that (as in detailed balance systems) the short-time part of the various operators can be transformed away, one can describe the stationary state of the network succesfully by a set of coupled equations involving only four scalar order parameters.
The present business network infrastructure is quickly varying with latest servers, services, connections, and ports added often, at times day by day, and with a uncontrollably inflow of laptops, storage media and wireless networks. With the increasing amount of vulnerabilities and exploits coupled with the recurrent evolution of IT infrastructure, organizations at present require more numerous vulnerability assessments. In this paper new approach the Unified process for Network vulnerability Assessments hereafter called as a unified NVA is proposed for network vulnerability assessment derived from Unified Software Development Process or Unified Process, it is a popular iterative and incremental software development process framework.
We examine the interplay between anisotropy and percolation, i.e., the spontaneous formation of a system spanning cluster in an anisotropic model. We simulate an extension of a benchmark model of continuum percolation, the Boolean model, which is formed by overlapping grains. Here we introduce an orientation bias of the grains that controls the degree of anisotropy of the generated patterns. We analyze in the Euclidean plane the percolation thresholds above which percolating clusters in $x$- and in $y$-direction emerge. Only in finite systems, distinct differences between effective percolation thresholds for different directions appear. If extrapolated to infinite system sizes, these differences vanish independent of the details of the model. In the infinite system, the uniqueness of the percolating cluster guarantees a unique percolation threshold. While percolation is isotropic even for anisotropic processes, the value of the percolation threshold depends on the model parameters, which we explore by simulating a score of models with varying degree of anisotropy. To which precision can we predict the percolation threshold without simulations? We discuss analytic formulas for approximations (based on the excluded area or the Euler characteristic) and compare them to our simulation results. Empirical parameters from similar systems allow for accurate predictions of the percolation thresholds (with deviations of $<5\%$ in our examples), but even without any empirical parameters, the explicit approximations from integral geometry provide, at least for the systems studied here, lower bounds that capture well the qualitative dependence of the percolation threshold on the system parameters (with deviations of $5\%$--$30\%$). As an outlook, we suggest further candidates for explicit and geometric approximations based on second moments of the so-called Minkowski functionals.
Scheduling problems are generally NP-hard combinatorial problems, and a lot of research has been done to solve these problems heuristically. However, most of the previous approaches are problem-specific and research into the development of a general scheduling algorithm is still in its infancy.   Mimicking the natural evolutionary process of the survival of the fittest, Genetic Algorithms (GAs) have attracted much attention in solving difficult scheduling problems in recent years. Some obstacles exist when using GAs: there is no canonical mechanism to deal with constraints, which are commonly met in most real-world scheduling problems, and small changes to a solution are difficult. To overcome both difficulties, indirect approaches have been presented (in [1] and [2]) for nurse scheduling and driver scheduling, where GAs are used by mapping the solution space, and separate decoding routines then build solutions to the original problem.
Pixel-level annotations are expensive and time consuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recently, CNN-based methods have proposed to fine-tune pre-trained networks using image tags. Without additional information, this leads to poor localization accuracy. This problem, however, was alleviated by making use of objectness priors to generate foreground/background masks. Unfortunately these priors either require pixel-level annotations/bounding boxes, or still yield inaccurate object boundaries. Here, we propose a novel method to extract accurate masks from networks pre-trained for the task of object recognition, thus forgoing external objectness modules. We first show how foreground/background masks can be obtained from the activations of higher-level convolutional layers of a network. We then show how to obtain multi-class masks by the fusion of foreground/background ones with information extracted from a weakly-supervised localization network. Our experiments evidence that exploiting these masks in conjunction with a weakly-supervised training loss yields state-of-the-art tag-based weakly-supervised semantic segmentation results.
The method is proposed adapted for calculating the T=0 conductance of arbitrarily stretched disordered conducting strips in terms of the Kubo theory. The 2D scattering problem is solved through exact one-dimensionalization in mode representation (instead of quasiclassical) that enables to allow reasonably for quantum interference of scattered waves as well as for the effect of dephasing. The inter- and intra-mode scattering channels are shown to play quite different role, the former being responsible for diffusive smearing of the quantum levels whereas the latter for the interference effects. No pronounced localization should reveal itself in wires with more then one conducting mode, irrespective of their length, contrary to anticipations of the scaling theory.
We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset. Our findings are three-fold: 1) 3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets; 2) A homogeneous architecture with small 3x3x3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets; and 3) Our learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks. In addition, the features are compact: achieving 52.8% accuracy on UCF101 dataset with only 10 dimensions and also very efficient to compute due to the fast inference of ConvNets. Finally, they are conceptually very simple and easy to train and use.
We have performed a high resolution specific heat measurement on 4He completely filling the pores of Vycor glass. Within 10mK of the superfluid transition we have found a peak in the heat capacity which is only 0.02% the size of the background. The peak can be fit with a rounded version of the ``logarithmic singularity'' observed in bulk 4He. Along with the previously observed ``2/3'' power law dependence of the superfluid density on temperature, this strongly suggests that the disorder imposed by the Vycor is irrelevant to the 3DXY superfluid phase transition. The critical amplitude of the peak, while in agreement with that found in other experiments in dilute superfluid 4He, is considerably larger than that predicted by the theory of hyperuniversality.
Next generation deep neural networks for classification hosted on embedded platforms will rely on fast, efficient, and accurate learning algorithms. Initialization of weights in learning networks has a great impact on the classification accuracy. In this paper we focus on deriving good initial weights by modeling the error function of a deep neural network as a high-dimensional landscape. We observe that due to the inherent complexity in its algebraic structure, such an error function may conform to general results of the statistics of large systems. To this end we apply some results from Random Matrix Theory to analyse these functions. We model the error function in terms of a Hamiltonian in N-dimensions and derive some theoretical results about its general behavior. These results are further used to make better initial guesses of weights for the learning algorithm.
I describe some deep-seated problems in higher mathematical education, and give some ideas for their solution -- I advocate a move away from the traditional introduction of mathematics through calculus, and towards computation and discrete mathematics.
Given the superposition of a low-rank matrix plus the product of a known fat compression matrix times a sparse matrix, the goal of this paper is to establish deterministic conditions under which exact recovery of the low-rank and sparse components becomes possible. This fundamental identifiability issue arises with traffic anomaly detection in backbone networks, and subsumes compressed sensing as well as the timely low-rank plus sparse matrix recovery tasks encountered in matrix decomposition problems. Leveraging the ability of $\ell_1$- and nuclear norms to recover sparse and low-rank matrices, a convex program is formulated to estimate the unknowns. Analysis and simulations confirm that the said convex program can recover the unknowns for sufficiently low-rank and sparse enough components, along with a compression matrix possessing an isometry property when restricted to operate on sparse vectors. When the low-rank, sparse, and compression matrices are drawn from certain random ensembles, it is established that exact recovery is possible with high probability. First-order algorithms are developed to solve the nonsmooth convex optimization problem with provable iteration complexity guarantees. Insightful tests with synthetic and real network data corroborate the effectiveness of the novel approach in unveiling traffic anomalies across flows and time, and its ability to outperform existing alternatives.
Dead Reckoning mechanisms are usually used to estimate the position of simulated entity in virtual environment. However, this technique often ignores available contextual information that may be influential to the state of an entity, sacrificing remote predictive accuracy in favor of low computational complexity. A novel extension of Dead Reckoning is suggested in this paper to increase the network availability and fulfill the required Quality of Service in large scale distributed simulation application. The proposed algorithm is referred to as ANFIS Dead Reckoning, which stands for Adaptive Neuro-based Fuzzy Inference System Dead Reckoning is based on a fuzzy inference system which is trained by the learning algorithm derived from the neuronal networks and fuzzy inference theory. The proposed mechanism takes its based on the optimization approach to calculate the error threshold violation in networking games. Our model shows it primary benefits especially in the decision making of the behavior of simulated entities and preserving the consistence of the simulation.
Recently, deep learning approach, especially deep Convolutional Neural Networks (ConvNets), have achieved overwhelming accuracy with fast processing speed for image classification. Incorporating temporal structure with deep ConvNets for video representation becomes a fundamental problem for video content analysis. In this paper, we propose a new approach, namely Hierarchical Recurrent Neural Encoder (HRNE), to exploit temporal information of videos. Compared to recent video representation inference approaches, this paper makes the following three contributions. First, our HRNE is able to efficiently exploit video temporal structure in a longer range by reducing the length of input information flow, and compositing multiple consecutive inputs at a higher level. Second, computation operations are significantly lessened while attaining more non-linearity. Third, HRNE is able to uncover temporal transitions between frame chunks with different granularities, i.e., it can model the temporal transitions between frames as well as the transitions between segments. We apply the new method to video captioning where temporal information plays a crucial role. Experiments demonstrate that our method outperforms the state-of-the-art on video captioning benchmarks. Notably, even using a single network with only RGB stream as input, HRNE beats all the recent systems which combine multiple inputs, such as RGB ConvNet plus 3D ConvNet.
Polynomial threshold gates are basic processing units of an artificial neural network. When the input vectors are binary vectors, these gates correspond to Boolean functions and can be analyzed via their polynomial representations. In practical applications, it is desirable to find a polynomial representation with the smallest number of terms possible, in order to use the least possible number of input lines to the unit under consideration. For this purpose, instead of an exact polynomial representation, usually the sign representation of a Boolean function is considered. The non-uniqueness of the sign representation allows the possibility for using a smaller number of monomials by solving a minimization problem. This minimization problem is combinatorial in nature, and so far the best known deterministic algorithm claims the use of at most $0.75\times 2^n$ of the $2^n$ total possible monomials. In this paper, the basic methods of representing a Boolean function by polynomials are examined, and an alternative approach to this problem is proposed. It is shown that it is possible to use at most $0.5\times 2^n = 2^{n-1}$ monomials based on the $\{0, 1\}$ binary inputs by introducing extra variables, and at the same time keeping the degree upper bound at $n$. An algorithm for further reduction of the number of terms that used in a polynomial representation is provided. Examples show that in certain applications, the improvement achieved by the proposed method over the existing methods is significant.
In order to develop statistical approaches for transcription networks, statistical community has proposed several methods to infer activity levels of proteins, from time-series measurements of targets' expression levels. A few number of approaches have been proposed in order to outperform the representation of fast switching time instants, but computational overheads are significant due to complex inference algorithms. Using the theory related to latent force models (LFM), the development of this project provide a switched dynamical hybrid model based on Gaussian processes (GPs). To deal with discontinuities in dynamical systems (or latent driving force), an extension of the single input motif approach is introduced, that switches between different protein concentrations, and different dynamical systems. This creates a versatile representation for transcription networks that can capture discrete changes and non-linearities in the dynamics. The proposed method is evaluated on both simulated data and real data, concluding that our framework provides a computationally efficient statistical inference module of continuous-time concentration profiles, and allows an easy estimation of the associated model parameters.
Learning and decision making in the brain are key processes critical to survival, and yet are processes implemented by non-ideal biological building blocks which can impose significant error. We explore quantitatively how the brain might cope with this inherent source of error by taking advantage of two ubiquitous mechanisms, redundancy and synchronization. In particular we consider a neural process whose goal is to learn a decision function by implementing a nonlinear gradient dynamics. The dynamics, however, are assumed to be corrupted by perturbations modeling the error which might be incurred due to limitations of the biology, intrinsic neuronal noise, and imperfect measurements. We show that error, and the associated uncertainty surrounding a learned solution, can be controlled in large part by trading off synchronization strength among multiple redundant neural systems against the noise amplitude. The impact of the coupling between such redundant systems is quantified by the spectrum of the network Laplacian, and we discuss the role of network topology in synchronization and in reducing the effect of noise. A range of situations in which the mechanisms we model arise in brain science are discussed, and we draw attention to experimental evidence suggesting that cortical circuits capable of implementing the computations of interest here can be found on several scales. Finally, simulations comparing theoretical bounds to the relevant empirical quantities show that the theoretical estimates we derive can be tight.
We study equilibrium properties of a catalytically-activated annihilation $A + A \to 0$ reaction taking place on a one-dimensional chain of length $N$ ($N \to \infty$) in which some segments (placed at random, with mean concentration $p$) possess special, catalytic properties. Annihilation reaction takes place, as soon as any two $A$ particles land onto two vacant sites at the extremities of the catalytic segment, or when any $A$ particle lands onto a vacant site on a catalytic segment while the site at the other extremity of this segment is already occupied by another $A$ particle. Non-catalytic segments are inert with respect to reaction and here two adsorbed $A$ particles harmlessly coexist. For both "annealed" and "quenched" disorder in placement of the catalytic segments, we calculate exactly the disorder-average pressure per site. Explicit asymptotic formulae for the particle mean density and the compressibility are also presented.
We discuss the identification of sources detected by ISO at 6.7 and 15 micron in the Hubble Deep Field (HDF) region. We conservatively associate ISO sources with objects in existing optical and near-infrared HDF catalogues using the likelihood ratio method, confirming these results (and, in one case, clarifying them) with independent visual searches. We find fifteen ISO sources to be reliably associated with bright [I(AB) < 23] galaxies in the HDF, and one with an I(AB)=19.9 star, while a further eleven are associated with objects in the Hubble Flanking Fields (ten galaxies and one star). Amongst optically bright HDF galaxies, ISO tends to detect luminous, star-forming galaxies at fairly high redshift and with disturbed morphologies, in preference to nearby ellipticals.
To identify communities in directed networks, we propose a generalized form of modularity in directed networks by introducing a new quantity LinkRank, which can be considered as the PageRank of links. This generalization is consistent with the original modularity in undirected networks and the modularity optimization methods developed for undirected networks can be directly applied to directed networks by optimizing our new modularity. Also, a model network, which can be used as a benchmark network in further community studies, is proposed to verify our method. Our method is supposed to find communities effectively in citation- or reference-based directed networks.
We study optimal synchronization of networks of coupled phase oscillators. We extend previous theory for optimizing the synchronization properties of undirected networks to the important case of directed networks. We derive a generalized synchrony alignment function that encodes the interplay between network structure and the oscillators' natural frequencies and serves as an objective measure for the network's degree of synchronization. Using the generalized synchrony alignment function, we show that a network's synchronization properties can be systematically optimized. This framework also allows us to study the properties of synchrony-optimized networks, and in particular, investigate the role of directed network properties such as nodal in- and out-degrees. For instance, we find that in optimally rewired networks the heterogeneity of the in-degree distribution roughly matches the heterogeneity of the natural frequency distribution, but no such relationship emerges for out-degrees. We also observe that a network's synchronization properties are promoted by a strong correlation between the nodal in-degrees and the natural frequencies of oscillators, whereas the relationship between the nodal out-degrees and the natural frequencies has comparatively little effect. This result is supported by our theory, which indicates that synchronization is promoted by a strong alignment of the natural frequencies with the left singular vectors corresponding to the largest singular values of the Laplacian matrix.
This work proposes deep network models and learning algorithms for unsupervised and supervised binary hashing. Our novel network design constrains one hidden layer to directly output the binary codes. This addresses a challenging issue in some previous works: optimizing non-smooth objective functions due to binarization. Moreover, we incorporate independence and balance properties in the direct and strict forms in the learning. Furthermore, we include similarity preserving property in our objective function. Our resulting optimization with these binary, independence, and balance constraints is difficult to solve. We propose to attack it with alternating optimization and careful relaxation. Experimental results on three benchmark datasets show that our proposed methods compare favorably with the state of the art.
In this paper, we exploit minimal sensing information gathered from biologically inspired sensor networks to perform exploration and mapping in an unknown environment. A probabilistic motion model of mobile sensing nodes, inspired by motion characteristics of cockroaches, is utilized to extract weak encounter information in order to build a topological representation of the environment.   Neighbor to neighbor interactions among the nodes are exploited to build point clouds representing spatial features of the manifold characterizing the environment based on the sampled data.   To extract dominant features from sampled data, topological data analysis is used to produce persistence intervals for features, to be used for topological mapping. In order to improve robustness characteristics of the sampled data with respect to outliers, density based subsampling algorithms are employed. Moreover, a robust scale-invariant classification algorithm for persistence diagrams is proposed to provide a quantitative representation of desired features in the data. Furthermore, various strategies for defining encounter metrics with different degrees of information regarding agents' motion are suggested to enhance the precision of the estimation and classification performance of the topological method.
The Kalman filter (KF) is used in a variety of applications for computing the posterior distribution of latent states in a state space model. The model requires a linear relationship between states and observations. Extensions to the Kalman filter have been proposed that incorporate linear approximations to nonlinear models, such as the extended Kalman filter (EKF) and the unscented Kalman filter (UKF). However, we argue that in cases where the dimensionality of observed variables greatly exceeds the dimensionality of state variables, a model for $p(\text{state}|\text{observation})$ proves both easier to learn and more accurate for latent space estimation. We derive and validate what we call the discriminative Kalman filter (DKF): a closed-form discriminative version of Bayesian filtering that readily incorporates off-the-shelf discriminative learning techniques. Further, we demonstrate that given mild assumptions, highly non-linear models for $p(\text{state}|\text{observation})$ can be specified. We motivate and validate on synthetic datasets and in neural decoding from non-human primates, showing substantial increases in decoding performance versus the standard Kalman filter.
Control of multihop Wireless networks in a distributed manner while providing end-to-end delay requirements for different flows, is a challenging problem. Using the notions of Draining Time and Discrete Review from the theory of fluid limits of queues, an algorithm that meets delay requirements to various flows in a network is constructed. The algorithm involves an optimization which is implemented in a cyclic distributed manner across nodes by using the technique of iterative gradient ascent, with minimal information exchange between nodes. The algorithm uses time varying weights to give priority to flows. The performance of the algorithm is studied in a network with interference modelled by independent sets.
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activate $K \geq 1$ arms at each time in order to maximize the expected total reward obtained over multiple plays. RMAB is a challenging problem that is known to be PSPACE-hard in general. We consider in this work the even harder non-Bayesian RMAB, in which the parameters of the Markov chain are assumed to be unknown \emph{a priori}. We develop an original approach to this problem that is applicable when the corresponding Bayesian problem has the structure that, depending on the known parameter values, the optimal solution is one of a prescribed finite set of policies. In such settings, we propose to learn the optimal policy for the non-Bayesian RMAB by employing a suitable meta-policy which treats each policy from this finite set as an arm in a different non-Bayesian multi-armed bandit problem for which a single-arm selection policy is optimal. We demonstrate this approach by developing a novel sensing policy for opportunistic spectrum access over unknown dynamic channels. We prove that our policy achieves near-logarithmic regret (the difference in expected reward compared to a model-aware genie), which leads to the same average reward that can be achieved by the optimal policy under a known model. This is the first such result in the literature for a non-Bayesian RMAB.
A new hybridization of the Cuckoo Search (CS) is developed and applied to optimize multi-cell solar systems; namely multi-junction and split spectrum cells. The new approach consists of combining the CS with the Nelder-Mead method. More precisely, instead of using single solutions as nests for the CS, we use the concept of a simplex which is used in the Nelder-Mead algorithm. This makes it possible to use the flip operation introduces in the Nelder-Mead algorithm instead of the Levy flight which is a standard part of the CS. In this way, the hybridized algorithm becomes more robust and less sensitive to parameter tuning which exists in CS. The goal of our work was to optimize the performance of multi-cell solar systems. Although the underlying problem consists of the minimization of a function of a relatively small number of parameters, the difficulty comes from the fact that the evaluation of the function is complex and only a small number of evaluations is possible. In our test, we show that the new method has a better performance when compared to similar but more compex hybridizations of Nelder-Mead algorithm using genetic algorithms or particle swarm optimization on standard benchmark functions. Finally, we show that the new method outperforms some standard meta-heuristics for the problem of interest.
Nanoscale resistive memories are expected to fuel dense integration of electronic synapses for large-scale neuromorphic system. To realize such a brain-inspired computing chip, a compact CMOS spiking neuron that performs in-situ learning and computing while driving a large number of resistive synapses is desired. This work presents a novel leaky integrate-and-fire neuron design which implements the dual-mode operation of current integration and synaptic drive, with a single opamp and enables in-situ learning with crossbar resistive synapses. The proposed design was implemented in a 0.18 $\mu$m CMOS technology. Measurements show neuron's ability to drive a thousand resistive synapses, and demonstrate an in-situ associative learning. The neuron circuit occupies a small area of 0.01 mm$^2$ and has an energy-efficiency of 9.3 pJ$/$spike$/$synapse.
In this paper, we consider the problem of probabilistically modelling symbolic music data. We introduce a representation which reduces polyphonic music to a univariate categorical sequence. In this way, we are able to apply state of the art natural language processing techniques, namely the long short-term memory sequence model. The representation we employ permits arbitrary rhythmic structure, which we assume to be given. We show that our model is effective on four out of four piano roll based benchmark datasets. We further improve our model by augmenting our training data set with transpositions of the original pieces through all musical keys, thereby convincingly advancing the state of the art on these benchmark problems. We also fit models to music which is unconstrained in its rhythmic structure, discuss the properties of this model, and provide musical samples which are more sophisticated than previously possible with this class of recurrent neural network sequence models. We also provide our newly preprocessed data set of non piano-roll music data.
XY pyrochlore antiferromagnets are well-known to exhibit order-by-disorder through both quantum and thermal selection. In this paper we consider the effect of substituting non-magnetic ions onto the magnetic sites in a pyrochlore XY model with generally anisotropic exchange tuned by a single parameter $J^{\pm\pm}/J^\pm$. The physics is controlled by two points in this space of parameters $J^{\pm\pm}/J^\pm=\pm 2$ at which there are line modes in the ground state and hence an $O(L^2)$ ground state degeneracy intermediate between that of a conventional magnet and a Coulomb phase. At each of these points, single vacancies seed pairs of line defects. Two line defects carrying incompatible spin configurations from different vacancies can cross leading to an effective one-dimensional description of the resulting spin texture. In the thermodynamic limit at finite density, we find that dilution selects a state "opposite" to the state selected by thermal and quantum disorder which is understood from the single vacancy limit. The latter finding hints at the possibility that Er$_{2-x}$Y$_x$Ti$_2$O$_7$ for small $x$ exhibits a second phase transition within the thermally selected $\psi_2$ state into a $\psi_3$ state selected by the quenched disorder.
We focus on spectral clustering of unlabeled graphs and review some results on clustering methods which achieve weak or strong consistent identification in data generated by such models. We also present a new algorithm which appears to perform optimally both theoretically using asymptotic theory and empirically.
We present deep multiband (F435W, F625W, and F658N) photometric data of the Globular Cluster Omega Cen collected with the Advanced Camera for Surveys on board of the Hubble Space Telescope. We identified in the (F435W-F625W, F435W) plane more than two thousand White Dwarf (WD) candidates using three out of nine available pointings. Such a large sample appears in agreement with predictions based on the ratio between WD and Horizontal Branch (HB) evolutionary lifetimes. We also detected ~ 1600 WDs in the (F658N-F625W, F625W) plane, supporting the evidence that a large fraction of current cluster WDs are $H_\alpha$ bright.
We apply the Bethe-Peierls approximation to the problem of the inverse Ising model and show how the linear response relation leads to a simple method to reconstruct couplings and fields of the Ising model. This reconstruction is exact on tree graphs, yet its computational expense is comparable to other mean-field methods. We compare the performance of this method to the independent-pair, naive mean- field, Thouless-Anderson-Palmer approximations, the Sessak-Monasson expansion, and susceptibility propagation in the Cayley tree, SK-model and random graph with fixed connectivity. At low temperatures, Bethe reconstruction outperforms all these methods, while at high temperatures it is comparable to the best method available so far (Sessak-Monasson). The relationship between Bethe reconstruction and other mean- field methods is discussed.
Object detection is a challenging task in visual understanding domain, and even more so if the supervision is to be weak. Recently, few efforts to handle the task without expensive human annotations is established by promising deep neural network. A new architecture of cascaded networks is proposed to learn a convolutional neural network (CNN) under such conditions. We introduce two such architectures, with either two cascade stages or three which are trained in an end-to-end pipeline. The first stage of both architectures extracts best candidate of class specific region proposals by training a fully convolutional network. In the case of the three stage architecture, the middle stage provides object segmentation, using the output of the activation maps of first stage. The final stage of both architectures is a part of a convolutional neural network that performs multiple instance learning on proposals extracted in the previous stage(s). Our experiments on the PASCAL VOC 2007, 2010, 2012 and large scale object datasets, ILSVRC 2013, 2014 datasets show improvements in the areas of weakly-supervised object detection, classification and localization.
Ensuring the early detection of important social network users is a challenging task. Some peer ranking services are now well established, such as PeerIndex, Klout, or Kred. Their function is to rank users according to their influence. This notion of influence is however abstract, and the algorithms achieving this ranking are opaque. Following the rising demand for a more transparent web, we explore the problem of gaining knowledge by reverse engineering such peer ranking services, with regards to the social network topology they get as an input. Since these services exploit the online activity of users (and therefore their connectivity in social networks), we provide a precise evaluation of how topological metrics of the social network impact the final user ranking. Our approach is the following : we first model the ranking service as a black-box with which we interact by creating user profiles and by performing operations on them. Through those profiles, we trigger some slight topological modifications. By monitoring the impact of these modifications on the rankings of those profiles, we infer the weight of each topological metric in the black-box, thus reversing the service influence cookbook.
We revisit the classical hard-core model, also known as independent set and dual to vertex cover problem, where one puts particles with a first-neighbor hard-core repulsion on the vertices of a random graph. Although the case of random graphs with small and very large average degrees respectively are quite well understood, they yield qualitatively different results and our aim here is to reconciliate these two cases. We revisit results that can be obtained using the (heuristic) cavity method and show that it provides a closed-form conjecture for the exact density of the densest packing on random regular graphs with degree K>=20, and that for K>16 the nature of the phase transition is the same as for large K. This also shows that the hard-code model is the simplest mean-field lattice model for structural glasses and jamming.
We describe TRILEGAL, a new populations synthesis code for simulating the stellar photometry of any Galaxy field. The code attempts to improve upon several technical aspects of star count models, by: dealing with very complete input libraries of evolutionary tracks; using a stellar spectral library to simulate the photometry in any broad-band system; being very versatile allowing easy changes in the input libraries and in the description of all of its ingredients -- like the SFR, AMR, IMF, and geometry of Galaxy components. In a previous paper (Groenewegen et al. 2002), the code was first applied to describe the very deep star counts of the CDFS stellar catalogue. Here, we briefly describe its initial calibration using EIS-deep and DMS star counts, which are adequate samples to probe both the halo and the disc components of largest scale heights (oldest ages). We then present the changes in the calibration that were necessary to cope with some improvements in the model input data, and the use of more extensive photometry datasets: the relatively shallower 2MASS catalogue, which probes mostly the disc at intermediate ages, and the immediate solar neighbourhood as sampled by Hipparcos, which contains a somewhat larger fraction of younger stars than deeper surveys. Remarkably, the same model calibration can reproduce well the star counts in all the above-mentioned data sets, that span from the very deep magnitudes of CDFS (16<R<23) to the very shallow ones of Hipparcos (V<8). Significant deviations are found just for fields close to the Galactic Center and Plane, and for a single set of South Galactic Pole data. The TRILEGAL code is ready to use for the variety of wide-angle surveys in the optical/infrared that will become available in the coming years.
Restricted Boltzmann machines are undirected neural networks which have been shown to be effective in many applications, including serving as initializations for training deep multi-layer neural networks. One of the main reasons for their success is the existence of efficient and practical stochastic algorithms, such as contrastive divergence, for unsupervised training. We propose an alternative deterministic iterative procedure based on an improved mean field method from statistical physics known as the Thouless-Anderson-Palmer approach. We demonstrate that our algorithm provides performance equal to, and sometimes superior to, persistent contrastive divergence, while also providing a clear and easy to evaluate objective function. We believe that this strategy can be easily generalized to other models as well as to more accurate higher-order approximations, paving the way for systematic improvements in training Boltzmann machines with hidden units.
The diabetic retinopathy is timely diagonalized through color eye fundus images by experienced ophthalmologists, in order to recognize potential retinal features and identify early-blindness cases. In this paper, it is proposed to extract deep features from the last fully-connected layer of, four different, pre-trained convolutional neural networks. These features are then feeded into a non-linear classifier to discriminate three-class diabetic cases, i.e., normal, exudates, and drusen. Averaged across 1113 color retinal images collected from six publicly available annotated datasets, the deep features approach perform better than the classical bag-of-words approach. The proposed approaches have an average accuracy between 91.23% and 92.00% with more than 13% improvement over the traditional state of art methods.
Boolean network models of molecular regulatory networks have been used successfully in computational systems biology. The Boolean functions that appear in published models tend to have special properties, in particular the property of being nested canalizing, a concept inspired by the concept of canalization in evolutionary biology. It has been shown that networks comprised of nested canalizing functions have dynamic properties that make them suitable for modeling molecular regulatory networks, namely a small number of (large) attractors, as well as relatively short limit cycles.   This paper contains a detailed analysis of this class of functions, based on a novel normal form as polynomial functions over the Boolean field. The concept of layer is introduced that stratifies variables into different classes depending on their level of dominance. Using this layer concept a closed form formula is derived for the number of nested canalizing functions with a given number of variables. Additional metrics considered include Hamming weight, the activity number of any variable, and the average sensitivity of the function. It is also shown that the average sensitivity of any nested canalizing function is between 0 and 2. This provides a rationale for why nested canalizing functions are stable, since a random Boolean function in n variables has average sensitivity n/2. The paper also contains experimental evidence that the layer number is an important factor in network stability.
We introduce our publicly available Wide-Field-Imaging reduction pipeline THELI. The procedures applied for the efficient pre-reduction and astrometric calibration are presented. A special emphasis is put on the methods applied to the photometric calibration. As a test case the reduction of optical data from the ESO Deep Public Survey including the WFI-GOODS data is described. The end-products of this project are now available via the ESO archive Advanced Data Products section.
The global dynamics of gene regulatory networks are known to show robustness to perturbations in the form of intrinsic and extrinsic noise, as well as mutations of individual genes. One molecular mechanism underlying this robustness has been identified as the action of so-called microRNAs that operate via feedforward loops. We present results of a computational study, using the modeling framework of stochastic Boolean networks, which explores the role that such network motifs play in stabilizing global dynamics. The paper introduces a new measure for the stability of stochastic networks. The results show that certain types of feedforward loops do indeed buffer the network against stochastic effects.
We present a simultaneous analysis of the elastic scattering and fusion cross-section data of the $^{12}$C+$^{24}$Mg system around the Coulomb barrier and over energies by using the microscopic $\alpha$-$\alpha$ double folding cluster potential within the framework of the optical model and the coupled-channels formalism. The $\alpha$-$\alpha$ double folding cluster potential is obtained by using the $\alpha$-cluster distribution densities of the nuclei in the usual double folding procedure. The microscopic potential results are compared with the findings of the phenomenological deep and shallow potentials. It is subsequently shown that only phenomenological deep real, microscopic nucleon-nucleon and $\alpha$-$\alpha$ double folding cluster potentials provide a consistent description of the angular distributions and fusion cross-section data simultaneously. The effect of the inclusion of the excited states of the target nucleus $^{24}$Mg on the fusion cross-section predictions is also determined by the coupled-channels calculations, which are shown to improve the agreement.
Studies of low-frequency resistance noise show that the glassy freezing of the two-dimensional electron system (2DES) in Si in the vicinity of the metal-insulator transition (MIT) persists in parallel magnetic fields B of up to 9 T. At low B, both the glass transition density $n_g$ and $n_c$, the critical density for the MIT, increase with B such that the width of the metallic glass phase ($n_c<n_s<n_g$) increases with B. At higher B, where the 2DES is spin polarized, $n_c$ and $n_g$ no longer depend on B. Our results demonstrate that charge, as opposed to spin, degrees of freedom are responsible for glassy ordering of the 2DES near the MIT.
In this paper we will analyze discrete probability distributions in which probabilities of particular outcomes of some experiment (microstates) can be represented by the ratio of natural numbers (in other words, probabilities are represented by digital numbers of finite representation length). We will introduce several results that are based on recently proposed JoyStick Probability Selector, which represents a geometrical interpretation of the probability based on the Born rule. The terms of generic space and generic dimension of the discrete distribution, as well as, effective dimension are going to be introduced. It will be shown how this simple geometric representation can lead to an optimal code length coding of the sequence of signals. Then, we will give a new, geometrical, interpretation of the Shannon entropy of the discrete distribution. We will suggest that the Shannon entropy represents the logarithm of the effective dimension of the distribution. Proposed geometrical interpretation of the Shannon entropy can be used to prove some information inequalities in an elementary way.
Many, if not most network analysis algorithms have been designed specifically for single-relational networks; that is, networks in which all edges are of the same type. For example, edges may either represent "friendship," "kinship," or "collaboration," but not all of them together. In contrast, a multi-relational network is a network with a heterogeneous set of edge labels which can represent relationships of various types in a single data structure. While multi-relational networks are more expressive in terms of the variety of relationships they can capture, there is a need for a general framework for transferring the many single-relational network analysis algorithms to the multi-relational domain. It is not sufficient to execute a single-relational network analysis algorithm on a multi-relational network by simply ignoring edge labels. This article presents an algebra for mapping multi-relational networks to single-relational networks, thereby exposing them to single-relational network analysis algorithms.
Microbloging is an extremely prevalent broadcast medium amidst the Internet fraternity these days. People share their opinions and sentiments about variety of subjects like products, news, institutions, etc., every day on microbloging websites. Sentiment analysis plays a key role in prediction systems, opinion mining systems, etc. Twitter, one of the microbloging platforms allows a limit of 140 characters to its users. This restriction stimulates users to be very concise about their opinion and twitter an ocean of sentiments to analyze. Twitter also provides developer friendly streaming API for data retrieval purpose allowing the analyst to search real time tweets from various users. In this paper, we discuss the state-of-art of the works which are focused on Twitter, the online social network platform, for sentiment analysis. We survey various lexical, machine learning and hybrid approaches for sentiment analysis on Twitter.
Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the model's attention, but previous attentional architectures are not constructed to learn such features specifically. We introduce an attentional neural network that employs convolution on the input tokens to detect local time-invariant and long-range topical attention features in a context-dependent way. We apply this architecture to the problem of extreme summarization of source code snippets into short, descriptive function name-like summaries. Using those features, the model sequentially generates a summary by marginalizing over two attention mechanisms: one that predicts the next summary token based on the attention weights of the input tokens and another that is able to copy a code token as-is directly into the summary. We demonstrate our convolutional attention neural network's performance on 10 popular Java projects showing that it achieves better performance compared to previous attentional mechanisms.
In the same way that classical computer networks connect and enhance the capabilities of classical computers, quantum networks can combine the advantages of quantum information and communications. We propose a non-classical network element, a delayed commutation switch, that can solve the problem of switching time in packet switching networks. With the help of some local ancillary qubits and superdense codes we can route the information after part of it has left the network node.
A fundamental property of complex networks is the tendency for edges to cluster. The extent of the clustering is typically quantified by the clustering coefficient, which is the probability that a length-2 path is closed, i.e., induces a triangle in the network. However, higher-order structures beyond triangles are crucial to understanding complex networks, and the clustering behavior with respect to such higher-order patterns is not well understood. Here we introduce higher-order clustering coefficients that measure the closure probability of higher-order network structures and provide a more comprehensive view of how the edges of complex networks cluster. Our higher-order clustering coefficients are a natural generalization of the traditional clustering coefficient. We derive several properties about higher-order clustering coefficients and analyze them under common random graph models. Finally, we use higher-order clustering coefficients to gain new insights into the structure of real-world networks from several domains.
A Bayesian belief network models a joint distribution with an directed acyclic graph representing dependencies among variables and network parameters characterizing conditional distributions. The parameters are viewed as random variables to quantify uncertainty about their values. Belief nets are used to compute responses to queries; i.e., conditional probabilities of interest. A query is a function of the parameters, hence a random variable. Van Allen et al. (2001, 2008) showed how to quantify uncertainty about a query via a delta method approximation of its variance. We develop more accurate approximations for both query mean and variance. The key idea is to extend the query mean approximation to a "doubled network" involving two independent replicates. Our method assumes complete data and can be applied to discrete, continuous, and hybrid networks (provided discrete variables have only discrete parents). We analyze several improvements, and provide empirical studies to demonstrate their effectiveness.
It is well-known that the behavior of many dynamical processes running on networks is intimately related to the eigenvalue spectrum of the network. In this paper, we address the problem of inferring global information regarding the eigenvalue spectrum of a network from a set of local samples of its structure. In particular, we find explicit relationships between the so-called spectral moments of a graph and the presence of certain small subgraphs, also called motifs, in the network. Since the eigenvalues of the network have a direct influence on the network dynamical behavior, our result builds a bridge between local network measurements (i.e., the presence of small subgraphs) and global dynamical behavior (via the spectral moments). Furthermore, based on our result, we propose a novel decentralized scheme to compute the spectral moments of a network by aggregating local measurements of the network topology. Our final objective is to understand the relationships between the behavior of dynamical processes taking place in a large-scale complex network and its local topological properties.
Efficient Human Epithelial-2 (HEp-2) cell image classification can facilitate the diagnosis of many autoimmune diseases. This paper presents an automatic framework for this classification task, by utilizing the deep convolutional neural networks (CNNs) which have recently attracted intensive attention in visual recognition. This paper elaborates the important components of this framework, discusses multiple key factors that impact the efficiency of training a deep CNN, and systematically compares this framework with the well-established image classification models in the literature. Experiments on benchmark datasets show that i) the proposed framework can effectively outperform existing models by properly applying data augmentation; ii) our CNN-based framework demonstrates excellent adaptability across different datasets, which is highly desirable for classification under varying laboratory settings. Our system is ranked high in the cell image classification competition hosted by ICPR 2014.
While going deeper has been witnessed to improve the performance of convolutional neural networks (CNN), going smaller for CNN has received increasing attention recently due to its attractiveness for mobile/embedded applications. It remains an active and important topic how to design a small network while retaining the performance of large and deep CNNs (e.g., Inception Nets, ResNets). Albeit there are already intensive studies on compressing the size of CNNs, the considerable drop of performance is still a key concern in many designs. This paper addresses this concern with several new contributions. First, we propose a simple yet powerful method for compressing the size of deep CNNs based on parameter binarization. The striking difference from most previous work on parameter binarization/quantization lies at different treatments of $1\times 1$ convolutions and $k\times k$ convolutions ($k>1$), where we only binarize $k\times k$ convolutions into binary patterns. The resulting networks are referred to as pattern networks. By doing this, we show that previous deep CNNs such as GoogLeNet and Inception-type Nets can be compressed dramatically with marginal drop in performance. Second, in light of the different functionalities of $1\times 1$ (data projection/transformation) and $k\times k$ convolutions (pattern extraction), we propose a new block structure codenamed the pattern residual block that adds transformed feature maps generated by $1\times 1$ convolutions to the pattern feature maps generated by $k\times k$ convolutions, based on which we design a small network with $\sim 1$ million parameters. Combining with our parameter binarization, we achieve better performance on ImageNet than using similar sized networks including recently released Google MobileNets.
We consider the influence of quenched noise upon interface dynamics in 2D and 3D capillary rise with rough walls by using phase-field approach, where the local conservation of mass in the bulk is explicitly included. In the 2D case the disorder is assumed to be in the effective mobility coefficient, while in the 3D case we explicitly consider the influence of locally fluctuating geometry along a solid wall using a generalized curvilinear coordinate transformation. To obtain the equations of motion for meniscus and contact lines, we develop a systematic projection formalism which allows inclusion of disorder. Using this formalism, we derive linearized equations of motion for the meniscus and contact line variables, which become local in the Fourier space representation. These dispersion relations contain effective noise that is linearly proportional to the velocity. The deterministic parts of our dispersion relations agree with results obtained from other similar studies in the proper limits. However, the forms of the noise terms derived here are quantitatively different from the other studies.
Due to its scale and largely interconnected nature, the Internet of Things (IoT) will be vulnerable to a number of security threats that range from physical layer attacks to network layer attacks. In this paper, a novel anti-jamming strategy for OFDM-based IoT systems is proposed which enables an IoT controller to protect the IoT devices against a malicious radio jammer. The interaction between the controller node and the jammer is modeled as a Colonel Blotto game with continuous and asymmetric resources in which the IoT controller, acting as defender, seeks to thwart the jamming attack by distributing its power among the subcarries in a smart way to decrease the aggregate bit error rate (BER) caused by the jammer. The jammer, on the other hand, aims at disrupting the system performance by allocating jamming power to different frequency bands. To solve the game, an evolutionary algorithm is proposed which can find a mixed-strategy Nash equilibrium of the Blotto game. Simulation results show that the proposed algorithm enables the IoT controller to maintain the BER above an acceptable threshold, thereby preserving the IoT network performance in the presence of malicious jamming.
We consider the problem of distributed online learning with multiple players in multi-armed bandits (MAB) models. Each player can pick among multiple arms. When a player picks an arm, it gets a reward. We consider both i.i.d. reward model and Markovian reward model. In the i.i.d. model each arm is modelled as an i.i.d. process with an unknown distribution with an unknown mean. In the Markovian model, each arm is modelled as a finite, irreducible, aperiodic and reversible Markov chain with an unknown probability transition matrix and stationary distribution. The arms give different rewards to different players. If two players pick the same arm, there is a "collision", and neither of them get any reward. There is no dedicated control channel for coordination or communication among the players. Any other communication between the users is costly and will add to the regret. We propose an online index-based distributed learning policy called ${\tt dUCB_4}$ algorithm that trades off \textit{exploration v. exploitation} in the right way, and achieves expected regret that grows at most as near-$O(\log^2 T)$. The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users. This is the first distributed learning algorithm for multi-player MABs to the best of our knowledge.
A noncooperative game theoretical approach for analysing opportunistic spectrum access (OSA) in cognitive radio (CR) environments is proposed. New concepts from game theory are applied to spectrum access analysis in order to extract rules of behaviour for an emerging environment. In order to assess OSA scenarios of CRs, two oligopoly game models are reformulated in terms of resource access: Cournot and Stackelberg games. Five CR scenes are analysed: simultaneous access of unlicensed users (commons regime) with symmetric and asymmetric costs, with and without bandwidth constraints and sequential access (licensed against unlicensed). Several equilibrium concepts are studied as game solutions: Nash, Pareto and the joint NashPareto equilibrium. The latter captures a game situation where players are non-homogeneous users, exhibiting different types of rationality, Nash and Pareto. This enables a more realistic modelling of interactions on a CR scene. An evolutionary game equilibrium detection method is used. The Nash equilibrium indicates the maximum number of channels a CR may access without decreasing its payoff. The Pareto equilibrium describes a larger range of payoffs, capturing unbalanced as well as equitable solutions. The analysis of the Stackelberg modelling shows that payoffs are maximised for all users if the incumbents are Nash oriented and the new entrants are Pareto driven.
A fundamental challenge in developing high-impact machine learning technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this paper, we introduce two new formalisms for modeling structured data, distinguished from previous approaches by their ability to both capture rich structure and scale to big data. The first, hinge-loss Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference. We unite three approaches from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three lead to the same inference objective. We then derive HL-MRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HL-MRFs easy to define using a syntax based on first-order logic. We next introduce an algorithm for inferring most-probable variable assignments (MAP inference) that is much more scalable than general-purpose convex optimization software, because it uses message passing to take advantage of sparse dependency structures. We then show how to learn the parameters of HL-MRFs. The learned HL-MRFs are as accurate as analogous discrete models, but much more scalable. Together, these algorithms enable HL-MRFs and PSL to model rich, structured data at scales not previously possible.
Networks play a central role in modern data analysis, enabling us to reason about systems by studying the relationships between their parts. Most often in network analysis, the edges are given. However, in many systems it is difficult or impossible to measure the network directly. Examples of latent networks include economic interactions linking financial instruments and patterns of reciprocity in gang violence. In these cases, we are limited to noisy observations of events associated with each node. To enable analysis of these implicit networks, we develop a probabilistic model that combines mutually-exciting point processes with random graph models. We show how the Poisson superposition principle enables an elegant auxiliary variable formulation and a fully-Bayesian, parallel inference algorithm. We evaluate this new model empirically on several datasets.
We investigate the 70 um Far-Infrared Radio Correlation (FRC) of star-forming galaxies in the Extended Chandra Deep Field South (ECDFS) out to z > 2. We use 70 um data from the Far-Infrared Deep Extragalactic Legacy Survey (FIDEL), which comprises the most sensitive (~0.8 mJy rms) and extensive far-infrared deep field observations using MIPS on the Spitzer Space Telescope, and 1.4 GHz radio data (~8 uJy/beam rms) from the VLA. In order to quantify the evolution of the FRC we use both survival analysis and stacking techniques which we find give similar results. We also calculate the FRC using total infrared luminosity and rest-frame radio luminosity, qTIR, and find that qTIR is constant (within 0.22) over the redshift range 0 - 2. We see no evidence for evolution in the FRC at 70 um which is surprising given the many factors that are expected to change this ratio at high redshifts.
It is shown that traditional estimates of the pore diameter in the porous silica material MCM-41-S15 (of order 15\AA) are too small to allow the amount of water that is absorbed by these materials (around 0.5gH2O/g substrate) to occur only inside the pore. Either the additional water is absorbed on the surface of the silica particles and outside the pores, or else the pores are larger than the traditional estimates. In addition the low Q Bragg intensities from a sample of MCM-41-S15 porous silica under different dry and wet conditions and with different hydrogen isotopes are simulated using a simple model of the water and silica density profile across the pore. It is found the best agreement of these intensities with experimental data is shown by assuming the much larger pore diameter of 25\AA (radius 12.5\AA). Qualitative agreement is found between these simulated density profiles and those found in recent empirical potential structure refinement simulations of the same data, even though the latter data did not specifically include the Bragg peaks in the structure refinement. It is shown that the change in the (100) peak intensity on cooling from 300K to 210K, which previously has been ascribed to a change in density of the confined water on cooling, can equally be ascribed to a change in density profile at constant average density. It is further pointed out that, independent of whether the pore diameter really is as large as 25\AA\ or whether a significant amount of water is absorbed outside the pore, the earlier reports of a dynamic cross-over in supercooled confined water could in fact be a crystallisation transition in the larger pore or surface water.
Competition is ubiquitous in many complex biological, social, and technological systems, playing an integral role in the evolutionary dynamics of the systems. It is often useful to determine the dominance hierarchy or the rankings of the components of the system that compete for survival and success based on the outcomes of the competitions between them. Here we propose a ranking method based on the random walk on the network representing the competitors as nodes and competitions as directed edges with asymmetric weights. We use the edge weights and node degrees to define the gradient on each edge that guides the random walker towards the weaker (or the stronger) node, which enables us to interpret the steady-state occupancy as the measure of the node's weakness (or strength) that is free of unwarranted degree-induced bias. We apply our method to two real-world competition networks and explore the issues of ranking stabilization and prediction accuracy, finding that our method outperforms other methods including the baseline win--loss differential method in sparse networks.
Laplacian matrices of graphs arise in large-scale computational applications such as machine learning; spectral clustering of images, genetic data and web pages; transportation network flows; electrical resistor circuits; and elliptic partial differential equations discretized on unstructured grids with finite elements. A Lean Algebraic Multigrid (LAMG) solver of the linear system Ax=b is presented, where A is a graph Laplacian. LAMG's run time and storage are linear in the number of graph edges. LAMG consists of a setup phase, in which a sequence of increasingly-coarser Laplacian systems is constructed, and an iterative solve phase using multigrid cycles. General graphs pose algorithmic challenges not encountered in traditional applications of algebraic multigrid. LAMG combines a lean piecewise-constant interpolation, judicious node aggregation based on a new node proximity definition, and an energy correction of the coarse-level systems. This results in fast convergence and substantial overhead and memory savings. A serial LAMG implementation scaled linearly for a diverse set of 1666 real-world graphs with up to six million edges. This multilevel methodology can be fully parallelized and extended to eigenvalue problems and other graph computations.
Visual question answering is fundamentally compositional in nature---a question like "where is the dog?" shares substructure with questions like "what color is the dog?" and "where is the cat?" This paper seeks to simultaneously exploit the representational capacity of deep networks and the compositional linguistic structure of questions. We describe a procedure for constructing and learning *neural module networks*, which compose collections of jointly-trained neural "modules" into deep networks for question answering. Our approach decomposes questions into their linguistic substructures, and uses these structures to dynamically instantiate modular networks (with reusable components for recognizing dogs, classifying colors, etc.). The resulting compound networks are jointly trained. We evaluate our approach on two challenging datasets for visual question answering, achieving state-of-the-art results on both the VQA natural image dataset and a new dataset of complex questions about abstract shapes.
We introduce single-set spectral sparsification as a deterministic sampling based feature selection technique for regularized least squares classification, which is the classification analogue to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world datasets, namely a subset of TechTC-300 datasets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.
We study the effects of coupling between layers of stochastic neural field models with laminar structure. In particular, we focus on how the propagation of waves of neural activity in each layer is affected by the coupling. Synaptic connectivities within and between each layer are determined by integral kernels of an integrodifferential equation describing the temporal evolution of neural activity. Excitatory neural fields, with purely positive connectivities, support traveling fronts in each layer, whose speeds are increased when coupling between layers is considered. Studying the effects of noise, we find coupling also serves to reduce the variance in the position of traveling fronts, as long as the noise sources to each layer are not completely correlated. Neural fields with asymmetric connectivity support traveling pulses whose speeds are decreased by interlaminar coupling. Again, coupling reduces the variance in traveling pulse position, when noise is considered that is not totally correlated between layers. To derive our stochastic results, we employ a small-noise expansion, also assuming interlaminar connectivity scales similarly. Our asymptotic results agree reasonably with accompanying numerical simulations.
We apply the ideas of Alvarez and Labastida to the invariant of spin networks defined by Witten and Martin based on Chern-Simons theory. We show that it is possible to define ambient invariants of spin networks that (for the case of SU(2)) can be considered as extensions to spin networks of the Jones polynomial. Expansions of the coefficients of the polynomial yield primitive Vassiliev invariants. The resulting invariants are candidates for solutions of the Wheeler--DeWitt equations in the spin network representation of quantum gravity.
We consider the effects of disorder in a Dirac-like Hamiltonian. In order to use conformal perturbation theory, we argue that one should consider disorder in an imaginary vector potential. This affects significantly the signs of the lowest order $\beta$eta functions. We present evidence for the existence of two distinct universality classes, depending on the relative strengths of the gauge field verses impurity disorder strengths. In one class all disorder is driven irrelevant by the gauge field disorder.
Network analysis has a crucial need for tools to compare networks and assess the significance of differences between networks. We propose a principled statistical approach to network comparison that approximates networks as probability distributions on negatively curved manifolds. We outline the theory, as well as implement the approach on simulated networks.
The role of relativistic off-mass-shell kinematics in the deep inelastic scattering on deuteron is analyzed. It is shown that the relativistic impulse approximation reproduces effects from the binding, Fermi motion, and two-nucleon contribution. The nonrelativistic limit of the deuteron structure function is in agreement with the nonrelativistic calculations.
We propose a new model for heterogeneous cellular networks that incorporates dependencies between the layers. In particular, it places lower-tier base stations at locations that are poorly covered by the macrocells, and it includes a small-cell model for the case where the goal is to enhance network capacity.
We reconsider stochastic convergence analyses of particle swarm optimisation, and point out that previously obtained parameter conditions are not always sufficient to guarantee mean square convergence to a local optimum. We show that stagnation can in fact occur for non-trivial configurations in non-optimal parts of the search space, even for simple functions like SPHERE. The convergence properties of the basic PSO may in these situations be detrimental to the goal of optimisation, to discover a sufficiently good solution within reasonable time. To characterise optimisation ability of algorithms, we suggest the expected first hitting time (FHT), i.e., the time until a search point in the vicinity of the optimum is visited. It is shown that a basic PSO may have infinite expected FHT, while an algorithm introduced here, the Noisy PSO, has finite expected FHT on some functions.
This paper summarizes the outcomes of the 5th International Workshop on Femtocells held at King's College London, UK, on the 13th and 14th of February, 2012.The workshop hosted cutting-edge presentations about the latest advances and research challenges in small cell roll-outs and heterogeneous cellular networks. This paper provides some cutting edge information on the developments of Self-Organizing Networks (SON) for small cell deployments, as well as related standardization supports on issues such as carrier aggregation (CA), Multiple-Input-Multiple-Output (MIMO) techniques, and enhanced Inter-Cell Interference Coordination (eICIC), etc. Furthermore, some recent efforts on issues such as energy-saving as well as Machine Learning (ML) techniques on resource allocation and multi-cell cooperation are described. Finally, current developments on simulation tools and small cell deployment scenarios are presented. These topics collectively represent the current trends in small cell deployments.
We present deep 1.4GHz Very Large Array (VLA) radio continuum observations of two ~half square degree fields in the Coma cluster of galaxies. The two fields, "Coma 1" and "Coma 3," correspond to the cluster core and southwest infall region and were selected on account of abundant pre-existing multiwavelength data. In their most sensitive regions the radio data reach 22 uJy rms per 4.4" beam, sufficient to detect (at 5-sigma) Coma member galaxies with log(L) = 20.11 W/Hz. The full catalog of radio detections is presented herein and consists of 1030 sources detected at >=5 sigma, 628 of which are within the combined Coma 1 and Coma 3 area. We also provide optical identifications of the radio sources using data from the Sloan Digital Sky Survey (SDSS). The depth of the radio observations allows us to detect AGN in cluster elliptical galaxies with Mr < -20.5 (AB magnitudes), including radio detections for all cluster ellipticals with Mr < -21.8. At fainter optical magnitudes (-20.5 < Mr <~ -19) the radio sources are associated with star-forming galaxies with star formation rates as low as 0.1 solar masses per year.
Image splicing is a very common image manipulation technique that is sometimes used for malicious purposes. A splicing detec- tion and localization algorithm usually takes an input image and produces a binary decision indicating whether the input image has been manipulated, and also a segmentation mask that corre- sponds to the spliced region. Most existing splicing detection and localization pipelines suffer from two main shortcomings: 1) they use handcrafted features that are not robust against subsequent processing (e.g., compression), and 2) each stage of the pipeline is usually optimized independently. In this paper we extend the formulation of the underlying splicing problem to consider two input images, a query image and a potential donor image. Here the task is to estimate the probability that the donor image has been used to splice the query image, and obtain the splicing masks for both the query and donor images. We introduce a novel deep convolutional neural network architecture, called Deep Matching and Validation Network (DMVN), which simultaneously localizes and detects image splicing. The proposed approach does not depend on handcrafted features and uses raw input images to create deep learned representations. Furthermore, the DMVN is end-to-end op- timized to produce the probability estimates and the segmentation masks. Our extensive experiments demonstrate that this approach outperforms state-of-the-art splicing detection methods by a large margin in terms of both AUC score and speed.
Image retrieval refers to finding relevant images from an image database for a query, which is considered difficult for the gap between low-level representation of images and high-level representation of queries. Recently further developed Deep Neural Network sheds light on automatically learning high-level image representation from raw pixels. In this paper, we proposed a multi-task DNN learned for image retrieval, which contains two parts, i.e., query-sharing layers for image representation computation and query-specific layers for relevance estimation. The weights of multi-task DNN are learned on clickthrough data by Ring Training. Experimental results on both simulated and real dataset show the effectiveness of the proposed method.
Data representation is a fundamental task in machine learning. The representation of data affects the performance of the whole machine learning system. In a long history, the representation of data is done by feature engineering, and researchers aim at designing better features for specific tasks. Recently, the rapid development of deep learning and representation learning has brought new inspiration to various domains.   In natural language processing, the most widely used feature representation is the Bag-of-Words model. This model has the data sparsity problem and cannot keep the word order information. Other features such as part-of-speech tagging or more complex syntax features can only fit for specific tasks in most cases. This thesis focuses on word representation and document representation. We compare the existing systems and present our new model.   First, for generating word embeddings, we make comprehensive comparisons among existing word embedding models. In terms of theory, we figure out the relationship between the two most important models, i.e., Skip-gram and GloVe. In our experiments, we analyze three key points in generating word embeddings, including the model construction, the training corpus and parameter design. We evaluate word embeddings with three types of tasks, and we argue that they cover the existing use of word embeddings. Through theory and practical experiments, we present some guidelines for how to generate a good word embedding.   Second, in Chinese character or word representation. We introduce the joint training of Chinese character and word. ...   Third, for document representation, we analyze the existing document representation models, including recursive NNs, recurrent NNs and convolutional NNs. We point out the drawbacks of these models and present our new model, the recurrent convolutional neural networks. ...
The aging in a Heisenberg-like spin glass Ag(11 at% Mn) is investigated by measurements of the zero field cooled magnetic relaxation at a constant temperature after small temperature shifts $|\Delta T/T_g| < 0.012$. A crossover from fully accumulative to non-accumulative aging is observed, and by converting time scales to length scales using the logarithmic growth law of the droplet model, we find a quantitative evidence that positive and negative temperature shifts cause an equivalent restart of aging (rejuvenation) in terms of dynamical length scales. This result supports the existence of a unique overlap length between a pair of equilibrium states in the spin glass system.
Efficient communication in wireless networks is typically challenged by the possibility of interference among several transmitting nodes. Much important research has been invested in decreasing the number of collisions in order to obtain faster algorithms for communication in such networks.   This paper proposes a novel approach for wireless communication, which embraces collisions rather than avoiding them, over an additive channel. It introduces a coding technique called Bounded-Contention Coding (BCC) that allows collisions to be successfully decoded by the receiving nodes into the original transmissions and whose complexity depends on a bound on the contention among the transmitters.   BCC enables deterministic local broadcast in a network with n nodes and at most a transmitters with information of l bits each within O(a log n + al) bits of communication with full-duplex radios, and O((a log n + al)(log n)) bits, with high probability, with half-duplex radios. When combined with random linear network coding, BCC gives global broadcast within O((D + a + log n)(a log n + l)) bits, with high probability. This also holds in dynamic networks that can change arbitrarily over time by a worst-case adversary. When no bound on the contention is given, it is shown how to probabilistically estimate it and obtain global broadcast that is adaptive to the true contention in the network.
Particle swarm optimization is a popular method for solving difficult optimization problems. There have been attempts to formulate the method in formal probabilistic or stochastic terms (e.g. bare bones particle swarm) with the aim to achieve more generality and explain the practical behavior of the method. Here we present a Bayesian interpretation of the particle swarm optimization. This interpretation provides a formal framework for incorporation of prior knowledge about the problem that is being solved. Furthermore, it also allows to extend the particle optimization method through the use of kernel functions that represent the intermediary transformation of the data into a different space where the optimization problem is expected to be easier to be resolved, such transformation can be seen as a form of prior knowledge about the nature of the optimization problem. We derive from the general Bayesian formulation the commonly used particle swarm methods as particular cases.
We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed architecture, called ReSeg, is based on the recently introduced ReNet model for image classification. We modify and extend it to perform the more challenging task of semantic segmentation. Each ReNet layer is composed of four RNN that sweep the image horizontally and vertically in both directions, encoding patches or activations, and providing relevant global information. Moreover, ReNet layers are stacked on top of pre-trained convolutional layers, benefiting from generic local features. Upsampling layers follow ReNet layers to recover the original image resolution in the final predictions. The proposed ReSeg architecture is efficient, flexible and suitable for a variety of semantic segmentation tasks. We evaluate ReSeg on several widely-used semantic segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving state-of-the-art performance. Results show that ReSeg can act as a suitable architecture for semantic segmentation tasks, and may have further applications in other structured prediction problems. The source code and model hyperparameters are available on https://github.com/fvisin/reseg.
This paper proposes a new method for calculating joint-state posteriors of mixed-audio features using deep neural networks to be used in factorial speech processing models. The joint-state posterior information is required in factorial models to perform joint-decoding. The novelty of this work is its architecture which enables the network to infer joint-state posteriors from the pairs of state posteriors of stereo features. This paper defines an objective function to solve an underdetermined system of equations, which is used by the network for extracting joint-state posteriors. It develops the required expressions for fine-tuning the network in a unified way. The experiments compare the proposed network decoding results to those of the vector Taylor series method and show 2.3% absolute performance improvement in the monaural speech separation and recognition challenge. This achievement is substantial when we consider the simplicity of joint-state posterior extraction provided by deep neural networks.
An ensemble of 2d disordered clusters with a few electrons is studied as a function of the Coulomb energy to kinetic energy ratio r_s. Between the Fermi system (small r_s) and the Wigner molecule (large r_s), an interaction induced delocalization of the ground state takes place which is suppressed when the spins are aligned by a parallel magnetic field. Our results confirm the existence of an intermediate regime where the Wigner antiferromagnetism defavors the Stoner ferromagnetism and where the enhancement of the Lande g factor observed in dilute electron systems is reproduced.
The scaling behavior of Anderson impurity models each of which with a different site energy epsilon_i is examined close to the Mott-Anderson transition. Depending on the impurity energy epsilon_i, in the critical regime the site turns into a local magnetic moment, as indicated by a vanishing quasiparticle weight Z -> 0, or remains nearly doubly occupied or nearly empty, corresponding to Z -> 1. In this paper, we present the scaling behavior of Z as a function of the on-site energy epsilon_i and the distance t to the transition, and interpret our result in terms of an appropriate beta function beta(Z).
Deep learning frameworks have recently achieved superior performance in many pattern recognition problems. However, adoption of deep learning in image steganalysis is still in its initial stage. In this paper we propose a hybrid deep-learning framework for JPEG steganalysis incorporating the domain knowledge behind rich steganalytic models. We prove that the convolution phase and the quantization & truncation phase of the rich models are not learnable in deep convolutional neural networks. Based on theoretical analysis, our proposed framework involves two main stages. The first stage is hand-crafted, corresponding to the convolution phase and the quantization & truncation phase of the rich models. The second stage is a compound deep neural network containing three deep subnets in which the model parameters are learned in the training procedure. By doing so, we ably combine some merits of rich models into our proposed deep-learning framework. We have conducted extensive experiments on a large-scale dataset extracted from ImageNet. The primary dataset used in our experiments contains 500,000 cover images, while our largest dataset contains five million cover images. Our experiments show that the proposed framework outperforms all other state-of-the-art steganalytic models either hand-crafted or learned using deep networks in the literature. Furthermore, we demonstrate that our framework is insensitive to JPEG blocking artifact alterations and the learned model can be easily transferred to a different attacking target. Both of these properties are of critical importance in practical applications. According to our best knowledge, This is the first report of deep learning in image steganalysis validated with large-scale test data.
The self-similarity of complex networks is typically investigated through computational algorithms the primary task of which is to cover the structure with a minimal number of boxes. Here we introduce a box-covering algorithm that not only outperforms previous ones, but also finds optimal solutions. For the two benchmark cases tested, namely, the E. Coli and the WWW networks, our results show that the improvement can be rather substantial, reaching up to 15% in the case of the WWW network.
In this work we present Cutting Plane Inference (CPI), a Maximum A Posteriori (MAP) inference method for Statistical Relational Learning. Framed in terms of Markov Logic and inspired by the Cutting Plane Method, it can be seen as a meta algorithm that instantiates small parts of a large and complex Markov Network and then solves these using a conventional MAP method. We evaluate CPI on two tasks, Semantic Role Labelling and Joint Entity Resolution, while plugging in two different MAP inference methods: the current method of choice for MAP inference in Markov Logic, MaxWalkSAT, and Integer Linear Programming. We observe that when used with CPI both methods are significantly faster than when used alone. In addition, CPI improves the accuracy of MaxWalkSAT and maintains the exactness of Integer Linear Programming.
A point-to-point wireless communication system in which the transmitter is equipped with an energy harvesting device and a rechargeable battery, is studied. Both the energy and the data arrivals at the transmitter are modeled as Markov processes. Delay-limited communication is considered assuming that the underlying channel is block fading with memory, and the instantaneous channel state information is available at both the transmitter and the receiver. The expected total transmitted data during the transmitter's activation time is maximized under three different sets of assumptions regarding the information available at the transmitter about the underlying stochastic processes. A learning theoretic approach is introduced, which does not assume any a priori information on the Markov processes governing the communication system. In addition, online and offline optimization problems are studied for the same setting. Full statistical knowledge and causal information on the realizations of the underlying stochastic processes are assumed in the online optimization problem, while the offline optimization problem assumes non-causal knowledge of the realizations in advance. Comparing the optimal solutions in all three frameworks, the performance loss due to the lack of the transmitter's information regarding the behaviors of the underlying Markov processes is quantified.
In this paper, a new approach has been presented to design sub-optimal state feedback regulators over Networked Control Systems (NCS) with random packet losses. The optimal regulator gains, producing guaranteed stability are designed with the nominal discrete time model of a plant using Lyapunov technique which produces a few set of Bilinear Matrix Inequalities (BMIs). In order to reduce the computational complexity of the BMIs, a Genetic Algorithm (GA) based approach coupled with the standard interior point methods for LMIs has been adopted. A Regrouping Particle Swarm Optimization (RegPSO) based method is then employed to optimally choose the weighting matrices for the state feedback regulator design that gets passed through the GA based stability checking criteria i.e. the BMIs. This hybrid optimization methodology put forward in this paper not only reduces the computational difficulty of the feasibility checking condition for optimum stabilizing gain selection but also minimizes other time domain performance criteria like expected value of the set-point tracking error with optimum weight selection based LQR design for the nominal system.
Modern network-like systems are usually coupled in such a way that failures in one network can affect the entire system. In infrastructures, biology, sociology, and economy, systems are interconnected and events taking place in one system can propagate to any other coupled system. Recent studies on such coupled systems show that the coupling increases their vulnerability to random failure. Properties for interdependent networks differ significantly from those of single-network systems. In this article, these results are reviewed and the main properties discussed.
Using a three-frequency one-dimensional kicked rotor experimentally realized with a cold atomic gas, we study the transport properties at the critical point of the metal-insulator Anderson transition. We accurately measure the time-evolution of an initially localized wavepacket and show that it displays at the critical point a scaling invariance characteristic of this second-order phase transition. The shape of the momentum distribution at the critical point is found to be in excellent agreement with the analytical form deduced from self-consistent theory of localization.
State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks --- Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both the two data --- 97.55\% accuracy for POS tagging and 91.21\% F1 for NER.
The synaptic connectivity of cortical networks features an overrepresentation of certain wiring motifs compared to simple random-network models. This structure is shaped, in part, by synaptic plasticity that promotes or suppresses connections between neurons depending on their spiking activity. Frequently, theoretical studies focus on how feedforward inputs drive plasticity to create this network structure. We study the complementary scenario of self-organized structure in a recurrent network, with spike timing-dependent plasticity driven by spontaneous dynamics. We develop a self-consistent theory that describes the evolution of network structure by combining fast spiking covariance with a fast-slow theory for synaptic weight dynamics. Through a finite-size expansion of network dynamics, we obtain a low-dimensional set of nonlinear differential equations for the evolution of two-synapse connectivity motifs. With this theory in hand, we explore how the form of the plasticity rule drives the evolution of microcircuits in cortical networks. When potentiation and depression are in approximate balance, synaptic dynamics depend on the frequency of weighted divergent, convergent, and chain motifs. For additive, Hebbian STDP, these motif interactions create instabilities in synaptic dynamics that either promote or suppress the initial network structure. Our work provides a consistent theoretical framework for studying how spiking activity in recurrent networks interacts with synaptic plasticity to determine network structure.
Finding coarse-grained, low-dimensional descriptions is an important task in the analysis of complex, stochastic models of gene regulatory networks. This task involves (a) identifying observables that best describe the state of these complex systems and (b) characterizing the dynamics of the observables. In a previous paper [13], we assumed that good observables were known a priori, and presented an equation-free approach to approximate coarse-grained quantities (i.e, effective drift and diffusion coefficients) that characterize the long-time behavior of the observables. Here we use diffusion maps [9] to extract appropriate observables ("reduction coordinates") in an automated fashion; these involve the leading eigenvectors of a weighted Laplacian on a graph constructed from network simulation data. We present lifting and restriction procedures for translating between physical variables and these data-based observables. These procedures allow us to perform equation-free coarse-grained, computations characterizing the long-term dynamics through the design and processing of short bursts of stochastic simulation initialized at appropriate values of the data-based observables.
The average node-to-node distance of scale-free graphs depends logarithmically on N, the number of nodes, while the probability distribution function (pdf) of the distances may take various forms. Here we analyze these by considering mean-field arguments and by mapping the m=1 case of the Barabasi-Albert model into a tree with a depth-dependent branching ratio. This shows the origins of the average distance scaling and allows a demonstration of why the distribution approaches a Gaussian in the limit of N large. The load (betweenness), the number of shortest distance paths passing through any node, is discussed in the tree presentation.
We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperatively tries to exploit vacancies in primary (licensed) channels whose occupancies follow a Markovian evolution. We first consider the scenario where the cognitive users have perfect knowledge of the distribution of the signals they receive from the primary users. For this problem, we obtain a greedy channel selection and access policy that maximizes the instantaneous reward, while satisfying a constraint on the probability of interfering with licensed transmissions. We also derive an analytical universal upper bound on the performance of the optimal policy. Through simulation, we show that our scheme achieves good performance relative to the upper bound and improved performance relative to an existing scheme.   We then consider the more practical scenario where the exact distribution of the signal from the primary is unknown. We assume a parametric model for the distribution and develop an algorithm that can learn the true distribution, still guaranteeing the constraint on the interference probability. We show that this algorithm outperforms the naive design that assumes a worst case value for the parameter. We also provide a proof for the convergence of the learning algorithm.
In conventional spin glasses, the magnetic interaction is not strongly anisotropic and the entire spin system freezes at low temperature. In La2(Cu,Li)O4, for which the in-plane exchange interaction dominates the interplane one, only a fraction of spins with antiferromagnetic correlations extending to neighboring planes become spin-glass. The remaining spins with only in-plane antiferromagnetic correlations remain spin-liquid at low temperature. Such a novel partial spin freezing out of a spin-liquid observed in this cold neutron scattering study is likely due to a delicate balance between disorder and quantum fluctuations in the quasi-two dimensional S=1/2 Heisenberg system.
We present an experimental and theoretical study of the shape of fragments generated by explosive and impact loading of closed shells. Based on high speed imaging, we have determined the fragmentation mechanism of shells. Experiments have shown that the fragments vary from completely isotropic to highly anisotropic elongated shapes, depending on the microscopic cracking mechanism of the shell-material. Anisotropic fragments proved to have self-affine character described by a scaling exponent. The distribution of fragment shapes exhibits a power law decay. The robustness of the scaling laws is illustrated by a stochastic hierarchical model of fragmentation. Our results provide a possible improvement of the representation of fragment shapes in models of space debris.
The early stage in the formation of a galaxy inevitably involves a spatially extended distribution of infalling, cold gas. If a central luminous quasar turned on during this phase, it would result in significant extended Lyman alpha emission, possibly accompanied by other lines. For halos condensing at redshifts between 3 < z < 8 and having virial temperatures between 2 x 10^5 K and 2 x 10^6 K, this emission results in a ``fuzz'' of characteristic angular diameter of a few arcseconds, and surface brightness between 10^-18 and 10^-16 erg/s/cm^2/asec^2. The fuzz around bright, high redshift quasars could be detected in deep narrow band imaging with current telescopes, providing a direct constraint on galaxy formation models. The absence of detectable fuzz might suggest that most that most of the protogalaxy's gas settles to a self-gravitating disk before a quasar turns on. However, continued gas infall from large radii, or an on-going merger spreading cold gas over a large solid angle, during the luminous quasar phase could also result in extended Ly$\alpha$ emission, and can be constrained by deep narrow band imaging.
This paper studies the secrecy performance of a wireless network (primary network) overlaid with an ambient RF energy harvesting IoT network (secondary network). The nodes in the secondary network are assumed to be solely powered by ambient RF energy harvested from the transmissions of the primary network. We assume that the secondary nodes can eavesdrop on the primary transmissions due to which the primary network uses secrecy guard zones. The primary transmitter goes silent if any secondary receiver is detected within its guard zone. Using tools from stochastic geometry, we derive the probability of successful connection of the primary network as well as the probability of secure communication. Two conditions must be jointly satisfied in order to ensure successful connection: (i) the SINR at the primary receiver is above a predefined threshold, and (ii) the primary transmitter is not silent. In order to ensure secure communication, the SINR value at each of the secondary nodes should be less than a predefined threshold. Clearly, when more secondary nodes are deployed, more primary transmitters will remain silent for a given guard zone radius, thus impacting the amount of energy harvested by the secondary network. Our results concretely show the existence of an optimal deployment density for the secondary network that maximizes the density of nodes that are able to harvest sufficient amount of energy. Furthermore, we show the dependence of this optimal deployment density on the guard zone radius of the primary network. In addition, we show that the optimal guard zone radius selected by the primary network is a function of the deployment density of the secondary network. This interesting coupling between the two networks is studied using tools from game theory. Overall, this work is one of the few concrete works that symbiotically merge tools from stochastic geometry and game theory.
The problem of parameterization is often central to the effective deployment of nature-inspired algorithms. However, finding the optimal set of parameter values for a combination of problem instance and solution method is highly challenging, and few concrete guidelines exist on how and when such tuning may be performed. Previous work tends to either focus on a specific algorithm or use benchmark problems, and both of these restrictions limit the applicability of any findings. Here, we examine a number of different algorithms, and study them in a "problem agnostic" fashion (i.e., one that is not tied to specific instances) by considering their performance on fitness landscapes with varying characteristics. Using this approach, we make a number of observations on which algorithms may (or may not) benefit from tuning, and in which specific circumstances.
Effective theories for random critical points are usually non-unitary, and thus may contain relevant operators with negative scaling dimensions. To study the consequences of the existence of negative dimensional operators, we consider the random-bond XY model. It has been argued that the XY model on a square lattice, when weakly perturbed by random phases, has a quasi-long-range ordered phase (the random spin wave phase) at sufficiently low temperatures. We show that infinitely many relevant perturbations to the proposed critical action for the random spin wave phase were omitted in all previous treatments. The physical origin of these perturbations is intimately related to the existence of broadly distributed correlation functions. We find that those relevant perturbations do enter the Renormalization Group equations, and affect critical behavior. This raises the possibility that the random XY model has no quasi-long-range ordered phase and no Kosterlitz-Thouless (KT) phase transition.
The results from a recent analysis on beauty production in deep inelastic scattering at HERA using decays into electrons from the ZEUS collaboration are presented. The fractions of events containing b quarks were extracted from a likelihood fit using variables sensitive to electron identification as well as to semileptonic decays. Total and differential cross sections were measured and compared with next-to-leading-order QCD calculations. The beauty contribution to the proton structure function F_2 was extracted from the double-differential cross sections.
From a liberal perspective, pluralism and viewpoint diversity are seen as a necessary condition for a well-functioning democracy. Recently, there have been claims that viewpoint diversity is diminishing in online social networks, putting users in a "bubble", where they receive political information which they agree with. The contributions from our investigations are fivefold: (1) we introduce different dimensions of the highly complex value viewpoint diversity using political theory; (2) we provide an overview of the metrics used in the literature of viewpoint diversity analysis; (3) we operationalize new metrics using the theory and provide a framework to analyze viewpoint diversity in Twitter for different political cultures; (4) we share our results for a case study on minorities we performed for Turkish and Dutch Twitter users; (5) we show that minority users cannot reach a large percentage of Turkish Twitter users. With the last of these contributions, using theory from communication scholars and philosophers, we show how minority access is missing from the typical dimensions of viewpoint diversity studied by computer scientists and the impact it has on viewpoint diversity analysis.
The ability to perform pixel-wise semantic segmentation in real-time is of paramount importance in mobile applications. Recent deep neural networks aimed at this task have the disadvantage of requiring a large number of floating point operations and have long run-times that hinder their usability. In this paper, we propose a novel deep neural network architecture named ENet (efficient neural network), created specifically for tasks requiring low latency operation. ENet is up to 18$\times$ faster, requires 75$\times$ less FLOPs, has 79$\times$ less parameters, and provides similar or better accuracy to existing models. We have tested it on CamVid, Cityscapes and SUN datasets and report on comparisons with existing state-of-the-art methods, and the trade-offs between accuracy and processing time of a network. We present performance measurements of the proposed architecture on embedded systems and suggest possible software improvements that could make ENet even faster.
The main objective of data mining is to extract previously unknown patterns from large collection of data. With the rapid growth in hardware, software and networking technology there is outstanding growth in the amount data collection. Organizations collect huge volumes of data from heterogeneous databases which also contain sensitive and private information about and individual .The data mining extracts novel patterns from such data which can be used in various domains for decision making .The problem with data mining output is that it also reveals some information, which are considered to be private and personal. Easy access to such personal data poses a threat to individual privacy. There has been growing concern about the chance of misusing personal information behind the scene without the knowledge of actual data owner. Privacy is becoming an increasingly important issue in many data mining applications in distributed environment. Privacy preserving data mining technique gives new direction to solve this problem. PPDM gives valid data mining results without learning the underlying data values .The benefits of data mining can be enjoyed, without compromising the privacy of concerned individuals. The original data is modified or a process is used in such a way that private data and private knowledge remain private even after the mining process. In this paper we have proposed a framework that allows systemic transformation of original data using randomized data perturbation technique and the modified data is then submitted as result of client's query through cryptographic approach. Using this approach we can achieve confidentiality at client as well as data owner sites. This model gives valid data mining results for analysis purpose but the actual or true data is not revealed.
Using a non-thermal local search, called Extremal Optimization (EO), in conjunction with a recently developed scheme for classifying the valley structure of complex systems, we analyze a short-range spin glass. In comparison with earlier studies using a thermal algorithm with detailed balance, we determine which features of the landscape are algorithm dependent and which are inherently geometrical. Apparently a characteristic for any local search in complex energy landscapes, the time series of successive energy records found by EO also is characterized approximately by a log-Poisson statistics. Differences in the results provide additional insights into the performance of EO. In contrast with a thermal search, the extremal search visits dramatically higher energies while returning to more widely separated low-energy configurations. Two important properties of the energy landscape are independent of either algorithm: first, to find lower energy records, progressively higher energy barriers need to be overcome. Second, the Hamming distance between two consecutive low-energy records is linearly related to the height of the intervening barrier.
We present the results of deep I-band imaging of two BL Lacerate objects, RGB 0136+391 and PKS 0735+178, during an epoch when the optical nucleus was in a faint state in both targets. In PKS 0735+178 we find a significant excess over a point source, which, if fitted by the de Vaucouleurs model, corresponds to a galaxy with I = 18.64 +- 0.11 and r_eff = 1.8 +- 0.4 arcsec. Interpreting this galaxy as the host galaxy of PKS 0735+178 we derive z = 0.45 +- 0.06 using the host galaxy as a "standard candle". We also discuss the immediate optical environment of PKS 0735+178 and the identity of the MgII absorber at z = 0.424. Despite of the optimally chosen epoch and deep imaging we find the surface brightness profile of RGB 0136+391 to be consistent with a point source. By determining a lower limit for the host galaxy brightness by simulations, we derive z > 0.40 for this target.
We study level number variance in a two-dimensional random matrix model characterized by a power-law decay of the matrix elements. The amplitude of the decay is controlled by the parameter b. We find analytically that at small values of b the level number variance behaves linearly, with the compressibility chi between 0 and 1, which is typical for critical systems. For large values of b, we derive that chi=0, as one would normally expect in the metallic phase. Using numerical simulations we determine the critical value of b at which the transition between these two phases occurs.
Measurement-device-independent quantum key distribution (MDI-QKD) can provide enhanced security, as compared to traditional QKD, and it constitutes an important framework for a quantum network with an untrusted network server. Still, a key assumption in MDI-QKD is that the sources are trusted. We propose here a MDI quantum network with a single untrusted source. We have derived a complete proof of the unconditional security of MDI-QKD with an untrusted source. Using simulations, we have considered various real-life imperfections in its implementation, and the simulation results show that MDI-QKD with an untrusted source provides a key generation rate that is close to the rate of initial MDI-QKD in the asymptotic setting. Our work proves the feasibility of the realization of a quantum network. The network users need only low-cost modulation devices, and they can share both an expensive detector and a complicated laser provided by an untrusted network server.
We present a new approach to estimating the interdependence of industries in an economy by applying data science solutions. By exploiting interfirm buyer--seller network data, we show that the problem of estimating the interdependence of industries is similar to the problem of uncovering the latent block structure in network science literature. To estimate the underlying structure with greater accuracy, we propose an extension of the sparse block model that incorporates node textual information and an unbounded number of industries and interactions among them. The latter task is accomplished by extending the well-known Chinese restaurant process to two dimensions. Inference is based on collapsed Gibbs sampling, and the model is evaluated on both synthetic and real-world datasets. We show that the proposed model improves in predictive accuracy and successfully provides a satisfactory solution to the motivated problem. We also discuss issues that affect the future performance of this approach.
A neural correlate of parametric working memory is a stimulus specific rise in neuron firing rate that persists long after the stimulus is removed. Network models with local excitation and broad inhibition support persistent neural activity, linking network architecture and parametric working memory. Cortical neurons receive noisy input fluctuations which causes persistent activity to diffusively wander about the network, degrading memory over time. We explore how cortical architecture that supports parametric working memory affects the diffusion of persistent neural activity. Studying both a spiking network and a simplified potential well model, we show that spatially heterogeneous excitatory coupling stabilizes a discrete number of persistent states, reducing the diffusion of persistent activity over the network. However, heterogeneous coupling also coarse-grains the stimulus representation space, limiting the capacity of parametric working memory. The storage errors due to coarse-graining and diffusion tradeoff so that information transfer between the initial and recalled stimulus is optimized at a fixed network heterogeneity. For sufficiently long delay times, the optimal number of attractors is less than the number of possible stimuli, suggesting that memory networks can under-represent stimulus space to optimize performance. Our results clearly demonstrate the effects of network architecture and stochastic fluctuations on parametric memory storage.
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-art performance on many speech recognition tasks, as they are able to provide the learned dynamically changing contextual window of all sequence history. On the other hand, the convolutional neural networks (CNNs) have brought significant improvements to deep feed-forward neural networks (FFNNs), as they are able to better reduce spectral variation in the input signal. In this paper, a network architecture called as convolutional recurrent neural network (CRNN) is proposed by combining the CNN and LSTM RNN. In the proposed CRNNs, each speech frame, without adjacent context frames, is organized as a number of local feature patches along the frequency axis, and then a LSTM network is performed on each feature patch along the time axis. We train and compare FFNNs, LSTM RNNs and the proposed LSTM CRNNs at various number of configurations. Experimental results show that the LSTM CRNNs can exceed state-of-the-art speech recognition performance.
The phonon scattering processes on the three solid phases of ethanol are investigated by means of thermal conductivity, light and neutron scattering measurements as well as molecular dynamics simulations on single-crystalline models for the two crystalline modifications (fully ordered monooclinic and orientationally disordered bcc phases). The orientationally disordered crystal is found to exhibit a temperature dependence of the thermal conductivity that is remarkably close to that found for the structurally amorphous solid, specially at low temperatures. The results, together with measurements of the Brillouin linewidths as derived from light scattering measurements emphasize the role of orientational disorder on phonon scattering. The experimental results obtained on polycrystal samples are then discussed with the aid of computer simulations on single-crystalline models of both bcc and monoclinic crystals. Our findings are in good agreement with the wealth of thermodynamic and dynamic data vailable so far [1], but are in stark contrast with inferences made from the analysis of inelastic X-ray data on polycrystalline samples [2], where a common nature for the excitations in all phases is postulated.
Symplectic ensemble of disordered non-Hermitian Hamiltonians is studied. Starting from a model with an imaginary magnetic field, we derive a proper supermatrix $\sigma $-model. The zero-dimensional version of this model corresponds to a symplectic ensemble of weakly non-Hermitian matrices. We derive analytically an explicit expression for the density of complex eigenvalues. This function proves to differ qualitatively from those known for the unitary and orthogonal ensembles. In contrast to these cases, a {\it depletion} of eigenvalues near the real axis occurs. The result about the depletion is in agreement with a previous numerical study performed for QCD models.
Landmines, specifically anti-tank mines, cluster bombs, and unexploded ordnance form a serious problem in many countries. Several landmine sweeping techniques are used for minesweeping. This paper presents the design and the implementation of the vision system of an autonomous robot for landmines localization. The proposed work develops state-of-the-art techniques in digital image processing for pre-processing captured images of the contaminated area. After enhancement, Artificial Neural Network (ANN) is used in order to identify, recognize and classify the landmines' make and model. The Back-Propagation algorithm is used for training the network. The proposed work proved to be able to identify and classify different types of landmines under various conditions (rotated landmine, partially covered landmine) with a success rate of up to 90%.
In this paper, we present a learning based approach to depth fusion, i.e., dense 3D reconstruction from multiple depth images. The most common approach to depth fusion is based on averaging truncated signed distance functions, which was originally proposed by Curless and Levoy in 1996. While this method achieves great results, it can not reconstruct surfaces occluded in the input views and requires a large number frames to filter out sensor noise and outliers. Motivated by large 3D model databases and recent advances in deep learning, we present a novel 3D convolutional network architecture that learns to predict an implicit surface representation from the input depth maps. Our learning based fusion approach significantly outperforms the traditional volumetric fusion approach in terms of noise reduction and outlier suppression. By learning the structure of real world 3D objects and scenes, our approach is further able to reconstruct occluded regions and to fill gaps in the reconstruction. We evaluate our approach extensively on both synthetic and real-world datasets for volumetric fusion. Further, we apply our approach to the problem of 3D shape completion from a single view where our approach achieves state-of-the-art results.
Complex networks have been shown to be robust against random structural perturbations, but vulnerable against targeted attacks. Robustness analysis usually simulates the removal of individual or sets of nodes, followed by the assessment of the inflicted damage. For complex metabolic networks, it has been suggested that evolutionary pressure may favor robustness against reaction removal. However, the removal of a reaction and its impact on the network may as well be interpreted as selective regulation of pathway activities, suggesting a tradeoff between the efficiency of regulation and vulnerability. Here, we employ a cascading failure algorithm to simulate the removal of single and pairs of reactions from the metabolic networks of two organisms, and estimate the significance of the results using two different null models: degree preserving and mass-balanced randomization. Our analysis suggests that evolutionary pressure promotes larger cascades of non-viable reactions, and thus favors the ability of efficient metabolic regulation at the expense of robustness.
We investigate the problem of learning Bayesian networks in an agnostic model where an $\epsilon$-fraction of the samples are adversarially corrupted. Our agnostic learning model is similar to -- in fact, stronger than -- Huber's contamination model in robust statistics. In this work, we study the fully observable Bernoulli case where the structure of the network is given. Even in this basic setting, previous learning algorithms either run in exponential time or lose dimension-dependent factors in their error guarantees. We provide the first computationally efficient agnostic learning algorithm for this problem with dimension-independent error guarantees. Our algorithm has polynomial sample complexity, runs in polynomial time, and achieves error that scales nearly-linearly with the fraction of adversarially corrupted samples.
Statistical models for network epidemics usually assume a Bernoulli random graph, in which any two nodes have the same probability of being connected. This assumption provides computational simplicity but does not describe real-life networks well. We propose an epidemic model based on the preferential attachment model, which adds nodes sequentially by simple rules to generate a network. A simulation study based on the subsequent Markov Chain Monte Carlo algorithm reveals an identifiability issue with the model parameters, so an alternative parameterisation is suggested. Finally, the model is applied to a set of online commissioning data.
Modern computer vision algorithms typically require expensive data acquisition and accurate manual labeling. In this work, we instead leverage the recent progress in computer graphics to generate fully labeled, dynamic, and photo-realistic proxy virtual worlds. We propose an efficient real-to-virtual world cloning method, and validate our approach by building and publicly releasing a new video dataset, called Virtual KITTI (see http://www.xrce.xerox.com/Research-Development/Computer-Vision/Proxy-Virtual-Worlds), automatically labeled with accurate ground truth for object detection, tracking, scene and instance segmentation, depth, and optical flow. We provide quantitative experimental evidence suggesting that (i) modern deep learning algorithms pre-trained on real data behave similarly in real and virtual worlds, and (ii) pre-training on virtual data improves performance. As the gap between real and virtual worlds is small, virtual worlds enable measuring the impact of various weather and imaging conditions on recognition performance, all other things being equal. We show these factors may affect drastically otherwise high-performing deep models for tracking.
In many machine learning applications, it is important to explain the predictions of a black-box classifier. For example, why does a deep neural network assign an image to a particular class? We cast interpretability of black-box classifiers as a combinatorial maximization problem and propose an efficient streaming algorithm to solve it subject to cardinality constraints. By extending ideas from Badanidiyuru et al. [2014], we provide a constant factor approximation guarantee for our algorithm in the case of random stream order and a weakly submodular objective function. This is the first such theoretical guarantee for this general class of functions, and we also show that no such algorithm exists for a worst case stream order. Our algorithm obtains similar explanations of Inception V3 predictions $10$ times faster than the state-of-the-art LIME framework of Ribeiro et al. [2016].
In this paper we address the problem of offline Arabic handwriting word recognition. Off-line recognition of handwritten words is a difficult task due to the high variability and uncertainty of human writing. The majority of the recent systems are constrained by the size of the lexicon to deal with and the number of writers. In this paper, we propose an approach for multi-writers Arabic handwritten words recognition using multiple Bayesian networks. First, we cut the image in several blocks. For each block, we compute a vector of descriptors. Then, we use K-means to cluster the low-level features including Zernik and Hu moments. Finally, we apply four variants of Bayesian networks classifiers (Na\"ive Bayes, Tree Augmented Na\"ive Bayes (TAN), Forest Augmented Na\"ive Bayes (FAN) and DBN (dynamic bayesian network) to classify the whole image of tunisian city name. The results demonstrate FAN and DBN outperform good recognition rates
A complete data acquisition and signal output control system for synchronous stimuli generation, geared towards in vivo neuroscience experiments, was developed using the Terasic DE2i-150 board. All emotions and thoughts are an emergent property of the chemical and electrical activity of neurons. Most of these cells are regarded as excitable cells (spiking neurons), which produce temporally localized electric patterns (spikes). Researchers usually consider that only the instant of occurrence (timestamp) of these spikes encodes information. Registering neural activity evoked by stimuli demands timing determinism and data storage capabilities that cannot be met without dedicated hardware and a hard real-time operational system (RTOS). Indeed, research in neuroscience usually requires dedicated electronic instrumentation for studies in neural coding, brain machine interfaces and closed loop in vivo or in vitro experiments. We developed a complete embedded system solution consisting of a hardware/software co-design with the Intel Atom processor running a free RTOS and a FPGA communicating via a PCIe-to-Avalon bridge. Our system is capable of registering input event timestamps with 1{\mu}s precision and digitally generating stimuli output in hard real-time. The whole system is controlled by a Linux-based Graphical User Interface (GUI). Collected results are simultaneously saved in a local file and broadcasted wirelessly to mobile device web-browsers in an user-friendly graphic format, enhanced by HTML5 technology. The developed system is low-cost and highly configurable, enabling various neuroscience experimental setups, while the commercial off-the-shelf systems have low availability and are less flexible to adapt to specific experimental configurations.
There has been a lot of recent interest in trying to characterize the error surface of deep models. This stems from a long standing question. Given that deep networks are highly nonlinear systems optimized by local gradient methods, why do they not seem to be affected by bad local minima? It is widely believed that training of deep models using gradient methods works so well because the error surface either has no local minima, or if they exist they need to be close in value to the global minimum. It is known that such results hold under very strong assumptions which are not satisfied by real models. In this paper we present examples showing that for such theorem to be true additional assumptions on the data, initialization schemes and/or the model classes have to be made. We look at the particular case of finite size datasets. We demonstrate that in this scenario one can construct counter-examples (datasets or initialization schemes) when the network does become susceptible to bad local minima over the weight space.
A nonlinear small-world network model has been presented to investigate the effect of nonlinear interaction and time delay on the dynamic properties of small-world networks. Both numerical simulations and analytical analysis for networks with time delay and nonlinear interaction show chaotic features in the system response when nonlinear interaction is strong enough or the length scale is large enough. In addition, the small-world system may behave very differently on different scales. Time-delay parameter also has a very strong effect on properties such as the critical length and response time of small-world networks.
Volcanic seismicity at Mt. Etna is studied. It is found that the associated stochastic process exhibits a subdiffusive phenomenon. The jump probability distribution well obeys an exponential law, whereas the waiting-time distribution follows a power law in a wide range. Although these results would seem to suggest that the phenomenon could be described by temporally-fractional kinetic theory based on the viewpoint of continuous-time random walks, the exponent of the power-law waiting-time distribution actually lies outside of the range allowed in the theory. In addition, there exists the aging phenomenon in the event-time averaged mean squared displacement, in contrast to the picture of fractional Brownian motion. Comments are also made on possible relevances of random walks on fractals as well as nonlinear kinetics. Thus, problems of volcanic seismicity are highly challenging for science of complex systems.
A single social phenomenon (such as crime, unemployment or birth rate) can be observed through temporal series corresponding to units at different levels (cities, regions, countries...). Units at a given local level may follow a collective trend imposed by external conditions, but also may display fluctuations of purely local origin. The local behavior is usually computed as the difference between the local data and a global average (e.g. a national average), a view point which can be very misleading. We propose here a method for separating the local dynamics from the global trend in a collection of correlated time series. We take an independent component analysis approach in which we do not assume a small unbiased local contribution in contrast with previously proposed methods. We first test our method on synthetic series generated by correlated random walkers. We then consider crime rate series (in the US and France) and the evolution of obesity rate in the US, which are two important examples of societal measures. For crime rates, the separation between global and local policies is a major subject of debate. For the US, we observe large fluctuations in the transition period of mid-70's during which crime rates increased significantly, whereas since the 80's, the state crime rates are governed by external factors and the importance of local specificities being decreasing. In the case of obesity, our method shows that external factors dominate the evolution of obesity since 2000, and that different states can have different dynamical behavior even if their obesity prevalence is similar.
Model-free learning has been considered as an efficient tool for designing control mechanisms when the model of the system environment or the interaction between the decision-making entities is not available as a-priori knowledge. With model-free learning, the decision-making entities adapt their behaviors based on the reinforcement from their interaction with the environment and are able to (implicitly) build the understanding of the system through trial-and-error mechanisms. Such characteristics of model-free learning is highly in accordance with the requirement of cognition-based intelligence for devices in cognitive wireless networks. Recently, model-free learning has been considered as one key implementation approach to adaptive, self-organized network control in cognitive wireless networks. In this paper, we provide a comprehensive survey on the applications of the state-of-the-art model-free learning mechanisms in cognitive wireless networks. According to the system models that those applications are based on, a systematic overview of the learning algorithms in the domains of single-agent system, multi-agent systems and multi-player games is provided. Furthermore, the applications of model-free learning to various problems in cognitive wireless networks are discussed with the focus on how the learning mechanisms help to provide the solutions to these problems and improve the network performance over the existing model-based, non-adaptive methods. Finally, a broad
We solve the spin-1 quantum Ising model with single-ion anisotropy by mapping it onto a series of segmented spin-1/2 transverse Ising chains, separated by the $S^z =0$ states called holes. A recursion formula is derived for the partition function to simplify the summation of hole configurations. This allows the thermodynamic quantities of this model to be rigorously determined in the thermodynamic limit. The low temperature behavior is governed by the interplay between the hole excitations and the fermionic excitations within each spin-1/2 Ising segment. The quantum critical fluctuations around the Ising critical point of the transverse Ising model are strongly suppressed by the hole excitations.
In this paper, we introduce robust and synergetic hand-crafted features and a simple but efficient deep feature from a convolutional neural network (CNN) architecture for defocus estimation. This paper systematically analyzes the effectiveness of different features, and shows how each feature can compensate for the weaknesses of other features when they are concatenated. For a full defocus map estimation, we extract image patches on strong edges sparsely, after which we use them for deep and hand-crafted feature extraction. In order to reduce the degree of patch-scale dependency, we also propose a multi-scale patch extraction strategy. A sparse defocus map is generated using a neural network classifier followed by a probability-joint bilateral filter. The final defocus map is obtained from the sparse defocus map with guidance from an edge-preserving filtered input image. Experimental results show that our algorithm is superior to state-of-the-art algorithms in terms of defocus estimation. Our work can be used for applications such as segmentation, blur magnification, all-in-focus image generation, and 3-D estimation.
We propose a method for human pose estimation based on Deep Neural Networks (DNNs). The pose estimation is formulated as a DNN-based regression problem towards body joints. We present a cascade of such DNN regressors which results in high precision pose estimates. The approach has the advantage of reasoning about pose in a holistic fashion and has a simple but yet powerful formulation which capitalizes on recent advances in Deep Learning. We present a detailed empirical analysis with state-of-art or better performance on four academic benchmarks of diverse real-world images.
Computer vision tasks often have side information available that is helpful to solve the task. For example, for crowd counting, the camera perspective (e.g., camera angle and height) gives a clue about the appearance and scale of people in the scene. While side information has been shown to be useful for counting systems using traditional hand-crafted features, it has not been fully utilized in counting systems based on deep learning. In order to incorporate the available side information, we propose an adaptive convolutional neural network (ACNN), where the convolutional filter weights adapt to the current scene context via the side information. In particular, we model the filter weights as a low-dimensional manifold, parametrized by the side information, within the high-dimensional space of filter weights. With the help of side information and adaptive weights, the ACNN can disentangle the variations related to the side information, and extract discriminative features related to the current context. Since existing crowd counting datasets do not contain ground-truth side information, we collect a new dataset with the ground-truth camera angle and height as the side information. On experiments in crowd counting, the ACNN improves counting accuracy compared to a plain CNN with a similar number of parameters. We also apply ACNN to image deconvolution to show its potential effectiveness on other computer vision applications.
Structured output support vector machine (SVM) based tracking algorithms have shown favorable performance recently. Nonetheless, the time-consuming candidate sampling and complex optimization limit their real-time applications. In this paper, we propose a novel large margin object tracking method which absorbs the strong discriminative ability from structured output SVM and speeds up by the correlation filter algorithm significantly. Secondly, a multimodal target detection technique is proposed to improve the target localization precision and prevent model drift introduced by similar objects or background noise. Thirdly, we exploit the feedback from high-confidence tracking results to avoid the model corruption problem. We implement two versions of the proposed tracker with the representations from both conventional hand-crafted and deep convolution neural networks (CNNs) based features to validate the strong compatibility of the algorithm. The experimental results demonstrate that the proposed tracker performs superiorly against several state-of-the-art algorithms on the challenging benchmark sequences while runs at speed in excess of 80 frames per second. The source code and experimental results will be made publicly available.
We introduce a robust, error-tolerant adaptive training algorithm for generalized learning paradigms in high-dimensional superposed quantum networks, or \emph{adaptive quantum networks}. The formalized procedure applies standard backpropagation training across a coherent ensemble of discrete topological configurations of individual neural networks, each of which is formally merged into appropriate linear superposition within a predefined, decoherence-free subspace. Quantum parallelism facilitates simultaneous training and revision of the system within this coherent state space, resulting in accelerated convergence to a stable network attractor under consequent iteration of the implemented backpropagation algorithm. Parallel evolution of linear superposed networks incorporating backpropagation training provides quantitative, numerical indications for optimization of both single-neuron activation functions and optimal reconfiguration of whole-network quantum structure.
We investigate the capabilities and limitations of Gaussian process models by jointly exploring three complementary directions: (i) scalable and statistically efficient inference; (ii) flexible kernels; and (iii) objective functions for hyperparameter learning alternative to the marginal likelihood. Our approach outperforms all previously reported GP methods on the standard MNIST dataset; performs comparatively to previous kernel-based methods using the RECTANGLES-IMAGE dataset; and breaks the 1% error-rate barrier in GP models using the MNIST8M dataset, showing along the way the scalability of our method at unprecedented scale for GP models (8 million observations) in classification problems. Overall, our approach represents a significant breakthrough in kernel methods and GP models, bridging the gap between deep learning approaches and kernel machines.
We have assessed how well stellar parameters (T_eff, logg and [Fe/H]) can be retrieved from low-resolution dispersed images to be obtained by the DIVA satellite. Although DIVA is primarily an all-sky astrometric mission, it will also obtain spectrophotometric information for about 13 million stars (operational limiting magnitude V ~ 13.5 mag). Constructional studies foresee a grating system yielding a dispersion of ~200nm/mm on the focal plane (first spectral order). For astrometric reasons there will be no cross dispersion which results in the overlapping of the first to third diffraction orders. The one-dimensional, position related intensity function is called a DISPI (DISPersed Intensity). We simulated DISPIS from synthetic spectra (...) for a limited range of metallicites i.e. our results are for [Fe/H] in the range -0.3 to 1 dex. We show that there is no need to deconvolve these low resolution signals in order to obtain basic stellar parameters. Using neural network methods and by including simulated data of DIVA's UV telescope, we can determine T_eff to an average accuracy of about 2% for DISPIS from stars with 2000 K < T_eff < 20000 K and visual magnitudes of V=13 mag (end of mission data). logg can be determined for all temperatures with an accuracy better than 0.25 dex for magnitudes brighter than V=12 mag. For low temperature stars with 2000 K < T_eff < 5000 K and for metallicities in the range -0.3 to +1 dex a determination of [Fe/H] is possible (to better than 0.2 dex) for these magnitudes. Additionally we examined the effects of extinction E(B-V) on DISPIS and found that it can be determined to better than 0.07 mag for magnitudes brighter than V=14 mag if the UV information is included.
Anomaly and similarity detection in multidimensional series have a long history and have found practical usage in many different fields such as medicine, networks, and finance. Anomaly detection is of great appeal for many different disciplines; for example, mathematicians searching for a unified mathematical formulation based on probability, statisticians searching for error bound estimates, and computer scientists who are trying to design fast algorithms, to name just a few. In summary, we have two contributions: First, we present a self-contained survey of the most promising methods being used in the fields of machine learning, statistics, and bio-informatics today. Included we present discussions about conformal prediction, kernels in the Hilbert space, Kolmogorov's information measure, and non-parametric cumulative distribution function comparison methods (NCDF). Second, building upon this foundation, we provide a powerful NCDF method for series with small dimensionality. Through a combination of data organization and statistical tests, we describe extensions that scale well with increased dimensionality.
Deep HST/ACS photometry of the young cluster NGC 602, located in the remote low density "wing" of the Small Magellanic Cloud, reveals numerous pre-main sequence stars as well as young stars on the main sequence. The resolved stellar content thus provides a basis for studying the star formation history into recent times and constraining several stellar population properties, such as the present day mass function, the initial mass function and the binary fraction. To better characterize the pre-main sequence population, we present a new set of model stellar evolutionary tracks for this evolutionary phase with metallicity appropriate for the Small Magellanic Cloud (Z = 0.004). We use a stellar population synthesis code, which takes into account a full range of stellar evolution phases to derive our best estimate for the star formation history in the region by comparing observed and synthetic color-magnitude diagrams. The derived present day mass function for NGC 602 is consistent with that resulting from the synthetic diagrams. The star formation rate in the region has increased with time on a scale of tens of Myr, reaching $0.3-0.7 \times 10^{-3} M_\odot yr^{-1}$ in the last 2.5 Myr, comparable to what is found in Galactic OB associations. Star formation is most complete in the main cluster but continues at moderate levels in the gas-rich periphery of the nebula.
Recently it has been suggested that long range magnetic dipolar interactions are responsible for spin ice behavior in the Ising pyrochlore magnets ${\rm Dy_{2}Ti_{2}O_{7}}$ and ${\rm Ho_{2}Ti_{2}O_{7}}$. We report here numerical results on the low temperature properties of the dipolar spin ice model, obtained via a new loop algorithm which greatly improves the dynamics at low temperature. We recover the previously reported missing entropy in this model, and find a first order transition to a long range ordered phase with zero total magnetization at very low temperature. We discuss the relevance of these results to ${\rm Dy_{2}Ti_{2}O_{7}}$ and ${\rm Ho_{2}Ti_{2}O_{7}}$.
The deep homogeneous survey of the large Local-Group spiral galaxy M 31 is a milestone project for X-ray astronomy, as it allows a detailed X-ray inventory of an archetypal low-star-formation-rate galaxy like our own. We present first results of the deep XMM-Newton survey, which covers the entire D 25 ellipse. Information from different X-ray energy bands are combined in an X-ray colour image of M 31. In the first 15 observations we found about 1000 sources, the full survey will yield about 2000 X-ray sources. Sources will be classified using hardness ratios, extent, high quality spectra and time variability. In addition the sources will be correlated with catalogues in optical, infra-red and radio wavelengths. Our goal is to study M 31 X-ray binaries and globular cluster sources, supersoft sources, supernova remnants and the hot interstellar medium and separate them from foreground stars and background objects.
The authors of a recent paper, "Ultrahigh-Energy Neutrino-Nucleon Deep-Inelastic Scattering and the Froissart Bound", A. Illarianov, B. Kniehl and A. Kotikov, Phys. Rev. Lett. 106, 231802 (2011), derive an approximate formula for the UHE limit of $\sigma_{\nu N}(s)$ in a class of models that includes our own and assert that they are led "to the important observation that $\sigma_{BBT}^{\nu N} \propto ln^3s$, which manifestly violates the Froissart bound [2] in contrast to what is stated in Refs. [6-8]", the latter reference being to our work and the $\sigma_{BBT}^{\nu N}$ to the cross sections we reported there. We here correct their erroneous implication that $\sigma_{\nu N}(s) should satisfy the Froissart bound and their mistaken assertion that we state that $\sigma_{BBT}^{\nu N}$ satisfies it.
The focus of this work is on developing probabilistic models for user activity in social networks by incorporating the social network influence as perceived by the user. For this, we propose a coupled Hidden Markov Model, where each user's activity evolves according to a Markov chain with a hidden state that is influenced by the collective activity of the friends of the user. We develop generalized Baum-Welch and Viterbi algorithms for model parameter learning and state estimation for the proposed framework. We then validate the proposed model using a significant corpus of user activity on Twitter. Our numerical studies show that with sufficient observations to ensure accurate model learning, the proposed framework explains the observed data better than either a renewal process-based model or a conventional uncoupled Hidden Markov Model. We also demonstrate the utility of the proposed approach in predicting the time to the next tweet. Finally, clustering in the model parameter space is shown to result in distinct natural clusters of users characterized by the interaction dynamic between a user and his network.
It has recently been demonstrated that many biological networks exhibit a scale-free topology where the probability of observing a node with a certain number of edges (k) follows a power law: i.e. p(k) ~ k^-g. This observation has been reproduced by evolutionary models. Here we consider the network of protein-protein interactions and demonstrate that two published independent measurements of these interactions produce graphs that are only weakly correlated with one another despite their strikingly similar topology. We then propose a physical model based on the fundamental principle that (de)solvation is a major physical factor in protein-protein interactions. This model reproduces not only the scale-free nature of such graphs but also a number of higher-order correlations in these networks. A key support of the model is provided by the discovery of a significant correlation between number of interactions made by a protein and the fraction of hydrophobic residues on its surface. The model presented in this paper represents the first physical model for experimentally determined protein-protein interactions that comprehensively reproduces the topological features of interaction networks. These results have profound implications for understanding not only protein-protein interactions but also other types of scale-free networks.
The relation between the spectral density of the QCD Dirac operator at nonzero baryon chemical potential and the chiral condensate is investigated. We use the analytical result for the eigenvalue density in the microscopic regime which shows oscillations with a period that scales as 1/V and an amplitude that diverges exponentially with the volume $V=L^4$. We find that the discontinuity of the chiral condensate is due to the whole oscillating region rather than to an accumulation of eigenvalues at the origin. These results also extend beyond the microscopic regime to chemical potentials $\mu \sim 1/L$.
Large-scale distributed computing systems face two major bottlenecks that limit their scalability: straggler delay caused by the variability of computation times at different worker nodes and communication bottlenecks caused by shuffling data across many nodes in the network. Recently, it has been shown that codes can provide significant gains in overcoming these bottlenecks. In particular, optimal coding schemes for minimizing latency in distributed computation of linear functions and mitigating the effect of stragglers was proposed for a wired network, where the workers can simultaneously transmit messages to a master node without interference. In this paper, we focus on the problem of coded computation over a wireless master-worker setup with straggling workers, where only one worker can transmit the result of its local computation back to the master at a time. We consider 3 asymptotic regimes (determined by how the communication and computation times are scaled with the number of workers) and precisely characterize the total run-time of the distributed algorithm and optimum coding strategy in each regime. In particular, for the regime of practical interest where the computation and communication times of the distributed computing algorithm are comparable, we show that the total run-time approaches a simple lower bound that decouples computation and communication, and demonstrate that coded schemes are $\Theta(\log(n))$ times faster than uncoded schemes.
A key problem in salient object detection is how to effectively model the semantic properties of salient objects in a data-driven manner. In this paper, we propose a multi-task deep saliency model based on a fully convolutional neural network (FCNN) with global input (whole raw images) and global output (whole saliency maps). In principle, the proposed saliency model takes a data-driven strategy for encoding the underlying saliency prior information, and then sets up a multi-task learning scheme for exploring the intrinsic correlations between saliency detection and semantic image segmentation. Through collaborative feature learning from such two correlated tasks, the shared fully convolutional layers produce effective features for object perception. Moreover, it is capable of capturing the semantic information on salient objects across different levels using the fully convolutional layers, which investigate the feature-sharing properties of salient object detection with great feature redundancy reduction. Finally, we present a graph Laplacian regularized nonlinear regression model for saliency refinement. Experimental results demonstrate the effectiveness of our approach in comparison with the state-of-the-art approaches.
Today's mobile phones are far from mere communication devices they were ten years ago. Equipped with sophisticated sensors and advanced computing hardware, phones can be used to infer users' location, activity, social setting and more. As devices become increasingly intelligent, their capabilities evolve beyond inferring context to predicting it, and then reasoning and acting upon the predicted context. This article provides an overview of the current state of the art in mobile sensing and context prediction paving the way for full-fledged anticipatory mobile computing. We present a survey of phenomena that mobile phones can infer and predict, and offer a description of machine learning techniques used for such predictions. We then discuss proactive decision making and decision delivery via the user-device feedback loop. Finally, we discuss the challenges and opportunities of anticipatory mobile computing.
We present an algorithm for generating random networks with arbitrary degree distribution and Clustering (frequency of triadic closure). We use this algorithm to generate networks with exponential, power law, and poisson degree distributions with variable levels of clustering. Such networks may be used as models of social networks and as a testable null hypothesis about network structure. Finally, we explore the effects of clustering on the point of the phase transition where a giant component forms in a random network, and on the size of the giant component. Some analysis of these effects is presented.
We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems. Classification networks are only responsive to small and sparse discriminative regions from the object of interest, which deviates from the requirement of the segmentation task that needs to localize dense, interior and integral regions for pixel-wise inference. To mitigate this gap, we propose a new adversarial erasing approach for localizing and expanding object regions progressively. Starting with a single small object region, our proposed approach drives the classification network to sequentially discover new and complement object regions by erasing the current mined regions in an adversarial manner. These localized regions eventually constitute a dense and complete object region for learning semantic segmentation. To further enhance the quality of the discovered regions by adversarial erasing, an online prohibitive segmentation learning approach is developed to collaborate with adversarial erasing by providing auxiliary segmentation supervision modulated by the more reliable classification scores. Despite its apparent simplicity, the proposed approach achieves 55.0% and 55.7% mean Intersection-over-Union (mIoU) scores on PASCAL VOC 2012 val and test sets, which are the new state-of-the-arts.
Memory networks are neural networks with an explicit memory component that can be both read and written to by the network. The memory is often addressed in a soft way using a softmax function, making end-to-end training with backpropagation possible. However, this is not computationally scalable for applications which require the network to read from extremely large memories. On the other hand, it is well known that hard attention mechanisms based on reinforcement learning are challenging to train successfully. In this paper, we explore a form of hierarchical memory network, which can be considered as a hybrid between hard and soft attention memory networks. The memory is organized in a hierarchical structure such that reading from it is done with less computation than soft attention over a flat memory, while also being easier to train than hard attention over a flat memory. Specifically, we propose to incorporate Maximum Inner Product Search (MIPS) in the training and inference procedures for our hierarchical memory network. We explore the use of various state-of-the art approximate MIPS techniques and report results on SimpleQuestions, a challenging large scale factoid question answering task.
We report on low-spectral resolution observations of comet 9P/Tempel 1 from 1983, 1989, 1994 and 2005 using the 2.7m Harlan J. Smith telescope of McDonald Observatory. This comet was the target of NASA's Deep Impact mission and our observations allowed us to characterize the comet prior to the impact. We found that the comet showed a decrease in gas production from 1983 to 2005, with the the decrease being different factors for different species. OH decreased by a factor 2.7, NH by 1.7, CN by 1.6, C$_{3}$ by 1.8, CH by 1.4 and C$_{2}$ by 1.3. Despite the decrease in overall gas production and these slightly different decrease factors, we find that the gas production rates of OH, NH, C$_{3}$, CH and C$_{2}$ ratioed to that of CN were constant over all of the apparitions. We saw no change in the production rate ratios after the impact. We found that the peak gas production occurred about two months prior to perihelion. Comet Tempel 1 is a "normal" comet.
The possibility of detecting which camera has been used to shoot a specific picture is of paramount importance for many forensics tasks. This is extremely useful for copyright infringement cases, ownership attribution, as well as for detecting the authors of distributed illicit material (e.g., pedo-pornographic shots). Due to its importance, the forensics community has developed a series of robust detectors that exploit characteristic traces left by each camera on the acquired images during the acquisition pipeline. These traces are reverse-engineered in order to attribute a picture to a camera. In this paper, we investigate an alternative approach to solve camera identification problem. Indeed, we propose a data-driven algorithm based on convolutional neural networks, which learns features characterizing each camera directly from the acquired pictures. The proposed approach is tested on both instance-attribution and model-attribution, providing an accuracy greater than 94% in discriminating 27 camera models.
We study the properties of a dilute Bose condensed gas at zero temperature in the presence of a strong random potential with arbitrary correlation length. Starting from the underlying Gross-Pitaevskii equation, we use the random phase approximation in order to get a closed integral equation for the averaged density distribution which allows to determine both the condensate and the superfluid density. The obtained results generalize those of Huang and Meng (HM) to strong disorder. In particular, we find the critical value of the disorder strength, where the superfluid phase disappears by a first-order phase transition. We show how this critical value changes as a function of the correlation length.
This paper discusses the dynamical properties of $p$-spin models with Kac kind interactions. For large but finite interaction range $R$ one finds two different time scales for relaxation. A first relaxation roughly independent of $R$ where the system is confined to limited regions of the configuration space and an $R$ dependent time scale where the system is able to escape the confining regions. I will argue that the $R$ independent time scales can be described through dynamical mean field theory, while non-perturbative new techniques have to be used to deal with the $R$ dependent scales.
Web services are expected to play significant role for message communications over internet applications. Most of the future work is web security. Online shopping and web services are increasing at rapid rate. In this paper we presented the fundamental concepts related to Network security, web security threats. QoS web service security intrusion detection is important concern in network communications and firewalls security; we discussed various issues and challenges related to web security. The fundamental concepts network security XML firewall, XML networks. We proposed a novel Dynamic Intruder Detection System (DIDA) is safe guard against SSL secured transactions over message communications in intermediate routers that enable services to sender and receiver use Secured Session Layer protocol messages. This can be into three stages 1. Sensor 2. Analyzer and 3.User Interface.
This paper presents a Convolutional Neural Network (CNN) based page segmentation method for handwritten historical document images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as one of the predefined classes. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we propose to learn features from raw image pixels using a CNN. While many researchers focus on developing deep CNN architectures to solve different problems, we train a simple CNN with only one convolution layer. We show that the simple architecture achieves competitive results against other deep architectures on different public datasets. Experiments also demonstrate the effectiveness and superiority of the proposed method compared to previous methods.
We present an analysis of the credit market of Japan. The analysis is performed by investigating the bipartite network of banks and firms which is obtained by setting a link between a bank and a firm when a credit relationship is present in a given time window. In our investigation we focus on a community detection algorithm which is identifying communities composed by both banks and firms. We show that the clusters obtained by directly working on the bipartite network carry information about the networked nature of the Japanese credit market. Our analysis is performed for each calendar year during the time period from 1980 to 2011. Specifically, we obtain communities of banks and networks for each of the 32 investigated years, and we introduce a method to track the time evolution of these communities on a statistical basis. We then characterize communities by detecting the simultaneous over-expression of attributes of firms and banks. Specifically, we consider as attributes the economic sector and the geographical location of firms and the type of banks. In our 32 year long analysis we detect a persistence of the over-expression of attributes of clusters of banks and firms together with a slow dynamics of changes from some specific attributes to new ones. Our empirical observations show that the credit market in Japan is a networked market where the type of banks, geographical location of firms and banks and economic sector of the firm play a role in shaping the credit relationships between banks and firms.
Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a Bag of Deep Features framework.   Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state-of-the-art.
We introduce a family of glassy models having a parameter, playing the role of an interaction range, that may be varied continuously to go from a system of particles in d dimensions to a mean-field version of it. The mean-field limit is exactly described by equations conceptually close, but different from, the Mode-Coupling equations. We obtain these by a dynamic virial construction. Quite surprisingly we observe that in three dimensions, the mean-field behavior is closely followed for ranges as small as one interparticle distance, and still qualitatively for smaller distances. For the original particle model, we expect the present mean-field theory to become, unlike the Mode-Coupling equations, an increasingly good approximation at higher dimensions.
We prove that complex networks of interactions have the capacity to regulate and buffer unpredictable fluctuations in production events. We show that non-bursty network-driven activation dynamics can effectively regulate the level of burstiness in the production of nodes, which can be enhanced or reduced. Burstiness can be induced even when the endogenous inter-event time distribution of nodes' production is non-bursty. We found that hubs tend to be less controllable than low degree nodes, which are more susceptible to the networked regulatory effects. Our results have important implications for the analysis and engineering of bursty activity in a range of systems, from telecommunication networks to transcription and translation of genes into proteins in cells.
In order to understand the nuclei which develop during the course of protein folding and unfolding, we examine phase segregation of a single heteropolymer chain which occurs in equilibrium. These segregated conformations are characterized by a nucleus of monomers which are superimposable upon the native conformation. We computationally generate the phase segregation by applying a ``folding pressure,'' or adding an energetic bonus for native monomer-monomer contacts. The computer models reveal a fundamental difference in the nucleation process between heteropolymeric and the more familiar vapor-liquid systems: in a polymer system, some nuclei hinder folding via topological constraints and must be partially destroyed in order for folding to proceed. To illustrate this finding, we examine the kinetics of protein unfolding in the long chain limit through scaling arguments. We find that because of the topological constraints, the critical nucleus size is of the order of the entire chain size so that unfolding time scales as $\exp [c N^{2/3} ]$, where $N$ and $c$ are the chain length and a constant.
Two interacting particles (TIP) in a disordered chain propagate beyond the single particle localization length $\xi_1$ up to a scale $\xi_2 > \xi_1$. An initially strongly localized TIP state expands almost ballistically up to $\xi_1$. The expansion of the TIP wave function beyond the distance $\xi_1 \gg 1$ is governed by highly connected Fock states in the space of noninteracting eigenfunctions. The resulting dynamics is subdiffusive, and the second moment grows as $m_2 \sim t^{1/2}$, precisely as in the strong chaos regime for corresponding nonlinear wave equations. This surprising outcome stems from the huge Fock connectivity and resulting quantum chaos. The TIP expansion finally slows down towards a complete halt -- in contrast to the nonlinear case.
Accuracy and interpretability are two dominant features of successful predictive models. Typically, a choice must be made in favor of complex black box models such as recurrent neural networks (RNN) for accuracy versus less accurate but more interpretable traditional models such as logistic regression. This tradeoff poses challenges in medicine where both accuracy and interpretability are important. We addressed this challenge by developing the REverse Time AttentIoN model (RETAIN) for application to Electronic Health Records (EHR) data. RETAIN achieves high accuracy while remaining clinically interpretable and is based on a two-level neural attention model that detects influential past visits and significant clinical variables within those visits (e.g. key diagnoses). RETAIN mimics physician practice by attending the EHR data in a reverse time order so that recent clinical visits are likely to receive higher attention. RETAIN was tested on a large health system EHR dataset with 14 million visits completed by 263K patients over an 8 year period and demonstrated predictive accuracy and computational scalability comparable to state-of-the-art methods such as RNN, and ease of interpretability comparable to traditional models.
In this paper we propose a new semi-supervised GAN architecture (ss-InfoGAN) for image synthesis that leverages information from few labels (as little as 0.22%, max. 10% of the dataset) to learn semantically meaningful and controllable data representations where latent variables correspond to label categories. The architecture builds on Information Maximizing Generative Adversarial Networks (InfoGAN) and is shown to learn both continuous and categorical codes and achieves higher quality of synthetic samples compared to fully unsupervised settings. Furthermore, we show that using small amounts of labeled data speeds-up training convergence. The architecture maintains the ability to disentangle latent variables for which no labels are available. Finally, we contribute an information-theoretic reasoning on how introducing semi-supervision increases mutual information between synthetic and real data.
Deep learning, in the form of artificial neural networks, has achieved remarkable practical success in recent years, for a variety of difficult machine learning applications. However, a theoretical explanation for this remains a major open problem, since training neural networks involves optimizing a highly non-convex objective function, and is known to be computationally hard in the worst case. In this work, we study the \emph{geometric} structure of the associated non-convex objective function, in the context of ReLU networks and starting from a random initialization of the network parameters. We identify some conditions under which it becomes more favorable to optimization, in the sense of (i) High probability of initializing at a point from which there is a monotonically decreasing path to a global minimum; and (ii) High probability of initializing at a basin (suitably defined) with a small minimal objective value. A common theme in our results is that such properties are more likely to hold for larger ("overspecified") networks, which accords with some recent empirical and theoretical observations.
The increased popularity and ubiquitous availability of online social networks and globalised Internet access have affected the way in which people share content. The information that users willingly disclose on these platforms can be used for various purposes, from building consumer models for advertising, to inferring personal, potentially invasive, information. In this work, we use Twitter, Instagram and Foursquare data to convey the idea that the content shared by users, especially when aggregated across platforms, can potentially disclose more information than was originally intended. We perform two case studies: First, we perform user de-anonymization by mimicking the scenario of finding the identity of a user making anonymous posts within a group of users. Empirical evaluation on a sample of real-world social network profiles suggests that cross-platform aggregation introduces significant performance gains in user identification. In the second task, we show that it is possible to infer physical location visits of a user on the basis of shared Twitter and Instagram content. We present an informativeness scoring function which estimates the relevance and novelty of a shared piece of information with respect to an inference task. This measure is validated using an active learning framework which chooses the most informative content at each given point in time. Based on a large-scale data sample, we show that by doing this, we can attain an improved inference performance. In some cases this performance exceeds even the use of the user's full timeline.
For a network consisting of a graph with edge weights prescribed by a given conductance function $c$, we consider the effects of replacing these weights with a new function $b$ that satisfies $b \leq c$ on each edge. In particular, we compare the corresponding energy spaces and the spectra of the Laplace operators acting on these spaces. We use these results to derive estimates for effective resistance on the two networks, and to compute a spectral invariant for the canonical embedding of one energy space into the other.
The Gripon-Berrou neural network (GBNN) is a recently invented recurrent neural network embracing a LDPC-like sparse encoding setup which makes it extremely resilient to noise and errors. A natural use of GBNN is as an associative memory. There are two activation rules for the neuron dynamics, namely sum-of-sum and sum-of-max. The latter outperforms the former in terms of retrieval rate by a huge margin. In prior discussions and experiments, it is believed that although sum-of-sum may lead the network to oscillate, sum-of-max always converges to an ensemble of neuron cliques corresponding to previously stored patterns. However, this is not entirely correct. In fact, sum-of-max often converges to bogus fixed points where the ensemble only comprises a small subset of the converged state. By taking advantage of this overlooked fact, we can greatly improve the retrieval rate. We discuss this particular issue and propose a number of heuristics to push sum-of-max beyond these bogus fixed points. To tackle the problem directly and completely, a novel post-processing algorithm is also developed and customized to the structure of GBNN. Experimental results show that the new algorithm achieves a huge performance boost in terms of both retrieval rate and run-time, compared to the standard sum-of-max and all the other heuristics.
Reaction networks taken with mass-action kinetics arise in many settings, from epidemiology to population biology to systems of chemical reactions. Bistable reaction networks are posited to underlie biochemical switches, which motivates the following question: which reaction networks have the capacity for multiple steady states? Mathematically, this asks: among certain parametrized families of polynomial systems, which admit multiple positive roots? No complete answer is known. This work analyzes the smallest networks, those with only a few chemical species or reactions. For these "smallest" networks, we completely answer the question of multistationarity and, in some cases, multistability too, thereby extending related work of Boros. Our results highlight the role played by the Newton polytope of a network (the convex hull of the reactant vectors). Also, our work is motivated by recent results that explain how a given network's capacity for multistationarity arises from that of certain related networks which are typically smaller. Hence, we are interested in classifying small multistationary networks, and our work forms a first step in this direction.
In this paper, we propose an architecture for Network of Information mobile node (NetInf MN). It bears characteristics and features of basic NetInf node architecture with features introduced in the LISP MN architecture. We also introduce a virtual node layer for mobility management in the Network of Information. Therefore, by adopting this architecture no major changes in the contemporary network topologies is required. Thus, making our approach more practical.
We present a sequence-based probabilistic formalism that directly addresses co-operative effects in networks of interacting positions in proteins, providing significantly improved contact prediction, as well as accurate quantitative prediction of free energy changes due to non-additive effects of multiple mutations. In addition to these practical considerations, the agreement of our sequence-based calculations with experimental data for both structure and stability demonstrates a strong relation between the statistical distribution of protein sequences produced by natural evolutionary processes, and the thermodynamic stability of the structures to which these sequences fold.
Evolutionary algorithms are good general problem solver but suffer from a lack of domain specific knowledge. However, the problem specific knowledge can be added to evolutionary algorithms by hybridizing. Interestingly, all the elements of the evolutionary algorithms can be hybridized. In this chapter, the hybridization of the three elements of the evolutionary algorithms is discussed: the objective function, the survivor selection operator and the parameter settings. As an objective function, the existing heuristic function that construct the solution of the problem in traditional way is used. However, this function is embedded into the evolutionary algorithm that serves as a generator of new solutions. In addition, the objective function is improved by local search heuristics. The new neutral selection operator has been developed that is capable to deal with neutral solutions, i.e. solutions that have the different representation but expose the equal values of objective function. The aim of this operator is to directs the evolutionary search into a new undiscovered regions of the search space. To avoid of wrong setting of parameters that control the behavior of the evolutionary algorithm, the self-adaptation is used. Finally, such hybrid self-adaptive evolutionary algorithm is applied to the two real-world NP-hard problems: the graph 3-coloring and the optimization of markers in the clothing industry. Extensive experiments shown that these hybridization improves the results of the evolutionary algorithms a lot. Furthermore, the impact of the particular hybridizations is analyzed in details as well.
We study the linear large $n$ behavior of the average number of distinct sites $S(n)$ visited by a random walker after $n$ steps on a large random graph. An expression for the graph topology dependent prefactor $B$ in $S(n) = Bn$ is proposed. We use generating function techniques to relate this prefactor to the graph adjacency matrix and then devise message-passing equations to calculate its value. Numerical simulations are performed to evaluate the agreement between the message passing predictions and random walk simulations on random graphs. Scaling with system size and average graph connectivity are also analysed.
Comparing weighted networks in neuroscience is hard, because the topological properties of a given network are necessarily dependent on the number of edges of that network. This problem arises in the analysis of both weighted and unweighted networks. The term density is often used in this context, in order to refer to the mean edge weight of a weighted network, or to the number of edges in an unweighted one. Comparing families of networks is therefore statistically difficult because differences in topology are necessarily associated with differences in density. In this review paper, we consider this problem from two different perspectives, which include (i) the construction of summary networks, such as how to compute and visualize the mean network from a sample of network-valued data points; and (ii) how to test for topological differences, when two families of networks also exhibit significant differences in density. In the first instance, we show that the issue of summarizing a family of networks can be conducted by adopting a mass-univariate approach, which produces a statistical parametric network (SPN). In the second part of this review, we then highlight the inherent problems associated with the comparison of topological functions of families of networks that differ in density. In particular, we show that a wide range of topological summaries, such as global efficiency and network modularity are highly sensitive to differences in density. Moreover, these problems are not restricted to unweighted metrics, as we demonstrate that the same issues remain present when considering the weighted versions of these metrics. We conclude by encouraging caution, when reporting such statistical comparisons, and by emphasizing the importance of constructing summary networks.
We apply recent advances in machine learning and computer vision to a central problem in materials informatics: The statistical representation of microstructural images. We use activations in a pre-trained convolutional neural network to provide a high-dimensional characterization of a set of synthetic microstructural images. Next, we use manifold learning to obtain a low-dimensional embedding of this statistical characterization. We show that the low-dimensional embedding extracts the parameters used to generate the images. According to a variety of metrics, the convolutional neural network method yields dramatically better embeddings than the analogous method derived from two-point correlations alone.
We propose a novel neural network architecture for visual saliency detections, which utilizes neurophysiologically plausible mechanisms for extraction of salient regions. The model has been significantly inspired by recent findings from neurophysiology and aimed to simulate the bottom-up processes of human selective attention. Two types of features were analyzed: color and direction of maximum variance. The mechanism we employ for processing those features is PCA, implemented by means of normalized Hebbian learning and the waves of spikes. To evaluate performance of our model we have conducted psychological experiment. Comparison of simulation results with those of experiment indicates good performance of our model.
Object detection systems based on the deep convolutional neural network (CNN) have recently made ground- breaking advances on several object detection benchmarks. While the features learned by these high-capacity neural networks are discriminative for categorization, inaccurate localization is still a major source of error for detection. Building upon high-capacity CNN architectures, we address the localization problem by 1) using a search algorithm based on Bayesian optimization that sequentially proposes candidate regions for an object bounding box, and 2) training the CNN with a structured loss that explicitly penalizes the localization inaccuracy. In experiments, we demonstrated that each of the proposed methods improves the detection performance over the baseline method on PASCAL VOC 2007 and 2012 datasets. Furthermore, two methods are complementary and significantly outperform the previous state-of-the-art when combined.
Photo composition is an important factor affecting the aesthetics in photography. However, it is a highly challenging task to model the aesthetic properties of good compositions due to the lack of globally applicable rules to the wide variety of photographic styles. Inspired by the thinking process of photo taking, we formulate the photo composition problem as a view finding process which successively examines pairs of views and determines their aesthetic preferences. We further exploit the rich professional photographs on the web to mine unlimited high-quality ranking samples and demonstrate that an aesthetics-aware deep ranking network can be trained without explicitly modeling any photographic rules. The resulting model is simple and effective in terms of its architectural design and data sampling method. It is also generic since it naturally learns any photographic rules implicitly encoded in professional photographs. The experiments show that the proposed view finding network achieves state-of-the-art performance with sliding window search strategy on two image cropping datasets.
A central problem in analyzing networks is partitioning them into modules or communities. One of the best tools for this is the stochastic block model, which clusters vertices into blocks with statistically homogeneous pattern of links. Despite its flexibility and popularity, there has been a lack of principled statistical model selection criteria for the stochastic block model. Here we propose a Bayesian framework for choosing the number of blocks as well as comparing it to the more elaborate degree- corrected block models, ultimately leading to a universal model selection framework capable of comparing multiple modeling combinations. We will also investigate its connection to the minimum description length principle.
We present local existence theorem of the initial value problem for third order semilinear dispersive partial differential equations in two space dimensions. This type of equations arises in the study of gravity wave of deep water, and cannot be solved by the classical energy method. To solve the initial value problem, we make full use of pseudodifferential operators with nonsmooth coefficients.
The imperative need for unconditional secure key exchange is expounded by the increasing connectivity of networks and by the increasing number and level of sophistication of cyberattacks. Two concepts that are information theoretically secure are quantum key distribution (QKD) and Kirchoff-law-Johnson-noise (KLJN). However, these concepts require a dedicated connection between hosts in peer-to-peer (P2P) networks which can be impractical and or cost prohibitive. A practical and cost effective method is to have each host share their respective cable(s) with other hosts such that two remote hosts can realize a secure key exchange without the need of an additional cable or key exchanger. In this article we analyze the cost complexities of cable, key exchangers, and time required in the star network. We mentioned the reliability of the star network and compare it with other network geometries. We also conceived a protocol and equation for the number of secure bit exchange periods needed in a star network. We then outline other network geometries and trade-off possibilities that seem interesting to explore.
We consider nonequilibrium systems such as the Edwards-Anderson Ising spin glass at a temperature where, in equilibrium, there are presumed to be (two or many) broken symmetry pure states. Following a deep quench, we argue that as time goes to infinity, although the system is usually in some pure state locally, either it never settles permanently on a fixed lengthscale into a single pure state, or it does but then the pure state depends on both the initial spin configuration and the realization of the stochastic dynamics. But this latter case can occur only if there exists an uncountable number of pure states (for each coupling realization) with almost every pair having zero overlap. In both cases, almost no initial spin configuration is in the basin of attraction of a single pure state; that is, the configuration space (resulting from a deep quench) is all boundary (except for a set of measure zero). We prove that the former case holds for deeply quenched two-dimensional ferromagnets. Our results raise the possibility that even if more than one pure state exists for an infinite system, time averages don't necessarily disagree with Boltzmann averages.
This paper answers an open question of Chen, Doty, and Soloveichik [1], who showed that a function f:N^k --> N^l is deterministically computable by a stochastic chemical reaction network (CRN) if and only if the graph of f is a semilinear subset of N^{k+l}. That construction crucially used "leaders": the ability to start in an initial configuration with constant but non-zero counts of species other than the k species X_1,...,X_k representing the input to the function f. The authors asked whether deterministic CRNs without a leader retain the same power.   We answer this question affirmatively, showing that every semilinear function is deterministically computable by a CRN whose initial configuration contains only the input species X_1,...,X_k, and zero counts of every other species. We show that this CRN completes in expected time O(n), where n is the total number of input molecules. This time bound is slower than the O(log^5 n) achieved in [1], but faster than the O(n log n) achieved by the direct construction of [1] (Theorem 4.1 in the latest online version of [1]), since the fast construction of that paper (Theorem 4.4) relied heavily on the use of a fast, error-prone CRN that computes arbitrary computable functions, and which crucially uses a leader.
Recently, there are significant advances in the areas of networking, caching and computing. Nevertheless, these three important areas have traditionally been addressed separately in the existing research. In this paper, we present a novel framework that integrates networking, caching and computing in a systematic way and enables dynamic orchestration of these three resources to improve the end-to-end system performance and meet the requirements of different applications. Then, we consider the bandwidth, caching and computing resource allocation issue and formulate it as a joint caching/computing strategy and servers selection problem to minimize the combination cost of network usage and energy consumption in the framework. To minimize the combination cost of network usage and energy consumption in the framework, we formulate it as a joint caching/computing strategy and servers selection problem. In addition, we solve the joint caching/computing strategy and servers selection problem using an exhaustive-search algorithm. Simulation results show that our proposed framework significantly outperforms the traditional network without in-network caching/computing in terms of network usage and energy consumption.
Biological networks have been recently found to exhibit many topological properties of the so-called complex networks. It has been reported that they are, in general, both highly skewed and directed. In this paper, we report on the dynamics of a Michaelis-Menten like model when the topological features of the underlying network resemble those of real biological networks. Specifically, instead of using a random graph topology, we deal with a complex heterogeneous network characterized by a power-law degree distribution coupled to a continuous dynamics for each network's component. The dynamics of the model is very rich and stationary, periodic and chaotic states are observed upon variation of the model's parameters. We characterize these states numerically and report on several quantities such as the system's phase diagram and size distributions of clusters of stationary, periodic and chaotic nodes. The results are discussed in view of recent debate about the ubiquity of complex networks in nature and on the basis of several biological processes that can be well described by the dynamics studied.
Foreign policy analysis has been struggling to find ways to measure policy preferences and paradigm shifts in international political systems. This paper presents a novel, potential solution to this challenge, through the application of a neural word embedding (Word2vec) model on a dataset featuring speeches by heads of state or government in the United Nations General Debate. The paper provides three key contributions based on the output of the Word2vec model. First, it presents a set of policy attention indices, synthesizing the semantic proximity of political speeches to specific policy themes. Second, it introduces country-specific semantic centrality indices, based on topological analyses of countries' semantic positions with respect to each other. Third, it tests the hypothesis that there exists a statistical relation between the semantic content of political speeches and UN voting behavior, falsifying it and suggesting that political speeches contain information of different nature then the one behind voting outcomes. The paper concludes with a discussion of the practical use of its results and consequences for foreign policy analysis, public accountability, and transparency.
In this review we discuss different mathematical models of gene regulatory networks as relevant to the onset and development of cancer. After discussion of alternative modelling approaches, we use a paradigmatic two-gene network to focus on the role played by time delays in the dynamics of gene regulatory networks. We contrast the dynamics of the reduced model arising in the limit of fast mRNA dynamics with that of the full model. The review concludes with the discussion of some open problems.
A naive (or Idiot) Bayes network is a network with a single hypothesis node and several observations that are conditionally independent given the hypothesis. We recently surveyed a number of members of the UAI community and discovered a general lack of understanding of the implications of the Naive Bayes assumption on the kinds of problems that can be solved by these networks. It has long been recognized [Minsky 61] that if observations are binary, the decision surfaces in these networks are hyperplanes. We extend this result (hyperplane separability) to Naive Bayes networks with m-ary observations. In addition, we illustrate the effect of observation-observation dependencies on decision surfaces. Finally, we discuss the implications of these results on knowledge acquisition and research in learning.
We introduce the neural network approach to global fits of parton distrubution functions. First we review previous work on unbiased parametrizations of deep-inelastic structure functions with faithful estimation of their uncertainties, and then we summarize the current status of neural network parton distribution fits.
We introduce a method for the problem of learning the structure of a Bayesian network using the quantum adiabatic algorithm. We do so by introducing an efficient reformulation of a standard posterior-probability scoring function on graphs as a pseudo-Boolean function, which is equivalent to a system of 2-body Ising spins, as well as suitable penalty terms for enforcing the constraints necessary for the reformulation; our proposed method requires $\mathcal O(n^2)$ qubits for $n$ Bayesian network variables. Furthermore, we prove lower bounds on the necessary weighting of these penalty terms. The logical structure resulting from the mapping has the appealing property that it is instance-independent for a given number of Bayesian network variables, as well as being independent of the number of data cases.
We report a technique which allows self-assembly of conducting nanoparticles into long continuous chains. Transport properties of such chains have been studied at low temperatures. At low bias voltages, the charges are pinned and the chain resistance is exponentially high. Above a certain threshold ($V_{T}$), the system enters a conducting state. The threshold voltage is much bigger than the Coulomb gap voltage for a single particle and decreases linearly with increasing temperature. A sharp threshold was observed up to about 77 K. Such chains may be used as switchable links in Coulomb charge memories.
We investigated the topological properties of stock networks through a comparison of the original stock network with the estimated stock network from the correlation matrix created by the random matrix theory (RMT). We used individual stocks traded on the market indices of Korea, Japan, Canada, the USA, Italy, and the UK. The results are as follows. As the correlation matrix reflects the more eigenvalue property, the estimated stock network from the correlation matrix gradually increases the degree of consistency with the original stock network. Each stock with a different number of links to other stocks in the original stock network shows a different response. In particular, the largest eigenvalue is a significant deterministic factor in terms of the formation of a stock network.
We present a formulation of quantum molecular dynamics that includes electron correlation effects via the Gutzwiller method. Our new scheme enables the study of the dynamical behavior of atoms and molecules with strong electron interactions. The Gutzwiller approach goes beyond the conventional mean-field treatment of the intra-atomic electron repulsion and captures crucial correlation effects such as band narrowing and electron localization. We use Gutzwiller quantum molecular dynamics to investigate the Mott transition in the liquid phase of a single-band metal and uncover intriguing structural and transport properties of the atoms.
In recent years networks have gained unprecedented attention in studying a broad range of topics, among them in complex systems research. In particular, multi-agent systems have seen an increased recognition of the importance of the interaction topology. It is now widely recognized that emergent phenomena can be highly sensitive to the structure of the interaction network connecting the system's components, and there is a growing body of abstract network classes, whose contributions to emergent dynamics are well-understood. However, much less understanding have yet been gained about the effects of network dynamics, especially in cases when the emergent phenomena feeds back to and changes the underlying network topology.   Our work starts with the application of the network approach to discrete choice analysis, a standard method in econometric estimation, where the classic approach is grounded in individual choice and lacks social network influences. In this paper, we extend our earlier results by considering the endogenous dynamics of social networks. In particular, we study a model where the behavior adopted by the agents feeds back to the underlying network structure, and report results obtained by computational multi-agent based simulations
Band splittings, chiral spin polarization and topological surface states generated by spin-orbit interactions at crystal surfaces are receiving a lot of attention for their potential device applications as well as fascinating physical properties. Most studies have focused on sp states near the Fermi energy, which are relevant for transport and have long lifetimes. Far less explored, though in principle stronger, are spin-orbit interaction effecs within d states, including those deep below the Fermi energy. Here, we report a joint photoemission/ab initio study of spin-orbit effects in the deep d orbital surface states of a 24-layer Au film grown on Ag(111) and a 24-layer Ag film grown on Au(111), singling out a conical intersection (Dirac cone) between two surface states in a large surface-projected gap at the time-reversal symmetric M points. Unlike the often isotropic dispersion at Gamma point Dirac cones, the M point cones are strongly anisotropic. An effective k.p Hamiltonian is derived to describe the anisotropic band splitting and spin polarization near the Dirac cone.
The purely classical counterpart of the Scattering Probability Matrix (SPM) $\mid S_{n,m}\mid^2$ of the quantum scattering matrix $S$ is defined for 2D quantum waveguides for an arbitrary number of propagating modes $M$. We compare the quantum and classical structures of $\mid S_{n,m}\mid^2$ for a waveguide with generic Hamiltonian chaos. It is shown that even for a moderate number of channels, knowledge of the classical structure of the SPM allows us to predict the global structure of the quantum one and, hence, understand important quantum transport properties of waveguides in terms of purely classical dynamics. It is also shown that the SPM, being an intensity measure, can give additional dynamical information to that obtained by the Poincar\`{e} maps.
We introduce the nested canalyzing depth of a function, which measures the extent to which it retains a nested canalyzing structure. We characterize the structure of functions with a given depth and compute the expected activities and sensitivities of the variables. This analysis quantifies how canalyzation leads to higher stability in Boolean networks. It generalizes the notion of nested canalyzing functions (NCFs), which are precisely the functions with maximum depth. NCFs have been proposed as gene regulatory network models, but their structure is frequently too restrictive and they are extremely sparse. We find that functions become decreasingly sensitive to input perturbations as the canalyzing depth increases, but exhibit rapidly diminishing returns in stability. Additionally, we show that as depth increases, the dynamics of networks using these functions quickly approach the critical regime, suggesting that real networks exhibit some degree of canalyzing depth, and that NCFs are not significantly better than functions of sufficient depth for many applications of the modeling and reverse engineering of biological networks.
Heterogeneous networks have a key role in the design of future mobile communication networks, since the employment of small cells around a macrocell enhances the network's efficiency and decreases complexity and power demand. Moreover, research on Blind Interference Alignment (BIA) has shown that optimal Degrees of Freedom (DoF) can be achieved in certain network architectures, with no requirement of Channel State Information (CSI) at the transmitters. Our contribution is a generalised model of BIA in a heterogeneous network with one macrocell with K users and K femtocells each with one user, by using Kronecker (Tensor) Product representation. We introduce a solution on how to vary beamforming vectors under power constraints to maximize the sum rate of the network and how optimal DoF can be achieved over K+1 time slots.
We implement a master-slave parallel genetic algorithm (PGA) with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs) to implement a PGA and visualise the results using disjoint minimal spanning trees (MSTs). We demonstrate that our GPU PGA, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable due to compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.
The Balloon-borne Large-Aperture Sub-millimetre Telescope (BLAST) will operate on a Long Duration Balloon platform with large format bolometer arrays at 250, 350 and 500 microns, initially using a 2m mirror, with plans to increase to 2.5m. BLAST is a collaboration between scientists in the USA, Canada, UK, Italy and Mexico. Funding has been approved and it is now in its building phase. The test flight is scheduled for 2002, with the first long duration flight the following year. The scientific goals are to learn about the nature of distant extragalactic star forming galaxies and cold pre-stellar sources by making deep maps both at high and low galactic latitudes. BLAST will be useful for planning Herschel key projects which use SPIRE.
The embedding problem for Markov chains is a famous problem in probability theory and only partial results are available up till now. In this paper, we propose a variant of the embedding problem called the reversible embedding problem which has a deep physical and biochemical background and provide a complete solution to this new problem. We prove that the reversible embedding of a stochastic matrix, if it exists, must be unique. Moreover, we obtain the sufficient and necessary conditions for the existence of the reversible embedding and provide an effective method to compute the reversible embedding. Some examples are also given to illustrate the main results of this paper.
The placement of wind turbines on a given area of land such that the wind farm produces a maximum amount of energy is a challenging optimization problem. In this article, we tackle this problem, taking into account wake effects that are produced by the different turbines on the wind farm. We significantly improve upon existing results for the minimization of wake effects by developing a new problem-specific local search algorithm. One key step in the speed-up of our algorithm is the reduction in computation time needed to assess a given wind farm layout compared to previous approaches. Our new method allows the optimization of large real-world scenarios within a single night on a standard computer, whereas weeks on specialized computing servers were required for previous approaches.
We propose a scheme for training a computerized agent to perform complex human tasks such as highway steering. The scheme is designed to follow a natural learning process whereby a human instructor teaches a computerized trainee. The learning process consists of five elements: (i) unsupervised feature learning; (ii) supervised imitation learning; (iii) supervised reward induction; (iv) supervised safety module construction; and (v) reinforcement learning. We implemented the last four elements of the scheme using deep convolutional networks and applied it to successfully create a computerized agent capable of autonomous highway steering over the well-known racing game Assetto Corsa. We demonstrate that the use of the last four elements is essential to effectively carry out the steering task using vision alone, without access to a driving simulator internals, and operating in wall-clock time. This is made possible also through the introduction of a safety network, a novel way for preventing the agent from performing catastrophic mistakes during the reinforcement learning stage.
This lecture contains a brief introduction to HERA and deep inelastic scattering (DIS), before going on to highlight some of the measurements of the hadronic final state in DIS performed by the H1 and ZEUS collaborations.
With our ability to record more neurons simultaneously, making sense of these data is a challenge. Functional connectivity is one popular way to study the relationship between multiple neural signals. Correlation-based methods are a set of currently well-used techniques for functional connectivity estimation. However, due to explaining away and unobserved common inputs (Stevenson et al., 2008), they produce spurious connections. The general linear model (GLM), which models spikes trains as Poisson processes (Okatan et al., 2005; Truccolo et al., 2005; Pillow et al., 2008), avoids these confounds. We develop here a new class of methods by using differential signals based on simulated intracellular voltage recordings. It is equivalent to a regularized AR(2) model. We also expand the method to simulated local field potential (LFP) recordings and calcium imaging. In all of our simulated data, the differential covariance-based methods achieved better or similar performance to the GLM method and required fewer data samples. This new class of methods provides alternative ways to analyze neural signals.
We present a numerical analysis for SiO_2 of the fraction of diffusive direction f_diff for temperatures T on both sides of the fragile-to-strong crossover. The T-dependence of f_diff clearly reveals this change in dynamical behavior. We find that for T above the crossover (fragile region) the system is always close to ridges of the potential energy surface (PES), while below the crossover (strong region), the system mostly explores the PES local minima. Despite this difference, the power law dependence of f_diff on the diffusion constant, as well as the power law dependence of f_diff on the configurational entropy, shows no change at the fragile to strong crossover.
The modularity of a network quantifies the extent, relative to a null model network, to which vertices cluster into community groups. We define a null model appropriate for bipartite networks, and use it to define a bipartite modularity. The bipartite modularity is presented in terms of a modularity matrix B; some key properties of the eigenspectrum of B are identified and used to describe an algorithm for identifying modules in bipartite networks. The algorithm is based on the idea that the modules in the two parts of the network are dependent, with each part mutually being used to induce the vertices for the other part into the modules. We apply the algorithm to real-world network data, showing that the algorithm successfully identifies the modular structure of bipartite networks.
We report superconducting properties of AgSnSe2 which is a conventional type-II superconductor in the very dirty limit due to intrinsically strong electron scatterings. While this material is an isotropic three-dimensional (3D) superconductor with a not-so-short coherence length where strong vortex fluctuations are NOT expected, we found that the magnetic-field-induced resistive transition at fixed temperatures becomes increasingly broader toward zero temperature and, surprisingly, that this broadened transition is taking place largely ABOVE the upper critical field determined thermodynamically from the specific heat. This result points to the existence of an anomalous metallic state possibly caused by quantum phase fluctuations in a strongly-disordered 3D superconductor.
Timetabling is a problem faced in all higher education institutions. The International Timetabling Competition (ITC) has published a dataset that can be used to test the quality of methods used to solve this problem. A number of meta-heuristic approaches have obtained good results when tested on the ITC dataset, however few have used the ant colony optimization technique, particularly on the ITC 2007 curriculum based university course timetabling problem. This study describes an ant system that solves the curriculum based university course timetabling problem and the quality of the algorithm is tested on the ITC 2007 dataset. The ant system was able to find feasible solutions in all instances of the dataset and close to optimal solutions in some instances. The ant system performs better than some published approaches, however results obtained are not as good as those obtained by the best published approaches. This study may be used as a benchmark for ant based algorithms that solve the curriculum based university course timetabling problem.
Feed-forward deep neural networks have been used extensively in various machine learning applications. Developing a precise understanding of the underling behavior of neural networks is crucial for their efficient deployment. In this paper, we use an information theoretic approach to study the flow of information in a neural network and to determine how entropy of information changes between consecutive layers. Moreover, using the Information Bottleneck principle, we develop a constrained optimization problem that can be used in the training process of a deep neural network. Furthermore, we determine a lower bound for the level of data representation that can be achieved in a deep neural network with an acceptable level of distortion.
To understand a node's centrality in a multiplex network, its centrality values in all the layers of the network can be aggregated. This requires a normalization of the values, to allow their meaningful comparison and aggregation over networks with different sizes and orders. The concrete choices of such preprocessing steps like normalization and aggregation are almost never discussed in network analytic papers. In this paper, we show that even sticking to the most simple centrality index (the degree) but using different, classic choices of normalization and aggregation strategies, can turn a node from being among the most central to being among the least central. We present our results by using an aggregation operator which scales between different, classic aggregation strategies based on three multiplex networks. We also introduce a new visualization and characterization of a node's sensitivity to the choice of a normalization and aggregation strategy in multiplex networks. The observed high sensitivity of single nodes to the specific choice of aggregation and normalization strategies is of strong importance, especially for all kinds of intelligence-analytic software as it questions the interpretations of the findings.
This work aims for image categorization using a representation of distinctive parts. Different from existing part-based work, we argue that parts are naturally shared between image categories and should be modeled as such. We motivate our approach with a quantitative and qualitative analysis by backtracking where selected parts come from. Our analysis shows that in addition to the category parts defining the class, the parts coming from the background context and parts from other image categories improve categorization performance. Part selection should not be done separately for each category, but instead be shared and optimized over all categories. To incorporate part sharing between categories, we present an algorithm based on AdaBoost to jointly optimize part sharing and selection, as well as fusion with the global image representation. We achieve results competitive to the state-of-the-art on object, scene, and action categories, further improving over deep convolutional neural networks.
Higher-level cognition includes logical reasoning and the ability of question answering with common sense. The RatioLog project addresses the problem of rational reasoning in deep question answering by methods from automated deduction and cognitive computing. In a first phase, we combine techniques from information retrieval and machine learning to find appropriate answer candidates from the huge amount of text in the German version of the free encyclopedia "Wikipedia". In a second phase, an automated theorem prover tries to verify the answer candidates on the basis of their logical representations. In a third phase - because the knowledge may be incomplete and inconsistent -, we consider extensions of logical reasoning to improve the results. In this context, we work toward the application of techniques from human reasoning: We employ defeasible reasoning to compare the answers w.r.t. specificity, deontic logic, normative reasoning, and model construction. Moreover, we use integrated case-based reasoning and machine learning techniques on the basis of the semantic structure of the questions and answer candidates to learn giving the right answers.
We study the $\pm J$ transverse-field Ising spin glass model at zero temperature on d-dimensional hypercubic lattices and in the Sherrington-Kirkpatrick (SK) model, by series expansions around the strong field limit. In the SK model and in high-dimensions our calculated critical properties are in excellent agreement with the exact mean-field results, surprisingly even down to dimension $d = 6$ which is below the upper critical dimension of $d=8$. In contrast, in lower dimensions we find a rich singular behavior consisting of critical and Griffiths-McCoy singularities. The divergence of the equal-time structure factor allows us to locate the critical coupling where the correlation length diverges, implying the onset of a thermodynamic phase transition. We find that the spin-glass susceptibility as well as various power-moments of the local susceptibility become singular in the paramagnetic phase \textit{before} the critical point. Griffiths-McCoy singularities are very strong in two-dimensions but decrease rapidly as the dimension increases. We present evidence that high enough powers of the local susceptibility may become singular at the pure-system critical point.
As a result of the interplay between the intrinsic off-diagonal terms of the dipolar interaction and an applied {\it transverse} field $H_t$, the diluted ${\rm LiHo_xY_{1-x}F_4}$ system at x$> 0.5$ is equivalent to a ferromagnet in a longitudinal random field (RF). At low $H_t$ the quantum fluctuations between the Ising like doublet states are negligible, while the effective induced RF is appreciable. This results in a practically exact equivalence to the classical RF Ising model. By tuning $H_t$, the applied longitudinal field, and the dilution, the Ising model can be realized in the presence of an effective RF, transverse field, and constant longitudinal field, all independently controlled. The experimental consequences for $D=1,2,3$ dimensions are discussed.
Automated affective computing in the wild is a challenging task in the field of computer vision. This paper presents three neural network-based methods proposed for the task of facial affect estimation submitted to the First Affect-in-the-Wild challenge. These methods are based on Inception-ResNet modules redesigned specifically for the task of facial affect estimation. These methods are: Shallow Inception-ResNet, Deep Inception-ResNet, and Inception-ResNet with LSTMs. These networks extract facial features in different scales and simultaneously estimate both the valence and arousal in each frame. Root Mean Square Error (RMSE) rates of 0.4 and 0.3 are achieved for the valence and arousal respectively with corresponding Concordance Correlation Coefficient (CCC) rates of 0.04 and 0.29 using Deep Inception-ResNet method.
We present an algorithm for simultaneous face detection, landmarks localization, pose estimation and gender recognition using deep convolutional neural networks (CNN). The proposed method called, Hyperface, fuses the intermediate layers of a deep CNN using a separate CNN and trains multi-task loss on the fused features. It exploits the synergy among the tasks which boosts up their individual performances. Extensive experiments show that the proposed method is able to capture both global and local information of faces and performs significantly better than many competitive algorithms for each of these four tasks.
Among the various means of available resource protection including biometrics, password based system is most simple, user friendly, cost effective and commonly used. But this method having high sensitivity with attacks. Most of the advanced methods for authentication based on password encrypt the contents of password before storing or transmitting in physical domain. But all conventional cryptographic based encryption methods are having its own limitations, generally either in terms of complexity or in terms of efficiency. Multi-application usability of password today forcing users to have a proper memory aids. Which itself degrades the level of security. In this paper a method to exploit the artificial neural network to develop the more secure means of authentication, which is more efficient in providing the authentication, at the same time simple in design, has given. Apart from protection, a step toward perfect security has taken by adding the feature of intruder detection along with the protection system. This is possible by analysis of several logical parameters associated with the user activities. A new method of designing the security system centrally based on neural network with intrusion detection capability to handles the challenges available with present solutions, for any kind of resource has presented.
Charge carriers that execute multi-phonon hopping generally interact strongly enough with phonons to form polarons. A polarons sluggish motion is linked to slowly shifting atomic displacements that severely reduce the intrinsic width of its transport band. Here a means to estimate hopping polarons bandwidths from Seebeck-coefficient measurements is described. The magnitudes of semiconductors Seebeck coefficients are usually quite large (greater than 86 microvolts/K) near room temperature. However, in accord with the third law of thermodynamics, Seebeck coefficients must vanish at absolute zero. Here the transition of the Seebeck coefficient of hopping polarons to its low-temperature regime is investigated. The temperature and sharpness of this transition depends on the concentration of carriers and on the width of their transport band. This feature provides a means of estimating the width of a polarons transport band. Since the intrinsic broadening of polaron bands is very small, less than the characteristic phonon energy, the net widths of polaron transport bands in disordered semiconductors approach the energetic disorder experienced by their hopping carriers, their disorder energy.
An automatic method for the selection of subsets of images, both modern and historic, out of a set of landmark large images collected from the Internet is presented in this paper. This selection depends on the extraction of dominant features using Gabor filtering. Features are selected carefully from a preliminary image set and fed into a neural network as a training data. The method collects a large set of raw landmark images containing modern and historic landmark images and non-landmark images. The method then processes these images to classify them as landmark and non-landmark images. The classification performance highly depends on the number of candidate features of the landmark.
Intracranial carotid artery calcification (ICAC) is a major risk factor for stroke, and might contribute to dementia and cognitive decline. Reliance on time-consuming manual annotation of ICAC hampers much demanded further research into the relationship between ICAC and neurological diseases. Automation of ICAC segmentation is therefore highly desirable, but difficult due to the proximity of the lesions to bony structures with a similar attenuation coefficient. In this paper, we propose a method for automatic segmentation of ICAC; the first to our knowledge. Our method is based on a 3D fully convolutional neural network that we extend with two regularization techniques. Firstly, we use deep supervision (hidden layers supervision) to encourage discriminative features in the hidden layers. Secondly, we augment the network with skip connections, as in the recently developed ResNet, and dropout layers, inserted in a way that skip connections circumvent them. We investigate the effect of skip connections and dropout. In addition, we propose a simple problem-specific modification of the network objective function that restricts the focus to the most important image regions and simplifies the optimization. We train and validate our model using 882 CT scans and test on 1,000. Our regularization techniques and objective improve the average Dice score by 7.1%, yielding an average Dice of 76.2% and 97.7% correlation between predicted ICAC volumes and manual annotations.
In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks. We propose a novel approach named Second-Order Response Transform (SORT), which appends element-wise product transform to the linear sum of a two-branch network module. A direct advantage of SORT is to facilitate cross-branch response propagation, so that each branch can update its weights based on the current status of the other branch. Moreover, SORT augments the family of transform operations and increases the nonlinearity of the network, making it possible to learn flexible functions to fit the complicated distribution of feature space. SORT can be applied to a wide range of network architectures, including a branched variant of a chain-styled network and a residual network, with very light-weighted modifications. We observe consistent accuracy gain on both small (CIFAR10, CIFAR100 and SVHN) and big (ILSVRC2012) datasets. In addition, SORT is very efficient, as the extra computation overhead is less than 5%.
The 1/f noise in pentacene thin film transistors has been measured as a function of device thickness from well above the effective conduction channel thickness to only two conducting layers. Over the entire thickness range, the spectral noise form is 1/f, and the noise parameter varies as (gate voltage)-1, confirming that the noise is due to mobility fluctuations, even in the thinnest films. Hooge's parameter varies as an inverse power-law with conductivity for all film thicknesses. The magnitude and transport characteristics of the spectral noise are well explained in terms of percolative effects arising from the grain boundary structure.
We propose a second renormalization group method to handle the tensor-network states or models. This method reduces dramatically the truncation error of the tensor renormalization group. It allows physical quantities of classical tensor-network models or tensor-network ground states of quantum systems to be accurately and efficiently determined.
DeepMind's recent spectacular success in using deep convolutional neural nets and machine learning to build superhuman level agents --- e.g. for Atari games via deep Q-learning and for the game of Go via Reinforcement Learning --- raises many questions, including to what extent these methods will succeed in other domains. In this paper we consider DQL for the game of Hex: after supervised initialization, we use selfplay to train NeuroHex, an 11-layer CNN that plays Hex on the 13x13 board. Hex is the classic two-player alternate-turn stone placement game played on a rhombus of hexagonal cells in which the winner is whomever connects their two opposing sides. Despite the large action and state space, our system trains a Q-network capable of strong play with no search. After two weeks of Q-learning, NeuroHex achieves win-rates of 20.4% as first player and 2.1% as second player against a 1-second/move version of MoHex, the current ICGA Olympiad Hex champion. Our data suggests further improvement might be possible with more training time.
We study analytically M-spin-flip stable states in disordered short-ranged Ising models (spin glasses and ferromagnets) in all dimensions and for all M. Our approach is primarily dynamical and is based on the convergence of a zero-temperature dynamical process with flips of lattice animals up to size M and starting from a deep quench, to a metastable limit. The results (rigorous and nonrigorous, in infinite and finite volumes) concern many aspects of metastable states: their numbers, basins of attraction, energy densities, overlaps, remanent magnetizations and relations to thermodynamic states. For example, we show that their overlap distribution is a delta-function at zero. We also define a dynamics for M=infinity, which provides a potential tool for investigating ground state structure.
Learning word representations has recently seen much success in computational linguistics. However, assuming sequences of word tokens as input to linguistic analysis is often unjustified. For many languages word segmentation is a non-trivial task and naturally occurring text is sometimes a mixture of natural language strings and other character data. We propose to learn text representations directly from raw character sequences by training a Simple recurrent Network to predict the next character in text. The network uses its hidden layer to evolve abstract representations of the character sequences it sees. To demonstrate the usefulness of the learned text embeddings, we use them as features in a supervised character level text segmentation and labeling task: recognizing spans of text containing programming language code. By using the embeddings as features we are able to substantially improve over a baseline which uses only surface character n-grams.
NASA's new age of space exploration augurs great promise for deep space exploration missions whereby spacecraft should be independent, autonomous, and smart. Nowadays NASA increasingly relies on the concepts of autonomic computing, exploiting these to increase the survivability of remote missions, particularly when human tending is not feasible. Autonomic computing has been recognized as a promising approach to the development of self-managing spacecraft systems that employ onboard intelligence and rely less on control links. The Autonomic System Specification Language (ASSL) is a framework for formally specifying and generating autonomic systems. As part of long-term research targeted at the development of models for space exploration missions that rely on principles of autonomic computing, we have employed ASSL to develop formal models and generate functional prototypes for NASA missions. This helps to validate features and perform experiments through simulation. Here, we discuss our work on developing such missions with ASSL.
In this paper, we introduce a new deep convolutional neural network (ConvNet) module that promotes competition among a set of multi-scale convolutional filters. This new module is inspired by the inception module, where we replace the original collaborative pooling stage (consisting of a concatenation of the multi-scale filter outputs) by a competitive pooling represented by a maxout activation unit. This extension has the following two objectives: 1) the selection of the maximum response among the multi-scale filters prevents filter co-adaptation and allows the formation of multiple sub-networks within the same model, which has been shown to facilitate the training of complex learning problems; and 2) the maxout unit reduces the dimensionality of the outputs from the multi-scale filters. We show that the use of our proposed module in typical deep ConvNets produces classification results that are either better than or comparable to the state of the art on the following benchmark datasets: MNIST, CIFAR-10, CIFAR-100 and SVHN.
On the basis of a tight-binding model for a strongly disordered semiconductor with correlated conduction- and valence band disorder a new coherent dynamical intra-band effect is analyzed. For systems that are excited by two, specially designed ultrashort light-pulse sequences delayed by tau relatively to each other echo-like phenomena are predicted to occur. In addition to the inter-band photon echo which shows up at exactly t=2*tau relative to the first pulse, the system responds with two spontaneous intra-band current pulses preceding and following the appearance of the photon echo. The temporal splitting depends on the electron-hole mass ratio. Calculating the population relaxation rate due to Coulomb scattering, it is concluded that the predicted new dynamical effect should be experimentally observable in an interacting and strongly disordered system, such as the Quantum-Coulomb-Glass.
Neural field equations are used to describe the spatiotemporal evolution of the activity in a network of synaptically coupled populations of neurons in the continuum limit. Their heuristic derivation involves two approximation steps. Under the assumption that each population in the network is large, the activity is described in terms of a population average. The discrete network is then approximated by a continuum. In this article we make the two approximation steps explicit. Extending a model by Bressloff and Newby, we describe the evolution of the activity in a discrete network of finite populations by a Markov chain. In order to determine finite-size effects - deviations from the mean field limit due to the finite size of the populations in the network - we analyze the fluctuations of this Markov chain and set up an approximating system of diffusion processes. We show that a well-posed stochastic neural field equation with a noise term accounting for finite-size effects on traveling wave solutions is obtained as the strong continuum limit.
We investigate low-temperature transport properties of thin TiN superconducting films in the vicinity of the disorder-driven superconductor-insulator transition. In a zero magnetic field, we find an extremely sharp separation between superconducting and insulating phases, evidencing a direct superconductor-insulator transition without an intermediate metallic phase. At moderate temperatures, in the insulating films we reveal thermally activated conductivity with the magnetic field-dependent activation energy. At very low temperatures, we observe a zero-conductivity state, which is destroyed at some depinning threshold voltage V_T. These findings indicate formation of a distinct collective state of the localized Cooper pairs in the critical region at both sides of the transition.
Using spectroscopic data from the Deep Extragalactic Evolutionary Probe (DEEP) Groth Strip survey (DGSS), we analyze the gas-phase oxygen abundances in the warm ionized medium for 64 emission-line field galaxies in the redshift range 0.26<z<0.82. Oxygen abundances relative to hydrogen range between 8.4<12+log(O/H)<9.0 with typical internal plus systematic measurement uncertainties of 0.17 dex. The 64 DGSS galaxies collectively exhibit an increase in metallicity with B-band luminosity. DGSS galaxies in the highest redshift bin (z=0.6-0.82) are brighter, on average, by ~1 mag at fixed metallicity compared to the lowest DGSS redshift bin (z=0.26-0.40) and brighter by up to ~2.4 mag compared to local (z<0.1) emission-line field galaxies. Alternatively, DGSS galaxies in the highest redshift bin (z=0.6-0.82) are, on average, 40% (0.15 dex) more metal-poor at fixed luminosity compared to local (z<0.1) emission-line field galaxies. For 0.6<z<0.8 galaxies, the offset from the local L-Z relation is greatest for objects at the low-luminosity (M_B>-19) end of the sample and vanishingly small for objects at the high-luminosity end of the sample (M_B ~ -22). Simple galaxy evolution models can produce reasonable agreement with observations for low-mass galaxies when least two of the following are true: 1) low-mass galaxies have lower effective chemical yields than massive galaxies, 2) low-mass galaxies assemble on longer timescales than massive galaxies, 3) low-mass galaxies began the assembly process at a later epoch than massive galaxies. (abridged)
When a neural language model is used for caption generation, the image information can be fed to the neural network either by directly incorporating it in a recurrent neural network -- conditioning the language model by injecting image features -- or in a layer following the recurrent neural network -- conditioning the language model by merging the image features. While merging implies that visual features are bound at the end of the caption generation process, injecting can bind the visual features at a variety stages. In this paper we empirically show that late binding is superior to early binding in terms of different evaluation metrics. This suggests that the different modalities (visual and linguistic) for caption generation should not be jointly encoded by the RNN; rather, the multimodal integration should be delayed to a subsequent stage. Furthermore, this suggests that recurrent neural networks should not be viewed as actually generating text, but only as encoding it for prediction in a subsequent layer.
Symmetric Positive Definite (SPD) matrix learning methods have become popular in many image and video processing tasks, thanks to their ability to learn appropriate statistical representations while respecting Riemannian geometry of underlying SPD manifolds. In this paper we build a Riemannian network architecture to open up a new direction of SPD matrix non-linear learning in a deep model. In particular, we devise bilinear mapping layers to transform input SPD matrices to more desirable SPD matrices, exploit eigenvalue rectification layers to apply a non-linear activation function to the new SPD matrices, and design an eigenvalue logarithm layer to perform Riemannian computing on the resulting SPD matrices for regular output layers. For training the proposed deep network, we exploit a new backpropagation with a variant of stochastic gradient descent on Stiefel manifolds to update the structured connection weights and the involved SPD matrix data. We show through experiments that the proposed SPD matrix network can be simply trained and outperform existing SPD matrix learning and state-of-the-art methods in three typical visual classification tasks.
Feature engineering remains a major bottleneck when creating predictive systems from electronic medical records. At present, an important missing element is detecting predictive regular clinical motifs from irregular episodic records. We present Deepr (short for Deep record), a new end-to-end deep learning system that learns to extract features from medical records and predicts future risk automatically. Deepr transforms a record into a sequence of discrete elements separated by coded time gaps and hospital transfers. On top of the sequence is a convolutional neural net that detects and combines predictive local clinical motifs to stratify the risk. Deepr permits transparent inspection and visualization of its inner working. We validate Deepr on hospital data to predict unplanned readmission after discharge. Deepr achieves superior accuracy compared to traditional techniques, detects meaningful clinical motifs, and uncovers the underlying structure of the disease and intervention space.
This paper has been withdrawn.
The recent success of deep neural networks relies on massive amounts of labeled data. For a target task where labeled data is unavailable, domain adaptation can transfer a learner from a different source domain. In this paper, we propose a new approach to domain adaptation in deep networks that can jointly learn adaptive classifiers and transferable features from labeled data in the source domain and unlabeled data in the target domain. We relax a shared-classifier assumption made by previous methods and assume that the source classifier and target classifier differ by a residual function. We enable classifier adaptation by plugging several layers into deep network to explicitly learn the residual function with reference to the target classifier. We fuse features of multiple layers with tensor product and embed them into reproducing kernel Hilbert spaces to match distributions for feature adaptation. The adaptation can be achieved in most feed-forward models by extending them with new residual layers and loss functions, which can be trained efficiently via back-propagation. Empirical evidence shows that the new approach outperforms state of the art methods on standard domain adaptation benchmarks.
We present a tight-binding calculation that, for the first time, accurately describes the structural, vibrational and elastic properties of amorphous silicon. We compute the interatomic force constants and find an unphysical feature of the Stillinger-Weber empirical potential that correlates with a much noted error in the radial distribution function associated with that potential. We also find that the intrinsic first peak of the radial distribution function is asymmetric, contrary to usual assumptions made in the analysis of diffraction data. We use our results for the normal mode frequencies and polarization vectors to obtain the zero-point broadening effect on the radial distribution function, enabling us to directly compare theory and a high resolution x-ray diffraction experiment.
Purpose - Most industrial robots are still programmed using the typical teaching process, through the use of the robot teach pendant. This is a tedious and time-consuming task that requires some technical expertise, and hence new approaches to robot programming are required. The purpose of this paper is to present a robotic system that allows users to instruct and program a robot with a high-level of abstraction from the robot language.   Design/methodology/approach - The paper presents in detail a robotic system that allows users, especially non-expert programmers, to instruct and program a robot just showing it what it should do, in an intuitive way. This is done using the two most natural human interfaces (gestures and speech), a force control system and several code generation techniques. Special attention will be given to the recognition of gestures, where the data extracted from a motion sensor (three-axis accelerometer) embedded in the Wii remote controller was used to capture human hand behaviours. Gestures (dynamic hand positions) as well as manual postures (static hand positions) are recognized using a statistical approach and artificial neural networks.   Practical implications - The key contribution of this paper is that it offers a practical method to program robots by means of gestures and speech, improving work efficiency and saving time.   Originality/value - This paper presents an alternative to the typical robot teaching process, extending the concept of human-robot interaction and co-worker scenario. Since most companies do not have engineering resources to make changes or add new functionalities to their robotic manufacturing systems, this system constitutes a major advantage for small- to medium-sized enterprises.
Protein interaction networks (PINs) are often used to "learn" new biological function from their topology. Since current PINs are noisy, their computational de-noising via link prediction (LP) could improve the learning accuracy. LP uses the existing PIN topology to predict missing and spurious links. Many of existing LP methods rely on shared immediate neighborhoods of the nodes to be linked. As such, they have limitations. Thus, in order to comprehensively study what are the topological properties of nodes in PINs that dictate whether the nodes should be linked, we had to introduce novel sensitive LP measures that overcome the limitations of the existing methods.   We systematically evaluate the new and existing LP measures by introducing "synthetic" noise to PINs and measuring how well the different measures reconstruct the original PINs. Our main findings are: 1) LP measures that favor nodes which are both "topologically similar" and have large shared extended neighborhoods are superior; 2) using more network topology often though not always improves LP accuracy; and 3) our new LP measures are superior to the existing measures. After evaluating the different methods, we use them to de-noise PINs. Importantly, we manage to improve biological correctness of the PINs by de-noising them, with respect to "enrichment" of the predicted interactions in Gene Ontology terms. Furthermore, we validate a statistically significant portion of the predicted interactions in independent, external PIN data sources.   Software executables are freely available upon request.
In many biological systems, the network of interactions between the elements can only be inferred from experimental measurements. In neuroscience, non-invasive imaging tools are extensively used to derive either structural or functional brain networks in-vivo. As a result of the inference process, we obtain a matrix of values corresponding to an unrealistic fully connected and weighted network. To turn this into a useful sparse network, thresholding is typically adopted to cancel a percentage of the weakest connections. The structural properties of the resulting network depend on how much of the inferred connectivity is eventually retained. However, how to fix this threshold is still an open issue. We introduce a criterion, the efficiency cost optimization (ECO), to select a threshold based on the optimization of the trade-off between the efficiency of a network and its wiring cost. We prove analytically and we confirm through numerical simulations that the connection density maximizing this trade-off emphasizes the intrinsic properties of a given network, while preserving its sparsity. Moreover, this density threshold can be determined a-priori, since the number of connections to filter only depends on the network size according to a power-law. We validate this result on several brain networks, from micro- to macro-scales, obtained with different imaging modalities. Finally, we test the potential of ECO in discriminating brain states with respect to alternative filtering methods. ECO advances our ability to analyze and compare biological networks, inferred from experimental data, in a fast and principled way.
Assuming a network of infinite extent, several researchers have analyzed small-cell networks using a Poisson point process (PPP) location model, leading to simple analytic expressions. The general assumption has been that these results apply to finite-area networks as well. However, do the results of infinite-area networks apply to finite-area networks? In this paper, we answer this question by obtaining an accurate approximation for the achievable signal-to-interference-plus-noise ratio (SINR) and user capacity in the downlink of a \textit{finite-area} network with \textit{a fixed number of} access points (APs). The APs are uniformly distributed within the area of interest. Our analysis shows that, crucially, the results of infinite-area networks are very different from those for finite-area networks of low-to-medium AP density. Comprehensive simulations are used to illustrate the accuracy of our analysis. For practical values of signal transmit powers and AP densities, the analytic expressions capture the behavior of the system well. As an added benefit, the formulations developed here can be used in parametric studies for network design. Here, the analysis is used to obtain the required number of APs to guarantee a desired target capacity in a finite-area network.
Ad-hoc routing protocols use a number of algorithms for route discovery. Some use flooding in which a route request packet (RREQ) is broadcasted from a source node to other nodes in the network. This often leads to unnecessary retransmissions, causing congestion and packet collisions in the network, a phenomenon called a broadcast storm. This paper presents a RREQ message forwarding scheme for AODV that reduces routing overheads. This has been called AODV_EXT. Its performance is compared to that of AODV, DSDV, DSR and OLSR protocols. Simulation results show that AODV_EXT achieves 3% energy efficiency, 19.5% improvement in data throughput and 69.5% reduction in the number of dropped packets for a network of 50 nodes. Greater efficiency is achieved in high density network and marginal improvement in networks with a small number of nodes.
The idea that the hard channels may dominate in the very high multiplicity processes is investigated. Quantitative realization of the `hard Pomeron', deep inelastic scattering and large-angle annihilation mechanism combinations are considered in the pQCD frame for this purpose.
In this paper we present a general procedure that allows for the reduction or expansion of any network (considered as a weighted graph). This procedure maintains the spectrum of the network's adjacency matrix up to a set of eigenvalues known beforehand from its graph structure. This procedure can be used to establish new equivalence relations on the class of all weighted graphs (networks) where two graphs are equivalent if they can be reduced to the same graph. Additionally, dynamical networks (or any finite dimensional, discrete time dynamical system) can be analyzed using isospectral transformations. By so doing we obtain stronger results regarding the global stability (strong synchronization) of dynamical networks when compared to other standard methods.
Measurements are presented of $K^0$ meson and $\Lambda$ baryon production in deep-inelastic positron-proton scattering (DIS) in the kinematic range $10 < Q^2 < 70 $GeV$^2$ and $10^{-4} < x < 10^{-2}$. The measurements, obtained using the H1 detector at the HERA collider, are discussed in the light of possible mechanisms for increased strangeness production at low Bjorken-$x$. Comparisons of the $x_F$ spectra, where $x_F$ is the fractional longitudinal momentum in the hadronic centre-of-mass frame, with results from electron-positron annihilation are made. The $x_F$ spectra and the $K^0$ ``seagull'' plot are compared with previous DIS results. The mean $K^0$ and $\Lambda$ multiplicities are studied as a function of the centre-of-mass energy $W$ and are observed to be consistent with a logarithmic increase with $W$ when compared with previous measurements. A comparison of the levels of strangeness production in diffractive and non-diffractive DIS is made. An upper limit of $0.9 $nb, at the 95% confidence level, is placed on the cross-section for QCD instanton induced events.
Today's high performance deep artificial neural networks (ANNs) rely heavily on parameter optimization, which is sequential in nature and even with a powerful GPU, would have taken weeks to train them up for solving challenging tasks [22]. HMAX [17] has demonstrated that a simple high performing network could be obtained without heavy optimization. In this paper, we had improved on the existing best HMAX neural network [12] in terms of structural simplicity and performance. Our design replaces the L1 minimization sparse coding (SC) with a locality-constrained linear coding (LLC) [20] which has a lower computational demand. We also put the simple orientation filter bank back into the front layer of the network replacing PCA. Our system's performance has improved over the existing architecture and reached 79.0% on the challenging Caltech-101 [7] dataset, which is state-of-the-art for ANNs (without transfer learning). From our empirical data, the main contributors to our system's performance include an introduction of partial signal whitening, a spot detector, and a spatial pyramid matching (SPM) [14] layer.
A framework is developed for analyzing capacity gains from user cooperation in slow fading wireless networks when the number of nodes (network size) is large. The framework is illustrated for the case of a simple multipath-rich Rayleigh fading channel model. Both unicasting (one source and one destination) and multicasting (one source and several destinations) scenarios are considered. We introduce a meaningful notion of Shannon capacity for such systems, evaluate this capacity as a function of signal-to-noise ratio (SNR), and develop a simple two-phase cooperative network protocol that achieves it. We observe that the resulting capacity is the same for both unicasting and multicasting, but show that the network size required to achieve any target error probability is smaller for unicasting than for multicasting. Finally, we introduce the notion of a network ``scaling exponent'' to quantify the rate of decay of error probability with network size as a function of the targeted fraction of the capacity. This exponent provides additional insights to system designers by enabling a finer grain comparison of candidate cooperative transmission protocols in even moderately sized networks.
We present a deep neural network (DNN) acoustic model that includes parametrised and differentiable pooling operators. Unsupervised acoustic model adaptation is cast as the problem of updating the decision boundaries implemented by each pooling operator. In particular, we experiment with two types of pooling parametrisations: learned $L_p$-norm pooling and weighted Gaussian pooling, in which the weights of both operators are treated as speaker-dependent. We perform investigations using three different large vocabulary speech recognition corpora: AMI meetings, TED talks and Switchboard conversational telephone speech. We demonstrate that differentiable pooling operators provide a robust and relatively low-dimensional way to adapt acoustic models, with relative word error rates reductions ranging from 5--20% with respect to unadapted systems, which themselves are better than the baseline fully-connected DNN-based acoustic models. We also investigate how the proposed techniques work under various adaptation conditions including the quality of adaptation data and complementarity to other feature- and model-space adaptation methods, as well as providing an analysis of the characteristics of each of the proposed approaches.
Internet faces the problem of congestion due to its increased use. AQM algorithm is a solution to the problem of congestion control in the Internet. There are various existing algorithms that have evolved over the past few years to solve the problem of congestion in IP networks. Congested link causes many problems such as large delay, underutilization of the link and packet drops in burst. There are various existing algorithms that have evolved over the past few years to solve the problem of congestion in IP networks. In this paper, study of these existing algorithms is done. This paper discusses algorithms based on various congestion-metrics and classifies them based on certain factors. This helps in identifying the algorithms that regulate the congestion more effectively.
A combination of a neural network with rule firing information from a rule-based system is used to generate segment durations for a text-to-speech system. The system shows a slight improvement in performance over a neural network system without the rule firing information. Synthesized speech using segment durations was accepted by listeners as having about the same quality as speech generated using segment durations extracted from natural speech.
In this paper, the problem of pinning control for synchronization of complex dynamical networks is discussed. A cost function of the controlled network is defined by the feedback gain and the coupling strength of the network. An interesting result is that lower cost is achieved by the control scheme of pinning nodes with smaller degrees. Some rigorous mathematical analysis is presented for achieving lower cost in the synchronization of different star-shaped networks. Numerical simulations on some non-regular complex networks generated by the Barabasi-Albert model and various star-shaped networks are shown for verification and illustration.
We demonstrate that ultrafast nonlinear dynamics gives rise to reciprocity breaking in a random photonic medium. Reciprocity breaking is observed via the suppression of coherent backscattering, a manifestation of weak localization of light. The effect is observed in a pump-probe configuration where the pump induces an ultrafast step-change of the refractive index during the dwell time of the probe light in the material. The dynamical suppression of coherent backscattering is reproduced well by a multiple scattering Monte Carlo simulation. Ultrafast reciprocity breaking provides a distinct mechanism in nonlinear optical media which opens up avenues for the active manipulation of mesoscopic transport, random lasers, and photon localization.
The linear thermal expansion coefficients a(T) of N2 - C60 solutions with 9.9% and 100% of the C60 lattice interstitials filled with N2 have been investigated in the interval 2.2 - 24 K. The dependence a(T) has a hysteresis suggesting co-existence of two types of orientational glasses in these solutions. The features of the glasses are compared. The characteristic times of phase transformations in the solutions and reorientation of C60 molecules have been estimated.
Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network the underlying solution concepts and methods are very similar to those applied in non-equilibrium statistical physics. This review gives a tutorial-type overview of the field for physicists. The first three sections introduce the necessary background in classical and evolutionary game theory from the basic definitions to the most important results. The fourth section surveys the topological complications implied by non-mean-field-type social network structures in general. The last three sections discuss in detail the dynamic behavior of three prominent classes of models: the Prisoner's Dilemma, the Rock-Scissors-Paper game, and Competing Associations. The major theme of the review is in what sense and how the graph structure of interactions can modify and enrich the picture of long term behavioral patterns emerging in evolutionary games.
A deep learning approach has been proposed recently to derive speaker identifies (d-vector) by a deep neural network (DNN). This approach has been applied to text-dependent speaker recognition tasks and shows reasonable performance gains when combined with the conventional i-vector approach. Although promising, the existing d-vector implementation still can not compete with the i-vector baseline. This paper presents two improvements for the deep learning approach: a phonedependent DNN structure to normalize phone variation, and a new scoring approach based on dynamic time warping (DTW). Experiments on a text-dependent speaker recognition task demonstrated that the proposed methods can provide considerable performance improvement over the existing d-vector implementation.
The networks formed from the links between telephones observed in a month's call detail records (CDRs) in the UK are analyzed, looking for the characteristics thought to identify a communications network or a social network. Some novel methods are employed. We find similarities to both types of network. We conclude that, just as analogies to spin glasses have proved fruitful for optimization of large scale practical problems, there will be opportunities to exploit a statistical mechanics of the formation and dynamics of social networks in today's electronically connected world.
Monocular 3D object parsing is highly desirable in various scenarios including occlusion reasoning and holistic scene interpretation. We present a deep convolutional neural network (CNN) architecture to localize semantic parts in 2D image and 3D space while inferring their visibility states, given a single RGB image. Our key insight is to exploit domain knowledge to regularize the network by deeply supervising its hidden layers, in order to sequentially infer intermediate concepts associated with the final task. To acquire training data in desired quantities with ground truth 3D shape and relevant concepts, we render 3D object CAD models to generate large-scale synthetic data and simulate challenging occlusion configurations between objects. We train the network only on synthetic data and demonstrate state-of-the-art performances on real image benchmarks including an extended version of KITTI, PASCAL VOC, PASCAL3D+ and IKEA for 2D and 3D keypoint localization and instance segmentation. The empirical results substantiate the utility of our deep supervision scheme by demonstrating effective transfer of knowledge from synthetic data to real images, resulting in less overfitting compared to standard end-to-end training.
Building a good generative model for image has long been an important topic in computer vision and machine learning. Restricted Boltzmann machine (RBM) is one of such models that is simple but powerful. However, its restricted form also has placed heavy constraints on the models representation power and scalability. Many extensions have been invented based on RBM in order to produce deeper architectures with greater power. The most famous ones among them are deep belief network, which stacks multiple layer-wise pretrained RBMs to form a hybrid model, and deep Boltzmann machine, which allows connections between hidden units to form a multi-layer structure. In this paper, we present a new method to compose RBMs to form a multi-layer network style architecture and a training method that trains all layers jointly. We call the resulted structure deep restricted Boltzmann network. We further explore the combination of convolutional RBM with the normal fully connected RBM, which is made trivial under our composition framework. Experiments show that our model can generate descent images and outperform the normal RBM significantly in terms of image quality and feature quality, without losing much efficiency for training.
Our recent study elucidate that information of density of states in configuration space (CDOS) for non-interacting system, characterized by spatial constraint on the system, plays essential role to determine thermodynamically equilibrium properties for interacting systems. Particularly for disordered states, variance of CDOS along all possible coordinations plays significant role. Despite this fact, even for binary system of crystalline solids, analytic expression for variance of CDOS as a function of composition has not been clarified so far. Here we successfully derive variance of CDOS as a function of composition for pair correlations, whose validity is demonstrated by comparing the results with uniform sampling of CDOS on real lattices. The present result certainly advances determining special microscopic state to characterize Helmholtz free energy in classical systems, whose structure is difficult to determine by numerical simulation.
Collaboration in science is one of the key components of world-class research. The European Commission supports collaboration between institutions and funds young researchers appointed by these partner institutions. In these networks, the mobility of the researchers is enforced in order to enhance the collaboration. In this study, based on a real Marie Curie Initial Training Network, an algorithm to construct a collaboration network is investigated. The algorithm suggests that a strongly efficient expansion leads to a star-like network. The results might help the design of efficient collaboration networks for future Initial Training Network proposals.
Ultra strong emission-line galaxies (USELs) with extremely high equivalent widths (EW(H beta) > 30A) can be used to pick out galaxies of extremely low metallicity in the z=0-1 redshift range. Large numbers of these objects are easily detected in deep narrow band searches and, since most have detectable [OIII] 4363, their metallicities determined using the direct method. These large samples hold the possibility for determining if there is a metallicity floor for the galaxy population. Here we describe results of an extensive spectroscopic follow-up of the Kakazu et al. (2007) catalog of 542 USELs using the DEIMOS spectrograph on Keck, with high S/N spectra of 348 galaxies. The two lowest metallicity galaxies in our sample have 12+log(O/H)=6.97+/-0.17 and 7.25+/-0.03 -- values comparable to the lowest metallicity galaxies found to date. We determine an empirical metallicity-R23 parameter relation for our sample, and compare this to the relationship for low redshift galaxies. The determined metallicity-luminosity relation is compared with those of magnitude selected samples in the same redshift range. The emission line selected galaxies show a metal-luminosity relation where the metallicity decreases with luminosity and they appear to define the lower bound of the galaxy metallicity distribution at a given continuum luminosity. We also compute the H alpha luminosity function of the USELs as a function of redshift and use this to compute an upper bound on the Ly alpha emitter luminosity function over the z=0-1 redshift range.
This paper presents an end-to-end convolutional neural network (CNN) for 2D-3D exemplar detection. We demonstrate that the ability to adapt the features of natural images to better align with those of CAD rendered views is critical to the success of our technique. We show that the adaptation can be learned by compositing rendered views of textured object models on natural images. Our approach can be naturally incorporated into a CNN detection pipeline and extends the accuracy and speed benefits from recent advances in deep learning to 2D-3D exemplar detection. We applied our method to two tasks: instance detection, where we evaluated on the IKEA dataset, and object category detection, where we out-perform Aubry et al. for "chair" detection on a subset of the Pascal VOC dataset.
The conventional theory of metals is in crisis. In the last 15 years, there has been an unexpected sprouting of metallic states in low dimensional systems directly contradicting conventional wisdom. For example, bosons are thought to exist in one of two ground states: condensed in a superconductor or localized in an insulator. However, several experiments on thin metal alloy films have observed that a metallic phase disrupts the direct transition between the superconductor and the insulator. We analyze the experiments on the insulator-superconductor transition and argue that the intervening metallic phase is bosonic. All relevant theoretical proposals for the Bose metal are discussed, particularly the recent idea that the metallic phase is glassy. The implications for the putative vortex glass state in the copper-oxide superconductors are examined.
We present moderate-resolution ($R$$\sim$4000-5000) near-infrared integral field spectroscopy of the young (1-5 Myr) 6-14 $M_\mathrm{Jup}$ companions ROXs 42B b and FW Tau b obtained with Keck/OSIRIS and Gemini-North/NIFS. The spectrum of ROXs 42B b exhibits clear signs of low surface gravity common to young L dwarfs, confirming its extreme youth, cool temperature, and low mass. Overall, it closely resembles the free-floating 4-7 $M_\mathrm{Jup}$ L-type Taurus member 2MASS J04373705+2331080. The companion to FW Tau AB is more enigmatic. Our optical and near-infrared spectra show strong evidence of outflow activity and disk accretion in the form of line emission from [S II], [O I], H$\alpha$, Ca II, [Fe II], Pa$\beta$, and H$_2$. The molecular hydrogen emission is spatially resolved as a single lobe that stretches $\approx$0.1" (15 AU). Although the extended emission is not kinematically resolved in our data, its morphology resembles shock-excited H$_2$ jets primarily seen in young Class 0 and Class I sources. The near-infrared continuum of FW Tau b is mostly flat and lacks the deep absorption features expected for a cool, late-type object. This may be a result of accretion-induced veiling, especially in light of its strong and sustained H$\alpha$ emission ($EW$(H$\alpha$)$\gtrsim$290 \AA). Alternatively, FW Tau b may be a slightly warmer (M5-M8) accreting low-mass star or brown dwarf (0.03-0.15 $M_{\odot}$) with an edge-on disk. Regardless, its young evolutionary stage is in stark contrast to its Class III host FW Tau AB, indicating a more rapid disk clearing timescale for the host binary system than for its wide companion. Finally, we present near-infrared spectra of the young ($\sim$2-10 Myr) low-mass (12-15 $M_\mathrm{Jup}$) companions GSC 6214-210 B and SR 12 C and find they best resemble low gravity M9.5 and M9 substellar templates.
The increasing demands of various high data rate wireless applications have been seen in the recent years and it will continue in the future. To fulfill these demands, the limited existing wireless resources should be utilized properly or new wireless technology should be developed. Therefore, we propose some novel idea to manage the wireless resources and deployment of femtocellular network technology. The study was mainly divided into two parts: (a) femtocellular network deployment and resource allocation and (b) resource management for macrocellular networks. The femtocellular network deployment scenarios, integrated femtocell/macrocell network architectures, cost-effective frequency planning, and mobility management schemes are presented in first part. In the second part, we provide a CAC based on adaptive bandwidth allocation for the wireless network in. The proposed CAC relies on adaptive multi-level bandwidth-allocation scheme for non-real-time calls. We propose video service provisioning over wireless networks. We provide a QoS adaptive radio resource allocation as well as popularity based bandwidth allocation schemes for scalable videos over wireless cellular networks. All the proposed schemes are verified through several numerical and simulation results. The research results presented in this dissertation clearly imply the advantages of our proposed schemes.
Learning the right graph representation from noisy, multi-source data has garnered significant interest in recent years. A central tenet of this problem is relational learning. Here the objective is to incorporate the partial information each data source gives us in a way that captures the true underlying relationships. To address this challenge, we present a general, boosting-inspired framework for combining weak evidence of entity associations into a robust similarity metric. Building on previous work, we explore the extent to which different local quality measurements yield graph representations that are suitable for community detection. We present empirical results on a variety of datasets demonstrating the utility of this framework, especially with respect to real datasets where noise and scale present serious challenges. Finally, we prove a convergence theorem in an ideal setting and outline future research into other application domains.
In this work, we present the Grounded Recurrent Neural Network (GRNN), a recurrent neural network architecture for multi-label prediction which explicitly ties labels to specific dimensions of the recurrent hidden state (we call this process "grounding"). The approach is particularly well-suited for extracting large numbers of concepts from text. We apply the new model to address an important problem in healthcare of understanding what medical concepts are discussed in clinical text. Using a publicly available dataset derived from Intensive Care Units, we learn to label a patient's diagnoses and procedures from their discharge summary. Our evaluation shows a clear advantage to using our proposed architecture over a variety of strong baselines.
In recent years, community structure has emerged as a key component of complex network analysis. As more data has been collected, researchers have begun investigating changing community structure across multiple networks. Several methods exist to analyze changing communities, but most of these are limited to evolution of a single network over time. In addition, most of the existing methods are more concerned with change at the community level than at the level of the individual node. In this paper, we introduce scaled inclusivity, which is a method to quantify the change in community structure across networks. Scaled inclusivity evaluates the consistency of the classiffication of every node in a network independently. In addition, the method can be applied cross-sectionally as well as longitudinally. In this paper, we calculate the scaled inclusivity for a set of simulated networks of United States cities and a set of real networks consisting of teams that play in the top division of American college football. We found that scaled inclusivity yields reasonable results for the consistency of individual nodes in both sets of networks. We propose that scaled inclusivity may provide a useful way to quantify the change in a network's community structure.
While network coding can be an efficient means of information dissemination in networks, it is highly susceptible to "pollution attacks," as the injection of even a single erroneous packet has the potential to corrupt each and every packet received by a given destination. Even when suitable error-control coding is applied, an adversary can, in many interesting practical situations, overwhelm the error-correcting capability of the code. To limit the power of potential adversaries, a broadcast transformation is introduced, in which nodes are limited to just a single (broadcast) transmission per generation. Under this broadcast transformation, the multicast capacity of a network is changed (in general reduced) from the number of edge-disjoint paths between source and sink to the number of internally-disjoint paths. Exploiting this fact, we propose a family of networks whose capacity is largely unaffected by a broadcast transformation. This results in a significant achievable transmission rate for such networks, even in the presence of adversaries.
Through seven publications this dissertation shows how anonymized mobile phone data can contribute to the social good and provide insights into human behaviour on a large scale. The size of the datasets analysed ranges from 500 million to 300 billion phone records, covering millions of people. The key contributions are two-fold:   1. Big Data for Social Good: Through prediction algorithms the results show how mobile phone data can be useful to predict important socio-economic indicators, such as income, illiteracy and poverty in developing countries. Such knowledge can be used to identify where vulnerable groups in society are, reduce economic shocks and is a critical component for monitoring poverty rates over time. Further, the dissertation demonstrates how mobile phone data can be used to better understand human behaviour during large shocks in society, exemplified by an analysis of data from the terror attack in Norway and a natural disaster on the south-coast in Bangladesh. This work leads to an increased understanding of how information spreads, and how millions of people move around. The intention is to identify displaced people faster, cheaper and more accurately than existing survey-based methods.   2. Big Data for efficient marketing: Finally, the dissertation offers an insight into how anonymised mobile phone data can be used to map out large social networks, covering millions of people, to understand how products spread inside these networks. Results show that by including social patterns and machine learning techniques in a large-scale marketing experiment in Asia, the adoption rate is increased by 13 times compared to the approach used by experienced marketers. A data-driven and scientific approach to marketing, through more tailored campaigns, contributes to less irrelevant offers for the customers, and better cost efficiency for the companies.
We consider random deposition of debris or blocks on a line, with block sizes following a rigorous hierarchy: the linear size equals $1/\lambda^n$ in generation $n$, in terms of a rescaling factor $\lambda$. Without interactions between the blocks, this model is described by a logarithmic fractal, studied previously, which is characterized by a constant increment of the length, area or volume upon proliferation. We study to what extent the logarithmic fractality survives, if each block is equipped with an Ising (pseudo-)spin $s=\pm 1$ and the interactions between those spins are switched on (ranging from antiferromagnetic to ferromagnetic). It turns out that the dependence of the surface topology on the interaction sign and strength is not trivial. For instance, deep in the ferromagnetic regime, our numerical experiments and analytical results reveal a sharp crossover from a Euclidean transient, consisting of aggregated domains of aligned spins, to an asymptotic logarithmic fractal growth. In contrast, deep into the antiferromagnetic regime the surface roughness is important and is shown analytically to be controlled by vacancies induced by frustrated spins. Finally, in the weak interaction regime, we demonstrate that the non-interacting model is extremal in the sense that the effect of the introduction of interactions is only quadratic in the magnetic coupling strength. In all regimes, we demonstrate the adequacy of a mean-field approximation whenever vacancies are rare. In sum, the logarithmic fractal character is robust with respect to the introduction of spatial correlations in the hierarchical deposition process.
In disordered itinerant magnets with arbitrary symmetry of the order parameter, the conventional quantum critical point between the ordered phase and the paramagnetic Fermi-liquid (PMFL) is destroyed due to the formation of an intervening cluster glass (CG) phase. In this Letter we discuss the quantum critical behavior at the CG-PMFL transition for systems with continuous symmetry. We show that fluctuations due to quantum Griffiths anomalies induce a first-order transition from the PMFL at T=0, while at higher temperatures a conventional continuous transition is restored. This behavior is a generic consequence of enhanced non-Ohmic dissipation caused by a broad distribution of energy scales within any quantum Griffiths phase in itinerant systems.
Accurate immunological models offer the possibility of performing highthroughput experiments in silico that can predict, or at least suggest, in vivo phenomena. In this chapter, we compare various models of immunological memory. We first validate an experimental immunological simulator, developed by the authors, by simulating several theories of immunological memory with known results. We then use the same system to evaluate the predicted effects of a theory of immunological memory. The resulting model has not been explored before in artificial immune systems research, and we compare the simulated in silico output with in vivo measurements. Although the theory appears valid, we suggest that there are a common set of reasons why immunological memory models are a useful support tool; not conclusive in themselves.
In time-varying wireless networks, the states of the communication channels are subject to random variations, and hence need to be estimated for efficient rate adaptation and scheduling. The estimation mechanism possesses inaccuracies that need to be tackled in a probabilistic framework. In this work, we study scheduling with rate adaptation in single-hop queueing networks under two levels of channel uncertainty: when the channel estimates are inaccurate but complete knowledge of the channel/estimator joint statistics is available at the scheduler; and when the knowledge of the joint statistics is incomplete. In the former case, we characterize the network stability region and show that a maximum-weight type scheduling policy is throughput-optimal. In the latter case, we propose a joint channel statistics learning - scheduling policy. With an associated trade-off in average packet delay and convergence time, the proposed policy has a stability region arbitrarily close to the stability region of the network under full knowledge of channel/estimator joint statistics.
Many real-world networks exhibit a high degeneracy at few eigenvalues. We show that a simple transformation of the network's adjacency matrix provides an understanding of the origins of occurrence of high multiplicities in the networks spectra. We find that the eigenvectors associated with the degenerate eigenvalues shed light on the structures contributing to the degeneracy. Since these degeneracies are rarely observed in model graphs, we present results for various cancer networks. This approach gives an opportunity to search for structures contributing to degeneracy which might have an important role in a network.
In this paper, we derive the capacity of a special class of mesh networks. A mesh network is defined as a heterogeneous wireless network in which the transmission among power limited nodes is assisted by powerful relays, which use the same wireless medium. We find the capacity of the mesh network when there is one source, one destination, and multiple relays. We call this channel the single source multiple relay single destination (SSMRSD) mesh network. Our approach is as follows. We first look at an upper bound on the information theoretic capacity of these networks in the Gaussian setting. We then show that the bound is achievable asymptotically using the compress-forward strategy for the multiple relay channel. Theoretically, the results indicate the value of cooperation and the utility of carefully deployed relays in wireless ad-hoc and sensor networks. The capacity characterization quantifies how the relays can be used to either conserve node energy or to increase transmission rate.
The information flow inside a P2P network is highly dependent on the network structure. In order to ease the diffusion of relevant data toward interested peers, many P2P protocols gather similar nodes by putting them in direct contact. With this approach the similarity between nodes is computed in a point-to-point fashion: each peer individually identifies the nodes that share similar interests with it. This leads to the creation of a sort of "private" communities, limited to each peer neighbors list. This "private" knowledge do not allow to identify the features needed to discover and characterize the correlations that collect similar peers in broader groups. In order to let these correlations to emerge, the collective knowledge of peers must be exploited. One common problem to overcome in order to avoid the "private" vision of the network, is related to how distributively determine the representation of a community and how nodes may decide to belong to it. We propose to use a gossip-like approach in order to let peers elect and identify leaders of interest communities. Once leaders are elected, their profiles are used as community representatives. Peers decide to adhere to a community or another by choosing the most similar representative they know about.
The suggested methodic is the way of formatting the subject areas models and co-authors networks by sounding the content networks. The paper represents the notion networks which match tags and authors of Google Scholar Citations service. Models depicted in the work were built for the physical optics area, and it can be applied for other domains. The proposed ways of defining connections between science areas and authors depicts the collaborations opportunities and versatility of interdisciplinary.
Query Understanding concerns about inferring the precise intent of search by the user with his formulated query, which is challenging because the queries are often very short and ambiguous. The report discusses the various kind of queries that can be put to a Search Engine and illustrates the Role of Query Understanding for return of relevant results. With different advances in techniques for deep understanding of queries as well as documents, the Search Technology has witnessed three major era. A lot of interesting real world examples have been used to illustrate the role of Query Understanding in each of them. The Query Understanding Module is responsible to correct the mistakes done by user in the query, to guide him in formulation of query with precise intent, and to precisely infer the intent of the user query. The report describes the complete architecture to handle aforementioned three tasks, and then discusses basic as well as recent advanced techniques for each of the component, through appropriate papers from reputed conferences and journals.
What makes an image appear realistic? In this work, we are answering this question from a data-driven perspective by learning the perception of visual realism directly from large amounts of data. In particular, we train a Convolutional Neural Network (CNN) model that distinguishes natural photographs from automatically generated composite images. The model learns to predict visual realism of a scene in terms of color, lighting and texture compatibility, without any human annotations pertaining to it. Our model outperforms previous works that rely on hand-crafted heuristics, for the task of classifying realistic vs. unrealistic photos. Furthermore, we apply our learned model to compute optimal parameters of a compositing method, to maximize the visual realism score predicted by our CNN model. We demonstrate its advantage against existing methods via a human perception study.
Gamma-ray burst remnants become trans-relativistic typically in days to tens of days, and they enter the deep Newtonian phase in tens of days to months, during which the majority of shock-accelerated electrons will no longer be highly relativistic. However, a small portion of electrons are still accelerated to ultra-relativistic speeds and capable of emitting synchrotron radiation. The distribution function for electrons is re-derived here so that synchrotron emission from these relativistic electrons can be calculated. Based on the revised model, optical afterglows from both isotropic fireballs and highly collimated jets are studied numerically, and compared to analytical results. In the beamed cases, it is found that, in addition to the steepening due to the edge effect and the lateral expansion effect, the light curves are universally characterized by a flattening during the deep Newtonian phase.
The problem of modeling and predicting spatiotemporal traffic phenomena over an urban road network is important to many traffic applications such as detecting and forecasting congestion hotspots. This paper presents a decentralized data fusion and active sensing (D2FAS) algorithm for mobile sensors to actively explore the road network to gather and assimilate the most informative data for predicting the traffic phenomenon. We analyze the time and communication complexity of D2FAS and demonstrate that it can scale well with a large number of observations and sensors. We provide a theoretical guarantee on its predictive performance to be equivalent to that of a sophisticated centralized sparse approximation for the Gaussian process (GP) model: The computation of such a sparse approximate GP model can thus be parallelized and distributed among the mobile sensors (in a Google-like MapReduce paradigm), thereby achieving efficient and scalable prediction. We also theoretically guarantee its active sensing performance that improves under various practical environmental conditions. Empirical evaluation on real-world urban road network data shows that our D2FAS algorithm is significantly more time-efficient and scalable than state-of-the-art centralized algorithms while achieving comparable predictive performance.
In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e., omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e., deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies.
We introduce a non-disordered lattice spin model, based on the principle of minimizing spin-spin correlations up to a (tunable) distance R. The model can be defined in any spatial dimension D, but already for D=1 and small values of R (e.g. R=5) the model shows the properties of a glassy system: deep and well separated energy minima, very slow relaxation dynamics, aging and non-trivial fluctuation-dissipation ratio.
The deluge of networked data motivates the development of algorithms for computation- and communication-efficient information processing. In this context, three data-adaptive censoring strategies are introduced to considerably reduce the computation and communication overhead of decentralized recursive least-squares (D-RLS) solvers. The first relies on alternating minimization and the stochastic Newton iteration to minimize a network-wide cost, which discards observations with small innovations. In the resultant algorithm, each node performs local data-adaptive censoring to reduce computations, while exchanging its local estimate with neighbors so as to consent on a network-wide solution. The communication cost is further reduced by the second strategy, which prevents a node from transmitting its local estimate to neighbors when the innovation it induces to incoming data is minimal. In the third strategy, not only transmitting, but also receiving estimates from neighbors is prohibited when data-adaptive censoring is in effect. For all strategies, a simple criterion is provided for selecting the threshold of innovation to reach a prescribed average data reduction. The novel censoring-based (C)D-RLS algorithms are proved convergent to the optimal argument in the mean-square deviation sense. Numerical experiments validate the effectiveness of the proposed algorithms in reducing computation and communication overhead.
We present a detailled timed automata model of the clock synchronization algorithm that is currently being used in a wireless sensor network (WSN) that has been developed by the Dutch company Chess. Using the Uppaal model checker, we establish that in certain cases a static, fully synchronized network may eventually become unsynchronized if the current algorithm is used, even in a setting with infinitesimal clock drifts.
The LEP experiments measure the QED and QCD structure of the photon in deep-inelastic electron-photon scattering. The status of these measurements is discussed in this short review.
In this paper, we want to study the informative value of negative links in signed complex networks. For this purpose, we extract and analyze a collection of signed networks representing voting sessions of the European Parliament (EP). We first process some data collected by the VoteWatch Europe Website for the whole 7 th term (2009-2014), by considering voting similarities between Members of the EP to define weighted signed links. We then apply a selection of community detection algorithms, designed to process only positive links, to these data. We also apply Parallel Iterative Local Search (Parallel ILS), an algorithm recently proposed to identify balanced partitions in signed networks. Our results show that, contrary to the conclusions of a previous study focusing on other data, the partitions detected by ignoring or considering the negative links are indeed remarkably different for these networks. The relevance of negative links for graph partitioning therefore is an open question which should be further explored.
We study the problem of generating abstractive summaries for opinionated text. We propose an attention-based neural network model that is able to absorb information from multiple text units to construct informative, concise, and fluent summaries. An importance-based sampling method is designed to allow the encoder to integrate information from an important subset of input. Automatic evaluation indicates that our system outperforms state-of-the-art abstractive and extractive summarization systems on two newly collected datasets of movie reviews and arguments. Our system summaries are also rated as more informative and grammatical in human evaluation.
We show that a simple artificial neural network trained on entanglement spectra of individual states of a many-body quantum system can be used to determine the transition between a many-body localized and a thermalizing regime. Specifically, we study the Heisenberg spin-1/2 chain in a random external field. We employ a multilayer perceptron with a single hidden layer, which is trained on labeled entanglement spectra pertaining to the fully localized and fully thermal regimes. We then apply this network to classify spectra belonging to states in the transition region. For training, we use a cost function that contains, in addition to the usual error and regularization parts, a term that favors a confident classification of the transition region states. The resulting phase diagram is in good agreement with the one obtained by more conventional methods and can be computed for small systems. In particular, the neural network outperforms conventional methods in classifying individual eigenstates pertaining to a single disorder realization. It allows us to map out the structure of these eigenstates across the transition with spatial resolution. Furthermore, we analyze the network operation using the dreaming technique to show that the neural network correctly learns by itself the power-law structure of the entanglement spectra in the many-body localized regime.
Limited angle problem is a challenging issue in x-ray computed tomography (CT) field. Iterative reconstruction methods that utilize the additional prior can suppress artifacts and improve image quality, but unfortunately require increased computation time. An interesting way is to restrain the artifacts in the images reconstructed from the practical filtered back projection (FBP) method. Frikel and Quinto have proved that the streak artifacts in FBP results could be characterized. It indicates that the artifacts created by FBP method have specific and similar characteristics in a stationary limited-angle scanning configuration. Based on this understanding, this work aims at developing a method to extract and suppress specific artifacts of FBP reconstructions for limited-angle tomography. A data-driven learning-based method is proposed based on a deep convolutional neural network. An end-to-end mapping between the FBP and artifact-free images is learned and the implicit features involving artifacts will be extracted and suppressed via nonlinear mapping. The qualitative and quantitative evaluations of experimental results indicate that the proposed method show a stable and prospective performance on artifacts reduction and detail recovery for limited angle tomography. The presented strategy provides a simple and efficient approach for improving image quality of the reconstruction results from limited projection data.
In many applications of black-box optimization, one can evaluate multiple points simultaneously, e.g. when evaluating the performances of several different neural network architectures in a parallel computing environment. In this paper, we develop a novel batch Bayesian optimization algorithm --- the parallel knowledge gradient method. By construction, this method provides the one-step Bayes optimal batch of points to sample. We provide an efficient strategy for computing this Bayes-optimal batch of points, and we demonstrate that the parallel knowledge gradient method finds global optima significantly faster than previous batch Bayesian optimization algorithms on both synthetic test functions and when tuning hyperparameters of practical machine learning algorithms, especially when function evaluations are noisy.
Modeling and optimization of metabolic networks has been one of the hottest topics in computational systems biology within recent years. However, the complexity and uncertainty of these networks in addition to the lack of necessary data has resulted in more efforts to design and usage of more capable models which fit to realistic conditions. In this paper, instead of optimizing networks in equilibrium condition, the optimization of dynamic networks in non-equilibrium states including low number of molecules has been studied using a 2-D lattice simulation. A prototyped network has been simulated with such approach, and has been optimized using Swarm Particle Algorithm the results of which are presented in addition to the relevant plots.
We investigate the influence of the contact network structure on the spread of epidemics over an heterogeneous population. In our model, the epidemic process spreads over a directed weighted graph. A continuous-time individual-based susceptible-infected-susceptible (SIS) is analyzed using a first-order mean-field approximation.   First, we consider a network with general topology in order to investigate the epidemic threshold and the stability properties of the system. Then, we analyze the case of a community network relying on the graph-theoretical notion of equitable partition. We show that, in this case, the epidemic threshold can be computed using a lower-dimensional dynamical system. Moreover we prove that the positive steady-state of the original system, that appears above the threshold, can be computed using this lower-dimensional system.   In the second part of the work, we leverage on our model to derive a cost-optimal curing policy, in order to prevent the disease from persisting indefinitely within the population. The solution of this optimization problem is obtained by formulating a convex minimization problem on a general but symmetric network structure. Finally, in the case of a two-level optimal curing problem, an algorithm is designed with a polynomial time complexity in the network size.
This paper describes the details of Sighthound's fully automated vehicle make, model and color recognition system. The backbone of our system is a deep convolutional neural network that is not only computationally inexpensive, but also provides state-of-the-art results on several competitive benchmarks. Additionally, our deep network is trained on a large dataset of several million images which are labeled through a semi-automated process. Finally we test our system on several public datasets as well as our own internal test dataset. Our results show that we outperform other methods on all benchmarks by significant margins. Our model is available to developers through the Sighthound Cloud API at https://www.sighthound.com/products/cloud
A model of sensory information processing is presented. The model assumes that learning of internal (hidden) generative models, which can predict the future and evaluate the precision of that prediction, is of central importance for information extraction. Furthermore, the model makes a bridge to goal-oriented systems and builds upon the structural similarity between the architecture of a robust controller and that of the hippocampal entorhinal loop. This generative control architecture is mapped to the neocortex and to the hippocampal entorhinal loop. Implicit memory phenomena; priming and prototype learning are emerging features of the model. Mathematical theorems ensure stability and attractive learning properties of the architecture. Connections to reinforcement learning are also established: both the control network, and the network with a hidden model converge to (near) optimal policy under suitable conditions. Falsifying predictions, including the role of the feedback connections between neocortical areas are made.
Outline of several strategies for using Gaussian processes as surrogate models for the covariance matrix adaptation evolution strategy (CMA-ES).
We study the long time dynamics of quantum spin glasses of rotors using the non-equilibrium Schwinger-Keldysh formalism. These models are known to have a quantum phase transition from a paramagnetic to a spin glass phase, which we approach by looking at the divergence of the spin relaxation rate at the transition point. In the aging regime, we determine the dynamical equations governing the time evolution of the spin response and correlation functions, and show that all terms in the equations that arise solely from quantum effects are irrelevant at long times under time reparametrization group (RpG) transformations. At long times, quantum effects enter only through the renormalization of the parameters in the dynamical equations for the classical counterpart of the rotor model. Consequently, quantum effects only modify the out of equilibrium fluctuation dissipation relation (OEFDR), i.e. the ratio X between the temperature and the effective temperature, but not the form of the classical OEFDR.
In contrast to the prevailing view in the literature, it is shown that even extremely stiff sets of ordinary differential equations may be solved efficiently by explicit methods if limiting algebraic solutions are used to stabilize the numerical integration. The stabilizing algebra differs essentially for systems well-removed from equilibrium and those near equilibrium. Explicit asymptotic and quasi-steady-state methods that are appropriate when the system is only weakly equilibrated are examined first. These methods are then extended to the case of close approach to equilibrium through a new implementation of partial equilibrium approximations. Using stringent tests with astrophysical thermonuclear networks, evidence is provided that these methods can deal with the stiffest networks, even in the approach to equilibrium, with accuracy and integration timestepping comparable to that of implicit methods. Because explicit methods can execute a timestep faster and scale more favorably with network size than implicit algorithms, our results suggest that algebraically-stabilized explicit methods might enable integration of larger reaction networks coupled to fluid dynamics than has been feasible previously for a variety of disciplines.
Small-world networks are highly clustered networks with small distances among the nodes. There are many biological neural networks that present this kind of connections. There are no special weightings in the connections of most existing small-world network models. However, this kind of simply-connected models cannot characterize biological neural networks, in which there are different weights in synaptic connections. In this paper, we present a neural network model with weighted small-world connections, and further investigate the stability of this model.
We introduce a deep learning architecture for structure-based virtual screening that generates fixed-sized fingerprints of proteins and small molecules by applying learnable atom convolution and softmax operations to each compound separately. These fingerprints are further transformed non-linearly, their inner-product is calculated and used to predict the binding potential. Moreover, we show that widely used benchmark datasets may be insufficient for testing structure-based virtual screening methods that utilize machine learning. Therefore, we introduce a new benchmark dataset, which we constructed based on DUD-E and PDBBind databases.
An expression for the moment of partition function valid for any finite system size $N$ and complex power $n (\Re(n)>0)$ is obtained for a simple spin glass model termed the {\em discrete random energy model} (DREM). We investigate the behavior of the moment in the thermodynamic limit $N \to \infty$ using this expression, and find that a phase transition occurs at a certain real replica number when the temperature is sufficiently low, directly clarifying the scenario of replica symmetry breaking of DREM in the replica number space {\em without using the replica trick}. The validity of the expression is numerically confirmed.
In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms. In the proposed architecture, PCA is employed to learn multistage filter banks. It is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus named as a PCA network (PCANet) and can be designed and learned extremely easily and efficiently. For comparison and better understanding, we also introduce and study two simple variations to the PCANet, namely the RandNet and LDANet. They share the same topology of PCANet but their cascaded filters are either selected randomly or learned from LDA. We have tested these basic networks extensively on many benchmark visual datasets for different tasks, such as LFW for face verification, MultiPIE, Extended Yale B, AR, FERET datasets for face recognition, as well as MNIST for hand-written digits recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state of the art features, either prefixed, highly hand-crafted or carefully learned (by DNNs). Even more surprisingly, it sets new records for many classification tasks in Extended Yale B, AR, FERET datasets, and MNIST variations. Additional experiments on other public datasets also demonstrate the potential of the PCANet serving as a simple but highly competitive baseline for texture classification and object recognition.
Analysis of the radio-metric tracking data from the Pioneer 10/11 spacecraft at distances between 20--70 astronomical units (AU) from the Sun has consistently indicated the presence of an anomalous, small, constant Doppler frequency drift. The drift is a blue-shift, uniformly changing with rate a_t = (2.92 +/- 0.44) x 10^(-18) s/s^2. It can also be interpreted as a constant acceleration of a_P = (8.74 +/- 1.33) x 10^(-8) cm/s^2 directed towards the Sun. Although it is suspected that there is a systematic origin to the effect, none has been found. As a result, the nature of this anomaly has become of growing interest. Here we discuss the details of our recent investigation focusing on the effects both external to and internal to the spacecraft, as well as those due to modeling and computational techniques. We review some of the mechanisms proposed to explain the anomaly and show their inability to account for the observed behavior of the anomaly. We also present lessons learned from this investigation for a potential deep-space experiment that will reveal the origin of the discovered anomaly and also will characterize its properties with an accuracy of at least two orders of magnitude below the anomaly's size. A number of critical requirements and design considerations for such a mission are outlined and addressed.
Mitogen-activated protein kinase (MAPK) signaling pathways play an essential role in the transduction of environmental stimuli to the nucleus, thereby regulating a variety of cellular processes, including cell proliferation, differentiation and programmed cell death. The components of the MAPK extracellular activated protein kinase (ERK) cascade represent attractive targets for cancer therapy as their aberrant activation is a frequent event among highly prevalent human cancers. MAPK networks are a model for computational simulation, mostly using Ordinary and Partial Differential Equations. Key results showed that these networks can have switch-like behavior, bistability and oscillations. In this work, we consider three representative ERK networks, one with a negative feedback loop, which present a binomial steady state ideal under mass-action kinetics. We therefore apply the theoretical result present in P\'erez Mill\'an et. al (2012) to find a set of rate constants that allow two significantly different stable steady states in the same stoichiometric compatibility class for each network. Our approach makes it possible to study certain aspects of the system, such as multistationarity, without relying on simulation, since we do not assume a priori any constant but the topology of the network. As the performed analysis is general it could be applied to many other important biochemical networks.
Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input. Traditionally, the performance of algorithms for this task is measured using pixel-wise reconstruction measures such as peak signal-to-noise ratio (PSNR) which have been shown to correlate poorly with the human perception of image quality. As a result, algorithms minimizing these metrics tend to produce oversmoothed images that lack high-frequency textures and do not look natural despite yielding high PSNR values.   We propose a novel combination of automated texture synthesis with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixel-accurate reproduction of ground truth images during training. By using feed-forward fully convolutional neural networks in an adversarial training setting, we achieve a significant boost in image quality at high magnification ratios. Extensive experiments on a number of datasets show the effectiveness of our approach, yielding state-of-the-art results in both quantitative and qualitative benchmarks.
The study explores whether the use of Twitter in Massive Open Online Courses (MOOCs) promotes the interaction among learners. The social network analysis shows that instructors still play a very central role in the social media communication and the communication network between students shrinking over time. The mere use of social media fails to promote learner-learner interaction. More research is needed for understanding learner motivation and how instructional design can help increase their engagement and participation.
We study the problem of counting the number of nodes in a slotted-time communication network, under the challenging assumption that nodes do not have identifiers and the network topology changes frequently. That is, for each time slot links among nodes can change arbitrarily provided that the network is always connected. Tolerating dynamic topologies is crucial in face of mobility and unreliable communication whereas, even if identifiers are available, it might be convenient to ignore them in massive networks with changing topology. Counting is a fundamental task in distributed computing since knowing the size of the system often facilitates the design of solutions for more complex problems. Currently, the best upper bound proved on the running time to compute the exact network size is double-exponential. However, only linear complexity lower bounds are known, leaving open the question of whether efficient Counting protocols for Anonymous Dynamic Networks exist or not. In this paper we make a significant step towards answering this question by presenting a distributed Counting protocol for Anonymous Dynamic Networks which has exponential time complexity. Our algorithm ensures that eventually every node knows the exact size of the system and stops executing the algorithm. Previous Counting protocols have either double-exponential time complexity, or they are exponential but do not terminate, or terminate but do not provide running-time guarantees, or guarantee only an exponential upper bound on the network size. Other protocols are heuristic and do not guarantee the correct count.
Automated computer-aided detection (CADe) in medical imaging has been an important tool in clinical practice and research. State-of-the-art methods often show high sensitivities but at the cost of high false-positives (FP) per patient rates. We design a two-tiered coarse-to-fine cascade framework that first operates a candidate generation system at sensitivities of $\sim$100% but at high FP levels. By leveraging existing CAD systems, coordinates of regions or volumes of interest (ROI or VOI) for lesion candidates are generated in this step and function as input for a second tier, which is our focus in this study. In this second stage, we generate $N$ 2D (two-dimensional) or 2.5D views via sampling through scale transformations, random translations and rotations with respect to each ROI's centroid coordinates. These random views are used to train deep convolutional neural network (ConvNet) classifiers. In testing, the trained ConvNets are employed to assign class (e.g., lesion, pathology) probabilities for a new set of $N$ random views that are then averaged at each ROI to compute a final per-candidate classification probability. This second tier behaves as a highly selective process to reject difficult false positives while preserving high sensitivities. The methods are evaluated on three different data sets with different numbers of patients: 59 patients for sclerotic metastases detection, 176 patients for lymph node detection, and 1,186 patients for colonic polyp detection. Experimental results show the ability of ConvNets to generalize well to different medical imaging CADe applications and scale elegantly to various data sets. Our proposed methods improve CADe performance markedly in all cases. CADe sensitivities improved from 57% to 70%, from 43% to 77% and from 58% to 75% at 3 FPs per patient for sclerotic metastases, lymph nodes and colonic polyps, respectively.
We explore the task of recognizing peoples' identities in photo albums in an unconstrained setting. To facilitate this, we introduce the new People In Photo Albums (PIPA) dataset, consisting of over 60000 instances of 2000 individuals collected from public Flickr photo albums. With only about half of the person images containing a frontal face, the recognition task is very challenging due to the large variations in pose, clothing, camera viewpoint, image resolution and illumination. We propose the Pose Invariant PErson Recognition (PIPER) method, which accumulates the cues of poselet-level person recognizers trained by deep convolutional networks to discount for the pose variations, combined with a face recognizer and a global recognizer. Experiments on three different settings confirm that in our unconstrained setup PIPER significantly improves on the performance of DeepFace, which is one of the best face recognizers as measured on the LFW dataset.
One of the major challenges in applications related to social networks, computational biology, collaboration networks etc., is to efficiently search for similar patterns in their underlying graphs. These graphs are typically noisy and contain thousands of vertices and millions of edges. In many cases, the graphs are unlabeled and the notion of similarity is also not well defined. We study the problem of searching an induced subgraph in a large target graph that is most similar to the given query graph. We assume that the query graph and target graph are undirected and unlabeled. We use graphlet kernels \cite{shervashidze2009efficient} to define graph similarity. Graphlet kernels are known to perform better than other kernels in different applications.   Our algorithm maps topological neighborhood information of vertices in the query and target graphs to vectors. These local topological informations are then combined to find a target subgraph having highly similar global topology with the given query graph. We tested our algorithm on several real world networks such as facebook network, google plus network, youtube network, amazon network etc. Most of them contain thousands of vertices and million edges. Our algorithm is able to detect highly similar matches when queried in these networks. Our multi-threaded implementation takes about one second to find the match on a 32 core machine, excluding the time for one time preprocessing. Computationally expensive parts of our algorithm can be further scaled to standard parallel and distributed frameworks like map-reduce.
We examine the onset of classical topological order in a nearest-neighbor kagome ice model. Using Monte Carlo simulations, we characterize the topological sectors of the groundstate using a non-local cut measure which circumscribes the toroidal geometry of the simulation cell. We demonstrate that simulations which employ global loop updates that are allowed to wind around the periodic boundaries cause the topological sector to fluctuate, while restricted local loop updates freeze the simulation into one topological sector. The freezing into one topological sector can also be observed in the susceptibility of the real magnetic spin vectors projected onto the kagome plane. The ability of the susceptibility to distinguish between fluctuating and non-fluctuating topological sectors should motivate its use as a local probe of topological order in a variety of related model and experimental systems.
Despite the fact that many 3D human activity benchmarks being proposed, most existing action datasets focus on the action recognition tasks for the segmented videos. There is a lack of standard large-scale benchmarks, especially for current popular data-hungry deep learning based methods. In this paper, we introduce a new large scale benchmark (PKU-MMD) for continuous multi-modality 3D human action understanding and cover a wide range of complex human activities with well annotated information. PKU-MMD contains 1076 long video sequences in 51 action categories, performed by 66 subjects in three camera views. It contains almost 20,000 action instances and 5.4 million frames in total. Our dataset also provides multi-modality data sources, including RGB, depth, Infrared Radiation and Skeleton. With different modalities, we conduct extensive experiments on our dataset in terms of two scenarios and evaluate different methods by various metrics, including a new proposed evaluation protocol 2D-AP. We believe this large-scale dataset will benefit future researches on action detection for the community.
In this paper, we consider a fitness-level model of a non-elitist mutation-only evolutionary algorithm (EA) with tournament selection. The model provides upper and lower bounds for the expected proportion of the individuals with fitness above given thresholds. In the case of so-called monotone mutation, the obtained bounds imply that increasing the tournament size improves the EA performance. As corollaries, we obtain an exponentially vanishing tail bound for the Randomized Local Search on unimodal functions and polynomial upper bounds on the runtime of EAs on 2-SAT problem and on a family of Set Cover problems proposed by E. Balas.
Standard neural network based on general back propagation learning using delta method or gradient descent method has some great faults like poor optimization of error-weight objective function, low learning rate, instability .This paper introduces a hybrid supervised back propagation learning algorithm which uses trust-region method of unconstrained optimization of the error objective function by using quasi-newton method .This optimization leads to more accurate weight update system for minimizing the learning error during learning phase of multi-layer perceptron.[13][14][15] In this paper augmented line search is used for finding points which satisfies Wolfe condition. In this paper, This hybrid back propagation algorithm has strong global convergence properties & is robust & efficient in practice.
An electric vehicle (EV) may be used as energy storage which allows the bi-directional electricity flow between the vehicle's battery and the electric power grid. In order to flatten the load profile of the electricity system, EV scheduling has become a hot research topic in recent years. In this paper, we propose a new formulation of the joint scheduling of EV and Unit Commitment (UC), called EVUC. Our formulation considers the characteristics of EVs while optimizing the system total running cost. We employ Chemical Reaction Optimization (CRO), a general-purpose optimization algorithm to solve this problem and the simulation results on a widely used set of instances indicate that CRO can effectively optimize this problem.
We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoder- decoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders. The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder. We show that conventional encoder-decoders are a special case of our framework. Empirically, we show that our framework improves over state-of- the-art encoder-decoder systems on the tasks of image captioning and source code captioning.
Facial recognition and verification is a widely used biometric technology in security system. Unfortunately, face biometrics is vulnerable to spoofing attacks using photographs or videos. In this paper, we present an anisotropic diffusion-based kernel matrix model (ADKMM) for face liveness detection to prevent face spoofing attacks. We use the anisotropic diffusion to enhance the edges and boundary locations of a face image, and the kernel matrix model to extract face image features which we call the diffusion-kernel (D-K) features. The D-K features reflect the inner correlation of the face image sequence. We introduce convolution neural networks to extract the deep features, and then, employ a generalized multiple kernel learning method to fuse the D-K features and the deep features to achieve better performance. Our experimental evaluation on the two publicly available datasets shows that the proposed method outperforms the state-of-art face liveness detection methods.
Networks exhibiting "accelerating" growth have total link numbers growing faster than linearly with network size and can exhibit transitions from stationary to nonstationary statistics and from random to scale-free to regular statistics at particular critical network sizes. However, if for any reason the network cannot tolerate such gross structural changes then accelerating networks are constrained to have sizes below some critical value. This is of interest as the regulatory gene networks of single celled prokaryotes are characterized by an accelerating quadratic growth and are size constrained to be less than about 10,000 genes encoded in DNA sequence of less than about 10 megabases. This paper presents a probabilistic accelerating network model for prokaryotic gene regulation which closely matches observed statistics by employing two classes of network nodes (regulatory and non-regulatory) and directed links whose inbound heads are exponentially distributed over all nodes and whose outbound tails are preferentially attached to regulatory nodes and described by a scale free distribution. This model explains the observed quadratic growth in regulator number with gene number and predicts an upper prokaryote size limit closely approximating the observed value.
Deep neural networks (DNNs) are powerful nonlinear architectures that are known to be robust to random perturbations of the input. However, these models are vulnerable to adversarial perturbations--small input changes crafted explicitly to fool the model. In this paper, we ask whether a DNN can distinguish adversarial samples from their normal and noisy counterparts. We investigate model confidence on adversarial samples by looking at Bayesian uncertainty estimates, available in dropout neural networks, and by performing density estimation in the subspace of deep features learned by the model. The result is a method for implicit adversarial detection that is oblivious to the attack algorithm. We evaluate this method on a variety of standard datasets including MNIST and CIFAR-10 and show that it generalizes well across different architectures and attacks. Our findings report that 85-93% ROC-AUC can be achieved on a number of standard classification tasks with a negative class that consists of both normal and noisy samples.
Obtaining high quality sensor information is critical in vehicular emergencies. However, existing standards such as IEEE 802.11p/DSRC and LTE-A cannot support either the required data rates or the latency requirements. One solution to this problem is for municipalities to invest in dedicated base stations to ensure that drivers have the information they need to make safe decisions in or near accidents. In this paper we further propose that these municipality-owned base stations form a Single Frequency Network (SFN). In order to ensure that transmissions are reliable, we derive tight bounds on the outage probability when the SFN is overlaid on an existing cellular network. Using our bounds, we propose a transmission power allocation algorithm. We show that our power allocation model can reduce the total instantaneous SFN transmission power up to $20$ times compared to a static uniform power allocation solution, for the considered scenarios. The result is particularly important when base stations rely on an off-grid power source (i.e., batteries).
Relation detection is a core component for many NLP applications including Knowledge Base Question Answering (KBQA). In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning that detects KB relations given an input question. Our method uses deep residual bidirectional LSTMs to compare questions and relation names via different hierarchies of abstraction. Additionally, we propose a simple KBQA system that integrates entity linking and our proposed relation detector to enable one enhance another. Experimental results evidence that our approach achieves not only outstanding relation detection performance, but more importantly, it helps our KBQA system to achieve state-of-the-art accuracy for both single-relation (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.
This is the handbook for the KONECT project, the \emph{Koblenz Network Collection}, a scientific project to collect, analyse, and provide network datasets for researchers in all related fields of research, by the Namur Center for Complex Systems (naXys) at the University of Namur, Belgium, with web hosting provided by the Institute for Web Science and Technologies (WeST) at the University of Koblenz--Landau, Germany.
We propose a new regularization method based on virtual adversarial loss: a new measure of local smoothness of the output distribution. Virtual adversarial loss is defined as the robustness of the model's posterior distribution against local perturbation around each input data point. Our method is similar to adversarial training, but differs from adversarial training in that it determines the adversarial direction based only on the output distribution and that it is applicable to a semi-supervised setting. Because the directions in which we smooth the model are virtually adversarial, we call our method virtual adversarial training (VAT). The computational cost of VAT is relatively low. For neural networks, the approximated gradient of virtual adversarial loss can be computed with no more than two pairs of forward and backpropagations. In our experiments, we applied VAT to supervised and semi-supervised learning on multiple benchmark datasets. With additional improvement based on entropy minimization principle, our VAT achieves the state-of-the-art performance on SVHN and CIFAR-10 for semi-supervised learning tasks.
This paper proposes a timed process algebra for wireless networks, an extension of the Algebra for Wireless Networks. It combines treatments of local broadcast, conditional unicast and data structures, which are essential features for the modelling of network protocols. In this framework we model and analyse the Ad hoc On-Demand Distance Vector routing protocol, and show that, contrary to claims in the literature, it fails to be loop free. We also present boundary conditions for a fix ensuring that the resulting protocol is indeed loop free.
We consider discrete spin models on arbitrary planar graphs and lattices with frustrated interactions. We first analyze the Ising model with frustrated plaquettes. We use an algebraic approach to derive the result that an Ising model with some of its plaquettes frustrated has a dual which is an Ising model with an external field $i\pi/2$ applied to the dual sites centered at frustrated plaquettes. In the case that all plaquettes are frustrated, this leads to the known result that the dual model has a uniform field $i\pi/2$ whose partition function can be evaluated in the thermodynamic limit for regular lattices.   The analysis is extended to a Potts spin glass with analogous results obtained.
Judiciously setting the base station transmit power that matches its deployment environment is a key problem in ultra dense networks and heterogeneous in-building cellular deployments. A unique characteristic of this problem is the tradeoff between sufficient indoor coverage and limited outdoor leakage, which has to be met without explicit knowledge of the environment. In this paper, we address the small base station (SBS) transmit power assignment problem based on stochastic bandit theory. Unlike existing solutions that rely on heavy involvement of RF engineers surveying the target area, we take advantage of the human user behavior with simple coverage feedback in the network, and thus significantly reduce the planned human measurement. In addition, the proposed power assignment algorithms follow the Bayesian principle to utilize the available prior knowledge from system self configuration. To guarantee good performance when the prior knowledge is insufficient, we incorporate the performance correlation among similar power values, and establish an algorithm that exploits the correlation structure to recover majority of the degraded performance. Furthermore, we explicitly consider power switching penalties in order to discourage frequent changes of the transmit power, which cause varying coverage and uneven user experience. Comprehensive system-level simulations are performed for both single and multiple SBS deployment scenarios, and the resulting power settings are compared to the state-of-the-art solutions. Significant performance gains of the proposed algorithms are observed. Particularly, the correlation structure enables the algorithm to converge much faster to the optimal long-term power than other methods.
Deep neural networks currently demonstrate state-of-the-art performance in several domains. At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper we convert the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved. In particular, for the Very Deep VGG networks we report the compression factor of the dense weight matrix of a fully-connected layer up to 200000 times leading to the compression factor of the whole network up to 7 times.
This position paper argues that the Baldwin effect is widely misunderstood by the evolutionary computation community. The misunderstandings appear to fall into two general categories. Firstly, it is commonly believed that the Baldwin effect is concerned with the synergy that results when there is an evolving population of learning individuals. This is only half of the story. The full story is more complicated and more interesting. The Baldwin effect is concerned with the costs and benefits of lifetime learning by individuals in an evolving population. Several researchers have focussed exclusively on the benefits, but there is much to be gained from attention to the costs. This paper explains the two sides of the story and enumerates ten of the costs and benefits of lifetime learning by individuals in an evolving population. Secondly, there is a cluster of misunderstandings about the relationship between the Baldwin effect and Lamarckian inheritance of acquired characteristics. The Baldwin effect is not Lamarckian. A Lamarckian algorithm is not better for most evolutionary computing problems than a Baldwinian algorithm. Finally, Lamarckian inheritance is not a better model of memetic (cultural) evolution than the Baldwin effect.
The Network Function Virtualization (NFV) paradigm is enabling flexibility and programmability of traditional network services deployed in dedicated hardware, into generic hardware, in form of Virtual Network Functions (VNFs). Today, cloud service providers are using virtual machines in Data Centers (DCs) for the instantiation of VNFs in the data center networks. Ongoing efforts on the VNF placement problem in DC networks propose to minimize the energy consumption by performing migration of Virtual Machines (VMs), which is a highly relevant parameter to consider in these environments. We propose using the replications of VNFs to reduce migrations, for DC networks during the peak hours not only in consideration of energy consumption but also of Quality of Service (QoS) which in fact can degrade during migrations. We propose a Linear Programming (LP) model to compare two different cases: 1) without replications, by finding a trade-off between the number of migrations and the server load balancing relevant to QoS, and 2) with replications, maximizing the server and network load balancing. The results show that replications generally reduce the number of required migrations, and can minimize the utilization of both the DC servers and network links.
Networks with only central force interactions are floppy when their average connectivity is below an isostatic threshold. Although such networks are mechanically unstable, they can become rigid when strained. It was recently shown that the transition from floppy to rigid states as a function of simple shear strain is continuous, with hallmark signatures of criticality (Nat. Phys. 12, 584 (2016)). The nonlinear mechanical response of collagen networks was shown to be quantitatively described within the framework of such mechanical critical phenomenon. Here, we provide a more quantitative characterization of critical behavior in subisostatic networks. Using finite size scaling we demonstrate the divergence of strain fluctuations in the network at well-defined critical strain. We show that the characteristic strain corresponding to the onset of strain stiffening is distinct from but related to this critical strain in a way that depends on critical exponents. We confirm this prediction experimentally for collagen networks. Moreover, we find that the apparent critical exponents are largely independent of the spatial dimensionality. In a highly simplified computational model of network dynamics, we also observe critical slowing down in the vicinity of the critical strain. With subisostaticity as the only required condition, strain-driven criticality is expected to be a general feature of biologically relevant fibrous networks.
Global data traffic is expected to grow exponentially in the next few years with video and smartphone applications driving data growth. Many mobile network providers in the UK have either deployed or planning to deploy 4th generation Long-Term-Evolution (LTE) mobile technology as the solution to meet capacity demands. This study evaluates the technological improvements in 4G LTE in comparison to 3G High Speed Packet Access (HSPA) and further conducts a techno-economic analysis using primary researched tariff data to determine network operator profitability and mobile tariff strategy to meet user demand. To ensure holistic analysis, the study also considers the environmental impacts of LTE by determining the annual carbon emission for a network operator. The study results shows LTE will prove profitable; however a trade-off has to be made by network operators between meeting consumer tariff demands or increasing profitability. Analysis also shows a 63% reduced in carbon emissions is possible with migration to 4G services with implication of further financial benefits for network operators as a result.
The antiferromagnetic Ising model in uncorrelated scale-free networks has been studied by means of Monte Carlo simulations. These networks are characterized by a connectivity (or degree) distribution P(k) ~ k^(- gamma). The disorder present in these complex networks frustrates the antiferromagnetic spin ordering, giving rise to a spin-glass (SG) phase at low temperature. The paramagnetic-SG transition temperature T_c has been studied as a function of the parameter gamma and the minimum degree present in the networks. T_c is found to increase when the exponent gamma is reduced, in line with a larger frustration caused by the presence of nodes with higher degree.
It is shown that a variety of deterministic cellular automaton models of highway traffic flow obey a variational principle which states that, for a given car density, the average car flow is a non-decreasing function of time. This result is established for systems whose configurations exhibits local jams of a given structure. If local jams have a different structure, it is shown that either the variational principle may still apply to systems evolving according to some particular rules, or it could apply under a weaker form to systems whose asymptotic average car flow is a well-defined function of car density. To establish these results it has been necessary to characterize among all number-conserving cellular automaton rules which ones may reasonably be considered as acceptable traffic rules. Various notions such as free-moving phase, perfect and defective tiles, and local jam play an important role in the discussion. Many illustrative examples are given.
We have reprocessed over 100 terabytes of single-exposure WISE/NEOWISE images to create the deepest ever full-sky maps at 3-5 microns. We incorporate all publicly available W1 and W2 imaging - a total of ~8 million exposures in each band - from ~37 months of observations spanning 2010 January to 2015 December. Our coadds preserve the native WISE resolution and feature depth of coverage ~3 times greater than that of the AllWISE Atlas stacks. Our coadds are designed to enable deep forced photometry, in particular for the Dark Energy Camera Legacy Survey (DECaLS) and Mayall z-Band Legacy Survey (MzLS), both of which are being used to select targets for the Dark Energy Spectroscopic Instrument (DESI). We describe newly introduced processing steps aimed at leveraging added redundancy to remove artifacts, with the intent of facilitating uniform target selection and searches for rare/exotic objects (e.g. high-redshift quasars and distant galaxy clusters). Forced photometry depths achieved with these coadds extend 0.56 (0.46) magnitudes deeper in W1 (W2) than is possible with only pre-hibernation WISE imaging.
A massively recurrent neural network responds on one side to input stimuli and is autonomously active, on the other side, in the absence of sensory inputs. Stimuli and information processing depends crucially on the qualia of the autonomous-state dynamics of the ongoing neural activity. This default neural activity may be dynamically structured in time and space, showing regular, synchronized, bursting or chaotic activity patterns.   We study the influence of non-synaptic plasticity on the default dynamical state of recurrent neural networks. The non-synaptic adaption considered acts on intrinsic neural parameters, such as the threshold and the gain, and is driven by the optimization of the information entropy. We observe, in the presence of the intrinsic adaptation processes, three distinct and globally attracting dynamical regimes, a regular synchronized, an overall chaotic and an intermittent bursting regime. The intermittent bursting regime is characterized by intervals of regular flows, which are quite insensitive to external stimuli, interseeded by chaotic bursts which respond sensitively to input signals. We discuss these finding in the context of self-organized information processing and critical brain dynamics.
Considering the context of building management systems with wireless sensor networks monitoring environmental features, this paper presents a proposal of a Fuzzy Logic Based Routing Algorithm (FLBRA) to determine the cost of each link and the identification of the best routes for packet forwarding. We describe the parameters (Received Signal Strength Indicator - RSSI, Standard Deviation of the RSSI and Packet Error Rate - PER) for the cost definition of each path, the sequence of identifying best routes and the results obtained in simulation. As expected in this proposal, the simulation results showed an increase in the packet delivery rate compared to RSSI-based forward protocol (RBF).
As a main structural level in colloidal food materials, extended colloidal networks are important for texture and rheology. By obtaining the 3D microstructure of the network, macroscopic mechanical properties of the material can be inferred. However, this approach is hampered by the lack of suitable non-destructive 3D imaging techniques with submicron resolution.   We present results of quantitative ptychographic X-ray computed tomography applied to a palm kernel oil based oil-in-water emulsion. The measurements were carried out at ambient pressure and temperature. The 3D structure of the extended colloidal network of fat globules was obtained with a resolution of around 300 nm. Through image analysis of the network structure, the fat globule size distribution was computed and compared to previous findings. In further support, the reconstructed electron density values were within 4% of reference values.
The sound inventories of the world's languages self-organize themselves giving rise to similar cross-linguistic patterns. In this work we attempt to capture this phenomenon of self-organization, which shapes the structure of the consonant inventories, through a complex network approach. For this purpose we define the occurrence and co-occurrence networks of consonants and systematically study some of their important topological properties. A crucial observation is that the occurrence as well as the co-occurrence of consonants across languages follow a power law distribution. This property is arguably a consequence of the principle of preferential attachment. In order to support this argument we propose a synthesis model which reproduces the degree distribution for the networks to a close approximation. We further observe that the co-occurrence network of consonants show a high degree of clustering and subsequently refine our synthesis model in order to incorporate this property. Finally, we discuss how preferential attachment manifests itself through the evolutionary nature of language.
In Network cooperative games, due to computational complexity issues, agents are not able to base their behavior on the "whole network status" but have to follow certain "beliefs" as to how it is in their strategic interest to act. This behavior constitutes the main interest of this paper. To this end, we quantify and characterize the set of beliefs that support cooperation of such agents. Assuming that they are engaged in a differentiated Cournot competition and that they equally split the worth produced, we characterize the set of coalitional beliefs that support core non-emptiness and thus guarantee stability of the Network.
With small cell networks becoming core parts of the fifth generation (5G) cellular networks, it is an important problem to evaluate the impact of user mobility on 5G small cell networks. However, the tendency and clustering habits in human activities have not been considered in traditional user mobility models. In this paper, human tendency and clustering behaviors are first considered to evaluate the user mobility performance for 5G small cell networks based on individual mobility model (IMM). As key contributions, user pause probability, user arrival and departure probabilities are derived in this paper for evaluat-ing the user mobility performance in a hotspot-type 5G small cell network. Furthermore, coverage probabilities of small cell and macro cell BSs are derived for all users in 5G small cell networks, respectively. Compared with the traditional random waypoint (RWP) model, IMM provides a different viewpoint to investigate the impact of human tendency and clustering behaviors on the performance of 5G small cell networks.
The main goal of the IceCube Deep Core Array is to search for neutrinos of astrophysical origins. Atmospheric neutrinos are commonly considered as a background for these searches. We show here that cascade measurements in the Ice Cube Deep Core Array can provide strong evidence for tau neutrino appearance in atmospheric neutrino oscillations. A careful study of these tau neutrinos is crucial, since they constitute an irreducible background for astrophysical neutrino detection.
The high frequency dynamics of fluid oxygen have been investigated by Inelastic X-ray Scattering. In spite of the markedly supercritical conditions ($T\approx 2 T_c$, $P>10^2 P_c$), the sound velocity exceeds the hydrodynamic value of about 20%, a feature which is the fingerprint of liquid-like dynamics. The comparison of the present results with literature data obtained in several fluids allow us to identify the extrapolation of the liquid vapor-coexistence line in the ($P/P_c$, $T/T_c$) plane as the relevant edge between liquid- and gas-like dynamics. More interestingly, this extrapolation is very close to the non metal-metal transition in hot dense fluids, at pressure and temperature values as obtained by shock wave experiments. This result points to the existence of a connection between structural modifications and transport properties in dense fluids.
I study the problem of social learning in a model where agents move sequentially. Each agent receives a private signal about the underlying state of the world, observes the past actions in a neighborhood of individuals, and chooses her action attempting to match the true state. Earlier research in this field emphasizes that herding behavior occurs with a positive probability in certain special cases; recent studies show that asymptotic learning is achievable under a more general observation structure. In particular, with unbounded private beliefs, asymptotic learning occurs if and only if agents observe a close predecessor, i.e., the action of a close predecessor reveals the true state in the limit. However, a prevailing assumption in these studies is that the observation structure in the society is exogenous. In contrast to most of the previous literature, I assume in this paper that observation is endogenous and costly. More specifically, each agent must pay a specific cost to make any observation and can strategically choose the set of actions to observe privately. I introduce the notion of maximal learning (relative to cost) as a natural extension of asymptotic learning: society achieves maximal learning when agents can learn the true state with probability 1 in the limit after paying the cost of observation. I show that observing only a close predecessor is no longer sufficient for learning the true state with unbounded private beliefs and positive costs. Instead, maximal learning occurs if and only if the size of the observations extends to infinity. I provide interesting comparative statics as to how various settings in the model affect the learning probability. For instance, the probability to learn the true state may be higher under positive costs than under zero cost; in addition, the probability to learn the true state may be higher under weaker private signals.
Many real networks that are inferred or collected from data are incomplete due to missing edges. Missing edges can be inherent to the dataset (Facebook friend links will never be complete) or the result of sampling (one may only have access to a portion of the data). The consequence is that downstream analyses that consume the network will often yield less accurate results than if the edges were complete. Community detection algorithms, in particular, often suffer when critical intra-community edges are missing. We propose a novel consensus clustering algorithm to enhance community detection on incomplete networks. Our framework utilizes existing community detection algorithms that process networks imputed by our link prediction based algorithm. The framework then merges their multiple outputs into a final consensus output. On average our method boosts performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook.
The effect of an electric field on the conductance of ultrathin films of metals deposited on substrates coated with a thin layer of amorphous Ge was investigated. A contribution to the conductance modulation symmetric with respect to the polarity of the applied electric field was found in regimes in which there was no sign of glassy behavior. For films with thicknesses that put them on the insulating side of the superconductor-insulator transition, the conductance increased with electric field, whereas for films that were becoming superconducting it decreased. Application of magnetic fields to the latter, which reduce the transition temperature and ultimately quench superconductivity, changed the sign of the reponse of the conductance to electric field back to that found for insulators. We propose that this symmetric response to capacitive charging is a consequence of changes in the conductance of the a-Ge layer, and is not a fundamental property of the physics of the superconductor-insulator transition as previously suggested.
The word Kondo means battle in Swahili. This coincidence is fortuitous because in the Kondo effect, a battle inevitably ensues anytime a magnetic impurity is placed in a non-magnetic metal. Below some energy scale, the Kondo temperature ($T_k$), a lone magnetic impurity is robbed of its spin. Above the Kondo temperature, rapid spin-flip scattering produces a temperature-dependent correction to the resistivity of the form, $B_k\ln T$. Until recently, both the Kondo resistivity and $T_k$ were thought to be determined solely by the host metal and the magnetic impurity. However, numerous presentations in this volume attest, there is now overwhelming evidence that both are affected by the size of the sample as well as non-magnetic random scattering. In this paper, I will focus on the theoretical work we have performed on the experiments revealing that non-magnetic scattering suppppresses the Kondo resistivity in thin Kondo alloys.
Mobile edge computing (MEC) has been regarded as a promising technique to enhance the computation capabilities of wireless devices, by enabling them to offload computation-intensive tasks to base stations (BSs) at the network edge. This paper studies a new multiuser MEC system by exploiting the multi-antenna non-orthogonal multiple access (NOMA)-based computation offloading. In this system, multiple users can simultaneously offload their computation tasks to one multi-antenna BS over the same time/frequency resources for remote execution, and the BS can employ successive interference cancellation (SIC) for information decoding. We consider the partial offloading case, such that each user can partition the computation task into two parts for local computing and offloading, respectively. Under this setup, we minimize the weighted sum of the energy consumption at all users subject to their computation latency constraints, by jointly optimizing each user's task partition, local central processing unit (CPU) frequency for local computing, and transmit power and rate for offloading, as well as the BS's decoding order for SIC. We present an efficient algorithm to obtain the globally optimal solution to this problem by applying the Lagrange dual method. Numerical results show that the proposed NOMA-based partial offloading design significantly improves the energy efficiency of the multiuser MEC system, as compared to benchmark schemes with orthogonal multiple access (OMA)-based partial offloading, local computing only, and full offloading only.
Graph-theoretical methods have rapidly become a standard tool in studies of the structure and function of the human brain. Whereas the structural connectome can be fairly straightforwardly mapped onto a complex network, there are more degrees of freedom in constructing networks that represent functional connections between brain areas. For fMRI data, such networks are typically built by aggregating the BOLD signal time series of voxels into larger entities (such as Regions of Interest in some brain atlas), and determining the connection strengths between these from some measure of time-series correlations. Although it is evident that the outcome of this procedure must be affected by how the voxel-level time series are treated at the preprocessing stage, there is a lack of systematic studies of the effects of preprocessing on network structure. Here, we focus on the effects of spatial smoothing, which is a standard preprocessing method for fMRI. We apply various levels of spatial smoothing to resting-state fMRI data, and measure the changes induced in the corresponding functional networks. We show that the level of spatial smoothing clearly affects the degrees and other centrality measures of the nodes of the functional networks; these changes are non-uniform, systematic, and depend on the geometry of the brain. The composition of the largest connected network component is also affected in a way that artificially increases the similarity of the networks of different subjects. Our conclusion is that wherever possible, spatial smoothing should be avoided when preprocessing fMRI data for network analysis.
Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The d facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Our final combination outperforms the state-of-the-art without employing fine-tuning or ensemble of RGB network architectures.
We propose a simple, exactly solvable, model of interface growth in a random medium that is a variant of the zero-temperature random-field Ising model on the Cayley tree. This model is shown to have a phase diagram (critical depinning field versus disorder strength) qualitatively similar to that obtained numerically on the cubic lattice. We then introduce a specifically tailored random graph that allows an exact asymptotic analysis of the height and width of the interface. We characterize the change of morphology of the interface as a function of the disorder strength, a change that is found to take place at a multicritical point along the depinning-transition line.
We investigate the problem of stochastic network optimization in the presence of imperfect state prediction and non-stationarity. Based on a novel distribution-accuracy curve prediction model, we develop the predictive learning-aided control (PLC) algorithm, which jointly utilizes historic and predicted network state information for decision making. PLC is an online algorithm that requires zero a-prior system statistical information, and consists of three key components, namely sequential distribution estimation and change detection, dual learning, and online queue-based control.   Specifically, we show that PLC simultaneously achieves good long-term performance, short-term queue size reduction, accurate change detection, and fast algorithm convergence. In particular, for stationary networks, PLC achieves a near-optimal $[O(\epsilon)$, $O(\log(1/\epsilon)^2)]$ utility-delay tradeoff. For non-stationary networks, \plc{} obtains an $[O(\epsilon), O(\log^2(1/\epsilon)$ $+ \min(\epsilon^{c/2-1}, e_w/\epsilon))]$ utility-backlog tradeoff for distributions that last $\Theta(\frac{\max(\epsilon^{-c}, e_w^{-2})}{\epsilon^{1+a}})$ time, where $e_w$ is the prediction accuracy and $a=\Theta(1)>0$ is a constant (the Backpressue algorithm \cite{neelynowbook} requires an $O(\epsilon^{-2})$ length for the same utility performance with a larger backlog). Moreover, PLC detects distribution change $O(w)$ slots faster with high probability ($w$ is the prediction size) and achieves an $O(\min(\epsilon^{-1+c/2}, e_w/\epsilon)+\log^2(1/\epsilon))$ convergence time. Our results demonstrate that state prediction (even imperfect) can help (i) achieve faster detection and convergence, and (ii) obtain better utility-delay tradeoffs.
In this paper we report on numerical study of plasmonic nanoparticle chains with long-range dipole-dipole interaction. We have shown that introduction of positional disorder gives a peak in the density of resonant states (DOS) at the frequency of individual nanoparticle resonance. This peak is referred to Dyson singularity in one-dimensional disordered structures [Dyson F., 1953] and, according to our calculations, governs the spectral properties of local DOS. This provides disorder-induced Purcell enhancement that can found its applications in random lasers and for SERS spectroscopy. We stress that this effect relates not only to plasmonic nanoparticles but to an arbitrary chain of nanoparticles or atoms with resonant polarizabilities.
Several convex relaxations of the optimal power flow (OPF) problem have recently been developed using both bus injection models and branch flow models. In this paper, we prove relations among three convex relaxations: a semidefinite relaxation that computes a full matrix, a chordal relaxation based on a chordal extension of the network graph, and a second-order cone relaxation that computes the smallest partial matrix. We prove a bijection between the feasible sets of the OPF in the bus injection model and the branch flow model, establishing the equivalence of these two models and their second-order cone relaxations. Our results imply that, for radial networks, all these relaxations are equivalent and one should always solve the second-order cone relaxation. For mesh networks, the semidefinite relaxation is tighter than the second-order cone relaxation but requires a heavier computational effort, and the chordal relaxation strikes a good balance. Simulations are used to illustrate these results.
We investigate charge relaxation in quantum-wires of spin-less disordered fermions ($t{-}V$-model). Our observable is the time-dependent density propagator, $\Pi_{\varepsilon}(x,t)$, calculated in windows of different energy density, $\varepsilon$, of the many-body Hamiltonian and at different disorder strengths, $W$, not exceeding the critical value $W_\text{c}$. The width $\Delta x_\varepsilon(t)$ of $\Pi_\varepsilon(x,t)$ exhibits a behavior $d\ln \Delta x_\varepsilon(t) / d\ln t {=} \beta_\varepsilon(t)$, where the exponent function $\beta_\varepsilon(t){\lesssim}1/2$ is seen to depend strongly on $L$ at all investigated parameter combinations. (i) We confirm the existence of a region in phase space that exhibits subdiffusive dynamics in the sense that $\beta_\varepsilon{<}1/2$ in large window of times. However, subdiffusion might possibly be transient, only, finally giving way to a conventional diffusive behavior with $\beta_{\varepsilon}{=}1/2$. (ii) We cannot confirm the existence of many-body mobility edges even in regions of the phase-diagram that have been reported to be deep in the delocalized phase. (iii) (Transient) subdiffusion $0<\beta_\varepsilon(t)\lesssim 1/2$, coexists with an enhanced probability for returning to the origin, $\Pi_\varepsilon(0,t)$, decaying much slower than $1/\Delta x_\varepsilon (t)$. Correspondingly, the spatial decay of $\Pi_\varepsilon(x,t)$ is far from Gaussian being exponential or even slower. On a phenomenological level, our findings are broadly consistent with effects of strong disorder and (fractal) Griffiths regions.
Semi-inclusive deep inelastic scattering on nuclear targets is an ideal tool to study the energy loss effect of an outgoing quark in a nuclear medium. By means of the short hadron formation time, the experimental data with quark hadronization occurring outside the nucleus are picked out. A leading-order analysis is performed for the hadron multiplicity ratios as a function of the energy fraction on helium, neon, and copper nuclei relative to deuteron for the various identified hadrons. It is shown that the nuclear effects on parton distribution functions can be neglected. It is found that the theoretical results considering the nuclear modification of fragmentation functions due to quark energy loss are in good agreement with the experimental data. Whether the quark energy loss is linear or quadratic with the path length is not determined. The obtained energy loss per unit length is 0.38 \pm 0.03 GeV/fm for an outgoing quark by the global fit.
Cognitive MIMO Radio: A Competitive Optimality Design Based on Subspace Projections
This work presents a contribution to secure the routing protocol GPSR (Greedy Perimeter Stateless Routing) for vehicular ad hoc networks, we examine the possible attacks against GPSR and security solutions proposed by different research teams working on ad hoc network security. Then, we propose a solution to secure GPSR packet by adding a digital signature based on symmetric cryptography generated using the AES algorithm and the MD5 hash function more suited to a mobile environment.
Data-driven brain parcellations aim to provide a more accurate representation of an individual's functional connectivity, since they are able to capture individual variability that arises due to development or disease. This renders comparisons between the emerging brain connectivity networks more challenging, since correspondences between their elements are not preserved. Unveiling these correspondences is of major importance to keep track of local functional connectivity changes. We propose a novel method based on graph edit distance for the comparison of brain graphs directly in their domain, that can accurately reflect similarities between individual networks while providing the network element correspondences. This method is validated on a dataset of 116 twin subjects provided by the Human Connectome Project.
Photo retouching enables photographers to invoke dramatic visual impressions by artistically enhancing their photos through stylistic color and tone adjustments. However, it is also a time-consuming and challenging task that requires advanced skills beyond the abilities of casual photographers. Using an automated algorithm is an appealing alternative to manual work but such an algorithm faces many hurdles. Many photographic styles rely on subtle adjustments that depend on the image content and even its semantics. Further, these adjustments are often spatially varying. Because of these characteristics, existing automatic algorithms are still limited and cover only a subset of these challenges. Recently, deep machine learning has shown unique abilities to address hard problems that resisted machine algorithms for long. This motivated us to explore the use of deep learning in the context of photo editing. In this paper, we explain how to formulate the automatic photo adjustment problem in a way suitable for this approach. We also introduce an image descriptor that accounts for the local semantics of an image. Our experiments demonstrate that our deep learning formulation applied using these descriptors successfully capture sophisticated photographic styles. In particular and unlike previous techniques, it can model local adjustments that depend on the image semantics. We show on several examples that this yields results that are qualitatively and quantitatively better than previous work.
In this paper, we demonstrate the expressibility of artificial neural networks (ANNs) in quantum many-body physics by showing that a feed-forward neural network with a small number of hidden layers can be trained to approximate with high precision the ground states of some notable quantum many-body systems. We consider the one-dimensional free bosons and fermions, spinless fermions on a square lattice away from half-filling, as well as frustrated quantum magnetism with a rapidly oscillating ground-state characteristic function. In the latter case, an ANN with a standard architecture fails, while that with a slightly modified one successfully learns the frustration-driven complex sign rule in the ground state. The practical application of this method to explore the unknown ground states is also discussed.
The explosion in the availability of GPS-enabled devices has resulted in an abundance of trajectory data. In reality, however, majority of these trajectories are collected at a low sampling rate and only provide partial observations on their actually traversed routes. Consequently, they are mired with uncertainty. In this paper, we develop a technique called InferTra to infer uncertain trajectories from network-constrained partial observations. Rather than predicting the most likely route, the inferred uncertain trajectory takes the form of an edge-weighted graph and summarizes all probable routes in a holistic manner. For trajectory inference, InferTra employs Gibbs sampling by learning a Network Mobility Model (NMM) from a database of historical trajectories. Extensive experiments on real trajectory databases show that the graph-based approach of InferTra is up to 50% more accurate, 20 times faster, and immensely more versatile than state-of-the-art techniques.
The modern age has seen an exponential growth of social network data available on the web. Analysis of these networks reveal important structural information about these networks in particular and about our societies in general. More often than not, analysis of these networks is concerned in identifying similarities among social networks and how they are different from other networks such as protein interaction networks, computer networks and food web. In this paper, our objective is to perform a critical analysis of different social networks using structural metrics in an effort to highlight their similarities and differences. We use five different social network datasets which are contextually and semantically different from each other. We then analyze these networks using a number of different network statistics and metrics. Our results show that although these social networks have been constructed from different contexts, they are structurally similar. We also review the snowball sampling method and show its vulnerability against different network metrics.
Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification (also called "semantic image segmentation"). We show that responses at the final layer of DCNNs are not sufficiently localized for accurate object segmentation. This is due to the very invariance properties that make DCNNs good for high level tasks. We overcome this poor localization property of deep networks by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Qualitatively, our "DeepLab" system is able to localize segment boundaries at a level of accuracy which is beyond previous methods. Quantitatively, our method sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 71.6% IOU accuracy in the test set. We show how these results can be obtained efficiently: Careful network re-purposing and a novel application of the 'hole' algorithm from the wavelet community allow dense computation of neural net responses at 8 frames per second on a modern GPU.
We have studied the averaged power density spectra (PDSs) of two classes of long-duration gamma-ray bursts in the recent classification by Balastegui et al.(2001) based on neural network analysis. Both PDSs follow a power law over a wide frequency range with approximately the same slope, which indicates that a process with a self-similar temporal property may underlie the emission mechanisms of both. The two classes of bursts are divided into groups according to their brightness and spectral hardness respectively and each group's PDS was calculated; For both classes, the PDS is found to flatten both with increasing burst brightness and with increasing hardness.
Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.
Random backpropagation (RBP) is a variant of the backpropagation algorithm for training neural networks, where the transpose of the forward matrices are replaced by fixed random matrices in the calculation of the weight updates. It is remarkable both because of its effectiveness, in spite of using random matrices to communicate error information, and because it completely removes the taxing requirement of maintaining symmetric weights in a physical neural system. To better understand random backpropagation, we first connect it to the notions of local learning and the learning channel. Through this connection, we derive several alternatives to RBP, including skipped RBP (SRPB), adaptive RBP (ARBP), sparse RBP, and their combinations (e.g. ASRBP) and analyze their computational complexity. We then study their behavior through simulations using the MNIST and CIFAR-10 bechnmark datasets. These simulations show that most of these variants work robustly, almost as well as backpropagation, and that multiplication by the derivatives of the activation functions is important. As a follow-up, we study also the low-end of the number of bits required to communicate error information over the learning channel. We then provide partial intuitive explanations for some of the remarkable properties of RBP and its variations. Finally, we prove several mathematical results, including the convergence to fixed points of linear chains of arbitrary length, the convergence to fixed points of linear autoencoders with decorrelated data, the long-term existence of solutions for linear systems with a single hidden layer, and the convergence to fixed points of non-linear chains, when the derivative of the activation functions is included.
Presented in this paper is a derivation of a 2D catalytic reaction-based model to solve combinatorial optimization problems (COPs). The simulated catalytic reactions, a computational metaphor, occurs in an artificial chemical reactor that finds near-optimal solutions to COPs. The artificial environment is governed by catalytic reactions that can alter the structure of artificial molecular elements. Altering the molecular structure means finding new solutions to the COP. The molecular mass of the elements was considered as a measure of goodness of fit of the solutions. Several data structures and matrices were used to record the directions and locations of the molecules. These provided the model the 2D topology. The Traveling Salesperson Problem (TSP) was used as a working example. The performance of the model in finding a solution for the TSP was compared to the performance of a topology-less model. Experimental results show that the 2D model performs better than the topology-less one.
In recent years genetic algorithms have emerged as a useful tool for the heuristic solution of complex discrete optimisation problems. In particular there has been considerable interest in their use in tackling problems arising in the areas of scheduling and timetabling. However, the classical genetic algorithm paradigm is not well equipped to handle constraints and successful implementations usually require some sort of modification to enable the search to exploit problem specific knowledge in order to overcome this shortcoming. This paper is concerned with the development of a family of genetic algorithms for the solution of a nurse rostering problem at a major UK hospital. The hospital is made up of wards of up to 30 nurses. Each ward has its own group of nurses whose shifts have to be scheduled on a weekly basis. In addition to fulfilling the minimum demand for staff over three daily shifts, nurses' wishes and qualifications have to be taken into account. The schedules must also be seen to be fair, in that unpopular shifts have to be spread evenly amongst all nurses, and other restrictions, such as team nursing and special conditions for senior staff, have to be satisfied. The basis of the family of genetic algorithms is a classical genetic algorithm consisting of n-point crossover, single-bit mutation and a rank-based selection. The solution space consists of all schedules in which each nurse works the required number of shifts, but the remaining constraints, both hard and soft, are relaxed and penalised in the fitness function. The talk will start with a detailed description of the problem and the initial implementation and will go on to highlight the shortcomings of such an approach, in terms of the key element of balancing feasibility, i.e. covering the demand and work regulations, and quality, as measured by the nurses' preferences. A series of experiments involving parameter adaptation, niching, intelligent weights, delta coding, local hill climbing, migration and special selection rules will then be outlined and it will be shown how a series of these enhancements were able to eradicate these difficulties. Results based on several months' real data will be used to measure the impact of each modification, and to show that the final algorithm is able to compete with a tabu search approach currently employed at the hospital. The talk will conclude with some observations as to the overall quality of this approach to this and similar problems.
As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge about the content is required and highly open-ended tasks can be supported. In the past few years, probabilistic topic modeling has emerged as a popular approach to this problem. Nevertheless, the representation of the latent topics as aggregations of semi-coherent terms limits their interpretability and level of detail.   This paper presents an alternative approach to topic modeling that maps topics as a network for exploration, based on distributional semantics using learned word vectors. From the granular level of terms and their semantic similarity relations global topic structures emerge as clustered regions and gradients of concepts. Moreover, the paper discusses the visual interactive representation of the topic map, which plays an important role in supporting its exploration.
We introduce a method for imposing higher-level structure on generated, polyphonic music. A Convolutional Restricted Boltzmann Machine (C-RBM) as a generative model is combined with gradient descent constraint optimization to provide further control over the generation process. Among other things, this allows for the use of a "template" piece, from which some structural properties can be extracted, and transferred as constraints to newly generated material. The sampling process is guided with Simulated Annealing in order to avoid local optima, and find solutions that both satisfy the constraints, and are relatively stable with respect to the C-RBM. Results show that with this approach it is possible to control the higher level self-similarity structure, the meter, as well as tonal properties of the resulting musical piece while preserving its local musical coherence.
Distributed average consensus is the main mechanism in algorithms for decentralized computation. In distributed average consensus algorithm each node has an initial state, and the goal is to compute the average of these initial states in every node. To accomplish this task, each node updates its state by a weighted average of its own and neighbors' states, by using local communication between neighboring nodes. In the networks with fixed topology, convergence rate of distributed average consensus algorithm depends on the choice of weights. This paper studies the weight optimization problem in distributed average consensus algorithm. The network topology considered here is a star network where the branches have different lengths. Closed-form formulas of optimal weights and convergence rate of algorithm are determined in terms of the network's topological parameters. Furthermore generic K-cored star topology has been introduced as an alternative to star topology. The introduced topology benefits from faster convergence rate compared to star topology. By simulation better performance of optimal weights compared to other common weighting methods has been proved.
The semi-inclusive deep inelastic scattering of electrons off a nucleus A with detection of a slow nucleus (A-1) in the ground or low excitation states, i.e. the process A(e,e'(A-1))X, can provide useful information on the origin of the EMC effect and the mechanisms of hadronization. The theoretical description of the process is reviewed and the results of several calculations on few-body systems and complex nuclei are presented.
The photo-luminescence features of Ge-oxygen defect centers in a 100nm thick Ge-doped silica film on a pure silica substrate were investigated by looking at the emission spectra and time decay detected under synchrotron radiation excitation in the 10-300 K temperature range. This center exhibits two luminescence bands centered at 4.3eV and 3.2eV associated with its de-excitation from singlet (S1) and triplet (T1) states, respectively, that are linked by an intersystem crossing process. The comparison with results obtained from a bulk Ge-doped silica sample evidences that the efficiency of the intersystem crossing rate depends on the properties of the matrix embedding the Ge-oxygen defect centers, being more effective in the film than in the bulk counterpart.
In this report, we introduce the version one of EgoNet-UIUC, which is a dataset for ego network research. The dataset contains about 230 ego networks in Linkedin, which have about 33K users (with their attributes) and 283K relationships (with their relationship types) in total. We name this dataset as EgoNet-UIUC, which stands for Ego Network Dataset from University of Illinois at Urbana-Champaign.
We consider two classes of computations which admit taking linear combinations of execution runs: probabilistic sampling and generalized animation. We argue that the task of program learning should be more tractable for these architectures than for conventional deterministic programs. We look at the recent advances in the "sampling the samplers" paradigm in higher-order probabilistic programming. We also discuss connections between partial inconsistency, non-monotonic inference, and vector semantics.
Stochastic network calculus is the probabilistic version of the network calculus, which uses envelopes to perform probabilistic analysis of queueing networks. The accuracy of probabilistic end-to-end delay or backlog bounds computed using network calculus has always been a concern. In this paper, we propose novel end-to-end probabilistic bounds based on demisubmartingale inequalities which improve the existing bounds for the tandem networks of GI/GI/1 queues. In particular, we show that reasonably accurate bounds are achieved by comparing the new bounds with the existing results for a network of M/M/1 queues.
Decision making is an important component in a speaker verification system. For the conventional GMM-UBM architecture, the decision is usually conducted based on the log likelihood ratio of the test utterance against the GMM of the claimed speaker and the UBM. This single-score decision is simple but tends to be sensitive to the complex variations in speech signals (e.g. text content, channel, speaking style, etc.). In this paper, we propose a decision making approach based on multiple scores derived from a set of cohort GMMs (cohort scores). Importantly, these cohort scores are not simply averaged as in conventional cohort methods; instead, we employ a powerful discriminative model as the decision maker. Experimental results show that the proposed method delivers substantial performance improvement over the baseline system, especially when a deep neural network (DNN) is used as the decision maker, and the DNN input involves some statistical features derived from the cohort scores.
Achievement and sustenance of growth are essential themes in organizational literature. In our paper, we develop models using systems thinking approach to understand how firms achieve and sustain growth in a technology-intensive product domain. We augment these to explain the possible impact of a disruptive technological innovation. We use enterprise software industry as the context where SAP has been acknowledged as the market leader. We find that product differentiation and learning effects helped SAP establish itself, and this growth was further sustained through networks and complementors. Introducing cloud computing as the disruptive innovation, we explain its impact on a firm. Analysis reveals that for the next wave of growth to occur, and to tap into newer markets, it would be imperative for SAP to create attractive cloud based offerings. We also discuss how the model can be enhanced by considering competitor dynamics.
We have trained a deep (convolutional) neural network to predict the ground-state energy of an electron in four classes of confining two-dimensional electrostatic potentials. On randomly generated potentials, for which there is no analytic form for either the potential or the ground-state energy, the neural network model was able to predict the ground-state energy to within chemical accuracy, with a median absolute error of 1.49 mHa. We also investigate the performance of the model in predicting other quantities such as the kinetic energy and the first excited-state energy of random potentials.
The outputs of non-linear feed-forward neural network are positive, which could be treated as probability when they are normalized to one. If we take Entropy-Based Principle into consideration, the outputs for each sample could be represented as the distribution of this sample for different clusters. Entropy-Based Principle is the principle with which we could estimate the unknown distribution under some limited conditions. As this paper defines two processes in Feed-Forward Neural Network, our limited condition is the abstracted features of samples which are worked out in the abstraction process. And the final outputs are the probability distribution for different clusters in the clustering process. As Entropy-Based Principle is considered into the feed-forward neural network, a clustering method is born. We have conducted some experiments on six open UCI datasets, comparing with a few baselines and applied purity as the measurement . The results illustrate that our method outperforms all the other baselines that are most popular clustering methods.
We consider the problem of naming objects in complex, natural scenes containing widely varying object appearance and subtly different names. Informed by cognitive research, we propose an approach based on sharing context based object hypotheses between visual and lexical spaces. To this end, we present the Visual Semantic Integration Model (VSIM) that represents object labels as entities shared between semantic and visual contexts and infers a new image by updating labels through context switching. At the core of VSIM is a semantic Pachinko Allocation Model and a visual nearest neighbor Latent Dirichlet Allocation Model. For inference, we derive an iterative Data Augmentation algorithm that pools the label probabilities and maximizes the joint label posterior of an image. Our model surpasses the performance of state-of-art methods in several visual tasks on the challenging SUN09 dataset.
A CMOS harmonic signal LC oscillator driver for automotive applications working in a harsh environment with high safety critical requirements is described. The driver can be used with a wide range of external components parameters (LC resonance network of a sensor). Quality factor of the external LC network can vary two decades. Amplitude regulation of the driver is digitally controlled and the DAC is constructed as exponential with piece-wise-linear (PWL) approximation. Low current consumption for high quality resonance networks is achieved. Realized oscillator is robust, used in safety critical application and has low EMC emissions.
We present a novel unsupervised deep learning framework for anomalous event detection in complex video scenes. While most existing works merely use hand-crafted appearance and motion features, we propose Appearance and Motion DeepNet (AMDN) which utilizes deep neural networks to automatically learn feature representations. To exploit the complementary information of both appearance and motion patterns, we introduce a novel double fusion framework, combining both the benefits of traditional early fusion and late fusion strategies. Specifically, stacked denoising autoencoders are proposed to separately learn both appearance and motion features as well as a joint representation (early fusion). Based on the learned representations, multiple one-class SVM models are used to predict the anomaly scores of each input, which are then integrated with a late fusion strategy for final anomaly detection. We evaluate the proposed method on two publicly available video surveillance datasets, showing competitive performance with respect to state of the art approaches.
We present experiments demonstrating that some other form of capacity control, different from network size, plays a central role in learning multilayer feed-forward networks. We argue, partially through analogy to matrix factorization, that this is an inductive bias that can help shed light on deep learning.
We investigate graphs that can be disconnected into small components by removing a vanishingly small fraction of their vertices. We show that when a quantum network is described by such a graph, the network is efficiently controllable, in the sense that universal quantum computation can be performed using a control sequence polynomial in the size of the network while controlling a vanishingly small fraction of subsystems. We show that networks corresponding to finite-dimensional lattices are efficently controllable, and explore generalizations to percolation clusters and random graphs. We show that the classical computational complexity of estimating the ground state of Hamiltonians described by controllable graphs is polynomial in the number of subsystems/qubits.
We analyze the failure process of a two-component system with widely different fracture strength in the framework of a fiber bundle model with localized load sharing. A fraction 0\leq \alpha \leq 1 of the bundle is strong and it is represented by unbreakable fibers, while fibers of the weak component have randomly distributed failure strength. Computer simulations revealed that there exists a critical composition \alpha_c which separates two qualitatively different behaviors: below the critical point the failure of the bundle is brittle characterized by an abrupt damage growth within the breakable part of the system. Above \alpha_c, however, the macroscopic response becomes ductile providing stability during the entire breaking process. The transition occurs at an astonishingly low fraction of strong fibers which can have importance for applications. We show that in the ductile phase the size distribution of breaking bursts has a power law functional form with an exponent \mu=2 followed by an exponential cutoff. In the brittle phase the power law also prevails but with a higher exponent \mu=9/2. The transition between the two phases shows analogies to continuous phase transitions. Analyzing the microstructure of the damage, it was found that at the beginning of the fracture process cracks nucleate randomly, while later on growth and coalescence of cracks dominate which give rise to power law distributed crack sizes.
The human brain is adept at solving difficult high-level visual processing problems such as image interpretation and object recognition in natural scenes. Over the past few years neuroscientists have made remarkable progress in understanding how the human brain represents categories of objects and actions in natural scenes. However, all current models of high-level human vision operate on hand annotated images in which the objects and actions have been assigned semantic tags by a human operator. No current models can account for high-level visual function directly in terms of low-level visual input (i.e., pixels). To overcome this fundamental limitation we sought to develop a new class of models that can predict human brain activity directly from low-level visual input (i.e., pixels). We explored two classes of models drawn from computer vision and machine learning. The first class of models was based on Fisher Vectors (FV) and the second was based on Convolutional Neural Networks (ConvNets). We find that both classes of models accurately predict brain activity in high-level visual areas, directly from pixels and without the need for any semantic tags or hand annotation of images. This is the first time that such a mapping has been obtained. The fit models provide a new platform for exploring the functional principles of human vision, and they show that modern methods of computer vision and machine learning provide important tools for characterizing brain function.
We study the minimum-energy configuration of a d-dimensional elastic interface in a random potential tied to a harmonic spring. As a function of the spring position, the center of mass of the interface changes in discrete jumps, also called shocks or "static avalanches''. We obtain analytically the distribution of avalanche sizes and its cumulants within an epsilon=4-d expansion from a tree and 1-loop resummation, using functional renormalization. This is compared with exact numerical minimizations of interface energies for random field disorder in d=2,3. Connections to the Burgers equation and to dynamic avalanches are discussed.
We address the problem of achieving an optical random laser with a cloud of cold atoms, in which gain and scattering are provided by the same atoms. The lasing threshold can be defined using the on-resonance optical thickness b0 as a single critical parameter. We predict the threshold quantitatively, as well as power and frequency of the emitted light, using two different light transport models and the atomic polarizability of a strongly-pumped two-level atom. We find a critical b0 on the order of 300, which is within reach of state-of-the-art cold-atom experiments. Interestingly, we find that random lasing can already occur in a regime of relatively low scattering.
We study the equilibrium states of energy functions involving a large set of real variables, defined on the links of sparsely connected networks, and interacting at the network nodes, using the cavity and replica methods. When applied to the representative problem of network resource allocation, an efficient distributed algorithm is devised, with simulations showing full agreement with theory. Scaling properties with the network connectivity and the resource availability are found.
Reservoir computing is a recently introduced, highly efficient bio-inspired approach for processing time dependent data. The basic scheme of reservoir computing consists of a non linear recurrent dynamical system coupled to a single input layer and a single output layer. Within these constraints many implementations are possible. Here we report an opto-electronic implementation of reservoir computing based on a recently proposed architecture consisting of a single non linear node and a delay line. Our implementation is sufficiently fast for real time information processing. We illustrate its performance on tasks of practical importance such as nonlinear channel equalization and speech recognition, and obtain results comparable to state of the art digital implementations.
The non equilibrium relaxational dynamics of the solid on solid model on a disordered substrate and the Sine Gordon model with random phase shifts is studied numerically. Close to the super-roughening temperature $T_g$ our results for the autocorrelations, spatial correlations and response function as well as for the fluctuation dissipation ratio (FDR) agree well with the prediction of a recent one loop RG calculation, whereas deep in the glassy low temperature phase substantial deviations occur. The change in the low temperature behavior of these quantities compared with the RG predictions is shown to be contained in a change of the functional temperature dependence of the dynamical exponent $z(T)$, which relates the age $t$ of the system with a length scale ${\cal L}(t)$: $z(T)$ changes from a linear $T$-dependence close to $T_g$ to a 1/T-behavior far away from $T_g$. By identifying spatial domains as connected patches of the exactly computable ground states of the system we demonstrate that the growing length scale ${\cal L}(t)$ is the characteristic size of thermally fluctuating clusters around ``typical'' long-lived configurations.
We studied the electronic structure of a Shastry-Sutherland lattice system, HoB4 employing high resolution photoemission spectroscopy and ab initio band structure calculations. The surface and bulk borons exhibit subtle differences, and loss of boron compared to the stoichiometric bulk. However, the surface and bulk conduction bands near Fermi level are found to be similar. Evolution of the electronic structure with temperature is found to be similar to that observed in a typical charge-disordered system. A sharp dip is observed at the Fermi level in the low temperature spectra revealing signature of antiferromagnetic gap. Asymmetric spectral weight transfer with temperature manifests particle-hole asymmetry that may be related to the exotic properties of these systems.
Online social networks play a major role in the spread of information at very large scale and it becomes essential to provide means to analyse this phenomenon. In this paper we address the issue of predicting the temporal dynamics of the information diffusion process. We develop a graph-based approach built on the assumption that the macroscopic dynamics of the spreading process are explained by the topology of the network and the interactions that occur through it, between pairs of users, on the basis of properties at the microscopic level. We introduce a generic model, called T-BaSIC, and describe how to estimate its parameters from users behaviours using machine learning techniques. Contrary to classical approaches where the parameters are fixed in advance, T-BaSIC's parameters are functions depending of time, which permit to better approximate and adapt to the diffusion phenomenon observed in online social networks. Our proposal has been validated on real Twitter datasets. Experiments show that our approach is able to capture the particular patterns of diffusion depending of the studied sub-networks of users and topics. The results corroborate the "two-step" theory (1955) that states that information flows from media to a few "opinion leaders" who then transfer it to the mass population via social networks and show that it applies in the online context. This work also highlights interesting recommendations for future investigations.
We derive bounds on ratios of deep inelastic nucleon structure functions from the color dipole picture of high energy photon-hadron scattering. We find an upper bound on the ratio R=sigma_L/sigma_T of the total cross sections for longitudinally and transversely polarized photons. We further obtain bounds on the ratio of deep inelastic structure functions F_2 taken at the same energy but at different photon virtualities. It is shown that these bounds can be used to constrain the range of applicability of the dipole picture.
There are various topological indices such as degree based topological indices, distance based topological indices and counting related topological indices etc. These topological indices correlate certain physicochemical properties such as boiling point, stability of chemical compounds. In this paper, we compute the sum-connectivity index and multiplicative Zagreb indices for certain networks of chemical importance like silicate networks, hexagonal networks, oxide networks, and honeycomb networks. Moreover, a comparative study using computer-based graphs has been made to clarify their nature for these families of networks.
Network anomaly detection is still a vibrant research area. As the fast growth of network bandwidth and the tremendous traffic on the network, there arises an extremely challengeable question: How to efficiently and accurately detect the anomaly on multiple traffic? In multi-task learning, the traffic consisting of flows at different time periods is considered as a task. Multiple tasks at different time periods performed simultaneously to detect anomalies. In this paper, we apply the multi-task feature selection in network anomaly detection area which provides a powerful method to gather information from multiple traffic and detect anomalies on it simultaneously. In particular, the multi-task feature selection includes the well-known l1-norm based feature selection as a special case given only one task. Moreover, we show that the multi-task feature selection is more accurate by utilizing more information simultaneously than the l1-norm based method. At the evaluation stage, we preprocess the raw data trace from trans-Pacific backbone link between Japan and the United States, label with anomaly communities, and generate a 248-feature dataset. We show empirically that the multi-task feature selection outperforms independent l1-norm based feature selection on real traffic dataset.
Aiming to reduce the cost and complexity of maintaining networking infrastructures, organizations are increasingly outsourcing their network functions (e.g., firewalls, traffic shapers and intrusion detection systems) to the cloud, and a number of industrial players have started to offer network function virtualization (NFV)-based solutions. Alas, outsourcing network functions in its current setting implies that sensitive network policies, such as firewall rules, are revealed to the cloud provider. In this paper, we investigate the use of cryptographic primitives for processing outsourced network functions, so that the provider does not learn any sensitive information. More specifically, we present a cryptographic treatment of privacy-preserving outsourcing of network functions, introducing security definitions as well as an abstract model of generic network functions, and then propose a few instantiations using partial homomorphic encryption and public-key encryption with keyword search. We include a proof-of-concept implementation of our constructions and show that network functions can be privately processed by an untrusted cloud provider in a few milliseconds.
Accurate automatic organ segmentation is an important yet challenging problem for medical image analysis. The pancreas is an abdominal organ with very high anatomical variability. This inhibits traditional segmentation methods from achieving high accuracies, especially compared to other organs such as the liver, heart or kidneys. In this paper, we present a holistic learning approach that integrates semantic mid-level cues of deeply-learned organ interior and boundary maps via robust spatial aggregation using random forest. Our method generates boundary preserving pixel-wise class labels for pancreas segmentation. Quantitative evaluation is performed on CT scans of 82 patients in 4-fold cross-validation. We achieve a (mean $\pm$ std. dev.) Dice Similarity Coefficient of 78.01% $\pm$ 8.2% in testing which significantly outperforms the previous state-of-the-art approach of 71.8% $\pm$ 10.7% under the same evaluation criterion.
We propose a protocol that, given a communication network, computes a subnetwork such that, for every pair $(u,v)$ of nodes connected in the original network, there is a minimum-energy path between $u$ and $v$ in the subnetwork (where a minimum-energy path is one that allows messages to be transmitted with a minimum use of energy). The network computed by our protocol is in general a subnetwork of the one computed by the protocol given in [13]. Moreover, our protocol is computationally simpler. We demonstrate the performance improvements obtained by using the subnetwork computed by our protocol through simulation.
A preliminary study of time dependence of B^0-anti-B^0 oscillations using dilepton events is presented. The flavor of the B meson is determined by the charge sign of the lepton. To separate signal leptons from cascade and fake leptons we have used a method which combines several discriminating variables in a neural network. The time evolution of the oscillations is studied by reconstructing the time difference between the decays of the B mesons produced by the Y(4S) decay. With an integrated luminosity of 7.7 fb-1 collected on resonance by BABAR at the PEP-II asymmetric B Factory, we measure the difference in mass of the neutral B eigenstates, Delta_mB0, to be (0.507+/-0.015+/-0.022) x 10^{12} hbar-s^{-1}.
We analyze data from simulations of 2D and 3D glass-forming liquids using a correlation function defined in terms of a memory function with a negative inverse power-law tail. The self-intermediate function and the autocorrelation functions of pressure and shear stress are analyzed; the obtained fits are very good, at least as good as with a stretched exponential. In contrast to the stretched exponential, the key shape parameter--the exponent of the power-law tail--seems to be the same for all three correlation functions. It decreases from a value around 2 at high temperature to a value close to 1.58 (2D), 1.50 (3D) at low temperatures. The amplitude of the tail increases towards towards a value corresponding to a diverging relaxation time, which is related to anomalous diffusion. On the other hand, careful analysis of the long time behavior in the case of suggests that the memory function is cut-off exponentially, which avoids the divergence of the relaxation time. Repeating the fits with an exponential cut-off included indicates that the power-law exponent is in fact independent of temperature and close to 1.58/1.50 over the whole range. Instead of the divergence, a fragile-to-strong crossover in the dynamics, estimated to occur around $T=0.40$ for the 3D Kob-Andersen system. Another key parameter of the fitting procedure may be interpreted as a short-time rate, the amount of decorrelation that occurs in a fixed, relatively short time interval (compared to the alpha time). This quantity is observed to have a near-Arrhenius temperature dependence, while its wavenumber dependence seems to be diffusive ($q^{-2}$) over a wider range than that of the relaxation itself, a further indication that this "bare relaxation rate" is simpler than the full dynamics.
This paper provides an in-depth empirical analysis of several evolutionary algorithms on the one-dimensional spin glass model with power-law interactions. The considered spin glass model provides a mechanism for tuning the effective range of interactions, what makes the problem interesting as an algorithm benchmark. As algorithms, the paper considers the genetic algorithm (GA) with twopoint and uniform crossover, and the hierarchical Bayesian optimization algorithm (hBOA). hBOA is shown to outperform both variants of GA, whereas GA with uniform crossover is shown to perform worst. The differences between the compared algorithms become more significant as the problem size grows and as the range of interactions decreases. Unlike for GA with uniform crossover, for hBOA and GA with twopoint crossover, instances with short-range interactions are shown to be easier. The paper also points out interesting avenues for future research.
Person re-identification consists in recognizing an individual that has already been observed over a network of cameras. It is a novel and challenging research topic in computer vision, for which no reference framework exists yet. Despite this, previous works share similar representations of human body based on part decomposition and the implicit concept of multiple instances. Building on these similarities, we propose a Multiple Component Matching (MCM) framework for the person re-identification problem, which is inspired by Multiple Component Learning, a framework recently proposed for object detection. We show that previous techniques for person re-identification can be considered particular implementations of our MCM framework. We then present a novel person re-identification technique as a direct, simple implementation of our framework, focused in particular on robustness to varying lighting conditions, and show that it can attain state of the art performances.
A framework for robust optimization under uncertainty based on the use of the generalized inverse distribution function (GIDF), also called quantile function, is here proposed. Compared to more classical approaches that rely on the usage of statistical moments as deterministic attributes that define the objectives of the optimization process, the inverse cumulative distribution function allows for the use of all the possible information available in the probabilistic domain. Furthermore, the use of a quantile based approach leads naturally to a multi-objective methodology which allows an a-posteriori selection of the candidate design based on risk/opportunity criteria defined by the designer. Finally, the error on the estimation of the objectives due to the resolution of the GIDF will be proven to be quantifiable
Now a days, Internet plays a major role in our day to day activities e.g., for online transactions, online shopping, and other network related applications. Internet suffers from slow convergence of routing protocols after a network failure which becomes a growing problem. Multiple Routing Configurations [MRC] recovers network from single node/link failures, but does not support network from multiple node/link failures. In this paper, we propose Enhanced MRC [EMRC], to support multiple node/link failures during data transmission in IP networks without frequent global re-convergence. By recovering these failures, data transmission in network will become fast.
This tutorial is intended as an accessible but rigorous first reference for someone interested in learning how to model and analyze cellular network performance using stochastic geometry. In particular, we focus on computing the signal-to-interference-plus-noise ratio (SINR) distribution, which can be characterized by the coverage probability (the SINR CCDF) or the outage probability (its CDF). We model base stations (BSs) in the network as a realization of a homogeneous Poisson point process of density $\lambda$, and compute the SINR for three main cases: the downlink, uplink, and finally the multi-tier downlink, which is characterized by having $k$ tiers of BSs each with a unique density $\lambda_i$ and transmit power $p_i$. These three baseline results have been extensively extended to many different scenarios, and we conclude with a brief summary of some of those extensions.
We have developed an Internet-based audience response system (called ARSBO). In this way we combine the advantages of common audience response systems using handheld devices and the easy and cheap access to the Internet. Evaluations of audience response systems in the literature have shown their success: encouraging participation of the students as well as immediate feedback to answers to the whole group for evaluational purposes of the teacher. However, commercial systems are relatively expensive and the number of students in such a teaching-learning scenario is limited. ARSBO solves these problems. Using the Internet (e.g. in computer rooms or by wireless Internet access) there are no special costs and the number of participating students is not limited. ARSBO is very easy to use for students as well as for the construction of new questions with possible answers and for the visualization of statistical results to questions.
We introduce a new, "worst-case" model for an asynchronous communication network and investigate the simplest (yet central) task in this model, namely the feasibility of end-to-end routing. Motivated by the question of how successful a protocol can hope to perform in a network whose reliability is guaranteed by as few assumptions as possible, we combine the main "unreliability" features encountered in network models in the literature, allowing our model to exhibit all of these characteristics simultaneously. In particular, our model captures networks that exhibit the following properties: 1) On-line; 2) Dynamic Topology; 3)Distributed/Local Control 4) Asynchronous Communication; 5) (Polynomially) Bounded Memory; 6) No Minimal Connectivity Assumptions. In the confines of this network, we evaluate throughput performance and prove matching upper and lower bounds. In particular, using competitive analysis (perhaps somewhat surprisingly) we prove that the optimal competitive ratio of any on-line protocol is 1/n (where n is the number of nodes in the network), and then we describe a specific protocol and prove that it is n-competitive. The model we describe in the paper and for which we achieve the above matching upper and lower bounds for throughput represents the "worst-case" network, in that it makes no reliability assumptions. In many practical applications, the optimal competitive ratio of 1/n may be unacceptable, and consequently stronger assumptions must be imposed on the network to improve performance. However, we believe that a fundamental starting point to understanding which assumptions are necessary to impose on a network model, given some desired throughput performance, is to understand what is achievable in the worst case for the simplest task (namely end-to-end routing).
The ANTARES project aims to build a deep-sea Cherenkov Telescope for High Energy Neutrino Astronomy located in the Mediterranean Sea. The experiment, currently in the construction phase, has recently achieved an important milestone : the operation of a prototype line and of a line with monitoring instruments. These deployments allowed a thorough understanding of environmental parameters.
The Tutte polynomial of a graph is a 2-variable polynomial which is quite important in both combinatorics and statistical physics. It contains various numerical invariants and polynomial invariants, such as the number of spanning trees, the number of spanning forests, the number of acyclic orientations, the reliability polynomial, chromatic polynomial and flow polynomial. In this paper, we study and gain recursive formulas for the Tutte polynomial of pseudofractal scale-free web (PSW) which implies logarithmic complexity algorithm is obtained to calculate the Tutte polynomial of PSW although it is NP-hard for general graph. We also obtain the rigorous solution for the the number of spanning trees of PSW by solving the recurrence relations derived from Tutte polynomial, which give an alternative approach for explicitly determining the number of spanning trees of PSW. Further more, we analysis the all-terminal reliability of PSW and compare the results with that of Sierpinski gasket which has the same number of nodes and edges with PSW. In contrast with the well-known conclusion that scale-free networks are more robust against removal of nodes than homogeneous networks (e.g., exponential networks and regular networks). Our results show that Sierpinski gasket (which is a regular network) are more robust against random edge failures than PSW (which is a scale-free network). Whether it is true for any regular networks and scale-free networks, is still a unresolved problem.
Using a variational technique, we generalize the statistical physics approach of learning from random examples to make it applicable to real data. We demonstrate the validity and relevance of our method by computing approximate estimators for generalization errors that are based on training data alone.
We propose a novel molecular computing scheme for statistical inference. We focus on the much-studied statistical inference problem of computing maximum likelihood estimators for log-linear models. Our scheme takes log-linear models to reaction systems, and the observed data to initial conditions, so that the corresponding equilibrium of each reaction system encodes the corresponding maximum likelihood estimator. The main idea is to exploit the coincidence between thermodynamic entropy and statistical entropy. We map a Maximum Entropy characterization of the maximum likelihood estimator onto a Maximum Entropy characterization of the equilibrium concentrations for the reaction system. This allows for an efficient encoding of the problem, and reveals that reaction networks are superbly suited to statistical inference tasks. Such a scheme may also provide a template to understanding how in vivo biochemical signaling pathways integrate extensive information about their environment and history.
We present a theory of the phonon-assisted nonlinear dc transport of 2D electrons in high Landau levels. The nonlinear dissipative resistivity displays quantum magneto-oscillations governed by two parameters which are proportional to the Hall drift velocity $v_H$ of electrons in electric field and the speed of sound $s$. In the subsonic regime, $v_H<s$, the theory quantitatively reproduces the oscillation pattern observed in recent experiments. We also find the $\pi/2$ phase change of oscillations across the sound barrier $v_H=s$. In the supersonic regime, $v_H>s$, the amplitude of oscillations saturates with lowering temperature, while the subsonic region displays exponential suppression of the phonon-assisted oscillations with temperature.
While recent neural encoder-decoder models have shown great promise in modeling open-domain conversations, they often generate dull and generic responses. Unlike past work that has focused on diversifying the output of the decoder at word-level to alleviate this problem, we present a novel framework based on conditional variational autoencoders that captures the discourse-level diversity in the encoder. Our model uses latent variables to learn a distribution over potential conversational intents and generates diverse responses using only greedy decoders. We have further developed a novel variant that is integrated with linguistic prior knowledge for better performance. Finally, the training procedure is improved by introducing a bag-of-word loss. Our proposed models have been validated to generate significantly more diverse responses than baseline approaches and exhibit competence in discourse-level decision-making.
We review the q-deformed spin network approach to topological quantum field theory and apply these methods to produce unitary representations of the braid groups that are dense in the unitary groups. The simplest case of these models is the Fibonacci model, itself universal for quantum computation. We here formulate these braid group representations in a shape suitable for computation and algebraic work.
WiFi offloading, where mobile device users (e.g., smart phone users) transmit packets through WiFi networks rather than cellular networks, is a promising solution to alleviating the heavy traffic burden of cellular networks due to data explosion. However, since WiFi networks are intermittently available, a mobile device user in WiFi offloading usually needs to wait for WiFi connection and thus experiences longer delay of packet transmission. To motivate users to participate in WiFi offloading, cellular network operators give incentives (rewards like coupons, e-coins) to users who wait for WiFi connection and transmit packets through WiFi networks.   In this paper, we aim at maximizing users' rewards while meeting constraints on queue stability and energy consumption. However, we face scheduling challenges from random packet arrivals, intermittent WiFi connection and time varying wireless link states. To address these challenges, we first formulate the problem as a stochastic optimization problem. We then propose an optimal scheduling policy, named Optimal scheduling Policy under Energy Constraint (OPEC), which makes online decisions as to when to delay packet transmission to wait for WiFi connection and which wireless link (WiFi link or cellular link) to transmit packets on. OPEC automatically adapts to random packet arrivals and time varying wireless link states, not requiring a priori knowledge of packet arrival and wireless link probabilities. As verified by simulations, OPEC scheduling policy can achieve the maximum rewards while keeping queue stable and meeting energy consumption constraint.
Deep reinforcement learning has achieved many impressive results in recent years. However, tasks with sparse rewards or long horizons continue to pose significant challenges. To tackle these important problems, we propose a general framework that first learns useful skills in a pre-training environment, and then leverages the acquired skills for learning faster in downstream tasks. Our approach brings together some of the strengths of intrinsic motivation and hierarchical methods: the learning of useful skill is guided by a single proxy reward, the design of which requires very minimal domain knowledge about the downstream tasks. Then a high-level policy is trained on top of these skills, providing a significant improvement of the exploration and allowing to tackle sparse rewards in the downstream tasks. To efficiently pre-train a large span of skills, we use Stochastic Neural Networks combined with an information-theoretic regularizer. Our experiments show that this combination is effective in learning a wide span of interpretable skills in a sample-efficient way, and can significantly boost the learning performance uniformly across a wide range of downstream tasks.
Inspired by recent successes of deep learning in computer vision, we propose a novel framework for encoding time series as different types of images, namely, Gramian Angular Summation/Difference Fields (GASF/GADF) and Markov Transition Fields (MTF). This enables the use of techniques from computer vision for time series classification and imputation. We used Tiled Convolutional Neural Networks (tiled CNNs) on 20 standard datasets to learn high-level features from the individual and compound GASF-GADF-MTF images. Our approaches achieve highly competitive results when compared to nine of the current best time series classification approaches. Inspired by the bijection property of GASF on 0/1 rescaled data, we train Denoised Auto-encoders (DA) on the GASF images of four standard and one synthesized compound dataset. The imputation MSE on test data is reduced by 12.18%-48.02% when compared to using the raw data. An analysis of the features and weights learned via tiled CNNs and DAs explains why the approaches work.
A reduction procedure to obtain ground states of spin glasses on sparse graphs is developed and tested on the hierarchical lattice associated with the Migdal-Kadanoff approximation for low-dimensional lattices. While more generally applicable, these rules here lead to a complete reduction of the lattice. The stiffness exponent governing the scaling of the defect energy $\Delta E$ with system size $L$, $\sigma(\Delta E)\sim L^y$, is obtained as $y_3=0.25546(3)$ by reducing the equivalent of lattices up to $L=2^{100}$ in $d=3$, and as $y_4=0.76382(4)$ for up to $L=2^{35}$ in $d=4$. The reduction rules allow the exact determination of the ground state energy, entropy, and also provide an approximation to the overlap distribution. With these methods, some well-know and some new features of diluted hierarchical lattices are calculated.
Purpose of review: We review the literature on the use and potential use of computational psychiatry methods in Borderline Personality Disorder.   Recent findings: Computational approaches have been used in psychiatry to increase our understanding of the molecular, circuit, and behavioral basis of mental illness. This is of particular interest in BPD, where the collection of ecologically valid data, especially in interpersonal settings, is becoming more common and more often subject to quantification. Methods that test learning and memory in social contexts, collect data from real-world settings, and relate behavior to molecular and circuit networks are yielding data of particular interest.   Summary: Research in BPD should focus on collaborative efforts to design and interpret experiments with direct relevance to core BPD symptoms and potential for translation to the clinic.
Till today we dreamt of imperceptible delay in a network. The computer science research grows today faster than ever offering more and more services (computational representational, graphical, intelligent implication etc) to its user. But the problem lies in "greater the volume of services greater the problem of delay". So tracing delay, or performance analysis focusing on time required for computation, in a existing or newly configured network is necessary to conclude the improvement. In this paper, we have done the job of delay analysis in a multi-server system,. For this proposed work we have used continuous -parameter Markov chains (Non -Birth -Death Process),for developing the required models, and for developing the simulator we have used queuing networking, different scheduling algorithms at the servers queue and process scheduling . The work can be further extended to test the performance of wireless domain.
Recent advances in deep learning have led to significant progress in the computer vision field, especially for visual object recognition tasks. The features useful for object classification are learned by feed-forward deep convolutional neural networks (CNNs) automatically, and they are shown to be able to predict and decode neural representations in the ventral visual pathway of humans and monkeys. However, despite the huge amount of work on optimizing CNNs, there has not been much research focused on linking CNNs with guiding principles from the human visual cortex. In this work, we propose a network optimization strategy inspired by both of the developmental trajectory of children's visual object recognition capabilities, and Bar (2003), who hypothesized that basic level information is carried in the fast magnocellular pathway through the prefrontal cortex (PFC) and then projected back to inferior temporal cortex (IT), where subordinate level categorization is achieved. We instantiate this idea by training a deep CNN to perform basic level object categorization first, and then train it on subordinate level categorization. We apply this idea to training AlexNet (Krizhevsky et al., 2012) on the ILSVRC 2012 dataset and show that the top-5 accuracy increases from 80.13% to 82.14%, demonstrating the effectiveness of the method. We also show that subsequent transfer learning on smaller datasets gives superior results.
Deep convolutional neural networks are being actively investigated in a wide range of speech and audio processing applications including speech recognition, audio event detection and computational paralinguistics, owing to their ability to reduce factors of variations, such as speaker and environment information in signals, for speech recognition. However, studies have suggested to favor a certain type of convolutional operations when building a deep convolutional neural network for speech applications although there has been promising results using different types of convolutional operations. In this work, we study four types of convolutional operations on different input features for speech emotion recognition in order to derive a comprehensive understanding. Since affective behavioral information has been shown to reflect temporally varying of mental state and convolutional operation are applied locally in time, all deep neural networks share a deep recurrent sub-network architecture for further temporal modeling. We present detailed quantitative module-wise performance analysis to gain insights into information flows within the proposed architectures. In particular, we demonstrate the interplay of affective information and the other irrelevant information during the progression from one module to another. Finally we show that all of our deep neural networks provide state-of-the-art performance on the eNTERFACE'05 corpus.
Open Information Extraction (Open IE) systems aim to obtain relation tuples with highly scalable extraction in portable across domain by identifying a variety of relation phrases and their arguments in arbitrary sentences. The first generation of Open IE learns linear chain models based on unlexicalized features such as Part-of-Speech (POS) or shallow tags to label the intermediate words between pair of potential arguments for identifying extractable relations. Open IE currently is developed in the second generation that is able to extract instances of the most frequently observed relation types such as Verb, Noun and Prep, Verb and Prep, and Infinitive with deep linguistic analysis. They expose simple yet principled ways in which verbs express relationships in linguistics such as verb phrase-based extraction or clause-based extraction. They obtain a significantly higher performance over previous systems in the first generation. In this paper, we describe an overview of two Open IE generations including strengths, weaknesses and application areas.
Breaking news leads to situations of fast-paced reporting in social media, producing all kinds of updates related to news stories, albeit with the caveat that some of those early updates tend to be rumours, i.e., information with an unverified status at the time of posting. Flagging information that is unverified can be helpful to avoid the spread of information that may turn out to be false. Detection of rumours can also feed a rumour tracking system that ultimately determines their veracity. In this paper we introduce a novel approach to rumour detection that learns from the sequential dynamics of reporting during breaking news in social media to detect rumours in new stories. Using Twitter datasets collected during five breaking news stories, we experiment with Conditional Random Fields as a sequential classifier that leverages context learnt during an event for rumour detection, which we compare with the state-of-the-art rumour detection system as well as other baselines. In contrast to existing work, our classifier does not need to observe tweets querying a piece of information to deem it a rumour, but instead we detect rumours from the tweet alone by exploiting context learnt during the event. Our classifier achieves competitive performance, beating the state-of-the-art classifier that relies on querying tweets with improved precision and recall, as well as outperforming our best baseline with nearly 40% improvement in terms of F1 score. The scale and diversity of our experiments reinforces the generalisability of our classifier.
We demonstrate the possibility of realizing a neural network in a chain of trapped ions with induced long range interactions. Such models permit one to store information distributed over the whole system. The storage capacity of such network, which depends on the phonon spectrum of the system, can be controlled by changing the external trapping potential. We analyze the implementation of error resistant universal quantum information processing in such systems.
We propose to use social networking data to validate mobility models for pervasive mobile ad-hoc networks (MANETs) and delay tolerant networks (DTNs). The Random Waypoint (RWP) and Erdos-Renyi (ER) models have been a popular choice among researchers for generating mobility traces of nodes and relationships between them. Not only RWP and ER are useful in evaluating networking protocols in a simulation environment, but they are also used for theoretical analysis of such dynamic networks. However, it has been observed that neither relationships among people nor their movements are random. Instead, human movements frequently contain repeated patterns and friendship is bounded by distance. We used social networking site Gowalla to collect, create and validate models of human mobility and relationships for analysis and evaluations of applications in opportunistic networks such as sensor networks and transportation models in civil engineering. In doing so, we hope to provide more human-like movements and social relationship models to researchers to study problems in complex and mobile networks.
By analysing the financial data of firms across Japan, a nonlinear power law with an exponent of 1.3 was observed between the number of business partners (i.e. the degree of the inter-firm trading network) and sales. In a previous study using numerical simulations, we found that this scaling can be explained by both the money-transport model, where a firm (i.e. customer) distributes money to its out-edges (suppliers) in proportion to the in-degree of destinations, and by the correlations among the Japanese inter-firm trading network. However, in this previous study, we could not specifically identify what types of structure properties (or correlations) of the network determine the 1.3 exponent. In the present study, we more clearly elucidate the relationship between this nonlinear scaling and the network structure by applying mean-field approximation of the diffusion in a complex network to this money-transport model. Using theoretical analysis, we obtained the mean-field solution of the model and found that, in the case of the Japanese firms, the scaling exponent of 1.3 can be determined from the power law of the average degree of the nearest neighbours of the network with an exponent of -0.7.
Deep redshift surveys of the universe provide the basic ingredients to compute the probability distribution function (PDF) of galaxy fluctuations and to constrain its evolution with cosmic time. When this statistic is combined with analytical CDM predictions for the PDF of mass, useful insights into the biasing function relating mass and galaxy distributions can be obtained. In this paper, we focus on two issues: the shape of the biasing function and its evolution with redshift. We constrain these quantities by using a preliminary sample of galaxies spectroscopically surveyed by the Vimos-VLT Deep Survey in a deep cone 0.4<z<1.5 covering 0.4x0.4 sq. deg. We show that the ratio between the amplitude of galaxy fluctuations and the underlying mass fluctuations declines with cosmic time, and that its evolution rate is a function of redshift: biasing evolution is marginal up to z~0.8 and more pronounced for z > 0.8.
Two-particle correlations in invariant mass are studied separately for like-sign and unlike-sign charged particles produced in deep inelastic positron-proton scattering in a new kinematical domain. The data were taken with the H1 detector at the HERA storage ring in 1994, in which 27.5 GeV positrons collided with 820 GeV protons at a centre of mass energy sqrt{s}=300 GeV. The observed enhancement of the like-sign correlations at low invariant masses is related to the dimensions of the hadronic source. The data are compared to different QCD models where the hadronization is performed with the string-fragmentation model. Results are presented for the first time separately for diffractive and non-diffractive scattering, in domains of four-momentum transfer, Bjorken-x, and hadronic center of mass energy, and in intervals of charged particle multiplicity. The observed source radii do not differ strongly from those measured in lower energy lepton-nucleon inelastic scattering, and e^+e^- annihilation.
In this paper a new method for information hiding in open social networks is introduced. The method, called StegHash, is based on the use of hashtags in various open social networks to connect multimedia files (like images, movies, songs) with embedded hidden messages. The evaluation of the system was performed on two social media services (Twitter and Instagram) with a simple environment as a proof of concept. The experiments proved that the initial idea was correct, thus the proposed system could create a completely new area of threats in social networks.
We discuss quantum capacities for two types of entanglement networks: $\mathcal{Q}$ for the quantum repeater network with free classical communication, and $\mathcal{R}$ for the tensor network as the rank of the linear operation represented by the tensor network. We find that $\mathcal{Q}$ always equals $\mathcal{R}$ in the regularized case for the samenetwork graph. However, the relationships between the corresponding one-shot capacities $\mathcal{Q}_1$ and $\mathcal{R}_1$ are more complicated, and the min-cut upper bound is in general not achievable. We show that the tensor network can be viewed as a stochastic protocol with the quantum repeater network, such that $\mathcal{R}_1$ is a natural upper bound of $\mathcal{Q}_1$. We analyze the possible gap between $\mathcal{R}_1$ and $\mathcal{Q}_1$ for certain networks, and compare them with the one-shot classical capacity of the corresponding classical network.
We make available in electronic form the stellar catalog of 19,494 objects from the Deep Multicolor Survey (DMS). The DMS is based on CCD imaging with the Mayall 4-m telescope in U,B,V,R',I75,I86 and covers 0.83 sq. deg. in six fields at high galactic latitude. The survey reached 5 sigma limiting magnitudes of 22.1 in I86 to 23.8 in B. The catalog gives positions, magnitudes and error estimates, and classification codes in the six filter bands for all the objects. We present tables that summarize the spectroscopic results for the 55 quasars, 44 compact narrow emission-line galaxies, and 135 stars in the DMS that we have confirmed to date. We also make available illustrations of all the spectra. The catalog and spectra can be obtained from the World Wide Web at http://www.astronomy.ohio-state.edu/~posmer/DMS/.
Artificial neural networks have been successfully applied to a variety of machine learning tasks, including image recognition, semantic segmentation, and machine translation. However, few studies fully investigated ensembles of artificial neural networks. In this work, we investigated multiple widely used ensemble methods, including unweighted averaging, majority voting, the Bayes Optimal Classifier, and the (discrete) Super Learner, for image recognition tasks, with deep neural networks as candidate algorithms. We designed several experiments, with the candidate algorithms being the same network structure with different model checkpoints within a single training process, networks with same structure but trained multiple times stochastically, and networks with different structure. In addition, we further studied the over-confidence phenomenon of the neural networks, as well as its impact on the ensemble methods. Across all of our experiments, the Super Learner achieved best performance among all the ensemble methods in this study.
We present a new deep (down to V ~ 24) photometry of a wide region (6'x 6') around the LMC globular cluster NGC1866: our sample is complete, down to 3 mag below the brightest MS star. Detailed comparisons with various theoretical scenarios using models computed with the evolutionary code FRANEC have been done reaching the following conclusions: both standard models (i.e. computed by adopting the Schwarzschild criterion to fix the border of the convective core) and models with an enlarged convective core (overshooting) lead to a fair fit of the MS but are not able to reproduce the luminosity and/or the number of He burning giants. Models including a fraction of 30% of binaries leads to a good fit both to the MS luminosity function and to the He clump, if standards models are considered, for a visual distance modulus (m-M)v = 18.8, age t ~ 100 Myr and mass function slope alpha ~ 2.4, thus largely removing the "classical" discrepancy between observed and predicted number of stars in the He burning clump. The fit obtained with models computed with an enlarged convective core gets worse when a binary component is taken into account, because the presence of binary systems increases the existing discrepancy between the observed and predicted clump luminosity. As a consequence of this analysis, we conclude that the next step towards a proper understanding of NGC 1866, and similar clusters, must include the accurate determination of the frequency of binary systems that will be hopefully performed with the incoming Cycle 8 HST observations of NGC~1866.
Much current network analysis is predicated on the assumption that important biological networks will either possess scale free or exponential statistics which are independent of network size allowing unconstrained network growth over time. In this paper, we demonstrate that such network growth models are unable to explain recent comparative genomics results on the growth of prokaryote regulatory gene networks as a function of gene number. This failure largely results as prokaryote regulatory gene networks are "accelerating" and have total link numbers growing faster than linearly with network size and so can exhibit transitions from stationary to nonstationary statistics and from random to scale-free to regular statistics at particular critical network sizes. In the limit, these networks can undergo transitions so marked as to constrain network sizes to be below some critical value. This is of interest as the regulatory gene networks of single celled prokaryotes are indeed characterized by an accelerating quadratic growth with gene count and are size constrained to be less than about 10,000 genes encoded in DNA sequence of less than about 10 megabases. We develop two "nonaccelerating" network models of prokaryote regulatory gene networks in an endeavor to match observation and demonstrate that these approaches fail to reproduce observed statistics.
In the near future, cosmology will enter the wide and deep galaxy survey area allowing high-precision studies of the large scale structure of the universe in three dimensions. To test cosmological models and determine their parameters accurately, it is natural to confront data with exact theoretical expectations expressed in the observational parameter space (angles and redshift). The data-driven galaxy number count fluctuations on redshift shells, can be used to build correlation functions $C(\theta; z_1, z_2)$ on and between shells which can probe the baryonic acoustic oscillations, the distance-redshift distortions as well as gravitational lensing and other relativistic effects. Transforming the model to the data space usually requires the computation of the angular power spectrum $C_\ell(z_1, z_2)$ but this appears as an artificial and inefficient step plagued by apodization issues. In this article we show that it is not necessary and present a compact expression for $C(\theta; z_1, z_2)$ that includes directly the leading density and redshift space distortions terms from the full linear theory. It can be evaluated using a fast integration method based on Clenshaw-Curtis quadrature and Chebyshev polynomial series. This new method to compute the correlation functions without any Limber approximation, allows us to produce and discuss maps of the correlation function directly in the observable space and is a significant step towards disentangling the data from the tested models.
We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences. In contrast with prior work on tree-structured models in which the trees are either provided as input or predicted using supervision from explicit treebank annotations, the tree structures in this work are optimized to improve performance on a downstream task. Experiments demonstrate the benefit of learning task-specific composition orders, outperforming both sequential encoders and recursive encoders based on treebank annotations. We analyze the induced trees and show that while they discover some linguistically intuitive structures (e.g., noun phrases, simple verb phrases), they are different than conventional English syntactic structures.
We describe a new B-meson full reconstruction algorithm designed for the Belle experiment at the B-factory KEKB, an asymmetric e+e- collider that collected a data sample of 771.6 x 10^6 BBbar pairs during its running time. To maximize the number of reconstructed B decay channels, it utilizes a hierarchical reconstruction procedure and probabilistic calculus instead of classical selection cuts. The multivariate analysis package NeuroBayes was used extensively to hold the balance between highest possible efficiency, robustness and acceptable consumption of CPU time.   In total, 1104 exclusive decay channels were reconstructed, employing 71 neural networks altogether. Overall, we correctly reconstruct one B+/- or B0 candidate in 0.28% or 0.18% of the BBbar events, respectively. Compared to the cut-based classical reconstruction algorithm used at the Belle experiment, this is an improvement in efficiency by roughly a factor of 2, depending on the analysis considered.   The new framework also features the ability to choose the desired purity or efficiency of the fully reconstructed sample freely. If the same purity as for the classical full reconstruction code is desired ~25%, the efficiency is still larger by nearly a factor of 2. If, on the other hand, the efficiency is chosen at a similar level as the classical full reconstruction, the purity rises from ~25% to nearly 90%.
Neuroevolution is an active and growing research field, especially in times of increasingly parallel computing architectures. Learning methods for Artificial Neural Networks (ANN) can be divided into two groups. Neuroevolution is mainly based on Monte-Carlo techniques and belongs to the group of global search methods, whereas other methods such as backpropagation belong to the group of local search methods. ANN's comprise important symmetry properties, which can influence Monte-Carlo methods. On the other hand, local search methods are generally unaffected by these symmetries. In the literature, dealing with the symmetries is generally reported as being not effective or even yielding inferior results. In this paper, we introduce the so called Minimum Global Optimum Proximity principle derived from theoretical considerations for effective symmetry breaking, applied to offline supervised learning. Using Differential Evolution (DE), which is a popular and robust evolutionary global optimization method, we experimentally show significant global search efficiency improvements by symmetry breaking.
General properties of the effective conductivity sigma_e of planar isotropic randomly inhomogeneous two-phase self-dual systems are investigated. A new approach for finding out sigma_e of random systems based on a duality, a series expansion in the inhomogeneous parameter z and additional assumptions, is proposed. Two new approximate expressions for sigma_e at arbitrary values of phase concentrations are found. They satisfy all necessary inequalities, symmetries, including a dual one, and reproduce known results in various limiting cases. Two corresponding models with different inhomogeneity structures, whose sigma_e coincide with these expressions, are constructed. First model describes systems with a finite maximal characteristic scale of the inhomogeneities. In this model sigma_e is a solution of the approximate functional equation, generalizing the duality relation. The second model is constructed from squares with random layered structure. The difference of sigma_e for these models means a nonuniversality of the effective conductivity even for binary random self-dual systems. The first explicit expression for sigma_e can be used also for approximate description of various inhomogeneous systems with compact inclusions of the second phase. The percolation problem of these models is briefly discussed.
A key ingredient of current models proposed to capture the topological evolution of complex networks is the hypothesis that highly connected nodes increase their connectivity faster than their less connected peers, a phenomenon called preferential attachment. Measurements on four networks, namely the science citation network, Internet, actor collaboration and science coauthorship network indicate that the rate at which nodes acquire links depends on the node's degree, offering direct quantitative support for the presence of preferential attachment. We find that for the first two systems the attachment rate depends linearly on the node degree, while for the latter two the dependence follows a sublinear power law.
In revenue maximization of selling a digital product in a social network, the utility of an agent is often considered to have two parts: a private valuation, and linearly additive influences from other agents. We study the incomplete information case where agents know a common distribution about others' private valuations, and make decisions simultaneously. The "rational behavior" of agents in this case is captured by the well-known Bayesian Nash equilibrium.   Two challenging questions arise: how to compute an equilibrium and how to optimize a pricing strategy accordingly to maximize the revenue assuming agents follow the equilibrium? In this paper, we mainly focus on the natural model where the private valuation of each agent is sampled from a uniform distribution, which turns out to be already challenging.   Our main result is a polynomial-time algorithm that can exactly compute the equilibrium and the optimal price, when pairwise influences are non-negative. If negative influences are allowed, computing any equilibrium even approximately is PPAD-hard. Our algorithm can also be used to design an FPTAS for optimizing discriminative price profile.
A draft memory model (DM) for neural networks with spike propagation delay (SNNwD) is described. Novelty in this approach are that the DM learns immediately, with stimuli presented once, without synaptic weight changes, and without external learning algorithm. Basal on this model is to trap spikes within neural loops. In order to construct the DM we developed two functional blocks, also described herein. The decoder block receives input from a single spikes source and connect it to one among many outputs. The selector block operates in the opposite direction, receiving many spikes sources and connecting one of them to a single output. We realized conceptual proofs by testing the DM in the prime numbers classifying task. This activation-based memory can be used as immediate and short-term memory.
In our understanding, a mind-map is an adaptive engine that basically works incrementally on the fundament of existing transactional streams. Generally, mind-maps consist of symbolic cells that are connected with each other and that become either stronger or weaker depending on the transactional stream. Based on the underlying biologic principle, these symbolic cells and their connections as well may adaptively survive or die, forming different cell agglomerates of arbitrary size. In this work, we intend to prove mind-maps' eligibility following diverse application scenarios, for example being an underlying management system to represent normal and abnormal traffic behaviour in computer networks, supporting the detection of the user behaviour within search engines, or being a hidden communication layer for natural language interaction.
This article gives a conceptual introduction to the topos approach to the formulation of physical theories.
Dedicating a major fraction of its guaranteed time, the FORS consortium established a FORS Deep Field which contains a known QSO at z = 3.36. It was imaged in UBgRIz with FORS at the VLT as well as in J and Ks with the NTT. Covering an area 6-8 times larger as the HDFs but with similar depth in the optical it is one of the largest deep fields up to date to investigate i) galaxy evolution in the field from present up to z $\sim$ 5, ii) the galaxy distribution in the line of sight to the QSO, iii) the high-z QSO environment and iv) the galaxy-galaxy lensing signal in such a large field. In this presentation a status report of the FORS Deep Field project is given. In particular, the field selection, the imaging results (number counts, photometric redshifts etc.) and the first spectroscopic results are presented.
Recent terrorist attacks carried out on behalf of ISIS on American and European soil by lone wolf attackers or sleeper cells remind us of the importance of understanding the dynamics of radicalization mediated by social media communication channels. In this paper, we shed light on the social media activity of a group of twenty-five thousand users whose association with ISIS online radical propaganda has been manually verified. By using a computational tool known as dynamical activity-connectivity maps, based on network and temporal activity patterns, we investigate the dynamics of social influence within ISIS supporters. We finally quantify the effectiveness of ISIS propaganda by determining the adoption of extremist content in the general population and draw a parallel between radical propaganda and epidemics spreading, highlighting that information broadcasters and influential ISIS supporters generate highly-infectious cascades of information contagion. Our findings will help generate effective countermeasures to combat the group and other forms of online extremism.
The explosion in the e-commerce industry which has been necessitated by the growth and advance expansion of Information technology and its related facilities in recent years have been met with adverse security issues consequently affecting the industry and the entire online activities. This paper exams the prevailing security threats e-commerce is facing which is predominantly known as cyber-crime and how computer related technology and facilities such as digital forensics tools can be adopted extensively to ensure security in online related business activities. This paper investigated the risk, damage and the cost cyber-crime poses to individuals and organizations when they become victims. As it is obvious transacting business online as well as all related online activities has become inherent in our everyday life. The paper also comprehensively elucidate on some of the cyber-crime activities that are posing serious threat to the security of E-commerce. Amazon and eBay were used as the case of study in relation to respondents who patronizes these renowned e-commerce sites for various transactions.   Keywords: E-commerce Security,Cyber-Crime,digital forensics, Network forensics, Network security, Online transactions, Identity theft, hacking.
The low-temperature thermal properties of dielectric crystals are governed by acoustic excitations with large wavelengths that are well described by plane waves. This is the Debye model, which rests on the assumption that the medium is an elastic continuum, holds true for acoustic wavelengths large on the microscopic scale fixed by the interatomic spacing, and gradually breaks down on approaching it. Glasses are characterized as well by universal low-temperature thermal properties, that are however anomalous with respect to those of the corresponding crystalline phases. Related universal anomalies also appear in the low-frequency vibrational density of states and, despite of a longstanding debate, still remain poorly understood. Using molecular dynamics simulations of a model monatomic glass of extremely large size, we show that in glasses the structural disorder undermines the Debye model in a subtle way: the elastic continuum approximation for the acoustic excitations breaks down abruptly on the mesoscopic, medium-range-order length-scale of about ten interatomic spacings, where it still works well for the corresponding crystalline systems. On this scale, the sound velocity shows a marked reduction with respect to the macroscopic value. This turns out to be closely related to the universal excess over the Debye model prediction found in glasses at frequencies of ~1 THz in the vibrational density of states or at temperatures of ~10 K in the specific heat.
The study of complex networks that account for different types of interactions has become a subject of interest in the last few years, specially because its representational power in the description of users interactions in diverse online social platforms (Facebook, Twitter, Instagram, etc.). The mathematical description of these interacting networks has been coined under the name of multilayer networks, where each layer accounts for a type of interaction. It has been shown that diffusive processes on top of these networks present a phenomenology that cannot be explained by the naive superposition of single layer diffusive phenomena but require the whole structure of interconnected layers. Nevertheless, the description of diffusive phenomena on multilayer networks has obviated the fact that social networks have strong mesoscopic structure represented by different communities of individuals driven by common interests, or any other social aspect. In this work, we study the transfer of information in multilayer networks with community structure. The final goal is to understand and quantify, if the existence of well-defined community structure at the level of individual layers, together with the multilayer structure of the whole network, enhances or deteriorates the diffusion of packets of information.
We analyze experimentally chemical waves propagation in the disordered flow field of a porous medium. The reaction fronts travel at a constant velocity which drastically depends on the mean flow direction and rate. The fronts may propagate either downstream and upstream but, surprisingly, they remain static over a range of flow rate values. Resulting from the competition between the chemical reaction and the disordered flow field, these frozen fronts display a particular sawtooth shape. The frozen regime is likely to be associated with front pinning in low velocity zones, the number of which varies with the ratio of the mean flow and the chemical front velocities.
Weak localization and weak anti-localization are quantum interference effects in quantum transport in a disordered electron system. Weak anti-localization enhances the conductivity and weak localization suppresses the conductivity with decreasing temperature at very low temperatures. A magnetic field can destroy the quantum interference effect, giving rise to a cusp-like positive and negative magnetoconductivity as the signatures of weak localization and weak anti-localization, respectively. These effects have been widely observed in topological insulators. In this article, we review recent progresses in both theory and experiment of weak (anti-)localization in topological insulators, where the quasiparticles are described as Dirac fermions. We predicted a crossover from weak anti-localization to weak localization if the massless Dirac fermions (such as the surface states of topological insulator) acquire a Dirac mass, which was confirmed experimentally. The bulk states in a topological insulator thin film can exhibit the weak localization effect, quite different from other system with strong spin-orbit interaction. We compare the localization behaviors of Dirac fermions with conventional electron systems in the presence of disorders of different symmetries. Finally, we show that both the interaction and quantum interference are required to account for the experimentally observed temperature and magnetic field dependence of the conductivity at low temperatures.
False alarm is one of the main concerns in intensive care units and can result in care disruption, sleep deprivation, and insensitivity of care-givers to alarms. Several methods have been proposed to suppress the false alarm rate through improving the quality of physiological signals by filtering, and developing more accurate sensors. However, significant intrinsic correlation among the extracted features limits the performance of most currently available data mining techniques, as they often discard the predictors with low individual impact that may potentially have strong discriminatory power when grouped with others. We propose a model based on coalition game theory that considers the inter-features dependencies in determining the salient predictors in respect to false alarm, which results in improved classification accuracy. The superior performance of this method compared to current methods is shown in simulation results using PhysionNet's MIMIC II database.
A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network. Especially, recurrent neural network and deep convolutional neural network have been applied in ASR successfully. Given the arising problem of training speed, we build a novel deep recurrent convolutional network for acoustic modeling and then apply deep residual learning to it. Our experiments show that it has not only faster convergence speed but better recognition accuracy over traditional deep convolutional recurrent network. In the experiments, we compare the convergence speed of our novel deep recurrent convolutional networks and traditional deep convolutional recurrent networks. With faster convergence speed, our novel deep recurrent convolutional networks can reach the comparable performance. We further show that applying deep residual learning can boost the convergence speed of our novel deep recurret convolutional networks. Finally, we evaluate all our experimental networks by phoneme error rate (PER) with our proposed bidirectional statistical n-gram language model. Our evaluation results show that our newly proposed deep recurrent convolutional network applied with deep residual learning can reach the best PER of 17.33\% with the fastest convergence speed on TIMIT database. The outstanding performance of our novel deep recurrent convolutional neural network with deep residual learning indicates that it can be potentially adopted in other sequential problems.
We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Unfortunately, permutation matrices are discrete, thereby posing difficulties for gradient-based methods. To this end, we resort to a continuous approximation of these matrices using doubly-stochastic matrices which we generate from standard CNN predictions using Sinkhorn iterations. Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on two challenging computer vision problems, namely, (i) relative attributes learning and (ii) self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for (i) and on the classification and segmentation tasks on the PASCAL VOC dataset for (ii).
Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters. Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks. It closes with open questions about the training difficulties observed with deeper architectures.
The network virtualization allows new on-demand management capabilities, in this work we demonstrate such a service, namely, on-demand efficient monitoring or anonymity. The proposed service is based on network virtualization of expanders or sparsifiers over the physical network. The defined virtual (or overlay) communication graphs coupled with a multi-hop extension of Valiant randomization based routing lets us monitor the entire traffic in the network, with a very few monitoring nodes.   In particular, we show that using overlay network with expansion properties and Valiant randomized load balancing it is enough to place $O(m)$ monitor nodes when the length of the overlay path (number of intermediate nodes chosen by Valiant's routing procedure) is $O(n/m)$.   We propose two randomized routing methods to implement policies for sending messages, and we show that they facilitate efficient monitoring of the entire traffic, such that the traffic is distributed uniformly in the network, and each monitor has equiprobable view of the network flow. In terms of complex networks, our result can be interpreted as a way to enforce the same betweenness centrality to all nodes in the network.   Additionally, we show that our results are useful in employing anonymity services. Thus, we propose monitoring or anonymity services, which can be deployed and shut down on-demand. Our work is the first, as far as we know, to bring such on-demand infrastructure structuring using the cloud network virtualization capability to existing monitoring or anonymity networks. We propose methods to theoretically improve services provided by existing anonymity networks, and optimize the degree of anonymity, in addition providing robustness and reliability to system usage and security.   We believe that, our constructions of overlay expanders and sparsifiers weighted network are of independent interest.
Convolutional neural networks (CNNs) have recently received a lot of attention due to their ability to model local stationary structures in natural images in a multi-scale fashion, when learning all model parameters with supervision. While excellent performance was achieved for image classification when large amounts of labeled visual data are available, their success for un-supervised tasks such as image retrieval has been moderate so far. Our paper focuses on this latter setting and explores several methods for learning patch descriptors without supervision with application to matching and instance-level retrieval. To that effect, we propose a new family of convolutional descriptors for patch representation , based on the recently introduced convolutional kernel networks. We show that our descriptor, named Patch-CKN, performs better than SIFT as well as other convolutional networks learned by artificially introducing supervision and is significantly faster to train. To demonstrate its effectiveness, we perform an extensive evaluation on standard benchmarks for patch and image retrieval where we obtain state-of-the-art results. We also introduce a new dataset called RomePatches, which allows to simultaneously study descriptor performance for patch and image retrieval.
A lot of sensor network applications are data-driven. We believe that query is the most preferred way to discover sensor services. Normally users are unaware of available sensors. Thus users need to pose different types of query over the sensor network to get the desired information. Even users may need to input more complicated queries with higher levels of aggregations, and requires more complex interactions with the system. As the users have no prior knowledge of the sensor data or services our aim is to develop a visual query interface where users can feed more user friendly queries and machine can understand those. In this paper work, we have developed an Interactive visual query interface for the users. To accomplish this we have considered several use cases and we have derived graphical representation of query from their text based format for those use case scenario. We have facilitated the user by extracting class, subclass and properties from Ontology. To do so we have parsed OWL file in the user interface and based upon the parsed information users build visual query. Later on we have translated the visual query languages into SPARQL query, a machine understandable format which helps the machine to communicate with the underlying technology.
In this paper, we investigate cooperative spectrum sensing (CSS) in a cognitive radio network (CRN) where multiple secondary users (SUs) cooperate in order to detect a primary user (PU) which possibly occupies multiple bands simultaneously. Deep sensing, which constitutes the first CSS framework based on a convolutional neural network (CNN), is proposed. In deep sensing, instead of the explicit mathematical modeling of CSS, the optimal strategy for combining the individual sensing results of the SUs is obtained with a CNN based on training sensing samples. Accordingly, an environment-specific CSS is found in an adaptive manner regardless of whether the individual sensing results are quantized or not. Through simulation, we show that the performance of CSS can be significantly improved by the proposed deep sensing scheme, especially in the low signal-to-noise ratio (SNR) regime, even when the number of training samples is moderate.
This paper strives to find the sentence best describing the content of an image or video. Different from existing works, which rely on a joint subspace for image / video to sentence matching, we propose to do so in a visual space only. We contribute Word2VisualVec, a deep neural network architecture that learns to predict a deep visual encoding of textual input based on sentence vectorization and a multi-layer perceptron. We thoroughly analyze its architectural design, by varying the sentence vectorization strategy, network depth and the deep feature to predict for image to sentence matching. We also generalize Word2VisualVec for matching a video to a sentence, by extending the predictive abilities to 3-D ConvNet features as well as a visual-audio representation. Experiments on four challenging image and video benchmarks detail Word2VisualVec's properties, capabilities for image and video to sentence matching, and on all datasets its state-of-the-art results.
During the last few years deep underwater neutrino telescopes of a new generation with dimensions close to 100 m or more were taken into operation. For the correct track reconstruction and for the interpretation of light pulses from calibration lasers one has to use the group velocity for light signals. The difference between group velocity leads to an additional delay of about 10 ns for a distance of 100 m between light source and photjmultiplier. From the time of the appearance of the first projects of deep underwater neutrino telescopes in the middle of 70th this fact was never mentioned in the literature.
We investigate how rotationally-constrained, deep convection might give rise to supergranulation, the largest distinct spatial scale of convection observed in the solar photosphere. While supergranulation is only weakly influenced by rotation, larger spatial scales of convection sample the deep convection zone and are presumably rotationally influenced. We present numerical results from a series of nonlinear, 3-D simulations of rotating convection and examine the velocity power distribution realized under a range of Rossby numbers. When rotation is present, the convective power distribution possesses a pronounced peak, at characteristic wavenumber $\ell_\mathrm{peak}$, whose value increases as the Rossby number is decreased. This distribution of power contrasts with that realized in non-rotating convection, where power increases monotonically from high to low wavenumbers. We find that spatial scales smaller than $\ell_\mathrm{peak}$ behave in analogy to non-rotating convection. Spatial scales larger than $\ell_\mathrm{peak}$ are rotationally-constrained and possess substantially reduced power relative to the non-rotating system. We argue that the supergranular scale emerges due to a suppression of power on spatial scales larger than $\ell\approx100$ owing to the presence of deep, rotationally-constrained convection. Supergranulation thus represents the largest non-rotationally-constrained mode of solar convection. We conclude that the characteristic spatial scale of supergranulation bounds that of the deep convective motions from above, making supergranulation an indirect measure of the deep-seated dynamics at work in the solar dynamo. Using the spatial scale of supergranulation in conjunction with our numerical results, we estimate an upper bound of 10 m s$^{-1}$ for the Sun's bulk rms convective velocity.
We report recent studies of the Asymmetry Analysis Collaboration (AAC) on polarized parton distribution functions (PDFs). Using the data on the spin symmetry A_1 in deep inelastic lepton scattering, we investigate optimum polarized PDFs. Their uncertainties are estimated by the Hessian method. The uncertainties are large for the polarized antiquark and gluon distributions. We discuss the role of accurate SLAC-E155 proton data on the determination of the PDFs. The obtained distributions are compared with other parametrization results.
Resonant x-ray reflectivity of the surface of the liquid phase of the Bi$_{43}$Sn$_{57}$ eutectic alloy reveals atomic-scale demixing extending over three near-surface atomic layers. Due to the absence of underlying atomic lattice which typically defines adsorption in crystalline alloys, studies of adsorption in liquid alloys provide unique insight on interatomic interactions at the surface. The observed composition modulation could be accounted for quantitatively by the Defay-Prigogine and Strohl-King multilayer extensions of the single-layer Gibbs model, revealing a near-surface domination of the attractive Bi-Sn interaction over the entropy.
A search for the standard model Higgs boson produced in association with a top-quark pair is presented using data samples corresponding to an integrated luminosity of 5.0 inverse femtobarns (5.1 inverse femtobarns) collected in pp collisions at the center-of-mass energy of 7 TeV (8 TeV). Events are considered where the top-quark pair decays to either one lepton+jets (t tbar to ell nu q q' b bbar) or dileptons (t tbar to ell(+) nu ell(-) nu b bbar), ell being an electron or a muon. The search is optimized for the decay mode H to b bbar. The largest background to the t tbar H signal is top-quark pair production with additional jets. Artificial neural networks are used to discriminate between signal and background events. Combining the results from the 7 TeV and 8 TeV samples, the observed (expected) limit on the cross section for Higgs boson production in association with top-quark pairs for a Higgs boson mass of 125 GeV is 5.8 (5.2) times the standard model expectation.
We describe Janus, a massively parallel FPGA-based computer optimized for the simulation of spin glasses, theoretical models for the behavior of glassy materials. FPGAs (as compared to GPUs or many-core processors) provide a complementary approach to massively parallel computing. In particular, our model problem is formulated in terms of binary variables, and floating-point operations can be (almost) completely avoided. The FPGA architecture allows us to run many independent threads with almost no latencies in memory access, thus updating up to 1024 spins per cycle. We describe Janus in detail and we summarize the physics results obtained in four years of operation of this machine; we discuss two types of physics applications: long simulations on very large systems (which try to mimic and provide understanding about the experimental non-equilibrium dynamics), and low-temperature equilibrium simulations using an artificial parallel tempering dynamics. The time scale of our non-equilibrium simulations spans eleven orders of magnitude (from picoseconds to a tenth of a second). On the other hand, our equilibrium simulations are unprecedented both because of the low temperatures reached and for the large systems that we have brought to equilibrium. A finite-time scaling ansatz emerges from the detailed comparison of the two sets of simulations. Janus has made it possible to perform spin-glass simulations that would take several decades on more conventional architectures. The paper ends with an assessment of the potential of possible future versions of the Janus architecture, based on state-of-the-art technology.
We propose a convolutional neural network (CNN) architecture for facial expression recognition. The proposed architecture is independent of any hand-crafted feature extraction and performs better than the earlier proposed convolutional neural network based approaches. We visualize the automatically extracted features which have been learned by the network in order to provide a better understanding. The standard datasets, i.e. Extended Cohn-Kanade (CKP) and MMI Facial Expression Databse are used for the quantitative evaluation. On the CKP set the current state of the art approach, using CNNs, achieves an accuracy of 99.2%. For the MMI dataset, currently the best accuracy for emotion recognition is 93.33%. The proposed architecture achieves 99.6% for CKP and 98.63% for MMI, therefore performing better than the state of the art using CNNs. Automatic facial expression recognition has a broad spectrum of applications such as human-computer interaction and safety systems. This is due to the fact that non-verbal cues are important forms of communication and play a pivotal role in interpersonal communication. The performance of the proposed architecture endorses the efficacy and reliable usage of the proposed work for real world applications.
In the paper we show that the bibliographic data can be transformed into a collection of compatible networks. Using network multiplication different interesting derived networks can be obtained. In defining them an appropriate normalization should be considered. The proposed approach can be applied also to other collections of compatible networks. We also discuss the question when the multiplication of sparse networks preserves sparseness. The proposed approaches are illustrated with analyses of collection of networks on the topic "social network" obtained from the Web of Science.
Hughes & Gaztanaga (2001, see article in these proceedings) have presented realistic simulations to address key issues confronting existing and forthcoming submm surveys. An important aspect illustrated by the simulations is the effect induced on the counts by the sampling variance of the large-scale galaxy clustering. We find factors of up to 2-4 variation (from the mean) in the extracted counts from deep surveys identical in area (6 sqr arcmin) to the SCUBA surveys of the Hubble Deep Fields (HDF). Here we present a recipe to model the expected degree of clustering as a function of sample area and redshift.
The classical Geiringer theorem addresses the limiting frequency of occurrence of various alleles after repeated application of crossover. It has been adopted to the setting of evolutionary algorithms and, a lot more recently, reinforcement learning and Monte-Carlo tree search methodology to cope with a rather challenging question of action evaluation at the chance nodes. The theorem motivates novel dynamic parallel algorithms that are explicitly described in the current paper for the first time. The algorithms involve independent agents traversing a dynamically constructed directed graph that possibly has loops. A rather elegant and profound category-theoretic model of cognition in biological neural networks developed by a well-known French mathematician, professor Andree Ehresmann jointly with a neurosurgeon, Jan Paul Vanbremeersch over the last thirty years provides a hint at the connection between such algorithms and Hebbian learning.
The paper investigates nonlinear system identification using system output data at various linearized operating points. A feed-forward multi-layer Artificial Neural Network (ANN) based approach is used for this purpose and tested for two target applications i.e. nuclear reactor power level monitoring and an AC servo position control system. Various configurations of ANN using different activation functions, number of hidden layers and neurons in each layer are trained and tested to find out the best configuration. The training is carried out multiple times to check for consistency and the mean and standard deviation of the root mean square errors (RMSE) are reported for each configuration.
In unsupervised data generation tasks, besides the generation of a sample based on previous observations, one would often like to give hints to the model in order to bias the generation towards desirable metrics. We propose a method that combines Generative Adversarial Networks (GANs) and reinforcement learning (RL) in order to accomplish exactly that. While RL biases the data generation process towards arbitrary metrics, the GAN component of the reward function ensures that the model still remembers information learned from data. We build upon previous results that incorporated GANs and RL in order to generate sequence data and test this model in several settings for the generation of molecules encoded as text sequences (SMILES) and in the context of music generation, showing for each case that we can effectively bias the generation process towards desired metrics.
Each year, the treatment decisions for more than 230,000 breast cancer patients in the U.S. hinge on whether the cancer has metastasized away from the breast. Metastasis detection is currently performed by pathologists reviewing large expanses of biological tissues. This process is labor intensive and error-prone. We present a framework to automatically detect and localize tumors as small as 100 x 100 pixels in gigapixel microscopy images sized 100,000 x 100,000 pixels. Our method leverages a convolutional neural network (CNN) architecture and obtains state-of-the-art results on the Camelyon16 dataset in the challenging lesion-level tumor detection task. At 8 false positives per image, we detect 92.4% of the tumors, relative to 82.7% by the previous best automated approach. For comparison, a human pathologist attempting exhaustive search achieved 73.2% sensitivity. We achieve image-level AUC scores above 97% on both the Camelyon16 test set and an independent set of 110 slides. In addition, we discover that two slides in the Camelyon16 training set were erroneously labeled normal. Our approach could considerably reduce false negative rates in metastasis detection.
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.
A growing number of systems are represented as networks whose architecture conveys significant information and determines many of their properties. Examples of network architecture include modular, bipartite, and core-periphery structures. However inferring the network structure is a non trivial task and can depend sometimes on the chosen null model. Here we propose a method for classifying network structures and ranking its nodes in a statistically well-grounded fashion. The method is based on the use of Belief Propagation for learning through Entropy Maximization on both the Stochastic Block Model (SBM) and the degree-corrected Stochastic Block Model (dcSBM). As a specific application we show how the combined use of the two ensembles -SBM and dcSBM- allows to disentangle the bipartite and the core-periphery structure in the case of the e-MID interbank network. Specifically we find that, taking into account the degree, this interbank network is better described by a bipartite structure, while using the SBM the core-periphery structure emerges only when data are aggregated for more than a week.
The magnetoresistance of ultrathin insulating films of Bi has been studied with magnetic fields applied parallel and perpendicular to the plane of the sample. Deep in the strongly localized regime, the magnetoresistance is negative and independent of field orientation. As film thicknesses increase, the magnetoresistance becomes positive, and a difference between values measured in perpendicular and parallel fields appears, which is a linear function of the magnetic field and is positive. This is not consistent with the quantum interference picture. We suggest that it is due to vortices present on the insulating side of the superconductor-insulator transition.
Learning based on networks of real neurons, and by extension biologically inspired models of neural networks, has yet to find general learning rules leading to widespread applications. In this paper, we argue for the existence of a principle allowing to steer the dynamics of a biologically inspired neural network. Using carefully timed external stimulation, the network can be driven towards a desired dynamical state. We term this principle "Learning by Stimulation Avoidance" (LSA). We demonstrate through simulation that the minimal sufficient conditions leading to LSA in artificial networks are also sufficient to reproduce learning results similar to those obtained in biological neurons by Shahaf and Marom [1]. We examine the mechanism's basic dynamics in a reduced network, and demonstrate how it scales up to a network of 100 neurons. We show that LSA has a higher explanatory power than existing hypotheses about the response of biological neural networks to external simulation, and can be used as a learning rule for an embodied application: learning of wall avoidance by a simulated robot. The surge in popularity of artificial neural networks is mostly directed to disembodied models of neurons with biologically irrelevant dynamics: to the authors' knowledge, this is the first work demonstrating sensory-motor learning with random spiking networks through pure Hebbian learning.
We propose a novel attention based deep learning architecture for visual question answering task (VQA). Given an image and an image related natural language question, VQA generates the natural language answer for the question. Generating the correct answers requires the model's attention to focus on the regions corresponding to the question, because different questions inquire about the attributes of different image regions. We introduce an attention based configurable convolutional neural network (ABC-CNN) to learn such question-guided attention. ABC-CNN determines an attention map for an image-question pair by convolving the image feature map with configurable convolutional kernels derived from the question's semantics. We evaluate the ABC-CNN architecture on three benchmark VQA datasets: Toronto COCO-QA, DAQUAR, and VQA dataset. ABC-CNN model achieves significant improvements over state-of-the-art methods on these datasets. The question-guided attention generated by ABC-CNN is also shown to reflect the regions that are highly relevant to the questions.
In this work, we answer an open problem in the study of phylogenetic networks. Phylogenetic trees are rooted binary trees in which all edges are directed away from the root, whereas phylogenetic networks are rooted acyclic digraphs. For the purpose of evolutionary model validation, biologists often want to know whether or not a phylogenetic tree is contained in a phylogenetic network. The tree containment problem is NP-complete even for very restricted classes of networks such as tree-sibling phylogenetic networks. We prove that this problem is solvable in cubic time for stable phylogenetic networks. A linear time algorithm is also presented for the cluster containment problem.
Matching pedestrians across multiple camera views, known as human re-identification, is a challenging research problem that has numerous applications in visual surveillance. With the resurgence of Convolutional Neural Networks (CNNs), several end-to-end deep Siamese CNN architectures have been proposed for human re-identification with the objective of projecting the images of similar pairs (i.e. same identity) to be closer to each other and those of dissimilar pairs to be distant from each other. However, current networks extract fixed representations for each image regardless of other images which are paired with it and the comparison with other images is done only at the final level. In this setting, the network is at risk of failing to extract finer local patterns that may be essential to distinguish positive pairs from hard negative pairs. In this paper, we propose a gating function to selectively emphasize such fine common local patterns by comparing the mid-level features across pairs of images. This produces flexible representations for the same image according to the images they are paired with. We conduct experiments on the CUHK03, Market-1501 and VIPeR datasets and demonstrate improved performance compared to a baseline Siamese CNN architecture.
We report measurements of dm/dn in Si MOSFET, where m is the magnetization of the two-dimensional electron gas and n is its density. We extended the density range of measurements from well in the metallic to deep in the insulating region. The paper discusses in detail the conditions under which this extension is justified, as well as the corrections one should make to extract dm/dn properly. At low temperatures, dm/dn was found to be strongly nonlinear already in weak magnetic fields, on a scale much smaller than the characteristic scales, expected for interacting two-dimensional electron gas. Surprisingly, this nonlinear behavior exists both in the dielectric, and in the metallic region. These observations, we believe, provide evidence for strong coupling of the itinerant and localized electrons in Si-MOSFET.
We study the problem of inferring network topology from information cascades, in which the amount of time taken for information to diffuse across an edge in the network follows an unknown distribution. Unlike previous studies, which assume knowledge of these distributions, we only require that diffusion along different edges in the network be independent together with limited moment information (e.g., the means). We introduce the concept of a separating vertex set for a graph, which is a set of vertices in which for any two given distinct vertices of the graph, there exists a vertex whose distance to them are different. We show that a necessary condition for reconstructing a tree perfectly using distance information between pairs of vertices is given by the size of an observed separating vertex set. We then propose an algorithm to recover the tree structure using infection times, whose differences have means corresponding to the distance between two vertices. To improve the accuracy of our algorithm, we propose the concept of redundant vertices, which allows us to perform averaging to better estimate the distance between two vertices. Though the theory is developed mainly for tree networks, we demonstrate how the algorithm can be extended heuristically to general graphs. Simulation results suggest that our proposed algorithm performs better than some current state-of-the-art network reconstruction methods.
One major challenge in training Deep Neural Networks is preventing overfitting. Many techniques such as data augmentation and novel regularizers such as Dropout have been proposed to prevent overfitting without requiring a massive amount of training data. In this work, we propose a new regularizer called DeCov which leads to significantly reduced overfitting (as indicated by the difference between train and val performance), and better generalization. Our regularizer encourages diverse or non-redundant representations in Deep Neural Networks by minimizing the cross-covariance of hidden activations. This simple intuition has been explored in a number of past works but surprisingly has never been applied as a regularizer in supervised learning. Experiments across a range of datasets and network architectures show that this loss always reduces overfitting while almost always maintaining or increasing generalization performance and often improving performance over Dropout.
In this paper we study the optimal placement and optimal number of active relay nodes through the traffic density in mobile sensor ad-hoc networks. We consider a setting in which a set of mobile sensor sources is creating data and a set of mobile sensor destinations receiving that data. We make the assumption that the network is massively dense, i.e., there are so many sources, destinations, and relay nodes, that it is best to describe the network in terms of macroscopic parameters, such as their spatial density, rather than in terms of microscopic parameters, such as their individual placements.   We focus on a particular physical layer model that is characterized by the following assumptions: i) the nodes must only transport the data from the sources to the destinations, and do not need to sense the data at the sources, or deliver them at the destinations once the data arrive at their physical locations, and ii) the nodes have limited bandwidth available to them, but they use it optimally to locally achieve the network capacity.   In this setting, the optimal distribution of nodes induces a traffic density that resembles the electric displacement that will be created if we substitute the sources and destinations with positive and negative charges respectively. The analogy between the two settings is very tight and have a direct interpretation in wireless sensor networks.
Information distances like the Hellinger distance and the Jensen-Shannon divergence have deep roots in information theory and machine learning. They are used extensively in data analysis especially when the objects being compared are high dimensional empirical probability distributions built from data. However, we lack common tools needed to actually use information distances in applications efficiently and at scale with any kind of provable guarantees. We can't sketch these distances easily, or embed them in better behaved spaces, or even reduce the dimensionality of the space while maintaining the probability structure of the data.   In this paper, we build these tools for information distances---both for the Hellinger distance and Jensen--Shannon divergence, as well as related measures, like the $\chi^2$ divergence. We first show that they can be sketched efficiently (i.e. up to multiplicative error in sublinear space) in the aggregate streaming model. This result is exponentially stronger than known upper bounds for sketching these distances in the strict turnstile streaming model. Second, we show a finite dimensionality embedding result for the Jensen-Shannon and $\chi^2$ divergences that preserves pair wise distances. Finally we prove a dimensionality reduction result for the Hellinger, Jensen--Shannon, and $\chi^2$ divergences that preserves the information geometry of the distributions (specifically, by retaining the simplex structure of the space). While our second result above already implies that these divergences can be explicitly embedded in Euclidean space, retaining the simplex structure is important because it allows us to continue doing inference in the reduced space. In essence, we preserve not just the distance structure but the underlying geometry of the space.
Large organizations often have users in multiple sites which are connected over the Internet. Since resources are limited, communication between these sites needs to be carefully orchestrated for the most benefit to the organization. We present a Mission-optimized Overlay Network (MON), a hybrid overlay network architecture for maximizing utility to the organization. We combine an offline and an online system to solve non-concave utility maximization problems. The offline tier, the Predictive Flow Optimizer (PFO), creates plans for routing traffic using a model of network conditions. The online tier, MONtra, is aware of the precise local network conditions and is able to react quickly to problems within the network. Either tier alone is insufficient. The PFO may take too long to react to network changes. MONtra only has local information and cannot optimize non-concave mission utilities. However, by combining the two systems, MON is robust and achieves near-optimal utility under a wide range of network conditions. While best-effort overlay networks are well studied, our work is the first to design overlays that are optimized for mission utility.
We study the problem of learning sparse structure changes between two Markov networks $P$ and $Q$. Rather than fitting two Markov networks separately to two sets of data and figuring out their differences, a recent work proposed to learn changes \emph{directly} via estimating the ratio between two Markov network models. In this paper, we give sufficient conditions for \emph{successful change detection} with respect to the sample size $n_p, n_q$, the dimension of data $m$, and the number of changed edges $d$. When using an unbounded density ratio model we prove that the true sparse changes can be consistently identified for $n_p = \Omega(d^2 \log \frac{m^2+m}{2})$ and $n_q = \Omega({n_p^2})$, with an exponentially decaying upper-bound on learning error. Such sample complexity can be improved to $\min(n_p, n_q) = \Omega(d^2 \log \frac{m^2+m}{2})$ when the boundedness of the density ratio model is assumed. Our theoretical guarantee can be applied to a wide range of discrete/continuous Markov networks.
Long term aging is studied on several families of chalcogenide glasses including the Ge-Se, As-Se, Ge-P-Se and Ge-As-Se systems. Special attention is given to the As-Se binary, a system that displays a rich variety of aging behavior intimately tied to sample synthesis conditions and the ambient environment in which samples are aged. Calorimetric (Modulated DSC) and Raman scattering experiments are undertaken. Our results show all samples display a sub-Tg endotherm below Tg in glassy networks possessing a mean coordination number r in the 2.25 < r < 2.45 range. Two sets of AsxSe1-x samples aged for 8 years were compared, set A consisted of slow cooled samples aged in the dark, and set B consisted of melt quenched samples aged at laboratory environment. Samples of set B in the As concentration range, 35% < x < 60%, display a pre-Tg exotherm, but the feature is not observed in samples of set A. The aging behavior of set A presumably represents intrinsic aging in these glasses, while that of set B is extrinsic due to presence of light. The reversibility window persists in both sets of samples but is less well defined in set B. These findings contrast with a recent study by Golovchak et al., which finds the onset of the reversibility window moved up to the stoichiometric composition (x = 40%). Here we show that the upshifted window is better understood as resulting due to demixing of As4Se4 and As4Se3 molecules from the backbone, i.e., Nanoscale phase separation (NSPS). We attribute sub-Tg endotherms to compaction of the flexible part of networks upon long term aging, while the pre-Tg exotherm to NSPS. Finally, the narrowing and sharpening of the reversibility window upon aging is interpreted as the slow 'self-organizing' stress relaxation of the phases just outside the Intermediate phase.
The fractal dimension of excitations in glassy systems gives information on the critical dimension at which the droplet picture of spin glasses changes to a description based on replica symmetry breaking where the interfaces are space filling. Here, the fractal dimension of domain-wall interfaces is studied using the strong-disorder renormalization group method pioneered by Monthus [Fractals 23, 1550042 (2015)] both for the Edwards-Anderson spin-glass model in up to eight space dimensions, as well as for the one-dimensional long-ranged Ising spin-glass with power-law interactions. Analyzing the fractal dimension of domain walls, we find that replica symmetry is broken in high-enough space dimensions. Because our results for high-dimensional hypercubic lattices are limited by their small size, we have also studied the behavior of the one-dimensional long-range Ising spin-glass with power-law interactions. For the regime where the power of the decay of the spin-spin interactions with their separation distance corresponds to 6 and higher effective space dimensions, we find again the broken replica symmetry result of space filling excitations. This is not the case for smaller effective space dimensions. These results show that the dimensionality of the spin glass determines which theoretical description is appropriate. Our results will also be of relevance to the Gardner transition of structural glasses.
The construction of a meaningful graph plays a crucial role in the success of many graph-based representations and algorithms for handling structured data, especially in the emerging field of graph signal processing. However, a meaningful graph is not always readily available from the data, nor easy to define depending on the application domain. In particular, it is often desirable in graph signal processing applications that a graph is chosen such that the data admit certain regularity or smoothness on the graph. In this paper, we address the problem of learning graph Laplacians, which is equivalent to learning graph topologies, such that the input data form graph signals with smooth variations on the resulting topology. To this end, we adopt a factor analysis model for the graph signals and impose a Gaussian probabilistic prior on the latent variables that control these signals. We show that the Gaussian prior leads to an efficient representation that favors the smoothness property of the graph signals. We then propose an algorithm for learning graphs that enforces such property and is based on minimizing the variations of the signals on the learned graph. Experiments on both synthetic and real world data demonstrate that the proposed graph learning framework can efficiently infer meaningful graph topologies from signal observations under the smoothness prior.
We investigate the feasibility of obtaining highly trustworthy results using crowdsourcing on complex engineering tasks. Crowdsourcing is increasingly seen as a potentially powerful way of increasing the supply of labor for solving society's problems. While applications in domains such as citizen-science, citizen-journalism or knowledge organization (e.g., Wikipedia) have seen many successful applications, there have been fewer applications focused on solving engineering problems, especially those involving complex tasks. This may be in part because of concerns that low quality input into engineering analysis and design could result in failed structures leading to loss of life. We compared the quality of work of the anonymous workers of Amazon Mechanical Turk (AMT), an online crowdsourcing service, with the quality of work of expert engineers in solving the complex engineering task of evaluating virtual wind tunnel data graphs. On this representative complex engineering task, our results showed that there was little difference between expert engineers and crowdworkers in the quality of their work and explained reasons for these results. Along with showing that crowdworkers are effective at completing new complex tasks our paper supplies a number of important lessons that were learned in the process of collecting this data from AMT, which may be of value to other researchers.
We study a social network consisting of agents organized as a hierarchical M-ary rooted tree, common in enterprise and military organizational structures. The goal is to aggregate information to solve a binary hypothesis testing problem. Each agent at a leaf of the tree, and only such an agent, makes a direct measurement of the underlying true hypothesis. The leaf agent then makes a decision and sends it to its supervising agent, at the next level of the tree. Each supervising agent aggregates the decisions from the M members of its group, produces a summary message, and sends it to its supervisor at the next level, and so on. Ultimately, the agent at the root of the tree makes an overall decision. We derive upper and lower bounds for the Type I and II error probabilities associated with this decision with respect to the number of leaf agents, which in turn characterize the converge rates of the Type I, Type II, and total error probabilities. We also provide a message-passing scheme involving non-binary message alphabets and characterize the exponent of the error probability with respect to the message alphabet size.
Accurate simulation and validation of advanced driver assistance systems requires accurate sensor models. Modeling automotive radar is complicated by effects such as multipath reflections, interference, reflective surfaces, discrete cells, and attenuation. Detailed radar simulations based on physical principles exist but are computationally intractable for realistic automotive scenes. This paper describes a methodology for the construction of stochastic automotive radar models based on deep learning with adversarial loss connected to real-world data. The resulting model exhibits fundamental radar effects while remaining real-time capable.
We investigate entanglement distribution in pure-state quantum networks. We consider the case when non-maximally entangled two-qubit pure states are shared by neighboring nodes of the network. For a given pair of nodes, we investigate how to generate the maximal entanglement between them by performing local measurements, assisted by classical communication, on the other nodes. We find optimal measurement protocols for both small and large 1D networks. Quite surprisingly, we prove that Bell measurements are not always the optimal ones to perform in such networks. We generalize then the results to simple small 2D networks, finding again counter-intuitive optimal measurement strategies. Finally, we consider large networks with hierarchical lattice geometries and 2D networks. We prove that perfect entanglement can be established on large distances with probability one in a finite number of steps, provided the initial entanglement shared by neighboring nodes is large enough. We discuss also various protocols of entanglement distribution in 2D networks employing classical and quantum percolation strategies.
Niching enables a genetic algorithm (GA) to maintain diversity in a population. It is particularly useful when the problem has multiple optima where the aim is to find all or as many as possible of these optima. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. In these problems, niching can maintain useful solutions to respond quickly, reliably and accurately to a change in the environment. In this paper, we present a niching method that works on the problem substructures rather than the whole solution, therefore it has less space complexity than previously known niching mechanisms. We show that the method is responding accurately when environmental changes occur.
Simple growth mechanisms have been proposed to explain the emergence of seemingly universal network structures. The widely-studied model of preferential attachment assumes that new nodes are more likely to connect to highly connected nodes. Preferential attachment explains the emergence of scale-free degree distributions within complex networks. Yet, it is incompatible with many network systems, particularly bipartite systems in which two distinct types of agents interact. For example, the addition of new links in a host-parasite system corresponds to the infection of hosts by parasites. Increasing connectivity is beneficial to a parasite and detrimental to a host. Therefore, the overall network connectivity is subject to conflicting pressures. Here, we propose a stochastic network growth model of conflicting attachment, inspired by a particular kind of parasite-host interactions: that of viruses interacting with microbial hosts. The mechanism of network growth includes conflicting preferences to network density as well as costs involved in modifying the network connectivity according to these preferences. We find that the resulting networks exhibit realistic patterns commonly observed in empirical data, including the emergence of nestedness, modularity, and nested-modular structures that exhibit both properties. We study the role of conflicting interests in shaping network structure and assess opportunities to incorporate greater realism in linking growth process to pattern in systems governed by antagonistic and mutualistic interactions.
Starting with valise supermultiplets obtained from 0-branes plus field redefinitions, valise adinkra networks, and the "Garden Algebra," we discuss an architecture for algorithms that (starting from on-shell theories and, through a well-defined computation procedure), search for off-shell completions. We show in one dimension how to directly attack the notorious "off-shell auxiliary field" problem of supersymmetry with algorithms in the adinkra network-world formulation.
Neural networks are powerful computing tools that have been exploited by evolution and by humans for solving diverse problems. Although the computational capabilities of neural networks are determined by their structure, the current understanding of the relationships between a neural network's structure and function is still primitive. Here we reveal that neural network's modular structure plays a vital role in determining the neural dynamics and memory performance of the network. In particular, we demonstrate that there exists an optimal modularity for memory performance, where a balance between local cohesion and global connectivity is established. Not only can optimally modular networks remember longer, they also allow for more information storage. We argue that an information diffusion model can provide insights into the optimal modularity phenomenon. Our results suggest that insights from dynamical analysis of neural networks and information spreading processes can be leveraged to better design neural networks and potentially shed light on the brain's modular organization.
We introduce a method to stabilize Generative Adversarial Networks (GANs) by defining the generator objective with respect to an unrolled optimization of the discriminator. This allows training to be adjusted between using the optimal discriminator in the generator's objective, which is ideal but infeasible in practice, and using the current value of the discriminator, which is often unstable and leads to poor solutions. We show how this technique solves the common problem of mode collapse, stabilizes training of GANs with complex recurrent generators, and increases diversity and coverage of the data distribution by the generator.
We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such as variational autoencoders (VAEs). However, these models typically assume that modalities are forced to have a conditioned relation, i.e., we can only generate modalities in one direction. To achieve our objective, we should extract a joint representation that captures high-level concepts among all modalities and through which we can exchange them bi-directionally. As described herein, we propose a joint multimodal variational autoencoder (JMVAE), in which all modalities are independently conditioned on joint representation. In other words, it models a joint distribution of modalities. Furthermore, to be able to generate missing modalities from the remaining modalities properly, we develop an additional method, JMVAE-kl, that is trained by reducing the divergence between JMVAE's encoder and prepared networks of respective modalities. Our experiments show that our proposed method can obtain appropriate joint representation from multiple modalities and that it can generate and reconstruct them more properly than conventional VAEs. We further demonstrate that JMVAE can generate multiple modalities bi-directionally.
We propose a new recursive procedure to estimate the microcanonical density of states in multicanonical Monte Carlo simulations which relies only on measurements of moments of the energy distribution, avoiding entirely the need for energy histograms. This method yields directly a piecewise analytical approximation to the microcanonical inverse temperature, $\beta(E)$, and allows improved control over the statistics and efficiency of the simulations. We demonstrate its utility in connection with recently proposed schemes for improving the efficiency of multicanonical sampling, either with adjustment of the asymptotic energy distribution or with the replacement of single spin flip dynamics with collective updates.
With smart devices, particular smartphones, becoming our everyday companions, the ubiquitous mobile Internet and computing applications pervade people's daily lives. With the surge demand on high-quality mobile services at anywhere, how to address the ubiquitous user demand and accommodate the explosive growth of mobile traffics is the key issue of the next generation mobile networks. The Fog computing is a promising solution towards this goal. Fog computing extends cloud computing by providing virtualized resources and engaged location-based services to the edge of the mobile networks so as to better serve mobile traffics. Therefore, Fog computing is a lubricant of the combination of cloud computing and mobile applications. In this article, we outline the main features of Fog computing and describe its concept, architecture and design goals. Lastly, we discuss some of the future research issues from the networking perspective.
We investigate higher-order Voronoi diagrams in the city metric. This metric is induced by quickest paths in the L1 metric in the presence of an accelerating transportation network of axis-parallel line segments. For the structural complexity of kth-order city Voronoi diagrams of n point sites, we show an upper bound of O(k(n - k) + kc) and a lower bound of {\Omega}(n + kc), where c is the complexity of the transportation network. This is quite different from the bound O(k(n - k)) in the Euclidean metric. For the special case where k = n - 1 the complexity in the Euclidean metric is O(n), while that in the city metric is {\Theta}(nc).   Furthermore, we develop an O(k^2(n + c) log n)-time iterative algorithm to compute the kth-order city Voronoi diagram and an O(nc log^2(n + c) log n)-time divide-and-conquer algorithm to compute the farthest-site city Voronoi diagram.
In this paper we study the inverse Dirichlet-to-Neumann problem for certain cylindrical electrical networks. We define and study a birational transformation acting on cylindrical electrical networks called the electrical $R$-matrix. We use this transformation to formulate a general conjectural solution to this inverse problem on the cylinder. This conjecture extends work of Curtis, Ingerman, and Morrow, and of de Verdi\`ere, Gitler, and Vertigan for circular planar electrical networks. We show that our conjectural solution holds for certain "purely cylindrical" networks. Here we apply the grove combinatorics introduced by Kenyon and Wilson.
In this paper, a progressive learning technique for multi-class classification is proposed. This newly developed learning technique is independent of the number of class constraints and it can learn new classes while still retaining the knowledge of previous classes. Whenever a new class (non-native to the knowledge learnt thus far) is encountered, the neural network structure gets remodeled automatically by facilitating new neurons and interconnections, and the parameters are calculated in such a way that it retains the knowledge learnt thus far. This technique is suitable for real-world applications where the number of classes is often unknown and online learning from real-time data is required. The consistency and the complexity of the progressive learning technique are analyzed. Several standard datasets are used to evaluate the performance of the developed technique. A comparative study shows that the developed technique is superior.
Non perturbative corrections to deep inelastic scattering are computed.
Learning is the important property of Back Propagation Network (BPN) and finding the suitable weights and thresholds during training in order to improve training time as well as achieve high accuracy. Currently, data pre-processing such as dimension reduction input values and pre-training are the contributing factors in developing efficient techniques for reducing training time with high accuracy and initialization of the weights is the important issue which is random and creates paradox, and leads to low accuracy with high training time. One good data preprocessing technique for accelerating BPN classification is dimension reduction technique but it has problem of missing data. In this paper, we study current pre-training techniques and new preprocessing technique called Potential Weight Linear Analysis (PWLA) which combines normalization, dimension reduction input values and pre-training. In PWLA, the first data preprocessing is performed for generating normalized input values and then applying them by pre-training technique in order to obtain the potential weights. After these phases, dimension of input values matrix will be reduced by using real potential weights. For experiment results XOR problem and three datasets, which are SPECT Heart, SPECTF Heart and Liver disorders (BUPA) will be evaluated. Our results, however, will show that the new technique of PWLA will change BPN to new Supervised Multi Layer Feed Forward Neural Network (SMFFNN) model with high accuracy in one epoch without training cycle. Also PWLA will be able to have power of non linear supervised and unsupervised dimension reduction property for applying by other supervised multi layer feed forward neural network model in future work.
Melt-fragility index (m) and glass molar volumes (Vm) of binary Ge-Se melts/glasses are found to change reproducibly as they are homogenized. Variance of Vm decreases as glasses homogenize, and the mean value of Vm increases to saturate at values characteristic of homogeneous glasses. Variance in fragility index of melts also decreases as they are homogenized, and the mean value of m decreases to acquire values characteristic of homogeneous melts. Broad consequences of these observations on physical behavior of chalcogenides melts/glasses are commented upon. The intrinsically slow kinetics of melt homogenization derives from high viscosity of select super-strong melt compositions in the Intermediate Phase that serve to bottleneck atomic diffusion at high temperatures.
We report on the crossover from the thermal to athermal regime of an artificial spin ice formed from a square array of magnetic islands whose lateral size, 30~nm~$\times$~70~nm, is small enough that they are superparamagnetic at room temperature. We used resonant magnetic soft x-ray photon correlation spectroscopy (XPCS) as a method to observe the time-time correlations of the fluctuating magnetic configurations of spin ice during cooling, which are found to slow abruptly as a freezing temperature $T_0 = 178 \pm 5$~K is approached. This slowing is well-described by a Vogel-Fulcher-Tammann law, implying that the frozen state is glassy, with the freezing temperature being commensurate with the strength of magnetostatic interaction energies in the array. The activation temperature, $T_\mathrm{A} = 40 \pm 10$~K, is much less than that expected from a Stoner-Wohlfarth coherent rotation model. Zero-field-cooled/field-cooled magnetometry reveals a freeing up of fluctuations of states within islands above this temperature, caused by variation in the local anisotropy axes at the oxidised edges. This Vogel-Fulcher-Tammann behavior implies that the system enters a glassy state on freezing, which is unexpected for a system with a well-defined ground state.
In the Bag-of-Words (BoW) model based image retrieval task, the precision of visual matching plays a critical role in improving retrieval performance. Conventionally, local cues of a keypoint are employed. However, such strategy does not consider the contextual evidences of a keypoint, a problem which would lead to the prevalence of false matches. To address this problem, this paper defines "true match" as a pair of keypoints which are similar on three levels, i.e., local, regional, and global. Then, a principled probabilistic framework is established, which is capable of implicitly integrating discriminative cues from all these feature levels.   Specifically, the Convolutional Neural Network (CNN) is employed to extract features from regional and global patches, leading to the so-called "Deep Embedding" framework. CNN has been shown to produce excellent performance on a dozen computer vision tasks such as image classification and detection, but few works have been done on BoW based image retrieval. In this paper, firstly we show that proper pre-processing techniques are necessary for effective usage of CNN feature. Then, in the attempt to fit it into our model, a novel indexing structure called "Deep Indexing" is introduced, which dramatically reduces memory usage.   Extensive experiments on three benchmark datasets demonstrate that, the proposed Deep Embedding method greatly promotes the retrieval accuracy when CNN feature is integrated. We show that our method is efficient in terms of both memory and time cost, and compares favorably with the state-of-the-art methods.
An instance of a random constraint satisfaction problem defines a random subset S (the set of solutions) of a large product space (the set of assignments). We consider two prototypical problem ensembles (random k-satisfiability and q-coloring of random regular graphs), and study the uniform measure with support on S. As the number of constraints per variable increases, this measure first decomposes into an exponential number of pure states ("clusters"), and subsequently condensates over the largest such states. Above the condensation point, the mass carried by the n largest states follows a Poisson-Dirichlet process.   For typical large instances, the two transitions are sharp. We determine for the first time their precise location. Further, we provide a formal definition of each phase transition in terms of different notions of correlation between distinct variables in the problem.   The degree of correlation naturally affects the performances of many search/sampling algorithms. Empirical evidence suggests that local Monte Carlo Markov Chain strategies are effective up to the clustering phase transition, and belief propagation up to the condensation point. Finally, refined message passing techniques (such as survey propagation) may beat also this threshold.
This work investigates the case of a network of agents that attempt to learn some unknown state of the world amongst the finitely many possibilities. At each time step, agents all receive random, independently distributed private signals whose distributions are dependent on the unknown state of the world. However, it may be the case that some or any of the agents cannot distinguish between two or more of the possible states based only on their private observations, as when several states result in the same distribution of the private signals. In our model, the agents form some initial belief (probability distribution) about the unknown state and then refine their beliefs in accordance with their private observations, as well as the beliefs of their neighbors. An agent learns the unknown state when her belief converges to a point mass that is concentrated at the true state. A rational agent would use the Bayes' rule to incorporate her neighbors' beliefs and own private signals over time. While such repeated applications of the Bayes' rule in networks can become computationally intractable, in this paper, we show that in the canonical cases of directed star, circle or path networks and their combinations, one can derive a class of memoryless update rules that replicate that of a single Bayesian agent but replace the self beliefs with the beliefs of the neighbors. This way, one can realize an exponentially fast rate of learning similar to the case of Bayesian (fully rational) agents. The proposed rules are a special case of the Learning without Recall.
This work addresses the intrinsic relationship between trees and networks (i.e. graphs). A complete (invertible) mapping is presented which allows trees to be mapped into weighted graphs and then backmapped into the original tree without loss of information. The extension of this methodology to more general networks, including unweighted structures, is also discussed and illustrated. It is shown that the identified duality between trees and graphs underlies several key concepts and issues of current interest in complex networks, including comprehensive characterization of trees and community detection. For instance, additional information about tree structures (e.g. phylogenetic trees) can be immediately obtained by taking into account several off-the-shelf network measurements -- such as the clustering coefficient, degree correlations and betweenness centrality. At the same time, the hierarchical structure of networks, including the respective communities, becomes clear when the network is represented in terms of the respective tree. Indeed, the network-tree mapping described in this work provides a simple and yet effective means of community detection.
Cascading failures are one of the main reasons for blackouts in electric power transmission grids. The economic cost of such failures is in the order of tens of billion dollars annually. The loading level of power system is a key aspect to determine the amount of the damage caused by cascading failures. Existing studies show that the blackout size exhibits phase transitions as the loading level increases. This paper investigates the impact of the topology of a power grid on phase transitions in its robustness. Three spectral graph metrics are considered: spectral radius, effective graph resistance and algebraic connectivity. Experimental results from a model of cascading failures in power grids on the IEEE power systems demonstrate the applicability of these metrics to design/optimize a power grid topology for an enhanced phase transition behavior of the system.
Embarrassingly parallel problems can be split in parts that are characterized by a really low (or sometime absent) exchange of information during their computation in parallel. As a consequence they can be effectively computed in parallel exploiting commodity hardware, hence without particularly sophisticated interconnection networks. Basically, this means Clusters, Networks of Workstations and Desktops as well as Computational Clouds. Despite the simplicity of this computational model, it can be exploited to compute a quite large range of problems. This paper describes JJPF, a tool for developing task parallel applications based on Java and Jini that showed to be an effective and efficient solution in environment like Clusters and Networks of Workstations and Desktops.
One of the most important fields in robotics is the optimization of controllers. Currently, robots are treated as a black box in this optimization process, which is the reason why derivative-free optimization methods such as evolutionary algorithms or reinforcement learning are omnipresent. We propose an implementation of a modern physics engine, which has the ability to differentiate control parameters. This has been implemented on both CPU and GPU. We show how this speeds up the optimization process, even for small problems, and why it will scale to bigger problems. We explain why this is an alternative approach to deep Q-learning, for using deep learning in robotics. Lastly, we argue that this is a big step for deep learning in robotics, as it opens up new possibilities to optimize robots, both in hardware and software.
Software defined networking implements the network control plane in an external entity, rather than in each individual device as in conventional networks. This architectural difference implies a different design for control functions necessary for essential network properties, e.g., loop prevention and link redundancy. We explore how such differences redefine the security weaknesses in the SDN control plane and provide a framework for comparative analysis which focuses on essential network properties required by typical production networks. This enables analysis of how these properties are delivered by the control planes of SDN and conventional networks, and to compare security risks and mitigations. Despite the architectural difference, we find similar, but not identical, exposures in control plane security if both network paradigms provide the same network properties and are analyzed under the same threat model. However, defenses vary; SDN cannot depend on edge based filtering to protect its control plane, while this is arguably the primary defense in conventional networks. Our concrete security analysis suggests that a distributed SDN architecture that supports fault tolerance and consistency checks is important for SDN control plane security. Our analysis methodology may be of independent interest for future security analysis of SDN and conventional networks.
Intrusion detection systems (IDS) are widely studied by researchers nowadays due to the dramatic growth in network-based technologies. Policy violations and unauthorized access is in turn increasing which makes intrusion detection systems of great importance. Existing approaches to improve intrusion detection systems focus on feature selection or reduction since some features are irrelevant or redundant which when removed improve the accuracy as well as the learning time. In this paper we propose a hybrid feature selection method using Correlation-based Feature Selection and Information Gain. In our work we apply adaptive boosting using na\"ive Bayes as the weak (base) classifier. The key point in our research is that we are able to improve the detection accuracy with a reduced number of features while precisely determining the attack. Experimental results showed that our proposed method achieved high accuracy compared to methods using only 5-class problem. Correlation is done using Greedy search strategy and na\"ive Bayes as the classifier on the reduced NSL-KDD dataset.
Over the years, communication speed of networks has increased from a few Kbps to several Mbps, as also the bandwidth demand, Communication Protocols, however have not improved to that extent. With the advent of Wavelength Division Multiplexing (WDM), it is now possible to "tune" protocols to current and future demands. The purpose of this paper is to evolve a High Speed Network architecture, which will cater to the needs of bandwidth-consuming applications, such as voice, video and high definition image transmission.
The Pioneer 10 and 11 spacecraft yielded very accurate navigation that was limited only by a small, anomalous frequency drift of their carrier signals received by the NASA Deep Space Network (DSN). This discrepancy, evident in the data for both spacecraft, was interpreted as an approximately constant acceleration and has become known as the Pioneer anomaly. The origin of this anomaly is yet unknown. Recent efforts to explain the effect included a search for independent confirmation, analyses of conventional mechanisms, even ideas rooted in new physics, and proposals for a dedicated mission. We assert that before any discussion of new physics and (or) a dedicated mission can take place, one must analyze the entire set of radiometric Doppler data received from Pioneer 10 and 11. We report on our efforts to recover and utilize the complete set of radio Doppler and telemetry records of both spacecraft. The collection of radio Doppler data for both missions is now complete; we are ready to begin its evaluation. We also made progress utilizing the recently recovered Pioneer telemetry data. We present a strategy for studying the effect of on-board generated small forces with this telemetry data, in conjunction with the analysis of the entire set of the Pioneer Doppler data. We report on the preparations for the upcoming analysis of the newly recovered data with the ultimate goal of determining the origin of the Pioneer anomaly. Finally, we discuss implications of our on-going research of the Pioneer anomaly for other missions, most notably for New Horizons, NASA's recently launched mission to Pluto.
The exclusive production of $\rho^0$ mesons in deep inelastic electron-proton scattering has been studied using the ZEUS detector. Cross sections have been measured in the range $7 < Q^2 < 25$ GeV$^2$ for $\gamma^*p$ centre of mass (c.m.) energies from 40 to 130 GeV. The $\gamma^*p \to \rho^0 p$ cross section exhibits a $Q^{-(4.2 \pm 0.8 ^{+1.4}_{-0.5})}$ dependence and both longitudinally and transversely polarised $\rho^0$'s are observed. The $\gamma^*p \to \rho^0 p$ cross section rises strongly with increasing c.m. energy, when compared with NMC data at lower energy, which cannot be explained by production through soft pomeron exchange. The data are compared with perturbative QCD calculations where the rise in the cross section reflects the increase in the gluon density at low $x$. the gluon density at low $x$.
We present an analysis of the dephasing present in the multiple scattering of photons by atoms with a quantum internal structure. The corresponding phase coherence times $\tau_{\phi}$ are obtained as a function of the Zeeman degeneracy of the atomic dipole transition and the polarization state of the photons. These results allow for an explanation of the recent experiments on coherent backscattering of photons from a gas of cold rubidium atoms where the height of the backscattering cone depends on the atomic internal degrees of freedom coupled to the polarization of the photons. Some consequences of these results are presented, and analogies with the case of electronic systems are highlighted.
We present a filter based approach for inbetweening. We train a convolutional neural network to generate intermediate frames. This network aim to generate smooth animation of line drawings. Our method can process scanned images directly. Our method does not need to compute correspondence of lines and topological changes explicitly. We experiment our method with real animation production data. The results show that our method can generate intermediate frames partially.
3D scene understanding is important for robots to interact with the 3D world in a meaningful way. Most previous works on 3D scene understanding focus on recognizing geometrical or semantic properties of the scene independently. In this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a new recurrent neural network architecture for semantic labeling on RGB-D videos. The output of the network is integrated with mapping techniques such as KinectFusion in order to inject semantic information into the reconstructed 3D scene. Experiments conducted on a real world dataset and a synthetic dataset with RGB-D videos demonstrate the ability of our method in semantic 3D scene mapping.
This paper addresses weight optimization problem in distributed consensus averaging algorithm over networks with symmetric star topology. We have determined optimal weights and convergence rate of the network in terms of its topological parameters. In addition, two alternative topologies with more rapid convergence rates have been introduced. The new topologies are Complete-Cored Symmetric (CCS) star and K-Cored Symmetric (KCS) star topologies. It has been shown that the optimal weights for the edges of central part in symmetric and CCS star configurations are independent of their branches. By simulation optimality of obtained weights under quantization constraints have been verified.
Understanding how the brain learns to compute functions reliably, efficiently and robustly with noisy spiking activity is a fundamental challenge in neuroscience. Most sensory and motor tasks can be described as dynamical systems and could presumably be learned by adjusting connection weights in a recurrent biological neural network. However, this is greatly complicated by the credit assignment problem for learning in recurrent network, e.g. the contribution of each connection to the global output error cannot be determined based only on locally accessible quantities to the synapse. Combining tools from adaptive control theory and efficient coding theories, we propose that neural circuits can indeed learn complex dynamic tasks with local synaptic plasticity rules as long as they associate two experimentally established neural mechanisms. First, they should receive top-down feedbacks driving both their activity and their synaptic plasticity. Second, inhibitory interneurons should maintain a tight balance between excitation and inhibition in the circuit. The resulting networks could learn arbitrary dynamical systems and produce irregular spike trains as variable as those observed experimentally. Yet, this variability in single neurons may hide an extremely efficient and robust computation at the population level.
Community detection is a fundamental task in social network analysis. In this paper, first we develop an endorsement filtered user connectivity network by utilizing Heider's structural balance theory and certain Twitter triad patterns. Next, we develop three Nonnegative Matrix Factorization frameworks to investigate the contributions of different types of user connectivity and content information in community detection. We show that user content and endorsement filtered connectivity information are complementary to each other in clustering politically motivated users into pure political communities. Word usage is the strongest indicator of users' political orientation among all content categories. Incorporating user-word matrix and word similarity regularizer provides the missing link in connectivity only methods which suffer from detection of artificially large number of clusters for Twitter networks.
One of the most vital activities to reduce energy consumption in wireless sensor networks is clustering. In clustering, one node from a group of nodes is selected to be a cluster head, which handles majority of the computation and processing for the nodes in the cluster. This paper proposes an algorithm for fuzzy based dynamic cluster head selection on cloud in wireless sensor networks. The proposed algorithm calculates a Potential value for each node and selects cluster heads with high potential. The proposed algorithm minimizes cluster overlapping by spatial distribution of cluster heads and discards malicious nodes i.e. never allows malicious nodes to be cluster heads.
Understanding, predicting and eventually improving the resistance to fracture of silicate materials is of primary importance to design new glasses that would be tougher, while retaining their transparency. However, the atomic mechanism of the fracture in amorphous silicate materials is still a topic of debate. In particular, there is some controversy about the existence of ductility at the nano-scale during the crack propagation. Here, we present simulations of the fracture of three archetypical silicate glasses using molecular dynamics. We show that the methodology that is used provide realistic values of fracture energy and toughness. In addition, the simulations clearly suggest that silicate glasses can show different degrees of ductility, depending on their composition.
Dropout is a very effective way of regularizing neural networks. Stochastically "dropping out" units with a certain probability discourages over-specific co-adaptations of feature detectors, preventing overfitting and improving network generalization. Besides, Dropout can be interpreted as an approximate model aggregation technique, where an exponential number of smaller networks are averaged in order to get a more powerful ensemble. In this paper, we show that using a fixed dropout probability during training is a suboptimal choice. We thus propose a time scheduling for the probability of retaining neurons in the network. This induces an adaptive regularization scheme that smoothly increases the difficulty of the optimization problem. This idea of "starting easy" and adaptively increasing the difficulty of the learning problem has its roots in curriculum learning and allows one to train better models. Indeed, we prove that our optimization strategy implements a very general curriculum scheme, by gradually adding noise to both the input and intermediate feature representations within the network architecture. Experiments on seven image classification datasets and different network architectures show that our method, named Curriculum Dropout, frequently yields to better generalization and, at worst, performs just as well as the standard Dropout method.
We consider a dual model of decision making, in which an individual forms its opinion based on contrasting mechanisms of imitation and rational calculation. The decision making model (DMM) implements imitating behavior by means of a network of coupled two-state master equations that undergoes a phase transition at a critical value of a control parameter. The evolutionary spatial game (EGM), being a generalization of the Prisoner's dilemma game, is used to determine in objective fashion the cooperative or anti-cooperative strategy adopted by individuals. Interactions between two sources of dynamics increases the domain of initial states attracted to phase transition dynamics beyond that of the DMM network in isolation. Additionally, on average the influence of the DMM on the game increases the final observed fraction of cooperators in the system.
We continue the recent line of research studying information dissemination problems in adversarial dynamic radio networks. We give two generic algorithms which allow to transform generalized version of single-message broadcast algorithms into multi-message broadcast algorithms. Based on these generic algorithms, we obtain multi-message broadcast algorithms for dynamic radio networks for a number of different dynamic network settings. For one of the modeling assumptions, our algorithms are complemented by a lower bound which shows that the upper bound is close to optimal.
The fifth Dialog State Tracking Challenge (DSTC5) introduces a new cross-language dialog state tracking scenario, where the participants are asked to build their trackers based on the English training corpus, while evaluating them with the unlabeled Chinese corpus. Although the computer-generated translations for both English and Chinese corpus are provided in the dataset, these translations contain errors and careless use of them can easily hurt the performance of the built trackers. To address this problem, we propose a multichannel Convolutional Neural Networks (CNN) architecture, in which we treat English and Chinese language as different input channels of one single CNN model. In the evaluation of DSTC5, we found that such multichannel architecture can effectively improve the robustness against translation errors. Additionally, our method for DSTC5 is purely machine learning based and requires no prior knowledge about the target language. We consider this a desirable property for building a tracker in the cross-language context, as not every developer will be familiar with both languages.
This study presents a deep H{\alpha} kinematical analysis of the Sculptor Group galaxy NGC253. The Fabry-Perot data were taken with the 36-cm Marseille Telescope in La Silla, Chile, using an EMCCD detector. Typical emission measures of ~0.1 cm^-6 pc are reached. The observations allow the detection of the Diffuse Ionized Gas component through [N II] emission at very large radii of 11.5', 12.8' and 19.0', on the receding side of the galaxy. No H{\alpha} emission is observed at radii larger than the neutral component (11.5'). The very extended rotation curve confirms previous results and shows signs of a significant decline, on the order of 30 per cent vmax . Using the rotation data, mass models are constructed with and without the outer [N II] data points, and similar results are found. The declining part of the rotation curve is very well modeled, and seems to be truly declining.
This paper goes back to Turing (1936) and treats his machine as a cognitive model (W,D,B), where W is an "external world" represented by memory device (the tape divided into squares), and (D,B) is a simple robot that consists of the sensory-motor devices, D, and the brain, B. The robot's sensory-motor devices (the "eye", the "hand", and the "organ of speech") allow the robot to simulate the work of any Turing machine. The robot simulates the internal states of a Turing machine by "talking to itself." At the stage of training, the teacher forces the robot (by acting directly on its motor centers) to perform several examples of an algorithm with different input data presented on tape. Two effects are achieved: 1) The robot learns to perform the shown algorithm with any input data using the tape. 2) The robot learns to perform the algorithm "mentally" using an "imaginary tape." The model illustrates the simplest concept of a universal learning neurocomputer, demonstrates universality of associative learning as the mechanism of programming, and provides a simplified, but nontrivial neurobiologically plausible explanation of the phenomena of working memory and mental imagery. The model is implemented as a user-friendly program for Windows called EROBOT. The program is available at www.brain0.com/software.html.
While the basic laws of Newtonian mechanics are well understood, explaining a physical scenario still requires manually modeling the problem with suitable equations and associated parameters. In order to adopt such models for artificial intelligence, researchers have handcrafted the relevant states, and then used neural networks to learn the state transitions using simulation runs as training data. Unfortunately, such approaches can be unsuitable for modeling complex real-world scenarios, where manually authoring relevant state spaces tend to be challenging. In this work, we investigate if neural networks can implicitly learn physical states of real-world mechanical processes only based on visual data, and thus enable long-term physical extrapolation. We develop a recurrent neural network architecture for this task and also characterize resultant uncertainties in the form of evolving variance estimates. We evaluate our setup to extrapolate motion of a rolling ball on bowl of varying shape and orientation using only images as input, and report competitive results with approaches that assume access to internal physics models and parameters.
Discovering relevant, but possibly hidden, variables is a key step in constructing useful and predictive theories about the natural world. This brief note explains the connections between three approaches to this problem: the recently introduced information-bottleneck method, the computational mechanics approach to inferring optimal models, and Salmon's statistical relevance basis.
We present a compartmentalized approach to finding the maximum a-posteriori (MAP) estimate of a latent time series that obeys a dynamic stochastic model and is observed through noisy measurements. We specifically consider modern signal processing problems with non-Markov signal dynamics (e.g. group sparsity) and/or non-Gaussian measurement models (e.g. point process observation models used in neuroscience). Through the use of auxiliary variables in the MAP estimation problem, we show that a consensus formulation of the alternating direction method of multipliers (ADMM) enables iteratively computing separate estimates based on the likelihood and prior and subsequently "averaging" them in an appropriate sense using a Kalman smoother. As such, this can be applied to a broad class of problem settings and only requires modular adjustments when interchanging various aspects of the statistical model. Under broad log-concavity assumptions, we show that the separate estimation problems are convex optimization problems and that the iterative algorithm converges to the MAP estimate. As such, this framework can capture non-Markov latent time series models and non-Gaussian measurement models. We provide example applications involving (i) group-sparsity priors, within the context of electrophysiologic specrotemporal estimation, and (ii) non-Gaussian measurement models, within the context of dynamic analyses of learning with neural spiking and behavioral observations.
This paper proposes Markovian Generative Adversarial Networks (MGANs), a method for training generative neural networks for efficient texture synthesis. While deep neural network approaches have recently demonstrated remarkable results in terms of synthesis quality, they still come at considerable computational costs (minutes of run-time for low-res images). Our paper addresses this efficiency issue. Instead of a numerical deconvolution in previous work, we precompute a feed-forward, strided convolutional network that captures the feature statistics of Markovian patches and is able to directly generate outputs of arbitrary dimensions. Such network can directly decode brown noise to realistic texture, or photos to artistic paintings. With adversarial training, we obtain quality comparable to recent neural texture synthesis methods. As no optimization is required any longer at generation time, our run-time performance (0.25M pixel images at 25Hz) surpasses previous neural texture synthesizers by a significant margin (at least 500 times faster). We apply this idea to texture synthesis, style transfer, and video stylization.
Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet mathematical constraints such as sparse coding and positivity both provide alternate biologically-plausible frameworks for generating brain networks. Non-negative Matrix Factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms ($L1$ Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks.   The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks for different constraints are used as basis functions to encode the observed functional activity at a given time point. These encodings are decoded using machine learning to compare both the algorithms and their assumptions, using the time series weights to predict whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects.   For classifying cognitive activity, the sparse coding algorithm of $L1$ Regularized Learning consistently outperformed 4 variations of ICA across different numbers of networks and noise levels (p$<$0.001). The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy. Within each algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p$<$0.001). The success of sparse coding algorithms may suggest that algorithms which enforce sparse coding, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA.
Semantic segmentation is challenging as it requires both object-level information and pixel-level accuracy. Recently, FCN-based systems gained great improvement in this area. Unlike classification networks, combining features of different layers plays an important role in these dense prediction models, as these features contains information of different levels. A number of models have been proposed to show how to use these features. However, what is the best architecture to make use of features of different layers is still a question. In this paper, we propose a module, called mixed context network, and show that our presented system outperforms most existing semantic segmentation systems by making use of this module.
This study investigates how adequate coordination among the different cognitive processes of a humanoid robot can be developed through end-to-end learning of direct perception of visuomotor stream. We propose a deep dynamic neural network model built on a dynamic vision network, a motor generation network, and a higher-level network. The proposed model was designed to process and to integrate direct perception of dynamic visuomotor patterns in a hierarchical model characterized by different spatial and temporal constraints imposed on each level. We conducted synthetic robotic experiments in which a robot learned to read human's intention through observing the gestures and then to generate the corresponding goal-directed actions. Results verify that the proposed model is able to learn the tutored skills and to generalize them to novel situations. The model showed synergic coordination of perception, action and decision making, and it integrated and coordinated a set of cognitive skills including visual perception, intention reading, attention switching, working memory, action preparation and execution in a seamless manner. Analysis reveals that coherent internal representations emerged at each level of the hierarchy. Higher-level representation reflecting actional intention developed by means of continuous integration of the lower-level visuo-proprioceptive stream.
Artificial Bee Colony (ABC) is a distinguished optimization strategy that can resolve nonlinear and multifaceted problems. It is comparatively a straightforward and modern population based probabilistic approach for comprehensive optimization. In the vein of the other population based algorithms, ABC is moreover computationally classy due to its slow nature of search procedure. The solution exploration equation of ABC is extensively influenced by a arbitrary quantity which helps in exploration at the cost of exploitation of the better search space. In the solution exploration equation of ABC due to the outsized step size the chance of skipping the factual solution is high. Therefore, here this paper improve onlooker bee phase with help of a local search strategy inspired by memetic algorithm to balance the diversity and convergence capability of the ABC. The proposed algorithm is named as Improved Onlooker Bee Phase in ABC (IoABC). It is tested over 12 well known un-biased test problems of diverse complexities and two engineering optimization problems; results show that the anticipated algorithm go one better than the basic ABC and its recent deviations in a good number of the experiments.
Single virus epidemics over complete networks are widely explored in the literature as the fraction of infected nodes is, under appropriate microscopic modeling of the virus infection, a Markov process. With non-complete networks, this macroscopic variable is no longer Markov. In this paper, we study virus diffusion, in particular, multi-virus epidemics, over non-complete stochastic networks. We focus on multipartite networks. In companying work http://arxiv.org/abs/1306.6198, we show that the peer-to-peer local random rules of virus infection lead, in the limit of large multipartite networks, to the emergence of structured dynamics at the macroscale. The exact fluid limit evolution of the fraction of nodes infected by each virus strain across islands obeys a set of nonlinear coupled differential equations, see http://arxiv.org/abs/1306.6198. In this paper, we develop methods to analyze the qualitative behavior of these limiting dynamics, establishing conditions on the virus micro characteristics and network structure under which a virus persists or a natural selection phenomenon is observed.
A new learning algorithm for Evolving Cascade Neural Networks (ECNNs) is described. An ECNN starts to learn with one input node and then adding new inputs as well as new hidden neurons evolves it. The trained ECNN has a nearly minimal number of input and hidden neurons as well as connections. The algorithm was successfully applied to classify artifacts and normal segments in clinical electroencephalograms (EEGs). The EEG segments were visually labeled by EEG-viewer. The trained ECNN has correctly classified 96.69% of the testing segments. It is slightly better than a standard fully connected neural network.
In this paper we focus our attention on the exploitation of the information contained in financial news to enhance the performance of a classifier of bank distress. Such information should be analyzed and inserted into the predictive model in the most efficient way and this task deals with all the issues related to text analysis and specifically analysis of news media. Among the different models proposed for such purpose, we investigate one of the possible deep learning approaches, based on a doc2vec representation of the textual data, a kind of neural network able to map the sequential and symbolic text input onto a reduced latent semantic space. Afterwards, a second supervised neural network is trained combining news data with standard financial figures to classify banks whether in distressed or tranquil states, based on a small set of known distress events. Then the final aim is not only the improvement of the predictive performance of the classifier but also to assess the importance of news data in the classification process. Does news data really bring more useful information not contained in standard financial variables? Our results seem to confirm such hypothesis.
Magnetoconductance (MC) in a parallel magnetic field B has been measured in a two-dimensional electron system in Si, in the regime where the conductivity decreases as \sigma (n_s,T,B=0)=\sigma (n_s,T=0) + A(n_s)T^2 (n_s -- carrier density) to a non-zero value as temperature T->0. Very near the B=0 metal-insulator transition, there is a large initial drop in \sigma with increasing B, followed by a much weaker \sigma (B). At higher n_s, the initial drop of MC is less pronounced.
We present several results relating to the contraction of generic tensor networks and discuss their application to the simulation of quantum many-body systems using variational approaches based upon tensor network states. Given a closed tensor network $\mathcal{T}$, we prove that if the environment of a single tensor from the network can be evaluated with computational cost $\kappa$, then the environment of any other tensor from $\mathcal{T}$ can be evaluated with identical cost $\kappa$. Moreover, we describe how the set of all single tensor environments from $\mathcal{T}$ can be simultaneously evaluated with fixed cost $3\kappa$. The usefulness of these results, which are applicable to a variety of tensor network methods, is demonstrated for the optimization of a Multi-scale Entanglement Renormalization Ansatz (MERA) for the ground state of a 1D quantum system, where they are shown to substantially reduce the computation time.
We examine the effect of the Group Lasso (gLasso) regularizer in selecting the salient nodes of Deep Neural Network (DNN) hidden layers by applying a DNN-HMM hybrid speech recognizer to TED Talks speech data. We test two types of gLasso regularization, one for outgoing weight vectors and another for incoming weight vectors, as well as two sizes of DNNs: 2048 hidden layer nodes and 4096 nodes. Furthermore, we compare gLasso and L2 regularizers. Our experiment results demonstrate that our DNN training, in which the gLasso regularizer was embedded, successfully selected the hidden layer nodes that are necessary and sufficient for achieving high classification power.
The task of labeling data for training deep neural networks is daunting and tedious, requiring millions of labels to achieve the current state-of-the-art results. Such reliance on large amounts of labeled data can be relaxed by exploiting hierarchical features via unsupervised learning techniques. In this work, we propose to train a deep convolutional network based on an enhanced version of the k-means clustering algorithm, which reduces the number of correlated parameters in the form of similar filters, and thus increases test categorization accuracy. We call our algorithm convolutional k-means clustering. We further show that learning the connection between the layers of a deep convolutional neural network improves its ability to be trained on a smaller amount of labeled data. Our experiments show that the proposed algorithm outperforms other techniques that learn filters unsupervised. Specifically, we obtained a test accuracy of 74.1% on STL-10 and a test error of 0.5% on MNIST.
We define a measure of coherent activity for gene regulatory networks, a property that reflects the unity of purpose between the regulatory agents with a common target. We propose that such harmonious regulatory action is desirable under a demand for energy efficiency and may be selected for under evolutionary pressures. We consider two recent models of the cell-cycle regulatory network of the budding yeast, Saccharomyces cerevisiae, as a case study and calculate their degree of coherence. A comparison with random networks of similar size and composition reveals that the yeast's cell-cycle regulation is wired to yield and exceptionally high level of coherent regulatory activity. We also investigate the mean degree of coherence as a function of the network size, connectivity and the fraction of repressory/activatory interactions.
We propose an environment recycling scheme to speed up a class of tensor network algorithms that produce an approximation to the ground state of a local Hamiltonian by simulating an evolution in imaginary time. Specifically, we consider the time-evolving block decimation (TEBD) algorithm applied to infinite systems in 1D and 2D, where the ground state is encoded, respectively, in a matrix product state (MPS) and in a projected entangled-pair state (PEPS). An important ingredient of the TEBD algorithm (and a main computational bottleneck, especially with PEPS in 2D) is the computation of the so-called environment, which is used to determine how to optimally truncate the bond indices of the tensor network so that their dimension is kept constant. In current algorithms, the environment is computed at each step of the imaginary time evolution, to account for the changes that the time evolution introduces in the many-body state represented by the tensor network. Our key insight is that close to convergence, most of the changes in the environment are due to a change in the choice of gauge in the bond indices of the tensor network, and not in the many-body state. Indeed, a consistent choice of gauge in the bond indices confirms that the environment is essentially the same over many time steps and can thus be re-used, leading to very substantial computational savings. We demonstrate the resulting approach in 1D and 2D by computing the ground state of the quantum Ising model in a transverse magnetic field.
Recent studies on the complex systems have shown that the synchronization of oscillators including neuronal ones is faster, stronger, and more efficient in the small-world networks than in the regular or the random networks, and many studies are based on the assumption that the brain may utilize the small-world and scale-free network structure. We show that the functional structures in the brain are self-organized to both the small-world and the scale-free networks by synaptic re-organization by the spike timing dependent synaptic plasticity (STDP), which is hardly achieved with conventional Hebbian learning rules. We show that the balance between the excitatory and the inhibitory synaptic inputs is critical in the formation of the functional structure, which is found to lie in a self-organized critical state.
Information and knowledge in exchange in public networks is a crucial challenge that needs to be overcome in order to consolidate the benefits associated with such structures. We study the impact of the nature of the information exchanged over the possibilities of success of this process, basing ourselves on the analysis of the information produced by the members of the Network of National Educational Portals. One of the main challenges that the Network of National Educational Portals faces consists in finding effective ways of sharing information that can promote knowledge transfer between members of the network. We argue that a key factor that prevents information sharing is the use of performance metrics by portal responsibles to evaluate the results of their decisions. These metrics are highly sensitive, context-dependent, and produced through non-standardized methods, all of which reduce the willingness of knowledge sharing. We present a different approach: based on the Network of National Educational Portals case, we propose creating a comprehensive information system aimed at providing reliable and timely information in a systematic fashion. We believe that adopting standardized procedures and indicators of less sensitive nature, we can produce information for all partners without the shortcomings of the usual practices.
Linear optimal power flow (LOPF) algorithms use a linearization of the alternating current (AC) load flow equations to optimize generator dispatch in a network subject to the loading constraints of the network branches. Common algorithms use the voltage angles at the buses as optimization variables, but alternatives can be computationally advantageous. In this article we provide a review of existing methods and describe new formulations, which express the loading constraints directly in terms of the flows themselves, using a decomposition of the graph into a spanning tree and closed cycles. We provide a comprehensive study of the computational performance of the various formulations, showing that one of the new formulations of the LOPF solves up to 20 times faster than the angle formulation using commercial linear programming solvers, with an average speed-up of factor 3 for the standard networks considered here. The speed-up is largest for networks with many nodes and decentral generators throughout the network, which is highly relevant given the rise of distributed renewable generation.
Unsupervised neural networks, such as restricted Boltzmann machines (RBMs) and deep belief networks (DBNs), are powerful tools for feature selection and pattern recognition tasks. We demonstrate that overfitting occurs in such models just as in deep feedforward neural networks, and discuss possible regularization methods to reduce overfitting. We also propose a "partial" approach to improve the efficiency of Dropout/DropConnect in this scenario, and discuss the theoretical justification of these methods from model convergence and likelihood bounds. Finally, we compare the performance of these methods based on their likelihood and classification error rates for various pattern recognition data sets.
Popular myths that cheaper memory, high-speed links and high-speed processors will solve the problem of congestion in computer networks are shown to be false. A simple definition for congestion based on supply and demand of resources is proposed and is then used to classify various congestion schemes. The issues that make the congestion problem a difficult one are discussed, and then the architectural decisions that affect the design of a congestion scheme are presented. It is argued that long-, medium- and short-term congestion problems require different solutions. Some of the recent schemes are brifly surveyed, and areas for further research are discussed.
This paper presents a self-organizing protocol for dynamic (unstructured P2P) overlay networks, which allows to react to the variability of node arrivals and departures. Through local interactions, the protocol avoids that the departure of nodes causes a partitioning of the overlay. We show that it is sufficient to have knowledge about 1st and 2nd neighbours, plus a simple interaction P2P protocol, to make unstructured networks resilient to node faults. A simulation assessment over different kinds of overlay networks demonstrates the viability of the proposal.
In this paper, a novel mutation operator of differential evolution algorithm is proposed. A new algorithm called divergence differential evolution algorithm (DDEA) is developed by combining the new mutation operator with divergence operator and assimilation operator (divergence operator divides population, and, assimilation operator combines population), which can detect multiple solutions and robustness in noisy environment. The new algorithm is applied to optimize Michalewicz Function and to track changing of rain-induced-attenuation process. The results based on DDEA are compared with those based on Differential Evolution Algorithm (DEA). It shows that DDEA algorithm gets better results than DEA does in the same premise. The new algorithm is significant for optimizing and tracking the characteristics of MIMO (Multiple Input Multiple Output) channel at millimeter waves.
Neural networks are usually trained by some form of stochastic gradient descent (SGD)). A number of strategies are in common use intended to improve SGD optimization, such as learning rate schedules, momentum, and batching. These are motivated by ideas about the occurrence of local minima at different scales, valleys, and other phenomena in the objective function. Empirical results presented here suggest that these phenomena are not significant factors in SGD optimization of MLP-related objective functions, and that the behavior of stochastic gradient descent in these problems is better described as the simultaneous convergence at different rates of many, largely non-interacting subproblems
In the wireless environment, dissemination techniques may improve data access for the users. In this paper, we show a description of dissemination architecture that fits the overall telecommunication network. This architecture is designed to provide efficient data access and power saving for the mobile units. A concurrency control approach, MCD, is suggested for data consistency and conflict checking. A performance study shows that the power consumption, space overhead, and response time associated with MCD is far less than other previous techniques.
The advancement of mobile technologies and the proliferation of map-based applications have enabled a user to access a wide variety of services that range from information queries to navigation systems. Due to the popularity of map-based applications among the users, the service provider often requires to answer a large number of simultaneous queries. Thus, processing queries efficiently on spatial networks (i.e., road networks) have become an important research area in recent years. In this paper, we focus on path queries that find the shortest path between a source and a destination of the user. In particular, we address the problem of finding the shortest paths for a large number of simultaneous path queries in road networks. Traditional systems that consider one query at a time are not suitable for many applications due to high computational and service costs. These systems cannot guarantee required response time in high load conditions. We propose an efficient group based approach that provides a practical solution with reduced cost. The key concept for our approach is to group queries that share a common travel path and then compute the shortest path for the group. Experimental results show that our approach is on an average ten times faster than the traditional approach in return of sacrificing the accuracy by 0.5% in the worst case, which is acceptable for most of the users.
We present a generalized method for calculating the k-shell structure of weighted networks. The method takes into account both the weight and the degree of a network, in such a way that in the absence of weights we resume the shell structure obtained by the classic k-shell decomposition. In the presence of weights, we show that the method is able to partition the network in a more refined way, without the need of any arbitrary threshold on the weight values. Furthermore, by simulating spreading processes using the susceptible-infectious-recovered model in four different weighted real-world networks, we show that the weighted k-shell decomposition method ranks the nodes more accurately, by placing nodes with higher spreading potential into shells closer to the core. In addition, we demonstrate our new method on a real economic network and show that the core calculated using the weighted k-shell method is more meaningful from an economic perspective when compared with the unweighted one.
Deep Convolutional Neural Networks (CNN) have exhibited superior performance in many visual recognition tasks including image classification, object detection, and scene label- ing, due to their large learning capacity and resistance to overfit. For the image classification task, most of the current deep CNN- based approaches take the whole size-normalized image as input and have achieved quite promising results. Compared with the previously dominating approaches based on feature extraction, pooling, and classification, the deep CNN-based approaches mainly rely on the learning capability of deep CNN to achieve superior results: the burden of minimizing intra-class variation while maximizing inter-class difference is entirely dependent on the implicit feature learning component of deep CNN; we rely upon the implicitly learned filters and pooling component to select the discriminative regions, which correspond to the activated neurons. However, if the irrelevant regions constitute a large portion of the image of interest, the classification performance of the deep CNN, which takes the whole image as input, can be heavily affected. To solve this issue, we propose a novel latent CNN framework, which treats the most discriminate region as a latent variable. We can jointly learn the global CNN with the latent CNN to avoid the aforementioned big irrelevant region issue, and our experimental results show the evident advantage of the proposed latent CNN over traditional deep CNN: latent CNN outperforms the state-of-the-art performance of deep CNN on standard benchmark datasets including the CIFAR-10, CIFAR- 100, MNIST and PASCAL VOC 2007 Classification dataset.
Among the different models of networks usually considered, the hexagonal network model is the most popular. However, it requires extensive numerical computations. The Poisson network model, for which the base stations (BS) locations form a spatial Poisson process, allows to consider a non constant distance between base stations. Therefore, it may characterize more realistically operational networks. The Fluid network model, for which the interfering BS are replaced by a continuum of infinitesimal interferers, allows to establish closed-form formula for the SINR (Signal on Interference plus Noise Ratio). This model was validated by comparison with an hexagonal network. The two models establish very close results. As a consequence, the Fluid network model can be used to analyze hexagonal networks. In this paper, we show that the Fluid network model can also be used to analyze Poisson networks. Therefore, the analysis of performance and quality of service becomes very easy, whatever the type of network model, by using the analytical expression of the SINR established by considering the Fluid network model.
Understanding the influence of structure of dispersal network on the species persistence and modeling a much realistic species dispersal in nature are two central issues in spatial ecology. A realistic dispersal structure which favors the persistence of interacting ecological systems has been studied in [Holland \& Hastings, Nature, 456:792--795 (2008)], where it is shown that a randomization of the structure of dispersal network in a metapopulation model of prey and predator increases the species persistence via clustering, prolonged transient dynamics, and amplitudes of population fluctuations. In this paper, by contrast, we show that a deterministic network topology in a metapopulation can also favor asynchrony and prolonged transient dynamics if species dispersal obeys a long-range interaction governed by a distance-dependent power-law. To explore the effects of power-law coupling, we take a realistic ecological model, namely the Rosenzweig-MacArthur model in each patch (node) of the network of oscillators, and show that the coupled system is driven from synchrony to asynchrony with an increase in the power-law exponent. Moreover, to understand the relationship between species persistence and variations in power-law exponent, we compute correlation coefficient to characterize cluster formation, synchrony order parameter and median predator amplitude. We further show that smaller metapopulations with less number of patches are more vulnerable to extinction as compared to larger metapopulations with higher number of patches. We believe that the present work improves our understanding of the interconnection between the random network and deterministic network in theoretical ecology.
The recently proposed stochastic residual networks selectively activate or bypass the layers during training, based on independent stochastic choices, each of which following a probability distribution that is fixed in advance. In this paper we present a first exploration on the use of an epoch-dependent distribution, starting with a higher probability of bypassing deeper layers and then activating them more frequently as training progresses. Preliminary results are mixed, yet they show some potential of adding an epoch-dependent management of distributions, worth of further investigation.
The free energy and the specific heat of the two-dimensional Gaussian random bond Ising model on a square lattice are found with high accuracy using graph expansion method. At low temperatures the specific heat reveals a zero-temperature criticality described by the power law $C\propto T^{1+\alpha}$, with $\alpha= 0.55(8)$. Interpretation of the free energy in terms of independent two-level excitations gives the density of states, that follows a novel power law $\rho(\epsilon)\propto \epsilon^\alpha$ at low energies. An exact high-temperature series for this model up to the term $\beta^{29}$ is found. A proof that the density of one-site spin flip states vanishes at low energy is given.
We propose a framework for generating a signal control policy for a traffic network of signalized intersections to accomplish control objectives expressible using linear temporal logic. By applying techniques from model checking and formal methods, we obtain a correct-by-construction controller that is guaranteed to satisfy complex specifications. To apply these tools, we identify and exploit structural properties particular to traffic networks that allow for efficient computation of a finite state abstraction. In particular, traffic networks exhibit a componentwise monotonicity property which allows reach set computations that scale linearly with the dimension of the continuous state space.
In this paper, we present a novel approach, called Deep MANTA (Deep Many-Tasks), for many-task vehicle analysis from a given image. A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation. Its architecture is based on a new coarse-to-fine object proposal that boosts the vehicle detection. Moreover, the Deep MANTA network is able to localize vehicle parts even if these parts are not visible. In the inference, the network's outputs are used by a real time robust pose estimation algorithm for fine orientation estimation and 3D vehicle localization. We show in experiments that our method outperforms monocular state-of-the-art approaches on vehicle detection, orientation and 3D location tasks on the very challenging KITTI benchmark.
Numerical calculations of the average dipole-coupling energy $\bar E_\mathrm{dip}$ in two-dimensional disordered magnetic nanostructures are performed as function of the particle coverage $C$. We observe that $\bar E_\mathrm{dip}$ scales as $\bar E_\mathrm{dip}\propto C^{\alpha^*}$ with an unusually small exponent $\alpha^*\simeq 0.8$--1.0 for coverages $C\lesssim20%$. This behavior is shown to be primarly given by the contributions of particle pairs at short distances, which is intrinsically related to the presence of an appreciable degree of disorder. The value of $\alpha^*$ is found to be sensitive to the magnetic arrangement within the nanostructure and to the degree of disorder. For large coverages $C\gtrsim20%$ we obtain $\bar E_\mathrm{dip}\propto C^\alpha$ with $\alpha=3/2$, in agreement with the straighforward scaling of the dipole coupling as in a periodic particle setup. Taking into account the effect of single-particle anisotropies, we show that the scaling exponent can be used as a criterion to distinguish between weakly interacting ($\alpha^* \simeq 1.0$) and strongly interacting ($\alpha^* \simeq 0.8$) particle ensembles as function of coverage.
We study a system of $N$ non-interacting spin-less fermions trapped in a confining potential, in arbitrary dimensions $d$ and arbitrary temperature $T$. The presence of the trap introduces an edge where the average density of fermions vanishes. Far from the edge, near the center of the trap (the so called "bulk regime"), physical properties of the fermions have traditionally been understood using the Local Density Approximation. However, this approximation drastically fails near the edge where the density vanishes. In this paper we show that, even near the edge, novel universal properties emerge, independently of the details of the confining potential. We show that for large $N$, these fermions in a confining trap, in arbitrary dimensions and at finite temperature, form a determinantal point process. As a result, any $n$-point correlation function can be expressed as an $n \times n$ determinant whose entry is called the kernel. Near the edge, we derive the large $N$ scaling form of the kernels. In $d=1$ and $T=0$, this reduces to the so called Airy kernel, that appears in the Gaussian Unitary Ensemble (GUE) of random matrix theory. In $d=1$ and $T>0$ we show a remarkable connection between our kernel and the one appearing in the $1+1$-dimensional Kardar-Parisi-Zhang equation at finite time. Consequently our result provides a finite $T$ generalization of the Tracy-Widom distribution, that describes the fluctuations of the rightmost fermion at $T=0$. In $d>1$ and $T \geq 0$, while the connection to GUE no longer holds, the process is still determinantal whose analysis provides a new class of kernels, generalizing the $1d$ Airy kernel at $T=0$ obtained in random matrix theory. Some of our finite temperature results should be testable in present-day cold atom experiments, most notably our detailed predictions for the temperature dependence of the fluctuations near the edge.
The collective dynamics of coupled two-dimensional chaotic maps on complex networks is known to exhibit a rich variety of emergent properties which crucially depend on the underlying network topology. We investigate the collective motion of Chirikov standard maps interacting with time delay through directed links of Gene Regulatory Network of bacterium Escherichia Coli. Departures from strongly chaotic behavior of the isolated maps are studied in relation to different coupling forms and strengths. At smaller coupling intensities the network induces stable and coherent emergent dynamics. The unstable behavior appearing with increase of coupling strength remains confined within a connected sub-network. For the appropriate coupling, network exhibits statistically robust self-organized dynamics in a weakly chaotic regime.
Consider a Gaussian relay network where a number of sources communicate to a destination with the help of several layers of relays. Recent work has shown that a compress-and-forward based strategy at the relays can achieve the capacity of this network within an additive gap. In this strategy, the relays quantize their observations at the noise level and map it to a random Gaussian codebook. The resultant capacity gap is independent of the SNR's of the channels in the network but linear in the total number of nodes.   In this paper, we show that if the relays quantize their signals at a resolution decreasing with the number of nodes in the network, the additive gap to capacity can be made logarithmic in the number of nodes for a class of layered, time-varying wireless relay networks. This suggests that the rule-of-thumb to quantize the received signals at the noise level used for compress-and-forward in the current literature can be highly suboptimal.
Imaging local magnetic field distribution in a single crystal of superconductor CaFe1.94Co0.06As2 shows the presence of anomalous remnant magnetization within Meissner like regions of the sample. Close to the superconducting transition temperature we find a coexistence of superconductivity and magnetic correlations. While the magnetic correlations are enhanced with the lowering of temperature, interestingly the area over which the magnetic phase exists shrinks at the expense of a growing superconducting phase in the sample. The coexistence of two phases deep inside the superconducting phase in this compound is maintained within the temperature range that we have explored in our experiment.
In this paper we consider the design of wireless queueing network control policies with special focus on application-dependent service constraints. In particular we consider streaming traffic induced requirements such as avoiding buffer underflows, which significantly complicate the control problem compared to guaranteeing throughput optimality only. Since state-of-the-art approaches for enforcing minimum buffer constraints in broadcast networks are not suitable for application in general networks we argue for a cost function based approach, which combines throughput optimality with flexibility regarding service constraints. New theoretical stability results are presented and various candidate cost functions are investigated concerning their suitability for use in wireless networks with streaming media traffic. Furthermore we show how the cost function based approach can be used to aid wireless network design with respect to important system parameters. The performance is demonstrated using numerical simulations.
Most methods proposed to uncover communities in complex networks rely on combinatorial graph properties. Usually an edge-counting quality function, such as modularity, is optimized over all partitions of the graph compared against a null random graph model. Here we introduce a systematic dynamical framework to design and analyze a wide variety of quality functions for community detection. The quality of a partition is measured by its Markov Stability, a time-parametrized function defined in terms of the statistical properties of a Markov process taking place on the graph. The Markov process provides a dynamical sweeping across all scales in the graph, and the time scale is an intrinsic parameter that uncovers communities at different resolutions.   This dynamic-based community detection leads to a compound optimization, which favours communities of comparable centrality (as defined by the stationary distribution), and provides a unifying framework for spectral algorithms, as well as different heuristics for community detection, including versions of modularity and Potts model. Our dynamic framework creates a systematic link between different stochastic dynamics and their corresponding notions of optimal communities under distinct (node and edge) centralities. We show that the Markov Stability can be computed efficiently to find multi-scale community structure in large networks.
When the interval between a transient ash of light (a "cue") and a second visual response signal (a "target") exceeds at least 200ms, responding is slowest in the direction indicated by the first signal. This phenomenon is commonly referred to as inhibition of return (IOR). The dynamic neural field model (DNF) has proven to have broad explanatory power for IOR, effectively capturing many empirical results. Previous work has used a short-term depression (STD) implementation of IOR, but this approach fails to explain many behavioral phenomena observed in the literature. Here, we explore a variant model of IOR involving a combination of STD and delayed direct collicular inhibition. We demonstrate that this hybrid model can better reproduce established behavioural results. We use the results of this model to propose several experiments that would yield particularly valuable insight into the nature of the neurophysiological mechanisms underlying IOR.
Starting from the known one-loop result for the $e^{+}e^{-}$-annihilation process $e^{+}e^{-}\stackrel{\gamma,Z} {\longrightarrow} q\bar{q}g$ with massless quarks we employ analyticity and crossing to determine the absorptive parts of the corresponding one-loop contributions in Deep Inelastic Scattering (DIS) and in the Drell-Yan process (DY). Whereas the ${\cal O}(\alpha_s^2)$ absorptive parts generate a non-measurable phase factor in the $e^{+}e^{-}$-annihilation channel one obtains measurable phase effects from the one-loop contributions in the deep inelastic and in the Drell-Yan case. We compare our results with the results of previous calculations where the absorptive parts in DIS and in the DY process were calculated directly in the respective channels. We also present some new results on the dispersive and absorptive contributions of the triangle anomaly graph to the DIS process.
A study of inclusive production of single hadrons with finite transverse momentum in neutral current deep-inelastic scattering has been carried out. Cross sections have been calculated using perturbative Quantum Chromodynamics at next-to-leading order and compared to HERA data. Predictions for light charged hadron production data were also calculated for large values of Q^2 to show the possible effects of the Z-boson exchange.
We present results on the low-frequency dynamical and transport properties of random quantum systems whose low temperature ($T$), low-energy behavior is controlled by strong disorder fixed points. We obtain the momentum and frequency dependent dynamic structure factor in the Random Singlet (RS) phases of both spin-1/2 and spin-1 random antiferromagnetic chains, as well as in the Random Dimer (RD) and Ising Antiferromagnetic (IAF) phases of spin-1/2 random antiferromagnetic chains. We show that the RS phases are unusual `spin metals' with divergent low-frequency spin conductivity at T=0, and we also follow the conductivity through novel `metal-insulator' transitions tuned by the strength of dimerization or Ising anisotropy in the spin-1/2 case, and by the strength of disorder in the spin-1 case. We work out the average spin and energy autocorrelations in the one-dimensional random transverse field Ising model in the vicinity of its quantum critical point. All of the above calculations are valid in the frequency dominated regime $\omega \agt T$, and rely on previously available renormalization group schemes that describe these systems in terms of the properties of certain strong-disorder fixed point theories. In addition, we obtain some information about the behavior of the dynamic structure factor and dynamical conductivity in the opposite `hydrodynamic' regime $\omega < T$ for the special case of spin-1/2 chains close to the planar limit (the quantum x-y model) by analyzing the corresponding quantities in an equivalent model of spinless fermions with weak repulsive interactions and particle-hole symmetric disorder.
The current functionality supported by OpenFlowbased software defined networking (SDN) includes switching, routing, tunneling, and some basic fire walling while operating on traffic flows. However, the semantics of SDN do not allow for other operations on the traffic, nor does it allow operations at a higher granularity. In this work, we describe a method to expand the SDN framework to add other network primitives. In particular, we present a method to integrate different network elements (like cache, proxy etc). Here, we focus on storage and caching, but our method could be expanded to other functionality seamlessly. We also present a method to identify content so as to perform per-content policy, as opposed to per flow policy. We have implemented the proposed mechanisms to demonstrate its feasibility.
Social network research has begun to take advantage of fine-grained communications regarding coordination, decision-making, and knowledge sharing. These studies, however, have not generally analyzed how external events are associated with a social network's structure and communicative properties. Here, we study how external events are associated with a network's change in structure and communications. Analyzing a complete dataset of millions of instant messages among the decision-makers in a large hedge fund and their network of outside contacts, we investigate the link between price shocks, network structure, and change in the affect and cognition of decision-makers embedded in the network. When price shocks occur the communication network tends not to display structural changes associated with adaptiveness. Rather, the network "turtles up". It displays a propensity for higher clustering, strong tie interaction, and an intensification of insider vs. outsider communication. Further, we find changes in network structure predict shifts in cognitive and affective processes, execution of new transactions, and local optimality of transactions better than prices, revealing the important predictive relationship between network structure and collective behavior within a social network.
Random constraint satisfaction problems undergo several phase transitions as the ratio between the number of constraints and the number of variables is varied. When this ratio exceeds the satisfiability threshold no more solutions exist; the satisfiable phase, for less constrained problems, is itself divided in an unclustered regime and a clustered one. In the latter solutions are grouped in clusters of nearby solutions separated in configuration space from solutions of other clusters. In addition the rigidity transition signals the appearance of so-called frozen variables in typical solutions: beyond this threshold most solutions belong to clusters with an extensive number of variables taking the same values in all solutions of the cluster. In this paper we refine the description of this phenomenon by estimating the location of the freezing transition, corresponding to the disappearance of all unfrozen solutions (not only typical ones). We also unveil phase transitions for the existence and uniqueness of locked solutions, in which all variables are frozen. From a technical point of view we characterize atypical solutions with a number of frozen variables different from the typical value via a large deviation study of the dynamics of a stripping process (whitening) that unveils the frozen variables of a solution, building upon recent works on atypical trajectories of the bootstrap percolation dynamics. Our results also bear some relevance from an algorithmic perspective, previous numerical studies having shown that heuristic algorithms of various kinds usually output unfrozen solutions.
Recent theoretical advances predict the existence, deep into the glass phase, of a novel phase transition, the so-called Gardner transition. This transition is associated with the emergence of a complex free energy landscape composed of many marginally stable sub-basins within a glass metabasin. In this study, we explore several methods to detect numerically the Gardner transition in a simple structural glass former, the infinite-range Mari-Kurchan model. The transition point is robustly located from three independent approaches: (i) the divergence of the characteristic relaxation time, (ii) the divergence of the caging susceptibility, and (iii) the abnormal tail in the probability distribution function of cage order parameters. We show that the numerical results are fully consistent with the theoretical expectation. The methods we propose may also be generalized to more realistic numerical models as well as to experimental systems.
An important problem of reconstruction of diffusion network and transmission probabilities from the data has attracted a considerable attention in the past several years. A number of recent papers introduced efficient algorithms for the estimation of spreading parameters, based on the maximization of the likelihood of observed cascades, assuming that the full information for all the nodes in the network is available. In this work, we focus on a more realistic and restricted scenario, in which only a partial information on the cascades is available: either the set of activation times for a limited number of nodes, or the states of nodes for a subset of observation times. To tackle this problem, we first introduce a framework based on the maximization of the likelihood of the incomplete diffusion trace. However, we argue that the computation of this incomplete likelihood is a computationally hard problem, and show that a fast and robust reconstruction of transmission probabilities in sparse networks can be achieved with a new algorithm based on recently introduced dynamic message-passing equations for the spreading processes. The suggested approach can be easily generalized to a large class of discrete and continuous dynamic models, as well as to the cases of dynamically-changing networks and noisy information.
A recent preprint by Yan & Windhorst (astro-ph/0407493) independently repeats our selection of candidate z~6 galaxies in the Hubble Ultra Deep Field (astro-ph/0403223). The agreement of this independent study with our original work is excellent, and most of the i-band drop-out galaxies are reproduced. To avoid confusion with various ID numbers for sources, we present the community with a matched list of i-band drop-outs in the Hubble Ultra Deep Field.
The inference of the interactions between organisms in an ecosystem from observational data is an important problem in ecology. This paper presents a mathematical inference method, originally developed for the inference of biochemical networks in molecular biology, adapted for the inference of networks of ecological interactions. The method is applied to a network of invertebrate families (taxa) in a rice field.
Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.
We study, via numerical experiments, the role of bound states in the evolution of cosmic superstring networks, being composed by p F-strings, q D-strings and (p,q) bound states. We find robust evidence for scaling of all three components of the network, independently of initial conditions. The novelty of our numerical approach consists of having control over the initial abundance of bound states. This indeed allows us to identify the effect of bound states on the evolution of the network. Our studies also clearly show the existence of an additional energy loss mechanism, resulting to a lower overall string network energy, and thus scaling of the network. This new mechanism consists of the formation of bound states with an increasing length.
We consider magnetic Schroedinger operators on quantum graphs with identical edges. The spectral problem for the quantum graph is reduced to the discrete magnetic Laplacian on the underlying combinatorial graph and a certain Hill operator. In particular, it is shown that the spectrum on the quantum graph is the preimage of the combinatorial spectrum under a certain entire function. Using this correspondence we show that that the number of gaps in the spectrum of the Schroedinger operators admits an estimate from below in terms of the Hill operator independently of the graph structure.
Motivated by uncertainty quantification in natural transport systems, we investigate an individual-based transport process involving particles undergoing a random walk along a line of point sinks whose strengths are themselves independent random variables. We assume particles are removed from the system via first-order kinetics. We analyse the system using a hierarchy of approaches when the sinks are sparsely distributed, including a stochastic homogenization approximation that yields explicit predictions for the extrinsic disorder in the stationary state due to sink strength fluctuations. The extrinsic noise induces long-range spatial correlations in the particle concentration, unlike fluctuations due to the intrinsic noise alone. Additionally, the mean concentration profile, averaged over both intrinsic and extrinsic noise, is elevated compared with the corresponding profile from a uniform sink distribution, showing that the classical homogenization approximation can be a biased estimator of the true mean.
We consider a switched (queuing) network in which there are constraints on which queues may be served simultaneously; such networks have been used to effectively model input-queued switches and wireless networks. The scheduling policy for such a network specifies which queues to serve at any point in time, based on the current state or past history of the system. In the main result of this paper, we provide a new class of online scheduling policies that achieve optimal queue-size scaling for a class of switched networks including input-queued switches. In particular, it establishes the validity of a conjecture (documented in Shah, Tsitsiklis and Zhong [Queueing Syst. 68 (2011) 375-384]) about optimal queue-size scaling for input-queued switches.
MirBot is a collaborative application for smartphones that allows users to perform object recognition. This app can be used to take a photograph of an object, select the region of interest and obtain the most likely class (dog, chair, etc.) by means of similarity search using features extracted from a convolutional neural network (CNN). The answers provided by the system can be validated by the user so as to improve the results for future queries. All the images are stored together with a series of metadata, thus enabling a multimodal incremental dataset labeled with synset identifiers from the WordNet ontology. This dataset grows continuously thanks to the users' feedback, and is publicly available for research. This work details the MirBot object recognition system, analyzes the statistics gathered after more than four years of usage, describes the image classification methodology, and performs an exhaustive evaluation using handcrafted features, convolutional neural codes and different transfer learning techniques. After comparing various models and transformation methods, the results show that the CNN features maintain the accuracy of MirBot constant over time, despite the increasing number of new classes. The app is freely available at the Apple and Google Play stores.
In a recent paper by Hellerstein [15], a tight relationship was conjectured between the number of strata of a Datalog${}^\neg$ program and the number of "coordination stages" required for its distributed computation. Indeed, Ameloot et al. [9] showed that a query can be computed by a coordination-free relational transducer network iff it is monotone, thus answering in the affirmative a variant of Hellerstein's CALM conjecture, based on a particular definition of coordination-free computation. In this paper, we present three additional models for declarative networking. In these variants, relational transducers have limited access to the way data is distributed. This variation allows transducer networks to compute more queries in a coordination-free manner: e.g., a transducer can check whether a ground atom $A$ over the input schema is in the "scope" of the local node, and then send either $A$ or $\neg A$ to other nodes.   We show the surprising result that the query given by the well-founded semantics of the unstratifiable win-move program is coordination-free in some of the models we consider. We also show that the original transducer network model [9] and our variants form a strict hierarchy of classes of coordination-free queries. Finally, we identify different syntactic fragments of Datalog${}^{\neg\neg}_{\forall}$, called semi-monotone programs, which can be used as declarative network programming languages, whose distributed computation is guaranteed to be eventually consistent and coordination-free.
The computation power of SDN controllers fosters the development of a new generation of control plane that uses compute-intensive operations to automate and optimize the network configuration across layers. From now on, cutting-edge optimization and machine learning algorithms can be used to control networks in real-time. This formidable opportunity transforms the way routing systems should be conceived and designed. This paper presents a candidate architecture for the next generation of routing platforms built on three main pillars for admission control, re-routing and monitoring that would have not been possible in legacy control planes.
Growing interest in automatic speaker verification (ASV)systems has lead to significant quality improvement of spoofing attackson them. Many research works confirm that despite the low equal er-ror rate (EER) ASV systems are still vulnerable to spoofing attacks. Inthis work we overview different acoustic feature spaces and classifiersto determine reliable and robust countermeasures against spoofing at-tacks. We compared several spoofing detection systems, presented so far,on the development and evaluation datasets of the Automatic SpeakerVerification Spoofing and Countermeasures (ASVspoof) Challenge 2015.Experimental results presented in this paper demonstrate that the useof magnitude and phase information combination provides a substantialinput into the efficiency of the spoofing detection systems. Also wavelet-based features show impressive results in terms of equal error rate. Inour overview we compare spoofing performance for systems based on dif-ferent classifiers. Comparison results demonstrate that the linear SVMclassifier outperforms the conventional GMM approach. However, manyresearchers inspired by the great success of deep neural networks (DNN)approaches in the automatic speech recognition, applied DNN in thespoofing detection task and obtained quite low EER for known and un-known type of spoofing attacks.
Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent's parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and require significantly more samples. Combining parameter noise with traditional RL methods allows to combine the best of both worlds. We demonstrate that both off- and on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous control tasks. Our results show that RL with parameter noise learns more efficiently than traditional RL with action space noise and evolutionary strategies individually.
Neural networks are usually over-parameterized with significant redundancy in the number of required neurons which results in unnecessary computation and memory usage at inference time. One common approach to address this issue is to prune these big networks by removing extra neurons and parameters while maintaining the accuracy. In this paper, we propose NoiseOut, a fully automated pruning algorithm based on the correlation between activations of neurons in the hidden layers. We prove that adding additional output neurons with entirely random targets results into a higher correlation between neurons which makes pruning by NoiseOut even more efficient. Finally, we test our method on various networks and datasets. These experiments exhibit high pruning rates while maintaining the accuracy of the original network.
We study adaptive network coding (NC) for scheduling real-time traffic over a single-hop wireless network. To meet the hard deadlines of real-time traffic, it is critical to strike a balance between maximizing the throughput and minimizing the risk that the entire block of coded packets may not be decodable by the deadline. Thus motivated, we explore adaptive NC, where the block size is adapted based on the remaining time to the deadline, by casting this sequential block size adaptation problem as a finite-horizon Markov decision process. One interesting finding is that the optimal block size and its corresponding action space monotonically decrease as the deadline approaches, and the optimal block size is bounded by the "greedy" block size. These unique structures make it possible to narrow down the search space of dynamic programming, building on which we develop a monotonicity-based backward induction algorithm (MBIA) that can solve for the optimal block size in polynomial time. Since channel erasure probabilities would be time-varying in a mobile network, we further develop a joint real-time scheduling and channel learning scheme with adaptive NC that can adapt to channel dynamics. We also generalize the analysis to multiple flows with hard deadlines and long-term delivery ratio constraints, devise a low-complexity online scheduling algorithm integrated with the MBIA, and then establish its asymptotical throughput-optimality. In addition to analysis and simulation results, we perform high fidelity wireless emulation tests with real radio transmissions to demonstrate the feasibility of the MBIA in finding the optimal block size in real time.
The relationship between components of biochemical network and the resulting dynamics of the overall system is a key focus of computational biology. However, as these networks and resulting mathematical models are inherently complex and non-linear, the understanding of this relationship becomes challenging. Among many approaches, model reduction methods provide an avenue to extract components responsible for the key dynamical features of the system. Unfortunately, these approaches often require intuition to apply. In this manuscript we propose a practical algorithm for the reduction of biochemical reaction systems using fast-slow asymptotics. This method allows the ranking of system variables according to how quickly they approach their momentary steady state, thus selecting the fastest for a steady state approximation. We applied this method to derive models of the Nuclear Factor kappa B network, a key regulator of the immune response that exhibits oscillatory dynamics. Analyses with respect to two specific solutions, which corresponded to different experimental conditions identified different components of the system that were responsible for the respective dynamics. This is an important demonstration of how reduction methods that provide approximations around a specific steady state, could be utilised in order to gain a better understanding of network topology in a broader context.
Recently, Mobile-Edge Computing (MEC) has arisen as an emerging paradigm that extends cloud-computing capabilities to the edge of the Radio Access Network (RAN) by deploying MEC servers right at the Base Stations (BSs). In this paper, we envision a collaborative joint caching and processing strategy for on-demand video streaming in MEC networks. Our design aims at enhancing the widely used Adaptive BitRate (ABR) streaming technology, where multiple bitrate versions of a video can be delivered so as to adapt to the heterogeneity of user capabilities and the varying of network connection bandwidth. The proposed strategy faces two main challenges: (i) not only the videos but their appropriate bitrate versions have to be effectively selected to store in the caches, and (ii) the transcoding relationships among different versions need to be taken into account to effectively utilize the processing capacity at the MEC servers. To this end, we formulate the collaborative joint caching and processing problem as an Integer Linear Program (ILP) that minimizes the backhaul network cost, subject to the cache storage and processing capacity constraints. Due to the NP-completeness of the problem and the impractical overheads of the existing offline approaches, we propose a novel online algorithm that makes cache placement and video scheduling decisions upon the arrival of each new request. Extensive simulations results demonstrate the significant performance improvement of the proposed strategy over traditional approaches in terms of cache hit ratio increase, backhaul traffic and initial access delay reduction.
Dynamic networks, also called network streams, are an important data representation that applies to many real-world domains. Many sets of network data such as e-mail networks, social networks, or internet traffic networks are best represented by a dynamic network due to the temporal component of the data. One important application in the domain of dynamic network analysis is anomaly detection. Here the task is to identify points in time where the network exhibits behavior radically different from a typical time, either due to some event (like the failure of machines in a computer network) or a shift in the network properties. This problem is made more difficult by the fluid nature of what is considered "normal" network behavior. The volume of traffic on a network, for example, can change over the course of a month or even vary based on the time of the day without being considered unusual. Anomaly detection tests using traditional network statistics have difficulty in these scenarios due to their Density Dependence: as the volume of edges changes the value of the statistics changes as well making it difficult to determine if the change in signal is due to the traffic volume or due to some fundamental shift in the behavior of the network. To more accurately detect anomalies in dynamic networks, we introduce the concept of Density-Consistent network statistics. On synthetically generated graphs anomaly detectors using these statistics show a a 20-400% improvement in the recall when distinguishing graphs drawn from different distributions. When applied to several real datasets Density-Consistent statistics recover multiple network events which standard statistics failed to find.
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, synthetically created contexts. Experimental results on simulated low-resource settings show that our method improves translation quality by up to 2.9 BLEU points over the baseline and up to 3.2 BLEU over back-translation.
Recently, the deep neural network (derived from the artificial neural network) has attracted many researchers' attention by its outstanding performance. However, since this network requires high-performance GPUs and large storage, it is very hard to use it on individual devices. In order to improve the deep neural network, many trials have been made by refining the network structure or training strategy. Unlike those trials, in this paper, we focused on the basic propagation function of the artificial neural network and proposed the binarized deep neural network. This network is a pure binary system, in which all the values and calculations are binarized. As a result, our network can save a lot of computational resource and storage. Therefore, it is possible to use it on various devices. Moreover, the experimental results proved the feasibility of the proposed network.
Network robustness against attacks is one of the most fundamental researches in network science as it is closely associated with the reliability and functionality of various networking paradigms. However, despite the study on intrinsic topological vulnerabilities to node removals, little is known on the network robustness when network defense mechanisms are implemented, especially for networked engineering systems equipped with detection capabilities. In this paper, a sequential defense mechanism is firstly proposed in complex networks for attack inference and vulnerability assessment, where the data fusion center sequentially infers the presence of an attack based on the binary attack status reported from the nodes in the network. The network robustness is evaluated in terms of the ability to identify the attack prior to network disruption under two major attack schemes, i.e., random and intentional attacks. We provide a parametric plug-in model for performance evaluation on the proposed mechanism and validate its effectiveness and reliability via canonical complex network models and real-world large-scale network topology. The results show that the sequential defense mechanism greatly improves the network robustness and mitigates the possibility of network disruption by acquiring limited attack status information from a small subset of nodes in the network.
Advanced algorithms are necessary to obtain faster-than-real-time dynamic simulations in a number of different physical problems that are characterized by widely disparate time scales. Recent advanced dynamic Monte Carlo algorithms that preserve the dynamics of the model are described. These include the $n$-fold way algorithm, the Monte Carlo with Absorbing Markov Chains (MCAMC) algorithm, and the Projective Dynamics (PD) algorithm. To demonstrate the use of these algorithms, they are applied to some simplified models of dynamic physical systems. The models studied include a model for ion motion through a pore such as a biological ion channel and the metastable decay of the ferromagnetic Ising model. Non-trivial parallelization issues for these dynamic algorithms, which are in the class of parallel discrete event simulations, are discussed. Efforts are made to keep the article at an elementary level by concentrating on a simple model in each case that illustrates the use of the advanced dynamic Monte Carlo algorithm.
This thesis investigates the use of problem-specific knowledge to enhance a genetic algorithm approach to multiple-choice optimisation problems.It shows that such information can significantly enhance performance, but that the choice of information and the way it is included are important factors for success.Two multiple-choice problems are considered.The first is constructing a feasible nurse roster that considers as many requests as possible.In the second problem, shops are allocated to locations in a mall subject to constraints and maximising the overall income.Genetic algorithms are chosen for their well-known robustness and ability to solve large and complex discrete optimisation problems.However, a survey of the literature reveals room for further research into generic ways to include constraints into a genetic algorithm framework.Hence, the main theme of this work is to balance feasibility and cost of solutions.In particular, co-operative co-evolution with hierarchical sub-populations, problem structure exploiting repair schemes and indirect genetic algorithms with self-adjusting decoder functions are identified as promising approaches.The research starts by applying standard genetic algorithms to the problems and explaining the failure of such approaches due to epistasis.To overcome this, problem-specific information is added in a variety of ways, some of which are designed to increase the number of feasible solutions found whilst others are intended to improve the quality of such solutions.As well as a theoretical discussion as to the underlying reasons for using each operator,extensive computational experiments are carried out on a variety of data.These show that the indirect approach relies less on problem structure and hence is easier to implement and superior in solution quality.
We study the scaling properties of the solid-on-solid front of the infinite cluster in two-dimensional gradient percolation. We show that such an object is self affine with a Hurst exponent equal to 2/3 up to a cutoff-length proportional to the gradient to the power (-4/7). Beyond this length scale, the front position has the character of uncorrelated noise. Importantly, the self-affine behavior is robust even after removing local jumps of the front. The previously observed multi affinity, is due to the dominance of overhangs at small distances in the structure function. This is a crossover effect.
Wireless industry nowadays is facing two major challenges: 1) how to support the vertical industry applications so that to expand the wireless industry market and 2) how to further enhance device capability and user experience. In this paper, we propose a technology framework to address these challenges. The proposed technology framework is based on end-to-end vertical and horizontal slicing, where vertical slicing enables vertical industry and services and horizontal slicing improves system capacity and user experience. The technology development on vertical slicing has already started in late 4G and early 5G and is mostly focused on slicing the core network. We envision this trend to continue with the development of vertical slicing in the radio access network and the air interface. Moving beyond vertical slicing, we propose to horizontally slice the computation and communication resources to form virtual computation platforms for solving the network capacity scaling problem and enhancing device capability and user experience. In this paper, we explain the concept of vertical and horizontal slicing and illustrate the slicing techniques in the air interface, the radio access network, the core network and the computation platform. This paper aims to initiate the discussion on the long-range technology roadmap and spur development on the solutions for E2E network slicing in 5G and beyond.
The scattering framework offers an optimal hierarchical convolutional decomposition according to its kernels. Convolutional Neural Net (CNN) can be seen as an optimal kernel decomposition, nevertheless it requires large amount of training data to learn its kernels. We propose a trade-off between these two approaches: a Chirplet kernel as an efficient Q constant bioacoustic representation to pretrain CNN. First we motivate Chirplet bioinspired auditory representation. Second we give the first algorithm (and code) of a Fast Chirplet Transform (FCT). Third, we demonstrate the computation efficiency of FCT on large environmental data base: months of Orca recordings, and 1000 Birds species from the LifeClef challenge. Fourth, we validate FCT on the vowels subset of the Speech TIMIT dataset. The results show that FCT accelerates CNN when it pretrains low level layers: it reduces training duration by -28\% for birds classification, and by -26% for vowels classification. Scores are also enhanced by FCT pretraining, with a relative gain of +7.8% of Mean Average Precision on birds, and +2.3\% of vowel accuracy against raw audio CNN. We conclude on perspectives on tonotopic FCT deep machine listening, and inter-species bioacoustic transfer learning to generalise the representation of animal communication systems.
We consider the dynamical system described by the area--preserving standard mapping. It is known for this system that $P(t)$, the normalized number of recurrences staying in some given domain of the phase space at time $t$ (so-clled "survival probability") has the power--law asymptotics, $P(t)\sim t^{-\nu}$. We present new semi--phenomenological arguments which enable us to map the dynamical system near the chaos border onto the effective "ultrametric diffusion" on the boundary of a tree--like space with hierarchically organized transition rates. In the frameworks of our approach we have estimated the exponent $\nu$ as $\nu=\ln 2/\ln (1+r_g)\approx 1.44$, where $r_g=(\sqrt{5}-1)/2$ is the critical rotation number.
We present the EpiReader, a novel model for machine comprehension of text. Machine comprehension of unstructured, real-world text is a major research goal for natural language processing. Current tests of machine comprehension pose questions whose answers can be inferred from some supporting text, and evaluate a model's response to the questions. The EpiReader is an end-to-end neural model comprising two components: the first component proposes a small set of candidate answers after comparing a question to its supporting text, and the second component formulates hypotheses using the proposed candidates and the question, then reranks the hypotheses based on their estimated concordance with the supporting text. We present experiments demonstrating that the EpiReader sets a new state-of-the-art on the CNN and Children's Book Test machine comprehension benchmarks, outperforming previous neural models by a significant margin.
We investigate the effect of a unidirectional quenched random field on the anisotropic quantum spin-1/2 $XY$ model, which magnetizes spontaneously in the absence of the random field. We adopt mean-field approach to show that spontaneous magnetization persists even in the presence of this random field but the magnitude of magnetization gets suppressed due to disorder, and the system magnetizes in the directions parallel and transverse to the random field. Our results are obtained by analytical calculations within perturbative framework and by numerical simulations. Interestingly, we show that it is possible to enhance a component of the magnetization in presence of the disorder field provided we apply an additional constant field in the $XY$ plane. Moreover, we derive generalized expressions for the critical temperature and the scalings of the magnetization near the critical point for the $XY$ spin system with arbitrary fixed quantum spin angular momentum.
Recently, the development of neural machine translation (NMT) has significantly improved the translation quality of automatic machine translation. While most sentences are more accurate and fluent than translations by statistical machine translation (SMT)-based systems, in some cases, the NMT system produces translations that have a completely different meaning. This is especially the case when rare words occur.   When using statistical machine translation, it has already been shown that significant gains can be achieved by simplifying the input in a preprocessing step. A commonly used example is the pre-reordering approach.   In this work, we used phrase-based machine translation to pre-translate the input into the target language. Then a neural machine translation system generates the final hypothesis using the pre-translation. Thereby, we use either only the output of the phrase-based machine translation (PBMT) system or a combination of the PBMT output and the source sentence.   We evaluate the technique on the English to German translation task. Using this approach we are able to outperform the PBMT system as well as the baseline neural MT system by up to 2 BLEU points. We analyzed the influence of the quality of the initial system on the final result.
Parsing sentences to linguistically-expressive semantic representations is a key goal of Natural Language Processing. Yet statistical parsing has focused almost exclusively on bilexical dependencies or domain-specific logical forms. We propose a neural encoder-decoder transition-based parser which is the first full-coverage semantic graph parser for Minimal Recursion Semantics (MRS). The model architecture uses stack-based embedding features, predicting graphs jointly with unlexicalized predicates and their token alignments. Our parser is more accurate than attention-based baselines on MRS, and on an additional Abstract Meaning Representation (AMR) benchmark, and GPU batch processing makes it an order of magnitude faster than a high-precision grammar-based parser. Further, the 86.69% Smatch score of our MRS parser is higher than the upper-bound on AMR parsing, making MRS an attractive choice as a semantic representation.
We show that the trap model at its critical temperature presents dynamical ultrametricity in the sense of Cugliandolo and Kurchan [CuKu94]. We use the explicit analytic solution of this model to discuss several issues that arise in the context of mean-field glassy dynamics, such as the scaling form of the correlation function, and the finite time (or finite forcing) corrections to ultrametricity, that are found to decay only logarithmically with the associated time scale, as well as the fluctuation dissipation ratio. We also argue that in the multilevel trap model, the short time dynamics is dominated by the level which is at its critical temperature, so that dynamical ultrametricity should hold in the whole glassy temperature range. We revisit some experimental data on spin-glasses in light of these results.
In this Rapid Communication we investigate spatially constrained networks that realize optimal synchronization properties. After arguing that spatial constraints can be imposed by limiting the amount of `wire' available to connect nodes distributed in space, we use numerical optimization methods to construct networks that realize different trade-offs between optimal synchronization and spatial constraints. Over a large range of parameters such optimal networks are found to have a link length distribution characterized by power law tails $P(l)\propto l^{-\alpha}$, with exponents $\alpha$ increasing as the networks become more constrained in space. It is also shown that the optimal networks, which constitute a particular type of small world network, are characterized by the presence of nodes of distinctly larger than average degree around which long distance links are centred.
Boolean networks is a well-established formalism for modelling biological systems. A vital challenge for analysing a Boolean network is to identify all the attractors. This becomes more challenging for large asynchronous Boolean networks, due to the asynchronous updating scheme. Existing methods are prohibited due to the well-known state-space explosion problem in large Boolean networks. In this paper, we tackle this challenge by proposing a SCC-based decomposition method. We prove the correctness of our proposed method and demonstrate its efficiency with two real-life biological networks.
The optimization of algorithm (hyper-)parameters is crucial for achieving peak performance across a wide range of domains, ranging from deep neural networks to solvers for hard combinatorial problems. The resulting algorithm configuration (AC) problem has attracted much attention from the machine learning community. However, the proper evaluation of new AC procedures is hindered by two key hurdles. First, AC benchmarks are hard to set up. Second and even more significantly, they are computationally expensive: a single run of an AC procedure involves many costly runs of the target algorithm whose performance is to be optimized in a given AC benchmark scenario. One common workaround is to optimize cheap-to-evaluate artificial benchmark functions (e.g., Branin) instead of actual algorithms; however, these have different properties than realistic AC problems. Here, we propose an alternative benchmarking approach that is similarly cheap to evaluate but much closer to the original AC problem: replacing expensive benchmarks by surrogate benchmarks constructed from AC benchmarks. These surrogate benchmarks approximate the response surface corresponding to true target algorithm performance using a regression model, and the original and surrogate benchmark share the same (hyper-)parameter space. In our experiments, we construct and evaluate surrogate benchmarks for hyperparameter optimization as well as for AC problems that involve performance optimization of solvers for hard combinatorial problems, drawing training data from the runs of existing AC procedures. We show that our surrogate benchmarks capture overall important characteristics of the AC scenarios, such as high- and low-performing regions, from which they were derived, while being much easier to use and orders of magnitude cheaper to evaluate.
The benefits of exploiting the presence of symmetries in tensor network algorithms have been extensively demonstrated in the context of matrix product states (MPSs). These include the ability to select a specific symmetry sector (e.g. with a given particle number or spin), to ensure the exact preservation of total charge, and to significantly reduce computational costs. Compared to the case of a generic tensor network, the practical implementation of symmetries in the MPS is simplified by the fact that tensors only have three indices (they are trivalent, just as the Clebsch-Gordan coefficients of the symmetry group) and are organized as a one-dimensional array of tensors, without closed loops. Instead, a more complex tensor network, one where tensors have a larger number of indices and/or a more elaborate network structure, requires a more general treatment. In two recent papers, namely (i) [Phys. Rev. A 82, 050301 (2010)] and (ii) [Phys. Rev. B 83, 115125 (2011)], we described how to incorporate a global internal symmetry into a generic tensor network algorithm based on decomposing and manipulating tensors that are invariant under the symmetry. In (i) we considered a generic symmetry group G that is compact, completely reducible and multiplicity free, acting as a global internal symmetry. Then in (ii) we described the practical implementation of Abelian group symmetries. In this paper we describe the implementation of non-Abelian group symmetries in great detail and for concreteness consider an SU(2) symmetry. Our formalism can be readily extended to more exotic symmetries associated with conservation of total fermionic or anyonic charge. As a practical demonstration, we describe the SU(2)-invariant version of the multi-scale entanglement renormalization ansatz and apply it to study the low energy spectrum of a quantum spin chain with a global SU(2) symmetry.
We propose a neural network based approach for learning topics from text and image datasets. The model makes no assumptions about the conditional distribution of the observed features given the latent topics. This allows us to perform topic modelling efficiently using sentences of documents and patches of images as observed features, rather than limiting ourselves to words. Moreover, the proposed approach is online, and hence can be used for streaming data. Furthermore, since the approach utilizes neural networks, it can be implemented on GPU with ease, and hence it is very scalable.
It is suggested that a quantum neural network (QNN), a type of artificial neural network, can be built using the principles of quantum information processing. The input and output qubits in the QNN can be implemented by optical modes with different polarization, the weights of the QNN can be implemented by optical beam splitters and phase shifters
This paper presents HEALER, a software agent that recommends sequential intervention plans for use by homeless shelters, who organize these interventions to raise awareness about HIV among homeless youth. HEALER's sequential plans (built using knowledge of social networks of homeless youth) choose intervention participants strategically to maximize influence spread, while reasoning about uncertainties in the network. While previous work presents influence maximizing techniques to choose intervention participants, they do not address three real-world issues: (i) they completely fail to scale up to real-world sizes; (ii) they do not handle deviations in execution of intervention plans; (iii) constructing real-world social networks is an expensive process. HEALER handles these issues via four major contributions: (i) HEALER casts this influence maximization problem as a POMDP and solves it using a novel planner which scales up to previously unsolvable real-world sizes; (ii) HEALER allows shelter officials to modify its recommendations, and updates its future plans in a deviation-tolerant manner; (iii) HEALER constructs social networks of homeless youth at low cost, using a Facebook application. Finally, (iv) we show hardness results for the problem that HEALER solves. HEALER will be deployed in the real world in early Spring 2016 and is currently undergoing testing at a homeless shelter.
We use very deep near-infrared imaging data taken with Multi-Object InfraRed Camera and Spectrograph (MOIRCS) on the Subaru Telescope to investigate the number counts of Distant Red Galaxies (DRGs). We have observed a 4x7 arcmin^2 field in the Great Observatories Origins Deep Survey North (GOODS-N), and our data reach J=24.6 and K=23.2 (5sigma, Vega magnitude). The surface density of DRGs selected by J-K>2.3 is 2.35+-0.31 arcmin^-2 at K<22 and 3.54+-0.38 arcmin^-2 at K<23, respectively. These values are consistent with those in the GOODS-South and FIRES. Our deep and wide data suggest that the number counts of DRGs turn over at K~22, and the surface density of the faint DRGs with K>22 is smaller than that expected from the number counts at the brighter magnitude. The result indicates that while there are many bright galaxies at 2<z<4 with the relatively old stellar population and/or heavy dust extinction, the number of the faint galaxies with the similar red color is relatively small. Different behaviors of the number counts of the DRGs and bluer galaxies with 2<z_phot<4 at K>22 suggest that the mass-dependent color distribution, where most of low-mass galaxies are blue while more massive galaxies tend to have redder colors, had already been established at that epoch.
This paper presents a novel algorithm, based upon the dependent Dirichlet process mixture model (DDPMM), for clustering batch-sequential data containing an unknown number of evolving clusters. The algorithm is derived via a low-variance asymptotic analysis of the Gibbs sampling algorithm for the DDPMM, and provides a hard clustering with convergence guarantees similar to those of the k-means algorithm. Empirical results from a synthetic test with moving Gaussian clusters and a test with real ADS-B aircraft trajectory data demonstrate that the algorithm requires orders of magnitude less computational time than contemporary probabilistic and hard clustering algorithms, while providing higher accuracy on the examined datasets.
Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide \& Deep model from Google, DeepFM has a shared input to its "wide" and "deep" parts, with no need of feature engineering besides raw features. Comprehensive experiments are conducted to demonstrate the effectiveness and efficiency of DeepFM over the existing models for CTR prediction, on both benchmark data and commercial data.
[Abridged] We perform on galaxy mock catalogues the same colour-density analysis made by Cucciati et al. (2006) on a 5 Mpc/h scale using the VVDS-Deep survey, and compare the results from mocks with observed data. We use mocks with the same flux limits (I=24) as the VVDS (CMOCKS), built using the semi- analytic model by De Lucia & Blaizot (2007) applied to the Millennium Simulation. From CMOCKS, we extracted samples of galaxies mimicking the VVDS observational strategy (OMOCKS). We computed the B-band Luminosity Function LF and the colour-density relation (CDR) in the mocks. We find that the LF in mocks roughly agrees with the observed LF, but at z<0.8 the faint-end slope of the model LF is steeper than the VVDS one. Computing the LF for early and late type galaxies, we show that mocks have an excess of faint early-type and of bright late-type galaxies with respect to data. We find that the CDR in OMOCKS is in excellent agreement with the one in CMOCKS. At z~0.7, the CDR in mocks agrees with the VVDS one (red galaxies reside mainly in high densities). Yet, the strength of the CDR in mocks does not vary within 0.2<z<1.5, while the observed relation flattens with increasing z and possibly inverts at z=1.3. We argue that the lack of evolution in the CDR in mocks is not due only to inaccurate prescriptions for satellite galaxies, but that also the treatment of central galaxies has to be revised. The reversal of the CDR can be explained by wet mergers between young galaxies, producing a starburst event. This should be seen on group scales. A residual of this is found in observations at z=1.5 on larger scales, but not in the mocks, suggesting that the treatment of physical processes affecting satellites and central galaxies in models should be revised.
In the previous paper, we studied the random-mass Dirac fermion in one dimension by using the transfer-matrix methods. We furthermore employed the imaginary vector potential methods for calculating the localization lengths. Especially we investigated effects of the nonlocal but short-range correlations of the random mass. In this paper, we shall study effects of the long-range correlations of the random mass especially on the delocalization transition and singular behaviours at the band center. We calculate localization lengths and density of states for various nonlocally correlated random mass. We show that there occurs a "phase transition" as the correlation length of the random Dirac mass is varied. The Thouless formula, which relates the density of states and the localization lengths, plays an important role in our investigation.
Pixel-wise street segmentation of photographs taken from a drivers perspective is important for self-driving cars and can also support other object recognition tasks. A framework called SST was developed to examine the accuracy and execution time of different neural networks. The best neural network achieved an $F_1$-score of 89.5% with a simple feedforward neural network which trained to solve a regression task.
We present a means of characterizing and removing internal reflections between the CCD and other optical surfaces in an astronomical camera. The stellar reflections appear as out-of-focus images and are not necessarily axisymmetric about the star. Using long exposures of very bright stars as calibration images we are able to measure the position, size, and intensity of reflections as a function of their position on the field. We also measure the extended stellar point-spread function out to one degree. Together this information can be used to create an empirical model of the excess light from bright stars and reduce systematic artifacts in deep surface photometry. We then reduce a set of deep observations of the Virgo cluster with our method to demonstrate its efficacy and to provide a comparison with other strategies for removing scattered light.
Optical interconnection networks, as enabled by recent advances in silicon photonic device and fabrication technology, have the potential to address on-chip and off-chip communication bottlenecks in many-core systems. Although several designs have shown superior power efficiency and performance compared to electrical alternatives, these networks will not scale to the thousands of cores required in the future.   In this paper, we introduce Hermes, a hybrid network composed of an optimized broadcast for power-efficient low-latency global-scale coordination and circuit-switch sub-networks for high-throughput data delivery. This network will scale for use in thousand core chip systems. At the physical level, SoI-based adiabatic coupler has been designed to provide low-loss and compact optical power splitting. Based on the adiabatic coupler, a topology based on 2-ary folded butterfly is designed to provide linear power division in a thousand core layout with minimal cross-overs. To address the network agility and provide for efficient use of optical bandwidth, a flow control and routing mechanism is introduced to dynamically allocate bandwidth and provide fairness usage of network resources. At the system level, bloom filter-based filtering for localization of communication are designed for reducing global traffic. In addition, a novel greedy-based data and workload migration are leveraged to increase the locality of communication in a NUCA (non-uniform cache access) architecture. First order analytic evaluation results have indicated that Hermes is scalable to at least 1024 cores and offers significant performance improvement and power savings over prior silicon photonic designs.
Recovering natural illumination from a single Low-Dynamic Range (LDR) image is a challenging task. To remedy this situation we exploit two properties often found in everyday images. First, images rarely show a single material, but rather multiple ones that all reflect the same illumination. However, the appearance of each material is observed only for some surface orientations, not all. Second, parts of the illumination are often directly observed in the background, without being affected by reflection. Typically, this directly observed part of the illumination is even smaller. We propose a deep Convolutional Neural Network (CNN) that combines prior knowledge about the statistics of illumination and reflectance with an input that makes explicit use of these two observations. Our approach maps multiple partial LDR material observations represented as reflectance maps and a background image to a spherical High-Dynamic Range (HDR) illumination map. For training and testing we propose a new data set comprising of synthetic and real images with multiple materials observed under the same illumination. Qualitative and quantitative evidence shows how both multi-material and using a background are essential to improve illumination estimations.
We propose a deep learning approach for finding dense correspondences between 3D scans of people. Our method requires only partial geometric information in the form of two depth maps or partial reconstructed surfaces, works for humans in arbitrary poses and wearing any clothing, does not require the two people to be scanned from similar viewpoints, and runs in real time. We use a deep convolutional neural network to train a feature descriptor on depth map pixels, but crucially, rather than training the network to solve the shape correspondence problem directly, we train it to solve a body region classification problem, modified to increase the smoothness of the learned descriptors near region boundaries. This approach ensures that nearby points on the human body are nearby in feature space, and vice versa, rendering the feature descriptor suitable for computing dense correspondences between the scans. We validate our method on real and synthetic data for both clothed and unclothed humans, and show that our correspondences are more robust than is possible with state-of-the-art unsupervised methods, and more accurate than those found using methods that require full watertight 3D geometry.
The Anderson model for independent electrons in a disordered potential is transformed analytically and exactly to a basis of random extended states leading to a variant of augmented space. In addition to the widely-accepted phase diagrams in all physical dimensions, a plethora of additional, weaker Anderson transitions are found, characterized by the long-distance behavior of states. Critical disorders are found for Anderson transitions at which the asymptotically dominant sector of augmented space changes for all states at the same disorder. At fixed disorder, critical energies are also found at which the localization properties of states are singular. Under the approximation of single-parameter scaling, this phase diagram reduces to the widely-accepted one in 1, 2 and 3 dimensions. In two dimensions, in addition to the Anderson transition at infinitesimal disorder, there is a transition between two localized states, characterized by a change in the nature of wave function decay.
Recent years have seen the world become a closely connected society with the emergence of different types of social networks. Online social networks have provided a way to bridge long distances and establish numerous communication channels which were not possible earlier. These networks exhibit interesting behavior under intentional attacks and random failures where different structural properties influence the resilience in different ways.   In this paper, we perform two sets of experiments and draw conclusions from the results pertaining to the resilience of social networks. The first experiment performs a comparative analysis of four different classes of networks namely small world networks, scale free networks, small world-scale free networks and random networks with four semantically different social networks under different attack strategies. The second experiment compares the resilience of these semantically different social networks under different attack strategies. Empirical analysis reveals interesting behavior of different classes of networks with different attack strategies.
Deep learning algorithms have been shown to perform extremely well on many classical machine learning problems. However, recent studies have shown that deep learning, like other machine learning techniques, is vulnerable to adversarial samples: inputs crafted to force a deep neural network (DNN) to provide adversary-selected outputs. Such attacks can seriously undermine the security of the system supported by the DNN, sometimes with devastating consequences. For example, autonomous vehicles can be crashed, illicit or illegal content can bypass content filters, or biometric authentication systems can be manipulated to allow improper access. In this work, we introduce a defensive mechanism called defensive distillation to reduce the effectiveness of adversarial samples on DNNs. We analytically investigate the generalizability and robustness properties granted by the use of defensive distillation when training DNNs. We also empirically study the effectiveness of our defense mechanisms on two DNNs placed in adversarial settings. The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN. Such dramatic gains can be explained by the fact that distillation leads gradients used in adversarial sample creation to be reduced by a factor of 10^30. We also find that distillation increases the average minimum number of features that need to be modified to create adversarial samples by about 800% on one of the DNNs we tested.
By harvesting friendship networks from e-mail contacts or instant message "buddy lists" Peer-to-Peer (P2P) applications can improve performance in low trust environments such as the Internet. However, natural social networks are not always suitable, reliable or available. We propose an algorithm (SLACER) that allows peer nodes to create and manage their own friendship networks.   We evaluate performance using a canonical test application, requiring cooperation between peers for socially optimal outcomes. The Artificial Social Networks (ASN) produced are connected, cooperative and robust - possessing many of the disable properties of human friendship networks such as trust between friends (directly linked peers) and short paths linking everyone via a chain of friends.   In addition to new application possibilities, SLACER could supply ASN to P2P applications that currently depend on human social networks thus transforming them into fully autonomous, self-managing systems.
Network Intrusion Detection (NID) is the process of identifying network activity that can lead to the compromise of a security policy. In this paper, we will look at four intrusion detection approaches, which include ANN or Artificial Neural Network, SOM, Fuzzy Logic and SVM. ANN is one of the oldest systems that have been used for Intrusion Detection System (IDS), which presents supervised learning methods. However, in this research, we also came across SOM or Self Organizing Map, which is an ANN-based system, but applies unsupervised methods. Another approach is Fuzzy Logic (IDS-based), which also applies unsupervised learning methods. Lastly, we will look at the SVM system or Support Vector Machine for IDS. The goal of this paper is to draw an image for hybrid approaches using these supervised and unsupervised methods.
The problem of binary minimization of a quadratic functional in the configuration space is discussed. In order to increase the efficiency of the random-search algorithm it is proposed to change the energy functional by raising to a power the matrix it is based on. We demonstrate that this brings about changes of the energy surface: deep minima displace slightly in the space and become still deeper and their attraction areas grow significantly. Experiments show that this approach results in a considerable displacement of the spectrum of the sought-for minima to the area of greater depth, and the probability of finding the global minimum increases abruptly (by a factor of 10^3 in the case of the 10-by-10 Edwards-Anderson spin glass).
Neurons and networks in the cerebral cortex must operate reliably despite multiple sources of noise. To evaluate the impact of both input and output noise, we determine the robustness of single-neuron stimulus selective responses, as well as the robustness of attractor states of networks of neurons performing memory tasks. We find that robustness to output noise requires synaptic connections to be in a balanced regime in which excitation and inhibition are strong and largely cancel each other. We evaluate the conditions required for this regime to exist and determine the properties of networks operating within it. A plausible synaptic plasticity rule for learning that balances weight configurations is presented. Our theory predicts an optimal ratio of the number of excitatory and inhibitory synapses for maximizing the encoding capacity of balanced networks for a given statistics of afferent activations. Previous work has shown that balanced networks amplify spatio-temporal variability and account for observed asynchronous irregular states. Here we present a novel type of balanced network that amplifies small changes in the impinging signals, and emerges automatically from learning to perform neuronal and network functions robustly.
We explore the range of probabilistic behaviours that can be engineered with Chemical Reaction Networks (CRNs). We show that at steady state CRNs are able to "program" any distribution with finite support in $\mathbb{N}^m$, with $m \geq 1$. Moreover, any distribution with countable infinite support can be approximated with arbitrarily small error under the $L^1$ norm. We also give optimized schemes for special distributions, including the uniform distribution. Finally, we formulate a calculus to compute on distributions that is complete for finite support distributions, and can be compiled to a restricted class of CRNs that at steady state realize those distributions.
In this paper we propose an augmented smoothing function for nonlinear L1 -norm minimization problem and consider a global stability of a gradient-based neural network model to minimize the smoothing function. The numerical simulations show that our smoothing neural network finds successfully the global solution of the L1 -norm minimization problems considered in the simulation.
We study the dependence of the largest component in regular networks on the clustering coefficient, showing that its size changes smoothly without undergoing a phase transition. We explain this behaviour via an analytical approach based on the network structure, and provide an exact equation describing the numerical results. Our work indicates that intrinsic structural properties always allow the spread of epidemics on regular networks.
Correlation Filters (CFs) have recently demonstrated excellent performance in terms of rapidly tracking objects under challenging photometric and geometric variations. The strength of the approach comes from its ability to efficiently learn - "on the fly" - how the object is changing over time. A fundamental drawback to CFs, however, is that the background of the object is not be modelled over time which can result in suboptimal results. In this paper we propose a Background-Aware CF that can model how both the foreground and background of the object varies over time. Our approach, like conventional CFs, is extremely computationally efficient - and extensive experiments over multiple tracking benchmarks demonstrate the superior accuracy and real-time performance of our method compared to the state-of-the-art trackers including those based on a deep learning paradigm.
The nature of the probability distribution function of the local energy in the $\pm J$ Ising model has been investigated. At finite temperature, it has been derived that the probability distribution function must satisfy several relations at $p=1/2$ ($p$ is the concentrationof the ferromagnetic bond) and at Nishimori-line, respectively on any lattice in any dimension. They relate the probability distribution function corresponding to the local energy lower than $-\tanh (K)$ with that corresponding to the local enegy greater than $-\tanh (K)$. ($K$ is the inverse temperature.) The present results at Nishimori-line are, in a sense, generalization of Nishimori's result about the internal energy obtained by the local gauge transformation. Moreover, from the numerical calculation in the two-dimentional $\pm J$ Ising model, it is found that, in a certain temperature region, the probability distribution function of the local energy has several peaks which are related to the patterns of frustration around a bond of the lattice.
In this paper, we explore optimizations to run Recurrent Neural Network (RNN) models locally on mobile devices. RNN models are widely used for Natural Language Processing, Machine Translation, and other tasks. However, existing mobile applications that use RNN models do so on the cloud. To address privacy and efficiency concerns, we show how RNN models can be run locally on mobile devices. Existing work on porting deep learning models to mobile devices focus on Convolution Neural Networks (CNNs) and cannot be applied directly to RNN models. In response, we present MobiRNN, a mobile-specific optimization framework that implements GPU offloading specifically for mobile GPUs. Evaluations using an RNN model for activity recognition shows that MobiRNN does significantly decrease the latency of running RNN models on phones.
Efficiency of single-objective optimization can be improved by introducing some auxiliary objectives. Ideally, auxiliary objectives should be helpful. However, in practice, objectives may be efficient on some optimization stages but obstructive on others. In this paper we propose a modification of the EA+RL method which dynamically selects optimized objectives using reinforcement learning. The proposed modification prevents from losing the best found solution. We analysed the proposed modification and compared it with the EA+RL method and Random Local Search on XdivK, Generalized OneMax and LeadingOnes problems. The proposed modification outperforms the EA+RL method on all problem instances. It also outperforms the single objective approach on the most problem instances. We also provide detailed analysis of how different components of the considered algorithms influence efficiency of optimization. In addition, we present theoretical analysis of the proposed modification on the XdivK problem.
The ability to accurately represent sentences is central to language understanding. We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences. The network uses Dynamic k-Max Pooling, a global pooling operation over linear sequences. The network handles input sentences of varying length and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. The network does not rely on a parse tree and is easily applicable to any language. We test the DCNN in four experiments: small scale binary and multi-class sentiment prediction, six-way question classification and Twitter sentiment prediction by distant supervision. The network achieves excellent performance in the first three tasks and a greater than 25% error reduction in the last task with respect to the strongest baseline.
We consider general disordered models of pinning of directed polymers on a defect line. This class contains in particular the $(1+1)$--dimensional interface wetting model, the disordered Poland--Scheraga model of DNA denaturation and other $(1+d)$--dimensional polymers in interaction with flat interfaces. We consider also the case of copolymers with adsorption at a selective interface. Under quite general conditions, these models are known to have a (de)localization transition at some critical line in the phase diagram. In this work we prove in particular that, as soon as disorder is present, the transition is at least of second order, in the sense that the free energy is differentiable at the critical line, so that the order parameter vanishes continuously at the transition. On the other hand, it is known that the corresponding non--disordered models can have a first order (de)localization transition, with a discontinuous first derivative. Our result shows therefore that the presence of the disorder has really a smoothing effect on the transition. The relation with the predictions based on the Harris criterion is discussed.
General purpose computing systems are used for a large variety of applications. Extensive supports for flexibility in these systems limit their energy efficiencies. Neural networks, including deep networks, are widely used for signal processing and pattern recognition applications. In this paper we propose a multicore architecture for deep neural network based processing. Memristor crossbars are utilized to provide low power high throughput execution of neural networks. The system has both training and recognition (evaluation of new input) capabilities. The proposed system could be used for classification, dimensionality reduction, feature extraction, and anomaly detection applications. The system level area and power benefits of the specialized architecture is compared with the NVIDIA Telsa K20 GPGPU. Our experimental evaluations show that the proposed architecture can provide up to five orders of magnitude more energy efficiency over GPGPUs for deep neural network processing.
Typical properties of computing circuits composed of noisy logical gates are studied using the statistical physics methodology. A growth model that gives rise to typical random Boolean functions is mapped onto a layered Ising spin system, which facilitates the study of their ability to represent arbitrary formulae with a given level of error, the tolerable level of gate-noise, and its dependence on the formulae depth and complexity, the gates used and properties of the function inputs. Bounds on their performance, derived in the information theory literature via specific gates, are straightforwardly retrieved, generalized and identified as the corresponding typical-case phase transitions. The framework is employed for deriving results on error-rates, function-depth and sensitivity, and their dependence on the gate-type and noise model used that are difficult to obtain via the traditional methods used in this field.
Maximum likelihood is the most widely used statistical estimation technique. Recent work by the authors introduced a general methodology for the construction of estimators for functionals in parametric models, and demonstrated improvements - both in theory and in practice - over the maximum likelihood estimator (MLE), particularly in high dimensional scenarios involving parameter dimension comparable to or larger than the number of samples. This approach to estimation, building on results from approximation theory, is shown to yield minimax rate-optimal estimators for a wide class of functionals, implementable with modest computational requirements. In a nutshell, a message of this recent work is that, for a wide class of functionals, the performance of these essentially optimal estimators with $n$ samples is comparable to that of the MLE with $n \ln n$ samples.   In the present paper, we highlight the applicability of the aforementioned methodology to statistical problems beyond functional estimation, and show that it can yield substantial gains. For example, we demonstrate that for learning tree-structured graphical models, our approach achieves a significant reduction of the required data size compared with the classical Chow--Liu algorithm, which is an implementation of the MLE, to achieve the same accuracy. The key step in improving the Chow--Liu algorithm is to replace the empirical mutual information with the estimator for mutual information proposed by the authors. Further, applying the same replacement approach to classical Bayesian network classification, the resulting classifiers uniformly outperform the previous classifiers on 26 widely used datasets.
We present the results of the deepest optically identified X-ray survey yet made. The X-ray survey was made with the ROSAT PSPC and reaches a flux limit of 1.6x10^-15 erg cm^-2 s^-1 (0.5--2.0 keV). Above a flux limit of 2x10^-15 erg cm^-2 s^-1 we define a complete sample of 70 sources of which 59 are identified. Some (5) other sources have tentative identifications and in a further 4 the X-ray error-boxes are blank to R=23 mag. At the brighter flux levels (>= 10^-14 erg cm^-2 s^-1) we confirm the results of previous less deep X-ray surveys with 84% of the sources begin QSOs. At fainter fluxes, however, the survey is dominated by a population of galaxies with narrow optical emission lines (NELGs). In addition, a number of groups and clusters of galaxies are found at intermediate fluxes. Most of these are poor systems of low X-ray luminosity and are generally found at redshifts of > 0.3. Their numbers are consistent with a zero evolutionary scenario, in contrast to the situation for high luminosity clusters at the same redshift. We discuss the significance of these results to the determination of the cosmic soft X-ray background (XRB) and show that at 2x10^-15 erg cm^-2 s^-1, we have resolved more than 50% of the background. We also briefly consider the probable importance of NELG objects to the residual background and look at some of the properties of these unusual objects.
This manuscript will provide a brief overview on Low Rate Wireless Personal Area Networks (LR-WPAN) for 802.15.4 protocol standard which was approved by IEEE Computer Society in May 2003. 802.15.4 standard presents some advantages for structuring sensor networks and other types of applications that require low rate communications. Some security considerations will also be presented.
The study of meme propagation and the prediction of meme trajectory are emerging areas of interest in the field of complex networks research. In addition to the properties of the meme itself, the structural properties of the underlying network decides the speed and the trajectory of the propagating meme. In this paper, we provide an artificial framework for studying the meme propagation patterns. Firstly, the framework includes a synthetic network which simulates a real world network and acts as a testbed for meme simulation. Secondly, we propose a meme spreading model based on the diversity of edges in the network. Through the experiments conducted, we show that the generated synthetic network combined with the proposed spreading model is able to simulate a real world meme spread. Our proposed model is validated by the propagation of the Higgs boson meme on Twitter as well as many real world social networks.
SimuLTE is a tool that enables system-level simulations of LTE/LTE-Advanced networks within OMNeT++. It is designed such that it can be plugged within network elements as an additional Network Interface Card (NIC) to those already provided by the INET framework (e.g. Wi-Fi). Recently, device-to-device (D2D) technology has been widely studied by the research community, as a mechanism to allow direct communications between devices of a LTE cellular network. In this work, we present how SimuLTE can be employed to simulate both one-to-one and one-to-many D2D communications, so that the latter can be exploited as a new communication opportunity in several research fields, like vehicular networks, IoT and machine-to-machine (M2M) applications.
Social media platforms provide continuous access to user generated content that enables real-time monitoring of user behavior and of events. The geographical dimension of such user behavior and events has recently caught a lot of attention in several domains: mobility, humanitarian, or infrastructural. While resolving the location of a user can be straightforward, depending on the affordances of their device and/or of the application they are using, in most cases, locating a user demands a larger effort, such as exploiting textual features. On Twitter for instance, only 2% of all tweets are geo-referenced. In this paper, we present a system for zoomed-in grounding (below city level) for short messages (e.g., tweets). The system combines different natural language processing and machine learning techniques to increase the number of geo-grounded tweets, which is essential to many applications such as disaster response and real-time traffic monitoring.
We study Ising spin models on finitely connected random interaction graphs which are drawn from an ensemble in which not only the degree distribution $p(k)$ can be chosen arbitrarily, but which allows for further fine-tuning of the topology via preferential attachment of edges on the basis of an arbitrary function Q(k,k') of the degrees of the vertices involved. We solve these models using finite connectivity equilibrium replica theory, within the replica symmetric ansatz. In our ensemble of graphs, phase diagrams of the spin system are found to depend no longer only on the chosen degree distribution, but also on the choice made for Q(k,k'). The increased ability to control interaction topology in solvable models beyond prescribing only the degree distribution of the interaction graph enables a more accurate modeling of real-world interacting particle systems by spin systems on suitably defined random graphs.
Developing routing protocols for Vehicular Ad Hoc Networks (VANETs) is a significant challenge in these large, self- organized and distributed networks. We address this challenge by studying VANETs from a network science perspective to develop solutions that act locally but influence the network performance globally. More specifically, we look at snapshots from highway and urban VANETs of different sizes and vehicle densities, and study parameters such as the node degree distribution, the clustering coefficient and the average shortest path length, in order to better understand the networks' structure and compare it to structures commonly found in large real world networks such as small-world and scale-free networks. We then show how to use this information to improve existing VANET protocols. As an illustrative example, it is shown that, by adding new mechanisms that make use of this information, the overhead of the urban vehicular broadcasting (UV-CAST) protocol can be reduced substantially with no significant performance degradation.
This paper presents a cooperative framework for fireworks algorithm (CoFFWA). A detailed analysis of existing fireworks algorithm (FWA) and its recently developed variants has revealed that (i) the selection strategy lead to the contribution of the firework with the best fitness (core firework) for the optimization overwhelms the contributions of the rest of fireworks (non-core fireworks) in the explosion operator, (ii) the Gaussian mutation operator is not as effective as it is designed to be. To overcome these limitations, the CoFFWA is proposed, which can greatly enhance the exploitation ability of non-core fireworks by using independent selection operator and increase the exploration capacity by crowdness-avoiding cooperative strategy among the fireworks. Experimental results on the CEC2013 benchmark functions suggest that CoFFWA outperforms the state-of-the-art FWA variants, artificial bee colony, differential evolution, the standard particle swarm optimization (SPSO) in 2007 and the most recent SPSO in 2011 in term of convergence performance.
Intelligent systems sometimes need to infer the probable goals of people, cars, and robots, based on partial observations of their motion. This paper introduces a class of probabilistic programs for formulating and solving these problems. The formulation uses randomized path planning algorithms as the basis for probabilistic models of the process by which autonomous agents plan to achieve their goals. Because these path planning algorithms do not have tractable likelihood functions, new inference algorithms are needed. This paper proposes two Monte Carlo techniques for these "likelihood-free" models, one of which can use likelihood estimates from neural networks to accelerate inference. The paper demonstrates efficacy on three simple examples, each using under 50 lines of probabilistic code.
We show that the connectivity distributions $P(k,t)$ of scale-free growing networks ($t$ is the network size) have the generic scale -- the cut-off at $k_{cut} \sim t^\beta$. The scaling exponent $\beta$ is related to the exponent $\gamma$ of the connectivity distribution, $\beta=1/(\gamma-1)$. We propose the simplest model of scale-free growing networks and obtain the exact form of its connectivity distribution for any size of the network. We demonstrate that the trace of the initial conditions -- a hump at $k_h \sim k_{cut} \sim t^\beta$ -- may be found for any network size. We also show that there exists a natural boundary for the observation of the scale-free networks and explain why so few scale-free networks are observed in Nature.
We present a many body approach for non-equilibrium behavior and self-generated glassiness in strongly correlated quantum systems. It combines the dynamical mean field theory of equilibrium systems with the replica theory for classical glasses without quenched disorder. We apply this approach to study a quantized version of the Brazovskii model and find a self-generated quantum glass that remains in a quantum mechanically mixed state as T -> 0. This quantum glass is formed by a large number of competing states spread over an energy region which is determined within our theory.
Distributed abstract programs are a novel class of distributed optimization problems where (i) the number of variables is much smaller than the number of constraints and (ii) each constraint is associated to a network node. Abstract optimization programs are a generalization of linear programs that captures numerous geometric optimization problems. We propose novel constraints consensus algorithms for distributed abstract programs: as each node iteratively identifies locally active constraints and exchanges them with its neighbors, the network computes the active constraints determining the global optimum. The proposed algorithms are appropriate for networks with weak time-dependent connectivity requirements and tight memory constraints. We show how suitable target localization and formation control problems can be tackled via constraints consensus.
The nu_mu -> nu_tau oscillation hypothesis will be tested through nu_tau production of tau in underground neutrino telescopes as well as long-baseline experiments. We provide the full QCD framework for the evaluation of tau neutrino deep inelastic charged current (CC) cross sections, including next-leading-order (NLO) corrections, charm production, tau threshold, and target mass effects in the collinear approximation. We investigate the violation of the Albright-Jarlskog relations for the structure functions F_4,5 which occur only in heavy lepton (tau) scattering. Integrated CC cross sections are evaluated naively over the full phase space and with the inclusion of DIS kinematic cuts. Uncertainties in our evaluation based on scale dependence, PDF errors and the interplay between kinematic and dynamical power corrections are discussed and/or quantified.
Recently, multilayer bootstrap network (MBN) has demonstrated promising performance in unsupervised dimensionality reduction. It can learn compact representations in standard data sets, i.e. MNIST and RCV1. However, as a bootstrap method, the prediction complexity of MBN is high. In this paper, we propose an unsupervised model compression framework for this general problem of unsupervised bootstrap methods. The framework compresses a large unsupervised bootstrap model into a small model by taking the bootstrap model and its application together as a black box and learning a mapping function from the input of the bootstrap model to the output of the application by a supervised learner. To specialize the framework, we propose a new technique, named compressive MBN. It takes MBN as the unsupervised bootstrap model and deep neural network (DNN) as the supervised learner. Our initial result on MNIST showed that compressive MBN not only maintains the high prediction accuracy of MBN but also is over thousands of times faster than MBN at the prediction stage. Our result suggests that the new technique integrates the effectiveness of MBN on unsupervised learning and the effectiveness and efficiency of DNN on supervised learning together for the effectiveness and efficiency of compressive MBN on unsupervised learning.
In diffusion-based algorithms for adaptive distributed estimation, each node of an adaptive network estimates a target parameter vector by creating an intermediate estimate and then combining the intermediate estimates available within its closed neighborhood. We analyze the performance of a reduced-communication diffusion least mean-square (RC-DLMS) algorithm, which allows each node to receive the intermediate estimates of only a subset of its neighbors at each iteration. This algorithm eases the usage of network communication resources and delivers a trade-off between estimation performance and communication cost. We show analytically that the RC-DLMS algorithm is stable and convergent in both mean and mean-square senses. We also calculate its theoretical steady-state mean-square deviation. Simulation results demonstrate a good match between theory and experiment.
Deep convolutional neural networks (CNNs) are indispensable to state-of-the-art computer vision algorithms. However, they are still rarely deployed on battery-powered mobile devices, such as smartphones and wearable gadgets, where vision algorithms can enable many revolutionary real-world applications. The key limiting factor is the high energy consumption of CNN processing due to its high computational complexity. While there are many previous efforts that try to reduce the CNN model size or amount of computation, we find that they do not necessarily result in lower energy consumption, and therefore do not serve as a good metric for energy cost estimation.   To close the gap between CNN design and energy consumption optimization, we propose an energy-aware pruning algorithm for CNNs that directly uses energy consumption estimation of a CNN to guide the pruning process. The energy estimation methodology uses parameters extrapolated from actual hardware measurements that target realistic battery-powered system setups. The proposed layer-by-layer pruning algorithm also prunes more aggressively than previously proposed pruning methods by minimizing the error in output feature maps instead of filter weights. For each layer, the weights are first pruned and then locally fine-tuned with a closed-form least-square solution to quickly restore the accuracy. After all layers are pruned, the entire network is further globally fine-tuned using back-propagation. With the proposed pruning method, the energy consumption of AlexNet and GoogLeNet are reduced by 3.7x and 1.6x, respectively, with less than 1% top-5 accuracy loss. Finally, we show that pruning the AlexNet with a reduced number of target classes can greatly decrease the number of weights but the energy reduction is limited.   Energy modeling tool and energy-aware pruned models available at http://eyeriss.mit.edu/energy.html
We analyze a neural system which mimics a sensorial cortex, with different input characteristics, in presence of transmission delays. We propose a new measure to characterize collective behavior, based on the nonlinear extension of the concept of Granger causality, and an interpretation is given of the variation of the percentage of the causally relevant interactions with transmission delays.
We study the next-to-leading order perturbative QCD corrections to the transverse momentum-weighted Sivers asymmetry in semi-inclusive hadron production in lepton-proton deep inelastic scattering. The corresponding differential cross section is evaluated as a convolution of a twist-three quark-gluon correlation function, often referred to as Qiu-Sterman function, the usual unpolarized fragmentation function, and a hard coefficient function. By studying the collinear divergence structure, we identify the evolution kernel for the Qiu-Sterman function. The hard coefficient function, which is finite and free of any divergence, is evaluated at one-loop order.
Argument mining has become a popular research area in NLP. It typically includes the identification of argumentative components, e.g. claims, as the central component of an argument. We perform a qualitative analysis across six different datasets and show that these appear to conceptualize claims quite differently. To learn about the consequences of such different conceptualizations of claim for practical applications, we carried out extensive experiments using state-of-the-art feature-rich and deep learning systems, to identify claims in a cross-domain fashion. While the divergent perception of claims in different datasets is indeed harmful to cross-domain classification, we show that there are shared properties on the lexical level as well as system configurations that can help to overcome these gaps.
Complex distribution networks are pervasive in biology. Examples include nutrient transport in the slime mold $Physarum$ $polycephalum$ as well as mammalian and plant venation. Adaptive rules are believed to guide development of these networks and lead to a reticulate, hierarchically nested topology that is both efficient and resilient against perturbations. However, as of yet no mechanism is known that can generate such networks on all scales. We show how hierarchically organized reticulation can be generated and maintained through spatially collective load fluctuations on a particular length scale. We demonstrate that the resulting network topologies represent a trade-off between optimizing power dissipation, construction cost, and damage robustness and identify the Pareto-efficient front that evolution is expected to favor and select for. We show that the typical fluctuation length scale controls the position of the networks on the Pareto front and thus on the spectrum of venation phenotypes. We compare the Pareto archetypes predicted by our model with examples of real leaf networks.
In this work, the human parsing task, namely decomposing a human image into semantic fashion/body regions, is formulated as an Active Template Regression (ATR) problem, where the normalized mask of each fashion/body item is expressed as the linear combination of the learned mask templates, and then morphed to a more precise mask with the active shape parameters, including position, scale and visibility of each semantic region. The mask template coefficients and the active shape parameters together can generate the human parsing results, and are thus called the structure outputs for human parsing. The deep Convolutional Neural Network (CNN) is utilized to build the end-to-end relation between the input human image and the structure outputs for human parsing. More specifically, the structure outputs are predicted by two separate networks. The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters. For a new image, the structure outputs of the two networks are fused to generate the probability of each label for each pixel, and super-pixel smoothing is finally used to refine the human parsing result. Comprehensive evaluations on a large dataset well demonstrate the significant superiority of the ATR framework over other state-of-the-arts for human parsing. In particular, the F1-score reaches $64.38\%$ by our ATR framework, significantly higher than $44.76\%$ based on the state-of-the-art algorithm.
We propose a novel convolutional neural network architecture to address the fine-grained recognition problem of multi-view dynamic facial action unit detection. We leverage recent gains in large-scale object recognition by formulating the task of predicting the presence or absence of a specific action unit in a still image of a human face as holistic classification. We then explore the design space of our approach by considering both shared and independent representations for separate action units, and also different CNN architectures for combining color and motion information. We then move to the novel setup of the FERA 2017 Challenge, in which we propose a multi-view extension of our approach that operates by first predicting the viewpoint from which the video was taken, and then evaluating an ensemble of action unit detectors that were trained for that specific viewpoint. Our approach is holistic, efficient, and modular, since new action units can be easily included in the overall system. Our approach significantly outperforms the baseline of the FERA 2017 Challenge, which was the previous state-of-the-art in multi-view dynamic action unit detection, with an absolute improvement of 14%.
The performance of a classifier trained on data coming from a specific domain typically degrades when applied to a related but different one. While annotating many samples from the new domain would address this issue, it is often too expensive or impractical. Domain Adaptation has therefore emerged as a solution to this problem; It leverages annotated data from a source domain, in which it is abundant, to train a classifier to operate in a target domain, in which it is either sparse or even lacking altogether. In this context, the recent trend consists of learning deep architectures whose weights are shared for both domains, which essentially amounts to learning domain invariant features.   Here, we show that it is more effective to explicitly model the shift from one domain to the other. To this end, we introduce a two-stream architecture, where one operates in the source domain and the other in the target domain. In contrast to other approaches, the weights in corresponding layers are related but not shared. We demonstrate that this both yields higher accuracy than state-of-the-art methods on several object recognition and detection tasks and consistently outperforms networks with shared weights in both supervised and unsupervised settings.
It is well known that the SOM algorithm achieves a clustering of data which can be interpreted as an extension of Principal Component Analysis, because of its topology-preserving property. But the SOM algorithm can only process real-valued data. In previous papers, we have proposed several methods based on the SOM algorithm to analyze categorical data, which is the case in survey data. In this paper, we present these methods in a unified manner. The first one (Kohonen Multiple Correspondence Analysis, KMCA) deals only with the modalities, while the two others (Kohonen Multiple Correspondence Analysis with individuals, KMCA\_ind, Kohonen algorithm on DISJonctive table, KDISJ) can take into account the individuals, and the modalities simultaneously.
Feature extraction and dimension reduction for networks is critical in a wide variety of domains. Efficiently and accurately learning features for multiple graphs has important applications in statistical inference on graphs. We propose a method to jointly embed multiple undirected graphs. Given a set of graphs, the joint embedding method identifies a linear subspace spanned by rank one symmetric matrices and projects adjacency matrices of graphs into this subspace. The projection coefficients can be treated as features of the graphs. We also propose a random graph model which generalizes classical random graph model and can be used to model multiple graphs. We show through theory and numerical experiments that under the model, the joint embedding method produces estimates of parameters with small errors. Via simulation experiments, we demonstrate that the joint embedding method produces features which lead to state of the art performance in classifying graphs. Applying the joint embedding method to human brain graphs, we find it extract interpretable features that can be used to predict individual composite creativity index.
Optimization of very expensive black-box functions requires utilization of maximum information gathered by the process of optimization. Model Guided Sampling Optimization (MGSO) forms a more robust alternative to Jones' Gaussian-process-based EGO algorithm. Instead of EGO's maximizing expected improvement, the MGSO uses sampling the probability of improvement which is shown to be helpful against trapping in local minima. Further, the MGSO can reach close-to-optimum solutions faster than standard optimization algorithms on low dimensional or smooth problems.
Ordering of disordered materials occurs during the activated process of nucleation that requires the formation of critical clusters that have to surmount a thermodynamic barrier. The characterization of these clusters is experimentally challenging but mandatory to improve nucleation theory. In this paper, the nucleation of a magnesium aluminosilicate glass containing the nucleating oxide TiO2 is investigated using neutron scattering with Ti isotopic substitution and 27Al NMR. We identified the structural changes induced by the formation of crystals around Ti atoms and evidenced important structural reorganization of the glassy matrix.
We show that an interaction decaying as a stretched exponential function of the distance, $J(l)\sim e^{-cl^a}$, is able to alter the universality class of short-range systems having an infinite-disorder critical point. To do so, we study the low-energy properties of the random transverse-field Ising chain with the above form of interaction by a strong-disorder renormalization group (SDRG) approach. We obtain that the critical behavior of the model is controlled by infinite-disorder fixed points different from that of the short-range one if $0<a<1/2$. In this range, the critical exponents calculated analytically by a simplified SDRG scheme are found to vary with $a$, while, for $a>1/2$, the model belongs to the same universality class as its short-range variant. The entanglement entropy of a block of size $L$ increases logarithmically with $L$ in the critical point but, as opposed to the short-range model, the prefactor is disorder-dependent in the range $0<a<1/2$. Numerical results obtained by an improved SDRG scheme are found to be in agreement with the analytical predictions. The same fixed points are expected to describe the critical behavior of, among others, the random contact process with stretched exponentially decaying activation rates.
Intuitive observations show that a baby may inherently possess the capability of recognizing a new visual concept (e.g., chair, dog) by learning from only very few positive instances taught by parent(s) or others, and this recognition capability can be gradually further improved by exploring and/or interacting with the real instances in the physical world. Inspired by these observations, we propose a computational model for slightly-supervised object detection, based on prior knowledge modelling, exemplar learning and learning with video contexts. The prior knowledge is modeled with a pre-trained Convolutional Neural Network (CNN). When very few instances of a new concept are given, an initial concept detector is built by exemplar learning over the deep features from the pre-trained CNN. Simulating the baby's interaction with physical world, the well-designed tracking solution is then used to discover more diverse instances from the massive online unlabeled videos. Once a positive instance is detected/identified with high score in each video, more variable instances possibly from different view-angles and/or different distances are tracked and accumulated. Then the concept detector can be fine-tuned based on these new instances. This process can be repeated again and again till we obtain a very mature concept detector. Extensive experiments on Pascal VOC-07/10/12 object detection datasets well demonstrate the effectiveness of our framework. It can beat the state-of-the-art full-training based performances by learning from very few samples for each object category, along with about 20,000 unlabeled videos.
This paper proposes an adaptive neural-compilation framework to address the problem of efficient program learning. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that make the code faster to execute without changing its semantics. In contrast, our work involves adapting programs to make them more efficient while considering correctness only on a target input distribution. Our approach is inspired by the recent works on differentiable representations of programs. We show that it is possible to compile programs written in a low-level language to a differentiable representation. We also show how programs in this representation can be optimised to make them efficient on a target distribution of inputs. Experimental results demonstrate that our approach enables learning specifically-tuned algorithms for given data distributions with a high success rate.
The Chandra X-ray Satellite already observed several deep fields, including the two 1 Megasec exposures of the Chandra Deep Field South (CDFS) and North. We review here the main findings from the CDFS. The LogN-LogS relations show the resolution of the X-ray background into point sources at the level of 83-99% in the 1-2 keV band and 65-98% in the 2-10 keV band, given the uncertainties in the unresolved value. The so called ``spectral paradox'' is solved by a hard, faint population of sources constituted mostly by nearby (z<1) absorbed (Type II) AGNs with hard-band luminosities L~10^{42}-10^{44} erg/s. When comparing these results to other deep fields in the X-ray band, we find that the AGNs detected in 0.1 deg^2 of the CDFS are representative of the AGNs population as a whole. However, we also noticed an excess in the hard counts in two Chandra deep fields. If we include this excess and average it among the observed fields, the total contribution of the XRB can grow of about 7%. Finally, we discuss briefly the properties of the Intra Cluster Medium imaged at high z, showing no evolution in clusters properties up to z~1.
Twitter, a popular social network, presents great opportunities for on-line machine learning research. However, previous research has focused almost entirely on learning from passively collected data. We study the problem of learning to acquire followers through normative user behavior, as opposed to the mass following policies applied by many bots. We formalize the problem as a contextual bandit problem, in which we consider retweeting content to be the action chosen and each tweet (content) is accompanied by context. We design reward signals based on the change in followers. The result of our month long experiment with 60 agents suggests that (1) aggregating experience across agents can adversely impact prediction accuracy and (2) the Twitter community's response to different actions is non-stationary. Our findings suggest that actively learning on-line can provide deeper insights about how to attract followers than machine learning over passively collected data alone.
A mobile ad hoc network (MANET) is a wireless network that uses multi-hop peer-to- peer routing instead of static network infrastructure to provide network connectivity. MANETs have applications in rapidly deployed and dynamic military and civilian systems. The network topology in a MANET usually changes with time. Therefore, there are new challenges for routing protocols in MANETs since traditional routing protocols may not be suitable for MANETs. In recent years, a variety of new routing protocols targeted specifically at this environment have been developed, but little performance information on each protocol and no realistic performance comparison between them is available. This paper presents the results of a detailed packet-level simulation comparing three multi-hop wireless ad hoc network routing protocols that cover a range of design choices: DSR, NFPQR, and clustered NFPQR. By applying queuing methodology to the introduced routing protocol the reliability and throughput of the network is increased.
Recently, cellular networks are severely overloaded by social-based services, such as YouTube, Facebook and Twitter, in which thousands of clients subscribe a common content provider (e.g., a popular singer) and download his/her content updates all the time. Offloading such traffic through complementary networks, such as a delay tolerant network formed by device-to-device (D2D) communications between mobile subscribers, is a promising solution to reduce the cellular burdens. In the existing solutions, mobile users are assumed to be volunteers who selfishlessly deliver the content to every other user in proximity while moving. However, practical users are selfish and they will evaluate their individual payoffs in the D2D sharing process, which may highly influence the network performance compared to the case of selfishless users. In this paper, we take user selfishness into consideration and propose a network formation game to capture the dynamic characteristics of selfish behaviors. In the proposed game, we provide the utility function of each user and specify the conditions under which the subscribers are guaranteed to converge to a stable network. Then, we propose a practical network formation algorithm in which the users can decide their D2D sharing strategies based on their historical records. Simulation results show that user selfishness can highly degrade the efficiency of data offloading, compared with ideal volunteer users. Also, the decrease caused by user selfishness can be highly affected by the cost ratio between the cellular transmission and D2D transmission, the access delays, and mobility patterns.
Recurrent Neural Networks (RNNs) have long been recognized for their potential to model complex time series. However, it remains to be determined what optimization techniques and recurrent architectures can be used to best realize this potential. The experiments presented take a deep look into Hessian free optimization, a powerful second order optimization method that has shown promising results, but still does not enjoy widespread use. This algorithm was used to train to a number of RNN architectures including standard RNNs, long short-term memory, multiplicative RNNs, and stacked RNNs on the task of character prediction. The insights from these experiments led to the creation of a new multiplicative LSTM hybrid architecture that outperformed both LSTM and multiplicative RNNs. When tested on a larger scale, multiplicative LSTM achieved character level modelling results competitive with the state of the art for RNNs using very different methodology.
Convolutional Neural Networks (CNNs) were recently shown to provide state-of-the-art results for object category viewpoint estimation. However different ways of formulating this problem have been proposed and the competing approaches have been explored with very different design choices. This paper presents a comparison of these approaches in a unified setting as well as a detailed analysis of the key factors that impact performance. Followingly, we present a new joint training method with the detection task and demonstrate its benefit. We also highlight the superiority of classification approaches over regression approaches, quantify the benefits of deeper architectures and extended training data, and demonstrate that synthetic data is beneficial even when using ImageNet training data. By combining all these elements, we demonstrate an improvement of approximately 5% mAVP over previous state-of-the-art results on the Pascal3D+ dataset. In particular for their most challenging 24 view classification task we improve the results from 31.1% to 36.1% mAVP.
We propose a highly structured neural network architecture for semantic segmentation with an extremely small model size, suitable for low-power embedded and mobile platforms. Specifically, our architecture combines i) a Haar wavelet-based tree-like convolutional neural network (CNN), ii) a random layer realizing a radial basis function kernel approximation, and iii) a linear classifier. While stages i) and ii) are completely pre-specified, only the linear classifier is learned from data. We apply the proposed architecture to outdoor scene and aerial image semantic segmentation and show that the accuracy of our architecture is competitive with conventional pixel classification CNNs. Furthermore, we demonstrate that the proposed architecture is data efficient in the sense of matching the accuracy of pixel classification CNNs when trained on a much smaller data set.
One of the longstanding open problems in spectral graph clustering (SGC) is the so-called model order selection problem: automated selection of the correct number of clusters. This is equivalent to the problem of finding the number of connected components or communities in an undirected graph. We propose automated model order selection (AMOS), a solution to the SGC model selection problem under a random interconnection model (RIM) using a novel selection criterion that is based on an asymptotic phase transition analysis. AMOS can more generally be applied to discovering hidden block diagonal structure in symmetric non-negative matrices. Numerical experiments on simulated graphs validate the phase transition analysis, and real-world network data is used to validate the performance of the proposed model selection procedure.
We consider the possibility of using neural networks in experimental data analysis in Daphne. We analyze the process $\gamma\gamma\to \pi^+ \pi^- \pi^0$ and its backgrounds using neural networks and we compare their performances with traditional methods of applying cuts on several kinematical variables. We find that the neural networks are more efficient and can be of great help for processes with small number of produced events.
Recent years have seen an emergence of network modeling applied to moods, attitudes, and problems in the realm of psychology. In this framework, psychological variables are understood to directly affect each other rather than being caused by an unobserved latent entity. In this tutorial, we introduce the reader to estimating the most popular network model for psychological data: the partial correlation network. We describe how regularization techniques can be used to efficiently estimate a parsimonious and interpretable network structure in psychological data. We show how to perform these analyses in R and demonstrate the method in an empirical example on post-traumatic stress disorder data. In addition, we discuss the effect of the hyperparameter that needs to be manually set by the researcher, how to handle non-normal data, how to investigate the required sample size for a network analysis, and provide a checklist with potential solutions for problems that can arise when estimating regularized partial correlation networks.
This paper sketches the challenges to address to realise a support able to achieve an Ephemeral Cloud Federation, an innovative cloud computing paradigm that enables the exploitation of a dynamic, personalised and context-aware set of resources.   The aim of the Ephemeral Federation is to answer to the need of combining private data-centres with both federation of cloud providers and the resource on the edge of the network.   The goal of the Ephemeral Federation is to deliver a context-aware and personalised federations of computational, data and network resources, able to manage their heterogeneity in a highly distributed deployment, which can dynamically bring data and computation close to the final user.
Integration of multiple microphone data is one of the key ways to achieve robust speech recognition in noisy environments or when the speaker is located at some distance from the input device. Signal processing techniques such as beamforming are widely used to extract a speech signal of interest from background noise. These techniques, however, are highly dependent on prior spatial information about the microphones and the environment in which the system is being used. In this work, we present a neural attention network that directly combines multi-channel audio to generate phonetic states without requiring any prior knowledge of the microphone layout or any explicit signal preprocessing for speech enhancement. We embed an attention mechanism within a Recurrent Neural Network (RNN) based acoustic model to automatically tune its attention to a more reliable input source. Unlike traditional multi-channel preprocessing, our system can be optimized towards the desired output in one step. Although attention-based models have recently achieved impressive results on sequence-to-sequence learning, no attention mechanisms have previously been applied to learn potentially asynchronous and non-stationary multiple inputs. We evaluate our neural attention model on the CHiME-3 challenge task, and show that the model achieves comparable performance to beamforming using a purely data-driven method.
The generalization error of deep neural networks via their classification margin is studied in this work. Our approach is based on the Jacobian matrix of a deep neural network and can be applied to networks with arbitrary non-linearities and pooling layers, and to networks with different architectures such as feed forward networks and residual networks. Our analysis leads to the conclusion that a bounded spectral norm of the network's Jacobian matrix in the neighbourhood of the training samples is crucial for a deep neural network of arbitrary depth and width to generalize well. This is a significant improvement over the current bounds in the literature, which imply that the generalization error grows with either the width or the depth of the network. Moreover, it shows that the recently proposed batch normalization and weight normalization re-parametrizations enjoy good generalization properties, and leads to a novel network regularizer based on the network's Jacobian matrix. The analysis is supported with experimental results on the MNIST, CIFAR-10, LaRED and ImageNet datasets.
Deep neural networks have recently achieved state-of-the-art results in many machine learning problems, e.g., speech recognition or object recognition. Hitherto, work on rectified linear units (ReLU) provides empirical and theoretical evidence on performance increase of neural networks comparing to typically used sigmoid activation function. In this paper, we investigate a new manner of improving neural networks by introducing a bunch of copies of the same neuron modeled by the generalized Kumaraswamy distribution. As a result, we propose novel non-linear activation function which we refer to as Kumaraswamy unit which is closely related to ReLU. In the experimental study with MNIST image corpora we evaluate the Kumaraswamy unit applied to single-layer (shallow) neural network and report a significant drop in test classification error and test cross-entropy in comparison to sigmoid unit, ReLU and Noisy ReLU.
This paper considers the problem of adaptively searching for an unknown target using multiple agents connected through a time-varying network topology. Agents are equipped with sensors capable of fast information processing, and we propose a decentralized collaborative algorithm for controlling their search given noisy observations. Specifically, we propose decentralized extensions of the adaptive query-based search strategy that combines elements from the 20 questions approach and social learning. Under standard assumptions on the time-varying network dynamics, we prove convergence to correct consensus on the value of the parameter as the number of iterations go to infinity. The convergence analysis takes a novel approach using martingale-based techniques combined with spectral graph theory. Our results establish that stability and consistency can be maintained even with one-way updating and randomized pairwise averaging, thus providing a scalable low complexity method with performance guarantees. We illustrate the effectiveness of our algorithm for random network topologies.
Traditional convolutional neural networks (CNN) are stationary and feedforward. They neither change their parameters during evaluation nor use feedback from higher to lower layers. Real brains, however, do. So does our Deep Attention Selective Network (dasNet) architecture. DasNets feedback structure can dynamically alter its convolutional filter sensitivities during classification. It harnesses the power of sequential processing to improve classification performance, by allowing the network to iteratively focus its internal attention on some of its convolutional filters. Feedback is trained through direct policy search in a huge million-dimensional parameter space, through scalable natural evolution strategies (SNES). On the CIFAR-10 and CIFAR-100 datasets, dasNet outperforms the previous state-of-the-art model.
In the last few years, the emerging network architecture paradigm of Software-Defined Networking (SDN), has become one of the most important technology to manage large scale networks such as Vehicular Ad-hoc Networks (VANETs). Recently, several works have shown interest in the use of SDN paradigm in VANETs. SDN brings flexibility, scalability and management facility to current VANETs. However, almost all of proposed Software-Defined VANET (SDVN) architectures are infrastructure-based. This paper will focus on how to enable SDN in infrastructure-less vehicular environments. For this aim, we propose a novel distributed SDN-based architecture for uncovered infrastructure-less vehicular scenarios. It is a scalable cluster-based architecture with distributed mobile controllers and a reliable fall back recovery mechanism based on self-organized clustering and failure anticipation.
For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.
As we all known, the nonnegative matrix factorization (NMF) is a dimension reduction method that has been widely used in image processing, text compressing and signal processing etc. In this paper, an algorithm for nonnegative matrix approximation is proposed. This method mainly bases on the active set and the quasi-Newton type algorithm, by using the symmetric rank-one and negative curvature direction technologies to approximate the Hessian matrix. Our method improves the recent results of those methods in [Pattern Recognition, 45(2012)3557-3565; SIAM J. Sci. Comput., 33(6)(2011)3261-3281; Neural Computation, 19(10)(2007)2756-2779, etc.]. Moreover, the object function decreases faster than many other NMF methods. In addition, some numerical experiments are presented in the synthetic data, imaging processing and text clustering. By comparing with the other six nonnegative matrix approximation methods, our experiments confirm to our analysis.
In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans. We show that this model is not only compatible with classical dynamic programming techniques, but also admits a novel greedy top-down inference algorithm based on recursive partitioning of the input. We demonstrate empirically that both prediction schemes are competitive with recent work, and when combined with basic extensions to the scoring model are capable of achieving state-of-the-art single-model performance on the Penn Treebank (91.79 F1) and strong performance on the French Treebank (82.23 F1).
We currently witness the emergence of interesting new network topologies optimized towards the traffic matrices they serve, such as demand-aware datacenter interconnects (e.g., ProjecToR) and demand-aware overlay networks (e.g., SplayNets). This paper introduces a formal framework and approach to reason about and design such topologies. We leverage a connection between the communication frequency of two nodes and the path length between them in the network, which depends on the entropy of the communication matrix. Our main contribution is a novel robust, yet sparse, family of network topologies which guarantee an expected path length that is proportional to the entropy of the communication patterns.
Classification methods based on learning from examples have been widely applied to character recognition from the 1990s and have brought forth significant improvements of recognition accuracies. This class of methods includes statistical methods, artificial neural networks, support vector machines (SVM), multiple classifier combination, etc. In this paper, we discuss the characteristics of the some classification methods that have been successfully applied to handwritten Devnagari character recognition and results of SVM and ANNs classification method, applied on Handwritten Devnagari characters. After preprocessing the character image, we extracted shadow features, chain code histogram features, view based features and longest run features. These features are then fed to Neural classifier and in support vector machine for classification. In neural classifier, we explored three ways of combining decisions of four MLP's designed for four different features.
We study the electronic properties of an array of small metallic grains connected by tunnel junctions. Such an array serves as a model for a granular metal. Previous theoretical studies of junction arrays were based on models of quantum dissipation which did not take into account the diffusive motion of electrons within the grains. We demonstrate that these models break down at sufficiently low temperatures: for a correct description of the screening properties of a granular metal at low energies the diffusive nature of the electronic motion within the grains is crucial. We present both a diagrammatic and a functional integral approach to analyse the properties of junction arrays. In particular, a new effective action is obtained which enables us to describe the array at arbitrary temperature. In the low temperature limit, our theory yields the correct, dynamically screened Coulomb interaction of a normal metal, whereas at high temperatures the standard description in terms of quantum dissipation is recovered.
In this work we develop Curvature Propagation (CP), a general technique for efficiently computing unbiased approximations of the Hessian of any function that is computed using a computational graph. At the cost of roughly two gradient evaluations, CP can give a rank-1 approximation of the whole Hessian, and can be repeatedly applied to give increasingly precise unbiased estimates of any or all of the entries of the Hessian. Of particular interest is the diagonal of the Hessian, for which no general approach is known to exist that is both efficient and accurate. We show in experiments that CP turns out to work well in practice, giving very accurate estimates of the Hessian of neural networks, for example, with a relatively small amount of work. We also apply CP to Score Matching, where a diagonal of a Hessian plays an integral role in the Score Matching objective, and where it is usually computed exactly using inefficient algorithms which do not scale to larger and more complex models.
Estimating influential nodes in large scale networks including but not limited to social networks, biological networks, communication networks, emerging smart grids etc. is a topic of fundamental interest. To understand influences of nodes in a network, a classical metric is centrality within which there are multiple specific instances including degree centrality, closeness centrality, betweenness centrality and more. As of today, existing algorithms to identify nodes with high centrality measures operate upon the entire (or rather global) network, resulting in high computational complexity. In this paper, we design efficient algorithms for determining the betweenness centrality in large scale networks by taking advantage of the modular topology exhibited by most of these large scale networks. Very briefly, modular topologies are those wherein the entire network appears partitioned into distinct modules (or clusters or communities), wherein nodes within the module (that likely share highly similar profiles) have dense connections between them, while connections across modules are relatively sparse. Using a novel adaptation of Dijkstra's shortest path algorithm, and executing it over local modules and over sparse edges between modules, we design algorithms that can correctly compute the betweenness centrality much faster than existing algorithms. To the best of our knowledge, ours is the first work that leverage modular topologies of large scale networks to address the centrality problem, though here we mostly limit our discussions to social networks. We also provide more insights on centrality in general, and also how our algorithms can be used to determine other centrality measures.
We consider network coding for networks experiencing worst-case bit-flip errors, and argue that this is a reasonable model for highly dynamic wireless network transmissions. We demonstrate that in this setup prior network error-correcting schemes can be arbitrarily far from achieving the optimal network throughput. We propose a new metric for errors under this model. Using this metric, we prove a new Hamming-type upper bound on the network capacity. We also show a commensurate lower bound based on GV-type codes that can be used for error-correction. The codes used to attain the lower bound are non-coherent (do not require prior knowledge of network topology). The end-to-end nature of our design enables our codes to be overlaid on classical distributed random linear network codes. Further, we free internal nodes from having to implement potentially computationally intensive link-by-link error-correction.
We here theoretically study the global phase diagram of a three-dimensional dirty Weyl system. The generalized Harris criterion, augmented by a perturbative renormalization-group (RG) analysis shows that weak disorder is an irrelevant perturbation at the Weyl semimetal(WSM)-insulator quantum critical point (QCP). But, a metallic phase sets in through a quantum phase transition (QPT) at strong disorder across a multicritical point, characterized by the correlation length exponent $\nu=2$ and dynamic scaling exponent (DSE) $z=5/4$. Deep inside the WSM phase, generic disorder is also an irrelevant perturbation, while a metallic phase appears at strong disorder through a QPT. We here demonstrate that in the presence of generic, but chiral symmetric disorder (e.g., chemical potential, axial potential, etc.) the WSM-metal QPT is always characterized by the exponents $\nu=1$ and $z=3/2$ to one-loop order. We here anchor such emergent \emph{chiral superuniversality} through complimentary RG calculations, controlled via $\epsilon$-expansions, and numerical analysis of average density of states across WSM-metal QPT. Even though in the presence of chiral symmetry breaking disorder, as for instance random spin-orbit coupling, the exact value of DSE across such QPT depends on the RG scheme, we always find $z>d$, with $d$ as dimensionality of the WSM, and $\nu=1$. We also discuss scaling behavior of various physical observables, such as residue of quasiparticle pole, dynamic conductivity, specific heat, Gr\"uneisen ratio, across the WSM-metal QPTs.
We propose a new equilibrium enforcing method paired with a loss derived from the Wasserstein distance for training auto-encoder based Generative Adversarial Networks. This method balances the generator and discriminator during training. Additionally, it provides a new approximate convergence measure, fast and stable training and high visual quality. We also derive a way of controlling the trade-off between image diversity and visual quality. We focus on the image generation task, setting a new milestone in visual quality, even at higher resolutions. This is achieved while using a relatively simple model architecture and a standard training procedure.
Stochastic network calculus is an evolving theory which accounts for statistical multiplexing and uses an envelope approach for probabilistic delay and backlog analysis of networks. One of the key ideas of stochastic network calculus is the possibility to describe service offered at network node as a stochastic service envelope, which in turn can be used to describe the stochastic service available in a network of nodes and determine end-to-end probabilistic delay and backlog bounds. This paper introduces a new definition of stochastic service envelopes which yield a simple network service envelope and tighter end-to-end performance bounds. It is shown for ($\sigma(\theta), \rho(\theta)$) - constrained traffic model that the end-to-end performance measures computed using the new stochastic network service envelope are tight in comparison to the ones obtained using the existing start-of-the-art definition of statistical network service envelope and are bounded by ${\cal O}(H \log{H})$, where $H$ is the number of nodes traversed by the arrival traffic.
The transverse-target single-spin asymmetry for inclusive deep-inelastic scattering with effectively unpolarized electron and positron beams off a transversely polarized hydrogen target was measured, with the goal of searching for a two-photon exchange signal in the kinematic range 0.007 < x_B < 0.9 and 0.25 GeV**2 < Q**2 < 20 GeV**2. In two separate regions Q**2 > 1 GeV**2 and Q**2 < 1 GeV**2, and for both electron and positron beams, the asymmetries are found to be consistent with zero within statistical and systematic uncertainties, which are of order 10**(-3) for the asymmetries integrated over x_B.
The security infrastructure is ill-equipped to detect and deter the smuggling of non-explosive devices that enable terror attacks such as those recently perpetrated in western Europe. The detection of so-called "small metallic threats" (SMTs) in cargo containers currently relies on statistical risk analysis, intelligence reports, and visual inspection of X-ray images by security officers. The latter is very slow and unreliable due to the difficulty of the task: objects potentially spanning less than 50 pixels have to be detected in images containing more than 2 million pixels against very complex and cluttered backgrounds. In this contribution, we demonstrate for the first time the use of Convolutional Neural Networks (CNNs), a type of Deep Learning, to automate the detection of SMTs in fullsize X-ray images of cargo containers. Novel approaches for dataset augmentation allowed to train CNNs from-scratch despite the scarcity of data available. We report fewer than 6% false alarms when detecting 90% SMTs synthetically concealed in stream-of-commerce images, which corresponds to an improvement of over an order of magnitude over conventional approaches such as Bag-of-Words (BoWs). The proposed scheme offers potentially super-human performance for a fraction of the time it would take for a security officers to carry out visual inspection (processing time is approximately 3.5s per container image).
We propose new activity-dependent adaptive Boolean networks inspired by the cis-regulatory mechanism in gene regulatory networks. We analytically show that our model can be solved for stationary in-degree distribution for a wide class of update rules by employing the annealed approximation of Boolean network dynamics and that evolved Boolean networks have a preassigned average sensitivity that can be set independently of update rules. In particular, when it is set to 1, our theory predicts that the proposed network rewiring algorithm drives Boolean networks towards criticality. We verify that these analytic results agree well with numerical simulations for four representative update rules. We also discuss the relationship between sensitivity of update rules and stationary in-degree distributions and compare it with that in real-world gene regulatory networks.
We present results from a new approach to learning and plasticity in neuromorphic hardware systems: to enable flexibility in implementable learning mechanisms while keeping high efficiency associated with neuromorphic implementations, we combine a general-purpose processor with full-custom analog elements.   This processor is operating in parallel with a fully parallel neuromorphic system consisting of an array of synapses connected to analog, continuous time neuron circuits. Novel analog correlation sensor circuits process spike events for each synapse in parallel and in real-time.   The processor uses this pre-processing to compute new weights possibly using additional information following its program.   Therefore, learning rules can be defined in software giving a large degree of flexibility.   Synapses realize correlation detection geared towards Spike-Timing Dependent Plasticity (STDP) as central computational primitive in the analog domain.   Operating at a speed-up factor of 1000 compared to biological time-scale, we measure time-constants from tens to hundreds of micro-seconds.   We analyze variability across multiple chips and demonstrate learning using a multiplicative STDP rule.   We conclude, that the presented approach will enable flexible and efficient learning as a platform for neuroscientific research and technological applications.
We present calculations of the upper critical field in superconducting films as a function of increasing disorder (as measured by the normal state resistance per square). In contradiction to previous work, we find that there is no anomalous low-temperature positive curvature in the upper critical field as disorder is increased. We show that the previous prediction of this effect is due to an unjustified analytical approximation of sums occuring in the perturbative calculation. Our treatment includes both a careful analysis of first-order perturbation theory, and a non-perturbative resummation technique. No anomalous curvature is found in either case. We present our results in graphical form.
There have been many proposed explanations for the larger-than-expected radii of some transiting hot Jupiters, including either stellar or orbital energy deposition deep in the atmosphere or deep in the interior. In this paper, we explore the important influences on hot-Jupiter radius evolution of (i) additional heat sources in the high atmosphere, the deep atmosphere, and deep in the convective interior; (ii) consistent cooling of the deep interior through the planetary dayside, nightside, and poles; (iii) the degree of heat redistribution to the nightside; and (iv) the presence of an upper atmosphere absorber inferred to produce anomalously hot upper atmospheres and inversions in some close-in giant planets. In particular, we compare the radius expansion effects of atmospheric and deep-interior heating at the same power levels and derive the power required to achieve a given radius increase when night-side cooling is incorporated. We find that models that include consistent day/night cooling are more similar to isotropically irradiated models when there is more heat redistributed from the dayside to the nightside. In addition, we consider the efficacy of ohmic heating in the atmosphere and/or convective interior in inflating hot Jupiters. Among our conclusions are that (i) the most highly irradiated planets cannot stably have uB > (10 km/s Gauss) over a large fraction of their daysides, where u is the zonal wind speed and B is the dipolar magnetic field strength in the atmosphere, and (ii) that ohmic heating cannot in and of itself lead to a runaway in planet radius.
Cloud computing has achieved great success in modern IT industry as an excellent computing paradigm due to its flexible management and elastic resource sharing. To date, cloud computing takes an irrepalceable position in our socioeconomic system and influences almost every aspect of our daily life. However, it is still in its infancy, many problems still exist.Besides the hotly-debated security problem, availability is also an urgent issue.With the limited power of availability mechanisms provided in present cloud platform, we can hardly get detailed availability information of current applications such as the root causes of availability problem,mean time to failure, etc. Thus a new mechanism based on deep avaliability analysis is neccessary and benificial.Following the prevalent terminology 'XaaS',this paper proposes a new win-win concept for cloud users and providers in term of 'Availability as a Service' (abbreviated as 'AaaS').The aim of 'AaaS' is to provide comprehensive and aimspecific runtime avaliabilty analysis services for cloud users by integrating plent of data-driven and modeldriven approaches. To illustrate this concept, we realize a prototype named 'EagleEye' with all features of 'AaaS'. By subscribing corresponding services in 'EagleEye', cloud users could get specific availability information of their applications deployed in cloud platform. We envision this new kind of service will be merged into the cloud management mechanism in the near future.
Recently, methods have been proposed that perform texture synthesis and style transfer by using convolutional neural networks (e.g. Gatys et al. [2015,2016]). These methods are exciting because they can in some cases create results with state-of-the-art quality. However, in this paper, we show these methods also have limitations in texture quality, stability, requisite parameter tuning, and lack of user controls. This paper presents a multiscale synthesis pipeline based on convolutional neural networks that ameliorates these issues. We first give a mathematical explanation of the source of instabilities in many previous approaches. We then improve these instabilities by using histogram losses to synthesize textures that better statistically match the exemplar. We also show how to integrate localized style losses in our multiscale framework. These losses can improve the quality of large features, improve the separation of content and style, and offer artistic controls such as paint by numbers. We demonstrate that our approach offers improved quality, convergence in fewer iterations, and more stability over the optimization.
The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters. Reducing the number of parameters while preserving essentially the same predictive performance is critically important for operating deep neural networks in memory constrained environments such as GPUs or embedded devices.   In this paper we show how kernel methods, in particular a single Fastfood layer, can be used to replace all fully connected layers in a deep convolutional neural network. This novel Fastfood layer is also end-to-end trainable in conjunction with convolutional layers, allowing us to combine them into a new architecture, named deep fried convolutional networks, which substantially reduces the memory footprint of convolutional networks trained on MNIST and ImageNet with no drop in predictive performance.
In this paper we calculate the mean number of metastable states for spin glasses on so called random thin graphs with couplings taken from a symmetric binary distribution $\pm J$. Thin graphs are graphs where the local connectivity of each site is fixed to some value $c$. As in totally connected mean field models we find that the number of metastable states increases exponentially with the system size. Furthermore we find that the average number of metastable states decreases as $c$ in agreement with previous studies showing that finite connectivity corrections of order $1/c$ increase the number of metastable states with respect to the totally connected mean field limit. We also prove that the average number of metastable states in the limit $c\to\infty$ is finite and converges to the average number of metastable states in the Sherrington-Kirkpatrick model. An annealed calculation for the number of metastable states $N_{MS}(E)$ of energy $E$ is also carried out giving a lower bound on the ground state energy of these spin glasses. For small $c$ one may obtain analytic expressions for $<N_{MS}(E)>$.
We show that most of the empirical or semi-empirical isotherms proposed to extend the Langmuir formula to sorption (adsorption, chimisorption and biosorption) on heterogeneous surfaces in the gaseous and liquid phase belong to the family and subfamily of the Burr_{XII} cumulative distribution functions. As a consequence they obey relatively simple differential equations which describe birth and death phenomena resulting from mesoscopic and microscopic physicochemical processes. Using the probability theory, it is thus possible to give a physical meaning to their empirical coefficients, to calculate well defined quantities and to compare the results obtained from different isotherms. Another interesting consequence of this finding is that it is possible to relate the shape of the isotherm to the distribution of sorption energies which we have calculated for each isotherm. In particular, we show that the energy distribution corresponding to the Brouers-Sotolongo (BS) isotherm [1] is the Gumbel extreme value distribution Finally we propose a generalized GBS isotherm, calculate its relevant statistical properties and recover all the previous results by giving well defined values to its coefficients. In the course of the discussion we make contact with the Tsallis nonextensive theory [2] and the noninteger order reaction and fractal kinetics theory [3]. In the spirits of the present and previous publications, we propose an alternative formula to include fractality in the Michealis-Menten enzyme catalysis theory. Finally we suggest that the stochastic cluster model introduced by K.Weron [4] to account for the universal character of relaxation in disordered systems should be relevant for other phenomena in particular for heterogeneous sorption
We analyze a new approach to Machine Learning coming from a modification of classical regularization networks by casting the process in the time dimension, leading to a sort of collapse of dimensionality in the problem of learning the model parameters. This approach allows the definition of a online learning algorithm that progressively accumulates the knowledge provided in the input trajectory. The regularization principle leads to a solution based on a dynamical system that is paired with a procedure to develop a graph structure that stores the input regularities acquired from the temporal evolution. We report an extensive experimental exploration on the behavior of the parameter of the proposed model and an evaluation on artificial dataset.
In a seminal article, Kahn has introduced the notion of process network and given a semantics for those using Scott domains whose elements are (possibly infinite) sequences of values. This model has since then become a standard tool for studying distributed asynchronous computations. From the beginning, process networks have been drawn as particular graphs, but this syntax is never formalized. We take the opportunity to clarify it by giving a precise definition of these graphs, that we call nets. The resulting category is shown to be a fixpoint category, i.e. a cartesian category which is traced wrt the monoidal structure given by the product, and interestingly this structure characterizes the category: we show that it is the free fixpoint category containing a given set of morphisms, thus providing a complete axiomatics that models of process networks should satisfy. We then use these tools to build a model of networks in which data vary over a continuous time, in order to elaborate on the idea that process networks should also be able to encompass computational models such as hybrid systems or electric circuits. We relate this model to Kahn's semantics by introducing a third model of networks based on non-standard analysis, whose elements form an internal complete partial order for which many properties of standard domains can be reformulated. The use of hyperreals in this model allows it to formally consider the notion of infinitesimal, and thus to make a bridge between discrete and continuous time: time is "discrete", but the duration between two instants is infinitesimal. Finally, we give some examples of uses of the model by describing some networks implementing common constructions in analysis.
The vortex glass model for a disordered high-T_c superconductor in an external magnetic field is studied in the strong screening limit. With exact ground state (i.e. T=0) calculations we show that 1) the ground state of the vortex configuration varies drastically with infinitesimal variations of the strength of the external field, 2) the minimum energy of global excitation loops of length scale L do not depend on the strength of the external field, however 3) the excitation loops themself depend sensibly on the field. From 2) we infer the absence of a true superconducting state at any finite temperature independent of the external field.
This paper presents measurements of D^{*\pm} production in deep inelastic scattering from collisions between 27.5 GeV positrons and 820 GeV protons. The data have been taken with the ZEUS detector at HERA. The decay channel $D^{*+}\to (D^0 \to K^- \pi^+) \pi^+ $ (+ c.c.) has been used in the study. The $e^+p$ cross section for inclusive D^{*\pm} production with $5<Q^2<100 GeV^2$ and $y<0.7$ is 5.3 \pms 1.0 \pms 0.8 nb in the kinematic region {$1.3<p_T(D^{*\pm})<9.0$ GeV and $| \eta(D^{*\pm}) |<1.5$}. Differential cross sections as functions of p_T(D^{*\pm}), $\eta(D^{*\pm}), W$ and $Q^2$ are compared with next-to-leading order QCD calculations based on the photon-gluon fusion production mechanism. After an extrapolation of the cross section to the full kinematic region in p_T(D^{*\pm}) and $\eta$(D^{*\pm}), the charm contribution $F_2^{c\bar{c}}(x,Q^2)$ to the proton structure function is determined for Bjorken $x$ between 2 $\cdot$ 10$^{-4}$ and 5 $\cdot$ 10$^{-3}$.
The CUR matrix decomposition is an important extension of Nystr\"{o}m approximation to a general matrix. It approximates any data matrix in terms of a small number of its columns and rows. In this paper we propose a novel randomized CUR algorithm with an expected relative-error bound. The proposed algorithm has the advantages over the existing relative-error CUR algorithms that it possesses tighter theoretical bound and lower time complexity, and that it can avoid maintaining the whole data matrix in main memory. Finally, experiments on several real-world datasets demonstrate significant improvement over the existing relative-error algorithms.
This paper aims to accelerate the test-time computation of deep convolutional neural networks (CNNs). Unlike existing methods that are designed for approximating linear filters or linear responses, our method takes the nonlinear units into account. We minimize the reconstruction error of the nonlinear responses, subject to a low-rank constraint which helps to reduce the complexity of filters. We develop an effective solution to this constrained nonlinear optimization problem. An algorithm is also presented for reducing the accumulated error when multiple layers are approximated. A whole-model speedup ratio of 4x is demonstrated on a large network trained for ImageNet, while the top-5 error rate is only increased by 0.9%. Our accelerated model has a comparably fast speed as the "AlexNet", but is 4.7% more accurate.
Periodic neural activity not locked to the stimulus or to motor responses is usually ignored. Here, we present new tools for modeling and quantifying the information transmission based on periodic neural activity that occurs with quasi-random phase relative to the stimulus. We propose a model to reproduce characteristic features of oscillatory spike trains, such as histograms of inter-spike intervals and phase locking of spikes to an oscillatory influence. The proposed model is based on an inhomogeneous Gamma process governed by a density function that is a product of the usual stimulus-dependent rate and a quasi-periodic function. Further, we present an analysis method generalizing the direct method (Rieke et al, 1999; Brenner et al, 2000) to assess the information content in such data. We demonstrate these tools on recordings from relay cells in the lateral geniculate nucleus of the cat.
In visual recognition tasks, such as image classification, unsupervised learning exploits cheap unlabeled data and can help to solve these tasks more efficiently. We show that the recursive autoconvolution operator, adopted from physics, boosts existing unsupervised methods by learning more discriminative filters. We take well established convolutional neural networks and train their filters layer-wise. In addition, based on previous works we design a network which extracts more than 600k features per sample, but with the total number of trainable parameters greatly reduced by introducing shared filters in higher layers. We evaluate our networks on the MNIST, CIFAR-10, CIFAR-100 and STL-10 image classification benchmarks and report several state of the art results among other unsupervised methods.
We propose an easy-to-use methodology to allocate one of the groups which have been previously built from a complete learning data base, to new individuals. The learning data base contains continuous and categorical variables for each individual. The groups (clusters) are built by using only the continuous variables and described with the help of the categorical ones. For the new individuals, only the categorical variables are available, and it is necessary to define a model which computes the probabilities to belong to each of the clusters, by using only the categorical variables. Then this model provides a decision rule to assign the new individuals and gives an efficient tool to decision-makers. This tool is shown to be very efficient for customers allocation in consumer clusters for marketing purposes, for example.
We present a systematical theory for the interplay of strong localization effects and absorption or gain of classical waves in 3-dimensional, disordered dielectrics. The theory is based on the selfconsistent Cooperon resummation, implementing the effects of energy conservation and its absorptive or emissive corrections by an exact, generalized Ward identity. Substantial renormalizations are found, depending on whether the absorption/gain occurs in the scatterers or in the background medium. We find a finite, gain-induced correlation volume which may be significantly smaller than the scale set by the scattering mean free path, even if there are no truly localized modes. Possible consequences for coherent feedback in random lasers as well as the possibility of oscillatory in time behavior induced by sufficiently strong gain are discussed.
We study an evolutionary version of the Prisoner's Dilemma game, played by agents placed in a small-world network. Agents are able to change their strategy, imitating that of the most successful neighbor. We observe that different topologies, ranging from regular lattices to random graphs, produce a variety of emergent behaviors. This is a contribution towards the study of social phenomena and transitions governed by the topology of the community.
In the absence of fractures, methane bubbles in deep-water sediments can be immovably trapped within a porous matrix by surface tension. The dominant mechanism of transfer of gas mass therefore becomes the diffusion of gas molecules through porewater. The accurate description of this process requires non-Fickian diffusion to be accounted for, including both thermodiffusion and gravitational action. We evaluate the diffusive flux of aqueous methane considering non-Fickian diffusion and predict the existence of extensive bubble mass accumulation zones within deep-water sediments. The limitation on the hydrate deposit capacity is revealed; too weak deposits cannot reach the base of the hydrate stability zone and form any bubbly horizon.
We study the dynamic service migration problem in mobile edge-clouds that host cloud-based services at the network edge. This offers the benefits of reduction in network overhead and latency but requires service migrations as user locations change over time. It is challenging to make these decisions in an optimal manner because of the uncertainty in node mobility as well as possible non-linearity of the migration and transmission costs. In this paper, we formulate a sequential decision making problem for service migration using the framework of Markov Decision Process (MDP). Our formulation captures general cost models and provides a mathematical framework to design optimal service migration policies. In order to overcome the complexity associated with computing the optimal policy, we approximate the underlying state space by the distance between the user and service locations. We show that the resulting MDP is exact for uniform one-dimensional mobility while it provides a close approximation for uniform two-dimensional mobility with a constant additive error term. We also propose a new algorithm and a numerical technique for computing the optimal solution which is significantly faster in computation than traditional methods based on value or policy iteration. We illustrate the effectiveness of our approach by simulation using real-world mobility traces of taxis in San Francisco.
In this paper, we propose a method of human activity recognition with high throughput from raw accelerometer data applying a deep recurrent neural network (DRNN), and investigate various architectures and its combination to find the best parameter values. The "high throughput" refers to short time at a time of recognition. We investigated various parameters and architectures of the DRNN by using the training dataset of 432 trials with 6 activity classes from 7 people. The maximum recognition rate was 95.42% and 83.43% against the test data of 108 segmented trials each of which has single activity class and 18 multiple sequential trials, respectively. Here, the maximum recognition rates by traditional methods were 71.65% and 54.97% for each. In addition, the efficiency of the found parameters was evaluated by using additional dataset. Further, as for throughput of the recognition per unit time, the constructed DRNN was requiring only 1.347 [ms], while the best traditional method required 11.031 [ms] which includes 11.027 [ms] for feature calculation. These advantages are caused by the compact and small architecture of the constructed real time oriented DRNN.
Semantic segmentation has been a long standing challenging task in computer vision. It aims at assigning a label to each image pixel and needs significant number of pixellevel annotated data, which is often unavailable. To address this lack, in this paper, we leverage, on one hand, massive amount of available unlabeled or weakly labeled data, and on the other hand, non-real images created through Generative Adversarial Networks. In particular, we propose a semi-supervised framework ,based on Generative Adversarial Networks (GANs), which consists of a generator network to provide extra training examples to a multi-class classifier, acting as discriminator in the GAN framework, that assigns sample a label y from the K possible classes or marks it as a fake sample (extra class). The underlying idea is that adding large fake visual data forces real samples to be close in the feature space, enabling a bottom-up clustering process, which, in turn, improves multiclass pixel classification. To ensure higher quality of generated images for GANs with consequent improved pixel classification, we extend the above framework by adding weakly annotated data, i.e., we provide class level information to the generator. We tested our approaches on several challenging benchmarking visual datasets, i.e. PASCAL, SiftFLow, Stanford and CamVid, achieving competitive performance also compared to state-of-the-art semantic segmentation method
In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework that allows a general nonlinear reward function, whose expected value may not depend only on the means of the input random variables but possibly on the entire distributions of these variables. Our framework enables a much larger class of reward functions such as the $\max()$ function and nonlinear utility functions. Existing techniques relying on accurate estimations of the means of random variables, such as the upper confidence bound (UCB) technique, do not work directly on these functions. We propose a new algorithm called stochastically dominant confidence bound (SDCB), which estimates the distributions of underlying random variables and their stochastically dominant confidence bounds. We prove that SDCB can achieve $O(\log{T})$ distribution-dependent regret and $\tilde{O}(\sqrt{T})$ distribution-independent regret, where $T$ is the time horizon. We apply our results to the $K$-MAX problem and expected utility maximization problems. In particular, for $K$-MAX, we provide the first polynomial-time approximation scheme (PTAS) for its offline problem, and give the first $\tilde{O}(\sqrt T)$ bound on the $(1-\epsilon)$-approximation regret of its online problem, for any $\epsilon>0$.
Locally caching contents at the network edge constitutes one of the most disruptive approaches in $5$G wireless networks. Reaping the benefits of edge caching hinges on solving a myriad of challenges such as how, what and when to strategically cache contents subject to storage constraints, traffic load, unknown spatio-temporal traffic demands and data sparsity. Motivated by this, we propose a novel transfer learning-based caching procedure carried out at each small cell base station. This is done by exploiting the rich contextual information (i.e., users' content viewing history, social ties, etc.) extracted from device-to-device (D2D) interactions, referred to as source domain. This prior information is incorporated in the so-called target domain where the goal is to optimally cache strategic contents at the small cells as a function of storage, estimated content popularity, traffic load and backhaul capacity. It is shown that the proposed approach overcomes the notorious data sparsity and cold-start problems, yielding significant gains in terms of users' quality-of-experience (QoE) and backhaul offloading, with gains reaching up to $22\%$ in a setting consisting of four small cell base stations.
Cooperative localization in agent networks based on interagent time-of-flight measurements is closely related to synchronization. To leverage this relation, we propose a Bayesian factor graph framework for cooperative simultaneous localization and synchronization (CoSLAS). This framework is suited to mobile agents and time-varying local clock parameters. Building on the CoSLAS factor graph, we develop a distributed (decentralized) belief propagation algorithm for CoSLAS in the practically important case of an affine clock model and asymmetric time stamping. Our algorithm allows for real-time operation and is suitable for a time-varying network connectivity. To achieve high accuracy at reduced complexity and communication cost, the algorithm combines particle implementations with parametric message representations and takes advantage of a conditional independence property. Simulation results demonstrate the good performance of the proposed algorithm in a challenging scenario with time-varying network connectivity.
State-of-the-art results of semantic segmentation are established by Fully Convolutional neural Networks (FCNs). FCNs rely on cascaded convolutional and pooling layers to gradually enlarge the receptive fields of neurons, resulting in an indirect way of modeling the distant contextual dependence. In this work, we advocate the use of spatially recurrent layers (i.e. ReNet layers) which directly capture global contexts and lead to improved feature representations. We demonstrate the effectiveness of ReNet layers by building a Naive deep ReNet (N-ReNet), which achieves competitive performance on Stanford Background dataset. Furthermore, we integrate ReNet layers with FCNs, and develop a novel Hybrid deep ReNet (H-ReNet). It enjoys a few remarkable properties, including full-image receptive fields, end-to-end training, and efficient network execution. On the PASCAL VOC 2012 benchmark, the H-ReNet improves the results of state-of-the-art approaches Piecewise, CRFasRNN and DeepParsing by 3.6%, 2.3% and 0.2%, respectively, and achieves the highest IoUs for 13 out of the 20 object classes.
Mobile phone metadata are increasingly used to study human behavior at large-scale. There has recently been a growing interest in predicting demographic information from metadata. Previous approaches relied on hand-engineered features. We here apply, for the first time, deep learning methods to mobile phone metadata using a convolutional network. Our method provides high accuracy on both age and gender prediction. These results show great potential for deep learning approaches for prediction tasks using standard mobile phone metadata.
Graphical models have been widely applied in solving distributed inference problems in sensor networks. In this paper, the problem of coordinating a network of sensors to train a unique ensemble estimator under communication constraints is discussed. The information structure of graphical models with specific potential functions is employed, and this thus converts the collaborative training task into a problem of local training plus global inference. Two important classes of algorithms of graphical model inference, message-passing algorithm and sampling algorithm, are employed to tackle low-dimensional, parametrized and high-dimensional, non-parametrized problems respectively. The efficacy of this approach is demonstrated by concrete examples.
We introduce session automata, an automata model to process data words, i.e., words over an infinite alphabet. Session automata support the notion of fresh data values, which are well suited for modeling protocols in which sessions using fresh values are of major interest, like in security protocols or ad-hoc networks. Session automata have an expressiveness partly extending, partly reducing that of classical register automata. We show that, unlike register automata and their various extensions, session automata are robust: They (i) are closed under intersection, union, and (resource-sensitive) complementation, (ii) admit a symbolic regular representation, (iii) have a decidable inclusion problem (unlike register automata), and (iv) enjoy logical characterizations. Using these results, we establish a learning algorithm to infer session automata through membership and equivalence queries.
Several Network Operating Systems (NOS) have been proposed in the last few years for Software Defined Networks; however, a few of them are currently offering the resiliency, scalability and high availability required for production environments. Open Networking Operating System (ONOS) is an open source NOS, designed to be reliable and to scale up to thousands of managed devices. It supports multiple concurrent instances (a cluster of controllers) with distributed data stores. A tight requirement of ONOS is that all instances must be close enough to have negligible communication delays, which means they are typically installed within a single datacenter or a LAN network. However in certain wide area network scenarios, this constraint may limit the speed of responsiveness of the controller toward network events like failures or congested links, an important requirement from the point of view of a Service Provider. This paper presents ICONA, a tool developed on top of ONOS and designed in order to extend ONOS capability in network scenarios where there are stringent requirements in term of control plane responsiveness. In particular the paper describes the architecture behind ICONA and provides some initial evaluation obtained on a preliminary version of the tool.
We present an accurate and robust method for six degree of freedom image localization. There are two key-points of our method, 1. automatic immense photo synthesis and labeling from point cloud model and, 2. pose estimation with deep convolutional neural networks regression. Our model can directly regresses 6-DOF camera poses from images, accurately describing where and how it was captured. We achieved an accuracy within 1 meters and 1 degree on our out-door dataset, which covers about 2 acres on our school campus.
We investigate the influence of a time dependent, homogeneous electric field on scattering properties of non-interacting electrons in an arbitrary static potential. We develop a method to calculate the (Keldysh) Green's function in two complementary approaches. Starting from a plane wave basis, a formally exact solution is given in terms of the inverse of a matrix containing infinitely many 'photoblocks' which can be evaluated approximately by truncation. In the exact eigenstate basis of the scattering potential, we obtain a version of the Floquet state theory in the Green's functions language. The formalism is checked for cases such as a simple model of a double barrier in a strong electric field. Furthermore, an exact relation between the inelastic scattering rate due to the microwave and the AC conductivity of the system is derived which in particular holds near or at a metal-insulator transition in disordered systems.
The use of mathematical models has helped to shed light on countless phenomena in chemistry and biology. Often, though, one finds that systems of interest in these fields are dauntingly complex. In this paper, we attempt to synthesize and expand upon the body of mathematical results pertaining to the theory of multiple equilibria in chemical reaction networks (CRNs), which has yielded surprising insights with minimal computational effort. Our central focus is a recent, cycle-based theorem by Gheorghe Craciun and Martin Feinberg, which is of significant interest in its own right and also serves, in a somewhat restated form, as the basis for a number of fruitful connections among related results.
The key limiting factor in graphical model inference and learning is the complexity of the partition function. We thus ask the question: what are general conditions under which the partition function is tractable? The answer leads to a new kind of deep architecture, which we call sum-product networks (SPNs). SPNs are directed acyclic graphs with variables as leaves, sums and products as internal nodes, and weighted edges. We show that if an SPN is complete and consistent it represents the partition function and all marginals of some graphical model, and give semantics to its nodes. Essentially all tractable graphical models can be cast as SPNs, but SPNs are also strictly more general. We then propose learning algorithms for SPNs, based on backpropagation and EM. Experiments show that inference and learning with SPNs can be both faster and more accurate than with standard deep networks. For example, SPNs perform image completion better than state-of-the-art deep networks for this task. SPNs also have intriguing potential connections to the architecture of the cortex.
Face images in the wild undergo large intra-personal variations, such as poses, illuminations, occlusions, and low resolutions, which cause great challenges to face-related applications. This paper addresses this challenge by proposing a new deep learning framework that can recover the canonical view of face images. It dramatically reduces the intra-person variances, while maintaining the inter-person discriminativeness. Unlike the existing face reconstruction methods that were either evaluated in controlled 2D environment or employed 3D information, our approach directly learns the transformation from the face images with a complex set of variations to their canonical views. At the training stage, to avoid the costly process of labeling canonical-view images from the training set by hand, we have devised a new measurement to automatically select or synthesize a canonical-view image for each identity. As an application, this face recovery approach is used for face verification. Facial features are learned from the recovered canonical-view face images by using a facial component-based convolutional neural network. Our approach achieves the state-of-the-art performance on the LFW dataset.
We report for the first time general geometrical expressions for the angular resolution of an arbitrary network of interferometric gravitational-wave (GW) detectors when the arrival-time of a GW is unknown. We show explicitly elements that decide the angular resolution of a GW detector network. In particular, we show the dependence of the angular resolution on areas formed by projections of pairs of detectors and how they are weighted by sensitivities of individual detectors. Numerical simulations are used to demonstrate the capabilities of the current GW detector network. We confirm that the angular resolution is poor along the plane formed by current LIGO-Virgo detectors. A factor of a few to more than ten fold improvement of the angular resolution can be achieved if the proposed new GW detectors LCGT or AIGO are added to the network. We also discuss the implications of our results for the design of a GW detector network, optimal localization methods for a given network, and electromagnetic follow-up observations.
Sentiment Analysis (SA) is an action research area in the digital age. With rapid and constant growth of online social media sites and services, and the increasing amount of textual data such as - statuses, comments, reviews etc. available in them, application of automatic SA is on the rise. However, most of the research works on SA in natural language processing (NLP) are based on English language. Despite being the sixth most widely spoken language in the world, Bangla still does not have a large and standard dataset. Because of this, recent research works in Bangla have failed to produce results that can be both comparable to works done by others and reusable as stepping stones for future researchers to progress in this field. Therefore, we first tried to provide a textual dataset - that includes not just Bangla, but Romanized Bangla texts as well, is substantial, post-processed and multiple validated, ready to be used in SA experiments. We tested this dataset in Deep Recurrent model, specifically, Long Short Term Memory (LSTM), using two types of loss functions - binary crossentropy and categorical crossentropy, and also did some experimental pre-training by using data from one validation to pre-train the other and vice versa. Lastly, we documented the results along with some analysis on them, which were promising.
Deep--inelastic scattering events with a leading baryon have been detected by the H1 experiment at HERA using a forward proton spectrometer and a forward neutron calorimeter. Semi--inclusive cross sections have been measured in the kinematic region 2 <= Q^2 <= 50 GeV^2, 6.10^-5 <= x <= 6.10^-3 and baryon p_T <= MeV, for events with a final state proton with energy 580 <= E' <= 740 GeV, or a neutron with energy E' >= 160 GeV. The measurements are used to test production models and factorization hypotheses. A Regge model of leading baryon production which consists of pion, pomeron and secondary reggeon exchanges gives an acceptable description of both semi-inclusive cross sections in the region 0.7 <= E'/E_p <= 0.9, where E_p is the proton beam energy. The leading neutron data are used to estimate for the first time the structure function of the pion at small Bjorken--x.
With the advent of online social networks, recommender systems have became crucial for the success of many online applications/services due to their significance role in tailoring these applications to user-specific needs or preferences. Despite their increasing popularity, in general recommender systems suffer from the data sparsity and the cold-start problems. To alleviate these issues, in recent years there has been an upsurge of interest in exploiting social information such as trust relations among users along with the rating data to improve the performance of recommender systems. The main motivation for exploiting trust information in recommendation process stems from the observation that the ideas we are exposed to and the choices we make are significantly influenced by our social context. However, in large user communities, in addition to trust relations, the distrust relations also exist between users. For instance, in Epinions the concepts of personal "web of trust" and personal "block list" allow users to categorize their friends based on the quality of reviews into trusted and distrusted friends, respectively. In this paper, we propose a matrix factorization based model for recommendation in social rating networks that properly incorporates both trust and distrust relationships aiming to improve the quality of recommendations and mitigate the data sparsity and the cold-start users issues. Through experiments on the Epinions data set, we show that our new algorithm outperforms its standard trust-enhanced or distrust-enhanced counterparts with respect to accuracy, thereby demonstrating the positive effect that incorporation of explicit distrust information can have on recommender systems.
With technology scaling down, hundreds and thousands processing elements (PEs) can be integrated on a single chip. Network-on-chip (NoC) has been proposed as an efficient solution to handle this distinctive challenge. In this thesis, we have explored the high performance NoC design for MPSoC and CMP structures from the performance modeling in the offline design phase to the routing algorithm and NoC architecture optimization. More specifically, we first deal with the issue of how to estimate an NoC design fast and accurately in the synthesis inner loop. For this purpose, we propose a machine learning based latency regression model to evaluate the NoC designs with respect to different configurations. Then, for high performance NoC designs, we tackle one of the most important problems, i.e., the routing algorithms design. For avoiding temperature hotspots, a thermal-aware routing algorithm is proposed to achieve an even temperature profile for application-specific Network-on-chips (NoCs). For improving the reliability, a routing algorithm to achieve maximum performance under fault is proposed. Finally, in the architecture level, we propose two new NoC structures using bi-directional links for the performance optimization. In particular, we propose a flit-level speedup scheme to enhance the network-on-chip(NoC) performance utilizing bidirectional channels. We also propose a flexible NoC architecture which takes advantage of a dynamic distributed routing algorithm and improves the NoC communication performance with moderate energy overhead. From the simulation results on both synthetic traffic and real workload traces, significant performance improvement in terms of latency and throughput can be achieved.
We propose an information-theoretic framework to quantify multipartite correlations in classical and quantum systems, answering questions such as: what is the amount of seven-partite correlations in a given state of ten particles? We identify measures of genuine multipartite correlations, i.e. statistical dependencies which cannot be ascribed to bipartite correlations, satisfying a set of desirable properties. Inspired by ideas developed in complexity science, we then introduce the concept of weaving to classify states with an equal amount of total correlations, but displaying different patterns of multipartite correlations. The weaving of a state is defined as the weighted sum of correlations of any order. Weaving measures are good descriptors of the complexity of correlation structures in multipartite systems.
Online social networks have enabled new methods and modalities of collaboration and sharing. These advances bring privacy concerns: online social data is more accessible and persistent and simultaneously less contextualized than traditional social interactions. To allay these concerns, many web services allow users to configure their privacy settings based on a set of multiple-choice questions.   We suggest a new paradigm for privacy options. Instead of suggesting the same defaults to each user, services can leverage knowledge of users' traits to recommend a machine-learned prediction of their privacy preferences for Facebook. As a case study, we build and evaluate MyPrivacy, a publicly available web application that suggests personalized privacy settings. An evaluation with 199 users shows that users find the suggestions to be appropriate and private; furthermore, they express intent to implement the recommendations made by MyPrivacy. This supports the proposal to put personalization to work in online communities to promote privacy and security.
We consider the simulation of wireless sensor networks (WSN) using a new approach. We present Shawn, an open-source discrete-event simulator that has considerable differences to all other existing simulators. Shawn is very powerful in simulating large scale networks with an abstract point of view. It is, to the best of our knowledge, the first simulator to support generic high-level algorithms as well as distributed protocols on exactly the same underlying networks.
Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or modalities. We propose two novel approaches to combine the outputs of attention mechanisms over each source sequence, flat and hierarchical. We compare the proposed methods with existing techniques and present results of systematic evaluation of those methods on the WMT16 Multimodal Translation and Automatic Post-editing tasks. We show that the proposed methods achieve competitive results on both tasks.
Wordnets are semantic networks containing nouns, verbs, adjectives, and adverbs organized according to linguistic principles, by means of semantic relations. In this work, we adopt a complex network perspective to perform a comparative analysis of the English and Polish wordnets. We determine their similarities and show that the networks exhibit some of the typical characteristics observed in other real-world networks. We analyse interlingual relations between both wordnets and deliberate over the problem of mapping the Polish lexicon onto the English one.
Person re-identification aims to match images of the same person across disjoint camera views, which is a challenging problem in video surveillance. The major challenge of this task lies in how to preserve the similarity of the same person against large variations caused by complex backgrounds, mutual occlusions and different illuminations, while discriminating the different individuals. In this paper, we present a novel deep ranking model with feature learning and fusion by learning a large adaptive margin between the intra-class distance and inter-class distance to solve the person re-identification problem. Specifically, we organize the training images into a batch of pairwise samples. Treating these pairwise samples as inputs, we build a novel part-based deep convolutional neural network (CNN) to learn the layered feature representations by preserving a large adaptive margin. As a result, the final learned model can effectively find out the matched target to the anchor image among a number of candidates in the gallery image set by learning discriminative and stable feature representations. Overcoming the weaknesses of conventional fixed-margin loss functions, our adaptive margin loss function is more appropriate for the dynamic feature space. Using the PRID2011, Market1501, CUHK01 and 3DPeS, we extensively conduct comparative evaluations to demonstrate the advantages of the proposed method over the state-of-the-art approaches in person re-identification.
To ensure that social networks (e.g. opinion consensus, cooperative estimation, distributed learning and adaptation etc.) proliferate and efficiently operate, the participating agents need to collaborate with each other by repeatedly sharing information. However, sharing information is often costly for the agents while resulting in no direct immediate benefit for them. Hence, lacking incentives to collaborate, strategic agents who aim to maximize their own individual utilities will withhold rather than share information, leading to inefficient operation or even collapse of networks. In this paper, we develop a systematic framework for designing distributed rating protocols aimed at incentivizing the strategic agents to collaborate with each other by sharing information. The proposed incentive protocols exploit the ongoing nature of the agents' interactions to assign ratings and through them, determine future rewards and punishments: agents that have behaved as directed enjoy high ratings -- and hence greater future access to the information of others; agents that have not behaved as directed enjoy low ratings -- and hence less future access to the information of others. Unlike existing rating protocols, the proposed protocol operates in a distributed manner, online, and takes into consideration the underlying interconnectivity of agents as well as their heterogeneity. We prove that in many deployment scenarios the price of anarchy (PoA) obtained by adopting the proposed rating protocols is one. In settings in which the PoA is larger than one, we show that the proposed rating protocol still significantly outperforms existing incentive mechanisms such as Tit-for-Tat. Importantly, the proposed rating protocols can also operate efficiently in deployment scenarios where the strategic agents interact over time-varying network topologies where new agents join the network over time.
Many-body localized (MBL) systems lie outside the framework of statistical mechanics, as they fail to equilibrate under their own quantum dynamics. Even basic features of MBL systems such as their stability to thermal inclusions and the nature of the dynamical transition to thermalizing behavior remain poorly understood. We study a simple model to address these questions: a two level system interacting with strength $J$ with $N\gg 1$ localized bits subject to random fields. On increasing $J$, the system transitions from a MBL to a delocalized phase on the \emph{vanishing} scale $J_c(N) \sim 1/N$, up to logarithmic corrections. In the transition region, the single-site eigenstate entanglement entropies exhibit bi-modal distributions, so that localized bits are either "on" (strongly entangled) or "off" (weakly entangled) in eigenstates. The clusters of "on" bits vary significantly between eigenstates of the \emph{same} sample, which provides evidence for a heterogenous discontinuous transition out of the localized phase in single-site observables. We obtain these results by perturbative mapping to bond percolation on the hypercube at small $J$ and by numerical exact diagonalization of the full many-body system. Our results imply the MBL phase is unstable in systems with short-range interactions and quenched randomness in dimensions $d$ that are high but finite.
Labeled sequence transduction is a task of transforming one sequence into another sequence that satisfies desiderata specified by a set of labels. In this paper we propose multi-space variational encoder-decoders, a new model for labeled sequence transduction with semi-supervised learning. The generative model can use neural networks to handle both discrete and continuous latent variables to exploit various features of data. Experiments show that our model provides not only a powerful supervised framework but also can effectively take advantage of the unlabeled data. On the SIGMORPHON morphological inflection benchmark, our model outperforms single-model state-of-art results by a large margin for the majority of languages.
Many models and methods are now available for network analysis, but model selection and tuning remain challenging. Cross-validation is a useful general tool for these tasks in many settings, but is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. Here we propose a new network cross-validation strategy based on splitting edges rather than nodes, which avoids losing information and is applicable to a wide range of network problems. We provide a theoretical justification for our method in a general setting, and in particular show that the method has good asymptotic properties under the stochastic block model. Numerical results on both simulated and real networks show that our approach performs well for a number of model selection and parameter tuning tasks.
It has been shown that many complex networks shared distinctive features, which differ in many ways from the random and the regular networks. Although these features capture important characteristics of complex networks, their applicability depends on the type of networks. To unravel ubiquitous characteristics that complex networks may have in common, we adopt the clustering coefficient as the probability measure, and present a systematic analysis of various types of complex networks from the perspective of statistical self-similarity. We find that the probability distribution of the clustering coefficient is best characterized by the multifractal; moreover, the support of the measure had a fractal dimension. These two features enable us to describe complex networks in a unified way; at the same time, offer unforeseen possibilities to comprehend complex networks.
This is a presentation of a new system for invariant recognition of 2D objects with overlapping classes, that can not be effectively recognized with the traditional methods. The translation, scale and partial rotation invariant contour object description is transformed in a DCT spectrum space. The obtained frequency spectrums are decomposed into frequency bands in order to feed different BPG neural nets (NNs). The NNs are structured in three stages - filtering and full rotation invariance; partial recognition; general classification. The designed multi-stage BPG Neural Structure shows very good accuracy and flexibility when tested with 2D objects used in the discontinuous production. The reached speed and the opportunuty for an easy restructuring and reprogramming of the system makes it suitable for application in different applied systems for real time work.
We calculate the expected number of multiply-imaged galaxies in the Hubble Deep Field (HDF), using photometric redshift information for galaxies with m_I < 27 that were detected in all four HDF passbands. A comparison of these expectations with the observed number of strongly lensed galaxies constrains the current value of Omega_m-Omega_Lambda, where Omega_m is the mean mass density of the universe and Omega_Lambda is the normalized cosmological constant. Based on current estimates of the HDF luminosity function and associated uncertainties in individual parameters, our 95% confidence lower limit on Omega_m-Omega_Lambda ranges between -0.44, if there are no strongly lensed galaxies in the HDF, and -0.73, if there are two strongly lensed galaxies in the HDF. If the only lensed galaxy in the HDF is the one presently viable candidate, then, in a flat universe (Omega_m+Omega_Lambda=1), Omega_Lambda < 0.79 (95% C.L.). These limits are compatible with estimates based on high-redshift supernovae and with previous limits based on gravitational lensing.
Gamma-ray burst afterglows have been observed for months or even years in a few cases. It deserves noting that at such late stages, the remnants should have entered the deep Newtonian phase, during which the majority of shock-accelerated electrons will no longer be highly relativistic. To calculate the afterglows, we must assume that the electrons obey a power-law distribution according to their kinetic energy, not simply the Lorentz factor.
Deep neural networks generally involve some layers with mil- lions of parameters, making them difficult to be deployed and updated on devices with limited resources such as mobile phones and other smart embedded systems. In this paper, we propose a scalable representation of the network parameters, so that different applications can select the most suitable bit rate of the network based on their own storage constraints. Moreover, when a device needs to upgrade to a high-rate network, the existing low-rate network can be reused, and only some incremental data are needed to be downloaded. We first hierarchically quantize the weights of a pre-trained deep neural network to enforce weight sharing. Next, we adaptively select the bits assigned to each layer given the total bit budget. After that, we retrain the network to fine-tune the quantized centroids. Experimental results show that our method can achieve scalable compression with graceful degradation in the performance.
We study the phase synchronization of Kuramoto's oscillators in small parts of networks known as motifs. We first report on the system dynamics for the case of a scale-free network and show the existence of a non-trivial critical point. We compute the probability that network motifs synchronize, and find that the fitness for synchronization correlates well with motif's interconnectedness and structural complexity. Possible implications for present debates about network evolution in biological and other systems are discussed.
We consider transitions in quantum networks analogous to those in the two-dimensional Ising model. We show that for a network of active components the transition is between the quantum and the classical behaviour of the network, and the critical amplification coincides with the fundamental quantum cloning limit.
In this paper we introduce a statistical inference framework for estimating the contagion source from a partially observed contagion spreading process on an arbitrary network structure. The framework is based on a maximum likelihood estimation of a partial epidemic realization and involves large scale simulation of contagion spreading processes from the set of potential source locations. We present a number of different likelihood estimators that are used to determine the conditional probabilities associated to observing partial epidemic realization with particular source location candidates. This statistical inference framework is also applicable for arbitrary compartment contagion spreading processes on networks. We compare estimation accuracy of these approaches in a number of computational experiments performed with the SIR (susceptible-infected-recovered), SI (susceptible-infected) and ISS (ignorant-spreading-stifler) contagion spreading models on synthetic and real-world complex networks.
In this article we study a problem within Dempster-Shafer theory where 2**n - 1 pieces of evidence are clustered by a neural structure into n clusters. The clustering is done by minimizing a metaconflict function. Previously we developed a method based on iterative optimization. However, for large scale problems we need a method with lower computational complexity. The neural structure was found to be effective and much faster than iterative optimization for larger problems. While the growth in metaconflict was faster for the neural structure compared with iterative optimization in medium sized problems, the metaconflict per cluster and evidence was moderate. The neural structure was able to find a global minimum over ten runs for problem sizes up to six clusters.
Results from the research and development of a Data Intensive and Network Aware (DIANA) scheduling engine, to be used primarily for data intensive sciences such as physics analysis, are described. In Grid analyses, tasks can involve thousands of computing, data handling, and network resources. The central problem in the scheduling of these resources is the coordinated management of computation and data at multiple locations and not just data replication or movement. However, this can prove to be a rather costly operation and efficient sing can be a challenge if compute and data resources are mapped without considering network costs. We have implemented an adaptive algorithm within the so-called DIANA Scheduler which takes into account data location and size, network performance and computation capability in order to enable efficient global scheduling. DIANA is a performance-aware and economy-guided Meta Scheduler. It iteratively allocates each job to the site that is most likely to produce the best performance as well as optimizing the global queue for any remaining jobs. Therefore it is equally suitable whether a single job is being submitted or bulk scheduling is being performed. Results indicate that considerable performance improvements can be gained by adopting the DIANA scheduling approach.
Random graphs have proven to be one of the most important and fruitful concepts in modern Combinatorics and Theoretical Computer Science. Besides being a fascinating study subject for their own sake, they serve as essential instruments in proving an enormous number of combinatorial statements, making their role quite hard to overestimate. Their tremendous success serves as a natural motivation for the following very general and deep informal questions: what are the essential properties of random graphs? How can one tell when a given graph behaves like a random graph? How to create deterministically graphs that look random-like? This leads us to a concept of pseudo-random graphs and the aim of this survey is to provide a systematic treatment of this concept.
The discovery of the spin torque effect has made magnetic nanodevices realistic candidates for active elements of memory devices and applications. Magnetoresistive effects allow the read-out of increasingly small magnetic bits, and the spin torque provides an efficient tool to manipulate - precisely, rapidly and at low energy cost - the magnetic state, which is in turn the central information medium of spintronic devices. By keeping the same magnetic stack, but by tuning a device's shape and bias conditions, the spin torque can be engineered to build a variety of advanced magnetic nanodevices. Here we show that by assembling these nanodevices as building blocks with different functionalities, novel types of computing architectures can be envisisaged. We focus in particular on recent concepts such as magnonics and spintronic neural networks.
The static configuration of ferroelectric domain walls was investigated using atomic force microscopy on epitaxial PbZr0.2Ti0.8O3 thin films. Measurements of domain wall roughness reveal a power law growth of the correlation function of relative displacements B(L) ~ L^(2zeta) with zeta ~ 0.26 at short length scales L, followed by an apparent saturation at large L. In the same films, the dynamic exponent mu was found to be ~ 0.6 from independent measurements of domain wall creep. These results give an effective domain wall dimensionality of d=2.5, in good agreement with theoretical calculations for a two-dimensional elastic interface in the presence of random-bond disorder and long range dipolar interactions.
Assuming a random uniform distribution of n sensor nodes over a virtual grid, this paper addresses the problem of finding the maximum number of connected set covers each ensuring 100% coverage of the query region. The connected sets remain active one after another in a round robin fashion such that if there are P such set covers, it can enhance the network lifetime P-fold. From graph-theoretic point of view, a centralized O(n3) heuristic is proposed here to maximize P. Next, for large self-organized sensor networks, a distributed algorithm is developed. The proposed algorithm is to be executed just once, during the initialization of the network. In case of failure, a distributed recovery algorithm is executed to rearrange the partitions. Simulation studies show that the performance of the proposed distributed algorithm is comparable with that of the centralized algorithm in terms of number of partitions. Also, comparison with earlier works shows significant improvement in terms of number of partitions, message complexity and network lifetime.
A unified deep neural network, denoted the multi-scale CNN (MS-CNN), is proposed for fast multi-scale object detection. The MS-CNN consists of a proposal sub-network and a detection sub-network. In the proposal sub-network, detection is performed at multiple output layers, so that receptive fields match objects of different scales. These complementary scale-specific detectors are combined to produce a strong multi-scale object detector. The unified network is learned end-to-end, by optimizing a multi-task loss. Feature upsampling by deconvolution is also explored, as an alternative to input upsampling, to reduce the memory and computation costs. State-of-the-art object detection performance, at up to 15 fps, is reported on datasets, such as KITTI and Caltech, containing a substantial number of small objects.
Convolutional Neural Networks (ConvNets) have recently shown promising performance in many computer vision tasks, especially image-based recognition. How to effectively apply ConvNets to sequence-based data is still an open problem. This paper proposes an effective yet simple method to represent spatio-temporal information carried in $3D$ skeleton sequences into three $2D$ images by encoding the joint trajectories and their dynamics into color distribution in the images, referred to as Joint Trajectory Maps (JTM), and adopts ConvNets to learn the discriminative features for human action recognition. Such an image-based representation enables us to fine-tune existing ConvNets models for the classification of skeleton sequences without training the networks afresh. The three JTMs are generated in three orthogonal planes and provide complimentary information to each other. The final recognition is further improved through multiply score fusion of the three JTMs. The proposed method was evaluated on four public benchmark datasets, the large NTU RGB+D Dataset, MSRC-12 Kinect Gesture Dataset (MSRC-12), G3D Dataset and UTD Multimodal Human Action Dataset (UTD-MHAD) and achieved the state-of-the-art results.
The analysis of eclipsing binaries containing non-radial pulsators allows: i) to combine two different and independent sources of information on the internal structure and evolutionary status of the components, and ii) to study the effects of tidal forces on pulsations. KIC 3858884 is a bright Kepler target whose light curve shows deep eclipses, complex pulsation patterns with pulsation frequencies typical of {\delta} Sct, and a highly eccentric orbit. We present the result of the analysis of Kepler photometry and of high resolution phaseresolved spectroscopy. Spectroscopy yielded both the radial velocity curves and, after spectral disentangling, the primary component effective temperature and metallicity, and line-of-sight projected rotational velocities. The Kepler light curve was analyzed with an iterative procedure devised to disentangle eclipses from pulsations which takes into account the visibility of the pulsating star during eclipses. The search for the best set of binary parameters was performed combining the synthetic light curve models with a genetic minimization algorithm, which yielded a robust and accurate determination of the system parameters. The binary components have very similar masses (1.88 and 1.86 Msun) and effective temperatures (6800 and 6600 K), but different radii (3.45 and 3.05 Rsun). The comparison with the theoretical models evidenced a somewhat different evolutionary status of the components and the need of introducing overshooting in the models. The pulsation analysis indicates a hybrid nature of the pulsating (secondary) component, the corresponding high order g-modes might be excited by an intrinsic mechanism or by tidal forces.
We propose a novel couple mappings method for low resolution face recognition using deep convolutional neural networks (DCNNs). The proposed architecture consists of two branches of DCNNs to map the high and low resolution face images into a common space with nonlinear transformations. The branch corresponding to transformation of high resolution images consists of 14 layers and the other branch which maps the low resolution face images to the common space includes a 5-layer super-resolution network connected to a 14-layer network. The distance between the features of corresponding high and low resolution images are backpropagated to train the networks. Our proposed method is evaluated on FERET data set and compared with state-of-the-art competing methods. Our extensive experimental results show that the proposed method significantly improves the recognition performance especially for very low resolution probe face images (11.4% improvement in recognition accuracy). Furthermore, it can reconstruct a high resolution image from its corresponding low resolution probe image which is comparable with state-of-the-art super-resolution methods in terms of visual quality.
PCA is a classical statistical technique whose simplicity and maturity has seen it find widespread use as an anomaly detection technique. However, it is limited in this regard by being sensitive to gross perturbations of the input, and by seeking a linear subspace that captures normal behaviour. The first issue has been dealt with by robust PCA, a variant of PCA that explicitly allows for some data points to be arbitrarily corrupted, however, this does not resolve the second issue, and indeed introduces the new issue that one can no longer inductively find anomalies on a test set. This paper addresses both issues in a single model, the robust autoencoder. This method learns a nonlinear subspace that captures the majority of data points, while allowing for some data to have arbitrary corruption. The model is simple to train and leverages recent advances in the optimisation of deep neural networks. Experiments on a range of real-world datasets highlight the model's effectiveness.
Liu et al. 2008 discuss an important consideration for models of zonal winds deep within giant planets. However, the constraints they propose for the depth of the winds are based on their prescriptions for the internal structures of the magnetic field and zonal winds. The same kinematic analysis applied to other plausible configurations would produce no constraint on the depth to which the winds extend.
One impressive advantage of convolutional neural networks (CNNs) is their ability to automatically learn feature representation from raw pixels, eliminating the need for hand-designed procedures. However, recent methods for single image super-resolution (SR) fail to maintain this advantage. They utilize CNNs in two decoupled steps, i.e., first upsampling the low resolution (LR) image to the high resolution (HR) size with hand-designed techniques (e.g., bicubic interpolation), and then applying CNNs on the upsampled LR image to reconstruct HR results. In this paper, we seek an alternative and propose a new image SR method, which jointly learns the feature extraction, upsampling and HR reconstruction modules, yielding a completely end-to-end trainable deep CNN. As opposed to existing approaches, the proposed method conducts upsampling in the latent feature space with filters that are optimized for the task of image SR. In addition, the HR reconstruction is performed in a multi-scale manner to simultaneously incorporate both short- and long-range contextual information, ensuring more accurate restoration of HR images. To facilitate network training, a new training approach is designed, which jointly trains the proposed deep network with a relatively shallow network, leading to faster convergence and more superior performance. The proposed method is extensively evaluated on widely adopted data sets and improves the performance of state-of-the-art methods with a considerable margin. Moreover, in-depth ablation studies are conducted to verify the contribution of different network designs to image SR, providing additional insights for future research.
We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L_2 regularization: We introduce the \gamma-adapted-dimension, which is a simple function of the spectrum of a distribution's covariance matrix, and show distribution-specific upper and lower bounds on the sample complexity, both governed by the \gamma-adapted-dimension of the source distribution. We conclude that this new quantity tightly characterizes the true sample complexity of large-margin classification. The bounds hold for a rich family of sub-Gaussian distributions.
Understanding the nature of two-level tunneling defects is important for minimizing their disruptive effects in various nano-devices. By exploiting the resonant coupling of these defects to a superconducting qubit, one can probe and coherently manipulate them individually. In this work we utilize a phase qubit to induce Rabi oscillations of single tunneling defects and measure their dephasing rates as a function of the defect's asymmetry energy, which is tuned by an applied strain. The dephasing rates scale quadratically with the external strain and are inversely proportional to the Rabi frequency. These results are analyzed and explained within a model of interacting standard defects, in which pure dephasing of coherent high-frequency (GHz) defects is caused by interaction with incoherent low-frequency thermally excited defects.
Recently, machine learning based single image super resolution (SR) approaches focus on jointly learning representations for high-resolution (HR) and low-resolution (LR) image patch pairs to improve the quality of the super-resolved images. However, due to treat all image pixels equally without considering the salient structures, these approaches usually fail to produce visual pleasant images with sharp edges and fine details. To address this issue, in this work we present a new novel SR approach, which replaces the main building blocks of the classical interpolation pipeline by a flexible, content-adaptive deep neural networks. In particular, two well-designed structure-aware components, respectively capturing local- and holistic- image contents, are naturally incorporated into the fully-convolutional representation learning to enhance the image sharpness and naturalness. Extensively evaluations on several standard benchmarks (e.g., Set5, Set14 and BSD200) demonstrate that our approach can achieve superior results, especially on the image with salient structures, over many existing state-of-the-art SR methods under both quantitative and qualitative measures.
The nature of polyamorphism and amorphous-to-amorphous transition is investigated by means of an exactly solvable model with quenched disorder, the spherical s+p multi-spin interaction model. The analysis is carried out in the framework of Replica Symmetry Breaking theory and leads to the identification of low temperature glass phases of different kinds. Besides the usual `one-step' solution, known to reproduce all basic properties of structural glasses, also a physically consistent `two-step' solution arises. More complicated phases are found as well, as temperature is further decreased, expressing a complex variety of metastable states structures for amorphous systems.
Most of the existing information retrieval systems are based on bag of words model and are not equipped with common world knowledge. Work has been done towards improving the efficiency of such systems by using intelligent algorithms to generate search queries, however, not much research has been done in the direction of incorporating human-and-society level knowledge in the queries. This paper is one of the first attempts where such information is incorporated into the search queries using Wikipedia semantics. The paper presents an essential shift from conventional token based queries to concept based queries, leading to an enhanced efficiency of information retrieval systems. To efficiently handle the automated query learning problem, we propose Wikipedia-based Evolutionary Semantics (Wiki-ES) framework where concept based queries are learnt using a co-evolving evolutionary procedure. Learning concept based queries using an intelligent evolutionary procedure yields significant improvement in performance which is shown through an extensive study using Reuters newswire documents. Comparison of the proposed framework is performed with other information retrieval systems. Concept based approach has also been implemented on other information retrieval systems to justify the effectiveness of a transition from token based queries to concept based queries.
In this chapter of the e-book "Self-Organized Criticality Systems" we summarize some theoretical approaches to self-organized criticality (SOC) phenomena that involve percolation as an essential key ingredient. Scaling arguments, random walk models, linear-response theory, and fractional kinetic equations of the diffusion and relaxation type are presented on an equal footing with theoretical approaches of greater sophistication, such as the formalism of discrete Anderson nonlinear Schr\"odinger equation, Hamiltonian pseudochaos, conformal maps, and fractional derivative equations of the nonlinear Schr\"odinger and Ginzburg-Landau type. Several physical consequences are described which are relevant to transport processes in complex systems. It is shown that a state of self-organized criticality may be unstable against a bursting ("fishbone") mode when certain conditions are met. Finally we discuss SOC-associated phenomena, such as: self-organized turbulence in the Earth's magnetotail (in terms of the "Sakura" model), phase transitions in SOC systems, mixed SOC-coherent behavior, and periodic and auto-oscillatory patterns of behavior. Applications of the above pertain to phenomena of magnetospheric substorm, market crashes, and the global climate change and are also discussed in some detail. Finally we address the frontiers in the field in association with the emerging projects in fusion research and space exploration.
A device based on current-induced spin-orbit torque (SOT) that functions as an electronic neuron is proposed in this work. The SOT device implements an artificial neuron's thresholding (transfer) function. In the first step of a two-step switching scheme, a charge current places the magnetization of a nano-magnet along the hard-axis i.e. an unstable point for the magnet. In the second step, the SOT device (neuron) receives a current (from the synapses) which moves the magnetization from the unstable point to one of the two stable states. The polarity of the synaptic current encodes the excitatory and inhibitory nature of the neuron input, and determines the final orientation of the magnetization. A resistive crossbar array, functioning as synapses, generates a bipolar current that is a weighted sum of the inputs. The simulation of a two layer feed-forward Artificial Neural Network (ANN) based on the SOT electronic neuron shows that it consumes ~3X lower power than a 45nm digital CMOS implementation, while reaching ~80% accuracy in the classification of one hundred images of handwritten digits from the MNIST dataset.
This paper presents an analysis of the inclusive properties of diffractive deep inelastic scattering events produced in $ep$ interactions at HERA. The events are characterised by a rapidity gap between the outgoing proton system and the remaining hadronic system. Inclusive distributions are presented and compared with Monte Carlo models for diffractive processes. The data are consistent with models where the pomeron structure function has a hard and a soft contribution. The diffractive structure function is measured as a function of $\xpom$, the momentum fraction lost by the proton, of $\beta$, the momentum fraction of the struck quark with respect to $\xpom$, and of $Q^2$. The $\xpom$ dependence is consistent with the form \xpoma where $a~=~1.30~\pm~0.08~(stat)~^{+~0.08}_{-~0.14}~(sys)$ in all bins of $\beta$ and $Q^2$. In the measured $Q^2$ range, the diffractive structure function approximately scales with $Q^2$ at fixed $\beta$. In an Ingelman-Schlein type model, where commonly used pomeron flux factor normalisations are assumed, it is found that the quarks within the pomeron do not saturate the momentum sum rule.
We examine the bound state(s) associated with a single cubic nonlinear impurity, in a one-dimensional tight-binding lattice, where hopping to first--and--second nearest neighbors is allowed. The model is solved in closed form {\em v\`{\i}a} the use of the appropriate lattice Green function and a phase diagram is obtained showing the number of bound states as a function of nonlinearity strength and the ratio of second to first nearest--neighbor hopping parameters. Surprisingly, a finite amount of hopping to second nearest neighbors helps the formation of a bound state at smaller (even vanishingly small) nonlinearity values. As a consequence, the selftrapping transition can also be tuned to occur at relatively small nonlinearity strength, by this increase in the lattice dispersion.
We study the influence of elements diffusing in and out of a network to the topological changes of the network and characterize it by investigating the behavior of probability of degree distribution ($\Gamma(k)$) with degree $k$. The local memory of the incoming element and its interaction with the elements already present in the network during the growing process significantly affect the network stability which in turn reorganize the network properties. We found that the properties of $\Gamma(k)$ of this network are deviated from scale free type, where the power law behavior contains a exponentially decay factor supporting earlier reported results of Amaral et.al. \cite{ama} and Newman \cite{new1} and recent statistical analysis results on degree distribution data of some scale free network [11]. Our numerical results also support the behavior of this $\Gamma(k)$. However, we found numerically the contribution from exponential factor to the $\Gamma(k)$ to be very weak as compared to the scale free factor showing that the network as a whole carries the scale free properties approximately.
Boolean networks have been the object of much attention, especially since S. Kauffman proposed them in the 1960's as models for gene regulatory networks. These systems are characterized by being defined on a Boolean state space and by simultaneous updating at discrete time steps. Of particular importance for biological applications are networks in which the indegree for each variable is bounded by a fixed constant, as was stressed by Kauffman in his original papers.   An important question is which conditions on the network topology can rule out exponentially long periodic orbits in the system. In this paper, we consider systems with positive feedback interconnections among all variables (known as cooperative systems), which in a continuous setting guarantees a very stable dynamics. We show that for an arbitrary constant 0<c<2 and sufficiently large n there exist n-dimensional cooperative Boolean networks in which both the indegree and outdegree of each variable is bounded by two, and which nevertheless contain periodic orbits of length at least c^n. In Part II of this paper we will prove an inverse result showing that any system with such a dynamic behavior must in a sense be similar to the example described.
As manifested in the similarity relation of diffuse light transport, it is difficult to assess single scattering characteristics from multiply scattered light. We take advantage of the limited validity of the diffusion approximation of light transport and demonstrate, experimentally and numerically, that even deep into the multiple scattering regime, time-resolved detection of transmitted light allows simultaneous assessment of both single scattering anisotropy and scattering mean free path, and therefore also macroscopic parameters like the diffusion constant and the transport mean free path. This is achieved via careful assessment of early light and matching against Monte Carlo simulations of radiative transfer.
When random forests are used for binary classification, an ensemble of $t=1,2,\dots$ randomized classifiers is generated, and the predictions of the classifiers are aggregated by majority vote. Due to the randomness in the algorithm, there is a natural tradeoff between statistical performance and computational cost. On one hand, as $t$ increases, the (random) prediction error of the ensemble tends to decrease and stabilize. On the other hand, larger ensembles require greater computational cost for training and making new predictions.   The present work offers a new approach for quantifying this tradeoff: Given a fixed training set $\mathcal{D}$, let the random variables $\text{Err}_{t,0}$ and $\text{Err}_{t,1}$ denote the class-wise prediction error rates of a randomly generated ensemble of size $t$. As $t\to\infty$, we provide a general bound on the "algorithmic variance", $\text{var}(\text{Err}_{t,l}|\mathcal{D})\leq \frac{f_l(1/2)^2}{4t}+o(\frac{1}{t})$, where $l\in\{0,1\}$, and $f_l$ is a density function that arises from the ensemble method. Conceptually, this result is somewhat surprising, because $\text{var}(\text{Err}_{t,l}|\mathcal{D})$ describes how $\text{Err}_{t,l}$ varies over repeated runs of the algorithm, and yet, the formula leads to a method for bounding $\text{var}(\text{Err}_{t,l}|\mathcal{D})$ with a single ensemble. The bound is also sharp in the sense that it is attained by an explicit family of randomized classifiers. With regard to the task of estimating $f_l(1/2)$, the presence of the ensemble leads to a unique twist on the classical setup of non-parametric density estimation --- wherein the effects of sample size and computational cost are intertwined. In particular, we propose an estimator for $f_l(1/2)$, and derive an upper bound on its MSE that matches "standard optimal non-parametric rates" when $t$ is sufficiently large.
We investigate the capability of localizing node failures in communication networks from binary states (normal/failed) of end-to-end paths. Given a set of nodes of interest, uniquely localizing failures within this set requires that different observable path states associate with different node failure events. However, this condition is difficult to test on large networks due to the need to enumerate all possible node failures. Our first contribution is a set of sufficient/necessary conditions for identifying a bounded number of failures within an arbitrary node set that can be tested in polynomial time. In addition to network topology and locations of monitors, our conditions also incorporate constraints imposed by the probing mechanism used. We consider three probing mechanisms that differ according to whether measurement paths are (i) arbitrarily controllable, (ii) controllable but cycle-free, or (iii) uncontrollable (determined by the default routing protocol). Our second contribution is to quantify the capability of failure localization through (1) the maximum number of failures (anywhere in the network) such that failures within a given node set can be uniquely localized, and (2) the largest node set within which failures can be uniquely localized under a given bound on the total number of failures. Both measures in (1-2) can be converted into functions of a per-node property, which can be computed efficiently based on the above sufficient/necessary conditions. We demonstrate how measures (1-2) proposed for quantifying failure localization capability can be used to evaluate the impact of various parameters, including topology, number of monitors, and probing mechanisms.
This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a training objective that includes not only discourse relation classification, but also word prediction. As a result, it outperforms state-of-the-art alternatives for two tasks: implicit discourse relation classification in the Penn Discourse Treebank, and dialog act classification in the Switchboard corpus. Furthermore, by marginalizing over latent discourse relations at test time, we obtain a discourse informed language model, which improves over a strong LSTM baseline.
We present a numerical study on the thermal activated avalanche dynamics in granular materials composed of ferromagnetic clusters embedded in a non-magnetic matrix. A microscopic dynamical simulation based on the reaction-diffusion process is developed to modeling the magnetization process of such systems. The large-scale simulations presented here explicitly demonstrate inter-granular collective behavior induced by thermal activation of spin tunneling. In particular, we observe an intriguing criticality controlled by the rate of energy dissipation. We show that thermal activated avalanches can be understood in the framework of continuum percolation and the emergent dissipation induced criticality is in the universality class of three-dimensional percolation transition. Implications of these results to the phase-separated states of colossal magnetoresistance materials and other artificial granular magnetic systems are also~discussed.
The past decade has witnessed the rapid development of feature representation learning and distance metric learning, whereas the two steps are often discussed separately. To explore their interaction, this work proposes an end-to-end learning framework called DARI, i.e. Distance metric And Representation Integration, and validates the effectiveness of DARI in the challenging task of person verification. Given the training images annotated with the labels, we first produce a large number of triplet units, and each one contains three images, i.e. one person and the matched/mismatch references. For each triplet unit, the distance disparity between the matched pair and the mismatched pair tends to be maximized. We solve this objective by building a deep architecture of convolutional neural networks. In particular, the Mahalanobis distance matrix is naturally factorized as one top fully-connected layer that is seamlessly integrated with other bottom layers representing the image feature. The image feature and the distance metric can be thus simultaneously optimized via the one-shot backward propagation. On several public datasets, DARI shows very promising performance on re-identifying individuals cross cameras against various challenges, and outperforms other state-of-the-art approaches.
Recommender systems often use latent features to explain the behaviors of users and capture the properties of items. As users interact with different items over time, user and item features can influence each other, evolve and co-evolve over time. The compatibility of user and item's feature further influence the future interaction between users and items. Recently, point process based models have been proposed in the literature aiming to capture the temporally evolving nature of these latent features. However, these models often make strong parametric assumptions about the evolution process of the user and item latent features, which may not reflect the reality, and has limited power in expressing the complex and nonlinear dynamics underlying these processes. To address these limitations, we propose a novel deep coevolutionary network model (DeepCoevolve), for learning user and item features based on their interaction graph. DeepCoevolve use recurrent neural network (RNN) over evolving networks to define the intensity function in point processes, which allows the model to capture complex mutual influence between users and items, and the feature evolution over time. We also develop an efficient procedure for training the model parameters, and show that the learned models lead to significant improvements in recommendation and activity prediction compared to previous state-of-the-arts parametric models.
Robustness estimation is critical for the design and maintenance of resilient networks, one of the global challenges of the 21st century. Existing studies exploit network metrics to generate attack strategies, which simulate intentional attacks in a network, and compute a metric-induced robustness estimation. While some metrics are easy to compute, e.g. degree centrality, other, more accurate, metrics require considerable computation efforts, e.g. betweennes centrality. We propose a new algorithm for estimating the robustness of a network in sub-quadratic time, i.e., significantly faster than betweenness centrality. Experiments on real-world networks and random networks show that our algorithm estimates the robustness of networks close to or even better than betweenness centrality, while being orders of magnitudes faster. Our work contributes towards scalable, yet accurate methods for robustness estimation of large complex networks.
We introduce phase disorder in a 1D quantum resistor through the formal device of `fake channels' distributed uniformly over its length such that the out-coupled wave amplitude is re-injected back into the system, but with a phase which is random. The associated scattering problem is treated via invariant imbedding in the continuum limit, and the resulting transport equation is found to correspond exactly to the Lloyd model. The latter has been a subject of much interest in recent years. This conversion of the random phase into the random Cauchy potential is a notable feature of our work. It is further argued that our phase-randomizing reservoir, as distinct from the well known phase-breaking reservoirs, induces no decoherence, but essentially destroys all interference effects other than the coherent back scattering.
Optical coherence tomography (OCT) is used for non-invasive diagnosis of diabetic macular edema assessing the retinal layers. In this paper, we propose a new fully convolutional deep architecture, termed ReLayNet, for end-to-end segmentation of retinal layers and fluid masses in eye OCT scans. ReLayNet uses a contracting path of convolutional blocks (encoders) to learn a hierarchy of contextual features, followed by an expansive path of convolutional blocks (decoders) for semantic segmentation. ReLayNet is trained to optimize a joint loss function comprising of weighted logistic regression and Dice overlap loss. The framework is validated on a publicly available benchmark dataset with comparisons against five state-of-the-art segmentation methods including two deep learning based approaches to substantiate its effectiveness.
Modularity is a popular measure of community structure. However, maximizing the modularity can lead to many competing partitions, with almost the same modularity, that are poorly correlated with each other. It can also produce illusory "communities" in random graphs where none exist. We address this problem by using the modularity as a Hamiltonian at finite temperature, and using an efficient Belief Propagation algorithm to obtain the consensus of many partitions with high modularity, rather than looking for a single partition that maximizes it. We show analytically and numerically that the proposed algorithm works all the way down to the detectability transition in networks generated by the stochastic block model. It also performs well on real-world networks, revealing large communities in some networks where previous work has claimed no communities exist. Finally we show that by applying our algorithm recursively, subdividing communities until no statistically-significant subcommunities can be found, we can detect hierarchical structure in real-world networks more efficiently than previous methods.
The neuronal paradigm of studying the brain has left us with major limitations in both our understanding of how neurons process information to achieve biological intelligence and how such knowledge can be translated into artificial intelligence. Overturning our fundamental assumptions of how the brain works, the recent exploration of glia is revealing that these long-neglected brain cells, and in particular astrocytes, dynamically contribute to and regulate neuronal activity by interacting with neurons at the synaptic level. Following recent glial-specific in vivo calcium imaging studies, we built a biologically-constrained model of an astrocytic-neural network and analyzed the dynamics of the astrocyte-neural interaction. We show that the inclusion of astrocytes was sufficient to trigger transitions between learned states in a Hopfield-type model of memory. The timing of the transitions was governed by the dynamics of the calcium-dependent slow-currents in the astrocytic processes ensheathing the synapses. Our results align with calcium imaging studies that in the last several years have revealed that astrocytes respond to neural activity on a much faster timescale than previously believed, and provide a possible computational mechanism for the long-suspected role of astrocytes in learning and memory. Overall, this paper aims to steer and expand our concept of information processing in the brain to include non-neuronal elements.
We study the effect of stochastic wireless channel models on the connectivity of ad hoc networks. Unlike in the deterministic geometric disk model where nodes connect if they are within a certain distance from each other, stochastic models attempt to capture small-scale fading effects due to shadowing and multipath received signals. Through analysis of local and global network observables, we present conclusive evidence suggesting that network behaviour is highly dependent upon whether a stochastic or deterministic connection model is employed. Specifically we show that the network mean degree is lower (higher) for stochastic wireless channels than for deterministic ones, if the path loss exponent is greater (lesser) than the spatial dimension. Similarly, the probability of forming isolated pairs of nodes in an otherwise dense random network is much less for stochastic wireless channels than for deterministic ones. The latter realisation explains why the upper bound of $k$-connectivity is tighter for stochastic wireless channels. We obtain closed form analytic results and compare to extensive numerical simulations.
By drawing on ideas from optimisation theory, artificial neural networks (ANN), graph embeddings and sparse representations, I develop a novel technique, termed SENNS (Sparse Extraction Neural NetworkS), aimed at addressing the feature extraction problem. The proposed method uses (preferably deep) ANNs for projecting input attribute vectors to an output space wherein pairwise distances are maximized for vectors belonging to different classes, but minimized for those belonging to the same class, while simultaneously enforcing sparsity on the ANN outputs. The vectors that result from the projection can then be used as features in any classifier of choice. Mathematically, I formulate the proposed method as the minimisation of an objective function which can be interpreted, in the ANN output space, as a negative factor of the sum of the squares of the pair-wise distances between output vectors belonging to different classes, added to a positive factor of the sum of squares of the pair-wise distances between output vectors belonging to the same classes, plus sparsity and weight decay terms. To derive an algorithm for minimizing the objective function via gradient descent, I use the multi-variate version of the chain rule to obtain the partial derivatives of the function with respect to ANN weights and biases, and find that each of the required partial derivatives can be expressed as a sum of six terms. As it turns out, four of those six terms can be computed using the standard back propagation algorithm; the fifth can be computed via a slight modification of the standard backpropagation algorithm; while the sixth one can be computed via simple arithmetic. Finally, I propose experiments on the ARABASE Arabic corpora of digits and letters, the CMU PIE database of faces, the MNIST digits database, and other standard machine learning databases.
The emergence of mental states from neural states by partitioning the neural phase space is analyzed in terms of symbolic dynamics. Well-defined mental states provide contexts inducing a criterion of structural stability for the neurodynamics that can be implemented by particular partitions. This leads to distinguished subshifts of finite type that are either cyclic or irreducible. Cyclic shifts correspond to asymptotically stable fixed points or limit tori whereas irreducible shifts are obtained from generating partitions of mixing hyperbolic systems. These stability criteria are applied to the discussion of neural correlates of consiousness, to the definition of macroscopic neural states, and to aspects of the symbol grounding problem. In particular, it is shown that compatible mental descriptions, topologically equivalent to the neurodynamical description, emerge if the partition of the neural phase space is generating. If this is not the case, mental descriptions are incompatible or complementary. Consequences of this result for an integration or unification of cognitive science or psychology, respectively, will be indicated.
In this paper, we present an infinite hierarchical non-parametric Bayesian model to extract the hidden factors over observed data, where the number of hidden factors for each layer is unknown and can be potentially infinite. Moreover, the number of layers can also be infinite. We construct the model structure that allows continuous values for the hidden factors and weights, which makes the model suitable for various applications. We use the Metropolis-Hastings method to infer the model structure. Then the performance of the algorithm is evaluated by the experiments. Simulation results show that the model fits the underlying structure of simulated data.
In this paper, we propose an innovative end-to-end subtitle detection and recognition system for videos in East Asian languages. Our end-to-end system consists of multiple stages. Subtitles are firstly detected by a novel image operator based on the sequence information of consecutive video frames. Then, an ensemble of Convolutional Neural Networks (CNNs) trained on synthetic data is adopted for detecting and recognizing East Asian characters. Finally, a dynamic programming approach leveraging language models is applied to constitute results of the entire body of text lines. The proposed system achieves average end-to-end accuracies of 98.2% and 98.3% on 40 videos in Simplified Chinese and 40 videos in Traditional Chinese respectively, which is a significant outperformance of other existing methods. The near-perfect accuracy of our system dramatically narrows the gap between human cognitive ability and state-of-the-art algorithms used for such a task.
Blind deblurring consists a long studied task, however the outcomes of generic methods are not effective in real world blurred images. Domain-specific methods for deblurring targeted object categories, e.g. text or faces, frequently outperform their generic counterparts, hence they are attracting an increasing amount of attention. In this work, we develop such a domain-specific method to tackle deblurring of human faces, henceforth referred to as face deblurring. Studying faces is of tremendous significance in computer vision, however face deblurring has yet to demonstrate some convincing results. This can be partly attributed to the combination of i) poor texture and ii) highly structure shape that yield the contour/gradient priors (that are typically used) sub-optimal. In our work instead of making assumptions over the prior, we adopt a learning approach by inserting weak supervision that exploits the well-documented structure of the face. Namely, we utilise a deep network to perform the deblurring and employ a face alignment technique to pre-process each face. We additionally surpass the requirement of the deep network for thousands training samples, by introducing an efficient framework that allows the generation of a large dataset. We utilised this framework to create 2MF2, a dataset of over two million frames. We conducted experiments with real world blurred facial images and report that our method returns a result close to the sharp natural latent image.
Optimizing decision problems under uncertainty can be done using a variety of solution methods. Soft computing and heuristic approaches tend to be powerful for solving such problems. In this overview article, we survey Evolutionary Optimization techniques to solve Stochastic Programming problems - both for the single-stage and multi-stage case.
Studying animal movement and distribution is of critical importance to addressing environmental challenges including invasive species, infectious diseases, climate and land-use change. Motion sensitive camera traps offer a visual sensor to record the presence of a broad range of species providing location -specific information on movement and behavior. Modern digital camera traps that record video present new analytical opportunities, but also new data management challenges. This paper describes our experience with a terrestrial animal monitoring system at Barro Colorado Island, Panama. Our camera network captured the spatio-temporal dynamics of terrestrial bird and mammal activity at the site - data relevant to immediate science questions, and long-term conservation issues. We believe that the experience gained and lessons learned during our year long deployment and testing of the camera traps as well as the developed solutions are applicable to broader sensor network applications and are valuable for the advancement of the sensor network research. We suggest that the continued development of these hardware, software, and analytical tools, in concert, offer an exciting sensor-network solution to monitoring of animal populations which could realistically scale over larger areas and time spans.
A new model for video captioning is developed, using a deep three-dimensional Convolutional Neural Network (C3D) as an encoder for videos and a Recurrent Neural Network (RNN) as a decoder for captions. We consider both "hard" and "soft" attention mechanisms, to adaptively and sequentially focus on different layers of features (levels of feature "abstraction"), as well as local spatiotemporal regions of the feature maps at each layer. The proposed approach is evaluated on three benchmark datasets: YouTube2Text, M-VAD and MSR-VTT. Along with visualizing the results and how the model works, these experiments quantitatively demonstrate the effectiveness of the proposed adaptive spatiotemporal feature abstraction for translating videos to sentences with rich semantics.
We use simulations within the Migdal-Kadanoff real space renormalization approach to probe the scales relevant for rejuvenation and memory in spin glasses. One of the central questions concerns the role of temperature chaos. First we investigate scaling laws of equilibrium temperature chaos, finding super-exponential decay of correlations but no chaos for the total free energy. Then we perform out of equilibrium simulations that follow experimental protocols. We find that: (1) rejuvenation arises at a length scale smaller than the ``overlap length'' l(T,T'); (2) memory survives even if equilibration goes out to length scales much larger than l(T,T').
The Information-Geometric Optimization (IGO) has been introduced as a unified framework for stochastic search algorithms. Given a parametrized family of probability distributions on the search space, the IGO turns an arbitrary optimization problem on the search space into an optimization problem on the parameter space of the probability distribution family and defines a natural gradient ascent on this space. From the natural gradients defined over the entire parameter space we obtain continuous time trajectories which are the solutions of an ordinary differential equation (ODE). Via discretization, the IGO naturally defines an iterated gradient ascent algorithm. Depending on the chosen distribution family, the IGO recovers several known algorithms such as the pure rank-\mu update CMA-ES. Consequently, the continuous time IGO-trajectory can be viewed as an idealization of the original algorithm. In this paper we study the continuous time trajectories of the IGO given the family of isotropic Gaussian distributions. These trajectories are a deterministic continuous time model of the underlying evolution strategy in the limit for population size to infinity and change rates to zero. On functions that are the composite of a monotone and a convex-quadratic function, we prove the global convergence of the solution of the ODE towards the global optimum. We extend this result to composites of monotone and twice continuously differentiable functions and prove local convergence towards local optima.
The effect of glycerol, water and glycerol-water binary mixtures on the structure and dynamics of biomolecules has been well studied. However, a lot remains to be learned about the effect of varying glycerol concentration and temperature on the dynamics of water. We have studied the effect of concentration and temperature on the hydrogen bonded network formed by water molecules. A strong correlation between the relaxation time of the network and average number of hydrogen bonds per water molecules was found. The radial distribution function of water oxygen and hydrogen atoms clarifies the effect of concentration on the structure and clustering of water.
This paper introduces a novel framework for modeling temporal events with complex longitudinal dependency that are generated by dependent sources. This framework takes advantage of multidimensional point processes for modeling time of events. The intensity function of the proposed process is a mixture of intensities, and its complexity grows with the complexity of temporal patterns of data. Moreover, it utilizes a hierarchical dependent nonparametric approach to model marks of events. These capabilities allow the proposed model to adapt its temporal and topical complexity according to the complexity of data, which makes it a suitable candidate for real world scenarios. An online inference algorithm is also proposed that makes the framework applicable to a vast range of applications. The framework is applied to a real world application, modeling the diffusion of contents over networks. Extensive experiments reveal the effectiveness of the proposed framework in comparison with state-of-the-art methods.
Undoing the image formation process and therefore decomposing appearance into its intrinsic properties is a challenging task due to the under-constraint nature of this inverse problem. While significant progress has been made on inferring shape, materials and illumination from images only, progress in an unconstrained setting is still limited. We propose a convolutional neural architecture to estimate reflectance maps of specular materials in natural lighting conditions. We achieve this in an end-to-end learning formulation that directly predicts a reflectance map from the image itself. We show how to improve estimates by facilitating additional supervision in an indirect scheme that first predicts surface orientation and afterwards predicts the reflectance map by a learning-based sparse data interpolation.   In order to analyze performance on this difficult task, we propose a new challenge of Specular MAterials on SHapes with complex IllumiNation (SMASHINg) using both synthetic and real images. Furthermore, we show the application of our method to a range of image-based editing tasks on real images.
John Barton was one of the experimental founders of particle astrophysics, working in deep underground locations all over the world. This note combines the obituary published in The Independent with a complete list of his publications.
When an object moves smoothly across a field of view, the identify of the object is unchanged, but the activation pattern of the photoreceptors on the retina changes drastically. One of the major computational roles of our visual system is to manage selectivity for different objects and tolerance to such identity-preserving transformations as translations or rotations. This study demonstrates that a hierarchical neural network, whose synaptic connectivities are learned competitively with Hebbian plasticity operating within a local spatiotemporal pooling range, is capable of gradually achieving feature selectivity and transformation tolerance, so that the top level neurons carry higher mutual information about object categories than a single-level neural network. Furthermore, when genetic algorithm is applied to search for a network architecture that maximizes transformation-invariant object recognition performance, in conjunction with the associative learning algorithm, it is found that deep networks outperform shallower ones.
Deep models have achieved impressive performance for face hallucination tasks. However, we observe that directly feeding the hallucinated facial images into recog- nition models can even degrade the recognition performance despite the much better visualization quality. In this paper, we address this problem by jointly learning a deep model for two tasks, i.e. face hallucination and recognition. In particular, we design an end-to-end deep convolution network with hallucination sub-network cascaded by recognition sub-network. The recognition sub- network are responsible for producing discriminative feature representations using the hallucinated images as inputs generated by hallucination sub-network. During training, we feed LR facial images into the network and optimize the parameters by minimizing two loss items, i.e. 1) face hallucination loss measured by the pixel wise difference between the ground truth HR images and network-generated images; and 2) verification loss which is measured by the classification error and intra-class distance. We extensively evaluate our method on LFW and YTF datasets. The experimental results show that our method can achieve recognition accuracy 97.95% on 4x down-sampled LFW testing set, outperforming the accuracy 96.35% of conventional face recognition model. And on the more challenging YTF dataset, we achieve recognition accuracy 90.65%, a margin over the recognition accuracy 89.45% obtained by conventional face recognition model on the 4x down-sampled version.
We replace the Hidden Markov Model (HMM) which is traditionally used in in continuous speech recognition with a bi-directional recurrent neural network encoder coupled to a recurrent neural network decoder that directly emits a stream of phonemes. The alignment between the input and output sequences is established using an attention mechanism: the decoder emits each symbol based on a context created with a subset of input symbols elected by the attention mechanism. We report initial results demonstrating that this new approach achieves phoneme error rates that are comparable to the state-of-the-art HMM-based decoders, on the TIMIT dataset.
The effects of the finite size of the network on the evolutionary dynamics of a Boolean network are analyzed. In the model considered, Boolean networks evolve via a competition between nodes that punishes those in the majority. It is found that finite size networks evolve in a fundamentally different way than infinitely large networks do. The symmetry of the evolutionary dynamics of infinitely large networks that selects for canalizing Boolean functions is broken in the evolutionary dynamics of finite size networks. In finite size networks there is an additional selection for input inverting Boolean functions that output a value opposite to the majority of input values. These results are revealed through an empirical study of the model that calculates the frequency of occurrence of the different possible Boolean functions. Classes of functions are found to occur with the same frequency. Those classes depend on the symmetry of the evolutionary dynamics and correspond to orbits of the relevant symmetry group. The empirical results match analytic results, determined by utilizing Polya's theorem, for the number of orbits expected in both finite size and infinitely large networks. The reason for the symmetry breaking in the evolutionary dynamics is found to be due to the need for nodes in finite size networks to behave differently in order to cooperate so that the system collectively performs as well as possible. The results suggest that both finite size effects and symmetry are important for understanding the evolution of real-world complex networks, including genetic regulatory networks.
To describe the interaction of the two level systems (TLSs) of an amorphous solid with arbitrary strain fields, we introduce a generalization of the standard interaction Hamiltonian. In this new model, the interaction strength depends on the orientation of the TLS with respect to the strain field through a $6\times 6$ symmetric tensor of deformation potential parameters, $[R]$. Taking into account the isotropy of the amorphous solid, we deduce that $[R]$ has only two independent parameters. We show how these two parameters can be calculated from experimental data and we prove that for any amorphous bulk material the average coupling of TLSs with longitudinal phonons is always stronger than the average coupling with transversal phonons (in standard notations, $\gamma_l>\gamma_t$).
We investigate the generalized p-spin models that contain arbitrary diagonal operators U with no reflection symmetry. We derive general equations that give an opportunity to uncover the behavior of the system near the glass transition at different (continuous) p. The quadrupole glass with J=1 is considered as an illustrating example. It is shown that the crossover from continuous to discontinuous glass transition to one-step replica breaking solution takes place at p=3.3 for this model. For p <2+\Delta p, where \Delta p= 0.5 is a finite value, stable 1RSB-solution disappears. This behaviour is strongly different from that of the p-spin Ising glass model.
We study the dynamics of a carrier, which performs a biased motion under the influence of an external field E, in an environment which is modeled by dynamic percolation and created by hard-core particles. The particles move randomly on a simple cubic lattice, constrained by hard-core exclusion, and they spontaneously annihilate and re-appear at some prescribed rates. Using decoupling of the third-order correlation functions into the product of the pairwise carrier-particle correlations we determine the density profiles of the "environment" particles, as seen from the stationary moving carrier, and calculate its terminal velocity, V_c, as the function of the applied field and other system parameters. We find that for sufficiently small driving forces the force exerted on the carrier by the "environment" particles shows a viscous-like behavior. An analog Stokes formula for such dynamic percolative environments and the corresponding friction coefficient are derived. We show that the density profile of the environment particles is strongly inhomogeneous: In front of the stationary moving carrier the density is higher than the average density, $\rho_s$, and approaches the average value as an exponential function of the distance from the carrier. Past the carrier the local density is lower than $\rho_s$ and the relaxation towards $\rho_s$ may proceed differently depending on whether the particles number is or is not explicitly conserved.
The convergence properties of the stationary Fokker-Planck algorithm for the estimation of the asymptotic density of stochastic search processes is studied. Theoretical and empirical arguments for the characterization of convergence of the estimation in the case of separable and nonseparable nonlinear optimization problems are given. Some implications of the convergence of stationary Fokker-Planck learning for the inference of parameters in artificial neural network models are outlined.
Methods for identification of dynamical patterns in networks suffer from effects of arbitrary time scales that need to be imposed a priori. Here we develop a principled method to identify patterns on dynamics that take place on network systems, as well as on the dynamics that shape the network themselves, without requiring the stipulation of relevant time scales, which instead are determined solely from data. Our approach is based on a variable-order hidden Markov chain model that generalizes the stochastic block model for discrete time-series as well as temporal networks, without requiring the aggregation of events into discrete intervals. We formulate an efficient nonparametric Bayesian framework that can infer the most appropriate Markov order and number of communities, based solely on statistical evidence and without overfitting.
To expand the framework available for interpreting experiments on disordered strongly correlated systems, and in particular to explore further the strong-coupling zero-bias anomaly found in the Anderson-Hubbard model, we ask how this anomaly responds to the addition of nonlocal electron-electron interactions. We use exact diagonalization to calculate the single-particle density of states of the extended Anderson-Hubbard model. We find that for weak nonlocal interactions the form of the zero-bias anomaly is qualitatively unchanged. The energy scale of the anomaly continues to be set by an effective hopping amplitude renormalized by the nonlocal interaction. At larger values of the nonlocal interaction strength, however, hopping ceases to be a relevant energy scale and higher energy features associated with charge correlations dominate the density of states.
We propose a Markov chain method to efficiently generate 'surrogate networks' that are random under the constraint of given vertex strengths. With these strength-preserving surrogates and with edge-weight-preserving surrogates we investigate the clustering coefficient and the average shortest path length of functional networks of the human brain as well as of the International Trade Networks. We demonstrate that surrogate networks can provide additional information about network-specific characteristics and thus help interpreting empirical weighted networks.
Deep BV photometry for a large field covering the outer-halo globular cluster NGC 6229 is presented. For the first time, a colour-magnitude diagram (CMD) reaching below the main sequence turnoff has been obtained for this cluster. Previous results regarding the overall morphology of the horizontal and giant branches are confirmed. In addition, nine possible extreme horizontal-branch stars have been identified in our deep images, as well as thirty-three candidate blue stragglers. We also find the latter to be more centrally concentrated than subgiant branch stars covering the same range in V magnitude.   A comparison of the cluster CMD with the M5 (NGC 5904) ridgeline from Sandquist et al. (1996) reveals that:   i) NGC 6229 and M5 have essentially identical metallicities;   ii) NGC 6229 and M5 have the same ages within the errors in spite of their different horizontal-branch morphologies.
"Cometary" Blue Compact Dwarf Galaxies (iI,C BCDs) are characterized by off-center starbursts close to the end of an elongated dwarf irregular (dI)-like host galaxy. This may either represent randomly enhanced star-forming activity of a dI, or may be caused by a set of special properties of such systems or their environment. For a first investigation of this issue, we analyse the nearby iI,C BCDs Markarian 59 and Markarian 71. Using deep ground-based spectrophotometric data and HST images, we derive physical properties, structure and ages of the starburst regions and the underlying stellar host galaxies. The metallicities show small scatter in the vicinity of the star-forming regions and along the major axis of Mrk 59 which suggests effective mixing of heavy elements on kpc scales. The surface brightness profiles of the underlying host galaxies in either iI,C BCD show an exponential decay with a central surface brightness and scale length that are intermediate between typical iE/nE BCDs and dIs. Spectral population synthesis models in combination with colour magnitude diagrams and colour profiles yield most likely formation ages of ~2 Gyr for the host galaxies in both iI,C BCDs, with upper limits of ~4 Gyr for Mrk 59 and ~3 Gyr for Mrk 71, i.e. significantly lower than the typical age of several Gyr derived for the host galaxies of iE/nE BCDs. These findings raise the question whether iI,C systems form a distinct class within BCDs with respect to the age and structure of their hosts, or whether they represent an evolutionary stage connecting young i0 BCDs and "classical" iE/nE BCDs. Properties of analogous objects studied in the local universe and at medium redshifts provide some support for this evolutionary hypothesis.
We present a descriptor, called fully convolutional self-similarity (FCSS), for dense semantic correspondence. To robustly match points among different instances within the same object class, we formulate FCSS using local self-similarity (LSS) within a fully convolutional network. In contrast to existing CNN-based descriptors, FCSS is inherently insensitive to intra-class appearance variations because of its LSS-based structure, while maintaining the precise localization ability of deep neural networks. The sampling patterns of local structure and the self-similarity measure are jointly learned within the proposed network in an end-to-end and multi-scale manner. As training data for semantic correspondence is rather limited, we propose to leverage object candidate priors provided in existing image datasets and also correspondence consistency between object pairs to enable weakly-supervised learning. Experiments demonstrate that FCSS outperforms conventional handcrafted descriptors and CNN-based descriptors on various benchmarks.
The purposes of this paper have to discuss issues related to Network Traffic Management. A relatively new category of network management is fast becoming a necessity in converged business Networks. Mid-sized and large organizations are finding they must control network traffic behavior to assure that their strategic applications always get the resources they need to perform optimally. Controlling network traffic requires limiting bandwidth to certain applications, guaranteeing minimum bandwidth to others, and marking traffic with high or low priorities. This exercise is called Network Traffic Management.
The mechanisms by which modularity emerges in complex networks are not well understood but recent reports have suggested that modularity may arise from evolutionary selection. We show that finding the modularity of a network is analogous to finding the ground-state energy of a spin system. Moreover, we demonstrate that, due to fluctuations, stochastic network models give rise to modular networks. Specifically, we show both numerically and analytically that random graphs and scale-free networks have modularity. We argue that this fact must be taken into consideration to define statistically-significant modularity in complex networks.
{This paper presents the VIMOS VLT Deep Survey around the Chandra Deep Field South (CDFS). We have measured 1599 new redshifts with VIMOS on the European Observatory Very Large Telescope - UT3, in an area 21x21.6 arcmin^2, including 784 redshifts in the Hubble Space Telescope - Advanced Camera for Surveys GOODS area. 30% of all objects with I_AB=24 have been observed independently of magnitude, indicating that the sample is purely magnitude limited. We have reached an unprecedented completeness level of 88% in terms of the ratio of secure measurements vs. observed objects, while 95% of all objects have a redshift measurement. A total of 1452 galaxies, 139 stars, 8 QSOs have a redshift identification, 141 of these being unsecure measurements. The redshift distribution down to I_AB=24 is peaked at a median redshift z=0.73, with a significant high redshift tail extending up to ~4. Several high density peaks in the distribution of galaxies are identified. In particular, the strong peak at z=0.735 contains more than 130 galaxies in a velocity range +/-2000 km/s distributed all across the transverse ~20 h^-1 Mpc of the survey. We are releasing all redshifts to the community, along with the cross identification with HST-ACS GOODS sources on the CENCOS database environment http://cencosw.oamp.fr.
The problem of efficiently delivering data within networks is very important nowadays, especially in the context of the large volumes of data which are being produced each year and of the increased data access needs of the users. Efficient data delivery strategies must satisfy several types of Quality of Service (QoS) constraints which are imposed by the data consumers. One possibility of achieving this goal (particularly in the case of in-order data transfers) is to choose a satisfactory network delivery path. In this paper we present novel algorithmic approaches for computing optimal network paths which satisfy several types of (QoS) constraints.
In this paper, we investigate the evolution of the network entropy for consensus dynamics in classical or quantum networks. We show that in the classical case, the network entropy decreases at the consensus limit if the node initial values are i.i.d. Bernoulli random variables, and the network differential entropy is monotonically non-increasing if the node initial values are i.i.d. Gaussian. While for quantum consensus dynamics, the network's von Neumann entropy is in contrast non-decreasing. In light of this inconsistency, we compare several gossiping algorithms with random or deterministic coefficients for classical or quantum networks, and show that quantum gossiping algorithms with deterministic coefficients are physically related to classical gossiping algorithms with random coefficients.
We use confocal microscopy to study the motions of particles in concentrated colloidal systems. Near the glass transition, diffusive motion is inhibited, as particles spend time trapped in transient ``cages'' formed by neighboring particles. We measure the cage sizes and lifetimes, which respectively shrink and grow as the glass transition approaches. Cage rearrangements are more prevalent in regions with lower local concentrations and higher disorder. Neighboring rearranging particles typically move in parallel directions, although a nontrivial fraction move in anti-parallel directions, usually from pairs of particles with initial separations corresponding to the local maxima and minima of the pair correlation function $g(r)$, respectively.
Bose-Einstein correlations of charged and neutral kaons have been measured in e+-p deep inelastic scattering with an integrated luminosity of 121 pb-1 using the ZEUS detector at HERA. The two-particle correlation function was studied as a function of the four-momentum difference of the kaon pairs, Q_12=sqrt{-(p_1-p_2)^2}, assuming a Gaussian shape for the particle source. The values of the radius of the production volume, r, and of the correlation strength, lambda, were obtained for both neutral and charged kaons. The radii for charged and neutral kaons are similar and are consistent with those obtained at LEP.
Pre-training is crucial for learning deep neural networks. Most of existing pre-training methods train simple models (e.g., restricted Boltzmann machines) and then stack them layer by layer to form the deep structure. This layer-wise pre-training has found strong theoretical foundation and broad empirical support. However, it is not easy to employ such method to pre-train models without a clear multi-layer structure,e.g., recurrent neural networks (RNNs). This paper presents a new pre-training approach based on knowledge transfer learning. In contrast to the layer-wise approach which trains model components incrementally, the new approach trains the entire model as a whole but with an easier objective function. This is achieved by utilizing soft targets produced by a prior trained model (teacher model). Compared to the conventional layer-wise methods, this new method does not care about the model structure, so can be used to pre-train very complex models. Experiments on a speech recognition task demonstrated that with this approach, complex RNNs can be well trained with a weaker deep neural network (DNN) model. Furthermore, the new method can be combined with conventional layer-wise pre-training to deliver additional gains.
There is a pressing need to build an architecture that could subsume these networks undera unified framework that achieves both higher performance and less overhead. To this end, two fundamental issues are yet to be addressed. The first one is how to implement the back propagation when neuronal activations are discrete. The second one is how to remove the full-precision hidden weights in the training phase to break the bottlenecks of memory/computation consumption. To address the first issue, we present a multistep neuronal activation discretization method and a derivative approximation technique that enable the implementing the back propagation algorithm on discrete DNNs. While for the second issue, we propose a discrete state transition (DST) methodology to constrain the weights in a discrete space without saving the hidden weights. In this way, we build a unified framework that subsumes the binary or ternary networks as its special cases.More particularly, we find that when both the weights and activations become ternary values, the DNNs can be reduced to gated XNOR networks (or sparse binary networks) since only the event of non-zero weight and non-zero activation enables the control gate to start the XNOR logic operations in the original binary networks. This promises the event-driven hardware design for efficient mobile intelligence. We achieve advanced performance compared with state-of-the-art algorithms. Furthermore,the computational sparsity and the number of states in the discrete space can be flexibly modified to make it suitable for various hardware platforms.
In this paper we propose a technique for spectral envelope estimation using maximum values in the sub-bands of Fourier magnitude spectrum (MSASB). Most other methods in the literature parametrize spectral envelope in cepstral domain such as Mel-generalized cepstrum etc. Such cepstral domain representations, although compact, are not readily interpretable. This difficulty is overcome by our method which parametrizes in the spectral domain itself. In our experiments, spectral envelope estimated using MSASB method was incorporated in the STRAIGHT vocoder. Both objective and subjective results of analysis-by-synthesis indicate that the proposed method is comparable to STRAIGHT. We also evaluate the effectiveness of the proposed parametrization in a statistical parametric speech synthesis framework using deep neural networks.
Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future. We improve upon the attention model of Bahdanau et al. (2014) by explicitly modeling the relationship between previous and subsequent attention levels for each word using one recurrent network per input word. This architecture easily captures informative features, such as fertility and regularities in relative distortion. In experiments, we show our parameterization of attention improves translation quality.
Deep neural networks (DNNs) have made significant progress in a number of Machine Learning applications. However without a consistent set of evaluation tasks, interpreting performance across test datasets is impossible. In most previous work, characteristics of individual data points are not considered during evaluation, and each data point is treated equally. Using Item Response Theory (IRT) from psychometrics it is possible to model characteristics of specific data points that then inform an estimate of model ability as compared to a population of humans. We report the results of several experiments to determine how different Deep Neural Network (DNN) models perform under different training circumstances with respect to ability. As DNNs train on larger datasets, performance begins to look like human performance under the assumptions of IRT models. That is, easy questions start to have a higher probability of being answered correctly than harder questions. We also report the results of additional analyses regarding model robustness to noise and performance as a function of training set size that further inform our main conclusion
We analyse the newest diffractive deep inelastic scattering data from the DESY collider HERA with the help of dipole models. We find good agreement with the data on the diffractive structure functions provided the diffractive open charm contribution is taken into account. However, the region of large diffractive mass (small values of a parameter beta) needs some refinement with the help of an additional gluon radiation.
An important challenge in cancer systems biology is to uncover the complex network of interactions between genes (tumor suppressor genes and oncogenes) implicated in cancer. Next generation sequencing provides unparalleled ability to probe the expression levels of the entire set of cancer genes and their transcript isoforms. However, there are onerous statistical and computational issues in interpreting high-dimensional sequencing data and inferring the underlying genetic network. In this study, we analyzed RNA-Seq data from lymphoblastoid cell lines derived from a population of 69 human individuals and implemented a probabilistic framework to construct biologically-relevant genetic networks. In particular, we employed a graphical lasso analysis, motivated by considerations of the maximum entropy formalism, to estimate the sparse inverse covariance matrix of RNA-Seq data. Gene ontology, pathway enrichment and protein-protein path length analysis were all carried out to validate the biological context of the predicted network of interacting cancer gene isoforms.
Analyzing and controlling large distributed services under a wide range of conditions is difficult. Yet these capabilities are essential to a number of important development and operational tasks such as benchmarking, testing, and system management. To facilitate these tasks, we have built the Application Control and Monitoring Environment (ACME), a scalable, flexible infrastructure for monitoring, analyzing, and controlling Internet-scale systems. ACME consists of two parts. ISING, the Internet Sensor In-Network agGregator, queries sensors and aggregates the results as they are routed through an overlay network. ENTRIE, the ENgine for TRiggering Internet Events, uses the data streams supplied by ISING, in combination with a user's XML configuration file, to trigger actuators such as killing processes during a robustness benchmark or paging a system administrator when predefined anomalous conditions are observed. In this paper we describe the design, implementation, and evaluation of ACME and its constituent parts. We find that for a 512-node system running atop an emulated Internet topology, ISING's use of in-network aggregation can reduce end-to-end query-response latency by more than 50% compared to using either direct network connections or the same overlay network without aggregation. We also find that an untuned implementation of ACME can invoke an actuator on one or all nodes in response to a discrete or aggregate event in less than four seconds, and we illustrate ACME's applicability to concrete benchmarking and monitoring scenarios.
A new hybrid experiment has been started by AS{\gamma} experiment at Tibet, China, since August 2011, which consists of a low threshold burst-detector-grid (YAC-II, Yangbajing Air shower Core array), the Tibet air-shower array (Tibet-III) and a large underground water Cherenkov muon detector (MD). In this paper, the capability of the measurement of the chemical components (proton, helium and iron) with use of the (Tibet-III+YAC-II) is investigated by means of an extensive Monte Carlo simulation in which the secondary particles are propagated through the (Tibet-III+YAC-II) array and an artificial neural network (ANN) method is applied for the primary mass separation. Our simulation shows that the new installation is powerful to study the chemical compositions, in particular, to obtain the primary energy spectrum of the major component at the knee.
We have analyzed the interplay between noise and periodic modulations in a mean field model of a neural excitable medium. To this purpose, we have considered two types of modulations; namely, variations of the resistance and oscillations of the threshold. In both cases, stochastic resonance is present, irrespective of if the system is monostable or bistable.
Topology Control (TC) aims at tuning the topology of highly dynamic networks to provide better control over network resources and to increase the efficiency of communication. Recently, many TC protocols have been proposed. The protocols are designed for preserving connectivity, minimizing energy consumption, maximizing the overall network coverage or network capacity. Each TC protocol makes different assumptions about the network topology, environment detection resources, and control capacities. This circumstance makes it extremely difficult to comprehend the role and purpose of each protocol. To tackle this situation, a taxonomy for TC protocols is presented throughout this paper. Additionally, some TC protocols are classified based upon this taxonomy.
Latent Dirichlet Allocation (LDA) is a three-level hierarchical Bayesian model for topic inference. In spite of its great success, inferring the latent topic distribution with LDA is time-consuming. Motivated by the transfer learning approach proposed by~\newcite{hinton2015distilling}, we present a novel method that uses LDA to supervise the training of a deep neural network (DNN), so that the DNN can approximate the costly LDA inference with less computation. Our experiments on a document classification task show that a simple DNN can learn the LDA behavior pretty well, while the inference is speeded up tens or hundreds of times.
In many situations, the statistical properties of wave systems with chaotic classical limits are well-described by random matrix theory. However, applications of random matrix theory to scattering problems require introduction of system specific information into the statistical model, such as the introduction of the average scattering matrix in the Poisson kernel. Here it is shown that the average impedance matrix, which also characterizes the system-specific properties, can be expressed in terms of classical trajectories that travel between ports and thus can be calculated semiclassically. Theoretical results are compared with numerical solutions for a model wave-chaotic system.
Model-free deep reinforcement learning (RL) methods have been successful in a wide variety of simulated domains. However, a major obstacle facing deep RL in the real world is their high sample complexity. Batch policy gradient methods offer stable learning, but at the cost of high variance, which often requires large batches. TD-style methods, such as off-policy actor-critic and Q-learning, are more sample-efficient but biased, and often require costly hyperparameter sweeps to stabilize. In this work, we aim to develop methods that combine the stability of policy gradients with the efficiency of off-policy RL. We present Q-Prop, a policy gradient method that uses a Taylor expansion of the off-policy critic as a control variate. Q-Prop is both sample efficient and stable, and effectively combines the benefits of on-policy and off-policy methods. We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation. We show that conservative Q-Prop provides substantial gains in sample efficiency over trust region policy optimization (TRPO) with generalized advantage estimation (GAE), and improves stability over deep deterministic policy gradient (DDPG), the state-of-the-art on-policy and off-policy methods, on OpenAI Gym's MuJoCo continuous control environments.
Mobile collaborative learning (MCL) is extensively recognized field all over the world. It demonstrates the cerebral approach combining the several technology to handle the problem of learning. MCL motivates the social and educational activities for creating healthier environment. To advance and promote the baseline of MCL, different frameworks have been introduced for MCL. From other side, there is no highly acceptable security mechanism has been implemented to protect MCL environment. This paper introduces the issue of rogue DHCP server, which highly affects the performance of users who communicate with each other through MCL. The rogue DHCP is unofficial server that gives incorrect IP address to user in order to use the legal sources of different servers by sniffing the traffic. The contribution focuses to maintain the privacy of users and enhances the confidentiality. The paper introduces multi-frame signature-cum anomaly-based intrusion detection systems (MSAIDS) supported with innovative algorithms, modification in the existing rules of IDS and use of mathematical model. The major objectives of research are to identify the malicious attacks created by rogue DHCP server. This innovative mechanism emphasizes the security for users in order to communicate freely. It also protects network from illegitimate involvement of intruders. Finally, the paper validates the mechanism using three types of simulations: testbed, discrete simulation in C++ and ns2. On the basis of findings, the efficiency of proposed mechanism has been compared with other well-known techniques.
Many real world systems can be expressed as complex networks of interconnected nodes. It is frequently important to be able to quantify the relative importance of the various nodes in the network, a task accomplished by defining some centrality measures, with different centrality definitions stressing different aspects of the network. It is interesting to know to what extent these different centrality definitions are related for different networks. In this work, we study the correlation between pairs of a set of centrality measures for different real world networks and two network models. We show that the centralities are in general correlated, but with stronger correlations for network models than for real networks. We also show that the strength of the correlation of each pair of centralities varies from network to network. Taking this fact into account, we propose the use of a centrality correlation profile, consisting of the values of the correlation coefficients between all pairs of centralities of interest, as a way to characterize networks. Using the yeast protein interaction network as an example we show also that the centrality correlation profile can be used to assess the adequacy of a network model as a representation of a given real network.
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that can be applied to a variety of tasks. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014, which is remarkably close to the 34.2% top-5 error achieved by a fully supervised CNN approach. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them
Dropout is a simple but effective technique for learning in neural networks and other settings. A sound theoretical understanding of dropout is needed to determine when dropout should be applied and how to use it most effectively. In this paper we continue the exploration of dropout as a regularizer pioneered by Wager, et.al. We focus on linear classification where a convex proxy to the misclassification loss (i.e. the logistic loss used in logistic regression) is minimized. We show: (a) when the dropout-regularized criterion has a unique minimizer, (b) when the dropout-regularization penalty goes to infinity with the weights, and when it remains bounded, (c) that the dropout regularization can be non-monotonic as individual weights increase from 0, and (d) that the dropout regularization penalty may not be convex. This last point is particularly surprising because the combination of dropout regularization with any convex loss proxy is always a convex function.   In order to contrast dropout regularization with $L_2$ regularization, we formalize the notion of when different sources are more compatible with different regularizers. We then exhibit distributions that are provably more compatible with dropout regularization than $L_2$ regularization, and vice versa. These sources provide additional insight into how the inductive biases of dropout and $L_2$ regularization differ. We provide some similar results for $L_1$ regularization.
We study a random neighbor version of the Bak-Sneppen model, where "nearest neighbors" are chosen according to a probability distribution decaying as a power-law of the distance from the active site, P(x) \sim |x-x_{ac }|^{-\omega}. All the exponents characterizing the self-organized critical state of this model depend on the exponent \omega. As \omega tends to 1 we recover the usual random nearest neighbor version of the model. The pattern of results obtained for a range of values of \omega is also compatible with the results of simulations of the original BS model in high dimensions. Moreover, our results suggest a critical dimension d_c=6 for the Bak-Sneppen model, in contrast with previous claims.
We present highlights from a large set of simulations of a hot Jupiter atmosphere, nominally based on HD 209458b, aimed at exploring both the evolution of the deep atmosphere, and the acceleration of the zonal flow or jet. We find the occurrence of a super-rotating equatorial jet is robust to changes in various parameters, and over long timescales, even in the absence of strong inner or bottom boundary drag. This jet is diminished in one simulation only, where we strongly force the deep atmosphere equator-to-pole temperature gradient over long timescales. Finally, although the eddy momentum fluxes in our atmosphere show similarities with the proposed mechanism for accelerating jets on tidally-locked planets, the picture appears more complex. We present tentative evidence for a jet driven by a combination of eddy momentum transport and mean flow.
The evolution of human language allowed the efficient propagation of nongenetic information, thus creating a new form of evolutionary change. Language development in children offers the opportunity of exploring the emergence of such complex communication system and provides a window to understanding the transition from protolanguage to language. Here we present the first analysis of the emergence of syntax in terms of complex networks. A previously unreported, sharp transition is shown to occur around two years of age from a (pre-syntactic) tree-like structure to a scale-free, small world syntax network. The nature of such transition supports the presence of an innate component pervading the emergence of full syntax. This observation is difficult to interpret in terms of any simple model of network growth, thus suggesting that some internal, perhaps innate component was at work. We explore this problem by using a minimal model that is able to capture several statistical traits. Our results provide evidence for adaptive traits, but it also indicates that some key features of syntax might actually correspond to non-adaptive phenomena.
We introduce a general framework for biological systems, called MESSI systems, that describe Modifications of type Enzyme-Substrate or Swap with Intermediates, and we prove general results based on the network structure. Many post-translational modification networks are MESSI systems. For example: the motifs in Feliu-Wiuf'12, sequential distributive multisite phosphorylation networks, sequential processive multisite phosphorylation networks, most of the examples in Angeli et al.'07, phosphorylation cascades, two component systems as in Kothamachu et al.'15, the bacterial EnvZ/OmpR network in Shinar-Feinberg'10 and all linear networks. We show that, under mass-action kinetics, MESSI systems are conservative. We simplify the study of steady states of these systems by explicit elimination of intermediate complexes (inspired by Feliu-Wiuf'12,13 and Thomson-Gunawardena'09) and we define an important subclass of MESSI systems with toric steady states. We give for MESSI systems with toric steady states an easy algorithm to determine the capacity for multistationarity. In this case, the algorithm provides rate constants for which multistationarity takes place, based on the theory of oriented matroids.
We present a massive equilibrium simulation of the three-dimensional Ising spin glass at low temperatures. The Janus special-purpose computer has allowed us to equilibrate, using parallel tempering, L=32 lattices down to T=0.64 Tc. We demonstrate the relevance of equilibrium finite-size simulations to understand experimental non-equilibrium spin glasses in the thermodynamical limit by establishing a time-length dictionary. We conclude that non-equilibrium experiments performed on a time scale of one hour can be matched with equilibrium results on L=110 lattices. A detailed investigation of the probability distribution functions of the spin and link overlap, as well as of their correlation functions, shows that Replica Symmetry Breaking is the appropriate theoretical framework for the physically relevant length scales. Besides, we improve over existing methodologies to ensure equilibration in parallel tempering simulations.
We participated in the WMT 2016 shared news translation task by building neural translation systems for four language pairs, each trained in both directions: English<->Czech, English<->German, English<->Romanian and English<->Russian. Our systems are based on an attentional encoder-decoder, using BPE subword segmentation for open-vocabulary translation with a fixed vocabulary. We experimented with using automatic back-translations of the monolingual News corpus as additional training data, pervasive dropout, and target-bidirectional models. All reported methods give substantial improvements, and we see improvements of 4.3--11.2 BLEU over our baseline systems. In the human evaluation, our systems were the (tied) best constrained system for 7 out of 8 translation directions in which we participated.
The extension of Boltzmann-Gibbs thermostatistics, proposed by Tsallis, introduces an additional parameter $q$ to the inverse temperature $\beta$. Here, we show that a previously introduced generalized Metropolis dynamics to evolve spin models is not local and does not obey the detailed energy balance. In this dynamics, locality is only retrieved for $q=1$, which corresponds to the standard Metropolis algorithm. Non-locality implies in very time consuming computer calculations, since the energy of the whole system must be reevaluated, when a single spin is flipped. To circumvent this costly calculation, we propose a generalized master equation, which gives rise to a local generalized Metropolis dynamics that obeys the detailed energy balance. To compare the different critical values obtained with other generalized dynamics, we perform Monte Carlo simulations in equilibrium for Ising model. By using the short time non-equilibrium numerical simulations, we also calculate for this model: the critical temperature, the static and dynamical critical exponents as function of $q$. Even for $q\neq 1$, we show that suitable time evolving power laws can be found for each initial condition. Our numerical experiments corroborate the literature results, when we use non-local dynamics, showing that short time parameter determination works also in this case. However, the dynamics governed by the new master equation leads to different results for critical temperatures and also the critical exponents affecting universality classes. We further propose a simple algorithm to optimize modeling the time evolution with a power law considering in a log-log plot two successive refinements.
Recurrent Neural Networks are showing much promise in many sub-areas of natural language processing, ranging from document classification to machine translation to automatic question answering. Despite their promise, many recurrent models have to read the whole text word by word, making it slow to handle long documents. For example, it is difficult to use a recurrent network to read a book and answer questions about it. In this paper, we present an approach of reading text while skipping irrelevant information if needed. The underlying model is a recurrent network that learns how far to jump after reading a few words of the input text. We employ a standard policy gradient method to train the model to make discrete jumping decisions. In our benchmarks on four different tasks, including number prediction, sentiment analysis, news article classification and automatic Q\&A, our proposed model, a modified LSTM with jumping, is up to 6 times faster than the standard sequential LSTM, while maintaining the same or even better accuracy.
Customer temporal behavioral data was represented as images in order to perform churn prediction by leveraging deep learning architectures prominent in image classification. Supervised learning was performed on labeled data of over 6 million customers using deep convolutional neural networks, which achieved an AUC of 0.743 on the test dataset using no more than 12 temporal features for each customer. Unsupervised learning was conducted using autoencoders to better understand the reasons for customer churn. Images that maximally activate the hidden units of an autoencoder trained with churned customers reveal ample opportunities for action to be taken to prevent churn among strong data, no voice users.
In Ad Hoc networks, route failure may occur due to less received power, mobility, congestion and node failures. Many approaches have been proposed in literature to solve this problem, where a node predicts pre-emptively the route failure that occurs with the less received power. However, this approach encounters some difficulties, especially in scenario without mobility where route failures may arise. In this paper, we propose an improvement of AODV protocol called LO-PPAODV (Link Quality and MAC-Overhead aware Predictive Preemptive AODV). This protocol is based on new metric combine more routing metrics (Link Quality, MAC Overhead) between each node and one hop neighbor. Also we propose a cross-layer networking mechanism to distinguish between both situations, failures due to congestion or mobility, and consequently avoiding unnecessary route repair process. The LO-PPAODV was implemented using NS-2. The simulation results show that our approach improves the overall performance of the network. It reduces the average end to end delay, the routing overhead, MAC errors and route errors, and increases the packet delivery fraction of the network.
In a seminal paper, Chen, Roughgarden and Valiant studied cost sharing protocols for network design with the objective to implement a low-cost Steiner forest as a Nash equilibrium of an induced cost-sharing game. One of the most intriguing open problems up to date is to understand the power of budget-balanced and separable cost sharing protocols in order to induce low-cost Steiner forests. In this work, we focus on undirected networks and analyze topological properties of the underlying graph so that an optimal Steiner forest can be implemented as a Nash equilibrium (by some separable cost sharing protocol) independent of the edge costs. We term a graph efficient if the above stated property holds. As our main result, we give a complete characterization of efficient undirected graphs for two-player network design games: an undirected graph is efficient if and only if it does not not contain at least one out of few forbidden subgraphs. Our characterization implies that several graph classes are efficient: generalized series-parallel graphs, fan and wheel graphs and graphs with small cycles.
Few years ago, application of the mean field Bethe scheme on a given system was shown to produce a systematic change of the system intrinsic symmetry. For instance, once applied on a ferromagnet, individual spins are no more equivalent. Accordingly a new loopwise mean field theory was designed to both go beyond the one site Weiss approach and yet preserve the initial Hamitonian symmetry. This loopwise scheme is applied here to solve the Triangular Antiferromagnetic Ising model. It is found to yield Wannier's exact result of no ordering at non-zero temperature. No adjustable parameter is used. Simultaneously a non-zero critical temperature is obtained for the Triangular Ising Ferromagnet. This simple mean field scheme opens a new way to tackle random systems.
There has been a lot of recent interest in designing neural network models to estimate a distribution from a set of examples. We introduce a simple modification for autoencoder neural networks that yields powerful generative models. Our method masks the autoencoder's parameters to respect autoregressive constraints: each input is reconstructed only from previous inputs in a given ordering. Constrained this way, the autoencoder outputs can be interpreted as a set of conditional probabilities, and their product, the full joint probability. We can also train a single network that can decompose the joint probability in multiple different orderings. Our simple framework can be applied to multiple architectures, including deep ones. Vectorized implementations, such as on GPUs, are simple and fast. Experiments demonstrate that this approach is competitive with state-of-the-art tractable distribution estimators. At test time, the method is significantly faster and scales better than other autoregressive estimators.
Recent advances in convolutional neural networks have considered model complexity and hardware efficiency to enable deployment onto embedded systems and mobile devices. For example, it is now well-known that the arithmetic operations of deep networks can be encoded down to 8-bit fixed-point without significant deterioration in performance. However, further reduction in precision down to as low as 3-bit fixed-point results in significant losses in performance. In this paper we propose a new data representation that enables state-of-the-art networks to be encoded to 3 bits with negligible loss in classification performance. To perform this, we take advantage of the fact that the weights and activations in a trained network naturally have non-uniform distributions. Using non-uniform, base-2 logarithmic representation to encode weights, communicate activations, and perform dot-products enables networks to 1) achieve higher classification accuracies than fixed-point at the same resolution and 2) eliminate bulky digital multipliers. Finally, we propose an end-to-end training procedure that uses log representation at 5-bits, which achieves higher final test accuracy than linear at 5-bits.
A system for semi-automatic vectorization of linear networks (roads, rivers, etc.) on rasterized cartographic maps is presented. In this system, human intervention is limited to a graphic, interactive selection of the color attributes of the information to be obtained. Using this data, the system performs a preliminary extraction of the linear network, which is subsequently completed, refined and vectorized by means of an automatic procedure. Results on maps of different sources and scales are included.   -----   Se presenta un sistema semi-automatico de vectorizacion de redes de objetos lineales (carreteras, rios, etc.) en mapas cartograficos digitalizados. En este sistema, la intervencion humana queda reducida a la seleccion grafica interactiva de los atributos de color de la informacion a obtener. Con estos datos, el sistema realiza una extraccion preliminar de la red lineal, que se completa, refina y vectoriza mediante un procedimiento automatico. Se presentan resultados de la aplicacion del sistema sobre imagenes digitalizadas de mapas de distinta procedencia y escala.
Ranking a set of objects from the most dominant one to the least, based on the results of paired comparisons, proves to be useful in many contexts. Using the rankings of teams or individuals players in sports to seed tournaments is an example. The quality of a ranking is often evaluated by the number of violations, cases in which an object is ranked lower than another that it has dominated in a comparison, that it contains. A minimum violations ranking (MVR) method, as its name suggests, searches specifically for rankings that have the minimum possible number of violations which may or may not be zero. In this paper, we present a method based on statistical physics that overcomes conceptual and practical difficulties faced by earlier studies of the problem.
We investigate the quantum properties of 1D quantum systems whose classical counterpart presents intermittency.   The spectral correlations are expressed in terms of the eigenvalues of an anomalous diffusion operator by using recent semiclassical techniques. For certain values of the parameters the spectral properties of our model show similarities with those of a disordered system at the Anderson transition. In Hamiltonian systems, intermittency is closely related to the presence of cantori in the classical phase space. We suggest, based on this relation, that our findings may be relevant for the description of the spectral correlations of (non-KAM) Hamiltonians with a classical phase space filled by cantori.   Finally we discuss the extension of our results to higher dimensions and their relation to Anderson models with long range hopping.
Cerebellar climbing fiber activity encodes performance errors during many motor learning tasks, but the role of these error signals in learning has been controversial. We compared two motor learning paradigms that elicited equally robust putative error signals in the same climbing fibers: learned increases and decreases in the gain of the vestibulo-ocular reflex (VOR). During VOR-increase training, climbing fiber activity on one trial predicted changes in cerebellar output on the next trial, and optogenetic activation of climbing fibers to mimic their encoding of performance errors was sufficient to implant a motor memory. In contrast, during VOR-decrease training, there was no trial-by-trial correlation between climbing fiber activity and changes in cerebellar output, and climbing fiber activation did not induce VOR-decrease learning. Comparison of the two training paradigms suggests that the ability of climbing fibers to induce plasticity can be dynamically gated in vivo by the state of the cerebellar circuit, even under conditions where the climbing fibers are robustly activated by performance errors.
Differences in core level binding energies between atoms belonging to the same chemical species can be related to differences in their intra- and extra-atomic charge distributions, and differences in how their core holes are screened. With this in mind, we consider the charge-excess functional model (CEFM) for net atomic charges in alloys [E. Bruno et al., Phys. Rev. Lett. 91, 166401 (2003)]. We begin by deriving the CEFM energy function in order to elucidate the approximations which underpin this model. We thereafter consider the particular case of the CEFM in which the strength of the `local interactions' within all atoms are the same. We show that for binary alloys the ground state charges of this model can be expressed in terms of charge transfer between all pairs of unlike atoms analogously to the linear charge model [R. Magri et al., Phys. Rev. B 42, 11388 (1990)]. Hence the model considered is a generalization of the linear charge model for alloys containing more than two chemical species. We then determine the model's unknown `geometric factors' over a wide range of parameter space. These quantities are linked to the nature of charge screening in the model, and we illustrate that the screening becomes increasingly universal as the strength of the local interactions is increased. We then use the model to derive analytical expressions for various physical quantities, including the Madelung energy and the disorder broadening in the core level binding energies. These expressions are applied to ternary random alloys, for which it is shown that the Madelung energy and magnitude of disorder broadening are maximized at the composition at which the two species with the largest `electronegativity difference' are equal, while the remaining species having a vanishing concentration. This result is somewhat counterintuitive with regards to the disorder broadening since it does not...
Computer or communication networks are so designed that they do not easily get disrupted under external attack and, moreover, these are easily reconstructible if they do get disrupted. These desirable properties of networks can be measured by various graph parameters, such as connectivity, toughness, scattering number, integrity, tenacity, rupture degree and edge-analogues of some of them. Among these parameters, the tenacity and rupture degree are two better ones to measure the stability of a network. In this paper we consider two extremal problems on the tenacity of graphs: Determine the minimum and maximum tenacity of graphs with given order and size. We give a complete solution to the first problem, while for the second one, it turns out that the problem is much more complicated than that of the minimum case. We determine the maximum tenacity of trees and unicyclic graphs with given order and show the corresponding extremal graphs. These results are helpful in constructing stable networks with lower costs. The paper concludes with a discussion of a related problem on the edge vulnerability parameters of graphs.
Traditional architectures for solving computer vision problems and the degree of success they enjoyed have been heavily reliant on hand-crafted features. However, of late, deep learning techniques have offered a compelling alternative -- that of automatically learning problem-specific features. With this new paradigm, every problem in computer vision is now being re-examined from a deep learning perspective. Therefore, it has become important to understand what kind of deep networks are suitable for a given problem. Although general surveys of this fast-moving paradigm (i.e. deep-networks) exist, a survey specific to computer vision is missing. We specifically consider one form of deep networks widely used in computer vision - convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN and then examine the broad variations proposed over time to suit different applications. We hope that our recipe-style survey will serve as a guide, particularly for novice practitioners intending to use deep-learning techniques for computer vision.
A new forecasting strategy for stochastic systems is introduced. It is inspired by the concept of anticipated synchronization between pairs of chaotic oscillators, recently developed in the area of Dynamical Systems, and by the earthquake forecasting algorithms in which different pattern recognition functions are used for identifying seismic premonitory phenomena. In the new strategy, copies (clones) of the original system (the master) are defined, and they are driven using rules that tend to synchronize them with the master dynamics. The observation of definite patterns in the state of the clones is the signal for connecting an alarm in the original system that efficiently marks the impending occurrence of a catastrophic event. The power of this method is quantitatively illustrated by forecasting the occurrence of characteristic earthquakes in the so-called Minimalist Model.
Many microbial populations rapidly adapt to changing environments with multiple variants competing for survival. To quantify such complex evolutionary dynamics in vivo, time resolved and genome wide data including rare variants are essential. We performed whole-genome deep sequencing of HIV-1 populations in 9 untreated patients, with 6-12 longitudinal samples per patient spanning 5-8 years of infection. We show that patterns of minor diversity are reproducible between patients and mirror global HIV-1 diversity, suggesting a universal landscape of fitness costs that control diversity. Reversions towards the ancestral HIV-1 sequence are observed throughout infection and account for almost one third of all sequence changes. Reversion rates depend strongly on conservation. Frequent recombination limits linkage disequilibrium to about 100bp in most of the genome, but strong hitch-hiking due to short range linkage limits diversity.
This paper addresses about various image compression techniques. On the basis of analyzing the various image compression techniques this paper presents a survey of existing research papers. In this paper we analyze different types of existing method of image compression. Compression of an image is significantly different then compression of binary raw data. To solve these use different types of techniques for image compression. Now there is question may be arise that how to image compress and which types of technique is used. For this purpose there are basically two types are method are introduced namely lossless and lossy image compression techniques. In present time some other techniques are added with basic method. In some area neural network genetic algorithms are used for image compression.   Keywords-Image Compression; Lossless; Lossy; Redundancy; Benefits of Compression.
The duality relation for the effective conductivity sigma_{e} of 2D isotropic heterophase systems is used for obtaining the exact results for sigma_{e} at arbitrary number of phases N. The exact values of sigma_{e} correspond to the fixed points of the duality transformations. The new exact results for sigma_{e}, generalizing the well-known exact values of sigma_{e} for N = 2,3 at equal phase concentrations, are found. It is shown that for N > 3 there exist the whole hyperplanes in the space of phase concentrations, on which sigma_{e} takes constant values. These results are checked in the framework of various approximations for different random heterophase systems.
This paper studies properties of the logic BV, which is an extension of multiplicative linear logic (MLL) with a self-dual non-commutative operator. BV is presented in the calculus of structures, a proof theoretic formalism that supports deep inference, in which inference rules can be applied anywhere inside logical expressions. The use of deep inference results in a simple logical system for MLL extended with the self-dual non-commutative operator, which has been to date not known to be expressible in sequent calculus. In this paper, deep inference is shown to be crucial for the logic BV, that is, any restriction on the ``depth'' of the inference rules of BV would result in a strictly less expressive logical system.
A Boolean network is a finite state discrete time dynamical system. At each step, each variable takes a value from a binary set. The value update rule for each variable is a local function which depends only on a selected subset of variables. Boolean networks have been used in modeling gene regulatory networks. We focus in this paper on a special class of Boolean networks, namely the conjunctive Boolean networks (CBNs), whose value update rule is comprised of only logic AND operations. It is known that any trajectory of a Boolean network will enter a periodic orbit. Periodic orbits of a CBN have been completely understood. In this paper, we investigate the orbit-controllability and state-controllability of a CBN: We ask the question of how one can steer a CBN to enter any periodic orbit or to reach any final state, from any initial state. We establish necessary and sufficient conditions for a CBN to be orbit-controllable and state-controllable. Furthermore, explicit control laws are presented along the analysis.
Phonon Monte Carlo (PMC) is a versatile stochasic technique for solving the Boltzmann transport equation for phonons. It is particularly well suited for analyzing thermal transport in structures that have real-space roughness or are too large to simulate directly using atomistic techniques. PMC hinges on the generation and use of \textit{random variates} -- specific values of the random variables that correspond to physical observables -- in a way that accurately and efficiently captures the appropriate distribution functions. We present the relative merits of the inversion and rejection techniques for generating random variates on several examples relevant in thermal transport: drawing phonons from a thermal distribution and with full or isotropic dispersion, randomizing outgoing momentum upon diffuse boundary scattering, implementing contacts (boundary and internal), and conserving energy in the simulation. We also identify common themes in phonon generation and scattering that are helpful for reusing code in the simulation (generating thermal-phonon attributes vs internal contacts; diffuse surface scattering vs boundary contacts). We hope these examples will inform the reader about the mechanics of random-variate generation and how to choose a good approach for whatever problem is at hand, and aid in the more widespread use of PMC for thermal transport simulation.
Nodes of minimum connected dominating set (MCDS) form a virtual backbone in a wireless adhoc network. In this paper, a modified approach is presented to determine MCDS of an underlying graph of a Wireless Adhoc network. Simulation results for a variety of graphs indicate that the approach is efficient in determining the MCDS as compared to other existing techniques.
Rotation plays a key role in stellar structure and its evolution. Through transport processes which induce rotational mixing of chemical species and the redistribution of angular momentum, internal stellar rotation influences the evolutionary tracks in the Hertzsprung-Russell diagram. In turn, evolution influences the rotational properties. Therefore, information on the rotational properties of the deep interior would help to better understand the stellar evolution. However, as the internal rotational profile cannot be measured directly, it remains a major unknown leaving this important aspect of models unconstrained. We can test for nonrigid rotation inside the stars with asteroseismology. Through the effect of rotational splitting of non-radial oscillation modes, we investigate the internal rotation profile indirectly. Red giants have very slow rotation rates leading to a rotational splitting on the level of a few tenth of a \mu Hz. Only from more than 1.5 years of consecutive data from the NASA Kepler space telescope, these tiny variations could be resolved. A qualitative comparison to theoretical models allowed constraining the core-to-surface rotation rate for some of these evolved stars. In this paper, we report on the first results of a large sample study of splitting of individual dipole modes.
In Recommender Systems research, algorithms are often characterized as either Collaborative Filtering (CF) or Content Based (CB). CF algorithms are trained using a dataset of user explicit or implicit preferences while CB algorithms are typically based on item profiles. These approaches harness very different data sources hence the resulting recommended items are generally also very different. This paper presents a novel model that serves as a bridge from items content into their CF representations. We introduce a multiple input deep regression model to predict the CF latent embedding vectors of items based on their textual description and metadata. We showcase the effectiveness of the proposed model by predicting the CF vectors of movies and apps based on their textual descriptions. Finally, we show that the model can be further improved by incorporating metadata such as the movie release year and tags which contribute to a higher accuracy.
Discrete Fourier transforms provide a significant speedup in the computation of convolutions in deep learning. In this work, we demonstrate that, beyond its advantages for efficient computation, the spectral domain also provides a powerful representation in which to model and train convolutional neural networks (CNNs).   We employ spectral representations to introduce a number of innovations to CNN design. First, we propose spectral pooling, which performs dimensionality reduction by truncating the representation in the frequency domain. This approach preserves considerably more information per parameter than other pooling strategies and enables flexibility in the choice of pooling output dimensionality. This representation also enables a new form of stochastic regularization by randomized modification of resolution. We show that these methods achieve competitive results on classification and approximation tasks, without using any dropout or max-pooling.   Finally, we demonstrate the effectiveness of complex-coefficient spectral parameterization of convolutional filters. While this leaves the underlying model unchanged, it results in a representation that greatly facilitates optimization. We observe on a variety of popular CNN configurations that this leads to significantly faster convergence during training.
While neural networks have been successfully applied to many natural language processing tasks, they come at the cost of interpretability. In this paper, we propose a general methodology to analyze and interpret decisions from a neural model by observing the effects on the model of erasing various parts of the representation, such as input word-vector dimensions, intermediate hidden units, or input words. We present several approaches to analyzing the effects of such erasure, from computing the relative difference in evaluation metrics, to using reinforcement learning to erase the minimum set of input words in order to flip a neural model's decision. In a comprehensive analysis of multiple NLP tasks, including linguistic feature classification, sentence-level sentiment analysis, and document level sentiment aspect prediction, we show that the proposed methodology not only offers clear explanations about neural model decisions, but also provides a way to conduct error analysis on neural models.
In this paper we have investigated the performance of PSO Particle Swarm Optimization based clustering on few real world data sets and one artificial data set. The performances are measured by two metric namely quantization error and inter-cluster distance. The K means clustering algorithm is first implemented for all data sets, the results of which form the basis of comparison of PSO based approaches. We have explored different variants of PSO such as gbest, lbest ring, lbest vonneumann and Hybrid PSO for comparison purposes. The results reveal that PSO based clustering algorithms perform better compared to K means in all data sets.
We derive an exact representation of the topological effect on the dynamics of sequence processing neural networks within signal-to-noise analysis. A new network structure parameter, loopiness coefficient, is introduced to quantitatively study the loop effect on network dynamics. The large loopiness coefficient means the large probability of finding loops in the networks. We develop the recursive equations for the overlap parameters of neural networks in the term of the loopiness. It was found that the large loopiness increases the correlations among the network states at different times, and eventually it reduces the performance of neural networks. The theory is applied to several network topological structures, including fully-connected, densely-connected random, densely-connected regular, and densely-connected small-world, where encouraging results are obtained.
We develop a theoretical approach to percolation in random clustered networks. We find that, although clustering in scale-free networks can strongly affect some percolation properties, such as the size and the resilience of the giant connected component, it cannot restore a finite percolation threshold. In turn, this implies the absence of an epidemic threshold in this class of networks extending, thus, this result to a wide variety of real scale-free networks which shows a high level of transitivity. Our findings are in good agreement with numerical simulations.
There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory. In order to collect a large number of manipulation demonstrations for different objects, we developed a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot can even manipulate objects it has never seen before.
Despite their topological complexity almost all functional properties of metabolic networks can be derived from steady-state dynamics. Indeed, many theoretical investigations (like flux-balance analysis) rely on extracting function from steady states. This leads to the interesting question, how metabolic networks avoid complex dynamics and maintain a steady-state behavior. Here, we expose metabolic network topologies to binary dynamics generated by simple local rules. We find that the networks' response is highly specific: Complex dynamics are systematically reduced on metabolic networks compared to randomized networks with identical degree sequences. Already small topological modifications substantially enhance the capacity of a network to host complex dynamic behavior and thus reduce its regularizing potential. This exceptionally pronounced regularization of dynamics encoded in the topology may explain, why steady-state behavior is ubiquitous in metabolism.
The T-odd distribution functions contributing to transversity properties of the nucleon and their role in fueling nontrivial contributions to azimuthal asymmetries in semi-inclusive deep inelastic scattering are investigated. We use a dynamical model to evaluate these quantities in terms of HERMES kinematics. We point out how the measurements of cos(2phi) asymmetry may indicate the presence of T-odd structures in unpolarized ep scattering.
Many emerging applications in mobile adhoc networks involve group-oriented communication. Multicast is an efficient way of supporting group oriented applications, mainly in mobile environment with limited bandwidth and limited power. For using such applications in an adversarial environment as military, it is necessary to provide secure multicast communication. Key management is the fundamental challenge in designing secure multicast communications. In many multicast interactions, new member can join and current members can leave at any time and existing members must communicate securely using multicast key distribution within constrained energy for mobile adhoc networks. This has to overcome the challenging element of "1 affects n" problem which is due to high dynamicity of groups. Thus this paper shows the specific challenges towards multicast key management protocols for securing multicast key distribution in mobile ad hoc networks, and present relevant multicast key management protocols in mobile ad hoc networks. A comparison is done against some pertinent performance criteria.
We study the generic non-equilibrium steady states in asymmetric exclusion processes on a closed network with bottlenecks. To this end we proposes and study closed simple networks with multiply-connected non-identical junctions. Depending upon the parameters that define the network junctions and the particle number density, the models display phase transitions with both static and moving density inhomogeneities. The currents in the models can be tuned by the junction parameters. Our models highlight how extended and point defects may affect the density profiles in a closed directed network. Phenomenological implications of our results are discussed.
We consider a topology control problem in which we are given a set of $n$ sensors in the plane and we would like to assign a communication radius to each of them. The radii assignment must generate a strongly connected network and have low receiver-based interference (i.e., we minimize the largest in-degree of the network). We give an algorithm that generates a network with $O(\log \Delta)$ interference, where $\Delta$ is the interference of a uniform-radius ad-hoc network. We then adapt the construction to the case in which no sensor can have a communication radius larger than $R_{\min}$, the minimum value needed to obtain connectivity. We also show that $\log \Delta$ interference is needed for some instances, making our algorithms asymptotically optimal.
We consider a neural network with adapting synapses whose dynamics can be analitically computed. The model is made of $N$ neurons and each of them is connected to $K$ input neurons chosen at random in the network. The synapses are $n$-states variables which evolve in time according to Stochastic Learning rules; a parallel stochastic dynamics is assumed for neurons. Since the network maintains the same dynamics whether it is engaged in computation or in learning new memories, a very low probability of synaptic transitions is assumed. In the limit $N\to\infty$ with $K$ large and finite, the correlations of neurons and synapses can be neglected and the dynamics can be analitically calculated by flow equations for the macroscopic parameters of the system.
We present phase diagrams of the Harper map, which is equivalent to the problem of Bloch electrons in a uniform magnetic field (Azbel-Hofstadter model). We consider the cases where the magnetic flux $\omega$ assumes either the continued fraction approximations towards the golden mean or the golden mean itself. The phase diagrams for rational values of $\omega$ show a finite number of Arnol'd tongues of localized electronic states with rational winding numbers and regions of extended phases in between them. For the particular case of $\omega = \frac{\sqrt{5}-1}{2}$, we find an infinite number of Arnol'd tongues of localized phases with extended phases in between. In this case, the study of the winding number gives rise to a Golden Staircase, where the plateaux represent localized phases with winding numbers equal to sums of powers of the golden mean. We also present evidence of the existence of an infinite number of strange nonchaotic attractors for $\epsilon=1$ in points analogous to critical points in the pressure-temperature phase diagram of the water.
In this paper, we introduce two indoor Wireless Local Area Network (WLAN) positioning methods using augmented sparse recovery algorithms. These schemes render a sparse user's position vector, and in parallel, minimize the distance between the online measurement and radio map. The overall localization scheme for both methods consists of three steps: 1) coarse localization, obtained from comparing the online measurements with clustered radio map. A novel graph-based method is proposed to cluster the offline fingerprints. In the online phase, a Region Of Interest (ROI) is selected within which we search for the user's location; 2) Access Point (AP) selection; and 3) fine localization through the novel sparse recovery algorithms. Since the online measurements are subject to inordinate measurement readings, called outliers, the sparse recovery methods are modified in order to jointly estimate the outliers and user's position vector. The outlier detection procedure identifies the APs whose readings are either not available or erroneous. The proposed localization methods have been tested with Received Signal Strength (RSS) measurements in a typical office environment and the results show that they can localize the user with significantly high accuracy and resolution which is superior to the results from competing WLAN fingerprinting localization methods.
The pattern matching capabilities of neural networks can be used to locate syntactic constituents of natural language. This paper describes a fully automated hybrid system, using neural nets operating within a grammatic framework. It addresses the representation of language for connectionist processing, and describes methods of constraining the problem size. The function of the network is briefly explained, and results are given.
Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information.   To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees for training data: Private Aggregation of Teacher Ensembles (PATE). The approach combines, in a black-box fashion, multiple models trained with disjoint datasets, such as records from different subsets of users. Because they rely directly on sensitive data, these models are not published, but instead used as "teachers" for a "student" model. The student learns to predict an output chosen by noisy voting among all of the teachers, and cannot directly access an individual teacher or the underlying data or parameters. The student's privacy properties can be understood both intuitively (since no single teacher and thus no single dataset dictates the student's training) and formally, in terms of differential privacy. These properties hold even if an adversary can not only query the student but also inspect its internal workings.   Compared with previous work, the approach imposes only weak assumptions on how teachers are trained: it applies to any model, including non-convex models like DNNs. We achieve state-of-the-art privacy/utility trade-offs on MNIST and SVHN thanks to an improved privacy analysis and semi-supervised learning.
This paper presents to the best of our knowledge the first end-to-end object tracking approach which directly maps from raw sensor input to object tracks in sensor space without requiring any feature engineering or system identification in the form of plant or sensor models. Specifically, our system accepts a stream of raw sensor data at one end and, in real-time, produces an estimate of the entire environment state at the output including even occluded objects. We achieve this by framing the problem as a deep learning task and exploit sequence models in the form of recurrent neural networks to learn a mapping from sensor measurements to object tracks. In particular, we propose a learning method based on a form of input dropout which allows learning in an unsupervised manner, only based on raw, occluded sensor data without access to ground-truth annotations. We demonstrate our approach using a synthetic dataset designed to mimic the task of tracking objects in 2D laser data -- as commonly encountered in robotics applications -- and show that it learns to track many dynamic objects despite occlusions and the presence of sensor noise.
This paper explores three different strategies for the inversion of spectral lines (and their Stokes profiles) using artificial neural networks. It is shown that a straightforward approach in which the network is trained with synthetic spectra from a simplified model leads to considerable errors in the inversion of real observations. This problem can be overcome in at least two different ways that are studied here in detail. The first method makes use of an additional pre-processing auto-associative neural network to project the observed profile into the theoretical model subspace. The second method considers a suitable regularization of the neural network used for the inversion. These new techniques are shown to be robust and reliable when applied to the inversion of both synthetic and observed data, with errors typically below $\sim$100 G.
We propose a novel technique to assess functional brain connectivity in EEG/MEG signals. Our method, called Sparsely-Connected Sources Analysis (SCSA), can overcome the problem of volume conduction by modeling neural data innovatively with the following ingredients: (a) the EEG is assumed to be a linear mixture of correlated sources following a multivariate autoregressive (MVAR) model, (b) the demixing is estimated jointly with the source MVAR parameters, (c) overfitting is avoided by using the Group Lasso penalty. This approach allows to extract the appropriate level cross-talk between the extracted sources and in this manner we obtain a sparse data-driven model of functional connectivity. We demonstrate the usefulness of SCSA with simulated data, and compare to a number of existing algorithms with excellent results.
This paper, based on $k$-NN graph, presents symmetric $(k,j)$-NN graph $(1 \leq j < k)$, a brand new topology which could be adopted by a series of network-based structures. We show that the $k$ nearest neighbors of a node exert disparate influence on guaranteeing network connectivity, and connections with the farthest $j$ ones among these $k$ neighbors are competent to build up a connected network, contrast to the current popular strategy of connecting all these $k$ neighbors. In particular, for a network with node amount $n$ up to $10^3$, as experiments demonstrate, connecting with the farthest three, rather than all, of the five nearest neighbor nodes, i.e. $(k,j)=(5,3)$, can guarantee the network connectivity in high probabilities. We further reveal that more than $0.75n$ links or edges in $5$-NN graph are not necessary for the connectivity. Moreover, a composite topology combining symmetric $(k,j)$-NN and random geometric graph (RGG) is constructed for constrained transmission radii in wireless sensor networks (WSNs) application.
Automatic speech recognition (ASR) allows a natural and intuitive interface for robotic educational applications for children. However there are a number of challenges to overcome to allow such an interface to operate robustly in realistic settings, including the intrinsic difficulties of recognising child speech and high levels of background noise often present in classrooms. As part of the EU EASEL project we have provided several contributions to address these challenges, implementing our own ASR module for use in robotics applications. We used the latest deep neural network algorithms which provide a leap in performance over the traditional GMM approach, and apply data augmentation methods to improve robustness to noise and speaker variation. We provide a close integration between the ASR module and the rest of the dialogue system, allowing the ASR to receive in real-time the language models relevant to the current section of the dialogue, greatly improving the accuracy. We integrated our ASR module into an interactive, multimodal system using a small humanoid robot to help children learn about exercise and energy. The system was installed at a public museum event as part of a research study where 320 children (aged 3 to 14) interacted with the robot, with our ASR achieving 90% accuracy for fluent and near-fluent speech.
Wide implementation of IEEE 802.11 based networks could lead to deployment of localized wireless data communication environments with a limited number of mobile hosts, called ad hoc networks. Implementation of a proper routing methodology in ad hoc networks makes it efficient in terms of performance. A wide spectrum of routing protocols has been contributed by several researchers. Real time applications have been most popular among the applications, run by ad hoc networks. Such applications strictly adhere to the Quality of Service (QoS) requirements such as overall throughput, end-toend delay and power level. Support of QoS requirements becomes more challenging due to dynamic nature of MANETs, where mobility of nodes results in frequent change in topology. QoS aware routing protocols can serve to the QoS support, which concentrate on determining a path between source and destination with the QoS requirements of the flow being satisfied. We propose a protocol, called Power and Delay aware Temporally Ordered Routing Algorithm (PDTORA), based on Temporally Ordered Routing Algorithm (TORA) Protocol, where verification of power and delay requirements is carried out with a query packet at each node along the path between source and destination. Simulations justify better performance of the proposed new protocol in terms of network lifetime, end-to-end delay and packet delivery ratio as compared to TORA.
We investigate the role of symmetries in determining the random matrix class describing quantum thermalization in a periodically driven many body quantum system. Using a combination of analytical arguments and numerical exact diagonalization, we establish that a periodically driven `Floquet' system can be in a different random matrix class to the instantaneous Hamiltonian. A periodically driven system can thermalize even when the instantaneous Hamiltonian is integrable. A Floquet system that thermalizes in general can display integrable behavior at commensurate driving frequencies. When the instantaneous Hamiltonian and Floquet operator both thermalize, the Floquet problem can be in the unitary class while the instantaneous Hamiltonian is always in the orthogonal class, and vice versa. We extract general principles regarding when a Floquet problem can thermalize to a different symmetry class to the instantaneous Hamiltonian. A (finite-sized) Floquet system can even display crossovers between different random matrix classes as a function of driving frequency.
Deep Learning (DL) methods show very good performance when trained on large, balanced data sets. However, many practical problems involve imbalanced data sets, or/and classes with a small number of training samples. The performance of DL methods as well as more traditional classifiers drops significantly in such settings. Most of the existing solutions for imbalanced problems focus on customizing the data for training. A more principled solution is to use mixed Hinge-Minimax risk [19] specifically designed to solve binary problems with imbalanced training sets. Here we propose a Latent Hinge Minimax (LHM) risk and a training algorithm that generalizes this paradigm to an ensemble of hyperplanes that can form arbitrary complex, piecewise linear boundaries. To extract good features, we combine LHM model with CNN via transfer learning. To solve multi-class problem we map pre-trained category-specific LHM classifiers to a multi-class neural network and adjust the weights with very fast tuning. LHM classifier enables the use of unlabeled data in its training and the mapping allows for multi-class inference, resulting in a classifier that performs better than alternatives when trained on a small number of training samples.
We consider the problem of recovering an $N$-dimensional sparse vector $\vm{x}$ from its linear transformation $\vm{y}=\vm{D} \vm{x}$ of $M(< N)$ dimension. Minimizing the $l_{1}$-norm of $\vm{x}$ under the constraint $\vm{y} = \vm{D} \vm{x}$ is a standard approach for the recovery problem, and earlier studies report that the critical condition for typically successful $l_1$-recovery is universal over a variety of randomly constructed matrices $\vm{D}$. For examining the extent of the universality, we focus on the case in which $\vm{D}$ is provided by concatenating $\nb=N/M$ matrices $\vm{O}_{1}, \vm{O}_{2},..., \vm{O}_\nb$ drawn uniformly according to the Haar measure on the $M \times M$ orthogonal matrices. By using the replica method in conjunction with the development of an integral formula for handling the random orthogonal matrices, we show that the concatenated matrices can result in better recovery performance than what the universality predicts when the density of non-zero signals is not uniform among the $\nb$ matrix modules. The universal condition is reproduced for the special case of uniform non-zero signal densities. Extensive numerical experiments support the theoretical predictions.
We present a 3D object detection method that uses regressed descriptors of locally-sampled RGB-D patches for 6D vote casting. For regression, we employ a convolutional auto-encoder that has been trained on a large collection of random local patches. During testing, scene patch descriptors are matched against a database of synthetic model view patches and cast 6D object votes which are subsequently filtered to refined hypotheses. We evaluate on three datasets to show that our method generalizes well to previously unseen input data, delivers robust detection results that compete with and surpass the state-of-the-art while being scalable in the number of objects.
Multilayer networks have seen a resurgence under the umbrella of deep learning. Current deep learning algorithms train the layers of the network sequentially, improving algorithmic performance as well as providing some regularization. We present a new training algorithm for deep networks which trains \emph{each node in the network} sequentially. Our algorithm is orders of magnitude faster, creates more interpretable internal representations at the node level, while not sacrificing on the ultimate out-of-sample performance.
Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, % FVC implementations employ the Gaussian mixture model (GMM) to characterize the generation process of local features. This choice has shown to be sufficient for traditional low dimensional local features, e.g., SIFT; and typically, good performance can be achieved with only a few hundred Gaussian distributions. However, the same number of Gaussians is insufficient to model the feature space spanned by higher dimensional local features, which have become popular recently. In order to improve the modeling capacity for high dimensional features, it turns out to be inefficient and computationally impractical to simply increase the number of Gaussians. In this paper, we propose a model in which each local feature is drawn from a Gaussian distribution whose mean vector is sampled from a subspace. With certain approximation, this model can be converted to a sparse coding procedure and the learning/inference problems can be readily solved by standard sparse coding methods. By calculating the gradient vector of the proposed model, we derive a new fisher vector encoding strategy, termed Sparse Coding based Fisher Vector Coding (SCFVC). Moreover, we adopt the recently developed Deep Convolutional Neural Network (CNN) descriptor as a high dimensional local feature and implement image classification with the proposed SCFVC. Our experimental evaluations demonstrate that our method not only significantly outperforms the traditional GMM based Fisher vector encoding but also achieves the state-of-the-art performance in generic object recognition, indoor scene, and fine-grained image classification problems.
The production of charged hadrons, in muon Deep inelastic scattering (DIS), at light and heavy target is presented. The particles produced by the interaction with Xenon (Xe) is compared with that produced by the interaction with Deuteron (D), to obtain information on cascading process, forward-backward productions, and the rapidity distribution in different bins of the invariant mass of the interacting system W.
In this paper, we propose a novel model, named Stroke Sequence-dependent Deep Convolutional Neural Network (SSDCNN), using the stroke sequence information and eight-directional features for Online Handwritten Chinese Character Recognition (OLHCCR). On one hand, SSDCNN can learn the representation of Online Handwritten Chinese Character (OLHCC) by incorporating the natural sequence information of the strokes. On the other hand, SSDCNN can incorporate eight-directional features in a natural way. In order to train SSDCNN, we divide the process of training into two stages: 1) The training data is used to pre-train the whole architecture until the performance tends to converge. 2) Fully-connected neural network which is used to combine the stroke sequence-dependent representation with eight-directional features and softmax layer are further trained. Experiments were conducted on the OLHCCR competition tasks of ICDAR 2013. Results show that, SSDCNN can reduce the recognition error by 50\% (5.13\% vs 2.56\%) compared to the model which only use eight-directional features. The proposed SSDCNN achieves 97.44\% accuracy which reduces the recognition error by about 1.9\% compared with the best submitted system on ICDAR2013 competition. These results indicate that SSDCNN can exploit the stroke sequence information to learn high-quality representation of OLHCC. It also shows that the learnt representation and the classical eight-directional features complement each other within the SSDCNN architecture.
Link prediction appears as a central problem of network science, as it calls for unfolding the mechanisms that govern the micro-dynamics of the network. In this work, we are interested in ego-networks, that is the mere information of interactions of a node to its neighbors, in the context of social relationships. As the structural information is very poor, we rely on another source of information to predict links among egos' neighbors: the timing of interactions. We define several features to capture different kinds of temporal information and apply machine learning methods to combine these various features and improve the quality of the prediction. We demonstrate the efficiency of this temporal approach on a cellphone interaction dataset, pointing out features which prove themselves to perform well in this context, in particular the temporal profile of interactions and elapsed time between contacts.
We demonstrate analytically and numerically the possibility that the fractal property of a scale-free network cannot be characterized by a unique fractal dimension and the network takes a multifractal structure. It is found that the mass exponents $\tau(q)$ for several deterministic, stochastic, and real-world fractal scale-free networks are nonlinear functions of $q$, which implies that structural measures of these networks obey the multifractal scaling. In addition, we give a general expression of $\tau(q)$ for some class of fractal scale-free networks by a mean-field approximation. The multifractal property of network structures is a consequence of large fluctuations of local node density in scale-free networks.
Spatial spread of minority carriers produced by optical excitation in semiconductors is usually well described by a diffusion equation. The classical diffusion process can be viewed as a result of a random walk of particles in which every step has the same probability distribution with a finite second moment. This allows applying the central limit theorem to the calculation of the particle distribution after many steps. However, in moderately doped direct-gap semiconductors the photon recycling process can radically modify the spatial spread. For this process, the steps in the random walk are defined by the reabsorption length of photons produced in radiative recombination. The step distribution has an asymptotic power-law decline. Moments of this distribution diverge and the displacement is governed by rare but large steps. Random walk of this kind is called the Levy flight. It corresponds to an anomalously large spread in space and a modified ("super-diffusive") temporal evolution. Here we discuss the first direct observation of the hole profile in n-doped InP samples over distances of the order of a centimeter and more than two orders of magnitude in hole concentration. Luminescence spectra and intensity were studied as a function of distance from the photo-excitation in a rather unusual geometry (homogeneous excitation of the wafer edge and observation of the luminescence spectra from the broadside). The intensity is proportional to the minority-carrier concentration and exhibits a slow power-law drop-off with no changes in the spectral shape. This power law gives a direct evidence of Levy-flight transport. It has enabled us to evaluate the index of the distribution, the characteristic distance of the minority-carrier spread and the photon recycling factor. The results are in good agreement with the theoretical analysis.
We present a deep neural network-based method to perform high-precision, robust and real-time 6 DOF visual servoing. The paper describes how to create a dataset simulating various perturbations (occlusions and lighting conditions) from a single real-world image of the scene. A convolutional neural network is fine-tuned using this dataset to estimate the relative pose between two images of the same scene. The output of the network is then employed in a visual servoing control scheme. The method converges robustly even in difficult real-world settings with strong lighting variations and occlusions.A positioning error of less than one millimeter is obtained in experiments with a 6 DOF robot.
This paper describes the relationship between trading network and WWW network from preferential attachment mechanism perspective. This mechanism is known to be the underlying principle in the network evolution and has been incorporated to formulate two famous web pages ranking algorithms, PageRank and HITS. We point out the differences between trading network and WWW network in this mechanism, derive the formulation of HITS-based ranking algorithm for trading network as a direct consequence of the differences, and apply the same framework when deriving the formulation back to the HITS formulation that turns to become a technique to accelerate its convergences.
A class of queueing networks which consist of single-server fork-join nodes with infinite buffers is examined to derive a representation of the network dynamics in terms of max-plus algebra. For the networks, we present a common dynamic state equation which relates the departure epochs of customers from the network nodes in an explicit vector form determined by a state transition matrix. We show how the matrix may be calculated from the service time of customers in the general case, and give examples of matrices inherent in particular networks.
The knowledge of the Free Energy Landscape topology is the essential key to understand many biochemical processes. The determination of the conformers of a protein and their basins of attraction takes a central role for studying molecular isomerization reactions. In this work, we present a novel framework to unveil the features of a Free Energy Landscape answering questions such as how many meta-stable conformers are, how the hierarchical relationship among them is, or what the structure and kinetics of the transition paths are. Exploring the landscape by molecular dynamics simulations, the microscopic data of the trajectory are encoded into a Conformational Markov Network. The structure of this graph reveals the regions of the conformational space corresponding to the basins of attraction. In addition, handling the Conformational Markov Network, relevant kinetic magnitudes as dwell times or rate constants, and the hierarchical relationship among basins, complete the global picture of the landscape. We show the power of the analysis studying a toy model of a funnel-like potential and computing efficiently the conformers of a short peptide, the dialanine, paving the way to a systematic study of the Free Energy Landscape in large peptides.
Recent studies have shown that embedding textual relations using deep neural networks greatly helps relation extraction. However, many existing studies rely on supervised learning; their performance is dramatically limited by the availability of training data. In this work, we generalize textual relation embedding to the distant supervision setting, where much larger-scale but noisy training data is available. We propose leveraging global statistics of relations, i.e., the co-occurrence statistics of textual and knowledge base relations collected from the entire corpus, to embed textual relations. This approach turns out to be more robust to the training noise introduced by distant supervision. On a popular relation extraction dataset, we show that the learned textual relation embeddings can be used to augment existing relation extraction models and significantly improve their performance. Most remarkably, for the top 1,000 relational facts discovered by the best existing model, the precision can be improved from 83.9% to 89.3%.
Traditional Scene Understanding problems such as Object Detection and Semantic Segmentation have made breakthroughs in recent years due to the adoption of deep learning. However, the former task is not able to localise objects at a pixel level, and the latter task has no notion of different instances of objects of the same class. We focus on the task of Instance Segmentation which recognises and localises objects down to a pixel level. Our model is based on a deep neural network trained for semantic segmentation. This network incorporates a Conditional Random Field with end-to-end trainable higher order potentials based on object detector outputs. This allows us to reason about instances from an initial, category-level semantic segmentation. Our simple method effectively leverages the great progress recently made in semantic segmentation and object detection. The accurate instance-level segmentations that our network produces is reflected by the considerable improvements obtained over previous work.
Recent studies have been using graph theoretical approaches to model complex networks (such as social, infrastructural or biological networks), and how their hardwired circuitry relates to their dynamic evolution in time. Understanding how configuration reflects on the coupled behavior in a system of dynamic nodes can be of great importance, for example in the context of how the brain connectome is affecting brain function. However, the connectivity patterns that appear in brain networks, and their individual effects on network dynamics, are far from being fully understood.   We study the connections between edge configuration and dynamics in a simple oriented network composed of two interconnected cliques (representative of brain feedback regulatory circuitry). In this paper, our main goal is to study the spectra of the graph adjacency and Laplacian matrices, with a focus on three aspects in particular: (1) the sensitivity/robustness the spectrum in response to varying the intra and inter-modular edge density, (2) the effects on the spectrum of perturbing the edge configuration, while keeping the densities fixed and (3) the effects of increasing the network size. We study some tractable aspects analytically, then simulate more general results numerically. This paper aims to clarify, from analytical and modeling perspectives, the underpinnings of our related work, which further addresses how graph properties affect the network's temporal dynamics and phase transitions.   We propose that this type of results may be helpful when studying small networks such as macroscopic brain circuits. We suggest potential applications to understanding synaptic restructuring in learning networks, and the effects of network configuration to function of emotion-regulatory neural circuits.
Suicide is among the leading causes of death in China. However, technical approaches toward preventing suicide are challenging and remaining under development. Recently, several actual suicidal cases were preceded by users who posted microblogs with suicidal ideation to Sina Weibo, a Chinese social media network akin to Twitter. It would therefore be desirable to detect suicidal ideations from microblogs in real-time, and immediately alert appropriate support groups, which may lead to successful prevention. In this paper, we propose a real-time suicidal ideation detection system deployed over Weibo, using machine learning and known psychological techniques. Currently, we have identified 53 known suicidal cases who posted suicide notes on Weibo prior to their deaths.We explore linguistic features of these known cases using a psychological lexicon dictionary, and train an effective suicidal Weibo post detection model. 6714 tagged posts and several classifiers are used to verify the model. By combining both machine learning and psychological knowledge, SVM classifier has the best performance of different classifiers, yielding an F-measure of 68:3%, a Precision of 78:9%, and a Recall of 60:3%.
Theory of graphical models has matured over more than three decades to provide the backbone for several classes of models that are used in a myriad of applications such as genetic mapping of diseases, credit risk evaluation, reliability and computer security, etc. Despite of their generic applicability and wide adoptance, the constraints imposed by undirected graphical models and Bayesian networks have also been recognized to be unnecessarily stringent under certain circumstances. This observation has led to the proposal of several generalizations that aim at more relaxed constraints by which the models can impose local or context-specific dependence structures. Here we consider an additional class of such models, termed as stratified graphical models. We develop a method for Bayesian learning of these models by deriving an analytical expression for the marginal likelihood of data under a specific subclass of decomposable stratified models. A non-reversible Markov chain Monte Carlo approach is further used to identify models that are highly supported by the posterior distribution over the model space. Our method is illustrated and compared with ordinary graphical models through application to several real and synthetic datasets.
Nowadays, computer scientists have shown the interest in the study of social insect's behaviour in neural networks area for solving different combinatorial and statistical problems. Chief among these is the Artificial Bee Colony (ABC) algorithm. This paper investigates the use of ABC algorithm that simulates the intelligent foraging behaviour of a honey bee swarm. Multilayer Perceptron (MLP) trained with the standard back propagation algorithm normally utilises computationally intensive training algorithms. One of the crucial problems with the backpropagation (BP) algorithm is that it can sometimes yield the networks with suboptimal weights because of the presence of many local optima in the solution space. To overcome ABC algorithm used in this work to train MLP learning the complex behaviour of earthquake time series data trained by BP, the performance of MLP-ABC is benchmarked against MLP training with the standard BP. The experimental result shows that MLP-ABC performance is better than MLP-BP for time series data.
In this paper, we explore the use of convolutional networks (ConvNets) for the purpose of cognate identification. We compare our architecture with binary classifiers based on string similarity measures on different language families. Our experiments show that convolutional networks achieve competitive results across concepts and across language families at the task of cognate identification.
We investigate the low-temperature critical behavior of the three dimensional random-field Ising ferromagnet. By a scaling analysis we find that in the limit of temperature $T \to 0$ the usual scaling relations have to be modified as far as the exponent $\alpha$ of the specific heat is concerned. At zero temperature, the Rushbrooke equation is modified to $\alpha + 2 \beta + \gamma = 1$, an equation which we expect to be valid also for other systems with similar critical behavior. We test the scaling theory numerically for the three dimensional random field Ising system with Gaussian probability distribution of the random fields by a combination of calculations of exact ground states with an integer optimization algorithm and Monte Carlo methods. By a finite size scaling analysis we calculate the critical exponents $\nu \approx 1.0$, $\beta \approx 0.05$, $\bar{\gamma} \approx 2.9$ $\gamma \approx 1.5$ and $\alpha \approx -0.55$.
This paper uses a spatial Aloha model to describe a distributed autonomous wireless network in which a group of transmit-receive pairs (users) shares a common collision channel via slotted-Aloha-like random access. The objective of this study is to develop an intelligent algorithm to be embedded into the transceivers so that all users know how to self-tune their medium access probability (MAP) to achieve overall Pareto optimality in terms of network throughput under spatial reuse while maintaining network stability. While the optimal solution requires each user to have complete information about the network, our proposed algorithm only requires users to have local information. The fundamental of our algorithm is that the users will first self-organize into a number of non-overlapping neighborhoods, and the user with the maximum node degree in each neighborhood is elected as the local leader (LL). Each LL then adjusts its MAP according to a parameter R which indicates the radio intensity level in its neighboring region, whereas the remaining users in the neighborhood simply follow the same MAP value. We show that by ensuring R less than or equal to 2 at the LLs, the stability of the entire network can be assured even when each user only has partial network information. For practical implementation, we propose each LL to use R=2 as the constant reference signal to its built-in proportional and integral controller. The settings of the control parameters are discussed and we validate through simulations that the proposed method is able to achieve close-to-Pareto-front throughput.
Mobile wireless network research focuses on scenarios at the extremes of the network connectivity continuum where the probability of all nodes being connected is either close to unity, assuming connected paths between all nodes (mobile ad hoc networks), or it is close to zero, assuming no multi-hop paths exist at all (delay-tolerant networks). In this paper, we argue that a sizable fraction of networks lies between these extremes and is characterized by the existence of partial paths, i.e. multi-hop path segments that allow forwarding data closer to the destination even when no end-to-end path is available. A fundamental issue in such networks is dealing with disruptions of end-to-end paths. Under a stochastic model, we compare the performance of the established end-to-end retransmission (ignoring partial paths), against a forwarding mechanism that leverages partial paths to forward data closer to the destination even during disruption periods. Perhaps surprisingly, the alternative mechanism is not necessarily superior. However, under a stochastic monotonicity condition between current v.s. future path length, which we demonstrate to hold in typical network models, we manage to prove superiority of the alternative mechanism in stochastic dominance terms. We believe that this study could serve as a foundation to design more efficient data transfer protocols for partially-connected networks, which could potentially help reducing the gap between applications that can be supported over disconnected networks and those requiring full connectivity.
The influence of an applied shear field on the dynamics of an aging colloidal suspension has been investigated by the dynamic light scattering determination of the density autocorrelation function. Though a stationary state is never observed, the slow dynamics crosses between two different non-equilibrium regimes as soon as the structural relaxation time approaches the inverse shear rate. In the shear dominated regime (at high shear rate values) the structural relaxation time is found to be strongly sensitive to shear rate while aging proceeds at a very slow rate. The effect of shear on the detailed shape of the density autocorrelation function is quantitatively described assuming that the structural relaxation process arises from the heterogeneous superposition of many relaxing units each one independently coupled to shear with a parallel composition rule for timescales.
In this paper we extend the earlier treatment of out-of-equilibrium mesoscopic fluctuations in glassy systems in several significant ways. First, via extensive simulations, we demonstrate that models of glassy behavior without quenched disorder display scalings of the probability of local two-time correlators that are qualitatively similar to that of models with short-ranged quenched interactions. The key ingredient for such scaling properties is shown to be the development of a critical-like dynamical correlation length, and not other microscopic details. This robust data collapse may be described in terms of a time-evolving Gumbel-like distribution. We develop a theory to describe both the form and evolution of these distributions based on a effective sigma-model approach.
Mining user opinion from Micro-Blogging has been extensively studied on the most popular social networking sites such as Twitter and Facebook in the U.S., but few studies have been done on Micro-Blogging websites in other countries (e.g. China). In this paper, we analyze the social opinion influence on Tencent, one of the largest Micro-Blogging websites in China, endeavoring to unveil the behavior patterns of Chinese Micro-Blogging users. This paper proposes a Topic-Level Opinion Influence Model (TOIM) that simultaneously incorporates topic factor and social direct influence in a unified probabilistic framework. Based on TOIM, two topic level opinion influence propagation and aggregation algorithms are developed to consider the indirect influence: CP (Conservative Propagation) and NCP (None Conservative Propagation). Users' historical social interaction records are leveraged by TOIM to construct their progressive opinions and neighbors' opinion influence through a statistical learning process, which can be further utilized to predict users' future opinions on some specific topics. To evaluate and test this proposed model, an experiment was designed and a sub-dataset from Tencent Micro-Blogging was used. The experimental results show that TOIM outperforms baseline methods on predicting users' opinion. The applications of CP and NCP have no significant differences and could significantly improve recall and F1-measure of TOIM.
In this paper, we propose a stochastic proximal gradient method to train ternary weight neural networks (TNN). The proposed method features weight ternarization via an exact formula of proximal operator. Our experiments show that our trained TNN are able to preserve the state-of-the-art performance on MNIST and CIFAR10 benchmark datesets.
Neural network design has utilized flexible nonlinear processes which can mimic biological systems, but has suffered from a lack of traceability in the resulting network. Graphical probabilistic models ground network design in probabilistic reasoning, but the restrictions reduce the expressive capability of each node making network designs complex. The ability to model coupled random variables using the calculus of nonextensive statistical mechanics provides a neural node design incorporating nonlinear coupling between input states while maintaining the rigor of probabilistic reasoning. A generalization of Bayes rule using the coupled product enables a single node to model correlation between hundreds of random variables. A coupled Markov random field is designed for the inferencing and classification of UCI's MLR 'Multiple Features Data Set' such that thousands of linear correlation parameters can be replaced with a single coupling parameter with just a (3%, 4%) percent reduction in (classification, inference) performance.
In the past few years, the field of computer vision has gone through a revolution fueled mainly by the advent of large datasets and the adoption of deep convolutional neural networks for end-to-end learning. The person re-identification subfield is no exception to this, thanks to the notable publication of the Market-1501 and MARS datasets and several strong deep learning approaches. Unfortunately, a prevailing belief in the community seems to be that the triplet loss is inferior to using surrogate losses (classification, verification) followed by a separate metric learning step. We show that, for models trained from scratch as well as pretrained ones, using a variant of the triplet loss to perform end-to-end deep metric learning outperforms any other published method by a large margin.
The distribution of electric charge in atomic nuclei is fundamental to our understanding of the complex nuclear dynamics and a quintessential observable to validate nuclear structure models. We explore a novel approach that combines sophisticated models of nuclear structure with Bayesian neural networks (BNN) to generate predictions for the charge radii of thousands of nuclei throughout the nuclear chart. A class of relativistic energy density functionals is used to provide robust predictions for nuclear charge radii. In turn, these predictions are refined through Bayesian learning for a neural network that is trained using residuals between theoretical predictions and the experimental data. Although predictions obtained with density functional theory provide a fairly good description of experiment, our results show significant improvement (better than 40%) after BNN refinement. Moreover, these improved results for nuclear charge radii are supplemented with theoretical error bars. We have successfully demonstrated the ability of the BNN approach to significantly increase the accuracy of nuclear models in the predictions of nuclear charge radii. However, as many before us, we failed to uncover the underlying physics behind the intriguing behavior of charge radii along the calcium isotopic chain.
Topics generated by topic models are usually represented by lists of $t$ terms or alternatively using short phrases and images. The current state-of-the-art work on labeling topics using images selects images by re-ranking a small set of candidates for a given topic. In this paper, we present a more generic method that can estimate the degree of association between any arbitrary pair of an unseen topic and image using a deep neural network. Our method has better runtime performance $O(n)$ compared to $O(n^2)$ for the current state-of-the-art method, and is also significantly more accurate.
Linearizing the Heisenberg equations of motion around the ground state of an interacting quantum many-body system, one gets a time-evolution generator in the positive cone of a real symplectic Lie algebra. The presence of disorder in the physical system determines a probability measure with support on this cone. The present paper analyzes a discrete family of such measures of exponential type, and does so in an attempt to capture, by a simple random matrix model, some generic statistical features of the characteristic frequencies of disordered bosonic quasi-particle systems. The level correlation functions of the said measures are shown to be those of a determinantal process, and the kernel of the process is expressed as a sum of bi-orthogonal polynomials. While the correlations in the bulk scaling limit are in accord with sine-kernel or GUE universality, at the low-frequency end of the spectrum an unusual type of scaling behavior is found.
In Wireless Sensor Network, sensor nodes life time is the most critical parameter. Many researches on these lifetime extension are motivated by LEACH scheme, which by allowing rotation of cluster head role among the sensor nodes tries to distribute the energy consumption over all nodes in the network. Selection of clusterhead for such rotation greatly affects the energy efficiency of the network. Different communication protocols and algorithms are investigated to find ways to reduce power consumption. In this paper brief survey is taken from many proposals, which suggests different clusterhead selection strategies and a global view is presented. Comparison of their costs of clusterhead selection in different rounds, transmission method and other effects like cluster formation, distribution of clusterheads and creation of clusters shows a need of a combined strategy for better results.
Virtualization enables the sharing of a same wireless sensor network (WSN) by multiple applications. However, in heterogeneous environments, virtualized wireless sensor networks (VWSN) raises new challenges such as the need for on-the-fly, dynamic, elastic and scalable provisioning of gateways. Network Functions Virtualization (NFV) is an emerging paradigm that can certainly aid in tackling these new challenges. It leverages standard virtualization technology to consolidate special-purpose network elements on top of commodity hardware. This article presents a case study on NFV based gateways for VWSNs. In the study, a VWSN gateway provider, operates and manages an NFV based infrastructure. We use two different brands of wireless sensors. The NFV infrastructure makes possible the dynamic, elastic and scalable deployment of gateway modules in this heterogeneous VWSN environment. The prototype built with Openstack as platform is described.
Network analysis is an important tool in understanding the behavior of complex systems of interacting entities. However, due to the limitations of data gathering technologies, some interactions might be missing from the network model. This is a ubiquitous problem in all domains that use network analysis, from social networks to hyper-linked web networks to biological networks. Consequently, an important question in analyzing networks is to understand how increasing the noise level (i.e. percentage of missing edges) affects different network parameters.   In this paper we evaluate the effect of noise on community scoring and centrality-based parameters with respect to two different aspects of network analysis: (i) sensitivity, that is how the parameter value changes as edges are removed and (ii) reliability in the context of message spreading, that is how the time taken to broadcast a message changes as edges are removed.   Our experiments on synthetic and real-world networks and three different noise models demonstrate that for both the aspects over all networks and all noise models, permanence qualifies as the most effective metric. For the sensitivity experiments closeness centrality is a close second. For the message spreading experiments, closeness and betweenness centrality based initiator selection closely competes with permanence. This is because permanence has a dual characteristic where the cumulative permanence over all vertices is sensitive to noise but the ids of the top-rank vertices, which are used to find seeds during message spreading remain relatively stable under noise.
The influence of Rashba spin-orbit interaction on the spin dynamics of a topologically disordered hopping system is studied in this paper. This is a significant generalization of a previous investigation, where an ordered (polaronic) hopping system has been considered instead. It is found, that in the limit, where the Rashba length is large compared to the typical hopping length, the spin dynamics of a disordered system can still be described by the expressions derived for an ordered system, under the provision that one takes into account the frequency dependence of the diffusion constant and the mobility (which are determined by charge transport and are independent of spin). With these results we are able to make explicit the influence of disorder on spin related quantities as, e.g., the spin life-time in hopping systems.
To process the ever-increasing amounts of data, computing technology has relied upon the laws of Dennard and Moore to scale up the performance of conventional von Neumann machines. As these laws break down due to technological limits, a radical departure from the processor-memory dichotomy is needed to circumvent the limitations of today's computers. 'Memcomputing' is a promising concept in which the physical attributes and state dynamics of nanoscale resistive memory devices are exploited to perform computational tasks with collocated memory and processing. The capability of 'memcomputing' for performing certain logical and arithmetic operations has been demonstrated. However, device variability and non-ideal device characteristics pose technical challenges to reach the numerical accuracy usually required in practice for data analytics and scientific computing. To resolve this, we propose the concept of mixed-precision 'memcomputing' that combines a von Neumann machine with a 'memcomputer' in a hybrid system that benefits from both the high precision of conventional computing and the energy/areal efficacy of 'memcomputing'. Such a system can achieve arbitrarily high computational accuracy with the bulk of the computation realized as low-precision 'memcomputing'. We demonstrate this by addressing the problem of solving systems of linear equations and present experimental results of solving accurately a system of 10,000 equations using 959,376 phase-change memory devices. We also demonstrate a practical application of computing the gene interaction network from RNA expression measurements. These results illustrate that an interconnection of high-precision arithmetic and 'memcomputing' can be used to solve problems at the core of today's computing applications.
In this paper we propose a causal analog to the purely observational Dynamic Bayesian Networks, which we call Dynamic Causal Networks. We provide a sound and complete algorithm for identification of Dynamic Causal Net- works, namely, for computing the effect of an intervention or experiment, based on passive observations only, whenever possible. We note the existence of two types of confounder variables that affect in substantially different ways the iden- tification procedures, a distinction with no analog in either Dynamic Bayesian Networks or standard causal graphs. We further propose a procedure for the transportability of causal effects in Dynamic Causal Network settings, where the re- sult of causal experiments in a source domain may be used for the identification of causal effects in a target domain.
As the shortcomings of our current Internet become more and more obvious, researchers have started creating alternative approaches for the Internet of the future. Their design goals are mainly content-orientation, security, support for mobility and cloud computing. The probably most popular architecture is called Content Centric Networking. Every communication is treated as a distribution of content and caches are used within the network to improve the effectiveness. While the performance gain of Content Centric Networks is undoubted, there are questions about security and especially privacy since it is not one of its main design principle. In this work, we compare the Content Centric Networking approach with the current Internet with respect to security and privacy. We analyze improvements that have been made and new problems that have yet to be resolved. The Internet of the future could be content-oriented, so it is essential to identify potential security and privacy issues that are inherent to the architecture early on.
In this paper we present a method for the estimation of the color of the illuminant in RAW images. The method includes a Convolutional Neural Network that has been specially designed to produce multiple local estimates. A multiple illuminant detector determines whether or not the local outputs of the network must be aggregated into a single estimate. We evaluated our method on standard datasets with single and multiple illuminants, obtaining lower estimation errors with respect to those obtained by other general purpose methods in the state of the art.
In the past few years, the problem of distributed consensus has received a lot of attention, particularly in the framework of ad hoc sensor networks. Most methods proposed in the literature address the consensus averaging problem by distributed linear iterative algorithms, with asymptotic convergence of the consensus solution. The convergence rate of such distributed algorithms typically depends on the network topology and the weights given to the edges between neighboring sensors, as described by the network matrix. In this paper, we propose to accelerate the convergence rate for given network matrices by the use of polynomial filtering algorithms. The main idea of the proposed methodology is to apply a polynomial filter on the network matrix that will shape its spectrum in order to increase the convergence rate. Such an algorithm is equivalent to periodic updates in each of the sensors by aggregating a few of its previous estimates. We formulate the computation of the coefficients of the optimal polynomial as a semi-definite program that can be efficiently and globally solved for both static and dynamic network topologies. We finally provide simulation results that demonstrate the effectiveness of the proposed solutions in accelerating the convergence of distributed consensus averaging problems.
Recent progress in variational inference has paid much attention to the flexibility of variational posteriors. Work has been done to use implicit distributions, i.e., distributions without tractable likelihoods as the variational posterior. However, existing methods on implicit posteriors still face challenges of noisy estimation and can hardly scale to high-dimensional latent variable models. In this paper, we present an implicit variational inference approach with kernel density ratio fitting that addresses these challenges. As far as we know, for the first time implicit variational inference is successfully applied to Bayesian neural networks, which shows promising results on both regression and classification tasks.
Over the past several years Bayesian networks have been applied to a wide variety of problems. A central problem in applying Bayesian networks is that of finding one or more of the most probable instantiations of a network. In this paper we develop an efficient algorithm that incrementally enumerates the instantiations of a Bayesian network in decreasing order of probability. Such enumeration algorithms are applicable in a variety of applications ranging from medical expert systems to model-based diagnosis. Fundamentally, our algorithm is simply performing a lazy enumeration of the sorted list of all instantiations of the network. This insight leads to a very concise algorithm statement which is both easily understood and implemented. We show that for singly connected networks, our algorithm generates the next instantiation in time polynomial in the size of the network. The algorithm extends to arbitrary Bayesian networks using standard conditioning techniques. We empirically evaluate the enumeration algorithm and demonstrate its practicality.
Generative Adversarial Networks have emerged as an effective technique for estimating data distributions. The basic setup consists of two deep networks playing against each other in a zero-sum game setting. However, it is not understood if the networks reach an equilibrium eventually and what dynamics makes this possible. The current GAN training procedure, which involves simultaneous gradient descent, lacks a clear game-theoretic justification in the literature. In this paper, we introduce regret minimization as a technique to reach equilibrium in games and use this to justify the success of simultaneous GD in GANs. In addition, we present a hypothesis that mode collapse, which is a common occurrence in GAN training, happens due to the existence of spurious local equilibria in non-convex games. Motivated by these insights, we develop an algorithm called DRAGAN that is fast, simple to implement and achieves competitive performance in a stable fashion across different architectures (150 random setups), datasets (MNIST, CIFAR-10, and CelebA), and divergence measures with almost no hyperparameter tuning. We show significant improvements over the recently proposed Wasserstein GAN variants.
While perception tasks such as visual object recognition and text understanding play an important role in human intelligence, the subsequent tasks that involve inference, reasoning and planning require an even higher level of intelligence. The past few years have seen major advances in many perception tasks using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. To achieve integrated intelligence that involves both perception and inference, it is naturally desirable to tightly integrate deep learning and Bayesian models within a principled probabilistic framework, which we call Bayesian deep learning. In this unified framework, the perception of text or images using deep learning can boost the performance of higher-level inference and in return, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a general introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, and control. In this survey, we also discuss the relationship and differences between Bayesian deep learning and other related topics like Bayesian treatment of neural networks.
The engineered control of cellular function through the design of synthetic genetic networks is becoming plausible. Here we show how a naturally occurring network can be used as a parts list for artificial network design, and how model formulation leads to computational and analytical approaches relevant to nonlinear dynamics and statistical physics.
Unsupervised discovery of latent representations, in addition to being useful for density modeling, visualisation and exploratory data analysis, is also increasingly important for learning features relevant to discriminative tasks. Autoencoders, in particular, have proven to be an effective way to learn latent codes that reflect meaningful variations in data. A continuing challenge, however, is guiding an autoencoder toward representations that are useful for particular tasks. A complementary challenge is to find codes that are invariant to irrelevant transformations of the data. The most common way of introducing such problem-specific guidance in autoencoders has been through the incorporation of a parametric component that ties the latent representation to the label information. In this work, we argue that a preferable approach relies instead on a nonparametric guidance mechanism. Conceptually, it ensures that there exists a function that can predict the label information, without explicitly instantiating that function. The superiority of this guidance mechanism is confirmed on two datasets. In particular, this approach is able to incorporate invariance information (lighting, elevation, etc.) from the small NORB object recognition dataset and yields state-of-the-art performance for a single layer, non-convolutional network.
Minimization of the number of cluster heads in a wireless sensor network is a very important problem to reduce channel contention and to improve the efficiency of the algorithm when executed at the level of cluster-heads. In this paper, we propose an efficient method based on genetic algorithms (GAs) to solve a sensor network optimization problem. Long communication distances between sensors and a sink in a sensor network can greatly drain the energy of sensors and reduce the lifetime of a network. By clustering a sensor network into a number of independent clusters using a GA, we can greatly minimize the total communication distance, thus prolonging the network lifetime. Simulation results show that our algorithm can quickly find a good solution.
We examine masses of hosting haloes of two photometrically-selected high-z galaxy samples: the old passively-evolving galaxies (OPEGs) and Lyman Break Galaxies (LBGs) both taken from the Subaru/XMM-Newton Deep Survey (SXDS). The large survey area of the SXDS (1sq deg) allows us to measure the angular two-point correlation functions to a wide separation of >10 arcmin with a good statistical quality. We utilize the halo model prescription for estimating characteristic masses of hosting haloes from the measured large-scale clustering amplitudes. It is found that the hosting halo mass positively correlates with the luminosity of galaxies. Then, adopting the extended Press-Schechter model (EPS), we compute the predictions for the mass evolution of the hosting haloes in the framework of the cold dark matter (CDM) cosmology in order to make an evolutionary link between the two galaxy samples at different redshifts and to identify their present-day descendants by letting their haloes evolve forward in time. It is found that, in the view of the mass evolution of hosting haloes in the CDM model, bright LBGs are consistent with being the progenitor of the OPEGs, whereas it is less likely that the LBG population, as a whole, have evolved into the OPEG population. It is also found that the present-day descendants of both the bright LBGs and OPEGs are likely to be located in massive systems such as groups of galaxies or clusters of galaxies. Finally, we estimate the hosting halo mass of local early-type galaxy samples from the 2dF and SDSS based on the halo model and it turns out that their expected characteristic mass of hosting haloes is in good agreement with the EPS predictions for the descendant's mass of both the bright LBGs and OPEGs.
A major challenge in designing neural network (NN) systems is to determine the best structure and parameters for the network given the data for the machine learning problem at hand. Examples of parameters are the number of layers and nodes, the learning rates, and the dropout rates. Typically, these parameters are chosen based on heuristic rules and manually fine-tuned, which may be very time-consuming, because evaluating the performance of a single parametrization of the NN may require several hours. This paper addresses the problem of choosing appropriate parameters for the NN by formulating it as a box-constrained mathematical optimization problem, and applying a derivative-free optimization tool that automatically and effectively searches the parameter space. The optimization tool employs a radial basis function model of the objective function (the prediction accuracy of the NN) to accelerate the discovery of configurations yielding high accuracy. Candidate configurations explored by the algorithm are trained to a small number of epochs, and only the most promising candidates receive full training. The performance of the proposed methodology is assessed on benchmark sets and in the context of predicting drug-drug interactions, showing promising results. The optimization tool used in this paper is open-source.
3-D image registration, which involves aligning two or more images, is a critical step in a variety of medical applications from diagnosis to therapy. Image registration is commonly performed by optimizing an image matching metric as a cost function. However, this task is challenging due to the non-convex nature of the matching metric over the plausible registration parameter space and insufficient approaches for a robust optimization. As a result, current approaches are often customized to a specific problem and sensitive to image quality and artifacts. In this paper, we propose a completely different approach to image registration, inspired by how experts perform the task. We first cast the image registration problem as a "strategy learning" process, where the goal is to find the best sequence of motion actions (e.g. up, down, etc.) that yields image alignment. Within this approach, an artificial agent is learned, modeled using deep convolutional neural networks, with 3D raw image data as the input, and the next optimal action as the output. To cope with the dimensionality of the problem, we propose a greedy supervised approach for an end-to-end training, coupled with attention-driven hierarchical strategy. The resulting registration approach inherently encodes both a data-driven matching metric and an optimal registration strategy (policy). We demonstrate, on two 3-D/3-D medical image registration examples with drastically different nature of challenges, that the artificial agent outperforms several state-of-art registration methods by a large margin in terms of both accuracy and robustness.
With Monte Carlo simulations, we systematically investigate the depinning phase transition in the two-dimensional driven random-field clock model. Based on the short-time dynamic approach, we determine the transition field and critical exponents. The results show that the critical exponents vary with the form of the random-field distribution and the strength of the random fields, and the roughening dynamics of the domain interface belongs to the new subclass with $\zeta \neq \zeta_{loc} \neq \zeta_s$ and $\zeta_{loc} \neq 1$. More importantly, we find that the transition field and critical exponents change with the initial orientations of the magnetization of the two ordered domains.
A contrarian is someone who deliberately decides to oppoe the prevailing choice of others. The Galam model of two state opinion dynamicsincorporates agent updates by a single step random grouping in which all participants adopt the opinion of their respective local majority group. The process is repeated until a stable collective state is reached; the associated dynamics is fast. Here we show that the introduction of contrarians may give rise to interesting dynamics generated phases and even to a critical behavior at a contrarian concentration $a_c$. For $a<a_c$ an ordered phase is generated with a clear cut majority-minority splitting. By contrast when $a>a_c$ the resulting disordered phase has no majority: agents keep shifting opinions but no symmetry breaking (i.e., the appearance of a majority) takes place. Our results are employed to explain the outcome of the 2000 American presidential elections and that of the 2002 German parliamentary elections. Those events are found to be inevitable. On this basis the ``hung elections scenario'' is predicted to become a common occurrence in modern democracies.
Deep neural networks for the machine comprehension typically utilizes only word or character embeddings without explicitly taking advantage of structured linguistic information such as constituency trees and dependency trees. In this paper, we propose structural embedding of syntactic trees (SEST), an algorithm framework to utilize structured information and encode them into vector representations that can boost the performance of algorithms for the machine comprehension. We evaluate our approach using a state-of-the-art neural attention model on SQuAD dataset. Experimental results demonstrate that our model can accurately identify the syntactic boundaries of the sentences and to extract answers that are syntactically coherent over the baseline methods.
An analysis of the electron localization properties in doped graphene is performed by doing a numerical multifractal analysis. By obtaining the singularity spectrum of a tight-binding model, it is found that the electron wave functions present a multifractal behavior. Such multifractality is preserved even for second neighbor interaction, which needs to be taken into account if a comparison is desired with experimental results. States close to the Dirac point have a wider multifractal character than those far from this point as the impurity concentration is increased. The analysis of the results allows to conclude that in the split-band limit, where impurities act as vacancies, the system can be well described by a chiral orthogonal symmetry class, with a singularity spectrum transition approaching freezing as disorder increases. This also suggests that in doped graphene, localization is in contrast with the conventional picture of Anderson localization in two dimensions, showing also that the common belief of the absence of quantum percolation in two dimensional systems needs to be revised.
Recently Watts and Strogatz have given an interesting model of small-world networks. Here we concretise the concept of a ``far away'' connection in a network by defining a {\it far edge}. Our definition is algorithmic and independent of underlying topology of the network. We show that it is possible to control spread of an epidemic by using the knowledge of far edges. We also suggest a model for better advertisement using the far edges. Our findings indicate that the number of far edges can be a good intrinsic parameter to characterize small-world phenomena.
Online action detection (OAD) is challenging since 1) robust yet computationally expensive features cannot be straightforwardly used due to the real-time processing requirements and 2) the localization and classification of actions have to be performed even before they are fully observed. We propose a new random forest (RF)-based online action detection framework that addresses these challenges. Our algorithm uses computationally efficient skeletal joint features. High accuracy is achieved by using robust convolutional neural network (CNN)-based features which are extracted from the raw RGBD images, plus the temporal relationships between the current frame of interest, and the past and future frames. While these high-quality features are not available in real-time testing scenario, we demonstrate that they can be effectively exploited in training RF classifiers: We use these spatio-temporal contexts to craft RF's new split functions improving RFs' leaf node statistics. Experiments with challenging MSRAction3D, G3D, and OAD datasets demonstrate that our algorithm significantly improves the accuracy over the state-of-the-art online action detection algorithms while achieving the real-time efficiency of existing skeleton-based RF classifiers.
Recently, dropout has seen increasing use in deep learning. For deep convolutional neural networks, dropout is known to work well in fully-connected layers. However, its effect in pooling layers is still not clear. This paper demonstrates that max-pooling dropout is equivalent to randomly picking activation based on a multinomial distribution at training time. In light of this insight, we advocate employing our proposed probabilistic weighted pooling, instead of commonly used max-pooling, to act as model averaging at test time. Empirical evidence validates the superiority of probabilistic weighted pooling. We also compare max-pooling dropout and stochastic pooling, both of which introduce stochasticity based on multinomial distributions at pooling stage.
We present an application of machine-learning (ML) techniques to source selection in the optical transient survey data with Hyper Suprime-Cam (HSC) on the Subaru telescope. Our goal is to select real transient events accurately and in a timely manner out of a large number of false candidates, obtained with the standard difference-imaging method. We have developed the transient selector which is based on majority voting of three ML machines of AUC Boosting, Random Forest, and Deep Neural Network. We applied it to our observing runs of Subaru-HSC in 2015 May and August, and proved it to be efficient in selecting optical transients. The false positive rate was 1.0% at the true positive rate of 90% in the magnitude range of 22.0--25.0 mag for the former data. For the latter run, we successfully detected and reported ten candidates of supernovae within the same day as the observation. From these runs, we learned the following lessons: (1) the training using artificial objects is effective in filtering out false candidates, especially for faint objects, and (2) combination of ML by majority voting is advantageous.
Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in networks that change over time. In this paper we propose the first betweenness centrality approximation algorithms with a provable guarantee on the maximum approximation error for dynamic networks. Several new intermediate algorithmic results contribute to the respective approximation algorithms: (i) new upper bounds on the vertex diameter, (ii) the first fully-dynamic algorithm for updating an approximation of the vertex diameter in undirected graphs, and (iii) an algorithm with lower time complexity for updating single-source shortest paths in unweighted graphs after a batch of edge actions. Using approximation, our algorithms are the first to make in-memory computation of betweenness in dynamic networks with millions of edges feasible. Our experiments show that our algorithms can achieve substantial speedups compared to recomputation, up to several orders of magnitude. Moreover, the approximation accuracy is usually significantly better than the theoretical guarantee in terms of absolute error. More importantly, for reasonably small approximation error thresholds, the rank of nodes is well preserved, in particular for nodes with high betweenness.
Examining elevated rates of systemic lupus erythematosus in African-American women from perspectives of immune cognition suggests the disease constitutes an internalized physiological image of external structured psychosocial stress, a 'pathogenic social hierarchy' involving the synergism of racism and gender discrimination in the context of policy-driven social disintegration which has particularly affected ethnic minorities in the USA. The disorder represents the punctuated resetting of normal immune self-image to a self-attacking excited state, a process formally analogous to models of punctuated equilibrium in evolutionary theory. Both onset and progression of disease may be stratified by a relation to cyclic physiological responses which are long in comparison with heartbeat period: circadian, hormonal, and annual light/termperature cycles.
We report results on the rectification properties of a carbon nanotube (CNT) ring transistor, contacted by CNT leads, whose novel features have been recently communicated by Watanabe et al. [Appl. Phys. Lett. 78, 2928 (2001)]. This paper contains results which are validated by the experimental observations. Moreover, we report on additional features of the transmission of this ring device which are associated with the possibility of breaking the lead inversion symmetry. The linear conductance displays a "chessboard"-like behavior alternated with anomalous zero-lines which should be directly observable in experiments. We are also able to discriminate in our results structural properties (quasi-onedimensional confinement) from pure topological effects (ring configuration), thus helping to gain physical intuition on the rich ring phenomenology.
Predicting another person's upcoming action to build an appropriate response is a regular occurrence in the domain of motor control. In this review we discuss conceptual and experimental approaches aiming at the neural basis of predicting and learning to predict upcoming movements by their observation.
We report a study by electron paramagnetic resonance (EPR) on the E'_\alpha point defect in amorphous silicon dioxide (a-SiO2). Our experiments were performed on gamma-ray irradiated oxygen-deficient materials and pointed out that the 29Si hyperfine structure of the E'_alpha consists in a pair of lines split by 49 mT. On the basis of the experimental results a microscopic model is proposed for the E'_alpha center, consisting in a hole trapped in an oxygen vacancy with the unpaired electron sp3 orbital pointing away from the vacancy in a back-projected configuration and interacting with an extra oxygen atom of the a-SiO2 matrix.
In this comment to [S. Lenjer, O. F. Schirmer, H. Hesse, and Th. W. Kool, Phys. Rev. B {\bf 66}, 165106 (2002)] we discuss the electronic structure of oxygen vacancies in perovskites. First principles computations are in favour of rather deep levels in these vacancies, and Lenjer et al suggest that the electrons' interaction energy is negative, but data on electroconductivity are against.
We study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL). Fingerspelling comprises a significant but relatively understudied part of ASL. Recognizing fingerspelling is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected. In this work we collect and annotate a new data set of continuous fingerspelling videos, compare several types of recognizers, and explore the problem of signer variation. Our best-performing models are segmental (semi-Markov) conditional random fields using deep neural network-based features. In the signer-dependent setting, our recognizers achieve up to about 92% letter accuracy. The multi-signer setting is much more challenging, but with neural network adaptation we achieve up to 83% letter accuracies in this setting.
We study Anderson localization of single particles in continuous, correlated, one-dimensional disordered potentials. We show that tailored correlations can completely change the energy-dependence of the localization length. By considering two suitable models of disorder, we explicitly show that disorder correlations can lead to a nonmonotonic behavior of the localization length versus energy. Numerical calculations performed within the transfer-matrix approach and analytical calculations performed within the phase formalism up to order three show excellent agreement and demonstrate the effect. We finally show how the nonmonotonic behavior of the localization length with energy can be observed using expanding ultracold-atom gases.
Computational methods have complemented experimental and clinical neursciences and led to improvements in our understanding of the nervous systems in health and disease. In parallel, neuromodulation in form of electric and magnetic stimulation is gaining increasing acceptance in chronic and intractable diseases. In this paper, we firstly explore the relevant state of the art in fusion of both developments towards translational computational neuroscience. Then, we propose a strategy to employ the new theoretical concept of dynamical network biomarkers (DNB) in episodic manifestations of chronic disorders. In particular, as a first example, we introduce the use of computational models in migraine and illustrate on the basis of this example the potential of DNB as early-warning signals for neuromodulation in episodic migraine.
This paper presents methods to compare networks where relationships between pairs of nodes in a given network are defined. We define such network distance by searching for the optimal method to embed one network into another network, prove that such distance is a valid metric in the space of networks modulo permutation isomorphisms, and examine its relationship with other network metrics. The network distance defined can be approximated via multi-dimensional scaling, however, the lack of structure in networks results in poor approximations. To alleviate such problem, we consider methods to define the interiors of networks. We show that comparing interiors induced from a pair of networks yields the same result as the actual network distance between the original networks. Practical implications are explored by showing the ability to discriminate networks generated by different models.
Unsupervised learning permits the development of algorithms that are able to adapt to a variety of different data sets using the same underlying rules thanks to the autonomous discovery of discriminating features during training. Recently, a new class of Hebbian-like and local unsupervised learning rules for neural networks have been developed that minimise a similarity matching cost-function. These have been shown to perform sparse representation learning. This study tests the effectiveness of one such learning rule for learning features from images. The rule implemented is derived from a nonnegative classical multidimensional scaling cost-function, and is applied to both single and multi-layer architectures. The features learned by the algorithm are then used as input to an SVM to test their effectiveness in classification on the established CIFAR-10 image dataset. The algorithm performs well in comparison to other unsupervised learning algorithms and multi-layer networks, thus suggesting its validity in the design of a new class of compact, online learning networks.
Motivated by the rapidly growing possibilities for experiments with ultracold atoms in optical lattices we investigate the thermodynamic properties of correlated lattice fermions in the presence of an external spin-dependent random potential. The corresponding model, a Hubbard model with spin-dependent local random potentials, is solved within dynamical mean-field theory. This allows us to present a comprehensive picture of the thermodynamic properties of this system. In particular, we show that for a fixed total number of fermions spin-dependent disorder induces a magnetic polarization. The magnetic response of the polarized system differs from that of a system with conventional disorder.
The mean free path is an essential characteristic length in disordered systems. In microscopic calculations, it is usually approximated by the classical value of the elastic mean free path. It corresponds to the Boltzmann mean free path when only isotropic scattering is considered, but it is different for anisotropic scattering. In this paper, we work out the corrections to the so called Boltzmann mean free path due to multiple scattering effects on finite size scatterers, in the s-wave approximation, ie. when the elastic mean free path is equivalent to the Boltzmann mean free path. The main result is the expression for the mean free path expanded in powers of the perturbative parameter given by the scatterer density.
We use the BFKL equation to calculate the rate of deep inelastic scattering events containing two forward jets (adjacent to the proton remnants) at HERA. We compare the production of two forward jets with that of only one forward jet (the "Mueller" process). We obtain a stable prediction for this two to one jet ratio, which may serve as a measure of the BFKL vertex function.
The time-triggered approach is a well-suited approach for building distributed hard real-time systems. Since many applications of transducer networks have real-time requirements, a time-triggered communication interface for smart transducers is desirable, however such a time-triggered interface must still support features for monitoring, maintenance, plug-and-play, etc. The approach of the OMG Smart Transducer Interface consists of clusters of time-triggered smart transducer nodes that contain special interfaces supporting configuration, diagnostics, and maintenance without affecting the deterministic real-time communication. This paper discusses the applicability of the time-triggered approach for smart transducer networks and presents a case study application of a time-triggered smart transducer network.
We are interested in unicast traffic over wireless networks that employ constructive inter-session network coding, including single-hop and multi-hop schemes. In this setting, TCP flows do not fully exploit the network coding opportunities due to their bursty behavior and due to the fact that TCP is agnostic to the underlying network coding. In order to improve the performance of TCP flows over coded wireless networks, we take the following steps. First, we formulate the problem as network utility maximization and we present a distributed solution. Second, mimicking the structure of the optimal solution, we propose a "network-coding aware" queue management scheme (NCAQM) at intermediate nodes; we make no changes to TCP or to the MAC protocol (802.11). We demonstrate, via simulation, that NCAQM significantly improves TCP performance compared to TCP over baseline schemes.
We define a Hidden Markov Model (HMM) in which each hidden state has time-dependent $\textit{activity levels}$ that drive transitions and emissions, and show how to estimate its parameters. Our construction is motivated by the problem of inferring human mobility on sub-daily time scales from, for example, mobile phone records.
Introduced recently, the concept of hierarchical degree allows a more complete characterization of the topological context of a node in a complex network than the traditional node degree. This article presents analytical characterization and studies of the density of hierarchical degrees in random and scale free networks. The obtained results allowed the identification of a hierarchy-dependent power law for the degrees of nodes in random complex networks, with Poisson density for the first hierarchical degree (obtained through master equation approach). Exact results were obtained for the second hierarchical degree in scale free networks.
The discovery of neighbouring nodes in multihop wireless networks has become a key challenge. Due to tribulations in communication, synchronization loss between nodes, disparity in transmission power etc, the connectivity of nodes will always experience disruptions. On the other hand, the energy utilization by the nodes also became critical . In this paper, we propose a new method for neighbour discovery in wireless sensor networks (WSNs) which pays an eminent consideration for energy utilization and QoS parameters like latency, throughput, error rate etc. In the proposed method, the network routing is enhanced using AOMDV protocol which can accurately discover the neighbour nodes and power management with HMAC protocol which reduces the energy utilization significantly. A complete analysis is being performed to estimate how the Q o S metrics varies in various scenarios of power consumption in wireless networks.
This paper presents a method of optimization, based on both Bayesian Analysis technical and Galois Lattice of Fuzzy Semantic Network. The technical System we use learns by interpreting an unknown word using the links created between this new word and known words. The main link is provided by the context of the query. When novice's query is confused with an unknown verb (goal) applied to a known noun denoting either an object in the ideal user's Network or an object in the user's Network, the system infer that this new verb corresponds to one of the known goal. With the learning of new words in natural language as the interpretation, which was produced in agreement with the user, the system improves its representation scheme at each experiment with a new user and, in addition, takes advantage of previous discussions with users. The semantic Net of user objects thus obtained by learning is not always optimal because some relationships between couple of user objects can be generalized and others suppressed according to values of forces that characterize them. Indeed, to simplify the obtained Net, we propose to proceed to an Inductive Bayesian Analysis, on the Net obtained from Galois lattice. The objective of this analysis can be seen as an operation of filtering of the obtained descriptive graph.
The importance of sleep is paramount for maintaining physical, emotional and mental wellbeing. Though the relationship between sleep and physical activity is known to be important, it is not yet fully understood. The explosion in popularity of actigraphy and wearable devices, provides a unique opportunity to understand this relationship. Leveraging this information source requires new tools to be developed to facilitate data-driven research for sleep and activity patient-recommendations.   In this paper we explore the use of deep learning to build sleep quality prediction models based on actigraphy data. We first use deep learning as a pure model building device by performing human activity recognition (HAR) on raw sensor data, and using deep learning to build sleep prediction models. We compare the deep learning models with those build using classical approaches, i.e. logistic regression, support vector machines, random forest and adaboost. Secondly, we employ the advantage of deep learning with its ability to handle high dimensional datasets. We explore several deep learning models on the raw wearable sensor output without performing HAR or any other feature extraction.   Our results show that using a convolutional neural network on the raw wearables output improves the predictive value of sleep quality from physical activity, by an additional 8% compared to state-of-the-art non-deep learning approaches, which itself shows a 15% improvement over current practice. Moreover, utilizing deep learning on raw data eliminates the need for data pre-processing and simplifies the overall workflow to analyze actigraphy data for sleep and physical activity research.
Information-Centric Fog Computing enables a multitude of nodes near the end-users to provide storage, communication, and computing, rather than in the cloud. In a fog network, nodes connect with each other directly to get content locally whenever possible. As the topology of the network directly influences the nodes' connectivity, there has been some work to compute the graph centrality of each node within that network topology. The centrality is then used to distinguish nodes in the fog network, or to prioritize some nodes over others to participate in the caching fog. We argue that, for an Information-Centric Fog Computing approach, graph centrality is not an appropriate metric. Indeed, a node with low connectivity that caches a lot of content may provide a very valuable role in the network.   To capture this, we introduce acontent-based centrality (CBC) metric which takes into account how well a node is connected to the content the network is delivering, rather than to the other nodes in the network. To illustrate the validity of considering content-based centrality, we use this new metric for a collaborative caching algorithm. We compare the performance of the proposed collaborative caching with typical centrality based, non-centrality based, and non-collaborative caching mechanisms. Our simulation implements CBC on three instances of large scale realistic network topology comprising 2,896 nodes with three content replication levels. Results shows that CBC outperforms benchmark caching schemes and yields a roughly 3x improvement for the average cache hit rate.
Linking networks of molecular interactions to cellular functions and phenotypes is a key goal in systems biology. Here, we adapt concepts of spatial statistics to assess the functional content of molecular networks. Based on the guilt-by-association principle, our approach (called SANTA) quantifies the strength of association between a gene set and a network, and functionally annotates molecular networks like other enrichment methods annotate lists of genes. As a general association measure, SANTA can (i) functionally annotate experimentally derived networks using a collection of curated gene sets, and (ii) annotate experimentally derived gene sets using a collection of curated networks, as well as (iii) prioritize genes for follow-up analyses. We exemplify the efficacy of SANTA in several case studies using the \emph{S. cerevisiae} genetic interaction network and genome-wide RNAi screens in cancer cell lines. Our theory, simulations and applications show that SANTA provides a principled statistical way to quantify the association between molecular networks and cellular functions and phenotypes. SANTA is available from http://bioconductor.org/packages/release/bioc/html/SANTA.html.
A natural hierarchical framework for network topology abstraction is presented based on an analogy with the Kadanoff transformation and renormalisation group in theoretical physics. Some properties of the renormalisation group bear similarities to the scalability properties of network routing protocols (interactions). Central to our abstraction are two intimately connected and complementary path diversity units: simple cycles, and cycle adjacencies. A recursive network abstraction procedure is presented, together with an associated generic recursive routing protocol family that offers many desirable features.
The routine transformation of a liquid, as it is cooled rapidly, resulting in glass formation, is remarkably complex. A theoretical explanation of the dynamics associated with this process has remained one of the major unsolved problems in condensed matter physics. The Random First Order Transition (RFOT) theory, which was proposed over twenty five years ago, provides a theoretical basis for explaining much of the phenomena associated with glass forming materials. It links or relates multiple metastable states, slow or glassy dynamics, dynamic heterogeneity, and both a dynamical and an ideal glass transition. Remarkably, the major concepts in the RFOT theory can also be profitably used to understand many spectacular phenomena in biology and condensed matter physics, as we illustrate here. The presence of a large number of metastable states and the dynamics in such complex landscapes in biological systems from molecular to cellular scale and beyond leads to behavior, which is amenable to descriptions based on the RFOT theory. Somewhat surprisingly even intratumor heterogeneity arising from variations in cancer metastasis in different cells is hauntingly similar to glassy systems. There are also deep connections between glass physics and electronically disordered systems undergoing a metal-insulator transition, aging effects in which quantum effects play a role, and the physics of super glasses (a phase that is simultaneously a super fluid and a frozen amorphous structure). We argue that the common aspect in all these diverse phenomena is that multiple symmetry unrelated states governing both the equilibrium and dynamical behavior - a lynchpin in the RFOT theory - controls the behavior observed in these unrelated systems.
We study the problem of geometric optimization of medium access control in multi-hop wireless network. We discuss the optimal placements of simultaneous transmitters in the network and our general framework allows us to evaluate the performance gains of highly managed medium access control schemes that would be required to implement these placements. In a wireless network consisting of randomly distributed nodes, our performance metrics are the optimum transmission range that achieves the most optimal tradeoff between the progress of packets in desired directions towards their respective destinations and the total number of transmissions required to transport packets to their destinations. We evaluate ALOHA based scheme where simultaneous transmitters are dispatched according to a uniform Poisson distribution and compare it with various grid pattern based schemes where simultaneous transmitters are positioned in specific regular patterns. Our results show that optimizing the medium access control in multi-hop network should take into account the parameters like signal-to-interference ratio threshold and attenuation coefficient. For instance, at typical values of signal-to-interference ratio threshold and attenuation coefficient, the most optimal scheme is based on triangular grid pattern and, under no fading channel model, the most optimal transmission range and network capacity are higher than the optimum transmission range and capacity achievable with ALOHA based scheme by factors of two and three respectively. Later on, we also identify the optimal medium access control schemes when signal-to-interference ratio threshold and attenuation coefficient approach the extreme values and discuss how fading impacts the performance of all schemes we evaluate in this article.
Domain adaptation is crucial in many real-world applications where the distribution of the training data differs from the distribution of the test data. Previous Deep Learning-based approaches to domain adaptation need to be trained jointly on source and target domain data and are therefore unappealing in scenarios where models need to be adapted to a large number of domains or where a domain is evolving, e.g. spam detection where attackers continuously change their tactics.   To fill this gap, we propose Knowledge Adaptation, an extension of Knowledge Distillation (Bucilua et al., 2006; Hinton et al., 2015) to the domain adaptation scenario. We show how a student model achieves state-of-the-art results on unsupervised domain adaptation from multiple sources on a standard sentiment analysis benchmark by taking into account the domain-specific expertise of multiple teachers and the similarities between their domains.   When learning from a single teacher, using domain similarity to gauge trustworthiness is inadequate. To this end, we propose a simple metric that correlates well with the teacher's accuracy in the target domain. We demonstrate that incorporating high-confidence examples selected by this metric enables the student model to achieve state-of-the-art performance in the single-source scenario.
Photometric redshifts (photo-z's) provide an alternative way to estimate the distances of large samples of galaxies and are therefore crucial to a large variety of cosmological problems. Among the various methods proposed over the years, supervised machine learning (ML) methods capable to interpolate the knowledge gained by means of spectroscopical data have proven to be very effective. METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts) is a novel method designed to provide a reliable PDF (Probability density Function) of the error distribution of photometric redshifts predicted by ML methods. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-z's. After a short description of the software, we present a summary of results on public galaxy data (Sloan Digital Sky Survey - Data Release 9) and a comparison with a completely different method based on Spectral Energy Distribution (SED) template fitting.
Different graph generalizations have been recently used in an ad-hoc manner to represent multilayer networks, i.e. systems formed by distinct layers where each layer can be seen as a network. Similar constructions have also been used to represent time-varying networks. We introduce the concept of MultiAspect Graph (MAG) as a graph generalization that we prove to be isomorphic to a directed graph, and also capable of representing all previous generalizations. In our proposal, the set of vertices, layers, time instants, or any other independent features are considered as an aspect of the MAG. For instance, a MAG is able to represent multilayer or time-varying networks, while both concepts can also be combined to represent a multilayer time-varying network and even other higher-order networks. Since the MAG structure admits an arbitrary (finite) number of aspects, it hence introduces a powerful modelling abstraction for networked complex systems. This paper formalizes the concept of MAG and derives theoretical results useful in the analysis of complex networked systems modelled using the proposed MAG abstraction. We also present an overview of the MAG applicability.
Deep learning architectures are showing great promise in various computer vision domains including image classification, object detection, event detection and action recognition. In this study, we investigate various aspects of convolutional neural networks (CNNs) from the big data perspective. We analyze recent studies and different network architectures both in terms of running time and accuracy. We present extensive empirical information along with best practices for big data practitioners. Using these best practices we propose efficient fusion mechanisms both for single and multiple network models. We present state-of-the art results on benchmark datasets while keeping computational costs at a lower level. Another contribution of our paper is that these state-of-the-art results can be reached without using extensive data augmentation techniques.
In this paper, we propose multimodal convolutional neural networks (m-CNNs) for matching image and sentence. Our m-CNN provides an end-to-end framework with convolutional architectures to exploit image representation, word composition, and the matching relations between the two modalities. More specifically, it consists of one image CNN encoding the image content, and one matching CNN learning the joint representation of image and sentence. The matching CNN composes words to different semantic fragments and learns the inter-modal relations between image and the composed fragments at different levels, thus fully exploit the matching relations between image and sentence. Experimental results on benchmark databases of bidirectional image and sentence retrieval demonstrate that the proposed m-CNNs can effectively capture the information necessary for image and sentence matching. Specifically, our proposed m-CNNs for bidirectional image and sentence retrieval on Flickr30K and Microsoft COCO databases achieve the state-of-the-art performances.
With the success of new computational architectures for visual processing, such as convolutional neural networks (CNN) and access to image databases with millions of labeled examples (e.g., ImageNet, Places), the state of the art in computer vision is advancing rapidly. One important factor for continued progress is to understand the representations that are learned by the inner layers of these deep architectures. Here we show that object detectors emerge from training CNNs to perform scene classification. As scenes are composed of objects, the CNN for scene classification automatically discovers meaningful objects detectors, representative of the learned scene categories. With object detectors emerging as a result of learning to recognize scenes, our work demonstrates that the same network can perform both scene recognition and object localization in a single forward-pass, without ever having been explicitly taught the notion of objects.
We propose a new probabilistic framework that allows mobile robots to autonomously learn deep generative models of their environments that span multiple levels of abstraction. Unlike traditional approaches that integrate engineered models for low-level features, geometry, and semantic information, our approach leverages recent advances in Sum-Product Networks (SPNs) and deep learning to learn a generative model of a robot's spatial environment, from low-level input to semantic interpretations. Our model is capable of solving a wide range of tasks from semantic classification of places, uncertainty estimation and novelty detection to generation of place appearances based on semantic information and prediction of missing data in partial observations. Experiments on laser range data from a mobile robot show that the proposed single universal model obtains accuracy or efficiency superior to models fine-tuned to specific sub-problems, such as Generative Adversarial Networks (GANs) or SVMs.
Number counts and redshift distribution of gravitational arcs are computed in the field of massive clusters of galaxies to probe the universe at high redshift. Using an accurate modelling for the cluster mass distribution and a model for the spectrophotometric evolution of galaxies, the redshift distribution of gravitational arclets is computed in the field of cluster Abell 2218 and in the Hubble Deep Field where a cluster is artificially located. Counts are very well reproduced in the B band but an important population appears at high redshift which is not seen in deep spectroscopic surveys. Unfortunately, the very high sensitivity of the counts with respect to the model for galaxy evolution and to the mass distribution prevents from estimating the cosmological parameters with arcs statistics. Future works have to concentrate on high redshift clusters and take advantage of objects with smaller distortions.
It is shown that the nuclear binding correction in deep inelastic lepton scattering is essentially the same in light front and instant form representations. Some contradicting papers are discussed.
We consider the problem of estimating the underlying graph associated with a Markov random field, with the added twist that the decoding algorithm can iteratively choose which subsets of nodes to sample based on the previous samples, resulting in an active learning setting. Considering both Ising and Gaussian models, we provide algorithm-independent lower bounds for high-probability recovery within the class of degree-bounded graphs. Our main results are minimax lower bounds for the active setting that match the best known lower bounds for the passive setting, which in turn are known to be tight in several cases of interest. Our analysis is based on Fano's inequality, along with novel mutual information bounds for the active learning setting, and the application of restricted graph ensembles. While we consider ensembles that are similar or identical to those used in the passive setting, we require different analysis techniques, with a key challenge being bounding a mutual information quantity associated with observed subsets of nodes, as opposed to full observations.
The deep two-stream architecture exhibited excellent performance on video based action recognition. The most computationally expensive step in this approach comes from the calculation of optical flow which prevents it to be real-time. This paper accelerates this architecture by replacing optical flow with motion vector which can be obtained directly from compressed videos without extra calculation. However, motion vector lacks fine structures, and contains noisy and inaccurate motion patterns, leading to the evident degradation of recognition performance. Our key insight for relieving this problem is that optical flow and motion vector are inherent correlated. Transferring the knowledge learned with optical flow CNN to motion vector CNN can significantly boost the performance of the latter. Specifically, we introduce three strategies for this, initialization transfer, supervision transfer and their combination. Experimental results show that our method achieves comparable recognition performance to the state-of-the-art, while our method can process 390.7 frames per second, which is 27 times faster than the original two-stream method.
We study ground-state properties of the two-dimensional random-bond Ising model with couplings having a concentration $p\in[0,1]$ of antiferromagnetic and $(1-p)$ of ferromagnetic bonds. We apply an exact matching algorithm which enables us the study of systems with linear dimension $L$ up to 700. We study the behavior of the domain-wall energies and of the magnetization. We find that the paramagnet-ferromagnet transition occurs at $p_c \sim 0.103$ compared to the concentration $p_n\sim 0.109$ at the Nishimory point, which means that the phase diagram of the model exhibits a reentrance. Furthermore, we find no indications for an (intermediate) spin-glass ordering at finite temperature.
Critical embedded systems have to provide a high level of dependability. In automotive domain, for example, TDMA protocols are largely recommended because of their deterministic behavior. Nevertheless, under the transient environmental perturbations, the loss of communication cycles may occur with a certain probability and, consequently, the system may fail. This paper analyzes the impact of the transient perturbations (especially due to Electromagnetic Interferences) on the dependability of systems distributed on TDMA-based networks. The dependability of such system is modeled as that of "consecutive-k-out-of-n:F" systems and we provide a efficient way for its evaluation.
We study deep inelastic scattering at large-'t Hooft coupling and finite $x$ from gauge/string duality beyond single-hadron final states which gives the leading large-$N_c$ contribution. Within the supergravity approximation, we calculate the subleading large-$N_c$ contribution by introducing an extra hadron into the final states. We find the contribution from these double-hadron final states will dominate in the Bjorken limit $q^2\rightarrow \infty$ compared with the single-hadron states. We discuss the implications of our results.
We consider the network communication scenario, over directed acyclic networks with unit capacity edges in which a number of sources $s_i$ each holding independent unit-entropy information $X_i$ wish to communicate the sum $\sum{X_i}$ to a set of terminals $t_j$. We show that in the case in which there are only two sources or only two terminals, communication is possible if and only if each source terminal pair $s_i/t_j$ is connected by at least a single path. For the more general communication problem in which there are three sources and three terminals, we prove that a single path connecting the source terminal pairs does not suffice to communicate $\sum{X_i}$. We then present an efficient encoding scheme which enables the communication of $\sum{X_i}$ for the three sources, three terminals case, given that each source terminal pair is connected by {\em two} edge disjoint paths.
In the limit of small values of the aspect ratio parameter (or wave steepness) which measures the amplitude of a surface wave in units of its wave-length, a model equation is derived from the Euler system in infinite depth (deep water) without potential flow assumption. The resulting equation is shown to sustain periodic waves which on the one side tend to the proper linear limit at small amplitudes, on the other side possess a threshold amplitude where wave crest peaking is achieved. An explicit expression of the crest angle at wave breaking is found in terms of the wave velocity. By numerical simulations, stable soliton-like solutions (experiencing elastic interactions) propagate in a given velocities range on the edge of which they tend to the peakon solution.
In this paper, we investigate the Chinese calligraphy synthesis problem: synthesizing Chinese calligraphy images with specified style from standard font(eg. Hei font) images (Fig. 1(a)). Recent works mostly follow the stroke extraction and assemble pipeline which is complex in the process and limited by the effect of stroke extraction. We treat the calligraphy synthesis problem as an image-to-image translation problem and propose a deep neural network based model which can generate calligraphy images from standard font images directly. Besides, we also construct a large scale benchmark that contains various styles for Chinese calligraphy synthesis. We evaluate our method as well as some baseline methods on the proposed dataset, and the experimental results demonstrate the effectiveness of our proposed model.
Despite deep neural networks have demonstrated extraordinary power in various applications, their superior performances are at expenses of high storage and computational cost. Consequently, the acceleration and compression of neural networks have attracted much attention recently. Knowledge Transfer (KT), which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the popular solutions. In this paper, we propose a novel knowledge transfer method by treating it as a distribution matching problem. Particularly, we match the distributions of neuron selectivity patterns between teacher and student networks. To achieve this goal, we devise a new KT loss function by minimizing the Maximum Mean Discrepancy (MMD) metric between these distributions. Combined with the original loss function, our method can significantly improve the performance of student networks. We validate the effectiveness of our method across several datasets, and further combine it with other KT methods to explore the best possible results.
The unrelenting increase in the population of mobile users and their traffic demands drive cellular network operators to densify their network infrastructure. Network densification shrinks the footprint of base stations (BSs) and reduces the number of users associated with each BS, leading to an improved spatial frequency reuse and spectral efficiency, and thus, higher network capacity. However, the densification gain come at the expense of higher handover rates and network control overhead. Hence, users mobility can diminish or even nullifies the foreseen densification gain. In this context, splitting the control plane (C-plane) and user plane (U-plane) is proposed as a potential solution to harvest densification gain with reduced cost in terms of handover rate and network control overhead. In this article, we use stochastic geometry to develop a tractable mobility-aware model for a two-tier downlink cellular network with ultra-dense small cells and C-plane/U-plane split architecture. The developed model is then used to quantify the effect of mobility on the foreseen densification gain with and without C-plane/U-plane split. To this end, we shed light on the handover problem in dense cellular environments, show scenarios where the network fails to support certain mobility profiles, and obtain network design insights.
Appearance based person re-identification in a real-world video surveillance system with non-overlapping camera views is a challenging problem for many reasons. Current state-of-the-art methods often address the problem by relying on supervised learning of similarity metrics or ranking functions to implicitly model appearance transformation between cameras for each camera pair, or group, in the system. This requires considerable human effort to annotate data. Furthermore, the learned models are camera specific and not transferable from one set of cameras to another. Therefore, the annotation process is required after every network expansion or camera replacement, which strongly limits their applicability. Alternatively, we propose a novel modeling approach to harness complementary appearance information without supervised learning that significantly outperforms current state-of-the-art unsupervised methods on multiple benchmark datasets.
We introduce a spherical version of the frustrated Blume-Emery-Griffiths model and solve exactly the statics and the Langevin dynamics for zero particle-particle coupling (K=0). In this case the model exhibits an equilibrium transition from a disordered to a spin glass phase which is always continuous for nonzero temperature. The same phase diagram results from the study of the dynamics. Furthermore, we notice the existence of a nonequilibrium time regime in a region of the disordered phase, characterized by aging as occurs in the spin glass phase. Due to a finite equilibration time, the system displays in this region the pattern of interrupted aging.
This paper addresses the problem of learning binary hash codes for large scale image search by proposing a novel hashing method based on deep neural network. The advantage of our deep model over previous deep model used in hashing is that our model contains necessary criteria for producing good codes such as similarity preserving, balance and independence. Another advantage of our method is that instead of relaxing the binary constraint of codes during the learning process as most previous works, in this paper, by introducing the auxiliary variable, we reformulate the optimization into two sub-optimization steps allowing us to efficiently solve binary constraints without any relaxation.   The proposed method is also extended to the supervised hashing by leveraging the label information such that the learned binary codes preserve the pairwise label of inputs.   The experimental results on three benchmark datasets show the proposed methods outperform state-of-the-art hashing methods.
Slow-wave sleep in mammalians is characterized by a change of large-scale cortical activity currently paraphrased as cortical Up/Down states. A recent experiment demonstrated a bistable collective behaviour in ferret slices, with the remarkable property that the Up states can be switched on and off with pulses, or excitations, of same polarity; whereby the effect of the second pulse significantly depends on the time interval between the pulses. Here we present a simple time discrete model of a neural network that exhibits this type of behaviour, as well as quantitatively reproduces the time-dependence found in the experiments.
Mobile Adhoc Network (MANET) is a network of a number of mobile routers and associated hosts, organized in a random fashion via wireless links. During recent years MANET has gained enormous amount of attention and has been widely used for not only military purposes but for search-and-rescue operations, intelligent transportation system, data collection, virtual classrooms and ubiquitous computing. Service Discovery is one of the most important issues in MANET. It is defined as the process of facilitating service providers to advertise their services in a dynamic way and to allow consumers to discover and access those services in an efficient and scalable manner. In this paper, we are proposing a flexible and efficient approach to service discovery for MANET by extending the work of Torres et al. (2004)...
This paper is concerned with a translation invariant network of identical quantum stochastic systems subjected to external quantum noise. Each node of the network is directly coupled to a finite number of its neighbours. This network is modelled as an open quantum harmonic oscillator and is governed by a set of linear quantum stochastic differential equations. The dynamic variables of the network satisfy the canonical commutation relations. Similar large-scale networks can be found, for example, in quantum metamaterials and optical lattices. Using spatial Fourier transform techniques, we obtain a sufficient condition for stability of the network in the case of finite interaction range, and consider a mean square performance index for the stable network in the thermodynamic limit. The Peres-Horodecki-Simon separability criterion is employed in order to obtain sufficient and necessary conditions for quantum entanglement of bipartite systems of nodes of the network in the Gaussian invariant state. The results on stability and entanglement are extended to the infinite chain of the linear quantum systems by letting the number of nodes go to infinity. A numerical example is provided to illustrate the results.
In this paper, we propose training very deep neural networks (DNNs) for supervised learning of hash codes. Existing methods in this context train relatively "shallow" networks limited by the issues arising in back propagation (e.e. vanishing gradients) as well as computational efficiency. We propose a novel and efficient training algorithm inspired by alternating direction method of multipliers (ADMM) that overcomes some of these limitations. Our method decomposes the training process into independent layer-wise local updates through auxiliary variables. Empirically we observe that our training algorithm always converges and its computational complexity is linearly proportional to the number of edges in the networks. Empirically we manage to train DNNs with 64 hidden layers and 1024 nodes per layer for supervised hashing in about 3 hours using a single GPU. Our proposed very deep supervised hashing (VDSH) method significantly outperforms the state-of-the-art on several benchmark datasets.
Deep Neural Networks now excel at image classification, detection and segmentation. When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show how dynamic programming can speedup the process by orders of magnitude, even when max-pooling layers are present.
Using aggregated journal-journal citation networks, the measurement of the knowledge base in empirical systems is factor-analyzed in two cases of interdisciplinary developments during the period 1995-2005: (i) the development of nanotechnology in the natural sciences and (ii) the development of communication studies as an interdiscipline between social psychology and political science. The results are compared with a case of stable development: the citation networks of core journals in chemistry. These citation networks are intellectually organized by networks of expectations in the knowledge base at the specialty (that is, above-journal) level. This "structuration" of structural components (over time) can be measured as configurational information. The latter is compared with the Shannon-type information generated in the interactions among structural components: the difference between these two measures provides us with a measure for the redundancy generated by the specification of a model in the knowledge base of the system. This knowledge base incurs (against the entropy law) to variable extents on the knowledge infrastructures provided by the observable networks of relations.
This work considers the problem of secure and reliable information transmission via relay cooperation in two-hop relay wireless networks without the information of both eavesdropper channels and locations. While previous work on this problem mainly studied infinite networks and their asymptotic behavior and scaling law results, this papers focuses on a more practical network with finite number of system nodes and explores the corresponding exact result on the number of eavesdroppers one network can tolerant to ensure desired secrecy and reliability. We first study the scenario where path-loss is equal between all pairs of nodes and consider two transmission protocols there, one adopts an optimal but complex relay selection process with less load balance capacity while the other adopts a random but simple relay selection process with good load balance capacity. Theoretical analysis is then provided to determine the maximum number of eavesdroppers one network can tolerate to ensure a desired performance in terms of the secrecy outage probability and transmission outage probability. We further extend our study to the more general scenario where path-loss between each pair of nodes also depends the distance between them, for which a new transmission protocol with both preferable relay selection and good load balance as well as the corresponding theoretical analysis are presented.
Paradigmatic model systems, which are used to study the mechanical response of matter, are random networks of point-atoms, random sphere packings, or simple crystal lattices, all of these models assume central-force interactions between particles/atoms. Each of these models differs in the spatial arrangement and the correlations among particles. In turn, this is reflected in the widely different behaviours of the shear (G) and compression (K) elastic moduli. The relation between the macroscopic elasticity as encoded in G, K and their ratio, and the microscopic lattice structure/order, is not understood. We provide a quantitative analytical connection between the local orientational order and the elasticity in model amorphous solids with different internal microstructure, focusing on the two opposite limits of packings (strong excluded-volume) and networks (no excluded-volume). The theory predicts that, in packings, the local orientational order due to excluded-volume causes less nonaffinity (less softness or larger stiffness) under compression than under shear. This leads to lower values of G/K, a well-documented phenomenon which was lacking a microscopic explanation. The theory also provides an excellent one-parameter description of the elasticity of compressed emulsions in comparison with experimental data over a broad range of packing fractions.
We compare different partitioning schemes for the box-counting algorithm in the multifractal analysis by computing the singularity spectrum and the distribution of the box probabilities. As model system we use the Anderson model of localization in two and three dimensions. We show that a partitioning scheme which includes unrestricted values of the box size and an average over all box origins leads to smaller error bounds than the standard method using only integer ratios of the linear system size and the box size which was found by Rodriguez et al. (Eur. Phys. J. B 67, 77-82 (2009)) to yield the most reliable results.
This paper presents a statistically sound method for using likelihood to assess potential models of network evolution. The method is tested on data from five real networks. Data from the internet autonomous system network, from two photo sharing sites and from a co-authorship network are tested using this framework.
We propose a neural sequence-to-sequence model for direction following, a task that is essential to realizing effective autonomous agents. Our alignment-based encoder-decoder model with long short-term memory recurrent neural networks (LSTM-RNN) translates natural language instructions to action sequences based upon a representation of the observable world state. We introduce a multi-level aligner that empowers our model to focus on sentence "regions" salient to the current world state by using multiple abstractions of the input sentence. In contrast to existing methods, our model uses no specialized linguistic resources (e.g., parsers) or task-specific annotations (e.g., seed lexicons). It is therefore generalizable, yet still achieves the best results reported to-date on a benchmark single-sentence dataset and competitive results for the limited-training multi-sentence setting. We analyze our model through a series of ablations that elucidate the contributions of the primary components of our model.
Shortest path computation is one of the most common queries in location-based services (LBSs). Although particularly useful, such queries raise serious privacy concerns. Exposing to a (potentially untrusted) LBS the client's position and her destination may reveal personal information, such as social habits, health condition, shopping preferences, lifestyle choices, etc. The only existing method for privacy-preserving shortest path computation follows the obfuscation paradigm; it prevents the LBS from inferring the source and destination of the query with a probability higher than a threshold. This implies, however, that the LBS still deduces some information (albeit not exact) about the client's location and her destination. In this paper we aim at strong privacy, where the adversary learns nothing about the shortest path query. We achieve this via established private information retrieval techniques, which we treat as black-box building blocks. Experiments on real, large-scale road networks assess the practicality of our schemes.
It is a significant and longstanding puzzle that the resistor, inductor, capacitor (RLC) networks obtained by the established RLC realization procedures appear highly non-minimal from the perspective of linear systems theory. Specifically, each of these networks contains significantly more energy storage elements than the McMillan degree of its impedance, and possesses a non-minimal state-space representation whose states correspond to the inductor currents and capacitor voltages. Despite this apparent non-minimality, there have been no improved algorithms since the 1950s, with the concurrent discovery by Reza, Pantell, Fialkow and Gerst of a class of networks (the RPFG networks), which are a slight simplification of the Bott-Duffin networks. Each RPFG network contains more than twice as many energy storage elements as the McMillan degree of its impedance, yet it has never been established if all of these energy storage elements are necessary. In this paper, we present some newly discovered alternatives to the RPFG networks. We then prove that the RPFG networks, and these newly discovered networks, contain the least possible number of energy storage elements for realizing certain positive-real functions. In other words, all RLC networks which realize certain impedances contain more than twice the expected number (McMillan degree) of energy storage elements.
Many biochemical and industrial applications involve complicated networks of simultaneously occurring chemical reactions. Under the assumption of mass action kinetics, the dynamics of these chemical reaction networks are governed by systems of polynomial ordinary differential equations. The steady states of these mass action systems have been analysed via a variety of techniques, including elementary flux mode analysis, algebraic techniques (e.g. Groebner bases), and deficiency theory. In this paper, we present a novel method for characterizing the steady states of mass action systems. Our method explicitly links a network's capacity to permit a particular class of steady states, called toric steady states, to topological properties of a related network called a translated chemical reaction network. These networks share their reaction stoichiometries with their source network but are permitted to have different complex stoichiometries and different network topologies. We apply the results to examples drawn from the biochemical literature.
The problem of finding the most efficient way to pack spheres has an illustrious history, dating back to the crystalline arrays conjectured by Kepler and the random geometries explored by Bernal in the 60's. This problem finds applications spanning from the mathematician's pencil, the processing of granular materials, the jamming and glass transitions, all the way to fruit packing in every grocery. There are presently numerous experiments showing that the loosest way to pack spheres gives a density of ~55% (RLP) while filling all the loose voids results in a maximum density of ~63-64% (RCP). While those values seem robustly true, to this date there is no physical explanation or theoretical prediction for them. Here we show that random packings of monodisperse hard spheres in 3d can pack between the densities 4/(4 + 2 \sqrt 3) or 53.6% and 6/(6 + 2 \sqrt 3) or 63.4%, defining RLP and RCP, respectively. The reason for these limits arises from a statistical picture of jammed states in which the RCP can be interpreted as the ground state of the ensemble of jammed matter with zero compactivity, while the RLP arises in the infinite compactivity limit. We combine an extended statistical mechanics approach 'a la Edwards' (where the role traditionally played by the energy and temperature in thermal systems is substituted by the volume and compactivity) with a constraint on mechanical stability imposed by the isostatic condition. Ultimately, our results lead to a phase diagram that provides a unifying view of the disordered hard sphere packing problem.
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
A collection of citation data, the HistComp, is available from the Internet as a database of examples of real life citation networks. The purposes of this approach is the analysis of these citation networks on learned literature by presenting its typical steps and results. We have selected the bibliographic insights into the "The Biological Bulletin", the journal published since 1897 by the Woods Hole Marine Biological Laboratory. Since the bibliographic networks tend to be very scattered, their visualization requires of criteria of convergence. To simplify, the main features in such a structure should include the survey for authoritative sources in the hyperlinked environment and the identification of thematic areas. By avoiding excessive loose connections and too dense clustered layouts to be useful, a smooth presentation is obtained by graphically depicting the citation patterns. HistComp computes 8884 articles published by 'The Biological Bulletin' between 1945-2003. A two-dimensional positioning of these papers that represent the extent of their bibliographic coupling and co-citation is offered as a histograph. The criteria to construct it is the adequateness of the visualization relative to the 8884 data set. The spatial representation obtained optimizes the identification of the clusters or topic areas. The thematic importance of marine science involves its participation in 7 of the 7 presenting clusters. The mainstream subjects were crustaceans and echinoderms, with some 60% of the material presented in the graph. But sea anemone, with about 16% of the total, remains as the best visualized topical area. A perspective of the highly relevant papers is readily confirmed by the visual inspection of width of the glyphs used for nodes representation. For user interaction, HistComp employs mouse-over labels.
We present an analysis leading to a conjecture on the exact location of the multicritical point in the phase diagram of spin glasses in finite dimensions. The conjecture, in satisfactory agreement with a number of numerical results, was previously derived using an ansatz emerging from duality and the replica method. In the present paper we carefully examine the ansatz and reduce it to a hypothesis on analyticity of a function appearing in the duality relation. Thus the problem is now clearer than before from a mathematical point of view: The ansatz, somewhat arbitrarily introduced previously, has now been shown to be closely related to the analyticity of a well-defined function.
Most recent approaches to monocular 3D pose estimation rely on Deep Learning. They either train a Convolutional Neural Network to directly regress from image to 3D pose, which ignores the dependencies between human joints, or model these dependencies via a max-margin structured learning framework, which involves a high computational cost at inference time.   In this paper, we introduce a Deep Learning regression architecture for structured prediction of 3D human pose from monocular images that relies on an overcomplete auto-encoder to learn a high-dimensional latent pose representation and account for joint dependencies. We demonstrate that our approach outperforms state-of-the-art ones both in terms of structure preservation and prediction accuracy.
The lack of reliable data in developing countries is a major obstacle to sustainable development, food security, and disaster relief. Poverty data, for example, is typically scarce, sparse in coverage, and labor-intensive to obtain. Remote sensing data such as high-resolution satellite imagery, on the other hand, is becoming increasingly available and inexpensive. Unfortunately, such data is highly unstructured and currently no techniques exist to automatically extract useful insights to inform policy decisions and help direct humanitarian efforts. We propose a novel machine learning approach to extract large-scale socioeconomic indicators from high-resolution satellite imagery. The main challenge is that training data is very scarce, making it difficult to apply modern techniques such as Convolutional Neural Networks (CNN). We therefore propose a transfer learning approach where nighttime light intensities are used as a data-rich proxy. We train a fully convolutional CNN model to predict nighttime lights from daytime imagery, simultaneously learning features that are useful for poverty prediction. The model learns filters identifying different terrains and man-made structures, including roads, buildings, and farmlands, without any supervision beyond nighttime lights. We demonstrate that these learned features are highly informative for poverty mapping, even approaching the predictive performance of survey data collected in the field.
Complex networks have become increasingly popular for modeling various real-world phenomena. Realistic generative network models are important in this context as they avoid privacy concerns of real data and simplify complex network research regarding data sharing, reproducibility, and scalability studies. \emph{Random hyperbolic graphs} are a well-analyzed family of geometric graphs. Previous work provided empirical and theoretical evidence that this generative graph model creates networks with non-vanishing clustering and other realistic features. However, the investigated networks in previous applied work were small, possibly due to the quadratic running time of a previous generator.   In this work we provide the first generation algorithm for these networks with subquadratic running time. We prove a time complexity of $O((n^{3/2}+m) \log n)$ with high probability for the generation process. This running time is confirmed by experimental data with our implementation. The acceleration stems primarily from the reduction of pairwise distance computations through a polar quadtree, which we adapt to hyperbolic space for this purpose. In practice we improve the running time of a previous implementation by at least two orders of magnitude this way. Networks with billions of edges can now be generated in a few minutes.   Finally, we evaluate the largest networks of this model published so far. Our empirical analysis shows that important features are retained over different graph densities and degree distributions.
We present a method for learning treewidth-bounded Bayesian networks from data sets containing thousands of variables. Bounding the treewidth of a Bayesian greatly reduces the complexity of inferences. Yet, being a global property of the graph, it considerably increases the difficulty of the learning process. We propose a novel algorithm for this task, able to scale to large domains and large treewidths. Our novel approach consistently outperforms the state of the art on data sets with up to ten thousand variables.
Evolutionary dynamics on graphs can lead to many interesting and counterintuitive findings. We study the Moran process, a discrete time birth-death process, that describes the invasion of a mutant type into a population of wild-type individuals. Remarkably, the fixation probability of a single mutant is the same on all regular networks. But non-regular networks can increase or decrease the fixation probability. While the time until fixation formally depends on the same transition probabilities as the fixation probabilities, there is no obvious relation between them. For example, an amplifier of selection, which increases the fixation probability and thus decreases the number of mutations needed until one of them is successful, can at the same time slow down the process of fixation. Based on small networks, we show analytically that (i) the time to fixation can decrease when links are removed from the network and (ii) the node providing the best starting conditions in terms of the shortest fixation time depends on the fitness of the mutant. Our results are obtained analytically on small networks, but numerical simulations show that they are qualitatively valid even in much larger populations.
Convolutional neural networks (CNN) have achieved major breakthroughs in recent years. Their performance in computer vision have matched and in some areas even surpassed human capabilities. Deep neural networks can capture complex non-linear features; however this ability comes at the cost of high computational and memory requirements. State-of-art networks require billions of arithmetic operations and millions of parameters. To enable embedded devices such as smartphones, Google glasses and monitoring cameras with the astonishing power of deep learning, dedicated hardware accelerators can be used to decrease both execution time and power consumption. In applications where fast connection to the cloud is not guaranteed or where privacy is important, computation needs to be done locally. Many hardware accelerators for deep neural networks have been proposed recently. A first important step of accelerator design is hardware-oriented approximation of deep networks, which enables energy-efficient inference. We present Ristretto, a fast and automated framework for CNN approximation. Ristretto simulates the hardware arithmetic of a custom hardware accelerator. The framework reduces the bit-width of network parameters and outputs of resource-intense layers, which reduces the chip area for multiplication units significantly. Alternatively, Ristretto can remove the need for multipliers altogether, resulting in an adder-only arithmetic. The tool fine-tunes trimmed networks to achieve high classification accuracy. Since training of deep neural networks can be time-consuming, Ristretto uses highly optimized routines which run on the GPU. This enables fast compression of any given network. Given a maximum tolerance of 1%, Ristretto can successfully condense CaffeNet and SqueezeNet to 8-bit. The code for Ristretto is available.
We present the properties of X-ray detected dust obscured galaxies (DOGs) in the Chandra Deep Field South. In recent years, it has been proposed that a significant percentage of the elusive Compton-thick (CT) active galactic nuclei (AGN) could be hidden among DOGs. In a previous work, we presented the properties of X-ray detected DOGs by making use of the deepest X-ray observations available at that time, the 2Ms observations of the Chandra deep fields. In that work, we only found a moderate percentage ($<$ 50%) of CT AGN among the DOGs sample, but we were limited by poor photon statistics. In this paper, we use not only a deeper 6 Ms Chandra survey of the Chandra Deep Field South (CDF-S), but combine these data with the 3 Ms XMM-Newton survey of the CDF-S. We also take advantage of the great coverage of the CDF-S region from the UV to the far-IR to fit the spectral energy distributions (SEDs) of our sources. Out of the 14 AGN composing our sample, 9 are highly absorbed (but only 3 could be CT AGN), whereas 2 look unabsorbed, and the other 3 are only moderately absorbed. In only one of the CT AGN, we detect a strong Fe K$\alpha$ emission line; the source is already classified as a CT AGN with Chandra data in a previous work. For the other two CT candidates, the non-detection of the line could be because of the low number of counts in their X-ray spectra, but their location in the L$_{\rm 2-10\,keV}$/L$_{12\mu m}$ plot supports their CT classification. Although a higher number of CT sources could be hidden among the X-ray undetected DOGs, our results indicate that DOGs could be as well composed of only a fraction of CT AGN plus a number of moderate to highly absorbed AGN, as previously suggested. From our study of the X-ray undetected DOGs in the CDF-S, we estimate a percentage between 13 and 44% of CT AGN among the whole population of DOGs.
The evolution of random wave fields on the free surface is a complex process which is not completely understood nowadays. For the sake of simplicity in this study we will restrict our attention to the 2D physical problems only (i.e. 1D wave propagation). However, the full Euler equations are solved numerically in order to predict the wave field dynamics. We will consider the most studied deep water case along with several finite depths (from deep to shallow waters) to make a comparison. For each depth we will perform a series of Monte--Carlo runs of random initial conditions in order to deduce some statistical properties of an average sea state.
The complexity of deep neural network algorithms for hardware implementation can be much lowered by optimizing the word-length of weights and signals. Direct quantization of floating-point weights, however, does not show good performance when the number of bits assigned is small. Retraining of quantized networks has been developed to relieve this problem. In this work, the effects of retraining are analyzed for a feedforward deep neural network (FFDNN) and a convolutional neural network (CNN). The network complexity is controlled to know their effects on the resiliency of quantized networks by retraining. The complexity of the FFDNN is controlled by varying the unit size in each hidden layer and the number of layers, while that of the CNN is done by modifying the feature map configuration. We find that the performance gap between the floating-point and the retrain-based ternary (+1, 0, -1) weight neural networks exists with a fair amount in 'complexity limited' networks, but the discrepancy almost vanishes in fully complex networks whose capability is limited by the training data, rather than by the number of connections. This research shows that highly complex DNNs have the capability of absorbing the effects of severe weight quantization through retraining, but connection limited networks are less resilient. This paper also presents the effective compression ratio to guide the trade-off between the network size and the precision when the hardware resource is limited.
Recognizing facial expression in a wild setting has remained a challenging task in computer vision. The World Wide Web is a good source of facial images which most of them are captured in uncontrolled conditions. In fact, the Internet is a Word Wild Web of facial images with expressions. This paper presents the results of a new study on collecting, annotating, and analyzing wild facial expressions from the web. Three search engines were queried using 1250 emotion related keywords in six different languages and the retrieved images were mapped by two annotators to six basic expressions and neutral. Deep neural networks and noise modeling were used in three different training scenarios to find how accurately facial expressions can be recognized when trained on noisy images collected from the web using query terms (e.g. happy face, laughing man, etc)? The results of our experiments show that deep neural networks can recognize wild facial expressions with an accuracy of 82.12%.
Anomalously localized states (ALS) at the critical point of the Anderson transition are studied for the SU(2) model belonging to the two-dimensional symplectic class. Giving a quantitative definition of ALS to clarify statistical properties of them, the system-size dependence of a probability to find ALS at criticality is presented. It is found that the probability increases with the system size and ALS exist with a finite probability even in an infinite critical system, though the typical critical states are kept to be multifractal. This fact implies that ALS should be eliminated from an ensemble of critical states when studying critical properties from distributions of critical quantities. As a demonstration of the effect of ALS to critical properties, we show that the distribution function of the correlation dimension of critical wavefunctions becomes a delta function in the thermodynamic limit only if ALS are eliminated.
Traffic simulations are made more realistic by giving individual drivers intentions, i.e. an idea of where they want to go. One possible implementation of this idea is to give each driver an exact pre-computed path, that is, a sequence of roads this driver wants to follow. This paper shows, in a realistic road network, how repeated simulations can be used so that drivers can explore different paths, and how macroscopic quantities such as locations of jams or network throughput change as a result of this.
An important aspect of the Future Internet is the efficient utilization of (wireless) network resources. In order for the - demanding in terms of QoS - Future Internet services to be provided, the current trend is evolving towards an "integrated" wireless network access model that enables users to enjoy mobility, seamless access and high quality of service in an all-IP network on an "Anytime, Anywhere" basis. The term "integrated" is used to denote that the Future Internet wireless "last mile" is expected to comprise multiple heterogeneous geographically coexisting wireless networks, each having different capacity and coverage radius. The efficient management of the wireless access network resources is crucial due to their scarcity that renders wireless access a potential bottleneck for the provision of high quality services. In this paper we propose an auction mechanism for allocating the bandwidth of such a network so that efficiency is attained, i.e. social welfare is maximized. In particular, we propose an incentive-compatible, efficient auction-based mechanism of low computational complexity. We define a repeated game to address user utilities and incentives issues. Subsequently, we extend this mechanism so that it can also accommodate multicast sessions. We also analyze the computational complexity and message overhead of the proposed mechanism. We then show how user bids can be replaced from weights generated by the network and transform the auction to a cooperative mechanism capable of prioritizing certain classes of services and emulating DiffServ and time-of-day pricing schemes. The theoretical analysis is complemented by simulations that assess the proposed mechanisms properties and performance. We finally provide some concluding remarks and directions for future research.
Existing information theoretic work in decentralized detection is largely focused on parallel configuration of Wireless Sensor Networks (WSNs), where an individual hard or soft decision is computed at each sensor node and then transmitted directly to the fusion node. Such an approach is not efficient for large networks, where communication structure is likely to comprise of multiple hops. On the other hand, decentralized detection problem investigated for multi-hop networks is mainly concerned with reducing number and/or size of messages by using compression and fusion of information at intermediate nodes. In this paper an energy efficient multi-hop configuration of WSNs is proposed to solve the detection problem in large networks with two objectives: maximizing network lifetime and minimizing probability of error in the fusion node. This optimization problem is considered under the constraint of total consumed energy. The two objectives mentioned are achieved simultaneously in the multi-hop configuration by exploring tradeoffs between different path lengths and number of bits allocated to each node for quantization. Simulation results show significant improvement in the proposed multi-hop configuration compared with the parallel configuration in terms of energy efficiency and detection accuracy for different size networks, especially in larger networks.
The detailed mathematical study of the recent paper by Sajjadi, Hunt and Drullion (2014) is pre- sented. The mathematical developement considered by them, for unsteady growing monochro- matic waves is also extended to Stokes waves. The present contribution also demonstrates agree- ment with the pioneering work of Belcher and Hunt (1993) which is valid in the limit of the complex part of the wave phase speed \c_i \downarrow 0. It is further shown that the energy-transfer parameter and the surface shear stress for a Stokes wave reverts to a monochromatic wave when the second harmonic is excluded. Furthermore, the present theory can be used to estimate the amount of energy transferred to each component of nonlinear surface waves on deep water from a turbulent shear flow blowing over it. Finally, it is demonstrated that in the presence of turbulent eddy viscosity the Miles (1957) critical layer does not play an important role. Thus, it is concluded that in the limit of zero growth rate the effect of the wave growth arises from the elevated critical layer by finite turbulent diffusivity, so that the perturbed flow and the drag force is determined by the asymmetric and sheltering flow in the surface shear layer and its matched interaction with the upper region.
In many real complex networks, the fractal and self-similarity properties have been found. The fractal dimension is a useful method to describe fractal property of complex networks. Fractal analysis is inadequate if only taking one fractal dimension to study complex networks. In this case, multifractal analysis of complex networks are concerned. However, multifractal dimension of weighted networks are less involved. In this paper, multifractal dimension of weighted networks is proposed based on box-covering algorithm for fractal dimension of weighted networks (BCANw). The proposed method is applied to calculate the fractal dimensions of some real networks. Our numerical results indicate that the proposed method is efficient for analysis fractal property of weighted networks.
Deep neural networks (DNN) have been successfully applied for music classification including music tagging. However, there are several open questions regarding generalisation and best practices in the choice of network architectures, hyper-parameters and input representations. In this article, we investigate specific aspects of neural networks to deepen our understanding of their properties. We analyse and (re-)validate a large music tagging dataset to investigate the reliability of training and evaluation. We perform comprehensive experiments involving audio preprocessing using different time-frequency representations, logarithmic magnitude compression, frequency weighting and scaling. Using a trained network, we compute label vector similarities which is compared to groundtruth similarity.   The results highlight several import aspects of music tagging and neural networks. We show that networks can be effective despite of relatively large error rates in groundtruth datasets. We subsequently show that many commonly used input preprocessing techniques are redundant except magnitude compression. Lastly, the analysis of our trained network provides valuable insight into the relationships between music tags. These results highlight the benefit of using data-driven methods to address automatic music tagging.
The 5th Generation cellular network may have the key feature of smaller cell size and denser resource employment, resulted from diminishing resource and increasing communication demands. However, small cell may result in high interference between cells. Moreover, the random geographic patterns of small cell networks make them hard to analyze, at least excluding schemes in the well-accepted hexagonal grid model. In this paper, a new model---the matrix graph is proposed which takes advantage of the small cell size and high inter-cell interference to reduce computation complexity. This model can simulate real world networks accurately and offers convenience in frequency allocation problems which are usually NP-complete. An algorithm dealing with this model is also given, which asymptotically achieves the theoretical limit of frequency allocation, and has a complexity which decreases with cell size and grows linearly with the network size. This new model is specifically proposed to characterize the next-generation cellular networks.
The present note points out a number of errors, omissions, redundancies and arbitrary deviations from the standard terminology in the paper "Resource placement in Cartesian product of networks," by N. Imani, H. Sarbazi-Azad and A.Y. Zomaya [J. Parallel Distrib. Comput. 70 (2010) 481-495].
Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.
Application-layer multicast implements the multicast functionality at the application layer. The main goal of application-layer multicast is to construct and maintain efficient distribution structures between end-hosts. In this paper we focus on the implementation of an application-layer multicast distribution algorithm. We observe that the total time required to measure network latency over TCP is influenced dramatically by the TCP connection time. We argue that end-host distribution is not only influenced by the quality of network links but also by the time required to make connections between nodes. We provide several solutions to decrease the total end-host distribution time.
The large distance behaviors of the random field and random anisotropy O(N) models are studied with the functional renormalization group in 4-\epsilon dimensions. The random anisotropy Heisenberg (N=3) model is found to have a phase with the infinite correlation radius at low temperatures and weak disorder. The correlation function of the magnetization obeys a power law < m(x) m(y) >\sim |x-y|^{-0.62\epsilon}. The magnetic susceptibility diverges at low fields as \chi \sim H^{-1+0.15\epsilon}. In the random field O(N) model the correlation radius is found to be finite at the arbitrarily weak disorder for any N>3. The random field case is studied with a new simple method, based on a rigorous inequality. This approach allows one to avoid the integration of the functional renormalization group equations.
The propagation of acoustic waves through disordered media results in anomalous features in respect to Debye's theory. This is observed in attenuation, retardation and depolarization of real systems, as predicted by the elementary elasticity theory and experimentally observed in real systems. Each of these phenomena have attracted a large interest when dealing with topologically disordered media but, even if originating from the same root, they have been treated separately. Stochastic description of topologically disordered media arises from the need to describe the effect of heterogeneous and undetermined structure of most real wave-transmitting media. Using a stochastic approach at wavelengths of the order of the characteristic length-scale of heterogeneities, allowed us to achieve a quantitative, experimentally corroborated and unified description of acoustic waves properties in a disordered medium. These include the low-frequency acoustic waves anomalies in the so-called Rayleigh region and the mixing of longitudinal and transverse polarizations occurring at higher wavevectors. Model predictions are compared to acoustic waves features and the vibrational density of states measured in an ionic glass. The excellent agreement obtained between theory and experiments reveals how a coherent description of elastic waves behavior in a disordered medium can be achieved.
The paper describes the design, the implementation of a neural controller used in an automatic daylight control system. The automatic lighting control system (ALCS) attempt to maintain constant the illuminance at the desired level on working plane even if the daylight contribution is variable. Therefore, the daylight will represent the perturbation signal for the ALCS. The mathematical model of process is unknown. The applied structure of control need the inverse model of process. For this purpose it was used other artificial neural network (ANN) which identify the inverse model of process in an on-line manner. In fact, this ANN identify the inverse model of process + the perturbation signal. In this way the learning signal for neural controller has a better accuracy for the present application.
As humans we possess an intuitive ability for navigation which we master through years of practice; however existing approaches to model this trait for diverse tasks including monitoring pedestrian flow and detecting abnormal events have been limited by using a variety of hand-crafted features. Recent research in the area of deep-learning has demonstrated the power of learning features directly from the data; and related research in recurrent neural networks has shown exemplary results in sequence-to-sequence problems such as neural machine translation and neural image caption generation. Motivated by these approaches, we propose a novel method to predict the future motion of a pedestrian given a short history of their, and their neighbours, past behaviour. The novelty of the proposed method is the combined attention model which utilises both "soft attention" as well as "hard-wired" attention in order to map the trajectory information from the local neighbourhood to the future positions of the pedestrian of interest. We illustrate how a simple approximation of attention weights (i.e hard-wired) can be merged together with soft attention weights in order to make our model applicable for challenging real world scenarios with hundreds of neighbours. The navigational capability of the proposed method is tested on two challenging publicly available surveillance databases where our model outperforms the current-state-of-the-art methods. Additionally, we illustrate how the proposed architecture can be directly applied for the task of abnormal event detection without handcrafting the features.
Making use of the energetics and equations of state of defective uranium dioxide that calculated with first-principles method, we demonstrate a possibility of constraining the formation energy of point defects by measuring the transition pressures of the corresponding pseudo-phase of defects. The mechanically stable range of fluorite structure of UO2, which dictates the maximum possible pressure of relevant pseudo-phase transitions, gives rise to defect formation energies that span a wide band and overlap with the existing experimental estimates. We reveal that the knowledge about pseudo-phase boundaries can not only provide important information of energetics that is helpful for reducing the scattering in current estimates, but also be valuable for guiding theoretical assessments, even to validate or disprove a theory. In order to take defect interactions into account and to extrapolate the physical quantities at finite stoichiometry deviations to that near the stoichiometry, we develop a general formalism to describe the thermodynamics of a defective system. We also show that it is possible to include interactions among defects in a simple expression of point defect model (PDM) by introducing an auxiliary constant mean-field. This generalization of the simple PDM leads to great versatility that allows one to study nonlinear effects of stoichiometry deviation on materials' behavior. It is a powerful tool to extract the defect energetics from finite defect concentrations to the dilute limit. Besides these, the full content of the theoretical formalism and some relevant and interesting issues, including reentrant pseudo-transition, multi-defect coexistence, charged defects, and possible consequence of instantaneous defective response in a quantum crystal, are explored and discussed.
Earthquakes are a complex spatiotemporal phenomenon, the underlying mechanism for which is still not fully understood despite decades of research and analysis. We propose and develop a network approach to earthquake events. In this network, a node represents a spatial location while a link between two nodes represents similar activity patterns in the two different locations. The strength of a link is proportional to the strength of the cross-correlation in activities of two nodes joined by the link. We apply our network approach to a Japanese earthquake catalog spanning the 14-year period 1985-1998. We find strong links representing large correlations between patterns in locations separated by more than 1000 km, corroborating prior observations that earthquake interactions have no characteristic length scale. We find network characteristics not attributable to chance alone, including a large number of network links, high node assortativity, and strong stability over time.
We introduce a new model of competition on growing networks. This extends the preferential attachment model, with the key property that node choices evolve simultaneously with the network. When a new node joins the network, it chooses neighbours by preferential attachment, and selects its type based on the number of initial neighbours of each type. The model is analysed in detail, and in particular, we determine the possible proportions of the various types in the limit of large networks. An important qualitative feature we find is that, in contrast to many current theoretical models, often several competitors will coexist. This matches empirical observations in many real-world networks.
We present deep spatially-resolved optical spectroscopy of the NW companion galaxy of the quasar BR 1202-0725 at $z=4.7$. Its rest-frame UV spectrum shows star-forming activity in the nuclear region. The Ly$\alpha$ emission profile is symmetric with wavelength while its line width is unusually wide (FWHM $\simeq 1100$ km s$^{-1}$) for such a high-$z$ star-forming galaxy. Spectrum taken along the Ly$\alpha$ nebula elongation, which is almost along the minor axis of the companion host galaxy, reveals that off-nuclear Ly$\alpha$ nebulae have either flat-topped or multi-peaked profiles along the extension. All these properties can be understood in terms of superwind activity in the companion galaxy. We also find a diffuse continuum component around the companion, which shows similar morphology to that of Ly$\alpha$ nebula, and is most likely due to scattering of the quasar light at dusty halo around the companion. We argue that the superwind could expel dusty material out to the halo region, making a dusty halo for scattering.
A polarization control experiment by utilizing a pair of crossed undulators has been proposed for the Shanghai deep ultraviolet free electron laser test facility. Numerical simulations indicate that, with the electromagnetic phase-shifter located between the two crossed planar undulators, fully coherent radiation with 100 nJ order pulse energy, 5 picoseconds pulse length and circular polarization degree above 90% could be generated. The physical design study and the preparation status of the experiment are presented in the paper.
Recent experimental breakthroughs have finally allowed to implement in-vitro reaction kinetics (the so called {\em enzyme based logic}) which code for two-inputs logic gates and mimic the stochastic AND (and NAND) as well as the stochastic OR (and NOR). This accomplishment, together with the already-known single-input gates (performing as YES and NOT), provides a logic base and paves the way to the development of powerful biotechnological devices. The investigation of this field would enormously benefit from a self-consistent, predictive, theoretical framework. Here we formulate a complete statistical mechanical description of the Monod-Wyman-Changeaux allosteric model for both single and double ligand systems, with the purpose of exploring their practical capabilities to express logical operators and/or perform logical operations. Mixing statistical mechanics with logics, and quantitatively our findings with the available biochemical data, we successfully revise the concept of cooperativity (and anti-cooperativity) for allosteric systems, with particular emphasis on its computational capabilities, the related ranges and scaling of the involved parameters and its differences with classical cooperativity (and anti-cooperativity).
The energy consumption by ICT (Information and Communication Technology) equipment is rapidly increasing which causes a significant economic and environmental problem. At present, the network infrastructure is becoming a large portion of the energy footprint in ICT. Thus, the concept of energy efficient or green networking has been introduced. Now one of the main concerns of network industry is to minimize energy consumption of network infrastructure because of the potential economic benefits, ethical responsibility, and its environmental impact. In this paper, the energy management strategies to reduce the energy consumed by network switches in LAN (Local Area Network) have been developed. According to the life-cycle assessment of network switches, the highest amount of energy is consumed during usage phase. The study considers bandwidth, link load and traffic matrices as input parameters which have the highest contribution to energy footprints of network switches during usage phase and energy consumption as output. Then with the objective of reducing energy usage of network infrastructure, the feasibility of putting Ethernet switches hibernate or sleep mode was investigated. After that, the network topology was reorganized using clustering method based on the spectral approach for putting network switches to hibernate or switched off mode considering the time and communications among them. Experimental results show the interest in this approach in terms of energy consumption. .
This technical document describes the comparison of the EPICS PV gateway and a new solution based on relay of UDP packets using the UDP-HELPER switch feature, iptables and a C program. The solution can be applied on environments that contain multiple sub-networks and a number of IOCs on the same host or multiple IOCs on the same sub-network.
Widely adopted at home, business places, and hot spots, wireless ad-hoc networks are expected to provide broadband services parallel to their wired counterparts in near future. To address this need, MIMO techniques, which are capable of offering several-fold increase in capacity, hold significant promise. Most previous work on capacity analysis of ad-hoc networks is based on an implicit assumption that each node has only one antenna. Core to the analysis therein is the characterization of a geometric area, referred to as the exclusion region, which quantizes the amount of spatial resource occupied by a link. When multiple antennas are deployed at each node, however, multiple links can transmit in the vicinity of each other simultaneously, as interference can now be suppressed by spatial signal processing. As such, a link no longer exclusively occupies a geometric area, making the concept of "exclusion region" not applicable any more. In this paper, we investigate link-layer throughput capacity of MIMO ad-hoc networks. In contrast to previous work, the amount of spatial resource occupied by each link is characterized by the actual interference it imposes on other links. To calculate the link-layer capacity, we first derive the probability distribution of post-detection SINR at a receiver. The result is then used to calculate the number of active links and the corresponding data rates that can be sustained within an area. Our analysis will serve as a guideline for the design of medium access protocols for MIMO ad-hoc networks. To the best of knowledge, this paper is the first attempt to characterize the capacity of MIMO ad-hoc networks by considering the actual PHY-layer signal and interference model.
We discuss a general framework for the inclusion of heavy quark mass contributions to deep-inelastic structure functions and their perturbative matching to structure functions computed in variable-mass schemes. Our approach is based on the so-called FONLL method, previously introduced and applied to heavy quark hadroproduction and photoproduction. We define our framework, provide expressions up to second order in the strong coupling, and use them to construct matched expressions for structure functions up to NNLO. After checking explicitly the consistency of our results, we perform a study of the phenomenological impact of heavy quark terms, and compare results obtained at various perturbative orders, and with various prescriptions for the treatment of subleading terms, specifically those related to threshold behaviour. We also consider the heavy quark structure function F2c and discuss issues related to the presence of mass singularities in their coefficient functions.
We study spectral properties of the Fokker-Planck operator that represents particles moving via a combination of diffusion and advection in a time-independent random velocity field, presenting in detail work outlined elsewhere [J. T. Chalker and Z. J. Wang, Phys. Rev. Lett. {\bf 79}, 1797 (1997)]. We calculate analytically the ensemble-averaged one-particle Green function and the eigenvalue density for this Fokker-Planck operator, using a diagrammatic expansion developed for resolvents of non-Hermitian random operators, together with a mean-field approximation (the self-consistent Born approximation) which is well-controlled in the weak-disorder regime for dimension d>2. The eigenvalue density in the complex plane is non-zero within a wedge that encloses the negative real axis. Particle motion is diffusive at long times, but for short times we find a novel time-dependence of the mean-square displacement, $<r^2> \sim t^{2/d}$ in dimension d>2, associated with the imaginary parts of eigenvalues.
Biomedical word sense disambiguation (WSD) is an important intermediate task in many natural language processing applications such as named entity recognition, syntactic parsing, and relation extraction. In this paper, we employ knowledge-based approaches that also exploit recent advances in neural word/concept embeddings to improve over the state-of-the-art in biomedical WSD using the MSH WSD dataset as the test set. Our methods involve weak supervision - we do not use any hand-labeled examples for WSD to build our prediction models; however, we employ an existing well known named entity recognition and concept mapping program, MetaMap, to obtain our concept vectors. Over the MSH WSD dataset, our linear time (in terms of numbers of senses and words in the test instance) method achieves an accuracy of 92.24% which is an absolute 3% improvement over the best known results obtained via unsupervised or knowledge-based means. A more expensive approach that we developed relies on a nearest neighbor framework and achieves an accuracy of 94.34%. Employing dense vector representations learned from unlabeled free text has been shown to benefit many language processing tasks recently and our efforts show that biomedical WSD is no exception to this trend. For a complex and rapidly evolving domain such as biomedicine, building labeled datasets for larger sets of ambiguous terms may be impractical. Here, we show that weak supervision that leverages recent advances in representation learning can rival supervised approaches in biomedical WSD. However, external knowledge bases (here sense inventories) play a key role in the improvements achieved.
Risk stratification of lung nodules is a task of primary importance in lung cancer diagnosis. Any improvement in robust and accurate nodule characterization can assist in identifying cancer stage, prognosis, and improving treatment planning. In this study, we propose a 3D Convolutional Neural Network (CNN) based nodule characterization strategy. With a completely 3D approach, we utilize the volumetric information from a CT scan which would be otherwise lost in the conventional 2D CNN based approaches. In order to address the need for a large amount for training data for CNN, we resort to transfer learning to obtain highly discriminative features. Moreover, we also acquire the task dependent feature representation for six high-level nodule attributes and fuse this complementary information via a Multi-task learning (MTL) framework. Finally, we propose to incorporate potential disagreement among radiologists while scoring different nodule attributes in a graph regularized sparse multi-task learning. We evaluated our proposed approach on one of the largest publicly available lung nodule datasets comprising 1018 scans and obtained state-of-the-art results in regressing the malignancy scores.
Populations are seldom completely isolated from their environment. Individuals in a particular geographic or social region may be considered a distinct network due to strong local ties, but will also interact with individuals in other networks. We study the susceptible-infected-recovered (SIR) process on interconnected network systems, and find two distinct regimes. In strongly-coupled network systems, epidemics occur simultaneously across the entire system at a critical infection strength $\beta_c$, below which the disease does not spread. In contrast, in weakly-coupled network systems, a mixed phase exists below $\beta_c$ of the coupled network system, where an epidemic occurs in one network but does not spread to the coupled network. We derive an expression for the network and disease parameters that allow this mixed phase and verify it numerically. Public health implications of communities comprising these two classes of network systems are also mentioned.
Policies for complex visual tasks have been successfully learned with deep reinforcement learning, using an approach called deep Q-networks (DQN), but relatively large (task-specific) networks and extensive training are needed to achieve good performance. In this work, we present a novel method called policy distillation that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient. Furthermore, the same method can be used to consolidate multiple task-specific policies into a single policy. We demonstrate these claims using the Atari domain and show that the multi-task distilled agent outperforms the single-task teachers as well as a jointly-trained DQN agent.
We consider an ensemble of $K$ single-layer perceptrons exposed to random inputs and investigate the conditions under which the couplings of these perceptrons can be chosen such that prescribed correlations between the outputs occur. A general formalism is introduced using a multi-perceptron costfunction that allows to determine the maximal number of random inputs as a function of the desired values of the correlations. Replica-symmetric results for $K=2$ and $K=3$ are compared with properties of two-layer networks of tree-structure and fixed Boolean function between hidden units and output. The results show which correlations in the hidden layer of multi-layer neural networks are crucial for the value of the storage capacity.
Level-spacing distributions of the Gaussian Unitary Ensemble (GUE) of random matrix theory are expressed in terms of solutions of coupled differential equations. Series solutions up to order 50 in the level spacing are obtained, thus providing a very good description of the small-spacing part of the level-spacing distribution, which can be used to make comparisons with experimental or numerical data. The level-spacing distributions can be obtained by solving the system of differential equations numerically.
Indium tin oxide (Sn-doped In$_2$O$_{3-\delta}$ or ITO) is an interesting and technologically important transparent conducting oxide. This class of material has been extensively studied for decades, with research efforts focusing on the application aspects. The fundamental issues of the electronic conduction properties of ITO from 300 K down to low temperatures have rarely been addressed. Studies of the electrical-transport properties over a wide range of temperature are essential to unraveling the underlying electronic dynamics and microscopic electronic parameters. We show that one can learn rich physics in ITO material, including the semi-classical Boltzmann transport, the quantum-interference electron transport, and the electron-electron interaction effects in the presence of disorder and granularity. To reveal the opportunities that the ITO material provides for fundamental research, we demonstrate a variety of charge transport properties in different forms of ITO structures, including homogeneous polycrystalline films, homogeneous single-crystalline nanowires, and inhomogeneous ultrathin films. We not only address new physics phenomena that arise in ITO but also illustrate the versatility of the stable ITO material forms for potential applications. We emphasize that, microscopically, the rich electronic conduction properties of ITO originate from the inherited free-electron-like energy bandstructure and low-carrier concentration (as compared with that in typical metals) characteristics of this class of material. Furthermore, a low carrier concentration leads to slow electron-phonon relaxation, which causes ($i$) a small residual resistance ratio, ($ii$) a linear electron diffusion thermoelectric power in a wide temperature range 1-300 K, and ($iii$) a weak electron dephasing rate. We focus our discussion on the metallic-like ITO material.
We propose a distributed algorithm based on Alternating Direction Method of Multipliers (ADMM) to minimize the sum of locally known convex functions using communication over a network. This optimization problem emerges in many applications in distributed machine learning and statistical estimation. We show that when functions are convex, both the objective function values and the feasibility violation converge with rate $O(\frac{1}{T})$, where $T$ is the number of iterations. We then show that if the functions are strongly convex and have Lipschitz continuous gradients, the sequence generated by our algorithm converges linearly to the optimal solution. In particular, an $\epsilon$-optimal solution can be computed with $O(\sqrt{\kappa_f} \log (1/\epsilon))$ iterations, where $\kappa_f$ is the condition number of the problem. Our analysis also highlights the effect of network structure on the convergence rate through maximum and minimum degree of nodes as well as the algebraic connectivity of the network.
Semidefinite programs have recently been developed for the problem of community detection, which may be viewed as a special case of the stochastic blockmodel. Here, we develop a semidefinite program that can be tailored to other instances of the blockmodel, such as non-assortative networks and overlapping communities. We establish label recovery in sparse settings, with conditions that are analogous to recent results for community detection. In settings where the data is not generated by a blockmodel, we give an oracle inequality that bounds excess risk relative to the best blockmodel approximation. Simulations are presented for community detection, for overlapping communities, and for latent space models.
We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations. On several datasets that require capturing long-term dependency structure, we show that path-SGD can significantly improve trainability of ReLU RNNs compared to RNNs trained with SGD, even with various recently suggested initialization schemes.
A lecture notes style review of the non-equilibrium statistical mechanics of recurrent neural networks with discrete and continuous neurons (e.g. Ising, graded-response, coupled-oscillators). To be published in the Handbook of Biological Physics (North-Holland). Accompanied by a similar review (part I) dealing with the statics.
Providing an efficient strategy to navigate safely through unsignaled intersections is a difficult task that requires determining the intent of other drivers. We explore the effectiveness of using Deep Reinforcement Learning to handle intersection problems. Combining several recent advances in Deep RL, were we able to learn policies that surpass the performance of a commonly-used heuristic approach in several metrics including task completion time and goal success rate. Our analysis, and the solutions learned by the network point out several short comings of current rule-based methods. The fact that Deep RL policies resulted in collisions, although rarely, combined with the limitations of the policy to generalize well to out-of-sample scenarios suggest a need for further research.
Convolutional neural networks (CNNs) tend to become a standard approach to solve a wide array of computer vision problems. Besides important theoretical and practical advances in their design, their success is built on the existence of manually labeled visual resources, such as ImageNet. The creation of such datasets is cumbersome and here we focus on alternatives to manual labeling. We hypothesize that new resources are of uttermost importance in domains which are not or weakly covered by ImageNet, such as tourism photographs. We first collect noisy Flickr images for tourist points of interest and apply automatic or weakly-supervised reranking techniques to reduce noise. Then, we learn domain adapted models with a standard CNN architecture and compare them to a generic model obtained from ImageNet. Experimental validation is conducted with publicly available datasets, including Oxford5k, INRIA Holidays and Div150Cred. Results show that low-cost domain adaptation improves results compared to the use of generic models but also compared to strong non-CNN baselines such as triangulation embedding.
We present the results of a deep photometric survey performed with FORS1@VLT aimed at investigating the complex Main Sequence structure of the stellar system omega Centauri. We confirm the presence of a double Main Sequence and identify its blue component (bMS) over a large field of view up to 26' from the cluster center. We found that bMS stars are significantly more concentrated toward the cluster center than the other "normal" MS stars. The bMS morphology and its position in the CMD have been used to constrain the helium overabundance required to explain the observed MS morphology.
This paper presents a unified model to perform language and speaker recognition simultaneously and altogether. The model is based on a multi-task recurrent neural network where the output of one task is fed as the input of the other, leading to a collaborative learning framework that can improve both language and speaker recognition by borrowing information from each other. Our experiments demonstrated that the multi-task model outperforms the task-specific models on both tasks.
Strongly correlated electrons with box disorder in high-dimensional lattices are investigated. We apply the statistical dynamical mean-field theory, which treats local correlations non-perturbatively. The incorporation of a finite lattice connectivity allows for the detection of disorder-induced localization via the probability distribution function of the local density of states. We obtain a complete paramagnetic ground state phase diagram and find correlation-induced as well as disorder-induced metal-insulator transitions. Our results qualitatively confirm predictions obtained by typical medium theory. Moreover, we find that the probability distribution function of the local density of states in the metallic phase strongly deviates from a log-normal distribution as found for the non-interacting case.
Percolation on complex networks has been used to study computer viruses, epidemics, and other casual processes. Here, we present conditions for the existence of a network specific, observation dependent, phase transition in the updated posterior of node states resulting from actively monitoring the network. Since traditional percolation thresholds are derived using observation independent Markov chains, the threshold of the posterior should more accurately model the true phase transition of a network, as the updated posterior more accurately tracks the process. These conditions should provide insight into modeling the dynamic response of the updated posterior to active intervention and control policies while monitoring large complex networks.
We propose a network structure discovery model for continuous observations that generalizes linear causal models by incorporating a Gaussian process (GP) prior on a network-independent component, and random sparsity and weight matrices as the network-dependent parameters. This approach provides flexible modeling of network-independent trends in the observations as well as uncertainty quantification around the discovered network structure. We establish a connection between our model and multi-task GPs and develop an efficient stochastic variational inference algorithm for it. Furthermore, we formally show that our approach is numerically stable and in fact numerically easy to carry out almost everywhere on the support of the random variables involved. Finally, we evaluate our model on three applications, showing that it outperforms previous approaches. We provide a qualitative and quantitative analysis of the structures discovered for domains such as the study of the full genome regulation of the yeast Saccharomyces cerevisiae.
This paper introduces a continuous model for Multi-cellular Developmental Design. The cells are fixed on a 2D grid and exchange "chemicals" with their neighbors during the growth process. The quantity of chemicals that a cell produces, as well as the differentiation value of the cell in the phenotype, are controlled by a Neural Network (the genotype) that takes as inputs the chemicals produced by the neighboring cells at the previous time step. In the proposed model, the number of iterations of the growth process is not pre-determined, but emerges during evolution: only organisms for which the growth process stabilizes give a phenotype (the stable state), others are declared nonviable. The optimization of the controller is done using the NEAT algorithm, that optimizes both the topology and the weights of the Neural Networks. Though each cell only receives local information from its neighbors, the experimental results of the proposed approach on the 'flags' problems (the phenotype must match a given 2D pattern) are almost as good as those of a direct regression approach using the same model with global information. Moreover, the resulting multi-cellular organisms exhibit almost perfect self-healing characteristics.
While the current trend is to increase the depth of neural networks to increase their performance, the size of their training database has to grow accordingly. We notice an emergence of tremendous databases, although providing labels to build a training set still remains a very expensive task. We tackle the problem of selecting the samples to be labelled in an online fashion. In this paper, we present an active learning strategy based on query by committee and dropout technique to train a Convolutional Neural Network (CNN). We derive a commmittee of partial CNNs resulting from batchwise dropout runs on the initial CNN. We evaluate our active learning strategy for CNN on MNIST benchmark, showing in particular that selecting less than 30 % from the annotated database is enough to get similar error rate as using the full training set on MNIST. We also studied the robustness of our method against adversarial examples.
It has recently become possible to record detailed social interactions in large social systems with high resolution. As we study these datasets, human social interactions display patterns that emerge at multiple time scales, from minutes to months. On a fundamental level, understanding of the network dynamics can be used to inform the process of measuring social networks. The details of measurement are of particular importance when considering dynamic processes where minute-to-minute details are important, because collection of physical proximity interactions with high temporal resolution is difficult and expensive. Here, we consider the dynamic network of proximity-interactions between approximately 500 individuals participating in the Copenhagen Networks Study. We show that in order to accurately model spreading processes in the network, the dynamic processes that occur on the order of minutes are essential and must be included in the analysis.
The local density of states (LDOS) of the adsorbate induced two-dimensional electron system (2DES) on n-InAs(110) is studied by low-temperature scanning tunneling spectroscopy. In contrast to a similar 3DES, the 2DES LDOS exhibits 20 times stronger corrugations and rather irregular structures. Both results are interpreted as a consequence of weak localization. Fourier transforms of the LDOS reveal that the k-values of the unperturbed 2DES still dominate the 2DES, but additional lower k-values contribute significantly. To clarify the origin of the LDOS patterns, we measure the potential landscape of the same 2DES area allowing to calculate the expected LDOS from the single particle Schr"odinger equation and to directly compare it with the measured one.
This paper describes a new model for presenting local information based on the network proximity. We present a novelty mobile mashup which combines Wi-Fi proximity measurements with QR-codes. Our mobile mashup automatically adds context information the content presented by QR-codes. It simplifies the deployment schemes and allows to use unified presentation for all data points, for example. This paper describes how to combine QR-codes and network proximity information.
This paper addresses deep-water gravity waves of finite amplitude generated by an initial disturbance to the water. It is assumed that the horizontal dimensions of the initially disturbed body of the water are much larger than the magnitude of the free surface displacement in the origin of the waves. Initially the free surface has not yet been displaced from its equilibrium position, but the velocity field has already become different from zero. This means that the water at rest initially is set in motion suddenly by an impulse. Duration of formation of the wave origin and the maximum water elevation in the origin are estimated using the arrival times of the waves and the maximum wave-heights at certain locations obtained from gauge records at the locations, and the distances between the centre of the origin and each of the locations. For points situated at a long distance from the wave origin, forecast is made for the travel time and wave height at the points. The forecast is based on the data recorded by the gauges at the locations.
Networked systems display complex patterns of interactions between a large number of components. In physical networks, these interactions often occur along structural connections that link components in a hard-wired connection topology, supporting a variety of system-wide dynamical behaviors such as synchronization. While descriptions of these behaviors are important, they are only a first step towards understanding the relationship between network topology and system behavior, and harnessing that relationship to optimally control the system's function. Here, we use linear network control theory to analytically relate the topology of a subset of structural connections (those linking driver nodes to non-driver nodes) to the minimum energy required to control networked systems. As opposed to the numerical computations of control energy, our accurate closed-form expressions yield general structural features in networks that require significantly more or less energy to control, providing topological principles for the design and modification of network behavior. To illustrate the utility of the mathematics, we apply this approach to high-resolution connectomes recently reconstructed from drosophila, mouse, and human brains. We use these principles to show that connectomes of increasingly complex species are wired to reduce control energy. We then use the analytical expressions we derive to perform targeted manipulation of the brain's control profile by removing single edges in the network, a manipulation that is accessible to current clinical techniques in patients with neurological disorders. Cross-species comparisons suggest an advantage of the human brain in supporting diverse network dynamics with small energetic costs, while remaining unexpectedly robust to perturbations. Our results ground the expectation of a system's dynamical behavior in its network architecture.
Dynamical instabilities in fluid mechanics are responsible of a variety of important common phenomena, such as waves on the sea surface or Taylor vorteces in Couette flow. In granular media dynamical instabilities has just begun to be discovered. Here we show by means of molecular dynamics simulation the existence of a new dynamical instability of a granular mixture under oscillating horizontal shear, which leads to the formation of a striped pattern where the components are segregated. We investigate the properties of such a Kelvin-Helmholtz like instability and show how it is connected to pattern formation in granular flow and segregation.
We show that the current thermodynamic measurements in the superconducting phase of $\mathrm{U}\mathrm{Ru}_2\mathrm{Si}_2$ are compatible with two distinct singlet chiral paired states $k_z(k_x \pm i k_y)$ and $(k_x \pm i k_y)^2$. Despite possessing similar low temperature thermodynamic properties, these two pairings are topologically distinguished by their respective orbital angular momentum projections along the c-axis, $m=\pm 1$ and $m=\pm 2$. The point nodes of these states act as the monopoles and the anti-monopoles of the Berry's gauge flux of charge $\pm m$, which are separated in the momentum space along the $c$ axis. As a result, the Berry's flux through the $ab$ plane equals $m$. Consequently, the point nodes of $k_z(k_x+i k_y)$ and $(k_x \pm ik_y)^2$ states respectively realize the Weyl and the double-Weyl fermions, with chemical potential exactly tuned at the Fermi point, due to the charge conjugation symmetry. These topologically nontrivial point nodes, give rise to $m$ copies of protected spin degenerate, chirally dispersing surface states on the $ca$ and the $cb$ planes, which carry surface current, and their energies vanish at the Fermi arcs. In contrast, a line node acts as the momentum space vortex loop, and gives rise to the zero energy, dispersionless Andreev bound states on the surfaces parallel to the plane enclosed by the line node. The Berry's flux through the $ab$ plane gives rise to anomalous spin Hall and thermal Hall conductivities, and various magnetoelectric effects. A clear determination of the bulk invariant can only be achieved by probing the pairing symmetry via a corner Josephson junction measurement, and Fourier transformed STM measurements of the Fermi arcs. Therefore, we identify $\mathrm{U}\mathrm{Ru}_2\mathrm{Si}_2$ as a promising material for realizing gapless topological superconductivity in three spatial dimensions.
Convolutional Neural Networks have achieved state-of-the-art performance on a wide range of tasks. Most benchmarks are led by ensembles of these powerful learners, but ensembling is typically treated as a post-hoc procedure implemented by averaging independently trained models with model variation induced by bagging or random initialization. In this paper, we rigorously treat ensembling as a first-class problem to explicitly address the question: what are the best strategies to create an ensemble? We first compare a large number of ensembling strategies, and then propose and evaluate novel strategies, such as parameter sharing (through a new family of models we call TreeNets) as well as training under ensemble-aware and diversity-encouraging losses. We demonstrate that TreeNets can improve ensemble performance and that diverse ensembles can be trained end-to-end under a unified loss, achieving significantly higher "oracle" accuracies than classical ensembles.
This work analyzes centered binary Restricted Boltzmann Machines (RBMs) and binary Deep Boltzmann Machines (DBMs), where centering is done by subtracting offset values from visible and hidden variables. We show analytically that (i) centering results in a different but equivalent parameterization for artificial neural networks in general, (ii) the expected performance of centered binary RBMs/DBMs is invariant under simultaneous flip of data and offsets, for any offset value in the range of zero to one, (iii) centering can be reformulated as a different update rule for normal binary RBMs/DBMs, and (iv) using the enhanced gradient is equivalent to setting the offset values to the average over model and data mean. Furthermore, numerical simulations suggest that (i) optimal generative performance is achieved by subtracting mean values from visible as well as hidden variables, (ii) centered RBMs/DBMs reach significantly higher log-likelihood values than normal binary RBMs/DBMs, (iii) centering variants whose offsets depend on the model mean, like the enhanced gradient, suffer from severe divergence problems, (iv) learning is stabilized if an exponentially moving average over the batch means is used for the offset values instead of the current batch mean, which also prevents the enhanced gradient from diverging, (v) centered RBMs/DBMs reach higher LL values than normal RBMs/DBMs while having a smaller norm of the weight matrix, (vi) centering leads to an update direction that is closer to the natural gradient and that the natural gradient is extremly efficient for training RBMs, (vii) centering dispense the need for greedy layer-wise pre-training of DBMs, (viii) furthermore we show that pre-training often even worsen the results independently whether centering is used or not, and (ix) centering is also beneficial for auto encoders.
Today, with the continued growth in using information and communication technologies (ICT) for business purposes, business organizations become increasingly dependent on their information systems. Thus, they need to protect them from the different attacks exploiting their vulnerabilities. To do so, the organization has to use security technologies, which may be proactive or reactive ones. Each security technology has a relative cost and addresses specific vulnerabilities. Therefore, the organization has to put in place the appropriate security technologies set that minimizes the information system s vulnerabilities with a minimal cost. This bi objective problem will be considered as a resources allocation problem (RAP) where security technologies represent the resources to be allocated. However, the set of vulnerabilities may change, periodically, with the continual appearance of new ones. Therefore, the security technologies set should be flexible to face these changes, in real time, and the problem becomes a dynamic one. In this paper, we propose a harmony search based algorithm to solve the bi objective dynamic resource allocation decision model. This approach was compared to a genetic algorithm and provided good results.
Nowozin \textit{et al} showed last year how to extend the GAN \textit{principle} to all $f$-divergences. The approach is elegant but falls short of a full description of the supervised game, and says little about the key player, the generator: for example, what does the generator actually converge to if solving the GAN game means convergence in some space of parameters? How does that provide hints on the generator's design and compare to the flourishing but almost exclusively experimental literature on the subject?   In this paper, we unveil a broad class of distributions for which such convergence happens --- namely, deformed exponential families, a wide superset of exponential families --- and show tight connections with the three other key GAN parameters: loss, game and architecture. In particular, we show that current deep architectures are able to factorize a very large number of such densities using an especially compact design, hence displaying the power of deep architectures and their concinnity in the $f$-GAN game. This result holds given a sufficient condition on \textit{activation functions} --- which turns out to be satisfied by popular choices. The key to our results is a variational generalization of an old theorem that relates the KL divergence between regular exponential families and divergences between their natural parameters. We complete this picture with additional results and experimental insights on how these results may be used to ground further improvements of GAN architectures, via (i) a principled design of the activation functions in the generator and (ii) an explicit integration of proper composite losses' link function in the discriminator.
The sigmoidal tuning curve that maximizes the mutual information for a Poisson neuron, or population of Poisson neurons, is obtained. The optimal tuning curve is found to have a discrete structure that results in a quantization of the input signal. The number of quantization levels undergoes a hierarchy of phase transitions as the length of the coding window is varied. We postulate, using the mammalian auditory system as an example, that the presence of a subpopulation structure within a neural population is consistent with an optimal neural code.
Word embeddings resulting from neural language models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with some well-known embeddings on NER and movie review tasks and show that we can reach similar or even better performance. Although deep learning is not really necessary for generating good word embeddings, we show that it can provide an easy way to adapt embeddings to specific tasks.
We investigate properties of several string networks in $D < 10$ which carry electric currents as well as electrostatic charge densities. We show the electric-current conservations as well as the force-balance condition of the string tensions on 3-string junctions in these networks. We also show the consistency of the above string networks from their world-volume point of view by comparing the world-volume energy-density with the induced worldsheet energy density of the supergravity solution. Finally, we present new charged macroscopic string solutions in type II theories in D=8 and discuss certain issues related to their network construction.
We develop a model of perceptual similarity judgment based on re-training a deep convolution neural network (DCNN) that learns to associate different views of each 3D object to capture the notion of object persistence and continuity in our visual experience. The re-training process effectively performs distance metric learning under the object persistency constraints, to modify the view-manifold of object representations. It reduces the effective distance between the representations of different views of the same object without compromising the distance between those of the views of different objects, resulting in the untangling of the view-manifolds between individual objects within the same category and across categories. This untangling enables the model to discriminate and recognize objects within the same category, independent of viewpoints. We found that this ability is not limited to the trained objects, but transfers to novel objects in both trained and untrained categories, as well as to a variety of completely novel artificial synthetic objects. This transfer in learning suggests the modification of distance metrics in view- manifolds is more general and abstract, likely at the levels of parts, and independent of the specific objects or categories experienced during training. Interestingly, the resulting transformation of feature representation in the deep networks is found to significantly better match human perceptual similarity judgment than AlexNet, suggesting that object persistence could be an important constraint in the development of perceptual similarity judgment in biological neural networks.
Recently developed deep learning techniques have significantly improved the accuracy of various speech and image recognition systems. In this paper we show how to adapt some of these techniques to create a novel chained convolutional architecture with next-step conditioning for improving performance on protein sequence prediction problems. We explore its value by demonstrating its ability to improve performance on eight-class secondary structure prediction. We first establish a state-of-the-art baseline by adapting recent advances in convolutional neural networks which were developed for vision tasks. This model achieves 70.0% per amino acid accuracy on the CB513 benchmark dataset without use of standard performance-boosting techniques such as ensembling or multitask learning. We then improve upon this state-of-the-art result using a novel chained prediction approach which frames the secondary structure prediction as a next-step prediction problem. This sequential model achieves 70.3% Q8 accuracy on CB513 with a single model; an ensemble of these models produces 71.4% Q8 accuracy on the same test set, improving upon the previous overall state of the art for the eight-class secondary structure problem. Our models are implemented using TensorFlow, an open-source machine learning software library available at TensorFlow.org; we aim to release the code for these experiments as part of the TensorFlow repository.
In this paper a mode of using the Dynamic Renormalization Group (DRG) method is suggested in order to cope with inconsistent results obtained when applying it to a continuous family of one-dimensional nonlocal models. The key observation is that the correct fixed-point dynamical system has to be identified during the analysis in order to account for all the relevant terms that are generated under renormalization. This is well established for static problems, however poorly implemented in dynamical ones. An application of this approach to a nonlocal extension of the Kardar-Parisi-Zhang equation resolves certain problems in one-dimension. Namely, obviously problematic predictions are eliminated and the existing exact analytic results are recovered.
A summary is given of the experimental and theoretical results presented in the working group on spin physics. New data on inclusive and semi-inclusive deep-inelastic scattering, combined with theoretical studies of the polarized distribution functions of nucleons, were presented. Many talks addressed the relatively new subjects of transversity distributions and generalized parton distributions. These distributions can be studied by measuring single spin asymmetries, while partonic intrinsic motion and models of new spin dependent distribution and fragmentation functions are needed to obtain the corresponding theoretical description. These subjects are not only studied in deep-inelastic lepton scattering, but also in polarized proton-proton collisions at RHIC. A selection of results that have been obtained in these experiments together with several associated theoretical ideas are presented in this paper. In conclusion, a brief sketch is given of the prospects for experimental and theoretical studies of the spin structure of the nucleon in the coming years.
For most problems in science and engineering we can obtain data sets that describe the observed system from various perspectives and record the behavior of its individual components. Heterogeneous data sets can be collectively mined by data fusion. Fusion can focus on a specific target relation and exploit directly associated data together with contextual data and data about system's constraints. In the paper we describe a data fusion approach with penalized matrix tri-factorization (DFMF) that simultaneously factorizes data matrices to reveal hidden associations. The approach can directly consider any data that can be expressed in a matrix, including those from feature-based representations, ontologies, associations and networks. We demonstrate the utility of DFMF for gene function prediction task with eleven different data sources and for prediction of pharmacologic actions by fusing six data sources. Our data fusion algorithm compares favorably to alternative data integration approaches and achieves higher accuracy than can be obtained from any single data source alone.
A discrete--dynamics model, which is specified solely in terms of the system's equilibrium structure, is defined for the density correlators of a simple fluid. This model yields results for the evolution of glassy dynamics which are identical with the ones obtained from the mode-coupling theory for ideal liquid--glass transitions. The decay of density fluctuations outside the transient regime is shown to be given by a superposition of Debye processes. The concept of structural relaxation is given a precise meaning. It is proven that the long-time part of the mode-coupling-theory solutions is structural relaxation, while the transient motion merely determines an overall time scale for the glassy dynamics.
Consider a network linking the points of a rate-$1$ Poisson point process on the plane. Write $\Psi^{\mbox{ave}}(s)$ for the minimum possible mean length per unit area of such a network, subject to the constraint that the route-length between every pair of points is at most $s$ times the Euclidean distance. We give upper and lower bounds on the function $\Psi^{\mbox{ave}}(s)$, and on the analogous "worst-case" function $\Psi^{\mbox{worst}}(s)$ where the point configuration is arbitrary subject to average density one per unit area. Our bounds are numerically crude, but raise the question of whether there is an exponent $\alpha$ such that each function has $\Psi(s) \asymp (s-1)^{-\alpha}$ as $s \downarrow 1$.
The repertoire of lymphocyte receptors in the adaptive immune system protects organisms from diverse pathogens. A well-adapted repertoire should be tuned to the pathogenic environment to reduce the cost of infections. We develop a general framework for predicting the optimal repertoire that minimizes the cost of infections contracted from a given distribution of pathogens. The theory predicts that the immune system will have more receptors for rare antigens than expected from the frequency of encounters; individuals exposed to the same infections will have sparse repertoires that are largely different, but nevertheless exploit cross-reactivity to provide the same coverage of antigens; and the optimal repertoires can be reached via the dynamics of competitive binding of antigens by receptors, and selective amplification of stimulated receptors. Our results follow from a tension between the statistics of pathogen detection, which favor a broader receptor distribution, and the effects of cross-reactivity, which tend to concentrate the optimal repertoire onto a few highly abundant clones. Our predictions can be tested in high throughput surveys of receptor and pathogen diversity.
We propose a generative machine comprehension model that learns jointly to ask and answer questions based on documents. The proposed model uses a sequence-to-sequence framework that encodes the document and generates a question (answer) given an answer (question). Significant improvement in model performance is observed empirically on the SQuAD corpus, confirming our hypothesis that the model benefits from jointly learning to perform both tasks. We believe the joint model's novelty offers a new perspective on machine comprehension beyond architectural engineering, and serves as a first step towards autonomous information seeking.
We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting the size of the parent sets. We then consider the problem of optimizing the variable order for a given set of features. This is still a computationally hard problem, but we present a convex relaxation that yields an optimal 'soft' ordering in polynomial time. One novel aspect of the approach is that we do not perform a discrete search over DAG structures, nor over variable orders, but instead solve a continuous relaxation that can then be rounded to obtain a valid network structure. We conduct an experimental comparison against standard structure search procedures over standard objectives, which cope with local minima, and evaluate the advantages of using convex relaxations that reduce the effects of local minima.
We explore the stability of far-from-equilibrium metastable states of a three-dimensional Coulomb glass at zero temperature by studying charge avalanches triggered by a slowly varying external electric field. Surprisingly, we identify a sharply defined dynamical ("depinning") phase transition from stationary to nonstationary charge displacement at a critical value of the external electric field. Using particle-conserving dynamics, scale-free system-spanning avalanches are observed only at the critical field. We show that the qualitative features of this depinning transition are completely different for an equivalent short-range model, highlighting the key importance of long-range interactions for nonequilibrium dynamics of Coulomb glasses.
Deep convolutional neural networks comprise a subclass of deep neural networks (DNN) with a constrained architecture that leverages the spatial and temporal structure of the domain they model. Convolutional networks achieve the best predictive performance in areas such as speech and image recognition by hierarchically composing simple local features into complex models. Although DNNs have been used in drug discovery for QSAR and ligand-based bioactivity predictions, none of these models have benefited from this powerful convolutional architecture. This paper introduces AtomNet, the first structure-based, deep convolutional neural network designed to predict the bioactivity of small molecules for drug discovery applications. We demonstrate how to apply the convolutional concepts of feature locality and hierarchical composition to the modeling of bioactivity and chemical interactions. In further contrast to existing DNN techniques, we show that AtomNet's application of local convolutional filters to structural target information successfully predicts new active molecules for targets with no previously known modulators. Finally, we show that AtomNet outperforms previous docking approaches on a diverse set of benchmarks by a large margin, achieving an AUC greater than 0.9 on 57.8% of the targets in the DUDE benchmark.
In this paper, we introduce a new dataset consisting of 360,001 focused natural language descriptions for 10,738 images. This dataset, the Visual Madlibs dataset, is collected using automatically produced fill-in-the-blank templates designed to gather targeted descriptions about: people and objects, their appearances, activities, and interactions, as well as inferences about the general scene or its broader context. We provide several analyses of the Visual Madlibs dataset and demonstrate its applicability to two new description generation tasks: focused description generation, and multiple-choice question-answering for images. Experiments using joint-embedding and deep learning methods show promising results on these tasks.
In cardiac magnetic resonance imaging, fully-automatic segmentation of the heart enables precise structural and functional measurements to be taken, e.g. from short-axis MR images of the left-ventricle. In this work we propose a recurrent fully-convolutional network (RFCN) that learns image representations from the full stack of 2D slices and has the ability to leverage inter-slice spatial dependences through internal memory units. RFCN combines anatomical detection and segmentation into a single architecture that is trained end-to-end thus significantly reducing computational time, simplifying the segmentation pipeline, and potentially enabling real-time applications. We report on an investigation of RFCN using two datasets, including the publicly available MICCAI 2009 Challenge dataset. Comparisons have been carried out between fully convolutional networks and deep restricted Boltzmann machines, including a recurrent version that leverages inter-slice spatial correlation. Our studies suggest that RFCN produces state-of-the-art results and can substantially improve the delineation of contours near the apex of the heart.
Self-Organizing Maps are commonly used for unsupervised learning purposes. This paper is dedicated to the certain modification of SOM called SOMN (Self-Organizing Mixture Networks) used as a mechanism for representing grayscale digital images. Any grayscale digital image regarded as a distribution function can be approximated by the corresponding Gaussian mixture. In this paper, the use of SOMN is proposed in order to obtain such approximations for input grayscale images in unsupervised manner.
We introduce transverse ferromagnetic interactions, in addition to a simple transverse field, to quantum annealing of the random-field Ising model to accelerate convergence toward the target ground state. The conventional approach using only the transverse-field term is known to be plagued by slow convergence when the true ground state has strong ferromagnetic characteristics for the random-field Ising model. The transverse ferromagnetic interactions are shown to improve the performance significantly in such cases. This conclusion is drawn from the analyses of the energy eigenvalues of instantaneous stationary states as well as by the very fast algorithm of Bethe-type mean-field annealing adopted to quantum systems. The present study highlights the importance of a flexible choice of the type of quantum fluctuations to achieve the best possible performance in quantum annealing. The existence of such flexibility is an outstanding advantage of quantum annealing over simulated annealing.
Object recognition on depth images using convolutional neural networks requires mapping the data collected with depth sensors into three dimensional channels. This makes them processable by deep architectures, pre-trained over large scale RGB databases like ImageNet. Current mappings are based on heuristic assumptions over what depth properties should be most preserved, resulting often in cumbersome data visualizations, and likely in sub-optimal recognition results. Here we take an alternative route and we attempt instead to \emph{learn} an optimal colorization mapping for any given pre-trained architecture, using as training data a reference RGB-D database. We propose a deep network architecture, exploiting the residual paradigm, that learns how to map depth data to three channel images from a reference database. A qualitative analysis of the images obtained with this approach clearly indicates that learning the optimal mapping for depth data preserves the richness of depth information much better than hand-crafted approaches currently in use. Experiments on the Washington, JHUIT-50 and BigBIRD public benchmark databases, using AlexNet, VGG-16, GoogleNet, ResNet and SqueezeNet, clearly showcase the power of our approach, with gains in performance of up to $17\%$ compared to the state of the art.
Search is at the heart of modern e-commerce. As a result, the task of ranking search results automatically (learning to rank) is a multibillion dollar machine learning problem. Traditional models optimize over a few hand-constructed features based on the item's text. In this paper, we introduce a multimodal learning to rank model that combines these traditional features with visual semantic features transferred from a deep convolutional neural network. In a large scale experiment using data from the online marketplace Etsy, we verify that moving to a multimodal representation significantly improves ranking quality. We show how image features can capture fine-grained style information not available in a text-only representation. In addition, we show concrete examples of how image information can successfully disentangle pairs of highly different items that are ranked similarly by a text-only model.
Quantum networks provide access to exchange of quantum information. The primary task of quantum networks is to distribute entanglement between remote nodes. Although quantum repeater protocol enables long distance entanglement distribution, it has been restricted to one-dimensional linear network. Here we develop a general framework that allows application of quantum repeater protocol to arbitrary quantum repeater networks with fractal structure. Entanglement distribution across such networks is mapped to renormalization. Furthermore, we demonstrate that logarithmical times of recursive such renormalization transformations can trigger fractal to small-world transition, where a scalable quantum small-world network is achieved. Our result provides new insight into quantum repeater theory towards realistic construction of large-scale quantum networks.
Current deep learning architectures are growing larger in order to learn from complex datasets. These architectures require giant matrix multiplication operations to train millions of parameters. Conversely, there is another growing trend to bring deep learning to low-power, embedded devices. The matrix operations, associated with both training and testing of deep networks, are very expensive from a computational and energy standpoint. We present a novel hashing based technique to drastically reduce the amount of computation needed to train and test deep networks. Our approach combines recent ideas from adaptive dropouts and randomized hashing for maximum inner product search to select the nodes with the highest activation efficiently. Our new algorithm for deep learning reduces the overall computational cost of forward and back-propagation by operating on significantly fewer (sparse) nodes. As a consequence, our algorithm uses only 5% of the total multiplications, while keeping on average within 1% of the accuracy of the original model. A unique property of the proposed hashing based back-propagation is that the updates are always sparse. Due to the sparse gradient updates, our algorithm is ideally suited for asynchronous and parallel training leading to near linear speedup with increasing number of cores. We demonstrate the scalability and sustainability (energy efficiency) of our proposed algorithm via rigorous experimental evaluations on several real datasets.
Connectivity correlations play an important role in the structure of scale-free networks. While several empirical studies exist, there is no general theoretical analysis that can explain the largely varying behavior of real networks. Here, we use scaling theory to quantify the degree of correlations in the particular case of networks with a power-law degree distribution. These networks are classified in terms of their correlation properties, revealing additional information on their structure. For instance, the studied social networks and the Internet at the router level are clustered around the line of random networks, implying a strongly connected core of hubs. On the contrary, some biological networks and the WWW exhibit strong anti-correlations. The present approach can be used to study robustness or diffusion, where we find that anti-correlations tend to accelerate the diffusion process.
High redshift QSOs (redshift$>$5.7) are highly important objects. If such QSOs may be found, their spectra will reveal the onset of reionization of the intergalactic medium (Gunn-Peterson trough), and provide precious insights into the re-ionization epoch in the very early universe.   Here we report our pilot attempt to follow-up high redshift QSOs with the Himalayan Chandra Telescope.   Deep $J$-band imaging was performed on three high redshift QSO candidates color-selected from the SDSS, using the near-infrared imager. Although none of the targets turned out to be likely high redshift QSOs, careful data reduction shows that the data reach the required depth, proving that the Himalayan Chandra Telescope is a powerful tool to follow-up high redshift QSO candidates.
People believe that depth plays an important role in success of deep neural networks (DNN). However, this belief lacks solid theoretical justifications as far as we know. We investigate role of depth from perspective of margin bound. In margin bound, expected error is upper bounded by empirical margin error plus Rademacher Average (RA) based capacity term. First, we derive an upper bound for RA of DNN, and show that it increases with increasing depth. This indicates negative impact of depth on test performance. Second, we show that deeper networks tend to have larger representation power (measured by Betti numbers based complexity) than shallower networks in multi-class setting, and thus can lead to smaller empirical margin error. This implies positive impact of depth. The combination of these two results shows that for DNN with restricted number of hidden units, increasing depth is not always good since there is a tradeoff between positive and negative impacts. These results inspire us to seek alternative ways to achieve positive impact of depth, e.g., imposing margin-based penalty terms to cross entropy loss so as to reduce empirical margin error without increasing depth. Our experiments show that in this way, we achieve significantly better test performance.
Recent interest in networked control systems (NCS) has instigated research in both communication networks and control. Analysis of NCSs has usually been performed from either the network or the control point of view, but not many papers exist where the analysis of both is done in the same context. In this paper an overall analysis of the networked control system is presented. First, the procedure of obtaining the upper bound delay value for packet transmission in the switched Ethernet network is presented. Next, the obtained delay estimate is utilised in delay compensation for improving the Quality of Performance (QoP) of the control systems. The presented upper bound delay algorithm applies ideas from network calculus theory. For the improvement of QoP, two delay compensation strategies, the Smith predictor based and the robust control based delay compensation strategies, are presented and compared.
In this paper, we investigate synchronization in a small-world network of coupled nonlinear oscillators. This network is constructed by introducing random shortcuts in a nearest-neighbors ring. The local stability of the synchronous state is closely related with the support of the eigenvalue distribution of the Laplacian matrix of the network. We introduce, for the first time, analytical expressions for the first three moments of the eigenvalue distribution of the Laplacian matrix as a function of the probability of shortcuts and the connectivity of the underlying nearest-neighbor coupled ring. We apply these expressions to estimate the spectral support of the Laplacian matrix in order to predict synchronization in small-world networks. We verify the efficiency of our predictions with numerical simulations.
The dynamics of charge-ordered states is one of the central unresolved issues in underdoped cuprate high-temperature superconductors. Static short-range charge-density-wave (CDW) domains have been detected in almost all cuprates, including La$_{1.48}$Nd$_{0.4}$Sr$_{0.12}$CuO$_{4}$, in which CDW order is stabilized by a structural transition. We probe the dynamics across the CDW and structural transition in La$_{1.48}$Nd$_{0.4}$Sr$_{0.12}$CuO$_{4}$ by measuring nonequilibrium charge transport, or resistance $R$ as the system responds to a change in temperature and to an applied magnetic field. We find evidence for collective dynamics of domains in the CDW-ordered phase: although they are strongly pinned by disorder, the domains are not static, but trapped in long-lived metastable states. Surprisingly, nonequilibrium effects, such as avalanches in $R$, are revealed only when the transition is approached from the charge-ordered phase. By unveiling the asymmetry of the transition, this work also points a way to detecting fluctuating CDWs in other cuprates and various strongly-correlated materials.
Cybersecurity attacks are a major and increasing burden to economic and social systems globally. Here we analyze the principles of security in different domains and demonstrate an architectural flaw in current cybersecurity. Cybersecurity is inherently weak because it is missing the ability to defend the overall system instead of individual computers. The current architecture enables all nodes in the computer network to communicate transparently with one another, so security would require protecting every computer in the network from all possible attacks. In contrast, other systems depend on system-wide protections. In providing conventional security, police patrol neighborhoods and the military secures borders, rather than defending each individual household. Likewise, in biology, the immune system provides security against viruses and bacteria using primarily action at the skin, membranes, and blood, rather than requiring each cell to defend itself. We propose applying these same principles to address the cybersecurity challenge. This will require: (a) Enabling pervasive distribution of self-propagating securityware and creating a developer community for such securityware, and (b) Modifying the protocols of internet routers to accommodate adaptive security software that would regulate internet traffic. The analysis of the immune system architecture provides many other principles that should be applied to cybersecurity. Among these principles is a careful interplay of detection and action that includes evolutionary improvement. However, achieving significant security gains by applying these principles depends strongly on remedying the underlying architectural limitations.
The e-learning systems are designed to provide an easy and constant access to educational resources online. Indeed, E-learning systems have capacity to adapt content and learning process according to the learner profile. Adaptation techniques using advanced behavioral analysis mechanisms, called "Learner Modeling" or "Profiling". The latter require continuous tracking of the activities of the learner to identify gaps and strengths in order to tailor content to their specific needs or advise and accompany him during his apprenticeship. However, the disadvantage of these systems is that they cause learners' discouragement, for learners, alone with his screen loses its motivation to improve. Adding social extension to learning, to avoid isolation of learners and boost support and interaction between members of the learning community, was able to increase learner's motivation. However, the tools to facilitate social interactions integrated to E-learning platforms can be used for purposes other than learning. These needs, which can be educational, professional or personal, create a mixture of data from the private life and public life of learners. With the integration of these tools for e-learning systems and the growth of the amount of personal data stored in the databases of these latter, protecting the privacy of students becomes a major concern. Indeed, the exchange of profiles between e-learning systems is done without the permission of their owners. Furthermore, the profiling behavior analysis currently represents a very cost-effective way to generate profits by selling these profiles advertising companies. Today, the right to privacy is threatened from all sides. In addition to the threat from pirates, the source of the most dangerous threats is that from service providers online that users devote a blind trust. Control and centralized data storage and access privileges that have suppliers are responsible for the threat. Our work is limited to the protection of personal data in e-learning systems. We try to answer the question: How can we design a system that protects the privacy of users against threats from the provider while benefiting from all the services, including analysis of behavior? In the absence of solutions that take into account the protection and respect of privacy in e-learning systems that integrate social learning tools, we designed our own solution. Our "ApprAide" system uses a set of protocols based on security techniques to protect users' privacy. In addition, our system incorporates tools that promote social interactions as a social learning network, a chat tool and a virtual table. Our solution allows the use of adaptation techniques and profiling to assist learners. Keywords: Social learning, privacy, security, e-learning, agents
Ensuring transportation systems are efficient is a priority for modern society. Technological advances have made it possible for transportation systems to collect large volumes of varied data on an unprecedented scale. We propose a traffic signal control system which takes advantage of this new, high quality data, with minimal abstraction compared to other proposed systems. We apply modern deep reinforcement learning methods to build a truly adaptive traffic signal control agent in the traffic microsimulator SUMO. We propose a new state space, the discrete traffic state encoding, which is information dense. The discrete traffic state encoding is used as input to a deep convolutional neural network, trained using Q-learning with experience replay. Our agent was compared against a one hidden layer neural network traffic signal control agent and reduces average cumulative delay by 82%, average queue length by 66% and average travel time by 20%.
Training a deep convolutional neural network (CNN) from scratch is difficult because it requires a large amount of labeled training data and a great deal of expertise to ensure proper convergence. A promising alternative is to fine-tune a CNN that has been pre-trained using, for instance, a large set of labeled natural images. However, the substantial differences between natural and medical images may advise against such knowledge transfer. In this paper, we seek to answer the following central question in the context of medical image analysis: \emph{Can the use of pre-trained deep CNNs with sufficient fine-tuning eliminate the need for training a deep CNN from scratch?} To address this question, we considered 4 distinct medical imaging applications in 3 specialties (radiology, cardiology, and gastroenterology) involving classification, detection, and segmentation from 3 different imaging modalities, and investigated how the performance of deep CNNs trained from scratch compared with the pre-trained CNNs fine-tuned in a layer-wise manner. Our experiments consistently demonstrated that (1) the use of a pre-trained CNN with adequate fine-tuning outperformed or, in the worst case, performed as well as a CNN trained from scratch; (2) fine-tuned CNNs were more robust to the size of training sets than CNNs trained from scratch; (3) neither shallow tuning nor deep tuning was the optimal choice for a particular application; and (4) our layer-wise fine-tuning scheme could offer a practical way to reach the best performance for the application at hand based on the amount of available data.
We studied the X-ray properties of the Hickson Compact Group HCG62, in order to determine the properties and dynamic and evolutionary state of its hot gaseous halo. Our analysis reveals that the X-ray diffuse halo has an extremely complex morphological, thermal and chemical structure. Two deep cavities, due to the presence of the AGN hosted by the central galaxy NGC 4778, are clearly visible in the group X-ray halo. The cavities appear to be surrounded by ridges of cool gas. The group shows a cool core associated with the dominant galaxy. In the outer regions the temperature structure is quite regular, while the metal abundance shows a more patchy distribution, with large Si/O and Si/Fe ratios.
A train backbone network consists of a sequence of nodes arranged in a linear topology. A key step that enables communication in such a network is that of topology discovery, or train inauguration, whereby nodes learn in a distributed fashion the physical topology of the backbone network. While the current standard for train inauguration assumes wired links between adjacent backbone nodes, this work investigates the more challenging scenario in which the nodes communicate wirelessly. The key motivations for this desired switch from wired topology discovery to wireless one are the flexibility and capability for expansion and upgrading of a wireless backbone. The implementation of topology discovery over wireless channels is made difficult by the broadcast nature of the wireless medium, and by fading and interference. A novel topology discovery protocol is proposed that overcomes these issues and requires relatively minor changes to the wired standard. The protocol is shown via analysis and numerical results to be robust to the impairments caused by the wireless channel including interference from other trains.
Recent researches have discovered that rich interactions among entities in nature and society bring about complex networks with community structures. Although the investigation of the community structures has promoted the development of many successful algorithms, most of them only find separated communities, while for the vast majority of real-world networks, communities actually overlap to some extent. Moreover, the vertices of networks can often belong to different domains as well. Therefore, in this paper, we propose a novel algorithm BiTector Bi-community De-tector) to efficiently mine overlapping communities in large-scale sparse bipartite networks. It only depends on the network topology, and does not require any priori knowledge about the number or the original partition of the network. We apply the algorithm to real-world data from different domains, showing that BiTector can successfully identifies the overlapping community structures of the bipartite networks.
We propose a local modelling approach using deep convolutional neural networks (CNNs) for fine-grained image classification. Recently, deep CNNs trained from large datasets have considerably improved the performance of object recognition. However, to date there has been limited work using these deep CNNs as local feature extractors. This partly stems from CNNs having internal representations which are high dimensional, thereby making such representations difficult to model using stochastic models. To overcome this issue, we propose to reduce the dimensionality of one of the internal fully connected layers, in conjunction with layer-restricted retraining to avoid retraining the entire network. The distribution of low-dimensional features obtained from the modified layer is then modelled using a Gaussian mixture model. Comparative experiments show that considerable performance improvements can be achieved on the challenging Fish and UEC FOOD-100 datasets.
We present a deep-learning framework for real-time multiple spatio-temporal (S/T) action localisation, classification and early prediction. Current state-of-the-art approaches work offline, and are too slow be useful in real- world settings. To overcome their limitations we introduce two major developments. Firstly, we adopt real-time SSD (Single Shot MultiBox Detector) convolutional neural networks to regress and classify detection boxes in each video frame potentially containing an action of interest. Secondly, we design an original and efficient online algorithm to incrementally construct and label `action tubes' from the SSD frame level detections. As a result, our system is not only capable of performing S/T detection in real time, but can also perform early action prediction in an online fashion. We achieve new state-of-the-art results in both S/T action localisation and early action prediction on the challenging UCF101-24 and J-HMDB-21 benchmarks, even when compared to the top offline competitors. To the best of our knowledge, ours is the first real-time (up to 40fps) system able to perform online S/T action localisation and early action prediction on the untrimmed videos of UCF101-24.
In this paper, we investigate deep inelastic and elastic scattering on a polarized spin-1/2 hadron using gauge/string duality. This spin-1/2 hadron corresponds to a supergravity mode of the dilatino. The polarized deep inelastic structure functions are computed in supergravity approximation at large t' Hooft coupling $\lambda$ and finite $x$ with $\lambda^{-1/2}\ll x<1$. Furthermore, we discuss the moments of all structure functions, and propose an interesting sum rule $\int_{0}^{1} \textrm{d}x g_2(x, q^2) =0$ for $g_2$ structure function which is known as the Burkhardt-Cottingham sum rule in QCD. In the end, the elastic scattering is studied and elastic form factors of the spin-1/2 hadron are calculated within the same framework.
Mobile cloud computing relieves the tension between compute-intensive mobile applications and battery-constrained mobile devices by enabling the offloading of computing tasks from mobiles to a remote processors. This paper considers a mobile cloud computing scenario in which the "cloudlet" processor that provides offloading opportunities to mobile devices is mounted on unmanned aerial vehicles (UAVs) to enhance coverage. Focusing on a slotted communication system with frequency division multiplexing between mobile and UAV, the joint optimization of the number of input bits transmitted in the uplink by the mobile to the UAV, the number of input bits processed by the cloudlet at the UAV, and the number of output bits returned by the cloudlet to the mobile in the downlink in each slot is carried out by means of dual decomposition under maximum latency constraints with the aim of minimizing the mobile energy consumption. Numerical results reveal the critical importance of an optimized bit allocation in order to enable significant energy savings as compared to local mobile execution for stringent latency constraints.
As the speed and the dynamic range of computer networks evolve, the issue of efficient traffic management becomes increasingly important. This work describes an approach to traffic management using explicit rate information provided to the source by the network. We present an asynchronous distributed algorithm for optimal rate calculation across the network, where optimality is understood in the maxmin sense. The algorithm quickly converges to the optimal rates and is shown to be well-behaved in transience.
We discuss a possible measurement of the ratio of the nucleon structure functions, F2n/F2p, and the ratio of the up to down quark distributions, u/d, at large x, by performing deep inelastic electron scattering from the 3H and 3He mirror nuclei with the 11 GeV upgraded beam of Jefferson Lab. The measurement is expected to be almost free of nuclear effects, which introduce a significant uncertainty in the extraction of these two ratios from deep inelastic scattering off the proton and deuteron. The results are expected to test perturbative and non-perturbative mechanisms of spin-flavor symmetry breaking in the nucleon, and constrain the structure function parameterizations needed for the interpretation of high energy e-p, p-p and p-pbar collider data. The precision of the expected data can also allow for testing competing parameterizations of the nuclear EMC effect and provide valuable constraints on models of its dynamical origin.
A zero temperature quench of the Ising model is known to lead to a frozen steady state on random and small world networks. We study such quenches on random scale free networks (RSF) and compare the scenario with that in the Barab\'asi-Albert network (BA) and the Watts Strogatz (WS) addition type network. While frozen states are present in all the cases, the RSF shows an order-disorder phase transition of mean field nature as in the WS model as well as the existence of two absorbing phases separated by an active phase. The WS network also shows an active-absorbing (A-A) phase transition occurring at the known order-disorder transition point. The comparison of the RSF and the BA network results show interesting difference in finite size dependence.
Deep Convolutional Neural Networks (CNNs) are capable of learning unprecedentedly effective features from images. Some researchers have struggled to enhance the parameters' efficiency using grouped convolution. However, the relation between the optimal number of convolutional groups and the recognition performance remains an open problem. In this paper, we propose a series of Basic Units (BUs) and a two-level merging strategy to construct deep CNNs, referred to as a joint Grouped Merging Net (GM-Net), which can produce joint grouped and reused deep features while maintaining the feature discriminability for classification tasks. Our GM-Net architectures with the proposed BU_A (dense connection) and BU_B (straight mapping) lead to significant reduction in the number of network parameters and obtain performance improvement in image classification tasks. Extensive experiments are conducted to validate the superior performance of the GM-Net than the state-of-the-arts on the benchmark datasets, e.g., MNIST, CIFAR-10, CIFAR-100 and SVHN.
We explore a novel method to generate and characterize complex networks by means of their embedding on hyperbolic surfaces. Evolution through local elementary moves allows the exploration of the ensemble of networks which share common embeddings and consequently share similar hierarchical properties. This method provides a new perspective to classify network-complexity both on local and global scale. We demonstrate by means of several examples that there is a strong relation between the network properties and the embedding surface.
We consider the problem of finding communities or modules in directed networks. The most common approach to this problem in the previous literature has been simply to ignore edge direction and apply methods developed for community discovery in undirected networks, but this approach discards potentially useful information contained in the edge directions. Here we show how the widely used benefit function known as modularity can be generalized in a principled fashion to incorporate the information contained in edge directions. This in turn allows us to find communities by maximizing the modularity over possible divisions of a network, which we do using an algorithm based on the eigenvectors of the corresponding modularity matrix. This method is shown to give demonstrably better results than previous methods on a variety of test networks, both real and computer-generated.
Even though there is a plethora of research in Microarray gene expression data analysis, still, it poses challenges for researchers to effectively and efficiently analyze the large yet complex expression of genes. The feature (gene) selection method is of paramount importance for understanding the differences in biological and non-biological variation between samples. In order to address this problem, a novel elephant search (ES) based optimization is proposed to select best gene expressions from the large volume of microarray data. Further, a promising machine learning method is envisioned to leverage such high dimensional and complex microarray dataset for extracting hidden patterns inside to make a meaningful prediction and most accurate classification. In particular, stochastic gradient descent based Deep learning (DL) with softmax activation function is then used on the reduced features (genes) for better classification of different samples according to their gene expression levels. The experiments are carried out on nine most popular Cancer microarray gene selection datasets, obtained from UCI machine learning repository. The empirical results obtained by the proposed elephant search based deep learning (ESDL) approach are compared with most recent published article for its suitability in future Bioinformatics research.
IEEE 802.11 is a widely used wireless LAN standard for medium access control. TCP is a prominent transport protocol originally designed for wired networks. TCP treats packet loss as congestion and reduces the data rate. In wireless networks packets are lost not only due to congestion but also due to various other reasons. Hence there is need for making TCP adaptable to wireless networks. Various parameters of TCP and IEEE 802.11 can be set to appropriate values to achieve optimum performance results. In this paper optimum values for various parameters of IEEE 802.11 are determined. Network simulator NS2 is used for simulation.
Recurrent neural networks (RNNs) with Long Short-Term memory cells currently hold the best known results in unconstrained handwriting recognition. We show that their performance can be greatly improved using dropout - a recently proposed regularization method for deep architectures. While previous works showed that dropout gave superior performance in the context of convolutional networks, it had never been applied to RNNs. In our approach, dropout is carefully used in the network so that it does not affect the recurrent connections, hence the power of RNNs in modeling sequence is preserved. Extensive experiments on a broad range of handwritten databases confirm the effectiveness of dropout on deep architectures even when the network mainly consists of recurrent and shared connections.
In nodal-line semimetals, Coulomb interactions and disorder are both marginal perturbations to the clean non-interacting Hamiltonian. We analyze their interplay using a weak-coupling renormal- ization group approach. In the clean case, the Coulomb interaction has been found to be marginally irrelevant, leading to Fermi liquid behavior. We extend the analysis to incorporate the effects of disorder. The nodal line structure gives rise to kinematical constraints similar to that for a two- dimensional Fermi surface, which plays a crucial role in the one-loop renormalization of the disorder couplings. For a two-fold degenerate nodal loop (Weyl loop), we show that disorder flows to strong coupling along a unique fixed trajectory in the space of symmetry inequivalent disorder couplings. Along this fixed trajectory, all symmetry inequivalent disorder strengths become equal. For a four- fold degenerate nodal loop (Dirac loop), disorder also flows to strong coupling, however the strengths of symmetry inequivalent disorder couplings remain different. We show that feedback from disorder reverses the sign of the beta function for the Coulomb interaction, causing the Coulomb interac- tion to flow to strong coupling as well. However, the Coulomb interaction flows to strong coupling asymptotically more slowly than disorder. Extrapolating our results to strong coupling, we conjec- ture that at low energies nodal line semimetals should be described by a noninteracting nonlinear sigma model. We discuss the relation of our results with possible many-body localization at zero temperatures in such materials.
In order to study the application of artificial intelligence (AI) to dental imaging, we applied AI technology to classify a set of panoramic radiographs using (a) a convolutional neural network (CNN) which is a form of an artificial neural network (ANN), (b) representative image cognition algorithms that implement scale-invariant feature transform (SIFT), and (c) histogram of oriented gradients (HOG).
Gene co-expression network differential analysis is designed to help biologists understand gene expression patterns under different condition. By comparing different gene co-expression networks we may find conserved part as well as condition specific set of genes. Taking the network as a collection as modules, we use a sample-saving method to construct condition-specific gene co-expression network, and identify differentially expressed subnetworks as conserved or condition specific modules which may be associated with biological processes. We have implemented the method as an R package which establishes a pipeline from expression profile to biological explanations. The usefulness of the method is also demonstrated by synthetic data as well as Daphnia magna gene expression data under different environmental stresses.
Semi-supervised learning methods using Generative Adversarial Networks (GANs) have shown promising empirical success recently. Most of these methods use a shared discriminator/classifier which discriminates real examples from fake while also predicting the class label. Motivated by the ability of the GANs generator to capture the data manifold well, we propose to estimate the tangent space to the data manifold using GANs and employ it to inject invariances into the classifier. In the process, we propose enhancements over existing methods for learning the inverse mapping (i.e., the encoder) which greatly improves in terms of semantic similarity of the reconstructed sample with the input sample. We observe considerable empirical gains in semi-supervised learning over baselines, particularly in the cases when the number of labeled examples is low. We also provide insights into how fake examples influence the semi-supervised learning procedure.
Deep generative models have been wildly successful at learning coherent latent representations for continuous data such as video and audio. However, generative modeling of discrete data such as arithmetic expressions and molecular structures still poses significant challenges. Crucially, state-of-the-art methods often produce outputs that are not valid. We make the key observation that frequently, discrete data can be represented as a parse tree from a context-free grammar. We propose a variational autoencoder which encodes and decodes directly to and from these parse trees, ensuring the generated outputs are always valid. Surprisingly, we show that not only does our model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs. We demonstrate the effectiveness of our learned models by showing their improved performance in Bayesian optimization for symbolic regression and molecular synthesis.
We present computations of nucleosynthesis in low-mass red-giant-branch and asymptotic-giant-branch stars of Population I experiencing extended mixing. We adopt the updated version of the FRANEC evolutionary model, a new post-process code for non-convective mixing and the most recent revisions for solar abundances. In this framework, we discuss the effects of recent improvements in relevant reaction rates for proton captures on intermediate-mass nuclei (from carbon to aluminum). For each nucleus we briefly discuss the new choices and their motivations. The calculations are then performed on the basis of a parameterized circulation, where the effects of the new nuclear inputs are best compared to previous works. We find that the new rates (and notably the one for the 14N(p,g)15O reaction) imply considerable modifications in the composition of post-main sequence stars. In particular, the slight temperature changes due to the reduced efficiency of proton captures on 14N induce abundance variations at the first dredge up (especially for 17O, whose equilibrium ratio to 16O is very sensitive to the temperature). In this new scenario presolar oxide grains of AGB origin turn out to be produced almost exclusively by very-low mass stars (M<=1.5-1.7Msun), never becoming C-rich. The whole population of grains with 18O/16O below 0.0015 (the limit permitted by first dredge up) is now explained. Also, there is now no forbidden area for very low values of 17O/16O (below 0.0005), contrary to previous findings. A rather shallow type of transport seems to be sufficient for the CNO changes in RGB stages. Both thermohaline diffusion and magnetic-buoyancy-induced mixing might provide a suitable physical mechanism for this. Thermohaline mixing is in any case certainly inadequate to account for the production of 26Al on the AGB. Other transport mechanisms must therefore be at play.
We establish an exact mapping between (i) the equilibrium (imaginary time) dynamics of non-interacting fermions trapped in a harmonic potential at temperature $T=1/\beta$ and (ii) non-intersecting Ornstein-Uhlenbeck (OU) particles constrained to return to their initial positions after time $\beta$. Exploiting the determinantal structure of the process we compute the universal correlation functions both in the bulk and at the edge of the trapped Fermi gas. The latter corresponds to the top path of the non-intersecting OU particles, and leads us to introduce and study the time-periodic Airy$_2$ process, ${\cal A}^b_2(u)$, depending on a single parameter, the period $b$. The standard Airy$_2$ process is recovered for $b=+\infty$. We discuss applications of our results to the real time quantum dynamics of trapped fermions.
The specific heat (Cm) and optical birefringence (\Delta n) for the magnetic percolation threshold system Fe(0.25)Zn(0.75)F2 are analyzed with the aid of Monte Carlo (MC) simulations. Both \Delta n and the magnetic energy (Um) are governed by a linear combination of near-neighbor spin-spin correlations, which we have determined for \Delta n using MC simulations modeled closely after the real system. Near a phase transition or when only one interaction dominates, the temperature derivative of the birefringence [{d(\Delta n)}/{dT}] is expected to be proportional Cm since all relevant correlations necessarily have the same temperature dependence. Such a proportionality does not hold for Fe(0.25)Zn(0.75)F2 at low temperatures, however, indicating that neither condition above holds. MC results for this percolation system demonstrate that the shape of the temperature derivative of correlations associated with the frustrating third-nearest-neighbor interaction differs from that of the dominant second-nearest-neighbor interaction, accurately explaining the experimentally observed behavior quantitatively.
In This paper we present a novel approach to spam filtering and demonstrate its applicability with respect to SMS messages. Our approach requires minimum features engineering and a small set of la- belled data samples. Features are extracted using topic modelling based on latent Dirichlet allocation, and then a comprehensive data model is created using a Stacked Denoising Autoencoder (SDA). Topic modelling summarises the data providing ease of use and high interpretability by visualising the topics using word clouds. Given that the SMS messages can be regarded as either spam (unwanted) or ham (wanted), the SDA is able to model the messages and accurately discriminate between the two classes without the need for a pre-labelled training set. The results are compared against the state-of-the-art spam detection algorithms with our proposed approach achieving over 97% accuracy which compares favourably to the best reported algorithms presented in the literature.
We have grown spiral carbon nanofibers containing Pd metal clusters using the Pd2(dba)3 catalyzed decomposition of gaseous acetylene on molecular sieves (AlPO4-5) support. The microstructure and composition of the spiral carbon nanofibers were examined by the powder x-ray diffractometer and transmission electron microscope. The conductivity of the mat in the temperature range from 14 to 250 K could be described by the form of exp[-(T-1/4)]. The thermopower shows a remarkably linear behavior down to 40 K, reminiscent of some conducting polymers. The sign change of the thermopower suggests there exists more than one type of charge carrier, which could be ascribed to the different types of nanotube with various sizes of radius. The transport behavior of spiral carbon nanofibers containing Pd metal clusters will be discussed in the framework of the heterogeneous model.
Community detection algorithms are fundamental tools that allow us to uncover organizational principles in networks. When detecting communities, there are two possible sources of information one can use: the network structure, and the features and attributes of nodes. Even though communities form around nodes that have common edges and common attributes, typically, algorithms have only focused on one of these two data modalities: community detection algorithms traditionally focus only on the network structure, while clustering algorithms mostly consider only node attributes. In this paper, we develop Communities from Edge Structure and Node Attributes (CESNA), an accurate and scalable algorithm for detecting overlapping communities in networks with node attributes. CESNA statistically models the interaction between the network structure and the node attributes, which leads to more accurate community detection as well as improved robustness in the presence of noise in the network structure. CESNA has a linear runtime in the network size and is able to process networks an order of magnitude larger than comparable approaches. Last, CESNA also helps with the interpretation of detected communities by finding relevant node attributes for each community.
We consider the problem of multicasting information from a source to a set of receivers over a network where intermediate network nodes perform randomized network coding operations on the source packets. We propose a channel model for the non-coherent network coding introduced by Koetter and Kschischang in [6], that captures the essence of such a network operation, and calculate the capacity as a function of network parameters. We prove that use of subspace coding is optimal, and show that, in some cases, the capacity-achieving distribution uses subspaces of several dimensions, where the employed dimensions depend on the packet length. This model and the results also allow us to give guidelines on when subspace coding is beneficial for the proposed model and by how much, in comparison to a coding vector approach, from a capacity viewpoint. We extend our results to the case of multiple source multicast that creates a virtual multiple access channel.
In the waveguide quantum electrodynamics (QED) system, emitter separation plays an important role for its functionality. Here, we present a method to measure the deep-subwavelength emitter separation in a waveguide-QED system. In this method, we can also determine the number of emitters within one diffraction-limited spot. In addition, we also show that ultrasmall emitter separation change can be detected in this system which may then be used as a waveguide-QED-based sensor to measure tiny local temperature/strain variation.
I present from a historical perspective a logical progression of understanding, related to the composition of the deep interior of the Earth, that comes from fundamental discoveries and from discoveries of fundamental quantitative relationships in nature. By following step by step the reasoning from that understanding, one might begin to appreciate what is not yet known that pertains to recent interest in geo-antineutrinos and also what should be investigated to further advance that understanding.
When creating digital art, coloring and shading are often time consuming tasks that follow the same general patterns. A solution to automatically colorize raw line art would have many practical applications. We propose a setup utilizing two networks in tandem: a color prediction network based only on outlines, and a shading network conditioned on both outlines and a color scheme. We present processing methods to limit information passed in the color scheme, improving generalization. Finally, we demonstrate natural-looking results when colorizing outlines from scratch, as well as from a messy, user-defined color scheme.
We unveil the existence of non-affinely rearranging regions in the inherent structures (IS) of supercooled liquids by numerical simulations of two- and three-dimensional model glass formers subject to static shear deformations combined with local energy minimizations. In the liquid state IS, we find a broad distribution of rather large rearrangements which are correlated only over small distances. At low temperatures, the onset of the cooperative dynamics corresponds to much smaller displacements correlated over larger distances. This finding indicates the presence of non-affinely rearranging domains of relevant size in the IS deformation, which can be seen as the static counterpart of the cooperatively rearranging regions in the dynamics. This idea provides new insight into possible structural signatures of slow cooperative dynamics of supercooled liquids and supports the connections with elastic heterogeneities found in amorphous solids.
We consider the problem of partitioning $n$ integers into two subsets of given cardinalities such that the discrepancy, the absolute value of the difference of their sums, is minimized. The integers are i.i.d. random variables chosen uniformly from the set $\{1,...,M\}$. We study how the typical behavior of the optimal partition depends on $n,M$ and the bias $s$, the difference between the cardinalities of the two subsets in the partition. In particular, we rigorously establish this typical behavior as a function of the two parameters $\kappa:=n^{-1}\log_2M$ and $b:=|s|/n$ by proving the existence of three distinct ``phases'' in the $\kappa b$-plane, characterized by the value of the discrepancy and the number of optimal solutions: a ``perfect phase'' with exponentially many optimal solutions with discrepancy 0 or 1; a ``hard phase'' with minimal discrepancy of order $Me^{-\Theta(n)}$; and a ``sorted phase'' with an unique optimal partition of order $Mn$, obtained by putting the $(s+n)/2$ smallest integers in one subset. Our phase diagram covers all but a relatively small region in the $\kappa b$-plane. We also show that the three phases can be alternatively characterized by the number of basis solutions of the associated linear programming problem, and by the fraction of these basis solutions whose $\pm 1$-valued components form optimal integer partitions of the subproblem with the corresponding weights. We show in particular that this fraction is one in the sorted phase, and exponentially small in both the perfect and hard phases, and strictly exponentially smaller in the hard phase than in the perfect phase. Open problems are discussed, and numerical experiments are presented.
We show that the recently developed self-consistent theory of Anderson localization with a position-dependent diffusion coefficient is in quantitative agreement with the supersymmetry approach up to terms of the order of $1/g_0^2$ (with $g_0$ the dimensionless conductance in the absence of interference effects) and with large-scale {\it ab-initio} simulations of the classical wave transport in disordered waveguides, at least for $g_0 \gtrsim 0.5$. In the latter case, agreement is found even in the presence of absorption. Our numerical results confirm that in open disordered media, the onset of Anderson localization can be viewed as position-dependent diffusion.
We address the problem of semi-supervised learning in relational networks, networks in which nodes are entities and links are the relationships or interactions between them. Typically this problem is confounded with the problem of graph-based semi-supervised learning (GSSL), because both problems represent the data as a graph and predict the missing class labels of nodes. However, not all graphs are created equally. In GSSL a graph is constructed, often from independent data, based on similarity. As such, edges tend to connect instances with the same class label. Relational networks, however, can be more heterogeneous and edges do not always indicate similarity. For instance, instead of links being more likely to connect nodes with the same class label, they may occur more frequently between nodes with different class labels (link-heterogeneity). Or nodes with the same class label do not necessarily have the same type of connectivity across the whole network (class-heterogeneity), e.g. in a network of sexual interactions we may observe links between opposite genders in some parts of the graph and links between the same genders in others. Performing classification in networks with different types of heterogeneity is a hard problem that is made harder still when we do not know a-priori the type or level of heterogeneity. Here we present two scalable approaches for graph-based semi-supervised learning for the more general case of relational networks. We demonstrate these approaches on synthetic and real-world networks that display different link patterns within and between classes. Compared to state-of-the-art approaches, ours give better classification performance without prior knowledge of how classes interact. In particular, our two-step label propagation algorithm gives consistently good accuracy and runs on networks of over 1.6 million nodes and 30 million edges in around 12 seconds.
The silicate cross section peak near 10um produces emission and absorption features in the spectra of dusty galactic nuclei observed with the Spitzer Space Telescope. Especially in ultraluminous infrared galaxies, the observed absorption feature can be extremely deep, as IRAS 08572+3915 illustrates. A foreground screen of obscuration cannot reproduce this observed feature, even at large optical depth. Instead, the deep absorption requires a nuclear source to be deeply embedded in a smooth distribution of material that is both geometrically and optically thick. In contrast, a clumpy medium can produce only shallow absorption or emission, which are characteristic of optically-identified active galactic nuclei. In general, the geometry of the dusty region and the total optical depth, rather than the grain composition or heating spectrum, determine the silicate feature's observable properties. The apparent optical depth calculated from the ratio of line to continuum emission generally fails to accurately measure the true optical depth. The obscuring geometry, not the nature of the embedded source, also determines the far-IR spectral shape.
In the power spectra of oscillating red giants, there are visually distinct features defining stars ascending the red giant branch from those that have commenced helium core burning. We train a one-dimensional convolutional neural network by supervised learning to automatically learn these visual features from images of folded oscillation spectra. By training and testing on \textit{Kepler} red giants, we achieve an accuracy of up to 99\% in separating helium-burning red giants from those ascending the red giant branch. The convolutional neural network additionally shows capability in accurately predicting the evolutionary states of 5379 previously unclassified \textit{Kepler} red giants, by which we now have greatly increased the number of classified stars.
In this paper, we propose a technique for time series clustering using community detection in complex networks. Firstly, we present a method to transform a set of time series into a network using different distance functions, where each time series is represented by a vertex and the most similar ones are connected. Then, we apply community detection algorithms to identify groups of strongly connected vertices (called a community) and, consequently, identify time series clusters. Still in this paper, we make a comprehensive analysis on the influence of various combinations of time series distance functions, network generation methods and community detection techniques on clustering results. Experimental study shows that the proposed network-based approach achieves better results than various classic or up-to-date clustering techniques under consideration. Statistical tests confirm that the proposed method outperforms some classic clustering algorithms, such as $k$-medoids, diana, median-linkage and centroid-linkage in various data sets. Interestingly, the proposed method can effectively detect shape patterns presented in time series due to the topological structure of the underlying network constructed in the clustering process. At the same time, other techniques fail to identify such patterns. Moreover, the proposed method is robust enough to group time series presenting similar pattern but with time shifts and/or amplitude variations. In summary, the main point of the proposed method is the transformation of time series from time-space domain to topological domain. Therefore, we hope that our approach contributes not only for time series clustering, but also for general time series analysis tasks.
We investigate the effect of the incommensurate potential on Weyl semimetal, which is proposed to be realized in ultracold atomic systems trapped in three-dimensional optical lattices. For the system without the Fermi arc, we find that the Weyl points are robust against the incommensurate potential and the system enters into a metallic phase only when the incommensurate potential strength exceeds a critical value. We unveil the trastition by analysing the properties of wave functions and the density of states as a function of the incommensurate potential strength. We also study the system with Fermi arcs and find the Fermi arcs are sensitive against the incommensurate potential and can be destoryed by a weak incommensurate potential.
We study the predictive power of autoregressive moving average models when forecasting demand in two shared computational networks, PlanetLab and Tycoon. Demand in these networks is very volatile, and predictive techniques to plan usage in advance can improve the performance obtained drastically.   Our key finding is that a random walk predictor performs best for one-step-ahead forecasts, whereas ARIMA(1,1,0) and adaptive exponential smoothing models perform better for two and three-step-ahead forecasts. A Monte Carlo bootstrap test is proposed to evaluate the continuous prediction performance of different models with arbitrary confidence and statistical significance levels. Although the prediction results differ between the Tycoon and PlanetLab networks, we observe very similar overall statistical properties, such as volatility dynamics.
We study the hysteretic evolution of the random field Ising model (RFIM) at T=0 when the magnetization M is controlled externally and the magnetic field H becomes the output variable. The dynamics is a simple modification of the single-spin-flip dynamics used in the H-driven situation and consists in flipping successively the spins with the largest local field. This allows to perform a detailed comparison between the microscopic trajectories followed by the system with the two protocols. Simulations are performed on random graphs with connectivity z=4 (Bethe lattice) and on the 3-D cubic lattice. The same internal energy U(M)is found with the two protocols when there is no macroscopic avalanche and it does not depend on whether the microscopic states are stable or not. On the Bethe lattice, the energy inside the macroscopic avalanche also coincides with the one that is computed analytically with the H-driven algorithm along the unstable branch of the hysteresis loop. The output field, defined here as dU/dM, exhibits very large fluctuations with the magnetization and is not self-averaging. Relation to the experimental situation is discussed.
Neural Machine Translation (MT) has reached state-of-the-art results. However, one of the main challenges that neural MT still faces is dealing with very large vocabularies and morphologically rich languages. In this paper, we propose a neural MT system using character-based embeddings in combination with convolutional and highway layers to replace the standard lookup-based word representations. The resulting unlimited-vocabulary and affix-aware source word embeddings are tested in a state-of-the-art neural MT based on an attention-based bidirectional recurrent neural network. The proposed MT scheme provides improved results even when the source language is not morphologically rich. Improvements up to 3 BLEU points are obtained in the German-English WMT task.
The inner structure of a material is called microstructure. It stores the genesis of a material and determines all its physical and chemical properties. While microstructural characterization is widely spread and well known, the microstructural classification is mostly done manually by human experts, which opens doors for huge uncertainties. Since the microstructure could be a combination of different phases with complex substructures its automatic classification is very challenging and just a little work in this field has been carried out. Prior related works apply mostly designed and engineered features by experts and classify microstructure separately from feature extraction step. Recently Deep Learning methods have shown surprisingly good performance in vision applications by learning the features from data together with the classification step. In this work, we propose a deep learning method for microstructure classification in the examples of certain microstructural constituents of low carbon steel. This novel method employs pixel-wise segmentation via Fully Convolutional Neural Networks (FCNN) accompanied by max-voting scheme. Our system achieves 93.94% classification accuracy, drastically outperforming the state-of-the-art method of 48.89% accuracy, indicating the effectiveness of pixel-wise approaches. Beyond the success presented in this paper, this line of research offers a more robust and first of all objective way for the difficult task of steel quality appreciation.
A fundamental question in systems biology is the construction and training to data of mathematical models. Logic formalisms have become very popular to model signaling networks because their simplicity allows us to model large systems encompassing hundreds of proteins. An approach to train (Boolean) logic models to high-throughput phospho-proteomics data was recently introduced and solved using optimization heuristics based on stochastic methods. Here we demonstrate how this problem can be solved using Answer Set Programming (ASP), a declarative problem solving paradigm, in which a problem is encoded as a logical program such that its answer sets represent solutions to the problem. ASP has significant improvements over heuristic methods in terms of efficiency and scalability, it guarantees global optimality of solutions as well as provides a complete set of solutions. We illustrate the application of ASP with in silico cases based on realistic networks and data.
We provide here a thermodynamic analog of the Braess road-network paradox with irreversible engines working between reservoirs that are placed at vertices of the network. Paradoxes of different kinds reappear, emphasizing the specialty of the network.
Accurately estimating the wiring diagram of a brain, known as a connectome, at an ultrastructure level is an open research problem. Specifically, precisely tracking neural processes is difficult, especially across many image slices. Here, we propose a novel method to automatically identify and annotate small subcellular structures present in axons, known as axoplasmic reticula, through a 3D volume of high-resolution neural electron microscopy data. Our method produces high precision annotations, which can help improve automatic segmentation by using our results as seeds for segmentation, and as cues to aid segment merging.
Convolutional neural networks (CNNs) have received increasing attention over the last few years. They were initially conceived for image categorization, i.e., the problem of assigning a semantic label to an entire input image.   In this paper we address the problem of dense semantic labeling, which consists in assigning a semantic label to every pixel in an image. Since this requires a high spatial accuracy to determine where labels are assigned, categorization CNNs, intended to be highly robust to local deformations, are not directly applicable.   By adapting categorization networks, many semantic labeling CNNs have been recently proposed. Our first contribution is an in-depth analysis of these architectures. We establish the desired properties of an ideal semantic labeling CNN, and assess how those methods stand with regard to these properties. We observe that even though they provide competitive results, these CNNs often underexploit properties of semantic labeling that could lead to more effective and efficient architectures.   Out of these observations, we then derive a CNN framework specifically adapted to the semantic labeling problem. In addition to learning features at different resolutions, it learns how to combine these features. By integrating local and global information in an efficient and flexible manner, it outperforms previous techniques. We evaluate the proposed framework and compare it with state-of-the-art architectures on public benchmarks of high-resolution aerial image labeling.
In this paper, we consider a cognitive setting under the context of cooperative communications, where the cognitive radio (CR) user is assumed to be a self-organized relay for the network. The CR user and the PU are assumed to be energy harvesters. The CR user cooperatively relays some of the undelivered packets of the primary user (PU). Specifically, the CR user stores a fraction of the undelivered primary packets in a relaying queue (buffer). It manages the flow of the undelivered primary packets to its relaying queue using the appropriate actions over time slots. Moreover, it has the decision of choosing the used queue for channel accessing at idle time slots (slots where the PU's queue is empty). It is assumed that one data packet transmission dissipates one energy packet. The optimal policy changes according to the primary and CR users arrival rates to the data and energy queues as well as the channels connectivity. The CR user saves energy for the PU by taking the responsibility of relaying the undelivered primary packets. It optimally organizes its own energy packets to maximize its payoff as time progresses.
High energy scattering processes of charged particles are accompanied by radiation of hard photons. Emission collinear to the incident particles, which leads to a reduction of the effective beam energy, and the possibility to directly measure these photons at the HERA electron-proton collider provides important physics opportunities. For deep inelastic scattering, the measurement of radiative processes extends the kinematic range accessible to the HERA experiments to lower Q^2, as well as helps in separating the proton structure functions without the need to run at different collider energies. QED corrections to these radiative processes are discussed, and the calculation of the model-independent leptonic corrections is described in some detail for the complete one-loop contributions as well as for the higher order leading logarithms.
Graphs provide a powerful means for representing complex interactions between entities. Recently, deep learning approaches are emerging for representing and modeling graph-structured data, although the conventional deep learning methods (such as convolutional neural networks and recurrent neural networks) have mainly focused on grid-structured inputs (image and audio). Leveraged by the capability of representation learning, deep learning based techniques are reporting promising results for graph applications by detecting structural characteristics of graphs in an automated fashion. In this paper, we attempt to advance deep learning for graph-structured data by incorporating another component, transfer learning. By transferring the intrinsic geometric information learned in the source domain, our approach can help us to construct a model for a new but related task in the target domain without collecting new data and without training a new model from scratch. We thoroughly test our approach with large-scale real corpora and confirm the effectiveness of the proposed transfer learning framework for deep learning on graphs. According to our experiments, transfer learning is most effective when the source and target domains bear a high level of structural similarity in their graph representations.
Action Unit (AU) detection becomes essential for facial analysis. Many proposed approaches face challenging problems in dealing with the alignments of different face regions, in the effective fusion of temporal information, and in training a model for multiple AU labels. To better address these problems, we propose a deep learning framework for AU detection with region of interest (ROI) adaptation, integrated multi-label learning, and optimal LSTM-based temporal fusing. First, ROI cropping nets (ROI Nets) are designed to make sure specifically interested regions of faces are learned independently; each sub-region has a local convolutional neural network (CNN) - an ROI Net, whose convolutional filters will only be trained for the corresponding region. Second, multi-label learning is employed to integrate the outputs of those individual ROI cropping nets, which learns the inter-relationships of various AUs and acquires global features across sub-regions for AU detection. Finally, the optimal selection of multiple LSTM layers to form the best LSTM Net is carried out to best fuse temporal features, in order to make the AU prediction the most accurate. The proposed approach is evaluated on two popular AU detection datasets, BP4D and DISFA, outperforming the state of the art significantly, with an average improvement of around 13% on BP4D and 25% on DISFA, respectively.
Recently, multi-view representation learning has become a rapidly growing direction in machine learning and data mining areas. This paper first reviews the root methods and theories on multi-view representation learning, especially on canonical correlation analysis (CCA) and its several extensions. And then we investigate the advancement of multi-view representation learning that ranges from shallow methods including multi-modal topic learning, multi-view sparse coding, and multi-view latent space Markov networks, to deep methods including multi-modal restricted Boltzmann machines, multi-modal autoencoders, and multi-modal recurrent neural networks. Further, we also provide an important perspective from manifold alignment for multi-view representation learning. Overall, this survey aims to provide an insightful overview of theoretical basis and current developments in the field of multi-view representation learning and to help researchers find the most appropriate tools for particular applications.
The problem of predictability, or "nature vs. nurture", in both ordered and disordered Ising systems following a deep quench from infinite to zero temperature is reviewed. Two questions are addressed. The first deals with the nature of the final state: for an infinite system, does every spin flip infinitely often, or does every spin flip only finitely many times, or do some spins flip infinitely often and others finitely often? Once this question is determined, the evolution of the system from its initial state can be studied with attention to the issue of how much information contained in the final state depends on that contained in the initial state, and how much depends on the detailed history of the system. This problem has been addressed both analytically and numerically in several papers, and their main methods, results, and conclusions will be reviewed. The discussion closes with some open problems that remain to be addressed.
We address the problem of semantic segmentation using deep learning. Most segmentation systems include a Conditional Random Field (CRF) to produce a structured output that is consistent with the image's visual features. Recent deep learning approaches have incorporated CRFs into Convolutional Neural Networks (CNNs), with some even training the CRF end-to-end with the rest of the network. However, these approaches have not employed higher order potentials, which have previously been shown to significantly improve segmentation performance. In this paper, we demonstrate that two types of higher order potential, based on object detections and superpixels, can be included in a CRF embedded within a deep network. We design these higher order potentials to allow inference with the differentiable mean field algorithm. As a result, all the parameters of our richer CRF model can be learned end-to-end with our pixelwise CNN classifier. We achieve state-of-the-art segmentation performance on the PASCAL VOC benchmark with these trainable higher order potentials.
Based upon the color-dipole picture, we provide closed analytic expressions for the longitudinal and the transverse photoabsorption cross sections at low values of the Bjorken variable of x<0.1. We compare with the experimental data for the longitudinal-to-transverse ratio of the (virtual) photoabsorption cross section and with our previous fit to the experimental data for the total photoabsorption cross section. Scaling in terms of the low-x scaling variable eta(W^2,Q^2) is analyzed in terms of the reduced cross section of deep inelastic scattering.
Traditional dialog systems used in goal-oriented applications require a lot of domain-specific handcrafting, which hinders scaling up to new domains. End-to-end dialog systems, in which all components are trained from the dialogs themselves, escape this limitation. But the encouraging success recently obtained in chit-chat dialog may not carry over to goal-oriented settings. This paper proposes a testbed to break down the strengths and shortcomings of end-to-end dialog systems in goal-oriented applications. Set in the context of restaurant reservation, our tasks require manipulating sentences and symbols, so as to properly conduct conversations, issue API calls and use the outputs of such calls. We show that an end-to-end dialog system based on Memory Networks can reach promising, yet imperfect, performance and learn to perform non-trivial operations. We confirm those results by comparing our system to a hand-crafted slot-filling baseline on data from the second Dialog State Tracking Challenge (Henderson et al., 2014a). We show similar result patterns on data extracted from an online concierge service.
A significant weakness of most current deep Convolutional Neural Networks is the need to train them using vast amounts of manu- ally labelled data. In this work we propose a unsupervised framework to learn a deep convolutional neural network for single view depth predic- tion, without requiring a pre-training stage or annotated ground truth depths. We achieve this by training the network in a manner analogous to an autoencoder. At training time we consider a pair of images, source and target, with small, known camera motion between the two such as a stereo pair. We train the convolutional encoder for the task of predicting the depth map for the source image. To do so, we explicitly generate an inverse warp of the target image using the predicted depth and known inter-view displacement, to reconstruct the source image; the photomet- ric error in the reconstruction is the reconstruction loss for the encoder. The acquisition of this training data is considerably simpler than for equivalent systems, requiring no manual annotation, nor calibration of depth sensor to camera. We show that our network trained on less than half of the KITTI dataset (without any further augmentation) gives com- parable performance to that of the state of art supervised methods for single view depth estimation.
Pattern-diluted associative networks were introduced recently as models for the immune system, with nodes representing T-lymphocytes and stored patterns representing signalling protocols between T- and B-lymphocytes. It was shown earlier that in the regime of extreme pattern dilution, a system with $N_T$ T-lymphocytes can manage a number $N_B!=!\order(N_T^\delta)$ of B-lymphocytes simultaneously, with $\delta!<!1$. Here we study this model in the extensive load regime $N_B!=!\alpha N_T$, with also a high degree of pattern dilution, in agreement with immunological findings. We use graph theory and statistical mechanical analysis based on replica methods to show that in the finite-connectivity regime, where each T-lymphocyte interacts with a finite number of B-lymphocytes as $N_T\to\infty$, the T-lymphocytes can coordinate effective immune responses to an extensive number of distinct antigen invasions in parallel. As $\alpha$ increases, the system eventually undergoes a second order transition to a phase with clonal cross-talk interference, where the system's performance degrades gracefully. Mathematically, the model is equivalent to a spin system on a finitely connected graph with many short loops, so one would expect the available analytical methods, which all assume locally tree-like graphs, to fail. Yet it turns out to be solvable. Our results are supported by numerical simulations.
Neural Networks sequentially build high-level features through their successive layers. We propose here a new neural network model where each layer is associated with a set of candidate mappings. When an input is processed, at each layer, one mapping among these candidates is selected according to a sequential decision process. The resulting model is structured according to a DAG like architecture, so that a path from the root to a leaf node defines a sequence of transformations. Instead of considering global transformations, like in classical multilayer networks, this model allows us for learning a set of local transformations. It is thus able to process data with different characteristics through specific sequences of such local transformations, increasing the expression power of this model w.r.t a classical multilayered network. The learning algorithm is inspired from policy gradient techniques coming from the reinforcement learning domain and is used here instead of the classical back-propagation based gradient descent techniques. Experiments on different datasets show the relevance of this approach.
A model for terrorism is presented using the theory of percolation. Terrorism power is related to the spontaneous formation of random backbones of people who are sympathetic to terrorism but without being directly involved in it. They just don't oppose in case they could. In the past such friendly-to-terrorism backbones have been always existing but were of finite size and localized to a given geographical area. The September 11 terrorist attack on the US has revealed for the first time the existence of a world wide spread extension. It is argued to have result from a sudden world percolation of otherwise unconnected and dormant world spread backbones of passive supporters. The associated strategic question is then to determine if collecting ground information could have predict and thus avoid such a transition. Our results show the answer is no, voiding the major criticism against intelligence services. To conclude the impact of military action is discussed.
The Long Term Evolution (LTE) broadcast is a promising solution to cope with exponentially increasing user traffic by broadcasting common user requests over the same frequency channels. In this paper, we propose a novel network framework provisioning broadcast and unicast services simultaneously. For each serving file to users, a cellular base station determines either to broadcast or unicast the file based on user demand prediction examining the file's content specific characteristics such as: file size, delay tolerance, price sensitivity. In a network operator's revenue maximization perspective while not inflicting any user payoff degradation, we jointly optimize resource allocation, pricing, and file scheduling. In accordance with the state of the art LTE specifications, the proposed network demonstrates up to 32% increase in revenue for a single cell and more than a 7-fold increase for a 7 cell coordinated LTE broadcast network, compared to the conventional unicast cellular networks.
Mature social networking services are one of the greatest assets of today's organizations. This valuable asset, however, can also be a threat to an organization's confidentiality. Members of social networking websites expose not only their personal information, but also details about the organizations for which they work. In this paper we analyze several commercial organizations by mining data which their employees have exposed on Facebook, LinkedIn, and other publicly available sources. Using a web crawler designed for this purpose, we extract a network of informal social relationships among employees of a given target organization. Our results, obtained using centrality analysis and Machine Learning techniques applied to the structure of the informal relationships network, show that it is possible to identify leadership roles within the organization solely by this means. It is also possible to gain valuable non-trivial insights on an organization's structure by clustering its social network and gathering publicly available information on the employees within each cluster. Organizations wanting to conceal their internal structure, identity of leaders, location and specialization of branches offices, etc., must enforce strict policies to control the use of social media by their employees.
One of the critical issues when adopting Bayesian networks (BNs) to model dependencies among random variables is to "learn" their structure, given the huge search space of possible solutions, i.e., all the possible direct acyclic graphs. This is a well-known NP-hard problem, which is also complicated by known pitfalls such as the issue of I-equivalence among different structures. In this work we restrict the investigations on BN structure learning to a specific class of networks, i.e., those representing the dynamics of phenomena characterized by the monotonic accumulation of events. Such phenomena allow to set specific structural constraints based on Suppes' theory of probabilistic causation and, accordingly, to define constrained BNs, named Suppes-Bayes Causal Networks (SBCNs). We here investigate the structure learning of SBCNs via extensive simulations with various state-of-the-art search strategies, such as canonical local search techniques and Genetic Algorithms. Among the main results we show that Suppes' constraints deeply simplify the learning task, by reducing the solution search space and providing a temporal ordering on the variables.
In this paper, we are concerned with the resilience of locally routed network flows with finite link capacities. In this setting, an external inflow is injected to the so-called origin nodes. The total inflow arriving at each node is routed locally such that none of the outgoing links are overloaded unless the node receives an inflow greater than its total outgoing capacity. A link irreversibly fails if it is overloaded or if there is no operational link in its immediate downstream to carry its flow. For such systems, resilience is defined as the minimum amount of reduction in the link capacities that would result in the failure of all the outgoing links of an origin node. We show that such networks do not necessarily become more resilient as additional capacity is built in the network. Moreover, when the external inflow does not exceed the network capacity, selective reductions of capacity at certain links can actually help averting the cascading failures, without requiring any change in the local routing policies. This is an attractive feature as it is often easier in practice to reduce the available capacity of some critical links than to add physical capacity or to alter routing policies, e.g., when such policies are determined by social behavior, as in the case of road traffic networks. The results can thus be used for real-time monitoring of distance-to-failure in such networks and devising a feasible course of actions to avert systemic failures.
Two quantized charge pumps are operated in parallel. The total current generated is shown to be far more accurate than the current produced with just one pump operating at a higher frequency. With the application of a perpendicular magnetic field the accuracy of quantization is shown to be $< $20 ppm for a current of $108.9 $pA. The scheme for parallel pumping presented in this work has applications in quantum information processing, the generation of single photons in pairs and bunches, neural networking and the development of a quantum standard for electrical current. All these applications will benefit greatly from the increase in output current without the characteristic decrease in accuracy as a result of high-frequency operation.
The paper suggests to treat the infrared reflectivity spectra of single crystal perovskite relaxors as fine-grained ferroelectric ceramics: locally frozen polarization makes the dielectric function strongly anisotropic in the phonon frequency range and the random orientation of the polarization at nano-scopic scale requires to take into account the inhomogeneous depolarization field. Employing a simple effective medium approximation (Bruggeman symmetrical formula) to dielectric function describing the polar optic modes as damped harmonic oscillators turns out to be sufficient for reproducing all principal features of room temperature reflectivity of PMN. One of the reflectivity bands is identified as a geometrical resonance entirely related to the nanoscale polarization inhomogeneity. The approach provides a general guide for systematic determination of the polar mode frequencies split by the inhomogeneous polarization at nanometer scale.
In some real world situations, linear models are not sufficient to represent accurately complex relations between input variables and output variables of a studied system. Multilayer Perceptrons are one of the most successful non-linear regression tool but they are unfortunately restricted to inputs and outputs that belong to a normed vector space. In this chapter, we propose a general recoding method that allows to use symbolic data both as inputs and outputs to Multilayer Perceptrons. The recoding is quite simple to implement and yet provides a flexible framework that allows to deal with almost all practical cases. The proposed method is illustrated on a real world data set.
Spectral algorithms are classic approaches to clustering and community detection in networks. However, for sparse networks the standard versions of these algorithms are suboptimal, in some cases completely failing to detect communities even when other algorithms such as belief propagation can do so. Here we introduce a new class of spectral algorithms based on a non-backtracking walk on the directed edges of the graph. The spectrum of this operator is much better-behaved than that of the adjacency matrix or other commonly used matrices, maintaining a strong separation between the bulk eigenvalues and the eigenvalues relevant to community structure even in the sparse case. We show that our algorithm is optimal for graphs generated by the stochastic block model, detecting communities all the way down to the theoretical limit. We also show the spectrum of the non-backtracking operator for some real-world networks, illustrating its advantages over traditional spectral clustering.
Mobile agent networks, such as multi-UAV systems, are constrained by limited resources. In particular, limited energy affects system performance directly, such as system lifetime. It has been demonstrated in the wireless sensor network literature that the communication energy consumption dominates the computational and the sensing energy consumption. Hence, the lifetime of the multi-UAV systems can be extended significantly by optimizing the amount of communication data, at the expense of increasing computational cost. In this work, we aim at attaining an optimal trade-off between the communication and the computational energy. Specifically, we propose a mixed-integer optimization formulation for a multi-hop hierarchical clustering-based self-organizing UAV network incorporating data aggregation, to obtain an energy-efficient information routing scheme. The proposed framework is tested on two applications, namely target tracking and area mapping. Based on simulation results, our method can significantly save energy compared to a baseline strategy, where there is no data aggregation and clustering scheme.
Networks on Chip is a recent solution paradigm adopted to increase the performance of Multicore designs. The key idea is to interconnect various computation modules (IP cores) in a network fashion and transport packets simultaneously across them, thereby gaining performance. In addition to improving performance by having multiple packets in flight, NoCs also present a host of other advantages including scalability, power efficiency, and component reuse through modular design. This work focuses on design and development of high performance communication architectures for FPGAs using NoCs Once completely developed, the above methodology could be used to augment the current FPGA design flow for implementing multicore SoC applications. We design and implement an NoC framework for FPGAs, MultiClock OnChip Network for Reconfigurable Systems (MoCReS). We propose a novel microarchitecture for a hybrid two layer router that supports both packetswitched communications, across its local and directional ports, as well as, time multiplexed circuitswitched communications among the multiple IP cores directly connected to it. Results from place and route VHDL models of the advanced router architecture show an average improvement of 20.4 percent in NoC bandwidth (maximum of 24 percent compared to a traditional NoC). We parameterize the hybrid router model over the number of ports, channel width and bRAM depth and develop a library of network components (MoClib Library). For your paper to be published in the conference proceedings, you must use this document as both an instruction set and as a template into which you can type your own text. If your paper does not conform to the required format, you will be asked to fix it.
The design of Medium Access Control (MAC) protocols for wireless sensor networks (WSNs) has been conventionally tackled by assuming battery-powered devices and by adopting the network lifetime as the main performance criterion. While WSNs operated by energy-harvesting (EH) devices are not limited by network lifetime, they pose new design challenges due to the uncertain amount of harvestable energy. Novel design criteria are thus required to capture the trade-offs between the potentially infinite network lifetime and the uncertain energy availability. This paper addresses the analysis and design of WSNs with EH devices by focusing on conventional MAC protocols, namely TDMA, Framed-ALOHA (FA) and Dynamic-FA (DFA), and by accounting for the performance trade-offs and design issues arising due to EH. A novel metric, referred to as delivery probability, is introduced to measure the capability of a MAC protocol to deliver the measure of any sensor in the network to the intended destination (or fusion center, FC). The interplay between delivery efficiency and time efficiency (i.e., the data collection rate at the FC), is investigated analytically using Markov models. Numerical results validate the analysis and emphasize the critical importance of accounting for both delivery probability and time efficiency in the design of EH-WSNs.
We consider a one-sided assignment market or exchange network with transferable utility and propose a model for the dynamics of bargaining in such a market. Our dynamical model is local, involving iterative updates of 'offers' based on estimated best alternative matches, in the spirit of pairwise Nash bargaining. We establish that when a balanced outcome (a generalization of the pairwise Nash bargaining solution to networks) exists, our dynamics converges rapidly to such an outcome. We extend our results to the cases of (i) general agent 'capacity constraints', i.e., an agent may be allowed to participate in multiple matches, and (ii) 'unequal bargaining powers' (where we also find a surprising change in rate of convergence).
The economical world consists of a highly interconnected and interdependent network of firms. Here we develop temporal and structural network tools to analyze the state of the economy. Our analysis indicates that a strong clustering can be a warning sign. Reduction in diversity, which was an essential aspect of the dynamics surrounding the crash in 2008, is seen as a key emergent feature arising naturally from the evolutionary and adaptive dynamics inherent to the financial markets. Similarly, collusion amongst construction firms in a number of regions in Japan in the 2000s can be identified with the formation of clusters of anomalous highly connected companies.
We study the effect of freeze-thaw cycling on the packing fraction of equal spheres immersed in water. The water located between the grains experiences a dilatation during freezing and a contraction during melting. After several cycles, the packing fraction converges to a particular value $\eta_{\infty} = 0.595$ independently of its initial value $\eta_0$. This behavior is well reproduced by numerical simulations. Moreover, the numerical results allow to analyze the packing structural configuration. With a Vorono\"i partition analysis, we show that the piles are fully random during the whole process and are characterized by two parameters: the average Vorono\"i volume $\mu_v$ (related to the packing fraction $\eta$) and the standard deviation $\sigma_v$ of Vorono\"i volumes. The freeze-thaw driving modify the volume standard deviation $\sigma_v$ to converge to a particular disordered state with a packing fraction corresponding to the Random Loose Packing fraction $\eta_{BRLP}$ obtained by Bernal during his pioneering experimental work. Therefore, freeze-thaw cycling is found to be a soft and spatially homogeneous driving method for disordered granular materials.
We consider the estimation of an i.i.d. (possibly non-Gaussian) vector $\xbf \in \R^n$ from measurements $\ybf \in \R^m$ obtained by a general cascade model consisting of a known linear transform followed by a probabilistic componentwise (possibly nonlinear) measurement channel. A novel method, called adaptive generalized approximate message passing (Adaptive GAMP), that enables joint learning of the statistics of the prior and measurement channel along with estimation of the unknown vector $\xbf$ is presented. The proposed algorithm is a generalization of a recently-developed EM-GAMP that uses expectation-maximization (EM) iterations where the posteriors in the E-steps are computed via approximate message passing. The methodology can be applied to a large class of learning problems including the learning of sparse priors in compressed sensing or identification of linear-nonlinear cascade models in dynamical systems and neural spiking processes. We prove that for large i.i.d. Gaussian transform matrices the asymptotic componentwise behavior of the adaptive GAMP algorithm is predicted by a simple set of scalar state evolution equations. In addition, we show that when a certain maximum-likelihood estimation can be performed in each step, the adaptive GAMP method can yield asymptotically consistent parameter estimates, which implies that the algorithm achieves a reconstruction quality equivalent to the oracle algorithm that knows the correct parameter values. Remarkably, this result applies to essentially arbitrary parametrizations of the unknown distributions, including ones that are nonlinear and non-Gaussian. The adaptive GAMP methodology thus provides a systematic, general and computationally efficient method applicable to a large range of complex linear-nonlinear models with provable guarantees.
The "Self-organized criticality" (SOC), introduced in 1987 by Bak, Tang and Wiesenfeld, was an attempt to explain the 1/f noise, but it rapidly evolved towards a more ambitious scope: explaining scale invariant avalanches. In two decades, phenomena as diverse as earthquakes, granular piles, snow avalanches, solar flares, superconducting vortices, sub-critical fracture, evolution, and even stock market crashes have been reported to evolve through scale invariant avalanches. The theory, based on the key axiom that a critical state is an attractor of the dynamics, presented an exponent close to -1 (in two dimensions) for the power-law distribution of avalanche sizes. However, the majority of real phenomena classified as SOC present smaller exponents, i.e., larger absolute values of negative exponents, a situation that has provoked a lot of confusion in the field of scale invariant avalanches. The main goal of this chapter is to shed light on this issue. The essential role of the exponent value of the power-law distribution of avalanche sizes is discussed. The exponent value controls the ratio of small and large events, the energy balance --required for stationary systems-- and the critical properties of the dynamics. A condition of criticality is introduced. As the exponent value decreases, there is a decrease of the critical properties, and consequently the system becomes, in principle, predictable. Prediction of scale invariant avalanches in both experiments and simulations are presented. Other sources of confusion as the use of logarithmic scales, and the avalanche dynamics in well established critical systems, are also revised; as well as the influence of dissipation and disorder in the "self-organization" of scale invariant dynamics.
We study how to restore site percolation on a damaged square lattice with nearest neighbor (N$^2$) interactions. Two strategies are suggested for a density $x$ of destroyed sites by a random attack at $p_c$. In the first one, a density $y$ of new sites are created with longer range interactions, either next nearest neighbor (N$^3$) or next next nearest neighbor (N$^4$). In the second one, new longer range interactions N$^3$ or N$^4$ are created for a fraction $v$ of the remaining $(p_c-x)$ sites in addition to their N$^2$ interactions. In both cases, the values of $y$ and $v$ are tuned in order to restore site percolation which then occurs at new percolation thresholds, respectively $\pi_3$, $\pi_4$, $\pi_{23}$ and $\pi_{24}$. Using Monte Carlo simulations the values of the pairs $\{y, \pi_3 \}$, $\{y, \pi_4\}$ and $\{v, \pi_{23}\}$, $\{v, \pi_{24}\}$ are calculated for the whole range $0\leq x \leq p_c(\text{N}^2)$. Our schemes are applicable to all regular lattices.
This paper introduces a novel approach for generating GIFs called Synchronized Deep Recurrent Attentive Writer (Sync-DRAW). Sync-DRAW employs a Recurrent Variational Autoencoder (R-VAE) and an attention mechanism in a hierarchical manner to create a temporally dependent sequence of frames that are gradually formed over time. The attention mechanism in Sync-DRAW attends to each individual frame of the GIF in sychronization, while the R-VAE learns a latent distribution for the entire GIF at the global level. We studied the performance of our Sync-DRAW network on the Bouncing MNIST GIFs Dataset and also, the newly available TGIF dataset. Experiments have suggested that Sync-DRAW is efficient in learning the spatial and temporal information of the GIFs and generates frames where objects have high structural integrity. Moreover, we also demonstrate that Sync-DRAW can be extended to even generate GIFs automatically given just text captions.
As we add rigid bars between points in the plane, at what point is there a giant (linear-sized) rigid component, which can be rotated and translated, but which has no internal flexibility? If the points are generic, this depends only on the combinatorics of the graph formed by the bars. We show that if this graph is an Erdos-Renyi random graph G(n,c/n), then there exists a sharp threshold for a giant rigid component to emerge. For c < c_2, w.h.p. all rigid components span one, two, or three vertices, and when c > c_2, w.h.p. there is a giant rigid component. The constant c_2 \approx 3.588 is the threshold for 2-orientability, discovered independently by Fernholz and Ramachandran and Cain, Sanders, and Wormald in SODA'07. We also give quantitative bounds on the size of the giant rigid component when it emerges, proving that it spans a (1-o(1))-fraction of the vertices in the (3+2)-core. Informally, the (3+2)-core is maximal induced subgraph obtained by starting from the 3-core and then inductively adding vertices with 2 neighbors in the graph obtained so far.
Computation of the trace of a matrix function plays an important role in many scientific computing applications, including applications in machine learning, computational physics (e.g., lattice quantum chromodynamics), network analysis and computational biology (e.g., protein folding), just to name a few application areas. We propose a linear-time randomized algorithm for approximating the trace of matrix functions of large symmetric matrices. Our algorithm is based on coupling function approximation using Chebyshev interpolation with stochastic trace estimators (Hutchinson's method), and as such requires only implicit access to the matrix, in the form of a function that maps a vector to the product of the matrix and the vector. We provide rigorous approximation error in terms of the extremal eigenvalue of the input matrix, and the Bernstein ellipse that corresponds to the function at hand. Based on our general scheme, we provide algorithms with provable guarantees for important matrix computations, including log-determinant, trace of matrix inverse, Estrada index, Schatten p-norm, and testing positive definiteness. We experimentally evaluate our algorithm and demonstrate its effectiveness on matrices with tens of millions dimensions.
Deep convolutional neural networks (CNNs) have been immensely successful in many high-level computer vision tasks given large labeled datasets. However, for video semantic object segmentation, a domain where labels are scarce, effectively exploiting the representation power of CNN with limited training data remains a challenge. Simply borrowing the existing pretrained CNN image recognition model for video segmentation task can severely hurt performance. We propose a semi-supervised approach to adapting CNN image recognition model trained from labeled image data to the target domain exploiting both semantic evidence learned from CNN, and the intrinsic structures of video data. By explicitly modeling and compensating for the domain shift from the source domain to the target domain, this proposed approach underpins a robust semantic object segmentation method against the changes in appearance, shape and occlusion in natural videos. We present extensive experiments on challenging datasets that demonstrate the superior performance of our approach compared with the state-of-the-art methods.
The understanding of the immense and intricate topological structure of the World Wide Web (WWW) is a major scientific and technological challenge. This has been tackled recently by characterizing the properties of its representative graphs in which vertices and directed edges are identified with web-pages and hyperlinks, respectively. Data gathered in large scale crawls have been analyzed by several groups resulting in a general picture of the WWW that encompasses many of the complex properties typical of rapidly evolving networks. In this paper, we report a detailed statistical analysis of the topological properties of four different WWW graphs obtained with different crawlers. We find that, despite the very large size of the samples, the statistical measures characterizing these graphs differ quantitatively, and in some cases qualitatively, depending on the domain analyzed and the crawl used for gathering the data. This spurs the issue of the presence of sampling biases and structural differences of Web crawls that might induce properties not representative of the actual global underlying graph. In order to provide a more accurate characterization of the Web graph and identify observables which are clearly discriminating with respect to the sampling process, we study the behavior of degree-degree correlation functions and the statistics of reciprocal connections. The latter appears to enclose the relevant correlations of the WWW graph and carry most of the topological information of theWeb. The analysis of this quantity is also of major interest in relation to the navigability and searchability of the Web.
At finite temperature and in presence of disorder, a one-dimensional elastic interface displays different scaling regimes at small and large lengthscales. Using a replica approach and a Gaussian Variational Method (GVM), we explore the consequences of a finite interface width $\xi$ on the small-lengthscale fluctuations. We compute analytically the static roughness $B(r)$ of the interface as a function of the distance $r$ between two points on the interface. We focus on the case of short-range elasticity and random-bond disorder. We show that for a finite width $\xi$ two temperature regimes exist. At low temperature, the expected thermal and random-manifold regimes, respectively for small and large scales, connect via an intermediate `modified' Larkin regime, that we determine. This regime ends at a temperature-independent characteristic `Larkin' length. Above a certain `critical' temperature that we identify, this intermediate regime disappears. The thermal and random-manifold regimes connect at a single crossover lengthscale, that we compute. This is also the expected behavior for zero width. Using a directed polymer description, we also study via a second GVM procedure and generic scaling arguments, a modified toy model that provides further insights on this crossover. We discuss the relevance of the two GVM procedures for the roughness at large lengthscale in those regimes. In particular we analyze the scaling of the temperature-dependent prefactor in the roughness $B(r)\sim T^{2 \text{\thorn}} r^{2 \zeta}$ and its corresponding exponent $\text{\thorn}$. We briefly discuss the consequences of those results for the quasistatic creep law of a driven interface, in connection with previous experimental and numerical studies.
Convolutional Neural Networks (CNNs) have been used extensively for computer vision tasks and produce rich feature representation for objects or parts of an image. But reasoning about scenes requires integration between the low-level feature representations and the high-level semantic information. We propose a deep network architecture which models the semantic context of scenes by capturing object-level information. We use Long Short Term Memory(LSTM) units in conjunction with object proposals to incorporate object-object relationship and object-scene relationship in an end-to-end trainable manner. We evaluate our model on the LSUN dataset and achieve results comparable to the state-of-art. We further show visualization of the learned features and analyze the model with experiments to verify our model's ability to model context.
In this paper, we proposed using a hybrid method that utilises deep convolutional and recurrent neural networks for accurate delineation of skin lesion of images supplied with ISBI 2017 lesion segmentation challenge. The proposed method was trained using 1800 images and tested on 150 images from ISBI 2017 challenge.
We have initiated a programme to compute the lower moments of the unpolarised and polarised deep inelastic structure functions of the nucleon in the quenched approximation. We review our progress to date.
A new method for sequence optimization in protein models is presented. The approach, which has inherited its basic philosophy from recent work by Deutsch and Kurosky [Phys. Rev. Lett. 76, 323 (1996)] by maximizing conditional probabilities rather than minimizing energy functions, is based upon a novel and very efficient multisequence Monte Carlo scheme. By construction, the method ensures that the designed sequences represent good folders thermodynamically. A bootstrap procedure for the sequence space search is devised making very large chains feasible. The algorithm is successfully explored on the two-dimensional HP model with chain lengths N=16, 18 and 32.
We present a simulation analysis of weak gravitational lensing flexion and shear measurement using shapelet decomposition, and identify differences between flexion and shear measurement noise in deep survey data. Taking models of galaxies from the Hubble Space Telescope Ultra Deep Field (HUDF) and applying a correction for the HUDF point spread function we generate lensed simulations of deep, optical imaging data from Hubble's Advanced Camera for Surveys (ACS), with realistic galaxy morphologies. We find that flexion and shear estimates differ in our measurement pipeline: whereas intrinsic galaxy shape is typically the dominant contribution to noise in shear estimates, pixel noise due to finite photon counts and detector read noise is a major contributor to uncertainty in flexion estimates, across a broad range of galaxy signal-to-noise. This pixel noise also increases more rapidly as galaxy signal-to-noise decreases than is found for shear estimates. We provide simple power law fitting functions for this behaviour, for both flexion and shear, allowing the effect to be properly accounted for in future forecasts for flexion measurement. Using the simulations we also quantify the systematic biases of our shapelet flexion and shear measurement pipeline for deep Hubble data sets such as Galaxy Evolution from Morphology and SEDs, Space Telescope A901/902 Galaxy Evolution Survey or the Cosmic Evolution Survey. Flexion measurement biases are found to be significant but consistent with previous studies.
In this paper we present new developments in the expressiveness and in the theory of a Calculus for Sensor Networks (CSN). We combine a network layer of sensor devices with a local object model to describe sensor devices with state. The resulting calculus is quite small and yet very expressive. We also present a type system and a type invariance result for the calculus. These results provide the fundamental framework for the development of programming languages and run-time environments.
The optimization of deep neural networks can be more challenging than traditional convex optimization problems due to the highly non-convex nature of the loss function, e.g. it can involve pathological landscapes such as saddle-surfaces that can be difficult to escape for algorithms based on simple gradient descent. In this paper, we attack the problem of optimization of highly non-convex neural networks by starting with a smoothed -- or \textit{mollified} -- objective function that gradually has a more non-convex energy landscape during the training. Our proposition is inspired by the recent studies in continuation methods: similar to curriculum methods, we begin learning an easier (possibly convex) objective function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, objective function. The complexity of the mollified networks is controlled by a single hyperparameter which is annealed during the training. We show improvements on various difficult optimization tasks and establish a relationship with recent works on continuation methods for neural networks and mollifiers.
Convolutional Neural Networks (CNN) have demon- strated its successful applications in computer vision, speech recognition, and natural language processing. For object recog- nition, CNNs might be limited by its strict label requirement and an implicit assumption that images are supposed to be target- object-dominated for optimal solutions. However, the labeling procedure, necessitating laying out the locations of target ob- jects, is very tedious, making high-quality large-scale dataset prohibitively expensive. Data augmentation schemes are widely used when deep networks suffer the insufficient training data problem. All the images produced through data augmentation share the same label, which may be problematic since not all data augmentation methods are label-preserving. In this paper, we propose a weakly supervised CNN framework named Multiple Instance Learning Convolutional Neural Networks (MILCNN) to solve this problem. We apply MILCNN framework to object recognition and report state-of-the-art performance on three benchmark datasets: CIFAR10, CIFAR100 and ILSVRC2015 classification dataset.
We present a sample of 407 z~3 Lyman break galaxies (LBGs) to a limiting isophotal u-band magnitude of 27.6 mag in the Hubble Ultra Deep Field (UDF). The LBGs are selected using a combination of photometric redshifts and the u-band drop-out technique enabled by the introduction of an extremely deep u-band image obtained with the Keck I telescope and the blue channel of the LRIS spectrometer. The Keck u-band image, totaling 9 hours of integration time, has a one sigma depth of 30.7 mag arcsec^-2, making it one of the most sensitive u-band images ever obtained. The u-band image also substantially improves the accuracy of photometric redshift measurements of ~50% of the z~3 Lyman break galaxies, significantly reducing the traditional degeneracy of colors between z~3 and z~0.2 galaxies. This sample provides the most sensitive, high-resolution multi-filter imaging of reliably identified z~3 LBGs for morphological studies of galaxy formation and evolution and the star formation efficiency of gas at high redshift.
Theoretical frameworks to estimate the tolerance of metabolic networks to various failures are important to evaluate the robustness of biological complex systems in systems biology. In this paper, we focus on a measure for robustness in metabolic networks, namely, the impact degree, and propose an approximation method to predict the probability distribution of impact degrees from metabolic network structures using the theory of branching process. We demonstrate the relevance of this method by testing it on real-world metabolic networks. Although the approximation method possesses a few limitations, it may be a powerful tool for evaluating metabolic robustness.
In this paper we describe the U-band imaging of the F02 deep field, one of the fields in the VIRMOS Deep Imaging Survey. The observations were done at the ESO/MPG 2.2m telescope at La Silla (Chile) using the 8k x 8k Wide-Field Imager (WFI). The field is centered at alpha(J2000)=02h 26m 00s and delta(J2000)=-04deg 30' 00", the total covered area is 0.9 deg**2 and the limiting magnitude (50% completeness) is U(AB) ~ 25.4 mag. Reduction steps, including astrometry, photometry and catalogue extraction, are first discussed. The achieved astrometric accuracy (RMS) is ~ 0.2" with reference to the I-band catalog and ~ 0.07" internally (estimated from overlapping sources in different exposures). The photometric accuracy including uncertainties from photometric calibration, is < 0.1 mag. Various tests are then performed as a quality assessment of the data. They include: (i) the color distribution of stars and galaxies in the field, done together with the BVRI data available from the VIMOS survey; (ii) the comparison with previous published results of U-band magnitude-number counts of galaxies.
Previous work has shown that robot navigation systems that employ an architecture based upon the idiotypic network theory of the immune system have an advantage over control techniques that rely on reinforcement learning only. This is thought to be a result of intelligent behaviour selection on the part of the idiotypic robot. In this paper an attempt is made to imitate idiotypic dynamics by creating controllers that use reinforcement with a number of different probabilistic schemes to select robot behaviour. The aims are to show that the idiotypic system is not merely performing some kind of periodic random behaviour selection, and to try to gain further insight into the processes that govern the idiotypic mechanism. Trials are carried out using simulated Pioneer robots that undertake navigation exercises. Results show that a scheme that boosts the probability of selecting highly-ranked alternative behaviours to 50% during stall conditions comes closest to achieving the properties of the idiotypic system, but remains unable to match it in terms of all round performance.
Eigenmode analysis is one of the most promising methods of analyzing large data sets in ongoing and near-future galaxy surveys. In such analyses, a fast evaluation of the correlation matrix in arbitrary cosmological models is crucial. The observational effects, including peculiar velocity distortions in redshift space, light-cone effects, selection effects, and effects of the complex shape of the survey geometry, should be taken into account in the analysis. In the framework of the linear theory of gravitational instability, we provide the methodology to quickly compute the correlation matrix. Our methods are not restricted to shallow redshift surveys, arbitrarily deep samples can be dealt with as well. Therefore, our methods are useful in constraining the geometry of the universe and the dark energy component, as well as the power spectrum of galaxies, since ongoing and near-future galaxy surveys probe the universe at intermediate to deep redshifts, z ~ 0.2--5. In addition to the detailed methods to compute the correlation matrix in 3-dimensional redshift surveys, methods to calculate the matrix in 2-dimensional projected samples are also provided. Prospects of applying our methods to likelihood estimation of the cosmological parameters are discussed.
Feedforward neural networks have been investigated to understand learning and memory, as well as applied to numerous practical problems in pattern classification. It is a rule of thumb that more complex tasks require larger networks. However, the design of optimal network architectures for specific tasks is still an unsolved fundamental problem. In this study, we consider three-layered neural networks for memorizing binary patterns. We developed a new complexity measure of binary patterns, and estimated the minimal network size for memorizing them as a function of their complexity. We formulated the minimal network size for regular, random, and complex patterns. In particular, the minimal size for complex patterns, which are neither ordered nor disordered, was predicted by measuring their Hamming distances from known ordered patterns. Our predictions agreed with simulations based on the back-propagation algorithm.
Probabilistic generative neural networks are useful for many applications, such as image classification, speech recognition and occlusion removal. However, the power budget for hardware implementations of neural networks can be extremely tight. To address this challenge we describe a design methodology for using approximate computing methods to implement Approximate Deep Belief Networks (ApproxDBNs) by systematically exploring the use of (1) limited precision of variables; (2) criticality analysis to identify the nodes in the network which can operate with such limited precision while allowing the network to maintain target accuracy levels; and (3) a greedy search methodology with incremental retraining to determine the optimal reduction in precision to enable maximize power savings under user-specified accuracy constraints. Experimental results show that significant bit-length reduction can be achieved by our ApproxDBN with constrained accuracy loss.
Convolutional neural networks (CNNs) with convolutional and pooling operations along the frequency axis have been proposed to attain invariance to frequency shifts of features. However, this is inappropriate with regard to the fact that acoustic features vary in frequency. In this paper, we contend that convolution along the time axis is more effective. We also propose the addition of an intermap pooling (IMP) layer to deep CNNs. In this layer, filters in each group extract common but spectrally variant features, then the layer pools the feature maps of each group. As a result, the proposed IMP CNN can achieve insensitivity to spectral variations characteristic of different speakers and utterances. The effectiveness of the IMP CNN architecture is demonstrated on several LVCSR tasks. Even without speaker adaptation techniques, the architecture achieved a WER of 12.7% on the SWB part of the Hub5'2000 evaluation test set, which is competitive with other state-of-the-art methods.
From the exact single step evolution equation of the two-point correlation function of a particle distribution subjected to a stochastic displacement field $\bu(\bx)$, we derive different dynamical regimes when $\bu(\bx)$ is iterated to build a velocity field. First we show that spatially uncorrelated fields $\bu(\bx)$ lead to both standard and anomalous diffusion equation. When the field $\bu(\bx)$ is spatially correlated each particle performs a simple free Brownian motion, but the trajectories of different particles result to be mutually correlated. The two-point statistical properties of the field $\bu(\bx)$ induce two-point spatial correlations in the particle distribution satisfying a simple but non-trivial diffusion-like equation. These displacement-displacement correlations lead the system to three possible regimes: coalescence, simple clustering and a combination of the two. The existence of these different regimes, in the one-dimensional system, is shown through computer simulations and a simple theoretical argument.
We study the formation of electron-hole pairs for disordered systems in the limit of weak electron-hole interactions. We find that both attractive and repulsive interactions lead to electron-hole pair states with large localization length $\lambda_{2}$ even when we are in this non-excitonic limit. Using a numerical decimation method to calculate the decay of the Green function along the diagonal of finite samples, we investigate the dependence of $\lambda_2(U)$ on disorder, interaction strength $U$ and system size. Infinite sample size estimates $\xi_{2}(U)$ are obtained by finite-size scaling. The results show a great similarity to the problem of two interacting electrons in the same random one-dimensional potential.
We present a natural language generator based on the sequence-to-sequence approach that can be trained to produce natural language strings as well as deep syntax dependency trees from input dialogue acts, and we use it to directly compare two-step generation with separate sentence planning and surface realization stages to a joint, one-step approach. We were able to train both setups successfully using very little training data. The joint setup offers better performance, surpassing state-of-the-art with regards to n-gram-based scores while providing more relevant outputs.
Deep neural networks have advanced the state of the art in named entity recognition. However, under typical training procedures, advantages over classical methods emerge only with large datasets. As a result, deep learning is employed only when large public datasets or a large budget for manually labeling data is available. In this work, we show that by combining deep learning with active learning, we can outperform classical methods even with a significantly smaller amount of training data.
We study target-searching processes on a percolation, on which a hunter tracks a target by smelling odors it emits. The odor intensity is supposed to be inversely proportional to the distance it propagates. The Monte Carlo simulation is performed on a 2-dimensional bond-percolation above the threshold. Having no idea of the location of the target, the hunter determines its moves only by random attempts in each direction. For lager percolation connectivity $p\gtrsim 0.90$, it reveals a scaling law for the searching time versus the distance to the position of the target. The scaling exponent is dependent on the sensitivity of the hunter. For smaller $p$, the scaling law is broken and the probability of finding out the target significantly reduces. The hunter seems trapped in the cluster of the percolation and can hardly reach the goal.
The eigenvalue spectrum of the adjacency matrix of a network is closely related to the behavior of many dynamical processes run over the network. In the field of robotics, this spectrum has important implications in many problems that require some form of distributed coordination within a team of robots. In this paper, we propose a continuous-time control scheme that modifies the structure of a position-dependent network of mobile robots so that it achieves a desired set of adjacency eigenvalues. For this, we employ a novel abstraction of the eigenvalue spectrum by means of the adjacency matrix spectral moments. Since the eigenvalue spectrum is uniquely determined by its spectral moments, this abstraction provides a way to indirectly control the eigenvalues of the network. Our construction is based on artificial potentials that capture the distance of the network's spectral moments to their desired values. Minimization of these potentials is via a gradient descent closed-loop system that, under certain convexity assumptions, ensures convergence of the network topology to one with the desired set of moments and, therefore, eigenvalues. We illustrate our approach in nontrivial computer simulations.
We have calculated the fermionic contributions to the flavour non-singlet structure functions in deep-inelastic scattering (DIS) at third order of massless perturbative QCD. We discuss their implications for the threshold resummation at the next-to-next-to-leading logarithmic accuracy.
Recent works in the information science literature have presented cases of using patent databases and patent classification information to construct network maps of technology fields, which aim to aid in competitive intelligence analysis and innovation decision making. Constructing such a patent network requires a proper measure of the distance between different classes of patents in the patent classification systems. Despite the existence of various distance measures in the literature, it is unclear how to consistently assess and compare them, and which ones to select for constructing patent technology network maps. This ambiguity has limited the development and applications of such technology maps. Herein, we propose to compare alternative distance measures and identify the superior ones by analyzing the differences and similarities in the structural properties of resulting patent network maps. Using United States patent data from 1976 to 2006 and International Patent Classification system, we compare 12 representative distance measures, which quantify inter-field knowledge base proximity, field-crossing diversification likelihood or frequency of innovation agents, and co-occurrences of patent classes in the same patents. Our comparative analyses suggest the patent technology network maps based on normalized co-reference and inventor diversification likelihood measures are the best representatives.
Testing the validity of probabilistic models containing unmeasured (hidden) variables is shown to be a hard task. We show that the task of testing whether models are structurally incompatible with the data at hand, requires an exponential number of independence evaluations, each of the form: "X is conditionally independent of Y, given Z." In contrast, a linear number of such evaluations is required to test a standard Bayesian network (one per vertex). On the positive side, we show that if a network with hidden variables G has a tree skeleton, checking whether G represents a given probability model P requires the polynomial number of such independence evaluations. Moreover, we provide an algorithm that efficiently constructs a tree-structured Bayesian network (with hidden variables) that represents P if such a network exists, and further recognizes when such a network does not exist.
In granular superconductors, individual grains can contain bound Cooper pairs while the system as a whole is strongly insulating. In such cases the conductivity is determined by electron hopping between localized states in individual grains. Here we examine a model of hopping conductivity in such an insulating granular superconductor, where disorder is assumed to be provided by random charges embedded in the insulating gaps between grains. We use computer simulations to calculate the single-electron and electron pair density of states at different values of the superconducting gap $\Delta$, and we identify "triptych" symmetries and scaling relations between them. At a particular critical value of $\Delta$, one can define an effective charge $\sqrt{2}e$ that characterizes the density of states and the hopping transport. We discuss the implications of our results for magnetoresistance and tunneling experiments.
We consider the optimal site selection of future generations of gravitational wave detectors. Previously, Raffai et al. optimized a 2-detector network with a combined figure of merit. This optimization was extended to networks with more than two detectors in a limited way by first fixing the parameters of all other component detectors. In this work we now present a more general optimization that allows the locations of all detectors to be simultaneously chosen. We follow the definition of Raffai et al. on the metric that defines the suitability of a certain detector network. Given the locations of the component detectors in the network, we compute a measure of the network's ability to distinguish the polarization, constrain the sky localization and reconstruct the parameters of a gravitational wave source. We further define the `flexibility index' for a possible site location, by counting the number of multi-detector networks with a sufficiently high Figure of Merit that include that site location. We confirm the conclusion of Raffai et al., that in terms of flexibility index as defined in this work, Australia hosts the best candidate site to build a future generation gravitational wave detector. This conclusion is valid for either a 3-detector network or a 5-detector network. For a 3-detector network site locations in Northern Europe display a comparable flexibility index to sites in Australia. However for a 5-detector network, Australia is found to be a clearly better candidate than any other location.
Convolution is a critical component in modern deep neural networks, thus several algorithms for convolution have been developed. Direct convolution is simple but suffers from poor performance. As an alternative, multiple indirect methods have been proposed including im2col-based convolution, FFT-based convolution, or Winograd-based algorithm. However, all these indirect methods have high memory-overhead, which creates performance degradation and offers a poor trade-off between performance and memory consumption. In this work, we propose a memory-efficient convolution or MEC with compact lowering, which reduces memory-overhead substantially and accelerates convolution process. MEC lowers the input matrix in a simple yet efficient/compact way (i.e., much less memory-overhead), and then executes multiple small matrix multiplications in parallel to get convolution completed. Additionally, the reduced memory footprint improves memory sub-system efficiency, improving performance. Our experimental results show that MEC reduces memory consumption significantly with good speedup on both mobile and server platforms, compared with other indirect convolution algorithms.
The rapid developments of mobile devices and online social networks have resulted in increasing attention to Mobile Social Networking (MSN). The explosive growth of mobile-connected and location-aware devices makes it possible and meaningful to do the Proximity-based Mobile Social Networks (PMSNs). Users can discover and make new social interactions easily with physical-proximate mobile users through WiFi/Bluetooth interfaces embedded in their smartphones. However, users enjoy these conveniences at the cost of their growing privacy concerns. To address this problem, we propose a suit of priority-aware private matching schemes to privately match the similarity with potential friends in the vicinity. Unlike most existing work, our proposed priority-aware matching scheme (P-match) achieves the privacy goal by combining the commutative encryption function and the Tanimoto similarity coefficient which considers both the number of common attributes between users as well as the corresponding priorities on each common attribute. Further, based on the newly constructed similarity function which takes the ratio of attributes matched over all the input set into consideration, we design an enhanced version to deal with some potential attacks such as unlimitedly inputting the attribute set on either the initiator side or the responder side, etc. Finally, our proposed E-match avoids the heavy cryptographic operations and improves the system performance significantly by employing a novel use of the Bloom filter. The security and communication/computation overhead of our schemes are thoroughly analyzed and evaluated via detailed simulations and implementation.
A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.
This paper discusses transmission strategies for dealing with the problem of self-interference in multi-hop wireless networks in which the nodes communicate in a full- duplex mode. An information theoretic study of the simplest such multi-hop network: the two-hop source-relay-destination network, leads to a novel transmission strategy called structured self-interference cancellation (or just "structured cancellation" for short). In the structured cancellation strategy the source restrains from transmitting on certain signal levels, and the relay structures its transmit signal such that it can learn the residual self-interference channel, and undo the self-interference, by observing the portion of its own transmit signal that appears at the signal levels left empty by the source. It is shown that in certain nontrivial regimes, the structured cancellation strategy outperforms not only half-duplex but also full-duplex schemes in which time-orthogonal training is used for estimating the residual self-interference channel.
In the color glass condensate framework the saturation scale measured in deep inelastic scattering of high energy hadrons and nuclei can be determined from the correlator of Wilson lines in the hadron wavefunction. These same Wilson lines give the initial condition of the classical field computation of the initial gluon multiplicity and energy density in a heavy ion collision. In this paper the Wilson line correlator in both adjoint and fundamental representations is computed using exactly the same numerical procedure that has been used to calculate gluon production in a heavy ion collision. In particular the discretization of the longitudinal coordinate has a large numerical effect on the relation between the color charge density parameter g^2 mu and the saturation scale Qs. Our result for this relation is Qs = 0.6 g^2 mu, which results in the classical Yang-Mills value for the "gluon liberation coefficient" c = 1.1.
Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus on addressing audio information only.In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent success of convolutional neural networks (CNNs) in SE, we propose an audio-visual deep CNN (AVDCNN) SE model, which incorporates audio and visual streams into a unified network model.In the proposed AVDCNN SE model,audio and visual features are first processed using individual CNNs, and then, fused into a joint network to generate enhanced speech at an output layer. The AVDCNN model is trained in an end-to-end manner, and parameters are jointly learned through back-propagation. We evaluate enhanced speech using five objective criteria. Results show that the AVDCNN yields notably better performance as compared to an audio-only CNN-based SE model, confirming the effectiveness of integrating visual information into the SE process.
The ability to compute an accurate reward function is essential for optimising a dialogue policy via reinforcement learning. In real-world applications, using explicit user feedback as the reward signal is often unreliable and costly to collect. This problem can be mitigated if the user's intent is known in advance or data is available to pre-train a task success predictor off-line. In practice neither of these apply for most real world applications. Here we propose an on-line learning framework whereby the dialogue policy is jointly trained alongside the reward model via active learning with a Gaussian process model. This Gaussian process operates on a continuous space dialogue representation generated in an unsupervised fashion using a recurrent neural network encoder-decoder. The experimental results demonstrate that the proposed framework is able to significantly reduce data annotation costs and mitigate noisy user feedback in dialogue policy learning.
A new framework based on the theory of copulas is proposed to address semi- supervised domain adaptation problems. The presented method factorizes any multivariate density into a product of marginal distributions and bivariate cop- ula functions. Therefore, changes in each of these factors can be detected and corrected to adapt a density model accross different learning domains. Impor- tantly, we introduce a novel vine copula model, which allows for this factorization in a non-parametric manner. Experimental results on regression problems with real-world data illustrate the efficacy of the proposed approach when compared to state-of-the-art techniques.
The renewable energies prediction and particularly global radiation forecasting is a challenge studied by a growing number of research teams. This paper proposes an original technique to model the insolation time series based on combining Artificial Neural Network (ANN) and Auto-Regressive and Moving Average (ARMA) model. While ANN by its non-linear nature is effective to predict cloudy days, ARMA techniques are more dedicated to sunny days without cloud occurrences. Thus, three hybrids models are suggested: the first proposes simply to use ARMA for 6 months in spring and summer and to use an optimized ANN for the other part of the year; the second model is equivalent to the first but with a seasonal learning; the last model depends on the error occurred the previous hour. These models were used to forecast the hourly global radiation for five places in Mediterranean area. The forecasting performance was compared among several models: the 3 above mentioned models, the best ANN and ARMA for each location. In the best configuration, the coupling of ANN and ARMA allows an improvement of more than 1%, with a maximum in autumn (3.4%) and a minimum in winter (0.9%) where ANN alone is the best.
In this article, we focus on inter-cell interference coordination (ICIC) techniques in heterogeneous network (Het-Net) deployments, whereby macro- and picocells autonomously optimize their downlink transmissions, with loose coordination. We model this strategic coexistence as a multi-agent system, aiming at joint interference management and cell association. Using tools from Reinforcement Learning (RL), agents (i.e., macro- and picocells) sense their environment, and self-adapt based on local information so as to maximize their network performance. Specifically, we explore both time- and frequency domain ICIC scenarios, and propose a two-level RL formulation. Here, picocells learn their optimal cell range expansion (CRE) bias and transmit power allocation, as well as appropriate frequency bands for multi-flow transmissions, in which a user equipment (UE) can be simultaneously served by two or more base stations (BSs) from macro- and pico-layers. To substantiate our theoretical findings, Long Term Evolution Advanced (LTEA) based system level simulations are carried out in which our proposed approaches are compared with a number of baseline approaches, such as resource partitioning (RP), static CRE, and single-flow Carrier Aggregation (CA). Our proposed solutions yield substantial gains up to 125% compared to static ICIC approaches in terms of average UE throughput in the timedomain. In the frequency-domain our proposed solutions yield gains up to 240% in terms of cell-edge UE throughput.
The statistical problem for network tomography is to infer the distribution of $\mathbf{X}$, with mutually independent components, from a measurement model $\mathbf{Y}=A\mathbf{X}$, where $A$ is a given binary matrix representing the routing topology of a network under consideration. The challenge is that the dimension of $\mathbf{X}$ is much larger than that of $\mathbf{Y}$ and thus the problem is often called ill-posed. This paper studies some statistical aspects of network tomography. We first address the identifiability issue and prove that the $\mathbf{X}$ distribution is identifiable up to a shift parameter under mild conditions. We then use a mixture model of characteristic functions to derive a fast algorithm for estimating the distribution of $\mathbf{X}$ based on the General method of Moments. Through extensive model simulation and real Internet trace driven simulation, the proposed approach is shown to be favorable comparing to previous methods using simple discretization for inferring link delays in a heterogeneous network.
We present an efficient block-diagonal ap- proximation to the Gauss-Newton matrix for feedforward neural networks. Our result- ing algorithm is competitive against state- of-the-art first order optimisation methods, with sometimes significant improvement in optimisation performance. Unlike first-order methods, for which hyperparameter tuning of the optimisation parameters is often a labo- rious process, our approach can provide good performance even when used with default set- tings. A side result of our work is that for piecewise linear transfer functions, the net- work objective function can have no differ- entiable local maxima, which may partially explain why such transfer functions facilitate effective optimisation.
Banks in the interbank network can not assess the true risks associated with lending to other banks in the network, unless they have full information on the riskiness of all the other banks. These risks can be estimated by using network metrics (for example DebtRank) of the interbank liability network which is available to Central Banks. With a simple agent based model we show that by increasing transparency by making the DebtRank of individual nodes (banks) visible to all nodes, and by imposing a simple incentive scheme, that reduces interbank borrowing from systemically risky nodes, the systemic risk in the financial network can be drastically reduced. This incentive scheme is an effective regulation mechanism, that does not reduce the efficiency of the financial network, but fosters a more homogeneous distribution of risk within the system in a self-organized critical way. We show that the reduction of systemic risk is to a large extent due to the massive reduction of cascading failures in the transparent system. An implementation of this minimal regulation scheme in real financial networks should be feasible from a technical point of view.
Log files contain information about User Name, IP Address, Time Stamp, Access Request, number of Bytes Transferred, Result Status, URL that Referred and User Agent. The log files are maintained by the web servers. By analysing these log files gives a neat idea about the user. This paper gives a detailed discussion about these log files, their formats, their creation, access procedures, their uses, various algorithms used and the additional parameters that can be used in the log files which in turn gives way to an effective mining. It also provides the idea of creating an extended log file and learning the user behaviour.
Network coding is known as a promising approach to improve wireless network performance. How to discover the coding opportunity in relay nodes is really important for it. There are more coding chances, there are more times it can improve network throughput by network coding operation. In this paper, an extended network coding opportunity discovery scheme (ExCODE) is proposed, which is realized by appending the current node ID and all its 1-hop neighbors' IDs to the packet. ExCODE enables the next hop relay node to know which nodes else have already overheard the packet, so it can discover the potential coding opportunities as much as possible. ExCODE expands the region of discovering coding chance to n-hops, and have more opportunities to execute network coding operation in each relay node. At last, we implement ExCODE over the AODV protocol, and efficiency of the proposed mechanism is demonstrated with NS2 simulations, compared to the existing coding opportunity discovery scheme.
Network complexity is increasing, making network control and orchestration a challenging task. The proliferation of network information and tools for data analytics can provide an important insight into resource provisioning and optimisation. The network knowledge incorporated in software defined networking can facilitate the knowledge driven control, leveraging the network programmability. We present Seer: a flexible, highly configurable data analytics platform for network intelligence based on software defined networking and big data principles. Seer combines a computational engine with a distributed messaging system to provide a scalable, fault tolerant and real-time platform for knowledge extraction. Our first prototype uses Apache Spark for streaming analytics and open network operating system (ONOS) controller to program a network in real-time. The first application we developed aims to predict the mobility pattern of mobile devices inside a smart city environment.
Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers. In this paper, we propose the first quantitative analysis of the robustness of classifiers to universal perturbations, and draw a formal link between the robustness to universal perturbations, and the geometry of the decision boundary. Specifically, we establish theoretical bounds on the robustness of classifiers under two decision boundary models (flat and curved models). We show in particular that the robustness of deep networks to universal perturbations is driven by a key property of their curvature: there exists shared directions along which the decision boundary of deep networks is systematically positively curved. Under such conditions, we prove the existence of small universal perturbations. Our analysis further provides a novel geometric method for computing universal perturbations, in addition to explaining their properties.
A general problem formulation for energy-efficient traffic engineering for future core networks is presented. Moreover, a distributed heuristic algorithm that provides jointly load balancing and energy efficiency is proposed, approaching in this way the optimal network operation in terms of throughput and energy consumption.
The molecular evolution that occurs in collapsing prestellar cores is investigated. To model the dynamics, we adopt the Larson-Penston (L-P) solution and analogues with slower rates of collapse. For the chemistry, we utilize the new standard model (NSM) with the addition of deuterium fractionation and grain-surface reactions treated via the modified rate approach. The use of surface reactions distinguishes the present work from our previous model. We find that these reactions efficiently produce H2O, H2CO, CH3OH, N2, and NH3 ices. In addition, the surface chemistry influences the gas-phase abundances in a variety of ways. The current reaction network along with the L-P solution allows us to reproduce satisfactorily most of the molecular column densities and their radial distributions observed in L1544. The agreement tends to worsen with models that include strongly delayed collapse rates. Inferred radial distributions in terms of fractional abundances are somewhat harder to reproduce. In addition to our standard chemical model, we have also run a model with the UMIST gas-phase chemical network. The abundances of gas-phase S-bearing molecules such as CS and CCS are significantly affected by uncertainties in the gas-phase chemical network. In all of our models, the column density of N2H+ monotonically increases as the central density of the core increases during collapse from 3 10^4 cm-3 to 3 10^7 cm-3. Thus, the abundance of this ion can be a probe of evolutionary stage. Molecular D/H ratios in assorted cores are best reproduced in the L-P picture with the conventional rate coefficients for fractionation reactions. If we adopt the newly measured and calculated rate coefficients, the D/H ratios, especially N2D+/N2H+, become significantly lower than the observed values.
We study two dispersive regimes in the dynamics of $N$ two-level atoms interacting with a bosonic mode for long interaction times. Firstly, we analyze the dispersive multiqubit quantum Rabi model for the regime in which the qubit frequencies are equal and smaller than the mode frequency, and for values of the coupling strength similar or larger than the mode frequency, namely, the deep strong coupling regime. Secondly, we address an interaction that is dependent on the photon number, where the coupling strength is comparable to the geometric mean of the qubit and mode frequencies. We show that the associated dynamics is analytically tractable and provide useful frameworks with which to analyze the system behavior. In the deep strong coupling regime, we unveil the structure of unexpected resonances for specific values of the coupling, present for $N\ge2$, and in the photon-number-dependent regime we demonstrate that all the nontrivial dynamical behavior occurs in the atomic degrees of freedom for a given Fock state. We verify these assertions with numerical simulations of the qubit population and photon-statistic dynamics.
Given the rapid proliferation of advanced information technologies, including the Internet, modern humans can easily access vast amount of socially transmitted information. Intuitively, this situation is isomorphic to some eusocial insects that are known to solve the exploration-exploitation dilemma collectively through information transfer (e.g., honeybees [Seeley et al., 1991]; and ants [Shaffer, Sasaki & Pratt, 2013]). Yet, in contrast from the eusocial insects, whose colonies are composed of kin, human collective performance may be affected by an inherent free-rider problem [Bolton & Harris, 1999; Kameda, Tsukasaki, Hastie & Berg, 2011]. Specifically, in groups involving non-kin members, it is expected that free-riders, who allow others to search for better alternatives and then exploit their findings through social learning ("information scroungers"), will frequently appear, and consequently undermine the advantage of collective intelligence [Rogers, 1998; Kameda & Nakanishi, 2003].
We consider a distributed user association problem in the downlink of a small cell network, where small cells obtain the required energy for providing wireless services to users through ambient energy harvesting. Since energy harvesting is opportunistic in nature, the amount of harvested energy is a random variable, without a priori known characteristics. We model the network as a competitive market with uncertainty, where self-interested small cells, modeled as consumers, are willing to maximize their utility scores by selecting users, represented by commodities. The utility scores of small cells depend on the amount of harvested energy, formulated as natures' state. Under this model, the problem is to assign users to small cells, so that the aggregate network utility is maximized. The solution is the general equilibrium under uncertainty, also called Arrow-Debreu equilibrium. We show that in our setting, such equilibrium not only exists, but also is unique and is Pareto optimal in the sense of expected aggregate network utility. We use the Walras' tatonnement process with some modifications in order to implement the equilibrium efficiently.
Evolutionary algorithms (EAs), a large class of general purpose optimization algorithms inspired from the natural phenomena, are widely used in various industrial optimizations and often show excellent performance. This paper presents an attempt towards revealing their general power from a statistical view of EAs. By summarizing a large range of EAs into the sampling-and-learning framework, we show that the framework directly admits a general analysis on the probable-absolute-approximate (PAA) query complexity. We particularly focus on the framework with the learning subroutine being restricted as a binary classification, which results in the sampling-and-classification (SAC) algorithms. With the help of the learning theory, we obtain a general upper bound on the PAA query complexity of SAC algorithms. We further compare SAC algorithms with the uniform search in different situations. Under the error-target independence condition, we show that SAC algorithms can achieve polynomial speedup to the uniform search, but not super-polynomial speedup. Under the one-side-error condition, we show that super-polynomial speedup can be achieved. This work only touches the surface of the framework. Its power under other conditions is still open.
Very deep convolutional neural networks introduced new problems like vanishing gradient and degradation. The recent successful contributions towards solving these problems are Residual and Highway Networks. These networks introduce skip connections that allow the information (from the input or those learned in earlier layers) to flow more into the deeper layers. These very deep models have lead to a considerable decrease in test errors, on benchmarks like ImageNet and COCO. In this paper, we propose the use of exponential linear unit instead of the combination of ReLU and Batch Normalization in Residual Networks. We show that this not only speeds up learning in Residual Networks but also improves the accuracy as the depth increases. It improves the test error on almost all data sets, like CIFAR-10 and CIFAR-100
Molecular dynamics simulations of a Lennard-Jones binary mixture confined in a disordered array of soft spheres are presented. The single particle dynamical behavior of the glass former is examined upon supercooling. Predictions of mode coupling theory are satisfied by the confined liquid. Estimates of the crossover temperature are obtained by power law fit to the diffusion coefficients and relaxation times of the late $\alpha$ region. The $b$ exponent of the von Schweidler law is also evaluated. Similarly to the bulk, different values of the exponent $\gamma$ are extracted from the power law fit to the diffusion coefficients and relaxation times.
Traditional feature encoding scheme (e.g., Fisher vector) with local descriptors (e.g., SIFT) and recent convolutional neural networks (CNNs) are two classes of successful methods for image recognition. In this paper, we propose a hybrid representation, which leverages the discriminative capacity of CNNs and the simplicity of descriptor encoding schema for image recognition, with a focus on scene recognition. To this end, we make three main contributions from the following aspects. First, we propose a patch-level and end-to-end architecture to model the appearance of local patches, called {\em PatchNet}. PatchNet is essentially a customized network trained in a weakly supervised manner, which uses the image-level supervision to guide the patch-level feature extraction. Second, we present a hybrid visual representation, called {\em VSAD}, by utilizing the robust feature representations of PatchNet to describe local patches and exploiting the semantic probabilities of PatchNet to aggregate these local patches into a global representation. Third, based on the proposed VSAD representation, we propose a new state-of-the-art scene recognition approach, which achieves an excellent performance on two standard benchmarks: MIT Indoor67 (86.2\%) and SUN397 (73.0\%).
Deep neural networks are gaining in popularity as they are used to generate state-of-the-art results for a variety of computer vision and machine learning applications. At the same time, these networks have grown in depth and complexity in order to solve harder problems. Given the limitations in power budgets dedicated to these networks, the importance of low-power, low-memory solutions has been stressed in recent years. While a large number of dedicated hardware using different precisions has recently been proposed, there exists no comprehensive study of different bit precisions and arithmetic in both inputs and network parameters. In this work, we address this issue and perform a study of different bit-precisions in neural networks (from floating-point to fixed-point, powers of two, and binary). In our evaluation, we consider and analyze the effect of precision scaling on both network accuracy and hardware metrics including memory footprint, power and energy consumption, and design area. We also investigate training-time methodologies to compensate for the reduction in accuracy due to limited bit precision and demonstrate that in most cases, precision scaling can deliver significant benefits in design metrics at the cost of very modest decreases in network accuracy. In addition, we propose that a small portion of the benefits achieved when using lower precisions can be forfeited to increase the network size and therefore the accuracy. We evaluate our experiments, using three well-recognized networks and datasets to show its generality. We investigate the trade-offs and highlight the benefits of using lower precisions in terms of energy and memory footprint.
A new initialization method for hidden parameters in a neural network is proposed. Derived from the integral representation of the neural network, a nonparametric probability distribution of hidden parameters is introduced. In this proposal, hidden parameters are initialized by samples drawn from this distribution, and output parameters are fitted by ordinary linear regression. Numerical experiments show that backpropagation with proposed initialization converges faster than uniformly random initialization. Also it is shown that the proposed method achieves enough accuracy by itself without backpropagation in some cases.
Millimeter wave (mm-wave) communications is considered a promising technology for 5G networks. Exploiting beamforming gains with large-scale antenna arrays to combat the increased path loss at mm-wave bands is one of its defining features. However, previous works on mm-wave network analysis usually adopted oversimplified antenna patterns for tractability, which can lead to significant deviation from the performance with actual antenna patterns. In this paper, using tools from stochastic geometry, we carry out a comprehensive investigation on the impact of directional antenna arrays in mm-wave networks. We first present a general and tractable framework for coverage analysis with arbitrary distributions for interference power and arbitrary antenna patterns. It is then applied to mm-wave ad hoc and cellular networks, where two sophisticated antenna patterns with desirable accuracy and analytical tractability are proposed to approximate the actual antenna pattern. Compared with previous works, the proposed approximate antenna patterns help to obtain more insights on the role of directional antenna arrays in mm-wave networks. In particular, it is shown that the coverage probabilities of both types of networks increase as a non-decreasing concave function with the antenna array size. The analytical results are verified to be effective and reliable through simulations, and numerical results also show that large-scale antenna arrays are required for satisfactory coverage in mm-wave networks.
Deep neural networks (DNNs) have excellent representative power and are state of the art classifiers on many tasks. However, they often do not capture their own uncertainties well making them less robust in the real world as they overconfidently extrapolate and do not notice domain shift. Gaussian processes (GPs) with RBF kernels on the other hand have better calibrated uncertainties and do not overconfidently extrapolate far from data in their training set. However, GPs have poor representational power and do not perform as well as DNNs on complex domains. In this paper we show that GP hybrid deep networks, GPDNNs, (GPs on top of DNNs and trained end-to-end) inherit the nice properties of both GPs and DNNs and are much more robust to adversarial examples. When extrapolating to adversarial examples and testing in domain shift settings, GPDNNs frequently output high entropy class probabilities corresponding to essentially "don't know". GPDNNs are therefore promising as deep architectures that know when they don't know.
In this paper, we study the $SIS$ (susceptible-infected-susceptible) and $SIR$ (susceptible-infected-removed) epidemic models on undirected, weighted networks by deriving pairwise-type approximate models coupled with individual-based network simulation. Two different types of theoretical/synthetic weighted network models are considered. Both models start from non-weighted networks with fixed topology followed by the allocation of link weights in either (i) random or (ii) fixed/deterministic way. The pairwise models are formulated for a general discrete distribution of weights, and these models are then used in conjunction with network simulation to evaluate the impact of different weight distributions on epidemic threshold and dynamics in general. For the $SIR$ dynamics, the basic reproductive ratio $R_0$ is computed, and we show that (i) for both network models $R_{0}$ is maximised if all weights are equal, and (ii) when the two models are equally matched, the networks with a random weight distribution give rise to a higher $R_0$ value. The models are also used to explore the agreement between the pairwise and simulation models for different parameter combinations.
Understanding the dynamics of computer virus (malware, worm) in cyberspace is an important problem that has attracted a fair amount of attention. Early investigations for this purpose adapted biological epidemic models, and thus inherited the so-called homogeneity assumption that each node is equally connected to others. Later studies relaxed this often-unrealistic homogeneity assumption, but still focused on certain power-law networks. Recently, researchers investigated epidemic models in {\em arbitrary} networks (i.e., no restrictions on network topology). However, all these models only capture {\em push-based} infection, namely that an infectious node always actively attempts to infect its neighboring nodes. Very recently, the concept of {\em pull-based} infection was introduced but was not treated rigorously. Along this line of research, the present paper investigates push- and pull-based epidemic spreading dynamics in arbitrary networks, using a Non-linear Dynamical Systems approach. The paper advances the state of the art as follows: (1) It presents a more general and powerful sufficient condition (also known as epidemic threshold in the literature) under which the spreading will become stable. (2) It gives both upper and lower bounds on the global mean infection rate, regardless of the stability of the spreading. (3) It offers insights into, among other things, the estimation of the global mean infection rate through localized monitoring of a small {\em constant} number of nodes, {\em without} knowing the values of the parameters.
Restart techniques are common in gradient-free optimization to deal with multimodal functions. Partial warm restarts are also gaining popularity in gradient-based optimization to improve the rate of convergence in accelerated gradient schemes to deal with ill-conditioned functions. In this paper, we propose a simple warm restart technique for stochastic gradient descent to improve its anytime performance when training deep neural networks. We empirically study its performance on the CIFAR-10 and CIFAR-100 datasets, where we demonstrate new state-of-the-art results at 3.14% and 16.21%, respectively. We also demonstrate its advantages on a dataset of EEG recordings and on a downsampled version of the ImageNet dataset. Our source code is available at https://github.com/loshchil/SGDR
A transition from asymmetric to symmetric patterns in time-dependent extended systems is described. It is found that one dimensional cellular automata, started from fully random initial conditions, can be forced to evolve into complex symmetrical patterns by stochastically coupling a proportion $p$ of pairs of sites located at equal distance from the center of the lattice. A nontrivial critical value of $p$ must be surpassed in order to obtain symmetrical patterns during the evolution. This strategy is able to classify the cellular automata rules -with complex behavior- between those that support time-dependent symmetric patterns and those which do not support such kind of patterns.
We consider the link prediction problem in a partially observed network, where the objective is to make predictions in the unobserved portion of the network. Many existing methods reduce link prediction to binary classification problem. However, the dominance of absent links in real world networks makes misclassification error a poor performance metric. Instead, researchers have argued for using ranking performance measures, like AUC, AP and NDCG, for evaluation. Our main contribution is to recast the link prediction problem as a learning to rank problem and use effective learning to rank techniques directly during training. This is in contrast to existing work that uses ranking measures only during evaluation. Our approach is able to deal with the class imbalance problem by using effective, scalable learning to rank techniques during training. Furthermore, our approach allows us to combine network topology and node features. As a demonstration of our general approach, we develop a link prediction method by optimizing the cross-entropy surrogate, originally used in the popular ListNet ranking algorithm. We conduct extensive experiments on publicly available co-authorship, citation and metabolic networks to demonstrate the merits of our method.
We present a heuristic for designing vector non-linear network codes for non-multicast networks, which we call quasi-linear network codes. The method presented has two phases: finding an approximate linear network code over the reals, and then quantizing it to a vector non-linear network code using a fixed-point representation. Apart from describing the method, we draw some links between some network parameters and the rate of the resulting code.
The dynamics and the stationary states for the competition between pattern reconstruction and asymmetric sequence processing are studied here in an exactly solvable feed-forward layered neural network model of binary units and patterns near saturation. Earlier work by Coolen and Sherrington on a parallel dynamics far from saturation is extended here to account for finite stochastic noise due to a Hebbian and a sequential learning rule. Phase diagrams are obtained with stationary states and quasi-periodic non-stationary solutions. The relevant dependence of these diagrams and of the quasi-periodic solutions on the stochastic noise and on initial inputs for the overlaps is explicitly discussed.
In this paper, an event network is presented for exploring open information, where linguistic units about an event are organized for analysing. The process is divided into three steps: document event detection, event network construction and event network analysis. First, by implementing event detection or tracking, documents are retrospectively (or on-line) organized into document events. Secondly, for each of the document event, linguistic units are extracted and combined into event networks. Thirdly, various analytic methods are proposed for event network analysis. In our application methodologies are presented for exploring open information.
Our current world is linked by a complex mesh of networks where information, people and goods flow. These networks are interdependent each other, and present structural and dynamical features different from those observed in isolated networks. While examples of such "dissimilar" properties are becoming more abundant, for example diffusion, robustness and competition, it is not yet clear where these differences are rooted in. Here we show that the composition of independent networks into an interconnected network of networks undergoes a structurally sharp transition as the interconnections are formed. Depending of the relative importance of inter and intra-layer connections, we find that the entire interdependent system can be tuned between two regimes: in one regime, the various layers are structurally decoupled and they act as independent entities; in the other regime, network layers are indistinguishable and the whole system behave as a single-level network. We analytically show that the transition between the two regimes is discontinuous even for finite size networks. Thus, any real-world interconnected system is potentially at risk of abrupt changes in its structure that may reflect in new dynamical properties.
Sample complexity and safety are major challenges when learning policies with reinforcement learning for real-world tasks, especially when the policies are represented using rich function approximators like deep neural networks. Model-based methods where the real-world target domain is approximated using a simulated source domain provide an avenue to tackle the above challenges by augmenting real data with simulated data. However, discrepancies between the simulated source domain and the target domain pose a challenge for simulated training. We introduce the EPOpt algorithm, which uses an ensemble of simulated source domains and a form of adversarial training to learn policies that are robust and generalize to a broad range of possible target domains, including unmodeled effects. Further, the probability distribution over source domains in the ensemble can be adapted using data from target domain and approximate Bayesian methods, to progressively make it a better approximation. Thus, learning on a model ensemble, along with source domain adaptation, provides the benefit of both robustness and learning/adaptation.
Catastrophic forgetting is a problem faced by many machine learning models and algorithms. When trained on one task, then trained on a second task, many machine learning models "forget" how to perform the first task. This is widely believed to be a serious problem for neural networks. Here, we investigate the extent to which the catastrophic forgetting problem occurs for modern neural networks, comparing both established and recent gradient-based training algorithms and activation functions. We also examine the effect of the relationship between the first task and the second task on catastrophic forgetting. We find that it is always best to train using the dropout algorithm--the dropout algorithm is consistently best at adapting to the new task, remembering the old task, and has the best tradeoff curve between these two extremes. We find that different tasks and relationships between tasks result in very different rankings of activation function performance. This suggests the choice of activation function should always be cross-validated.
In this paper, given a random uniform distribution of sensor nodes on a 2-D plane, a fast self-organized distributed algorithm is proposed to find the maximum number of partitions of the nodes such that each partition is connected and covers the area to be monitored. Each connected partition remains active in a round robin fashion to cover the query region individually. In case of a node failure, the proposed distributed fault recovery algorithm reconstructs the affected partition locally utilizing the available free nodes. Simulation studies show significant improvement in performance compared to the earlier works in terms of computation time, the diameter of each partition, message overhead and network lifetime.
A large fraction of Internet traffic is now driven by requests from mobile devices with relatively small screens and often stringent bandwidth requirements. Due to these factors, it has become the norm for modern graphics-heavy websites to transmit low-resolution, low-bytecount image previews (thumbnails) as part of the initial page load process to improve apparent page responsiveness. Increasing thumbnail compression beyond the capabilities of existing codecs is therefore a current research focus, as any byte savings will significantly enhance the experience of mobile device users. Toward this end, we propose a general framework for variable-rate image compression and a novel architecture based on convolutional and deconvolutional LSTM recurrent networks. Our models address the main issues that have prevented autoencoder neural networks from competing with existing image compression algorithms: (1) our networks only need to be trained once (not per-image), regardless of input image dimensions and the desired compression rate; (2) our networks are progressive, meaning that the more bits are sent, the more accurate the image reconstruction; and (3) the proposed architecture is at least as efficient as a standard purpose-trained autoencoder for a given number of bits. On a large-scale benchmark of 32$\times$32 thumbnails, our LSTM-based approaches provide better visual quality than (headerless) JPEG, JPEG2000 and WebP, with a storage size that is reduced by 10% or more.
In this paper we propose that artificial neural network, the basis of machine learning, is useful to generate the inflationary landscape from a cosmological point of view. Traditional numerical simulations of a global cosmic landscape typically need an exponential complexity when the number of fields is large. However, a basic application of artificial neural network could solve the problem based on the universal approximation theorem of the multilayer perceptron. A toy model in inflation with multiple light fields is investigated numerically as an example of such an application.
This paper proposes a novel deep reinforcement learning (RL) architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network. In contrast to typical model-based RL methods, VPN learns a dynamics model whose abstract states are trained to make option-conditional predictions of future values (discounted sum of rewards) rather than of future observations. Our experimental results show that VPN has several advantages over both model-free and model-based baselines in a stochastic environment where careful planning is required but building an accurate observation-prediction model is difficult. Furthermore, VPN outperforms Deep Q-Network (DQN) on several Atari games even with short-lookahead planning, demonstrating its potential as a new way of learning a good state representation.
We analyze in this paper the longest increasing contiguous sequence or maximal ascending run of random variables with common uniform distribution but not independent. Their dependence is characterized by the fact that two successive random variables cannot take the same value. Using a Markov chain approach, we study the distribution of the maximal ascending run and we develop an algorithm to compute it. This problem comes from the analysis of several self-organizing protocols designed for large-scale wireless sensor networks, and we show how our results apply to this domain.
In this paper a Neural Network based approach is presented to identify the noise in the VIRGO context. VIRGO is an experiment to detect Gravitational Waves by means of a Laser Interferometer. Preliminary results appear to be very promising for data analysis of realistic Interferometer outputs.
The possibility to simulate lepton number violating supersymmetric models has been introduced into the recently updated PYTHIA event generator, now containing 1278 decay channels of SUSY particles into SM particles via lepton number violating interactions. This generator has been used in combination with the ATLFAST detector simulation to study the impact of lepton number violation (LV) on event topologies in the ATLAS detector, and trigger menus designed for LV-SUSY are proposed based on very general considerations. In addition, a rather preliminary analysis is presented on the possibility for ATLAS to observe a signal above the background in several mSUGRA scenarios, using a combination of primitive cuts and neural networks to optimize the discriminating power between signal and background events over regions of parameter space rather than at individual points. It is found that a 5 sigma discovery is possible roughly for m_{1/2} < 1TeV and m_0 < 2TeV with an integrated luminosity of 30fb^{-1}, corresponding to one year of data taking with the LHC running at ``mid-luminosity'', L = 3*10^{33}s^{-1}cm^{-2}.
As location-based techniques and applications become ubiquitous in emerging wireless networks, the verification of location information will become of growing importance. This has led in recent years to an explosion of activity related to location verification techniques in wireless networks, with a specific focus on Intelligent Transport Systems (ITS) being evident. Such focus is largely due to the mission-critical nature of vehicle location verification within the ITS scenario. In this work we review recent research in wireless location verification related to the vehicular network scenario. We particularly focus on location verification systems that rely on formal mathematical classification frameworks, showing how many systems are either partially or fully encompassed by such frameworks.
Recent developments in the search for topological superconductivity have brought lattices of magnetic adatoms on a superconductor into intense focus. In this work we will study ferromagnetic chains of adatoms on superconducting surfaces with Rashba spin-orbit coupling. Generalising the deep-impurity approach employed extensively in previous works to arbitrary subgap energies, we formulate the theory of the subgap spectrum as a nonlinear matrix eigenvalue problem. We obtain an essentially analytical description of the subgap spectrum, allowing an efficient study of the topological properties. Employing a flat-band Hamiltonian sharing the topological properties of the chain, we evaluate the $\mathbb{Z}$-valued winding number and discover five distinct topological phases. Our results also confirm that the topological band formation does not require the decoupled Shiba energies to be fine-tuned to the gap centre. We also study the properties of Majorana bound states in the system.
Wireless powered communication network (WPCN) is a new networking paradigm where the battery of wireless communication devices can be remotely replenished by means of microwave wireless power transfer (WPT) technology. WPCN eliminates the need of frequent manual battery replacement/recharging, and thus significantly improves the performance over conventional battery-powered communication networks in many aspects, such as higher throughput, longer device lifetime, and lower network operating cost. However, the design and future application of WPCN is essentially challenged by the low WPT efficiency over long distance and the complex nature of joint wireless information and power transfer within the same network. In this article, we provide an overview of the key networking structures and performance enhancing techniques to build an efficient WPCN. Besides, we point out new and challenging future research directions for WPCN.
Despite of the pain and limited accuracy of blood tests for early recognition of cardiovascular disease, they dominate risk screening and triage. On the other hand, heart rate variability is non-invasive and cheap, but not considered accurate enough for clinical practice. Here, we tackle heart beat interval based classification with deep learning. We introduce an end to end differentiable hybrid architecture, consisting of a layer of biological neuron models of cardiac dynamics (modified FitzHugh Nagumo neurons) and several layers of a standard feed-forward neural network. The proposed model is evaluated on ECGs from 474 stable at-risk (coronary artery disease) patients, and 1172 chest pain patients of an emergency department. We show that it can significantly outperform models based on traditional heart rate variability predictors, as well as approaching or in some cases outperforming clinical blood tests, based only on 60 seconds of inter-beat intervals.
We propose a new model of neural network. It consists of spin variables to describe the state of neurons as in the Hopfield model and new gauge variables to describe the state of synapses. The model possesses local gauge symmetry and resembles lattice gauge theory of high-energy physics. Time dependence of synapses describes the process of learning. The mean field theory predicts a new phase corresponding to confinement phase, in which brain loses ablility of learning and memory.
We discuss the problem of a possible "violation" of the optical sum rule in the normal (non superconducting) state of strongly correlated electronic systems, using our recently proposed DMFT+Sigma approach, applied to two typical models: the "hot - spot" model of the pseudogap state and disordered Anderson - Hubbard model. We explicitly demonstrate that the general Kubo single band sum rule is satisfied for both models. However, the optical integral itself is in general dependent on temperature and characteristic parameters, such as pseudogap width, correlation strength and disorder scattering, leading to effective "violation" of the optical sum rule, which may be observed in the experiments.
This paper adapts the corner classification algorithm (CC4) to train the neural networks using spread unary inputs. This is an important problem as spread unary appears to be at the basis of data representation in biological learning. The modified CC4 algorithm is tested using the pattern classification experiment and the results are found to be good. Specifically, we show that the number of misclassified points is not particularly sensitive to the chosen radius of generalization.
Various methods of identification of single events in the HPGe detector are compared on the basis of a program especially designed to simulate pulse shape in a semi-conductor germanium detector. Capabilities of three following methods are shown: (1) an application of the library of single pulse shapes, (2) the single-parameter method, and (3) a separation on the basis of artificial neural networks. The analysis was done in the context of an application of all above-mentioned techniques in the energy interval around the expected neutrino less double beta decay of Ge-76 in the Heidelberg-Moscow experiment.
In this paper we analyze the predictions of the forward approximation in some models which exhibit an Anderson (single-) or many-body localized phase. This approximation, which consists in summing over the amplitudes of only the shortest paths in the locator expansion, is known to over-estimate the critical value of the disorder which determines the onset of the localized phase. Nevertheless, the results provided by the approximation become more and more accurate as the local coordination (dimensionality) of the graph, defined by the hopping matrix, is made larger. In this sense, the forward approximation can be regarded as a mean field theory for the Anderson transition in infinite dimensions. The sum can be efficiently computed using transfer matrix techniques, and the results are compared with the most precise exact diagonalization results available.   For the Anderson problem, we find a critical value of the disorder which is $0.9\%$ off the most precise available numerical value already in 5 spatial dimensions, while for the many-body localized phase of the Heisenberg model with random fields the critical disorder $h_c=4.0\pm 0.3$ is strikingly close to the most recent results obtained by exact diagonalization. In both cases we obtain a critical exponent $\nu=1$. In the Anderson case, the latter does not show dependence on the dimensionality, as it is common within mean field approximations.   We discuss the relevance of the correlations between the shortest paths for both the single- and many-body problems, and comment on the connections of our results with the problem of directed polymers in random medium.
It is suggested that a large deep underocean (or ice) neutrino detector, given the presence of significant numbers of neutrinos in the PeV energy range as predicted by various models of Active Galactic Nuclei, can make unique measurements of the properties of neutrinos. It will be possible to observe the existence of the tau neutrino, measure its mixing with other flavors, in fact test the mixing pattern for all three flavors based upon the mixing parameters suggested by the atmospheric and solar neutrino data, and measure the tau neutrino cross section. The key signature is the charged current tau neutrino interaction, which produces a double cascade, one at either end of a lightly radiating track. At a few PeV these cascades would be separated by roughly 100 m, and thus be easily resolvable in next generation DUMAND-like detectors. First examples might be found in detectors presently under construction. Future applications are precise neutrino astronomy and earth tomography. This paper is an expanded version of hep-ph/9405296, for publication.
An easily implementable path solution algorithm for 2D spatial problems, based on excitable/programmable characteristics of a specific cellular nonlinear network (CNN) model is presented and numerically investigated. The network is a single layer bioinspired model which was also implemented in CMOS technology. It exhibits excitable characteristics with regionally bistable cells. The related response realizes propagations of trigger autowaves, where the excitable mode can be globally preset and reset. It is shown that, obstacle distributions in 2D space can also be directly mapped onto the coupled cell array in the network. Combining these two features, the network model can serve as the main block in a 2D path computing processor. The related algorithm and configurations are numerically experimented with circuit level parameters and performance estimations are also presented. The simplicity of the model also allows alternative technology and device level implementation, which may become critical in autonomous processor design of related micro or nanoscale robotic applications.
In the problems of image retrieval and annotation, complete textual tag lists of images play critical roles. However, in real-world applications, the image tags are usually incomplete, thus it is important to learn the complete tags for images. In this paper, we study the problem of image tag complete and proposed a novel method for this problem based on a popular image representation method, convolutional neural network (CNN). The method estimates the complete tags from the convolutional filtering outputs of images based on a linear predictor. The CNN parameters, linear predictor, and the complete tags are learned jointly by our method. We build a minimization problem to encourage the consistency between the complete tags and the available incomplete tags, reduce the estimation error, and reduce the model complexity. An iterative algorithm is developed to solve the minimization problem. Experiments over benchmark image data sets show its effectiveness.
We investigate the superfluid-insulator quantum phase transition of one-dimensional bosons with off-diagonal disorder by means of large-scale Monte-Carlo simulations. For weak disorder, we find the transition to be in the same universality class as the superfluid-Mott insulator transition of the clean system. The nature of the transition changes for stronger disorder. Beyond a critical disorder strength, we find nonuniversal, disorder-dependent critical behavior. We compare our results to recent perturbative and strong-disorder renormalization group predictions. We also discuss experimental implications as well as extensions of our results to other systems.
We present photometry of nine cataclysmic variable stars identified by the Sloan Digital Sky Survey, aimed at measuring the orbital periods of these systems. Four of these objects show deep eclipses, from which we measure their orbital periods. The light curves of three of the eclipsing systems are also analysed using the LCURVE code, and their mass ratios and orbital inclinations determined. SDSS J075059.97+141150.1 has an orbital period of 134.1564 +/- 0.0008 min, making it a useful object with which to investigate the evolutionary processes of cataclysmic variables. SDSS J092444.48+080150.9 has a period of 131.2432 +/- 0.0014 min and is probably magnetic. The white dwarf ingress and egress phases are very deep and short, and there is no clear evidence that this object has an accretion disc. SDSS J115207.00+404947.8 and SDSS J152419.33+220920.1 are nearly identical twins, with periods of 97.5 +/- 0.4 and 93.6 +/- 0.5 min and mass ratios of 0.14 +/- 0.03 and 0.17 +/- 0.03, respectively. Their eclipses have well-defined white dwarf and bright spot ingress and egress features, making them excellent candidates for detailed study. All four of the orbital periods presented here are shorter than the 2-3 hour period gap observed in the known population of cataclysmic variables.
In this paper we explore a connection between two seemingly different problems from two different domains: the small-set expansion problem studied in unique games conjecture, and a popular community finding approach for social networks known as the modularity clustering approach. We show that a sub-exponential time algorithm for the small-set expansion problem leads to a sub-exponential time constant factor approximation for some hard input instances of the modularity clustering problem.
This paper attends to the problem of embedding flexibly specified CloudNets, virtual networks connecting cloud resources (such as storage or computation). We attend to a scenario where customers can request CloudNets at short notice, and an infrastructure provider (or a potential itermediate broker or reseller) first embeds the CloudNet fast (e.g., using a simple heuristic). Later, however, long-lived CloudNets embeddings are optimized by migrating them to more suitable locations, whose precise definition depends on a given objective function. For instance, such migrations can be useful to reduce the peak resource loads in the network by spreading CloudNets across the infrastructure, to save energy by moving CloudNets together and switching off unused components, or for maintenance purposes.   We present a very generic algorithm to compute optimal embeddings of CloudNets: It allows for different objective functions (such as load minimization or energy conservation), supports cost-aware migration, and can deal with all link types that arise in practice (e.g., full-duplex or even wireless or wired broadcast links with multiple endpoints). Our evaluation shows that such a rigorous optimization is even feasible in order to optimize a moderate-size CloudNet of full flexibility (e.g., a router site, a small physical infrastructure or virtual provider network).
Heat conduction in three types of 1D channels are studied. The channels consist of two parallel walls, right triangles as scattering obstacles, and noninteracting particles. The triangles are placed along the walls in three different ways: (a) periodic, (b) disordered in height, and (c) disordered in position. The Lyapunov exponents in all three models are zero because of the flatness of triangle sides. It is found numerically that the temperature gradient can be formed in all three channels, but the Fourier heat law is observed only in two disordered ones. The results show that there might be no direct connection between chaos (in the sense of positive Lyapunov exponent) and the normal thermal conduction.
We investigate a simple stochastic model of social network formation by the process of reinforcement learning with discounting of the past. In the limit, for any value of the discounting parameter, small, stable cliques are formed. However, the time it takes to reach the limiting state in which cliques have formed is very sensitive to the discounting parameter. Depending on this value, the limiting result may or may not be a good predictor for realistic observation times.
In this paper we investigate the family of functions representable by deep neural networks (DNN) with rectified linear units (ReLU). We give the first-ever polynomial time (in the size of data) algorithm to train to global optimality a ReLU DNN with one hidden layer, assuming the input dimension and number of nodes of the network as fixed constants.   We also improve on the known lower bounds on size (from exponential to super exponential) for approximating a ReLU deep net function by a shallower ReLU net. Our gap theorems hold for smoothly parametrized families of "hard" functions, contrary to countable, discrete families known in the literature. An example consequence of our gap theorems is the following: for every natural number $k$ there exists a function representable by a ReLU DNN with $k^2$ hidden layers and total size $k^3$, such that any ReLU DNN with at most $k$ hidden layers will require at least $\frac{1}{2}k^{k+1}-1$ total nodes.   Finally, we construct a family of $\mathbb{R}^n\to \mathbb{R}$ piecewise linear functions for $n\geq 2$ (also smoothly parameterized), whose number of affine pieces scales exponentially with the dimension $n$ at any fixed size and depth. To the best of our knowledge, such a construction with exponential dependence on $n$ has not been achieved by previous families of "hard" functions in the neural nets literature. This construction utilizes the theory of zonotopes from polyhedral theory.
Expansion of natural gas networks is a critical process involving substantial capital expenditures with complex decision-support requirements. Given the non-convex nature of gas transmission constraints, global optimality and infeasibility guarantees can only be offered by global optimisation approaches. Unfortunately, state-of-the-art global optimisation solvers are unable to scale up to real-world size instances. In this study, we present a convex mixed-integer second-order cone relaxation for the gas expansion planning problem under steady-state conditions. The underlying model offers tight lower bounds with high computational efficiency. In addition, the optimal solution of the relaxation can often be used to derive high-quality solutions to the original problem, leading to provably tight optimality gaps and, in some cases, global optimal soluutions. The convex relaxation is based on a few key ideas, including the introduction of flux direction variables, exact McCormick relaxations, on/off constraints, and integer cuts. Numerical experiments are conducted on the traditional Belgian gas network, as well as other real larger networks. The results demonstrate both the accuracy and computational speed of the relaxation and its ability to produce high-quality solutions.
This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration of attention-based neural translation models with phrase-based statistical machine translation. Efficient batch-algorithms for GPU-querying are proposed and implemented. For English-Russian, our system stays behind the state-of-the-art pure neural models in terms of BLEU. Among restricted systems, manual evaluation places it in the first cluster tied with the pure neural model. For the Russian-English task, our submission achieves the top BLEU result, outperforming the best pure neural system by 1.1 BLEU points and our own phrase-based baseline by 1.6 BLEU. After manual evaluation, this system is the best restricted system in its own cluster. In follow-up experiments we improve results by additional 0.8 BLEU.
We report on a combined theoretical and experimental determination of the Fermi level position in wurtzite Ga$_{1-x}$Mn$_{x}$N films with $x=4\%$ and $x=10\%$ as grown by molecular beam epitaxy. By means of ellipsometric measurements the real part of the frequency-dependent conductivity is determined. An electronic model in the framework of the effective bond-orbital model is parameterized in order to theoretically reproduce the measured transport properties. Predictions for the long-wavelength behaviour as a function of the Fermi level are made. The corresponding density of states obtained in this model is in qualitative agreement with first-principle calculations. The absence of a significant experimental peak in the AC conductivity for small frequencies indicates that the Fermi level lies in a gap between two Mn-related impurity bands in the host band gap.
We provide broadband dielectric loss spectra of glass-forming glycerol with varying additions of LiCl. The measurements covering frequencies up to 10 THz extend well into the region of the fast beta process, commonly ascribed to caged molecule dynamics. Aside of the known variation of the structural alpha relaxation time and a modification of the excess wing with ion content, we also find a clear influence on the shallow loss minimum arising from the fast beta relaxation. Within the framework of mode-coupling theory, the detected significant broadening of this minimum is in reasonable accord with the found variation of the alpha-relaxation dynamics. A correlation between alpha-relaxation rate and minimum position holds for all ion concentrations and temperatures, even below the critical temperature defined by mode-coupling theory.
Security protocols often use randomization to achieve probabilistic non-determinism. This non-determinism, in turn, is used in obfuscating the dependence of observable values on secret data. Since the correctness of security protocols is very important, formal analysis of security protocols has been widely studied in literature. Randomized security protocols have also been analyzed using formal techniques such as process-calculi and probabilistic model checking. In this paper, we consider the problem of validating implementations of randomized protocols. Unlike previous approaches which treat the protocol as a white-box, our approach tries to verify an implementation provided as a black box. Our goal is to infer the secrecy guarantees provided by a security protocol through statistical techniques. We learn the probabilistic dependency of the observable outputs on secret inputs using Bayesian network. This is then used to approximate the leakage of secret. In order to evaluate the accuracy of our statistical approach, we compare our technique with the probabilistic model checking technique on two examples: crowds protocol and dining crypotgrapher's protocol.
The influence of the nuclear medium on the production of charged hadrons in semi-inclusive deep-inelastic scattering has been studied by the HERMES experiment at DESY using a 27.5 GeV positron beam. The differential multiplicity of charged hadrons and identified charged pions from nitrogen relative to that from deuterium has been measured as a function of the virtual photon energy \nu and the fraction z of this energy transferred to the hadron. There are observed substantial reductions of the multiplicity ratio R_M^h at low \nu and at high z, both of which are well described by a gluon-bremsstrahlung model of hadronization. A significant difference of the \nu-dependence of R_M^h is found between positive and negative hadrons. This is interpreted in terms of a difference between the formation times of protons and pions, using a phenomenological model to describe the \nu- and z-dependence of R_M^h.
We demonstrate that p-adic analysis is a natural basis for the construction of a wide variety of the ultrametric diffusion models constrained by hierarchical energy landscapes. A general analytical description in terms of p-adic analysis is given for a class of models. Two exactly solvable examples, i.e. the ultrametric diffusion constraned by the linear energy landscape and the ultrametric diffusion with reaction sink, are considered. We show that such models can be applied to both the relaxation in complex systems and the rate processes coupled to rearrangenment of the complex surrounding.
In order to properly assess the function and computational properties of simulated neural systems, it is necessary to account for the nature of the stimuli that drive the system. However, providing stimuli that are rich and yet both reproducible and amenable to experimental manipulations is technically challenging, and even more so if a closed-loop scenario is required. In this work, we present a novel approach to solve this problem, connecting robotics and neural network simulators. We implement a middleware solution that bridges the Robotic Operating System (ROS) to the Multi-Simulator Coordinator (MUSIC). This enables any robotic and neural simulators that implement the corresponding interfaces to be efficiently coupled, allowing real-time performance for a wide range of configurations. This work extends the toolset available for researchers in both neurorobotics and computational neuroscience, and creates the opportunity to perform closed-loop experiments of arbitrary complexity to address questions in multiple areas, including embodiment, agency, and reinforcement learning.
Hierarchical galaxy formation is the model whereby massive galaxies form from an assembly of smaller units. The most massive objects therefore form last. The model succeeds in describing the clustering of galaxies, but the evolutionary history of massive galaxies, as revealed by their visible stars and gas, is not accurately predicted. Near-infrared observations (which allow us to measure the stellar masses of high-redshift galaxies) and deep multi-colour images indicate that a large fraction of the stars in massive galaxies form in the first 5 Gyr, but uncertainties remain owing to the lack of spectra to confirm the redshifts (which are estimated from the colours) and the role of obscuration by dust. Here we report the results of a spectroscopic redshift survey that probes the most massive and quiescent galaxies back to an era only 3 Gyr after the Big Bang. We find that at least two-thirds of massive galaxies have appeared since this era, but also that a significant fraction of them are already in place in the early Universe.
We study the detection of the $t^\prime$ of a fourth family during the early running of LHC with 7 TeV collision energy and 1 fb$^{-1}$ integrated luminosity. By use of a neural network we show that it is feasible to search for the $t'$ even with a mass close to the unitarity upper bound, which is in the 500 to 600 GeV range. We also present results for the Tevatron with $10 \,\, \textrm{fb}^{-1}$. In both cases the search for a fourth family quark doublet can be significantly enhanced if one incorporates the contribution that the $b'$ can make to a $t'$-like signal. Thus the bound on the mass of a degenerate quark doublet should be stronger than the bounds obtained by treating $t'$ and $b'$ in isolation.
This paper presents methods which are aimed at finding approximations to missing data in a dataset by using optimization algorithms to optimize the network parameters after which prediction and classification tasks can be performed. The optimization methods that are considered are genetic algorithm (GA), simulated annealing (SA), particle swarm optimization (PSO), random forest (RF) and negative selection (NS) and these methods are individually used in combination with auto-associative neural networks (AANN) for missing data estimation and the results obtained are compared. The methods suggested use the optimization algorithms to minimize an error function derived from training the auto-associative neural network during which the interrelationships between the inputs and the outputs are obtained and stored in the weights connecting the different layers of the network. The error function is expressed as the square of the difference between the actual observations and predicted values from an auto-associative neural network. In the event of missing data, all the values of the actual observations are not known hence, the error function is decomposed to depend on the known and unknown variable values. Multi-layer perceptron (MLP) neural network is employed to train the neural networks using the scaled conjugate gradient (SCG) method. Prediction accuracy is determined by mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and correlation coefficient (r) computations. Accuracy in classification is obtained by plotting ROC curves and calculating the areas under these. Analysis of results depicts that the approach using RF with AANN produces the most accurate predictions and classifications while on the other end of the scale is the approach which entails using NS with AANN.
We formulate and study a fundamental search and detection problem, Schedule Optimization, motivated by a variety of real-world applications, ranging from monitoring content changes on the web, social networks, and user activities to detecting failure on large systems with many individual machines.   We consider a large system consists of many nodes, where each node has its own rate of generating new events, or items. A monitoring application can probe a small number of nodes at each step, and our goal is to compute a probing schedule that minimizes the expected number of undiscovered items at the system, or equivalently, minimizes the expected time to discover a new item in the system.   We study the Schedule Optimization problem both for deterministic and randomized memoryless algorithms. We provide lower bounds on the cost of an optimal schedule and construct close to optimal schedules with rigorous mathematical guarantees. Finally, we present an adaptive algorithm that starts with no prior information on the system and converges to the optimal memoryless algorithms by adapting to observed data.
Monte Carlo simulations are performed to determine the critical percolation threshold for interpenetrating square objects in two dimensions and cubic objects in three dimensions. Simulations are performed for two cases: (i) objects whose edges are aligned parallel to one another and (ii) randomly oriented objects. For squares whose edges are aligned, the critical area fraction at the percolation threshold phi_c=0.6666 +/- 0.0004, while for randomly oriented squares phi_c=0.6254 +/- 0.0002, 6% smaller. For cubes whose edges are aligned, the critical volume fraction at the percolation threshold phi_c=0.2773 +/- 0.0002, while for randomly oriented cubes phi_c=0.2236 +/- 0.0002, 24% smaller.
We consider the diffusion of new products in the discrete Bass-SIR model, in which consumers who adopt the product can later "recover" and stop influencing their peers to adopt the product. To gain insight into the effect of the social network structure on the diffusion, we focus on two extreme cases. In the "most-connected" configuration where all consumers are inter-connected (complete network), averaging over all consumers leads to an aggregate model, which combines the Bass model for diffusion of new products with the SIR model for epidemics. In the "least-connected" configuration where consumers are arranged on a circle and each consumer can only be influenced by his left neighbor (one-sided 1D network), averaging over all consumers leads to a different aggregate model which is linear, and can be solved explicitly. We conjecture that for any other network, the diffusion is bounded from below and from above by that on a one-sided 1D network and on a complete network, respectively. When consumers are arranged on a circle and each consumer can be influenced by his left and right neighbors (two-sided 1D network), the diffusion is strictly faster than on a one-sided 1D network. This is different from the case of non-recovering adopters, where the diffusion on one-sided and on two-sided 1D networks is identical. We also propose a nonlinear model for recoveries, and show that consumers' heterogeneity has a negligible effect on the aggregate diffusion.
We exhibit a mathematical framework to represent the neural dynamics at cortical level. Our description of neural dynamics with columnar and functional modularity, named fibre bundle representation (FBM) method, is based both on neuroscience and informatics, whereas they correspond with the conventional formulas in statistical physics. In spite of complex interactions in neural circuitry and various cortical modification rules per models, some significant factors determine the typical phenomena in cortical dynamics. The FBM representation method reveals them plainly and gives profit in building or analyzing the cortical dynamic models. Not only the similarity in formulas, the cortical dynamics can share the statistical properties with other physical systems, which validated in primary visual maps. We apply our method to proposed models in visual map formations, in addition our suggestion using the lateral interaction scheme. In this paper, we will show that the neural dynamic procedures can be treated through conventional physics expressions and theories.
Geometric constraints impact the formation of a broad range of spatial networks, from amino acid chains folding to proteins structures to rearranging particle aggregates. How the network of interactions dynamically self-organizes in such systems is far from fully understood. Here, we analyze a class of spatial network formation processes by introducing a mapping from geometric to graph-theoretic constraints. Combining stochastic and mean field analyses yields an algebraic scaling law for the extent (graph diameter) of the resulting networks with system size, in contrast to logarithmic scaling known for networks without constraints. Intriguingly, the exponent falls between that of self-avoiding random walks and that of space filling arrangements, consistent with experimentally observed scaling (of the spatial radius of gyration) for protein tertiary structures.
Many successful approaches to semantic parsing build on top of the syntactic analysis of text, and make use of distributional representations or statistical models to match parses to ontology-specific queries. This paper presents a novel deep learning architecture which provides a semantic parsing system through the union of two neural models of language semantics. It allows for the generation of ontology-specific queries from natural language statements and questions without the need for parsing, which makes it especially suitable to grammatically malformed or syntactically atypical text, such as tweets, as well as permitting the development of semantic parsers for resource-poor languages.
We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and beam search. We show both deficiencies and improvements over the quality of phrase-based statistical machine translation.
Associative memories are structures that store data patterns and retrieve them given partial inputs. Sparse Clustered Networks (SCNs) are recently-introduced binary-weighted associative memories that significantly improve the storage and retrieval capabilities over the prior state-of-the art. However, deleting or updating the data patterns result in a significant increase in the data retrieval error probability. In this paper, we propose an algorithm to address this problem by incorporating multiple-valued weights for the interconnections used in the network. The proposed algorithm lowers the error rate by an order of magnitude for our sample network with 60% deleted contents. We then investigate the advantages of the proposed algorithm for hardware implementations.
We present generalized delayed neural network (DNN) model with positive delay feedback and neuron history. The local stability analysis around trivial local equilibria of delayed neural networks has applied and determine the conditions for the existence of zero root. We develop few innovative delayed neural network models in different dimensions through transformation and extension of some existing models. We found that zero root can have multiplicity two under certain conditions. We further show how the characteristic equation can have zero root and its multiplicity is dependent on the conditions undertaken. Finally, we generalize the neural network of $N$ neurons through which we determine the general form of Jacobian of the linear form and corresponding characteristic equation of the system.
We investigate the sensorimotor loop of simple robots simulated within the LPZRobots environment from the point of view of dynamical systems theory. For a robot with a cylindrical shaped body and an actuator controlled by a single proprioceptual neuron we find various types of periodic motions in terms of stable limit cycles. These are self-organized in the sense, that the dynamics of the actuator kicks in only, for a certain range of parameters, when the barrel is already rolling, stopping otherwise. The stability of the resulting rolling motions terminates generally, as a function of the control parameters, at points where fold bifurcations of limit cycles occur. We find that several branches of motion types exist for the same parameters, in terms of the relative frequencies of the barrel and of the actuator, having each their respective basins of attractions in terms of initial conditions. For low drivings stable limit cycles describing periodic and drifting back-and-forth motions are found additionally. These modes allow to generate symmetry breaking explorative behavior purely by the timing of an otherwise neutral signal with respect to the cyclic back-and-forth motion of the robot.
The network traffic matrix is widely used in network operation and management. It is therefore of crucial importance to analyze the components and the structure of the network traffic matrix, for which several mathematical approaches such as Principal Component Analysis (PCA) were proposed. In this paper, we first argue that PCA performs poorly for analyzing traffic matrix that is polluted by large volume anomalies, and then propose a new decomposition model for the network traffic matrix. According to this model, we carry out the structural analysis by decomposing the network traffic matrix into three sub-matrices, namely, the deterministic traffic, the anomaly traffic and the noise traffic matrix, which is similar to the Robust Principal Component Analysis (RPCA) problem previously studied in [13]. Based on the Relaxed Principal Component Pursuit (Relaxed PCP) method and the Accelerated Proximal Gradient (APG) algorithm, we present an iterative approach for decomposing a traffic matrix, and demonstrate its efficiency and flexibility by experimental results. Finally, we further discuss several features of the deterministic and noise traffic. Our study develops a novel method for the problem of structural analysis of the traffic matrix, which is robust against pollution of large volume anomalies.
Deep neural networks have recently achieved state of the art performance thanks to new training algorithms for rapid parameter estimation and new regularization methods to reduce overfitting. However, in practice the network architecture has to be manually set by domain experts, generally by a costly trial and error procedure, which often accounts for a large portion of the final system performance. We view this as a limitation and propose a novel training algorithm that automatically optimizes network architecture, by progressively increasing model complexity and then eliminating model redundancy by selectively removing parameters at training time. For convolutional neural networks, our method relies on iterative split/merge clustering of convolutional kernels interleaved by stochastic gradient descent. We present a training algorithm and experimental results on three different vision tasks, showing improved performance compared to similarly sized hand-crafted architectures.
Mobile ad-hoc networks are temporary wireless networks. Network resources are abnormally consumed by intruders. Anomaly and signature based techniques are used for intrusion detection. Classification techniques are used in anomaly based techniques. Intrusion detection techniques are used for the network attack detection process. Two types of intrusion detection systems are available. They are anomaly detection and signature based detection model. The anomaly detection model uses the historical transactions with attack labels. The signature database is used in the signature based IDS schemes.   The mobile ad-hoc networks are infrastructure less environment. The intrusion detection applications are placed in a set of nodes under the mobile ad-hoc network environment. The nodes are grouped into clusters. The leader nodes are assigned for the clusters. The leader node is assigned for the intrusion detection process. Leader nodes are used to initiate the intrusion detection process. Resource sharing and lifetime management factors are considered in the leader election process. The system optimizes the leader election and intrusion detection process.   The system is designed to handle leader election and intrusion detection process. The clustering scheme is optimized with coverage and traffic level. Cost and resource utilization is controlled under the clusters. Node mobility is managed by the system.
This paper addresses how a recursive neural network model can automatically leave out useless information and emphasize important evidence, in other words, to perform "weight tuning" for higher-level representation acquisition. We propose two models, Weighted Neural Network (WNN) and Binary-Expectation Neural Network (BENN), which automatically control how much one specific unit contributes to the higher-level representation. The proposed model can be viewed as incorporating a more powerful compositional function for embedding acquisition in recursive neural networks. Experimental results demonstrate the significant improvement over standard neural models.
Community structure in networks is often a consequence of homophily, or assortative mixing, based on some attribute of the vertices. For example, researchers may be grouped into communities corresponding to their research topic. This is possible if vertex attributes have discrete values, but many networks exhibit assortative mixing by some continuous-valued attribute, such as age or geographical location. In such cases, no discrete communities can be identified. We consider how the notion of community structure can be generalized to networks that are based on continuous-valued attributes: in general, a network may contain discrete communities which are ordered according to their attribute values. We propose a method of generating synthetic ordered networks and investigate the effect of ordered community structure on the spread of infectious diseases. We also show that community detection algorithms fail to recover community structure in ordered networks, and evaluate an alternative method using a layout algorithm to recover the ordering.
Context. Precise determination of stellar masses is necessary to test the validity of pre-main-sequence (PMS) stellar evolutionary models, whose predictions are in disagreement with measurements for masses below 1.2 Msun. To improve such a test, and based on our previous studies, we selected the AB Doradus moving group (AB Dor-MG) as the best-suited association on which to apply radio-based high-precision astrometric techniques to study binary systems. Aims. We seek to determine precise estimates of the masses of a set of stars belonging to the AB Dor-MG using radio and infrared observations. Methods. We observed in phase-reference mode with the Very Large Array (VLA) at 5 GHz and with the European VLBI Network (EVN) at 8.4 GHz the stars HD 160934, EK Dra, PW And, and LO Peg. We also observed some of these stars with the near-infrared CCD AstraLux camera at the Calar Alto observatory to complement the radio observations. Results. We determine model-independent dynamical masses of both components of the star HD 160934, A and c, which are 0.70+/-0.07 Msun and 0.45+/-0.04 Msun , respectively. We revised the orbital parameters of EK Dra and we determine a sum of the masses of the system of 1.38+/-0.08 Msun. We also explored the binarity of the stars LO Peg and PW And. Conclusions. We found observational evidence that PMS evolutionary models underpredict the mass of PMS stars by 10%-40%, as previously reported by other authors. We also inferred that the origin of the radio emission must be similar in all observed stars, that is, extreme magnetic activity of the stellar corona that triggers gyrosynchrotron emission from non-thermal, accelerated electrons.
We propose a microscopic model to study the avalanche problem of insulating glass deformed by external static uniform strain below $T=60$K. We use three-dimensional real-space renormalization procedure to carry out the glass mechanical susceptibility at macroscopic length scale. We prove the existence of irreversible stress drops in amorphous materials, corresponding to the steep positive-negative transitions in glass mechanical susceptibility. We also obtain the strain directions in which the glass system is brittle. The irreversible stress drops in glass essentially come from non-elastic stress-stress interaction which is generated by virtual phonon exchange process.
We start with an overview of a class of submodular functions called SCMMs (sums of concave composed with non-negative modular functions plus a final arbitrary modular). We then define a new class of submodular functions we call {\em deep submodular functions} or DSFs. We show that DSFs are a flexible parametric family of submodular functions that share many of the properties and advantages of deep neural networks (DNNs). DSFs can be motivated by considering a hierarchy of descriptive concepts over ground elements and where one wishes to allow submodular interaction throughout this hierarchy. Results in this paper show that DSFs constitute a strictly larger class of submodular functions than SCMMs. We show that, for any integer $k>0$, there are $k$-layer DSFs that cannot be represented by a $k'$-layer DSF for any $k'<k$. This implies that, like DNNs, there is a utility to depth, but unlike DNNs, the family of DSFs strictly increase with depth. Despite this, we show (using a "backpropagation" like method) that DSFs, even with arbitrarily large $k$, do not comprise all submodular functions. In offering the above results, we also define the notion of an antitone superdifferential of a concave function and show how this relates to submodular functions (in general), DSFs (in particular), negative second-order partial derivatives, continuous submodularity, and concave extensions. To further motivate our analysis, we provide various special case results from matroid theory, comparing DSFs with forms of matroid rank, in particular the laminar matroid. Lastly, we discuss strategies to learn DSFs, and define the classes of deep supermodular functions, deep difference of submodular functions, and deep multivariate submodular functions, and discuss where these can be useful in applications.
A real-time Deep Learning based method for Pedestrian Detection (PD) is applied to the Human-Aware robot navigation problem. The pedestrian detector combines the Aggregate Channel Features (ACF) detector with a deep Convolutional Neural Network (CNN) in order to obtain fast and accurate performance. Our solution is firstly evaluated using a set of real images taken from onboard and offboard cameras and, then, it is validated in a typical domestic indoor scenario, in two distinct experiments. The results show that the robot is able to cope with human-aware constraints, defined after common proxemics rules.
Recently much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks - a type of network slightly more general than a phylogenetic tree - from triplets. Our algorithm has been made publicly available as the program LEV1ATHAN. It combines ideas from several known theoretical algorithms for phylogenetic tree and network reconstruction with two novel subroutines. Namely, an exponential-time exact and a greedy algorithm both of which are of independent theoretical interest. Most importantly, LEV1ATHAN runs in polynomial time and always constructs a level-1 network. If the data is consistent with a phylogenetic tree, then the algorithm constructs such a tree. Moreover, if the input triplet set is dense and, in addition, is fully consistent with some level-1 network, it will find such a network. The potential of LEV1ATHAN is explored by means of an extensive simulation study and a biological data set. One of our conclusions is that LEV1ATHAN is able to construct networks consistent with a high percentage of input triplets, even when these input triplets are affected by a low to moderate level of noise.
In the wake of the vast population of smart device users worldwide, mobile health (mHealth) technologies are hopeful to generate positive and wide influence on people's health. They are able to provide flexible, affordable and portable health guides to device users. Current online decision-making methods for mHealth assume that the users are completely heterogeneous. They share no information among users and learn a separate policy for each user. However, data for each user is very limited in size to support the separate online learning, leading to unstable policies that contain lots of variances. Besides, we find the truth that a user may be similar with some, but not all, users, and connected users tend to have similar behaviors. In this paper, we propose a network cohesion constrained (actor-critic) Reinforcement Learning (RL) method for mHealth. The goal is to explore how to share information among similar users to better convert the limited user information into sharper learned policies. To the best of our knowledge, this is the first online actor-critic RL for mHealth and first network cohesion constrained (actor-critic) RL method in all applications. The network cohesion is important to derive effective policies. We come up with a novel method to learn the network by using the warm start trajectory, which directly reflects the users' property. The optimization of our model is difficult and very different from the general supervised learning due to the indirect observation of values. As a contribution, we propose two algorithms for the proposed online RLs. Apart from mHealth, the proposed methods can be easily applied or adapted to other health-related tasks. Extensive experiment results on the HeartSteps dataset demonstrates that in a variety of parameter settings, the proposed two methods obtain obvious improvements over the state-of-the-art methods.
Long lived topological features are distinguished from short lived ones (considered as topological noise) in simplicial complexes constructed from complex networks. A new topological invariant, persistent homology, is determined and presented as a parametrized version of a Betti number. Complex networks with distinct degree distributions exhibit distinct persistent topological features. Persistent toplogical attributes, shown to be related to robust quality of networks, also reflect defficiency in certain connectivity properites of networks. Random networks, networks with exponential conectivity distribution and scale-free networks were considered for homological persistency analysis.
Network measures that reflect the most salient properties of complex large-scale networks are in high demand in the network research community. In this paper we adapt a combinatorial measure of negative curvature (also called hyperbolicity) to parameterized finite networks, and show that a variety of biological and social networks are hyperbolic. This hyperbolicity property has strong implications on the higher-order connectivity and other topological properties of these networks. Specifically, we derive and prove bounds on the distance among shortest or approximately shortest paths in hyperbolic networks. We describe two implications of these bounds to cross-talk in biological networks, and to the existence of central, influential neighborhoods in both biological and social networks.
We offer a generalized point of view on the backpropagation algorithm, currently the most common technique to train neural networks via stochastic gradient descent and variants thereof. Specifically, we show that backpropagation of a prediction error is equivalent to sequential gradient descent steps on a quadratic penalty energy. This energy comprises the network activations as variables of the optimization and couples them to the network parameters. Based on this viewpoint, we illustrate the limitations on step sizes when optimizing a nested function with gradient descent. Rather than taking explicit gradient steps, where step size restrictions are an impediment for optimization, we propose proximal backpropagation (ProxProp) as a novel algorithm that takes implicit gradient steps to update the network parameters. We experimentally demonstrate that our algorithm is robust in the sense that it decreases the objective function for a wide range of parameter values. In a systematic quantitative analysis, we compare to related approaches on a supervised visual learning task (CIFAR-10) for fully connected as well as convolutional neural networks and for an unsupervised autoencoder (USPS dataset). We demonstrate that ProxProp leads to a significant speed up in training performance.
Convolutional Neural Networks (CNNs) are effective models for reducing spectral variations and modeling spectral correlations in acoustic features for automatic speech recognition (ASR). Hybrid speech recognition systems incorporating CNNs with Hidden Markov Models/Gaussian Mixture Models (HMMs/GMMs) have achieved the state-of-the-art in various benchmarks. Meanwhile, Connectionist Temporal Classification (CTC) with Recurrent Neural Networks (RNNs), which is proposed for labeling unsegmented sequences, makes it feasible to train an end-to-end speech recognition system instead of hybrid settings. However, RNNs are computationally expensive and sometimes difficult to train. In this paper, inspired by the advantages of both CNNs and the CTC approach, we propose an end-to-end speech framework for sequence labeling, by combining hierarchical CNNs with CTC directly without recurrent connections. By evaluating the approach on the TIMIT phoneme recognition task, we show that the proposed model is not only computationally efficient, but also competitive with the existing baseline systems. Moreover, we argue that CNNs have the capability to model temporal correlations with appropriate context information.
Hierarchical feature learning based on convolutional neural networks (CNN) has recently shown significant potential in various computer vision tasks. While allowing high-quality discriminative feature learning, the downside of CNNs is the lack of explicit structure in features, which often leads to overfitting, absence of reconstruction from partial observations and limited generative abilities. Explicit structure is inherent in hierarchical compositional models, however, these lack the ability to optimize a well-defined cost function. We propose a novel analytic model of a basic unit in a layered hierarchical model with both explicit compositional structure and a well-defined discriminative cost function. Our experiments on two datasets show that the proposed compositional model performs on a par with standard CNNs on discriminative tasks, while, due to explicit modeling of the structure in the feature units, affording a straight-forward visualization of parts and faster inference due to separability of the units. Actions
We study the problem of estimating a manifold from random samples. In particular, we consider piecewise constant and piecewise linear estimators induced by k-means and k-flats, and analyze their performance. We extend previous results for k-means in two separate directions. First, we provide new results for k-means reconstruction on manifolds and, secondly, we prove reconstruction bounds for higher-order approximation (k-flats), for which no known results were previously available. While the results for k-means are novel, some of the technical tools are well-established in the literature. In the case of k-flats, both the results and the mathematical tools are new.
A biophysical model of epimorphic regeneration based on a continuum percolation process of fully penetrable disks in two dimensions is proposed. All cells within a randomly chosen disk of the regenerating organism are assumed to receive a signal in the form of a circular wave as a result of the action/reconfiguration of neoblasts and neoblast-derived mesenchymal cells in the blastema. These signals trigger the growth of the organism, whose cells read, on a faster time scale, the electric polarization state responsible for their differentiation and the resulting morphology. In the long time limit, the process leads to a morphological attractor that depends on experimentally accessible control parameters governing the blockage of cellular gap junctions and, therefore, the connectivity of the multicellular ensemble. When this connectivity is weakened, positional information is degraded leading to more symmetrical structures. This general theory is applied to the specifics of planaria regeneration. Computations and asymptotic analyses made with the model show that it correctly describes a significant subset of the most prominent experimental observations, notably anterior-posterior polarization (and its loss) or the formation of four-headed planaria.
Networks of excitable nodes have recently attracted much attention particularly in regards to neuronal dynamics, where criticality has been argued to be a fundamental property. Refractory behavior, which limits the excitability of neurons is thought to be an important dynamical property. We therefore consider a simple model of excitable nodes which is known to exhibit a transition to instability at a critical point ($\lambda=1$), and introduce refractory period into its dynamics. We use mean-field analytical calculations as well as numerical simulations to calculate the activity dependent branching ratio that is useful to characterize the behavior of critical systems. We also define avalanches and calculate probability distribution of their size and duration. We find that in the presence of refractory period the dynamics stabilizes while various parameter regimes become accessible. A sub-critical regime with $\lambda<1.0$, a standard critical behavior with exponents close to critical branching process for $\lambda=1$, a regime with $1<\lambda<2$ that exhibits an interesting scaling behavior, and an oscillating regime with $\lambda>2.0$. We have therefore shown that refractory behavior leads to a wide range of scaling as well as periodic behavior which are relevant to real neuronal dynamics.
We study cascading failures in networks using a dynamical flow model based on simple conservation and distribution laws to investigate the impact of transient dynamics caused by the rebalancing of loads after an initial network failure (triggering event). It is found that considering the flow dynamics may imply reduced network robustness compared to previous static overload failure models. This is due to the transient oscillations or overshooting in the loads, when the flow dynamics adjusts to the new (remaining) network structure. We obtain {\em upper} and {\em lower} limits to network robustness, and it is shown that {\it two} time scales $\tau$ and $\tau_0$, defined by the network dynamics, are important to consider prior to accurately addressing network robustness or vulnerability. The robustness of networks showing cascading failures is generally determined by a complex interplay between the network topology and flow dynamics, where the ratio $\chi=\tau/\tau_0$ determines the relative role of the two of them.
It has long been assumed that high dimensional continuous control problems cannot be solved effectively by discretizing individual dimensions of the action space due to the exponentially large number of bins over which policies would have to be learned. In this paper, we draw inspiration from the recent success of sequence-to-sequence models for structured prediction problems to develop policies over discretized spaces. Central to this method is the realization that complex functions over high dimensional spaces can be modeled by neural networks that use next step prediction. Specifically, we show how Q-values and policies over continuous spaces can be modeled using a next step prediction model over discretized dimensions. With this parameterization, it is possible to both leverage the compositional structure of action spaces during learning, as well as compute maxima over action spaces (approximately). On a simple example task we demonstrate empirically that our method can perform global search, which effectively gets around the local optimization issues that plague DDPG and NAF. We apply the technique to off-policy (Q-learning) methods and show that our method can achieve the state-of-the-art for off-policy methods on several continuous control tasks.
A Li-rich red giant star (2M19411367+4003382) recently discovered in the direction of NGC 6819 belongs to the rare subset of Li-rich stars that have not yet evolved to the luminosity bump, an evolutionary stage where models predict Li can be replenished. The currently favored model to explain Li enhancement in first-ascent red giants like 2M19411367+4003382 requires deep mixing into the stellar interior. Testing this model requires a measurement of 12C/13C, which is possible to obtain from APOGEE spectra. However, the Li-rich star also has abnormal asteroseismic properties that call into question its membership in the cluster, even though its radial velocity and location on color-magnitude diagrams are consistent with membership. To address these puzzles, we have measured a wide array of abundances in the Li-rich star and three comparison stars using spectra taken as part of the APOGEE survey to determine the degree of stellar mixing, address the question of membership, and measure the surface gravity. We confirm that the Li-rich star is a red giant with the same overall chemistry as the other cluster giants. However, its log g is significantly lower, consistent with the asteroseismology results and suggestive of a very low mass if the star is indeed a cluster member. Regardless of the cluster membership, the 12C/13C and C/N ratios of the Li-rich star are consistent with standard first dredge-up, indicating that Li dilution has already occurred, and inconsistent with internal Li enrichment scenarios that require deep mixing.
Neural networks are generally built by interleaving (adaptable) linear layers with (fixed) nonlinear activation functions. To increase their flexibility, several authors have proposed methods for adapting the activation functions themselves, endowing them with varying degrees of flexibility. None of these approaches, however, have gained wide acceptance in practice, and research in this topic remains open. In this paper, we introduce a novel family of flexible activation functions that are based on an inexpensive kernel expansion at every neuron. Leveraging over several properties of kernel-based models, we propose multiple variations for designing and initializing these kernel activation functions (KAFs), including a multidimensional scheme allowing to nonlinearly combine information from different paths in the network. The resulting KAFs can approximate any mapping defined over a subset of the real line, either convex or nonconvex. Furthermore, they are smooth over their entire domain, linear in their parameters, and they can be regularized using any known scheme, including the use of $\ell_1$ penalties to enforce sparseness. To the best of our knowledge, no other known model satisfies all these properties simultaneously. In addition, we provide a relatively complete overview on alternative techniques for adapting the activation functions, which is currently lacking in the literature. A large set of experiments validates our proposal.
Often, vaccination programs are carried out based on self-interest rather than being mandatory. Owing to the perceptions about risks associated with vaccines and the `herd immunity' effect, it may provide suboptimal vaccination coverage for the population as a whole. In this case, some subsidy policies may be offered by the government to promote vaccination coverage. But, not all subsidy policies are effective in controlling the transmission of infectious diseases. We address the question of which subsidy policy is best, and how to appropriately distribute the limited subsidies to maximize vaccine coverage. To answer these questions, we establish a model based on evolutionary game theory, where individuals try to maximize their personal payoffs when considering the voluntary vaccination mechanism. Our model shows that voluntary vaccination alone is insufficient to control an epidemic. Hence, two subsidy policies are systematically studied: (1) in the free subsidy policy the total amount of subsidies is distributed to some individuals and all the donees may vaccinate at no cost, and (2) in the part-offset subsidy policy each vaccinated person is offset by a certain proportion of the vaccination cost. Simulations suggest that, since the part-offset subsidy policy can encourage more individuals to be vaccinated, the performance of this policy is significantly better than that of the free subsidy policy.
The compute-and-forward (CoF) is a relaying protocol, which uses algebraic structured codes to harness the interference and remove the noise in wireless networks. We propose the use of phase precoders at the transmitters of a network, where relays apply CoF strategy. We define the {\em phase precoded computation rate} and show that it is greater than the original computation rate of CoF protocol. We further give a new low-complexity method for finding network equations. We finally show that the proposed precoding scheme increases the degrees-of-freedom (DoF) of CoF protocol. This overcomes the limitations on the DoF of the CoF protocol, recently presented by Niesen and Whiting. Using tools from Diophantine approximation and algebraic geometry, we prove the existence of a phase precoder that approaches the maximum DoF when the number of transmitters tends to infinity.
We measure the thickness of the heavy water layer trapped under the stress corrosion fracture surface of silica using neutron reflectivity experiments. We show that the penetration depth is 65-85 \aa ngstr\"{o}ms, suggesting the presence of a damaged zone of $\approx$ 100 \aa ngstr\"{o}ms extending ahead of the crack tip during its propagation. This estimate of the size of the damaged zone is compatible with other recent results.
We investigate hadron formation in deep inelastic lepton scattering on N, Kr and Xe nuclei in the kinematic regime of the HERMES experiment. The elementary electron-nucleon interaction is described within the event generator PYTHIA while a full coupled-channel treatment of the final state interactions is included by means of a BUU transport model. We find a good agreement with the measured charged hadron multiplicity ratio $R_M^h$ for N and Kr targets by accounting for the deceleration and absorption of the primarily produced particles as well as for the creation of secondary hadrons in the final state interactions.
Nowadays, this is very popular to use the deep architectures in machine learning. Deep Belief Networks (DBNs) are deep architectures that use stack of Restricted Boltzmann Machines (RBM) to create a powerful generative model using training data. DBNs have many ability like feature extraction and classification that are used in many applications like image processing, speech processing and etc. This paper introduces a new object oriented MATLAB toolbox with most of abilities needed for the implementation of DBNs. In the new version, the toolbox can be used in Octave. According to the results of the experiments conducted on MNIST (image), ISOLET (speech), and 20 Newsgroups (text) datasets, it was shown that the toolbox can learn automatically a good representation of the input from unlabeled data with better discrimination between different classes. Also on all datasets, the obtained classification errors are comparable to those of state of the art classifiers. In addition, the toolbox supports different sampling methods (e.g. Gibbs, CD, PCD and our new FEPCD method), different sparsity methods (quadratic, rate distortion and our new normal method), different RBM types (generative and discriminative), using GPU, etc. The toolbox is a user-friendly open source software and is freely available on the website http://ceit.aut.ac.ir/~keyvanrad/DeeBNet%20Toolbox.html .
Mutation is one of the most important stages of the genetic algorithm because of its impact on the exploration of global optima, and to overcome premature convergence. There are many types of mutation, and the problem lies in selection of the appropriate type, where the decision becomes more difficult and needs more trial and error. This paper investigates the use of more than one mutation operator to enhance the performance of genetic algorithms. Novel mutation operators are proposed, in addition to two selection strategies for the mutation operators, one of which is based on selecting the best mutation operator and the other randomly selects any operator. Several experiments on some Travelling Salesman Problems (TSP) were conducted to evaluate the proposed methods, and these were compared to the well-known exchange mutation and rearrangement mutation. The results show the importance of some of the proposed methods, in addition to the significant enhancement of the genetic algorithm's performance, particularly when using more than one mutation operator.
Satellite image classification is a challenging problem that lies at the crossroads of remote sensing, computer vision, and machine learning. Due to the high variability inherent in satellite data, most of the current object classification approaches are not suitable for handling satellite datasets. The progress of satellite image analytics has also been inhibited by the lack of a single labeled high-resolution dataset with multiple class labels. The contributions of this paper are twofold - (1) first, we present two new satellite datasets called SAT-4 and SAT-6, and (2) then, we propose a classification framework that extracts features from an input image, normalizes them and feeds the normalized feature vectors to a Deep Belief Network for classification. On the SAT-4 dataset, our best network produces a classification accuracy of 97.95% and outperforms three state-of-the-art object recognition algorithms, namely - Deep Belief Networks, Convolutional Neural Networks and Stacked Denoising Autoencoders by ~11%. On SAT-6, it produces a classification accuracy of 93.9% and outperforms the other algorithms by ~15%. Comparative studies with a Random Forest classifier show the advantage of an unsupervised learning approach over traditional supervised learning techniques. A statistical analysis based on Distribution Separability Criterion and Intrinsic Dimensionality Estimation substantiates the effectiveness of our approach in learning better representations for satellite imagery.
The Internet, as well as many other networks, has a very complex connectivity recently modeled by the class of scale-free networks. This feature, which appears to be very efficient for a communications network, favors at the same time the spreading of computer viruses. We analyze real data from computer virus infections and find the average lifetime and prevalence of viral strains on the Internet. We define a dynamical model for the spreading of infections on scale-free networks, finding the absence of an epidemic threshold and its associated critical behavior. This new epidemiological framework rationalize data of computer viruses and could help in the understanding of other spreading phenomena on communication and social networks.
(abriged)In this paper we study different properties of quasars and their host galaxies at high redshifts up to z~3.4. We compare our results to those of other authors and discuss the correlation between galaxy evolution and quasar activity. We analysed broad-band images in eight filters (from U to K) of eight quasars in the FORS Deep Field with redshifts between z=0.87 and z=3.37. A fully 2-dimensional decomposition was carried out to detect and resolve the host galaxies. We were able to resolve the host galaxies of two out of eight quasars between z=0.87 and z=2.75. Additionally, two host galaxies were possibly resolved. The resolved low-redshift quasar (z=0.9) was identified as a late type galaxy with a moderate star formation rate of 1.8 M_{sun}/yr hosting a supermassive black hole with a mass of <10^{8}M_{sun}. The resolved high redshift host galaxy (z=2.8) shows moderate star formation of 4.4-6.9 M_{sun}/yr, for the black hole mass we found a lower limit of >10^{7}M_{sun}. All quasars host supermassive black hole with masses in the range ~10^{7}-10^{9}M_{sun}. Our findings are well consistent with those of other authors.
Classifiers for the semi-supervised setting often combine strong supervised models with additional learning objectives to make use of unlabeled data. This results in powerful though very complex models that are hard to train and that demand additional labels for optimal parameter tuning, which are often not given when labeled data is very sparse. We here study a minimalistic multi-layer generative neural network for semi-supervised learning in a form and setting as similar to standard discriminative networks as possible. Based on normalized Poisson mixtures, we derive compact and local learning and neural activation rules. Learning and inference in the network can be scaled using standard deep learning tools for parallelized GPU implementation. With the single objective of likelihood optimization, both labeled and unlabeled data are naturally incorporated into learning. Empirical evaluations on standard benchmarks show, that for datasets with few labels the derived minimalistic network improves on all classical deep learning approaches and is competitive with their recent variants without the need of additional labels for parameter tuning. Furthermore, we find that the studied network is the best performing monolithic (`non-hybrid') system for few labels, and that it can be applied in the limit of very few labels, where no other system has been reported to operate so far.
Respondent-driven sampling (RDS) is a popular method for sampling hard-to-survey populations that leverages social network connections through peer recruitment. While RDS is most frequently applied to estimate the prevalence of infections and risk behaviors of interest to public health, like HIV/AIDS or condom use, it is rarely used to draw inferences about the structural properties of social networks among such populations because it does not typically collect the necessary data. Drawing on recent advances in computer science, we introduce a set of data collection instruments and RDS estimators for network clustering, an important topological property that has been linked to a network's potential for diffusion of information, disease, and health behaviors. We use simulations to explore how these estimators, originally developed for random walk samples of computer networks, perform when applied to RDS samples with characteristics encountered in realistic field settings that depart from random walks. In particular, we explore the effects of multiple seeds, without vs. with replacement, branching chains, imperfect response rates, preferential recruitment, and misreporting of ties. We find that clustering coefficient estimators retain desirable properties in RDS samples. This paper takes an important step towards calculating network characteristics using non-traditional sampling methods, and it expands RDS's potential to tell researchers more about hidden populations and the social factors driving disease prevalence.
We introduce SE3-Nets, which are deep neural networks designed to model and learn rigid body motion from raw point cloud data. Based only on sequences of depth images along with action vectors and point wise data associations, SE3-Nets learn to segment effected object parts and predict their motion resulting from the applied force. Rather than learning point wise flow vectors, SE3-Nets predict SE3 transformations for different parts of the scene. Using simulated depth data of a table top scene and a robot manipulator, we show that the structure underlying SE3-Nets enables them to generate a far more consistent prediction of object motion than traditional flow based networks. Additional experiments with a depth camera observing a Baxter robot pushing objects on a table show that SE3-Nets also work well on real data.
To construct a quantum network with many end users, it is critical to have a cost-efficient way to distribute entanglement over different network ends. We demonstrate an entanglement access network, where the expensive resource, the entangled photon source at the telecom wavelength and the core communication channel, is shared by many end users. Using this cost-efficient entanglement access network, we report experimental demonstration of a secure multiparty computation protocol, the privacy-preserving secure sum problem, based on the network quantum cryptography.
The regulation of a gene depends on the binding of transcription factors to specific sites located in the regulatory region of the gene. The generation of these binding sites and of cooperativity between them are essential building blocks in the evolution of complex regulatory networks. We study a theoretical model for the sequence evolution of binding sites by point mutations. The approach is based on biophysical models for the binding of transcription factors to DNA. Hence we derive empirically grounded fitness landscapes, which enter a population genetics model including mutations, genetic drift, and selection. We show that the selection for factor binding generically leads to specific correlations between nucleotide frequencies at different positions of a binding site. We demonstrate the possibility of rapid adaptive evolution generating a new binding site for a given transcription factor by point mutations. The evolutionary time required is estimated in terms of the neutral (background) mutation rate, the selection coefficient, and the effective population size. The efficiency of binding site formation is seen to depend on two joint conditions: the binding site motif must be short enough and the promoter region must be long enough. These constraints on promoter architecture are indeed seen in eukaryotic systems. Furthermore, we analyse the adaptive evolution of genetic switches and of signal integration through binding cooperativity between different sites. Experimental tests of this picture involving the statistics of polymorphisms and phylogenies of sites are discussed.
The inference of network topologies from relational data is an important problem in data analysis. Exemplary applications include the reconstruction of social ties from data on human interactions, the inference of gene co-expression networks from DNA microarray data, or the learning of semantic relationships based on co-occurrences of words in documents. Solving these problems requires techniques to infer significant links in noisy relational data. In this short paper, we propose a new statistical modeling framework to address this challenge. It builds on generalized hypergeometric ensembles, a class of generative stochastic models that give rise to analytically tractable probability spaces of directed, multi-edge graphs. We show how this framework can be used to assess the significance of links in noisy relational data. We illustrate our method in two data sets capturing spatio-temporal proximity relations between actors in a social system. The results show that our analytical framework provides a new approach to infer significant links from relational data, with interesting perspectives for the mining of data on social systems.
Training deep networks is a time-consuming process, with networks for object recognition often requiring multiple days to train. For this reason, leveraging the resources of a cluster to speed up training is an important area of work. However, widely-popular batch-processing computational frameworks like MapReduce and Spark were not designed to support the asynchronous and communication-intensive workloads of existing distributed deep learning systems. We introduce SparkNet, a framework for training deep networks in Spark. Our implementation includes a convenient interface for reading data from Spark RDDs, a Scala interface to the Caffe deep learning framework, and a lightweight multi-dimensional tensor library. Using a simple parallelization scheme for stochastic gradient descent, SparkNet scales well with the cluster size and tolerates very high-latency communication. Furthermore, it is easy to deploy and use with no parameter tuning, and it is compatible with existing Caffe models. We quantify the dependence of the speedup obtained by SparkNet on the number of machines, the communication frequency, and the cluster's communication overhead, and we benchmark our system's performance on the ImageNet dataset.
The examination of Osteoarthritis disease through X-ray by rheumatology can be classified into four grade of severity. This paper discusses about the application of artificial neural network backpropagation method for measuring the severity of the disease, where the observed X-ray range from wrist to fingers. The main procedures of system in this paper is divided into three, which are image processing, feature extraction, and artificial neural network process. First, an X-ray image digital (200x150 pixels and greyscale) will be thresholded, then extracted features based on probabilistic values of the color intensity of seven bit quantization result, and statistical textures. That feature values then will be normalizing to interval [0.1, 0.9], and then the result would be processing on backpropagation artificial neural network system as input to determine the severity of disease from an X-ray had input before it. From testing with learning rate 0.3, momentum 0.4, hidden units five pieces and about 132 feature vectors, this system had had a level of accuracy of 100% for learning data, 80% for learning and non-learning data, and 66.6% for non-learning data
Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization.   In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.
Deep neural network (DNN) models have recently obtained state-of-the-art prediction accuracy for the transcription factor binding (TFBS) site classification task. However, it remains unclear how these approaches identify meaningful DNA sequence signals and give insights as to why TFs bind to certain locations. In this paper, we propose a toolkit called the Deep Motif Dashboard (DeMo Dashboard) which provides a suite of visualization strategies to extract motifs, or sequence patterns from deep neural network models for TFBS classification. We demonstrate how to visualize and understand three important DNN models: convolutional, recurrent, and convolutional-recurrent networks. Our first visualization method is finding a test sequence's saliency map which uses first-order derivatives to describe the importance of each nucleotide in making the final prediction. Second, considering recurrent models make predictions in a temporal manner (from one end of a TFBS sequence to the other), we introduce temporal output scores, indicating the prediction score of a model over time for a sequential input. Lastly, a class-specific visualization strategy finds the optimal input sequence for a given TFBS positive class via stochastic gradient optimization. Our experimental results indicate that a convolutional-recurrent architecture performs the best among the three architectures. The visualization techniques indicate that CNN-RNN makes predictions by modeling both motifs as well as dependencies among them.
Over 50 million scholarly articles have been published: they constitute a unique repository of knowledge. In particular, one may infer from them relations between scientific concepts, such as synonyms and hyponyms. Artificial neural networks have been recently explored for relation extraction. In this work, we continue this line of work and present a system based on a convolutional neural network to extract relations. Our model ranked first in the SemEval-2017 task 10 (ScienceIE) for relation extraction in scientific articles (subtask C).
Complex problems may require sophisticated, non-linear learning methods such as kernel machines or deep neural networks to achieve state of the art prediction accuracies. However, high prediction accuracies are not the only objective to consider when solving problems using machine learning. Instead, particular scientific applications require some explanation of the learned prediction function. Unfortunately, most methods do not come with out of the box straight forward interpretation. Even linear prediction functions are not straight forward to explain if features exhibit complex correlation structure.   In this paper, we propose the Measure of Feature Importance (MFI). MFI is general and can be applied to any arbitrary learning machine (including kernel machines and deep learning). MFI is intrinsically non-linear and can detect features that by itself are inconspicuous and only impact the prediction function through their interaction with other features. Lastly, MFI can be used for both --- model-based feature importance and instance-based feature importance (i.e, measuring the importance of a feature for a particular data point).
The Dendritic Cell Algorithm is an immune-inspired algorithm orig- inally based on the function of natural dendritic cells. The original instantiation of the algorithm is a highly stochastic algorithm. While the performance of the algorithm is good when applied to large real-time datasets, it is difficult to anal- yse due to the number of random-based elements. In this paper a deterministic version of the algorithm is proposed, implemented and tested using a port scan dataset to provide a controllable system. This version consists of a controllable amount of parameters, which are experimented with in this paper. In addition the effects are examined of the use of time windows and variation on the number of cells, both which are shown to influence the algorithm. Finally a novel metric for the assessment of the algorithms output is introduced and proves to be a more sensitive metric than the metric used with the original Dendritic Cell Algorithm.
We study a tandem of agents who make decisions about an underlying binary hypothesis, where the distribution of the agent observations under each hypothesis comes from an uncertainty class. We investigate both decentralized detection rules, where agents collaborate to minimize the error probability of the final agent, and social learning rules, where each agent minimizes its own local minimax error probability. We then extend our results to the infinite tandem network, and derive necessary and sufficient conditions on the uncertainty classes for the minimax error probability to converge to zero when agents know their positions in the tandem. On the other hand, when agents do not know their positions in the network, we study the cases where agents collaborate to minimize the asymptotic minimax error probability, and where agents seek to minimize their worst-case minimax error probability (over all possible positions in the tandem). We show that asymptotic learning of the true hypothesis is no longer possible in these cases, and derive characterizations for the minimax error performance.
Networks extracted from social media platforms frequently include multiple types of links that dynamically change over time; these links can be used to represent dyadic interactions such as economic transactions, communications, and shared activities. Organizing this data into a dynamic multiplex network, where each layer is composed of a single edge type linking the same underlying vertices, can reveal interesting cross-layer interaction patterns. In coevolving networks, links in one layer result in an increased probability of other types of links forming between the same node pair. Hence we believe that a holistic approach in which all the layers are simultaneously considered can outperform a factored approach in which link prediction is performed separately in each layer. This paper introduces a comprehensive framework, MLP (Multilayer Link Prediction), in which link existence likelihoods for the target layer are learned from the other network layers. These likelihoods are used to reweight the output of a single layer link prediction method that uses rank aggregation to combine a set of topological metrics. Our experiments show that our reweighting procedure outperforms other methods for fusing information across network layers.
In this paper, we empirically investigate correlations among four centrality measures, originated from the social science, of various complex networks. For each network, we compute the centrality measures, from which the partial correlation as well as the correlation coefficient among measures is estimated. We uncover that the degree and the betweenness centrality are highly correlated; furthermore, the betweenness follows a power-law distribution irrespective of the type of networks. This characteristic is further examined in terms of the conditional probability distribution of the betweenness, given the degree. The conditional distribution also exhibits a power-law behavior independent of the degree which explains partially, if not whole, the origin of the power-law distribution of the betweenness. A similar analysis on the random network reveals that these characteristics are not found in the random network.
Living cells use readout molecules to record the state of receptor proteins, similar to measurements or copies in typical computational devices. But is this analogy rigorous? Can cells be optimally efficient, and if not, why? We show that, as in computation, a canonical biochemical readout network generates correlations; extracting no work from these correlations sets a lower bound on dissipation. For general input, the biochemical network cannot reach this bound, even with arbitrarily slow reactions or weak thermodynamic driving. It faces an accuracy-dissipation trade-off that is qualitatively distinct from and worse than implied by the bound, and more complex steady-state copy processes cannot perform better. Nonetheless, the cost remains close to the thermodynamic bound unless accuracy is extremely high. Additionally, we show that biomolecular reactions could be used in thermodynamically optimal devices under exogenous manipulation of chemical fuels, suggesting an experimental system for testing computational thermodynamics.
Traditional approaches to Bayes net structure learning typically assume little regularity in graph structure other than sparseness. However, in many cases, we expect more systematicity: variables in real-world systems often group into classes that predict the kinds of probabilistic dependencies they participate in. Here we capture this form of prior knowledge in a hierarchical Bayesian framework, and exploit it to enable structure learning and type discovery from small datasets. Specifically, we present a nonparametric generative model for directed acyclic graphs as a prior for Bayes net structure learning. Our model assumes that variables come in one or more classes and that the prior probability of an edge existing between two variables is a function only of their classes. We derive an MCMC algorithm for simultaneous inference of the number of classes, the class assignments of variables, and the Bayes net structure over variables. For several realistic, sparse datasets, we show that the bias towards systematicity of connections provided by our model yields more accurate learned networks than a traditional, uniform prior approach, and that the classes found by our model are appropriate.
Avalanches of electrochemical activity in brain networks have been empirically reported to obey scale-invariant behavior --characterized by power-law distributions up to some upper cut-off-- both in vitro and in vivo. Elucidating whether such scaling laws stem from the underlying neural dynamics operating at the edge of a phase transition is a fascinating possibility, as systems poised at criticality have been argued to exhibit a number of important functional advantages. Here we employ a well-known model for neural dynamics with synaptic plasticity, to elucidate an alternative scenario in which neuronal avalanches can coexist, overlapping in time, but still remaining scale-free. Remarkably their scale-invariance does not stem from underlying criticality nor self-organization at the edge of a continuous phase transition. Instead, it emerges from the fact that perturbations to the system exhibit a neutral drift --guided by demographic fluctuations-- with respect to endogenous spontaneous activity. Such a neutral dynamics --similar to the one in neutral theories of population genetics-- implies marginal propagation of activity, characterized by power-law distributed causal avalanches. Importantly, our results underline the importance of considering causal information --on which neuron triggers the firing of which-- to properly estimate the statistics of avalanches of neural activity. We discuss the implications of these findings both in modeling and to elucidate experimental observations, as well as its possible consequences for actual neural dynamics and information processing in actual neural networks.
A repeated game is an effective tool to model interactions and conflicts for players aiming to achieve their objectives in a long-term basis. Contrary to static noncooperative games that model an interaction among players in only one period, in repeated games, interactions of players repeat for multiple periods; and thus the players become aware of other players' past behaviors and their future benefits, and will adapt their behavior accordingly. In wireless networks, conflicts among wireless nodes can lead to selfish behaviors, resulting in poor network performances and detrimental individual payoffs. In this paper, we survey the applications of repeated games in different wireless networks. The main goal is to demonstrate the use of repeated games to encourage wireless nodes to cooperate, thereby improving network performances and avoiding network disruption due to selfish behaviors. Furthermore, various problems in wireless networks and variations of repeated game models together with the corresponding solutions are discussed in this survey. Finally, we outline some open issues and future research directions.
The detection of overlapping communities is a challenging problem which is gaining increasing interest in recent years because of the natural attitude of individuals, observed in real-world networks, to participate in multiple groups at the same time. This review gives a description of the main proposals in the field. Besides the methods designed for static networks, some new approaches that deal with the detection of overlapping communities in networks that change over time, are described. Methods are classified with respect to the underlying principles guiding them to obtain a network division in groups sharing part of their nodes. For each of them we also report, when available, computational complexity and web site address from which it is possible to download the software implementing the method.
We use N-body-spectro-photometric simulations to investigate the impact of incompleteness and incorrect redshifts in spectroscopic surveys to photometric redshift training and calibration and the resulting effects on cosmological parameter estimation from weak lensing shear-shear correlations. The photometry of the simulations is modeled after the upcoming Dark Energy Survey and the spectroscopy is based on a low/intermediate resolution spectrograph with wavelength coverage of 5500{\AA} < {\lambda} < 9500{\AA}. The principal systematic errors that such a spectroscopic follow-up encounters are incompleteness (inability to obtain spectroscopic redshifts for certain galaxies) and wrong redshifts. Encouragingly, we find that a neural network-based approach can effectively describe the spectroscopic incompleteness in terms of the galaxies' colors, so that the spectroscopic selection can be applied to the photometric sample. Hence, we find that spectroscopic incompleteness yields no appreciable biases to cosmology, although the statistical constraints degrade somewhat because the photometric survey has to be culled to match the spectroscopic selection. Unfortunately, wrong redshifts have a more severe impact: the cosmological biases are intolerable if more than a percent of the spectroscopic redshifts are incorrect. Moreover, we find that incorrect redshifts can also substantially degrade the accuracy of training set based photo-z estimators. The main problem is the difficulty of obtaining redshifts, either spectroscopically or photometrically, for objects at z > 1.3. We discuss several approaches for reducing the cosmological biases, in particular finding that photo-z error estimators can reduce biases appreciably.
A bi-directional Korean/English dialog translation system is designed and implemented using the memory-based translation technique. The system KEMDT (Korean/English Memory-based Dialog Translation system) can perform Korean to English, and English to Korean translation using unified memory network and extended marker passing algorithm. We resolve the word order variation and frequent word omission problems in Korean by classifying the concept sequence element in four different types and extending the marker- passing-based-translation algorithm. Unlike the previous memory-based translation systems, the KEMDT system develops the bilingual memory network and the unified bi-directional marker passing translation algorithm. For efficient language specific processing, we separate the morphological processors from the memory-based translator. The KEMDT technology provides a hierarchical memory network and an efficient marker-based control for the recent example-based MT paradigm.
We apply RMT, Network and MF-DFA methods to investigate correlation, network and multifractal properties of 20 global financial indices. We compare results before and during the financial crisis of 2008 respectively. We find that the network method gives more useful information about the formation of clusters as compared to results obtained from eigenvectors corresponding to second largest eigenvalue and these sectors are formed on the basis of geographical location of indices. At threshold 0.6, indices corresponding to Americas, Europe and Asia/Pacific disconnect and form different clusters before the crisis but during the crisis, indices corresponding to Americas and Europe are combined together to form a cluster while the Asia/Pacific indices forms another cluster. By further increasing the value of threshold to 0.9, European countries France, Germany and UK constitute the most tightly linked markets. We study multifractal properties of global financial indices and find that financial indices corresponding to Americas and Europe almost lie in the same range of degree of multifractality as compared to other indices. India, South Korea, Hong Kong are found to be near the degree of multifractality of indices corresponding to Americas and Europe. A large variation in the degree of multifractality in Egypt, Indonesia, Malaysia, Taiwan and Singapore may be a reason that when we increase the threshold in financial network these countries first start getting disconnected at low threshold from the correlation network of financial indices. We fit Binomial Multifractal Model (BMFM) to these financial markets.
Given an undirected graph $\mathcal{G}=(\mathcal{N},\mathcal{E})$ of agents $\mathcal{N}=\{1,\ldots,N\}$ connected with edges in $\mathcal{E}$, we study how to compute an optimal decision on which there is consensus among agents and that minimizes the sum of agent-specific private convex composite functions $\{\Phi_i\}_{i\in\mathcal{N}}$ while respecting privacy requirements, where $\Phi_i\triangleq \xi_i + f_i$ belongs to agent-$i$. Assuming only agents connected by an edge can communicate, we propose a distributed proximal gradient method DPGA for consensus optimization over both unweighted and weighted static (undirected) communication networks. In one iteration, each agent-$i$ computes the prox map of $\xi_i$ and gradient of $f_i$, and this is followed by local communication with neighboring agents. We also study its stochastic gradient variant, SDPGA, which can only access to noisy estimates of $\nabla f_i$ at each agent-$i$. This computational model abstracts a number of applications in distributed sensing, machine learning and statistical inference. We show ergodic convergence in both sub-optimality error and consensus violation for DPGA and SDPGA with rates $\mathcal{O}(1/t)$ and $\mathcal{O}(1/\sqrt{t})$, respectively.
Our experiment adapts several popular deep learning methods as well as some traditional methods on the problem of video emotion recognition. In our experiment, we use the CNN-LSTM architecture for visual information extraction and classification and utilize traditional methods such as for audio feature classification. For multimodal fusion, we use the traditional Support Vector Machine. Our experiment yields a good result on the AFEW 6.0 Dataset.
We present several different calculations pertaining to the nature of the low-energy excitations of the site-diluted S=1/2 Heisenberg antiferromagnet, in particular at the percolation point. We present a picture of excitations originating from an effective low-energy subsystem consisting of localized magnetic moments. At the percolation point, these moments lead to an anomalously large dynamic exponent; z=3.6(1). The magnetic moments are shown to originate from local sublattice imbalance, which we study quantitatively using a classical dimer-monomer model. The localization properties of triplet excitations of clusters with singlet ground state are examined using simulations in the valence bond basis. The triplet is shown to affect predominantly sites on the classical monomer regions. The number of sites affected grows as a non-trivial power of the cluster size. We also study a bilayer model, where there is no sublattice imbalance. Accordingly, we find a much smaller dynamic exponent, consistent with quantum rotor (or possibly fracton) excitations.
Human language is colored by a broad range of topics, but existing text analysis tools only focus on a small number of them. We present Empath, a tool that can generate and validate new lexical categories on demand from a small set of seed terms (like "bleed" and "punch" to generate the category violence). Empath draws connotations between words and phrases by deep learning a neural embedding across more than 1.8 billion words of modern fiction. Given a small set of seed words that characterize a category, Empath uses its neural embedding to discover new related terms, then validates the category with a crowd-powered filter. Empath also analyzes text across 200 built-in, pre-validated categories we have generated from common topics in our web dataset, like neglect, government, and social media. We show that Empath's data-driven, human validated categories are highly correlated (r=0.906) with similar categories in LIWC.
Ising spin glasses with bimodal and Gaussian near-neighbor interaction distributions are studied through numerical simulations. The non-self-averaging (normalized inter-sample variance) parameter $U_{22}(T,L)$ for the spin glass susceptibility (and for higher moments $U_{nn}(T,L)$) is reported for dimensions 2, 3, 4, 5 and 7. In each dimension $d$ the non-self-averaging parameters in the paramagnetic regime vary with the sample size L and the correlation length $\xi(T,L)$ as $U_{nn}(\beta,L) = [K_{d}\xi(T,L)/L]^d$, and so follow a renormalization group law due to Aharony and Harris (1991). Empirically, it is found that the $K_{d}$ values are independent of d to within the statistics. The maximum values $[U_{nn}(T,L)]_{\max}$ are almost independent of L in each dimension, and remarkably the estimated thermodynamic limit critical $[U_{nn}(T,L)]_{\max}$ peak values are also dimension-independent to within the statistics and so are "hyperuniversal". These results show that the form of the spin-spin correlation function distribution at criticality in the large $L$ limit is independent of dimension within the ISG family. Inspection of published non-self-averaging data for 3D Heisenberg and XY spin glasses the light of the Ising spin glass non-self-averaging results show behavior incompatible with a spin-driven ordering scenario, but compatible with that expected on a chiral-driven ordering interpretation.
Community detection is a fundamental problem in network analysis which is made more challenging by overlaps between communities which often occur in practice. Here we propose a general, flexible, and interpretable generative model for overlapping communities, which can be thought of as a generalization of the degree-corrected stochastic block model. We develop an efficient spectral algorithm for estimating the community memberships, which deals with the overlaps by employing the K-medians algorithm rather than the usual K-means for clustering in the spectral domain. We show that the algorithm is asymptotically consistent when networks are not too sparse and the overlaps between communities not too large. Numerical experiments on both simulated networks and many real social networks demonstrate that our method performs very well compared to a number of benchmark methods for overlapping community detection.
In several experiments for measuring various classes of responses, performed at least some four decades ago, on driven physical systems in a far-from-equilibrium (or, from a steady-state) situation, early stage inverse-power-law relaxation dynamics had been observed. Since then, this intriguing behavior raised its head off and on until it regained its central role in the mainstream physical sciences about a decade ago with a breakdown and/or avalanche type (also called self-organized critical) behavior of the sand-pile model and a host of other similar problems. In this communication, we report on the non-equilibrium dynamics in our Random Resistor cum Tunneling-bond Network (RRTN) model. Previously, this semi-classical, or semi-quantum percolative model has been highly successful in explaining the static behavior for various random composite systems. In our dynamic studies for the last several years, we observe two initial power-laws (more than a decade each) and then an exponential relaxation for asymptotically large time scales. Efforts were made to interpret our results with various existing theoretical wisdom/s (which give, only one power-law relaxation for each such system near its breakdown or run-away type state). Obviously, our results (with two different power-laws) are richer than those particular cases. Further, a complete theory is still lacking probably due to a much deeper issue of entropy at stake. The appearance of two power-laws seems to be connected to some non-extensive information-loss / entropy (the experimental systems being mostly athermal) for such systems near their brinks (catastrophic failure not necessarily due to criticality).
Verhulst-like mathematical modeling has been used to investigate several complex biological issues, such as immune memory equilibrium and cell-mediated immunity in mammals. The regulation mechanisms of both these processes are still not sufficiently understood. In a recent paper, Choo et al. [J. Immunol., v. 185, pp. 3436-44, 2010], used an Ag-independent approach to quantitatively analyze memory cell turnover from some empirical data, and concluded that immune homeostasis behaves stochastically, rather than deterministically. In the paper here presented, we use an in silico Ag-dependent approach to simulate the process of antigenic mutation and study its implications for memory dynamics. Our results have suggested a deterministic kinetics for homeostatic equilibrium, what contradicts the Choo et al. findings. Accordingly, our calculations are an indication that a more extensive empirical protocol for studying the homeostatic turnover should be considered.
This paper treats the problem of screening a p-variate sample for strongly and multiply connected vertices in the partial correlation graph associated with the the partial correlation matrix of the sample. This problem, called hub screening, is important in many applications ranging from network security to computational biology to finance to social networks. In the area of network security, a node that becomes a hub of high correlation with neighboring nodes might signal anomalous activity such as a coordinated flooding attack. In the area of computational biology the set of hubs of a gene expression correlation graph can serve as potential targets for drug treatment to block a pathway or modulate host response. In the area of finance a hub might indicate a vulnerable financial instrument or sector whose collapse might have major repercussions on the market. In the area of social networks a hub of observed interactions between criminal suspects could be an influential ringleader. The techniques and theory presented in this paper permit scalable and reliable screening for such hubs. This paper extends our previous work on correlation screening [arXiv:1102.1204] to the more challenging problem of partial correlation screening for variables with a high degree of connectivity. In particular we consider 1) extension to the more difficult problem of screening for partial correlations exceeding a specified magnitude; 2) extension to screening variables whose vertex degree in the associated partial correlation graph, often called the concentration graph, exceeds a specified degree.
We investigate the total resistive losses incurred in returning a power network of identical generators to a synchronous state following a transient stability event or in maintaining this state in the presence of persistent stochastic disturbances. We formulate this cost as the input-output $H^2$ norm of a linear dynamical system with distributed disturbances. We derive an expression for the total resistive losses that scales with the size of the network as well as properties of the generators and power lines, but is independent of the network topology. This topologically invariant scaling of what we term the price of synchrony is in contrast to typical power system stability notions like rate of convergence or the region of attraction for rotor-angle stability. Our result indicates that highly connected power networks, whilst desirable for higher phase synchrony, do not offer an advantage in terms of the total resistive power losses needed to achieve this synchrony. Furthermore, if power flow is the mechanism used to achieve synchrony in highly-distributed-generation networks, the cost increases unboundedly with the number of generators.
Distributed spectrum access (DSA) is challenging since an individual secondary user often has limited sensing capabilities only. One key insight is that channel recommendation among secondary users can help to take advantage of the inherent correlation structure of spectrum availability in both time and space, and enable users to obtain more informed spectrum opportunities. With this insight, we advocate to leverage the wisdom of crowds, and devise social recommendation aided DSA mechanisms to orient secondary users to make more intelligent spectrum access decisions, for both strong and weak network information cases. We start with the strong network information case where secondary users have the statistical information. To mitigate the difficulty due to the curse of dimensionality in the stochastic game approach, we take the one-step Nash approach and cast the social recommendation aided DSA decision making problem at each time slot as a strategic game. We show that it is a potential game, and then devise an algorithm to achieve the Nash equilibrium by exploiting its finite improvement property. For the weak information case where secondary users do not have the statistical information, we develop a distributed reinforcement learning mechanism for social recommendation aided DSA based on the local observations of secondary users only. Appealing to the maximum-norm contraction mapping, we also derive the conditions under which the distributed mechanism converges and characterize the equilibrium therein. Numerical results reveal that the proposed social recommendation aided DSA mechanisms can achieve superior performance using real social data traces and its performance loss in the weak network information case is insignificant, compared with the strong network information case.
Person Re-Identification (re-id) is a challenging task in computer vision, especially when there are limited training data from multiple camera views. In this paper, we pro- pose a deep learning based person re-identification method by transferring knowledge of mid-level attribute features and high-level classification features. Building on the idea that identity classification, attribute recognition and re- identification share the same mid-level semantic representations, they can be trained sequentially by fine-tuning one based on another. In our framework, we train identity classification and attribute recognition tasks from deep Convolutional Neural Network (dCNN) to learn person information. The information can be transferred to the person re-id task and improves its accuracy by a large margin. Further- more, a Long Short Term Memory(LSTM) based Recurrent Neural Network (RNN) component is extended by a spacial gate. This component is used in the re-id model to pay attention to certain spacial parts in each recurrent unit. Experimental results show that our method achieves 78.3% of rank-1 recognition accuracy on the CUHK03 benchmark.
Bitcoin is a decentralized P2P digital currency in which coins are generated by a distributed set of miners and transaction are broadcasted via a peer-to-peer network. While Bitcoin provides some level of anonymity (or rather pseudonymity) by encouraging the users to have any number of random-looking Bitcoin addresses, recent research shows that this level of anonymity is rather low. This encourages users to connect to the Bitcoin network through anonymizers like Tor and motivates development of default Tor functionality for popular mobile SPV clients. In this paper we show that combining Tor and Bitcoin creates an attack vector for the deterministic and stealthy man-in-the-middle attacks. A low-resource attacker can gain full control of information flows between all users who chose to use Bitcoin over Tor. In particular the attacker can link together user's transactions regardless of pseudonyms used, control which Bitcoin blocks and transactions are relayed to the user and can \ delay or discard user's transactions and blocks. In collusion with a powerful miner double-spending attacks become possible and a totally virtual Bitcoin reality can be created for such set of users. Moreover, we show how an attacker can fingerprint users and then recognize them and learn their IP address when they decide to connect to the Bitcoin network directly.
Using the supersymmetry approach, we study spectral statistical properties of a two-dimensional quantum particle subject to a non-uniform magnetic field. We focus mainly on the problem of regularisation of the field theory. Our analysis begins with an investigation of the spectral properties of the purely classical evolution operator. We show that, although the kinetic equation is formally time-reversible, density relaxation is controlled by {\em irreversible} classical dynamics. In the case of a weak magnetic field, the effective kinetic operator corresponds to diffusion in the angle space, the diffusion constant being determined by the spectral resolution of the inhomogeneous magnetic field. Applying these results to the quantum problem, we demonstrate that the low-lying modes of the field theory are related to the eigenmodes of the irreversible classical dynamics, and the higher modes are separated from the zero mode by a gap associated with the lowest density relaxation rate. As a consequence, we find that the long-time properties of the system are characterised by universal Wigner-Dyson statistics. For a weak magnetic field, we obtain a description in terms of the quasi one-dimensional non-linear $\sigma$-model.
The authors of (Cho et al., 2014a) have shown that the recently introduced neural network translation systems suffer from a significant drop in translation quality when translating long sentences, unlike existing phrase-based translation systems. In this paper, we propose a way to address this issue by automatically segmenting an input sentence into phrases that can be easily translated by the neural network translation model. Once each segment has been independently translated by the neural machine translation model, the translated clauses are concatenated to form a final translation. Empirical results show a significant improvement in translation quality for long sentences.
Understanding how person-to-person contagious processes spread through a population requires accurate information on connections between population members. However, such connectivity data, when collected via interview, is often incomplete due to partial recall, respondent fatigue or study design, e.g., fixed choice designs (FCD) truncate out-degree by limiting the number of contacts each respondent can report. Past research has shown how FCD truncation affects network properties, but its implications for predicted speed and size of spreading processes remain largely unexplored. To study the impact of degree truncation on spreading processes, we generated collections of synthetic networks containing specific properties (degree distribution, degree-assortativity, clustering), and also used empirical social network data from 75 villages in Karnataka, India. We simulated FCD using various truncation thresholds and ran a susceptible-infectious-recovered (SIR) process on each network. We found that spreading processes propagated on truncated networks resulted in slower and smaller epidemics, with a sudden decrease in prediction accuracy at a level of truncation that varied by network type. Our results have implications beyond FCD to truncation due to any limited sampling from a larger network. We conclude that knowledge of network structure is important for understanding the accuracy of predictions of process spread on degree truncated networks.
In this paper, the network simulation framework OMNeT++ is used for development and testing of automotive Ethernet-Networks. Therefore OMNeT++ is extended by the INET framework. It is augmented by an implementation of the protocol SOME/IP (-SD) and an connector to the middleware Gamma V. The middleware is used to configure the network by initialization. Additionally data, which is sent over the network, can be changed on the fly.   The contribution of this work regards three main aspects: First, the use of OMNeT++ for network development in automotive industry. Second, the employment of an existing simulation model and using it as restbus simulation for Hardware in the Loop (HiL) testing or rapid prototyping. Finally, the implementation of SOME/IP(-SD) into OMNeT++.
Many geophysical processes can be modelled by using interconnected networks. The small-world network model has recently attracted much attention in physics and applied sciences. In this paper, we try to use and modify the small-world theory to model geophysical processes such as diffusion and transport in disordered porous rocks. We develop an analytical approach as well as numerical simulations to try to characterize the pollutant transport and percolation properties of small-world networks. The analytical expression of system saturation time and fractal dimension of small-world networks are given and thus compared with numerical simulations.
Handwritten character recognition is an active area of research with applications in numerous fields. Past and recent works in this field have concentrated on various languages. Arabic is one language where the scope of research is still widespread, with it being one of the most popular languages in the world and being syntactically different from other major languages. Das et al. \cite{DBLP:journals/corr/abs-1003-1891} has pioneered the research for handwritten digit recognition in Arabic. In this paper, we propose a novel algorithm based on deep learning neural networks using appropriate activation function and regularization layer, which shows significantly improved accuracy compared to the existing Arabic numeral recognition methods. The proposed model gives 97.4 percent accuracy, which is the recorded highest accuracy of the dataset used in the experiment. We also propose a modification of the method described in \cite{DBLP:journals/corr/abs-1003-1891}, where our method scores identical accuracy as that of \cite{DBLP:journals/corr/abs-1003-1891}, with the value of 93.8 percent.
The statistical properties of time intervals between significant earthquakes are found to be described by the Zipf-Mandelbrot-Tsallis-type distribution.
Automatic organ segmentation is an important yet challenging problem for medical image analysis. The pancreas is an abdominal organ with very high anatomical variability. This inhibits previous segmentation methods from achieving high accuracies, especially compared to other organs such as the liver, heart or kidneys. In this paper, we present a probabilistic bottom-up approach for pancreas segmentation in abdominal computed tomography (CT) scans, using multi-level deep convolutional networks (ConvNets). We propose and evaluate several variations of deep ConvNets in the context of hierarchical, coarse-to-fine classification on image patches and regions, i.e. superpixels. We first present a dense labeling of local image patches via $P{-}\mathrm{ConvNet}$ and nearest neighbor fusion. Then we describe a regional ConvNet ($R_1{-}\mathrm{ConvNet}$) that samples a set of bounding boxes around each image superpixel at different scales of contexts in a "zoom-out" fashion. Our ConvNets learn to assign class probabilities for each superpixel region of being pancreas. Last, we study a stacked $R_2{-}\mathrm{ConvNet}$ leveraging the joint space of CT intensities and the $P{-}\mathrm{ConvNet}$ dense probability maps. Both 3D Gaussian smoothing and 2D conditional random fields are exploited as structured predictions for post-processing. We evaluate on CT images of 82 patients in 4-fold cross-validation. We achieve a Dice Similarity Coefficient of 83.6$\pm$6.3% in training and 71.8$\pm$10.7% in testing.
This paper presents an algorithm to automatically design networks with torus topologies, such as ones widely used in large-scale supercomputers. The characteristic feature of our approach is that real life equipment prices and values of technical characteristics are used. As a result, we also have the opportunity to compare costs of torus and fat-tree networks.   The algorithm is useful as a part of a bigger design procedure that selects optimal hardware of cluster supercomputer as a whole.
We derive the two-loop QCD helicity amplitudes for the processes $l q \to l qg$ ($l \bar q \to l \bar q g$) and $l g \to l q\bar q$, which are the partonic reactions yielding $(2+1)$-jet final states in deep inelastic lepton nucleon scattering. The amplitudes are obtained by analytic continuation of the known helicity amplitudes for $e^+e^- \to q\bar q g$. We separate the infrared divergent and finite parts of the amplitudes using Catani's infrared factorization formula. The analytic results for the finite parts of the amplitudes are expressed in terms of one- and two-dimensional harmonic polylogarithms. To evaluate these functions numerically, we list in detail the non-trivial (and kinematic region dependent) variable transformations one needs to perform.
Wireless Sensor Networks (WSNs) are a new technology that has received a substantial attention from several academic research fields in the last years. There are many applications of WSNs, including environmental monitoring, industrial automation, intelligent transportation systems, healthcare and wellbeing, smart energy, to mention a few. Courses have been introduced both at the PhD and at the Master levels. However, these existing courses focus on particular aspects of WSNs (Networking, or Signal Processing, or Embedded Software), whereas WSNs encompass disciplines traditionally separated in Electrical Engineering and Computer Sciences. This paper gives two original contributions: the essential knowledge that should be brought in a WSNs course is characterized, and a course structure with an harmonious holistic approach is proposed. A method based on both theory and experiments is illustrated for the design of this course, whereby the students have hands-on to implement, understand, and develop in practice the implications of theoretical concepts. Theory and applications are thus considered all together. Ultimately, the objective of this paper is to design a new course, to use innovative hands-on experiments to illustrate the theoretical concepts in the course, to show that theoretical aspects are essential for the solution of real-life engineering WSNs problems, and finally to create a fun and interesting teaching and learning environments for WSNs.
Several recent approaches showed how the representations learned by Convolutional Neural Networks can be repurposed for novel tasks. Most commonly it has been shown that the activation features of the last fully connected layers (fc7 or fc6) of the network, followed by a linear classifier outperform the state-of-the-art on several recognition challenge datasets. Instead of recognition, this paper focuses on the image retrieval problem and proposes a examines alternative pooling strategies derived for CNN features. The presented scheme uses the features maps from an earlier layer 5 of the CNN architecture, which has been shown to preserve coarse spatial information and is semantically meaningful. We examine several pooling strategies and demonstrate superior performance on the image retrieval task (INRIA Holidays) at the fraction of the computational cost, while using a relatively small memory requirements. In addition to retrieval, we see similar efficiency gains on the SUN397 scene categorization dataset, demonstrating wide applicability of this simple strategy. We also introduce and evaluate a novel GeoPlaces5K dataset from different geographical locations in the world for image retrieval that stresses more dramatic changes in appearance and viewpoint.
In this work, we propose an approach to index Deep Convolutional Neural Network Features to support efficient content-based retrieval on large image databases. To this aim, we have converted the these features into a textual form, to index them into an inverted index by means of Lucene. In this way, we were able to set up a robust retrieval system that combines full-text search with content-based image retrieval capabilities. We evaluated different strategies of textual representation in order to optimize the index occupation and the query response time. In order to show that our approach is able to handle large datasets, we have developed a web-based prototype that provides an interface for combined textual and visual searching into a dataset of about 100 million of images.
Handover measurement is responsible for finding a handover target and directly decides the performance of mobility management. It is governed by a complex combination of parameters dealing with multi-cell scenarios and system dynamics. A network design has to offer an appropriate handover measurement procedure in such a multi-constraint problem. The present paper proposes a unified framework for the network analysis and optimization. The exposition focuses on the stochastic modeling and addresses its key probabilistic events namely (i) suitable handover target found, (ii) service failure, (iii) handover measurement triggering, and (iv) handover measurement withdrawal. We derive their closed-form expressions and provide a generalized setup for the analysis of handover measurement failure and target cell quality by the best signal quality and \st{minimum duration outage} \textit{level crossing properties}. Finally, we show its application and effectiveness in today's 3GPP-LTE cellular networks.
One of the frequently stated advantages of neural networks is that they can work effectively with non-normally distributed data. But optimal results are possible with normalized data.In this paper, how normality of the input affects the behaviour of a K-means fast learning artificial neural network(KFLANN) for grouping the data is presented. Basically, the grouping of high dimensional input data is controlled by additional neural network input parameters namely vigilance and tolerance.Neural networks learn faster and give better performance if the input variables are pre-processed before being fed to the input units of the neural network. A common way of dealing with data that is not normally distributed is to perform some form of mathematical transformation on the data that shifts it towards a normal distribution.In a neural network, data preprocessing transforms the data into a format that will be more easily and effectively processed for the purpose of the user. Among various methods, Normalization is one which organizes data for more efficient access. Experimental results on several artificial and synthetic data sets indicate that the groups formed in the data vary with non-normally distributed data and normalized data and also depends on the normalization method used.
Attention Deficit Hyperactivity Disorder (ADHD) and Autism Spectrum Disorder (ASD) are neurodevelopmental conditions which impact on a significant number of children and adults. Currently, the diagnosis of such disorders is done by experts who employ standard questionnaires and look for certain behavioural markers through manual observation. Such methods for their diagnosis are not only subjective, difficult to repeat, and costly but also extremely time consuming. In this work, we present a novel methodology to aid diagnostic predictions about the presence/absence of ADHD and ASD by automatic visual analysis of a person's behaviour. To do so, we conduct the questionnaires in a computer-mediated way while recording participants with modern RGBD (Colour+Depth) sensors. In contrast to previous automatic approaches which have focussed only detecting certain behavioural markers, our approach provides a fully automatic end-to-end system for directly predicting ADHD and ASD in adults. Using state of the art facial expression analysis based on Dynamic Deep Learning and 3D analysis of behaviour, we attain classification rates of 96% for Controls vs Condition (ADHD/ASD) group and 94% for Comorbid (ADHD+ASD) vs ASD only group. We show that our system is a potentially useful time saving contribution to the diagnostic field of ADHD and ASD.
This paper explores resonance-driven secular evolution between a bar and dark-matter halo using N-body simulations. We make direct comparisons to our analytic theory (Weinberg & Katz 2005) to demonstrate the great difficulty that an N-body simulation has representing these dynamics for realistic astronomical interactions. In a dark-matter halo, the bar's angular momentum is coupled to the central density cusp (if present) by the Inner Lindblad Resonance. Owing to this angular momentum transfer and self-consistent re-equilibration, strong realistic bars WILL modify the cusp profile, lowering the central densities within about 30% of the bar radius in a few bar orbits. Past results to the contrary (Sellwood 2006, McMillan & Dehnen 2005) may be the result of weak bars or numerical artifacts. The magnitude depends on many factors and we illustrate the sensitivity of the response to the dark-matter profile, the bar shape and mass, and the galaxy's evolutionary history. For example, if the bar length is comparable to the size of a central dark-matter core, the bar may exchange angular momentum without changing its pattern speed significantly. We emphasise that this apparently simple example of secular evolution is remarkably subtle in detail and conclude that an N-body exploration of any astronomical scenario requires a deep investigation into the underlying dynamical mechanisms for that particular problem to set the necessary requirements for the simulation parameters and method (e.g. particle number and Poisson solver). Simply put, N-body simulations do not divinely reveal truth and hence their results are not infallible. They are unlikely to provide useful insight on their own, particularly for the study of even more complex secular processes such as the production of pseudo-bulges and disk heating.
The ability to visually recognize objects is a fundamental skill for robotics systems. Indeed, a large variety of tasks involving manipulation, navigation or interaction with other agents, deeply depends on the accurate understanding of the visual scene. Yet, at the time being, robots are lacking good visual perceptual systems, which often become the main bottleneck preventing the use of autonomous agents for real-world applications.   Lately in computer vision, systems that learn suitable visual representations and based on multi-layer deep convolutional networks are showing remarkable performance in tasks such as large-scale visual recognition and image retrieval. To this regard, it is natural to ask whether such remarkable performance would generalize also to the robotic setting.   In this paper we investigate such possibility, while taking further steps in developing a computational vision system to be embedded on a robotic platform, the iCub humanoid robot. In particular, we release a new dataset ({\sc iCubWorld28}) that we use as a benchmark to address the question: {\it how many objects can iCub recognize?} Our study is developed in a learning framework which reflects the typical visual experience of a humanoid robot like the iCub. Experiments shed interesting insights on the strength and weaknesses of current computer vision approaches applied in real robotic settings.
We present a new analysis of the Sloan Digital Sky Survey data aimed at producing a detailed map of the nearby (z < 0.5) universe. Using neural networks trained on the available spectroscopic base of knowledge we derived distance estimates for about 30 million galaxies distributed over ca. 8,000 sq. deg. We also used unsupervised clustering tools developed in the framework of the VO-Tech project, to investigate the possibility to understand the nature of each object present in the field and, in particular, to produce a list of candidate AGNs and QSOs.
One of the main methods for protecting quantum information against decoherence is to encode information in the ground subspace (or the low energy sector) of a Hamiltonian with a large energy gap which penalizes errors from environment. The protecting Hamiltonian is chosen such that its degenerate ground subspace is an error detecting code for the errors caused by the interaction with the environment. We consider environments with arbitrary number of local sites, e.g. spins, whose interactions among themselves are local and bounded. Then, assuming the system is interacting with a finite number of sites in the environment, we prove that, up to second order with respect to the coupling constant, decoherence and relaxation are suppressed by a factor which grows exponentially fast with the ratio of energy penalty to the norm of local interactions in the environment. The state may, however, still evolve unitarily inside the code subspace due to the Lamb shift effect. In the context of adiabatic quantum computation, this means that the evolution inside the code subspace is effectively governed by a renormalized Hamiltonian. The result is derived from first principles, without use of master equations or their assumptions, and holds even in the infinite temperature limit. We also prove that unbounded or non-local interactions in the environment at sites far from the system do not considerably modify the exponential suppression. Our main technical tool is a new bound on the decay of power spectral density at high frequencies for local observables and many-body Hamiltonians with bounded and local interactions in a neighborhood around the support of the observable.
I review recent progress in analysing deep inelastic scattering structure functions in global analyses. The new ingredients are new data and attempts to incorporate heavy quarks consistently. A new way of including the resummation of large $\log 1/x$ terms is discussed.
Recently a new entanglemenet dilution scheme has been constructed by Lo and Popescu. This paper points out that this result has a deep implication that the entanglement measure for bipartite pure states is independent of the distance between entangled two systems.
We propose that a regulation mechanism based on Hebbian covariance plasticity may cause the brain to operate near criticality. We analyze the effect of such a regulation on the dynamics of a network with excitatory and inhibitory neurons and uniform connectivity within and across the two populations. We show that, under broad conditions, the system converges to a critical state lying at the common boundary of three regions in parameter space; these correspond to three modes of behavior: high activity, low activity, oscillation.
While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry. Large labeled training datasets, expensive and tedious to produce, are required to optimize millions of parameters in deep network models. Lagging behind the growth in model capacity, the available datasets are quickly becoming outdated in terms of size and density. To circumvent this bottleneck, we propose to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop. Starting from a large set of candidate images for each category, we iteratively sample a subset, ask people to label them, classify the others with a trained model, split the set into positives, negatives, and unlabeled based on the classification confidence, and then iterate with the unlabeled set. To assess the effectiveness of this cascading procedure and enable further progress in visual recognition research, we construct a new image dataset, LSUN. It contains around one million labeled images for each of 10 scene categories and 20 object categories. We experiment with training popular convolutional networks and find that they achieve substantial performance gains when trained on this dataset.
We present a fully convolutional network(FCN) based approach for color image restoration. FCNs have recently shown remarkable performance for high-level vision problem like semantic segmentation. In this paper, we investigate if FCN models can show promising performance for low-level problems like image restoration as well. We propose a fully convolutional model, that learns a direct end-to-end mapping between the corrupted images as input and the desired clean images as output. Our proposed method takes inspiration from domain transformation techniques but presents a data-driven task specific approach where filters for novel basis projection, task dependent coefficient alterations, and image reconstruction are represented as convolutional networks. Experimental results show that our FCN model outperforms traditional sparse coding based methods and demonstrates competitive performance compared to the state-of-the-art methods for image denoising. We further show that our proposed model can solve the difficult problem of blind image inpainting and can produce reconstructed images of impressive visual quality.
The task of next POI recommendation has been studied extensively in recent years. However, developing an unified recommendation framework to incorporate multiple factors associated with both POIs and users remains challenging, because of the heterogeneity nature of these information. Further, effective mechanisms to handle cold-start and endow the system with interpretability are also difficult topics. Inspired by the recent success of neural networks in many areas, in this paper, we present a simple but effective neural network framework for next POI recommendation, named NEXT. NEXT is an unified framework to learn the hidden intent regarding user's next move, by incorporating different factors in an unified manner. Specifically, in NEXT, we incorporate meta-data information and two kinds of temporal contexts (i.e., time interval and visit time). To leverage sequential relations and geographical influence, we propose to adopt DeepWalk, a network representation learning technique, to encode such knowledge. We evaluate the effectiveness of NEXT against state-of-the-art alternatives and neural networks based solutions. Experimental results over three publicly available datasets demonstrate that NEXT significantly outperforms baselines in real-time next POI recommendation. Further experiments demonstrate the superiority of NEXT in handling cold-start. More importantly, we show that NEXT provides meaningful explanation of the dimensions in hidden intent space.
Spin ice in a magnetic field in the [111] direction displays two magnetization plateaux, one at saturation and an intermediate one with finite entropy. We study the crossovers between the different regimes from a point of view of (entropically) interacting defects. We develop an analytical theory for the nearest-neighbor spin ice model, which covers most of the magnetization curve. We find that the entropy is non-monotonic, exhibiting a giant spike between the two plateaux. This regime is described by a monomer-dimer model with tunable fugacities. At low fields, we develop an RG treatment for the extended string defects, and we compare our results to extensive Monte Carlo simulations. We address the implications of our results for cooling by adiabatic (de)magnetization.
This work addresses decentralized online optimization in non-stationary environments. A network of agents aim to track the minimizer of a global time-varying convex function. The minimizer evolves according to a known dynamics corrupted by an unknown, unstructured noise. At each time, the global function can be cast as a sum of a finite number of local functions, each of which is assigned to one agent in the network. Moreover, the local functions become available to agents sequentially, and agents do not have a prior knowledge of the future cost functions. Therefore, agents must communicate with each other to build an online approximation of the global function. We propose a decentralized variation of the celebrated Mirror Descent, developed by Nemirovksi and Yudin. Using the notion of Bregman divergence in lieu of Euclidean distance for projection, Mirror Descent has been shown to be a powerful tool in large-scale optimization. Our algorithm builds on Mirror Descent, while ensuring that agents perform a consensus step to follow the global function and take into account the dynamics of the global minimizer. To measure the performance of the proposed online algorithm, we compare it to its offline counterpart, where the global functions are available a priori. The gap between the two is called dynamic regret. We establish a regret bound that scales inversely in the spectral gap of the network, and more notably it represents the deviation of minimizer sequence with respect to the given dynamics. We then show that our results subsume a number of results in distributed optimization. We demonstrate the application of our method to decentralized tracking of dynamic parameters and verify the results via numerical experiments.
The quantum field theoretic treatment of inclusive deep--inelastic diffractive scattering given in a previous paper \cite{BGR2006} is discussed in detail using an equivalent formulation with the aim to derive a representation suitable for data analysis. We consider the off-cone twist--2 light-cone operators to derive the target mass and finite $t$ corrections to diffractive deep--inelastic scattering and deep--inelastic scattering. The corrections turn out to be at most proportional to $x |t|/Q^2, x M^2/Q^2, x = x_{\rm BJ} {\rm or} x_\PP$, which suggests an expansion in these parameters. Their contribution varies in size considering diffractive scattering or meson--exchange processes. Relations between different kinematic amplitudes which are determined by one and the same diffractive GPD or its moments are derived. In the limit $t, M^2 \to 0$ one obtains the results of \cite{Blumlein:2001xf} and \cite{Blumlein:2002fw}.
Sparse coding algorithms trained on natural images can accurately predict the features that excite visual cortical neurons, but it is not known whether such codes can be learned using biologically realistic plasticity rules. We have developed a biophysically motivated spiking network, relying solely on synaptically local information, that can predict the full diversity of V1 simple cell receptive field shapes when trained on natural images. This represents the first demonstration that sparse coding principles, operating within the constraints imposed by cortical architecture, can successfully reproduce these receptive fields. We further prove, mathematically, that sparseness and decorrelation are the key ingredients that allow for synaptically local plasticity rules to optimize a cooperative, linear generative image model formed by the neural representation. Finally, we discuss several interesting emergent properties of our network, with the intent of bridging the gap between theoretical and experimental studies of visual cortex.
In this paper, we investigate optimal coding strategies for a class of linear deterministic relay networks. The network under study is a relay network, with one source, one destination, and two relay nodes. Additionally, there is a disturbing source of signals that causes interference with the information signals received by the relay nodes. Our model captures the effect of the interference of message signals and disturbing signals on a single relay network, or the interference of signals from multiple relay networks with each other in the linear deterministic framework. For several ranges of the network parameters we find upper bounds on the maximum achievable source--destination rate in the presense of the disturbing node and in each case we find an optimal coding scheme that achieves the upper bound.
A confined system of non-interacting electrons, subject to the combined effect of a time-dependent potential and different external chemical-potentials, is considered. The current flowing through such a system is obtained for arbitrary strengths of the modulating potential, using the adiabatic approximation in an iterative manner. A new formula is derived for the charge pumped through an un-biased system (all external chemical potentials are kept at the same value); It reproduces the Brouwer formula for a two-terminal nanostructure. The formalism presented yields the effect of the chemical potential bias on the pumped charge on one hand, and the modification of the Landauer formula (which gives the current in response to a constant chemical-potential difference) brought about by the modulating potential on the other. Corrections to the adiabatic approximation are derived and discussed.
In this paper, I study the diffusion of new terms, called neologism, in social networks. I consider it as an example of information dynamics on networks and I hope that solving this problem can help us to understand and describe the information dynamics problem. To do so I develop a phenomenological model for the diffusion mechanism. I find an analytical relationship between number of people in the society who has learned the term and time taken. The Network parameters are imported in this analytical solution. I also present some simulation for this mechanism for several sample and some real networks which confirms the analytical results. In addition, I study the effects of network topology on diffusion process.
A second generation spectrograph for the Keck telescope is under construction at the Lick Observatory shops and will be delivered to Hawaii in 1999. Starting in the Fall of 1999, we shall begin the second phase of the DEEP project: a dense redshift survey of galaxies at Z=1. With each pointing of DEIMOS we shall obtain simultaneous short slit spectra of 70-100 galaxies with $m_I(AB) < 23.0$ in a field of 15' by 2'. Four regions of the sky will be studied in detail, with dense sampling in a region of 120'x15' in each region, plus outrigger fields. The galaxies for spectroscopic analysis will be selected by flux limit and by photometric redshift estimate $Z_{photo}>0.7$. The goal is to obtain high quality spectra of perhaps 30,000 galaxies over the course of 2-3 years. We here review the status of DEIMOS and the science objectives of the survey.
We define a probabilistic scheme to compute the distributions of periods, transients and weigths of attraction basins in Kauffman networks. These quantities are obtained in the framework of the annealed approximation, first introduced by Derrida and Pomeau. Numerical results are in good agreement with the computed values of the exponents of average periods, but show also some interesting features which can not be explained whithin the annealed approximation.
Corynebacterium glutamicum is a Gram-positive, anaerobic, rod-shaped soil bacterium able to grow on a diversity of carbon sources like sugars and organic acids. It is a biotechnological relevant organism because of its highly efficient ability to biosynthesize amino acids, such as L-glutamic acid and L-lysine. Here, we reconstructed the most complete C. glutamicum regulatory network to date and comprehensively analyzed its global organizational properties, systems-level features and functional architecture. Our analyses show the tremendous power of Abasy Atlas to study the functional organization of regulatory networks. We created two models of the C. glutamicum regulatory network: all-evidences (containing both weak and strong supported interactions, genomic coverage = 73%) and strongly-supported (only accounting for strongly supported evidences, genomic coverage = 71%). Using state-of-the-art methodologies, we prove that power-law behaviors truly govern the connectivity and clustering coefficient distributions. We found a non-previously reported circuit motif that we named complex feed-forward motif. We highlighted the importance of feedback loops for the functional architecture, beyond whether they are statistically over-represented or not in the network. We show that the previously reported top-down approach is inadequate to infer the hierarchy governing a regulatory network because feedback bridges different hierarchical layers, and the top-down approach disregards the presence of intermodular genes shaping the integration layer. Our findings all together further support a diamond-shaped, three-layered hierarchy exhibiting some feedback between processing and coordination layers, which is shaped by four classes of systems-level elements: global regulators, locally autonomous modules, basal machinery and intermodular genes.
In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods.
Visual saliency detection aims at identifying the most visually distinctive parts in an image, and serves as a pre-processing step for a variety of computer vision and image processing tasks. To this end, the saliency detection procedure must be as fast and compact as possible and optimally processes input images in a real time manner. It is an essential application requirement for the saliency detection task. However, contemporary detection methods often utilize some complicated procedures to pursue feeble improvements on the detection precession, which always take hundreds of milliseconds and make them not easy to be applied practically. In this paper, we tackle this problem by proposing a fast and compact saliency score regression network which employs fully convolutional network, a special deep convolutional neural network, to estimate the saliency of objects in images. It is an extremely simplified end-to-end deep neural network without any pre-processings and post-processings. When given an image, the network can directly predict a dense full-resolution saliency map (image-to-image prediction). It works like a compact pipeline which effectively simplifies the detection procedure. Our method is evaluated on six public datasets, and experimental results show that it can achieve comparable or better precision performance than the state-of-the-art methods while get a significant improvement in detection speed (35 FPS, processing in real time).
Calvera is an unusual isolated neutron star with pure thermal X-ray spectrum typical for central compact objects in supernova remnants. On the other hand, its rotation period and spin-down rate are typical for ordinary rotation-powered pulsars. It was discovered and studied in X-rays and not yet detected in other spectral domains. We present deep optical imaging of the Calvera field obtained with the Gran Telescopio Canarias in $g'$ and $i'$ bands. Within $\approx 1^{\prime\prime}$ vicinity of Calvera, we detected two point-like objects invisible at previous shallow observations. However, accurate astrometry showed that none of them can be identified with the pulsar. We put new upper limits on its optical brightness of $g' > 27.87$ and $i' > 26.84$. We also reanalyzed all available archival X-ray data on Calvera. Comparison of the Calvera thermal emission parameters and upper limits on optical and non-thermal X-ray emission with respective data on rotation-powered pulsars shows that Calvera might belong to the class of ordinary middle-aged pulsars, if we assume that its distance is in the range of $1.5-5$ kpc.
We consider the problem of detecting an epidemic in a population where individual diagnoses are extremely noisy. The motivation for this problem is the plethora of examples (influenza strains in humans, or computer viruses in smartphones, etc.) where reliable diagnoses are scarce, but noisy data plentiful. In flu/phone-viruses, exceedingly few infected people/phones are professionally diagnosed (only a small fraction go to a doctor) but less reliable secondary signatures (e.g., people staying home, or greater-than-typical upload activity) are more readily available. These secondary data are often plagued by unreliability: many people with the flu do not stay home, and many people that stay home do not have the flu. This paper identifies the precise regime where knowledge of the contact network enables finding the needle in the haystack: we provide a distributed, efficient and robust algorithm that can correctly identify the existence of a spreading epidemic from highly unreliable local data. Our algorithm requires only local-neighbor knowledge of this graph, and in a broad array of settings that we describe, succeeds even when false negatives and false positives make up an overwhelming fraction of the data available. Our results show it succeeds in the presence of partial information about the contact network, and also when there is not a single "patient zero", but rather many (hundreds, in our examples) of initial patient-zeroes, spread across the graph.
Canonical correlation analysis (CCA) is a classical representation learning technique for finding correlated variables in multi-view data. Several nonlinear extensions of the original linear CCA have been proposed, including kernel and deep neural network methods. These approaches seek maximally correlated projections among families of functions, which the user specifies (by choosing a kernel or neural network structure), and are computationally demanding. Interestingly, the theory of nonlinear CCA, without functional restrictions, had been studied in the population setting by Lancaster already in the 1950s, but these results have not inspired practical algorithms. We revisit Lancaster's theory to devise a practical algorithm for nonparametric CCA (NCCA). Specifically, we show that the solution can be expressed in terms of the singular value decomposition of a certain operator associated with the joint density of the views. Thus, by estimating the population density from data, NCCA reduces to solving an eigenvalue system, superficially like kernel CCA but, importantly, without requiring the inversion of any kernel matrix. We also derive a partially linear CCA (PLCCA) variant in which one of the views undergoes a linear projection while the other is nonparametric. Using a kernel density estimate based on a small number of nearest neighbors, our NCCA and PLCCA algorithms are memory-efficient, often run much faster, and perform better than kernel CCA and comparable to deep CCA.
Back-propagation has been the workhorse of recent successes of deep learning but it relies on infinitesimal effects (partial derivatives) in order to perform credit assignment. This could become a serious issue as one considers deeper and more non-linear functions, e.g., consider the extreme case of nonlinearity where the relation between parameters and cost is actually discrete. Inspired by the biological implausibility of back-propagation, a few approaches have been proposed in the past that could play a similar credit assignment role. In this spirit, we explore a novel approach to credit assignment in deep networks that we call target propagation. The main idea is to compute targets rather than gradients, at each layer. Like gradients, they are propagated backwards. In a way that is related but different from previously proposed proxies for back-propagation which rely on a backwards network with symmetric weights, target propagation relies on auto-encoders at each layer. Unlike back-propagation, it can be applied even when units exchange stochastic bits rather than real numbers. We show that a linear correction for the imperfectness of the auto-encoders, called difference target propagation, is very effective to make target propagation actually work, leading to results comparable to back-propagation for deep networks with discrete and continuous units and denoising auto-encoders and achieving state of the art for stochastic networks.
This paper is dedicated to the long-term, or multi-step-ahead, time series prediction problem. We propose a novel method for training feed-forward neural networks, such as multilayer perceptrons, with tapped delay lines. Special batch calculation of derivatives called Forecasted Propagation Through Time and batch modification of the Extended Kalman Filter are introduced. Experiments were carried out on well-known time series benchmarks, the Mackey-Glass chaotic process and the Santa Fe Laser Data Series. Recurrent and feed-forward neural networks were evaluated.
In this paper we explore representations of temporal knowledge based upon the formalism of Causal Probabilistic Networks (CPNs). Two different ?continuous-time? representations are proposed. In the first, the CPN includes variables representing ?event-occurrence times?, possibly on different time scales, and variables representing the ?state? of the system at these times. In the second, the CPN describes the influences between random variables with values in () representing dates, i.e. time-points associated with the occurrence of relevant events. However, structuring a system of inter-related dates as a network where all links commit to a single specific notion of cause and effect is in general far from trivial and leads to severe difficulties. We claim that we should recognize explicitly different kinds of relation between dates, such as ?cause?, ?inhibition?, ?competition?, etc., and propose a method whereby these relations are coherently embedded in a CPN using additional auxiliary nodes corresponding to "instrumental" variables. Also discussed, though not covered in detail, is the topic concerning how the quantitative specifications to be inserted in a temporal CPN can be learned from specific data.
We present x-ray photon correlation spectroscopy measurements of the atomic dynamics in a Zr67Ni33 metallic glass, well below its glass transition temperature. We find that the decay of the density fluctuations can be well described by compressed, thus faster than exponential, correlation functions which can be modeled by the well-known Kohlrausch-Williams-Watts function with a shape exponent {\beta} larger than one. This parameter is furthermore found to be independent of both waiting time and wave-vector, leading to the possibility to rescale all the correlation functions to a single master curve. The dynamics in the glassy state is additionally characterized by different aging regimes which persist in the deep glassy state. These features seem to be universal in metallic glasses and suggest a non diffusive nature of the dynamics. This universality is supported by the possibility of describing the fast increase of the structural relaxation time with waiting time using a unique model function, independently of the microscopic details of the system.
Indoor multpropagation channel is modeled by the Kaiser electromagnetic wavelet. A method for channel characterization is proposed by modeling all the reflections of indoor propagation in a kernel function instead of its impulse response. This lead us to consider a fractal modulation scheme in which Kaiser wavelets substitute the traditional sinusoidal carrier.
Verification determines whether two samples belong to the same class or not, and has important applications such as face and fingerprint verification, where thousands or millions of categories are present but each category has scarce labeled examples, presenting two major challenges for existing deep learning models. We propose a deep semi-supervised model named SEmi-supervised VErification Network (SEVEN) to address these challenges. The model consists of two complementary components. The generative component addresses the lack of supervision within each category by learning general salient structures from a large amount of data across categories. The discriminative component exploits the learned general features to mitigate the lack of supervision within categories, and also directs the generative component to find more informative structures of the whole data manifold. The two components are tied together in SEVEN to allow an end-to-end training of the two components. Extensive experiments on four verification tasks demonstrate that SEVEN significantly outperforms other state-of-the-art deep semi-supervised techniques when labeled data are in short supply. Furthermore, SEVEN is competitive with fully supervised baselines trained with a larger amount of labeled data. It indicates the importance of the generative component in SEVEN.
We study the dynamics of the Naming Game [Baronchelli et al., (2006) J. Stat. Mech.: Theory Exp. P06014] in empirical social networks. This stylized agent-based model captures essential features of agreement dynamics in a network of autonomous agents, corresponding to the development of shared classification schemes in a network of artificial agents or opinion spreading and social dynamics in social networks. Our study focuses on the impact that communities in the underlying social graphs have on the outcome of the agreement process. We find that networks with strong community structure hinder the system from reaching global agreement; the evolution of the Naming Game in these networks maintains clusters of coexisting opinions indefinitely. Further, we investigate agent-based network strategies to facilitate convergence to global consensus.
Comparing images in order to recommend items from an image-inventory is a subject of continued interest. Added with the scalability of deep-learning architectures the once `manual' job of hand-crafting features have been largely alleviated, and images can be compared according to features generated from a deep convolutional neural network. In this paper, we compare distance metrics (and divergences) to rank features generated from a neural network, for content-based image retrieval. Specifically, after modelling individual images using approximations of mixture models or sparse covariance estimators we resort to their information-theoretic and Riemann geometric comparisons. We show that using approximations of mixture models enable us to to compute a distance measure based on the Wasserstein metric that requires less effort than computationally intensive optimal transport plans; finally, an affine invariant metric is used to compare the optimal transport metric to its Riemann geometric counterpart -- we conclude that although expensive, retrieval metric based on Wasserstein geometry are more suitable than information theoretic comparison of images. In short, we combine GPU scalability in learning deep feature vectors with computationally efficient metrics that we foresee being utilized in a commercial setting.
We introduce an Ising approach to study the spread of malware. The Ising spins up and down are used to represent two states--online and offline--of the nodes in the network. Malware is allowed to propagate amongst online nodes and the rate of propagation was found to increase with data traffic. For a more efficient network, the spread of infection is much slower; while for a congested network, infection spreads quickly.
Objective: The advent of Electronic Medical Records (EMR) with large electronic imaging databases along with advances in deep neural networks with machine learning has provided a unique opportunity to achieve milestones in automated image analysis. Optical coherence tomography (OCT) is the most commonly obtained imaging modality in ophthalmology and represents a dense and rich dataset when combined with labels derived from the EMR. We sought to determine if deep learning could be utilized to distinguish normal OCT images from images from patients with Age-related Macular Degeneration (AMD). Methods: Automated extraction of an OCT imaging database was performed and linked to clinical endpoints from the EMR. OCT macula scans were obtained by Heidelberg Spectralis, and each OCT scan was linked to EMR clinical endpoints extracted from EPIC. The central 11 images were selected from each OCT scan of two cohorts of patients: normal and AMD. Cross-validation was performed using a random subset of patients. Area under receiver operator curves (auROC) were constructed at an independent image level, macular OCT level, and patient level. Results: Of an extraction of 2.6 million OCT images linked to clinical datapoints from the EMR, 52,690 normal and 48,312 AMD macular OCT images were selected. A deep neural network was trained to categorize images as either normal or AMD. At the image level, we achieved an auROC of 92.78% with an accuracy of 87.63%. At the macula level, we achieved an auROC of 93.83% with an accuracy of 88.98%. At a patient level, we achieved an auROC of 97.45% with an accuracy of 93.45%. Peak sensitivity and specificity with optimal cutoffs were 92.64% and 93.69% respectively. Conclusions: Deep learning techniques are effective for classifying OCT images. These findings have important implications in utilizing OCT in automated screening and computer aided diagnosis tools.
In the paper, we study fluctuations over several ensembles of maximum-entropy random networks. We derive several fluctuation-dissipation relations characterizing susceptibilities of different networks to changes in external fields. In the case of networks with a given degree sequence, we argue that the scale-free topologies of real-world networks may arise as a result of self-organization of real systems into sparse structures with low susceptibility to random external perturbations. We also show that the ensembles of networks with noninteracting links (both uncorrelated and with two-point correlations) are equivalent to random networks with hidden variables.
We present a comparative numerical study of the ordered and the random two-dimensional sine-Gordon models on a lattice. We analytically compute the main features of the expected high temperature phase of both models, described by the Edwards-Wilkinson equation. We then use those results to locate the transition temperatures of both models in our Langevin dynamics simulations. We show that our results reconcile previous contradictory numerical works concerning the superroughening transition in the random sine-Gordon model. We also find evidence supporting the existence of two different low temperature phases for the disordered model. We discuss our results in view of the different analytical predictions available and comment on the nature of these two putative phases.
Wire grid polarizers (WGPs), periodic nano-optical meta-surfaces, are convenient polarizing elements for many optical applications. However, they are still inadequate in the deep ultraviolet spectral range. We show that to achieve high performance ultraviolet WGPs a material with large absolute value of the complex permittivity and extinction coefficient at the wavelength of interest has to be utilized. This requirement is compared to refractive index models considering intraband and interband absorption processes. We elucidate why the extinction ratio of metallic WGPs intrinsically humble in the deep ultraviolet, whereas wide bandgap semiconductors are superior material candidates in this spectral range. To demonstrate this, we present the design, fabrication and optical characterization of a titanium dioxide WGP. At a wavelength of 193 nm an unprecedented extinction ratio of 384 and a transmittance of 10 % is achieved.
Hybrid evolution and horizontal gene transfer (HGT) are processes where evolutionary relationships may more accurately be described by a reticulated network than by a tree. In such a network, there will often be several paths between any two extant species, reflecting the possible pathways that genetic material may have been passed down from a common ancestor to these species. These paths will typically have different lengths but an `average distance' can still be calculated between any two taxa. In this article, we ask whether this average distance is able to distinguish reticulate evolution from pure tree-like evolution. We consider two types of reticulation networks: hybridization networks and HGT networks. For the former, we establish a general result which shows that average distances between extant taxa can appear tree-like, but only under a single hybridization event near the root; in all other cases, the two forms of evolution can be distinguished by average distances. For HGT networks, we demonstrate some analogous but more intricate results.
An infrastructure network is a self-organizing network which uses Access Point (AP) of wireless links that connecting one node with another. These nodes can communicate without using ad hoc, instead these nodes form an arbitrary topology (BSS/ESS) in which these nodes play the role of routers. Though the efficiency of Infrastructure networks is high, they are highly vulnerable to security attacks. Detecting/Preventing these attacks over the network is highly challenging task. Many solutions are proposed to provide authentication, confidentiality, availability, secure routing and intrusion avoidance in infrastructure networks. Providing security in such dynamically changing networks is a hard task. Characteristic of infrastructure network should also be taken into consideration in order to design efficient solutions. In this study, we focus on efficiently increasing the flow transmission confidentiality in infrastructure networks based on multi-path routing. In order to increase confidentiality of transmitted data, we take advantage of the existence of multiple paths between nodes in an infrastructure network with the help of Access Point. In this approach the original data is split into package and are forwarded through access point. The encrypted packets are then forwarded in different disjoint paths that exist between sender and receiver. Even if an attacker succeeds to obtain one or more transmitted packets, the probability of reconstructing the original message is very low.
This work explores the feasibility of specialized hardware implementing the Cortical Learning Algorithm (CLA) in order to fully exploit its inherent advantages. This algorithm, which is inspired in the current understanding of the mammalian neo-cortex, is the basis of the Hierarchical Temporal Memory (HTM). In contrast to other machine learning (ML) approaches, the structure is not application dependent and relies on fully unsupervised continuous learning. We hypothesize that a hardware implementation will be able not only to extend the already practical uses of these ideas to broader scenarios but also to exploit the hardware-friendly CLA characteristics. The architecture proposed will enable an unfeasible scalability for software solutions and will fully capitalize on one of the many CLA advantages: low computational requirements and reduced storage utilization. Compared to a state-of-the-art CLA software implementation it could be possible to improve by 4 orders of magnitude in performance and up to 8 orders of magnitude in energy efficiency. We propose to use a packet-switched network to tackle this. The paper addresses the fundamental issues of such an approach, proposing solutions to achieve scalable solutions. We will analyze cost and performance when using well-known architecture techniques and tools. The results obtained suggest that even with CMOS technology, under constrained cost, it might be possible to implement a large-scale system. We found that the proposed solutions enable a saving of 90% of the original communication costs running either synthetic or realistic workloads.
Power law degree distribution was shown in many complex networks. However, in most real systems, deviation from power-law behavior is observed in social and economical networks and emergence of giant hubs is obvious in real network structures far from the tail of power law. We propose a model based on the information transparency (transparency means how much the information is obvious to others). This model can explain power structure in societies with non-transparency in information delivery. The emergence of ultra powerful nodes is explained as a direct result of censorship. Based on these assumptions, we define four distinct transparency regions: perfect non-transparent, low transparent, perfect transparent and exaggerated regions. We observe the emergence of some ultra powerful (very high degree) nodes in low transparent networks, in accordance with the economical and social systems. We show that the low transparent networks are more vulnerable to attacks and the controllability of low transparent networks is harder than the others. Also, the ultra powerful nodes in the low transparent networks have a smaller mean length and higher clustering coefficients than the other regions.
Motivated by the findings of logarithmic spreading of entanglement in a many-body localized system, we more closely examine the spreading of entanglement in the fully many-body localized phase, where all many-body eigenstates are localized. Performing full diagonalizations of an XXZ spin model with random longitudinal fields, we identify two factors contributing to the spreading rate: the localization length ($\xi$), which depends on the disorder strength, and the final value of entanglement per spin ($s_\infty$), which primarily depends on the initial state. We find that the entanglement entropy grows with time as $\sim \xi \times s_\infty \log t$, providing support for the phenomenology of many-body localized systems recently proposed by Huse and Oganesyan [arXiv:1305.4915v1].
Deep stacked RNNs are usually hard to train. Adding shortcut connections across different layers is a common way to ease the training of stacked networks. However, extra shortcuts make the recurrent step more complicated. To simply the stacked architecture, we propose a framework called shortcut block, which is a marriage of the gating mechanism and shortcuts, while discarding the self-connected part in LSTM cell. We present extensive empirical experiments showing that this design makes training easy and improves generalization. We propose various shortcut block topologies and compositions to explore its effectiveness. Based on this architecture, we obtain a 6% relatively improvement over the state-of-the-art on CCGbank supertagging dataset. We also get comparable results on POS tagging task.
Although artificial neural networks have occasionally been used for Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) studies in the past, the literature has of late been dominated by other machine learning techniques such as random forests. However, a variety of new neural net techniques along with successful applications in other domains have renewed interest in network approaches. In this work, inspired by the winning team's use of neural networks in a recent QSAR competition, we used an artificial neural network to learn a function that predicts activities of compounds for multiple assays at the same time. We conducted experiments leveraging recent methods for dealing with overfitting in neural networks as well as other tricks from the neural networks literature. We compared our methods to alternative methods reported to perform well on these tasks and found that our neural net methods provided superior performance.
IEEE 802.11 Wireless Networks are getting more and more popular at university campuses, enterprises, shopping centers, airports and in so many other public places, providing Internet access to a large crowd openly and quickly. The wireless users are also getting more dependent on WiFi technology and therefore demanding more reliability and higher performance for this vital technology. However, due to unstable radio conditions, faulty equipment, and dynamic user behavior among other reasons, there are always unpredictable performance problems in a wireless covered area. Detection and prediction of such problems is of great significance to network managers if they are to alleviate the connectivity issues of the mobile users and provide a higher quality wireless service. This paper aims to improve the management of the 802.11 wireless networks by characterizing and modeling wireless usage patterns in a set of anomalous scenarios that can occur in such networks. We apply time-invariant (Gaussian Mixture Models) and time-variant (Hidden Markov Models) modeling approaches to a dataset generated from a large production network and describe how we use these models for anomaly detection. We then generate several common anomalies on a Testbed network and evaluate the proposed anomaly detection methodologies in a controlled environment. The experimental results of the Testbed show that HMM outperforms GMM and yields a higher anomaly detection ratio and a lower false alarm rate.
We propose a method for the stabilisation of quantum computations (including quantum state storage). The method is based on the operation of projection into $\cal SYM$, the symmetric subspace of the full state space of $R$ redundant copies of the computer. We describe an efficient algorithm and quantum network effecting $\cal SYM$--projection and discuss the stabilising effect of the proposed method in the context of unitary errors generated by hardware imprecision, and nonunitary errors arising from external environmental interaction. Finally, limitations of the method are discussed.
In many body systems, constituents interact with each other, forming a recursive pattern of mutual interaction and giving rise to many interesting phenomena. Based upon concepts of the modern many body theory, a model for a generic many body system is developed. A novel approach is used to investigate the general features in such a system. An interesting phase transition in the system is found. Possible link to brain dynamics is discussed. It is shown how some of the basic brain processes, such as learning and memory, find therein a natural explanation.
We consider various issues which arise as soon as one tries to practically implement simple networks of quantum relays for QKD. In particular we discuss authentication and routing which are essential ingredients of any QKD network. This paper aims to address some gaps between quantum and networking aspects of QKD networks usually reserved to specialist in physics and computer science respectively.
In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.
It has been shown that the one-loop behavior of the axial anomaly, occurring when the axial current is appropriately normalized, leads to the cancellation of the corrections of type $C_F^N{\bar \alpha}_s^N,~~ (N\geq 1) $ in the Crewther relation for the coefficient functions of deep-inelastic and annihilation processes. The arguments in favour of the overall factorization of the factor $\beta({\bar \alpha}_s)/ {\bar \alpha}_s$ in all orders of perturbation theory in this relation are presented.
The currently most efficient algorithm for inference with a probabilistic network builds upon a triangulation of a network's graph. In this paper, we show that pre-processing can help in finding good triangulations forprobabilistic networks, that is, triangulations with a minimal maximum clique size. We provide a set of rules for stepwise reducing a graph, without losing optimality. This reduction allows us to solve the triangulation problem on a smaller graph. From the smaller graph's triangulation, a triangulation of the original graph is obtained by reversing the reduction steps. Our experimental results show that the graphs of some well-known real-life probabilistic networks can be triangulated optimally just by preprocessing; for other networks, huge reductions in their graph's size are obtained.
Vehicular adhoc network or VANET is special types of adhoc network consists of moving cars referred to as nodes; provide a way to exchange any information between cars without depending on fixed infrastructure. For efficient communication between nodes various routing protocols and mobility models have been proposed based on different scenarios. Due to rapid topology changing and frequent disconnection makes it difficult to select suitable mobility model and routing protocols. Hence performance evaluation and comparison between routing protocols is required to understand any routing protocol as well as to develop a new routing protocol. In this research paper, the performance of two on-demand routing protocols AODV & DSR has been analyzed by means of packet delivery ratio, loss packet ratio & average end-to-end delay with varying speed limit and node density under TCP & CBR connection.
In this paper we introduce a script identification method based on hand-crafted texture features and an artificial neural network. The proposed pipeline achieves near state-of-the-art performance for script identification of video-text and state-of-the-art performance on visual language identification of handwritten text. More than using the deep network as a classifier, the use of its intermediary activations as a learned metric demonstrates remarkable results and allows the use of discriminative models on unknown classes. Comparative experiments in video-text and text in the wild datasets provide insights on the internals of the proposed deep network.
The present work discusses the pertinence of a 'sociotype' construct, both theoretically and empirically oriented. The term, based on the conceptual chain genotype-phenotype-sociotype, suggests an evolutionary preference in the human species for some determined averages of social relationships. This core pattern or 'sociotype' has been explored herein for the networking relationships of young people--165 university students filling in a 20-items questionnaire on their social interactions. In spite that this is a preliminary study, interesting results have been obtained on gender conversation time, mental health, sociability level, and satisfaction with personal relationships. This sociotype hypothesis could be a timely enterprise for mental health and quality of life policies.
Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world.
In convolutional deep neural networks, receptive field (RF) size increases with hierarchical depth. When RF size approaches full coverage of the input image, different RF positions result in RFs with different specificity, as portions of the RF fall out of the input space. This leads to a departure from the convolutional concept of positional invariance and opens the possibility for complex forms of context specificity.
Emergence of online content voting networks allows users to share and rate content including social news, photos and videos. The basic idea behind online content voting networks is that aggregate user activities (e.g., submitting and rating content) makes high-quality content thrive through the unprecedented scale, high dynamics and divergent quality of user generated content (UGC). To better understand the nature and impact of online content voting networks, we have analyzed Digg, a popular online social news aggregator and rating website. Based on a large amount of data collected, we provide an in-depth study of Digg. In particular, we study structural properties of Digg social network, impact of social network properties on user digging activities and vice versa, distribution of user diggs, content promotion, and information filtering. We also provide insight into design of content promotion algorithms and recommendation-assisted content discovery. Overall, we believe that the results presented in this paper are crucial in understanding online content rating networks.
The concept of the limiting step is extended to the asymptotology of multiscale reaction networks. Complete theory for linear networks with well separated reaction rate constants is developed. We present algorithms for explicit approximations of eigenvalues and eigenvectors of kinetic matrix. Accuracy of estimates is proven. Performance of the algorithms is demonstrated on simple examples. Application of algorithms to nonlinear systems is discussed.
Inferring topological and geometrical information from data can offer an alternative perspective on machine learning problems. Methods from topological data analysis, e.g., persistent homology, enable us to obtain such information, typically in the form of summary representations of topological features. However, such topological signatures often come with an unusual structure (e.g., multisets of intervals) that is highly impractical for most machine learning techniques. While many strategies have been proposed to map these topological signatures into machine learning compatible representations, they suffer from being agnostic to the target learning task. In contrast, we propose a technique that enables us to input topological signatures to deep neural networks and learn a task-optimal representation during training. Our approach is realized as a novel input layer with favorable theoretical properties. Classification experiments on 2D object shapes and social network graphs demonstrate the versatility of the approach and, in case of the latter, we even outperform the state-of-the-art by a large margin.
The AIRIX facility is a high current linear accelerator (2-3.5kA) used for flash-radiography at the CEA of Moronvilliers France. The general background of this study is the diagnosis and the predictive maintenance of AIRIX. We will present a tool for fault diagnosis and monitoring based on pattern recognition using artificial neural network. Parameters extracted from the signals recorded on each shot are used to define a vector to be classified. The principal component analysis permits us to select the most pertinent information and reduce the redundancy. A three layer Radial Basis Function (RBF) neural network is used to classify the states of the accelerator. We initialize the network by applying an unsupervised fuzzy technique to the training base. This allows us to determine the number of clusters and real classes, which define the number of cells on the hidden and output layers of the network. The weights between the hidden and the output layers, realising the non-convex union of the clusters, are determined by a least square method. Membership and ambiguity rejection enable the network to learn unknown failures, and to monitor accelerator operations to predict future failures. We will present the first results obtained on the injector.
The isotropic spin one-half XY chain in transverse field (XX chain) has specific ground state phase with permanent criticality (quasi-long-range ordered, QLRO, phase) which exists in a field less than nearest-neighbor exchange. It is characterized by gapless excitations and power law decay of transverse correlators. The dilution of XX chain has drastic effect on this phase: the infinite series of quantum phase transitions marked by magnetization jumps appears in it. The thermodynamics of dilute XX chain allows exact analytical description revealing the peculiarities of the appearing transitions. We calculate the low-temperature magnetization, entropy, longitudinal magnetic susceptibility and specific heat of dilute model to elucidate the influence of quantum transitions on their field and low-temperature behavior. The changes of pair spin correlators under dilution are also analyzed. We argue that other dilute quantum spin chains and ladders with the gapless (algebraic) spin-liquid states would also exhibit the quantum jumps under variation of couplings and field similar to those of XX chain.
Optimization techniques play an important role in several scientific and real-world applications, thus becoming of great interest for the community. As a consequence, a number of open-source libraries are available in the literature, which ends up fostering the research and development of new techniques and applications. In this work, we present a new library for the implementation and fast prototyping of nature-inspired techniques called LibOPT. Currently, the library implements 15 techniques and 112 benchmarking functions, as well as it also supports 11 hypercomplex-based optimization approaches, which makes it one of the first of its kind. We showed how one can easily use and also implement new techniques in LibOPT under the C paradigm. Examples are provided with samples of source-code using benchmarking functions.
Deep neural networks have been extremely successful at various image, speech, video recognition tasks because of their ability to model deep structures within the data. However, they are still prohibitively expensive to train and apply for problems containing millions of classes in the output layer. Based on the observation that the key computation common to most neural network layers is a vector/matrix product, we propose a fast locality-sensitive hashing technique to approximate the actual dot product enabling us to scale up the training and inference to millions of output classes. We evaluate our technique on three diverse large-scale recognition tasks and show that our approach can train large-scale models at a faster rate (in terms of steps/total time) compared to baseline methods.
A network supporting deep unsupervised learning is presented. The network is an autoencoder with lateral shortcut connections from the encoder to decoder at each level of the hierarchy. The lateral shortcut connections allow the higher levels of the hierarchy to focus on abstract invariant features. While standard autoencoders are analogous to latent variable models with a single layer of stochastic variables, the proposed network is analogous to hierarchical latent variables models. Learning combines denoising autoencoder and denoising sources separation frameworks. Each layer of the network contributes to the cost function a term which measures the distance of the representations produced by the encoder and the decoder. Since training signals originate from all levels of the network, all layers can learn efficiently even in deep networks. The speedup offered by cost terms from higher levels of the hierarchy and the ability to learn invariant features are demonstrated in experiments.
Triplet-based Spike Timing Dependent Plasticity (TSTDP) is a powerful synaptic plasticity rule that acts beyond conventional pair-based STDP (PSTDP). Here, the TSTDP is capable of reproducing the outcomes from a variety of biological experiments, while the PSTDP rule fails to reproduce them. Additionally, it has been shown that the behaviour inherent to the spike rate-based Bienenstock-Cooper-Munro (BCM) synaptic plasticity rule can also emerge from the TSTDP rule. This paper proposes an analog implementation of the TSTDP rule. The proposed VLSI circuit has been designed using the AMS 0.35 um CMOS process and has been simulated using design kits for Synopsys and Cadence tools. Simulation results demonstrate how well the proposed circuit can alter synaptic weights according to the timing difference amongst a set of different patterns of spikes. Furthermore, the circuit is shown to give rise to a BCM-like learning rule, which is a rate-based rule. To mimic implementation environment, a 1000 run Monte Carlo (MC) analysis was conducted on the proposed circuit. The presented MC simulation analysis and the simulation result from fine-tuned circuits show that, it is possible to mitigate the effect of process variations in the proof of concept circuit, however, a practical variation aware design technique is required to promise a high circuit performance in a large scale neural network. We believe that the proposed design can play a significant role in future VLSI implementations of both spike timing and rate based neuromorphic learning systems.
Slime mould of Physarum polycephalum is a large cell exhibiting rich spatial non-linear electrical characteristics. We exploit the electrical properties of the slime mould to implement logic gates using a flexible hardware platform designed for investigating the electrical properties of a substrate (MECOBO). We apply arbitrary electrical signals to `configure' the slime mould, i.e. change shape of its body and, measure the slime mould's electrical response. We show that it is possible to find configurations that allow the Physarum to act as any 2-input Boolean gate. The occurrence frequency of the gates discovered in the slime was analysed and compared to complexity hierarchies of logical gates obtained in other unconventional materials. The search for gates was performed by both sweeping across configurations in the real material as well as training a neural network-based model and searching the gates therein using gradient descent.
Diffusion and propagation of information, influence and diseases take place over increasingly larger networks. We observe when a node copies information, makes a decision or becomes infected but networks are often hidden or unobserved. Since networks are highly dynamic, changing and growing rapidly, we only observe a relatively small set of cascades before a network changes significantly. Scalable network inference based on a small cascade set is then necessary for understanding the rapidly evolving dynamics that govern diffusion. In this article, we develop a scalable approximation algorithm with provable near-optimal performance based on submodular maximization which achieves a high accuracy in such scenario, solving an open problem first introduced by Gomez-Rodriguez et al (2010). Experiments on synthetic and real diffusion data show that our algorithm in practice achieves an optimal trade-off between accuracy and running time.
Computer simulation of the hopping charge transport in disordered organic materials has been carried out explicitly taking into account charge-charge interactions. This approach provides a possibility to take into account dynamic correlations that are neglected by more traditional approaches like mean field theory. It was found that the effect of interaction is no less significant than the usually considered effect of filling of deep states by non-interacting carriers. It was found too that carrier mobility generally increases with the increase of carrier density, but the effect of interaction is opposite for two models of disordered organic materials: for the non-correlated random distribution of energies with Gaussian DOS mobility decreases with the increase of the interaction strength, while for the model with long range correlated disorder mobility increases with the increase of interaction strength.
Time series forecasting is an important predictive methodology which can be applied to a wide range of problems. Particularly, forecasting the indoor temperature permits an improved utilization of the HVAC (Heating, Ventilating and Air Conditioning) systems in a home and thus a better energy efficiency. With such purpose the paper describes how to implement an Artificial Neural Network (ANN) algorithm in a low cost system-on-chip to develop an autonomous intelligent wireless sensor network. The present paper uses a Wireless Sensor Networks (WSN) to monitor and forecast the indoor temperature in a smart home, based on low resources and cost microcontroller technology as the 8051MCU. An on-line learning approach, based on Back-Propagation (BP) algorithm for ANNs, has been developed for real-time time series learning. It performs the model training with every new data that arrive to the system, without saving enormous quantities of data to create a historical database as usual, i.e., without previous knowledge. Consequently to validate the approach a simulation study through a Bayesian baseline model have been tested in order to compare with a database of a real application aiming to see the performance and accuracy. The core of the paper is a new algorithm, based on the BP one, which has been described in detail, and the challenge was how to implement a computational demanding algorithm in a simple architecture with very few hardware resources.
We propose a simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning. Given a layer, we use non-linear least squares to compute a low-rank CP-decomposition of the 4D convolution kernel tensor into a sum of a small number of rank-one tensors. At the second step, this decomposition is used to replace the original convolutional layer with a sequence of four convolutional layers with small kernels. After such replacement, the entire network is fine-tuned on the training data using standard backpropagation process.   We evaluate this approach on two CNNs and show that it is competitive with previous approaches, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks. Thus, for the 36-class character classification CNN, our approach obtains a 8.5x CPU speedup of the whole network with only minor accuracy drop (1% from 91% to 90%). For the standard ImageNet architecture (AlexNet), the approach speeds up the second convolution layer by a factor of 4x at the cost of $1\%$ increase of the overall top-5 classification error.
Presented is a method of generating a full drum kit part for a provided kick-drum sequence. A sequence to sequence neural network model used in natural language translation was adopted to encode multiple musical styles and an online survey was developed to test different techniques for sampling the output of the softmax function. The strongest results were found using a sampling technique that drew from the three most probable outputs at each subdivision of the drum pattern but the consistency of output was found to be heavily dependent on style.
Real growing networks like the WWW or personal connection based networks are characterized by a high degree of clustering, in addition to the small-world property and the absence of a characteristic scale. Appropriate modifications of the (Barabasi-Albert) preferential attachment network growth capture all these aspects. We present a scaling theory to describe the behavior of the generalized models and the mean field rate equation for the problem. This is solved for a specific case with the result C(k) ~ 1/k for the clustering of a node of degree k. Numerical results agree with such a mean-field exponent which also reproduces the clustering of many real networks.
We represent planning as a set of loosely coupled network flow problems, where each network corresponds to one of the state variables in the planning domain. The network nodes correspond to the state variable values and the network arcs correspond to the value transitions. The planning problem is to find a path (a sequence of actions) in each network such that, when merged, they constitute a feasible plan. In this paper we present a number of integer programming formulations that model these loosely coupled networks with varying degrees of flexibility. Since merging may introduce exponentially many ordering constraints we implement a so-called branch-and-cut algorithm, in which these constraints are dynamically generated and added to the formulation when needed. Our results are very promising, they improve upon previous planning as integer programming approaches and lay the foundation for integer programming approaches for cost optimal planning.
In this manuscript we present an in vitro model based on agarose gel that can be used to simulate a dirty, oily, bloody, and morphologically complex surface of, for example, an open wound. We show this models effectiveness in simulating depth of penetration of reactive species generated in plasma deep into tissue of a rat and confirm the penetration depths with agarose gel model. We envision that in the future such a model could be used to study plasma discharges (and other modalities) and minimize the use of live animals: plasma can be optimized on the agarose gel wound model and then finally verified using an actual wound.
A feedback neural network approach to communication routing problems is developed with emphasis on Multiple Shortest Path problems, with several requests for transmissions between distinct start- and endnodes. The basic ingredients are a set of Potts neurons for each request, with interactions designed to minimize path lengths and to prevent overloading of network arcs. The topological nature of the problem is conveniently handled using a propagator matrix approach. Although the constraints are global, the algorithmic steps are based entirely on local information, facilitating distributed implementations. In the polynomially solvable single-request case the approach reduces to a fuzzy version of the Bellman-Ford algorithm. The approach is evaluated for synthetic problems of varying sizes and load levels, by comparing with exact solutions from a branch-and-bound method. With very few exceptions, the Potts approach gives legal solutions of very high quality. The computational demand scales merely as the product of the numbers of requests, nodes, and arcs.
Automating the detection of anomalous events within long video sequences is challenging due to the ambiguity of how such events are defined. We approach the problem by learning generative models that can identify anomalies in videos using limited supervision. We propose end-to-end trainable composite Convolutional Long Short-Term Memory (Conv-LSTM) networks that are able to predict the evolution of a video sequence from a small number of input frames. Regularity scores are derived from the reconstruction errors of a set of predictions with abnormal video sequences yielding lower regularity scores as they diverge further from the actual sequence over time. The models utilize a composite structure and examine the effects of conditioning in learning more meaningful representations. The best model is chosen based on the reconstruction and prediction accuracy. The Conv-LSTM models are evaluated both qualitatively and quantitatively, demonstrating competitive results on anomaly detection datasets. Conv-LSTM units are shown to be an effective tool for modeling and predicting video sequences.
We propose the Recurrent Soft Attention Model, which integrates the visual attention from the original image to a LSTM memory cell through a down-sample network. The model recurrently transmits visual attention to the memory cells for glimpse mask generation, which is a more natural way for attention integration and exploitation in general object detection and recognition problem. We test our model under the metric of the top-1 accuracy on the CIFAR-10 dataset. The experiment shows that our down-sample network and feedback mechanism plays an effective role among the whole network structure.
Extending recent numerical studies on two dimensional amorphous bodies, we characterize the approach of elastic continuum limit in three dimensional (weakly polydisperse) Lennard-Jones systems. While performing a systematic finite-size analysis (for two different quench protocols) we investigate the non-affine displacement field under external strain, the linear response to an external delta force and the low-frequency harmonic eigenmodes and their density distribution. Qualitatively similar behavior is found as in two dimensions. We demonstrate that the classical elasticity description breaks down below an intermediate length scale $\xi$, which in our system is approximately 23 molecular sizes. This length characterizes the correlations of the non-affine displacement field, the self-averaging of external noise with distance from the source and gives the lower wave length bound for the applicability of the classical eigenfrequency calculations. We trace back the "Boson-peak" of the density of eigenfrequencies (obtained from the velocity auto-correlation function) to the inhomogeneities on wave lengths smaller than $\xi$.
SYN-flooding attack uses the weakness available in TCP's three-way handshake process to keep it from handling legitimate requests. This attack causes the victim host to populate its backlog queue with forged TCP connections. In other words it increases Ploss (probability of loss) and Pa (buffer occupancy percentage of attack requests) and decreases Pr (buffer occupancy percentage of regular requests) in the victim host and results to decreased performance of the host. This paper proposes a self-managing approach, in which the host defends against SYN-flooding attack by dynamically tuning of its own two parameters, that is, m (maximum number of half-open connections) and h (hold time for each half-open connection). In this way, it formulates the defense problem to an optimization problem and then employs the learning automata (LA) algorithm to solve it. The simulation results show that the proposed defense strategy improves performance of the under attack system in terms of Ploss, Pa and Pr.
We consider the NP-hard Tree Containment problem that has important applications in phylogenetics. The problem asks if a given leaf-labeled network contains a subdivision of a given leaf-labeled tree. We develop a fast algorithm for the case that the input network is indeed a tree in which multiple leaves might share a label. By combining this algorithm with a generalization of a previously known decomposition scheme, we improve the running time on reticulation visible networks and nearly stable networks to linear time. While these are special classes of networks, they rank among the most general of the previously considered classes.
We study the effects of the degree-degree correlations on the pressure congestion J when we apply a dynamical process on scale free complex networks using the gradient network approach. We find that the pressure congestion for disassortative (assortative) networks is lower (bigger) than the one for uncorrelated networks which allow us to affirm that disassortative networks enhance transport through them. This result agree with the fact that many real world transportation networks naturally evolve to this kind of correlation. We explain our results showing that for the disassortative case the clusters in the gradient network turn out to be as much elongated as possible, reducing the pressure congestion J and observing the opposite behavior for the assortative case. Finally we apply our model to real world networks, and the results agree with our theoretical model.
While spike timing has been shown to carry detailed stimulus information at the sensory periphery, its possible role in network computation is less clear. Most models of computation by neural networks are based on population firing rates. In equivalent spiking implementations, firing is assumed to be random such that averaging across populations of neurons recovers the rate-based approach. Recently, however, Den\'eve and colleagues have suggested that the spiking behavior of neurons may be fundamental to how neuronal networks compute, with precise spike timing determined by each neuron's contribution to producing the desired output. By postulating that each neuron fires in order to reduce the error in the network's output, it was demonstrated that linear computations can be carried out by networks of integrate-and-fire neurons that communicate through instantaneous synapses. This left open, however, the possibility that realistic networks, with conductance-based neurons with subthreshold nonlinearity and the slower timescales of biophysical synapses, may not fit into this framework. Here, we show how the spike-based approach can be extended to biophysically plausible networks. We then show that our network reproduces a number of key features of cortical networks including irregular and Poisson-like spike times and a tight balance between excitation and inhibition. Lastly, we discuss how the behavior of our model scales with network size, or with the number of neurons "recorded" from a larger computing network. These results significantly increase the biological plausibility of the spike-based approach to network computation.
Deep Learning based techniques have been adopted with precision to solve a lot of standard computer vision problems, some of which are image classification, object detection and segmentation. Despite the widespread success of these approaches, they have not yet been exploited largely for solving the standard perception related problems encountered in autonomous navigation such as Visual Odometry (VO), Structure from Motion (SfM) and Simultaneous Localization and Mapping (SLAM). This paper analyzes the problem of Monocular Visual Odometry using a Deep Learning-based framework, instead of the regular 'feature detection and tracking' pipeline approaches. Several experiments were performed to understand the influence of a known/unknown environment, a conventional trackable feature and pre-trained activations tuned for object classification on the network's ability to accurately estimate the motion trajectory of the camera (or the vehicle). Based on these observations, we propose a Convolutional Neural Network architecture, best suited for estimating the object's pose under known environment conditions, and displays promising results when it comes to inferring the actual scale using just a single camera in real-time.
Image quality is an important practical challenge that is often overlooked in the design of machine vision systems. Commonly, machine vision systems are trained and tested on high quality image datasets, yet in practical applications the input images can not be assumed to be of high quality. Recently, deep neural networks have obtained state-of-the-art performance on many machine vision tasks. In this paper we provide an evaluation of 4 state-of-the-art deep neural network models for image classification under quality distortions. We consider five types of quality distortions: blur, noise, contrast, JPEG, and JPEG2000 compression. We show that the existing networks are susceptible to these quality distortions, particularly to blur and noise. These results enable future work in developing deep neural networks that are more invariant to quality distortions.
This work investigates how the traditional image classification pipelines can be extended into a deep architecture, inspired by recent successes of deep neural networks. We propose a deep boosting framework based on layer-by-layer joint feature boosting and dictionary learning. In each layer, we construct a dictionary of filters by combining the filters from the lower layer, and iteratively optimize the image representation with a joint discriminative-generative formulation, i.e. minimization of empirical classification error plus regularization of analysis image generation over training images. For optimization, we perform two iterating steps: i) to minimize the classification error, select the most discriminative features using the gentle adaboost algorithm; ii) according to the feature selection, update the filters to minimize the regularization on analysis image representation using the gradient descent method. Once the optimization is converged, we learn the higher layer representation in the same way. Our model delivers several distinct advantages. First, our layer-wise optimization provides the potential to build very deep architectures. Second, the generated image representation is compact and meaningful. In several visual recognition tasks, our framework outperforms existing state-of-the-art approaches.
In the presence of disorder, an interacting closed quantum system can undergo many-body localization (MBL) and fail to thermalize. However, over long times even weak couplings to any thermal environment will necessarily thermalize the system and erase all signatures of MBL. This presents a challenge for experimental investigations of MBL, since no realistic system can ever be fully closed. In this work, we experimentally explore the thermalization dynamics of a localized system in the presence of controlled dissipation. Specifically, we find that photon scattering results in a stretched exponential decay of an initial density pattern with a rate that depends linearly on the scattering rate. We find that the resulting susceptibility increases significantly close to the phase transition point. In this regime, which is inaccessible to current numerical studies, we also find a strong dependence on interactions. Our work provides a basis for systematic studies of MBL in open systems and opens a route towards extrapolation of closed system properties from experiments.
The newsvendor problem is one of the most basic and widely applied inventory models. There are numerous extensions of this problem. One important extension is the multi-item newsvendor problem, in which the demand of each item may be correlated with that of other items. If the joint probability distribution of the demand is known, the problem can be solved analytically. However, approximating the probability distribution is not easy and is prone to error; therefore, the resulting solution to the newsvendor problem may be not optimal. To address this issue, we propose an algorithm based on deep learning that optimizes the order quantities for all products based on features of the demand data. Our algorithm integrates the forecasting and inventory-optimization steps, rather than solving them separately as is typically done. The algorithm does not require the knowledge of the probability distributions of the demand. Numerical experiments on real-world data suggest that our algorithm outperforms other approaches, including data-driven and SVM approaches, especially for demands with high volatility.
Social networks have been popular platforms for information propagation. An important use case is viral marketing: given a promotion budget, an advertiser can choose some influential users as the seed set and provide them free or discounted sample products; in this way, the advertiser hopes to increase the popularity of the product in the users' friend circles by the world-of-mouth effect, and thus maximizes the number of users that information of the production can reach. There has been a body of literature studying the influence maximization problem. Nevertheless, the existing studies mostly investigate the problem on a one-off basis, assuming fixed known influence probabilities among users, or the knowledge of the exact social network topology. In practice, the social network topology and the influence probabilities are typically unknown to the advertiser, which can be varying over time, i.e., in cases of newly established, strengthened or weakened social ties. In this paper, we focus on a dynamic non-stationary social network and design a randomized algorithm, RSB, based on multi-armed bandit optimization, to maximize influence propagation over time. The algorithm produces a sequence of online decisions and calibrates its explore-exploit strategy utilizing outcomes of previous decisions. It is rigorously proven to achieve an upper-bounded regret in reward and applicable to large-scale social networks. Practical effectiveness of the algorithm is evaluated using both synthetic and real-world datasets, which demonstrates that our algorithm outperforms previous stationary methods under non-stationary conditions.
We consider two-dimensional degenerate electron gas in the presence of perpendicular random magnetic field. The magnetic field disorder which is assumed to be gaussian is characterized by two parameters. The first is proportional to the amplitude of local magnetic field fluctuations. The second one characterizes the disorder on the longer scale - we call it the screening length. Using Kubo formula for the conductivity we have found a class of diagrams which leads to the long time tails in the current - current correlation function. For short times comparing to the diffusion time corresponding to the screening length this function behaves like logarithm of time, for longer times it decays like $t^{-1}$ what of course puts in question the diffusive character of the behavior of the charged particle in random magnetic field widely assumed in the literature.
We extend the quantal hypernetted-chain (QHNC) method, which has been proved to yield accurate results for liquid metals, to treat a partially ionized plasma. In a plasma, the electrons change from a quantum to a classical fluid gradually with increasing temperature; the QHNC method applied to the electron gas is in fact able to provide the electron-electron correlation at arbitrary temperature. As an illustrating example of this approach, we investigate how liquid rubidium becomes a plasma by increasing the temperature from 0 to 30 eV at a fixed normal ion-density $1.03 \times 10^{22}/cm^3$. The electron-ion radial distribution function (RDF) in liquid Rb has distinct inner-core and outer-core parts. Even at a temperature of 1 eV, this clear distinction remains as a characteristic of a liquid metal. At a temperature of 3 eV, this distinction disappears, and rubidium becomes a plasma with the ionization 1.21. The temperature variations of bound levels in each ion and the average ionization are calculated in Rb plasmas at the same time. Using the density-functional theory, we also derive the Saha equation applicable even to a high-density plasma at low temperatures. The QHNC method provides a procedure to solve this Saha equation with ease by using a recursive formula; the charge population of differently ionized species are obtained in Rb plasmas at several temperatures. In this way, it is shown that, with the atomic number as the only input, the QHNC method produces the average ionization, the electron-ion and ion-ion RDF's, and the charge population which are consistent with the atomic structure of each ion for a partially ionized plasma.
The roll calls of the Italian Parliament in the XVI legislature are studied by employing multidimensional scaling, hierarchical clustering, and network analysis. In order to detect changes in voting behavior, the roll calls have been divided in seven periods of six months each. All the methods employed pointed out an increasing fragmentation of the political parties endorsing the previous government that culminated in its downfall. By using the concept of modularity at different resolution levels, we identify the community structure of Parliament and its evolution in each of the considered time periods. The analysis performed revealed as a valuable tool in detecting trends and drifts of Parliamentarians. It showed its effectiveness at identifying political parties and at providing insights on the temporal evolution of groups and their cohesiveness, without having at disposal any knowledge about political membership of Representatives.
In this paper, we present RegNet, the first deep convolutional neural network (CNN) to infer a 6 degrees of freedom (DOF) extrinsic calibration between multimodal sensors, exemplified using a scanning LiDAR and a monocular camera. Compared to existing approaches, RegNet casts all three conventional calibration steps (feature extraction, feature matching and global regression) into a single real-time capable CNN. Our method does not require any human interaction and bridges the gap between classical offline and target-less online calibration approaches as it provides both a stable initial estimation as well as a continuous online correction of the extrinsic parameters. During training we randomly decalibrate our system in order to train RegNet to infer the correspondence between projected depth measurements and RGB image and finally regress the extrinsic calibration. Additionally, with an iterative execution of multiple CNNs, that are trained on different magnitudes of decalibration, our approach compares favorably to state-of-the-art methods in terms of a mean calibration error of 0.28 degrees for the rotational and 6 cm for the translation components even for large decalibrations up to 1.5 m and 20 degrees.
The well-known word analogy experiments show that the recent word vectors capture fine-grained linguistic regularities in words by linear vector offsets, but it is unclear how well the simple vector offsets can encode visual regularities over words. We study a particular image-word relevance relation in this paper. Our results show that the word vectors of relevant tags for a given image rank ahead of the irrelevant tags, along a principal direction in the word vector space. Inspired by this observation, we propose to solve image tagging by estimating the principal direction for an image. Particularly, we exploit linear mappings and nonlinear deep neural networks to approximate the principal direction from an input image. We arrive at a quite versatile tagging model. It runs fast given a test image, in constant time w.r.t.\ the training set size. It not only gives superior performance for the conventional tagging task on the NUS-WIDE dataset, but also outperforms competitive baselines on annotating images with previously unseen tags
An interrelation between a topological design of network and efficient algorithm on it is important for its applications to communication or transportation systems. In this paper, we propose a design principle for a reliable routing in a store-carry-forward manner based on autonomously moving message-ferries on a special structure of fractal-like network, which consists of a self-similar tiling of equilateral triangles. As a collective adaptive mechanism, the routing is realized by a relay of cyclic message-ferries corresponded to a concatenation of the triangle cycles and using some good properties of the network structure. It is recoverable for local accidents in the hierarchical network structure. Moreover, the design principle is theoretically supported with a calculation method for the optimal service rates of message-ferries derived from a tandem queue model for stochastic processes on a chain of edges in the network. These results obtained from a combination of complex network science and computer science will be useful for developing a resilient network system.
We propose a method to investigate modular structure in networks based on fitted probabilistic model, where the connection probability between nodes is related to a set of introduced local attributes. The attributes, as parameters of the empirical model, can be estimated by maximizing the likelihood function of the observed network. We demonstrate that the distribution of attributes provides an informative visulization of modular networks on low-dimensional space, and suggest the attribute space can be served as a better platform for further network analysis.
Kohonen's Self-Organizing Maps (SOMs) have proven to be a successful data-reduction method to identify the intrinsic lower-dimensional sub-manifold of a data set that is scattered in the higher-dimensional feature space. Motivated by the possibly non-Euclidian nature of the feature space and of the intrinsic geometry of the data set, we extend the definition of classic SOMs to obtain the General Riemannian SOM (GRiSOM). We additionally provide an implementation as a proof-of-concept for geometries with constant curvature. We furthermore perform the analytic and numerical analysis of the stability limits of certain (GRi)SOM configurations covering the different possible regular tessellation of the map space in each geometry. A deviation between the numerical and analytic stability limit has been observed for the square and hexagonal Euclidean maps for very small neighbourhoods in the map space as well as agreement in case of longer-ranged relations between the map nodes.
We find that the hierarchical organization of the potential energy landscape in a model supercooled liquid can be related to a change in the spatial distribution of soft normal modes. For groups of nearby minima, between which fast relaxation processes typically occur, the localization of the soft modes is very similar. The spatial distribution of soft regions changes, instead, for minima between which transitions relevant to structural relaxation occur. This may be the reason why the soft modes are able to predict spatial heterogeneities in the dynamics. Nevertheless, the very softest modes are only weakly correlated with dynamical heterogeneities, and instead show higher statistical overlap with regions in the local minima that would undergo non-affine rearrangements if subjected to a shear deformation. This feature of the supercooled liquid is reminiscent of the behavior of non-affine deformations in amorphous solids, where the very softest modes identify the {\it loci} of plastic instabilities.
User connectivity patterns in network applications are known to be heterogeneous, and to follow periodic (daily and weekly) patterns. In many cases, the regularity and the correlation of those patterns is problematic: for network applications, many connected users create peaks of demand; in contrast, in peer-to-peer scenarios, having few users online results in a scarcity of available resources. On the other hand, since connectivity patterns exhibit a periodic behavior, they are to some extent predictable. This work shows how this can be exploited to anticipate future user connectivity and to have applications proactively responding to it. We evaluate the probability that any given user will be online at any given time, and assess the prediction on six-month availability traces from three different Internet applications. Building upon this, we show how our probabilistic approach makes it easy to evaluate and optimize the performance in a number of diverse network application models, and to use them to optimize systems. In particular, we show how this approach can be used in distributed hash tables, friend-to-friend storage, and cache pre-loading for social networks, resulting in substantial gains in data availability and system efficiency at negligible costs.
This paper presents cltorch, a hardware-agnostic backend for the Torch neural network framework. cltorch enables training of deep neural networks on GPUs from diverse hardware vendors, including AMD, NVIDIA, and Intel. cltorch contains sufficient implementation to run models such as AlexNet, VGG, Overfeat, and GoogleNet. It is written using the OpenCL language, a portable compute language, governed by the Khronos Group. cltorch is the top-ranked hardware-agnostic machine learning framework on Chintala's convnet-benchmarks page.   This paper presents the technical challenges encountered whilst creating the cltorch backend for Torch, and looks in detail at the challenges related to obtaining a fast hardware-agnostic implementation.   The convolutional layers are identified as the key area of focus for accelerating hardware-agnostic frameworks. Possible approaches to accelerating the convolutional implementation are identified including: implementation of the convolutions using the implicitgemm or winograd algorithm, using a GEMM implementation adapted to the geometries associated with the convolutional algorithm, or using a pluggable hardware-specific convolutional implementation.
We study the fluctuations of the two-time dependent global roughness of finite size elastic lines in a quenched random environment. We propose a scaling form for the roughness distribution function that accounts for the two-time, temperature, and size dependence. At high temperature and in the final stationary regime before saturation the fluctuations are as the ones of the Edwards-Wilkinson interface evolving from typical initial conditions. We analyze the variation of the scaling function within the aging regime and with the distance from saturation. We speculate on the relevance of our results to describe the fluctuations of other non-equilibrium systems such as models at criticality.
Inspired by the theory of Leitners learning box from the field of psychology, we propose DropSample, a new method for training deep convolutional neural networks (DCNNs), and apply it to large-scale online handwritten Chinese character recognition (HCCR). According to the principle of DropSample, each training sample is associated with a quota function that is dynamically adjusted on the basis of the classification confidence given by the DCNN softmax output. After a learning iteration, samples with low confidence will have a higher probability of being selected as training data in the next iteration; in contrast, well-trained and well-recognized samples with very high confidence will have a lower probability of being involved in the next training iteration and can be gradually eliminated. As a result, the learning process becomes more efficient as it progresses. Furthermore, we investigate the use of domain-specific knowledge to enhance the performance of DCNN by adding a domain knowledge layer before the traditional CNN. By adopting DropSample together with different types of domain-specific knowledge, the accuracy of HCCR can be improved efficiently. Experiments on the CASIA-OLHDWB 1.0, CASIA-OLHWDB 1.1, and ICDAR 2013 online HCCR competition datasets yield outstanding recognition rates of 97.33%, 97.06%, and 97.51% respectively, all of which are significantly better than the previous best results reported in the literature.
Evolutionary artificial neural networks (EANNs) refer to a special class of artificial neural networks (ANNs) in which evolution is another fundamental form of adaptation in addition to learning. Evolutionary algorithms are used to adapt the connection weights, network architecture and learning algorithms according to the problem environment. Even though evolutionary algorithms are well known as efficient global search algorithms, very often they miss the best local solutions in the complex solution space. In this paper, we propose a hybrid meta-heuristic learning approach combining evolutionary learning and local search methods (using 1st and 2nd order error information) to improve the learning and faster convergence obtained using a direct evolutionary approach. The proposed technique is tested on three different chaotic time series and the test results are compared with some popular neuro-fuzzy systems and a recently developed cutting angle method of global optimization. Empirical results reveal that the proposed technique is efficient in spite of the computational complexity.
In a physical neural system, where storage and processing are intimately intertwined, the rules for adjusting the synaptic weights can only depend on variables that are available locally, such as the activity of the pre- and post-synaptic neurons, resulting in local learning rules. A systematic framework for studying the space of local learning rules is obtained by first specifying the nature of the local variables, and then the functional form that ties them together into each learning rule. Such a framework enables also the systematic discovery of new learning rules and exploration of relationships between learning rules and group symmetries. We study polynomial local learning rules stratified by their degree and analyze their behavior and capabilities in both linear and non-linear units and networks. Stacking local learning rules in deep feedforward networks leads to deep local learning. While deep local learning can learn interesting representations, it cannot learn complex input-output functions, even when targets are available for the top layer. Learning complex input-output functions requires local deep learning where target information is communicated to the deep layers through a backward learning channel. The nature of the communicated information about the targets and the structure of the learning channel partition the space of learning algorithms. We estimate the learning channel capacity associated with several algorithms and show that backpropagation outperforms them by simultaneously maximizing the information rate and minimizing the computational cost, even in recurrent networks. The theory clarifies the concept of Hebbian learning, establishes the power and limitations of local learning rules, introduces the learning channel which enables a formal analysis of the optimality of backpropagation, and explains the sparsity of the space of learning rules discovered so far.
An approach to information processing based on the excitation of patterns of activity by non-linear active resonators in response to their input patterns is proposed. Arguments are presented to show that any computation performed by a conventional Turing machine-based computer, called T-machine in this paper, could also be performed by the pattern excitation-based machine, which will be called P-machine. A realization of this processing scheme by neural networks is discussed. In this realization, the role of the resonators is played by neural pattern excitation networks, which are the neural circuits capable of exciting different spatio-temporal patterns of activity in response to different inputs. Learning in the neural pattern excitation networks is also considered. It is shown that there is a duality between pattern excitation and pattern recognition neural networks, which allows to create new pattern excitation modes corresponding to recognizable input patterns, based on Hebbian learning rules. Hierarchically organized, such networks can produce complex behavior. Animal behavior, human language and thought are treated as examples produced by such networks.
In this paper, we present a method to optimise rough set partition sizes, to which rule extraction is performed on HIV data. The genetic algorithm optimisation technique is used to determine the partition sizes of a rough set in order to maximise the rough sets prediction accuracy. The proposed method is tested on a set of demographic properties of individuals obtained from the South African antenatal survey. Six demographic variables were used in the analysis, these variables are; race, age of mother, education, gravidity, parity, and age of father, with the outcome or decision being either HIV positive or negative. Rough set theory is chosen based on the fact that it is easy to interpret the extracted rules. The prediction accuracy of equal width bin partitioning is 57.7% while the accuracy achieved after optimising the partitions is 72.8%. Several other methods have been used to analyse the HIV data and their results are stated and compared to that of rough set theory (RST).
We propose Logic Tensor Networks: a uniform framework for integrating automatic learning and reasoning. A logic formalism called Real Logic is defined on a first-order language whereby formulas have truth-value in the interval [0,1] and semantics defined concretely on the domain of real numbers. Logical constants are interpreted as feature vectors of real numbers. Real Logic promotes a well-founded integration of deductive reasoning on a knowledge-base and efficient data-driven relational machine learning. We show how Real Logic can be implemented in deep Tensor Neural Networks with the use of Google's tensorflow primitives. The paper concludes with experiments applying Logic Tensor Networks on a simple but representative example of knowledge completion.
The transformation of the free-energy landscape from smooth to hierarchical is one of the richest features of mean-field disordered systems. A well-studied example is the de Almeida-Thouless transition for spin glasses in a magnetic field, and a similar phenomenon--the Gardner transition--has recently been predicted for structural glasses. The existence of these replica-symmetry-breaking phase transitions has, however, long been questioned below their upper critical dimension, d_u=6. Here, we obtain evidence for the existence of these transitions in d<d_u using a two-loop calculation. Because the critical fixed point is found in the strong-coupling regime, we corroborate the result by resumming the perturbative series with inputs from a three-loop calculation and an analysis of its large-order behavior. Our study offers a resolution of the long-lasting controversy surrounding phase transitions in finite-dimensional disordered systems.
This paper presents the first topological analysis of the economic structure of an entire country based on payments data obtained from Swedbank. This data set is exclusive in its kind because around 80% of Estonia's bank transactions are done through Swedbank, hence, the economic structure of the country can be reconstructed. Scale-free networks are commonly observed in a wide array of different contexts such as nature and society. In this paper, the nodes are comprised by customers of the bank (legal entities) and the links are established by payments between these nodes. We study the scaling-free and structural properties of this network. We also describe its topology, components and behaviors. We show that this network shares typical structural characteristics known in other complex networks: degree distributions follow a power law, low clustering coefficient and low average shortest path length. We identify the key nodes of the network and perform simulations of resiliency against random and targeted attacks of the nodes with two different approaches. With this, we find that by identifying and studying the links between the nodes is possible to perform vulnerability analysis of the Estonian economy with respect to economic shocks.
We propose a tensor-network algorithm for discrete-time stochastic dynamics of a homogeneous system in the thermodynamic limit. We map a $d$-dimensional nonequilibrium Markov process to a $(d+1)$-dimensional infinite tensor network by using a higher-order singular-value decomposition. As an application of the algorithm, we compute the nonequilibrium relaxation from a fully magnetized state to equilibrium of the one- and two- dimensional Ising models with periodic boundary conditions. Utilizing the translational invariance of the systems, we analyze the behavior in the thermodynamic limit directly. We estimated the dynamical critical exponent $z=2.16(5)$ for the two-dimensional Ising model. Our approach fits well with the framework of the nonequilibrium-relaxation method. Our algorithm can compute time evolution of the magnetization of a large system precisely for a relatively short period. In the nonequilibrium-relaxation method, on the other hand, one needs to simulate dynamics of a large system for a short time. The combination of the two provides a new approach to the study of critical phenomena.
Hadron production in semi-inclusive deep-inelastic scattering of leptons from nuclei is an ideal tool to determine and constrain the transport coefficient in cold nuclear matter. The leading-order computations for hadron multiplicity ratios are performed by means of the SW quenching weights and the analytic parameterizations of quenching weights based on BDMPS formalism. The theoretical results are compared to the HERMES positively charged pions production data with the quarks hadronization occurring outside the nucleus. With considering the nuclear geometry effect on hadron production, our predictions are in good agreement with the experimental measurements. The extracted transport parameter from the global fit is shown to be $\hat{q} = 0.74\pm0.03 GeV^2/fm$ for the SW quenching weight without the finite energy corrections. As for the analytic parameterization of BDMPS quenching weight without the quark energy E dependence, the computed transport coefficient is $\hat{q} = 0.20\pm0.02 GeV^2/fm$. It is found that the nuclear geometry effect has a significant impact on the transport coefficient in cold nuclear matter. It is necessary to consider the detailed nuclear geometry in studying the semi-inclusive hadron production in deep inelastic scattering on nuclear targets.
We study the stationary properties as well as the non-stationary dynamics of the one-dimensional partially asymmetric exclusion process with position dependent random hop rates. In a finite system of $L$ sites the stationary current, $J$, is determined by the largest barrier and the corresponding waiting time, $\tau \sim J^{-1}$, is related to the waiting time of a single random walker, $\tau_{rw}$, as $\tau \sim \tau_{rw}^{1/2}$. The current is found to vanish as: $J \sim L^{-z/2}$, where $z$ is the dynamical exponent of the biased single particle Sinai walk. Typical stationary states are phase separated: At the largest barrier almost all particles queue at one side and almost all holes are at the other side. The high-density (low-density) region, is divided into $\sim L^{1/2}$ connected parts of particles (holes) which are separated by islands of holes (particles) located at the subleading barriers (valleys). We also study non-stationary processes of the system, like coarsening and invasion. Finally we discuss some related models, where particles of larger size or multiple occupation of lattice sites is considered.
Topic models have been widely explored as probabilistic generative models of documents. Traditional inference methods have sought closed-form derivations for updating the models, however as the expressiveness of these models grows, so does the difficulty of performing fast and accurate inference over their parameters. This paper presents alternative neural approaches to topic modelling by providing parameterisable distributions over topics which permit training by backpropagation in the framework of neural variational inference. In addition, with the help of a stick-breaking construction, we propose a recurrent network that is able to discover a notionally unbounded number of topics, analogous to Bayesian non-parametric topic models. Experimental results on the MXM Song Lyrics, 20NewsGroups and Reuters News datasets demonstrate the effectiveness and efficiency of these neural topic models.
The survey speed of ASKAP makes it a prime instrument with which to survey the HI universe, enabling it to carry out both wide surveys of the entire sky, as well as deep surveys covering cosmologically representative volumes. Here, the use of ASKAP to study deep HI fields is discussed as proposed by the Deep Investigation of Neutral Gas Origins (DINGO) survey. This ASKAP science survey project anticipates observing in excess of 10^5 sources out to redshift z~0.4. Key science goals include: Omega_HI and its evolution, the cosmic web as traced by distributions such as the HI mass function and the 2pt correlation function, and the formation and evolution of galaxies. Science returns are maximised by targeting the GAMA survey regions, enabling the HI content of galaxies to be studied and understood in full context with all the major galactic constituents over the past 4 Gyr.
Various methods have been proposed in the literature to determine an optimal partitioning of the set of actors in a network into core and periphery subsets. However, these methods either work only for relatively small input sizes, or do not guarantee an optimal answer. In this paper, we propose a new algorithm to solve this problem. This algorithm is efficient and exact, allowing the optimal partitioning for networks of several thousand actors to be computed in under a second. We also show that the optimal core can be characterized as a set containing the actors with the highest degrees in the original network.
New results of a Programme of study of BAL + IR + Fe II QSOs (at low and high redshift) are presented. Which are based mainly on deep Gemini GMOS integral field unit (IFU/3D) spectroscopy. We have performed a detailed study of the kinematics, morphological, and physical conditions, in the BAL + IR + Fe II QSO: IRAS 04505-2958. From this study, some selected results are presented, mainly for the 3 expanding giant shells (observed with Gemini). In particular, the GMOS data suggest that the outflow (OF) process -in this IR QSO- generated multiple expanding hypergiant shells (from 10, to 100 kpc), in several extreme explosive events. These new Gemini GMOS data are in good agreement with our evolutionary, explosive and composite Model: where part of the ISM of the host galaxy is ejected in the form of multiple giant shells, mainly by HyN explosions. This process could generate satellite/companion galaxies, and even could expel a high fraction -or all- the host galaxy. In addition, this Model for AGN could give important clue about the physical processes that could explain the origin -in AGNs- of very energetic cosmic rays, detected by the P. Auger Observatory.
We compare the phylogenetic tensors for various trees and networks for two, three and four taxa. If the probability spaces between one tree or network and another are not identical then there will be phylogenetic tensors that could have arisen on one but not the other. We call these two trees or networks distinguishable from each other. We show that for the binary symmetric model there are no two-taxon trees and networks that are distinguishable from each other, however there are three-taxon trees and networks that are distinguishable from each other.   We compare the time parameters for the phylogenetic tensors for various taxon label permutations on a given tree or network. If the time parameters on one taxon label permutation in terms of the other taxon label permutation are all non-negative then we say that the two taxon label permutations are not network identifiable from each other. We show that some taxon label permutations are network identifiable from each other.   We show that some four-taxon networks satisfy the four-point condition. Of the two "shapes" of four-taxon rooted trees, one is defined by the cluster, b,c,d, labelling taxa alphabetically from left to right. The network with this shape and convergence between the two taxa with the root as their most recent common ancestor satisfies the four-point condition.   The phylogenetic tensors contain polynomial equations that cannot be easily solved for four-taxon or higher trees or networks. We show how methods from algebraic geometry, such as Gr\"obner bases, can be used to solve the polynomial equations. We show that some four-taxon trees and networks can be distinguished from each other.
We show that the probability that a wave packet will remain in a disordered cavity until the time $t$ decreases exponentially for times shorter than the Heisenberg time and log-normally for times much longer than the Heisenberg time. Our result is equivalent to the known result for time-dependent conductance; in particular, it is independent of the dimensionality of the cavity. We perform non-perturbative ensemble averaging over disorder by making use of field theory. We make use of a one-mode approximation which also gives an interpolation formula (arccosh-normal distribution) for the probability to remain. We have checked that the optimal fluctuation method gives the same result for the particular geometry which we have chosen. We also show that the probability to remain does not relate simply to the form-factor of the delay time. Finally, we give an interpretation of the result in terms of path integrals. PACS numbers: 73.23.-b, 03.65.Nk
Recently, deep Convolutional Neural Networks (CNN) have demonstrated strong performance on RGB salient object detection. Although, depth information can help improve detection results, the exploration of CNNs for RGB-D salient object detection remains limited. Here we propose a novel deep CNN architecture for RGB-D salient object detection that exploits high-level, mid-level, and low level features. Further, we present novel depth features that capture the ideas of background enclosure and depth contrast that are suitable for a learned approach. We show improved results compared to state-of-the-art RGB-D salient object detection methods. We also show that the low-level and mid-level depth features both contribute to improvements in the results. Especially, F-Score of our method is 0.848 on RGBD1000 dataset, which is 10.7% better than the second place.
This paper addresses network anomography, that is, the problem of inferring network-level anomalies from indirect link measurements. This problem is cast as a low-rank subspace tracking problem for normal flows under incomplete observations, and an outlier detection problem for abnormal flows. Since traffic data is large-scale time-structured data accompanied with noise and outliers under partial observations, an efficient modeling method is essential. To this end, this paper proposes an online subspace tracking of a Hankelized time-structured traffic tensor for normal flows based on the Candecomp/PARAFAC decomposition exploiting the recursive least squares (RLS) algorithm. We estimate abnormal flows as outlier sparse flows via sparsity maximization in the underlying under-constrained linear-inverse problem. A major advantage is that our algorithm estimates normal flows by low-dimensional matrices with time-directional features as well as the spatial correlation of multiple links without using the past observed measurements and the past model parameters. Extensive numerical evaluations show that the proposed algorithm achieves faster convergence per iteration of model approximation, and better volume anomaly detection performance compared to state-of-the-art algorithms.
We develop a classification of composite operators without gradients at Anderson-transition critical points in disordered systems. These operators represent correlation functions of the local density of states (or of wave-function amplitudes). Our classification is motivated by the Iwasawa decomposition for the field of the pertinent supersymmetric \sigma-model: the scaling operators are represented by "plane waves" in terms of the corresponding radial coordinates. We also present an alternative construction of scaling operators by using the notion of highest-weight vector. We further argue that a certain Weyl-group invariance associated with the \sigma-model manifold leads to numerous exact symmetry relations between the scaling dimensions of the composite operators. These symmetry relations generalize those derived earlier for the multifractal spectrum of the leading operators.
Bots, in recent times, have posed a major threat to enterprise networks. With the distributed nature of the way in which botnets operate, the problems faced by enterprises have become acute. A bot is a program that operates as an agent for a user and runs automated tasks over the internet, at a much higher rate than would be possible for a human alone. A collection of bots in a network, used for malicious purposes, is referred to as a botnet. In this paper we suggested a distributed, co-operative approach towards detecting botnets is a given network which is inspired by the gossip protocol. Each node in a given network runs a standalone agent that computes a suspicion value for that node after regular intervals. Each node in the network exchanges its suspicion values with every other node in the network at regular intervals. The use of gossip protocol ensures that if a node in the network is compromised, all other nodes in the network are informed about it as soon as possible. Each node also ensures that at any instance, by means of the gossip protocol, it maintains the latest suspicion values of all the other nodes in the network.
We derive a phase diagram for amorphous solids and liquid supercooled water and explain why the amorphous solids of water exist in several different forms. Application of large-deviation theory allows us to prepare such phases in computer simulations. Along with nonequilibrium transitions between the ergodic liquid and two distinct amorphous solids, we establish coexistence between these two amorphous solids. The phase diagram we predict includes a nonequilibrium triple point where two amorphous phases and the liquid coexist. While the amorphous solids are long-lived and slowly-aging glasses, their melting can lead quickly to the formation of crystalline ice. Further, melting of the higher density amorphous solid at low pressures takes place in steps, transitioning to the lower density glass before accessing a nonequilibrium liquid from which ice coarsens.
This paper presented a face detection system using Radial Basis Function Neural Networks With Fixed Spread Value. Face detection is the first step in face recognition system. The purpose is to localize and extract the face region from the background that will be fed into the face recognition system for identification. General preprocessing approach was used for normalizing the image and Radial Basis Function (RBF) Neural Network was used to distinguish between face and non-face. RBF Neural Networks offer several advantages compared to other neural network architecture such as they can be trained using fast two stages training algorithm and the network possesses the property of best approximation. The output of the network can be optimized by setting suitable value of center and spread of the RBF. In this paper, fixed spread value will be used. The Radial Basis Function Neural Network (RBFNN) used to distinguish faces and non-faces and the evaluation of the system will be the performance of detection, False Acceptance Rate (FAR), False Rejection Rate (FRR) and the discriminative properties.
Neural networks have proved to be versatile and robust for particle separation in many experiments related to particle astrophysics. We apply these techniques to separate gamma rays from hadrons for the MAGIC Cerenkov Telescope. Two types of neural network architectures have been used for the classi cation task: one is the MultiLayer Perceptron (MLP) based on supervised learning, and the other is the Self-Organising Tree Algorithm (SOTA), which is based on unsupervised learning. We propose a new architecture by combining these two neural networks types to yield better and faster classi cation results for our classi cation problem.
The "ALFA Ultra Deep Survey" (AUDS) is an ongoing 21-cm spectral survey with the Arecibo 305m telescope. AUDS will be the most sensitive blind survey undertaken with Arecibo's 300 MHz Mock spectrometer. The survey searches for 21-cm HI line emission at redshifts between 0 and 0.16. The main goals of the survey are to investigate the HI content and probe the evolution of HI gas within that redshift region. In this paper, we report on a set of precursor observations with a total integration time of 53 hours. The survey detected a total of eighteen 21-cm emission lines at redshifts between 0.07 and 0.15 in a region centered around ra~0:00h, dec= 15:42deg. The rate of detection is consistent with the one expected from the local HI mass function. The derived relative HI density at the median redshift of the survey is rho_HI[z=0.125]=(1.0+/-0.3)*rho_0, where rho_0 is the HI density at zero redshift.
We present numerical evidence from Monte Carlo simulations that the superfluid-insulator quantum phase transition of interacting bosons subject to strong disorder in one dimension is controlled by the strong-randomness critical point. At this critical point the distribution of superfluid stiffness over disorder realizations develops a power-law tail reflecting a universal distribution of weak links. The Luttinger parameter on the other hand does not take on a universal value at this critical point, in marked contrast to the known Berezinskii-Kosterlitz-Thouless-like superfluid-insulator transition in weakly disordered systems. We develop a finite-size scaling procedure which allows us to directly compare the numerical results from systems of linear size up to 1024 sites with theoretical predictions obtained by Altman et al. [ Phys. Rev. Lett. 93 150402 (2004)] using a strong disorder renormalization group approach. The data shows good agreement with the scaling expected at the strong-randomness critical point.
Recent measurements of jet cross sections in neutral-current and charged-current deep inelastic ep scattering are presented. The results of the QCD analyses on these measurements to determine the strong coupling constant are also reported.
The importance of modeling and analyzing Social Networks is a consequence of the success of Online Social Networks during last years. Several models of networks have been proposed, reflecting the different characteristics of Social Networks. Some of them fit better to model specific phenomena, such as the growth and the evolution of the Social Networks; others are more appropriate to capture the topological characteristics of the networks. Because these networks show unique and different properties and features, in this work we describe and exploit several models in order to capture the structure of popular Online Social Networks, such as Arxiv, Facebook, Wikipedia and YouTube. Our experimentation aims at verifying the structural characteristics of these networks, in order to understand what model better depicts their structure, and to analyze the inner community structure, to illustrate how members of these Online Social Networks interact and group together into smaller communities.
Although artificial neural networks have shown great promise in applications including computer vision and speech recognition, there remains considerable practical and theoretical difficulty in optimizing their parameters. The seemingly unreasonable success of gradient descent methods in minimizing these non-convex functions remains poorly understood. In this work we offer some theoretical guarantees for networks with piecewise affine activation functions, which have in recent years become the norm. We prove three main results. Firstly, that the network is piecewise convex as a function of the input data. Secondly, that the network, considered as a function of the parameters in a single layer, all others held constant, is again piecewise convex. Finally, that the network as a function of all its parameters is piecewise multi-convex, a generalization of biconvexity. From here we characterize the local minima and stationary points of the training objective, showing that they minimize certain subsets of the parameter space. We then analyze the performance of two optimization algorithms on multi-convex problems: gradient descent, and a method which repeatedly solves a number of convex sub-problems. We prove necessary convergence conditions for the first algorithm and both necessary and sufficient conditions for the second, after introducing regularization to the objective. Finally, we remark on the remaining difficulty of the global optimization problem. Under the squared error objective, we show that by varying the training data, a single rectifier neuron admits local minima arbitrarily far apart, both in objective value and parameter space.
Military intelligence is underutilized in the study of civil war violence. Declassified records are hard to acquire and difficult to explore with the standard econometrics toolbox. I investigate a contemporary government database of civilians targeted during the Vietnam War. The data are detailed, with up to 45 attributes recorded for 73,712 individual civilian suspects. I employ an unsupervised machine learning approach of cleaning, variable selection, dimensionality reduction, and clustering. I find support for a simplifying typology of civilian targeting that distinguishes different kinds of suspects and different kinds targeting methods. The typology is robust, successfully clustering both government actors and rebel departments into groups that mirror their known functions. The exercise highlights methods for dealing with high dimensional found conflict data. It also illustrates how aggregating measures of political violence masks a complex underlying empirical data generating process as well as a complex institutional reporting process.
By dividing potential energy landscapes into basins of attractions surrounding minima and linking those basins that are connected by transition state valleys, a network description of energy landscapes naturally arises. These networks are characterized in detail for a series of small Lennard-Jones clusters and show behaviour characteristic of small-world and scale-free networks. However, unlike many such networks, this topology cannot reflect the rules governing the dynamics of network growth, because they are static spatial networks. Instead, the heterogeneity in the networks stems from differences in the potential energy of the minima, and hence the hyperareas of their associated basins of attraction. The low-energy minima with large basins of attraction act as hubs in the network.Comparisons to randomized networks with the same degree distribution reveals structuring in the networks that reflects their spatial embedding.
The automatic disambiguation of word senses (i.e., the identification of which of the meanings is used in a given context for a word that has multiple meanings) is essential for such applications as machine translation and information retrieval, and represents a key step for developing the so-called Semantic Web. Humans disambiguate words in a straightforward fashion, but this does not apply to computers. In this paper we address the problem of Word Sense Disambiguation (WSD) by treating texts as complex networks, and show that word senses can be distinguished upon characterizing the local structure around ambiguous words. Our goal was not to obtain the best possible disambiguation system, but we nevertheless found that in half of the cases our approach outperforms traditional shallow methods. We show that the hierarchical connectivity and clustering of words are usually the most relevant features for WSD. The results reported here shine light on the relationship between semantic and structural parameters of complex networks. They also indicate that when combined with traditional techniques the complex network approach may be useful to enhance the discrimination of senses in large texts
The pursuit of highest payoffs in evolutionary social dilemmas is risky and sometimes inferior to conformity. Choosing the most common strategy within the interaction range is safer because it ensures that the payoff of an individual will not be much lower than average. Herding instincts and crowd behavior in humans and social animals also compel to conformity on their own right. Motivated by these facts, we here study the impact of conformity on the evolution of cooperation in social dilemmas. We show that an appropriate fraction of conformists within the population introduces an effective surface tension around cooperative clusters and ensures smooth interfaces between different strategy domains. Payoff-driven players brake the symmetry in favor of cooperation and enable an expansion of clusters past the boundaries imposed by traditional network reciprocity. This mechanism works even under the most testing conditions, and it is robust against variations of the interaction network as long as degree-normalized payoffs are applied. Conformity may thus be beneficial for the resolution of social dilemmas.
As part of a complete software stack for autonomous driving, NVIDIA has created a neural-network-based system, known as PilotNet, which outputs steering angles given images of the road ahead. PilotNet is trained using road images paired with the steering angles generated by a human driving a data-collection car. It derives the necessary domain knowledge by observing human drivers. This eliminates the need for human engineers to anticipate what is important in an image and foresee all the necessary rules for safe driving. Road tests demonstrated that PilotNet can successfully perform lane keeping in a wide variety of driving conditions, regardless of whether lane markings are present or not.   The goal of the work described here is to explain what PilotNet learns and how it makes its decisions. To this end we developed a method for determining which elements in the road image most influence PilotNet's steering decision. Results show that PilotNet indeed learns to recognize relevant objects on the road.   In addition to learning the obvious features such as lane markings, edges of roads, and other cars, PilotNet learns more subtle features that would be hard to anticipate and program by engineers, for example, bushes lining the edge of the road and atypical vehicle classes.
This paper considers consensus optimization problems where each node of a network has access to a different summand of an aggregate cost function. Nodes try to minimize the aggregate cost function, while they exchange information only with their neighbors. We modify the dual decomposition method to incorporate a curvature correction inspired by the Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton method. The resulting dual D-BFGS method is a fully decentralized algorithm in which nodes approximate curvature information of themselves and their neighbors through the satisfaction of a secant condition. Dual D-BFGS is of interest in consensus optimization problems that are not well conditioned, making first order decentralized methods ineffective, and in which second order information is not readily available, making decentralized second order methods infeasible. Asynchronous implementation is discussed and convergence of D-BFGS is established formally for both synchronous and asynchronous implementations. Performance advantages relative to alternative decentralized algorithms are shown numerically.
We analyze the Sznajd opinion formation model, where a pair of neighboring individuals sharing the same opinion on a square lattice convince its six neighbors to adopt their opinions, when a fraction of the individuals is updated according to the usual random sequential updating rule (asynchronous updating), and the other fraction, the simultaneous updating (synchronous updating). This combined updating scheme provides that the bigger the synchronous frequency becomes, the more difficult the system reaches a consensus. Moreover, in the thermodynamic limit, the system needs only a small fraction of individuals following a different kind of updating rules to present a non-consensus state as a final state.
O(\alpha) QED radiative corrections to neutral current deep inelastic production of heavy quarks are calculated in the leading log approximation and compared with the corresponding corrections assuming a massless charm parton. Besides the inclusive case, corrections to the semi-inclusive d^3\sigma/dx dy dz and the effect of z-cuts are studied. In the latter case, the massless corrections differ from the correct massive radiative corrections to deep inelastic heavy quark production by about 40%-10% for 0.2\lesssim z \lesssim 0.5.
A computer-aided detection (CAD) system for the identification of pulmonary nodules in low-dose multi-detector helical CT images with 1.25 mm slice thickness is being developed in the framework of the INFN-supported MAGIC-5 Italian project. The basic modules of our lung-CAD system, a dot enhancement filter for nodule candidate selection and a voxel-based neural classifier for false-positive finding reduction, are described. Preliminary results obtained on the so-far collected database of lung CT scans are discussed.
To date, most studies on spam have focused only on the spamming phase of the spam cycle and have ignored the harvesting phase, which consists of the mass acquisition of email addresses. It has been observed that spammers conceal their identity to a lesser degree in the harvesting phase, so it may be possible to gain new insights into spammers' behavior by studying the behavior of harvesters, which are individuals or bots that collect email addresses. In this paper, we reveal social networks of spammers by identifying communities of harvesters with high behavioral similarity using spectral clustering. The data analyzed was collected through Project Honey Pot, a distributed system for monitoring harvesting and spamming. Our main findings are (1) that most spammers either send only phishing emails or no phishing emails at all, (2) that most communities of spammers also send only phishing emails or no phishing emails at all, and (3) that several groups of spammers within communities exhibit coherent temporal behavior and have similar IP addresses. Our findings reveal some previously unknown behavior of spammers and suggest that there is indeed social structure between spammers to be discovered.
Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.
Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting and training CNNs so that their learned features are compositional. It encourages networks to form representations that disentangle objects from their surroundings and from each other, thereby promoting better generalization. Our method is agnostic to the specific details of the underlying CNN to which it is applied and can in principle be used with any CNN. As we show in our experiments, the learned representations lead to feature activations that are more localized and improve performance over non-compositional baselines in object recognition tasks.
We present results from Molecular Dynamics simulations of the thermal glass transition in a dense polymer melt. In previous work we compared the simulation data with the idealized version of mode coupling theory (MCT) and found that the theory provides a good description of the dynamics above the critical temperature. In order to investigate the influence of different thermodynamic paths on the structural relaxation (alpha-process), we performed simulations for three different pressures and are thus able to give a sketch of the critical line of MCT in the pressure-temperature-plane [(p,T)-plane], where, according to the idealised version of MCT, an ergodic-nonergodic transition should occur. Furthermore, by cooling our system along two different paths (an isobar and an isochor), with the same impact point on the critical line, we demonstrate that neither the critical temperature nor the exponent gamma depend on the chosen path.
In spite of the large amount of existing neural models in the literature, there is a lack of a systematic review of the possible effect of choosing different initial conditions on the dynamic evolution of neural systems. In this short review we intend to give insights into this topic by discussing some published examples. First, we briefly introduce the different ingredients of a neural dynamical model. Secondly, we introduce some concepts used to describe the dynamic behavior of neural models, namely phase space and its portraits, time series, spectra, multistability and bifurcations. We end with an analysis of the irreversibility of processes and its implications on the functioning of normal and pathological brains.
Metric auto calibration of a camera network from multiple views has been reported by several authors. Resulting 3D reconstruction recovers shape faithfully, but not scale. However, preservation of scale becomes critical in applications, such as multi-party telepresence, where multiple 3D scenes need to be fused into a single coordinate system. In this context, we propose a camera network configuration that includes a stereo pair with known baseline separation, and analytically demonstrate Euclidean auto calibration of such network under mild conditions. Further, we experimentally validate our theory using a four-camera network. Importantly, our method not only recovers scale, but also compares favorably with the well known Zhang and Pollefeys methods in terms of shape recovery.
With recent advances in high-throughput cell biology the amount of cellular biological data has grown drastically. Such data is often modeled as graphs (also called networks) and studying them can lead to new insights into molecule-level organization. A possible way to understand their structure is by analysing the smaller components that constitute them, namely network motifs and graphlets. Graphlets are particularly well suited to compare networks and to assess their level of similarity but are almost always used as small undirected graphs of up to five nodes, thus limiting their applicability in directed networks. However, a large set of interesting biological networks such as metabolic, cell signaling or transcriptional regulatory networks are intrinsically directional, and using metrics that ignore edge direction may gravely hinder information extraction. The applicability of graphlets is extended to directed networks by considering the edge direction of the graphlets. We tested our approach on a set of directed biological networks and verified that they were correctly grouped by type using directed graphlets. However, enumerating all graphlets in a large network is a computationally demanding task. Our implementation addresses this concern by using a state-of-the-art data structure, the g-trie, which is able to greatly reduce the necessary computation. We compared our tool, gtrieScanner, to other state-of-the art methods and verified that it is the fastest general tool for graphlet counting.
Precise low-frequency light scattering experiments on silica glass are presented, covering a broad temperature and frequency range (9 GHz < \nu < 2 THz). For the first time the spectral shape of relaxations is observed over more than one decade in frequency. The spectra show a power-law low-frequency wing of the relaxational part of the spectrum with an exponent $\alpha $ proportional to temperature in the range 30 K < T < 200 K. A comparison of our results with those from acoustic attenuation experiments performed at different frequencies shows that this power-law behaviour rather well describes relaxations in silica over 9 orders of magnitude in frequency. These findings can be explained by a model of thermally activated transitions in double well potentials.
Recently, DNN model compression based on network architecture design, e.g., SqueezeNet, attracted a lot attention. No accuracy drop on image classification is observed on these extremely compact networks, compared to well-known models. An emerging question, however, is whether these model compression techniques hurt DNN's learning ability other than classifying images on a single dataset. Our preliminary experiment shows that these compression methods could degrade domain adaptation (DA) ability, though the classification performance is preserved. Therefore, we propose a new compact network architecture and unsupervised DA method in this paper. The DNN is built on a new basic module Conv-M which provides more diverse feature extractors without significantly increasing parameters. The unified framework of our DA method will simultaneously learn invariance across domains, reduce divergence of feature representations, and adapt label prediction. Our DNN has 4.1M parameters, which is only 6.7% of AlexNet or 59% of GoogLeNet. Experiments show that our DNN obtains GoogLeNet-level accuracy both on classification and DA, and our DA method slightly outperforms previous competitive ones. Put all together, our DA strategy based on our DNN achieves state-of-the-art on sixteen of total eighteen DA tasks on popular Office-31 and Office-Caltech datasets.
Generative Adversarial Networks (GANs) have been shown to be able to sample impressively realistic images. GAN training consists of a saddle point optimization problem that can be thought of as an adversarial game between a generator which produces the images, and a discriminator, which judges if the images are real. Both the generator and the discriminator are commonly parametrized as deep convolutional neural networks. The goal of this paper is to disentangle the contribution of the optimization procedure and the network parametrization to the success of GANs. To this end we introduce and study Generative Latent Optimization (GLO), a framework to train a generator without the need to learn a discriminator, thus avoiding challenging adversarial optimization problems. We show experimentally that GLO enjoys many of the desirable properties of GANs: learning from large data, synthesizing visually-appealing samples, interpolating meaningfully between samples, and performing linear arithmetic with noise vectors.
Natural Language Inference is an important task for Natural Language Understanding. It is concerned with classifying the logical relation between two sentences. In this paper, we propose several text generative neural networks for generating text hypothesis, which allows construction of new Natural Language Inference datasets. To evaluate the models, we propose a new metric -- the accuracy of the classifier trained on the generated dataset. The accuracy obtained by our best generative model is only 2.7% lower than the accuracy of the classifier trained on the original, human crafted dataset. Furthermore, the best generated dataset combined with the original dataset achieves the highest accuracy. The best model learns a mapping embedding for each training example. By comparing various metrics we show that datasets that obtain higher ROUGE or METEOR scores do not necessarily yield higher classification accuracies. We also provide analysis of what are the characteristics of a good dataset including the distinguishability of the generated datasets from the original one.
Sparse random linear network coding (SRLNC) is an attractive technique proposed in the literature to reduce the decoding complexity of random linear network coding. Recognizing the fact that the existing SRLNC schemes are not efficient in terms of the required reception overhead, we consider the problem of designing overhead-optimized SRLNC schemes. To this end, we introduce a new design of SRLNC scheme that enjoys very small reception overhead while maintaining the main benefit of SRLNC, i.e., its linear encoding/decoding complexity. We also provide a mathematical framework for the asymptotic analysis and design of this class of codes based on density evolution (DE) equations. To the best of our knowledge, this work introduces the first DE analysis in the context of network coding. Our analysis method then enables us to design network codes with reception overheads in the order of a few percent. We also investigate the finite-length performance of the proposed codes and through numerical examples we show that our proposed codes have significantly lower reception overheads compared to all existing linear-complexity random linear network coding schemes.
In this paper we explore deep learning models with memory component or attention mechanism for question answering task. We combine and compare three models, Neural Machine Translation, Neural Turing Machine, and Memory Networks for a simulated QA data set. This paper is the first one that uses Neural Machine Translation and Neural Turing Machines for solving QA tasks. Our results suggest that the combination of attention and memory have potential to solve certain QA problem.
We study reproducing kernels, and associated reproducing kernel Hilbert spaces (RKHSs) $\mathscr{H}$ over infinite, discrete and countable sets $V$. In this setting we analyze in detail the distributions of the corresponding Dirac point-masses of $V$. Illustrations include certain models from neural networks: An Extreme Learning Machine (ELM) is a neural network-configuration in which a hidden layer of weights are randomly sampled, and where the object is then to compute resulting output. For RKHSs $\mathscr{H}$ of functions defined on a prescribed countable infinite discrete set $V$, we characterize those which contain the Dirac masses $\delta_{x}$ for all points $x$ in $V$. Further examples and applications where this question plays an important role are: (i) discrete Brownian motion-Hilbert spaces, i.e., discrete versions of the Cameron-Martin Hilbert space; (ii) energy-Hilbert spaces corresponding to graph-Laplacians where the set $V$ of vertices is then equipped with a resistance metric; and finally (iii) the study of Gaussian free fields.
A key characteristic of work on deep learning and neural networks in general is that it relies on representations of the input that support generalization, robust inference, domain adaptation and other desirable functionalities. Much recent progress in the field has focused on efficient and effective methods for computing representations. In this paper, we propose an alternative method that is more efficient than prior work and produces representations that have a property we call focality -- a property we hypothesize to be important for neural network representations. The method consists of a simple application of two consecutive SVDs and is inspired by Anandkumar (2012).
Mining discriminative subgraph patterns from graph data has attracted great interest in recent years. It has a wide variety of applications in disease diagnosis, neuroimaging, etc. Most research on subgraph mining focuses on the graph representation alone. However, in many real-world applications, the side information is available along with the graph data. For example, for neurological disorder identification, in addition to the brain networks derived from neuroimaging data, hundreds of clinical, immunologic, serologic and cognitive measures may also be documented for each subject. These measures compose multiple side views encoding a tremendous amount of supplemental information for diagnostic purposes, yet are often ignored. In this paper, we study the problem of discriminative subgraph selection using multiple side views and propose a novel solution to find an optimal set of subgraph features for graph classification by exploring a plurality of side views. We derive a feature evaluation criterion, named gSide, to estimate the usefulness of subgraph patterns based upon side views. Then we develop a branch-and-bound algorithm, called gMSV, to efficiently search for optimal subgraph features by integrating the subgraph mining process and the procedure of discriminative feature selection. Empirical studies on graph classification tasks for neurological disorders using brain networks demonstrate that subgraph patterns selected by the multi-side-view guided subgraph selection approach can effectively boost graph classification performances and are relevant to disease diagnosis.
We present a systematic analysis on the performance of a phonetic recogniser when the window of input features is not symmetric with respect to the current frame. The recogniser is based on Context Dependent Deep Neural Networks (CD-DNNs) and Hidden Markov Models (HMMs). The objective is to reduce the latency of the system by reducing the number of future feature frames required to estimate the current output. Our tests performed on the TIMIT database show that the performance does not degrade when the input window is shifted up to 5 frames in the past compared to common practice (no future frame). This corresponds to improving the latency by 50 ms in our settings. Our tests also show that the best results are not obtained with the symmetric window commonly employed, but with an asymmetric window with eight past and two future context frames, although this observation should be confirmed on other data sets. The reduction in latency suggested by our results is critical for specific applications such as real-time lip synchronisation for tele-presence, but may also be beneficial in general applications to improve the lag in human-machine spoken interaction.
Support for human-to-human interactions over a network is still insufficient, particularly for professional virtual communities (PVC). Among other limitations, adaptation and learning-by-experience capabilities of humans are not taken into account in existing models for collaboration processes in PVC. This paper presents a model for adaptive human collaboration. A key element of this model is the use of negotiation for adaptation of social protocols modelling processes. A second contribution is the proposition of various adaptation propagation strategies as means for continuous management of the PVC inheritance.
With the increasing concern for security in the network, many approaches are laid out that try to protect the network from unauthorised access. New methods have been adopted in order to find the potential discrepancies that may damage the network. Most commonly used approach is the vulnerability assessment. By vulnerability, we mean, the potential flaws in the system that make it prone to the attack. Assessment of these system vulnerabilities provide a means to identify and develop new strategies so as to protect the system from the risk of being damaged. This paper focuses on the usage of various vulnerability scanners and their related methodology to detect the various vulnerabilities available in the web applications or the remote host across the network and tries to identify new mechanisms that can be deployed to secure the network.
Using neutron elastic and inelastic scattering and high-energy x-ray diffraction, we present a comparison of 40% Pb(Mg$_{1/3}$Nb$_{2/3}$)O$_{3}$-60% PbTiO$_{3}$ (PMN-60PT) with pure Pb(Mg$_{1/3}$Nb$_{2/3}$)O$_{3}$ (PMN) and PbTiO$_{3}$ (PT). We measure the structural properties of PMN-60PT to be identical to pure PT, however, the lattice dynamics are exactly that previously found in relaxors PMN and PZN. PMN-60PT displays a well-defined macroscopic structural transition from a cubic to tetragonal unit cell at 550 K. The diffuse scattering is shown to be weak indicating that the structural distortion is long-range in PMN-60PT and short-range polar correlations (polar nanoregions) are not present. Even though polar nanoregions are absent, the soft optic mode is short-lived for wavevectors near the zone-centre. Therefore, PMN-60PT displays the same waterfall effect as prototypical relaxors PMN and PZN. We conclude that it is random fields resulting from the intrinsic chemical disorder which is the reason for the broad transverse optic mode observed in PMN and PMN-60PT near the zone centre and not due to the formation of short-ranged polar correlations. Through our comparison of PMN, PMN-60PT, and pure PT, we interpret the dynamic and static properties of the PMN-xPT system in terms of a random field model in which the cubic anisotropy term dominates with increasing doping of PbTiO$_{3}$.
We present the rest-frame J- and H-band luminosity function (LF) of field galaxies, based on a deep multi-wavelength composite sample from the MUSYC, FIRES and FIREWORKS survey public catalogues, covering a total area of 450 arcmin^2. The availability of flux measurements in the Spitzer IRAC 3.6, 4.5, 5.8, and 8 um channels allows us to compute absolute magnitudes in the rest-frame J and H bands up to z=3.5 minimizing the dependence on the stellar evolution models. We compute the LF in the four redshift bins 1.5<z<2.0, 2.0<z<2.5, 2.5<z<3.0 and 3.0<z<3.5. Combining our results with those already available at lower redshifts, we find that (1) the faint end slope is consistent with being constant up to z=3.5, with alpha=-1.05+/-0.03 for the rest-frame J band and alpha=-1.15+/-0.02 for the rest-frame H band; (2) the normalization phi* decreases by a factor of 6 between z=0 and z~1.75 and by a factor 3 between z~1.75 and z=3.25; (3) the characteristic magnitude M* shows a brightening from z=0 to z~2 followed by a slow dimming to z=3.25. We finally compute the luminosity density (LD) in both rest-frame J and H bands. The analysis of our results together with those available in the literature shows that the LD is approximately constant up to z~1, and it then decreases by a factor of 6 up to z=3.5.
Competing with top human players in the ancient game of Go has been a long-term goal of artificial intelligence. Go's high branching factor makes traditional search techniques ineffective, even on leading-edge hardware, and Go's evaluation function could change drastically with one stone change. Recent works [Maddison et al. (2015); Clark & Storkey (2015)] show that search is not strictly necessary for machine Go players. A pure pattern-matching approach, based on a Deep Convolutional Neural Network (DCNN) that predicts the next move, can perform as well as Monte Carlo Tree Search (MCTS)-based open source Go engines such as Pachi [Baudis & Gailly (2012)] if its search budget is limited. We extend this idea in our bot named darkforest, which relies on a DCNN designed for long-term predictions. Darkforest substantially improves the win rate for pattern-matching approaches against MCTS-based approaches, even with looser search budgets. Against human players, the newest versions, darkfores2, achieve a stable 3d level on KGS Go Server as a ranked bot, a substantial improvement upon the estimated 4k-5k ranks for DCNN reported in Clark & Storkey (2015) based on games against other machine players. Adding MCTS to darkfores2 creates a much stronger player named darkfmcts3: with 5000 rollouts, it beats Pachi with 10k rollouts in all 250 games; with 75k rollouts it achieves a stable 5d level in KGS server, on par with state-of-the-art Go AIs (e.g., Zen, DolBaram, CrazyStone) except for AlphaGo [Silver et al. (2016)]; with 110k rollouts, it won the 3rd place in January KGS Go Tournament.
In hearing aids, the presence of babble noise degrades hearing intelligibility of human speech greatly. However, removing the babble without creating artifacts in human speech is a challenging task in a low SNR environment. Here, we sought to solve the problem by finding a `mapping' between noisy speech spectra and clean speech spectra via supervised learning. Specifically, we propose using fully Convolutional Neural Networks, which consist of lesser number of parameters than fully connected networks. The proposed network, Redundant Convolutional Encoder Decoder (R-CED), demonstrates that a convolutional network can be 12 times smaller than a recurrent network and yet achieves better performance, which shows its applicability for an embedded system: the hearing aids.
Research on integrated neural-symbolic systems has made significant progress in the recent past. In particular the understanding of ways to deal with symbolic knowledge within connectionist systems (also called artificial neural networks) has reached a critical mass which enables the community to strive for applicable implementations and use cases. Recent work has covered a great variety of logics used in artificial intelligence and provides a multitude of techniques for dealing with them within the context of artificial neural networks. We present a comprehensive survey of the field of neural-symbolic integration, including a new classification of system according to their architectures and abilities.
We estimate the differential and total cross sections for both the photoproduction of vector D*-meson and its yield in deep inelastic scattering at the HERA collider in the framework of model motivated by perturbative calculations in QCD. The offered model allows us to take into account higher twists over the transverse momentum of meson at p_T ~ m_c and to correctly approach the dominance of $c$-quark fragmentation at p_T >> m_c. We consider a possibility for the hadronization of color-octet c q-bar state into the meson. The combined contribution by the singlet and octet-color terms results in a good agreement with the experimental data for both the photoproduction and the production in deep inelastic scattering.
We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has $k$ hidden layers and that the $\ell_1$-norm of the incoming weights of any neuron is bounded by $L$. We present a kernel-based method, such that with probability at least $1 - \delta$, it learns a predictor whose generalization error is at most $\epsilon$ worse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in $(1/\epsilon,\log(1/\delta),F(k,L))$, where $F(k,L)$ is a function depending on $(k,L)$ and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.
To obtain uncertainty estimates with real-world Bayesian deep learning models, practical inference approximations are needed. Dropout variational inference (VI) for example has been used for machine vision and medical applications, but VI can severely underestimates model uncertainty. Alpha-divergences are alternative divergences to VI's KL objective, which are able to avoid VI's uncertainty underestimation. But these are hard to use in practice: existing techniques can only use Gaussian approximating distributions, and require existing models to be changed radically, thus are of limited use for practitioners. We propose a re-parametrisation of the alpha-divergence objectives, deriving a simple inference technique which, together with dropout, can be easily implemented with existing models by simply changing the loss of the model. We demonstrate improved uncertainty estimates and accuracy compared to VI in dropout networks. We study our model's epistemic uncertainty far away from the data using adversarial images, showing that these can be distinguished from non-adversarial images by examining our model's uncertainty.
This paper presents a model for a dynamical system where particles dominate edges in a complex network. The proposed dynamical system is then extended to an application on the problem of community detection and data clustering. In the case of the data clustering problem, 6 different techniques were simulated on 10 different datasets in order to compare with the proposed technique. The results show that the proposed algorithm performs well when prior knowledge of the number of clusters is known to the algorithm.
Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research.
We consider single-particle quantum transport on parametrized complex networks. Based on general arguments regarding the spectrum of the corresponding Hamiltonian, we derive bounds for a measure of the global transport efficiency defined by the time-averaged return probability. For tree-like networks, we show analytically that a transition from efficient to inefficient transport occurs depending on the (average) functionality of the nodes of the network. In the infinite system size limit, this transition can be characterized by an exponent which is universal for all tree-like networks. Our findings are corroborated by analytic results for specific deterministic networks, dendrimers and Viscek fractals, and by Monte Carlo simulations of iteratively built scale-free trees.
A controversial issue in spin glass theory is whether mean field correctly describes 3-dimensional spin glasses. If it does, how can replica symmetry breaking arise in terms of spin clusters in Euclidean space? Here we argue that there exist system-size low energy excitations that are sponge-like, generating multiple valleys separated by diverging energy barriers. The droplet model should be valid for length scales smaller than the size of the system (theta > 0), but nevertheless there can be system-size excitations of constant energy without destroying the spin glass phase. The picture we propose then combines droplet-like behavior at finite length scales with a potentially mean field behavior at the system-size scale.
Can intelligence optimise Digital Ecosystems? How could a distributed intelligence interact with the ecosystem dynamics? Can the software components that are part of genetic selection be intelligent in themselves, as in an adaptive technology? We consider the effect of a distributed intelligence mechanism on the evolutionary and ecological dynamics of our Digital Ecosystem, which is the digital counterpart of a biological ecosystem for evolving software services in a distributed network. We investigate Neural Networks and Support Vector Machine for the learning based pattern recognition functionality of our distributed intelligence. Simulation results imply that the Digital Ecosystem performs better with the application of a distributed intelligence, marginally more effectively when powered by Support Vector Machine than Neural Networks, and suggest that it can contribute to optimising the operation of our Digital Ecosystem.
We show that the learning sample complexity of a sigmoidal neural network constructed by Sontag (1992) required to achieve a given misclassification error under a fixed purely atomic distribution can grow arbitrarily fast: for any prescribed rate of growth there is an input distribution having this rate as the sample complexity, and the bound is asymptotically tight. The rate can be superexponential, a non-recursive function, etc. We further observe that Sontag's ANN is not Glivenko-Cantelli under any input distribution having a non-atomic part.
We present a visually grounded model of speech perception which projects spoken utterances and images to a joint semantic space. We use a multi-layer recurrent highway network to model the temporal nature of spoken speech, and show that it learns to extract both form and meaning-based linguistic knowledge from the input signal. We carry out an in-depth analysis of the representations used by different components of the trained model and show that encoding of semantic aspects tends to become richer as we go up the hierarchy of layers, whereas encoding of form-related aspects of the language input tends to initially increase and then plateau or decrease.
Many scientific and engineering fields involve analyzing network data. For document networks, relational topic models (RTMs) provide a probabilistic generative process to describe both the link structure and document contents, and they have shown promise on predicting network structures and discovering latent topic representations. However, existing RTMs have limitations in both the restricted model expressiveness and incapability of dealing with imbalanced network data. To expand the scope and improve the inference accuracy of RTMs, this paper presents three extensions: 1) unlike the common link likelihood with a diagonal weight matrix that allows the-same-topic interactions only, we generalize it to use a full weight matrix that captures all pairwise topic interactions and is applicable to asymmetric networks; 2) instead of doing standard Bayesian inference, we perform regularized Bayesian inference (RegBayes) with a regularization parameter to deal with the imbalanced link structure issue in common real networks and improve the discriminative ability of learned latent representations; and 3) instead of doing variational approximation with strict mean-field assumptions, we present collapsed Gibbs sampling algorithms for the generalized relational topic models by exploring data augmentation without making restricting assumptions. Under the generic RegBayes framework, we carefully investigate two popular discriminative loss functions, namely, the logistic log-loss and the max-margin hinge loss. Experimental results on several real network datasets demonstrate the significance of these extensions on improving the prediction performance, and the time efficiency can be dramatically improved with a simple fast approximation method.
Online social media and games are increasingly replacing offline social activities. Social media is now an indispensable mode of communication; online gaming is not only a genuine social activity but also a popular spectator sport. With support for anonymity and larger audiences, online interaction shrinks social and geographical barriers. Despite such benefits, social disparities such as gender inequality persist in online social media. In particular, online gaming communities have been criticized for persistent gender disparities and objectification. As gaming evolves into a social platform, persistence of gender disparity is a pressing question. Yet, there are few large-scale, systematic studies of gender inequality and objectification in social gaming platforms. Here we analyze more than one billion chat messages from Twitch, a social game-streaming platform, to study how the gender of streamers is associated with the nature of conversation. Using a combination of computational text analysis methods, we show that gendered conversation and objectification is prevalent in chats. Female streamers receive significantly more objectifying comments while male streamers receive more game-related comments. This difference is more pronounced for popular streamers. There also exists a large number of users who post only on female or male streams. Employing a neural vector-space embedding (paragraph vector) method, we analyze gendered chat messages and create prediction models that (i) identify the gender of streamers based on messages posted in the channel and (ii) identify the gender a viewer prefers to watch based on their chat messages. Our findings suggest that disparities in social game-streaming platforms is a nuanced phenomenon that involves the gender of streamers as well as those who produce gendered and game-related conversation.
The form of energy termed heat that typically derives from lattice vibrations, i.e. the phonons, is usually considered as waste energy and, moreover, deleterious to information processing. However, with this colloquium, we attempt to rebut this common view: By use of tailored models we demonstrate that phonons can be manipulated like electrons and photons can, thus enabling controlled heat transport. Moreover, we explain that phonons can be put to beneficial use to carry and process information. In a first part we present ways to control heat transport and how to process information for physical systems which are driven by a temperature bias. Particularly, we put forward the toolkit of familiar electronic analogs for exercising phononics; i.e. phononic devices which act as thermal diodes, thermal transistors, thermal logic gates and thermal memories, etc.. These concepts are then put to work to transport, control and rectify heat in physical realistic nanosystems by devising practical designs of hybrid nanostructures that permit the operation of functional phononic devices and, as well, report first experimental realizations. Next, we discuss yet richer possibilities to manipulate heat flow by use of time varying thermal bath temperatures or various other external fields. These give rise to a plenty of intriguing phononic nonequilibrium phenomena as for example the directed shuttling of heat, a geometrical phase induced heat pumping, or the phonon Hall effect, that all may find its way into operation with electronic analogs.
We present a technique designed to automatically compute predicate abstractions for dense real-timed models represented as networks of timed automata.   We use the CIPM algorithm in our previous work which computes new invariants for timed automata control locations and prunes the model, to compute a predicate abstraction of the model. We do so by taking information regarding control locations and their newly computed invariants into account.
Traditional databases commonly support efficient query and update procedures that operate in time which is sublinear in the size of the database. Our goal in this paper is to take a first step toward dynamic reasoning in probabilistic databases with comparable efficiency. We propose a dynamic data structure that supports efficient algorithms for updating and querying singly connected Bayesian networks. In the conventional algorithm, new evidence is absorbed in O(1) time and queries are processed in time O(N), where N is the size of the network. We propose an algorithm which, after a preprocessing phase, allows us to answer queries in time O(log N) at the expense of O(log N) time per evidence absorption. The usefulness of sub-linear processing time manifests itself in applications requiring (near) real-time response over large probabilistic databases. We briefly discuss a potential application of dynamic probabilistic reasoning in computational biology.
We present a fit of the virtual-photon scattering asymmetry of polarized Deep Inelastic Scattering which combines a Monte Carlo technique with the use of a redundant parametrization based on Neural Networks. We apply the result to the analysis of CLAS data on a polarized proton target.
Relations between users on social media sites often reflect a mixture of positive (friendly) and negative (antagonistic) interactions. In contrast to the bulk of research on social networks that has focused almost exclusively on positive interpretations of links between people, we study how the interplay between positive and negative relationships affects the structure of on-line social networks. We connect our analyses to theories of signed networks from social psychology. We find that the classical theory of structural balance tends to capture certain common patterns of interaction, but that it is also at odds with some of the fundamental phenomena we observe --- particularly related to the evolving, directed nature of these on-line networks. We then develop an alternate theory of status that better explains the observed edge signs and provides insights into the underlying social mechanisms. Our work provides one of the first large-scale evaluations of theories of signed networks using on-line datasets, as well as providing a perspective for reasoning about social media sites.
The search of binary sequences with low auto-correlations (LABS) is a discrete combinatorial optimization problem contained in the NP-hard computational complexity class. We study this problem using Warning Propagation (WP) , a message passing algorithm, and compare the performance of the algorithm in the original problem and in two different disordered versions. We show that in all the cases Warning Propagation converges to low energy minima of the solution space. Our results highlight the importance of the local structure of the interaction graph of the variables for the convergence time of the algorithm and for the quality of the solutions obtained by WP. While in general the algorithm does not provide the optimal solutions in large systems it does provide, in polynomial time, solutions that are energetically similar to the optimal ones. Moreover, we designed hybrid models that interpolate between the standard LABS problem and the disordered versions of it, and exploit them to improved the convergence time of WP and the quality of the solutions.
Recent literature on deep neural networks for tagging of highly energetic jets resulting from top quark decays has focused on image based techniques or multivariate approaches using high level jet substructure variables. Here a sequential approach to this task is taken by using an ordered sequence of jet constituents as training inputs. Unlike previous approaches, this strategy does not result in a loss of information during pixelisation or the calculation of high level features. New preprocessing methods that do not alter key physical quantities such as the jet mass are developed. The jet classification method achieves background rejection of 45 for 50% efficiency operating point for reconstruction level jets with transverse momentum range of 600 to 2500 GeV and is insensitive to multiple proton-proton interactions at the levels expected throughout LHC Run 2.
We investigate the presence of iron line emission among faint X-ray sources identified in the 1Ms Chandra Deep Field South and in the 2Ms Chandra Deep Field North. Individual source spectra are stacked in seven redshift bins over the range z=0.5-4. We find that iron line emission is an ubiquitous property of X-ray sources up to z~3. The measured line strengths are in good agreement with those expected by simple pre-Chandra estimates based on X-ray background synthesis models. The average rest frame equivalent width of the iron line does not show significant changes with redshift.
Many different methods to train deep generative models have been proposed in the past. In this paper, we propose to extend the variational auto-encoder (VAE) framework with a new type of prior which we call "Variational Mixture of Posteriors" prior, or VampPrior for short. The VampPrior consists of a mixture distribution (e.g., a mixture of Gaussians) with components given by variational posteriors conditioned on learnable pseudo-inputs. We further extend this prior to a two layer hierarchical model and show that this architecture where prior and posterior are coupled, learns significantly better models. The model also avoids the usual local optima issues that plague VAEs related to useless latent dimensions. We provide empirical studies on three benchmark datasets, namely, MNIST, OMNIGLOT and Caltech 101 Silhouettes, and show that applying the hierarchical VampPrior delivers state-of-the-art results on all three datasets in the unsupervised permutation invariant setting.
Several theories of the glass transition propose that the structural relaxation time {\tau}{\alpha} is controlled by a growing static length scale {\xi} that is determined by the free energy landscape but not by the local dynamical rules governing its exploration. We argue, based on recent simulations using particle-radius-swap dynamics, that only a modest factor in the increase in {\tau}{\alpha} on approach to the glass transition may stem from the growth of a static length, with a vastly larger contribution attributable instead to a slowdown of local dynamics. This reinforces arguments that we base on the observed strong coupling of particle diffusion and density fluctuations in real glasses
Networks created from real-world data contain some inaccuracies or noise, manifested as small changes in the network structure. An important question is whether these small changes can significantly affect the analysis results. In this paper, we study the effect of noise in changing ranks of the high centrality vertices. We compare, using the Jaccard Index (JI), how many of the top-k high centrality nodes from the original network are also part of the top-k ranked nodes from the noisy network. We deem a network as stable if the JI value is high. We observe two features that affect the stability. First, the stability is dependent on the number of top-ranked vertices considered. When the vertices are ordered according to their centrality values, they group into clusters. Perturbations to the network can change the relative ranking within the cluster, but vertices rarely move from one cluster to another. Second, the stability is dependent on the local connections of the high ranking vertices. The network is highly stable if the high ranking vertices are connected to each other. Our findings show that the stability of a network is affected by the local properties of high centrality vertices, rather than the global properties of the entire network. Based on these local properties we can identify the stability of a network, without explicitly applying a noise model.
We propose a new coarse-grained model for the description of liquid-vapor phase separation of colloid-polymer mixtures. The hard-sphere repulsion between colloids and between colloids and polymers, which is used in the well-known Asakura-Oosawa (AO) model, is replaced by Weeks-Chandler-Anderson potentials. Similarly, a soft potential of height comparable to thermal energy is used for the polymer-polymer interaction, rather than treating polymers as ideal gas particles. It is shown by grand-canonical Monte Carlo simulations that this model leads to a coexistence curve that almost coincides with that of the AO model and the Ising critical behavior of static quantities is reproduced. Then the main advantage of the model is exploited - its suitability for Molecular Dynamics simulations - to study the dynamics of mean square displacements of the particles, transport coefficients such as the self-diffusion and interdiffusion coefficients, and dynamic structure factors. While the self-diffusion of polymers increases slightly when the critical point is approached, the self-diffusion of colloids decreases and at criticality the colloid self-diffusion coefficient is about a factor of 10 smaller than that of the polymers. Critical slowing down of interdiffusion is observed, which is qualitatively similar to symmetric binary Lennard-Jones mixtures, for which no dynamic asymmetry of self-diffusion coefficients occurs.
Events with a (2+1) jet topology in deep-inelastic scattering at HERA are studied in the kinematic range 200 < Q^2< 10,000 GeV^2. The rate of (2+1) jet events has been determined with the modified JADE jet algorithm as a function of the jet resolution parameter and is compared with the predictions of Monte Carlo models. In addition, the event rate is corrected for both hadronization and detector effects and is compared with next-to-leading order QCD calculations. A value of the strong coupling constant of alpha_s(M_Z^2)= 0.118+- 0.002 (stat.)^(+0.007)_(-0.008) (syst.)^(+0.007)_(-0.006) (theory) is extracted. The systematic error includes uncertainties in the calorimeter energy calibration, in the description of the data by current Monte Carlo models, and in the knowledge of the parton densities. The theoretical error is dominated by the renormalization scale ambiguity.
We propose a new framework, called Hierarchical Multi-resolution Mesh Networks (HMMNs), which establishes a set of brain networks at multiple time resolutions of fMRI signal to represent the underlying cognitive process. The suggested framework, first, decomposes the fMRI signal into various frequency subbands using wavelet transforms. Then, a brain network, called mesh network, is formed at each subband by ensembling a set of local meshes. The locality around each anatomic region is defined with respect to a neighborhood system based on functional connectivity. The arc weights of a mesh are estimated by ridge regression formed among the average region time series. In the final step, the adjacency matrices of mesh networks obtained at different subbands are ensembled for brain decoding under a hierarchical learning architecture, called, fuzzy stacked generalization (FSG). Our results on Human Connectome Project task-fMRI dataset reflect that the suggested HMMN model can successfully discriminate tasks by extracting complementary information obtained from mesh arc weights of multiple subbands. We study the topological properties of the mesh networks at different resolutions using the network measures, namely, node degree, node strength, betweenness centrality and global efficiency; and investigate the connectivity of anatomic regions, during a cognitive task. We observe significant variations among the network topologies obtained for different subbands. We, also, analyze the diversity properties of classifier ensemble, trained by the mesh networks in multiple subbands and observe that the classifiers in the ensemble collaborate with each other to fuse the complementary information freed at each subband. We conclude that the fMRI data, recorded during a cognitive task, embed diverse information across the anatomic regions at each resolution.
Primary visual cortex (V1) is the first stage of cortical image processing, and a major effort in systems neuroscience is devoted to understanding how it encodes information about visual stimuli. Within V1, many neurons respond selectively to edges of a given preferred orientation: these are known as simple or complex cells, and they are well-studied. Other neurons respond to localized center-surround image features. Still others respond selectively to certain image stimuli, but the specific features that excite them are unknown. Moreover, even for the simple and complex cells-- the best-understood V1 neurons-- it is challenging to predict how they will respond to natural image stimuli. Thus, there are important gaps in our understanding of how V1 encodes images. To fill this gap, we train deep convolutional neural networks to predict the firing rates of V1 neurons in response to natural image stimuli, and find that 15% of these neurons are within 10% of their theoretical limit of predictability. For these well predicted neurons, we invert the predictor network to identify the image features (receptive fields) that cause the V1 neurons to spike. In addition to those with previously-characterized receptive fields (Gabor wavelet and center-surround), we identify neurons that respond predictably to higher-level textural image features that are not localized to any particular region of the image.
The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential. In this work we propose the first WII approach based upon deep convolutional neural networks (CNNs). The CNN naively learns its features through self-optimization during an extensive data-driven GPU-based training process. We propose a CNN example which is based upon sensing snapshots with a limited duration of 12.8 {\mu}s and an acquisition bandwidth of 10 MHz. The CNN differs between 15 classes. They represent packet transmissions of IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1 with overlapping frequency channels within the 2.4 GHz ISM band. We show that the CNN outperforms state-of-the-art WII approaches and has a classification accuracy greater than 95% for signal-to-noise ratio of at least -5 dB.
The mutual information between a set of stimuli and the elicited neural responses is compared to the corresponding decoded information. The decoding procedure is presented as an artificial distortion of the joint probabilities between stimuli and responses. The information loss is quantified. Whenever the probabilities are only slightly distorted, the information loss is shown to be quadratic in the distortion
We demonstrate that the tail of transmission distribution through 1D disordered Anderson chain is a strong function of the correlation radius of the random potential, $a$, even when this radius is much shorter than the de Broglie wavelength, $k_F^{-1}$. The reason is that the correlation radius defines the phase volume of the trapping configurations of the random potential, which are responsible for the low-$T$ tail. To see this, we perform the averaging over the low-$T$ disorder configurations by first introducing a finite lattice spacing $\sim a$, and then demonstrating that the prefactor in the corresponding functional integral is exponentially small and depends on $a$ even as $a \to 0$. Moreover, we demonstrate that this restriction of the phase volume leads to the dramatic change in the shape of the tail of ${\cal P}(\ln T)$ from universal Gaussian in $\ln T $ to a simple exponential (in $\ln T $) with exponent depending on $a$. Severity of the phase-volume restriction affects the shape of the low-$T$ disorder configurations transforming them from almost periodic (Bragg mirrors) to periodically-sign-alternating (loose mirrors).
We introduce a method to learn a hierarchy of successively more abstract representations of complex data based on optimizing an information-theoretic objective. Intuitively, the optimization searches for a set of latent factors that best explain the correlations in the data as measured by multivariate mutual information. The method is unsupervised, requires no model assumptions, and scales linearly with the number of variables which makes it an attractive approach for very high dimensional systems. We demonstrate that Correlation Explanation (CorEx) automatically discovers meaningful structure for data from diverse sources including personality tests, DNA, and human language.
We consider extended starlike networks where the hub node is coupled with several chains of nodes representing star rays. Assuming that nodes of the network are occupied by nonidentical self-oscillators we study various forms of their cluster synchronization. Radial cluster emerges when the nodes are synchronized along a ray, while circular cluster is formed by nodes without immediate connections but located on identical distances to the hub. By its nature the circular synchronization is a new manifestation of so called remote synchronization [Phys. Rev. E 85 (2012), 026208]. We report its long-range form when the synchronized nodes interact through at least three intermediate nodes. Forms of long-range remote synchronization are elements of scenario of transition to the total synchronization of the network. We observe that the far ends of rays synchronize first. Then more circular clusters appear involving closer to hub nodes. Subsequently the clusters merge and, finally, all network become synchronous. Behavior of the extended starlike networks is found to be strongly determined by the ray length, while varying the number of rays basically affects fine details of a dynamical picture. Symmetry of the star also extensively influences the dynamics. In an asymmetric star circular cluster mainly vanish in favor of radial ones, however, long-range remote synchronization survives.
We present an approach based on feed-forward neural networks for learning the distribution of textual documents. This approach is inspired by the Neural Autoregressive Distribution Estimator(NADE) model, which has been shown to be a good estimator of the distribution of discrete-valued igh-dimensional vectors. In this paper, we present how NADE can successfully be adapted to the case of textual data, retaining from NADE the property that sampling or computing the probability of observations can be done exactly and efficiently. The approach can also be used to learn deep representations of documents that are competitive to those learned by the alternative topic modeling approaches. Finally, we describe how the approach can be combined with a regular neural network N-gram model and substantially improve its performance, by making its learned representation sensitive to the larger, document-specific context.
For an intelligent agent to be truly autonomous, it must be able to adapt its representation to the requirements of its task as it interacts with the world. Most current approaches to on-line feature extraction are ad hoc; in contrast, this paper presents an algorithm that bases judgments of state compatibility and state-space abstraction on principled criteria derived from the psychological principle of cognitive economy. The algorithm incorporates an active form of Q-learning, and partitions continuous state-spaces by merging and splitting Voronoi regions. The experiments illustrate a new methodology for testing and comparing representations by means of learning curves. Results from the puck-on-a-hill task demonstrate the algorithm's ability to learn effective representations, superior to those produced by some other, well-known, methods.
We present a numerical study of the spin-1/2 bilayer Heisenberg antiferromagnet with random interlayer dimer dilution. From the temperature dependence of the uniform susceptibility and a scaling analysis of the spin correlation length we deduce the ground state phase diagram as a function of nonmagnetic impurity concentration p and bilayer coupling g. At the site percolation threshold, there exists a multicritical point at small but nonzero bilayer coupling g_m = 0.15(3). The magnetic properties of the single-layer material La_2Cu_{1-p}(Zn,Mg)_pO_4 near the percolation threshold appear to be controlled by the proximity to this new quantum critical point.
We present a signal-from-background separation study based on neural networks technique applied to a W/quartz fiber calorimeter. Performance results in terms of signal efficiency and improvement of the signal-to-background ratio are presented. We conclude that by using neural networks we can efficiently separate signal from background and achieve a signal enhancement over the background of the order of several thousands at high efficiency.
We investigate the behavior of nonequilibrium phase transitions under the influence of disorder that locally breaks the symmetry between two symmetrical macroscopic absorbing states. In equilibrium systems such "random-field" disorder destroys the phase transition in low dimensions by preventing spontaneous symmetry breaking. In contrast, we show here that random-field disorder fails to destroy the nonequilibrium phase transition of the one- and two-dimensional generalized contact process. Instead, it modifies the dynamics in the symmetry-broken phase. Specifically, the dynamics in the one-dimensional case is described by a Sinai walk of the domain walls between two different absorbing states. In the two-dimensional case, we map the dynamics onto that of the well studied low-temperature random-field Ising model. We also study the critical behavior of the nonequilibrium phase transition and characterize its universality class in one dimension. We support our results by large-scale Monte Carlo simulations, and we discuss the applicability of our theory to other systems.
The electronic properties of polycrystalline lead oxide consisting of a network of single-crystalline $\alpha$-PbO platelets and the formation of the native point defects in $\alpha$-PbO crystal lattice are studied using first principles calculations. The $\alpha$-PbO lattice consists of coupled layers interaction between which is too low to produce high efficiency interlayer charge transfer. In practice, the polycrystalline nature of $\alpha$-PbO causes the formation of lattice defects in such a high concentration that defect-related conductivity becomes the dominant factor in the interlayer charge transition. We found that the formation energy for the O vacancies is low, such vacancies are occupied by two electrons in the zero charge state and tend to donate their electrons to the Pb vacancies that leads to ionization of both vacancies.The vacancies introduce localized states in the band gap which can affect charge transport. The O vacancy forms a defect state at 1.03 eV above the valence band which can act as a deep trap for electrons, while the Pb vacancy forms a shallow trap for holes located just 0.1 eV above the valence band. Charge de-trapping from O vacancies can be accounted for the experimentally found dark current decay in ITO/PbO/Au structures.
To capture various experimental results in the pseudogap regime of the underdoped cuprate superconductors for temperature $T<T^{*}$, we propose a four-component pair density wave (PDW) state, in which all components compete with each other. Without random field disorders (RFD), only one of the PDW components survives. If the RFD is included, this state could become phase separated and consist of short range PDW stripes, in which two PDW components coexist but differ in magnitudes, resulting in charge density waves (CDW) and a time-reversal symmetry breaking order, in the form of loop current, as secondary composite orders. We call this phase-separated pair nematic (PSPN) state, which could be responsible for the pseudogap. Using a phenomenological Ginzburg-Landau approach and Monte Carlo simulations, we found that in this state, RFD induces short range static CDW with phase-separated patterns in the directional components and the static CDW is destroyed by thermal phase fluctuations at a crossover temperature $T_{CO}<T^{*}$, above which the CDW becomes dynamically fluctuating. The experimentally found CDW with predominantly d-wave form factor constrains the PDW components to have $s^{\prime}\pm id$ pairing symmetries. We also construct a lattice model and compute the spectral functions for the PSPN state and find good agreement with ARPES results.
We present a novel method in the family of particle MCMC methods that we refer to as particle Gibbs with ancestor sampling (PG-AS). Similarly to the existing PG with backward simulation (PG-BS) procedure, we use backward sampling to (considerably) improve the mixing of the PG kernel. Instead of using separate forward and backward sweeps as in PG-BS, however, we achieve the same effect in a single forward sweep. We apply the PG-AS framework to the challenging class of non-Markovian state-space models. We develop a truncation strategy of these models that is applicable in principle to any backward-simulation-based method, but which is particularly well suited to the PG-AS framework. In particular, as we show in a simulation study, PG-AS can yield an order-of-magnitude improved accuracy relative to PG-BS due to its robustness to the truncation error. Several application examples are discussed, including Rao-Blackwellized particle smoothing and inference in degenerate state-space models.
Genomics are rapidly transforming medical practice and basic biomedical research, providing insights into disease mechanisms and improving therapeutic strategies, particularly in cancer. The ability to predict the future course of a patient's disease from high-dimensional genomic profiling will be essential in realizing the promise of genomic medicine, but presents significant challenges for state-of-the-art survival analysis methods. In this abstract we present an investigation in learning genomic representations with neural networks to predict patient survival in cancer. We demonstrate the advantages of this approach over existing survival analysis methods using brain tumor data.
Optimizing the spread of influence in online social networks (OSNs) is important for the design of efficient viral marketing strategies using online recommendations. It is commonly believed that, spreading is a global process, whose optimization would require the knowledge of the whole network information. Here we uncover a characteristic local length scale, called influence radius, hidden in the global nature of spreading processes. We show that, any node's influence to the entire OSN can be quantified from its local network environment within the influence radius, which is significantly smaller than the whole network diameter. By mapping the problem onto bond percolation, we give a theoretical explanation for the presence of this short influence radius, and a framework to quantify individual's influence in real OSNs. We then propose a scalable optimization algorithm to identify the most influential spreaders. The time complexity of our algorithm is independent of network size, and its performance is remarkably close the true optimum. Our method may be applied to other large scale spreading problems, such as the world-wide epidemic control.
The interference imposes a significant negative impact on the performance of wireless networks. With the continuous deployment of larger and more sophisticated wireless networks, reducing interference in such networks is quickly being focused upon as a problem in today's world. In this paper we analyze the interference reduction problem from a graph theoretical viewpoint. A graph coloring methods are exploited to model the interference reduction problem. However, additional constraints to graph coloring scenarios that account for various networking conditions result in additional complexity to standard graph coloring. This paper reviews a variety of algorithmic solutions for specific network topologies.
Underwater acoustic sensor networks (UASNs) are often used for environmental and industrial sensing in undersea/ocean space, therefore, these networks are also named underwater wireless sensor networks (UWSNs). Underwater sensor networks are different from other sensor networks due to the acoustic channel used in their physical layer, thus we should discuss about the specific features of these underwater networks such as acoustic channel modeling and protocol design for different layers of open system interconnection (OSI) model. Each node of these networks as a sensor needs to exchange data with other nodes; however, complexity of the acoustic channel makes some challenges in practice, especially when we are designing the network protocols. Therefore based on the mentioned cases, we are going to review general issues of the design of an UASN in this paper. In this regard, we firstly describe the network architecture for a typical 3D UASN, then we review the characteristics of the acoustic channel and the corresponding challenges of it and finally, we discuss about the different layers e.g. MAC protocols, routing protocols, and signal processing for the application layer of UASNs.
Transition-based models can be fast and accurate for constituent parsing. Compared with chart-based models, they leverage richer features by extracting history information from a parser stack, which spans over non-local constituents. On the other hand, during incremental parsing, constituent information on the right hand side of the current word is not utilized, which is a relative weakness of shift-reduce parsing. To address this limitation, we leverage a fast neural model to extract lookahead features. In particular, we build a bidirectional LSTM model, which leverages the full sentence information to predict the hierarchy of constituents that each word starts and ends. The results are then passed to a strong transition-based constituent parser as lookahead features. The resulting parser gives 1.3% absolute improvement in WSJ and 2.3% in CTB compared to the baseline, given the highest reported accuracies for fully-supervised parsing.
Here we introduce a new model of natural textures based on the feature spaces of convolutional neural networks optimised for object recognition. Samples from the model are of high perceptual quality demonstrating the generative power of neural networks trained in a purely discriminative fashion. Within the model, textures are represented by the correlations between feature maps in several layers of the network. We show that across layers the texture representations increasingly capture the statistical properties of natural images while making object information more and more explicit. The model provides a new tool to generate stimuli for neuroscience and might offer insights into the deep representations learned by convolutional neural networks.
Inspired by recent advances in multimodal learning and machine translation, we introduce an encoder-decoder pipeline that learns (a): a multimodal joint embedding space with images and text and (b): a novel language model for decoding distributed representations from our space. Our pipeline effectively unifies joint image-text embedding models with multimodal neural language models. We introduce the structure-content neural language model that disentangles the structure of a sentence to its content, conditioned on representations produced by the encoder. The encoder allows one to rank images and sentences while the decoder can generate novel descriptions from scratch. Using LSTM to encode sentences, we match the state-of-the-art performance on Flickr8K and Flickr30K without using object detections. We also set new best results when using the 19-layer Oxford convolutional network. Furthermore we show that with linear encoders, the learned embedding space captures multimodal regularities in terms of vector space arithmetic e.g. *image of a blue car* - "blue" + "red" is near images of red cars. Sample captions generated for 800 images are made available for comparison.
Person Re-IDentification (Re-ID) aims to match person images captured from two non-overlapping cameras. In this paper, a deep hybrid similarity learning (DHSL) method for person Re-ID based on a convolution neural network (CNN) is proposed. In our approach, a CNN learning feature pair for the input image pair is simultaneously extracted. Then, both the element-wise absolute difference and multiplication of the CNN learning feature pair are calculated. Finally, a hybrid similarity function is designed to measure the similarity between the feature pair, which is realized by learning a group of weight coefficients to project the element-wise absolute difference and multiplication into a similarity score. Consequently, the proposed DHSL method is able to reasonably assign parameters of feature learning and metric learning in a CNN so that the performance of person Re-ID is improved. Experiments on three challenging person Re-ID databases, QMUL GRID, VIPeR and CUHK03, illustrate that the proposed DHSL method is superior to multiple state-of-the-art person Re-ID methods.
Using a Wang-Landau entropic sampling scheme, we investigate the effects of quenched bond randomness on a particular case of a triangular Ising model with nearest- ($J_{nn}$) and next-nearest-neighbor ($J_{nnn}$) antiferromagnetic interactions. We consider the case $R=J_{nnn}/J_{nn}=1$, for which the pure model is known to have a columnar ground state where rows of nearest-neighbor spins up and down alternate and undergoes a weak first-order phase transition from the ordered to the paramagnetic state. With the introduction of quenched bond randomness we observe the effects signaling the expected conversion of the first-order phase transition to a second-order phase transition and using the Lee-Kosterlitz method, we quantitatively verify this conversion. The emerging, under random bonds, continuous transition shows a strongly saturating specific heat behavior, corresponding to a negative exponent $\alpha$, and belongs to a new distinctive universality class with $\nu=1.135(11)$, $\gamma/\nu=1.744(9)$, and $\beta/\nu=0.124(8)$. Thus, our results for the critical exponents support an extensive but weak universality and the emerged continuous transition has the same magnetic critical exponent (but a different thermal critical exponent) as a wide variety of two-dimensional (2d) systems without and with quenched disorder.
We consider the $p$-adic random walk model in a potential, which can be viewed as a generalization of $p$-adic random walk models used for description of protein conformational dynamics. This model is based on the Kolmogorov--Feller equations for the distribution function defined on the field of $p$-adic numbers in which the probability of transitions per unit time depends on ultrametric distance between the transition points as well as on function of potential violating the spatial homogeneity of a random process. This equation, which will be called the equation of $p$-adic random walk in a potential, is equivalent to the equation of $p$-adic random walk with modified measure and reaction source. With a special choice of a power-law potential the last equation is shown to have an exact analytic solution. We find the analytic solution of the Cauchy problem for such equation with an initial condition, whose support lies in the ring of integer $p$-adic numbers. We also examine the asymptotic behaviour of the distribution function for large times. It is shown that in the limit $t\rightarrow\infty$ the distribution function tends to the equilibrium solution according to the law, which is bounded from above and below by power laws with the same exponent. Our principal conclusion is that the introduction of a potential in the ultrametric model of conformational dynamics of protein conserves the power-law behaviour of relaxation curves for large times.
We study the role of geography in R&D networks by means of a quantitative, micro-geographic approach. Using a large database that covers international R&D collaborations from 1984 to 2009, we localize each actor precisely in space through its latitude and longitude. This allows us to analyze the R&D network at all geographic scales simultaneously. Our empirical results show that despite the high importance of the city level, transnational R&D collaborations at large distances are much more frequent than expected from similar networks. This provides evidence for the ambiguity of distance in economic cooperation which is also suggested by the existing literature. In addition we test whether the hypothesis of local buzz and global pipelines applies to the observed R&D network by calculating well-defined metrics from network theory.
Atmospheric muons play an important role in underwater/ice neutrino detectors. In this paper, a parameterisation of the flux of single and multiple muon events, their lateral distribution and of their energy spectrum is presented. The kinematics parameters were modelled starting from a full Monte Carlo simulation of the interaction of primary cosmic rays with atmospheric nuclei; secondary muons reaching the sea level were propagated in the deep water. The parametric formulas are valid for a vertical depth of 1.5-5 km w.e. and up to 85 deg for the zenith angle, and can be used as input for a fast simulation of atmospheric muons in underwater/ice detectors.
We provide novel guaranteed approaches for training feedforward neural networks with sparse connectivity. We leverage on the techniques developed previously for learning linear networks and show that they can also be effectively adopted to learn non-linear networks. We operate on the moments involving label and the score function of the input, and show that their factorization provably yields the weight matrix of the first layer of a deep network under mild conditions. In practice, the output of our method can be employed as effective initializers for gradient descent.
We develop an approach for unsupervised learning of associations between co-occurring perceptual events using a large graph. We applied this approach to successfully solve the image captcha of China's railroad system. The approach is based on the principle of suspicious coincidence. In this particular problem, a user is presented with a deformed picture of a Chinese phrase and eight low-resolution images. They must quickly select the relevant images in order to purchase their train tickets. This problem presents several challenges: (1) the teaching labels for both the Chinese phrases and the images were not available for supervised learning, (2) no pre-trained deep convolutional neural networks are available for recognizing these Chinese phrases or the presented images, and (3) each captcha must be solved within a few seconds. We collected 2.6 million captchas, with 2.6 million deformed Chinese phrases and over 21 million images. From these data, we constructed an association graph, composed of over 6 million vertices, and linked these vertices based on co-occurrence information and feature similarity between pairs of images. We then trained a deep convolutional neural network to learn a projection of the Chinese phrases onto a 230-dimensional latent space. Using label propagation, we computed the likelihood of each of the eight images conditioned on the latent space projection of the deformed phrase for each captcha. The resulting system solved captchas with 77% accuracy in 2 seconds on average. Our work, in answering this practical challenge, illustrates the power of this class of unsupervised association learning techniques, which may be related to the brain's general strategy for associating language stimuli with visual objects on the principle of suspicious coincidence.
Conventional ensemble learning combines students in the space domain. On the other hand, in this paper we combine students in the time domain and call it time domain ensemble learning. In this paper, we analyze the generalization performance of time domain ensemble learning in the framework of online learning using a statistical mechanical method. We treat a model in which both the teacher and the student are linear perceptrons with noises. Time domain ensemble learning is twice as effective as conventional space domain ensemble learning.
Atmospheric muon neutrino in Deep Core (whose rate and spectra might be soon available) should exhibit a suppression (due to tens GeV up-going muon neutrino converted into tau flavor) that must be imprinted in out-coming rate spectra. We estimate here our independent muon neutrino spectra based on SK and its projected record on Deep Core Channels. Our estimate (based on cosmic rays, muon records and tested Super-Kamiokande (SK) data) differs both in shape and in rate from other previous published spectra. The expected rate might exhibit a minimum near channel 6 of Deep Core strings and it should manifest strong signature for flavor mixing (mostly between channel 4--15)and a relevant anomaly for eventual CPT violation (MINOS like) written at channel 3--6,whose statistical weight (mainly at channel 5) might soon confirm or dismiss MINOS CPT claim. At the flux minimum around channel 6, (a flux suppressed respect the non oscillated case at least by an order of magnitude) the atmospheric neutrino paucity offers a better windows to a twenty GeV Neutrino Astronomy. Therefore by doubling the string array we may foresee a richer rate and a more complete (zenith and azimuth) atmospheric neutrino distribution and an exciting first twenty GeV Astronomy.
We propose a novel method for Acoustic Event Detection (AED). In contrast to speech, sounds coming from acoustic events may be produced by a wide variety of sources. Furthermore, distinguishing them often requires analyzing an extended time period due to the lack of a clear sub-word unit. In order to incorporate the long-time frequency structure for AED, we introduce a convolutional neural network (CNN) with a large input field. In contrast to previous works, this enables to train audio event detection end-to-end. Our architecture is inspired by the success of VGGNet and uses small, 3x3 convolutions, but more depth than previous methods in AED. In order to prevent over-fitting and to take full advantage of the modeling capabilities of our network, we further propose a novel data augmentation method to introduce data variation. Experimental results show that our CNN significantly outperforms state of the art methods including Bag of Audio Words (BoAW) and classical CNNs, achieving a 16% absolute improvement.
Artificial neural networks have seen a recent surge in popularity for their ability to solve complex problems as well as or better than humans. In computer vision, deep convolutional neural networks have become the standard for object classification and image understanding due to their ability to learn efficient representations of high-dimensional data. However, the relationship between these representations and human psychological representations has remained unclear. Here we evaluate the quantitative and qualitative nature of this correspondence. We find that state-of-the-art object classification networks provide a reasonable first approximation to human similarity judgments, but fail to capture some of the structure of psychological representations. We show that a simple transformation that corrects these discrepancies can be obtained through convex optimization. Such representations provide a tool that can be used to study human performance on complex tasks with naturalistic stimuli, such as predicting the difficulty of learning novel categories. Our results extend the scope of psychological experiments and computational modeling of cognition by enabling tractable use of large natural stimulus sets.
Individuals interact with conspecifics in a number of behavioural contexts or dimensions. Here, we formalise this by considering a social network between n individuals interacting in b behavioural dimensions as a nxnxb multidimensional object. In addition, we propose that the topology of this object is driven by individual needs to reduce uncertainty about the outcomes of interactions in one or more dimension. The proposal grounds social network dynamics and evolution in individual selection processes and allows us to define the uncertainty of the social network as the joint entropy of its constituent interaction networks. In support of these propositions we use simulations and natural 'knock-outs' in a free-ranging baboon troop to show (i) that such an object can display a small-world state and (ii) that, as predicted, changes in interactions after social perturbations lead to a more certain social network, in which the outcomes of interactions are easier for members to predict. This new formalisation of social networks provides a framework within which to predict network dynamics and evolution under the assumption that it is driven by individuals seeking to reduce the uncertainty of their social environment.
In weakly ferromagnetic materials, already small changes in the atomic configuration triggered by temperature or chemistry can alter the magnetic interactions responsible for the non-random atomic-spin orientation. Different magnetic states, in turn, can give rise to substantially different macroscopic properties. A classical example is iron, which exhibits a great variety of properties as one gradually removes the magnetic long-range order by raising the temperature towards and beyond its Curie point of $T_{\text{C}}^{0}=1043$\,K. Using first-principles theory, here we demonstrate that uniaxial tensile strain can also destabilize the magnetic order in iron and eventually lead to a ferromagnetic to paramagnetic transition at temperatures far below $T_{\text{C}}^{0}$. In consequence, the intrinsic strength of the ideal single-crystal body-centered cubic iron dramatically weakens above a critical temperature of $\sim 500$\,K. The discovered strain-induced magneto-mechanical softening provides a plausible atomic-level mechanism behind the observed drop of the measured strength of Fe whiskers around $300-500$\,K. Alloying additions which have the capability to partially restore the magnetic order in the strained Fe lattice, push the critical temperature for the strength-softening scenario towards the magnetic transition temperature of the undeformed lattice. This can result in a surprisingly large alloying-driven strengthening effect at high temperature as illustrated here in the case of Fe-Co alloy.
We present SymNet, a network static analysis tool based on symbolic execution. SymNet quickly analyzes networks by injecting symbolic packets and tracing their path through the network. Our key novelty is SEFL, a language we designed for network processing that is symbolic-execution friendly.   SymNet is easy to use: we have developed parsers that automatically generate SEFL models from router and switch tables, firewall configurations and arbitrary Click modular router configurations. Most of our models are exact and have optimal branching factor. Finally, we built a testing tool that checks SEFL models conform to the real implementation. SymNet can check networks containing routers with hundreds of thousands of prefixes and NATs in seconds, while ensuring packet header memory-safety and capturing network functionality such as dynamic tunneling, stateful processing and encryption. We used SymNet to debug middlebox interactions documented in the literature, to check our department's network and the Stanford backbone network. Results show that symbolic execution is fast and more accurate than existing static analysis tools.
Identifying and understanding modular organizations is centrally important in the study of complex systems. Several approaches to this problem have been advanced, many framed in information-theoretic terms. Our treatment starts from the complementary point of view of statistical modeling and prediction of dynamical systems. It is known that for finite amounts of training data, simpler models can have greater predictive power than more complex ones. We use the trade-off between model simplicity and predictive accuracy to generate optimal multiscale decompositions of dynamical networks into weakly-coupled, simple modules. State-dependent and causal versions of our method are also proposed.
We discuss a new non-linear PDE, u_t + (2 u_xx/u) u_x = epsilon u_xxx, invariant under scaling of dependent variable and referred to here as SIdV. It is one of the simplest such translation and space-time reflection-symmetric first order advection-dispersion equations. This PDE (with dispersion coefficient unity) was discovered in a genetic programming search for equations sharing the KdV solitary wave solution. It provides a bridge between non-linear advection, diffusion and dispersion. Special cases include the mKdV and linear dispersive equations. We identify two conservation laws, though initial investigations indicate that SIdV does not follow from a polynomial Lagrangian of the KdV sort. Nevertheless, it possesses solitary and periodic travelling waves. Moreover, numerical simulations reveal recurrence properties usually associated with integrable systems. KdV and SIdV are the simplest in an infinite dimensional family of equations sharing the KdV solitary wave. SIdV and its generalizations may serve as a testing ground for numerical and analytical techniques and be a rich source for further explorations.
Using the newly available infrared images of the Hubble Deep Field in the J, H, and K bands and an optimal photometric method, we have refined a technique to estimate the redshifts of 1067 galaxies. A detailed comparison of our results with the spectroscopic redshifts in those cases where the latter are available shows that this technique gives very good results for bright enough objects (AB(8140) < 26.0). From a study of the distribution of residuals (Dz(rms)/(1+z) ~ 0.1 at all redshifts) we conclude that the observed errors are mainly due to cosmic variance. This very important result allows for the assessment of errors in quantities to be directly or indirectly measured from the catalog. We present some of the statistical properties of the ensemble of galaxies in the catalog, and finish by presenting a list of bright high-redshift (z ~ 5) candidates extracted from our catalog, together with recent spectroscopic redshift determinations confirming that two of them are at z=5.34 and z=5.60.
Neurons in the primary visual cortex are more or less selective for the orientation of a light bar used for stimulation. A broad distribution of individual grades of orientation selectivity has in fact been reported in all species. A possible reason for emergence of broad distributions is the recurrent network within which the stimulus is being processed. Here we compute the distribution of orientation selectivity in randomly connected model networks that are equipped with different spatial patterns of connectivity. We show that, for a wide variety of connectivity patterns, a linear theory based on firing rates accurately approximates the outcome of direct numerical simulations of networks of spiking neurons. Distance dependent connectivity in networks with a more biologically realistic structure does not compromise our linear analysis, as long as the linearized dynamics, and hence the uniform asynchronous irregular activity state, remain stable. We conclude that linear mechanisms of stimulus processing are indeed responsible for the emergence of orientation selectivity and its distribution in recurrent networks with functionally heterogeneous synaptic connectivity.
Collecting large training datasets, annotated with high quality labels, is a costly process. This paper proposes a novel framework for training deep convolutional neural networks from noisy labeled datasets. The problem is formulated using an undirected graphical model that represents the relationship between noisy and clean labels, trained in a semi-supervised setting. In the proposed structure, the inference over latent clean labels is tractable and is regularized during training using auxiliary sources of information. The proposed model is applied to the image labeling problem and is shown to be effective in labeling unseen images as well as reducing label noise in training on CIFAR-10 and MS COCO datasets.
Finding the mean of the total number N_{tot} of critical points for N-dimensional random energy landscapes is reduced to averaging the absolute value of characteristic polynomial of the corresponding Hessian. For any finite N we provide the exact solution to the problem for a class of landscapes corresponding to the "toy model" of manifolds in random environment. For N >>1 our asymptotic analysis reveals a phase transition at some critical value \mu_c of a control parameter \mu from a phase with finite landscape complexity to the phase with vanishing complexity. The same value of the control parameter is known to correspond to an onset of glassy behaviour at zero temperature. Finally, we discuss a method of dealing with the modulus of the spectral determinant applicable to a broad class of problems.
We present galaxy counts at 15 microns using the Japanese AKARI satellite's NEP-deep and NEP-wide legacy surveys at the North Ecliptic Pole. The total number of sources detected are approximately 6700 and 10,700 down to limiting fluxes of 117 and 250 microJy (5 sigma) for the NEP-deep and NEP-wide survey respectively. We construct the Euclidean normalized differential source counts for both data sets (assuming 80 percent completeness levels of 200 and 270 microJy respectively) to produce the widest and deepest contiguous survey at 15 microns to date covering the entire flux range from the deepest to shallowest surveys made with the Infrared Space Observatory (ISO) over areas sufficiently significant to overcome cosmic variance, detecting six times as many sources as the largest survey carried out with ISO.We compare the results from AKARI with the previous surveys with ISO at the same wavelength and the Spitzer observations at 16 microns using the peek-up camera on its IRS instrument. The AKARI source counts are consistent with other results to date reproducing the steep evolutionary rise at fluxes less than a milliJansky and super-Euclidean slopes. We find the the AKARI source counts show a slight excess at fluxes fainter than 200 microJanskys which is not predicted by previous source count models at 15 microns. However, we caution that at this level we may be suffering from the effects of source confusion in our data. At brighter fluxes greater than a milliJansky, the NEP-wide survey source counts agree with the Northern ISO-ELAIS field results, resolving the discrepancy of the bright end calibration in the ISO 15 micron source counts.
A proper analysis of the evolution of sources emitting in the Mid-Infrared is strongly dependent on their broad-band spectral properties (SEDs) at different redshifts and luminosities and on a reliable classification allowing to disentangle AGN from star-formation activity. The diagnostic diagrams based on the optical line ratios are often ambiguous and/or misleading not allowing a proper separation of the galaxy/AGN populations. Thanks to the combination of deep Spitzer and X-rays data a much better census of the hidden AGN activity and dust-obscured star-forming galaxies can be obtained, constraining galaxy and AGN evolutionary models.
We study the flow of the renormalized model parameters obtained from a sequence of simple transformations of the 1D Anderson model with long-range hierarchical hopping. Combining numerical results with a perturbative approach for the flow equations, we identify three qualitatively different regimes at weak disorder. For a sufficiently fast decay of the hopping energy, the Cauchy distribution is the only stable fixed-point of the flow equations, whereas for sufficiently slowly decaying hopping energy the renormalized parameters flow to a delta peak fixed-point distribution. In an intermediate range of the hopping decay, both fixed-point distributions are stable and the stationary solution is determined by the initial configuration of the random parameters. We present results for the critical decay of the hopping energy separating the different regimes.
We studied effects of disorder in a three dimensional layered Chern insulator. By calculating the localization length and density of states numerically, we found two distict types of metallic phases between Anderson insulator and Chern insulator; one is diffusive metallic (DM) phase and the other is renormalized Weyl semimetal (WSM) phase. We show that longitudinal conductivity at the zero energy state remains finite in the renormalizd WSM phase as well as in the DM phase, while goes to zero at a semimetal-metal quantum phase transition point between these two. Based on the Einstein relation combined with the self-consistent Born analysis, we give a conductivity scaling near the quantum transition point.
In this chapter, we present CORrelation ALignment (CORAL), a simple yet effective method for unsupervised domain adaptation. CORAL minimizes domain shift by aligning the second-order statistics of source and target distributions, without requiring any target labels. In contrast to subspace manifold methods, it aligns the original feature distributions of the source and target domains, rather than the bases of lower-dimensional subspaces. It is also much simpler than other distribution matching methods. CORAL performs remarkably well in extensive evaluations on standard benchmark datasets. We first describe a solution that applies a linear transformation to source features to align them with target features before classifier training. For linear classifiers, we propose to equivalently apply CORAL to the classifier weights, leading to added efficiency when the number of classifiers is small but the number and dimensionality of target examples are very high. The resulting CORAL Linear Discriminant Analysis (CORAL-LDA) outperforms LDA by a large margin on standard domain adaptation benchmarks. Finally, we extend CORAL to learn a nonlinear transformation that aligns correlations of layer activations in deep neural networks (DNNs). The resulting Deep CORAL approach works seamlessly with DNNs and achieves state-of-the-art performance on standard benchmark datasets. Our code is available at:~\url{https://github.com/VisionLearningGroup/CORAL}
The Monte Carlo generator {\sc Major} 1.5 simulates the production and decay of heavy Majorana neutrinos via lepton mixing or exchange of `light' right-handed $W$-bosons in deep inelastic scattering, i.e. $e^{\pm} p \rightarrow {N} X \rightarrow{e}^{\pm} W^{\mp} X$ or $\nu_e Z X$. Physics and programming aspects are described in this manual.
Wireless networks can be self-sustaining by harvesting energy from ambient radio-frequency (RF) signals. Recently, researchers have made progress on designing efficient circuits and devices for RF energy harvesting suitable for low-power wireless applications. Motivated by this and building upon the classic cognitive radio (CR) network model, this paper proposes a novel method for wireless networks coexisting where low-power mobiles in a secondary network, called secondary transmitters (STs), harvest ambient RF energy from transmissions by nearby active transmitters in a primary network, called primary transmitters (PTs), while opportunistically accessing the spectrum licensed to the primary network. We consider a stochastic-geometry model in which PTs and STs are distributed as independent homogeneous Poisson point processes (HPPPs) and communicate with their intended receivers at fixed distances. Each PT is associated with a guard zone to protect its intended receiver from ST's interference, and at the same time delivers RF energy to STs located in its harvesting zone. Based on the proposed model, we analyze the transmission probability of STs and the resulting spatial throughput of the secondary network. The optimal transmission power and density of STs are derived for maximizing the secondary network throughput under the given outage-probability constraints in the two coexisting networks, which reveal key insights to the optimal network design. Finally, we show that our analytical result can be generally applied to a non-CR setup, where distributed wireless power chargers are deployed to power coexisting wireless transmitters in a sensor network.
Many paralinguistic tasks are closely related and thus representations learned in one domain can be leveraged for another. In this paper, we investigate how knowledge can be transferred between three paralinguistic tasks: speaker, emotion, and gender recognition. Further, we extend this problem to cross-dataset tasks, asking how knowledge captured in one emotion dataset can be transferred to another. We focus on progressive neural networks and compare these networks to the conventional deep learning method of pre-training and fine-tuning. Progressive neural networks provide a way to transfer knowledge and avoid the forgetting effect present when pre-training neural networks on different tasks. Our experiments demonstrate that: (1) emotion recognition can benefit from using representations originally learned for different paralinguistic tasks and (2) transfer learning can effectively leverage additional datasets to improve the performance of emotion recognition systems.
We propose a method for hand pose estimation based on a deep regressor trained on two different kinds of input. Raw depth data is fused with an intermediate representation in the form of a segmentation of the hand into parts. This intermediate representation contains important topological information and provides useful cues for reasoning about joint locations. The mapping from raw depth to segmentation maps is learned in a semi/weakly-supervised way from two different datasets: (i) a synthetic dataset created through a rendering pipeline including densely labeled ground truth (pixelwise segmentations); and (ii) a dataset with real images for which ground truth joint positions are available, but not dense segmentations. Loss for training on real images is generated from a patch-wise restoration process, which aligns tentative segmentation maps with a large dictionary of synthetic poses. The underlying premise is that the domain shift between synthetic and real data is smaller in the intermediate representation, where labels carry geometric and topological meaning, than in the raw input domain. Experiments on the NYU dataset show that the proposed training method decreases error on joints over direct regression of joints from depth data by 15.7%.
The rapidly developing theory of complex networks indicates that real networks are not random, but have a highly robust large-scale architecture, governed by strict organizational principles. Here, we focus on the properties of biological networks, discussing their scale-free and hierarchical features. We illustrate the major network characteristics using examples from the metabolic network of the bacterium Escherichia coli. We also discuss the principles of network utilization, acknowledging that the interactions in a real network have unequal strengths. We study the interplay between topology and reaction fluxes provided by flux-balance analysis. We find that the cellular utilization of the metabolic network is both globally and locally highly inhomogeneous, dominated by "hot-spots", representing connected high-flux pathways.
Generation of 3D data by deep neural network has been attracting increasing attention in the research community. The majority of extant works resort to regular representations such as volumetric grids or collection of images; however, these representations obscure the natural invariance of 3D shapes under geometric transformations and also suffer from a number of other issues. In this paper we address the problem of 3D reconstruction from a single image, generating a straight-forward form of output -- point cloud coordinates. Along with this problem arises a unique and interesting issue, that the groundtruth shape for an input image may be ambiguous. Driven by this unorthodox output form and the inherent ambiguity in groundtruth, we design architecture, loss function and learning paradigm that are novel and effective. Our final solution is a conditional shape sampler, capable of predicting multiple plausible 3D point clouds from an input image. In experiments not only can our system outperform state-of-the-art methods on single image based 3d reconstruction benchmarks; but it also shows a strong performance for 3d shape completion and promising ability in making multiple plausible predictions.
This paper proposes schemes for automated and weighted Self-Organizing Time Maps (SOTMs). The SOTM provides means for a visual approach to evolutionary clustering, which aims at producing a sequence of clustering solutions. This task we denote as visual dynamic clustering. The implication of an automated SOTM is not only a data-driven parametrization of the SOTM, but also the feature of adjusting the training to the characteristics of the data at each time step. The aim of the weighted SOTM is to improve learning from more trustworthy or important data with an instance-varying weight. The schemes for automated and weighted SOTMs are illustrated on two real-world datasets: (i) country-level risk indicators to measure the evolution of global imbalances, and (ii) credit applicant data to measure the evolution of firm-level credit risks.
Many biological neuronal networks exhibit highly variable spiking activity. Balanced networks offer a parsimonious model of this variability. In balanced networks, strong excitatory synaptic inputs are canceled by strong inhibitory inputs on average and spiking activity is driven by transient breaks in this balance. Most previous studies of balanced networks assume a homogeneous or distance-dependent connectivity structure, but connectivity in biological cortical networks is more intricate. We use a heterogeneous mean-field theory of balanced networks to show that heterogeneous in-degrees can break balance, but balance can be restored by heterogeneous out-degrees that are correlated with in-degrees. In all examples considered, we find that highly connected neurons spike less frequently, consistent with recent experimental observations.
We study transport of interacting particles in weakly disordered media. Our one-dimensional system includes (i) disorder: the hopping rate governing the movement of a particle between two neighboring lattice sites is inhomogeneous, and (ii) hard core interaction: the maximum occupancy at each site is one particle. We find that over a substantial regime, the root-mean-square displacement of a particle, sigma, grows super-diffusively with time t, sigma ~ (epsilon t)^{2/3}, where epsilon is the disorder strength. Without disorder the particle displacement is sub-diffusive, sigma ~ t^{1/4}, and therefore disorder dramatically enhances particle mobility. We explain this effect using scaling arguments, and verify the theoretical predictions through numerical simulations. Also, the simulations show that disorder generally leads to stronger mobility.
Despite the success of deep learning on many fronts especially image and speech, its application in text classification often is still not as good as a simple linear SVM on n-gram TF-IDF representation especially for smaller datasets. Deep learning tends to emphasize on sentence level semantics when learning a representation with models like recurrent neural network or recursive neural network, however from the success of TF-IDF representation, it seems a bag-of-words type of representation has its strength. Taking advantage of both representions, we present a model known as TDSM (Top Down Semantic Model) for extracting a sentence representation that considers both the word-level semantics by linearly combining the words with attention weights and the sentence-level semantics with BiLSTM and use it on text classification. We apply the model on characters and our results show that our model is better than all the other character-based and word-based convolutional neural network models by \cite{zhang15} across seven different datasets with only 1\% of their parameters. We also demonstrate that this model beats traditional linear models on TF-IDF vectors on small and polished datasets like news article in which typically deep learning models surrender.
We introduce a novel strategy for learning to extract semantically meaningful features from aerial imagery. Instead of manually labeling the aerial imagery, we propose to predict (noisy) semantic features automatically extracted from co-located ground imagery. Our network architecture takes an aerial image as input, extracts features using a convolutional neural network, and then applies an adaptive transformation to map these features into the ground-level perspective. We use an end-to-end learning approach to minimize the difference between the semantic segmentation extracted directly from the ground image and the semantic segmentation predicted solely based on the aerial image. We show that a model learned using this strategy, with no additional training, is already capable of rough semantic labeling of aerial imagery. Furthermore, we demonstrate that by finetuning this model we can achieve more accurate semantic segmentation than two baseline initialization strategies. We use our network to address the task of estimating the geolocation and geoorientation of a ground image. Finally, we show how features extracted from an aerial image can be used to hallucinate a plausible ground-level panorama.
There have been active research activities worldwide in developing the next-generation 5G wireless network. The 5G network is expected to support significantly large amount of mobile data traffic and huge number of wireless connections, achieve better cost- and energy-efficiency as well as quality of service (QoS) in terms of communication delay, reliability and security. To this end, the 5G wireless network should exploit potential gains in different network dimensions including super dense and heterogeneous deployment of cells and massive antenna arrays (i.e., massive multiple input multiple output (MIMO) technologies) and utilization of higher frequencies, in particular millimeter wave (mmWave) frequencies. This article discusses potentials and challenges of the 5G heterogeneous wireless network (HetNet) which incorporates massive MIMO and mmWave technologies. We will first provide the typical requirements of the 5G wireless network. Then, the significance of massive MIMO and mmWave in engineering the future 5G HetNet is discussed in detail. Potential challenges associated with the design of such 5G HetNet are discussed. Finally, we provide some case studies, which illustrate the potential benefits of the considered technologies.
Time synchronization is an important aspect of sensor network operation. However, it is well known that synchronization error accumulates over multiple hops. This presents a challenge for large-scale, multi-hop sensor networks with a large number of nodes distributed over wide areas. In this work, we present a protocol that uses spatial averaging to reduce error accumulation in large-scale networks. We provide an analysis to quantify the synchronization improvement achieved using spatial averaging and find that in a basic cooperative network, the skew and offset variance decrease approximately as $1/\bar{N}$ where $\bar{N}$ is the number of cooperating nodes. For general networks, simulation results and a comparison to basic cooperative network results are used to illustrate the improvement in synchronization performance.
Using results of a Monte Carlo simulation of the Sherrington-Kirkpatrick model, we try to characterize the slow disorder samples, namely we analyze visually the correlation between the relaxation time for a given disorder sample $J$ with several observables of the system for the same disorder sample. For temperatures below $T_c$ but not too low, fast samples (small relaxation times) are clearly correlated with a small value of the largest eigenvalue of the coupling matrix, a large value of the site averaged local field probability distribution at the origin, or a small value of the squared overlap $<q^2>$. Within our limited data, the correlation remains as the system size increases but becomes less clear as the temperature is decreased (the correlation with $<q^2>$ is more robust) . There is a strong correlation between the values of the relaxation time for two distinct values of the temperature, but this correlation decreases as the system size is increased. This may indicate the onset of temperature chaos.
We present Spitzer 24micron imaging of 1.5 < z < 2.5 Distant Red Galaxies (DRGs) in the 10arcmin by 10arcmin Extended Hubble Deep Field South of the Multiwavelength Survey by Yale-Chile. We detect 65% of the DRGs with K_AB < 23.2 mag at S_24micron > 40uJy, and conclude that the bulk of the DRG population are dusty active galaxies. A mid-infrared (MIR) color analysis with IRAC data suggests that the MIR fluxes are not dominated by buried AGN, and we interpret the high detection rate as evidence for a high average star formation rate of <SFR> = 130+/-30 M/yr. From this, we infer that DRGs are important contributors to the cosmic star formation rate density at z ~ 2, at a level of \~0.02 M/yr/Mpc^3 to our completeness limit of K_AB = 22.9 mag
I consider a class of fitness landscapes, in which the fitness is a function of a finite number of phenotypic "traits", which are themselves linear functions of the genotype. I show that the stationary trait distribution in such a landscape can be explicitly evaluated in a suitably defined "thermodynamic limit", which is a combination of infinite-genome and strong selection limits. These considerations can be applied in particular to identify relevant features of the evolution of promoter binding sites, in spite of the shortness of the corresponding sequences.
Neural Autoregressive Distribution Estimators (NADEs) have recently been shown as successful alternatives for modeling high dimensional multimodal distributions. One issue associated with NADEs is that they rely on a particular order of factorization for $P(\mathbf{x})$. This issue has been recently addressed by a variant of NADE called Orderless NADEs and its deeper version, Deep Orderless NADE. Orderless NADEs are trained based on a criterion that stochastically maximizes $P(\mathbf{x})$ with all possible orders of factorizations. Unfortunately, ancestral sampling from deep NADE is very expensive, corresponding to running through a neural net separately predicting each of the visible variables given some others. This work makes a connection between this criterion and the training criterion for Generative Stochastic Networks (GSNs). It shows that training NADEs in this way also trains a GSN, which defines a Markov chain associated with the NADE model. Based on this connection, we show an alternative way to sample from a trained Orderless NADE that allows to trade-off computing time and quality of the samples: a 3 to 10-fold speedup (taking into account the waste due to correlations between consecutive samples of the chain) can be obtained without noticeably reducing the quality of the samples. This is achieved using a novel sampling procedure for GSNs called annealed GSN sampling, similar to tempering methods that combines fast mixing (obtained thanks to steps at high noise levels) with accurate samples (obtained thanks to steps at low noise levels).
We present a reformulation of modularity that allows the analysis of the community structure in networks of correlated data. The new modularity preserves the probabilistic semantics of the original definition even when the network is directed, weighted, signed, and has self-loops. This is the most general condition one can find in the study of any network, in particular those defined from correlated data. We apply our results to a real network of correlated data between stores in the city of Lyon (France).
Multilayer bootstrap network builds a gradually narrowed multilayer nonlinear network from bottom up for unsupervised nonlinear dimensionality reduction. Each layer of the network is a group of k-centers clusterings. Each clustering uses randomly sampled data points with randomly selected features as its centers, and learns a one-of-k encoding by one-nearest-neighbor optimization. Thanks to the binarized encoding, the similarity of two data points is measured by the number of the nearest centers they share in common, which is an adaptive similarity metric in the discrete space that needs no model assumption and parameter tuning. Thanks to the network structure, larger and larger local variations of data are gradually reduced from bottom up. The information loss caused by the binarized encoding is proportional to the correlation of the clusterings, both of which are reduced by the randomization steps.
Assessing whether a given network is typical or atypical for a random-network ensemble (i.e., network-ensemble comparison) has widespread applications ranging from null-model selection and hypothesis testing to clustering and classifying networks. We develop a framework for network-ensemble comparison by subjecting the network to stochastic rewiring. We study two rewiring processes, uniform and degree-preserved rewiring, which yield random-network ensembles that converge to the Erdos-Renyi and configuration-model ensembles, respectively. We study convergence through von Neumann entropy (VNE), a network summary statistic measuring information content based on the spectra of a Laplacian matrix, and develop a perturbation analysis for the expected effect of rewiring on VNE. Our analysis yields an estimate for how many rewires are required for a given network to resemble a typical network from an ensemble, offering a computationally efficient quantity for network-ensemble comparison that does not require simulation of the corresponding rewiring process.
Quantum networks distributed over distances greater than a few kilometers will be limited by the time required for information to propagate between nodes. We analyze protocols that are able to circumvent this bottleneck by employing multi-qubit nodes and multiplexing. For each protocol, we investigate the key network parameters that determine its performance. We model achievable entangling rates based on the anticipated near-term performance of nitrogen-vacancy centres and other promising network platforms. This analysis allows us to compare the potential of the proposed multiplexed protocols in different regimes. Moreover, by identifying the gains that may be achieved by improving particular network parameters, our analysis suggests the most promising avenues for research and development of prototype quantum networks.
Using mappings to computer-science problems and by applying sophisticated algorithms, one can study numerically many problems much better compared to applying standard approaches like Monte Carlo simulations. Here, using calculations of ground states of suitable perturbed systems, droplets are obtained in two-dimensional +-J spin glasses, which are in the focus of a currently very lifely debate. Since a sophisticated matching algorithm is applied here, exact ground states of large systems up to L^2=256^2 spins can be generated. Furthermore, no equilibration or extrapolation to T=0 is necessary. Three different +-J models are studied here: a) with open boundary conditions, b) with fixed boundary conditions and c) a diluted system where a fraction p=0.125 of all bonds is zero. For large systems, the droplet energy shows for all three models a power-law behavior E_D L^\theta'_D with \theta'_D<0. This is different from previous studies of domain walls, where a convergence to a constant non-zero value (\theta_dw=0) has been found for such models. After correcting for the non-compactness of the droplets, the results are likely to be compatible with \theta_D= -0.29 for all three models.   This is in accordance with the Gaussian system where \theta_D=-0.287(4) (\nu=3.5 via \nu=-1/\theta_D). Nevertheless, the disorder-averaged spin-spin correlation exponent \eta is determined here via the probability to have a non-zero-energy droplet, and \eta~0.22$ is found for all three models, this being in contrast to the behavior of the model with Gaussian interactions, where exactly \eta=0.
Recent studies have shown that deep neural networks (DNN) are vulnerable to adversarial samples: maliciously-perturbed samples crafted to yield incorrect model outputs. Such attacks can severely undermine DNN systems, particularly in security-sensitive settings. It was observed that an adversary could easily generate adversarial samples by making a small perturbation on irrelevant feature dimensions that are unnecessary for the current classification task. To overcome this problem, we introduce a defensive mechanism called DeepCloak. By identifying and removing unnecessary features in a DNN model, DeepCloak limits the capacity an attacker can use generating adversarial samples and therefore increase the robustness against such inputs. Comparing with other defensive approaches, DeepCloak is easy to implement and computationally efficient. Experimental results show that DeepCloak can increase the performance of state-of-the-art DNN models against adversarial samples.
Video captioning which automatically translates video clips into natural language sentences is a very important task in computer vision. By virtue of recent deep learning technologies, e.g., convolutional neural networks (CNNs) and recurrent neural networks (RNNs), video captioning has made great progress. However, learning an effective mapping from visual sequence space to language space is still a challenging problem. In this paper, we propose a Multimodal Memory Model (M3) to describe videos, which builds a visual and textual shared memory to model the long-term visual-textual dependency and further guide global visual attention on described targets. Specifically, the proposed M3 attaches an external memory to store and retrieve both visual and textual contents by interacting with video and sentence with multiple read and write operations. First, text representation in the Long Short-Term Memory (LSTM) based text decoder is written into the memory, and the memory contents will be read out to guide an attention to select related visual targets. Then, the selected visual information is written into the memory, which will be further read out to the text decoder. To evaluate the proposed model, we perform experiments on two publicly benchmark datasets: MSVD and MSR-VTT. The experimental results demonstrate that our method outperforms the state-of-theart methods in terms of BLEU and METEOR.
State-of-the-art speech recognition systems typically employ neural network acoustic models. However, compared to Gaussian mixture models, deep neural network (DNN) based acoustic models often have many more model parameters, making it challenging for them to be deployed on resource-constrained platforms, such as mobile devices. In this paper, we study the application of the recently proposed highway deep neural network (HDNN) for training small-footprint acoustic models. HDNNs are a depth-gated feedforward neural network, which include two types of gate functions to facilitate the information flow through different layers. Our study demonstrates that HDNNs are more compact than regular DNNs for acoustic modeling, i.e., they can achieve comparable recognition accuracy with many fewer model parameters. Furthermore, HDNNs are more controllable than DNNs: the gate functions of an HDNN can control the behavior of the whole network using a very small number of model parameters. Finally, we show that HDNNs are more adaptable than DNNs. For example, simply updating the gate functions using adaptation data can result in considerable gains in accuracy. We demonstrate these aspects by experiments using the publicly available AMI corpus, which has around 80 hours of training data.
The parameter-less hierarchical Bayesian optimization algorithm (hBOA) enables the use of hBOA without the need for tuning parameters for solving each problem instance. There are three crucial parameters in hBOA: (1) the selection pressure, (2) the window size for restricted tournaments, and (3) the population size. Although both the selection pressure and the window size influence hBOA performance, performance should remain low-order polynomial with standard choices of these two parameters. However, there is no standard population size that would work for all problems of interest and the population size must thus be eliminated in a different way. To eliminate the population size, the parameter-less hBOA adopts the population-sizing technique of the parameter-less genetic algorithm. Based on the existing theory, the parameter-less hBOA should be able to solve nearly decomposable and hierarchical problems in quadratic or subquadratic number of function evaluations without the need for setting any parameters whatsoever. A number of experiments are presented to verify scalability of the parameter-less hBOA.
The maturity of deep learning techniques has led in recent years to a breakthrough in object recognition in visual media. While for some specific benchmarks, neural techniques seem to match if not outperform human judgement, challenges are still open for detecting arbitrary concepts in arbitrary videos. In this paper, we propose a system that combines neural techniques, a large scale visual concepts ontology, and an active learning loop, to provide on the fly model learning of arbitrary concepts. We give an overview of the system as a whole, and focus on the central role of the ontology for guiding and bootstrapping the learning of new concepts, improving the recall of concept detection, and, on the user end, providing semantic search on a library of annotated videos.
Community structure is essential for social communications, where individuals belonging to the same community are much more actively interacting and communicating with each other than those in different communities within the human society. Naming game, on the other hand, is a social communication model that simulates the process of learning a name of an object within a community of humans, where the individuals can reach global consensus on naming an object asymptotically through iterative pair-wise conversations. The underlying communication network indicates the relationships among the individuals. In this paper, three typical topologies of human communication networks, namely random-graph, small-world and scale-free networks, are employed, which are embedded with the multi-local-world community structure, to study the naming game. Simulations show that 1) when the intra-community connections increase while the inter-community connections remain to be unchanged, the convergence to global consensus is slow and eventually might fail; 2) when the inter-community connections are sufficiently dense, both the number and the size of the communities do not affect the convergence process; and 3) for different topologies with the same average node-degree, local clustering of individuals obstruct or prohibit global consensus to take place. The results reveal the role of local communities in a global naming game in social network studies.
Deployment of sensor network in hostile environment makes it mainly vulnerable to battery drainage attacks because it is impossible to recharge or replace the battery power of sensor nodes. Among different types of security threats, low power sensor nodes are immensely affected by the attacks which cause random drainage of the energy level of sensors, leading to death of the nodes. The most dangerous type of attack in this category is sleep deprivation, where target of the intruder is to maximize the power consumption of sensor nodes, so that their lifetime is minimized. Most of the existing works on sleep deprivation attack detection involve a lot of overhead, leading to poor throughput. The need of the day is to design a model for detecting intrusions accurately in an energy efficient manner. This paper proposes a hierarchical framework based on distributed collaborative mechanism for detecting sleep deprivation torture in wireless sensor network efficiently. Proposed model uses anomaly detection technique in two steps to reduce the probability of false intrusion.
We propose to observe many-body localization in cold atomic gases by realizing a Bose-Hubbard chain with binary disorder and studying its non-equilibrium dynamics. In particular, we show that measuring the difference in occupation between even and odd sites, starting from a prepared density-wave state, provides clear signatures of localization. As hallmarks of the many-body localized phase we confirm, furthermore, a logarithmic increase of the entanglement entropy in time and Poissonian level statistics. Our numerical density-matrix renormalization group calculations for infinite system size are based on a purification approach; this allows us to perform the disorder average exactly, thus producing data without any statistical noise and with maximal simulation times of up to a factor 10 longer than in the clean case.
In this paper, we investigated the neural spikes synchronisation in a neural network with synaptic plasticity and external perturbation. In the simulations the neural dynamics is described by the Hodgkin Huxley model considering chemical synapses (excitatory) among neurons. According to neural spikes synchronisation is expected that a perturbation produce non synchronised regimes. However, in the literature there are works showing that the combination of synaptic plasticity and external perturbation may generate synchronised regime. This article describes the effect of the synaptic plasticity on the synchronisation, where we consider a perturbation with a uniform distribution. This study is relevant to researches of neural disorders control.
Finding similar user pairs is a fundamental task in social networks, with numerous applications in ranking and personalization tasks such as link prediction and tie strength detection. A common manifestation of user similarity is based upon network structure: each user is represented by a vector that represents the user's network connections, where pairwise cosine similarity among these vectors defines user similarity. The predominant task for user similarity applications is to discover all similar pairs that have a pairwise cosine similarity value larger than a given threshold $\tau$. In contrast to previous work where $\tau$ is assumed to be quite close to 1, we focus on recommendation applications where $\tau$ is small, but still meaningful. The all pairs cosine similarity problem is computationally challenging on networks with billions of edges, and especially so for settings with small $\tau$. To the best of our knowledge, there is no practical solution for computing all user pairs with, say $\tau = 0.2$ on large social networks, even using the power of distributed algorithms.   Our work directly addresses this challenge by introducing a new algorithm --- WHIMP --- that solves this problem efficiently in the MapReduce model. The key insight in WHIMP is to combine the "wedge-sampling" approach of Cohen-Lewis for approximate matrix multiplication with the SimHash random projection techniques of Charikar. We provide a theoretical analysis of WHIMP, proving that it has near optimal communication costs while maintaining computation cost comparable with the state of the art. We also empirically demonstrate WHIMP's scalability by computing all highly similar pairs on four massive data sets, and show that it accurately finds high similarity pairs. In particular, we note that WHIMP successfully processes the entire Twitter network, which has tens of billions of edges.
This paper presents a new approach in understanding how deep neural networks (DNNs) work by applying homomorphic signal processing techniques. Focusing on the task of multi-pitch estimation (MPE), this paper demonstrates the equivalence relation between a generalized cepstrum and a DNN in terms of their structures and functionality. Such an equivalence relation, together with pitch perception theories and the recently established rectified-correlations-on-a-sphere (RECOS) filter analysis, provide an alternative way in explaining the role of the nonlinear activation function and the multi-layer structure, both of which exist in a cepstrum and a DNN. To validate the efficacy of this new approach, a new feature designed in the same fashion is proposed for pitch salience function. The new feature outperforms the one-layer spectrum in the MPE task and, as predicted, it addresses the issue of the missing fundamental effect and also achieves better robustness to noise.
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.
Motivated by the recent experimental work (A. Widera, \textit{et al}, Phys. Rev. Lett. 95, 19045), we study the collisional spin dynamics of two spin-2 $% ^{87}$Rb atoms confined in a deep optical lattice. When the system is initialized as $|0,0>$, three different two-particle Zeeman states are involved in the time evolution due to the conservation of magnetization. For a large magnetic field $B>0.8$ Guass, the spin coherent dynamics reduces to a Rabi-like oscillation between the states $|0,0>$ and $% |1,-1>$. However, under a small magnetic field, a general three-level coherent oscillation displays. In particular, around a critical magnetic field $B_{c}\simeq 0.48$ Guass, the probability in the Zeeman states $% |2,-2> $ exhibits a novel quantum beat phenomenon, ready to be confirmed in future experiments.
The random walk problem is studied in two and three dimensions in the presence of a random distribution of static traps. An efficient Monte Carlo method, based on a mapping onto a polymer model, is used to measure the survival probability P(c,t) as a function of the trap concentration c and the time t. Theoretical arguments are presented, based on earlier work of Donsker and Varadhan and of Rosenstock, why in two dimensions one expects a data collapse if -ln[P(c,t)]/ln(t) is plotted as a function of (lambda t)^{1/2}/ln(t) (with lambda=-ln(1-c)), whereas in three dimensions one expects a data collapse if -t^{-1/3}ln[P(c,t)] is plotted as a function of t^{2/3}lambda. These arguments are supported by the Monte Carlo results. Both data collapses show a clear crossover from the early-time Rosenstock behavior to Donsker-Varadhan behavior at long times.
Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserstein GAN (WGAN) makes progress toward stable training of GANs, but can still generate low-quality samples or fail to converge in some settings. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to pathological behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.
Software testing is an important tool to ensure software quality. However, testing in robotics is a hard task due to dynamic environments and the expensive development and time-consuming execution of test cases. Most testing approaches use model-based and / or simulation-based testing to overcome these problems. We propose a model-free skill-centric testing approach in which a robot autonomously executes skills in the real world and compares it to previous experiences. The robot selects specific skills in order to identify errors in the software by maximising the expected information gain. We use deep learning to model the sensor data observed during previous successful executions of a skill and to detect irregularities. This information is connected to functional profiling data such that certain misbehaviour can be related to specific functions. We evaluate our approach in simulation and in experiments with a KUKA LWR 4+ robot by purposefully introducing bugs to the software. We demonstrate that these bugs can be detected with high accuracy and without the need for the implementation of specific tests or models.
Models for RNA secondary structures (the topology of folded RNA) without pseudo knots are disordered systems with a complex state-space below a critical temperature. Hence, a complex dynamical (glassy) behavior can be expected, when performing Monte Carlo simulation. Interestingly, in contrast to most other complex systems, the ground states and the density of states can be computed in polynomial time exactly using transfer matrix methods. Hence, RNA secondary structure is an ideal model to study the relation between static/thermodynamic properties and dynamics of algorithms. Also they constitute an ideal benchmark system for new Monte Carlo methods.   Here we considered three different recent Monte Carlo approaches: entropic sampling using flat histograms, optimized-weights ensembles, and ParQ, which estimates the density of states from transition matrices.   These methods were examined by comparing the obtained density of states with the exact results. We relate the complexity seen in the dynamics of the Monte Carlo algorithms to static properties of the phase space by studying the correlations between tunneling times, sampling errors, amount of meta-stable states and degree of ultrametricity at finite temperature.
In these notes, we continue our investigation of classical toy models of disordered statistical mechanics through various techniques recently developed and tested mainly on the paradigmatic SK spin glass. Here we consider the p-spin-glass model with Ising spins and interactions drawn from a normal distribution N[0,1]. After a general presentation of its properties (e.g. self-averaging of the free energy, existence of a suitable thermodynamic limit), we study its equilibrium behavior within the Hamilton-Jacobi framework and the smooth cavity approach. Through the former we find both the RS and the 1RSB expressions for the free energy, coupled with their self-consistent relations for the overlaps. Through the latter, we recover these results as irreducible expression, and we study the generalization of the overlap polynomial identities suitable for this model; a discussion on their deep connection with the structure of the internal energy and the entropy closes the investigation.
We show that arbitrarily weak interparticle interactions destabilize the surface states of 3D topological superconductors with spin SU(2) invariance (symmetry class CI), in the presence of non-magnetic disorder. The conduit for the instability is disorder-induced wavefunction multifractality. We argue that time-reversal symmetry breaks spontaneously at the surface, so that topologically-protected states do not exist for this class. The interaction-stabilized surface phase is expected to exhibit ferromagnetic order, or to reside in an insulating plateau of the spin quantum Hall effect.
We consider a class of Gaussian layered networks where a source communicates with a destination through $L$ intermediate relay layers with $N$ nodes in each layer in the presence of a single eavesdropper which can overhear the transmissions of the nodes in any one layer. The problem of maximum secrecy rate achievable with analog network coding for a unicast communication over such layered wireless relay networks with directed links is considered. A relay node performing analog network coding scales and forwards the signals received at its input. The key contribution of this work is a lemma that provides the globally optimal set of scaling factors for the nodes that maximizes the end-to-end secrecy rate for a class of layered networks. We also show that in the high-SNR regime, ANC achieves secrecy rates within a constant gap of the cutset upper bound on the secrecy capacity. To the best of our knowledge, this work offers the first characterization of the performance of secure ANC in multi-layered networks in the presence of an eavesdropper.
In this paper we develop a method for report level tracking based on Dempster-Shafer clustering using Potts spin neural networks where clusters of incoming reports are gradually fused into existing tracks, one cluster for each track. Incoming reports are put into a cluster and continuous reclustering of older reports is made in order to obtain maximum association fit within the cluster and towards the track. Over time, the oldest reports of the cluster leave the cluster for the fixed track at the same rate as new incoming reports are put into it. Fusing reports to existing tracks in this fashion allows us to take account of both existing tracks and the probable future of each track, as represented by younger reports within the corresponding cluster. This gives us a robust report-to-track association. Compared to clustering of all available reports this approach is computationally faster and has a better report-to-track association than simple step-by-step association.
In this work we study the problem of network morphism, an effective learning scheme to morph a well-trained neural network to a new one with the network function completely preserved. Different from existing work where basic morphing types on the layer level were addressed, we target at the central problem of network morphism at a higher level, i.e., how a convolutional layer can be morphed into an arbitrary module of a neural network. To simplify the representation of a network, we abstract a module as a graph with blobs as vertices and convolutional layers as edges, based on which the morphing process is able to be formulated as a graph transformation problem. Two atomic morphing operations are introduced to compose the graphs, based on which modules are classified into two families, i.e., simple morphable modules and complex modules. We present practical morphing solutions for both of these two families, and prove that any reasonable module can be morphed from a single convolutional layer. Extensive experiments have been conducted based on the state-of-the-art ResNet on benchmark datasets, and the effectiveness of the proposed solution has been verified.
How spiking networks are able to perform probabilistic inference is an intriguing question, not only for understanding information processing in the brain, but also for transferring these computational principles to neuromorphic silicon circuits. A number of computationally powerful spiking network models have been proposed, but most of them have only been tested, under ideal conditions, in software simulations. Any implementation in an analog, physical system, be it in vivo or in silico, will generally lead to distorted dynamics due to the physical properties of the underlying substrate. In this paper, we discuss several such distortive effects that are difficult or impossible to remove by classical calibration routines or parameter training. We then argue that hierarchical networks of leaky integrate-and-fire neurons can offer the required robustness for physical implementation and demonstrate this with both software simulations and emulation on an accelerated analog neuromorphic device.
Much work in Social Network Analysis has focused on the identification of the most important actors in a social network. This has resulted in several measures of influence and authority. While most of such sociometrics (e.g., PageRank) are driven by intuitions based on an actors location in a network, asking for the "most influential" actors in itself is an ill-posed question, unless it is put in context with a specific measurable task. Constructing a predictive task of interest in a given domain provides a mechanism to quantitatively compare different measures of influence. Furthermore, when we know what type of actionable insight to gather, we need not rely on a single network centrality measure. A combination of measures is more likely to capture various aspects of the social network that are predictive and beneficial for the task. Towards this end, we propose an approach to supervised rank aggregation, driven by techniques from Social Choice Theory. We illustrate the effectiveness of this method through experiments on Twitter and citation networks.
We numerically extract large-scale excitations above the ground state in the 3-dimensional Edwards-Anderson spin glass with Gaussian couplings. We find that associated energies are O(1), in agreement with the mean field picture. Of further interest are the position-space properties of these excitations. First, our study of their topological properties show that the majority of the large-scale excitations are sponge-like. Second, when probing their geometrical properties, we find that the excitations coarsen when the system size is increased. We conclude that either finite size effects are very large even when the spin overlap q is close to zero, or the mean field picture of homogeneous excitations has to be modified.
We show that the slowing of the dynamics in simulations of several model glass-forming liquids is equivalent to the hard-sphere glass transition in the low-pressure limit. In this limit, we find universal behavior of the relaxation time by collapsing molecular-dynamics data for all systems studied onto a single curve as a function of $T/p$, the ratio of the temperature to the pressure. At higher pressures, there are deviations from this universal behavior that depend on the inter-particle potential, implying that additional physical processes must enter into the dynamics of glass-formation.
This is a lecture note for the course DS-GA 3001 <Natural Language Understanding with Distributed Representation> at the Center for Data Science , New York University in Fall, 2015. As the name of the course suggests, this lecture note introduces readers to a neural network based approach to natural language understanding/processing. In order to make it as self-contained as possible, I spend much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced. On the language front, I almost solely focus on language modelling and machine translation, two of which I personally find most fascinating and most fundamental to natural language understanding.
Model-free reinforcement learning has been successfully applied to a range of challenging problems, and has recently been extended to handle large neural network policies and value functions. However, the sample complexity of model-free algorithms, particularly when using high-dimensional function approximators, tends to limit their applicability to physical systems. In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks. We propose two complementary techniques for improving the efficiency of such algorithms. First, we derive a continuous variant of the Q-learning algorithm, which we call normalized adantage functions (NAF), as an alternative to the more commonly used policy gradient and actor-critic methods. NAF representation allows us to apply Q-learning with experience replay to continuous tasks, and substantially improves performance on a set of simulated robotic control tasks. To further improve the efficiency of our approach, we explore the use of learned models for accelerating model-free reinforcement learning. We show that iteratively refitted local linear models are especially effective for this, and demonstrate substantially faster learning on domains where such models are applicable.
Intra-session network coding is known to be vulnerable to pollution attacks. In this work, first, we introduce a novel homomorphic MAC scheme called SpaceMac, which allows an intermediate node to verify if its received packets belong to a specific subspace, even if the subspace is expanding over time. Then, we use SpaceMac as a building block to design a cooperative scheme that provides complete defense against pollution attacks: (i) it can detect polluted packets early at intermediate nodes and (ii) it can identify the exact location of all, even colluding, attackers, thus making it possible to eliminate them. Our scheme is cooperative: parents and children of any node cooperate to detect any corrupted packets sent by the node, and nodes in the network cooperate with a central controller to identify the exact location of all attackers. We implement SpaceMac in both C/C++ and Java as a library, and we make the library available online. Our evaluation on both a PC and an Android device shows that (i) SpaceMac's algorithms can be computed quickly and efficiently, and (ii) our cooperative defense scheme has low computation and significantly lower communication overhead than other comparable state-of-the-art schemes.
Determination of the epoch dependent star-formation rate of field galaxies is one of the principal goals of modern observational cosmology. Recently, Hughes et al. (1998) using the SCUBA instrument on the James Clerk Maxwell Telescope, report the detection of a new population of heavily dust enshrouded, star-forming galaxies at high redshifts (z > 2), dramatically altering the picture of galaxy evolution. However, we show that this interpretation must be treated with caution because of ambiguities in the identification of the host galaxies. Based on our deep, high resolution 1.4 GHz obervations of the Hubble Deep Field, we suggest alternate identifications to the sub-mm detections. These identifications argue for a substantially lower redshift to the sub-mm population with a consequential lowering of the z > 2 sub-mm/far infrared luminosity density and global star-formation rate.
We introduce the novel problem of automatically generating animated GIFs from video. GIFs are short looping video with no sound, and a perfect combination between image and video that really capture our attention. GIFs tell a story, express emotion, turn events into humorous moments, and are the new wave of photojournalism. We pose the question: Can we automate the entirely manual and elaborate process of GIF creation by leveraging the plethora of user generated GIF content? We propose a Robust Deep RankNet that, given a video, generates a ranked list of its segments according to their suitability as GIF. We train our model to learn what visual content is often selected for GIFs by using over 100K user generated GIFs and their corresponding video sources. We effectively deal with the noisy web data by proposing a novel adaptive Huber loss in the ranking formulation. We show that our approach is robust to outliers and picks up several patterns that are frequently present in popular animated GIFs. On our new large-scale benchmark dataset, we show the advantage of our approach over several state-of-the-art methods.
We present an unsupervised visual feature learning algorithm driven by context-based pixel prediction. By analogy with auto-encoders, we propose Context Encoders -- a convolutional neural network trained to generate the contents of an arbitrary image region conditioned on its surroundings. In order to succeed at this task, context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s). When training context encoders, we have experimented with both a standard pixel-wise reconstruction loss, as well as a reconstruction plus an adversarial loss. The latter produces much sharper results because it can better handle multiple modes in the output. We found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures. We quantitatively demonstrate the effectiveness of our learned features for CNN pre-training on classification, detection, and segmentation tasks. Furthermore, context encoders can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.
Online multiplayer games are a popular form of social interaction, used by hundreds of millions of individuals. However, little is known about the social networks within these online games, or how they evolve over time. Understanding human social dynamics within massive online games can shed new light on social interactions in general and inform the development of more engaging systems. Here, we study a novel, large friendship network, inferred from nearly 18 billion social interactions over 44 weeks between 17 million individuals in the popular online game Halo: Reach. This network is one of the largest, most detailed temporal interaction networks studied to date, and provides a novel perspective on the dynamics of online friendship networks, as opposed to mere interaction graphs. Initially, this network exhibits strong structural turnover and decays rapidly from a peak size. In the following period, however, both network size and turnover stabilize, producing a dynamic structural equilibrium. In contrast to other studies, we find that the Halo friendship network is non-densifying: both the mean degree and the average pairwise distance are stable, suggesting that densification cannot occur when maintaining friendships is costly. Finally, players with greater long-term engagement exhibit stronger local clustering, suggesting a group-level social engagement process. These results demonstrate the utility of online games for studying social networks, shed new light on empirical temporal graph patterns, and clarify the claims of universality of network densification.
We propose an advanced Chebyshev expansion method for the numerical calculation of linear response functions at finite temperature. Its high stability and the small required resources allow for a comprehensive study of the optical conductivity $\sigma(\omega)$ of non-interacting electrons in a random potential (Anderson model) on large three-dimensional clusters. For low frequency the data follows the analytically expected power-law behaviour with an exponent that depends on disorder and has its minimum near the metal-insulator transition, where also the extrapolated DC conductivity continuously goes to zero. In view of the general applicability of the Chebyshev approach we briefly discuss its formulation for interacting quantum systems.
We have recently introduced an any-space algorithm for exact inference in Bayesian networks, called Recursive Conditioning, RC, which allows one to trade space with time at increments of X-bytes, where X is the number of bytes needed to cache a floating point number. In this paper, we present three key extensions of RC. First, we modify the algorithm so it applies to more general factorization of probability distributions, including (but not limited to) Bayesian network factorizations. Second, we present a forgetting mechanism which reduces the space requirements of RC considerably and then compare such requirmenets with those of variable elimination on a number of realistic networks, showing orders of magnitude improvements in certain cases. Third, we present a version of RC for computing maximum a posteriori hypotheses (MAP), which turns out to be the first MAP algorithm allowing a smooth time-space tradeoff. A key advantage of presented MAP algorithm is that it does not have to start from scratch each time a new query is presented, but can reuse some of its computations across multiple queries, leading to significant savings in ceratain cases.
Network modeling has become increasingly popular for analyzing genomic data, to aid in the interpretation and discovery of possible mechanistic components and therapeutic targets. However, genomic-scale networks are high-dimensional models and are usually estimated from a relatively small number of samples. Therefore, their usefulness is hampered by estimation instability. In addition, the complexity of the models is controlled by one or more penalization (tuning) parameters where small changes to these can lead to vastly different networks, thus making interpretation of models difficult. This necessitates the development of techniques to produce robust network models accompanied by estimation quality assessments.   We introduce Resampling of Penalized Estimates (ROPE): a novel statistical method for robust network modeling. The method utilizes resampling-based network estimation and integrates results from several levels of penalization through a constrained, over-dispersed beta-binomial mixture model. ROPE provides robust False Discovery Rate (FDR) control of network estimates and each edge is assigned a measure of validity, the q-value, corresponding to the FDR-level for which the edge would be included in the network model. We apply ROPE to several simulated data sets as well as genomic data from The Cancer Genome Atlas. We show that ROPE outperforms state-of-the-art methods in terms of FDR control and robust performance across data sets. We illustrate how to use ROPE to make a principled model selection for which genomic associations to study further. ROPE is available as an R package on CRAN.
Technical advances in ubiquitous sensing, embedded computing, and wireless communication are leading to a new generation of engineered systems called cyber-physical systems (CPS). CPS promises to transform the way we interact with the physical world just as the Internet transformed how we interact with one another. Before this vision becomes a reality, however, a large number of challenges have to be addressed. Network quality of service (QoS) management in this new realm is among those issues that deserve extensive research efforts. It is envisioned that wireless sensor/actuator networks (WSANs) will play an essential role in CPS. This paper examines the main characteristics of WSANs and the requirements of QoS provisioning in the context of cyber-physical computing. Several research topics and challenges are identified. As a sample solution, a feedback scheduling framework is proposed to tackle some of the identified challenges. A simple example is also presented that illustrates the effectiveness of the proposed solution.
Providing service continuity to the end users with best quality is a very important issue in the next generation wireless communications. With the evolution of the mobile devices towards a multimode architecture and the coexistence of multitude of radio access technologies (RAT's), the users are able to benefit simultaneously from these RAT's. However, the major issue in heterogeneous wireless communications is how to choose the most suitable access network for mobile's user which can be used as long as possible for communication. To achieve this issue, this paper proposes an intelligent network selection strategy which combines two multi attribute decision making (MADM) methods such as analytic network process (ANP) and the technique for order preference by similarity to an ideal solution (TOPSIS) method. The ANP method is used to find the differentiate weights of available networks by considering each criterion and the TOPSIS method is applied to rank the alternatives. Our new strategy for network selection can dealing with the limitations of MADM methods which are the ranking abnormality and the ping-ponf effect.
We discuss the properties of an object in the Subaru Deep Field (SDF) classified as a galaxy in on-line data bases and revealed on the Subaru images as a genuine polar-ring galaxy (PRG) candidate. We analyse available photometric data and conclude that this object consists of a >5 Gyr old early-type central body surrounded by a faint, narrow inner ring tilted at a ~25 deg angle relative to the polar axis of the host galaxy. The halo surrounding the main stellar body exhibits a diversity of spatially extended stellar features of low surface brightness, including a faint asymmetric stellar cloud and two prominent loops. These faint features, together with the unperturbed morphology of the central host, are clear signs of a recent coalescence of two highly unequal mass galaxies, most likely a pre-existing early-type galaxy and a close-by gas-rich dwarf galaxy. The presumed stellar remnants observed near the edges of the ring, including possibly the surviving captured companion itself, indicate that the merger is still taking place.
Coherent backscattering (CBS) of light waves by a random medium is a signature of interference effects in multiple scattering. This effect has been studied in many systems ranging from white paint to biological tissues. Recently, we have observed CBS from a sample of laser-cooled atoms, a scattering medium with interesting new properties. In this paper we discuss various effects, which have to be taken into account for a quantitative study of coherent backscattering of light by cold atoms.
Force-constant and positional disorder have been introduced into diamond lattice models in an attempt to mimic the vibrational properties of a realistic amorphous silicon model. Neither type of disorder is sufficient on its own to mimic the realistic model. By comparing the spectral densities of these models, it is shown that a combination of both disorders is a better representation, but still not completely satisfactory. Topological disorder in these models was investigated by renumbering the atoms and examining the dynamical matrix graphically. The dynamical matrix of the realistic model is similar to that of a positionally-disordered lattice model, implying that the short-range order in both systems is similar.
This paper explores the topic of preferential sampling, specifically situations where monitoring sites in environmental networks are preferentially located by the designers. This means the data arising from such networks may not accurately characterize the spatio-temporal field they intend to monitor. Approaches that have been developed to mitigate the effects of preferential sampling in various contexts are reviewed and, building on these approaches, a general framework for dealing with the effects of preferential sampling in environmental monitoring is proposed. Strategies for implementation are proposed, leading to a method for improving the accuracy of official statistics used to report trends and inform regulatory policy. An essential feature of the method is its capacity to learn the preferential selection process over time and hence to reduce bias in these statistics. Simulation studies suggest dramatic reductions in bias are possible. A case study demonstrates use of the method in assessing the levels of air pollution due to black smoke in the UK over an extended period (1970-1996). In particular, dramatic reductions in the estimates of the number of sites out of compliance are observed.
Long Short-Term Memory (LSTM) is widely used in speech recognition. In order to achieve higher prediction accuracy, machine learning scientists have built larger and larger models. Such large model is both computation intensive and memory intensive. Deploying such bulky model results in high power consumption and leads to high total cost of ownership (TCO) of a data center. In order to speedup the prediction and make it energy efficient, we first propose a load-balance-aware pruning method that can compress the LSTM model size by 20x (10x from pruning and 2x from quantization) with negligible loss of the prediction accuracy. The pruned model is friendly for parallel processing. Next, we propose scheduler that encodes and partitions the compressed model to each PE for parallelism, and schedule the complicated LSTM data flow. Finally, we design the hardware architecture, named Efficient Speech Recognition Engine (ESE) that works directly on the compressed model. Implemented on Xilinx XCKU060 FPGA running at 200MHz, ESE has a performance of 282 GOPS working directly on the compressed LSTM network, corresponding to 2.52 TOPS on the uncompressed one, and processes a full LSTM for speech recognition with a power dissipation of 41 Watts. Evaluated on the LSTM for speech recognition benchmark, ESE is 43x and 3x faster than Core i7 5930k CPU and Pascal Titan X GPU implementations. It achieves 40x and 11.5x higher energy efficiency compared with the CPU and GPU respectively.
Security in Wireless Sensor Networks (WSN) can be achieved by establishing shared keys among the neighbor sensor nodes to create secure communication links. The protocol to be used for such a pairwise key establishment is a key factor determining the energy to be consumed by each sensor node during the secure network configuration. On the other hand, to achieve the optimum network configuration, nodes may not need to establish pairwise keys with all of their neighbors. Because, links to be established are defined by the network configuration protocol and as long as the network connectivity requirements are satisfied, number of links to be secured can be limited accordingly. In this sense, key establishment and network configuration performances are related to each other and this cross relation should be taken into consideration while implementing security for WSN. In this paper, we have investigated the cross layer relations and performance figures of the selected randomized pre-distribution and public key based key establishment protocols with the configuration protocol we proposed in a separate publication. Simulation results indicate that total network configuration energy cost can be reduced by reducing the number of links to be secured without affecting the global network connectivity performance. Results also show that the energy and resilience performances of the public key establishment can be better than the key pre-distribution for a given set of network configuration parameters.
The paper, based on authors' experience from several distributed systems integration projects, summarizes briefly practical designer's view on methodological requirements and overall system organization, including clues as to the organization of the application layer, use of operating system and preferred communication protocols.
We present and apply a general-purpose, multi-start algorithm for improving the performance of low-energy samplers used for solving optimization problems. The algorithm iteratively fixes the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are smaller and less connected, and samplers tend to give better low-energy samples for these problems. The algorithm is trivially parallelizable, since each start in the multi-start algorithm is independent, and could be applied to any heuristic solver that can be run multiple times to give a sample. We present results for several classes of hard problems solved using simulated annealing, path-integral quantum Monte Carlo, parallel tempering with isoenergetic cluster moves, and a quantum annealer, and show that the success metrics as well as the scaling are improved substantially. When combined with this algorithm, the quantum annealer's scaling was substantially improved for native Chimera graph problems. In addition, with this algorithm the scaling of the time to solution of the quantum annealer is comparable to the Hamze--de Freitas--Selby algorithm on the weak-strong cluster problems introduced by Boixo et al. Parallel tempering with isoenergetic cluster moves was able to consistently solve 3D spin glass problems with 8000 variables when combined with our method, whereas without our method it could not solve any.
We present a microscopic theory of the low-frequency voltage noise (known as "1/f" noise) in micrometer-thick films of hydrogenated amorphous silicon. This theory traces the noise back to the long-range fluctuations of the Coulomb potential produced by deep defects, thereby predicting the absolute noise intensity as a function of the distribution of defect activation energies. The predictions of this theory are in very good agreement with our own experiments in terms of both the absolute intensity and the temperature dependence of the noise spectra.
It is developed a very complex system (hardware/software) to detect luminosity variations connected with the discovery of new planets outside the Solar System. Traditional imaging approaches are very demanding in terms of computing time; then, the implementation of an automatic vision and decision software architecture is presented. It allows to perform an on-line discrimination of interesting events by using two levels of triggers. A fundamental challenge was to work with very large CCD camera (even 16k*16k pixels) in line with very large telescopes. Then, the architecture can use a distributed parallel network system based on a maximum of 256 standard workstations.
Long-range interacting systems such as nitrogen vacancy (NV)-centers in diamond serve as useful experimental setups to probe a range of nonequilibrium many-body phenomena. In particular, via driving, various effective Hamiltonians with physics potentially quite distinct from short-range systems can be realized. In this Letter, we derive general bounds on the linear response energy absorption rates of periodically driven systems of spins or fermions in $d$ spatial dimensions with long-range interactions that are sign-changing and fall off as $1/r^\alpha$ with $\alpha > d/2$. We show that the disordered averaged energy absorption rate at high temperature decays exponentially as a function of the driving frequency. These results are relevant for understanding heating timescales and hence timescales of validity of effective Hamiltonians as well as new dynamical regimes in such long-range systems.
We present the results of Monte Carlo simulations of two different Potts glass models with short range random interactions. In the first model a \pm J-distribution of the bonds is chosen, in the second model a Gaussian distribution. In both cases the first two moments of the distribution are chosen to be J_0=-1, Delta J=+1, so that no ferromagnetic ordering of the Potts spins can occur. We find that for all temperatures investigated the spin glass susceptibility remains finite, the spin glass order parameter remains zero, and that the specific heat has only a smooth Schottky-like peak. These results can be understood quantitatively by considering small but independent clusters of spins. Hence we have evidence that there is no static phase transition at any nonzero temperature. Consistent with these findings, only very minor size effects are observed, which implies that all correlation lengths of the models remain very short. We also compute for both models the time auto-correlation function C(t) of the Potts spins. While in the Gaussian model C(t) shows a smooth uniform decay, the correlator for the \pm J model has several distinct steps. These steps correspond to the breaking of bonds in small clusters of ferromagnetically coupled spins (dimers, trimers, etc.). The relaxation times follow simple Arrhenius laws, with activation energies that are readily interpreted within the cluster picture, giving evidence that the system does not have a dynamic transition at a finite temperature. Hence we find that for the present models all the transitions known for the mean-field version of the model are completely wiped out. Finally we also determine the time auto-correlation functions of individual spins, and show that the system is dynamically very heterogeneous.
Deep B- and R-band CCD images of the central ~700 arcmin^2 of the Coma cluster core have been used to measure the dwarf-galaxy population in Coma. In this paper, we describe a newly developed code for automated detection, photometry and classification of faint objects of arbitrary shape and size on digital images. Intensity-weighted moments are used to compute the positions, radial structures, ellipticities, and integrated magnitudes of detected objects. We demonstrate that Kron-type 2r_1 aperture aperture magnitudes and surface brightnesses are well suited to faint-galaxy photometry of the type described here. Discrimination between starlike and extended (galaxy) objects is performed interactively through parameter-space culling in several possible parameters, including the radial moments, surface brightness, and integrated color versus magnitude. Our code is tested and characterized with artificial CCD images of star and galaxy fields; it is demonstrated to be accurate, robust and versatile. Using these analysis techniques, we detect a large population of dE galaxies in the Coma cluster core. These dEs stand out as a tight sequence in the R, (B-R) color-magnitude diagram.
Deep learning, as a promising new area of machine learning, has attracted a rapidly increasing attention in the field of medical imaging. Compared to the conventional machine learning methods, deep learning requires no hand-tuned feature extractor, and has shown a superior performance in many visual object recognition applications. In this study, we develop a deep convolutional neural network (CNN) and apply it to thoracic CT images for the classification of lung nodules. We present the CNN architecture and classification accuracy for the original images of lung nodules. In order to understand the features of lung nodules, we further construct new datasets, based on the combination of artificial geometric nodules and some transformations of the original images, as well as a stochastic nodule shape model. It is found that simplistic geometric nodules cannot capture the important features of lung nodules.
We derive the mean-field equations characterizing the dynamics of a rumor process that takes place on top of complex heterogeneous networks. These equations are solved numerically by means of a stochastic approach. First, we present analytical and Monte Carlo calculations for homogeneous networks and compare the results with those obtained by the numerical method. Then, we study the spreading process in detail for random scale-free networks. The time profiles for several quantities are numerically computed, which allow us to distinguish among different variants of rumor spreading algorithms. Our conclusions are directed to possible applications in replicated database maintenance, peer to peer communication networks and social spreading phenomena.
Outsourcing -- successful, and sometimes painful -- has become one of the hottest topics in IT service management discussions over the past decade. IT services are outsourced to external service provider in order to reduce the effort required for and overhead of delivering these services within the own organization. More recently also IT services providers themselves started to either outsource service parts or to deliver those services in a non-hierarchical cooperation with other providers. Splitting a service into several service parts is a non-trivial task as they have to be implemented, operated, and maintained by different providers. One key aspect of such inter-organizational cooperation is fault management, because it is crucial to locate and solve problems, which reduce the quality of service, quickly and reliably. In this article we present the results of a thorough use case based requirements analysis for an architecture for inter-organizational fault management (ioFMA). Furthermore, a concept of the organizational respective functional model of the ioFMA is given.
We establish the existence of a duality transformation for generic models of interacting fermions with two-body interactions. The eigenstates at weak and strong interaction U possess similar statistical properties when expressed in the U=0 and U=infinity eigenstates bases respectively. This implies the existence of a duality point U_d where the eigenstates have the same spreading in both bases. U_d is surrounded by an interval of finite width which is characterized by a non Lorentzian spreading of the strength function in both bases. Scaling arguments predict the survival of this intermediate regime as the number of particles is increased.
A new technique is presented developed to learn multi-class concepts from clinical electroencephalograms. A desired concept is represented as a neuronal computational model consisting of the input, hidden, and output neurons. In this model the hidden neurons learn independently to classify the electroencephalogram segments presented by spectral and statistical features. This technique has been applied to the electroencephalogram data recorded from 65 sleeping healthy newborns in order to learn a brain maturation concept of newborns aged between 35 and 51 weeks. The 39399 and 19670 segments from these data have been used for learning and testing the concept, respectively. As a result, the concept has correctly classified 80.1% of the testing segments or 87.7% of the 65 records.
Estimating similarity between vertices is a fundamental issue in network analysis across various domains, such as social networks and biological networks. Methods based on common neighbors and structural contexts have received much attention. However, both categories of methods are difficult to scale up to handle large networks (with billions of nodes). In this paper, we propose a sampling method that provably and accurately estimates the similarity between vertices. The algorithm is based on a novel idea of random path, and an extended method is also presented, to enhance the structural similarity when two vertices are completely disconnected. We provide theoretical proofs for the error-bound and confidence of the proposed algorithm. We perform extensive empirical study and show that our algorithm can obtain top-k similar vertices for any vertex in a network approximately 300x faster than state-of-the-art methods. We also use identity resolution and structural hole spanner finding, two important applications in social networks, to evaluate the accuracy of the estimated similarities. Our experimental results demonstrate that the proposed algorithm achieves clearly better performance than several alternative methods.
One of the natural objectives of the field of the social networks is to predict agents' behaviour. To better understand the spread of various products through a social network arXiv:1105.2434 introduced a threshold model, in which the nodes influenced by their neighbours can adopt one out of several alternatives. To analyze the consequences of such product adoption we associate here with each such social network a natural strategic game between the agents.   In these games the payoff of each player weakly increases when more players choose his strategy, which is exactly opposite to the congestion games. The possibility of not choosing any product results in two special types of (pure) Nash equilibria.   We show that such games may have no Nash equilibrium and that determining an existence of a Nash equilibrium, also of a special type, is NP-complete. This implies the same result for a more general class of games, namely polymatrix games. The situation changes when the underlying graph of the social network is a DAG, a simple cycle, or, more generally, has no source nodes. For these three classes we determine the complexity of an existence of (a special type of) Nash equilibria.   We also clarify for these categories of games the status and the complexity of the finite best response property (FBRP) and the finite improvement property (FIP). Further, we introduce a new property of the uniform FIP which is satisfied when the underlying graph is a simple cycle, but determining it is co-NP-hard in the general case and also when the underlying graph has no source nodes. The latter complexity results also hold for the property of being a weakly acyclic game. A preliminary version of this paper appeared as [19].
The evolutionary properties and spatial distribution of I Zwicky 18 stellar populations are analyzed by means of HST/ACS deep and accurate photometry. The comparison of the resulting Colour- Magnitude diagrams with stellar evolution models indicates that stars of all ages are present in all the system components, including ob jects possibly up to 13 Gyr old, intermediate age stars and very young ones. The Colour-Magnitude diagrams show evidence of thermally pulsing asymptotic giant branch and carbon stars. Classical and ultra-long period Cepheids, as well as long period variables have been measured. About 20 ob jects could be unresolved star clusters, and are mostly concentrated in the North-West (NW) portion of the Main Body (MB). If interpreted with simple stellar population models, these ob jects indicate a particularly active star formation over the past hundred Myr in IZw 18. The stellar spatial distribution shows that the younger ones are more centrally concentrated, while old and intermediate age stars are distributed homogeneously over the two bodies, although more easily detectable at the system periphery. The oldest stars are best visible in the Secondary Body (SB) and in the South East (SE) portion of the MB, where crowding is less severe, but are present also in the rest of the MB, although measured with larger uncertainties. The youngest stars are a few Myr old, are located predominantly in the MB and mostly concentrated in its NW portion. The SE portion of the MB appears to be in a similar, but not as young evolutionary stage as the NW, while the SB stars are older than at least 10 Myr. There is then a sequence of decreasing age of the younger stars from the Secondary Body to the SE portion of the MB to the NW portion. All our results suggest that IZw18 is not atypical compared to other BCDs.
Complex functional brain networks are large networks of brain regions and functional brain connections. Statistical characterizations of these networks aim to quantify global and local properties of brain activity with a small number of network measures. Important functional network measures include measures of modularity (measures of the goodness with which a network is optimally partitioned into functional subgroups) and measures of centrality (measures of the functional influence of individual brain regions). Characterizations of functional networks are increasing in popularity, but are associated with several important methodological problems. These problems include the inability to characterize densely connected and weighted functional networks, the neglect of degenerate topologically distinct high-modularity partitions of these networks, and the absence of a network null model for testing hypotheses of association between observed nontrivial network properties and simple weighted connectivity properties. In this study we describe a set of methods to overcome these problems. Specifically, we generalize measures of modularity and centrality to fully connected and weighted complex networks, describe the detection of degenerate high-modularity partitions of these networks, and introduce a weighted-connectivity null model of these networks. We illustrate our methods by demonstrating degenerate high-modularity partitions and strong correlations between two complementary measures of centrality in resting-state functional magnetic resonance imaging (MRI) networks from the 1000 Functional Connectomes Project, an open-access repository of resting-state functional MRI datasets. Our methods may allow more sound and reliable characterizations and comparisons of functional brain networks across conditions and subjects.
Dermoscopy image detection stays a tough task due to the weak distinguishable property of the object.Although the deep convolution neural network signifigantly boosted the performance on prevelance computer vision tasks in recent years,there remains a room to explore more robust and precise models to the problem of low contrast image segmentation.Towards the challenge of Lesion Segmentation in ISBI 2017,we built a symmetrical identity inception fully convolution network which is based on only 10 reversible inception blocks,every block composed of four convolution branches with combination of different layer depth and kernel size to extract sundry semantic features.Then we proposed an approximate loss function for jaccard index metrics to train our model.To overcome the drawbacks of traditional convolution,we adopted the dilation convolution and conditional random field method to rectify our segmentation.We also introduced multiple ways to prevent the problem of overfitting.The experimental results shows that our model achived jaccard index of 0.82 and kept learning from epoch to epoch.
Patterns of vortex ripples form when a sand bed is subjected to an oscillatory fluid flow. Here we describe experiments on the response of regular vortex ripple patterns to sudden changes of the driving amplitude a or frequency f. A sufficient decrease of f leads to a "freezing" of the pattern, while a sufficient increase of f leads to a supercritical secondary "pearling" instability. Sufficient changes in the amplitude a lead to subcritical secondary "doubling" and "bulging" instabilities. Our findings are summarized in a "stability balloon" for vortex ripple pattern formation.
A proof is presented that gene regulatory networks (GRNs) based solely on transcription factors cannot control the development of complex multicellular life. GRNs alone cannot explain the evolution of multicellular life in the Cambrian Explosion. Networks are based on addressing systems which are used to construct network links. The more complex the network the greater the number of links and the larger the required address space. It has been assumed that combinations of transcription factors generate a large enough address space to form GRNs that are complex enough to control the development of complex multicellular life. However, it is shown in this article that transcription factors do not have sufficient combinatorial power to serve as the basis of an addressing system for regulatory control of genomes in the development of complex organisms. It is proven that given $n$ transcription factor genes in a genome and address combinations of length $k$ then there are at most $n/k$ k-length transcription factor addresses in the address space. The complexity of embryonic development requires a corresponding complexity of control information in the cell and its genome. Therefore, a different addressing system must exist to form the complex control networks required for complex control systems. It is postulated that a new type of network evolved based on an RNA-DNA addressing system that utilized and subsumed the extant GRNs. These new developmental control networks are called CENES (for Control genes). The evolution of these new higher networks would explain how the Cambrian Explosion was possible. The architecture of these higher level networks may in fact be universal (modulo syntax) in the genomes of all multicellular life.
We address the problem of contour detection via per-pixel classifications of edge point. To facilitate the process, the proposed approach leverages with DenseNet, an efficient implementation of multiscale convolutional neural networks (CNNs), to extract an informative feature vector for each pixel and uses an SVM classifier to accomplish contour detection. In the experiment of contour detection, we look into the effectiveness of combining per-pixel features from different CNN layers and verify their performance on BSDS500.
Many real world networks have groups of similar nodes which are vulnerable to the same failure or adversary. Nodes can be colored in such a way that colors encode the shared vulnerabilities. Using multiple paths to avoid these vulnerabilities can greatly improve network robustness. Color-avoiding percolation provides a theoretical framework for analyzing this scenario, focusing on the maximal set of nodes which can be connected via multiple color-avoiding paths. In this paper we extend the basic theory of color-avoiding percolation that was published in [Krause et. al., Phys. Rev. X 6 (2016) 041022]. We explicitly account for the fact that the same particular link can be part of different paths avoiding different colors. This fact was previously accounted for with a heuristic approximation. We compare this approximation with a new, more exact theory and show that the new theory is substantially more accurate for many avoided colors. Further, we formulate our new theory with differentiated node functions, as senders/receivers or as transmitters. In both functions, nodes can be explicitly trusted or avoided. With only one avoided color we obtain standard percolation. With one by one avoiding additional colors, we can understand the critical behavior of color avoiding percolation. For heterogeneous color frequencies, we find that the colors with the largest frequencies control the critical threshold and exponent. Colors of small frequencies have only a minor influence on color avoiding connectivity, thus allowing for approximations.
Network alignment generalizes and unifies several approaches for forming a matching or alignment between the vertices of two graphs. We study a mathematical programming framework for network alignment problem and a sparse variation of it where only a small number of matches between the vertices of the two graphs are possible. We propose a new message passing algorithm that allows us to compute, very efficiently, approximate solutions to the sparse network alignment problems with graph sizes as large as hundreds of thousands of vertices. We also provide extensive simulations comparing our algorithms with two of the best solvers for network alignment problems on two synthetic matching problems, two bioinformatics problems, and three large ontology alignment problems including a multilingual problem with a known labeled alignment.
In this paper we consider the operator mapping problem for in-network stream processing applications. In-network stream processing consists in applying a tree of operators in steady-state to multiple data objects that are continually updated at various locations on a network. Examples of in-network stream processing include the processing of data in a sensor network, or of continuous queries on distributed relational databases. We study the operator mapping problem in a ``constructive'' scenario, i.e., a scenario in which one builds a platform dedicated to the application buy purchasing processing servers with various costs and capabilities. The objective is to minimize the cost of the platform while ensuring that the application achieves a minimum steady-state throughput. The first contribution of this paper is the formalization of a set of relevant operator-placement problems as linear programs, and a proof that even simple versions of the problem are NP-complete. Our second contribution is the design of several polynomial time heuristics, which are evaluated via extensive simulations and compared to theoretical bounds for optimal solutions.
We show that diversity, in the form of quenched noise, can have a constructive effect in the dynamics of extended systems. We first consider a bistable $\phi^4$ model composed by many coupled units and show that the global response to an external periodic forcing is enhanced under the presence of the right amount of diversity (measured as the dispersion in one of the parameters defining the model). As a second example, we consider a system of active-rotators and show that while they are at rest in the homogeneous case, the disorder introduced by the diversity suffices to trigger the appearance of common firings or pulses. Both effects require very simple ingredients and we expect the results presented here to be of interest in similar models.
The virtualization of Radio Access Networks (RANs) has been proposed as one of the important use cases of Network Function Virtualization (NFV). In VRAN!s (VRAN!s), some functions from a Base Station (BS), such as those which make up the Base Band Unit (BBU), may be implemented in a shared infrastructure located at either a data center or distributed in network nodes. For the latter option, one challenge is in deciding which subset of the available network nodes can be used to host the physical BBU servers (the placement problem), and then to which of the available physical BBUs each Remote Radio Head (RRH) should be assigned (the assignment problem). These two problems constitute what we refer to as the VRANPAP! (VRAN-PAP!). In this paper, we start by formally defining the VRAN-PAP! before formulating it as a Binary Integer Linear Program (BILP) whose objective is to minimize the server and front haul link setup costs as well as the latency between each RRH and its assigned BBU. Since the BILP could become computationally intractable, we also propose a greedy approximation for larger instances of the VRAN-PAP!. We perform simulations to compare both algorithms in terms of solution quality as well as computation time under varying network sizes and setup budgets.
The evolution of the human society raises more and more difficult endeavors. For some of the real-life problems, the computing time-restriction enhances their complexity. The Matrix Bandwidth Minimization Problem (MBMP) seeks for a simultaneous permutation of the rows and the columns of a square matrix in order to keep its nonzero entries close to the main diagonal. The MBMP is a highly investigated P-complete problem, as it has broad applications in industry, logistics, artificial intelligence or information recovery. This paper describes a new attempt to use the Ant Colony Optimization framework in tackling MBMP. The introduced model is based on the hybridization of the Ant Colony System technique with new local search mechanisms. Computational experiments confirm a good performance of the proposed algorithm for the considered set of MBMP instances.
Time-varying delays adversely affect the performance of networked control sys-tems (NCS) and in the worst-case can destabilize the entire system. Therefore, modelling network delays is important for designing NCS. However, modelling time-varying delays is challenging because of their dependence on multiple pa-rameters such as length, contention, connected devices, protocol employed, and channel loading. Further, these multiple parameters are inherently random and de-lays vary in a non-linear fashion with respect to time. This makes estimating ran-dom delays challenging. This investigation presents a methodology to model de-lays in NCS using experiments and general regression neural network (GRNN) due to their ability to capture non-linear relationship. To compute the optimal smoothing parameter that computes the best estimates, genetic algorithm is used. The objective of the genetic algorithm is to compute the optimal smoothing pa-rameter that minimizes the mean absolute percentage error (MAPE). Our results illustrate that the resulting GRNN is able to predict the delays with less than 3% error. The proposed delay model gives a framework to design compensation schemes for NCS subjected to time-varying delays.
The ability to accurately model a sentence at varying stages (e.g., word-phrase-sentence) plays a central role in natural language processing. As an effort towards this goal we propose a self-adaptive hierarchical sentence model (AdaSent). AdaSent effectively forms a hierarchy of representations from words to phrases and then to sentences through recursive gated local composition of adjacent segments. We design a competitive mechanism (through gating networks) to allow the representations of the same sentence to be engaged in a particular learning task (e.g., classification), therefore effectively mitigating the gradient vanishing problem persistent in other recursive models. Both qualitative and quantitative analysis shows that AdaSent can automatically form and select the representations suitable for the task at hand during training, yielding superior classification performance over competitor models on 5 benchmark data sets.
Transferring artistic styles onto everyday photographs has become an extremely popular task in both academia and industry. Recently, offline training has replaced on-line iterative optimization, enabling nearly real-time stylization. When those stylization networks are applied directly to high-resolution images, however, the style of localized regions often appears less similar to the desired artistic style. This is because the transfer process fails to capture small, intricate textures and maintain correct texture scales of the artworks. Here we propose a multimodal convolutional neural network that takes into consideration faithful representations of both color and luminance channels, and performs stylization hierarchically with multiple losses of increasing scales. Compared to state-of-the-art networks, our network can also perform style transfer in nearly real-time by conducting much more sophisticated training offline. By properly handling style and texture cues at multiple scales using several modalities, we can transfer not just large-scale, obvious style cues but also subtle, exquisite ones. That is, our scheme can generate results that are visually pleasing and more similar to multiple desired artistic styles with color and texture cues at multiple scales.
Human action recognition in videos is one of the most challenging tasks in computer vision. One important issue is how to design discriminative features for representing spatial context and temporal dynamics. Here, we introduce a path signature feature to encode information from intra-frame and inter-frame contexts. A key step towards leveraging this feature is to construct the proper trajectories (paths) for the data steam. In each frame, the correlated constraints of human joints are treated as small paths, then the spatial path signature features are extracted from them. In video data, the evolution of these spatial features over time can also be regarded as paths from which the temporal path signature features are extracted. Eventually, all these features are concatenated to constitute the input vector of a fully connected neural network for action classification. Experimental results on four standard benchmark action datasets, J-HMDB, SBU Dataset, Berkeley MHAD, and NTURGB+D demonstrate that the proposed approach achieves state-of-the-art accuracy even in comparison with recent deep learning based models.
Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, deep convolutional generative adversarial networks (GANs) have begun to generate highly compelling images of specific categories, such as faces, album covers, and room interiors. In this work, we develop a novel deep architecture and GAN formulation to effectively bridge these advances in text and image model- ing, translating visual concepts from characters to pixels. We demonstrate the capability of our model to generate plausible images of birds and flowers from detailed text descriptions.
The main challenge in wireless sensor network is to improve the fault tolerance of each node and also provide an energy efficient fast data routing service. In this paper we propose an energy efficient node fault diagnosis and recovery for wireless sensor networks referred as fault tolerant multipath routing scheme for energy efficient wireless sensor network (FTMRS).The FTMRS is based on multipath data routing scheme. One shortest path is use for main data routing in FTMRS technique and other two backup paths are used as alternative path for faulty network and to handle the overloaded traffic on main channel. Shortest path data routing ensures energy efficient data routing. The performance analysis of FTMRS shows better results compared to other popular fault tolerant techniques in wireless sensor networks.
We investigate numerically the dynamics of three different spin models in the aging regime. Each of these models is meant to be representative of a distinct class of aging behavior: coarsening systems, discontinuous spin glasses, and continuous spin glasses. In order to study dynamic heterogeneities induced by quenched disorder, we consider single-spin observables for a given disorder realization. In some simple cases we are able to provide analytical predictions for single-spin response and correlation functions.   The results strongly depend upon the model considered. It turns out that, by comparing the slow evolution of a few different degrees of freedom, one can distinguish between different dynamic classes. As a conclusion we present the general properties which can be induced from our results, and discuss their relation with thermometric arguments.
Muon energy measurement represents an important issue for any experiment addressing neutrino induced upgoing muon studies. Since the neutrino oscillation probability depends on the neutrino energy, a measurement of the muon energy adds an important piece of information concerning the neutrino system. We show in this paper how the MACRO limited streamer tube system can be operated in drift mode by using the TDC's included in the QTPs, an electronics designed for magnetic monopole search. An improvement of the space resolution is obtained, through an analysis of the multiple scattering of muon tracks as they pass through our detector. This information can be used further to obtain an estimate of the energy of muons crossing the detector. Here we present the results of two dedicated tests, performed at CERN PS-T9 and SPS-X7 beam lines, to provide a full check of the electronics and to exploit the feasibility of such a multiple scattering analysis. We show that by using a neural network approach, we are able to reconstruct the muon energy for $E_\mu<$40 GeV. The test beam data provide an absolute energy calibration, which allows us to apply this method to MACRO data.
Machine learning has been used to detect new malware in recent years, while malware authors have strong motivation to attack such algorithms. Malware authors usually have no access to the detailed structures and parameters of the machine learning models used by malware detection systems, and therefore they can only perform black-box attacks. This paper proposes a generative adversarial network (GAN) based algorithm named MalGAN to generate adversarial malware examples, which are able to bypass black-box machine learning based detection models. MalGAN uses a substitute detector to fit the black-box malware detection system. A generative network is trained to minimize the generated adversarial examples' malicious probabilities predicted by the substitute detector. The superiority of MalGAN over traditional gradient based adversarial example generation algorithms is that MalGAN is able to decrease the detection rate to nearly zero and make the retraining based defensive method against adversarial examples hard to work.
These are some informal remarks on quadratic orbital networks over finite fields. We discuss connectivity, Euler characteristic, number of cliques, planarity, diameter and inductive dimension. We find a non-trivial disconnected graph for d=3. We prove that for d=1 generators, the Euler characteristic is always non-negative and for d=2 and large enough p the Euler characteristic is negative. While for d=1, all networks are planar, we suspect that for d larger or equal to 2 and large enough prime p, all networks are non-planar. As a consequence on bounds for the number of complete sub graphs of a fixed dimension, the inductive dimension of all these networks goes 1 as p goes to infinity.
We consider the criticality for firing structures of a simplified integrate-and-fire neural model on the regular network, small-world network, and random networks. We simplify an integrate-and-fire model suggested by Levina, Herrmann and Geisel (LHG). In our model we set up the synaptic strength as a constant value. We observed the power law behaviors of the probability distribution of the avalanche size and the life time of the avalanche. The critical exponents in the small-world network and the random network were the same as those in the fully connected network. However, in the regular one-dimensional ring, the model does not show the critical behaviors. In the simplified LHG model, the short-cuts are crucial role in the self-organized criticality. The simplified LHG model in three types of networks such as the fully connected network, the small-world network, and random network belong to the same universality class.
Evolution of visual object recognition architectures based on Convolutional Neural Networks & Convolutional Deep Belief Networks paradigms has revolutionized artificial Vision Science. These architectures extract & learn the real world hierarchical visual features utilizing supervised & unsupervised learning approaches respectively. Both the approaches yet cannot scale up realistically to provide recognition for a very large number of objects as high as 10K. We propose a two level hierarchical deep learning architecture inspired by divide & conquer principle that decomposes the large scale recognition architecture into root & leaf level model architectures. Each of the root & leaf level models is trained exclusively to provide superior results than possible by any 1-level deep learning architecture prevalent today. The proposed architecture classifies objects in two steps. In the first step the root level model classifies the object in a high level category. In the second step, the leaf level recognition model for the recognized high level category is selected among all the leaf models. This leaf level model is presented with the same input object image which classifies it in a specific category. Also we propose a blend of leaf level models trained with either supervised or unsupervised learning approaches. Unsupervised learning is suitable whenever labelled data is scarce for the specific leaf level models. Currently the training of leaf level models is in progress; where we have trained 25 out of the total 47 leaf level models as of now. We have trained the leaf models with the best case top-5 error rate of 3.2% on the validation data set for the particular leaf models. Also we demonstrate that the validation error of the leaf level models saturates towards the above mentioned accuracy as the number of epochs are increased to more than sixty.
We consider directed acyclic {\em sum-networks} with $m$ sources and $n$ terminals where the sources generate symbols from an arbitrary alphabet field $F$, and the terminals need to recover the sum of the sources over $F$. We show that for any co-finite set of primes, there is a sum-network which is solvable only over fields of characteristics belonging to that set. We further construct a sum-network where a scalar solution exists over all fields other than the binary field $F_2$. We also show that a sum-network is solvable over a field if and only if its reverse network is solvable over the same field.
Face alignment is an active topic in computer vision, consisting in aligning a shape model on the face. To this end, most modern approaches refine the shape in a cascaded manner, starting from an initial guess. Those shape updates can either be applied in the feature point space (\textit{i.e.} explicit updates) or in a low-dimensional, parametric space. In this paper, we propose a semi-parametric cascade that first aligns a parametric shape, then captures more fine-grained deformations of an explicit shape. For the purpose of learning shape updates at each cascade stage, we introduce a deep greedy neural forest (GNF) model, which is an improved version of deep neural forest (NF). GNF appears as an ideal regressor for face alignment, as it combines differentiability, high expressivity and fast evaluation runtime. The proposed framework is very fast and achieves high accuracies on multiple challenging benchmarks, including small, medium and large pose experiments.
In financial markets, abnormal trading behaviors pose a serious challenge to market surveillance and risk management. What is worse, there is an increasing emergence of abnormal trading events that some experienced traders constitute a collusive clique and collaborate to manipulate some instruments, thus mislead other investors by applying similar trading behaviors for maximizing their personal benefits. In this paper, a method is proposed to detect the hidden collusive cliques involved in an instrument of future markets by first calculating the correlation coefficient between any two eligible unified aggregated time series of signed order volume, and then combining the connected components from multiple sparsified weighted graphs constructed by using the correlation matrices where each correlation coefficient is over a user-specified threshold. Experiments conducted on real order data from the Shanghai Futures Exchange show that the proposed method can effectively detect suspect collusive cliques. A tool based on the proposed method has been deployed in the exchange as a pilot application for futures market surveillance and risk management.
Recent research in neural machine translation has largely focused on two aspects; neural network architectures and end-to-end learning algorithms. The problem of decoding, however, has received relatively little attention from the research community. In this paper, we solely focus on the problem of decoding given a trained neural machine translation model. Instead of trying to build a new decoding algorithm for any specific decoding objective, we propose the idea of trainable decoding algorithm in which we train a decoding algorithm to find a translation that maximizes an arbitrary decoding objective. More specifically, we design an actor that observes and manipulates the hidden state of the neural machine translation decoder and propose to train it using a variant of deterministic policy gradient. We extensively evaluate the proposed algorithm using four language pairs and two decoding objectives and show that we can indeed train a trainable greedy decoder that generates a better translation (in terms of a target decoding objective) with minimal computational overhead.
We systematically investigate the transmission of charge carriers across the grain-boundary defects in polycrystalline graphene by means of the Landauer-B\"uttiker formalism within the tight-binding approximation. Calculations reveal a strong suppression of transmission at low energies upon decreasing the density of dislocations with the smallest Burger's vector ${\mathbf b}=(1,0)$. The observed transport anomaly is explained from the point of view of back-scattering due to localized states of topological origin. These states are related to the gauge field associated with all dislocations characterized by ${\mathbf b}=(n,m)$ with $n-m \neq 3q$ ($q \in \mathbb{Z}$). Our work identifies an important source of charge-carrier scattering caused by topological defects present in large-area graphene samples produced by chemical vapor deposition.
Human skill learning requires fine-scale coordination of distributed networks of brain regions that are directly linked to one another by white matter tracts to allow for effective information transmission. Yet how individual differences in these anatomical pathways may impact individual differences in learning remains far from understood. Here, we test the hypothesis that individual differences in the organization of structural networks supporting task performance predict individual differences in the rate at which humans learn a visuo-motor skill. Over the course of 6 weeks, twenty-two healthy adult subjects practiced a discrete sequence production task, where they learned a sequence of finger movements based on discrete visual cues. We collected structural imaging data during four MRI scanning sessions spaced approximately two weeks apart, and using deterministic tractography, structural networks were generated for each participant to identify streamlines that connect cortical and sub-cortical brain regions. We observed that increased white matter connectivity linking early visual (but not motor) regions was associated with a faster learning rate. Moreover, we observed that the strength of multi-edge paths between motor and visual modules was also correlated with learning rate, supporting the role of polysynaptic connections in successful skill acquisition. Our results demonstrate that the combination of diffusion imaging and tractography-based connectivity can be used to predict future individual differences in learning capacity, particularly when combined with methods from network science and graph theory.
Most recent developments on the stochastic block model (SBM) rely on the knowledge of the model parameters, or at least on the number of communities. This paper introduces efficient algorithms that do not require such knowledge and yet achieve the optimal information-theoretic tradeoffs identified in [AS15] for linear size communities. The results are three-fold: (i) in the constant degree regime, an algorithm is developed that requires only a lower-bound on the relative sizes of the communities and detects communities with an optimal accuracy scaling for large degrees; (ii) in the regime where degrees are scaled by $\omega(1)$ (diverging degrees), this is enhanced into a fully agnostic algorithm that only takes the graph in question and simultaneously learns the model parameters (including the number of communities) and detects communities with accuracy $1-o(1)$, with an overall quasi-linear complexity; (iii) in the logarithmic degree regime, an agnostic algorithm is developed that learns the parameters and achieves the optimal CH-limit for exact recovery, in quasi-linear time. These provide the first algorithms affording efficiency, universality and information-theoretic optimality for strong and weak consistency in the general SBM with linear size communities.
A nonlinear screening theory is formulated to study the problem of gap formation and its relation to glassy freezing in classical Coulomb glasses. We find that a pseudo-gap ("plasma dip") in a single-particle density of states begins to open already at temperatures comparable to the Coulomb energy. This phenomenon is shown to reflect the emergence of short range correlations in a liquid (plasma) phase, a process which occurs even in the absence of disorder. Glassy ordering emerges when disorder is present, but this occurs only at temperatures more then an order of magnitude lower, which is shown to follow from nonlinear screening of the Coulomb interaction. Our result suggest that the formation of the "plasma dip" at high temperatures is a process distinct from the formation of the Efros-Shklovskii (ES) pseudo-gap, which in our model emerges only within the glassy phase.
Inferring a graphical model or network from observational data from a large number of variables is a well studied problem in machine learning and computational statistics. In this paper we consider a version of this problem that is relevant to the analysis of multiple phenotypes collected in genetic studies. In such datasets we expect correlations between phenotypes and between individuals. We model observations as a sum of two matrix normal variates such that the joint covariance function is a sum of Kronecker products. This model, which generalizes the Graphical Lasso, assumes observations are correlated due to known genetic relationships and corrupted with non-independent noise. We have developed a computationally efficient EM algorithm to fit this model. On simulated datasets we illustrate substantially improved performance in network reconstruction by allowing for a general noise distribution.
In this paper, we study ordered representations of data in which different dimensions have different degrees of importance. To learn these representations we introduce nested dropout, a procedure for stochastically removing coherent nested sets of hidden units in a neural network. We first present a sequence of theoretical results in the simple case of a semi-linear autoencoder. We rigorously show that the application of nested dropout enforces identifiability of the units, which leads to an exact equivalence with PCA. We then extend the algorithm to deep models and demonstrate the relevance of ordered representations to a number of applications. Specifically, we use the ordered property of the learned codes to construct hash-based data structures that permit very fast retrieval, achieving retrieval in time logarithmic in the database size and independent of the dimensionality of the representation. This allows codes that are hundreds of times longer than currently feasible for retrieval. We therefore avoid the diminished quality associated with short codes, while still performing retrieval that is competitive in speed with existing methods. We also show that ordered representations are a promising way to learn adaptive compression for efficient online data reconstruction.
The handover process is one of the most critical functions in a cellular network, and is in charge of maintaining seamless connectivity of user equipments (UEs) across multiple cells. It is usually based on signal measurements from the neighboring base stations (BSs), and it is adversely affected by the time and frequency selectivity of the radio propagation channel. In this paper, we introduce a new model for analyzing handover performance in heterogeneous networks (HetNets) as a function of vehicular user velocity, cell size, and mobility management parameters. In order to investigate the impact of shadowing and fading on handover performance, we extract relevant statistics obtained from a 3rd Generation Partnership Project (3GPP)-compliant HetNet simulator, and subsequently, we integrate these statistics into our analytical model to analyze handover failure probability under fluctuating channel conditions. Computer simulations validate the analytical findings, which show that fading can significantly degrade the handover performance in HetNets with vehicular users.
We consider the problem of accelerating distributed optimization in multi-agent networks by sequentially adding edges. Specifically, we extend the distributed dual averaging (DDA) subgradient algorithm to evolving networks of growing connectivity and analyze the corresponding improvement in convergence rate. It is known that the convergence rate of DDA is influenced by the algebraic connectivity of the underlying network, where better connectivity leads to faster convergence. However, the impact of network topology design on the convergence rate of DDA has not been fully understood. In this paper, we begin by designing network topologies via edge selection and scheduling. For edge selection, we determine the best set of candidate edges that achieves the optimal tradeoff between the growth of network connectivity and the usage of network resources. The dynamics of network evolution is then incurred by edge scheduling. Further, we provide a tractable approach to analyze the improvement in the convergence rate of DDA induced by the growth of network connectivity. Our analysis reveals the connection between network topology design and the convergence rate of DDA, and provides quantitative evaluation of DDA acceleration for distributed optimization that is absent in the existing analysis. Lastly, numerical experiments show that DDA can be significantly accelerated using a sequence of well-designed networks, and our theoretical predictions are well matched to its empirical convergence behavior.
Observational galaxy cluster studies at z>1.5 probe the formation of the first massive M>10^14 Msun dark matter halos, the early thermal history of the hot ICM, and the emergence of the red-sequence population of quenched early-type galaxies. We present first results for the newly discovered X-ray luminous galaxy cluster XMMU J1007.4+1237 at z=1.555, detected and confirmed by the XMM-Newton Distant Cluster Project (XDCP) survey. We selected the system as a serendipitous weak extended X-ray source in XMM-Newton archival data and followed it up with two-band near-infrared imaging and deep optical spectroscopy. We can establish XMMU J1007.4+1237 as a spectroscopically confirmed, massive, bona fide galaxy cluster with a bolometric X-ray luminosity of Lx=(2.1+-0.4)\times 10^44 erg/s, a red galaxy population centered on the X-ray emission, and a central radio-loud brightest cluster galaxy. However, we see evidence for the first time that the massive end of the galaxy population and the cluster red-sequence are not yet fully in place. In particular, we find ongoing starburst activity for the third ranked galaxy close to the center and another slightly fainter object. At a lookback time of 9.4Gyr, the cluster galaxy population appears to be caught in an important evolutionary phase, prior to full star-formation quenching and mass assembly in the core region. X-ray selection techniques are an efficient means of identifying and probing the most distant clusters without any prior assumptions about their galaxy content.
We study two types of simple Boolean networks, namely two loops with a cross-link and one loop with an additional internal link. Such networks occur as relevant components of critical K=2 Kauffman networks. We determine mostly analytically the numbers and lengths of cycles of these networks and find many of the features that have been observed in Kauffman networks. In particular, the mean number and length of cycles can diverge faster than any power law.
The realization that electron localization in disordered systems (Anderson localization) is ultimately a wave phenomenon has led to the suggestion that photons could be similarly localized by disorder. This conjecture attracted wide interest because the differences between photons and electrons - in their interactions, spin statistics, and methods of injection and detection - may open a new realm of optical and microwave phenomena, and allow a detailed study of the Anderson localization transition undisturbed by the Coulomb interaction. To date, claims of three-dimensional photon localization have been based on observations of the exponential decay of the electromagnetic wave as it propagates through the disordered medium. But these reports have come under close scrutiny because of the possibility that the decay observed may be due to residual absorption, and because absorption itself may suppress localization. Here we show that the extent of photon localization can be determined by a different approach - measurement of the relative size of fluctuations of certain transmission quantities. The variance of relative fluctuations accurately reflects the extent of localization, even in the presence of absorption. Using this approach, we demonstrate photon localization in both weakly and strongly scattering quasi-one-dimensional dielectric samples and in periodic metallic wire meshes containing metallic scatterers, while ruling it out in three-dimensional mixtures of aluminum spheres.
The one dimensional dimer model is investigated and the localization length calculated exactly. The presence of delocalized states at $E_c = \epsilon_{a,b}$ of two possible values of the chemical potential in case of $\mid\epsilon_a-\epsilon_b\mid \leq 2 $ is confirmed and the corresponding indices of the localization length were calculated. The singular integral equation connecting the density of states with the inverse localization length is solved and the analytic expression for the density of states compared with the numerical calculations.
Evaluating the performance of generative models for unsupervised learning is inherently challenging due to the lack of well-defined and tractable objectives. This is particularly difficult for implicit models such as generative adversarial networks (GANs) which perform extremely well in practice for tasks such as sample generation, but sidestep the explicit characterization of a density.   We propose Flow-GANs, a generative adversarial network with the generator specified as a normalizing flow model which can perform exact likelihood evaluation. Subsequently, we learn a Flow-GAN using a hybrid objective that integrates adversarial training with maximum likelihood estimation. We show empirically the benefits of Flow-GANs on MNIST and CIFAR-10 datasets in learning generative models that can attain low generalization error based on the log-likelihoods and generate high quality samples. Finally, we show a simple, yet hard to beat baseline for GANs based on Gaussian Mixture Models.
We propose a fully-automated method for accurate and robust detection and segmentation of potentially cancerous lesions found in the liver and in lymph nodes. The process is performed in three steps, including organ detection, lesion detection and lesion segmentation. Our method applies machine learning techniques such as marginal space learning and convolutional neural networks, as well as active contour models. The method proves to be robust in its handling of extremely high lesion diversity. We tested our method on volumetric computed tomography (CT) images, including 42 volumes containing liver lesions and 86 volumes containing 595 pathological lymph nodes. Preliminary results under 10-fold cross validation show that for both the liver lesions and the lymph nodes, a total detection sensitivity of 0.53 and average Dice score of $0.71 \pm 0.15$ for segmentation were obtained.
In this paper, we introduce a novel deep learning framework, termed Purine. In Purine, a deep network is expressed as a bipartite graph (bi-graph), which is composed of interconnected operators and data tensors. With the bi-graph abstraction, networks are easily solvable with event-driven task dispatcher. We then demonstrate that different parallelism schemes over GPUs and/or CPUs on single or multiple PCs can be universally implemented by graph composition. This eases researchers from coding for various parallelization schemes, and the same dispatcher can be used for solving variant graphs. Scheduled by the task dispatcher, memory transfers are fully overlapped with other computations, which greatly reduce the communication overhead and help us achieve approximate linear acceleration.
We present a novel layerwise optimization algorithm for the learning objective of Piecewise-Linear Convolutional Neural Networks (PL-CNNs), a large class of convolutional neural networks. Specifically, PL-CNNs employ piecewise linear non-linearities such as the commonly used ReLU and max-pool, and an SVM classifier as the final layer. The key observation of our approach is that the problem corresponding to the parameter estimation of a layer can be formulated as a difference-of-convex (DC) program, which happens to be a latent structured SVM. We optimize the DC program using the concave-convex procedure, which requires us to iteratively solve a structured SVM problem. This allows to design an optimization algorithm with an optimal learning rate that does not require any tuning. Using the MNIST, CIFAR and ImageNet data sets, we show that our approach always improves over the state of the art variants of backpropagation and scales to large data and large network settings.
An ad hoc network with a finite spatial extent and number of nodes or mobiles is analyzed. The mobile locations may be drawn from any spatial distribution, and interference-avoidance protocols or protection against physical collisions among the mobiles may be modeled by placing an exclusion zone around each radio. The channel model accounts for the path loss, Nakagami fading, and shadowing of each received signal. The Nakagami m-parameter can vary among the mobiles, taking any positive value for each of the interference signals and any positive integer value for the desired signal. The analysis is governed by a new exact expression for the outage probability, defined to be the probability that the signal-to-interference-and-noise ratio (SINR) drops below a threshold, and is conditioned on the network geometry and shadowing factors, which have dynamics over much slower timescales than the fading. By averaging over many network and shadowing realizations, the average outage probability and transmission capacity are computed. Using the analysis, many aspects of the network performance are illuminated. For example, one can determine the influence of the choice of spreading factors, the effect of the receiver location within the finite network region, and the impact of both the fading parameters and the attenuation power laws.
Microscopy imaging plays a vital role in understanding many biological processes in development and disease. The recent advances in automation of microscopes and development of methods and markers for live cell imaging has led to rapid growth in the amount of image data being captured. To efficiently and reliably extract useful insights from these captured sequences, automated cell tracking is essential. This is a challenging problem due to large variation in the appearance and shapes of cells depending on many factors including imaging methodology, biological characteristics of cells, cell matrix composition, labeling methodology, etc. Often cell tracking methods require a sequence-specific segmentation method and manual tuning of many tracking parameters, which limits their applicability to sequences other than those they are designed for. In this paper, we propose 1) a deep learning based cell proposal method, which proposes candidates for cells along with their scores, and 2) a cell tracking method, which links proposals in adjacent frames in a graphical model using edges representing different cellular events and poses joint cell detection and tracking as the selection of a subset of cell and edge proposals. Our method is completely automated and given enough training data can be applied to a wide variety of microscopy sequences. We evaluate our method on multiple fluorescence and phase contrast microscopy sequences containing cells of various shapes and appearances from ISBI cell tracking challenge, and show that our method outperforms existing cell tracking methods.   Code is available at: https://github.com/SaadUllahAkram/CellTracker
In this paper, we deal with the problem of jointly determining the optimal coding strategy and the scheduling decisions when receivers obtain layered data from multiple servers. The layered data is encoded by means of Prioritized Random Linear Coding (PRLC) in order to be resilient to channel loss while respecting the unequal levels of importance in the data, and data blocks are transmitted simultaneously in order to reduce decoding delays and improve the delivery performance. We formulate the optimal coding and scheduling decisions problem in our novel framework with the help of Markov Decision Processes (MDP), which are effective tools for modeling adapting streaming systems. Reinforcement learning approaches are then proposed to derive reduced computational complexity solutions to the adaptive coding and scheduling problems. The novel reinforcement learning approaches and the MDP solution are examined in an illustrative example for scalable video transmission. Our methods offer large performance gains over competing methods that deliver the data blocks sequentially. The experimental evaluation also shows that our novel algorithms offer continuous playback and guarantee small quality variations which is not the case for baseline solutions. Finally, our work highlights the advantages of reinforcement learning algorithms to forecast the temporal evolution of data demands and to decide the optimal coding and scheduling decisions.
We consider a set of primary channels that operate in an unslotted fashion, switching activity at random times. A secondary user senses the primary channels searching for transmission opportunities. If a channel is sensed to be free, the secondary terminal transmits, and if sensed to be busy, the secondary transmitter remains silent.We solve the problem of determining the optimal time after which a primary channel needs to be sensed again depending on the sensing outcome. The objective is to find the inter-sensing times such that the mean secondary throughput is maximized while imposing a constraint over the maximum tolerable interference inflicted on the primary network. Our numerical results show that by optimizing the sensing-dependent inter-sensing times, our proposed scheme reduces the impact of sensing errors caused by false alarm and misdetection and outperforms the case of a single sensing period.
The dilatometric investigation in the temperature range of 2-28K shows that a first-order polyamorphous transition occurs in the orientational glasses based on C60 doped with H2, D2 and Xe. A polyamorphous transition was also detected in C60 doped with Kr and He. It is observed that the hysteresis of thermal expansion caused by the polyamorphous transition (and, hence, the transition temperature) is essentially dependent on the type of doping gas. Both positive and negative contributions to the thermal expansion were observed in the low temperature phase of the glasses. The relaxation time of the negative contribution occurs to be much longer than that of the positive contribution. The positive contribution is found to be due to phonon and libron modes, whilst the negative contribution is attributed to tunneling states of the C60 molecules. The characteristic time of the phase transformation from the low-T phase to the high-T phase has been found for the C60-H2 system at 12K. A theoretical model is proposed to interpret these observed phenomena. The theoretical model proposed, includes a consideration of the nature of polyamorphism in glasses, as well as the thermodynamics and kinetics of the transition. A model of non-interacting tunneling states is used to explain the negative contribution to the thermal expansion. The experimental data obtained is considered within the framework of the theoretical model. From the theoretical model the order of magnitude of the polyamorphous transition temperature has been estimated. It is found that the late stage of the polyamorphous transformation is described well by the Kolmogorov law with an exponent of n=1. At this stage of the transformation, the two-dimensional phase boundary moves along the normal, and the nucleation is not important.
We consider a state estimation problem where observations are made by multiple sensors. These observations are communicated over a lossy wireless network to a central base station that computes estimates via a Kalman filter. The goal is to determine the optimal location of the base station under a certain class of packet loss probability models. It is shown in the two sensor case that the base station is optimally located at one of the sensor locations. Empirical evidence suggests that the result holds in some generality.
A central biological question is how natural organisms are so evolvable (capable of quickly adapting to new environments). A key driver of evolvability is the widespread modularity of biological networks--their organization as functional, sparsely connected subunits--but there is no consensus regarding why modularity itself evolved. While most hypotheses assume indirect selection for evolvability, here we demonstrate that the ubiquitous, direct selection pressure to reduce the cost of connections between network nodes causes the emergence of modular networks. Experiments with selection pressures to maximize network performance and minimize connection costs yield networks that are significantly more modular and more evolvable than control experiments that only select for performance. These results will catalyze research in numerous disciplines, including neuroscience, genetics and harnessing evolution for engineering purposes.
We present a deep convolutional neural network for estimating the relative homography between a pair of images. Our feed-forward network has 10 layers, takes two stacked grayscale images as input, and produces an 8 degree of freedom homography which can be used to map the pixels from the first image to the second. We present two convolutional neural network architectures for HomographyNet: a regression network which directly estimates the real-valued homography parameters, and a classification network which produces a distribution over quantized homographies. We use a 4-point homography parameterization which maps the four corners from one image into the second image. Our networks are trained in an end-to-end fashion using warped MS-COCO images. Our approach works without the need for separate local feature detection and transformation estimation stages. Our deep models are compared to a traditional homography estimator based on ORB features and we highlight the scenarios where HomographyNet outperforms the traditional technique. We also describe a variety of applications powered by deep homography estimation, thus showcasing the flexibility of a deep learning approach.
We study the relaxation of a non-equilibrium carrier distribution under the influence of the electron-electron interaction in the presence of disorder. Based on the Anderson model, our Hamiltonian is composed from a single particle part including the disorder and a two-particle part accounting for the Coulomb interaction. We apply the equation-of-motion approach for the density matrix, which provides a fully microscopic description of the relaxation. Our results show that the nonequlibrium distribution in this closed and internally interacting system relaxes exponentially fast during the initial dynamics. This fast relaxation can be described by a phenomenological damping rate. The total single particle energy decreases in the redistribution process, keeping the total energy of the system fixed. It turns out that the relaxation rate decreases with increasing disorder.
Players of coevolutionary games may update not only their strategies but also their networks of interaction. Based on interpreting the payoff of players as fitness, dynamic landscape models are proposed. The modeling procedure is carried out for Prisoner's Dilemma (PD) and Snowdrift (SD) games that both use either birth--death (BD) or death--birth (DB) strategy updating. The main focus is on using dynamic fitness landscapes as a mathematical model of coevolutionary game dynamics. Hence, an alternative tool for analyzing coevolutionary games becomes available, and landscape measures such as modality, ruggedness and information content can be computed and analyzed. In addition, fixation properties of the games and quantifiers characterizing the interaction networks are calculated numerically. Relations are established between landscape properties expressed by landscape measures and quantifiers of coevolutionary game dynamics such as fixation probabilities, fixation times and network properties.
In this report, we introduce the concept of co-community structure in time-varying networks. We propose a novel optimization algorithm to rapidly detect co-community structure in these networks. Both theoretical and numerical results show that the proposed method not only can resolve detailed co-communities, but also can effectively identify the dynamical phenomena in these networks.
We calculate high and (ultra-)high energy upward-going muon neutrino propagation through the Earth and the induced muon energy distribution near the one cubic kilometer detector using the Monte Carlo simulation, due to both charged current interaction and neutral one. The initiated neutrino energies on the surface of the Earth are 1PeV, 1EeV and 1ZeV. The mean free paths of (ultra-)high energy neutrino events generated by the deep inelastic scattering may be comparable with the diameter of the Earth or less than it. Therefore, the induced muon production distribution is influenced by the change of the densities interior to the Earth. Furthermore, in such situation, the contribution from the neutral current neutrino interaction to the induced muon production distribution cannot be neglected. We report several examples of the deep inelastic scattered depth of muon neutrino in the Earth and the induced muon energy distribution near the detector.
There is recent evidence that the $XY$ spin model on complex networks can display three different macroscopic states in response to the topology of the network underpinning the interactions of the spins. In this work, we present a novel way to characterise the macroscopic states of the $XY$ spin model based on the spectral decomposition of time series using topological information about the underlying networks. We use three different classes of networks to generate time series of the spins for the three possible macroscopic states. We then use the temporal Graph Signal Transform technique to decompose the time series of the spins on the eigenbasis of the Laplacian. From this decomposition, we produce spatial power spectra, which summarise the activation of structural modes by the non-linear dynamics, and thus coherent patterns of activity of the spins. These signatures of the macroscopic states are independent of the underlying networks and can thus be used as universal signatures for the macroscopic states. This work opens new avenues to analyse and characterise dynamics on complex networks using temporal Graph Signal Analysis.
Recent experiments revealed that a certain class of inhibitory neurons in the cerebral cortex make synapses not onto cell bodies but at distal parts of dendrites of the target neurons, mediating highly nonlinear dendritic inhibition. We propose a novel form of competitive neural network model that realizes such dendritic inhibition. Contrary to the conventional lateral inhibition in neural networks, our dendritic inhibition models don't always show winner-take-all behaviors; instead, they converge to "I don't know" states when unknown input patterns are presented. We derive reduced two-dimensional dynamics for the network, showing that a drastic shift of the fixed point from a winner-take-all state to an "I don't know" state occurs in accordance with the increase in noise added to the stored patterns. By preventing misrecognition in such a way, dendritic inhibition networks achieve fine pattern discrimination, which could be one of the basic computations by inhibitory connected recurrent neural networks in the brain.
Pixel wise image labeling is an interesting and challenging problem with great significance in the computer vision community. In order for a dense labeling algorithm to be able to achieve accurate and precise results, it has to consider the dependencies that exist in the joint space of both the input and the output variables. An implicit approach for modeling those dependencies is by training a deep neural network that, given as input an initial estimate of the output labels and the input image, it will be able to predict a new refined estimate for the labels. In this context, our work is concerned with what is the optimal architecture for performing the label improvement task. We argue that the prior approaches of either directly predicting new label estimates or predicting residual corrections w.r.t. the initial labels with feed-forward deep network architectures are sub-optimal. Instead, we propose a generic architecture that decomposes the label improvement task to three steps: 1) detecting the initial label estimates that are incorrect, 2) replacing the incorrect labels with new ones, and finally 3) refining the renewed labels by predicting residual corrections w.r.t. them. Furthermore, we explore and compare various other alternative architectures that consist of the aforementioned Detection, Replace, and Refine components. We extensively evaluate the examined architectures in the challenging task of dense disparity estimation (stereo matching) and we report both quantitative and qualitative results on three different datasets. Finally, our dense disparity estimation network that implements the proposed generic architecture, achieves state-of-the-art results in the KITTI 2015 test surpassing prior approaches by a significant margin.
Recently, attempts have been made to remove Gaussian mixture models (GMM) from the training process of deep neural network-based hidden Markov models (HMM/DNN). For the GMM-free training of a HMM/DNN hybrid we have to solve two problems, namely the initial alignment of the frame-level state labels and the creation of context-dependent states. Although flat-start training via iteratively realigning and retraining the DNN using a frame-level error function is viable, it is quite cumbersome. Here, we propose to use a sequence-discriminative training criterion for flat start. While sequence-discriminative training is routinely applied only in the final phase of model training, we show that with proper caution it is also suitable for getting an alignment of context-independent DNN models. For the construction of tied states we apply a recently proposed KL-divergence-based state clustering method, hence our whole training process is GMM-free. In the experimental evaluation we found that the sequence-discriminative flat start training method is not only significantly faster than the straightforward approach of iterative retraining and realignment, but the word error rates attained are slightly better as well.
A dynamic system, which is used in the neural network theory, Ising spin glasses and factor analysis, has been investigated. The properties of the connection matrix, which guarantee the coincidence of the set of the fixed points of the dynamic system with the set of the local minima of the energy functional, have been determined. The influence of the connection matrix diagonal elements on the structure of the fixed points set has been investigated.
We study the creep rupture of bundles of viscoelastic fibers occurring under uniaxial constant tensile loading. A novel fiber bundle model is introduced which combines the viscoelastic constitutive behaviour and the strain controlled breaking of fibers. Analytical and numerical calculations showed that above a critical external load the deformation of the system monotonically increases in time resulting in global failure at a finite time $t_f$, while below the critical load the deformation tends to a constant value giving rise to an infinite lifetime. Our studies revealed that the nature of the transition between the two regimes, i.e. the behaviour of $t_f$ at the critical load $sigma_c$, strongly depends on the range of load sharing: for global load sharing $t_f$ has a power law divergence at $\sigma_c$ with a universal exponent of 0.5, however, for local load sharing the transition becomes abrupt: at the critical load $t_f$ jumps to a finite value, analogous to second and first order phase transitions, respectively. The acoustic response of the bundle during creep is also studied.
This paper investigates the learning of 3rd-order tensors representing the semantics of transitive verbs. The meaning representations are part of a type-driven tensor-based semantic framework, from the newly emerging field of compositional distributional semantics. Standard techniques from the neural networks literature are used to learn the tensors, which are tested on a selectional preference-style task with a simple 2-dimensional sentence space. Promising results are obtained against a competitive corpus-based baseline. We argue that extending this work beyond transitive verbs, and to higher-dimensional sentence spaces, is an interesting and challenging problem for the machine learning community to consider.
One of the main issues in Vehicular Ad-hoc NETworks (VANETs) is providing a reliable and efficient routing in urban scenarios with regard to the high vehicle mobility and presence of radio obstacle. In this paper, we propose a Position-Based routing protocol using Learning Automata (PBLA). In addition, PBLA uses the traffic information for enhancing learning. As we know, a main characteristic of learning is increasing performance over time. We exploit this characteristic to decreasing use of traffic information. Initially, PBLA make effort to finding best and shortest path to mobile destination using traffic information.
The random first-order transition (RFOT) theory of the structural glass transition is reviewed in a pedagogical fashion. The rigidity that emerges in crystals and glassy liquids is of the same fundamental origin. In both cases, it corresponds with a breaking of the translational symmetry; analogies with freezing transitions in spin systems can also be made. The common aspect of these seemingly distinct phenomena is a spontaneous emergence of the molecular field, a venerable and well-understood concept. In crucial distinction from periodic crystallisation, the free energy landscape of a glassy liquid is vastly degenerate, which gives rise to new length and time scales while rendering the emergence of rigidity gradual. We obviate the standard notion that to be mechanically stable a structure must be essentially unique; instead, we show that bulk degeneracy is perfectly allowed but should not exceed a certain value. The present microscopic description thus explains both crystallisation and the emergence of the landscape regime followed by vitrification in a unified, thermodynamics-rooted fashion. The article contains a self-contained exposition of the basics of the classical density functional theory and liquid theory, which are subsequently used to quantitatively estimate, without using adjustable parameters, the key attributes of glassy liquids, viz., the relaxation barriers, glass transition temperature, and cooperativity size. These results are then used to quantitatively discuss many diverse glassy phenomena, including: the intrinsic connection between the excess liquid entropy and relaxation rates, the non-Arrhenius temperature dependence of $\alpha$-relaxation, the dynamic heterogeneity, ... (see Comments for the remainder of Abstract.)
In this paper, we investigate convergence dynamics of $2^N$ almost periodic encoded patterns of general neural networks (GNNs) subjected to external almost periodic stimuli, including almost periodic delays. Invariant regions are established for the existence of $2^N$ almost periodic encoded patterns under two classes of activation functions. By employing the property of $\mathscr{M}$-cone and inequality technique, attracting basins are estimated and some criteria are derived for the networks to converge exponentially toward $2^N$ almost periodic encoded patterns. The obtained results are new, they extend and generalize the corresponding results existing in previous literature.
Disordered systems exhibiting exponential localization are mapped to anisotropic spin chains with localization length being related to the anisotropy of the spin model. This relates localization phenomenon in fermions to the rotational symmetry breaking in the critical spin chains. One of the intriguing consequence is that the statement of Onsager universality in spin chains implies universality of the localized fermions where the fluctuations in localized wave functions are universal. We further show that the fluctuations about localized nonrelativistic fermions describe relativistic fermions. This provides a new approach to understand the absence of localization in disordered Dirac fermions. We investigate how disorder affects well known universality of the spin chains by examining the multifractal exponents. Finally, we examine the effects of correlations on the localization characteristics of relativistic fermions.
We consider the problem of coloring Erdos-Renyi and regular random graphs of finite connectivity using q colors. It has been studied so far using the cavity approach within the so-called one-step replica symmetry breaking (1RSB) ansatz. We derive a general criterion for the validity of this ansatz and, applying it to the ground state, we provide evidence that the 1RSB solution gives exact threshold values c_q for the q-COL/UNCOL phase transition. We also study the asymptotic thresholds for q >> 1 finding c_q = 2qlog(q)-log(q)-1+o(1) in perfect agreement with rigorous mathematical bounds, as well as the nature of excited states, and give a global phase diagram of the problem.
Deep learning is a broad set of techniques that uses multiple layers of representation to automatically learn relevant features directly from structured data. Recently, such techniques have yielded record-breaking results on a diverse set of difficult machine learning tasks in computer vision, speech recognition, and natural language processing. Despite the enormous success of deep learning, relatively little is understood theoretically about why these techniques are so successful at feature learning and compression. Here, we show that deep learning is intimately related to one of the most important and successful techniques in theoretical physics, the renormalization group (RG). RG is an iterative coarse-graining scheme that allows for the extraction of relevant features (i.e. operators) as a physical system is examined at different length scales. We construct an exact mapping from the variational renormalization group, first introduced by Kadanoff, and deep learning architectures based on Restricted Boltzmann Machines (RBMs). We illustrate these ideas using the nearest-neighbor Ising Model in one and two-dimensions. Our results suggests that deep learning algorithms may be employing a generalized RG-like scheme to learn relevant features from data.
Cell formation is a critical step in the design of cellular manufacturing systems. Recently, it was tackled using a cut-based-graph-partitioning model. This model meets real-life production systems requirements as it uses the actual amount of product flows, it looks for the suitable number of cells, and it takes into account the natural constraints such as operation sequences, maximum cell size, cohabitation and non-cohabitation constraints. Based on this model, we propose an original encoding representation to solve the problem by using a genetic algorithm. We discuss the performance of this new GA in comparison to some approaches taken from the literature on a set of medium sized instances. Given the results we obtained, it is reasonable to assume that the new GA will provide similar results for large real-life problems. Keywords: Group Technology, Manufacturing Cell Formation, Graph Partitioning, Graph Cuts, Genetic Algorithm, Encoding representation.
Given the recent successes of deep learning applied to style transfer and texture synthesis, we propose a new theoretical framework to construct visual metamers: \textit{a family of perceptually identical, yet physically different images}. We review work both in neuroscience related to metameric stimuli, as well as computer vision research in style transfer. We propose our NeuroFovea metamer model that is based on a mixture of peripheral representations and style transfer forward-pass algorithms for \emph{any} image from the recent work of Adaptive Instance Normalization (Huang~\&~Belongie). Our model is parametrized by a VGG-Net versus a set of joint statistics of complex wavelet coefficients which allows us to encode images in high dimensional space and interpolate between the content and texture information. We empirically show that human observers discriminate our metamers at a similar rate as the metamers of Freeman~\&~Simoncelli (FS) In addition, our NeuroFovea metamer model gives us the benefit of near real-time generation which presents a $\times1000$ speed-up compared to previous work. Critically, psychophysical studies show that both the FS and NeuroFovea metamers are discriminable from the original images highlighting an important limitation of current metamer generation methods.
This paper is an extensive survey of literature on complex network communities and clustering. Complex networks describe a widespread variety of systems in nature and society especially systems composed by a large number of highly interconnected dynamical entities. Complex networks like real networks can also have community structure. There are several types of methods and algorithms for detection and identification of communities in complex networks. Several complex networks have the property of clustering or network transitivity. Some of the important concepts in the field of complex networks are small-world and scale-robustness, degree distributions, clustering, network correlations, random graph models, models of network growth, dynamical processes on networks, etc. Some current areas of research on complex network communities are those on community evolution, overlapping communities, communities in directed networks, community characterization and interpretation, etc. Many of the algorithms or methods proposed for network community detection through clustering are modified versions of or inspired from the concepts of minimum-cut based algorithms, hierarchical connectivity based algorithms, the original GirvanNewman algorithm, concepts of modularity maximization, algorithms utilizing metrics from information and coding theory, and clique based algorithms.
Person re-identification aims to re-identify the probe image from a given set of images under different camera views. It is challenging due to large variations of pose, illumination, occlusion and camera view. Since the convolutional neural networks (CNN) have excellent capability of feature extraction, certain deep learning methods have been recently applied in person re-identification. However, in person re-identification, the deep networks often suffer from the over-fitting problem. In this paper, we propose a novel CNN-based method to learn a discriminative metric with good robustness to the over-fitting problem in person re-identification. Firstly, a novel deep architecture is built where the Mahalanobis metric is learned with a weight constraint. This weight constraint is used to regularize the learning, so that the learned metric has a better generalization ability. Secondly, we find that the selection of intra-class sample pairs is crucial for learning but has received little attention. To cope with the large intra-class variations in pedestrian images, we propose a novel training strategy named moderate positive mining to prevent the training process from over-fitting to the extreme samples in intra-class pairs. Experiments show that our approach significantly outperforms state-of-the-art methods on several benchmarks of person re-identification.
Neural networks require a careful design in order to perform properly on a given task. In particular, selecting a good activation function (possibly in a data-dependent fashion) is a crucial step, which remains an open problem in the research community. Despite a large amount of investigations, most current implementations simply select one fixed function from a small set of candidates, which is not adapted during training, and is shared among all neurons throughout the different layers. However, neither two of these assumptions can be supposed optimal in practice. In this paper, we present a principled way to have data-dependent adaptation of the activation functions, which is performed independently for each neuron. This is achieved by leveraging over past and present advances on cubic spline interpolation, allowing for local adaptation of the functions around their regions of use. The resulting algorithm is relatively cheap to implement, and overfitting is counterbalanced by the inclusion of a novel damping criterion, which penalizes unwanted oscillations from a predefined shape. Experimental results validate the proposal over two well-known benchmarks.
The vanilla attention-based neural machine translation has achieved promising performance because of its capability in leveraging varying-length source annotations. However, this model still suffers from failures in long sentence translation, for its incapability in capturing long-term dependencies. In this paper, we propose a novel recurrent neural machine translation (RNMT), which not only preserves the ability to model varying-length source annotations but also better captures long-term dependencies. Instead of the conventional attention mechanism, RNMT employs a recurrent neural network to extract the context vector, where the target-side previous hidden state serves as its initial state, and the source annotations serve as its inputs. We refer to this new component as contexter. As the encoder, contexter and decoder in our model are all derivable recurrent neural networks, our model can still be trained end-to-end on large-scale corpus via stochastic algorithms. Experiments on Chinese-English translation tasks demonstrate the superiority of our model to attention-based neural machine translation, especially on long sentences. Besides, further analysis of the contexter revels that our model can implicitly reflect the alignment to source sentence.
The antiferromagnetic Ising chain in both transverse and longitudinal magnetic fields is one of the paradigmatic models of a quantum phase transition. The antiferromagnetic system exhibits a zero-temperature critical line separating an antiferromagnetic phase and a paramagnetic phase; the critical line connects an integrable quantum critical point at zero longitudinal field and a classical first-order transition point at zero transverse field. Using a strong-disorder renormalization group method formulated as a tree tensor network, we study the zero-temperature phase of the quantum Ising chain with bond randomness. We introduce a new matrix product operator representation of high-order moments, which provides an efficient and accurate tool for determining quantum phase transitions via the Binder cumulant of the order parameter. Our results demonstrate an infinite-randomness quantum critical point in zero longitudinal field accompanied by pronounced quantum Griffiths singularities, arising from rare ordered regions with anomalously slow fluctuations inside the paramagnetic phase. The strong Griffiths effects are signaled by a large dynamical exponent $z>1$, which characterizes a power-law density of low-energy states of the localized rare regions and becomes infinite at the quantum critical point. Upon application of a longitudinal field, the quantum phase transition between the paramagnetic phase and the antiferromagnetic phase is completely destroyed. Furthermore, quantum Griffiths effects are suppressed, showing $z<1$, when the dynamics of the rare regions is hampered by the longitudinal field.
Multi objective (MO) optimization is an emerging field which is increasingly being implemented in many industries globally. In this work, the MO optimization of the extraction process of bioactive compounds from the Gardenia Jasminoides Ellis fruit was solved. Three swarm-based algorithms have been applied in conjunction with normal-boundary intersection (NBI) method to solve this MO problem. The gravitational search algorithm (GSA) and the particle swarm optimization (PSO) technique were implemented in this work. In addition, a novel Hopfield-enhanced particle swarm optimization was developed and applied to the extraction problem. By measuring the levels of dominance, the optimality of the approximate Pareto frontiers produced by all the algorithms were gauged and compared. Besides, by measuring the levels of convergence of the frontier, some understanding regarding the structure of the objective space in terms of its relation to the level of frontier dominance is uncovered. Detail comparative studies were conducted on all the algorithms employed and developed in this work.
A basic problem in the analysis of social networks is missing data. When a network model does not accurately capture all the actors or relationships in the social system under study, measures computed on the network and ultimately the final outcomes of the analysis can be severely distorted. For this reason, researchers in social network analysis have characterised the impact of different types of missing data on existing network measures. Recently a lot of attention has been devoted to the study of multiple-network systems, e.g., multiplex networks. In these systems missing data has an even more significant impact on the outcomes of the analyses. However, to the best of our knowledge, no study has focused on this problem yet. This work is a first step in the direction of understanding the impact of missing data in multiple networks. We first discuss the main reasons for missingness in these systems, then we explore the relation between various types of missing information and their effect on network properties. We provide initial experimental evidence based on both real and synthetic data.
We propose a new description of inclusive diffraction in deep inelastic scattering (DIS). The diffractive structure functions are expressed in the dipole picture and contain heavy-quark contributions. The dipole scattering amplitude, a saturation model fitted on inclusive DIS data, features a saturation scale Q_s(x) larger than 1 GeV for x=10^{-5}. The q\bar{q}g contribution to the diffractive final state is modeled in such a way that both the large-Q^2 and small-beta limits are implemented. In the regime xpom<0.01 in which saturation is expected to be relevant, we obtain a parameter-free description of the HERA data with chi^2/points=1.2.
We study the problem of communicating over a single-source single-terminal network in the presence of an adversary that may jam a single link of the network. If any one of the edges can be jammed, the capacity of such networks is well understood and follows directly from the connection between the minimum cut and maximum flow in single-source single- terminal networks. In this work we consider networks in which some edges cannot be jammed, and show that determining the network communication capacity is at least as hard as solving the multiple-unicast network coding problem for the error-free case. The latter problem is a long standing open problem.
We present BayeSED, a general purpose tool for doing Bayesian analysis of SEDs by using whatever pre-existing model SED libraries or their linear combinations. The artificial neural networks (ANNs), principal component analysis (PCA) and multimodal nested sampling (MultiNest) techniques are employed to allow a highly efficient sampling of posterior distribution and the calculation of Bayesian evidence. As a demonstration, we apply this tool to a sample of hyperluminous infrared galaxies (HLIRGs). The Bayesian evidences obtained for a pure Starburst, a pure AGN, and a linear combination of Starburst+AGN models show that the Starburst+AGN model have the highest evidence for all galaxies in this sample. The Bayesian evidences for the three models and the estimated contributions of starburst and AGN to infrared luminosity show that HLIRGs can be classified into two groups: one dominated by starburst and the other dominated by AGN. Other parameters and corresponding uncertainties about starburst and AGN are also estimated by using the model with the highest Bayesian evidence. We found that the starburst region of the HLIRGs dominated by starburst tends to be more compact and has a higher fraction of OB star than that of HLIRGs dominated by AGN. Meanwhile, the AGN torus of the HLIRGs dominated by AGN tend to be more dusty than that of HLIRGs dominated by starburst. These results are consistent with previous researches, but need to be tested further with larger samples. Overall, we believe that BayeSED could be a reliable and efficient tool for exploring the nature of complex systems such as dust-obscured starburst-AGN composite systems from decoding their SEDs.
Collective classification has been intensively studied due to its impact in many important applications, such as web mining, bioinformatics and citation analysis. Collective classification approaches exploit the dependencies of a group of linked objects whose class labels are correlated and need to be predicted simultaneously. In this paper, we focus on studying the collective classification problem in heterogeneous networks, which involves multiple types of data objects interconnected by multiple types of links. Intuitively, two objects are correlated if they are linked by many paths in the network. However, most existing approaches measure the dependencies among objects through directly links or indirect links without considering the different semantic meanings behind different paths. In this paper, we study the collective classification problem taht is defined among the same type of objects in heterogenous networks. Moreover, by considering different linkage paths in the network, one can capture the subtlety of different types of dependencies among objects. We introduce the concept of meta-path based dependencies among objects, where a meta path is a path consisting a certain sequence of linke types. We show that the quality of collective classification results strongly depends upon the meta paths used. To accommodate the large network size, a novel solution, called HCC (meta-path based Heterogenous Collective Classification), is developed to effectively assign labels to a group of instances that are interconnected through different meta-paths. The proposed HCC model can capture different types of dependencies among objects with respect to different meta paths. Empirical studies on real-world networks demonstrate that effectiveness of the proposed meta path-based collective classification approach.
In social situations with which evolutionary game is concerned, individuals are considered to be heterogeneous in various aspects. In particular, they may differently perceive the same outcome of the game owing to heterogeneity in idiosyncratic preferences, fighting abilities, and positions in a social network. In such a population, an individual may imitate successful and similar others, where similarity refers to that in the idiosyncratic fitness function. I propose an evolutionary game model with two subpopulations on the basis of multipopulation replicator dynamics to describe such a situation. In the proposed model, pairs of players are involved in a two-person game as a well-mixed population, and imitation occurs within subpopulations in each of which players have the same payoff matrix. It is shown that the model does not allow any internal equilibrium such that the dynamics differs from that of other related models such as the bimatrix game. In particular, even a slight difference in the payoff matrix in the two subpopulations can make the opposite strategies to be stably selected in the two subpopulations in the snowdrift and coordination games.
We have identified a clustering of red galaxies from deep optical/IR images obtained as part of the Institute for Astronomy Deep Survey. Photometric spectral-energy distributions indicate that most of these galaxies comprise nearly pure old stellar populations at a redshift near 1.4, and Keck spectroscopy of the three brightest red galaxies confirm this interpretation and give redshifts ranging from 1.335 to 1.338. Four of the galaxies are close together on the sky and less than 30" from an R=13.5 star, and we have obtained deep adaptive-optics imaging of this group. Detailed analysis and modeling of the surface-brightness profiles of these galaxies shows that two are normal ellipticals, one is an S0, and one appears to be an essentially pure disk of old stars, with no significant bulge. All four are highly relaxed, symmetric systems. While the old, bulgeless disk galaxy represents a type that is rare at the present epoch, the other three galaxies have structural parameters that are essentially indistinguishable from those of present-day galaxies and differ only in the age of their stellar populations.
A spatial network is constructed on a two dimensional space where the nodes are geometrical points located at randomly distributed positions which are labeled sequentially in increasing order of one of their co-ordinates. Starting with $N$ such points the network is grown by including them one by one according to the serial number into the growing network. The $t$-th point is attached to the $i$-th node of the network using the probability: $\pi_i(t) \sim k_i(t)\ell_{ti}^{\alpha}$ where $k_i(t)$ is the degree of the $i$-th node and $\ell_{ti}$ is the Euclidean distance between the points $t$ and $i$. Here $\alpha$ is a continuously tunable parameter and while for $\alpha=0$ one gets the simple Barab\'asi-Albert network, the case for $\alpha \to -\infty$ corresponds to the spatially continuous version of the well known Scheidegger's river network problem. The modulating parameter $\alpha$ is tuned to study the transition between the two different critical behaviors at a specific value $\alpha_c$ which we numerically estimate to be -2.
Prediction of one-dimensional protein structures such as secondary structures and contact numbers is useful for the three-dimensional structure prediction and important for the understanding of sequence-structure relationship. Here we present a new machine-learning method, critical random networks (CRNs), for predicting one-dimensional structures, and apply it, with position-specific scoring matrices, to the prediction of secondary structures (SS), contact numbers (CN), and residue-wise contact orders (RWCO). The present method achieves, on average, $Q_3$ accuracy of 77.8% for SS, correlation coefficients of 0.726 and 0.601 for CN and RWCO, respectively. The accuracy of the SS prediction is comparable to other state-of-the-art methods, and that of the CN prediction is a significant improvement over previous methods. We give a detailed formulation of critical random networks-based prediction scheme, and examine the context-dependence of prediction accuracies. In order to study the nonlinear and multi-body effects, we compare the CRNs-based method with a purely linear method based on position-specific scoring matrices. Although not superior to the CRNs-based method, the surprisingly good accuracy achieved by the linear method highlights the difficulty in extracting structural features of higher order from amino acid sequence beyond that provided by the position-specific scoring matrices.
We discuss the application of wavelet transforms to a critical interface model, which is known to provide a good description of Barkhausen noise in soft ferromagnets. The two-dimensional version of the model (one-dimensional interface) is considered, mainly in the adiabatic limit of very slow driving. On length scales shorter than a crossover length (which grows with the strength of surface tension), the effective interface roughness exponent $\zeta$ is $\simeq 1.20$, close to the expected value for the universality class of the quenched Edwards-Wilkinson model. We find that the waiting times between avalanches are fully uncorrelated, as the wavelet transform of their autocorrelations scales as white noise. Similarly, detrended size-size correlations give a white-noise wavelet transform. Consideration of finite driving rates, still deep within the intermittent regime, shows the wavelet transform of correlations scaling as $1/f^{1.5}$ for intermediate frequencies. This behavior is ascribed to intra-avalanche correlations.
Standard approaches in entity identification hard-code boundary detection and type prediction into labels (e.g., John/B-PER Smith/I-PER) and then perform Viterbi. This has two disadvantages: 1. the runtime complexity grows quadratically in the number of types, and 2. there is no natural segment-level representation. In this paper, we propose a novel neural architecture that addresses these disadvantages. We frame the problem as multitasking, separating boundary detection and type prediction but optimizing them jointly. Despite its simplicity, this architecture performs competitively with fully structured models such as BiLSTM-CRFs while scaling linearly in the number of types. Furthermore, by construction, the model induces type-disambiguating embeddings of predicted mentions.
We consider a class of growing random graphs obtained by creating vertices sequentially one by one: at each step, we choose uniformly the neighbours of the newly created vertex; its degree is a random variable with a fixed but arbitrary distribution, depending on the number of existing vertices. Examples from this class turn out to be the ER random graph, a natural random threshold graph, etc. By working with the notion of graph limits, we define a kernel which, under certain conditions, is the limit of the growing random graph. Moreover, for a subclass of models, the growing graph on any given n vertices has the same distribution as the random graph with n vertices that the kernel defines. The motivation stems from a model of graph growth whose attachment mechanism does not require information about properties of the graph at each iteration.
We propose a framework for adversarial training that relies on a sample rather than a single sample point as the fundamental unit of discrimination. Inspired by discrepancy measures and two-sample tests between probability distributions, we propose two such distributional adversaries that operate and predict on samples, and show how they can be easily implemented on top of existing models. Various experimental results show that generators trained with our distributional adversaries are much more stable and are remarkably less prone to mode collapse than traditional models trained with pointwise prediction discriminators. The application of our framework to domain adaptation also results in considerable improvement over recent state-of-the-art.
When reasoning with uncertainty there are many situations where evidences are not only uncertain but their propositions may also be weakly specified in the sense that it may not be certain to which event a proposition is referring. It is then crucial not to combine such evidences in the mistaken belief that they are referring to the same event. This situation would become manageable if the evidences could be clustered into subsets representing events that should be handled separately. In an earlier article we established within Dempster-Shafer theory a criterion function called the metaconflict function. With this criterion we can partition a set of evidences into subsets. Each subset representing a separate event. In this article we will not only find the most plausible subset for each piece of evidence, we will also find the plausibility for every subset that the evidence belongs to the subset. Also, when the number of subsets are uncertain we aim to find a posterior probability distribution regarding the number of subsets.
We present first results of a pilot project aimed at exploiting the potentiality of ground based adaptive optics imaging in the near infrared to determine the age of stellar clusters in the Galactic Bulge. We have used a combination of high resolution adaptive optics (ESO-VLT NAOS-CONICA) and wide-field (ESO-NTT-SOFI) photometry of the metal rich globular cluster NGC 6440 located towards the inner Bulge, to compute a deep color magnitude diagram from the tip of the Red Giant Branch down to J~22$, two magnitudes below the Main Sequence Turn Off (TO). The magnitude difference between the TO level and the red Horizontal Branch has been used as an age indicator. It is the first time that such a measurement for a bulge globular cluster has been obtained with a ground based telescope. From a direct comparison with 47 Tuc and with a set of theoretical isochrones, we concluded that NGC 6440 is old and likely coeval to 47 Tuc. This result adds a new evidence that the Galactic Bulge is ~2 Gyr younger at most than the pristine, metal poor population of the Galactic Halo.
Homogeneous charge compression ignition (HCCI) is a futuristic combustion technology that operates with a high fuel efficiency and reduced emissions. HCCI combustion is characterized by complex nonlinear dynamics which necessitates a model based control approach for automotive application. HCCI engine control is a nonlinear, multi-input multi-output problem with state and actuator constraints which makes controller design a challenging task. Typical HCCI controllers make use of a first principles based model which involves a long development time and cost associated with expert labor and calibration. In this paper, an alternative approach based on machine learning is presented using extreme learning machines (ELM) and nonlinear model predictive control (MPC). A recurrent ELM is used to learn the nonlinear dynamics of HCCI engine using experimental data and is shown to accurately predict the engine behavior several steps ahead in time, suitable for predictive control. Using the ELM engine models, an MPC based control algorithm with a simplified quadratic program update is derived for real time implementation. The working and effectiveness of the MPC approach has been analyzed on a nonlinear HCCI engine model for tracking multiple reference quantities along with constraints defined by HCCI states, actuators and operational limits.
We re-analyse experiments on a foam sheared in a two-dimensional Couette geometry [Debregeas <i>et al.</i>, Phys. Rev. Lett. <b>87</b>, 178305 (2001)]. We characterise the bubble deformation by a texture tensor. Our measurements are local in time: they show two regimes, one transient and one stationary. They provide both the average and fluctuations of the anisotropy. Measurements are also local in space: they show that both the deformation and the elastic contribution to the stress field do not localise, varying smoothly across the shear gap. We can thus describe the foam as a continuous medium with elastic properties.
Spatially constrained planar networks are frequently encountered in real-life systems. In this paper, based on a space-filling disk packing we propose a minimal model for spatial maximal planar networks, which is similar to but different from the model for Apollonian networks [J. S. Andrade, Jr. et al., Phys. Rev. Lett. {\bf 94}, 018702 (2005)]. We present an exhaustive analysis of various properties of our model, and obtain the analytic solutions for most of the features, including degree distribution, clustering coefficient, average path length, and degree correlations. The model recovers some striking generic characteristics observed in most real networks. To address the robustness of the relevant network properties, we compare the structural features between the investigated network and the Apollonian networks. We show that topological properties of the two networks are encoded in the way of disk packing. We argue that spatial constrains of nodes are relevant to the structure of the networks.
Using deep two-band imaging from the Hubble Space Telescope, we measure the color-magnitude relations (CMR) of E/S0 galaxies in a set of 9 optically-selected clusters principally from the Red-Sequence Cluster Survey (RCS) at 0.9 < z < 1.23. We find that the mean scatter in the CMR in the observed frame of this set of clusters is 0.049 +/- 0.008, as compared to 0.031 +/- 0.007 in a similarly imaged and identically analyzed X-ray sample at similar redshifts. Single-burst stellar population models of the CMR scatter suggest that the E/S0 population in these RCS clusters truncated their star-formation at z~1.6, some 0.9 Gyrs later than their X-ray E/S0 counterparts which were truncated at z~2.1. The notion that this is a manifestation of the differing evolutionary states of the two populations of cluster galaxies is supported by comparison of the fraction of bulge-dominated galaxies found in the two samples which shows that optically-selected clusters contain a smaller fraction of E/S0 galaxies at the their cores
We compare localization properties of one-dimensional Frenkel excitons with Gaussian and Lorentzian uncorrelated diagonal disorder. We focus on the states of the Lifshits tail, which dominate the optical response and low-temperature energy transport in molecular J-aggregates. The absence of exchange narrowing in chains with Lorentzian disorder is shown to manifest itself in the disorder scaling of the localization length distribution. Also, we show that the local exciton level structure of the Lifshits tail differs substantially for these two types of disorder: In addition to the singlets and doublets of localized states near the bare band edge, strongly resembling those found for Gaussian disorder, for Lorentzian disorder two other types of states are found in this energy region as well, namely multiplets of three or four states localized on the same chain segment and isolated states localized on short segments. Finally, below the Lifshits tail, Lorentzian disorder induces strongly localized exciton states, centered around low energy sites, with localization properties that strongly depend on energy. For Gaussian disorder with a magnitude that does not exceed the exciton bandwidth, the likelihood to find such very deep states is exponentially small.
Hierarchical temporal memory (HTM) is an emerging machine learning algorithm, with the potential to provide a means to perform predictions on spatiotemporal data. The algorithm, inspired by the neocortex, currently does not have a comprehensive mathematical framework. This work brings together all aspects of the spatial pooler (SP), a critical learning component in HTM, under a single unifying framework. The primary learning mechanism is explored, where a maximum likelihood estimator for determining the degree of permanence update is proposed. The boosting mechanisms are studied and found to be only relevant during the initial few iterations of the network. Observations are made relating HTM to well-known algorithms such as competitive learning and attribute bagging. Methods are provided for using the SP for classification as well as dimensionality reduction. Empirical evidence verifies that given the proper parameterizations, the SP may be used for feature learning.
Modern social networks frequently encompass multiple distinct types of connectivity information; for instance, explicitly acknowledged friend relationships might complement behavioral measures that link users according to their actions or interests. One way to represent these networks is as multi-layer graphs, where each layer contains a unique set of edges over the same underlying vertices (users). Edges in different layers typically have related but distinct semantics; depending on the application multiple layers might be used to reduce noise through averaging, to perform multifaceted analyses, or a combination of the two. However, it is not obvious how to extend standard graph analysis techniques to the multi-layer setting in a flexible way. In this paper we develop latent variable models and methods for mining multi-layer networks for connectivity patterns based on noisy data.
Financial news contains useful information on public companies and the market. In this paper we apply the popular word embedding methods and deep neural networks to leverage financial news to predict stock price movements in the market. Experimental results have shown that our proposed methods are simple but very effective, which can significantly improve the stock prediction accuracy on a standard financial database over the baseline system using only the historical price information.
We propose a general framework for increasing local stability of Artificial Neural Nets (ANNs) using Robust Optimization (RO). We achieve this through an alternating minimization-maximization procedure, in which the loss of the network is minimized over perturbed examples that are generated at each parameter update. We show that adversarial training of ANNs is in fact robustification of the network optimization, and that our proposed framework generalizes previous approaches for increasing local stability of ANNs. Experimental results reveal that our approach increases the robustness of the network to existing adversarial examples, while making it harder to generate new ones. Furthermore, our algorithm improves the accuracy of the network also on the original test data.
Coverage and connectivity both are important in wireless sensor network (WSN). Coverage means how well an area of interest is being monitored by the deployed network. It depends on sensing model that has been used to design the network model. Connectivity ensures the establishment of a wireless link between two nodes. A link model studies the connectivity between two nodes. The probability of establishing a wireless link between two nodes is a probabilistic phenomenon. The connectivity between two nodes plays an important role in the determination of network connectivity. In this paper, we investigate the impact of sensing model of nodes on the network coverage. Also, we investigate the dependency of the connectivity and coverage on the shadow fading parameters. It has been observed that shadowing effect reduces the network coverage while it enhances connectivity in a multi-hop wireless network.
It is currently not possible to quantify the resources needed to perform a computation. As a consequence, it is not possible to reliably evaluate the hardware resources needed for the application of algorithms or the running of programs. This is apparent in both computer science, for instance, in cryptanalysis, and in neuroscience, for instance, comparative neuro-anatomy. A System versus Environment game formalism is proposed based on Computability Logic that allows to define a computational work function that describes the theoretical and physical resources needed to perform any purely algorithmic computation. Within this formalism, the cost of a computation is defined as the sum of information storage over the steps of the computation. The size of the computational device, eg, the action table of a Universal Turing Machine, the number of transistors in silicon, or the number and complexity of synapses in a neural net, is explicitly included in the computational cost. The proposed cost function leads in a natural way to known computational trade-offs and can be used to estimate the computational capacity of real silicon hardware and neural nets. The theory is applied to a historical case of 56 bit DES key recovery, as an example of application to cryptanalysis. Furthermore, the relative computational capacities of human brain neurons and the C. elegans nervous system are estimated as an example of application to neural nets.
Joint diagonalisation (JD) is a technique used to estimate an average eigenspace of a set of matrices. Whilst it has been used successfully in many areas to track the evolution of systems via their eigenvectors; its application in network analysis is novel. The key focus in this paper is the use of JD on matrices of spanning trees of a network. This is especially useful in the case of real-world contact networks in which a single underlying static graph does not exist. The average eigenspace may be used to construct a graph which represents the `average spanning tree' of the network or a representation of the most common propagation paths. We then examine the distribution of deviations from the average and find that this distribution in real-world contact networks is multi-modal; thus indicating several \emph{modes} in the underlying network. These modes are identified and are found to correspond to particular times. Thus JD may be used to decompose the behaviour, in time, of contact networks and produce average static graphs for each time. This may be viewed as a mixture between a dynamic and static graph approach to contact network analysis.
This paper examines the proximity of authors to those they cite using degrees of separation in a co-author network, essentially using collaboration networks to expand on the notion of self-citations. While the proportion of direct self-citations (including co-authors of both citing and cited papers) is relatively constant in time and across specialties in the natural sciences (10% of citations) and the social sciences (20%), the same cannot be said for citations to authors who are members of the co-author network. Differences between fields and trends over time lie not only in the degree of co-authorship which defines the large-scale topology of the collaboration network, but also in the referencing practices within a given discipline, computed by defining a propensity to cite at a given distance within the collaboration network. Overall, there is little tendency to cite those nearby in the collaboration network, excluding direct self-citations. By analyzing these social references, we characterize the social capital of local collaboration networks in terms of the knowledge production within scientific fields. These results have implications for the long-standing debate over biases common to most types of citation analysis, and for understanding citation practices across scientific disciplines over the past 50 years. In addition, our findings have important practical implications for the availability of 'arm's length' expert reviewers of grant applications and manuscripts.
While classical neural networks take a position of a leading method in the machine learning community, spiking neuromorphic systems bring attention and large projects in neuroscience. Spiking neural networks were shown to be able to substitute networks of classical neurons in applied tasks. This work explores recent hardware designs focusing on perspective applications (like convolutional neural networks) for both neuron types from the energy efficiency side to analyse whether there is a possibility for spiking neuromorphic hardware to grow up for a wider use. Our comparison shows that spiking hardware is at least on the same level of energy efficiency or even higher than non-spiking on a level of basic operations. However, on a system level, spiking systems are outmatched and consume much more energy due to inefficient data representation with a long series of spikes. If spike-driven applications, minimizing an amount of spikes, are developed, spiking neural systems may reach the energy efficiency level of classical neural systems. However, in the near future, both type of neuromorphic systems may benefit from emerging memory technologies, minimizing the energy consumption of computation and memory for both neuron types. That would make infrastructure and data transfer energy dominant on the system level. We expect that spiking neurons have some benefits, which would allow achieving better energy results. Still the problem of an amount of spikes will still be the major bottleneck for spiking hardware systems.
We have used modern supercomputer facilities to carry out extensive Monte Carlo simulations of 2D hopping (at negligible Coulomb interaction) in conductors with the completely random distribution of localized sites in both space and energy, within a broad range of the applied electric field $E$ and temperature $T$, both within and beyond the variable-range hopping region. The calculated properties include not only dc current and statistics of localized site occupation and hop lengths, but also the current fluctuation spectrum. Within the calculation accuracy, the model does not exhibit $1/f$ noise, so that the low-frequency noise at low temperatures may be characterized by the Fano factor $F$. For sufficiently large samples, $F$ scales with conductor length $L$ as $(L_c/L)^{\alpha}$, where $\alpha=0.76\pm 0.08 < 1$, and parameter $L_c$ is interpreted as the average percolation cluster length. At relatively low $E$, the electric field dependence of parameter $L_c$ is compatible with the law $L_c\propto E^{-0.911}$ which follows from directed percolation theory arguments.
Mechanical relaxation behaviour of various natural volcanic glasses have been investigated in the temperature range RT-1200K using special low frequency flexure (f~0.63Hz) pendulum experiments. The rheological properties complex Young's modulus M* and internal friction 1/Q have been studied from a pure elastic solid at room temperature to pure viscous melt at log(eta[Pas])=8. Several relaxation processes are assumed to act: the primary alpha-relaxation (viscoelastic process, E_a=(344...554)kJ/mol) above the glass transition temperature T_g=(935...1105)K and secondary anelastic beta', beta and gamma-relaxation processes below T_g. With a simple fractional Maxwell model with asymmetrical relaxation time distribution, phenomenological the mechanical relaxation behaviour, is described. This establish a basis of realistic concepts for modelling of volcanic or magmatic processes.
We present deep CCD BVI photometry of the distant, old open cluster Berkeley 22, covering from the red giants branch (RGB) to about 6 magnitudes below the main sequence (MS) turn-off. Using the synthetic Colour - Magnitude Diagram method with three different types of stellar evolutionary tracks, we estimate values and theoretical uncertainty of distance modulus mod0, reddening E(B-V), age tau and approximate metallicity. The best fit to the data is obtained for 13.8 <= mod0 <= 14.1, 0.64 <= E(B-V) <= 0.65, 2 <= tau <= 2.5 Gyr (depending on the amount of overshooting from convective regions adopted in the stellar models) and solar metallicity.
Composite fabrication technologies now provide the means for producing high-strength, low-weight panels, plates, spars and other structural components which use embedded fiber optic sensors and piezoelectric transducers. These materials, often referred to as smart structures, make it possible to sense internal characteristics, such as delaminations or structural degradation. In this effort we use neural network based techniques for modeling and analyzing dynamic structural information for recognizing structural defects. This yields an adaptable system which gives a measure of structural integrity for composite structures.
Networks are mathematical structures that are universally used to describe a large variety of complex systems such as the brain or the Internet. Characterizing the geometrical properties of these networks has become increasingly relevant for routing problems, inference and data mining. In real growing networks, topological, structural and geometrical properties emerge spontaneously from their dynamical rules. Nevertheless we still miss a model in which networks develop an emergent complex geometry. Here we show that a single two parameter network model, the growing geometrical network, can generate complex network geometries with non-trivial distribution of curvatures, combining exponential growth and small-world properties with finite spectral dimensionality. In one limit, the non-equilibrium dynamical rules of these networks can generate scale-free networks with clustering and communities, in another limit planar random geometries with non-trivial modularity. Finally we find that these properties of the geometrical growing networks are present in a large set of real networks describing biological, social and technological systems.
We provide a systematic comparison of the many-body localization transition in spin chains with nonrandom quasiperiodic vs. random fields. We find evidence that these belong to two separate universality classes: one dominated by "intrinsic" intra-sample randomness, and the second dominated by external inter-sample quenched randomness. We show that the effects of inter-sample variations are strongly growing, but not yet dominant, at the system sizes probed by exact-diagonalization studies on random models. Thus, the observed finite-size critical scaling collapses in such studies appear to be in a preasymptotic regime near the nonrandom universality class, but showing signs of the initial crossover towards the quenched-disorder dominated universality class. We also show that the MBL phase is more stable for the quasiperiodic model as compared to the random one, and the transition in the quasiperiodic model suffers less from certain finite-size effects. Altogether, our results motivate the need for a greater focus on quasiperiodic models in theoretical studies of the MBL transition.
Many aesthetic models in computer vision suffer from two shortcomings: 1) the low descriptiveness and interpretability of those hand-crafted aesthetic criteria (i.e., nonindicative of region-level aesthetics), and 2) the difficulty of engineering aesthetic features adaptively and automatically toward different image sets. To remedy these problems, we develop a deep architecture to learn aesthetically-relevant visual attributes from Flickr1, which are localized by multiple textual attributes in a weakly-supervised setting. More specifically, using a bag-ofwords (BoW) representation of the frequent Flickr image tags, a sparsity-constrained subspace algorithm discovers a compact set of textual attributes (e.g., landscape and sunset) for each image. Then, a weakly-supervised learning algorithm projects the textual attributes at image-level to the highly-responsive image patches at pixel-level. These patches indicate where humans look at appealing regions with respect to each textual attribute, which are employed to learn the visual attributes. Psychological and anatomical studies have shown that humans perceive visual concepts hierarchically. Hence, we normalize these patches and feed them into a five-layer convolutional neural network (CNN) to mimick the hierarchy of human perceiving the visual attributes. We apply the learned deep features on image retargeting, aesthetics ranking, and retrieval. Both subjective and objective experimental results thoroughly demonstrate the competitiveness of our approach.
We explore a new general-purpose heuristic for finding high-quality solutions to hard optimization problems. The method, called extremal optimization, is inspired by self-organized criticality, a concept introduced to describe emergent complexity in physical systems. Extremal optimization successively replaces extremely undesirable variables of a single sub-optimal solution with new, random ones. Large fluctuations ensue, that efficiently explore many local optima. With only one adjustable parameter, the heuristic's performance has proven competitive with more elaborate methods, especially near phase transitions which are believed to coincide with the hardest instances. We use extremal optimization to elucidate the phase transition in the 3-coloring problem, and we provide independent confirmation of previously reported extrapolations for the ground-state energy of +-J spin glasses in d=3 and 4.
In this paper, we discuss a method to conduct a quantitative study of the star formation history (SFH) of Local Group (LG) galaxies using (HST) data. This method has proven to be successful in the analysis of the SFH of the same kind of galaxies using ground-based observations. It is based on the comparison of observed CMDs with a set of model CMDs. The latter are computed assuming different evolutionary scenarios, and include a detailed simulation of observational effects. HST CMDs are ~3 mags deeper than typical ground-based CMDs, allowing the observation, for all LG galaxies, of a part of the CMD that up till now had remained accessible only for the very nearest galaxies. A very important feature that will become accessible is the HB+red-clump. The distribution of stars along this structure is quite sensitive to age and metallicity and should provide a very important improvement in the time resolution of the SFH for stars older than ~2-3 Gyr. We show and discuss four model CMDs which would be comparable with CMDs from deep HST observations. These model CMDs represent the following evolutionary scenarios corresponding to a wide range of dwarf galaxy sub-types from dI to dE: A) a constant SFR from 15Gyr ago to the present time; B) as A), but with the SFR stopped 0.5 Gyr ago; C) a constant SFR in the age range 10-9Gyr and D) as C) but in the age range 15-12 Gyr. In all four cases a range of metallicity from Z=0.0001 to Z=0.004 has been assumed. The present analysis is just a first qualitative approach to what one may expect to find in the CMDs of LG galaxies. However a complete set of model CMDs must be computed to analize the data for each galaxy, using the crowding effects derived for that particular galaxy.
A geometric graph is a graph embedded in the plane with vertices at points and edges drawn as curves (which are usually straight line segments) between those points. The average transversal complexity of a geometric graph is the number of edges of that graph that are crossed by random line or line segment. In this paper, we study the average transversal complexity of road networks. By viewing road networks as multiscale-dispersed graphs, we show that a random line will cross the edges of such a graph O(sqrt(n)) times on average. In addition, we provide by empirical evidence from experiments on the road networks of the fifty states of United States and the District of Columbia that this bound holds in practice and has a small constant factor. Combining this result with data structuring techniques from computational geometry, allows us to show that we can then do point location and ray-shooting navigational queries with respect to road networks in O(sqrt(n) log n) expected time. Finally, we provide empirical justification for this claim as well.
Self-paced learning and hard example mining re-weight training instances to improve learning accuracy. This paper presents two improved alternatives based on lightweight estimates of sample uncertainty in stochastic gradient descent (SGD): the variance in predicted probability of the correct class across iterations of mini-batch SGD, and the proximity of the correct class probability to the decision threshold. Extensive experimental results on six datasets show that our methods reliably improve accuracy in various network architectures, including additional gains on top of other popular training techniques, such as residual learning, momentum, ADAM, batch normalization, dropout, and distillation.
Caching is a hot research topic and poised to develop into a key technology for the upcoming 5G wireless networks. The successful implementation of caching techniques however, crucially depends on joint research developments in different scientific domains such as networking, information theory, machine learning, and wireless communications. Moreover, there exist business barriers related to the complex interactions between the involved stakeholders, the users, the cellular operators, and the Internet content providers. In this article we discuss several technical misconceptions with the aim to uncover enabling research directions for caching in wireless systems. Ultimately we make a speculative stakeholder analysis for wireless caching in 5G.
Why is it difficult to refold a previously folded sheet of paper? We show that even crease patterns with only one designed folding motion inevitably contain an exponential number of `distractor' folding branches accessible from a bifurcation at the flat state. Consequently, refolding a sheet requires finding the ground state in a glassy energy landscape with an exponential number of other attractors of higher energy, much like in models of protein folding (Levinthal's paradox) and other NP-hard satisfiability (SAT) problems. As in these problems, we find that refolding a sheet requires actuation at multiple carefully chosen creases. We show that seeding successful folding in this way can be understood in terms of sub-patterns that fold when cut out (`folding islands'). Besides providing guidelines for the placement of active hinges in origami applications, our results point to fundamental limits on the programmability of energy landscapes in sheets.
Recently several authors have proposed stochastic evolutionary models for the growth of complex networks that give rise to power-law distributions. These models are based on the notion of preferential attachment leading to the ``rich get richer'' phenomenon. Despite the generality of the proposed stochastic models, there are still some unexplained phenomena, which may arise due to the limited size of networks such as protein and e-mail networks. Such networks may in fact exhibit an exponential cutoff in the power-law scaling, although this cutoff may only be observable in the tail of the distribution for extremely large networks. We propose a modification of the basic stochastic evolutionary model, so that after a node is chosen preferentially, say according to the number of its inlinks, there is a small probability that this node will be discarded. We show that as a result of this modification, by viewing the stochastic process in terms of an urn transfer model, we obtain a power-law distribution with an exponential cutoff. Unlike many other models, the current model can capture instances where the exponent of the distribution is less than or equal to two. As a proof of concept, we demonstrate the consistency of our model by analysing a yeast protein interaction network, the distribution of which is known to follow a power law with an exponential cutoff.
We are motivated by the problem of impromptu or as- you-go deployment of wireless sensor networks. As an application example, a person, starting from a sink node, walks along a forest trail, makes link quality measurements (with the previously placed nodes) at equally spaced locations, and deploys relays at some of these locations, so as to connect a sensor placed at some a priori unknown point on the trail with the sink node. In this paper, we report our experimental experiences with some as-you-go deployment algorithms. Two algorithms are based on Markov decision process (MDP) formulations; these require a radio propagation model. We also study purely measurement based strategies: one heuristic that is motivated by our MDP formulations, one asymptotically optimal learning algorithm, and one inspired by a popular heuristic. We extract a statistical model of the propagation along a forest trail from raw measurement data, implement the algorithms experimentally in the forest, and compare them. The results provide useful insights regarding the choice of the deployment algorithm and its parameters, and also demonstrate the necessity of a proper theoretical formulation.
Retrieving 3D models from 2D human sketches has received considerable attention in the areas of graphics, image retrieval, and computer vision. Almost always in state of the art approaches a large amount of "best views" are computed for 3D models, with the hope that the query sketch matches one of these 2D projections of 3D models using predefined features.   We argue that this two stage approach (view selection -- matching) is pragmatic but also problematic because the "best views" are subjective and ambiguous, which makes the matching inputs obscure. This imprecise nature of matching further makes it challenging to choose features manually. Instead of relying on the elusive concept of "best views" and the hand-crafted features, we propose to define our views using a minimalism approach and learn features for both sketches and views. Specifically, we drastically reduce the number of views to only two predefined directions for the whole dataset. Then, we learn two Siamese Convolutional Neural Networks (CNNs), one for the views and one for the sketches. The loss function is defined on the within-domain as well as the cross-domain similarities. Our experiments on three benchmark datasets demonstrate that our method is significantly better than state of the art approaches, and outperforms them in all conventional metrics.
We analyze a continuous and discrete symmetries of the maximum lifetime problem in two dimensional sensor networks. We show, how a symmetry of the network and invariance of the problem under a given transformation group $G$ can be utilized to simplify its solution. We prove, that for a $G$-invariant maximum lifetime problem there exists a $G$-invariant solution. Constrains which follow from the $G$-invariance allow to reduce the problem and its solution to a subset, an optimal fundamental region of the sensor network. We analyze in detail solutions of the maximum lifetime problem invariant under a group of isometry transformations of a two dimensional Euclidean plane.
In real-time speech recognition applications, the latency is an important issue. We have developed a character-level incremental speech recognition (ISR) system that responds quickly even during the speech, where the hypotheses are gradually improved while the speaking proceeds. The algorithm employs a speech-to-character unidirectional recurrent neural network (RNN), which is end-to-end trained with connectionist temporal classification (CTC), and an RNN-based character-level language model (LM). The output values of the CTC-trained RNN are character-level probabilities, which are processed by beam search decoding. The RNN LM augments the decoding by providing long-term dependency information. We propose tree-based online beam search with additional depth-pruning, which enables the system to process infinitely long input speech with low latency. This system not only responds quickly on speech but also can dictate out-of-vocabulary (OOV) words according to pronunciation. The proposed model achieves the word error rate (WER) of 8.90% on the Wall Street Journal (WSJ) Nov'92 20K evaluation set when trained on the WSJ SI-284 training set.
This paper presents a general graph representation learning framework called DeepGL for learning deep node and edge representations from large (attributed) graphs. In particular, DeepGL begins by deriving a set of base features (e.g., graphlet features) and automatically learns a multi-layered hierarchical graph representation where each successive layer leverages the output from the previous layer to learn features of a higher-order. Contrary to previous work, DeepGL learns relational functions (each representing a feature) that generalize across-networks and therefore useful for graph-based transfer learning tasks. Moreover, DeepGL naturally supports attributed graphs, learns interpretable features, and is space-efficient (by learning sparse feature vectors). In addition, DeepGL is expressive, flexible with many interchangeable components, efficient with a time complexity of $\mathcal{O}(|E|)$, and scalable for large networks via an efficient parallel implementation. Compared with the state-of-the-art method, DeepGL is (1) effective for across-network transfer learning tasks and attributed graph representation learning, (2) space-efficient requiring up to 6x less memory, (3) fast with up to 182x speedup in runtime performance, and (4) accurate with an average improvement of 20% or more on many learning tasks.
In this work, we propose a learning framework for identifying synapses using a deep and wide multi-scale recursive (DAWMR) network, previously considered in image segmentation applications. We apply this approach on electron microscopy data from invertebrate fly brain tissue. By learning features directly from the data, we are able to achieve considerable improvements over existing techniques that rely on a small set of hand-designed features. We show that this system can reduce the amount of manual annotation required, in both acquisition of training data as well as verification of inferred detections.
We present a theoretical study of capillary condensation of fluids adsorbed in mesoporous disordered media. Combining mean-field density functional theory with a coarse-grained description in terms of a lattice-gas model allows us to investigate both the out-of-equilibrium (hysteresis) and the equilibrium behavior. We show that the main features of capillary condensation in disordered solids result from the appearance of a complex free-energy landscape with a large number of metastable states. We detail the numerical procedures for finding these states, and the presence or absence of transitions in the thermodynamic limit is determined by careful finite-size studies.
The matrices of spanning rooted forests are studied as a tool for analysing the structure of networks and measuring their properties. The problems of revealing the basic bicomponents, measuring vertex proximity, and ranking from preference relations / sports competitions are considered. It is shown that the vertex accessibility measure based on spanning forests has a number of desirable properties. An interpretation for the stochastic matrix of out-forests in terms of information dissemination is given.
We propose a deep neural network fusion architecture for fast and robust pedestrian detection. The proposed network fusion architecture allows for parallel processing of multiple networks for speed. A single shot deep convolutional network is trained as a object detector to generate all possible pedestrian candidates of different sizes and occlusions. This network outputs a large variety of pedestrian candidates to cover the majority of ground-truth pedestrians while also introducing a large number of false positives. Next, multiple deep neural networks are used in parallel for further refinement of these pedestrian candidates. We introduce a soft-rejection based network fusion method to fuse the soft metrics from all networks together to generate the final confidence scores. Our method performs better than existing state-of-the-arts, especially when detecting small-size and occluded pedestrians. Furthermore, we propose a method for integrating pixel-wise semantic segmentation network into the network fusion architecture as a reinforcement to the pedestrian detector. The approach outperforms state-of-the-art methods on most protocols on Caltech Pedestrian dataset, with significant boosts on several protocols. It is also faster than all other methods.
We are witnessing increasing interests in the effective use of road networks. For example, to enable effective vehicle routing, weighted-graph models of transportation networks are used, where the weight of an edge captures some cost associated with traversing the edge, e.g., greenhouse gas (GHG) emissions or travel time. It is a precondition to using a graph model for routing that all edges have weights. Weights that capture travel times and GHG emissions can be extracted from GPS trajectory data collected from the network. However, GPS trajectory data typically lack the coverage needed to assign weights to all edges. This paper formulates and addresses the problem of annotating all edges in a road network with travel cost based weights from a set of trips in the network that cover only a small fraction of the edges, each with an associated ground-truth travel cost. A general framework is proposed to solve the problem. Specifically, the problem is modeled as a regression problem and solved by minimizing a judiciously designed objective function that takes into account the topology of the road network. In particular, the use of weighted PageRank values of edges is explored for assigning appropriate weights to all edges, and the property of directional adjacency of edges is also taken into account to assign weights. Empirical studies with weights capturing travel time and GHG emissions on two road networks (Skagen, Denmark, and North Jutland, Denmark) offer insight into the design properties of the proposed techniques and offer evidence that the techniques are effective.
The Banzhaf index, Shapley-Shubik index and other voting power indices measure the importance of a player in a coalitional game. We consider a simple coalitional game called the spanning connectivity game (SCG) based on an undirected, unweighted multigraph, where edges are players. We examine the computational complexity of computing the voting power indices of edges in the SCG. It is shown that computing Banzhaf values and Shapley-Shubik indices is #P-complete for SCGs. Interestingly, Holler indices and Deegan-Packel indices can be computed in polynomial time. Among other results, it is proved that Banzhaf indices can be computed in polynomial time for graphs with bounded treewidth. It is also shown that for any reasonable representation of a simple game, a polynomial time algorithm to compute the Shapley-Shubik indices implies a polynomial time algorithm to compute the Banzhaf indices. As a corollary, computing the Shapley value is #P-complete for simple games represented by the set of minimal winning coalitions, Threshold Network Flow Games, Vertex Connectivity Games and Coalitional Skill Games.
We analyze the effect of adding quenched disorder along a defect line in the 2D conformal minimal models using replicas. The disorder is realized by a random applied magnetic field in the Ising model, by fluctuations in the ferromagnetic bond coupling in the Tricritical Ising model and Tricritical Three-state Potts model (the $\phi_{12}$ operator), etc.. We find that for the Ising model, the defect renormalizes to two decoupled half-planes without disorder, but that for all other models, the defect renormalizes to a disorder-dominated fixed point. Its critical properties are studied with an expansion in $\eps \propto 1/m$ for the mth Virasoro minimal model. The decay exponents $X_N=\frac{N}{2}(1-\frac{9(3N-4)}{4(m+1)^2}+ \mathcal{O}(\frac{3}{m+1})^3)$ of the Nth moment of the two-point function of $\phi_{12}$ along the defect are obtained to 2-loop order, exhibiting multifractal behavior.This leads to a typical decay exponent $X_{\rm typ}={1/2} (1+\frac{9}{(m+1)^2}+\mathcal{O}(\frac{3}{m+1})^3)$. One-point functions are seen to have a non-self-averaging amplitude. The boundary entropy is larger than that of the pure system by order 1/m^3.   As a byproduct of our calculations, we also obtain to 2-loop order the exponent $\tilde{X}_N=N(1-\frac{2}{9\pi^2}(3N-4)(q-2)^2+\mathcal{O}(q-2)^3)$ of the Nth moment of the energy operator in the q-state Potts model with bulk bond disorder.
Most existing word embedding methods can be categorized into Neural Embedding Models and Matrix Factorization (MF)-based methods. However some models are opaque to probabilistic interpretation, and MF-based methods, typically solved using Singular Value Decomposition (SVD), may incur loss of corpus information. In addition, it is desirable to incorporate global latent factors, such as topics, sentiments or writing styles, into the word embedding model. Since generative models provide a principled way to incorporate latent factors, we propose a generative word embedding model, which is easy to interpret, and can serve as a basis of more sophisticated latent factor models. The model inference reduces to a low rank weighted positive semidefinite approximation problem. Its optimization is approached by eigendecomposition on a submatrix, followed by online blockwise regression, which is scalable and avoids the information loss in SVD. In experiments on 7 common benchmark datasets, our vectors are competitive to word2vec, and better than other MF-based methods.
Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training.   Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice.   We interpret our experimental findings by comparison with traditional models.
Mobile Networks are in need of what we call as an e-age. There have been numerous advancements in this field and soft computing has gained roots in this Mobile Ad-hoc Network domain. This documents studies the use of fuzzy logic over Ad Hoc On-Demand Distance vector Routing. The evaluation of the network using fuzzy logic over AODV has been conducted using selective criterion. Both the terms come from two different computing environments, but when combined together can achieve better results; when it comes to QoS based upon selective parameters. The crisp the rules formed in fuzzy logic, better the performance or quality will be achieved in AODV routing mechanism of MANET.
We present a brief discussion on the nonlinear Schr{\"o}dinger equation for modeling the propagation of the deep-water wavetrains and a discussion on its doubly-localized breather solutions that can be connected to the sudden formation of extreme waves, also known as rogue waves or freak waves.
We present Spitzer Space Telescope observations of the well-studied extremely red objects (EROs) HR10 and LBDS53W091 from 3.6 to 160 microns. These galaxies are the prototypes of the two primary classes of EROs: dusty starbursts and old, evolved galaxies, respectively. Both galaxies, as well as LBDS53W069, another example of an old, quiescent galaxy, are well-detected out to 8 microns. However, only the dusty starburst HR10 is detected in the far-infrared. All three EROs have stellar masses of a few times 10^11 M(sun). Using evolutionary model fits to their multiband photometry, we predict the infrared colors of similar EROs at 1<z<2. We find that blueward of observed 10 microns, the two ERO classes are virtually indistinguishable photometrically. Deep spectroscopy and 24 micron data allow the classes to be separated.
We present the discovery of a distant methane dwarf, the first from the Institute for Astronomy (IfA) Deep Survey. The object ("IfA 0230-Z1") was identified from deep optical I and z'-band imaging, being conducted as an IfA-wide collaboration using the prime-focus imager Suprime-Cam on the Subaru 8.2-m Telescope. IfA 0230-Z1 is extremely red in the Iz'J (0.8--1.2 micron) bands but relatively blue in J-H; such colors are uniquely characteristic of T dwarfs. A near-IR spectrum taken with the Keck Telescope shows strong H2O absorption and a continuum break indicative of CH4, confirming the object has a very cool atmosphere. Comparison with nearby T dwarfs gives a spectral type of T3-T4 and a distance of ~45 pc. Simple estimates based on previous T dwarf discoveries suggest that the IfA survey will find a comparable number of T dwarfs as the 2MASS survey, albeit at a much larger average distance. We also discuss the survey's ability to probe the galactic scale height of ultracool (L and T) dwarfs.
Recurrent networks of non-linear units display a variety of dynamical regimes depending on the structure of their synaptic connectivity. A particularly remarkable phenomenon is the appearance of strongly fluctuating, chaotic activity in networks of deterministic, but randomly connected rate units. How this type of intrinsi- cally generated fluctuations appears in more realistic networks of spiking neurons has been a long standing question. To ease the comparison between rate and spiking networks, recent works investigated the dynami- cal regimes of randomly-connected rate networks with segregated excitatory and inhibitory populations, and firing rates constrained to be positive. These works derived general dynamical mean field (DMF) equations describing the fluctuating dynamics, but solved these equations only in the case of purely inhibitory networks. Using a simplified excitatory-inhibitory architecture in which DMF equations are more easily tractable, here we show that the presence of excitation qualitatively modifies the fluctuating activity compared to purely inhibitory networks. In presence of excitation, intrinsically generated fluctuations induce a strong increase in mean firing rates, a phenomenon that is much weaker in purely inhibitory networks. Excitation moreover induces two different fluctuating regimes: for moderate overall coupling, recurrent inhibition is sufficient to stabilize fluctuations, for strong coupling, firing rates are stabilized solely by the upper bound imposed on activity, even if inhibition is stronger than excitation. These results extend to more general network architectures, and to rate networks receiving noisy inputs mimicking spiking activity. Finally, we show that signatures of the second dynamical regime appear in networks of integrate-and-fire neurons.
Cloud computing is an emerging concept combining many fields of computing. The foundation of cloud computing is the delivery of services, software and processing capacity over the Internet, reducing cost, increasing storage, automating systems, decoupling of service delivery from underlying technology, and providing flexibility and mobility of information. However, the actual realization of these benefits is far from being achieved for mobile applications and open many new research questions. In order to better understand how to facilitate the building of mobile cloud-based applications, we have surveyed existing work in mobile computing through the prism of cloud computing principles. We give a definition of mobile cloud coputing and provide an overview of the results from this review, in particular, models of mobile cloud applications. We also highlight research challenges in the area of mobile cloud computing. We conclude with recommendations for how this better understanding of mobile cloud computing can help building more powerful mobile applications.
Theoretical predictions for neutrino fluxes indicate that km$^{3}$ scale detectors are needed to detect certain astrophysical sources. The three Mediterranean experiments, ANTARES, NEMO and NESTOR are working together on a design study, KM3NeT, for a large deep-sea neutrino telescope. A detector placed in the Mediterranean Sea will survey a large part of the Galactic disc, including the Galactic Centre. It will complement the IceCube telescope currently under construction at the South Pole. Furthermore, the improved optical properties of sea water, compared to Antarctic ice, will allow a better angular resolution and hence better background rejection.   The main work presented in this paper is to evaluate different km$^{3}$ scale detector geometries in order to optimize the muon neutrino sensitivity between 1 and 100 TeV. For this purpose, we have developed a detailed simulation based on the {\it Mathematica} software - for the muon track production, the light transmission in water, the environmental background and the detector response. To compare different geometries, we have mainly used the effective neutrino area obtained after the full standard reconstruction chain.}
The emergence of online social networks and the growing popularity of digital communication has resulted in an increasingly amount of information about individuals available on the Internet. Social network users are given the freedom to create complex digital identities, and enrich them with truthful or even fake personal information. However, this freedom has led to serious security and privacy incidents, due to the role users' identities play in establishing social and privacy settings.   In this paper, we take a step toward a better understanding of online information exposure. Based on the detailed analysis of a sample of real-world data, we develop a deception model for online users. The model uses a game theoretic approach to characterizing a user's willingness to release, withhold or lie about information depending on the behavior of individuals within the user's circle of friends. In the model, we take into account both the heterogeneous nature of users and their different attitudes, as well as the different types of information they may expose online.
The recently introduced ``mobile-bond'' model for two-dimensional spin glasses is studied. The model is characterized by an annealing temperature T_q. On the basis of Monte Carlo simulations of small systems it has been claimed that this model exhibits a non-trivial spin-glass transition at finite temperature for small values of T_q.   Here the model is studied by means of exact ground-state calculations of large systems up to N=256^2. The scaling of domain-wall energies is investigated as a function of the system size. For small values T_q<0.95 the system behaves like a (gauge-transformed) ferromagnet having a small fraction of frustrated plaquettes. For T_q>=0.95 the system behaves like the standard two-dimensional +-J spin-glass, i.e. it does NOT exhibit a phase transition at T>0.
We develop multi-step gradient methods for network-constrained optimization of strongly convex functions with Lipschitz-continuous gradients. Given the topology of the underlying network and bounds on the Hessian of the objective function, we determine the algorithm parameters that guarantee the fastest convergence and characterize situations when significant speed-ups can be obtained over the standard gradient method. Furthermore, we quantify how the performance of the gradient method and its accelerated counterpart are affected by uncertainty in the problem data, and conclude that in most cases our proposed method outperforms gradient descent. Finally, we apply the proposed technique to three engineering problems: resource allocation under network-wide budget constraints, distributed averaging, and Internet congestion control. In all cases, we demonstrate that our algorithm converges more rapidly than alternative algorithms reported in the literature.
Semantic image segmentation is a fundamental task in image understanding. Per-pixel semantic labelling of an image benefits greatly from the ability to consider region consistency both locally and globally. However, many Fully Convolutional Network based methods do not impose such consistency, which may give rise to noisy and implausible predictions. We address this issue by proposing a dense multi-label network module that is able to encourage the region consistency at different levels. This simple but effective module can be easily integrated into any semantic segmentation systems. With comprehensive experiments, we show that the dense multi-label can successfully remove the implausible labels and clear the confusion so as to boost the performance of semantic segmentation systems.
Efforts in this paper seek to combine graph theory with adaptive dynamic programming (ADP) as a reinforcement learning (RL) framework to determine forward-in-time, real-time, approximate optimal controllers for distributed multi-agent systems with uncertain nonlinear dynamics. A decentralized continuous time-varying control strategy is proposed, using only local communication feedback from two-hop neighbors on a communication topology that has a spanning tree. An actor-critic-identifier architecture is proposed that employs a nonlinear state derivative estimator to estimate the unknown dynamics online and uses the estimate thus obtained for value function approximation.
We present and discuss a new method to extract parton distribution functions from hard scattering processes based on an alternative type of neural network, the Self-Organizing Map. Quantitative results including a detailed treatment of uncertainties are presented within a Next to Leading Order analysis of inclusive electron proton deep inelastic scattering data.
We present a quantum Monte Carlo study of the "quantum glass" phase of the 2D Bose-Hubbard model with random potentials at filling $\rho=1$. In the narrow region between the Mott and superfluid phases the compressibility has the form $\kappa \sim {\rm exp}(-b/T^\alpha)+c$ with $\alpha <1$ and $c$ vanishing or very small. Thus, at $T=0$ the system is either incompressible (a Mott glass) or nearly incompressible (a Mott-glass-like anomalous Bose glass). At stronger disorder, where a glass reappears from the superfluid, we find a conventional highly compressible Bose glass. On a path connecting these states, away from the superfluid at larger Hubbard repulsion, a change of the disorder strength by only $10\%$ changes the low-temperature compressibility by more than four orders of magnitude, lending support to two types of glass states separated by a phase transition or a sharp cross-over.
In this paper, we compute the first set of ${\cal O}(\alpha_s^2)$ corrections to semi-inclusive deep inelastic scattering structure functions. We start by studying the impact of the contribution of the partonic subprocesses that open at this order for the longitudinal structure function. We perform the full calculation analytically, and obtain the expression of the factorized cross section at this order. Special care is given to the study of their flavour decomposition structure. We analyze the phenomenological effect of the corrections finding that, even though expected to be small a priori, it turns out to be sizable with respect to the previous order know, calling for a full NNLO calculation.
Extreme learning machine (ELM) as an emerging branch of shallow networks has shown its excellent generalization and fast learning speed. However, for blended data, the robustness of ELM is weak because its weights and biases of hidden nodes are set randomly. Moreover, the noisy data exert a negative effect. To solve this problem, a new framework called RMSE-ELM is proposed in this paper. It is a two-layer recursive model. In the first layer, the framework trains lots of ELMs in different groups concurrently, then employs selective ensemble to pick out an optimal set of ELMs in each group, which can be merged into a large group of ELMs called candidate pool. In the second layer, selective ensemble is recursively used on candidate pool to acquire the final ensemble. In the experiments, we apply UCI blended datasets to confirm the robustness of our new approach in two key aspects (mean square error and standard deviation). The space complexity of our method is increased to some degree, but the results have shown that RMSE-ELM significantly improves robustness with slightly computational time compared with representative methods (ELM, OP-ELM, GASEN-ELM, GASEN-BP and E-GASEN). It becomes a potential framework to solve robustness issue of ELM for high-dimensional blended data in the future.
Network hypervisors provide the network virtualization layer for Software Defined Networking (SDN). They enable virtual network (VN) tenants to bring their SDN controllers to program their logical networks individually according to their demands. In order to make use of the high flexibility of virtual SDN networks and to provide high performance, the deployment of the virtualization layer needs to adapt to changing VN demands. This paper initializes the study of the optimization of dynamic SDN network virtualization layers. Based on the definition of reconfiguration events, we formalized mixed integer programs to analyze the multi-objective problem of adapting virtualization layers. Our initial simulation results demonstrate Pareto frontiers of conflicting objectives, namely control plane latency and hypervisor and control path reconfigurations.
We construct a parametrization of deep-inelastic structure functions which retains information on experimental errors and correlations, and which does not introduce any theoretical bias while interpolating between existing data points. We generate a Monte Carlo sample of pseudo-data configurations and we train an ensemble of neural networks on them. This effectively provides us with a probability measure in the space of structure functions, within the whole kinematic region where data are available. This measure can then be used to determine the value of the structure function, its error, point-to-point correlations and generally the value and uncertainty of any function of the structure function itself. We apply this technique to the determination of the structure function F_2 of the proton and deuteron, and a precision determination of the isotriplet combination F_2[p-d]. We discuss in detail these results, check their stability and accuracy, and make them available in various formats for applications.
This paper presents a new approach to select events of interest to a user in a social media setting where events are generated by the activities of the user's friends through their mobile devices. We argue that given the unique requirements of the social media setting, the problem is best viewed as an inductive learning problem, where the goal is to first generalize from the users' expressed "likes" and "dislikes" of specific events, then to produce a program that can be manipulated by the system and distributed to the collection devices to collect only data of interest. The key contribution of this paper is a new algorithm that combines existing machine learning techniques with new program synthesis technology to learn users' preferences. We show that when compared with the more standard approaches, our new algorithm provides up to order-of-magnitude reductions in model training time, and significantly higher prediction accuracies for our target application. The approach also improves on standard machine learning techniques in that it produces clear programs that can be manipulated to optimize data collection and filtering.
We find there is relationship between the associated bigraph and the cluster (or community) detecting on network. By imbedding the associated bigraph of some network (suppose it has cluster structures) into some space, we can identify the clusters on this network, which is a new method for network cluster detecting. And this method, of which the physical meaning is clear and the time complexity is acceptable, may provide us a new point to understand the structure and character of networks. In this paper, We test the methods on serval computer-generated networks and real networks. A computer-generated network with 128 vertexes and the Zachary Network, which presents the structure of a karate club, can be partitioned correctly by these methods. And the Dolphin Network, which presents the relationship between 62 dolphins on the coast of New Zealand, is partitioned reasonably.
The cognitive radio networks are an emerging wireless communication and computing paradigm. The cognitive radio nodes execute computations on multiple heterogeneous channels in the absence of licensed users (a.k.a. primary users) of those bands. Termination detection is a fundamental and non-trivial problem in distributed systems. In this paper, we propose a termination detection protocol for multi-hop cognitive radio networks where the cognitive radio nodes are allowed to tune to channels that are not currently occupied by primary users and to move to different locations during the protocol execution. The proposed protocol applies credit distribution and aggregation approach and maintains a new kind of logical structure, called the virtual tree-like structure. The virtual tree-like structure helps in decreasing the latency involved in announcing termination. Unlike conventional tree structures, the virtual tree-like structure does not require a specific node to act as the root node that has to stay involved in the computation until termination announcement; hence, the root node may become idle soon after finishing its computation. Also, the protocol is able to detect the presence of licensed users and announce strong or weak termination, whichever is possible.
Interactions among units in complex systems occur in a specific sequential order thus affecting the flow of information, the propagation of diseases, and general dynamical processes. We investigate the Laplacian spectrum of temporal networks and compare it with that of the corresponding aggregate network. First, we show that the spectrum of the ensemble average of a temporal network has identical eigenmodes but smaller eigenvalues than the aggregate networks. In large networks without edge condensation, the expected temporal dynamics is a time-rescaled version of the aggregate dynamics. Even for single sequential realizations, diffusive dynamics is slower in temporal networks. These discrepancies are due to the noncommutability of interactions. We illustrate our analytical findings using a simple temporal motif, larger network models and real temporal networks.
We study the growing time scales and length scales associated with dynamical slow down for a realistic glass former, using computer simulations. We perform finite size scaling to evaluate a length scale associated with dynamical heterogeneity which grows as temperature decreases. However, relaxation times which also grow with decreasing temperature, do not show the same kind of scaling behavior with system size as the dynamical heterogeneity, indicating that relaxation times are not solely determined by the length scale of dynamical heterogeneity. We show that relaxation times are instead determined, for all studied system sizes and temperatures, by configurational entropy, in accordance with the Adam-Gibbs relation, but in disagreement with theoretical expectations based on spin-glass models that configurational entropy is not relevant at temperatures substantially above the critical temperature of mode coupling theory. The temperature dependence of the heterogeneity length scale shows significant deviations from theoretical expectations, and the length scale one may extract from the system size dependence of the configurational entropy has much weaker temperature dependence compared to the heterogeneity length scale at all studied temperatures. Our results provide new insights into the dynamics of glass-forming liquids and pose serious challenges to existing theoretical descriptions.
Adaptive routing algorithm has been employed in multichip interconnection networks in order to improve network performance. Does a algorithm use local or global network state? This is the key question in adaptive routing. In many traffic patterns, the ignorance of global network state, leading to routing selection based only on local congestion information, tends to violate global load balance. To attack the load balance issue in adapting routing, some global adaptive routing algorithms introduce a congestion propagation network to obtain global network status information, such as Regional Congestion Awareness (RCA) and Destination Based Adaptive Routing (DBAR).   However, the congestion propagation network leads to additional power and area consumption which cannot be ignored. From another view, if we just increase the bandwidth between neighbor nodes with the wires used to build the congestion propagation network, the network performance could be improved as well. In this paper, we propose a global adaptive routing algorithm without employing the additional congestion propagation network. Our algorithm obtains the global network state in a novel way, and can offer significant improvement than the base-line local adaptive routing algorithm (xy-adaptive algorithm which selects routing based on local congestion information in each hop) for both medium and high injection rates.   In wormhole flow control, all the routing information (flit id, source node id, destination node id, vc id and address) is contained in head flit, and data is carried in body flits. As a result, there are always many free bits in the head flit, especially when the bandwidth is 128-bits which is normal in interconnection network design. Then, we can use these free bits in the head flit to propagate global congestion information but not increase the number of flits.
In this paper, we investigate the transmission range assignment for N wireless nodes located on a line (a linear wireless network) for broadcasting data from one specific node to all the nodes in the network with minimum energy. Our goal is to find a solution that has low complexity and yet performs close to optimal. We propose an algorithm for finding the optimal assignment (which results in the minimum energy consumption) with complexity O(N^2). An approximation algorithm with complexity O(N) is also proposed. It is shown that, for networks with uniformly distributed nodes, the linear-time approximate solution obtained by this algorithm on average performs practically identical to the optimal assignment. Both the optimal and the suboptimal algorithms require the full knowledge of the network topology and are thus centralized. We also propose a distributed algorithm of negligible complexity, i.e., with complexity O(1), which only requires the knowledge of the adjacent neighbors at each wireless node. Our simulations demonstrate that the distributed solution on average performs almost as good as the optimal one for networks with uniformly distributed nodes.
For decades, optimization has played a central role in addressing wireless resource management problems such as power control and beamformer design. However, these algorithms often require a considerable number of iterations for convergence, which poses challenges for real-time processing. In this work, we propose a new learning-based approach for wireless resource management. The key idea is to treat the input and output of a resource allocation algorithm as an unknown non-linear mapping and to use a deep neural network (DNN) to approximate it. If the non-linear mapping can be learned accurately and effectively by a DNN of moderate size, then such DNN can be used for resource allocation in almost \emph{real time}, since passing the input through a DNN to get the output only requires a small number of simple operations. In this work, we first characterize a class of `learnable algorithms' and then design DNNs to approximate some algorithms of interest in wireless communications. We use extensive numerical simulations to demonstrate the superior ability of DNNs for approximating two considerably complex algorithms that are designed for power allocation in wireless transmit signal design, while giving orders of magnitude speedup in computational time.
In this paper we study the problem of how resilient networks are to node faults. Specifically, we investigate the question of how many faults a network can sustain so that it still contains a large (i.e. linear-sized) connected component that still has approximately the same expansion as the original fault-free network. For this we apply a pruning technique which culls away parts of the faulty network which have poor expansion. This technique can be applied to both adversarial faults and to random faults. For adversarial faults we prove that for every network with expansion alpha, a large connected component with basically the same expansion as the original network exists for up to a constant times alpha n faults. This result is tight in the sense that every graph G of size n and uniform expansion alpha(.), i.e. G has an expansion of alpha(n) and every subgraph G' of size m of G has an expansion of O(alpha(m)), can be broken into sublinear components with omega(alpha(n) n) faults.   For random faults we observe that the situation is significantly different, because in this case the expansion of a graph only gives a very weak bound on its resilience to random faults. More specifically, there are networks of uniform expansion O(sqrt{n}) that are resilient against a constant fault probability but there are also networks of uniform expansion Omega(1/log n) that are not resilient against a O(1/log n) fault probability. Thus, a different parameter is needed. For this we introduce the span of a graph which allows us to determine the maximum fault probability in a much better way than the expansion can. We use the span to show the first known results for the effect of random faults on the expansion of d-dimensional meshes.
Molecular dynamics computer simulations are used to study the aging dynamics of SiO2 (modeled by the BKS model). Starting from fully equilibrated configurations at high temperatures T_i =5000K/3760K the system is quenched to lower temperatures T_f=2500K, 2750K, 3000K, 3250K and observed after a waiting time t_w. Since the simulation runs are long enough to reach equilibrium at T_f, we are able to study the transition from out-of-equilibrium to equilibrium dynamics. We present results for the partial structure factors, for the generalized incoherent intermediate scattering function C_q(t_w, t_w+t), and for the mean square displacement msd(t_w,t_w+t). We conclude that there are three different t_w regions: (I) At very short waiting times, C_q(t_w, t_w+t) decays very fast without forming a plateau. Similarly msd(t_w,t_w+t) increases without forming a plateau. (II) With increasing t_w a plateau develops in C_q(t_w, t_w+t) and msd(t_w,t_w+t). For intermediate waiting times the plateau height is independent of t_w and T_i. Time superposition applies, i.e. C_q=C_q(t/t_r) where t_r=t_r(t_w) is a waiting time dependent decay time. Furthermore C_q=C(q,t_w,t_w+t) scales as C_q=C(q,z(t_w,t) where z is a function of t_w and t only, i.e. independent of q. (III) At large t_w the system reaches equilibrium, i.e. C_q(t_w,t_w+t) and msd(t_w,t_w+t) are independent of t_w and T_i. For C_q(t_w,t_w+t) we find that the time superposition of intermediate waiting times (II) includes the equilibrium curve (III).
Brain inspired neuromorphic computing has demonstrated remarkable advantages over traditional von Neumann architecture for its high energy efficiency and parallel data processing. However, the limited resolution of synaptic weights degrades system accuracy and thus impedes the use of neuromorphic systems. In this work, we propose three orthogonal methods to learn synapses with one-level precision, namely, distribution-aware quantization, quantization regularization and bias tuning, to make image classification accuracy comparable to the state-of-the-art. Experiments on both multi-layer perception and convolutional neural networks show that the accuracy drop can be well controlled within 0.19% (5.53%) for MNIST (CIFAR-10) database, compared to an ideal system without quantization.
Rich and complex time-series data, such as those generated from engineering systems, financial markets, videos or neural recordings, are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires flexible and accurate models. In this paper, we promote Gaussian process dynamical systems (GPDS) as a rich model class that is appropriate for such analysis. In particular, we present a message passing algorithm for approximate inference in GPDSs based on expectation propagation. By posing inference as a general message passing problem, we iterate forward-backward smoothing. Thus, we obtain more accurate posterior distributions over latent structures, resulting in improved predictive performance compared to state-of-the-art GPDS smoothers, which are special cases of our general message passing algorithm. Hence, we provide a unifying approach within which to contextualize message passing in GPDSs.
Chemical reaction networks taken with mass-action kinetics are dynamical systems that arise in chemical engineering and systems biology. In general, determining whether a chemical reaction network admits multiple steady states is difficult, as this requires determining existence of multiple positive solutions to a large system of polynomials with unknown coefficients. However, in certain cases, various easy criteria can be applied. One such test is the Jacobian Criterion, due to Craciun and Feinberg, which gives sufficient conditions for ruling out the possibility of multiple steady states. A chemical reaction network is said to pass the Jacobian Criterion if all terms in the determinant expansion of its parametrized Jacobian matrix have the same sign. In this article, we present a procedure which simplifies the application of the Jacobian Criterion, and as a result, we identify a new class of networks for which multiple steady states is precluded: those in which all chemical species have total molecularity of at most two. The total molecularity of a species refers to the sum of all of its stoichiometric coefficients in the network. We illustrate our results by examining enzyme catalysis networks.
Building neural networks to query a knowledge base (a table) with natural language is an emerging research topic in deep learning. An executor for table querying typically requires multiple steps of execution because queries may have complicated structures. In previous studies, researchers have developed either fully distributed executors or symbolic executors for table querying. A distributed executor can be trained in an end-to-end fashion, but is weak in terms of execution efficiency and explicit interpretability. A symbolic executor is efficient in execution, but is very difficult to train especially at initial stages. In this paper, we propose to couple distributed and symbolic execution for natural language queries, where the symbolic executor is pretrained with the distributed executor's intermediate execution results in a step-by-step fashion. Experiments show that our approach significantly outperforms both distributed and symbolic executors, exhibiting high accuracy, high learning efficiency, high execution efficiency, and high interpretability.
Inspired by experiments on the actin driven propulsion of micrometer sized beads we develop and study a minimal mechanical model of a two-dimensional network of stiff elastic filaments grown from the surface of a cylinder. Starting out from a discrete model of the network structure and of its microscopic mechanical behavior we derive a macroscopic constitutive law by homogenization techniques. We calculate the axisymmetric equilibrium state and study its linear stability depending on the microscopic mechanical properties. We find that thin networks are linearly stable, whereas thick networks are unstable. The critical thickness for the change in stability depends on the ratio of the microscopic elastic constants. The instability is induced by the increase in the compressive load on the inner network layers as the thickness of the network increases. The here employed homogenization approach combined with more elaborate microscopic models can serve as a basis to study the evolution of polymerizing actin networks and the mechanism of actin driven motion.
Handover rate is one of the most import metrics to instruct mobility management and resource management in wireless cellular networks. In the literature, the mathematical expression of handover rate has been derived for homogeneous cellular network by both regular hexagon coverage model and stochastic geometry model, but there has not been any reliable result for heterogeneous cellular networks (HCNs). Recently, stochastic geometry modeling has been shown to model well the real deployment of HCNs and has been extensively used to analyze HCNs. In this paper, we give an effective handover analysis for HCNs by stochastic geometry modeling, derive the mathematical expression of handover rate by employing an infinitesimal method for a generalized multi-tier scenario, discuss the result by deriving some meaningful corollaries, and validate the analysis by computer simulation with multiple walking models. By our analysis, we find that in HCNs the handover rate is related to many factors like the base stations' densities and transmitting powers, user's velocity distribution, bias factor, pass loss factor and etc. Although our analysis focuses on the scenario of multi-tier HCNs, the analytical framework can be easily extended for more complex scenarios, and may shed some light for future study.
It is commonly believed that quantum isolated systems satisfying the eigenstate thermalization hypothesis (ETH) are diffusive. We show that this assumption is too restrictive, since there are systems that are asymptotically in a thermal state, yet exhibit anomalous, subdiffusive thermalization. We show that such systems satisfy a modified version of the ETH ansatz and derive a general connection between the scaling of the variance of the offdiagonal matrix elements of local operators, written in the eigenbasis of the Hamiltonian, and the dynamical exponent. We find that for subdiffusively thermalizing systems the variance scales more slowly with system size than expected for diffusive systems. We corroborate our findings by numerically studying the distribution of the coefficients of the eigenfunctions and the offdiagonal matrix elements of local operators of the random field Heisenberg chain, which has anomalous transport in its thermal phase. Surprisingly, this system also has non-Gaussian distributions of the eigenfunctions, thus directly violating Berry's conjecture.
Stochastic modelling provides an indispensable tool for understanding how random events at the molecular level influence cellular functions. In practice, the common challenge is to calibrate a large number of model parameters against the experimental data. A related problem is to efficiently study how the behaviour of a stochastic model depends on its parameters, i.e. whether a change in model parameters can lead to a significant qualitative change in model behaviour (bifurcation). In this paper, tensor-structured parametric analysis (TPA) is presented. It is based on recently proposed low-parametric tensor-structured representations of classical matrices and vectors. This approach enables simultaneous computation of the model properties for all parameter values within a parameter space. This methodology is exemplified to study the parameter estimation, robustness, sensitivity and bifurcation structure in stochastic models of biochemical networks. The TPA has been implemented in Matlab and the codes are available at http://www.stobifan.org .
Maximally synchronizable networks (MSNs) are acyclic directed networks that maximize synchronizability. In this paper, we investigate the feasibility of transforming networks of coupled oscillators into their corresponding MSNs. By tuning the weights of any given network so as to reach the lowest possible eigenratio $\lambda_N/\lambda_2$, the synchronized state is guaranteed to be maintained across the longest possible range of coupling strengths. We check the robustness of the resulting MSNs with an experimental implementation of a network of nonlinear electronic oscillators and study the propagation of the synchronization errors through the network. Importantly, a method to study the effects of topological uncertainties on the synchronizability is proposed and explored both theoretically and experimentally.
Dust extinction is the most robust tracer of the gas distribution in the interstellar medium, but measuring extinction is limited by the systematic uncertainties involved in estimating the intrinsic colors to background stars. In this paper we present a new technique, PNICER, that estimates intrinsic colors and extinction for individual stars using unsupervised machine learning algorithms. This new method aims to be free from any priors with respect to the column density and intrinsic color distribution. It is applicable to any combination of parameters and works in arbitrary numbers of dimensions. Furthermore, it is not restricted to color space. Extinction towards single sources is determined by fitting Gaussian Mixture Models along the extinction vector to (extinction-free) control field observations. In this way it becomes possible to describe the extinction for observed sources with probability densities. PNICER effectively eliminates known biases found in similar methods and outperforms them in cases of deep observational data where the number of background galaxies is significant, or when a large number of parameters is used to break degeneracies in the intrinsic color distributions. This new method remains computationally competitive, making it possible to correctly de-redden millions of sources within a matter of seconds. With the ever-increasing number of large-scale high-sensitivity imaging surveys, PNICER offers a fast and reliable way to efficiently calculate extinction for arbitrary parameter combinations without prior information on source characteristics. PNICER also offers access to the well-established NICER technique in a simple unified interface and is capable of building extinction maps including the NICEST correction for cloud substructure. PNICER is offered to the community as an open-source software solution and is entirely written in Python.
We present deep CCD surface photometry in the Gunn g, r, i system and spectroscopy of the cD galaxy at the centre of the cluster A3284 at z = 0.15. The brightness profile of the galaxy has been tracked up to 40 arcsec (142 kpc) from the centre, and down to 26 mag/arcsec^2. The core and halo components in the galaxy have been singled out deriving geometrical parameters of the fitting isophotes as well as magnitudes and colours. The spectral properties of the galaxy core indicate a stellar population super metal-rich with [Fe/H] = +0.5. The cD halo is clearly dominant at 45 kpc from the galactic centre and has exceedingly red colours (g-r = 1.03, g-i = 1.82), about 0.7 mag redder than the core g-i. A match with the models for evolutionary population synthesis by Buzzoni (1989) show that the halo is consistent with a population of unevolved M-dwarf stars lower than 0.7 M_\odot. The M/L ratio in B for the halo is estimated to range between 50 and 200 implying a total mass for the cD galaxy of 1.6-3.1~10^{13} M_\odot and a total B luminosity of 6.0~10^{11} L_\odot.
Developers of text-to-speech synthesizers (TTS) often make use of human raters to assess the quality of synthesized speech. We demonstrate that we can model human raters' mean opinion scores (MOS) of synthesized speech using a deep recurrent neural network whose inputs consist solely of a raw waveform. Our best models provide utterance-level estimates of MOS only moderately inferior to sampled human ratings, as shown by Pearson and Spearman correlations. When multiple utterances are scored and averaged, a scenario common in synthesizer quality assessment, AutoMOS achieves correlations approaching those of human raters. The AutoMOS model has a number of applications, such as the ability to explore the parameter space of a speech synthesizer without requiring a human-in-the-loop.
A complex network processing information or physical flows is usually characterized by a number of macroscopic quantities such as the diameter and the betweenness centrality. An issue of significant theoretical and practical interest is how such a network responds to sudden changes caused by attacks or disturbances. By introducing a model to address this issue, we find that, for a finite-capacity network, perturbations can cause the network to \emph{oscillate} persistently in the sense that the characterizing quantities vary periodically or randomly with time. We provide a theoretical estimate of the critical capacity-parameter value for the onset of the network oscillation. The finding is expected to have broad implications as it suggests that complex networks may be structurally highly dynamic.
We use real replicas within the Thouless, Anderson and Palmer construction to investigate stability of solutions with respect to uniform scalings in the phase space of the Sherrington-Kirkpatrick model. We show that the demand of homogeneity of thermodynamic potentials leads in a natural way to a thermodynamically dependent ultrametric hierarchy of order parameters. The derived hierarchical mean-field equations appear equivalent to the discrete Parisi RSB scheme. The number of hierarchical levels in the construction is fixed by the global thermodynamic homogeneity expressed as generalized de Almeida Thouless conditions. A physical interpretation of a hierarchical structure of the order parameters is gained.
We study both analytically, using the renormalization group (RG) to two loop order, and numerically, using an exact polynomial algorithm, the disorder-induced glass phase of the two-dimensional XY model with quenched random symmetry-breaking fields and without vortices. In the super-rough glassy phase, i.e. below the critical temperature $T_c$, the disorder and thermally averaged correlation function $B(r)$ of the phase field $\theta(x)$, $B(r) = \bar{<[\theta(x) - \theta(x+ r) ]^2>}$ behaves, for $r \gg a$, as $B(r) \simeq A(\tau) \ln^2 (r/a)$ where $r = |r|$ and $a$ is a microscopic length scale. We derive the RG equations up to cubic order in $\tau = (T_c-T)/T_c$ and predict the universal amplitude ${A}(\tau) = 2\tau^2-2\tau^3 + {\cal O}(\tau^4)$. The universality of $A(\tau)$ results from nontrivial cancellations between nonuniversal constants of RG equations. Using an exact polynomial algorithm on an equivalent dimer version of the model we compute ${A}(\tau)$ numerically and obtain a remarkable agreement with our analytical prediction, up to $\tau \approx 0.5$.
Community detection in complex networks is a topic of high interest in many fields. Bipartite networks are a special type of complex networks in which nodes are decomposed into two disjoint sets, and only nodes between the two sets can be connected. Bipartite networks represent diverse interaction patterns in many real-world systems, such as predator-prey networks, plant-pollinator networks, and drug-target networks. While community detection in unipartite networks has been extensively studied in the past decade, identification of modules or communities in bipartite networks is still in its early stage. Several quantitative functions proposed for evaluating the quality of bipartite network divisions are based on null models and have distinct resolution limits. In this paper, we propose a new quantitative function for community detection in bipartite networks, and demonstrate that this quantitative function is superior to the widely used Barber's bipartite modularity and other functions. Based on the new quantitative function, the bipartite network community detection problem is formulated into an integer programming model. Bipartite networks can be partitioned into reasonable overlapping communities by maximizing the quantitative function. We further develop a heuristic and adapted label propagation algorithm (BiLPA) to optimize the quantitative function in large-scale bipartite networks. BiLPA does not require any prior knowledge about the number of communities in the networks. We apply BiLPA to both artificial networks and real-world networks and demonstrate that this method can successfully identify the community structures of bipartite networks.
Most deep architectures for image classification--even those that are trained to classify a large number of diverse categories--learn shared image representations with a single model. Intuitively, however, categories that are more similar should share more information than those that are very different. While hierarchical deep networks address this problem by learning separate features for subsets of related categories, current implementations require simplified models using fixed architectures specified via heuristic clustering methods. Instead, we propose Blockout, a method for regularization and model selection that simultaneously learns both the model architecture and parameters. A generalization of Dropout, our approach gives a novel parametrization of hierarchical architectures that allows for structure learning via back-propagation. To demonstrate its utility, we evaluate Blockout on the CIFAR and ImageNet datasets, demonstrating improved classification accuracy, better regularization performance, faster training, and the clear emergence of hierarchical network structures.
This work presents the application of the artificial neural networks, trained and structurally optimized by genetic algorithms, for modeling of crude distillation process at PKN ORLEN S.A. refinery. Models for the main fractionator distillation column products were developed using historical data. Quality of the fractions were predicted based on several chosen process variables. The performance of the model was validated using test data. Neural networks used in companion with genetic algorithms proved that they can accurately predict fractions quality shifts, reproducing the results of the standard laboratory analysis. Simple knowledge extraction method from neural network model built was also performed. Genetic algorithms can be successfully utilized in efficient training of large neural networks and finding their optimal structures.
Foursquare is an online social network and can be represented with a bipartite network of users and venues. A user-venue pair is connected if a user has checked-in at that venue. In the case of Foursquare, network analysis techniques can be used to enhance the user experience. One such technique is link prediction, which can be used to build a personalized recommendation system of venues. Recommendation systems in bipartite networks are very often designed using the global ranking method and collaborative filtering. A less known method- network based inference is also a feasible choice for link prediction in bipartite networks and sometimes performs better than the previous two. In this paper we test these techniques on the Foursquare network. The best technique proves to be the network based inference. We also show that taking into account the available metadata can be beneficial.
Using the 15-m Swedish ESO Sub-millimeter Telescope (SEST), CO, HCN, and HCO+ observations of the galactic star-forming region NGC6334 FIR II are presented, complemented by [C I] 3^P_1--3^P_0 and 3^P_2--3^P_1 data from the Atacama Pathfinder Experiment (APEX 12-m telescope). Embedded in the extended molecular cloud and associated with the H II region NGC6334--D, there is a molecular "void". [C I] correlates well with 13^CO and other molecular lines and shows no rim brightening relative to molecular cloud regions farther off the void. While an interpretation in terms of a highly clumped cloud morphology is possible, with photon dominated regions (PDRs) reaching deep into the cloud, the data do not provide any direct evidence for a close association of [C I] with PDRs. Kinetic temperatures are ~40--50K in the molecular cloud and >=200K toward the void. CO and [C I] excitation temperatures are similar. A comparison of molecular and atomic fine structure line emission with the far infrared and radio continuum as well as the distribution of 2.2um H_2 emission indicates that the well-evolved H II region expands into a medium that is homogeneous on pc-scales. If the H_2 emission is predominantly shock excited, both the expanding ionization front (classified as subsonic, "D-type") and the associated shock front farther out (traced by H_2) can be identified, observationally confirming for the first time a classical scenario that is predicted by evolutionary models of H II regions. Integrated line intensity ratios of the observed molecules are determined, implying a mean C18^O/C17^O abundance ratio of 4.13+-0.13 that reflects the 18^O/17^O isotope ratio. This ratio is consistent with values determined in nearby clouds. Right at the edge of the void, however, the oxygen isotope ratio might be smaller.
We study the oscillatory behaviour of a gene regulatory network with interlinked positive and negative feedback loop. Frequency and amplitude are two important properties of oscillation. Studied network produces two different modes of oscillation. In one mode (mode 1) frequency remains constant over a wide range amplitude and in other mode (mode 2) the amplitude of oscillation remains constant over a wide range of frequency. Our study reproduces both features of oscillations in a single gene regulatory network and show that the negative plus positive feedback loops in gene regulatory network offer additional advantage. We identified the key parameters/variables responsible for different modes of oscillation. The network is flexible in switching between different modes by choosing appropriately the required parameters/variables.
Parameter sensitivity analysis is a powerful tool in the building and analysis of biochemical network models. For stochastic simulations, parameter sensitivity analysis can be computationally expensive, requiring multiple simulations for perturbed values of the parameters. Here, we use trajectory reweighting to derive a method for computing sensitivity coefficients in stochastic simulations without explicitly perturbing the parameter values, avoiding the need for repeated simulations. The method allows the simultaneous computation of multiple sensitivity coefficients. Our approach recovers results originally obtained by application of the Girsanov measure transform in the general theory of stochastic processes [A. Plyasunov and A. P. Arkin, J. Comp. Phys. 221, 724 (2007)]. We build on these results to show how the method can be used to compute steady-state sensitivity coefficients from a single simulation run, and we present various efficiency improvements. For models of biochemical signaling networks the method has a particularly simple implementation. We demonstrate its application to a signaling network showing stochastic focussing and to a bistable genetic switch, and present exact results for models with linear propensity functions.
We demonstrate experimentally dynamic interface binding in a system consisting of two coupled ferromagnetic layers. While domain walls in each layer have different velocity-field responses, for two broad ranges of the driving field, H, walls in the two layers are bound and move at a common velocity. The bound states have their own velocity-field response and arise when the isolated wall velocities in each layer are close, a condition which always occurs as H->0. Several features of the bound states are reproduced using a one dimensional model, illustrating their general nature.
In recent years, the widespread use of deep neural networks (DNNs) has facilitated great improvements in performance for computer vision tasks like image classification and object recognition. In most realistic computer vision applications, an input image undergoes some form of image distortion such as blur and additive noise during image acquisition or transmission. Deep networks trained on pristine images perform poorly when tested on distorted images affected by image blur or additive noise. In this paper, we evaluate the effect of image distortions like Gaussian blur and additive noise on the outputs of pre-trained convolutional filters. We propose a metric to identify the most noise susceptible convolutional filters and rank them in order of the highest gain in classification accuracy upon correction. In our proposed approach called DeepCorrect, we apply small convolutional filter blocks on top of these ranked filters and train them to correct the worst noise and blur affected filter outputs. Applying DeepCorrect on the CIFAR-100 dataset, we significantly improve the robustness of DNNs against distorted images and also outperform the alternative approach of fine-tuning networks.
Recent work suggests that a sharp definition of `phase of matter' can be given for periodically driven `Floquet' quantum systems exhibiting many-body localization. In this work we propose a classification of the phases of interacting Floquet localized systems with (completely) spontaneously broken symmetries -- we focus on the one dimensional case, but our results appear to generalize to higher dimensions. We find that the different Floquet phases correspond to elements of $Z(G)$, the centre of the symmetry group in question. In a previous paper we offered a companion classification of unbroken, i.e., paramagnetic phases.
Fictitious play (FP) is a canonical game-theoretic learning algorithm which has been deployed extensively in decentralized control scenarios. However standard treatments of FP, and of many other game-theoretic models, assume rather idealistic conditions which rarely hold in realistic control scenarios. This paper considers a broad class of best response learning algorithms, that we refer to as FP-type algorithms. In such an algorithm, given some (possibly limited) information about the history of actions, each individual forecasts the future play and chooses a (myopic) best action given their forecast. We provide a unifed analysis of the behavior of FP-type algorithms under an important class of perturbations, thus demonstrating robustness to deviations from the idealistic operating conditions that have been previously assumed. This robustness result is then used to derive convergence results for two control-relevant relaxations of standard game-theoretic applications: distributed (network-based) implementation without full observability and asynchronous deployment (including in continuous time). In each case the results follow as a direct consequence of the main robustness result.
In this paper, we investigate stable patterns of electroencephalogram (EEG) over time for emotion recognition using a machine learning approach. Up to now, various findings of activated patterns associated with different emotions have been reported. However, their stability over time has not been fully investigated yet. In this paper, we focus on identifying EEG stability in emotion recognition. To validate the efficiency of the machine learning algorithms used in this study, we systematically evaluate the performance of various popular feature extraction, feature selection, feature smoothing and pattern classification methods with the DEAP dataset and a newly developed dataset for this study. The experimental results indicate that stable patterns exhibit consistency across sessions; the lateral temporal areas activate more for positive emotion than negative one in beta and gamma bands; the neural patterns of neutral emotion have higher alpha responses at parietal and occipital sites; and for negative emotion, the neural patterns have significant higher delta responses at parietal and occipital sites and higher gamma responses at prefrontal sites. The performance of our emotion recognition system shows that the neural patterns are relatively stable within and between sessions.
We provide an empirical analysis of the network structure of the Austrian interbank market based on a unique data set of the Oesterreichische Nationalbank (OeNB). We show that the contract size distribution follows a power law over more than 3 decades. By using a novel ''dissimilarity'' measure we find that the interbank network shows a community structure that exactly mirrors the regional and sectoral organization of the actual Austrian banking system. The degree distribution of the interbank network shows two different power law exponents which are one-to-one related to two sub-network structures, differing in the degree of hierarchical organization. The banking network moreover shares typical structural features known in numerous complex real world networks: a low clustering coefficient and a relatively short average shortest path length. These empirical findings are in marked contrast to interbank networks that have been analyzed in the theoretical economic and econo-physics literature.
Recent research in the deep learning field has produced a plethora of new architectures. At the same time, a growing number of groups are applying deep learning to new applications. Some of these groups are likely to be composed of inexperienced deep learning practitioners who are baffled by the dizzying array of architecture choices and therefore opt to use an older architecture (i.e., Alexnet). Here we attempt to bridge this gap by mining the collective knowledge contained in recent deep learning research to discover underlying principles for designing neural network architectures. In addition, we describe several architectural innovations, including Fractal of FractalNet network, Stagewise Boosting Networks, and Taylor Series Networks (our Caffe code and prototxt files is available at https://github.com/iPhysicist/CNNDesignPatterns). We hope others are inspired to build on our preliminary work.
We consider the Dyson hierarchical version of the quantum Spin-Glass with random Gaussian couplings characterized by the power-law decaying variance $\overline{J^2(r)} \propto r^{-2\sigma}$ and a uniform transverse field $h$. The ground state is studied via real-space renormalization to characterize the spinglass-paramagnetic zero temperature quantum phase transition as a function of the control parameter $h$. In the spinglass phase $h<h_c$, the typical renormalized coupling grows with the length scale $L$ as the power-law $J_L^{typ}(h) \propto \Upsilon(h) L^{\theta}$ with the classical droplet exponent $\theta=1-\sigma$, where the stiffness modulus vanishes at criticality $\Upsilon(h) \propto (h_c-h)^{\mu} $, whereas the typical renormalized transverse field decays exponentially $ h^{typ}_L(h) \propto e^{- \frac{L}{\xi}}$ where the correlation length diverges at the transition $\xi \propto (h_c-h)^{-\nu}$. At the critical point $h=h_c$, the typical renormalized coupling $J_L^{typ}(h_c) $ and the typical renormalized transverse field $ h^{typ}_L(h_c)$ display the same power-law behavior $L^{-z}$ with a finite dynamical exponent $z$. The RG rules are applied numerically to chains containing $L=2^{12}=4096 $ spins in order to measure these critical exponents for various values of $\sigma$ in the region $1/2<\sigma<1$.
Within a case study on the protein-protein interaction network (PIN) of Drosophila melanogaster we investigate the relation between the network's spectral properties and its structural features such as the prevalence of specific subgraphs or duplicate nodes as a result of its evolutionary history. The discrete part of the spectral density shows fingerprints of the PIN's topological features including a preference for loop structures. Duplicate nodes are another prominent feature of PINs and we discuss their representation in the PIN's spectrum as well as their biological implications.
Generative Adversarial Networks (GANs) have recently demonstrated the capability to synthesize compelling real-world images, such as room interiors, album covers, manga, faces, birds, and flowers. While existing models can synthesize images based on global constraints such as a class label or caption, they do not provide control over pose or object location. We propose a new model, the Generative Adversarial What-Where Network (GAWWN), that synthesizes images given instructions describing what content to draw in which location. We show high-quality 128 x 128 image synthesis on the Caltech-UCSD Birds dataset, conditioned on both informal text descriptions and also object location. Our system exposes control over both the bounding box around the bird and its constituent parts. By modeling the conditional distributions over part locations, our system also enables conditioning on arbitrary subsets of parts (e.g. only the beak and tail), yielding an efficient interface for picking part locations. We also show preliminary results on the more challenging domain of text- and location-controllable synthesis of images of human actions on the MPII Human Pose dataset.
Achieving optimal transmission throughput in data networks in a multi-hop wireless networks is fundamental but hard problem. The situation is aggravated when nodes are mobile. Further, multi-rate system make the analysis of throughput more complicated. In mobile scenario, link may break or be created as nodes are moving within communication range. `Route Discovery' which is to find the optimal route and transmission schedule is an important issue. Route discovery entails some cost; so one would not like to initiate discovery too often. On the other hand, not discovering reasonably often entails the risk of being stuck with a suboptimal route and/or schedule, which hurts end-to-end throughput. The implementation of the routing decision problem in one dimensional mobile ad hoc network as Markov decision process problem is already is discussed in the paper [1]. A heuristic based on threshold policy is discussed in the same paper without giving a way to find the threshold. In this paper, we suggested a rule for setting the threshold, given the parameters of the system. We also point out that our results remain valid in a slightly different mobility model; this model is a first step towards an `open' network in which existing relay nodes can leave and/or new relay nodes can join the network.
In this paper we prove the separation of source-network coding and channel coding in wireline networks. For the purposes of this work, a wireline network is any network of independent, memoryless, point-to-point, finite-alphabet channels used to transmit dependent sources either losslessly or subject to a distortion constraint. In deriving this result, we also prove that in a general memoryless network with dependent sources, lossless and zero-distortion reconstruction are equivalent provided that the conditional entropy of each source given the other sources is non-zero. Furthermore, we extend the separation result to the case of continuous-alphabet, point-to-point channels such as additive white Gaussian noise (AWGN) channels.
In recent years, the use of tailed radio galaxies as environmental probes has gained momentum as a method for galaxy cluster detection, examining the dynamics of individual clusters, measuring the density and velocity flows in the intra-cluster medium, and for probing cluster magnetic fields. To date instrumental limitations in terms of resolution and sensitivity have confined this research to the local (z < 0.7) Universe. The advent of SKA-1 surveys however will allow detection of well over 1 million tailed radio galaxies and their associated galaxy clusters out to redshifts of 2 or more. This is in fact ten times more than the current number of known clusters in the Universe. Such a substantial sample of tailed galaxies will provide an invaluable tool not only for detecting clusters, but also for characterizing their intra-cluster medium, magnetic fields and dynamical state as a function of cosmic time. In this paper we present an analysis of the usability of tailed radio galaxies as tracers of dense environments extrapolated from existing deep radio surveys such the Extended Chandra Deep Field-South.
The problem of anomaly detection has been studied for a long time, and many Network Analysis techniques have been proposed as solutions. Although some results appear to be quite promising, no method is clearly to be superior to the rest. In this paper, we particularly consider anomaly detection in the Bitcoin transaction network. Our goal is to detect which users and transactions are the most suspicious; in this case, anomalous behavior is a proxy for suspicious behavior. To this end, we use the laws of power degree and densification and local outlier factor (LOF) method (which is proceeded by k-means clustering method) on two graphs generated by the Bitcoin transaction network: one graph has users as nodes, and the other has transactions as nodes.   We remark that the methods used here can be applied to any type of setting with an inherent graph structure, including, but not limited to, computer networks, telecommunications networks, auction networks, security networks, social networks, Web networks, or any financial networks. We use the Bitcoin transaction network in this paper due to the availability, size, and attractiveness of the data set.
Distributed storage plays a crucial role in the current cloud computing framework. After the theoretical bound for distributed storage was derived by the pioneer work of the regenerating code, Reed-Solomon code based regenerating codes were developed. The RS code based minimum storage regeneration code (RS-MSR) and the minimum bandwidth regeneration code (RS-MBR) can achieve theoretical bounds on the MSR point and the MBR point respectively in code regeneration. They can also maintain the MDS property in code reconstruction. However, in the hostile network where the storage nodes can be compromised and the packets can be tampered with, the storage capacity of the network can be significantly affected. In this paper, we propose a Hermitian code based minimum storage regenerating (H-MSR) code and a minimum bandwidth regenerating (H-MBR) code. We first prove that our proposed Hermitian code based regenerating codes can achieve the theoretical bounds for MSR point and MBR point respectively. We then propose data regeneration and reconstruction algorithms for the H-MSR code and the H-MBR code in both error-free network and hostile network. Theoretical evaluation shows that our proposed schemes can detect the erroneous decodings and correct more errors in hostile network than the RS-MSR code and the RS-MBR code with the same code rate. Our analysis also demonstrates that the proposed H-MSR and H-MBR codes have lower computational complexity than the RS-MSR/RS-MBR codes in both code regeneration and code reconstruction.
Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game. In this paper, we propose the \emph{Generative Multi-Adversarial Network} (GMAN), a framework that extends GANs to multiple discriminators. In previous work, the successful training of GANs requires modifying the minimax objective to accelerate training early on. In contrast, GMAN can be reliably trained with the original, untampered objective. We explore a number of design perspectives with the discriminator role ranging from formidable adversary to forgiving teacher. Image generation tasks comparing the proposed framework to standard GANs demonstrate GMAN produces higher quality samples in a fraction of the iterations when measured by a pairwise GAM-type metric.
The Cerebellar Model Articulation Controller (CMAC) is an influential brain-inspired computing model in many relevant fields. Since its inception in the 1970s, the model has been intensively studied and many variants of the prototype, such as Kernel-CMAC, Self-Organizing Map CMAC, and Linguistic CMAC, have been proposed. This review article focus on how the CMAC model is gradually developed and refined to meet the demand of fast, adaptive, and robust control. Two perspective, CMAC as a neural network and CMAC as a table look-up technique are presented. Three aspects of the model: the architecture, learning algorithms and applications are discussed. In the end, some potential future research directions on this model are suggested.
Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.
Wireless sensor networks benefit from communication protocols that reduce power requirements by avoiding frame collision. Time Division Media Access methods schedule transmission in slots to avoid collision, however these methods often lack scalability when implemented in \emph{ad hoc} networks subject to node failures and dynamic topology. This paper reports a distributed algorithm for TDMA slot assignment that is self-stabilizing to transient faults and dynamic topology change. The expected local convergence time is O(1) for any size network satisfying a constant bound on the size of a node neighborhood.
In this paper, we propose a generic technique to model temporal dependencies and sequences using a combination of a recurrent neural network and a Deep Belief Network. Our technique, RNN-DBN, is an amalgamation of the memory state of the RNN that allows it to provide temporal information and a multi-layer DBN that helps in high level representation of the data. This makes RNN-DBNs ideal for sequence generation. Further, the use of a DBN in conjunction with the RNN makes this model capable of significantly more complex data representation than an RBM. We apply this technique to the task of polyphonic music generation.
Cell motility and tissue morphogenesis depend crucially on the dynamic remodelling of actomyosin networks. An actomyosin network consists of an actin polymer network connected by crosslinker proteins and motor protein myosins that generate internal stresses on the network. A recent discovery shows that for a range of experimental parameters, actomyosin networks contract to clusters with a power-law size distribution [Alvarado J. et al. (2013) Nature Physics 9 591]. Here, we argue that actomyosin networks can exhibit robust critical signature without fine-tuning because the dynamics of the system can be mapped onto a modified version of percolation with trapping (PT), which is known to show critical behaviour belonging to the static percolation universality class without the need of fine-tuning of a control parameter. We further employ our PT model to generate experimentally testable predictions.
We study bond percolation of $N$ non-interacting Gaussian polymers of $\ell$ segments on a 2D square lattice of size $L$ with reflecting boundaries. Through simulations, we find the fraction of configurations displaying {\em no} connected cluster which span from one edge to the opposite edge. From this fraction, we define a critical segment density $\rho_{c}^L(\ell)$ and the associated critical fraction of occupied bonds $p_{c}^L(\ell)$, so that they can be identified as the percolation threshold in the $L \to \infty$ limit. Whereas $p_{c}^L(\ell)$ is found to decrease monotonically with $\ell$ for a wide range of polymer lengths, $\rho_{c}^L(\ell)$ is non-monotonic. We give physical arguments for this intriguing behavior in terms of the competing effects of multiple bond occupancies and polymerization.
The paper is devoted to the construction of the superstatistical description for nonequilibrium Markovian systems. It is based on Kirchhoff's diagram technique and the assumption on the system under consideration to possess a wide variety of cycles with vanishing probability fluxes. The latter feature enables us to introduce equivalence classes called channels within which detailed balance holds individually. Then stationary probability as well as flux distributions are represented as some sums over the channels. The latter construction actually forms the superstatistical description, which, however, deals with a certain superposition of equilibrium subsystems rather then is a formal expansion of the nonequilibrium steady state distribution into terms of the Boltzmann type.
A methodology that can generate the optimal coefficients of a numerical method with the use of an artificial neural network is presented in this work. The network can be designed to produce a finite difference algorithm that solves a specific system of ordinary differential equations numerically. The case we are examining here concerns an explicit two-stage Runge-Kutta method for the numerical solution of the two-body problem. Following the implementation of the network, the latter is trained to obtain the optimal values for the coefficients of the Runge-Kutta method. The comparison of the new method to others that are well known in the literature proves its efficiency and demonstrates the capability of the network to provide efficient algorithms for specific problems.
Neural network models comprising elements which have exclusively excitatory or inhibitory synapses are capable of a wide range of dynamic behavior, including chaos. In this paper, a simple excitatory-inhibitory neural pair, which forms the building block of larger networks, is subjected to external stimulation. The response shows transition between various types of dynamics, depending upon the magnitude of the stimulus. Coupling such pairs over a local neighborhood in a two-dimensional plane, the resultant network can achieve a satisfactory segmentation of an image into ``object'' and ``background''. Results for synthetic and and ``real-life'' images are given.
In almost all situation assessment problems, it is useful to dynamically contract and expand the states under consideration as assessment proceeds. Contraction is most often used to combine similar events or low probability events together in order to reduce computation. Expansion is most often used to make distinctions of interest which have significant probability in order to improve the quality of the assessment. Although other uncertainty calculi, notably Dempster-Shafer [Shafer, 1976], have addressed these operations, there has not yet been any approach of refining and coarsening state spaces for the Bayesian Network technology. This paper presents two operations for refining and coarsening the state space in Bayesian Networks. We also discuss their practical implications for knowledge acquisition.
The asymmetry model for the highly viscous flow postulates thermally activated jumps from a practically undistorted ground state to strongly distorted, but stable structures, with a pronounced Eshelby backstress from the distorted surroundings. The viscosity is ascribed to those stable distorted structures which do not jump back, but relax by the relaxation of the surrounding viscoelastic matrix. It is shown that this mechanism implies a description in terms of the shear compliance, with a viscosity which can be calculated from the cutoff of the retardation spectrum. Consistency requires that this cutoff lies close to the Maxwell time. The improved asymmetry model compares well with experiment.
Image restoration, including image denoising, super resolution, inpainting, and so on, is a well-studied problem in computer vision and image processing, as well as a test bed for low-level image modeling algorithms. In this work, we propose a very deep fully convolutional auto-encoder network for image restoration, which is a encoding-decoding framework with symmetric convolutional-deconvolutional layers. In other words, the network is composed of multiple layers of convolution and de-convolution operators, learning end-to-end mappings from corrupted images to the original ones. The convolutional layers capture the abstraction of image contents while eliminating corruptions. Deconvolutional layers have the capability to upsample the feature maps and recover the image details. To deal with the problem that deeper networks tend to be more difficult to train, we propose to symmetrically link convolutional and deconvolutional layers with skip-layer connections, with which the training converges much faster and attains better results.
The complex effect of genetic algorithm's (GA) operators and parameters to its performance has been studied extensively by researchers in the past but none studied their interactive effects while the GA is under different problem sizes. In this paper, We present the use of experimental model (1)~to investigate whether the genetic operators and their parameters interact to affect the offline performance of GA, (2)~to find what combination of genetic operators and parameter settings will provide the optimum performance for GA, and (3)~to investigate whether these operator-parameter combination is dependent on the problem size. We designed a GA to optimize a family of traveling salesman problems (TSP), with their optimal solutions known for convenient benchmarking. Our GA was set to use different algorithms in simulating selection ($\Omega_s$), different algorithms ($\Omega_c$) and parameters ($p_c$) in simulating crossover, and different parameters ($p_m$) in simulating mutation. We used several $n$-city TSPs ($n=\{5, 7, 10, 100, 1000\}$) to represent the different problem sizes (i.e., size of the resulting search space as represented by GA schemata). Using analysis of variance of 3-factor factorial experiments, we found out that GA performance is affected by $\Omega_s$ at small problem size (5-city TSP) where the algorithm Partially Matched Crossover significantly outperforms Cycle Crossover at $95\%$ confidence level.
Distribution network operators (DNOs) are increasingly concerned about the impact of low carbon technologies on the low voltage (LV) networks. More advanced metering infrastructures provide numerous opportunities for more accurate load flow analysis of the LV networks. However, such data may not be readily available for DNOs and in any case is likely to be expensive. Modelling tools are required which can provide realistic, yet accurate, load profiles as input for a network modelling tool, without needing access to large amounts of monitored customer data. In this paper we outline some simple methods for accurately modelling a large number of unmonitored residential customers at the LV level. We do this by a process we call buddying, which models unmonitored customers by assigning them load profiles from a limited sample of monitored customers who have smart meters. Hence the presented method requires access to only a relatively small amount of domestic customers' data. The method is efficiently optimised using a genetic algorithm to minimise a weighted cost function between matching the substation data and the individual mean daily demands. Hence we can show the effectiveness of substation monitoring in LV network modelling. Using real LV network modelling, we show that our methods perform significantly better than a comparative Monte Carlo approach, and provide a description of the peak demand behaviour.
Training a denoising autoencoder neural network requires access to truly clean data, a requirement which is often impractical. To remedy this, we introduce a method to train an autoencoder using only noisy data, having examples with and without the signal class of interest. The autoencoder learns a partitioned representation of signal and noise, learning to reconstruct each separately. We illustrate the method by denoising birdsong audio (available abundantly in uncontrolled noisy datasets) using a convolutional autoencoder.
Although conditional branching between possible behavioural states is a hallmark of intelligent behavior, very little is known about the neuronal mechanisms that support this processing. In a step toward solving this problem we demonstrate by theoretical analysis and simulation how networks of richly inter-connected neurons, such as those observed in the superficial layers of the neocortex, can embed reliable robust finite state machines. We show how a multi-stable neuronal network containing a number of states can be created very simply, by coupling two recurrent networks whose synaptic weights have been configured for soft winner-take-all (sWTA) performance. These two sWTAs have simple, homogenous locally recurrent connectivity except for a small fraction of recurrent cross-connections between them, which are used to embed the required states. This coupling between the maps allows the network to continue to express the current state even after the input that elicted that state is withdrawn. In addition, a small number of 'transition neurons' implement the necessary input-driven transitions between the embedded states. We provide simple rules to systematically design and construct neuronal state machines of this kind. The significance of our finding is that it offers a method whereby the cortex could construct networks supporting a broad range of sophisticated processing by applying only small specializations to the same generic neuronal circuit.
Facial attributes are emerging soft biometrics that have the potential to reject non-matches, for example, based on mismatching gender. To be usable in stand-alone systems, facial attributes must be extracted from images automatically and reliably. In this paper, we propose a simple yet effective solution for automatic facial attribute extraction by training a deep convolutional neural network (DCNN) for each facial attribute separately, without using any pre-training or dataset augmentation, and we obtain new state-of-the-art facial attribute classification results on the CelebA benchmark. To test the stability of the networks, we generated adversarial images -- formed by adding imperceptible non-random perturbations to original inputs which result in classification errors -- via a novel fast flipping attribute (FFA) technique. We show that FFA generates more adversarial examples than other related algorithms, and that DCNNs for certain attributes are generally robust to adversarial inputs, while DCNNs for other attributes are not. This result is surprising because no DCNNs tested to date have exhibited robustness to adversarial images without explicit augmentation in the training procedure to account for adversarial examples. Finally, we introduce the concept of natural adversarial samples, i.e., images that are misclassified but can be easily turned into correctly classified images by applying small perturbations. We demonstrate that natural adversarial samples commonly occur, even within the training set, and show that many of these images remain misclassified even with additional training epochs. This phenomenon is surprising because correcting the misclassification, particularly when guided by training data, should require only a small adjustment to the DCNN parameters.
By considering diffusion on De Bruijn graphs, we study in details the dynamics of the histories in the Minority Game, a model of competition between adaptative agents. Such graphs describe the structure of temporal evolution of $M$ bits strings, each node standing for a given string, i.e. a history in the Minority Game. We show that the frequency of visit of each history is not given by $1/2^M$ in the limit of large $M$ when the transition probabilities are biased. Consequently all quantities of the model do significantly depend on whether the histories are real, or uniformly and randomly sampled. We expose a self-consistent theory of the case of real histories, which turns out to be in very good agreement with numerical simulations.
Traditional approach of providing network security has been to borrow tools and mechanisms from cryptography. However, the conventional view of security based on cryptography alone is not sufficient for the defending against unique and novel types of misbehavior exhibited by nodes in wireless self-organizing networks such as mobile ad hoc networks and wireless sensor networks. Reputation-based frameworks, where nodes maintain reputation of other nodes and use it to evaluate their trustworthiness, are deployed to provide scalable, diverse and a generalized approach for countering different types of misbehavior resulting form malicious and selfish nodes in these networks. In this chapter, we present a comprehensive discussion on reputation and trust-based systems for wireless self-organizing networks. Different classes of reputation system are described along with their unique characteristics and working principles. A number of currently used reputation systems are critically reviewed and compared with respect to their effectiveness and efficiency of performance. Some open problems in the area of reputation and trust-based system within the domain of wireless self-organizing networks are also discussed.
Social networking sites (SNS) have recently used by millions of people all over the world. An SNS is a society on the Internet, where people communicate and foster friendship with each other. We examine a nation-wide SNS (more than six million users at present), mutually acknowledged friendship network with third million people and nearly two million links. By employing a community-extracting method developed by Newman and others, we found that there exists a range of community-sizes in which only few communities are detected. This novel feature cannot be explained by previous growth models of networks. We present a simple model with two processes of acquaintance, connecting nearest neighbors and random linkage. We show that the model can explain the gap in the community-size distribution as well as other statistical properties including long-tail degree distribution, high transitivity, its correlation with degree, and degree-degree correlation. The model can estimate how the two processes, which are ubiquitous in many social networks, are working with relative frequencies in the SNS as well as other societies.
Similarities among words affect language acquisition and processing in a multi-relational way barely accounted for in the literature. We propose a multiplex network representation of word similarities in a mental lexicon as a natural framework for investigating large-scale cognitive patterns. Our model accounts for semantic, taxonomic, and phonological interactions and identifies a cluster of words of higher frequency, easier to identify, memorise and learn and with more meanings than expected at random. This cluster emerges around age 7 yr through an explosive transition not reproduced by null models. We relate this phenomenon to polysemy, i.e. redundancy in word meanings. We show that the word cluster acts as a core for the lexicon, increasing both its navigability and robustness to degradation in cognitive impairments. Our findings provide quantitative confirmation of existing psycholinguistic conjectures about core structure in the mental lexicon and the importance of integrating multi-relational word-word interactions in suitable frameworks.
A system level view of cellular processes for human and several organisms can be cap- tured by analyzing molecular interaction networks. A molecular interaction network formed of differentially expressed genes and their interactions helps to understand key players behind disease development. So, if the functions of these genes are blocked by altering their interactions, it would have a great impact in controlling the disease. Due to this promising consequence, the problem of inferring disease causing genes and their pathways has attained a crucial position in computational biology research. However, considering the huge size of interaction networks, executing computations can be costly. Review of literatures shows that the methods proposed for finding the set of disease causing genes could be assessed in terms of their accuracy which a perfect algorithm would find. Along with accuracy, the time complexity of the method is also important, as high time complexities would limit the number of pathways that could be found within a pragmatic time interval.
We examine the feasibility of predicting and subsequently managing the future evolution of a Complex Adaptive System. Our archetypal system mimics a competitive population of mechanical, biological, informational or human objects. We show that short-term prediction yields corridors along which the system will, with very high probability, evolve. We then show how small amounts of 'population engineering' can be undertaken in order to steer the system away from any undesired regimes which have been predicted. Despite the system's many degrees of freedom and inherent stochasticity, this dynamical 'soft' control over future risk requires only minimal knowledge about the population's composition.
We study random heteropolymer chain with gaussian distribution of kinds of monomers. The long-range correlations between kinds of monomers were introduce. The mean-field analysis of such heteropolymer indicates the existence of infinite energetic barrier between heteropolymer random coil and frozen states. Thus, the frozen state is kinetically unavailable for the random heteropolymer with power-law correlations in monomers' sequence. The relationship between our results and some early obtained results for the DNA intrones sequences are discussed.
Optical and infrared observations have thus far detected more celestial cataclysms than have been seen in gravity waves (GW). This argues that we should search for gravity wave signatures that correspond to flux variability seen at optical wavelengths, at precisely known positions. There is an unknown time delay between the optical and gravitational transient, but knowing the source location precisely specifies the corresponding time delays across the gravitational antenna network as a function of the GW-to-optical arrival time difference. Optical searches should detect virtually all supernovae that are plausible gravitational radiation sources. The transient optical signature expected from merging compact objects is not as well understood, but there are good reasons to expect detectable transient optical/IR emission from most of these sources as well. The next generation of deep wide-field surveys (for example PanSTARRS and LSST) will be sensitive to subtle optical variability, but we need to fill the ``blind spots'' that exist in the Galactic plane, and for optically bright transient sources. In particular, a Galactic plane variability survey at 2 microns seems worthwhile. Science would benefit from closer coordination between the various optical survey projects and the gravity wave community.
We investigate the structural properties of the underlying hosts of 34 blue compact dwarf (BCD) galaxies with deep near-infrared (NIR) photometry. The BCD sample is selected from the Cosmic Assembly Near-IR Deep Extragalactic Legacy Survey in the Great observatories origins Deep Survey North and South fields. We extract the surface brightness profile (SBP) in the optical F 435W and NIR F 160W bands. The SBPs of BCDs in the H band reach 26 mag arcsec^-2 at the 3\sigma level, which is so far the deepest NIR imaging of BCDs. Then we fit the SBPs with one- and two- component Sersic models. About half of the BCDs favour the two-component model which significantly improves the fit quality. The effective radii of the underlying hosts of BCDs in the B band are smaller than those of early-type dwarfs (dEs) and dwarf irregulars at a fixed luminosity. This discrepancy is similar to findings in many previous works. However, the difference in structural parameters between BCDs and other dwarf galaxies seems to be less significant in the H band. Furthermore, we find a remarkable agreement between the underlying hosts of BCDs and dEs. All dwarf galaxies seem to follow a similar luminosity-radius relationship which suggests a unified structural evolution for dwarf galaxies. We conclude that a possible evolution track from BCDs to dEs cannot be ruled out, with no significant change of structure needed in the evolutionary scenario.
Rectified Linear Units (ReLU) seem to have displaced traditional 'smooth' nonlinearities as activation-function-du-jour in many - but not all - deep neural network (DNN) applications. However, nobody seems to know why. In this article, we argue that ReLU are useful because they are ideal demodulators - this helps them perform fast abstract learning. However, this fast learning comes at the expense of serious nonlinear distortion products - decoy features. We show that Parallel Dither acts to suppress the decoy features, preventing overfitting and leaving the true features cleanly demodulated for rapid, reliable learning.
Cloud Computing has emerged as a successful computing paradigm for efficiently utilizing managed compute infrastructure such as high speed rack-mounted servers, connected with high speed networking, and reliable storage. Usually such infrastructure is dedicated, physically secured and has reliable power and networking infrastructure. However, much of our idle compute capacity is present in unmanaged infrastructure like idle desktops, lab machines, physically distant server machines, and laptops. We present a scheme to utilize this idle compute capacity on a best-effort basis and provide high availability even in face of failure of individual components or facilities.   We run virtual machines on the commodity infrastructure and present a cloud interface to our end users. The primary challenge is to maintain availability in the presence of node failures, network failures, and power failures. We run multiple copies of a Virtual Machine (VM) redundantly on geographically dispersed physical machines to achieve availability. If one of the running copies of a VM fails, we seamlessly switchover to another running copy. We use Virtual Machine Record/Replay capability to implement this redundancy and switchover. In current progress, we have implemented VM Record/Replay for uniprocessor machines over Linux/KVM and are currently working on VM Record/Replay on shared-memory multiprocessor machines. We report initial experimental results based on our implementation.
In a previous paper the authors argued the case for incorporating ideas from innate immunity into artificial immune systems (AISs) and presented an outline for a conceptual framework for such systems. A number of key general properties observed in the biological innate and adaptive immune systems were highlighted, and how such properties might be instantiated in artificial systems was discussed in detail. The next logical step is to take these ideas and build a software system with which AISs with these properties can be implemented and experimentally evaluated. This paper reports on the results of that step - the libtissue system.
Recently, real world networks having constant/shrinking diameter along with power-law degree distribution are observed and investigated in literature. Taking an inspiration from these findings, we propose a deterministic complex network model, which we call Self-Coordinated Corona Graphs (SCCG), based on the corona product of graphs. As it has also been established that self coordination/organization of nodes gives rise to emergence of power law in degree distributions of several real networks, the networks in the proposed model are generated by the virtue of self coordination of nodes in corona graphs. Alike real networks, the SCCG inherit motifs which act as the seed graphs for the generation of SCCG. We also analytically prove that the power law exponent of SCCG is approximately $2$ and the diameter of SCCG produced by a class of motifs is constant. Finally, we compare different properties of the proposed model with that of the BA and Pseudofractal scale-free models for complex networks.
We argue that the freezing transition scenario, previously explored in the statistical mechanics of 1/f-noise random energy models, also determines the value distribution of the maximum of the modulus of the characteristic polynomials of large N x N random unitary (CUE) matrices. We postulate that our results extend to the extreme values taken by the Riemann zeta-function zeta(s) over sections of the critical line s=1/2+it of constant length and present the results of numerical computations in support. Our main purpose is to draw attention to possible connections between the statistical mechanics of random energy landscapes, random matrix theory, and the theory of the Riemann zeta function.
In this paper we investigate the evolution of the IPv4 and IPv6 Internet topologies at the autonomous system (AS) level over a long period of time.We provide abundant empirical evidence that there is a phase transition in the growth trend of the two networks. For the IPv4 network, the phase change occurred in 2001. Before then the network's size grew exponentially, and thereafter it followed a linear growth. Changes are also observed around the same time for the maximum node degree, the average node degree and the average shortest path length. For the IPv6 network, the phase change occurred in late 2006. It is notable that the observed phase transitions in the two networks are different, for example the size of IPv6 network initially grew linearly and then shifted to an exponential growth. Our results show that following decades of rapid expansion up to the beginning of this century, the IPv4 network has now evolved into a mature, steady stage characterised by a relatively slow growth with a stable network structure; whereas the IPv6 network, after a slow startup process, has just taken off to a full speed growth. We also provide insight into the possible impact of IPv6-over-IPv4 tunneling deployment scheme on the evolution of the IPv6 network. The Internet topology generators so far are based on an inexplicit assumption that the evolution of Internet follows non-changing dynamic mechanisms. This assumption, however, is invalidated by our results.Our work reveals insights into the Internet evolution and provides inputs to future AS-Level Internet models.
The dynamical origin of complex networks, i.e., the underlying principles governing network evolution, is a crucial issue in network study. In this paper, by carrying out analysis to the temporal data of Flickr and Epinions--two typical social media networks, we found that the dynamical pattern in neighborhood, especially the formation of triadic links, plays a dominant role in the evolution of networks. We thus proposed a coevolving dynamical model for such networks, in which the evolution is only driven by the local dynamics--the preferential triadic closure. Numerical experiments verified that the model can reproduce global properties which are qualitatively consistent with the empirical observations.
Until recently, research on artificial neural networks was largely restricted to systems with only two types of variable: Neural activities that represent the current or recent input and weights that learn to capture regularities among inputs, outputs and payoffs. There is no good reason for this restriction. Synapses have dynamics at many different time-scales and this suggests that artificial neural networks might benefit from variables that change slower than activities but much faster than the standard weights. These "fast weights" can be used to store temporary memories of the recent past and they provide a neurally plausible way of implementing the type of attention to the past that has recently proved very helpful in sequence-to-sequence models. By using fast weights we can avoid the need to store copies of neural activity patterns.
Machine learning is a thriving part of computer science. There are many efficient approaches to machine learning that do not provide strong theoretical guarantees, and a beautiful general learning theory. Unfortunately, machine learning approaches that give strong theoretical guarantees have not been efficient enough to be applicable. In this paper we introduce a logical approach to machine learning. Models are represented by tuples of logical formulas and inputs and outputs are logical structures. We present our framework together with several applications where we evaluate it using SAT and SMT solvers. We argue that this approach to machine learning is particularly suited to bridge the gap between efficiency and theoretical soundness. We exploit results from descriptive complexity theory to prove strong theoretical guarantees for our approach. To show its applicability, we present experimental results including learning complexity-theoretic reductions rules for board games. We also explain how neural networks fit into our framework, although the current implementation does not scale to provide guarantees for real-world neural networks.
The control of complex networks is a significant challenge, especially when the network topology of the system to be controlled is dynamic. Addressing this challenge, here we introduce a novel approach which allows exploring the controllability of temporal networks. Studying six empirical data sets, we particularly show that order correlations in the sequence of interactions can both increase or decrease the time needed to achieve full controllability. Counter-intuitively, we find that this effect can be opposite than the effect of order correlations on other dynamical processes. Specifically, we show that order correlations that speed up a diffusion process in a given system can slow down the control of the same system, and vice-versa. Building on the higher-order graphical modeling framework introduced in recent works, we further demonstrate that spectral properties of higher-order network topologies can be used to analytically explain this phenomenon.
Multi-agent networks are often modeled as interaction graphs, where the nodes represent the agents and the edges denote some direct interactions. The robustness of a multi-agent network to perturbations such as failures, noise, or malicious attacks largely depends on the corresponding graph. In many applications, networks are desired to have well-connected interaction graphs with relatively small number of links. One family of such graphs is the random regular graphs. In this paper, we present a decentralized scheme for transforming any connected interaction graph with a possibly non-integer average degree of k into a connected random m-regular graph for some m in [k, k + 2]. Accordingly, the agents improve the robustness of the network with a minimal change in the overall sparsity by optimizing the graph connectivity through the proposed local operations.
We study effects of classical magnetic impurities on the Anderson metal-insulator transition numerically. We find that a small concentration of Heisenberg impurities enhances the critical disorder amplitude $W_{\rm c}$ with increasing exchange coupling strength $J$. The resulting scaling with $J$ is analyzed which supports an anomalous scaling prediction by Wegner due to the combined breaking of time-reversal and spin-rotational symmetry. Moreover, we find that the presence of magnetic impurities lowers the critical correlation length exponent $\nu$ and enhances the multifractality parameter $\alpha_0$. The new value of $\nu$ improves the agreement with the value measured in experiments on the metal-insulator transition (MIT) in doped semiconductors like phosphor-doped silicon, where a finite density of magnetic moments is known to exist in the vicinity of the MIT. The results are obtained by a finite-size scaling analysis of the geometric mean of the local density of states which is calculated by means of the kernel polynomial method. We establish this combination of numerical techniques as a method to obtain critical properties of disordered systems quantitatively.
At Anderson critical points, the statistics of the two-point transmission $T_L$ for disordered samples of linear size $L$ is expected to be multifractal with the following properties [Janssen {\it et al} PRB 59, 15836 (1999)] : (i) the probability to have $T_L \sim 1/L^{\kappa}$ behaves as $L^{\Phi(\kappa)}$, where the multifractal spectrum $\Phi(\kappa)$ terminates at $\kappa=0$ as a consequence of the physical bound $T_L \leq 1$; (ii) the exponents $X(q)$ that govern the moments $\overline{T_L^q} \sim 1/L^{X(q)}$ become frozen above some threshold: $X(q \geq q_{sat}) = - \Phi(\kappa=0)$, i.e. all moments of order $q \geq q_{sat}$ are governed by the measure of the rare samples having a finite transmission ($\kappa=0$). In the present paper, we test numerically these predictions for the ensemble of $L \times L$ power-law random banded matrices, where the random hopping $H_{i,j}$ decays as a power-law $(b/| i-j |)^a$. This model is known to present an Anderson transition at $a=1$ between localized ($a>1$) and extended ($a<1$) states, with critical properties that depend continuously on the parameter $b$. Our numerical results for the multifractal spectra $\Phi_b(\kappa)$ for various $b$ are in agreement with the relation $\Phi(\kappa \geq 0) = 2 [ f(\alpha= d+ \frac{\kappa}{2}) -d ]$ in terms of the singularity spectrum $f(\alpha)$ of individual critical eigenfunctions, in particular the typical exponents are related via the relation $\kappa_{typ}(b)= 2 (\alpha_{typ}(b)-d)$. We also discuss the statistics of the two-point transmission in the delocalized phase and in the localized phase.
Owners of a web-site are often interested in analysis of groups of users of their site. Information on these groups can help optimizing the structure and contents of the site. In this paper we use an approach based on formal concepts for constructing taxonomies of user groups. For decreasing the huge amount of concepts that arise in applications, we employ stability index of a concept, which describes how a group given by a concept extent differs from other such groups. We analyze resulting taxonomies of user groups for three target websites.
The gesture recognition using motion capture data and depth sensors has recently drawn more attention in vision recognition. Currently most systems only classify dataset with a couple of dozens different actions. Moreover, feature extraction from the data is often computational complex. In this paper, we propose a novel system to recognize the actions from skeleton data with simple, but effective, features using deep neural networks. Features are extracted for each frame based on the relative positions of joints (PO), temporal differences (TD), and normalized trajectories of motion (NT). Given these features a hybrid multi-layer perceptron is trained, which simultaneously classifies and reconstructs input data. We use deep autoencoder to visualize learnt features, and the experiments show that deep neural networks can capture more discriminative information than, for instance, principal component analysis can. We test our system on a public database with 65 classes and more than 2,000 motion sequences. We obtain an accuracy above 95% which is, to our knowledge, the state of the art result for such a large dataset.
It is the main goal of this paper to propose a novel method to perform matrix completion on-line. Motivated by a wide variety of applications, ranging from the design of recommender systems to sensor network localization through seismic data reconstruction, we consider the matrix completion problem when entries of the matrix of interest are observed gradually. Precisely, we place ourselves in the situation where the predictive rule should be refined incrementally, rather than recomputed from scratch each time the sample of observed entries increases. The extension of existing matrix completion methods to the sequential prediction context is indeed a major issue in the Big Data era, and yet little addressed in the literature. The algorithm promoted in this article builds upon the Soft Impute approach introduced in Mazumder et al. (2010). The major novelty essentially arises from the use of a randomised technique for both computing and updating the Singular Value Decomposition (SVD) involved in the algorithm. Though of disarming simplicity, the method proposed turns out to be very efficient, while requiring reduced computations. Several numerical experiments based on real datasets illustrating its performance are displayed, together with preliminary results giving it a theoretical basis.
In this paper, a novel neural network architecture is proposed attempting to rectify text images with mild assumptions. A new dataset of text images is collected to verify our model and open to public. We explored the capability of deep neural network in learning geometric transformation and found the model could segment the text image without explicit supervised segmentation information. Experiments show the architecture proposed can restore planar transformations with wonderful robustness and effectiveness.
We introduce a recurrent neural network language model (RNN-LM) with long short-term memory (LSTM) units that utilizes both character-level and word-level inputs. Our model has a gate that adaptively finds the optimal mixture of the character-level and word-level inputs. The gate creates the final vector representation of a word by combining two distinct representations of the word. The character-level inputs are converted into vector representations of words using a bidirectional LSTM. The word-level inputs are projected into another high-dimensional space by a word lookup table. The final vector representations of words are used in the LSTM language model which predicts the next word given all the preceding words. Our model with the gating mechanism effectively utilizes the character-level inputs for rare and out-of-vocabulary words and outperforms word-level language models on several English corpora.
In engineering applications, one of the major challenges today is to develop reliable and robust control algorithms for complex networked systems. Controllability and observability of such systems play a crucial role in the design process. The underlying network structure may contain symmetries -- caused for example by the coupling of identical building blocks -- and these symmetries lead to repeated eigenvalues in a generic way. This complicates the design of controllers since repeated eigenvalues might decrease the controllability of the system. In this paper, we will analyze the relationship between the controllability and observability of complex networked systems and graph symmetries using results from representation theory. Furthermore, we will propose an algorithm to compute sparse input and output matrices based on projections onto the isotypic components. We will illustrate our results with the aid of two guiding examples, a network with $ D_4 $ symmetry and the Petersen graph.
In this paper we present the 1-loop perturbative computation of the renormalization constants and mixing coefficients of the lattice quark operators of rank three whose hadronic elements enter in the determination of the second moment of Deep Inelastic Scattering (DIS) structure functions.   We have employed in our calculations the nearest-neighbor improved ``clover-leaf'' lattice QCD action. The interest of using this action in Monte Carlo simulations lies in the fact that all terms which in the continuum limit are effectively of order $a$ ($a$ being the lattice spacing) have been demonstrated to be absent from on-shell hadronic lattice matrix elements. We have limited our computations to the quenched case, in which quark operators do not mix with gluon operators.   We have studied the transformation properties under the hypercubic group of the operators up to the rank five (which are related to moments up to the fourth of DIS structure functions), and we discuss the choice of the operators considered in this paper together with the feasibility of lattice computations for operators of higher ranks.   To perform the huge amount of calculations required for the evaluation of all the relevant Feynman diagrams, we have extensively used the symbolic manipulation languages Schoonschip and Form.
(abridged) The fairly recent detection of a variety of anions in the Interstellar Molecular Clouds have underlined the importance of realistically modeling the processes governing their abundance. To this aim, our earlier calculations for the radiative electron attachment (REA) rates for C4H-, C6H-, and C8H- are employed to generate the corresponding column density ratios of anion/neutral (A/N) relative abundances. The latter are then compared with those obtained from observational measurements. The calculations involved the time-dependent solutions of a large network of chemical processes over an extended time interval and included a series of runs in which the values of REA rates were repeatedly scaled. Macroscopic parameters for the clouds' modeling were also varied to cover a broad range of physical environments. It was found that, within the range and quality of the processes included in the present network,and selected from state-of-the-art astrophysical databases, the REA values required to match the observed A/N ratios needed to be reduced by orders of magnitude for C4H- case, while the same rates for C6H- and C8H- only needed to be scaled by much smaller factors. The results suggest that the generally proposed formation of interstellar anions by REA mechanism is overestimated by current models for the C4H- case, for which is likely to be an inefficient path to formation. This path is thus providing a rather marginal contribution to the observed abundances of C4H-, the latter being more likely to originate from other chemical processes in the network, as we discuss in some detail in the present work.Possible physical reasons for the much smaller differences against observations found instead for the values of the (A/N) ratios in two other, longer members of the series are put forward and analyzed within the evolutionary modeling discussed in the present work.
For any positive integer $k$, there exist neural networks with $\Theta(k^3)$ layers, $\Theta(1)$ nodes per layer, and $\Theta(1)$ distinct parameters which can not be approximated by networks with $\mathcal{O}(k)$ layers unless they are exponentially large --- they must possess $\Omega(2^k)$ nodes. This result is proved here for a class of nodes termed "semi-algebraic gates" which includes the common choices of ReLU, maximum, indicator, and piecewise polynomial functions, therefore establishing benefits of depth against not just standard networks with ReLU gates, but also convolutional networks with ReLU and maximization gates, sum-product networks, and boosted decision trees (in this last case with a stronger separation: $\Omega(2^{k^3})$ total tree nodes are required).
We study numerically the Sherrington--Kirkpatrick model as function of the magnetic field h, with fixed temperature T=0.6 Tc. We investigate the finite size scaling behavior of several quantities, such as the spin glass susceptibility, looking for numerical evidences of the transition on the De Almeida Thouless line. We find strong corrections to scaling which make difficult to locate the transition point. This shows, in a simple case, the extreme difficulties of spin glass simulations in non-zero magnetic field. Next, we study various sum rules (consequences of stochastic stability) involving overlaps between three and four replicas, which appear to be numerically well satisfied, and in a non-trivial way. Finally, we present data on P(q) for a large lattice size (N=3200) at low temperature T=0.4 Tc, where, for the first time, the shape predicted by the RSB solution of the model for non-zero magnetic field is visible.
With increasingly ambitious initiatives such as GENI and FIND that seek to design the future Internet, it becomes imperative to define the characteristics of robust topologies, and build future networks optimized for robustness. This paper investigates the characteristics of network topologies that maintain a high level of throughput in spite of multiple attacks. To this end, we select network topologies belonging to the main network models and some real world networks. We consider three types of attacks: removal of random nodes, high degree nodes, and high betweenness nodes. We use elasticity as our robustness measure and, through our analysis, illustrate that different topologies can have different degrees of robustness. In particular, elasticity can fall as low as 0.8% of the upper bound based on the attack employed. This result substantiates the need for optimized network topology design. Furthermore, we implement a tradeoff function that combines elasticity under the three attack strategies and considers the cost of the network. Our extensive simulations show that, for a given network density, regular and semi-regular topologies can have higher degrees of robustness than heterogeneous topologies, and that link redundancy is a sufficient but not necessary condition for robustness.
Exact ground states are calculated for the Sherrington-Kirkpatrick (SK) spin-glass containing up to N=90 spins. A ground-state energy per spin $e^{\infty}_0 = - 0.7637 \pm 0.0004$ is found from the $N$ dependence of the misfit parameter, which is a measure of frustration of the system. The results are compared with those of two related models, which can be introduced by replacing all interactions of the SK model by ferromagnetic or antiferromagnetic ones of the same strengths. A parameter $x$ is introduced, which describes the fraction of antiferromagnetic interactions in these types of models. From the $x$ dependence of finite models it is concluded, that the the SK model $(x=0.5)$ assigns a transition between a ferromagnetic state $(x<0.5)$ to a spin-glass state $(x>0.5)$.
We present deep observations of the nearby T-type brown dwarf binary $\epsilon$ Indi Bab in radio with the Australia Telescope Compact Array and in X-rays with the Chandra X-ray Observatory. Despite long integration times, the binary (composed of T1 and T6 dwarfs) was not detected in either wavelength regime. We reached $3\sigma$ upper limits of $1.23 \times 10^{12}$ and $1.74 \times 10^{12}$ erg/s/Hz for the radio luminosity at 4.8 GHz and 8.64 GHz, respectively; in the X-rays, the upper limit in the $0.1-10$ keV band was $3.16 \times 10^{23}$ erg/s. We discuss the above results in the framework of magnetic activity in ultracool, low-mass dwarfs.
K$_{1-x}$Li$_{x}$TaO$_3$ (KLT) solid solutions exhibit a variety of interesting physical phenomena related to large displacements of Li-ions from ideal perovskite A-site positions. First-principles calculations for KLT supercells were used to investigate these phenomena. Lattice dynamics calculations for KLT exhibit a Li off-centering instability. The energetics of Li-displacements for isolated Li-ions and for Li-Li pairs up to 4th neighbors were calculated. Interactions between nearest neighbor Li-ions, in a Li-Li pair, strongly favor ferroelectric alignment along the pair axis. Such Li-Li pairs can be considered "seeds" for polar nanoclusters in KLT. Electrostriction, local oxygen relaxation, coupling to the KT soft-mode, and interactions with neighboring Li-ions all enhance the polarization from Li off-centering. Calculated hopping barriers for isolated Li-ions and for nearest neighbor Li-Li pairs are in good agreement with Arrhenius fits to experimental dielectric data.
Modeling dynamical systems is important in many disciplines, e.g., control, robotics, or neurotechnology. Commonly the state of these systems is not directly observed, but only available through noisy and potentially high-dimensional observations. In these cases, system identification, i.e., finding the measurement mapping and the transition mapping (system dynamics) in latent space can be challenging. For linear system dynamics and measurement mappings efficient solutions for system identification are available. However, in practical applications, the linearity assumptions does not hold, requiring non-linear system identification techniques. If additionally the observations are high-dimensional (e.g., images), non-linear system identification is inherently hard. To address the problem of non-linear system identification from high-dimensional observations, we combine recent advances in deep learning and system identification. In particular, we jointly learn a low-dimensional embedding of the observation by means of deep auto-encoders and a predictive transition model in this low-dimensional space. We demonstrate that our model enables learning good predictive models of dynamical systems from pixel information only.
Telehaptic applications involve delay-sensitive multimedia communication between remote locations with distinct Quality of Service (QoS) requirements for different media components. These QoS constraints pose a variety of challenges, especially when the communication occurs over a shared network, with unknown and time-varying cross-traffic. In this work, we propose a transport layer congestion control protocol for telehaptic applications operating over shared networks, termed as dynamic packetization module (DPM). DPM is a lossless, network-aware protocol which tunes the telehaptic packetization rate based on the level of congestion in the network. To monitor the network congestion, we devise a novel network feedback module, which communicates the end-to-end delays encountered by the telehaptic packets to the respective transmitters with negligible overhead. Via extensive simulations, we show that DPM meets the QoS requirements of telehaptic applications over a wide range of network cross-traffic conditions. We also report qualitative results of a real-time telepottery experiment with several human subjects, which reveal that DPM preserves the quality of telehaptic activity even under heavily congested network scenarios. Finally, we compare the performance of DPM with several previously proposed telehaptic communication protocols and demonstrate that DPM outperforms these protocols.
The study of percolation in so-called {\em nested subgraphs} implies a generalization of the concept of percolation since the results are not linked to specific graph process. Here the behavior of such graphs at criticallity is studied for the case where the nesting operation is performed in an uncorrelated way. Specifically, I provide an analyitic derivation for the percolation inequality showing that the cluster size distribution under a generalized process of uncorrelated nesting at criticality follows a power law with universal exponent $\gamma=3/2$. The relevance of the result comes from the wide variety of processes responsible for the emergence of the giant component that fall within the category of nesting operations, whose outcome is a family of nested subgraphs.
Firms having similar business activities are correlated. We analyze two different cross-correlation matrices C constructed from (i) 30-min price fluctuations of 1000 US stocks for the 2-year period 1994-95 and (ii) 1-day price fluctuations of 422 US stocks for the 35-year period 1962-96. We find that the eigenvectors of C corresponding to the largest eigenvalues allow us to partition the set of all stocks into distinct subsets. These subsets are similar to conventionally-identified business sectors, and are stable for extended periods of time. Using a set of coupled stochastic differential equations, we argue how correlations between stocks might arise. Finally, we demonstrate that the sectors we identify are useful for the practical goal of finding an investment which earns a given return without exposure to unnecessary risk.
We investigate the global persistence properties of critical systems relaxing from an initial state with non-vanishing value of the order parameter (e.g., the magnetization in the Ising model). The persistence probability of the global order parameter displays two consecutive regimes in which it decays algebraically in time with two distinct universal exponents. The associated crossover is controlled by the initial value m_0 of the order parameter and the typical time at which it occurs diverges as m_0 vanishes. Monte-Carlo simulations of the two-dimensional Ising model with Glauber dynamics display clearly this crossover. The measured exponent of the ultimate algebraic decay is in rather good agreement with our theoretical predictions for the Ising universality class.
The learning capability of a neural network improves with increasing depth at higher computational costs. Wider layers with dense kernel connectivity patterns furhter increase this cost and may hinder real-time inference. We propose feature map and kernel level pruning for reducing the computational complexity of a deep convolutional neural network. Pruning feature maps reduces the width of a layer and hence does not need any sparse representation. Further, kernel pruning converts the dense connectivity pattern into a sparse one. Due to coarse nature, these pruning granularities can be exploited by GPUs and VLSI based implementations. We propose a simple and generic strategy to choose the least adversarial pruning masks for both granularities. The pruned networks are retrained which compensates the loss in accuracy. We obtain the best pruning ratios when we prune a network with both granularities. Experiments with the CIFAR-10 dataset show that more than 85% sparsity can be induced in the convolution layers with less than 1% increase in the missclassification rate of the baseline network.
Learning sparse feature representations is a useful instrument for solving an unsupervised learning problem. In this paper, we present three labeled handwritten digit datasets, collectively called n-MNIST. Then, we propose a novel framework for the classification of handwritten digits that learns sparse representations using probabilistic quadtrees and Deep Belief Nets. On the MNIST and n-MNIST datasets, our framework shows promising results and significantly outperforms traditional Deep Belief Networks.
Social media tend to be rife with rumours while new reports are released piecemeal during breaking news. Interestingly, one can mine multiple reactions expressed by social media users in those situations, exploring their stance towards rumours, ultimately enabling the flagging of highly disputed rumours as being potentially false. In this work, we set out to develop an automated, supervised classifier that uses multi-task learning to classify the stance expressed in each individual tweet in a rumourous conversation as either supporting, denying or questioning the rumour. Using a classifier based on Gaussian Processes, and exploring its effectiveness on two datasets with very different characteristics and varying distributions of stances, we show that our approach consistently outperforms competitive baseline classifiers. Our classifier is especially effective in estimating the distribution of different types of stance associated with a given rumour, which we set forth as a desired characteristic for a rumour-tracking system that will warn both ordinary users of Twitter and professional news practitioners when a rumour is being rebutted.
We propose a random walks based model to generate complex networks. Many authors studied and developed different methods and tools to analyze complex networks by random walk processes. Just to cite a few, random walks have been adopted to perform community detection, exploration tasks and to study temporal networks. Moreover, they have been used also to generate scale-free networks. In this work, we define a random walker that plays the role of "edges-generator". In particular, the random walker generates new connections and uses these ones to visit each node of a network. As result, the proposed model allows to achieve networks provided with a Gaussian degree distribution, and moreover, some features as the clustering coefficient and the assortativity show a critical behavior. Finally, we performed numerical simulations to study the behavior and the properties of the cited model.
We study the diffusion behavior of real-time information. Typically, real-time information is valuable only for a limited time duration, and hence needs to be delivered before its "deadline." Therefore, real-time information is much easier to spread among a group of people with frequent interactions than between isolated individuals. With this insight, we consider a social network which consists of many cliques and information can spread quickly within a clique. Furthermore, information can also be shared through online social networks, such as Facebook, twitter, Youtube, etc.   We characterize the diffusion of real-time information by studying the phase transition behaviors. Capitalizing on the theory of inhomogeneous random networks, we show that the social network has a critical threshold above which information epidemics are very likely to happen. We also theoretically quantify the fractional size of individuals that finally receive the message. Finally, the numerical results indicate that under certain conditions, the large size cliques in a social network could greatly facilitate the diffusion of real-time information.
We study the evolution of social networks that contain both friendly and unfriendly pairwise links between individual nodes. The network is endowed with dynamics in which the sense of a link in an imbalanced triad--a triangular loop with 1 or 3 unfriendly links--is reversed to make the triad balanced. With this dynamics, an infinite network undergoes a dynamic phase transition from a steady state to "paradise"--all links are friendly--as the propensity p for friendly links in an update event passes through 1/2. A finite network always falls into a socially-balanced absorbing state where no imbalanced triads remain. If the additional constraint that the number of imbalanced triads in the network does not increase in an update is imposed, then the network quickly reaches a balanced final state.
Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn. In this paper we describe how to use Relation Networks (RNs) as a simple plug-and-play module to solve problems that fundamentally hinge on relational reasoning. We tested RN-augmented networks on three tasks: visual question answering using a challenging dataset called CLEVR, on which we achieve state-of-the-art, super-human performance; text-based question answering using the bAbI suite of tasks; and complex reasoning about dynamic physical systems. Then, using a curated dataset called Sort-of-CLEVR we show that powerful convolutional networks do not have a general capacity to solve relational questions, but can gain this capacity when augmented with RNs. Our work shows how a deep learning architecture equipped with an RN module can implicitly discover and learn to reason about entities and their relations.
We present a micro aerial vehicle (MAV) system, built with inexpensive off-the-shelf hardware, for autonomously following trails in unstructured, outdoor environments such as forests. The system introduces a deep neural network (DNN) called TrailNet for estimating the view orientation and lateral offset of the MAV with respect to the trail center. The DNN utilizes a novel application of transfer learning, and it achieves stable flight without oscillations by avoiding overconfident behavior through a loss function that includes both label smoothing and entropy reward. In addition to the TrailNet DNN, the system also utilizes vision modules for environmental awareness, including another DNN for object detection and a visual odometry component for estimating depth for the purpose of low-level obstacle detection. All vision systems run in real time on board the MAV via a Jetson TX1. We provide details on the hardware and software used, as well as implementation details. We present experiments showing the ability of our system to navigate forest trails more robustly than previous techniques, including autonomous flights of 1 km.
Developing a dialogue agent that is capable of making autonomous decisions and communicating by natural language is one of the long-term goals of machine learning research. Traditional approaches either rely on hand-crafting a small state-action set for applying reinforcement learning that is not scalable or constructing deterministic models for learning dialogue sentences that fail to capture natural conversational variability. In this paper, we propose a Latent Intention Dialogue Model (LIDM) that employs a discrete latent variable to learn underlying dialogue intentions in the framework of neural variational inference. In a goal-oriented dialogue scenario, these latent intentions can be interpreted as actions guiding the generation of machine responses, which can be further refined autonomously by reinforcement learning. The experimental evaluation of LIDM shows that the model out-performs published benchmarks for both corpus-based and human evaluation, demonstrating the effectiveness of discrete latent variable models for learning goal-oriented dialogues.
Electronic properties of disordered binary alloys are studied via the calculation of the average Density of States (DOS) in two and three dimensions. We propose a new approximate scheme that allows for the inclusion of local order effects in finite geometries and extrapolates the behavior of infinite systems following `finite-size scaling' ideas. We particularly investigate the limit of the Quantum Site Percolation regime described by a tight-binding Hamiltonian. This limit was chosen to probe the role of short range order (SRO) properties under extreme conditions. The method is numerically highly efficient and asymptotically exact in important limits, predicting the correct DOS structure as a function of the SRO parameters. Magnetic field effects can also be included in our model to study the interplay of local order and the shifted quantum interference driven by the field. The average DOS is highly sensitive to changes in the SRO properties, and striking effects are observed when a magnetic field is applied near the segregated regime. The new effects observed are twofold: there is a reduction of the band width and the formation of a gap in the middle of the band, both as a consequence of destructive interference of electronic paths and the loss of coherence for particular values of the magnetic field. The above phenomena are periodic in the magnetic flux. For other limits that imply strong localization, the magnetic field produces minor changes in the structure of the average DOS.
Gradient descent and coordinate descent are well understood in terms of their asymptotic behavior, but less so in a transient regime often used for approximations in machine learning. We investigate how proper initialization can have a profound effect on finding near-optimal solutions quickly. We show that a certain property of a data set, namely the boundedness of the correlations between eigenfeatures and the response variable, can lead to faster initial progress than expected by commonplace analysis. Convex optimization problems can tacitly benefit from that, but this automatism does not apply to their dual formulation. We analyze this phenomenon and devise provably good initialization strategies for dual optimization as well as heuristics for the non-convex case, relevant for deep learning. We find our predictions and methods to be experimentally well-supported.
We consider the problem of modeling multivariate time series with parsimonious dynamical models which can be represented as sparse dynamic Bayesian networks with few latent nodes. This structure translates into a sparse plus low rank model. In this paper, we propose a Gaussian regression approach to identify such a model.
The structural cohesion model is a powerful theoretical conception of cohesion in social groups, but its diffusion in empirical literature has been hampered by operationalization and computational problems. In this paper we start from the classic definition of structural cohesion as the minimum number of actors who need to be removed in a network in order to disconnect it, and extend it by using average node connectivity as a finer grained measure of cohesion. We present useful heuristics for computing structural cohesion that allow a speed-up of one order of magnitude over the algorithms currently available. We analyze three large collaboration networks (co-maintenance of Debian packages, co-authorship in Nuclear Theory and High-Energy Theory) and show how our approach can help researchers measure structural cohesion in relatively large networks. We also introduce a novel graphical representation of the structural cohesion analysis to quickly spot differences across networks.
This paper uses a bipartite network model to calculate the coverage achieved by a delay-tolerant information dissemination algorithm in a specialized setting. The specialized Delay Tolerant Network (DTN) system comprises static message buffers or throwboxes kept in popular places besides the mobile agents hopping from one place to another. We identify that an information dissemination technique that exploits the throwbox infrastructure can cover only a fixed number of popular places irrespective of the time spent. We notice that such DTN system has a natural bipartite network correspondence where two sets are the popular places and people visiting those places. This helps leveraging the theories of evolving bipartite networks (BNW) to provide an appropriate explanation of the observed temporal invariance of information coverage over the DTN. In this work, we first show that information coverage can be estimated by the size of the largest component in the thresholded one-mode projection of BNW. Next, we also show that exploiting a special property of BNW, the size of the largest component size can be calculated from the degree distribution. Combining these two, we derive a closed form simple equation to accurately predict the amount of information coverage in DTN which is almost impossible to achieve using the traditional techniques such as epidemic or Markov modeling. The equation shows that full coverage quickly becomes extremely difficult to achieve if the number of places increases while variation in agent's activity helps in covering more places.
Time-varying graphs are a useful model for networks with dynamic connectivity such as vehicular networks, yet, despite their great modeling power, many important features of time-varying graphs are still poorly understood. In this paper, we study the survivability properties of time-varying networks against unpredictable interruptions. We first show that the traditional definition of survivability is not effective in time-varying networks, and propose a new survivability framework. To evaluate the survivability of time-varying networks under the new framework, we propose two metrics that are analogous to MaxFlow and MinCut in static networks. We show that some fundamental survivability-related results such as Menger's Theorem only conditionally hold in time-varying networks. Then we analyze the complexity of computing the proposed metrics and develop several approximation algorithms. Finally, we conduct trace-driven simulations to demonstrate the application of our survivability framework to the robust design of a real-world bus communication network.
We report on the first observation of bosons condensed into the energy minima of an $F$-band of a bipartite square optical lattice. Momentum spectra indicate that a truly complex-valued staggered angular momentum superfluid order is established. The corresponding wave function is composed of alternating local $F_{2x^3-3x} + i F_{2y^3-3y}$-orbits and local $S$-orbits residing in the deep and shallow wells of the lattice, which are arranged as the black and white areas of a checkerboard. A pattern of staggered vortical currents arises, which breaks time reversal symmetry and the translational symmetry of the lattice potential. We have measured the populations of higher order Bragg peaks in the momentum spectra for varying relative depths of the shallow and deep lattice wells and find remarkable agreement with band calculations.
This paper is a step towards a systematic theory of the transitivity (clustering) phenomenon in random networks. A static framework is used, with adjacency matrix playing the role of the dynamical variable. Hence, our model is a matrix model, where matrices are random, but their elements take values 0 and 1 only. Confusion present in some papers where earlier attempts to incorporate transitivity in a similar framework have been made is hopefully dissipated. Inspired by more conventional matrix models, new analytic techniques to develop a static model with non-trivial clustering are introduced. Computer simulations complete the analytic discussion.
We study the localization properties of non-interacting waves propagating in a speckle-like potential superposed on a one-dimensional lattice. Using a decimation/renormalization procedure, we estimate the localization length for a tight-binding Hamiltonian where site-energies are square-sinc-correlated random variables. By decreasing the width of the correlation function, the disorder patterns approaches a $\delta$-correlated disorder, and the localization length becomes almost energy-independent in the strong disorder limit. We show that this regime can be reached for a size of the speckle grains of the order of (lower than) four lattice steps.
Two challenges lie in the facial attractiveness computation research: the lack of true attractiveness labels (scores), and the lack of an accurate face representation. In order to address the first challenge, this paper recasts facial attractiveness computation as a label distribution learning (LDL) problem rather than a traditional single-label supervised learning task. In this way, the negative influence of the label incomplete problem can be reduced. Inspired by the recent promising work in face recognition using deep neural networks to learn effective features, the second challenge is expected to be solved from a deep learning point of view. A very deep residual network is utilized to enable automatic learning of hierarchical aesthetics representation. Integrating these two ideas, an end-to-end deep learning framework is established. Our approach achieves the best results on a standard benchmark SCUT-FBP dataset compared with other state-of-the-art work.
Markov chain Monte Carlo (MCMC) algorithms are simple and extremely powerful techniques to sample from almost arbitrary distributions. The flaw in practice is that it can take a large and/or unknown amount of time to converge to the stationary distribution. This paper gives sufficient conditions to guarantee that univariate Gibbs sampling on Markov Random Fields (MRFs) will be fast mixing, in a precise sense. Further, an algorithm is given to project onto this set of fast-mixing parameters in the Euclidean norm. Following recent work, we give an example use of this to project in various divergence measures, comparing univariate marginals obtained by sampling after projection to common variational methods and Gibbs sampling on the original parameters.
Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output alignment is unknown. The combination of these methods with the Long Short-term Memory RNN architecture has proved particularly fruitful, delivering state-of-the-art results in cursive handwriting recognition. However RNN performance in speech recognition has so far been disappointing, with better results returned by deep feedforward networks. This paper investigates \emph{deep recurrent neural networks}, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. When trained end-to-end with suitable regularisation, we find that deep Long Short-term Memory RNNs achieve a test set error of 17.7% on the TIMIT phoneme recognition benchmark, which to our knowledge is the best recorded score.
We consider the framework of average aggregative games, where the cost function of each agent depends on his own strategy and on the average population strategy. We focus on the case in which the agents are coupled not only via their cost functions, but also via affine constraints on the average of the agents' strategies. We propose a distributed algorithm that achieves an almost Nash equilibrium by requiring only local communications of the agents, as specified by a sparse communication network. The proof of convergence of the algorithm relies on the auxiliary class of network aggregative games and exploits a novel result of parametric convergence of variational inequalities, which is applicable beyond the context of games. We apply our theoretical findings to a multi-market Cournot game with transportation costs and maximum market capacity.
In this letter we study metamaterials made out of resonant electric wires arranged on a spatial scale much smaller than the free space wavelength and we show that they present a hybridization band that is insensible to positional disorder. We experimentally demonstrate defect cavities in disordered and ordered samples and prove that, analogous to those designed in photonic crystals, those cavities can present very high quality factors. In addition we show that they display mode volumes much smaller than a wavelength cube, owing to the deep subwavelength nature of the unit cell. We underline that this type of structure can be shrunk down to a period close of a few skin depth. Our approach paves the way towards the confinement and manipulation of waves at deep subwavelength scales in both ordered and disordered metamaterials.
Euclidean distance matrices (EDM) are matrices of squared distances between points. The definition is deceivingly simple: thanks to their many useful properties they have found applications in psychometrics, crystallography, machine learning, wireless sensor networks, acoustics, and more. Despite the usefulness of EDMs, they seem to be insufficiently known in the signal processing community. Our goal is to rectify this mishap in a concise tutorial. We review the fundamental properties of EDMs, such as rank or (non)definiteness. We show how various EDM properties can be used to design algorithms for completing and denoising distance data. Along the way, we demonstrate applications to microphone position calibration, ultrasound tomography, room reconstruction from echoes and phase retrieval. By spelling out the essential algorithms, we hope to fast-track the readers in applying EDMs to their own problems. Matlab code for all the described algorithms, and to generate the figures in the paper, is available online. Finally, we suggest directions for further research.
We present a method of general applicability for finding exact or accurate approximations to bond percolation thresholds for a wide class of lattices. To every lattice we sytematically associate a polynomial, the root of which in $[0,1]$ is the conjectured critical point. The method makes the correct prediction for every exactly solved problem, and comparison with numerical results shows that it is very close, but not exact, for many others. We focus primarily on the Archimedean lattices, in which all vertices are equivalent, but this restriction is not crucial. Some results we find are kagome: $p_c=0.524430...$, $(3,12^2): p_c=0.740423...$, $(3^3,4^2): p_c=0.419615...$, $(3,4,6,4):p_c=0.524821...$, $(4,8^2):p_c=0.676835...$, $(3^2,4,3,4)$: $p_c=0.414120...$ . The results are generally within $10^{-5}$ of numerical estimates. For the inhomogeneous checkerboard and bowtie lattices, errors in the formulas (if they are not exact) are less than $10^{-6}$.
Understanding protein-protein interactions is central to our understanding of almost all complex biological processes. Computational tools exploiting rapidly growing genomic databases to characterize protein-protein interactions are urgently needed. Such methods should connect multiple scales from evolutionary conserved interactions between families of homologous proteins, over the identification of specifically interacting proteins in the case of multiple paralogs inside a species, down to the prediction of residues being in physical contact across interaction interfaces. Statistical inference methods detecting residue-residue coevolution have recently triggered considerable progress in using sequence data for quaternary protein structure prediction; they require, however, large joint alignments of homologous protein pairs known to interact. The generation of such alignments is a complex computational task on its own; application of coevolutionary modeling has in turn been restricted to proteins without paralogs, or to bacterial systems with the corresponding coding genes being co-localized in operons. Here we show that the Direct-Coupling Analysis of residue coevolution can be extended to connect the different scales, and simultaneously to match interacting paralogs, to identify inter-protein residue-residue contacts and to discriminate interacting from noninteracting families in a multiprotein system. Our results extend the potential applications of coevolutionary analysis far beyond cases treatable so far.
The concentration of paramagnetic trace impurities in glasses can be determined via precise SQUID measurements of the sample's magnetization in a magnetic field. However the existence of quasi-ordered structural inhomogeneities in the disordered solid causes correlated tunneling currents that can contribute to the magnetization, surprisingly, also at the higher temperatures. We show that taking into account such tunneling systems gives rise to a good agreement between the concentrations extracted from SQUID magnetization and those extracted from low-temperature heat capacity measurements. Without suitable inclusion of such magnetization contribution from the tunneling currents we find that the concentration of paramagnetic impurities gets considerably over-estimated. This analysis represents a further positive test for the structural inhomogeneity theory of the magnetic effects in the cold glasses.
We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent's policy can be used to aid efficient exploration. The parameters of the noise are learned with gradient descent along with the remaining network weights. NoisyNet is straightforward to implement and adds little computational overhead. We find that replacing the conventional exploration heuristics for A3C, DQN and dueling agents (entropy reward and $\epsilon$-greedy respectively) with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.
Many natural and artificial two-states signaling devices are connected forming networks. The information-processing potential of these systems is usually related to the response to weak external signals. Here, using a network of overdamped bistable elements, we study the effect of a heterogeneous complex topology on the signal response. The analysis of the problem in random scale-free networks, reveals that heterogeneity plays a crucial role in amplifying external signals. We have contrasted numerical simulations with analytical calculations in simplified topologies.
This paper summarizes efforts to computationally model two transitions in the evolution of human creativity: its origins about two million years ago, and the 'big bang' of creativity about 50,000 years ago. Using a computational model of cultural evolution in which neural network based agents evolve ideas for actions through invention and imitation, we tested the hypothesis that human creativity began with onset of the capacity for recursive recall. We compared runs in which agents were limited to single-step actions to runs in which they used recursive recall to chain simple actions into complex ones. Chaining resulted in higher diversity, open-ended novelty, no ceiling on the mean fitness of actions, and greater ability to make use of learning. Using a computational model of portrait painting, we tested the hypothesis that the explosion of creativity in the Middle/Upper Paleolithic was due to onset of con-textual focus: the capacity to shift between associative and analytic thought. This resulted in faster convergence on portraits that resembled the sitter, employed painterly techniques, and were rated as preferable. We conclude that recursive recall and contextual focus provide a computationally plausible explanation of how humans evolved the means to transform this planet.
Considerable progress has recently been made in the development of techniques to exactly determine two-point resistances in networks of various topologies. In particular, two types of method have emerged. One is based on potentials and the evaluation of eigenvalues and eigenvectors of the Laplacian matrix associated with the network or its minors. The second method is based on a recurrence relation associated with the distribution of currents in the network. Here, these methods are compared and used to determine the resistance distances between any two nodes of a network with topology of a hammock.
Modern discriminative predictors have been shown to match natural intelligences in specific perceptual tasks in image classification, object and part detection, boundary extraction, etc. However, a major advantage that natural intelligences still have is that they work well for "all" perceptual problems together, solving them efficiently and coherently in an "integrated manner". In order to capture some of these advantages in machine perception, we ask two questions: whether deep neural networks can learn universal image representations, useful not only for a single task but for all of them, and how the solutions to the different tasks can be integrated in this framework. We answer by proposing a new architecture, which we call "MultiNet", in which not only deep image features are shared between tasks, but where tasks can interact in a recurrent manner by encoding the results of their analysis in a common shared representation of the data. In this manner, we show that the performance of individual tasks in standard benchmarks can be improved first by sharing features between them and then, more significantly, by integrating their solutions in the common representation.
This paper describes maxDNN, a computationally efficient convolution kernel for deep learning with the NVIDIA Maxwell GPU. maxDNN reaches 96.3% computational efficiency on typical deep learning network architectures. The design combines ideas from cuda-convnet2 with the Maxas SGEMM assembly code. We only address forward propagation (FPROP) operation of the network, but we believe that the same techniques used here will be effective for backward propagation (BPROP) as well.
There has been a tremendous rise in the growth of online social networks all over the world in recent times. While some networks like Twitter and Facebook have been well documented, the popular Chinese microblogging social network Sina Weibo has not been studied. In this work, we examine the key topics that trend on Sina Weibo and contrast them with our observations on Twitter. We find that there is a vast difference in the content shared in China, when compared to a global social network such as Twitter. In China, the trends are created almost entirely due to retweets of media content such as jokes, images and videos, whereas on Twitter, the trends tend to have more to do with current global events and news stories.
Device-to-device (D2D) communications in cellular networks are promising technologies for improving network throughput, spectrum efficiency, and transmission delay. In this paper, we first introduce the concept of guard distance to explore a proper system model for enabling multiple concurrent D2D pairs in the same cell. Considering the Signal to Interference Ratio (SIR) requirements for both macro-cell and D2D communications, a geometrical method is proposed to obtain the guard distances from a D2D user equipment (DUE) to the base station (BS), to the transmitting cellular user equipment (CUE), and to other communicating D2D pairs, respectively, when the uplink resource is reused. By utilizing the guard distances, we then derive the bounds of the maximum throughput improvement provided by D2D communications in a cell. Extensive simulations are conducted to demonstrate the impact of different parameters on the optimal maximum throughput. We believe that the obtained results can provide useful guidelines for the deployment of future cellular networks with underlaying D2D communications.
Dynamic network reconfiguration is described as the process of replacing one routing function with another while the network keeps running. The main challenge is avoiding deadlock anomalies while keeping limitations on message injection and forwarding minimal. Current approaches, whose complexity is so high that their practical applicability is limited, either require the existence of extra network resources like virtual channels, or they affect the performance of the network during the reconfiguration process. In this paper we present a simple, fast and efficient mechanism for dynamic network reconfiguration which is based on regressive deadlock recoveries instead of avoiding deadlocks. The mechanism which is referred to as DBR guarantees a deadlock-free reconfiguration based on wormhole switching (WS) and it does not require additional resources. In this approach, the need for a reliable message transmission has led to a modified WS mechanism which includes additional flits or control signals. DBR allows cycles to be formed and in such conditions when a deadlock occurs, the messages suffer from time-out. Then, this method releases the buffers and channels from the current node and thus the source retransmits the message after a random time gap. Evaluating results reveal that the mechanism shows substantial performance improvements over the other methods and it works efficiently in different topologies with various routing algorithms.
Key cosmological applications require the three-dimensional galaxy distribution on the entire celestial sphere. These include measuring the gravitational pull on the Local Group, estimating the large-scale bulk flow and testing the Copernican principle. However, the largest all-sky redshift surveys -- the 2MRS and IRAS PSCz -- have median redshifts of only z=0.03 and sample the very local Universe. There exist all-sky galaxy catalogs reaching much deeper -- SuperCOSMOS in the optical, 2MASS in the near-IR and WISE in the mid-IR -- but these lack complete redshift information. At present, the only rapid way towards larger 3D catalogs covering the whole sky is through photometric redshift techniques. In this paper we present the 2MASS Photometric Redshift catalog (2MPZ) containing 1 million galaxies, constructed by cross-matching 2MASS XSC, WISE and SuperCOSMOS all-sky samples and employing the artificial neural network approach (the ANNz algorithm), trained on such redshift surveys as SDSS, 6dFGS and 2dFGRS. The derived photometric redshifts have errors nearly independent of distance, with an all-sky accuracy of sigma_z=0.015 and a very small percentage of outliers. In this way, we obtain redshift estimates with a typical precision of 12% for all the 2MASS XSC galaxies that lack spectroscopy. In addition, we have made an early effort towards probing the entire 3D sky beyond 2MASS, by pairing up WISE with SuperCOSMOS and training the ANNz on GAMA redshift data reaching currently to z_med~0.2. This has yielded photo-z accuracies comparable to those in the 2MPZ. These all-sky photo-z catalogs, with a median z~0.1 for the 2MPZ, and significantly deeper for future WISE-based samples, will be the largest and most complete of their kind for the foreseeable future.
In recent years, due to the powerful abilities to deal with highly complex tasks, the artificial neural networks (ANNs) have been studied in the hope of achieving human-like performance in many applications. Since the ANNs have the ability to approximate complex functions from observations, it is straightforward to consider the ANNs for steganography. In this paper, we aim to implement the well-known LSB substitution and matrix coding steganography with the feed-forward neural networks (FNNs). Our experimental results have shown that, the used FNNs can achieve the data embedding operation of the LSB substitution and matrix coding steganography. For steganography with the ANNs, though there may be some challenges to us, it would be very promising and valuable to pay attention to the ANNs for steganography, which may be a new direction for steganography.
EcoTRADE is a multi player network game of a virtual biodiversity credit market. Each player controls the land use of a certain amount of parcels on a virtual landscape. The biodiversity credits of a particular parcel depend on neighboring parcels, which may be owned by other players. The game can be used to study the strategies of players in experiments or classroom games and also as a communication tool for stakeholders participating in credit markets that include spatially interdependent credits.
We use a superspin Hamiltonian defined on an infinite-dimensional Fock space with positive definite scalar product to study localization and delocalization of noninteracting spinless quasiparticles in quasi-one-dimensional quantum wires perturbed by weak quenched disorder. Past works using this approach have considered a single chain. Here, we extend the formalism to treat a quasi-one-dimensional system: a quantum wire with an arbitrary number of channels coupled by random hopping amplitudes. The computations are carried out explicitly for the case of a chiral quasi-one-dimensional wire with broken time-reversal symmetry (chiral-unitary symmetry class). By treating the space direction along the chains as imaginary time, the effects of the disorder are encoded in the time evolution induced by a single site superspin (non-Hermitian) Hamiltonian. We obtain the density of states near the band center of an infinitely long quantum wire. Our results agree with those based on the Dorokhov-Mello-Pereyra-Kumar equation for the chiral-unitary symmetry class.
We present the first study of disordered jammed hard-sphere packings in four-, five- and six-dimensional Euclidean spaces. Using a collision-driven packing generation algorithm, we obtain the first estimates for the packing fractions of the maximally random jammed (MRJ) states for space dimensions $d=4$, 5 and 6 to be $\phi_{MRJ} \simeq 0.46$, 0.31 and 0.20, respectively. To a good approximation, the MRJ density obeys the scaling form $\phi_{MRJ}= c_1/2^d+(c_2 d)/2^d$, where $c_1=-2.72$ and $c_2=2.56$, which appears to be consistent with high-dimensional asymptotic limit, albeit with different coefficients. Calculations of the pair correlation function $g_{2}(r)$ and structure factor $S(k)$ for these states show that short-range ordering appreciably decreases with increasing dimension, consistent with a recently proposed ``decorrelation principle,'' which, among othe things, states that unconstrained correlations diminish as the dimension increases and vanish entirely in the limit $d \to \infty$. As in three dimensions (where $\phi_{MRJ} \simeq 0.64$), the packings show no signs of crystallization, are isostatic, and have a power-law divergence in $g_{2}(r)$ at contact with power-law exponent $\simeq 0.4$. Across dimensions, the cumulative number of neighbors equals the kissing number of the conjectured densest packing close to where $g_{2}(r)$ has its first minimum. We obtain estimates for the freezing and melting desnities for the equilibrium hard-sphere fluid-solid transition, $\phi_F \simeq 0.32$ and $\phi_M \simeq 0.39$, respectively, for $d=4$, and $\phi_F \simeq 0.19$ and $\phi_M \simeq 0.24$, respectively, for $d=5$.
Time- and state-discrete dynamical systems are frequently used to model molecular networks. This paper provides a collection of mathematical and computational tools for the study of robustness in Boolean network models. The focus is on networks governed by $k$-canalizing functions, a recently introduced class of Boolean functions that contains the well-studied class of nested canalizing functions. The activities and sensitivity of a function quantify the impact of input changes on the function output. This paper generalizes the latter concept to $c$-sensitivity and provides formulas for the activities and $c$-sensitivity of general $k$-canalizing functions as well as canalizing functions with more precisely defined structure. A popular measure for the robustness of a network, the Derrida value, can be expressed as a weighted sum of the $c$-sensitivities of the governing canalizing functions, and can also be calculated for a stochastic extension of Boolean networks. These findings provide a computationally efficient way to obtain Derrida values of Boolean networks, deterministic or stochastic, that does not involve simulation.
While both the data volume and heterogeneity of the digital music content is huge, it has become increasingly important and convenient to build a recommendation or search system to facilitate surfacing these content to the user or consumer community. Most of the recommendation models fall into two primary species, collaborative filtering based and content based approaches. Variants of instantiations of collaborative filtering approach suffer from the common issues of so called "cold start" and "long tail" problems where there is not much user interaction data to reveal user opinions or affinities on the content and also the distortion towards the popular content. Content-based approaches are sometimes limited by the richness of the available content data resulting in a heavily biased and coarse recommendation result. In recent years, the deep neural network has enjoyed a great success in large-scale image and video recognitions. In this paper, we propose and experiment using deep convolutional neural network to imitate how human brain processes hierarchical structures in the auditory signals, such as music, speech, etc., at various timescales. This approach can be used to discover the latent factor models of the music based upon acoustic hyper-images that are extracted from the raw audio waves of music. These latent embeddings can be used either as features to feed to subsequent models, such as collaborative filtering, or to build similarity metrics between songs, or to classify music based on the labels for training such as genre, mood, sentiment, etc.
Complex networks have attracted growing attention in many fields. As a generalization of fractal analysis, multifractal analysis (MFA) is a useful way to systematically describe the spatial heterogeneity of both theoretical and experimental fractal patterns. Some algorithms for MFA of unweighted complex networks have been proposed in the past a few years, including the sandbox (SB) algorithm recently employed by our group. In this paper, a modified SB algorithm (we call it SBw algorithm) is proposed for MFA of weighted networks.First, we use the SBw algorithm to study the multifractal property of two families of weighted fractal networks (WFNs): "Sierpinski" WFNs and "Cantor dust" WFNs. We also discuss how the fractal dimension and generalized fractal dimensions change with the edge-weights of the WFN. From the comparison between the theoretical and numerical fractal dimensions of these networks, we can find that the proposed SBw algorithm is efficient and feasible for MFA of weighted networks. Then, we apply the SBw algorithm to study multifractal properties of some real weighted networks ---collaboration networks. It is found that the multifractality exists in these weighted networks, and is affected by their edge-weights.
Shapes and texture image recognition usage is an essential branch of pattern recognition. It is made up of techniques that aim at extracting information from images via human knowledge and works. Local Binary Pattern (LBP) ensures encoding global and local information and scaling invariance by introducing a look-up table to reflect the uniformity structure of an object. However, edge direction matrixes (EDMS) only apply global invariant descriptor which employs first and secondary order relationships. The main idea behind this methodology is the need of improved recognition capabilities, a goal achieved by the combinative use of these descriptors. This collaboration aims to make use of the major advantages each one presents, by simultaneously complementing each other, in order to elevate their weak points. By using multiple classifier approaches such as random forest and multi-layer perceptron neural network, the proposed combinative descriptor are compared with the state of the art combinative methods based on Gray-Level Co-occurrence matrix (GLCM with EDMS), LBP and moment invariant on four benchmark dataset MPEG-7 CE-Shape-1, KTH-TIPS image, Enghlishfnt and Arabic calligraphy . The experiments have shown the superiority of the introduced descriptor over the GLCM with EDMS, LBP and moment invariants and other well-known descriptor such as Scale Invariant Feature Transform from the literature.
In this work, we introduce Video Question Answering in temporal domain to infer the past, describe the present and predict the future. We present an encoder-decoder approach using Recurrent Neural Networks to learn temporal structures of videos and introduce a dual-channel ranking loss to answer multiple-choice questions. We explore approaches for finer understanding of video content using question form of "fill-in-the-blank", and managed to collect 109,895 video clips with duration over 1,000 hours from TACoS, MPII-MD, MEDTest 14 datasets, while the corresponding 390,744 questions are generated from annotations. Extensive experiments demonstrate that our approach significantly outperforms the compared baselines.
Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and training of very deep networks remains an open problem. Here we introduce a new architecture designed to overcome this. Our so-called highway networks allow unimpeded information flow across many layers on information highways. They are inspired by Long Short-Term Memory recurrent networks and use adaptive gating units to regulate the information flow. Even with hundreds of layers, highway networks can be trained directly through simple gradient descent. This enables the study of extremely deep and efficient architectures.
As decentralized computing scenarios get ever more popular, unstructured topologies are natural candidates to consider running mix networks upon. We consider mix network topologies where mixes are placed on the nodes of an unstructured network, such as social networks and scale-free random networks. We explore the efficiency and traffic analysis resistance properties of mix networks based on unstructured topologies as opposed to theoretically optimal structured topologies, under high latency conditions. We consider a mix of directed and undirected network models, as well as one real world case study -- the LiveJournal friendship network topology. Our analysis indicates that mix-networks based on scale-free and small-world topologies have, firstly, mix-route lengths that are roughly comparable to those in expander graphs; second, that compromise of the most central nodes has little effect on anonymization properties, and third, batch sizes required for warding off intersection attacks need to be an order of magnitude higher in unstructured networks in comparison with expander graph topologies.
A recurring problem when building probabilistic latent variable models is regularization and model selection, for instance, the choice of the dimensionality of the latent space. In the context of belief networks with latent variables, this problem has been adressed with Automatic Relevance Determination (ARD) employing Monte Carlo inference. We present a variational inference approach to ARD for Deep Generative Models using doubly stochastic variational inference to provide fast and scalable learning. We show empirical results on a standard dataset illustrating the effects of contracting the latent space automatically. We show that the resulting latent representations are significantly more compact without loss of expressive power of the learned models.
Conditional Random Fields (CRFs) are undirected graphical models, a special case of which correspond to conditionally-trained finite state machines. A key advantage of these models is their great flexibility to include a wide array of overlapping, multi-granularity, non-independent features of the input. In face of this freedom, an important question that remains is, what features should be used? This paper presents a feature induction method for CRFs. Founded on the principle of constructing only those feature conjunctions that significantly increase log-likelihood, the approach is based on that of Della Pietra et al [1997], but altered to work with conditional rather than joint probabilities, and with additional modifications for providing tractability specifically for a sequence model. In comparison with traditional approaches, automated feature induction offers both improved accuracy and more than an order of magnitude reduction in feature count; it enables the use of richer, higher-order Markov models, and offers more freedom to liberally guess about which atomic features may be relevant to a task. The induction method applies to linear-chain CRFs, as well as to more arbitrary CRF structures, also known as Relational Markov Networks [Taskar & Koller, 2002]. We present experimental results on a named entity extraction task.
In the last decade, a new computational paradigm was introduced in the field of Machine Learning, under the name of Reservoir Computing (RC). RC models are neural networks which a recurrent part (the reservoir) that does not participate in the learning process, and the rest of the system where no recurrence (no neural circuit) occurs. This approach has grown rapidly due to its success in solving learning tasks and other computational applications. Some success was also observed with another recently proposed neural network designed using Queueing Theory, the Random Neural Network (RandNN). Both approaches have good properties and identified drawbacks. In this paper, we propose a new RC model called Echo State Queueing Network (ESQN), where we use ideas coming from RandNNs for the design of the reservoir. ESQNs consist in ESNs where the reservoir has a new dynamics inspired by recurrent RandNNs. The paper positions ESQNs in the global Machine Learning area, and provides examples of their use and performances. We show on largely used benchmarks that ESQNs are very accurate tools, and we illustrate how they compare with standard ESNs.
Bitcoin and other cryptocurrencies have surged in popularity over the last decade. Although Bitcoin does not claim to provide anonymity for its users, it enjoys a public perception of being a `privacy-preserving' financial system. In reality, cryptocurrencies publish users' entire transaction histories in plaintext, albeit under a pseudonym; this is required for transaction validation. Therefore, if a user's pseudonym can be linked to their human identity, the privacy fallout can be significant. Recently, researchers have demonstrated deanonymization attacks that exploit weaknesses in the Bitcoin network's peer-to-peer (P2P) networking protocols. In particular, the P2P network currently forwards content in a structured way that allows observers to deanonymize users. In this work, we redesign the P2P network from first principles with the goal of providing strong, provable anonymity guarantees. We propose a simple networking policy called Dandelion, which achieves nearly-optimal anonymity guarantees at minimal cost to the network's utility. We also provide a practical implementation of Dandelion.
Preliminary report on network based keyword extraction for Croatian is an unsupervised method for keyword extraction from the complex network. We build our approach with a new network measure the node selectivity, motivated by the research of the graph based centrality approaches. The node selectivity is defined as the average weight distribution on the links of the single node. We extract nodes (keyword candidates) based on the selectivity value. Furthermore, we expand extracted nodes to word-tuples ranked with the highest in/out selectivity values. Selectivity based extraction does not require linguistic knowledge while it is purely derived from statistical and structural information en-compassed in the source text which is reflected into the structure of the network. Obtained sets are evaluated on a manually annotated keywords: for the set of extracted keyword candidates average F1 score is 24,63%, and average F2 score is 21,19%; for the exacted words-tuples candidates average F1 score is 25,9% and average F2 score is 24,47%.
Previous work in the area of gesture production, has made the assumption that machines can replicate "human-like" gestures by connecting a bounded set of salient points in the motion trajectory. Those inflection points were hypothesized to also display cognitive saliency. The purpose of this paper is to validate that claim using electroencephalography (EEG). That is, this paper attempts to find neural signatures of gestures (also referred as placeholders) in human cognition, which facilitate the understanding, learning and repetition of gestures. Further, it is discussed whether there is a direct mapping between the placeholders and kinematic salient points in the gesture trajectories. These are expressed as relationships between inflection points in the gestures' trajectories with oscillatory mu rhythms (8-12 Hz) in the EEG. This is achieved by correlating fluctuations in mu power during gesture observation with salient motion points found for each gesture. Peaks in the EEG signal at central electrodes (motor cortex) and occipital electrodes (visual cortex) were used to isolate the salient events within each gesture. We found that a linear model predicting mu peaks from motion inflections fits the data well. Increases in EEG power were detected 380 and 500ms after inflection points at occipital and central electrodes, respectively. These results suggest that coordinated activity in visual and motor cortices is sensitive to motion trajectories during gesture observation, and it is consistent with the proposal that inflection points operate as placeholders in gesture recognition.
Potentially influential spaces in the spatial networks of cities can be detected by means of the entropy participation ratios. Local (connectivity) and global (centrality) entropies are considered. While the connectivity entropy has a tendency to increase with the city size, the centrality entropy is decreasing that reflects the global connectedness of cities. In urban networks, the local and global properties of nodes are positively correlated that indicates the intelligibility of cities. Correlations between entropy participation ratios can be used in purpose of intelligibility measurements and city networks comparisons.
In this study we address the problem of training a neuralnetwork for language identification using both labeled and unlabeled speech samples in the form of i-vectors. We propose a neural network architecture that can also handle out-of-set languages. We utilize a modified version of the recently proposed Ladder Network semisupervised training procedure that optimizes the reconstruction costs of a stack of denoising autoencoders. We show that this approach can be successfully applied to the case where the training dataset is composed of both labeled and unlabeled acoustic data. The results show enhanced language identification on the NIST 2015 language identification dataset.
Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.
Computer networks today typically do not provide any mechanisms to the users to learn, in a reliable manner, which paths have (and have not) been taken by their packets. Rather, it seems inevitable that as soon as a packet leaves the network card, the user is forced to trust the network provider to forward the packets as expected or agreed upon. This can be undesirable, especially in the light of today's trend toward more programmable networks: after a successful cyber attack on the network management system or Software-Defined Network (SDN) control plane, an adversary in principle has complete control over the network.   This paper presents a low-cost and efficient solution to detect misbehaviors and ensure trustworthy routing over untrusted or insecure providers, in particular providers whose management system or control plane has been compromised (e.g., using a cyber attack). We propose Routing-Verification-as-a-Service (RVaaS): RVaaS offers clients a flexible interface to query information relevant to their traffic, while respecting the autonomy of the network provider. RVaaS leverages key features of OpenFlow-based SDNs to combine (passive and active) configuration monitoring, logical data plane verification and actual in-band tests, in a novel manner.
We show that the one-particle density matrix $\rho$ can be used to characterize the interaction-driven many-body localization transition in closed fermionic systems. The natural orbitals (the eigenstates of $\rho$) are localized in the many-body localized phase and spread out when one enters the delocalized phase, while the occupation spectrum (the set of eigenvalues of $\rho$) reveals the distinctive Fock-space structure of the many-body eigenstates, exhibiting a step-like discontinuity in the localized phase. The associated one-particle occupation entropy is small in the localized phase and large in the delocalized phase, with diverging fluctuations at the transition. We analyze the inverse participation ratio of the natural orbitals and find that it is independent of system size in the localized phase.
By means of free fermionic techniques we study the time evolution of the entanglement entropy, S(t), of a block of spins in the random transverse-field Ising chain after a sudden change of the parameters of the Hamiltonian. We consider global quenches, when the parameters are modified uniformly in space, as well as local quenches, when two disconnected blocks are suddenly joined together. For a non-critical final state, the dynamical entanglement entropy is found to approach a finite limiting value for both types of quenches. If the quench is performed to the critical state, the entropy grows for an infinite block as S(t) \sim ln ln t. This type of ultraslow increase is explained through the strong disorder renormalization group method.
We have done a finite-size scaling study of a continuous phase transition altered by the quenched bond disorder, investigating systems at quasicritical temperatures of each disorder realization by using the equilibriumlike invaded cluster algorithm. Our results indicate that in order to access the thermal critical exponent $y_\tau$, it is necessary to average the free energy at quasicritical temperatures of each disorder configuration. Despite the thermal fluctuations on the scale of the system at the transition point, we find that spatial inhomogeneities form in the system and become more pronounced as the size of the system increases. This leads to different exponents describing rescaling of the fluctuations of observables in disorder and thermodynamic ensembles.
Many real life networks, such as the World Wide Web, transportation systems, biological or social networks, achieve both a strong local clustering (nodes have many mutual neighbors) and a small diameter (maximum distance between any two nodes). These networks have been characterized as small-world networks and modeled by the addition of randomness to regular structures. We show that small-world networks can be constructed in a deterministic way. This exact approach permits a direct calculation of relevant network parameters allowing their immediate contrast with real-world networks and avoiding complex computer simulations.
Hybrid modeling provides an effective solution to cope with multiple time scales dynamics in systems biology. Among the applications of this method, one of the most important is the cell cycle regulation. The machinery of the cell cycle, leading to cell division and proliferation, combines slow growth, spatio-temporal re-organisation of the cell, and rapid changes of regulatory proteins concentrations induced by post-translational modifications. The advancement through the cell cycle comprises a well defined sequence of stages, separated by checkpoint transitions. The combination of continuous and discrete changes justifies hybrid modelling approaches to cell cycle dynamics. We present a piecewise-smooth version of a mammalian cell cycle model, obtained by hybridization from a smooth biochemical model. The approximate hybridization scheme, leading to simplified reaction rates and binary event location functions, is based on learning from a training set of trajectories of the smooth model. We discuss several learning strategies for the parameters of the hybrid model.
TIntelligent multi agent systems have great potentials to use in different purposes and research areas. One of the important issues to apply intelligent multi agent systems in real world and virtual environment is to develop a framework that support machine learning model to reflect the whole complexity of the real world. In this paper, we proposed a framework of intelligent agent based neural network classification model to solve the problem of gap between two applicable flows of intelligent multi agent technology and learning model from real environment. We consider the new Supervised Multilayers Feed Forward Neural Network (SMFFNN) model as an intelligent classification for learning model in the framework. The framework earns the information from the respective environment and its behavior can be recognized by the weights. Therefore, the SMFFNN model that lies in the framework will give more benefits in finding the suitable information and the real weights from the environment which result for better recognition. The framework is applicable to different domains successfully and for the potential case study, the clinical organization and its domain is considered for the proposed framework
Although there is consensus that software defined networking and network functions virtualization overhaul service provisioning and deployment, the community still lacks a definite answer on how carrier-grade operations praxis needs to evolve. This article presents what lies beyond the first evolutionary steps in network management, identifies the challenges in service verification, observability, and troubleshooting, and explains how to address them using our Service Provider DevOps (SP-DevOps) framework. We compendiously cover the entire process from design goals to tool realization and employ an elastic version of an industry-standard use case to show how on-the-fly verification, software-defined monitoring, and automated troubleshooting of services reduce the cost of fault management actions. We assess SP-DevOps with respect to key attributes of software-defined telecommunication infrastructures both qualitatively and quantitatively, and demonstrate that SP-DevOps paves the way toward carrier-grade operations and management in the network virtualization era.
The quantum correction to the conductivity have been studied in two types of 2D heterostructures: with doped quantum well and doped barriers. The consistent analysis shows that in the structures where electrons occupy the states in quantum well only, all the temperature and magnetic field dependencies of the components of resistivity tensor are well described by the theories of quantum corrections. The contribution of electron-electron interaction to the conductivity have been determined reliably in the structures with different electron density. A possible reason of large scatter in experimental data concerning the contribution of electron-electron interaction, obtained in previous papers, and the role of the carriers, occupied the states of the doped layers, is discussed.
We present a deep photometric (B- and R-band) catalog and an associated spectroscopic redshift survey conducted in the vicinity of the Hubble Deep Field South. The spectroscopy yields 53 extragalactic redshifts in the range 0<z<1.4 substantially increasing the body of spectroscopic work in this field to over 200 objects. The targets are selected from deep AAT prime focus images complete to R<24 and spectroscopy is 50% complete at R=23. There is now strong evidence for a rich cluster at z\simeq 0.58 flanking the WFPC2 field which is consistent with a known absorber of the bright QSO in this field. We find that photometric redshifts of z<1 galaxies in this field based on HST data are accurate to \sigma_z/(1+z)=0.03 (albeit with small number statistics). The observations were carried out as a community service for Hubble Deep Field science, to demonstrate the first use of the `nod & shuffle' technique with a classical multi-object spectrograph and to test the use of `microslits' for ultra-high multiplex observations along with a new VPH grism and deep-depletion CCD. The reduction of this new type of data is also described.
Generative moment matching network (GMMN), which is based on the maximum mean discrepancy (MMD) measure, is a generative model for unsupervised learning, where the mini-batch stochastic gradient descent is applied for the update of param- eters. In this work, instead of obtaining a mini-batch randomly, each mini-batch in the iterations is selected in a submodular way such that the most informative subset of data is more likely to be chosen. In such a framework, the training objective is reformulated as optimizing a mixed continuous and submodular function with a cardinality constraint. A Majorization Minimization-like algorithm is used to iteratively solve the problem. Specifically, in each iteration of the training process, a mini-batch is first selected by solving a submodular maximization problem, and then the mini-batch stochastic gradient descent is conducted. Our experiments on the MNIST and Labeled Faces in the Wild (LFW) databases show the effectiveness of the submodular mini-batch training in the GMMN frameworks.
We study the structural properties and the collective microscopic dynamics of atoms in the amorphous metallic alloy $Ni_{33}Zr_{67}$ at the temperature $T=300K$ by molecular dynamics simulations. The calculated equilibrium structural and dynamical characteristics are compared with the experimental data on neutron diffraction and on inelastic X-ray scattering. We present the interpretation of observed structural relaxation of the microscopic density fluctuations of particles for amorphous metallic alloy in the framework of the recurrent relation approach. The results of theoretical calculations of the intensity of scattering $I(k,\omega)$ for amorphous $Ni_{33}Zr_{67}$ are in a good agreement with the results of computer simulation as well as with the experimental data on inelastic X-ray scattering.
Fog computing is seen as a promising approach to perform distributed, low-latency computation for supporting Internet of Things applications. However, due to the unpredictable arrival of available neighboring fog nodes, the dynamic formation of a fog network can be challenging. In essence, a given fog node must smartly select the set of neighboring fog nodes that can provide low-latency computations. In this paper, this problem of fog network formation and task distribution is studied considering a hybrid cloud-fog architecture. The goal of the proposed framework is to minimize the maximum computational latency by enabling a given fog node to form a suitable fog network, under uncertainty on the arrival process of neighboring fog nodes. To solve this problem, a novel approach based on the online secretary framework is proposed. To find the desired set of neighboring fog nodes, an online algorithm is developed to enable a task initiating fog node to decide on which other nodes can be used as part of its fog network, to offload computational tasks, without knowing any prior information on the future arrivals of those other nodes. Simulation results show that the proposed online algorithm can successfully select an optimal set of neighboring fog nodes while achieving a latency that is as small as the one resulting from an ideal, offline scheme that has complete knowledge of the system. The results also show how, using the proposed approach, the computational tasks can be properly distributed between the fog network and a remote cloud server.
System-level properties of metabolic networks may be the direct product of natural selection or arise as a by-product of selection on other properties. Here we study the effect of direct selective pressure for growth or viability in particular environments on two properties of metabolic networks: latent versatility to function in additional environments and carbon usage efficiency. Using a Markov Chain Monte Carlo (MCMC) sampling based on Flux Balance Analysis (FBA), we sample from a known biochemical universe random viable metabolic networks that differ in the number of directly constrained environments. We find that the latent versatility of sampled metabolic networks increases with the number of directly constrained environments and with the size of the networks. We then show that the average carbon wastage of sampled metabolic networks across the constrained environments decreases with the number of directly constrained environments and with the size of the networks. Our work expands the growing body of evidence about nonadaptive origins of key functional properties of biological networks.
New proposed models are often compared to state-of-the-art using statistical significance testing. Literature is scarce for classifier comparison using metrics other than accuracy. We present a survey of statistical methods that can be used for classifier comparison using precision, accounting for inter-precision correlation arising from use of same dataset. Comparisons are made using per-class precision and methods presented to test global null hypothesis of an overall model comparison. Comparisons are extended to multiple multi-class classifiers and to models using cross validation or its variants. Partial Bayesian update to precision is introduced when population prevalence of a class is known. Applications to compare deep architectures are studied.
Learning probabilistic logic programming languages is receiving an increasing attention and systems are available for learning the parameters (PRISM, LeProbLog, LFI-ProbLog and EMBLEM) or both the structure and the parameters (SEM-CP-logic and SLIPCASE) of these languages. In this paper we present the algorithm SLIPCOVER for "Structure LearnIng of Probabilistic logic programs by searChing OVER the clause space". It performs a beam search in the space of probabilistic clauses and a greedy search in the space of theories, using the log likelihood of the data as the guiding heuristics. To estimate the log likelihood SLIPCOVER performs Expectation Maximization with EMBLEM. The algorithm has been tested on five real world datasets and compared with SLIPCASE, SEM-CP-logic, Aleph and two algorithms for learning Markov Logic Networks (Learning using Structural Motifs (LSM) and ALEPH++ExactL1). SLIPCOVER achieves higher areas under the precision-recall and ROC curves in most cases.
We derive the finite size dependence of the clustering coefficient of scale-free random graphs generated by the configuration model with degree distribution exponent $2<\gamma<3$. Degree heterogeneity increases the presence of triangles in the network up to levels that compare to those found in many real networks even for extremely large nets. We also find that for values of $\gamma \approx 2$, clustering is virtually size independent and, at the same time, becomes a {\it de facto} non self-averaging topological property. This implies that a single instance network is not representative of the ensemble even for very large network sizes.
We consider a multi-class single server queueing network as a model of a packet switching network. The rates packets are sent into this network are controlled by queues which act as congestion windows. By considering a sequence of congestion controls, we analyse a sequence of stationary queueing networks. In this asymptotic regime, the service capacity of the network remains constant and the sequence of congestion controllers act to exploit the network's capcity by increasing the number of packets within the network. We show the stationary throughput of routes on this sequence of networks converges to an allocation that maximizes aggregate utility subject to the network's capacity constraints. To perform this analysis, we require that our utility functions satisfy an exponential concavity condition. This family of utilities includes weighted $\alpha$-fair utilities for $\alpha >1$.
This paper introduces a one-port method for estimating model parameters of VNA calibration standards. The method involves measuring the standards through an asymmetrical passive network connected in direct mode and then in reverse mode, and using these measurements to compute the S-parameters of the network. The free parameters of the calibration standards are estimated by minimizing a figure of merit based on the expected equality of the S-parameters of the network when used in direct and reverse modes. The capabilities of the method are demonstrated through simulations, and real measurements are used to estimate the actual offset delay of a 50-$\mathbf{\Omega}$ calibration load that is assigned zero delay by the manufacturer. The estimated delay is $38.8$ ps with a $1\sigma$ uncertainty of $2.1$ ps for this particular load. This result is verified through measurements of a terminated airline. The measurements agree better with theoretical models of the airline when the reference plane is calibrated using the new estimate for the load delay.
Data representation is an important pre-processing step in many machine learning algorithms. There are a number of methods used for this task such as Deep Belief Networks (DBNs) and Discrete Fourier Transforms (DFTs). Since some of the features extracted using automated feature extraction methods may not always be related to a specific machine learning task, in this paper we propose two methods in order to make a distinction between extracted features based on their relevancy to the task. We applied these two methods to a Deep Belief Network trained for a face recognition task.
Artificial immune systems have previously been applied to the problem of intrusion detection. The aim of this research is to develop an intrusion detection system based on the function of Dendritic Cells (DCs). DCs are antigen presenting cells and key to activation of the human immune system, behaviour which has been abstracted to form the Dendritic Cell Algorithm (DCA). In algorithmic terms, individual DCs perform multi-sensor data fusion, asynchronously correlating the the fused data signals with a secondary data stream. Aggregate output of a population of cells, is analysed and forms the basis of an anomaly detection system. In this paper the DCA is applied to the detection of outgoing port scans using TCP SYN packets. Results show that detection can be achieved with the DCA, yet some false positives can be encountered when simultaneously scanning and using other network services. Suggestions are made for using adaptive signals to alleviate this uncovered problem.
YouTube draws large number of users who contribute actively by uploading videos or commenting on existing videos. However, being a crowd sourced and large content pushed onto it, there is limited control over the content. This makes malicious users push content (videos and comments) which is inappropriate (unsafe), particularly when such content is placed around cartoon videos which are typically watched by kids. In this paper, we focus on presence of unsafe content for children and users who promote it. For detection of child unsafe content and its promoters, we perform two approaches, one based on supervised classification which uses an extensive set of video-level, user-level and comment-level features and another based Convolutional Neural Network using video frames. Detection accuracy of 85.7% is achieved which can be leveraged to build a system to provide a safe YouTube experience for kids. Through detailed characterization studies, we are able to successfully conclude that unsafe content promoters are less popular and engage less as compared with other users. Finally, using a network of unsafe content promoters and other users based on their engagements (likes, subscription and playlist addition) and other factors, we find that unsafe content is present very close to safe content and unsafe content promoters form very close knit communities with other users, thereby further increasing the likelihood of a child getting getting exposed to unsafe content.
We introduce a new Bayesian network (BN) scoring metric called the Global Uniform (GU) metric. This metric is based on a particular type of default parameter prior. Such priors may be useful when a BN developer is not willing or able to specify domain-specific parameter priors. The GU parameter prior specifies that every prior joint probability distribution P consistent with a BN structure S is considered to be equally likely. Distribution P is consistent with S if P includes just the set of independence relations defined by S. We show that the GU metric addresses some undesirable behavior of the BDeu and K2 Bayesian network scoring metrics, which also use particular forms of default parameter priors. A closed form formula for computing GU for special classes of BNs is derived. Efficiently computing GU for an arbitrary BN remains an open problem.
Recently developed techniques have made it possible to quickly learn accurate probability density functions from data in low-dimensional continuous space. In particular, mixtures of Gaussians can be fitted to data very quickly using an accelerated EM algorithm that employs multiresolution kd-trees (Moore, 1999). In this paper, we propose a kind of Bayesian networks in which low-dimensional mixtures of Gaussians over different subsets of the domain's variables are combined into a coherent joint probability model over the entire domain. The network is also capable of modeling complex dependencies between discrete variables and continuous variables without requiring discretization of the continuous variables. We present efficient heuristic algorithms for automatically learning these networks from data, and perform comparative experiments illustrated how well these networks model real scientific data and synthetic data. We also briefly discuss some possible improvements to the networks, as well as possible applications.
In this paper we present an analytic study of sampled networks in the case of some important shortest-path sampling models. We present analytic formulas for the probability of edge discovery in the case of an evolving and a static network model. We also show that the number of discovered edges in a finite network scales much slower than predicted by earlier mean field models. Finally, we calculate the degree distribution of sampled networks, and we demonstrate that they are analogous to a destructed network obtained by randomly removing edges from the original network.
We study the decay and oscillations of Majorana fermion wavefunctions and ground state (GS) fermion parity in one-dimensional topological superconducting lattice systems. Using a Majorana transfer matrix method, we find that Majorana wavefunction properties are encoded in the associated Lyapunov exponent, which in turn is the sum of two independent components: a `superconducting component' which characterizes the gap induced decay, and the `normal component', which determines the oscillations and response to chemical potential configurations. The topological phase transition separating phases with and without Majorana end modes is seen to be a cancellation of these two components. We show that Majorana wavefunction oscillations are completely determined by an underlying non-superconducting tight-binding model and are solely responsible for GS fermion parity switches in finite-sized systems. These observations enable us to analytically chart out wavefunction oscillations, the resultant GS parity configuration as a function of parameter space in uniform wires, and special parity switch points where degenerate zero energy Majorana modes are restored in spite of finite size effects. For disordered wires, we find that band oscillations are completely washed out leading to a second localization length for the Majorana mode and the remnant oscillations are randomized as per Anderson localization physics in normal systems. Our transfer matrix method further allows us to i) reproduce known results on the scaling of mid-gap Majorana states and demonstrate the origin of its log-normal distribution, ii) identify contrasting behavior of disorder-dependent GS parity switches for the cases of even versus odd number of lattice sites, and iii) chart out the GS parity configuration and associated parity switch points as a function of disorder strength.
This paper has been withdrawn by the author, due to the fact that the main result in it has already been obtained in [1] for any c < e, see also [2] and [3].   Moreover the formula which gives the minimal vertex-cover in a tree   (see the abstract) has already been derived in [4]. I thank M. Bauer, O. Golinelli, F. Ricci-Tersenghi, G. Semerjian and M. Weigt for having brought to my attention [1] and M.B. and O.G. for [4].   [1] M. Bauer and O. Golinelli, Eur. Phys. J. B 24, 339-352 (2001);   [2] R. M. Karp and M. Sipser, Proc. 22nd IEEE Symposium on Foundations of Computing,(1981), 364-375;   [3] J. Aronson, A. Frieze, and B.G. Pittel, Random Structures and Algorithms 12 (1998) 111-177;   [4] M. Bauer, O. Golinelli, Journal of Integer Sequences, Vol 3, (2000).
We outline the phenomenon of deep crustal heating in transiently accreting neutron stars. It is produced by nuclear transformations (mostly, by pycnonuclear reactions) in accreted matter while this matter sinks to densities rho > 10^{10} g/cc under the weight of freshly accreted material. We consider then thermal states of transiently accreting neutron stars (with mean mass accretion rates \dot{M}=(10^{-14}-10^{-9}) M_\odot/yr) determined by deep crustal heating. In a simplified fashion we study how the thermal flux emergent from such stars depends on the properties of superdense matter in stellar interiors. We analyze the most important regulators of the thermal flux: strong superfluidity in the cores of low-mass stars and fast neutrino emission (in nucleon, pion-condensed, kaon-condensed, or quark phases of dense matter) in the cores of high-mass stars. We compare the results with observations of soft X-ray transients in quiescent states.
Understanding functional organization of genetic information is a major challenge in modern biology. Following the initial publication of the human genome sequence in 2001, advances in high-throughput measurement technologies and efficient sharing of research material through community databases have opened up new views to the study of living organisms and the structure of life. In this thesis, novel computational strategies have been developed to investigate a key functional layer of genetic information, the human transcriptome, which regulates the function of living cells through protein synthesis. The key contributions of the thesis are general exploratory tools for high-throughput data analysis that have provided new insights to cell-biological networks, cancer mechanisms and other aspects of genome function.   A central challenge in functional genomics is that high-dimensional genomic observations are associated with high levels of complex and largely unknown sources of variation. By combining statistical evidence across multiple measurement sources and the wealth of background information in genomic data repositories it has been possible to solve some the uncertainties associated with individual observations and to identify functional mechanisms that could not be detected based on individual measurement sources. Statistical learning and probabilistic models provide a natural framework for such modeling tasks. Open source implementations of the key methodological contributions have been released to facilitate further adoption of the developed methods by the research community.
Banach's fixed point theorem for contraction maps has been widely used to analyze the convergence of iterative methods in non-convex problems. It is a common experience, however, that iterative maps fail to be globally contracting under the natural metric in their domain, making the applicability of Banach's theorem limited. We explore how generally we can apply Banach's fixed point theorem to establish the convergence of iterative methods when pairing it with carefully designed metrics.   Our first result is a strong converse of Banach's theorem, showing that it is a universal analysis tool for establishing uniqueness of fixed points and for bounding the convergence rate of iterative maps to a unique fixed point. In other words, we show that, whenever an iterative map globally converges to a unique fixed point, there exists a metric under which the iterative map is contracting and which can be used to bound the number of iterations until convergence. We illustrate our approach in the widely used power method, providing a new way of bounding its convergence rate through contraction arguments.   We next consider the computational complexity of Banach's fixed point theorem. Making the proof of our converse theorem constructive, we show that computing a fixed point whose existence is guaranteed by Banach's fixed point theorem is CLS-complete. We thus provide the first natural complete problem for the class CLS, which was defined in [Daskalakis-Papadimitriou 2011] to capture the complexity of problems such as P-matrix LCP, computing KKT-points, and finding mixed Nash equilibria in congestion and network coordination games.
Markov jump processes (or continuous-time Markov chains) are a simple and important class of continuous-time dynamical systems. In this paper, we tackle the problem of simulating from the posterior distribution over paths in these models, given partial and noisy observations. Our approach is an auxiliary variable Gibbs sampler, and is based on the idea of uniformization. This sets up a Markov chain over paths by alternately sampling a finite set of virtual jump times given the current path and then sampling a new path given the set of extant and virtual jump times using a standard hidden Markov model forward filtering-backward sampling algorithm. Our method is exact and does not involve approximations like time-discretization. We demonstrate how our sampler extends naturally to MJP-based models like Markov-modulated Poisson processes and continuous-time Bayesian networks and show significant computational benefits over state-of-the-art MCMC samplers for these models.
When the college student satisfaction survey is considered in the promotion and recognition of instructors, a usual complaint is related to the impact that biased ratings have on the arithmetic mean (used as a measure of teaching effectiveness). This is especially significant when the number of students responding to the survey is small. In this work a new methodology, considering student to student perceptions, is presented. Two different estimators of student rating credibility, based on centrality properties of the student social network, are proposed. This method is established on the idea that in the case of on-site higher education, students often know which others are competent in rating the teaching and learning process.
In Bayesian networks, exact belief propagation is achieved through message passing algorithms. These algorithms (ex: inward and outward) provide only a recursive definition of the corresponding messages. In contrast, when working on hidden Markov models and variants, one classically first defines explicitly these messages (forward and backward quantities), and then derive all results and algorithms. In this paper, we generalize the hidden Markov model approach by introducing an explicit definition of the messages in Bayesian networks, from which we derive all the relevant properties and results including the recursive algorithms that allow to compute these messages. Two didactic examples (the precipitation hidden Markov model and the pedigree Bayesian network) are considered along the paper to illustrate the new formalism and standalone R source code is provided in the appendix.
We present a technique of proving lower bounds for noisy computations. This is achieved by a theorem connecting computations on a kind of randomized decision trees and sampling based algorithms. This approach is surprisingly powerful, and applicable to several models of computation previously studied.   As a first illustration we show how all the results of Evans and Pippenger (SIAM J. Computing, 1999) for noisy decision trees, some of which were derived using Fourier analysis, follow immediately if we consider the sampling-based algorithms that naturally arise from these decision trees.   Next, we show a tight lower bound of $\Omega(N \log\log N)$ on the number of transmissions required to compute several functions (including the parity function and the majority function) in a network of $N$ randomly placed sensors, communicating using local transmissions, and operating with power near the connectivity threshold. This result considerably simplifies and strengthens an earlier result of Dutta, Kanoria Manjunath and Radhakrishnan (SODA 08) that such networks cannot compute the parity function reliably with significantly fewer than $N\log \log N$ transmissions. The lower bound for parity shown earlier made use of special properties of the parity function and is inapplicable, e.g., to the majority function. In this paper, we use our approach to develop an interesting connection between computation of boolean functions on noisy networks that make few transmissionss, and algorithms that work by sampling only a part of the input. It is straightforward to verify that such sampling-based algorithms cannot compute the majority function.
The performance of different mutation operators is usually evaluated in conjunc-tion with specific parameter settings of genetic algorithms and target problems. Most studies focus on the classical genetic algorithm with different parameters or on solving unconstrained combinatorial optimization problems such as the traveling salesman problems. In this paper, a subpopulation-based genetic al-gorithm that uses only mutation and selection is developed to solve multi-robot task allocation problems. The target problems are constrained combinatorial optimization problems, and are more complex if cooperative tasks are involved as these introduce additional spatial and temporal constraints. The proposed genetic algorithm can obtain better solutions than classical genetic algorithms with tournament selection and partially mapped crossover. The performance of different mutation operators in solving problems without/with cooperative tasks is evaluated. The results imply that inversion mutation performs better than others when solving problems without cooperative tasks, and the swap-inversion combination performs better than others when solving problems with cooperative tasks.
This paper establishes the state of the art in both deterministic and randomized online permutation routing in the POPS network. Indeed, we show that any permutation can be routed online on a POPS network either with $O(\frac{d}{g}\log g)$ deterministic slots, or, with high probability, with $5c\lceil d/g\rceil+o(d/g)+O(\log\log g)$ randomized slots, where constant $c=\exp (1+e^{-1})\approx 3.927$. When $d=\Theta(g)$, that we claim to be the "interesting" case, the randomized algorithm is exponentially faster than any other algorithm in the literature, both deterministic and randomized ones. This is true in practice as well. Indeed, experiments show that it outperforms its rivals even starting from as small a network as a POPS(2,2), and the gap grows exponentially with the size of the network. We can also show that, under proper hypothesis, no deterministic algorithm can asymptotically match its performance.
Many methods have been proposed for community detection in networks, but most of them do not take into account additional information on the nodes that is often available in practice. In this paper, we propose a new joint community detection criterion that uses both the network edge information and the node features to detect community structures. One advantage our method has over existing joint detection approaches is the flexibility of learning the impact of different features which may differ across communities. Another advantage is the flexibility of choosing the amount of influence the feature information has on communities. The method is asymptotically consistent under the block model with additional assumptions on the feature distributions, and performs well on simulated and real networks.
The Sybil attack in unknown port networks such as wireless is not considered tractable. A wireless node is not capable of independently differentiating the universe of real nodes from the universe of arbitrary non-existent fictitious nodes created by the attacker. Similar to failure detectors, we propose to use universe detectors to help nodes determine which universe is real. In this paper, we (i) define several variants of the neighborhood discovery problem under Sybil attack (ii) propose a set of matching universe detectors (iii) demonstrate the necessity of additional topological constraints for the problems to be solvable: node density and communication range; (iv) present SAND -- an algorithm that solves these problems with the help of appropriate universe detectors, this solution demonstrates that the proposed universe detectors are the weakest detectors possible for each problem.
Inferring latent attributes of people online is an important social computing task, but requires integrating the many heterogeneous sources of information available on the web. We propose learning individual representations of people using neural nets to integrate rich linguistic and network evidence gathered from social media. The algorithm is able to combine diverse cues, such as the text a person writes, their attributes (e.g. gender, employer, education, location) and social relations to other people. We show that by integrating both textual and network evidence, these representations offer improved performance at four important tasks in social media inference on Twitter: predicting (1) gender, (2) occupation, (3) location, and (4) friendships for users. Our approach scales to large datasets and the learned representations can be used as general features in and have the potential to benefit a large number of downstream tasks including link prediction, community detection, or probabilistic reasoning over social networks.
Ensemble methods are arguably the most trustworthy techniques for boosting the performance of machine learning models. Popular independent ensembles (IE) relying on naive averaging/voting scheme have been of typical choice for most applications involving deep neural networks, but they do not consider advanced collaboration among ensemble models. In this paper, we propose new ensemble methods specialized for deep neural networks, called confident multiple choice learning (CMCL): it is a variant of multiple choice learning (MCL) via addressing its overconfidence issue.In particular, the proposed major components of CMCL beyond the original MCL scheme are (i) new loss, i.e., confident oracle loss, (ii) new architecture, i.e., feature sharing and (iii) new training method, i.e., stochastic labeling. We demonstrate the effect of CMCL via experiments on the image classification on CIFAR and SVHN, and the foreground-background segmentation on the iCoseg. In particular, CMCL using 5 residual networks provides 14.05% and 6.60% relative reductions in the top-1 error rates from the corresponding IE scheme for the classification task on CIFAR and SVHN, respectively.
We present our results for inclusive instanton-induced cross-sections in deep-inelastic scattering, paying in particular attention to the residual renormalization-scale dependencies. A ``fiducial'' kinematical region in the relevant Bjorken variables is extracted from recent lattice simulations of QCD. The integrated instanton-contribution to the cross-section at HERA corresponding to this fiducial region is surprisingly large: It is in the O(100) pb range, and thus remarkably close to the recently published experimental upper bounds.
We introduce and analyze a general model of a population evolving over a network of selectively neutral genotypes. We show that the population's limit distribution on the neutral network is solely determined by the network topology and given by the principal eigenvector of the network's adjacency matrix. Moreover, the average number of neutral mutant neighbors per individual is given by the matrix spectral radius. This quantifies the extent to which populations evolve mutational robustness: the insensitivity of the phenotype to mutations. Since the average neutrality is independent of evolutionary parameters---such as, mutation rate, population size, and selective advantage---one can infer global statistics of neutral network topology using simple population data available from {\it in vitro} or {\it in vivo} evolution. Populations evolving on neutral networks of RNA secondary structures show excellent agreement with our theoretical predictions.
Approaches to machine intelligence based on brain models have stressed the use of neural networks for generalization. Here we propose the use of a hybrid neural network architecture that uses two kind of neural networks simultaneously: (i) a surface learning agent that quickly adapt to new modes of operation; and, (ii) a deep learning agent that is very accurate within a specific regime of operation. The two networks of the hybrid architecture perform complementary functions that improve the overall performance. The performance of the hybrid architecture has been compared with that of back-propagation perceptrons and the CC and FC networks for chaotic time-series prediction, the CATS benchmark test, and smooth function approximation. It has been shown that the hybrid architecture provides a superior performance based on the RMS error criterion.
The maximum mean discrepancy (MMD) is a recently proposed test statistic for two-sample test. Its quadratic time complexity, however, greatly hampers its availability to large-scale applications. To accelerate the MMD calculation, in this study we propose an efficient method called FastMMD. The core idea of FastMMD is to equivalently transform the MMD with shift-invariant kernels into the amplitude expectation of a linear combination of sinusoid components based on Bochner's theorem and Fourier transform (Rahimi & Recht, 2007). Taking advantage of sampling of Fourier transform, FastMMD decreases the time complexity for MMD calculation from $O(N^2 d)$ to $O(L N d)$, where $N$ and $d$ are the size and dimension of the sample set, respectively. Here $L$ is the number of basis functions for approximating kernels which determines the approximation accuracy. For kernels that are spherically invariant, the computation can be further accelerated to $O(L N \log d)$ by using the Fastfood technique (Le et al., 2013). The uniform convergence of our method has also been theoretically proved in both unbiased and biased estimates. We have further provided a geometric explanation for our method, namely ensemble of circular discrepancy, which facilitates us to understand the insight of MMD, and is hopeful to help arouse more extensive metrics for assessing two-sample test. Experimental results substantiate that FastMMD is with similar accuracy as exact MMD, while with faster computation speed and lower variance than the existing MMD approximation methods.
We study the problem of computing isochrones in road networks, where the objective is to identify the region that is reachable from a given source within a certain amount of time. While there is a wide range of practical applications for this problem (e.g., reachability analyses, geomarketing, visualizing the cruising range of a vehicle), there has been little research on fast computation of isochrones on large, realistic inputs. In this work, we formalize the notion of isochrones in road networks and present a basic approach for the resulting problem based on Dijkstra's algorithm. Moreover, we consider several speedup techniques that are based on previous approaches for one-to-many shortest path computation (or similar scenarios). In contrast to such related problems, the set of targets is not part of the input when computing isochrones. We extend known Multilevel Dijkstra techniques (such as CRP) to the isochrone scenario, adapting a previous technique called isoGRASP to our problem setting (thereby, enabling faster queries). Moreover, we introduce a family of algorithms based on (single-level) graph partitions, following different strategies to exploit the efficient access patterns of PHAST, a well-known approach towards one-to-all queries. Our experimental study reveals that all speedup techniques allow fast isochrone computation on input graphs at continental scale, while providing different tradeoffs between preprocessing effort, space consumption, and query performance. Finally, we demonstrate that all techniques scale well when run in parallel, decreasing query times to a few milliseconds (orders of magnitude faster than the basic approach) and enabling even interactive applications.
We describe new, deep MIPS photometry and new high signal-to-noise optical spectroscopy of the 2.5 Myr-old IC 348 Nebula. To probe the properties of the IC 348 disk population, we combine these data with previous optical/infrared photometry and spectroscopy to identify stars with gas accretion, to examine their mid-IR colors,and to model their spectral energy distributions. IC 348 contains many sources in different evolutionary states, including protostars and stars surrounded by primordial disks, two kinds of transitional disks, and debris disks. Most disks surrounding eary/intermediate spectral-type stars (> 1.4 Msun at 2.5 Myr) are debris disks; most disks surrounding solar and subsolar-mass stars are primordial disks. At the 1--2 sigma level, more massive stars also have a smaller frequency of gas accretion and smaller mid-IR luminosities than lower-mass stars. These trends are suggestive of a stellar mass-dependent evolution of disks, where most disks around high/intermediate-mass stars shed their primordial disks on rapid, 2.5 Myr timescales. The frequency of MIPS-detected transitional disks is ~ 15--35% for stars plausibly more massive than 0.5 Msun. The relative frequency of transitional disks in IC 348 compared to that for 1 Myr-old Taurus and 5 Myr-old NGC 2362 is consistent with a transition timescale that is a significant fraction of the total primordial disk lifetime.
Modeling spreading processes in complex random networks plays an essential role in understanding and prediction of many real phenomena like epidemics or rumor spreading. The dynamics of such systems may be represented algorithmically by Monte-Carlo simulations on graphs or by ordinary differential equations (ODEs). Despite many results in the area of network modeling the selection of the best computational representation of the model dynamics remains a challenge. While a closed form description is often straightforward to derive, it generally cannot be solved analytically; as a consequence the network dynamics requires a numerical solution of the ODEs or a direct Monte-Carlo simulation on the networks. Moreover, Monte-Carlo simulations and ODE solutions are not equivalent since ODEs produce a deterministic solution while Monte-Carlo simulations are stochastic by nature. Despite some recent advantages in Monte-Carlo simulations, particularly in the flexibility of implementation, the computational cost of an ODE solution is much lower and supports accurate and detailed output analysis such as uncertainty or sensitivity analyses, parameter identification etc. In this paper we propose a novel approach to model spreading processes in complex random heterogeneous networks using systems of nonlinear ordinary differential equations. We successfully apply this approach to predict the dynamics of HIV-AIDS spreading in sexual networks, and compare it to historical data.
Deep learning has made significant breakthroughs in various fields of artificial intelligence. Advantages of deep learning include the ability to capture highly complicated features, weak involvement of human engineering, etc. However, it is still virtually impossible to use deep learning to analyze programs since deep architectures cannot be trained effectively with pure back propagation. In this pioneering paper, we propose the "coding criterion" to build program vector representations, which are the premise of deep learning for program analysis. Our representation learning approach directly makes deep learning a reality in this new field. We evaluate the learned vector representations both qualitatively and quantitatively. We conclude, based on the experiments, the coding criterion is successful in building program representations. To evaluate whether deep learning is beneficial for program analysis, we feed the representations to deep neural networks, and achieve higher accuracy in the program classification task than "shallow" methods, such as logistic regression and the support vector machine. This result confirms the feasibility of deep learning to analyze programs. It also gives primary evidence of its success in this new field. We believe deep learning will become an outstanding technique for program analysis in the near future.
A scattering transform defines a signal representation which is invariant to translations and Lipschitz continuous relatively to deformations. It is implemented with a non-linear convolution network that iterates over wavelet and modulus operators. Lipschitz continuity locally linearizes deformations. Complex classes of signals and textures can be modeled with low-dimensional affine spaces, computed with a PCA in the scattering domain. Classification is performed with a penalized model selection. State of the art results are obtained for handwritten digit recognition over small training sets, and for texture classification.
Named entity recognition is a challenging task that has traditionally required large amounts of knowledge in the form of feature engineering and lexicons to achieve high performance. In this paper, we present a novel neural network architecture that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering. We also propose a novel method of encoding partial lexicon matches in neural networks and compare it to existing approaches. Extensive evaluation shows that, given only tokenized text and publicly available word embeddings, our system is competitive on the CoNLL-2003 dataset and surpasses the previously reported state of the art performance on the OntoNotes 5.0 dataset by 2.13 F1 points. By using two lexicons constructed from publicly-available sources, we establish new state of the art performance with an F1 score of 91.62 on CoNLL-2003 and 86.28 on OntoNotes, surpassing systems that employ heavy feature engineering, proprietary lexicons, and rich entity linking information.
In this work we introduce an evolutionary strategy to solve combinatorial optimization tasks, i.e. problems characterized by a discrete search space. In particular, we focus on the Traveling Salesman Problem (TSP), i.e. a famous problem whose search space grows exponentially, increasing the number of cities, up to becoming NP-hard. The solutions of the TSP can be codified by arrays of cities, and can be evaluated by fitness, computed according to a cost function (e.g. the length of a path). Our method is based on the evolution of an agent population by means of an imitative mechanism, we define `partial imitation'. In particular, agents receive a random solution and then, interacting among themselves, may imitate the solutions of agents with a higher fitness. Since the imitation mechanism is only partial, agents copy only one entry (randomly chosen) of another array (i.e. solution). In doing so, the population converges towards a shared solution, behaving like a spin system undergoing a cooling process, i.e. driven towards an ordered phase. We highlight that the adopted `partial imitation' mechanism allows the population to generate solutions over time, before reaching the final equilibrium. Results of numerical simulations show that our method is able to find, in a finite time, both optimal and suboptimal solutions, depending on the size of the considered search space.
We consider factoring low-rank tensors in the presence of outlying slabs. This problem is important in practice, because data collected in many real-world applications, such as speech, fluorescence, and some social network data, fit this paradigm. Prior work tackles this problem by iteratively selecting a fixed number of slabs and fitting, a procedure which may not converge. We formulate this problem from a group-sparsity promoting point of view, and propose an alternating optimization framework to handle the corresponding $\ell_p$ ($0<p\leq 1$) minimization-based low-rank tensor factorization problem. The proposed algorithm features a similar per-iteration complexity as the plain trilinear alternating least squares (TALS) algorithm. Convergence of the proposed algorithm is also easy to analyze under the framework of alternating optimization and its variants. In addition, regularization and constraints can be easily incorporated to make use of \emph{a priori} information on the latent loading factors. Simulations and real data experiments on blind speech separation, fluorescence data analysis, and social network mining are used to showcase the effectiveness of the proposed algorithm.
There have been several attempts to mathematically understand neural networks and many more from biological and computational perspectives. The field has exploded in the last decade, yet neural networks are still treated much like a black box. In this work we describe a structure that is inherent to a feed forward neural network. This will provide a framework for future work on neural networks to improve training algorithms, compute the homology of the network, and other applications. Our approach takes a more geometric point of view and is unlike other attempts to mathematically understand neural networks that rely on a functional perspective.
We propose a systematic approach to reduce the memory consumption of deep neural network training. Specifically, we design an algorithm that costs O(sqrt(n)) memory to train a n layer network, with only the computational cost of an extra forward pass per mini-batch. As many of the state-of-the-art models hit the upper bound of the GPU memory, our algorithm allows deeper and more complex models to be explored, and helps advance the innovations in deep learning research. We focus on reducing the memory cost to store the intermediate feature maps and gradients during training. Computation graph analysis is used for automatic in-place operation and memory sharing optimizations. We show that it is possible to trade computation for memory - giving a more memory efficient training algorithm with a little extra computation cost. In the extreme case, our analysis also shows that the memory consumption can be reduced to O(log n) with as little as O(n log n) extra cost for forward computation. Our experiments show that we can reduce the memory cost of a 1,000-layer deep residual network from 48G to 7G with only 30 percent additional running time cost on ImageNet problems. Similarly, significant memory cost reduction is observed in training complex recurrent neural networks on very long sequences.
Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcribers is 5.9% for the Switchboard portion of the data, in which newly acquainted pairs of people discuss an assigned topic, and 11.3% for the CallHome portion where friends and family members have open-ended conversations. In both cases, our automated system establishes a new state of the art, and edges past the human benchmark, achieving error rates of 5.8% and 11.0%, respectively. The key to our system's performance is the use of various convolutional and LSTM acoustic model architectures, combined with a novel spatial smoothing method and lattice-free MMI acoustic training, multiple recurrent neural network language modeling approaches, and a systematic use of system combination.
We investigate the superfluid-insulator transition in the disordered two-dimensional Bose-Hubbard model through quantum Monte Carlo simulations. The Bose-Hubbard model is studied in the presence of site disorder and the quantum critical point between the Bose-glass and superfluid is determined in both the grand canonical ($\mu/U=0.375$ close to $\rho=1$) and canonical ensemble ($\rho=1$ and 0.5). Particular attention is paid to disorder averaging and it is shown that an extremely large number of disorder realizations are needed in order to obtain reliable results. Typically we average over more than $100,000$ disorder realizations. In the grand canonical ensemble we find $Z t_c/U=0.112(1)$ with $\mu/U=0.375$, significantly different from previous studies. When compared to the critical point in the absence of disorder ($Z t_c/U=0.2385$), this result confirms previous findings showing that disorder enlarges the superfluid region. At the critical point, in order to estimate universal features, we compute the dynamic conductivity scaling curves.
Plug-and-play information technology (IT) infrastructure has been expanding very rapidly in recent years. With the advent of cloud computing, many ecosystem and business paradigms are encountering potential changes and may be able to eliminate their IT infrastructure maintenance processes. Real-time performance and high availability requirements have induced telecom networks to adopt the new concepts of the cloud model: software-defined networking (SDN) and network function virtualization (NFV). NFV introduces and deploys new network functions in an open and standardized IT environment, while SDN aims to transform the way networks function. SDN and NFV are complementary technologies; they do not depend on each other. However, both concepts can be merged and have the potential to mitigate the challenges of legacy networks. In this paper, our aim is to describe the benefits of using SDN in a multitude of environments such as in data centers, data center networks, and Network as Service offerings. We also present the various challenges facing SDN, from scalability to reliability and security concerns, and discuss existing solutions to these challenges.
The possible application of boosted neural network to particle classification in high energy physics is discussed. A two-dimensional toy model, where the boundary between signal and background is irregular but not overlapping, is constructed to show how boosting technique works with neural network. It is found that boosted neural network not only decreases the error rate of classification significantly but also increases the efficiency and signal-background ratio. Besides, boosted neural network can avoid the disadvantage aspects of single neural network design. The boosted neural network is also applied to the classification of quark- and gluon- jet samples from Monte Carlo \EE collisions, where the two samples show significant overlapping. The performance of boosting technique for the two different boundary cases -- with and without overlapping is discussed.
Inclusive jet production is studied in neutral current deep-inelastic positron-proton scattering at large four momentum transfer squared Q^2>150 GeV^2 with the H1 detector at HERA. Single and double differential inclusive jet cross sections are measured as a function of Q^2 and of the transverse energy E_T of the jets in the Breit frame. The measurements are found to be well described by calculations at next-to-leading order in perturbative QCD. The running of the strong coupling is demonstrated and the value of alpha_s(M_Z) is determined. The ratio of the inclusive jet cross section to the inclusive neutral current cross section is also measured and used to extract a precise value for alpha_s(M_Z)=0.1193+/-0.0014(exp.)^{+0.0047}_{-0.0030}(th.)+/-0.0016(pdf).
Unregularized deep neural networks (DNNs) can be easily overfit with a limited sample size. We argue that this is mostly due to the disriminative nature of DNNs which directly model the conditional probability (or score) of labels given the input. The ignorance of input distribution makes DNNs difficult to generalize to unseen data. Recent advances in regularization techniques, such as pretraining and dropout, indicate that modeling input data distribution (either explicitly or implicitly) greatly improves the generalization ability of a DNN. In this work, we explore the manifold hypothesis which assumes that instances within the same class lie in a smooth manifold. We accordingly propose two simple regularizers to a standard discriminative DNN. The first one, named Label-Aware Manifold Regularization, assumes the availability of labels and penalizes large norms of the loss function w.r.t. data points. The second one, named Label-Independent Manifold Regularization, does not use label information and instead penalizes the Frobenius norm of the Jacobian matrix of prediction scores w.r.t. data points, which makes semi-supervised learning possible. We perform extensive control experiments on fully supervised and semi-supervised tasks using the MNIST, CIFAR10 and SVHN datasets and achieve excellent results.
We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.
We study the dynamics of on-line learning in large perceptrons, for the case of training sets with a structural bias of the input vectors, by deriving exact and closed macroscopic dynamical laws using non-equilibrium statistical mechanical tools. In sharp contrast to the more conventional theories developed for homogeneously distributed or only weakly biased data, these laws are found to describe a non-trivial and persistently non-deterministic macroscopic evolution, and a generalisation error which retains both stochastic and sample-to-sample fluctuations, even for infinitely large networks. Furthermore, for the standard error-correcting microscopic algorithms (such as the perceptron learning rule) one obtains learning curves with distinct bias-induced phases. Our theoretical predictions find excellent confirmation in numerical simulations.
The development of reliable imaging biomarkers for the analysis of colorectal cancer (CRC) in hematoxylin and eosin (H&E) stained histopathology images requires an accurate and reproducible classification of the main tissue components in the image. In this paper, we propose a system for CRC tissue classification based on convolutional networks (ConvNets). We investigate the importance of stain normalization in tissue classification of CRC tissue samples in H&E-stained images. Furthermore, we report the performance of ConvNets on a cohort of rectal cancer samples and on an independent publicly available dataset of colorectal H&E images.
Human intuition has been simulated by several research projects using artificial intelligence techniques. Most of these algorithms or models lack the ability to handle complications or diversions. Moreover, they also do not explain the factors influencing intuition and the accuracy of the results from this process. In this paper, we present a simple series based model for implementation of human-like intuition using the principles of connectivity and unknown entities. By using Poker hand datasets and Car evaluation datasets, we compare the performance of some well-known models with our intuition model. The aim of the experiment was to predict the maximum accurate answers using intuition based models. We found that the presence of unknown entities, diversion from the current problem scenario, and identifying weakness without the normal logic based execution, greatly affects the reliability of the answers. Generally, the intuition based models cannot be a substitute for the logic based mechanisms in handling such problems. The intuition can only act as a support for an ongoing logic based model that processes all the steps in a sequential manner. However, when time and computational cost are very strict constraints, this intuition based model becomes extremely important and useful, because it can give a reasonably good performance. Factors affecting intuition are analyzed and interpreted through our model.
This report is targeted to groups who are subject matter experts in their application but deep learning novices. It contains practical advice for those interested in testing the use of deep neural networks on applications that are novel for deep learning. We suggest making your project more manageable by dividing it into phases. For each phase this report contains numerous recommendations and insights to assist novice practitioners.
In this paper we propose an investing strategy based on neural network models combined with ideas from game-theoretic probability of Shafer and Vovk. Our proposed strategy uses parameter values of a neural network with the best performance until the previous round (trading day) for deciding the investment in the current round. We compare performance of our proposed strategy with various strategies including a strategy based on supervised neural network models and show that our procedure is competitive with other strategies.
The independent set problem is NP-hard and particularly difficult to solve in large sparse graphs. In this work, we develop an advanced evolutionary algorithm, which incorporates kernelization techniques to compute large independent sets in huge sparse networks. A recent exact algorithm has shown that large networks can be solved exactly by employing a branch-and-reduce technique that recursively kernelizes the graph and performs branching. However, one major drawback of their algorithm is that, for huge graphs, branching still can take exponential time. To avoid this problem, we recursively choose vertices that are likely to be in a large independent set (using an evolutionary approach), then further kernelize the graph. We show that identifying and removing vertices likely to be in large independent sets opens up the reduction space---which not only speeds up the computation of large independent sets drastically, but also enables us to compute high-quality independent sets on much larger instances than previously reported in the literature.
The electronic healthcare databases are starting to become more readily available and are thought to have excellent potential for generating adverse drug reaction signals. The Health Improvement Network (THIN) database is an electronic healthcare database containing medical information on over 11 million patients that has excellent potential for detecting ADRs. In this paper we apply four existing electronic healthcare database signal detecting algorithms (MUTARA, HUNT, Temporal Pattern Discovery and modified ROR) on the THIN database for a selection of drugs from six chosen drug families. This is the first comparison of ADR signalling algorithms that includes MUTARA and HUNT and enabled us to set a benchmark for the adverse drug reaction signalling ability of the THIN database. The drugs were selectively chosen to enable a comparison with previous work and for variety. It was found that no algorithm was generally superior and the algorithms' natural thresholds act at variable stringencies. Furthermore, none of the algorithms perform well at detecting rare ADRs.
This paper presents an impact assessment for the imputation of missing data. The data set used is HIV Seroprevalence data from an antenatal clinic study survey performed in 2001. Data imputation is performed through five methods: Random Forests, Autoassociative Neural Networks with Genetic Algorithms, Autoassociative Neuro-Fuzzy configurations, and two Random Forest and Neural Network based hybrids. Results indicate that Random Forests are superior in imputing missing data in terms both of accuracy and of computation time, with accuracy increases of up to 32% on average for certain variables when compared with autoassociative networks. While the hybrid systems have significant promise, they are hindered by their Neural Network components. The imputed data is used to test for impact in three ways: through statistical analysis, HIV status classification and through probability prediction with Logistic Regression. Results indicate that these methods are fairly immune to imputed data, and that the impact is not highly significant, with linear correlations of 96% between HIV probability prediction and a set of two imputed variables using the logistic regression analysis.
The current study applies deep learning to herbalism. Toward the goal, we acquired the de-identified health insurance reimbursements that were claimed in a 10-year period from 2004 to 2013 in the National Health Insurance Database of Taiwan, the total number of reimbursement records equaling 340 millions. Two artificial intelligence techniques were applied to the dataset: residual convolutional neural network multitask classifier and attention-based recurrent neural network. The former works to translate from herbal prescriptions to diseases; and the latter from diseases to herbal prescriptions. Analysis of the classification results indicates that herbal prescriptions are specific to: anatomy, pathophysiology, sex and age of the patient, and season and year of the prescription. Further analysis identifies temperature and gross domestic product as the meteorological and socioeconomic factors that are associated with herbal prescriptions. Analysis of the neural machine transitional result indicates that the recurrent neural network learnt not only syntax but also semantics of diseases and herbal prescriptions.
Background. Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text "feature engineering" and conventional machine learning algorithms such as conditional random fields and support vector machines. However, developing good features is inherently heavily time-consuming. Conversely, more modern machine learning approaches such as recurrent neural networks (RNNs) have proved capable of automatically learning effective features from either random assignments or automated word "embeddings". Objectives. (i) To create a highly accurate DNR and CCE system that avoids conventional, time-consuming feature engineering. (ii) To create richer, more specialized word embeddings by using health domain datasets such as MIMIC-III. (iii) To evaluate our systems over three contemporary datasets. Methods. Two deep learning methods, namely the Bidirectional LSTM and the Bidirectional LSTM-CRF, are evaluated. A CRF model is set as the baseline to compare the deep learning systems to a traditional machine learning approach. The same features are used for all the models. Results. We have obtained the best results with the Bidirectional LSTM-CRF model, which has outperformed all previously proposed systems. The specialized embeddings have helped to cover unusual words in DDI-DrugBank and DDI-MedLine, but not in the 2010 i2b2/VA IRB Revision dataset. Conclusion. We present a state-of-the-art system for DNR and CCE. Automated word embeddings has allowed us to avoid costly feature engineering and achieve higher accuracy. Nevertheless, the embeddings need to be retrained over datasets that are adequate for the domain, in order to adequately cover the domain-specific vocabulary.
Computer vision methods that quantify the perception of urban environment are increasingly being used to study the relationship between a city's physical appearance and the behavior and health of its residents. Yet, the throughput of current methods is too limited to quantify the perception of cities across the world. To tackle this challenge, we introduce a new crowdsourced dataset containing 110,988 images from 56 cities, and 1,170,000 pairwise comparisons provided by 81,630 online volunteers along six perceptual attributes: safe, lively, boring, wealthy, depressing, and beautiful. Using this data, we train a Siamese-like convolutional neural architecture, which learns from a joint classification and ranking loss, to predict human judgments of pairwise image comparisons. Our results show that crowdsourcing combined with neural networks can produce urban perception data at the global scale.
Outdoor lighting has extremely high dynamic range. This makes the process of capturing outdoor environment maps notoriously challenging since special equipment must be used. In this work, we propose an alternative approach. We first capture lighting with a regular, LDR omnidirectional camera, and aim to recover the HDR after the fact via a novel, learning-based tonemapping method. We propose a deep autoencoder framework which regresses linear, high dynamic range data from non-linear, saturated, low dynamic range panoramas. We validate our method through a wide set of experiments on synthetic data, as well as on a novel dataset of real photographs with ground truth. Our approach finds applications in a variety of settings, ranging from outdoor light capture to image matching.
Social media platforms provide an environment where people can freely engage in discussions. Unfortunately, they also enable several problems, such as online harassment. Recently, Google and Jigsaw started a project called Perspective, which uses machine learning to automatically detect toxic language. A demonstration website has been also launched, which allows anyone to type a phrase in the interface and instantaneously see the toxicity score [1]. In this paper, we propose an attack on the Perspective toxic detection system based on the adversarial examples. We show that an adversary can subtly modify a highly toxic phrase in a way that the system assigns significantly lower toxicity score to it. We apply the attack on the sample phrases provided in the Perspective website and show that we can consistently reduce the toxicity scores to the level of the non-toxic phrases. The existence of such adversarial examples is very harmful for toxic detection systems and seriously undermines their usability.
In this paper we derive an updating scheme for calculating some important network statistics such as degree, clustering coefficient, etc., aiming at reduce the amount of computation needed to track the evolving behavior of large networks; and more importantly, to provide efficient methods for potential use of modeling the evolution of networks. Using the updating scheme, the network statistics can be computed and updated easily and much faster than re-calculating each time for large evolving networks. The update formula can also be used to determine which edge/node will lead to the extremal change of network statistics, providing a way of predicting or designing evolution rule of networks.
Ad-hoc networks are independent of any infrastructure. The nodes are autonomous and make their own decisions. They also have limited energy resources. Thus, a node tends to behave selfishly when it is asked to forward the packets of other nodes. Indeed, it would rather choose to reject a forwarding request in order to save its energy. To overcome this problem, the nodes need to be motivated to cooperate. To this end, we propose a self-learning repeated game framework to enforce cooperation between the nodes of a network. This framework is inspired by the concept of "The Weakest Link" TV game. Each node has a utility function whose value depends on its cooperation in forwarding packets on a route as well as the cooperation of all the nodes that form this same route. The more these nodes cooperate the higher is their utility value. This would establish a cooperative spirit within the nodes of the networks. All the nodes will then more or less equally participate to the forwarding tasks which would then eventually guarantee a more efficient packets forwarding from sources to respective destinations. Simulations are run and the results show that the proposed framework efficiently enforces nodes to cooperate and outperforms two other self-learning repeated game frameworks which we are interested in.
Call Detail Records (CDRs) are data recorded by telecommunications companies, consisting of basic informations related to several dimensions of the calls made through the network: the source, destination, date and time of calls. CDRs data analysis has received much attention in the recent years since it might reveal valuable information about human behavior. It has shown high added value in many application domains like e.g., communities analysis or network planning. In this paper, we suggest a generic methodology for summarizing information contained in CDRs data. The method is based on a parameter-free estimation of the joint distribution of the variables that describe the calls. We also suggest several well-founded criteria that allows one to browse the summary at various granularities and to explore the summary by means of insightful visualizations. The method handles network graph data, temporal sequence data as well as user mobility data stemming from original CDRs data. We show the relevance of our methodology for various case studies on real-world CDRs data from Ivory Coast.
The optimal solution of an inter-city passenger transport network has been studied using Zipf's law for the city populations and the Gravity law describing the fluxes of inter-city passenger traffic. Assuming a fixed value for the cost of transport per person per kilometer we observe that while the total traffic cost decreases, the total wiring cost increases with the density of links. As a result the total cost to maintain the traffic distribution is optimal at a certain link density which vanishes on increasing the network size. At a finite link density the network is scale-free. Using this model the air-route network of India has been generated and an one-to-one comparison of the nodal degree values with the real network has been made.
The diffusivity of tagged particles is demonstrated to be very heterogeneous on time scales comparable to or shorter than the $\alpha$ relaxation time $\tau_{\alpha}$ ($\cong$ the stress relaxation time) in a highly supercooled liquid via 3D molecular dynamics simulation. The particle motions in the relatively active regions dominantly contribute to the mean square displacement, giving rise to a diffusion constant systematically larger than the Einstein-Stokes value. The van Hove self-correlation function $G_s(r,t)$ is shown to have a long distance tail which can be scaled in terms of $r/t^{1/2}$ for $t \ls 3\tau_{\alpha}$. Its presence indicates heterogeneous diffusion in the active regions. However, the diffusion process eventually becomes homogeneous on time scales longer than the life time of the heterogeneity structure ($\sim 3 \tau_{\alpha}$).
We consider the problem of cooperative spectrum sharing among a primary user (PU) and multiple secondary users (SUs) under quality of service (QoS) constraints. The SUs network is controlled by the PU through a relay which gets a revenue for amplifying and forwarding the SUs signals to their respective destinations. The relay charges each SU a different price depending on its received signal-to-interference and-noise ratio (SINR). The relay can control the SUs network and maximize any desired PU utility function. The PU utility function represents its rate, which is affected by the SUs access, and its gained revenue to allow the access of the SUs. The SU network can be formulated as a game in which each SU wants to maximize its utility function; the problem is formulated as a Stackelberg game. Finally, the problem of maximizing the primary utility function is solved through three different approaches, namely, the optimal, the heuristic and the suboptimal algorithms.
A new method for learning variational autoencoders is developed, based on an application of Stein's operator. The framework represents the encoder as a deep nonlinear function through which samples from a simple distribution are fed. One need not make parametric assumptions about the form of the encoder distribution, and performance is further enhanced by integrating the proposed encoder with importance sampling. Example results are demonstrated across multiple unsupervised and semi-supervised problems, including semi-supervised analysis of the ImageNet data, demonstrating the scalability of the model to large datasets.
We analyse in this article the critical behavior of $M$ $q_1$-state Potts models coupled to $N$ $q_2$-state Potts models ($q_1,q_2\in [2..4]$) with and without disorder. The technics we use are based on perturbed conformal theories. Calculations have been performed at two loops. We already find some interesting situations in the pure case for some peculiar values of $M$ and $N$ with new tricritical points. When adding weak disorder, the results we obtain tend to show that disorder makes the models decouple. Therefore, no relations emerges, at a perturbation level, between for example the disordered $q_1\times q_2$-state Potts model and the two disordered $q_1,q_2$-state Potts models ($q_1\ne q_2$), despite their central charges are similar according to recent numerical investigations.
Some of the new experimental results and theoretical developments presented at the Workshop on Deep Inelastic Scattering and Related Phenomena (Rome, April 1996) are reviewed.
This paper presents a short selective review of the non-thermal weak microwave field impact on a nerve fiber. The published results of recent experiments are reviewed and analyzed. The theory of the authors is presented, according to which there are strongly pronounced resonances in the range of about 30-300 GHz associated with the excitation of ultrasonic vibrations in the membrane as a result of interactions with the microwave radiation. These forced vibrations create acoustic pressure, which may lead to the redistribution of the protein transmembrane channels, thus changing the threshold of the action potential excitation in the axons of the neural network. The problem of surface charge on the bilayer lipid membrane of the nerve fiber is discussed. Various experiments for observing the effects considered are also discussed.
The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature quality, we introduce the Introspective Adversarial Network, a novel hybridization of the VAE and GAN. Our model efficiently captures long-range dependencies through use of a computational block based on weight-shared dilated convolutions, and improves generalization performance with Orthogonal Regularization, a novel weight regularization method. We validate our contributions on CelebA, SVHN, and CIFAR-100, and produce samples and reconstructions with high visual fidelity.
Recognition and binding of specific sites on DNA by proteins is central for many cellular functions such as transcription, replication, and recombination. In the process of recognition, a protein rapidly searches for its specific site on a long DNA molecule and then strongly binds this site. Here we aim to find a mechanism that can provide both a fast search (1-10 sec) and high stability of the specific protein-DNA complex ($K_d=10^{-15}-10^{-8}$ M).   Earlier studies have suggested that rapid search involves the sliding of a protein along the DNA. Here we consider sliding as a one-dimensional (1D) diffusion in a sequence-dependent rough energy landscape. We demonstrate that, in spite of the landscape's roughness, rapid search can be achieved if 1D sliding is accompanied by 3D diffusion. We estimate the range of the specific and non-specific DNA-binding energy required for rapid search and suggest experiments that can test our mechanism. We show that optimal search requires a protein to spend half of time sliding along the DNA and half diffusing in 3D. We also establish that, paradoxically, realistic energy functions cannot provide both rapid search and strong binding of a rigid protein. To reconcile these two fundamental requirements we propose a search-and-fold mechanism that involves the coupling of protein binding and partial protein folding.   Proposed mechanism has several important biological implications for search in the presence of other proteins and nucleosomes, simultaneous search by several proteins etc. Proposed mechanism also provides a new framework for interpretation of experimental and structural data on protein-DNA interactions.
The vertex-cover problem on the Hanoi networks HN3 and HN5 is analyzed with an exact renormalization group and parallel-tempering Monte Carlo simulations. The grand canonical partition function of the equivalent hard-core repulsive lattice-gas problem is recast first as an Ising-like canonical partition function, which allows for a closed set of renormalization group equations. The flow of these equations is analyzed for the limit of infinite chemical potential, at which the vertex-cover problem is attained. The relevant fixed point and its neighborhood are analyzed, and non-trivial results are obtained both, for the coverage as well as for the ground state entropy density, which indicates the complex structure of the solution space. Using special hierarchy-dependent operators in the renormalization group and Monte-Carlo simulations, structural details of optimal configurations are revealed. These studies indicate that the optimal coverages (or packings) are not related by a simple symmetry. Using a clustering analysis of the solutions obtained in the Monte Carlo simulations, a complex solution space structure is revealed for each system size. Nevertheless, in the thermodynamic limit, the solution landscape is dominated by one huge set of very similar solutions.
Recently, we introduced in arXiv:1105.2434 a model for product adoption in social networks with multiple products, where the agents, influenced by their neighbours, can adopt one out of several alternatives. We identify and analyze here four types of paradoxes that can arise in these networks. To this end, we use social network games that we recently introduced in arxiv:1202.2209. These paradoxes shed light on possible inefficiencies arising when one modifies the sets of products available to the agents forming a social network. One of the paradoxes corresponds to the well-known Braess paradox in congestion games and shows that by adding more choices to a node, the network may end up in a situation that is worse for everybody. We exhibit a dual version of this, where removing available choices from someone can eventually make everybody better off. The other paradoxes that we identify show that by adding or removing a product from the choice set of some node may lead to permanent instability. Finally, we also identify conditions under which some of these paradoxes cannot arise.
Population structure induced by both spatial embedding and more general networks of interaction, such as model social networks, have been shown to have a fundamental effect on the dynamics and outcome of evolutionary games. These effects have, however, proved to be sensitive to the details of the underlying topology and dynamics. Here we introduce a minimal population structure that is described by two distinct hierarchical levels of interaction. We believe this model is able to identify effects of spatial structure that do not depend on the details of the topology. We derive the dynamics governing the evolution of a system starting from fundamental individual level stochastic processes through two successive meanfield approximations. In our model of population structure the topology of interactions is described by only two parameters: the effective population size at the local scale and the relative strength of local dynamics to global mixing. We demonstrate, for example, the existence of a continuous transition leading to the dominance of cooperation in populations with hierarchical levels of unstructured mixing as the benefit to cost ratio becomes smaller then the local population size. Applying our model of spatial structure to the repeated prisoner's dilemma we uncover a novel and counterintuitive mechanism by which the constant influx of defectors sustains cooperation. Further exploring the phase space of the repeated prisoner's dilemma and also of the "rock-paper-scissor" game we find indications of rich structure and are able to reproduce several effects observed in other models with explicit spatial embedding, such as the maintenance of biodiversity and the emergence of global oscillations.
Function of proteins or a network of interacting proteins often involves communication between residues that are well separated in sequence. The classic example is the participation of distant residues in allosteric regulation. Bioinformatic and structural analysis methods have been introduced to infer residues that are correlated. Recently, increasing attention has been paid to obtain the sequence properties that determine the tendency of disease related proteins (Abeta peptides, prion proteins, transthyretin etc.) to aggregate and form fibrils. Motivated in part by the need to identify sequence characteristics that indicate a tendency to aggregate, we introduce a general method that probes covariations in charged residues along the sequence in a given protein family. The method, which involves computing the Sequence Correlation Entropy (SCE) using the quenched probability Psk(i,j) of finding a residue pair at a given sequence separation sk, allows us to classify protein families in terms of their SCE. Our general approach may be a useful way in obtaining evolutionary covariations of amino acid residues on a genome wide level.
This works explores and illustrates recent results developed by the author in field of dynamical network analysis. The considered approach is blind, i.e., no a priori assumptions on the interconnected systems are available. Moreover, the perspective is that of a simple "observer" who can perform no kind of test on the network in order to study the related response, that is no action or forcing input aimed to reveal particular responses of the system can be performed. In such a scenario a frequency based method of investigation is developed to obtain useful insights on the network. The information thus derived can be fruitfully exploited to build acyclic graphical models, which can be seen as extension of Bayesian Networks or Markov Chains. Moreover, it is shown that the topology of polytree linear networks can be exactly identified via the same mathematical tools. In this respect, it is worth observing that important real systems, such as all the transportation networks, fit this class.
Current state-of-the-art approaches to skeleton-based action recognition are mostly based on recurrent neural networks (RNN). In this paper, we propose a novel convolutional neural networks (CNN) based framework for both action classification and detection. Raw skeleton coordinates as well as skeleton motion are fed directly into CNN for label prediction. A novel skeleton transformer module is designed to rearrange and select important skeleton joints automatically. With a simple 7-layer network, we obtain 89.3% accuracy on validation set of the NTU RGB+D dataset. For action detection in untrimmed videos, we develop a window proposal network to extract temporal segment proposals, which are further classified within the same network. On the recent PKU-MMD dataset, we achieve 93.7% mAP, surpassing the baseline by a large margin.
When a network application is implmented as a virtual machine on a cloud and is used by a large number of users, the location of the virtual machine should be selected carefully so that the response time experienced by users is minimized. As the user population moves and/or increases, the virtual machine may need to be migrated to a new location or replicated on many locations over a wide-area network. Virtual machine migration and replication have been studied extensively but in most cases are limited within a subnetwork to be able to maintain service continuity. In this paper we introduce a distributed cloud computing environment which facilitates the migration and replication of a virtual machine over a wide area network. The mechanism is provided by an overlay network of smart routers, each of which connects a cooperating data center to the Internet. The proposed approach is analyzed and compared with related works
In this paper, we try to solve the problem of temporal link prediction in information networks. This implies predicting the time it takes for a link to appear in the future, given its features that have been extracted at the current network snapshot. To this end, we introduce a probabilistic non-parametric approach, called "Non-Parametric Generalized Linear Model" (NP-GLM), which infers the hidden underlying probability distribution of the link advent time given its features. We then present a learning algorithm for NP-GLM and an inference method to answer time-related queries. Extensive experiments conducted on both synthetic data and real-world Sina Weibo social network demonstrate the effectiveness of NP-GLM in solving temporal link prediction problem vis-a-vis competitive baselines.
Inserting a patterned occluder at the aperture of a camera lens has been shown to improve the recovery of depth map and all-focus image compared to a fully open aperture. However, design of the aperture pattern plays a very critical role. Previous approaches for designing aperture codes make simple assumptions on image distributions to obtain metrics for evaluating aperture codes. However, real images may not follow those assumptions and hence the designed code may not be optimal for them. To address this drawback we propose a data driven approach for learning the optimal aperture pattern to recover depth map from a single coded image. We propose a two stage architecture where, in the first stage we simulate coded aperture images from a training dataset of all-focus images and depth maps and in the second stage we recover the depth map using a deep neural network. We demonstrate that our learned aperture code performs better than previously designed codes even on code design metrics proposed by previous approaches.
Reconstructing a synaptic wiring diagram, or connectome, from electron microscopy (EM) images of brain tissue currently requires many hours of manual annotation or proofreading (Kasthuri and Lichtman, 2010; Lichtman and Sanes, 2008; Seung, 2009). The desire to reconstruct ever larger and more complex networks has pushed the collection of ever larger EM datasets. A cubic millimeter of raw imaging data would take up 1 PB of storage and present an annotation project that would be impractical without relying heavily on automatic segmentation methods. The RhoanaNet image processing pipeline was developed to automatically segment large volumes of EM data and ease the burden of manual proofreading and annotation. Based on (Kaynig et al., 2015), we updated every stage of the software pipeline to provide better throughput performance and higher quality segmentation results. We used state of the art deep learning techniques to generate improved membrane probability maps, and Gala (Nunez-Iglesias et al., 2014) was used to agglomerate 2D segments into 3D objects.   We applied the RhoanaNet pipeline to four densely annotated EM datasets, two from mouse cortex, one from cerebellum and one from mouse lateral geniculate nucleus (LGN). All training and test data is made available for benchmark comparisons. The best segmentation results obtained gave $V^\text{Info}_\text{F-score}$ scores of 0.9054 and 09182 for the cortex datasets, 0.9438 for LGN, and 0.9150 for Cerebellum.   The RhoanaNet pipeline is open source software. All source code, training data, test data, and annotations for all four benchmark datasets are available at www.rhoana.org.
The dramatic mobile data traffic growth is not only resulting in the spectrum crunch but is also leading to exorbitant energy consumption. It is thus desirable to liberate mobile and wireless networks from the constraint of the spectrum scarcity and to rein in the growing energy consumption. This article introduces FreeNet, figuratively synonymous to "Free Network", which engineers the spectrum and energy harvesting techniques to alleviate the spectrum and energy constraints by sensing and harvesting spare spectrum for data communications and utilizing renewable energy as power supplies, respectively. Hence, FreeNet increases the spectrum and energy efficiency of wireless networks and enhances the network availability. As a result, FreeNet can be deployed to alleviate network congestion in urban areas, provision broadband services in rural areas, and upgrade emergency communication capacity. This article provides a brief analysis of the design of FreeNet that accommodates the dynamics of the spare spectrum and employs renewable energy.
The time to converge to the steady state of a finite Markov chain can be greatly reduced by a lifting operation, which creates a new Markov chain on an expanded state space. For a class of quadratic objectives, we show an analogous behavior where a distributed ADMM algorithm can be seen as a lifting of Gradient Descent algorithm. This provides a deep insight for its faster convergence rate under optimal parameter tuning. We conjecture that this gain is always present, as opposed to the lifting of a Markov chain which sometimes only provides a marginal speedup.
Complex networks often have a modular structure, where a number of tightly- connected groups of nodes (modules) have relatively few interconnections. Modularity had been shown to have an important effect on the evolution and stability of biological networks, on the scalability and efficiency of large-scale infrastructure, and the development of economic and social systems. An analytical framework for understanding modularity and its effects on network vulnerability is still missing. Through recent advances in the understanding of multilayer networks, however, it is now possible to develop a theoretical framework to systematically study this critical issue. Here we study, analytically and numerically, the resilience of modular networks under attacks on interconnected nodes, which exhibit high betweenness values and are often more exposed to failure. Our model provides new understandings into the feedback between structure and function in real world systems, and consequently has important implications as diverse as developing efficient immunization strategies, designing robust large-scale infrastructure, and understanding brain function.
We discuss the sources which are likely to dominate the ionizing field throughout the Local Group. In terms of the limiting flux to produce detectable H-alpha emission (4-10 x 10**3 phot/cm**2/s), the four dominant galaxies (M31, Galaxy, M33, LMC) have spheres of influence which occupy a small fraction (<10%) of the Local Volume. There are at least two possible sources of ionization whose influence could be far more pervasive: (i) a cosmic background of ionizing photons; (ii) a pervasive warm plasma throughout the Local Group. The COBE FIRAS sky temperature measurements permit a wide variety of plasmas with detectable ionizing fields. It has been suggested (Blitz et al. 1996; Spergel et al. 1996; Sembach et al. 1995, 1998) that a substantial fraction of high velocity clouds are external to the Galaxy but within the Local Group. Deep H-alpha detections are the crucial test of these claims and, indeed, provide a test bed for the putative Local Group corona.
Linear operator broadcast channel (LOBC) models the scenario of multi-rate packet broadcasting over a network, when random network coding is applied. This paper presents the framework of algebraic coding for LOBCs and provides a Hamming-like upper bound on (multishot) subspace codes for LOBCs.
We use the density matrix renormalization group to study the quantum transitions that occur in the half-filled one-dimensional fermionic Hubbard model with onsite potential disorder. We find a transition from the gapped Mott phase with algebraic spin correlations to a gapless spin-disordered phase beyond a critical strength of the disorder $\Delta_c \approx U/2$. Both the transitions in the charge and spin sectors are shown to be coincident. We also establish the finite-size corrections to the charge gap and the spin-spin correlation length in the presence of disorder and using a finite-size-scaling analysis we obtain the zero temperature phase diagram of the various quantum phase transitions that occur in the disorder-interaction plane.
Biology-derived algorithms are an important part of computational sciences, which are essential to many scientific disciplines and engineering applications. Many computational methods are derived from or based on the analogy to natural evolution and biological activities, and these biologically inspired computations include genetic algorithms, neural networks, cellular automata, and other algorithms.
Recent years have witnessed great success of convolutional neural network (CNN) for various problems both in low and high level visions. Especially noteworthy is the residual network which was originally proposed to handle high-level vision problems and enjoys several merits. This paper aims to extend the merits of residual network, such as skip connection induced fast training, for a typical low-level vision problem, i.e., single image super-resolution. In general, the two main challenges of existing deep CNN for supper-resolution lie in the gradient exploding/vanishing problem and large amount of parameters or computational cost as CNN goes deeper. Correspondingly, the skip connections or identity mapping shortcuts are utilized to avoid gradient exploding/vanishing problem. To tackle with the second problem, a parameter economic CNN architecture which has carefully designed width, depth and skip connections was proposed. Different residual-like architectures for image superresolution has also been compared. Experimental results have demonstrated that the proposed CNN model can not only achieve state-of-the-art PSNR and SSIM results for single image super-resolution but also produce visually pleasant results. This paper has extended the mmm 2017 paper with more experiments and explanations.
The main aim of the research was to examine educational use of Facebook. The Computer Networks and Communication lesson was taken as the sample and the attitudes of the students included in the study group towards Facebook were measured in a semi-experimental setup. The students on Facebook platform were examined for about three months and they continued their education interactively in that virtual environment. After the-three-month-education period, observations for the students were reported and the attitudes of the students towards Facebook were measured by three different measurement tools. As a result, the attitudes of the students towards educational use of Facebook and their views were heterogeneous. When the average values of the group were examined, it was reported that the attitudes towards educational use of Facebook was above a moderate level. Therefore, it might be suggested that social networks in virtual environments provide continuity in life long learning.
The Quadratic Assignment Problem (QAP) is one of the models used for the multi-row layout problem with facilities of equal area. There are a set of n facilities and a set of n locations. For each pair of locations, a distance is specified and for each pair of facilities a weight or flow is specified (e.g., the amount of supplies transported between the two facilities). The problem is to assign all facilities to different locations with the aim of minimizing the sum of the distances multiplied by the corresponding flows. The QAP is among the most difficult NP-hard combinatorial optimization problems. Because of this, this paper presents an efficient Genetic algorithm (GA) to solve this problem in reasonable time. For validation the proposed GA some examples are selected from QAP library. The obtained results in reasonable time show the efficiency of proposed GA.
Scale-free (SF) networks exhibiting a power-law degree distribution can be grouped into the assortative, dissortative and neutral networks according to the behavior of the degree-degree correlation coefficient. Here we investigate the betweenness centrality (BC) correlation for each type of SF networks. While the BC-BC correlation coefficients behave similarly to the degree-degree correlation coefficients for the dissortative and neutral networks, the BC correlation is nontrivial for the assortative ones found mainly in social networks. The mean BC of neighbors of a vertex with BC $g_i$ is almost independent of $g_i$, implying that each person is surrounded by almost the same influential environments of people no matter how influential the person is.
The monitoring and management of high-volume feature-rich traffic in large networks offers significant challenges in storage, transmission and computational costs. The predominant approach to reducing these costs is based on performing a linear mapping of the data to a low-dimensional subspace such that a certain large percentage of the variance in the data is preserved in the low-dimensional representation. This variance-based subspace approach to dimensionality reduction forces a fixed choice of the number of dimensions, is not responsive to real-time shifts in observed traffic patterns, and is vulnerable to normal traffic spoofing. Based on theoretical insights proved in this paper, we propose a new distance-based approach to dimensionality reduction motivated by the fact that the real-time structural differences between the covariance matrices of the observed and the normal traffic is more relevant to anomaly detection than the structure of the training data alone. Our approach, called the distance-based subspace method, allows a different number of reduced dimensions in different time windows and arrives at only the number of dimensions necessary for effective anomaly detection. We present centralized and distributed versions of our algorithm and, using simulation on real traffic traces, demonstrate the qualitative and quantitative advantages of the distance-based subspace approach.
The hyperfine field from dynamically polarized nuclei in n-GaAs is very spatially inhomogeneous, as the nu- clear polarization process is most efficient near the randomly-distributed donors. Electrons with polarized spins traversing the bulk semiconductor will experience this inhomogeneous hyperfine field as an effective fluctuating spin precession rate, and thus the spin polarization of an electron ensemble will relax. A theory of spin relaxation based on the theory of random walks is applied to such an ensemble precessing in an oblique magnetic field, and the precise form of the (unequal) longitudinal and transverse spin relaxation analytically derived. To investigate this mechanism, electrical three-terminal Hanle measurements were performed on epitaxially grown Co$_2$MnSi/$n$-GaAs heterostructures fabricated into electrical spin injection devices. The proposed anisotropic spin relaxation mechanism is required to satisfactorily describe the Hanle lineshapes when the applied field is oriented at large oblique angles.
We study the vortex lines that are a feature of many random or disordered three-dimensional systems. These show universal statistical properties on long length scales, and geometrical phase transitions analogous to percolation transitions but in distinct universality classes. The field theories for these problems have not previously been identified, so that while many numerical studies have been performed, a framework for interpreting the results has been lacking. We provide such a framework with mappings to simple supersymmetric models. Our main focus is on vortices in short-range correlated complex fields, which show a geometrical phase transition that we argue is described by the CP^{k|k} model (essentially the CP^{n-1} model in the replica limit n\rightarrow 1). This can be seen by mapping a lattice version of the problem to a lattice gauge theory. A related field theory with a noncompact gauge field, the 'NCCP^{k|k} model', is a supersymmetric extension of the standard dual theory for the XY transition, and we show that XY duality gives another way to understand the appearance of field theories of this type. The supersymmetric descriptions yield results relevant, for example, to vortices in the XY model and in superfluids, to optical vortices, and to certain models of cosmic strings. A distinct but related field theory, the RP^{2l|2l} model (or the RP^{n-1} model in the limit n\rightarrow 1) describes the unoriented vortices which occur for instance in nematic liquid crystals. Finally, we show that in two dimensions, a lattice gauge theory analogous to that discussed in three dimensions gives a simple way to see the known relation between two-dimensional percolation and the CP^{k|k} sigma model with a \theta-term.
Interactions between units in phyical, biological, technological, and social systems usually give rise to intrincate networks with non-trivial structure, which critically affects the dynamics and properties of the system. The focus of most current research on complex networks is on global network properties. A caveat of this approach is that the relevance of global properties hinges on the premise that networks are homogeneous, whereas most real-world networks have a markedly modular structure. Here, we report that networks with different functions, including the Internet, metabolic, air transportation, and protein interaction networks, have distinct patterns of connections among nodes with different roles, and that, as a consequence, complex networks can be classified into two distinct functional classes based on their link type frequency. Importantly, we demonstrate that the above structural features cannot be captured by means of often studied global properties.
This work describes a methodology to combine logic-based systems and connectionist systems. Our approach uses finite truth valued {\L}ukasiewicz logic, where we take advantage of fact what in this type of logics every connective can be define by a neuron in an artificial network having by activation function the identity truncated to zero and one. This allowed the injection of first-order formulas in a network architecture, and also simplified symbolic rule extraction.   Our method trains a neural network using Levenderg-Marquardt algorithm, where we restrict the knowledge dissemination in the network structure. We show how this reduces neural networks plasticity without damage drastically the learning performance. Making the descriptive power of produced neural networks similar to the descriptive power of {\L}ukasiewicz logic language, simplifying the translation between symbolic and connectionist structures.   This method is used in the reverse engineering problem of finding the formula used on generation of a truth table for a multi-valued {\L}ukasiewicz logic. For real data sets the method is particularly useful for attribute selection, on binary classification problems defined using nominal attribute. After attribute selection and possible data set completion in the resulting connectionist model: neurons are directly representable using a disjunctive or conjunctive formulas, in the {\L}ukasiewicz logic, or neurons are interpretations which can be approximated by symbolic rules. This fact is exemplified, extracting symbolic knowledge from connectionist models generated for the data set Mushroom from UCI Machine Learning Repository.
Neural codes are collections of binary strings motivated by patterns of neural activity. In this paper, we study algorithmic and enumerative aspects of convex neural codes in dimension 1 (i.e. on a line or a circle). We use the theory of consecutive-ones matrices to obtain some structural and algorithmic results; we use generating functions to obtain enumerative results.
We present an exact relation among the kinetic, potential and surface tension energies of a solitary wave in deep water in all dimensions. We deduce its non-existence in the absence of the effects of surface tension, provided that gravity acts in a direction opposite to what is physically realistic.
We analyze the unbiased black-box complexity of jump functions with small, medium, and large sizes of the fitness plateau surrounding the optimal solution.   Among other results, we show that when the jump size is $(1/2 - \varepsilon)n$, that is, only a small constant fraction of the fitness values is visible, then the unbiased black-box complexities for arities $3$ and higher are of the same order as those for the simple \textsc{OneMax} function. Even for the extreme jump function, in which all but the two fitness values $n/2$ and $n$ are blanked out, polynomial-time mutation-based (i.e., unary unbiased) black-box optimization algorithms exist. This is quite surprising given that for the extreme jump function almost the whole search space (all but a $\Theta(n^{-1/2})$ fraction) is a plateau of constant fitness.   To prove these results, we introduce new tools for the analysis of unbiased black-box complexities, for example, selecting the new parent individual not by comparing the fitnesses of the competing search points, but also by taking into account the (empirical) expected fitnesses of their offspring.
Recent years have witnessed a dramatic increase of user-generated video services. In such user-generated video services, crowdsourced live streaming (e.g., Periscope, Twitch) has significantly challenged today's edge network infrastructure: today's edge networks (e.g., 4G, Wi-Fi) have limited uplink capacity support, making high-bitrate live streaming over such links fundamentally impossible. In this paper, we propose to let broadcasters (i.e., users who generate the video) upload crowdsourced video streams using aggregated network resources from multiple edge networks. There are several challenges in the proposal: First, how to design a framework that aggregates bandwidth from multiple edge networks? Second, how to make this framework transparent to today's crowdsourced live streaming services? Third, how to maximize the streaming quality for the whole system? We design a multi-objective and deployable bandwidth aggregation system BASS to address these challenges: (1) We propose an aggregation framework transparent to today's crowdsourced live streaming services, using an edge proxy box and aggregation cloud paradigm; (2) We dynamically allocate geo-distributed cloud aggregation servers to enable MPTCP (i.e., multi-path TCP), according to location and network characteristics of both broadcasters and the original streaming servers; (3) We maximize the overall performance gain for the whole system, by matching streams with the best aggregation paths.
Parrondo's paradox occurs in sequences of games in which a winning expectation may be obtained by playing the games in a random order, even though each game in the sequence may be lost when played individually. Several variations of Parrondo's games with paradoxical property have been introduced. In this paper, I examine whether Parrondo's paradox occurs or not in scale free networks. Two models are discussed by some theoretical analyses and computer simulations. As a result, I prove that Parrondo's paradox occurs only in the second model.
The key question in the interaction of antinucleons in the nuclear medium concerns the deepness of the antinucleon-nucleus optical potential. In this work we study this task in the framework of the non-linear derivative (NLD) model which describes consistently bulk properties of nuclear matter and Dirac phenomenology of nucleon-nucleus interactions. We apply the NLD model to antinucleon interactions in nuclear matter and find a strong decrease of the vector and scalar self-energies in energy and density and thus a strong suppression of the optical potential at zero momentum and, in particular, at FAIR energies. This is in agreement with available empirical information and, therefore, resolves the issue concerning the incompatibility of G-parity arguments in relativistic mean-field (RMF) models. We conclude the relevance of our results for the future activities at FAIR.
One of the main obstacles to the wider use of the modern error-correction codes is that, due to the complex behavior of their decoding algorithms, no systematic method which would allow characterization of the Bit-Error-Rate (BER) is known. This is especially true at the weak noise where many systems operate and where coding performance is difficult to estimate because of the diminishingly small number of errors. We show how the instanton method of physics allows one to solve the problem of BER analysis in the weak noise range by recasting it as a computationally tractable minimization problem.
This paper builds theoretical foundations for the recovery of a newly proposed class of smooth graph signals, approximately bandlimited graph signals, under three sampling strategies: uniform sampling, experimentally designed sampling and active sampling. We then state minimax lower bounds on the maximum risk for the approximately bandlimited class under these three sampling strategies and show that active sampling cannot fundamentally outperform experimentally designed sampling. We propose a recovery strategy to compare uniform sampling with experimentally designed sampling. As the proposed recovery strategy lends itself well to statistical analysis, we derive the exact mean square error for each sampling strategy. To study convergence rates, we introduce two types of graphs and find that (1) the proposed recovery strategy achieves the optimal rates; and (2) the experimentally designed sampling fundamentally outperforms uniform sampling for Type-2 class of graphs. To validate our proposed recovery strategy, we test it on five specific graphs: a ring graph with $k$ nearest neighbors, an Erd\H{o}s-R\'enyi graph, a random geometric graph, a small-world graph and a power-law graph and find that experimental results match the proposed theory well. This work also presents a comprehensive explanation for when and why sampling for semi-supervised learning with graphs works.
While the costs of human violence have attracted a great deal of attention from the research community, the effects of the network-on-network (NoN) violence popularised by Generative Adversarial Networks have yet to be addressed. In this work, we quantify the financial, social, spiritual, cultural, grammatical and dermatological impact of this aggression and address the issue by proposing a more peaceful approach which we term Generative Unadversarial Networks (GUNs). Under this framework, we simultaneously train two models: a generator G that does its best to capture whichever data distribution it feels it can manage, and a motivator M that helps G to achieve its dream. Fighting is strictly verboten and both models evolve by learning to respect their differences. The framework is both theoretically and electrically grounded in game theory, and can be viewed as a winner-shares-all two-player game in which both players work as a team to achieve the best score. Experiments show that by working in harmony, the proposed model is able to claim both the moral and log-likelihood high ground. Our work builds on a rich history of carefully argued position-papers, published as anonymous YouTube comments, which prove that the optimal solution to NoN violence is more GUNs.
We present the first results of a deep imaging programme to identify the system responsible for the Cosmic Microwave Background decrement in the field towards the z = 3.8 quasar pair PC1643+4631 A&B. Using the prime focus camera at the William Herschel Telescope we have carried out deep multicolour optical imaging to search for candidate cluster galaxies at extremely high redshift. Using UGR colour selection we find the surface density of z > 3 Lyman-break galaxy candidates is at least as great as that found in the field of the z = 3.1 structure discovered by Steidel et al. (1998), and may be somewhat greater.
We study the mixtures of factorizing probability distributions represented as visible marginal distributions in stochastic layered networks. We take the perspective of kernel transitions of distributions, which gives a unified picture of distributed representations arising from Deep Belief Networks (DBN) and other networks without lateral connections. We describe combinatorial and geometric properties of the set of kernels and products of kernels realizable by DBNs as the network parameters vary. We describe explicit classes of probability distributions, including exponential families, that can be learned by DBNs. We use these submodels to bound the maximal and the expected Kullback-Leibler approximation errors of DBNs from above depending on the number of hidden layers and units that they contain.
Chandra deep fields represent the deepest look at the X-ray sky. We analyzed the Chandra Deep Field South (CDFS) with the aid of a dedicated wavelet-based algorithm. Here we present a detailed description of the procedures used to analyze this field, tested and verified by means of extensive simulations. We show that we can safely reconstruct the LogN-Log S source distribution of the CDFS down to limiting fluxes of 2.4x10^-17 and 2.1x10^-16 erg s^-1 cm^-2 in the soft (0.5-2 keV) and hard (2-10 keV) bands, respectively, fainter by a factor ~ 2 than current estimates. At these levels we can account for ~ 90% of the 1-2 keV and 2-10 keV X-ray background.
This paper details the investigation of the influence of different disorders in two-dimensional topological insulator systems. Unlike the phase transitions to topological Anderson insulator induced by normal Anderson disorder, a different physical picture arises when bond disorder is considered. Using Born approximation theory, an explanation is given as to why bond disorder plays a different role in phase transition than does Anderson disorder. By comparing phase diagrams, conductance, conductance fluctuations, and the localization length for systems with different types of disorder, a consistent conclusion is obtained. The results indicate that a topological Anderson insulator is dependent on the type of disorder. These results are important for the doping processes used in preparation of topological insulators.
Large numbers of ground states of two-dimensional Ising spin glasses with periodic boundary conditions in both directions are calculated for sizes up to 40^2. A combination of a genetic algorithm and Cluster-Exact Approximation is used. For each quenched realization of the bonds up to 40 independent ground states are obtained.   For the infinite system a ground-state energy of e=-1.4015(3) is extrapolated. The ground-state landscape is investigated using a finite-size scaling analysis of the distribution of overlaps. The mean-field picture assuming a complex landscape describes the situation better than the droplet-scaling model, where for the infinite system mainly two ground states exist. Strong evidence is found that the ground states are not organized in an ultrametric fashion in contrast to previous results for three-dimensional spin glasses.
Let G and G' be two graphs on potentially different vertex sets such that a subset of the vertices of G correspond to a subset of the vertices of G'. Given vertices of interest in G, we seek to identify the corresponding vertices, if any exist, in the second network G'. While in moderately sized networks graph matching methods can be applied directly to recover the missing correspondences, herein we present a principled methodology appropriate for situations in which the networks are too large for brute-force graph matching. Our methodology identifies vertices in a local neighborhood of the vertices of interest in the first network that have verifiable corresponding vertices in the second network. Leveraging these known correspondences, referred to as seeds, we match the induced subgraphs in each network generated by the neighborhoods of these verified seeds, and rank the vertices of the second network in terms of the most likely matches to the original vertices of interest. We demonstrate the applicability of our methodology through simulations and real data examples.
Network of Information (NetInf) is a term coined for networks which unlike contemporary network are not node centric. As the name indicates, information supersedes nodes in the network. In this report, we propose an architecture of mobile node for NetInf. We call it NetInf Mobile Node. It is an extension of the basic node architecture proposed for NetInf. It is compatible to NetInf and TCP/IP based networks. The Virtual Node Layer modules in the architecture provide support for managing mobility, power consumption of the node as well data relaying/storing services. In- ner/Outer Locator Construction Routers (I/O LCTR) are two functions introduced in NetInf mobile nodes to operate between NetInf and non- NetInf sites. The basic purpose of NetInf mobile node is to maintain the QoS during mobility events. The handoff/handover are critical situations during mobility where chances of QoS degradation of an ongoing session are high. This report presents one such scenario in which QoS of an appli- cation is maintained during a handoff situations in heterogeneous wireless network environment through our proposed algorithm.
We develop a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data. In contrast to recent methods which accelerate the learning problem by restricting the search space, our main contribution is a fast algorithm for score-based structure learning which does not restrict the search space in any way and works on high-dimensional datasets with thousands of variables. Our use of concave regularization, as opposed to the more popular $\ell_0$ (e.g. BIC) penalty, is new. Moreover, we provide theoretical guarantees which generalize existing asymptotic results when the underlying distribution is Gaussian. Most notably, our framework does not require the existence of a so-called faithful DAG representation, and as a result the theory must handle the inherent nonidentifiability of the estimation problem in a novel way. Finally, as a matter of independent interest, we provide a comprehensive comparison of our approach to several standard structure learning methods using open-source packages developed for the R language. Based on these experiments, we show that our algorithm is significantly faster than other competing methods while obtaining higher sensitivity with comparable false discovery rates for high-dimensional data. In particular, the total runtime for our method to generate a solution path of 20 estimates for DAGs with 8000 nodes is around one hour.
Mass, metallicity, and star formation rate (SFR) of a galaxy are crucial parameters in understanding galaxy formation and evolution. However, the relation among these is still a matter of debate for luminous infrared galaxies, which carry a bulk of SFR budget of the universe at $z\sim1$. We have investigated the relation among stellar mass, gas-phase oxygen abundance, and SFR of AKARI-detected mid-IR galaxies at $z\sim0.88$ in the AKARI NEP deep field. We observed about 350 AKARI sources with Subaru/FMOS NIR spectrograph, and detected secure and expected H$\alpha$ emission lines from 25 and 44 galaxies, respectively. The SFR of our sample is almost constant ($\sim 25M_{\odot}/yr$) over the stellar mass range of our sample. Compared with main-sequence (MS) galaxies at a similar redshift range, the average SFR of our detected sample is comparable for massive galaxies ($\sim10^{10.58}~M_{\odot}$), while higher by $\sim$0.6dex for less massive galaxies ($\sim 10^{10.05}~M_{\odot}$). We measure metallicities from the [NII]/H$\alpha$ emission line ratio.   We find that the mass-metallicity relation of our individually measured sources agrees with that for optical-selected star-forming galaxies at $z\sim0.1$, while metallicities of stacked spectra agree with that of MS galaxies at $z\sim0.78$. Considering high SFR of individually measured sources, FMR of the IR galaxies is different from that at $z\sim0.1$. However, on the mass-metallicity plane, they are consistent with the MS galaxies, highlighting higher SFR of the IR galaxies. This suggests the evolutionary path of our IR galaxies is different from that of MS galaxies. A possible physical interpretation includes that the star-formation activities of IR galaxies at $z\sim0.88$ in our sample are enhanced by interaction and/or merger of galaxies, but the inflow of metal-poor gas is not yet induced, keeping the metallicity intact.
A mean field feedback artificial neural network algorithm is developed and explored for the set covering problem. A convenient encoding of the inequality constraints is achieved by means of a multilinear penalty function. An approximate energy minimum is obtained by iterating a set of mean field equations, in combination with annealing. The approach is numerically tested against a set of publicly available test problems with sizes ranging up to 5x10^3 rows and 10^6 columns. When comparing the performance with exact results for sizes where these are available, the approach yields results within a few percent from the optimal solutions. Comparisons with other approximate methods also come out well, in particular given the very low CPU consumption required -- typically a few seconds. Arbitrary problems can be processed using the algorithm via a public domain server.
Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which are capable of learning continuously, deep artificial networks have a limited ability for incorporating new information in an already trained network. As a result, methods for continuous learning are potentially highly impactful in enabling the application of deep networks to dynamic data sets. Here, inspired by the process of adult neurogenesis in the hippocampus, we explore the potential for adding new neurons to deep layers of artificial neural networks in order to facilitate their acquisition of novel information while preserving previously trained data representations. Our results on the MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes lower and upper case letters and digits, demonstrate that neurogenesis is well suited for addressing the stability-plasticity dilemma that has long challenged adaptive machine learning algorithms.
Precise spike timing as a means to encode information in neural networks is biologically supported, and is advantageous over frequency-based codes by processing input features on a much shorter time-scale. For these reasons, much recent attention has been focused on the development of supervised learning rules for spiking neural networks that utilise a temporal coding scheme. However, despite significant progress in this area, there still lack rules that have a theoretical basis, and yet can be considered biologically relevant. Here we examine the general conditions under which synaptic plasticity most effectively takes place to support the supervised learning of a precise temporal code. As part of our analysis we examine two spike-based learning methods: one of which relies on an instantaneous error signal to modify synaptic weights in a network (INST rule), and the other one on a filtered error signal for smoother synaptic weight modifications (FILT rule). We test the accuracy of the solutions provided by each rule with respect to their temporal encoding precision, and then measure the maximum number of input patterns they can learn to memorise using the precise timings of individual spikes as an indication of their storage capacity. Our results demonstrate the high performance of FILT in most cases, underpinned by the rule's error-filtering mechanism, which is predicted to provide smooth convergence towards a desired solution during learning. We also find FILT to be most efficient at performing input pattern memorisations, and most noticeably when patterns are identified using spikes with sub-millisecond temporal precision. In comparison with existing work, we determine the performance of FILT to be consistent with that of the highly efficient E-learning Chronotron, but with the distinct advantage that FILT is also implementable as an online method for increased biological realism.
In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing a spoken utterance, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and word-based prosodic features. We find that different types of acoustic-prosodic features are individually helpful, and together improve parse F1 scores significantly over a strong text-only baseline. For this study with known sentence boundaries, error analysis shows that the main benefit of acoustic-prosodic features is in sentences with disfluencies and that attachment errors are most improved.
Sophisticated multilayer neural networks have achieved state of the art results on multiple supervised tasks. However, successful applications of such multilayer networks to control have so far been limited largely to the perception portion of the control pipeline. In this paper, we explore the application of deep and recurrent neural networks to a continuous, high-dimensional locomotion task, where the network is used to represent a control policy that maps the state of the system (represented by joint angles) directly to the torques at each joint. By using a recent reinforcement learning algorithm called guided policy search, we can successfully train neural network controllers with thousands of parameters, allowing us to compare a variety of architectures. We discuss the differences between the locomotion control task and previous supervised perception tasks, present experimental results comparing various architectures, and discuss future directions in the application of techniques from deep learning to the problem of optimal control.
We have implemented a new numerical method to obtain the low-energy many-particle states of the Coulomb glass. First, this method creates an initial set of low-energy states by a hybrid of local search and simulated annealing approaches. Then, systematically investigating the surroundings of the states found, this set is completed. The transition rates between these states are calculated. The connectivity of the corresponding graph is analysed in dependence on temperature and duration of measurement. We study how the formation of clusters is reflected in the specific heat as non-ergodic effects.
We consider the growth of a polymer layer on a flat surface in a good solvent by in-situ polymerization. This is viewed as a modified form of diffusion-limited aggregation without branching. We predict theoretically the formation of a pseudo-brush with density rhog(z) \propto z^{-2/3} and characteristic height \propto t^{3}. These results are found by combining a mean-field treatment of the diffusive growth (marginally valid in three dimensions) with a scaling theory (Flory exponent nu =3/5) of the growing polymers. We confirm their validity by Monte Carlo simulations.
Turing machines and G\"odel numbers are important pillars of the theory of computation. Thus, any computational architecture needs to show how it could relate to Turing machines and how stable implementations of Turing computation are possible. In this chapter, we implement universal Turing computation in a neural field environment. To this end, we employ the canonical symbologram representation of a Turing machine obtained from a G\"odel encoding of its symbolic repertoire and generalized shifts. The resulting nonlinear dynamical automaton (NDA) is a piecewise affine-linear map acting on the unit square that is partitioned into rectangular domains. Instead of looking at point dynamics in phase space, we then consider functional dynamics of probability distributions functions (p.d.f.s) over phase space. This is generally described by a Frobenius-Perron integral transformation that can be regarded as a neural field equation over the unit square as feature space of a dynamic field theory (DFT). Solving the Frobenius-Perron equation yields that uniform p.d.f.s with rectangular support are mapped onto uniform p.d.f.s with rectangular support, again. We call the resulting representation \emph{dynamic field automaton}.
Viral spread on large graphs has many real-life applications such as malware propagation in computer networks and rumor (or misinformation) spread in Twitter-like online social networks. Although viral spread on large graphs has been intensively analyzed on classical models such as Susceptible-Infectious-Recovered, there still exits a deficit of effective methods in practice to contain epidemic spread once it passes a critical threshold. Against this backdrop, we explore methods of containing viral spread in large networks with the focus on sparse random networks. The viral containment strategy is to partition a large network into small components and then to ensure the sanity of all messages delivered across different components. With such a defense mechanism in place, an epidemic spread starting from any node is limited to only those nodes belonging to the same component as the initial infection node. We establish both lower and upper bounds on the costs of inspecting inter-component messages. We further propose heuristic-based approaches to partition large input graphs into small components. Finally, we study the performance of our proposed algorithms under different network topologies and different edge weight models.
In this paper, we propose a method to improve sound classification performance by combining signal features, derived from the time-frequency spectrogram, with human perception. The method presented herein exploits an artificial neural network (ANN) and learns the signal features based on the human perception knowledge. The proposed method is applied to a large acoustic dataset containing 24 months of nearly continuous recordings. The results show a significant improvement in performance of the detection-classification system; yielding as much as 20% improvement in true positive rate for a given false positive rate.
Recent availability of geo-localized data capturing individual human activity together with the statistical data on international migration opened up unprecedented opportunities for a study on global mobility. In this paper we consider it from the perspective of a multi-layer complex network, built using a combination of three datasets: Twitter, Flickr and official migration data. Those datasets provide different but equally important insights on the global mobility: while the first two highlight short-term visits of people from one country to another, the last one - migration - shows the long-term mobility perspective, when people relocate for good. And the main purpose of the paper is to emphasize importance of this multi-layer approach capturing both aspects of human mobility at the same time. So we start from a comparative study of the network layers, comparing short- and long- term mobility through the statistical properties of the corresponding networks, such as the parameters of their degree centrality distributions or parameters of the corresponding gravity model being fit to the network. We also focus on the differences in country ranking by their short- and long-term attractiveness, discussing the most noticeable outliers. Finally, we apply this multi-layered human mobility network to infer the structure of the global society through a community detection approach and demonstrate that consideration of mobility from a multi-layer perspective can reveal important global spatial patterns in a way more consistent with other available relevant sources of international connections, in comparison to the spatial structure inferred from each network layer taken separately.
Clustering analysis plays an important role in scientific research and commercial application. K-means algorithm is a widely used partition method in clustering. However, it is known that the K-means algorithm may get stuck at suboptimal solutions, depending on the choice of the initial cluster centers. In this article, we propose a technique to handle large scale data, which can select initial clustering center purposefully using Genetic algorithms (GAs), reduce the sensitivity to isolated point, avoid dissevering big cluster, and overcome deflexion of data in some degree that caused by the disproportion in data partitioning owing to adoption of multi-sampling. We applied our method to some public datasets these show the advantages of the proposed approach for example Hepatitis C dataset that has been taken from the machine learning warehouse of University of California. Our aim is to evaluate hepatitis dataset. In order to evaluate this dataset we did some preprocessing operation, the reason to preprocessing is to summarize the data in the best and suitable way for our algorithm. Missing values of the instances are adjusted using local mean method.
We provide a distributed coordinated approach to the stability analysis and control design of large-scale nonlinear dynamical systems by using a vector Lyapunov functions approach. In this formulation the large-scale system is decomposed into a network of interacting subsystems and the stability of the system is analyzed through a comparison system. However finding such comparison system is not trivial. In this work, we propose a sum-of-squares based completely decentralized approach for computing the comparison systems for networks of nonlinear systems. Moreover, based on the comparison systems, we introduce a distributed optimal control strategy in which the individual subsystems (agents) coordinate with their immediate neighbors to design local control policies that can exponentially stabilize the full system under initial disturbances.We illustrate the control algorithm on a network of interacting Van der Pol systems.
We study the large-x behaviour of the physical evolution kernels for flavour non-singlet observables in deep-inelastic scattering, where x is the Bjorken variable, semi-inclusive e^+ e^- annihilation and Drell-Yan lepton-pair production. Unlike the corresponding MSbar-scheme coefficient functions, all these kernels show a single-logarithmic large-x enhancement at all orders in 1-x. We conjecture that this universal behaviour, established by Feynman-diagram calculations up to the fourth order, holds at all orders in the strong coupling constant alpha_s. The resulting predictions are presented for the highest ln^n(1-x) contributions to the higher-order coefficient functions. In Mellin-N space these predictions take the form of an exponentiation which, however, appears to be less powerful than the well-known soft-gluon exponentiation of the leading (1-x)^(-1) ln^n(1-x) terms. In particular in deep-inelastic scattering the 1/N corrections are non-negligible for all practically relevant N.
Deep neural networks continue to advance the state-of-the-art of image recognition tasks with various methods. However, applications of these methods to multimodality remain limited. We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning. Unlike the deep residual learning, MRN effectively learns the joint representation from vision and language information. The main idea is to use element-wise multiplication for the joint residual mappings exploiting the residual learning of the attentional models in recent studies. Various alternative models introduced by multimodality are explored based on our study. We achieve the state-of-the-art results on the Visual QA dataset for both Open-Ended and Multiple-Choice tasks. Moreover, we introduce a novel method to visualize the attention effect of the joint representations for each learning block using back-propagation algorithm, even though the visual features are collapsed without spatial information.
We report a measurement of the cross section of single top-quark production in the t-channel using 1.04 fb^-1 of pp collision data at sqrt(s) = 7 TeV recorded with the ATLAS detector at the LHC. Selected events feature one electron or muon, missing transverse momentum, and two or three jets, exactly one of them identified as originating from a b quark. The cross section is measured by fitting the distribution of a multivariate discriminant constructed with a neural network, yielding sigma(t)= 83 +/- 4 (stat.) +20 -19 (syst) pb which is in good agreement with the prediction of the Standard Model. Using the ratio of the measured to the theoretically predicted cross section and assuming that the top-quark-related CKM matrix elements obey the relation |V(tb)| >> |V(ts)|, |V(td)|, the coupling strength at the W-t-b vertex is determined to be |V(tb)| = 1.13 +0.14 -0.13. If it is assumed that |V(tb)| <= 1 a lower limit of |V(tb)|>0.75 is obtained at the 95% confidence level.
This work targets human action recognition in video. While recent methods typically represent actions by statistics of local video features, here we argue for the importance of a representation derived from human pose. To this end we propose a new Pose-based Convolutional Neural Network descriptor (P-CNN) for action recognition. The descriptor aggregates motion and appearance information along tracks of human body parts. We investigate different schemes of temporal aggregation and experiment with P-CNN features obtained both for automatically estimated and manually annotated human poses. We evaluate our method on the recent and challenging JHMDB and MPII Cooking datasets. For both datasets our method shows consistent improvement over the state of the art.
Recent Deep Inelastic data leads to an up-down quark asymmetry of the nucleon sea. Explanations of the flavour asymmetry and the di-lepton production in proton-nucleus collisions call for a temperature $T \approx 100$ MeV in a statistical model. This T may be conjectured as being due to the Fulling-Davies-Unruh effect. But it is not possible to fit the structure function itself.
We introduce the notion of a network's conduciveness, a probabilistically interpretable measure of how the network's structure allows it to be conducive to roaming agents, in certain conditions, from one portion of the network to another. We exemplify its use through an application to the two problems in combinatorial optimization that, given an undirected graph, ask that its so-called chromatic and independence numbers be found. Though NP-hard, when solved on sequences of expanding random graphs there appear marked transitions at which optimal solutions can be obtained substantially more easily than right before them. We demonstrate that these phenomena can be understood by resorting to the network that represents the solution space of the problems for each graph and examining its conduciveness between the non-optimal solutions and the optimal ones. At the said transitions, this network becomes strikingly more conducive in the direction of the optimal solutions than it was just before them, while at the same time becoming less conducive in the opposite direction. We believe that, besides becoming useful also in other areas in which network theory has a role to play, network conduciveness may become instrumental in helping clarify further issues related to NP-hardness that remain poorly understood.
A framework based on generalized hierarchical random graphs (GHRGs) for the detection of change points in the structure of temporal networks has recently been developed by Peel and Clauset [1]. We build on this methodology and extend it to also include the versatile stochastic block models (SBMs) as a parametric family for reconstructing the empirical networks. We use five different techniques for change point detection on prototypical temporal networks, including empirical and synthetic ones. We find that none of the considered methods can consistently outperform the others when it comes to detecting and locating the expected change points in empirical temporal networks. With respect to the precision and the recall of the results of the change points, we find that the method based on a degree-corrected SBM has better recall properties than other dedicated methods, especially for sparse networks and smaller sliding time window widths.
We propose a new neural sequence model training method in which the objective function is defined by $\alpha$-divergence. We demonstrate that the objective function generalizes the maximum-likelihood (ML)-based and reinforcement learning (RL)-based objective functions as special cases (i.e., ML corresponds to $\alpha \to 0$ and RL to $\alpha \to1$). We also show that the gradient of the objective function can be considered a mixture of ML- and RL-based objective gradients. The experimental results of a machine translation task show that minimizing the objective function with $\alpha > 0$ outperforms $\alpha \to 0$, which corresponds to ML-based methods.
Gaussian belief propagation (BP) has been widely used for distributed inference in large-scale networks such as the smart grid, sensor networks, and social networks, where local measurements/observations are scattered over a wide geographical area. One particular case is when two neighboring agents share a common observation. For example, to estimate voltage in the direct current (DC) power flow model, the current measurement over a power line is proportional to the voltage difference between two neighboring buses. When applying the Gaussian BP algorithm to this type of problem, the convergence condition remains an open issue. In this paper, we analyze the convergence properties of Gaussian BP for this pairwise linear Gaussian model. We show analytically that the updating information matrix converges at a geometric rate to a unique positive definite matrix with arbitrary positive semidefinite initial value and further provide the necessary and sufficient convergence condition for the belief mean vector to the optimal estimate.
We have considered the two-spin interaction spherical spin-glass model with asymmetric bonds (coupling constants). Besides the usual interactions between spins and bonds and between the spins and a thermostat with temperature $T_{\sigma}$ there is also an additional factor: the bonds are not assumed random {\it a priori} but interact with some other thermostat at the temperature $T_{J}$. We show that when the bonds are frozen with respect to the spins a first order phase transition to a spin-glass phase occurs, and the temperature of this transition tends to zero if $T_J$ is large. Our analytical results show that a spin-glass phase can exist in mean-field models with nonrelaxational dynamics.
We show how recent theoretical advances for data-propagation in Wireless Sensor Networks (WSNs) can be combined to improve gradient-based routing (GBR) in Wireless Sensor Networks. We propose a mixed-strategy of direct transmission and multi-hop propagation of data which improves the lifespan of WSNs by reaching better energy-load-balancing amongst sensor nodes.
We revisit the classical but as yet unresolved problem of predicting the breaking onset of 2D and 3D irrotational gravity water waves. This study focuses on domains with flat bottom topography and conditions ranging from deep to intermediate depth (depth to wavelength ratio from 1 to 0.2). Our calculations based on a fully nonlinear boundary element model investigated geometric, kinematic and energetic differences between maximally recurrent and marginally breaking waves in focusing wave groups. Maximally steep non-breaking (maximally recurrent) waves are clearly separated from marginally breaking waves by their normalised energy fluxes localized near the crest region. On the surface, this reduces to the local ratio of the energy flux velocity (here the fluid velocity) to the crest point velocity for the tallest wave in the evolving group. This provides a robust threshold parameter for breaking onset for 2D and 3D wave packets propagating in uniform water depths from deep to intermediate. Warning of imminent breaking onset was found to be detected up to a fifth of a carrier wave period prior to a breaking event.
It is shown that coherent spin motion of electron-hole pairs localized in band gap states of silicon can influence charge carrier recombination. Based on this effect, a readout concept for silicon based solid-state spin--quantum computers as proposed by Kane is suggested. The 31P quantum bit (qbit) is connected via hyperfine coupling to the spin of the localized donor electron. When a second localized and singly occupied electronic state with an energy level deep within the band gap or close to the valence edge is in proximity, a gate controlled exchange between the 31P nucleus and the two electronic states can be activated that leaves the donor-deep level pair either unchanged in a |T->-state or shifts it into a singlet state |S>. Since the donor deep level transition is spin-dependent, the deep level becomes charged or not, depending on the nuclear spin orientation of the donor nucleus. Thus, the state of the qbit can be read with a sequence of light pulses and photo conductivity measurements.
We present a quantum algorithm that additively approximates the value of a tensor network to a certain scale. When combined with existing results, this provides a complete problem for quantum computation. The result is a simple new way of looking at quantum computation in which unitary gates are replaced by tensors and time is replaced by the order in which the tensor-network is "swallowed". We use this result to derive new quantum algorithms that approximate the partition function of a variety of classical statistical mechanics models, including the Potts model.
We study the transport properties of two electrons in a quasi one-dimensional disordered wire. The electrons are subject to both, a disorder potential and a short range two-body interaction. Using the approach developed by Iida et al. [ Ann. Phys. (N.Y.) 200 (1990) 219 ], the supersymmetry technique, and a suitable truncation of Hilbert space, we work out the two-point correlation function in the framework of a non-linear sigma model. We study the loop corrections to arbitrary order. We obtain a remarkably simple and physically transparent expression for the change of the localization length caused by the two-body interaction.
An artificial neural network can be used to generate a series of numbers. A boolean perceptron generates bit sequences with a periodic structure. The corresponding spectrum of cycle lengths is investigated analytically and numerically; it has similarities with properties of rational numbers.
We have proposed a model based upon flocking on a complex network, and then developed two clustering algorithms on the basis of it. In the algorithms, firstly a \textit{k}-nearest neighbor (knn) graph as a weighted and directed graph is produced among all data points in a dataset each of which is regarded as an agent who can move in space, and then a time-varying complex network is created by adding long-range links for each data point. Furthermore, each data point is not only acted by its \textit{k} nearest neighbors but also \textit{r} long-range neighbors through fields established in space by them together, so it will take a step along the direction of the vector sum of all fields. It is more important that these long-range links provides some hidden information for each data point when it moves and at the same time accelerate its speed converging to a center. As they move in space according to the proposed model, data points that belong to the same class are located at a same position gradually, whereas those that belong to different classes are away from one another. Consequently, the experimental results have demonstrated that data points in datasets are clustered reasonably and efficiently, and the rates of convergence of clustering algorithms are fast enough. Moreover, the comparison with other algorithms also provides an indication of the effectiveness of the proposed approach.
We analyze the possibility of the formation of a magnetic-field-induced insulating state in a two-dimensional granular superconductor with relatively strong intergranular coupling and show that such a state appears in a model with spatial variations of the single-grain critical magnetic field. This model describes realistic granular samples with the dispersion in grain sizes and explains the mechanism leading to a giant peak in the magnetoresistance.
We propose a natural model of evolving weighted networks in which new links are not necessarily connected to new nodes. The model allows a newly added link to connect directly two nodes already present in the network. This is plausible in modeling many real-world networks. Such a link is called an inner link, while a link connected to a new node is called an outer link. In view of interrelations between inner and outer links, we investigate power-laws for the strength, degree and weight distributions of weighted complex networks. This model enables us to predict some features of weighted networks such as the worldwide airport network and the scientific collaboration network.
Magnetic behaviour of a mixed spin-1/2 and spin-1 Ising model on the diced lattice is studied by the use of an exact star-triangle mapping transformation. It is found that the uniaxial as well as biaxial single-ion anisotropy acting on the spin-1 sites may potentially cause a reentrant transition with two consecutive critical points. Contrary to this, the effect of next-nearest-neighbour interaction between the spin-1/2 sites possibly leads to a reentrant transition with three critical temperatures in addition to the one with two critical points only. The shape of the total magnetization versus temperature dependence is particularly investigated for the case of ferrimagnetically ordered system.
The authors previously considered a method solving optimization problems by using a system of interconnected network of two component Bose-Einstein condensates (Byrnes, Yan, Yamamoto New J. Phys. 13, 113025 (2011)). The use of bosonic particles was found to give a reduced time proportional to the number of bosons N for solving Ising model Hamiltonians by taking advantage of enhanced bosonic cooling rates. In this paper we consider the same system in terms of neural networks. We find that up to the accelerated cooling of the bosons the previously proposed system is equivalent to a stochastic continuous Hopfield network. This makes it clear that the BEC network is a physical realization of a simulated annealing algorithm, with an additional speedup due to bosonic enhancement. We discuss the BEC network in terms of typical neural network tasks such as learning and pattern recognition and find that the latter process may be accelerated by a factor of N.
In complex networks it is common to model a network or generate a surrogate network based on the conservation of the network's degree distribution. We provide an alternative network model based on the conservation of connection density within a set of nodes. This density is measure by the rich-club coefficient. We present a method to generate surrogates networks with a given rich-club coefficient. We show that by choosing a suitable local linking term, the generated random networks can reproduce the degree distribution and the mixing pattern of real networks. The method is easy to implement and produces good models of real networks.
In this paper we present a distributed clustering protocol for mobile wireless sensor networks. A large majority of research in clustering and routing algorithms for WSNs assume a static network and hence are rendered inefficient in cases of highly mobile sensor networks, which is an aspect addressed here. MECP is an energy efficient, mobility aware protocol and utilizes information about movement of sensor nodes and residual energy as attributes in network formation. It also provides a mechanism for fault tolerance to decrease packet data loss in case of cluster head failures.
The increasing accuracy of automatic chord estimation systems, the availability of vast amounts of heterogeneous reference annotations, and insights from annotator subjectivity research make chord label personalization increasingly important. Nevertheless, automatic chord estimation systems are historically exclusively trained and evaluated on a single reference annotation. We introduce a first approach to automatic chord label personalization by modeling subjectivity through deep learning of a harmonic interval-based chord label representation. After integrating these representations from multiple annotators, we can accurately personalize chord labels for individual annotators from a single model and the annotators' chord label vocabulary. Furthermore, we show that chord personalization using multiple reference annotations outperforms using a single reference annotation.
We explore learning-based approaches for feedback control of a dexterous five-finger hand performing non-prehensile manipulation. First, we learn local controllers that are able to perform the task starting at a predefined initial state. These controllers are constructed using trajectory optimization with respect to locally-linear time-varying models learned directly from sensor data. In some cases, we initialize the optimizer with human demonstrations collected via teleoperation in a virtual environment. We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state. We then consider two interpolation methods for generalizing to a wider range of initial conditions: deep learning, and nearest neighbors. We find that nearest neighbors achieve higher performance. Nevertheless, the neural network has its advantages: it uses only tactile and proprioceptive feedback but no visual feedback about the object (i.e. it performs the task blind) and learns a time-invariant policy. In contrast, the nearest neighbors method switches between time-varying local controllers based on the proximity of initial object states sensed via motion capture. While both generalization methods leave room for improvement, our work shows that (i) local trajectory-based controllers for complex non-prehensile manipulation tasks can be constructed from surprisingly small amounts of training data, and (ii) collections of such controllers can be interpolated to form more global controllers. Results are summarized in the supplementary video: https://youtu.be/E0wmO6deqjo
We introduce noncooperatively optimized tolerance (NOT), a generalization of highly optimized tolerance (HOT) that involves strategic (game theoretic) interactions between parties in a complex system. We illustrate our model in the forest fire (percolation) framework. As the number of players increases, our model retains features of HOT, such as robustness, high yield combined with high density, and self-dissimilar landscapes, but also develops features of self-organized criticality (SOC) when the number of players is large enough. For example, the forest landscape becomes increasingly homogeneous and protection from adverse events (lightning strikes) becomes less closely correlated with the spatial distribution of these events. While HOT is a special case of our model, the resemblance to SOC is only partial; for example, the distribution of cascades, while becoming increasingly heavy-tailed as the number of players increases, also deviates more significantly from a power law in this regime. Surprisingly, the system retains considerable robustness even as it becomes fractured, due in part to emergent cooperation between neighboring players. At the same time, increasing homogeneity promotes resilience against changes in the lightning distribution, giving rise to intermediate regimes where the system is robust to a particular distribution of adverse events, yet not very fragile to changes.
We introduce a time-varying network model accounting for burstiness and tie reinforcement observed in social networks. The analytical solution indicates a non-trivial phase diagram determined by the competition of the leading terms of the two processes. We test our results against numerical simulations, and compare the analytical predictions with an empirical dataset finding good agreements between them. The presented framework can be used to classify the dynamical features of real social networks and to gather new insights about the effects of social dynamics on ongoing spreading processes.
Using the transfer matrix method, we study the conductance of the chiral particles through a monolayer graphene superlattice with long-range correlated disorder distributed on the potential of the barriers. Even though the transmission of the particles through graphene superlattice with white noise potentials is suppressed, the transmission is revived in a wide range of angles when the potential heights are long-range correlated with a power spectrum $S(k)\sim1/k^{\beta}$. As a result, the conductance increases with increasing the correlation exponent values gives rise a metallic phase. We obtain a phase transition diagram in which a critical correlation exponent depends strongly on disorder strength and slightly on the energy of the incident particles. The phase transition, on the other hand, appears in all ranges of the energy from propagating to evanescent mode regimes.
In this paper, a novel cluster-based approach for optimizing the energy efficiency of wireless small cell networks is proposed. A dynamic mechanism based on the spectral clustering technique is proposed to dynamically form clusters of small cell base stations. Such clustering enables intra-cluster coordination among the base stations for optimizing the downlink performance through load balancing, while satisfying users' quality-of-service requirements. In the proposed approach, the clusters use an opportunistic base station sleep-wake switching mechanism to strike a balance between delay and energy consumption. The inter-cluster interference affects the performance of the clusters and their choices of active or sleep state. Due to the lack of inter-cluster communications, the clusters have to compete with each other to make decisions on improving the energy efficiency. This competition is formulated as a noncooperative game among the clusters that seek to minimize a cost function which captures the tradeoff between energy expenditure and load. To solve this game, a distributed learning algorithm is proposed using which the clusters autonomously choose their optimal transmission strategies. Simulation results show that the proposed approach yields significant performance gains in terms of reduced energy expenditures up to 40% and reduced load up to 23% compared to conventional approaches.
Background: The study of deep-inelastic reactions of nuclei provide a vehicle to investigate nuclear transport phenomena for a full range of equilibration dynamics. These inquires provide us the ingredients to model such phenomena and help answer important questions about the nuclear Equation of State (EOS) and its evolution as a function of neutron-to-proton $(N/Z)$ ratio. Purpose: The motivation is to examine the real-time dynamics of nuclear transport phenomena and its dependence on $(N/Z)$ asymmetry from a microscopic point of view to avoid any pre-conceived assumptions about the involved processes. Method: Time-dependent Hartree-Fock (TDHF) method in full 3D is employed to calculate deep-inelastic reactions of $^{78}$Kr+$^{208}$Pb and $^{92}$Kr+$^{208}$Pb systems at 8.5~MeV$/A$. The impact parameter and energy-loss dependence of relevant observables are calculated. In addition, density constrained TDHF method is used to compute excitation energies of the primary fragments. The statistical deexcitation code GEMINI is utilized to examine the final reaction products. Results: The kinetic energy loss and sticking times as a function of impact parameter are calculated. Final properties of the fragments (charge, mass, scattering angle, kinetic energy) are computed. Conclusions: We find a smooth dependence of the energy loss, $E_\mathrm{loss}$, on the impact parameter for both systems. On the other hand the transfer properties for low $E_\mathrm{loss}$ values are very different for the two systems but become similar in the higher $E_\mathrm{loss}$ regime. The mean life time of the charge equilibration process, obtained from the final $(N-Z)/A$ value of the fragments, is shown to be $\sim 0.5$~zs. This value is slightly larger (but of the same order) than the value obtained from reactions at Fermi energies.
Robustness of biochemical systems has become one of the central questions in Systems Biology, although it is notoriously difficult to formally capture its multifaceted nature. Maintenance of normal system function depends not only on the stoichiometry of the underlying interrelated components, but also on a multitude of kinetic parameters. For given parameter values, recent findings have aimed at characterizing the property of the system components to exhibit same concentrations in the resulting steady states, termed absolute concentration robustness (ACR). However, the existing method for determining system components exhibiting ACR is applicable only to one class of mass-action networks for which this property can be confirmed, but not discarded. Here we design a new method which relies on biochemical network decompositions into subnetworks, called elementary flux modes, to identify ACR in a broader class of mass-action networks by using only the given stoichiometry. This approach reduces the problem of determining ACR to that of solving parameterized systems of linear equations, rendering it amenable to networks of larger sizes. Our unified framework will be helpful in analyzing this biologically important type of robustness as well as detection of novel systemic properties independent of the kinetic parameters for more complex biochemical networks.
Maintaining connectivity among a group of autonomous agents exploring an area is very important, as it promotes cooperation between the agents and also helps message exchanges which are very critical for their mission. Creating an underlying Ad-hoc Mobile Router Network (AMRoNet) using simple robotic routers is an approach that facilitates communication between the agents without restricting their movements. We address the following question in our paper: How to create an AMRoNet with local information and with minimum number of routers? We propose two new localized and distributed algorithms 1) agent-assisted router deployment and 2) a self-spreading for creating AMRoNet. The algorithms use a greedy deployment strategy for deploying routers effectively into the area maximizing coverage and a triangular deployment strategy to connect different connected component of routers from different base stations. Empirical analysis shows that the proposed algorithms are the two best localized approaches to create AMRoNets.
Gatys et al. (2015) showed that optimizing pixels to match features in a convolutional network with respect reference image features is a way to render images of high visual quality. We show that unrolling this gradient-based optimization yields a recurrent computation that creates images by incrementally adding onto a visual "canvas". We propose a recurrent generative model inspired by this view, and show that it can be trained using adversarial training to generate very good image samples. We also propose a way to quantitatively compare adversarial networks by having the generators and discriminators of these networks compete against each other.
Algorithmic composition is the partial or total automation of the process of music composition by using computers. Since the 1950s, different computational techniques related to Artificial Intelligence have been used for algorithmic composition, including grammatical representations, probabilistic methods, neural networks, symbolic rule-based systems, constraint programming and evolutionary algorithms. This survey aims to be a comprehensive account of research on algorithmic composition, presenting a thorough view of the field for researchers in Artificial Intelligence.
The paper presents the QCD description of the small $x$ behaviour of parton distribution functions in the leading twist of Wilson operator product expansion. The smooth transition between the cases of the soft and hard Pomerons is obtained. The results are in qualitative agreement with deep inelastic experimental data.
We revisit the problem of how spin-glasses ``heal'' after being exposed to tortuous perturbations by the temperature/bond chaos effects in temperature/bond cycling protocols. Revised scaling arguments suggest the amplitude of the order parameter within ghost domains recovers very slowly as compared with the rate it is reduced by the strong perturbations. The parallel evolution of the order parameter and the size of the ghost domains can be examined in simulations and experiments by measurements of a memory auto-correlation function which exhibits a ``memory peak'' at the time scale of the age imprinted in the ghost domains. These expectations are confirmed by Monte Calro simulations of an Edwards-Anderson Ising spin-glass model.
Deep learning based landcover classification algorithms have recently been proposed in literature. In hyperspectral images (HSI) they face the challenges of large dimensionality, spatial variability of spectral signatures and scarcity of labeled data. In this article we propose an end-to-end deep learning architecture that extracts band specific spectral-spatial features and performs landcover classification. The architecture has fewer independent connection weights and thus requires lesser number of training data. The method is found to outperform the highest reported accuracies on popular hyperspectral image data sets.
Today's networks are controlled assuming pre-compressed and packetized data. For video, this assumption of data packets abstracts out one of the key aspects - the lossy compression problem. Therefore, first, this paper develops a framework for network control that incorporates both source-rate and source-distortion. Next, it decomposes the network control problem into an application-layer compression control, a transport-layer congestion control and a network-layer scheduling. It is shown that this decomposition is optimal for concave utility functions. Finally, this paper derives further insights from the developed rate-distortion framework by focusing on specific problems.
Most existing approaches for community detection require complete information of the graph in a specific scale, which is impractical for many social networks. We propose a novel algorithm that does not embrace the universal approach but instead of trying to focus on local social ties and modeling multi-scales of social interactions occurring in those networks. Our method for the first time optimizes the topological entropy of a network and uncovers communities through a novel dynamic system converging to a local minimum by simply updating the membership vector with very low computational complexity. It naturally supports overlapping communities through associating each node with a membership vector which describes node's involvement in each community. This way, in addition to uncover overlapping communities, we can also describe different multi-scale partitions by tuning the characteristic size of modules from the optimal partition. Because of the high efficiency and accuracy of the algorithm, it is feasible to be used for the accurate detection of community structure in real networks.
The problem of the survival of a spin glass phase in the presence of a field has been a challenging one for a long time. To date, all attempts using equilibrium Monte Carlo methods have been unconclusive. In their comment to our paper, Marinari, Parisi and Zuliani use out-of-equilibrium measurements to test for an Almeida-Thouless line. In our view such a dynamic approach is not based on very solid foundations in finite dimensional systems and so cannot be as compelling as equilibrium approaches. Nevertheless, the results of those authors suggests that there is a critical field near B=0.4 at zero temperature. In view of this quite small value (compared to the mean field value), we have reanalyzed our data. We find that if finite size scaling is to distinguish between that small field and a zero field, we would need to go to lattice sizes of about 20x20x20.
Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables. It was proposed as an improvement to latent class analysis --- a method widely used in social sciences and medicine to identify homogeneous subgroups in a population. It provides new and fruitful perspectives on a number of machine learning areas, including cluster analysis, topic detection, and deep probabilistic modeling. This paper gives an overview of the research on latent tree analysis and various ways it is used in practice.
When a fluid comprised of multiple phases or constituents flows through a network, non-linear phenomena such as multiple stable equilibrium states and spontaneous oscillations can occur. Such behavior has been observed or predicted in a number of networks including the flow of blood through the microcirculation, the flow of picoliter droplets through microfluidic devices, the flow of magma through lava tubes, and two-phase flow in refrigeration systems. While the existence of non-linear phenomena in a network with many inter-connections containing fluids with complex rheology may seem unsurprising, this paper demonstrates that even simple networks containing Newtonian fluids in laminar flow can demonstrate multiple equilibria.   The paper describes a theoretical and experimental investigation of the laminar flow of two miscible Newtonian fluids of different density and viscosity through a simple network. The fluids stratify due to gravity and remain as nearly distinct phases with some mixing occurring only by diffusion. This fluid system has the advantage that it is easily controlled and modeled, yet contains the key ingredients for network non-linearities. Experiments and 3D simulations are first used to explore how phases distribute at a single T-junction. Once the phase separation at a single junction is known, a network model is developed which predicts multiple equilibria in the simplest of networks. The existence of multiple stable equilibria is confirmed experimentally and a criteria for their existence is developed. The network results are generic and could be applied to or found in different physical systems.
A natural sound can be described by dynamic changes in envelope (amplitude) and carrier (frequency), corresponding to amplitude modulation (AM) and frequency modulation (FM) respectively. Although the neural responses to both AM and FM sounds are extensively studied in both animals and humans, it is uncertain how they are co-represented when changed simultaneously but independently, as is typical for ecologically natural signals. This study elucidates the neural coding of such sounds in human auditory cortex using magnetoencephalography (MEG). Using stimuli with both sinusoidal modulated envelope (f_{AM}, 37 Hz) and carrier frequency (f_{FM}, 0.3 - 8 Hz), it is demonstrated that AM and FM stimulus dynamics are co-represented in the neural code of human auditory cortex. The stimulus AM dynamics are represented neurally with AM encoding, by the auditory Steady State Response (aSSR) at f_{AM}. For sounds with slowly changing carrier frequency ((f_{FM} < 5 Hz), it is shown that the stimulus FM dynamics are tracked by the phase of the aSSR, demonstrating neural phase modulation (PM) encoding of the stimulus carrier frequency. For sounds with faster carrier frequency change ((f_{FM} >= 5 Hz), it is shown that modulation encoding of stimulus FM dynamics persists, but the neural encoding is no longer purely PM. This result is consistent with the recruitment of additional neural AM encoding over and above the original neural PM encoding, indicating that both the amplitude and phase of the aSSR at f_{AM} track the stimulus FM dynamics. A neural model is suggested to account for these observations.
We present a formal model to represent and solve the unicast/multicast routing problem in networks with Quality of Service (QoS) requirements. To attain this, first we translate the network adapting it to a weighted graph (unicast) or and-or graph (multicast), where the weight on a connector corresponds to the multidimensional cost of sending a packet on the related network link: each component of the weights vector represents a different QoS metric value (e.g. bandwidth, cost, delay, packet loss). The second step consists in writing this graph as a program in Soft Constraint Logic Programming (SCLP): the engine of this framework is then able to find the best paths/trees by optimizing their costs and solving the constraints imposed on them (e.g. delay < 40msec), thus finding a solution to QoS routing problems. Moreover, c-semiring structures are a convenient tool to model QoS metrics. At last, we provide an implementation of the framework over scale-free networks and we suggest how the performance can be improved.
The Ma-Dasgupta real-space renormalization methods allow to study disordered systems which are governed by strong disorder fixed points. After a general introduction to the qualitative ideas and to the quantitative renormalization rules, we describe the explicit exact results that can be obtained in various one-dimensional models, either classical or quantum, either for dynamics or statics. The main part of this dissertation is devoted to statistical physics models, with special attention to (i) the off-equilibrium dynamics of a particle diffusing in a Brownian potential or in a trap landscape, (ii) the coarsening dynamics and the equilibrium of classical disordered spin chains, (iii) the delocalization transition of a random polymer at an interface. The last part of the dissertation deals with two disordered quantum spin chains which exhibit a zero-temperature phase transition as the disorder varies, namely (a) the random antiferromagnetic S=1 spin chain, (b) the random transverse field Ising chain.
In this paper we have addressed a very important issue of Traffic Management.Our proposed approach provides assistance in traffic routing by integrating VANET and Grid Computing
Unsupervised estimation of latent variable models is a fundamental problem central to numerous applications of machine learning and statistics. This work presents a principled approach for estimating broad classes of such models, including probabilistic topic models and latent linear Bayesian networks, using only second-order observed moments. The sufficient conditions for identifiability of these models are primarily based on weak expansion constraints on the topic-word matrix, for topic models, and on the directed acyclic graph, for Bayesian networks. Because no assumptions are made on the distribution among the latent variables, the approach can handle arbitrary correlations among the topics or latent factors. In addition, a tractable learning method via $\ell_1$ optimization is proposed and studied in numerical experiments.
In this paper an attempt has been made to identify most important human resource factors and propose a diagnostic model based on the back-propagation and connectionist model approaches of artificial neural network (ANN). The focus of the study is on the mobile -communication industry of India. The ANN based approach is particularly important because conventional approaches (such as algorithmic) to the problem solving have their inherent disadvantages. The algorithmic approach is well-suited to the problems that are well-understood and known solution(s). On the other hand the ANNs have learning by example and processing capabilities similar to that of a human brain. ANN has been followed due to its inherent advantage over conversion algorithmic like approaches and having capabilities, training and human like intuitive decision making capabilities. Therefore, this ANN based approach is likely to help researchers and organizations to reach a better solution to the problem of managing the human resource. The study is particularly important as many studies have been carried in developed countries but there is a shortage of such studies in developing nations like India. Here, a model has been derived using connectionist-ANN approach and improved and verified via back-propagation algorithm. This suggested ANN based model can be used for testing the success and failure human factors in any of the communication Industry. Results have been obtained on the basis of connectionist model, which has been further refined by BPNN to an accuracy of 99.99%. Any company to predict failure due to HR factors can directly deploy this model.
We consider the problem of calculating learning curves (i.e., average generalization performance) of Gaussian processes used for regression. On the basis of a simple expression for the generalization error, in terms of the eigenvalue decomposition of the covariance function, we derive a number of approximation schemes. We identify where these become exact, and compare with existing bounds on learning curves; the new approximations, which can be used for any input space dimension, generally get substantially closer to the truth. We also study possible improvements to our approximations. Finally, we use a simple exactly solvable learning scenario to show that there are limits of principle on the quality of approximations and bounds expressible solely in terms of the eigenvalue spectrum of the covariance function.
The problem of finding network codes for general connections is inherently difficult in capacity constrained networks. Resource minimization for general connections with network coding is further complicated. Existing methods for identifying solutions mainly rely on highly restricted classes of network codes, and are almost all centralized. In this paper, we introduce linear network mixing coefficients for code constructions of general connections that generalize random linear network coding (RLNC) for multicast connections. For such code constructions, we pose the problem of cost minimization for the subgraph involved in the coding solution and relate this minimization to a path-based Constraint Satisfaction Problem (CSP) and an edge-based CSP. While CSPs are NP-complete in general, we present a path-based probabilistic distributed algorithm and an edge-based probabilistic distributed algorithm with almost sure convergence in finite time by applying Communication Free Learning (CFL). Our approach allows fairly general coding across flows, guarantees no greater cost than routing, and shows a possible distributed implementation. Numerical results illustrate the performance improvement of our approach over existing methods.
Electronic collaboration among devices in a geographically localized environment is made possible with the implementation of IEEE 802.11 based wireless ad hoc networks. Dynamic nature of mobile ad hoc networks(MANETs) may lead to unpredictable intervention of attacks or fault occurrence, which consequently may partition the network, degrade its performance, violate the QoS requirements and most importantly, affect bandwidth allocation to mobile nodes in the network. In this paper, we propose a new distributed cluster scheme for MANETs, especially in harsh environments, based on the concept of survivability to support QoS requirements and to protect bandwidth efficiently. With the incorporation of clustering algorithms in survivability technology, we employ a simple network configuration and expect to reduce occurrences of faults in MANETs. At the same time, we address the scalability problem, which represents a great challenge to network configuration. We do expect a simplification of accessing bandwidth allocation with required QoS support for different applications.
We employ simulations of model proteins to study folding on rugged energy landscapes. We construct ``first-passage'' networks as the system transitions from unfolded to native states. The nodes and bonds in these networks correspond to basins and transitions between them in the energy landscape. We find power-laws between the folding time and number of nodes and bonds. We show that these scalings are determined by the fractal properties of first-passage networks. Reliable folding is possible in systems with rugged energy landscapes because first passage networks have small fractal dimension.
We theoretically and numerically investigate the capability of disordered media to enhance the optical path length in dielectric slabs and augment their light absorption efficiency due to scattering. We first perform a series of Monte Carlo simulations of random walks to determine the path length distribution in weakly to strongly (single to multiple) scattering, non-absorbing dielectric slabs under normally incident light and derive analytical expressions for the path length enhancement in these two limits. Quite interestingly, while multiple scattering is expected to produce long optical paths, we find that media containing a vanishingly small amount of scatterers can still provide high path length enhancements due to the very long trajectories sustained by total internal reflection at the slab interfaces. The path length distributions are then used to calculate the light absorption efficiency of media with varying absorption coefficients. We find that maximum absorption enhancement is obtained at an optimal scattering strength, in-between the single-scattering and the diffusive (strong multiple-scattering) regimes. This study can guide experimentalists towards more efficient and potentially low-cost solutions in photovoltaic technologies.
Learning transformation invariant representations of visual data is an important problem in computer vision. Deep convolutional networks have demonstrated remarkable results for image and video classification tasks. However, they have achieved only limited success in the classification of images that undergo geometric transformations. In this work we present a novel Transformation Invariant Graph-based Network (TIGraNet), which learns graph-based features that are inherently invariant to isometric transformations such as rotation and translation of input images. In particular, images are represented as signals on graphs, which permits to replace classical convolution and pooling layers in deep networks with graph spectral convolution and dynamic graph pooling layers that together contribute to invariance to isometric transformations. Our experiments show high performance on rotated and translated images from the test set compared to classical architectures that are very sensitive to transformations in the data. The inherent invariance properties of our framework provide key advantages, such as increased resiliency to data variability and sustained performance with limited training sets.
Fixed-vocabulary language models fail to account for one of the most characteristic statistical facts of natural language: the frequent creation and reuse of new word types. Although character-level language models offer a partial solution in that they can create word types not attested in the training corpus, they do not capture the "bursty" distribution of such words. In this paper, we augment a hierarchical LSTM language model that generates sequences of word tokens character by character with a caching mechanism that learns to reuse previously generated words. To validate our model we construct a new open-vocabulary language modeling corpus (the Multilingual Wikipedia Corpus, MWC) from comparable Wikipedia articles in 7 typologically diverse languages and demonstrate the effectiveness of our model across this range of languages.
Experimental results obtained with the HERA collider and recent progress in their theoretical interpretation are reviewed. After a short introduction to HERA physics, deep inelastic scattering and photoproduction are discussed as (virtual) photon-proton scattering. It is shown that the measurement and theoretical understanding of both photoproduction as well as low-x deep inelastic scattering are essential for obtaining reliable high energy extrapolations within hadron-hadron interaction models. Limitations of the predictive power of hadron interaction models due to the interplay of perturbative QCD and unitarity effects are discussed.
We present a novel architecture, the "stacked what-where auto-encoders" (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvolutional net (Deconvnet) (Zeiler et al. (2010)) to produce the reconstruction. The objective function includes reconstruction terms that induce the hidden states in the Deconvnet to be similar to those of the Convnet. Each pooling layer produces two sets of variables: the "what" which are fed to the next layer, and its complementary variable "where" that are fed to the corresponding layer in the generative decoder.
We calculate exactly the first cumulants of the free energy of a directed polymer in a random medium for the geometry of a cylinder. By using the fact that the n-th moment <Z^n> of the partition function is given by the ground state energy of a quantum problem of n interacting particles on a ring of length L, we write an integral equation allowing to expand these moments in powers of the strength of the disorder gamma or in powers of n. For n small and n of order (L gamma)^(-1/2), the moments <Z^n> take a scaling form which allows to describe all the fluctuations of order 1/L of the free energy per unit length of the directed polymer. The distribution of these fluctuations is the same as the one found recently in the asymmetric exclusion process, indicating that it is characteristic of all the systems described by the Kardar-Parisi-Zhang equation in 1+1 dimensions.
We have carefully instrumented a large portion of the population living in a university graduate dormitory by giving participants Android smart phones running our sensing software. In this paper, we propose the novel problem of predicting mobile application (known as "apps") installation using social networks and explain its challenge. Modern smart phones, like the ones used in our study, are able to collect different social networks using built-in sensors. (e.g. Bluetooth proximity network, call log network, etc) While this information is accessible to app market makers such as the iPhone AppStore, it has not yet been studied how app market makers can use these information for marketing research and strategy development. We develop a simple computational model to better predict app installation by using a composite network computed from the different networks sensed by phones. Our model also captures individual variance and exogenous factors in app adoption. We show the importance of considering all these factors in predicting app installations, and we observe the surprising result that app installation is indeed predictable. We also show that our model achieves the best results compared with generic approaches: our results are four times better than random guess, and predict almost 45% of all apps users install with almost 45% precision (F1 score= 0.43).
Nowadays, a big part of people rely on available content in social media in their decisions (e.g. reviews and feedback on a topic or product). The possibility that anybody can leave a review provide a golden opportunity for spammers to write spam reviews about products and services for different interests. Identifying these spammers and the spam content is a hot topic of research and although a considerable number of studies have been done recently toward this end, but so far the methodologies put forth still barely detect spam reviews, and none of them show the importance of each extracted feature type. In this study, we propose a novel framework, named NetSpam, which utilizes spam features for modeling review datasets as heterogeneous information networks to map spam detection procedure into a classification problem in such networks. Using the importance of spam features help us to obtain better results in terms of different metrics experimented on real-world review datasets from Yelp and Amazon websites. The results show that NetSpam outperforms the existing methods and among four categories of features; including review-behavioral, user-behavioral, reviewlinguistic, user-linguistic, the first type of features performs better than the other categories.
Facial expressions play a significant role in human communication and behavior. Psychologists have long studied the relationship between facial expressions and emotions. Paul Ekman et al., devised the Facial Action Coding System (FACS) to taxonomize human facial expressions and model their behavior. The ability to recognize facial expressions automatically, enables novel applications in fields like human-computer interaction, social gaming, and psychological research. There has been a tremendously active research in this field, with several recent papers utilizing convolutional neural networks (CNN) for feature extraction and inference. In this paper, we employ CNN understanding methods to study the relation between the features these computational networks are using, the FACS and Action Units (AU). We verify our findings on the Extended Cohn-Kanade (CK+), NovaEmotions and FER2013 datasets. We apply these models to various tasks and tests using transfer learning, including cross-dataset validation and cross-task performance. Finally, we exploit the nature of the FER based CNN models for the detection of micro-expressions and achieve state-of-the-art accuracy using a simple long-short-term-memory (LSTM) recurrent neural network (RNN).
How can we detect suspicious users in large online networks? Online popularity of a user or product (via follows, page-likes, etc.) can be monetized on the premise of higher ad click-through rates or increased sales. Web services and social networks which incentivize popularity thus suffer from a major problem of fake connections from link fraudsters looking to make a quick buck. Typical methods of catching this suspicious behavior use spectral techniques to spot large groups of often blatantly fraudulent (but sometimes honest) users. However, small-scale, stealthy attacks may go unnoticed due to the nature of low-rank eigenanalysis used in practice.   In this work, we take an adversarial approach to find and prove claims about the weaknesses of modern, state-of-the-art spectral methods and propose fBox, an algorithm designed to catch small-scale, stealth attacks that slip below the radar. Our algorithm has the following desirable properties: (a) it has theoretical underpinnings, (b) it is shown to be highly effective on real data and (c) it is scalable (linear on the input size). We evaluate fBox on a large, public 41.7 million node, 1.5 billion edge who-follows-whom social graph from Twitter in 2010 and with high precision identify many suspicious accounts which have persisted without suspension even to this day.
Bistable biochemical switches are ubiquitous in gene regulatory networks and signal transduction pathways. Their switching dynamics, however, are difficult to study directly in experiments or conventional computer simulations, because switching events are rapid, yet infrequent. We present a simulation technique that makes it possible to predict the rate and mechanism of flipping of biochemical switches. The method uses a series of interfaces in phase space between the two stable steady states of the switch to generate transition trajectories in a ratchet-like manner. We demonstrate its use by calculating the spontaneous flipping rate of a symmetric model of a genetic switch consisting of two mutually repressing genes. The rate constant can be obtained orders of magnitude more efficiently than using brute-force simulations. For this model switch, we show that the switching mechanism, and consequently the switching rate, depends crucially on whether the binding of one regulatory protein to the DNA excludes the binding of the other one. Our technique could also be used to study rare events and non-equilibrium processes in soft condensed matter systems.
We explore liquid crystal order in a cell with a "dirty" substrate imposing a random surface pinning. Modeling such systems by a random-field xy-model with surface heterogeneity, we find that orientational order in the three-dimensional system is marginally unstable to such surface pinning. We compute the Larkin length scale, and the corresponding surface and bulk distortions. On longer scales we calculate correlation functions using the functional renormalization group and matching methods, finding a universal logarithmic and double-logarithmic roughness in two and three dimensions, respectively. For a finite thickness cell, we explore the interplay of homogeneous-heterogeneous substrate pair and detail crossovers as a function of disorder strength and cell thickness.
We introduce a novel framework for adversarial training where the target distribution is annealed between the uniform distribution and the data distribution. We posited a conjecture that learning under continuous annealing in the nonparametric regime is stable irrespective of the divergence measures in the objective function and proposed an algorithm, dubbed {\ss}-GAN, in corollary. In this framework, the fact that the initial support of the generative network is the whole ambient space combined with annealing are key to balancing the minimax game. In our experiments on synthetic data, MNIST, and CelebA, {\ss}-GAN with a fixed annealing schedule was stable and did not suffer from mode collapse.
In wild-type mice axons of retinal ganglion cells establish topographically precise projection to the superior colliculus of the midbrain. This implies that axons of neighboring retinal ganglion cells project to the proximal locations in the target. The precision of topographic projection is a result of combined effects of molecular labels, such as Eph receptors and ephrins, and correlated electric activity. In the Isl2/EphA3 mutant mice the expression levels of molecular labels is changed. As a result the topographic projection is rewired so that the neighborhood relationships between retinal cell axons are disrupted. Here we argue that the effects of correlated activity presenting themselves in the form of Hebbian learning rules can facilitate the restoration of the topographic connectivity even when the molecular labels carry conflicting instructions. This occurs because the correlations in electric activity carry information about retinal cells' spatial location that is independent on molecular labels. We argue therefore that experiments in Isl2/EphA3 knock-in mice directly test the interaction between effects of molecular labels and correlated activity during the development of neural connectivity.
Galaxy-scale strong gravitational lensing is not only a valuable probe of the dark matter distribution of massive galaxies, but can also provide valuable cosmological constraints, either by studying the population of strong lenses or by measuring time delays in lensed quasars. Due to the rarity of galaxy-scale strongly lensed systems, fast and reliable automated lens finding methods will be essential in the era of large surveys such as LSST, Euclid, and WFIRST. To tackle this challenge, we introduce CMU DeepLens, a new fully automated galaxy-galaxy lens finding method based on Deep Learning. This supervised machine learning approach does not require any tuning after the training step which only requires realistic image simulations of strongly lensed systems. We train and validate our model on a set of 20,000 LSST-like mock observations including a range of lensed systems of various sizes and signal-to-noise ratios (S/N). We find on our simulated data set that for a rejection rate of non-lenses of 99%, a completeness of 90% can be achieved for lenses with Einstein radii larger than 1.4" and S/N larger than 20 on individual $g$-band LSST exposures. Finally, we emphasize the importance of realistically complex simulations for training such machine learning methods by demonstrating that the performance of models of significantly different complexities cannot be distinguished on simpler simulations. We make our code publicly available at https://github.com/McWilliamsCenter/CMUDeepLens .
This paper reviews the modularity index and suggests an alternative index of the quality of a division of a network into subsets.
We tackle the problem of inferring node labels in a partially labeled graph where each node in the graph has multiple label types and each label type has a large number of possible labels. Our primary example, and the focus of this paper, is the joint inference of label types such as hometown, current city, and employers, for users connected by a social network. Standard label propagation fails to consider the properties of the label types and the interactions between them. Our proposed method, called EdgeExplain, explicitly models these, while still enabling scalable inference under a distributed message-passing architecture. On a billion-node subset of the Facebook social network, EdgeExplain significantly outperforms label propagation for several label types, with lifts of up to 120% for recall@1 and 60% for recall@3.
This document describes a Monte-Carlo simulation of an underwater neutrino telescope with a homogeneous detection volume of a cubic kilometre.
This work develops a generic framework, called the bag-of-paths (BoP), for link and network data analysis. The central idea is to assign a probability distribution on the set of all paths in a network. More precisely, a Gibbs-Boltzmann distribution is defined over a bag of paths in a network, that is, on a representation that considers all paths independently. We show that, under this distribution, the probability of drawing a path connecting two nodes can easily be computed in closed form by simple matrix inversion. This probability captures a notion of relatedness between nodes of the graph: two nodes are considered as highly related when they are connected by many, preferably low-cost, paths. As an application, two families of distances between nodes are derived from the BoP probabilities. Interestingly, the second distance family interpolates between the shortest path distance and the resistance distance. In addition, it extends the Bellman-Ford formula for computing the shortest path distance in order to integrate sub-optimal paths by simply replacing the minimum operator by the soft minimum operator. Experimental results on semi-supervised classification show that both of the new distance families are competitive with other state-of-the-art approaches. In addition to the distance measures studied in this paper, the bag-of-paths framework enables straightforward computation of many other relevant network measures.
We study the flux-driven superconductor-metal transition in ultrasmall cylinders observed experimentally by Liu {\em et.al.}(Science 294, 2332 (2001)). Where $T_c\to 0$, there is a quantum critical point, and a large fluctuation conductivity is observed in the proximate metallic phase over a wide range of $T$ and flux . However, we find that the predicted (Gaussian) fluctuation conductivity in the neighborhood of the quantum critical point is 4 orders of magnitude smaller than observed experimentally. We argue that the breakdown of Anderson's theorem at any non-integer flux leads to a broad fluctuation region reflecting the existence of ``rare regions'' with local superconducting order. We calculate the leading order correction to the conductivity within a simple model of statistically induced Josephson coupled local ordered regions that rationalizes the existing data.
Deep Learning's recent successes have mostly relied on Convolutional Networks, which exploit fundamental statistical properties of images, sounds and video data: the local stationarity and multi-scale compositional structure, that allows expressing long range interactions in terms of shorter, localized interactions. However, there exist other important examples, such as text documents or bioinformatic data, that may lack some or all of these strong statistical regularities.   In this paper we consider the general question of how to construct deep architectures with small learning complexity on general non-Euclidean domains, which are typically unknown and need to be estimated from the data. In particular, we develop an extension of Spectral Networks which incorporates a Graph Estimation procedure, that we test on large-scale classification problems, matching or improving over Dropout Networks with far less parameters to estimate.
We consider the problem of parsing natural language descriptions into source code written in a general-purpose programming language like Python. Existing data-driven methods treat this problem as a language generation task without considering the underlying syntax of the target programming language. Informed by previous work in semantic parsing, in this paper we propose a novel neural architecture powered by a grammar model to explicitly capture the target syntax as prior knowledge. Experiments find this an effective way to scale up to generation of complex programs from natural language descriptions, achieving state-of-the-art results that well outperform previous code generation and semantic parsing approaches.
This paper deals with fully-connected mean-field models of quantum spins with p-body ferromagnetic interactions and a transverse field. For p=2 this corresponds to the quantum Curie-Weiss model (a special case of the Lipkin-Meshkov-Glick model) which exhibits a second-order phase transition, while for p>2 the transition is first order. We provide a refined analytical description both of the static and of the dynamic properties of these models. In particular we obtain analytically the exponential rate of decay of the gap at the first-order transition. We also study the slow annealing from the pure transverse field to the pure ferromagnet (and vice versa) and discuss the effect of the first-order transition and of the spinodal limit of metastability on the residual excitation energy, both for finite and exponentially divergent annealing times. In the quantum computation perspective this quantity would assess the efficiency of the quantum adiabatic procedure as an approximation algorithm.
Numerous systems ranging from deformation of materials to earthquakes exhibit bursty dynamics, which consist of a sequence of events with a broad event size distribution. Very often these events are observed to be temporally correlated or clustered, evidenced by power-law distributed waiting times separating two consecutive activity bursts. We show how such inter-event correlations arise simply because of a finite detection threshold, created by the limited sensitivity of the measurement apparatus, or used to subtract background activity or noise from the activity signal. Data from crack propagation experiments and numerical simulations of a non-equilibrium crack line model demonstrate how thresholding leads to correlated bursts of activity by separating the avalanche events into sub-avalanches. The resulting temporal sub-avalanche correlations are well-described by our general scaling description of thresholding-induced correlations in crackling noise.
This paper presents a distributed search engine based on semantic P2P Networks. The user's computers join the domains in which user wants to share information in semantic P2P networks which is domain specific virtual tree (VIRGO ). Each user computer contains search engine which indexes the domain specific information on local computer or Internet. We can get all search information through P2P message provided by all joined computers. By companies' effort, we have implemented a prototype of distributed search engine, which demonstrates easily retrieving domain-related information provided by joined computers .
Tensors offer a natural representation for many kinds of data frequently encountered in machine learning. Images, for example, are naturally represented as third order tensors, where the modes correspond to height, width, and channels. Tensor methods are noted for their ability to discover multi-dimensional dependencies, and tensor decompositions in particular, have been used to produce compact low-rank approximations of data. In this paper, we explore the use of tensor contractions as neural network layers and investigate several ways to apply them to activation tensors. Specifically, we propose the Tensor Contraction Layer (TCL), the first attempt to incorporate tensor contractions as end-to-end trainable neural network layers. Applied to existing networks, TCLs reduce the dimensionality of the activation tensors and thus the number of model parameters. We evaluate the TCL on the task of image recognition, augmenting two popular networks (AlexNet, VGG). The resulting models are trainable end-to-end. Applying the TCL to the task of image recognition, using the CIFAR100 and ImageNet datasets, we evaluate the effect of parameter reduction via tensor contraction on performance. We demonstrate significant model compression without significant impact on the accuracy and, in some cases, improved performance.
By considering the task of finding the shortest walk through a network we find an algorithm for which the run time is not as O(2^n), with n being the number of nodes, but instead scales with the number of nodes in a coarsened network. This coarsened network has a number of nodes related to the number of dense regions in the original graph. Since we exploit a form of local community detection as a preprocessing, this work gives support to the project of developing heuristic algorithms for detecting dense regions in networks: preprocessing of this kind can accelerate optimization tasks on networks. Our work also suggests a class of empirical conjectures for how structural features of efficient networked systems might scale with system size.
Fine-grained classification is challenging because categories can only be discriminated by subtle and local differences. Variances in the pose, scale or rotation usually make the problem more difficult. Most fine-grained classification systems follow the pipeline of finding foreground object or object parts (where) to extract discriminative features (what).   In this paper, we propose to apply visual attention to fine-grained classification task using deep neural network. Our pipeline integrates three types of attention: the bottom-up attention that propose candidate patches, the object-level top-down attention that selects relevant patches to a certain object, and the part-level top-down attention that localizes discriminative parts. We combine these attentions to train domain-specific deep nets, then use it to improve both the what and where aspects. Importantly, we avoid using expensive annotations like bounding box or part information from end-to-end. The weak supervision constraint makes our work easier to generalize.   We have verified the effectiveness of the method on the subsets of ILSVRC2012 dataset and CUB200_2011 dataset. Our pipeline delivered significant improvements and achieved the best accuracy under the weakest supervision condition. The performance is competitive against other methods that rely on additional annotations.
We study the effect of dissipation on the infinite randomness fixed point and the Griffiths-McCoy singularities of random transverse Ising systems in chains, ladders and in two-dimensions. A strong disorder renormalization group scheme is presented that allows the computation of the finite temperature behavior of the magnetic susceptibility and the spin specific heat. In the case of Ohmic dissipation the susceptibility displays a crossover from Griffiths-McCoy behavior (with a continuously varying dynamical exponent) to classical Curie behavior at some temperature $T^*$. The specific heat displays Griffiths-McCoy singularities over the whole temperature range. For super-Ohmic dissipation we find an infinite randomness fixed point within the same universality class as the transverse Ising system without dissipation. In this case the phase diagram and the parameter dependence of the dynamical exponent in the Griffiths-McCoy phase can be determined analytically.
Link weight is crucial in weighted complex networks. It provides additional dimension for describing and adjusting the properties of networks. The topological role of weight is studied by the effects of random redistribution of link weights based on regular network with initial homogeneous weight. The small world effect emerges due to the weight randomization. Its effects on the dynamical systems coupled by weighted networks are also investigated. Randomization of weight can increase the transition temperature in Ising model and enhance the ability of synchronization of chaotic systems dramatically.
We study the attenuation of long-wavelength shear sound waves propagating through model jammed packings of frictionless soft spheres interacting with repulsive springs. The elastic attenuation coefficient, $\alpha(\omega)$, of transverse phonons of low frequency, $\omega$, exhibits power law scaling as the packing fraction $\phi$ is lowered towards $\phi_c$, the critical packing fraction below which rigidity is lost. The elastic attenuation coefficient is inversely proportional to the scattering mean free path and follows Rayleigh law with $\alpha(\omega)\sim \omega^4 (\phi - \phi_c)^{-5/2}$ for $\omega$ much less than $\omega^* \sim (\phi - \phi_c)^{1/2}$, the characteristic frequency scale above which the energy diffusivity and density of states plateau. This scaling of the attenuation coefficient, consistent with numerics, is obtained by assuming that a jammed packing can be viewed as a mosaic composed of domains whose characteristic size $\ell^ * \sim (\phi-\phi_c) ^{-1/2}$ diverges at the transition.
We address some computational issues that may hinder the use of AMP chain graphs in practice. Specifically, we show how a discrete probability distribution that satisfies all the independencies represented by an AMP chain graph factorizes according to it. We show how this factorization makes it possible to perform inference and parameter learning efficiently, by adapting existing algorithms for Markov and Bayesian networks. Finally, we turn our attention to another issue that may hinder the use of AMP CGs, namely the lack of an intuitive interpretation of their edges. We provide one such interpretation.
Classical chaotic dynamics is characterized by the exponential sensitivity to initial conditions. Quantum mechanics, however, does not show this feature. We consider instead the sensitivity of quantum evolution to perturbations in the Hamiltonian. This is observed as an atenuation of the Loschmidt Echo, $M(t)$, i.e. the amount of the original state (wave packet of width $\sigma$) which is recovered after a time reversed evolution, in presence of a classically weak perturbation. By considering a Lorentz gas of size $L$, which for large $L$ is a model for an {\it unbounded} classically chaotic system, we find numerical evidence that, if the perturbation is within a certain range, $M(t)$ decays exponentially with a rate $1/\tau_{\phi}$ determined by the Lyapunov exponent $\lambda$ of the corresponding classical dynamics. This exponential decay extends much beyond the Eherenfest time $t_{E}$ and saturates at a time $t_{s}\simeq \lambda^{-1}\ln (\widetilde{N})$, where $\widetilde{N}\simeq (L/\sigma)^2$ is the effective dimensionality of the Hilbert space. Since $\tau _{\phi}$ quantifies the increasing uncontrollability of the quantum phase (decoherence) its characterization and control has fundamental interest.
We extended the work of proposed activation function, Noisy Softplus, to fit into training of layered up spiking neural networks (SNNs). Thus, any ANN employing Noisy Softplus neurons, even of deep architecture, can be trained simply by the traditional algorithm, for example Back Propagation (BP), and the trained weights can be directly used in the spiking version of the same network without any conversion. Furthermore, the training method can be generalised to other activation units, for instance Rectified Linear Units (ReLU), to train deep SNNs off-line. This research is crucial to provide an effective approach for SNN training, and to increase the classification accuracy of SNNs with biological characteristics and to close the gap between the performance of SNNs and ANNs.
We propose a model of a growing network, in which preferential linking is combined with partial inheritance of connectivity (number of incoming links) of individual nodes by new ones. The nontrivial version of this model is solved exactly in the limit of a large network size. We demonstrate, that the connectivity distribution depends on the network size, $t$, in a {\em multifractal} fashion. When the size of the network tends to infinity, the distribution behaves as $\sim q^{-\gamma}\ln q$, where $\gamma =\sqrt{2}$. For the finite-size network, this behavior is observed for $1 \ll q \lesssim \exp(\ln ^{1/2}t) $ but the multifractality is determined by the far wider part, $1 \ll q \lesssim \sqrt t$, of the distribution function.
One of the common artificial intelligence applications in electronic games consists of making an artificial agent learn how to execute some determined task successfully in a game environment. One way to perform this task is through machine learning algorithms capable of learning the sequence of actions required to win in a given game environment. There are several supervised learning techniques able to learn the correct answer for a problem through examples. However, when learning how to play electronic games, the correct answer might only be known by the end of the game, after all the actions were already taken. Thus, not being possible to measure the accuracy of each individual action to be taken at each time step. A way for dealing with this problem is through Neuroevolution, a method which trains Artificial Neural Networks using evolutionary algorithms. In this article, we introduce a framework for testing optimization algorithms with artificial agent controllers in electronic games, called EvoMan, which is inspired in the action-platformer game Mega Man II. The environment can be configured to run in different experiment modes, as single evolution, coevolution and others. To demonstrate some challenges regarding the proposed platform, as initial experiments we applied Neuroevolution using Genetic Algorithms and the NEAT algorithm, in the context of competitively coevolving two distinct agents in this game.
We theoretically investigate the spectral properties and the spatial dependence of the local density of states (LDoS) in disordered two-dimensional electron gases (2DEG) in the quantum Hall regime, taking into account the combined presence of electrostatic disorder, random Rashba spin-orbit in- teraction, and finite Zeeman coupling. To this purpose, we extend a coherent-state Green's function formalism previously proposed for spinless 2DEG in the presence of smooth arbitrary disorder, that here incorporates the nontrivial coupling between the orbital and spin degrees of freedom into the electronic drift states. The formalism allows us to obtain analytical and controlled nonperturbative expressions of the energy spectrum in arbitrary locally flat disorder potentials with both random electric fields and Rashba coupling. As an illustration of this theory, we derive analytical microscopic expressions for the LDoS in different temperature regimes which can be used as a starting point to interpret scanning tunneling spectroscopy data at high magnetic fields. In this context, we study the spatial dependence and linewidth of the LDoS peaks and explain an experimentally-noticed correlation between the spatial dispersion of the spin-orbit splitting and the local extrema of the potential landscape.
Recurrent neural nets are widely used for predicting temporal data. Their inherent deep feedforward structure allows learning complex sequential patterns. It is believed that top-down feedback might be an important missing ingredient which in theory could help disambiguate similar patterns depending on broader context. In this paper we introduce surprisal-driven recurrent networks, which take into account past error information when making new predictions. This is achieved by continuously monitoring the discrepancy between most recent predictions and the actual observations. Furthermore, we show that it outperforms other stochastic and fully deterministic approaches on enwik8 character level prediction task achieving 1.37 BPC on the test portion of the text.
We consider chiral electrons moving along the 1D helical edge of a 2D topological insulator and interacting with a disordered chain of Kondo impurities. Assuming the electron-spin couplings of random anisotropies, we map this system to the problem of the pinning of the charge density wave by the disordered potential. This mapping proves that arbitrary weak anisotropic disorder in coupling of chiral electrons with spin impurities leads to the Anderson localization of the edge states.
Using the cavity method and diagrammatic methods, we model the dynamics of batch learning of restricted sets of examples, widely applicable to general learning cost functions, and fully taking into account the temporal correlations introduced by the recycling of the examples.
In this paper we study disease spread over a randomly switched network, which is modeled by a stochastic switched differential equation based on the so called $N$-intertwined model for disease spread over static networks. Assuming that all the edges of the network are independently switched, we present sufficient conditions for the convergence of infection probability to zero. Though the stability theory for switched linear systems can naively derive a necessary and sufficient condition for the convergence, the condition cannot be used for large-scale networks because, for a network with $n$ agents, it requires computing the maximum real eigenvalue of a matrix of size exponential in $n$. On the other hand, our conditions that are based also on the spectral theory of random matrices can be checked by computing the maximum real eigenvalue of a matrix of size exactly $n$.
Many complex human and natural phenomena can usefully be represented as networks describing the relationships between individuals. While these relationships are typically intermittent, previous research has used network representations that aggregate the relationships at discrete intervals. However, such an aggregation discards important temporal information, thus inhibiting our understanding of the networks dynamic behaviour and evolution. We have recorded patterns of human urban encounter using Bluetooth technology thus retaining the temporal properties of this network. Here we show how this temporal information influences the structural properties of the network. We show that the temporal properties of human urban encounter are scale-free, leading to an overwhelming proportion of brief encounters between individuals. While previous research has shown preferential attachment to result in scale-free connectivity in aggregated network data, we found that scale-free connectivity results from the temporal properties of the network. In addition, we show that brief encounters act as weak social ties in the diffusion of non-expiring information, yet persistent encounters provide the means for sustaining time-expiring information through a network.
The traditional model of single ownership of all the physical network elements and network layers by mobile network operators is beginning to be challenged. This has been attributed to the rapid and complex technology migration compounded with rigorous regulatory requirements and ever increasing capital expenditures. These trends, combined together with the increasing competition, rapid commoditization of telecommunication equipments and rising separation of network and service provisioning are pushing the operators to adopt multiple strategies, with network infrastructure sharing in the core and radio access networks emerging as a more radical mechanism to substantially and sustainably improve network costs. Through infrastructure sharing, developing countries and other emerging economies can harness the technological, market and regulatory developments that have fostered affordable access to mobile and broadband services. Similarly, the network operators entering or consolidating in the emerging markets can aim for substantial savings on capital and operating expenses. The present paper aims to investigate the current technological solutions and regulatory and the technical-economical dimensions in connection with the sharing of mobile telecommunication networks in emerging countries. We analyze the estimated savings on capital and operating expenses, while assessing the technical constraints, applicability and benefits of the network sharing solutions in an emerging market context.
We consider the problem of establishing minimum-cost multicast connections over coded packet networks, i.e. packet networks where the contents of outgoing packets are arbitrary, causal functions of the contents of received packets. We consider both wireline and wireless packet networks as well as both static multicast (where membership of the multicast group remains constant for the duration of the connection) and dynamic multicast (where membership of the multicast group changes in time, with nodes joining and leaving the group).   For static multicast, we reduce the problem to a polynomial-time solvable optimization problem, and we present decentralized algorithms for solving it. These algorithms, when coupled with existing decentralized schemes for constructing network codes, yield a fully decentralized approach for achieving minimum-cost multicast. By contrast, establishing minimum-cost static multicast connections over routed packet networks is a very difficult problem even using centralized computation, except in the special cases of unicast and broadcast connections.   For dynamic multicast, we reduce the problem to a dynamic programming problem and apply the theory of dynamic programming to suggest how it may be solved.
In this paper, we propose a multiplex proportional-integral approach, for solving consensus problems in networks of heterogeneous nodes dynamics affected by constant disturbances. The proportional and integral actions are deployed on two different layers across the network, each with its own topology. Sufficient conditions for convergence are derived that depend upon the structure of the network, the parameters characterizing the control layers and the node dynamics. The effectiveness of the theoretical results is illustrated using a power network model as a representative example.
Cylindrical lattice Diffusion Limited Aggregation (DLA), with a narrow width N, is solved using a Markovian matrix method. This matrix contains the probabilities that the front moves from one configuration to another at each growth step, calculated exactly by solving the Laplace equation and using the proper normalization. The method is applied for a series of approximations, which include only a finite number of rows near the front. The matrix is then used to find the weights of the steady state growing configurations and the rate of approaching this steady state stage. The former are then used to find the average upward growth probability, the average steady-state density and the fractal dimensionality of the aggregate, which is extrapolated to a value near 1.64.
Convolutional neural networks (CNN) are representation learning techniques that achieve state-of-the-art performance on almost every image-related, machine learning task. Applying the representation languages build by these models to tasks beyond the one they were originally trained for is a field of interest known as transfer learning for feature extraction. Through this approach, one can apply the image descriptors learnt by a CNN after processing millions of images to any dataset, without an expensive training phase. Contributions to this field have so far focused on extracting CNN features from layers close to the output (e.g., fully connected layers), particularly because they work better when used out-of-the-box to feed a classifier. Nevertheless, the rest of CNN features is known to encode a wide variety of visual information, which could be potentially exploited on knowledge representation and reasoning tasks. In this paper we analyze the behavior of each feature individually, exploring their intra/inter class activations for all classes of three different datasets. From this study we learn that low and middle level features behave very differently to high level features, the former being more descriptive and the latter being more discriminant. We show how low and middle level features can be used for knowledge representation purposes both by their presence or by their absence. We also study how much noise these features may encode, and propose a thresholding approach to discard most of it. Finally, we discuss the potential implications of these results in the context of knowledge representation using features extracted from a CNN.
Nowadays, with increase in ageing population, Health care market keeps growing. There is a need for monitoring of Health issues. Body Area Network consists of wireless sensors attached on or inside human body for monitoring vital Health related problems e.g, Electro Cardiogram (ECG), ElectroEncephalogram (EEG), ElectronyStagmography(ENG) etc. Data is recorded by sensors and is sent towards Health care center. Due to life threatening situations, timely sending of data is essential. For data to reach Health care center, there must be a proper way of sending data through reliable connection and with minimum delay. In this paper transmission delay of different paths, through which data is sent from sensor to Health care center over heterogeneous multi-hop wireless channel is analyzed. Data of medical related diseases is sent through three different paths. In all three paths, data from sensors first reaches ZigBee, which is the common link in all three paths. After ZigBee there are three available networks, through which data is sent. Wireless Local Area Network (WLAN), Worldwide Interoperability for Microwave Access (WiMAX), Universal Mobile Telecommunication System (UMTS) are connected with ZigBee. Each network (WLAN, WiMAX, UMTS) is setup according to environmental conditions, suitability of device and availability of structure for that device. Data from these networks is sent to IP-Cloud, which is further connected to Health care center. Main aim of this paper is to calculate delay of each link in each path over multihop wireless channel.
How can we efficiently propagate uncertainty in a latent state representation with recurrent neural networks? This paper introduces stochastic recurrent neural networks which glue a deterministic recurrent neural network and a state space model together to form a stochastic and sequential neural generative model. The clear separation of deterministic and stochastic layers allows a structured variational inference network to track the factorization of the model's posterior distribution. By retaining both the nonlinear recursive structure of a recurrent neural network and averaging over the uncertainty in a latent path, like a state space model, we improve the state of the art results on the Blizzard and TIMIT speech modeling data sets by a large margin, while achieving comparable performances to competing methods on polyphonic music modeling.
We study the ability of linear recurrent networks obeying discrete time dynamics to store long temporal sequences that are retrievable from the instantaneous state of the network. We calculate this temporal memory capacity for both distributed shift register and random orthogonal connectivity matrices. We show that the memory capacity of these networks scales with system size.
The recent discovery of microlensing of stars in the Large Magellanic Cloud has excited much interest in the nature of the lensing population. Detailed analyses indicate that the mass of these objects ranges from 0.3-0.8 solar masses, suggesting that they might be white dwarfs, the faint remnants of stellar evolution. The confirmation of such an hypothesis would yield profound insights into the early history of our galaxy and the early generations of stars in the universe. Previous attempts have been made to place theoretical constraints on this scenario, but were unduly pessimistic because they relied on inadequate evolutionary models. Here we present the first results from detailed evolutionary models appropriate for the study of white dwarfs of truly cosmological vintage. We find that the commonly held notion that old white dwarfs are red to hold only for helium atmosphere dwarfs and that hydrogen atmosphere dwarfs will be blue, with colours similar to those of the faint point sources found in the Hubble Deep Field. Thus, any direct observational search for the microlensing population should search for faint blue objects rather than faint red ones.
Multijet production rates in neutral current deep inelastic scattering have been measured in the range of exchanged boson virtualities 10 < Q2 < 5000 GeV2. The data were taken at the ep collider HERA with centre-of-mass energy sqrt(s) = 318 GeV using the ZEUS detector and correspond to an integrated luminosity of 82.2 pb-1. Jets were identified in the Breit frame using the k_T cluster algorithm in the longitudinally invariant inclusive mode. Measurements of differential dijet and trijet cross sections are presented as functions of jet transverse energy E_{T,B}{jet}, pseudorapidity eta_{LAB}{jet} and Q2 with E_{T,B}{jet} > 5 GeV and -1 < eta_{LAB}{jet} < 2.5. Next-to-leading-order QCD calculations describe the data well. The value of the strong coupling constant alpha_s(M_Z), determined from the ratio of the trijet to dijet cross sections, is alpha_s(M_Z) = 0.1179 pm 0.0013(stat.) {+0.0028}_{-0.0046}(exp.) {+0.0064}_{-0.0046}(th.)
Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.
Our wellbeing depends as much on our personal success, as it does on the success of our society. The realization of this fact makes cooperation a very much needed trait. Experiments have shown that rewards can elevate our readiness to cooperate, but since giving a reward inevitably entails paying a cost for it, the emergence and stability of such behavior remain elusive. Here we show that allowing for the act of rewarding to self-organize in dependence on the success of cooperation creates several evolutionary advantages that instill new ways through which collaborative efforts are promoted. Ranging from indirect territorial battle to the spontaneous emergence and destruction of coexistence, phase diagrams and the underlying spatial patterns reveal fascinatingly reach social dynamics that explains why this costly behavior has evolved and persevered. Comparisons with adaptive punishment, however, uncover an Achilles heel of adaptive rewarding that is due to over-aggression, which in turn hinders optimal utilization of network reciprocity. This may explain why, despite of its success, rewarding is not as firmly weaved into our societal organization as punishment.
We constrain the evolution of the galaxy mass and luminosity functions from the analysis of (public) multi-wavelength data in the Chandra Deep Field South (CDFS) area, obtained from the GOODS and other projects, and including very deep high-resolution imaging by HST/ACS. Our reference catalogue of faint high-redshift galaxies, which we have thoroughly tested for completeness and reliability, comes from a deep (S(3.6micron)>1 microJy) image by IRAC on the Spitzer Observatory. These imaging data in the field are complemented with extensive optical spectroscopy by the ESO VLT/FORS2 and VIMOS spectrographs, while deep K-band VLT/ISAAC imaging is also used to derive further complementary statistical constraints and to assist the source identification and SED analysis. We have selected a highly reliable IRAC 3.6micron sub-sample of 1478 galaxies with S(3.6)>10microJy, 47% of which have spectroscopic redshift, while for the remaining objects both COMBO-17 and Hyperz are used to estimate the photometric redshift. This very extensive dataset is exploited to assess evolutionary effects in the galaxy luminosity and stellar mass functions, while luminosity/density evolution is further constrained with the number counts and redshift distributions. The deep ACS imaging allows us to differentiate these evolutionary paths by morphological type, which our simulations show to be reliable at least up to z=1.5 for the two main early- (E/S0) and late-type (Sp/Irr) classes. These data, as well as our direct estimate of the stellar mass function above M=10^(10)M_sun for the spheroidal subclass, consistently evidence a progressive dearth of such objects to occur starting at z=0.7, paralleled by an increase in luminosity. (abridged)
We present a comprehensive analysis of gamma-Z interference corrections to the weak charge of the proton measured in parity-violating electron scattering, including a survey of existing models and a critical analysis of their uncertainties. Constraints from parton distributions in the deep-inelastic region, together with new data on parity-violating electron scattering in the resonance region, result in significantly smaller uncertainties on the corrections compared to previous estimates. At the kinematics of the Qweak experiment, we determine the gamma-Z box correction to be ReBox^V_{gamma-Z} = (5.57 +- 0.36) x 10^{-3}. The new constraints also allow precise predictions to be made for parity-violating deep-inelastic asymmetries on the deuteron.
Graphs and networks are used to model interactions in a variety of contexts. There is a growing need to quickly assess the characteristics of a graph in order to understand its underlying structure. Some of the most useful metrics are triangle-based and give a measure of the connectedness of mutual friends. This is often summarized in terms of clustering coefficients, which measure the likelihood that two neighbors of a node are themselves connected. Computing these measures exactly for large-scale networks is prohibitively expensive in both memory and time. However, a recent wedge sampling algorithm has proved successful in efficiently and accurately estimating clustering coefficients. In this paper, we describe how to implement this approach in MapReduce to deal with massive graphs. We show results on publicly-available networks, the largest of which is 132M nodes and 4.7B edges, as well as artificially generated networks (using the Graph500 benchmark), the largest of which has 240M nodes and 8.5B edges. We can estimate the clustering coefficient by degree bin (e.g., we use exponential binning) and the number of triangles per bin, as well as the global clustering coefficient and total number of triangles, in an average of 0.33 seconds per million edges plus overhead (approximately 225 seconds total for our configuration). The technique can also be used to study triangle statistics such as the ratio of the highest and lowest degree, and we highlight differences between social and non-social networks. To the best of our knowledge, these are the largest triangle-based graph computations published to date.
We investigate analytically the behavior of Ising model on two connected Barabasi-Albert networks. Depending on relative ordering of both networks there are two possible phases corresponding to parallel or antiparallel alingment of spins in both networks. A difference between critical temperatures of both phases disappears in the limit of vanishing inter-network coupling for identical networks. The analytic predictions are confirmed by numerical simulations.
A general approach to knowledge transfer is introduced in which an agent controlled by a neural network adapts how it reuses existing networks as it learns in a new domain. Networks trained for a new domain can improve their performance by routing activation selectively through previously learned neural structure, regardless of how or for what it was learned. A neuroevolution implementation of this approach is presented with application to high-dimensional sequential decision-making domains. This approach is more general than previous approaches to neural transfer for reinforcement learning. It is domain-agnostic and requires no prior assumptions about the nature of task relatedness or mappings. The method is analyzed in a stochastic version of the Arcade Learning Environment, demonstrating that it improves performance in some of the more complex Atari 2600 games, and that the success of transfer can be predicted based on a high-level characterization of game dynamics.
We study the behavior of simple models for financial markets with widely spread frequency either in the trading activity of agents or in the occurrence of basic events. The generic picture of a phase transition between information efficient and inefficient markets still persists even when agents trade on widely spread time-scales. We derive analytically the dependence of the critical threshold on the distribution of time-scales. We also address the issue of market efficiency as a function of frequency. In an inefficient market we find that the size of arbitrage opportunities is inversely proportional to the frequency of the events on which they occur. Greatest asymmetries in market outcomes are concentrated on the most rare events. The practical limits of the applications of these ideas to real markets are discussed in a specific example.
We propose an image super-resolution method (SR) using a deeply-recursive convolutional network (DRCN). Our network has a very deep recursive layer (up to 16 recursions). Increasing recursion depth can improve performance without introducing new parameters for additional convolutions. Albeit advantages, learning a DRCN is very hard with a standard gradient descent method due to exploding/vanishing gradients. To ease the difficulty of training, we propose two extensions: recursive-supervision and skip-connection. Our method outperforms previous methods by a large margin.
The S=1/2 Heisenberg model is considered on bilayer and single-layer square lattices with couplings J1, J2, and with each spin belonging to one J2-coupled dimer. A transition from a Neel to disordered ground state occurs at a critical value of g=J2/J1. The systems are here studied at their dimer-dilution percolation points p*. The multi-critical point (g*,p*) previously found for the bilayer is not reproduced for the single layer. Instead, there is line of critical points (g < g*,p*) with continuously varying exponents. The uniform magnetic susceptibility diverges as 1/T^a $ with a in the range [1/2,1]. This unusual behavior is attributed to an effective free-moment density T^(1-a). The susceptibility of the bilayer is not divergent but exhibits remarkably robust quantum-critical scaling.
Self-organization to ferromagnetic phase transition in cellular automata of spins governed by stochastic majority rule and when topology between spins is changed, is investigated numerically. Three types of edge rewiring are considered. The algorithm of Watts and Strogatz is applied at each time step to establish a network evolving stochastically (the first network). The preference functions are defined to amplify the role of strongly connected vertices (the second network). Demand to preserve graph connectivity provides the third way of edge rewiring. Each of these processes yields the different network: small-world, scattered nodes with one strongly connected component, and scale-free network when rewiring is properly adjusted. The stochastic majority rule applied to these networks can lead or not to ferromagnetic transition. Classical mean-field transition in case of the first and third network is observed. Scale-free ferromagnetic transition can be observed when rewiring is properly adjusted.
Complete characterization of the state of a quantum system made up of subsystems requires determination of relative phase, because of interference effects between the subsystems. For a system of qubits used as a quantum computer this is especially vital, because the entanglement, which is the basis for the quantum advantage in computing, depends intricately on phase. We present here a first step towards that determination, in which we use a two-qubit quantum system as a quantum neural network, which is trained to compute and output its own relative phase.
Join the shortest queue (JSQ) refers to networks whose incoming jobs are assigned to the shortest queue from among a randomly chosen subset of the queues in the system. After completion of service at the queue, a job leaves the network. We show that, for all non- idling service disciplines and for general interarrival and service time distributions, such networks are stable when they are subcritical. We then obtain uniform bounds on the tails of the marginal distributions of the equilibria for families of such networks; these bounds are employed to show relative compactness of the marginal distributions. We also present a family of subcritical JSQ networks whose workloads in equilibrium are much larger than for the corresponding networks where each incoming job is assigned randomly to a queue. Part of this work generalizes results in Foss and Chernova [12], which applied fluid limits to study networks with the FIFO discipline. Here, we apply an appropriate Lyapunov function.
In recent years, skeleton-based action recognition has become a popular 3D classification problem. State-of-the-art methods typically first represent each motion sequence as a high-dimensional trajectory on a Lie group with an additional dynamic time warping, and then shallowly learn favorable Lie group features. In this paper we incorporate the Lie group structure into a deep network architecture to learn more appropriate Lie group features for 3D action recognition. Within the network structure, we design rotation mapping layers to transform the input Lie group features into desirable ones, which are aligned better in the temporal domain. To reduce the high feature dimensionality, the architecture is equipped with rotation pooling layers for the elements on the Lie group. Furthermore, we propose a logarithm mapping layer to map the resulting manifold data into a tangent space that facilitates the application of regular output layers for the final classification. Evaluations of the proposed network for standard 3D human action recognition datasets clearly demonstrate its superiority over existing shallow Lie group feature learning methods as well as most conventional deep learning methods.
The 2011 Grand Challenge in Service conference aimed to explore, analyse and evaluate complex service systems, utilising a case scenario of delivering on improved perception of safety in the London Borough of Sutton, which provided a common context to link the contributions. The key themes that emerged included value co-creation, systems and networks, ICT and complexity, for which we summarise the contributions. Contributions on value co-creation are based mainly on empirical research and provide a variety of insights including the importance of better understanding collaboration within value co-creation. Contributions on the systems perspective, considered to arise from networks of value co-creation, include efforts to understand the implications of the interactions within service systems, as well as their interactions with social systems, to co-create value. Contributions within the technological sphere, providing ever greater connectivity between entities, focus on the creation of new value constellations and new demand being fulfilled through hybrid offerings of physical assets, information and people. Contributions on complexity, arising from the value co- creation networks of technology enabled services systems, focus on the challenges in understanding, managing and analysing these complex service systems. The theory and applications all show the importance of understanding service for the future.
Differential dijet cross sections in diffractive deep-inelastic scattering are measured with the H1 detector at HERA using an integrated luminosity of 51.5 pb-1. The selected events are of the type ep --> eXY, where the system X contains at least two jets and is well separated in rapidity from the low mass proton dissociation system Y. The dijet data are compared with QCD predictions at next-to-leading order based on diffractive parton distribution functions previously extracted from measurements of inclusive diffractive deep-inelastic scattering. The prediction describes the dijet data well at low and intermediate zpom (the fraction of the momentum of the diffractive exchange carried by the parton entering the hard interaction) where the gluon density is well determined from the inclusive diffractive data, supporting QCD factorisation. A new set of diffractive parton distribution functions is obtained through a simultaneous fit to the diffractive inclusive and dijet cross sections. This allows for a precise determination of both the diffractive quark and gluon distributions in the range 0.05<zpom<0.9. In particular, the precision on the gluon density at high momentum fractions is improved compared to previous extractions.
Cooperation is ubiquitous in every level of living organisms. It is known that spatial (network) structure is a viable mechanism for cooperation to evolve. Until recently, it has been difficult to predict whether cooperation can evolve at a network (population) level. To address this problem, Pinheiro et al. proposed a numerical metric, called Average Gradient of Selection (AGoS) in 2012. AGoS can characterize and forecast the evolutionary fate of cooperation at a population level. However, stochastic mutation of strategies was not considered in the analysis of AGoS. Here we analyzed the evolution of cooperation using AGoS where mutation may occur to strategies of individuals in networks. Our analyses revealed that mutation always has a negative effect on the evolution of cooperation regardless of the fraction of cooperators and network structures. Moreover, we found that mutation affects the fitness of cooperation differently on different social network structures.
The kinetics of E' centers (silicon dangling bonds) induced by 4.7eV pulsed laser irradiation in dry fused silica was investigated by in situ optical absorption spectroscopy. The stability of the defects, conditioned by reaction with mobile hydrogen of radiolytic origin, is discussed and compared to results of similar experiments performed on wet fused silica. A portion of E' and hydrogen are most likely generated by laser-induced breaking of Si-H precursors, while an additional fraction of the paramagnetic centers arise from another formation mechanism. Both typologies of E' participate to the reaction with H_2 leading to the post-irradiation decay of the defects. This annealing process is slowed down on decreasing temperature and is frozen at T=200K, consistently with the diffusion properties of H_2 in silica.
Many networks including social networks, computer networks, and biological networks are found to divide naturally into communities of densely connected individuals. Finding community structure is one of fundamental problems in network science. Since Newman's suggestion of using \emph{modularity} as a measure to qualify the goodness of community structures, many efficient methods to maximize modularity have been proposed but without a guarantee of optimality. In this paper, we propose two polynomial-time algorithms to the modularity maximization problem with theoretical performance guarantees. The first algorithm comes with a \emph{priori guarantee} that the modularity of found community structure is within a constant factor of the optimal modularity when the network has the power-law degree distribution. Despite being mainly of theoretical interest, to our best knowledge, this is the first approximation algorithm for finding community structure in networks. In our second algorithm, we propose a \emph{sparse metric}, a substantially faster linear programming method for maximizing modularity and apply a rounding technique based on this sparse metric with a \emph{posteriori approximation guarantee}. Our experiments show that the rounding algorithm returns the optimal solutions in most cases and are very scalable, that is, it can run on a network of a few thousand nodes whereas the LP solution in the literature only ran on a network of at most 235 nodes.
Shot boundary detection (SBD) is an important component of many video analysis tasks, such as action recognition, video indexing, summarization and editing. Previous work typically used a combination of low-level features like color histograms, in conjunction with simple models such as SVMs. Instead, we propose to learn shot detection end-to-end, from pixels to final shot boundaries. For training such a model, we rely on our insight that all shot boundaries are generated. Thus, we create a dataset with one million frames and automatically generated transitions such as cuts, dissolves and fades. In order to efficiently analyze hours of videos, we propose a Convolutional Neural Network (CNN) which is fully convolutional in time, thus allowing to use a large temporal context without the need to repeatedly processing frames. With this architecture our method obtains state-of-the-art results while running at an unprecedented speed of more than 120x real-time.
Ligand diffusion through a protein interior is a fundamental process governing biological signaling and enzymatic catalysis. A complex topology of channels in proteins leads often to difficulties in modeling ligand escape pathways by classical molecular dynamics simulations. In this paper two novel memetic methods for searching the exit paths and cavity space exploration are proposed: Memory Enhanced Random Acceleration (MERA) Molecular Dynamics and Immune Algorithm (IA). In MERA, a pheromone concept is introduced to optimize an expulsion force. In IA, hybrid learning protocols are exploited to predict ligand exit paths. They are tested on three protein channels with increasing complexity: M2 muscarinic GPCR receptor, enzyme nitrile hydratase and heme-protein cytochrome P450cam. In these cases, the memetic methods outperform Simulated Annealing and Random Acceleration Molecular Dynamics. The proposed algorithms are general and appropriate in all problems where an accelerated transport of an object through a network of channels is studied.
A long-standing problem of the low-energy dynamics of a disordered XY spin chain is re-examined. The case of a rigid chain is studied where the quantum effects can be treated quasiclassically. It is shown that as the frequency decreases, the relevant excitations change from localized spin waves to two-level systems to soliton-antisoliton pairs. The linear-response correlation functions are calculated. The results apply to other periodic glassy systems such as pinned density waves, planar vortex lattices, stripes, and disordered Luttinger liquids.
In this paper, we investigate the factors that affect the synchronization of coupled oscillators on networks. By using the edge-intercrossing method, we keep the degree distribution unchanged to see other statistical properties' effects on network's synchronizability. By optimizing the eigenratio $R$ of the coupling matrix with \textit{Memory Tabu Search} (MTS), we observe that a network with lower degree of clustering, without modular structure and displaying disassortative connecting pattern may be easy to synchronize. Moreover, the optimal network contains fewer small-size loops. The optimization process on scale-free network strongly suggests that the heterogeneity plays the main role in determining the synchronizability.
Top-performing machine learning systems, such as deep neural networks, large ensembles and complex probabilistic graphical models, can be expensive to store, slow to evaluate and hard to integrate into larger systems. Ideally, we would like to replace such cumbersome models with simpler models that perform equally well.   In this thesis, we study knowledge distillation, the idea of extracting the knowledge contained in a complex model and injecting it into a more convenient model. We present a general framework for knowledge distillation, whereby a convenient model of our choosing learns how to mimic a complex model, by observing the latter's behaviour and being penalized whenever it fails to reproduce it.   We develop our framework within the context of three distinct machine learning applications: (a) model compression, where we compress large discriminative models, such as ensembles of neural networks, into models of much smaller size; (b) compact predictive distributions for Bayesian inference, where we distil large bags of MCMC samples into compact predictive distributions in closed form; (c) intractable generative models, where we distil unnormalizable models such as RBMs into tractable models such as NADEs.   We contribute to the state of the art with novel techniques and ideas. In model compression, we describe and implement derivative matching, which allows for better distillation when data is scarce. In compact predictive distributions, we introduce online distillation, which allows for significant savings in memory. Finally, in intractable generative models, we show how to use distilled models to robustly estimate intractable quantities of the original model, such as its intractable partition function.
Many systems of interacting elements can be conceptualized as networks, where network nodes represent the elements and network ties represent interactions between the elements. In systems where the underlying network evolves in time, it is useful to determine the points in time where the network structure changes significantly as these may correspond also to functional change points. We propose a method for detecting these change points in correlation networks that, unlike previous change point detection methods designed for time series data, requires no distributional assumptions. We investigate the difficulty of change point detection near the boundaries of data in correlation networks and demonstrate the power of our method and a competing method through simulation. We also show the generalizable nature of our method by applying it to stock price data as well as fMRI data.
Bayesian networks (BNs) are widely used graphical models usable to draw statistical inference about Directed acyclic graphs (DAGs). We presented here Graph_sampler a fast free C language software for structural inference on BNs. Graph_sampler uses a fully Bayesian approach in which the marginal likelihood of the data and prior information about the network structure are considered. This new software can handle both the continuous as well discrete data and based on the data type two different models are formulated. The software also provides a wide variety of structure priors which can be informative or uninformative. We proposed a new and much faster jumping kernel strategy in the Metropolis-Hastings algorithm. The source C code distributed is very compact, fast, uses low memory and disk storage. We performed out several analyses based on different simulated data sets and synthetic as well as real networks to discuss the performance of Graph_sampler.
We apply a general recurrent neural network (RNN) encoder framework to community question answering (cQA) tasks. Our approach does not rely on any linguistic processing, and can be applied to different languages or domains. Further improvements are observed when we extend the RNN encoders with a neural attention mechanism that encourages reasoning over entire sequences. To deal with practical issues such as data sparsity and imbalanced labels, we apply various techniques such as transfer learning and multitask learning. Our experiments on the SemEval-2016 cQA task show 10% improvement on a MAP score compared to an information retrieval-based approach, and achieve comparable performance to a strong handcrafted feature-based method.
An artificial neural network algorithm is implemented using a field programmable gate array hardware. One hidden layer is used in the feed-forward neural network structure in order to discriminate one class of patterns from the other class in real time. With five 8-bit input patterns, six hidden nodes, and one 8-bit output, the implemented hardware neural network makes decision on a set of input patterns in 11 clocks and the result is identical to what to expect from off-line computation. This implementation may be used in level 1 hardware triggers in high energy physics experiments
Inspired by the importance of diversity in biological system, we built an heterogeneous system that could achieve this goal. Our architecture could be summarized in two basic steps. First, we generate a diverse set of classification hypothesis using both Convolutional Neural Networks, currently the state-of-the-art technique for this task, among with other traditional and innovative machine learning techniques. Then, we optimally combine them through Meta-Nets, a family of recently developed and performing ensemble methods.
Latent representation learned from multi-layered neural networks via hierarchical feature abstraction enables recent success of deep learning. Under the deep learning framework, generalization performance highly depends on the learned latent representation which is obtained from an appropriate training scenario with a task-specific objective on a designed network model. In this work, we propose a novel latent space modeling method to learn better latent representation. We designed a neural network model based on the assumption that good base representation can be attained by maximizing the total correlation between the input, latent, and output variables. From the base model, we introduce a semantic noise modeling method which enables class-conditional perturbation on latent space to enhance the representational power of learned latent feature. During training, latent vector representation can be stochastically perturbed by a modeled class-conditional additive noise while maintaining its original semantic feature. It implicitly brings the effect of semantic augmentation on the latent space. The proposed model can be easily learned by back-propagation with common gradient-based optimization algorithms. Experimental results show that the proposed method helps to achieve performance benefits against various previous approaches. We also provide the empirical analyses for the proposed class-conditional perturbation process including t-SNE visualization.
Using Monte Carlo simulations, we study the character of the spin-glass (SG) state of a site-diluted dipolar Ising model. We consider systems of dipoles randomly placed on a fraction x of all L^3 sites of a simple cubic lattice that point up or down along a given crystalline axis. For x < 0.65 these systems are known to exhibit an equilibrium spin-glass phase below a temperature T_sg proportional to x. At high dilution and very low temperatures, well deep in the SG phase, we find spiky distributions of the overlap parameter q that are strongly sample-dependent. We focus on spikes associated with large excitations. From cumulative distributions of q and a pair correlation function averaged over several thousands of samples we find that, for the system sizes studied, the average width of spikes, and the fraction of samples with spikes higher than a certain threshold does not vary appreciably with L. This is compared with the behaviour found for the Sherrington-Kirkpatrick model.
In this paper the variation principles from theoretical physics is considered that would describe the process of routing in computer networks. The total traffic which is currently served on all hops of the route has been chosen as the quantity to minimize. Universal metric function has been found for dynamic routing taking into account the packet loss effect. An attempt to derive the metric of the most popular dynamic routing protocols such as RIP, OSPF, EIGRP from universal metric was made.
Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to predefined types. Although important in practice, this task is so far underexplored, partly due to the lack of labelled data. To overcome this, we explore several auxiliary tasks, including semantic super-sense tagging and identification of multi-word expressions, and cast the task as a multi-task learning problem with deep recurrent neural networks. Our multi-task models perform significantly better than previous state of the art approaches on two scientific KBC datasets, particularly for long keyphrases.
A new type of End-to-End system for text-dependent speaker verification is presented in this paper. Previously, using the phonetically discriminative/speaker discriminative DNNs as feature extractors for speaker verification has shown promising results. The extracted frame-level (DNN bottleneck, posterior or d-vector) features are equally weighted and aggregated to compute an utterance-level speaker representation (d-vector or i-vector). In this work we use speaker discriminative CNNs to extract the noise-robust frame-level features. These features are smartly combined to form an utterance-level speaker vector through an attention mechanism. The proposed attention model takes the speaker discriminative information and the phonetic information to learn the weights. The whole system, including the CNN and attention model, is joint optimized using an end-to-end criterion. The training algorithm imitates exactly the evaluation process --- directly mapping a test utterance and a few target speaker utterances into a single verification score. The algorithm can automatically select the most similar impostor for each target speaker to train the network. We demonstrated the effectiveness of the proposed end-to-end system on Windows $10$ "Hey Cortana" speaker verification task.
In this paper, we propose a web-centered framework to infer voter preferences for the 2016 U.S. presidential primaries. Using Twitter data collected from Sept. 2015 to March 2016, we first uncover the tweeting tactics of the candidates and then exploit the variations in the number of 'likes' to infer voters' preference. With sparse learning, we are able to reveal neutral topics as well as positive and negative ones.   Methodologically, we are able to achieve a higher predictive power with sparse learning. Substantively, we show that for Hillary Clinton the (only) positive issue area is women's rights. We demonstrate that Hillary Clinton's tactic of linking herself to President Obama resonates well with her supporters but the same is not true for Bernie Sanders. In addition, we show that Donald Trump is a major topic for all the other candidates, and that the women's rights issue is equally emphasized in Sanders' campaign as in Clinton's.
After a brief review of the kinematics of deep inelastic scattering (DIS) within the so-called $\Sigma$ method, we derive the necessary formulae for the treatment of QED radiative corrections to DIS originating from hard photon radiation. The results are applied to a calculation of the corrections to DIS with a tagged photon with next-to-leading logarithmic accuracy under HERA conditions. It turns out that the next-to-leading order corrections are quite important for the $\Sigma$ method. We also discuss the dependence of the corrections on the longitudinal structure function of the proton, $F_L$, in the region of low $Q^2$ and moderate $x$.
This article characterizes certain small multistationary chemical reaction networks. We consider the set of fully open networks, those for which all chemical species participate in inflow and outflow, containing one non-flow (reversible or irreversible) reaction. We show that such a network admits multiple positive mass-action steady states if and only if the stoichiometric coefficients in the non-flow reaction satisfy a certain simple arithmetic relation. The multistationary fully open one-reaction networks are identified with the chemical process of autocatalysis. Using the notion of `embedded network' defined recently by Joshi and Shiu, we provide new sufficient conditions for establishing multistationarity of fully open networks, applicable well beyond the one-reaction setting.
The replica method is applied to a neural network model with state-dependent synapses built from those patterns having a correlation with the state of the system greater than a certain threshold. Replica-symmetric and first-step replica-symmetry-breaking results are presented for the storage capacity at zero temperature as a function of this threshold value. A comparison is made with existing results based upon mean-field equations obtained by using a statistical method.
Synchronization of neurons forming a network with a hierarchical structure is essential for the brain to be able to function optimally. In this paper we study synchronization of phase oscillators on the most basic example of such a network, namely, the hierarchical lattice. Each oscillator has a natural frequency, drawn independently from a common probability distribution. In addition, pairs of oscillators interact with each other at a strength that depends on their hierarchical distance, modulated by a sequence of interaction parameters. We look at block averages of the oscillators on successive hierarchical scales, which we think of as block communities. Also these block communities are given a natural frequency, drawn independently from a common probability distribution that depends on their hierarchical scale. In the limit as the number of oscillators per community tends to infinity, referred to as the hierarchical mean-field limit, we find a separation of time scales, i.e., each block community behaves like a single oscillator evolving on its own time scale. We show that the evolution of the block communities is given by a renormalized mean-field noisy Kuramoto equation, with a synchronization level that depends on the hierarchical scale of the block community. We identify three universality classes for the synchronization levels on successive hierarchical scales, with explicit characterizations in terms of the sequence of interaction parameters and the sequence of natural frequency probability distributions. We show that disorder reduces synchronization when the natural frequency probability distributions are symmetric and unimodal, with the reduction gradually vanishing as the hierarchical scale goes up.
In a sensor network, in practice, the communication among sensors is subject to:(1) errors or failures at random times; (3) costs; and(2) constraints since sensors and networks operate under scarce resources, such as power, data rate, or communication. The signal-to-noise ratio (SNR) is usually a main factor in determining the probability of error (or of communication failure) in a link. These probabilities are then a proxy for the SNR under which the links operate. The paper studies the problem of designing the topology, i.e., assigning the probabilities of reliable communication among sensors (or of link failures) to maximize the rate of convergence of average consensus, when the link communication costs are taken into account, and there is an overall communication budget constraint. To consider this problem, we address a number of preliminary issues: (1) model the network as a random topology; (2) establish necessary and sufficient conditions for mean square sense (mss) and almost sure (a.s.) convergence of average consensus when network links fail; and, in particular, (3) show that a necessary and sufficient condition for both mss and a.s. convergence is for the algebraic connectivity of the mean graph describing the network topology to be strictly positive. With these results, we formulate topology design, subject to random link failures and to a communication cost constraint, as a constrained convex optimization problem to which we apply semidefinite programming techniques. We show by an extensive numerical study that the optimal design improves significantly the convergence speed of the consensus algorithm and can achieve the asymptotic performance of a non-random network at a fraction of the communication cost.
A unified vision of the symmetric coupling of angular momenta and of the quantum mechanical volume operator is illustrated. The focus is on the quantum mechanical angular momentum theory of Wigner's 6j symbols and on the volume operator of the symmetric coupling in spin network approaches: here, crucial to our presentation are an appreciation of the role of the Racah sum rule and the simplification arising from the use of Regge symmetry. The projective geometry approach permits the introduction of a symmetric representation of a network of seven spins or angular momenta. Results of extensive computational investigations are summarized, presented and briefly discussed.
A novel iterative algorithm for the efficient computation of the intersection areas of an arbitrary number of circles is presented. The algorithm, based on a trellis-structure, hinges on two geometric results which allow the existence-check and the computation of the area of the intersection regions generated by more than three circles by simple algebraic manipulations of the intersection areas of a smaller number of circles. The presented algorithm is a powerful tool for the performance analysis of wireless networks, and finds many applications, ranging from sensor to cellular networks. As an example of practical application, an insightful study of the uplink outage probability of in a wireless network with cooperative access points as a function of the transmission power and access point density is presented.
In this report we present a network-level multi-core energy model and a software development process workflow that allows software developers to estimate the energy consumption of multi-core embedded programs. This work focuses on a high performance, cache-less and timing predictable embedded processor architecture, XS1. Prior modelling work is improved to increase accuracy, then extended to be parametric with respect to voltage and frequency scaling (VFS) and then integrated into a larger scale model of a network of interconnected cores. The modelling is supported by enhancements to an open source instruction set simulator to provide the first network timing aware simulations of the target architecture. Simulation based modelling techniques are combined with methods of results presentation to demonstrate how such work can be integrated into a software developer's workflow, enabling the developer to make informed, energy aware coding decisions. A set of single-, multi-threaded and multi-core benchmarks are used to exercise and evaluate the models and provide use case examples for how results can be presented and interpreted. The models all yield accuracy within an average +/-5 % error margin.
We separate monotone analogues of L and NL by proving that any monotone switching network solving directed connectivity on $n$ vertices must have size at least $n^(\Omega(\lg(n)))$.
This paper introduces a new architecture for human pose estimation using a multi- layer convolutional network architecture and a modified learning technique that learns low-level features and higher-level weak spatial models. Unconstrained human pose estimation is one of the hardest problems in computer vision, and our new architecture and learning schema shows significant improvement over the current state-of-the-art results. The main contribution of this paper is showing, for the first time, that a specific variation of deep learning is able to outperform all existing traditional architectures on this task. The paper also discusses several lessons learned while researching alternatives, most notably, that it is possible to learn strong low-level feature detectors on features that might even just cover a few pixels in the image. Higher-level spatial models improve somewhat the overall result, but to a much lesser extent then expected. Many researchers previously argued that the kinematic structure and top-down information is crucial for this domain, but with our purely bottom up, and weak spatial model, we could improve other more complicated architectures that currently produce the best results. This mirrors what many other researchers, like those in the speech recognition, object recognition, and other domains have experienced.
Recent research shows that colluded malware in different VMs sharing a single physical host may use a resource as a channel to leak critical information. Covert channels employ time or storage characteristics to transmit confidential information to attackers leaving no trail.These channels were not meant for communication and hence control mechanisms do not exist. This means these remain undetected by traditional security measures employed in firewalls etc in a network. The comprehensive survey to address the issue highlights that accurate methods for fast detection in cloud are very expensive in terms of storage and processing. The proposed framework builds signature by extracting features which accurately classify the regular from covert traffic in cloud and estimates difference in distribution of data under analysis by means of scores. It then adds context to the signature and finally using machine learning (Support Vector Machines),a model is built and trained for deploying in cloud. The results show that the framework proposed is high in accuracy while being low cost and robust as it is tested after adding noise which is likely to exist in public cloud environments.
We have used 3.5 to 8 micron data from the Cores to Disks (c2d) Legacy survey and our own deep IJHKs images of a 0.5 square degree portion of the c2d fields in Ophiuchus to produce a sample of candidate young objects with probable masses between 1 and 10 Jupiter masses. The availability of photometry over whole range where these objects emit allows us to discriminate between young, extremely low-mass candidates and more massive foreground and background objects and means our survey will have fewer false positives than existing near-IR surveys. The sensitive inventory of a star forming cloud from the red to the mid-IR will allow us to constrain the IMF for these non-clustered star formation regions to well below the deuterium burning limit. For stars with fluxes in the broad gap between the 2MASS limits and our limits, our data will provide information about the photospheres. We will use the Spitzer results in combination with current disk models to learn about the presence and nature of circumstellar disks around young brown dwarfs.
A framework for implementing reservoir computing (RC) and extreme learning machines (ELMs), two types of artificial neural networks, based on 1D elementary Cellular Automata (CA) is presented, in which two separate CA rules explicitly implement the minimum computational requirements of the reservoir layer: hyperdimensional projection and short-term memory. CAs are cell-based state machines, which evolve in time in accordance with local rules based on a cells current state and those of its neighbors. Notably, simple single cell shift rules as the memory rule in a fixed edge CA afforded reasonable success in conjunction with a variety of projection rules, potentially significantly reducing the optimal solution search space. Optimal iteration counts for the CA rule pairs can be estimated for some tasks based upon the category of the projection rule. Initial results support future hardware realization, where CAs potentially afford orders of magnitude reduction in size, weight, and power (SWaP) requirements compared with floating point RC implementations.
We determine the phase portrait of a Hamiltonian system of equations describing the motion of the particles in linear deep-water waves. The particles experience in each period a forward drift which decreases with greater depth.
In this work, we build a generic architecture of Convolutional Neural Networks to discover empirical properties of neural networks. Our first contribution is to introduce a state-of-the-art framework that depends upon few hyper parameters and to study the network when we vary them. It has no max pooling, no biases, only 13 layers, is purely convolutional and yields up to 95.4% and 79.6% accuracy respectively on CIFAR10 and CIFAR100. We show that the nonlinearity of a deep network does not need to be continuous, non expansive or point-wise, to achieve good performance. We show that increasing the width of our network permits being competitive with very deep networks. Our second contribution is an analysis of the contraction and separation properties of this network. Indeed, a 1-nearest neighbor classifier applied on deep features progressively improves with depth, which indicates that the representation is progressively more regular. Besides, we defined and analyzed local support vectors that separate classes locally. All our experiments are reproducible and code is available online, based on TensorFlow.
We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i.e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document. Different from traditional variational learning or Gibbs sampling approaches, the proposed learning method applies (i) the mirror descent algorithm for maximum a posterior inference and (ii) back propagation over a deep architecture together with stochastic gradient/mirror descent for model parameter estimation, leading to scalable and end-to-end discriminative learning of the model. As a byproduct, we also apply this technique to develop a new learning method for the traditional unsupervised LDA model (i.e., BP-LDA). Experimental results on three real-world regression and classification tasks show that the proposed methods significantly outperform the previous supervised topic models, neural networks, and is on par with deep neural networks.
This paper uses the reliability polynomial, introduced by Moore and Shannon in 1956, to analyze the effect of network structure on diffusive dynamics such as the spread of infectious disease. We exhibit a representation for the reliability polynomial in terms of what we call {\em structural motifs} that is well suited for reasoning about the effect of a network's structural properties on diffusion across the network. We illustrate by deriving several general results relating graph structure to dynamical phenomena.
Principal manifolds serve as useful tool for many practical applications. These manifolds are defined as lines or surfaces passing through "the middle" of data distribution. We propose an algorithm for fast construction of grid approximations of principal manifolds with given topology. It is based on analogy of principal manifold and elastic membrane. The first advantage of this method is a form of the functional to be minimized which becomes quadratic at the step of the vertices position refinement. This makes the algorithm very effective, especially for parallel implementations. Another advantage is that the same algorithmic kernel is applied to construct principal manifolds of different dimensions and topologies. We demonstrate how flexibility of the approach allows numerous adaptive strategies like principal graph constructing, etc. The algorithm is implemented as a C++ package elmap and as a part of stand-alone data visualization tool VidaExpert, available on the web. We describe the approach and provide several examples of its application with speed performance characteristics.
We numerically study the non-equilibrium dynamics of the three dimensional Heisenberg Edwards-Anderson spin glass following a sudden quench to its low temperature phase. The subsequent aging behavior of the system is analyzed in detail, and the scaling behavior of various space-time correlation functions is investigated for both spin and chiral degrees of freedom. We carefully compare our results with those obtained from simulations of the more studied Ising version of the model, and with experiments on real spin glasses in which the spins have vectorial character. Finally, the present dynamical study offers new perspectives into the possibility of spin-chirality decoupling at low temperature in vectorial spin glasses.
Security can be seen as an optimisation objective in NoC resource management, and as such poses trade-offs against other objectives such as real-time schedulability. In this paper, we show how to increase NoC resilience against a concrete type of security attack, named side-channel attack, which exploit the correlation between specific non-functional properties (such as packet latencies and routes, in the case of NoCs) to infer the functional behaviour of secure applications. For instance, the transmission of a packet over a given link of the NoC may hint on a cache miss, which can be used by an attacker to guess specific parts of a secret cryptographic key, effectively weakening it.   We therefore propose packet route randomisation as a mechanism to increase NoC resilience against side-channel attacks, focusing specifically on the potential impact of such an approach upon hard real-time systems, where schedulability is a vital design requirement. Using an evolutionary optimisation approach, we show how to effectively apply route randomisation in such a way that it can increase NoC security while controlling its impact on hard real-time performance guarantees. Extensive experimental evidence based on analytical and simulation models supports our findings.
Recently, convolutional neural networks (CNNs) have been used as a powerful tool to solve many problems of machine learning and computer vision. In this paper, we aim to provide insight on the property of convolutional neural networks, as well as a generic method to improve the performance of many CNN architectures. Specifically, we first examine existing CNN models and observe an intriguing property that the filters in the lower layers form pairs (i.e., filters with opposite phase). Inspired by our observation, we propose a novel, simple yet effective activation scheme called concatenated ReLU (CRelu) and theoretically analyze its reconstruction property in CNNs. We integrate CRelu into several state-of-the-art CNN architectures and demonstrate improvement in their recognition performance on CIFAR-10/100 and ImageNet datasets with fewer trainable parameters. Our results suggest that better understanding of the properties of CNNs can lead to significant performance improvement with a simple modification.
We construct and investigate Boolean networks that follow a given reliable trajectory in state space, which is insensitive to fluctuations in the updating schedule, and which is also robust against noise. Robustness is quantified as the probability that the dynamics return to the reliable trajectory after a perturbation of the state of a single node. In order to achieve high robustness, we navigate through the space of possible update functions by using an evolutionary algorithm. We constrain the networks to having the minimum number of connections required to obtain the reliable trajectory. Surprisingly, we find that robustness always reaches values close to 100 percent during the evolutionary optimization process. The set of update functions can be evolved such that it differs only slightly from that of networks that were not optimized with respect to robustness. The state space of the optimized networks is dominated by the basin of attraction of the reliable trajectory.
A model for traffic flow in street networks or material flows in supply networks is presented, that takes into account the conservation of cars or materials and other significant features of traffic flows such as jam formation, spillovers, and load-dependent transportation times. Furthermore, conflicts or coordination problems of intersecting or merging flows are considered as well. Making assumptions regarding the permeability of the intersection as a function of the conflicting flows and the queue lengths, we find self-organized oscillations in the flows similar to the operation of traffic lights.
Nowadays there is a multitude of measures designed to capture different aspects of network structure. To be able to say if the structure of certain network is expected or not, one needs a reference model (null model). One frequently used null model is the ensemble of graphs with the same set of degrees as the original network. In this paper we argue that this ensemble can be more than just a null model -- it also carries information about the original network and factors that affect its evolution. By mapping out this ensemble in the space of some low-level network structure -- in our case those measured by the assortativity and clustering coefficients -- one can for example study how close to the valid region of the parameter space the observed networks are. Such analysis suggests which quantities are actively optimized during the evolution of the network. We use four very different biological networks to exemplify our method. Among other things, we find that high clustering might be a force in the evolution of protein interaction networks. We also find that all four networks are conspicuously robust to both random errors and targeted attacks.
We continue our consideration of a class of models describing the reversible dynamics of $N$ Boolean variables, each with $K$ inputs. We investigate in detail the behavior of the Hamming distance as well as of the distribution of orbit lengths as $N$ and $K$ are varied. We present numerical evidence for a phase transition in the behavior of the Hamming distance at a critical value $K_c\approx 1.65$ and also an analytic theory that yields the exact bounds on $1.5 \le K_c \le 2.$   We also discuss the large oscillations that we observe in the Hamming distance for $K<K_c$ as a function of time as well as in the distribution of cycle lengths as a function of cycle length for moderate $K$ both greater than and less than $K_c$. We propose that local structures, or subsets of spins whose dynamics are not fully coupled to the other spins in the system, play a crucial role in generating these oscillations. The simplest of these structures are linear chains, called linkages, and rings, called circuits. We discuss the properties of the linkages in some detail, and sketch the properties of circuits. We argue that the observed oscillation phenomena can be largely understood in terms of these local structures.
Understanding information exchange and aggregation on networks is a central problem in theoretical economics, probability and statistics. We study a standard model of economic agents on the nodes of a social network graph who learn a binary "state of the world" S, from initial signals, by repeatedly observing each other's best guesses.   Asymptotic learning is said to occur on a family of graphs G_n = (V_n, E_n), with |V_n| tending to infinity, if with probability tending to 1 as n tends to infinity all agents in G_n eventually estimate S correctly. We identify sufficient conditions for asymptotic learning and contruct examples where learning does not occur when the conditions do not hold.
The most puzzling aspect of the glass transition observed in laboratory is an apparent decoupling of dynamics from structure. In this paper we recount the implication of various theories of glass transition for the static correlation length in an attempt to reconcile the dynamic and static lengths associate with the glass problem. We argue that a more recent characterization of the static relaxation length based on the bond ordering scenario, as the typical length over which the energy fluctuations are correlated, is more consistent with, and indeed in perfect agreement with the typical linear size of the dynamically heterogeneous domains observed in deeply supercooled liquids. The correlated relaxation of bonds in terms of energy is therefore identified as the physical origin of the observed dynamic heterogeneity.
Special features of surface gravity waves in deep fluid flow with constant vertical shear of velocity is studied. It is found that the mean flow velocity shear leads to non-trivial modification of surface gravity wave modes dispersive characteristics. Moreover, the shear induces generation of surface gravity waves by internal vortex mode perturbations. The performed analytical and numerical study provides, that surface gravity waves are effectively generated by the internal perturbations at high shear rates. The generation is different for the waves propagating in the different directions. Generation of surface gravity waves propagating along the main flow considerably exceeds the generation of surface gravity waves in the opposite direction for relatively small shear rates, whereas the later wave is generated more effectively for the high shear rates. From the mathematical point of view the wave generation is caused by non self-adjointness of the linear operators that describe the shear flow.
Deep neural networks have achieved impressive experimental results in image classification, but can surprisingly be unstable with respect to adversarial perturbations, that is, minimal changes to the input image that cause the network to misclassify it. With potential applications including perception modules and end-to-end controllers for self-driving cars, this raises concerns about their safety. We develop a novel automated verification framework for feed-forward multi-layer neural networks based on Satisfiability Modulo Theory (SMT). We focus on safety of image classification decisions with respect to image manipulations, such as scratches or changes to camera angle or lighting conditions that would result in the same class being assigned by a human, and define safety for an individual decision in terms of invariance of the classification within a small neighbourhood of the original image. We enable exhaustive search of the region by employing discretisation, and propagate the analysis layer by layer. Our method works directly with the network code and, in contrast to existing methods, can guarantee that adversarial examples, if they exist, are found for the given region and family of manipulations. If found, adversarial examples can be shown to human testers and/or used to fine-tune the network. We implement the techniques using Z3 and evaluate them on state-of-the-art networks, including regularised and deep learning networks. We also compare against existing techniques to search for adversarial examples and estimate network robustness.
It is commonly assumed that language refers to high-level visual concepts while leaving low-level visual processing unaffected. This view dominates the current literature in computational models for language-vision tasks, where visual and linguistic input are mostly processed independently before being fused into a single representation. In this paper, we deviate from this classic pipeline and propose to modulate the \emph{entire visual processing} by linguistic input. Specifically, we condition the batch normalization parameters of a pretrained residual network (ResNet) on a language embedding. This approach, which we call MOdulated RESnet (\MRN), significantly improves strong baselines on two visual question answering tasks. Our ablation study shows that modulating from the early stages of the visual processing is beneficial.
There has been relatively little attention to incorporating linguistic prior to neural machine translation. Much of the previous work was further constrained to considering linguistic prior on the source side. In this paper, we propose a hybrid model, called NMT+RNNG, that learns to parse and translate by combining the recurrent neural network grammar into the attention-based neural machine translation. Our approach encourages the neural machine translation model to incorporate linguistic prior during training, and lets it translate on its own afterward. Extensive experiments with four language pairs show the effectiveness of the proposed NMT+RNNG.
We discuss some properties of the Pomeron in high energy elastic hadron-hadron and deep inelastic lepton-hadron scattering. A number of issues concerning the nature and the origin of the Pomeron are briefly recalled here. The novelty in this paper resides essentially in its presentation; we strive at discussing all these various issues in the following unifying perspective : it is our contention that the Pomeron is one and the same in all reactions. Various examples will be provided illustrating why we do not believe that one should invoke additional tools to describe the data. For pedagogical convenience, we list below the topics to be covered in the following.   -- 1. Introduction. How many Pomerons?   -- 2. The Pomeron in the $S$-matrix theory   -- 3. The Pomeron in QCD   -- 4. The Pomeron in deep inelastic scattering   -- 5. The Pomeron structure   -- 6. (Temporary?) Conclusions
Keyword spotting (KWS) constitutes a major component of human-technology interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the footprint size, latency and complexity are the goals for KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems, we combine the strengths of convolutional layers and recurrent layers to exploit local structure and long-range context. We analyze the effect of architecture parameters, and propose training strategies to improve performance. With only ~230k parameters, our CRNN model yields acceptably low latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise ratio.
Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this CVPR 2015 paper, we discover that a high-quality visual saliency model can be trained with multiscale features extracted using a popular deep learning architecture, convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for extracting features at three different scales. We then propose a refinement method to enhance the spatial coherence of our saliency results. Finally, aggregating multiple saliency maps computed for different levels of image segmentation can further boost the performance, yielding saliency maps better than those generated from a single segmentation. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotation. Experimental results demonstrate that our proposed method is capable of achieving state-of-the-art performance on all public benchmarks, improving the F-Measure by 5.0% and 13.2% respectively on the MSRA-B dataset and our new dataset (HKU-IS), and lowering the mean absolute error by 5.7% and 35.1% respectively on these two datasets.
Precise measurements of charm production in deep inelastic scattering are presented and compared with next-to-leading-order QCD. The data show sensitivity to the choice of parametrisation of the gluon density in the proton and could be used in future global fits of the parton densities.
This paper is devoted to a study of possible scaling laws, and their logarithmic corrections, occurring in deep inelastic electropion production. Both the exclusive and semi-exclusive processes are considered. Scaling laws, originally motivated from PCAC and current algebra considerations are examined, first in the framework of the parton model and QCD peturbation theory and then from the more formal perspective of the operator product expansion and asymptotic freedom, (as expressed through the renormalization group). We emphasize that these processes allow scaling to be probed for the full amplitude rather than just its absorbtive part (as is the case in the conventional structure functions). Because of this it is not possible to give a formal derivation of scaling for deep inelastic electropion production processes even if one believes that they are unambiguously sensitive to the light cone behavior of the operator product. The origin of this is shown to be related to its behavior near $x\approx 0$. Investigations, both theoretical and experimental, of these processes is therefore strongly encouraged.
Learning or memory formation are associated with the strengthening of the synaptic connections between neurons according to a pattern reflected by the input. According to this theory a retained memory sequence is associated to a dynamic pattern of the associated neural circuit. In this work we consider a class of network neuron models, known as Hopfield networks, with a learning rule which consists of transforming an information string to a coupling pattern. Within this class of models we study dynamic patterns, known as robust heteroclinic cycles, and establish a tight connection between their existence and the structure of the coupling.
The performance of the Hopfield neural network model is numerically studied on various complex networks, such as the Watts-Strogatz network, the Barab{\'a}si-Albert network, and the neuronal network of the C. elegans. Through the use of a systematic way of controlling the clustering coefficient, with the degree of each neuron kept unchanged, we find that the networks with the lower clustering exhibit much better performance. The results are discussed in the practical viewpoint of application, and the biological implications are also suggested.
We present a novel neural network algorithm, the Tensor Switching (TS) network, which generalizes the Rectified Linear Unit (ReLU) nonlinearity to tensor-valued hidden units. The TS network copies its entire input vector to different locations in an expanded representation, with the location determined by its hidden unit activity. In this way, even a simple linear readout from the TS representation can implement a highly expressive deep-network-like function. The TS network hence avoids the vanishing gradient problem by construction, at the cost of larger representation size. We develop several methods to train the TS network, including equivalent kernels for infinitely wide and deep TS networks, a one-pass linear learning algorithm, and two backpropagation-inspired representation learning algorithms. Our experimental results demonstrate that the TS network is indeed more expressive and consistently learns faster than standard ReLU networks.
We use Monte Carlo simulations to study the one-dimensional long-range diluted Heisenberg spin glass with interactions that fall as a power, sigma, of the distance. Varying the power is argued to be equivalent to varying the space dimension of a short-range model. We are therefore able to study both the mean-field and non-mean-field regimes. For one value of sigma, in the non-mean-field regime, we find evidence that the chiral glass transition temperature may be somewhat higher than the spin glass transition temperature. For the other values of sigma we see no evidence for this.
Roulette-wheel selection is a frequently used method in genetic and evolutionary algorithms or in modeling of complex networks. Existing routines select one of N individuals using search algorithms of O(N) or O(log(N)) complexity. We present a simple roulette-wheel selection algorithm, which typically has O(1) complexity and is based on stochastic acceptance instead of searching. We also discuss a hybrid version, which might be suitable for highly heterogeneous weight distributions, found, for example, in some models of complex networks. With minor modifications, the algorithm might also be used for sampling with fitness cut-off at a certain value or for sampling without replacement.
Coupled oscillator-based networks are an attractive approach for implementing hardware neural networks based on emerging nanotechnologies. However, the readout of the state of a coupled oscillator network is a difficult challenge in hardware implementations, as it necessitates complex signal processing to evaluate the degree of synchronization between oscillators, possibly more complicated than the coupled oscillator network itself. In this work, we focus on a coupled oscillator network particularly adapted to emerging technologies, and evaluate two schemes for reading synchronization patterns that can be readily implemented with basic CMOS circuits. Through simulation of a simple generic coupled oscillator network, we compare the operation of these readout techniques with a previously proposed full statistics evaluation scheme. Our approaches provide results nearly identical to the mathematical method, but also show better resilience to moderate noise, which is a major concern for hardware implementations. These results open the door to widespread realization of hardware coupled oscillator-based neural systems.
Recent neural models of dialogue generation offer great promise for generating responses for conversational agents, but tend to be shortsighted, predicting utterances one at a time while ignoring their influence on future outcomes. Modeling the future direction of a dialogue is crucial to generating coherent, interesting dialogues, a need which led traditional NLP models of dialogue to draw on reinforcement learning. In this paper, we show how to integrate these goals, applying deep reinforcement learning to model future reward in chatbot dialogue. The model simulates dialogues between two virtual agents, using policy gradient methods to reward sequences that display three useful conversational properties: informativity (non-repetitive turns), coherence, and ease of answering (related to forward-looking function). We evaluate our model on diversity, length as well as with human judges, showing that the proposed algorithm generates more interactive responses and manages to foster a more sustained conversation in dialogue simulation. This work marks a first step towards learning a neural conversational model based on the long-term success of dialogues.
Fork-join network is a class of queueing networks with applications in manufactory, healthcare and computation systems. In this paper, we develop a simulation algorithm that (1) generates i.i.d. samples of the job sojourn time, jointly with the number of waiting tasks, exactly following the steady-state distribution, and (2) unbiased estimators of the derivatives of the job sojourn time with respect to the service rates of the servers in the network. The algorithm is designed based on the Coupling from the Past (CFTP) and Infinitesimal Perturbation Analysis (IPA) techniques. Two numerical examples are reported, including the special 2-station case where analytic results on the steady-state distribution is known and a 10-station network with a bottleneck.
The phase transition in a model for vortex lines in high temperature superconductors with columnar defects, i.e., linearly correlated quenched random disorder, is studied with finite size scaling and Monte Carlo simulations. Previous studies of critical properties have mainly focused on the limit of strongly screened vortex line interactions. Here the opposite limit of weak screening is considered. The simulation results provide evidence for a new universality class, with new critical exponents that differ from the case of strong screening of the vortex interaction. In particular, scaling is anisotropic and characterized by a nontrivial value of the anisotropy exponent $\zeta=\nu_\parallel/\nu_\perp$. The exponents we find, $\zeta = 1.25 \pm 0.1, \nu_\perp=1.0 \pm 0.1, z = 1.95 \pm 0.1$, are similar to certain experimental results on YBCO.
We study the topology of e-mail networks with e-mail addresses as nodes and e-mails as links using data from server log files. The resulting network exhibits a scale-free link distribution and pronounced small-world behavior, as observed in other social networks. These observations imply that the spreading of e-mail viruses is greatly facilitated in real e-mail networks compared to random architectures.
Increasing use of CT in modern medical practice has raised concerns over associated radiation dose. Reduction of radiation dose associated with CT can increase noise and artifacts, which can adversely affect diagnostic confidence. Denoising of low-dose CT images on the other hand can help improve diagnostic confidence, which however is a challenging problem due to its ill-posed nature, since one noisy image patch may correspond to many different output patches. In the past decade, machine learning based approaches have made quite impressive progress in this direction. However, most of those methods, including the recently popularized deep learning techniques, aim for minimizing mean-squared-error (MSE) between a denoised CT image and the ground truth, which results in losing important structural details due to over-smoothing, although the PSNR based performance measure looks great. In this work, we introduce a new perceptual similarity measure as the objective function for a deep convolutional neural network to facilitate CT image denoising. Instead of directly computing MSE for pixel-to-pixel intensity loss, we compare the perceptual features of a denoised output against those of the ground truth in a feature space. Therefore, our proposed method is capable of not only reducing the image noise levels, but also keeping the critical structural information at the same time. Promising results have been obtained in our experiments with a large number of CT images.
Reinforcement learning is a powerful technique to train an agent to perform a task. However, an agent that is trained using reinforcement learning is only capable of achieving the single task that is specified via its reward function. Such an approach does not scale well to settings in which an agent needs to perform a diverse set of tasks, such as navigating to varying positions in a room or moving objects to varying locations. Instead, we propose a method that allows an agent to automatically discover the range of tasks that it is capable of performing. We use a generator network to propose tasks for the agent to try to achieve, specified as goal states. The generator network is optimized using adversarial training to produce tasks that are always at the appropriate level of difficulty for the agent. Our method thus automatically produces a curriculum of tasks for the agent to learn. We show that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment. Our method can also learn to achieve tasks with sparse rewards, which traditionally pose significant challenges.
Food diary applications represent a tantalizing market. Such applications, based on image food recognition, opened to new challenges for computer vision and pattern recognition algorithms. Recent works in the field are focusing either on hand-crafted representations or on learning these by exploiting deep neural networks. Despite the success of such a last family of works, these generally exploit off-the shelf deep architectures to classify food dishes. Thus, the architectures are not cast to the specific problem. We believe that better results can be obtained if the deep architecture is defined with respect to an analysis of the food composition. Following such an intuition, this work introduces a new deep scheme that is designed to handle the food structure. Specifically, inspired by the recent success of residual deep network, we exploit such a learning scheme and introduce a slice convolution block to capture the vertical food layers. Outputs of the deep residual blocks are combined with the sliced convolution to produce the classification score for specific food categories. To evaluate our proposed architecture we have conducted experimental results on three benchmark datasets. Results demonstrate that our solution shows better performance with respect to existing approaches (e.g., a top-1 accuracy of 90.27% on the Food-101 challenging dataset).
Elastic and inelastic neutron scattering studies have been carried out on the pyrochlore magnet Ho2Ti2O7. Measurements in zero applied magnetic field show that the disordered spin ice ground state of Ho2Ti2O7 is characterized by a pattern of rectangular diffuse elastic scattering within the [HHL] plane of reciprocal space, which closely resembles the zone boundary scattering seen in its sister compound Dy2Ti2O7. Well-defined peaks in the zone boundary scattering develop only within the spin ice ground state below ~ 2 K. In contrast, the overall diffuse scattering pattern evolves on a much higher temperature scale of ~ 17 K. The diffuse scattering at small wavevectors below [001] is found to vanish on going to Q=0, an explicit signature of expectations for dipolar spin ice. Very high energy-resolution inelastic measurements reveal that the spin ice ground state below ~ 2 K is also characterized by a transition from dynamic to static spin correlations on the time scale of 10^{-9} seconds. Measurements in a magnetic field applied along the [1${\bar1}$0] direction in zero-field cooled conditions show that the system can be broken up into orthogonal sets of polarized alpha chains along [1${\bar1}$0] and quasi-one-dimensional beta chains along [110]. Three dimensional correlations between beta chains are shown to be very sensitive to the precise alignment of the [1${\bar1}$0] externally applied magnetic field.
We present an analytic microscopic theory showing that in a large class of spin-$\frac{1}{2}$ quasiperiodic quantum kicked rotors, a dynamical analog of the integer quantum Hall effect (IQHE) emerges from an intrinsic chaotic structure. Specifically, the inverse of the Planck's quantum ($h_e$) and the rotor's energy growth rate mimic the `filling fraction' and the `longitudinal conductivity' in conventional IQHE, respectively, and a hidden quantum number is found to mimic the `quantized Hall conductivity'. We show that for an infinite discrete set of critical values of $h_e$, the long-time energy growth rate is universal and of order of unity (`metallic' phase), but otherwise vanishes (`insulating' phase). Moreover, the rotor insulating phases are topological, each of which is characterized by a hidden quantum number. This number exhibits universal behavior for small $h_e$, i.e., it jumps by unity whenever $h_e$ decreases, passing through each critical value. This intriguing phenomenon is not triggered by the like of Landau band filling, well-known to be the mechanism for conventional IQHE, and far beyond the canonical Thouless-Kohmoto-Nightingale-Nijs paradigm for quantum Hall transitions. Instead, this dynamical phenomenon is of strong chaos origin; it does not occur when the dynamics is (partially) regular. More precisely, we find that, for the first time, a topological object, similar to the topological theta angle in quantum chromodynamics, emerges from strongly chaotic motion at microscopic scales, and its renormalization gives the hidden quantum number.Our analytic results are confirmed by numerical simulations.Our findings indicate that rich topological quantum phenomena can emerge from chaos and might point to a new direction of study in the interdisciplinary area straddling chaotic dynamics and condensed matter physics.
We review briefly the physical concept of generalized parton distributions and the experimental challenges associated with the corresponding measurements of deep exclusive reactions.The first data obtained at Jefferson Lab for exclusive photon (DVCS) and vector meson (DVMP) electroproduction above the resonance-excitation region are described . Two upcoming dedicated DVCS experiments are presented in some detail.
In this paper, we present a novel learning-aided energy management scheme ($\mathtt{LEM}$) for multihop energy harvesting networks. Different from prior works on this problem, our algorithm explicitly incorporates information learning into system control via a step called \emph{perturbed dual learning}. $\mathtt{LEM}$ does not require any statistical information of the system dynamics for implementation, and efficiently resolves the challenging energy outage problem. We show that $\mathtt{LEM}$ achieves the near-optimal $[O(\epsilon), O(\log(1/\epsilon)^2)]$ utility-delay tradeoff with an $O(1/\epsilon^{1-c/2})$ energy buffers ($c\in(0,1)$). More interestingly, $\mathtt{LEM}$ possesses a \emph{convergence time} of $O(1/\epsilon^{1-c/2} +1/\epsilon^c)$, which is much faster than the $\Theta(1/\epsilon)$ time of pure queue-based techniques or the $\Theta(1/\epsilon^2)$ time of approaches that rely purely on learning the system statistics. This fast convergence property makes $\mathtt{LEM}$ more adaptive and efficient in resource allocation in dynamic environments. The design and analysis of $\mathtt{LEM}$ demonstrate how system control algorithms can be augmented by learning and what the benefits are. The methodology and algorithm can also be applied to similar problems, e.g., processing networks, where nodes require nonzero amount of contents to support their actions.
Training of the neural autoregressive density estimator (NADE) can be viewed as doing one step of probabilistic inference on missing values in data. We propose a new model that extends this inference scheme to multiple steps, arguing that it is easier to learn to improve a reconstruction in $k$ steps rather than to learn to reconstruct in a single inference step. The proposed model is an unsupervised building block for deep learning that combines the desirable properties of NADE and multi-predictive training: (1) Its test likelihood can be computed analytically, (2) it is easy to generate independent samples from it, and (3) it uses an inference engine that is a superset of variational inference for Boltzmann machines. The proposed NADE-k is competitive with the state-of-the-art in density estimation on the two datasets tested.
In this paper, we focus on a novel knowledge reuse scenario where the knowledge in the source schema needs to be translated to a semantically heterogeneous target schema. We refer to this task as "knowledge translation" (KT). Unlike data translation and transfer learning, KT does not require any data from the source or target schema. We adopt a probabilistic approach to KT by representing the knowledge in the source schema, the mapping between the source and target schemas, and the resulting knowledge in the target schema all as probability distributions, specially using Markov random fields and Markov logic networks. Given the source knowledge and mappings, we use standard learning and inference algorithms for probabilistic graphical models to find an explicit probability distribution in the target schema that minimizes the Kullback-Leibler divergence from the implicit distribution. This gives us a compact probabilistic model that represents knowledge from the source schema as well as possible, respecting the uncertainty in both the source knowledge and the mapping. In experiments on both propositional and relational domains, we find that the knowledge obtained by KT is comparable to other approaches that require data, demonstrating that knowledge can be reused without data.
A measurement of inclusive jet cross-sections in deep-inelastic ep scattering at HERA is presented based on data with an integrated luminosity of 21.1 pb^-1. The measurement is performed for photon virtualities Q^2 between 5 and 100 GeV^2, differentially in Q^2, in the jet transverse energy E_T, in E_T^2/Q^2 and in the pseudorapidity eta_lab. With the renormalization scale mu_R = E_T, perturbative QCD calculations in next-to-leading order (NLO) give a good description of the data in most of the phase space. Significant discrepancies are observed only for jets in the proton beam direction with E_T below 20 GeV and Q^2 below 20 GeV^2. This corresponds to the region in which NLO corrections are largest and further improvement of the calculations is thus of particular interest.
Vortex lines in superconductors in an external magnetic field slightly tilted from randomly-distributed parallel columnar defects can be modeled by a system of interacting bosons in a non-Hermitian vector potential and a random scalar potential. We develop a theory of the strongly-disordered non-Hermitian boson Hubbard model using the Hartree-Bogoliubov approximation and apply it to calculate the complex energy spectra, the vortex tilt angle and the tilt modulus of (1+1)-dimensional directed flux line systems. We construct the phase diagram associated with the flux-liquid to Bose-glass transition and find that, close to the phase boundary, the tilted flux liquid phase is characterized by a band of localized excitations, with two mobility edges in its low-energy spectrum.
The traditional bag-of-words approach has found a wide range of applications in computer vision. The standard pipeline consists of a generation of a visual vocabulary, a quantization of the features into histograms of visual words, and a classification step for which usually a support vector machine in combination with a non-linear kernel is used. Given large amounts of data, however, the model suffers from a lack of discriminative power. This applies particularly for action recognition, where the vast amount of video features needs to be subsampled for unsupervised visual vocabulary generation. Moreover, the kernel computation can be very expensive on large datasets. In this work, we propose a recurrent neural network that is equivalent to the traditional bag-of-words approach but enables for the application of discriminative training. The model further allows to incorporate the kernel computation into the neural network directly, solving the complexity issue and allowing to represent the complete classification system within a single network. We evaluate our method on four recent action recognition benchmarks and show that the conventional model as well as sparse coding methods are outperformed.
Lifelogging cameras capture everyday life from a first-person perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively. In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections. We develop and explore novel techniques based on deep learning to generate captions for both individual images and image streams, using temporal consistency constraints to create summaries that are both more compact and less noisy. We evaluate our techniques with quantitative and qualitative results, and apply captioning to an image retrieval application for finding potentially private images. Our results suggest that our automatic captioning algorithms, while imperfect, may work well enough to help users manage lifelogging photo collections.
Recurrent neural networks (RNN) are capable of learning to encode and exploit activation history over an arbitrary timescale. However, in practice, state of the art gradient descent based training methods are known to suffer from difficulties in learning long term dependencies. Here, we describe a novel training method that involves concurrent parallel cloned networks, each sharing the same weights, each trained at different stimulus phase and each maintaining independent activation histories. Training proceeds by recursively performing batch-updates over the parallel clones as activation history is progressively increased. This allows conflicts to propagate hierarchically from short-term contexts towards longer-term contexts until they are resolved. We illustrate the parallel clones method and hierarchical conflict propagation with a character-level deep RNN tasked with memorizing a paragraph of Moby Dick (by Herman Melville).
Computing shortest paths is a fundamental primitive for several social network applications including socially-sensitive ranking, location-aware search, social auctions and social network privacy. Since these applications compute paths in response to a user query, the goal is to minimize latency while maintaining feasible memory requirements. We present ASAP, a system that achieves this goal by exploiting the structure of social networks.   ASAP preprocesses a given network to compute and store a partial shortest path tree (PSPT) for each node. The PSPTs have the property that for any two nodes, each edge along the shortest path is with high probability contained in the PSPT of at least one of the nodes. We show that the structure of social networks enable the PSPT of each node to be an extremely small fraction of the entire network; hence, PSPTs can be stored efficiently and each shortest path can be computed extremely quickly.   For a real network with 5 million nodes and 69 million edges, ASAP computes a shortest path for most node pairs in less than 49 microseconds per pair. ASAP, unlike any previous technique, also computes hundreds of paths (along with corresponding distances) between any node pair in less than 100 microseconds. Finally, ASAP admits efficient implementation on distributed programming frameworks like MapReduce.
We consider the problem of detecting robotic grasps in an RGB-D view of a scene containing objects. In this work, we apply a deep learning approach to solve this problem, which avoids time-consuming hand-design of features. This presents two main challenges. First, we need to evaluate a huge number of candidate grasps. In order to make detection fast, as well as robust, we present a two-step cascaded structure with two deep networks, where the top detections from the first are re-evaluated by the second. The first network has fewer features, is faster to run, and can effectively prune out unlikely candidate grasps. The second, with more features, is slower but has to run only on the top few detections. Second, we need to handle multimodal inputs well, for which we present a method to apply structured regularization on the weights based on multimodal group regularization. We demonstrate that our method outperforms the previous state-of-the-art methods in robotic grasp detection, and can be used to successfully execute grasps on two different robotic platforms.
A rigorous understanding of brain dynamics and function requires a conceptual bridge between multiple levels of organization, including neural spiking and network-level population activity. Mounting evidence suggests that neural networks of cerebral cortex operate at criticality. How operating near this network state impacts the variability of neuronal spiking is largely unknown. Here we show in a computational model that two prevalent features of cortical single-neuron activity, irregular spiking and the decline of response variability at stimulus onset, are both emergent properties of a recurrent network operating near criticality. Importantly, our work reveals that the relation between the irregularity of spiking and the number of input connections to a neuron, i.e., the in-degree, is maximized at criticality. Our findings establish criticality as a unifying principle for the variability of single-neuron spiking and the collective behavior of recurrent circuits in cerebral cortex.
Extracting informative image features and learning effective approximate hashing functions are two crucial steps in image retrieval . Conventional methods often study these two steps separately, e.g., learning hash functions from a predefined hand-crafted feature space. Meanwhile, the bit lengths of output hashing codes are preset in most previous methods, neglecting the significance level of different bits and restricting their practical flexibility. To address these issues, we propose a supervised learning framework to generate compact and bit-scalable hashing codes directly from raw images. We pose hashing learning as a problem of regularized similarity learning. Specifically, we organize the training images into a batch of triplet samples, each sample containing two images with the same label and one with a different label. With these triplet samples, we maximize the margin between matched pairs and mismatched pairs in the Hamming space. In addition, a regularization term is introduced to enforce the adjacency consistency, i.e., images of similar appearances should have similar codes. The deep convolutional neural network is utilized to train the model in an end-to-end fashion, where discriminative image features and hash functions are simultaneously optimized. Furthermore, each bit of our hashing codes is unequally weighted so that we can manipulate the code lengths by truncating the insignificant bits. Our framework outperforms state-of-the-arts on public benchmarks of similar image search and also achieves promising results in the application of person re-identification in surveillance. It is also shown that the generated bit-scalable hashing codes well preserve the discriminative powers with shorter code lengths.
In many social dilemmas, individuals tend to generate a situation with low payoffs instead of a system optimum ("tragedy of the commons"). Is the routing of traffic a similar problem? In order to address this question, we present experimental results on humans playing a route choice game in a computer laboratory, which allow one to study decision behavior in repeated games beyond the Prisoner's Dilemma. We will focus on whether individuals manage to find a cooperative and fair solution compatible with the system-optimal road usage. We find that individuals tend towards a user equilibrium with equal travel times in the beginning. However, after many iterations, they often establish a coherent oscillatory behavior, as taking turns performs better than applying pure or mixed strategies. The resulting behavior is fair and compatible with system-optimal road usage. In spite of the complex dynamics leading to coordinated oscillations, we have identified mathematical relationships quantifying the observed transition process. Our main experimental discoveries for 2- and 4-person games can be explained with a novel reinforcement learning model for an arbitrary number of persons, which is based on past experience and trial-and-error behavior. Gains in the average payoff seem to be an important driving force for the innovation of time-dependent response patterns, i.e. the evolution of more complex strategies. Our findings are relevant for decision support systems and routing in traffic or data networks.
Neural networks have been used successfully to a broad range of areas such as business, data mining, drug discovery and biology. In medicine, neural networks have been applied widely in medical diagnosis, detection and evaluation of new drugs and treatment cost estimation. In addition, neural networks have begin practice in data mining strategies for the aim of prediction, knowledge discovery. This paper will present the application of neural networks for the prediction and analysis of antitubercular activity of Oxazolines and Oxazoles derivatives. This study presents techniques based on the development of Single hidden layer neural network (SHLFFNN), Gradient Descent Back propagation neural network (GDBPNN), Gradient Descent Back propagation with momentum neural network (GDBPMNN), Back propagation with Weight decay neural network (BPWDNN) and Quantile regression neural network (QRNN) of artificial neural network (ANN) models Here, we comparatively evaluate the performance of five neural network techniques. The evaluation of the efficiency of each model by ways of benchmark experiments is an accepted application. Cross-validation and resampling techniques are commonly used to derive point estimates of the performances which are compared to identify methods with good properties. Predictive accuracy was evaluated using the root mean squared error (RMSE), Coefficient determination(???), mean absolute error(MAE), mean percentage error(MPE) and relative square error(RSE). We found that all five neural network models were able to produce feasible models. QRNN model is outperforms with all statistical tests amongst other four models.
State-of-the-art semantic image segmentation methods are mostly based on training deep convolutional neural networks (CNNs). In this work, we proffer to improve semantic segmentation with the use of contextual information. In particular, we explore `patch-patch' context and `patch-background' context in deep CNNs. We formulate deep structured models by combining CNNs and Conditional Random Fields (CRFs) for learning the patch-patch context between image regions. Specifically, we formulate CNN-based pairwise potential functions to capture semantic correlations between neighboring patches. Efficient piecewise training of the proposed deep structured model is then applied in order to avoid repeated expensive CRF inference during the course of back propagation. For capturing the patch-background context, we show that a network design with traditional multi-scale image inputs and sliding pyramid pooling is very effective for improving performance. We perform comprehensive evaluation of the proposed method. We achieve new state-of-the-art performance on a number of challenging semantic segmentation datasets including $NYUDv2$, $PASCAL$-$VOC2012$, $Cityscapes$, $PASCAL$-$Context$, $SUN$-$RGBD$, $SIFT$-$flow$, and $KITTI$ datasets. Particularly, we report an intersection-over-union score of $77.8$ on the $PASCAL$-$VOC2012$ dataset.
Cluster analysis plays an important role in decision making process for many knowledge-based systems. There exist a wide variety of different approaches for clustering applications including the heuristic techniques, probabilistic models, and traditional hierarchical algorithms. In this paper, a novel heuristic approach based on big bang-big crunch algorithm is proposed for clustering problems. The proposed method not only takes advantage of heuristic nature to alleviate typical clustering algorithms such as k-means, but it also benefits from the memory based scheme as compared to its similar heuristic techniques. Furthermore, the performance of the proposed algorithm is investigated based on several benchmark test functions as well as on the well-known datasets. The experimental results show the significant superiority of the proposed method over the similar algorithms.
Despite the abundance of bipartite networked systems, their organizing principles are less studied, compared to unipartite networks. Bipartite networks are often analyzed after projecting them onto one of the two sets of nodes. As a result of the projection, nodes of the same set are linked together if they have at least one neighbor in common in the bipartite network. Even though these projections allow one to study bipartite networks using tools developed for unipartite networks, one-mode projections lead to significant loss of information and artificial inflation of the projected network with fully connected subgraphs. Here we pursue a different approach for analyzing bipartite systems that is based on the observation that such systems have a latent metric structure: network nodes are points in a latent metric space, while connections are more likely to form between nodes separated by shorter distances. This approach has been developed for unipartite networks, and relatively little is known about its applicability to bipartite systems. Here, we fully analyze a simple latent-geometric model of bipartite networks, and show that this model explains the peculiar structural properties of many real bipartite systems, including the distributions of common neighbors and bipartite clustering. We also analyze the geometric information loss in one-mode projections in this model, and propose an efficient method to infer the latent pairwise distances between nodes. Uncovering the latent geometry underlying real bipartite networks can find applications in diverse domains, ranging from constructing efficient recommender systems to understanding cell metabolism.
We consider transport properties of disordered two-dimensional electron gases under high perpendicular magnetic field, focusing in particular on the peak longitudinal conductivity $\sigma_{xx}^\mathrm{peak}$ at the quantum Hall plateau transition. We use a local conductivity model, valid at temperatures high enough such that quantum tunneling is suppressed, taking into account the random drift motion of the electrons in the disordered potential landscape and inelastic processes provided by electron-phonon scattering. A diagrammatic solution of this problem is proposed, which leads to a rich interplay of conduction mechanisms, where classical percolation effects play a prominent role. The scaling function for $\sigma_{xx}^\mathrm{peak}$ is derived in the high temperature limit, which can be used to extract universal critical exponents of classical percolation from experimental data.
The dynamics of an initially localized Anderson mode is studied in the framework of the nonlinear Schroedinger equation in the presence of disorder. It is shown that the dynamics can be described in the framework of the Liouville operator. An analytical expression for a wave function of the initial time dynamics is found by a perturbation approach. As follows from a perturbative solution the initially localized wave function remains localized. At asymptotically large times the dynamics can be described qualitatively in the framework of a phenomenological probabilistic approach by means of a probability distribution function. It is shown that the probability distribution function may be governed by the fractional Fokker-Planck equation and corresponds to subdiffusion.
Decoupling uplink (UL) and downlink (DL) is a new architectural paradigm where DL and UL are not constrained to be associated to the same base station (BS). Building upon this paradigm, the goal of the present paper is to provide lower, albeit tight bounds for the ergodic UL capacity of a decoupled cellular network. The analysis is performed for a scenario consisting of a macro BS and a set of small cells (SCs) whose positions are selected randomly according to a Poisson point process of a given spatial density. Based on this analysis simple bounds in closed form expressions are defined. The devised bounds are employed to compare the performance of the decoupled case versus a set of benchmark cases, namely the coupled case, and the situations of having either a single macro BS or only SCs. This comparison provides valuable insights regarding the behavior and performance of such networks, providing simpler expressions for the ergodic UL capacity as a function of the distances to the macro BS and the density of SCs. These expressions constitute a simple guide to the minimum degree of densification that guarantees the Quality of Service (QoS) objectives of the network, thus, providing a valuable tool to the network operator of significant practical and commercial value.
We analyze the anomalies of superconducting state within a simple exactly solvable model of the pseudogap state, induced by fluctuations of ``dielectric'' short range order, for the model of the Fermi surface with ``hot'' patches. The analysis is performed for the arbitrary values of the correlation length xi_{corr} of this short range order. It is shown that superconducting energy gap averaged over these fluctuations is non zero in a wide temperature range above T_c - the temperature of homogeneous superconducting transition. This follows from the absence of self averaging of the gap over the random field of fluctuations. For temperatures T>T_c superconductivity apparently appears in separate regions of space (``drops''). These effects become weaker for shorter correlation lengths xi_{corr} and the region of ``drops'' on the phase diagram becomes narrower and disappears for xi_{corr}-->0, however, for the finite values of xi_{corr} the complete self averaging is absent.
In this paper, we introduce a distributed algorithm that optimizes the Gaussian signal covariance matrices of multi-antenna users transmitting to a common multi-antenna receiver under imperfect and possibly delayed channel state information. The algorithm is based on an extension of exponential learning techniques to a semidefinite setting and it requires the same information as distributed water-filling methods. Unlike water-filling however, the proposed matrix exponential learning (MXL) algorithm converges to the system's optimum signal covariance profile under very mild conditions on the channel uncertainty statistics; moreover, the algorithm retains its convergence properties even in the presence of user update asynchronicities, random delays and/or ergodically changing channel conditions. In particular, by properly tuning the algorithm's learning rate (or step size), the algorithm converges within a few iterations, even for large numbers of users and/or antennas per user. Our theoretical analysis is complemented by numerical simulations which illustrate the algorithm's robustness and scalability in realistic network conditions.
As the complexity of deep neural networks (DNNs) trend to grow to absorb the increasing sizes of data, memory and energy consumption has been receiving more and more attentions for industrial applications, especially on mobile devices. This paper presents a novel structure based on functional hashing to compress DNNs, namely FunHashNN. For each entry in a deep net, FunHashNN uses multiple low-cost hash functions to fetch values in the compression space, and then employs a small reconstruction network to recover that entry. The reconstruction network is plugged into the whole network and trained jointly. FunHashNN includes the recently proposed HashedNets as a degenerated case, and benefits from larger value capacity and less reconstruction loss. We further discuss extensions with dual space hashing and multi-hops. On several benchmark datasets, FunHashNN demonstrates high compression ratios with little loss on prediction accuracy.
We present a probabilistic variant of the recently introduced maxout unit. The success of deep neural networks utilizing maxout can partly be attributed to favorable performance under dropout, when compared to rectified linear units. It however also depends on the fact that each maxout unit performs a pooling operation over a group of linear transformations and is thus partially invariant to changes in its input. Starting from this observation we ask the question: Can the desirable properties of maxout units be preserved while improving their invariance properties ? We argue that our probabilistic maxout (probout) units successfully achieve this balance. We quantitatively verify this claim and report classification performance matching or exceeding the current state of the art on three challenging image classification benchmarks (CIFAR-10, CIFAR-100 and SVHN).
We provide a discussion of the impact of a subset of Drell-Yan data from LHC on the determination of the photon parton distribution function (PDF), using the NNPDF methodology. In previous work we have shown that the photon PDF determined from deep-inelastic scattering (DIS) data only has large uncertainties, suggesting the need for more data from other processes such as Drell-Yan, which unlike DIS, includes photon-induced contributions at leading order in QED. We describe the inclusion of ATLAS Drell-Yan W, Z data, which is a subset of the LHC data used in a final photon PDF determination, by means of a reweighting procedure. We show the impact of such data by comparing the reweighted photon PDF with the photon PDF from DIS, highlighting the reduction of uncertainties at medium/small-x. We conclude that the Drell-Yan data from LHC allows a reasonably accurate determination of the photon PDF.
The back-propagation (BP) algorithm has been considered the de-facto method for training deep neural networks. It back-propagates errors from the output layer to the hidden layers in an exact manner using the transpose of the feedforward weights. However, it has been argued that this is not biologically plausible because back-propagating error signals with the exact incoming weights is not considered possible in biological neural systems. In this work, we propose a biologically plausible paradigm of neural architecture based on related literature in neuroscience and asymmetric BP-like methods. Specifically, we propose two bidirectional learning algorithms with trainable feedforward and feedback weights. The feedforward weights are used to relay activations from the inputs to target outputs. The feedback weights pass the error signals from the output layer to the hidden layers. Different from other asymmetric BP-like methods, the feedback weights are also plastic in our framework and are trained to approximate the forward activations. Preliminary results show that our models outperform other asymmetric BP-like methods on the MNIST and the CIFAR-10 datasets.
One of the defining properties of deep learning is that models are chosen to have many more parameters than available training data. In light of this capacity for overfitting, it is remarkable that simple algorithms like SGD reliably return solutions with low test error. One roadblock to explaining these phenomena in terms of implicit regularization, structural properties of the solution, and/or easiness of the data is that many learning bounds are quantitatively vacuous in this "deep learning" regime. In order to explain generalization, we need nonvacuous bounds. We return to an idea by Langford and Caruana (2001), who used PAC-Bayes bounds to compute nonvacuous numerical bounds on generalization error for stochastic two-layer two-hidden-unit neural networks via a sensitivity analysis. By optimizing the PAC-Bayes bound directly, we are able to extend their approach and obtain nonvacuous generalization bounds for deep stochastic neural network classifiers with millions of parameters trained on only tens of thousands of examples. We connect our findings to recent and old work on flat minima and MDL-based explanations of generalization.
Deep neural networks (DNNs) are now a central component of nearly all state-of-the-art speech recognition systems. Building neural network acoustic models requires several design decisions including network architecture, size, and training loss function. This paper offers an empirical investigation on which aspects of DNN acoustic model design are most important for speech recognition system performance. We report DNN classifier performance and final speech recognizer word error rates, and compare DNNs using several metrics to quantify factors influencing differences in task performance. Our first set of experiments use the standard Switchboard benchmark corpus, which contains approximately 300 hours of conversational telephone speech. We compare standard DNNs to convolutional networks, and present the first experiments using locally-connected, untied neural networks for acoustic modeling. We additionally build systems on a corpus of 2,100 hours of training data by combining the Switchboard and Fisher corpora. This larger corpus allows us to more thoroughly examine performance of large DNN models -- with up to ten times more parameters than those typically used in speech recognition systems. Our results suggest that a relatively simple DNN architecture and optimization technique produces strong results. These findings, along with previous work, help establish a set of best practices for building DNN hybrid speech recognition systems with maximum likelihood training. Our experiments in DNN optimization additionally serve as a case study for training DNNs with discriminative loss functions for speech tasks, as well as DNN classifiers more generally.
We systematically study the deep representation of random weight CNN (convolutional neural network) using the DeCNN (deconvolutional neural network) architecture. We first fix the weights of an untrained CNN, and for each layer of its feature representation, we train a corresponding DeCNN to reconstruct the input image. As compared with the pre-trained CNN, the DeCNN trained on a random weight CNN can reconstruct images more quickly and accurately, no matter which type of random distribution for the CNN's weights. It reveals that every layer of the random CNN can retain photographically accurate information about the image. We then let the DeCNN be untrained, i.e. the overall CNN-DeCNN architecture uses only random weights. Strikingly, we can reconstruct all position information of the image for low layer representations but the colors change. For high layer representations, we can still capture the rough contours of the image. We also change the number of feature maps and the shape of the feature maps and gain more insight on the random function of the CNN-DeCNN structure. Our work reveals that the purely random CNN-DeCNN architecture substantially contributes to the geometric and photometric invariance due to the intrinsic symmetry and invertible structure, but it discards the colormetric information due to the random projection.
Application service providers (ASPs) now develop, deploy, and maintain complex computing platforms within multiple cloud infrastructures to improve resilience, responsiveness and elasticity of their applications. On the other hand, complex applications have little control and visibility over network resources, and need to use low-level hacks to extract network properties and prioritize traffic. This biased view, limits tenants flexibility while deploying their applications and prevents them from implementing part of the application logic in the network. In this paper, we propose the CNSMO (CYCLONE Network Services Manager/Orchestrator) tool to bring the innovation at federated cloud environments by bridging these network service capabilities to cloud based services as part of the overall CYCLONE solution. The integration of networking aspects with purely federated clouds, will allow users to request specific infrastructures and manage their dedicated set of coordinated network and IT resources in an easy and transparent way while operating dynamic deployments of distributed applications.
This paper proposes a joint multi-task learning algorithm to better predict attributes in images using deep convolutional neural networks (CNN). We consider learning binary semantic attributes through a multi-task CNN model, where each CNN will predict one binary attribute. The multi-task learning allows CNN models to simultaneously share visual knowledge among different attribute categories. Each CNN will generate attribute-specific feature representations, and then we apply multi-task learning on the features to predict their attributes. In our multi-task framework, we propose a method to decompose the overall model's parameters into a latent task matrix and combination matrix. Furthermore, under-sampled classifiers can leverage shared statistics from other classifiers to improve their performance. Natural grouping of attributes is applied such that attributes in the same group are encouraged to share more knowledge. Meanwhile, attributes in different groups will generally compete with each other, and consequently share less knowledge. We show the effectiveness of our method on two popular attribute datasets.
The technique of building of networks of hierarchies of terms based on the analysis of chosen text corpora is offered. The technique is based on the methodology of horizontal visibility graphs. Constructed and investigated language network, formed on the basis of electronic preprints arXiv on topics of information retrieval.
We report the results of including resummed splitting functions in the QCD evolution equations at small x, and discuss the predictions that follow for the deep inelastic structure functions.   *Contribution at XXX Rencontres de Moriond, Les Arcs, March 1995
Our primary objective in this paper is to study the distribution of the maximal clique size of the vertices in complex networks. We define the maximal clique size for a vertex as the maximum size of the clique that the vertex is part of and such a clique need not be the maximum size clique for the entire network. We determine the maximal clique size of the vertices using a modified version of a branch-and-bound based exact algorithm that has been originally proposed to determine the maximum size clique for an entire network graph. We then run this algorithm on two categories of complex networks: One category of networks capture the evolution of small-world networks from regular network (according to the wellknown Watts-Strogatz model) and their subsequent evolution to random networks; we show that the distribution of the maximal clique size of the vertices follows a Poisson-style distribution at different stages of the evolution of the small-world network to a random network; on the other hand, the maximal clique size of the vertices is observed to be in-variant and to be very close to that of the maximum clique size for the entire network graph as the regular network is transformed to a small-world network. The second category of complex networks studied are real-world networks (ranging from random networks to scale-free networks) and we observe the maximal clique size of the vertices in five of the six real-world networks to follow a Poisson-style distribution. In addition to the above case studies, we also analyze the correlation between the maximal clique size and clustering coefficient as well as analyze the assortativity index of the vertices with respect to maximal clique size and node degree.
New polarized fragmentation functions are introduced and justified, in addition to those conventional ones assumed to be independent of the helicity of the parent parton. It is demonstrated that due to our present ignorance concerning these new parton-spin dependent leading-twist fragmentation functions, it is impossible to utilize current experiments on spin-dependent semi-inclusive deep inelastic lepton nucleon scattering to disentangle the separate polarized parton distributions.
Deep Convolutional Neural Networks (CNN) have won a significant place in the computer vision recently, which repeatedly convolving an image to extract the knowledge behind it. However, with the depth of convolutional layers getting deeper and deeper in recent years, the computational complexity also increases significantly, which make it difficult to be deployed on embedded systems with limited hardware resources.   In this paper we propose a method to reduce the redundant convolution kernels during the computation of CNN and apply it to a network for super resolution (SR). Using PSNR drop compared to the original network as performance criterion, our method can get the optimal PSNR under a certain computation budget constraint. On the other hand, our method is also capable of minimizing the computation required under a given PSNR drop.
Sparseness is a useful regularizer for learning in a wide range of applications, in particular in neural networks. This paper proposes a model targeted at classification tasks, where sparse activity and sparse connectivity are used to enhance classification capabilities. The tool for achieving this is a sparseness-enforcing projection operator which finds the closest vector with a pre-defined sparseness for any given vector. In the theoretical part of this paper, a comprehensive theory for such a projection is developed. In conclusion, it is shown that the projection is differentiable almost everywhere and can thus be implemented as a smooth neuronal transfer function. The entire model can hence be tuned end-to-end using gradient-based methods. Experiments on the MNIST database of handwritten digits show that classification performance can be boosted by sparse activity or sparse connectivity. With a combination of both, performance can be significantly better compared to classical non-sparse approaches.
This paper deals with controllability of dynamical networks. It is often unfeasible or unnecessary to fully control large-scale networks, which motivates the control of a prescribed subset of agents of the network. This specific form of output control is known under the name target control. We consider target control of a family of linear control systems associated with a network, and provide both a necessary and a sufficient topological condition under which the network is strongly targeted controllable. Furthermore, a leader selection algorithm is presented to compute leader sets achieving target control.
We develop a model free energy from an expansion that basically includes graphs without loops. From this calculation, we derive the temperature dependence of the density (or specific volume), the typical time scale of the $\alpha$-relaxation, and the heat capacity. From this, we argue that the glass transition is dominated by the vicinity of a first order phase transition. The fluctuations, observable in principle as scattering, would support the findings and would increase in terms of amplitude close to the phase boundary (while the size stays constant). This amplitude is connected to the cluster size, also introduced in the cooperativity argument. Minor arguments about corrections from loops are discussed where we also might have found an argument for the "Boson Peak". The whole concept then bases on equilibrium arguments that are inhibited by -- to our view -- the fluctuations (high susceptibility) plus the high density that results in the strong growing of the cluster size.
Crystal plasticity is mediated through dislocations, which form knotted configurations in a complex energy landscape. Once they disentangle and move, they may also be impeded by permanent obstacles with finite energy barriers or frustrating long-range interactions. The outcome of such complexity is the emergence of dislocation avalanches as the basic mechanism of plastic flow in solids at the nanoscale. While the deformation behavior of bulk materials appears smooth, a predictive model should clearly be based upon the character of these dislocation avalanches and their associated strain bursts. We provide here a comprehensive overview of experimental observations, theoretical models and computational approaches that have been developed to unravel the multiple aspects of dislocation avalanche physics and the phenomena leading to strain bursts in crystal plasticity.
We consider a stochastic version of the Wilson-Cowan model which accommodates for discrete populations of excitatory and inhibitory neurons. The model assumes a finite carrying capacity with the two populations being constant in size. The master equation that governs the dynamics of the stochastic model is analyzed by an expansion in powers of the inverse population size, yielding a coupled pair of non-linear Langevin equations with multiplicative noise. Gillespie simulations show the validity of the obtained approximation, for the parameter region where the system exhibits dynamical bistability. We report analytical progress by silencing the retroaction of the activators on the inhibitors, while still assigning the parameters so to fall in the region of deterministic bistability for the excitatory species. The proposed approach forms the basis of a perturbative generalization which applies to the case where a modest degree of coupling is restored.
Establishing correspondence between shapes is a fundamental problem in geometry processing, arising in a wide variety of applications. The problem is especially difficult in the setting of non-isometric deformations, as well as in the presence of topological noise and missing parts, mainly due to the limited capability to model such deformations axiomatically. Several recent works showed that invariance to complex shape transformations can be learned from examples. In this paper, we introduce an intrinsic convolutional neural network architecture based on anisotropic diffusion kernels, which we term Anisotropic Convolutional Neural Network (ACNN). In our construction, we generalize convolutions to non-Euclidean domains by constructing a set of oriented anisotropic diffusion kernels, creating in this way a local intrinsic polar representation of the data (`patch'), which is then correlated with a filter. Several cascades of such filters, linear, and non-linear operators are stacked to form a deep neural network whose parameters are learned by minimizing a task-specific cost. We use ACNNs to effectively learn intrinsic dense correspondences between deformable shapes in very challenging settings, achieving state-of-the-art results on some of the most difficult recent correspondence benchmarks.
This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows end-to-end training. We train a QMDP-net in a set of different environments so that it can generalize over new ones and "transfer" to larger environments as well. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, because of QMDP-net's increased robustness through end-to-end learning.
Feature matching in omnidirectional vision systems is a challenging problem, mainly because complicated optical systems make the theoretical modelling of invariance and construction of invariant feature descriptors hard or even impossible. In this paper, we propose learning invariant descriptors using a training set of similar and dissimilar descriptor pairs. We use the similarity-preserving hashing framework, in which we are trying to map the descriptor data to the Hamming space preserving the descriptor similarity on the training set. A neural network is used to solve the underlying optimization problem. Our approach outperforms not only straightforward descriptor matching, but also state-of-the-art similarity-preserving hashing methods.
Recently, considerable effort has been put into developing fast algorithms to reconstruct a rooted phylogenetic network that explains two rooted phylogenetic trees and has a minimum number of hybridization vertices. With the standard approach to tackle this problem being combinatorial, the reconstructed network is rarely unique. From a biological point of view, it is therefore of importance to not only compute one network, but all possible networks. In this paper, we make a first step towards approaching this goal by presenting the first algorithm---called allMAAFs---that calculates all maximum-acyclic-agreement forests for two rooted binary phylogenetic trees on the same set of taxa.
Liver lesion segmentation is an important step for liver cancer diagnosis, treatment planning and treatment evaluation. LiTS (Liver Tumor Segmentation Challenge) provides a common testbed for comparing different automatic liver lesion segmentation methods. We participate in this challenge by developing a deep convolutional neural network (DCNN) method. The particular DCNN model works in 2.5D in that it takes a stack of adjacent slices as input and produces the segmentation map corresponding to the center slice. The model has 32 layers in total and makes use of both long range concatenation connections of U-Net [1] and short-range residual connections from ResNet [2]. The model was trained using the 130 LiTS training datasets and achieved an average Dice score of 0.67 when evaluated on the 70 test CT scans, which ranked first for the LiTS challenge at the time of the ISBI 2017 conference.
An analysis of the evolutionary trends in the ground state geometries of Na$_{55}$ to Na$_{62}$ reveals Na$_{58}$, an electronic closed--shell system, shows namely an electronically driven spherical shape leading to a disordered but compact structure. This structural change induces a strong {\it connectivity} of short bonds among the surface atoms as well as between core and surface atoms with inhomogeneous strength in the ground state geometry, which affects its finite--temperature behavior. By employing {\it ab initio} density--functional molecular dynamics, we show that this leads to two distinct features in specific heat curve compared to that of Na$_{55}$: (1) The peak is shifted by about 100 K higher in temperature. (2) The transition region becomes much broader than Na$_{55}$. The inhomogeneous distribution of bond strengths results in a broad melting transition and the strongly connected network of short bonds leads to the highest melting temperature of 375 K reported among the sodium clusters. Na$_{57}$, which has one electron less than Na$_{58}$, also possesses stronger short--bond network compared with Na$_{55}$, resulting in higher melting temperature (350 K) than observed in Na$_{55}$. Thus, we conclude that when a cluster has nearly closed shell structure not only geometrically but also electronically, it show a high melting temperature. Our calculations clearly bring out the size--sensitive nature of the specific heat curve in sodium clusters.
Designing routing schemes is a multidimensional and complex task that depends on the objective function, the computational model (centralized vs. distributed), and the amount of uncertainty (online vs. offline). Nevertheless, there are quite a few well-studied general techniques, for a large variety of network problems. In contrast, in our view, practical techniques for designing robust routing schemes are scarce; while fault-tolerance has been studied from a number of angles, existing approaches are concerned with dealing with faults after the fact by rerouting, self-healing, or similar techniques. We argue that this comes at a high burden for the designer, as in such a system any algorithm must account for the effects of faults on communication.   With the goal of initiating efforts towards addressing this issue, we showcase simple and generic transformations that can be used as a blackbox to increase resilience against (independently distributed) faults. Given a network and a routing scheme, we determine a reinforced network and corresponding routing scheme that faithfully preserves the specification and behavior of the original scheme. We show that reasonably small constant overheads in terms of size of the new network compared to the old are sufficient for substantially relaxing the reliability requirements on individual components. The main message in this paper is that the task of designing a robust routing scheme can be decoupled into (i) designing a routing scheme that meets the specification in a fault-free environment, (ii) ensuring that nodes correspond to fault-containment regions, i.e., fail (approximately) independently, and (iii) applying our transformation to obtain a reinforced network and a robust routing scheme that is fault-tolerant.
Deep Neural Network (DNN) are currently of great inter- est in research and application. The training of these net- works is a compute intensive and time consuming task. To reduce training times to a bearable amount at reasonable cost we extend the popular Caffe toolbox for DNN with an efficient distributed memory communication pattern. To achieve good scalability we emphasize the overlap of computation and communication and prefer fine granu- lar synchronization patterns over global barriers. To im- plement these communication patterns we rely on the the Global address space Programming Interface version 2 (GPI-2) communication library. This interface provides a light-weight set of asynchronous one-sided communica- tion primitives supplemented by non-blocking fine gran- ular data synchronization mechanisms. Therefore, Caf- feGPI is the name of our parallel version of Caffe. First benchmarks demonstrate better scaling behavior com- pared with other extensions, e.g., the Intel TM Caffe. Even within a single symmetric multiprocessing machine with four graphics processing units, the CaffeGPI scales bet- ter than the standard Caffe toolbox. These first results demonstrate that the use of standard High Performance Computing (HPC) hardware is a valid cost saving ap- proach to train large DDNs. I/O is an other bottleneck to work with DDNs in a standard parallel HPC setting, which we will consider in more detail in a forthcoming paper.
In this paper we initiate an approach that deals with the problem of calculating average properties of messages traveling on networks, by employing concepts and methods that are used for the study of the many-body problem in the field of physics. We set up a framework that simplifies enormously the problem and, through a concrete example, we show how it can be applied to a broad class of networks and protocols.
Many research works deal with chaotic neural networks for various fields of application. Unfortunately, up to now these networks are usually claimed to be chaotic without any mathematical proof. The purpose of this paper is to establish, based on a rigorous theoretical framework, an equivalence between chaotic iterations according to Devaney and a particular class of neural networks. On the one hand we show how to build such a network, on the other hand we provide a method to check if a neural network is a chaotic one. Finally, the ability of classical feedforward multilayer perceptrons to learn sets of data obtained from a dynamical system is regarded. Various Boolean functions are iterated on finite states. Iterations of some of them are proven to be chaotic as it is defined by Devaney. In that context, important differences occur in the training process, establishing with various neural networks that chaotic behaviors are far more difficult to learn.
In this paper we investigate the problem of localizing a mobile device based on readings from its embedded sensors utilizing machine learning methodologies. We consider a real-world environment, collect a large dataset of 3110 datapoints, and examine the performance of a substantial number of machine learning algorithms in localizing a mobile device. We have found algorithms that give a mean error as accurate as 0.76 meters, outperforming other indoor localization systems reported in the literature. We also propose a hybrid instance-based approach that results in a speed increase by a factor of ten with no loss of accuracy in a live deployment over standard instance-based methods, allowing for fast and accurate localization. Further, we determine how smaller datasets collected with less density affect accuracy of localization, important for use in real-world environments. Finally, we demonstrate that these approaches are appropriate for real-world deployment by evaluating their performance in an online, in-motion experiment.
We study the impact of new open charm muoproduction data from COMPASS and preliminary W production data from STAR on NNPDFpol1.0, the first unbiased set of polarized parton distributions recently delivered by the NNPDF Collaboration and based on inclusive Deep-Inelastic Scattering data only. The information contained in the new data sets is incorporated in our Monte Carlo parton determination via Bayesian reweighting, a method based on statistical inference which avoids the need for a global refitting. We explicitly show to which extent COMPASS and STAR data can improve our knowledge of gluon and antiquark polarized parton distributions respectively.
The Earth's magnetosphere is formed as a consequence of interaction between the planet's magnetic field and the solar wind, a continuous plasma stream from the Sun. A number of different solar wind phenomena have been studied over the past forty years with the intention of understanding and forecasting solar behavior. One of these phenomena in particular, Earth-bound interplanetary coronal mass ejections (CMEs), can significantly disturb the Earth's magnetosphere for a short time and cause geomagnetic storms. This publication presents a mission concept consisting of six spacecraft that are equally spaced in a heliocentric orbit at 0.72 AU. These spacecraft will monitor the plasma properties, the magnetic field's orientation and magnitude, and the 3D-propagation trajectory of CMEs heading for Earth. The primary objective of this mission is to increase space weather (SW) forecasting time by means of a near real-time information service, that is based upon in-situ and remote measurements of the aforementioned CME properties. The mission's secondary objective is to provide vital data to update scientific models. In-situ measurements are performed using a Solar Wind Analyzer instrumentation package and flux gate magnetometers, while coronagraphs execute remote measurements. Communication with the six identical spacecraft is realized via a deep space network consisting of six ground stations. They provide an information service that is in uninterrupted contact with the spacecraft, allowing for continuous SW monitoring. The data will be handled by a dedicated processing center before being forwarded to the SSA Space Weather Coordination Center who will manage the SW forecasting. The data processing center will additionally archive the data for the scientific community. The proposed concept mission allows for major advances in SW forecasting time and the scientific modelling of SW.
An operator-assisted user-provided network (UPN) has the potential to achieve a low cost ubiquitous Internet connectivity, without significantly increasing the network infrastructure investment. In this paper, we consider such a network where the network operator encourages some of her subscribers to operate as mobile Wi-Fi hotspots (hosts), providing Internet connectivity for other subscribers (clients). We formulate the interaction between the operator and mobile users as a two-stage game. In Stage I, the operator determines the usage-based pricing and quota-based incentive mechanism for the data usage. In Stage II, the mobile users make their decisions about whether to be a host, or a client, or not a subscriber at all. We characterize how the users' membership choices will affect each other's payoffs in Stage II, and how the operator optimizes her decision in Stage I to maximize her profit. Our theoretical and numerical results show that the operator's maximum profit increases with the user density under the proposed hybrid pricing mechanism, and the profit gain can be up to 50\% in a dense network comparing with a pricing-only approach with no incentives.
We calculate azimuthal asymmetries and the Callan-Gross $R$-ratio for semi-exclusive pion production in deep inelastic scattering taking into account higher twist effects. Our results are qualitatively different from the QCD-improved parton model predictions for semi-inclusive deep inelastic scattering.
Advances in natural language processing tasks have gained momentum in recent years due to the increasingly popular neural network methods. In this paper, we explore deep learning techniques for answering multi-step reasoning questions that operate on semi-structured tables. Challenges here arise from the level of logical compositionality expressed by questions, as well as the domain openness. Our approach is weakly supervised, trained on question-answer-table triples without requiring intermediate strong supervision. It performs two phases: first, machine understandable logical forms (programs) are generated from natural language questions following the work of [Pasupat and Liang, 2015]. Second, paraphrases of logical forms and questions are embedded in a jointly learned vector space using word and character convolutional neural networks. A neural scoring function is further used to rank and retrieve the most probable logical form (interpretation) of a question. Our best single model achieves 34.8% accuracy on the WikiTableQuestions dataset, while the best ensemble of our models pushes the state-of-the-art score on this task to 38.7%, thus slightly surpassing both the engineered feature scoring baseline, as well as the Neural Programmer model of [Neelakantan et al., 2016].
It is well-known that the synchronization of diffusively-coupled systems on networks strongly depends on the network topology. In particular, the so-called algebraic connectivity $\mu_{N-1}$, or the smallest non-zero eigenvalue of the discrete Laplacian operator plays a crucial role on synchronization, graph partitioning, and network robustness. In our study, synchronization is placed in the general context of networks-of-networks, where single network models are replaced by a more realistic hierarchy of interdependent networks. The present work shows, analytically and numerically, how the algebraic connectivity experiences sharp transitions after the addition of sufficient links among interdependent networks.
In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.
It is commonly accepted that there are no phase transitions in one-dimensional (1D) systems at a finite temperature, because long-range correlations are destroyed by thermal fluctuations. Here we demonstrate that the 1D gas of short-range interacting bosons in the presence of disorder can undergo a finite temperature phase transition between two distinct states: fluid and insulator. None of these states has long-range spatial correlations, but this is a true albeit non-conventional phase transition because transport properties are singular at the transition point. In the fluid phase the mass transport is possible, whereas in the insulator phase it is completely blocked even at finite temperatures. We thus reveal how the interaction between disordered bosons influences their Anderson localization. This key question, first raised for electrons in solids, is now crucial for the studies of atomic bosons where recent experiments have demonstrated Anderson localization in expanding very dilute quasi-1D clouds.
Core helium burning is the dominant source of energy of extreme horizontal branch stars, as the hydrogen envelope is too small to contribute to the nuclear energy output. The evolution of each mass in the HR diagram occurs along vertical tracks that, when the core helium is consumed, evolve to higher Teff and then to the white dwarf stage. The larger is the mass, the smaller is the Teff of the models, so that the zero age horizontal branch (ZAHB) is "horizontal". In this paper we show that, if the helium mass fraction (Y) of the envelope is larger than Y~0.5, the shape of the tracks changes completely: the hydrogen burning becomes efficient again also for very small envelope masses, thanks to the higher molecular weight and to the higher temperatures of the hydrogen shell. The larger is Y, the smaller is the envelope mass that provides strong H-shell burning. These tracks have a curled shape, are located at a Teff following the approximate relation Teff=8090+ 32900xY, and become more luminous for larger envelope masses. Consequently, the ZAHB of the very high helium models is "vertical" in the HR diagram. Synthetic models based on these tracks nicely reproduce the location and shape of the "blue hook" in the globular cluster wCen, best fit by a very high Teff (bluer) sequence with Y=0.80 and a cooler (redder) one with Y=0.65. Although these precise values of Y may depend on the color-Teff conversions, we know that the helium content of the progenitors of the blue hook stars can not be larger than Y~0.38-0.40, if they are descendants of the cluster blue main sequence. Consequently, this interpretation implies that all these objects must in fact be progeny of the blue main sequence, but they have all suffered further deep mixing, that has largely and uniformly increased their surface helium abundance, during the red giant branch evolution. A late helium flash can not be the cause of this deep mixing, as the models we propose have hydrogen rich envelopes much more massive than those required for a late flash. We discuss different models of deep mixing proposed in the literature, and conclude that our interpretation of the blue hook can not be ruled out, but requires a much deeper investigation before it can be accepted.
A self-control mechanism for the dynamics of a three-state fully-connected neural network is studied through the introduction of a time-dependent threshold. The self-adapting threshold is a function of both the neural and the pattern activity in the network. The time evolution of the order parameters is obtained on the basis of a recently developed dynamical recursive scheme. In the limit of low activity the mutual information is shown to be the relevant parameter in order to determine the retrieval quality. Due to self-control an improvement of this mutual information content as well as an increase of the storage capacity and an enlargement of the basins of attraction are found. These results are compared with numerical simulations.
Many giant exoplanets in close orbits have observed radii which exceed theoretical predictions. One suggested explanation for this discrepancy is heat deposited deep inside the atmospheres of these "hot Jupiters". Here, we study extended power sources which distribute heat from the photosphere to the deep interior of the planet. Our analytical treatment is a generalization of a previous analysis of localized "point sources". We model the deposition profile as a power law in the optical depth and find that planetary cooling and contraction halt when the internal luminosity (i.e. cooling rate) of the planet drops below the heat deposited in the planet's convective region. A slowdown in the evolutionary cooling prior to equilibrium is possible only for sources which do not extend to the planet's center. We estimate the Ohmic dissipation resulting from the interaction between the atmospheric winds and the planet's magnetic field, and apply our analytical model to Ohmically heated planets. Our model can account for the observed radii of most inflated planets which have equilibrium temperatures $\approx 1500\textrm{K}-2500\textrm{K}$, and are inflated to a radius $\approx 1.6 R_J$. However, some extremely inflated planets remain unexplained by our model. We also argue that Ohmically inflated planets have already reached their equilibrium phase, and no longer contract. Following Wu & Lithwick who argued that Ohmic heating could only suspend and not reverse contraction, we calculate the time it takes Ohmic heating to re-inflate a cold planet to its equilibrium configuration. We find that while it is possible to re-inflate a cold planet, the re-inflation timescales are longer by a factor of $\approx 30$ than the cooling time.
Simulation studies of the atomic shear stress in the local potential energy minima (inherent structures) are reported for binary liquid mixtures in 2D and 3D. These inherent structure stresses are fundamental to slow stress relaxation and high viscosity in supercooled liquids. We find that the atomic shear stress in the inherent structures (IS) of both liquids at rest exhibits slowly decaying anisotropic correlations. We show that the stress correlations contributes significantly to the variance of the total shear stress of the IS configurations and consider the origins of the anisotropy and spatial extent of the stress correlations.
The German motorway, or 'autobahn', is characterised by long traditions, meticulous state planning, historical misbalance between West and East Germany's transport networks, and highest increase in traffic in modern Europe, posing a need for expansion and/or restructuring. We attempt to evaluate the structure of autobahns using principles of intrinsic optimality of biological networks in experiments with slime mould of Physarum polycephalum. In laboratory experiments with living slime mould we represent major urban areas of Germany with sources of nutrients, inoculate the slime mould in Berlin, wait till the slime mould colonises all major urban areas and compare the statistical structure of protoplasmic networks with existing autobahn network. The straightforward comparative analysis of the slime mould and autobahn graphs is supported by integral characteristics and indices of the graphs. We also study the protoplasmic and autobahn networks in the context of planar proximity graphs.
Deep convolutional neural networks (CNNs) have achieved breakthrough performance in many pattern recognition tasks such as image classification. However, the development of high-quality deep models typically relies on a substantial amount of trial-and-error, as there is still no clear understanding of when and why a deep model works. In this paper, we present a visual analytics approach for better understanding, diagnosing, and refining deep CNNs. We formulate a deep CNN as a directed acyclic graph. Based on this formulation, a hybrid visualization is developed to disclose the multiple facets of each neuron and the interactions between them. In particular, we introduce a hierarchical rectangle packing algorithm and a matrix reordering algorithm to show the derived features of a neuron cluster. We also propose a biclustering-based edge bundling method to reduce visual clutter caused by a large number of connections between neurons. We evaluated our method on a set of CNNs and the results are generally favorable.
Gravitational waves carry unique information about high-energy astrophysical events such as the inspiral and merger of neutron stars and black holes, core collapse in massive stars, and other sources. Large gravitational wave (GW) detectors utilizing exquisitely sensitive laser interferometry--namely, LIGO in the United States and GEO 600 and Virgo in Europe--have been successfully operated in recent years and are currently being upgraded to greatly improve their sensitivities. Many signals are expected to be detected in the coming decade. Simultaneous observing with the network of GW detectors enables us to identify and localize event candidates on the sky with modest precision, opening up the possibility of capturing optical transients or other electromagnetic counterparts to confirm an event and obtain complementary information about it. We developed and implemented the first complete low-latency GW data analysis and alert system in 2009-10 and used it to send alerts to several observing partners; the system design and some lessons learned are briefly described. We discuss several operational considerations and design choices for improving this scientific capability for future observations.
Many systems can be described in terms of networks of discrete elements and their various relationships to one another. A semantic network, or multi-relational network, is a directed labeled graph consisting of a heterogeneous set of entities connected by a heterogeneous set of relationships. Semantic networks serve as a promising general-purpose modeling substrate for complex systems. Various standardized formats and tools are now available to support practical, large-scale semantic network models. First, the Resource Description Framework (RDF) offers a standardized semantic network data model that can be further formalized by ontology modeling languages such as RDF Schema (RDFS) and the Web Ontology Language (OWL). Second, the recent introduction of highly performant triple-stores (i.e. semantic network databases) allows semantic network models on the order of $10^9$ edges to be efficiently stored and manipulated. RDF and its related technologies are currently used extensively in the domains of computer science, digital library science, and the biological sciences. This article will provide an introduction to RDF/RDFS/OWL and an examination of its suitability to model discrete element complex systems.
This thesis aims to use intelligent systems to extend and improve performance and security of cryptographic techniques. Genetic algorithms framework for cryptanalysis problem is addressed. A novel extension to the differential cryptanalysis using genetic algorithm is proposed and a fitness measure based on the differential characteristics of the cipher being attacked is also proposed. The complexity of the proposed attack is shown to be less than quarter of normal differential cryptanalysis of the same cipher by applying the proposed attack to both the basic Substitution Permutation Network and the Feistel Network. The basic models of modern block ciphers are attacked instead of actual cipher to prove that the attack is applicable to other ciphers vulnerable to differential cryptanalysis. A new attack for block cipher based on the ability of neural networks to perform an approximation of mapping is proposed. A complete problem formulation is explained and implementation of the attack on some hypothetical Feistel cipher not vulnerable to differential or linear attacks is presented. A new block cipher based on the neural networks is proposed. A complete cipher structure is given and a key scheduling is also shown. The main properties of neural network being able to perform mapping between large dimension domains in a very fast and a very small memory compared to S-Boxes is used as a base for the cipher.
Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously by reinforcement learning. All of these tasks share a common representation that, like unsupervised learning, continues to develop in the absence of extrinsic rewards. We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task. Our agent significantly outperforms the previous state-of-the-art on Atari, averaging 880\% expert human performance, and a challenging suite of first-person, three-dimensional \emph{Labyrinth} tasks leading to a mean speedup in learning of 10$\times$ and averaging 87\% expert human performance on Labyrinth.
Visual representations are defined in terms of minimal sufficient statistics of visual data, for a class of tasks, that are also invariant to nuisance variability. Minimal sufficiency guarantees that we can store a representation in lieu of raw data with smallest complexity and no performance loss on the task at hand. Invariance guarantees that the statistic is constant with respect to uninformative transformations of the data. We derive analytical expressions for such representations and show they are related to feature descriptors commonly used in computer vision, as well as to convolutional neural networks. This link highlights the assumptions and approximations tacitly assumed by these methods and explains empirical practices such as clamping, pooling and joint normalization.
Deep learning is a form of machine learning for nonlinear high dimensional data reduction and prediction. A Bayesian probabilistic perspective provides a number of advantages. Specifically statistical interpretation and properties, more efficient algorithms for optimisation and hyper-parameter tuning, and an explanation of predictive performance. Traditional high-dimensional statistical techniques; principal component analysis (PCA), partial least squares (PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are shown to be shallow learners. Their deep learning counterparts exploit multiple layers of of data reduction which leads to performance gains. Stochastic gradient descent (SGD) training and optimisation and Dropout (DO) provides model and variable selection. Bayesian regularization is central to finding networks and provides a framework for optimal bias-variance trade-off to achieve good out-of sample performance. Constructing good Bayesian predictors in high dimensions is discussed. To illustrate our methodology, we provide an analysis of first time international bookings on Airbnb. Finally, we conclude with directions for future research.
The static and dynamic structure of liquid Al is studied using the orbital free ab-initio molecular dynamics method. Two thermodynamic states along the coexistence line are considered, namely T = 943 K and 1323 K for which X-ray and neutron scattering data are available. A new kinetic energy functional, which fulfills a number of physically relevant conditions is employed, along with a local first principles pseudopotential. In addition to a comparison with experiment, we also compare our ab-initio results with those obtained from conventional molecular dynamics simulations using effective interionic pair potentials derived from second order pseudopotential perturbation theory.
The Percus-Yevick theory for monodisperse hard spheres gives very good results for the pressure and structure factor of the system in a whole range of densities that lie within the gas and liquid phases. However, the equation seems to lead to a very unacceptable result beyond that region. Namely, the Percus-Yevick theory predicts a smooth behavior of the pressure that diverges only when the volume fraction $\eta$ approaches unity. Thus, within the theory there seems to be no indication for the termination of the liquid phase and the transition to a solid or to a glass. In the present article we study the Percus-Yevick hard sphere radial distribution function, $g_2(r)$, for various spatial dimensions. We find that beyond a certain critical volume fraction $\eta_c$ the pair distribution function, $g_2(r)$, which should be positive definite, becomes negative at some distances. Furthermore, the critical values we find are consistent with volume fractions where onsets of random close packing (or maximally random jammed states) are reported in the literature for various dimensions. This work has important implications for other systems for which a Percus-Yevick theory exists.
Networks with a given degree distribution may be very resilient to one type of failure or attack but not to another. The goal of this work is to determine network design guidelines which maximize the robustness of networks to both random failure and intentional attack while keeping the cost of the network (which we take to be the average number of links per node) constant. We find optimal parameters for: (i) scale free networks having degree distributions with a single power-law regime, (ii) networks having degree distributions with two power-law regimes, and (iii) networks described by degree distributions containing two peaks. Of these various kinds of distributions we find that the optimal network design is one in which all but one of the nodes have the same degree, $k_1$ (close to the average number of links per node), and one node is of very large degree, $k_2 \sim N^{2/3}$, where $N$ is the number of nodes in the network.
This research explores the effects of various training settings from Polish to English Statistical Machine Translation system for spoken language. Various elements of the TED parallel text corpora for the IWSLT 2013 evaluation campaign were used as the basis for training of language models, and for development, tuning and testing of the translation system. The BLEU, NIST, METEOR and TER metrics were used to evaluate the effects of data preparations on translation results. Our experiments included systems, which use stems and morphological information on Polish words. We also conducted a deep analysis of provided Polish data as preparatory work for the automatic data correction and cleaning phase.
In this paper we propose a generalization of deep neural networks called deep function machines (DFMs). DFMs act on vector spaces of arbitrary (possibly infinite) dimension and we show that a family of DFMs are invariant to the dimension of input data; that is, the parameterization of the model does not directly hinge on the quality of the input (eg. high resolution images). Using this generalization we provide a new theory of universal approximation of bounded non-linear operators between function spaces locally compact Hausdorff spaces. We then suggest that DFMs provide an expressive framework for designing new neural network layer types with topological considerations in mind. Finally, we provide several examples of DFMs and in particular give a practical algorithm for neural networks approximating infinite dimensional operators.
About 45,000 years ago, symbolic and technological complexity of human artefacts increased drastically. Computer simulations of Powell, Shennan and Thomas (2009) explained it through an increase of the population density, facilitating the spread of information about useful innovations. We simplify this demographic model and make it more similar to standard physics models. For this purpose, we assume that bands (extended families) of stone-age humans were distributed randomly on a square lattice such that each lattice site is randomly occupied with probability p and empty with probability 1-p. Information spreads randomly from an occupied site to one of its occupied neighbours. If we wait long enough, information spreads from one side of the lattice to the opposite site if and only if p is larger than the percolation threshold; this process was called "ant in the labyrinth" by deGennes 1976. We modify it by giving the diffusing information a finite lifetime, which shifts the threshold upwards.
Several systems can be modeled as sets of interdependent networks where each network contains distinct nodes. Diffusion processes like the spreading of a disease or the propagation of information constitute fundamental phenomena occurring over such coupled networks. In this paper we propose a new concept of multidimensional epidemic threshold characterizing diffusion processes over interdependent networks, allowing different diffusion rates on the different networks and arbitrary degree distributions. We analytically derive and numerically illustrate the conditions for multilayer epidemics, i.e., the appearance of a giant connected component spanning all the networks. Furthermore, we study the evolution of infection density and diffusion dynamics with extensive simulation experiments on synthetic and real networks.
In the classical s-t network reliability problem a fixed network G is given including two designated vertices s and t (called terminals). The edges are subject to independent random failure, and the task is to compute the probability that s and t are connected in the resulting network, which is known to be #P-complete. In this paper we are interested in approximating the s-t reliability in case of a directed acyclic original network G. We introduce and analyze a specialized version of the Monte-Carlo algorithm given by Karp and Luby. For the case of uniform edge failure probabilities, we give a worst-case bound on the number of samples that have to be drawn to obtain an epsilon-delta approximation, being sharper than the original upper bound. We also derive a variance reduction of the estimator which reduces the expected number of iterations to perform to achieve the desired accuracy when applied in conjunction with different stopping rules. Initial computational results on two types of random networks (directed acyclic Delaunay graphs and a slightly modified version of a classical random graph) with up to one million vertices are presented. These results show the advantage of the introduced Monte-Carlo approach compared to direct simulation when small reliabilities have to be estimated and demonstrate its applicability on large-scale instances.
Simulation of biomolecular networks is now indispensable for studying biological systems, from small reaction networks to large ensembles of cells. Here we present a novel approach for stochastic simulation of networks embedded in the dynamic environment of the cell and its surroundings. We thus sample trajectories of the stochastic process described by the chemical master equation with time-varying propensities. A comparative analysis shows that existing approaches can either fail dramatically, or else can impose impractical computational burdens due to numerical integration of reaction propensities, especially when cell ensembles are studied. Here we introduce the Extrande method which, given a simulated time course of dynamic network inputs, provides a conditionally exact and several orders-of-magnitude faster simulation solution. The new approach makes it feasible to demonstrate, using decision-making by a large population of quorum sensing bacteria, that robustness to fluctuations from upstream signaling places strong constraints on the design of networks determining cell fate. Our approach has the potential to significantly advance both understanding of molecular systems biology and design of synthetic circuits.
We provide a Jacobian criterion that applies to arbitrary chemical reaction networks taken with mass-action kinetics to preclude the existence of multiple positive steady states within any stoichiometric class for any choice of rate constants. We are concerned with the characterization of injective networks, that is, networks for which the species formation rate function is injective in the interior of the positive orthant within each stoichiometric class. We show that a network is injective if and only if the determinant of the Jacobian of a certain function does not vanish. The function consists of components of the species formation rate function and a maximal set of independent conservation laws. The determinant of the function is a polynomial in the species concentrations and the rate constants (linear in the latter) and its coefficients are fully determined. The criterion also precludes the existence of degenerate steady states. Further, we relate injectivity of a chemical reaction network to that of the chemical reaction network obtained by adding outflow, or degradation, reactions for all species.
This is a review of the dynamics of wave propagation through a disordered N-mode waveguide in the localized regime. The basic quantities considered are the Wigner-Smith and single-mode delay times, plus the time-dependent power spectrum of a reflected pulse. The long-time dynamics is dominated by resonant transmission over length scales much larger than the localization length. The corresponding distribution of the Wigner-Smith delay times is the Laguerre ensemble of random-matrix theory. In the power spectrum the resonances show up as a 1/t^2 tail after N^2 scattering times. In the distribution of single-mode delay times the resonances introduce a dynamic coherent backscattering effect, that provides a way to distinguish localization from absorption.
We study functional activity in the human brain using functional Magnetic Resonance Imaging and recently developed tools from network science. The data arise from the performance of a simple behavioural motor learning task. Unsupervised clustering of subjects with respect to similarity of network activity measured over three days of practice produces significant evidence of `learning', in the sense that subjects typically move between clusters (of subjects whose dynamics are similar) as time progresses. However, the high dimensionality and time-dependent nature of the data makes it difficult to explain which brain regions are driving this distinction. Using network centrality measures that respect the arrow of time, we express the data in an extremely compact form that characterizes the aggregate activity of each brain region in each experiment using a single coefficient, while reproducing information about learning that was discovered using the full data set. This compact summary allows key brain regions contributing to centrality to be visualized and interpreted. We thereby provide a proof of principle for the use of recently proposed dynamic centrality measures on temporal network data in neuroscience.
The problem of stochastic advection of passive particles by circulating conserved flows on networks is formulated and investigated. The particles undergo transitions between the nodes with the transition rates determined by the flows passing through the links. Such stochastic advection processes lead to mixing of particles in the network and, in the final equilibrium state, concentration of particles in all nodes become equal. As we find, equilibration begins in the subset of nodes, representing flow hubs, and extends to the periphery nodes with weak flows. This behavior is related to the effect of localization of the eigenvectors of the advection matrix for considered networks. Applications of the results to problems involving spreading of infections or pollutants by traffic networks are discussed.
The recently introduced concept of dynamic communicability is a valuable tool for ranking the importance of nodes in a temporal network. Two metrics, broadcast score and receive score, were introduced to measure the centrality of a node with respect to a model of contagion based on time-respecting walks. This article examines the temporal and structural factors influencing these metrics by considering a versatile stochastic temporal network model. We analytically derive formulae to accurately predict the expectation of the broadcast and receive scores when one or more columns in a temporal edge-list are shuffled. These methods are then applied to two publicly available data-sets and we quantify how much the centrality of each individual depends on structural or temporal influences. From our analysis we highlight two practical contributions: a way to control for temporal variation when computing dynamic communicability, and the conclusion that the broadcast and receive scores can, under a range of circumstances, be replaced by the row and column sums of the matrix exponential of a weighted adjacency matrix given by the data.
RNA secondary structure folding kinetics is known to be important for the biological function of certain processes, such as the hok/sok system in E. coli. Although linear algebra provides an exact computational solution of secondary structure folding kinetics with respect to the Turner energy model for tiny (~ 20 nt) RNA sequences, the folding kinetics for larger sequences can only be approximated by binning structures into macrostates in a coarse-grained model, or by repeatedly simulating secondary structure folding with either the Monte Carlo algorithm or the Gillespie algorithm. Here we investigate the relation between the Monte Carlo algorithm and the Gillespie algorithm. We prove that asymptotically, the expected time for a K-step trajectory of the Monte Carlo algorithm is equal to <N> times that of the Gillespie algorithm, where <N> denotes the Boltzmann expected network degree. If the network is regular (i.e. every node has the same degree), then the mean first passage time (MFPT) computed by the Monte Carlo algorithm is equal to MFPT computed by the Gillespie algorithm multiplied by <N>; however, this is not true for non-regular networks. In particular, RNA secondary structure folding kinetics, as computed by the Monte Carlo algorithm, is not equal to the folding kinetics, as computed by the Gillespie algorithm, although the mean first passage times are roughly correlated. Simulation software for RNA secondary structure folding according to the Monte Carlo and Gille- spie algorithms is publicly available, as is our software to compute the expected degree of the net- work of secondary structures of a given RNA sequence { see http://bioinformatics.bc.edu/clote/ RNAexpNumNbors.
Deep neural networks have proven to be quite effective in a wide variety of machine learning tasks, ranging from improved speech recognition systems to advancing the development of autonomous vehicles. However, despite their superior performance in many applications, these models have been recently shown to be susceptible to a particular type of attack possible through the generation of particular synthetic examples referred to as adversarial samples. These samples are constructed by manipulating real examples from the training data distribution in order to "fool" the original neural model, resulting in misclassification (with high confidence) of previously correctly classified samples. Addressing this weakness is of utmost importance if deep neural architectures are to be applied to critical applications, such as those in the domain of cybersecurity. In this paper, we present an analysis of this fundamental flaw lurking in all neural architectures to uncover limitations of previously proposed defense mechanisms. More importantly, we present a unifying framework for protecting deep neural models using a non-invertible data transformation--developing two adversary-resilient architectures utilizing both linear and nonlinear dimensionality reduction. Empirical results indicate that our framework provides better robustness compared to state-of-art solutions while having negligible degradation in accuracy.
We develop a mean-field theory for random quantum spin systems using the spin coherent state path integral representation. After the model is reduced to the mean field one-body Hamiltonian, the integral is analyzed with the aid of several methods such as the semiclassical method and the gauge transformation. As an application we consider the Sherrington-Kirkpatrick model in a transverse field. Using the Landau expansion and its improved versions, we give a detailed analysis of the imaginary-time dependence of the order parameters. Integrating out the quantum part of the order parameters, we obtain the effective renormalized free energy written in terms of the classically defined order parameters. Our method allows us to obtain the spin glass-paramagnetic phase transition point $\Gamma/J\sim 1.62$ at T=0.
We consider a simple linear reversible isomerization reaction A <--> B under subdiffusion described by continuous time random walks (CTRW). The reactants' transformations take place independently on the motion and are described by constant rates. We show that the form of the ensuing system of mesoscopic reaction-subdiffusion is somewhat unusual: the equation giving the time derivative of one reactant concentration, say A(x,t), contains the terms depending not only on Laplacian A, but also on Laplacian B, i.e. depends also on the transport operator of another reactant. Physically this is due to the fact that several transitions from A to B and back may take place at one site before the particle jumps.
In this paper, we introduce the concept of mixsourcing as a modality of crowdsourcing focused on using remixing as a framework to get people to perform creative tasks. We explore this idea through the design of a system that helped us identify the promises and challenges of this peer-production modality.
Currently, owing to the ubiquity of mobile devices, online handwritten Chinese character recognition (HCCR) has become one of the suitable choice for feeding input to cell phones and tablet devices. Over the past few years, larger and deeper convolutional neural networks (CNNs) have extensively been employed for improving character recognition performance. However, its substantial storage requirement is a significant obstacle in deploying such networks into portable electronic devices. To circumvent this problem, we propose a novel technique called DropWeight for pruning redundant connections in the CNN architecture. It is revealed that the proposed method not only treats streamlined architectures such as AlexNet and VGGNet well but also exhibits remarkable performance for deep residual network and inception network. We also demonstrate that global pooling is a better choice for building very compact online HCCR systems. Experiments were performed on the ICDAR-2013 online HCCR competition dataset using our proposed network, and it is found that the proposed approach requires only 0.57 MB for storage, whereas state-of-the-art CNN-based methods require up to 135 MB; meanwhile the performance is decreased only by 0.91%.
The ABR service is designed to fairly allocate the bandwidth unused by higher priority services. The network indicates to the ABR sources the rates at which they should transmit to minimize their cell loss. Switches must constantly measure the demand and available capacity, and divide the capacity fairly among the contending connections. In order to compute the fair and efficient allocation for each connection, a switch needs to determine the effective number of active connections. In this paper, we propose a method for determining the number of active connections and the fair bandwidth share for each. We prove the efficiency and fairness of the proposed method analytically, and simulate it by incorporating it into the ERICA switch algorithm.
Using tempered Monte Carlo simulations, we study the the spin-glass phase of dense packings of Ising dipoles pointing along random axes. We consider systems of L^3 dipoles (a) placed on the sites of a simple cubic lattice with lattice constant $d$, (b) placed at the center of randomly closed packed spheres of diameter d that occupy a 64% of the volume. For both cases we find an equilibrium spin-glass phase below a temperature T_sg. We compute the spin-glass overlap parameter q and their associated correlation length xi_L. From the variation of xi_L with T and L we determine T_sg for both systems. In the spin-glass phase, we find (a) <q> decreases algebraically with L, and (b) xi_L/L does not diverge as L increases. At very low temperatures we find comb-like distributions of q that are sample-dependent. We find that the fraction of samples with cross-overlap spikes higher than a certain value as well as the average width of the spikes are size independent quantities. All these results are consistent with a quasi-long-range order in the spin-glass phase, as found previously for very diluted dipolar systems.
Recent advances in AI and robotics have claimed many incredible results with deep learning, yet no work to date has applied deep learning to the problem of liquid perception and reasoning. In this paper, we apply fully-convolutional deep neural networks to the tasks of detecting and tracking liquids. We evaluate three models: a single-frame network, multi-frame network, and a LSTM recurrent network. Our results show that the best liquid detection results are achieved when aggregating data over multiple frames, in contrast to standard image segmentation. They also show that the LSTM network outperforms the other two in both tasks. This suggests that LSTM-based neural networks have the potential to be a key component for enabling robots to handle liquids using robust, closed-loop controllers.
The paradigm shift from shallow classifiers with hand-crafted features to end-to-end trainable deep learning models has shown significant improvements on supervised learning tasks. Despite the promising power of deep neural networks (DNN), how to alleviate overfitting during training has been a research topic of interest. In this paper, we present a Generative-Discriminative Variational Model (GDVM) for visual classification, in which we introduce a latent variable inferred from inputs for exhibiting generative abilities towards prediction. In other words, our GDVM casts the supervised learning task as a generative learning process, with data discrimination to be jointly exploited for improved classification. In our experiments, we consider the tasks of multi-class classification, multi-label classification, and zero-shot learning. We show that our GDVM performs favorably against the baselines or recent generative DNN models.
The Hopfield recurrent neural network is a classical auto-associative model of memory, in which collections of symmetrically-coupled McCulloch-Pitts neurons interact to perform emergent computation. Although previous researchers have explored the potential of this network to solve combinatorial optimization problems and store memories as attractors of its deterministic dynamics, a basic open problem is to design a family of Hopfield networks with a number of noise-tolerant memories that grows exponentially with neural population size. Here, we discover such networks by minimizing probability flow, a recently proposed objective for estimating parameters in discrete maximum entropy models. By descending the gradient of the convex probability flow, our networks adapt synaptic weights to achieve robust exponential storage, even when presented with vanishingly small numbers of training patterns. In addition to providing a new set of error-correcting codes that achieve Shannon's channel capacity bound, these networks also efficiently solve a variant of the hidden clique problem in computer science, opening new avenues for real-world applications of computational models originating from biology.
We analyze low field hysteresis close to the demagnetized state in disordered ferromagnets using the zero temperature random-field Ising model. We solve the demagnetization process exactly in one dimension and derive the Rayleigh law of hysteresis. The initial susceptibility a and the hysteretic coefficient b display a peak as a function of the disorder width. This behavior is confirmed by numerical simulations d=2,3 showing that in limit of weak disorder demagnetization is not possible and the Rayleigh law is not defined. These results are in agreement with experimental observations on nanocrystalline magnetic materials.
Extremal optimization is a new general-purpose method for approximating solutions to hard optimization problems. We study the method in detail by way of the NP-hard graph partitioning problem. We discuss the scaling behavior of extremal optimization, focusing on the convergence of the average run as a function of runtime and system size. The method has a single free parameter, which we determine numerically and justify using a simple argument. Our numerical results demonstrate that on random graphs, extremal optimization maintains consistent accuracy for increasing system sizes, with an approximation error decreasing over runtime roughly as a power law t^(-0.4). On geometrically structured graphs, the scaling of results from the average run suggests that these are far from optimal, with large fluctuations between individual trials. But when only the best runs are considered, results consistent with theoretical arguments are recovered.
We continue to explore the hypothesis that neuronal populations represent and process analog variables in terms of probability density functions (PDFs). A neural assembly encoding the joint probability density over relevant analog variables can in principle answer any meaningful question about these variables by implementing the Bayesian rules of inference. Aided by an intermediate representation of the probability density based on orthogonal functions spanning an underlying low-dimensional function space, we show how neural circuits may be generated from Bayesian belief networks. The ideas and the formalism of this PDF approach are illustrated and tested with several elementary examples, and in particular through a problem in which model-driven top-down information flow influences the processing of bottom-up sensory input.
Weyl semimetals are gapless quasi-topological materials with a set of isolated nodal points forming their Fermi surface. They manifest their quasi-topological character in a series of topological electromagnetic responses including the anomalous Hall effect. Here we study the effect of disorder on Weyl semimetals while monitoring both their nodal/semi-metallic and topological properties through computations of the localization length and the Hall conductivity. We examine three different lattice tight-binding models which realize the Weyl semimetal in part of their phase diagram and look for universal features that are common to all of the models, and interesting distinguishing features of each model. We present detailed phase diagrams of these models for large system sizes and we find that weak disorder preserves the nodal points up to the diffusive limit, but does affect the Hall conductivity. We show that the trend of the Hall conductivity is consistent with an effective picture in which disorder causes the Weyl nodes move within the Brillouin zone along a specific direction that depends deterministically on the properties of the model and the neighboring phases to the Weyl semimetal phase. We also uncover an unusual (non-quantized) anomalous Hall insulator phase which can only exist in the presence of disorder.
We explore the effects of social influence in a simple market model in which a large number of agents face a binary choice: 'to buy/not to buy' a single unit of a product at a price posted by a single seller (the monopoly case). We consider the case of 'positive externalities': an agent is more willing to buy if the other agents with whom he/she interacts make the same decision.   We compare two special cases known in the economics literature as the Thurstone and the McFadden approaches. We show that they correspond to modeling the heterogenity in individual decision rules with, respectively, annealed and quenched disorder. More precisely the first case leads to a standard Ising model at finite temperature in a uniform external field, and the second case to a random field Ising model (RFIM) at zero temperature.   Considering the optimisation of profit by the seller within the McFadden/RFIM model in the mean field limit, we exhibit a new first order phase transition: if the social influence is strong enough, there is a regime where, if the mean willingness to pay increases, or if the production costs decrease, the optimal solution for the seller jumps from one with a high price and a small number of buyers, to another one with a low price and a large number of buyers.
We study the off-equilibrium dynamics of the Edwards-Anderson spin glass in four dimensions under the influence of a non-hamiltonian perturbation. We find that for small asymmetry the model behaves as the hamiltonian one, while for large asymmetry the behaviour of the model can be well described by an interrupted aging scenario. The autocorrelation function C(t_w+\tau,t_w) scales as \tau/t_w^\beta, with \beta a function of the asymmetry. For very long waiting times the previous regime crosses over to a time translational invariant regime (TTI) with stretched exponential relaxation. The model does not show signs of reaching a TTI regime for weak asymmetry, but in the aging regime the exponent \beta is always different from one, showing a non trivial aging scenario.
We study a novel architecture and training procedure for locomotion tasks. A high-frequency, low-level "spinal" network with access to proprioceptive sensors learns sensorimotor primitives by training on simple tasks. This pre-trained module is fixed and connected to a low-frequency, high-level "cortical" network, with access to all sensors, which drives behavior by modulating the inputs to the spinal network. Where a monolithic end-to-end architecture fails completely, learning with a pre-trained spinal module succeeds at multiple high-level tasks, and enables the effective exploration required to learn from sparse rewards. We test our proposed architecture on three simulated bodies: a 16-dimensional swimming snake, a 20-dimensional quadruped, and a 54-dimensional humanoid. Our results are illustrated in the accompanying video at https://youtu.be/sboPYvhpraQ
Intra-session network coding has been shown to offer significant gains in terms of achievable throughput and delay in settings where one source multicasts data to several clients. In this paper, we consider a more general scenario where multiple sources transmit data to sets of clients and study the benefits of inter-session network coding, when network nodes have the opportunity to combine packets from different sources. In particular, we propose a novel framework for optimal rate allocation in inter-session network coding systems. We formulate the problem as the minimization of the average decoding delay in the client population and solve it with a gradient-based stochastic algorithm. Our optimized inter-session network coding solution is evaluated in different network topologies and compared with basic intra-session network coding solutions. Our results show the benefits of proper coding decisions and effective rate allocation for lowering the decoding delay when the network is used by concurrent multicast sessions.
Generative Adversarial Networks (GAN) (Goodfellow et al., 2014) are an effective method for training generative models of complex data such as natural images. However, they are notoriously hard to train and can suffer from the problem of missing modes where the model is not able to produce examples in certain regions of the space. We propose an iterative procedure, called AdaGAN, where at every step we add a new component into a mixture model by running a GAN algorithm on a reweighted sample. This is inspired by boosting algorithms, where many potentially weak individual predictors are greedily aggregated to form a strong composite predictor. We prove that such an incremental procedure leads to convergence to the true distribution in a finite number of steps if each step is optimal, and convergence at an exponential rate otherwise. We also illustrate experimentally that this procedure addresses the problem of missing modes.
We introduce CTCP, a novel multi-path transport protocol using network coding. CTCP is designed to incorporate TCP's good features, such as congestion control and reliability, while improving on TCP's performance in lossy and/or dynamic networks. CTCP builds upon the ideas of TCP/NC introduced by Sundararajan et al. and uses network coding to provide robustness against losses. We introduce the use of multiple paths to provide robustness against mobility and network failures. We provide an implementation of CTCP (in userspace) to demonstrate its performance.
Let $\XX$ be a compact, smooth, connected, Riemannian manifold without boundary, $G:\XX\times\XX\to \RR$ be a kernel. Analogous to a radial basis function network, an eignet is an expression of the form $\sum_{j=1}^M a_jG(\circ,y_j)$, where $a_j\in\RR$, $y_j\in\XX$, $1\le j\le M$. We describe a deterministic, universal algorithm for constructing an eignet for approximating functions in $L^p(\mu;\XX)$ for a general class of measures $\mu$ and kernels $G$. Our algorithm yields linear operators. Using the minimal separation amongst the centers $y_j$ as the cost of approximation, we give modulus of smoothness estimates for the degree of approximation by our eignets, and show by means of a converse theorem that these are the best possible for every \emph{individual function}. We also give estimates on the coefficients $a_j$ in terms of the norm of the eignet. Finally, we demonstrate that if any sequence of eignets satisfies the optimal estimates for the degree of approximation of a smooth function, measured in terms of the minimal separation, then the derivatives of the eignets also approximate the corresponding derivatives of the target function in an optimal manner.
In this Letter we propose a new method to infer the topology of the interaction network in pairwise models with Ising variables. By using the pseudolikelihood method (PLM) at high temperature, it is generally possible to distinguish between zero and nonzero couplings because a clear gap separate the two groups. However at lower temperatures the PLM is much less effective and the result depends on subjective choices, such as the value of the $\ell_1$ regularizer and that of the threshold to separate nonzero couplings from null ones. We introduce a decimation procedure based on the PLM that recursively sets to zero the less significant couplings, until the variation of the pseudolikelihood signals that relevant couplings are being removed. The new method is fully automated and does not require any subjective choice by the user. Numerical tests have been performed on a wide class of Ising models, having different topologies (from random graphs to finite dimensional lattices) and different couplings (both diluted ferromagnets in a field and spin glasses). These numerical results show that the new algorithm performs better than standard PLM
A recent paper suggests that Deep Neural Networks can be protected from gradient-based adversarial perturbations by driving the network activations into a highly saturated regime. Here we analyse such saturated networks and show that the attacks fail due to numerical limitations in the gradient computations. A simple stabilisation of the gradient estimates enables successful and efficient attacks. Thus, it has yet to be shown that the robustness observed in highly saturated networks is not simply due to numerical limitations.
At low temperature T, a significant difference between the behavior of crystals on the one hand and disordered solids on the other is seen: sufficiently strong disorder can give rise to a transition of the transport properties from conducting behavior with finite resistance R to insulating behavior with R=infinity as T -> 0. This well-studied phenomenon is called the disorder-driven metal-insulator transition and it is characteristic to non-crystalline solids. In this review of recent advances, we have presented results of transport studies in disordered systems, ranging from modifications of the standard Anderson model of localization to effects of a two-body interaction. Of paramount importance in these studies was always the highest possible accuracy of the raw data combined with the careful subsequent application of the finite-size scaling technique. In fact, it is this scaling method that has allowed numerical studies to move beyond simple extrapolations and reliably construct estimates of quantities as if one were studying an infinite system.
In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame. An important insight is that the tracking problem can be considered as a sequential decision-making process and historical semantics encode highly relevant information for future decisions. Based on this intuition, we formulate our model as a recurrent convolutional neural network agent that interacts with a video overtime, and our model can be trained with reinforcement learning (RL) algorithms to learn good tracking policies that pay attention to continuous, inter-frame correlation and maximize tracking performance in the long run. The proposed tracking algorithm achieves state-of-the-art performance in an existing tracking benchmark and operates at frame-rates faster than real-time. To the best of our knowledge, our tracker is the first neural-network tracker that combines convolutional and recurrent networks with RL algorithms.
We investigate the explicit renormalization group for fermionic field theoretic representation of two-dimensional random bond Ising model with long-range correlated disorder. We show that a new fixed point appears by introducing a long-range correlated disorder. Such as the one has been observed in previous works for the bosonic ($\phi^4$) description. We have calculated the correlation length exponent and the anomalous scaling dimension of fermionic fields at this fixed point. Our results are in agreement with the extended Harris criterion derived by Weinrib and Halperin.
Modern networks are becoming increasingly interdependent. As a prominent example, the smart grid is an electrical grid controlled through a communications network, which in turn is powered by the electrical grid. Such interdependencies create new vulnerabilities and make these networks more susceptible to failures. In particular, failures can easily spread across these networks due to their interdependencies, possibly causing cascade effects with a devastating impact on their functionalities.   In this paper we focus on the interdependence between the power grid and the communications network, and propose a novel realistic model, HINT (Heterogeneous Interdependent NeTworks), to study the evolution of cascading failures. Our model takes into account the heterogeneity of such networks as well as their complex interdependencies. We compare HINT with previously proposed models both on synthetic and real network topologies. Experimental results show that existing models oversimplify the failure evolution and network functionality requirements, resulting in severe underestimations of the cascading failures.
Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete combinatorial optimization where the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated annealing which allows one to guarantee finite-time performance in the optimization of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.
Joint distribution function of N eigenvalues of U(N) invariant random-matrix ensemble can be interpreted as a probability density to find N fictitious non-interacting fermions to be confined in a one-dimensional space. Within this picture a general formalism is developed to study the eigenvalue correlations in non-Gaussian ensembles of large random matrices possessing non-monotonic, log-singular level confinement. An effective one-particle Schroedinger equation for wave-functions of fictitious fermions is derived. It is shown that eigenvalue correlations are completely determined by the Dyson's density of states and by the parameter of the logarithmic singularity. Closed analytical expressions for the two-point kernel in the origin, bulk, and soft-edge scaling limits are deduced in a unified way, and novel universal correlations are predicted near the end point of the single spectrum support.
The small-world property in the context of complex networks implies structural benefits to the processes taking place within a network, such as optimal information transmission and robustness. In this paper, we study a model network of integrate-and-fire neurons that are subject to activity-dependent synaptic plasticity. We find the learning rule that gives rise to a small-world structure when the collective dynamics of the system reaches a critical state which is characterised by power-law distributions of activity clusters. Moreover, by analysing the motif profile of the networks, we observe that bidirectional connectivity is impaired by the effects of this type of plasticity.
Corrections to the magnitudes of high redshift objects due to intergalactic attenuation are computed using current estimates of the properties of the intergalactic medium. The results of numerical simulations are used to estimate the contributions to resonant scattering from the higher order Lyman transitions. Differences of 0.5-1 magnitude from the previous estimate of Madau (1995) are found. Intergalactic k_IGM-corrections and colours are provided for high redshift starburst galaxies and Type I and Type II QSOs for several filter systems used in current and planned deep optical and infra-red surveys.
Observability is the property that enables to distinguish two different locations in $n$-dimensional state space from a reduced number of measured variables, usually just one. In high-dimensional systems it is therefore important to make sure that the variable recorded to perform the analysis conveys good observability of the system dynamics. In the case of networks composed of neuron models, the observability of the network depends nontrivially on the observability of the node dynamics and on the topology of the network. The aim of this paper is twofold. First, a study of observability is conducted using four well-known neuron models by computing three different observability coefficients. This not only clarifies observability properties of the models but also shows the limitations of applicability of each type of coefficients in the context of such models. Second, a multivariate singular spectrum analysis (M-SSA) is performed to detect phase synchronization in networks composed by neuron models. This tool, to the best of the authors' knowledge has not been used in the context of networks of neuron models. It is shown that it is possible to detect phase synchronization i)~without having to measure all the state variables, but only one from each node, and ii)~without having to estimate the phase.
This paper envisions an end-to-end program generation scenario using recurrent neural networks (RNNs): Users can express their intention in natural language; an RNN then automatically generates corresponding code in a characterby-by-character fashion. We demonstrate its feasibility through a case study and empirical analysis. To fully make such technique useful in practice, we also point out several cross-disciplinary challenges, including modeling user intention, providing datasets, improving model architectures, etc. Although much long-term research shall be addressed in this new field, we believe end-to-end program generation would become a reality in future decades, and we are looking forward to its practice.
The application of multi-wavelength selection techniques is crucial for discovering a complete and unbiased set of Active Galactic Nuclei (AGNs). Here, we select a sample of 72 AGN candidates in the Extended Groth Strip (EGS) using deep radio and mid-infrared data from the Very Large Array (VLA) and the Spitzer Space Telescope, and analyze their properties across other wavelengths. Only 30% of these sources are detected in deep 200 ks Chandra X-ray Observatory pointings. The X-ray detected sources demonstrate moderate obscuration with column densities of N_H > 10^22 cm^-2. A stacked image of sources undetected by Chandra shows low levels of X-ray activity, suggesting they may be faint or obscured AGNs. Less than 40% of our sample are selected as AGNs with optical broad lines, mid-infrared power laws, or X-ray detections. Thus, if our candidates are indeed AGNs, the radio/mid-infrared selection criteria we use provide a powerful tool for identifying sources missed by other surveys.
Many Proper Names (PNs) are Out-Of-Vocabulary (OOV) words for speech recognition systems used to process diachronic audio data. To help recovery of the PNs missed by the system, relevant OOV PNs can be retrieved out of the many OOVs by exploiting semantic context of the spoken content. In this paper, we propose two neural network models targeted to retrieve OOV PNs relevant to an audio document: (a) Document level Continuous Bag of Words (D-CBOW), (b) Document level Continuous Bag of Weighted Words (D-CBOW2). Both these models take document words as input and learn with an objective to maximise the retrieval of co-occurring OOV PNs. With the D-CBOW2 model we propose a new approach in which the input embedding layer is augmented with a context anchor layer. This layer learns to assign importance to input words and has the ability to capture (task specific) key-words in a bag-of-word neural network model. With experiments on French broadcast news videos we show that these two models outperform the baseline methods based on raw embeddings from LDA, Skip-gram and Paragraph Vectors. Combining the D-CBOW and D-CBOW2 models gives faster convergence during training.
The q-state ferromagnetic Potts model under a non-zero magnetic field coupled with the 0^th Potts state was investigated by an exact real-space renormalization group approach. The model was defined on a family of diamond hierarchical lattices of several fractal dimensions d_F. On these lattices, the renormalization group transformations became exact for such a model when a correlation coupling that singles out the 0^th Potts state was included in the Hamiltonian. The rich criticality presented by the model with q=3 and d_F=2 was fully analyzed. Apart from the Potts criticality for the zero field, an Ising-like phase transition was found whenever the system was submitted to a strong reverse magnetic field. Unusual characteristics such as cusps and dimensional reduction were observed on the critical surface.
The structure of binary As_xS_{1-x} glasses is elucidated using modulated-DSC, Raman scattering, IR reflectance and molar volume experiments over a wide range (8%<x<41%) of compositions. We observe a reversibility window in the calorimetric experiments, which permits fixing the three elastic phases; flexible at x<22.5%, intermediate phase (IP) in the 22.5%<x<29.5% range, and stressed-rigid at x>29.5%. Raman scattering supported by first principles cluster calculations reveal existence of both pyramidal (PYR, As(S1/2)3) and quasi-tetrahedral(QT, S=As(S1/2)3) local structures. The QT unit concentrations show a global maximum in the IP, while the concentration of PYR units becomes comparable to those of QT units in the phase, suggesting that both these local structures contribute to the width of the IP. The IP centroid in the sulfides is significantly shifted to lower As content x than in corresponding selenides, a feature identified with excess chalcogen partially segregating from the backbone in the sulfides, but forming part of the backbone in selenides. These ideas are corroborated by the proportionately larger free volumes of sulfides than selenides, and the absence of chemical bond strength scaling of Tgs between As-sulfides and As-selenides. Low-frequency Raman modes increase in scattering strength linearly as As content x of glasses decreases from x = 20% to 8%, with a slope that is close to the floppy mode fraction in flexible glasses predicted by rigidity theory. These results show that floppy modes contribute to the excess vibrations observed at low frequency. In the intermediate and stressed rigid elastic phases low-frequency Raman modes persist and are identified as boson modes. Some consequences of the present findings on the optoelectronic properties of these glasses is commented upon.
In a cognitive radio (CR) network, CR users intend to operate over the same spectrum band licensed to legacy networks. A tradeoff exists between protecting the communications in legacy networks and maximizing the throughput of CR transmissions, especially when CR links are unstable due to the mobility of CR users. Because of the non-zero probability of false detection and implementation complexity of spectrum sensing, in this paper, we investigate a sensing-free spectrum sharing scenario for mobile CR ad hoc networks to improve the frequency reuse by incorporating the location awareness capability in CR networks. We propose an optimal power control algorithm for the CR transmitter to maximize the concurrent transmission region of CR users especially in mobile scenarios. Under the proposed power control algorithm, the mobile CR network achieves maximized throughput without causing harmful interference to primary users in the legacy network. Simulation results show that the proposed optimal power control algorithm outperforms the algorithm with the fixed power policy in terms of increasing the packet delivery ratio in the network.
Multi-class supervised learning systems require the knowledge of the entire range of labels they predict. Often when learnt incrementally, they suffer from catastrophic forgetting. To avoid this, generous leeways have to be made to the philosophy of incremental learning that either forces a part of the machine to not learn, or to retrain the machine again with a selection of the historic data. While these hacks work to various degrees, they do not adhere to the spirit of incremental learning. In this article, we redefine incremental learning with stringent conditions that do not allow for any undesirable relaxations and assumptions. We design a strategy involving generative models and the distillation of dark knowledge as a means of hallucinating data along with appropriate targets from past distributions. We call this technique, phantom sampling.We show that phantom sampling helps avoid catastrophic forgetting during incremental learning. Using an implementation based on deep neural networks, we demonstrate that phantom sampling dramatically avoids catastrophic forgetting. We apply these strategies to competitive multi-class incremental learning of deep neural networks. Using various benchmark datasets and through our strategy, we demonstrate that strict incremental learning could be achieved. We further put our strategy to test on challenging cases, including cross-domain increments and incrementing on a novel label space. We also propose a trivial extension to unbounded-continual learning and identify potential for future development.
The emission rate of minority atmospheric gases is inferred by a new approach based on neural networks. The neural network applied is the multi-layer perceptron with backpropagation algorithm for learning. The identification of these surface fluxes is an inverse problem. A comparison between the new neural-inversion and regularized inverse solution id performed. The results obtained from the neural networks are significantly better. In addition, the inversion with the neural netwroks is fster than regularized approaches, after training.
MACHO-96-BLG-5 is a microlensing event observed towards the bulge of the Galaxy with exceptionally long duration of $\sim 970$ days. The microlensing parallax fit parameters were used to estimate the lens mass $M=6^{+10}_{-3}$ M$_{\odot}$ and its distance $d$ which results to be in the range 0.5 kpc - 2 kpc. The upper limit on the absolute brightness for main-sequence stars of the same mass is less than 1 $L_{\odot}$ so that the lens is a good black hole candidate. If it is so, the black hole would accrete by interstellar medium thereby emitting in the $X$-ray band. Here, the analysis of an {\it XMM} deep observation towards MACHO-96-BLG-5 lens position is reported. Only an upper limit (at 99.8% confidence level) to the $X$-ray flux from the lens position of $9.10\times 10^{-15}$-- $1.45\times 10^{-14}$ erg cm$^{-2}$ s$^{-1}$ in the energy band $0.2-10$ keV has been obtained from that deep observation allowing to constrain the putative black hole accretion parameters.
Crowdsourced 3D CAD models are becoming easily accessible online, and can potentially generate an infinite number of training images for almost any object category.We show that augmenting the training data of contemporary Deep Convolutional Neural Net (DCNN) models with such synthetic data can be effective, especially when real training data is limited or not well matched to the target domain. Most freely available CAD models capture 3D shape but are often missing other low level cues, such as realistic object texture, pose, or background. In a detailed analysis, we use synthetic CAD-rendered images to probe the ability of DCNN to learn without these cues, with surprising findings. In particular, we show that when the DCNN is fine-tuned on the target detection task, it exhibits a large degree of invariance to missing low-level cues, but, when pretrained on generic ImageNet classification, it learns better when the low-level cues are simulated. We show that our synthetic DCNN training approach significantly outperforms previous methods on the PASCAL VOC2007 dataset when learning in the few-shot scenario and improves performance in a domain shift scenario on the Office benchmark.
In this paper, a nurse-scheduling model is developed using mixed integer programming model. It is deployed to a general care ward to replace and automate the current manual approach for scheduling. The developed model differs from other similar studies in that it optimizes both hospitals requirement as well as nurse preferences by allowing flexibility in the transfer of nurses from different duties. The model also incorporated additional policies which are part of the hospitals requirement but not part of the legislations. Hospitals key primary mission is to ensure continuous ward care service with appropriate number of nursing staffs and the right mix of nursing skills. The planning and scheduling is done to avoid additional non essential cost for hospital. Nurses preferences are taken into considerations such as the number of night shift and consecutive rest days. We will also reformulate problems from another paper which considers the penalty objective using the model but without the flexible components. The models are built using AIMMS which solves the problem in very short amount of time.
We learn the structure of a Markov Network between two groups of random variables from joint observations. Since modelling and learning the full MN structure may be hard, learning the links between two groups directly may be a preferable option. We introduce a novel concept called the \emph{partitioned ratio} whose factorization directly associates with the Markovian properties of random variables across two groups. A simple one-shot convex optimization procedure is proposed for learning the \emph{sparse} factorizations of the partitioned ratio and it is theoretically guaranteed to recover the correct inter-group structure under mild conditions. The performance of the proposed method is experimentally compared with the state of the art MN structure learning methods using ROC curves. Real applications on analyzing bipartisanship in US congress and pairwise DNA/time-series alignments are also reported.
It has become increasingly popular to study the brain as a network due to the realization that functionality cannot be explained exclusively by independent activation of specialized regions. Instead, across a large spectrum of behaviors, function arises due to the dynamic interactions between brain regions. The existing literature on functional brain networks focuses mainly on a battery of network properties characterizing the "resting state" using for example the modularity, clustering, or path length among regions. In contrast, we seek to uncover subgraphs of functional connectivity that predict or drive individual differences in sensorimotor learning across subjects. We employ a principled approach for the discovery of significant subgraphs of functional connectivity, induced by brain activity (measured via fMRI imaging) while subjects perform a motor learning task. Our aim is to uncover patterns of functional connectivity that discriminate between high and low rates of learning among subjects. The discovery of such significant discriminative subgraphs promises a better data-driven understanding of the dynamic brain processes associated with brain plasticity.
Recurrent neural networks (RNNs) process input text sequentially and model the conditional transition between word tokens. In contrast, the advantages of recursive networks include that they explicitly model the compositionality and the recursive structure of natural language. However, the current recursive architecture is limited by its dependence on syntactic tree. In this paper, we introduce a robust syntactic parsing-independent tree structured model, Neural Tree Indexers (NTI) that provides a middle ground between the sequential RNNs and the syntactic treebased recursive models. NTI constructs a full n-ary tree by processing the input text with its node function in a bottom-up fashion. Attention mechanism can then be applied to both structure and node function. We implemented and evaluated a binarytree model of NTI, showing the model achieved the state-of-the-art performance on three different NLP tasks: natural language inference, answer sentence selection, and sentence classification, outperforming state-of-the-art recurrent and recursive neural networks.
Phylogenetic networks are mathematical structures for modeling and visualization of reticulation processes in the study of evolution. Galled networks, reticulation visible networks, nearly-stable networks and stable-child networks are the four classes of phylogenetic networks that are recently introduced to study the topological and algorithmic aspects of phylogenetic networks. We prove the following results.   (1) A binary galled network with n leaves has at most 2(n-1) reticulation nodes. (2) A binary nearly-stable network with n leaves has at most 3(n-1) reticulation nodes. (3) A binary stable-child network with n leaves has at most 7(n-1) reticulation nodes.
By means of neutron scattering techniques we have investigated the frustrated pyrochlore magnet Ho_2Sn_2O_7, which was found to show a ferromagnetic spin-ice behavior below T \simeq 1.4 K by susceptibility measurements. High-resolution powder-neutron-diffraction shows no detectable disorder of the lattice, which implies appearance of a random magnetic state solely by frustrated geometry, i.e., the corner sharing tetrahedra. Magnetic inelastic-scattering spectra show that Ho magnetic moments behave as an Ising spin system at low temperatures, and that the spin fluctuation has static character. The system remains in a short-range ordered state down to T = 0.4 K. By analyzing the wave-number dependence of the magnetic scattering using a mean field theory, it is clarified that the Ising spins interact via the dipolar interaction. Therefore we conclude that Ho_2Sn_2O_7 belongs to the dipolar-spin-ice family. Slow spin dynamics is exhibited as thermal hysteresis and time dependence of the magnetic scattering.
Humans can learn concepts or recognize items from just a handful of examples, while machines require many more samples to perform the same task. In this paper, we build a computational model to investigate the possibility of this kind of rapid learning. The proposed method aims to improve the learning task of input from sensory memory by leveraging the information retrieved from long-term memory. We present a simple and intuitive technique called cognitive discriminative mappings (CDM) to explore the cognitive problem. First, CDM separates and clusters the data instances retrieved from long-term memory into distinct classes with a discrimination method in working memory when a sensory input triggers the algorithm. CDM then maps each sensory data instance to be as close as possible to the median point of the data group with the same class. The experimental results demonstrate that the CDM approach is effective for learning the discriminative features of supervised classifications with few training sensory input instances.
The routing algorithms for parallel computers, on-chip networks, multi-core processors, and multiprocessors system-on-chip (MP-SoCs) exhibit router failures must be able to handle interconnect router failures that render a symmetrical mesh non-symmetrically. When developing a routing methodology, the time complexity of calculation should be minimal, and thus complicated routing strategies to introduce profitable paths may not be appropriate. Several reports have been released in the literature on using the concept of fault rings to provide detour paths to messages blocked by faults and to route messages around the fault regions. In order to analyze the performance of such algorithms, it is required to investigate the characteristics of fault rings. In this paper, we introduce a novel performance index of network reliability presenting the probability of message facing fault rings, and evaluating the performance-related reliability of adaptive routing schemes in n-D mesh-based interconnection networks with a variety of common cause fault patterns. Sufficient simulation results of Monte-Carlo method are conducted to demonstrate the correctness of the proposed analytical model.
In the present work we have analyzed a fermionic infinite-ranged Ising spin glass with a local BCS coupling in the presence of transverse field. This model has been obtained by tracing out the conduction electrons degrees of freedom in a superconducting alloy. The transverse field \Gamma is applied in the resulting effective model. The problem is formulated in the path integral formalism where the spins operators are represented by bilinear combination of Grassmann fields. The problem can be solved by combining previous approaches used to study a fermionic Heisenberg spin glass and a Ising spin glass in a transverse field. The results are show in a phase diagram T/J {\it versus} \Gamma/J (J is the standard deviation of the random coupling J_{ij}) for several values of g (the strength of the pairing interaction). For small g, the line transition T_c(\Gamma) between the normal paramagnetic phase and the spin glass phase decreases when increases \Gamma, until it reaches a quantum critical point. For increasing g, a PAIR phase (where there is formation of local pairs) has been found which disappears when is close to \Gamma_c showing that the transverse field tends to inhibited the PAIR phase.
We introduce Kapre, Keras layers for audio and music signal preprocessing. Music research using deep neural networks requires a heavy and tedious preprocessing stage, for which audio processing parameters are often ignored in parameter optimisation. To solve this problem, Kapre implements time-frequency conversions, normalisation, and data augmentation as Keras layers. We report simple benchmark results, showing real-time on-GPU preprocessing adds a reasonable amount of computation.
We describe here the new concept of $\epsilon$-Homomorphisms of Probabilistic Regulatory Gene Networks(PRN). The $\epsilon$-homomorphisms are special mappings between two probabilistic networks, that consider the algebraic action of the iteration of functions and the probabilistic dynamic of the two networks. It is proved here that the class of PRN, together with the homomorphisms, form a category with products and coproducts. Projections are special homomorphisms, induced by invariant subnetworks. Here, it is proved that an $\epsilon$-homomorphism for 0 <$\epsilon$< 1 produces simultaneous Markov Chains in both networks, that permit to introduce the concepts of $\epsilon$-isomorphism of Markov Chains, and similar networks.
Two neural networks which are trained on their mutual output bits are analysed using methods of statistical physics. The exact solution of the dynamics of the two weight vectors shows a novel phenomenon: The networks synchronize to a state with identical time dependent weights. Extending the models to multilayer networks with discrete weights, it is shown how synchronization by mutual learning can be applied to secret key exchange over a public channel.
Programming by Example (PBE) targets at automatically inferring a computer program for accomplishing a certain task from sample input and output. In this paper, we propose a deep neural networks (DNN) based PBE model called Neural Programming by Example (NPBE), which can learn from input-output strings and induce programs that solve the string manipulation problems. Our NPBE model has four neural network based components: a string encoder, an input-output analyzer, a program generator, and a symbol selector. We demonstrate the effectiveness of NPBE by training it end-to-end to solve some common string manipulation problems in spreadsheet systems. The results show that our model can induce string manipulation programs effectively. Our work is one step towards teaching DNN to generate computer programs.
Training deep directed graphical models with many hidden variables and performing inference remains a major challenge. Helmholtz machines and deep belief networks are such models, and the wake-sleep algorithm has been proposed to train them. The wake-sleep algorithm relies on training not just the directed generative model but also a conditional generative model (the inference network) that runs backward from visible to latent, estimating the posterior distribution of latent given visible. We propose a novel interpretation of the wake-sleep algorithm which suggests that better estimators of the gradient can be obtained by sampling latent variables multiple times from the inference network. This view is based on importance sampling as an estimator of the likelihood, with the approximate inference network as a proposal distribution. This interpretation is confirmed experimentally, showing that better likelihood can be achieved with this reweighted wake-sleep procedure. Based on this interpretation, we propose that a sigmoidal belief network is not sufficiently powerful for the layers of the inference network in order to recover a good estimator of the posterior distribution of latent variables. Our experiments show that using a more powerful layer model, such as NADE, yields substantially better generative models.
Decay law of a complicated unstable state formed in a high energy collision is described by the Fourier transform of the two-point correlation function of the scattering matrix. Although each constituent resonance state decays exponentially the decay of a state composed of a large number of such interfering resonances is not, generally, exponential. We introduce the decay rates distribution function by representing the decay law in the form of the mean-weighted decay exponent. In the framework of the random matrix approach we investigate the properties of the new distribution function and its relation to the more conventional statistics of the decay widths. The latter is not in fact conclusive as concerns the evolution during the time shorter than the characteristic Heisenberg time. Exact analytical consideration is presented for the case of systems without time reversal symmetry.
We present a new experimental-computational technology of inferring network models that predict the response of cells to perturbations and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is measured in terms of levels of proteins and phospho-proteins and of cellular phenotype such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, belief propagation, which is three orders of magnitude more efficient than Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in Skmel-133 melanoma cell lines, which are resistant to the therapeutically important inhibition of Raf kinase. The resulting network models reproduce and extend known pathway biology. They can be applied to discover new molecular interactions and to predict the effect of novel drug perturbations, one of which is verified experimentally. The technology is suitable for application to larger systems in diverse areas of molecular biology.
We study theoretically the geometrical and temporal commensurability oscillations induced in the resistivity of 2D electrons in a perpendicular magnetic field by surface acoustic waves (SAWs). We show that there is a positive anisotropic dynamical classical contribution and an isotropic non-equilibrium quantum contribution to the resistivity. We describe how the commensurability oscillations modulate the resonances in the SAW-induced resistivity at multiples of the cyclotron frequency. We study the effects of both short-range and long-range disorder on the resistivity corrections for both the classical and quantum non-equilibrium cases. We predict that the quantum correction will give rise to zero-resistance states with associated geometrical commensurability oscillations at large SAW amplitude for sufficiently large inelastic scattering times. These zero resistance states are qualitatively similar to those observed under microwave illumination, and their nature depends crucially on whether the disorder is short- or long-range. Finally, we discuss the implications of our results for current and future experiments on two dimensional electron gases.
We present the results of deep imaging obtained at the CFHT with MegaCam in the Anticenter direction at two different heights above the Galactic disk. We detect the presence of the Monoceros ring in both fields as a conspicuous and narrow Main Sequence feature which dominates star counts over a large portion of the color-magnitude diagram down to g'~24. The comparison of the morphology and density of this feature with a large variety of Galactic models excludes the possibility that it can be due to a flare of the Galactic disk, supporting an extra-Galactic origin for this ring-like structure.
Synchronization of non-identical oscillators coupled through complex networks is an important example of collective behavior. It is interesting to ask how the structural organization of network interactions influences this process. Several studies have uncovered optimal topologies for synchronization by making purposeful alterations to a network. Yet, the connectivity patterns of many natural systems are often not static, but are rather modulated over time according to their dynamics. This co-evolution - and the extent to which the dynamics of the individual units can shape the organization of the network itself - is not well understood. Here, we study initially randomly connected but locally adaptive networks of Kuramoto oscillators. The system employs a co-evolutionary rewiring strategy that depends only on instantaneous, pairwise phase differences of neighboring oscillators, and that conserves the total number of edges, allowing the effects of local reorganization to be isolated. We find that a simple regulatory rule - which preserves connections between more out-of-phase oscillators while rewiring connections between more in-phase oscillators - can cause initially disordered networks to organize into more structured topologies that support enhanced synchronization dynamics. We examine how this process unfolds over time, finding both a dependence on the intrinsic frequencies of the oscillators and the global coupling. For large enough coupling and after sufficient adaptation, the resulting networks exhibit degree - frequency and frequency - neighbor frequency correlations. These properties have previously been associated with optimal synchronization or explosive transitions. By considering a time-dependent interplay between structure and dynamics, this work offers a mechanism through which emergent phenomena can arise in complex systems utilizing local rules.
Designing a photometric system to best fulfil a set of scientific goals is a complex task, demanding a compromise between conflicting requirements and subject to various constraints. A specific example is the determination of stellar astrophysical parameters (APs) - effective temperature, metallicity etc. - across a wide range of stellar types. I present a novel approach to this problem which makes minimal assumptions about the required filter system. By considering a filter system as a set of free parameters it may be designed by optimizing some figure-of-merit (FoM) with respect to these parameters. In the example considered, the FoM is a measure of how well the filter system can `separate' stars with different APs. This separation is vectorial in nature, in the sense that the local directions of AP variance are preferably mutually orthogonal to avoid AP degeneracy. The optimization is carried out with an evolutionary algorithm, which uses principles of evolutionary biology to search the parameter space. This model, HFD (Heuristic Filter Design), is applied to the design of photometric systems for the Gaia space astrometry mission. The optimized systems show a number of interesting features, not least the persistence of broad, overlapping filters. These HFD systems perform as least as well as other proposed systems for Gaia, although inadequacies remain in all. The principles underlying HFD are quite generic and may be applied to filter design for numerous other projects, such as the search for specific types of objects or photometric redshift determination.
The nonparametric problem of detecting existence of an anomalous interval over a one dimensional line network is studied. Nodes corresponding to an anomalous interval (if exists) receive samples generated by a distribution q, which is different from the distribution p that generates samples for other nodes. If anomalous interval does not exist, then all nodes receive samples generated by p. It is assumed that the distributions p and q are arbitrary, and are unknown. In order to detect whether an anomalous interval exists, a test is built based on mean embeddings of distributions into a reproducing kernel Hilbert space (RKHS) and the metric of maximummean discrepancy (MMD). It is shown that as the network size n goes to infinity, if the minimum length of candidate anomalous intervals is larger than a threshold which has the order O(log n), the proposed test is asymptotically successful, i.e., the probability of detection error approaches zero asymptotically. An efficient algorithm to perform the test with substantial computational complexity reduction is proposed, and is shown to be asymptotically successful if the condition on the minimum length of candidate anomalous interval is satisfied. Numerical results are provided, which are consistent with the theoretical results.
This paper proposes a deep denoising auto-encoder technique to extract better acoustic features for speech synthesis. The technique allows us to automatically extract low-dimensional features from high dimensional spectral features in a non-linear, data-driven, unsupervised way. We compared the new stochastic feature extractor with conventional mel-cepstral analysis in analysis-by-synthesis and text-to-speech experiments. Our results confirm that the proposed method increases the quality of synthetic speech in both experiments.
Ising spin-glass systems with long-range interactions ($J(r)\sim r^{-\sigma}$) are considered. A numerical study of the critical behaviour is presented in the non-mean-field region together with an analysis of the probability distribution of the overlaps and of the ultrametric structure of the space of the equilibrium configurations in the frozen phase. Also in presence of diverging thermodynamical fluctuations at the critical point the behaviour of the model is shown to be of the Replica Simmetry Breaking type and there are hints of a non-trivial ultrametric structure. The parallel tempering algorithm has been used to simulate the dynamical approach to equilibrium of such systems.
A method is proposed for studying wave and particle transport in disordered waveguide systems of dimension higher than unity by means of exact one-dimensionalization of the dynamic equations in the mode representation. As a particular case, the T=0 conductance of a two-dimensional quantum wire is calculated, which exhibits ohmic behaviour, with length-dependent conductivity, at any conductor length exceeding the electron quasi-classical mean free path. The unconventional diffusive regime of charge transport is found in the range of conductor lengths where the electrons are commonly considered as localized. In quantum wires with more than one conducting channel, each being identified with the extended waveguide mode, the inter-mode scattering is proven to serve as a phase-breaking mechanism that prevents interference localization without real inelasticity of interaction.
The Teacher Forcing algorithm trains recurrent networks by supplying observed sequence values as inputs during training and using the network's own one-step-ahead predictions to do multi-step sampling. We introduce the Professor Forcing algorithm, which uses adversarial domain adaptation to encourage the dynamics of the recurrent network to be the same when training the network and when sampling from the network over multiple time steps. We apply Professor Forcing to language modeling, vocal synthesis on raw waveforms, handwriting generation, and image generation. Empirically we find that Professor Forcing acts as a regularizer, improving test likelihood on character level Penn Treebank and sequential MNIST. We also find that the model qualitatively improves samples, especially when sampling for a large number of time steps. This is supported by human evaluation of sample quality. Trade-offs between Professor Forcing and Scheduled Sampling are discussed. We produce T-SNEs showing that Professor Forcing successfully makes the dynamics of the network during training and sampling more similar.
The aim of this paper is to develop a general framework for training neural networks (NNs) in a distributed environment, where training data is partitioned over a set of agents that communicate with each other through a sparse, possibly time-varying, connectivity pattern. In such distributed scenario, the training problem can be formulated as the (regularized) optimization of a non-convex social cost function, given by the sum of local (non-convex) costs, where each agent contributes with a single error term defined with respect to its local dataset. To devise a flexible and efficient solution, we customize a recently proposed framework for non-convex optimization over networks, which hinges on a (primal) convexification-decomposition technique to handle non-convexity, and a dynamic consensus procedure to diffuse information among the agents. Several typical choices for the training criterion (e.g., squared loss, cross entropy, etc.) and regularization (e.g., $\ell_2$ norm, sparsity inducing penalties, etc.) are included in the framework and explored along the paper. Convergence to a stationary solution of the social non-convex problem is guaranteed under mild assumptions. Additionally, we show a principled way allowing each agent to exploit a possible multi-core architecture (e.g., a local cloud) in order to parallelize its local optimization step, resulting in strategies that are both distributed (across the agents) and parallel (inside each agent) in nature. A comprehensive set of experimental results validate the proposed approach.
In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixed-length vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder-Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.
We consider the "Touch and Stop" cluster growth percolation (CGP) model on the two dimensional square lattice. A key-parameter in the model is the fraction p of occupied "seed" sites that act as nucleation centers from which a particular cluster growth procedure is started. Here, we consider two growth-styles: rhombic and disk-shaped cluster growth. For intermediate values of p the final state, attained by the growth procedure, exhibits a cluster of occupied sites that spans the entire lattice. Using numerical simulations we investigate the percolation probability and the order parameter and perform a finite-size scaling analysis for lattices of side length up to L=1024 in order to carefully determine the critical exponents that govern the respective transition. In contrast to previous studies, reported in [Tsakiris et al., Phys. Rev. E 82 (2010) 041108], we find strong numerical evidence that the CGP model is in the standard percolation universality class.
An epidemic spreading in a network calls for a decision on the part of the network members: They should decide whether to protect themselves or not. Their decision depends on the trade-off between their perceived risk of being infected and the cost of being protected. The network members can make decisions repeatedly, based on information that they receive about the changing infection level in the network.   We study the equilibrium states reached by a network whose members increase (resp. decrease) their security deployment when learning that the network infection is widespread (resp. limited). Our main finding is that the equilibrium level of infection increases as the learning rate of the members increases. We confirm this result in three scenarios for the behavior of the members: strictly rational cost minimizers, not strictly rational, and strictly rational but split into two response classes. In the first two cases, we completely characterize the stability and the domains of attraction of the equilibrium points, even though the first case leads to a differential inclusion. We validate our conclusions with simulations on human mobility traces.
In this article, we have given a systematic formulation of the new generalized competing mechanism: the Glauber-type single-spin transition mechanism, with probability p, simulates the contact of the system with the heat bath, and the Kawasaki-type spin-pair redistribution mechanism, with probability 1-p, simulates an external energy flux. These two mechanisms are natural generalizations of Glauber's single-spin flipping mechanism and Kawasaki's spin-pair exchange mechanism respectively. On the one hand, the new mechanism is in principle applicable to arbitrary systems, while on the other hand, our formulation is able to contain a mechanism that just directly combines single-spin flipping and spin-pair exchange in their original form. Compared with the conventional mechanism, the new mechanism does not assume the simplified version and leads to greater influence of temperature. The fact, order for lower temperature and disorder for higher temperature, will be universally true. In order to exemplify this difference, we applied the mechanism to 1D Ising model and obtained analytical results. We also applied this mechanism to kinetic Gaussian model and found that, above the critical point there will be only paramagnetic phase, while below the critical point, the self-organization as a result of the energy flux will lead the system to an interesting heterophase, instead of the initially guessed antiferromagnetic phase. We studied this process in details.
We consider mesh networks composed of groups of relaying nodes which operate in decode-and-forward mode, where each node from a group relays information to all the nodes in the next group. We study these networks in two setups, one where the nodes have complete channel state information from the nodes that transmit to them, and another when they only have the statistics of the channel. We derive recursive expressions for the probabilities of errors of the nodes and present several implementations of detectors used in these networks. We compare the mesh networks with multihop networks, the latter being formed by a set of parallel sections of multiple relaying nodes. We demonstrate with numerous simulations that there are significant improvements in performance of mesh over multihop networks in various scenarios.
In this paper, we propose a robust tracking method based on the collaboration of a generative model and a discriminative classifier, where features are learned by shallow and deep architectures, respectively. For the generative model, we introduce a block-based incremental learning scheme, in which a local binary mask is constructed to deal with occlusion. The similarity degrees between the local patches and their corresponding subspace are integrated to formulate a more accurate global appearance model. In the discriminative model, we exploit the advances of deep learning architectures to learn generic features which are robust to both background clutters and foreground appearance variations. To this end, we first construct a discriminative training set from auxiliary video sequences. A deep classification neural network is then trained offline on this training set. Through online fine-tuning, both the hierarchical feature extractor and the classifier can be adapted to the appearance change of the target for effective online tracking. The collaboration of these two models achieves a good balance in handling occlusion and target appearance change, which are two contradictory challenging factors in visual tracking. Both quantitative and qualitative evaluations against several state-of-the-art algorithms on challenging image sequences demonstrate the accuracy and the robustness of the proposed tracker.
The space of one-dimensional disordered interacting quantum models displaying a Many-Body-Localization Transition seems sufficiently rich to produce critical points with level statistics interpolating continuously between the Poisson statistics of the Localized phase and the Wigner-Dyson statistics of the Delocalized Phase. In this paper, we consider the strong disorder limit of the MBL transition, where the critical level statistics is close to the Poisson statistics. We analyse a one-dimensional quantum spin model, in order to determine the statistical properties of the rare extensive resonances that are needed to destabilize the MBL phase. At criticality, we find that the entanglement entropy can grow with an exponent $0<\alpha < 1$ anywhere between the area law $\alpha=0$ and the volume law $\alpha=1$, as a function of the resonances properties, while the entanglement spectrum follows the strong multifractality statistics. In the MBL phase near criticality, we obtain the simple value $\nu=1$ for the correlation length exponent. Independently of the strong disorder limit, we explain why for the Many-Body-Localization transition concerning individual eigenstates, the correlation length exponent $\nu$ is not constrained by the usual Harris inequality $\nu \geq 2/d$, so that there is no theoretical inconsistency with the best numerical measure $\nu = 0.8 (3)$ obtained by D. J. Luitz, N. Laflorencie and F. Alet, Phys. Rev. B 91, 081103 (2015).
In this paper, we introduce a distributed algorithm to compute th\v{e} Cech complex. This algorithm is aimed at solving coverage problems in self organized wireless networks. Two applications based on the distributed computation of th\v{e} Cech complex are proposed. The first application detects coverage holes while the later one optimizes coverage of wireless networks.
As parallelism becomes critically important in the semiconductor technology, high-performance computing, and cloud applications, parallel network systems will increasingly follow suit. Today, parallelism is an essential architectural feature of 40/100/400 Gigabit Ethernet standards, whereby high speed Ethernet systems are equipped with multiple parallel network interfaces. This creates new network topology abstractions and new technology requirements: instead of a single high capacity network link, multiple Ethernet end-points and interfaces need to be considered together with multiple links in form of discrete parallel paths. This new paradigm is enabling implementations of various new features to improve overall system performance. In this paper, we analyze the performance of parallel network systems with network coding. In particular, by using random LNC (RLNC), - a code without the need for decoding, we can make use of the fact that we have codes that are both distributed (removing the need for coordination or optimization of resources) and composable (without the need to exchange code information), leading to a fully stateless operation. We propose a novel theoretical modeling framework, including derivation of the upper and lower bounds as well as an expected value of the differential delay of parallel paths, and the resulting queue size at the receiver. The results show a great promise of network system parallelism in combination with RLNC: with a proper set of design parameters, the differential delay and the buffer size at the Ethernet receiver can be reduced significantly, while the cross-layer design and routing can be greatly simplified.
Batch Normalization is quite effective at accelerating and improving the training of deep models. However, its effectiveness diminishes when the training minibatches are small, or do not consist of independent samples. We hypothesize that this is due to the dependence of model layer inputs on all the examples in the minibatch, and different activations being produced between training and inference. We propose Batch Renormalization, a simple and effective extension to ensure that the training and inference models generate the same outputs that depend on individual examples rather than the entire minibatch. Models trained with Batch Renormalization perform substantially better than batchnorm when training with small or non-i.i.d. minibatches. At the same time, Batch Renormalization retains the benefits of batchnorm such as insensitivity to initialization and training efficiency.
In modelling of elastic objects in a flow such as red blood cells, white blood cells, or tumour cells, several elastic moduli are involved. One of them is the area conservation modulus. In this paper, we focus on spring network models and we introduce a new way of modeling the area preservation modulus. We take into account the current shape of the individual triangles and find the proportional allocation of area conservation forces, which would for individual triangles preserve their shapes. The analysis shows that this approach tends to regularize the triangulation. We demonstrate this effect on individual triangles as well as on the complete triangulations.
We propose minimum risk training for end-to-end neural machine translation. Unlike conventional maximum likelihood estimation, minimum risk training is capable of optimizing model parameters directly with respect to arbitrary evaluation metrics, which are not necessarily differentiable. Experiments show that our approach achieves significant improvements over maximum likelihood estimation on a state-of-the-art neural machine translation system across various languages pairs. Transparent to architectures, our approach can be applied to more neural networks and potentially benefit more NLP tasks.
This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation. Previous work has considered scene completion and semantic labeling of depth maps separately. However, we observe that these two problems are tightly intertwined. To leverage the coupled nature of these two tasks, we introduce the semantic scene completion network (SSCNet), an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum. Our network uses a dilation-based 3D context module to efficiently expand the receptive field and enable 3D context learning. To train our network, we construct SUNCG - a manually created large-scale dataset of synthetic 3D scenes with dense volumetric annotations. Our experiments demonstrate that the joint model outperforms methods addressing each task in isolation and outperforms alternative approaches on the semantic scene completion task.
Animals (especially humans) have an amazing ability to learn new tasks quickly, and switch between them flexibly. How brains support this ability is largely unknown, both neuroscientifically and algorithmically. One reasonable supposition is that modules drawing on an underlying general-purpose sensory representation are dynamically allocated on a per-task basis. Recent results from neuroscience and artificial intelligence suggest the role of the general purpose visual representation may be played by a deep convolutional neural network, and give some clues how task modules based on such a representation might be discovered and constructed. In this work, we investigate module architectures in an embodied two-dimensional touchscreen environment, in which an agent's learning must occur via interactions with an environment that emits images and rewards, and accepts touches as input. This environment is designed to capture the physical structure of the task environments that are commonly deployed in visual neuroscience and psychophysics. We show that in this context, very simple changes in the nonlinear activations used by such a module can significantly influence how fast it is at learning visual tasks and how suitable it is for switching to new tasks.
We introduce principal differences analysis (PDA) for analyzing differences between high-dimensional distributions. The method operates by finding the projection that maximizes the Wasserstein divergence between the resulting univariate populations. Relying on the Cramer-Wold device, it requires no assumptions about the form of the underlying distributions, nor the nature of their inter-class differences. A sparse variant of the method is introduced to identify features responsible for the differences. We provide algorithms for both the original minimax formulation as well as its semidefinite relaxation. In addition to deriving some convergence results, we illustrate how the approach may be applied to identify differences between cell populations in the somatosensory cortex and hippocampus as manifested by single cell RNA-seq. Our broader framework extends beyond the specific choice of Wasserstein divergence.
Multicasting technology uses the minimum network resources to serve multiple clients by duplicating the data packets at the closest possible point to the clients. This way at most only one data packets travels down a network link at any one time irrespective of how many clients receive this packet. Traditionally multicasting has been implemented over a specialized network built using multicast routers. This kind of network has the drawback of requiring the deployment of special routers that are more expensive than ordinary routers. Recently there is new interest in delivering multicast traffic over application layer overlay networks. Application layer overlay networks though built on top of the physical network, behave like an independent virtual network made up of only logical links between the nodes. Several authors have proposed systems, mechanisms and protocols for the implementation of multicast media streaming over overlay networks. In this paper, the author takes a critical look at these systems and mechanism with special reference to their strengths and weaknesses.
Recently, machine learning techniques especially predictive modeling and pattern recognition in biomedical sciences from drug delivery system to medical imaging has become one of the important methods which are assisting researchers to have deeper understanding of entire issue and to solve complex medical problems. Deep learning is a powerful machine learning algorithm in classification while extracting low to high-level features. In this paper, we used convolutional neural network to classify Alzheimer's brain from normal healthy brain. The importance of classifying this kind of medical data is to potentially develop a predict model or system in order to recognize the type disease from normal subjects or to estimate the stage of the disease. Classification of clinical data such as Alzheimer's disease has been always challenging and most problematic part has been always selecting the most discriminative features. Using Convolutional Neural Network (CNN) and the famous architecture LeNet-5, we successfully classified structural MRI data of Alzheimer's subjects from normal controls where the accuracy of test data on trained data reached 98.84%. This experiment suggests us the shift and scale invariant features extracted by CNN followed by deep learning classification is most powerful method to distinguish clinical data from healthy data in fMRI. This approach also enables us to expand our methodology to predict more complicated systems.
We investigate numerically and analytically the formation of the frozen core in critical random Boolean networks with biased functions. We demonstrate that a previously used efficient algorithm for obtaining the frozen core, which starts from the nodes with constant functions, fails when the number of inputs per node exceeds 4. We present computer simulation data for the process of formation of the frozen core and its robustness, and we show that several important features of the data can be derived by using a mean-field calculation.
Cell differentiation in multicellular organisms is a complex process whose mechanism can be understood by a reductionist approach, in which the individual processes that control the generation of different cell types are identified. Alternatively, a large scale approach in search of different organizational features of the growth stages promises to reveal its modular global structure with the goal of discovering previously unknown relations between cell types. Here we sort and analyze a large set of scattered data to construct the network of human cell differentiation (NHCD) based on cell types (nodes) and differentiation steps (links) from the fertilized egg to a crying baby. We discover a dynamical law of critical branching, which reveals a fractal regularity in the modular organization of the network, and allows us to observe the network at different scales. The emerging picture clearly identifies clusters of cell types following a hierarchical organization, ranging from sub-modules to super-modules of specialized tissues and organs on varying scales. This discovery will allow one to treat the development of a particular cell function in the context of the complex network of human development as a whole. Our results point to an integrated large-scale view of the network of cell types systematically revealing ties between previously unrelated domains in organ functions.
We propose and analyze a stochastic model which explains, analytically, the cutoff behavior of real scale-free networks previously modeled computationally by Amaral et al. [Proc. Natl. Acad. Sci. U.S.A. 97, 11149 (2000)] and others. We present a mathematical model that can explain several existing computational scale-free network generation models. This yields a theoretical basis to understand cutoff behavior in complex networks, previously treated only with simulations using distinct models. Therefore, ours is an integrative approach that unifies the existing literature on cutoff behavior in scale-free networks. Furthermore, our mathematical model allows us to reach conclusions not hitherto possible with computational models: the ability to predict the equilibrium point of active vertices and to relate the growth of networks with the probability of aging. We also discuss how our model introduces a useful way to classify scale free behavior of complex networks.
We consider a learning problem of identifying a dictionary matrix D (M times N dimension) from a sample set of M dimensional vectors Y = N^{-1/2} DX, where X is a sparse matrix (N times P dimension) in which the density of non-zero entries is 0<rho< 1. In particular, we focus on the minimum sample size P_c (sample complexity) necessary for perfectly identifying D of the optimal learning scheme when D and X are independently generated from certain distributions. By using the replica method of statistical mechanics, we show that P_c=O(N) holds as long as alpha = M/N >rho is satisfied in the limit of N to infinity. Our analysis also implies that the posterior distribution given Y is condensed only at the correct dictionary D when the compression rate alpha is greater than a certain critical value alpha_M(rho). This suggests that belief propagation may allow us to learn D with a low computational complexity using O(N) samples.
It remains a challenge to run Deep Learning in devices with stringent power budget in the Internet-of-Things. This paper presents a low-power accelerator for processing Deep Neural Networks in the embedded devices. The power reduction is realized by avoiding multiplications of near-zero valued data. The near-zero approximation and a dedicated Near-Zero Approximation Unit (NZAU) are proposed to predict and skip the near-zero multiplications under certain thresholds. Compared with skipping zero-valued computations, our design achieves 1.92X and 1.51X further reduction of the total multiplications in LeNet-5 and Alexnet respectively, with negligible lose of accuracy. In the proposed accelerator, 256 multipliers are grouped into 16 independent Processing Lanes (PL) to support up to 16 neuron activations simultaneously. With the help of data pre-processing and buffering in each PL, multipliers can be clock-gated in most of the time even the data is excessively streaming in. Designed and simulated in UMC 65 nm process, the accelerator operating at 500 MHz is $>$ 4X faster than the mobile GPU Tegra K1 in processing the fully-connected layer FC8 of Alexnet, while consuming 717X less energy.
In Multi-Radio Multi-Channel (MRMC) Wireless Mesh Networks (WMN), Partially Overlapped Channels (POC) has been used to increase the parallel transmission. But adjacent channel interference is very severe in MRMC environment; it decreases the network throughput very badly. In this paper, we propose a Coefficient of Restitution based Cross layer Interference aware Routing protocol (CoRCiaR) to improve TCP performance in Wireless Mesh Networks. This approach comprises of two-steps: Initially, the interference detection algorithm is developed at MAC layer by enhancing the RTS/CTS method. Based on the channel interference, congestion is identified by Round Trip Time (RTT) measurements, and subsequently the route discovery module selects the alternative path to send the data packet. The packets are transmitted to the congestion free path seamlessly by the source. The performance of the proposed CoRCiaR protocol is measured by Coefficient of Restitution (COR) parameter. The impact of the rerouting is experienced on the network throughput performance. The simulation results show that the proposed cross layer interference aware dynamic routing enhances the TCP performance on WMN.   Keywords: Coefficient of Restitution, Wireless Mesh Networks, Partially Overlapped Channels, Round Trip Time, Multi-Radio, Multi-Channel.
Previous theoretical work on deep learning and neural network optimization tend to focus on avoiding saddle points and local minima. However, the practical observation is that, at least in the case of the most successful Deep Convolutional Neural Networks (DCNNs), practitioners can always increase the network size to fit the training data (an extreme example would be [1]). The most successful DCNNs such as VGG and ResNets are best used with a degree of "overparametrization". In this work, we characterize with a mix of theory and experiments, the landscape of the empirical risk of overparametrized DCNNs. We first prove in the regression framework the existence of a large number of degenerate global minimizers with zero empirical error (modulo inconsistent equations). The argument that relies on the use of Bezout theorem is rigorous when the RELUs are replaced by a polynomial nonlinearity (which empirically works as well). As described in our Theory III [2] paper, the same minimizers are degenerate and thus very likely to be found by SGD that will furthermore select with higher probability the most robust zero-minimizer. We further experimentally explored and visualized the landscape of empirical risk of a DCNN on CIFAR-10 during the entire training process and especially the global minima. Finally, based on our theoretical and experimental results, we propose an intuitive model of the landscape of DCNN's empirical loss surface, which might not be as complicated as people commonly believe.
Network traffic data is huge, varying and imbalanced because various classes are not equally distributed. Machine learning (ML) algorithms for traffic analysis uses the samples from this data to recommend the actions to be taken by the network administrators as well as training. Due to imbalances in dataset, it is difficult to train machine learning algorithms for traffic analysis and these may give biased or false results leading to serious degradation in performance of these algorithms. Various techniques can be applied during sampling to minimize the effect of imbalanced instances. In this paper various sampling techniques have been analysed in order to compare the decrease in variation in imbalances of network traffic datasets sampled for these algorithms. Various parameters like missing classes in samples, probability of sampling of the different instances have been considered for comparison.
During hurricane seasons, emergency managers and other decision makers need accurate and `on-time' information on potential storm surge impacts. Fully dynamical computer models, such as the ADCIRC tide, storm surge, and wind-wave model take several hours to complete a forecast when configured at high spatial resolution. Additionally, statically meaningful ensembles of high-resolution models (needed for uncertainty estimation) cannot easily be computed in near real-time. This paper discusses an artificial neural network model for storm surge prediction in North Carolina. The network model provides fast, real-time storm surge estimates at coastal locations in North Carolina. The paper studies the performance of the neural network model vs. other models on synthetic and real hurricane data.
We have studied by Molecular Dynamics computer simulations the dynamics of water confined in ionic surfactants phases, ranging from well ordered lamellar structures to micelles at low and high water loading, respectively. We have analysed in depth the main dynamical features in terms of mean squared displacements and intermediate scattering functions, and found clear evidences of sub-diffusive behaviour. We have identified water molecules lying at the charged interface with the hydrophobic confining matrix as the main responsible for this unusual feature, and provided a comprehensive picture for dynamics based on a very precise analysis of life times at the interface. We conclude by providing, for the first time to our knowledge, a unique framework for rationalising the existence of important dynamical heterogeneities in fluids absorbed in soft confining environments.
Within the framework of mean field theory, we study the effects of multi-surface modification on Curie temperature of ferroelectric films using the transverse Ising model. The general nonlinear equations for Curie temperature of multi-surface ferroelectric films with arbitrary exchange constants and transverse fields are derived by the transfer matrix method. As an example, we consider a film consisting of top surface layers, bulk layers and bottom surface layers. Two types of surface modifications, modifications of a surface exchange constant and a surface transverse field are taken into account. The dependence of Curie temperature on the surface layer numbers, bulk layer numbers, surface exchange constants, surface transverse fields and bulk transverse fields is discussed.
This paper analises distributed evolutionary computation based on the Representational State Transfer (REST) protocol, which overlays a farming model on evolutionary computation. An approach to evolutionary distributed optimisation of multilayer perceptrons (MLP) using REST and language Perl has been done. In these experiments, a master-slave based evolutionary algorithm (EA) has been implemented, where slave processes evaluate the costly fitness function (training a MLP to solve a classification problem). Obtained results show that the parallel version of the developed programs obtains similar or better results using much less time than the sequential version, obtaining a good speedup.
Recently developed deep learning techniques have significantly improved the accuracy of various speech and image recognition systems. In this paper we adapt some of these techniques for protein secondary structure prediction. We first train a series of deep neural networks to predict eight-class secondary structure labels given a protein's amino acid sequence information and find that using recent methods for regularization, such as dropout and weight-norm constraining, leads to measurable gains in accuracy. We then adapt recent convolutional neural network architectures--Inception, ReSNet, and DenseNet with Batch Normalization--to the problem of protein structure prediction. These convolutional architectures make heavy use of multi-scale filter layers that simultaneously compute features on several scales, and use residual connections to prevent underfitting. Using a carefully modified version of these architectures, we achieve state-of-the-art performance of 70.0% per amino acid accuracy on the public CB513 benchmark dataset. Finally, we explore additions from sequence-to-sequence learning, altering the model to make its predictions conditioned on both the protein's amino acid sequence and its past secondary structure labels. We introduce a new method of ensembling such a conditional model with our convolutional model, an approach which reaches 70.6% Q8 accuracy on CB513. We argue that these results can be further refined for larger boosts in prediction accuracy through more sophisticated attempts to control overfitting of conditional models. We aim to release the code for these experiments as part of the TensorFlow repository.
This paper describes an approach of using dynamic Structural Equation Modeling (SEM) analysis to estimate the connectivity networks from resting-state fMRI data measured by a multiband EPI sequence. Two structural equation models were estimated at each voxel with respect to the sensory-motor network and default-mode network. The resulting connectivity maps indicate that supplementary motor area has significant connections to left/right primary motor areas, and medial prefrontal cortex link significantly with posterior cingulate cortex and inferior parietal lobules. The results imply that high temporal resolution images obtained with multiband fMRI data can provide dynamic and directional information on the neural connectivity.
This paper argues that maps of the Web's structure based solely on technical infrastructure such as hyperlinks may bear little resemblance to maps based on Web usage, as cultural factors drive the latter to a larger extent. To test this thesis, the study constructs two network maps of 1000 globally most popular Web Domains, one based on hyperlinks and the other using an "audience centric" approach with ties based on shared audience traffic between these domains. Analyses of the two networks reveal that unlike the centralized structure of the hyperlinks network with few dominant "core" websites, the audience network is more decentralized and clustered to a larger extent along geo-linguistic lines.
Age and gender are complementary soft biometric traits for face recognition. Successful estimation of age and gender from facial images taken under real-world conditions can contribute improving the identification results in the wild. In this study, in order to achieve robust age and gender classification in the wild, we have benefited from Deep Convolutional Neural Networks based representation. We have explored transferability of existing deep convolutional neural network (CNN) models for age and gender classification. The generic AlexNet-like architecture and domain specific VGG-Face CNN model are employed and fine-tuned with the Adience dataset prepared for age and gender classification in uncontrolled environments. In addition, task specific GilNet CNN model has also been utilized and used as a baseline method in order to compare with transferred models. Experimental results show that both transferred deep CNN models outperform the GilNet CNN model, which is the state-of-the-art age and gender classification approach on the Adience dataset, by an absolute increase of 7% and 4.5% in accuracy, respectively. This outcome indicates that transferring a deep CNN model can provide better classification performance than a task specific CNN model, which has a limited number of layers and trained from scratch using a limited amount of data as in the case of GilNet. Domain specific VGG-Face CNN model has been found to be more useful and provided better performance for both age and gender classification tasks, when compared with generic AlexNet-like model, which shows that transfering from a closer domain is more useful.
The intermediate map responses of a Convolutional Neural Network (CNN) contain information about an image that can be used to extract contextual knowledge about it. In this paper, we present a core sampling framework that is able to use these activation maps from several layers as features to another neural network using transfer learning to provide an understanding of an input image. Our framework creates a representation that combines features from the test data and the contextual knowledge gained from the responses of a pretrained network, processes it and feeds it to a separate Deep Belief Network. We use this representation to extract more information from an image at the pixel level, hence gaining understanding of the whole image. We experimentally demonstrate the usefulness of our framework using a pretrained VGG-16 model to perform segmentation on the BAERI dataset of Synthetic Aperture Radar(SAR) imagery and the CAMVID dataset.
The incorporation of prior knowledge into learning is essential in achieving good performance based on small noisy samples. Such knowledge is often incorporated through the availability of related data arising from domains and tasks similar to the one of current interest. Ideally one would like to allow both the data for the current task and for previous related tasks to self-organize the learning system in such a way that commonalities and differences between the tasks are learned in a data-driven fashion. We develop a framework for learning multiple tasks simultaneously, based on sharing features that are common to all tasks, achieved through the use of a modular deep feedforward neural network consisting of shared branches, dealing with the common features of all tasks, and private branches, learning the specific unique aspects of each task. Once an appropriate weight sharing architecture has been established, learning takes place through standard algorithms for feedforward networks, e.g., stochastic gradient descent and its variations. The method deals with domain adaptation and multi-task learning in a unified fashion, and can easily deal with data arising from different types of sources. Numerical experiments demonstrate the effectiveness of learning in domain adaptation and transfer learning setups, and provide evidence for the flexible and task-oriented representations arising in the network.
The structure of real-world social networks in large part determines the evolution of social phenomena, including opinion formation, diffusion of information and influence, and the spread of disease. Globally, network structure is characterized by features such as degree distribution, degree assortativity, and clustering coefficient. However, information about global structure is usually not available to each vertex. Instead, each vertex's knowledge is generally limited to the locally observable portion of the network consisting of the subgraph over its immediate neighbors. Such subgraphs, known as ego networks, have properties that can differ substantially from those of the global network. In this paper, we study the structural properties of ego networks and show how they relate to the global properties of networks from which they are derived. Through empirical comparisons and mathematical derivations, we show that structural features, similar to static attributes, suffer from paradoxes. We quantify the differences between global information about network structure and local estimates. This knowledge allows us to better identify and correct the biases arising from incomplete local information.
A phase transition of a model compound of the long-range Ising spin glass (SG) Dy$_{x}$Y$_{1-x}$Ru$_{2}$Si$_{2}$, where spins interact via the RKKY interaction, has been investigated. The static and the dynamic scaling analyses reveal that the SG phase transition in the model magnet belongs to the mean-field universality class. Moreover, the characteristic relaxation time in finite magnetic fields exhibits a critical divergent behavior as well as in zero field, indicating a stability of the SG phase in finite fields. The presence of the SG phase transition in field in the model magnet strongly syggests that the replica symmetry is broken in the long-range Ising SG.
Social medias are increasing their influence with the vast public information leading to their active use for marketing by the companies and organizations. Such marketing promotions are difficult to identify unlike the traditional medias like TV and newspaper. So, it is very much important to identify the promoters in the social media. Although, there are active ongoing researches, existing approaches are far from solving the problem. To identify such imposters, it is very much important to understand their strategies of social circle creation and dynamics of content posting. Are there any specific spammer types? How successful are each types? We analyze these questions in the light of social relationships in Twitter. Our analyses discover two types of spammers and their relationships with the dynamics of content posts. Our results discover novel dynamics of spamming which are intuitive and arguable. We propose ENWalk, a framework to detect the spammers by learning the feature representations of the users in the social media. We learn the feature representations using the random walks biased on the spam dynamics. Experimental results on large-scale twitter network and the corresponding tweets show the effectiveness of our approach that outperforms the existing approaches
This study extends the SIS epidemic model for single virus propagation over an arbitrary graph to an SI1SI2S epidemic model of two exclusive, competitive viruses over a two-layer network with generic structure, where network layers represent the distinct transmission routes of the viruses. We find analytical results determining extinction, mutual exclusion, and coexistence of the viruses by introducing the concepts of survival threshold and winning threshold. Furthermore, we show the possibility of coexistence in SIS-type competitive spreading over multilayer networks. Not only do we rigorously prove a region of coexistence, we quantitate it via interrelation of central nodes across the network layers. Little to no overlapping of layers central nodes is the key determinant of coexistence. Specifically, we show coexistence is impossible if network layers are identical yet possible if the network layers have distinct dominant eigenvectors and node degree vectors. For example, we show both analytically and numerically that positive correlation of network layers makes it difficult for a virus to survive while in a network with negatively correlated layers survival is easier but total removal of the other virus is more difficult. We believe our methodology has great potentials for application to broader classes of multi-pathogen spreading over multi-layer and interconnected networks.
Motivated by timeouts in Internet services, we consider networks of infinite server queues in which routing decisions are based on deadlines. Specifically, at each node in the network, the total service time equals the minimum of several independent service times (e.g. the minimum of the amount of time required to complete a transaction and a deadline). Furthermore, routing decisions depend on which of the independent service times achieves the minimum (e.g. exceeding a deadline will require the customer to be routed so they can re-attempt the transaction). Because current routing decisions are dependent on past service times, much of the existing theory on product-form queueing networks does not apply. In spite of this, we are able to show that such networks have product-form equilibrium distributions. We verify our analytic characterization with a simulation of a simple network. We also discuss extensions of this work to more general settings.
In a recent paper (Brown & Battye 2011), we proposed the use of integrated polarization measurements of background galaxies in radio weak gravitational lensing surveys and investigated the potential impact on the statistical measurement of cosmic shear. Here we extend this idea to reconstruct maps of the projected dark matter distribution, or lensing convergence field. The addition of polarization can, in principle, greatly reduce shape noise due to the intrinsic dispersion in galaxy ellipticities. We show that maps reconstructed using this technique in the radio band can be competitive with those derived using standard lensing techniques which make use of many more galaxies. In addition, since the reconstruction noise is uncorrelated between these standard techniques and the polarization technique, their comparison can serve as a powerful check for systematics and their combination can reduce noise further. We examine the convergence reconstruction which could be achieved with two forthcoming facilities: (i) a deep survey, covering 1.75 square degrees using the e-MERLIN instrument currently being commissioned in the UK and (ii) the high resolution, deep wide field surveys which will eventually be conducted with the Square Kilometre Array.
Eigenvector localization refers to the situation when most of the components of an eigenvector are zero or near-zero. This phenomenon has been observed on eigenvectors associated with extremal eigenvalues, and in many of those cases it can be meaningfully interpreted in terms of "structural heterogeneities" in the data. For example, the largest eigenvectors of adjacency matrices of large complex networks often have most of their mass localized on high-degree nodes; and the smallest eigenvectors of the Laplacians of such networks are often localized on small but meaningful community-like sets of nodes. Here, we describe localization associated with low-order eigenvectors, i.e., eigenvectors corresponding to eigenvalues that are not extremal but that are "buried" further down in the spectrum. Although we have observed it in several unrelated applications, this phenomenon of low-order eigenvector localization defies common intuitions and simple explanations, and it creates serious difficulties for the applicability of popular eigenvector-based machine learning and data analysis tools. After describing two examples where low-order eigenvector localization arises, we present a very simple model that qualitatively reproduces several of the empirically-observed results. This model suggests certain coarse structural similarities among the seemingly-unrelated applications where we have observed low-order eigenvector localization, and it may be used as a diagnostic tool to help extract insight from data graphs when such low-order eigenvector localization is present.
This work investigates the diffusion of cooperative behavior over time in a decentralized cognitive radio network with selfish spectrum-sensing users. The users can individually choose whether or not to participate in cooperative spectrum sensing, in order to maximize their individual payoff defined in terms of the sensing false-alarm rate and transmit energy expenditure. The system is modeled as a partially connected network with a statistical distribution of the degree of the users, who play their myopic best responses to the actions of their neighbors at each iteration. Based on this model, we investigate the existence and characterization of Bayesian Nash Equilibria for the diffusion game. The impacts of network topology, channel fading statistics, sensing protocol, and multiple antennas on the outcome of the diffusion process are analyzed next. Simulation results that demonstrate how conducive different network scenarios are to the diffusion of cooperation are presented for further insight, and we conclude with a discussion on additional refinements and issues worth pursuing.
Gene regulatory networks with dynamics characterized by multiple stable states underlie cell fate-decisions. Quantitative models that can link molecular-level knowledge of gene regulation to a global understanding of network dynamics have the potential to guide cell-reprogramming strategies. Networks are often modeled by the stochastic Chemical Master Equation, but methods for systematic identification of key properties of the global dynamics are currently lacking. We present a method for analyzing global dynamics of gene networks using the Markov State Model (MSM) framework, which utilizes a separation-of-timescales based clustering method to obtain a coarse-grained state transition graph that approximates global gene network dynamics. The method identifies the number, phenotypes, and lifetimes of long-lived network states. Application of transition path theory to the constructed MSM decomposes global dynamics into a set of dominant transition paths and associated relative probabilities for stochastic state-switching. In this proof-of-concept study, we found that the MSM provides a general framework for analyzing and visualizing stochastic multistability and state-transitions in gene networks. Our results suggest that the MSM framework, adopted from the field of atomistic Molecular Dynamics, can be a useful tool for quantitative Systems Biology at the network scale.
We present a new radio-selected cluster of galaxies, 0217+70, using observations from the Very Large Array and archival optical and X-ray data. The new cluster is one of only seven known that has candidate double peripheral radio relics, and the second of those with a giant radio halo (GRH), as well. It also contains unusual diffuse radio filaments interior to the peripheral relics, and a clumpy, elongated X-ray structure. All of these indicate a very actively evolving system, with ongoing accretion and merger activity, illuminating a network of shocks, such as those first seen in numerical simulations. The peripheral relics are most easily understood as outgoing spherical merger shocks with large variations in brightness along them, likely reflecting the inhomogeneities in the shocks' magnetic fields . The interior filaments could be projections of substructures from the sheet-like peripheral shocks, or they might be separate structures due to multiple accretion events. ROSAT images show large-scale diffuse X-ray emission coincident with the GRH, and additional patchy diffuse emission that suggests a recent merger event. This uniquely rich set of radio shocks and halo offer the possibility, with deeper X-ray, optical and data higher resolution radio observations, of testing the models of how shocks and turbulence couple to the relativistic plasma. 0217+70 is also over-luminous in the radio compared to the empirical radio-X-ray correlation for clusters -- the third example of such a system. This new population of diffuse radio emission opens up the possibility of probing low-mass cluster mergers with upcoming deep radio continuum surveys.
We present a search for the Higgs boson in the process $q\bar{q} \to ZH \to \ell^+\ell^- b\bar{b}$. The analysis uses an integrated luminosity of 1 fb$^{-1}$ of $p\bar{p}$ collisions produced at $\sqrt{s} =$ 1.96 TeV and accumulated by the upgraded Collider Detector at Fermilab (CDF II). We employ artificial neural networks both to correct jets mismeasured in the calorimeter, and to distinguish the signal kinematic distributions from those of the background. We see no evidence for Higgs boson production, and set 95% CL upper limits on $\sigma_{ZH} \cdot {\cal B}(H \to b\bar{b}$), ranging from 1.5 pb to 1.2 pb for a Higgs boson mass ($m_H$) of 110 to 150 GeV/$c^2$.
By considering the $x$-dependence of $\pi^+$, $\pi^-$, $K^+$, $K^-$, $\Lambda$, $\bar{\Lambda}$, $p$, $\bar{p}$ hadron productions in charged lepton semi-inclusive deep inelastic scattering off nuclear target (using Fe as an example) and deuteron D target, % at $Q^2=5$ GeV$^2$, we find that $(\bar{\Lambda}^A/\Lambda^A)/(\bar{\Lambda}^D/\Lambda^D)$ and $({\bar{p}}^A/{p}^A)/({\bar{p}}^A/p^A)$ are ideal to figure out the nuclear sea content, which is predicted to be different by different models accounting for the nuclear EMC effect.
Deep Learning is a very powerful machine learning model. Deep Learning trains a large number of parameters for multiple layers and is very slow when data is in large scale and the architecture size is large. Inspired from the shrinking technique used in accelerating computation of Support Vector Machines (SVM) algorithm and screening technique used in LASSO, we propose a shrinking Deep Learning with recall (sDLr) approach to speed up deep learning computation. We experiment shrinking Deep Learning with recall (sDLr) using Deep Neural Network (DNN), Deep Belief Network (DBN) and Convolution Neural Network (CNN) on 4 data sets. Results show that the speedup using shrinking Deep Learning with recall (sDLr) can reach more than 2.0 while still giving competitive classification performance.
In recent years, deep neural networks (DNN) have demonstrated significant business impact in large scale analysis and classification tasks such as speech recognition, visual object detection, pattern extraction, etc. Training of large DNNs, however, is universally considered as time consuming and computationally intensive task that demands datacenter-scale computational resources recruited for many days. Here we propose a concept of resistive processing unit (RPU) devices that can potentially accelerate DNN training by orders of magnitude while using much less power. The proposed RPU device can store and update the weight values locally thus minimizing data movement during training and allowing to fully exploit the locality and the parallelism of the training algorithm. We identify the RPU device and system specifications for implementation of an accelerator chip for DNN training in a realistic CMOS-compatible technology. For large DNNs with about 1 billion weights this massively parallel RPU architecture can achieve acceleration factors of 30,000X compared to state-of-the-art microprocessors while providing power efficiency of 84,000 GigaOps/s/W. Problems that currently require days of training on a datacenter-size cluster with thousands of machines can be addressed within hours on a single RPU accelerator. A system consisted of a cluster of RPU accelerators will be able to tackle Big Data problems with trillions of parameters that is impossible to address today like, for example, natural speech recognition and translation between all world languages, real-time analytics on large streams of business and scientific data, integration and analysis of multimodal sensory data flows from massive number of IoT (Internet of Things) sensors.
Logo detection from images has many applications, particularly for brand recognition and intellectual property protection. Most existing studies for logo recognition and detection are based on small-scale datasets which are not comprehensive enough when exploring emerging deep learning techniques. In this paper, we introduce "LOGO-Net", a large-scale logo image database for logo detection and brand recognition from real-world product images. To facilitate research, LOGO-Net has two datasets: (i)"logos-18" consists of 18 logo classes, 10 brands, and 16,043 logo objects, and (ii) "logos-160" consists of 160 logo classes, 100 brands, and 130,608 logo objects. We describe the ideas and challenges for constructing such a large-scale database. Another key contribution of this work is to apply emerging deep learning techniques for logo detection and brand recognition tasks, and conduct extensive experiments by exploring several state-of-the-art deep region-based convolutional networks techniques for object detection tasks. The LOGO-net will be released at http://logo-net.org/
With the raise in practice of Internet, in social, personal, commercial and other aspects of life, the cybercrime is as well escalating at an alarming rate. Such usage of Internet in diversified areas also augmented the illegal activities, which in turn, bids many network attacks and threats. Network forensics is used to detect the network attacks. This can be viewed as the extension of network security. It is the technology, which detects and also suggests prevention of the various network attacks. Botnet is one of the most common attacks and is regarded as a network of hacked computers. It captures the network packet, store it and then analyze and correlate to find the source of attack. Various methods based on this approach for botnet detection are in literature, but a generalized method is lacking. So, there is a requirement to design a generic framework that can be used by any botnet detection. This framework is of use for researchers, in the development of their own method of botnet detection, by means of providing methodology and guidelines. In this paper, various prevalent methods of botnet detection are studied, commonalities among them are established and then a generalized model for the detection of botnet is proposed. The proposed framework is described as UML diagrams.
The reverse engineering problem with probabilities and sequential behavior is introducing here, using the expression of an algorithm. The solution is partially founded, because we solve the problem only if we have a Probabilistic Sequential Network. Therefore the probabilistic structure on sequential dynamical systems is introduced here, the new model will be called Probabilistic Sequential Network, PSN. The morphisms of Probabilistic Sequential Networks are defined using two algebraic conditions, whose imply that the distribution of probabilities in the systems are close. It is proved here that two homomorphic Probabilistic Sequential Networks have the same equilibrium or steady state probabilities. Additionally, the proof of the set of PSN with its morphisms form the category PSN, having the category of sequential dynamical systems SDS, as a full subcategory is given. Several examples of morphisms, subsystems and simulations are given.
Let p be a prime. For any finite p-group G, the deep transfers T(H,G'):H/H' --> G'/G'' from the maximal subgroups H of index (G:H)=p in G to the derived subgroup G' are introduced as an innovative tool for identifying G uniquely by means of the family of kernels kappa_d(G)=(ker(T(H,G')))_{(G:H)=p}. For all finite 3-groups G of coclass cc(G)=1, the family kappa_d(G) is determined explicitly. The results are applied to the Galois groups G=Gal(F_3^\infty/F) of the Hilbert 3-class towers of all real quadratic fields F=Q(d^1/2) with fundamental discriminants d>1, 3-class group Cl_3(F)~C_3*C_3, and total 3-principalization in each of their four unramified cyclic cubic extensions E/F. A systematic statistical evaluation is given for the complete range 1<d<5*10^6, and a few exceptional cases are pointed out for 1<d<64*10^6.
We train a deep convolutional neural network to perform identity classification using a new dataset of public figures annotated with age, gender, ethnicity and emotion labels, and then fine-tune it for attribute classification. An optimal sharing pattern of computational resources within this network is determined by experiment, requiring only 1 G flops to produce all predictions. Rather than fine-tune by relearning weights in one additional layer after the penultimate layer of the identity network, we try several different depths for each attribute. We find that prediction of age and emotion is improved by fine-tuning from earlier layers onward, presumably because deeper layers are progressively invariant to non-identity related changes in the input.
Using the growing volumes of vehicle trajectory data, it becomes increasingly possible to capture time-varying and uncertain travel costs in a road network, including travel time and fuel consumption. The current paradigm represents a road network as a graph, assigns weights to the graph's edges by fragmenting trajectories into small pieces that fit the underlying edges, and then applies a routing algorithm to the resulting graph. We propose a new paradigm that targets more accurate and more efficient estimation of the costs of paths by associating weights with sub-paths in the road network. The paper provides a solution to a foundational problem in this paradigm, namely that of computing the time-varying cost distribution of a path.   The solution consists of several steps. We first learn a set of random variables that capture the joint distributions of sub-paths that are covered by sufficient trajectories. Then, given a departure time and a path, we select an optimal subset of learned random variables such that the random variables' corresponding paths together cover the path. This enables accurate joint distribution estimation of the path, and by transferring the joint distribution into a marginal distribution, the travel cost distribution of the path is obtained. The use of multiple learned random variables contends with data sparseness, the use of multi-dimensional histograms enables compact representation of arbitrary joint distributions that fully capture the travel cost dependencies among the edges in paths. Empirical studies with substantial trajectory data from two different cities offer insight into the design properties of the proposed solution and suggest that the solution is effective in real-world settings.
The use of formal methods provides confidence in the correctness of developments. Yet one may argue about the actual level of confidence obtained when the method itself -- or its implementation -- is not formally checked. We address this question for the B, a widely used formal method that allows for the derivation of correct programs from specifications. Through a deep embedding of the B logic in Coq, we check the B theory but also implement B tools. Both aspects are illustrated by the description of a proved prover for the B logic.
We study the finite-size behavior of two-dimensional spin-glass models. We consider the +-J model for two different values of the probability of the antiferromagnetic bonds and the model with Gaussian distributed couplings. The analysis of renormalization-group invariant quantities, the overlap susceptibility, and the two-point correlation function confirms that they belong to the same universality class. We analyze in detail the standard finite-size scaling limit in terms of TL^(1/nu) in the +-J model. We find that it holds asymptotically. This result is consistent with the low-temperature crossover scenario in which the crossover temperature, which separates the universal high-temperature region from the discrete low-temperature regime, scales as T_c(L) ~ L^(-theta_S) with theta_S \approx 0.5.
The paper outlines a framework for autonomous control of a CRM (customer relationship management) system. First, it explores how a modified version of the widely accepted Recency-Frequency-Monetary Value system of metrics can be used to define the state space of clients or donors. Second, it describes a procedure to determine the optimal direct marketing action in discrete and continuous action space for the given individual, based on his position in the state space. The procedure involves the use of model-free Q-learning to train a deep neural network that relates a client's position in the state space to rewards associated with possible marketing actions. The estimated value function over the client state space can be interpreted as customer lifetime value, and thus allows for a quick plug-in estimation of CLV for a given client. Experimental results are presented, based on KDD Cup 1998 mailing dataset of donation solicitations.
Many modern networks are \emph{reconfigurable}, in the sense that the topology of the network can be changed by the nodes in the network. For example, peer-to-peer, wireless and ad-hoc networks are reconfigurable. More generally, many social networks, such as a company's organizational chart; infrastructure networks, such as an airline's transportation network; and biological networks, such as the human brain, are also reconfigurable. Modern reconfigurable networks have a complexity unprecedented in the history of engineering, resembling more a dynamic and evolving living animal rather than a structure of steel designed from a blueprint. Unfortunately, our mathematical and algorithmic tools have not yet developed enough to handle this complexity and fully exploit the flexibility of these networks.   We believe that it is no longer possible to build networks that are scalable and never have node failures. Instead, these networks should be able to admit small, and maybe, periodic failures and still recover like skin heals from a cut. This process, where the network can recover itself by maintaining key invariants in response to attack by a powerful adversary is what we call \emph{self-healing}.   Here, we present several fast and provably good distributed algorithms for self-healing in reconfigurable dynamic networks. Each of these algorithms have different properties, a different set of gaurantees and limitations. We also discuss future directions and theoretical questions we would like to answer. %in the final dissertation that this document is proposed to lead to.
The last decade has seen great progress in both dynamic network modeling and topic modeling. This paper draws upon both areas to create a Bayesian method that allows topic discovery to inform the latent network model and the network structure to facilitate topic identification. We apply this method to the 467 top political blogs of 2012. Our results find complex community structure within this set of blogs, where community membership depends strongly upon the set of topics in which the blogger is interested.
The multilevel trigger system of the DIRAC experiment at CERN is presented. It includes a fast first level trigger as well as various trigger processors to select events with a pair of pions having a low relative momentum typical of the physical process under study. One of these processors employs the drift chamber data, another one is based on a neural network algorithm and the others use various hit-map detector correlations. Two versions of the trigger system used at different stages of the experiment are described. The complete system reduces the event rate by a factor of 1000, with efficiency $\geq$95% of detecting the events in the relative momentum range of interest.
We have proposed a novel Grid Based Dynamic Energy Efficient Routing (GBDEER) approach which is aimed to construct an energy efficient path from source to destination based on grid area, where each grid will have three deferent levels of transmission power.
Neuromorphic computing is a promising solution for reducing the size, weight and power of mobile embedded systems. In this paper, we introduce a realization of such a system by creating the first closed-loop battery-powered communication system between an IBM TrueNorth NS1e and an autonomous Android-Based Robotics platform. Using this system, we constructed a dataset of path following behavior by manually driving the Android-Based robot along steep mountain trails and recording video frames from the camera mounted on the robot along with the corresponding motor commands. We used this dataset to train a deep convolutional neural network implemented on the TrueNorth NS1e. The NS1e, which was mounted on the robot and powered by the robot's battery, resulted in a self-driving robot that could successfully traverse a steep mountain path in real time. To our knowledge, this represents the first time the TrueNorth NS1e neuromorphic chip has been embedded on a mobile platform under closed-loop control.
MmWave communications are expected to play a major role in the Fifth generation of mobile networks. They offer a potential multi-gigabit throughput and an ultra-low radio latency, but at the same time suffer from high isotropic pathloss, and a coverage area much smaller than the one of LTE macrocells. In order to address these issues, highly directional beamforming and a very high-density deployment of mmWave base stations were proposed. This Thesis aims to improve the reliability and performance of the 5G network by studying its tight and seamless integration with the current LTE cellular network. In particular, the LTE base stations can provide a coverage layer for 5G mobile terminals, because they operate on microWave frequencies, which are less sensitive to blockage and have a lower pathloss. This document is a copy of the Master's Thesis carried out by Mr. Michele Polese under the supervision of Dr. Marco Mezzavilla and Prof. Michele Zorzi. It will propose an LTE-5G tight integration architecture, based on mobile terminals' dual connectivity to LTE and 5G radio access networks, and will evaluate which are the new network procedures that will be needed to support it. Moreover, this new architecture will be implemented in the ns-3 simulator, and a thorough simulation campaign will be conducted in order to evaluate its performance, with respect to the baseline of handover between LTE and 5G.
In the past decade, cities have experienced rapid growth, expansion, and changes in their community structure. Many aspects of critical urban infrastructure are closely coupled with the human communities that they serve. Urban communities are composed of a multiplex of overlapping factors which can be distinguished into cultural, religious, social-economic, political, and geographical layers. In this paper, we review how increasingly available heterogeneous mobile big data sets can be leveraged to detect the community interaction structure using natural language processing and machine learning techniques. A number of community layer and interaction detection algorithms are then reviewed, with a particular focus on robustness, stability, and causality of evolving communities. The better understanding of the structural dynamics and multiplexed relationships can provide useful information to inform both urban planning policies and shape the design of socially coupled urban infrastructure systems, such as socially-driven computing and communication networks.
In order to make full use of geographic routing techniques developed for large scale networks, nodes must be localized. However, localization and virtual localization techniques in sensor networks are dependent either on expensive and sometimes unavailable hardware (e.g. GPS) or on sophisticated localization calculus (e.g. triangulation) which are both error-prone and with a costly overhead.   Instead of localizing nodes in a traditional 2-dimensional space, we use directly the raw distance to a set of anchors to route messages in a multi-dimensional space. This should enable us to use any geographic routing protocol in a robust and efficient manner in a very large range of scenarios. We test this technique for two different geographic routing algorithms, namely GRIC and ROAM. The simulation results show that using the raw coordinates does not decrease their efficiency.
An important open question is the relation between intracluster light and the halos of central galaxies in galaxy clusters. Here we report results from an on going project with the aim to characterize the dynamical state in the core of the Hydra I (Abell 1060) cluster around NGC 3311. Methods: We analyze deep long-slit absorption line spectra reaching out to ~25 kpc in the halo of NGC 3311. Results: We find a very steep increase in the velocity dispersion profile from a central sigma_0=150 km/s to sigma_out ~450 km/s at R ~ 12 kpc. Farther out, to ~25 kpc, sigma appears to be constant at this value, which is ~60% of the velocity dispersion of the Hydra I galaxies. With its dynamically hot halo kinematics, NGC 3311 is unlike other normal early-type galaxies. Conclusions: These results and the large amount of dark matter inferred from X-rays around NGC 3311 suggest that the stellar halo of this galaxy is dominated by the central intracluster stars of the cluster, and that the transition from predominantly galaxy-bound stars to cluster stars occurs in the radial range 4 to 12 kpc from the center of NGC 3311. We comment on the wide range of halo kinematics observed in cluster central galaxies, depending on the evolutionary state of their host clusters.
Ultrametric approach to the genetic code and the genome is considered and developed. $p$-Adic degeneracy of the genetic code is pointed out. Ultrametric tree of the codon space is presented. It is shown that codons and amino acids can be treated as $p$-adic ultrametric networks. Ultrametric modification of the Hamming distance is defined and noted how it can be useful. Ultrametric approach with $p$-adic distance is an attractive and promising trend towards investigation of bioinformation.
Information processing in complex systems is often found to be maximally efficient close to critical states associated with phase transitions. It is therefore conceivable that also neural information processing operates close to criticality. This is further supported by the observation of power-law distributions, which are a hallmark of phase transitions. An important open question is how neural networks could remain close to a critical point while undergoing a continual change in the course of development, adaptation, learning, and more. An influential contribution was made by Bornholdt and Rohlf, introducing a generic mechanism of robust self-organized criticality in adaptive networks. Here, we address the question whether this mechanism is relevant for real neural networks. We show in a realistic model that spike-time-dependent synaptic plasticity can self-organize neural networks robustly toward criticality. Our model reproduces several empirical observations and makes testable predictions on the distribution of synaptic strength, relating them to the critical state of the network. These results suggest that the interplay between dynamics and topology may be essential for neural information processing.
This paper presents an in-depth investigation on integrating neural language models in translation systems. Scaling neural language models is a difficult task, but crucial for real-world applications. This paper evaluates the impact on end-to-end MT quality of both new and existing scaling techniques. We show when explicitly normalising neural models is necessary and what optimisation tricks one should use in such scenarios. We also focus on scalable training algorithms and investigate noise contrastive estimation and diagonal contexts as sources for further speed improvements. We explore the trade-offs between neural models and back-off n-gram models and find that neural models make strong candidates for natural language applications in memory constrained environments, yet still lag behind traditional models in raw translation quality. We conclude with a set of recommendations one should follow to build a scalable neural language model for MT.
This article investigates Kak neural networks, which can be instantaneously trained, for complex and quaternion inputs. The performance of the basic algorithm has been analyzed and shown how it provides a plausible model of human perception and understanding of images. The motivation for studying quaternion inputs is their use in representing spatial rotations that find applications in computer graphics, robotics, global navigation, computer vision and the spatial orientation of instruments. The problem of efficient mapping of data in quaternion neural networks is examined. Some problems that need to be addressed before quaternion neural networks find applications are identified.
We review recent progress in applying information- and computation-theoretic measures to describe material structure that transcends previous methods based on exact geometric symmetries. We discuss the necessary theoretical background for this new toolset and show how the new techniques detect and describe novel material properties. We discuss how the approach relates to well known crystallographic practice and examine how it provides novel interpretations of familiar structures. Throughout, we concentrate on disordered materials that, while important, have received less attention both theoretically and experimentally than those with either periodic or aperiodic order.
Explicit high-order feature interactions efficiently capture essential structural knowledge about the data of interest and have been used for constructing generative models. We present a supervised discriminative High-Order Parametric Embedding (HOPE) approach to data visualization and compression. Compared to deep embedding models with complicated deep architectures, HOPE generates more effective high-order feature mapping through an embarrassingly simple shallow model. Furthermore, two approaches to generating a small number of exemplars conveying high-order interactions to represent large-scale data sets are proposed. These exemplars in combination with the feature mapping learned by HOPE effectively capture essential data variations. Moreover, through HOPE, these exemplars are employed to increase the computational efficiency of kNN classification for fast information retrieval by thousands of times. For classification in two-dimensional embedding space on MNIST and USPS datasets, our shallow method HOPE with simple Sigmoid transformations significantly outperforms state-of-the-art supervised deep embedding models based on deep neural networks, and even achieved historically low test error rate of 0.65% in two-dimensional space on MNIST, which demonstrates the representational efficiency and power of supervised shallow models with high-order feature interactions.
Recently, the concept of fog computing which aims at providing time-sensitive data services has become popular. In this model, computation is performed at the edge of the network instead of sending vast amounts of data to the cloud. Thus, fog computing provides low latency, location awareness to end users, and improves quality-of-services (QoS). One key feature in this model is the designing of payment plan from network operator (NO) to fog nodes (FN) for the rental of their computing resources, such as computation capacity, spectrum, and transmission power. In this paper, we investigate the problem of how to design the efficient payment plan to maximize the NO's revenue while maintaining FN's incentive to cooperate through the moral hazard model in contract theory. We propose a multi-dimensional contract which considers the FNs' characteristics such as location, computation capacity, storage, transmission bandwidth, and etc. First, a contract which pays the FNs by evaluating the resources they have provided from multiple aspects is proposed. Then, the utility maximization problem of the NO is formulated. Furthermore, we use the numerical results to analyze the optimal payment plan, and compare the NO's utility under different payment plans.
We study the single spin asymmetry (SSA) for the pion production with large transverse momentum p_T in semi-inclusive deep inelastic scattering $ep^\uparrow\to e\pi X$. We derive the twist-3 cross section formula for SSA, focussing on the soft-gluon-pole contributions associated with the twist-3 distribution for the nucleon and with the twist-3 fragmentation function for the pion. We present a simple estimate of the asymmetries due to each twist-3 effect from nucleon and pion, respectively, by fixing the overall strength of the relevant nonperturbative quantities by the data on the SSA A_N in $p^\uparrow p\to\pi X$ collision.
Classification of texture pattern is one of the most important problems in pattern recognition. In this paper, we present a classification method based on the Discrete Cosine Transform (DCT) coefficients of texture image. As DCT works on gray level image, the color scheme of each image is transformed into gray levels. For classifying the images using DCT we used two popular soft computing techniques namely neurocomputing and neuro-fuzzy computing. We used a feedforward neural network trained using the backpropagation learning and an evolving fuzzy neural network to classify the textures. The soft computing models were trained using 80% of the texture data and remaining was used for testing and validation purposes. A performance comparison was made among the soft computing models for the texture classification problem. We also analyzed the effects of prolonged training of neural networks. It is observed that the proposed neuro-fuzzy model performed better than neural network.
3GPP has introduced LTE Femtocells to manipulate the traffic for indoor users and to minimize the charge on the Macro cells. A key mechanism in the LTE traffic handling is the packet scheduler which is in charge of allocating resources to active flows in both the frequency and time dimension. So several scheduling algorithms need to be analyzed for femtocells networks. In this paper we introduce a performance analysis of three distinct scheduling algorithms of mixed type of traffic flows in LTE femtocells networks. The particularly study is evaluated in terms of throughput, packet loss ratio, fairness index and spectral efficiency
We present a generalization of conventional artificial neural networks that allows for a functional equivalence to multi-expert systems. The new model provides an architectural freedom going beyond existing multi-expert models and an integrative formalism to compare and combine various techniques of learning. (We consider gradient, EM, reinforcement, and unsupervised learning.) Its uniform representation aims at a simple genetic encoding and evolutionary structure optimization of multi-expert systems. This paper contains a detailed description of the model and learning rules, empirically validates its functionality, and discusses future perspectives.
In this paper, we present a coding-theoretic framework for message transmission over packet-switched networks. Network is modeled as a channel which can induce packet errors, deletions, insertions, and out of order delivery of packets. The proposed approach can be viewed as an extension of the one introduced by Koetter and Kschischang for networks based on random linear network coding. Namely, while their framework is based on subspace codes and designed for networks in which network nodes perform random linear combining of the packets, ours is based on the so-called subset codes, and is designed for networks employing routing in network nodes.
It is shown how to perfectly transfer an arbitrary qudit state in interacting boson networks. By defining a family of Hamiltonians related to Bose-Hubbard model, we describe a possible method for state transfer through bosonic atoms trapped in these networks with different kinds of coupling strengths between them. Particularly, by taking the underlying networks of so called group schemes as interacting boson networks, we show how choose suitable coupling strengths between the nodes, in order that an arbitrary qudit state be transferred from one node to its antipode, perfectly. In fact, by employing the group theory properties of these networks, an explicit formula for suitable coupling strengths has been given in order that perfect state transfer (PST) be achieved. Finally, as examples, PST on the underlying networks associated with cyclic group C2m, dihedral group D2n, Clifford group CL(n), and the groups U6n and V8n has been considered in details. Keywords: Bose-Hubbard Hamiltonian, Interacting boson networks, Perfect state transfer (PST), Qudit state, Underlying networks, Group schemes PACs Index: 01.55.+b, 02.10.Yn
Feature representations, both hand-designed and learned ones, are often hard to analyze and interpret, even when they are extracted from visual data. We propose a new approach to study image representations by inverting them with an up-convolutional neural network. We apply the method to shallow representations (HOG, SIFT, LBP), as well as to deep networks. For shallow representations our approach provides significantly better reconstructions than existing methods, revealing that there is surprisingly rich information contained in these features. Inverting a deep network trained on ImageNet provides several insights into the properties of the feature representation learned by the network. Most strikingly, the colors and the rough contours of an image can be reconstructed from activations in higher network layers and even from the predicted class probabilities.
Numerous groups have applied a variety of deep learning techniques to computer vision problems in highway perception scenarios. In this paper, we presented a number of empirical evaluations of recent deep learning advances. Computer vision, combined with deep learning, has the potential to bring about a relatively inexpensive, robust solution to autonomous driving. To prepare deep learning for industry uptake and practical applications, neural networks will require large data sets that represent all possible driving environments and scenarios. We collect a large data set of highway data and apply deep learning and computer vision algorithms to problems such as car and lane detection. We show how existing convolutional neural networks (CNNs) can be used to perform lane and vehicle detection while running at frame rates required for a real-time system. Our results lend credence to the hypothesis that deep learning holds promise for autonomous driving.
Various hand-crafted features and metric learning methods prevail in the field of person re-identification. Compared to these methods, this paper proposes a more general way that can learn a similarity metric from image pixels directly. By using a "siamese" deep neural network, the proposed method can jointly learn the color feature, texture feature and metric in a unified framework. The network has a symmetry structure with two sub-networks which are connected by Cosine function. To deal with the big variations of person images, binomial deviance is used to evaluate the cost between similarities and labels, which is proved to be robust to outliers.   Compared to existing researches, a more practical setting is studied in the experiments that is training and test on different datasets (cross dataset person re-identification). Both in "intra dataset" and "cross dataset" settings, the superiorities of the proposed method are illustrated on VIPeR and PRID.
We have studied zero temperature metastable states in classical $m$-vector component spin glasses in the presence of $m$-component random fields (of strength $h_{r}$) for a variety of models, including the Sherrington Kirkpatrick (SK) model, the Viana Bray (VB) model and the randomly diluted one-dimensional models with long-range power law interactions. For the SK model we have calculated analytically its complexity (the log of the number of minima) for both the annealed case and the quenched case, both for fields above and below the de Almeida Thouless (AT) field ($h_{AT} > 0$ for $m>2$). We have done quenches starting from a random initial state by putting spins parallel to their local fields until convergence and found that in zero field it always produces minima which have zero overlap with each other. For the $m=2$ and $m=3$ cases in the SK model the final energy reached in the quench is very close to the energy $E_c$ at which the overlap of the states would acquire replica symmetry breaking features. These minima have marginal stability and will have long-range correlations between them. In the SK limit we have analytically studied the density of states $\rho(\lambda)$ of the Hessian matrix in the annealed approximation. Despite the absence of continuous symmetries, the spectrum extends down to zero with the usual $\sqrt{\lambda}$ form for the density of states for $h_{r}<h_{AT}$. However, when $h_{r}>h_{AT}$, there is a gap in the spectrum which closes up as $h_{AT}$ is approached. For the VB model and the other models our numerical work shows that there always exist some low-lying eigenvalues and there never seems to be a gap. There is no sign of the AT transition in the quenched states reached from infinite temperature for any model but the SK model, which is the only model which has zero complexity above $h_{AT}$.
In the present paper, we analyze symmetric mixed states corresponding to the so-called concept formation on a sparsely encoded associative memory model with 0-1 neurons. Three types of mixed states, OR, AND and a majority decision mixed state are described as typical examples. Each element of the OR mixed state is composed of corresponding memory pattern elements by means of the OR-operation. The other two types are similarly defined. By analyzing their stabilities through the SCSNA and the computer simulation, we found that the storage capacity of the OR mixed state diverges in the sparse limit, but that the other states do not diverge. In addition, we found that the optimal threshold values, which maximize the storage capacity, for the memory pattern and the OR mixed state coincide with each other in the spare limit. Thus, we conclude that the OR mixed state is a reasonable representative of the mixed state in the spare limit. Finally, the paper examines the relationship between our results and recently reported physiological findings regarding face-responsive neurons in the inferior temporal cortex.
To understand the factors that encourage the deployment of a new networking technology, we must be able to model how such technology gets deployed. We investigate how network structure influences deployment with a simple deployment model and different network models through computer simulations. The results indicate that a realistic model of networking technology deployment should take network structure into account.
Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner's structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component of a large neuron system, and defines its relative Fisher information metric that describes accurately this small component, and is invariant to the other parts of the system. This concept is important because the geometry structure is much simplified and it can be easily applied to guide the learning of neural networks. We provide an analysis on a list of commonly used components, and demonstrate how to use this concept to further improve optimization.
Understanding the system level adaptive changes taking place in an organism in response to variations in the environment is a key issue of contemporary biology. Current modeling approaches such as the constraint-based flux balance analyses (FBA) have proved highly successful in analyzing the capabilities of cellular metabolism, including its capacity to predict deletion phenotypes, the ability to calculate the relative flux values of metabolic reactions and the properties of alternate optimal growth states. Here, we use FBA to thoroughly assess the activity of the Escherichia coli, Helicobacter pylori, and Saccharomyces cerevisiae metabolism in 30,000 diverse simulated environments. We identify a set of metabolic reactions forming a connected metabolic core that carry non-zero fluxes under all growth conditions, and whose flux variations are highly correlated. Furthermore, we find that the enzymes catalyzing the core reactions display a considerably higher fraction of phenotypic essentiality and evolutionary conservation than those catalyzing non-core reactions. Cellular metabolism is characterized by a large number of species-specific conditionally-active reactions organized around an evolutionary conserved always active metabolic core. Finally, we find that most current antibiotics interfering with the bacterial metabolism target the core enzymes, indicating that our findings may have important implications for antimicrobial drug target discovery.
Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore multiple levels of representations of genetic variants, learn their internal patterns involved in the disease development, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new framework referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the nine competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and nine other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the nine other statistics.
We study the one-dimensional Schr\"odinger equation with a disordered potential of the form $V (x) = \phi(x)^2+\phi'(x) + \kappa(x) $ where $\phi(x)$ is a Gaussian white noise with mean $\mu g$ and variance $g$, and $\kappa(x)$ is a random superposition of delta functions distributed uniformly on the real line with mean density $\rho$ and mean strength $v$. Our study is motivated by the close connection between this problem and classical diffusion in a random environment (the Sinai problem) in the presence of random absorbers~: $\phi(x)$ models the force field acting on the diffusing particle and $\kappa(x)$ models the absorption properties of the medium in which the diffusion takes place. The focus is on the calculation of the complex Lyapunov exponent $ \Omega(E) = \gamma(E) - \mathrm{i} \pi N(E) $, where $N$ is the integrated density of states per unit length and $\gamma$ the reciprocal of the localisation length. By using the continuous version of the Dyson-Schmidt method, we find an exact formula, in terms of a Hankel function, in the particular case where the strength of the delta functions is exponentially-distributed with mean $v=2g$. Building on this result, we then solve the general case -- in the low-energy limit -- in terms of an infinite sum of Hankel functions. Our main result, valid without restrictions on the parameters of the model, is that the integrated density of states exhibits the power law behaviour   $$ N(E) \underset{E\to0+}{\sim} E^\nu \hspace{0.5cm} \mbox{where } \nu=\sqrt{\mu^2+2\rho/g}\:. $$   This confirms and extends several results obtained previously by approximate methods.
In complex scale-free networks, ranking the individual nodes based upon their importance has useful applications, such as the identification of hubs for epidemic control, or bottlenecks for controlling traffic congestion. However, in most real situations, only limited sub-structures of entire networks are available, and therefore the reliability of the order relationships in sampled networks requires investigation. With a set of randomly sampled nodes from the underlying original networks, we rank individual nodes by three centrality measures: degree, betweenness, and closeness. The higher-ranking nodes from the sampled networks provide a relatively better characterisation of their ranks in the original networks than the lower-ranking nodes. A closeness-based order relationship is more reliable than any other quantity, due to the global nature of the closeness measure. In addition, we show that if access to hubs is limited during the sampling process, an increase in the sampling fraction can in fact decrease the sampling accuracy. Finally, an estimation method for assessing sampling accuracy is suggested.
We compare the likelihood of different socially relevant features to allow the evolutionary emergence and maintenance of cooperation in a generalized variant of the iterated Prisoners Dilemma game. Results show that the average costs/benefit balance of cooperation is the primary constraint for its establishment and maintenance. Behavior increasing inclusive fitness such as assortation, homophily, kin-selection and tagging of individuals, is second in importance. Networks characteristics were the least important in favoring the establishment and maintenance of cooperation, despite being the most popular in recent research on the subject. Results suggest that inclusive fitness theory with its expansions to include assortative and economic considerations is more general, powerful and relevant in analyzing social phenomena than kin selection theory with its emphasis on genetic relatedness. Merging economics with evolutionary theory will be necessary to reveal more about the nature of social dynamics.
We introduce a design strategy for neural network macro-architecture based on self-similarity. Repeated application of a simple expansion rule generates deep networks whose structural layouts are precisely truncated fractals. These networks contain interacting subpaths of different lengths, but do not include any pass-through or residual connections; every internal signal is transformed by a filter and nonlinearity before being seen by subsequent layers. In experiments, fractal networks match the excellent performance of standard residual networks on both CIFAR and ImageNet classification tasks, thereby demonstrating that residual representations may not be fundamental to the success of extremely deep convolutional neural networks. Rather, the key may be the ability to transition, during training, from effectively shallow to deep. We note similarities with student-teacher behavior and develop drop-path, a natural extension of dropout, to regularize co-adaptation of subpaths in fractal architectures. Such regularization allows extraction of high-performance fixed-depth subnetworks. Additionally, fractal networks exhibit an anytime property: shallow subnetworks provide a quick answer, while deeper subnetworks, with higher latency, provide a more accurate answer.
A spiking neuron ``computes'' by transforming a complex dynamical input into a train of action potentials, or spikes. The computation performed by the neuron can be formulated as dimensional reduction, or feature detection, followed by a nonlinear decision function over the low dimensional space. Generalizations of the reverse correlation technique with white noise input provide a numerical strategy for extracting the relevant low dimensional features from experimental data, and information theory can be used to evaluate the quality of the low--dimensional approximation. We apply these methods to analyze the simplest biophysically realistic model neuron, the Hodgkin--Huxley model, using this system to illustrate the general methodological issues. We focus on the features in the stimulus that trigger a spike, explicitly eliminating the effects of interactions between spikes. One can approximate this triggering ``feature space'' as a two dimensional linear subspace in the high--dimensional space of input histories, capturing in this way a substantial fraction of the mutual information between inputs and spike time. We find that an even better approximation, however, is to describe the relevant subspace as two dimensional, but curved; in this way we can capture 90% of the mutual information even at high time resolution. Our analysis provides a new understanding of the computational properties of the Hodgkin--Huxley model. While it is common to approximate neural behavior as ``integrate and fire,'' the HH model is not an integrator nor is it well described by a single threshold.
We study the statistics of local energy minima in the configuration space and the energy relaxation due to activated hopping in a system of interacting electrons in a random environment. The distribution of the local minima is exponential, which is in agreement with extreme value statistics considerations. The relaxation of the system energy shows logarithmic time dependence reflecting the ultrametric structure of the system.
This paper describes the winning entry to the IJCNN 2011 Social Network Challenge run by Kaggle.com. The goal of the contest was to promote research on real-world link prediction, and the dataset was a graph obtained by crawling the popular Flickr social photo sharing website, with user identities scrubbed. By de-anonymizing much of the competition test set using our own Flickr crawl, we were able to effectively game the competition. Our attack represents a new application of de-anonymization to gaming machine learning contests, suggesting changes in how future competitions should be run.   We introduce a new simulated annealing-based weighted graph matching algorithm for the seeding step of de-anonymization. We also show how to combine de-anonymization with link prediction---the latter is required to achieve good performance on the portion of the test set not de-anonymized---for example by training the predictor on the de-anonymized portion of the test set, and combining probabilistic predictions from de-anonymization and link prediction.
We have studied topology of the distribution of the high redshift galaxies identified in the Hubble Deep Field (HDF) North and South. The two-dimensional genus is measured from the projected distributions of the HDF galaxies at angular scales from $3.8''$ to $ 6.1''$. We have also divided the samples into three redshift slices with roughly equal number of galaxies using photometric redshifts to see possible evolutionary effects on the topology.   The genus curve of the HDF North clearly indicates clustering of galaxies over the Poisson distribution while the clustering is somewhat weaker in the HDF South. This clustering is mainly due to the nearer galaxies in the samples. We have also found that the genus curve of galaxies in the HDF is consistent with the Gaussian random phase distribution with no significant redshift dependence.
Deep learning research over the past years has shown that by increasing the scope or difficulty of the learning problem over time, increasingly complex learning problems can be addressed. We study incremental learning in the context of sequence learning, using generative RNNs in the form of multi-layer recurrent Mixture Density Networks. While the potential of incremental or curriculum learning to enhance learning is known, indiscriminate application of the principle does not necessarily lead to improvement, and it is essential therefore to know which forms of incremental or curriculum learning have a positive effect. This research contributes to that aim by comparing three instantiations of incremental or curriculum learning.   We introduce Incremental Sequence Learning, a simple incremental approach to sequence learning. Incremental Sequence Learning starts out by using only the first few steps of each sequence as training data. Each time a performance criterion has been reached, the length of the parts of the sequences used for training is increased.   We introduce and make available a novel sequence learning task and data set: predicting and classifying MNIST pen stroke sequences. We find that Incremental Sequence Learning greatly speeds up sequence learning and reaches the best test performance level of regular sequence learning 20 times faster, reduces the test error by 74%, and in general performs more robustly; it displays lower variance and achieves sustained progress after all three comparison methods have stopped improving. The other instantiations of curriculum learning do not result in any noticeable improvement. A trained sequence prediction model is also used in transfer learning to the task of sequence classification, where it is found that transfer learning realizes improved classification performance compared to methods that learn to classify from scratch.
We show that the infinite percolating cluster (with density P_inf) of central-force networks is composed of: a fractal stress-bearing backbone (Pb) and; rigid but unstressed ``dangling ends'' which occupy a finite volume-fraction of the lattice (Pd). Near the rigidity threshold pc, there is then a first-order transition in P_inf = Pd + Pb, while Pb is second-order with exponent Beta'. A new mean field theory shows Beta'(mf)=1/2, while simulations of triangular lattices give Beta'_tr = 0.255 +/- 0.03.
Which statistical features of spiking activity matter for how stimuli are encoded in neural populations? A vast body of work has explored how firing rates in individual cells and correlations in the spikes of cell pairs impact coding. But little is known about how higher-order correlations, which describe simultaneous firing in triplets and larger ensembles of cells, impact encoded stimulus information. Here, we take a first step toward closing this gap. We vary triplet correlations in small (~10 cell) neural populations while keeping single cell and pairwise statistics fixed at typically reported values. For each value of triplet correlations, we estimate the performance of the neural population on a two-stimulus discrimination task. We identify a predominant way that such triplet correlations can strongly enhance coding: if triplet correlations differ for the two stimuli, they skew the response distributions of the two stimuli apart from each other, separating them and making them easier to distinguish. This coding benefit does not occur when both stimuli elicit similar triplet correlations. These results indicate that higher-order correlations could have a strong effect on population coding. Finally, we calculate how many samples are necessary to accurately measure spiking correlations of this type, providing an estimate of the necessary recording times in experiments.
Complex phenotypic differences among different acute leukemias cannot be fully captured by analyzing the expression levels of one single molecule, such as a miR, at a time, but requires systematic analysis of large sets of miRs. While a popular approach for analysis of such datasets is principal component analysis (PCA), this method is not designed to optimally discriminate different phenotypes. Moreover, PCA and other low-dimensional representation methods yield linear or non-linear combinations of all measured miRs. Global human miR expression was measured in AML, B-ALL, and T-ALL cell lines and patient RNA samples. By systematically applying support vector machines to all measured miRs taken in dyad and triad groups, we built miR networks using cell line data and validated our findings with primary patient samples. All the coordinately transcribed members of the miR-23a cluster (which includes also miR-24 and miR-27a), known to function as tumor suppressors of acute leukemias, appeared in the AML, B-ALL and T-ALL centric networks. Subsequent qRT-PCR analysis showed that the most connected miR in the B-ALL-centric network, miR-708, is highly and specifically expressed in B-ALLs, suggesting that miR-708 might serve as a biomarker for B-ALL. This approach is systematic, quantitative, scalable, and unbiased. Rather than a single signature, our approach yields a network of signatures reflecting the redundant nature of biological signaling pathways. The network representation allows for visual analysis of all signatures by an expert and for future integration of additional information. Furthermore, each signature involves only small sets of miRs, such as dyads and triads, which are well suited for in depth validation through laboratory experiments such as loss- and gain-of-function assays designed to drive changes in leukemia cell survival, proliferation and differentiation.
Wind is slated to become one of the most sought after source of energy in future. Both onshore as well as offshore wind farms are getting deployed rapidly over the world. This paper evaluates a neural network based time series approach to predict wind speed in real time over shorter duration of up to 12 hr based on analysis of three hourly wind data collected through a wave rider buoy deployed off Goa in deep water and far away from the shore. The data were collected for 4 years from February 1998 to February 2002. A simple feed forward type of network trained using a variety of algorithms was used. The input nodes selected by trial were three in number and belonged to the segment of preceding observations while the output node was single and it consisted of the predicted value of the wind speed over the subsequent 3, 6 and 12 hours one at a time. The number of hidden nodes was based on trials. The total sample was divided into a training set (first 70 percent) and a testing set (remaining 30 percent). The outcome of the network was compared with the actual observations with the help of scatter diagrams and time history plots as well as through the error statistics of the correlation coefficient, R, and mean square error, MSE. The testing of the network showed that it predicted the wind speed in a very satisfactory manner with R = 0.99 and MSE = 0.30 (m/s)2 for a 3 hour ahead prediction while these values for a 12 hour ahead predictions were 0.96 and 1.19 (m/s)2, respectively. Such a prediction based on neural network was found to be superior to that based on polynomial fittings as well as ARMA models. ARIMA models were also used but the predicted values showed significant lag.
Chemical Reaction Optimization (CRO) is a powerful metaheuristic which mimics the interactions of molecules in chemical reactions to search for the global optimum. The perturbation function greatly influences the performance of CRO on solving different continuous problems. In this paper, we study four different probability distributions, namely, the Gaussian distribution, the Cauchy distribution, the exponential distribution, and a modified Rayleigh distribution, for the perturbation function of CRO. Different distributions have different impacts on the solutions. The distributions are tested by a set of well-known benchmark functions and simulation results show that problems with different characteristics have different preference on the distribution function. Our study gives guidelines to design CRO for different types of optimization problems.
Boolean networks are special types of finite state time-discrete dynamical systems. A Boolean network can be described by a function from an n-dimensional vector space over the field of two elements to itself. A fundamental problem in studying these dynamical systems is to link their long term behaviors to the structures of the functions that define them. In this paper, a method for deriving a Boolean network's dynamical information via its disjunctive normal form is explained. For a given Boolean network, a matrix with entries 0 and 1 is associated with the polynomial function that represents the network, then the information on the fixed points and the limit cycles is derived by analyzing the matrix. The described method provides an algorithm for the determination of the fixed points from the polynomial expression of a Boolean network. The method can also be used to construct Boolean networks with prescribed limit cycles and fixed points. Examples are provided to explain the algorithm.
We describe a learning-based approach to blind image deconvolution. It uses a deep layered architecture, parts of which are borrowed from recent work on neural network learning, and parts of which incorporate computations that are specific to image deconvolution. The system is trained end-to-end on a set of artificially generated training examples, enabling competitive performance in blind deconvolution, both with respect to quality and runtime.
We propose a generic strategic network resource sharing game between a set of players representing operators. The players negotiate which sets of players share given resources, serving users with varying sensitivity to interference. We prove that the proposed game has a Nash equilibrium, to which a greedily played game converges. Furthermore, simulation results show that, when applied to inter-operator spectrum sharing in small-cell indoor office environment, the convergence is fast and there is a significant performance improvement for the operators when compared to the default resource usage configuration.
Chemical reaction networks (CRNs) formally model chemistry in a well-mixed solution. CRNs are widely used to describe information processing occurring in natural cellular regulatory networks, and with upcoming advances in synthetic biology, CRNs are a promising language for the design of artificial molecular control circuitry. Nonetheless, despite the widespread use of CRNs in the natural sciences, the range of computational behaviors exhibited by CRNs is not well understood.   CRNs have been shown to be efficiently Turing-universal when allowing for a small probability of error. CRNs that are guaranteed to converge on a correct answer, on the other hand, have been shown to decide only the semilinear predicates. We introduce the notion of function, rather than predicate, computation by representing the output of a function f:N^k --> N^l by a count of some molecular species, i.e., if the CRN starts with n_1,...,n_k molecules of some "input" species X1,...,Xk, the CRN is guaranteed to converge to having f(n_1,...,n_k) molecules of the "output" species Y1,...,Yl. We show that a function f:N^k --> N^l is deterministically computed by a CRN if and only if its graph {(x,y) | f(x) = y} is a semilinear set. Furthermore, each semilinear function f can be computed on input x in expected time O(polylog(|x|)).
Multiple hypothesis testing is a significant problem in nearly all neuroimaging studies. In order to correct for this phenomena, we require a reliable estimate of the Family-Wise Error Rate (FWER). The well known Bonferroni correction method, while simple to implement, is quite conservative, and can substantially under-power a study because it ignores dependencies between test statistics. Permutation testing, on the other hand, is an exact, non-parametric method of estimating the FWER for a given $\alpha$-threshold, but for acceptably low thresholds the computational burden can be prohibitive. In this paper, we show that permutation testing in fact amounts to populating the columns of a very large matrix ${\bf P}$. By analyzing the spectrum of this matrix, under certain conditions, we see that ${\bf P}$ has a low-rank plus a low-variance residual decomposition which makes it suitable for highly sub--sampled --- on the order of $0.5\%$ --- matrix completion methods. Based on this observation, we propose a novel permutation testing methodology which offers a large speedup, without sacrificing the fidelity of the estimated FWER. Our evaluations on four different neuroimaging datasets show that a computational speedup factor of roughly $50\times$ can be achieved while recovering the FWER distribution up to very high accuracy. Further, we show that the estimated $\alpha$-threshold is also recovered faithfully, and is stable.
This paper reports numerical studies of a compressible version of the Ising spin glass in two dimensions. Compressibility is introduced by adding a term that couples the spin-spin interactions and local lattice deformations to the standard Edwards-Anderson model. The relative strength of this coupling is controlled by a single dimensionless parameter, mu. The timescale associated with the dynamics of the system grows exponentially as mu is increased, and the energy of the compressible system is shifted downward by an amount proportional to mu times the square of the uncoupled energy. This result leads to the formulation of a simplified model that depends solely on spin variables; analysis and numerical simulations of the simplified model predict a critical value of the coupling strength above which the spin-glass transition cannot exist at any temperature.
Robust diffusion adaptive estimation algorithms based on the maximum correntropy criterion (MCC), including adaptation to combination MCC and combination to adaptation MCC, are developed to deal with the distributed estimation over network in impulsive (long-tailed) noise environments. The cost functions used in distributed estimation are in general based on the mean square error (MSE) criterion, which is desirable when the measurement noise is Gaussian. In non-Gaussian situations, such as the impulsive-noise case, MCC based methods may achieve much better performance than the MSE methods as they take into account higher order statistics of error distribution. The proposed methods can also outperform the robust diffusion least mean p-power(DLMP) and diffusion minimum error entropy (DMEE) algorithms. The mean and mean square convergence analysis of the new algorithms are also carried out.
We consider the problem of reliably broadcasting information in a multihop asyn- chronous network that is subject to Byzantine failures. That is, some nodes of the network can exhibit arbitrary (and potentially malicious) behavior. Existing solutions provide de- terministic guarantees for broadcasting between all correct nodes, but require that the communication network is highly-connected (typically, 2k + 1 connectivity is required, where k is the total number of Byzantine nodes in the network). In this paper, we investigate the possibility of Byzantine tolerant reliable broadcast be- tween most correct nodes in low-connectivity networks (typically, networks with constant connectivity). In more details, we propose a new broadcast protocol that is specifically designed for low-connectivity networks. We provide sufficient conditions for correct nodes using our protocol to reliably communicate despite Byzantine participants. We present experimental results that show that our approach is especially effective in low-connectivity networks when Byzantine nodes are randomly distributed.
In this paper, we propose a new deep network that learns multi-level deep representations for image emotion classification (MldrNet). Image emotion can be recognized through image semantics, image aesthetics and low-level visual features from both global and local views. Existing image emotion classification works using hand-crafted features or deep features mainly focus on either low-level visual features or semantic-level image representations without taking all factors into consideration. Our proposed MldrNet unifies deep representations of three levels, i.e. image semantics, image aesthetics and low-level visual features through multiple instance learning (MIL) in order to effectively cope with noisy labeled data, such as images collected from the Internet. Extensive experiments on both Internet images and abstract paintings demonstrate the proposed method outperforms the state-of-the-art methods using deep features or hand-crafted features. The proposed approach also outperforms the state-of-the-art methods with at least 6% performance improvement in terms of overall classification accuracy.
Biopolymer Networks play an important role in coordinating and regulating collective cellular dynamics via a number of signaling pathways. Here, we investigate the mechanical response of a model biopolymer network due to the active contraction of embedded cells. Specifically, a graph (bond-node) model derived from confocal microscopy data is used to represent the network microstructure, and cell contraction is modeled by applying correlated displacements at specific nodes, representing the focal adhesion sites. A force-based stochastic relaxation method is employed to obtain force-balanced network under cell contraction. We find that the majority of the forces are carried by a small number of heterogeneous force chains emitted from the contracting cells. The force chains consist of fiber segments that either possess a high degree of alignment before cell contraction or are aligned due to the reorientation induced by cell contraction. Large fluctuations of the forces along different force chains are observed. Importantly, the decay of the forces along the force chains is significantly slower than the decay of radially averaged forces in the system. These results suggest that the fibrous nature of biopolymer network structure can support long-range force transmission and thus, long-range mechanical signaling between cells.
Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be automatically separated, as it conflates them into a single vector. We address this issue by proposing a new model which learns word and sense embeddings jointly. Our model exploits large corpora and knowledge from semantic networks in order to produce a unified vector space of word and sense embeddings. We evaluate the main features of our approach both qualitatively and quantitatively in a variety of tasks, highlighting the advantages of the proposed method in comparison to state-of-the-art word- and sense-based models.
CCFE perform Monte-Carlo transport simulations on large and complex tokamak models such as ITER. Such simulations are challenging since streaming and deep penetration effects are equally important. In order to make such simulations tractable, both variance reduction (VR) techniques and parallel computing are used. It has been found that the application of VR techniques in such models significantly reduces the efficiency of parallel computation due to 'long histories'. VR in MCNP can be accomplished using energy-dependent weight windows. The weight window represents an 'average behaviour' of particles, and large deviations in the arriving weight of a particle give rise to extreme amounts of splitting being performed and a long history. When running on parallel clusters, a long history can have a detrimental effect on the parallel efficiency - if one process is computing the long history, the other CPUs complete their batch of histories and wait idle. Furthermore some long histories have been found to be effectively intractable. To combat this effect, CCFE has developed an adaptation of MCNP which dynamically adjusts the WW where a large weight deviation is encountered. The method effectively 'de-optimises' the WW, reducing the VR performance but this is offset by a significant increase in parallel efficiency. Testing with a simple geometry has shown the method does not bias the result. This 'long history method' has enabled CCFE to significantly improve the performance of MCNP calculations for ITER on parallel clusters, and will be beneficial for any geometry combining streaming and deep penetration effects.
The theory of random vector functional link network (RVFLN) has provided a breakthrough in the design of neural networks (NNs) since it conveys solid theoretical justification of randomized learning. Existing works in RVFLN are hardly scalable for data stream analytics because they are inherent to the issue of complexity as a result of the absence of structural learning scenarios. A novel class of RVLFN, namely parsimonious random vector functional link network (pRVFLN), is proposed in this paper. pRVFLN features an open structure paradigm where its network structure can be built from scratch and can be automatically generated in accordance with degree of nonlinearity and time-varying property of system being modelled. pRVFLN is equipped with complexity reduction scenarios where inconsequential hidden nodes can be pruned and input features can be dynamically selected. pRVFLN puts into perspective an online active learning mechanism which expedites the training process and relieves operator labelling efforts. In addition, pRVFLN introduces a non-parametric type of hidden node, developed using an interval-valued data cloud. The hidden node completely reflects the real data distribution and is not constrained by a specific shape of the cluster. All learning procedures of pRVFLN follow a strictly single-pass learning mode, which is applicable for an online real-time deployment. The efficacy of pRVFLN was rigorously validated through numerous simulations and comparisons with state-of-the art algorithms where it produced the most encouraging numerical results. Furthermore, the robustness of pRVFLN was investigated and a new conclusion is made to the scope of random parameters where it plays vital role to the success of randomized learning.
In order to clarify the origin of the dominant processes responsible for the acoustic attenuation of phonons, which is a much debatted topic, we present Bril louin scattering experiments in various silica glasses of different OH impurities content. A large temperature range, from 5 to 1500 K is investigated, up to the glass transition temperature. Comparison of the hypersonic wave attenuation in various samples allows to identify two different processes. The first one induce s a low temperature peak related to relaxational processes; it is strongly sensitive to the extrinsic defects. The second, dominant in the hig h temperature range, is weakly dependent on the impurities and can be ascribed to anharmonic interactions.
Recent trends suggest that cognitive radio based wireless networks will be frequency agile and the nodes will be equipped with multiple radios capable of tuning across large swaths of spectrum. The MAC scheduling problem in such networks refers to making intelligent decisions on which communication links to activate at which time instant and over which frequency band. The challenge in designing a low-complexity distributed MAC, that achieves low delay, is posed by two additional dimensions of cognitive radio networks: interference graphs and data rates that are frequency-band dependent, and explosion in number of feasible schedules due to large number of available frequency-bands. In this paper, we propose MAXIMALGAIN MAC, a distributed MAC scheduler for frequency agile multi-band networks that simultaneously achieves the following: (i) optimal network-delay scaling with respect to the number of communicating pairs, (ii) low computational complexity of O(log2(maximum degree of the interference graphs)) which is independent of the number of frequency bands, number of radios per node, and overall size of the network, and (iii) robustness, i.e., it can be adapted to a scenario where nodes are not synchronized and control packets could be lost. Our proposed MAC also achieves a throughput provably within a constant fraction (under isotropic propagation) of the maximum throughput. Due to a recent impossibility result, optimal delay-scaling could only be achieved with some amount of throughput loss [30]. Extensive simulations using OMNeT++ network simulator shows that, compared to a multi-band extension of a state-of-art CSMA algorithm (namely, Q-CSMA), our asynchronous algorithm achieves a 2.5x? reduction in delay while achieving at least 85% of the maximum achievable throughput. Our MAC algorithms are derived from a novel local search based technique.
Supralinear and sublinear pre-synaptic and dendritic integration is considered to be responsible for nonlinear computation power of biological neurons, emphasizing the role of nonlinear integration as opposed to nonlinear output thresholding. How, why, and to what degree the transfer function nonlinearity helps biologically inspired neural network models is not fully understood. Here, we study these questions in the context of echo state networks (ESN). ESN is a simple neural network architecture in which a fixed recurrent network is driven with an input signal, and the output is generated by a readout layer from the measurements of the network states. ESN architecture enjoys efficient training and good performance on certain signal-processing tasks, such as system identification and time series prediction. ESN performance has been analyzed with respect to the connectivity pattern in the network structure and the input bias. However, the effects of the transfer function in the network have not been studied systematically. Here, we use an approach tanh on the Taylor expansion of a frequently used transfer function, the hyperbolic tangent function, to systematically study the effect of increasing nonlinearity of the transfer function on the memory, nonlinear capacity, and signal processing performance of ESN. Interestingly, we find that a quadratic approximation is enough to capture the computational power of ESN with tanh function. The results of this study apply to both software and hardware implementation of ESN.
In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment. In this paper, we extend this setting by considering the environment is not given, but controllable and learnable through its interaction with the agent at the same time. Theoretically, we find a dual Markov decision process (MDP) w.r.t. the environment to that w.r.t. the agent, and solving the dual MDP-policy pair yields a policy gradient solution to optimizing the parametrized environment. Furthermore, environments with non-differentiable parameters are addressed by a proposed general generative framework. Experiments on a Maze generation task show the effectiveness of generating diverse and challenging Mazes against agents with various settings.
Big graphs (networks) arising in numerous application areas pose significant challenges for graph analysts as these graphs grow to billions of nodes and edges and are prohibitively large to fit in the main memory. Finding the number of triangles in a graph is an important problem in the mining and analysis of graphs. In this paper, we present two efficient MPI-based distributed memory parallel algorithms for counting triangles in big graphs. The first algorithm employs overlapping partitioning and efficient load balancing schemes to provide a very fast parallel algorithm. The algorithm scales well to networks with billions of nodes and can compute the exact number of triangles in a network with 10 billion edges in 16 minutes. The second algorithm divides the network into non-overlapping partitions leading to a space-efficient algorithm. Our results on both artificial and real-world networks demonstrate a significant space saving with this algorithm. We also present a novel approach that reduces communication cost drastically leading the algorithm to both a space- and runtime-efficient algorithm. Further, we demonstrate how our algorithms can be used to list all triangles in a graph and compute clustering coefficients of nodes. Our algorithm can also be adapted to a parallel approximation algorithm using an edge sparsification method.
This paper evaluates the bit error rate (BER) performance of underlay relay cognitive networks with decode-and-forward (DF) relays in arbitrary number of hops over Rayleigh fading with channel estimation errors. In order to facilitate the performance evaluation analytically we derive a novel exact closed-form representation for the corresponding BER which is validated through extensive comparisons with results from Monte-Carlo simulations. The proposed expression involved well known elementary and special functions which render its computational realization rather simple and straightforward. As a result, the need for laborious, energy exhaustive and time-consuming computer simulations can be ultimately omitted. Numerous results illustrate that the performance of underlay relay cognitive networks is, as expected, significantly degraded by channel estimation errors and that is highly dependent upon of both the network topology and the number of hops.
In this paper we propose a method to enhance the life time as well as improve the performance of the mobile ad hoc networks (MANET). Since MANET consists of devices that run on batteries, having limited amount of energy and due to the self-configuring and dynamic change of topology, all operations are performed by the node itself.
Many complex systems can be represented as networks, and the problem of network comparison is becoming increasingly relevant. There are many techniques for network comparison, from simply comparing network summary statistics to sophisticated but computationally costly alignment-based approaches. Yet it remains challenging to accurately cluster networks that are of a different size and density, but hypothesized to be structurally similar. In this paper, we address this problem by introducing a new network comparison methodology that is aimed at identifying common organizational principles in networks. The methodology is simple, intuitive and applicable in a wide variety of settings ranging from the functional classification of proteins to tracking the evolution of a world trade network.
We investigate violations of the fluctuation-dissipation theorem in two classes of trap models by studying the influence of the perturbing field on the transition rates. We show that for perturbed rates depending upon the value of the observable at the arrival trap, a limiting value of the fluctuation-dissipation ratio does exist. However, the mechanism behind the emergence of this value is different in both classes of models. In particular, for an entropically governed dynamics (where the perturbing field shifts the relative population of traps according to the value of the observable) perturbed rates are argued to take a form that guarantees the existence of a limiting value for the effective temperature, utterly related to the exponential character of the distribution of trap energies. Fluctuation-dissipation (FD) plots reproduce some of the patterns found in a broad class of glassy systems, reinforcing the idea that structural glasses self-generate a dynamical measure that is captured by phenomenological trap models.
For a rumour spreading protocol, the spread time is defined as the first time that everyone learns the rumour. We compare the synchronous push&pull rumour spreading protocol with its asynchronous variant, and show that for any $n$-vertex graph and any starting vertex, the ratio between their expected spread times is bounded by $O \left({n}^{1/3}{\log^{2/3} n}\right)$. This improves the $O(\sqrt n)$ upper bound of Giakkoupis, Nazari, and Woelfel (in Proceedings of ACM Symposium on Principles of Distributed Computing, 2016). Our bound is tight up to a factor of $O(\log n)$, as illustrated by the string of diamonds graph. We also show that if for a pair $\alpha,\beta$ of real numbers, there exists infinitely many graphs for which the two spread times are $n^{\alpha}$ and $n^{\beta}$ in expectation, then $0\leq\alpha \leq 1$ and $\alpha \leq \beta \leq \frac13 + \frac23 \alpha$; and we show each such pair $\alpha,\beta$ is achievable.
Retrieving and analyzing transit feeds relies on working with analytical workflows that can handle the massive volume of data streams that are relevant to understand the dynamics of transit networks which are entirely deterministic in the geographical space in which they takes place. In this paper, we consider the fundamental issues in developing a streaming analytical workflow for analyzing the continuous arrival of multiple, unbounded transit data feeds for automatically processing and enriching them with additional information containing higher level concepts accordingly to a particular mobility context. This workflow consists of three tasks: (1) stream data retrieval for creating time windows; (2) data cleaning for handling missing data, overlap data or redundant data; and (3) data contextualization for computing actual arrival and departure times as well as the stops and moves during a bus trip, and also performing mobility context computation. The workflow was implemented in a Hadoop cloud ecosystem using data streams from the CODIAC Transit System of the city of Moncton, NB. The Map() function of MapReduce is used to retrieve and bundle data streams into numerous clusters which are subsequently handled in a parallel manner by the Reduce() function in order to execute the data contextualization step. The results validate the need for cloud computing for achieving high performance and scalability, however, due to the delay in computing and networking, it is clear that data cleaning tasks should not only be deployed using a cloud environment, paving the way to combine it with fog computing in the near future.
Deep convolutional neural networks have led to breakthrough results in practical feature extraction applications. The mathematical analysis of these networks was pioneered by Mallat, 2012. Specifically, Mallat considered so-called scattering networks based on identical semi-discrete wavelet frames in each network layer, and proved translation-invariance as well as deformation stability of the resulting feature extractor. The purpose of this paper is to develop Mallat's theory further by allowing for different and, most importantly, general semi-discrete frames (such as, e.g., Gabor frames, wavelets, curvelets, shearlets, ridgelets) in distinct network layers. This allows to extract wider classes of features than point singularities resolved by the wavelet transform. Our generalized feature extractor is proven to be translation-invariant, and we develop deformation stability results for a larger class of deformations than those considered by Mallat. For Mallat's wavelet-based feature extractor, we get rid of a number of technical conditions. The mathematical engine behind our results is continuous frame theory, which allows us to completely detach the invariance and deformation stability proofs from the particular algebraic structure of the underlying frames.
We analyze a slow-fading interference network with MN non-cooperating single-antenna sources and M non-cooperating single-antenna destinations. In particular, we assume that the sources are divided into M mutually exclusive groups of N sources each, every group is dedicated to transmit a common message to a unique destination, all transmissions occur concurrently and in the same frequency band and a dedicated 1-bit broadcast feedback channel from each destination to its corresponding group of sources exists. We provide a feedback-based iterative distributed (multi-user) beamforming algorithm, which "learns" the channels between each group of sources and its assigned destination. This algorithm is a straightforward generalization, to the multi-user case, of the feedback-based iterative distributed beamforming algorithm proposed recently by Mudumbai et al., in IEEE Trans. Inf. Th. (submitted) for networks with a single group of sources and a single destination. Putting the algorithm into a Markov chain context, we provide a simple convergence proof. We then show that, for M finite and N approaching infinity, spatial multiplexing based on the beamforming weights produced by the algorithm achieves full spatial multiplexing gain of M and full per-stream array gain of N, provided the time spent "learning'' the channels scales linearly in N. The network is furthermore shown to "crystallize''. Finally, we characterize the corresponding crystallization rate.
In this report a computational study of ConceptNet 4 is performed using tools from the field of network analysis. Part I describes the process of extracting the data from the SQL database that is available online, as well as how the closure of the input among the assertions in the English language is computed. This part also performs a validation of the input as well as checks for the consistency of the entire database. Part II investigates the structural properties of ConceptNet 4. Different graphs are induced from the knowledge base by fixing different parameters. The degrees and the degree distributions are examined, the number and sizes of connected components, the transitivity and clustering coefficient, the cores, information related to shortest paths in the graphs, and cliques. Part III investigates non-overlapping, as well as overlapping communities that are found in ConceptNet 4. Finally, Part IV describes an investigation on rules.
Training generative adversarial networks is unstable in high-dimensions when the true data distribution lies on a lower-dimensional manifold. The discriminator is then easily able to separate nearly all generated samples leaving the generator without meaningful gradients. We propose training a single generator simultaneously against an array of discriminators, each of which looks at a different random low-dimensional projection of the data. We show that individual discriminators then provide stable gradients to the generator, and that the generator learns to produce samples consistent with the full data distribution to satisfy all discriminators. We demonstrate the practical utility of this approach experimentally, and show that it is able to produce image samples with higher quality than traditional training with a single discriminator.
Based on large-scale Monte Carlo simulations on lattice the energy probability distribution functions are investigated for a large set of primary sequences in distinct models of copolymers at low temperatures below transitions to compacted states. Amphiphilic copolymers with hydrophobic and hydrophilic units are found to produce a single or double peak energy distributions corresponding to mono- or multi-meric micellar conformations. However, copolymers with short ranged random `charge' interactions in some cases are found to produce energy distribution functions with a well pronounced lowest energy state and a gap separating it from the rest of the spectrum. These, however have rather peculiar conformations corresponding to effectively immiscible domains comprised from monomers of likewise species. Relevance of these observations for coarse-grained models for protein folding is discussed.
We consider the reversible random sequential adsorption of line segments on a one-dimensional lattice. Line segments of length $l \geq 2$ adsorb on the lattice with a adsorption rate $K_a$, and leave with a desorption rate $K_d$. We calculate the coverage fraction, and steady-state jamming limits by a Monte Carlo method. We observe that coverage fraction and jamming limits do not follow mean-field results at the large $K=K_a/K_d >>1$. Jamming limits decrease when the length of the line segment $l$ increases. However, jamming limits increase monotonically when the parameter $K$ increases. The distribution of two consecutive empty sites is not equivalent to the square of the distribution of isolated empty sites.
Many computer vision algorithms depend on a variety of parameter choices and settings that are typically hand-tuned in the course of evaluating the algorithm. While such parameter tuning is often presented as being incidental to the algorithm, correctly setting these parameter choices is frequently critical to evaluating a method's full potential. Compounding matters, these parameters often must be re-tuned when the algorithm is applied to a new problem domain, and the tuning process itself often depends on personal experience and intuition in ways that are hard to describe. Since the performance of a given technique depends on both the fundamental quality of the algorithm and the details of its tuning, it can be difficult to determine whether a given technique is genuinely better, or simply better tuned.   In this work, we propose a meta-modeling approach to support automated hyper parameter optimization, with the goal of providing practical tools to replace hand-tuning with a reproducible and unbiased optimization process. Our approach is to expose the underlying expression graph of how a performance metric (e.g. classification accuracy on validation examples) is computed from parameters that govern not only how individual processing steps are applied, but even which processing steps are included. A hyper parameter optimization algorithm transforms this graph into a program for optimizing that performance metric. Our approach yields state of the art results on three disparate computer vision problems: a face-matching verification task (LFW), a face identification task (PubFig83) and an object recognition task (CIFAR-10), using a single algorithm. More broadly, we argue that the formalization of a meta-model supports more objective, reproducible, and quantitative evaluation of computer vision algorithms, and that it can serve as a valuable tool for guiding algorithm development.
In network coding, information transmission often encounters wiretapping attacks. Secure network coding is introduced to prevent information from being leaked to adversaries. For secure linear network codes (SLNCs), the required field size is a very important index, because it largely determines the computational and space complexities of a SLNC, and it is also very important for the process of secure network coding from theoretical research to practical application. In this letter, we further discuss the required field size of SLNCs, and obtain a new lower bound. This bound shows that the field size of SLNCs can be reduced further, and much smaller than the known results for almost all cases.
An overview is given about the statistical physics of neural networks generating and analysing time series. Storage capacity, bit and sequence generation, prediction error, antipredictable sequences, interacting perceptrons and the application on the minority game are discussed. Finally, as a demonstration a perceptron predicts bit sequences produced by human beings.
A convolution model which accounts for neural activity dynamics in the primary visual cortex is derived and used to detect visually salient contours in images. Image inputs to the model are modulated by long-range horizontal connections, allowing contextual effects in the image to determine visual saliency, i.e. line segments arranged in a closed contour elicit a larger neural response than line segments forming background clutter. The model is tested on 3 types of contour, including a line, a circular closed contour, and a non-circular closed contour. Using a modified association field to describe horizontal connections the model is found to perform well for different parameter values. For each type of contour a different facilitation mechanism is found. Operating as a feed-forward network, the model assigns saliency by increasing the neural activity of line segments facilitated by the horizontal connections. Alternatively, operating as a feedback network, the model can achieve further improvement over several iterations through cooperative interactions. This model has no dynamical stability issues, and is suitable for use in biologically-inspired neural networks.
We investigate localized exciton states in pi-conjugated polymers with finite torsion. The localized states are associated with a perturbed transfer integral for which the magnitude of cosine of the torsion angle exceeds the magnitude of the corresponding cosine of the unperturbed system. The localized state energy is calculated as a function of the ratio of the perturbed to unperturbed transfer integrals. Particular attention is paid to the optically active symmetric localized states, and the effective oscillator strength, or square of the absolute magnitude of the transition dipole moment, is calculated as a function of the energy. The relation of the theory to recent optical studies of poly(di-noctylfluorene)(PF8) is discussed.
Blind Null Space Learning (BNSL) has recently been proposed for fast and accurate learning of the null-space associated with the channel matrix between a secondary transmitter and a primary receiver. In this paper we propose a channel tracking enhancement of the algorithm, namely the Blind Null Space Tracking (BNST) algorithm that allows transmission of information to the Secondary Receiver (SR) while simultaneously learning the null-space of the time-varying target channel. Specifically, the enhanced algorithm initially performs a BNSL sweep in order to acquire the null space. Then, it performs modified Jacobi rotations such that the induced interference to the primary receiver is kept lower than a given threshold $P_{Th}$ with probability $p$ while information is transmitted to the SR simultaneously. We present simulation results indicating that the proposed approach has strictly better performance over the BNSL algorithm for channels with independent Rayleigh fading with a small Doppler frequency.
The structure of close communication, contacts and association in social networks is studied in the form of maximal subgraphs of diameter 2 (2-clubs), corresponding to three types of close communities: hamlets, social circles and coteries. The concept of borough of a graph is defined and introduced. Each borough is a chained union of 2-clubs of the network and any 2-club of the network belongs to one borough. Thus the set of boroughs of a network, together with the 2-clubs held by them, are shown to contain the structure of close communication in a network. Applications are given with examples from real world network data.
The community structure of a complex network can be determined by finding the partitioning of its nodes that maximizes modularity. Many of the proposed algorithms for doing this work by recursively bisecting the network. We show that this unduely constrains their results, leading to a bias in the size of the communities they find and limiting their effectivness. To solve this problem, we propose adding a step to the existing algorithms that does not increase the order of their computational complexity. We show that, if this step is combined with a commonly used method, the identified constraint and resulting bias are removed, and its ability to find the optimal partitioning is improved. The effectiveness of this combined algorithm is also demonstrated by using it on real-world example networks. For a number of these examples, it achieves the best results of any known algorithm.
To address growth challenges facing large Data Centers and supercomputing clusters a new construction is presented for scalable, high throughput, low latency networks. The resulting networks require 1.5-5 times fewer switches, 2-6 times fewer cables, have 1.2-2 times lower latency and correspondingly lower congestion and packet losses than the best present or proposed networks providing the same number of ports at the same total bisection. These advantage ratios increase with network size. The key new ingredient is the exact equivalence discovered between the problem of maximizing network bisection for large classes of practically interesting Cayley graphs and the problem of maximizing codeword distance for linear error correcting codes. Resulting translation recipe converts existent optimal error correcting codes into optimal throughput networks.
Dynamical systems with a network structure can display anomalous bifurcations as a generic phenomenon. As an explanation for this it has been noted that homogeneous networks can be realized as quotient networks of so-called fundamental networks. The class of admissible vector fields for these fundamental networks is equal to the class of equivariant vector fields of the regular representation of a monoid. Using this insight, we set up a framework for center manifold reduction in fundamental networks and their quotients. We then use this machinery to explain the difference in generic bifurcations between three example networks with identical spectral properties and identical robust synchrony spaces.
The effect of dimerization on the random antiferomagnetic Heisenberg chain with spin 1/2 is studied by the density matrix renormalization group method. The ground state energy, the energy gap distribution and the string order parameter are calculated. Using the finite size scaling analysis, the dimerization dependence of the these quantities are obtained. The ground state energy gain due to dimerization behaves as $u^a$ with $a > 2$ where $u$ denotes the degree of dimerization, suggesting the absence of spin-Peierls instability. It is explicitly shown that the string long range order survives even in the presence of randomness. The string order behaves as $u^{2\beta}$ with $\beta \sim 0.37$ in agreement with the recent prediction of real space renormalization group theory ($\beta =(3-\sqrt{5})/2 \simeq 0.382$). The physical picture of this behavior in this model is also discussed.
We reformulate the covering and quantizer problems as the determination of the ground states of interacting particles in $\mathbb{R}^d$ that generally involve single-body, two-body, three-body, and higher-body interactions. This is done by linking the covering and quantizer problems to certain optimization problems involving the "void" nearest-neighbor functions that arise in the theory of random media and statistical mechanics. These reformulations, which again exemplifies the deep interplay between geometry and physics, allow one now to employ theoretical and numerical optimization techniques to analyze and solve these energy minimization problems. The covering and quantizer problems have relevance in numerous applications, including wireless communication network layouts, the search of high-dimensional data parameter spaces, stereotactic radiation therapy, data compression, digital communications, meshing of space for numerical analysis, and coding and cryptography, among other examples. The connections between the covering and quantizer problems and the sphere-packing and number-variance problems are discussed. We also show that disordered saturated sphere packings provide relatively thin (economical) coverings and may yield thinner coverings than the best known lattice coverings in sufficiently large dimensions. In the case of the quantizer problem, we derive improved upper bounds on the quantizer error using sphere-packing solutions, which are generally substantially sharper than an existing upper bound in low to moderately large dimensions. We also demonstrate that disordered saturated sphere packings yield relatively good quantizers. Finally, we remark on possible applications of our results for the detection of gravitational waves.
The anomalous magnetic field dependence of dielectric properties of insulating glasses in the temperature interval $10mK<T<50mK$ is considered. In this temperature range, the dielectric permittivity is defined by the resonant contribution of tunneling systems. The external magnetic field regulates nuclear spins of tunneling atoms. This regulation suppresses a nuclear quadrupole interaction of these spins with lattice and, thus, affects the dielectric response of tunneling systems. It is demonstrated that in the absence of an external magnetic field the nuclear quadrupole interaction $b$ results in the correction to the permittivity $\delta\chi\sim| b| /T$ in the temperature range of interest. An application of a magnetic field results in a sharp increase of this correction approximately by a factor of two when the Zeeman splitting $m$ approaches the order of $| b| $. Further increase of the magnetic field results in a relatively smooth decrease in the correction until the Zeeman splitting approaches the temperature. This smooth dependence results from tunneling accompanied by a change of the nuclear spin projection. As the magnetic field surpasses the temperature, the correction vanishes. The results obtained in this paper are compared with experiment. A new mechanism of the low temperature nuclear spin-lattice relaxation in glasses is considered.
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today about the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time.   Our paper is motivated by the empirical observation that network evolution patterns seem far from random, but exhibit structure. Moreover, the specific patterns appear to depend on the network type, contradicting the existence of a "one fits it all" model. However, we still lack observables to quantify these intuitions, as well as metrics to compare graph evolutions. Such observables and metrics are needed for extrapolating or predicting evolutions, as well as for interpolating graph evolutions.   To explore the many faces of graph dynamics and to quantify temporal changes, this paper suggests to build upon the concept of centrality, a measure of node importance in a network. In particular, we introduce the notion of centrality distance, a natural similarity measure for two graphs which depends on a given centrality, characterizing the graph type. Intuitively, centrality distances reflect the extent to which (non-anonymous) node roles are different or, in case of dynamic graphs, have changed over time, between two graphs.   We evaluate the centrality distance approach for five evolutionary models and seven real-world social and physical networks. Our results empirically show the usefulness of centrality distances for characterizing graph dynamics compared to a null-model of random evolution, and highlight the differences between the considered scenarios. Interestingly, our approach allows us to compare the dynamics of very different networks, in terms of scale and evolution speed.
In real world, the huge amount of temporal data is to be processed in many application areas such as scientific, financial, network monitoring, sensor data analysis. Data mining techniques are primarily oriented to handle discrete features. In the case of temporal data the time plays an important role on the characteristics of data. To consider this effect, the data discretization techniques have to consider the time while processing to resolve the issue by finding the intervals of data which are more concise and precise with respect to time. Here, this research is reviewing different data discretization techniques used in temporal data applications according to the inclusion or exclusion of: class label, temporal order of the data and handling of stream data to open the research direction for temporal data discretization to improve the performance of data mining technique.
Understanding the information behind social relationships represented by a network is very challenging, especially, when the social interactions change over time inducing updates on the network topology. In this context, this paper proposes an approach for analysing dynamic social networks, more precisely for Twitter's network. Our approach relies on two complementary steps: (i) an online community identification based on a dynamic community detection algorithm called Dyci. The main idea of Dyci is to track whether a connected component of the weighted graph becomes weak over time, in order to merge it with the "dominant" neighbour community. Additionally, (ii) a community visualization is provided by our visualization tool called NLCOMS, which combines between two methods of dynamic network visualization. In order to assess the efficiency and the applicability of the proposed approach, we consider real-world data of the ANR-Info-RSN project, which deals with community analysis in Twitter.
`Water In Star-forming regions with Herschel' (WISH) is a key program on the Herschel Space Observatory designed to probe the physical and chemical structure of young stellar objects using water and related molecules and to follow the water abundance from collapsing clouds to planet-forming disks. About 80 sources are targeted covering a wide range of luminosities and evolutionary stages, from cold pre-stellar cores to warm protostellar envelopes and outflows to disks around young stars. Both the HIFI and PACS instruments are used to observe a variety of lines of H2O, H218O and chemically related species. An overview of the scientific motivation and observational strategy of the program is given together with the modeling approach and analysis tools that have been developed. Initial science results are presented. These include a lack of water in cold gas at abundances that are lower than most predictions, strong water emission from shocks in protostellar environments, the importance of UV radiation in heating the gas along outflow walls across the full range of luminosities, and surprisingly widespread detection of the chemically related hydrides OH+ and H2O+ in outflows and foreground gas. Quantitative estimates of the energy budget indicate that H2O is generally not the dominant coolant in the warm dense gas associated with protostars. Very deep limits on the cold gaseous water reservoir in the outer regions of protoplanetary disks are obtained which have profound implications for our understanding of grain growth and mixing in disks.
In this paper, we explore and detail our experiments in a high-dimensionality, multi-class image classification problem often found in the automatic recognition of Sign Languages. Here, our efforts are directed towards comparing the characteristics, advantages and drawbacks of creating and training Support Vector Machines disposed in a Directed Acyclic Graph and Artificial Neural Networks to classify signs from the Brazilian Sign Language (LIBRAS). We explore how the different heuristics, hyperparameters and multi-class decision schemes affect the performance, efficiency and ease of use for each classifier. We provide hyperparameter surface maps capturing accuracy and efficiency, comparisons between DDAGs and 1-vs-1 SVMs, and effects of heuristics when training ANNs with Resilient Backpropagation. We report statistically significant results using Cohen's Kappa statistic for contingency tables.
We aim to take a census of molecular hydrogen emission line objects (MHOs) in the Rho Ophiuchi molecular cloud and to make the first systematic proper motion measurements of these objects in this region. Deep H2 near-infrared imaging is performed to search for molecular hydrogen emission line objects. Multi-epoch data are used to derive the proper motions of the features of these objects, and the lengths and opening angles of the molecular hydrogen outflows. Our imaging covers an area of about 0.11 deg2 toward the L1688 core in the Rho Ophiuchi molecular cloud. In total, six new MHOs are discovered, 32 previously known MHOs are detected, and the proper motions for 86 features of the MHOs are measured. The proper motions lie in the range of 14 to 247 mas/yr, corresponding to transversal velocities of 8 to 140 km s-1 with a median velocity of about 35 km s-1. Based on morphology and proper motion measurements, 27 MHOs are ascribed to 21 driving sources. The molecular hydrogen outflows have a median length of about 0.04 pc and random orientations. We find no obvious correlation between H2 jet length, jet opening angle, and the evolutionary stage of the driving sources as defined by their spectral indices. We find that the fraction of protostars (23%) that drive molecular hydrogen outflows is similar to the one for Class II sources (15%). For most molecular hydrogen outflows, no obvious velocity variation along the outflow has been found. In Ophiuchus the frequency of occurrence of molecular hydrogen outflows has no strong dependency on the evolutionary stage of the driving source during the evolution from the protostellar stage to the classical T Tauri stage.
We consider a general class of stochastic networks and ask which network nodes need to be controlled, and how, to stabilize and switch between desired metastable (target) states in terms of the first and second statistical moments of the system. We first show that it is sufficient to directly interfere with a subset of nodes which can be identified using information about the graph of the network only. Then, we develop a suitable method for feedback control which acts on that subset of nodes and preserves the covariance structure of the desired target state. Finally, we demonstrate our theoretical results using a stochastic Hopfield network and a global brain model. Our results are applicable to a variety of (model) networks, and further our understanding of the relationship between network structure and collective dynamics for the benefit of effective control.
We present the N2Sky system, which provides a framework for the exchange of neural network specific knowledge, as neural network paradigms and objects, by a virtual organization environment. It follows the sky computing paradigm delivering ample resources by the usage of federated Clouds. N2Sky is a novel Cloud-based neural network simulation environment, which follows a pure service oriented approach. The system implements a transparent environment aiming to enable both novice and experienced users to do neural network research easily and comfortably. N2Sky is built using the RAVO reference architecture of virtual organizations which allows itself naturally integrating into the Cloud service stack (SaaS, PaaS, and IaaS) of service oriented architectures.
Community detection is a commonly used technique for identifying groups in a network based on similarities in connectivity patterns. To facilitate community detection in large networks, we recast the network to be partitioned into a smaller network of 'super nodes', each super node comprising one or more nodes in the original network. To define the seeds of our super nodes, we apply the 'CoreHD' ranking from dismantling and decycling. We test our approach through the analysis of two common methods for community detection: modularity maximization with the Louvain algorithm and maximum likelihood optimization for fitting a stochastic block model. Our results highlight that applying community detection to the compressed network of super nodes is significantly faster while successfully producing partitions that are more aligned with the local network connectivity, more stable across multiple (stochastic) runs within and between community detection algorithms, and overlap well with the results obtained using the full network.
Differentiation is a key cellular process in normal tissue development that is significantly altered in cancer. Although molecular signatures characterising pluripotency and multipotency exist, there is, as yet, no single quantitative mark of a cellular sample's position in the global differentiation hierarchy. Here we adopt a systems view and consider the sample's network entropy, a measure of signaling pathway promiscuity, computable from a sample's genome-wide expression profile. We demonstrate that network entropy provides a quantitative, in-silico, readout of the average undifferentiated state of the profiled cells, recapitulating the known hierarchy of pluripotent, multipotent and differentiated cell types. Network entropy further exhibits dynamic changes in time course differentiation data, and in line with a sample's differentiation stage. In disease, network entropy predicts a higher level of cellular plasticity in cancer stem cell populations compared to ordinary cancer cells. Importantly, network entropy also allows identification of key differentiation pathways. Our results are consistent with the view that pluripotency is a statistical property defined at the cellular population level, correlating with intra-sample heterogeneity, and driven by the degree of signaling promiscuity in cells. In summary, network entropy provides a quantitative measure of a cell's undifferentiated state, defining its elevation in Waddington's landscape.
In one dimension, noninteracting particles can undergo a localization-delocalization transition in a quasiperiodic potential. Recent studies have suggested that this transition transforms into a many-body localization (MBL) transition upon the introduction of interactions. It has also been shown that mobility edges can appear in the single particle spectrum for certain types of quasiperiodic potentials. Here, we investigate the effect of interactions in two models with such mobility edges. Employing the technique of exact diagonalization for finite-sized systems, we calculate the level spacing distribution, time evolution of entanglement entropy, optical conductivity, and return probability to detect MBL. We find that MBL does indeed occur in one of the two models we study, but the entanglement appears to grow faster than logarithmically with time unlike in other MBL systems.
An introduction to Extremal Optimization written for the Computer Simulation Column in ``Computing in Science and Engineering'' (CISE).
Hessian-free (HF) optimization has been successfully used for training deep autoencoders and recurrent networks. HF uses the conjugate gradient algorithm to construct update directions through curvature-vector products that can be computed on the same order of time as gradients. In this paper we exploit this property and study stochastic HF with gradient and curvature mini-batches independent of the dataset size. We modify Martens' HF for these settings and integrate dropout, a method for preventing co-adaptation of feature detectors, to guard against overfitting. Stochastic Hessian-free optimization gives an intermediary between SGD and HF that achieves competitive performance on both classification and deep autoencoder experiments.
Visual saliency models have recently begun to incorporate deep learning to achieve predictive capacity much greater than previous unsupervised methods. However, most existing models predict saliency using local mechanisms limited to the receptive field of the network. We propose a model that incorporates global scene semantic information in addition to local information gathered by a convolutional neural network. Our model is formulated as a mixture of experts. Each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' output, with weights determined by a separate gating network. This gating network is guided by global scene information to predict weights. The expert networks and the gating network are trained simultaneously in an end-to-end manner. We show that our mixture formulation leads to improvement in performance over an otherwise identical non-mixture model that does not incorporate global scene information.
Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning.   One factor behind the recent resurgence of the subject is a key algorithmic step called {\em pretraining}: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications of this simple principle, by establishing a connection with the interplay of orbits and stabilizers of group actions. Although the neural networks themselves may not form groups, we show the existence of {\em shadow} groups whose elements serve as close approximations.   Over the shadow groups, the pre-training step, originally introduced as a mechanism to better initialize a network, becomes equivalent to a search for features with minimal orbits. Intuitively, these features are in a way the {\em simplest}. Which explains why a deep learning network learns simple features first. Next, we show how the same principle, when repeated in the deeper layers, can capture higher order representations, and why representation complexity increases as the layers get deeper.
In image classification, visual separability between different object categories is highly uneven, and some categories are more difficult to distinguish than others. Such difficult categories demand more dedicated classifiers. However, existing deep convolutional neural networks (CNN) are trained as flat N-way classifiers, and few efforts have been made to leverage the hierarchical structure of categories. In this paper, we introduce hierarchical deep CNNs (HD-CNNs) by embedding deep CNNs into a category hierarchy. An HD-CNN separates easy classes using a coarse category classifier while distinguishing difficult classes using fine category classifiers. During HD-CNN training, component-wise pretraining is followed by global finetuning with a multinomial logistic loss regularized by a coarse category consistency term. In addition, conditional executions of fine category classifiers and layer parameter compression make HD-CNNs scalable for large-scale visual recognition. We achieve state-of-the-art results on both CIFAR100 and large-scale ImageNet 1000-class benchmark datasets. In our experiments, we build up three different HD-CNNs and they lower the top-1 error of the standard CNNs by 2.65%, 3.1% and 1.1%, respectively.
Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework.
Previous work has shown that the problem of learning the optimal structure of a Bayesian network can be formulated as a shortest path finding problem in a graph and solved using A* search. In this paper, we improve the scalability of this approach by developing a memory-efficient heuristic search algorithm for learning the structure of a Bayesian network. Instead of using A*, we propose a frontier breadth-first branch and bound search that leverages the layered structure of the search graph of this problem so that no more than two layers of the graph, plus solution reconstruction information, need to be stored in memory at a time. To further improve scalability, the algorithm stores most of the graph in external memory, such as hard disk, when it does not fit in RAM. Experimental results show that the resulting algorithm solves significantly larger problems than the current state of the art.
Who is more important in a network? Who controls the flow between the nodes or whose contribution is significant for connections? Centrality metrics play an important role while answering these questions. The betweenness metric is useful for network analysis and implemented in various tools. Since it is one of the most computationally expensive kernels in graph mining, several techniques have been proposed for fast computation of betweenness centrality. In this work, we propose and investigate techniques which compress a network and shatter it into pieces so that the rest of the computation can be handled independently for each piece. Although we designed and tuned the shattering process for betweenness, it can be adapted for other centrality metrics in a straightforward manner. Experimental results show that the proposed techniques can be a great arsenal to reduce the centrality computation time for various types of networks.
Trial-and-error learning requires evaluating variable actions and reinforcing successful variants. In songbirds, vocal exploration is induced by LMAN, the output of a basal ganglia-circuit that also contributes a corrective bias to the vocal output. This bias is gradually consolidated in RA, a motor cortex analogue downstream of LMAN. We develop a new model of such two-stage learning. Using stochastic gradient descent, we derive how the activity in 'tutor' circuits (e.g., LMAN) should match plasticity mechanisms in 'student' circuits (e.g., RA) to achieve efficient learning. We further describe a reinforcement learning framework through which the tutor can build its teaching signal. We show that mismatches between the tutor signal and the plasticity mechanism can impair learning. Applied to birdsong, our results predict the temporal structure of the corrective bias from LMAN given a plasticity rule in RA. Our framework can be applied predictively to other paired brain areas showing two-stage learning.
We propose a novel architecture to design a neural associative memory that is capable of learning a large number of patterns and recalling them later in presence of noise. It is based on dividing the neurons into local clusters and parallel plains, very similar to the architecture of the visual cortex of macaque brain. The common features of our proposed architecture with those of spatially-coupled codes enable us to show that the performance of such networks in eliminating noise is drastically better than the previous approaches while maintaining the ability of learning an exponentially large number of patterns. Previous work either failed in providing good performance during the recall phase or in offering large pattern retrieval (storage) capacities. We also present computational experiments that lend additional support to the theoretical analysis.
We present explicit classes of probability distributions that can be learned by Restricted Boltzmann Machines (RBMs) depending on the number of units that they contain, and which are representative for the expressive power of the model. We use this to show that the maximal Kullback-Leibler divergence to the RBM model with $n$ visible and $m$ hidden units is bounded from above by $n - \left\lfloor \log(m+1) \right\rfloor - \frac{m+1}{2^{\left\lfloor\log(m+1)\right\rfloor}} \approx (n -1) - \log(m+1)$. In this way we can specify the number of hidden units that guarantees a sufficiently rich model containing different classes of distributions and respecting a given error tolerance.
The Dice score is widely used for binary segmentation due to its robustness to class imbalance. Soft generalisations of the Dice score allow it to be used as a loss function for training convolutional neural networks (CNN). Although CNNs trained using mean-class Dice score achieve state-of-the-art results on multi-class segmentation, this loss function does neither take advantage of inter-class relationships nor multi-scale information. We argue that an improved loss function should balance misclassifications to favour predictions that are semantically meaningful. This paper investigates these issues in the context of multi-class brain tumour segmentation. Our contribution is threefold. 1) We propose a semantically-informed generalisation of the Dice score for multi-class segmentation based on the Wasserstein distance on the probabilistic label space. 2) We propose a holistic CNN that embeds spatial information at multiple scales with deep supervision. 3) We show that the joint use of holistic CNNs and generalised Wasserstein Dice scores achieves segmentations that are more semantically meaningful for brain tumour segmentation.
Previous work in network analysis has focused on modeling the mixed-memberships of node roles in the graph, but not the roles of edges. We introduce the edge role discovery problem and present a generalizable framework for learning and extracting edge roles from arbitrary graphs automatically. Furthermore, while existing node-centric role models have mainly focused on simple degree and egonet features, this work also explores graphlet features for role discovery. In addition, we also develop an approach for automatically learning and extracting important and useful edge features from an arbitrary graph. The experimental results demonstrate the utility of edge roles for network analysis tasks on a variety of graphs from various problem domains.
Learning over multi-view data is a challenging problem with strong practical applications. Most related studies focus on the classification point of view and assume that all the views are available at any time. We consider an extension of this framework in two directions. First, based on the BiGAN model, the Multi-view BiGAN (MV-BiGAN) is able to perform density estimation from multi-view inputs. Second, it can deal with missing views and is able to update its prediction when additional views are provided. We illustrate these properties on a set of experiments over different datasets.
We describe a simple adaptive network of coupled chaotic maps. The network reaches a stationary state (frozen topology) for all values of the coupling parameter, although the dynamics of the maps at the nodes of the network can be non-trivial. The structure of the network shows interesting hierarchical properties and in certain parameter regions the dynamics is polysynchronous: nodes can be divided in differently synchronized classes but contrary to cluster synchronization, nodes in the same class need not be connected to each other. These complicated synchrony patterns have been conjectured to play roles in systems biology and circuits. The adaptive system we study describes ways whereby this behaviour can evolve from undifferentiated nodes.
A method is provided for designing and training noise-driven recurrent neural networks as models of stochastic processes. The method unifies and generalizes two known separate modeling approaches, Echo State Networks (ESN) and Linear Inverse Modeling (LIM), under the common principle of relative entropy minimization. The power of the new method is demonstrated on a stochastic approximation of the El Nino phenomenon studied in climate research.
The self-dual random-bond eight-state Potts model is studied numerically through large-scale Monte Carlo simulations using the Swendsen-Wang cluster flipping algorithm. We compute bulk and surface order parameters and susceptibilities and deduce the corresponding critical exponents at the random fixed point using standard finite-size scaling techniques. The scaling laws are suitably satisfied. We find that a belonging of the model to the 2D Ising model universality class can be conclusively ruled out, and the dimensions of the relevant bulk and surface scaling fields are found to take the values $y_h=1.849$, $y_t=0.977$, $y_{h_s}=0.54$, to be compared to their Ising values: 15/8, 1, and 1/2.
Model predictive control (MPC) is an effective method for controlling robotic systems, particularly autonomous aerial vehicles such as quadcopters. However, application of MPC can be computationally demanding, and typically requires estimating the state of the system, which can be challenging in complex, unstructured environments. Reinforcement learning can in principle forego the need for explicit state estimation and acquire a policy that directly maps sensor readings to actions, but is difficult to apply to unstable systems that are liable to fail catastrophically during training before an effective policy has been found. We propose to combine MPC with reinforcement learning in the framework of guided policy search, where MPC is used to generate data at training time, under full state observations provided by an instrumented training environment. This data is used to train a deep neural network policy, which is allowed to access only the raw observations from the vehicle's onboard sensors. After training, the neural network policy can successfully control the robot without knowledge of the full state, and at a fraction of the computational cost of MPC. We evaluate our method by learning obstacle avoidance policies for a simulated quadrotor, using simulated onboard sensors and no explicit state estimation at test time.
We present new version of the OGLE-II catalog of eclipsing binary stars detected in the Small Magellanic Cloud, based on Difference Image Analysis catalog of variable stars in the Magellanic Clouds containing data collected from 1997 to 2000. We found 1351 eclipsing binary stars in the central 2.4 square degree area of the SMC. 455 stars are newly discovered objects, not found in the previous release of the catalog. The eclipsing objects were selected with the automatic search algorithm based on the artificial neural network. The full catalog is accessible from the OGLE Internet archive.
For the last few years, the amount of data has significantly increased in the companies. It is the reason why data analysis methods have to evolve to meet new demands. In this article, we introduce a practical analysis of a large database from a telecommunication operator. The problem is to segment a territory and characterize the retrieved areas owing to their inhabitant behavior in terms of mobile telephony. We have call detail records collected during five months in France. We propose a two stages analysis. The first one aims at grouping source antennas which originating calls are similarly distributed on target antennas and conversely for target antenna w.r.t. source antenna. A geographic projection of the data is used to display the results on a map of France. The second stage discretizes the time into periods between which we note changes in distributions of calls emerging from the clusters of source antennas. This enables an analysis of temporal changes of inhabitants behavior in every area of the country.
Deep learning has dramatically improved the performance of speech recognition systems through learning hierarchies of features optimized for the task at hand. However, true end-to-end learning, where features are learned directly from waveforms, has only recently reached the performance of hand-tailored representations based on the Fourier transform. In this paper, we detail an approach to use convolutional filters to push past the inherent tradeoff of temporal and frequency resolution that exists for spectral representations. At increased computational cost, we show that increasing temporal resolution via reduced stride and increasing frequency resolution via additional filters delivers significant performance improvements. Further, we find more efficient representations by simultaneously learning at multiple scales, leading to an overall decrease in word error rate on a difficult internal speech test set by 20.7% relative to networks with the same number of parameters trained on spectrograms.
The activity of collections of synchronizing neurons can be represented by weakly coupled nonlinear phase oscillators satisfying Kuramoto's equations. In this article, we build such neural-oscillator models, partly based on neurophysiological evidence, to represent approximately the learning behavior predicted and confirmed in three experiments by well-known stochastic learning models of behavioral stimulus-response theory. We use three Kuramoto oscillators to model a continuum of responses, and we provide detailed numerical simulations and analysis of the three-oscillator Kuramoto problem, including an analysis of the stability points for different coupling conditions. We show that the oscillator simulation data are well-matched to the behavioral data of the three experiments.
We present a theoretically grounded approach to train deep neural networks, including recurrent networks, subject to class-dependent label noise. We propose two procedures for loss correction that are agnostic to both application domain and network architecture. They simply amount to at most a matrix inversion and multiplication, provided that we know the probability of each class being corrupted into another. We further show how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and thus providing an end-to-end framework. Extensive experiments on MNIST, IMDB, CIFAR-10, CIFAR-100 and a large scale dataset of clothing images employing a diversity of architectures --- stacking dense, convolutional, pooling, dropout, batch normalization, word embedding, LSTM and residual layers --- demonstrate the noise robustness of our proposals. Incidentally, we also prove that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise.
Metamaterials are known to exhibit a variety of electromagnetic properties non-existing in nature. We show that an all-dielectric (non-magnetic) system consisting of deep subwavelength, high permittivity resonant spheres possess effective negative magnetic permeability (dielectric permittivity being positive and small). Due to the symmetry of the electromagnetic wave equations in classical electrodynamics, localized "magnetic" plasmon resonances can be excited in a metasphere made of such metamaterial. This is theoretically demonstrated by the coupled-dipole approximation and numerically for real spheres, in full agreement with the exact analytical solution for the scattering process by the same metasphere with effective material properties predicted by effective medium theory. The emergence of this phenomenon as a function of structural order within the metastructures is also studied. Universal conditions enabling effective negative magnetic permeability relate subwavelength sphere permittivity and size with critical filling fraction. Our proposal paves the way towards (all-dielectric) magnetic plasmonics, with a wealth of fascinating applications.
In this paper, a new framework of mobile converged networks is proposed for flexible resource optimization over multi-tier wireless heterogeneous networks. Design principles and advantages of this new framework of mobile converged networks are discussed. Moreover, mobile converged network models based on interference coordination and energy efficiency are presented and the corresponding optimization algorithms are developed. Furthermore, future challenges of mobile converged networks are identified to promote the study in modeling and performance analysis of mobile converged networks.
We study the energy distribution function $\rho (E)$ for interfaces in a random field environment at zero temperature by summing the leading terms in the perturbation expansion of $\rho (E)$ in powers of the disorder strength, and by taking into account the non perturbational effects of the disorder using the functional renormalization group. We have found that the average and the variance of the energy for one-dimensional interface of length $L$ behave as, $<E>_{R}\propto L\ln L$, $\Delta E_{R}\propto L$, while the distribution function of the energy tends for large $L$ to the Gumbel distribution of the extreme value statistics.
A fundamental question in neuroscience is how structure and function of neural systems are related. We study this interplay by combining a familiar auto-associative neural network with an evolving mechanism for the birth and death of synapses. A feedback loop then arises leading to two qualitatively different behaviours. In one, the network structure becomes heterogeneous and dissasortative and the system displays good memory performance. In the other, the structure remains homogeneous and incapable of pattern retrieval. These findings are compatible with experimental results on early brain development, and provide an explanation for the existence of synaptic pruning. Other evolving networks -- such as those of protein interaction -- might share the basic ingredients for this feedback loop, and indeed many of their structural features are as predicted by our model.
The incubation period of a disease is the time between an initiating pathologic event and the onset of symptoms. For typhoid fever, polio, measles, leukemia and many other diseases, the incubation period is highly variable. Some affected people take much longer than average to show symptoms, leading to a distribution of incubation periods that is right skewed and often approximately lognormal. Although this statistical pattern was discovered more than sixty years ago, it remains an open question to explain its ubiquity. Here we propose an explanation based on evolutionary dynamics on graphs. For simple models of a mutant or pathogen invading a network-structured population of healthy cells, we show that skewed distributions of incubation periods emerge for a wide range of assumptions about invader fitness, competition dynamics, and network structure. The skewness stems from stochastic mechanisms associated with two classic problems in probability theory: the coupon collector and the random walk. Unlike previous explanations that rely crucially on heterogeneity, our results hold even for homogeneous populations. Thus, we predict that two equally healthy individuals subjected to equal doses of equally pathogenic agents may, by chance alone, show remarkably different time courses of disease.
In our today's information society more and more data emerges, e.g.~in social networks, technical applications, or business applications. Companies try to commercialize these data using data mining or machine learning methods. For this purpose, the data are categorized or classified, but often at high (monetary or temporal) costs. An effective approach to reduce these costs is to apply any kind of active learning (AL) methods, as AL controls the training process of a classifier by specific querying individual data points (samples), which are then labeled (e.g., provided with class memberships) by a domain expert. However, an analysis of current AL research shows that AL still has some shortcomings. In particular, the structure information given by the spatial pattern of the (un)labeled data in the input space of a classification model (e.g.,~cluster information), is used in an insufficient way. In addition, many existing AL techniques pay too little attention to their practical applicability. To meet these challenges, this article presents several techniques that together build a new approach for combining AL and semi-supervised learning (SSL) for support vector machines (SVM) in classification tasks. Structure information is captured by means of probabilistic models that are iteratively improved at runtime when label information becomes available. The probabilistic models are considered in a selection strategy based on distance, density, diversity, and distribution (4DS strategy) information for AL and in a kernel function (Responsibility Weighted Mahalanobis kernel) for SVM. The approach fuses generative and discriminative modeling techniques. With 20 benchmark data sets and with the MNIST data set it is shown that our new solution yields significantly better results than state-of-the-art methods.
We study ep deep inelastic scattering and the inclusive production of prompt photon within the framework of the quasi-multi-Regge-kinematic approach, applying the quark Reggeization hypothesis. We describe structure functions F_2 and F_L supposing that a virtual photon scatters on a Reggeized quark from a proton, via the effective gamma-Reggeon-quark vertex. It is shown that the main mechanism of the inclusive prompt photon production in p \bar p collisions is the fusion of a Reggeized quark and a Reggeized antiquark into a photon, via the effective Regeon-Reggeon-gamma vertex. We describe the inclusive photon transverse momentum spectra measured by the CDF and D0 Collaborations within errors and without free parameters, using the Kimber-Martin-Ryskin unintegrated quark and gluon distribution functions in a proton.
Recent measurements and theoretical developments on the hadronic final state in deep-inelastic scattering, pp and ee collisions are presented.
Many inverse problems are formulated as optimization problems over certain appropriate input distributions. Recently, there has been a growing interest in understanding the computational hardness of these optimization problems, not only in the worst case, but in an average-complexity sense under this same input distribution.   In this note, we are interested in studying another aspect of hardness, related to the ability to learn how to solve a problem by simply observing a collection of previously solved instances. These are used to supervise the training of an appropriate predictive model that parametrizes a broad class of algorithms, with the hope that the resulting "algorithm" will provide good accuracy-complexity tradeoffs in the average sense.   We illustrate this setup on the Quadratic Assignment Problem, a fundamental problem in Network Science. We observe that data-driven models based on Graph Neural Networks offer intriguingly good performance, even in regimes where standard relaxation based techniques appear to suffer.
Recent results from statistical physics show that large classes of complex networks, both man-made and of natural origin, are characterized by high clustering properties yet strikingly short path lengths between pairs of nodes. This class of networks are said to have a small-world topology. In the context of communication networks, navigable small-world topologies, i.e. those which admit efficient distributed routing algorithms, are deemed particularly effective, for example in resource discovery tasks and peer-to-peer applications. Breaking with the traditional approach to small-world topologies that privileges graph parameters pertaining to connectivity, and intrigued by the fundamental limits of communication in networks that exploit this type of topology, we investigate the capacity of these networks from the perspective of network information flow. Our contribution includes upper and lower bounds for the capacity of standard and navigable small-world models, and the somewhat surprising result that, with high probability, random rewiring does not alter the capacity of a small-world network.
We study large-scale kernel methods for acoustic modeling in speech recognition and compare their performance to deep neural networks (DNNs). We perform experiments on four speech recognition datasets, including the TIMIT and Broadcast News benchmark tasks, and compare these two types of models on frame-level performance metrics (accuracy, cross-entropy), as well as on recognition metrics (word/character error rate). In order to scale kernel methods to these large datasets, we use the random Fourier feature method of Rahimi and Recht (2007). We propose two novel techniques for improving the performance of kernel acoustic models. First, in order to reduce the number of random features required by kernel models, we propose a simple but effective method for feature selection. The method is able to explore a large number of non-linear features while maintaining a compact model more efficiently than existing approaches. Second, we present a number of frame-level metrics which correlate very strongly with recognition performance when computed on the heldout set; we take advantage of these correlations by monitoring these metrics during training in order to decide when to stop learning. This technique can noticeably improve the recognition performance of both DNN and kernel models, while narrowing the gap between them. Additionally, we show that the linear bottleneck method of Sainath et al. (2013) improves the performance of our kernel models significantly, in addition to speeding up training and making the models more compact. Together, these three methods dramatically improve the performance of kernel acoustic models, making their performance comparable to DNNs on the tasks we explored.
Although liquid water has been studied for many decades by (X-ray and neutron) diffraction measurements, new experimental results keep appearing, virtually every year. The reason for this is that neither X-ray, nor neutron diffraction data are trivial to correct and interpret for this essential substance. Since X-rays are somewhat insensitive to hydrogen, neutron diffraction with (most frequently, H/D) isotopic substitution is vital for investigating the most important feature in water: hydrogen bonding. Here, the two very recent sets of neutron diffraction data are considered, both exploiting the contrast between light and heavy hydrogen, $^1$H and $^2$H, in different ways. Reverse Monte Carlo structural modeling is applied for constructing large structural models that are as consistent as possible with all experimental information, both in real and reciprocal space. The method has also proven to be useful for revealing where possible small inconsistencies appear during primary data processing: for one neutron data set, it is the molecular geometry that may not be maintained within reasonable limits, whereas for the other set, it is one of the (composite) radial distribution functions that cannot be modeled at the same (high) level as the other three functions. Nevertheless, details of the local structure around the hydrogen bonds appear very much the same for both data sets: the most probable hydrogen bond angle is straight, and the nearest oxygen neighbours of a central oxygen atom occupy approximately tetrahedral positions.
Word2vec, as an efficient tool for learning vector representation of words has shown its effectiveness in many natural language processing tasks. Mikolov et al. issued Skip-Gram and Negative Sampling model for developing this toolbox. Perozzi et al. introduced the Skip-Gram model into the study of social network for the first time, and designed an algorithm named DeepWalk for learning node embedding on a graph. We prove that the DeepWalk algorithm is actually factoring a matrix M where each entry M_{ij} is logarithm of the average probability that node i randomly walks to node j in fix steps.
For Industrial Wireless Sensor Networks, it is essential to reliably sense and deliver the environmental data on time to avoid system malfunction. While energy harvesting is a promising technique to extend the lifetime of sensor nodes, it also brings new challenges for system reliability due to the stochastic nature of the harvested energy. In this paper, we investigate the optimal energy management policy to minimize the weighted packet loss rate under delay constraint, where the packet loss rate considers the lost packets both during the sensing and delivering processes. We show that the above energy management problem can be modeled as an infinite horizon average reward constraint Markov decision problem. In order to address the well-known curse of dimensionality problem and facilitate distributed implementation, we utilize the linear value approximation technique. Moreover, we apply stochastic online learning with post-decision state to deal with the lack of knowledge of the underlying stochastic processes. A distributed energy allocation algorithm with water-filling structure and a scheduling algorithm by auction mechanism are obtained. Experimental results show that the proposed algorithm achieves nearly the same performance as the optimal offline value iteration algorithm while requiring much less computation complexity and signaling overhead, and outperforms various existing baseline algorithms.
The specific heat of toluene in glass and crystal states, has been measured both at low temperatures down to 1.8 K (using the thermal relaxation method) and in a wide temperature range up to the liquid state (using a quasiadiabatic continuous method). Our measurements therefore extend earlier published data to much lower temperatures, thereby allowing to explore the low temperature glassy anomalies in the case of toluene. Surprisingly, no indication of the existence of tunneling states is found, at least within the temperature range studied. At moderate temperatures, our data either for the glass or for the crystal show good agreement with those found in the literature. Also, we have been able to prepare bulk samples of toluene glass by only doping with 2% mol ethanol instead of with higher impurity doses used by other authors.
A pairwise clustering approach is applied to the analysis of the Dow Jones index companies, in order to identify similar temporal behavior of the traded stock prices. To this end, the chaotic map clustering algorithm is used, where a map is associated to each company and the correlation coefficients of the financial time series are associated to the coupling strengths between maps. The simulation of a chaotic map dynamics gives rise to a natural partition of the data, as companies belonging to the same industrial branch are often grouped together. The identification of clusters of companies of a given stock market index can be exploited in the portfolio optimization strategies.
Deep learning neural networks have emerged as one of the most powerful classification tools for vision related applications. However, the computational and energy requirements associated with such deep nets can be quite high, and hence their energy-efficient implementation is of great interest. Although traditionally the entire network is utilized for the recognition of all inputs, we observe that the classification difficulty varies widely across inputs in real-world datasets; only a small fraction of inputs require the full computational effort of a network, while a large majority can be classified correctly with very low effort. In this paper, we propose Conditional Deep Learning (CDL) where the convolutional layer features are used to identify the variability in the difficulty of input instances and conditionally activate the deeper layers of the network. We achieve this by cascading a linear network of output neurons for each convolutional layer and monitoring the output of the linear network to decide whether classification can be terminated at the current stage or not. The proposed methodology thus enables the network to dynamically adjust the computational effort depending upon the difficulty of the input data while maintaining competitive classification accuracy. We evaluate our approach on the MNIST dataset. Our experiments demonstrate that our proposed CDL yields 1.91x reduction in average number of operations per input, which translates to 1.84x improvement in energy. In addition, our results show an improvement in classification accuracy from 97.5% to 98.9% as compared to the original network.
Researchers have demonstrated state-of-the-art performance in sequential decision making problems (e.g., robotics control, sequential prediction) with deep neural network models. One often has access to near-optimal oracles that achieve good performance on the task during training. We demonstrate that AggreVaTeD --- a policy gradient extension of the Imitation Learning (IL) approach of (Ross & Bagnell, 2014) --- can leverage such an oracle to achieve faster and better solutions with less training data than a less-informed Reinforcement Learning (RL) technique. Using both feedforward and recurrent neural network predictors, we present stochastic gradient procedures on a sequential prediction task, dependency-parsing from raw image data, as well as on various high dimensional robotics control problems. We also provide a comprehensive theoretical study of IL that demonstrates we can expect up to exponentially lower sample complexity for learning with AggreVaTeD than with RL algorithms, which backs our empirical findings. Our results and theory indicate that the proposed approach can achieve superior performance with respect to the oracle when the demonstrator is sub-optimal.
Aims. In this paper the optical data of the ESO Deep-Public-Survey observed with the Wide Field Imager and reduced with the THELI pipeline are described. Methods. Here we present 63 fully reduced and stacked images. The astrometric and photometric calibrations are discussed and the properties of the images are compared to images released by the ESO Imaging Survey team covering a subset of our data. Results. These images are publicly released to the community. Our main scientific goals with this survey are to study the high-redshift universe by optically pre-selecting high-redshift objects from imaging data and to use VLT instruments for follow-up spectroscopy as well as weak lensing applications.
This paper demonstrates a method for using belief-network algorithms to solve influence diagram problems. In particular, both exact and approximation belief-network algorithms may be applied to solve influence-diagram problems. More generally, knowing the relationship between belief-network and influence-diagram problems may be useful in the design and development of more efficient influence diagram algorithms.
Reaction networks are systems in which the populations of a finite number of species evolve through predefined interactions. Such networks are found as modeling tools in many biological disciplines such as biochemistry, ecology, epidemiology, immunology, systems biology and synthetic biology. It is now well-established that, for small population sizes, stochastic models for biochemical reaction networks are necessary to capture randomness in the interactions. The tools for analyzing such models, however, still lag far behind their deterministic counterparts. In this paper, we bridge this gap by developing a constructive framework for examining the long-term behavior and stability properties of the reaction dynamics in a stochastic setting. In particular, we address the problems of determining ergodicity of the reaction dynamics, which is analogous to having a globally attracting fixed point for deterministic dynamics. We also examine when the statistical moments of the underlying process remain bounded with time and when they converge to their steady state values. The framework we develop relies on a blend of ideas from probability theory, linear algebra and optimization theory. We demonstrate that the stability properties of a wide class of biological networks can be assessed from our sufficient theoretical conditions that can be recast as efficient and scalable linear programs, well-known for their tractability. It is notably shown that the computational complexity is often linear in the number of species. We illustrate the validity, the efficiency and the wide applicability of our results on several reaction networks arising in biochemistry, systems biology, epidemiology and ecology. The biological implications of the results as well as an example of a non-ergodic biological network are also discussed.
A sensor network is a collection of wireless devices that are able to monitor physical or environmental conditions. These devices (nodes) are expected to operate autonomously, be battery powered and have very limited computational capabilities. This makes the task of protecting a sensor network against misbehavior or possible malfunction a challenging problem. In this document we discuss performance of Artificial immune systems (AIS) when used as the mechanism for detecting misbehavior.   We show that (i) mechanism of the AIS have to be carefully applied in order to avoid security weaknesses, (ii) the choice of genes and their interaction have a profound influence on the performance of the AIS, (iii) randomly created detectors do not comply with limitations imposed by communications protocols and (iv) the data traffic pattern seems not to impact significantly the overall performance.   We identified a specific MAC layer based gene that showed to be especially useful for detection; genes measure a network's performance from a node's viewpoint. Furthermore, we identified an interesting complementarity property of genes; this property exploits the local nature of sensor networks and moves the burden of excessive communication from normally behaving nodes to misbehaving nodes. These results have a direct impact on the design of AIS for sensor networks and on engineering of sensor networks.
This book chapter identifies various security threats in wireless mesh network (WMN). Keeping in mind the critical requirement of security and user privacy in WMNs, this chapter provides a comprehensive overview of various possible attacks on different layers of the communication protocol stack for WMNs and their corresponding defense mechanisms. First, it identifies the security vulnerabilities in the physical, link, network, transport, application layers. Furthermore, various possible attacks on the key management protocols, user authentication and access control protocols, and user privacy preservation protocols are presented. After enumerating various possible attacks, the chapter provides a detailed discussion on various existing security mechanisms and protocols to defend against and wherever possible prevent the possible attacks. Comparative analyses are also presented on the security schemes with regards to the cryptographic schemes used, key management strategies deployed, use of any trusted third party, computation and communication overhead involved etc. The chapter then presents a brief discussion on various trust management approaches for WMNs since trust and reputation-based schemes are increasingly becoming popular for enforcing security in wireless networks. A number of open problems in security and privacy issues for WMNs are subsequently discussed before the chapter is finally concluded.
Ballistic particles interacting with irregular surfaces are representative of many physical problems in the Knudsen diffusion regime. In this paper, the collisions of ballistic particles interacting with an irregular surface modeled by a quadratic Koch curve, are studied numerically. The $q$ moments of the source spatial distribution of collision numbers $\mu(x)$ are characterized by a sequence of ``collision exponent'' $\tau(q)$. The measure $\mu(x)$ is found to be multifractal even when a random micro-roughness (or random re-emission) of the surface exists. The dimensions $f(\alpha)$, obtained by a Legendre transformation from $\tau(q)$, consist of two parabolas corresponding to a trinomial multifractal. This is demonstrated for a particular case by obtaining an exact $f(\alpha)$ for a multiplicative trinomial mass distribution. The trinomial nature of the multifractality is related to the type of surface macro-irregularity considered here and is independent of the micro-roughness of the surface which however influence the values of $\alpha_{min}$ and $\alpha_{max}$. The information dimension $D_I$ increases significantly with the micro-roughness of the surface. Interestingly, in contrast with this point of view, the surface seems to work uniformly. This correspond to an absence of screening effects in Knudsen diffusion.
The Wivace 2013 Electronic Proceedings in Theoretical Computer Science (EPTCS) contain some selected long and short articles accepted for the presentation at Wivace 2013 - Italian Workshop on Artificial Life and Evolutionary Computation, which was held at the University of Milan-Bicocca, Milan, on the 1st and 2nd of July, 2013.
We describe the construction of MegaZ-LRG, a photometric redshift catalogue of over one million luminous red galaxies (LRGs) in the redshift range 0.4 < z < 0.7 with limiting magnitude i < 20. The catalogue is selected from the imaging data of the Sloan Digital Sky Survey Data Release 4. The 2dF-SDSS LRG and Quasar (2SLAQ) spectroscopic redshift catalogue of 13,000 intermediate-redshift LRGs provides a photometric redshift training set, allowing use of ANNz, a neural network-based photometric-redshift estimator. The rms photometric redshift accuracy obtained for an evaluation set selected from the 2SLAQ sample is sigma_z = 0.049 averaged over all galaxies, and sigma_z = 0.040 for a brighter subsample (i < 19.0). The catalogue is expected to contain ~5 per cent stellar contamination. The ANNz code is used to compute a refined star/galaxy probability based on a range of photometric parameters; this allows the contamination fraction to be reduced to 2 per cent with negligible loss of genuine galaxies. The MegaZ-LRG catalogue is publicly available on the World Wide Web from http://www.2slaq.info .
The existing peer-to-peer networks have several problems such as fake content distribution, free riding, white-washing and poor search scalability, lack of a robust trust model and absence of user privacy protection mechanism. Although, several trust management and semantic community-based mechanisms for combating free riding and distribution of malicious contents have been proposed by some researchers, most of these schemes lack scalability due to their high computational, communication and storage overhead. This paper presents a robust trust management scheme for P2P networks that utilizes topology adaptation by constructing an overlay of trusted peers where the neighbors are selected based on their trust ratings and content similarities. While increasing the search efficiency by intelligently exploiting the formation of semantic community structures by topology adaptation among the trustworthy peers, the scheme provides the users a very high level of privacy protection of their usage and consumption patterns of network resources. Simulation results demonstrate that the proposed scheme provides efficient searching to good peers while penalizing the malicious peers by increasing their search times as the network topology stabilizes.
Distributed adaptive filtering has been considered as an effective approach for data processing and estimation over distributed networks. Most existing distributed adaptive filtering algorithms focus on designing different information diffusion rules, regardless of the nature evolutionary characteristic of a distributed network. In this paper, we study the adaptive network from the game theoretic perspective and formulate the distributed adaptive filtering problem as a graphical evolutionary game. With the proposed formulation, the nodes in the network are regarded as players and the local combiner of estimation information from different neighbors is regarded as different strategies selection. We show that this graphical evolutionary game framework is very general and can unify the existing adaptive network algorithms. Based on this framework, as examples, we further propose two error-aware adaptive filtering algorithms. Moreover, we use graphical evolutionary game theory to analyze the information diffusion process over the adaptive networks and evolutionarily stable strategy of the system. Finally, simulation results are shown to verify the effectiveness of our analysis and proposed methods.
Rigorous coupled spin-charge drift-diffusion equations are derived from quantum-kinetic equations for the spin-density matrix that incorporate effects due to k-linear spin-orbit interaction, an in-plane electric field, and the elastic scattering on nonmagnetic impurities. The explicit analytical solution for the induced magnetization exhibits a pole structure, from which the dispersion relations of spin excitations are identified. Applications of the general approach refer to the excitation of long-lived field-induced spin waves by optically generated spin and charge patterns. This approach transfers methods known in the physics of space-charge waves to the treatment of spin eigenmodes. In addition, the amplification of an oscillating electric field by spin injection is demonstrated.
The deep X-ray surveys performed by the two major X-ray observatories on flight, Chandra and XMM, are being resolving the bulk of the cosmic X-ray background (XRB) in the 2-10 keV energy band, where the sky flux is dominated by extragalactic emission. Although the actual fraction depends on the absolute sky flux, which is measured with an uncertainty of ~40%, most of the XRB is already resolved. Optical identifications of the X-ray sources in the deep surveys are being showing that these are mainly AGN, most of which being obscured as predicted by population synthesis models. However, first results indicate that the redshift distribution of the sources making the XRB seems to peak at much lower redshift than predicted by the models. In this article I will briefly review and discuss the measurements of the XRB spectrum and the AGN synthesis models of the XRB. Then, I will introduce the Chandra and XMM deep X-ray surveys, mainly focusing on the Chandra Deep Field North and South. Finally, the properties of the X-ray sources populating the deep surveys will be described and compared with the predictions of the most recent synthesis models.
We present an improved method to compute the radiative momentum transfer in the Pioneer 10 & 11 spacecraft that takes into account both diffusive and specular reflection. The method allows for more reliable results regarding the thermal acceleration of the deep-space probes, confirming previous findings. A parametric analysis is performed in order to set an upper and lower-bound for the thermal acceleration and its evolution with time.
New inelastic X-ray scattering experiments have been performed on liquid lithium in a wide wavevector range. With respect to the previous measurements, the instrumental resolution, improved up to 1.5 meV, allows to accurately investigate the dynamical processes determining the observed shape of the the dynamic structure factor, $S(Q,\omega)$. A detailed analysis of the lineshapes shows the co-existence of relaxation processes with both a slow and a fast characteristic timescales, and therefore that pictures of the relaxation mechanisms based on a simple viscoelastic model must be abandoned.
We investigate the roughening phase transition of a $(3+1)$-dimensional elastic manifold driven by the completion between a periodic pinning potential and a randomly distributed impurities. The elastic manifold is modeled by a solid-on-solid type interface model, and universal features of the transition from a flat phase (for strong periodic potential) to a rough phase (for strong disorder) are studied at zero temperature using a combinatorial optimization algorithm technique. We find a {\it continuous} transition with a set of numerically estimated critical exponents that we compare with analytic results and those for a periodic elastic medium.
This paper discusses the methodology necessary to measure the Hubble constant Ho to a high degree of accuracy based upon Doppler tracking of spacecraft in the solar system. Using this methodology with available published data we determine a model independent value of the Hubble constant for the current epoch in the solar system to be Ho = 2.59 \pm 0.05 x 10^-18 (s^-1) or as 79.8 \pm 1.7 (km/s/Mpc).   We calculate the direct effect of the Cosmic Redshift on Doppler tracking of spacecraft in the solar system. It is shown that with current tracking systems, such as NASA's Deep Space Tracking Network, when the return trip light time of the Doppler signal exceeds a certain threshold, imposed by the stability of the frequency standard, the effect of the Cosmic Redshift is coherently conserved in the returning Doppler signal.   We demonstrate that in an underdetermined orbit, one determined by line of sight Doppler alone, that if this Cosmic Redshift term is not accounted for, the orbit determination program (ODP) miscalculates the actual recessional velocity of the spacecraft from the measured recessional velocity causing a mismatch between the actual and the predicted trajectory of the spacecraft. One consequence is that the ODP will generate Doppler residuals, the difference between the actual trajectory and the predicted trajectory which show an anomalous force. When this effect is integrated in long arc solutions, it can grow to considerable magnitude. We show that the ODP residuals uniquely separate the Cosmic Redshift term from velocity Doppler sources and that the solution can provide an accurate determination of Ho.
This work proposes a decentralized, iterative, Bayesian algorithm called CB-DSBL for in-network estimation of multiple jointly sparse vectors by a network of nodes, using noisy and underdetermined linear measurements. The proposed algorithm exploits the network wide joint sparsity of the un- known sparse vectors to recover them from significantly fewer number of local measurements compared to standalone sparse signal recovery schemes. To reduce the amount of inter-node communication and the associated overheads, the nodes exchange messages with only a small subset of their single hop neighbors. Under this communication scheme, we separately analyze the convergence of the underlying Alternating Directions Method of Multipliers (ADMM) iterations used in our proposed algorithm and establish its linear convergence rate. The findings from the convergence analysis of decentralized ADMM are used to accelerate the convergence of the proposed CB-DSBL algorithm. Using Monte Carlo simulations, we demonstrate the superior signal reconstruction as well as support recovery performance of our proposed algorithm compared to existing decentralized algorithms: DRL-1, DCOMP and DCSP.
The interplay between the ground state energy of the generalized Bernasconi model to multi-phase, and the minimal value of the maximal autocorrelation function, $C_{max}=\max_K{|C_K|}$, $K=1,..N-1$, is examined analytically and the main results are: (a) The minimal value of $\min_N{C_{max}}$ is $0.435\sqrt{N}$ significantly smaller than the typical value for random sequences $O(\sqrt{\log{N}}\sqrt{N})$. (b) $\min_N{C_{max}}$ over all sequences of length N is obtained in an energy which is about 30% above the ground-state energy of the generalized Bernasconi model, independent of the number of phases m. (c) The maximal merit factor $F_{max}$ grows linearly with m. (d) For a given N, $\min_N{C_{max}}\sim\sqrt{N/m}$ indicating that for m=N, $\min_N{C_{max}}=1$, i.e. a Barker code exits. The analytical results are confirmed by simulations.
The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time. Very few previous studies have examined this crucial and challenging weather forecasting problem from the machine learning perspective. In this paper, we formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem in which both the input and the prediction target are spatiotemporal sequences. By extending the fully connected LSTM (FC-LSTM) to have convolutional structures in both the input-to-state and state-to-state transitions, we propose the convolutional LSTM (ConvLSTM) and use it to build an end-to-end trainable model for the precipitation nowcasting problem. Experiments show that our ConvLSTM network captures spatiotemporal correlations better and consistently outperforms FC-LSTM and the state-of-the-art operational ROVER algorithm for precipitation nowcasting.
In this paper, we outline the theory of epidemic percolation networks and their use in the analysis of stochastic SIR epidemic models on undirected contact networks. We then show how the same theory can be used to analyze stochastic SIR models with random and proportionate mixing. The epidemic percolation networks for these models are purely directed because undirected edges disappear in the limit of a large population. In a series of simulations, we show that epidemic percolation networks accurately predict the mean outbreak size and probability and final size of an epidemic for a variety of epidemic models in homogeneous and heterogeneous populations. Finally, we show that epidemic percolation networks can be used to re-derive classical results from several different areas of infectious disease epidemiology. In an appendix, we show that an epidemic percolation network can be defined for any time-homogeneous stochastic SIR model in a closed population and prove that the distribution of outbreak sizes given the infection of any given node in the SIR model is identical to the distribution of its out-component sizes in the corresponding probability space of epidemic percolation networks. We conclude that the theory of percolation on semi-directed networks provides a very general framework for the analysis of stochastic SIR models in closed populations.
We propose a novel deep architecture, SegNet, for semantic pixel wise image labelling. SegNet has several attractive properties; (i) it only requires forward evaluation of a fully learnt function to obtain smooth label predictions, (ii) with increasing depth, a larger context is considered for pixel labelling which improves accuracy, and (iii) it is easy to visualise the effect of feature activation(s) in the pixel label space at any depth. SegNet is composed of a stack of encoders followed by a corresponding decoder stack which feeds into a soft-max classification layer. The decoders help map low resolution feature maps at the output of the encoder stack to full input image size feature maps. This addresses an important drawback of recent deep learning approaches which have adopted networks designed for object categorization for pixel wise labelling. These methods lack a mechanism to map deep layer feature maps to input dimensions. They resort to ad hoc methods to upsample features, e.g. by replication. This results in noisy predictions and also restricts the number of pooling layers in order to avoid too much upsampling and thus reduces spatial context. SegNet overcomes these problems by learning to map encoder outputs to image pixel labels. We test the performance of SegNet on outdoor RGB scenes from CamVid, KITTI and indoor scenes from the NYU dataset. Our results show that SegNet achieves state-of-the-art performance even without use of additional cues such as depth, video frames or post-processing with CRF models.
Visual tracking is a fundamental problem in computer vision. Recently, some deep-learning-based tracking algorithms have been achieving record-breaking performances. However, due to the high complexity of deep learning, most deep trackers suffer from low tracking speed, and thus are impractical in many real-world applications. Some new deep trackers with smaller network structure achieve high efficiency while at the cost of significant decrease on precision. In this paper, we propose to transfer the feature for image classification to the visual tracking domain via convolutional channel reductions. The channel reduction could be simply viewed as an additional convolutional layer with the specific task. It not only extracts useful information for object tracking but also significantly increases the tracking speed. To better accommodate the useful feature of the target in different scales, the adaptation filters are designed with different sizes. The yielded visual tracker is real-time and also illustrates the state-of-the-art accuracies in the experiment involving two well-adopted benchmarks with more than 100 test videos.
Deep reinforcement learning (DRL) brings the power of deep neural networks to bear on the generic task of trial-and-error learning, and its effectiveness has been convincingly demonstrated on tasks such as Atari video games and the game of Go. However, contemporary DRL systems inherit a number of shortcomings from the current generation of deep learning techniques. For example, they require very large datasets to work effectively, entailing that they are slow to learn even when such datasets are available. Moreover, they lack the ability to reason on an abstract level, which makes it difficult to implement high-level cognitive functions such as transfer learning, analogical reasoning, and hypothesis-based reasoning. Finally, their operation is largely opaque to humans, rendering them unsuitable for domains in which verifiability is important. In this paper, we propose an end-to-end reinforcement learning architecture comprising a neural back end and a symbolic front end with the potential to overcome each of these shortcomings. As proof-of-concept, we present a preliminary implementation of the architecture and apply it to several variants of a simple video game. We show that the resulting system -- though just a prototype -- learns effectively, and, by acquiring a set of symbolic rules that are easily comprehensible to humans, dramatically outperforms a conventional, fully neural DRL system on a stochastic variant of the game.
Inspired by recent successes of deep learning in computer vision, we propose a novel application of deep convolutional neural networks to facial expression recognition, in particular smile recognition. A smile recognition test accuracy of 99.45% is achieved for the Denver Intensity of Spontaneous Facial Action (DISFA) database, significantly outperforming existing approaches based on hand-crafted features with accuracies ranging from 65.55% to 79.67%. The novelty of this approach includes a comprehensive model selection of the architecture parameters, allowing to find an appropriate architecture for each expression such as smile. This is feasible because all experiments were run on a Tesla K40c GPU, allowing a speedup of factor 10 over traditional computations on a CPU.
Deep neural networks have achieved impressive supervised classification performance in many tasks including image recognition, speech recognition, and sequence to sequence learning. However, this success has not been translated to applications like question answering that may involve complex arithmetic and logic reasoning. A major limitation of these models is in their inability to learn even simple arithmetic and logic operations. For example, it has been shown that neural networks fail to learn to add two binary numbers reliably. In this work, we propose Neural Programmer, an end-to-end differentiable neural network augmented with a small set of basic arithmetic and logic operations. Neural Programmer can call these augmented operations over several steps, thereby inducing compositional programs that are more complex than the built-in operations. The model learns from a weak supervision signal which is the result of execution of the correct program, hence it does not require expensive annotation of the correct program itself. The decisions of what operations to call, and what data segments to apply to are inferred by Neural Programmer. Such decisions, during training, are done in a differentiable fashion so that the entire network can be trained jointly by gradient descent. We find that training the model is difficult, but it can be greatly improved by adding random noise to the gradient. On a fairly complex synthetic table-comprehension dataset, traditional recurrent networks and attentional models perform poorly while Neural Programmer typically obtains nearly perfect accuracy.
In this paper, we present and analyse two Hopfield-like nonlinear networks, in continuous-time and discrete-time respectively. The proposed network is based on an autonomous linear system with a symmetric weight matrix, which is designed to be unstable, and a nonlinear function stabilizing the whole network thanks to a manipulated state variable called``ultimate SIR''. This variable is observed to be equal to the traditional Signal-to-Interference Ratio (SIR) definition in telecommunications engineering.   The underlying linear system of the proposed continuous-time network is $\dot{{\mathbf x}} = {\mathbf B} {\mathbf x}$ where {\bf B} is a real symmetric matrix whose diagonal elements are fixed to a constant. The nonlinear function, on the other hand, is based on the defined system variables called ``SIR''s. We also show that the ``SIR''s of all the states converge to a constant value, called ``system-specific Ultimate SIR''; which is equal to $\frac{r}{\lambda_{max}}$ where $r$ is the diagonal element of matrix ${\bf B}$ and $\lambda_{max}$ is the maximum (positive) eigenvalue of diagonally-zero matrix $({\bf B} - r{\bf I})$, where ${\bf I}$ denotes the identity matrix. The same result is obtained in its discrete-time version as well.   Computer simulations for binary associative memory design problem show the effectiveness of the proposed network as compared to the traditional Hopfield Networks.
Wireless sensor networks faces unbalanced energy consumption problem over time. Clustering provides an energy efficient method to improve lifespan of the sensor network. Cluster head collects data from other nodes and transmits it towards the sink node. Cluster heads which are far-off from the sink, consumes more power in transmission of information towards the sink. We propose Region Based Energy Balanced Inter-cluster communication protocol (RBEBP) to improve lifespan of the sensor network. Monitored area has been divided into regions; cluster heads are selected from specific region based on the residual energy of nodes in that region. If energy of nodes of the specific region is low, nodes from another region are selected as cluster heads. Optimized selection of cluster heads helps in improving lifespan of the sensor network. In our scheme, cluster heads which are far-off from the sink use another cluster heads as the relay nodes to transmit their data to the sink node. So energy of cluster heads deplete in a uniform way and complete area remain covered by sensor nodes. Simulation results demonstrate that RBEBP can effectively reduce total energy depletion and considerably extend lifespan of the network as compared to LEACH protocol. RBEBP also minimize the problem of energy holes in monitored area and improve the throughput of the network
This paper reports the implementation and performance evaluation of the OpenSHMEM 1.3 specification for the Adapteva Epiphany architecture within the Parallella single-board computer. The Epiphany architecture exhibits massive many-core scalability with a physically compact 2D array of RISC CPU cores and a fast network-on-chip (NoC). While fully capable of MPMD execution, the physical topology and memory-mapped capabilities of the core and network translate well to Partitioned Global Address Space (PGAS) programming models and SPMD execution with SHMEM.
Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator. However, standard imitation learning methods assume that the agent receives examples of observation-action tuples that could be provided, for instance, to a supervised learning algorithm. This stands in contrast to how humans and animals imitate: we observe another person performing some behavior and then figure out which actions will realize that behavior, compensating for changes in viewpoint, surroundings, and embodiment. We term this kind of imitation learning as imitation-from-observation and propose an imitation learning method based on video prediction with context translation and deep reinforcement learning. This lifts the assumption in imitation learning that the demonstration should consist of observations and actions in the same environment, and enables a variety of interesting applications, including learning robotic skills that involve tool use simply by observing videos of human tool use. Our experimental results show that our approach can perform imitation-from-observation for a variety of real-world robotic tasks modeled on common household chores, acquiring skills such as sweeping from videos of a human demonstrator. Videos can be found at https://sites.google.com/site/imitationfromobservation
We present experimental results on the quantized Hall insulator in two dimensions. This insulator, with vanishing conductivities, is characterized by the quantization (within experimental accuracy) of the Hall resistance in units of the quantum unit of resistance, h/e^2. The measurements were performed in a two dimensional hole system, confined in a Ge/SiGe quantum well, when the magnetic field is increased above the nu=1 quantum Hall state. This quantization leads to a nearly perfect semi-circle relation for the diagonal and Hall conductivities. Similar results are obtained with a higher mobility n-type modulation doped GaAs/AlGaAs sample, when the magnetic field is increased above the nu=1/3 fractional quantum Hall state.
When a considerable number of mutations have no effects on fitness values, the fitness landscape is said neutral. In order to study the interplay between neutrality, which exists in many real-world applications, and performances of metaheuristics, it is useful to design landscapes which make it possible to tune precisely neutral degree distribution. Even though many neutral landscape models have already been designed, none of them are general enough to create landscapes with specific neutral degree distributions. We propose three steps to design such landscapes: first using an algorithm we construct a landscape whose distribution roughly fits the target one, then we use a simulated annealing heuristic to bring closer the two distributions and finally we affect fitness values to each neutral network. Then using this new family of fitness landscapes we are able to highlight the interplay between deceptiveness and neutrality.
Chaos synchronization may arise in networks of nonlinear units with delayed couplings. We study complete and sublattice synchronization generated by resonance of two large time delays with a specific ratio. As it is known for single delay networks, the number of synchronized sublattices is determined by the Greatest Common Divisor (GCD) of the network loops lengths. We demonstrate analytically the GCD condition in networks of iterated Bernouilli maps with multiple delay times and complement our analytic results by numerical phase diagrams, providing parameter regions showing complete and sublattice synchronization by resonance for Tent and Bernouilli maps. We compare networks with the same GCD with single and multiple delays, and we investigate the sensitivity of the correlation to a detuning between the delays in a network of coupled Stuart-Landau oscillators. Moreover, the GCD condition also allows to detect time delay resonances leading to high correlations in non-synchronizable networks. Specifically, GCD-induced resonances are observed both in a chaotic asymmetric network and in doubly connected rings of delay-coupled noisy linear oscillators.
Deep neural networks have become increasingly successful at solving classic perception problems such as object recognition, semantic segmentation, and scene understanding, often reaching or surpassing human-level accuracy. This success is due in part to the ability of DNNs to learn useful representations of high-dimensional inputs, a problem that humans must also solve. We examine the relationship between the representations learned by these networks and human psychological representations recovered from similarity judgments. We find that deep features learned in service of object classification account for a significant amount of the variance in human similarity judgments for a set of animal images. However, these features do not capture some qualitative distinctions that are a key part of human representations. To remedy this, we develop a method for adapting deep features to align with human similarity judgments, resulting in image representations that can potentially be used to extend the scope of psychological experiments.
The model presented in this paper experiments with a comprehensive simulant agent in order to provide an exploratory platform in which simulation modelers may try alternative scenarios and participation in policy decision-making. The framework is built in a computationally distributed online format in which users can join in and visually explore the results. Modeled activity involves daily routine errands, such as shopping, visiting the doctor or engaging in the labor market. Further, agents make everyday decisions based on individual behavioral attributes and minimal requirements, according to social and contagion networks. Fully developed firms and governments are also included in the model allowing for taxes collection, production decisions, bankruptcy and change in ownership. The contributions to the literature are multifold. They include (a) a comprehensive model with detailing of the agents and firms' activities and processes and original use of simultaneously (b) reinforcement learning for firm pricing and demand allocation; (c) social contagion for disease spreading and social network for hiring opportunities; and (d) Bayesian networks for demographic-like generation of agents. All of that within a (e) visually rich environment and multiple use of databases. Hence, the model provides a comprehensive framework from where interactions among citizens, firms and governments can be easily explored allowing for learning and visualization of policies and scenarios.
Differential inclusive jet cross sections in neutral current deep inelastic ep scattering have been measured with the ZEUS detector. Three phase-space regions have been selected in order to study parton dynamics where the effects of BFKL evolution might be present. The measurements have been compared to the predictions of leading-logarithm parton shower Monte Carlo models and fixed-order perturbative QCD calculations. In the forward region, QCD calculations at order alpha_s^1 underestimate the data up to an order of magnitude at low x. An improved description of the data in this region is obtained by including QCD corrections at order alpha_s^2, which account for the lowest-order t-channel gluon-exchange diagrams, highlighting the importance of such terms in parton dynamics at low x.
We present molecular dynamics simulations of the motion of a single rod in a two-dimensional random static array of disks. For long rods the mean-squared displacement of the center-of-mass shows a cage effect similar to that observed in supercooled liquids or dense colloidal systems. We have determined the time-dependence of the non-Gaussian parameter for different rod lengths. It is found that the long-time regime is strongly non-Gaussian even at length scales of the order of 10-15 times the rod length, thus showing the heterogeneity of the dynamics at such length scales.
We consider a wireless distributed computing system, in which multiple mobile users, connected wirelessly through an access point, collaborate to perform a computation task. In particular, users communicate with each other via the access point to exchange their locally computed intermediate computation results, which is known as data shuffling. We propose a scalable framework for this system, in which the required communication bandwidth for data shuffling does not increase with the number of users in the network. The key idea is to utilize a particular repetitive pattern of placing the dataset (thus a particular repetitive pattern of intermediate computations), in order to provide coding opportunities at both the users and the access point, which reduce the required uplink communication bandwidth from users to access point and the downlink communication bandwidth from access point to users by factors that grow linearly with the number of users. We also demonstrate that the proposed dataset placement and coded shuffling schemes are optimal (i.e., achieve the minimum required shuffling load) for both a centralized setting and a decentralized setting, by developing tight information-theoretic lower bounds.
In this paper, we present a new framework that links the two worlds of wired and cellular users sharing systems. The approach is to propose an easy gateway that enables the use of cellular networks based services by wireline users and applications. The idea is to use a mobile terminal or wireless equipment for sharing cellular services, available thanks to its cellular network, to other users that use the wireline Internet. The software application acts as a gateway between the cellular and the wired network; it is responsible for supporting the services provided by the wireless network and make them accessible and usable, in a standard and easy way, by anyone on the wireline network. The gateway software can be integrated easily on any complex architecture since it can interact with any cellular modem. The paper describes an implementation prototype where some examples of services, such as the ability of using messaging services and calls streaming, are experimented. The proposed platform combines different standards to guarantee the use of our gateway in heterogeneous environments.
This paper introduces an extension of the backpropagation algorithm that enables us to have layers with constrained weights in a deep network. In particular, we make use of the Riemannian geometry and optimization techniques on matrix manifolds to step outside of normal practice in training deep networks, equipping the network with structures such as orthogonality or positive definiteness. Based on our development, we make another contribution by introducing the Stiefel layer, a layer with orthogonal weights. Among various applications, Stiefel layers can be used to design orthogonal filter banks, perform dimensionality reduction and feature extraction. We demonstrate the benefits of having orthogonality in deep networks through a broad set of experiments, ranging from unsupervised feature learning to fine-grained image classification.
As manufacturing processes become increasingly automated, so should tool condition monitoring (TCM) as it is impractical to have human workers monitor the state of the tools continuously. Tool condition is crucial to ensure the good quality of products: Worn tools affect not only the surface quality but also the dimensional accuracy, which means higher reject rate of the products. Therefore, there is an urgent need to identify tool failures before it occurs on the fly. While various versions of intelligent tool condition monitoring have been proposed, most of them suffer from a cognitive nature of traditional machine learning algorithms. They focus on the how to learn process without paying attention to other two crucial issues: what to learn, and when to learn. The what to learn and the when to learn provide self regulating mechanisms to select the training samples and to determine time instants to train a model. A novel tool condition monitoring approach based on a psychologically plausible concept, namely the metacognitive scaffolding theory, is proposed and built upon a recently published algorithm, recurrent classifier (rClass). The learning process consists of three phases: what to learn, how to learn, when to learn and makes use of a generalized recurrent network structure as a cognitive component. Experimental studies with real-world manufacturing data streams were conducted where rClass demonstrated the highest accuracy while retaining the lowest complexity over its counterparts.
Boolean networks are used to model biological networks such as gene regulatory networks. Often Boolean networks show very chaotic behaviour which is sensitive to any small perturbations. In order to reduce the chaotic behaviour and to attain stability in the gene regulatory network, nested Canalizing Functions (NCFs) are best suited. NCFs and its variants have a wide range of applications in systems biology. Previously, many works were done on the application of canalizing functions, but there were fewer methods to check if any arbitrary Boolean function is canalizing or not. In this paper, by using Karnaugh Map this problem is solved and also it has been shown that when the canalizing functions of variable is given, all the canalizing functions of variable could be generated by the method of concatenation. In this paper we have uniquely identified the number of NCFs having a particular Hamming Distance (H.D) generated by each variable as starting canalizing input. Partially NCFs of 4 variables has also been studied in this paper.
Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of Laplacian mixture models derive from partial differential equations in physics, which are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.
We present the results of a deep search for pulsations from the nearby isolated neutron star RX J1856.5-3754 using the 450 ks Director's Discretionary Time Chandra observation completed on 2001 Oct 15. No pulsations were detected. We find a 99% confidence upper limit on the pulsed fraction of ~4.5% for worst-case sinusoidal pulsations with frequency <~50 Hz and frequency derivatives -5x10^-10 <= f-dot <= 0 Hz/s. The non-detection of pulsations is most likely due to an unfavorable viewing angle or emitting geometry. Such emitting geometries are much more likely to occur for more ``compact'' neutron stars which show increased gravitational light-bending effects. In this case, the non-detection implies a radius/mass ratio for RX J1856.5-3754 of R/M <~10 km/Msun.
Deep Neural Networks are becoming the de-facto standard models for image understanding, and more generally for computer vision tasks. As they involve highly parallelizable computations, CNN are well suited to current fine grain programmable logic devices. Thus, multiple CNN accelerators have been successfully implemented on FPGAs. Unfortunately, FPGA resources such as logic elements or DSP units remain limited. This work presents a holistic method relying on approximate computing and design space exploration to optimize the DSP block utilization of a CNN implementation on an FPGA. This method was tested when implementing a reconfigurable OCR convolutional neural network on an Altera Stratix V device and varying both data representation and CNN topology in order to find the best combination in terms of DSP block utilization and classification accuracy. This exploration generated dataflow architectures of 76 CNN topologies with 5 different fixed point representation. Most efficient implementation performs 883 classifications/sec at 256 x 256 resolution using 8% of the available DSP blocks.
Top decays into a light Higgs boson and an up or charm quark can reach detectable levels in Standard Model extensions with two Higgs doublets or with new exotic quarks, and in the Minimal Supersymmetric Standard Model. Using both a standard and a neural network analysis we show that the CERN Large Hadron Collider will give 3 sigma evidence of decays with Br(t -> Hc) >= 6.5 10^-5 or set a limit Br(t -> Hc) <= 4.5 10^-5 with a 95% confidence level if these decays are not observed. We also consider limits obtained from single top production associated with a neutral Higgs boson.
The aim of this paper is to discuss some basic notions regarding generic glass forming systems composed of particles interacting via soft potentials. Excluding explicitly hard-core interaction we discuss the so called `glass transition' in which super-cooled amorphous state is formed, accompanied with a spectacular slowing down of relaxation to equilibrium, when the temperature is changed over a relatively small interval. Using the classical example of a 50-50 binary liquid of N particles with different interaction length-scales we show that (i) the system remains ergodic at all temperatures. (ii) the number of topologically distinct configurations can be computed, is temperature independent, and is exponential in N. (iii) Any two configurations in phase space can be connected using elementary moves whose number is polynomially bounded in N, showing that the graph of configurations has the `small world' property. (iv) The entropy of the system can be estimated at any temperature (or energy), and there is no Kauzmann crisis at any positive temperature. (v) The mechanism for the super-Arrhenius temperature dependence of the relaxation time is explained, connecting it to an entropic squeeze at the glass transition. (vi) There is no Vogel-Fulcher crisis at any finite temperature T>0
Group formation is important in many economic contexts. The current literature on group formation assumes that individuals may join any existing group. In this paper, I consider the implications of social, geographic, and informational constraints to group membership decisions. I embed the players in a network of relationships, which constrains their choice of groups--they may only join a group if that group contains a member that they are connected to on the network. I then examine how this network constraint affects the equilibrium group structure. I show that even with complete information, unconstrained individuals form groups that are inefficiently large. When individuals are constrained, the resulting group structures are much closer to the socially optimal group structure, because the constraint limits the ability of the individual to free ride on the efforts of other group members. The efficiency of the outcome is related to the structure of the network constraint--outcomes are more efficient when networks are sparse and have few random connections.
Multi-valued logical models can be used to describe biological networks on a high level of abstraction based on the network structure and logical parameters capturing regulatory effects. Interestingly, the dynamics of two distinct models need not necessarily be different, which might hint at either only non-functional characteristics distinguishing the models or at different possible implementations for the same behaviour. Here, we study the conditions allowing for such effects by analysing classes of dynamically equivalent models and both structurally maximal and minimal representatives of such classes. Finally, we present an efficient algorithm that constructs a minimal representative of the respective class of a given multi-valued model.
The position of the nodes within a network topology largely determines the level of their involvement in various networking functions. Yet numerous node centrality indices, proposed to quantify how central individual nodes are in this respect, yield very different views of their relative significance. Our first contribution in this paper is then an exhaustive survey and categorization of centrality indices along several attributes including the type of information (local vs. global) and processing complexity required for their computation. We next study the seven most popular of those indices in the context of Internet vulnerability to address issues that remain under-explored in literature so far. First, we carry out a correlation study to assess the consistency of the node rankings those indices generate over ISP router-level topologies. For each pair of indices, we compute the full ranking correlation, which is the standard choice in literature, and the percentage overlap between the k top nodes. Then, we let these rankings guide the removal of highly central nodes and assess the impact on both the connectivity properties and traffic-carrying capacity of the network. Our results confirm that the top-k overlap predicts the comparative impact of indices on the network vulnerability better than the full-ranking correlation. Importantly, the locally computed degree centrality index approximates closely the global indices with the most dramatic impact on the traffic-carrying capacity; whereas, its approximative power in terms of connectivity is more topology-dependent.
We discuss effect of nuclear shadowing in neutrino deep-inelastic scattering in terms of non perturbative parton model. We found that for small Bjorken $x$ and large $Q^2$ the structure function $F_3$ is shadowed in nuclei about two times as stronger as $F_2$. The underlying reason and phenomenological aspects of this observation are discussed.
We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time and when computing the parameters' gradient at train-time. We conduct two sets of experiments, each based on a different framework, namely Torch7 and Theano, where we train BNNs on MNIST, CIFAR-10 and SVHN, and achieve nearly state-of-the-art results. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which might lead to a great increase in power-efficiency. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available.
We report our effort to identify the sensitive information, subset of data items listed by HIPAA (Health Insurance Portability and Accountability), from medical text using the recent advances in natural language processing and machine learning techniques. We represent the words with high dimensional continuous vectors learned by a variant of Word2Vec called Continous Bag Of Words (CBOW). We feed the word vectors into a simple neural network with a Long Short-Term Memory (LSTM) architecture. Without any attempts to extract manually crafted features and considering that our medical dataset is too small to be fed into neural network, we obtained promising results. The results thrilled us to think about the larger scale of the project with precise parameter tuning and other possible improvements.
We show that the problem of finding a set with maximum cohesion in an undirected network is NP-hard.
This paper gives an overview on how Software Defined Networking (SDN) principles can be applied to existing Ethernet merchant silicon considering the requirements modern networks face. We show that existing Layer 2 features specified by IEEE 802.1Q support SDN. The bridge architecture supports control plane/data plane split by design and also allows for external control e.g. by an SDN Controller. The data plane provided by existing chips is feature rich for network virtualization and supports even more features like OAM. We outline the principles of SDN over bridges and show a number of possibilities for further research and development.
We present a general computational theory of stem cell networks and their developmental dynamics. Stem cell networks are special cases of developmental control networks. Our theory generates a natural classification of all possible stem cell networks based on their network architecture. Each stem cell network has a unique topology and semantics and developmental dynamics that result in distinct phenotypes. We show that the ideal growth dynamics of multicellular systems generated by stem cell networks have mathematical properties related to the coefficients of Pascal's Triangle. The relationship to cancer stem cells and their control networks is indicated. The theory lays the foundation for a new research paradigm for understanding and investigating stem cells. The theory of stem cell networks implies that new methods for generating and controlling stem cells will become possible.
We review the q-deformed spin network approact to Topological Quantum Field Theory and apply these methods to produce unitary representations of the braid groups that are dense in the unitary groups. These methods produce a concise proof that quantum computation can be performed within a single representation of the Artin Braid Group.
The efficient evaluation of tensor expressions involving sums over multiple indices is of significant importance to many fields of research, including quantum many-body physics, loop quantum gravity, and quantum chemistry. The computational cost of evaluating an expression may depend strongly upon the order in which the index sums are evaluated, and determination of the operation-minimising contraction sequence for a single tensor network (single term, in quantum chemistry) is known to be NP-hard. The current preferred solution is an exhaustive search, using either an iterative depth-first approach with pruning or dynamic programming and memoisation, but these approaches are impractical for many of the larger tensor network Ansaetze encountered in quantum many-body physics. We present a modified search algorithm with enhanced pruning which exhibits a performance increase of several orders of magnitude while still guaranteeing identification of an optimal operation-minimising contraction sequence for a single tensor network. A reference implementation for MATLAB, compatible with the ncon() and multienv() network contractors of arXiv:1402.0939 and arXiv:1310.8023 respectively, is supplied.
Distributed configuration management is imperative for wireless infrastructureless networks where each node adjusts locally its physical and logical configuration through information exchange with neighbors. Two issues remain open. The first is the optimality. The second is the complexity. We study these issues through modeling, analysis, and randomized distributed algorithms. Modeling defines the optimality. We first derive a global probabilistic model for a network configuration which characterizes jointly the statistical spatial dependence of a physical- and a logical-configuration. We then show that a local model which approximates the global model is a two-layer Markov Random Field or a random bond model. The complexity of the local model is the communication range among nodes. The local model is near-optimal when the approximation error to the global model is within a given error bound. We analyze the trade-off between an approximation error and complexity, and derive sufficient conditions on the near-optimality of the local model. We validate the model, the analysis and the randomized distributed algorithms also through simulation.
In this paper we propose a network aware approach for routing in graded network using Artificial Bee Colony (ABC) algorithm. ABC has been used as a good search process for optimality exploitation and exploration. The paper shows how ABC approach has been utilized for determining the optimal path based on bandwidth availability of the link and how it outperformed non graded network while deriving the optimal path. The selection of the nodes is based on the direction of the destination node also. This would help in narrowing down the number of nodes participating in routing. Here an agent system governs the collection of QoS parameters of the nodes. Also a quadrant is synthesized with centre as the source node. Based on the information of which quadrant the destination belongs, a search is performed. Among the many searches observed by the onlooker bees the best path is selected based on which onlooker bee comes back to source with information of the optimal path. The simulation result shows that the path convergence in graded network with ABC was 30% faster than non-graded ABC.
[Background] Several studies have mentioned network modularity -- that a network can easily be decomposed into subgraphs that are densely connected within and weakly connected between each other -- as a factor affecting metabolic robustness. In this paper we measure the relation between network modularity and several aspects of robustness directly in a model system of metabolism. [Methodology/Principal Findings] By using a model for generating chemical reaction systems where one can tune the network modularity, we find that robustness increases with modularity for changes in the concentrations of metabolites, whereas it decreases with changes in the expression of enzymes. The same modularity scaling is true for the speed of relaxation after the perturbations. [Conclusions/Significance] Modularity is not a general principle for making metabolism either more or less robust; this question needs to be addressed specifically for different types of perturbations of the system.
Benchmarks and datasets have important role in evaluation of machine learning algorithms and neural network implementations. Traditional dataset for images such as MNIST is applied to evaluate efficiency of different training algorithms in neural networks. This demand is different in Spiking Neural Networks (SNN) as they require spiking inputs. It is widely believed, in the biological cortex the timing of spikes is irregular. Poisson distributions provide adequate descriptions of the irregularity in generating appropriate spikes. Here, we introduce a spike-based version of MNSIT (handwritten digits dataset),using Poisson distribution and show the Poissonian property of the generated streams. We introduce a new version of evt_MNIST which can be used for neural network evaluation.
We investigate transmission energy minimization via optimizing wireless relay selection in orthogonal-frequency-division multiple access (OFDMA) networks. We take into account the impact of the load of cells on transmission energy. We prove the NP-hardness of the energy-aware wireless relay selection problem. To tackle the computational complexity, a partial optimality condition is derived for providing insights in respect of designing an effective and efficient algorithm. Numerical results show that the resulting algorithm achieves high energy performance.
Learning the network structure underlying data is an important problem in machine learning. This paper introduces a novel prior to study the inference of scale-free networks, which are widely used to model social and biological networks. The prior not only favors a desirable global node degree distribution, but also takes into consideration the relative strength of all the possible edges adjacent to the same node and the estimated degree of each individual node.   To fulfill this, ranking is incorporated into the prior, which makes the problem challenging to solve. We employ an ADMM (alternating direction method of multipliers) framework to solve the Gaussian Graphical model regularized by this prior. Our experiments on both synthetic and real data show that our prior not only yields a scale-free network, but also produces many more correctly predicted edges than the others such as the scale-free inducing prior, the hub-inducing prior and the $l_1$ norm.
In this paper, a clustered wireless sensor network is considered that is modeled as a set of coupled Gaussian multiple-access channels. The objective of the network is not to reconstruct individual sensor readings at designated fusion centers but rather to reliably compute some functions thereof. Our particular attention is on real-valued functions that can be represented as a post-processed sum of pre-processed sensor readings. Such functions are called nomographic functions and their special structure permits the utilization of the interference property of the Gaussian multiple-access channel to reliably compute many linear and nonlinear functions at significantly higher rates than those achievable with standard schemes that combat interference. Motivated by this observation, a computation scheme is proposed that combines a suitable data pre- and post-processing strategy with a nested lattice code designed to protect the sum of pre-processed sensor readings against the channel noise. After analyzing its computation rate performance, it is shown that at the cost of a reduced rate, the scheme can be extended to compute every continuous function of the sensor readings in a finite succession of steps, where in each step a different nomographic function is computed. This demonstrates the fundamental role of nomographic representations.
The standard tunneling model describes quite satisfactorily the thermal properties of amorphous solids at temperatures $T<1K$ in terms of an ensemble of two-level systems possessing logarithmically uniform distribution over their tunneling amplitudes and uniform distribution over their asymmetry energies. In particular, this distribution explains the observable logarithmic temperature dependence of the dielectric constant. Yet, experiments have shown that at ultralow temperatures $T<5mK$ such a temperature behavior breaks down and the dielectric constant becomes temperature independent (plateau effect). In this letter we suggest an explanation of this behavior exploiting the effect of the nuclear quadrupole interaction on tunneling. We show that below a temperature corresponding to the characteristic energy of the nuclear quadrupole interaction the effective tunneling amplitude is reduced by a small overlap factor of the nuclear quadrupole ground states in the left and right potential wells of the tunneling system. It is just this reduction that explains the plateau effect . We predict that the application of a sufficiently large magnetic field $B>10T$ should restore the logarithmic dependence because of the suppression of the nuclear quadrupole interaction.
We solve the problem of $N$ non magnetic impurities in the staggered flux phase of the Heisenberg model which we assume to be a good mean-field approximation for the spin-gap phase of the cuprates. The density of states is evaluated exactly in the unitary limit and is porportional to $1/\left (\omega \ln^2(|\omega|/D))$, in analogy with the 1D case of doped spin-Peierls and two-leg ladders compounds. We argue that the system exhibits a quasi long-range order at T=0 with instantaneous spin-spin correlations decreasing as $n_i/ \ln^2\left (n_i R_{ij})$ for large distances $R_{ij}$ and we predict enhanced low energy fluctuations in Neutron Scattering.
We investigate the dynamic critical exponent of the two-dimensional Ising model defined on a curved surface with constant negative curvature. By using the short-time relaxation method, we find a quantitative alteration of the dynamic exponent from the known value for the planar Ising model. This phenomenon is attributed to the fact that the Ising lattices embedded on negatively curved surfaces act as ones in infinite dimensions, thus yielding the dynamic exponent deduced from mean field theory. We further demonstrate that the static critical exponent for the correlation length exhibits the mean field exponent, which agrees with the existing results obtained from canonical Monte Carlo simulations.
Network densification along with universal resources reuse is expected to play a key role in the realization of 5G radio access as an enabler for delivering most of the anticipated network capacity improvements. On the one hand, neither the expected additional spectrum allocation nor the forthcoming novel air-interface processing techniques will be sufficient for sustaining the anticipated exponentially-increasing mobile data traffic. On the other hand, enhanced ultra-dense infrastructure deployments are expected to provide remarkable capacity gains, regardless of the evolutionary or revolutionary approach followed towards 5G development. In this work, we thoroughly examine global network coordination as the main enabler for future 5G large dense small-cell deployments. We propose a powerful radio resources coordination framework through which interference management is handled network-wise and jointly over multiple dimensions. In particular, we explore strategies for pairing serving and served access nodes, partitioning the available network resources, as well as dynamically allocating power per pair, towards optimizing system performance and guaranteeing individual minimum performance levels. We develop new optimization formulations, providing network scaling performance upper bounds, along with lower complexity algorithmic solutions tailored to large networks. We apply the proposed solutions to dense network deployments, in order to obtain useful insights on network performance and optimization, such as rate scaling, infrastructure density, optimal bandwidth partitioning and spatial reuse factor optimization.
The theory of pattern formation in reaction-diffusion systems is extended to the case of a directed network. Due to the structure of the network Laplacian of the scrutinised system, the dispersion relation has both real and imaginary parts, at variance with the conventional case for a symmetric network. It is found that the homogeneous fixed point can become unstable due to the topology of the network, resulting in a new class of instabilities which cannot be induced on undirected graphs. Results from a linear stability analysis allow the instability region to be analytically traced. Numerical simulations show that the instability can lead to travelling waves, or quasi-stationary patterns, depending on the characteristics of the underlying graph. The results presented here could impact on the diverse range of disciplines where directed networks are found, such as neuroscience, computer networks and traffic systems.
Communities are an important feature of social networks. In fact, it seems that communities are necessary for a social network to be efficient. However, there exist very few formal studies of the actual role of communities in social networks, how they emerge, and how they are structured. The goal of this paper is to propose a mathematical model to study communities in social networks. For this, we consider a particular case of a social network, namely information networks. We assume that there is a population of agents who are interested in obtaining content. Agents differ in the type of content they are interested in. The goal of agents is to form communities in order to maximize their utility for obtaining and producing content. We use this model to characterize the structure of communities that emerge in this setting. While the proposed model is very simple, the obtained results suggest that it indeed is able to capture key properties of information communities.
We present the results of a spectroscopic survey with the Keck I telescope of more than 280 star-forming galaxies and AGN at redshifts 1.4<z<3.0 in the GOODS-N field. Candidates are selected by their UGR colors using the ``BM/BX'' criteria to target redshift 1.4<z<2.5 galaxies and the LBG criteria to target redshift z~3 galaxies; combined these samples account for ~25-30% of the R and K-band counts to R=25.5 and K(AB)=24.4, respectively. The sample of 212 BM/BX galaxies and 74 LBGs is presently the largest spectroscopic sample of galaxies at z>1.4 in GOODS-N. Extensive multi-wavelength data allow us to investigate the stellar populations, stellar masses, bolometric luminosities (L_bol), and extinction of z~2 galaxies. Deep Chandra and Spitzer data indicate that the sample includes galaxies with a wide range in L_bol, from 10^10 L_sun to >10^12 L_sun, and covering 4 orders of magnitude in dust obscuration (L_bol/L_UV). The sample includes galaxies with a large dynamic range in evolutionary state, from very young galaxies (ages <50 Myr) with small stellar masses (M* ~ 10^9 M_sun) to evolved galaxies (ages >2 Gyr) with stellar masses comparable to the most massive galaxies at these redshifts (M* > 10^11 M_sun). Spitzer data indicate that the optical sample includes some fraction of the obscured AGN population at high redshifts: at least 3 of 11 AGN in the z>1.4 sample are undetected in the deep X-ray data but exhibit power-law SEDs longward of ~2 micron (rest-frame) indicative of obscured AGN. The results of our survey indicate that rest-frame UV selection and spectroscopy presently constitute the most time-wise efficient method of culling large samples of high redshift galaxies with a wide range in intrinsic properties, and the data presented here will add significantly to the multi-wavelength legacy of the GOODS survey. [Abridged]
The combinatorial optimization problem is one of the important applications in neural network computation. The solutions of linearly constrained continuous optimization problems are difficult with an exact algorithm, but the algorithm for the solution of such problems is derived by using logarithm barrier function. In this paper we have made an attempt to solve the linear constrained optimization problem by using general logarithm barrier function to get an approximate solution. In this case the barrier parameters behave as temperature decreasing to zero from sufficiently large positive number satisfying convexity of the barrier function. We have developed an algorithm to generate decreasing sequence of solution converging to zero limit.
For the Sr2-xIrO4 system, recent work pointed out the abnormal electronic and magnetic properties around the critical point. Here, to understand Sr-vacancy effect on spin-orbit coupled Mott insulator Sr2IrO4, the electronic structure and local structure distortion for Sr2-xIrO4 system have been investigated by x-ray absorption spectroscopy (XAS). By comparing the intensity of white-line features at the IrL2,3-edge x-ray absorption near-edge spectroscopy (XANES), we observe a sudden rise of the branching ratio in the vicinity of the critical point.Further analysis on theIrL3 -edge extended x-ray absorption fine structure (EXAFS) and calculated data demonstrated that the abrupt enhancement of the branching ratio is intimately related to the less distorted IrO6 octahedra induced by deep Sr-vacancies, which in turn alters the relatively strength of the spin-orbit coupling (SOC) and crystal electric field (CEF)that would dictate the abnormal behavior of electronic and magnetic structure near the critical point.
We carry out an exploratory weak gravitational lensing analysis on a combined VLA and MERLIN radio data set: a deep (3.3 micro-Jy beam^-1 rms noise) 1.4 GHz image of the Hubble Deep Field North. We measure the shear estimator distribution at this radio sensitivity for the first time, finding a similar distribution to that of optical shear estimators for HST ACS data in this field. We examine the residual systematics in shear estimation for the radio data, and give cosmological constraints from radio-optical shear cross-correlation functions. We emphasize the utility of cross-correlating shear estimators from radio and optical data in order to reduce the impact of systematics. Unexpectedly we find no evidence of correlation between optical and radio intrinsic ellipticities of matched objects; this result improves the properties of optical-radio lensing cross-correlations. We explore the ellipticity distribution of the radio counterparts to optical sources statistically, confirming the lack of correlation; as a result we suggest a connected statistical approach to radio shear measurements.
An important question in evolutionary computation is how good solutions evolutionary algorithms can produce. This paper aims to provide an analytic analysis of solution quality in terms of the relative approximation error, which is defined by the error between 1 and the approximation ratio of the solution found by an evolutionary algorithm. Since evolutionary algorithms are iterative methods, the relative approximation error is a function of generations. With the help of matrix analysis, it is possible to obtain an exact expression of such a function. In this paper, an analytic expression for calculating the relative approximation error is presented for a class of evolutionary algorithms, that is, (1+1) strictly elitist evolution algorithms. Furthermore, analytic expressions of the fitness value and the average convergence rate in each generation are also derived for this class of evolutionary algorithms. The approach is promising, and it can be extended to non-elitist or population-based algorithms too.
We investigate the transport properties of itinerant electrons interacting with a background of localized spins in a correlated paramagnetic phase of the pyrochlore lattice. We find a residual resistivity at zero temperature due to the scattering of electrons by the static dipolar spin-spin correlation that characterizes the metallic Coulomb phase. As temperature increases, thermally excited topological defects, also known as magnetic monopoles, reduce the spin correlation, hence suppressing electron scattering. Combined with the usual scattering processes in metals at higher temperatures, this mechanism yields a non-monotonic resistivity, displaying a minimum at temperature scales associated with the magnetic monopole excitation energy. Our calculations agree quantitatively with resistivity measurements in Nd2Ir2O7, shedding light on the origin of the resistivity minimum observed in metallic spin-ice compounds.
This paper addresses semantic image segmentation by incorporating rich information into Markov Random Field (MRF), including high-order relations and mixture of label contexts. Unlike previous works that optimized MRFs using iterative algorithm, we solve MRF by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-to-end computation in a single forward pass. Specifically, DPN extends a contemporary CNN architecture to model unary terms and additional layers are carefully devised to approximate the mean field algorithm (MF) for pairwise terms. It has several appealing properties. First, different from the recent works that combined CNN and MRF, where many iterations of MF were required for each training image during back-propagation, DPN is able to achieve high performance by approximating one iteration of MF. Second, DPN represents various types of pairwise terms, making many existing works as its special cases. Third, DPN makes MF easier to be parallelized and speeded up in Graphical Processing Unit (GPU). DPN is thoroughly evaluated on the PASCAL VOC 2012 dataset, where a single DPN model yields a new state-of-the-art segmentation accuracy.
While question answering (QA) with neural network, i.e. neural QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating neural QA system. To alleviate this problem, we propose a large scale human annotated real-world QA dataset WebQA with more than 42k questions and 556k evidences. As existing neural QA methods resolve QA either as sequence generation or classification/ranking problem, they face challenges of expensive softmax computation, unseen answers handling or separate candidate answer generation component. In this work, we cast neural QA as a sequence labeling problem and propose an end-to-end sequence labeling model, which overcomes all the above challenges. Experimental results on WebQA show that our model outperforms the baselines significantly with an F1 score of 74.69% with word-based input, and the performance drops only 3.72 F1 points with more challenging character-based input.
We study by extensive Monte Carlo simulations the effect of random bond dilution on the phase transition of the three-dimensional 4-state Potts model which is known to exhibit a strong first-order transition in the pure case. The phase diagram in the dilution-temperature plane is determined from the peaks of the susceptibility for sufficiently large system sizes. In the strongly disordered regime, numerical evidence for softening to a second-order transition induced by randomness is given. Here a large-scale finite-size scaling analysis, made difficult due to strong crossover effects presumably caused by the percolation fixed point, is performed.
This work introduces a novel convolutional network architecture for the task of human pose estimation. Features are processed across all scales and consolidated to best capture the various spatial relationships associated with the body. We show how repeated bottom-up, top-down processing used in conjunction with intermediate supervision is critical to improving the performance of the network. We refer to the architecture as a "stacked hourglass" network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions. State-of-the-art results are achieved on the FLIC and MPII benchmarks outcompeting all recent methods.
Earthquake network captures the complexity of seismicity in a peculiar manner. Given a seismic data, the procedure of constructing an earthquake network proposed in [S. Abe, N, Suzuki, Europhys. Lett. 65 (2004) 581] contains as a single parameter the size of the cells, into which a geographical region under consideration is divided. Then, the characteristics of the network depend on the cell size, in general. Here, the dependency of the clustering coefficient, C, of network on the cell size is studied. Remarkably, C of the earthquake networks constructed from the seismic data taken from California, Japan and Iran well coincide for each value of the rescaled dimensionless cell size. It is found that the networks in California and Japan are three-dimensional, whereas the one in Iran is rather two-dimensional. In addition, the values of C of all these three networks monotonically converge to C~0.85 as the dimensionless cell size becomes larger than a common value, highlighting a universal aspect of the concept of earthquake network. An implication of the result to understanding the physics of seismicity is discussed.
We experimentally observe many-body localization of interacting fermions in a one-dimensional quasi-random optical lattice. We identify the many-body localization transition through the relaxation dynamics of an initially-prepared charge density wave. For sufficiently weak disorder the time evolution appears ergodic and thermalizing, erasing all remnants of the initial order. In contrast, above a critical disorder strength a significant portion of the initial ordering persists, thereby serving as an effective order parameter for localization. The stationary density wave order and the critical disorder value show a distinctive dependence on the interaction strength, in agreement with numerical simulations. We connect this dependence to the ubiquitous logarithmic growth of entanglement entropy characterizing the generic many-body localized phase.
While test collections are commonly employed to evaluate the effectiveness of information retrieval (IR) systems, constructing these collections has become increasingly expensive in recent years as collection sizes have grown ever-larger. To address this, we propose a new learning-to-rank topic selection method which reduces the number of search topics needed for reliable evaluation of IR systems. As part of this work, we also revisit the deep vs. shallow judging debate: whether it is better to collect many relevance judgments for a few topics or a few judgments for many topics. We consider a number of factors impacting this trade-off: how topics are selected, topic familiarity to judges, and how topic generation cost may impact both budget utilization and the resultant quality of judgments. Experiments on NIST TREC Robust 2003 and Robust 2004 test collections show not only our method's ability to reliably evaluate IR systems using fewer topics, but also that when topics are intelligently selected, deep judging is often more cost-effective than shallow judging in achieving the same level of evaluation reliability. Topic familiarity and construction costs are also seen to notably impact the evaluation cost vs. reliability tradeoff and provide further evidence supporting deep judging in practice.
Many Information Centric Networking (ICN) proposals use a network of caches to bring the contents closer to the consumers, reduce the load on producers and decrease the unnecessary retransmission for ISPs. Nevertheless, the existing cache management scheme for the network of caches obtain poor performance. The main reason for performance degradation in a network of caches is the filter effect of the replacement policy. A cache serves the requests that generate cache-hits and forwards the requests that generate cache-misses. This filtering changes the pattern of requests and leads to decreased hit ratios in the subsequent caches. In this paper, we propose a coordinated caching scheme to solve the filter effect problem by introducing the selection policy. This policy manages a cache such that: i) the cache obtains a high hit ratio ii) the missed requests from the cache can be used by subsequent caches to obtain a high hit ratio. Our coordinated selection scheme achieves an overall hit ratio of a network of caches equivalent to that of edge routers with big caches. Moreover, our scheme decreases the average number of evictions per cache slot by four order of magnitude compared to the LRU universal caching.
With the wide adoption of large-scale Internet services and big data, the cloud has become the ideal environment to satisfy the ever-growing storage demand, thanks to its seemingly limitless capacity, high availability and faster access time. In this context, data replication has been touted as the ultimate solution to improve data availability and reduce access time. However, replica placement systems usually need to migrate and create a large number of data replicas over time between and within data centers, incurring a large overhead in terms of network load and availability. In this paper, we propose CRANE, an effiCient Replica migrAtion scheme for distributed cloud Storage systEms. CRANE complements any replica placement algorithm by efficiently managing replica creation in geo-distributed infrastructures by (1) minimizing the time needed to copy the data to the new replica location, (2) avoiding network congestion, and (3) ensuring a minimal availability of the data. Our results show that, compared to swift (the OpenStack project for managing data storage), CRANE is able to minimize up to 30% of the replica creation time and 25% of inter-data center network traffic, while ensuring the minimum required availability of the data.
The search for problems where quantum adiabatic optimization might excel over classical optimization techniques has sparked a recent interest in inducing a finite-temperature spin-glass transition in quasi-planar topologies. We have performed large-scale finite-temperature Monte Carlo simulations of a two-dimensional square-lattice bimodal spin glass with next-nearest ferromagnetic interactions claimed to exhibit a finite-temperature spin-glass state for a particular relative strength of the next-nearest to nearest interactions [Phys. Rev. Lett. 76, 4616 (1996)]. Our results show that the system is in a paramagnetic state in the thermodynamic limit, despite zero-temperature simulations [Phys. Rev. B 63, 094423 (2001)] suggesting the existence of a finite-temperature spin-glass transition. Therefore, deducing the finite-temperature behavior from zero-temperature simulations can be dangerous when corrections to scaling are large.
In order to distribute the best arm identification task as close as possible to the user's devices, on the edge of the Radio Access Network, we propose a new problem setting, where distributed players collaborate to find the best arm. This architecture guarantees privacy to end-users since no events are stored. The only thing that can be observed by an adversary through the core network is aggregated information across users. We provide a first algorithm, Distributed Median Elimination, which is optimal in term of number of transmitted bits and near optimal in term of speed-up factor with respect to an optimal algorithm run independently on each player. In practice, this first algorithm cannot handle the trade-off between the communication cost and the speed-up factor, and requires some knowledge about the distribution of players. Extended Distributed Median Elimination overcomes these limitations, by playing in parallel different instances of Distributed Median Elimination and selecting the best one. Experiments illustrate and complete the analysis. According to the analysis, in comparison to Median Elimination performed on each player, the proposed algorithm shows significant practical improvements.
The Great Observatories Origins Deep Survey provides significant constraints on the space density of less luminous QSOs at high redshift, which is particularly important to understand the interplay between the formation of galaxies and super-massive black holes and to measure the QSO contribution to the UV ionizing background. We present the results of a search for high-z QSOs, identified in the two GOODS fields on the basis of deep imaging in the optical (with HST) and X-ray (Chandra), and discuss the allowed space density of QSOs in the early universe.
Inference of new biological knowledge, e.g., prediction of protein function, from protein-protein interaction (PPI) networks has received attention in the post-genomic era. A popular strategy has been to cluster the network into functionally coherent groups of proteins and predict protein function from the clusters. Traditionally, network research has focused on clustering of nodes. However, why favor nodes over edges, when clustering of edges may be preferred? For example, nodes belong to multiple functional groups, but clustering of nodes typically cannot capture the group overlap, while clustering of edges can. Clustering of adjacent edges that share many neighbors was proposed recently, outperforming different node clustering methods. However, since some biological processes can have characteristic "signatures" throughout the network, not just locally, it may be of interest to consider edges that are not necessarily adjacent. Hence, we design a sensitive measure of the "topological similarity" of edges that can deal with edges that are not necessarily adjacent. We cluster edges that are similar according to our measure in different baker's yeast PPI networks, outperforming existing node and edge clustering approaches.
We describe an approximation to backpropagation algorithm for training deep neural networks, which is designed to work with synapses implemented with memristors. The key idea is to represent the values of both the input signal and the backpropagated delta value with a series of pulses that trigger multiple positive or negative updates of the synaptic weight, and to use the min operation instead of the product of the two signals. In computational simulations, we show that the proposed approximation to backpropagation is well converged and may be suitable for memristor implementations of multilayer neural networks.
In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of interest is regret, defined as the gap between the expected total reward accumulated by an omniscient player that knows the reward means for each arm, and the expected total reward accumulated by the given policy. The policies presented in prior work have storage, computation and regret all growing linearly with the number of arms, which is not scalable when the number of arms is large. We consider in this work a broad class of multi-armed bandits with dependent arms that yield rewards as a linear combination of a set of unknown parameters. For this general framework, we present efficient policies that are shown to achieve regret that grows logarithmically with time, and polynomially in the number of unknown parameters (even though the number of dependent arms may grow exponentially). Furthermore, these policies only require storage that grows linearly in the number of unknown parameters. We show that this generalization is broadly applicable and useful for many interesting tasks in networks that can be formulated as tractable combinatorial optimization problems with linear objective functions, such as maximum weight matching, shortest path, and minimum spanning tree computations.
A unified theory for the conductance of a long multimode quantum wire whose finite segment has randomly rough boundaries is developed. It enables one to take account of all mechanisms of wave scattering, both related to boundary roughness and to contacts between the wire rough section and the leads within the same technical frameworks. The rough part of the conducting wire is shown to act as a mode-specific randomly modulated effective potential barrier whose height is governed essentially by the asperity slope. The mean height of the barrier specifies the number of conducting channels. Under relatively small asperity amplitude this number can take on arbitrary small values if the asperities are sufficiently sharp. The channel cut-off that arises when the asperity sharpness increases can be regarded as a kind of localization, which is not related to the disorder but rather is of entropic origin. The fluctuating part of the barrier results in two fundamentally different types of guided wave scattering, viz., inter- and intramode scattering. The intermode scattering is shown to be for the most part very strong except in the cases of (a) extremely smooth asperities, (b) excessively small length of the corrugated segment, and (c) the asperities sharp enough for only one conducting channel to remain in the wire. Under strong intermode scattering, a new set of conducting channels develops, which have the form of decoupled extended modes subject to individual random potentials. In view of this fact, two transport regimes only are realizable in randomly corrugated multimode wires, specifically, the ballistic and the localized regime, the latter characteristic of one-dimensional random systems. Two kinds of localization are thus shown to coexist in waveguide-like systems with randomly corrugated boundaries, specifically, the entropic localization and the one-dimensional Anderson localization.
We study the Sivers asymmetry in semi-inclusive hadron production in deep inelastic scattering. We concentrate on the contribution from the photon-gluon fusion channel at ${\cal O}(\alpha_{em}^2 \alpha_s)$, where three-gluon correlation functions play a major role within the twist-3 collinear factorization formalism. We establish the correspondence between such a formalism with three gluon correlation functions and the usual transverse momentum dependent (TMD) factorization formalism at moderate hadron transverse momenta. We derive the coefficient functions used in the usual TMD evolution formalism related to the quark Sivers function expansion in terms of the three-gluon correlation functions. We further perform the next-to-leading order calculation for the transverse-momentum-weighted spin-dependent differential cross section, and identify the off-diagonal contribution from the three-gluon correlation functions to the QCD collinear evolution of the twist-3 Qiu-Sterman function.
The dependence of two-level systems in disordered atomic chain on pressure, both positive and negative was studied numerically. The disorder was produced through the use of interatomic pair potentials having more than one energy minimum. It was found that there exists a correlation between the energy separation of the minima of two-level systems Delta and the variation of this separation with pressure. The correlation may have either positive or negative sign, implying that the asymmetry of two-level systems may in average increase or decrease with pressure depending on the interplay of different interactions between atoms in disordered state. The values of Delta depend on the sign of pressure.
Anomalous Hall effect arises in systems with both spin-orbit coupling and magnetization. Generally, there are three mechanisms contributing to anomalous Hall conductivity: intrinsic, side jump, and skew scattering. The standard diagrammatic approach to the anomalous Hall effect is limited to computation of ladder diagrams. We demonstrate that this approach is insufficient. An important additional contribution comes from diagrams with a single pair of intersecting disorder lines. This contribution constitutes an inherent part of skew scattering on pairs of closely located defects and essentially modifies previously obtained results for anomalous Hall conductivity. We argue that this statement is general and applies to all models of anomalous Hall effect. We illustrate it by an explicit calculation for two-dimensional massive Dirac fermions with weak disorder. In this case, inclusion of the diagrams with crossed impurity lines reverses the sign of the skew scattering term and strongly suppresses the total Hall conductivity at high electron concentrations.
In this paper we propose a generative model, the Temporal Generative Adversarial Network (TGAN), which can learn a semantic representation of unlabelled videos, and is capable of generating consistent videos. Unlike an existing GAN that generates videos with a generator consisting of 3D deconvolutional layers, our model exploits two types of generators: a temporal generator and an image generator. The temporal generator consists of 1D deconvolutional layers and outputs a set of latent variables, each of which corresponds to a frame in the generated video, and the image generator transforms them into a video with 2D deconvolutional layers. This representation allows efficient training of the network parameters. Moreover, it can handle a wider range of applications including the generation of a long sequence, frame interpolation, and the use of pre-trained models. Experimental results demonstrate the effectiveness of our method.
The number partitioning problem is a classic problem of combinatorial optimization in which a set of $n$ numbers is partitioned into two subsets such that the sum of the numbers in one subset is as close as possible to the sum of the numbers in the other set. When the $n$ numbers are i.i.d. variables drawn from some distribution, the partitioning problem turns out to be equivalent to a mean-field antiferromagnetic Ising spin glass. In the spin glass representation, it is natural to define energies -- corresponding to the costs of the partitions, and overlaps -- corresponding to the correlations between partitions. Although the energy levels of this model are {\em a priori} highly correlated, a surprising recent conjecture asserts that the energy spectrum of number partitioning is locally that of a random energy model (REM): the spacings between nearby energy levels are uncorrelated. In other words, the properly scaled energies converge to a Poisson process. The conjecture also asserts that the corresponding spin configurations are uncorrelated, indicating vanishing overlaps in the spin glass representation. In this paper, we prove these two claims, collectively known as the local REM conjecture.
Clustering is among the most fundamental tasks in computer vision and machine learning. In this paper, we propose Variational Deep Embedding (VaDE), a novel unsupervised generative clustering approach within the framework of Variational Auto-Encoder (VAE). Specifically, VaDE models the data generative procedure with a Gaussian Mixture Model (GMM) and a deep neural network (DNN): 1) the GMM picks a cluster; 2) from which a latent embedding is generated; 3) then the DNN decodes the latent embedding into observables. Inference in VaDE is done in a variational way: a different DNN is used to encode observables to latent embeddings, so that the evidence lower bound (ELBO) can be optimized using Stochastic Gradient Variational Bayes (SGVB) estimator and the reparameterization trick. Quantitative comparisons with strong baselines are included in this paper, and experimental results show that VaDE significantly outperforms the state-of-the-art clustering methods on 4 benchmarks from various modalities. Moreover, by VaDE's generative nature, we show its capability of generating highly realistic samples for any specified cluster, without using supervised information during training. Lastly, VaDE is a flexible and extensible framework for unsupervised generative clustering, more general mixture models than GMM can be easily plugged in.
In this work we present a prototype for simulating computer network attacks. Our objective is to simulate large networks (thousands of hosts, with applications and vulnerabilities) while remaining realistic from the attacker's point of view. The foundation for the simulator is a model of computer intrusions, based on the analysis of real world attacks. In particular we show how to interpret vulnerabilities and exploits as communication channels. This conceptual model gives a tool to describe the theater of operations, targets, actions and assets involved in multistep network attacks. We conclude with applications of the attack simulator.
In this work, we propose a novel microaneurysm (MA) detection for early diabetic retinopathy screening using color fundus images. Since MA usually the first lesions to appear as an indicator of diabetic retinopathy, accurate detection of MA is necessary for treatment. Each pixel of the image is classified as either MA or non-MA using a deep neural network with dropout training procedure using maxout activation function. No preprocessing step or manual feature extraction is required. Substantial improvements over standard MA detection method based on the pipeline of preprocessing, feature extraction, classification followed by post processing is achieved. The presented method is evaluated in publicly available Retinopathy Online Challenge (ROC) and Diaretdb1v2 database and achieved state-of-the-art accuracy.
We investigate mechanisms of the typically observed recoverable prevalence in epidemic spreading. Assuming the time-independent connectivity correlations, we analyze the dynamics of spreading on linearly growing scale-free (SF) networks, and derive the extinction condition related to the rates of network growth, infection, and immunization of viruses. The behavior is consistent with the previous results for SF networks by a mean-field approximation without connectivity correlations. In particular, it is suggested that the growing must be stopped to prevent the spreading of infection. This insight helps to understand the spreading phenomena on communication or social networks.
We present a new distributed representation in deep neural nets wherein the information is represented in native form as a matrix. This differs from current neural architectures that rely on vector representations. We consider matrices as central to the architecture and they compose the input, hidden and output layers. The model representation is more compact and elegant - the number of parameters grows only with the largest dimension of the incoming layer rather than the number of hidden units. We derive feed-forward nets that map an input matrix into an output matrix, and recurrent nets which map a sequence of input matrices into a sequence of output matrices. Experiments on handwritten digits recognition, face reconstruction, sequence to sequence learning and EEG classification demonstrate the efficacy and compactness of the matrix-centric architectures.
We consider the quadrupolar glass model with infinite-range random interaction. Introducing a simple one-step replica symmetry breaking ansatz we investigate the para-glass continuous (discontinuous) transition which occurs below (above) a critical value of the quadrupole dimension m*. By using a mean-field approximation we study the stability of the one-step replica symmetry breaking solution and show that for m>m* there are two transitions. The thermodynamic transition is discontinuous but there is no latent heat. At a higher temperature we find the dynamical or glass transition temperature and the corresponding discontinuous jump of the order parameter.
This paper proposes an unsupervised learning technique by using Multi-layer Mirroring Neural Network and Forgy's clustering algorithm. Multi-layer Mirroring Neural Network is a neural network that can be trained with generalized data inputs (different categories of image patterns) to perform non-linear dimensionality reduction and the resultant low-dimensional code is used for unsupervised pattern classification using Forgy's algorithm. By adapting the non-linear activation function (modified sigmoidal function) and initializing the weights and bias terms to small random values, mirroring of the input pattern is initiated. In training, the weights and bias terms are changed in such a way that the input presented is reproduced at the output by back propagating the error. The mirroring neural network is capable of reducing the input vector to a great degree (approximately 1/30th the original size) and also able to reconstruct the input pattern at the output layer from this reduced code units. The feature set (output of central hidden layer) extracted from this network is fed to Forgy's algorithm, which classify input data patterns into distinguishable classes. In the implementation of Forgy's algorithm, initial seed points are selected in such a way that they are distant enough to be perfectly grouped into different categories. Thus a new method of unsupervised learning is formulated and demonstrated in this paper. This method gave impressive results when applied to classification of different image patterns.
In this paper, energy efficiency of relay-assisted millimeter wave (mmWave) cellular networks with Poisson Point Process (PPP) distributed base stations (BSs) and relay stations (RSs) is analyzed using tools from stochastic geometry. The distinguishing features of mmWave communications such as directional beamforming and having different path loss laws for line-of-sight (LOS) and non-line-of-sight (NLOS) links are incorporated into the energy efficiency analysis. Following the description of the system model for mmWave cellular networks, coverage probabilities are computed for each link. Subsequently, average power consumption of BSs and RSs are modeled and energy efficiency is determined in terms of system parameters. Energy efficiency in the presence of beamforming alignment errors is also investigated to get insight on the performance in practical scenarios. Finally, the impact of BS and RS densities, antenna gains, main lobe beam widths, LOS interference range, and alignment errors on the energy efficiency is analyzed via numerical results.
Bio-inspired algorithms like Genetic Algorithms and Fuzzy Inference Systems (FIS) are nowadays widely adopted as hybrid techniques in commercial and industrial environment. In this paper we present an interesting application of the fuzzy-GA paradigm to Smart Grids. The main aim consists in performing decision making for power flow management tasks in the proposed microgrid model equipped by renewable sources and an energy storage system, taking into account the economical profit in energy trading with the main-grid. In particular, this study focuses on the application of a Hierarchical Genetic Algorithm (HGA) for tuning the Rule Base (RB) of a Fuzzy Inference System (FIS), trying to discover a minimal fuzzy rules set in a Fuzzy Logic Controller (FLC) adopted to perform decision making in the microgrid. The HGA rationale focuses on a particular encoding scheme, based on control genes and parametric genes applied to the optimization of the FIS parameters, allowing to perform a reduction in the structural complexity of the RB. This approach will be referred in the following as fuzzy-HGA. Results are compared with a simpler approach based on a classic fuzzy-GA scheme, where both FIS parameters and rule weights are tuned, while the number of fuzzy rules is fixed in advance. Experiments shows how the fuzzy-HGA approach adopted for the synthesis of the proposed controller outperforms the classic fuzzy-GA scheme, increasing the accounting profit by 67\% in the considered energy trading problem yielding at the same time a simpler RB.
Deep learning, in particular Convolutional Neural Network (CNN), has achieved promising results in face recognition recently. However, it remains an open question: why CNNs work well and how to design a 'good' architecture. The existing works tend to focus on reporting CNN architectures that work well for face recognition rather than investigate the reason. In this work, we conduct an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a common ground to make our work easily reproducible. Specifically, we use public database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing CNNs trained on private databases. We propose three CNN architectures which are the first reported architectures trained using LFW data. This paper quantitatively compares the architectures of CNNs and evaluate the effect of different implementation choices. We identify several useful properties of CNN-FRS. For instance, the dimensionality of the learned features can be significantly reduced without adverse effect on face recognition accuracy. In addition, traditional metric learning method exploiting CNN-learned features is evaluated. Experiments show two crucial factors to good CNN-FRS performance are the fusion of multiple CNNs and metric learning. To make our work reproducible, source code and models will be made publicly available.
A new method for compiling quantum algorithms is proposed and tested for a three qubit system. The proposed method is to decompose a a unitary matrix U, into a product of simpler U j via a neural network. These U j can then be decomposed into product of known quantum gates. Key to the effectiveness of this approach is the restriction of the set of training data generated to paths which approximate minimal normal subRiemannian geodesics, as this removes unnecessary redundancy and ensures the products are unique. The two neural networks are shown to work effectively, each individually returning low loss values on validation data after relatively short training periods. The two networks are able to return coefficients that are sufficiently close to the true coefficient values to validate this method as an approach for generating quantum circuits. There is scope for more work in scaling this approach to larger quantum systems.
We consider the problem of generating randomized control sequences for complex networked systems typically actuated by human agents. Our approach leverages a concept known as control improvisation, which is based on a combination of data-driven learning and controller synthesis from formal specifications. We learn from existing data a generative model (for instance, an explicit-duration hidden Markov model, or EDHMM) and then supervise this model in order to guarantee that the generated sequences satisfy some desirable specifications given in Probabilistic Computation Tree Logic (PCTL). We present an implementation of our approach and apply it to the problem of mimicking the use of lighting appliances in a residential unit, with potential applications to home security and resource management. We present experimental results showing that our approach produces realistic control sequences, similar to recorded data based on human actuation, while satisfying suitable formal requirements.
Here we propose a novel model family with the objective of learning to disentangle the factors of variation in data. Our approach is based on the spike-and-slab restricted Boltzmann machine which we generalize to include higher-order interactions among multiple latent variables. Seen from a generative perspective, the multiplicative interactions emulates the entangling of factors of variation. Inference in the model can be seen as disentangling these generative factors. Unlike previous attempts at disentangling latent factors, the proposed model is trained using no supervised information regarding the latent factors. We apply our model to the task of facial expression classification.
Deep inelastic electron-proton scattering is analyzed in the target rest frame using a theoretical approach suitable to describe many-body systems of {\em bound} constituents subject to {\em interactions}. At large three-momentum transfer $\magq$, this approach predicts the onset of scaling in the variable $\yt=\nu$-$\magq$, where $\nu$ denotes the energy transfer. The present analysis shows that the data, plotted at constant $\magq$, exhibit a remarkable scaling behavior in $\yt$ and manifestly display the presence of sizable interaction effects.
In this paper, we present MLEANN (Meta-Learning Evolutionary Artificial Neural Network), an automatic computational framework for the adaptive optimization of artificial neural networks wherein the neural network architecture, activation function, connection weights; learning algorithm and its parameters are adapted according to the problem. We explored the performance of MLEANN and conventionally designed artificial neural networks for function approximation problems. To evaluate the comparative performance, we used three different well-known chaotic time series. We also present the state of the art popular neural network learning algorithms and some experimentation results related to convergence speed and generalization performance. We explored the performance of backpropagation algorithm; conjugate gradient algorithm, quasi-Newton algorithm and Levenberg-Marquardt algorithm for the three chaotic time series. Performances of the different learning algorithms were evaluated when the activation functions and architecture were changed. We further present the theoretical background, algorithm, design strategy and further demonstrate how effective and inevitable is the proposed MLEANN framework to design a neural network, which is smaller, faster and with a better generalization performance.
The time-ordered product framework of quantum field theory can also be used to understand salient phenomena in stochastic biochemical networks. It is used here to derive Gillespie's Stochastic Simulation Algorithm (SSA) for chemical reaction networks; consequently, the SSA can be interpreted in terms of Feynman diagrams. It is also used here to derive other, more general simulation and parameter-learning algorithms including simulation algorithms for networks of stochastic reaction-like processes operating on parameterized objects, and also hybrid stochastic reaction/differential equation models in which systems of ordinary differential equations evolve the parameters of objects that can also undergo stochastic reactions. Thus, the time-ordered product expansion (TOPE) can be used systematically to derive simulation and parameter-fitting algorithms for stochastic systems.
Physics perspectives are shown for future experiments in electron or positron scattering on nucleons, towards a deep and comprehensive understanding of the angular momentum structure of the nucleon in the context of Quantum Chromodynamics. Measurements of Generalised Parton Distributions in exclusive reactions and precise determinations of forward Parton Distributions in semi-inclusive deep inelastic scattering are identified as major physics topics. Requirements are discussed for a next generation of high-luminosity fixed-target experiments in the energy range 30-200 GeV.
We observe that the standard log likelihood training objective for a Recurrent Neural Network (RNN) model of time series data is equivalent to a variational Bayesian training objective, given the proper choice of generative and inference models. This perspective may motivate extensions to both RNNs and variational Bayesian models. We propose one such extension, where multiple particles are used for the hidden state of an RNN, allowing a natural representation of uncertainty or multimodality.
In this paper we are extracting surface reflectance and natural environmental illumination from a reflectance map, i.e. from a single 2D image of a sphere of one material under one illumination. This is a notoriously difficult problem, yet key to various re-rendering applications. With the recent advances in estimating reflectance maps from 2D images their further decomposition has become increasingly relevant.   To this end, we propose a Convolutional Neural Network (CNN) architecture to reconstruct both material parameters (i.e. Phong) as well as illumination (i.e. high-resolution spherical illumination maps), that is solely trained on synthetic data. We demonstrate that decomposition of synthetic as well as real photographs of reflectance maps, both in High Dynamic Range (HDR), and, for the first time, on Low Dynamic Range (LDR) as well. Results are compared to previous approaches quantitatively as well as qualitatively in terms of re-renderings where illumination, material, view or shape are changed.
Networking data analytics is increasingly used for enhanced network visibility and controllability. We draw the similarities between the Software Defined Networking (SDN) architecture and the MapReduce programming model. Inspired by the similarity, we suggest the necessary data plane innovations to make network data plane devices function as distributed mappers and optionally, reducers. A streaming network data MapReduce architecture can therefore conveniently solve a series of network monitoring and management problems. Unlike the traditional networking data analytical system, our proposed system embeds the data analytics engine directly in the network infrastructure. The affinity leads to a concise system architecture and better cost performance ratio. On top of this architecture, we propose a general MapReduce-like programming model for real-time and one-pass networking data analytics, which involves joint in-network and out-of-network computing. We show this model can address a wide range of interactive queries from various network applications. This position paper strives to make a point that the white-box trend does not necessarily lead to simple and dumb networking devices. Rather, the defining characteristics of the next generation white-box are open and programmable, so that the network devices can be made smart and versatile to support new services and applications.
Currently, we are overwhelmed by a deluge of experimental data, and network physics has the potential to become an invaluable method to increase our understanding of large interacting datasets. However, this potential is often unrealized for two reasons: uncovering the hidden community structure of a network, known as community detection, is difficult, and further, even if one has an idea of this community structure, it is not a priori obvious how to efficiently use this information. Here, to address both of these issues, we, first, identify optimal community structure of given networks in terms of modularity by utilizing a recently introduced community detection method. Second, we develop an approach to use this community information to extract hidden information from a network. When applied to a protein-protein interaction network, the proposed method outperforms current state-of-the-art methods that use only the local information of a network. The method is generally applicable to networks from many areas.
In many information networks, data items -- such as updates in social networks, news flowing through interconnected RSS feeds and blogs, measurements in sensor networks, route updates in ad-hoc networks -- propagate in an uncoordinated manner: nodes often relay information they receive to neighbors, independent of whether or not these neighbors received the same information from other sources. This uncoordinated data dissemination may result in significant, yet unnecessary communication and processing overheads, ultimately reducing the utility of information networks. To alleviate the negative impacts of this information multiplicity phenomenon, we propose that a subset of nodes (selected at key positions in the network) carry out additional information filtering functionality. Thus, nodes are responsible for the removal (or significant reduction) of the redundant data items relayed through them. We refer to such nodes as filters. We formally define the Filter Placement problem as a combinatorial optimization problem, and study its computational complexity for different types of graphs. We also present polynomial-time approximation algorithms and scalable heuristics for the problem. Our experimental results, which we obtained through extensive simulations on synthetic and real-world information flow networks, suggest that in many settings a relatively small number of filters are fairly effective in removing a large fraction of redundant information.
In this brief paper, a simple and fast computational method, the Planar Visibility Graph Networks Algorithm was proposed based on the famous Visibility Graph Algorithm, which can fulfill converting two dimensional timeseries into a planar graph. The constructed planar graph inherits several properties of the series in its structure. Thereby, periodic series, random series, and chaotic series convert into quite different networks with different average degree, characteristic path length, diameter, clustering coefficient, different degree distribution, and modularity, etc. By means of this new approach, with such different networks measures, one can characterize two dimensional timeseries from a new viewpoint of complex networks.
The number of systems that collect vast amount of data about users rapidly grow during last few years. Many of these systems contain data not only about people characteristics but also about their relationships with other system users. From this kind of data it is possible to extract a social network that reflects the connections between system's users. Moreover, the analysis of such social network enables to investigate different characteristics of its members and their linkages. One of the types of examining such network is key users extraction. Key users are these who have the biggest impact on other network members as well as have big influence on network evolution. The obtained about these users knowledge enables to investigate and predict changes within the network. So this knowledge is very important for the people or companies who make a profit from the network like telecommunication company. The second important thing is the ability to extract these users as quick as possible, i.e. developed the algorithm that will be time-effective in large social networks where number of nodes and edges equal few millions. In this master thesis the method of key user extraction, which is called social position, was analyzed. Moreover, social position measure was compared with other methods, which are used to assess the centrality of a node. Furthermore, three algorithms used to social position calculation was introduced along with results of comparison between their processing time and others centrality methods.
We present a deep X-ray selected sample of galaxy clusters which has been created from a serendipitous search in ROSAT-PSPC deep pointed observations at high galactic latitude. This survey, hereafter known as the ROSAT Deep Cluster Survey (RDCS), is being carried out utilizing a wavelet-based detection algorithm which, unlike other detection methods, is not biased against extended, low surface brightness sources. It is a flux-diameter limited sample that extends the X-ray flux limit of previous cluster surveys by more than one order of magnitude ($F_X \ge 1\cdot 10^{-14}\rm erg\, cm^{-2}s^{-1}$). The first results of the on-going optical follow-up program indicate a high success rate of identification. At the present, 38 clusters out of 80 candidates have been identified on a 26 deg$^2$ surveyed area. Recently measured redshifts confirm the nature of these systems as low-moderate redshift groups ($z\simeq 0.2-0.3$) and intermediate to high redshift clusters ($z\simeq 0.4-0.7$). We show X-ray and optical images of several clusters identified to date, discuss the X-ray properties of the sample and present preliminary results on the redshift distribution. The final sample will include $\sim 100$ clusters covering and area of $\sim 40$ deg$^2$.
While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval. In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: i) noisy training data, ii) inappropriate deep architecture, and iii) suboptimal training procedure. We address all three issues.   First, we leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval. Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it. Last, we train this network with a siamese architecture that combines three streams with a triplet loss. At the end of the training process, the proposed architecture produces a global image representation in a single forward pass that is well suited for image retrieval. Extensive experiments show that our approach significantly outperforms previous retrieval approaches, including state-of-the-art methods based on costly local descriptor indexing and spatial verification. On Oxford 5k, Paris 6k and Holidays, we respectively report 94.7, 96.6, and 94.8 mean average precision. Our representations can also be heavily compressed using product quantization with little loss in accuracy. For additional material, please see www.xrce.xerox.com/Deep-Image-Retrieval.
The oxygen and silicon dynamics in silica is compared via computer simulations. In agreement with experimental data and previous simulations a decoupling of oxygen and silicon dynamics is observed upon cooling. The origin of this decoupling is studied in the framework of the potential energy landscape. From analysis of the transition features between neighboring superstructures of minima, denoted metabasins, the differences between the oxygen and the silicon dynamics can be quantified. The decoupling can be explicitly related to the presence of generalized rotational processes, giving rise to oxygen but not to silicon displacement. Closer analysis of these processes yields important insight into the nature of the potential energy landscape of silica. The physical picture of relaxation processes in silica, obtained in previous work for the oxygen dynamics, is consistent with the decoupling effects, elucidated here.
Learning powerful feature representations for image retrieval has always been a challenging task in the field of remote sensing. Traditional methods focus on extracting low-level hand-crafted features which are not only time-consuming but also tend to achieve unsatisfactory performance due to the content complexity of remote sensing images. In this paper, we investigate how to extract deep feature representations based on convolutional neural networks (CNN) for high-resolution remote sensing image retrieval (HRRSIR). To this end, two effective schemes are proposed to generate powerful feature representations for HRRSIR. In the first scheme, the deep features are extracted from the fully-connected and convolutional layers of the pre-trained CNN models, respectively; in the second scheme, we propose a novel CNN architecture based on conventional convolution layers and a three-layer perceptron. The novel CNN model is then trained on a large remote sensing dataset to learn low dimensional features. The two schemes are evaluated on several public and challenging datasets, and the results indicate that the proposed schemes and in particular the novel CNN are able to achieve state-of-the-art performance.
A key contributing factor to incredible success of deep neural networks has been the significant rise on massively parallel computing devices allowing researchers to greatly increase the size and depth of deep neural networks, leading to significant improvements in modeling accuracy. Although deeper, larger, or complex deep neural networks have shown considerable promise, the computational complexity of such networks is a major barrier to utilization in resource-starved scenarios. We explore the synaptogenesis of deep neural networks in the formation of efficient deep neural network architectures within an evolutionary deep intelligence framework, where a probabilistic generative modeling strategy is introduced to stochastically synthesize increasingly efficient yet effective offspring deep neural networks over generations, mimicking evolutionary processes such as heredity, random mutation, and natural selection in a probabilistic manner. In this study, we primarily explore the imposition of synaptic precision restrictions and its impact on the evolutionary synthesis of deep neural networks to synthesize more efficient network architectures tailored for resource-starved scenarios. Experimental results show significant improvements in synaptic efficiency (~10X decrease for GoogLeNet-based DetectNet) and inference speed (>5X increase for GoogLeNet-based DetectNet) while preserving modeling accuracy.
Learning binary representation is essential to large-scale computer vision tasks. Most existing algorithms require a separate quantization constraint to learn effective hashing functions. In this work, we present Direct Binary Embedding (DBE), a simple yet very effective algorithm to learn binary representation in an end-to-end fashion. By appending an ingeniously designed DBE layer to the deep convolutional neural network (DCNN), DBE learns binary code directly from the continuous DBE layer activation without quantization error. By employing the deep residual network (ResNet) as DCNN component, DBE captures rich semantics from images. Furthermore, in the effort of handling multilabel images, we design a joint cross entropy loss that includes both softmax cross entropy and weighted binary cross entropy in consideration of the correlation and independence of labels, respectively. Extensive experiments demonstrate the significant superiority of DBE over state-of-the-art methods on tasks of natural object recognition, image retrieval and image annotation.
In applications such as social, energy, transportation, sensor, and neuronal networks, high-dimensional data naturally reside on the vertices of weighted graphs. The emerging field of signal processing on graphs merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process such signals on graphs. In this tutorial overview, we outline the main challenges of the area, discuss different ways to define graph spectral domains, which are the analogues to the classical frequency domain, and highlight the importance of incorporating the irregular structures of graph data domains when processing signals on graphs. We then review methods to generalize fundamental operations such as filtering, translation, modulation, dilation, and downsampling to the graph setting, and survey the localized, multiscale transforms that have been proposed to efficiently extract information from high-dimensional data on graphs. We conclude with a brief discussion of open issues and possible extensions.
Inclusive KsKs production in deep inelastic ep scattering at HERA has been studied with the ZEUS detector using an integrated luminosity of 120 pb-1. Two states are observed at masses of 1537{+9}{-8} MeV and 1726{+-7} MeV, as well as an enhancement around 1300 MeV. The state at 1537 MeV is consistent with the well established f'2(1525). The state at 1726 MeV may be the glueball candidate f0(1710). However, it's width of 38{+20}{-14} MeV is narrower than 125{+-10} MeV observed by previous experiments for the f0(1710).
In these notes we review first in some detail the concept of random overlap structure (ROSt) applied to fully connected and diluted spin glasses. We then sketch how to write down the general term of the expansion of the energy part from the Boltzmann ROSt (for the Sherrington-Kirkpatrick model) and the corresponding term from the RaMOSt, which is the diluted extension suitable for the Viana-Bray model. From the ROSt energy term, a set of polynomial identities (often known as Aizenman-Contucci or AC relations) is shown to hold rigorously at every order because of a recursive structure of these polynomials that we prove. We show also, however, that this set is smaller than the full set of AC identities that is already known. Furthermore, when investigating the RaMOSt energy for the diluted counterpart, at higher orders, combinations of such AC identities appear, ultimately suggesting a crucial role for the entropy in generating these constraints in spin glasses.
We present an exactly solvable toy model which describes the emergence of a pseudogap in an electronic system due to a fluctuating off-diagonal order parameter. In one dimension our model reduces to the fluctuating gap model (FGM) with a gap Delta (x) that is constrained to be of the form Delta (x) = A e^{i Q x}, where A and Q are random variables. The FGM was introduced by Lee, Rice and Anderson [Phys. Rev. Lett. {\bf{31}}, 462 (1973)] to study fluctuation effects in Peierls chains. We show that their perturbative results for the average density of states are exact for our toy model if we assume a Lorentzian probability distribution for Q and ignore amplitude fluctuations. More generally, choosing the probability distributions of A and Q such that the average of Delta (x) vanishes and its covariance is < Delta (x) Delta^{*} (x^{prime}) > = Delta_s^2 exp[ {- | x - x^{\prime} | / \xi}], we study the combined effect of phase and amplitude fluctutations on the low-energy properties of Peierls chains. We explicitly calculate the average density of states, the localization length, the average single-particle Green's function, and the real part of the average conductivity. In our model phase fluctuations generate delocalized states at the Fermi energy, which give rise to a finite Drude peak in the conductivity. We also find that the interplay between phase and amplitude fluctuations leads to a weak logarithmic singulatity in the single-particle spectral function at the bare quasi-particle energies. In higher dimensions our model might be relevant to describe the pseudogap state in the underdoped cuprate superconductors.
The diagnostic use and detectability of luminous fine structure lines from high redshift galaxies is reviewed in the light of results from COBE concerning the Milky Way and from ISO on low redshift galaxies. At the highest luminosities (L>10^{12}\Lsun) the [CII] 158 micron line is somewhat less luminous with respect to the bolometric luminosity than for lower luminosity objects. Thus, surveys for this line must emphasize depth. The [CII] line will be the principle spectroscopic probe of the deep universe for the MMA and FIRST. A deep search for [CII] 158 micron emission from the dusty z=4.693 quasar BR1202-0725 is presented. The resulting 3\sigma upper limit implies that for this object L[CII]}/LFIR<0.0006%, a highly significant result indicating that distant luminous objects may represent a natural extension towards higher luminosities of the ultraluminous infrared galaxies at low redshift.
We study transport properties of anisotropic Heisenberg model in a disordered magnetic field experiencing dephasing due to external degrees of freedom. In the absence of dephasing the model can display, depending on parameter values, the whole range of possible transport regimes: ideal ballistic conduction, diffusive, or ideal insulating behavior. We show that the presence of dephasing induces normal diffusive transport in a wide range of parameters. We also analyze the dependence of spin conductivity on the dephasing strength. In addition, by analyzing the decay of spin-spin correlation function we discover a presence of long-range order for finite chain sizes. All our results for a one-dimensional spin chain at infinite temperature can be equivalently rephrased for strongly-interacting disordered spinless fermions.
In spite of the growing theoretical literature on cascades of failures in interbank lending networks, empirical results seem to suggest that networks of direct exposures are not the major channel of financial contagion. In this paper we show that networks of interbank exposures can however significantly amplify contagion due to overlapping portfolios. To illustrate this point, we consider the case of the Austrian interbank network and perform stress tests on it according to different protocols. We consider in particular contagion due to (i) counterparty loss; (ii) roll-over risk; and (iii) overlapping portfolios. We find that the average number of bankruptcies caused by counterparty loss and roll-over risk is fairly small if these contagion mechanisms are considered in isolation. Once portfolio overlaps are also accounted for, however, we observe that the network of direct interbank exposures significantly contributes to systemic risk.
Research in texture recognition often concentrates on the problem of material recognition in uncluttered conditions, an assumption rarely met by applications. In this work we conduct a first study of material and describable texture at- tributes recognition in clutter, using a new dataset derived from the OpenSurface texture repository. Motivated by the challenge posed by this problem, we propose a new texture descriptor, D-CNN, obtained by Fisher Vector pooling of a Convolutional Neural Network (CNN) filter bank. D-CNN substantially improves the state-of-the-art in texture, mate- rial and scene recognition. Our approach achieves 82.3% accuracy on Flickr material dataset and 81.1% accuracy on MIT indoor scenes, providing absolute gains of more than 10% over existing approaches. D-CNN easily trans- fers across domains without requiring feature adaptation as for methods that build on the fully-connected layers of CNNs. Furthermore, D-CNN can seamlessly incorporate multi-scale information and describe regions of arbitrary shapes and sizes. Our approach is particularly suited at lo- calizing stuff categories and obtains state-of-the-art re- sults on MSRC segmentation dataset, as well as promising results on recognizing materials and surface attributes in clutter on the OpenSurfaces dataset.
Borrowing from concepts in expander graphs, we study the expansion properties of real-world, complex networks (e.g. social networks, unstructured peer-to-peer or P2P networks) and the extent to which these properties can be exploited to understand and address the problem of decentralized search. We first produce samples that concisely capture the overall expansion properties of an entire network, which we collectively refer to as the expansion signature. Using these signatures, we find a correspondence between the magnitude of maximum expansion and the extent to which a network can be efficiently searched. We further find evidence that standard graph-theoretic measures, such as average path length, fail to fully explain the level of "searchability" or ease of information diffusion and dissemination in a network. Finally, we demonstrate that this high expansion can be leveraged to facilitate decentralized search in networks and show that an expansion-based search strategy outperforms typical search methods.
We show that the sensor self-localization problem can be cast as a static parameter estimation problem for Hidden Markov Models and we implement fully decentralized versions of the Recursive Maximum Likelihood and on-line Expectation-Maximization algorithms to localize the sensor network simultaneously with target tracking. For linear Gaussian models, our algorithms can be implemented exactly using a distributed version of the Kalman filter and a novel message passing algorithm. The latter allows each node to compute the local derivatives of the likelihood or the sufficient statistics needed for Expectation-Maximization. In the non-linear case, a solution based on local linearization in the spirit of the Extended Kalman Filter is proposed. In numerical examples we demonstrate that the developed algorithms are able to learn the localization parameters.
A fundamental goal of systems neuroscience is to probe the dynamics of neural activity that drive behavior. Here we present an instrument to simultaneously manipulate neural activity via Channelrhodopsin, monitor neural response via GCaMP3, and observe behavior in freely moving C. elegans. We use the instrument to directly observe the relation between sensory stimuli, interneuron activity and locomotion in the mechanosensory circuit.
We consider the static properties of periodic structures in weak random disorder. We apply a functional renormalization group approach (FRG) and a Gaussian variational method (GVM) to study their displacement correlations. We focus in particular on the effects of temperature and we compute explicitly the crossover length scales separating different regimes in the displacement correlation function. To do so using the FRG we introduce a functional form that approximate very accurately the flow of the disorder correlator at all scales. We compare the FRG and GVM results and find excellent agreement. We show that the FRG predicts in addition the existence of a third length scale associated with the screening of the disorder by thermal fluctuations and discuss a protocol to observe it.
Instance-level object segmentation is an important yet under-explored task. The few existing studies are almost all based on region proposal methods to extract candidate segments and then utilize object classification to produce final results. Nonetheless, generating accurate region proposals itself is quite challenging. In this work, we propose a Proposal-Free Network (PFN ) to address the instance-level object segmentation problem, which outputs the instance numbers of different categories and the pixel-level information on 1) the coordinates of the instance bounding box each pixel belongs to, and 2) the confidences of different categories for each pixel, based on pixel-to-pixel deep convolutional neural network. All the outputs together, by using any off-the-shelf clustering method for simple post-processing, can naturally generate the ultimate instance-level object segmentation results. The whole PFN can be easily trained in an end-to-end way without the requirement of a proposal generation stage. Extensive evaluations on the challenging PASCAL VOC 2012 semantic segmentation benchmark demonstrate that the proposed PFN solution well beats the state-of-the-arts for instance-level object segmentation. In particular, the $AP^r$ over 20 classes at 0.5 IoU reaches 58.7% by PFN, significantly higher than 43.8% and 46.3% by the state-of-the-art algorithms, SDS [9] and [16], respectively.
We consider the effect of a random longitudinal field on the Ising model in a transverse magnetic field. For spatial dimension $d > 2$, there is at low strength of randomness and transverse field, a phase with true long range order which is destroyed at higher values of the randomness or transverse field. The properties of the quantum phase transition at zero temperature are controlled by a fixed point with no quantum fluctuations. This fixed point also controls the classical finite temperature phase transition in this model. Many critical properties of the quantum transition are therefore identical to those of the classical transition. In particular, we argue that the dynamical scaling is activated, i.e, the logarithm of the diverging time scale rises as a power of the diverging length scale.
In this paper we propose distributed flooding-based storage algorithms for large-scale wireless sensor networks. Assume a wireless sensor network with $n$ nodes that have limited power, memory, and bandwidth. Each node is capable of both sensing and storing data. Such sensor nodes might disappear from the network due to failures or battery depletion. Hence it is desired to design efficient schemes to collect data from these $n$ nodes. We propose two distributed storage algorithms (DSA's) that utilize network flooding to solve this problem. In the first algorithm, DSA-I, we assume that every node utilizes network flooding to disseminate its data throughout the network using a mixing time of approximately O(n). We show that this algorithm is efficient in terms of the encoding and decoding operations. In the second algorithm, DSA-II, we assume that the total number of nodes is not known to every sensor; hence dissemination of the data does not depend on $n$. The encoding operations in this case take $O(C\mu^2)$, where $\mu$ is the mean degree of the network graph and $C$ is a system parameter. We evaluate the performance of the proposed algorithms through analysis and simulation, and show that their performance matches the derived theoretical results.
We discuss the results of our recent analysis [1] of deep inelastic scattering data on F2 structure function in the non-singlet approximation with next-to-next-to-leading-order accuracy. The study of high statistics deep inelastic scattering data provided by BCDMS, SLAC, NMC and BFP collaborations was performed with a special emphasis placed on the higher twist contributions. For the coupling constant the following value alfa_s(MZ2) = 0.1167 +- 0.0022 (total exp. error) was found.
It has been shown that most machine learning algorithms are susceptible to adversarial perturbations. Slightly perturbing an image in a carefully chosen direction in the image space may cause a trained neural network model to misclassify it. Recently, it was shown that physical adversarial examples exist: printing perturbed images then taking pictures of them would still result in misclassification. This raises security and safety concerns.   However, these experiments ignore a crucial property of physical objects: the camera can view objects from different distances and at different angles. In this paper, we show experiments that suggest that current constructions of physical adversarial examples do not disrupt object detection from a moving platform. Instead, a trained neural network classifies most of the pictures taken from different distances and angles of a perturbed image correctly. We believe this is because the adversarial property of the perturbation is sensitive to the scale at which the perturbed picture is viewed, so (for example) an autonomous car will misclassify a stop sign only from a small range of distances.   Our work raises an important question: can one construct examples that are adversarial for many or most viewing conditions? If so, the construction should offer very significant insights into the internal representation of patterns by deep networks. If not, there is a good prospect that adversarial examples can be reduced to a curiosity with little practical impact.
Human eye movement mechanisms (saccades) are very useful for scene analysis, including object representation and pattern recognition. In this letter, a Hopfield neural network to emulate saccades is proposed. The network uses an energy function that includes location and identification tasks. Computer simulation shows that the network performs those tasks cooperatively. The result suggests that the network is applicable to shift-invariant pattern recognition.
Many problems in industry --- and in the social, natural, information, and medical sciences --- involve discrete data and benefit from approaches from subjects such as network science, information theory, optimization, probability, and statistics. Because the study of networks is concerned explicitly with connectivity between different entities, it has become very prominent in industrial settings, and this importance has been accentuated further amidst the modern data deluge. In this article, we discuss the role of network analysis in industrial and applied mathematics, and we give several examples of network science in industry.
We study the effects of the degree distribution in mutual synchronization of two-layer neural networks. We carry out three coupling strategies: large-large coupling, random coupling, and small-small coupling. By computer simulations and analytical methods, we find that couplings between nodes with large degree play an important role in the synchronization. For large-large coupling, less couplings are needed for inducing synchronization for both random and scale-free networks. For random coupling, cutting couplings between nodes with large degree is very efficient for preventing neural systems from synchronization, especially when subnetworks are scale-free.
Fine-grained object classification is a challenging task due to the subtle inter-class difference and large intra-class variation. Recently, visual attention models have been applied to automatically localize the discriminative regions of an image for better capturing critical difference and demonstrated promising performance. However, without consideration of the diversity in attention process, most of existing attention models perform poorly in classifying fine-grained objects. In this paper, we propose a diversified visual attention network (DVAN) to address the problems of fine-grained object classification, which substan- tially relieves the dependency on strongly-supervised information for learning to localize discriminative regions compared with attentionless models. More importantly, DVAN explicitly pursues the diversity of attention and is able to gather discriminative information to the maximal extent. Multiple attention canvases are generated to extract convolutional features for attention. An LSTM recurrent unit is employed to learn the attentiveness and discrimination of attention canvases. The proposed DVAN has the ability to attend the object from coarse to fine granularity, and a dynamic internal representation for classification is built up by incrementally combining the information from different locations and scales of the image. Extensive experiments con- ducted on CUB-2011, Stanford Dogs and Stanford Cars datasets have demonstrated that the proposed diversified visual attention networks achieve competitive performance compared to the state- of-the-art approaches, without using any prior knowledge, user interaction or external resource in training or testing.
We study data-driven methods for community detection in graphs. This estimation problem is typically formulated in terms of the spectrum of certain operators, as well as via posterior inference under certain probabilistic graphical models. Focusing on random graph families such as the Stochastic Block Model, recent research has unified these two approaches, and identified both statistical and computational signal-to-noise detection thresholds.   We embed the resulting class of algorithms within a generic family of graph neural networks and show that they can reach those detection thresholds in a purely data-driven manner, without access to the underlying generative models and with no parameter assumptions. The resulting model is also tested on real datasets, requiring less computational steps and performing significantly better than rigid parametric models.
We analyse the calculations of deep inelastic structure functions of free nucleons by QCD sum rules method,carried out by others.We present our results of calculation of the distribution of valence quarks in nucleon placed into the nuclear matter.We show that the change in distribution has typical EMC shape.We discuss possible application of the method to investigation of other aspects of deep-inelastic processes.We analyse also the limits of possibilities of the approach.
We present a fully-distributed self-healing algorithm DEX, that maintains a constant degree expander network in a dynamic setting. To the best of our knowledge, our algorithm provides the first efficient distributed construction of expanders --- whose expansion properties hold {\em deterministically} --- that works even under an all-powerful adaptive adversary that controls the dynamic changes to the network (the adversary has unlimited computational power and knowledge of the entire network state, can decide which nodes join and leave and at what time, and knows the past random choices made by the algorithm). Previous distributed expander constructions typically provide only {\em probabilistic} guarantees on the network expansion which {\em rapidly degrade} in a dynamic setting; in particular, the expansion properties can degrade even more rapidly under {\em adversarial} insertions and deletions.   Our algorithm provides efficient maintenance and incurs a low overhead per insertion/deletion by an adaptive adversary: only $O(\log n)$ rounds and $O(\log n)$ messages are needed with high probability ($n$ is the number of nodes currently in the network). The algorithm requires only a constant number of topology changes. Moreover, our algorithm allows for an efficient implementation and maintenance of a distributed hash table (DHT) on top of DEX, with only a constant additional overhead.   Our results are a step towards implementing efficient self-healing networks that have \emph{guaranteed} properties (constant bounded degree and expansion) despite dynamic changes.
Spatio-temporal biochemical signaling in a large class of protein-protein interaction networks is well modeled by a reaction-diffusion system. The global existence of the solution to the reaction-diffusion system is determined by the reaction kinetics model and the protein network topology. We propose a novel reaction kinetics model that guarantees that the reaction-diffusion system with this model has a nonnegative invariant global classical solution for any network topology. We then present a computational method to identify the unknown parameters and initial values for a reaction-diffusion system with this reaction kinetics model. The identification approach solves an optimization problem that minimizes the cost function defined as the $L^2$-norm of the difference between the data and the solution of the reaction-diffusion system. We utilize an adjoint-based optimal control method to obtain the gradients of the cost function with respect to the parameters and initial values. The regularity of the global classical solutions of the reaction-diffusion system and its corresponding adjoint system avoids situations in which the gradients blow up, and therefore guarantees the success of the identification method for any network structure. Utilizing this gradient information, an efficient algorithm to solve the optimization problem is proposed and applied to estimate the mass diffusivities, rate constants and initial values of a reaction-diffusion system that models protein-protein interactions in a signaling network that regulates the actin cytoskeleton in a malignant breast cell.
Accurate knowledge of the single-molecule (self) translational dynamics of liquid para-H2 is an essential requirement for the calculation of the neutron scattering properties of this important quantum liquid. We show that, by using Centroid Molecular Dynamics (CMD) quantum simulations of the velocity autocorrelation function, calculations of the total neutron cross section (TCS) remarkably agree with experimental data at the thermal and epithermal incident neutron energies where para-H2 dynamics is actually dominated by the self contributions. This result shows that a proper account of the quantum nature of the fluid, as provided by CMD, is a necessary and very effective condition to obtain the correct absolute-scale cross section values directly from first-principle computations of the double differential cross section, and without the need of introducing any empirically adjusted quantity. At subthermal incident energies, appropriate modeling of the para-H2 intermolecular (distinct) dynamics also becomes crucial, but quantum simulations are not yet able to cope with it. Existing simple models which account for the distinct part provide an appropriate correction of self-only calculations and bring the computed results in reasonable accord with TCS experimental data available until very recently. However, if just published cross section measurements in the cold range are considered, the agreement turns out to be by far superior and very satisfactory. The possible origin of slight residual differences will be commented and suggest further computational and experimental efforts. Nonetheless, the ability to reproduce the total cross section in the wide range between 1 and 900 meV represents an encouraging and important validation step of the CMD method and of the present simple algorithm.
In this paper, we use deep neural networks for inverting face sketches to synthesize photorealistic face images. We first construct a semi-simulated dataset containing a very large number of computer-generated face sketches with different styles and corresponding face images by expanding existing unconstrained face data sets. We then train models achieving state-of-the-art results on both computer-generated sketches and hand-drawn sketches by leveraging recent advances in deep learning such as batch normalization, deep residual learning, perceptual losses and stochastic optimization in combination with our new dataset. We finally demonstrate potential applications of our models in fine arts and forensic arts. In contrast to existing patch-based approaches, our deep-neural-network-based approach can be used for synthesizing photorealistic face images by inverting face sketches in the wild.
In this work, we present a novel neural network based architecture for inducing compositional crosslingual word representations. Unlike previously proposed methods, our method fulfills the following three criteria; it constrains the word-level representations to be compositional, it is capable of leveraging both bilingual and monolingual data, and it is scalable to large vocabularies and large quantities of data. The key component of our approach is what we refer to as a monolingual inclusion criterion, that exploits the observation that phrases are more closely semantically related to their sub-phrases than to other randomly sampled phrases. We evaluate our method on a well-established crosslingual document classification task and achieve results that are either comparable, or greatly improve upon previous state-of-the-art methods. Concretely, our method reaches a level of 92.7% and 84.4% accuracy for the English to German and German to English sub-tasks respectively. The former advances the state of the art by 0.9% points of accuracy, the latter is an absolute improvement upon the previous state of the art by 7.7% points of accuracy and an improvement of 33.0% in error reduction.
Recently, mobile operators in many developing economies have launched "Mobile Money" platforms that deliver basic financial services over the mobile phone network. While many believe that these services can improve the lives of the poor, a consistent difficulty has been identifying individuals most likely to benefit from access to the new technology. Here, we combine terabyte-scale data from three different mobile phone operators from Ghana, Pakistan, and Zambia, to better understand the behavioral determinants of mobile money adoption. Our supervised learning models provide insight into the best predictors of adoption in three very distinct cultures. We find that models fit on one population fail to generalize to another, and in general are highly context-dependent. These findings highlight the need for a nuanced approach to understanding the role and potential of financial services for the poor.
This work introduces a method to tune a sequence-based generative model for molecular de novo design that through augmented episodic likelihood can learn to generate structures with certain specified desirable properties. We demonstrate how this model can execute a range of tasks such as generating analogues to a query structure and generating compounds predicted to be active against a biological target. As a proof of principle, the model is first trained to generate molecules that do not contain sulphur. As a second example, the model is trained to generate analogues to the drug Celecoxib, a technique that could be used for scaffold hopping or library expansion starting from a single molecule. Finally, when tuning the model towards generating compounds predicted to be active against the dopamine receptor D2, the model generates structures of which more than 95% are predicted to be active, including experimentally confirmed actives that have not been included in either the generative model nor the activity prediction model.
We study information gathering in ad-hoc radio networks. Initially, each node of the network has a piece of information called a rumor, and the overall objective is to gather all these rumors in the designated target node. The ad-hoc property refers to the fact that the topology of the network is unknown when the computation starts. Aggregation of rumors is not allowed, which means that each node may transmit at most one rumor in one step.   We focus on networks with tree topologies, that is we assume that the network is a tree with all edges directed towards the root, but, being ad-hoc, its actual topology is not known. We provide two deterministic algorithms for this problem. For the model that does not assume any collision detection nor acknowledgement mechanisms, we give an $O(n\log\log n)$-time algorithm, improving the previous upper bound of $O(n\log n)$. We also show that this running time can be further reduced to $O(n)$ if the model allows for acknowledgements of successful transmissions.
We show that solving a multiple-unicast network coding problem can be reduced to solving a single-unicast network error correction problem, where an adversary may jam at most a single edge in the network. Specifically, we present an efficient reduction that maps a multiple-unicast network coding instance to a network error correction instance while preserving feasibility. The reduction holds for both the zero probability of error model and the vanishing probability of error model. Previous reductions are restricted to the zero-error case. As an application of the reduction, we present a constructive example showing that the single-unicast network error correction capacity may not be achievable, a result of separate interest.
The heirarchical mean-field theory of elastic networks, originally developed by Maxwell to discuss the stability of scaffolds, and recently applied to atomic networks by Phillips and Thorpe, explains the phase diagrams and remarkable superconductive properties of cuprates as the result of giant electron-phonon interactions in a marginally unstable mechanical network. The overall cuprate networks are fragile (floppy), as shown quantitatively (with an accuracy ~1 percent), and without adjustable parameters, by comparison with stabilities of generically similar network glasses, and are stabilized by percolative backbones composed of CuO2 planes.
Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metrics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.
We first provide a review of the properties of warm absorbers concentrating on what we have learned from ROSAT and ASCA. This includes dusty and dust-free warm absorbers, non-X-ray emission and absorption features of warm absorbers, and the possible warm absorber interpretation of the peculiar 1.1 keV features.   We then discuss facets of warm absorbers by a more detailed investigation of individual objects: In a first part, we discuss several candidates for dusty warm absorbers. In a second part, we review and extend our earlier study of a possible relation between warm absorber and CLR in NGC 4051, and confirm that both components are of different origin (the observed coronal lines are underpredicted by the models, the warm absorber is too highly ionized). We then suggest that a potential overprediction of these lines in more lowly ionized absorbers can be avoided if these warm absorbers are dusty. In a third part, we present first results of an analysis of a deep ROSAT PSPC observation of the quasar MR2251-178, the first one discovered to host a warm absorber. Finally, we summarize our scrutiny under which conditions a warm absorber could account for the dramatic spectral variability of the Narrow-line Seyfert 1 galaxy RXJ0134-4258.
Some mobile sensor network applications require the sensor nodes to transfer their trajectories to a data sink. This paper proposes an adaptive trajectory (lossy) compression algorithm based on compressive sensing. The algorithm has two innovative elements. First, we propose a method to compute a deterministic projection matrix from a learnt dictionary. Second, we propose a method for the mobile nodes to adaptively predict the number of projections needed based on the speed of the mobile nodes. Extensive evaluation of the proposed algorithm using 6 datasets shows that our proposed algorithm can achieve sub-metre accuracy. In addition, our method of computing projection matrices outperforms two existing methods. Finally, comparison of our algorithm against a state-of-the-art trajectory compression algorithm show that our algorithm can reduce the error by 10-60 cm for the same compression ratio.
The exploration of the structural topology and the organizing principles of genome-based large-scale metabolic networks is essential for studying possible relations between structure and functionality of metabolic networks. Topological analysis of graph models has often been applied to study the structural characteristics of complex metabolic networks.In this work, metabolic networks of 75 organisms were investigated from a topological point of view. Network decomposition of three microbes (Escherichia coli, Aeropyrum pernix and Saccharomyces cerevisiae) shows that almost all of the sub-networks exhibit a highly modularized bow-tie topological pattern similar to that of the global metabolic networks. Moreover, these small bow-ties are hierarchically nested into larger ones and collectively integrated into a large metabolic network, and important features of this modularity are not observed in the random shuffled network. In addition, such a bow-tie pattern appears to be present in certain chemically isolated functional modules and spatially separated modules including carbohydrate metabolism, cytosol and mitochondrion respectively. The highly modularized bow-tie pattern is present at different levels and scales, and in different chemical and spatial modules of metabolic networks, which is likely the result of the evolutionary process rather than a random accident. Identification and analysis of such a pattern is helpful for understanding the design principles and facilitate the modelling of metabolic networks.
Online social media is a social vehicle in which people share various moments of their lives with their friends, such as playing sports, cooking dinner or just taking a selfie for fun, via visual means, that is, photographs. Our study takes a closer look at the popular visual concepts illustrating various cultural lifestyles from aggregated, de-identified photographs. We perform analysis both at macroscopic and microscopic levels, to gain novel insights about global and local visual trends as well as the dynamics of interpersonal cultural exchange and diffusion among Facebook friends. We processed images by automatically classifying the visual content by a convolutional neural network (CNN). Through various statistical tests, we find that socially tied individuals more likely post images showing similar cultural lifestyles. To further identify the main cause of the observed social correlation, we use the Shuffle test and the Preference-based Matched Estimation (PME) test to distinguish the effects of influence and homophily. The results indicate that the visual content of each user's photographs are temporally, although not necessarily causally, correlated with the photographs of their friends, which may suggest the effect of influence. Our paper demonstrates that Facebook photographs exhibit diverse cultural lifestyles and preferences and that the social interaction mediated through the visual channel in social media can be an effective mechanism for cultural diffusion.
Deep clustering is a recently introduced deep learning architecture that uses discriminatively trained embeddings as the basis for clustering. It was recently applied to spectrogram segmentation, resulting in impressive results on speaker-independent multi-speaker separation. In this paper we extend the baseline system with an end-to-end signal approximation objective that greatly improves performance on a challenging speech separation. We first significantly improve upon the baseline system performance by incorporating better regularization, larger temporal context, and a deeper architecture, culminating in an overall improvement in signal to distortion ratio (SDR) of 10.3 dB compared to the baseline of 6.0 dB for two-speaker separation, as well as a 7.1 dB SDR improvement for three-speaker separation. We then extend the model to incorporate an enhancement layer to refine the signal estimates, and perform end-to-end training through both the clustering and enhancement stages to maximize signal fidelity. We evaluate the results using automatic speech recognition. The new signal approximation objective, combined with end-to-end training, produces unprecedented performance, reducing the word error rate (WER) from 89.1% down to 30.8%. This represents a major advancement towards solving the cocktail party problem.
Humans are generally good at learning abstract concepts about objects and scenes (e.g.\ spatial orientation, relative sizes, etc.). Over the last years convolutional neural networks have achieved almost human performance in recognizing concrete classes (i.e.\ specific object categories). This paper tests the performance of a current CNN (GoogLeNet) on the task of differentiating between abstract classes which are trivially differentiable for humans. We trained and tested the CNN on the two abstract classes of horizontal and vertical orientation and determined how well the network is able to transfer the learned classes to other, previously unseen objects.
When analyzing interaction networks, it is common to interpret the amount of interaction between two nodes as the strength of their relationship. We argue that this interpretation may not be appropriate, since the interaction between a pair of nodes could potentially be explained only by characteristics of the nodes that compose the pair and, however, not by pair-specific features. In interaction networks, where edges or arcs are count-valued, the above scenario corresponds to a model of independence for the expected interaction in the network, and consequently we propose the notions of arc strength, and edge strength to be understood as departures from this model of independence. We discuss how our notion of arc/edge strength can be used as a guidance to study network structure, and in particular we develop a latent arc strength stochastic blockmodel for directed interaction networks. We illustrate our approach studying the interaction between the Kolkata users of the myGamma mobile network.
We investigate how suitable a weighted network is for gossip spreading. The proposed model is based on the gossip spreading model introduced by Lind et.al. on unweighted networks. Weight represents "friendship." Potential spreader prefers not to spread if the victim of gossip is a "close friend". Gossip spreading is related to the triangles and cascades of triangles. It gives more insight about the structure of a network.   We analyze gossip spreading on real weighted networks of human interactions. 6 co-occurrence and 7 social pattern networks are investigated. Gossip propagation is found to be a good parameter to distinguish co-occurrence and social pattern networks. As a comparison some miscellaneous networks and computer generated networks based on ER, BA, WS models are also investigated. They are found to be quite different than the human interaction networks.
The topological (or graph) structures of real-world networks are known to be predictive of multiple dynamic properties of the networks. Conventionally, a graph structure is represented using an adjacency matrix or a set of hand-crafted structural features. These representations either fail to highlight local and global properties of the graph or suffer from a severe loss of structural information. There lacks an effective graph representation, which hinges the realization of the predictive power of network structures.   In this study, we propose to learn the represention of a graph, or the topological structure of a network, through a deep learning model. This end-to-end prediction model, named DeepGraph, takes the input of the raw adjacency matrix of a real-world network and outputs a prediction of the growth of the network. The adjacency matrix is first represented using a graph descriptor based on the heat kernel signature, which is then passed through a multi-column, multi-resolution convolutional neural network. Extensive experiments on five large collections of real-world networks demonstrate that the proposed prediction model significantly improves the effectiveness of existing methods, including linear or nonlinear regressors that use hand-crafted features, graph kernels, and competing deep learning methods.
The dominant paradigm for feature learning in computer vision relies on training neural networks for the task of object recognition using millions of hand labelled images. Is it possible to learn useful features for a diverse set of visual tasks using any other form of supervision? In biology, living organisms developed the ability of visual perception for the purpose of moving and acting in the world. Drawing inspiration from this observation, in this work we investigate if the awareness of egomotion can be used as a supervisory signal for feature learning. As opposed to the knowledge of class labels, information about egomotion is freely available to mobile agents. We show that given the same number of training images, features learnt using egomotion as supervision compare favourably to features learnt using class-label as supervision on visual tasks of scene recognition, object recognition, visual odometry and keypoint matching.
The problem of extending a function $f$ defined on a training data $\mathcal{C}$ on an unknown manifold $\mathbb{X}$ to the entire manifold and a tubular neighborhood of this manifold is considered in this paper. For $\mathbb{X}$ embedded in a high dimensional ambient Euclidean space $\mathbb{R}^D$, a deep learning algorithm is developed for finding a local coordinate system for the manifold {\bf without eigen--decomposition}, which reduces the problem to the classical problem of function approximation on a low dimensional cube. Deep nets (or multilayered neural networks) are proposed to accomplish this approximation scheme by using the training data. Our methods do not involve such optimization techniques as back--propagation, while assuring optimal (a priori) error bounds on the output in terms of the number of derivatives of the target function. In addition, these methods are universal, in that they do not require a prior knowledge of the smoothness of the target function, but adjust the accuracy of approximation locally and automatically, depending only upon the local smoothness of the target function. Our ideas are easily extended to solve both the pre--image problem and the out--of--sample extension problem, with a priori bounds on the growth of the function thus extended.
Naive scale invariance is not a true property of natural images. Natural monochrome images posses a much richer geometrical structure, that is particularly well described in terms of multiscaling relations. This means that the pixels of a given image can be decomposed into sets, the fractal components of the image, with well-defined scaling exponents (Turiel & Parga, submitted). Here it is shown that multispectral representations of natural scenes also exhibit multiscaling properties, observing the same kind of behavior. A precise measure of the informational relevance of the fractal components is also given, and it is shown that there are important differences between the intrinsically redundant RGB system and the decorrelated one defined in (Ruderman, Cronin & Chiao, 1998).
Deep learning refers to the shining branch of machine learning that is based on learning levels of representations. Convolutional Neural Networks (CNN) is one kind of deep neural network. It can study concurrently. In this article, we gave a detailed analysis of the process of CNN algorithm both the forward process and back propagation. Then we applied the particular convolutional neural network to implement the typical face recognition problem by java. Then, a parallel strategy was proposed in section4. In addition, by measuring the actual time of forward and backward computing, we analysed the maximal speed up and parallel efficiency theoretically.
We report an experimental study of quench condensed ($2K\le T \le 15K$) disordered ultrathin films of {\rm Bi} where localisation effects and superconductivity compete. Experiments are done with different substrates and/or different underlayers. Quasi-free standing films of {\rm Bi}, prepared by quenching {\rm Bi} vapours onto solid {\rm Xe}, are also studied. The results show a dependence of the transport properties both on the dielectric constant of the substrate/underlayer as well as the temperature of quench condensation. RHEED studies indicate that quantum size effects are important in these systems. In this paper, we try to correlate the structure of the films to the transport properties obtained.
The quantum critical behavior and the Griffiths-McCoy singularities of random quantum Ising ferromagnets are studied by applying a numerical implementation of the Ma-Dasgupta-Hu renormalization group scheme. We check the procedure for the analytically tractable one-dimensional case and apply our code to the quasi-one-dimensional double chain. For the latter we obtain identical critical exponents as for the simple chain implying the same universality class. Then we apply the method to the two-dimensional case for which we get estimates for the exponents that are compatible with a recent study in the same spirit.
Designing networks with specified collective properties is useful in a variety of application areas, enabling the study of how given properties affect the behavior of network models, the downscaling of empirical networks to workable sizes, and the analysis of network evolution. Despite the importance of the task, there currently exists a gap in our ability to systematically generate networks that adhere to theoretical guarantees for the given property specifications. In this paper, we propose the use of Mixed-Integer Linear Optimization modeling and solution methodologies to address this Network Generation Problem. We present a number of useful modeling techniques and apply them to mathematically express and constrain network properties in the context of an optimization formulation. We then develop complete formulations for the generation of networks that attain specified levels of connectivity, spread, assortativity and robustness, and we illustrate these via a number of computational case studies.
User preference profiling is an important task in modern online social networks (OSN). With the proliferation of image-centric social platforms, such as Pinterest, visual contents have become one of the most informative data streams for understanding user preferences. Traditional approaches usually treat visual content analysis as a general classification problem where one or more labels are assigned to each image. Although such an approach simplifies the process of image analysis, it misses the rich context and visual cues that play an important role in people's perception of images. In this paper, we explore the possibilities of learning a user's latent visual preferences directly from image contents. We propose a distance metric learning method based on Deep Convolutional Neural Networks (CNN) to directly extract similarity information from visual contents and use the derived distance metric to mine individual users' fine-grained visual preferences. Through our preliminary experiments using data from 5,790 Pinterest users, we show that even for the images within the same category, each user possesses distinct and individually-identifiable visual preferences that are consistent over their lifetime. Our results underscore the untapped potential of finer-grained visual preference profiling in understanding users' preferences.
In this work, we numerically demonstrate an infrared frequency-tunable selective thermal emitter made of graphene covered SiC deep gratings. Full-wave simulation shows temporally-coherent emission peak associated with magnetic polariton, whose resonance frequency can be dynamically tuned within the phonon absorption band of SiC by varying graphene chemical potential. An analytical inductor-capacitor circuit model is introduced to quantitatively predict the resonance frequency and further elucidate the mechanism for the tunable emission peak. Moreover, by depositing multiple layers of graphene sheets onto the SiC deep gratings, a large tunability of 8.5% in peak frequency can be obtained to yield the coherent emission covering a broad frequency range from 820 1/cm to 890 1/cm. The novel tunable met-amaterial could pave the way to a new class of tunable thermal sources in the infrared region.
Training energy-based probabilistic models is confronted with apparently intractable sums, whose Monte Carlo estimation requires sampling from the estimated probability distribution in the inner loop of training. This can be approximately achieved by Markov chain Monte Carlo methods, but may still face a formidable obstacle that is the difficulty of mixing between modes with sharp concentrations of probability. Whereas an MCMC process is usually derived from a given energy function based on mathematical considerations and requires an arbitrarily long time to obtain good and varied samples, we propose to train a deep directed generative model (not a Markov chain) so that its sampling distribution approximately matches the energy function that is being trained. Inspired by generative adversarial networks, the proposed framework involves training of two models that represent dual views of the estimated probability distribution: the energy function (mapping an input configuration to a scalar energy value) and the generator (mapping a noise vector to a generated configuration), both represented by deep neural networks.
Zeros of the moment of the partition function $[Z^n]_{\bm{J}}$ with respect to complex $n$ are investigated in the zero temperature limit $\beta \to \infty$, $n\to 0$ keeping $y=\beta n \approx O(1)$. We numerically investigate the zeros of the $\pm J$ Ising spin glass models on several Cayley trees and hierarchical lattices and compare those results. In both lattices, the calculations are carried out with feasible computational costs by using recursion relations originated from the structures of those lattices. The results for Cayley trees show that a sequence of the zeros approaches the real axis of $y$ implying that a certain type of analyticity breaking actually occurs, although it is irrelevant for any known replica symmetry breaking. The result of hierarchical lattices also shows the presence of analyticity breaking, even in the two dimensional case in which there is no finite-temperature spin-glass transition, which implies the existence of the zero-temperature phase transition in the system. A notable tendency of hierarchical lattices is that the zeros spread in a wide region of the complex $y$ plane in comparison with the case of Cayley trees, which may reflect the difference between the mean-field and finite-dimensional systems.
The review is concerned with the nonlinear Schr\"odinger equation (NLSE) in the presence of disorder. Disorder leads to localization in the form of the localized Anderson modes (AM), while nonlinearity is responsible for the interaction between the AMs and transport. The dynamics of an initially localized wave packets are concerned in both classical and quantum cases. In both cases, there is a subdiffusive spreading, which is explained in the framework of a continuous time random walk (CTRW), and it is shown that subdiffusion is due to the transitions between those AMs, which are strongly overlapped. This overlapping being a common feature of both classical and quantum dynamics, leads to the clustering with an effective trapping of the wave packet inside each cluster. Therefore, the dynamics of the wave packet corresponds to the CTRW, where the basic mechanism of subdiffusion is an entrapping of the wave packet with delay, or waiting, times distributed by the power law $w(t)\sim 1/t^{1+\alpha}$, where $\alpha$ is the transport exponent. It is obtained that $\alpha=1/3$ for the classical NLSE and $\alpha=1/2$ for the quantum NLSE.
Analysis of networks and in particular discovering communities within networks has been a focus of recent work in several fields, with applications ranging from citation and friendship networks to food webs and gene regulatory networks. Most of the existing community detection methods focus on partitioning the entire network into communities, with the expectation of many ties within communities and few ties between. However, many networks contain nodes that do not fit in with any of the communities, and forcing every node into a community can distort results. Here we propose a new framework that focuses on community extraction instead of partition, extracting one community at a time. The main idea behind extraction is that the strength of a community should not depend on ties between members of other communities, but only on ties within that community and its ties to the outside world. We show that the new extraction criterion performs well on simulated and real networks, and establish asymptotic consistency of our method under the block model assumption.
Latent space models (LSM) for network data were introduced by Hoff et al. (2002) under the basic assumption that each node of the network has an unknown position in a D-dimensional Euclidean latent space: generally the smaller the distance between two nodes in the latent space, the greater their probability of being connected. In this paper we propose a variational Bayes approach to estimate the intractable posterior of the LSM.   In many cases, different network views on the same set of nodes are available. It can therefore be useful to build a model able to jointly summarise the information given by all the network views. For this purpose, we introduce the latent space joint model (LSJM) that merges the information given by multiple network views assuming that the probability of a node being connected with other nodes in each network view is explained by a unique latent variable. This model is demonstrated on the analysis of two datasets: the excerpt of 50 girls from `Teenage Friends and Lifestyle Study' data at three time points and the Saccharomyces cerevisiae genetic and physical protein-protein interactions.
The demands on visual recognition systems do not end with the complexity offered by current large-scale image datasets, such as ImageNet. In consequence, we need curious and continuously learning algorithms that actively acquire knowledge about semantic concepts which are present in available unlabeled data. As a step towards this goal, we show how to perform continuous active learning and exploration, where an algorithm actively selects relevant batches of unlabeled examples for annotation. These examples could either belong to already known or to yet undiscovered classes. Our algorithm is based on a new generalization of the Expected Model Output Change principle for deep architectures and is especially tailored to deep neural networks. Furthermore, we show easy-to-implement approximations that yield efficient techniques for active selection. Empirical experiments show that our method outperforms currently used heuristics.
The recently measured yeast transcriptional network is analyzed in terms of simplified Boolean network models, with the aim of determining feasible rule structures, given the requirement of stable solutions of the generated Boolean networks. We find that for ensembles of generated models, those with canalyzing Boolean rules are remarkably stable, whereas those with random Boolean rules are only marginally stable. Furthermore, substantial parts of the generated networks are frozen, in the sense that they reach the same state regardless of initial state. Thus, our ensemble approach suggests that the yeast network shows highly ordered dynamics.
Deep neural networks (DNNs) achieve excellent performance on standard classification tasks. However, under image quality distortions such as blur and noise, classification accuracy becomes poor. In this work, we compare the performance of DNNs with human subjects on distorted images. We show that, although DNNs perform better than or on par with humans on good quality images, DNN performance is still much lower than human performance on distorted images. We additionally find that there is little correlation in errors between DNNs and human subjects. This could be an indication that the internal representation of images are different between DNNs and the human visual system. These comparisons with human performance could be used to guide future development of more robust DNNs.
We study the influence of quenched disorder on quantum phase transitions in itinerant magnets with Heisenberg spin symmetry, paying particular attention to rare disorder fluctuations. In contrast to the Ising case where the overdamping suppresses the tunneling of the rare regions, the Heisenberg system displays strong power-law quantum Griffiths singularities in the vicinity of the quantum critical point. We discuss these phenomena based on general scaling arguments, and we illustrate them by an explicit calculation for O(N) spin symmetry in the large-N limit. We also discuss broad implications for the classification of quantum phase transitions in the presence of quenched disorder.
Congestion limits the efficiency of transport networks ranging from highways to the internet. Fungal hyphal networks are studied as an examples of optimal biological transport networks, but the scheduling and direction of traffic to avoid congestion has not been examined. We show here that the Neurospora crassa fungal network exhibits anticongestion: more densely packed nuclei flow faster along hyphal highways, and transported nuclei self-organize into fast flowing solitons. Concentrated transport by solitons may allow cells to cycle between growing and acting as transport conduits.
We propose a model for the longitudinal polarization of Lambda baryons produced in deep-inelastic lepton scattering at any xF, based on static SU(6) quark-diquark wave functions and polarized intrinsic strangeness in the nucleon associated with individual valence quarks. Free parameters of the model are fixed by fitting the NOMAD data on the longitudinal polarization of Lambda hyperons in neutrino interactions. Our model correctly reproduces the observed dependences of Lambda polarization on the kinematic variables. Within the context of our model, the NOMAD data imply that the intrinsic strangeness associated with a valence quark has anticorrelated polarization. We also compare our model predictions with results from the HERMES and E665 experiments using charged leptons. Predictions of our model for the COMPASS experiment are also presented.
We discuss the `memory effect' discovered in the 60's by Kovacs in temperature shift experiments on glassy polymers, where the volume (or energy) displays a non monotonous time behaviour. This effect is generic and is observed on a variety of different glassy systems (including granular materials). The aim of this paper is to discuss whether some microscopic information can be extracted from a quantitative analysis of the `Kovacs hump'. We study analytically two families of theoretical models: domain growth and traps, for which detailed predictions of the shape of the hump can be obtained. Qualitatively, the Kovacs effect reflects the heterogeneity of the system: its description requires to deal not only with averages but with a full probability distribution (of domain sizes or of relaxation times). We end by some suggestions for a quantitative analysis of experimental results.
This paper describes a parsing model that combines the exact dynamic programming of CRF parsing with the rich nonlinear featurization of neural net approaches. Our model is structurally a CRF that factors over anchored rule productions, but instead of linear potential functions based on sparse features, we use nonlinear potentials computed via a feedforward neural network. Because potentials are still local to anchored rules, structured inference (CKY) is unchanged from the sparse case. Computing gradients during learning involves backpropagating an error signal formed from standard CRF sufficient statistics (expected rule counts). Using only dense features, our neural CRF already exceeds a strong baseline CRF model (Hall et al., 2014). In combination with sparse features, our system achieves 91.1 F1 on section 23 of the Penn Treebank, and more generally outperforms the best prior single parser results on a range of languages.
Nanoparticles (NPs) engineered for biomedical applications are meant to be in contact with protein-rich physiological fluids. These proteins are usually adsorbed onto the NP surface, forming a swaddling layer called protein corona that influences cell internalization. We present a study on protein adsorption onto different magnetic NPs (MNPs) when immersed in cell culture medium, and how these changes affect the cellular uptake. Two colloids with magnetite cores of 25 nm, same hydrodynamic size and opposite surface charge were in situ coated with (a) positive polyethyleneimine (PEI-MNPs) and (b) negative poly(acrylic acid) (PAA-MNPs). After few minutes of incubation in cell culture medium the wrapping of the MNPs by protein adsorption resulted in a 5-fold size increase. After 24 h of incubation large MNP-protein aggregates with hydrodynamic sizes 1500 to 3000 nm (PAA-MNPs and PEI-MNPs respectively) were observed. Each cluster contained an estimated number of magnetic cores between 450 and 1000, indicating the formation of large aggregates with a "plum pudding" structure of MNPs embedded into a protein network of negative surface charge irrespective of the MNP_core charge. We demonstrated that PEI-MNPs are incorporated in much larger amounts than the PAA-MNPs units. Quantitative analysis showed that SH-SY5Y cells can incorporate 100 per cent of the added PEI-MNPs up to about 100 pg per cell, whereas for PAA-MNPs the uptake was less than 50 percent. The final cellular distribution showed also notable differences regarding partial attachment to the cell membrane. These results highlight the need to characterize the final properties of MNPs after protein adsorption in biological media, and demonstrate the impact of these properties on the internalization mechanisms in neural cells.
An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e.g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e.g. 200,000). Computing the equally large, but typically non-sparse D-dimensional output vector from a last hidden layer of reasonable dimension d (e.g. 500) incurs a prohibitive O(Dd) computational cost for each example, as does updating the $D \times d$ output weight matrix and computing the gradient needed for backpropagation to previous layers. While efficient handling of large sparse network inputs is trivial, the case of large sparse targets is not, and has thus so far been sidestepped with approximate alternatives such as hierarchical softmax or sampling-based approximations during training. In this work we develop an original algorithmic approach which, for a family of loss functions that includes squared error and spherical softmax, can compute the exact loss, gradient update for the output weights, and gradient for backpropagation, all in $O(d^{2})$ per example instead of $O(Dd)$, remarkably without ever computing the D-dimensional output. The proposed algorithm yields a speedup of up to $D/4d$ i.e. two orders of magnitude for typical sizes, for that critical part of the computations that often dominates the training time in this kind of network architecture.
The topological interference management (TIM) problem studies partially-connected interference networks with no channel state information except for the network topology (i.e., connectivity graph) at the transmitters. In this paper, we consider a similar problem in the uplink cellular networks, while message passing is enabled at the receivers (e.g., base stations), so that the decoded messages can be routed to other receivers via backhaul links to help further improve network performance. For this TIM problem with decoded message passing (TIM-MP), we model the interference pattern by conflict digraphs, connect orthogonal access to the acyclic set coloring on conflict digraphs, and show that one-to-one interference alignment boils down to orthogonal access because of message passing. With the aid of polyhedral combinatorics, we identify the structural properties of certain classes of network topologies where orthogonal access achieves the optimal degrees-of-freedom (DoF) region in the information-theoretic sense. The relation to the conventional index coding with simultaneous decoding is also investigated by formulating a generalized index coding problem with successive decoding as a result of decoded message passing. The properties of reducibility and criticality are also studied, by which we are able to prove the linear optimality of orthogonal access in terms of symmetric DoF for the networks up to four users with all possible network topologies (218 instances). Practical issues of the tradeoff between the overhead of message passing and the achievable symmetric DoF are also discussed, in the hope of facilitating efficient backhaul utilization.
Chemical-chemical interaction (CCI) plays a key role in predicting candidate drugs, toxicity, therapeutic effects, and biological functions. In various types of chemical analyses, computational approaches are often required due to the amount of data that needs to be handled. The recent remarkable growth and outstanding performance of deep learning have attracted considerable research attention. However,even in state-of-the-art drug analysis methods, deep learning continues to be used only as a classifier, although deep learning is capable of not only simple classification but also automated feature extraction. In this paper, we propose the first end-to-end learning method for CCI, named DeepCCI. Hidden features are derived from a simplified molecular input line entry system (SMILES), which is a string notation representing the chemical structure, instead of learning from crafted features. To discover hidden representations for the SMILES strings, we use convolutional neural networks (CNNs). To guarantee the commutative property for homogeneous interaction, we apply model sharing and hidden representation merging techniques. The performance of DeepCCI was compared with a plain deep classifier and conventional machine learning methods. The proposed DeepCCI showed the best performance in all seven evaluation metrics used. In addition, the commutative property was experimentally validated. The automatically extracted features through end-to-end SMILES learning alleviates the significant efforts required for manual feature engineering. It is expected to improve prediction performance, in drug analyses.
Realizing strong light-matter interactions between individual 2-level systems and resonating cavities in atomic and solid state systems opens up possibilities to study optical nonlinearities on a single photon level, which can be useful for future quantum information processing networks. However, these efforts have been hampered by the unfavorable experimental conditions, such as cryogenic temperatures and ultrahigh vacuum, required to study such systems and phenomena. Although several attempts to realize strong light-matter interactions at room-temperature using so-called plasmon resonances have been made, successful realizations on the single nanoparticle level are still lacking. Here, we demonstrate strong coupling between plasmons confined within a single silver nanoprism and excitons in molecular J-aggregates at ambient conditions. Our findings show that the deep subwavelength mode volumes, $V$, together with high quality factors, $Q$, associated with plasmons in the nanoprisms result in strong coupling figure-of-merit -- $Q/\sqrt{V}$ as high as $\sim6\times10^{3}$~$\mu$m$^{-3/2}$ -- a value comparable to state-of-art photonic crystal and microring resonator cavities, thereby suggesting that plasmonic nanocavities and specifically silver nanoprisms can be used for room-temperature quantum optics.
In this paper, we explore complex network properties of word collocation networks (Ferret, 2002) from four different genres. Each document of a particular genre was converted into a network of words with word collocations as edges. We analyzed graphically and statistically how the global properties of these networks varied across different genres, and among different network types within the same genre. Our results indicate that the distributions of network properties are visually similar but statistically apart across different genres, and interesting variations emerge when we consider different network types within a single genre. We further investigate how the global properties change as we add more and more collocation edges to the graph of one particular genre, and observe that except for the number of vertices and the size of the largest connected component, network properties change in phases, via jumps and drops.
In this paper a framework for Automatic Query Expansion (AQE) is proposed using distributed neural language model word2vec. Using semantic and contextual relation in a distributed and unsupervised framework, word2vec learns a low dimensional embedding for each vocabulary entry. Using such a framework, we devise a query expansion technique, where related terms to a query are obtained by K-nearest neighbor approach. We explore the performance of the AQE methods, with and without feedback query expansion, and a variant of simple K-nearest neighbor in the proposed framework. Experiments on standard TREC ad-hoc data (Disk 4, 5 with query sets 301-450, 601-700) and web data (WT10G data with query set 451-550) shows significant improvement over standard term-overlapping based retrieval methods. However the proposed method fails to achieve comparable performance with statistical co-occurrence based feedback method such as RM3. We have also found that the word2vec based query expansion methods perform similarly with and without any feedback information.
Purpose: Our objective is to apply an improved statistical global model of beta^- decay half-life systematics [1] generated by machine-learning techniques to the prediction of beta half-lives relevant to r-process nuclei. The primary aim of this application is to complement existing r-process-clock and matter-flow studies, thereby providing additional theoretical support for the planning of future activities of the world's network of rare-isotope laboratories.   Results: Results are presented for nuclides situated on the r-ladders at N=50, 82, and 126 where abundances peak, as well as for nuclides that affect abundances between peaks or may be relevant to r-processes under different astrophysical scenarios. The half-lives of some of the targeted neutron-rich nuclides have either been recently measured or will be accessible at rare-isotope laboratories in the relatively near future. The results of our large-scale data-driven half-life calculations (generated by a "theory-thin" global statistical model) are compared to available experimental data, including recent measurements on very neutron-rich nuclei along an r-process path far from the valley of $\beta$ stability. Comparison is also made with corresponding results from traditional global models derived by semi-phenomenological "theory-thick" approaches.
Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of "neurally-weighted" machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain. After training, these neurally-weighted classifiers are able to classify images without requiring any additional neural data. We show that our neural-weighting approach can lead to large performance gains when used with traditional machine vision features, as well as to significant improvements with already high-performing convolutional neural network features. The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.
We study empirically how fame of WWI fighter-pilot aces, measured in numbers of web pages mentioning them, is related to their achievement or merit, measured in numbers of opponent aircraft destroyed. We find that on the average fame grows exponentially with achievement; to be precise, there is a strong correlation (~0.7) between achievement and the logarithm of fame. At the same time, the number of individuals achieving a particular level of merit decreases exponentially with the magnitude of the level, leading to a power-law distribution of fame. A stochastic model that can explain the exponential growth of fame with merit is also proposed.
Currently, criminals profile (CP) is obtained from investigators or forensic psychologists interpretation, linking crime scene characteristics and an offenders behavior to his or her characteristics and psychological profile. This paper seeks an efficient and systematic discovery of nonobvious and valuable patterns between variables from a large database of solved cases via a probabilistic network (PN) modeling approach. The PN structure can be used to extract behavioral patterns and to gain insight into what factors influence these behaviors. Thus, when a new case is being investigated and the profile variables are unknown because the offender has yet to be identified, the observed crime scene variables are used to infer the unknown variables based on their connections in the structure and the corresponding numerical (probabilistic) weights. The objective is to produce a more systematic and empirical approach to profiling, and to use the resulting PN model as a decision tool.
We study the nonequilibrium properties of a nonergodic random quantum chain in which highly excited eigenstates exhibit critical properties usually associated with quantum critical ground states. The ground state and excited states of this system belong to different universality classes, characterized by infinite-randomness quantum critical behavior. Using strong-disorder renormalization group techniques, we show that the crossover between the zero and finite energy density regimes is universal. We analytically derive a flow equation describing the unitary dynamics of this isolated system at finite energy density from which we obtain universal scaling functions along the crossover.
Current Chinese social media text summarization models are based on an encoder-decoder framework. Although its generated summaries are similar to source texts literally, they have low semantic relevance. In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization. We introduce a Semantic Relevance Based neural model to encourage high semantic similarity between texts and summaries. In our model, the source text is represented by a gated attention encoder, while the summary representation is produced by a decoder. Besides, the similarity score between the representations is maximized during training. Our experiments show that the proposed model outperforms baseline systems on a social media corpus.
Today's Cloud applications are dominated by composite applications comprising multiple computing and data components with strong communication correlations among them. Although Cloud providers are deploying large number of computing and storage devices to address the ever increasing demand for computing and storage resources, network resource demands are emerging as one of the key areas of performance bottleneck. This paper addresses network-aware placement of virtual components (computing and data) of multi-tier applications in data centers and formally defines the placement as an optimization problem. The simultaneous placement of Virtual Machines and data blocks aims at reducing the network overhead of the data center network infrastructure. A greedy heuristic is proposed for the on-demand application components placement that localizes network traffic in the data center interconnect. Such optimization helps reducing communication overhead in upper layer network switches that will eventually reduce the overall traffic volume across the data center. This, in turn, will help reducing packet transmission delay, increasing network performance, and minimizing the energy consumption of network components. Experimental results demonstrate performance superiority of the proposed algorithm over other approaches where it outperforms the state-of-the-art network-aware application placement algorithm across all performance metrics by reducing the average network cost up to 67% and network usage at core switches up to 84%, as well as increasing the average number of application deployments up to 18%.
We present a self-consistent theory of Anderson localization that yields a simple algorithm to obtain \emph{typical local density of states} as an order parameter, thereby reproducing the essential features of a phase-diagram of localization-delocalization quantum phase transition in the standard lattice models of disordered electron problem. Due to the local character of our theory, it can easily be combined with dynamical mean-field approaches to strongly correlated electrons, thus opening an attractive avenue for a genuine {\em non-perturbative} treatment of the interplay of strong interactions and strong disorder.
We describe the evolution of macromolecules as an information transmission process and apply tools from Shannon information theory to it. This allows us to isolate three independent, competing selective pressures that we term compression, transmission, and neutrality selection. The first two affect genome length: the pressure to conserve resources by compressing the code, and the pressure to acquire additional information that improves the channel, increasing the rate of information transmission into each offspring. Noisy transmission channels (replication with mutations) gives rise to a third pressure that acts on the actual encoding of information; it maximizes the fraction of mutations that are neutral with respect to the phenotype. This neutrality selection has important implications for the evolution of evolvability. We demonstrate each selective pressure in experiments with digital organisms.
When tasked to find fraudulent social network users, what is a practitioner to do? Traditional classification can lead to poor generalization and high misclassification given few and possibly biased labels. We tackle this problem by analyzing fraudulent behavioral patterns, featurizing users to yield strong discriminative performance, and building algorithms to handle new and multimodal fraud types. First, we set up honeypots, or "dummy" social network accounts on which we solicit fake followers (after careful IRB approval). We report the signs of such behaviors, including oddities in local network connectivity, account attributes, and similarities and differences across fraud providers. We discover several types of fraud behaviors, with the possibility of even more. We discuss how to leverage these insights in practice, build strongly performing entropy-based features, and propose OEC (Open-ended Classification), an approach for "future-proofing" existing algorithms to account for the complexities of link fraud. Our contributions are (a) observations: we analyze our honeypot fraudster ecosystem and give insights regarding various fraud behaviors, (b) features: we engineer features which give exceptionally strong (>0.95 precision/recall) discriminative power on ground-truth data, and (c) algorithm: we motivate and discuss OEC, which reduces misclassification rate by >18% over baselines and routes practitioner attention to samples at high-risk of misclassification.
Recurrent Neural Network (RNN) is one of the most popular architectures used in Natural Language Processsing (NLP) tasks because its recurrent structure is very suitable to process variable-length text. RNN can utilize distributed representations of words by first converting the tokens comprising each text into vectors, which form a matrix. And this matrix includes two dimensions: the time-step dimension and the feature vector dimension. Then most existing models usually utilize one-dimensional (1D) max pooling operation or attention-based operation only on the time-step dimension to obtain a fixed-length vector. However, the features on the feature vector dimension are not mutually independent, and simply applying 1D pooling operation over the time-step dimension independently may destroy the structure of the feature representation. On the other hand, applying two-dimensional (2D) pooling operation over the two dimensions may sample more meaningful features for sequence modeling tasks. To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text. This paper also utilizes 2D convolution to sample more meaningful information of the matrix. Experiments are conducted on six text classification tasks, including sentiment analysis, question classification, subjectivity classification and newsgroup classification. Compared with the state-of-the-art models, the proposed models achieve excellent performance on 4 out of 6 tasks. Specifically, one of the proposed models achieves highest accuracy on Stanford Sentiment Treebank binary classification and fine-grained classification tasks.
Performance becomes an issue particularly when execution cost hinders the functionality of a program. Typically a profiler can be used to find program code execution which represents a large portion of the overall execution cost of a program. Pinpointing where a performance issue exists provides a starting point for tracing cause back through a program.   While profiling shows where a performance issue manifests, we use mutation analysis to show where a performance improvement is likely to exist. We find that mutation analysis can indicate locations within a program which are highly impactful to the overall execution cost of a program yet are executed relatively infrequently. By better locating potential performance improvements in programs we hope to make performance improvement more amenable to automation.
We propose a novel zero-shot learning method for semantic utterance classification (SUC). It learns a classifier $f: X \to Y$ for problems where none of the semantic categories $Y$ are present in the training set. The framework uncovers the link between categories and utterances using a semantic space. We show that this semantic space can be learned by deep neural networks trained on large amounts of search engine query log data. More precisely, we propose a novel method that can learn discriminative semantic features without supervision. It uses the zero-shot learning framework to guide the learning of the semantic features. We demonstrate the effectiveness of the zero-shot semantic learning algorithm on the SUC dataset collected by (Tur, 2012). Furthermore, we achieve state-of-the-art results by combining the semantic features with a supervised method.
When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.
The cooperative electromagnetic interactions between discrete resonators have been widely used to modify the optical properties of metamaterials. Here we propose a general evolutionary approach for engineering these interactions in arbitrary networks of resonators. To illustrate the performances of this approach, we designed by genetic algorithm, an almost perfect broadband absorber in the visible range made with a simple binary array of metallic nanoparticles.
This paper explores the effectiveness of network attack when the attacker has imperfect information about the network. For Erd\H{o}s-R\'enyi networks, we observe that dynamical importance and betweenness centrality-based attacks are surprisingly robust to the presence of a moderate amount of imperfect information and are more effective compared with simpler degree-based attacks even at moderate levels of network information error. In contrast, for scale-free networks the effectiveness of attack is much less degraded by a moderate level of information error. Furthermore, in the Erd\H{o}os-R\'enyi case the effectiveness of network attack is much more degraded by missing links as compared with the same number of false links.
How much can pruning algorithms teach us about the fundamentals of learning representations in neural networks? A lot, it turns out. Neural network model compression has become a topic of great interest in recent years, and many different techniques have been proposed to address this problem. In general, this is motivated by the idea that smaller models typically lead to better generalization. At the same time, the decision of what to prune and when to prune necessarily forces us to confront our assumptions about how neural networks actually learn to represent patterns in data. In this work we set out to test several long-held hypotheses about neural network learning representations and numerical approaches to pruning. To accomplish this we first reviewed the historical literature and derived a novel algorithm to prune whole neurons (as opposed to the traditional method of pruning weights) from optimally trained networks using a second-order Taylor method. We then set about testing the performance of our algorithm and analyzing the quality of the decisions it made. As a baseline for comparison we used a first-order Taylor method based on the Skeletonization algorithm and an exhaustive brute-force serial pruning algorithm. Our proposed algorithm worked well compared to a first-order method, but not nearly as well as the brute-force method. Our error analysis led us to question the validity of many widely-held assumptions behind pruning algorithms in general and the trade-offs we often make in the interest of reducing computational complexity. We discovered that there is a straightforward way, however expensive, to serially prune 40-70% of the neurons in a trained network with minimal effect on the learning representation and without any re-training.
Wireless mobile sensor networks (WMSNs) are groups of mobile sensing agents with multi-modal sensing capabilities that communicate over wireless networks. WMSNs have more flexibility in terms of deployment and exploration abilities over static sensor networks. Sensor networks have a wide range of applications in security and surveillance systems, environmental monitoring, data gathering for network-centric healthcare systems, monitoring seismic activities and atmospheric events, tracking traffic congestion and air pollution levels, localization of autonomous vehicles in intelligent transportation systems, and detecting failures of sensing, storage, and switching components of smart grids. The above applications require target tracking for processes and events of interest occurring in an environment. Various methods and approaches have been proposed in order to track one or more targets in a pre-defined area. Usually, this turns out to be a complicated job involving higher order mathematics coupled with artificial intelligence due to the dynamic nature of the targets. To optimize the resources we need to have an approach that works in a more straightforward manner while resulting in fairly satisfactory data. In this paper we have discussed the various cases that might arise while flocking a group of sensors to track targets in a given environment. The approach has been developed from scratch although some basic assumptions have been made keeping in mind some previous theories. This paper outlines a customized approach for feasibly tracking swarms of targets in a specific area so as to minimize the resources and optimize tracking efficiency.
We show that a quantum phase transition from ergodic to many-body localized (MBL) phases can be induced via periodic pulsed manipulation of spin systems. Such a transition is enabled by the interplay between weak disorder and slow heating rates. Specifically, we demonstrate that the Hamiltonian of a weakly disordered ergodic spin system can be effectively engineered, by using sufficiently fast coherent controls, to yield a stable MBL phase, which in turn completely suppresses the energy absorption from external control field. Our results imply that a broad class of existing many-body systems can be used to probe non-equilibrium phases of matter for a long time, limited only by coupling to external environment.
We present a single-layer feedforward artificial neural network architecture trained through a supervised learning approach for the deconvolution of flow variables from their coarse grained computations such as those encountered in large eddy simulations. We stress that the deconvolution procedure proposed in this investigation is blind, i.e. the deconvolved field is computed without any pre-existing information about the filtering procedure or kernel. This may be conceptually contrasted to the celebrated approximate deconvolution approaches where a filter shape is predefined for an iterative deconvolution process. We demonstrate that the proposed blind deconvolution network performs exceptionally well in the a-priori testing of both two-dimensional Kraichnan and three-dimensional Kolmogorov turbulence and shows promise in forming the backbone of a physics-augmented data-driven closure for the Navier-Stokes equations.
The heat capacity of pure ^3He in low density aerogel is measured at 22.5 bar. The superfluid response is simultaneously monitored with a torsional oscillator. A slightly rounded heat capacity peak, 65 mu K in width, is observed at the ^3He-aerogel superfluid transition, T_{ca}. Subtracting the bulk ^3He contribution, the heat capacity shows a Fermi-liquid form above T_{ca}. The heat capacity attributed to superfluid within the aerogel can be fit with a rounded BCS form, and accounts for 0.30 of the non-bulk fluid in the aerogel, indicating a substantial reduction in the superfluid order parameter consistent with earlier superfluid density measurements.
Shortest path network interdiction is a combinatorial optimization problem on an activity network arising in a number of important security-related applications. It is classically formulated as a bilevel maximin problem representing an "interdictor" and an "evader". The evader tries to move from a source node to the target node along a path of the least cost while the interdictor attempts to frustrate this motion by cutting edges or nodes. The interdiction objective is to find the optimal set of edges to cut given that there is a finite interdiction budget and the interdictor must move first. We reformulate the interdiction problem for stochastic evaders by introducing a model in which the evader follows a Markovian random walk guided by the least-cost path to the target. This model can represent incomplete knowledge about the evader, and the resulting model is a nonlinear 0-1 optimization problem. We then introduce an optimization heuristic based on betweenness centrality that can rapidly find high-quality interdiction solutions by providing a global view of the network.
Cloud computing has emerged as a powerful and elastic platform for internet service hosting, yet it also draws concerns of the unpredictable performance of cloud-based services due to network congestion. To offer predictable performance, the virtual cluster abstraction of cloud services has been proposed, which enables allocation and performance isolation regarding both computing resources and network bandwidth in a simplified virtual network model. One issue arisen in virtual cluster allocation is the survivability of tenant services against physical failures. Existing works have studied virtual cluster backup provisioning with fixed primary embeddings, but have not considered the impact of primary embeddings on backup resource consumption. To address this issue, in this paper we study how to embed virtual clusters survivably in the cloud data center, by jointly optimizing primary and backup embeddings of the virtual clusters. We formally define the survivable virtual cluster embedding problem. We then propose a novel algorithm, which computes the most resource-efficient embedding given a tenant request. Since the optimal algorithm has high time complexity, we further propose a faster heuristic algorithm, which is several orders faster than the optimal solution, yet able to achieve similar performance. Besides theoretical analysis, we evaluate our algorithms via extensive simulations.
Recent measurement studies show that there are massively distributed hosting and computing infrastructures deployed in the Internet. Such infrastructures include large data centers and organizations' computing clusters. When idle, these resources can readily serve local users. Such users can be smartphone or tablet users wishing to access services such as remote desktop or CPU/bandwidth intensive activities. Particularly, when they are likely to have high latency to access, or may have no access at all to, centralized cloud providers. Today, however, there is no global marketplace where sellers and buyers of available resources can trade. The recently introduced marketplaces of Amazon and other cloud infrastructures are limited by the network footprint of their own infrastructures and availability of such services in the target country and region. In this article we discuss the potentials for a federated cloud marketplace where sellers and buyers of a number of resources, including storage, computing, and network bandwidth, can freely trade. This ecosystem can be regulated through brokers who act as service level monitors and auctioneers. We conclude by discussing the challenges and opportunities in this space.
We show that a rate of conditional Shannon entropy reduction, characterizing the learning of an internal process about an external process, is bounded by the thermodynamic entropy production. This approach allows for the definition of an informational efficiency that can be used to study cellular information processing. We analyze three models of increasing complexity inspired by the E. coli sensory network, where the external process is an external ligand concentration jumping between two values. We start with a simple model for which ATP must be consumed so that a protein inside the cell can learn about the external concentration. With a second model for a single receptor we show that the rate at which the receptor learns about the external environment can be nonzero even without any dissipation inside the cell since chemical work done by the external process compensates for this learning rate. The third model is more complete, also containing adaptation. For this model we show inter alia that a bacterium in an environment that changes at a very slow time-scale is quite inefficient, dissipating much more than it learns. Using the concept of a coarse-grained learning rate, we show for the model with adaptation that while the activity learns about the external signal the option of changing the methylation level increases the concentration range for which the learning rate is substantial.
What if Information Retrieval (IR) systems did not just retrieve relevant information that is stored in their indices, but could also "understand" it and synthesise it into a single document? We present a preliminary study that makes a first step towards answering this question. Given a query, we train a Recurrent Neural Network (RNN) on existing relevant information to that query. We then use the RNN to "deep learn" a single, synthetic, and we assume, relevant document for that query. We design a crowdsourcing experiment to assess how relevant the "deep learned" document is, compared to existing relevant documents. Users are shown a query and four wordclouds (of three existing relevant documents and our deep learned synthetic document). The synthetic document is ranked on average most relevant of all.
We use replicator dynamics to study an iterated prisoners' dilemma game with memory. In this study, we investigate the characteristics of all 32 possible strategies with a single-step memory by observing the results when each strategy encounters another one. Based on these results, we define similarity measures between the 32 strategies and perform a network analysis of the relationship between the strategies by constructing a strategies network. Interestingly, we find that a win-lose circulation, like rock-paper-scissors, exists between strategies and that the circulation results from one unusual strategy.
Deep Neural Networks, and specifically fully-connected convolutional neural networks are achieving remarkable results across a wide variety of domains. They have been trained to achieve state-of-the-art performance when applied to problems such as speech recognition, image classification, natural language processing and bioinformatics. Most of these deep learning models when applied to classification employ the softmax activation function for prediction and aim to minimize cross-entropy loss. In this paper, we have proposed a supervised model for dominant category prediction to improve search recall across all eBay classifieds platforms. The dominant category label for each query in the last 90 days is first calculated by summing the total number of collaborative clicks among all categories. The category having the highest number of collaborative clicks for the given query will be considered its dominant category. Second, each query is transformed to a numeric vector by mapping each unique word in the query document to a unique integer value; all padded to equal length based on the maximum document length within the pre-defined vocabulary size. A fully-connected deep convolutional neural network (CNN) is then applied for classification. The proposed model achieves very high classification accuracy compared to other state-of-the-art machine learning techniques.
Recursive neural networks (RNN) and their recently proposed extension recursive long short term memory networks (RLSTM) are models that compute representations for sentences, by recursively combining word embeddings according to an externally provided parse tree. Both models thus, unlike recurrent networks, explicitly make use of the hierarchical structure of a sentence. In this paper, we demonstrate that RNNs nevertheless suffer from the vanishing gradient and long distance dependency problem, and that RLSTMs greatly improve over RNN's on these problems. We present an artificial learning task that allows us to quantify the severity of these problems for both models. We further show that a ratio of gradients (at the root node and a focal leaf node) is highly indicative of the success of backpropagation at optimizing the relevant weights low in the tree. This paper thus provides an explanation for existing, superior results of RLSTMs on tasks such as sentiment analysis, and suggests that the benefits of including hierarchical structure and of including LSTM-style gating are complementary.
Applying end-to-end learning to solve complex, interactive, pixel-driven control tasks on a robot is an unsolved problem. Deep Reinforcement Learning algorithms are too slow to achieve performance on a real robot, but their potential has been demonstrated in simulated environments. We propose using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world. The progressive net approach is a general framework that enables reuse of everything from low-level visual features to high-level policies for transfer to new tasks, enabling a compositional, yet simple, approach to building complex skills. We present an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging the reality gap. Unlike other proposed approaches, our real-world experiments demonstrate successful task learning from raw visual input on a fully actuated robot manipulator. Moreover, rather than relying on model-based trajectory optimisation, the task learning is accomplished using only deep reinforcement learning and sparse rewards.
We study the properties of fluctuation for the free energies and internal energies of two spin glass systems that differ for having some set of interactions flipped. We show that their difference has a variance that grows like the volume of the flipped region. Using a new interpolation method, which extends to the entire circle the standard interpolation technique, we show by integration by parts that the bound imply new overlap identities for the equilibrium state. As a side result the case of the non-interacting random field is analyzed and the triviality of its overlap distribution proved.
In times of plenty expectations rise, just as in times of crisis they fall. This can be mathematically described as a Win-Stay-Lose-Shift strategy with dynamic aspiration levels, where individuals aspire to be as wealthy as their average neighbor. Here we investigate this model in the realm of evolutionary social dilemmas on the square lattice and scale-free networks. By using the master equation and Monte Carlo simulations, we find that cooperators coexist with defectors in the whole phase diagram, even at high temptations to defect. We study the microscopic mechanism that is responsible for the striking persistence of cooperative behavior and find that cooperation spreads through second-order neighbors, rather than by means of network reciprocity that dominates in imitation-based models. For the square lattice the master equation can be solved analytically in the large temperature limit of the Fermi function, while for other cases the resulting differential equations must be solved numerically. Either way, we find good qualitative agreement with the Monte Carlo simulation results. Our analysis also reveals that the evolutionary outcomes are to a large degree independent of the network topology, including the number of neighbors that are considered for payoff determination on lattices, which further corroborates the local character of the microscopic dynamics. Unlike large-scale spatial patterns that typically emerge due to network reciprocity, here local checkerboard-like patterns remain virtually unaffected by differences in the macroscopic properties of the interaction network.
Mobility management and bandwidth management are two major research issues in a cellular mobile network. Mobility management consists of two basic components: location management and handoff management. To Provide QoS to the users Handoff is a key element in wireless cellular networks. It is often initiated either by crossing a cell boundary or by deterioration in the quality of signal in the current channel. In this paper, a new admission control policy for cellular mobile network is being proposed. Two important QoS parameter in cellular networks are Call Dropping Probability (CDP) and Handoff Dropping Probability (HDP). CDP represents the probability that a call is dropped due to a handoff failure. HDP represents the probability of a handoff failure due to insufficient available resources in the target cell. Most of the algorithms try to limit the HDP to some target maximum but not CDP. In this paper, we show that when HDP is controlled, the CDP is also controlled to a minimum extent while maintaining lower blocking rates for new calls in the system.
Networks-of-networks (NoN) is a graph-theoretic model of interdependent networks that have distinct dynamics at each network (layer). By adding special edges to represent relationships between nodes in different layers, NoN provides a unified mechanism to study interdependent systems intertwined in a complex relationship. While NoN based models have been proposed for cyber-physical systems, in this position paper we build towards a three-layered NoN model for an enterprise cyber system. Each layer captures a different facet of a cyber system. We present in-depth discussion for four major graph- theoretic applications to demonstrate how the three-layered NoN model can be leveraged for continuous system monitoring and mission assurance.
This Study proposes a routing strategy of combining a packet scheduling with congestion control policy that applied for LEO satellite network with high speed and multiple traffic. It not only ensures the QoS of different traffic, but also can avoid low priority traffic to be "starve" due to their weak resource competitiveness, thus it guarantees the throughput and performance of the network. In the end, we set up a LEO satellite network simulation platform in OPNET to verify the effectiveness of the proposed algorithm.
Determining the material category of a surface from an image is a demanding task in perception that is drawing increasing attention. Following the recent remarkable results achieved for image classification and object detection utilising Convolutional Neural Networks (CNNs), we empirically study material classification of everyday objects employing these techniques. More specifically, we conduct a rigorous evaluation of how state-of-the art CNN architectures compare on a common ground over widely used material databases. Experimental results on three challenging material databases show that the best performing CNN architectures can achieve up to 94.99\% mean average precision when classifying materials.
The possibility to use Neural Networks for reconstruction of the energy deposited in the calorimetry system of the CMS detector is investigated. It is shown that using feed - forward neural network, good linearity, Gaussian energy distribution and good energy resolution can be achieved. Significant improvement of the energy resolution and linearity is reached in comparison with other weighting methods for energy reconstruction.
Urban legends are a genre of modern folklore, consisting of stories about rare and exceptional events, just plausible enough to be believed, which tend to propagate inexorably across communities. In our view, while urban legends represent a form of "sticky" deceptive text, they are marked by a tension between the credible and incredible. They should be credible like a news article and incredible like a fairy tale to go viral. In particular we will focus on the idea that urban legends should mimic the details of news (who, where, when) to be credible, while they should be emotional and readable like a fairy tale to be catchy and memorable. Using NLP tools we will provide a quantitative analysis of these prototypical characteristics. We also lay out some machine learning experiments showing that it is possible to recognize an urban legend using just these simple features.
Financial forecasting is an estimation of future financial outcomes for a company, industry, country using historical internal accounting and sales data. We may predict the future outcome of BSE_SENSEX practically by some soft computing techniques and can also optimized using PSO (Particle Swarm Optimization), EA (Evolutionary Algorithm) or DEA (Differential Evolutionary Algorithm) etc. PSO is a biologically inspired computational search & optimization method developed in 1995 by Dr. Eberhart and Dr. Kennedy based on the social behaviors of fish schooling or birds flocking. PSO is a promising method to train Artificial Neural Network (ANN). It is easy to implement then Genetic Algorithm except few parameters are adjusted. PSO is a random & pattern search technique based on populating of particle. In PSO, the particles are having some position and velocity in the search space. Two terms are used in PSO one is Local Best and another one is Global Best. To optimize problems that are like Irregular, Noisy, Change over time, Static etc. PSO uses a classic optimization method such as Gradient Decent & Quasi-Newton Methods. The observation and review of few related studies in the last few years, focusing on function of PSO, modification of PSO and operation that have implemented using PSO like function optimization, ANN Training & Fuzzy Control etc. Differential Evolution is an efficient EA technique for optimization of numerical problems, financial problems etc. PSO technique is introduced due to the swarming behavior of animals which is the collective behavior of similar size that aggregates together.
We present deterministic techniques for computing upper and lower bounds on marginal probabilities in sigmoid and noisy-OR networks. These techniques become useful when the size of the network (or clique size) precludes exact computations. We illustrate the tightness of the bounds by numerical experiments.
Topics generated by topic models are typically represented as list of terms. To reduce the cognitive overhead of interpreting these topics for end-users, we propose labelling a topic with a succinct phrase that summarises its theme or idea. Using Wikipedia document titles as label candidates, we compute neural embeddings for documents and words to select the most relevant labels for topics. Compared to a state-of-the-art topic labelling system, our methodology is simpler, more efficient, and finds better topic labels.
Many kinds of data can be represented as a network or graph. It is crucial to infer the latent structure underlying such a network and to predict unobserved links in the network. Mixed Membership Stochastic Blockmodel (MMSB) is a promising model for network data. Latent variables and unknown parameters in MMSB have been estimated through Bayesian inference with the entire network; however, it is important to estimate them online for evolving networks. In this paper, we first develop online inference methods for MMSB through sequential Monte Carlo methods, also known as particle filters. We then extend them for time-evolving networks, taking into account the temporal dependency of the network structure. We demonstrate through experiments that the time-dependent particle filter outperformed several baselines in terms of prediction performance in an online condition.
The diagonal ensemble is the infinite time average of a quantum state following unitary dynamics. In analogy to the time average of a classical phase space dynamics, it is intimately related to the ergodic properties of the quantum system giving information on the spreading of the initial state in the eigenstates of the Hamiltonian. In this work we apply a concept from quantum information, known as total correlations, to the diagonal ensemble. Forming an upper-bound on the multipartite entanglement, it quantifies the combination of both classical and quantum correlations in a mixed state. We generalize the total correlations of the diagonal ensemble to more general $\alpha$-Renyi entropies and focus on the the cases $\alpha=1$ and $\alpha=2$ with further numerical extensions in mind. Here we show that the total correlations of the diagonal ensemble is a generic indicator of ergodicity breaking, displaying a sub-extensive behaviour when the system is ergodic. We demonstrate this by investigating its scaling in a range of spin chain models focusing not only on the cases of integrability breaking but also emphasize its role in understanding the transition from an ergodic to a many body localized phase in systems with disorder or quasi-periodicity.
Hashing is widely applied to approximate nearest neighbor search for large-scale multimodal retrieval with storage and computation efficiency. Cross-modal hashing improves the quality of hash coding by exploiting semantic correlations across different modalities. Existing cross-modal hashing methods first transform data into low-dimensional feature vectors, and then generate binary codes by another separate quantization step. However, suboptimal hash codes may be generated since the quantization error is not explicitly minimized and the feature representation is not jointly optimized with the binary codes. This paper presents a Correlation Hashing Network (CHN) approach to cross-modal hashing, which jointly learns good data representation tailored to hash coding and formally controls the quantization error. The proposed CHN is a hybrid deep architecture that constitutes a convolutional neural network for learning good image representations, a multilayer perception for learning good text representations, two hashing layers for generating compact binary codes, and a structured max-margin loss that integrates all things together to enable learning similarity-preserving and high-quality hash codes. Extensive empirical study shows that CHN yields state of the art cross-modal retrieval performance on standard benchmarks.
Stochastic gradient descent (SGD) still is the workhorse for many practical problems. However, it converges slow, and can be difficult to tune. It is possible to precondition SGD to accelerate its convergence remarkably. But many attempts in this direction either aim at solving specialized problems, or result in significantly more complicated methods than SGD. This paper proposes a new method to estimate a preconditioner such that the amplitudes of perturbations of preconditioned stochastic gradient match that of the perturbations of parameters to be optimized in a way comparable to Newton method for deterministic optimization. Unlike the preconditioners based on secant equation fitting as done in deterministic quasi-Newton methods, which assume positive definite Hessian and approximate its inverse, the new preconditioner works equally well for both convex and non-convex optimizations with exact or noisy gradients. When stochastic gradient is used, it can naturally damp the gradient noise to stabilize SGD. Efficient preconditioner estimation methods are developed, and with reasonable simplifications, they are applicable to large scaled problems. Experimental results demonstrate that equipped with the new preconditioner, without any tuning effort, preconditioned SGD can efficiently solve many challenging problems like the training of a deep neural network or a recurrent neural network requiring extremely long term memories.
We consider the design of cognitive Medium Access Control (MAC) protocols enabling a secondary (unlicensed) transmitter-receiver pair to communicate over the idle periods of a set of primary (licensed) channels. More specifically, we propose cognitive MAC protocols optimized for both slotted and un-slotted primary networks. For the slotted structure, the objective is to maximize the secondary throughput while maintaining synchronization between the secondary pair and not causing interference to the primary network. Our investigations differentiate between two sensing scenarios. In the first, the secondary transmitter is capable of sensing all the primary channels, whereas it senses only a subset of the primary channels in the second scenario. In both cases, we propose blind MAC protocols that efficiently learn the statistics of the primary traffic on-line and asymptotically achieve the throughput obtained when prior knowledge of primary traffic statistics is available. For the un-slotted structure, the objective is to maximize the secondary throughput while satisfying an interference constraint on the primary network. Sensing-dependent periods are optimized for each primary channel yielding a MAC protocol which outperforms previously proposed techniques that rely on a single sensing period optimization.
In this paper, we propose a novel neural network structure, namely \emph{feedforward sequential memory networks (FSMN)}, to model long-term dependency in time series without using recurrent feedback. The proposed FSMN is a standard fully-connected feedforward neural network equipped with some learnable memory blocks in its hidden layers. The memory blocks use a tapped-delay line structure to encode the long context information into a fixed-size representation as short-term memory mechanism. We have evaluated the proposed FSMNs in several standard benchmark tasks, including speech recognition and language modelling. Experimental results have shown FSMNs significantly outperform the conventional recurrent neural networks (RNN), including LSTMs, in modeling sequential signals like speech or language. Moreover, FSMNs can be learned much more reliably and faster than RNNs or LSTMs due to the inherent non-recurrent model structure.
Dynamic scaling analyses of linear and nonlinear ac susceptibilities in a model magnet of the long-rang RKKY Ising spin glass (SG) Dy$_{x}$Y$_{1-x}$Ru$_{2}$Si$_{2}$ were examined. The obtained set of the critical exponents, $\gamma$ $\sim$ 1, $\beta$ $\sim$ 1, $\delta$ $\sim$ 2, and $z\nu$ $\sim$ 3.4, indicates the SG phase transition belongs to a different universality class from either the canonical (Heisenberg) or the short-range Ising SGs. The analyses also reveal a finite-temperature SG transition with the same critical exponents under a magnetic field and the phase transition line $T_{\mbox{g}}(H)$ described by $T_{\mbox{g}}(H)$ $=$ $T_{\mbox{g}}(0)(1-AH^{2/\phi})$ with $\phi$ $\sim$ 2. The crossover exponent $\phi$ obeys the scaling relation $\phi$ $=$ $\gamma + \beta$ within the margin of errors. These results strongly suggest the spontaneous replica-symmetry-breaking (RSB) with a {\it non- or marginal-mean-field universality class} in the long-range RKKY Ising SG.
Reaction networks, or equivalently Petri nets, are a general framework for describing processes in which entities of various kinds interact and turn into other entities. In chemistry, where the reactions are assigned "rate constants", any reaction network gives rise to a nonlinear dynamical system called its "rate equation". Here we generalize these ideas to "open" reaction networks, which allow entities to flow in and out at certain designated inputs and outputs. We treat open reaction networks as morphisms in a category. Composing two such morphisms connects the outputs of the first to the inputs of the second. We construct a functor sending any open reaction network to its corresponding "open dynamical system". This provides a compositional framework for studying the dynamics of reaction networks. We then turn to statics: that is, steady state solutions of open dynamical systems. We construct a "black-boxing" functor that sends any open dynamical system to the relation that it imposes between input and output variables in steady states. This extends our earlier work on black-boxing for Markov processes.
Due to the abundance of 2D product images from the Internet, developing efficient and scalable algorithms to recover the missing depth information is central to many applications. Recent works have addressed the single-view depth estimation problem by utilizing convolutional neural networks. In this paper, we show that exploring symmetry information, which is ubiquitous in man made objects, can significantly boost the quality of such depth predictions. Specifically, we propose a new convolutional neural network architecture to first estimate dense symmetric correspondences in a product image and then propose an optimization which utilizes this information explicitly to significantly improve the quality of single-view depth estimations. We have evaluated our approach extensively, and experimental results show that this approach outperforms state-of-the-art depth estimation techniques.
Deep neural networks (DNNs) demand a very large amount of computation and weight storage, and thus efficient implementation using special purpose hardware is highly desired. In this work, we have developed an FPGA based fixed-point DNN system using only on-chip memory not to access external DRAM. The execution time and energy consumption of the developed system is compared with a GPU based implementation. Since the capacity of memory in FPGA is limited, only 3-bit weights are used for this implementation, and training based fixed-point weight optimization is employed. The implementation using Xilinx XC7Z045 is tested for the MNIST handwritten digit recognition benchmark and a phoneme recognition task on TIMIT corpus. The obtained speed is about one quarter of a GPU based implementation and much better than that of a PC based one. The power consumption is less than 5 Watt at the full speed operation resulting in much higher efficiency compared to GPU based systems.
We calculate the short-range contributions $C^{(1)}$ and $C^{(10)}$ to the angular intensity correlation function for the scattering of s-polarized light from a one-dimensional random interface between two dielectric media. The calculations are carried out on the basis of a new approach that separates out explicitly the contributions $C^{(1)}$ a nd $C^{(10)}$ to the angular intensity correlation function. The contribution $C^{(1)}$ displays peaks associated with the memory effect and the reciprocal memory effect. In the case of a dielectric-dielectric interface, which does not support surface electromagnetic surface waves, these peaks arise from the co herent interference of multiply-scattered lateral waves supported by the in terface. The contribution $C^{(10)}$ is a structureless function of its arguments.
Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for simple special cases. This paper develops ensemble sampling, which aims to approximate Thompson sampling while maintaining tractability even in the face of complex models such as neural networks. Ensemble sampling dramatically expands on the range of applications for which Thompson sampling is viable. We establish a theoretical basis that supports the approach and present computational results that offer further insight.
Determining the intended sense of words in text - word sense disambiguation (WSD) - is a long standing problem in natural language processing. Recently, researchers have shown promising results using word vectors extracted from a neural network language model as features in WSD algorithms. However, a simple average or concatenation of word vectors for each word in a text loses the sequential and syntactic information of the text. In this paper, we study WSD with a sequence learning neural net, LSTM, to better capture the sequential and syntactic patterns of the text. To alleviate the lack of training data in all-words WSD, we employ the same LSTM in a semi-supervised label propagation classifier. We demonstrate state-of-the-art results, especially on verbs.
We present a numerical study of the zero-temperature response of the Gaussian random-field Ising model (RFIM) to a slowly varying external field, allowing the system to be trapped in microscopic configurations that are not fully metastable. This modification of the standard single-spin-flip dynamics results in an increase of dissipation (hysteresis) somewhat similar to that observed with a finite driving rate. We then study the distribution of avalanches along the hysteresis loop and perform a finite-size scaling analysis that shows good evidence that the critical exponents associated to the disorder-induced phase transition are not modified.
In order to assess the short-term memory performance of non-linear random neural networks, we introduce a measure to quantify the dependence of a neural representation upon the past context. We study this measure both numerically and theoretically using the mean-field theory for random neural networks, showing the existence of an optimal level of synaptic weights heterogeneity. We further investigate the influence of the network topology, in particular the symmetry of reciprocal synaptic connections, on this measure of context dependence, revealing the importance of considering the interplay between non-linearities and connectivity structure.
Part I describes an intelligent acoustic emission locator, while Part II discusses blind source separation, time delay estimation and location of two continuous acoustic emission sources.   Acoustic emission (AE) analysis is used for characterization and location of developing defects in materials. AE sources often generate a mixture of various statistically independent signals. A difficult problem of AE analysis is separation and characterization of signal components when the signals from various sources and the mode of mixing are unknown. Recently, blind source separation (BSS) by independent component analysis (ICA) has been used to solve these problems. The purpose of this paper is to demonstrate the applicability of ICA to locate two independent simultaneously active acoustic emission sources on an aluminum band specimen. The method is promising for non-destructive testing of aircraft frame structures by acoustic emission analysis.
We present the results obtained on the color distribution of galaxies in the CFHTLS-Deep Field Survey Data Release 03. Photometric redshifts have been computed using a standard SED fitting approach, with a new version of the public code HyperZ (New-HyperZ). Large samples of galaxies with well determined photometric redshifts in the 0<z<1.3 interval have been selected in the four CFHTLS Deep fields, within the completeness limit in absolute luminosity in u and r bands. We study the restframe color distribution of galaxies as a function of redshift, luminosity and local density. Our results are consistent with a bimodal color distribution, where red galaxies dominate the highest luminosities out to z~0.6. An important population of blue and bright galaxies appears beyond this redshift, increasing with redshift. Out to z~1.3, a strong evolution is observed, at a given redshift, in the color distribution of galaxies as a function of luminosity, together with a mild evolution with the local density at fixed luminosity.
We investigate a model for colloidal network formation using Brownian Dynamics computer simulations. Hysteretic springs establish transient bonds between particles with repulsive core. If a bonded pair is separated by a cutoff distance, the spring vanishes and reappears only if the two particles contact each other. We present results for the the bond lifetime distribution and investigate the properties of the van Hove dynamical two-body correlation function. The model displays crossover from fluid-like dynamics, via transient network formation, to arrested quasi-static network behavior.
Artificial intelligence (AI) research has evolved over the last few decades and knowledge acquisition research is at the core of AI research. PKAW-04 is one of three international knowledge acquisition workshops held in the Pacific-Rim, Canada and Europe over the last two decades. PKAW-04 has a strong emphasis on incremental knowledge acquisition, machine learning, neural nets and active mining.   The proceedings contain 19 papers that were selected by the program committee among 24 submitted papers. All papers were peer reviewed by at least two reviewers. The papers in these proceedings cover the methods and tools as well as the applications related to develop expert systems or knowledge based systems.
We study spatial networks that are designed to distribute or collect a commodity, such as gas pipelines or train tracks. We focus on the cost of a network, as represented by the total length of all its edges, and its efficiency in terms of the directness of routes from point to point. Using data for several real-world examples, we find that distribution networks appear remarkably close to optimal where both these properties are concerned. We propose two models of network growth that offer explanations of how this situation might arise.
We study the problem of estimating the time delay between two signals representing delayed, irregularly sampled and noisy versions of the same underlying pattern. We propose and demonstrate an evolutionary algorithm for the (hyper)parameter estimation of a kernel-based technique in the context of an astronomical problem, namely estimating the time delay between two gravitationally lensed signals from a distant quasar. Mixed types (integer and real) are used to represent variables within the evolutionary algorithm. We test the algorithm on several artificial data sets, and also on real astronomical observations of quasar Q0957+561. By carrying out a statistical analysis of the results we present a detailed comparison of our method with the most popular methods for time delay estimation in astrophysics. Our method yields more accurate and more stable time delay estimates: for Q0957+561, we obtain 419.6 days for the time delay between images A and B. Our methodology can be readily applied to current state-of-the-art optical monitoring data in astronomy, but can also be applied in other disciplines involving similar time series data.
Interpreting gradient methods as fixed-point iterations, we provide a detailed analysis of those methods for minimizing convex objective functions. Due to their conceptual and algorithmic simplicity, gradient methods are widely used in machine learning for massive datasets (big data). In particular, stochastic gradient methods are considered the de- facto standard for training deep neural networks. Studying gradient methods within the realm of fixed-point theory provides us with powerful tools to analyze their convergence properties. In particular, gradient methods using inexact or noisy gradients, such as stochastic gradient descent, can be studied conveniently using well-known results on inexact fixed-point iterations. Moreover, as we demonstrate in this paper, the fixed-point approach allows an elegant derivation of accelerations for basic gradient methods. In particular, we will show how gradient descent can be accelerated by a fixed-point preserving transformation of an operator associated with the objective function.
Networks are topological and geometric structures used to describe systems as different as the Internet, the brain or the quantum structure of space-time. Here we define complex quantum network geometries, describing the underlying structure of growing simplicial 2-complexes, i.e. simplicial complexes formed by triangles. These networks are geometric networks with energies of the links that grow according to a non-equilibrium dynamics. The evolution in time of the geometric networks is a classical evolution describing a given path of a path integral defining the evolution of quantum network states. The quantum network states are characterized by quantum occupation numbers that can be mapped respectively to the nodes, links, and triangles incident to each link of the network. We call the geometric networks describing the evolution of quantum network states the quantum geometric networks. The quantum geometric networks have many properties common to complex networks including small-world property, high clustering coefficient, high modularity, scale-free degree distribution.Moreover they can be distinguished between the Fermi-Dirac Network and the Bose-Einstein Network obeying respectively the Fermi-Dirac and Bose-Einstein statistics. We show that these networks can undergo structural phase transitions where the geometrical properties of the networks change drastically. Finally we comment on the relation between Quantum Complex Network Geometries, spin networks and triangulations.
We study the low-temperature spin-glass phases of the Sherrington-Kirkpatrick (SK) model and of the 3-dimensional short range Ising spin glass (3dISG). For the SK model, evidence for ultrametricity becomes clearer as the system size increases, while for the short-range case our results indicate the opposite, i.e. lack of ultrametricity. Our results are obtained by a recently proposed method that uses clustering to focus on the relevant parts of phase space and reduce finite size effects. Evidence that the mean field solution does not apply in detail to the 3dISG is also found by another method which does not rely on clustering.
YouTube-8M is the largest video dataset for multi-label video classification. In order to tackle the multi-label classification on this challenging dataset, it is necessary to solve several issues such as temporal modeling of videos, label imbalances, and correlations between labels. We develop a deep neural network model, which consists of four components: the frame encoder, the classification layer, the label processing layer, and the loss function. We introduce our newly proposed methods and discusses how existing models operate in the YouTube-8M Classification Task, what insights they have, and why they succeed (or fail) to achieve good performance. Most of the models we proposed are very high compared to the baseline models, and the ensemble of the models we used is 8th in the Kaggle Competition.
Binary Neural Networks (BNNs) can drastically reduce memory size and accesses by applying bit-wise operations instead of standard arithmetic operations. Therefore it could significantly improve the efficiency and lower the energy consumption at runtime, which enables the application of state-of-the-art deep learning models on low power devices. BMXNet is an open-source BNN library based on MXNet, which supports both XNOR-Networks and Quantized Neural Networks. The developed BNN layers can be seamlessly applied with other standard library components and work in both GPU and CPU mode. BMXNet is maintained and developed by the multimedia research group at Hasso Plattner Institute and released under Apache license. Extensive experiments validate the efficiency and effectiveness of our implementation. The BMXNet library, several sample projects, and a collection of pre-trained binary deep models are available for download at https://github.com/hpi-xnor
A total negative field dependence of hole mobility down to low temperature was observed in N,N'-diphenyl-N,N'-bis(3-methylphenyl)-(1,1'-biphenyl)-4,4'diamine (TPD) doped in Polystyrene. The observed field dependence of mobility is explained on the basis of low values of energetic and positional disorder present in the sample. The low value of disorder is attributed to different morphology of the sample due to aggregation/crystallization of TPD. Monte Carlo simulations were also performed to understand the influence of aggregates on charge transport in disordered medium with correlated site energies. The simulation supports our experimental observations and justification on the basis of low values of disorder parameters.
The cochlea provides a biological information-processing paradigm that we only begin to under- stand in its full complexity. Our work reveals an interacting network of strongly nonlinear dynami- cal nodes, on which even simple sound input triggers subnetworks of activated elements that follow power-law size statistics ('avalanches'). From dynamical systems theory, power-law size distribu- tions relate to a fundamental ground-state of biological information processing. Learning destroys these power laws. These results strongly modify the models of mammalian sound processing and provide a novel methodological perspective for understanding how the brain processes information.
In this paper, we illustrate the modeling of a reservoir property (sand fraction) from seismic attributes namely seismic impedance, seismic amplitude, and instantaneous frequency using Neuro-Fuzzy (NF) approach. Input dataset includes 3D post-stacked seismic attributes and six well logs acquired from a hydrocarbon field located in the western coast of India. Presence of thin sand and shale layers in the basin area makes the modeling of reservoir characteristic a challenging task. Though seismic data is helpful in extrapolation of reservoir properties away from boreholes; yet, it could be challenging to delineate thin sand and shale reservoirs using seismic data due to its limited resolvability. Therefore, it is important to develop state-of-art intelligent methods for calibrating a nonlinear mapping between seismic data and target reservoir variables. Neural networks have shown its potential to model such nonlinear mappings; however, uncertainties associated with the model and datasets are still a concern. Hence, introduction of Fuzzy Logic (FL) is beneficial for handling these uncertainties. More specifically, hybrid variants of Artificial Neural Network (ANN) and fuzzy logic, i.e., NF methods, are capable for the modeling reservoir characteristics by integrating the explicit knowledge representation power of FL with the learning ability of neural networks. The documented results in this study demonstrate acceptable resemblance between target and predicted variables, and hence, encourage the application of integrated machine learning approaches such as Neuro-Fuzzy in reservoir characterization domain. Furthermore, visualization of the variation of sand probability in the study area would assist in identifying placement of potential wells for future drilling operations.
We present a dynamic algorithm for solving the Longest Common Subsequence Problem using Ant Colony Optimization Technique. The Ant Colony Optimization Technique has been applied to solve many problems in Optimization Theory, Machine Learning and Telecommunication Networks etc. In particular, application of this theory in NP-Hard Problems has a remarkable significance. Given two strings, the traditional technique for finding Longest Common Subsequence is based on Dynamic Programming which consists of creating a recurrence relation and filling a table of size . The proposed algorithm draws analogy with behavior of ant colonies function and this new computational paradigm is known as Ant System. It is a viable new approach to Stochastic Combinatorial Optimization. The main characteristics of this model are positive feedback, distributed computation, and the use of constructive greedy heuristic. Positive feedback accounts for rapid discovery of good solutions, distributed computation avoids premature convergence and greedy heuristic helps find acceptable solutions in minimum number of stages. We apply the proposed methodology to Longest Common Subsequence Problem and give the simulation results. The effectiveness of this approach is demonstrated by efficient Computational Complexity. To the best of our knowledge, this is the first Ant Colony Optimization Algorithm for Longest Common Subsequence Problem.
We know anything because we learn about it, there is anything we ever share about it, but now a lot of media that can represent how it happened as infrastructure of the knowledge sharing. This paper aims to introduce a model for understanding a problem in knowledge sharing based on interaction.
The clustering ensembles mingle numerous partitions of a specified data into a single clustering solution. Clustering ensemble has emerged as a potent approach for ameliorating both the forcefulness and the stability of unsupervised classification results. One of the major problems in clustering ensembles is to find the best consensus function. Finding final partition from different clustering results requires skillfulness and robustness of the classification algorithm. In addition, the major problem with the consensus function is its sensitivity to the used data sets quality. This limitation is due to the existence of noisy, silence or redundant data. This paper proposes a novel consensus function of cluster ensembles based on Multilayer networks technique and a maintenance database method. This maintenance database approach is used in order to handle any given noisy speech and, thus, to guarantee the quality of databases. This can generates good results and efficient data partitions. To show its effectiveness, we support our strategy with empirical evaluation using distorted speech from Aurora speech databases.
Networks are a fundamental tool for understanding and modeling complex systems in physics, biology, neuroscience, engineering, and social science. Many networks are known to exhibit rich, lower-order connectivity patterns that can be captured at the level of individual nodes and edges. However, higher-order organization of complex networks---at the level of small network subgraphs---remains largely unknown. Here we develop a generalized framework for clustering networks based on higher-order connectivity patterns. This framework provides mathematical guarantees on the optimality of obtained clusters and scales to networks with billions of edges. The framework reveals higher-order organization in a number of networks including information propagation units in neuronal networks and hub structure in transportation networks. Results show that networks exhibit rich higher-order organizational structures that are exposed by clustering based on higher-order connectivity patterns.
Quantum transport through disordered structures is inhibited by (Anderson) localization effects. The disorder can be either topological as in random networks or energetical as in the original Anderson model. In both cases the eigenstates of the Hamiltonian associated with the network become localized. We show how to overcome localization by network multiplexing. Here, multiple layers of random networks with the same number of nodes are stacked in such a way that in the perpendicular directions regular one-dimensional networks are formed. Depending on the ratio of the coupling within the layer and perpendicular to it, transport gets either enhanced or diminished. In particular, if the couplings are of the same order, transport gets enhanced and localization effects can be overcome. We exemplify our results by two examples: multiplexes of random networks and of one-dimensional Anderson models.
Named entity recognition, and other information extraction tasks, frequently use linguistic features such as part of speech tags or chunkings. For languages where word boundaries are not readily identified in text, word segmentation is a key first step to generating features for an NER system. While using word boundary tags as features are helpful, the signals that aid in identifying these boundaries may provide richer information for an NER system. New state-of-the-art word segmentation systems use neural models to learn representations for predicting word boundaries. We show that these same representations, jointly trained with an NER system, yield significant improvements in NER for Chinese social media. In our experiments, jointly training NER and word segmentation with an LSTM-CRF model yields nearly 5% absolute improvement over previously published results.
We study the dimensional crossover of weak localization in strongly anisotropic systems. This crossover from three-dimensional behavior to an effective lower dimensional system is triggered by increasing temperature if the phase coherence length gets shorter than the lattice spacing $a$. A similar effect occurs in a magnetic field if the magnetic length $L_m$ becomes shorter than $a(D_{||}/D_\perp)^\gamma$, where $\D_{||}/D_\perp$ is the ratio of the diffusion coefficients parallel and perpendicular to the planes or chains. $\gamma$ depends on the direction of the magnetic field, e.g. $\gamma=1/4$ or 1/2 for a magnetic field parallel or perpendicular to the planes in a quasi two-dimensional system. We show that even in the limit of large magnetic field, weak localization is not fully suppressed in a lattice system. Experimental implications are discussed in detail.
Sensor Networks are inherently complex networks, and many of their associated problems require analysis of some of their global characteristics. These are primarily affected by the topology of the network. We present in this paper, a general framework for a topological analysis of a network, and develop distributed algorithms in a generalized combinatorial setting in order to solve two seemingly unrelated problems, 1) Coverage hole detection and Localization and 2) Worm hole attack detection and Localization. We also note these solutions remain coordinate free as no priori localization information of the nodes is assumed. For the coverage hole problem, we follow a "divide and conquer approach", by strategically dissecting the network so that the overall topology is preserved, while efficiently pursuing the detection and localization of failures. The detection of holes, is enabled by first attributing a combinatorial object called a "Rips Complex" to each network segment, and by subsequently checking the existence/non-existence of holes by way of triviality of the first homology class of this complex. Our estimate exponentially approaches the location of potential holes with each iteration, yielding a very fast convergence coupled with optimal usage of valuable resources such as power and memory. We then show a simple extension of the above problem to address a well known problem in networks, namely the localization of a worm hole attack. We demonstrate the effectiveness of the presented algorithm with several substantiating examples.
Gossip algorithms are widely used in modern distributed systems, with applications ranging from sensor networks and peer-to-peer networks to mobile vehicle networks and social networks. A tremendous research effort has been devoted to analyzing and improving the asymptotic rate of convergence for gossip algorithms. In this work we study finite-time convergence of deterministic gossiping. We show that there exists a symmetric gossip algorithm that converges in finite time if and only if the number of network nodes is a power of two, while there always exists an asymmetric gossip algorithm with finite-time convergence, independent of the number of nodes. For $n=2^m$ nodes, we prove that a fastest convergence can be reached in $nm=n\log_2 n$ node updates via symmetric gossiping. On the other hand, under asymmetric gossip among $n=2^m+r$ nodes with $0\leq r<2^m$, it takes at least $mn+2r$ node updates for achieving finite-time convergence. It is also shown that the existence of finite-time convergent gossiping often imposes strong structural requirements on the underlying interaction graph. Finally, we apply our results to gossip algorithms in quantum networks, where the goal is to control the state of a quantum system via pairwise interactions. We show that finite-time convergence is never possible for such systems.
Prior studies have generally suggested that Artificial Neural Networks (ANNs) are superior to conventional statistical models in predicting consumer buying behavior. There are, however, contradicting findings which raise question over usefulness of ANNs. This paper discusses development of three neural networks for modeling consumer e-commerce behavior and compares the findings to equivalent logistic regression models. The results showed that ANNs predict e-commerce adoption slightly more accurately than logistic models but this is hardly justifiable given the added complexity. Further, ANNs seem to be highly adaptive, particularly when a small sample is coupled with a large number of nodes in hidden layers which, in turn, limits the neural networks' generalisability.
Boundary multifractality of electronic wave functions is studied analytically and numerically for the power-law random banded matrix (PRBM) model, describing a critical one-dimensional system with long-range hopping. The peculiarity of the Anderson localization transition in this model is the existence of a line of fixed points describing the critical system in the bulk. We demonstrate that the boundary critical theory of the PRBM model is not uniquely determined by the bulk properties. Instead, the boundary criticality is controlled by an additional parameter characterizing the hopping amplitudes of particles reflected by the boundary.
Evaluating similarity between graphs is of major importance in several computer vision and pattern recognition problems, where graph representations are often used to model objects or interactions between elements. The choice of a distance or similarity metric is, however, not trivial and can be highly dependent on the application at hand. In this work, we propose a novel metric learning method to evaluate distance between graphs that leverages the power of convolutional neural networks, while exploiting concepts from spectral graph theory to allow these operations on irregular graphs. We demonstrate the potential of our method in the field of connectomics, where neuronal pathways or functional connections between brain regions are commonly modelled as graphs. In this problem, the definition of an appropriate graph similarity function is critical to unveil patterns of disruptions associated with certain brain disorders. Experimental results on the ABIDE dataset show that our method can learn a graph similarity metric tailored for a clinical application, improving the performance of a simple k-nn classifier by 11.9% compared to a traditional distance metric.
Current fine-grained classification approaches often rely on a robust localization of object parts to extract localized feature representations suitable for discrimination. However, part localization is a challenging task due to the large variation of appearance and pose. In this paper, we show how pre-trained convolutional neural networks can be used for robust and efficient object part discovery and localization without the necessity to actually train the network on the current dataset. Our approach called "part detector discovery" (PDD) is based on analyzing the gradient maps of the network outputs and finding activation centers spatially related to annotated semantic parts or bounding boxes.   This allows us not just to obtain excellent performance on the CUB200-2011 dataset, but in contrast to previous approaches also to perform detection and bird classification jointly without requiring a given bounding box annotation during testing and ground-truth parts during training. The code is available at http://www.inf-cv.uni-jena.de/part_discovery and https://github.com/cvjena/PartDetectorDisovery.
As companies increase their efforts in retaining customers, being able to predict accurately ahead of time, whether a customer will churn in the foreseeable future is an extremely powerful tool for any marketing team. The paper describes in depth the application of Deep Learning in the problem of churn prediction. Using abstract feature vectors, that can generated on any subscription based company's user event logs, the paper proves that through the use of the intrinsic property of Deep Neural Networks (learning secondary features in an unsupervised manner), the complete pipeline can be applied to any subscription based company with extremely good churn predictive performance. Furthermore the research documented in the paper was performed for Framed Data (a company that sells churn prediction as a service for other companies) in conjunction with the Data Science Institute at Lancaster University, UK. This paper is the intellectual property of Framed Data.
Sensing-as-a-Service (S2aaS) is an emerging Internet of Things (IOT) business model pattern. To be technically feasible and to effectively allow for broad adoption, S2aaS implementations have to overcome manifold systemic hurdles, specifically regarding payment and sensor identification. In an effort to overcome these hurdles, we propose Bitcoin as protocol for S2aaS networks. To lay the groundwork and start the conversation about disruptive changes that Bitcoin technology could bring to S2aaS concepts and IOT in general, we identify and discuss the core characteristics that could drive those changes. We present a conceptual example and describe the basic process of exchanging data for cash using Bitcoin.
Diffusion-based and neural communication are two interesting domains in molecular communication. Both of them have distinct advantages and are exploited separately in many works. However, in some cases, neural and diffusion-based ways have to work together for a communication. Therefore, in this paper, we propose a hybrid communication system, in which the diffusion-based and neural communication channels are contained. Multiple connection nano-devices (CND) are used to connect the two channels. We define the practice function of the CNDs and develop the mechanism of exchanging information from diffusion-based to neural channel, based on the biological characteristics of the both channels. In addition, we establish a brief mathematical model to present the complete communication process of the hybrid system. The information exchange process at the CNDs is shown in the simulation. The bit error rate (BER) indicator is used to verify the reliability of communication. The result reveals that based on the biological channels, optimizing some parameters of nano-devices could improve the reliability performance.
We present a general computational theory of cancer and its developmental dynamics. The theory is based on a theory of the architecture and function of developmental control networks which guide the formation of multicellular organisms. Cancer networks are special cases of developmental control networks. Cancer results from transformations of normal developmental networks. Our theory generates a natural classification of all possible cancers based on their network architecture. Each cancer network has a unique topology and semantics and developmental dynamics that result in distinct clinical tumor phenotypes. We apply this new theory with a series of proof of concept cases for all the basic cancer types. These cases have been computationally modeled, their behavior simulated and mathematically described using a multicellular systems biology approach. There are fascinating correspondences between the dynamic developmental phenotype of computationally modeled {\em in silico} cancers and natural {\em in vivo} cancers. The theory lays the foundation for a new research paradigm for understanding and investigating cancer. The theory of cancer networks implies that new diagnostic methods and new treatments to cure cancer will become possible.
This paper discusses a new energy aware routing scheme which uses variable transmission range. The protocol has been incorporated along with the route discovery procedure of AODV as a case study. Both the protocols are simulated using Network Simulator and comparisons are made to analyze their performance based on energy consumption, network lifetime and number of alive nodes metrics for different network scenarios. The results show that EAR makes effective node energy utilization.
During the past few years, we have observed the emergence of new applications that use multicast transmission. For a multicast routing algorithm to be applicable in optical networks, it must route data only to group members, optimize and maintain loop-free routes, and concentrate the routes on a subset of network links. For an all-optical switch to play the role of a branching router, it must be equipped with a light splitter. Light splitters are expensive equipments and therefore it will be very expensive to implement splitters on all optical switches. Optical light splitters are only implemented on some optical switches. That limited availability of light splitters raises a new problem when we want to implement multicast protocols in optical network (because usual multicast protocols make the assumption that all nodes have branching capabilities). Another issue is the knowledge of the locations of light splitters in the optical network. Nodes in the network should be able to identify the locations of light splitters scattered in the optical network so it can construct multicast trees. These problems must be resolved by implementing a multicast routing protocol that must take into consideration that not all nodes can be branching node. As a result, a new signaling process must be implemented so that light paths can be created, spanning from source to the group members.
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs). In particular, this is achieved by leveraging a feature-centric voting scheme to implement novel convolutional layers which explicitly exploit the sparsity encountered in the input. To this end, we examine the trade-off between accuracy and speed for different architectures and additionally propose to use an L1 penalty on the filter activations to further encourage sparsity in the intermediate representations. To the best of our knowledge, this is the first work to propose sparse convolutional layers and L1 regularisation for efficient large-scale processing of 3D data. We demonstrate the efficacy of our approach on the KITTI object detection benchmark and show that Vote3Deep models with as few as three layers outperform the previous state of the art in both laser and laser-vision based approaches by margins of up to 40% while remaining highly competitive in terms of processing time.
An algorithm is given in this paper for the computation of dynamically equivalent weakly reversible realizations with the maximal number of reactions, for chemical reaction networks (CRNs) with mass action kinetics.
Theory of complex networks proved successful in the description of a variety of static networks ranging from biology to computer and social sciences and to economics and finance. Here we use network models to describe the evolution of a particular economic system, namely the International Trade Network (ITN). Previous studies often assume that globalization and regionalization in international trade are contradictory to each other. We re-examine the relationship between globalization and regionalization by viewing the international trade system as an interdependent complex network. We use the modularity optimization method to detect communities and community cores in the ITN during the years 1995-2011. We find rich dynamics over time both inter- and intra-communities. Most importantly, we have a multilevel description of the evolution where the global dynamics (i.e., communities disappear or reemerge) tend to be correlated with the regional dynamics (i.e., community core changes between community members). In particular, the Asia-Oceania community disappeared and reemerged over time along with a switch in leadership from Japan to China. Moreover, simulation results show that the global dynamics can be generated by a preferential attachment mechanism both inter- and intra-communities.
A first-order percolation transition, called explosive percolation, was recently discovered in evolution networks with random edge selection under a certain restriction. However, the network percolation with more realistic evolution mechanisms such as preferential attachment has not yet been concerned. We propose a tunable network percolation model by introducing a hybrid mechanism of edge selection into the Bohman-Frieze-Wormald model, in which a parameter adjusts the relative weights between random and preferential selections. A large number of simulations indicate that there exist crossover phenomena of percolation transition by adjusting the parameter in the evolution processes. When the strategy of selecting a candidate edge is dominated by random selection, a single discontinuous percolation transition occurs. When a candidate edge is selected more preferentially based on node's degree, the size of the largest component undergoes multiple discontinuous jumps, which exhibits a peculiar difference from the network percolation of random selection with a certain restriction. Besides, the percolation transition becomes continuous when the candidate edge is selected completely preferentially.
Supervised machine learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world? We want models to be not only good, but interpretable. And yet the task of interpretation appears underspecified. Papers provide diverse and sometimes non-overlapping motivations for interpretability, and offer myriad notions of what attributes render models interpretable. Despite this ambiguity, many papers proclaim interpretability axiomatically, absent further explanation. In this paper, we seek to refine the discourse on interpretability. First, we examine the motivations underlying interest in interpretability, finding them to be diverse and occasionally discordant. Then, we address model properties and techniques thought to confer interpretability, identifying transparency to humans and post-hoc explanations as competing notions. Throughout, we discuss the feasibility and desirability of different notions, and question the oft-made assertions that linear models are interpretable and that deep neural networks are not.
Previous CNN-based video super-resolution approaches need to align multiple frames to the reference. In this paper, we show that proper frame alignment and motion compensation is crucial for achieving high quality results. We accordingly propose a `sub-pixel motion compensation' (SPMC) layer in a CNN framework. Analysis and experiments show the suitability of this layer in video SR. The final end-to-end, scalable CNN framework effectively incorporates the SPMC layer and fuses multiple frames to reveal image details. Our implementation can generate visually and quantitatively high-quality results, superior to current state-of-the-arts, without the need of parameter tuning.
Motivation: Many problems in biomedicine and other areas of the life sciences can be characterized as control problems, with the goal of finding strategies to change a disease or otherwise undesirable state of a biological system into another, more desirable, state through an intervention, such as a drug or other therapeutic treatment. The identification of such strategies is typically based on a mathematical model of the process to be altered through targeted control inputs. This paper focuses on processes at the molecular level that determine the state of an individual cell, involving signaling or gene regulation. The mathematical model type considered is that of Boolean networks. The potential control targets can be represented by a set of nodes and edges that can be manipulated to produce a desired effect on the system. Experimentally, node manipulation requires technology to completely repress or fully activate a particular gene product while edge manipulations only require a drug that inactivates the interaction between two gene products. Results: This paper presents a method for the identification of potential intervention targets in Boolean molecular network models using algebraic techniques. The approach exploits an algebraic representation of Boolean networks to encode the control candidates in the network wiring diagram as the solutions of a system of polynomials equations, and then uses computational algebra techniques to find such controllers. The control methods in this paper are validated through the identification of combinatorial interventions in the signaling pathways of previously reported control targets in two well studied systems, a p53-mdm2 network and a blood T cell lymphocyte granular leukemia survival signaling network.
The mixed spin-1/2 and spin-3/2 Ising model on the extended Kagom\'e lattice is solved by establishing a mapping correspondence with the eight-vertex model. Letting the parameter of uniaxial single-ion anisotropy tend to infinity, the model becomes exactly soluble as a free-fermion eight-vertex model. Under this restriction, the critical points are characterized by critical exponents from the standard Ising universality class. In a certain subspace of interaction parameters that corresponds to a coexistence surface between two ordered phases, the model becomes exactly soluble as a symmetric zero-field eight-vertex model. This surface is bounded by a line of bicritical points that have non-universal interaction-dependent critical exponents.
Recently, applying the novel data mining techniques for evaluating enterprise financial distress has received much research alternation. Support Vector Machine (SVM) and back propagation neural (BPN) network has been applied successfully in many areas with excellent generalization results, such as rule extraction, classification and evaluation. In this paper, a model based on SVM with Gaussian RBF kernel is proposed here for enterprise financial distress evaluation. BPN network is considered one of the simplest and are most general methods used for supervised training of multilayered neural network. The comparative results show that through the difference between the performance measures is marginal; SVM gives higher precision and lower error rates.
We have developed an automatic method for segmenting fluorescence lifetime (FLT) imaging microscopy (FLIM) images of cells inspired by a multi-resolution community detection (MCD) based network segmentation method. The image processing problem is framed as identifying segments with respective average FLTs against a background in FLIM images. The proposed method segments a FLIM image for a given resolution of the network composed using image pixels as the nodes and similarity between the pixels as the edges. In the resulting segmentation, low network resolution leads to larger segments and high network resolution leads to smaller segments. Further, the mean-square error (MSE) in estimating the FLT segments in a FLIM image using the proposed method was found to be consistently decreasing with increasing resolution of the corresponding network. The proposed MCD method outperformed a popular spectral clustering based method in performing FLIM image segmentation. The spectral segmentation method introduced noisy segments in its output at high resolution. It was unable to offer a consistent decrease in MSE with increasing resolution.
This paper studies the problem of robust spectrum-aware routing in a multi-hop, multi-channel Cognitive Radio Network (CRN) with the presence of malicious nodes in the secondary network. The proposed routing scheme models the interaction among the Secondary Users (SUs) as a stochastic game. By allowing the backward propagation of the path utility information from the next-hop nodes, the stochastic routing game is decomposed into a series of stage games. The best-response policies are learned through the process of smooth fictitious play, which is guaranteed to converge without flooding of the information about the local utilities and behaviors. To address the problem of mixed insider attacks with both routing-toward-primary and sink-hole attacks, the trustworthiness of the neighbor nodes is evaluated through a multi-arm bandit process for each SU. The simulation results show that the proposed routing algorithm is able to enforce the cooperation of the malicious SUs and reduce the negative impact of the attacks on the routing selection process.
In thermal amorphous systems, the first peak of the pair correlation function $g(r)$ shows a maximum height $g_1^{\rm max}$ at a volume fraction $\phi=\phi_v$ that increases with the temperature. $g_1^{\rm max}$ diverges at the T=0 jamming transition at $\phi=\phi_c$. Molecular dynamics simulations show that some typical quantities, such as the pressure, bulk modulus, shear modulus, and boson peak frequency that behave power law scalings with $\phi-\phi_c$ in marginally jammed solids at T=0, all show scalings with $\phi-\phi_c$ when $\phi > \phi_v$, while the scalings break down when $\phi < \phi_v$. The presence of $g_1^{\rm max}$ is thus not only a thermal vestige of the T=0 jamming transition, but more importantly the structural signature of the jamming transition.
A ternary/binary data coding algorithm and conditions under which Hopfield networks implement optimal convolutional or Hamming decoding algorithms has been described. Using the coding/decoding approach (an optimal Binary Signal Detection Theory, BSDT) introduced a Neural Network Assembly Memory Model (NNAMM) is built. The model provides optimal (the best) basic memory performance and demands the use of a new memory unit architecture with two-layer Hopfield network, N-channel time gate, auxiliary reference memory, and two nested feedback loops. NNAMM explicitly describes the dependence on time of a memory trace retrieval, gives a possibility of metamemory simulation, generalized knowledge representation, and distinct description of conscious and unconscious mental processes. A model of smallest inseparable part or an "atom" of consciousness is also defined. The NNAMM's neurobiological backgrounds and its applications to solving some interdisciplinary problems are shortly discussed. BSDT could implement the "best neural code" used in nervous tissues of animals and humans.
We present a method for training a deep neural network containing sinusoidal activation functions to fit to time-series data. Weights are initialized using a fast Fourier transform, then trained with regularization to improve generalization. A simple dynamic parameter tuning method is employed to adjust both the learning rate and regularization term, such that stability and efficient training are both achieved. We show how deeper layers can be utilized to model the observed sequence using a sparser set of sinusoid units, and how non-uniform regularization can improve generalization by promoting the shifting of weight toward simpler units. The method is demonstrated with time-series problems to show that it leads to effective extrapolation of nonlinear trends.
The neural basis of human memory is incredibly complex. We argue that the diversity of neural systems underlying various forms of memory suggests that any discussion of enhancing 'memory' per se is too broad, thus obfuscating the biopolitical debate about human enhancement. Memory can be differentiated into at least four major (and several minor) systems with largely dissociable (i.e., non-overlapping) neural substrates. We outline each system, and discuss both the practical and the ethical implications of these diverse neural substrates. In practice, distinct neural bases imply the possibility, and likely the necessity, of specific approaches for the safe and effective enhancement of various memory systems. In the debate over the ethical and social implications of enhancement technologies, this fine-grained perspective clarifies - and may partially mitigate - certain common concerns in enhancement debates, including issues related to safety, fairness, coercion, and authenticity. While many researchers certainly appreciate the neurobiological complexity of memory, the political debate tends to revolve around a monolithic one-size-fits-all conception. The overall project - exploring how human enhancement technologies affect society - stands to benefit from a deeper appreciation of memory's neurobiological diversity.
We report on tunnel-injected deep ultraviolet light emitting diodes (UV LEDs) configured with a polarization engineered Al0.75Ga0.25N/ In0.2Ga0.8N tunnel junction structure. Tunnel-injected UV LED structure enables n-type contacts for both bottom and top contact layers. However, achieving Ohmic contact to wide bandgap n-AlGaN layers is challenging and typically requires high temperature contact metal annealing. In this work, we adopted a compositionally graded top contact layer for non-alloyed metal contact, and obtained a low contact resistance of Rc=4.8x10-5 Ohm cm2 on n-Al0.75Ga0.25N. We also observed a significant reduction in the forward operation voltage from 30.9 V to 19.2 V at 1 kA/cm2 by increasing the Mg doping concentration from 6.2x1018 cm-3 to 1.5x1019 cm-3. Non-equilibrium hole injection into wide bandgap Al0.75Ga0.25N with Eg>5.2 eV was confirmed by light emission at 257 nm. This work demonstrates the feasibility of tunneling hole injection into deep UV LEDs, and provides a novel structural design towards high power deep-UV emitters.
While the GPGPU paradigm is widely recognized as an effective approach to high performance computing, its adoption in low-latency, real-time systems is still in its early stages.   Although GPUs typically show deterministic behaviour in terms of latency in executing computational kernels as soon as data is available in their internal memories, assessment of real-time features of a standard GPGPU system needs careful characterization of all subsystems along data stream path.   The networking subsystem results in being the most critical one in terms of absolute value and fluctuations of its response latency.   Our envisioned solution to this issue is NaNet, a FPGA-based PCIe Network Interface Card (NIC) design featuring a configurable and extensible set of network channels with direct access through GPUDirect to NVIDIA Fermi/Kepler GPU memories.   NaNet design currently supports both standard - GbE (1000BASE-T) and 10GbE (10Base-R) - and custom - 34~Gbps APElink and 2.5~Gbps deterministic latency KM3link - channels, but its modularity allows for a straightforward inclusion of other link technologies.   To avoid host OS intervention on data stream and remove a possible source of jitter, the design includes a network/transport layer offload module with cycle-accurate, upper-bound latency, supporting UDP, KM3link Time Division Multiplexing and APElink protocols.   After NaNet architecture description and its latency/bandwidth characterization for all supported links, two real world use cases will be presented: the GPU-based low level trigger for the RICH detector in the NA62 experiment at CERN and the on-/off-shore data link for KM3 underwater neutrino telescope.
Anderson insulators are non-interacting disordered systems which have localized single particle eigenstates. The interacting analogue of Anderson insulators are the Many-Body Localized (MBL) phases. The natural language for representing the spectrum of the Anderson insulator is that of product states over the single-particle modes. We show that product states over Matrix Product Operators of small bond dimension is the corresponding natural language for describing the MBL phases. In this language all of the many-body eigenstates are encode by Matrix Product States (i.e. DMRG wave function) consisting of only two sets of low bond-dimension matrices per site: the $G_i$ matrix corresponding to the local ground state on site $i$ and the $E_i$ matrix corresponding to the local excited state. All $2^n$ eigenstates can be generated from all possible combinations of these matrices.
This paper explores the capabilities of convolutional neural networks to deal with a task that is easily manageable for humans: perceiving 3D pose of a human body from varying angles. However, in our approach, we are restricted to using a monocular vision system. For this purpose, we apply a convolutional neural network approach on RGB videos and extend it to three dimensional convolutions. This is done via encoding the time dimension in videos as the 3\ts{rd} dimension in convolutional space, and directly regressing to human body joint positions in 3D coordinate space. This research shows the ability of such a network to achieve state-of-the-art performance on the selected Human3.6M dataset, thus demonstrating the possibility of successfully representing temporal data with an additional dimension in the convolutional operation.
We investigate numerically the inverse participation ratio, $P_2$, of the 3D Anderson model and of the power-law random banded matrix (PRBM) model at criticality. We found that the variance of $\ln P_2$ scales with system size $L$ as $\sigma^2(L)=\sigma^2(\infty)-A L^{-D_2/2d}$, being $D_2$ the correlation dimension and $d$ the system dimension. Therefore the concept of a correlation dimension is well defined in the two models considered. The 3D Anderson transition and the PRBM transition for $b=0.3$ (see the text for the definition of $b$) are fairly similar with respect to all critical magnitudes studied.
We propose an effective two-dimensional quantum non-linear sigma model combined with classical percolation theory to study the magnetic properties of site diluted layered quantum antiferromagnets like La$_{2}$Cu$_{1-x}$M$_x$O$_{4}$ (M$=$Zn, Mg). We calculate the staggered magnetization at zero temperature, $M_s(x)$, the magnetic correlation length, $\xi(x,T)$, the NMR relaxation rate, $1/T_1(x,T)$, and the N\'eel temperature, $T_N(x)$, in the renormalized classical regime. Due to quantum fluctuations we find a quantum critical point (QCP) at $x_c \approx 0.305$ at lower doping than the two-dimensional percolation threshold $x_p \approx 0.41$. We compare our results with the available experimental data.
We consider the problem of Joint Routing, Scheduling and Power-control (JRSP) problem for multihop wireless networks (MHWN) with multiple antennas. We extend the problem and a (sub-optimal) heuristic solution method for JRSP in MHWN with single antennas. We present an iterative scheme to calculate link capacities(achievable rates) in the interference environment of the network using SINR model. We then present the algorithm for solving the JRSP problem. This completes a feasible system model for MHWN when nodes have multiple antennas. We show that the gain we achieve by using multiple antennas in the network is linear both in optimal performance as well as heuristic algorithmic performance.
To describe the slow dynamics of a system out of equilibrium, but close to a dynamical arrest, we generalize the ideas of previous work to the case where time-translational invariance is broken. We introduce a model of the dynamics that is reasonably general, and show how all of the unknown parameters of this model may be related to the observables or to averages of the noise. One result is a generalisation of the Fluctuation Dissipation Theorem of type two (FDT2), and the method is thereby freed from this constraint. Significantly, a systematic means of implementing the theory to higher order is outlined. We propose the simplest possible closure of these generalized equations, following the same type of approximations that have been long known for the equilibrium case of Mode Coupling Theory (MCT). Naturally, equilibrium MCT equations are found as a limit of this generalized formalism. %We indicate that, within the same general %framework, it should be possible to make higher level approximations, %leading to more general applicability.
We consider the classic problem of establishing a statistical ranking of a set of n items given a set of inconsistent and incomplete pairwise comparisons between such items. Instantiations of this problem occur in numerous applications in data analysis (e.g., ranking teams in sports data), computer vision, and machine learning. We formulate the above problem of ranking with incomplete noisy information as an instance of the group synchronization problem over the group SO(2) of planar rotations, whose usefulness has been demonstrated in numerous applications in recent years. Its least squares solution can be approximated by either a spectral or a semidefinite programming (SDP) relaxation, followed by a rounding procedure. We perform extensive numerical simulations on both synthetic and real-world data sets, showing that our proposed method compares favorably to other algorithms from the recent literature. Existing theoretical guarantees on the group synchronization problem imply lower bounds on the largest amount of noise permissible in the ranking data while still achieving exact recovery. We propose a similar synchronization-based algorithm for the rank-aggregation problem, which integrates in a globally consistent ranking pairwise comparisons given by different rating systems on the same set of items. We also discuss the problem of semi-supervised ranking when there is available information on the ground truth rank of a subset of players, and propose an algorithm based on SDP which recovers the ranks of the remaining players. Finally, synchronization-based ranking, combined with a spectral technique for the densest subgraph problem, allows one to extract locally-consistent partial rankings, in other words, to identify the rank of a small subset of players whose pairwise comparisons are less noisy than the rest of the data, which other methods are not able to identify.
A global optimization method, conformational space annealing (CSA), is applied to study a 46-residue protein with the sequence B_9N_3(LB)_4N_3B_9N_3(LB)_5L, where B, L and N designate hydrophobic, hydrophilic, and neutral residues, respectively. The 46-residue BLN protein is folded into the native state of a four-stranded beta-barrel. It has been a challenging problem to locate the global minimum of the 46-residue BLN protein since the system is highly frustrated and consequently its energy landscape is quite rugged. The CSA successfully located the global minimum of the 46-mer for all 100 independent runs. The CPU time for CSA is about seventy times less than that for simulated annealing (SA), and its success rate (100 %) to find the global minimum is about eleven times higher. The amount of computational efforts used for CSA is also about ten times less than that of the best global optimization method yet applied to the 46-residue BLN protein, the quantum thermal annealing with renormalization. The 100 separate CSA runs produce the global minimum 100 times as well as other 5950 final conformations corresponding to a total of 2361 distinct local minima of the protein. Most of the final conformations have relatively small RMSD values from the global minimum, independent of their diverse energy values. Very close to the global minimum, there exist quasi-global-minima which are frequently obtained as one of the final answers from SA runs. We find that there exist two largest energy gaps between the quasi-global-minima and the other local minima. Once a SA run is trapped in one of these quasi-global-minima, it cannot be folded into the global minimum before crossing over the two large energy barriers, clearly demonstrating the reason for the poor success rate of SA.
Deep inelastic structure functions have been calculated by Polchinski and Strassler in gauge/string duality introducing a hard infrared (IR) cut off in AdS space. Here we investigate this problem using a soft IR cut off that leads to linear Regge trajectories for mesons. We calculate the structure functions for scalar particles in the large x regime where supergravity approximation holds and the small x regime where massive string states contribute. We also propose a hybrid model to calculate structure functions for fermions in the supergravity approximation. In the deep inelastic limit our results are in agreement with those obtained using a hard cut off.
We investigate the one-dimensional long-range random-field Ising magnet with Gaussian distribution of the random fields. In this model, a ferromagnetic bond between two spins is placed with a probability $p \sim r^{-1-\sigma}$, where $r$ is the distance between these spins and $\sigma$ is a parameter to control the effective dimension of the model. Exact ground states at zero temperature are calculated for system sizes up to $L = 2^{19}$ via graph theoretical algorithms for four different values of $\sigma \in \{0.25,0.4,0.5,1.0\}$ while varying the strength $h$ of the random fields. For each of these values several independent physical observables are calculated, i.e., magnetization, Binder parameter, susceptibility and a specific-heat-like quantity. The ferromagnet-paramagnet transitions at critical values $h_c(\sigma)$ as well as the corresponding critical exponents are obtained. The results agree well with theory and interestingly we find for $\sigma = 1/2$ the data is compatible with a critical random-field strength $h_c > 0$.
In this paper we present a synthesis of the work performed on two inference algorithms: the Pearl's belief propagation (BP) algorithm applied to Bayesian networks without loops (i.e. polytree) and the Loopy belief propagation (LBP) algorithm (inspired from the BP) which is applied to networks containing undirected cycles. It is known that the BP algorithm, applied to Bayesian networks with loops, gives incorrect numerical results i.e. incorrect posterior probabilities. Murphy and al. [7] find that the LBP algorithm converges on several networks and when this occurs, LBP gives a good approximation of the exact posterior probabilities. However this algorithm presents an oscillatory behaviour when it is applied to QMR (Quick Medical Reference) network [15]. This phenomenon prevents the LBP algorithm from converging towards a good approximation of posterior probabilities. We believe that the translation of the inference computation problem from the probabilistic framework to the possibilistic framework will allow performance improvement of LBP algorithm. We hope that an adaptation of this algorithm to a possibilistic causal network will show an improvement of the convergence of LBP.
Computational color constancy refers to the problem of computing the illuminant color so that the images of a scene under varying illumination can be normalized to an image under the canonical illumination. In this paper, we adopt a deep learning framework for the illumination estimation problem. The proposed method works under the assumption of uniform illumination over the scene and aims for the accurate illuminant color computation. Specifically, we trained the convolutional neural network to solve the problem by casting the color constancy problem as an illumination classification problem. We designed the deep learning architecture so that the output of the network can be directly used for computing the color of the illumination. Experimental results show that our deep network is able to extract useful features for the illumination estimation and our method outperforms all previous color constancy methods on multiple test datasets.
Conceptors provide an elementary neuro-computational mechanism which sheds a fresh and unifying light on a diversity of cognitive phenomena. A number of demanding learning and processing tasks can be solved with unprecedented ease, robustness and accuracy. Some of these tasks were impossible to solve before. This entirely informal paper introduces the basic principles of conceptors and highlights some of their usages.
``Nishimori line'' is a line or hypersurface in the parameter space of systems with quenched disorder, where simple expressions of the averages of physical quantities over the quenched random variables are obtained. It has been playing an important role in the theoretical studies of the random frustrated systems since its discovery around 1980. In this paper, a novel interpretation of the Nishimori line from the viewpoint of statistical information processing is presented. Our main aim is the reconstruction of the whole theory of the Nishimori line from the viewpoint of Bayesian statistics, or, almost equivalently, from the viewpoint of the theory of error-correcting codes. As a byproduct of our interpretation, counterparts of the Nishimori line in models without gauge invariance are given. We also discussed the issues on the ``finite temperature decoding'' of error-correcting codes in connection with our theme and clarify the role of gauge invariance in this topic.
Recent measurements of the diffractive deep-inelastic cross section are used to extract diffractive parton densities of the proton. These are subsequently applied in models to predict the production of jets and open charm in the final state. A rapidity gap suppression factor for dijet production in diffractive photoproduction relative to diffractive deep-inelastic scattering is obtained using a model-independent comparison.
This paper addresses the observed performance gap between automatic speech recognition (ASR) systems based on Long Short Term Memory (LSTM) neural networks trained with the connectionist temporal classification (CTC) loss function and systems based on hybrid Deep Neural Networks (DNNs) trained with the cross entropy (CE) loss function on domains with limited data. We step through a number of experiments that show incremental improvements on a baseline EESEN toolkit based LSTM-CTC ASR system trained on the Librispeech 100hr (train-clean-100) corpus. Our results show that with effective combination of data augmentation and regularization, a LSTM-CTC based system can exceed the performance of a strong Kaldi based baseline trained on the same data.
Finding sparse cuts is an important tool in analyzing large-scale distributed networks such as the Internet and Peer-to-Peer networks, as well as large-scale graphs such as the web graph, online social communities, and VLSI circuits. In distributed communication networks, they are useful for topology maintenance and for designing better search and routing algorithms.   In this paper, we focus on developing fast distributed algorithms for computing sparse cuts in networks. Given an undirected $n$-node network $G$ with conductance $\phi$, the goal is to find a cut set whose conductance is close to $\phi$. We present two distributed algorithms that find a cut set with sparsity $\tilde O(\sqrt{\phi})$ ($\tilde{O}$ hides $\polylog{n}$ factors). Both our algorithms work in the CONGEST distributed computing model and output a cut of conductance at most $\tilde O(\sqrt{\phi})$ with high probability, in $\tilde O(\frac{1}{b}(\frac{1}{\phi} + n))$ rounds, where $b$ is balance of the cut of given conductance. In particular, to find a sparse cut of constant balance, our algorithms take $\tilde O(\frac{1}{\phi} + n)$ rounds.   Our algorithms can also be used to output a {\em local} cluster, i.e., a subset of vertices near a given source node, and whose conductance is within a quadratic factor of the best possible cluster around the specified node. Both our distributed algorithm can work without knowledge of the optimal $\phi$ value and hence can be used to find approximate conductance values both globally and with respect to a given source node. We also give a lower bound on the time needed for any distributed algorithm to compute any non-trivial sparse cut --- any distributed approximation algorithm (for any non-trivial approximation ratio) for computing sparsest cut will take $\tilde \Omega(\sqrt{n} + D)$ rounds, where $D$ is the diameter of the graph.
In cloud computing paradigm, virtual resource autoscaling approaches have been intensively studied recent years. Those approaches dynamically scale in/out virtual resources to adjust system performance for saving operation cost. However, designing the autoscaling algorithm for desired performance with limited budget, while considering the existing capacity of legacy network equipment, is not a trivial task. In this paper, we propose a Deadline and Budget Constrained Autoscaling (DBCA) algorithm for addressing the budget-performance tradeoff. We develop an analytical model to quantify the tradeoff and cross-validate the model by extensive simulations. The results show that the DBCA can significantly improve system performance given the budget upper-bound. In addition, the model provides a quick way to evaluate the budget-performance tradeoff and system design without wide deployment, saving on cost and time.
Forward jet cross sections have been measured in neutral current deep inelastic scattering at low Bjorken-x with the ZEUS detector at HERA using an integrated luminosity of ${81.8 \rm pb}^{-1}$. Measurements are presented for inclusive forward jets as well as for forward jets accompanied by a dijet system. The explored phase space, with jet pseudorapidity up to 4.3 is expected to be particularly sensitive to the dynamics of QCD parton evolution at low x. The measurements are compared to fixed-order QCD calculations and to leading-order parton-shower Monte Carlo models.
The number of algorithms available to reconstruct a biological network from a dataset of high-throughput measurements is nowadays overwhelming, but evaluating their performance when the gold standard is unknown is a difficult task. Here we propose to use a few reconstruction stability tools as a quantitative solution to this problem. We introduce four indicators to quantitatively assess the stability of a reconstructed network in terms of variability with respect to data subsampling. In particular, we give a measure of the mutual distances among the set of networks generated by a collection of data subsets (and from the network generated on the whole dataset) and we rank nodes and edges according to their decreasing variability within the same set of networks. As a key ingredient, we employ a global/local network distance combined with a bootstrap procedure. We demonstrate the use of the indicators in a controlled situation on a toy dataset, and we show their application on a miRNA microarray dataset with paired tumoral and non-tumoral tissues extracted from a cohort of 241 hepatocellular carcinoma patients.
We describe an event-based approach to simulate the propagation of an electromagnetic plane wave through dielectric media. The basic building block is a deterministic learning machine that is able to simulate a plane interface. We show that a network of two of such machines can simulate the propagation of light through a plane parallel plate. With properly chosen parameters this setup can be used as a beam splitter. The modularity of the simulation method is illustrated by constructing a Mach-Zehnder interferometer from plane parallel plates, the whole system reproducing the results of wave theory. A generalization of the event-based model of the plane parallel plate is also used to simulate a periodically stratified medium.
We calculate the probability distribution of entanglement entropy S across a cut of a finite one dimensional spin chain of length L at an infinite randomness fixed point using Fisher's strong randomness renormalization group (RG). Using the random transverse-field Ising model as an example, the distribution is shown to take the form $p(S|L) \sim L^{-\psi(k)}$, where $k = S / \log [L/L_0]$, the large deviation function $\psi(k)$ is found explicitly, and $L_0$ is a nonuniversal microscopic length. We discuss the implications of such a distribution on numerical techniques that rely on entanglement, such as matrix product state (MPS) based techniques. Our results are verified with numerical RG simulations, as well as the actual entanglement entropy distribution for the random transverse-field Ising model which we calculate for large L via a mapping to Majorana fermions.
The classical network configuration introduced by Braess in 1968 is of fundamental significance because Valiant and Roughgarden showed in 2006 that `the "global" behaviour of an equilibrium flow in a large random network is similar to that in Braess' original four-node example'. In this paper, a natural generalisation of Braess' network is introduced and conditions for the occurrence of Braess' paradox are formulated for the generalised network.   The Braess' paradox has been studied mainly in the context of the classical problem introduced by Braess and his colleagues, assuming a certain type of networks. Specifically, two pairs of links in those networks are assumed to have the same volume-delay functions. The occurrence of Braess' paradox for this specific case of network symmetry was investigated by Pas and Principio in 1997. Such a symmetry is not common in real-life networks because the parameters of volume-delay functions are associated with roads physical and functional characteristics, which typically differ from one link to another (e.g. roads in networks are of different length). Our research provides an extension of previous studies on Braess' paradox by considering arbitrary volume-delay functions, i.e. symmetry properties are not assumed for any of the network's links and the occurrence of Braess' paradox is studied for a general configuration.
Numerical simulations of the recently derived fully nonlinear equations of motion for weakly three-dimensional water waves [V.P. Ruban, Phys. Rev. E {\bf 71}, 055303(R) (2005)] with quasi-random initial conditions are reported, which show the spontaneous formation of a single extreme wave on the deep water. This rogue wave behaves in an oscillating manner and exists for a relatively long time (many wave periods) without significant change of its maximal amplitude.
The work proposes the application of fuzzy set theory (FST) to diagnose the condition of high voltage bushings. The diagnosis uses dissolved gas analysis (DGA) data from bushings based on IEC60599 and IEEE C57-104 criteria for oil impregnated paper (OIP) bushings. FST and neural networks are compared in terms of accuracy and computational efficiency. Both FST and NN simulations were able to diagnose the bushings condition with 10% error. By using fuzzy theory, the maintenance department can classify bushings and know the extent of degradation in the component.
We present a preliminary report on the first deep infrared photometry of 2MASS GC01 and 2MASS GC02 (GC01 and GC02 hereafter) - new Galactic globular cluster (GC) candidates, discovered by the 2MASS. They both appear to be genuine disk GCs though highly obscured. The location of the two GCs suggests that they are metal rich - [Fe/H]~-0.5. We estimated their distance and reddening using the K_s-band brightness of the red giant branch (RGB) bump, and J-K_s color of the RGB at M_K=-3 mag: D=3.1+/-0.5, 3.9+/-0.6 kpc, and A_V=20.9+/-0.7, 17.2+/-1.2 mag, for GC01 and GC02 respectively. Variation in the metal abundance can change our results within 30-35%.
In the last two years, convolutional neural networks (CNNs) have achieved an impressive suite of results on standard recognition datasets and tasks. CNN-based features seem poised to quickly replace engineered representations, such as SIFT and HOG. However, compared to SIFT and HOG, we understand much less about the nature of the features learned by large CNNs. In this paper, we experimentally probe several aspects of CNN feature learning in an attempt to help practitioners gain useful, evidence-backed intuitions about how to apply CNNs to computer vision problems.
The first-known tidal disruption event (TDE) with strong evidence for a relativistic jet -- based on extensive multi-wavelength campaigns -- is Swift J1644+5734. In order to directly measure the apparent speed of the radio jet, we performed very long baseline interferometry (VLBI) observations with the European VLBI network (EVN) at 5 GHz. Our observing strategy was to identify a very nearby and compact radio source with the real-time e-EVN, and then utilise this source as a stationary astrometry reference point in the later five deep EVN observations. With respect to the in-beam source FIRST J1644+5736, we have achieved a statistical astrometric precision about 12 micro-arcsecond (68 % confidence level) per epoch. This is one of the best phase-referencing measurements available to date. No proper motion has been detected in the Swift J1644+5734 radio ejecta. We conclude that the apparent average ejection speed between 2012.2 and 2015.2 was less than 0.3c with a confidence level of 99 %. This tight limit is direct observational evidence for either a very small viewing angle or a strong jet deceleration due to interactions with a dense circum-nuclear medium, in agreement with some recent theoretical studies.
Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM) approach, which also enables a very rapid training time (~10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random `receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.
A linearized theory of the acoustics of porous elastic formations, such as rocks, saturated with two different viscous fluids is generalized to take into account a pressure discontinuity across the fluid boundaries. The latter can arise due to the surface tension of the membrane separating the fluids. We show that the frequency-dependent bulk modulus $\tilde{K}(\omega)$ for wave lengths longer than the characteristic structural dimensions of the fluid patches has a similar analytic behavior as in the case of a vanishing membrane stiffness and depends on the same parameters of the fluid-distribution topology. The effect of the capillary stiffness can be accounted by renormalizing the coefficients of the leading terms in the low-frequency asymptotic of $\tilde{K}(\omega)$.
Recent interest has developed around the problem of dynamic compressed sensing, or the recovery of time-varying, sparse signals from limited observations. In this paper, we study how the dynamics of recurrent networks, formulated as general dynamical systems, mediate the recoverability of such signals. We specifically consider the problem of recovering a high-dimensional network input, over time, from observation of only a subset of the network states (i.e., the network output). Our goal is to ascertain how the network dynamics lead to performance advantages, particularly in scenarios where both the input and output are corrupted by disturbance and noise, respectively. For this scenario, we develop bounds on the recovery performance in terms of the dynamics. Conditions for exact recovery in the absence of noise are also formulated. Through several examples, we use the results to highlight how different network characteristics may trade off toward enabling dynamic compressed sensing and how such tradeoffs may manifest naturally in certain classes of neuronal networks.
Tree containment problem is a fundamental problem in phylogenetic study, as it is used to verify a network model. It asks whether a given network contain a subtree that resembles a binary tree. The problem is NP-complete in general, even in the class of binary network. Recently, it was proven to be solvable in cubic time, and later in quadratic time for the class of general reticulation visible networks. In this paper, we further improve the time complexity into linear time.
A fundamental question in systems biology is what combinations of mean and variance of the species present in a stochastic biochemical reaction network are attainable by perturbing the system with an external signal. To address this question, we show that the moments evolution in any generic network can be either approximated or, under suitable assumptions, computed exactly as the solution of a switched affine system. Motivated by this application, we propose a new method to approximate the reachable set of switched affine systems. A remarkable feature of our approach is that it allows one to easily compute projections of the reachable set for pairs of moments of interest, without requiring the computation of the full reachable set, which can be prohibitive for large networks. As a second contribution, we also show how to select the external signal in order to maximize the probability of reaching a target set. To illustrate the method we study a renown model of controlled gene expression and we derive estimates of the reachable set, for the protein mean and variance, that are more accurate than those available in the literature and consistent with experimental data.
Recent theoretical and experimental studies of superfluid $^3$He in aerogels with a global anisotropy, e.g., due to an external stress, have definitely shown that the A-like phase with an equal spin pairing (ESP) in such aerogel samples is in the ABM (or, axial) pairing state. In this paper, the A-like phase of superfluid $^3$He in globally {\it isotropic} aerogel is studied in details by assuming a weakly disordered system in which singular topological defects are absent. Through calculation of the free energy, a disordered ABM state is found to be the best candidate of the pairing state of the globally isotropic A-like phase. Further, it is found through a one-loop renormalization group calculation that the coreless continuous vortices (or, vortex-skyrmions) are irrelevant to the long-distance behavior of the disorder-induced textures, and that the superfluidity is maintained in spite of lack of the conventional off-diagonal long range order. Therefore, the globally isotropic A-like phase at weak disorder is, like in the case with a global stretched anisotropy, a superfluid glass with the ABM pairing.
We use complex network concepts to analyze statistical properties of urban public transport networks (PTN). To this end, we present a comprehensive survey of the statistical properties of PTNs based on the data of fourteen cities of so far unexplored network size. Especially helpful in our analysis are different network representations. Within a comprehensive approach we calculate PTN characteristics in all of these representations and perform a comparative analysis. The standard network characteristics obtained in this way often correspond to features that are of practical importance to a passenger using public traffic in a given city. Specific features are addressed that are unique to PTNs and networks with similar transport functions (such as networks of neurons, cables, pipes, vessels embedded in 2D or 3D space). Based on the empirical survey, we propose a model that albeit being simple enough is capable of reproducing many of the identified PTN properties. A central ingredient of this model is a growth dynamics in terms of routes represented by self-avoiding walks.
We develop a generalized loss network framework for capacity planning of a perinatal network in the UK. Decomposing the network by hospitals, each unit is analyzed with a GI/G/c/0 overflow loss network model. A two-moment approximation is performed to obtain the steady state solution of the GI/G/c/0 loss systems, and expressions for rejection probability and overflow probability have been derived. Using the model framework, the number of required cots can be estimated based on the rejection probability at each level of care of the neonatal units in a network. The generalization ensures that the model can be applied to any perinatal network for renewal arrival and discharge processes.
Networks of living neurons exhibit an avalanche mode of activity, experimentally found in organotypic cultures. Moreover, experimental studies of morphology indicate that neurons develop a network of small-world-like connections, with the possibility of very high connectivity degree. Here we study a recent model based on self-organized criticality, which consists of an electrical network with threshold firing and activity-dependent synapse strengths. We study the model on a scale-free network, the Apollonian network, which presents many features of neuronal systems. The system exhibits a power law distributed avalanche activity. The analysis of the power spectra of the electrical signal reproduces very robustly the power law behaviour with the exponent 0.8, experimentally measured in electroencephalograms (EEG) spectra. The exponents are found to be quite stable with respect to initial configurations and strength of plastic remodelling, indicating that universality holds for a wide class of brain models.
The problem of sharing entanglement over large distances is crucial for implementations of quantum cryptography. A possible scheme for long-distance entanglement sharing and quantum communication exploits networks whose nodes share Einstein-Podolsky-Rosen (EPR) pairs. In Perseguers et al. [Phys. Rev. A 78, 062324 (2008)] the authors put forward an important isomorphism between storing quantum information in a dimension $D$ and transmission of quantum information in a $D+1$-dimensional network. We show that it is possible to obtain long-distance entanglement in a noisy two-dimensional (2D) network, even when taking into account that encoding and decoding of a state is exposed to an error. For 3D networks we propose a simple encoding and decoding scheme based solely on syndrome measurements on 2D Kitaev topological quantum memory. Our procedure constitutes an alternative scheme of state injection that can be used for universal quantum computation on 2D Kitaev code. It is shown that the encoding scheme is equivalent to teleporting the state, from a specific node into a whole two-dimensional network, through some virtual EPR pair existing within the rest of network qubits. We present an analytic lower bound on fidelity of the encoding and decoding procedure, using as our main tool a modified metric on space-time lattice, deviating from a taxicab metric at the first and the last time slices.
We numerically explore the many body localization (MBL) transition through the lens of the {\it entanglement spectrum}. While a direct transition from localization to thermalization is believed to obtain in the thermodynamic limit (the exact details of which remain an open problem), in finite system sizes there exists an intermediate `quantum critical' regime. Previous numerical investigations have explored the crossover from thermalization to criticality, and have used this to place a numerical {\it lower} bound on the critical disorder strength for MBL. A careful analysis of the {\it high energy} part of the entanglement spectrum (which contains universal information about the critical point) allows us to make the first ever observation in exact numerics of the crossover from criticality to MBL and hence to place a numerical {\it upper bound} on the critical disorder strength for MBL.
In this paper we extend replica bounds and free energy subadditivity arguments to diluted spin-glass models on graphs with arbitrary, non-Poissonian degree distribution. The new difficulties specific of this case are overcome introducing an interpolation procedure that stresses the relation between interpolation methods and the cavity method. As a byproduct we obtain self-averaging identities that generalize the Ghirlanda-Guerra ones to the multi-overlap case.
Object segmentation requires both object-level information and low-level pixel data. This presents a challenge for feedforward networks: lower layers in convolutional nets capture rich spatial information, while upper layers encode object-level knowledge but are invariant to factors such as pose and appearance. In this work we propose to augment feedforward nets for object segmentation with a novel top-down refinement approach. The resulting bottom-up/top-down architecture is capable of efficiently generating high-fidelity object masks. Similarly to skip connections, our approach leverages features at all layers of the net. Unlike skip connections, our approach does not attempt to output independent predictions at each layer. Instead, we first output a coarse `mask encoding' in a feedforward pass, then refine this mask encoding in a top-down pass utilizing features at successively lower layers. The approach is simple, fast, and effective. Building on the recent DeepMask network for generating object proposals, we show accuracy improvements of 10-20% in average recall for various setups. Additionally, by optimizing the overall network architecture, our approach, which we call SharpMask, is 50% faster than the original DeepMask network (under .8s per image).
We present an approach to learn a dense pixel-wise labeling from image-level tags. Each image-level tag imposes constraints on the output labeling of a Convolutional Neural Network (CNN) classifier. We propose Constrained CNN (CCNN), a method which uses a novel loss function to optimize for any set of linear constraints on the output space (i.e. predicted label distribution) of a CNN. Our loss formulation is easy to optimize and can be incorporated directly into standard stochastic gradient descent optimization. The key idea is to phrase the training objective as a biconvex optimization for linear models, which we then relax to nonlinear deep networks. Extensive experiments demonstrate the generality of our new learning framework. The constrained loss yields state-of-the-art results on weakly supervised semantic image segmentation. We further demonstrate that adding slightly more supervision can greatly improve the performance of the learning algorithm.
We introduce and analyze a new Markov model of the IEEE 802.11 Distributed Coordination Function (DCF) for wireless networks. The new model is derived from a detailed DCF description where transition probabilities are determined by precise estimates of collision probabilities based on network topology and node states. For steady state calculations, we approximate joint probabilities from marginal probabilities using product approximations. To assess the quality of the model, we compare detailed equilibrium node states with results from realistic simulations of wireless networks. We find very close correspondence between the model and the simulations in a variety of representative network topologies.
The usage of psychological networks that conceptualize psychological behavior as a complex interplay of psychological and other components has gained increasing popularity in various fields of psychology. While prior publications have tackled the topics of estimating and interpreting such networks, little work has been conducted to check how accurate (i.e., prone to sampling variation) networks are estimated, and how stable (i.e., interpretation remains similar with less observations) inferences from the network structure (such as centrality indices) are. In this tutorial paper, we aim to introduce the reader to this field and tackle the problem of accuracy under sampling variation. We first introduce the current state-of-the-art of network estimation. Second, we provide a rationale why researchers should investigate the accuracy of psychological networks. Third, we describe how bootstrap routines can be used to (A) assess the accuracy of estimated network connections, (B) investigate the stability of centrality indices, and (C) test whether network connections and centrality estimates for different variables differ from each other. We introduce two novel statistical methods: for (B) the correlation stability coefficient, and for (C) the bootstrapped difference test for edge-weights and centrality indices. We conducted and present simulation studies to assess the performance of both methods. Finally, we developed the free R-package bootnet that allows for estimating psychological networks in a generalized framework in addition to the proposed bootstrap methods. We showcase bootnet in a tutorial, accompanied by R syntax, in which we analyze a dataset of 359 women with posttraumatic stress disorder available online.
In a recent paper Castro-Neto and Jones argue that because the observability of quantum Griffiths-McCoy effects in metals is controlled by non-universal quantities, the quantum Griffiths-McCoy scenario may be a viable explanation for the non-fermi-liquid behavior observed in heavy fermion compounds. In this Comment we point out that the important non-universal quantity is the damping of the spin dynamics by the metallic electrons; quantum Griffiths-McCoy effects occur only if this is parametrically weak relative to other scales in the problem, i.e. if the spins are decoupled from the carriers. This suggests that in heavy fermion materials, where the Kondo effect leads to a strong carrier-spin coupling, quantum Griffiths-McCoy effects are unlikely to occur.
We have discovered an invariant distribution for local packing configurations in static granular media. This distribution holds in experiments for packing fractions covering most of the range from random loose packed to random close packed, for beads packed both in air and in water. Assuming only that there exist elementary cells in which the system volume is subdivided, we derive from statistical mechanics a distribution that is in accord with the observations. This universal distribution function for granular media is analogous to the Maxwell-Boltzmann distribution for molecular gasses.
In these notes we study synchronizability of dynamical processes defined on complex networks as well as its interplay with network topology. Building from a recent work by Barahona and Pecora [Phys. Rev. Lett. 89, 054101 (2002)], we use a simulated annealing algorithm to construct optimally-synchronizable networks. The resulting structures, known as entangled networks, are characterized by an extremely homogeneous and interwoven topology: degree, distance, and betweenness distributions are all very narrow, with short average distances, large loops, and small modularity. Entangled networks exhibit an excellent (almost optimal) performance with respect to other flow or connectivity properties such as robustness, random walk minimal first-passage times, and good searchability. All this converts entangled networks in a powerful concept with optimal properties in many respects.
Nowadays, the exponentially growing of the Web renders the problem of correlation among different topics of paramount importance. The proposed model can be used to study the evolution of network depicted by different topics on the web correlated by a dynamic "fluid" of tags among them. The fluid-dynamic model depicted is completely evolutive, thus it is able to describe the dynamic situation of a network at every instant of time. This overcomes the difficulties encountered by many static models. The theory permits the development of efficient numerical schemes also for very large networks. This is possible since dynamic flow at junctions is modeled in a simple and computationally convenient way (resorting to a linear programming problem). The obtained model consists of a single conservation law and is on one side simple enough to permit a complete understanding, on the other side reach enough to detect the evolution of the dynamic network.
Deep neural network algorithms are difficult to analyze because they lack structure allowing to understand the properties of underlying transforms and invariants. Multiscale hierarchical convolutional networks are structured deep convolutional networks where layers are indexed by progressively higher dimensional attributes, which are learned from training data. Each new layer is computed with multidimensional convolutions along spatial and attribute variables. We introduce an efficient implementation of such networks where the dimensionality is progressively reduced by averaging intermediate layers along attribute indices. Hierarchical networks are tested on CIFAR image data bases where they obtain comparable precisions to state of the art networks, with much fewer parameters. We study some properties of the attributes learned from these databases.
This paper details the implementation of an algorithm for automatically generating a high-level knowledge network to perform commonsense reasoning, specifically with the application of robotic task repair. The network is represented using a Bayesian Logic Network (BLN) (Jain, Waldherr, and Beetz 2009), which combines a set of directed relations between abstract concepts, including IsA, AtLocation, HasProperty, and UsedFor, with a corresponding probability distribution that models the uncertainty inherent in these relations. Inference over this network enables reasoning over the abstract concepts in order to perform appropriate object substitution or to locate missing objects in the robot's environment. The structure of the network is generated by combining information from two existing knowledge sources: ConceptNet (Speer and Havasi 2012), and WordNet (Miller 1995). This is done in a "situated" manner by only including information relevant a given context. Results show that the generated network is able to accurately predict object categories, locations, properties, and affordances in three different household scenarios.
Modern automatic speech recognition (ASR) systems need to be robust under acoustic variability arising from environmental, speaker, channel, and recording conditions. Ensuring such robustness to variability is a challenge in modern day neural network-based ASR systems, especially when all types of variability are not seen during training. We attempt to address this problem by encouraging the neural network acoustic model to learn invariant feature representations. We use ideas from recent research on image generation using Generative Adversarial Networks and domain adaptation ideas extending adversarial gradient-based training. A recent work from Ganin et al. proposes to use adversarial training for image domain adaptation by using an intermediate representation from the main target classification network to deteriorate the domain classifier performance through a separate neural network. Our work focuses on investigating neural architectures which produce representations invariant to noise conditions for ASR. We evaluate the proposed architecture on the Aurora-4 task, a popular benchmark for noise robust ASR. We show that our method generalizes better than the standard multi-condition training especially when only a few noise categories are seen during training.
Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates. All layers, or more generally, modules, of the network are therefore locked, in the sense that they must wait for the remainder of the network to execute forwards and propagate error backwards before they can be updated. In this work we break this constraint by decoupling modules by introducing a model of the future computation of the network graph. These models predict what the result of the modelled subgraph will produce using only local information. In particular we focus on modelling error gradients: by using the modelled synthetic gradient in place of true backpropagated error gradients we decouple subgraphs, and can update them independently and asynchronously i.e. we realise decoupled neural interfaces. We show results for feed-forward models, where every layer is trained asynchronously, recurrent neural networks (RNNs) where predicting one's future gradient extends the time over which the RNN can effectively model, and also a hierarchical RNN system with ticking at different timescales. Finally, we demonstrate that in addition to predicting gradients, the same framework can be used to predict inputs, resulting in models which are decoupled in both the forward and backwards pass -- amounting to independent networks which co-learn such that they can be composed into a single functioning corporation.
The emission from neutral hydrogen (HI) clouds in the post-reionization era (z < 6), too faint to be individually detected, is present as a diffuse background in all low frequency radio observations below 1420 MHz. The angular and frequency fluctuations of this radiation (~ 1 mK) is an important future probe of the large scale structures in the Universe. We show that such observations are a very effective probe of the background cosmological model and the perturbed Universe. In our study we focus on the possibility of determining the redshift space distortion parameter, coordinate distance and its derivative with redshift. Using reasonable estimates for the observational uncertainties and configurations representative of the ongoing and upcoming radio interferometers, we predict parameter estimation at a precision comparable with supernova Ia observations and galaxy redshift surveys, across a wide range in redshift that is only partially accessed by other probes. Future HI observations of the post-reionization era present a new technique, complementing several existing one, to probe the expansion history and to elucidate the nature of the dark energy.
Many functions have been recently defined to assess the similarity among networks as tools for quantitative comparison. They stem from very different frameworks - and they are tuned for dealing with different situations. Here we show an overview of the spectral distances, highlighting their behavior in some basic cases of static and dynamic synthetic and real networks.
A scheme is proposed for simultaneous intraportation of many unknown quantum states within a quantum computing network. It is shown that our scheme, much different from the teleportation in the strict sense, can be very similar to the original teleportation proposal[Phys.Rev.Lett.{\bf 70} (1993)1895)] and the efficiency of the scheme for quantum state transmission is very high. The possible applications of our scheme are also discussed.
We present observations of SDF-05M05, an unusual optical transient discovered in the Subaru Deep Field (SDF). The duration of the transient is > ~800 d in the observer frame, and the maximum brightness during observation reached approximately 23 mag in the i' and z' bands. The faint host galaxy is clearly identified in all 5 optical bands of the deep SDF images. The photometric redshift of the host yields z~0.6 and the corresponding absolute magnitude at maximum is ~-20. This implies that this event shone with an absolute magnitude brighter than -19 mag for approximately 300 d in the rest frame, which is significantly longer than a typical supernova and ultra-luminous supernova. The total radiated energy during our observation was 1x10^51 erg. The light curves and color evolution are marginally consistent with some of luminous IIn supernova. We suggest that the transient may be a unique and peculiar supernova at intermediate redshift.
Radio and far infrared luminosities of star forming galaxies follow a tight linear relation. Making use of BeppoSAX and ASCA observations of a well-defined sample of star forming galaxies, we argue that a tight linear relation holds between the 2-10 keV X-ray luminosity and both the radio and far infrared ones. It is suggested that the hard X-ray emission is directly related to the Star Formation Rate. Preliminary results obtained from deep Chandra and radio observations of the Hubble Deep Field North show that a similar relation might hold also at high (0.2<z<1.2) redshift.
RNA virus populations will undergo processes of mutation and selection resulting in a mixed population of viral particles. High throughput sequencing of a viral population subsequently contains a mixed signal of the underlying clones. We would like to identify the underlying evolutionary structures. We utilize two sources of information to attempt this; within segment linkage information, and mutation prevalence. We demonstrate that clone haplotypes, their prevalence, and maximum parsimony reticulate evolutionary structures can be identified, although the solutions may not be unique, even for complete sets of information. This is applied to a chain of influenza infection, where we infer evolutionary structures, including reassortment, and demonstrate some of the difficulties of interpretation that arise from deep sequencing due to artifacts such as template switching during PCR amplification.
There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and learn to transfer manipulation strategy across different objects by embedding point-cloud, natural language, and manipulation trajectory data into a shared embedding space using a deep neural network. In order to learn semantically meaningful spaces throughout our network, we introduce a method for pre-training its lower layers for multimodal feature embedding and a method for fine-tuning this embedding space using a loss-based margin. In order to collect a large number of manipulation demonstrations for different objects, we develop a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects and appliances with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot with our model can even prepare a cup of a latte with appliances it has never seen before.
This paper reports application of neuro- fuzzy inference system (NFIS) and self organizing feature map neural networks (SOM) on detection of contact state in a block system. In this manner, on a simple system, the evolution of contact states, by parallelization of DDA, has been investigated. So, a comparison between NFIS and SOM results has been presented. The results show applicability of the proposed methods, by different accuracy, on detection of contact's distribution.
Click prediction is one of the fundamental problems in sponsored search. Most of existing studies took advantage of machine learning approaches to predict ad click for each event of ad view independently. However, as observed in the real-world sponsored search system, user's behaviors on ads yield high dependency on how the user behaved along with the past time, especially in terms of what queries she submitted, what ads she clicked or ignored, and how long she spent on the landing pages of clicked ads, etc. Inspired by these observations, we introduce a novel framework based on Recurrent Neural Networks (RNN). Compared to traditional methods, this framework directly models the dependency on user's sequential behaviors into the click prediction process through the recurrent structure in RNN. Large scale evaluations on the click-through logs from a commercial search engine demonstrate that our approach can significantly improve the click prediction accuracy, compared to sequence-independent approaches.
The Discrete Truncated Wigner Approximation (DTWA) is a semi-classical phase space method useful for the exploration of Many-body quantum dynamics. In this work, we show that the method is suitable for studying Many-Body Localization (MBL). By taking as a benchmark case a 1D random field Heisenberg spin chain with short range interactions, and by comparing to numerically exact techniques, we show that DTWA is able to reproduce dynamical signatures of the MBL phase such as logarithmic growth of entanglement, even though a pure classical mean-field analysis would lead to no dynamics at all. The method is also able to characterize the thermal phase, and becomes increasingly accurate deep in both the thermal and localized phases, failing only close to the phase transition. Our results suggest the DTWA can become a useful method to investigate MBL and thermalization in experimentally relevant settings intractable with exact numerical techniques, such as systems with long range interactions and/or systems in higher dimensions.
Energy is one of the most important resources in wireless sensor networks. Recently, the mobility of base station has been exploited to preserve the energy. But in event driven networks, the mobility issue is quite different from the continuous monitoring one because only a small portion of sensor node has data to send at one time. The number of sensor node that forward traffic should be minimized to prolong the network lifetime. In this paper, we propose a movement-assisted energy conserving method which tries to reduce the amount of forwarding sensor node by directing the base station to move close to the hotspots. This method achieves good performance especially when applied to a network with a set of cooperative mobile base station. Extensive simulation has been done to verify the effectiveness of the propose schema.
The Higgs boson is thought to provide the interaction that imparts mass to the fundamental fermions, but while measurements at the Large Hadron Collider (LHC) are consistent with this hypothesis, current analysis techniques lack the statistical power to cross the traditional 5$\sigma$ significance barrier without more data. \emph{Deep learning} techniques have the potential to increase the statistical power of this analysis by \emph{automatically} learning complex, high-level data representations. In this work, deep neural networks are used to detect the decay of the Higgs to a pair of tau leptons. A Bayesian optimization algorithm is used to tune the network architecture and training algorithm hyperparameters, resulting in a deep network of eight non-linear processing layers that improves upon the performance of shallow classifiers even without the use of features specifically engineered by physicists for this application. The improvement in discovery significance is equivalent to an increase in the accumulated dataset of 25\%.
This paper formulates the problem of learning discriminative features (\textit{i.e.,} segments) from networked time series data considering the linked information among time series. For example, social network users are considered to be social sensors that continuously generate social signals (tweets) represented as a time series. The discriminative segments are often referred to as \emph{shapelets} in a time series. Extracting shapelets for time series classification has been widely studied. However, existing works on shapelet selection assume that the time series are independent and identically distributed (i.i.d.). This assumption restricts their applications to social networked time series analysis, since a user's actions can be correlated to his/her social affiliations. In this paper we propose a new Network Regularized Least Squares (NetRLS) feature selection model that combines typical time series data and user network data for analysis. Experiments on real-world networked time series Twitter and DBLP data demonstrate the performance of the proposed method. NetRLS performs better than LTS, the state-of-the-art time series feature selection approach, on real-world data.
Most of the sensor network applications need real time communication and the need for deadline aware real time communication is becoming eminent in these applications. These applications have different dead line requirements also. The real time applications of wireless sensor networks are bandwidth sensitive and need higher share of bandwidth for higher priority data to meet the dead line requirements. In this paper we focus on the MAC layer modifications to meet the real time requirements of different priority data. Bandwidth partitioning among different priority transmissions is implemented through MAC layer modifications. The MAC layer implements a queuing model that supports lower transfer rate for lower priority packets and higher transfer rate for real time packets with higher priority, minimizing the end to end delay. The performance of the algorithm is evaluated with varying node distribution.
Although the latest high-end smartphone has powerful CPU and GPU, running deeper convolutional neural networks (CNNs) for complex tasks such as ImageNet classification on mobile devices is challenging. To deploy deep CNNs on mobile devices, we present a simple and effective scheme to compress the entire CNN, which we call one-shot whole network compression. The proposed scheme consists of three steps: (1) rank selection with variational Bayesian matrix factorization, (2) Tucker decomposition on kernel tensor, and (3) fine-tuning to recover accumulated loss of accuracy, and each step can be easily implemented using publicly available tools. We demonstrate the effectiveness of the proposed scheme by testing the performance of various compressed CNNs (AlexNet, VGGS, GoogLeNet, and VGG-16) on the smartphone. Significant reductions in model size, runtime, and energy consumption are obtained, at the cost of small loss in accuracy. In addition, we address the important implementation level issue on 1?1 convolution, which is a key operation of inception module of GoogLeNet as well as CNNs compressed by our proposed scheme.
We estimate the magnitude of charge symmetry breaking effects in deep inelastic scattering from nuclei. The resulting contribution to systematic uncertainties in hadronic determinations of $sin^2\theta_W$ are found to be less than the current experimental accuracy, but may be important in the analyses of more precise future experiments. We expect the largest nuclear charge symmetry breaking effects in the Paschos-Wolfenstein ratio $R^-$, where the resulting uncertainty in the determination of $sin^2 \theta_W$ reaches $10^{-3}$.
In recent years, we have seen tremendous progress in the field of object detection. Most of the recent improvements have been achieved by targeting deeper feedforward networks. However, many hard object categories, such as bottle and remote, require representation of fine details and not coarse, semantic representations. But most of these fine details are lost in the early convolutional layers. What we need is a way to incorporate finer details from lower layers into the detection architecture. Skip connections have been proposed to combine high-level and low-level features, but we argue that selecting the right features from low-level requires top-down contextual information. Inspired by the human visual pathway, in this paper we propose top-down modulations as a way to incorporate fine details into the detection framework. Our approach supplements the standard bottom-up, feedforward ConvNet with a top-down modulation (TDM) network, connected using lateral connections. These connections are responsible for the modulation of lower layer filters, and the top-down network handles the selection and integration of features. The proposed architecture provides a significant boost on the COCO benchmark for VGG16, ResNet101, and InceptionResNet-v2 architectures. Preliminary experiments using InceptionResNet-v2 achieve 36.8 AP, which is the best performance to-date on the COCO benchmark using a single-model without any bells and whistles (e.g., multi-scale, iterative box refinement, etc.).
In this presentation, we introduce sparsity methods for networked control systems and show the effectiveness of sparse control. In networked control, efficient data transmission is important since transmission delay and error can critically deteriorate the stability and performance. We will show that this problem is solved by sparse control designed by recent sparse optimization methods.
Let $p$ be a prime number and let $ k $ be a number field, which does not contain the field $\mathbb{Q} (\zeta_p + \bar{\zeta_p})$. Let $\mathcal{E}$ be an elliptic curve defined over $k$. We prove that if there are no $k$-rational torsion points of exact order $p$ on $\E$, then the local-global principle holds for divisibility by $p^n$, with $n$ a natural number. As a consequence of the deep theorem of Merel, for $p$ larger than a constant depending only on the degree of $k$, there are no counterexamples to the local-global divisibility principle. Nice and deep works give explicit small constants for elliptic curves defined over a number field of degree at most 5 over $\mathbb{Q}.
This paper presents an applicability analysis over a novel integer programming model devoted to optimize power consumption efficiency in heterogeneous wireless sensor networks. This model is based upon a schedule of sensor allocation plans in multiple time intervals subject to coverage and connectivity constraints. By turning off a specific set of redundant sensors in each time interval, it is possible to reduce the total energy consumption in the network and, at the same time, avoid partitioning the whole network by losing some strategic sensors too prematurely. Since the network is heterogeneous, sensors can sense different phenomena from different demand points, with different sample rates. As the problem instances grows the time spent to the execution turns impracticable.
The mechanism of preferential attachment underpins most recent social network formation models. Yet few authors attempt to check or quantify assumptions on this mechanism. We call generalized preferential attachment any kind of preference to interact with other agents with respect to any node property. We then introduce tools for measuring empirically and characterizing comprehensively such phenomena, and apply these tools to a socio-semantic network of scientific collaborations, investigating in particular homophilic behavior. This opens the way to a whole class of realistic and credible social network morphogenesis models.
To investigate the link between discrete, small-scale and continuous, large scale mechanical properties of a foam, we observe its two-dimensional flow in a channel, around an elliptical obstacle. We measure the drag, lift and torque acting on the ellipse {\it versus} the angle between its major axis and the flow direction. The drag increases with the spanwise dimension, in marked contrast with a square obstacle. The lift passes through a smooth extremum at an angle close to, but smaller than 45$^\circ$. The torque peaks at a significantly smaller angle, 26$^\circ$. No existing model can reproduce the observed viscous, elastic, plastic behavior. We propose a microscopic visco-elasto-plastic model which agrees qualitatively with the data.
The human brain is autonomously active. To understand the functional role of this self-sustained neural activity, and its interplay with the sensory data input stream, is an important question in cognitive system research and we review here the present state of theoretical modelling.   This review will start with a brief overview of the experimental efforts, together with a discussion of transient vs. self-sustained neural activity in the framework of reservoir computing. The main emphasis will be then on two paradigmal neural network architectures showing continuously ongoing transient-state dynamics: saddle point networks and networks of attractor relics.   Self-active neural networks are confronted with two seemingly contrasting demands: a stable internal dynamical state and sensitivity to incoming stimuli. We show, that this dilemma can be solved by networks of attractor relics based on competitive neural dynamics, where the attractor relics compete on one side with each other for transient dominance, and on the other side with the dynamical influence of the input signals. Unsupervised and local Hebbian-style online learning then allows the system to build up correlations between the internal dynamical transient states and the sensory input stream. An emergent cognitive capability results from this set-up. The system performs online, and on its own, a non-linear independent component analysis of the sensory data stream, all the time being continuously and autonomously active. This process maps the independent components of the sensory input onto the attractor relics, which acquire in this way a semantic meaning.
In a recent comment [cond-mat/9703106] Casati, Izrailev and Sokolov claim that our analysis of the quantum kicked rotor [Phys. Rev. Lett. 77, 4536 (1996), chao-dyn/9609014] seems to miss an important aspect, viz. the difference in behavior between rational and irrational values of the parameter T = tau/4pi (tau being the time between kicks). The fact of the matter is that our approach does depend very sensitively on the number theoretical properties of T. In our reply we show how the 'degree of rationality' is related to the topological aspects of our theory and point out the phenomenological consequences of this connection.
We rigorously calculate the propagation and scattering of electromagnetic waves by rectangular and random arrays of dielectric cylinders in a uniform medium. For regular arrays, the band structures are computed and complete bandgaps are discovered. For random arrays, the phenomenon of wave localization is investigated and compared in two scenarios: (1) wave propagating through the array of cylinders; this is the scenario which has been commonly considered in the literature, and (2) wave transmitted from a source located inside the ensemble. We show that within complete band gaps, results from the two scenarios are similar. Outside the gaps, however, there is a distinct difference, that is, waves can be blocked from propagation by disorders in the first scenario, but such an inhibition may not lead to inhibition or wave localization in the second scenario. The study suggests that the traditional method may be ambiguous in discerning localization effects.
In this paper, we propose a novel method to enrich the representation provided to the output layer of feedforward neural networks in the form of an auto-clustering output layer (ACOL) which enables the network to naturally create sub-clusters under the provided main class la- bels. In addition, a novel regularization term is introduced which allows ACOL to encourage the neural network to reveal its own explicit clustering objective. While the underlying process of finding the subclasses is completely unsupervised, semi-supervised learning is also possible based on the provided classification objective. The results show that ACOL can achieve a 99.2% clustering accuracy for the semi-supervised case when partial class labels are presented and a 96% accuracy for the unsupervised clustering case. These findings represent a paradigm shift especially when it comes to harnessing the power of deep networks for primary and secondary clustering applications in large datasets.
Using the generalized DMFT+Sigma approach we have studied disorder influence on single-particle properties of the normal phase and superconducting transition temperature in attractive Hubbard model. The wide range of attractive potentials U was studied - from the weak coupling region, where both the instability of the normal phase and superconductivity are well described by BCS model, towards the strong coupling region, where superconducting transition is due to Bose-Einstein condensation (BEC) of compact Cooper pairs, formed at temperatures much higher than the temperature of superconducting transition. We have studied two typical models of conduction band with semi-elliptic and flat densities of states, appropriate for three-dimensional and two-dimensional systems respectively. For semi-elliptic density of states disorder influence on all single-particle properties (e.g. density of states) is universal for arbitrary strength of electronic correlations and disorder and is due only to the general disorder widening of conduction band. In the case of flat density of states universality is absent in general case, but still the disorder influence is due mainly to band widening and universal behavior is restored for large enough disorder. Using the combination of DMFT+Sigma and Nozieres - Schmitt-Rink approximations we have studied disorder influence upon superconducting transition temperature T_c for the range of characteristic values of U and disorder, including the BCS-BEC crossover region and the limit of strong coupling. Disorder can either suppress T_c (in the weak coupling region) or significantly increase T_c (in strong coupling region). However in all cases the generalized Anderson theorem is valid and all changes of superconducting critical temperature are essentially due only to the general disorder widening of the conduction band.
We study the formation and evolution of an interconnected string network in large-scale field-theory numerical simulations, both in flat spacetime and in expanding universe. The network consists of gauge U(1) strings of two different kinds and their bound states, arising due to an attractive interaction potential. We find that the network shows no tendency to ``freeze'' and appears to approach a scaling regime, with all characteristic lengths growing linearly with time. Bound strings constitute only a small fraction of the total string length in the network.
Wireless Networks are most appealing in terms of deployment over a wide range of applications. The key areas are disaster management, industrial unit automation and battlefield surveillance. The paper presents a study over inter-operability of MANET (Mobile Ad-Hoc Network) protocols i.e DSDV, OLSR, ZRP, AODV over WSN (Wireless Sensor Network) [10]. The review here covers all the prevailing protocol solutions for WSN and deployment of MANET protocols over them. The need of moving to MANET protocols lie in situation when we talk about mobile sensory nodes which are a compulsion when we talk about the above mentioned three areas. However, the deployment may not be limited to these only.
Using Monte Carlo simulations, we have studied the relaxation of energy of the three-dimensional Ising spin-glass model in aging process. Our finite-size-scaling analysis on the isothermal energy decay after the quench suggests strongly that the energy relaxes by coarsening of domain walls in agreement with the droplet theory. We have also evaluated relaxation times required for spin configurations in small systems, which are regarded as isolated droplets, to flip globally. The results support the fundamental assumption of the droplet theory that coarsening of domain walls, in case of spin glasses, is driven by successive nucleation of thermally activated droplets.
Deep spiking neural networks (SNNs) hold great potential for improving the latency and energy efficiency of deep neural networks through event-based computation. However, training such networks is difficult due to the non-differentiable nature of asynchronous spike events. In this paper, we introduce a novel technique, which treats the membrane potentials of spiking neurons as differentiable signals, where discontinuities at spike times are only considered as noise. This enables an error backpropagation mechanism for deep SNNs, which works directly on spike signals and membrane potentials. Thus, compared with previous methods relying on indirect training and conversion, our technique has the potential to capture the statics of spikes more precisely. Our novel framework outperforms all previously reported results for SNNs on the permutation invariant MNIST benchmark, as well as the N-MNIST benchmark recorded with event-based vision sensors.
Non-negative matrix factorization (NMF) is a common method for generating topic models from text data. NMF is widely accepted for producing good results despite its relative simplicity of implementation and ease of computation. One challenge with applying NMF to large datasets is that intermediate matrix products often become dense, stressing the memory and compute elements of a system. In this article, we investigate a simple but powerful modification of a common NMF algorithm that enforces the generation of sparse intermediate and output matrices. This method enables the application of NMF to large datasets through improved memory and compute performance. Further, we demonstrate empirically that this method of enforcing sparsity in the NMF either preserves or improves both the accuracy of the resulting topic model and the convergence rate of the underlying algorithm.
We develop a modern deep convolutional neural network for conditional time series forecasting based on the recent WaveNet architecture. The proposed network contains stacks of dilated convolutions that widen the receptive field of the forecast; multiple convolutional filters are applied in parallel to separate time series and allow for the fast processing of data and the exploitation of the correlation structure between the multivariate time series. The performance of the deep convolutional neural network is analyzed on various multivariate time series including commodities data and stock indices and compared to that of the well-known autoregressive model and a fully convolutional network. We show that our network is able to effectively learn dependencies between the series without the need of long historical time series and significantly outperforms the baseline neural forecasting models.
In this paper we extend an earlier result within Dempster-Shafer theory ["Fast Dempster-Shafer Clustering Using a Neural Network Structure," in Proc. Seventh Int. Conf. Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU'98)] where several pieces of evidence were clustered into a fixed number of clusters using a neural structure. This was done by minimizing a metaconflict function. We now develop a method for simultaneous clustering and determination of number of clusters during iteration in the neural structure. We let the output signals of neurons represent the degree to which a pieces of evidence belong to a corresponding cluster. From these we derive a probability distribution regarding the number of clusters, which gradually during the iteration is transformed into a determination of number of clusters. This gradual determination is fed back into the neural structure at each iteration to influence the clustering process.
The distributed computation of Nash equilibria is assuming growing relevance in engineering where such problems emerge in the context of distributed control. Accordingly, we consider static stochastic convex games complicated by a parametric misspecification, a natural concern when competing in any large-scale networked engineered system. In settings where the equilibrium conditions are captured by a strongly monotone stochastic variational inequality problem, we present two sets of distributedschemes in which agents learn the equilibrium strategy and correct the misspecification by leveraging noise-corrupted observations. (1) Monotone stochastic Nash games: We present a set of coupled stochastic approximation schemes distributed across agents in which the first scheme updates each agent's strategy via a projected (stochastic) gradient step while the second scheme updates every agent's belief regarding its misspecified parameter. We proceed to show that the produced sequences converge to the true equilibrium strategy and the true parameter in an almost sure sense. Surprisingly, convergence in the equilibrium strategy achieves the optimal rate of convergence in a mean-squared sense with a quantifiable degradation in the rate constant; (2) Stochastic Nash-Cournot games with unobservable aggregate output: We refine (1) to a Cournot setting where the tuple of strategies is not observable and assume that payoff functions and strategy sets are public knowledge (a common knowledge assumption). When noise-corrupted prices are observable, iterative fixed-point schemes are developed, allowing for simultaneously learning the equilibrium strategies and the misspecified parameter in an almost-sure sense.
Recent leading approaches to semantic segmentation rely on deep convolutional networks trained with human-annotated, pixel-level segmentation masks. Such pixel-accurate supervision demands expensive labeling effort and limits the performance of deep networks that usually benefit from more training data. In this paper, we propose a method that achieves competitive accuracy but only requires easily obtained bounding box annotations. The basic idea is to iterate between automatically generating region proposals and training convolutional networks. These two steps gradually recover segmentation masks for improving the networks, and vise versa. Our method, called BoxSup, produces competitive results supervised by boxes only, on par with strong baselines fully supervised by masks under the same setting. By leveraging a large amount of bounding boxes, BoxSup further unleashes the power of deep convolutional networks and yields state-of-the-art results on PASCAL VOC 2012 and PASCAL-CONTEXT.
We present 75 pulsars discovered in the mid-latitude portion of the High Time Resolution Universe survey, 54 of which have full timing solutions. All the pulsars have spin periods greater than 100 ms, and none of those with timing solutions are in binaries. Two display particularly interesting behaviour; PSR J1054-5944 is found to be an intermittent pulsar, and PSR J1809-0119 has glitched twice since its discovery.   In the second half of the paper we discuss the development and application of an artificial neural network in the data-processing pipeline for the survey. We discuss the tests that were used to generate scores and find that our neural network was able to reject over 99% of the candidates produced in the data processing, and able to blindly detect 85% of pulsars. We suggest that improvements to the accuracy should be possible if further care is taken when training an artificial neural network; for example ensuring that a representative sample of the pulsar population is used during the training process, or the use of different artificial neural networks for the detection of different types of pulsars.
This paper investigates how to extract objects-of-interest without relying on hand-craft features and sliding windows approaches, that aims to jointly solve two sub-tasks: (i) rapidly localizing salient objects from images, and (ii) accurately segmenting the objects based on the localizations. We present a general joint task learning framework, in which each task (either object localization or object segmentation) is tackled via a multi-layer convolutional neural network, and the two networks work collaboratively to boost performance. In particular, we propose to incorporate latent variables bridging the two networks in a joint optimization manner. The first network directly predicts the positions and scales of salient objects from raw images, and the latent variables adjust the object localizations to feed the second network that produces pixelwise object masks. An EM-type method is presented for the optimization, iterating with two steps: (i) by using the two networks, it estimates the latent variables by employing an MCMC-based sampling method; (ii) it optimizes the parameters of the two networks unitedly via back propagation, with the fixed latent variables. Extensive experiments suggest that our framework significantly outperforms other state-of-the-art approaches in both accuracy and efficiency (e.g. 1000 times faster than competing approaches).
Data-target association is an important step in multi-target localization for the intelligent operation of un- manned systems in numerous applications such as search and rescue, traffic management and surveillance. The objective of this paper is to present an innovative data association learning approach named multi-layer K-means (MLKM) based on leveraging the advantages of some existing machine learning approaches, including K-means, K-means++, and deep neural networks. To enable the accurate data association from different sensors for efficient target localization, MLKM relies on the clustering capabilities of K-means++ structured in a multi-layer framework with the error correction feature that is motivated by the backpropogation that is well-known in deep learning research. To show the effectiveness of the MLKM method, numerous simulation examples are conducted to compare its performance with K-means, K-means++, and deep neural networks.
Visual media are powerful means of expressing emotions and sentiments. The constant generation of new content in social networks highlights the need of automated visual sentiment analysis tools. While Convolutional Neural Networks (CNNs) have established a new state-of-the-art in several vision problems, their application to the task of sentiment analysis is mostly unexplored and there are few studies regarding how to design CNNs for this purpose. In this work, we study the suitability of fine-tuning a CNN for visual sentiment prediction as well as explore performance boosting techniques within this deep learning setting. Finally, we provide a deep-dive analysis into a benchmark, state-of-the-art network architecture to gain insight about how to design patterns for CNNs on the task of visual sentiment prediction.
We analyze the sequence of time intervals between consecutive stock trades of thirty companies representing eight sectors of the U. S. economy over a period of four years. For all companies we find that: (i) the probability density function of intertrade times may be fit by a Weibull distribution; (ii) when appropriately rescaled the probability densities of all companies collapse onto a single curve implying a universal functional form; (iii) the intertrade times exhibit power-law correlated behavior within a trading day and a consistently greater degree of correlation over larger time scales, in agreement with the correlation behavior of the absolute price returns for the corresponding company, and (iv) the magnitude series of intertrade time increments is characterized by long-range power-law correlations suggesting the presence of nonlinear features in the trading dynamics, while the sign series is anti-correlated at small scales. Our results suggest that independent of industry sector, market capitalization and average level of trading activity, the series of intertrade times exhibit possibly universal scaling patterns, which may relate to a common mechanism underlying the trading dynamics of diverse companies. Further, our observation of long-range power-law correlations and a parallel with the crossover in the scaling of absolute price returns for each individual stock, support the hypothesis that the dynamics of transaction times may play a role in the process of price formation.
We study the spatial structure of wave functions with exceptionally high local amplitudes in the Anderson model of localisation. By means of exact diagonalisations of finite systems, we obtain and analyse images of these wave functions: we compare the spatial structure of such anomalously localised states in quasi-one-dimensional samples to that in three-dimensional samples. In both cases the average wave-function intensity exhibits a very narrow peak. The background intensity, however, is found to be very different in these two cases: in three dimensions, it is constant, independent of the distance to the localisation centre (as expected for extended states). In quasi-one dimensional samples, on the other hand, it is redistributed towards the localisation centre and approaches a characteristic form predicted in [A. D. Mirlin, Phys. Rep. 326, 249 (2000)].
In this paper, we present a novel approach for initializing deep neural networks, i.e., by turning PCA into neural layers. Usually, the initialization of the weights of a deep neural network is done in one of the three following ways: 1) with random values, 2) layer-wise, usually as Deep Belief Network or as auto-encoder, and 3) re-use of layers from another network (transfer learning). Therefore, typically, many training epochs are needed before meaningful weights are learned, or a rather similar dataset is required for seeding a fine-tuning of transfer learning. In this paper, we describe how to turn a PCA into an auto-encoder, by generating an encoder layer of the PCA parameters and furthermore adding a decoding layer. We analyze the initialization technique on real documents. First, we show that a PCA-based initialization is quick and leads to a very stable initialization. Furthermore, for the task of layout analysis we investigate the effectiveness of PCA-based initialization and show that it outperforms state-of-the-art random weight initialization methods.
Gossip protocols have been proposed as a robust and efficient method for disseminating information throughout large-scale networks. In this paper, we propose a compositional analysis technique to study formal probabilistic models of gossip protocols in the context of wireless sensor networks. We adopt a simple probabilistic timed process calculus for modelling wireless sensor networks. We equip the calculus with a simulation theory to compare probabilistic protocols that have similar behaviour up to a certain tolerance. The theory is used to prove a number of algebraic laws which revealed to be very effective to estimate the performances of gossip networks, with and without communication collisions, and randomised gossip networks. Our simulation theory is an asymmetric variant of the weak bisimulation metric that maintains most of the properties of the original definition. However, our asymmetric version is particularly suitable to reason on protocols in which the systems under consideration are not approximately equivalent, as in the case of gossip protocols.
With the growing amount of available temporal real-world network data, an important question is how to efficiently study these data. One can simply model a temporal network as either a single aggregate static network, or as a series of time-specific snapshots, each of which is an aggregate static network over the corresponding time window. The advantage of modeling the temporal data in these two ways is that one can use existing well established methods for static network analysis to study the resulting aggregate network(s). Here, we develop a novel approach for studying temporal network data more explicitly. We base our methodology on the well established notion of graphlets (subgraphs), which have been successfully used in numerous contexts in static network research. Here, we take the notion of static graphlets to the next level and develop new theory needed to allow for graphlet-based analysis of temporal networks. Our new notion of dynamic graphlets is quite different than existing approaches for dynamic network analysis that are based on temporal motifs (statistically significant subgraphs). Namely, these approaches suffer from many limitations. For example, they can only deal with subgraph structures of limited complexity. Also, their major drawback is that their results heavily depend on the choice of a null network model that is required to evaluate the significance of a subgraph. However, choosing an appropriate null network model is a non-trivial task. Our dynamic graphlet approach overcomes the limitations of the existing temporal motif-based approaches. At the same time, when we thoroughly evaluate the ability of our new approach to characterize the structure and function of an entire temporal network or of individual nodes, we find that the dynamic graphlet approach outperforms the static graphlet approach, which indicates that accounting for temporal information helps.
Tectonic earthquakes of high magnitude can cause considerable losses in terms of human lives, economic and infrastructure, among others. According to an evaluation published by the U.S. Geological Survey, 30 is the number of earthquakes which have greatly impacted Mexico from the end of the XIX century to this one. Based upon data from the National Seismological Service, on the period between January 1, 2006 and May 1, 2013 there have occurred 5,826 earthquakes which magnitude has been greater than 4.0 degrees on the Richter magnitude scale (25.54% of the total of earthquakes registered on the national territory), being the Pacific Plate and the Cocos Plate the most important ones. This document describes the development of an Artificial Neural Network (ANN) based on the radial topology which seeks to generate a prediction with an error margin lower than 20% which can inform about the probability of a future earthquake one of the main questions is: can artificial neural networks be applied in seismic forecasting? It can be argued that research has the potential to bring in the forecast seismic, more research is needed to consolidate data and help mitigate the impact caused by such events linked with society. Keywords--- Analysis, Mexico, Neural Artificial Networks, Seismicity.
We study the dependence on the spatial dimensionality of different quantities relevant in the description of the Anderson transition by combining numerical calculations in a $3 \leq d \leq 6$ disordered tight binding model with theoretical arguments. Our results indicate that, in agreement with the one parameter scaling theory, the upper critical dimension for localization is infinity. Typical properties of the spectral correlations at the Anderson transition such as level repulsion or a linear number variance are still present in higher dimensions though eigenvalues correlations get weaker as the dimensionality of the space increases. It is argued that such a critical behavior can be traced back to the exponential decay of the two-level correlation function in a certain range of eigenvalue separations. We also discuss to what extent different effective random matrix models proposed in the literature to describe the Anderson transition provide an accurate picture of this phenomenon. Finally, we study the effect of a random flux in our results.
Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for tasks such as clustering evaluation, consensus clustering, and tracking the temporal evolution of clusters. For example, the extrinsic evaluation of clustering methods requires comparing the uncovered clusterings to planted clusterings or known metadata. Yet, as we demonstrate, existing clustering comparison measures have critical biases which un- dermine their usefulness, and no measure accommodates both overlapping and hierarchical clusterings. Here we unify the comparison of disjoint, overlapping, and hierarchically struc- tured clusterings by proposing a new element-centric framework: elements are compared based on the relationships induced by the cluster structure, as opposed to the traditional cluster-centric philosophy. We demonstrate that, in contrast to standard clustering simi- larity measures, our framework does not suffer from critical biases and naturally provides unique insights into how the clusterings differ. We illustrate the strengths of our framework by revealing new insights into the organization of clusters in two applications: the improved classification of schizophrenia based on the overlapping and hierarchical community struc- ture of fMRI brain networks, and the disentanglement of various social homophily factors in Facebook social networks. The universality of clustering suggests far-reaching impact of our framework throughout all areas of science.
A peer-to-peer network, enabling different parties to jointly store and run computations on data while keeping the data completely private. Enigma's computational model is based on a highly optimized version of secure multi-party computation, guaranteed by a verifiable secret-sharing scheme. For storage, we use a modified distributed hashtable for holding secret-shared data. An external blockchain is utilized as the controller of the network, manages access control, identities and serves as a tamper-proof log of events. Security deposits and fees incentivize operation, correctness and fairness of the system. Similar to Bitcoin, Enigma removes the need for a trusted third party, enabling autonomous control of personal data. For the first time, users are able to share their data with cryptographic guarantees regarding their privacy.
Accurately counting maize tassels is important for monitoring the growth status of maize plants. This tedious task, however, is still mainly done by manual efforts. In the context of modern plant phenotyping, automating this task is required to meet the need of large-scale analysis of genotype and phenotype. In recent years, computer vision technologies have experienced a significant breakthrough due to the emergence of large-scale datasets and increased computational resources. Naturally image-based approaches have also received much attention in plant-related studies. Yet a fact is that most image-based systems for plant phenotyping are deployed under controlled laboratory environment. When transferring the application scenario to unconstrained in-field conditions, intrinsic and extrinsic variations in the wild pose great challenges for accurate counting of maize tassels, which goes beyond the ability of conventional image processing techniques. This calls for further robust computer vision approaches to address in-field variations. This paper studies the in-field counting problem of maize tassels. To our knowledge, this is the first time that a plant-related counting problem is considered using computer vision technologies under unconstrained field-based environment.
A key for understanding the evolution of galaxies and in particular their star formation history will be future ultra-deep radio surveys. While star formation rates (SFRs) are regularly estimated with phenomenological formulas based on the local FIR-radio correlation, we present here a physically motivated model to relate star formation with radio fluxes. Such a relation holds only in frequency ranges where the flux is dominated by synchrotron emission, as this radiation originates from cosmic rays produced in supernova remnants, therefore reflecting recent star formation. At low frequencies synchrotron emission can be absorbed by the free-free mechanism. This suppression becomes stronger with increasing number density of the gas, more precisely of the free electrons. We estimate the critical observing frequency below which radio emission is not tracing the SFR, and use the three well-studied local galaxies M 51, M 82, and Arp 220 as test cases for our model. If the observed galaxy is at high redshift, this critical frequency moves along with other spectral features to lower values in the observing frame. In the absence of systematic evolutionary effects, one would therefore expect that the method can be applied at lower observing frequencies for high redshift observations. However, in case of a strong increase of the typical gas column densities towards high redshift, the increasing free-free absorption may erase the star formation signatures at low frequencies. At high radio frequencies both, free-free emission and the thermal bump, can dominate the spectrum, also limiting the applicability of this method.
This paper investigates the effects of limited speech data in the context of speaker verification using deep neural network (DNN) approach. Being able to reduce the length of required speech data is important to the development of speaker verification system in real world applications. The experimental studies have found that DNN-senone-based Gaussian probabilistic linear discriminant analysis (GPLDA) system respectively achieves above 50% and 18% improvements in EER values over GMM-UBM GPLDA system on NIST 2010 coreext-coreext and truncated 15sec-15sec evaluation conditions. Further when GPLDA model is trained on short-length utterances (30sec) rather than full-length utterances (2min), DNN-senone GPLDA system achieves above 7% improvement in EER values on truncated 15sec-15sec condition. This is because short length development i-vectors have speaker, session and phonetic variation and GPLDA is able to robustly model those variations. For several real world applications, longer utterances (2min) can be used for enrollment and shorter utterances (15sec) are required for verification, and in those conditions, DNN-senone GPLDA system achieves above 26% improvement in EER values over GMM-UBM GPLDA systems.
Through the success of deep learning, Artificial Neural Networks (ANNs) are among the most used artificial intelligence methods nowadays. ANNs have led to major breakthroughs in various domains, such as particle physics, reinforcement learning, speech recognition, computer vision, and so on. Taking inspiration from the network properties of biological neural networks (e.g. sparsity, scale-freeness), we argue that (contrary to general practice) Artificial Neural Networks (ANN), too, should not have fully-connected layers. We show how ANNs perform perfectly well with sparsely-connected layers. Following a Darwinian evolutionary approach, we propose a novel algorithm which evolves an initial random sparse topology (i.e. an Erd\H{o}s-R\'enyi random graph) of two consecutive layers of neurons into a scale-free topology, during the ANN training process. The resulting sparse layers can safely replace the corresponding fully-connected layers. Our method allows to quadratically reduce the number of parameters in the fully conencted layers of ANNs, yielding quadratically faster computational times in both phases (i.e. training and inference), at no decrease in accuracy. We demonstrate our claims on two popular ANN types (restricted Boltzmann machine and multi-layer perceptron), on two types of tasks (supervised and unsupervised learning), and on 14 benchmark datasets. We anticipate that our approach will enable ANNs having billions of neurons and evolved topologies to be capable of handling complex real-world tasks that are intractable using state-of-the-art methods.
We present a two-dimensional theoretical model for the slow chemical corrosion of a thin film of a disordered solid by suitable etching solutions. This model explain different experimental results showing that the corrosion stops spontaneously in a situation in which the concentration of the etchant is still finite while the corrosion surface develops clear fractal features. We show that these properties are strictly related to the percolation theory, and in particular to its behavior around the critical point. This task is accomplished both by a direct analysis in terms of a self-organized version of the Gradient Percolation model and by field theoretical arguments.
The design and development of a parallel plate shear cell for the study of large scale shear flows in granular materials is presented. The parallel plate geometry allows for shear studies without the effects of curvature found in the more common Couette experiments. A system of independently movable slats creates a well with side walls that deform in response to the motions of grains within the pack. This allows for true parallel plate shear with minimal interference from the containing geometry. The motions of the side walls also allow for a direct measurement of the velocity profile across the granular pack. Results are presented for applying this system to the study of transients in granular shear and for shear-induced crystallization. Initial shear profiles are found to vary from packing to packing, ranging from a linear profile across the entire system to an exponential decay with a width of approximately 6 bead diameters. As the system is sheared, the velocity profile becomes much sharper, resembling an exponential decay with a width of roughly 3 bead diameters. Further shearing produces velocity profiles which can no longer be fit to an exponential decay, but are better represented as a Gaussian decay or error function profile. Cyclic shear is found to produce large scale ordering of the granular pack, which has a profound impact on the shear profile. There exist periods of time in which there is slipping between layers as well as periods of time in which the layered particles lock together resulting in very little relative motion.
We introduce a new artificial intelligence (AI) approach called, the 'Digital Synaptic Neural Substrate' (DSNS). It uses selected attributes from objects in various domains (e.g. chess problems, classical music, renowned artworks) and recombines them in such a way as to generate new attributes that can then, in principle, be used to create novel objects of creative value to humans relating to any one of the source domains. This allows some of the burden of creative content generation to be passed from humans to machines. The approach was tested in the domain of chess problem composition. We used it to automatically compose numerous sets of chess problems based on attributes extracted and recombined from chess problems and tournament games by humans, renowned paintings, computer-evolved abstract art, photographs of people, and classical music tracks. The quality of these generated chess problems was then assessed automatically using an existing and experimentally-validated computational chess aesthetics model. They were also assessed by human experts in the domain. The results suggest that attributes collected and recombined from chess and other domains using the DSNS approach can indeed be used to automatically generate chess problems of reasonably high aesthetic quality. In particular, a low quality chess source (i.e. tournament game sequences between weak players) used in combination with actual photographs of people was able to produce three-move chess problems of comparable quality or better to those generated using a high quality chess source (i.e. published compositions by human experts), and more efficiently as well. Why information from a foreign domain can be integrated and functional in this way remains an open question for now. The DSNS approach is, in principle, scalable and applicable to any domain in which objects have attributes that can be represented using real numbers.
This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively hiding its learnt pattern from its neighbors. The robustness to noise, high speed and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA).
The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many domains, particularly in speech recognition and computer vision, to the extent that the majority of expert practitioners in those field are now regularly eschewing prior established models in favor of deep learning models. In this review, we provide an introductory overview into the theory of deep neural networks and their unique properties that distinguish them from traditional machine learning algorithms used in cheminformatics. By providing an overview of the variety of emerging applications of deep neural networks, we highlight its ubiquity and broad applicability to a wide range of challenges in the field, including QSAR, virtual screening, protein structure prediction, quantum chemistry, materials design and property prediction. In reviewing the performance of deep neural networks, we observed a consistent outperformance against non-neural networks state-of-the-art models across disparate research topics, and deep neural network based models often exceeded the "glass ceiling" expectations of their respective tasks. Coupled with the maturity of GPU-accelerated computing for training deep neural networks and the exponential growth of chemical data on which to train these networks on, we anticipate that deep learning algorithms will be a valuable tool for computational chemistry.
We present a method for extracting depth information from a rectified image pair. We train a convolutional neural network to predict how well two image patches match and use it to compute the stereo matching cost. The cost is refined by cross-based cost aggregation and semiglobal matching, followed by a left-right consistency check to eliminate errors in the occluded regions. Our stereo method achieves an error rate of 2.61 % on the KITTI stereo dataset and is currently (August 2014) the top performing method on this dataset.
Unlike traditional reinforcement learning (RL), market-based RL is in principle applicable to worlds described by partially observable Markov Decision Processes (POMDPs), where an agent needs to learn short-term memories of relevant previous events in order to execute optimal actions. Most previous work, however, has focused on reactive settings (MDPs) instead of POMDPs. Here we reimplement a recent approach to market-based RL and for the first time evaluate it in a toy POMDP setting.
The model 2d kagome system (H3O)Fe3(SO4)2(OH)6 and the 3d pyrochlore Y2Mo2O7 are two well characterized examples of low-disordered frustrated antiferromagnets which rather then condensing into spin liquid have been found to undergo a freezing transition with spin glass-like properties. We explore more deeply the comparison of their properties with those of spin glasses, by the study of characteristic rejuvenation and memory effects in the non-stationary susceptibility. While the pyrochlore shows clear evidence for these non-trivial effects, implying temperature selective aging, that is characteristic of a wide hierarchical distribution of equilibration processes, the kagome system does n not show clearly these effects. Rather, it seems to evolve towards the same final state independently of temperature.
The success of deep neural networks has inspired many to wonder whether other learners could benefit from deep, layered architectures. We present a general framework called forward thinking for deep learning that generalizes the architectural flexibility and sophistication of deep neural networks while also allowing for (i) different types of learning functions in the network, other than neurons, and (ii) the ability to adaptively deepen the network as needed to improve results. This is done by training one layer at a time, and once a layer is trained, the input data are mapped forward through the layer to create a new learning problem. The process is then repeated, transforming the data through multiple layers, one at a time, rendering a new dataset, which is expected to be better behaved, and on which a final output layer can achieve good performance. In the case where the neurons of deep neural nets are replaced with decision trees, we call the result a Forward Thinking Deep Random Forest (FTDRF). We demonstrate a proof of concept by applying FTDRF on the MNIST dataset. We also provide a general mathematical formulation that allows for other types of deep learning problems to be considered.
We demonstrate that discrepancies between predicted low-energy quasiparticle properties in disordered 2D d-wave superconductors occur because of the unanticipated importance of disorder model details and normal-state particle-hole symmetry. This conclusion follows from numerically exact evaluations of the quasiparticle density-of-states predicted by the Bogoliubov-deGennes (BdG) mean field equations for both binary alloy and random site energy disorder models. For the realistic case, which is best described by a binary alloy model without particle-hole symmetry, we predict density-of-states suppression below an energy scale which appears to be correlated with the corresponding single-impurity resonance.
We consider the dynamics of Frenkel excitons on quasiperiodic lattices, focusing our attention on the Fibonacci case as a typical example. We evaluate the absorption spectrum by solving numerically the equation of motion of the Frenkel-exciton problem on the lattice. Besides the main absorption line, satellite lines appear in the high-energy side of the spectra, which we have related to the underlying quasiperiodic order. The influence of lattice vibrations on the absorption line-shape is also considered. We find that the characteristic features of the absorption spectra should be observable even at room temperature. Consequently, we propose that excitons act as a probe of the topology of the lattice even when thermal vibrations reduce their quantum coherence.
In this paper, we introduce a new image representation based on a multilayer kernel machine. Unlike traditional kernel methods where data representation is decoupled from the prediction task, we learn how to shape the kernel with supervision. We proceed by first proposing improvements of the recently-introduced convolutional kernel networks (CKNs) in the context of unsupervised learning; then, we derive backpropagation rules to take advantage of labeled training data. The resulting model is a new type of convolutional neural network, where optimizing the filters at each layer is equivalent to learning a linear subspace in a reproducing kernel Hilbert space (RKHS). We show that our method achieves reasonably competitive performance for image classification on some standard "deep learning" datasets such as CIFAR-10 and SVHN, and also for image super-resolution, demonstrating the applicability of our approach to a large variety of image-related tasks.
A robot operating in a real-world environment needs to perform reasoning over a variety of sensor modalities such as vision, language and motion trajectories. However, it is extremely challenging to manually design features relating such disparate modalities. In this work, we introduce an algorithm that learns to embed point-cloud, natural language, and manipulation trajectory data into a shared embedding space with a deep neural network. To learn semantically meaningful spaces throughout our network, we use a loss-based margin to bring embeddings of relevant pairs closer together while driving less-relevant cases from different modalities further apart. We use this both to pre-train its lower layers and fine-tune our final embedding space, leading to a more robust representation. We test our algorithm on the task of manipulating novel objects and appliances based on prior experience with other objects. On a large dataset, we achieve significant improvements in both accuracy and inference time over the previous state of the art. We also perform end-to-end experiments on a PR2 robot utilizing our learned embedding space.
We give a brief introduction to the mode-coupling theory of the glass transition, a theory which was proposed a while ago to describe the dynamics of supercooled liquids. After presenting the basic equations of the theory, we review some of its predictions and compare these with results of experiments and computer simulations. We conclude that the theory is able to describe the dynamics of supercooled liquids in remarkably great detail.
The Shapley value---probably the most important normative payoff division scheme in coalitional games---has recently been advocated as a useful measure of centrality in networks. However, although this approach has a variety of real-world applications (including social and organisational networks, biological networks and communication networks), its computational properties have not been widely studied. To date, the only practicable approach to compute Shapley value-based centrality has been via Monte Carlo simulations which are computationally expensive and not guaranteed to give an exact answer. Against this background, this paper presents the first study of the computational aspects of the Shapley value for network centralities. Specifically, we develop exact analytical formulae for Shapley value-based centrality in both weighted and unweighted networks and develop efficient (polynomial time) and exact algorithms based on them. We empirically evaluate these algorithms on two real-life examples (an infrastructure network representing the topology of the Western States Power Grid and a collaboration network from the field of astrophysics) and demonstrate that they deliver significant speedups over the Monte Carlo approach. For instance, in the case of unweighted networks our algorithms are able to return the exact solution about 1600 times faster than the Monte Carlo approximation, even if we allow for a generous 10% error margin for the latter method.
Wireless Fidelity (WiFi) is the fastest growing wireless technology to date. In addition to providing wire-free connectivity to the Internet WiFi technology also enables mobile devices to connect directly to each other and form highly dynamic wireless adhoc networks. Such distributed networks can be used to perform cooperative communication tasks such ad data routing and information dissemination in the absence of a fixed infrastructure. Furthermore, adhoc grids composed of wirelessly networked portable devices are emerging as a new paradigm in grid computing. In this paper we review computational and algorithmic challenges of high-fidelity simulations of such WiFi-based wireless communication and computing networks, including scalable topology maintenance, mobility modelling, parallelisation and synchronisation. We explore similarities and differences between the simulations of these networks and simulations of interacting many-particle systems, such as molecular dynamics (MD) simulations. We show how the cell linked-list algorithm which we have adapted from our MD simulations can be used to greatly improve the computational performance of wireless network simulators in the presence of mobility, and illustrate with an example from our simulation studies of worm attacks on mobile wireless adhoc networks.
This work is designed to overview our present knowledge about universality classes occurring in nonequilibrium systems defined on regular lattices. In the first section I summarize the most important critical exponents, relations and the field theoretical formalism used in the text. In the second section I briefly address the question of scaling behavior at first order phase transitions. In section three I review dynamical extensions of basic static classes, show the effect of mixing dynamics and the percolation behavior. The main body of this work is given in section four where genuine, dynamical universality classes specific to nonequilibrium systems are introduced. In section five I continue overviewing such nonequilibrium classes but in coupled, multi-component systems. Most of the known nonequilibrium transition classes are explored in low dimensions between active and absorbing states of reaction-diffusion type of systems. However by mapping they can be related to universal behavior of interface growth models, which I overview in section six. Finally in section seven I summarize families of absorbing state system classes, mean-field classes and give an outlook for further directions of research.
We study the shape, elasticity and fluctuations of the recently predicted (cond-mat/9510172) and subsequently observed (in numerical simulations) (cond-mat/9705059) tubule phase of anisotropic membranes, as well as the phase transitions into and out of it. This novel phase lies between the previously predicted flat and crumpled phases, both in temperature and in its physical properties: it is crumpled in one direction, and extended in the other. Its shape and elastic properties are characterized by a radius of gyration exponent $\nu$ and an anisotropy exponent $z$. We derive scaling laws for the radius of gyration $R_G(L_\perp,L_y)$ (i.e. the average thickness) of the tubule about a spontaneously selected straight axis and for the tubule undulations $h_{rms}(L_\perp,L_y)$ transverse to its average extension. For phantom (i.e. non-self-avoiding) membranes, we predict $\nu=1/4$, $z=1/2$ and $\eta_\kappa=0$, exactly, in excellent agreement with simulations. For membranes embedded in the space of dimension $d<11$, self-avoidance greatly swells the tubule and suppresses its wild transverse undulations, changing its shape exponents $\nu$ and $z$. We give detailed scaling results for the shape of the tubule of an arbitrary aspect ratio and compute a variety of correlation functions, as well as the anomalous elasticity of the tubules. Finally we present a scaling theory for the shape of the membrane and its specific heat near the continuous transitions into and out of the tubule phase and perform detailed renormalization group calculations for the crumpled-to-tubule transition for phantom membranes.
We give a mathematical analysis of a new type of classical computer network architecture, intended as a model of a new technology that has recently been proposed in industry. Our approach is based on groubits, generalizations of classical bits based on groupoids. This network architecture allows the direct execution of a number of protocols that are usually associated with quantum networks, including teleportation, dense coding and secure key distribution.
Sufficient conditions are derived for global asymptotic synchronization in a system of identical nonlinear electrical circuits coupled through linear time-invariant (LTI) electrical networks. In particular, the conditions we derive apply to settings where: i) the nonlinear circuits are composed of a parallel combination of passive LTI circuit elements and a nonlinear voltage-dependent current source with finite gain; and ii) a collection of these circuits are coupled through either uniform or homogeneous LTI electrical networks. Uniform electrical networks have identical per-unit-length impedances. Homogeneous electrical networks are characterized by having the same effective impedance between any two terminals with the others open circuited. Synchronization in these networks is guaranteed by ensuring the stability of an equivalent coordinate-transformed differential system that emphasizes signal differences. The applicability of the synchronization conditions to this broad class of networks follows from leveraging recent results on structural and spectral properties of Kron reduction---a model-reduction procedure that isolates the interactions of the nonlinear circuits in the network. The validity of the analytical results is demonstrated with simulations in networks of coupled Chua's circuits.
In this paper, the problem of content-aware user clustering and content caching in wireless small cell networks is studied. In particular, a service delay minimization problem is formulated, aiming at optimally caching contents at the small cell base stations (SCBSs). To solve the optimization problem, we decouple it into two interrelated subproblems. First, a clustering algorithm is proposed grouping users with similar content popularity to associate similar users to the same SCBS, when possible. Second, a reinforcement learning algorithm is proposed to enable each SCBS to learn the popularity distribution of contents requested by its group of users and optimize its caching strategy accordingly. Simulation results show that by correlating the different popularity patterns of different users, the proposed scheme is able to minimize the service delay by 42% and 27%, while achieving a higher offloading gain of up to 280% and 90%, respectively, compared to random caching and unclustered learning schemes.
We show that the random matrix theory with non-integer "symmetry parameter" beta describes the statistics of transport parameters of strongly disordered two dimensional systems.
In the flurry of experiments looking for topological insulator materials, it has been recently discovered that some bulk metals very close to topological insulator electronic states, support the same topological surface states that are the defining characteristic of the topological insulator. First observed in spin-polarized ARPES in Sb (D. Hsieh et al. Science 323, 919 (2009)), the helical surface states in the metallic systems appear to be robust to at least mild disorder. We present here a theoretical investigation of the nature of these "helical metals" - bulk metals with helical surface states. We explore how the surface and bulk states can mix, in both clean and disordered systems. Using the Fano model, we discover that in a clean system, the helical surface states are \emph{not} simply absorbed by hybridization with a non-topological parasitic metallic band. Instead, they are pushed away from overlapping in momentum and energy with the bulk states, leaving behind a finite-lifetime surface resonance in the bulk energy band. Furthermore, the hybridization may lead in some cases to multiplied surface state bands, in all cases retaining the helical characteristic. Weak disorder leads to very similar effects - surface states are pushed away from the energy bandwidth of the bulk, leaving behind a finite-lifetime surface resonance in place of the original surface states.
Many complex systems are organized in the form of a network embedded in space. Important examples include the physical Internet infrastucture, road networks, flight connections, brain functional networks and social networks. The effect of space on network topology has recently come under the spotlight because of the emergence of pervasive technologies based on geo-localization, which constantly fill databases with people's movements and thus reveal their trajectories and spatial behaviour. Extracting patterns and regularities from the resulting massive amount of human mobility data requires the development of appropriate tools for uncovering information in spatially-embedded networks. In contrast with most works that tend to apply standard network metrics to any type of network, we argue in this paper for a careful treatment of the constraints imposed by space on network topology. In particular, we focus on the problem of community detection and propose a modularity function adapted to spatial networks. We show that it is possible to factor out the effect of space in order to reveal more clearly hidden structural similarities between the nodes. Methods are tested on a large mobile phone network and computer-generated benchmarks where the effect of space has been incorporated.
A new method for generating variance reduction parameters for strongly anisotropic, deep-penetration radiation shielding studies is presented. This method generates an alternate form of the adjoint scalar flux quantity, $\phi^{\dagger}_{\Omega}$, which is used by both CADIS and FW-CADIS to generate variance reduction parameters for local and global response functions, respectively. The new method, called CADIS-$\Omega$, was implemented in the Denovo/ADVANTG software suite, and results are presented for a concrete labyrinth test problem. Results indicate that the flux generated by CADIS-$\Omega$ incorporates localized angular anisotropies in the flux effectively. CADIS-$\Omega$ outperformed CADIS in the test problem while obtaining accurate results. This initial work indicates that CADIS-$\Omega$ may be highly useful for shielding problems with strong angular anisotropies. A future test plan to fully characterize the new method is proposed, which should reveal more about the types of realistic problems for which the CADIS-$\Omega$ will be suited.
It has been found that recent results on forward jet production from deep inelastic scattering can neither be reproduced by models which are based on leading order alpha_s QCD matrix elements and parton showers nor by next-to-leading order calculations. The measurement of forward jet cross sections has been suggested as a promising probe of new small x dynamics and the question is whether these data provide an indication of this. The same question arises for other experimental data in deep inelastic scattering at small x which can not be described by conventional models for deep inelastic scattering. In this paper the influence of resolved photon processes has been investigated and it has been studied to what extent such processes are able to reproduce the data. It is shown that two DGLAP evolution chains from the hard scattering process towards the proton and the photon, respectively, are sufficient to describe effects, observed in the HERA data, which have been attributed to BFKL dynamics.
A major contributing factor to the recent advances in deep neural networks is structural units that let sensory information and gradients to propagate easily. Gating is one such structure that acts as a flow control. Gates are employed in many recent state-of-the-art recurrent models such as LSTM and GRU, and feedforward models such as Residual Nets and Highway Networks. This enables learning in very deep networks with hundred layers and helps achieve record-breaking results in vision (e.g., ImageNet with Residual Nets) and NLP (e.g., machine translation with GRU). However, there is limited work in analysing the role of gating in the learning process. In this paper, we propose a flexible $p$-norm gating scheme, which allows user-controllable flow and as a consequence, improve the learning speed. This scheme subsumes other existing gating schemes, including those in GRU, Highway Networks and Residual Nets as special cases. Experiments on large sequence and vector datasets demonstrate that the proposed gating scheme helps improve the learning speed significantly without extra overhead.
Solving constraint satisfaction problems (CSPs) is a notoriously expensive computational task. Recently, it has been proposed that efficient stochastic solvers can be obtained through appropriately configured spiking neural networks performing Markov Chain Monte Carlo (MCMC) sampling. The possibility to run such models on massively parallel, low-power neuromorphic hardware holds great promise; however, previously proposed networks are based on probabilistically spiking neurons, and thus rely on random number generators or external noise sources to achieve the necessary stochasticity, leading to significant overhead in the implementation. Here we show how stochasticity can be achieved by implementing deterministic models of integrate and fire neurons using subthreshold analog circuits that are affected by thermal noise. We present an efficient implementation of spike-based CSP solvers using a reconfigurable neural network VLSI device, and the device's intrinsic noise as a source of randomness. To illustrate the overall concept, we implement a generic Sudoku solver based on our approach and demonstrate its operation. We establish a link between the neuron parameters and the system dynamics, allowing for a simple temperature control mechanism.
We combine a refined version of two-point step-size adaptation with the covariance matrix adaptation evolution strategy (CMA-ES). Additionally, we suggest polished formulae for the learning rate of the covariance matrix and the recombination weights. In contrast to cumulative step-size adaptation or to the 1/5-th success rule, the refined two-point adaptation (TPA) does not rely on any internal model of optimality. In contrast to conventional self-adaptation, the TPA will achieve a better target step-size in particular with large populations. The disadvantage of TPA is that it relies on two additional objective function
The dynamics of an agreement protocol interacting with a disagreement process over a common random network is considered. The model can represent the spreading of true and false information over a communication network, the propagation of faults in a large-scale control system, or the development of trust and mistrust in a society. At each time instance and with a given probability, a pair of network nodes are selected to interact. At random each of the nodes then updates its state towards the state of the other node (attraction), away from the other node (repulsion), or sticks to its current state (neglect). Agreement convergence and disagreement divergence results are obtained for various strengths of the updates for both symmetric and asymmetric update rules. Impossibility theorems show that a specific level of attraction is required for almost sure asymptotic agreement and a specific level of repulsion is required for almost sure asymptotic disagreement. A series of sufficient and/or necessary conditions are then established for agreement convergence or disagreement divergence. In particular, under symmetric updates, a critical convergence measure in the attraction and repulsion update strength is found, in the sense that the asymptotic property of the network state evolution transits from agreement convergence to disagreement divergence when this measure goes from negative to positive. The result can be interpreted as a tight bound on how much bad action needs to be injected in a dynamic network in order to consistently steer its overall behavior away from consensus.
The availability of large labeled datasets has allowed Convolutional Network models to achieve impressive recognition results. However, in many settings manual annotation of the data is impractical; instead our data has noisy labels, i.e. there is some freely available label for each image which may or may not be accurate. In this paper, we explore the performance of discriminatively-trained Convnets when trained on such noisy data. We introduce an extra noise layer into the network which adapts the network outputs to match the noisy label distribution. The parameters of this noise layer can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks. We demonstrate the approaches on several datasets, including large scale experiments on the ImageNet classification benchmark.
Incorrect implementations of network protocol message specifications affect the stability, security, and cost of network system development. Most implementation defects fall into one of three categories of well defined message constraints. However, the general process of constructing network protocol stacks and systems does not capture these categorical con- straints. We introduce a systems programming language with new abstractions that capture these constraints. Safe and efficient implementations of standard message handling operations are synthesized by our compiler, and whole-program analysis is used to ensure constraints are never violated. We present language examples using the OpenFlow protocol.
Building on the elastically collective nonlinear Langevin equation theory developed for hard spheres in the preceding paper I, we propose and implement a quasi-universal theory for the alpha relaxation of thermal liquids based on mapping them to an effective hard sphere fluid via the dimensionless compressibility. The result is a zero adjustable parameter theory that can quantitatively address in a unified manner the alpha relaxation time over 14 or more decades. The theory has no singularities above zero Kelvin, and relaxation in the equilibrium low temperature limit is predicted to be of a roughly Arrhenius form. The two-barrier (local cage and long range collective elastic) description results in a rich dynamic behavior including apparent Arrhenius, narrow crossover and deeply supercooled regimes, and multiple characteristic or crossover times and temperatures of clear physical meaning. Application of the theory to nonpolar molecules, alcohols, rare gases and liquids metals is carried out. Overall, the agreement with experiment is quite good for the temperature dependence of the alpha time, plateau shear modulus and Boson-like peak frequency for van der Waals liquids, though less so for hydrogen-bonding molecules. The theory predicts multiple growing length scales upon cooling, which reflect distinct aspects of the coupled local hopping and cooperative elastic physics. Calculations of an activation volume that grows with cooling, which is correlated with a measure of dynamic cooperativity, agree quantitatively with experiment. Comparisons with elastic, entropy crisis, dynamic facilitation and other approaches are performed, and a fundamental basis for empirically-extracted crossover temperatures is established. The present work sets the stage for addressing distinctive glassy phenomena in polymer melts, and diverse liquids under strong confinement.
In this work we study the transition from normal to anomalous diffusion of Brownian particles on disordered potentials. The potential model consists of a series of "potential hills" (defined on unit cell of constant length) whose heights are chosen randomly from a given distribution. We calculate the exact expression for the diffusion coefficient in the case of uncorrelated potentials for arbitrary distributions. We particularly show that when the potential heights have a Gaussian distribution (with zero mean and a finite variance) the diffusion of the particles is always normal. In contrast when the distribution of the potential heights are exponentially distributed we show that the diffusion coefficient vanishes when system is placed below a critical temperature. We calculate analytically the diffusion exponent for the anomalous (subdiffusive) phase by using the so-called "random trap model". We test our predictions by means of Langevin simulations obtaining good agreement within the accuracy of our numerical calculations.
A new method of feature extraction in the social network for within-network classification is proposed in the paper. The method provides new features calculated by combination of both: network structure information and class labels assigned to nodes. The influence of various features on classification performance has also been studied. The experiments on real-world data have shown that features created owing to the proposed method can lead to significant improvement of classification accuracy.
Comment on "Approaching human language with complex networks" by Cong & Liu
A strict interpretation of connectionism mandates complex networks of simple components. The question here is, is this simplicity to be interpreted in absolute terms? I conjecture that absolute simplicity might not be an essential attribute of connectionism, and that it may be effectively exchanged with a requirement for relative simplicity, namely simplicity with respect to the current organizational level. In this paper I provide some elements to the analysis of the above question. In particular I conjecture that fractally organized connectionist networks may provide a convenient means to achive what Leibniz calls an "art of complication", namely an effective way to encapsulate complexity and practically extend the applicability of connectionism to domains such as sociotechnical system modeling and design. Preliminary evidence to my claim is brought by considering the design of the software architecture designed for the telemonitoring service of Flemish project "Little Sister".
Systems biology is an emerging interdisciplinary area of research that focuses on study of complex interactions in a biological system, such as gene regulatory networks. The discovery of gene regulatory networks leads to a wide range of applications, such as pathways related to a disease that can unveil in what way the disease acts and provide novel tentative drug targets. In addition, the development of biological models from discovered networks or pathways can help to predict the responses to disease and can be much useful for the novel drug development and treatments. The inference of regulatory networks from biological data is still in its infancy stage. This paper proposes a recurrent neural network (RNN) based gene regulatory network (GRN) model hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between the biological closeness and mathematical flexibility to model GRN. The RNN is able to capture complex, non-linear and dynamic relationship among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation even in noisy data. Hence, non-linear version of Kalman filter, i.e., generalized extended Kalman filter has been applied for weight update during network training. The developed model has been applied on DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We compared our results with other state-of-the-art techniques that show superiority of our model. Further, 5% Gaussian noise has been added in the dataset and result of the proposed model shows negligible effect of noise on the results.
The temperature dependence of the frequency dispersion in the sound velocity and damping of vitreous silica is reanalyzed. Thermally activated relaxation accounts for the sound attenuation observed above 10 K at sonic and ultrasonic frequencies. Its extrapolation to the hypersonic regime reveals that the anharmonic coupling to the thermal bath becomes important in Brillouin-scattering measurements. At 35 GHz and room temperature, the damping due to this anharmonicity is found to be nearly twice that produced by thermally activated relaxation. The analysis also reveals a sizeable velocity increase with temperature which is not related with sound dispersion. This suggests that silica experiences a gradual structural change that already starts well below room temperature.
We have performed an extensive muon spin relaxation study of the well known (Fe$_{1-x}$Mn$_{x}$)$_{75}$P$_{16}$B$_6$Al$_3$ reentrant spin glass. We find a strong increase of the dynamic as well as static muon depolarization rate at the paramagnetic to ferromagnetic transition temperature, $T_c$, and at $T_f$ ($T_f<T_c$) that corresponds to the onset of strong bulk magnetic irreversabilities. We find no critical dynamic signature of a freezing of the transverse $XY$ spin components at an intermediate temperature $T_{xy}$ ($T_f < T_{xy} < T_c$) where extra static order not contributing to the magnetization starts developping.
We consider the effect of potential disorder on magnetic properties of a two-dimensional metallic system (with conductance $g\gg 1$) when interaction in the triplet channel is so strong that the system is close to the threshold of the Stoner instability. We show, that under these conditions there is an exponentially small probability for the system to form local spin droplets which are local regions with non zero spin density. Using a non-local version of the optimal fluctuation method we find analytically the probability distribution and the typical spin of a local spin droplet (LSD). In particular, we show that both the probability to form a LSD and its typical spin are independent of the size of the droplet (within the exponential accuracy). The LSDs manifest themselves in temperature dependence of observable quantities. We show, that below certain cross-over temperature the paramagnetic susceptibility acquires the Curie-like temperature dependence, while the dephasing time (extracted from magneto-resistance measurements) saturates.
We study the non-equilibrium version of the fluctuation dissipation (FD) relation in the glass phase of a trap model that is driven into a non-equilibrium steady state by external ``shear''. This extends our recent study of ageing FD relations in the same model, where we found limiting, observable independent FD relations for ``neutral'' observables that are uncorrelated with the system's average energy. In this work, for such neutral observables, we find the FD relation for a stationary weakly driven system to be the same, to within small corrections, as for an infinitely aged system. We analyse the robustness of this correspondence with respect to non-neutrality of the observable, and with respect to changes in the driving mechanism.
The sensitivity of the LHC experiments to the Standard Model Higgs using $H\to\tau\tau\to l^+l^-\sla{p_t}$ associated with one high $P_T$ jet in the mass range $110 <M_H<150 \gev$/c$^2$ is investigated. A cut and Neural Network based event selections are chosen to optimize the expected signal significance with this decay mode. A signal significance of about $6.6 \sigma$ can be achieved for $M_H=120 \gev$/c$^2$ with $30 $fb$^{-1}$ of integrated luminosity for one experiment only. With this approach, experimental issues related to tagging forward jets and to the application of a central jet veto are simplified considerably.
In this paper we introduce algorithms for the construction of scale-free networks and for clustering around the nerve centers, nodes with a high connectivity in a scale-free networks. We argue that such overlay networks could support self-organization in a complex system like a cloud computing infrastructure and allow the implementation of optimal resource management policies.
Nearest neighbor search is a basic computational tool used extensively in almost research domains of computer science specially when dealing with large amount of data. However, the use of nearest neighbor search is restricted for the purpose of algorithmic development by the existence of the notion of nearness among the data points. The recent trend of research is on large, complex networks and their structural analysis, where nodes represent entities and edges represent any kind of relation between entities. Community detection in complex network is an important problem of much interest. In general, a community detection algorithm represents an objective function and captures the communities by optimizing it to extract the interesting communities for the user. In this article, we have studied the nearest neighbor search problem in complex network via the development of a suitable notion of nearness. Initially, we have studied and analyzed the exact nearest neighbor search using metric tree on proposed metric space constructed from complex network. After, the approximate nearest neighbor search problem is studied using locality sensitive hashing. For evaluation of the proposed nearest neighbor search on complex network we applied it in community detection problem. The results obtained using our methods are very competitive with most of the well known algorithms exists in the literature and this is verified on collection of real networks. On the other-hand, it can be observed that time taken by our algorithm is quite less compared to popular methods.
In this paper, we investigate the use of Convolutional Neural Networks for counting the number of records in historical handwritten documents. With this work we demonstrate that training the networks only with synthetic images allows us to perform a near perfect evaluation of the number of records printed on historical documents. The experiments have been performed on a benchmark dataset composed by marriage records and outperform previous results on this dataset.
The stable magnetisation configurations of antiferromagnets on quasiperiodic tilings are investigated theoretically. The exchange coupling is assumed to decrease exponentially with the distance between magnetic moments. It is demonstrated that the combination of geometric frustration and the quasiperiodic order of atoms leads to complicated noncollinear ground states. The structure can be divided into subtilings of different energies. The symmetry of the subtilings depends on the quasiperiodic order of magnetic moments. The subtilings are spatially ordered. However, the magnetic ordering of the subtilings in general does not correspond to their spatial arrangements. While subtilings of low energy are magnetically ordered, those of high energy can be completely disordered due to local magnetic frustration.
Numerical studies are providing novel information on the physical processes associated to physical aging. The process of aging has been shown to consist in a slow process of explorations of deeper and deeper minima of the system potential energy surface. In this article we compare the properties of the basins explored in equilibrium with those explored during the aging process both for sudden temperature changes and for sudden density changes. We find that the hypothesis that during the aging process the system explores the part of the configuration space explored in equilibrium holds only for shallow quenches or for the early aging dynamics. At longer times, systematic deviations are observed. In the case of crunches, such deviations are much more apparent.
While recent deep neural networks have achieved promising results for 3D reconstruction from a single-view image, these rely on the availability of RGB textures in images and extra information as supervision. In this work, we propose novel stacked hierarchical networks and an end to end training strategy to tackle a more challenging task for the first time, 3D reconstruction from a single-view 2D silhouette image. We demonstrate that our model is able to conduct 3D reconstruction from a single-view silhouette image both qualitatively and quantitatively. Evaluation is performed using Shapenet for the single-view reconstruction and results are presented in comparison with a single network, to highlight the improvements obtained with the proposed stacked networks and the end to end training strategy. Furthermore, 3D re- construction in forms of IoU is compared with the state of art 3D reconstruction from a single-view RGB image, and the proposed model achieves higher IoU than the state of art of reconstruction from a single view RGB image.
We theoretically study transport properties in one-dimensional interacting quasiperiodic systems at infinite temperature. We compare and contrast the dynamical transport properties across the many-body localization (MBL) transition in quasiperiodic and random models. Using exact diagonalization we compute the optical conductivity $\sigma(\omega)$ and the return probability $R(\tau)$ and study their average low-frequency and long-time power-law behavior, respectively. We show that the low-energy transport dynamics is markedly distinct in both the thermal and MBL phases in quasiperiodic and random models and find that the diffusive and MBL regimes of the quasiperiodic model are more robust than those in the random system. Using the distribution of the DC conductivity, we quantify the contribution of sample-to-sample and state-to-state fluctuations of $\sigma(\omega)$ across the MBL transition. We find that the activated dynamical scaling ansatz works poorly in the quasiperiodic model but holds in the random model with an estimated activation exponent $\psi\approx 0.9$. We argue that near the MBL transition in quasiperiodic systems, critical eigenstates give rise to a subdiffusive crossover regime on finite-size systems.
The paper gives an artificial neural network (ANN) approach to time series modeling, the data being instance versus notes (characterized by pitch) depicting the structure of a North Indian raga, namely, Bageshree. Respecting the sentiments of the artists' community, the paper argues why it is more ethical to model a structure than try and "manufacture" an artist by training the neural network to copy performances of artists. Indian Classical Music centers on the ragas, where emotion and devotion are both important and neither can be substituted by such "calculated artistry" which the ANN generated copies are ultimately up to.
By carefully analyzing the low temperature density dependence of 2D conductivity in undoped high mobility n-GaAs heterostructures, we conclude that the 2D metal-insulator transition in this system is a density inhomogeneity driven percolation transition due to the breakdown of screening in the random charged impurity disorder background. In particular, our measured conductivity exponent of $\sim 1.4$ approaches the 2D percolation exponent value of 4/3 at low temperatures and our experimental data are inconsistent with there being a zero-temperature quantum critical point in our system.
We study the recently introduced Inverse-Beta polymer, an exactly solvable, anisotropic finite temperature model of directed polymer on the square lattice, and obtain its stationary measure. In parallel we introduce an anisotropic zero temperature model of directed polymer on the square lattice, the Bernoulli-Geometric polymer, and obtain its stationary measure. This new exactly solvable model is dual to the Inverse-Beta polymer and interpolates between models of first and last passage percolation on the square lattice. Both stationary measures are shown to satisfy detailed balance. We also obtain the asymptotic mean value of (i) the free-energy of the Inverse-Beta polymer; (ii) the optimal energy of the Bernoulli-Geometric polymer. We discuss the convergence of both models to their stationary state. We perform simulations of the Bernoulli-Geometric polymer that confirm our results.
The response function of a network of springs and masses, an elastodynamic network, is the matrix valued function $W(\omega)$, depending on the frequency $\omega$, mapping the displacements of some accessible or terminal nodes to the net forces at the terminals. We give necessary and sufficient conditions for a given function $W(\omega)$ to be the response function of an elastodynamic network, assuming there is no damping. In particular we construct an elastodynamic network that can mimic a suitable response in the frequency or time domain. Our characterization is valid for networks in three dimensions and also for planar networks, which are networks where all the elements, displacements and forces are in a plane. The network we design can fit within an arbitrarily small neighborhood of the convex hull of the terminal nodes, provided the springs and masses occupy an arbitrarily small volume. Additionally, we prove stability of the network response to small changes in the spring constants and/or addition of springs with small spring constants.
This paper proposes to learn high-performance deep ConvNets with sparse neural connections, referred to as sparse ConvNets, for face recognition. The sparse ConvNets are learned in an iterative way, each time one additional layer is sparsified and the entire model is re-trained given the initial weights learned in previous iterations. One important finding is that directly training the sparse ConvNet from scratch failed to find good solutions for face recognition, while using a previously learned denser model to properly initialize a sparser model is critical to continue learning effective features for face recognition. This paper also proposes a new neural correlation-based weight selection criterion and empirically verifies its effectiveness in selecting informative connections from previously learned models in each iteration. When taking a moderately sparse structure (26%-76% of weights in the dense model), the proposed sparse ConvNet model significantly improves the face recognition performance of the previous state-of-the-art DeepID2+ models given the same training data, while it keeps the performance of the baseline model with only 12% of the original parameters.
We consider two random matrix ensembles which are relevant for describing critical spectral statistics in systems with multifractal eigenfunction statistics. One of them is the Gaussian non-invariant ensemble which eigenfunction statistics is multifractal, while the other is the invariant random matrix ensemble with a shallow, log-square confinement potential. We demonstrate a close correspondence between the spectral as well as eigenfuncton statistics of these random matrix ensembles and those of the random tight-binding Hamiltonian in the point of the Anderson localization transition in three dimensions. Finally we present a simple field theory in 1+1 dimensions which reproduces level statistics of both of these random matrix models and the classical Wigner-Dyson spectral statistics in the framework of the unified formalism of Luttinger liquid. We show that the (equal-time) density correlations in both random matrix models correspond to the finite-temperature density correlations of the Luttinger liquid. We show that spectral correlations in the invariant ensemble with log-square confinement correspond to a Luttinger liquid in the 1+1 curved space-time with the event horizon, similar to the phonon density correlations in the sonic analogy of Hawking radiation in black holes.
We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model averaging technique. We empirically verify that the model successfully accomplishes both of these tasks. We use maxout and dropout to demonstrate state of the art classification performance on four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN.
The thermodynamic and retrieval properties of the Ashkin-Teller neural network model storing an infinite number of patterns are examined in the replica-symmetric mean-field approximation. In particular, for linked patterns temperature-capacity phase diagrams are derived for different values of the two-neuron and four-neuron coupling strengths. This model can be considered as a particular non-trivial generalisation of the Hopfield model and exhibits a number of interesting new features. Some aspects of replica-symmetry breaking are discussed.
We show how the success of deep learning could depend not only on mathematics but also on physics: although well-known mathematical theorems guarantee that neural networks can approximate arbitrary functions well, the class of functions of practical interest can frequently be approximated through "cheap learning" with exponentially fewer parameters than generic ones. We explore how properties frequently encountered in physics such as symmetry, locality, compositionality, and polynomial log-probability translate into exceptionally simple neural networks. We further argue that when the statistical process generating the data is of a certain hierarchical form prevalent in physics and machine-learning, a deep neural network can be more efficient than a shallow one. We formalize these claims using information theory and discuss the relation to the renormalization group. We prove various "no-flattening theorems" showing when efficient linear deep networks cannot be accurately approximated by shallow ones without efficiency loss, for example, we show that $n$ variables cannot be multiplied using fewer than 2^n neurons in a single hidden layer.
The spectral properties of the adjacency (connectivity) and distance matrix for various types of networks: exponential, scale-free (Albert--Barabasi) and classical random ones (Erdos--Renyi) are evaluated. The graph spectra for dense graph in the Erdos-Renyi model are derived analytically.
We propose semi-random features for nonlinear function approximation. The flexibility of semi-random feature lies between the fully adjustable units in deep learning and the random features used in kernel methods. For one hidden layer models with semi-random features, we prove with no unrealistic assumptions that the model classes contain an arbitrarily good function as the width increases (universality), and despite non-convexity, we can find such a good function (optimization theory) that generalizes to unseen new data (generalization bound). For deep models, with no unrealistic assumptions, we prove universal approximation ability, a lower bound on approximation error, a partial optimization guarantee despite non-convexity, and a generalization bound. The generalization bound of deep semi-random features can be exponentially better than the known bounds of deep ReLU nets; our generalization error bound can be independent of the depth, the number of trainable weights as well as the input dimensionality. In experiments, we show that semi-random features can match the performance of neural networks by using slightly more units, and it outperforms random features by using significantly fewer units.
Thermal conductivity of a model glass-forming system in the liquid and glass states is studied using extensive numerical simulations. We show that near the glass transition temperture, where the structural relaxation time becomes very long, the measured thermal conductivity decreases with increasing age. Secondly the thermal conductivity of the disordered solid obtained at low temperatures depends on the cooling rate with which it was prepared, with lower cooling rates leading to lower thermal conductivity. Our analysis links this decrease of the thermal conductivity with increased exploration of lower-energy inherent structures of the underlying potential energy landscape. Further we show that the lowering of conductivity for lower-energy inherent structures is related to the high frequency harmonic modes associated with the inherent structure being less extended.
In the study of networked systems such as biological, technological, and social networks the available data are often uncertain. Rather than knowing the structure of a network exactly, we know the connections between nodes only with a certain probability. In this paper we develop methods for the analysis of such uncertain data, focusing particularly on the problem of community detection. We give a principled maximum-likelihood method for inferring community structure and demonstrate how the results can be used to make improved estimates of the true structure of the network. Using computer-generated benchmark networks we demonstrate that our methods are able to reconstruct known communities more accurately than previous approaches based on data thresholding. We also give an example application to the detection of communities in a protein-protein interaction network.
This paper introduces a model based upon games on an evolving network, and develops three clustering algorithms according to it. In the clustering algorithms, data points for clustering are regarded as players who can make decisions in games. On the network describing relationships among data points, an edge-removing-and-rewiring (ERR) function is employed to explore in a neighborhood of a data point, which removes edges connecting to neighbors with small payoffs, and creates new edges to neighbors with larger payoffs. As such, the connections among data points vary over time. During the evolution of network, some strategies are spread in the network. As a consequence, clusters are formed automatically, in which data points with the same evolutionarily stable strategy are collected as a cluster, so the number of evolutionarily stable strategies indicates the number of clusters. Moreover, the experimental results have demonstrated that data points in datasets are clustered reasonably and efficiently, and the comparison with other algorithms also provides an indication of the effectiveness of the proposed algorithms.
While smart living based on the controls of voices, gestures, mobile phones or the Web has gained momentum from both academia and industries, most of existing methods are not effective in helping the elderly or people with muscle disordered or motor disabilities. Recently, the Electroencephalography (EEG) signal based mind control has attracted much attentions, due to the fact that it enables users to control devices and to communicate to outer world with little participation of their muscle systems. However, the use of EEG signals face challenges such as low accuracy, arduous and time-consuming feature extraction. This paper proposes a 7-layer deep learning model that includes two layers of Long-Short Term Memory (LSTM) cells to directly classify raw EEG signals, avoiding the time-consuming pre-processing and feature extraction. The hyper-parameters are selected by an Orthogonal Array experiment method to improve the efficiency. Our model is applied to an open EEG dataset released by PhysioNet and achieves 95.05% accuracy over 5 categorical EEG raw data. The applicability of our proposed model is further demonstrated by two use cases of smart living in terms of assisted living with robotics and home automation.
Clustering is an important concept to reduce the energy consumption and prolonging the life of a wireless sensor network. In heterogeneous wireless sensor network some of the nodes are equipped with more energy than the other nodes.   Many routing algorithms are proposed for heterogeneous wireless sensor network. Stable Election Protocol (SEP) is one of the important protocol in this category. In this research paper a novel energy efficient distance based cluster protocol (DBCP) is proposed for single hop heterogeneous wireless sensor network to increase the life and energy efficiency of a sensor network. DBCP use the average distance of the sensor from the base station as the major issue for the selection of a cluster head in the sensor network.
We investigate the behavior of the recently proposed quantum Google algorithm, or quantum PageRank, in large complex networks. Applying the quantum algorithm to a part of the real World Wide Web, we find that the algorithm is able to univocally reveal the underlying scale-free topology of the network and to clearly identify and order the most relevant nodes (hubs) of the graph according to their importance in the network structure. Moreover, our results show that the quantum PageRank algorithm generically leads to changes in the hierarchy of nodes. In addition, as compared to its classical counterpart, the quantum algorithm is capable to clearly highlight the structure of secondary hubs of the network, and to partially resolve the degeneracy in importance of the low lying part of the list of rankings, which represents a typical shortcoming of the classical PageRank algorithm. Complementary to this study, our analysis shows that the algorithm is able to clearly distinguish scale-free networks from other widespread and important classes of complex networks, such as Erd\H{o}s-R\'enyi networks and hierarchical graphs. We show that the ranking capabilities of the quantum PageRank algorithm are related to an increased stability with respect to a variation of the damping parameter $\alpha$ that appears in the Google algorithm, and to a more clearly pronounced power-law behavior in the distribution of importance among the nodes, as compared to the classical algorithm. Finally, we study to which extent the increased sensitivity of the quantum algorithm persists under coordinated attacks of the most important nodes in scale-free and Erd\H{o}s-R\'enyi random graphs.
It is shown that an exact solution of the transient dynamics of an associative memory model storing an infinite number of limit cycles with l finite steps by means of the path-integral analysis. Assuming the Maxwell construction ansatz, we have succeeded in deriving the stationary state equations of the order parameters from the macroscopic recursive equations with respect to the finite-step sequence processing model which has retarded self-interactions. We have also derived the stationary state equations by means of the signal-to-noise analysis (SCSNA). The signal-to-noise analysis must assume that crosstalk noise of an input to spins obeys a Gaussian distribution. On the other hand, the path-integral method does not require such a Gaussian approximation of crosstalk noise. We have found that both the signal-to-noise analysis and the path-integral analysis give the completely same result with respect to the stationary state in the case where the dynamics is deterministic, when we assume the Maxwell construction ansatz.   We have shown the dependence of storage capacity (alpha_c) on the number of patterns per one limit cycle (l). Storage capacity monotonously increases with the number of steps, and converges to alpha_c=0.269 at l ~= 10. The original properties of the finite-step sequence processing model appear as long as the number of steps of the limit cycle has order l=O(1).
The importance of timely response to natural disasters and evacuating affected people to safe areas is paramount to save lives. Emergency services are often handicapped by the amount of rescue resources at their disposal. We present a system that leverages the power of a social network forming new connections among people based on \textit{real-time location} and expands the rescue resources pool by adding private sector cars. We also introduce a car-sharing algorithm to identify safe routes in an emergency with the aim of minimizing evacuation time, maximizing pick-up of people without cars, and avoiding traffic congestion.
The complex topology of real networks allows its actors to change their functional behavior. Network models provide better understanding of the evolutionary mechanisms being accountable for the growth of such networks by capturing the dynamics in the ways network agents interact and change their behavior. Considerable amount of research efforts is required for developing novel network modeling techniques to understand the structural properties such networks, reproducing similar properties based on empirical evidence, and designing such networks efficiently. First, we demonstrate how to construct social interaction networks using social media data and then present the key findings obtained from the network analytics. We analyze the characteristics and growth of such interaction networks, examine the network properties and derive important insights based on the theories of network science literature. We also discuss the application of such networks as a useful tool to effectively disseminate targeted information during planned special events. We observed that the degree-distributions of such networks follow power-law that is indicative of the existence of fewer nodes in the network with higher levels of interactions, and many other nodes with less interactions. While the network elements and average user degree grow linearly each day, densities of such networks tend to become zero. Largest connected components exhibit higher connectivity (density) when compared with the whole graph. Network radius and diameter become stable over time evidencing the small-world property. We also observe increased transitivity and higher stability of the power-law exponents as the networks grow. Data is specific to the Purdue University community and two large events, namely Purdue Day of Giving and Senator Bernie Sanders' visit to Purdue University as part of Indiana Primary Election 2016.
We present a catalog of morphological and color data for galaxies with $I < 25$ mag in the {\em Hubble Deep Field} (Williams et al. 1996). Galaxies have been inspected and (when possible) independently visually classified on the MDS and DDO systems. Measurements of central concentration and asymmetry are also included in the catalog. The fraction of interacting and merging objects is seen to be significantly higher in the {\em Hubble Deep Field} than it is among nearby galaxies. Barred spirals are essentially absent from the deep sample. The fraction of early-type galaxies in the Hubble Deep Field is similar to the fraction of early-types in the Shapley-Ames Catalog, but the fraction of galaxies resembling archetypal grand-design late-type spiral galaxies is dramatically lower in the distant HDF sample.
Learning a better representation with neural networks is a challenging problem, which was tackled extensively from different prospectives in the past few years. In this work, we focus on learning a representation that could be used for a clustering task and introduce two novel loss components that substantially improve the quality of produced clusters, are simple to apply to an arbitrary model and cost function, and do not require a complicated training procedure. We evaluate them on two most common types of models, Recurrent Neural Networks and Convolutional Neural Networks, showing that the approach we propose consistently improves the quality of KMeans clustering in terms of Adjusted Mutual Information score and outperforms previously proposed methods.
Motivation: The MinION device by Oxford Nanopore is the first portable sequencing device. MinION is able to produce very long reads (reads over 100~kBp were reported), however it suffers from high sequencing error rate. In this paper, we show that the error rate can be reduced by improving the base calling process.   Results: We present the first open-source DNA base caller for the MinION sequencing platform by Oxford Nanopore. By employing carefully crafted recurrent neural networks, our tool improves the base calling accuracy compared to the default base caller supplied by the manufacturer. This advance may further enhance applicability of MinION for genome sequencing and various clinical applications.   Availability: DeepNano can be downloaded at http://compbio.fmph.uniba.sk/deepnano/.   Contact: boza@fmph.uniba.sk
We measure the angular clustering of ~2000 radio sources in The Bootes Deep Field, covering 5.3 deg^2 down to S(1.4 GHz)=0.2 mJy. With reference to work by Blake & Wall, we show that the size distribution of multi-component radio galaxies dominates the overall clustering signal, and that its amplitude extrapolates smoothly from their measurements above 5 mJy. The upper limits on any true galaxy-galaxy clustering are consistent with the clustering of the sub-mJy radio-loud AGN being effectively diluted by the more weakly-clustered IRAS-type starburst galaxies. Source count models imply that the survey contains around 400 of the latter galaxies above 0.2 mJy out to z ~ 1 to 2. Measurement of their clustering must await their identification via the optical and infrared data due on this field.
The state space of a conventional Hopfield network typically exhibits many different attractors of which only a small subset satisfy constraints between neurons in a globally optimal fashion. It has recently been demonstrated that combining Hebbian learning with occasional alterations of normal neural states avoids this problem by means of self-organized enlargement of the best basins of attraction. However, so far it is not clear to what extent this process of self-optimization is also operative in real brains. Here we demonstrate that it can be transferred to more biologically plausible neural networks by implementing a self-optimizing spiking neural network model. In addition, by using this spiking neural network to emulate a Hopfield network with Hebbian learning, we attempt to make a connection between rate-based and temporal coding based neural systems. Although further work is required to make this model more realistic, it already suggests that the efficacy of the self-optimizing process is independent from the simplifying assumptions of a conventional Hopfield network. We also discuss natural and cultural processes that could be responsible for occasional alteration of neural firing patterns in actual brains
Training neural network language models over large vocabularies is still computationally very costly compared to count-based models such as Kneser-Ney. At the same time, neural language models are gaining popularity for many applications such as speech recognition and machine translation whose success depends on scalability. We present a systematic comparison of strategies to represent and train large vocabularies, including softmax, hierarchical softmax, target sampling, noise contrastive estimation and self normalization. We further extend self normalization to be a proper estimator of likelihood and introduce an efficient variant of softmax. We evaluate each method on three popular benchmarks, examining performance on rare words, the speed/accuracy trade-off and complementarity to Kneser-Ney.
We study a wide field motion sensitive neuron in the visual system of the blowfly {\em Calliphora vicina}. By rotating the fly on a stepper motor outside in a wooded area, and along an angular motion trajectory representative of natural flight, we stimulate the fly's visual system with input that approaches the natural situation. The neural response is analyzed in the framework of information theory, using methods that are free from assumptions. We demonstrate that information about the motion trajectory increases as the light level increases over a natural range. This indicates that the fly's brain utilizes the increase in photon flux to extract more information from the photoreceptor array, suggesting that imprecision in neural signals is dominated by photon shot noise in the physical input, rather than by noise generated within the nervous system itself.
The multiple access scheduling decides how the channel is shared among the nodes in the network. Typical scheduling algorithms aims at increasing the channel utilization and thereby throughput of the network. This paper describes several algorithms for generating an optimal schedule in terms of channel utilization for multiple access by utilizing range information in a fully connected network. We also provide detailed analysis for the proposed algorithms performance in terms of their complexity, convergence, and effect of non-idealities in the network. The performance of the proposed schemes are compared with non-aided methods to quantify the benefits of using the range information in the communication. The proposed methods have several favorable properties for the scalable systems. We show that the proposed techniques yields better channel utilization and throughput as the number of nodes in the network increases. We provide simulation results in support of this claim. The proposed methods indicate that the throughput can be increased on average by 3-10 times for typical network configurations.
Skilled robot task learning is best implemented by predictive action policies due to the inherent latency of sensorimotor processes. However, training such predictive policies is challenging as it involves finding a trajectory of motor activations for the full duration of the action. We propose a data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations. The architecture consists of three sub-networks referred to as the perception, policy and behavior super-layers. The perception and behavior super-layers force an abstraction of visual and motor data trained with synthetic and simulated training samples, respectively. The policy super-layer is a small sub-network with fewer parameters that maps data in-between the abstracted manifolds. It is trained for each task using methods for policy search reinforcement learning. We demonstrate the suitability of the proposed architecture and learning framework by training predictive policies for skilled object grasping and ball throwing on a PR2 robot. The effectiveness of the method is illustrated by the fact that these tasks are trained using only about 180 real robot attempts with qualitative terminal rewards.
Online discussion forums are complex webs of overlapping subcommunities (macrolevel structure, across threads) in which users enact different roles depending on which subcommunity they are participating in within a particular time point (microlevel structure, within threads). This sub-network structure is implicit in massive collections of threads. To uncover this structure, we develop a scalable algorithm based on stochastic variational inference and leverage topic models (LDA) along with mixed membership stochastic block (MMSB) models. We evaluate our model on three large-scale datasets, Cancer-ThreadStarter (22K users and 14.4K threads), Cancer-NameMention(15.1K users and 12.4K threads) and StackOverFlow (1.19 million users and 4.55 million threads). Qualitatively, we demonstrate that our model can provide useful explanations of microlevel and macrolevel user presentation characteristics in different communities using the topics discovered from posts. Quantitatively, we show that our model does better than MMSB and LDA in predicting user reply structure within threads. In addition, we demonstrate via synthetic data experiments that the proposed active sub-network discovery model is stable and recovers the original parameters of the experimental setup with high probability.
Standard error backpropagation is used in almost all modern neural network learnings for minimizing training errors with respect to network parameters. However, it typically suffers from proliferation of saddle points in the high-dimensional parameter space. Therefore, it is highly desirable to design an efficient algorithm to escape from these saddle points and reach a parameter region of better generalization capabilities. Here, we propose a simple extension of the backpropagation, namely reinforced backpropagation, which simply adds previous first-order gradients in a stochastic manner with a probability that increases with learning time. As verified in a simple synthetic dataset, this method significantly accelerates learning compared to standard backpropagation. Surprisingly, it dramatically reduces over-fitting effects, even compared to state-of-the-art adaptive learning algorithm---Adam. For a benchmark handwritten dataset, the learning performance is comparable to Adam, yet with an extra advantage of requiring one-fold less computer memory. Overall, our method introduces stochastic memory into gradients, which may be an important starting point to understand how gradient-based training algorithms can work and its relationship with the generalization abilities of deep networks.
An important goal in visual recognition is to devise image representations that are invariant to particular transformations. In this paper, we address this goal with a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel. Unlike traditional approaches where neural networks are learned either to represent data or for solving a classification task, our network learns to approximate the kernel feature map on training data. Such an approach enjoys several benefits over classical ones. First, by teaching CNNs to be invariant, we obtain simple network architectures that achieve a similar accuracy to more complex ones, while being easy to train and robust to overfitting. Second, we bridge a gap between the neural network literature and kernels, which are natural tools to model invariance. We evaluate our methodology on visual recognition tasks where CNNs have proven to perform well, e.g., digit recognition with the MNIST dataset, and the more challenging CIFAR-10 and STL-10 datasets, where our accuracy is competitive with the state of the art.
In this paper we propose a model that combines the strengths of RNNs and SGVB: the Variational Recurrent Auto-Encoder (VRAE). Such a model can be used for efficient, large scale unsupervised learning on time series data, mapping the time series data to a latent vector representation. The model is generative, such that data can be generated from samples of the latent space. An important contribution of this work is that the model can make use of unlabeled data in order to facilitate supervised training of RNNs by initialising the weights and network state.
In this dissertation, we analyze the computational properties of game-theoretic centrality measures. The key idea behind game-theoretic approach to network analysis is to treat nodes as players in a cooperative game, where the value of each coalition of nodes is determined by certain graph properties. Next, the centrality of any individual node is determined by a chosen game-theoretic solution concept (notably, the Shapley value) in the same way as the payoff of a player in a cooperative game. On one hand, the advantage of game-theoretic centrality measures is that nodes are ranked not only according to their individual roles but also according to how they contribute to the role played by all possible subsets of nodes. On the other hand, the disadvantage is that the game-theoretic solution concepts are typically computationally challenging. The main contribution of this dissertation is that we show that a wide variety of game-theoretic solution concepts on networks can be computed in polynomial time. Our focus is on centralities based on the Shapley value and its various extensions, such as the Semivalues and Coalitional Semivalues. Furthermore, we prove #P-hardness of computing the Shapley value in connectivity games and propose an algorithm to compute it. Finally, we analyse computational properties of generalized version of cooperative games in which order of player matters. We propose a new representation for such games, called generalized marginal contribution networks, that allows for polynomial computation in the size of the representation of two dedicated extensions of the Shapley value to this class of games.
We present deep near-infrared images of the Antennae galaxies, taken with the Palomar Wide-Field Infrared Camera WIRC. The images cover a 4.33' x 4.33' (24.7kpc x 24.7kpc) area around the galaxy interaction zone. We derive J and K_s band photometric fluxes for 172 infrared star clusters, and discuss details of the two galactic nuclei and the overlap region. We also discuss the properties of a subset of 27 sources which have been detected with WIRC, HST and the VLA. The sources in common are young clusters of less than 10 Myr, which show no correlation between their infrared colors and 6 cm radio properties. These clusters cover a wide range in infrared color due to extinction and evolution. The average extinction is about A_V~2 mag while the reddest clusters may be reddened by up to 10 magnitudes.
Stanley Milgram's small world experiment presents "six degrees of separation" of our world. One phenomenon of the experiment still puzzling us is that how individuals operating with the social network information with their characteristics can be very adept at finding the short chains. The previous works on this issue focus whether on the methods of navigation in a given network structure, or on the effects of additional information to the searching process. In this paper, we emphasize that the growth and shape of network architecture is tightly related to the individuals' attributes. We introduce a method to reconstruct nodes' intimacy degree based on local interaction. Then we provide an intimacy based approach for orientation in networks. We find that the basic reason of efficient search in social networks is that the degree of "intimacy" of each pair of nodes decays with the length of their shortest path exponentially. Meanwhile, the model can explain the hubs limitation which was observed in real-world experiment.
In the present work we introduce a stochastic cellular automata model in order to simulate the dynamics of the stock market. A direct percolation method is used to create a hierarchy of clusters of active traders on a two dimensional grid. Active traders are characterised by the decision to buy, (+1), or sell, (-1), a stock at a certain discrete time step. The remaining cells are inactive,(0). The trading dynamics is then determined by the stochastic interaction between traders belonging to the same cluster. Most of the stylized aspects of the financial market time series are reproduced by the model.
The structure of the three-dimensional random field Ising magnet is studied by ground state calculations. We investigate the percolation of the minority spin orientation in the paramagnetic phase above the bulk phase transition, located at [Delta/J]_c ~= 2.27, where Delta is the standard deviation of the Gaussian random fields (J=1). With an external field H there is a disorder strength dependent critical field +/- H_c(Delta) for the down (or up) spin spanning. The percolation transition is in the standard percolation universality class. H_c ~ (Delta - Delta_p)^{delta}, where Delta_p = 2.43 +/- 0.01 and delta = 1.31 +/- 0.03, implying a critical line for Delta_c < Delta <= Delta_p. When, with zero external field, Delta is decreased from a large value there is a transition from the simultaneous up and down spin spanning, with probability Pi_{uparrow downarrow} = 1.00 to Pi_{uparrow downarrow} = 0. This is located at Delta = 2.32 +/- 0.01, i.e., above Delta_c. The spanning cluster has the fractal dimension of standard percolation D_f = 2.53 at H = H_c(Delta). We provide evidence that this is asymptotically true even at H=0 for Delta_c < Delta <= Delta_p beyond a crossover scale that diverges as Delta_c is approached from above. Percolation implies extra finite size effects in the ground states of the 3D RFIM.
The full dynamics of a synchronous recurrent neural network model with Ising binary units and a Hebbian learning rule with a finite self-interaction is studied in order to determine the stability to synaptic and stochastic noise of frozen-in states that appear in the absence of both kinds of noise. Both, the numerical simulation procedure of Eissfeller and Opper and a new alternative procedure that allows to follow the dynamics over larger time scales have been used in this work. It is shown that synaptic noise destabilizes the frozen-in states and yields either retrieval or paramagnetic states for not too large stochastic noise. The indications are that the same results may follow in the absence of synaptic noise, for low stochastic noise.
The authors measured the transport properties of single-walled carbon nanotube (SWCNT) films in the microwave frequency range from 10 MHz to 30 GHz by using the Corbino reflection technique from temperatures of 20-400 K. Based on the real and imaginary parts of the microwave conductivity, they calculated the shielding effectiveness for various film thicknesses. Shielding effectiveness of 43 dB at 10 MHz and 28 dB at 10 GHz are found for films with 90% optical transmittance, which suggests that SWCNT films are promising as a type of transparent microwave shielding material. By combining their data with those from the literature, the conductivity of SWCNT films was established in a broad frequency range from dc to visible.
In this paper, we address the logarithmic corrections to the leading power laws that govern thermodynamic quantities as a second-order phase transition point is approached. For phase transitions of spin systems on d-dimensional lattices, such corrections appear at some marginal values of the order parameter or space dimension. We present new scaling relations for these exponents. We also consider a spin system on a scale-free network which exhibits logarithmic corrections due to the specific network properties. To this end, we analyze the phase behavior of a model with coupled order parameters on a scale-free network and extract leading and logarithmic correction-to-scaling exponents that determine its field- and temperature behavior. Although both non-trivial sets of exponents emerge from the correlations in the network structure rather than from the spin fluctuations they fulfil the respective thermodynamic scaling relations. For the scale-free networks the logarithmic corrections appear at marginal values of the node degree distribution exponent. In addition we calculate scaling functions, which also exhibit nontrivial dependence on intrinsic network properties.
We discuss, in this paper, the dynamical properties of extremely diluted, non-monotonic neural networks. Assuming parallel updating and the Hebb prescription for the synaptic connections, a flow equation for the macroscopic overlap is derived. A rich dynamical phase diagram was obtained, showing a stable retrieval phase, as well as a cycle two and chaotic behavior. Numerical simulations were performed, showing good agreement with analytical results. Furthermore, the simulations give an additional insight into the microscopic dynamical behavior during the chaotic phase. It is shown that the freezing of individual neuron states is related to the structure of chaotic attractors.
Cloud computing provides a great opportunity for scientists, as it enables large-scale experiments that cannot are too long to run on local desktop machines. Cloud-based computations can be highly parallel, long running and data-intensive, which is desirable for many kinds of scientific experiments. However, to unlock this power, we need a user-friendly interface and an easy-to-use methodology for conducting these experiments. For this reason, we introduce here a formal model of a cloud-based platform and the corresponding open-source implementation. The proposed solution allows to conduct experiments without having a deep technical understanding of cloud-computing, HPC, fault tolerance, or data management in order to leverage the benefits of cloud computing. In the current version, we have focused on biophysics and structural chemistry experiments, based on the analysis of big data from synchrotrons and atomic force microscopy. The domain experts noted the time savings for computing and data management, as well as user-friendly interface.
We present asymptotic and finite-sample results on the use of stochastic blockmodels for the analysis of network data. We show that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this size. We also establish finite-sample confidence bounds on maximum-likelihood blockmodel parameter estimates from data comprising independent Bernoulli random variates; these results hold uniformly over class assignment. We provide simulations verifying the conditions sufficient for our results, and conclude by fitting a logit parameterization of a stochastic blockmodel with covariates to a network data example comprising a collection of Facebook profiles, resulting in block estimates that reveal residual structure.
When building large-scale machine learning (ML) programs, such as big topic models or deep neural nets, one usually assumes such tasks can only be attempted with industrial-sized clusters with thousands of nodes, which are out of reach for most practitioners or academic researchers. We consider this challenge in the context of topic modeling on web-scale corpora, and show that with a modest cluster of as few as 8 machines, we can train a topic model with 1 million topics and a 1-million-word vocabulary (for a total of 1 trillion parameters), on a document collection with 200 billion tokens -- a scale not yet reported even with thousands of machines. Our major contributions include: 1) a new, highly efficient O(1) Metropolis-Hastings sampling algorithm, whose running cost is (surprisingly) agnostic of model size, and empirically converges nearly an order of magnitude faster than current state-of-the-art Gibbs samplers; 2) a structure-aware model-parallel scheme, which leverages dependencies within the topic model, yielding a sampling strategy that is frugal on machine memory and network communication; 3) a differential data-structure for model storage, which uses separate data structures for high- and low-frequency words to allow extremely large models to fit in memory, while maintaining high inference speed; and 4) a bounded asynchronous data-parallel scheme, which allows efficient distributed processing of massive data via a parameter server. Our distribution strategy is an instance of the model-and-data-parallel programming model underlying the Petuum framework for general distributed ML, and was implemented on top of the Petuum open-source system. We provide experimental evidence showing how this development puts massive models within reach on a small cluster while still enjoying proportional time cost reductions with increasing cluster size, in comparison with alternative options.
We study domain walls in 2d Ising spin glasses in terms of a minimum-weight path problem. Using this approach, large systems can be treated exactly. Our focus is on the fractal dimension $d_f$ of domain walls, which describes via $<\ell >\simL^{d_f}$ the growth of the average domain-wall length with %% systems size $L\times L$. %% 20.07.07 OM %% Exploring systems up to L=320 we yield $d_f=1.274(2)$ for the case of Gaussian disorder, i.e. a much higher accuracy compared to previous studies. For the case of bimodal disorder, where many equivalent domain walls exist due to the degeneracy of this model, we obtain a true lower bound $d_f=1.095(2)$ and a (lower) estimate $d_f=1.395(3)$ as upper bound. Furthermore, we study the distributions of the domain-wall lengths. Their scaling with system size can be described also only by the exponent $d_f$, i.e. the distributions are monofractal. Finally, we investigate the growth of the domain-wall width with system size (``roughness'') and find a linear behavior.
We propose a secure scheme for wireless network coding, called the algebraic watchdog. By enabling nodes to detect malicious behaviors probabilistically and use overheard messages to police their downstream neighbors locally, the algebraic watchdog delivers a secure global self-checking network. Unlike traditional Byzantine detection protocols which are receiver-based, this protocol gives the senders an active role in checking the node downstream. The key idea is inspired by Marti et al.'s watchdog-pathrater, which attempts to detect and mitigate the effects of routing misbehavior.   As an initial building block of a such system, we first focus on a two-hop network. We present a graphical model to understand the inference process nodes execute to police their downstream neighbors; as well as to compute, analyze, and approximate the probabilities of misdetection and false detection. In addition, we present an algebraic analysis of the performance using an hypothesis testing framework that provides exact formulae for probabilities of false detection and misdetection.   We then extend the algebraic watchdog to a more general network setting, and propose a protocol in which we can establish trust in coded systems in a distributed manner. We develop a graphical model to detect the presence of an adversarial node downstream within a general multi-hop network. The structure of the graphical model (a trellis) lends itself to well-known algorithms, such as the Viterbi algorithm, which can compute the probabilities of misdetection and false detection. We show analytically that as long as the min-cut is not dominated by the Byzantine adversaries, upstream nodes can monitor downstream neighbors and allow reliable communication with certain probability. Finally, we present simulation results that support our analysis.
A new emerging paradigm of Uncertain Risk of Suspicion, Threat and Danger, observed across the field of information security, is described. Based on this paradigm a novel approach to anomaly detection is presented. Our approach is based on a simple yet powerful analogy from the innate part of the human immune system, the Toll-Like Receptors. We argue that such receptors incorporated as part of an anomaly detector enhance the detector's ability to distinguish normal and anomalous behaviour. In addition we propose that Toll-Like Receptors enable the classification of detected anomalies based on the types of attacks that perpetrate the anomalous behaviour. Classification of such type is either missing in existing literature or is not fit for the purpose of reducing the burden of an administrator of an intrusion detection system. For our model to work, we propose the creation of a taxonomy of the digital Acytota, based on which our receptors are created.
We study numerically the overdamped motion of particles driven in a two dimensional ratchet potential. In the proposed design, of the so-called geometrical-ratchet type, the mean velocity of a single particle in response to a constant force has a transverse component that can be induced by the presence of thermal or other unbiased fluctuations. We find that additional quenched disorder can strongly enhance the transverse drift at low temperatures, in spite of reducing the transverse mobility. We show that, under general conditions, the rectified transverse velocity of a driven particle fluid is equivalent to the response of a one dimensional flashing ratchet working at a drive-dependent effective temperature, defined through generalized Einstein relations.
The $Fermi$-LAT has revealed that rotation powered millisecond pulsars (MSPs) are a major contributor to the Galactic $\gamma$-ray source population. Such pulsars may also be important in modeling the quiescent state of several low mass X-ray binaries (LMXBs), where optical observations of the companion star suggest the possible existence of rotation powered MSPs. To understand the observational properties of the different evolutionary stages of MSPs, the X-ray and $\gamma$-ray emission associated with the outer gap model is investigated. For rotation powered MSPs, the size of the outer gap and the properties of the high-energy emission are controlled by either the photon-photon pair-creation process or magnetic pair-creation process near the surface. For these pulsars, we find that the outer gap model controlled by the magnetic pair-creation process is preferable in explaining the possible correlations between the $\gamma$-ray luminosity or non-thermal X-ray luminosity versus the spin down power. For the accreting MSPs in quiescent LMXBs, the thermal X-ray emission at the neutron star surface resulting from deep crustal heating can control the conditions in the outer gap. We argue that the optical modulation observed in the quiescent state of several LMXBs originates from the irradiation of the donor star by $\gamma$-rays from the outer gap. In these systems, the irradiation luminosity required for the optical modulation of the source such as SAX J1808.4-3658 can be achieved for a neutron star of high mass. Finally, we discuss the high-energy emission associated with an intra-binary shock in black widow systems, e.g. PSR B1957+20.
When humans learn a new concept, they might ignore examples that they cannot make sense of at first, and only later focus on such examples, when they are more useful for learning. We propose incorporating this idea of tunable sensitivity for hard examples in neural network learning, using a new generalization of the cross-entropy gradient step, which can be used in place of the gradient in any gradient-based training method. The generalized gradient is parameterized by a value that controls the sensitivity of the training process to harder training examples. We tested our method on several benchmark datasets. We propose, and corroborate in our experiments, that the optimal level of sensitivity to hard example is positively correlated with the depth of the network. Moreover, the test prediction error obtained by our method is generally lower than that of the vanilla cross-entropy gradient learner. We therefore conclude that tunable sensitivity can be helpful for neural network learning.
We analyse the phase diagram of ultra-cold bosons in a one-dimensional superlattice potential with disorder using the time evolving block decimation algorithm for infinite sized systems (iTEBD). For degenerate potential energies within the unit cell of the superlattice loophole-shaped insulating phases with non-integer filling emerge with a particle-hole gap proportional to the boson hopping. Adding a small amount of disorder destroys this gap. For not too large disorder the loophole Mott regions detach from the axis of vanishing hopping giving rise to insulating islands. Thus the system shows a transition from a compressible Bose-glass to a Mott-insulating phase with increasing hopping amplitude. We present a straight forward effective model for the dynamics within a unit cell which provides a simple explanation for the emergence of Mott-insulating islands. In particular it gives rather accurate predictions for the inner critical point of the Bose-glass to Mott-insulator transition.
In this project we outline Vedalia, a high performance distributed network for performing inference on latent variable models in the context of Amazon review visualization. We introduce a new model, RLDA, which extends Latent Dirichlet Allocation (LDA) [Blei et al., 2003] for the review space by incorporating auxiliary data available in online reviews to improve modeling while simultaneously remaining compatible with pre-existing fast sampling techniques such as [Yao et al., 2009; Li et al., 2014a] to achieve high performance. The network is designed such that computation is efficiently offloaded to the client devices using the Chital system [Robinson & Li, 2015], improving response times and reducing server costs. The resulting system is able to rapidly compute a large number of specialized latent variable models while requiring minimal server resources.
There is a brief description of the probabilistic causal graph model for representing, reasoning with, and learning causal structure using Bayesian networks. It is then argued that this model is closely related to how humans reason with and learn causal structure. It is shown that studies in psychology on discounting (reasoning concerning how the presence of one cause of an effect makes another cause less probable) support the hypothesis that humans reach the same judgments as algorithms for doing inference in Bayesian networks. Next, it is shown how studies by Piaget indicate that humans learn causal structure by observing the same independencies and dependencies as those used by certain algorithms for learning the structure of a Bayesian network. Based on this indication, a subjective definition of causality is forwarded. Finally, methods for further testing the accuracy of these claims are discussed.
New mid-infrared (10 and 20 micron) spectro-photometry of the ultraluminous infrared galaxy IRAS 08572+3915 is presented. The 10 micron spectrum reveals a deep silicate absorption feature, while the 20 micron spectrum shows no clear evidence for an 18 micron silicate absorption feature. An interstellar extinction curve is fitted to IRAS 08572+3915 and two other deep silicate infrared galaxies, NGC 4418 and Arp 220. It is found that pure extinction cannot explain the spectral energy distributions of these sources. On the other hand, both the strength of the silicate absorption and the overall spectral energy distributions of the three galaxies agree well with scaled-up models of galactic protostars. From this agreement, we conclude that the infrared emission comes from an optically thick dust shell surrounding a compact power source. The size of the power source is constrained to be smaller than a few parsecs. We argue that a significant portion of the total luminosities of these galaxies arises from an active galactic nucleus deeply embedded in dust.
It was shown recently that the structural alpha-relaxation time tau of supercooled o-terphenyl depends on a single control parameter Gamma, which is the product of a function of density E(ro), by the inverse temperature T -1. We extend this finding to other fragile glassforming liquids using light-scattering data. Available experimental results do not allow to discriminate between several analytical forms of the function E(ro), the scaling arising from the separation of density and temperature in Gamma. We also propose a simple form for tau(Gamma), which depends only on three material-dependent parameters, reproducing relaxation times over 12 orders of magnitude.
Currently, many users of Social Network Sites are insufficiently aware of who can see their shared personal items. Nonetheless, most approaches focus on enhancing privacy in Social Networks through improved privacy settings, neglecting the fact that privacy awareness is a prerequisite for privacy control. Social Network users first need to know about privacy issues before being able to make adjustments. In this paper, we introduce Friend Inspector, a serious game that allows its users to playfully increase their privacy awareness on Facebook. Since its launch, Friend Inspector has attracted a significant number of visitors, emphasising the need for better tools to understand privacy settings on Social Networks.
This article presents the measurement of a $t$-channel single top-quark production fiducial cross-section in the lepton+jets channel with 20.3 fb$^{-1}$ of 8 TeV data using a neural-network discriminant. A fiducial cross-section quoted within the detector acceptance of $\sigma_{\rm fid} = 3.37 \pm 0.05 \, (\mathrm{stat.}) \pm 0.47 \, (\mathrm{syst.}) \pm 0.09 \, (\mathrm{lumi.})~\text{pb}$ is obtained. The total inclusive $t$-channel cross-section is calculated using the acceptance predicted by various Monte Carlo generators. If the acceptance from the a MC@NLO + Herwig event generator is used, a value of $\sigma_t = 82.6 \pm 1.2 \, (\mathrm{stat.}) \pm 11.4 \, (\mathrm{syst.}) \pm 3.1 \, (\mathrm{PDF}) \pm 2.3 \, (\mathrm{lumi.})~\text{pb}$ is obtained, consistent with the Standard Model prediction. Using the ratio of the measured inclusive cross-section to the predicted cross-section and assuming that the top-quark-related CKM matrix elements obey the relation $|V_{tb}|\gg |V_{ts}|, |V_{td}|$, the coupling strength at the $W$-$t$-$b$ vertex is determined to be $|V_{tb}|=0.97^{+0.09}_{-0.10}$. Assuming that $|V_{tb}|\leq 1$ a lower limit of $|V_{tb}|>0.78$ is obtained at the 95% confidence level.
Over many decades, researchers working in object recognition have longed for an end-to-end automated system that will simply accept 2D or 3D image or videos as inputs and output the labels of objects in the input data. Computer vision methods that use representations derived based on geometric, radiometric and neural considerations and statistical and structural matchers and artificial neural network-based methods where a multi-layer network learns the mapping from inputs to class labels have provided competing approaches for image recognition problems. Over the last four years, methods based on Deep Convolutional Neural Networks (DCNNs) have shown impressive performance improvements on object detection/recognition challenge problems. This has been made possible due to the availability of large annotated data, a better understanding of the non-linear mapping between image and class labels as well as the affordability of GPUs. In this paper, we present a brief history of developments in computer vision and artificial neural networks over the last forty years for the problem of image-based recognition. We then present the design details of a deep learning system for end-to-end unconstrained face verification/recognition. Some open issues regarding DCNNs for object recognition problems are then discussed. We caution the readers that the views expressed in this paper are from the authors and authors only!
We develop an Effective Medium Theory to study the electrical transport properties of disordered graphene. The theory includes non-linear screening and exchange-correlation effects allowing us to consider experimentally relevant strengths of the Coulomb interaction. Assuming random Coulomb impurities, we calculate the electrical conductivity as a function of gate voltage describing quantitatively the full cross-over from the fluctuations dominated regime around the Dirac point to the large doping regime at high gate voltages. We find that the conductivity at the Dirac point is strongly affected by exchange correlation effects.
During the second half of the twentieth century, expensive observatories are being erected at La Silla (Chile), Mauna Kea (Hawai), Las Palmas (Canary Island), and Calar Alto (Spain), to name a view. In 1990, at the beginning of The Decade of Discovery in Astronomy and Astrophysics (Bahcall [2]), the UN/ESA Workshops on Basic Space Science initiated the establishment of small astronomical telescope facilities, among them many particularly supported by Japan, in developing countries in Asia and the Pacific (Sri Lanka, Philippines), Latin America and the Caribbean (Colombia, Costa   Rica, Honduras, Paraguay), and Western Asia (Egypt, Jordan, Morocco). The annual UN/ESA Workshops continue to pursue an agenda to network these small observatory facilities through similar research and education programmes and at the same time encourage the incorporation of cultural elements predominant in the respective cultures. Cross-cultural integration and multi-lingual scientific cooperation may well be a dominant theme in the new millennium (Pyenson [20]). This trend is supported by the notion that astronomy has deep roots in virtually every human culture, that it helps to understand humanity's place in the vast scale of the Universe, and that it increases the knowledge of humanity about its origins and evolution=2E Two of these Workshops have been organized in Europe (Germany 1996 and France 2000) to strengthen cooperation between developing and industrialized countries.
Gene expression data, or transcription data, are surrogates for actual protein concentrations in the cells. In addition protein-protein interactions are static diagrams of all the protein-protein interactions in the cell. These interactions may consist of covalent bonding or maybe just secondary bonding such as hydrogen bonding. Given these two surrogate data types we show a technique to compute the Gibbs free energy of a cell. We apply this to yeast cell cycle and to cancer.
The problem of existence of non-analytic (Griffith-like) contributions to the free energy of weakly disordered Ising ferromagnet is studied from the point of view of the replica theory. The consideration is done in terms of the usual random temperature Ginzburg-Landau Hamiltonian in space dimensions $D < 4$ in the zero external magnetic field. It is shown that in the paramagnetic phase, at temperatures not too close to $T_{c}$ (where the behaviour of the pure system is correctly described by the Gaussian approximation), the free energy of the system has additional non-perturbative contribution of the form $\exp{-(const) \frac{\tau^{(4-D)/2}}{u}}$ (where $\tau = (T-T_{c})/T_{c}$), which has essential singularity in the parameter $u \to 0$ which describes the strength of the disorder. It is demonstrated that this contribution appears due to non-linear localized (instanton-like) solutions of the mean-field stationary equations which are characterized by the special type of the replica symmetry breaking . It is argued that physically these replica instantons describe the contribution from rare spatial "ferromagnetic islands" in which local (random) temperature is below $T_{c}$
The theory of natural selection has two forms. Deductive theory describes how populations change over time. One starts with an initial population and some rules for change. From those assumptions, one calculates the future state of the population. Deductive theory predicts how populations adapt to environmental challenge. Inductive theory describes the causes of change in populations. One starts with a given amount of change. One then assigns different parts of the total change to particular causes. Inductive theory analyzes alternative causal models for how populations have adapted to environmental challenge. This chapter emphasizes the inductive analysis of cause.
Cores of cooperative games are ubiquitous in information theory, and arise most frequently in the characterization of fundamental limits in various scenarios involving multiple users. Examples include classical settings in network information theory such as Slepian-Wolf source coding and multiple access channels, classical settings in statistics such as robust hypothesis testing, and new settings at the intersection of networking and statistics such as distributed estimation problems for sensor networks. Cooperative game theory allows one to understand aspects of all of these problems from a fresh and unifying perspective that treats users as players in a game, sometimes leading to new insights. At the heart of these analyses are fundamental dualities that have been long studied in the context of cooperative games; for information theoretic purposes, these are dualities between information inequalities on the one hand and properties of rate, capacity or other resource allocation regions on the other.
Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better models. However its success has been very limited when dealing with recurrent neural networks. On the other hand, layer normalization normalizes the activations across all activities within a layer. This was shown to work well in the recurrent setting. In this paper we propose a unified view of normalization techniques, as forms of divisive normalization, which includes layer and batch normalization as special cases. Our second contribution is the finding that a small modification to these normalization schemes, in conjunction with a sparse regularizer on the activations, leads to significant benefits over standard normalization techniques. We demonstrate the effectiveness of our unified divisive normalization framework in the context of convolutional neural nets and recurrent neural networks, showing improvements over baselines in image classification, language modeling as well as super-resolution.
Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. In this paper, we present Wide & Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on Google Play, a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide & Deep significantly increased app acquisitions compared with wide-only and deep-only models. We have also open-sourced our implementation in TensorFlow.
The estimation of probabilities of network edges from the observed adjacency matrix has important applications to predicting missing links and network denoising. It has usually been addressed by estimating the graphon, a function that determines the matrix of edge probabilities, but this is ill-defined without strong assumptions on the network structure. Here we propose a novel computationally efficient method, based on neighborhood smoothing to estimate the expectation of the adjacency matrix directly, without making the structural assumptions that graphon estimation requires. The neighborhood smoothing method requires little tuning, has a competitive mean-squared error rate, and outperforms many benchmark methods on link prediction in simulated and real networks.
In the applications of Boolean networks to modeling biological systems, an important computational problem is the detection of the fixed points of these networks. This is an NP-complete problem in general. There have been various attempts to develop algorithms to address the computation need for large size Boolean networks. The existing methods are usually based on known algorithms and thus limited to the situations where these known algorithms can apply. In this paper, we propose a novel approach to this problem. We show that any system of Boolean equations is equivalent to one Boolean equation, and thus it is possible to divide the polynomial equation system which defines the fixed points of a Boolean network into subsystems that can be solved easily. After solving these subsystems and thus reducing the number of states, we can combine the solutions to obtain all fixed points of the given network. This approach does not depend on other algorithms and it is straightforward and easy to implement. We show that our method can handle large size Boolean networks, and demonstrate its effectiveness by using MAPLE to compute the fixed points of Boolean networks with hundreds of nodes and thousands of interactions.
We consider networks with two types of nodes. The v-nodes, called centers, are hyper- connected and interact one to another via many u-nodes, called satellites. This central- ized architecture, widespread in gene networks, possesses two fundamental properties. Namely, this organization creates feedback loops that are capable to generate practically any prescribed patterning dynamics, chaotic or periodic, or having a number of equilib- rium states. Moreover, this organization is robust with respect to random perturbations of the system.
This paper describes the parallel implementation of the TRANSIMS traffic micro-simulation. The parallelization method is domain decomposition, which means that each CPU of the parallel computer is responsible for a different geographical area of the simulated region. We describe how information between domains is exchanged, and how the transportation network graph is partitioned. An adaptive scheme is used to optimize load balancing. We then demonstrate how computing speeds of our parallel micro-simulations can be systematically predicted once the scenario and the computer architecture are known. This makes it possible, for example, to decide if a certain study is feasible with a certain computing budget, and how to invest that budget. The main ingredients of the prediction are knowledge about the parallel implementation of the micro-simulation, knowledge about the characteristics of the partitioning of the transportation network graph, and knowledge about the interaction of these quantities with the computer system. In particular, we investigate the differences between switched and non-switched topologies, and the effects of 10 Mbit, 100 Mbit, and Gbit Ethernet. keywords: Traffic simulation, parallel computing, transportation planning, TRANSIMS
In this paper we find an exact analytical expression for the number of spanning trees in Apollonian networks. This parameter can be related to significant topological and dynamic properties of the networks, including percolation, epidemic spreading, synchronization, and random walks. As Apollonian networks constitute an interesting family of maximal planar graphs which are simultaneously small-world, scale-free, Euclidean and space filling and highly clustered, the study of their spanning trees is of particular relevance. Our results allow also the calculation of the spanning tree entropy of Apollonian networks, which then we compare with those of other graphs with the same average degree.
Considering topologies of anonymous networks we used to organizing anonymous communication into hard to trace paths hiding its origin or destination. In anonymity the company is crucial, however the serial transportation imposes a costly tradeoff between a level of privacy and a speed of communication.   This paper introduces a framework of a novel architecture for anonymous networks that hides initiators of communications by parallelization of anonymous links. The new approach, which is based on the grounds of the anonymous P2P network called P2Priv, does not require content forwarding via a chain of proxy nodes to assure high degree of anonymity. Contrary to P2Priv, the new architecture can be suited to anonymization of various network communications, including anonymous access to distributed as well as client-server services. In particular, it can be considered as an anonymization platform for these network applications where both privacy and low delays are required.
New techniques in cross-layer wireless networks are building demand for ubiquitous channel sounding, that is, the capability to measure channel impulse response (CIR) with any standard wireless network and node. Towards that goal, we present a software-defined IEEE 802.11b receiver and CIR estimation system with little additional computational complexity compared to 802.11b reception alone. The system implementation, using the universal software radio peripheral (USRP) and GNU Radio, is described and compared to previous work. By overcoming computational limitations and performing direct-sequence spread-spectrum (DS-SS) matched filtering on the USRP, we enable high-quality yet inexpensive CIR estimation. We validate the channel sounder and present a drive test campaign which measures hundreds of channels between WiFi access points and an in-vehicle receiver in urban and suburban areas.
In this paper, we propose an efficient super-resolution (SR) method based on deep convolutional neural network (CNN), namely gradual upsampling network (GUN). Recent CNN based SR methods either preliminarily magnify the low resolution (LR) input to high resolution (HR) and then reconstruct the HR input, or directly reconstruct the LR input and then recover the HR result at the last layer. The proposed GUN utilizes a gradual process instead of these two kinds of frameworks. The GUN consists of an input layer, multistep upsampling and convolutional layers, and an output layer. By means of the gradual process, the proposed network can simplify the difficult direct SR problem to multistep easier upsampling tasks with very small magnification factor in each step. Furthermore, a gradual training strategy is presented for the GUN. In the proposed training process, an initial network can be easily trained with edge-like samples, and then the weights are gradually tuned with more complex samples. The GUN can recover fine and vivid results, and is easy to be trained. The experimental results on several image sets demonstrate the effectiveness of the proposed network.
Community detection algorithms attempt to find the best clusters of nodes in an arbitrary complex network. Multi-scale ("multiresolution") community detection extends the problem to identify the best network scale(s) for these clusters. The latter task is generally accomplished by analyzing community stability simultaneously for all clusters in the network. In the current work, we extend this general approach to define local multiresolution methods, which enable the extraction of well-defined local communities even if the global community structure is vaguely defined in an average sense. Toward this end, we propose measures analogous to variation of information and normalized mutual information that are used to quantitatively identify the best resolution(s) at the community level based on correlations between clusters in independently-solved systems. We demonstrate our method on two constructed networks as well as a real network and draw inferences about local community strength. Our approach is independent of the applied community detection algorithm save for the inherent requirement that the method be able to identify communities across different network scales, with appropriate changes to account for how different resolutions are evaluated or defined in a particular community detection method. It should, in principle, easily adapt to alternative community comparison measures.
Deep learning has been popularized by its recent successes on challenging artificial intelligence problems. One of the reasons for its dominance is also an ongoing challenge: the need for immense amounts of computational power. Hardware architects have responded by proposing a wide array of promising ideas, but to date, the majority of the work has focused on specific algorithms in somewhat narrow application domains. While their specificity does not diminish these approaches, there is a clear need for more flexible solutions. We believe the first step is to examine the characteristics of cutting edge models from across the deep learning community.   Consequently, we have assembled Fathom: a collection of eight archetypal deep learning workloads for study. Each of these models comes from a seminal work in the deep learning community, ranging from the familiar deep convolutional neural network of Krizhevsky et al., to the more exotic memory networks from Facebook's AI research group. Fathom has been released online, and this paper focuses on understanding the fundamental performance characteristics of each model. We use a set of application-level modeling tools built around the TensorFlow deep learning framework in order to analyze the behavior of the Fathom workloads. We present a breakdown of where time is spent, the similarities between the performance profiles of our models, an analysis of behavior in inference and training, and the effects of parallelism on scaling.
This paper provides a collection of mathematical and computational tools for the study of robustness in nonlinear gene regulatory networks, represented by time- and state-discrete dynamical systems taking on multiple states. The focus is on networks governed by nested canalizing functions (NCFs), first introduced in the Boolean context by S. Kauffman. After giving a general definition of NCFs we analyze the class of such functions. We derive a formula for the normalized average $c$-sensitivities of multistate NCFs, which enables the calculation of the Derrida plot, a popular measure of network stability. We also provide a unique canonical parametrized polynomial form of NCFs. This form has several consequences. We can easily generate NCFs for varying parameter choices, and derive a closed form formula for the number of such functions in a given number of variables, as well as an asymptotic formula. Finally, we compute the number of equivalence classes of NCFs under permutation of variables. Together, the results of the paper represent a useful mathematical framework for the study of NCFs and their dynamic networks.
Censoring has been proposed to be utilized in wireless distributed detection networks with a fusion center to enhance network performance in terms of error probability in addition to the well-established energy saving gains. In this paper, we further examine the employment of censoring in infrastructure-less cognitive radio networks, where nodes employ binary consensus algorithms to take global decisions regarding a binary hypothesis test without a fusion center to coordinate such a process. We show analytically - and verify by simulations - that censoring enhances the performance of such networks in terms of error probability and convergence times. Our protocol shows performance gains up to 46.6% in terms of average error probability over its conventional counterpart, in addition to performance gains of about 48.7% in terms of average energy expenditure and savings up to 50% in incurred transmission overhead.
Latent tree graphical models are widely used in computational biology, signal and image processing, and network tomography. Here we design a new efficient, estimation procedure for latent tree models, including Gaussian and discrete, reversible models, that significantly improves on previous sample requirement bounds. Our techniques are based on a new hidden state estimator which is robust to inaccuracies in estimated parameters. More precisely, we prove that latent tree models can be estimated with high probability in the so-called Kesten-Stigum regime with $O(log^2 n)$ samples where $n$ is the number of nodes.
Co-occurrence Data is a common and important information source in many areas, such as the word co-occurrence in the sentences, friends co-occurrence in social networks and products co-occurrence in commercial transaction data, etc, which contains rich correlation and clustering information about the items. In this paper, we study co-occurrence data using a general energy-based probabilistic model, and we analyze three different categories of energy-based model, namely, the $L_1$, $L_2$ and $L_k$ models, which are able to capture different levels of dependency in the co-occurrence data. We also discuss how several typical existing models are related to these three types of energy models, including the Fully Visible Boltzmann Machine (FVBM) ($L_2$), Matrix Factorization ($L_2$), Log-BiLinear (LBL) models ($L_2$), and the Restricted Boltzmann Machine (RBM) model ($L_k$). Then, we propose a Deep Embedding Model (DEM) (an $L_k$ model) from the energy model in a \emph{principled} manner. Furthermore, motivated by the observation that the partition function in the energy model is intractable and the fact that the major objective of modeling the co-occurrence data is to predict using the conditional probability, we apply the \emph{maximum pseudo-likelihood} method to learn DEM. In consequence, the developed model and its learning method naturally avoid the above difficulties and can be easily used to compute the conditional probability in prediction. Interestingly, our method is equivalent to learning a special structured deep neural network using back-propagation and a special sampling strategy, which makes it scalable on large-scale datasets. Finally, in the experiments, we show that the DEM can achieve comparable or better results than state-of-the-art methods on datasets across several application domains.
Network models with latent geometry have been used successfully in many applications in network science and other disciplines, yet it is usually impossible to tell if a given real network is geometric, meaning if it is a typical element in an ensemble of random geometric graphs. Here we identify structural properties of networks that guarantee that random graphs having these properties are geometric. Specifically we show that random graphs in which expected degree and clustering of every node are fixed to some constants are equivalent to random geometric graphs on the real line, if clustering is sufficiently strong. Large numbers of triangles, homogeneously distributed across all nodes as in real networks, are thus a consequence of network geometricity. The methods we use to prove this are quite general and applicable to other network ensembles, geometric or not, and to certain problems in quantum gravity.
Based on Restricted Boltzmann Machines (RBMs), an improved pseudo-stochastic sequential cipher generator is proposed. It is effective and efficient because of the two advantages: this generator includes a stochastic neural network that can perform the calculation in parallel, that is to say, all elements are calculated simultaneously; unlimited number of sequential ciphers can be generated simultaneously for multiple encryption schemas. The periodicity and the correlation of the output sequential ciphers meet the requirements for the design of encrypting sequential data. In the experiment, the generated sequential cipher is used to encrypt the image, and better performance is achieved in terms of the key space analysis, the correlation analysis, the sensitivity analysis and the differential attack. The experimental result is promising that could promote the development of image protection in computer security.
Heterogeneous Networks (HetNets) are introduced by the 3GPP as an emerging technology to provide high network coverage and capacity. The HetNets are the combination of multilayer networks such as macrocell, small cell (picocell and femtocell) networks. In such networks, users may suffer significant cross-layer interference. To manage the interference the 3GPP has introduced Enhanced Inter-Cell Interference Coordination (eICIC) techniques, Almost Blank SubFrame (ABSF) is one of the time-domain technique in the eICIC solutions. In this thesis, we propose a dynamically optimal ABSF framework to enhance the small cell user downlink performance while maintains the macro user downlink performance. We also study the mechanism to help the small cell base stations cooperate efficiently in order to reduce the mutual interference. Via numerical results, our proposed scheme achieves a significant performance and outperforms the existing ABSF frameworks in terms of user throughput and outage probability.
We study neural networks whose only non-linear components are multipliers, to test a new training rule in a context where the precise representation of data is paramount. These networks are challenged to discover the rules of matrix multiplication, given many examples. By limiting the number of multipliers, the network is forced to discover the Strassen multiplication rules. This is the mathematical equivalent of finding low rank decompositions of the $n\times n$ matrix multiplication tensor, $M_n$. We train these networks with the conservative learning rule, which makes minimal changes to the weights so as to give the correct output for each input at the time the input-output pair is received. Conservative learning needs a few thousand examples to find the rank 7 decomposition of $M_2$, and $10^5$ for the rank 23 decomposition of $M_3$ (the lowest known). High precision is critical, especially for $M_3$, to discriminate between true decompositions and "border approximations".
We propose a statistical mechanics for a general class of stationary and metastable equilibrium states. For this purpose, the Gibbs extremal conditions are slightly modified in order to be applied to a wide class of non-equilibrium states. As usual, it is assumed that the system maximizes the entropy functional $S$, subjected to the standard conditions; i.e., constant energy and normalization of the probability distribution. However, an extra conserved constraint function $F$ is also assumed to exist, which forces the system to remain in the metastable configuration. Further, after assuming additivity for two quasi-independent subsystems, and that the new constraint commutes with density matrix $\rho$, it is argued that F should be an homogeneous function of the density matrix, at least for systems in which the spectrum is sufficiently dense to be considered as continuous. The explicit form of $F$ turns to be $F(p_{i})=p_{i}^{q}$, where $p_i$ are the eigenvalues of the density matrix and $q$ is a real number to be determined. This $q$ number appears as a kind of Tsallis parameter having the interpretation of the order of homogeneity of the constraint $F$. The procedure is applied to describe the results of the plasma experiment of Huang and Driscoll. The experimentally measured density is predicted with a similar precision as it is done with the use of the extremum of the enstrophy and Tsallis procedures. However, the present results define the density at all the radial positions. In particular, the smooth tail shown by the experimental distribution turns to be predicted by the procedure. In this way, the scheme avoids the non-analyticity of the density profile at large distances arising in both of the mentioned alternative procedures.
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections, are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.
The big data era is swamping areas including data analysis, machine/deep learning, signal processing, statistics, scientific computing, and cloud computing. The multidimensional feature and huge volume of big data put urgent requirements to the development of multilinear modeling tools and efficient algorithms. In this paper, we build a novel multilinear tensor space that supports useful algorithms such as SVD and QR, while generalizing the matrix space to fourth-order tensors was believed to be challenging. Specifically, given any multidimensional discrete transform, we show that fourth-order tensors are bilinear operators on a space of matrices. First, we take a transform-based approach to construct a new tensor space by defining a new multiplication operation and tensor products, and accordingly the analogous concepts: identity, inverse, transpose, linear combinations, and orthogonality. Secondly, we define the $\mathcal{L}$-SVD for fourth-order tensors and present an efficient algorithm, where the tensor case requires a stronger condition for unique decomposition than the matrix case. Thirdly, we define the tensor $\mathcal{L}$-QR decomposition and propose a Householder QR algorithm to avoid the catastrophic cancellation problem associated with the conventional Gram-Schmidt process. Finally, we validate our schemes on video compression and one-shot face recognition. For video compression, compared with the existing tSVD, the proposed $\mathcal{L}$-SVD achieves $3\sim 10$dB gains in RSE, while the running time is reduced by about $50\%$ and $87.5\%$, respectively. For one-shot face recognition, the recognition rate is increased by about $10\% \sim 20\%$.
Inference using deep neural networks is often outsourced to the cloud since it is a computationally demanding task. However, this raises a fundamental issue of trust. How can a client be sure that the cloud has performed inference correctly? A lazy cloud provider might use a simpler but less accurate model to reduce its own computational load, or worse, maliciously modify the inference results sent to the client. We propose SafetyNets, a framework that enables an untrusted server (the cloud) to provide a client with a short mathematical proof of the correctness of inference tasks that they perform on behalf of the client. Specifically, SafetyNets develops and implements a specialized interactive proof (IP) protocol for verifiable execution of a class of deep neural networks, i.e., those that can be represented as arithmetic circuits. Our empirical results on three- and four-layer deep neural networks demonstrate the run-time costs of SafetyNets for both the client and server are low. SafetyNets detects any incorrect computations of the neural network by the untrusted server with high probability, while achieving state-of-the-art accuracy on the MNIST digit recognition (99.4%) and TIMIT speech recognition tasks (75.22%).
Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks. A fundamental barrier for parallelizing large-scale SGD is the fact that the cost of communicating the gradient updates between nodes can be very large. Consequently, lossy compression heuristics have been proposed, by which nodes only communicate quantized gradients. Although effective in practice, these heuristics do not always provably converge, and it is not clear whether they are optimal.   In this paper, we propose Quantized SGD (QSGD), a family of compression schemes which allow the compression of gradient updates at each node, while guaranteeing convergence under standard assumptions. QSGD allows the user to trade off compression and convergence time: it can communicate a sublinear number of bits per iteration in the model dimension, and can achieve asymptotically optimal communication cost. We complement our theoretical results with empirical data, showing that QSGD can significantly reduce communication cost, while being competitive with standard uncompressed techniques on a variety of real tasks.   In particular, experiments show that gradient quantization applied to training of deep neural networks for image classification and automated speech recognition can lead to significant reductions in communication cost, and end-to-end training time. For instance, on 16 GPUs, we are able to train a ResNet-152 network on ImageNet 1.8x faster to full accuracy. Of note, we show that there exist generic parameter settings under which all known network architectures preserve or slightly improve their full accuracy when using quantization.
This is the first of a series of three articles that treats fracture localization as a critical phenomenon. This first article establishes a statistical mechanics based on ensemble averages when fluctuations through time play no role in defining the ensemble. Ensembles are obtained by dividing a huge rock sample into many mesoscopic volumes. Because rocks are a disordered collection of grains in cohesive contact, we expect that once shear strain is applied and cracks begin to arrive in the system, the mesoscopic volumes will have a wide distribution of different crack states. These mesoscopic volumes are the members of our ensembles. We determine the probability of observing a mesoscopic volume to be in a given crack state by maximizing Shannon's measure of the emergent crack disorder subject to constraints coming from the energy-balance of brittle fracture. The laws of thermodynamics, the partition function, and the quantification of temperature are obtained for such cracking systems.
Green cellular networking has become an important research area in recent years due to environmental and economical concerns. Switching off under-utilized BSs during off-peak traffic load conditions is a promising approach to reduce energy consumption in cellular networks. In practice, during initial cell planning, the BS locations and RAN parameters are optimized to meet the basic system design requirements like coverage, capacity, overlap, QoS etc. As these metrics are tightly coupled with each other due to co-channel interference, switching off certain BSs may affect the system requirements. Therefore, identifying a subset of large number of BSs which are to be put into sleep mode, is a challenging dynamic optimization problem. In this work, we develop a multiobjective framework for dynamic optimization framework for OFDMA based cellular systems. The objective is to identify the appropriate set of active sectors and RAN parameters that maximize coverage and area spectral efficiency while minimizing overlap and area power consumption without violating the QoS requirements for a given traffic demand density. The objective functions and constraints are obtained using appropriate analytical models which capture the traffic characteristics, propagation characteristics (pathloss, shadowing, and small scale fading) as well as load condition in neighbouring cells. A low complexity evolutionary algorithm is used for identifying the global Pareto optimal solutions at a faster convergence rate. The inter-relationships between the system objectives are studied and guidelines are provided to find an appropriate network configuration that provides the best achievable trade-offs. The results show that using the proposed framework, significant amount of energy saving can be achieved and with a low computational complexity while maintaining good trade-offs among the other objectives.
Value guides behavior. With knowledge of stimulus values and action consequences, behaviors that maximize expected reward can be selected. Prior work has identified several brain structures critical for representing both stimuli and their values. Yet, it remains unclear how these structures interact with one another and with other regions of the brain to support the dynamic acquisition of value-related knowledge. Here, we use a network neuroscience approach to examine how BOLD functional networks change as 20 healthy human subjects learn the values of novel visual stimuli over the course of four consecutive days. We show that connections between regions of the visual, frontal, and cingulate cortices become increasingly stronger as learning progresses, and that these changes are primarily confined to the temporal core of the network. These results demonstrate that functional networks dynamically track behavioral improvement in value judgments, and that interactions between network communities form predictive biomarkers of learning.
We investigate the matching of agents to resources in a computational ecology configured to present heterogeneous resource patches to evolving, neurally controlled agents. We repeatedly find a nearly optimal, ideal free distribution (IFD) of agents to resources. Deviations from IFD are shown to be consistent with models of human foraging behaviors, and possibly driven by spatial constraints and maximum foraging rates. The lack of any model parameters addressing agent foraging or clustering behaviors and the biological verisimilitude of our agent control systems differentiates these results from simpler models and suggests the possibility of exploring the underlying mechanisms by which optimal foraging emerges.
We consider how to generate chemical reaction networks (CRNs) from functional specifications. We propose a two-stage approach that combines synthesis by satisfiability modulo theories and Markov chain Monte Carlo based optimisation. First, we identify candidate CRNs that have the possibility to produce correct computations for a given finite set of inputs. We then optimise the reaction rates of each CRN using a combination of stochastic search techniques applied to the chemical master equation, simultaneously improving the of correct behaviour and ruling out spurious solutions. In addition, we use techniques from continuous time Markov chain theory to study the expected termination time for each CRN. We illustrate our approach by identifying CRNs for majority decision-making and division computation, which includes the identification of both known and unknown networks.
We present a novel method to reconstruct complex network from partial information. We assume to know the links only for a subset of the nodes and to know some non-topological quantity (fitness) characterising every node. The missing links are generated on the basis of the latter quan- tity according to a fitness model calibrated on the subset of nodes for which links are known. We measure the quality of the reconstruction of several topological properties, such as the network density and the degree distri- bution as a function of the size of the initial subset of nodes. Moreover, we also study the resilience of the network to distress propagation. We first test the method on ensembles of synthetic networks generated with the Exponential Random Graph model which allows to apply common tools from statistical mechanics. We then test it on the empirical case of the World Trade Web. In both cases, we find that a subset of 10 % of nodes is enough to reconstruct the main features of the network along with its resilience with an error of 5%.
Face detection and recognition has been prevalent with research scholars and diverse approaches have been incorporated till date to serve purpose. The rampant advent of biometric analysis systems, which may be full body scanners, or iris detection and recognition systems and the finger print recognition systems, and surveillance systems deployed for safety and security purposes have contributed to inclination towards same. Advances has been made with frontal view, lateral view of the face or using facial expressions such as anger, happiness and gloominess, still images and video image to be used for detection and recognition. This led to newer methods for face detection and recognition to be introduced in achieving accurate results and economically feasible and extremely secure. Techniques such as Principal Component analysis (PCA), Independent component analysis (ICA), Linear Discriminant Analysis (LDA), have been the predominant ones to be used. But with improvements needed in the previous approaches Neural Networks based recognition was like boon to the industry. It not only enhanced the recognition but also the efficiency of the process. Choosing Backpropagation as the learning method was clearly out of its efficiency to recognize nonlinear faces with an acceptance ratio of more than 90% and execution time of only few seconds.
Language Models based on recurrent neural networks have dominated recent image caption generation tasks. In this paper, we introduce a Language CNN model which is suitable for statistical language modeling tasks and shows competitive performance in image captioning. In contrast to previous models which predict next word based on one previous word and hidden state, our language CNN is fed with all the previous words and can model the long-range dependencies of history words, which are critical for image captioning. The effectiveness of our approach is validated on two datasets MS COCO and Flickr30K. Our extensive experimental results show that our method outperforms the vanilla recurrent neural network based language models and is competitive with the state-of-the-art methods.
In wireless sensor networks, the $q$-composite key predistribution scheme is a widely recognized way to secure communications. Although connectivity properties of secure sensor networks with the $q$-composite scheme have been studied in the literature, few results address physical transmission constraints since it is challenging to analyze the network connectivity in consideration of both the $q$-composite scheme and transmission constraints together. These transmission constraints reflect real-world implementations of sensor networks in which two sensors have to be within a certain distance from each other to communicate. In this paper, we rigorously derive conditions for connectivity in sensor networks employing the $q$-composite scheme under transmission constraints. Furthermore, we extend the analysis to consider the unreliability of wireless links by modeling each link being independently active with some probability. Our results provide useful guidelines for designing securely and reliably connected sensor networks. We also present numerical experiments to confirm the analytical results.
In this paper we analyse the selectivity measure calculated from the complex network in the task of the automatic keyword extraction. Texts, collected from different web sources (portals, forums), are represented as directed and weighted co-occurrence complex networks of words. Words are nodes and links are established between two nodes if they are directly co-occurring within the sentence. We test different centrality measures for ranking nodes - keyword candidates. The promising results are achieved using the selectivity measure. Then we propose an approach which enables extracting word pairs according to the values of the in/out selectivity and weight measures combined with filtering.
We introduce superposition-based quantum networks composed of (i) the classical perceptron model of multilayered, feedforward neural networks and (ii) the algebraic model of evolving reticular quantum structures as described in quantum gravity. The main feature of this model is moving from particular neural topologies to a quantum metastructure which embodies many differing topological patterns. Using quantum parallelism, training is possible on superpositions of different network topologies. As a result, not only classical transition functions, but also topology becomes a subject of training. The main feature of our model is that particular neural networks, with different topologies, are quantum states. We consider high-dimensional dissipative quantum structures as candidates for implementation of the model.
Landau's theory of phase transitions is adapted to treat independently relaxing regions in complex systems using nanothermodynamics. The order parameter we use governs the thermal fluctuations, not a specific static structure. We find that the entropy term dominates the thermal behavior, as is reasonable for disordered systems. Consequently, the thermal equilibrium occurs at the internal-energy maximum, so that the minima in a potential-energy landscape have negligible influence on the dynamics. Instead the dynamics involves normal thermal fluctuations about the free-energy minimum, with a time scale that is governed by the internal-energy maximum. The temperature dependence of the fluctuations yields VTF-like relaxation rates and approximate time-temperature superposition, consistent with the WLF procedure for analyzing the dynamics of complex fluids; while the size dependence of the fluctuations provides an explanation for the distribution of relaxation times and heterogeneity that are found in glass-forming liquids, thus providing a unified picture for several features in the dynamics of disordered materials.
A theoretical description of the low-temperature phase of short-range spin glasses has remained elusive for decades. In particular, it is unclear if theories that assert a single pair of pure states, or theories that are based infinitely many pure states-such as replica symmetry breaking-best describe realistic short-range systems. To resolve this controversy, the three-dimensional Edwards-Anderson Ising spin glass in thermal boundary conditions is studied numerically using population annealing Monte Carlo. In thermal boundary conditions all eight combinations of periodic vs antiperiodic boundary conditions in the three spatial directions appear in the ensemble with their respective Boltzmann weights, thus minimizing finite-size corrections due to domain walls. From the relative weighting of the eight boundary conditions for each disorder instance a sample stiffness is defined, and its typical value is shown to grow with system size according to a stiffness exponent. An extrapolation to the large-system-size limit is in agreement with a description that supports the droplet picture and other theories that assert a single pair of pure states. The results are, however, incompatible with the mean-field replica symmetry breaking picture, thus highlighting the need to go beyond mean-field descriptions to accurately describe short-range spin-glass systems.
The sources separated by most single channel audio source separation techniques are usually distorted and each separated source contains residual signals from the other sources. To tackle this problem, we propose to enhance the separated sources to decrease the distortion and interference between the separated sources using deep neural networks (DNNs). Two different DNNs are used in this work. The first DNN is used to separate the sources from the mixed signal. The second DNN is used to enhance the separated signals. To consider the interactions between the separated sources, we propose to use a single DNN to enhance all the separated sources together. To reduce the residual signals of one source from the other separated sources (interference), we train the DNN for enhancement discriminatively to maximize the dissimilarity between the predicted sources. The experimental results show that using discriminative enhancement decreases the distortion and interference between the separated sources.
The thermal conductance by phonons of a quasi-one-dimensional solid with isotope or defect scattering is studied using the Landauer formalism for thermal transport. The conductance shows a crossover from localized to Ohmic behavior, just as for electrons, but the nature of this crossover is modified by delocalization of phonons at low frequency. A scalable numerical transfer-matrix technique is developed and applied to model quasi-one-dimensional systems in order to confirm simple analytic predictions. We argue that existing thermal conductivity data on semiconductor nanowires, showing an unexpected linear dependence, can be understood through a model that combines incoherent surface scattering for short-wavelength phonons with nearly ballistic long-wavelength phonons. It is also found that even when strong phonon localization effects would be observed if defects are distributed throughout the wire, localization effects are much weaker when defects are localized at the boundary, as in current experiments.
We show how a Hopfield network with modifiable recurrent connections undergoing slow Hebbian learning can extract the underlying geometry of an input space. First, we use a slow/fast analysis to derive an averaged system whose dynamics derives from an energy function and therefore always converges to equilibrium points. The equilibria reflect the correlation structure of the inputs, a global object extracted through local recurrent interactions only. Second, we use numerical methods to illustrate how learning extracts the hidden geometrical structure of the inputs. Indeed, multidimensional scaling methods make it possible to project the final connectivity matrix on to a distance matrix in a high-dimensional space, with the neurons labelled by spatial position within this space. The resulting network structure turns out to be roughly convolutional. The residual of the projection defines the non-convolutional part of the connectivity which is minimized in the process. Finally, we show how restricting the dimension of the space where the neurons live gives rise to patterns similar to cortical maps. We motivate this using an energy efficiency argument based on wire length minimization. Finally, we show how this approach leads to the emergence of ocular dominance or orientation columns in primary visual cortex. In addition, we establish that the non-convolutional (or long-range) connectivity is patchy, and is co-aligned in the case of orientation learning.
Visual question answering is a recently proposed artificial intelligence task that requires a deep understanding of both images and texts. In deep learning, images are typically modeled through convolutional neural networks, and texts are typically modeled through recurrent neural networks. While the requirement for modeling images is similar to traditional computer vision tasks, such as object recognition and image classification, visual question answering raises a different need for textual representation as compared to other natural language processing tasks. In this work, we perform a detailed analysis on natural language questions in visual question answering. Based on the analysis, we propose to rely on convolutional neural networks for learning textual representations. By exploring the various properties of convolutional neural networks specialized for text data, such as width and depth, we present our "CNN Inception + Gate" model. We show that our model improves question representations and thus the overall accuracy of visual question answering models. We also show that the text representation requirement in visual question answering is more complicated and comprehensive than that in conventional natural language processing tasks, making it a better task to evaluate textual representation methods. Shallow models like fastText, which can obtain comparable results with deep learning models in tasks like text classification, are not suitable in visual question answering.
This workshop explores the interface between cognitive neuroscience and recent advances in AI fields that aim to reproduce human performance such as natural language processing and computer vision, and specifically deep learning approaches to such problems.   When studying the cognitive capabilities of the brain, scientists follow a system identification approach in which they present different stimuli to the subjects and try to model the response that different brain areas have of that stimulus. The goal is to understand the brain by trying to find the function that expresses the activity of brain areas in terms of different properties of the stimulus. Experimental stimuli are becoming increasingly complex with more and more people being interested in studying real life phenomena such as the perception of natural images or natural sentences. There is therefore a need for a rich and adequate vector representation of the properties of the stimulus, that we can obtain using advances in machine learning.   In parallel, new ML approaches, many of which in deep learning, are inspired to a certain extent by human behavior or biological principles. Neural networks for example were originally inspired by biological neurons. More recently, processes such as attention are being used which have are inspired by human behavior. However, the large bulk of these methods are independent of findings about brain function, and it is unclear whether it is at all beneficial for machine learning to try to emulate brain function in order to achieve the same tasks that the brain achieves.
In this Comment we argue that Ho$_2$Ti$_2$O$_7$ does not exhibit a transition to a partially ordered state unlike what is argued in PRL 83, 1854 (1999), that it exhibits spin ice behavior, and that the experimental specific heat data by Siddharthan et al. can be accounted for in terms of a "dipolar spin ice" model by including an expected contibution from the nuclear Ho spins to the appropriate long-range treatment of the dipole-dipole interactions.
We introduce and solve a model which considers two coupled networks growing simultaneously. The dynamics of the networks is governed by the new arrival of network elements (nodes) making preferential attachments to pre-existing nodes in both networks. The model segregates the links in the networks as intra-links, cross-links and mix-links. The corresponding degree distributions of these links are found to be power-laws with exponents having coupled parameters for intra- and cross-links. In the weak coupling case the model reduces to a simple citation network. As for the strong coupling, it mimics the mechanism of \emph{the web of human sexual contacts}.
While a user's preference is directly reflected in the interactive choice process between her and the recommender, this wealth of information was not fully exploited for learning recommender models. In particular, existing collaborative filtering (CF) approaches take into account only the binary events of user actions but totally disregard the contexts in which users' decisions are made. In this paper, we propose Collaborative Competitive Filtering (CCF), a framework for learning user preferences by modeling the choice process in recommender systems. CCF employs a multiplicative latent factor model to characterize the dyadic utility function. But unlike CF, CCF models the user behavior of choices by encoding a local competition effect. In this way, CCF allows us to leverage dyadic data that was previously lumped together with missing data in existing CF models. We present two formulations and an efficient large scale optimization algorithm. Experiments on three real-world recommendation data sets demonstrate that CCF significantly outperforms standard CF approaches in both offline and online evaluations.
We present the preliminary results from two new surveys of blazars that have direct implications on the GLAST detection of extragalactic sources from two different perspectives: microwave selection and a combined deep X-ray/radio selection. The first one is a 41 GHz flux-limited sample extracted from the WMAP 3-yr catalog of microwave point sources. This is a statistically well defined sample of about 200 blazars and radio galaxies, most of which are expected to be detected by GLAST. The second one is a new deep survey of Blazars selected among the radio sources that are spatially coincident with serendipitous sources detected in deep X-ray images (0.3-10 keV) centered on the Gamma Ray Bursts (GRB) discovered by the Swift satellite. This sample is particularly interesting from a statistical viewpoint since a) it is unbiased as GRBs explode at random positions in the sky, b) it is very deep in the X-ray band (\fx \simgt $10^{-15}$ \ergs) with a position accuracy of a few arc-seconds, c) it will cover a fairly large (20-30 square deg.) area of sky, d) it includes all blazars with radio flux (1.4 GHz) larger than 10 mJy, making it approximately two orders of magnitude deeper than the WMAP sample and about one order of magnitude deeper than the deepest existing complete samples of radio selected blazars, and e) it can be used to estimate the amount of unresolved GLAST high latitude gamma-ray background and its anisotropy spectrum.
Taking advantage of the ultra-deep near-infrared imaging obtained with the Hubble Space Telescope on the Hubble Ultra Deep Field, we detect and explore for the first time the properties of the stellar haloes of two Milky Way-like galaxies at z~1. We find that the structural properties of those haloes (size and shape) are similar to the ones found in the local universe. However, these high-z stellar haloes are approximately three magnitudes brighter and exhibit bluer colours ((g-r)<0.3 mag) than their local counterparts. The stellar populations of z~1 stellar haloes are compatible with having ages <1 Gyr. This implies that the stars in those haloes were formed basically at 1<z<2. This result matches very well the theoretical predictions that locate most of the formation of the stellar haloes at those early epochs. A pure passive evolutionary scenario, where the stellar populations of our high-z haloes simply fade to match the stellar halo properties found in the local universe, is consistent with our data.
In this paper, we study a class of stochastic processes, called evolving network Markov chains, in evolving networks. Our approach is to transform the degree distribution problem of an evolving network to a corresponding problem of evolving network Markov chains. We investigate the evolving network Markov chains, thereby obtaining some exact formulas as well as a precise criterion for determining whether the steady degree distribution of the evolving network is a power-law or not. With this new method, we finally obtain a rigorous, exact and unified solution of the steady degree distribution of the evolving network.
We investigate dynamical systems characterized by a time series of distinct semi-stable activity patterns, as they are observed in cortical neural activity patterns. We propose and discuss a general mechanism allowing for an adiabatic continuation between attractor networks and a specific adjoined transient-state network, which is strictly dissipative. Dynamical systems with transient states retain functionality when their working point is autoregulated; avoiding prolonged periods of stasis or drifting into a regime of rapid fluctuations. We show, within a continuous-time neural network model, that a single local updating rule for online learning allows simultaneously (i) for information storage via unsupervised Hebbian-type learning, (ii) for adaptive regulation of the working point and (iii) for the suppression of runaway synaptic growth. Simulation results are presented; the spontaneous breaking of time-reversal symmetry and link symmetry are discussed.
I propose a novel method for practical Joint Processing DL CoMP implementation in LTE/LTE-A systems using a supervised machine learning technique. DL CoMP has not been thoroughly studied in previous work although cluster formation and interference mitigation have been studied extensively. In this paper, I attempt to improve the cell-edge user data rate served by a heterogeneous network cluster by means of dynamically changing the DL SINR threshold at which DL CoMP is triggered. I do so by allowing the base stations to derive a threshold on the basis of machine learning inference. The simulation results show an improved user throughput at the cell edge of 40% and a 6.4% improvement to the average cell throughput compared to the baseline of static triggering.
Phylogenetic networks are used to display the relationship of different species whose evolution is not treelike, which is the case, for instance, in the presence of hybridization events or horizontal gene transfers. Tree inference methods such as Maximum Parsimony need to be modified in order to be applicable to networks. In this paper, we discuss two different definitions of Maximum Parsimony on networks, "hardwired" and "softwired", and examine the complexity of computing them given a network topology and a character. By exploiting a link with the problem Multicut, we show that computing the hardwired parsimony score for 2-state characters is polynomial-time solvable, while for characters with more states this problem becomes NP-hard but is still approximable and fixed parameter tractable in the parsimony score. On the other hand we show that, for the softwired definition, obtaining even weak approximation guarantees is already difficult for binary characters and restricted network topologies, and fixed-parameter tractable algorithms in the parsimony score are unlikely. On the positive side we show that computing the softwired parsimony score is fixed-parameter tractable in the level of the network, a natural parameter describing how tangled reticulate activity is in the network. Finally, we show that both the hardwired and softwired parsimony score can be computed efficiently using Integer Linear Programming. The software has been made freely available.
In this paper we introduce the idea of estimating local topology in wireless networks by means of crowdsourced user reports. In this approach each user periodically reports to the serving basestation information about the set of neighbouring basestations observed by the user. We show that, by mapping the local topological structure of the network onto states of increasing knowledge, a crisp mathematical framework can be obtained, which allows in turn for the use of a variety of user mobility models. Using a simplified mobility model we show how obtain useful upper bounds on the expected time for a basestation to gain full knowledge of its local neighbourhood, answering the fundamental question about which classes of network deployments can effectively benefit from a crowdsourcing approach.
Signed directed social networks, in which the relationships between users can be either positive (indicating relations such as trust) or negative (indicating relations such as distrust), are increasingly common. Thus the interplay between positive and negative relationships in such networks has become an important research topic. Most recent investigations focus upon edge sign inference using structural balance theory or social status theory. Neither of these two theories, however, can explain an observed edge sign well when the two nodes connected by this edge do not share a common neighbor (e.g., common friend). In this paper we develop a novel approach to handle this situation by applying a new model for node types. Initially, we analyze the local node structure in a fully observed signed directed network, inferring underlying node types. The sign of an edge between two nodes must be consistent with their types; this explains edge signs well even when there are no common neighbors. We show, moreover, that our approach can be extended to incorporate directed triads, when they exist, just as in models based upon structural balance or social status theory. We compute Bayesian node types within empirical studies based upon partially observed Wikipedia, Slashdot, and Epinions networks in which the largest network (Epinions) has 119K nodes and 841K edges. Our approach yields better performance than state-of-the-art approaches for these three signed directed networks.
Domination is the fastest-growing field within graph theory with a profound diversity and impact in real-world applications, such as the recent breakthrough approach that identifies optimized subsets of proteins enriched with cancer-related genes. Despite its conceptual simplicity, domination is a classical NP-complete decision problem which makes analytical solutions elusive and poses difficulties to design optimization algorithms for finding a dominating set of minimum cardinality in a large network. Here we derive for the first time an approximate analytical solution for the density of the minimum dominating set (MDS) by using a combination of cavity method and Ultra-Discretization (UD) procedure. The derived equation allows us to compute the size of MDS by only using as an input the information of the degree distribution of a given network.
How large is a network flow? Traditionally this question has been addressed by using metrics such as the number of bytes, the transmission rate or the duration of a flow. We reason that a formal mathematical definition of flow size should account for the impact a flow has on the performance of a network: flows that have the largest impact, should have the largest size. In this paper we present a theory of flow ordering that reveals the connection between the abstract concept of flow size and the QoS properties of a network. The theory is generalized to accommodate for the case of partial information, allowing us to model real computer network scenarios such as those found in involuntary lossy environments or voluntary packet sampling protocols (e.g., sFlow). We explore one application of this theory to address the problem of elephant flow detection at very high speed rates. The algorithm uses the information theoretic properties of the problem to help reduce the computational cost by a factor of one thousand.
The highly influential framework of conceptual spaces provides a geometric way of representing knowledge. It aims at bridging the gap between symbolic and subsymbolic processing. Instances are represented by points in a high-dimensional space and concepts are represented by convex regions in this space. In this paper, we present our approach towards grounding the dimensions of a conceptual space in latent spaces learned by an InfoGAN from unlabeled data.
Off-diagonal disorder with random hopping between the sublattices of a bipartite lattice is described by a Hamiltonian which has chiral (sub-lattice) symmetry. The energy spectrum is symmetric around E=0 and for odd total number of lattice sites an isolated zero mode always exists, which coincides with the mobility edge of an Anderson transition in two dimensions(2D). In the chiral orthogonal symmetry class BDI we compute the fractal dimension $D_{2}$ of the zero mode for graphene samples with edges. In the absence of disorder $D_{2}=1$, which corresponds to a one-dimensional edge states, while for strong disorder $D_{2}$ decays towards 0 and the zero mode becomes localized. The similarities and differences between zero modes in the honeycomb and the square bipartite lattices are pointed out.
In the consensus model of Krause-Hegselmann, opinions are real numbers between 0 and 1 and two agents are compatible if the difference of their opinions is smaller than the confidence bound parameter \epsilon. A randomly chosen agent takes the average of the opinions of all neighbouring agents which are compatible with it. We propose a conjecture, based on numerical evidence, on the value of the consensus threshold \epsilon_c of this model. We claim that \epsilon_c can take only two possible values, depending on the behaviour of the average degree d of the graph representing the social relationships, when the population N goes to infinity: if d diverges when N goes to infinity, \epsilon_c equals the consensus threshold \epsilon_i ~ 0.2 on the complete graph; if instead d stays finite when N goes to infinity, \epsilon_c=1/2 as for the model of Deffuant et al.
Graphical models have gained a lot of attention recently as a tool for learning and representing dependencies among variables in multivariate data. Often, domain scientists are looking specifically for differences among the dependency networks of different conditions or populations (e.g. differences between regulatory networks of different species, or differences between dependency networks of diseased versus healthy populations). The standard method for finding these differences is to learn the dependency networks for each condition independently and compare them. We show that this approach is prone to high false discovery rates (low precision) that can render the analysis useless. We then show that by imposing a bias towards learning similar dependency networks for each condition the false discovery rates can be reduced to acceptable levels, at the cost of finding a reduced number of differences. Algorithms developed in the transfer learning literature can be used to vary the strength of the imposed similarity bias and provide a natural mechanism to smoothly adjust this differential precision-recall tradeoff to cater to the requirements of the analysis conducted. We present real case studies (oncological and neurological) where domain experts use the proposed technique to extract useful differential networks that shed light on the biological processes involved in cancer and brain function.
This paper presents a novel method for detecting pedestrians under adverse illumination conditions. Our approach relies on a novel cross-modality learning framework and it is based on two main phases. First, given a multimodal dataset, a deep convolutional network is employed to learn a non-linear mapping, modeling the relations between RGB and thermal data. Then, the learned feature representations are transferred to a second deep network, which receives as input an RGB image and outputs the detection results. In this way, features which are both discriminative and robust to bad illumination conditions are learned. Importantly, at test time, only the second pipeline is considered and no thermal data are required. Our extensive evaluation demonstrates that the proposed approach outperforms the state-of- the-art on the challenging KAIST multispectral pedestrian dataset and it is competitive with previous methods on the popular Caltech dataset.
Automated analysis of imaged phenotypes enables fast and reproducible quantification of biologically relevant features. Despite recent developments, recordings of complex, networked structures, such as: leaf venation patterns, cytoskeletal structures, or traffic networks, remain challenging to analyze. Here we illustrate the applicability of img2net to automatedly analyze such structures by reconstructing the underlying network, computing relevant network properties, and statistically comparing networks of different types or under different conditions. The software can be readily used for analyzing image data of arbitrary 2D and 3D network-like structures. img2net is open-source software under the GPL and can be downloaded from http://mathbiol.mpimp-golm.mpg.de/img2net/, where supplementary information and data sets for testing are provided.
In complex networks, the failure of one or very few nodes may cause cascading failures. When this dynamical process stops in steady state, the size of the giant component formed by remaining un-failed nodes can be used to measure the severity of cascading failures, which is critically important for estimating the robustness of networks. In this paper, we provide a cascade of overload failure model with local load sharing mechanism, and then explore the threshold of node capacity when the large-scale cascading failures happen and un-failed nodes in steady state cannot connect to each other to form a large connected sub-network. We get the theoretical derivation of this threshold in degree-degree uncorrelated networks, and validate the effectiveness of this method in simulation. This threshold provide us a guidance to improve the network robustness under the premise of limited capacity resource when creating a network and assigning load. Therefore, this threshold is useful and important to analyze the robustness of networks.
We propose a theory that relates difficulty of learning in deep architectures to culture and language. It is articulated around the following hypotheses: (1) learning in an individual human brain is hampered by the presence of effective local minima; (2) this optimization difficulty is particularly important when it comes to learning higher-level abstractions, i.e., concepts that cover a vast and highly-nonlinear span of sensory configurations; (3) such high-level abstractions are best represented in brains by the composition of many levels of representation, i.e., by deep architectures; (4) a human brain can learn such high-level abstractions if guided by the signals produced by other humans, which act as hints or indirect supervision for these high-level abstractions; and (5), language and the recombination and optimization of mental concepts provide an efficient evolutionary recombination operator, and this gives rise to rapid search in the space of communicable ideas that help humans build up better high-level internal representations of their world. These hypotheses put together imply that human culture and the evolution of ideas have been crucial to counter an optimization difficulty: this optimization difficulty would otherwise make it very difficult for human brains to capture high-level knowledge of the world. The theory is grounded in experimental observations of the difficulties of training deep artificial neural networks. Plausible consequences of this theory for the efficiency of cultural evolutions are sketched.
In this paper, we present an adaptive investment strategy for environments with periodic returns on investment. In our approach, we consider an investment model where the agent decides at every time step the proportion of wealth to invest in a risky asset, keeping the rest of the budget in a risk-free asset. Every investment is evaluated in the market via a stylized return on investment function (RoI), which is modeled by a stochastic process with unknown periodicities and levels of noise. For comparison reasons, we present two reference strategies which represent the case of agents with zero-knowledge and complete-knowledge of the dynamics of the returns. We consider also an investment strategy based on technical analysis to forecast the next return by fitting a trend line to previous received returns. To account for the performance of the different strategies, we perform some computer experiments to calculate the average budget that can be obtained with them over a certain number of time steps. To assure for fair comparisons, we first tune the parameters of each strategy. Afterwards, we compare the performance of these strategies for RoIs with different periodicities and levels of noise.
Recognizing human activities in a sequence is a challenging area of research in ubiquitous computing. Most approaches use a fixed size sliding window over consecutive samples to extract features---either handcrafted or learned features---and predict a single label for all samples in the window. Two key problems emanate from this approach: i) the samples in one window may not always share the same label. Consequently, using one label for all samples within a window inevitably lead to loss of information; ii) the testing phase is constrained by the window size selected during training while the best window size is difficult to tune in practice. We propose an efficient algorithm that can predict the label of each sample, which we call dense labeling, in a sequence of human activities of arbitrary length using a fully convolutional network. In particular, our approach overcomes the problems posed by the sliding window step. Additionally, our algorithm learns both the features and classifier automatically. We release a new daily activity dataset based on a wearable sensor with hospitalized patients. We conduct extensive experiments and demonstrate that our proposed approach is able to outperform the state-of-the-arts in terms of classification and label misalignment measures on three challenging datasets: Opportunity, Hand Gesture, and our new dataset.
Spike timing dependent plasticity (STDP) is believed to play an important role in shaping the structure of neural circuits. Here we show that STDP generates effective interactions between synapses of different neurons, which were neglected in previous theoretical treatments, and can be described as a sum over contributions from structural motifs. These interactions can have a pivotal influence on the connectivity patterns that emerge under the influence of STDP. In particular, we consider two highly ordered forms of structure: wide synfire chains, in which groups of neurons project to each other sequentially, and self connected assemblies. We show that high order synaptic interactions can enable the formation of both structures, depending on the form of the STDP function and the time course of synaptic currents. Furthermore, within a certain regime of biophysical parameters, emergence of the ordered connectivity occurs robustly and autonomously in a stochastic network of spiking neurons, without a need to expose the neural network to structured inputs during learning.
Object proposal is essential for current state-of-the-art object detection pipelines. However, the existing proposal methods generally fail in producing results with satisfying localization accuracy. The case is even worse for small objects which however are quite common in practice. In this paper we propose a novel Scale-aware Pixel-wise Object Proposal (SPOP) network to tackle the challenges. The SPOP network can generate proposals with high recall rate and average best overlap (ABO), even for small objects. In particular, in order to improve the localization accuracy, a fully convolutional network is employed which predicts locations of object proposals for each pixel. The produced ensemble of pixel-wise object proposals enhances the chance of hitting the object significantly without incurring heavy extra computational cost. To solve the challenge of localizing objects at small scale, two localization networks which are specialized for localizing objects with different scales are introduced, following the divide-and-conquer philosophy. Location outputs of these two networks are then adaptively combined to generate the final proposals by a large-/small-size weighting network. Extensive evaluations on PASCAL VOC 2007 show the SPOP network is superior over the state-of-the-art models. The high-quality proposals from SPOP network also significantly improve the mean average precision (mAP) of object detection with Fast-RCNN framework. Finally, the SPOP network (trained on PASCAL VOC) shows great generalization performance when testing it on ILSVRC 2013 validation set.
The microscopic and macroscopic dynamics of random networks is investigated in the strong-dilution limit (i.e. for sparse networks). By simulating chaotic maps, Stuart-Landau oscillators, and leaky integrate-and-fire neurons, we show that a finite connectivity (of the order of a few tens) is able to sustain a nontrivial collective dynamics even in the thermodynamic limit. Although the network structure implies a non-additive dynamics, the microscopic evolution is extensive (i.e. the number of active degrees of freedom is proportional to the number of network elements).
With this paper, we contribute to the understanding of ant colony optimization (ACO) algorithms by formally analyzing their runtime behavior. We study simple MAX-MIN ant systems on the class of linear pseudo-Boolean functions defined on binary strings of length 'n'. Our investigations point out how the progress according to function values is stored in pheromone. We provide a general upper bound of O((n^3 \log n)/ \rho) for two ACO variants on all linear functions, where (\rho) determines the pheromone update strength. Furthermore, we show improved bounds for two well-known linear pseudo-Boolean functions called OneMax and BinVal and give additional insights using an experimental study.
The shapes of jets with transverse energies, E_T(jet), up to 45 GeV produced in neutral- and charged-current deep inelastic e+p scattering (DIS) at Q**2 > 100 GeV**2 have been measured with the ZEUS detector at HERA. Jets are identified using a cone algorithm in the eta-phi plane with a cone radius of one unit. The jets become narrower as E_T(jet) increases. The jet shapes in neutral- and charged-current DIS are found to be very similar. The jets in neutral-current DIS are narrower than those in resolved processes in photoproduction and closer to those in direct-photon processes for the same ranges in E_T(jet) and jet pseudorapidity. The jet shapes in DIS are observed to be similar to those in e+e- interactions and narrower than those in pbarp collisions for comparable E_T(jet). Since the jets in e+e- interactions and e+p DIS are predominantly quark initiated in both cases, the similarity in the jet shapes indicates that the pattern of QCD radiation within a quark jet is to a large extent independent of the hard scattering process in these reactions.
Clustering the nodes of a graph allows the analysis of the topology of a network.   The stochastic block model is a clustering method based on a probabilistic model. Initially developed for binary networks it has recently been extended to valued networks possibly with covariates on the edges.   We present an implementation of a variational EM algorithm. It is written using C++, parallelized, available under a GNU General Public License (version 3), and can select the optimal number of clusters using the ICL criteria. It allows us to analyze networks with ten thousand nodes in a reasonable amount of time.
Motivation: Identifying the molecular pathways more prone to disruption during a pathological process is a key task in network medicine and, more in general, in systems biology.   Results: In this work we propose a pipeline that couples a machine learning solution for molecular profiling with a recent network comparison method. The pipeline can identify changes occurring between specific sub-modules of networks built in a case-control biomarker study, discriminating key groups of genes whose interactions are modified by an underlying condition. The proposal is independent from the classification algorithm used. Three applications on genomewide data are presented regarding children susceptibility to air pollution and two neurodegenerative diseases: Parkinson's and Alzheimer's.   Availability: Details about the software used for the experiments discussed in this paper are provided in the Appendix.
The statistical analysis of the structure of bipartite ecological networks has increased in importance in recent years. Yet, both algorithms and software packages for the analysis of network structure focus on properties of unipartite networks. In response, we describe BiMAT, an object-oriented MATLAB package for the study of the structure of bipartite ecological networks. BiMAT can analyze the structure of networks, including features such as modularity and nestedness, using a selection of widely-adopted algorithms. BiMAT also includes a variety of null models for evaluating the statistical significance of network properties. BiMAT is capable of performing multi-scale analysis of structure - a potential (and under-examined) feature of many biological networks. Finally, BiMAT relies on the graphics capabilities of MATLAB to enable the visualization of the statistical structure of bipartite networks in either matrix or graph layout representations. BiMAT is available as an open-source package at http://ecotheory.biology.gatech.edu/cflores.
Most real-world networks exhibit community structure, a phenomenon characterized by existence of node clusters whose intra-edge connectivity is stronger than edge connectivities between nodes belonging to different clusters. In addition to facilitating a better understanding of network behavior, community detection finds many practical applications in diverse settings. Communities in online social networks are indicative of shared functional roles, or affiliation to a common socio-economic status, the knowledge of which is vital for targeted advertisement. In buyer-seller networks, community detection facilitates better product recommendations. Unfortunately, reliability of community assignments is hindered by anomalous user behavior often observed as unfair self-promotion, or "fake" highly-connected accounts created to promote fraud. The present paper advocates a novel approach for jointly tracking communities while detecting such anomalous nodes in time-varying networks. By postulating edge creation as the result of mutual community participation by node pairs, a dynamic factor model with anomalous memberships captured through a sparse outlier matrix is put forth. Efficient tracking algorithms suitable for both online and decentralized operation are developed. Experiments conducted on both synthetic and real network time series successfully unveil underlying communities and anomalous nodes.
We apply Murasugi-Tristram inequality to real algebraic curves of odd degree on $RP^2$ with a deep nest, i.e. a nest of the depth $k-1$ where $2k+1$ is the degree. For such curves, the ingredients of the Murasugi-Tristram inequality can be computed (or estimated) inductively using the computations for iterated torus links due to Eisenbud and Neumann as the base of the induction and Conway's skein relation as the induction step.   In Appendix B, we give some generalization of the skein relation.
We describe a model element able to perform universal stochastic approximations of continuous multivariable functions in both neuron-like and quantum form. The implementation of this model in the form of a multi-barrier, multiple-slit system is proposed and it is demonstrated that this single neuron-like model is able to perform the XOR function unrealizable with single classical neuron. For the simplified waveguide variant of this model it is proved for different interfering quantum alternatives with no correlated adjustable parameters, that the system can approximate any continuous function of many variables. This theorem is applied to the 2-input quantum neural model based on the use of the schemes developed for controlled nonlinear multiphoton absorption of light by quantum systems. The relation between the field of quantum neural computing and quantum control is discussed.
Designing high performance channel assignment schemes to harness the potential of multi-radio multi-channel deployments in wireless mesh networks (WMNs) is an active research domain. A pragmatic channel assignment approach strives to maximize network capacity by restraining the endemic interference and mitigating its adverse impact on network performance. Interference prevalent in WMNs is multi-faceted, radio co-location interference (RCI) being a crucial aspect that is seldom addressed in research endeavors. In this effort, we propose a set of intelligent channel assignment algorithms, which focus primarily on alleviating the RCI. These graph theoretic schemes are structurally inspired by the spatio-statistical characteristics of interference. We present the theoretical design foundations for each of the proposed algorithms, and demonstrate their potential to significantly enhance network capacity in comparison to some well-known existing schemes. We also demonstrate the adverse impact of radio co- location interference on the network, and the efficacy of the proposed schemes in successfully mitigating it. The experimental results to validate the proposed theoretical notions were obtained by running an exhaustive set of ns-3 simulations in IEEE 802.11g/n environments.
When electrons are subject to a large external magnetic field, the conventional charge quantum Hall effect \cite{Klitzing,Tsui} dictates that an electronic excitation gap is generated in the sample bulk, but metallic conduction is permitted at the boundary. Recent theoretical models suggest that certain bulk insulators with large spin-orbit interactions may also naturally support conducting topological boundary states in the extreme quantum limit, which opens up the possibility for studying unusual quantum Hall-like phenomena in zero external magnetic fields. Bulk Bi{1-x}Sbx single crystals are predicted to be prime candidates for one such unusual Hall phase of matter known as the topological insulator. The hallmark of a topological insulator is the existence of metallic spin-textured surface states that are higher dimensional analogues of the edge states that characterize a quantum spin Hall insulator. Here, using incident-photon-energy-modulated angle-resolved photoemission spectroscopy, we report the direct observation of massive Dirac particles in the bulk of Bi0.9Sb0.1, locate the Kramers' points at the sample's boundary and provide a comprehensive mapping of the topological Dirac insulator's gapless surface modes. These findings taken together suggest that the observed surface state on the boundary of the bulk insulator is a realization of the much sought exotic "topological metal". They also suggest that this material has potential application in developing next-generation quantum computing devices that may incorporate "light-like" bulk carriers and topologically protected spin-textured edge-surface currents. This work is a detailed version of [Hsieh et.al., NATURE 452, 970 (2008), {Submitted in November 2007}].
This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical applications.
This research aims to secure precise distances for cluster delta Scutis in order to investigate their properties via a VI Wesenheit framework. Deep JHKs colour-colour and ZAMS relations derived from ~700 unreddened stars featuring 2MASS photometry and precise HIP parallaxes (d<~25 pc) are applied to establish distances to several benchmark open clusters that host delta Scutis: Hyades, Pleiades, Praesepe, alpha Per, and M67 (d=47+-2,138+-6,183+-8,171+-8,815+-40 pc). That analysis provided constraints on the delta Sct sample's absolute Wesenheit magnitudes (W_VI,0), evolutionary status, and pulsation modes (order, n). The reliability of JHKs established cluster parameters is demonstrated via a comparison with van Leeuwen 2009 revised HIP results. Distances for 7 of 9 nearby (d<250 pc) clusters agree, and the discrepant cases (Pleiades & Blanco 1) are unrelated to (insignificant) Te/J-Ks variations with cluster age or [Fe/H]. JHKs photometry is tabulated for ~3x10^3 probable cluster members on the basis of proper motions (NOMAD). The deep JHKs photometry extends into the low mass regime (~0.4 Msun) and ensures precise (<5%) ZAMS fits. Pulsation modes inferred for the cluster delta Scutis from VI Wesenheit and independent analyses are comparable (+-n), and the methods are consistent in identifying higher order pulsators. Most small-amplitude cluster delta Scutis lie on VI Wesenheit loci characterizing n>~1 pulsators. A distance established to NGC 1817 from delta Scutis (d~1.7 kpc) via a universal VI Wesenheit template agrees with estimates in the literature, assuming the variables delineate the n>~1 boundary. Small statistics in tandem with other factors presently encumber the use of mmag delta Scutis as viable distance indicators to intermediate-age open clusters, yet a VI Wesenheit approach is a pertinent means for studying delta Scutis in harmony with other methods.
In multiple instance learning, objects are sets (bags) of feature vectors (instances) rather than individual feature vectors. In this paper we address the problem of how these bags can best be represented. Two standard approaches are to use (dis)similarities between bags and prototype bags, or between bags and prototype instances. The first approach results in a relatively low-dimensional representation determined by the number of training bags, while the second approach results in a relatively high-dimensional representation, determined by the total number of instances in the training set. In this paper a third, intermediate approach is proposed, which links the two approaches and combines their strengths. Our classifier is inspired by a random subspace ensemble, and considers subspaces of the dissimilarity space, defined by subsets of instances, as prototypes. We provide guidelines for using such an ensemble, and show state-of-the-art performances on a range of multiple instance learning problems.
We address the role of topology in the energy transport process that occurs in networks of photosynthetic complexes. We take inspiration from light harvesting networks present in purple bacteria and simulate an incoherent dissipative energy transport process on more general and abstract networks, considering both regular structures (Cayley trees and hyperbranched fractals) and randomly-generated ones. We focus on the the two primary light harvesting complexes of purple bacteria, i.e., the LH1 and LH2, and we use network-theoretical centrality measures in order to select different LH1 arrangements. We show that different choices cause significant differences in the transport efficiencies, and that for regular networks centrality measures allow to identify arrangements that ensure transport efficiencies which are better than those obtained with a random disposition of the complexes. The optimal arrangements strongly depend on the dissipative nature of the dynamics and on the topological properties of the networks considered, and depending on the latter they are achieved by using global vs. local centrality measures. For randomly-generated networks a random arrangement of the complexes already provides efficient transport, and this suggests the process is strong with respect to limited amount of control in the structure design and to the disorder inherent in the construction of randomly-assembled structures. Finally, we compare the networks considered with the real biological networks and find that the latter have in general better performances, due to their higher connectivity, but the former with optimal arrangements can mimic the real networks' behaviour for a specific range of transport parameters. These results show that the use of network-theoretical concepts can be crucial for the characterization and design of efficient artificial energy transport networks.
Task-specific word identification aims to choose the task-related words that best describe a short text. Existing approaches require well-defined seed words or lexical dictionaries (e.g., WordNet), which are often unavailable for many applications such as social discrimination detection and fake review detection. However, we often have a set of labeled short texts where each short text has a task-related class label, e.g., discriminatory or non-discriminatory, specified by users or learned by classification algorithms. In this paper, we focus on identifying task-specific words and phrases from short texts by exploiting their class labels rather than using seed words or lexical dictionaries. We consider the task-specific word and phrase identification as feature learning. We train a convolutional neural network over a set of labeled texts and use score vectors to localize the task-specific words and phrases. Experimental results on sentiment word identification show that our approach significantly outperforms existing methods. We further conduct two case studies to show the effectiveness of our approach. One case study on a crawled tweets dataset demonstrates that our approach can successfully capture the discrimination-related words/phrases. The other case study on fake review detection shows that our approach can identify the fake-review words/phrases.
The $f^{-\gamma}$ sloped current noise power spectra, observed in organic semiconductors, have been interpreted within a {\em variable range hopping} mechanism of the fluctuations. The relative current noise power spectral density ${\cal S}(f)=S_I(f)/I^2$ exhibits a maximum at the {\em trap-filling transition} between the {\em ohmic} and the {\em space-charge-limited-current} regime [Phys. Rev. Lett., {\bf 95}, 236601, 2005]. Here, we discuss the electronic conditions determining the crossover from ohmic to space-charge-limited transport. These arguments shed further light on the need to adopt a {\em percolative} fluctuation picture to account for the competition between insulating and conductive phases coexisting at the {\em transition}, where small changes in the external bias lead to dramatic effects in the fluctuations.
We present a summary on the discovery of active galactic nuclei in mid- and far-infrared deep surveys with use of the Infrared Space Observatory.
Chest X-Rays are widely used for diagnosing abnormalities in the heart and lung area. Automatically detecting these abnormalities with high accuracy could greatly enhance real world diagnosis processes. Lack of standard publicly available dataset and benchmark studies, however, makes it difficult to compare various detection methods. In order to overcome these difficulties, we have used a publicly available Indiana chest X-Ray dataset and studied the performance of known deep convolutional network (DCN) architectures on different abnormalities. We find that the same DCN architecture doesn't perform well across all abnormalities. Shallow features or earlier layers consistently provide higher detection accuracy compared to deep features. We have also found ensemble models to improve classification significantly compared to single model. Combining these insight, we report the highest accuracy on chest X-Ray abnormality detection on this dataset. Our localization experiments using these trained classifiers show that for spatially spread out abnormalities like cardiomegaly and pulmonary edema, the network can localize the abnormalities successfully most of the time. We find that in the cardiomegaly classification task, the deep learning method improves the accuracy by a staggering 17 percentage point compared to rule based methods. One remarkable result of the cardiomegaly localization is that the heart and its surrounding region is most responsible for cardiomegaly detection, in contrast to the rule based models where the ratio of heart and lung area is used as the measure. We believe that through deep learning based classification and localization, we will discover many more interesting features in medical image diagnosis that are not considered traditionally.
The effect of strong disorder on the one-dimensional Kondo necklace model is studied using a perturbative real-space renormalization group approach which becomes asymptotically exact in the low energy limit. The phase diagram of the model presents a random quantum critical point separating two phases; the {\em random singlet phase} of a quantum disordered XY chain and the random Kondo phase. We also consider an anisotropic version of the model and show that it maps on the strongly disordered transverse Ising model. The present results provide a rigorous microscopic basis for non-Fermi liquid behavior in disordered heavy fermions due to Griffiths phases.
Recurrent neural networks are powerful tools for understanding and modeling computation and representation by populations of neurons. Continuous-variable or "rate" model networks have been analyzed and applied extensively for these purposes. However, neurons fire action potentials, and the discrete nature of spiking is an important feature of neural circuit dynamics. Despite significant advances, training recurrently connected spiking neural networks remains a challenge. We present a procedure for training recurrently connected spiking networks to generate dynamical patterns autonomously, to produce complex temporal outputs based on integrating network input, and to model physiological data. Our procedure makes use of a continuous-variable network to identify targets for training the inputs to the spiking model neurons. Surprisingly, we are able to construct spiking networks that duplicate tasks performed by continuous-variable networks with only a relatively minor expansion in the number of neurons. Our approach provides a novel view of the significance and appropriate use of "firing rate" models, and it is a useful approach for building model spiking networks that can be used to address important questions about representation and computation in neural systems.
Orchestrating a live field trial of wireless mobile networking involves significant cost and logistical issues relating to mobile platforms, support personnel, network and experiment automation and support equipment. The significant cost and logistics required to execute such a field trial can also be limiting in terms of achieving meaningful test results that exercise a practical number of mobile nodes over a significant set of test conditions within a given time. There is no argument that field trials are an important component of dynamic network testing. A field test of prototype will show whether simulations were on right track or not, but that's a big leap to take; going from the simulator directly to the real thing. In conceiving our work, we envisioned a mobile network emulation system that is low cost, flexible and controllable. This paper describes our wireless MANET test bed under development which emulates an actual MANET. Here, we focuses that, this test bed allows the users to automatically generate arbitrary logically network topologies in order to perform real time operations on adhoc network at a relatively low cost in a laboratory environment without having to physically move the nodes in the adhoc network. Thus, we try to "compress" wireless network so that it fits on a single table.
We have calculated the first and second order corrections to several deep inelastic sum rules which are due to heavy flavour contributions. A comparison is made with the existing perturbation series which has been computed up to third order for massless quarks only. In general it turns out that the effects of heavy quarks are very small except when $Q \sim m$ or $Q \gg m$. Here $Q$ and $m$ denote the virtual mass of the vector boson and the mass of the heavy quark, respectively. For $Q \gg m$ the radiative corrections reveal large logarithms of the type $\ln Q^2/m^2$ which have to be absorbed in the running coupling constant so that the number of light flavours $n_f$ is enhanced by one unit. However this has to happen at much larger values of Q i.e. $Q \sim 6.5~ m$ than one usually assumes for the flavour thresholds which appear in the running coupling constant. An alternative description for the heavy flavour dependence of the running coupling constant in the context of the MOM-scheme is discussed.
We develop a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a gamma process. We derive a posterior characterization and a simple and effective Gibbs sampler for posterior simulation. We develop a time-varying extension of our model, and apply it to the New York Times lists of weekly bestselling books.
With the development of speech synthesis techniques, automatic speaker verification systems face the serious challenge of spoofing attack. In order to improve the reliability of speaker verification systems, we develop a new filter bank based cepstral feature, deep neural network filter bank cepstral coefficients (DNN-FBCC), to distinguish between natural and spoofed speech. The deep neural network filter bank is automatically generated by training a filter bank neural network (FBNN) using natural and synthetic speech. By adding restrictions on the training rules, the learned weight matrix of FBNN is band-limited and sorted by frequency, similar to the normal filter bank. Unlike the manually designed filter bank, the learned filter bank has different filter shapes in different channels, which can capture the differences between natural and synthetic speech more effectively. The experimental results on the ASVspoof {2015} database show that the Gaussian mixture model maximum-likelihood (GMM-ML) classifier trained by the new feature performs better than the state-of-the-art linear frequency cepstral coefficients (LFCC) based classifier, especially on detecting unknown attacks.
There are many methods developed to approximate a cloud of vectors embedded in high-dimensional space by simpler objects: starting from principal points and linear manifolds to self-organizing maps, neural gas, elastic maps, various types of principal curves and principal trees, and so on. For each type of approximators the measure of the approximator complexity was developed too. These measures are necessary to find the balance between accuracy and complexity and to define the optimal approximations of a given type. We propose a measure of complexity (geometrical complexity) which is applicable to approximators of several types and which allows comparing data approximations of different types.
In this paper, we consider a recent cellular network connection paradigm, known as user-provided network (UPN), where users share their connectivity and act as an access point for other users. To incentivize user participation in this network, we allow the users to trade their data plan and obtain some profits by selling and buying leftover data capacities (caps) from each other. We formulate the buyers and sellers association for data trading as a matching game. In this game, buyers and sellers rank one another based on preference functions that capture buyers' data demand and QoS requirements, sellers' available data and energy resources. We show that these preferences are interdependent and influenced by the existing network-wide matching. For this reason, the game can be classified as a one-to-many matching game with externalities. To solve this game, a distributed algorithm that combines notions from matching theory and market equilibrium is proposed. The algorithm enables the players to self-organize into a stable matching and dynamic adaptation of price to data demand and supply. The properties of the resulting matching are discussed. Moreover, the price benchmark for the users to join the UPN and the operator gain are also determined. Simulation results show that the proposed algorithm yields an improvement of the average utility per user up to 25% and 50% relative to random matching and worst case utility, respectively.
Emerging workloads, such as graph processing and machine learning are approximate because of the scale of data involved and the stochastic nature of the underlying algorithms. These algorithms are often distributed over multiple machines using bulk-synchronous processing (BSP) or other synchronous processing paradigms such as map-reduce. However, data parallel processing primitives such as repeated barrier and reduce operations introduce high synchronization overheads. Hence, many existing data-processing platforms use asynchrony and staleness to improve data-parallel job performance. Often, these systems simply change the synchronous communication to asynchronous between the worker nodes in the cluster. This improves the throughput of data processing but results in poor accuracy of the final output since different workers may progress at different speeds and process inconsistent intermediate outputs.   In this paper, we present ASAP, a model that provides asynchronous and approximate processing semantics for data-parallel computation. ASAP provides fine-grained worker synchronization using NOTIFY-ACK semantics that allows independent workers to run asynchronously. ASAP also provides stochastic reduce that provides approximate but guaranteed convergence to the same result as an aggregated all-reduce. In our results, we show that ASAP can reduce synchronization costs and provides 2-10X speedups in convergence and up to 10X savings in network costs for distributed machine learning applications and provides strong convergence guarantees.
Multi-label active learning is a hot topic in reducing the label cost by optimally choosing the most valuable instance to query its label from an oracle. In this paper, we consider the poolbased multi-label active learning under the crowdsourcing setting, where during the active query process, instead of resorting to a high cost oracle for the ground-truth, multiple low cost imperfect annotators with various expertise are available for labeling. To deal with this problem, we propose the MAC (Multi-label Active learning from Crowds) approach which incorporate the local influence of label correlations to build a probabilistic model over the multi-label classifier and annotators. Based on this model, we can estimate the labels for instances as well as the expertise of each annotator. Then we propose the instance selection and annotator selection criteria that consider the uncertainty/diversity of instances and the reliability of annotators, such that the most reliable annotator will be queried for the most valuable instances. Experimental results demonstrate the effectiveness of the proposed approach.
Deep learning with convolutional neural networks (deep ConvNets) has revolutionized computer vision through end-to-end learning, i.e. learning from the raw data. Now, there is increasing interest in using deep ConvNets for end-to-end EEG analysis. However, little is known about many important aspects of how to design and train ConvNets for end-to-end EEG decoding, and there is still a lack of techniques to visualize the informative EEG features the ConvNets learn.   Here, we studied deep ConvNets with a range of different architectures, designed for decoding imagined or executed movements from raw EEG. Our results show that recent advances from the machine learning field, including batch normalization and exponential linear units, together with a cropped training strategy, boosted the deep ConvNets decoding performance, reaching or surpassing that of the widely-used filter bank common spatial patterns (FBCSP) decoding algorithm. While FBCSP is designed to use spectral power modulations, the features used by ConvNets are not fixed a priori. Our novel methods for visualizing the learned features demonstrated that ConvNets indeed learned to use spectral power modulations in the alpha, beta and high gamma frequencies. These methods also proved useful as a technique for spatially mapping the learned features, revealing the topography of the causal contributions of features in different frequency bands to decoding the movement classes.   Our study thus shows how to design and train ConvNets to decode movement-related information from the raw EEG without handcrafted features and highlights the potential of deep ConvNets combined with advanced visualization techniques for EEG-based brain mapping.
Learning and logic are fundamental brain functions that make the individual to adapt to the environment, and such functions are established in human brain by modulating ionic fluxes in synapses. Nanoscale ionic/electronic devices with inherent synaptic functions are considered to be essential building blocks for artificial neural networks. Here, Multi-terminal IZO-based artificial synaptic transistors gated by fast proton-conducting phosphosilicate electrolytes are fabricated on glass substrates. Proton in the SiO2 electrolyte and IZO channel conductance are regarded as the neurotransmitter and synaptic weight, respectively. Spike-timing dependent plasticity, short-term memory and long-term memory were successfully mimicked in such protonic/electronic hybrid artificial synapses. And most importantly, spatiotemporally correlated logic functions are also mimicked in a simple artificial neural network without any intentional hard-wire connections due to the naturally proton-related coupling effect. The oxide-based protonic/electronic hybrid artificial synaptic transistors reported here are potential building blocks for artificial neural networks.
Genetic algorithms (GAs) that solve hard problems quickly, reliably and accurately are called competent GAs. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. This paper investigates the use of competent GAs for optimizing non--stationary optimization problems. More specifically, we use an information theoretic approach based on the minimum description length principle to adaptively identify regularities and substructures that can be exploited to respond quickly to changes in the environment. We also develop a special type of problems with bounded difficulties to test non--stationary optimization problems. The results provide new insights into non-stationary optimization problems and show that a search algorithm which automatically identifies and exploits possible decompositions is more robust and responds quickly to changes than a simple genetic algorithm.
The design of spacecraft trajectories for missions visiting multiple celestial bodies is here framed as a multi-objective bilevel optimization problem. A comparative study is performed to assess the performance of different Beam Search algorithms at tackling the combinatorial problem of finding the ideal sequence of bodies. Special focus is placed on the development of a new hybridization between Beam Search and the Population-based Ant Colony Optimization algorithm. An experimental evaluation shows all algorithms achieving exceptional performance on a hard benchmark problem. It is found that a properly tuned deterministic Beam Search always outperforms the remaining variants. Beam P-ACO, however, demonstrates lower parameter sensitivity, while offering superior worst-case performance. Being an anytime algorithm, it is then found to be the preferable choice for certain practical applications.
This paper presents a novel approach to the robust design of deflection actions for Near Earth Objects (NEO). In particular, the case of deflection by means of Solar-pumped Laser ablation is studied here in detail. The basic idea behind Laser ablation is that of inducing a sublimation of the NEO surface, which produces a low thrust thereby slowly deviating the asteroid from its initial Earth threatening trajectory. This work investigates the integrated design of the Space-based Laser system and the deflection action generated by laser ablation under uncertainty. The integrated design is formulated as a multi-objective optimisation problem in which the deviation is maximised and the total system mass is minimised. Both the model for the estimation of the thrust produced by surface laser ablation and the spacecraft system model are assumed to be affected by epistemic uncertainties (partial or complete lack of knowledge). Evidence Theory is used to quantify these uncertainties and introduce them in the optimisation process. The propagation of the trajectory of the NEO under the laser-ablation action is performed with a novel approach based on an approximated analytical solution of Gauss' Variational Equations. An example of design of the deflection of asteroid Apophis with a swarm of spacecraft is presented.
An information theoretic formulation of the distributed averaging problem previously studied in computer science and control is presented. We assume a network with m nodes each observing a WGN source. The nodes communicate and perform local processing with the goal of computing the average of the sources to within a prescribed mean squared error distortion. The network rate distortion function R^*(D) for a 2-node network with correlated Gaussian sources is established. A general cutset lower bound on R^*(D) is established and shown to be achievable to within a factor of 2 via a centralized protocol over a star network. A lower bound on the network rate distortion function for distributed weighted-sum protocols, which is larger in order than the cutset bound by a factor of log m is established. An upper bound on the network rate distortion function for gossip-base weighted-sum protocols, which is only log log m larger in order than the lower bound for a complete graph network, is established. The results suggest that using distributed protocols results in a factor of log m increase in order relative to centralized protocols.
The question addressed is whether magnetic materials based on physical small world networks are possible. Physical constraints, such as uniform bond length and embedding in three dimensions, are the new features added to make small world networks physical. Results are presented to further determine if physical small world networks can exist, and the effect of the small world connections on the critical phenomena of Ising models on such networks. Spectra of the Laplacian on randomly-collapsed bead-chain networks are studied. The scaling function for the order parameter of an Ising model with physical small world connections is presented.
Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.
Event detection has been one of the most important research topics in social media analysis. Most of the traditional approaches detect events based on fixed temporal and spatial resolutions, while in reality events of different scales usually occur simultaneously, namely, they span different intervals in time and space. In this paper, we propose a novel approach towards multiscale event detection using social media data, which takes into account different temporal and spatial scales of events in the data. Specifically, we explore the properties of the wavelet transform, which is a well-developed multiscale transform in signal processing, to enable automatic handling of the interaction between temporal and spatial scales. We then propose a novel algorithm to compute a data similarity graph at appropriate scales and detect events of different scales simultaneously by a single graph-based clustering process. Furthermore, we present spatiotemporal statistical analysis of the noisy information present in the data stream, which allows us to define a novel term-filtering procedure for the proposed event detection algorithm and helps us study its behavior using simulated noisy data. Experimental results on both synthetically generated data and real world data collected from Twitter demonstrate the meaningfulness and effectiveness of the proposed approach. Our framework further extends to numerous application domains that involve multiscale and multiresolution data analysis.
We describe a polynomial network technique developed for learning to classify clinical electroencephalograms (EEGs) presented by noisy features. Using an evolutionary strategy implemented within Group Method of Data Handling, we learn classification models which are comprehensively described by sets of short-term polynomials. The polynomial models were learnt to classify the EEGs recorded from Alzheimer and healthy patients and recognize the EEG artifacts. Comparing the performances of our technique and some machine learning methods we conclude that our technique can learn well-suited polynomial models which experts can find easy-to-understand.
A Bayesian optimization algorithm for the nurse scheduling problem is presented, which involves choosing a suitable scheduling rule from a set for each nurses assignment. Unlike our previous work that used Gas to implement implicit learning, the learning in the proposed algorithm is explicit, ie. Eventually, we will be able to identify and mix building blocks directly. The Bayesian optimization algorithm is applied to implement such explicit learning by building a Bayesian network of the joint distribution of solutions. The conditional probability of each variable in the network is computed according to an initial set of promising solutions. Subsequently, each new instance for each variable is generated, ie in our case, a new rule string has been obtained. Another set of rule strings will be generated in this way, some of which will replace previous strings based on fitness selection. If stopping conditions are not met, the conditional probabilities for all nodes in the Bayesian network are updated again using the current set of promising rule strings. Computational results from 52 real data instances demonstrate the success of this approach. It is also suggested that the learning mechanism in the proposed approach might be suitable for other scheduling problems.
We seek to develop network algorithms for function computation in sensor networks. Specifically, we want dynamic joint aggregation, routing, and scheduling algorithms that have analytically provable performance benefits due to in-network computation as compared to simple data forwarding. To this end, we define a class of functions, the Fully-Multiplexible functions, which includes several functions such as parity, MAX, and k th -order statistics. For such functions we exactly characterize the maximum achievable refresh rate of the network in terms of an underlying graph primitive, the min-mincut. In acyclic wireline networks, we show that the maximum refresh rate is achievable by a simple algorithm that is dynamic, distributed, and only dependent on local information. In the case of wireless networks, we provide a MaxWeight-like algorithm with dynamic flow splitting, which is shown to be throughput-optimal.
A rekindled the interest in auto-encoder algorithms has been spurred by recent work on deep learning. Current efforts have been directed towards effective training of auto-encoder architectures with a large number of coding units. Here, we propose a learning algorithm for auto-encoders based on a rate-distortion objective that minimizes the mutual information between the inputs and the outputs of the auto-encoder subject to a fidelity constraint. The goal is to learn a representation that is minimally committed to the input data, but that is rich enough to reconstruct the inputs up to certain level of distortion. Minimizing the mutual information acts as a regularization term whereas the fidelity constraint can be understood as a risk functional in the conventional statistical learning setting. The proposed algorithm uses a recently introduced measure of entropy based on infinitely divisible matrices that avoids the plug in estimation of densities. Experiments using over-complete bases show that the rate-distortion auto-encoders can learn a regularized input-output mapping in an implicit manner.
We investigate what would happen to the time dependence of a pulse reflected by a disordered single-mode waveguide, if it is closed at one end not by an ordinary mirror but by a phase-conjugating mirror. We find that the waveguide acts like a virtual cavity with resonance frequency equal to the working frequency omega_0 of the phase-conjugating mirror. The decay in time of the average power spectrum of the reflected pulse is delayed for frequencies near omega_0. In the presence of localization the resonance width is tau_s^{-1}exp(-L/l), with L the length of the waveguide, l the mean free path, and tau_s the scattering time. Inside this frequency range the decay of the average power spectrum is delayed up to times t simeq tau_s exp(L/l).
We discuss the problem of ultrametricity in mean field spin glasses by means of a hierarchical clustering algorithm. We complement the clustering approach with quantitative testing: we discuss both in some detail. We show that the elimination of the (in this context accidental) spin flip symmetry plays a crucial role in the analysis, since the symmetry hides the real nature of the data. We are able to use in the analysis disorder averaged quantities. We are able to exhibit a number of features of the low $T$ phase of the mean field theory, and to claim that the full hierarchical structure can be observed without ambiguities only on very large lattice volumes, not currently accessible by numerical simulations.
To meet the ever-increasing demands on higher throughput and better network delay performance, 60 GHZ networking is proposed as a promising solution for the next generation of wireless communications. To successfully deploy such networks, its important to understand their performance first. However, due to the unique fading characteristic of the 60 GHz channel, the characterization of the corresponding service process, offered by the channel, using the conventional methodologies may not be tractable. In this work, we provide an alternative approach to derive a closed-form expression that characterizes the cumulative service process of the 60 GHz channel in terms of the moment generating function (MGF) of its instantaneous channel capacity. We then use this expression to derive probabilistic upper bounds on the backlog and delay that are experienced by a flow traversing this network, using results from the MGF-based network calculus. The computed bounds are validated using simulation. We provide numerical results for different networking scenarios and for different traffic and channel parameters and we show that the 60 GHz wireless network is capable of satisfying stringent quality-of-Service (QoS) requirements, in terms of network delay and reliability. With this analysis approach at hand, a larger scale 60 GHz network design and optimization is possible.
Adaptive networks consist of a collection of nodes with adaptation and learning abilities. The nodes interact with each other on a local level and diffuse information across the network to solve estimation and inference tasks in a distributed manner. In this work, we compare the mean-square performance of two main strategies for distributed estimation over networks: consensus strategies and diffusion strategies. The analysis in the paper confirms that under constant step-sizes, diffusion strategies allow information to diffuse more thoroughly through the network and this property has a favorable effect on the evolution of the network: diffusion networks are shown to converge faster and reach lower mean-square deviation than consensus networks, and their mean-square stability is insensitive to the choice of the combination weights. In contrast, and surprisingly, it is shown that consensus networks can become unstable even if all the individual nodes are stable and able to solve the estimation task on their own. When this occurs, cooperation over the network leads to a catastrophic failure of the estimation task. This phenomenon does not occur for diffusion networks: we show that stability of the individual nodes always ensures stability of the diffusion network irrespective of the combination topology. Simulation results support the theoretical findings.
Thermochemical models have been used in the past to constrain the deep oxygen abundance in the gas and ice giant planets from tropospheric CO spectroscopic measurements. Knowing the oxygen abundance of these planets is a key to better understand their formation. These models have widely used dry and/or moist adiabats to extrapolate temperatures from the measured values in the upper troposphere down to the level where the thermochemical equilibrium between H$_2$O and CO is established. The mean molecular mass gradient produced by the condensation of H$_2$O stabilizes the atmosphere against convection and results in a vertical thermal profile and H$_2$O distribution that departs significantly from previous estimates. We revisit O/H estimates using an atmospheric structure that accounts for the inhibition of the convection by condensation. We use a thermochemical network and the latest observations of CO in Uranus and Neptune to calculate the internal oxygen enrichment required to satisfy both these new estimates of the thermal profile and the observations. We also present the current limitations of such modeling.
We consider a Sinai billiard where the usual hard disk scatterer is replaced by a repulsive potential with $V(r)\sim\lambda r^{-\alpha}$ close to the origin. Using periodic orbit theory and numerical evidence we show that its spectral statistics tends to Poisson statistics for large energies when $\alpha<2$ and to Wigner-Dyson statistics when $\alpha>2$, while for $\alpha=2$ it is independent of energy, but depends on $\lambda$. We apply the approach of Altshuler and Levitov [Phys. Rep. {\bf 288}, 487 (1997)] to show that the transition in the spectral statistics is accompanied by a dynamical localization-delocalization transition. This behaviour is reminiscent of a metal-insulator transition in disordered electronic systems.
Simple elastic network models of DNA were developed to reveal the structure-dynamics relationships for several nucleotide sequences. First, we propose a simple all-atom elastic network model of DNA that can explain the profiles of temperature factors for several crystal structures of DNA. Second, we propose a coarse-grained elastic network model of DNA, where each nucleotide is described only by one node. This model could effectively reproduce the detailed dynamics obtained with the all-atom elastic network model according to the sequence-dependent geometry. Through normal-mode analysis for the coarse-grained elastic network model, we exhaustively analyzed the dynamic features of a large number of long DNA sequences, approximately $\sim 150$ bp in length. These analyses revealed positive correlations between the nucleosome-forming abilities and the inter-strand fluctuation strength of double-stranded DNA for several DNA sequences.
Clustering mechanisms are essential in certain multiuser networks for achieving efficient resource utilization. This lecture note presents the theory of coalition formation as a useful tool for distributed clustering problems. We reveal the generality of the theory, and study complexity aspects which must be considered in multiuser networks.
We used the 50 cm Binocular Network (50BiN) telescope at Delingha Station (Qinghai Province) of Purple Mountain Observatory (Chinese Academy of Sciences) to obtain simultaneous $V$- and $R$-band observations of the old open cluster NGC 188. Our aim was a search for populations of variable stars. We derived light-curve solutions for six W Ursae Majoris (W UMa) eclipsing-binary systems and estimated their orbital parameters. The resulting distance to the W UMas is independent of the physical characteristics of the host cluster. We next determined the current best period--luminosity relations for contact binaries (CBs; scatter $\sigma < 0.10$ mag). We conclude that CBs can be used as distance tracers with better than 5\% uncertainty. We apply our new relations to the 102 CBs in the Large Magellanic Cloud, which yields a distance modulus of $(m-M_V)_0=18.41\pm0.20$ mag.
We present a connection between the modifications induced by the nuclear medium of the nucleon form factors and of the deep inelastic structure functions, obtained using the concept of generalized parton distributions. Generalized parton distributions allow us to access elements of the partonic structure that are common to both the hard inclusive and exclusive scattering processes in nuclei.
In this paper we introduce a novel method for segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary encoding can be embedded into the CNN as an extra layer at the end of the network. This results in real-time segmentation. To the best of our knowledge our method is the first attempt on general semantic image segmentation using CNN. All the previous papers were limited to few number of category of the images (e.g. PASCAL VOC). Experiments show that our segmentation algorithm outperform the state-of-the-art non-semantic segmentation methods by a large margin.
The Resilient Propagation (Rprop) algorithm has been very popular for backpropagation training of multilayer feed-forward neural networks in various applications. The standard Rprop however encounters difficulties in the context of deep neural networks as typically happens with gradient-based learning algorithms. In this paper, we propose a modification of the Rprop that combines standard Rprop steps with a special drop out technique. We apply the method for training Deep Neural Networks as standalone components and in ensemble formulations. Results on the MNIST dataset show that the proposed modification alleviates standard Rprop's problems demonstrating improved learning speed and accuracy.
In this paper, we theoretically address three fundamental problems involving deep convolutional networks regarding invariance, depth and hierarchy. We introduce the paradigm of Transformation Networks (TN) which are a direct generalization of Convolutional Networks (ConvNets). Theoretically, we show that TNs (and thereby ConvNets) are can be invariant to non-linear transformations of the input despite pooling over mere local translations. Our analysis provides clear insights into the increase in invariance with depth in these networks. Deeper networks are able to model much richer classes of transformations. We also find that a hierarchical architecture allows the network to generate invariance much more efficiently than a non-hierarchical network. Our results provide useful insight into these three fundamental problems in deep learning using ConvNets.
The topic of fake news has drawn attention both from the public and the academic communities. Such misinformation has the potential of affecting public opinion, providing an opportunity for malicious parties to manipulate the outcomes of public events such as elections. Because such high stakes are at play, automatically detecting fake news is an important, yet challenging problem that is not yet well understood. Nevertheless, there are three generally agreed upon characteristics of fake news: the text of an article, the user response it receives, and the source users promoting it. Existing work has largely focused on tailoring solutions to one particular characteristic which has limited their success and generality.   In this work, we propose a model that combines all three characteristics for a more accurate and automated prediction. Specifically, we incorporate the behavior of both parties, users and articles, and the group behavior of users who propagate fake news. Motivated by the three characteristics, we propose a model called CSI which is composed of three modules: Capture, Score, and Integrate. The first module is based on the response and text; it uses a Recurrent Neural Network to capture the temporal pattern of user activity on a given article. The second module learns the source characteristic based on the behavior of users, and the two are integrated with the third module to classify an article as fake or not. Experimental analysis on real-world data demonstrates that CSI achieves higher accuracy than existing models, and extracts meaningful latent representations of both users and articles.
In this paper, we propose an efficient transfer leaning methods for training a personalized language model using a recurrent neural network with long short-term memory architecture. With our proposed fast transfer learning schemes, a general language model is updated to a personalized language model with a small amount of user data and a limited computing resource. These methods are especially useful for a mobile device environment while the data is prevented from transferring out of the device for privacy purposes. Through experiments on dialogue data in a drama, it is verified that our transfer learning methods have successfully generated the personalized language model, whose output is more similar to the personal language style in both qualitative and quantitative aspects.
On the basis of the general form for the energy needed to adapt the connection strengths of a network in which learning takes place, a local learning rule is found for the changes of the weights. This biologically realizable learning rule turns out to comply with Hebb's neuro-physiological postulate, but is not of the form of any of the learning rules proposed in the literature.   It is shown that, if a finite set of the same patterns is presented over and over again to the network, the weights of the synapses converge to finite values.   Furthermore, it is proved that the final values found in this biologically realizable limit are the same as those found via a mathematical approach to the problem of finding the weights of a partially connected neural network that can store a collection of patterns. The mathematical solution is obtained via a modified version of the so-called method of the pseudo-inverse, and has the inverse of a reduced correlation matrix, rather than the usual correlation matrix, as its basic ingredient. Thus, a biological network might realize the final results of the mathematician by the energetically economic rule for the adaption of the synapses found in this article.
Identifying factors that affect human decision making and quantifying their influence remain essential and challenging tasks for the design and implementation of social and technological communication systems. We report results of a behavioral experiment involving decision making in the face of an impending natural disaster. In a controlled laboratory setting, we characterize individual and group evacuation decision making influenced by several key factors, including the likelihood of the disaster, available shelter capacity, group size, and group decision protocol. Our results show that success in individual decision making is not a strong predictor of group performance. We use an artificial neural network trained on the collective behavior of subjects to predict individual and group outcomes. Overall model accuracy increases with the inclusion of a subject-specific performance parameter based on laboratory trials that captures individual differences. In parallel, we demonstrate that the social media activity of individual subjects, specifically their Facebook use, can be used to generate an alternative individual personality profile that leads to comparable model accuracy. Quantitative characterization and prediction of collective decision making is crucial for the development of effective policies to guide the action of populations in the face of threat or uncertainty.
In video games, virtual characters' decision systems often use a simplified representation of the world. To increase both their autonomy and believability we want those characters to be able to learn this representation from human players. We propose to use a model called growing neural gas to learn by imitation the topology of the environment. The implementation of the model, the modifications and the parameters we used are detailed. Then, the quality of the learned representations and their evolution during the learning are studied using different measures. Improvements for the growing neural gas to give more information to the character's model are given in the conclusion.
We present the first three-dimensional circulation models for extrasolar gas giant atmospheres with geometrically and energetically consistent treatments of magnetic drag and ohmic dissipation. Atmospheric resistivities are continuously updated and calculated directly from the flow structure, strongly coupling the magnetic effects with the circulation pattern. We model the hot Jupiters HD 189733b (Teq \approx 1200 K) and HD 209458b (Teq \approx 1500 K) and test planetary magnetic field strengths from 0 to 30 G. We find that even at B = 3 G the atmospheric structure and circulation of HD 209458b are strongly influenced by magnetic effects, while the cooler HD 189733b remains largely unaffected, even in the case of B = 30 G and super-solar metallicities. Our models of HD 209458b indicate that magnetic effects can substantially slow down atmospheric winds, change circulation and temperature patterns, and alter observable properties. These models establish that longitudinal and latitudinal hot spot offsets, day-night flux contrasts, and planetary radius inflation are interrelated diagnostics of the magnetic induction process occurring in the atmospheres of hot Jupiters and other similarly forced exoplanets. Most of the ohmic heating occurs high in the atmosphere and on the day side of the planet, while the heating at depth is strongly dependent on the internal heat flux assumed for the planet, with more heating when the deep atmosphere is hot. We compare the ohmic power at depth in our models, and estimates of the ohmic dissipation in the bulk interior (from general scaling laws), to evolutionary models that constrain the amount of heating necessary to explain the inflated radius of HD 209458b. Our results suggest that deep ohmic heating can successfully inflate the radius of HD 209458b for planetary magnetic field strengths of B \geq 3 - 10 G.
Molecular vibrational transitions are prime candidates for model-independent searches for variation of the proton-to-electron mass ratio. Searches for present-day variation achieve highest sensitivity with deep molecular potentials. We identify several high-sensitivity transitions in the deeply bound ${\rm O}_2^+$ molecular ion. These transitions are electric-dipole forbidden and thus have narrow linewidths. The most sensitive transitions take advantage of an accidental degeneracy between vibrational states in different electronic potentials. We suggest experimentally feasible routes to a measurement with uncertainty exceeding current limits on present-day variation in $m_p/m_e$.
Deep neural models, particularly the LSTM-RNN model, have shown great potential in language identification (LID). However, the phonetic information has been largely overlooked by most of existing neural LID methods, although this information has been used in the conventional phonetic LID systems with a great success. We present a phonetic temporal neural model for LID, which is an LSTM-RNN LID system but accepts phonetic features produced by a phone-discriminative DNN as the input, rather than raw acoustic features. This new model is a reminiscence of the old phonetic LID methods, but the phonetic knowledge here is much richer: it is at the frame level and involves compacted information of all phones. Our experiments conducted on the Babel database and the AP16-OLR database demonstrate that the temporal phonetic neural approach is very effective, and significantly outperforms existing acoustic neural models. It also outperforms the conventional i-vector approach on short utterances and in noisy conditions.
This year, the Nara Institute of Science and Technology (NAIST)/Carnegie Mellon University (CMU) submission to the Japanese-English translation track of the 2016 Workshop on Asian Translation was based on attentional neural machine translation (NMT) models. In addition to the standard NMT model, we make a number of improvements, most notably the use of discrete translation lexicons to improve probability estimates, and the use of minimum risk training to optimize the MT system for BLEU score. As a result, our system achieved the highest translation evaluation scores for the task.
Realistic network-like systems are usually composed of multiple networks with interacting relations such as school-enterprise research and development collaboration networks. Here we study the percolation properties of a special kind of that R&D collaboration networks, namely institute-enterprise R&D collaboration networks. We introduce two actual IERDCNs to show their structural properties, and present a mathematical framework based on generating functions for analyzing an interacting network with any connection probability. Then we illustrate the percolation threshold and structural parameter arithmetic in the sub-critical and supercritical regimes. We compare the predictions of our mathematical framework and arithmetic to data for two real R&D collaboration networks and a number of simulations, and we find that they are in remarkable agreement with the data. We show applications of the framework to electronics R&D collaboration networks.
Wireless Mesh Network (WMN) is a multi hop low cost, with easy maintenance robust network providing reliable service coverage. WMNs consist of mesh routers and mesh clients. In this architecture, while static mesh routers form the wireless backbone, mesh clients access the network through mesh routers as well as directly meshing with each other. Different from traditional wireless networks, WMN is dynamically self-organized and self-configured. In other words, the nodes in the mesh network automatically establish and maintain network connectivity. Over the years researchers have worked, to reduce the redundancy in broadcasting packet in the mesh network in the wireless domain for providing reliable service coverage, the source node deserves to broadcast or flood the control packets. The redundant control packet consumes the bandwidth of the wireless medium and significantly reduces the average throughput and consequently reduces the overall system performance. In this paper I study the optimization problem in Wireless Mesh Networks. We have proposed a novel approach to reduce the broadcast redundant packet in the wireless mesh network. Also we have shown, a novel procedure to forward the control packet to the destination nodes and efficiently minimize the transmitted control packet in the wireless mesh cloud, that covers the domain.
Device-to-Device (D2D) communications, which allow direct communication among mobile devices, have been proposed as an enabler of local services in 3GPP LTE-Advanced (LTE-A) cellular networks. This work investigates a hierarchical LTE-A network framework consisting of multiple D2D operators at the upper layer and a group of devices at the lower layer. We propose a cooperative model that allows the operators to improve their utility in terms of revenue by sharing their devices, and the devices to improve their payoff in terms of end-to-end throughput by collaboratively performing multi-path routing. To help understanding the interaction among operators and devices, we present a game-theoretic framework to model the cooperation behavior, and further, we propose a layered coalitional game (LCG) to address the decision making problems among them. Specifically, the cooperation of operators is modeled as an overlapping coalition formation game (CFG) in a partition form, in which operators should form a stable coalitional structure. Moreover, the cooperation of devices is modeled as a coalitional graphical game (CGG), in which devices establish links among each other to form a stable network structure for multi-path routing.We adopt the extended recursive core, and Nash network, as the stability concept for the proposed CFG and CGG, respectively. Numerical results demonstrate that the proposed LCG yields notable gains compared to both the non-cooperative case and a LCG variant and achieves good convergence speed.
In societal-scale decision-making systems the collective is faced with the problem of ensuring that the derived group decision is in accord with the collective's intention. In modern systems, political institutions have instatiated representative forms of decision-making to ensure that every individual in the society has a participatory voice in the decision-making behavior of the whole--even if only indirectly through representation. An agent-based simulation demonstrates that in modern representative systems, as the ratio of representatives increases, there exists an exponential decrease in the ability for the group to behave in accord with the desires of the whole. To remedy this issue, this paper provides a novel representative power structure for decision-making that utilizes a social network and power distribution algorithm to maintain the collective's perspective over varying degrees of participation and/or ratios of representation. This work shows promise for the future development of policy-making systems that are supported by the computer and network infrastructure of our society.
Social networks can have asymmetric relationships. In the online social network Twitter, a follower receives tweets from a followed person but the followed person is not obliged to subscribe to the channel of the follower. Thus, it is natural to consider the dissemination of information in directed networks. In this work we use the mean-field approach to derive differential equations that describe the dissemination of information in a social network with asymmetric relationships. In particular, our model reflects the impact of the degree distribution on the information propagation process. We further show that for an important subclass of our model, the differential equations can be solved analytically.
-The emergence of Network Functions Virtualization (NFV) is bringing a set of novel algorithmic challenges in the operation of communication networks. NFV introduces volatility in the management of network functions, which can be dynamically orchestrated, i.e., placed, resized, etc. Virtual Network Functions (VNFs) can belong to VNF chains, where nodes in a chain can serve multiple demands coming from the network edges. In this paper, we formally define the VNF placement and routing (VNF-PR) problem, proposing a versatile linear programming formulation that is able to accommodate specific features and constraints of NFV infrastructures, and that is substantially different from existing virtual network embedding formulations in the state of the art. We also design a math-heuristic able to scale with multiple objectives and large instances. By extensive simulations, we draw conclusions on the trade-off achievable between classical traffic engineering (TE) and NFV infrastructure efficiency goals, evaluating both Internet access and Virtual Private Network (VPN) demands. We do also quantitatively compare the performance of our VNF-PR heuristic with the classical Virtual Network Embedding (VNE) approach proposed for NFV orchestration, showing the computational differences, and how our approach can provide a more stable and closer-to-optimum solution.
An almost ubiquitous assumption made in the stochastic-analytic study of the quality of service in cellular networks is Poisson distribution of base stations. It is usually justified by various irregularities in the real placement of base stations, which ideally should form the hexagonal pattern. We provide a different and rigorous argument justifying the Poisson assumption under sufficiently strong log-normal shadowing observed in the network, in the evaluation of a natural class of the typical-user service-characteristics including its SINR. Namely, we present a Poisson-convergence result for a broad range of stationary (including lattice) networks subject to log-normal shadowing of increasing variance. We show also for the Poisson model that the distribution of all these characteristics does not depend on the particular form of the additional fading distribution. Our approach involves a mapping of 2D network model to 1D image of it "perceived" by the typical user. For this image we prove our convergence result and the invariance of the Poisson limit with respect to the distribution of the additional shadowing or fading. Moreover, we present some new results for Poisson model allowing one to calculate the distribution function of the SINR in its whole domain. We use them to study and optimize the mean energy efficiency in cellular networks.
Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on neural approaches to summarisation, which can be very data-hungry. However, few large datasets exist and none for the traditionally popular domain of scientific publications, which opens up challenging research avenues centered on encoding large, complex documents. In this paper, we introduce a new dataset for summarisation of computer science publications by exploiting a large resource of author provided summaries and show straightforward ways of extending it further. We develop models on the dataset making use of both neural sentence encoding and traditionally used summarisation features and show that models which encode sentences as well as their local and global context perform best, significantly outperforming well-established baseline methods.
States of self stress (SSS) are assignments of forces on the edges of a network that satisfy mechanical equilibrium in the absence of external forces. In this work we show that a particular class of quasilocalized SSS in packing-derived networks, first introduced in [D. M. Sussman, C. P. Goodrich, and A. J. Liu, Soft Matter 12, 3982 (2016)], are characterized by a lengthscale $\ell_c$ that scales as $1/\sqrt{z_c-z}$ where $z$ is the mean connectivity of the network, and $z_c\!\equiv\!4$ is the Maxwell threshold in two dimensions, at odds with previous claims. Our results verify the previously proposed analogy between quasilocalized SSS and the mechanical response to a local dipolar force in random networks of relaxed Hookean springs. We show that the normalization factor that distinguishes between quasilocalized SSS and the response to a local dipole constitutes a measure of the mechanical coupling of the forced spring to the elastic network in which it is embedded. We further demonstrate that the lengthscale that characterizes quasilocalized SSS does not depend on its associated degree of mechanical coupling, but instead only on the network connectivity.
Since quantum information is continuous, its handling is sometimes surprisingly harder than the classical counterpart. A typical example is cloning; making a copy of digital information is straightforward but it is not possible exactly for quantum information. The question in this paper is whether or not quantum network coding is possible. Its classical counterpart is another good example to show that digital information flow can be done much more efficiently than conventional (say, liquid) flow.   Our answer to the question is similar to the case of cloning, namely, it is shown that quantum network coding is possible if approximation is allowed, by using a simple network model called Butterfly. In this network, there are two flow paths, s_1 to t_1 and s_2 to t_2, which shares a single bottleneck channel of capacity one. In the classical case, we can send two bits simultaneously, one for each path, in spite of the bottleneck. Our results for quantum network coding include: (i) We can send any quantum state |psi_1> from s_1 to t_1 and |psi_2> from s_2 to t_2 simultaneously with a fidelity strictly greater than 1/2. (ii) If one of |psi_1> and |psi_2> is classical, then the fidelity can be improved to 2/3. (iii) Similar improvement is also possible if |psi_1> and |psi_2> are restricted to only a finite number of (previously known) states. (iv) Several impossibility results including the general upper bound of the fidelity are also given.
Dual descent methods are commonly used to solve network flow optimization problems, since their implementation can be distributed over the network. These algorithms, however, often exhibit slow convergence rates. Approximate Newton methods which compute descent directions locally have been proposed as alternatives to accelerate the convergence rates of conventional dual descent. The effectiveness of these methods, is limited by the accuracy of such approximations. In this paper, we propose an efficient and accurate distributed second order method for network flow problems. The proposed approach utilizes the sparsity pattern of the dual Hessian to approximate the the Newton direction using a novel distributed solver for symmetric diagonally dominant linear equations. Our solver is based on a distributed implementation of a recent parallel solver of Spielman and Peng (2014). We analyze the properties of the proposed algorithm and show that, similar to conventional Newton methods, superlinear convergence within a neighbor- hood of the optimal value is attained. We finally demonstrate the effectiveness of the approach in a set of experiments on randomly generated networks.
We study the directed polymer (DP) of length $t$ in a random potential in dimension 1+1 in the continuum limit, with one end fixed and one end free. This maps onto the Kardar-Parisi-Zhang growth equation in time $t$, with flat initial conditions. We use the Bethe Ansatz solution for the replicated problem which is an attractive bosonic model. The problem is more difficult than the previous solution of the fixed endpoint problem as it requires regularization of the spatial integrals over the Bethe eigenfunctions. We use either a large fixed system length or a small finite slope KPZ initial conditions (wedge). The latter allows to take properly into account non-trivial contributions, which appear as deformed strings in the former. By considering a half-space model in a proper limit we obtain an expression for the generating function of all positive integer moments $\bar{Z^n}$ of the directed polymer partition function. We obtain the generating function of the moments of the DP partition sum as a Fredholm Pfaffian. At large time, this Fredholm Pfaffian, valid for all time $t$, exhibits convergence of the free energy (i.e. KPZ height) distribution to the GOE Tracy Widom distribution
We consider the low-temperature $T<T_c$ disorder-dominated phase of the directed polymer in a random potentiel in dimension 1+1 (where $T_c=\infty$) and 1+3 (where $T_c<\infty$). To characterize the localization properties of the polymer of length $L$, we analyse the statistics of the weights $w_L(\vec r)$ of the last monomer as follows. We numerically compute the probability distributions $P_1(w)$ of the maximal weight $w_L^{max}= max_{\vec r} [w_L(\vec r)]$, the probability distribution $\Pi(Y_2)$ of the parameter $Y_2(L)= \sum_{\vec r} w_L^2(\vec r) $ as well as the average values of the higher order moments $Y_k(L)= \sum_{\vec r} w_L^k(\vec r)$. We find that there exists a temperature $T_{gap}<T_c$ such that (i) for $T<T_{gap}$, the distributions $P_1(w)$ and $\Pi(Y_2)$ present the characteristic Derrida-Flyvbjerg singularities at $w=1/n$ and $Y_2=1/n$ for $n=1,2..$. In particular, there exists a temperature-dependent exponent $\mu(T)$ that governs the main singularities $P_1(w) \sim (1-w)^{\mu(T)-1}$ and $\Pi(Y_2) \sim (1-Y_2)^{\mu(T)-1}$ as well as the power-law decay of the moments $ \bar{Y_k(i)} \sim 1/k^{\mu(T)}$. The exponent $\mu(T)$ grows from the value $\mu(T=0)=0$ up to $\mu(T_{gap}) \sim 2$. (ii) for $T_{gap}<T<T_c$, the distribution $P_1(w)$ vanishes at some value $w_0(T)<1$, and accordingly the moments $\bar{Y_k(i)}$ decay exponentially as $(w_0(T))^k$ in $k$. The histograms of spatial correlations also display Derrida-Flyvbjerg singularities for $T<T_{gap}$. Both below and above $T_{gap}$, the study of typical and averaged correlations is in full agreement with the droplet scaling theory.
Stochasticity plays important roles in molecular networks when molecular concentrations are in the range of $0.1 \mu$M to $10 n$M (about 100 to 10 copies in a cell). The chemical master equation provides a fundamental framework for studying these networks, and the time-varying landscape probability distribution over the full microstates provide a full characterization of the network dynamics. A complete characterization of the space of the microstates is a prerequisite for obtaining the full landscape probability distribution of a network. However, there are neither closed-form solutions nor algorithms fully describing all microstates for a given molecular network.   We have developed an algorithm that can exhaustively enumerate the microstates of a molecular network of small copy numbers under the finite buffer condition that the net gain in newly synthesized molecules is smaller than a predefined limit. We also describe a simple method for computing the exact mean or steady state landscape probability distribution over microstates. We show how the full landscape probability for the gene networks of the self-regulating gene and the toggle-switch in the steady state can be fully characterized. We also give an example using the MAPK cascade network.   Our algorithm works for networks of small copy numbers buffered with a finite copy number of net molecules that can be synthesized, regardless of the reaction stoichiometry, and is optimal in both storage and time complexity. The buffer size is limited by the available memory or disk storage. Our algorithm is applicable to a class of biological networks when the copy numbers of molecules are small and the network is closed, or the network is open but the net gain in newly synthesized molecules does not exceed a predefined buffer capacity.
Deep networks have recently enjoyed enormous success when applied to recognition and classification problems in computer vision, but their use in graphics problems has been limited. In this work, we present a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets. In contrast to traditional approaches which consist of multiple complex stages of processing, each of which require careful tuning and can fail in unexpected ways, our system is trained end-to-end. The pixels from neighboring views of a scene are presented to the network which then directly produces the pixels of the unseen view. The benefits of our approach include generality (we only require posed image sets and can easily apply our method to different domains), and high quality results on traditionally difficult scenes. We believe this is due to the end-to-end nature of our system which is able to plausibly generate pixels according to color, depth, and texture priors learnt automatically from the training data. To verify our method we show that it can convincingly reproduce known test views from nearby imagery. Additionally we show images rendered from novel viewpoints. To our knowledge, our work is the first to apply deep learning to the problem of new view synthesis from sets of real-world, natural imagery.
Recent experimental and theoretical studies suggest that crystallization and glass-like solidification are useful analogies for understanding cell ordering in confluent biological tissues. It remains unexplored how cellular ordering contributes to pattern formation during morphogenesis. With a computational model we show that a system of elongated, cohering biological cells can get dynamically arrested in a network pattern. Our model provides a new explanation for the formation of cellular networks in culture systems that exclude intercellular interaction via chemotaxis or mechanical traction.
The paper presents a method for failure free genetic algorithm optimization of a system controller. Genetic algorithms present a powerful tool that facilitates producing near-optimal system controllers. Applied to such methods of computational intelligence as neural networks or fuzzy logic, these methods are capable of combining the non-linear mapping capabilities of the latter with learning the system behavior directly, that is, without a prior model. At the same time, genetic algorithms routinely produce solutions that lead to the failure of the controlled system. Such solutions are generally unacceptable for applications where safe operation must be guaranteed. We present here a method of design, which allows failure-free application of genetic algorithms through utilization of SAFE and LEARNING controllers in tandem, where the SAFE controller recovers the system from dangerous states while the LEARNING controller learns its behavior. The method has been validated by applying it to an inherently unstable system of inverted pendulum.
Deep learning models (DLMs) are state-of-the-art techniques in speech recognition. However, training good DLMs can be time consuming especially for production-size models and corpora. Although several parallel training algorithms have been proposed to improve training efficiency, there is no clear guidance on which one to choose for the task in hand due to lack of systematic and fair comparison among them. In this paper we aim at filling this gap by comparing four popular parallel training algorithms in speech recognition, namely asynchronous stochastic gradient descent (ASGD), blockwise model-update filtering (BMUF), bulk synchronous parallel (BSP) and elastic averaging stochastic gradient descent (EASGD), on 1000-hour LibriSpeech corpora using feed-forward deep neural networks (DNNs) and convolutional, long short-term memory, DNNs (CLDNNs). Based on our experiments, we recommend using BMUF as the top choice to train acoustic models since it is most stable, scales well with number of GPUs, can achieve reproducible results, and in many cases even outperforms single-GPU SGD. ASGD can be used as a substitute in some cases.
We determine the strong coupling alpha_s(M_Z) from scaling violations of truncated moments of the nonsinglet deep inelastic structure function F_2. Truncated moments are determined from BCDMS and NMC data using a neural network parametrization which retains the full experimental information on errors and correlations. Our method minimizes all sources of theoretical uncertainty and bias which characterize extractions of alpha_s from scaling violations. We obtain alpha_s(M_Z) = 0.124 +0.004-0.007 (exp.) + 0.003- 0.004 (th.).
We attempt to identify the different node hubs of a road network using PageRank for preparation for possible random terrorist attacks. The robustness of a road network against such attack is crucial to be studied because it may cripple its connectivity by simply shutting down these hubs. We show the important hubs in a road network based on network structure and propose a model for robustness analysis. By identifying important hubs in a road network, possible preparation schemes may be done earlier to mitigate random terrorist attacks, including defense reinforcement and transportation security. A case study of the Metro Manila road network is also presented. The case study shows that the most important hubs in the Metro Manila road network are near airports, piers, major highways and expressways
In a recent letter [Phys. Rev. Lett. {\bf 75}, 2360 (1996)] we briefly discussed the existence and nature of ferroelectric order in positionally disordered dipolar materials. Here we report further results and give a complete description of our work. Simulations of randomly frozen and dynamically disordered dipolar soft spheres are used to study ferroelectric ordering in non-crystalline systems. We also give a physical interpretation of the simulation results in terms of short- and long-range interactions. Cases where the dipole moment has 1, 2, and 3 components (Ising, XY and XYZ models, respectively) are considered. It is found that the Ising model displays ferroelectric phases in frozen amorphous systems, while the XY and XYZ models form dipolar glass phases at low temperatures. In the dynamically disordered model the equations of motion are decoupled such that particle translation is completely independent of the dipolar forces. These systems spontaneously develop long-range ferroelectric order at nonzero temperature despite the absence of any fined-tuned short-range spatial correlations favoring dipolar order. Furthermore, since this is a nonequilibrium model we find that the paraelectric to ferroelectric transition depends on the particle mass. For the XY and XYZ models, the critical temperatures extrapolate to zero as the mass of the particle becomes infinite, whereas, for the Ising model the critical temperature is almost independent of mass and coincides with the ferroelectric transition found for the randomly frozen system at the same density. Thus in the infinite mass limit the results of the frozen amorphous systems are recovered.
We make a detailed study of the longitudinal polarization of hyperons and anti-hyperons in semi-inclusive deep-inelastic lepton-nucleon scattering. We present the numerical results for spin transfer in quark fragmentation processes, analyze the possible origins for a difference between the polarization for hyperon and that for the corresponding anti-hyperon. We present the results obtained in the case that there is no asymmetry between sea and anti-sea distribution in nucleon as well as those obtained when such an asymmetry is taken into account. We compare the results with the available data such as those from COMPASS and make predictions for future experiments including those at even higher energies such as at eRHIC.
A combination of experimental techniques and molecular dynamics (MD) computer simulation is used to investigate the diffusion dynamics in Al80Ni20 melts. Experimentally, the self-diffusion coefficient of Ni is measured by the long-capillary (LC) method and by quasielastic neutron scattering. The LC method yields also the interdiffusion coefficient. Whereas the experiments were done in the normal liquid state, the simulations provided the determination of both self-diffusion and interdiffusion constants in the undercooled regime as well. The simulation results show good agreement with the experimental data. In the temperature range 3000 K >= T >= 715 K, the interdiffusion coefficient is larger than the self-diffusion constants. Furthermore the simulation shows that this difference becomes larger in the undercooled regime. This result can be refered to a relatively strong temperature dependence of the thermodynamic factor \Phi, which describes the thermodynamic driving force for interdiffusion. The simulations also indicate that the Darken equation is a good approximation, even in the undercooled regime. This implies that dynamic cross correlations play a minor role for the temperature range under consideration.
The low temperature entropy of the the spin ice compounds, such as Ho$_2$Ti$_2$O$_7$ and Dy$_2$Ti$_2$O$_7$, is well described by the nearest-neighbor antiferromagnetic Ising model on the pyrochlore lattice, i.e.\ by the ``ice rules''. This is surprising since the dominant coupling between the spins is their long ranged dipole interaction. We show that this phenomenon can be understood rather elegantly: one can construct a model dipole interaction, by adding terms of shorter range, which yields {\it precisely} the same ground states, and hence T=0 entropy, as the nearest neighbor interaction. A treatment of the small difference between the model and true dipole interactions reproduces the numerical work by Gingras et al in detail. We are also led to a more general concept of projective equivalence between interactions.
BABAR has studied the time dependent asymmetries in the the decays B0 -> J/psi K0S and B0 -> psi(2S) K0S in a data set of 9.0 fb^-1 taken at the Y(4S)resonance. In these channels we reconstruct 168 events of which 120 are flavor tagged and used in a likelihood fit where we determine sin2beta. The flavor of the other neutral $B$ mesons is tagged using information primarily from identified leptons and Kaons. A neural network is used to recover events without any clear Kaon or lepton signature. A preliminary result of sin2beta=0.12+/-0.37+/-0.09 is obtained.
Deep convolutional neural network (DCNN) has achieved remarkable performance on object detection and speech recognition in recent years. However, the excellent performance of a DCNN incurs high computational complexity and large memory requirement. In this paper, an equal distance nonuniform quantization (ENQ) scheme and a K-means clustering nonuniform quantization (KNQ) scheme are proposed to reduce the required memory storage when low complexity hardware or software implementations are considered. For the VGG-16 and the AlexNet, the proposed nonuniform quantization schemes reduce the number of required memory storage by approximately 50\% while achieving almost the same or even better classification accuracy compared to the state-of-the-art quantization method. Compared to the ENQ scheme, the proposed KNQ scheme provides a better tradeoff when higher accuracy is required.
Advancements in deep learning have ignited an explosion of research on efficient hardware for embedded computer vision. Hardware vision acceleration, however, does not address the cost of capturing and processing the image data that feeds these algorithms. We examine the role of the image signal processing (ISP) pipeline in computer vision to identify opportunities to reduce computation and save energy. The key insight is that imaging pipelines should be designed to be configurable: to switch between a traditional photography mode and a low-power vision mode that produces lower-quality image data suitable only for computer vision. We use eight computer vision algorithms and a reversible pipeline simulation tool to study the imaging system's impact on vision performance. For both CNN-based and classical vision algorithms, we observe that only two ISP stages, demosaicing and gamma compression, are critical for task performance. We propose a new image sensor design that can compensate for skipping these stages. The sensor design features an adjustable resolution and tunable analog-to-digital converters (ADCs). Our proposed imaging system's vision mode disables the ISP entirely and configures the sensor to produce subsampled, lower-precision image data. This vision mode can save ~75% of the average energy of a baseline photography mode while having only a small impact on vision task accuracy.
Research on Offline Handwritten Signature Verification explored a large variety of handcrafted feature extractors, ranging from graphology, texture descriptors to interest points. In spite of advancements in the last decades, performance of such systems is still far from optimal when we test the systems against skilled forgeries - signature forgeries that target a particular individual. In previous research, we proposed a formulation of the problem to learn features from data (signature images) in a Writer-Independent format, using Deep Convolutional Neural Networks (CNNs), seeking to improve performance on the task. In this research, we push further the performance of such method, exploring a range of architectures, and obtaining a large improvement in state-of-the-art performance on the GPDS dataset, the largest publicly available dataset on the task. In the GPDS-160 dataset, we obtained an Equal Error Rate of 2.74%, compared to 6.97% in the best result published in literature (that used a combination of multiple classifiers). We also present a visual analysis of the feature space learned by the model, and an analysis of the errors made by the classifier. Our analysis shows that the model is very effective in separating signatures that have a different global appearance, while being particularly vulnerable to forgeries that very closely resemble genuine signatures, even if their line quality is bad, which is the case of slowly-traced forgeries.
In the context of the VLA-COSMOS Deep project additional VLA A array observations at 1.4 GHz were obtained for the central degree of the COSMOS field and combined with the existing data from the VLA-COSMOS Large project. A newly constructed Deep mosaic with a resolution of 2.5" was used to search for sources down to 4 sigma with 1 sigma ~ 12 microJy/beam in the central 50'x50'. This new catalog is combined with the catalog from the Large project (obtained at 1.5"x1.4" resolution) to construct a new Joint catalog. All sources listed in the new Joint catalog have peak flux densities of >5 sigma at 1.5" and/or 2.5" resolution to account for the fact that a significant fraction of sources at these low flux levels are expected to be slighty resolved at 1.5" resolution. All properties listed in the Joint catalog such as peak flux density, integrated flux density and source size are determined in the 2.5" resolution Deep image. In addition, the Joint catalog contains 43 newly identified multi-component sources.
The optical luminous quasar PG0043+039 has not been detected before in deep X-ray observations indicating the most extreme optical-to-X-ray slope index ${\alpha}_{ox}$ of all quasars. This study aims to detect PG0043+039 in a deep X-ray exposure. Furthermore, we wanted to check out whether this object shows specific spectral properties in other frequency bands. We took deep X-ray (XMM-Newton), far-ultraviolet (HST), and optical (HET, SALT telescopes) spectra of PG0043+039 simultaneously in July 2013. We just detected PG0043+039 in our deep X-ray exposure. The steep ${\alpha}_{ox} = -2.37 {\pm} 0.05$ gradient is consistent with an unusual steep gradient $F_{\nu} {\sim} {\nu}^{\alpha}$ with ${\alpha} = -2.67 {\pm} 0.02$ seen in the UV/far-UV continuum. The optical/UV continuum flux has a clear maximum near 2500 {\AA}. The UV spectrum is very peculiar because it shows broad humps in addition to known emission lines. A modeling of these observed humps with cyclotron lines can explain their wavelength positions, their relative distances, and their relative intensities. We derive plasma temperatures of T ${\sim}$ 3keV and magnetic field strengths of B ${\sim}$ 2 ${\times} 10^8$ G for the line-emitting regions close to the black hole.
Information spreads across social and technological networks, but often the network structures are hidden from us and we only observe the traces left by the diffusion processes, called cascades. Can we recover the hidden network structures from these observed cascades? What kind of cascades and how many cascades do we need? Are there some network structures which are more difficult than others to recover? Can we design efficient inference algorithms with provable guarantees?   Despite the increasing availability of cascade data and methods for inferring networks from these data, a thorough theoretical understanding of the above questions remains largely unexplored in the literature. In this paper, we investigate the network structure inference problem for a general family of continuous-time diffusion models using an $l_1$-regularized likelihood maximization framework. We show that, as long as the cascade sampling process satisfies a natural incoherence condition, our framework can recover the correct network structure with high probability if we observe $O(d^3 \log N)$ cascades, where $d$ is the maximum number of parents of a node and $N$ is the total number of nodes. Moreover, we develop a simple and efficient soft-thresholding inference algorithm, which we use to illustrate the consequences of our theoretical results, and show that our framework outperforms other alternatives in practice.
We consider a distributed learning setup where a network of agents sequentially access realizations of a set of random variables with unknown distributions. The network objective is to find a parametrized distribution that best describes their joint observations in the sense of the Kullback-Leibler divergence. Apart from recent efforts in the literature, we analyze the case of countably many hypotheses and the case of a continuum of hypotheses. We provide non-asymptotic bounds for the concentration rate of the agents' beliefs around the correct hypothesis in terms of the number of agents, the network parameters, and the learning abilities of the agents. Additionally, we provide a novel motivation for a general set of distributed Non-Bayesian update rules as instances of the distributed stochastic mirror descent algorithm.
We present the Intelligent Automated Client Diagnostic (IACD) system, which only relies on inference from Transmission Control Protocol (TCP) packet traces for rapid diagnosis of client device problems that cause network performance issues. Using soft-margin Support Vector Machine (SVM) classifiers, the system (i) distinguishes link problems from client problems, and (ii) identifies characteristics unique to client faults to report the root cause of the client device problem. Experimental evaluation demonstrated the capability of the IACD system to distinguish between faulty and healthy links and to diagnose the client faults with 98% accuracy in healthy links. The system can perform fault diagnosis independent of the client's specific TCP implementation, enabling diagnosis capability on diverse range of client computers.
Random Neural Networks (RNNs) are a class of Neural Networks (NNs) that can also be seen as a specific type of queuing network. They have been successfully used in several domains during the last 25 years, as queuing networks to analyze the performance of resource sharing in many engineering areas, as learning tools and in combinatorial optimization, where they are seen as neural systems, and also as models of neurological aspects of living beings. In this article we focus on their learning capabilities, and more specifically, we present a practical guide for using the RNN to solve supervised learning problems. We give a general description of these models using almost indistinctly the terminology of Queuing Theory and the neural one. We present the standard learning procedures used by RNNs, adapted from similar well-established improvements in the standard NN field. We describe in particular a set of learning algorithms covering techniques based on the use of first order and, then, of second order derivatives. We also discuss some issues related to these objects and present new perspectives about their use in supervised learning problems. The tutorial describes their most relevant applications, and also provides a large bibliography.
Online social systems are multiplex in nature as multiple links may exist between the same two users across different social networks. In this work, we introduce a framework for studying links and interactions between users beyond the individual social network. Exploring the cross-section of two popular online platforms - Twitter and location-based social network Foursquare - we represent the two together as a composite multilayer online social network. Through this paradigm we study the interactions of pairs of users differentiating between those with links on one or both networks. We find that users with multiplex links, who are connected on both networks, interact more and have greater neighbourhood overlap on both platforms, in comparison with pairs who are connected on just one of the social networks. In particular, the most frequented locations of users are considerably closer, and similarity is considerably greater among multiplex links. We present a number of structural and interaction features, such as the multilayer Adamic/Adar coefficient, which are based on the extension of the concept of the node neighbourhood beyond the single network. Our evaluation, which aims to shed light on the implications of multiplexity for the link generation process, shows that multilayer features, constructed from properties across social networks, perform better than their single network counterparts in predicting links across networks. We propose that combining information from multiple networks in a multilayer configuration can provide new insights into user interactions on online social networks, and can significantly improve link prediction overall with valuable applications to social bootstrapping and friend recommendations.
This paper considers the problem of transferring a file from one source node to multiple receivers in a peer-to-peer (P2P) network. The objective is to minimize the weighted sum download time (WSDT) for the one-to-many file transfer. Previous work has shown that, given an order at which the receivers finish downloading, the minimum WSD can be solved in polynomial time by convex optimization, and can be achieved by linear network coding, assuming that node uplinks are the only bottleneck in the network. This paper, however, considers heterogeneous peers with both uplink and downlink bandwidth constraints specified. The static scenario is a file-transfer scheme in which the network resource allocation remains static until all receivers finish downloading. This paper first shows that the static scenario may be optimized in polynomial time by convex optimization, and the associated optimal static WSD can be achieved by linear network coding. This paper then presented a lower bound to the minimum WSDT that is easily computed and turns out to be tight across a wide range of parameterizations of the problem. This paper also proposes a static routing-based scheme and a static rateless-coding-based scheme which have almost-optimal empirical performances. The dynamic scenario is a file-transfer scheme which can re-allocate the network resource during the file transfer. This paper proposes a dynamic rateless-coding-based scheme, which provides significantly smaller WSDT than the optimal static scenario does.
It is suggested to use multi-layer graphs as a mathematical model in the design of MPLS networks. The application of this model makes it possible to design multi-service telecommunication systems simultaneously at several levels and to reduce the problem to the search of the minimum weight graph.
We present an approach for quantum computation based on resonance transition (QCRT). We found that for a Hamiltonian whose ground state encodes the solution to a problem, if it has the structure that the ratio between the degeneracies of any two energy levels is polynomial large, then the problem can be solved in polynomial time using the QCRT approach. We demonstrate that both the factoring problem and the search problem can be solved by this approach. And for a minimum-searching problem, in which the task is to find the target item in a database of distinct values, the QCRT approach can achieve exponential speedup over classical algorithms. The QCRT approach is also applied for optimization problems and the 3-bit exact cover problem. One of the most difficult problems with classical optimization algorithms is a trial vector being trapped in a deep local minimum while missing the global minimum of a function, this problem can be avoided by using the QCRT approach. A quantum cooling algorithm is also proposed based on the QCRT approach.
This paper analyzes DONE, an online optimization algorithm that iteratively minimizes an unknown function based on costly and noisy measurements. The algorithm maintains a surrogate of the unknown function in the form of a random Fourier expansion (RFE). The surrogate is updated whenever a new measurement is available, and then used to determine the next measurement point. The algorithm is comparable to Bayesian optimization algorithms, but its computational complexity per iteration does not depend on the number of measurements. We derive several theoretical results that provide insight on how the hyper-parameters of the algorithm should be chosen. The algorithm is compared to a Bayesian optimization algorithm for a benchmark problem and three applications, namely, optical coherence tomography, optical beam-forming network tuning, and robot arm control. It is found that the DONE algorithm is significantly faster than Bayesian optimization in the discussed problems, while achieving a similar or better performance.
In this paper, we aim at improving the performance of synthesized speech in statistical parametric speech synthesis (SPSS) based on a generative adversarial network (GAN). In particular, we propose a novel architecture combining the traditional acoustic loss function and the GAN's discriminative loss under a multi-task learning (MTL) framework. The mean squared error (MSE) is usually used to estimate the parameters of deep neural networks, which only considers the numerical difference between the raw audio and the synthesized one. To mitigate this problem, we introduce the GAN as a second task to determine if the input is a natural speech with specific conditions. In this MTL framework, the MSE optimization improves the stability of GAN, and at the same time GAN produces samples with a distribution closer to natural speech. Listening tests show that the multi-task architecture can generate more natural speech that satisfies human perception than the conventional methods.
Over the past decade network theory has turned out to be a powerful methodology to investigate complex systems of various sorts. Through data analysis, modeling, and simulation quite an unparalleled insight into their structure, function, and response can be obtained. In human societies individuals are linked through social interactions, which today are increasingly mediated electronically by modern Information Communication Technology thus leaving "footprints" of human behaviour as digital records. For these datasets the network theory approach is a natural one as we have demonstrated by analysing the dataset of multi-million user mobile phone communication-logs. This social network turned out to be modular in structure showing communities where individuals are connected with stronger ties and between communities with weaker ties. Also the network topology and the weighted links for pairs of individuals turned out to be related.These empirical findings inspired us to take the next step in network theory, by developing a simple network model based on basic network sociology mechanisms to get friends in order to catch some salient features of mesoscopic community and macroscopic topology formation. Our model turned out to produce many empirically observed features of large-scale social networks. Thus we believe that the network theory approach combining data analysis with modeling and simulation could open a new perspective for studying and even predicting various collective social phenomena such as information spreading, formation of societal structures, and evolutionary processes in them.
We study the three-dimensional Edwards-Anderson model with binary interactions by Monte Carlo simulations. Direct evidence of finite-size scaling is provided, and the universal finite-size scaling functions are determined. Monte Carlo data are extrapolated to infinite volume with an iterative procedure up to correlation lengths xi \approx 140. The infinite volume data are consistent with a conventional power law singularity at finite temperature Tc. Taking into account corrections to scaling, we find Tc = 1.156 +/- 0.015, nu = 1.8 +/- 0.2 and eta = -0.26 +/- 0.04. The data are also consistent with an exponential singularity at finite Tc, but not with an exponential singularity at zero temperature.
Wireless Sensor networks are dense networks of small, low-cost sensors, which collect and disseminate environmental data and thus facilitate monitoring and controlling of physical environment from remote locations with better accuracy. The major challenge is to achieve energy efficiency during the communication among the nodes. This paper aims at proposing a solution to schedule the node's activities to reduce the energy consumption. We propose the construction of a decentralized lifetime maximizing tree within clusters. We aim at minimizing the distance of transmission with minimization of energy consumption. The sensor network is distributed into clusters based on the close proximity of the nodes. Data transfer among the nodes is done with a hybrid technique of both TDMA/ FDMA which leads to efficient utilization of bandwidth and maximizing throughput.
Many real world network problems often concern multivariate nodal attributes such as image, textual, and multi-view feature vectors on nodes, rather than simple univariate nodal attributes. The existing graph estimation methods built on Gaussian graphical models and covariance selection algorithms can not handle such data, neither can the theories developed around such methods be directly applied. In this paper, we propose a new principled framework for estimating graphs from multi-attribute data. Instead of estimating the partial correlation as in current literature, our method estimates the partial canonical correlations that naturally accommodate complex nodal features. Computationally, we provide an efficient algorithm which utilizes the multi-attribute structure. Theoretically, we provide sufficient conditions which guarantee consistent graph recovery. Extensive simulation studies demonstrate performance of our method under various conditions. Furthermore, we provide illustrative applications to uncovering gene regulatory networks from gene and protein profiles, and uncovering brain connectivity graph from functional magnetic resonance imaging data.
We review the recently published results from the CBI's first season of observations. Angular power spectra of the CMB were obtained from deep integrations of 3 single fields covering a total of 3 deg^2 and 3 shallower surveys of overlapping (mosaiced) fields covering a total of 40 deg^2. The observations show a damping of the anisotropies at high-l as expected from the standard scenarios of recombination. We present parameter estimates obtained from the data and discuss the significance of an excess at l>2000 observed in the deep fields.
Online advertising is a huge, rapidly growing advertising market in today's world. One common form of online advertising is using image ads. A decision is made (often in real time) every time a user sees an ad, and the advertiser is eager to determine the best ad to display. Consequently, many algorithms have been developed that calculate the optimal ad to show to the current user at the present time. Typically, these algorithms focus on variations of the ad, optimizing among different properties such as background color, image size, or set of images. However, there is a more fundamental layer. Our study looks at new qualities of ads that can be determined before an ad is shown (rather than online optimization) and defines which ads are most likely to be successful.   We present a set of novel algorithms that utilize deep-learning image processing, machine learning, and graph theory to investigate online advertising and to construct prediction models which can foresee an image ad's success. We evaluated our algorithms on a dataset with over 260,000 ad images, as well as a smaller dataset specifically related to the automotive industry, and we succeeded in constructing regression models for ad image click rate prediction. The obtained results emphasize the great potential of using deep-learning algorithms to effectively and efficiently analyze image ads and to create better and more innovative online ads. Moreover, the algorithms presented in this paper can help predict ad success and can be applied to analyze other large-scale image corpora.
Advances in the technologies of networking, wireless communication and trimness of computers lead to the rapid development in mobile communication infrastructure, and have drastically changed information processing on mobile devices. Users carrying portable devices can freely move around, while still connected to the network. This provides flexibility in accessing information anywhere at any time. For improving more flexibility on mobile device, the new challenges in designing software systems for mobile networks include location and mobility management, channel allocation, power saving and security. In this paper, we are proposing intelligent software tool for software design on mobile devices to fulfill the new challenges on mobile location and mobility management. In this study, the proposed Business Process Redesign (BPR) concept is aims at an extension of the capabilities of an existing, widely used process modeling tool in industry with 'Intelligent' capabilities to suggest favorable alternatives to an existing software workflow design for improving flexibilities on mobile devices.
The advent of the Hubble Space Telescope (HST) has provided images of galaxies at moderate and high redshifts and changed the scope of galaxy morphologies considerably. It is evident that the Hubble Sequence requires modifications in order to incorporate all the various morphologies one encounters at such redshifts. We investigate and compare different approaches to quantifying peculiar galaxy morphologies on images obtained from the Medium Deep Survey (MDS) and other surveys using the Wide Field Planetary Camera 2 (WFPC2) on board the HST, in the I band (filter F814W). We define criteria for peculiarity and put them to use on a sample of 978 galaxies, classifying them by eye as either normal or peculiar. Based on our criteria and on concepts borrowed from digital image processing we design a set of four purely morphological parameters, which comprise the overall texture (or ``blobbiness'') of the image; the distortion of isophotes; the filling-factor of isophotes; and the skeleta of detected structures. We also examine the parameters suggested by Abraham et al. (1995). An artificial neural network (ANN) is trained to distinguish between normal and peculiar galaxies. While the majority of peculiar galaxies are disk-dominated, we also find evidence for a significant population of bulge-dominated peculiars. Consequently, peculiar galaxies do not all form a ``natural'' continuation of the Hubble sequence beyond the late spirals and the irregulars. The trained neural network is applied to a second, larger sample of 1999 WFPC2 images and its probabilistic capabilities are used to estimate the frequency of peculiar galaxies at moderate redshifts as $35 \pm 15 %$.
We study the purely relaxational dynamics (model A) at criticality in three-dimensional disordered Ising systems whose static critical behaviour belongs to the randomly diluted Ising universality class. We consider the site-diluted and bond-diluted Ising models, and the +- J Ising model along the paramagnetic-ferromagnetic transition line. We perform Monte Carlo simulations at the critical point using the Metropolis algorithm and study the dynamic behaviour in equilibrium at various values of the disorder parameter. The results provide a robust evidence of the existence of a unique model-A dynamic universality class which describes the relaxational critical dynamics in all considered models. In particular, the analysis of the size-dependence of suitably defined autocorrelation times at the critical point provides the estimate z=2.35(2) for the universal dynamic critical exponent. We also study the off-equilibrium relaxational dynamics following a quench from T=\infty to T=T_c. In agreement with the field-theory scenario, the analysis of the off-equilibrium dynamic critical behavior gives an estimate of z that is perfectly consistent with the equilibrium estimate z=2.35(2).
Groups of Small and Medium Enterprises (SME) back each other and form guarantee network to obtain loan from banks. The risk over the networked enterprises may cause significant contagious damage. To dissolve such risks, we propose a hybrid feature representation, which is feeded into a gradient boosting model for credit risk assessment of guarantee network. Empirical study is performed on a ten-year guarantee loan record from commercial banks. We find that often hundreds or thousands of enterprises back each other and constitute a sparse complex network. We study the risk of various structures of loan guarantee network, and observe the high correlation between defaults with centrality, and with the communities of the network. In particular, our quantitative risk evaluation model shows promising prediction performance on real-world data, which can be useful to both regulators and stakeholders.
We propose a novel framework for efficient parallelization of deep reinforcement learning algorithms, enabling these algorithms to learn from multiple actors on a single machine. The framework is algorithm agnostic and can be applied to on-policy, off-policy, value based and policy gradient based algorithms. Given its inherent parallelism, the framework can be efficiently implemented on a GPU, allowing the usage of powerful models while significantly reducing training time. We demonstrate the effectiveness of our framework by implementing an advantage actor-critic algorithm on a GPU, using on-policy experiences and employing synchronous updates. Our algorithm achieves state-of-the-art performance on the Atari domain after only a few hours of training. Our framework thus opens the door for much faster experimentation on demanding problem domains. Our implementation is open-source and is made public at https://github.com/alfredvc/paac
We present results that show it is possible to build a competitive, greatly simplified, large vocabulary continuous speech recognition system with whole words as acoustic units. We model the output vocabulary of about 100,000 words directly using deep bi-directional LSTM RNNs with CTC loss. The model is trained on 125,000 hours of semi-supervised acoustic training data, which enables us to alleviate the data sparsity problem for word models. We show that the CTC word models work very well as an end-to-end all-neural speech recognition model without the use of traditional context-dependent sub-word phone units that require a pronunciation lexicon, and without any language model removing the need to decode. We demonstrate that the CTC word models perform better than a strong, more complex, state-of-the-art baseline with sub-word units.
We introduce the "Energy-based Generative Adversarial Network" model (EBGAN) which views the discriminator as an energy function that attributes low energies to the regions near the data manifold and higher energies to other regions. Similar to the probabilistic GANs, a generator is seen as being trained to produce contrastive samples with minimal energies, while the discriminator is trained to assign high energies to these generated samples. Viewing the discriminator as an energy function allows to use a wide variety of architectures and loss functionals in addition to the usual binary classifier with logistic output. Among them, we show one instantiation of EBGAN framework as using an auto-encoder architecture, with the energy being the reconstruction error, in place of the discriminator. We show that this form of EBGAN exhibits more stable behavior than regular GANs during training. We also show that a single-scale architecture can be trained to generate high-resolution images.
In a recent work, Nazer and Gastpar proposed the Compute-and-Forward strategy as a physical-layer network coding scheme. They described a code structure based on nested lattices whose algebraic structure makes the scheme reliable and efficient. In this work, we consider the implementation of their scheme for real Gaussian channels and one dimensional lattices. We relate the maximization of the transmission rate to the lattice shortest vector problem. We explicit, in this case, the maximum likelihood criterion and show that it can be implemented by using an Inhomogeneous Diophantine Approximation algorithm.
We calculate numerically the sizes S of jumps (avalanches) between successively pinned configurations of an elastic line (d=1) or interface (d=2), pulled by a spring of (small) strength m^2 in a random-field landscape. We obtain strong evidence that the size distribution, away from the small-scale cutoff, takes the form P(S) = p(S/S_m) <S>/S_m^2, where S_m:=<S^2>/(2<S>), proportional to m^(-d-zeta), is the scale of avalanches, and zeta the roughness exponent at the depinning transition. Measurement of the scaling function f(s) := s^tau p(s) is compared with the predictions from a recent Functional RG (FRG) calculation, both at mean-field and one-loop level. The avalanche-size exponent tau is found in good agreement with the conjecture tau = 2- 2/(d+zeta), recently confirmed to one loop via the FRG. The function f(s) exhibits a shoulder and a stretched exponential decay at large s, with ln f(s) proportional to - s^delta, and delta approximately 7/6 in d=1. The function f(s), universal ratios of moments, and the generating function <exp(lambda s)> are found in excellent agreement with the one-loop FRG predictions. The distribution of local avalanche sizes, i.e. of the jumps of a subspace of the manifold of dimension d', is also computed and compared to our FRG predictions, and to the conjecture tau' =2- 2/(d'+zeta).
This paper presents a novel approach for the preliminary design of Low-Thrust, many-revolution transfers. The main feature of the novel approach is a considerable reduction in the control parameters and a consequent gain in computational speed. Each spiral is built by using a predefined pattern for thrust direction and switching structure. The pattern is then optimised to minimise propellant consumption and transfer time. The variation of the orbital elements due to the thrust is computed analytically from a first-order solution of the perturbed Keplerian motion. The proposed approach allows for a realistic estimation of {\Delta}V and time of flight required to transfer a spacecraft between two arbitrary orbits. Eccentricity and plane changes are both accounted for. The novel approach is applied here to the design of missions for the removal of space debris by means of an Ion Beam Shepherd Spacecraft. In particular, two slightly different variants of the proposed low-thrust control model are used for the different phases of the mission. Thanks to their low computational cost they can be included in a multiobjective optimisation problem in which the sequence and timing of the removal of five pieces of debris are optimised to minimise propellant consumption and mission duration.
In this letter, we consider a joint macro-relay network with densely deployed relay stations (RSs) and dynamically varied traffic load measured by the number of users. An energy-efficient strategy is proposed by intelligently adjusting the RS working modes (active or sleeping) according to the traffic variation. Explicit expressions related to the network energy efficiency are derived based on stochastic geometry theory. Simulation results demonstrate that the derived analytic results are reasonable and the proposed strategy can significantly improve the network energy efficiency.
The phenomenon of many-body localisation received a lot of attention recently, both for its implications in condensed-matter physics of allowing systems to be an insulator even at non-zero temperature as well as in the context of the foundations of quantum statistical mechanics, providing examples of systems showing the absence of thermalisation following out-of-equilibrium dynamics. In this work, we establish a novel link between dynamical properties - the absence of a group velocity and transport - with entanglement properties of individual eigenvectors. Using Lieb-Robinson bounds and filter functions, we prove rigorously under simple assumptions on the spectrum that if a system shows strong dynamical localisation, all of its many-body eigenvectors have clustering correlations. In one dimension this implies directly an entanglement area law, hence the eigenvectors can be approximated by matrix-product states. We also show this statement for parts of the spectrum, allowing for the existence of a mobility edge above which transport is possible.
We address temporal action localization in untrimmed long videos. This is important because videos in real applications are usually unconstrained and contain multiple action instances plus video content of background scenes or other activities. To address this challenging issue, we exploit the effectiveness of deep networks in temporal action localization via three segment-based 3D ConvNets: (1) a proposal network identifies candidate segments in a long video that may contain actions; (2) a classification network learns one-vs-all action classification model to serve as initialization for the localization network; and (3) a localization network fine-tunes on the learned classification network to localize each action instance. We propose a novel loss function for the localization network to explicitly consider temporal overlap and therefore achieve high temporal localization accuracy. Only the proposal network and the localization network are used during prediction. On two large-scale benchmarks, our approach achieves significantly superior performances compared with other state-of-the-art systems: mAP increases from 1.7% to 7.4% on MEXaction2 and increases from 15.0% to 19.0% on THUMOS 2014, when the overlap threshold for evaluation is set to 0.5.
In recent years, network coding has been investigated as a method to obtain improvements in wireless networks. A typical assumption of previous work is that relay nodes performing network coding can decode the messages from sources perfectly. On a simple relay network, we design a scheme to obtain network coding gain even when the relay node cannot perfectly decode its received messages. In our scheme, the operation at the relay node resembles message passing in belief propagation, sending the logarithm likelihood ratio (LLR) of the network coded message to the destination. Simulation results demonstrate the gain obtained over different channel conditions. The goal of this paper is not to give a theoretical result, but to point to possible interaction of network coding with user cooperation in noisy scenario. The extrinsic information transfer (EXIT) chart is shown to be a useful engineering tool to analyze the performance of joint channel coding and network coding in the network.
Networks are widely used to model real-world systems and uncover their topological features. Network properties such as the degree distribution and shortest path length have been computed in numerous real-world networks, and most of them have been shown to be both scale-free and small-world networks. Graphlets and network motifs are subgraph patterns that capture richer structural information than aforementioned global network properties, and these local features are often used for network comparison. However, past work on graphlets and network motifs is almost exclusively applicable only for static networks. Many systems are better represented as temporal networks which depict not only how a system was at a given stage but also how they evolved. Time-dependent information is crucial in temporal networks and, by disregarding that data, static methods can not achieve the best possible results. This paper introduces an extension of graphlets for temporal networks. Our proposed method enumerates all 4-node graphlet-orbits in each network-snapshot, building the corresponding orbit-transition matrix in the process. Our hypothesis is that networks representing similar systems have characteristic orbit transitions which better identify them than simple static patterns, and this is assessed on a set of real temporal networks split into categories. In order to perform temporal network comparison we put forward an orbit-transition-agreement metric (OTA). OTA correctly groups a set of temporal networks that both static network motifs and graphlets fail to do so adequately. Furthermore, our method produces interpretable results which we use to uncover characteristic orbit transitions, and that can be regarded as a network-fingerprint.
The significant computational costs of deploying neural networks in large-scale or resource constrained environments, such as data centers and mobile devices, has spurred interest in model compression, which can achieve a reduction in both arithmetic operations and storage memory. Several techniques have been proposed for reducing or compressing the parameters for feed-forward and convolutional neural networks, but less is understood about the effect of parameter compression on recurrent neural networks (RNN). In particular, the extent to which the recurrent parameters can be compressed and the impact on short-term memory performance, is not well understood. In this paper, we study the effect of complexity reduction, through singular value decomposition rank reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several tasks. We show that considerable rank reduction is possible when compressing recurrent weights, even without fine tuning. Furthermore, we propose a perturbation model for the effect of general perturbations, such as a compression, on the recurrent parameters of RNNs. The model is tested against a noiseless memorization experiment that elucidates the short-term memory performance. In this way, we demonstrate that the effect of compression of recurrent parameters is dependent on the degree of temporal coherence present in the data and task. This work can guide on-the-fly RNN compression for novel environments or tasks, and provides insight for applying RNN compression in low-power devices, such as hearing aids.
We generalize the recent study of random space-filling bearings to a more realistic situation, where the spacing offset varies randomly during the space-filling procedure, and show that it reproduces well the size-distributions observed in recent studies of real fault gouges. In particular, we show that the fractal dimensions of random polydisperse bearings sweep predominantly the low range of values in the spectrum of fractal dimensions observed along real faults, which strengthen the evidence that polydisperse bearings may explain the occurrence of seismic gaps in nature. In addition, the influence of different distributions for the offset is studied and we find that the uniform distribution is the best choice for reproducing the size-distribution of fault gouges.
In this paper, results of investigations of the simplest mechanisms of a structure formation are presented. In frameworks of the suggested model the main attention was focused on such characteristics as wiring of the system, clusters formation, dynamics of the wiring. The idea to take into account an influence of the environment factor is employed in the proposed model. Investigations of systems with such principle of a structure formation reveal that the system's dynamics has typical features of self-organized criticality phenomenon. In the avalanche-like processes, which occur in the wiring dynamics, a power law was found with the index close to 1.4. It is independent on the environment factor (which in a sense can be considered as system parameter). The system wiring is approximated pretty well by the Gaussian distribution. The size of the system does not play any role in the dynamics of the model.
Network motifs are overrepresented interconnection patterns found in real-world networks. What functional advantages may they offer for building complex systems? We show that most network motifs emerge from interconnections patterns that best exploit the intrinsic stability characteristics of individual nodes. This feature is observed at different scales in a network, from nodes to modules, suggesting an efficient mechanism to stably build complex systems.
A new architecture and learning algorithms for the multidimensional hybrid cascade neural network with neuron pool optimization in each cascade are proposed in this paper. The proposed system differs from the well-known cascade systems in its capability to process multidimensional time series in an online mode, which makes it possible to process non-stationary stochastic and chaotic signals with the required accuracy. Compared to conventional analogs, the proposed system provides computational simplicity and possesses both tracking and filtering capabilities.
Predictions of missing links of incomplete networks like protein-protein interaction networks or very likely but not yet existent links in evolutionary networks like friendship networks in web society can be considered as a guideline for further experiments or valuable information for web users. In this paper, we introduce a local path index to estimate the likelihood of the existence of a link between two nodes. We propose a network model with controllable density and noise strength in generating links, as well as collect data of six real networks. Extensive numerical simulations on both modeled networks and real networks demonstrated the high effectiveness and efficiency of the local path index compared with two well-known and widely used indices, the common neighbors and the Katz index. Indeed, the local path index provides competitively accurate predictions as the Katz index while requires much less CPU time and memory space, which is therefore a strong candidate for potential practical applications in data mining of huge-size networks.
Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce video object segmentation problem as a concept of guided instance segmentation. Our model proceeds on a per-frame basis, guided by the output of the previous frame towards the object of interest in the next frame. We demonstrate that highly accurate object segmentation in videos can be enabled by using a convnet trained with static images only. The key ingredient of our approach is a combination of offline and online learning strategies, where the former serves to produce a refined mask from the previous frame estimate and the latter allows to capture the appearance of the specific object instance. Our method can handle different types of input annotations: bounding boxes and segments, as well as incorporate multiple annotated frames, making the system suitable for diverse applications. We obtain competitive results on three different datasets, independently from the type of input annotation.
Multiagent systems appear in most social, economical, and political situations. In the present work we extend the Deep Q-Learning Network architecture proposed by Google DeepMind to multiagent environments and investigate how two agents controlled by independent Deep Q-Networks interact in the classic videogame Pong. By manipulating the classical rewarding scheme of Pong we demonstrate how competitive and collaborative behaviors emerge. Competitive agents learn to play and score efficiently. Agents trained under collaborative rewarding schemes find an optimal strategy to keep the ball in the game as long as possible. We also describe the progression from competitive to collaborative behavior. The present work demonstrates that Deep Q-Networks can become a practical tool for studying the decentralized learning of multiagent systems living in highly complex environments.
Rudimentary mathematical analysis of simple network models suggests bandwidth-independent saturation of network growth dynamics and hints at linear decrease in information density of the data. However it strongly confirms Metcalfe's law as a measure of network utility and suggests it can play an important role in network calculations. This paper establishes mathematical notion of network value and analyses two conflicting models of network. One, traditional model, fails to manifest Metcalfe's law. Another model, one that observes network in a wider context, both confirms Metcalfe's law and reveals its upper boundary.
The information of the Austrian airline flights was collected and quantitatively analyzed by the concepts of complex network. It displays some features of small-world networks, namely large clustering coefficient and small average shortest-path length. The degree distributions of the networks reveal power law behavior with exponent value of 2 $\sim$ 3 for the small degree branch but a flat tail for the large degree branch. Similarly, the flight weight distributions show power-law behavior for the small weight branch. Furthermore, we found that the clustering coefficient $C$, 0.206, of this flight network is greatly larger than that of a random network, 0.01, which has the same numbers of the airports ($N$) and mean degree ($<k>$), and the diameter $D$, 2.383, of the flight network is significantly smaller than the value of the same random network, 18.67. In addition, the degree-degree correlation analysis shows the network has disassortative behavior, i.e. the large airports are likely to link to smaller airports. Furthermore, the clustering coefficient analysis indicates that the large airports reveal the hierarchical organization.
Identifying and communicating relationships between causes and effects is important for understanding our world, but is affected by language structure, cognitive and emotional biases, and the properties of the communication medium. Despite the increasing importance of social media, much remains unknown about causal statements made online. To study real-world causal attribution, we extract a large-scale corpus of causal statements made on the Twitter social network platform as well as a comparable random control corpus. We compare causal and control statements using statistical language and sentiment analysis tools. We find that causal statements have a number of significant lexical and grammatical differences compared with controls and tend to be more negative in sentiment than controls. Causal statements made online tend to focus on news and current events, medicine and health, or interpersonal relationships, as shown by topic models. By quantifying the features and potential biases of causality communication, this study improves our understanding of the accuracy of information and opinions found online.
Most machine learning classifiers, including deep neural networks, are vulnerable to adversarial examples. Such inputs are typically generated by adding small but purposeful modifications that lead to incorrect outputs while imperceptible to human eyes. The goal of this paper is not to introduce a single method, but to make theoretical steps towards fully understanding adversarial examples. By using concepts from topology, our theoretical analysis brings forth the key reasons why an adversarial example can fool a classifier ($f_1$) and adds its oracle ($f_2$, like human eyes) in such analysis. By investigating the topological relationship between two (pseudo)metric spaces corresponding to predictor $f_1$ and oracle $f_2$, we develop necessary and sufficient conditions that can determine if $f_1$ is always robust (strong-robust) against adversarial examples according to $f_2$. Interestingly our theorems indicate that just one unnecessary feature can make $f_1$ not strong-robust, and the right feature representation learning is the key to getting a classifier that is both accurate and strong-robust.
Simultaneous recordings from N electrodes generate N-dimensional time series that call for efficient representations to expose relevant aspects of the underlying dynamics. Binning the time series defines neural activity vectors that populate the N-dimensional space as a density distribution, especially informative when the neural dynamics performs a noisy path through metastable states (often a case of interest in neuroscience); this makes clustering in the N-dimensional space a natural choice. We apply a variant of the 'mean-shift' algorithm to perform such clustering, and validate it on an Hopfield network in the glassy phase, in which metastable states are uncorrelated from memory attractors. The neural states identified as clusters' centroids are then used to define a parsimonious parametrization of the synaptic matrix, which allows a significant improvement in inferring the synaptic couplings from neural activities. We next consider the more realistic case of a multi-modular spiking network, with spike-frequency adaptation (SFA) inducing history-dependent effects; we develop a procedure, inspired by Boltzmann learning but extending its domain of application, to learn inter-module synaptic couplings so that the spiking network reproduces a prescribed pattern of spatial correlations. After clustering the activity generated by multi-modular spiking networks, we represent their multi-dimensional dynamics as the symbolic sequence of the clusters' centroids, which naturally lends itself to complexity estimates that provide information on memory effects like those induced by SFA. To obtain a relative complexity measure we compare the Lempel-Ziv complexity of the actual centroid sequence to the one of Markov processes sharing the same transition probabilities between centroids; as an illustration, we show that the dependence of such relative complexity on the time scale of SFA.
In this paper we propose a novel framework for learning local image descriptors in a discriminative manner. For this purpose we explore a siamese architecture of Deep Convolutional Neural Networks (CNN), with a Hinge embedding loss on the L2 distance between descriptors. Since a siamese architecture uses pairs rather than single image patches to train, there exist a large number of positive samples and an exponential number of negative samples. We propose to explore this space with a stochastic sampling of the training set, in combination with an aggressive mining strategy over both the positive and negative samples which we denote as "fracking". We perform a thorough evaluation of the architecture hyper-parameters, and demonstrate large performance gains compared to both standard CNN learning strategies, hand-crafted image descriptors like SIFT, and the state-of-the-art on learned descriptors: up to 2.5x vs SIFT and 1.5x vs the state-of-the-art in terms of the area under the curve (AUC) of the Precision-Recall curve.
Core-collapse supernovae are one of the most energetic events in the universe ($10^{46} J$). When a massive star (M $>$ 8 M$_{\odot}$) ignites its last fusion stage where silicon fusion makes iron, its end is then very close. Basically, the core of the star falls inwardly and the gravitational energy is then released in a supernova explosion. The basic picture of this explosion was confirmed by the few neutrinos detected from the SN1987a supernova at Kamiokande, IMB and Baksan detectors. However, there are many details that are still unknown. Since then, a large detector network has grown with better capabilities. Nowadays, in the case of a supernova explosion in our galaxy, the information that we would acquire would allow us to learn much more about these energetic events and constrain our models. Here, I present a brief summary of this network with special emphasis in SuperK-Gd (the upgraded Super-Kamiokande detector with efficient neutron tagging).
In this letter we summarize some recent theoretical work on the design of collectives, i.e., of systems containing many agents, each of which can be viewed as trying to maximize an associated private utility, where there is also a world utility rating the behavior of that overall system that the designer of the collective wishes to optimize. We then apply algorithms based on that work on a recently suggested testbed for such optimization problems (Challet & Johnson, PRL, vol 89, 028701 2002). This is the problem of finding the combination of imperfect nano-scale objects that results in the best aggregate object. We present experimental results showing that these algorithms outperform conventional methods by more than an order of magnitude in this domain.
We present a formalism for obtaining the statistical properties of functionals and inverse functionals of the paths of a particle diffusing in a one-dimensional quenched random potential. We demonstrate the implementation of the formalism in two specific examples: (1) where the functional corresponds to the local time spent by the particle around the origin and (2) where the functional corresponds to the occupation time spent by the particle on the positive side of the origin, within an observation time window of size $t$. We compute the disorder average distributions of the local time, the inverse local time, the occupation time and the inverse occupation time, and show that in many cases disorder modifies the behavior drastically.
The high level of dynamics in today's online social networks (OSNs) creates new challenges for their infrastructures and providers. In particular, dynamics involving edge creation has direct implications on strategies for resource allocation, data partitioning and replication. Understanding network dynamics in the context of physical time is a critical first step towards a predictive approach towards infrastructure management in OSNs. Despite increasing efforts to study social network dynamics, current analyses mainly focus on change over time of static metrics computed on snapshots of social graphs. The limited prior work models network dynamics with respect to a logical clock. In this paper, we present results of analyzing a large timestamped dataset describing the initial growth and evolution of Renren, the leading social network in China. We analyze and model the burstiness of link creation process, using the second derivative, i.e. the acceleration of the degree. This allows us to detect bursts, and to characterize the social activity of a OSN user as one of four phases: acceleration at the beginning of an activity burst, where link creation rate is increasing; deceleration when burst is ending and link creation process is slowing; cruising, when node activity is in a steady state, and complete inactivity.
We present synthetic observations for the first generations of galaxies in the Universe and make predictions for future deep field observations for redshifts greater than 6. Due to the strong impact of nebular emission lines and the relatively compact scale of HII regions, high resolution cosmological simulations and a robust suite of analysis tools are required to properly simulate spectra. We created a software pipeline consisting of FSPS, Hyperion, Cloudy and our own tools to generate synthetic IR observations from a fully three-dimensional arrangement of gas, dust, and stars. Our prescription allows us to include emission lines for a complete chemical network and tackle the effect of dust extinction and scattering in the various lines of sight. We provide spectra, 2-D binned photon imagery for both HST and JWST IR filters, luminosity relationships, and emission line strengths for a large sample of high redshift galaxies in the Renaissance Simulations. Our resulting synthetic spectra show high variability between galactic halos with a strong dependence on stellar mass, metallicity, gas mass fraction, and formation history. Halos with the lowest stellar mass have the greatest variability in [OIII]/H$\beta$, [OIII] and CIII] while halos with higher masses are seen to show consistency in their spectra and [OIII] equivalent widths (EW) between 1\AA\ and 10\AA. Viewing angle accounted for three-fold difference in flux due to the presence of ionized gas channels in a halo. Furthermore, JWST color plots show a discernible relationship between redshift, color, and mean stellar age.
We have extended our experimentally constrained molecular relaxation technique (P. Biswas {\it et al}, Phys. Rev. B {\bf 71} 54204 (2005)) to hydrogenated amorphous silicon: a 540-atom model with 7.4 % hydrogen and a 611-atom model with 22 % hydrogen were constructed. Starting from a random configuration, using physically relevant constraints, {\it ab initio} interactions and the experimental static structure factor, we construct realistic models of hydrogenated amorphous silicon. Our models confirm the presence of a high frequency localized band in the vibrational density of states due to Si-H vibration that has been observed in a recent vibrational transient grating measurements on plasma enhanced chemical vapor deposited films of hydrogenated amorphous silicon.
The CNN-encoding of features from entire videos for the representation of human actions has rarely been addressed. Instead, CNN work has focused on approaches to fuse spatial and temporal networks, but these were typically limited to processing shorter sequences. We present a new video representation, called temporal linear encoding (TLE) and embedded inside of CNNs as a new layer, which captures the appearance and motion throughout entire videos. It encodes this aggregated information into a robust video feature representation, via end-to-end learning. Advantages of TLEs are: (a) they encode the entire video into a compact feature representation, learning the semantics and a discriminative feature space; (b) they are applicable to all kinds of networks like 2D and 3D CNNs for video classification; and (c) they model feature interactions in a more expressive way and without loss of information. We conduct experiments on two challenging human action datasets: HMDB51 and UCF101. The experiments show that TLE outperforms current state-of-the-art methods on both datasets.
We present a technique for the estimation of photometric redshifts based on feed-forward neural networks. The Multilayer Perceptron (MLP) Artificial Neural Network is used to predict photometric redshifts in the HDF-S from an ultra deep multicolor catalog. Various possible approaches for the training of the neural network are explored, including the deepest and most complete spectroscopic redshift catalog currently available (the Hubble Deep Field North dataset) and models of the spectral energy distribution of galaxies available in the literature. The MLP can be trained on observed data, theoretical data and mixed samples. The prediction of the method is tested on the spectroscopic sample in the HDF-S (44 galaxies). Over the entire redshift range, $0.1<z<3.5$, the agreement between the photometric and spectroscopic redshifts in the HDF-S is good: the training on mixed data produces sigma_z(test) ~ 0.11, showing that model libraries together with observed data provide a sufficiently complete description of the galaxy population. The neural system capability is also tested in a low redshift regime, 0<z<0.4, using the Sloan Digital Sky Survey Data Release One (DR1) spectroscopic sample. The resulting accuracy on 88108 galaxies is sigma_z(test) ~ 0.022. Inputs other than galaxy colors - such as morphology, angular size and surface brightness - may be easily incorporated in the neural network technique. An important feature, in view of the application of the technique to large databases, is the computational speed: in the evaluation phase, redshifts of 10^5 galaxies are estimated in few seconds.
A mobile ad hoc network is a self organized cooperative network that works without any permanent infrastructure. This infrastructure less design makes it complex compared to other wireless networks. Lot of attacks and misbehavior obstruct the growth and implementation. The majority of attacks and misbehavior can be handled by existing protocols. But these protocols reduce the total strength of nodes in a network because they isolate nodes from network participation having lesser reputation value. To cope with this problem we have presented the Possibility and Certainty model. This model uses reputation value to determine the possibilities and certainties in network participation. The proposed model classifies nodes into three classes such as certain or HIGH grade possible or MED grade and not possible or LOW grade. Choosing HIGH grade nodes in network activities improves the Packet Delivery Ratio which enhances the throughput of the MANET. On the other hand when node strength is poor we choose MED grade nodes for network activities. Thus the proposed model allows communication in the worst scenario with the possibility of success. It protects a network from misbehavior by isolating LOW grade nodes from routing paths.
The rapidly increasing number of mobile devices, voluminous data, and higher data rate are pushing to rethink the current generation of the cellular mobile communication. The next or fifth generation (5G) cellular networks are expected to meet high-end requirements. The 5G networks are broadly characterized by three unique features: ubiquitous connectivity, extremely low latency, and very high-speed data transfer. The 5G networks would provide novel architectures and technologies beyond state-of-the-art architectures and technologies. In this paper, our intent is to find an answer to the question: "what will be done by 5G and how?" We investigate and discuss serious limitations of the fourth generation (4G) cellular networks and corresponding new features of 5G networks. We identify challenges in 5G networks, new technologies for 5G networks, and present a comparative study of the proposed architectures that can be categorized on the basis of energy-efficiency, network hierarchy, and network types. Interestingly, the implementation issues, e.g., interference, QoS, handoff, security-privacy, channel access, and load balancing, hugely effect the realization of 5G networks. Furthermore, our illustrations highlight the feasibility of these models through an evaluation of existing real-experiments and testbeds.
This paper presents a statistically sound method for measuring the accuracy with which a probabilistic model reflects the growth of a network, and a method for optimising parameters in such a model. The technique is data-driven, and can be used for the modeling and simulation of any kind of evolving network.   The overall framework, a Framework for Evolving Topology Analysis (FETA), is tested on data sets collected from the Internet AS-level topology, social networking websites and a co-authorship network. Statistical models of the growth of these networks are produced and tested using a likelihood-based method. The models are then used to generate artificial topologies with the same statistical properties as the originals. This work can be used to predict future growth patterns for a known network, or to generate artificial models of graph topology evolution for simulation purposes. Particular application examples include strategic network planning, user profiling in social networks or infrastructure deployment in managed overlay-based services.
We describe a graph-based semi-supervised learning framework in the context of deep neural networks that uses a graph-based entropic regularizer to favor smooth solutions over a graph induced by the data. The main contribution of this work is a computationally efficient, stochastic graph-regularization technique that uses mini-batches that are consistent with the graph structure, but also provides enough stochasticity (in terms of mini-batch data diversity) for convergence of stochastic gradient descent methods to good solutions. For this work, we focus on results of frame-level phone classification accuracy on the TIMIT speech corpus but our method is general and scalable to much larger data sets. Results indicate that our method significantly improves classification accuracy compared to the fully-supervised case when the fraction of labeled data is low, and it is competitive with other methods in the fully labeled case.
Advances in materials science have led to physical instantiations of self-assembled networks of memristive devices and demonstrations of their computational capability through reservoir computing. Reservoir computing is an approach that takes advantage of collective system dynamics for real-time computing. A dynamical system, called a reservoir, is excited with a time-varying signal and observations of its states are used to reconstruct a desired output signal. However, such a monolithic assembly limits the computational power due to signal interdependency and the resulting correlated readouts. Here, we introduce an approach that hierarchically composes a set of interconnected memristive networks into a larger reservoir. We use signal amplification and restoration to reduce reservoir state correlation, which improves the feature extraction from the input signals. Using the same number of output signals, such a hierarchical composition of heterogeneous small networks outperforms monolithic memristive networks by at least 20% on waveform generation tasks. On the NARMA-10 task, we reduce the error by up to a factor of 2 compared to homogeneous reservoirs with sigmoidal neurons, whereas single memristive networks are unable to produce the correct result. Hierarchical composition is key for solving more complex tasks with such novel nano-scale hardware.
We study the behavior of untrained neural networks whose weights and biases are randomly distributed using mean field theory. We show the existence of depth scales that naturally limit the maximum depth of signal propagation through these random networks. Our main practical result is to show that random networks may be trained precisely when information can travel through them. Thus, the depth scales that we identify provide bounds on how deep a network may be trained for a specific choice of hyperparameters. As a corollary to this, we argue that in networks at the edge of chaos, one of these depth scales diverges. Thus arbitrarily deep networks may be trained only sufficiently close to criticality. We show that the presence of dropout destroys the order-to-chaos critical point and therefore strongly limits the maximum trainable depth for random networks. Finally, we develop a mean field theory for backpropagation and we show that the ordered and chaotic phases correspond to regions of vanishing and exploding gradient respectively.
It is known by now that amorphous solids at zero temperature do not possess a nonlinear elasticity theory: besides the shear modulus which exists, all the higher order coefficients do not exist in the thermodynamic limit. Here we show that the same phenomenon persists up to temperatures comparable to the glass transition. The zero temperature mechanism due to the prevalence of dangerous plastic modes of the Hessian matrix is replaced by anomalous stress fluctuations that lead to the divergence of the variances of the higher order elastic coefficients. The conclusion is that in amorphous solids elasticity can never be decoupled from plasticity: the nonlinear response is very substantially plastic.
Recently, microRNAs (miRNAs) have emerged as central posttranscriptional regulators of gene expression. miRNAs regulate many key biological processes, including cell growth, death, development and differentiation. This discovery is challenging the central dogma of molecular biology. Genes are working together by forming cellular networks. It has become an emerging concept that miRNAs could intertwine with cellular networks to exert their function. Thus, it is essential to understand how miRNAs take part in cellular processes at a systems-level. In this review, I will first introduce basic knowledge of miRNAs and their relations to heart disaeses and cancer, highlight recently dicovered functions such as filtering out gene expression noise by miRNAs. I will aslo introduce basic concepts of cellular networks and interpret their biological meaning in such a way that the network concepts are digested in a biological context and are understandable for biologists. Finally, I will summarize the most recent progress in understanding of miRNA biology at a systems-level: the principles of miRNA regulation of the major cellular networks including signaling, metabolic, protein interaction and gene regulatory networks. A common miRNA regulatory principle is emerging: miRNAs preferentially regulated the genes that have high regulation complexity. In addition, miRNAs preferentially regulate positive regulatory motifs, highly connected scaffolds and the most network downstream components of cellular signaling networks, while miRNAs selectively regulate the genes which have specific network structural features on metabolic networks.
Main approaches for learning Bayesian networks can be classified as constraint-based, score-based or hybrid methods. Although high-dimensional consistency results are available for constraint-based methods like the PC algorithm, such results have not been proved for score-based or hybrid methods, and most of the hybrid methods have not even shown to be consistent in the classical setting where the number of variables remains fixed and the sample size tends to infinity. In this paper, we show that consistency of hybrid methods based on greedy equivalence search (GES) can be achieved in the classical setting with adaptive restrictions on the search space that depend on the current state of the algorithm. Moreover, we prove consistency of GES and adaptively restricted GES (ARGES) in several sparse high-dimensional settings. ARGES scales well to sparse graphs with thousands of variables and our simulation study indicates that both GES and ARGES generally outperform the PC algorithm.
It is now commonplace to see the Web as a platform that can harness the collective abilities of large numbers of people to accomplish tasks with unprecedented speed, accuracy and scale. To push this idea to its limit, DARPA launched its Network Challenge, which aimed to "explore the roles the Internet and social networking play in the timely communication, wide-area team-building, and urgent mobilization required to solve broad-scope, time-critical problems." The challenge required teams to provide coordinates of ten red weather balloons placed at different locations in the continental United States. This large-scale mobilization required the ability to spread information about the tasks widely and quickly, and to incentivize individuals to act. We report on the winning team's strategy, which utilized a novel recursive incentive mechanism to find all balloons in under nine hours. We analyze the theoretical properties of the mechanism, and present data about its performance in the challenge.
Delay tolerant Networks (DTNs) leverage the mobility of relay nodes to compensate for lack of permanent connectivity and thus enable communication between nodes that are out of range of each other. To decrease message delivery delay, the information to be transmitted is replicated in the network. We study replication mechanisms that include Reed-Solomon type codes as well as network coding in order to improve the probability of successful delivery within a given time limit. We propose an analytical approach that allows us to compute the probability of successful delivery. We study the effect of coding on the performance of the network while optimizing parameters that govern routing.
The analysis of the modular structure of networks is a major challenge in complex networks theory. The validity of the modular structure obtained is essential to confront the problem of the topology-functionality relationship. Recently, several authors have worked on the limit of resolution that different community detection algorithms have, making impossible the detection of natural modules when very different topological scales coexist in the network. Existing multiresolution methods are not the panacea for solving the problem in extreme situations, and also fail. Here, we present a new hierarchical multiresolution scheme that works even when the network decomposition is very close to the resolution limit. The idea is to split the multiresolution method for optimal subgraphs of the network, focusing the analysis on each part independently. We also propose a new algorithm to speed up the computational cost of screening the mesoscale looking for the resolution parameter that best splits every subgraph. The hierarchical algorithm is able to solve a difficult benchmark proposed in [Lancichinetti & Fortunato, 2011], encouraging the further analysis of hierarchical methods based on the modularity quality function.
In this paper, we empirically explore the effects of various kinds of skip connections in stacked bidirectional LSTMs for sequential tagging. We investigate three kinds of skip connections connecting to LSTM cells: (a) skip connections to the gates, (b) skip connections to the internal states and (c) skip connections to the cell outputs. We present comprehensive experiments showing that skip connections to cell outputs outperform the remaining two. Furthermore, we observe that using gated identity functions as skip mappings works pretty well. Based on this novel skip connections, we successfully train deep stacked bidirectional LSTM models and obtain state-of-the-art results on CCG supertagging and comparable results on POS tagging.
Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic AMR parsers: a CAMR+wrapper parser and a novel character-level neural translation AMR parser. For AMR parsing task the character-level neural translation attains surprising 7% gain over the carefully optimized word-level neural translation. Overall, we achieve smatch F1=62% on the SemEval-2016 official scor-ing set and F1=67% on the LDC2015E86 test set.
File sharing, typically involving video or audio material in which copyright may persist and using peer-to-peer (P2P) networks like BitTorrent, has been reported to make up the bulk of Internet traffic. The free-riding problem appears in this "digital gift economy" but its users exhibit rational behaviour, subject to the characteristics of the particular network. The high demand for the Internet as a delivery channel for entertainment underlines the importance of understanding the dynamics of this market, especially when considering possible business models for future pricing or licensing regimes and for the provisioning of network capacity to support future services. The availability of specific titles on file sharing networks is the focus of this paper, with a special emphasis on the P2P protocol BitTorrent. The paper compares the incentives provided in BitTorrent to those in other file-sharing communities, including file hosting, and discusses the number of titles available in the community at any given time, with an emphasis on popular video items with ambiguous legal status.
We present submillimeter observations for 136 of the 370 X-ray sources detected in the 1 Ms exposure of the Chandra Deep Field North. Ten of the X-ray sources are significantly detected in the submillimeter. The average X-ray source in the sample has a significant 850 micron flux of 1.69+/-0.27 mJy. This value shows little dependence on the 2-8 keV flux from 5e-16 erg/cm^2/s to 1e-14 erg/cm^2/s. The ensemble of X-ray sources contribute about 10% of the extragalactic background light at 850 microns. The submillimeter excess is found to be strongest in the optically faint X-ray sources that are also seen at 20 cm, which is consistent with these X-ray sources being obscured and at high redshift (z>1).
Lung cancer is the leading cause for cancer related deaths. As such, there is an urgent need for a streamlined process that can allow radiologists to provide diagnosis with greater efficiency and accuracy. A powerful tool to do this is radiomics: a high-dimension imaging feature set. In this study, we take the idea of radiomics one step further by introducing the concept of discovery radiomics for lung cancer prediction using CT imaging data. In this study, we realize these custom radiomic sequencers as deep convolutional sequencers using a deep convolutional neural network learning architecture. To illustrate the prognostic power and effectiveness of the radiomic sequences produced by the discovered sequencer, we perform cancer prediction between malignant and benign lesions from 97 patients using the pathologically-proven diagnostic data from the LIDC-IDRI dataset. Using the clinically provided pathologically-proven data as ground truth, the proposed framework provided an average accuracy of 77.52% via 10-fold cross-validation with a sensitivity of 79.06% and specificity of 76.11%, surpassing the state-of-the art method.
Using a dynamic functional renormalization group treatment of driven elastic interfaces in a disordered medium, we investigate several aspects of the creep-type motion induced by external forces below the depinning threshold $f_c$: i) We show that in the experimentally important regime of forces slightly below $f_c$ the velocity obeys an Arrhenius-type law $v\sim\exp[-U(f)/T]$ with an effective energy barrier $U(f)\propto (f_{c}-f)$ vanishing linearly when f approaches the threshold $f_c$. ii) Thermal fluctuations soften the pinning landscape at high temperatures. Determining the corresponding velocity-force characteristics at low driving forces for internal dimensions d=1,2 (strings and interfaces) we find a particular non-Arrhenius type creep $v\sim \exp[-(f_c(T)/f)^{\mu}]$ involving the reduced threshold force $f_c(T)$ alone. For d=3 we obtain a similar v-f characteristic which is, however, non-universal and depends explicitly on the microscopic cutoff.
Bayesian network structure learning algorithms with limited data are being used in domains such as systems biology and neuroscience to gain insight into the underlying processes that produce observed data. Learning reliable networks from limited data is difficult, therefore transfer learning can improve the robustness of learned networks by leveraging data from related tasks. Existing transfer learning algorithms for Bayesian network structure learning give a single maximum a posteriori estimate of network models. Yet, many other models may be equally likely, and so a more informative result is provided by Bayesian structure discovery. Bayesian structure discovery algorithms estimate posterior probabilities of structural features, such as edges. We present transfer learning for Bayesian structure discovery which allows us to explore the shared and unique structural features among related tasks. Efficient computation requires that our transfer learning objective factors into local calculations, which we prove is given by a broad class of transfer biases. Theoretically, we show the efficiency of our approach. Empirically, we show that compared to single task learning, transfer learning is better able to positively identify true edges. We apply the method to whole-brain neuroimaging data.
Efficiently aggregating data from different sources is a challenging problem, particularly when samples from each source are distributed differently. These differences can be inherent to the inference task or present for other reasons: sensors in a sensor network may be placed far apart, affecting their individual measurements. Conversely, it is computationally advantageous to split Bayesian inference tasks across subsets of data, but data need not be identically distributed across subsets. One principled way to fuse probability distributions is via the lens of optimal transport: the Wasserstein barycenter is a single distribution that summarizes a collection of input measures while respecting their geometry. However, computing the barycenter scales poorly and requires discretization of all input distributions and the barycenter itself. Improving on this situation, we present a scalable, communication-efficient, parallel algorithm for computing the Wasserstein barycenter of arbitrary distributions. Our algorithm can operate directly on continuous input distributions and is optimized for streaming data. Our method is even robust to nonstationary input distributions and produces a barycenter estimate that tracks the input measures over time. The algorithm is semi-discrete, needing to discretize only the barycenter estimate. To the best of our knowledge, we also provide the first bounds on the quality of the approximate barycenter as the discretization becomes finer. Finally, we demonstrate the practical effectiveness of our method, both in tracking moving distributions on a sphere, as well as in a large-scale Bayesian inference task.
We study aging in a colloidal suspension consisting of micron-sized particles in a liquid. This system is made glassy by increasing the particle concentration. We observe samples composed of particles of two sizes, with a size ratio of 1:2.1 and a volume fraction ratio 1:6, using fast laser scanning confocal microscopy. This technique yields real-time, three-dimensional movies deep inside the colloidal glass. Specifically, we look at how the size, motion and structural organization of the particles relate to the overall aging of the glass. Particles move in spatially heterogeneous cooperative groups. These mobile regions tend to be richer in small particles, and these small particles facilitate the motion of nearby particles of both sizes.
We investigate the Anderson localization in non-Hermitian Aubry-Andr\'e-Harper (AAH) models with imaginary potentials added to lattice sites to represent the physical gain and loss during the interacting processes between the system and environment. By checking the mean inverse participation ratio (MIPR) of the system, we find that different configurations of physical gain and loss have very different impacts on the localization phase transition in the system. In the case with balanced physical gain and loss added in an alternate way to the lattice sites, the critical region (in the case with p-wave superconducting pairing) and the critical value (both in the situations with and without p-wave pairing) for the Anderson localization phase transition will be significantly reduced, which implies an enhancement of the localization process. However, if the system is divided into two parts with one of them coupled to physical gain and the other coupled to the corresponding physical loss, the transition process will be impacted only in a very mild way. Besides, we also discuss the situations with imbalanced physical gain and loss and find that the existence of random imaginary potentials in the system will also affect the localization process while constant imaginary potentials will not.
With the rapid development in wide area networks and low cost, powerful computational resources, grid computing has gained its popularity. With the advent of grid computing, space limitations of conventional distributed systems can be overcome and underutilized computing resources at different locations around the world can be put to distributed jobs. Workload and resource management is the main key grid services at the service level of grid infrastructures, out of which load balancing in the main concern for grid developers. It has been found that load is the major problem which server faces, especially when the number of users increases. A lot of research is being done in the area of load management. This paper presents the various mechanisms of load balancing in grid computing so that the readers will get an idea of which algorithm would be suitable in different situations. Keywords: wide area network, distributed computing, load balancing.
The possibility of the first measurement of Bjorken unpolarized sum rule for $F_1$ structure function of $\nu N$ deep-inelastic scattering at Neutrino Factories is commented. The brief summary of various theoretical contributions to this sum rule is given. Using the next-to-leading set of parton distributions functions, we simulate the expected $Q^2$-behavior and emphasize that its measurement can allow to determine the value of the QCD strong coupling constant $\alpha_s$ with reasonable theoretical uncertainty, dominated by the ambiguity in the existing estimates of the twist-4 non-perturbative $1/Q^2$-effect.
Neurons in the brain represent external stimuli via neural codes. These codes often arise from stereotyped stimulus-response maps, associating to each neuron a convex receptive field. An important problem confronted by the brain is to infer properties of a represented stimulus space without knowledge of the receptive fields, using only the intrinsic structure of the neural code. How does the brain do this? To address this question, it is important to determine what stimulus space features can - in principle - be extracted from neural codes. This motivates us to define the neural ring and a related neural ideal, algebraic objects that encode the full combinatorial data of a neural code. Our main finding is that these objects can be expressed in a "canonical form" that directly translates to a minimal description of the receptive field structure intrinsic to the code. We also find connections to Stanley-Reisner rings, and use ideas similar to those in the theory of monomial ideals to obtain an algorithm for computing the primary decomposition of pseudo-monomial ideals. This allows us to algorithmically extract the canonical form associated to any neural code, providing the groundwork for inferring stimulus space features from neural activity alone.
Deep learning methods achieve state-of-the-art performance in many application scenarios. Yet, these methods require a significant amount of hyperparameters tuning in order to achieve the best results. In particular, tuning the learning rates in the stochastic optimization process is still one of the main bottlenecks. In this paper, we propose a new stochastic gradient descent procedure for deep networks that does not require any learning rate setting. Contrary to previous methods, we do not adapt the learning rates nor we make use of the assumed curvature of the objective function. Instead, we reduce the optimization process to a game of betting on a coin and propose a learning rate free optimal algorithm for this scenario. Theoretical convergence is proven for convex and quasi-convex functions and empirical evidence shows the advantage of our algorithm over popular stochastic gradient algorithms.
We develop a systematic typical medium dynamical cluster approximation that provides a proper description of the Anderson localization transition in three dimensions (3D). Our method successfully captures the localization phenomenon both in the low and large disorder regimes, and allows us to study the localization in different momenta cells, which renders the discovery that the Anderson localization transition occurs in a cell-selective fashion. As a function of cluster size, our method systematically recovers the re-entrance behavior of the mobility edge and obtains the correct critical disorder strength for Anderson localization in 3D.
With the rapid development of Cloud computing technologies and wide adopt of Cloud services and applications, QoS provisioning in Clouds becomes an important research topic. In this paper, we propose an admission control mechanism for Cloud computing. In particular we consider the high volume of simultaneous requests for Cloud services and develop admission control for aggregated traffic flows to address this challenge. By employ network calculus, we determine effective bandwidth for aggregate flow, which is used for making admission control decision. In order to improve network resource allocation while achieving Cloud service QoS, we investigate the relationship between effective bandwidth and equivalent capacity. We have also conducted extensive experiments to evaluate performance of the proposed admission control mechanism.
We report the discovery of a rich population of low mass stars in the young, massive star forming region N66/NGC346 in the Small Magellanic Cloud, from deep V, I and H alpha images taken with the HST/ACS. These stars have likely formed together with the NGC346 cluster, ~3-5 Myr ago. Their magnitude and colors are those of pre-main sequence stars in the mass range 0.6-3 Mo, mostly concentrated in the main cluster, but with secondary subclusters spread over a region across ~45 pc. These subclusters appear to be spatially coincident with previously known knots of molecular gas identified in ground based and ISO observations. We show that N66/NGC346 is a complex region, being shaped by its massive stars, and the observations presented here represent a key step towards the understanding of how star formation occurred and has progressed in this low metallicity environment.
We study the phase structure of the random-plaquette Z_2 lattice gauge model in three dimensions. In this model, the "gauge coupling" for each plaquette is a quenched random variable that takes the value \beta with the probability 1-p and -\beta with the probability p. This model is relevant for the recently proposed quantum memory of toric code. The parameter p is the concentration of the plaquettes with "wrong-sign" couplings -\beta, and interpreted as the error probability per qubit in quantum code. In the gauge system with p=0, i.e., with the uniform gauge couplings \beta, it is known that there exists a second-order phase transition at a certain critical "temperature", T(\equiv \beta^{-1}) = T_c =1.31, which separates an ordered(Higgs) phase at T<T_c and a disordered(confinement) phase at T>T_c. As p increases, the critical temperature T_c(p) decreases. In the p-T plane, the curve T_c(p) intersects with the Nishimori line T_{N}(p) at the certain point (p_c, T_{N}(p_c)). The value p_c is just the accuracy threshold for a fault-tolerant quantum memory and associated quantum computations. By the Monte-Carlo simulations, we calculate the specific heat and the expectation values of the Wilson loop to obtain the phase-transition line T_c(p) numerically. The accuracy threshold is estimated as p_c \simeq 0.033.
Deep convolutional neural network (CNN) based salient object detection methods have achieved state-of-the-art performance and outperform those unsupervised methods with a wide margin. In this paper, we propose to integrate deep and unsupervised saliency for salient object detection under a unified framework. Specifically, our method takes results of unsupervised saliency (Robust Background Detection, RBD) and normalized color images as inputs, and directly learns an end-to-end mapping between inputs and the corresponding saliency maps. The color images are fed into a Fully Convolutional Neural Networks (FCNN) adapted from semantic segmentation to exploit high-level semantic cues for salient object detection. Then the results from deep FCNN and RBD are concatenated to feed into a shallow network to map the concatenated feature maps to saliency maps. Finally, to obtain a spatially consistent saliency map with sharp object boundaries, we fuse superpixel level saliency map at multi-scale. Extensive experimental results on 8 benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches with a margin.
In this paper we propose a new algorithm for learning polyhedral classifiers which we call as Polyceptron. It is a Perception like algorithm which updates the parameters only when the current classifier misclassifies any training data. We give both batch and online version of Polyceptron algorithm. Finally we give experimental results to show the effectiveness of our approach.
The EPCglobal network is a computer network which allows supply chain companies to search for their unknown partners globally and share information stored in product RFID tags with each other. Although there have been quite a number of recent research works done to improve the security of EPCglobal Network, the existing access control solutions are not efficient and scalable. For instance, when a user queries Electronic Product Code Information Service (EPCIS) for EPC event information, the EPCIS would have to query the Electronic Product Code Discovery Service (EPCDS) to check the access rights of the user. This implementation is not efficient and creates a bottleneck at EPCDS. In this paper, we design and propose a digital signature scheme, SignEPC, as a more efficient and scalable access control solution for EPCglobal network. Our paper will also evaluate SignEPC by considering the various possible attacks that could be done on our proposed model
In this paper, we propose a novel deep convolutional network (DCN) that achieves outstanding performance on FDDB, PASCAL Face, and AFW. Specifically, our method achieves a high recall rate of 90.99% on the challenging FDDB benchmark, outperforming the state-of-the-art method by a large margin of 2.91%. Importantly, we consider finding faces from a new perspective through scoring facial parts responses by their spatial structure and arrangement. The scoring mechanism is carefully formulated considering challenging cases where faces are only partially visible. This consideration allows our network to detect faces under severe occlusion and unconstrained pose variation, which are the main difficulty and bottleneck of most existing face detection approaches. We show that despite the use of DCN, our network can achieve practical runtime speed.
We study the possibility to employ neural networks to simulate jet clustering procedures in high energy hadron-hadron collisions. We concentrate our analysis on the Fermilab Tevatron energy and on the $k_\bot$ algorithm. We consider both supervised multilayer feed-forward network trained by the backpropagation algorithm and unsupervised learning, where the neural network autonomously organizes the events in clusters.
Delay-tolerant networks (DTNs) are characterized by a possible absence of end-to-end communication routes at any instant. Still, connectivity can generally be established over time and space. The optimality of a temporal path (journey) in this context can be defined in several terms, including topological (e.g. {\em shortest} in hops) and temporal (e.g. {\em fastest, foremost}). The combinatorial problem of computing shortest, foremost, and fastest journeys {\em given full knowledge} of the network schedule was addressed a decade ago (Bui-Xuan {\it et al.}, 2003). A recent line of research has focused on the distributed version of this problem, where foremost, shortest or fastest {\em broadcast} are performed without knowing the schedule beforehand. In this paper we show how to build {\em fastest} broadcast trees (i.e., trees that minimize the global duration of the broadcast, however late the departure is) in Time-Varying Graphs where intermittent edges are available periodically (it is known that the problem is infeasible in the general case even if various parameters of the graph are know). We address the general case where contacts between nodes can have arbitrary durations and thus fastest routes may consist of a mixture of {\em continuous} and {\em discontinuous} segments (a more complex scenario than when contacts are {\em punctual} and thus routes are only discontinuous). Using the abstraction of \tclocks to compute the temporal distances, we solve the fastest broadcast problem by first learning, at the emitter, what is its time of {\em minimum temporal eccentricity} (i.e. the fastest time to reach all the other nodes), and second by building a {\em foremost} broadcast tree relative to this particular emission date.
This paper introduces a novel deep learning based approach for vision based single target tracking. We address this problem by proposing a network architecture which takes the input video frames and directly computes the tracking score for any candidate target location by estimating the probability distributions of the positive and negative examples. This is achieved by combining a deep convolutional neural network with a Bayesian loss layer in a unified framework. In order to deal with the limited number of positive training examples, the network is pre-trained offline for a generic image feature representation and then is fine-tuned in multiple steps. An online fine-tuning step is carried out at every frame to learn the appearance of the target. We adopt a two-stage iterative algorithm to adaptively update the network parameters and maintain a probability density for target/non-target regions. The tracker has been tested on the standard tracking benchmark and the results indicate that the proposed solution achieves state-of-the-art tracking results.
We study the influence of disorder strength on the interface roughening process in a phase-field model with locally conserved dynamics. We consider two cases where the mobility coefficient multiplying the locally conserved current is either constant throughout the system (the two-sided model) or becomes zero in the phase into which the interface advances (one-sided model). In the limit of weak disorder, both models are completely equivalent and can reproduce the physical process of a fluid diffusively invading a porous media, where super-rough scaling of the interface fluctuations occurs. On the other hand, increasing disorder causes the scaling properties to change to intrinsic anomalous scaling. In the limit of strong disorder this behavior prevails for the one-sided model, whereas for the two-sided case, nucleation of domains in front of the invading front are observed.
We compute the amplitudes for the insertion of various operators in a quark 2-point function at one loop in the RI' symmetric momentum scheme, RI'/SMOM. Specifically we focus on the moments n = 2 and 3 of the flavour non-singlet twist-2 operators used in deep inelastic scattering as these are required for lattice computations.
Why does a microwave oven work? How does biological tissue absorb electromagnetic radiation? Astonishingly, we do not have a definite answer to these simple questions because the microscopic processes governing the absorption of electromagnetic waves by water are largely unclarified. This absorption can be quantified by dielectric loss spectra, which reveal a huge peak at a frequency of the exciting electric field of about 20 GHz and a gradual tailing off towards higher frequencies. The microscopic interpretation of such spectra is highly controversial and various superpositions of relaxation and resonance processes ascribed to single-molecule or molecule-cluster motions have been proposed for their analysis. By combining dielectric, microwave, THz, and far-infrared spectroscopy, here we provide nearly continuous temperature-dependent broadband spectra of water. Moreover, we find that corresponding spectra for aqueous solutions reveal the same features as pure water. However, in contrast to the latter, crystallization in these solutions can be avoided by supercooling. As different spectral contributions tend to disentangle at low temperatures, this enables to deconvolute them when approaching the glass transition under cooling. We find that the overall spectral development, including the 20 GHz feature (employed for microwave heating), closely resembles the behavior known for common supercooled liquids. Thus, water's absorption of electromagnetic waves at room temperature is not unusual but very similar to that of glass-forming liquids at elevated temperatures, deep in the low-viscosity liquid regime, and should be interpreted along similar lines.
We present a quantitative theory for a relaxation function in a simple glass-forming model (binary mixture of particles with different interaction parameters). It is shown that the slowing down is caused by the competition between locally favored regions (clusters) which are long lived but each of which relaxes as a simple function of time. Without the clusters the relaxation of the background is simply determined by one typical length which we deduce from an elementary statistical mechanical argument. The total relaxation function (which depends on time in a nontrivial manner) is quantitatively determined as a weighted sum over the clusters and the background. The `fragility' in this system can be understood quantitatively since it is determined by the temperature dependence of the number fractions of the locally favored regions.
Percolation, the formation of a macroscopic connected component, is a key feature in the description of complex networks. The dynamical properties of a variety of systems can be understood in terms of percolation, including the robustness of power grids and information networks, the spreading of epidemics and forest fires, and the stability of gene regulatory networks. Recent studies have shown that if network edges are added "competitively" in undirected networks, the onset of percolation is abrupt or "explosive." The unusual qualitative features of this phase transition have been the subject of much recent attention. Here we generalize this previously studied network growth process from undirected networks to directed networks and use finite-size scaling theory to find several scaling exponents. We find that this process is also characterized by a very rapid growth in the giant component, but that this growth is not as sudden as in undirected networks.
We have studied the critical temperature of Diluted Magnetic Semiconductors by means of Monte Carlo simulations and Coherent-Potential-Approximation (CPA) calculations. In our model for this syste m, the magnetic ions couple with the carriers through an antiferromagnetic exchange interaction, $J$, and an electrostatic interaction $W$. The effective impurity potential $J-W$ controls the hybridization between the magnetic impurities and the hole charge on the dopants. We find that the critical temperature depends substantially on the hole charge on the magnetic impurities. The CPA critical temperature is always lower than the obtained in the Monte Carlo simulations, although all trends in the simulation results are reproduced in the CPA calculations. Finally we predict the existence of pockets of phase segregation instability close to the carriers band edges.
This paper describes the architecture, the development and the implementation of Janus II, a new generation application-driven number cruncher optimized for Monte Carlo simulations of spin systems (mainly spin glasses). This domain of computational physics is a recognized grand challenge of high-performance computing: the resources necessary to study in detail theoretical models that can make contact with experimental data are by far beyond those available using commodity computer systems. On the other hand, several specific features of the associated algorithms suggest that unconventional computer architectures, which can be implemented with available electronics technologies, may lead to order of magnitude increases in performance, reducing to acceptable values on human scales the time needed to carry out simulation campaigns that would take centuries on commercially available machines. Janus II is one such machine, recently developed and commissioned, that builds upon and improves on the successful JANUS machine, which has been used for physics since 2008 and is still in operation today. This paper describes in detail the motivations behind the project, the computational requirements, the architecture and the implementation of this new machine and compares its expected performances with those of currently available commercial systems.
We propose a free-energy based Monte-Carlo method to measure the volume of potential-energy basins in configuration space. Using this approach we can estimate the number of distinct potential-energy minima, even when this number is much too large to be sampled directly. We validate our approach by comparing our results with the direct enumeration of distinct jammed states in small packings of frictionless spheres. We find that the entropy of distinct packings is extensive and that the entropy of distinct hard-sphere packings must have a maximum as a function of packing fraction.
The use of copula-based models in EDAs (estimation of distribution algorithms) is currently an active area of research. In this context, the copulaedas package for R provides a platform where EDAs based on copulas can be implemented and studied. The package offers complete implementations of various EDAs based on copulas and vines, a group of well-known optimization problems, and utility functions to study the performance of the algorithms. Newly developed EDAs can be easily integrated into the package by extending an S4 class with generic functions for their main components. This paper presents copulaedas by providing an overview of EDAs based on copulas, a description of the implementation of the package, and an illustration of its use through examples. The examples include running the EDAs defined in the package, implementing new algorithms, and performing an empirical study to compare the behavior of different algorithms on benchmark functions and a real-world problem.
The problem of anomaly detection has been studied for a long time. In short, anomalies are abnormal or unlikely things. In financial networks, thieves and illegal activities are often anomalous in nature. Members of a network want to detect anomalies as soon as possible to prevent them from harming the network's community and integrity. Many Machine Learning techniques have been proposed to deal with this problem; some results appear to be quite promising but there is no obvious superior method. In this paper, we consider anomaly detection particular to the Bitcoin transaction network. Our goal is to detect which users and transactions are the most suspicious; in this case, anomalous behavior is a proxy for suspicious behavior. To this end, we use three unsupervised learning methods including k-means clustering, Mahalanobis distance, and Unsupervised Support Vector Machine (SVM) on two graphs generated by the Bitcoin transaction network: one graph has users as nodes, and the other has transactions as nodes.
Many important real-world decision-making problems involve interactions of individuals with purely informational externalities, for example, in jury deliberations, expert committees, etc. We model such interactions of rational agents in a group, where they receive private information and act based on that information while also observing other people's beliefs or actions. As a Bayesian agent attempts to infer the truth from her sequence of observations of actions of others and her own private signal, she recursively refines her belief on the signals that other players could have observed and actions that they could have taken given that other players are also rational. The existing literature addresses asymptotic properties of Bayesian group decisions (important questions such as convergence to consensus and learning). In this work, we address the computations that the Bayesian agent should undertake to realize the optimal actions at every decision epoch. We use the iterated eliminations of infeasible signals (IEIS) to model the thinking process as well as the calculations of a Bayesian agent in a group decision scenario. We show that IEIS algorithm runs in exponential time; however, when the group structure is a partially ordered set the Bayesian calculations simplify and polynomial-time computation of the Bayesian recommendations is possible. We next shift attention to the case where agents reveal their beliefs (instead of actions) at every decision epoch. We analyze the computational complexity of the Bayesian belief formation in groups and show that it is NP-hard. We also investigate the factors underlying this computational complexity and show how belief calculations simplify in special network structures or cases with strong inherent symmetries. We finally give insights about the statistical efficiency (optimality) of the beliefs and its relations to computational efficiency.
We consider the problem of estimating functions of distributed data using a distributed algorithm over a network. The extant literature on computing functions in distributed networks such as wired and wireless sensor networks and peer-to-peer networks deals with computing linear functions of the distributed data when the alphabet size of the data values is small, O(1). We describe a distributed randomized algorithm to estimate a class of non-linear functions of the distributed data which is over a large alphabet. We consider three types of networks: point-to-point networks with gossip based communication, random planar networks in the connectivity regime and random planar networks in the percolating regime both of which use the slotted Aloha communication protocol. For each network type, we estimate the scaled $k$-th frequency moments, for $k \geq 2$. Specifically, for every $k \geq 2,$ we give a distributed randomized algorithm that computes, with probability $(1-\delta),$ an $\epsilon$-approximation of the scaled $k$-th frequency moment, $F_k/N^k$, using time $O(M^{1-\frac{1}{k-1}} T)$ and $O(M^{1-\frac{1}{k-1}} \log N \log (\delta^{-1})/\epsilon^2)$ bits of transmission per communication step. Here, $N$ is the number of nodes in the network, $T$ is the information spreading time and $M=o(N)$ is the alphabet size.
One of the main challenges for broad adoption of deep convolutional neural network (DCNN) models is the lack of understanding of their decision process. In many applications a simpler less capable model that can be easily understood is favorable to a black-box model that has superior performance. In this paper, we present an approach for designing DCNN models based on visualization of the internal activations of the model. We visualize the model's response using fractional stride convolution technique and compare the results with known imaging landmarks from the medical literature. We show that sufficiently deep and capable models can be successfully trained to use the same medical landmarks a human expert would use. The presented approach allows for communicating the model decision process well, but also offers insight towards detecting biases.
A detailed pressure-dependence study of the low-energy excitations of glassy As2S3 is reported over a wide pressure range, up to 10 GPa. The spectral features of Boson peak are analysed as a function of pressure. Pressure effects on the Boson peak are manifested as an appreciable shift of its frequency to higher values, a suppression of its intensity, as well as a noticeable change of its asymmetry leading to a more symmetric shape at high pressures. The pressure-induced Boson peak frequency shift agrees very well with the predictions of the soft potential model over the whole pressure range studied. As regards the pressure dependence of the Boson peak intensity, the situation is more complicated. It is proposed that in order to reach proper conclusions the corresponding dependence of the Debye density of states must also be considered. Employing a comparison of the low energy modes of the crystalline counterpart of As2S3 as well as the experimental data concerning the pressure dependencies of the Boson peak frequency and intensity, structural or glass-to-glass transition seems to occur at the pressure ~4 GPa related to a change of local structure. Finally, the pressure-induced shape changes of the Boson peak can be traced back to the very details of the excess (over the Debye contribution) vibrational density of states.
We reanalyze deep inelastic scattering data of BCDMS Collaboration by including proper cuts of ranges with large systematic errors. We perform also the fits of high statistic deep inelastic scattering data of BCDMS, SLAC, NM and BFP Collaborations taking the data separately and in combined way and find good agreement between these analyses. We extract the values of the QCD coupling constant \alpha_s(M^2_Z) up to NLO level. The fits of the combined data for the nonsinglet part of the structure function F_2 predict the coupling constant value \alpha_s(M^2_Z) = 0.1174 \pm 0.0007 (stat) \pm 0.0019 (syst) \pm 0.0010 (normalization). The fits of the combined data for both: the nonsinglet part of F_2 and the singlet one, lead to the values \alpha_s(M^2_Z) = 0.1177 \pm 0.0007 (stat) \pm 0.0021 (syst) \pm 0.0009 (normalization). Both above values are in very good agreement with each other.
Exact expressions have been proposed for correlation functions of the large-$N$ (planar) limit of the $(1+1)$-dimensional ${\rm SU}(N)\times {\rm SU}(N)$ principal chiral sigma model. These were obtained with the form-factor bootstrap. The short-distance form of the two-point function of the scaling field $\Phi(x)$, was found to be $N^{-1}\langle {\rm Tr}\,\Phi(0)^{\dagger} \Phi(x)\rangle=C_{2}\ln^{2}mx$, where $m$ is the mass gap, in agreement with the perturbative renormalization group. Here we point out that the universal coefficient $C_{2}$, is proportional to the mean first-passage time of a L\'{e}vy flight in one dimension. This observation enables us to calculate $C_{2}=1/16\pi$.
We show how to train a quantum network of pairwise interacting qubits such that its evolution implements a target quantum algorithm into a given network subset. Our strategy is inspired by supervised learning and is designed to help the physical construction of a quantum computer which operates with minimal external classical control.
Standard large deviation estimates or the use of the Hubbard-Stratonovich transformation reduce the analysis of the distribution of the overlap parameters essentially to that of an explicitly known random function $\Phi_{N,\b}$ on $\R^M$. In this article we present a rather careful study of the structure of the minima of this random function related to the retrieval of the stored patterns. We denote by $m^*(\b)$ the modulus of the spontaneous magnetization in the Curie-Weiss model and by $\a$ the ratio between the number of the stored patterns and the system size. We show that there exist strictly positive numbers $0<\g_a<\g_c$ such that 1) If $\sqrt\a\leq \g_a (m^*(\b))^2$, then the absolute minima of $\Phi$ are located within small balls around the points $\pm m^*e^\mu$, where $e^\mu$ denotes the $\mu$-th unit vector while 2) if $\sqrt\a\leq \g_c (m^*(\b))^2$ at least a local minimum surrounded by extensive energy barriers exists near these points. The random location of these minima is given within precise bounds. These are used to prove sharp estimates on the support of the Gibbs measures.   KEYWORDS: Hopfield model, neural networks, storage capacity, Gibbs measures, self-averaging, random matrices
Large-scale blockages like buildings affect the performance of urban cellular networks, especially at higher frequencies. Unfortunately, such blockage effects are either neglected or characterized by oversimplified models in the analysis of cellular networks. Leveraging concepts from random shape theory, this paper proposes a mathematical framework to model random blockages and analyze their impact on cellular network performance. Random buildings are modeled as a process of rectangles with random sizes and orientations whose centers form a Poisson point process on the plane. The distribution of the number of blockages in a link is proven to be Poisson random variable with parameter dependent on the length of the link. A path loss model that incorporates the blockage effects is proposed, which matches experimental trends observed in prior work. The model is applied to analyze the performance of cellular networks in urban areas with the presence of buildings, in terms of connectivity, coverage probability, and average rate. Analytic results show while buildings may block the desired signal, they may still have a positive impact on network performance since they can block significantly more interference.
Deformable part models (DPMs) and convolutional neural networks (CNNs) are two widely used tools for visual recognition. They are typically viewed as distinct approaches: DPMs are graphical models (Markov random fields), while CNNs are "black-box" non-linear classifiers. In this paper, we show that a DPM can be formulated as a CNN, thus providing a novel synthesis of the two ideas. Our construction involves unrolling the DPM inference algorithm and mapping each step to an equivalent (and at times novel) CNN layer. From this perspective, it becomes natural to replace the standard image features used in DPM with a learned feature extractor. We call the resulting model DeepPyramid DPM and experimentally validate it on PASCAL VOC. DeepPyramid DPM significantly outperforms DPMs based on histograms of oriented gradients features (HOG) and slightly outperforms a comparable version of the recently introduced R-CNN detection system, while running an order of magnitude faster.
The era of Big Data is here now, which has brought both unprecedented opportunities and critical challenges. In this article, from a perspective of cognitive wireless networking, we start with a definition of Big Spectrum Data by analyzing its characteristics in terms of six Vs, i.e., volume, variety, velocity, veracity, viability, and value. We then present a high-level tutorial on research frontiers in Big Spectrum Data analytics to guide the development of practical algorithms. We also highlight Big Spectrum Data as the new resource for cognitive wireless networking by presenting the emerging use cases.
The effect of bond randomness on the spin-gapped ground state of the spin-1 bond-alternating antiferromagnetic Heisenberg chain is discussed. By using the loop cluster quantum Monte Carlo method, we investigate the stability of topological order in terms of the recently proposed twist order parameter [M. Nakamura and S. Todo: Phys. Rev. Lett. 89 (2002) 077204]. It is observed that the dimer phases as well as the Haldane phase of the spin-1 Heisenberg chain are robust against a weak randomness, though the valence-bond-solid-like topological order in the latter phase is destroyed by introducing a disorder stronger than the critical value.
Dynamic networks are increasingly being usedd to model real world datasets. A challenging task in their analysis is to detect and characterize clusters. It is useful for analyzing real-world data such as detecting evolving communities in networks. We propose a temporal clustering framework based on a set of network generative models to address this problem. We use PARAFAC decomposition to learn network models from datasets.We then use $K$-means for clustering, the Silhouette criterion to determine the number of clusters, and a similarity score to order the clusters and retain the significant ones. In order to address the time-dependent aspect of these clusters, we propose a segmentation algorithm to detect their formations, dissolutions and lifetimes. Synthetic networks with ground truth and real-world datasets are used to test our method against state-of-the-art, and the results show that our method has better performance in clustering and lifetime detection than previous methods.
A scheme for measuring complex temperature partition functions of Ising models is introduced. In the context of ordered qubit registers this scheme finds a natural translation in terms of global operations, and single particle measurements on the edge of the array. Two applications of this scheme are presented. First, through appropriate Wick rotations, those amplitudes can be analytically continued to yield estimates for partition functions of Ising models. Bounds on the estimation error, valid with high confidence, are provided through a central-limit theorem, which validity extends beyond the present context. It holds for example for estimations of the Jones polynomial. Interestingly, the kind of state preparations and measurements involved in this application can in principle be made "instantaneous", i.e. independent of the system size or the parameters being simulated. Second, the scheme allows to accurately estimate some non-trivial invariants of links. A third result concerns the computational power of estimations of partition functions for real temperature classical ferromagnetic Ising models on a square lattice. We provide conditions under which estimating such partition functions allows one to reconstruct scattering amplitudes of quantum circuits making the problem BQP-hard. Using this mapping, we show that fidelity overlaps for ground states of quantum Hamiltonians, which serve as a witness to quantum phase transitions, can be estimated from classical Ising model partition functions. Finally, we show that the ability to accurately measure corner magnetizations on thermal states of two-dimensional Ising models with magnetic field leads to fully polynomial random approximation schemes (FPRAS) for the partition function. Each of these results corresponds to a section of the text that can be essentially read independently.
We show preliminary results for the performance of Network Coded TCP (CTCP) over large latency networks. While CTCP performs very well in networks with relatively short RTT, the slow-start mechanism currently employed does not adequately fill the available bandwidth when the RTT is large. Regardless, we show that CTCP still outperforms current TCP variants (i.e., Cubic TCP and Hybla TCP) for high packet loss rates (e.g., >2.5%). We then explore the possibility of a modified congestion control mechanism based off of H-TCP that opens the congestion window quickly to overcome the challenges of large latency networks. Preliminary results are provided that show the combination of network coding with an appropriate congestion control algorithm can provide gains on the order of 20 times that of existing TCP variants. Finally, we provide a discussion of the future work needed to increase CTCP's performance in these networks.
Many of the distributed localization algorithms are based on relaxed optimization formulations of the localization problem. These algorithms commonly rely on first-order optimization methods, and hence may require many iterations or communications among computational agents. Furthermore, some of these distributed algorithms put a considerable computational demand on the agents. In this paper, we show that for tree-structured scattered sensor networks, which are networks that their inter-sensor range measurement graphs have few edges (few range measurements among sensors) and can be represented using a tree, it is possible to devise an efficient distributed localization algorithm that solely relies on second-order methods. Particularly, we apply a state-of-the-art primal-dual interior-point method to a semidefinite relaxation of the maximum-likelihood formulation of the localization problem. We then show how it is possible to exploit the tree-structure in the network and use message-passing or dynamic programming over trees, to distribute computations among different computational agents. The resulting algorithm requires far fewer iterations and communications among agents to converge to an accurate estimate. Moreover, the number of required communications among agents, seems to be less sensitive and more robust to the number of sensors in the network, the number of available measurements and the quality of the measurements. This is in stark contrast to distributed algorithms that rely on first-order methods. We illustrate the performance of our algorithm using experiments based on simulated and real data.
In view of the fact that routing algorithms are network layer entities and the varying performance of any routing algorithm depends on the underlying networks. Localized routing algorithms avoid the problems associated with the maintenance of global network state by using statistics of flow blocking probabilities. We developed a new network parameter that can be used to predict which network topology gives better performance on the quality of localized QoS routing algorithms. Using this parameter we explore a simple model that can be rewired to introduce increasing the performance. We find that this model have small characteristic path length. Simulations of random and complex networks used to show that the performance is significantly affected by the level of connectivity.
We investigate a model for fatigue crack growth in which damage accumulation is assumed to follow a power law of the local stress amplitude, a form which can be generically justified on the grounds of the approximately self-similar aspect of microcrack distributions. Our aim is to determine the relation between model ingredients and the Paris exponent governing subcritical crack-growth dynamics at the macroscopic scale, starting from a single small notch propagating along a fixed line. By a series of analytical and numerical calculations, we show that, in the absence of disorder, there is a critical damage-accumulation exponent $\gamma$, namely $\gamma_c=2$, separating two distinct regimes of behavior for the Paris exponent $m$. For $\gamma>\gamma_c$, the Paris exponent is shown to assume the value $m=\gamma$, a result which proves robust against the separate introduction of various modifying ingredients. Explicitly, we deal here with (i) the requirement of a minimum stress for damage to occur; (ii) the presence of disorder in local damage thresholds; (iii) the possibility of crack healing. On the other hand, in the regime $\gamma<\gamma_c$ the Paris exponent is seen to be sensitive to the different ingredients added to the model, with rapid healing or a high minimum stress for damage leading to $m=2$ for all $\gamma<\gamma_c$, in contrast with the linear dependence $m=6-2\gamma$ observed for very long characteristic healing times in the absence of a minimum stress for damage. Upon the introduction of disorder on the local fatigue thresholds, which leads to the possible appearance of multiple cracks along the propagation line, the Paris exponent tends to $m\approx 4$ for $\gamma\lesssim 2$, while retaining the behavior $m=\gamma$ for $\gamma\gtrsim 4$.
We present a method for skin lesion segmentation for the ISIC 2017 Skin Lesion Segmentation Challenge. Our approach is based on a Fully Convolutional Network architecture which is trained end to end, from scratch, on a limited dataset. Our semantic segmentation architecture utilizes several recent innovations in particularly in the combined use of (i) use of atrous convolutions to increase the effective field of view of the network's receptive field without increasing the number of parameters, (ii) the use of network-in-network $1\times1$ convolution layers to add capacity to the network and (iii) state-of-art super-resolution upsampling of predictions using subpixel CNN layers. We reported a mean IOU score of 0.642 on the validation set provided by the organisers.
We consider a dynamic network cascade process developed by Watts applied to a random networks with a specified amount of clustering, belonging to a class of random networks developed by Newman. We adapt existing tree-based methods to formulate an appropriate two-type branching process to describe the spread of a cascade started with a single active node, and obtain a fixed-point equation to implicitly express the extinction probability of such a cascade. In so doing, we also recover a special case of a formula of Hackett et al. giving conditions for certain extinction of the cascade.
This paper discusses some topics related to the latest trends in the field of evolutionary approaches to iris recognition. It presents the results of an exploratory experimental simulation whose goal was to analyze the possibility of establishing an Interchange Protocol for Digital Identities evolved in different geographic locations interconnected through and into an Intelligent Iris Verifier Distributed System (IIVDS) based on multi-enrollment. Finding a logically consistent model for the Interchange Protocol is the key factor in designing the future large-scale iris biometric networks. Therefore, the logical model of such a protocol is also investigated here. All tests are made on Bath Iris Database and prove that outstanding power of discrimination between the intra- and the inter-class comparisons can be achieved by an IIVDS, even when practicing 52.759.182 inter-class and 10.991.943 intra-class comparisons. Still, the test results confirm that inconsistent enrollment can change the logic of recognition from a fuzzified 2-valent consistent logic of biometric certitudes to a fuzzified 3-valent inconsistent possibilistic logic of biometric beliefs justified through experimentally determined probabilities, or to a fuzzified 8-valent logic which is almost consistent as a biometric theory - this quality being counterbalanced by an absolutely reasonable loss in the user comfort level.
We describe a novel extension of subspace codes for noncoherent networks, suitable for use when the network is viewed as a communication system that introduces both dimension and symbol errors. We show that when symbol erasures occur in a significantly large number of different basis vectors transmitted through the network and when the min-cut of the networks is much smaller then the length of the transmitted codewords, the new family of codes outperforms their subspace code counterparts.   For the proposed coding scheme, termed hybrid network coding, we derive two upper bounds on the size of the codes. These bounds represent a variation of the Singleton and of the sphere-packing bound. We show that a simple concatenated scheme that represents a combination of subspace codes and Reed-Solomon codes is asymptotically optimal with respect to the Singleton bound. Finally, we describe two efficient decoding algorithms for concatenated subspace codes that in certain cases have smaller complexity than subspace decoders.
Color gradients in elliptical galaxies in distant clusters ($z=0.37-0.56$) are examined by using the archival deep imaging data of Wide Field Planetary Camera 2 (WFPC2) on-board the Hubble Space Telescope (HST). Obtained color gradients are compared with the two model gradients to examine the origin of the color gradients. In one model, a color gradient is assumed to be caused by a metallicity gradient of stellar populations, while in the other one, it is caused by an age gradient. Both of these model color gradients reproduce the average color gradient seen in nearby ellipticals, but predict significantly different gradients at a redshift larger than $\sim$0.3. Comparison between the observed gradients and the model gradients reveals that the metallicity gradient is much more favorable as the primary origin of color gradients in elliptical galaxies in clusters. The same conclusion has been obtained for field ellipticals by using those at the redshift from 0.1 to 1.0 in the Hubble Deep Field-North by Tamura et al. (2000). Thus, it is also suggested that the primary origin of the color gradients in elliptical galaxies does not depend on galaxy environment.
Due to observational constraints, dark matter determinations in nearby clusters based on weak lensing are still extremely rare, in spite of their importance for the determination of cluster properties independent of other methods. We present a weak lensing study of the Coma cluster (redshift 0.024) based on deep images obtained at the CFHT. After obtaining photometric redshifts for the galaxies in our field based on deep images in the u (1x1 deg2), and in the B, V, R and I bands (42'x52'), allowing us to eliminate foreground galaxies, we apply weak lensing calculations on shape measurements performed in the u image. We derive a map of the mass distribution in Coma, as well as the radial shear profile, and the mass and concentration parameter at various radii. We obtain M_200c = 5.1+4.3-2.1 x10^14 Msun and c_200c=5.0+3.2-2.5, in good agreement with previous measurements. With deep wide field images it is now possible to analyze nearby clusters with weak lensing techniques, thus opening a broad new field of investigation.
Future wireless systems are expected to provide a wide range of services to more and more users. Advanced scheduling strategies thus arise not only to perform efficient radio resource management, but also to provide fairness among the users. On the other hand, the users' perceived quality, i.e., Quality of Experience (QoE), is becoming one of the main drivers within the schedulers design. In this context, this paper starts by providing a comprehension of what is QoE and an overview of the evolution of wireless scheduling techniques. Afterwards, a survey on the most recent QoE-based scheduling strategies for wireless systems is presented, highlighting the application/service of the different approaches reported in the literature, as well as the parameters that were taken into account for QoE optimization. Therefore, this paper aims at helping readers interested in learning the basic concepts of QoE-oriented wireless resources scheduling, as well as getting in touch with the present time research frontier.
Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the \overfeat network which was trained to perform object classification on ILSVRC13. We use features extracted from the \overfeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets. We selected these tasks and datasets as they gradually move further away from the original task and data the \overfeat network was trained to solve. Astonishingly, we report consistent superior results compared to the highly tuned state-of-the-art systems in all the visual classification tasks on various datasets. For instance retrieval it consistently outperforms low memory footprint methods except for sculptures dataset. The results are achieved using a linear SVM classifier (or $L2$ distance in case of retrieval) applied to a feature representation of size 4096 extracted from a layer in the net. The representations are further modified using simple augmentation techniques e.g. jittering. The results strongly suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.
We introduce a general deterministic model for Apollonian Networks in an iterative fashion. The networks have small-world effect and scale-free topology. We calculate the exact results for the degree exponent, the clustering coefficient and the diameter. The major points of our results indicate that (a) the degree exponent can be adjusted in a wide range, (b) the clustering coefficient of each individual vertex is inversely proportional to its degree and the average clustering coefficient of all vertices approaches to a nonzero value in the infinite network order, and (c) the diameter grows logarithmically with the number of network vertices.
In this paper we show that a dynamical description of the protein folding process provides an effective representation of equilibrium properties and it allows for a direct investigation of the mechanisms ruling the approach towards the native configuration. The results reported in this paper have been obtained for a two-dimensional toy-model of amino acid sequences, whose native configurations were previously determined by Monte Carlo techniques. The somewhat controversial scenario emerging from the comparison among various thermodynamical indicators is definitely better resolved relying upon a truly dynamical description, that points out the crucial role played by long-range interactions in determining the characteristic step-wise evolution of ``good'' folders to their native state. It is worth stressing that this dynamical scenario is consistent with the information obtained by exploring the energy landscapes of different sequences. This suggests that even the identification of more efficient ``static'' indicators should take into account the peculiar features associated with the complex ``orography'' of the landscape.
We consider disordered models of pinning of directed polymers on a defect line, including (1+1)-dimensional interface wetting models, disordered Poland--Scheraga models of DNA denaturation and other (1+d)-dimensional polymers in interaction with columnar defects. We consider also random copolymers at a selective interface. These models are known to have a (de)pinning transition at some critical line in the phase diagram. In this work we prove that, as soon as disorder is present, the transition is at least of second order: the free energy is differentiable at the critical line, and the order parameter (contact fraction) vanishes continuously at the transition. On the other hand, it is known that the corresponding non-disordered models can have a first order (de)pinning transition, with a jump in the order parameter. Our results confirm predictions based on the Harris criterion.
Current network models assume one type of links to define the relations between the network entities. However, many real networks can only be correctly described using two different types of relations. Connectivity links that enable the nodes to function cooperatively as a network and dependency links that bind the failure of one network element to the failure of other network elements. Here we present for the first time an analytical framework for studying the robustness of networks that include both connectivity and dependency links. We show that the synergy between the two types of failures leads to an iterative process of cascading failures that has a devastating effect on the network stability and completely alters the known assumptions regarding the robustness of networks. We present exact analytical results for the dramatic change in the network behavior when introducing dependency links. For a high density of dependency links the network disintegrates in a form of a first order phase transition while for a low density of dependency links the network disintegrates in a second order transition. Moreover, opposed to networks containing only connectivity links where a broader degree distribution results in a more robust network, when both types of links are present a broad degree distribution leads to higher vulnerability.
Heterogeneous small cell networks with overlay femtocells and macrocell is a promising solution for future heterogeneous wireless cellular communications. However, great resilience is needed in heterogeneous small cells in case of accidents, attacks and natural disasters. In this article, we first describe the network architecture of disaster resilient heterogeneous small cell networks (DRHSCNs), where several self-organization inspired approaches are applied. Based on the proposed resilient heterogeneous small cell network architecture, self-configuring (power, physical cell ID and neighbor cell list self-configuration) and self-optimizing (coverage and capacity optimization and mobility robustness optimization) techniques are investigated in the DRHSCN. Simulation results show that self-configuration and self-optimization can effectively improve the performance of the deployment and operation of the small cell networks in disaster scenarios.
The aim of this paper is to highlight and explore a traditional problem, which is the minimum spanning tree, and finding the shortest-path in network routing, by using Swarm Intelligence. This work to be considered as an investigation topic with combination between operations research, discrete mathematics, and evolutionary computing aiming to solve one of networking problems.
We investigate the evolutionary dynamics in directed and/or weighted networks. We study the fixation probability of a mutant in finite populations in stochastic voter-type dynamics for several update rules. The fixation probability is defined as the probability of a newly introduced mutant in a wild-type population taking over the entire population. In contrast to the case of undirected and unweighted networks, the fixation probability of a mutant in directed networks is characterized not only by the degree of the node that the mutant initially invades but by the global structure of networks. Consequently, the gross connectivity of networks such as small-world property or modularity has a major impact on the fixation probability.
A cellular automata model that describes as limit cases of his parameters the spread of contagious diseases modeled by systems of ordinary or partial differential equations is developed. Periodic features of the behavior of human settlement are considered. The model is built taking into account the range of motion of the elements of population. For small (large) values of this range, the behaviors described by partial (ordinary) differential equation models are reproduced. Emphasis is done in the study of those scenarios in which the above mentioned equations fail to describe. Some interesting results in these cases are reported.
The processing of data which contain missing values is a complicated and always awkward problem, when the data come from real-world contexts. In applications, we are very often in front of observations for which all the values are not available, and this can occur for many reasons: typing errors, fields left unanswered in surveys, etc. Most of the statistical software (as SAS for example) simply suppresses incomplete observations. It has no practical consequence when the data are very numerous. But if the number of remaining data is too small, it can remove all significance to the results. To avoid suppressing data in that way, it is possible to replace a missing value with the mean value of the corresponding variable, but this approximation can be very bad when the variable has a large variance. So it is very worthwhile seeing that the Kohonen algorithm (as well as the Forgy algorithm) perfectly deals with data with missing values, without having to estimate them beforehand. We are particularly interested in the Kohonen algorithm for its visualization properties.
Capacity scaling laws are analyzed in an underwater acoustic network with $n$ regularly located nodes on a square. A narrow-band model is assumed where the carrier frequency is allowed to scale as a function of $n$. In the network, we characterize an attenuation parameter that depends on the frequency scaling as well as the transmission distance. A cut-set upper bound on the throughput scaling is then derived in extended networks. Our result indicates that the upper bound is inversely proportional to the attenuation parameter, thus resulting in a highly power-limited network. Interestingly, it is seen that unlike the case of wireless radio networks, our upper bound is intrinsically related to the attenuation parameter but not the spreading factor. Furthermore, we describe an achievable scheme based on the simple nearest neighbor multi-hop (MH) transmission. It is shown under extended networks that the MH scheme is order-optimal as the attenuation parameter scales exponentially with $\sqrt{n}$ (or faster). Finally, these scaling results are extended to a random network realization.
IBM TrueNorth chip uses digital spikes to perform neuromorphic computing and achieves ultrahigh execution parallelism and power efficiency. However, in TrueNorth chip, low quantization resolution of the synaptic weights and spikes significantly limits the inference (e.g., classification) accuracy of the deployed neural network model. Existing workaround, i.e., averaging the results over multiple copies instantiated in spatial and temporal domains, rapidly exhausts the hardware resources and slows down the computation. In this work, we propose a novel learning method on TrueNorth platform that constrains the random variance of each computation copy and reduces the number of needed copies. Compared to the existing learning method, our method can achieve up to 68.8% reduction of the required neuro-synaptic cores or 6.5X speedup, with even slightly improved inference accuracy.
We explore the emergence of cooperation in the framework of evolutionary game theory. First we introduce the cooperation problem in a novel way that we believe it have important consequences in how problem is addressed. Then we present a minimal model for the emergence of cooperation in growing systems with cultural reproduction where topological structure and the evolution of strategies are decoupled instead a coevolution dynamic. We show that when system grows, there exists cultural reproduction and a nonzero probability that individuals take cooperation as first strategy; there are conditions to build up a cooperative system with real topological structures for any natural selection intensity. When the system is small cooperation is unstable but become stable as soon as the system reaches an enough well defined topological structure which size mainly depends on the intensity of natural selection. In this way, we reduce the emergence of cooperation problem for systems with cultural reproduction to justify a small initial cooperative structure, what we call cooperative seed. Otherwise, given that the system grows principally as cooperator whose cooperators inhabit the most linked parts of the system, the condition required to cooperation prevails into the systems are drastically reduced compared to those found in statics networks.
Although deep neural networks (DNN) are able to scale with direct advances in computational power (e.g., memory and processing speed), they are not well suited to exploit the recent trends for parallel architectures. In particular, gradient descent is a sequential process and the resulting serial dependencies mean that DNN training cannot be parallelized effectively. Here, we show that a DNN may be replicated over a massive parallel architecture and used to provide a cumulative sampling of local solution space which results in rapid and robust learning. We introduce a complimentary convolutional bootstrapping approach that enhances performance of the parallel architecture further. Our parallelized convolutional bootstrapping DNN out-performs an identical fully-trained traditional DNN after only a single iteration of training.
The nucleon structure function $F_2^N$ computed in a holographic framework can be used to describe nuclear deep inelastic scattering effects provided that a rescaling of the $Q^2$ momentum and of the IR hard-wall parameter $z_0$ is made. The ratios $R_A=F_2^A/F_2^N$ can be obtained in terms of a single rescaling parameter $\lambda_A$ for each nucleus. The resulting ratios agree with experiment in a wide range of the shadowing region.
Conventional population genetics considers the evolution of a limited number of genotypes corresponding to phenotypes with different fitness. As model phenotypes, in particular RNA secondary structure, have become computationally tractable, however, it has become apparent that the context dependent effect of mutations and the many-to-one nature inherent in these genotype-phenotype maps can have fundamental evolutionary consequences. It has previously been demonstrated that populations of genotypes evolving on the neutral networks corresponding to all genotypes with the same secondary structure only through neutral mutations can evolve mutational robustness [Nimwegen {\it et al.} Neutral evolution of mutational robustness, 1999 PNAS], by concentrating the population on regions of high neutrality. Introducing recombination we demonstrate, through numerically calculating the stationary distribution of an infinite population on ensembles of random neutral networks that mutational robustness is significantly enhanced and further that the magnitude of this enhancement is sensitive to details of the neutral network topology. Through the simulation of finite populations of genotypes evolving on random neutral networks and a scaled down microRNA neutral network, we show that even in finite populations recombination will still act to focus the population on regions of locally high neutrality.
We propose a new experimentally corroborated paradigm in which the functionality of the brain's logic-gates depends on the history of their activity, e.g. an OR-gate that turns into a XOR-gate over time. Our results are based on an experimental procedure where conditioned stimulations were enforced on circuits of neurons embedded within a large-scale network of cortical cells in-vitro. The underlying biological mechanism is the unavoidable increase of neuronal response latency to ongoing stimulations, which imposes a non-uniform gradual stretching of network delays.
The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time during training with 3% loss in accuracy for DenseNets, a 20% speedup without loss of accuracy for ResNets, and no improvement for VGG networks. Our code is publicly available at https://github.com/ajbrock/FreezeOut
In this paper we study different options for the survivability implementation in MPLS over Optical Transport Networks (OTN) in terms of network resource usage and configuration cost. We investigate two approaches to the survivability deployment: single layer and multilayer survivability and present various methods for spare capacity allocation (SCA) to reroute disrupted traffic. The comparative analysis shows the influence of the offered traffic granularity and the physical network structure on the survivability cost: for high bandwidth LSPs, close to the optical channel capacity, the multilayer survivability outperforms the single layer one, whereas for low bandwidth LSPs the single layer survivability is more cost-efficient. On the other hand, sparse networks of low connectivity parameter use more wavelengths for optical path routing and increase the configuration cost, as compared with dense networks. We demonstrate that by mapping efficiently the spare capacity of the MPLS layer onto the resources of the optical layer one can achieve up to 22% savings in the total configuration cost and up to 37% in the optical layer cost. Further savings (up to 9 %) in the wavelength use can be obtained with the integrated approach to network configuration over the sequential one, however, at the increase in the optimization problem complexity. These results are based on a cost model with different cost variations, and were obtained for networks targeted to a nationwide coverage.
The emergence of mobile platforms with increased storage and computing capabilities and the pervasive use of these platforms for sensitive applications such as online banking, e-commerce and the storage of sensitive information on these mobile devices have led to increasing danger associated with malware targeted at these devices. Detecting such malware presents inimitable challenges as signature-based detection techniques available today are becoming inefficient in detecting new and unknown malware. In this research, a machine learning approach for the detection of malware on Android platforms is presented. The detection system monitors and extracts features from the applications while in execution and uses them to perform in-device detection using a trained K-Nearest Neighbour classifier. Results shows high performance in the detection rate of the classifier with accuracy of 93.75%, low error rate of 6.25% and low false positive rate with ability of detecting real Android malware.
We have investigated the kagom\'{e} ice state in the frustrated pyrochlore oxide Dy_{2}Ti_{2}O_{7} under magnetic field along a [111] axis. Spin correlations have been measured by neutron scattering and analyzed by Monte-Carlo simulation. The kagom\'{e} ice state, which has a non-vanishing residual entropy well established for the nearest-neighbor spin ice model by the exact solution, has been proved to be realized in the dipolar-interacting spin ice Dy_{2}Ti_{2}O_{7} by observing the characteristic spin correlations. The simulation shows that the long-range interaction gives rise to only weak lifting of the ground state degeneracy and that the system freezes within the degenerate kagom\'{e} ice manifold .
In order to obtain quite precise information about the shape of the particle paths below small-amplitude gravity waves travelling on irrotational deep water, analytic solutions of the nonlinear differential equation system describing the particle motion are provided. All these solutions are not closed curves. Some particle trajectories are peakon-like, others can be expressed with the aid of the Jacobi elliptic functions or with the aid of the hyperelliptic functions. Remarks on the stagnation points of the small-amplitude irrotational deep-water waves are also made.
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks.
We analyze the fine-grained connections between the average degree and the power-law degree distribution exponent in growing information networks. Our starting observation is a power-law degree distribution with a decreasing exponent and increasing average degree as a function of the network size. Our experiments are based on three Twitter at-mention networks and three more from the Koblenz Network Collection. We observe that popular network models cannot explain decreasing power-law degree distribution exponent and increasing average degree at the same time.   We propose a model that is the combination of exponential growth, and a power-law developing network, in which new "homophily" edges are continuously added to nodes proportional to their current homophily degree. Parameters of the average degree growth and the power-law degree distribution exponent functions depend on the ratio of the network growth exponent parameters. Specifically, we connect the growth of the average degree to the decreasing exponent of the power-law degree distribution. Prior to our work, only one of the two cases were handled. Existing models and even their combinations can only reproduce some of our key new observations in growing information networks.
Graph-based algorithms for point-to-point link scheduling in Spatial reuse Time Division Multiple Access (STDMA) wireless ad hoc networks often result in a significant number of transmissions having low Signal to Interference and Noise density Ratio (SINR) at intended receivers, leading to low throughput. To overcome this problem, we propose a new algorithm for STDMA link scheduling based on a graph model of the network as well as SINR computations. The performance of our algorithm is evaluated in terms of spatial reuse and computational complexity. Simulation results demonstrate that our algorithm achieves better performance than existing algorithms.
We consider evolution of a large population, where fitness of each organism is defined by many phenotypical traits. These traits result from expression of many genes. We propose a new model of gene regulation, where gene expression is controlled by a gene network with a threshold mechanism and there is a feedback between that threshold and gene expression. We show that this regulation is very powerful: depending on parameters we can obtain any functional connection between thresholds and genes. Under general assumptions on fitness we prove that such model organisms are capable, to some extent, to recognize the fitness landscape. That fitness landscape learning sharply reduces the number of mutations necessary for adaptation and thus accelerates of evolution. Moreover, this learning increases phenotype robustness with respect to mutations. However, this acceleration leads to an additional risk since learning procedure can produce errors. Finally evolution acceleration reminds races on a rugged highway: when you speed up, you have more chances to crash. These results explain recent experimental data on anticipation of environment changes by some organisms.
We present a neural-network valuation of financial derivatives in the case of fat-tailed underlying asset returns. A two-layer perceptron is trained on simulated prices taking into account the well-known effect of volatility smile. The prices of the underlier are generated using fractional calculus algorithms, and option prices are computed by means of the Bouchaud-Potters formula. This learning scheme is tested on market data; the results show a very good agreement between perceptron option prices and real market ones.
Salient object detection increasingly receives attention as an important component or step in several pattern recognition and image processing tasks. Although a variety of powerful saliency models have been intensively proposed, they usually involve heavy feature (or model) engineering based on priors (or assumptions) about the properties of objects and backgrounds. Inspired by the effectiveness of recently developed feature learning, we provide a novel Deep Image Saliency Computing (DISC) framework for fine-grained image saliency computing. In particular, we model the image saliency from both the coarse- and fine-level observations, and utilize the deep convolutional neural network (CNN) to learn the saliency representation in a progressive manner. Specifically, our saliency model is built upon two stacked CNNs. The first CNN generates a coarse-level saliency map by taking the overall image as the input, roughly identifying saliency regions in the global context. Furthermore, we integrate superpixel-based local context information in the first CNN to refine the coarse-level saliency map. Guided by the coarse saliency map, the second CNN focuses on the local context to produce fine-grained and accurate saliency map while preserving object details. For a testing image, the two CNNs collaboratively conduct the saliency computing in one shot. Our DISC framework is capable of uniformly highlighting the objects-of-interest from complex background while preserving well object details. Extensive experiments on several standard benchmarks suggest that DISC outperforms other state-of-the-art methods and it also generalizes well across datasets without additional training. The executable version of DISC is available online: http://vision.sysu.edu.cn/projects/DISC.
We present the multiplicative recurrent neural network as a general model for compositional meaning in language, and evaluate it on the task of fine-grained sentiment analysis. We establish a connection to the previously investigated matrix-space models for compositionality, and show they are special cases of the multiplicative recurrent net. Our experiments show that these models perform comparably or better than Elman-type additive recurrent neural networks and outperform matrix-space models on a standard fine-grained sentiment analysis corpus. Furthermore, they yield comparable results to structural deep models on the recently published Stanford Sentiment Treebank without the need for generating parse trees.
Recent SDN-based solutions give cloud providers the opportunity to extend their "as-a-service" model with the offer of complete network virtualization. They provide tenants with the freedom to specify the network topologies and addressing schemes of their choosing, while guaranteeing the required level of isolation among them. These platforms, however, have been targeting the datacenter of a single cloud provider with full control over the infrastructure.   This paper extends this concept further by supporting the creation of virtual networks that span across several datacenters, which may belong to distinct cloud providers, while including private facilities owned by the tenant. In order to achieve this, we introduce a new network layer above the existing cloud hypervisors, affording the necessary level of control over the communications while hiding the heterogeneity of the clouds. The benefits of this approach are various, such as enabling finer decisions on where to place the virtual machines (e.g., to fulfill legal requirements), avoiding single points of failure, and potentially decreasing costs. Although our focus in the paper is on architecture design, we also present experimental results of a first prototype of the proposed solution.
Deep Neural Networks (DNN) have been successful in en- hancing noisy speech signals. Enhancement is achieved by learning a nonlinear mapping function from the features of the corrupted speech signal to that of the reference clean speech signal. The quality of predicted features can be improved by providing additional side channel information that is robust to noise, such as visual cues. In this paper we propose a novel deep learning model inspired by insights from human audio visual perception. In the proposed unified hybrid architecture, features from a Convolution Neural Network (CNN) that processes the visual cues and features from a fully connected DNN that processes the audio signal are integrated using a Bidirectional Long Short-Term Memory (BiLSTM) network. The parameters of the hybrid model are jointly learned using backpropagation. We compare the quality of enhanced speech from the hybrid models with those from traditional DNN and BiLSTM models.
The problem of distributed rate maximization in multi-channel ALOHA networks is considered. First, we study the problem of constrained distributed rate maximization, where user rates are subject to total transmission probability constraints. We propose a best-response algorithm, where each user updates its strategy to increase its rate according to the channel state information and the current channel utilization. We prove the convergence of the algorithm to a Nash equilibrium in both homogeneous and heterogeneous networks using the theory of potential games. The performance of the best-response dynamic is analyzed and compared to a simple transmission scheme, where users transmit over the channel with the highest collision-free utility. Then, we consider the case where users are not restricted by transmission probability constraints. Distributed rate maximization under uncertainty is considered to achieve both efficiency and fairness among users. We propose a distributed scheme where users adjust their transmission probability to maximize their rates according to the current network state, while maintaining the desired load on the channels. We show that our approach plays an important role in achieving the Nash bargaining solution among users. Sequential and parallel algorithms are proposed to achieve the target solution in a distributed manner. The efficiencies of the algorithms are demonstrated through both theoretical and simulation results.
In this paper we address the problem of human action recognition from video sequences. Inspired by the exemplary results obtained via automatic feature learning and deep learning approaches in computer vision, we focus our attention towards learning salient spatial features via a convolutional neural network (CNN) and then map their temporal relationship with the aid of Long-Short-Term-Memory (LSTM) networks. Our contribution in this paper is a deep fusion framework that more effectively exploits spatial features from CNNs with temporal features from LSTM models. We also extensively evaluate their strengths and weaknesses. We find that by combining both the sets of features, the fully connected features effectively act as an attention mechanism to direct the LSTM to interesting parts of the convolutional feature sequence. The significance of our fusion method is its simplicity and effectiveness compared to other state-of-the-art methods. The evaluation results demonstrate that this hierarchical multi stream fusion method has higher performance compared to single stream mapping methods allowing it to achieve high accuracy outperforming current state-of-the-art methods in three widely used databases: UCF11, UCFSports, jHMDB.
Image super-resolution (SR) is an underdetermined inverse problem, where a large number of plausible high-resolution images can explain the same downsampled image. Most current single image SR methods use empirical risk minimisation, often with a pixel-wise mean squared error (MSE) loss. However, the outputs from such methods tend to be blurry, over-smoothed and generally appear implausible. A more desirable approach would employ Maximum a Posteriori (MAP) inference, preferring solutions that always have a high probability under the image prior, and thus appear more plausible. Direct MAP estimation for SR is non-trivial, as it requires us to build a model for the image prior from samples. Furthermore, MAP inference is often performed via optimisation-based iterative algorithms which don't compare well with the efficiency of neural-network-based alternatives. Here we introduce new methods for amortised MAP inference whereby we calculate the MAP estimate directly using a convolutional neural network. We first introduce a novel neural network architecture that performs a projection to the affine subspace of valid SR solutions ensuring that the high resolution output of the network is always consistent with the low resolution input. We show that, using this architecture, the amortised MAP inference problem reduces to minimising the cross-entropy between two distributions, similar to training generative models. We propose three methods to solve this optimisation problem: (1) Generative Adversarial Networks (GAN) (2) denoiser-guided SR which backpropagates gradient-estimates from denoising to train the network, and (3) a baseline method using a maximum-likelihood-trained image prior. Our experiments show that the GAN based approach performs best on real image data. Lastly, we establish a connection between GANs and amortised variational inference as in e.g. variational autoencoders.
We address the difficult problem of distinguishing fine-grained object categories in low resolution images. Wepropose a simple an effective deep learning approach that transfers fine-grained knowledge gained from high resolution training data to the coarse low-resolution test scenario. Such fine-to-coarse knowledge transfer has many real world applications, such as identifying objects in surveillance photos or satellite images where the image resolution at the test time is very low but plenty of high resolution photos of similar objects are available. Our extensive experiments on two standard benchmark datasets containing fine-grained car models and bird species demonstrate that our approach can effectively transfer fine-detail knowledge to coarse-detail imagery.
Due to the huge availability of documents in digital form, and the deception possibility raise bound to the essence of digital documents and the way they are spread, the authorship attribution problem has constantly increased its relevance. Nowadays, authorship attribution,for both information retrieval and analysis, has gained great importance in the context of security, trust and copyright preservation. This work proposes an innovative multi-agent driven machine learning technique that has been developed for authorship attribution. By means of a preprocessing for word-grouping and time-period related analysis of the common lexicon, we determine a bias reference level for the recurrence frequency of the words within analysed texts, and then train a Radial Basis Neural Networks (RBPNN)-based classifier to identify the correct author. The main advantage of the proposed approach lies in the generality of the semantic analysis, which can be applied to different contexts and lexical domains, without requiring any modification. Moreover, the proposed system is able to incorporate an external input, meant to tune the classifier, and then self-adjust by means of continuous learning reinforcement.
We use a confocal microscope to examine the motion of individual particles in a dense colloidal suspension. Close to the glass transition, particle motion is strongly spatially correlated. The correlations decay exponentially with particle separation, yielding a dynamic length scale of O(2-3 sigma) (in terms of particle diameter sigma). This length scale grows modestly as the glass transition is approached. Further, the correlated motion exhibits a strong spatial dependence on the pair correlation function g(r). Motion within glassy samples is weakly correlated, but with a larger spatial scale for this correlation.
The first part of Hilbert's sixteenth problem deals with the classification of the isotopy types realizable by real plane algebraic curves of given degree $m$. For $m \geq 8$, one restricts the study to the case of the $M$-curves. For $m=9$, the classification is still wide open. We say that an $M$-curve of degree 9 has a deep nest if it has a nest of depth 3. In the present paper, we prohibit 10 isotopy types with deep nests and no outer ovals.
Characterizing the connectivity tendency of a network is a fundamental problem in network science. The traditional and well-known assortativity coefficient is calculated on a per-network basis, which is of little use to partial connection tendency of a network. This paper proposes a universal assortativity coefficient(UAC), which is based on the unambiguous definition of each individual edge's contribution to the global assortativity coefficient (GAC). It is able to reveal the connection tendency of microscopic, mesoscopic, macroscopic structures and any given part of a network. Applying UAC to real world networks, we find that, contrary to the popular expectation, most networks (notably the AS-level Internet topology) have markedly more assortative edges/nodes than dissortaive ones despite their global dissortativity. Consequently, networks can be categorized along two dimensions--single global assortativity and local assortativity statistics. Detailed anatomy of the AS-level Internet topology further illustrates how UAC can be used to decipher the hidden patterns of connection tendencies on different scales.
For long, node cooperation has been exploited as a data relaying mechanism. However, the wireless channel allows for much richer interaction between nodes. One such scenario is in a multi-channel environment, where transmitter-receiver pairs may make incorrect decisions (e.g., in selecting channels) but idle neighbors could help by sharing information to prevent undesirable consequences (e.g., data collisions). This represents a Distributed Information SHaring (DISH) mechanism for cooperation and suggests new ways of designing cooperative protocols. However, what is lacking is a theoretical understanding of this new notion of cooperation. In this paper, we view cooperation as a network resource and evaluate the availability of cooperation via a metric, $p_{co}$, the probability of obtaining cooperation. First, we analytically evaluate $p_{co}$ in the context of multi-channel multi-hop wireless networks. Second, we verify our analysis via simulations and the results show that our analysis accurately characterizes the behavior of $p_{co}$ as a function of underlying network parameters. This step also yields important insights into DISH with respect to network dynamics. Third, we investigate the correlation between $p_{co}$ and network performance in terms of collision rate, packet delay, and throughput. The results indicate a near-linear relationship, which may significantly simplify performance analysis for cooperative networks and suggests that $p_{co}$ be used as an appropriate performance indicator itself. Throughout this work, we utilize, as appropriate, three different DISH contexts --- model-based DISH, ideal DISH, and real DISH --- to explore $p_{co}$.
In the paper, we proposed a novel algorithm dedicated to adaptive video streaming based on HTTP. The algorithm employs a hybrid play-out strategy which combines two popular approaches: an estimation of network bandwidth and a control of a player buffer. The proposed algorithm was implemented in two versions which differ in the method of handling fluctuations of network throughput.   The proposed hybrid algorithm was evaluated against solutions which base their play-out strategy purely on bandwidth or buffer level assessment. The comparison was performed in an environment which emulated two systems: a Wi-Fi network with a single immobile node and HSPA (High Speed Packet Access) network with a mobile node. The evaluation shows that the hybrid approach in most cases achieves better results compared to its competitors, being able to stream the video more smoothly without unnecessary bit-rate switches. However, in certain network conditions, this score is traded for a worse throughput utilisation compared to other play-out strategies.
We present a method for visualising the response of a deep neural network to a specific input. For image data for instance our method will highlight areas that provide evidence in favor of, and against choosing a certain class. The method overcomes several shortcomings of previous methods and provides great additional insight into the decision making process of convolutional networks, which is important both to improve models and to accelerate the adoption of such methods in e.g. medicine. In experiments on ImageNet data, we illustrate how the method works and can be applied in different ways to understand deep neural nets.
Convolutional Neural Networks (ConvNets) have achieved excellent recognition performance in various visual recognition tasks. A large labeled training set is one of the most important factors for its success. However, it is difficult to collect sufficient training images with precise labels in some domains such as apparent age estimation, head pose estimation, multi-label classification and semantic segmentation. Fortunately, there is ambiguous information among labels, which makes these tasks different from traditional classification. Based on this observation, we convert the label of each image into a discrete label distribution, and learn the label distribution by minimizing a Kullback-Leibler divergence between the predicted and ground-truth label distributions using deep ConvNets. The proposed DLDL (Deep Label Distribution Learning) method effectively utilizes the label ambiguity in both feature learning and classifier learning, which help prevent the network from over-fitting even when the training set is small. Experimental results show that the proposed approach produces significantly better results than state-of-the-art methods for age estimation and head pose estimation. At the same time, it also improves recognition performance for multi-label classification and semantic segmentation tasks.
Motivation: Histone modifications are among the most important factors that control gene regulation. Computational methods that predict gene expression from histone modification signals are highly desirable for understanding their combinatorial effects in gene regulation. This knowledge can help in developing 'epigenetic drugs' for diseases like cancer. Previous studies for quantifying the relationship between histone modifications and gene expression levels either failed to capture combinatorial effects or relied on multiple methods that separate predictions and combinatorial analysis. This paper develops a unified discriminative framework using a deep convolutional neural network to classify gene expression using histone modification data as input. Our system, called DeepChrome, allows automatic extraction of complex interactions among important features. To simultaneously visualize the combinatorial interactions among histone modifications, we propose a novel optimization-based technique that generates feature pattern maps from the learnt deep model. This provides an intuitive description of underlying epigenetic mechanisms that regulate genes. Results: We show that DeepChrome outperforms state-of-the-art models like Support Vector Machines and Random Forests for gene expression classification task on 56 different cell-types from REMC database. The output of our visualization technique not only validates the previous observations but also allows novel insights about combinatorial interactions among histone modification marks, some of which have recently been observed by experimental studies.
There is a widening recognition that cancer cells are products of complex developmental processes. Carcinogenesis and metastasis formation are increasingly described as systems-level, network phenomena. Here we propose that malignant transformation is a two-phase process, where an initial increase of system plasticity is followed by a decrease of plasticity at late stages of carcinogenesis as a model of cellular learning. We describe the hallmarks of increased system plasticity of early, tumor initiating cells, such as increased noise, entropy, conformational and phenotypic plasticity, physical deformability, cell heterogeneity and network rearrangements. Finally, we argue that the large structural changes of molecular networks during cancer development necessitate a rather different targeting strategy in early and late phase of carcinogenesis. Plastic networks of early phase cancer development need a central hit, while rigid networks of late stage primary tumors or established metastases should be attacked by the network influence strategy, such as by edgetic, multi-target, or allo-network drugs. Cancer stem cells need special diagnosis and targeting, since their dormant and rapidly proliferating forms may have more rigid, or more plastic networks, respectively. The extremely high ability to change their rigidity/plasticity may be a key differentiating hallmark of cancer stem cells. The application of early stage-optimized anti-cancer drugs to late-stage patients may be a reason of many failures in anti-cancer therapies. Our hypotheses presented here underlie the need for patient-specific multi-target therapies applying the correct ratio of central hits and network influences -- in an optimized sequence.
We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of the weights of a network without any drop in accuracy.
A remarkable phenomenon in the time evolution of many networks such as cultural, political, national and economic systems, is the recurrent transition between the states of union and division of nodes. In this work, we propose a phenomenological modeling, inspired by the maxim "long union divides and long division unites", in order to investigate the evolutionary characters of these networks composed of the entities whose behaviors are dominated by these two events. The nodes are endowed with quantities such as identity, ingredient, richness (power), openness (connections), age, distance, interaction etc. which determine collectively the evolution in a probabilistic way. Depending on a tunable parameter, the time evolution of this model is mainly an alternative domination of union or division state, with a possible state of final union dominated by one single node.
The state-of-the-art of face recognition has been significantly advanced by the emergence of deep learning. Very deep neural networks recently achieved great success on general object recognition because of their superb learning capacity. This motivates us to investigate their effectiveness on face recognition. This paper proposes two very deep neural network architectures, referred to as DeepID3, for face recognition. These two architectures are rebuilt from stacked convolution and inception layers proposed in VGG net and GoogLeNet to make them suitable to face recognition. Joint face identification-verification supervisory signals are added to both intermediate and final feature extraction layers during training. An ensemble of the proposed two architectures achieves 99.53% LFW face verification accuracy and 96.0% LFW rank-1 face identification accuracy, respectively. A further discussion of LFW face verification result is given in the end.
Network slicing to enable resource sharing among multiple tenants --network operators and/or services-- is considered a key functionality for next generation mobile networks. This paper provides an analysis of a well-known model for resource sharing, the 'share-constrained proportional allocation' mechanism, to realize network slicing. This mechanism enables tenants to reap the performance benefits of sharing, while retaining the ability to customize their own users' allocation. This results in a network slicing game in which each tenant reacts to the user allocations of the other tenants so as to maximize its own utility. We show that, under appropriate conditions, the game associated with such strategic behavior converges to a Nash equilibrium. At the Nash equilibrium, a tenant always achieves the same, or better, performance than under a static partitioning of resources, hence providing the same level of protection as such static partitioning. We further analyze the efficiency and fairness of the resulting allocations, providing tight bounds for the price of anarchy and envy-freeness. Our analysis and extensive simulation results confirm that the mechanism provides a comprehensive practical solution to realize network slicing. Our theoretical results also fill a gap in the literature regarding the analysis of this resource allocation model under strategic players.
We study the interplay between disorder and interactions for emergent bosonic degrees of freedom induced by an external magnetic field in the Br-doped spin-gapped antiferromagnetic material Ni(Cl$_{1-x}$Br$_x$)$_2$-4SC(NH$_2$)$_2$ (DTNX). Building on nuclear magnetic resonance experiments at high magnetic field [A. Orlova et al., Phys. Rev. Lett. 118, 067203 (2017)], we describe the localization of isolated impurity states, providing a realistic theoretical modeling for DTNX. Going beyond single impurity localization we use quantum Monte Carlo simulations to explore many-body effects from which pairwise effective interactions lead to a (impurity-induced) BEC revival [M. Dupont, S. Capponi, and N. Laflorencie, Phys. Rev. Lett. 118, 067204 (2017)]. We further address the question of the existence of a many-body localized Bose-glass (BG) phase in DTNX, which is found to compete with a series of a new kind of BEC regimes made out of the multi-impurity states; the global magnetic field-temperature phase diagram of DTNX reveals a very rich structure for low impurity concentration, with consecutive disorder-induced BEC mini-domes separated by intervening many-body localized BG regimes. Upon increasing the impurity level, multiple mini-BEC phases start to overlap, while intermediate BG regions vanish.
Convolutional neural networks (CNNs) perform well on problems such as handwriting recognition and image classification. However, the performance of the networks is often limited by budget and time constraints, particularly when trying to train deep networks.   Motivated by the problem of online handwriting recognition, we developed a CNN for processing spatially-sparse inputs; a character drawn with a one-pixel wide pen on a high resolution grid looks like a sparse matrix. Taking advantage of the sparsity allowed us more efficiently to train and test large, deep CNNs. On the CASIA-OLHWDB1.1 dataset containing 3755 character classes we get a test error of 3.82%.   Although pictures are not sparse, they can be thought of as sparse by adding padding. Applying a deep convolutional network using sparsity has resulted in a substantial reduction in test error on the CIFAR small picture datasets: 6.28% on CIFAR-10 and 24.30% for CIFAR-100.
The network topology can be described by the number of nodes and the interconnections among them. The degree of a node in a network is the number of connections it has to other nodes and the degree distribution is the probability distribution of these degrees over the whole network. Therefore, the degree is very important structural parameter of network topology. However, given the number of nodes and the degree of each node in a network, the topology of the network cannot be determined. Therefore, we propose the degree-layer theory of network topology to describe deeply the network topology. First, we propose the concept of degree-tree with the breadth-first search tree. The degrees of all nodes are layered and have a hierarchical structure. Second,the degree-layer theory is described in detail. Two new concepts are defined in the theory. An index is proposed to quantitatively distinguish the two network topologies. It also can quantitatively measure the stability of network topology built by a model mechanism. One theorem is given and proved, furthermore, and one corollary is derived directly from the theorem. Third, the applications of the degree-layer theory are discussed in the ER random network, WS small world network and BA scale-free network, and the influences of the degree distribution on the stability of network topology are studied in the three networks. In conclusion, the degree-layer theory is helpful for accurately describing the network topology, and provides a new starting point for researching the similarity and isomorphism between two network topologies.
Networks are a natural and popular mechanism for the representation and investigation of a broad class of systems. But extracting information from a network can present significant challenges. We present NetzCope, a software application for the display and analysis of networks. Its key features include the visualization of networks in two or three dimensions, the organization of vertices to reveal structural similarity, and the detection and visualization of network communities by modularity maximization.
In this paper, we present a fully automatic brain tumor segmentation method based on Deep Neural Networks (DNNs). The proposed networks are tailored to glioblastomas (both low and high grade) pictured in MR images. By their very nature, these tumors can appear anywhere in the brain and have almost any kind of shape, size, and contrast. These reasons motivate our exploration of a machine learning solution that exploits a flexible, high capacity DNN while being extremely efficient. Here, we give a description of different model choices that we've found to be necessary for obtaining competitive performance. We explore in particular different architectures based on Convolutional Neural Networks (CNN), i.e. DNNs specifically adapted to image data.   We present a novel CNN architecture which differs from those traditionally used in computer vision. Our CNN exploits both local features as well as more global contextual features simultaneously. Also, different from most traditional uses of CNNs, our networks use a final layer that is a convolutional implementation of a fully connected layer which allows a 40 fold speed up. We also describe a 2-phase training procedure that allows us to tackle difficulties related to the imbalance of tumor labels. Finally, we explore a cascade architecture in which the output of a basic CNN is treated as an additional source of information for a subsequent CNN. Results reported on the 2013 BRATS test dataset reveal that our architecture improves over the currently published state-of-the-art while being over 30 times faster.
The paper focuses on the problem of learning saccades enabling visual object search. The developed system combines reinforcement learning with a neural network for learning to predict the possible outcomes of its actions. We validated the solution in three types of environment consisting of (pseudo)-randomly generated matrices of digits. The experimental verification is followed by the discussion regarding elements required by systems mimicking the fovea movement and possible further research directions.
We propose a new way of incorporating temporal information present in videos into Spatial Convolutional Neural Networks (ConvNets) trained on images, that avoids training Spatio-Temporal ConvNets from scratch. We describe several initializations of weights in 3D Convolutional Layers of Spatio-Temporal ConvNet using 2D Convolutional Weights learned from ImageNet. We show that it is important to initialize 3D Convolutional Weights judiciously in order to learn temporal representations of videos. We evaluate our methods on the UCF-101 dataset and demonstrate improvement over Spatial ConvNets.
The network communication scenario where one or more receivers request all the information transmitted by different sources is considered. We introduce distributed polynomial-time network codes in the presence of malicious nodes. Our codes can achieve any point inside the rate region of multiple-source multicast transmission scenarios both in the cases of coherent and non-coherent network coding. For both cases the encoding and decoding algorithm runs in poly(|E|)exp(s) time, where poly(|E|) is a polynomial function of the number of edges |E| in the network and exp(s) is an exponential function of the number of sources s. Our codes are fully distributed and different sources require no knowledge of the data transmitted by their peers. Our codes are "end-to-end", that is, all nodes apart from the sources and the receivers are oblivious to the adversaries present in the network and simply implement random linear network coding.
We consider the long-ranged Ising spin-glass with random couplings decaying as a power-law of the distance, in the region of parameters where the spin-glass phase exists with a positive droplet exponent. For the Metropolis single-spin-flip dynamics near zero temperature, we construct via real-space renormalization the full hierarchy of relaxation times of the master equation for any given realization of the random couplings. We then analyze the probability distribution of dynamical barriers as a function of the spatial scale. This real-space renormalization procedure represents a simple explicit example of the droplet scaling theory, where the convergence towards local equilibrium on larger and larger scales is governed by a strong hierarchy of activated dynamical processes, with valleys within valleys.
Face recognition algorithms based on deep convolutional neural networks (DCNNs) have made progress on the task of recognizing faces in unconstrained viewing conditions. These networks operate with compact feature-based face representations derived from learning a very large number of face images. While the learned features produced by DCNNs can be highly robust to changes in viewpoint, illumination, and appearance, little is known about the nature of the face code that emerges at the top level of such networks. We analyzed the DCNN features produced by two face recognition algorithms. In the first set of experiments we used the top-level features from the DCNNs as input into linear classifiers aimed at predicting metadata about the images. The results show that the DCNN features contain surprisingly accurate information about the yaw and pitch of a face, and about whether the face came from a still image or a video frame. In the second set of experiments, we measured the extent to which individual DCNN features operated in a view-dependent or view-invariant manner. We found that view-dependent coding was a characteristic of the identities rather than the DCNN features - with some identities coded consistently in a view-dependent way and others in a view-independent way. In our third analysis, we visualized the DCNN feature space for over 24,000 images of 500 identities. Images in the center of the space were uniformly of low quality (e.g., extreme views, face occlusion, low resolution). Image quality increased monotonically as a function of distance from the origin. This result suggests that image quality information is available in the DCNN features, such that consistently average feature values reflect coding failures that reliably indicate poor or unusable images. Combined, the results offer insight into the coding mechanisms that support robust representation of faces in DCNNs.
Network simulation is the most useful and common methodology used to evaluate different network to-pologies without real world implementation. Network simulators are widely used by the research community to evaluate new theories and hypotheses. There are a number of network simulators, for instance, ns-2, ns-3, OMNET++, SWAN, OPNET, Jist, and GloMoSiM etc. Therefore, the selection of a network simulator for evaluating research work is a crucial task for researchers. The main focus of this paper is to compare the state-of-the-art, open source network simulators based on the following parameters: CPU utilization, memory usage, computational time, and scalability by simulating a MANET routing protocol, to identify an optimal network simulator for the research community.
Plenty of face detection and recognition methods have been proposed and got delightful results in decades. Common face recognition pipeline consists of: 1) face detection, 2) face alignment, 3) feature extraction, 4) similarity calculation, which are separated and independent from each other. The separated face analyzing stages lead the model redundant calculation and are hard for end-to-end training. In this paper, we proposed a novel end-to-end trainable convolutional network framework for face detection and recognition, in which a geometric transformation matrix was directly learned to align the faces, instead of predicting the facial landmarks. In training stage, our single CNN model is supervised only by face bounding boxes and personal identities, which are publicly available from WIDER FACE \cite{Yang2016} dataset and CASIA-WebFace \cite{Yi2014} dataset. Tested on Face Detection Dataset and Benchmark (FDDB) \cite{Jain2010} dataset and Labeled Face in the Wild (LFW) \cite{Huang2007} dataset, we have achieved 89.24\% recall for face detection task and 98.63\% verification accuracy for face recognition task simultaneously, which are comparable to state-of-the-art results.
Data confidentiality policies at major social network providers have severely limited researchers' access to large-scale datasets. The biggest impact has been on the study of network dynamics, where researchers have studied citation graphs and content-sharing networks, but few have analyzed detailed dynamics in the massive social networks that dominate the web today. In this paper, we present results of analyzing detailed dynamics in the Renren social network, covering a period of 2 years when the network grew from 1 user to 19 million users and 199 million edges. Rather than validate a single model of network dynamics, we analyze dynamics at different granularities (user-, community- and network- wide) to determine how much, if any, users are influenced by dynamics processes at different scales. We observe in- dependent predictable processes at each level, and find that while the growth of communities has moderate and sustained impact on users, significant events such as network merge events have a strong but short-lived impact that is quickly dominated by the continuous arrival of new users.
The diffraction pattern of a single non-periodic compact object, such as a molecule, is continuous and is proportional to the square modulus of the Fourier transform of that object. When arrayed in a crystal, the coherent sum of the continuous diffracted wave-fields from all objects gives rise to strong Bragg peaks that modulate the single-object transform. Wilson statistics describe the distribution of continuous diffraction intensities to the same extent that they apply to Bragg diffraction. The continuous diffraction obtained from translationally-disordered molecular crystals consists of the incoherent sum of the wave-fields from the individual rigid units (such as molecules) in the crystal, which is proportional to the incoherent sum of the diffraction from the rigid units in each of their crystallographic orientations. This sum over orientations modifies the statistics in a similar way that crystal twinning modifies the distribution of Bragg intensities. These statistics are applied to determine parameters of continuous diffraction such as its scaling, the beam coherence, and the number of independent wave-fields or object orientations contributing. Continuous diffraction is generally much weaker than Bragg diffraction and may be accompanied by a background that far exceeds the strength of the signal. Instead of just relying upon the smallest measured intensities to guide the subtraction of the background it is shown how all measured values can be utilised to estimate the background, noise, and signal, by employing a modified "noisy Wilson" distribution that explicitly includes the background. Parameters relating to the background and signal quantities can be estimated from the moments of the measured intensities. The analysis method is demonstrated on previously-published continuous diffraction data measured from imperfect crystals of photosystem II.
Audio tagging aims to perform multi-label classification on audio chunks and it is a newly proposed task in the Detection and Classification of Acoustic Scenes and Events 2016 (DCASE 2016) challenge. This task encourages research efforts to better analyze and understand the content of the huge amounts of audio data on the web. The difficulty in audio tagging is that it only has a chunk-level label without a frame-level label. This paper presents a weakly supervised method to not only predict the tags but also indicate the temporal locations of the occurred acoustic events. The attention scheme is found to be effective in identifying the important frames while ignoring the unrelated frames. The proposed framework is a deep convolutional recurrent model with two auxiliary modules: an attention module and a localization module. The proposed algorithm was evaluated on the Task 4 of DCASE 2016 challenge. State-of-the-art performance was achieved on the evaluation set with equal error rate (EER) reduced from 0.13 to 0.11, compared with the convolutional recurrent baseline system.
We propose a novel hierarchical model for multitask bipartite ranking. The proposed approach combines a matrix-variate Gaussian process with a generative model for task-wise bipartite ranking. In addition, we employ a novel trace constrained variational inference approach to impose low rank structure on the posterior matrix-variate Gaussian process. The resulting posterior covariance function is derived in closed form, and the posterior mean function is the solution to a matrix-variate regression with a novel spectral elastic net regularizer. Further, we show that variational inference for the trace constrained matrix-variate Gaussian process combined with maximum likelihood parameter estimation for the bipartite ranking model is jointly convex. Our motivating application is the prioritization of candidate disease genes. The goal of this task is to aid the identification of unobserved associations between human genes and diseases using a small set of observed associations as well as kernels induced by gene-gene interaction networks and disease ontologies. Our experimental results illustrate the performance of the proposed model on real world datasets. Moreover, we find that the resulting low rank solution improves the computational scalability of training and testing as compared to baseline models.
Human-machine networks pervade much of contemporary life. Network change is the product of structural modifications along with differences in participant be-havior. If we assume that behavioural change in a human-machine network is the result of changing the attitudes of participants in the network, then the question arises whether network structure can affect participant attitude. Taking citizen par-ticipation as an example, engagement with relevant stakeholders reveals trust and motivation to be the major objectives for the network. Using a typology to de-scribe network state based on multiple characteristic or dimensions, we can pre-dict possible behavioural outcomes in the network. However, this has to be medi-ated via attitude change. Motivation for the citizen participation network can only increase in line with enhanced trust. The focus for changing network dynamics, therefore, shifts to the dimensional changes needed to encourage increased trust. It turns out that the coordinated manipulation of multiple dimensions is needed to bring about the desired shift in attitude.
We present an approach to matching images of objects in fine-grained datasets without using part annotations, with an application to the challenging problem of weakly supervised single-view reconstruction. This is in contrast to prior works that require part annotations, since matching objects across class and pose variations is challenging with appearance features alone. We overcome this challenge through a novel deep learning architecture, WarpNet, that aligns an object in one image with a different object in another. We exploit the structure of the fine-grained dataset to create artificial data for training this network in an unsupervised-discriminative learning approach. The output of the network acts as a spatial prior that allows generalization at test time to match real images across variations in appearance, viewpoint and articulation. On the CUB-200-2011 dataset of bird categories, we improve the AP over an appearance-only network by 13.6%. We further demonstrate that our WarpNet matches, together with the structure of fine-grained datasets, allow single-view reconstructions with quality comparable to using annotated point correspondences.
Using supporting backchannel (BC) cues can make human-computer interaction more social. BCs provide a feedback from the listener to the speaker indicating to the speaker that he is still listened to. BCs can be expressed in different ways, depending on the modality of the interaction, for example as gestures or acoustic cues. In this work, we only considered acoustic cues. We are proposing an approach towards detecting BC opportunities based on acoustic input features like power and pitch. While other works in the field rely on the use of a hand-written rule set or specialized features, we made use of artificial neural networks. They are capable of deriving higher order features from input features themselves. In our setup, we first used a fully connected feed-forward network to establish an updated baseline in comparison to our previously proposed setup. We also extended this setup by the use of Long Short-Term Memory (LSTM) networks which have shown to outperform feed-forward based setups on various tasks. Our best system achieved an F1-Score of 0.37 using power and pitch features. Adding linguistic information using word2vec, the score increased to 0.39.
New and more natural human-robot interfaces are of crucial interest to the evolution of robotics. This paper addresses continuous and real-time hand gesture spotting, i.e., gesture segmentation plus gesture recognition. Gesture patterns are recognized by using artificial neural networks (ANNs) specifically adapted to the process of controlling an industrial robot. Since in continuous gesture recognition the communicative gestures appear intermittently with the noncommunicative, we are proposing a new architecture with two ANNs in series to recognize both kinds of gesture. A data glove is used as interface technology. Experimental results demonstrated that the proposed solution presents high recognition rates (over 99% for a library of ten gestures and over 96% for a library of thirty gestures), low training and learning time and a good capacity to generalize from particular situations.
Intrusion detection systems (IDSs) fall into two high-level categories: network-based systems (NIDS) that monitor network behaviors, and host-based systems (HIDS) that monitor system calls. In this work, we present a general technique for both systems. We use anomaly detection, which identifies patterns not conforming to a historic norm. In both types of systems, the rates of change vary dramatically over time (due to burstiness) and over components (due to service difference). To efficiently model such systems, we use continuous time Bayesian networks (CTBNs) and avoid specifying a fixed update interval common to discrete-time models. We build generative models from the normal training data, and abnormal behaviors are flagged based on their likelihood under this norm. For NIDS, we construct a hierarchical CTBN model for the network packet traces and use Rao-Blackwellized particle filtering to learn the parameters. We illustrate the power of our method through experiments on detecting real worms and identifying hosts on two publicly available network traces, the MAWI dataset and the LBNL dataset. For HIDS, we develop a novel learning method to deal with the finite resolution of system log file time stamps, without losing the benefits of our continuous time model. We demonstrate the method by detecting intrusions in the DARPA 1998 BSM dataset.
A function computation problem in directed acyclic networks has been considered in the literature, where a sink node wants to compute a target function with the inputs generated at multiple source nodes. The network links are error-free but capacity-limited, and the intermediate network nodes perform network coding. The target function is required to be computed with zero error. The computing rate of a network code is measured by the average number of times that the target function can be computed for one use of the network, i.e., each link in the network is used at most once. In the papers [1], [2], two cut-set bounds were proposed on the computing rate. However, we show in this paper that these bounds are not valid for general network function computation problems. We analyze the arguments that lead to the invalidity of these bounds and fix the issue with a new cut-set bound, where a new equivalence relation associated with the inputs of the target function is used. Our bound is qualified for general target functions and network topologies. We also show that our bound is tight for some special cases where the computing capacity is known. Moreover, some results in [11], [12] were proved using the invalid upper bound in [1] and hence their correctness needs further justification. We also justify their validity in the paper.
Representing symbolic knowledge into a connectionist network is the key element for the integration of scalable learning and sound reasoning. Most of the previous studies focus on discriminative neural networks which unnecessarily require a separation of input/output variables. Recent development of generative neural networks such as restricted Boltzmann machines (RBMs) has shown a capability of learning semantic abstractions directly from data, posing a promise for general symbolic learning and reasoning. Previous work on Penalty logic show a link between propositional logic and symmetric connectionist networks, however it is not applicable to RBMs. This paper proposes a novel method to represent propositional formulas into RBMs/stack of RBMs where Gibbs sampling can be seen as maximising satisfiability. It also shows a promising use of RBMs to learn symbolic knowledge through maximum likelihood estimation.
The process of charm quark fragmentation is studied using $D^{*\pm}$ meson production in deep-inelastic scattering as measured by the H1 detector at HERA. Two different regions of phase space are investigated defined by the presence or absence of a jet containing the $D^{*\pm}$ meson in the event. The parameters of fragmentation functions are extracted for QCD models based on leading order matrix elements and DGLAP or CCFM evolution of partons together with string fragmentation and particle decays. Additionally, they are determined for a next-to-leading order QCD calculation in the fixed flavour number scheme using the independent fragmentation of charm quarks to $D^{*\pm}$ mesons.
Human decisional processes result from the employment of selected quantities of relevant information, generally synthesized from environmental incoming data and stored memories. Their main goal is the production of an appropriate and adaptive response to a cognitive or behavioral task. Different strategies of response production can be adopted, among which haphazard trials, formation of mental schemes and heuristics. In this paper, we propose a model of Boolean neural network that incorporates these strategies by recurring to global optimization strategies during the learning session. The model characterizes as well the passage from an unstructured/chaotic attractor neural network typical of data-driven processes to a faster one, forward-only and representative of schema-driven processes. Moreover, a simplified version of the Iowa Gambling Task (IGT) is introduced in order to test the model. Our results match with experimental data and point out some relevant knowledge coming from psychological domain.
Many of today's most widely used computing applications utilize social networking features and allow users to connect, follow each other, share content, and comment on others' posts. However, despite the widespread adoption of these features, there is little understanding of the consequences that social networking has on user retention, engagement, and online as well as offline behavior. Here, we study how social networks influence user behavior in a physical activity tracking application. We analyze 791 million online and offline actions of 6 million users over the course of 5 years, and show that social networking leads to a significant increase in users' online as well as offline activities. Specifically, we establish a causal effect of how social networks influence user behavior. We show that the creation of new social connections increases user online in-application activity by 30%, user retention by 17%, and user offline real-world physical activity by 7% (about 400 steps per day). By exploiting a natural experiment we distinguish the effect of social influence of new social connections from the simultaneous increase in user's motivation to use the app and take more steps. We show that social influence accounts for 55% of the observed changes in user behavior, while the remaining 45% can be explained by the user's increased motivation to use the app. Further, we show that subsequent, individual edge formations in the social network lead to significant increases in daily steps. These effects diminish with each additional edge and vary based on edge attributes and user demographics. Finally, we utilize these insights to develop a model that accurately predicts which users will be most influenced by the creation of new social network connections.
The mechanism of appearance of exponentially large number of metastable states in magnetic phases of disordered Ising magnets with short-range random exchange is suggested. It is based on the assumption that transitions into inhomogeneous magnetic phases results from the condensation of macroscopically large number of sparse delocalized modes near the localization threshold. The properties of metastable states in random magnets with zero ground state magnetization (dilute antiferromagnet, binary spin glass, dilute ferromagnet with dipole interaction) has been obtained in framework of this mechanism using variant of mean-field approximation. The relations between the characteristics of slow nonequilibrium processes in magnetic phases and thermodynamic parameters of metastable states are established.
Human mobility modelling is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad-hoc networks or what-if analysis in urban ecosystems. Current generative models fail in accurately reproducing the individuals' recurrent schedules and at the same time in accounting for the possibility that individuals may break the routine during periods of variable duration. In this article we present DITRAS (DIary-based TRAjectory Simulator), a framework to simulate the spatio-temporal patterns of human mobility. DITRAS operates in two steps: the generation of a mobility diary and the translation of the mobility diary into a mobility trajectory. We propose a data-driven algorithm which constructs a diary generator from real data, capturing the tendency of individuals to follow or break their routine. We also propose a trajectory generator based on the concept of preferential exploration and preferential return. We instantiate DITRAS with the proposed diary and trajectory generators and compare the resulting spatio-temporal model with real data and synthetic data produced by other spatio-temporal mobility models, built by instantiating DITRAS with several combinations of diary and trajectory generators. We show that the proposed model reproduces the statistical properties of real trajectories in the most accurate way, making a step forward the understanding of the origin of the spatio-temporal patterns of human mobility.
Learning the distance metric between pairs of examples is of great importance for learning and visual recognition. With the remarkable success from the state of the art convolutional neural networks, recent works have shown promising results on discriminatively training the networks to learn semantic feature embeddings where similar examples are mapped close to each other and dissimilar examples are mapped farther apart. In this paper, we describe an algorithm for taking full advantage of the training batches in the neural network training by lifting the vector of pairwise distances within the batch to the matrix of pairwise distances. This step enables the algorithm to learn the state of the art feature embedding by optimizing a novel structured prediction objective on the lifted problem. Additionally, we collected Online Products dataset: 120k images of 23k classes of online products for metric learning. Our experiments on the CUB-200-2011, CARS196, and Online Products datasets demonstrate significant improvement over existing deep feature embedding methods on all experimented embedding sizes with the GoogLeNet network.
We report an experimental study by electron paramagnetic resonance (EPR) of gamma ray irradiation induced point defects in oxygen deficient amorphous SiO2 materials. We have found that three intrinsic (E'gamma, E'delta and triplet) and one extrinsic ([AlO4]0) paramagnetic centers are induced. All the paramagnetic defects but E'gamma center are found to reach a concentration limit value for doses above 10^3 kGy, suggesting a generation process from precursors. Isochronal thermal treatments of a sample irradiated at 10^3 kGy have shown that for T>500 K the concentrations of E'gamma and E'delta centers increase concomitantly to the decrease of [AlO4]0. This occurrence speaks for an hole transfer process from [AlO4]0 centers to diamagnetic precursors of E' centers proving the positive charge state of the thermally induced E'gamma and E'delta centers and giving insight on the origin of E'gamma from an oxygen vacancy. A comparative study of the E'delta center and of the 10 mT doublet EPR signals on three distinct materials subjected to isochronal and isothermal treatments, has shown a quite general linear correlation between these two EPR signals. This result confirms the attribution of the 10 mT doublet to the hyperfine structure of the E'delta center, originating from the interaction of the unpaired electron with a nucleus of 29Si (I=1/2). Analogies between the microwave saturation properties of E'gamma and E'delta centers and between those of their hyperfine structures are found and suggest that the unpaired electron wave function involves similar Si sp3 hybrid orbitals; specifically, for the E'delta the unpaired electron is supposed to be delocalized over four such orbitals of four equivalent Si atoms.
Community detection is a very active field in complex networks analysis, consisting in identifying groups of nodes more densely interconnected relatively to the rest of the network. The existing algorithms are usually tested and compared on real-world and artificial networks, their performance being assessed through some partition similarity measure. However, artificial networks realism can be questioned, and the appropriateness of those measures is not obvious. In this study, we take advantage of recent advances concerning the characterization of community structures to tackle these questions. We first generate networks thanks to the most realistic model available to date. Their analysis reveals they display only some of the properties observed in real-world community structures. We then apply five community detection algorithms on these networks and find out the performance assessed quantitatively does not necessarily agree with a qualitative analysis of the identified communities. It therefore seems both approaches should be applied to perform a relevant comparison of the algorithms.
A network representation is useful for describing the structure of a large variety of complex systems. However, most real and engineered systems have multiple subsystems and layers of connectivity, and the data produced by such systems is very rich. Achieving a deep understanding of such systems necessitates generalizing "traditional" network theory, and the newfound deluge of data now makes it possible to test increasingly general frameworks for the study of networks. In particular, although adjacency matrices are useful to describe traditional single-layer networks, such a representation is insufficient for the analysis and description of multiplex and time-dependent networks. One must therefore develop a more general mathematical framework to cope with the challenges posed by multi-layer complex systems. In this paper, we introduce a tensorial framework to study multi-layer networks, and we discuss the generalization of several important network descriptors and dynamical processes --including degree centrality, clustering coefficients, eigenvector centrality, modularity, Von Neumann entropy, and diffusion-- for this framework. We examine the impact of different choices in constructing these generalizations, and we illustrate how to obtain known results for the special cases of single-layer and multiplex networks. Our tensorial approach will be helpful for tackling pressing problems in multi-layer complex systems, such as inferring who is influencing whom (and by which media) in multichannel social networks and developing routing techniques for multimodal transportation systems.
In group activity recognition, the temporal dynamics of the whole activity can be inferred based on the dynamics of the individual people representing the activity. We build a deep model to capture these dynamics based on LSTM (long-short term memory) models. To make use of these ob- servations, we present a 2-stage deep temporal model for the group activity recognition problem. In our model, a LSTM model is designed to represent action dynamics of in- dividual people in a sequence and another LSTM model is designed to aggregate human-level information for whole activity understanding. We evaluate our model over two datasets: the collective activity dataset and a new volley- ball dataset. Experimental results demonstrate that our proposed model improves group activity recognition perfor- mance with compared to baseline methods.
The main goal of routing protocol is to efficiency delivers data from source to destination. All routing protocols are the same in this goal, but the way they adopt to achieve it is different, so routing strategy has an egregious role on the performance of an ad hoc network. Most of routing protocols proposed for ad hoc networks have a flat structure. These protocols expand the control overhead packets to discover or maintain a route. On the other hand a number of hierarchical-based routing protocols have been developed, mostly are based on layered design. These protocols improve network performances especially when the network size grows up since details about remote portion of network can be handled in an aggregate manner. Although, there is another approach to design a protocol called cross-layer design. Using this approach information can exchange between different layer of protocol stack, result in optimizing network performances.   In this paper, we intend to exert cross-layer design to optimize Cluster Based Routing Protocol (Cross-CBRP). Using NS-2 network simulator we evaluate rate of cluster head changes, throughput and packet delivery ratio. Comparisons denote that Cross-CBRP has better performances with respect to the original CBRP.
We consider the problem of training input-output recurrent neural networks (RNN) for sequence labeling tasks. We propose a novel spectral approach for learning the network parameters. It is based on decomposition of the cross-moment tensor between the output and a non-linear transformation of the input, based on score functions. We guarantee consistent learning with polynomial sample and computational complexity under transparent conditions such as non-degeneracy of model parameters, polynomial activations for the neurons, and a Markovian evolution of the input sequence. We also extend our results to Bidirectional RNN which uses both previous and future information to output the label at each time point, and is employed in many NLP tasks such as POS tagging.
Learning based approaches have not yet achieved their full potential in optical flow estimation, where their performance still trails heuristic approaches. In this paper, we present a CNN based patch matching approach for optical flow estimation. An important contribution of our approach is a novel thresholded loss for Siamese networks. We demonstrate that our loss performs clearly better than existing losses. It also allows to speed up training by a factor of 2 in our tests. Furthermore, we present a novel way for calculating CNN based features for different image scales, which performs better than existing methods. We also discuss new ways of evaluating the robustness of trained features for the application of patch matching for optical flow. An interesting discovery in our paper is that low-pass filtering of feature maps can increase the robustness of features created by CNNs. We proved the competitive performance of our approach by submitting it to the KITTI 2012, KITTI 2015 and MPI-Sintel evaluation portals where we obtained state-of-the-art results on all three datasets.
There is a growing interest in learning data representations that work well for many different types of problems and data. In this paper, we look in particular at the task of learning a single visual representation that can be successfully utilized in the analysis of very different types of images, from dog breeds to stop signs and digits. Inspired by recent work on learning networks that predict the parameters of another, we develop a tunable deep network architecture that, by means of adapter residual modules, can be steered on the fly to diverse visual domains. Our method achieves a high degree of parameter sharing while maintaining or even improving the accuracy of domain-specific representations. We also introduce the Visual Decathlon Challenge, a benchmark that evaluates the ability of representations to capture simultaneously ten very different visual domains and measures their ability to recognize well uniformly.
Bursting neurons are considered to be a potential cause of over-excitability and seizure susceptibility. The functional influence of these neurons in extended epileptic networks is still poorly understood. There is mounting evidence that the dynamics of neuronal networks is influenced not only by neuronal and synaptic properties but also by network topology. We investigate numerically the influence of different neuron dynamics on global synchrony in neuronal networks with complex connection topologies.
We argue for network slicing as an efficient solution that addresses the diverse requirements of 5G mobile networks, thus providing the necessary flexibility and scalability associated with future network implementations. We elaborate on the challenges that emerge when we design 5G networks based on network slicing. We focus on the architectural aspects associated with the coexistence of dedicated as well as shared slices in the network. In particular, we analyze the realization options of a flexible radio access network with focus on network slicing and their impact on the design of 5G mobile networks. In addition to the technical study, this paper provides an investigation of the revenue potential of network slicing, where the applications that originate from such concept and the profit capabilities from the network operator's perspective are put forward.
The growth of the Internet technology enables us to use network applications for streaming audio and video. Especially, real-time streaming services using peer-to-peer (P2P) technology are currently emerging. An important issue on P2P streaming is how to construct a logical network (overlay network) on a physical network (IP network). In this paper, we propose an initial peer configuration algorithm for a multi-streaming peer-to-peer network. The proposed algorithm is based on a mesh-pull approach where any node has multiple parent and child nodes as neighboring nodes, and content transmitted between these neighboring nodes depends on their parent-child relationships. Our simulation experiments show that the proposed algorithm improves the number of joining node and traffic load.
We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pre-training. Our work builds on the Ladder network proposed by Valpola (2015), which we extend by combining the model with supervision. We show that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification, in addition to permutation-invariant MNIST classification with all labels.
Recommendation plays an increasingly important role in our daily lives. Recommender systems automatically suggest items to users that might be interesting for them. Recent studies illustrate that incorporating social trust in Matrix Factorization methods demonstrably improves accuracy of rating prediction. Such approaches mainly use the trust scores explicitly expressed by users. However, it is often challenging to have users provide explicit trust scores of each other. There exist quite a few works, which propose Trust Metrics to compute and predict trust scores between users based on their interactions. In this paper, first we present how social relation can be extracted from users' ratings to items by describing Hellinger distance between users in recommender systems. Then, we propose to incorporate the predicted trust scores into social matrix factorization models. By analyzing social relation extraction from three well-known real-world datasets, which both: trust and recommendation data available, we conclude that using the implicit social relation in social recommendation techniques has almost the same performance compared to the actual trust scores explicitly expressed by users. Hence, we build our method, called Hell-TrustSVD, on top of the state-of-the-art social recommendation technique to incorporate both the extracted implicit social relations and ratings given by users on the prediction of items for an active user. To the best of our knowledge, this is the first work to extend TrustSVD with extracted social trust information. The experimental results support the idea of employing implicit trust into matrix factorization whenever explicit trust is not available, can perform much better than the state-of-the-art approaches in user rating prediction.
Recently, neural networks have achieved great success on sentiment classification due to their ability to alleviate feature engineering. However, one of the remaining challenges is to model long texts in document-level sentiment classification under a recurrent architecture because of the deficiency of the memory unit. To address this problem, we present a Cached Long Short-Term Memory neural networks (CLSTM) to capture the overall semantic information in long texts. CLSTM introduces a cache mechanism, which divides memory into several groups with different forgetting rates and thus enables the network to keep sentiment information better within a recurrent unit. The proposed CLSTM outperforms the state-of-the-art models on three publicly available document-level sentiment analysis datasets.
We present a systematics of fission barriers and fission lifetimes for the whole landscape of super-heavy elements (SHE), i.e. nuclei with Z>100. The fission lifetimes are also compared with the alpha-decay half-lives. The survey is based on a self-consistent description in terms of the Skyrme-Hartree-Fock (SHF) approach. Results for various different SHF parameterizations are compared to explore the robustness of the predictions. The fission path is computed by quadrupole constrained SHF. The computation of fission lifetimes takes care of the crucial ingredients of the large-amplitude collective dynamics along the fission path, as self-consistent collective mass and proper quantum corrections. We discuss the different topologies of fission landscapes which occur in the realm of SHE (symmetric versus asymmetric fission, regions of triaxial fission, bi-modal fission, and the impact of asymmetric ground states). The explored region is extended deep into the regime of very neutron-rich isotopes as they are expected to be produced in the astrophysical r process.
In recent years, deep neural networks (DNNs), have yielded strong results on a wide range of applications. Graphics Processing Units (GPUs) have been one key enabling factor leading to the current popularity of DNNs. However, despite increasing hardware flexibility and software programming toolchain maturity, high efficiency GPU programming remains difficult: it suffers from high complexity, low productivity, and low portability. GPU vendors such as NVIDIA have spent enormous effort to write special-purpose DNN libraries. However, on other hardware targets, especially mobile GPUs, such vendor libraries are not generally available. Thus, the development of portable, open, high-performance, energy-efficient GPU code for DNN operations would enable broader deployment of DNN-based algorithms. Toward this end, this work presents a framework to enable productive, high-efficiency GPU programming for DNN computations across hardware platforms and programming models. In particular, the framework provides specific support for metaprogramming, autotuning, and DNN-tailored data types. Using our framework, we explore implementing DNN operations on three different hardware targets: NVIDIA, AMD, and Qualcomm GPUs. On NVIDIA GPUs, we show both portability between OpenCL and CUDA as well competitive performance compared to the vendor library. On Qualcomm GPUs, we show that our framework enables productive development of target-specific optimizations, and achieves reasonable absolute performance. Finally, On AMD GPUs, we show initial results that indicate our framework can yield reasonable performance on a new platform with minimal effort.
This work builds upon previous efforts in online incremental learning, namely the Incremental Gaussian Mixture Network (IGMN). The IGMN is capable of learning from data streams in a single-pass by improving its model after analyzing each data point and discarding it thereafter. Nevertheless, it suffers from the scalability point-of-view, due to its asymptotic time complexity of $\operatorname{O}\bigl(NKD^3\bigr)$ for $N$ data points, $K$ Gaussian components and $D$ dimensions, rendering it inadequate for high-dimensional data. In this paper, we manage to reduce this complexity to $\operatorname{O}\bigl(NKD^2\bigr)$ by deriving formulas for working directly with precision matrices instead of covariance matrices. The final result is a much faster and scalable algorithm which can be applied to high dimensional tasks. This is confirmed by applying the modified algorithm to high-dimensional classification datasets.
We review recent progress towards a determination of a set of polarized parton distributions from a global set of deep-inelastic scattering data based on the NNPDF methodology, in analogy with the unpolarized case. This method is designed to provide a faithful and statistically sound representation of parton distributions and their uncertainties. We show how the FastKernel method provides a fast and accurate method for solving the polarized DGLAP equations. We discuss the polarized PDF parametrizations and the physical constraints which can be imposed. Preliminary results suggest that the uncertainty on polarized PDFs, most notably the gluon, has been underestimated in previous studies.
We study a p-spin spin-glass model to understand if the finite-temperature glass transition found in the mean-field regime of p-spin models, and used to model the behavior of structural glasses, persists in the non-mean-field regime. By using a 3-spin spin-glass model with long-range power-law diluted interactions we are able to continuously tune the (effective) space dimension via the exponent of the interactions. Monte Carlo simulations of the spin-glass susceptibility and the two-point finite-size correlation length show that deep in the non-mean-field regime the finite-temperature transition is lost, whereas this is not the case in the mean-field regime, in agreement with the prediction of Moore and Drossel [Phys. Rev. Lett. 89, 217202 (2002)] that 3-spin models are in the same universality class as an Ising spin glass in a magnetic field. However, slightly in the non-mean-field region, we find an apparent transition in the 3-spin model, in contrast to results for the Ising spin glass in a field. This may indicate that even larger sizes are needed to probe the asymptotic behavior in this region.
We report the electrical responses of thin and thick single-walled carbon nanotube (SWNT) networks to N2 and O2 adsorption. In the surface desorbed state exposure to N2 and O2 provide an increase in conductance of thin and thick SWNT networks. The increase in conductance of both thin and thick networks is of a greater magnitude during O2 exposure rather than N2 exposure. Thin networks exhibit a greater response to both O2 and N2 rather than thick networks. This is likely a result of the increased semiconducting nature of thin SWNT networks.
A significant source of errors in Automatic Speech Recognition (ASR) systems is due to pronunciation variations which occur in spontaneous and conversational speech. Usually ASR systems use a finite lexicon that provides one or more pronunciations for each word. In this paper, we focus on learning a similarity function between two pronunciations. The pronunciations can be the canonical and the surface pronunciations of the same word or they can be two surface pronunciations of different words. This task generalizes problems such as lexical access (the problem of learning the mapping between words and their possible pronunciations), and defining word neighborhoods. It can also be used to dynamically increase the size of the pronunciation lexicon, or in predicting ASR errors. We propose two methods, which are based on recurrent neural networks, to learn the similarity function. The first is based on binary classification, and the second is based on learning the ranking of the pronunciations. We demonstrate the efficiency of our approach on the task of lexical access using a subset of the Switchboard conversational speech corpus. Results suggest that on this task our methods are superior to previous methods which are based on graphical Bayesian methods.
Are expansions and recessions more likely to end as their magnitude increases? In this paper we apply parametric hazard models to investigate this issue in a sample of 16 countries from 1881 to 2000. For the total sample we find evidence of positive magnitude dependence for recessions, while for expansions we are not able to reject the null of magnitude independence. This last result is likely due to a structural change in the mechanism guiding expansions before and after the second World War. In particular, upturns show negative magnitude dependence in the post-World War II sub-sample, meaning that in this period expansions become less likely to end as their magnitude increases.
Neural networks are nowadays both powerful operational tools (e.g., for pattern recognition, data mining, error correction codes) and complex theoretical models on the focus of scientific investigation. As for the research branch, neural networks are handled and studied by psychologists, neurobiologists, engineers, mathematicians and theoretical physicists. In particular, in theoretical physics, the key instrument for the quantitative analysis of neural networks is statistical mechanics. From this perspective, here, we first review attractor networks: starting from ferromagnets and spin-glass models, we discuss the underlying philosophy and we recover the strand paved by Hopfield, Amit-Gutfreund-Sompolinky. One step forward, we highlight the structural equivalence between Hopfield networks (modeling retrieval) and Boltzmann machines (modeling learning), hence realizing a deep bridge linking two inseparable aspects of biological and robotic spontaneous cognition. As a sideline, in this walk we derive two alternative (with respect to the original Hebb proposal) ways to recover the Hebbian paradigm, stemming from ferromagnets and from spin-glasses, respectively. Further, as these notes are thought of for an Engineering audience, we highlight also the mappings between ferromagnets and operational amplifiers and between antiferromagnets and flip-flops (as neural networks -built by op-amp and flip-flops- are particular spin-glasses and the latter are indeed combinations of ferromagnets and antiferromagnets), hoping that such a bridge plays as a concrete prescription to capture the beauty of robotics from the statistical mechanical perspective.
Deep Convolutional Neural Networks (CNNs) are playing important roles in state-of-the-art visual recognition. This paper focuses on modeling the spatial co-occurrence of neuron responses, which is less studied in the previous work. For this, we consider the neurons in the hidden layer as neural words, and construct a set of geometric neural phrases on top of them. The idea that grouping neural words into neural phrases is borrowed from the Bag-of-Visual-Words (BoVW) model. Next, the Geometric Neural Phrase Pooling (GNPP) algorithm is proposed to efficiently encode these neural phrases. GNPP acts as a new type of hidden layer, which punishes the isolated neuron responses after convolution, and can be inserted into a CNN model with little extra computational overhead. Experimental results show that GNPP produces significant and consistent accuracy gain in image classification.
The refugee crisis is perhaps the single most challenging problem for Europe today. Hundreds of thousands of people have already traveled across dangerous sea passages from Turkish shores to Greek islands, resulting in thousands of dead and missing, despite the best rescue efforts from both sides. One of the main reasons is the total lack of any early warning-alerting system, which could provide some preparation time for the prompt and effective deployment of resources at the hot zones. This work is such an attempt for a systemic analysis of the refugee influx in Greece, aiming at (a) the statistical and signal-level characterization of the smuggling networks and (b) the formulation and preliminary assessment of such models for predictive purposes, i.e., as the basis of such an early warning-alerting protocol. To our knowledge, this is the first-ever attempt to design such a system, since this refugee crisis itself and its geographical properties are unique (intense event handling, little or no warning). The analysis employs a wide range of statistical, signal-based and matrix factorization (decomposition) techniques, including linear & linear-cosine regression, spectral analysis, ARMA, SVD, Probabilistic PCA, ICA, K-SVD for Dictionary Learning, as well as fractal dimension analysis. It is established that the behavioral patterns of the smuggling networks closely match (as expected) the regular burst and pause periods of store-and-forward networks in digital communications. There are also major periodic trends in the range of 6.2-6.5 days and strong correlations in lags of four or more days, with distinct preference in the Sunday-Monday 48-hour time frame. These results show that such models can be used successfully for short-term forecasting of the influx intensity, producing an invaluable operational asset for planners, decision-makers and first-responders.
We use ab initio simulations to investigate the properties of a sodium borosilicate glass of composition 3Na_2O-B_2O_3-6SiO_2. We find that the broadening of the first peak in the radial distribution functions g_BO(r) and g_BNa(r) is due to the presence of trigonal and tetrahedral boron units as well as to non-bridging oxygen atoms connected to BO_3 units. In agreement with experimental results we find that the [3]B units involve a significant number of non-bridging oxygens whereas the vast majority of [4]B have only bridging oxygens. We determine the three dimensional distribution of the Na atoms around the [3]B and [4]B units and use this information to explain why the sodium atoms associated to the latter share more oxygen atoms with the central boron atoms than the former units. From the distribution of the electrons we calculate the total electronic density of states as well its decomposition into angular momentum contributions. The vibrational density of states shows at high frequencies a band that originates from the motion of the boron atoms. Furthermore we show that the [3]B and [4]B units give rise to well defined features in the spectrum which thus can be used to estimate the concentration of these structural entities. The contribution of [3]B can be decomposed further into symmetric and asymmetric parts that can also be easily identified in the spectrum. We show that certain features in the spectrum can be used to obtain information on the type of atom that is the second nearest neighbor of a boron in the [4]B unit. We calculate the average Born charges on the bridging and non-bridging oxygen atoms and show that these depend linearly on the angle between the two bonds and the distance from the connected cation, respectively. Finally we have calculated the frequency dependence of the dielectric function as well as the absorption spectra.
Deep learning techniques lie at the heart of several significant AI advances in recent years including object recognition and detection, image captioning, machine translation, speech recognition and synthesis, and playing the game of Go. Automated first-order theorem provers can aid in the formalization and verification of mathematical theorems and play a crucial role in program analysis, theory reasoning, security, interpolation, and system verification. Here we suggest deep learning based guidance in the proof search of the theorem prover E. We train and compare several deep neural network models on the traces of existing ATP proofs of Mizar statements and use them to select processed clauses during proof search. We give experimental evidence that with a hybrid, two-phase approach, deep learning based guidance can significantly reduce the average number of proof search steps while increasing the number of theorems proved. Using a few proof guidance strategies that leverage deep neural networks, we have found first-order proofs of 7.36% of the first-order logic translations of the Mizar Mathematical Library theorems that did not previously have ATP generated proofs. This increases the ratio of statements in the corpus with ATP generated proofs from 56% to 59%.
Face modeling has been paid much attention in the field of visual computing. There exist many scenarios, including cartoon characters, avatars for social media, 3D face caricatures as well as face-related art and design, where low-cost interactive face modeling is a popular approach especially among amateur users. In this paper, we propose a deep learning based sketching system for 3D face and caricature modeling. This system has a labor-efficient sketching interface, that allows the user to draw freehand imprecise yet expressive 2D lines representing the contours of facial features. A novel CNN based deep regression network is designed for inferring 3D face models from 2D sketches. Our network fuses both CNN and shape based features of the input sketch, and has two independent branches of fully connected layers generating independent subsets of coefficients for a bilinear face representation. Our system also supports gesture based interactions for users to further manipulate initial face models. Both user studies and numerical results indicate that our sketching system can help users create face models quickly and effectively. A significantly expanded face database with diverse identities, expressions and levels of exaggeration is constructed to promote further research and evaluation of face modeling techniques.
The dynamics of an elastic medium driven through a random medium by a small applied force is investigated in the low-temperature limit where quantum fluctuations dominate. The motion proceeds via tunneling of segments of the manifold through barriers whose size grows with decreasing driving force $f$. In the limit of small drive, at zero-temperature the average velocity has the form $v\propto\exp[-{\rm const.}/\hbar^{\alpha} f^{\mu}]$. For strongly dissipative dynamics, there is a wide range of forces where the dissipation dominates and the velocity--force characteristics takes the form $v\propto\exp[-S(f)/\hbar]$, with $S(f)\propto 1/ f^{(d+2\zeta)/(2-\zeta)}$ the action for a typical tunneling event, the force dependence being determined by the roughness exponent $\zeta$ of the $d$-dimensional manifold. This result agrees with the one obtained via simple scaling considerations. Surprisingly, for asymptotically low forces or for the case when the massive dynamics is dominant, the resulting quantum creep law is {\it not} of the usual form with a rate proportional to $\exp[-S(f)/\hbar]$; rather we find $v\propto \exp\{-[S(f)/\hbar]^2\}$ corresponding to $\alpha=2$ and $\mu= 2(d+2\zeta-1)/(2-\zeta)$, with $\mu/2$ the naive scaling exponent for massive dynamics. Our analysis is based on the quasi-classical Langevin approximation with a noise obeying the quantum fluctuation--dissipation theorem. The many space and time scales involved in the dynamics are treated via a functional renormalization group analysis related to that used previously to treat the classical dynamics of such systems. Various potential difficulties with these approaches to the multi-scale dynamics -- both classical and quantum -- are raised and questions about the validity of the results are discussed.
We study theoretically the relative importance of short-range disorder in determining the low-temperature 2D mobility in GaAs-based structures with respect to Coulomb disorder which is known to be the dominant disorder in semiconductor systems. We give results for unscreened and screened short-range disorder effects on 2D mobility in quantum wells and heterostructures, comparing with the results for Coulomb disorder and finding that the asymptotic high-density mobility is always limited by short-range disorder which, in general, becomes effectively stronger with increasing `carrier density' in contrast to Coulomb disorder. We also predict an intriguing re-entrant metal-insulator transition at very high carrier densities in Si-MOSFETs driven by the short-range disorder associated with surface roughness scattering.
We present a method for very fast repeated computations of higher-order cross sections in hadron-induced processes for arbitrary parton density functions. A full implementation of the method for computations of jet cross sections in Deep-Inelastic Scattering and in Hadron-Hadron Collisions is offered by the "fastNLO" project. A web-interface for online calculations and user code can be found at http://hepforge.cedar.ac.uk/fastnlo/ .
In this thesis, we study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL). Fingerspelling comprises a significant but relatively understudied part of ASL, and recognizing it is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected. In this work, we propose several types of recognition approaches, and explore the signer variation problem. Our best-performing models are segmental (semi-Markov) conditional random fields using deep neural network-based features. In the signer-dependent setting, our recognizers achieve up to about 8% letter error rates. The signer-independent setting is much more challenging, but with neural network adaptation we achieve up to 17% letter error rates.
Based on API call sequences, semantic-aware and machine learning (ML) based malware classifiers can be built for malware detection or classification. Previous works concentrate on crafting and extracting various features from malware binaries, disassembled binaries or API calls via static or dynamic analysis and resorting to ML to build classifiers. However, they tend to involve too much feature engineering and fail to provide interpretability. We solve these two problems with the recent advances in deep learning: 1) RNN-based autoencoders (RNN-AEs) can automatically learn low-dimensional representation of a malware from its raw API call sequence. 2) Multiple decoders can be trained under different supervisions to give more information, other than the class or family label of a malware. Inspired by the works of document classification and automatic sentence summarization, each API call sequence can be regarded as a sentence. In this paper, we make the first attempt to build a multi-task malware learning model based on API call sequences. The model consists of two decoders, one for malware classification and one for $\emph{file access pattern}$ (FAP) generation given the API call sequence of a malware. We base our model on the general seq2seq framework. Experiments show that our model can give competitive classification results as well as insightful FAP information.
Invited talk at 9th International Workshop on Deep Inelastic Scattering (DIS 2001), Bologna, Italy, 27 Apr -1 May 2001
Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning. Many real life data sets contain a small amount of labelled data points, that are typically disregarded when training generative models. We propose the Cluster-aware Generative Model, that uses unlabelled information to infer a latent representation that models the natural clustering of the data, and additional labelled data points to refine this clustering. The generative performances of the model significantly improve when labelled information is exploited, obtaining a log-likelihood of -79.38 nats on permutation invariant MNIST, while also achieving competitive semi-supervised classification accuracies. The model can also be trained fully unsupervised, and still improve the log-likelihood performance with respect to related methods.
The recent phenomenal interest in convolutional neural networks (CNNs) must have made it inevitable for the super-resolution (SR) community to explore its potential. The response has been immense and in the last three years, since the advent of the pioneering work, there appeared too many works not to warrant a comprehensive survey. This paper surveys the SR literature in the context of deep learning. We focus on the three important aspects of multimedia - namely image, video and multi-dimensions, especially depth maps. In each case, first relevant benchmarks are introduced in the form of datasets and state of the art SR methods, excluding deep learning. Next is a detailed analysis of the individual works, each including a short description of the method and a critique of the results with special reference to the benchmarking done. This is followed by minimum overall benchmarking in the form of comparison on some common dataset, while relying on the results reported in various works.
We present an asymptotically exact analysis of the problem of detecting communities in sparse random networks. Our results are also applicable to detection of functional modules, partitions, and colorings in noisy planted models. Using a cavity method analysis, we unveil a phase transition from a region where the original group assignment is undetectable to one where detection is possible. In some cases, the detectable region splits into an algorithmically hard region and an easy one. Our approach naturally translates into a practical algorithm for detecting modules in sparse networks, and learning the parameters of the underlying model.
Large-scale fading (LSF) between interacting nodes is a fundamental element in radio communications, responsible for weakening the propagation, and thus worsening the service quality. Given the importance of channel-losses in general, and the inevitability of random spatial geometry in real-life wireless networks, it was then natural to merge these two paradigms together in order to obtain an improved stochastical model for the LSF indicator. Therefore, in exact closed-form notation, we generically derived the LSF distribution between a prepositioned reference base-station and an arbitrary node for a multi-cellular random network model. In fact, we provided an explicit and definitive formulation that considered at once: the lattice profile, the users' random geometry, the effect of the far-field phenomenon, the path-loss behavior, and the stochastic impact of channel scatters. The veracity and accuracy of the theoretical analysis were also confirmed through Monte Carlo simulations.
To find interesting structure in networks, community detection algorithms have to take into account not only the network topology, but also dynamics of interactions between nodes. We investigate this claim using the paradigm of synchronization in a network of coupled oscillators. As the network evolves to a global steady state, nodes belonging to the same community synchronize faster than nodes belonging to different communities. Traditionally, nodes in network synchronization models are coupled via one-to-one, or conservative interactions. However, social interactions are often one-to-many, as for example, in social media, where users broadcast messages to all their followers. We formulate a novel model of synchronization in a network of coupled oscillators in which the oscillators are coupled via one-to-many, or non-conservative interactions. We study the dynamics of different interaction models and contrast their spectral properties. To find multi-scale community structure in a network of interacting nodes, we define a similarity function that measures the degree to which nodes are synchronized and use it to hierarchically cluster nodes. We study real-world social networks, including networks of two social media providers. To evaluate the quality of the discovered communities in a social media network we propose a community quality metric based on user activity. We find that conservative and non-conservative interaction models lead to dramatically different views of community structure even within the same network. Our work offers a novel mathematical framework for exploring the relationship between network structure, topology and dynamics.
Neural network based models have achieved impressive results on various specific tasks. However, in previous works, most models are learned separately based on single-task supervised objectives, which often suffer from insufficient training data. In this paper, we propose two deep architectures which can be trained jointly on multiple related tasks. More specifically, we augment neural model with an external memory, which is shared by several tasks. Experiments on two groups of text classification tasks show that our proposed architectures can improve the performance of a task with the help of other related tasks.
Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network for skeleton based action recognition. Inspired by the observation that the co-occurrences of the joints intrinsically characterize human actions, we take the skeleton as the input at each time slot and introduce a novel regularization scheme to learn the co-occurrence features of skeleton joints. To train the deep LSTM network effectively, we propose a new dropout algorithm which simultaneously operates on the gates, cells, and output responses of the LSTM neurons. Experimental results on three human action recognition datasets consistently demonstrate the effectiveness of the proposed model.
Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.
Direct load control of a heterogeneous cluster of residential demand flexibility sources is a high-dimensional control problem with partial observability. This work proposes a novel approach that uses a convolutional neural network to extract hidden state-time features to mitigate the curse of partial observability. More specific, a convolutional neural network is used as a function approximator to estimate the state-action value function or Q-function in the supervised learning step of fitted Q-iteration. The approach is evaluated in a qualitative simulation, comprising a cluster of thermostatically controlled loads that only share their air temperature, whilst their envelope temperature remains hidden. The simulation results show that the presented approach is able to capture the underlying hidden features and successfully reduce the electricity cost the cluster.
Advanced driver assistance systems (ADAS) can be significantly improved with effective driver action prediction (DAP). Predicting driver actions early and accurately can help mitigate the effects of potentially unsafe driving behaviors and avoid possible accidents. In this paper, we formulate driver action prediction as a timeseries anomaly prediction problem. While the anomaly (driver actions of interest) detection might be trivial in this context, finding patterns that consistently precede an anomaly requires searching for or extracting features across multi-modal sensory inputs. We present such a driver action prediction system, including a real-time data acquisition, processing and learning framework for predicting future or impending driver action. The proposed system incorporates camera-based knowledge of the driving environment and the driver themselves, in addition to traditional vehicle dynamics. It then uses a deep bidirectional recurrent neural network (DBRNN) to learn the correlation between sensory inputs and impending driver behavior achieving accurate and high horizon action prediction. The proposed system performs better than other existing systems on driver action prediction tasks and can accurately predict key driver actions including acceleration, braking, lane change and turning at durations of 5sec before the action is executed by the driver.
We explore the geometry of complex networks in terms of an n-dimensional Euclidean embedding represented by the Moore-Penrose pseudo-inverse of the graph Laplacian $(\bb L^+)$. The squared distance of a node $i$ to the origin in this n-dimensional space $(l^+_{ii})$, yields a topological centrality index $(\mathcal{C}^{*}(i) = 1/l^+_{ii})$ for node $i$. In turn, the sum of reciprocals of individual node structural centralities, $\sum_{i}1/\mathcal{C}^*(i) = \sum_{i} l^+_{ii}$, i.e. the trace of $\bb L^+$, yields the well-known Kirchhoff index $(\mathcal{K})$, an overall structural descriptor for the network. In addition to this geometric interpretation, we provide alternative interpretations of the proposed indices to reveal their true topological characteristics: first, in terms of forced detour overheads and frequency of recurrences in random walks that has an interesting analogy to voltage distributions in the equivalent electrical network; and then as the average connectedness of $i$ in all the bi-partitions of the graph. These interpretations respectively help establish the topological centrality $(\mathcal{C}^{*}(i))$ of node $i$ as a measure of its overall position as well as its overall connectedness in the network; thus reflecting the robustness of node $i$ to random multiple edge failures. Through empirical evaluations using synthetic and real world networks, we demonstrate how the topological centrality is better able to distinguish nodes in terms of their structural roles in the network and, along with Kirchhoff index, is appropriately sensitive to perturbations/rewirings in the network.
One basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic approaches. Since the most of clustering methods depend on their input parameters, it is important to evaluate the result of a clustering algorithm with its different input parameters, to choose the most appropriate one. There are several clustering validity techniques based on inner density and outer density of clusters that represent different metrics to choose the most appropriate clustering independent of the input parameters. According to dependency of previous methods on the input parameters, one challenge in facing with large systems, is to complete data incrementally that effects on the final choice of the most appropriate clustering. Those methods define the existence of high intensity in a cluster, and low intensity among different clusters as the measure of choosing the optimal clustering. This measure has a tremendous problem, not availing all data at the first stage. In this paper, we introduce an efficient measure in which maximum number of repetitions for various initial values occurs.
This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set of n-best hypotheses returned by the Automatic Speech Recognition. Most current models for spoken language understanding assume (i) word-aligned semantic annotations as in sequence taggers and (ii) delexicalisation, or a mapping of input words to domain-specific concepts using heuristics that try to capture morphological variation but that do not scale to other domains nor to language variation (e.g., morphology, synonyms, paraphrasing ). In this work the semantic decoder is trained using unaligned semantic annotations and it uses distributed semantic representation learning to overcome the limitations of explicit delexicalisation. The proposed architecture uses a convolutional neural network for the sentence representation and a long-short term memory network for the context representation. Results are presented for the publicly available DSTC2 corpus and an In-car corpus which is similar to DSTC2 but has a significantly higher word error rate (WER).
In this paper considered question of using pattern recognition methods in network equipment state identification.
We present the first results of a study of faint 12 micron sources detected with the Infrared Space Observatory (ISO) in four deep, high galactic latitude fields. The sample includes 50 such sources in an area of 0.1 square degrees down to a 5 sigma flux limit of about 500 microJy. From optical identifications based on the US Naval Observatory (USNO) catalogue and analysis of the optical/IR colours and Digital Sky Survey (DSS) images, we conclude that 37 of these objects are galaxies and 13 are stars. We derive galaxy number counts and compare them with existing IRAS counts at 12 microns, and with models of the 12 micron source population. In particular, we find evidence for significant evolution in the galaxy population, with the no-evolution case excluded at the 3.5 sigma level. The stellar population is well matched by existing models. Two of the objects detected at 12 micron are associated with known galaxies. One of these is an IRAS galaxy at z=0.11 with a luminosity of 10^11 L_sun.
Out of time ordered correlator (OTOC) is recently introduced as a powerful diagnose for quantum chaos. To go beyond, here we present an analytical solution of OTOC for a non-chaotic many body localized (MBL) system, showing distinct feature from quantum chaos and Anderson localization (AL). The OTOC is found to fall only if the nearest distance between the two operators being shorter than $\xi\ln t$, where $\xi$ is dimensionless localization length. Thereafter, we found an universal power law decay of OTOC as $2^{-\xi\ln t}$, implying an universal logarithmic growth of second R\'{e}nyi entropy, where $\xi$ plays the role of information scrambling rate. A relation between butterfly velocity and scrambling rate is found.
A team consisting of an unknown number of mobile agents, starting from different nodes of an unknown network, possibly at different times, have to meet at the same node. Agents are anonymous (identical), execute the same deterministic algorithm and move in synchronous rounds along links of the network. Which configurations are gatherable and how to gather all of them deterministically by the same algorithm?   We give a complete solution of this gathering problem in arbitrary networks. We characterize all gatherable configurations and give two universal deterministic gathering algorithms, i.e., algorithms that gather all gatherable configurations. The first algorithm works under the assumption that an upper bound n on the size of the network is known. In this case our algorithm guarantees gathering with detection, i.e., the existence of a round for any gatherable configuration, such that all agents are at the same node and all declare that gathering is accomplished. If no upper bound on the size of the network is known, we show that a universal algorithm for gathering with detection does not exist. Hence, for this harder scenario, we construct a second universal gathering algorithm, which guarantees that, for any gatherable configuration, all agents eventually get to one node and stop, although they cannot tell if gathering is over. The time of the first algorithm is polynomial in the upper bound n on the size of the network, and the time of the second algorithm is polynomial in the (unknown) size itself.   Our results have an important consequence for the leader election problem for anonymous agents in arbitrary graphs. For anonymous agents in graphs, leader election turns out to be equivalent to gathering with detection. Hence, as a by-product, we obtain a complete solution of the leader election problem for anonymous agents in arbitrary graphs.
Global degree/strength based preferential attachment is widely used as an evolution mechanism of networks. But it is hard to believe that any individual can get global information and shape the network architecture based on it. In this paper, it is found that the global preferential attachment emerges from the local interaction models, including distance-dependent preferential attachment (DDPA) evolving model of weighted networks(M. Li et al, New Journal of Physics 8 (2006) 72), acquaintance network model(J. Davidsen et al, Phys. Rev. Lett. 88 (2002) 128701) and connecting nearest-neighbor(CNN) model(A. Vazquez, Phys. Rev. E 67 (2003) 056104). For DDPA model and CNN model, the attachment rate depends linearly on the degree or strength, while for acquaintance network model, the dependence follows a sublinear power law. It implies that for the evolution of social networks, local contact could be more fundamental than the presumed global preferential attachment. This is onsistent with the result observed in the evolution of empirical email networks.
The popularity of neural networks (NNs) spans academia, industry, and popular culture. In particular, convolutional neural networks (CNNs) have been applied to many image based machine learning tasks and have yielded strong results. The availability of hardware/software systems for efficient training and deployment of large and/or deep CNN models has been, and continues to be, an important consideration for the field. Early systems for NN computation focused on leveraging existing dense linear algebra techniques and libraries. Current approaches use low-level machine specific programming and/or closed-source, purpose-built vendor libraries. In this work, we present an open source system that, compared to existing approaches, achieves competitive computational speed while achieving higher portability. We achieve this by targeting the vendor-neutral OpenCL platform using a code-generation approach. We argue that our approach allows for both: (1) the rapid development of new computational kernels for existing hardware targets, and (2) the rapid tuning of existing computational kernels for new hardware targets. Results are presented for a case study of targeting the Qualcomm Snapdragon 820 mobile computing platform for CNN deployment.
Deep Convolutional Neural Networks (CNNs) have gained great success in image classification and object detection. In these fields, the outputs of all layers of CNNs are usually considered as a high dimensional feature vector extracted from an input image and the correspondence between finer level feature vectors and concepts that the input image contains is all-important. However, fewer studies focus on this deserving issue. On considering the correspondence, we propose a novel approach which generates an edited version for each original CNN feature vector by applying the maximum entropy principle to abandon particular vectors. These selected vectors correspond to the unfriendly concepts in each image category. The classifier trained from merged feature sets can significantly improve model generalization of individual categories when training data is limited. The experimental results for classification-based object detection on canonical datasets including VOC 2007 (60.1%), 2010 (56.4%) and 2012 (56.3%) show obvious improvement in mean average precision (mAP) with simple linear support vector machines.
It is known that the anomalous Hall conductivity (AHC) in a disordered two dimensional electron system with Rashba spin-orbit interaction and finite ferromagnetic spin-exchange energy is zero in the metallic weak-scattering regime because of the exact cancellation of the bare-bubble contribution by the vertex correction. We study the effect of inhomogeneous longitudinal electric field on the AHC in such a system. We predict that AHC increases from zero (at zero wavenumber), forms a peak, and then decreases as the wavenumber for the variation of electric field increases. The peak-value of AHC is as high as the bare-buble contribution. We find that the wave number, q, at which the peaks occur is the inverse of the geometric mean of the mean free path of an electron and the spin-exchange length scale. Although the Rashba energy is responsible for the peak-value of AHC, the peak position is independent of it.
We examine using Monte Carlo simulations, photon transport in optically `thin' slabs whose thickness L is only a few times the transport mean free path $l^{*}$, with particles of different scattering anisotropies. The confined geometry causes an auto-selection of only photons with looping paths to remain within the slab. The results of the Monte Carlo simulations are borne out by our analytical treatment which incorporates the directional persistence by the use of the Ornstein-Uhlenbeck process, which interpolates between the short time ballistic and long time diffusive regimes.
A field-theoretic description of the critical behavior of weakly disordered systems with a $p$-component order parameter is given. For systems of an arbitrary dimension in the range from three to four, a renormalization group analysis of the effective replica Hamiltonian of the model with an interaction potential without replica symmetry is given in the two-loop approximation. For the case of the one-step replica symmetry breaking, fixed points of the renormalization group equations are found using the Pade-Borel summing technique. For every value $p$, the threshold dimensions of the system that separate the regions of different types of the critical behavior are found by analyzing those fixed points. Specific features of the critical behavior determined by the replica symmetry breaking are described. The results are compared with those obtained by the $\epsilon$-expansion and the scope of the method applicability is determined.
It's useful to automatically transform an image from its original form to some synthetic form (style, partial contents, etc.), while keeping the original structure or semantics. We define this requirement as the "image-to-image translation" problem, and propose a general approach to achieve it, based on deep convolutional and conditional generative adversarial networks (GANs), which has gained a phenomenal success to learn mapping images from noise input since 2014. In this work, we develop a two step (unsupervised) learning method to translate images between different domains by using unlabeled images without specifying any correspondence between them, so that to avoid the cost of acquiring labeled data. Compared with prior works, we demonstrated the capacity of generality in our model, by which variance of translations can be conduct by a single type of model. Such capability is desirable in applications like bidirectional translation
Recently, contagion-based (disease, information, etc.) spreading on social networks has been extensively studied. In this paper, other than traditional full interaction, we propose a partial interaction based spreading model, considering that the informed individuals would transmit information to only a certain fraction of their neighbors due to the transmission ability in real-world social networks. Simulation results on three representative networks (BA, ER, WS) indicate that the spreading efficiency is highly correlated with the network heterogeneity. In addition, a special phenomenon, namely \emph{Information Blind Areas} where the network is separated by several information-unreachable clusters, will emerge from the spreading process. Furthermore, we also find that the size distribution of such information blind areas obeys power-law-like distribution, which has very similar exponent with that of site percolation. Detailed analyses show that the critical value is decreasing along with the network heterogeneity for the spreading process, which is complete the contrary to that of random selection. Moreover, the critical value in the latter process is also larger that of the former for the same network. Those findings might shed some lights in in-depth understanding the effect of network properties on information spreading.
The echo-enabled harmonic generation (EEHG) free-electron laser (FEL) has been already demonstrated at lower harmonics and the first lasing at third harmonic also has been achieved at Shanghai deep ultra-violet FEL (SDUV-FEL). While the great advantage of much higher harmonic up-conversion efficiency of EEHG over other seeded FELs only shows evidently at much higher harmonics. In this paper, we investigate the possibility of EEHG lasing at 10-th harmonic of the seed laser at SDUV-FEL, both physical designs and numerical simulations have been studied carefully. Two proposals of EEHG at 10-th harmonic have been studied respectively, i.e. with the seed lasers of the same color and two difference colors, the simulation results indicate that both approaches could be the candidate for EEHG lasing at 10-th harmonic at SDUV-FEL, meanwhile the coherent synchrotron radiation does not affect the performance of EEHG-FEL but only slightly shifts the central radiation frequency.
Generative adversarial networks (GANs) are highly effective unsupervised learning frameworks that can generate very sharp data, even for data such as images with complex, highly multimodal distributions. However GANs are known to be very hard to train, suffering from problems such as mode collapse and disturbing visual artifacts. Batch normalization (BN) techniques have been introduced to address the training problem. However, though BN accelerates training in the beginning, our experiments show that the use of BN can be unstable and negatively impact the quality of the trained model. The evaluation of BN and numerous other recent schemes for improving GAN training is hindered by the lack of an effective objective quality measure for GAN models. To address these issues, we first introduce a weight normalization (WN) approach for GAN training that significantly improves the stability, efficiency and the quality of the generated samples. To allow a methodical evaluation, we introduce a new objective measure based on a squared Euclidean reconstruction error metric, to assess training performance in terms of speed, stability, and quality of generated samples. Our experiments indicate that training using WN is generally superior to BN for GANs. We provide statistical evidence for commonly used datasets (CelebA, LSUN, and CIFAR-10), that WN achieves 10% lower mean squared loss for reconstruction and significantly better qualitative results than BN.
We investigate the application of a perturbative renormalization group (RG) method to random antiferromagnetic Heisenberg chains with arbitrary spin size. At zero temperature we observe that initial arbitrary probability distributions develop a singularity at J=0, for all values of spin S. When the RG method is extended to finite temperatures, without any additional assumptions, we find anomalous results for S >= 1. These results lead us to conclude that the perturbative scheme is not adequate to study random chains with S >= 1. Therefore a random singlet phase in its more restrictive definition is only assured for spin-1/2 chains.
In this paper, we propose a greedy cycle direction heuristic to improve the generalized $\mathbf{R}$ redundancy quorum cycle technique. When applied using only single cycles rather than the standard paired cycles, the generalized $\mathbf{R}$ redundancy technique has been shown to almost halve the necessary light-trail resources in the network. Our greedy heuristic improves this cycle-based routing technique's fault-tolerance and dependability.   For efficiency and distributed control, it is common in distributed systems and algorithms to group nodes into intersecting sets referred to as quorum sets. Optimal communication quorum sets forming optical cycles based on light-trails have been shown to flexibly and efficiently route both point-to-point and multipoint-to-multipoint traffic requests. Commonly cycle routing techniques will use pairs of cycles to achieve both routing and fault-tolerance, which uses substantial resources and creates the potential for underutilization. Instead, we use a single cycle and intentionally utilize $\mathbf{R}$ redundancy within the quorum cycles such that every point-to-point communication pairs occur in at least $\mathbf{R}$ cycles. Without the paired cycles the direction of the quorum cycles becomes critical to the fault tolerance performance. For this we developed a greedy cycle direction heuristic and our single fault network simulations show a reduction of missing pairs by greater than 30%, which translates to significant improvements in fault coverage.
Reconstructing complex networks from measurable data is a fundamental problem for understanding and controlling collective dynamics of complex networked systems. However, a significant challenge arises when we attempt to decode structural information hidden in limited amounts of data accompanied by noise and in the presence of inaccessible nodes. Here, we develop a general framework for robust reconstruction of complex networks from sparse and noisy data. Specifically, we decompose the task of reconstructing the whole network into recovering local structures centered at each node. Thus, the natural sparsity of complex networks ensures a conversion from the local structure reconstruction into a sparse signal reconstruction problem that can be addressed by using the lasso, a convex optimization method. We apply our method to evolutionary games, transportation and communication processes taking place in a variety of model and real complex networks, finding that universal high reconstruction accuracy can be achieved from sparse data in spite of noise in time series and missing data of partial nodes. Our approach opens new routes to the network reconstruction problem and has potential applications in a wide range of fields.
This paper proposes a simple but effective graph-based agglomerative algorithm, for clustering high-dimensional data. We explore the different roles of two fundamental concepts in graph theory, indegree and outdegree, in the context of clustering. The average indegree reflects the density near a sample, and the average outdegree characterizes the local geometry around a sample. Based on such insights, we define the affinity measure of clusters via the product of average indegree and average outdegree. The product-based affinity makes our algorithm robust to noise. The algorithm has three main advantages: good performance, easy implementation, and high computational efficiency. We test the algorithm on two fundamental computer vision problems: image clustering and object matching. Extensive experiments demonstrate that it outperforms the state-of-the-arts in both applications.
We performed deep photometry of the central region of Galactic globular cluster M15 from archival Hubble Space Telescope data taken on the High Resolution Channel and Solar Blind Channel of the Advanced Camera for Surveys. Our data set consists of images in far-UV (FUV$_{140}$; F140LP), near-UV (NUV$_{220}$; F220W), and blue (B$_{435}$; F435W) filters. The addition of an optical filter complements previous UV work on M15 by providing an additional constraint on the UV-bright stellar populations. Using color-magnitude diagrams (CMDs) we identified several populations that arise from non-canonical evolution including candidate blue stragglers, extreme horizontal branch stars, blue hook stars (BHks), cataclysmic variables (CVs), and helium-core white dwarfs (He WDs). Due to preliminary identification of several He WD and BHk candidates, we add M15 as a cluster containing a He WD sequence and suggest it be included among clusters with a BHk population.   We also investigated a subset of CV candidates that appear in the gap between the main sequence (MS) and WDs in FUV$_{140}-$NUV$_{220}$ but lie securely on the MS in NUV$_{220}-$B$_{435}$. These stars may represent a magnetic CV or detached WD-MS binary population. Additionally, we analyze our candidate He WDs using model cooling sequences to estimate their masses and ages and investigate the plausibility of thin vs. thick hydrogen envelopes. Finally, we identify a class of UV-bright stars that lie between the horizontal branch and WD cooling sequences, a location not usually populated on cluster CMDs. We conclude these stars may be young, low-mass He WDs.
Patterns of word use both reflect and influence a myriad of human activities and interactions. Like other entities that are reproduced and evolve, words rise or decline depending upon a complex interplay between {their intrinsic properties and the environments in which they function}. Using Internet discussion communities as model systems, we define the concept of a word niche as the relationship between the word and the characteristic features of the environments in which it is used. We develop a method to quantify two important aspects of the size of the word niche: the range of individuals using the word and the range of topics it is used to discuss. Controlling for word frequency, we show that these aspects of the word niche are strong determinants of changes in word frequency. Previous studies have already indicated that word frequency itself is a correlate of word success at historical time scales. Our analysis of changes in word frequencies over time reveals that the relative sizes of word niches are far more important than word frequencies in the dynamics of the entire vocabulary at shorter time scales, as the language adapts to new concepts and social groupings. We also distinguish endogenous versus exogenous factors as additional contributors to the fates of words, and demonstrate the force of this distinction in the rise of novel words. Our results indicate that short-term nonstationarity in word statistics is strongly driven by individual proclivities, including inclinations to provide novel information and to project a distinctive social identity.
We present a growing dimension asymptotic formalism. The perspective in this paper is classification theory and we show that it can accommodate probabilistic networks classifiers, including naive Bayes model and its augmented version. When represented as a Bayesian network these classifiers have an important advantage: The corresponding discriminant function turns out to be a specialized case of a generalized additive model, which makes it possible to get closed form expressions for the asymptotic misclassification probabilities used here as a measure of classification accuracy. Moreover, in this paper we propose a new quantity for assessing the discriminative power of a set of features which is then used to elaborate the augmented naive Bayes classifier. The result is a weighted form of the augmented naive Bayes that distributes weights among the sets of features according to their discriminative power. We derive the asymptotic distribution of the sample based discriminative power and show that it is seriously overestimated in a high dimensional case. We then apply this result to find the optimal, in a sense of minimum misclassification probability, type of weighting.
Recent approaches in depth-based human activity analysis achieved outstanding performance and proved the effectiveness of 3D representation for classification of action classes. Currently available depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of training samples, distinct class labels, camera views and variety of subjects. In this paper we introduce a large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects. Our dataset contains 60 different action classes including daily, mutual, and health-related actions. In addition, we propose a new recurrent neural network structure to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification. Experimental results show the advantages of applying deep learning methods over state-of-the-art hand-crafted features on the suggested cross-subject and cross-view evaluation criteria for our dataset. The introduction of this large scale dataset will enable the community to apply, develop and adapt various data-hungry learning techniques for the task of depth-based and RGB+D-based human activity analysis.
Wireless network virtualization enables multiple virtual wireless networks to coexist on shared physical infrastructure. However, one of the main challenges is the problem of assigning the physical resources to virtual networks in an efficient manner. Although some work has been done on solving the embedding problem for wireless networks, few solutions are applicable to dynamic networks with changing traffic patterns. In this paper we propose a dynamic greedy embedding algorithm for wireless virtualization. Virtual networks can be re-embedded dynamically using this algorithm, enabling increased resource usage and lower rejection rates. We compare the dynamic greedy algorithm to a static embedding algorithm and also to its dynamic version. We show that the dynamic algorithms provide increased performance to previous methods using simulated traffic. In addition we formulate the embedding problem with multiple priority levels for the static and dynamic case.
Uncoordinated individuals in human society pursuing their personally optimal strategies do not always achieve the social optimum, the most beneficial state to the society as a whole. Instead, strategies form Nash equilibria which are often socially suboptimal. Society, therefore, has to pay a price of anarchy for the lack of coordination among its members. Here we assess this price of anarchy by analyzing the travel times in road networks of several major cities. Our simulation shows that uncoordinated drivers possibly waste a considerable amount of their travel time. Counterintuitively,simply blocking certain streets can partially improve the traffic conditions. We analyze various complex networks and discuss the possibility of similar paradoxes in physics.
We introduce a new representation and feature extraction method for biological sequences. Named bio-vectors (BioVec) to refer to biological sequences in general with protein-vectors (ProtVec) for proteins (amino-acid sequences) and gene-vectors (GeneVec) for gene sequences, this representation can be widely used in applications of deep learning in proteomics and genomics. In the present paper, we focus on protein-vectors that can be utilized in a wide array of bioinformatics investigations such as family classification, protein visualization, structure prediction, disordered protein identification, and protein-protein interaction prediction. In this method, we adopt artificial neural network approaches and represent a protein sequence with a single dense n-dimensional vector. To evaluate this method, we apply it in classification of 324,018 protein sequences obtained from Swiss-Prot belonging to 7,027 protein families, where an average family classification accuracy of 93%+-0.06% is obtained, outperforming existing family classification methods. In addition, we use ProtVec representation to predict disordered proteins from structured proteins. Two databases of disordered sequences are used: the DisProt database as well as a database featuring the disordered regions of nucleoporins rich with phenylalanine-glycine repeats (FG-Nups). Using support vector machine classifiers, FG-Nup sequences are distinguished from structured protein sequences found in Protein Data Bank (PDB) with a 99.8% accuracy, and unstructured DisProt sequences are differentiated from structured DisProt sequences with 100.0% accuracy. These results indicate that by only providing sequence data for various proteins into this model, accurate information about protein structure can be determined.
We consider an Ising competitive model defined over a triangular Husimi tree where loops, responsible for an explicit frustration, are even allowed. After a critical analysis of the phase diagram, in which a ``gas of non interacting dimers (or spin liquid) - ferro or antiferromagnetic ordered state'' transition is recognized in the frustrated regions, we introduce the disorder for studying the spin glass version of the model: the triangular +/- J model. We find out that, for any finite value of the averaged couplings, the model exhibits always a phase transition, even in the frustrated regions, where the transition turns out to be a glassy transition. The analysis of the random model is done by applying a recently proposed method which allows to derive the upper phase boundary of a random model through a mapping with a corresponding non random one.
Active stabilisation of a quantum system is the active suppression of noise (such as decoherence) in the system, without disrupting its unitary evolution. Quantum error correction suggests the possibility of achieving this, but only if the recovery network can suppress more noise than it introduces. A general method of constructing such networks is proposed, which gives a substantial improvement over previous fault tolerant designs. The construction permits quantum error correction to be understood as essentially quantum state synthesis. An approximate analysis implies that algorithms involving very many computational steps on a quantum computer can thus be made possible.
We assume that, within the dense clusters of neurons that can be found in nuclei, cells may interconnect via soma-to-soma interactions, in addition to conventional synaptic connections. We illustrate this idea with a multi-layer architecture (MLA) composed of multiple clusters of recurrent sub-networks of spiking Random Neural Networks (RNN) with dense soma-to-soma interactions, and use this RNN-MLA architecture for deep learning. The inputs to the clusters are first normalised by adjusting the external arrival rates of spikes to each cluster. Then we apply this architecture to learning from multi-channel datasets. Numerical results based on both images and sensor based data, show the value of this novel architecture for deep learning.
Based on a calculation of neural decoherence rates, we argue that that the degrees of freedom of the human brain that relate to cognitive processes should be thought of as a classical rather than quantum system, i.e., that there is nothing fundamentally wrong with the current classical approach to neural network simulations. We find that the decoherence timescales ~10^{-13}-10^{-20} seconds are typically much shorter than the relevant dynamical timescales (~0.001-0.1 seconds), both for regular neuron firing and for kink-like polarization excitations in microtubules. This conclusion disagrees with suggestions by Penrose and others that the brain acts as a quantum computer, and that quantum coherence is related to consciousness in a fundamental way.
In this paper, we elaborate over the well-known interpretability issue in echo state networks. The idea is to investigate the dynamics of reservoir neurons with time-series analysis techniques taken from research on complex systems. Notably, we analyze time-series of neuron activations with Recurrence Plots (RPs) and Recurrence Quantification Analysis (RQA), which permit to visualize and characterize high-dimensional dynamical systems. We show that this approach is useful in a number of ways. First, the two-dimensional representation offered by RPs provides a way for visualizing the high-dimensional dynamics of a reservoir. Our results suggest that, if the network is stable, reservoir and input denote similar line patterns in the respective RPs. Conversely, the more unstable the ESN, the more the RP of the reservoir presents instability patterns. As a second result, we show that the $\mathrm{L_{max}}$ measure is highly correlated with the well-established maximal local Lyapunov exponent. This suggests that complexity measures based on RP diagonal lines distribution provide a valuable tool to quantify the degree of network stability. Finally, our analysis shows that all RQA measures fluctuate on the proximity of the so-called edge of stability, where an ESN typically achieves maximum computational capability. We verify that the determination of the edge of stability provided by such RQA measures is more accurate than two well-known criteria based on the Jacobian matrix of the reservoir. Therefore, we claim that RPs and RQA-based analyses can be used as valuable tools to design an effective network given a specific problem.
It is shown that recent experiments indicating a metal-insulator transition in 2D electron systems can be interpreted in terms of a simple model, in which the resistivity is controlled by scattering at charged hole traps located in the oxide layer. The gate voltage changes the number of charged traps which results in a sharp change in the resistivity. The observed exponential temperature dependence of the resistivity in the metallic phase of the transition follows from the temperature dependence of the trap occupation number. The model naturally describes the experimentally observed scaling properties of the transition and effects of magnetic and electric fields.
We investigate viscous and non-viscous flow in two-dimensional self-affine fracture joints through direct numerical simulations of the Navier-Stokes equations. As a novel hydrodynamic feature of this flow system, we find that the effective permeability at higher Reynolds number to cubic order, falls into two regimes as a function of the Hurst exponent $h$ characterizing the fracture joints. For $h>1/2$, we find a weak dependency whereas for $h<1/2$, the dependency is strong. A similar behavior is found for the higher order coefficients. We also study the velocity fluctuations in space of a passive scalar. These are strongly correlated on smaller length scales, but decorrelates on larger scales. Moreover, the fluctuations on larger scale are insensitive to the value of the Reynolds number.
We present a new algorithm that significantly improves the efficiency of exploration for deep Q-learning agents in dialogue systems. Our agents explore via Thompson sampling, drawing Monte Carlo samples from a Bayes-by-Backprop neural network. Our algorithm learns much faster than common exploration strategies such as \epsilon-greedy, Boltzmann exploration, and bootstrapping-based approaches. Additionally, we show that spiking the replay buffer with experiences from just a few successful episodes can make Q-learning feasible when it might otherwise fail.
The dynamical evolution of a recently introduced one dimensional model in \cite{biswas-sen} (henceforth referred to as model I), has been made stochastic by introducing a parameter $\beta$ such that $\beta =0$ corresponds to the Ising model and $\beta \to \infty$ to the original model I. The equilibrium behaviour for any value of $\beta $ is identical: a homogeneous state. We argue, from the behaviour of the dynamical exponent $z$,that for any $\beta \neq 0$, the system belongs to the dynamical class of model I indicating a dynamic phase transition at $\beta = 0$. On the other hand, the persistence probabilities in a system of $L$ spins saturate at a value $P_{sat}(\beta, L) = (\beta/L)^{\alpha}f(\beta)$, where $\alpha$ remains constant for all $\beta \neq 0$ supporting the existence of the dynamic phase transition at $\beta =0$. The scaling function $f(\beta)$ shows a crossover behaviour with $f(\beta) = \rm{constant} $ for $\beta <<1$ and $f(\beta) \propto \beta^{-\alpha}$ for $\beta >>1$.
Computing pixel depth values provide a basis for understanding the 3D geometrical structure of an image. As it has been presented in recent research, using stereo images provides an accurate depth due to the advantage of having local correspondences; however, the processing time of these methods are still an open issue. To solve this problem, it has been suggested to use single images to compute the depth values but extracting depth from monocular images requires extracting a large number of cues from the global and local information in the image. This challenge has been studied for a decade and it is still an open problem. Recently the idea of using neural networks to solve this problem has attracted attention. In this paper, we tackle this challenge by employing a Deep Neural Network (DNN) equipped with semantic pixel-wise segmentation utilizing our recently published disparity post-processing method. Four models are trained in this study and they have been evaluated at 2 stages on KITTI dataset. The ground truth images in the first part of the experiment come from the benchmark and for the second part, the ground truth images are considered to be the disparity results from applying a state-of-art stereo matching method. The results of this evaluation demonstrate that using post-processing techniques to refine the target of the network increases the accuracy of depth estimation on individual mono images. The second evaluation shows that using segmentation data as the input can improve the depth estimation results to a point where performance is comparable with stereo depth matching.
As subjects perceive the sensory world, different stimuli elicit a number of neural representations. Here, a subjective distance between stimuli is defined, measuring the degree of similarity between the underlying representations. As an example, the subjective distance between different locations in space is calculated from the activity of rodent hippocampal place cells, and lateral septal cells. Such a distance is compared to the real distance, between locations. As the number of sampled neurons increases, the subjective distance shows a tendency to resemble the metrics of real space.
This paper introduces an exact algorithm for the construction of a shortest curvature-constrained network interconnecting a given set of directed points in the plane and an iterative method for doing so in 3D space. Such a network will be referred to as a minimum Dubins network, since its edges are Dubins paths (or slight variants thereof). The problem of constructing a minimum Dubins network appears in the context of underground mining optimisation, where the aim is to construct a least-cost network of tunnels navigable by trucks with a minimum turning radius. The Dubins network problem is similar to the Steiner tree problem, except that the terminals are directed and there is a curvature constraint. We propose the minimum curvature-constrained Steiner point algorithm for determining the optimal location of the Steiner point in a 3-terminal network. We show that when two terminals are fixed and the third varied, the Steiner point traces out a lima\c{c}on.
This paper proposes an innovative triple layer Wireless Sensor Network (WSN) system, which monitors M-ary events like temperature, pressure, humidity, etc. with the help of geographically distributed sensors. The sensors convey signals to the fusion centre using M-ary Frequency Shift Keying (MFSK)modulation scheme over independent Rayleigh fading channels. At the fusion centre, detection takes place with the help of Selection Combining (SC) diversity scheme, which assures a simple and economical receiver circuitry. With the aid of various simulations, the performance and efficacy of the system has been analyzed by varying modulation levels, number of local sensors and probability of correct detection by the sensors. The study endeavors to prove that triple layer WSN system is an economical and dependable system capable of correct detection of M-ary events by integrating frequency diversity together with antenna diversity.
This paper discusses the potential of applying deep learning techniques for plant classification and its usage for citizen science in large-scale biodiversity monitoring. We show that plant classification using near state-of-the-art convolutional network architectures like ResNet50 achieves significant improvements in accuracy compared to the most widespread plant classification application in test sets composed of thousands of different species labels. We find that the predictions can be confidently used as a baseline classification in citizen science communities like iNaturalist (or its Spanish fork, Natusfera) which in turn can share their data with biodiversity portals like GBIF.
This paper presents a multi-swarm PSO algorithm for the Quadratic Assignment Problem (QAP) implemented on OpenCL platform. Our work was motivated by results of time efficiency tests performed for single-swarm algorithm implementation that showed clearly that the benefits of a parallel execution platform can be fully exploited, if the processed population is large. The described algorithm can be executed in two modes: with independent swarms or with migration. We discuss the algorithm construction, as well as we report results of tests performed on several problem instances from the QAPLIB library. During the experiments the algorithm was configured to process large populations. This allowed us to collect statistical data related to values of goal function reached by individual particles. We use them to demonstrate on two test cases that although single particles seem to behave chaotically during the optimization process, when the whole population is analyzed, the probability that a particle will select a near-optimal solution grows.
We present a novel model to calculate vertical transport properties such as conductance and current in unintentionally disordered double barrier GaAs-Al$_x$Ga$_{1-x}$As heterostructures. The source of disorder comes from interface roughness at the heterojunctions (lateral disorder) as well as spatial inhomogeneities of the Al mole fraction in the barriers (compositional disorder). Both lateral and compositional disorder break translational symmetry along the lateral direction and therefore electrons can be scattered off the growth direction. The model correctly describe channel mixing due to these elastic scattering events. In particular, for realistic degree of disorder, we have found that the effects of compositional disorder on transport properties are negligible as compared to the effects due to lateral disorder.
In this paper, we investigate the use of prediction-adaptation-correction recurrent neural networks (PAC-RNNs) for low-resource speech recognition. A PAC-RNN is comprised of a pair of neural networks in which a {\it correction} network uses auxiliary information given by a {\it prediction} network to help estimate the state probability. The information from the correction network is also used by the prediction network in a recurrent loop. Our model outperforms other state-of-the-art neural networks (DNNs, LSTMs) on IARPA-Babel tasks. Moreover, transfer learning from a language that is similar to the target language can help improve performance further.
With an exponential increase in number of users switching to mobile banking, various countries are adopting biometric solutions as security measures. The main reason for biometric technologies becoming more common in the everyday lives of consumers is because of the facility to easily capture biometric data in real time, using their mobile phones. Biometric technologies are providing the potential security framework to make banking more convenient and secure than it has ever been. At the same time, the exponential growth of enrollment in the biometric system produces massive amount of high dimensionality data that leads to degradation in the performance of the mobile banking systems. Therefore, in order to overcome the performance issues arising due to this data deluge, this paper aims to propose a distributed mobile biometric system based on a high performance cluster Cloud. High availability, better time efficiency and scalability are some of the added advantages of using the proposed system. In this paper a Cloud based mobile biometric authentication framework (BAMCloud) is proposed that uses dynamic signatures and performs authentication. It includes the steps involving data capture using any handheld mobile device, then storage, preprocessing and training the system in a distributed manner over Cloud. For this purpose we have implemented it using MapReduce on Hadoop platform and for training Levenberg-Marquardt backpropagation neural network has been used. Moreover, the methodology adopted is very novel as it achieves a speedup of 8.5x and a performance of 96.23%. Furthermore, the cost benefit analysis of the implemented system shows that the cost of implementation and execution of the system is lesser than the existing ones. The experiments demonstrate that the better performance is achieved by proposed framework as compared to the other methods used in the recent literature.
Stimulated by the results of NMR experiments with superfluid $^3$He in "nematically ordered" aerogel [1] we report on the results of phenomenological analysis of stability of different phases of superfluid $^3$He subjected to a strong homogeneous uniaxial anisotropy. On a basis of this analysis we suggest a form of the order parameter for the new ESP2 phase observed in the quoted experiments. In the weak coupling limit the suggested order parameter approaches that of the axi-planar phase. We discuss a possible experimental check of the proposed identification of the new phase.
We explore the evolution of cooperation in the framework of the evolutionary game theory using the prisoner's dilemma as metaphor of the problem. We present a minimal model taking into account the growing process of the systems and individuals with imitation capacity. We consider the topological structure and the evolution of strategies decoupled instead of a coevolutionary dynamic. We show conditions to build up a cooperative system with real topological structures for any natural selection intensity. When the system starts to grow, cooperation is unstable but becomes stable as soon as the system reaches a small core of cooperators whose size increase when the intensity of natural selection decreases. Thus, we reduce the emergence of cooperative systems with cultural reproduction to justify a small initial cooperative structure that we call cooperative seed. Otherwise, given that the system grows principally as cooperator whose cooperators inhabit the most linked parts of the system, the benefit-cost ratio required for cooperation evolve is drastically reduced compared to the found in static networks. In this way, we show that in systems whose individuals have imitation capacity the growing process is essential for the evolution of cooperation.
We propose a novel method to find the community structure in complex networks based on an extremal optimization of the value of modularity. The method outperforms the optimal modularity found by the existing algorithms in the literature. We present the results of the algorithm for computer simulated and real networks and compare them with other approaches. The efficiency and accuracy of the method make it feasible to be used for the accurate identification of community structure in large complex networks.
Contrastive divergence (CD) is a promising method of inference in high dimensional distributions with intractable normalizing constants, however, the theoretical foundations justifying its use are somewhat shaky. This document proposes a framework for understanding CD inference, how/when it works, and provides multiple justifications for the CD moment conditions, including framing them as a variational approximation. Algorithms for performing inference are discussed and are applied to social network data using an exponential-family random graph models (ERGM). The framework also provides guidance about how to construct MCMC kernels providing good CD inference, which turn out to be quite different from those used typically to provide fast global mixing.
We propose a new belief update rule for Distributed Non-Bayesian learning in time-varying directed graphs, where a group of agents tries to collectively identify a hypothesis that best describes a sequence of observed data. We show that the proposed update rule, inspired by the Push-Sum algorithm, is consistent, moreover we provide an explicit characterization of its convergence rate. Our main result states that, after a transient time, all agents will concentrate their beliefs at a network independent rate. Network independent rates were not available for other consensus based distributed learning algorithms.
Inclusive production of D*+/- (2010) mesons in deep inelastic scattering has been measured using e+p and e-p data obtained with the ZEUS detector at HERA using integrated luminosities of 16.7 and 65.2 pb-1, respectively. The decay channel D*+/- -> D0 pi+ with D0 -> K- pi+ and corresponding antiparticle decays were used to identify D*+/- mesons. The D*+/- cross sections in e-p and e+p interactions agree with NLO QCD predictions, although the D*+/- cross section in e-p is slightly higher than that in e+p.
Analysis of criminal networks is inherently difficult because of the nature of the topic. Criminal networks are covert and most of the information is not publicly available. This leads to small datasets available for analysis. The available criminal network datasets consists of entities, i.e. individual or organizations, which are linked to each other. The links between entities indicates that there is a connection between these entities such as involvement in the same criminal event, having commercial ties, and/or memberships in the same criminal organization. Because of incognito criminal activities, there could be many hidden links from entities to entities, which makes the publicly available criminal networks incomplete. Revealing hidden links introduces new information, e.g. affiliation of a suspected individual with a criminal organization, which may not be known with public information. What will we be able to find if we can run analysis on a larger dataset and use link prediction to reveal the implicit connections? We plan to answer this question by using a dataset that is an order of magnitude more than what is used in most criminal networks analysis. And by using machine learning techniques, we will convert a link prediction problem to a binary classification problem. We plan to reveal hidden links and potentially hidden key attributes of the criminal network. With a more complete picture of the network, we can potentially use this data to thwart criminal organizations and/or take a Pareto approach in targeting key nodes. We conclude our analysis with an effective destruction strategy to weaken criminal networks and prove the effectiveness of revealing hidden links when attacking to criminal networks.
Resistively-shunted-junction dynamics is applied to the three dimensional uniformly frustrated XY model with randomly perturbed couplings, as a model for driven steady states in a type-II superconductor with quenched point pinning. For a disorder strength p strong enough to produce a vortex glass in equilibrium, we map the phase diagram as a function of temperature T and uniform driving current I. We find that, within a finite current range I_{c1}(T)<I<I_{c2}(T), the system orders into a set of periodically spaced smectic planes of vortex lines. Smectic planes are short range correlated with neighboring planes; vortex lines within a given plane are periodic along the direction of motion, while disordered along the direction of the applied magnetic field.
In this paper, we propose an approach to the domain adaptation, dubbed Second- or Higher-order Transfer of Knowledge (So-HoT), based on the mixture of alignments of second- or higher-order scatter statistics between the source and target domains. The human ability to learn from few labeled samples is a recurring motivation in the literature for domain adaptation. Towards this end, we investigate the supervised target scenario for which few labeled target training samples per category exist. Specifically, we utilize two CNN streams: the source and target networks fused at the classifier level. Features from the fully connected layers fc7 of each network are used to compute second- or even higher-order scatter tensors; one per network stream per class. As the source and target distributions are somewhat different despite being related, we align the scatters of the two network streams of the same class (within-class scatters) to a desired degree with our bespoke loss while maintaining good separation of the between-class scatters. We train the entire network in end-to-end fashion. We provide evaluations on the standard Office benchmark (visual domains), RGB-D combined with Caltech256 (depth-to-rgb transfer) and Pascal VOC2007 combined with the TU Berlin dataset (image-to-sketch transfer). We attain state-of-the-art results.
Researchers use community-detection algorithms to reveal large-scale organization in biological and social networks, but community detection is useful only if the communities are significant and not a result of noisy data. To assess the statistical significance of the network communities, or the robustness of the detected structure, one approach is to perturb the network structure by removing links and measure how much the communities change. However, perturbing sparse networks is challenging because they are inherently sensitive; they shatter easily if links are removed. Here we propose a simple method to perturb sparse networks and assess the significance of their communities. We generate resampled networks by adding extra links based on local information, then we aggregate the information from multiple resampled networks to find a coarse-grained description of significant clusters. In addition to testing our method on benchmark networks, we use our method on the sparse network of the European Court of Justice (ECJ) case law, to detect significant and insignificant areas of law. We use our significance analysis to draw a map of the ECJ case law network that reveals the relations between the areas of law.
We characterize the low temperature phase of a simple model for RNA secondary structures by determining the typical energy scale E(l) of excitations involving l bases. At zero temperature, we find a scaling law E(l) \sim l^\theta with \theta \approx 0.23, and this same scaling holds at low enough temperatures. Above a critical temperature, there is a different phase characterized by a relatively flat free energy landscape resembling that of a homopolymer with a scaling exponent \theta=1. These results strengthen the evidence in favour of the existence of a glass phase at low temperatures.
In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces. Then, does depth alone create bad local minima? In this paper, we prove that without nonlinearity, depth alone does not create bad local minima, although it induces non-convex loss surface. Using this insight, we greatly simplify a recently proposed proof to show that all of the local minima of feedforward deep linear neural networks are global minima. Our theoretical results generalize previous results with fewer assumptions, and this analysis provides a method to show similar results beyond square loss in deep linear models.
A model of spinless interacting electrons in presence of randomness is examined using an extended dynamical mean-field formulation. When the interaction strength is large as compared to the Fermi energy, a low temperature glassy phase is identified, which in our formulation corresponds to a replica-symmetry breaking instability. The glassy phase is characterized by a pseudo-gap in the single particle density of states, reminiscent of the Coulomb gap of Efros and Shklovskii. Due to ergodicity breaking, the ``zero-field cooled'' compressibility of this electron glass vanishes at T=0, consistent with absence of screening. When the Fermi energy exceeds a critical value, the glassy phase is suppressed, and normal metallic behavior is recovered.
We study the sequences generated by neuronal recurrence equations of the form $x(n) = {\bf 1}[\sum_{j=1}^{h} a_{j} x(n-j)- \theta]$. From a neuronal recurrence equation of memory size $h$ which describes a cycle of length $\rho(m) \times lcm(p_0, p_1,..., p_{-1+\rho(m)})$, we construct a set of $\rho(m)$ neuronal recurrence equations whose dynamics describe respectively the transient of length $O(\rho(m) \times lcm(p_0, ..., p_{d}))$ and the cycle of length $O(\rho(m) \times lcm(p_{d+1}, ..., p_{-1+\rho(m)}))$ if $0 \leq d \leq -2+\rho(m)$ and 1 if $d=\rho(m)-1$.   This result shows the exponential time of the convergence of neuronal recurrence equation to fixed points and the existence of the period-halving bifurcation.
Experimental data suggest that some classes of spiking neurons in the first layers of sensory systems are electrically coupled via gap junctions or ephaptic interactions. When the electrical coupling is removed, the response function (firing rate {\it vs.} stimulus intensity) of the uncoupled neurons typically shows a decrease in dynamic range and sensitivity. In order to assess the effect of electrical coupling in the sensory periphery, we calculate the response to a Poisson stimulus of a chain of excitable neurons modeled by $n$-state Greenberg-Hastings cellular automata in two approximation levels. The single-site mean field approximation is shown to give poor results, failing to predict the absorbing state of the lattice, while the results for the pair approximation are in good agreement with computer simulations in the whole stimulus range. In particular, the dynamic range is substantially enlarged due to the propagation of excitable waves, which suggests a functional role for lateral electrical coupling. For probabilistic spike propagation the Hill exponent of the response function is $\alpha=1$, while for deterministic spike propagation we obtain $\alpha=1/2$, which is close to the experimental values of the psychophysical Stevens exponents for odor and light intensities. Our calculations are in qualitative agreement with experimental response functions of ganglion cells in the mammalian retina.
In this paper we study the inference of the kinetic Ising model on sparse graphs by the decimation method. The decimation method, which was first proposed in [Phys. Rev. Lett. 112, 070603] for the static inverse Ising problem, tries to recover the topology of the inferred system by setting the weakest couplings to zero iteratively. During the decimation process the likelihood function is maximized over the remaining couplings. Unlike the $\ell_1$-optimization based methods, the decimation method does not use the Laplace distribution as a heuristic choice of prior to select a sparse solution. In our case, the whole process can be done automatically without fixing any parameters by hand. We show that in the dynamical inference problem, where the task is to reconstruct the couplings of an Ising model given the data, the decimation process can be applied naturally into a maximum-likelihood optimization algorithm, as opposed to the static case where pseudo-likelihood method needs to be adopted. We also use extensive numerical studies to validate the accuracy of our methods in dynamical inference problems. Our results illustrate that on various topologies and with different distribution of couplings, the decimation method outperforms the widely-used $\ell _1$-optimization based methods.
We study the nature of the broken ergodicity in the low temperature phase of Ising spin glass systems, using as a diagnostic tool the spectrum of eigenvalues of the spin-spin correlation function. We show that multiple extensive eigenvalues of the correlation matrix $C_{ij}\equiv< S_i S_j>$ occur if and only if there is replica symmetry breaking. We support our arguments with Exchange Monte-Carlo results for the infinite-range problem. Here we find multiple extensive eigenvalues in the RSB phase for $N \agt 200$, but only a single extensive eigenvalue for phases with long-range order but no RSB. Numerical results for the short range model in four spatial dimensions, for $N\le 1296$, are consistent with the presence of a single extensive eigenvalue, with the subdominant eigenvalue behaving in agreement with expectations derived from the droplet model. Because of the small system sizes we cannot exclude the possibility of replica symmetry breaking with finite size corrections in this regime.
Several methods exist to infer causal networks from massive volumes of observational data. However, almost all existing methods require a considerable length of time series data to capture cause and effect relationships. In contrast, memory-less transition networks or Markov Chain data, which refers to one-step transitions to and from an event, have not been explored for causality inference even though such data is widely available. We find that causal network can be inferred from characteristics of four unique distribution zones around each event. We call this Composition of Transitions and show that cause, effect, and random events exhibit different behavior in their compositions. We applied machine learning models to learn these different behaviors and to infer causality. We name this new method Causality Inference using Composition of Transitions (CICT). To evaluate CICT, we used an administrative inpatient healthcare dataset to set up a network of patients transitions between different diagnoses. We show that CICT is highly accurate in inferring whether the transition between a pair of events is causal or random and performs well in identifying the direction of causality in a bi-directional association.
Crop diseases are a major threat to food security, but their rapid identification remains difficult in many parts of the world due to the lack of the necessary infrastructure. The combination of increasing global smartphone penetration and recent advances in computer vision made possible by deep learning has paved the way for smartphone-assisted disease diagnosis. Using a public dataset of 54,306 images of diseased and healthy plant leaves collected under controlled conditions, we train a deep convolutional neural network to identify 14 crop species and 26 diseases (or absence thereof). The trained model achieves an accuracy of 99.35% on a held-out test set, demonstrating the feasibility of this approach. When testing the model on a set of images collected from trusted online sources - i.e. taken under conditions different from the images used for training - the model still achieves an accuracy of 31.4%. While this accuracy is much higher than the one based on random selection (2.6%), a more diverse set of training data is needed to improve the general accuracy. Overall, the approach of training deep learning models on increasingly large and publicly available image datasets presents a clear path towards smartphone-assisted crop disease diagnosis on a massive global scale.
Spanning trees are an important quantity characterizing the reliability of a network, however, explicitly determining the number of spanning trees in networks is a theoretical challenge. In this paper, we study the number of spanning trees in a small-world scale-free network and obtain the exact expressions. We find that the entropy of spanning trees in the studied network is less than 1, which is in sharp contrast to previous result for the regular lattice with the same average degree, the entropy of which is higher than 1. Thus, the number of spanning trees in the scale-free network is much less than that of the corresponding regular lattice. We present that this difference lies in disparate structure of the two networks. Since scale-free networks are more robust than regular networks under random attack, our result can lead to the counterintuitive conclusion that a network with more spanning trees may be relatively unreliable.
We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.
Preventing infectious disease like flu from spreading to large communities is one of the most important issues for humans. One effective strategy is voluntary vaccination, however, there is always the temptation for people refusing to be vaccinated because once herd immunity is achieved, infection risk is greatly reduced. In this paper, we study the effect of social impact on the vaccination behavior resulting in preventing infectious disease in networks. The evolutionary simulation results show that the social impact has both positive and negative effects on the vaccination behavior. Especially, in heterogeneous networks, if the vaccination cost is low, the behavior is more promoted than the case without social impact. In contrast, if the cost is high, the behavior is reduced compared to the case without social impact. Moreover, the vaccination behavior is effective in heterogeneous networks more than in homogeneous networks. This implies that the social impact puts people at risk in homogeneous networks. We also evaluate the results from the social cost related to the vaccination policy.
As deep neural networks (DNNs) are applied to increasingly challenging problems, they will need to be able to represent their own uncertainty. Modelling uncertainty is one of the key features of Bayesian methods. Bayesian DNNs that use dropout-based variational distributions and scale to complex tasks have recently been proposed. We evaluate Bayesian DNNs trained with Bernoulli or Gaussian multiplicative masking of either the units (dropout) or the weights (dropconnect). We compare these Bayesian DNNs ability to represent their uncertainty about their outputs through sampling during inference. We tested the calibration of these Bayesian fully connected and convolutional DNNs on two visual inference tasks (MNIST and CIFAR-10). By adding different levels of Gaussian noise to the test images in z-score space, we assessed how these DNNs represented their uncertainty about regions of input space not covered by the training set. These Bayesian DNNs represented their own uncertainty more accurately than traditional DNNs with a softmax output. We find that sampling of weights, whether Gaussian or Bernoulli, led to more accurate representation of uncertainty compared to sampling of units. However, sampling units using either Gaussian or Bernoulli dropout led to increased convolutional neural network (CNN) classification accuracy. Based on these findings we use both Bernoulli dropout and Gaussian dropconnect concurrently, which approximates the use of a spike-and-slab variational distribution. We find that networks with spike-and-slab sampling combine the advantages of the other methods: they classify with high accuracy and robustly represent the uncertainty of their classifications for all tested architectures.
We analyze the quantum dynamics of periodically driven, disordered systems in the presence of long-range interactions. Focusing on the stability of discrete time crystalline (DTC) order in such systems, we use a perturbative procedure to evaluate its lifetime. For 3D systems with dipolar interactions, we show that the corresponding decay is parametrically slow, implying that robust, long-lived DTC order can be obtained. We further predict a sharp crossover from the stable DTC regime into a regime where DTC order is lost, reminiscent of a phase transition. These results are in good agreement with the recent experiments utilizing a dense, dipolar spin ensemble in diamond [Nature 543, 221-225 (2017)]. They demonstrate the existence of a novel, critical DTC regime that is stabilized not by many-body localization but rather by slow, critical dynamics. Our analysis shows that the DTC response can be used as a sensitive probe of nonequilibrium quantum matter.
In mathematics the signature of a path is a collection of iterated integrals, commonly used for solving differential equations. We show that the path signature, used as a set of features for consumption by a convolutional neural network (CNN), improves the accuracy of online character recognition---that is the task of reading characters represented as a collection of paths. Using datasets of letters, numbers, Assamese and Chinese characters, we show that the first, second, and even the third iterated integrals contain useful information for consumption by a CNN.   On the CASIA-OLHWDB1.1 3755 Chinese character dataset, our approach gave a test error of 3.58%, compared with 5.61% for a traditional CNN [Ciresan et al.]. A CNN trained on the CASIA-OLHWDB1.0-1.2 datasets won the ICDAR2013 Online Isolated Chinese Character recognition competition.   Computationally, we have developed a sparse CNN implementation that make it practical to train CNNs with many layers of max-pooling. Extending the MNIST dataset by translations, our sparse CNN gets a test error of 0.31%.
We show that not only preferential attachment but also preferential depletion leads to scale-free networks. The resulting degree distribution exponents is typically less than two (5/3) as opposed to the case of the growth models studied before where the exponents are larger. Our approach applies in particular to biological networks where in fact we find interesting agreement with experimental measurements. We investigate the most important properties characterizing these networks, as the cluster size distribution, the average shortest path and the clustering coefficient.
Quantum cybernetics and its connections to complex quantum systems science is addressed from the perspective of complex quantum computing systems. In this way, the notion of an autonomous quantum computing system is introduced in regards to quantum artificial intelligence, and applied to quantum artificial neural networks, considered as autonomous quantum computing systems, which leads to a quantum connectionist framework within quantum cybernetics for complex quantum computing systems. Several examples of quantum feedforward neural networks are addressed in regards to Boolean functions' computation, multilayer quantum computation dynamics, entanglement and quantum complementarity. The examples provide a framework for a reflection on the role of quantum artificial neural networks as a general framework for addressing complex quantum systems that perform network-based quantum computation, possible consequences are drawn regarding quantum technologies, as well as fundamental research in complex quantum systems science and quantum biology.
Gravitational Wave Inspiral search with a global network of interferometers when carried in a phase coherent fashion would mimic an effective multi-detector network with synthetic streams constructed by the linear combination of the data from different detectors. For the first time, we demonstrate that the two synthetic data streams pertaining to the two polarizations of Gravitational Wave can be derived prior to the maximum-likelihood analysis in a most natural way using the technique of singular-value-decomposition applied to the network signal-to-noise ratio vector. We construct the network matched filters in combined network plus spectral space which capture both the synthetic streams. We further show that the network LLR is then sum of the LLR of each synthetic stream. The four extrinsic parameters are mapped to the two amplitudes and two phases. The maximization over these is a straightforward approach closely linked to the single detector approach. Towards the end, we connect all the previous works related to the multi-detector Gravitational Wave inspiral search and express in the same notation in order to bring under the same footing.
Deep convolutional neural networks (CNN) have recently been shown to generate promising results for aesthetics assessment. However, the performance of these deep CNN methods is often compromised by the constraint that the neural network only takes the fixed-size input. To accommodate this requirement, input images need to be transformed via cropping, warping, or padding, which often alter image composition, reduce image resolution, or cause image distortion. Thus the aesthetics of the original images is impaired because of potential loss of fine grained details and holistic image layout. However, such fine grained details and holistic image layout is critical for evaluating an image's aesthetics. In this paper, we present an Adaptive Layout-Aware Multi-Patch Convolutional Neural Network (A-Lamp CNN) architecture for photo aesthetic assessment. This novel scheme is able to accept arbitrary sized images, and learn from both fined grained details and holistic image layout simultaneously. To enable training on these hybrid inputs, we extend the method by developing a dedicated double-subnet neural network structure, i.e. a Multi-Patch subnet and a Layout-Aware subnet. We further construct an aggregation layer to effectively combine the hybrid features from these two subnets. Extensive experiments on the large-scale aesthetics assessment benchmark (AVA) demonstrate significant performance improvement over the state-of-the-art in photo aesthetic assessment.
We use several diverse parameterizations of diffractive parton distributions, extracted in leading twist QCD analyses of the HERA diffractive deep inelastic scattering (DIS) data, to make predictions for leading twist nuclear shadowing of nuclear quark and gluon distributions in DIS on nuclei. We find that the HERA diffractive data are sufficiently precise to allow us to predict large nuclear shadowing for gluons and quarks, unambiguously. We performed detailed studies of nuclear shadowing for up and charm sea quarks and gluons within several scenarios of shadowing and diffractive slopes, as well as at central impact parameters. We compare these leading twist results with those obtained from the eikonal approach to nuclear shadowing (which is based on a very different space-time picture) and observe sharply contrasting predictions for the size and Q^2-dependence of nuclear shadowing. The most striking differences arise for the interaction of small dipoles with nuclei, in particular for the longitudinal structure function F_{L}^{A}.
Despite recent advances, memory-augmented deep neural networks are still limited when it comes to life-long and one-shot learning, especially in remembering rare events. We present a large-scale life-long memory module for use in deep learning. The module exploits fast nearest-neighbor algorithms for efficiency and thus scales to large memory sizes. Except for the nearest-neighbor query, the module is fully differentiable and trained end-to-end with no extra supervision. It operates in a life-long manner, i.e., without the need to reset it during training.   Our memory module can be easily added to any part of a supervised neural network. To show its versatility we add it to a number of networks, from simple convolutional ones tested on image classification to deep sequence-to-sequence and recurrent-convolutional models. In all cases, the enhanced network gains the ability to remember and do life-long one-shot learning. Our module remembers training examples shown many thousands of steps in the past and it can successfully generalize from them. We set new state-of-the-art for one-shot learning on the Omniglot dataset and demonstrate, for the first time, life-long one-shot learning in recurrent neural networks on a large-scale machine translation task.
This study provides the first confirmation that individual employment status can be predicted from standard mobile phone network logs externally validated with household survey data. Individual welfare and households vulnerability to shocks are intimately connected to employment status and professions of household breadwinners. At a societal level unemployment is an important indicator of the performance of an economy. By deriving a broad set of novel mobile phone network indicators reflecting users financial, social and mobility patterns we show how machine learning models can be used to predict 18 categories of profession in a South-Asian developing country. The model predicts individual unemployment status with 70.4 percent accuracy. We further show how unemployment can be aggregated from individual level and mapped geographically at cell tower resolution, providing a promising approach to map labor market economic indicators, and the distribution of economic productivity and vulnerability between censuses, especially in heterogeneous urban areas. The method also provides a promising approach to support data collection on vulnerable populations, which are frequently under-represented in official surveys.
Deep neural networks have been demonstrated impressive results in various cognitive tasks such as object detection and image classification. In order to execute large networks, Von Neumann computers store the large number of weight parameters in external memories, and processing elements are timed-shared, which leads to power-hungry I/O operations and processing bottlenecks. This paper describes a neuromorphic computing system that is designed from the ground up for the energy-efficient evaluation of large-scale neural networks. The computing system consists of a non-conventional compiler, a neuromorphic architecture, and a space-efficient microarchitecture that leverages existing integrated circuit design methodologies. The compiler factorizes a trained, feedforward network into a sparsely connected network, compresses the weights linearly, and generates a time delay neural network reducing the number of connections. The connections and units in the simplified network are mapped to silicon synapses and neurons. We demonstrate an implementation of the neuromorphic computing system based on a field-programmable gate array that performs the MNIST hand-written digit classification with 97.64% accuracy.
Deep neural networks are powerful and popular learning models that achieve state-of-the-art pattern recognition performance on many computer vision, speech, and language processing tasks. However, these networks have also been shown susceptible to carefully crafted adversarial perturbations which force misclassification of the inputs. Adversarial examples enable adversaries to subvert the expected system behavior leading to undesired consequences and could pose a security risk when these systems are deployed in the real world.   In this work, we focus on deep convolutional neural networks and demonstrate that adversaries can easily craft adversarial examples even without any internal knowledge of the target network. Our attacks treat the network as an oracle (black-box) and only assume that the output of the network can be observed on the probed inputs. Our first attack is based on a simple idea of adding perturbation to a randomly selected single pixel or a small set of them. We then improve the effectiveness of this attack by carefully constructing a small set of pixels to perturb by using the idea of greedy local-search. Our proposed attacks also naturally extend to a stronger notion of misclassification. Our extensive experimental results illustrate that even these elementary attacks can reveal a deep neural network's vulnerabilities. The simplicity and effectiveness of our proposed schemes mean that they could serve as a litmus test for designing robust networks.
Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, almost all existing CMH methods are based on hand-crafted features which might not be optimally compatible with the hash-code learning procedure. As a result, existing CMH methods with handcrafted features may not achieve satisfactory performance. In this paper, we propose a novel cross-modal hashing method, called deep crossmodal hashing (DCMH), by integrating feature learning and hash-code learning into the same framework. DCMH is an end-to-end learning framework with deep neural networks, one for each modality, to perform feature learning from scratch. Experiments on two real datasets with text-image modalities show that DCMH can outperform other baselines to achieve the state-of-the-art performance in cross-modal retrieval applications.
We present a novel method for aligning a sequence of instructions to a video of someone carrying out a task. In particular, we focus on the cooking domain, where the instructions correspond to the recipe. Our technique relies on an HMM to align the recipe steps to the (automatically generated) speech transcript. We then refine this alignment using a state-of-the-art visual food detector, based on a deep convolutional neural network. We show that our technique outperforms simpler techniques based on keyword spotting. It also enables interesting applications, such as automatically illustrating recipes with keyframes, and searching within a video for events of interest.
We investigate the problem of reaching majority agreement in a disconnected network. We obtain conditions under which such an agreement is certainly possible/impossible, and observe that these coincide in the ternary case.
A data-analysis strategy based on the maximum-likelihood method (MLM) is presented for the detection of gravitational waves from inspiraling compact binaries with a network of laser-interferometric detectors having arbitrary orientations and arbitrary locations around the globe. The MLM is based on the network likelihood ratio (LR), which is a function of eight signal-parameters that determine the Newtonian inspiral waveform. In the MLM-based strategy, the LR must be maximized over all of these parameters. Here, we show that it is possible to maximize it analytically over four of the eight parameters. Maximization over a fifth parameter, the time of arrival, is handled most efficiently by using the Fast-Fourier-Transform algorithm. This allows us to scan the parameter space continuously over these five parameters and also cuts down substantially on the computational costs. Maximization of the LR over the remaining three parameters is handled numerically. This includes the construction of a bank of templates on this reduced parameter space. After obtaining the network statistic, we first discuss `idealized' networks with all the detectors having a common noise curve for simplicity. Such an exercise nevertheless yields useful estimates about computational costs, and also tests the formalism developed here. We then consider realistic cases of networks comprising of the LIGO and VIRGO detectors: These include two-detector networks, which pair up the two LIGOs or VIRGO with one of the LIGOs, and the three-detector network that includes VIRGO and both the LIGOs. For these networks we present the computational speed requirements, network sensitivities, and source-direction resolutions.
Named Data networking ensure data integrity so that every important data has to be signed by its owner in order to send it safely inside the network. Similarly, in NDN we have to assure that none could open the data except authorized users. Since only the endpoints have the right to sign the data or check its validity during the verification process, we have considered that the data could be requested from various types of devices used by different people, these devices could be anything like a smartphone, PC, sensor node with a different CPU descriptions, parameters, and memory sizes, however their ability to check the high traffic of a data during the key generation and verification period is definitely a hard task and it could exhaust the systems with low computational resources. RSA and ECDSA as digital signature algorithms have proven their efficiency against cyber attacks, they are characterized by their speed to encrypt and decrypt data, in addition to their competence at checking the data integrity. The main purpose of our research was to find the optimal algorithm that avoids the systems overhead and offers the best time during the signature scheme
Extracting associations that recur across multiple studies while controlling the false discovery rate is a fundamental challenge. Here, we consider an extension of Efron's single-study two-groups model to allow joint analysis of multiple studies. We assume that given a set of p-values obtained from each study, the researcher is interested in associations that recur in at least $k>1$ studies. We propose new algorithms that differ in how the study dependencies are modeled. We compared our new methods and others using various simulated scenarios. The top performing algorithm, SCREEN (Scalable Cluster-based REplicability ENhancement), is our new algorithm that is based on three stages: (1) clustering an estimated correlation network of the studies, (2) learning replicability (e.g., of genes) within clusters, and (3) merging the results across the clusters using dynamic programming. We applied SCREEN to two real datasets and demonstrated that it greatly outperforms the results obtained via standard meta-analysis. First, on a collection of 29 case-control large-scale gene expression cancer studies, we detected a large up-regulated module of genes related to proliferation and cell cycle regulation. These genes are both consistently up-regulated across many cancer studies, and are well connected in known gene networks. Second, on a recent pan-cancer study that examined the expression profiles of patients with or without mutations in the HLA complex, we detected an active module of up-regulated genes that are related to immune responses. Thanks to our ability to quantify the false discovery rate, we detected thrice more genes as compared to the original study. Our module contains most of the genes reported in the original study, and many new ones. Interestingly, the newly discovered genes are needed to establish the connectivity of the module.
We introduce a new structure for memory neural networks, called feedforward sequential memory networks (FSMN), which can learn long-term dependency without using recurrent feedback. The proposed FSMN is a standard feedforward neural networks equipped with learnable sequential memory blocks in the hidden layers. In this work, we have applied FSMN to several language modeling (LM) tasks. Experimental results have shown that the memory blocks in FSMN can learn effective representations of long history. Experiments have shown that FSMN based language models can significantly outperform not only feedforward neural network (FNN) based LMs but also the popular recurrent neural network (RNN) LMs.
Distributed consensus algorithm over networks of quantum systems has been the focus of recent studies in the context of quantum computing and distributed control. Most of the progress in this category have been on the convergence conditions and optimizing the convergence rate of the algorithm, for quantum networks with undirected underlying topology. This paper aims to address the extension of this problem over quantum networks with directed underlying graphs. In doing so, the convergence to two different stable states namely, consensus and synchronous states have been studied. Based on the intertwining relation between the eigenvalues, it is shown that for determining the convergence rate to the consensus state, all induced graphs should be considered while for the synchronous state only the underlying graph suffices. Furthermore, it is illustrated that for the range of weights that the Aldous' conjecture holds true, the convergence rate to both states are equal. Using the Pareto region for convergence rates of the algorithm, the global and Pareto optimal points for several topologies have been provided.
Low-temperature states of polycrystalline samples of a frustrated pyrochlore oxide Tb$_{2+x}$Ti$_{2-x}$O$_{7+y}$ have been investigated by specific heat, magnetic susceptibility, and neutron scattering experiments. We have found that this system can be tuned by a minute change of $x$ from a spin-liquid state ($x < x_{\text{c}}$) to a partly ordered state with a small antiferromagnetic ordering of the order of $0.1 \mu_{\text{B}}$. Specific heat shows a sharp peak at a phase transition at $T_{\text{c}}= 0.5$ K for $x=0.005$. Magnetic excitation spectra for this sample change from a quasielastic to a gapped type through $T_{\text{c}}$. The possibility of a Jahn-Teller transition is discussed.
Mobile edge computing (MEC) has attracted great interests as a promising approach to augment computational capabilities of mobile devices. An important issue in the MEC paradigm is computation offloading. In this paper, we propose an integrated framework for computation offloading and interference management in wireless cellular networks with mobile edge computing. In this integrated framework, the MEC server makes the offloading decision according to the local computation overhead estimated by all user equipments (UEs) and the offloading overhead estimated by the MEC server itself. Then, the MEC server performs the PRB allocation using graph coloring. The outcomes of the offloading decision and PRB allocation are then used to allocate the computation resource of the MEC server to the UEs. Simulation results are presented to show the effectiveness of the proposed scheme with different system parameters.
We present an extension of the window flow control analysis by R. Agrawal et.al. (Reference [1]), C.-S. Chang (Reference [6]), and C.-S. Chang et. al. (Reference [8]) to a system with random service time and fixed feedback delay. We consider two network service models. In the first model, the network service process itself has no time correlations. The second model addresses a two-state Markov-modulated service.
Sampling the stationary points of a complicated potential energy landscape is a challenging problem. Here we introduce a sampling method based on relaxation from stationary points of the highest index of the Hessian matrix. We illustrate how this approach can find all the stationary points for potentials or Hamiltonians bounded from above, which includes a large class of important spin models, and we show that it is far more efficient than previous methods. For potentials unbounded from above, the relaxation part of the method is still efficient in finding minima and transition states, which are usually the primary focus of attention for atomistic systems.
A neural network-based chart pattern represents adaptive parametric features, including non-linear transformations, and a template that can be applied in the feature space. The search of neural network-based chart patterns has been unexplored despite its potential expressiveness. In this paper, we formulate a general chart pattern search problem to enable cross-representational quantitative comparison of various search schemes. We suggest a HyperNEAT framework applying state-of-the-art deep neural network techniques to find attractive neural network-based chart patterns; These techniques enable a fast evaluation and search of robust patterns, as well as bringing a performance gain. The proposed framework successfully found attractive patterns on the Korean stock market. We compared newly found patterns with those found by different search schemes, showing the proposed approach has potential.
In this paper we develop an integrated model for request mechanism and data transmission in multi-channel wireless local area networks. We calculated the performance parameters for single and multi-channel wireless networks when the channel is noisy. The proposed model is general it can be applied to different wireless networks such as IEEE802.11x, IEEE802.16, CDMA operated networks and Hiperlan\2.
Deep learning has significantly advanced the state of the art in machine learning. However, neural networks are often considered black boxes. There is significant effort to develop techniques that explain a classifier's decisions. Although some of these approaches have resulted in compelling visualisations, there is a lack of theory of what is actually explained. Here we present an analysis of these methods and formulate a quality criterion for explanation methods. On this ground, we propose an improved method that may serve as an extension for existing back-projection and decomposition techniques.
Critical scattering analyses for dilute antiferromagnets are made difficult by the lack of predicted theoretical line shapes beyond mean-field models. Nevertheless, with the use of some general scaling assumptions we have developed a procedure by which we can analyze the equilibrium critical scattering in these systems for H=0, the random-exchange Ising model, and, more importantly, for H>0, the random-field Ising model. Our new fitting approach, as opposed to the more conventional techniques, allows us to obtain the universal critical behavior exponents and amplitude ratios as well as the critical line shapes. We discuss the technique as applied to Fe(0.93)Zn(0.07)F2. The general technique, however, should be applicable to other problems where the scattering line shapes are not well understood but scaling is expected to hold.
By defining a spatially varying replica overlap parameter for a supercooled liquid referenced to an ensemble of fiducial liquid state configurations we explicitly construct a constrained replica free energy functional that maps directly onto an Ising Hamiltonian with both random fields and random interactions whose statistics depend on liquid structure. Renormalization group results for random magnets when combined with these statistics for the Lennard-Jones glass suggest that discontinuous replica symmetry breaking would occur if a liquid with short range interactions could be equilibrated at a sufficiently low temperature where its mean field configurational entropy would vanish, even though the system strictly retains a finite configurational entropy.
Biomarkers which predict patient's survival can play an important role in medical diagnosis and treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers of survival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were located based on the position of optimized features. Kaplan-Meier curve and Cox regression model were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time.
Epilepsy is common neurological diseases, affecting about 0.6-0.8 % of world population. Epileptic patients suffer from chronic unprovoked seizures, which can result in broad spectrum of debilitating medical and social consequences. Since seizures, in general, occur infrequently and are unpredictable, automated seizure detection systems are recommended to screen for seizures during long-term electroencephalogram (EEG) recordings. In addition, systems for early seizure detection can lead to the development of new types of intervention systems that are designed to control or shorten the duration of seizure events. In this article, we investigate the utility of recurrent neural networks (RNNs) in designing seizure detection and early seizure detection systems. We propose a deep learning framework via the use of Gated Recurrent Unit (GRU) RNNs for seizure detection. We use publicly available data in order to evaluate our method and demonstrate very promising evaluation results with overall accuracy close to 100 %. We also systematically investigate the application of our method for early seizure warning systems. Our method can detect about 98% of seizure events within the first 5 seconds of the overall epileptic seizure duration.
Semi-supervised learning methods based on generative adversarial networks (GANs) obtained strong empirical results, but it is not clear 1) how the discriminator benefits from joint training with a generator, and 2) why good semi-supervised classification performance and a good generator cannot be obtained at the same time. Theoretically, we show that given the discriminator objective, good semisupervised learning indeed requires a bad generator, and propose the definition of a preferred generator. Empirically, we derive a novel formulation based on our analysis that substantially improves over feature matching GANs, obtaining state-of-the-art results on multiple benchmark datasets.
The study of networks has emerged in diverse disciplines as a means of analyzing complex relationship data. Beyond graph analysis tasks like graph query processing, link analysis, influence propagation, there has recently been some work in the area of outlier detection for information network data. Although various kinds of outliers have been studied for graph data, there is not much work on anomaly detection from edge-attributed graphs. In this paper, we introduce a method that detects novel outlier graph nodes by taking into account the node data and edge data simultaneously to detect anomalies. We model the problem as a community detection task, where outliers form a separate community. We propose a method that uses a probabilistic graph model (Hidden Markov Random Field) for joint modeling of nodes and edges in the network to compute Holistic Community Outliers (HCOutliers). Thus, our model presents a natural setting for heterogeneous graphs that have multiple edges/relationships between two nodes. EM (Expectation Maximization) is used to learn model parameters, and infer hidden community labels. Experimental results on synthetic datasets and the DBLP dataset show the effectiveness of our approach for finding novel outliers from networks.
Particle Swarm Optimisation (PSO) makes use of a dynamical system for solving a search task. Instead of adding search biases in order to improve performance in certain problems, we aim to remove algorithm-induced scales by controlling the swarm with a mechanism that is scale-free except possibly for a suppression of scales beyond the system size. In this way a very promising performance is achieved due to the balance of large-scale exploration and local search. The resulting algorithm shows evidence for self-organised criticality, brought about via the intrinsic dynamics of the swarm as it interacts with the objective function, rather than being explicitly specified. The Critical Particle Swarm (CriPS) can be easily combined with many existing extensions such as chaotic exploration, additional force terms or non-trivial topologies.
We review the current status and research challenges in the area of cyber security often called continuous monitoring and risk scoring (CMRS). We focus on two most salient aspects of CMRS. First, continuous collection of data through automated feeds; hence the term continuous monitoring. Typical data collected for continuous monitoring purposes include network traffic information as well as host information from host-based agents. Second, analysis of the collected data in order to assess the risks - the risk scoring. This assessment may include flagging especially egregious vulnerabilities and exposures, or computing metrics that provide an overall characterization of the network's risk level. Currently used risk metrics are often simple sums or counts of vulnerabilities and missing patches.   The research challenges pertaining to CMRS fall mainly into two categories. The first centers on the problem of integrating and fusing highly heterogeneous information. The second group of challenges is the lack of rigorous approaches to computing risk. Existing risk scoring algorithms remain limited to ad hoc heuristics such as simple sums of vulnerability scores or counts of things like missing patches or open ports, etc. Weaknesses and potentially misleading nature of such metrics are well recognized. For example, the individual vulnerability scores are dangerously reliant on subjective, human, qualitative input, potentially inaccurate and expensive to obtain. Further, the total number of vulnerabilities may matters far less than how vulnerabilities are distributed over hosts, or over time. Similarly, neither topology of the network nor the roles and dynamics of inter-host interactions are considered by simple sums of vulnerabilities or missing patches.
Network Function Virtualization (NFV) refers to the use of commodity hardware resources as the basic platform to perform specialized network functions as opposed to specialized hardware devices. Currently, NFV is mainly implemented based on general purpose processors, or general purpose network processors. In this paper we propose the use of FPGAs as an ideal platform for NFV that can be used to provide both the flexibility of virtualizations and the high performance of the specialized hardware. We present the early attempts of using FPGAs dynamic reconfiguration in network processing applications to provide flexible network functions and we present the opportunities for an FPGA-based NFV platform.
Radio source counts constrain galaxy populations and evolution, as well as the global star formation history. However, there is considerable disagreement among the published 1.4-GHz source counts below 100 microJy. Here we present a statistical method for estimating the microJy and even sub-microJy source count using new deep wide-band 3-GHz data in the Lockman Hole from the Karl G. Jansky Very Large Array (VLA). We analyzed the confusion amplitude distribution P(D), which provides a fresh approach in the form of a more robust model, with a comprehensive error analysis. We tested this method on a large-scale simulation, incorporating clustering and finite source sizes. We discuss in detail our statistical methods for fitting using Monte Carlo Markov chains, handling correlations, and systematic errors from the use of wide-band radio interferometric data. We demonstrated that the source count can be constrained down to 50 nJy, a factor of 20 below the rms confusion. We found the differential source count near 10 microJy to have a slope of -1.7, decreasing to about -1.4 at fainter flux densities. At 3GHz the rms confusion in an 8arcsec FWHM beam is ~ 1.2 microJy/beam, and a radio background temperature ~ 14mK. Our counts are broadly consistent with published evolutionary models. With these results we were also able to constrain the peak of the Euclidean normalized differential source count of any possible new radio populations that would contribute to the cosmic radio background down to 50 nJy.
Substantial evidence indicates that major psychiatric disorders are associated with distributed neural dysconnectivity, leading to strong interest in using neuroimaging methods to accurately predict disorder status. In this work, we are specifically interested in a multivariate approach that uses features derived from whole-brain resting state functional connectomes. However, functional connectomes reside in a high dimensional space, which complicates model interpretation and introduces numerous statistical and computational challenges. Traditional feature selection techniques are used to reduce data dimensionality, but are blind to the spatial structure of the connectomes. We propose a regularization framework where the 6-D structure of the functional connectome is explicitly taken into account via the fused Lasso or the GraphNet regularizer. Our method only restricts the loss function to be convex and margin-based, allowing non-differentiable loss functions such as the hinge-loss to be used. Using the fused Lasso or GraphNet regularizer with the hinge-loss leads to a structured sparse support vector machine (SVM) with embedded feature selection. We introduce a novel efficient optimization algorithm based on the augmented Lagrangian and the classical alternating direction method, which can solve both fused Lasso and GraphNet regularized SVM with very little modification. We also demonstrate that the inner subproblems of the algorithm can be solved efficiently in analytic form by coupling the variable splitting strategy with a data augmentation scheme. Experiments on simulated data and resting state scans from a large schizophrenia dataset show that our proposed approach can identify predictive regions that are spatially contiguous in the 6-D "connectome space," offering an additional layer of interpretability that could provide new insights about various disease processes.
Probabilistic programming languages (PPLs) are a powerful modeling tool, able to represent any computable probability distribution. Unfortunately, probabilistic program inference is often intractable, and existing PPLs mostly rely on expensive, approximate sampling-based methods. To alleviate this problem, one could try to learn from past inferences, so that future inferences run faster. This strategy is known as amortized inference; it has recently been applied to Bayesian networks and deep generative models. This paper proposes a system for amortized inference in PPLs. In our system, amortization comes in the form of a parameterized guide program. Guide programs have similar structure to the original program, but can have richer data flow, including neural network components. These networks can be optimized so that the guide approximately samples from the posterior distribution defined by the original program. We present a flexible interface for defining guide programs and a stochastic gradient-based scheme for optimizing guide parameters, as well as some preliminary results on automatically deriving guide programs. We explore in detail the common machine learning pattern in which a 'local' model is specified by 'global' random values and used to generate independent observed data points; this gives rise to amortized local inference supporting global model learning.
WSN is formed by autonomous nodes with partial memory, communication range, power, and bandwidth. Their occupation depends on inspecting corporal and environmental conditions and communing through a system and performing data processing. The application field is vast, comprising military, ecology, healthcare, home or commercial and require a highly secured communication. The paper analyses different types of attacks and counterattacks and provides solutions for the WSN threats
We show that the unrestricted black-box complexity of the $n$-dimensional XOR- and permutation-invariant LeadingOnes function class is $O(n \log (n) / \log \log n)$. This shows that the recent natural looking $O(n\log n)$ bound is not tight.   The black-box optimization algorithm leading to this bound can be implemented in a way that only 3-ary unbiased variation operators are used. Hence our bound is also valid for the unbiased black-box complexity recently introduced by Lehre and Witt (GECCO 2010). The bound also remains valid if we impose the additional restriction that the black-box algorithm does not have access to the objective values but only to their relative order (ranking-based black-box complexity).
Both superbursters and soft X-ray transients probe the process of deep crustal heating in compact stars. It was recently shown that the transfer of matter from crust to core in a strange star can heat the crust and ignite superbursts provided certain constraints on the strange quark matter equation of state are fulfilled. We derive corresponding constraints on the equation of state for soft X-ray transients assuming their quiescent emission is powered in the same way, and further discuss the time dependence of this heating mechanism in transient systems. We approach this using a simple parametrized model for deep crustal heating in strange stars assuming slow neutrino cooling in the core and blackbody photon emission from the surface.The constraints derived for hot frequently accreting soft X-ray transients are always consistent with those for superbursters. The colder sources are consistent for low values of the quark matter binding energy, heat conductivity and neutrino emissivity. The heating mechanism is very time dependent which may help to explain cold sources with long recurrence times. Thus deep crustal heating in strange stars can provide a consistent explanation for superbursters and soft X-ray transients.
This paper proposes a new algorithm based on multi-scale stochastic local search with binary representation for training neural networks.   In particular, we study the effects of neighborhood evaluation strategies, the effect of the number of bits per weight and that of the maximum weight range used for mapping binary strings to real values. Following this preliminary investigation, we propose a telescopic multi-scale version of local search where the number of bits is increased in an adaptive manner, leading to a faster search and to local minima of better quality. An analysis related to adapting the number of bits in a dynamic way is also presented. The control on the number of bits, which happens in a natural manner in the proposed method, is effective to increase the generalization performance. Benchmark tasks include a highly non-linear artificial problem, a control problem requiring either feed-forward or recurrent architectures for feedback control, and challenging real-world tasks in different application domains.   The results demonstrate the effectiveness of the proposed method.
We present first results from a very deep (~650 ksec) Chandra X-ray observation of Abell 2052, as well as archival VLA radio observations. The data reveal detailed structure in the inner parts of the cluster, including bubbles evacuated by the AGN's radio lobes, compressed bubble rims, filaments, and loops. Two concentric shocks are seen, and a temperature rise is measured for the innermost one. On larger scales, we report the first detection of an excess surface brightness spiral feature. The spiral has cooler temperatures, lower entropies, and higher abundances than its surroundings, and is likely the result of sloshing gas initiated by a previous cluster-cluster or sub-cluster merger. Initial evidence for previously unseen bubbles at larger radii related to earlier outbursts from the AGN is presented.
Reducing bit-widths of weights, activations, and gradients of a Neural Network can shrink its storage size and memory usage, and also allow for faster training and inference by exploiting bitwise operations. However, previous attempts for quantization of RNNs show considerable performance degradation when using low bit-width weights and activations. In this paper, we propose methods to quantize the structure of gates and interlinks in LSTM and GRU cells. In addition, we propose balanced quantization methods for weights to further reduce performance degradation. Experiments on PTB and IMDB datasets confirm effectiveness of our methods as performances of our models match or surpass the previous state-of-the-art of quantized RNN.
Heterogeneous ultra-dense networks enable ultra-high data rates and ultra-low latency through the use of dense sub-6 GHz and millimeter wave (mmWave) small cells with different antenna configurations. Existing work has widely studied spectral and energy efficiency in such networks and shown that high spectral and energy efficiency can be achieved. This article investigates the benefits of heterogeneous ultra-dense network architecture from the perspectives of three promising technologies, i.e., physical layer security, caching, and wireless energy harvesting, and provides enthusiastic outlook towards application of these technologies in heterogeneous ultra-dense networks. Based on the rationale of each technology, opportunities and challenges are identified to advance the research in this emerging network.
We present a practical approach for processing mobile sensor time series data for continual deep learning predictions. The approach comprises data cleaning, normalization, capping, time-based compression, and finally classification with a recurrent neural network. We demonstrate the effectiveness of the approach in a case study with 279 participants. On the basis of sparse sensor events, the network continually predicts whether the participants would attend to a notification within 10 minutes. Compared to a random baseline, the classifier achieves a 40% performance increase (AUC of 0.702) on a withheld test set. This approach allows to forgo resource-intensive, domain-specific, error-prone feature engineering, which may drastically increase the applicability of machine learning to mobile phone sensor data.
Free-hand sketch-based image retrieval (SBIR) is a specific cross-view retrieval task, in which queries are abstract and ambiguous sketches while the retrieval database is formed with natural images. Work in this area mainly focuses on extracting representative and shared features for sketches and natural images. However, these can neither cope well with the geometric distortion between sketches and images nor be feasible for large-scale SBIR due to the heavy continuous-valued distance computation. In this paper, we speed up SBIR by introducing a novel binary coding method, named \textbf{Deep Sketch Hashing} (DSH), where a semi-heterogeneous deep architecture is proposed and incorporated into an end-to-end binary coding framework. Specifically, three convolutional neural networks are utilized to encode free-hand sketches, natural images and, especially, the auxiliary sketch-tokens which are adopted as bridges to mitigate the sketch-image geometric distortion. The learned DSH codes can effectively capture the cross-view similarities as well as the intrinsic semantic correlations between different categories. To the best of our knowledge, DSH is the first hashing work specifically designed for category-level SBIR with an end-to-end deep architecture. The proposed DSH is comprehensively evaluated on two large-scale datasets of TU-Berlin Extension and Sketchy, and the experiments consistently show DSH's superior SBIR accuracies over several state-of-the-art methods, while achieving significantly reduced retrieval time and memory footprint.
We discuss the renormalon-based approach to power corrections in non-singlet deep inelastic scattering structure functions and compare it with the general operator product expansion. The renormalon technique and its variations relate the power corrections directly to infrared-sensitive parameters such as the position of the Landau pole \Lambda_{QCD} or the infinitesimal gluon mass \lambda. In terms of the standard OPE these techniques unify evaluations of the coefficient functions and of matrix elements. We argue that in case of deep inelastic scattering there is a proliferation of competeing infrared sensitive parameters. In particular we consider the gluon and quark masses, virtuality of quarks and \Lambda_{QCD} as possible infrared cut offs and compare the emerging results. In the standard renormalon technique where \Lambda_{QCD} is the infrared parameter, the argument of the running coupling is crucial to obtain the correct x dependance of the structure functions. Finally we discuss the limitations of the use of the renormalon based methods for determining of the x dependance of the power corrections.
Due to high viscosity, glassy systems evolve slowly to the ordered state. Results of molecular dynamics simulation reveal that the structural ordering in glasses becomes observable over "experimental" (finite) time-scale for the range of phase diagram with high values of pressure. We show that the structural ordering in glasses at such conditions is initiated through the nucleation mechanism, and the mechanism spreads to the states at extremely deep levels of supercooling. We find that the scaled values of the nucleation time, $\tau_1$ (average waiting time of the first nucleus with the critical size), in glassy systems as a function of the reduced temperature, $\widetilde{T}$, are collapsed onto a single line reproducible by the power-law dependence. This scaling is supported by the simulation results for the model glassy systems for a wide range of temperatures as well as by the experimental data for the stoichiometric glasses at the temperatures near the glass transition.
Network interfaces in most LAN computing devices are usually severely under-utilized, wasting energy while waiting for new packets to arrive. In this paper, we present two algorithms for opportunistically powering down unused network interfaces in order to save some of that wasted energy. We compare our proposals to the best known opportunistic method, and show that they provide much greater power savings inflicting even lower delays to Internet traffic.
In a previous paper we proposed a model to study the dynamics of opinion formation in human societies by a co-evolution process involving two distinct time scales of fast transaction and slower network evolution dynamics. In the transaction dynamics we take into account short range interactions as discussions between individuals and long range interactions to describe the attitude to the overall mood of society. The latter is handled by a uniformly distributed parameter $\alpha$, assigned randomly to each individual, as quenched personal bias. The network evolution dynamics is realized by rewiring the societal network due to state variable changes as a result of transaction dynamics. The main consequence of this complex dynamics is that communities emerge in the social network for a range of values in the ratio between time scales. In this paper we focus our attention on the attitude parameter $\alpha$ and its influence on the conformation of opinion and the size of the resulting communities. We present numerical studies and extract interesting features of the model that can be interpreted in terms of social behaviour.
The exquisite sensitivity of the advanced LIGO detectors has enabled the detection of multiple gravitational wave signals. The sophisticated design of these detectors mitigates the effect of most types of noise. However, advanced LIGO data streams are contaminated by numerous artifacts known as glitches: non-Gaussian noise transients with complex morphologies. Given their high rate of occurrence, glitches can lead to false coincident detections, obscure and even mimic gravitational wave signals. Therefore, successfully characterizing and removing glitches from advanced LIGO data is of utmost importance. Here, we present the first application of Deep Transfer Learning for glitch classification, showing that knowledge from deep learning algorithms trained for real-world object recognition can be transferred for classifying glitches in time-series based on their spectrogram images. Using the Gravity Spy dataset, containing hand-labeled, multi-duration spectrograms obtained from real LIGO data, we demonstrate that this method enables optimal use of very deep convolutional neural networks for classification given small training datasets, significantly reduces the time for training the networks, and achieves state-of-the-art accuracy above 98.8%, with perfect precision-recall on 8 out of 22 classes. Furthermore, new types of glitches can be classified accurately given few labeled examples with this technique. Once trained via transfer learning, we show that the convolutional neural networks can be truncated and used as excellent feature extractors for unsupervised clustering methods to identify new classes based on their morphology, without any labeled examples. Therefore, this provides a new framework for dynamic glitch classification for gravitational wave detectors, which are expected to encounter new types of noise as they undergo gradual improvements to attain design sensitivity.
The budding yeast {\it Saccharomyces cerevisiae} is the first eukaryote whose genome has been completely sequenced. It is also the first eukaryotic cell whose proteome (the set of all proteins) and interactome (the network of all mutual interactions between proteins) has been analyzed. In this paper we study the structure of the yeast protein complex network in which weighted edges between complexes represent the number of shared proteins. It is found that the network of protein complexes is a small world network with scale free behavior for many of its distributions. However we find that there are no strong correlations between the weights and degrees of neighboring complexes. To reveal non-random features of the network we also compare it with a null model in which the complexes randomly select their proteins. Finally we propose a simple evolutionary model based on duplication and divergence of proteins.
Humans demonstrate remarkable abilities to predict physical events in complex scenes. Two classes of models for physical scene understanding have recently been proposed: "Intuitive Physics Engines", or IPEs, which posit that people make predictions by running approximate probabilistic simulations in causal mental models similar in nature to video-game physics engines, and memory-based models, which make judgments based on analogies to stored experiences of previously encountered scenes and physical outcomes. Versions of the latter have recently been instantiated in convolutional neural network (CNN) architectures. Here we report four experiments that, to our knowledge, are the first rigorous comparisons of simulation-based and CNN-based models, where both approaches are concretely instantiated in algorithms that can run on raw image inputs and produce as outputs physical judgments such as whether a stack of blocks will fall. Both approaches can achieve super-human accuracy levels and can quantitatively predict human judgments to a similar degree, but only the simulation-based models generalize to novel situations in ways that people do, and are qualitatively consistent with systematic perceptual illusions and judgment asymmetries that people show.
A fundamental manifestation of wave scattering in a disordered medium is the highly complex intensity pattern the waves acquire due to multi-path interference. Here we show that these intensity variations can be entirely suppressed by adding disorder-specific gain and loss components to the medium. The resulting constant-intensity (CI) waves in such non-Hermitian scattering landscapes are free of any backscattering and feature perfect transmission through the disorder. An experimental demonstration of these unique wave states is envisioned based on spatially modulated pump beams that can flexibly control the gain and loss components in an active medium.
A good computer network is hard to disrupt. It is desired that the computer communication network remains connected even when some of the links or nodes fail. Since the communication links are expensive, one wants to achieve these goals with fewer links. The computer communication network is fault tolerant if it has alternative paths between vertices, the more disjoint paths, the better is the survivability. This paper presents a method for generating k-connected computer communication network with optimal number of links using bipartite graph concept.
Numerous methods have been proposed for forecasting load for normal days. Modeling of anomalous load, however, has often been ignored in the research literature. Occurring on special days, such as public holidays, anomalous load conditions pose considerable modeling challenges due to their infrequent occurrence and significant deviation from normal load. To overcome these limitations, we adopt a rule-based approach, which allows incorporation of prior expert knowledge of load profiles into the statistical model. We use triple seasonal Holt-Winters-Taylor (HWT) exponential smoothing, triple seasonal autoregressive moving average (ARMA), artificial neural networks (ANNs), and triple seasonal intraweek singular value decomposition (SVD) based exponential smoothing. These methods have been shown to be competitive for modeling load for normal days. The methodological contribution of this paper is to demonstrate how these methods can be adapted to model load for special days, when used in conjunction with a rule-based approach. The proposed rule-based method is able to model normal and anomalous load in a unified framework. Using nine years of half-hourly load for Great Britain, we evaluate point forecasts, for lead times from one half-hour up to a day ahead. A combination of two rule-based methods generated the most accurate forecasts.
Consensus protocols are the foundation for building many fault-tolerant distributed systems and services. This paper posits that there are significant performance benefits to be gained by offering consensus as a network service (CAANS). CAANS leverages recent advances in commodity networking hardware design and programmability to implement consensus protocol logic in network devices. CAANS provides a complete Paxos protocol, is a drop-in replacement for software-based implementations of Paxos, makes no restrictions on network topologies, and is implemented in a higher-level, data-plane programming language, allowing for portability across a range of target devices. At the same time, CAANS significantly increases throughput and reduces latency for consensus operations. Consensus logic executing in hardware can transmit consensus messages at line speed, with latency only slightly higher than simply forwarding packets.
In complex network research clique percolation, introduced by Palla et al., is a deterministic community detection method, which allows for overlapping communities and is purely based on local topological properties of a network. Here we present a sequential clique percolation algorithm (SCP) to do fast community detection in weighted and unweighted networks, for cliques of a chosen size. This method is based on sequentially inserting the constituent links to the network and simultaneously keeping track of the emerging community structure. Unlike existing algorithms, the SCP method allows for detecting k-clique communities at multiple weight thresholds in a single run, and can simultaneously produce a dendrogram representation of hierarchical community structure. In sparse weighted networks, the SCP algorithm can also be used for implementing the weighted clique percolation method recently introduced by Farkas et al. The computational time of the SCP algorithm scales linearly with the number of k-cliques in the network. As an example, the method is applied to a product association network, revealing its nested community structure.
In recent years there have been many successes of using deep representations in reinforcement learning. Still, many of these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders. In this paper, we present a new neural network architecture for model-free reinforcement learning. Our dueling network represents two separate estimators: one for the state value function and one for the state-dependent action advantage function. The main benefit of this factoring is to generalize learning across actions without imposing any change to the underlying reinforcement learning algorithm. Our results show that this architecture leads to better policy evaluation in the presence of many similar-valued actions. Moreover, the dueling architecture enables our RL agent to outperform the state-of-the-art on the Atari 2600 domain.
This paper addresses the problem of large scale image retrieval, with the aim of accurately ranking the similarity of a large number of images to a given query image. To achieve this, we propose a novel Siamese network. This network consists of two computational strands, each comprising of a CNN component followed by a Fisher vector component. The CNN component produces dense, deep convolutional descriptors that are then aggregated by the Fisher Vector method. Crucially, we propose to simultaneously learn both the CNN filter weights and Fisher Vector model parameters. This allows us to account for the evolving distribution of deep descriptors over the course of the learning process. We show that the proposed approach gives significant improvements over the state-of-the-art methods on the Oxford and Paris image retrieval datasets. Additionally, we provide a baseline performance measure for both these datasets with the inclusion of 1 million distractors.
Parameter estimation in Markov random fields (MRFs) is a difficult task, in which inference over the network is run in the inner loop of a gradient descent procedure. Replacing exact inference with approximate methods such as loopy belief propagation (LBP) can suffer from poor convergence. In this paper, we provide a different approach for combining MRF learning and Bethe approximation. We consider the dual of maximum likelihood Markov network learning - maximizing entropy with moment matching constraints - and then approximate both the objective and the constraints in the resulting optimization problem. Unlike previous work along these lines (Teh & Welling, 2003), our formulation allows parameter sharing between features in a general log-linear model, parameter regularization and conditional training. We show that piecewise training (Sutton & McCallum, 2005) is a very restricted special case of this formulation. We study two optimization strategies: one based on a single convex approximation and one that uses repeated convex approximations. We show results on several real-world networks that demonstrate that these algorithms can significantly outperform learning with loopy and piecewise. Our results also provide a framework for analyzing the trade-offs of different relaxations of the entropy objective and of the constraints.
Generative Adversarial Net has shown its great ability in generating samples. The inverse mapping of generator also contains a great value. Some works have been developed to construct the inverse function of generator. However, the existing ways of training the inverse model of GANs have many shortcomings. In this paper, we propose a new approach of training the inverse model of generator by regarding a pre-trained generator as the decoder part of an autoencoder network. This model does not directly minimize the difference between original input and inverse output, but try to minimize the difference between the generated data by using original input and inverse output. This strategy overcome the difficulty in training a inverse model of a non one-to-one function. And the inverse mapping we learned can be directly used in image searching and processing.
The use of wireless implant technology requires correct delivery of the vital physiological signs of the patient along with the energy management in power-constrained devices. Toward these goals, we present an augmentation protocol for the physical layer of the Medical Implant Communications Service (MICS) with focus on the energy efficiency of deployed devices over the MICS frequency band. The present protocol uses the rateless code with the Frequency Shift Keying (FSK) modulation scheme to overcome the reliability and power cost concerns in tiny implantable sensors due to the considerable attenuation of propagated signals across the human body. In addition, the protocol allows a fast start-up time for the transceiver circuitry. The main advantage of using rateless codes is to provide an inherent adaptive duty-cycling for power management, due to the flexibility of the rateless code rate. Analytical results demonstrate that an 80% energy saving is achievable with the proposed protocol when compared to the IEEE 802.15.4 physical layer standard with the same structure used for wireless sensor networks. Numerical results show that the optimized rateless coded FSK is more energy efficient than that of the uncoded FSK scheme for deep tissue (e.g., digestive endoscopy) applications, where the optimization is performed over modulation and coding parameters.
We study the problem of routing in sensor networks where the goal is to maximize the network's lifetime. Previous work has considered this problem for fixed-topology networks. Here, we add mobility to the source node, which requires a new definition of the network lifetime. In particular, we redefine lifetime to be the time until the source node depletes its energy. When the mobile node's trajectory is unknown in advance, we formulate three versions of an optimal control problem aiming at this lifetime maximization. We show that in all cases, the solution can be reduced to a sequence of Non- Linear Programming (NLP) problems solved on line as the source node trajectory evolves.
Over the last few years, we have seen a plethora of Internet of Things (IoT) solutions, products and services, making their way into the industry's market-place. All such solution will capture a large amount of data pertaining to the environment, as well as their users. The objective of the IoT is to learn more and to serve better the system users. Some of these solutions may store the data locally on the devices ('things'), and others may store in the Cloud. The real value of collecting data comes through data processing and aggregation in large-scale where new knowledge can be extracted. However, such procedures can also lead to user privacy issues. This article discusses some of the main challenges of privacy in IoT, and opportunities for research and innovation. We also introduce some of the ongoing research efforts that address IoT privacy issues.
Neural attention models have achieved great success in different NLP tasks. How- ever, they have not fulfilled their promise on the AMR parsing task due to the data sparsity issue. In this paper, we de- scribe a sequence-to-sequence model for AMR parsing and present different ways to tackle the data sparsity problem. We show that our methods achieve significant improvement over a baseline neural atten- tion model and our results are also compet- itive against state-of-the-art systems that do not use extra linguistic resources.
Within the framework proposed in this paper, we address the issue of extending the certain networks to a fuzzy certain networks in order to cope with a vagueness and limitations of existing models for decision under imprecise and uncertain knowledge. This paper proposes a framework that combines two disciplines to exploit their own advantages in uncertain and imprecise knowledge representation problems. The framework proposed is a possibilistic logic based one in which Bayesian nodes and their properties are represented by local necessity-valued knowledge base. Data in properties are interpreted as set of valuated formulas. In our contribution possibilistic Bayesian networks have a qualitative part and a quantitative part, represented by local knowledge bases. The general idea is to study how a fusion of these two formalisms would permit representing compact way to solve efficiently problems for knowledge representation. We show how to apply possibility and necessity measures to the problem of knowledge representation with large scale data. On the other hand fuzzification of crisp certainty degrees to fuzzy variables improves the quality of the network and tends to bring smoothness and robustness in the network performance. The general aim is to provide a new approach for decision under uncertainty that combines three methodologies: Bayesian networks certainty distribution and fuzzy logic.
This paper introduces the hypervolume maximization with a single solution as an alternative to the mean loss minimization. The relationship between the two problems is proved through bounds on the cost function when an optimal solution to one of the problems is evaluated on the other, with a hyperparameter to control the similarity between the two problems. This same hyperparameter allows higher weight to be placed on samples with higher loss when computing the hypervolume's gradient, whose normalized version can range from the mean loss to the max loss. An experiment on MNIST with a neural network is used to validate the theory developed, showing that the hypervolume maximization can behave similarly to the mean loss minimization and can also provide better performance, resulting on a 20% reduction of the classification error on the test set.
We present, as a very general method, an effective field theory to analyze models defined over small-world networks. Even if the exactness of the method is limited to the paramagnetic regions and to some special limits, it gives the exact critical behavior and the exact critical surfaces and percolation thresholds, and provide a clear and immediate (also in terms of calculation) insight of the physics. The underlying structure of the non random part of the model, i.e., the set of spins staying in a given lattice L_0 of dimension d_0 and interacting through a fixed coupling J_0, is exactly taken into account. When J_0\geq 0, the small-world effect gives rise to the known fact that a second order phase transition takes place, independently of the dimension d_0 and of the added random connectivity c. However, when J_0<0, a completely different scenario emerges where, besides a spin glass transition, multiple first- and second-order phase transitions may take place.
Scope of the present work is to frame into a rigorous, quantitative scaffold - stemmed from stochastic process theory - two sets of experiments designed to infer the spontaneous organization of leukocytes against cancer cells, namely mice splenocytes vs. B16 mouse tumor cells, and embedded in an "ad hoc" microfluidic environment developed on a LabOnChip technology. In the former, splenocytes from knocked out (KO) mice engineered to silence the transcription factor IRF-8, crucial for the development and function of several immune populations, were used. In this case lymphocytes and cancer cells exhibited a poor reciprocal exchange, resulting in the inability of coordinating or mounting an effective immune response against melanoma. In the second class of tests, wild type (WT) splenocytes were able to interact with and to coordinate a response against the tumor cells through physical interaction. The environment where cells moved was built of by two different chambers, containing respectively melanoma cells and splenocytes, connected by capillary migration channels allowing leucocytes to migrate from their chamber toward the melanoma one. We collected and analyzed data on the motility of the cells and found that the first ensemble of IRF-8 KO cells performed pure uncorrelated random walks, while WT splenocytes were able to make singular drifted random walks, that, averaged over the ensemble of cells, collapsed on a straight ballistic motion for the system as a whole. At a finer level of investigation, we found that IRF-8 KO splenocytes moved rather uniformly since their step lengths were exponentially distributed, while WT counterpart displayed a qualitatively broader motion as their step lengths along the direction of the melanoma were log-normally distributed.
This paper presents a novel fixation prediction and saliency modeling framework based on inter-image similarities and ensemble of Extreme Learning Machines (ELM). The proposed framework is inspired by two observations, 1) the contextual information of a scene along with low-level visual cues modulates attention, 2) the influence of scene memorability on eye movement patterns caused by the resemblance of a scene to a former visual experience. Motivated by such observations, we develop a framework that estimates the saliency of a given image using an ensemble of extreme learners, each trained on an image similar to the input image. That is, after retrieving a set of similar images for a given image, a saliency predictor is learnt from each of the images in the retrieved image set using an ELM, resulting in an ensemble. The saliency of the given image is then measured in terms of the mean of predicted saliency value by the ensemble's members.
The broadcast scheduling problem asks how a multihop network of broadcast transceivers operating on a shared medium may share the medium in such a way that communication over the entire network is possible. This can be naturally modeled as a graph coloring problem via distance-2 coloring (L(1,1)-labeling, strict scheduling). This coloring is difficult to compute and may require a number of colors quadratic in the graph degree. This paper introduces pseudo-scheduling, a relaxation of distance-2 coloring. Centralized and decentralized algorithms that compute pseudo-schedules with colors linear in the graph degree are given and proved.
We start out by demonstrating that an elementary learning task, corresponding to the training of a single linear neuron in a convolutional neural network, can be solved for feature spaces of very high dimensionality. In a second step, acknowledging that such high-dimensional learning tasks typically benefit from some form of regularization and arguing that the problem of scale has not been taken care of in a very satisfactory manner, we come to a combined resolution of both of these shortcomings by proposing a form of scale regularization. Moreover, using variational method, this regularization problem can also be solved rather efficiently and we demonstrate, on an artificial filter learning problem, the capabilities of our basic linear neuron. From a more general standpoint, we see this work as prime example of how learning and variational methods could, or even should work to their mutual benefit.
Images can be segmented by first using a classifier to predict an affinity graph that reflects the degree to which image pixels must be grouped together and then partitioning the graph to yield a segmentation. Machine learning has been applied to the affinity classifier to produce affinity graphs that are good in the sense of minimizing edge misclassification rates. However, this error measure is only indirectly related to the quality of segmentations produced by ultimately partitioning the affinity graph. We present the first machine learning algorithm for training a classifier to produce affinity graphs that are good in the sense of producing segmentations that directly minimize the Rand index, a well known segmentation performance measure. The Rand index measures segmentation performance by quantifying the classification of the connectivity of image pixel pairs after segmentation. By using the simple graph partitioning algorithm of finding the connected components of the thresholded affinity graph, we are able to train an affinity classifier to directly minimize the Rand index of segmentations resulting from the graph partitioning. Our learning algorithm corresponds to the learning of maximin affinities between image pixel pairs, which are predictive of the pixel-pair connectivity.
In this paper we propose a new approach for learning local descriptors for matching image patches. It has recently been demonstrated that descriptors based on convolutional neural networks (CNN) can significantly improve the matching performance. Unfortunately their computational complexity is prohibitive for any practical application. We address this problem and propose a CNN based descriptor with improved matching performance, significantly reduced training and execution time, as well as low dimensionality.   We propose to train the network with triplets of patches that include a positive and negative pairs. To that end we introduce a new loss function that exploits the relations within the triplets. We compare our approach to recently introduced MatchNet and DeepCompare and demonstrate the advantages of our descriptor in terms of performance, memory footprint and speed i.e. when run in GPU, the extraction time of our 128 dimensional feature is comparable to the fastest available binary descriptors such as BRIEF and ORB.
We explore packet traffic dynamics in a data network model near phase transition point from free flow to congestion. The model of data network is an abstraction of the Network Layer of the OSI (Open Systems Interconnection) Reference Model of packet switching networks. The Network Layer is responsible for routing packets across the network from their sources to their destinations and for control of congestion in data networks. Using the model we investigate spatio-temporal packets traffic dynamics near the phase transition point for various network connection topologies, and static and adaptive routing algorithms. We present selected simulation results and analyze them.
Data clustering is a recognized data analysis method in data mining whereas K-Means is the well known partitional clustering method, possessing pleasant features. We observed that, K-Means and other partitional clustering techniques suffer from several limitations such as initial cluster centre selection, preknowledge of number of clusters, dead unit problem, multiple cluster membership and premature convergence to local optima. Several optimization methods are proposed in the literature in order to solve clustering limitations, but Swarm Intelligence (SI) has achieved its remarkable position in the concerned area. Particle Swarm Optimization (PSO) is the most popular SI technique and one of the favorite areas of researchers. In this paper, we present a brief overview of PSO and applicability of its variants to solve clustering challenges. Also, we propose an advanced PSO algorithm named as Subtractive Clustering based Boundary Restricted Adaptive Particle Swarm Optimization (SC-BR-APSO) algorithm for clustering multidimensional data. For comparison purpose, we have studied and analyzed various algorithms such as K-Means, PSO, K-Means-PSO, Hybrid Subtractive + PSO, BRAPSO, and proposed algorithm on nine different datasets. The motivation behind proposing SC-BR-APSO algorithm is to deal with multidimensional data clustering, with minimum error rate and maximum convergence rate.
We measure the frequency dependence of the complex ac conductivity of NbN films with different levels of disorder in frequency range 0.4-20 GHz. Films with low disorder exhibit a narrow dynamic fluctuation regime above T_c as expected for a conventional superconductor. However, for strongly disordered samples, the fluctuation regime extends well above T_c, with a strongly frequency-dependent superfluid stiffness which disappears only at a temperature T* close to the pseudogap temperature obtained from scanning tunneling measurements. Such a finite-frequency response is associated to a marked slowing down of the superconducting fluctuations already below T*. The corresponding large length-scale fluctuations suggest a scenario of thermal phase fluctuations between superconducting domains in a strongly disordered s-wave superconductor.
In this paper, we apply social network analytic methods to unveil the structural dynamics of a popular open source goal oriented IRC community, Ubuntu. The primary objective is to track the development of this ever growing community over time using a social network lens and examine the dynamically changing participation patterns of people. Specifically, our research seeks out to investigate answers to the following question: How can the communication dynamics help us in delineating important substructures in the IRC network? This gives an insight into how open source learning communities function internally and what drives the exhibited IRC behavior. By application of a consistent set of social network metrics, we discern factors that affect people's embeddedness in the overall IRC network, their structural influence and importance as discussion initiators or responders. Deciphering these informal connections are crucial for the development of novel strategies to improve communication and foster collaboration between people conversing in the IRC channel, there by stimulating knowledge flow in the network. Our approach reveals a novel network skeleton, that more closely resembles the behavior of participants interacting online. We highlight bottlenecks to effective knowledge dissemination in the IRC, so that focused attention could be provided to communities with peculiar behavioral patterns. Additionally, we explore interesting research directions in augmenting the study of communication dynamics in the IRC.
We study in this paper the consequences of using the Mean Absolute Percentage Error (MAPE) as a measure of quality for regression models. We prove the existence of an optimal MAPE model and we show the universal consistency of Empirical Risk Minimization based on the MAPE. We also show that finding the best model under the MAPE is equivalent to doing weighted Mean Absolute Error (MAE) regression, and we apply this weighting strategy to kernel regression. The behavior of the MAPE kernel regression is illustrated on simulated data.
Recent advancements in the development of memristive devices has opened new opportunities for hardware implementation of non-Boolean computing. To this end, the suitability of memristive devices for swarm intelligence algorithms has enabled researchers to solve a maze in hardware. In this paper, we utilize swarm intelligence of memristive networks to perform image edge detection. First, we propose a hardware-friendly algorithm for image edge detection based on ant colony. Second, we implement the image edge detection algorithm using memristive networks. Furthermore, we explain the impact of various parameters of the memristors on the efficacy of the implementation. Our results show 28% improvement in the energy compared to a low power CMOS hardware implementation based on stochastic circuits. Furthermore, our design occupies up to 5x less area.
The capacity of cells and organisms to respond to challenging conditions in a repeatable manner is limited by a finite repertoire of pre-evolved adaptive responses. Beyond this capacity, cells can use exploratory dynamics to cope with a much broader array of conditions. However, the process of adaptation by exploratory dynamics within the lifetime of a cell is not well understood. Here we demonstrate the feasibility of exploratory adaptation in a high-dimensional network model of gene regulation. Exploration is initiated by failure to comply with a constraint and is implemented by random sampling of network configurations. It ceases if and when the network reaches a stable state satisfying the constraint. We find that successful convergence (adaptation) in high dimensions requires outgoing network hubs and is enhanced by their auto-regulation. The ability of these empirically-validated features of gene regulatory networks to support exploratory adaptation without fine-tuning, makes it plausible for biological implementation.
The International Trade Network (ITN) is the network formed by trade relationships between world countries. The complex structure of the ITN impacts important economic processes such as globalization, competitiveness, and the propagation of instabilities. Modeling the structure of the ITN in terms of simple macroeconomic quantities is therefore of paramount importance. While traditional macroeconomics has mainly used the Gravity Model to characterize the magnitude of trade volumes, modern network theory has predominantly focused on modeling the topology of the ITN. Combining these two complementary approaches is still an open problem. Here we review these approaches and emphasize the double role played by GDP in empirically determining both the existence and the volume of trade linkages. Moreover, we discuss a unified model that exploits these patterns and uses only the GDP as the relevant macroeconomic factor for reproducing both the topology and the link weights of the ITN.
Recently, a higher dimensional Eisenstein-Jacobi networks, has been proposed in [22], which is shown that they have better average distance with more number of nodes than a single dimensional EJ networks. Some communication algorithms such as one-to-all and all-to-all communications are well known and used in interconnection networks. In one-to-all communication, a source node sends a message to every other node in the network. Whereas, in all-to-all communication, every node is considered as a source node and sends its message to every other node in the network. In this paper, an improved one-to-all communication algorithm in higher dimensional EJ networks is presented. The paper shows that the proposed algorithm achieves a lower average number of steps to receiving the broadcasted message. In addition, since the links are assumed to be half-duplex, the all-to-all broadcasting algorithm is divided into three phases. The simulation results are discussed and showed that the improved one-to-all algorithm achieves better traffic performance than the well-known one-to-all algorithm and has 2.7% less total number of senders
Word2vec is a widely used algorithm for extracting low-dimensional vector representations of words. State-of-the-art algorithms including those by Mikolov et al. have been parallelized for multi-core CPU architectures, but are based on vector-vector operations with "Hogwild" updates that are memory-bandwidth intensive and do not efficiently use computational resources. In this paper, we propose "HogBatch" by improving reuse of various data structures in the algorithm through the use of minibatching and negative sample sharing, hence allowing us to express the problem using matrix multiply operations. We also explore different techniques to distribute word2vec computation across nodes in a compute cluster, and demonstrate good strong scalability up to 32 nodes. The new algorithm is particularly suitable for modern multi-core/many-core architectures, especially Intel's latest Knights Landing processors, and allows us to scale up the computation near linearly across cores and nodes, and process hundreds of millions of words per second, which is the fastest word2vec implementation to the best of our knowledge.
We investigate the constraints on the superfluid fraction of an amorphous solid following from an upper bound derived by Leggett. In order to accomplish this, we use as input density profiles generated for amorphous solids in a variety of different manners including by investigating Gaussian fluctuations around classical results. These rough estimates suggest that, at least at the level of the upper bound, there is not much difference in terms of superfluidity between a glass and a crystal characterized by the same Lindemann ratio. Moreover, we perform Path Integral Monte Carlo simulations of distinguishable Helium 4 rapidly quenched from the liquid phase to very lower temperature, at the density of the freezing transition. We find that the system crystallizes very quickly, without any sign of intermediate glassiness. Overall our results suggest that the experimental observations of large superfluid fractions in Helium 4 after a rapid quench correspond to samples evolving far from equilibrium, instead of being in a stable glass phase. Other scenarios and comparisons to other results on the super-glass phase are also discussed.
Exemplar-based models have achieved great success on localizing the parts of semi-rigid objects. However, their efficacy on highly articulated objects such as humans is yet to be explored. Inspired by hierarchical object representation and recent application of Deep Convolutional Neural Networks (DCNNs) on human pose estimation, we propose a novel formulation that incorporates both hierarchical exemplar-based models and DCNNs in the spatial terms. Specifically, we obtain more expressive spatial models by assuming independence between exemplars at different levels in the hierarchy; we also obtain stronger spatial constraints by inferring the spatial relations between parts at the same level. As our method strikes a good balance between expressiveness and strength of spatial models, it is both effective and generalizable, achieving state-of-the-art results on different benchmarks: Leeds Sports Dataset and CUB-200-2011.
Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate mean-field approximate inference for the Conditional Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks. This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has desirable properties of both CNNs and CRFs. Importantly, our system fully integrates CRF modelling with CNNs, making it possible to train the whole deep network end-to-end with the usual back-propagation algorithm, avoiding offline post-processing methods for object delineation. We apply the proposed method to the problem of semantic image segmentation, obtaining top results on the challenging Pascal VOC 2012 segmentation benchmark.
While many models are purposed for detecting the occurrence of significant events in financial systems, the task of providing qualitative detail on the developments is not usually as well automated. We present a deep learning approach for detecting relevant discussion in text and extracting natural language descriptions of events. Supervised by only a small set of event information, comprising entity names and dates, the model is leveraged by unsupervised learning of semantic vector representations on extensive text data. We demonstrate applicability to the study of financial risk based on news (6.6M articles), particularly bank distress and government interventions (243 events), where indices can signal the level of bank-stress-related reporting at the entity level, or aggregated at national or European level, while being coupled with explanations. Thus, we exemplify how text, as timely, widely available and descriptive data, can serve as a useful complementary source of information for financial and systemic risk analytics.
We describe a question answering model that applies to both images and structured knowledge bases. The model uses natural language strings to automatically assemble neural networks from a collection of composable modules. Parameters for these modules are learned jointly with network-assembly parameters via reinforcement learning, with only (world, question, answer) triples as supervision. Our approach, which we term a dynamic neural model network, achieves state-of-the-art results on benchmark datasets in both visual and structured domains.
Consider a communication network represented by a directed graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$, where $\mathcal{V}$ is the set of nodes and $\mathcal{E}$ is the set of point-to-point channels in the network. On the network a secure message $M$ is transmitted, and there may exist wiretappers who want to obtain information about the message. In secure network coding, we aim to find a network code which can protect the message against the wiretapper whose power is constrained. Cai and Yeung \cite{cai2002secure} studied the model in which the wiretapper can access any one but not more than one set of channels, called a wiretap set, out of a collection $\mathcal{A}$ of all possible wiretap sets. In order to protect the message, the message needs to be mixed with a random key $K$. They proved tight fundamental performance bounds when $\mathcal{A}$ consists of all subsets of $\mathcal{E}$ of a fixed size $r$. However, beyond this special case, obtaining such bounds is much more difficult. In this paper, we investigate the problem when $\mathcal{A}$ consists of arbitrary subsets of $\mathcal{E}$ and obtain the following results: 1) an upper bound on $H(M)$; 2) a lower bound on $H(K)$ in terms of $H(M)$. The upper bound on $H(M)$ is explicit, while the lower bound on $H(K)$ can be computed in polynomial time when $|\mathcal{A}|$ is fixed. The tightness of the lower bound for the point-to-point communication system is also proved.
The weekly maintenance schedule specifies when maintenance activities should be performed on the equipment, taking into account the availability of workers and maintenance bays, and other operational constraints. The current approach to generating this schedule is labour intensive and requires coordination between the maintenance schedulers and operations staff to minimise its impact on the operation of the mine. This paper presents methods for automatically generating this schedule from the list of maintenance tasks to be performed, the availability roster of the maintenance staff, and time windows in which each piece of equipment is available for maintenance. Both Mixed-Integer Linear Programming (MILP) and genetic algorithms are evaluated, with the genetic algorithm shown to significantly outperform the MILP. Two fitness functions for the genetic algorithm are also examined, with a linear fitness function outperforming an inverse fitness function by up to 5% for the same calculation time. The genetic algorithm approach is computationally fast, allowing the schedule to be rapidly recalculated in response to unexpected delays and breakdowns.
Despite its nonconvex nature, $\ell_0$ sparse approximation is desirable in many theoretical and application cases. We study the $\ell_0$ sparse approximation problem with the tool of deep learning, by proposing Deep $\ell_0$ Encoders. Two typical forms, the $\ell_0$ regularized problem and the $M$-sparse problem, are investigated. Based on solid iterative algorithms, we model them as feed-forward neural networks, through introducing novel neurons and pooling functions. Enforcing such structural priors acts as an effective network regularization. The deep encoders also enjoy faster inference, larger learning capacity, and better scalability compared to conventional sparse coding solutions. Furthermore, under task-driven losses, the models can be conveniently optimized from end to end. Numerical results demonstrate the impressive performances of the proposed encoders.
We present nuclear spin relaxation measurements in GaAs epilayers using a new pump-probe technique in all-electrical, lateral spin-valve devices. The measured T1 times agree very well with NMR data available for T > 1 K. However, the nuclear spin relaxation rate clearly deviates from the well-established Korringa law expected in metallic samples and follows a sub-linear temperature dependence 1/T1 ~ T^0.6 for 0.1 K < T < 10 K. Further, we investigate nuclear spin inhomogeneities.
Smart energy grid is an emerging area for new applications of machine learning in a non-stationary environment. Such a non-stationary environment emerges when large-scale failures occur at power distribution networks due to external disturbances such as hurricanes and severe storms. Power distribution networks lie at the edge of the grid, and are especially vulnerable to external disruptions. Quantifiable approaches are lacking and needed to learn non-stationary behaviors of large-scale failure and recovery of power distribution. This work studies such non-stationary behaviors in three aspects. First, a novel formulation is derived for an entire life cycle of large-scale failure and recovery of power distribution. Second, spatial-temporal models of failure and recovery of power distribution are developed as geo-location based multivariate non-stationary GI(t)/G(t)/Infinity queues. Third, the non-stationary spatial-temporal models identify a small number of parameters to be learned. Learning is applied to two real-life examples of large-scale disruptions. One is from Hurricane Ike, where data from an operational network is exact on failures and recoveries. The other is from Hurricane Sandy, where aggregated data is used for inferring failure and recovery processes at one of the impacted areas. Model parameters are learned using real data. Two findings emerge as results of learning: (a) Failure rates behave similarly at the two different provider networks for two different hurricanes but differently at the geographical regions. (b) Both rapid- and slow-recovery are present for Hurricane Ike but only slow recovery is shown for a regional distribution network from Hurricane Sandy.
We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents. We overcome the apparent difficulty of training a DBM with judicious parameter tying. This parameter tying enables an efficient pretraining algorithm and a state initialization scheme that aids inference. The model can be trained just as efficiently as a standard Restricted Boltzmann Machine. Our experiments show that the model assigns better log probability to unseen data than the Replicated Softmax model. Features extracted from our model outperform LDA, Replicated Softmax, and DocNADE models on document retrieval and document classification tasks.
We propose a scalable, efficient and statistically motivated computational framework for Graphical Lasso (Friedman et al., 2007b) - a covariance regularization framework that has received significant attention in the statistics community over the past few years. Existing algorithms have trouble in scaling to dimensions larger than a thousand. Our proposal significantly enhances the state-of-the-art for such moderate sized problems and gracefully scales to larger problems where other algorithms become practically infeasible. This requires a few key new ideas. We operate on the primal problem and use a subtle variation of block-coordinate-methods which drastically reduces the computational complexity by orders of magnitude. We provide rigorous theoretical guarantees on the convergence and complexity of our algorithm and demonstrate the effectiveness of our proposal via experiments. We believe that our framework extends the applicability of Graphical Lasso to large-scale modern applications like bioinformatics, collaborative filtering and social networks, among others.
Precise spatio-temporal patterns of neuronal action potentials underly e.g. sensory representations and control of muscle activities. However, it is not known how the synaptic efficacies in the neuronal networks of the brain adapt such that they can reliably generate spikes at specific points in time. Existing activity-dependent plasticity rules like Spike-Timing-Dependent Plasticity are agnostic to the goal of learning spike times. On the other hand, the existing formal and supervised learning algorithms perform a temporally precise comparison of projected activity with the target, but there is no known biologically plausible implementation of this comparison. Here, we propose a simple and local unsupervised synaptic plasticity mechanism that is derived from the requirement of a balanced membrane potential. Since the relevant signal for synaptic change is the postsynaptic voltage rather than spike times, we call the plasticity rule Membrane Potential Dependent Plasticity (MPDP). Combining our plasticity mechanism with spike after-hyperpolarization causes a sensitivity of synaptic change to pre- and postsynaptic spike times which can reproduce Hebbian spike timing dependent plasticity for inhibitory synapses as was found in experiments. In addition, the sensitivity of MPDP to the time course of the voltage when generating a spike allows MPDP to distinguish between weak (spurious) and strong (teacher) spikes, which therefore provides a neuronal basis for the comparison of actual and target activity. For spatio-temporal input spike patterns our conceptually simple plasticity rule achieves a surprisingly high storage capacity for spike associations. The sensitivity of the MPDP to the subthreshold membrane potential during training allows robust memory retrieval after learning even in the presence of activity corrupted by noise.
This paper presents an implementation of multilayer feed forward neural networks (NN) to optimize CMOS analog circuits. For modeling and design recently neural network computational modules have got acceptance as an unorthodox and useful tool. To achieve high performance of active or passive circuit component neural network can be trained accordingly. A well trained neural network can produce more accurate outcome depending on its learning capability. Neural network model can replace empirical modeling solutions limited by range and accuracy.[2] Neural network models are easy to obtain for new circuits or devices which can replace analytical methods. Numerical modeling methods can also be replaced by neural network model due to their computationally expansive behavior.[2][10][20]. The pro- posed implementation is aimed at reducing resource requirement, without much compromise on the speed. The NN ensures proper functioning by assigning the appropriate inputs, weights, biases, and excitation function of the layer that is currently being computed. The concept used is shown to be very effective in reducing resource requirements and enhancing speed.
We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks.
Three dimensional convolutional neural networks (3D CNNs) have been established as a powerful tool to simultaneously learn features from both spatial and temporal dimensions, which is suitable to be applied to video-based action recognition. In this work, we propose not to directly use the activations of fully-connected layers of a 3D CNN as the video feature, but to use selective convolutional layer activations to form a discriminative descriptor for video. It pools the feature on the convolutional layers under the guidance of body joint positions. Two schemes of mapping body joints into convolutional feature maps for pooling are discussed. The body joint positions can be obtained from any off-the-shelf skeleton estimation algorithm. The helpfulness of the body joint guided feature pooling with inaccurate skeleton estimation is systematically evaluated. To make it end-to-end and do not rely on any sophisticated body joint detection algorithm, we further propose a two-stream bilinear model which can learn the guidance from the body joints and capture the spatio-temporal features simultaneously. In this model, the body joint guided feature pooling is conveniently formulated as a bilinear product operation. Experimental results on three real-world datasets demonstrate the effectiveness of body joint guided pooling which achieves promising performance.
Boolean Networks have been used to study numerous phenomena, including gene regulation, neural networks, social interactions, and biological evolution. Here, we propose a general method for determining the critical behavior of Boolean systems built from arbitrary ensembles of Boolean functions. In particular, we solve the critical condition for systems of units operating according to canalizing functions and present strong numerical evidence that our approach correctly predicts the phase transition from order to chaos in such systems.
Qualitative and quantitative information about critical phenomena is provided by the distribution of zeros of the partition function in the complex plane. We apply this idea to Ising models on non-periodic systems based on substitution. In 1D we consider the Thue-Morse chain and show that the magnetic field zeros are filling a Cantor subset of the unit circle, the gaps being related to a general gap labeling theorem. In 2D we study the temperature zeros of the Ising model on the Ammann-Beenker tiling. The use of corner transfer matrices allows an efficient calculation of the partition function for rather large patches which results in a reasonable estimate of the critical temperature.
The presence of bacteria or fungi in the bloodstream of patients is abnormal and can lead to life-threatening conditions. A computational model based on a bidirectional long short-term memory artificial neural network, is explored to assist doctors in the intensive care unit to predict whether examination of blood cultures of patients will return positive. As input it uses nine monitored clinical parameters, presented as time series data, collected from 2177 ICU admissions at the Ghent University Hospital. Our main goal is to determine if general machine learning methods and more specific, temporal models, can be used to create an early detection system. This preliminary research obtains an area of 71.95% under the precision recall curve, proving the potential of temporal neural networks in this context.
Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems' perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi-hypothesis point cloud registration to achieve both robust pixel-wise semantic segmentation as well as accurate and real-time 6-DOF pose estimation for relevant objects, even in the presence of occlusions and sensor noise. Our architecture achieves 1cm position error and <5^\circ$ angle error in real time without an initial seed. We evaluate and benchmark SegICP against an annotated dataset generated by motion capture.
In this work we propose a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length. Moreover, residual networks seem to enable very deep networks by leveraging only the short paths during training. To support this observation, we rewrite residual networks as an explicit collection of paths. Unlike traditional models, paths through residual networks vary in length. Further, a lesion study reveals that these paths show ensemble-like behavior in the sense that they do not strongly depend on each other. Finally, and most surprising, most paths are shorter than one might expect, and only the short paths are needed during training, as longer paths do not contribute any gradient. For example, most of the gradient in a residual network with 110 layers comes from paths that are only 10-34 layers deep. Our results reveal one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of very deep networks.
In this work, we propose CLass-Enhanced Attentive Response (CLEAR): an approach to visualize and understand the decisions made by deep neural networks (DNNs) given a specific input. CLEAR facilitates the visualization of attentive regions and levels of interest of DNNs during the decision-making process. It also enables the visualization of the most dominant classes associated with these attentive regions of interest. As such, CLEAR can mitigate some of the shortcomings of heatmap-based methods associated with decision ambiguity, and allows for better insights into the decision-making process of DNNs. Quantitative and qualitative experiments across three different datasets demonstrate the efficacy of CLEAR for gaining a better understanding of the inner workings of DNNs during the decision-making process.
In a recent communication to the cond-mat archives, Suslov [cond-mat/0105325] severely criticizes a multitude of numerical results obtained by various groups for the critical exponent $\nu$ of the localization length at the disorder-induced metal-insulator transition in the three-dimensional Anderson model of localization as ``entirely absurd'' and ``evident desinformation''. These claims are based on the observation that there still is a large disagreement between analytical, numerical and experimental results for the critical exponent. The author proposes, based on a ``simple procedure to deal with corrections to scaling'', that the numerical data support nu approx 1, whereas recent numerical papers find nu = 1.58 +/- 0.06.   As we show here, these claims are entirely wrong. The proposed scheme does neither yield any improved accuracy when compared to the existing finite-size scaling methods, nor does it give nu approx 1 when applied to high-precision data. Rather, high-precision numerics with error epsilon approx 0.1% together with all available finite-size-scaling methods evidently produce a critical exponent nu approx 1.58.
This paper studies the tension between throughput and decoding delay performance of two widely-used network coding schemes: random linear network coding (RLNC) and instantly decodable network coding (IDNC). A single-hop broadcasting system model is considered that aims to deliver a block of packets to all receivers in the presence of packet erasures. For a fair and analytically tractable comparison between the two coding schemes, the transmission comprises two phases: a systematic transmission phase and a network coded transmission phase which is further divided into rounds. After the systematic transmission phase and given the same packet reception state, three quantitative metrics are proposed and derived in each scheme: 1) the absolute minimum number of transmissions in the first coded transmission round (assuming no erasures), 2) probability distribution of extra coded transmissions in a subsequent round (due to erasures), and 3) average packet decoding delay. This comparative study enables application-aware adaptive selection between IDNC and RLNC after systematic transmission phase.   One contribution of this paper is to provide a deep and systematic understanding of the IDNC scheme, to propose the notion of packet diversity and an optimal IDNC encoding scheme for minimizing metric 1. This is generally NP-hard, but nevertheless required for characterizing and deriving all the three metrics. Analytical and numerical results show that there is no clear winner between RLNC and IDNC if one is concerned with both throughput and decoding delay performance. IDNC is more preferable than RLNC when the number of receivers is smaller than packet block size, and the case reverses when the number of receivers is much greater than the packet block size. In the middle regime, the choice can depend on the application and a specific instance of the problem.
Studies on ensemble methods for classification suffer from the difficulty of modeling the complementary strengths of the components. Kleinberg's theory of stochastic discrimination (SD) addresses this rigorously via mathematical notions of enrichment, uniformity, and projectability of an ensemble. We explain these concepts via a very simple numerical example that captures the basic principles of the SD theory and method. We focus on a fundamental symmetry in point set covering that is the key observation leading to the foundation of the theory. We believe a better understanding of the SD method will lead to developments of better tools for analyzing other ensemble methods.
Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is calculated as the product of the probabilities of all its segments, where each segment is modeled using existing tools such as recurrent neural networks. Since the segmentation of a sequence is usually unknown in advance, we sum over all valid segmentations to obtain the final probability for the sequence. An efficient dynamic programming algorithm is developed for forward and backward computations without resorting to any approximation. We demonstrate our approach on text segmentation and speech recognition tasks. In addition to quantitative results, we also show that our approach can discover meaningful segments in their respective application contexts.
Reactive power plays an important role in supporting the real power transfer by maintaining voltage stability and system reliability. It is a critical element for a transmission operator to ensure the reliability of an electric system while minimizing the cost associated with it. The traditional objectives of reactive power dispatch are focused on the technical side of reactive support such as minimization of transmission losses. Reactive power cost compensation to a generator is based on the incurred cost of its reactive power contribution less the cost of its obligation to support the active power delivery. In this paper an efficient Particle Swarm Optimization (PSO) based reactive power optimization approach is presented. The optimal reactive power dispatch problem is a nonlinear optimization problem with several constraints. The objective of the proposed PSO is to minimize the total support cost from generators and reactive compensators. It is achieved by maintaining the whole system power loss as minimum thereby reducing cost allocation. The purpose of reactive power dispatch is to determine the proper amount and location of reactive support. Reactive Optimal Power Flow (ROPF) formulation is developed as an analysis tool and the validity of proposed method is examined using an IEEE-14 bus system.
Security in computer networks is one of the most interesting aspects of computer systems. It is typically represented by the initials CIA: confidentiality, integrity, and authentication or availability. Although, many access levels for data protection have been identified in computer networks, the intruders would still find lots of ways to harm sites and systems. The accommodation proceedings and the security supervision in the network systems, especially wireless sensor networks have been changed into a challenging point. One of the newest security algorithms for wireless sensor networks is Artificial Immune System (AIS) algorithm. Human lymphocytes play the main role in recognizing and destroying the unknown elements. In this article, we focus on the inspiration of these defective systems to guarantee the complications security using two algorithms; the first algorithms proposed to distinguish self-nodes from non-self ones by the related factors and the second one is to eliminate the enemy node danger.The results showed a high rate success and good rate of detecting for unknown object; it could present the best nodes with high affinity and fitness to be selected to confront the unknown agents.
Using the Landauer formula and a random matrix model, we investigate the autocorrelation function of the conductance versus magnetic field strength for ballistic electron transport through few-channel microstructures with the shape of a classically chaotic billiard coupled to ideal leads. This function depends on the total number M of channels and the parameter t which measures the difference in magnetic field strengths. Using the supersymmetry technique, we calculate for any value of M the leading terms of the asymptotic expansion for small t. We pay particular attention to the evaluation of the boundary terms. For small values of M, we supplement this analytical study by a numerical simulation. We compare our results with the squared Lorentzian suggested by semiclassical theory and valid for large M. For small M, we present evidence for non--analytic behavior of the autocorrelation function at t = 0.
In this paper we propose Time Synchronized One-Time-Password scheme to provide secure wake up authentication. The main constraint of wireless sensor networks is their limited power resource that prevents us from using radio transmission over the network to transfer the passwords. On the other hand computation power consumption is insignificant when compared to the costs associated with the power needed for transmitting the right set of keys. In addition to prevent adversaries from reading and following the timeline of the network, we propose to encrypt the tokens using symmetric encryption to prevent replay attacks.
Convolutional neural network (CNN) is one of the most prominent architectures and algorithm in Deep Learning. It shows a remarkable improvement in the recognition and classification of objects. This method has also been proven to be very effective in a variety of computer vision and machine learning problems. As in other deep learning, however, training the CNN is interesting yet challenging. Recently, some metaheuristic algorithms have been used to optimize CNN using Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing and Harmony Search. In this paper, another type of metaheuristic algorithms with different strategy has been proposed, i.e. Microcanonical Annealing to optimize Convolutional Neural Network. The performance of the proposed method is tested using the MNIST and CIFAR-10 datasets. Although experiment results of MNIST dataset indicate the increase in computation time (1.02x - 1.38x), nevertheless this proposed method can considerably enhance the performance of the original CNN (up to 4.60\%). On the CIFAR10 dataset, currently, state of the art is 96.53\% using fractional pooling, while this proposed method achieves 99.14\%.
Many clustering schemes have been proposed for ad hoc networks. A systematic classification of these clustering schemes enables one to better understand and make improvements. In mobile ad hoc networks, the movement of the network nodes may quickly change the topology resulting in the increase of the overhead message in topology maintenance. Protocols try to keep the number of nodes in a cluster around a pre-defined threshold to facilitate the optimal operation of the medium access control protocol. The clusterhead election is invoked on-demand, and is aimed to reduce the computation and communication costs. A large variety of approaches for ad hoc clustering have been developed by researchers which focus on different performance metrics. This paper presents a survey of different clustering schemes.
Does synchronization between action potentials from different neurons in the visual system play a substantial role in solving the binding problem? The binding problem can be studied quantitatively in the broader framework of the information contained in neural spike trains about some external correlate, which in this case is object configurations in the visual field. We approach this problem by using a mathematical formalism that quantifies the impact of correlated firing in short time scales. Using a power series expansion, the mutual information an ensemble of neurons conveys about external stimuli is broken down into firing rate and correlation components. This leads to a new quantification procedure directly applicable to simultaneous multiple neuron recordings. It theoretically constrains the neural code, showing that correlations contribute less significantly than firing rates to rapid information processing. By using this approach to study the limits upon the amount of information that an ideal observer is able to extract from a synchrony code, it may be possible to determine whether the available amount of information is sufficient to support computational processes such as feature binding.
A brief survey of the theoretical, numerical and experimental studies of the random field Ising model during last three decades is given. Nature of the phase transition in the three-dimensional RFIM with Gaussian random fields is discussed. Using simple scaling arguments it is shown that if the strength of the random fields is not too small (bigger than a certain threshold value) the finite temperature phase transition in this system is equivalent to the low-temperature order-disorder transition which takes place at variations of the strength of the random fields. Detailed study of the zero-temperature phase transition in terms of simple probabilistic arguments and modified mean-field approach (which take into account nearest-neighbors spin-spin correlations) is given. It is shown that if all thermally activated processes are suppressed the ferromagnetic order parameter m(h) as the function of the strength $h$ of the random fields becomes history dependent. In particular, the behavior of the magnetization curves m(h) for increasing and for decreasing $h$ reveals the hysteresis loop.
Deep feedforward neural networks with piecewise linear activations are currently producing the state-of-the-art results in several public datasets. The combination of deep learning models and piecewise linear activation functions allows for the estimation of exponentially complex functions with the use of a large number of subnetworks specialized in the classification of similar input examples. During the training process, these subnetworks avoid overfitting with an implicit regularization scheme based on the fact that they must share their parameters with other subnetworks. Using this framework, we have made an empirical observation that can improve even more the performance of such models. We notice that these models assume a balanced initial distribution of data points with respect to the domain of the piecewise linear activation function. If that assumption is violated, then the piecewise linear activation units can degenerate into purely linear activation units, which can result in a significant reduction of their capacity to learn complex functions. Furthermore, as the number of model layers increases, this unbalanced initial distribution makes the model ill-conditioned. Therefore, we propose the introduction of batch normalisation units into deep feedforward neural networks with piecewise linear activations, which drives a more balanced use of these activation units, where each region of the activation function is trained with a relatively large proportion of training samples. Also, this batch normalisation promotes the pre-conditioning of very deep learning models. We show that by introducing maxout and batch normalisation units to the network in network model results in a model that produces classification results that are better than or comparable to the current state of the art in CIFAR-10, CIFAR-100, MNIST, and SVHN datasets.
An updated next-to-leading order (NLO) QCD analysis of all presently available longitudinally polarized deep-inelastic scattering (DIS) data is presented in the framework of the radiative parton model.
We study the influence of the velocity dependence of friction on the escape of a Brownian particle from the deep potential well ($E_{b} \gg k_{B}T$, $E_{b}$ is the barrier height, $k_{B}$ is the Boltzmann constant, $T$ is the bath temperature). The bath-induced relaxation is treated within the Rayleigh model (a heavy particle of mass $M$ in the bath of light particles of mass $m\ll M$) up to the terms of the order of $O(\lambda^{4})$, $\lambda^{2}=m/M\ll1$. The term $\sim 1$ is equivalent to the Fokker-Planck dissipative operator, and the term $\sim \lambda^{2}$ is responsible for the velocity dependence of friction. As expected, the correction to the Kramers escape rate in the overdamped limit is proportional to $\lambda^{2}$ and is small. The corresponding correction in the underdamped limit is proportional to $\lambda^{2}E_{b}/(k_{B}T)$ and is not necessarily small. We thus suggest that the effects due to the velocity-dependent friction may be of considerable importance in determining the rate of escape of an under- and moderately damped Brownian particle from a deep potential well, while they are of minor importance for an overdamped particle.
Resources are often limited, therefore it is essential how convincingly competitors present their claims for them. Beside a player's natural capacity, here overconfidence and bluffing may also play a decisive role and influence how to share a restricted reward. While bluff provides clear, but risky advantage, overconfidence, as a form of self-deception, could be harmful to its user. Still, it is a long-standing puzzle why these potentially damaging biases are maintained and evolving to a high level in the human society. Within the framework of evolutionary game theory, we present a simple version of resource competition game in which the coevolution of overconfidence and bluffing is fundamental, which is capable to explain their prevalence in structured populations. Interestingly, bluffing seems apt to evolve to higher level than corresponding overconfidence and in general the former is less resistant to punishment than the latter. Moreover, topological feature of the social network plays an intricate role in the spreading of overconfidence and bluffing. While the heterogeneity of interactions facilitates bluffing, it also increases efficiency of adequate punishment against overconfident behavior. Furthermore, increasing the degree of homogeneous networks can trigger similar effect. We also observed that having high real capability may accommodate both bluffing ability and overconfidence simultaneously.
We introduce a class of neural networks derived from probabilistic models in the form of Bayesian belief networks. By imposing additional assumptions about the nature of the probabilistic models represented in the belief networks, we derive neural networks with standard dynamics that require no training to determine the synaptic weights, that can pool multiple sources of evidence, and that deal cleanly and consistently with inconsistent or contradictory evidence. The presented neural networks capture many properties of Bayesian belief networks, providing distributed versions of probabilistic models.
Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone. Our goal is to minimize human participation, so we employ evolutionary algorithms to discover such networks automatically. Despite significant computational requirements, we show that it is now possible to evolve models with accuracies within the range of those published in the last year. Specifically, we employ simple evolutionary techniques at unprecedented scales to discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial initial conditions and reaching accuracies of 94.6% (95.6% for ensemble) and 77.0%, respectively. To do this, we use novel and intuitive mutation operators that navigate large search spaces; we stress that no human participation is required once evolution starts and that the output is a fully-trained model. Throughout this work, we place special emphasis on the repeatability of results, the variability in the outcomes and the computational requirements.
Developing networks of neural systems can exhibit spontaneous, synchronous activities called neural bursts, which can be important in the organization of functional neural circuits. Before the network matures, the activity level of a burst can reverberate in repeated rise-and-falls in periods of hundreds of milliseconds following an initial wave-like propagation of spiking activity, while the burst itself lasts for seconds. To investigate the spatiotemporal structure of the reverberatory bursts, we culture dissociated, rat cortical neurons on a high-density multi-electrode array to record the dynamics of neural activity over the growth and maturation of the network. We find the synchrony of the spiking significantly reduced following the initial wave and the activities become broadly distributed spatially. The synchrony recovers as the system reverberates until the end of the burst. Using a propagation model we infer the spreading speed of the spiking activity, which increases as the culture ages. We perform computer simulations of the system using a physiological model of spiking networks in two spatial dimensions and find the parameters that reproduce the observed resynchronization of spiking in the bursts. An analysis of the simulated dynamics suggests that the depletion of synaptic resources causes the resynchronization. The spatial propagation dynamics of the simulations match well with observations over the course of a burst and point to an interplay of the synaptic efficacy and the noisy neural self-activation in producing the morphology of the bursts.
Melanoma is amongst most aggressive types of cancer. However, it is highly curable if detected in its early stages. Prescreening of suspicious moles and lesions for malignancy is of great importance. Detection can be done by images captured by standard cameras, which are more preferable due to low cost and availability. One important step in computerized evaluation of skin lesions is accurate detection of lesion region, i.e. segmentation of an image into two regions as lesion and normal skin. Accurate segmentation can be challenging due to burdens such as illumination variation and low contrast between lesion and healthy skin. In this paper, a method based on deep neural networks is proposed for accurate extraction of a lesion region. The input image is preprocessed and then its patches are fed to a convolutional neural network (CNN). Local texture and global structure of the patches are processed in order to assign pixels to lesion or normal classes. A method for effective selection of training patches is used for more accurate detection of a lesion border. The output segmentation mask is refined by some post processing operations. The experimental results of qualitative and quantitative evaluations demonstrate that our method can outperform other state-of-the-art algorithms exist in the literature.
Vector space models have become popular in distributional semantics, despite the challenges they face in capturing various semantic phenomena. We propose a novel probabilistic framework which draws on both formal semantics and recent advances in machine learning. In particular, we separate predicates from the entities they refer to, allowing us to perform Bayesian inference based on logical forms. We describe an implementation of this framework using a combination of Restricted Boltzmann Machines and feedforward neural networks. Finally, we demonstrate the feasibility of this approach by training it on a parsed corpus and evaluating it on established similarity datasets.
Identifying features of molecular regulatory networks is an important problem in systems biology. It has been shown that the combinatorial logic of such networks can be captured in many cases by special functions called nested canalyzing in the context of discrete dynamic network models. It was also shown that the dynamics of networks constructed from such functions has very special properties that are consistent with what is known about molecular networks, and that simplify analysis. It is important to know how restrictive this class of functions is, for instance for the purpose of network reverse-engineering. This paper contains a formula for the number of such functions and a comparison to the class of all functions. In particular, it is shown that, as the number of variables becomes large, the ratio of the number of nested canalyzing functions to the number of all functions converges to zero. This shows that the class of nested canalyzing functions is indeed very restrictive, indicating that molecular networks have very special properties. The principal tool used for this investigation is a description of these functions as polynomials and a parameterization of the class of all such polynomials in terms of relations on their coefficients.
Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Typically these models encode all features of the data into a single variable. Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectures that generalize from standard VAEs, employing a general graphical model structure in the encoder and decoder. This allows us to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables. We further define a general objective for semi-supervised learning in this model class, which can be approximated using an importance sampling procedure. We evaluate our framework's ability to learn disentangled representations, both by qualitative exploration of its generative capacity, and quantitative evaluation of its discriminative ability on a variety of models and datasets.
We outline the case for a comprehensive wide and deep survey ultimately targeted at obtaining 21-cm HI line emission spectroscopic observations of more than a billion galaxies to redshift z=1.5 and greater over half the sky. This survey provides a database of galaxy redshifts, HI gas masses, and galaxy rotation curves that would enable a wide range of science, including fundamental cosmology and studies of Dark Energy. This science requires the next generation of radio arrays, which are being designed under the umbrella of the Square Kilometer Array (SKA) project. We present a science roadmap, extending to 2020 and beyond, that would enable this ambitious survey. We also place this survey in the context of other multi-wavelength surveys.
This paper proposes a new framework for RGB-D-based action recognition that takes advantages of hand-designed features from skeleton data and deeply learned features from depth maps, and exploits effectively both the local and global temporal information. Specifically, depth and skeleton data are firstly augmented for deep learning and making the recognition insensitive to view variance. Secondly, depth sequences are segmented using the hand-crafted features based on skeleton joints motion histogram to exploit the local temporal information. All training se gments are clustered using an Infinite Gaussian Mixture Model (IGMM) through Bayesian estimation and labelled for training Convolutional Neural Networks (ConvNets) on the depth maps. Thus, a depth sequence can be reliably encoded into a sequence of segment labels. Finally, the sequence of labels is fed into a joint Hidden Markov Model and Support Vector Machine (HMM-SVM) classifier to explore the global temporal information for final recognition.
We investigate a double-spin asymmetry for the semi-inclusive $\Lambda$ hyperon production in the longitudinally deep inelastic lepton-proton scattering, the sign of which can provide us with important information about the strange quark helicity distribution in the proton.On the basis of the interpretation of the longitudinal deep inelastic lepton-nucleon scattering data as a negative strange quark polarization in the proton and the preliminary results on the measurement of the longitudinal $\Lambda$ polarization at the $Z$ resonance in electron-positron annihilation,we predict a minus sign for the suggested observable. The experimental condition required for our suggestion is met by the HERA facilities, so the asymmetry considered can be measured by the HERMES experiments at HERA in the near future.
Explicit characterization of the capacity region of communication networks is a long standing problem. While it is known that network coding can outperform routing and replication, the set of feasible rates is not known in general. Characterizing the network coding capacity region requires determination of the set of all entropic vectors. Furthermore, computing the explicitly known linear programming bound is infeasible in practice due to an exponential growth in complexity as a function of network size. This paper focuses on the fundamental problems of characterization and computation of outer bounds for networks with correlated sources. Starting from the known local functional dependencies induced by the communications network, we introduce the notion of irreducible sets, which characterize implied functional dependencies. We provide recursions for computation of all maximal irreducible sets. These sets act as information-theoretic bottlenecks, and provide an easily computable outer bound. We extend the notion of irreducible sets (and resulting outer bound) for networks with independent sources. We compare our bounds with existing bounds in the literature. We find that our new bounds are the best among the known graph theoretic bounds for networks with correlated sources and for networks with independent sources.
We studied the feasibility of recognizing individual right whales (Eubalaena glacialis) using convolutional neural networks. Prior studies have shown that CNNs can be used in wide range of classification and categorization tasks such as automated human face recognition. To test applicability of deep learning to whale recognition we have developed several models based on best practices from literature. Here, we describe the performance of the models. We conclude that machine recognition of whales is feasible and comment on the difficulty of the problem
Concrete, a mixture formed by mixing cement, water, and fine and coarse mineral aggregates is used in the construction of nuclear power plants (NPPs), e.g., to construct the reactor cavity concrete that encases the reactor pressure vessel, etc. In such environments, concrete may be exposed to radiation (e.g., neutrons) emanating from the reactor core. Until recently, concrete has been assumed relatively immune to radiation exposure. Direct evidence acquired on Ar$^+$-ion irradiated calcite and quartz indicates, on the contrary, that, such minerals, which constitute aggregates in concrete, may be significantly altered by irradiation. Specifically, while quartz undergoes disordering of its atomic structure resulting in a near complete lack of periodicity, i.e., similar to glassy silica, calcite only experiences random rotations, and distortions of its carbonate groups. As a result, irradiated quartz shows a reduction in density of around 15%, and an increase in chemical reactivity, described by its dissolution rate, similar to a glassy silica; i.e., an increase of around 3 orders of magnitude. Calcite however, shows little change in dissolution rates - although its density noted to reduce by around 9%. These differences are correlated with the nature of bonds in these minerals, i.e., being dominantly ionic or covalent, and the rigidity of the mineral's atomic network that is characterized by the number of topological constraints (n$_c$) that are imposed on the atoms in the network. The outcomes are discussed within the context of the durability of concrete structural elements formed with calcitic/quartzitic aggregates in nuclear power plants.
Graphical Gaussian models have proven to be useful tools for exploring network structures based on multivariate data. Applications to studies of gene expression have generated substantial interest in these models, and resulting recent progress includes the development of fitting methodology involving penalization of the likelihood function. In this paper we advocate the use of the multivariate t and related distributions for more robust inference of graphs. In particular, we demonstrate that penalized likelihood inference combined with an application of the EM algorithm provides a simple and computationally efficient approach to model selection in the t-distribution case.
Can we develop a computer algorithm that assesses the creativity of a painting given its context within art history? This paper proposes a novel computational framework for assessing the creativity of creative products, such as paintings, sculptures, poetry, etc. We use the most common definition of creativity, which emphasizes the originality of the product and its influential value. The proposed computational framework is based on constructing a network between creative products and using this network to infer about the originality and influence of its nodes. Through a series of transformations, we construct a Creativity Implication Network. We show that inference about creativity in this network reduces to a variant of network centrality problems which can be solved efficiently. We apply the proposed framework to the task of quantifying creativity of paintings (and sculptures). We experimented on two datasets with over 62K paintings to illustrate the behavior of the proposed framework. We also propose a methodology for quantitatively validating the results of the proposed algorithm, which we call the "time machine experiment".
We study the aging dynamics in a model for dense simple liquids, in which particles interact through a hard-core repulsion complemented by a short-ranged attractive potential, of the kind found in colloidal suspensions. In this system, at large packing fractions, kinetically arrested disordered states can be created both on cooling (attractive glass) and on heating (repulsive glass). The possibility of having two distinct glasses, at the same packing fraction, with two different dynamics offers the unique possibility of comparing -- within the same model -- the differences in aging dynamics. We find that, while the aging dynamics of the repulsive glass is similar to the one observed in atomic and molecular systems, the aging dynamics of the attractive glass shows novel unexpected features.
When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional distribution with a single Gaussian; and using nonparametric kernel density estimation. We observe large reductions in error on several natural and artificial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models.
We propose flow-based analysis to estimate quality of an Internet connection. Using results from the queuing theory we compare two expressions for backbone traffic that have different scopes of applicability. A curve that shows dependence of utilization of a link on a number of active flows in it describes different states of the network. We propose a methodology for plotting such a curve using data received from a Cisco router by NetFlow protocol, determining the working area and the overloading point of the network. Our test is an easy way to find a moment for upgrading the backbone.
We present a new method for blind motion deblurring that uses a neural network trained to compute estimates of sharp image patches from observations that are blurred by an unknown motion kernel. Instead of regressing directly to patch intensities, this network learns to predict the complex Fourier coefficients of a deconvolution filter to be applied to the input patch for restoration. For inference, we apply the network independently to all overlapping patches in the observed image, and average its outputs to form an initial estimate of the sharp image. We then explicitly estimate a single global blur kernel by relating this estimate to the observed image, and finally perform non-blind deconvolution with this kernel. Our method exhibits accuracy and robustness close to state-of-the-art iterative methods, while being much faster when parallelized on GPU hardware.
Deep field observations are an essential tool to probe the cosmological evolution of galaxies. In this context, X-ray deep fields provide information about some of the most energetic cosmological objects: active galactic nuclei (AGN). Astronomers are interested in detecting sufficient numbers of AGN to probe the accretion history at high redshift. This talk gives an overview of the knowledge resulting from a highly complete soft X-ray selected sample collected with ROSAT, XMM-Newton and Chandra deep fields. The principal outcome based on X-ray luminosity functions and space density evolution studies is that low-luminosity AGN evolve in a dramatically different way from high-luminosity AGN: The most luminous quasars perform at significantly earlier cosmic times and are most numerous in a unit volume at cosmological redshift z~2. In contrast, low-luminosity AGN evolve later and their space density peaks at z~0.7. This finding is also interpreted as an anti-hierarchical growth of supermassive black holes in the Universe. Comparing this with star formation rate history studies one concludes that supermassive black holes enter the cosmic stage before the bulk of the first stars. Therefore, first solutions of the so-called hen-egg problem are suggested. Finally, status developments and expectations of ongoing and future extended observations such as the XMM-COSMOS project are highlighted.
We consider the morphology of two dimensional cracks observed in experimental results obtained from paper samples and compare these results with the numerical simulations of the random fuse model (RFM). We demonstrate that the data obey multiscaling at small scales but cross over to self-affine scaling at larger scales. Next, we show that the roughness exponent of the random fuse model is recovered by a simpler model that produces a connected crack, while a directed crack yields a different result, close to a random walk. We discuss the multiscaling behavior of all these models.
A simulated Hopfield-type neural-net-like model, which is realizable using quantum holography, is proposed for quantum associative memory and pattern recognition.
Stochastic resonance holds much promise for the detection of weak signals in the presence of relatively loud noise. Following the discovery of nondynamical and of aperiodic stochastic resonance, it was recently shown that the phenomenon can manifest itself even in the presence of nonstationary signals. This was found in a composite system of differentiated trigger mechanisms mounted in parallel, which suggests that it could be realized in some elementary neural networks or nonlinear electronic circuits. Here, we find that even an individual trigger system may be able to detect weak nonstationary signals using stochastic resonance. The very simple modification to the trigger mechanism that makes this possible is reminiscent of some aspects of actual neuron physics. Stochastic resonance may thus become relevant to more types of biological or electronic systems injected with an ever broader class of realistic signals.
Problem statement: This paper examines Artificial Spiking Neural Network (ASNN) which inter-connects group of artificial neurons that uses a mathematical model with the aid of block cipher. The aim of undertaken this research is to come up with a block cipher where by the keys are randomly generated by ASNN which can then have any variable block length. This will show the private key is kept and do not have to be exchange to the other side of the communication channel so it present a more secure procedure of key scheduling. The process enables for a faster change in encryption keys and a network level encryption to be implemented at a high speed without the headache of factorization. Approach: The block cipher is converted in public cryptosystem and had a low level of vulnerability to attack from brute, and moreover can able to defend against linear attacks since the Artificial Neural Networks (ANN) architecture convey non-linearity to the encryption/decryption procedures. Result: In this paper is present to use the Spiking Neural Networks (SNNs) with spiking neurons as its basic unit. The timing for the SNNs is considered and the output is encoded in 1's and 0's depending on the occurrence or not occurrence of spikes as well as the spiking neural networks use a sign function as activation function, and present the weights and the filter coefficients to be adjust, having more degrees of freedom than the classical neural networks. Conclusion: In conclusion therefore, encryption algorithm can be deployed in communication and security applications where data transfers are most crucial. So this paper, the neural block cipher proposed where the keys are generated by the SNN and the seed is considered the public key which generates the both keys on both sides In future therefore a new research will be conducted on the Spiking Neural Network (SNN) impacts on communication.
In an end-to-end dialog system, the aim of dialog state tracking is to accurately estimate a compact representation of the current dialog status from a sequence of noisy observations produced by the speech recognition and the natural language understanding modules. This paper introduces a novel method of dialog state tracking based on the general paradigm of machine reading and proposes to solve it using an End-to-End Memory Network, MemN2N, a memory-enhanced neural network architecture. We evaluate the proposed approach on the second Dialog State Tracking Challenge (DSTC-2) dataset. The corpus has been converted for the occasion in order to frame the hidden state variable inference as a question-answering task based on a sequence of utterances extracted from a dialog. We show that the proposed tracker gives encouraging results. Then, we propose to extend the DSTC-2 dataset with specific reasoning capabilities requirement like counting, list maintenance, yes-no question answering and indefinite knowledge management. Finally, we present encouraging results using our proposed MemN2N based tracking model.
The interplay between interactions and disorder in closed quantum many-body systems is relevant for thermalization phenomenon. In this article, we address this competition in an infinite temperature spin system, by means of the Loschmidt echo (LE), which is based on a time reversal procedure. This quantity has been formerly employed to connect quantum and classical chaos, and in the present many-body scenario we use it as a dynamical witness. We assess the LE time scales as a function of disorder and interaction strengths. The strategy enables a qualitative phase diagram that shows the regions of ergodic and nonergodic behavior of the polarization under the echo dynamics.
We consider a two-sample hypothesis testing problem, where the distributions are defined on the space of undirected graphs, and one has access to only one observation from each model. A motivating example for this problem is comparing the friendship networks on Facebook and LinkedIn. The practical approach to such problems is to compare the networks based on certain network statistics. In this paper, we present a general principle for two-sample hypothesis testing in such scenarios without making any assumption about the network generation process. The main contribution of the paper is a general formulation of the problem based on concentration of network statistics, and consequently, a consistent two-sample test that arises as the natural solution for this problem. We also show that the proposed test is minimax optimal for certain network statistics.
Motivated by hypergraph decomposition algorithms, we introduce the notion of edge-induced vertex-cuts and compare it with the well-known notions of edge-cuts and vertex-cuts. We investigate the complexity of computing minimum edge-induced vertex-cuts and demonstrate the usefulness of our notion by applications in network reliability and constraint satisfaction.
We present a regularized maximum likelihood weak lensing reconstruction of the Deep Lens Survey F2 field (4 deg^2). High signal-to-noise ratio peaks in our lensing significance map appear to be associated with possible projected filamentary structures. The largest apparent structure extends for over a degree in the field and has contributions from known optical clusters at three redshifts (z ~ 0.3, 0.43, 0.5). Noise in weak lensing reconstructions is known to potentially cause "false positives"; we use Monte Carlo techniques to estimate the contamination in our sample, and find that 10-25% of the peaks are expected to be false detections. For significant lensing peaks we estimate the total signal-to-noise ratio of detection using a method that accounts for pixel-to-pixel correlations in our reconstruction. We also report the detection of a candidate relative underdensity in the F2 field with a total signal-to-noise ratio of ~ 5.5.
Traditionally, the evolution of cooperation has been studied on single, isolated networks. Yet a player, especially in human societies, will typically be a member of many different networks, and those networks will play a different role in the evolutionary process. Multilayer networks are therefore rapidly gaining on popularity as the more apt description of a networked society. With this motivation, we here consider 2-layer scale-free networks with all possible combinations of degree mixing, wherein one network layer is used for the accumulation of payoffs and the other is used for strategy updating. We find that breaking the symmetry through assortative mixing in one layer and/or disassortative mixing in the other layer, as well as preserving the symmetry by means of assortative mixing in both layers, impedes the evolution of cooperation. We use degree-dependent distributions of strategies and cluster-size analysis to explain these results, which highlight the importance of hubs and the preservation of symmetry between multilayer networks for the successful resolution of social dilemmas.
The temperature dependence of 2D magnetoresistance in an applied in-plane magnetic field is theoretically considered for electrons in Si MOSFETs within the screening theory for long-range charged impurity scattering limited carrier transport. In agreement with recent experimental observations we find an essentially temperature independent magnetoresistivity for carrier densities well into the 2D metallic regime due to the field-induced lifting of spin and, perhaps, valley degeneracies. In particular the metallic temperature dependence of the ballistic magnetoresistance is strongly suppressed around the zero-temperature critical magnetic field ($B_s$) for full spin-polarization, with the metallic temperature dependence strongest at B=0, weakest around $B \sim B_s$, and intermediate at $B \gg B_s$.
The phase transitions that occur in an infinite-range-interaction Ising ferromagnet in the presence of a double-Gaussian random magnetic field are analyzed. Such random fields are defined as a superposition of two Gaussian distributions, presenting the same width $\sigma$. Is is argued that this distribution is more appropriate for a theoretical description of real systems than its simpler particular cases, i.e., the bimodal ($\sigma=0$) and the single Gaussian distributions. It is shown that a low-temperature first-order phase transition may be destructed for increasing values of $\sigma$, similarly to what happens in the compound $Fe_{x}Mg_{1-x}Cl_{2}$, whose finite-temperature first-order phase transition is presumably destructed by an increase in the field randomness.
Based on an 87-fb$^{-1}$ dataset collected by the Babar detector at the PEP-II asymmetric-energy $B$-Factory, a search for $D^{0}$--$\bar{D}^{0}$ mixing has been made using the semileptonic decay modes $D^{*+} \to \pi^{+} D^{0}, D^{0} \to K^{(*)}e\nu$ (+c.c.). The use of these modes allows unambiguous flavor tagging and a combined fit of the $D^{0}$ decay time and $D^{*+}$--$D^{0}$ mass difference ($\Delta M$) distributions. The high-statistics sample of unmixed semileptonic $D^{0}$ decays is used to model the $\Delta M$ distribution and time-dependence of mixed events directly from the data. Neural networks are used to select events and reconstruct the $D^{0}$. A result consistent with no charm mixing has been obtained, $R_{\rm{mix}}=0.0023 \pm 0.0012 \pm 0.0004$. This corresponds to an upper limit of $R_{\rm{mix}}<0.0042$ (90% CL).
We present an approach for real-time, robust and accurate hand pose estimation from moving egocentric RGB-D cameras in cluttered real environments. Existing methods typically fail for hand-object interactions in cluttered scenes imaged from egocentric viewpoints, common for virtual or augmented reality applications. Our approach uses two subsequently applied Convolutional Neural Networks (CNNs) to localize the hand and regress 3D joint locations. Hand localization is achieved by using a CNN to estimate the 2D position of the hand center in the input, even in the presence of clutter and occlusions. The localized hand position, together with the corresponding input depth value, is used to generate a normalized cropped image that is fed into a second CNN to regress relative 3D hand joint locations in real time. For added accuracy, robustness and temporal stability, we refine the pose estimates using a kinematic pose tracking energy. To train the CNNs, we introduce a new photorealistic dataset that uses a merged reality approach to capture and synthesize large amounts of annotated data of natural hand interaction in cluttered scenes. Through quantitative and qualitative evaluation, we show that our method is robust to self-occlusion and occlusions by objects, particularly in moving egocentric perspectives.
Distributed networks of brain areas interact with one another in a time-varying fashion to enable complex cognitive and sensorimotor functions. Here we use novel network analysis algorithms to test the recruitment and integration of large-scale functional neural circuitry during learning. Using functional magnetic resonance imaging (fMRI) data acquired from healthy human participants, from initial training through mastery of a simple motor skill, we investigate changes in the architecture of functional connectivity patterns that promote learning. Our results reveal that learning induces an autonomy of sensorimotor systems and that the release of cognitive control hubs in frontal and cingulate cortices predicts individual differences in the rate of learning on other days of practice. Our general statistical approach is applicable across other cognitive domains and provides a key to understanding time-resolved interactions between distributed neural circuits that enable task performance.
Empirical Centroid Fictitious Play (ECFP) is a generalization of the well-known Fictitious Play (FP) algorithm designed for implementation in large-scale games. In ECFP, the set of players is subdivided into equivalence classes with players in the same class possessing similar properties. Players choose a next-stage action by tracking and responding to aggregate statistics related to each equivalence class. This setup alleviates the difficult task of tracking and responding to the statistical behavior of every individual player, as is the case in traditional FP. Aside from ECFP, many useful modifications have been proposed to classical FP, e.g., rules allowing for network-based implementation, increased computational efficiency, and stronger forms of learning. Such modifications tend to be of great practical value; however, their effectiveness relies heavily on two fundamental properties of FP: robustness to alterations in the empirical distribution step size process, and robustness to best-response perturbations. The main contribution of the paper is to show that similar robustness properties also hold for the ECFP algorithm. This result serves as a first step in enabling practical modifications to ECFP, similar to those already developed for FP.
Twitter data is extremely noisy -- each tweet is short, unstructured and with informal language, a challenge for current topic modeling. On the other hand, tweets are accompanied by extra information such as authorship, hashtags and the user-follower network. Exploiting this additional information, we propose the Twitter-Network (TN) topic model to jointly model the text and the social network in a full Bayesian nonparametric way. The TN topic model employs the hierarchical Poisson-Dirichlet processes (PDP) for text modeling and a Gaussian process random function model for social network modeling. We show that the TN topic model significantly outperforms several existing nonparametric models due to its flexibility. Moreover, the TN topic model enables additional informative inference such as authors' interests, hashtag analysis, as well as leading to further applications such as author recommendation, automatic topic labeling and hashtag suggestion. Note our general inference framework can readily be applied to other topic models with embedded PDP nodes.
It is a challenge for any Knowledge Base reasoning to manage ubiquitous uncertain ontology as well as uncertain updating times, while achieving acceptable service levels at minimum computational cost. This paper proposes an application-independent merging ontologies for any open interaction system. A solution that uses Multi-Entity Bayesan Networks with SWRL rules, and a Java program is presented to dynamically monitor Exogenous and Endogenous temporal evolution on updating merging ontologies on a probabilistic framework for the Semantic Web.
We propose a method for semi-supervised training of structured-output neural networks. Inspired by the framework of Generative Adversarial Networks (GAN), we train a discriminator network to capture the notion of a quality of network output. To this end, we leverage the qualitative difference between outputs obtained on the labelled training data and unannotated data. We then use the discriminator as a source of error signal for unlabelled data. This effectively boosts the performance of a network on a held out test set. Initial experiments in image segmentation demonstrate that the proposed framework enables achieving the same network performance as in a fully supervised scenario, while using two times less annotations.
In this paper, a geometric framework for neural networks is proposed. This framework uses the inner product space structure underlying the parameter set to perform gradient descent not in a component-based form, but in a coordinate-free manner. Convolutional neural networks are described in this framework in a compact form, with the gradients of standard --- and higher-order --- loss functions calculated for each layer of the network. This approach can be applied to other network structures and provides a basis on which to create new networks.
A tile Hamiltonian (TH) replaces the actual atomic interactions in a quasicrystal with effective interactions between and within tiles. We studied Al-Co-Cu decagonal quasicrystals described as decorated Hexagon-Boat-Star (HBS) tiles using {\em ab-initio} methods. The dominant term in the TH counts the number of H, B and S tiles. Phason flips that replace an HS pair with a BB pair lower the energy. In Penrose tilings, quasiperiodicity is forced by arrow matching rules on tile edges. The edge arrow orientation in our model of AlCoCu is due to Co/Cu chemical ordering. Tile edges meet in vertices with 72$^\circ$ or 144$^\circ$ angles. We find strong interactions between edge orientations at 72$^\circ$ vertices that force a type of matching rule. Interactions at 144$^\circ$ vertices are somewhat weaker.
In this paper, we analyze the efficiency of Monte Carlo methods for incremental computation of PageRank, personalized PageRank, and similar random walk based methods (with focus on SALSA), on large-scale dynamically evolving social networks. We assume that the graph of friendships is stored in distributed shared memory, as is the case for large social networks such as Twitter.   For global PageRank, we assume that the social network has $n$ nodes, and $m$ adversarially chosen edges arrive in a random order. We show that with a reset probability of $\epsilon$, the total work needed to maintain an accurate estimate (using the Monte Carlo method) of the PageRank of every node at all times is $O(\frac{n\ln m}{\epsilon^{2}})$. This is significantly better than all known bounds for incremental PageRank. For instance, if we naively recompute the PageRanks as each edge arrives, the simple power iteration method needs $\Omega(\frac{m^2}{\ln(1/(1-\epsilon))})$ total time and the Monte Carlo method needs $O(mn/\epsilon)$ total time; both are prohibitively expensive. Furthermore, we also show that we can handle deletions equally efficiently.   We then study the computation of the top $k$ personalized PageRanks starting from a seed node, assuming that personalized PageRanks follow a power-law with exponent $\alpha < 1$. We show that if we store $R>q\ln n$ random walks starting from every node for large enough constant $q$ (using the approach outlined for global PageRank), then the expected number of calls made to the distributed social network database is $O(k/(R^{(1-\alpha)/\alpha}))$.   We also present experimental results from the social networking site, Twitter, verifying our assumptions and analyses. The overall result is that this algorithm is fast enough for real-time queries over a dynamic social network.
We consider the distributed channel selection problem in the context of device-to-device (D2D) communication as an underlay to a cellular network. Underlaid D2D users communicate directly by utilizing the cellular spectrum but their decisions are not governed by any centralized controller. Selfish D2D users that compete for access to the resources construct a distributed system, where the transmission performance depends on channel availability and quality. This information, however, is difficult to acquire. Moreover, the adverse effects of D2D users on cellular transmissions should be minimized. In order to overcome these limitations, we propose a network-assisted distributed channel selection approach in which D2D users are only allowed to use vacant cellular channels. This scenario is modeled as a multi-player multi-armed bandit game with side information, for which a distributed algorithmic solution is proposed. The solution is a combination of no-regret learning and calibrated forecasting, and can be applied to a broad class of multi-player stochastic learning problems, in addition to the formulated channel selection problem. Analytically, it is established that this approach not only yields vanishing regret (in comparison to the global optimal solution), but also guarantees that the empirical joint frequencies of the game converge to the set of correlated equilibria.
The brain encodes spacial structure through a combinatorial code of neural activity. Experiments suggest such codes correspond to convex areas of the subject's environment. We present an intrinsic condition that implies a neural code may correspond to a convex space and give a bound on the minimal dimension underlying such a realization.
Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities in a graph, using a modification of spectral clustering. Statistical guarantees are provided under a joint mixture model that we call the node-contextualized stochastic blockmodel, including a bound on the mis-clustering rate. The bound is used to derive conditions for achieving perfect clustering. For most simulated cases, covariate-assisted spectral clustering yields results superior to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis. We apply our clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates. In both cases, covariate-assisted spectral clustering yields clusters that are easier to interpret neurologically.
Internet of Things (IoT) will comprise billions of devices that can sense, communicate, compute and potentially actuate. Data streams coming from these devices will challenge the traditional approaches to data management and contribute to the emerging paradigm of big data. This paper discusses emerging Internet of Things (IoT) architecture, large scale sensor network applications, federating sensor networks, sensor data and related context capturing techniques, challenges in cloud-based management, storing, archiving and processing of sensor data.
Part models of object categories are essential for challenging recognition tasks, where differences in categories are subtle and only reflected in appearances of small parts of the object. We present an approach that is able to learn part models in a completely unsupervised manner, without part annotations and even without given bounding boxes during learning. The key idea is to find constellations of neural activation patterns computed using convolutional neural networks. In our experiments, we outperform existing approaches for fine-grained recognition on the CUB200-2011, NA birds, Oxford PETS, and Oxford Flowers dataset in case no part or bounding box annotations are available and achieve state-of-the-art performance for the Stanford Dog dataset. We also show the benefits of neural constellation models as a data augmentation technique for fine-tuning. Furthermore, our paper unites the areas of generic and fine-grained classification, since our approach is suitable for both scenarios. The source code of our method is available online at http://www.inf-cv.uni-jena.de/part_discovery
This paper describes an efficient rule generation algorithm, called rule generation from artificial neural networks (RGANN) to generate symbolic rules from ANNs. Classification rules are sought in many areas from automatic knowledge acquisition to data mining and ANN rule extraction. This is because classification rules possess some attractive features. They are explicit, understandable and verifiable by domain experts, and can be modified, extended and passed on as modular knowledge. A standard three-layer feedforward ANN is the basis of the algorithm. A four-phase training algorithm is proposed for backpropagation learning. Comparing them to the symbolic rules generated by other methods supports explicitness of the generated rules. Generated rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy. Extensive experimental studies on several benchmarks classification problems, including breast cancer, wine, season, golf-playing, and lenses classification demonstrate the effectiveness of the proposed approach with good generalization ability.
Standard models for deep galaxy counts are based on luminosity functions (LF) with relatively flat faint end ($\alpha\sim-1.0$). Galaxy counts in the B--band exceed the prediction of such models by a factor of 2 to more than 5, forcing the introduction of strong luminosity and/or density evolution. Recently Marzke et al. (1994a) using the CfA redshift survey sample find that the number of galaxies in the range $-16<M_{Zw}<-13$ exceeds the extrapolation of a flat faint end LF by a factor of 2. Here we show that this steep LF substantially contributes to justify the observed blue galaxy counts without invoking strong luminosity and/or density evolution. Furthermore we show that taking into account the variation of the $B-K$ color as a function of the morphological types and assuming a mean value $<B-K><2.5$ for dwarf galaxies, we reproduce well also the observed $K$--band deep galaxy counts. This assumption is supported by the strong correlation we found between $B-K$ color of galaxies and their infrared absolute magnitude: galaxies become bluer with decreasing luminosity.
We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time. At training-time the binary weights and activations are used for computing the parameters gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substantially improve power-efficiency. To validate the effectiveness of BNNs we conduct two sets of experiments on the Torch7 and Theano frameworks. On both, BNNs achieved nearly state-of-the-art results over the MNIST, CIFAR-10 and SVHN datasets. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available on-line.
We introduce quantum fluctuations into the simulated annealing process of optimization problems, aiming at faster convergence to the optimal state. The idea is tested by the two models, the transverse Ising model and the traveling salesman problem (TSP). Adding the transverse field to the Ising model is a simple way to introduce quantum fluctuations. The strength of the transverse field is controlled as a function of time similarly to the temperature in the conventional method. TSP can be described by the Ising spin, so that we also apply the same technique to TSP. We directory solve the time-dependent Schr\"odinger equation for small-size systems and perform the quantum Monte Carlo simulation for large-size systems. Comparison with the results of the corresponding classical (thermal) method reveals that the quantum method leads to the ground state with much larger probability in almost all cases if we use the same annealing schedule of the control parameters. We also find that the relaxation time is quite short for quantum systems by numerical simulations. We consider this is one of the reasons why the annealing in quantum systems have a better performance of finding the optimal state in comparison with classical systems.
Crowdsourcing utilizes the wisdom of crowds for collective classification via information (e.g., labels of an item) provided by labelers. Current crowdsourcing algorithms are mainly unsupervised methods that are unaware of the quality of crowdsourced data. In this paper, we propose a supervised collective classification algorithm that aims to identify reliable labelers from the training data (e.g., items with known labels). The reliability (i.e., weighting factor) of each labeler is determined via a saddle point algorithm. The results on several crowdsourced data show that supervised methods can achieve better classification accuracy than unsupervised methods, and our proposed method outperforms other algorithms.
At the present work we propose a new method for studying the processes of thermodynamic equilibrium setting in the adsorbed 3He film in a porous media. By this method we have studied the thermalization of the adsorbed 3He on the silica aerogel surface at the temperature 1.5 K. The process of the thermodynamic equilibrium setting was controlled by measuring of the pressure in the experimental cell, the amplitude of the NMR signal and the time of the nuclear spin-spin and spin-lattice relaxation times of an adsorbed 3He. The thermodynamic equilibrium setting in the system "adsorbed helium-3 - aerogel" has the characteristic time 26 min.
Bayesian approaches to learn the graphical structure of Bayesian Belief Networks (BBNs) from databases share the assumption that the database is complete, that is, no entry is reported as unknown. Attempts to relax this assumption involve the use of expensive iterative methods to discriminate among different structures. This paper introduces a deterministic method to learn the graphical structure of a BBN from a possibly incomplete database. Experimental evaluations show a significant robustness of this method and a remarkable independence of its execution time from the number of missing data.
Cloud computing aims to power the next generation data centers and enables application service providers to lease data center capabilities for deploying applications depending on user QoS (Quality of Service) requirements. Cloud applications have different composition, configuration, and deployment requirements. Quantifying the performance of resource allocation policies and application scheduling algorithms at finer details in Cloud computing environments for different application and service models under varying load, energy performance (power consumption, heat dissipation), and system size is a challenging problem to tackle. To simplify this process, in this paper we propose CloudSim: an extensible simulation toolkit that enables modelling and simulation of Cloud computing environments. The CloudSim toolkit supports modelling and creation of one or more virtual machines (VMs) on a simulated node of a Data Center, jobs, and their mapping to suitable VMs. It also allows simulation of multiple Data Centers to enable a study on federation and associated policies for migration of VMs for reliability and automatic scaling of applications.
We present initial results from a redshift survey carried out with the Low Resolution Imaging Spectrograph on the 10~m W. M. Keck Telescope in the Hubble Deep Field. In the redshift distribution of the 140 extragalactic objects in this sample we find 6 strong peaks, with velocity dispersions of ${\sim}400${\kms}. The areal density of objects within a particular peak, while it may be non-uniform, does not show evidence for strong central concentration. These peaks have characteristics (velocity dispersions, density enhancements, spacing, and spatial extent) similar to those seen in a comparable redshift survey in a different high galactic latitude field (Cohen et al 1996), confirming that the structures are generic. They are probably the high redshift counterparts of huge galaxy structures (``walls'') observed locally.
Recurrent neural networks (RNNs), particularly long short-term memory (LSTM), have gained much attention in automatic speech recognition (ASR). Although some successful stories have been reported, training RNNs remains highly challenging, especially with limited training data. Recent research found that a well-trained model can be used as a teacher to train other child models, by using the predictions generated by the teacher model as supervision. This knowledge transfer learning has been employed to train simple neural nets with a complex one, so that the final performance can reach a level that is infeasible to obtain by regular training. In this paper, we employ the knowledge transfer learning approach to train RNNs (precisely LSTM) using a deep neural network (DNN) model as the teacher. This is different from most of the existing research on knowledge transfer learning, since the teacher (DNN) is assumed to be weaker than the child (RNN); however, our experiments on an ASR task showed that it works fairly well: without applying any tricks on the learning scheme, this approach can train RNNs successfully even with limited training data.
My dissertation presents results from three recent investigations in the Hubble Ultra Deep Field (HUDF) focusing on understanding structural and physical properties of high redshift galaxies. Here I summarize results from these studies. This thesis work was conducted at Arizona State University under the guidance of Prof. Rogier Windhorst and Prof. Sangeeta Malhotra.
Recent works on zero-shot learning make use of side information such as visual attributes or natural language semantics to define the relations between output visual classes and then use these relationships to draw inference on new unseen classes at test time. In a novel extension to this idea, we propose the use of visual prototypical concepts as side information. For most real-world visual object categories, it may be difficult to establish a unique prototype. However, in cases such as traffic signs, brand logos, flags, and even natural language characters, these prototypical templates are available and can be leveraged for an improved recognition performance. The present work proposes a way to incorporate this prototypical information in a deep learning framework. Using prototypes as prior information, the deepnet pipeline learns the input image projections into the prototypical embedding space subject to minimization of the final classification loss. Based on our experiments with two different datasets of traffic signs and brand logos, prototypical embeddings incorporated in a conventional convolutional neural network improve the recognition performance. Recognition accuracy on the Belga logo dataset is especially noteworthy and establishes a new state-of-the-art. In zero-shot learning scenarios, the same system can be directly deployed to draw inference on unseen classes by simply adding the prototypical information for these new classes at test time. Thus, unlike earlier approaches, testing on seen and unseen classes is handled using the same pipeline, and the system can be tuned for a trade-off of seen and unseen class performance as per task requirement. Comparison with one of the latest works in the zero-shot learning domain yields top results on the two datasets mentioned above.
The paper presents a neurorobotics cognitive model to explain the understanding and generalisation of nouns and verbs combinations when a vocal command consisting of a verb-noun sentence is provided to a humanoid robot. This generalisation process is done via the grounding process: different objects are being interacted, and associated, with different motor behaviours, following a learning approach inspired by developmental language acquisition in infants. This cognitive model is based on Multiple Time-scale Recurrent Neural Networks (MTRNN).With the data obtained from object manipulation tasks with a humanoid robot platform, the robotic agent implemented with this model can ground the primitive embodied structure of verbs through training with verb-noun combination samples. Moreover, we show that a functional hierarchical architecture, based on MTRNN, is able to generalise and produce novel combinations of noun-verb sentences. Further analyses of the learned network dynamics and representations also demonstrate how the generalisation is possible via the exploitation of this functional hierarchical recurrent network.
This paper addresses the path selection problem from a known sender to the receiver. The proposed work shows path selection using genetic algorithm(GA)and simulated annealing (SA) approaches. In genetic algorithm approach, the multi point crossover and mutation helps in determining the optimal path and also alternate path if required. The input to both the algorithms is a learnt module which is a part of the cognitive router that takes care of four QoS parameters.The aim of the approach is to maximize the bandwidth along the forward channels and minimize the route length. The population size is considered as the N nodes participating in the network scenario, which will be limited to a known size of topology. The simulated results show that, by using genetic algorithm approach, the probability of shortest path convergence is higher as the number of iteration goes up whereas in simulated annealing the number of iterations had no influence to attain better results as it acts on random principle of selection.
In non-equilibrium experiments on the glasses Mylar and BK7, we measured the excess dielectric response after the temporary application of a strong electric bias field at mK--temperatures. A model recently developed describes the observed long time decays qualitatively for Mylar [PRL 90, 105501, S. Ludwig, P. Nalbach, D. Rosenberg, D. Osheroff], but fails for BK7. In contrast, our results on both samples can be described by including an additional mechanism to the mentioned model with temperature independent decay times of the excess dielectric response. As the origin of this novel process beyond the "tunneling model" we suggest bias field induced structural rearrangements of "tunneling states" that decay by quantum mechanical tunneling.
Recent papers have shown that neural networks obtain state-of-the-art performance on several different sequence tagging tasks. One appealing property of such systems is their generality, as excellent performance can be achieved with a unified architecture and without task-specific feature engineering. However, it is unclear if such systems can be used for tasks without large amounts of training data. In this paper we explore the problem of transfer learning for neural sequence taggers, where a source task with plentiful annotations (e.g., POS tagging on Penn Treebank) is used to improve performance on a target task with fewer available annotations (e.g., POS tagging for microblogs). We examine the effects of transfer learning for deep hierarchical recurrent networks across domains, applications, and languages, and show that significant improvement can often be obtained. These improvements lead to improvements over the current state-of-the-art on several well-studied tasks.
Galactic archeology based on star counts is instrumental to reconstruct the past mass assembly of Local Group galaxies. The development of new observing techniques and data-reduction, coupled with the use of sensitive large field of view cameras, now allows us to pursue this technique in more distant galaxies exploiting their diffuse low surface brightness (LSB) light. As part of the Atlas3D project, we have obtained with the MegaCam camera at the Canada-France Hawaii Telescope extremely deep, multi--band, images of nearby early-type galaxies. We present here a catalog of 92 galaxies from the Atlas3D sample, that are located in low to medium density environments. The observing strategy and data reduction pipeline, that achieve a gain of several magnitudes in the limiting surface brightness with respect to classical imaging surveys, are presented. The size and depth of the survey is compared to other recent deep imaging projects. The paper highlights the capability of LSB--optimized surveys at detecting new prominent structures that change the apparent morphology of galaxies. The intrinsic limitations of deep imaging observations are also discussed, among those, the contamination of the stellar halos of galaxies by extended ghost reflections, and the cirrus emission from Galactic dust. The detection and systematic census of fine structures that trace the present and past mass assembly of ETGs is one of the prime goals of the project. We provide specific examples of each type of observed structures -- tidal tails, stellar streams and shells --, and explain how they were identified and classified. We give an overview of the initial results. The detailed statistical analysis will be presented in future papers.
This paper intends to explain Venezuela's country spread behavior through the Neural Networks analysis of a monthly economic activity general index of economic indicators constructed by the Central Bank of Venezuela, a measure of the shocks affecting country risk of emerging markets and the U.S. short term interest rate. The use of non parametric methods allowed the finding of non linear relationship between these inputs and the country risk. The networks performance was evaluated using the method of excess predictability.
Identification and extraction of singing voice from within musical mixtures is a key challenge in source separation and machine audition. Recently, deep neural networks (DNN) have been used to estimate 'ideal' binary masks for carefully controlled cocktail party speech separation problems. However, it is not yet known whether these methods are capable of generalizing to the discrimination of voice and non-voice in the context of musical mixtures. Here, we trained a convolutional DNN (of around a billion parameters) to provide probabilistic estimates of the ideal binary mask for separation of vocal sounds from real-world musical mixtures. We contrast our DNN results with more traditional linear methods. Our approach may be useful for automatic removal of vocal sounds from musical mixtures for 'karaoke' type applications.
Recently, due to rapid development of information and communication technologies, the data are created and consumed in the avalanche way. Distributed computing create preconditions for analyzing and processing such Big Data by distributing the computations among a number of compute nodes. In this work, performance of distributed computing environments on the basis of Hadoop and Spark frameworks is estimated for real and virtual versions of clusters. As a test task, we chose the classic use case of word counting in texts of various sizes. It was found that the running times grow very fast with the dataset size and faster than a power function even. As to the real and virtual versions of cluster implementations, this tendency is the similar for both Hadoop and Spark frameworks. Moreover, speedup values decrease significantly with the growth of dataset size, especially for virtual version of cluster configuration. The problem of growing data generated by IoT and multimodal (visual, sound, tactile, neuro and brain-computing, muscle and eye tracking, etc.) interaction channels is presented. In the context of this problem, the current observations as to the running times and speedup on Hadoop and Spark frameworks in real and virtual cluster configurations can be very useful for the proper scaling-up and efficient job management, especially for machine learning and Deep Learning applications, where Big Data are widely present.
While emerging deep-learning systems have outclassed knowledge-based approaches in many tasks, their application to detection tasks for autonomous technologies remains an open field for scientific exploration. Broadly, there are two major developmental bottlenecks: the unavailability of comprehensively labeled datasets and of expressive evaluation strategies. Approaches for labeling datasets have relied on intensive hand-engineering, and strategies for evaluating learning systems have been unable to identify failure-case scenarios. Human intelligence offers an untapped approach for breaking through these bottlenecks. This paper introduces Driverseat, a technology for embedding crowds around learning systems for autonomous driving. Driverseat utilizes crowd contributions for (a) collecting complex 3D labels and (b) tagging diverse scenarios for ready evaluation of learning systems. We demonstrate how Driverseat can crowdstrap a convolutional neural network on the lane-detection task. More generally, crowdstrapping introduces a valuable paradigm for any technology that can benefit from leveraging the powerful combination of human and computer intelligence.
This paper has been withdrawn by the author because it needs a deep methodological revision.
In previous work, we have shown that quantum neural computation is robust to noise and decoherence for a two-qubit system. Here we extend our results to three-, four-, and five-qubit systems, and show that the robustness to noise and decoherence is maintained and even improved as the size of the system is increased.
Rotor-routing is a procedure for routing tokens through a network that can implement certain kinds of computation. These computations are inherently asynchronous (the order in which tokens are routed makes no difference) and distributed (information is spread throughout the system). It is also possible to efficiently check that a computation has been carried out correctly in less time than the computation itself required, provided one has a certificate that can itself be computed by the rotor-router network. Rotor-router networks can be viewed as both discrete analogues of continuous linear systems and deterministic analogues of stochastic processes.
The beep model is a very weak communications model in which devices in a network can communicate only via beeps and silence. As a result of its weak assumptions, it has broad applicability to many different implementations of communications networks. This comes at the cost of a restrictive environment for algorithm design.   Despite being only recently introduced, the beep model has received considerable attention, in part due to its relationship with other communication models such as that of ad-hoc radio networks. However, there has been no definitive published result for several fundamental tasks in the model. We aim to rectify this with our paper.   We present algorithms for the tasks of broadcast, gossiping, and multi-broadcast. Our $O(D+\log M)$-time algorithm for broadcasting is a simple formalization of a concept known as beep waves, and is asymptotically optimal. We give an $O(n \log L)$-time depth-first search procedure, and show how this can be used as the basis for an $O(n \log LM)$-time gossiping algorithm. Finally, we present almost optimal algorithms for the more general problem of multi-broadcast. When message provenance is required, we give an $O(k \log \frac{LM}{k}+D \log L)$-time algorithm and a corresponding $\Omega(k \log \frac{LM}{k}+D)$ lower bound. When provenance is not required, we give an algorithm taking $O(k \log \frac {M}{k}+D \log L)$ time when $M>k$ and $O(M+D \log L)$ otherwise, and a corresponding lower bound of $\Omega(k \log \frac{M}{k}+D)$ when $M>k$ and $\Omega(M+D)$ otherwise.   Our algorithms are all explicit, deterministic, and practical, and give efficient means of communication while making arguably the minimum possible assumptions about the network.
We consider the algorithmic problem of selecting a set of target nodes that cause the biggest activation cascade in a network. In case when the activation process obeys the diminishing returns property, a simple hill-climbing selection mechanism has been shown to achieve a provably good performance. Here we study models of influence propagation that exhibit critical behavior, and where the property of diminishing returns does not hold. We demonstrate that in such systems, the structural properties of networks can play a significant role. We focus on networks with two loosely coupled communities, and show that the double-critical behavior of activation spreading in such systems has significant implications for the targeting strategies. In particular, we show that simple strategies that work well for homogeneous networks can be overly sub-optimal, and suggest simple modification for improving the performance, by taking into account the community structure.
Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for convolutional neural networks that enables efficient learning and inference. We introduce LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained to cover the space of weights in CNNs. Training LCNN involves jointly learning a dictionary and a small set of linear combinations. The size of the dictionary naturally traces a spectrum of trade-offs between efficiency and accuracy. Our experimental results on ImageNet challenge show that LCNN can offer 3.2x speedup while achieving 55.1% top-1 accuracy using AlexNet architecture. Our fastest LCNN offers 37.6x speed up over AlexNet while maintaining 44.3% top-1 accuracy. LCNN not only offers dramatic speed ups at inference, but it also enables efficient training. In this paper, we show the benefits of LCNN in few-shot learning and few-iteration learning, two crucial aspects of on-device training of deep learning models.
Advances in optical neuroimaging techniques now allow neural activity to be recorded with cellular resolution in awake and behaving animals. Brain motion in these recordings pose a unique challenge. The location of individual neurons must be tracked in 3D over time to accurately extract single neuron activity traces. Recordings from small invertebrates like C. elegans are especially challenging because they undergo very large brain motion and deformation during animal movement. Here we present an automated computer vision pipeline to reliably track populations of neurons with single neuron resolution in the brain of a freely moving C. elegans undergoing large motion and deformation. 3D volumetric fluorescent images of the animal's brain are straightened, aligned and registered, and the locations of neurons in the images are found via segmentation. Each neuron is then assigned an identity using a new time-independent machine-learning approach we call Neuron Registration Vector Encoding. In this approach, non-rigid point-set registration is used to match each segmented neuron in each volume with a set of reference volumes taken from throughout the recording. The way each neuron matches with the references defines a feature vector which is clustered to assign an identity to each neuron in each volume. Finally, thin-plate spline interpolation is used to correct errors in segmentation and check consistency of assigned identities. The Neuron Registration Vector Encoding approach proposed here is uniquely well suited for tracking neurons in brains undergoing large deformations. When applied to whole-brain calcium imaging recordings in freely moving C. elegans, this analysis pipeline located 150 neurons for the duration of an 8 minute recording and consistently found more neurons more quickly than manual or semi-automated approaches.
Using multiscaling analysis, we compare the characteristic roughening of ferroelectric domain walls in PZT thin films with numerical simulations of weakly pinned one-dimensional interfaces. Although at length scales up to a length scale greater or equal to 5 microns the ferroelectric domain walls behave similarly to the numerical interfaces, showing a simple mono-affine scaling (with a well-defined roughness exponent), we demonstrate more complex scaling at higher length scales, making the walls globally multi-affine (varying roughness exponent at different observation length scales). The dominant contributions to this multi-affine scaling appear to be very localized variations in the disorder potential, possibly related to dislocation defects present in the substrate.
In neural circuits, statistical connectivity rules strongly depend on neuronal type. Here we study dynamics of neural networks with cell-type specific connectivity by extending the dynamic mean field method, and find that these networks exhibit a phase transition between silent and chaotic activity. By analyzing the locus of this transition, we derive a new result in random matrix theory: the spectral radius of a random connectivity matrix with block-structured variances. We apply our results to show how a small group of hyper-excitable neurons within the network can significantly increase the network's computational capacity.
We present Hubble Space Telescope ACS deep photometry of the intermediate-age globular cluster NGC 1783 in the Large Magellanic Cloud. By using this photometric dataset, we have determined the degree of ellipticity of the cluster ($\epsilon$=0.14$\pm$0.03) and the radial density profile. This profile is well reproduced by a standard King model with an extended core (r_c=24.5'') and a low concentration (c=1.16), indicating that the cluster has not experienced the collapse of the core.   We also derived the cluster age, by using the Pisa Evolutionary Library (PEL) isochrones, with three different amount of overshooting (namely, $\Lambda_{os}$=0.0, 0.10 and 0.25). From the comparison of the observed Color-Magnitude Diagram (CMD) and Main Sequence (MS) Luminosity Function (LF) with the theoretical isochrones and LFs, we find that only models with the inclusion of some overshooting ($\Lambda_{os}$=0.10-0.25) are able to reproduce the observables. By using the magnitude difference $\delta V_{SGB}^{He-Cl}=0.90$ between the mean level of the He-clump and the flat region of the SGB, we derive an age $\tau$=1.4$\pm$0.2 Gyr.
We present an analysis of different techniques for selecting the connection be- tween layers of deep neural networks. Traditional deep neural networks use ran- dom connection tables between layers to keep the number of connections small and tune to different image features. This kind of connection performs adequately in supervised deep networks because their values are refined during the training. On the other hand, in unsupervised learning, one cannot rely on back-propagation techniques to learn the connections between layers. In this work, we tested four different techniques for connecting the first layer of the network to the second layer on the CIFAR and SVHN datasets and showed that the accuracy can be im- proved up to 3% depending on the technique used. We also showed that learning the connections based on the co-occurrences of the features does not confer an advantage over a random connection table in small networks. This work is helpful to improve the efficiency of connections between the layers of unsupervised deep neural networks.
Online reviews provide viewpoints on the strengths and shortcomings of products/services, influencing potential customers' purchasing decisions. However, the proliferation of non-credible reviews -- either fake (promoting/ demoting an item), incompetent (involving irrelevant aspects), or biased -- entails the problem of identifying credible reviews. Prior works involve classifiers harnessing rich information about items/users -- which might not be readily available in several domains -- that provide only limited interpretability as to why a review is deemed non-credible. This paper presents a novel approach to address the above issues. We utilize latent topic models leveraging review texts, item ratings, and timestamps to derive consistency features without relying on item/user histories, unavailable for "long-tail" items/users. We develop models, for computing review credibility scores to provide interpretable evidence for non-credible reviews, that are also transferable to other domains -- addressing the scarcity of labeled data. Experiments on real-world datasets demonstrate improvements over state-of-the-art baselines.
The knowledge of the topology of a wired network is often of fundamental importance. For instance, in the context of Power Line Communications (PLC) networks it is helpful to implement data routing strategies, while in power distribution networks and Smart Micro Grids (SMG) it is required for grid monitoring and for power flow management. In this paper, we use the transmission line theory to shed new light and to show how the topological properties of a wired network can be found exploiting admittance measurements at the nodes. An analytic proof is reported to show that the derivation of the topology can be done in complex networks under certain assumptions. We also analyze the effect of the network background noise on admittance measurements. In this respect, we propose a topology derivation algorithm that works in the presence of noise. We finally analyze the performance of the algorithm using values that are typical of power line distribution networks.
A model of providing service in a P2P network is analyzed. It is shown that by adding a scrip system, a mechanism that admits a reasonable Nash equilibrium that reduces free riding can be obtained. The effect of varying the total amount of money (scrip) in the system on efficiency (i.e., social welfare) is analyzed, and it is shown that by maintaining the appropriate ratio between the total amount of money and the number of agents, efficiency is maximized. The work has implications for many online systems, not only P2P networks but also a wide variety of online forums for which scrip systems are popular, but formal analyses have been lacking.
We propose BlackOut, an approximation algorithm to efficiently train massive recurrent neural network language models (RNNLMs) with million word vocabularies. BlackOut is motivated by using a discriminative loss, and we describe a new sampling strategy which significantly reduces computation while improving stability, sample efficiency, and rate of convergence. One way to understand BlackOut is to view it as an extension of the DropOut strategy to the output layer, wherein we use a discriminative training loss and a weighted sampling scheme. We also establish close connections between BlackOut, importance sampling, and noise contrastive estimation (NCE). Our experiments, on the recently released one billion word language modeling benchmark, demonstrate scalability and accuracy of BlackOut; we outperform the state-of-the art, and achieve the lowest perplexity scores on this dataset. Moreover, unlike other established methods which typically require GPUs or CPU clusters, we show that a carefully implemented version of BlackOut requires only 1-10 days on a single machine to train a RNNLM with a million word vocabulary and billions of parameters on one billion words. Although we describe BlackOut in the context of RNNLM training, it can be used to any networks with large softmax output layers.
The transverse spatial modes of light offer a large state-space with interesting physical properties. For exploiting it in future long-distance experiments, spatial modes will have to be transmitted over turbulent free-space links. Numerous recent lab-scale experiments have found significant degradation in the mode quality after transmission through simulated turbulence and consecutive coherent detection. Here we experimentally analyze the transmission of one prominent class of spatial modes, the orbital-angular momentum (OAM) modes, through 3 km of strong turbulence over the city of Vienna. Instead of performing a coherent phase-dependent measurement, we employ an incoherent detection scheme which relies on the unambiguous intensity patterns of the different spatial modes. We use a pattern recognition algorithm (an artificial neural network) to identify the characteristic mode pattern displayed on a screen at the receiver. We were able to distinguish between 16 different OAM mode superpositions with only ~1.7% error, and use them to encode and transmit small grey-scale images. Moreover, we found that the relative phase of the superposition modes is not affected by the atmosphere, establishing the feasibility for performing long-distance quantum experiments with the OAM of photons. Our detection method works for other classes of spatial modes with unambiguous intensity patterns as well, and can further be improved by modern techniques of pattern recognition.
Recent works have validated the possibility of improving energy efficiency in radio access networks (RANs), achieved by dynamically turning on/off some base stations (BSs). In this paper, we extend the research over BS switching operations, which should match up with traffic load variations. Instead of depending on the dynamic traffic loads which are still quite challenging to precisely forecast, we firstly formulate the traffic variations as a Markov decision process. Afterwards, in order to foresightedly minimize the energy consumption of RANs, we design a reinforcement learning framework based BS switching operation scheme. Furthermore, to avoid the underlying curse of dimensionality in reinforcement learning, a transfer actor-critic algorithm (TACT), which utilizes the transferred learning expertise in historical periods or neighboring regions, is proposed and provably converges. In the end, we evaluate our proposed scheme by extensive simulations under various practical configurations and show that the proposed TACT algorithm contributes to a performance jumpstart and demonstrates the feasibility of significant energy efficiency improvement at the expense of tolerable delay performance.
We present a systematic study of the reconstruction of a non-negative function via maximum entropy approach utilizing the information contained in a finite number of moments of the function. For testing the efficacy of the approach, we reconstruct a set of functions using an iterative entropy optimization scheme, and study the convergence profile as the number of moments is increased. We consider a wide variety of functions that include a distribution with a sharp discontinuity, a rapidly oscillatory function, a distribution with singularities, and finally a distribution with several spikes and fine structure. The last example is important in the context of the determination of the natural density of the logistic map. The convergence of the method is studied by comparing the moments of the approximated functions with the exact ones. Furthermore, by varying the number of moments and iterations, we examine to what extent the features of the functions, such as the divergence behavior at singular points within the interval, is reproduced. The proximity of the reconstructed maximum entropy solution to the exact solution is examined via Kullback-Leibler divergence and variation measures for different number of moments.
We present results on the quark and gluon polarization in the nucleon obtained in a combined next to leading order analysis to the available inclusive and semi-inclusive polarized deep inelastic scattering data.
We study high speed collision and reconnection of cosmic strings in the type-II regime (scalar-to-gauge mass ratios larger than one) of the Abelian Higgs model. New phenomena such as multiple reconnections and clustering of small scale structure have been observed and reported in a previous paper, as well as the fact that the previously observed loop that mediates the second intercommutation is only a loop for sufficiently large beta = m_scalar^2/m_gauge^2. Here we give a more detailed account of our study, involving 3D numerical simulations with beta in the range 1 to 64, the largest value simulated to date, as well as 2D simulations of vortex-antivortex (v-av) collisions to understand the possible relation to the new 3D phenomena. Our simulations give further support to the idea that Abelian Higgs strings never pass through each other, unless this is the result of a double reconnection; and that the critical velocity (v_c) for double reconnection goes down with increasing mass ratio, but energy conservation suggests a lower bound around 0.77c. We discuss the qualitative change in the intermediate state observed for large mass ratios. We relate it to a similar change in the outcome of 2D v-av collisions in the form of radiating bound states. In the deep type-II regime the angular dependence of v_c for double reconnection does not seem to conform to semi-analytic predictions based on the Nambu-Goto approximation. We model the high angle collisions reasonably well by incorporating the effect of core interactions, and the torque they produce on the approaching strings, into the Nambu-Goto description of the collision. An interesting, counterintuitive aspect is that the effective collision angle is smaller because of the torque. Our results suggest differences in network evolution and radiation output with respect to the predictions based on Nambu-Goto or beta = 1 Abelian Higgs dynamics.
When choosing the timing of cross-sectional network snapshots in longitudinal social network studies, the effect on the precision of parameter estimates generally plays a minor role. Often the timing is opportunistic or determined by a variety of considerations such as organizational constraints, funding, and availability of study participants. Theory to guide the timing of network snapshots is also missing. We use a statistical framework to relate the timing to the precision of the parameter estimates, specifically, the sum of the relative widths of their confidence intervals. We illustrate this computationally using the STERGM suite of the statnet package to estimate the parameters of the network dynamics. Analytically, we derive simple approximations for the optimal timing when the parameters correspond to the rates for different network events. We find that the optimal time depends the most on the network processes with short expected durations and few expected events such as the dissolution of ties. We apply our approximations to a simple example of a dynamic network with formation and dissolution of iid ties.
Infrared renormalons and $1/Q^2$ power corrections in deep-inelastic sum rules are studied. The renormalization of operators with power divergence are discussed. The higher-twist terms in the operator product expansion are shown to account for the residual soft contributions survived from the Kinoshita-Lee-Nauenberg type of cancellation in Feynman diagrams. The presence of some degree of arbitrariness in the twist separation allows one to define the most convenient higher-twist operators suitable for a particular non-perturbative method. The discussion is focused on the Bjorken sum rule, for which the $1/Q^2$ corrections are considered on a lattice.
Scholarship on teams has focused on the relationship between a team's performance, however defined, and the network structure among team members. For example, Uzzi and Spiro (2005) find that the creative performance of Broadway musical teams depends heavily on the internal cohesion of team members and their past collaborative experience with individuals outside their immediate teams. In other words, team members' internal cohesion and external ties are crucial to the team's success. How, then, do they interact to produce positive performance outcomes? In our work, we separate the proximal causes of tie formation from the proximal determinants of outcomes to determine the mechanism behind this interaction. To examine this puzzle, we examine the performance of national soccer squads over time as a function of changing levels and configurations of brokerage and closure ties formed by players working for professional soccer clubs.
Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.
This paper proposes a learning-based approach to scene parsing inspired by the deep Recursive Context Propagation Network (RCPN). RCPN is a deep feed-forward neural network that utilizes the contextual information from the entire image, through bottom-up followed by top-down context propagation via random binary parse trees. This improves the feature representation of every super-pixel in the image for better classification into semantic categories. We analyze RCPN and propose two novel contributions to further improve the model. We first analyze the learning of RCPN parameters and discover the presence of bypass error paths in the computation graph of RCPN that can hinder contextual propagation. We propose to tackle this problem by including the classification loss of the internal nodes of the random parse trees in the original RCPN loss function. Secondly, we use an MRF on the parse tree nodes to model the hierarchical dependency present in the output. Both modifications provide performance boosts over the original RCPN and the new system achieves state-of-the-art performance on Stanford Background, SIFT-Flow and Daimler urban datasets.
Dynamic Textures (DTs) are sequences of images of moving scenes that exhibit certain stationarity properties in time such as smoke, vegetation and fire. The analysis of DT is important for recognition, segmentation, synthesis or retrieval for a range of applications including surveillance, medical imaging and remote sensing. Deep learning methods have shown impressive results and are now the new state of the art for a wide range of computer vision tasks including image and video recognition and segmentation. In particular, Convolutional Neural Networks (CNNs) have recently proven to be well suited for texture analysis with a design similar to a filter bank approach. In this paper, we develop a new approach to DT analysis based on a CNN method applied on three orthogonal planes x y , xt and y t . We train CNNs on spatial frames and temporal slices extracted from the DT sequences and combine their outputs to obtain a competitive DT classifier. Our results on a wide range of commonly used DT classification benchmark datasets prove the robustness of our approach. Significant improvement of the state of the art is shown on the larger datasets.
Conventional cellular wireless networks were designed with the purpose of providing high throughput for the user and high capacity for the service provider, without any provisions of energy efficiency. As a result, these networks have an enormous Carbon footprint. In this paper, we describe the sources of the inefficiencies in such networks. First we present results of the studies on how much Carbon footprint such networks generate. We also discuss how much more mobile traffic is expected to increase so that this Carbon footprint will even increase tremendously more. We then discuss specific sources of inefficiency and potential sources of improvement at the physical layer as well as at higher layers of the communication protocol hierarchy. In particular, considering that most of the energy inefficiency in cellular wireless networks is at the base stations, we discuss multi-tier networks and point to the potential of exploiting mobility patterns in order to use base station energy judiciously. We then investigate potential methods to reduce this inefficiency and quantify their individual contributions. By a consideration of the combination of all potential gains, we conclude that an improvement in energy consumption in cellular wireless networks by two orders of magnitude, or even more, is possible.
Solving sequential decision making problems, such as text parsing, robotic control, and game playing, requires a combination of planning policies and generalisation of those plans. In this paper, we present Expert Iteration, a novel algorithm which decomposes the problem into separate planning and generalisation tasks. Planning new policies is performed by tree search, while a deep neural network generalises those plans. In contrast, standard Deep Reinforcement Learning algorithms rely on a neural network not only to generalise plans, but to discover them too. We show that our method substantially outperforms Policy Gradients in the board game Hex, winning 84.4% of games against it when trained for equal time.
In the current analog grid, power is available at all times, to all users, indiscriminately. This makes the grid vulnerable to demand fluctuations and much effort has been invested to mitigate their effect. The Digital Power Network (DPN) is an energy-on-demand approach to power grids. In the new (digital) approach, the user initiates an energy request. This action alleviates uncertainties in energy demands. The service provider may grant the request fully or partially. Energy is then transmitted in discrete units, analogous to packets of data over a computer network. The packetized energy is routed to the user address. Because energy demands are known ahead of time, the energy provider may optimize the power distribution of the entire power network and isolate pockets of instabilities. For example, under severe energy constraints the energy provider may queue some energy requests and grant these requests later. Alternative energy resources may be seamlessly incorporated into the power network as yet another address in the system and since their energy is coded, they would be connected to specific users directly. In its simplest form, this grid can be realized by overlaying an auxiliary (communication) network on top of an energy delivery network (the current transmission lines) and coupling the two through an array of addressable digital power switches. Optimization of energy requests is the topic of this paper. We investigate the role of the network queue and provide a snapshot of its behavior in time. The DPN with a limited channel capacity and the optimal path for energy flow in a standard IEEE 39 bus are considered, as well.
Hyper-Raman scattering spectra of vitreous B$_2$O$_3$ are reported and compared to Raman scattering results. The main features are indexed in terms of vibrations of structural units. Particular attention is given to the low frequency boson peak which is shown to relate to out-of-plane librations of B$_3$O$_3$ boroxol rings and BO$_3$ triangles. Its hyper-Raman strength is comparable to that of cooperative polar modes. It points to a sizeable coherent enhancement of the hyper-Raman signal compared to the Raman one. This is explained by the symmetry of the structural units.
Many real world, complex phenomena have underlying structures of evolving networks where nodes and links are added and removed over time. A central scientific challenge is the description and explanation of network dynamics, with a key test being the prediction of short and long term changes. For the problem of short-term link prediction, existing methods attempt to determine neighborhood metrics that correlate with the appearance of a link in the next observation period. Recent work has suggested that the incorporation of topological features and node attributes can improve link prediction. We provide an approach to predicting future links by applying the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to optimize weights which are used in a linear combination of sixteen neighborhood and node similarity indices. We examine a large dynamic social network with over $10^6$ nodes (Twitter reciprocal reply networks), both as a test of our general method and as a problem of scientific interest in itself. Our method exhibits fast convergence and high levels of precision for the top twenty predicted links. Based on our findings, we suggest possible factors which may be driving the evolution of Twitter reciprocal reply networks.
The Smithsonian Hectospec Lensing Survey (SHELS) combines a large deep complete redshift survey with a weak lensing map from the Deep Lens Survey (Wittman et al. 2002; 2005). We use maps of the velocity dispersion based on systems identified in the redshift survey to compare the three-dimensional matter distribution with the two-dimensional projection mapped by weak lensing. We demonstrate directly that the lensing map images the three-dimensional matter distribution obtained from the kinematic data.
We consider a network topology design problem in which an initial undirected graph underlying the network is given and the objective is to select a set of edges to add to the graph to optimize the coherence of the resulting network. We show that network coherence is a submodular function of the network topology. As a consequence, a simple greedy algorithm is guaranteed to produce near optimal edge set selections. We also show that fast rank one updates of the Laplacian pseudoinverse using generalizations of the Sherman-Morrison formula and an accelerated variant of the greedy algorithm can speed up the algorithm by several orders of magnitude in practice. These allow our algorithms to scale to network sizes far beyond those that can be handled by convex relaxation heuristics.
With the wide spread of Internet services, developers and users need a greater understanding of the technology of networking. Acquiring a clear understanding of communication protocols is an important step in understanding how a network functions; however, many protocols are complicated, and explaining them can be demanding. In addition, protocols are often explained in terms of traffic analysis and oriented toward technical staff and those already familiar with network protocols. This paper aims at proposing a diagrammatic methodology to represent protocols in general, with a focus on the Transmission Control Protocol and Secure Sockets Layer in particular. The purpose is to facilitate understanding of protocols for learning and communication purposes. The methodology is based on the notion of flow of primitive things in a system with six stages: creation, release, transfer, arrival, acceptance, and processing. Though the method presents a basic description of protocols without in-depth analysis of all aspects and mechanisms, the resultant conceptual description is a systematic specification that utilizes a few basic notions that assist in illustrating functionality and support comprehension.
Azimuthal single-spin asymmetries (SSA) in semi-inclusive electroproduction of charged pions and kaons in deep-inelastic scattering of positrons on a transversely polarised hydrogen target were observed. SSA amplitudes for both the Collins and the Sivers mechanism are presented.
We discuss several parametrizations of the space of circular planar electrical networks. For any circular planar network we associate a canonical minimal network with the same response matrix, called a "standard" network. The conductances of edges in a standard network can be computed as a biratio of Pfaffians constructed from the response matrix. The conductances serve as coordinates that are compatible with the cell structure of circular planar networks in the sense that one conductance degenerates to 0 or infinity when moving from a cell to a boundary cell. We also show how to test if a network with n nodes is well-connected by checking that $\binom{n}{2}$ minors of the $n\times n$ response matrix are positive; Colin de Verdi\`ere had previously shown that it was sufficient to check the positivity of exponentially many minors. For standard networks with m edges, positivity of the conductances can be tested by checking the positivity of m+1 Pfaffians.
Multi-dimensional data classification is an important and challenging problem in many astro-particle experiments. Neural networks have proved to be versatile and robust in multi-dimensional data classification. In this article we shall study the classification of gamma from the hadrons for the MAGIC Experiment. Two neural networks have been used for the classification task. One is Multi-Layer Perceptron based on supervised learning and other is Self-Organising Map (SOM), which is based on unsupervised learning technique. The results have been shown and the possible ways of combining these networks have been proposed to yield better and faster classification results.
Consumers with low demand, like households, are generally supplied single-phase power by connecting their service mains to one of the phases of a distribution transformer. The distribution companies face the problem of keeping a record of consumer connectivity to a phase due to uninformed changes that happen. The exact phase connectivity information is important for the efficient operation and control of distribution system. We propose a new data driven approach to the problem based on Principal Component Analysis (PCA) and its Graph Theoretic interpretations, using energy measurements in equally timed short intervals, generated from smart meters. We propose an algorithm for inferring phase connectivity from noisy measurements. The algorithm is demonstrated using simulated data for phase connectivities in distribution networks.
This paper considers a process for the creation and subsequent firing of sequences of neuronal patterns, as might be found in the human brain. The scale is one of larger patterns emerging from an ensemble mass, possibly through some type of energy equation and a reduction procedure. The links between the patterns can be formed naturally, as a residual effect of the pattern creation itself. If the process is valid, then the pattern creation can be relatively simplistic and automatic, where the neuron does not have to do anything particularly intelligent. The pattern interfaces become slightly abstract without firm boundaries and exact structure is determined more by averages or ratios. This paper follows-on closely from the earlier research, including two earlier papers in the series and uses the ideas of entropy and cohesion. With a small addition, it is possible to show how the inter-pattern links can be determined. A new compact Grid form of an earlier Counting Mechanism is also demonstrated. Finally, it is possible to explain how a very basic repeating structure can form the arbitrary patterns and activation sequences between them. A key question of how nodes synchronise may even be answerable.
Imitation is widely observed in populations of decision-making agents. Using our recent convergence results for asynchronous imitation dynamics on networks, we consider how such networks can be efficiently driven to a desired equilibrium state by offering payoff incentives for using a certain strategy, either uniformly or targeted to individuals. In particular, if for each available strategy, agents playing that strategy receive maximum payoff when their neighbors play that same strategy, we show that providing incentives to agents in a network that is at equilibrium will result in convergence to a unique new equilibrium. For the case when a uniform incentive can be offered to all agents, this result allows the computation of the optimal incentive using a binary search algorithm. When incentives can be targeted to individual agents, we propose an algorithm to select which agents should be chosen based on iteratively maximizing a ratio of the number of agents who adopt the desired strategy to the payoff incentive required to get those agents to do so. Simulations demonstrate that the proposed algorithm computes near-optimal targeted payoff incentives for a range of networks and payoff distributions in coordination games.
In this paper, we introduce a new machine learning theory based on multi-channel parallel adaptation for rule discovery. This theory is distinguished from the familiar parallel-distributed adaptation theory of neural networks in terms of channel-based convergence to the target rules. We show how to realize this theory in a learning system named CFRule. CFRule is a parallel weight-based model, but it departs from traditional neural computing in that its internal knowledge is comprehensible. Furthermore, when the model converges upon training, each channel converges to a target rule. The model adaptation rule is derived by multi-level parallel weight optimization based on gradient descent. Since, however, gradient descent only guarantees local optimization, a multi-channel regression-based optimization strategy is developed to effectively deal with this problem. Formally, we prove that the CFRule model can explicitly and precisely encode any given rule set. Also, we prove a property related to asynchronous parallel convergence, which is a critical element of the multi-channel parallel adaptation theory for rule learning. Thanks to the quantizability nature of the CFRule model, rules can be extracted completely and soundly via a threshold-based mechanism. Finally, the practical application of the theory is demonstrated in DNA promoter recognition and hepatitis prognosis prediction.
Recognizing human actions from unknown and unseen (novel) views is a challenging problem. We propose a Robust Non-Linear Knowledge Transfer Model (R-NKTM) for human action recognition from novel views. The proposed R-NKTM is a deep fully-connected neural network that transfers knowledge of human actions from any unknown view to a shared high-level virtual view by finding a non-linear virtual path that connects the views. The R-NKTM is learned from dense trajectories of synthetic 3D human models fitted to real motion capture data and generalizes to real videos of human actions. The strength of our technique is that we learn a single R-NKTM for all actions and all viewpoints for knowledge transfer of any real human action video without the need for re-training or fine-tuning the model. Thus, R-NKTM can efficiently scale to incorporate new action classes. R-NKTM is learned with dummy labels and does not require knowledge of the camera viewpoint at any stage. Experiments on three benchmark cross-view human action datasets show that our method outperforms existing state-of-the-art.
The Saratoga transfer protocol was developed by Surrey Satellite Technology Ltd (SSTL) for its Disaster Monitoring Constellation (DMC) satellites. In over seven years of operation, Saratoga has provided efficient delivery of remote-sensing Earth observation imagery, across private wireless links, from these seven low-orbit satellites to ground stations, using the Internet Protocol (IP). Saratoga is designed to cope with high bandwidth-delay products, constrained acknowledgement channels, and high loss while streaming or delivering extremely large files. An implementation of this protocol has now been developed at the Australian Commonwealth Scientific and Industrial Research Organisation (CSIRO) for wider use and testing. This is intended to prototype delivery of data across dedicated astronomy radio telescope networks on the ground, where networked sensors in Very Long Baseline Interferometer (VLBI) instruments generate large amounts of data for processing and can send that data across private IP- and Ethernet-based links at very high rates. We describe this new Saratoga implementation, its features and focus on high throughput and link utilization, and lessons learned in developing this protocol for sensor-network applications.
We consider the problem of locating a black hole in synchronous anonymous networks using finite state agents. A black hole is a harmful node in the network that destroys any agent visiting that node without leaving any trace. The objective is to locate the black hole without destroying too many agents. This is difficult to achieve when the agents are initially scattered in the network and are unaware of the location of each other. Previous studies for black hole search used more powerful models where the agents had non-constant memory, were labelled with distinct identifiers and could either write messages on the nodes of the network or mark the edges of the network. In contrast, we solve the problem using a small team of finite-state agents each carrying a constant number of identical tokens that could be placed on the nodes of the network. Thus, all resources used in our algorithms are independent of the network size. We restrict our attention to oriented torus networks and first show that no finite team of finite state agents can solve the problem in such networks, when the tokens are not movable. In case the agents are equipped with movable tokens, we determine lower bounds on the number of agents and tokens required for solving the problem in torus networks of arbitrary size. Further, we present a deterministic solution to the black hole search problem for oriented torus networks, using the minimum number of agents and tokens.
The thermal conductivity of the methane hydrate CH4 (5.75 H2O) was measured in the interval 2-140 K using the steady-state technique. The thermal conductivity corresponding to a homogeneous substance was calculated from the measured effective thermal conductivity obtained in the experiment. The temperature dependence of the thermal conductivity is typical for the thermal conductivity of amorphous solids. It is shown that after separation of the hydrate into ice and methane, at 240 K, the thermal conductivity of the ice exhibits a dependence typical of heavily deformed fine-grain polycrystal. The reason for the glass-like behavior in the thermal conductivity of clathrate compounds has been discussed. The experimental results can be interpreted within the phenomenological soft-potential model with two fitting parameters.
Characterization of semiconductor devices is used to gather as much data about the device as possible to determine weaknesses in design or trends in the manufacturing process. In this paper, we propose a novel multiple trip point characterization concept to overcome the constraint of single trip point concept in device characterization phase. In addition, we use computational intelligence techniques (e.g. neural network, fuzzy and genetic algorithm) to further manipulate these sets of multiple trip point values and tests based on semiconductor test equipments, Our experimental results demonstrate an excellent design parameter variation analysis in device characterization phase, as well as detection of a set of worst case tests that can provoke the worst case variation, while traditional approach was not capable of detecting them.
We propose new methods to speed up convergence of the Alternating Direction Method of Multipliers (ADMM), a common optimization tool in the context of large scale and distributed learning. The proposed method accelerates the speed of convergence by automatically deciding the constraint penalty needed for parameter consensus in each iteration. In addition, we also propose an extension of the method that adaptively determines the maximum number of iterations to update the penalty. We show that this approach effectively leads to an adaptive, dynamic network topology underlying the distributed optimization. The utility of the new penalty update schemes is demonstrated on both synthetic and real data, including a computer vision application of distributed structure from motion.
Circadian clocks are oscillatory genetic networks that help organisms adapt to the 24-hour day/night cycle. The clock of the green alga Ostreococcus tauri is the simplest plant clock discovered so far. Its many advantages as an experimental system facilitate the testing of computational predictions.   We present a model of the Ostreococcus clock in the stochastic process algebra Bio-PEPA and exploit its mapping to different analysis techniques, such as ordinary differential equations, stochastic simulation algorithms and model-checking. The small number of molecules reported for this system tests the limits of the continuous approximation underlying differential equations. We investigate the difference between continuous-deterministic and discrete-stochastic approaches. Stochastic simulation and model-checking allow us to formulate new hypotheses on the system behaviour, such as the presence of self-sustained oscillations in single cells under constant light conditions.   We investigate how to model the timing of dawn and dusk in the context of model-checking, which we use to compute how the probability distributions of key biochemical species change over time. These show that the relative variation in expression level is smallest at the time of peak expression, making peak time an optimal experimental phase marker. Building on these analyses, we use approaches from evolutionary systems biology to investigate how changes in the rate of mRNA degradation impacts the phase of a key protein likely to affect fitness. We explore how robust this circadian clock is towards such potential mutational changes in its underlying biochemistry. Our work shows that multiple approaches lead to a more complete understanding of the clock.
We investigate cascade dynamics in threshold-controlled (multiplex) propagation on random geometric networks. We find that such local dynamics can serve as an efficient, robust, and reliable prototypical activation protocol in sensor networks in responding to various alarm scenarios. We also consider the same dynamics on a modified network by adding a few long-range communication links, resulting in a small-world network. We find that such construction can further enhance and optimize the speed of the network's response, while keeping energy consumption at a manageable level.
As mobile network users look forward to the connectivity speeds of 5G networks, service providers are facing challenges in complying with connectivity demands without substantial financial investments. Network Function Virtualization (NFV) is introduced as a new methodology that offers a way out of this bottleneck. NFV is poised to change the core structure of telecommunications infrastructure to be more cost-efficient. In this paper, we introduce a Network Function Virtualization framework, and discuss the challenges and requirements of its use in mobile networks. In particular, an NFV framework in the virtual environment is proposed. Moreover, in order to reduce signaling traffic and achieve better performance, this paper proposes a criterion to bundle multiple functions of virtualized evolved packet-core in a single physical device or a group of adjacent devices. The analysis shows that the proposed grouping can reduce the network control traffic by 70 percent.
In molecular systematics, evolutionary trees are reconstructed from sequences at the tips under simple models of site substitution. A central question is how much sequence data is required to reconstruct a tree accurately? The answer depends on the lengths of the branches (edges) of the tree, with very short and very long edges requiring long sequences for accurate tree inference, particularly when these branch lengths are arranged in certain ways. For four-taxon trees, the sequence length question was settled for the case of a rapid speciation event in the distant past. Here, we generalize this result and show that the same sequence length requirement holds even when the speciation event is recent, provided that at least one of the four taxa is distantly related to the others. However, this equivalence disappears if a molecular clock applies, since the length of the long outgroup edge becomes largely irrelevant in the estimation of the tree topology for a recent (but not a deep) divergence. We also show how our results can be extended to models in which substitution rates vary across sites, and to settings where more than four taxa are involved.
We present a general framework, applicable to a broad class of random walks on complex networks, which provides a rigorous lower bound for the mean first-passage time of a random walker to a target site averaged over its starting position, the so-called global mean first-passage time (GMFPT). This bound is simply expressed in terms of the equilibrium distribution at the target, and implies a minimal scaling of the GMFPT with the network size. We show that this minimal scaling, which can be arbitrarily slow for a proper choice of highly connected target, is realized under the simple condition that the random walk is transient at the target site, and independently of the small-world, scale free or fractal properties of the network. Last, we put forward that the GMFPT to a specific target is not a representative property of the network, since the target averaged GMFPT satisfies much more restrictive bounds, which forbid any sublinear scaling with the network size.
This paper describes a novel approach for combining Luby Transform (LT) codes and Network Coding (NC) in the context of PowerLine Communications (PLC) smart grid networks. Multihop transmissions of LT-encoded data on PLC networks are considered and algorithms to combine data at relay nodes are proposed. Without the need to decode and then re-encode the total received data stream, the relay nodes can forward the received data stream while adding at the same time their own data. Simulation results are provided confirming the good performance of the proposed algorithms.
We present a secondary eclipse observation for the hot Jupiter HD189733b across the wavelength range 290-570nm made using the Space Telescope Imaging Spectrograph on the Hubble Space Telescope. We measure geometric albedos of Ag = 0.40 \pm 0.12 across 290-450nm and Ag < 0.12 across 450-570nm at 1-sigma confidence. The albedo decrease toward longer wavelengths is also apparent when using six wavelength bins over the same wavelength range. This can be interpreted as evidence for optically thick reflective clouds on the dayside hemisphere with sodium absorption suppressing the scattered light signal beyond ~450nm. Our best-fit albedo values imply that HD189733b would appear a deep blue color at visible wavelengths.
Unsupervised neural network learning extracts hidden features from unlabeled training data. This is used as a pretraining step for further supervised learning in deep networks. Hence, understanding unsupervised learning is of fundamental importance. Here, we study the unsupervised learning from a finite number of data, based on the restricted Boltzmann machine learning. Our study inspires an efficient message passing algorithm to infer the hidden feature, and estimate the entropy of candidate features consistent with the data. Our analysis reveals that the learning requires only a few data if the feature is salient and extensively many if the feature is weak. Moreover, the entropy of candidate features monotonically decreases with data size and becomes negative (i.e., entropy crisis) before the message passing becomes unstable, suggesting a discontinuous phase transition. In terms of convergence time of the message passing algorithm, the unsupervised learning exhibits an easy-hard-easy phenomenon as the training data size increases. All these properties are reproduced in an approximate Hopfield model, with an exception that the entropy crisis is absent, and only continuous phase transition is observed. This key difference is also confirmed in a handwritten digits dataset. This study deepens our understanding of unsupervised learning from a finite number of data, and may provide insights into its role in training deep networks.
Deep Neural Networks (DNN) achieve human level performance in many image analytics tasks but DNNs are mostly deployed to GPU platforms that consume a considerable amount of power. New hardware platforms using lower precision arithmetic achieve drastic reductions in power consumption. More recently, brain-inspired spiking neuromorphic chips have achieved even lower power consumption, on the order of milliwatts, while still offering real-time processing.   However, for deploying DNNs to energy efficient neuromorphic chips the incompatibility between continuous neurons and synaptic weights of traditional DNNs, discrete spiking neurons and synapses of neuromorphic chips need to be overcome. Previous work has achieved this by training a network to learn continuous probabilities, before it is deployed to a neuromorphic architecture, such as IBM TrueNorth Neurosynaptic System, by random sampling these probabilities.   The main contribution of this paper is a new learning algorithm that learns a TrueNorth configuration ready for deployment. We achieve this by training directly a binary hardware crossbar that accommodates the TrueNorth axon configuration constrains and we propose a different neuron model.   Results of our approach trained on electroencephalogram (EEG) data show a significant improvement with previous work (76% vs 86% accuracy) while maintaining state of the art performance on the MNIST handwritten data set.
The energy landscape for the random-field Ising model (RFIM) is complex, yet algorithms such as the push-relabel algorithm exist for computing the exact ground state of an RFIM sample in time polynomial in the sample volume. Simulations were carried out to investigate the scaling properties of the push-relabel algorithm. The time evolution of the algorithm was studied along with the statistics of an auxiliary potential field. At very small random fields, the algorithm dynamics are closely related to the dynamics of two-species annihilation, consistent with fractal statistics for the distribution of minima in the potential (``height''). For $d=1,2$, a correlation length diverging at zero disorder sets a cutoff scale for the magnitude of the height field; our results are most consistent with a power-law correction to the exponential scaling of the correlation length with disorder in $d=2$. Near the ferromagnetic-paramagnetic transition in $d=3$, the time to find a solution diverges with a dynamic critical exponent of $z=0.93\pm0.06$ for a priority queue version and $z=0.43\pm0.06$ for a first-in first-out queue version of the algorithm. The links between the evolution of auxiliary fields in algorithmic time and the static physical properties of the RFIM ground state provide insight into the physics of the RFIM and a better understanding of how the algorithm functions.
Reasoning about objects, relations, and physics is central to human intelligence, and a key goal of artificial intelligence. Here we introduce the interaction network, a model which can reason about how objects in complex systems interact, supporting dynamical predictions, as well as inferences about the abstract properties of the system. Our model takes graphs as input, performs object- and relation-centric reasoning in a way that is analogous to a simulation, and is implemented using deep neural networks. We evaluate its ability to reason about several challenging physical domains: n-body problems, rigid-body collision, and non-rigid dynamics. Our results show it can be trained to accurately simulate the physical trajectories of dozens of objects over thousands of time steps, estimate abstract quantities such as energy, and generalize automatically to systems with different numbers and configurations of objects and relations. Our interaction network implementation is the first general-purpose, learnable physics engine, and a powerful general framework for reasoning about object and relations in a wide variety of complex real-world domains.
Autonomous neural systems must efficiently process information in a wide range of novel environments, which may have very different statistical properties. We consider the problem of how to optimally distribute receptors along a one-dimensional continuum consistent with the following design principles. First, neural representations of the world should obey a neural uncertainty principle---making as few assumptions as possible about the statistical structure of the world. Second, neural representations should convey, as much as possible, equivalent information about environments with different statistics. The results of these arguments resemble the structure of the visual system and provide a natural explanation of the behavioral Weber-Fechner law, a foundational result in psychology. Because the derivation is extremely general, this suggests that similar scaling relationships should be observed not only in sensory continua, but also in neural representations of ``cognitive' one-dimensional quantities such as time or numerosity.
Distribution networks are usually multiphase and radial. To facilitate power flow computation and optimization, two semidefinite programming (SDP) relaxations of the optimal power flow problem and a linear approximation of the power flow are proposed. We prove that the first SDP relaxation is exact if and only if the second one is exact. Case studies show that the second SDP relaxation is numerically exact and that the linear approximation obtains voltages within 0.0016 per unit of their true values for the IEEE 13, 34, 37, 123-bus networks and a real-world 2065-bus network.
We consider classical normal modes and non-interacting bosonic excitations in disordered systems. We emphasise generic aspects of such problems and parallels with disordered, non-interacting systems of fermions, and discuss in particular the relevance for bosonic excitations of symmetry classes known in the fermionic context. We also stress important differences between bosonic and fermionic problems. One of these follows from the fact that ground state stability of a system requires all bosonic excitation energy levels to be positive, while stability in systems of non-interacting fermions is ensured by the exclusion principle, whatever the single-particle energies. As a consequence, simple models of uncorrelated disorder are less useful for bosonic systems than for fermionic ones, and it is generally important to study the excitation spectrum in conjunction with the problem of constructing a disorder-dependent ground state: we show how a mapping to an operator with chiral symmetry provides a useful tool for doing this. A second difference involves the distinction for bosonic systems between excitations which are Goldstone modes and those which are not. In the case of Goldstone modes we review established results illustrating the fact that disorder decouples from excitations in the low frequency limit, above a critical dimension $d_c$, which in different circumstances takes the values $d_c=2$ and $d_c=0$. For bosonic excitations which are not Goldstone modes, we argue that an excitation density varying with frequency as $\rho(\omega) \propto \omega^4$ is a universal feature in systems with ground states that depend on the disorder realisation. We illustrate our conclusions with extensive analytical and some numerical calculations for a variety of models in one dimension.
We develop two different solar dynamo models to verify the hypothesis that a deep meridional flow can restrict the apperance of sunspots below 45 degrees, proposed by Nandy & Choudhuri (2002). In the first one, a single polytropic approximation for the density profile was taken, for both radiative and convective zones. In the second one, two polytropes were used to distinguish between both zones Pinzon & Calvo-Mozo (2001). The magnetic buoyancy mechanism proposed by Dikpati & Charbonneau (1999) was chosen in both models. We, actually, have obtained that a deep meridional flow pushes the maxima of toroidal magnetic field toward the solar equator, but in contrast to Nandy & Choudhuri (2002) a second zone of maximal fields remains at the poles. The second model, although closely resembling the solar standard model of Bahcall, Pinsonneault & Wasserbug (1995); Bahcall, Pinsonneault & Basu (2001), gives solar cyles three times longer than observed.
Symbolic regression via genetic programming is a flexible approach to machine learning that does not require up-front specification of model structure. However, traditional approaches to symbolic regression require the use of protected operators, which can lead to perverse model characteristics and poor generalisation. In this paper, we revisit interval arithmetic as one possible solution to allow genetic programming to perform regression using unprotected operators. Using standard benchmarks, we show that using interval arithmetic within model evaluation does not prevent invalid solutions from entering the population, meaning that search performance remains compromised. We extend the basic interval arithmetic concept with `safe' search operators that integrate interval information into their process, thereby greatly reducing the number of invalid solutions produced during search. The resulting algorithms are able to more effectively identify good models that generalise well to unseen data. We conclude with an analysis of the sensitivity of interval arithmetic-based operators with respect to the accuracy of the supplied input feature intervals.
We study the dynamics of a directed manifold of internal dimension D in a d-dimensional random force field. We obtain an exact solution for $d \to \infty$ and a Hartree approximation for finite d. They yield a Flory-like roughness exponent $\zeta$ and a non trivial anomalous diffusion exponent $\nu$ continuously dependent on the ratio $g_{T}/g_{L}$ of divergence-free ($g_{T}$) to potential ($g_{L}$) disorder strength. For the particle (D=0) our results agree with previous order $\epsilon^2$ RG calculations. The time-translational invariant dynamics for $g_{T} >0$ smoothly crosses over to the previously studied ultrametric aging solution in the potential case.
Sensor-based activity recognition seeks the profound high-level knowledge about human activity from multitudes of low-level sensor readings. Conventional pattern recognition approaches have made tremendous progress in the past years. However, most of those approaches heavily rely on heuristic hand-crafted feature extraction methods, which dramatically hinder their generalization performance. Additionally, those methods often produce unsatisfactory results for unsupervised and incremental learning tasks. Meanwhile, the recent advancement of deep learning makes it possible to perform automatic high-level feature extraction thus achieves promising performance in many areas. Since then, deep learning based methods have been widely adopted for the sensor-based activity recognition tasks. In this paper, we survey and highlight the recent advancement of deep learning approaches for sensor-based activity recognition. Specifically, we summarize existing literatures from three aspects: sensor modality, deep model and application. We also present a detailed discussion and propose grand challenges for future direction.
This paper considers networks where relationships between nodes are represented by directed dissimilarities. The goal is to study methods that, based on the dissimilarity structure, output hierarchical clusters, i.e., a family of nested partitions indexed by a connectivity parameter. Our construction of hierarchical clustering methods is built around the concept of admissible methods, which are those that abide by the axioms of value - nodes in a network with two nodes are clustered together at the maximum of the two dissimilarities between them - and transformation - when dissimilarities are reduced, the network may become more clustered but not less. Two particular methods, termed reciprocal and nonreciprocal clustering, are shown to provide upper and lower bounds in the space of admissible methods. Furthermore, alternative clustering methodologies and axioms are considered. In particular, modifying the axiom of value such that clustering in two-node networks occurs at the minimum of the two dissimilarities entails the existence of a unique admissible clustering method.
As an immune-inspired algorithm, the Dendritic Cell Algorithm (DCA), produces promising performances in the field of anomaly detection. This paper presents the application of the DCA to a standard data set, the KDD 99 data set. The results of different implementation versions of the DXA, including the antigen multiplier and moving time windows are reported. The real-valued Negative Selection Algorithm (NSA) using constant-sized detectors and the C4.5 decision tree algorithm are used, to conduct a baseline comparison. The results suggest that the DCA is applicable to KDD 99 data set, and the antigen multiplier and moving time windows have the same effect on the DCA for this particular data set. The real-valued NSA with constant-sized detectors is not applicable to the data set, and the C4.5 decision tree algorithm provides a benchmark of the classification performance for this data set.
We give some quantum circuits for calculating the probability $P(G|D)$ of a graph $G$ given data $D$. $G$ together with a transition probability matrix for each node of the graph, constitutes a Classical Bayesian Network, or CB net for short. Bayesian methods for calculating $P(G|D)$ have been given before (the so called structural modular and ordered modular models), but these earlier methods were designed to work on a classical computer. The goal of this paper is to "quantum computerize" those earlier methods.
The current paper proposes a novel neural network model for recognizing visually perceived human actions. The proposed multiple spatio-temporal scales recurrent neural network (MSTRNN) model is derived by introducing multiple timescale recurrent dynamics to the conventional convolutional neural network model. One of the essential characteristics of the MSTRNN is that its architecture imposes both spatial and temporal constraints simultaneously on the neural activity which vary in multiple scales among different layers. As suggested by the principle of the upward and downward causation, it is assumed that the network can develop meaningful structures such as functional hierarchy by taking advantage of such constraints during the course of learning. To evaluate the characteristics of the model, the current study uses three types of human action video dataset consisting of different types of primitive actions and different levels of compositionality on them. The performance of the MSTRNN in testing with these dataset is compared with the ones by other representative deep learning models used in the field. The analysis of the internal representation obtained through the learning with the dataset clarifies what sorts of functional hierarchy can be developed by extracting the essential compositionality underlying the dataset.
SGD is the widely adopted method to train CNN. Conceptually it approximates the population with a randomly sampled batch; then it evenly trains batches by conducting a gradient update on every batch in an epoch. In this paper, we demonstrate Sampling Bias, Intrinsic Image Difference and Fixed Cycle Pseudo Random Sampling differentiate batches in training, which then affect learning speeds on them. Because of this, the unbiased treatment of batches involved in SGD creates improper load balancing. To address this issue, we present Inconsistent Stochastic Gradient Descent (ISGD) to dynamically vary training effort according to learning statuses on batches. Specifically ISGD leverages techniques in Statistical Process Control to identify a undertrained batch. Once a batch is undertrained, ISGD solves a new subproblem, a chasing logic plus a conservative constraint, to accelerate the training on the batch while avoid drastic parameter changes. Extensive experiments on a variety of datasets demonstrate ISGD converges faster than SGD. In training AlexNet, ISGD is 21.05\% faster than SGD to reach 56\% top1 accuracy under the exactly same experiment setup. We also extend ISGD to work on multiGPU or heterogeneous distributed system based on data parallelism, enabling the batch size to be the key to scalability. Then we present the study of ISGD batch size to the learning rate, parallelism, synchronization cost, system saturation and scalability. We conclude the optimal ISGD batch size is machine dependent. Various experiments on a multiGPU system validate our claim. In particular, ISGD trains AlexNet to 56.3% top1 and 80.1% top5 accuracy in 11.5 hours with 4 NVIDIA TITAN X at the batch size of 1536.
Developing a novel experimental technique, we applied photon correlation spectroscopy using infrared radiation in liquid Sulphur around $T_\lambda$, i.e. in the temperature range where an abrupt increase in viscosity by four orders of magnitude is observed upon heating within few degrees. This allowed us - overcoming photo-induced and absorption effects at visible wavelengths - to reveal a chain relaxation process with characteristic time in the ms range. These results do rehabilitate the validity of the Maxwell relation in Sulphur from an apparent failure, allowing rationalizing the mechanical and thermodynamic behavior of this system within a viscoelastic scenario.
We identify the fluctuations of the partition function for a class of random energy models, where the energies are given by the positions of the particles of the complex-valued branching Brownian motion (BBM). Specifically, we provide the weak limit theorems for the partition function in the so-called "glassy phase" -- the regime of parameters, where the behaviour of the partition function is governed by the extrema of BBM. We allow for arbitrary correlations between the real and imaginary parts of the energies. This extends the recent result of Madaule, Rhodes and Vargas, where the uncorrelated case was treated. In particular, our result covers the case of the real-valued BBM energy model at complex temperatures.
In the article, we describe a new algebraic approach to the temporal network analysis based on the notion of temporal quantities. We define the semiring for computing the foremost journey and the traveling semirings for the analysis of temporal networks where the latency is given, the waiting times are arbitrary, and some other information on the links are known. We use the operations in the traveling semiring to compute a generalized temporal betweenness centrality of the nodes that corresponds to the importance of the nodes with respect to the ubiquitous foremost journeys in a temporal network.
Gathering data in an energy efficient manner in wireless sensor networks is an important design challenge. In wireless sensor networks, the readings of sensors always exhibit intra-temporal and inter-spatial correlations. Therefore, in this letter, we use low rank matrix completion theory to explore the inter-spatial correlation and use compressive sensing theory to take advantage of intra-temporal correlation. Our method, dubbed MCCS, can significantly reduce the amount of data that each sensor must send through network and to the sink, thus prolong the lifetime of the whole networks. Experiments using real datasets demonstrate the feasibility and efficacy of our MCCS method.
Many mobile applications running on smartphones and wearable devices would potentially benefit from the accuracy and scalability of deep CNN-based machine learning algorithms. However, performance and energy consumption limitations make the execution of such computationally intensive algorithms on mobile devices prohibitive. We present a GPU-accelerated library, dubbed CNNdroid, for execution of trained deep CNNs on Android-based mobile devices. Empirical evaluations show that CNNdroid achieves up to 60X speedup and 130X energy saving on current mobile devices. The CNNdroid open source library is available for download at https://github.com/ENCP/CNNdroid
Relation classification is an important semantic processing task for which state-ofthe-art systems still rely on costly handcrafted features. In this work we tackle the relation classification task using a convolutional neural network that performs classification by ranking (CR-CNN). We propose a new pairwise ranking loss function that makes it easy to reduce the impact of artificial classes. We perform experiments using the the SemEval-2010 Task 8 dataset, which is designed for the task of classifying the relationship between two nominals marked in a sentence. Using CRCNN, we outperform the state-of-the-art for this dataset and achieve a F1 of 84.1 without using any costly handcrafted features. Additionally, our experimental results show that: (1) our approach is more effective than CNN followed by a softmax classifier; (2) omitting the representation of the artificial class Other improves both precision and recall; and (3) using only word embeddings as input features is enough to achieve state-of-the-art results if we consider only the text between the two target nominals.
Are biological networks different from other large complex networks? Both large biological and non-biological networks exhibit power-law graphs (number of nodes with degree k, N(k) ~ k-b) yet the exponents, b, fall into different ranges. This may be because duplication of the information in the genome is a dominant evolutionary force in shaping biological networks (like gene regulatory networks and protein-protein interaction networks), and is fundamentally different from the mechanisms thought to dominate the growth of most non-biological networks (such as the internet [1-4]). The preferential choice models non-biological networks like web graphs can only produce power-law graphs with exponents greater than 2 [1-4,8]. We use combinatorial probabilistic methods to examine the evolution of graphs by duplication processes and derive exact analytical relationships between the exponent of the power law and the parameters of the model. Both full duplication of nodes (with all their connections) as well as partial duplication (with only some connections) are analyzed. We demonstrate that partial duplication can produce power-law graphs with exponents less than 2, consistent with current data on biological networks. The power-law exponent for large graphs depends only on the growth process, not on the starting graph.
We apply a recently developed method for classifying broad absorption line quasars (BALQSOs) to the latest QSO catalogue constructed from Data Release 5 of the Sloan Digital Sky Survey. Our new hybrid classification scheme combines the power of simple metrics, supervised neural networks and visual inspection. In our view the resulting BALQSO catalogue is both more complete and more robust than all previous BALQSO catalogues, containing 3552 sources selected from a parent sample of 28,421 QSOs in the redshift range 1.7<z<4.2. This equates to a raw BALQSO fraction of 12.5%.   In the process of constructing a robust catalogue, we shed light on the main problems encountered when dealing with BALQSO classification, many of which arise due to the lack of a proper physical definition of what constitutes a BAL. This introduces some subjectivity in what is meant by the term BALQSO, and because of this, we also provide all of the meta-data used in constructing our catalogue, for every object in the parent QSO sample. This makes it easy to quickly isolate and explore sub-samples constructed with different metrics and techniques. By constructing composite QSO spectra from sub-samples classified according to the meta-data, we show that no single existing metric produces clean and robust BALQSO classifications. Rather, we demonstrate that a variety of complementary metrics are required at the moment to accomplish this task. Along the way, we confirm the finding that BALQSOs are redder than non-BALQSOs and that the raw BALQSO fraction displays an apparent trend with signal-to-noise, steadily increasing from 9% in low signal-to-noise data, up to 15%.
Traditional methods of computer vision and machine learning cannot match human performance on tasks such as the recognition of handwritten digits or traffic signs. Our biologically plausible deep artificial neural network architectures can. Small (often minimal) receptive fields of convolutional winner-take-all neurons yield large network depth, resulting in roughly as many sparsely connected neural layers as found in mammals between retina and visual cortex. Only winner neurons are trained. Several deep neural columns become experts on inputs preprocessed in different ways; their predictions are averaged. Graphics cards allow for fast training. On the very competitive MNIST handwriting benchmark, our method is the first to achieve near-human performance. On a traffic sign recognition benchmark it outperforms humans by a factor of two. We also improve the state-of-the-art on a plethora of common image classification benchmarks.
Interconnected networks are mathematical representation of systems where two or more simple networks are coupled to each other. Depending on the coupling weight between the two components, the interconnected network can function in two regimes: one where the two networks are structurally distinguishable, and one where they are not. The coupling threshold--denoting this structural transition--is one of the most crucial concepts in interconnected networks. Yet, current information about the coupling threshold is limited. This letter presents an analytical expression for the exact value of the coupling threshold and outlines network interrelation implications.
As networking attempts to cleanly separate the control plane and forwarding plane abstractions, it also defines a clear interface between these two layers. An underlying network state is represented as a view to act upon in the control plane. We are interested in studying some fundamental properties of this interface, both in a general framework, and in the specific case of content routing. We try to evaluate the traffic between the two planes based on allowing a minimum level of acceptable distortion in the network state representation in the control plane.   We apply our framework to content distribution, and see how we can compute the overhead of maintaining the location of content in the control plane. This is of importance to evaluate resolution-based content discovery in content-oriented network architectures: we identify scenarios where the cost of updating the control plane for content routing overwhelms the benefit of fetching the nearest copy. We also show how to minimize the cost of this overhead when associating costs to peering traffic and to internal traffic for network of caches.
We discuss the mean-field theory of spin-glass models with frustrated long-range random spin exchange. We analyze the reasons for breakdown of the simple mean-field theory of Sherrington and Kirkpatrick. We relate the replica-symmetry breaking to ergodicity breaking and use the concept of real replicas to restore thermodynamic homogeneity of the equilibrium free energy in a replicated phase space. Embedded replications of the spin variables result in a set of hierarchical free energies and overlap susceptibilities between replica hierarchies as order parameters. The limit to infinite number of replica hierarchies leads to the Parisi solution with a continuous replica-symmetry breaking. We present a closed-form representation of the Parisi mean-field theory that is independent from stability of solutions with finite-many replica hierarchies. Hence, solutions with continuous and discrete replica-symmetry breaking can coexist. We demonstrate the construction of the spin-glass mean-field solutions via real replicas on three spin-glass models: Ising, Potts and p-spin. An asymptotic expansion in each model is used to demonstrate various types of the transition from the paramagnetic to the glassy phase.
Using the Uves echelle spectrograph at the ESO VLT, we observed the absorption line spectrum of the QSO 0103-260 in the Fors Deep Field. In addition to the expected Ly-alpha forest lines, we detected at least 16 metal absorption systems with highly different ionization levels in the observed spectral range. The redshifts of the metal absorption systems are strongly correlated with the redshift distribution of the high-z galaxies in the Fors Deep Field and with the strength (but not the number density) of the Ly-alpha forest lines. Both the metal systems and the galaxies show clustering at least up to the QSO emission line redshift of 3.365, but only few of these galaxy accumulations seem to form bound systems.
As the most competitive solution for next-generation network, software-defined network (SDN) and its dominant implementation OpenFlow, are attracting more and more interests. But besides convenience and flexibility, SDN/OpenFlow also introduces new kinds of limitations and security issues. Of these limitations, the most obvious and maybe the most neglected one, is the flow table capacity of SDN/OpenFlow switches.   In this paper, we proposed a novel inference attack targeting at SDN/OpenFlow network, which is motivated by the limited flow table capacities of SDN/OpenFlow switches and the following measurable network performance decrease resulting from frequent interactions between data plane and control plane when the flow table is full. To our best knowledge, this is the first proposed inference attack model of this kind for SDN/OpenFlow. We also implemented an inference attack framework according to our model and examined its efficiency and accuracy. The simulation results demonstrate that our framework can infer the network parameters(flow table capacity and flow table usage) with an accuracy of 80% or higher. These findings give us a deeper understanding of SDN/OpenFlow limitations and serve as guidelines to future improvements of SDN/OpenFlow.
This paper describes a relatively simple way of allowing a brain model to self-organise its concept patterns through nested structures. Time is a key element and a simulator would be able to show how patterns may form and then fire in sequence, as part of a search or thought process. It uses a very simple equation to show how the inhibitors in particular, can switch off certain areas, to allow other areas to become the prominent ones and thereby define the current brain state. This allows for a small amount of control over what appears to be a chaotic structure inside of the brain. It is attractive because it is still mostly mechanical and therefore can be added as an automatic process, or the modelling of that. The paper also describes how the nested pattern structure can be used as a basic counting mechanism.
Social media has been considered as a data source for tracking disease. However, most analyses are based on models that prioritize strong correlation with population-level disease rates over determining whether or not specific individual users are actually sick. Taking a different approach, we develop a novel system for social-media based disease detection at the individual level using a sample of professionally diagnosed individuals. Specifically, we develop a system for making an accurate influenza diagnosis based on an individual's publicly available Twitter data. We find that about half (17/35 = 48.57%) of the users in our sample that were sick explicitly discuss their disease on Twitter. By developing a meta classifier that combines text analysis, anomaly detection, and social network analysis, we are able to diagnose an individual with greater than 99% accuracy even if she does not discuss her health.
It is shown that, in the scaling regime, transport properties of quantum wires with off-diagonal disorder are described by a family of scaling equations that depend on two parameters: the mean free path and an additional continuous parameter. The existing scaling equation for quantum wires with off-diagonal disorder [Brouwer et al., Phys. Rev. Lett. 81, 862 (1998)] is a special point in this family. Both parameters depend on the details of the microscopic model. Since there are two parameters involved, instead of only one, localization in a wire with off-diagonal disorder is not universal. We take a geometric point of view and show that this nonuniversality follows from the fact that the group of transfer matrices is not semi-simple. Our results are illustrated with numerical simulations for a tight-binding model with random hopping amplitudes.
Cancer genomes exhibit a large number of different alterations that affect many genes in a diverse manner. It is widely believed that these alterations follow combinatorial patterns that have a strong connection with the underlying molecular interaction networks and functional pathways. A better understanding of the generative mechanisms behind the mutation rules and their influence on gene communities is of great importance for the process of driver mutations discovery and for identification of network modules related to cancer development and progression. We developed a new method for cancer mutation pattern analysis based on a constrained form of correlation clustering. Correlation clustering is an agnostic learning method that can be used for general community detection problems in which the number of communities or their structure is not known beforehand. The resulting algorithm, named $C^3$, leverages mutual exclusivity of mutations, patient coverage, and driver network concentration principles; it accepts as its input a user determined combination of heterogeneous patient data, such as that available from TCGA (including mutation, copy number, and gene expression information), and creates a large number of clusters containing mutually exclusive mutated genes in a particular type of cancer. The cluster sizes may be required to obey some useful soft size constraints, without impacting the computational complexity of the algorithm. To test $C^3$, we performed a detailed analysis on TCGA breast cancer and glioblastoma data and showed that our algorithm outperforms the state-of-the-art CoMEt method in terms of discovering mutually exclusive gene modules and identifying driver genes. Our $C^3$ method represents a unique tool for efficient and reliable identification of mutation patterns and driver pathways in large-scale cancer genomics studies.
We propose a general modeling and inference framework that composes probabilistic graphical models with deep learning methods and combines their respective strengths. Our model family augments graphical structure in latent variables with neural network observation models. For inference, we extend variational autoencoders to use graphical model approximating distributions with recognition networks that output conjugate potentials. All components of these models are learned simultaneously with a single objective, giving a scalable algorithm that leverages stochastic variational inference, natural gradients, graphical model message passing, and the reparameterization trick. We illustrate this framework with several example models and an application to mouse behavioral phenotyping.
Networks of chaotic units with static couplings can synchronize to a common chaotic trajectory. The effect of dynamic adaptive couplings on the cooperative behavior of chaotic networks is investigated. The couplings adjust to the activities of its two units by two competing mechanisms: An exponential decrease of the coupling strength is compensated by an increase due to de-synchronized activity. This mechanism prevents the network from reaching a steady state. Numerical simulations of a coupled map lattice show chaotic trajectories of de-synchronized units interrupted by pulses of mutually synchronized clusters. These pulses occur on all scales, sometimes extending to the entire network. Clusters of synchronized units can be triggered by a small group of synchronized units.
In structurally disordered ferromagnets the weak random dipole-dipole exchange may transform the polydomain state into a spin-glass one. To some extent the properties of such phase in disordered isotropic ferromagnet can be qualitatively described by the spherical model with the short-range ferromagnetic interaction and weak frustrated infinite-range random-bond exchange. This model is shown to predict that spin-glass phase substitute the ferromagnetic one at the arbitrary small disorder strength and that its thermodynamics has some similarity to that of polydomain state along with some significant distinctions. In particular, the longitudinal susceptibility at small fields becomes frozen below transition point at a constant value depending on the disorder strength, while the third order nonlinear magnetic susceptibilitiy exhibits the temperature oscillations in small field near the transition point. The relation of these predictions to the experimental data for some disordered isotropic ferromagnets is discussed.
Invariance to nuisance transformations is one of the desirable properties of effective representations. We consider transformations that form a \emph{group} and propose an approach based on kernel methods to derive local group invariant representations. Locality is achieved by defining a suitable probability distribution over the group which in turn induces distributions in the input feature space. We learn a decision function over these distributions by appealing to the powerful framework of kernel methods and generate local invariant random feature maps via kernel approximations. We show uniform convergence bounds for kernel approximation and provide excess risk bounds for learning with these features. We evaluate our method on three real datasets, including Rotated MNIST and CIFAR-10, and observe that it outperforms competing kernel based approaches. The proposed method also outperforms deep CNN on Rotated-MNIST and performs comparably to the recently proposed group-equivariant CNN.
We introduce a new procedure for training of artificial neural networks by using the approximation of an objective function by arithmetic mean of an ensemble of selected randomly generated neural networks, and apply this procedure to the classification (or pattern recognition) problem. This approach differs from the standard one based on the optimization theory. In particular, any neural network from the mentioned ensemble may not be an approximation of the objective function.
The low-temperature equilibrial state of a system of small metal grains, embedded into insulator, is studied. We find, that the grains may be charged due to the fluctuations of the surface energy of electron gas in grains, rather than quantization of electron states. The higherst-occupied level in a grain fluctuates within the range of order of charging energy below the overall chemical potential. The system, called a gapless Hubbard insulator, has no overall energy gap, while the transfer of an electron on finite distances costs finite energy. The ionization energy is determined mostly by the intragrain Coulomb repulsion, rather than a weak intergrain interaction, responsible for the Coulomb gap.   The hopping transport in the system is studied. The hopping energy is determined by the charging energy. At low temperature the transport has gapless character.
We present several applications of non-linear data modeling, using principal manifolds and principal graphs constructed using the metaphor of elasticity (elastic principal graph approach). These approaches are generalizations of the Kohonen's self-organizing maps, a class of artificial neural networks. On several examples we show advantages of using non-linear objects for data approximation in comparison to the linear ones. We propose four numerical criteria for comparing linear and non-linear mappings of datasets into the spaces of lower dimension. The examples are taken from comparative political science, from analysis of high-throughput data in molecular biology, from analysis of dynamical systems.
Auto-encoders are perhaps the best-known non-probabilistic methods for representation learning. They are conceptually simple and easy to train. Recent theoretical work has shed light on their ability to capture manifold structure, and drawn connections to density modelling. This has motivated researchers to seek ways of auto-encoder scoring, which has furthered their use in classification. Gated auto-encoders (GAEs) are an interesting and flexible extension of auto-encoders which can learn transformations among different images or pixel covariances within images. However, they have been much less studied, theoretically or empirically. In this work, we apply a dynamical systems view to GAEs, deriving a scoring function, and drawing connections to Restricted Boltzmann Machines. On a set of deep learning benchmarks, we also demonstrate their effectiveness for single and multi-label classification.
When training neural networks, the use of Synthetic Gradients (SG) allows layers or modules to be trained without update locking - without waiting for a true error gradient to be backpropagated - resulting in Decoupled Neural Interfaces (DNIs). This unlocked ability of being able to update parts of a neural network asynchronously and with only local information was demonstrated to work empirically in Jaderberg et al (2016). However, there has been very little demonstration of what changes DNIs and SGs impose from a functional, representational, and learning dynamics point of view. In this paper, we study DNIs through the use of synthetic gradients on feed-forward networks to better understand their behaviour and elucidate their effect on optimisation. We show that the incorporation of SGs does not affect the representational strength of the learning system for a neural network, and prove the convergence of the learning system for linear and deep linear models. On practical problems we investigate the mechanism by which synthetic gradient estimators approximate the true loss, and, surprisingly, how that leads to drastically different layer-wise representations. Finally, we also expose the relationship of using synthetic gradients to other error approximation techniques and find a unifying language for discussion and comparison.
We investigate the use of deep neural networks for the novel task of class generic object detection. We show that neural networks originally designed for image recognition can be trained to detect objects within images, regardless of their class, including objects for which no bounding box labels have been provided. In addition, we show that bounding box labels yield a 1% performance increase on the ImageNet recognition challenge.
So far proposed quantum computers use fragile and environmentally sensitive natural quantum systems. Here we explore the new notion that synthetic quantum systems suitable for quantum computation may be fabricated from smart nanostructures using topological excitations of a stochastic neural-type network that can mimic natural quantum systems. These developments are a technological application of process physics which is an information theory of reality in which space and quantum phenomena are emergent, and so indicates the deep origins of quantum phenomena. Analogous complex stochastic dynamical systems have recently been proposed within neurobiology to deal with the emergent complexity of biosystems, particularly the biodynamics of higher brain function. The reasons for analogous discoveries in fundamental physics and neurobiology are discussed.
A Boltzmann machine is a stochastic neural network that has been extensively used in the layers of deep architectures for modern machine learning applications. In this paper, we develop a Boltzmann machine that is capable of modelling thermodynamic observables for physical systems in thermal equilibrium. Through unsupervised learning, we train the Boltzmann machine on data sets constructed with spin configurations importance-sampled from the partition function of an Ising Hamiltonian at different temperatures using Monte Carlo (MC) methods. The trained Boltzmann machine is then used to generate spin states, for which we compare thermodynamic observables to those computed by direct MC sampling. We demonstrate that the Boltzmann machine can faithfully reproduce the observables of the physical system. Further, we observe that the number of neurons required to obtain accurate results increases as the system is brought close to criticality.
Complex networks with expanding dimensions are studied, where the networks may be directed and weighted, and network nodes are varying in discrete time in the sense that some new nodes may be added and some old nodes may be removed from time to time. A model of such networks in computer data transmission is discussed. Each node on the network has fixed dimensionality, while the dimension of the whole network is defined by the total number of nodes. Based on the spectacular properties of data transmission on computer networks, some new concepts of stable and unstable networks differing from the classical Lyapunov stability are defined. In particular, a special unstable network model, called devil network, is introduced and discussed. It is further found that a variety of structures and connection weights affects the network stability substantially. Several criteria on stability, instability, and devil network are established for a rather general class of networks, where some conditions are actually necessary and sufficient. Mathematically, this paper makes a first attempt to rigorously formulate a fundamental issue of modeling discrete linear time-varying systems with expanding dimensions and study their basic stability property.
Betweenness measures provide quantitative tools to pick out fine details from the massive amount of interaction data that is available from large complex networks. They allow us to study the extent to which a node takes part when information is passed around the network. Nodes with high betweenness may be regarded as key players that have a highly active role. At one extreme, betweenness has been defined by considering information passing only through the shortest paths between pairs of nodes. At the other extreme, an alternative type of betweenness has been defined by considering all possible walks of any length. In this work, we propose a betweenness measure that lies between these two opposing viewpoints. We allow information to pass through all possible routes, but introduce a scaling so that longer walks carry less importance. This new definition shares a similar philosophy to that of communicability for pairs of nodes in a network, which was introduced by Estrada and Hatano (Phys. Rev. E 77 (2008) 036111). Having defined this new communicability betweenness measure, we show that it can be characterized neatly in terms of the exponential of the adjacency matrix. We also show that this measure is closely related to a Frechet derivative of the matrix exponential. This allows us to conclude that it also describes network sensitivity when the edges of a given node are subject to infinitesimally small perturbations. Using illustrative synthetic and real life networks, we show that the new betweenness measure behaves differently to existing versions, and in particular we show that it recovers meaningful biological information from a protein-protein interaction network.
Our goal is to revisit rank order coding by proposing an original exact decoding procedure for it. Rank order coding was proposed by Simon Thorpe et al. who stated that the retina represents the visual stimulus by the order in which its cells are activated. A classical rank order coder/decoder was then designed on this basis [1]. Though, it appeared that the decoding procedure employed yields reconstruction errors that limit the model Rate/Quality performances when used as an image codec. The attempts made in the literature to overcome this issue are time consuming and alter the coding procedure, or are lacking mathematical support and feasibility for standard size images. Here we solve this problem in an original fashion by using the frames theory, where a frame of a vector space designates an extension for the notion of basis. First, we prove that the analyzing filter bank considered is a frame, and then we define the corresponding dual frame that is necessary for the exact image reconstruction. Second, to deal with the problem of memory overhead, we design a recursive out-of-core blockwise algorithm for the computation of this dual frame. Our work provides a mathematical formalism for the retinal model under study and defines a simple and exact reverse transform for it with up to 270 dB of PSNR gain compared to [1]. Furthermore, the framework presented here can be extended to several models of the visual cortical areas using redundant representations.
We consider content-level selective offloading of cellular downlink traffic to a wireless infostation terminal which stores high data-rate content in its cache memory. Cellular users in the vicinity of the infostation can directly download the stored content from the infostation through a broadband connection (e.g., WiFi), reducing the latency and load on the cellular network. The goal of the infostation cache controller (CC) is to store the most popular content in the cache memory such that the maximum amount of traffic is offloaded to the infostation. In practice, the popularity profile of the files is not known by the CC, which observes only the instantaneous demands for those contents stored in the cache. Hence, the cache content placement is optimised based on the demand history and on the cost associated to placing each content in the cache. By refreshing the cache content at regular time intervals, the CC gradually learns the popularity profile, while at the same time exploiting the limited cache capacity in the best way possible. This is formulated as a multi-armed bandit (MAB) problem with switching cost. Several algorithms are presented to decide on the cache content over time. The performance is measured in terms of cache efficiency, defined as the amount of net traffic that is offloaded to the infostation. In addition to theoretical regret bounds, the proposed algorithms are analysed through numerical simulations. In particular, the impact of system parameters, such as the number of files, number of users, cache size, and skewness of the popularity profile, on the performance is studied numerically. It is shown that the proposed algorithms learn the popularity profile quickly for a wide range of system parameters.
The study is devoted to definition of generalized metrical and topological (informational entropy) characteristics of neural signals via their well-known theoretical models. We have shown that time dependence of action potential of neurons is scale invariant. Information and entropy of neural signals have constant values in case of self-similarity and self-affinity.
Place classification is a fundamental ability that a robot should possess to carry out effective human-robot interactions. It is a nontrivial classification problem which has attracted many research. In recent years, there is a high exploitation of Artificial Intelligent algorithms in robotics applications. Inspired by the recent successes of deep learning methods, we propose an end-to-end learning approach for the place classification problem. With the deep architectures, this methodology automatically discovers features and contributes in general to higher classification accuracies. The pipeline of our approach is composed of three parts. Firstly, we construct multiple layers of laser range data to represent the environment information in different levels of granularity. Secondly, each layer of data is fed into a deep neural network model for classification, where a graph regularization is imposed to the deep architecture for keeping local consistency between adjacent samples. Finally, the predicted labels obtained from all the layers are fused based on confidence trees to maximize the overall confidence. Experimental results validate the effective- ness of our end-to-end place classification framework in which both the multi-layer structure and the graph regularization promote the classification performance. Furthermore, results show that the features automatically learned from the raw input range data can achieve competitive results to the features constructed based on statistical and geometrical information.
We consider the distribution of the transmission coefficients, i.e. the singular values of the modal transmission matrix, for 2D random media with periodic boundary conditions composed of a large number of point-like non-absorbing scatterers. The scatterers are placed at random locations in the medium and have random refractive indices that are drawn from an arbitrary, known distribution. We construct a randomized model for the scattering matrix that retains scatterer dependent properties essential to reproduce the transmission coefficient distribution and analytically characterize the distribution of this matrix as a function of the refractive index distribution, the number of modes, and the number of scatterers. We show that the derived distribution agrees remarkably well with results obtained using a numerically rigorous spectrally accurate simulation. Analysis of the derived distribution provides the strongest principled justification yet of why we should expect perfect transmission in such random media regardless of the refractive index distribution of the constituent scatterers. The analysis suggests a sparsity condition under which random media will exhibit a perfect transmission-supporting universal transmission coefficient distribution in the deep medium limit.
Models for predicting the risk of cardiovascular events based on individual patient characteristics are important tools for managing patient care. Most current and commonly used risk prediction models have been built from carefully selected epidemiological cohorts. However, the homogeneity and limited size of such cohorts restricts the predictive power and generalizability of these risk models to other populations. Electronic health data (EHD) from large health care systems provide access to data on large, heterogeneous, and contemporaneous patient populations. The unique features and challenges of EHD, including missing risk factor information, non-linear relationships between risk factors and cardiovascular event outcomes, and differing effects from different patient subgroups, demand novel machine learning approaches to risk model development. In this paper, we present a machine learning approach based on Bayesian networks trained on EHD to predict the probability of having a cardiovascular event within five years. In such data, event status may be unknown for some individuals as the event time is right-censored due to disenrollment and incomplete follow-up. Since many traditional data mining methods are not well-suited for such data, we describe how to modify both modelling and assessment techniques to account for censored observation times. We show that our approach can lead to better predictive performance than the Cox proportional hazards model (i.e., a regression-based approach commonly used for censored, time-to-event data) or a Bayesian network with {\em{ad hoc}} approaches to right-censoring. Our techniques are motivated by and illustrated on data from a large U.S. Midwestern health care system.
A contemporary and fundamental problem faced by many evolutionary biologists is how to puzzle together a collection $\mathcal P$ of partial trees (leaf-labelled trees whose leaves are bijectively labelled by species or, more generally, taxa, each supported by e. g. a gene) into an overall parental structure that displays all trees in $\mathcal P$. This already difficult problem is complicated by the fact that the trees in $\mathcal P$ regularly support conflicting phylogenetic relationships and are not on the same but only overlapping taxa sets. A desirable requirement on the sought after parental structure therefore is that it can accommodate the observed conflicts. Phylogenetic networks are a popular tool capable of doing precisely this. However, not much is known about how to construct such networks from partial trees, a notable exception being the $Z$-closure super-network approach and the recently introduced $Q$-imputation approach. Here, we propose the usage of closure rules to obtain such a network. In particular, we introduce the novel $Y$-closure rule and show that this rule on its own or in combination with one of Meacham's closure rules (which we call the $M$-rule) has some very desirable theoretical properties. In addition, we use the $M$- and $Y$-rule to explore the dependency of Rivera et al.'s ``ring of life'' on the fact that the underpinning phylogenetic trees are all on the same data set. Our analysis culminates in the presentation of a collection of induced subtrees from which this ring can be reconstructed.
To help mitigate road congestion, many transit authorities have implemented managed lane policies. Managed lanes typically run parallel to a freeway's standard, general-purpose (GP) lanes, but are restricted to certain types of vehicles. It was originally thought that managed lanes would improve the use of existing infrastructure through incentivization of demand-management behaviors like carpooling, but implementations have often been characterized by unpredicted phenomena that is often to detrimental system performance. Development of traffic models that can capture these sorts of behaviors is a key step for helping managed lanes deliver on their promised gains.   Towards this goal, this paper presents several macroscopic traffic modeling tools we have used for study of freeways equipped with managed lanes, or "managed lane-freeway networks." The proposed framework is based on the widely-used first-order kinematic wave theory. In this model, the GP and the managed lanes are modeled as parallel links connected by nodes, where certain type of traffic may switch between GP and managed lane links. Two types of managed lane configuration are considered: full-access, where vehicles can switch between the GP and the managed lanes anywhere; and separated, where such switching is allowed only at certain locations called gates. We incorporate two empirically-observed phenomena into our model that are particular to managed lane-freeway networks: the inertia effect and the friction effect.   Calibration of models of large road networks is difficult, as the number of parameters grow with the network's size. We present an iterative learning-based approach to calibrating our model's physical and driver-behavioral parameters. Finally, we validate our model and calibration methodology with case studies of simulations of two managed lane-equipped California freeways.
This paper considers secure energy-efficient routing in the presence of multiple passive eavesdroppers. Previous work in this area has considered secure routing assuming probabilistic or exact knowledge of the location and channel-state-information (CSI) of each eavesdropper. In wireless networks, however, the locations and CSIs of passive eavesdroppers are not known, making it challenging to guarantee secrecy for any routing algorithm.   We develop an efficient (in terms of energy consumption and computational complexity) routing algorithm that does not rely on any information about the locations and CSIs of the eavesdroppers. Our algorithm guarantees secrecy even in disadvantaged wireless environments, where multiple eavesdroppers try to eavesdrop each message, are equipped with directional antennas, or can get arbitrarily close to the transmitter. The key is to employ additive random jamming to exploit inherent non-idealities of the eavesdropper's receiver, which makes the eavesdroppers incapable of recording the messages. We have simulated our proposed algorithm and compared it with existing secrecy routing algorithms in both single-hop and multi-hop networks. Our results indicate that when the uncertainty in the locations of eavesdroppers is high and/or in disadvantaged wireless environments, our algorithm outperforms existing algorithms in terms of energy consumption and secrecy.
Though deep learning has pushed the boundaries of classification forward, in recent years hints of the limits of standard classification have begun to emerge. Problems such as fooling, adding new classes over time, and the need to retrain learning models only for small changes to the original problem all point to a potential shortcoming in the classic classification regime, where a comprehensive a priori knowledge of the possible classes or concepts is critical. Without such knowledge, classifiers misjudge the limits of their knowledge and overgeneralization therefore becomes a serious obstacle to consistent performance. In response to these challenges, this paper extends the classic regime by reframing classification instead with the assumption that concepts present in the training set are only a sample of the hypothetical final set of concepts. To bring learning models into this new paradigm, a novel elaboration of standard architectures called the competitive overcomplete output layer (COOL) neural network is introduced. Experiments demonstrate the effectiveness of COOL by applying it to fooling, separable concept learning, one-class neural networks, and standard classification benchmarks. The results suggest that, unlike conventional classifiers, the amount of generalization in COOL networks can be tuned to match the problem.
We study the dynamic behaviour of concentrated colloidal hard spheres using Time Resolved Correlation, a light scattering technique that can detect the slow evolution of the dynamics in out-of-equilibrium systems. Surprisingly, equilibrium is reached a very long time after sample initialization, the non-stationary regime lasting up to three orders of magnitude more than the relaxation time of the system. Before reaching equilibrium, the system displays unusual aging behaviour. The intermediate scattering function decays faster than exponentially and its relaxation time evolves non-monotonically with sample age.
Animals smelling in the real world use a small number of receptors to sense a vast number of natural molecular mixtures, and proceed to learn arbitrary associations between odors and valences. Here, we propose a new interpretation of how the architecture of olfactory circuits is adapted to meet these immense complementary challenges. First, the diffuse binding of receptors to many molecules compresses a vast odor space into a tiny receptor space, while preserving similarity. Next, lateral interactions "densify" and decorrelate the response, enhancing robustness to noise. Finally, disordered projections from the periphery to the central brain reconfigure the densely packed information into a format suitable for flexible learning of associations and valences. We test our theory empirically using data from Drosophila. Our theory suggests that the neural processing of olfactory information differs from the other senses in its fundamental use of disorder.
According to conventional neural network theories, the feature of single-hidden-layer feedforward neural networks(SLFNs) resorts to parameters of the weighted connections and hidden nodes. SLFNs are universal approximators when at least the parameters of the networks including hidden-node parameter and output weight are exist. Unlike above neural network theories, this paper indicates that in order to let SLFNs work as universal approximators, one may simply calculate the hidden node parameter only and the output weight is not needed at all. In other words, this proposed neural network architecture can be considered as a standard SLFNs with fixing output weight equal to an unit vector. Further more, this paper presents experiments which show that the proposed learning method tends to extremely reduce network output error to a very small number with only 1 hidden node. Simulation results demonstrate that the proposed method can provide several to thousands of times faster than other learning algorithm including BP, SVM/SVR and other ELM methods.
We used a large database of 9 billion calls from 20 million mobile users to examine the relationships between aggregated time spent on the phone, personal network size, tie strength and the way in which users distributed their limited time across their network (disparity). Compared to those with smaller networks, those with large networks did not devote proportionally more time to communication and had on average weaker ties (as measured by time spent communicating). Further, there were not substantially different levels of disparity between individuals, in that mobile users tend to distribute their time very unevenly across their network, with a large proportion of calls going to a small number of individuals. Together, these results suggest that there are time constraints which limit tie strength in large personal networks, and that even high levels of mobile communication do not fundamentally alter the disparity of time allocation across networks.
Lately, network sampling proved as a promising tool for simplifying large real-world networks and thus providing for their faster and more efficient analysis. Still, understanding the changes of network structure and properties under different sampling methods remains incomplete. In this paper, we analyze the presence of characteristic group of nodes (i.e., communities, modules and mixtures of the two) in social and information networks. Moreover, we observe the changes of node group structure under two sampling methods, random node selection based on degree and breadth-first sampling. We show that the sampled information networks contain larger number of mixtures than original networks, while the structure of sampled social networks exhibits stronger characterization by communities. The results also reveal there exist no significant differences in the behavior of both sampling methods. Accordingly, the selection of sampling method impact on the changes of node group structure to a much smaller extent that the type and the structure of analyzed network.
In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimensional structure of the high-dimensional interaural spectral data, based on a corpus recorded with a human-like audiomotor robot head. A non-linear dimensionality reduction technique is used to show that these data lie on a two-dimensional (2D) smooth manifold parameterized by the motor states of the listener, or equivalently, the sound source directions. We propose a probabilistic piecewise affine mapping model (PPAM) specifically designed to deal with high-dimensional data exhibiting an intrinsic piecewise linear structure. We derive a closed-form expectation-maximization (EM) procedure for estimating the model parameters, followed by Bayes inversion for obtaining the full posterior density function of a sound source direction. We extend this solution to deal with missing data and redundancy in real world spectrograms, and hence for 2D localization of natural sound sources such as speech. We further generalize the model to the challenging case of multiple sound sources and we propose a variational EM framework. The associated algorithm, referred to as variational EM for source separation and localization (VESSL) yields a Bayesian estimation of the 2D locations and time-frequency masks of all the sources. Comparisons of the proposed approach with several existing methods reveal that the combination of acoustic-space learning with Bayesian inference enables our method to outperform state-of-the-art methods.
We present an efficient and robust numerical model for simulation of electrokinetic phenomena in porous networks over a wide range of applications including energy conversion, desalination, and lab-on-a-chip systems. Coupling between fluid flow and ion transport in these networks is governed by the Poisson-Nernst-Planck-Stokes equations. These equations describe a wide range of transport phenomena that can interact in complex and highly nonlinear ways in networks involving multiple pores with variable properties. Capturing these phenomena by direct simulation of the governing equations in multiple dimensions is prohibitively expensive. We present here a reduced order computational model that treats a network of many pores via solutions to 1D equations. Assuming that each pore in the network is long and thin, we derive a 1D model describing the transport in pore's longitudinal direction. We take into account the non-uniformity of potential and ion concentration profiles across the pore cross-section in the form of area-averaged coefficients in different flux terms representing fluid flow, electric current, and ion fluxes. Distinct advantages of the present framework include: a fully conservative discretization, fully bounded tabulated area-averaged coefficients without any singularity in the limit of infinitely thick electric double layers (EDLs), a flux discretization that exactly preserves equilibrium conditions, and extension to general network of pores with multiple intersections. By considering a hierarchy of canonical problems with increasing complexity, we demonstrate that the developed framework can capture a wide range of phenomena. Example demonstrations include, prediction of osmotic pressure built up in thin pores subject to concentration gradient, propagation of deionization shocks and induced recirculations for intersecting pores with varying properties.
Hyperparameters of deep neural networks are often optimized by grid search, random search or Bayesian optimization. As an alternative, we propose to use the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which is known for its state-of-the-art performance in derivative-free optimization. CMA-ES has some useful invariance properties and is friendly to parallel evaluations of solutions. We provide a toy example comparing CMA-ES and state-of-the-art Bayesian optimization algorithms for tuning the hyperparameters of a convolutional neural network for the MNIST dataset on 30 GPUs in parallel.
The giant single-celled slime mould Physarum polycephalum has inspired rapid develop- ments in unconventional computing substrates since the start of this century. This is primarily due to its simple component parts and the distributed nature of the computation which it approximates during its growth, foraging and adaptation to a changing environment. Slime mould functions as a living embodied computational material which can be influenced (or pro- grammed) by the placement of external stimuli. The goal of exploiting this material behaviour for unconventional computation led to the development of a multi-agent approach to the ap- proximation of slime mould behaviour. The basis of the model is a simple dynamical pattern formation mechanism which exhibits self-organised formation and subsequent adaptation of collective transport networks. The system exhibits emergent properties such as relaxation and minimisation and it can be considered as a virtual computing material, influenced by the external application of spatial concentration gradients. In this paper we give an overview of this multi-agent approach to unconventional computing. We describe its computational mechanisms and different generic application domains, together with concrete example ap- plications of material computation. We examine the potential exploitation of the approach for computational geometry, path planning, combinatorial optimisation, data smoothing and statistical applications.
We derive optimal periodic controls for entrainment of a self-driven oscillator to a desired frequency. The alternative objectives of minimizing power and maximizing frequency range of entrainment are considered. A state space representation of the oscillator is reduced to a linearized phase model, and the optimal periodic control is computed from the phase response curve using formal averaging and the calculus of variations. Computational methods are used to calculate the periodic orbit and the phase response curve, and a numerical method for approximating the optimal controls is introduced. Our method is applied to asymptotically control the period of spiking neural oscillators modeled using the Hodgkin-Huxley equations. This example illustrates the optimality of entrainment controls derived using phase models when applied to the original state space system.
By extrapolating the accumulated low-redshift data on the absorption radius of galaxies and its luminosity scaling, it is possible to predict the total absorption cross-section of the gas associated with collapsed structures in the universe at any given epoch. This prediction can be verified observationally through comparison with the well-known spatial distribution of the QSO absorption systems. In this way, it is shown that HDF, NTT SUSI Deep Field and other such data give further evidence for the plausibility of origin of the significant fraction of the Ly$\alpha$ forest in haloes of normal galaxies.
We introduce a stochastic model of growing networks where both, the number of new nodes which joins the network and the number of connections, vary stochastically. We provide an exact mapping between this model and zero range process, and use this mapping to derive an analytical solution of degree distribution for any given evolution rule. One can also use this mapping to infer about a possible evolution rule for a given network. We demonstrate this for protein-protein interaction (PPI) network for Saccharomyces Cerevisiae.
Real-time, accurate, and robust pupil detection is an essential prerequisite for pervasive video-based eye-tracking. However, automated pupil detection in real-world scenarios has proven to be an intricate challenge due to fast illumination changes, pupil occlusion, non centered and off-axis eye recording, and physiological eye characteristics. In this paper, we propose and evaluate a method based on a novel dual convolutional neural network pipeline. In its first stage the pipeline performs coarse pupil position identification using a convolutional neural network and subregions from a downscaled input image to decrease computational costs. Using subregions derived from a small window around the initial pupil position estimate, the second pipeline stage employs another convolutional neural network to refine this position, resulting in an increased pupil detection rate up to 25% in comparison with the best performing state-of-the-art algorithm. Annotated data sets can be made available upon request.
We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low-dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representation of the test posteriors using this dictionary enables projection to the space of training data. Relying on the fact that the intrinsic dimensions of the posterior subspaces are indeed very small and the matrix of all posteriors belonging to a class has a very low rank, we demonstrate how low-dimensional structures enable further enhancement of the posteriors and rectify the spurious errors due to mismatch conditions. The enhanced acoustic modeling method leads to improvements in continuous speech recognition task using hybrid DNN-HMM (hidden Markov model) framework in both clean and noisy conditions, where upto 15.4% relative reduction in word error rate (WER) is achieved.
The influence of non magnetic impurities in the 2d XY model is investigated through Monte Carlo (MC) simulations. The general picture of the transition is fully understood from the Harris criterion which predicts that the universality class is unchanged, and the Berezinskii-Kosterlitz-Thouless description of the topological transition remains valid. We nevertheless address here the question about the influence of dilution on the quasi-long-range order at low temperatures. In particular, we study the asymptotic of the pair correlation function and report the MC estimates for the critical exponent $\eta$ at different dilutions. In the weak dilution region, our MC calculations are further supported by simple spin-wave-like calculations.
Understanding the structures why links are formed is an important and prominent research topic. In this paper, we therefore consider the link prediction problem in face-to-face contact networks, and analyze the predictability of new and recurring links. Furthermore, we study additional influence factors, and the role of stronger ties in these networks. Specifically, we compare neighborhood-based and path-based network proximity measures in a threshold-based analysis for capturing temporal dynamics. The results and insights of the analysis are a first step onto predictability applications for human contact networks, for example, for improving recommendations.
We describe the approach to linguistic variation taken by the Motorola speech synthesizer. A pan-dialectal pronunciation dictionary is described, which serves as the training data for a neural network based letter-to-sound converter. Subsequent to dictionary retrieval or letter-to-sound generation, pronunciations are submitted a neural network based postlexical module. The postlexical module has been trained on aligned dictionary pronunciations and hand-labeled narrow phonetic transcriptions. This architecture permits the learning of individual postlexical variation, and can be retrained for each speaker whose voice is being modeled for synthesis. Learning variation in this way can result in greater naturalness for the synthetic speech that is produced by the system.
We report on the bright Lyman-break galaxies (LBGs) selected in a 767 arcmin^2 area of the Subaru Deep Field. The selection is made in the i-zR vs zB-zR plane, where zB and zR are new bandpasses with a central wavelength of 8842A and 9841A, respectively. This set of bandpasses enables us to separate well z~6 LBGs from foreground galaxies and Galactic cool stars. We detect 12 LBG candidates down to zR=25.4, and calculate the normalization of the rest-frame far-ultraviolet (FUV: 1400A) luminosity function at MFUV = -21.6 to be \phi(-21.6) = (2.6+/-0.7) x 10^{-5} mag^{-1} Mpc^{-3}. This must be the most reliable measurement ever obtained of the number density of bright z~6 LBGs, because it is more robust against both contamination and cosmic variance than previous values. The FUV luminosity density contributed from LBGs brighter than MFUV = -21.3 is (2.8+/-0.8) x 10^{24} ergs/s/Hz/Mpc^3, which is equivalent to a star formation rate density of (3.5+/-1.0) x 10^{-4} Msun/yr/Mpc^3. Combining our measurement with those at z<6 in the literature, we find that the FUV luminosity density of bright galaxies increases by an order of magnitude from z~6 to z~3 and then drops by 10^3 from z~3 to the present epoch, while the evolution of the total luminosity density is much milder. The evolutionary behavior of bright LBGs resembles that of luminous dusty star-forming galaxies and bright QSOs. The redshift of z~3 appears to be a remarkable era in the cosmic history when massive galaxies were being intensively formed.
Compared with artificial neural networks (ANNs), spiking neural networks (SNNs) are promising to explore the brain-like behaviors since the spikes could encode more spatio-temporal information. Although pre-training from ANN or direct training based on backpropagation (BP) makes the supervised training of SNNs possible, these methods only exploit the networks' spatial domain information which leads to the performance bottleneck and requires many complicated training skills. Another fundamental issue is that the spike activity is naturally non-differentiable which causes great difficulties in training SNNs. To this end, we build an iterative LIF model that is more friendly for gradient descent training. By simultaneously considering the layer-by-layer spatial domain (SD) and the timing-dependent temporal domain (TD) in the training phase, as well as an approximated derivative for the spike activity, we propose a spatio-temporal backpropagation (STBP) training framework without using any complicated technology. We achieve the best performance of multi-layered perceptron (MLP) compared with existing state-of-the-art algorithms over the static MNIST and the dynamic N-MNIST dataset as well as a custom object detection dataset. This work provides a new perspective to explore the high-performance SNNs for future brain-like computing paradigm with rich spatio-temporal dynamics.
This paper develops upper and lower bounds on the influence measure in a network, more precisely, the expected number of nodes that a seed set can influence in the independent cascade model. In particular, our bounds exploit nonbacktracking walks, Fortuin-Kasteleyn-Ginibre (FKG) type inequalities, and are computed by message passing implementation. Nonbacktracking walks have recently allowed for headways in community detection, and this paper shows that their use can also impact the influence computation. Further, we provide a knob to control the trade-off between the efficiency and the accuracy of the bounds. Finally, the tightness of the bounds is illustrated with simulations on various network models.
In this paper we present a family of algorithms that can simultaneously align and cluster sets of multidimensional curves measured on a discrete time grid. Our approach is based on a generative mixture model that allows non-linear time warping of the observed curves relative to the mean curves within the clusters. We also allow for arbitrary discrete-valued translation of the time axis, random real-valued offsets of the measured curves, and additive measurement noise. The resulting model can be viewed as a dynamic Bayesian network with a special transition structure that allows effective inference and learning. The Expectation-Maximization (EM) algorithm can be used to simultaneously recover both the curve models for each cluster, and the most likely time warping, translation, offset, and cluster membership for each curve. We demonstrate how Bayesian estimation methods improve the results for smaller sample sizes by enforcing smoothness in the cluster mean curves. We evaluate the methodology on two real-world data sets, and show that the DBN models provide systematic improvements in predictive power over competing approaches.
We define a new class of random walk processes which maximize entropy. This maximal entropy random walk is equivalent to generic random walk if it takes place on a regular lattice, but it is not if the underlying lattice is irregular. In particular, we consider a lattice with weak dilution. We show that the stationary probability of finding a particle performing maximal entropy random walk localizes in the largest nearly spherical region of the lattice which is free of defects. This localization phenomenon, which is purely classical in nature, is explained in terms of the Lifshitz states of a certain random operator.
The recent proliferation of correlated percolation models---models where the addition of edges/vertices is no longer independent of other edges/vertices---has been motivated by the quest to find discontinuous percolation transitions. The leader in this proliferation is what is known as explosive percolation. A recent proof demonstrates that a large class of explosive percolation-type models does not, in fact, exhibit a discontinuous transition[O. Riordan and L. Warnke, Science, {\bf 333}, 322 (2011)]. We, on the other hand, discuss several correlated percolation models, the $k$-core model on random graphs, and the spiral and counter-balance models in two-dimensions, all exhibiting discontinuous transitions in an effort to identify the needed ingredients for such a transition. We then construct mixtures of these models to interpolate between a continuous transition and a discontinuous transition to search for a tricritical point. Using a powerful rate equation approach, we demonstrate that a mixture of $k=2$-core and $k=3$-core vertices on the random graph exhibits a tricritical point. However, for a mixture of $k$-core and counter-balance vertices, heuristic arguments and numerics suggest that there is a line of continuous transitions as the fraction of counter-balance vertices is increased from zero with the line ending at a discontinuous transition only when all vertices are counter-balance. Our results may have potential implications for glassy systems and a recent experiment on shearing a system of frictional particles to induce what is known as jamming.
We write explicitly a transformation of the scattering phases reducing the problem of quantum chaotic scattering for systems with M statistically equivalent channels at nonideal coupling to that for ideal coupling. Unfolding the phases by their local density leads to universality of their local fluctuations for large M. A relation between the partial time delays and diagonal matrix elements of the Wigner-Smith matrix is revealed for ideal coupling. This helped us in deriving the joint probability distribution of partial time delays and the distribution of the Wigner time delay.
Electronic Health Records maintained in health care settings are a potential source of substantial clinical knowledge. The massive volume of data, unstructured nature of records and obligatory requirement of domain acquaintance together pose a challenge in knowledge extraction from it. The aim of this study is to overcome this challenge with a methodical analysis, abstraction and summarization of such data. This is an attempt to explain clinical observations through bio-medical and genomic data. Discharge summaries of obesity patients were processed to extract coherent patterns. This was supported by Machine Learning and Natural Language Processing based technologies and concept mapping tool along with biomedical, clinical and genomic knowledge bases. Semantic relations between diseases were extracted and filtered through Chi square test to remove spurious relations. The remaining relations were validated against biomedical literature and gene interaction networks. A collection of binary relations of diseases was derived from the data. One set implied co-morbidity while the other set contained diseases which are risk factors of others. Validation against bio-medical literature increased the prospect of correlation between diseases. Gene interaction network revealed that the diseases are related and their corresponding genes are in close proximity. Conclusion: This study focuses on deducing meaningful relations between diseases from discharge summaries. For analytical purpose, the scope has been limited to a few common, well-researched diseases. It can be extended to incorporate relatively unknown, complex diseases and discover new traits to help in clinical assessments.
We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.
We carry out simulated annealing and employ a generalized Kibble-Zurek scaling hypothesis to study the 2D Ising spin glass with normal-distributed couplings. The system has an equilibrium glass transition at temperature $T=0$. From a scaling analysis when $T\rightarrow 0$ at different annealing velocities, we extract the dynamic critical exponent $z$, i.e., the exponent relating the relaxation time $\tau$ to the system length $L$; $\tau\sim L^z$. We find $z=13.6 \pm 0.4$ for both the Edwards-Anderson spin-glass order parameter and the excess energy. This is different from a previous study of the system with bimodal couplings [S. J. Rubin, N. Xu, and A. W. Sandvik, Phys. Rev. E {\bf 95}, 052133 (2017)] where the dynamics is faster and the above two quantities relax with different exponents (and that of the energy is larger). We here argue that the different behaviors arise as a consequence of the different low-energy landscapes---for normal-distributed couplings the ground state is unique (up to a spin reflection) while the system with bimodal couplings is massively degenerate. Our results reinforce the conclusion of anomalous entropy-driven relaxation behavior in the bimodal Ising glass. In the case of a continuous coupling distribution, our results presented here indicate that, although Kibble-Zurek scaling holds, the perturbative behavior normally applying in the slow limit breaks down, likely due to quasi-degenerate states, and the scaling function takes a different form.
Current methods of routing are based on network information in the form of routing tables, in which routing protocols determine how to update the tables according to the network changes. Despite the variability of data in routing tables, node addresses are constant. In this paper, we first introduce the new concept of variable addresses, which results in a novel framework to cope with routing problems using heuristic solutions. Then we propose a heuristic routing mechanism based on the application of genes for determination of network addresses in a variable address network and describe how this method flexibly solves different problems and induces new ideas in providing integral solutions for variety of problems. The case of ad-hoc networks is where simulation results are more supportive and original solutions have been proposed for issues like mobility.
In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper we propose a model-based approach to analyze what we will refer to as the dynamic tomography of such time-evolving networks. Our approach offers an intuitive but powerful tool to infer the semantic underpinnings of each actor, such as its social roles or biological functions, underlying the observed network topologies. Our model builds on earlier work on a mixed membership stochastic blockmodel for static networks, and the state-space model for tracking object trajectory. It overcomes a major limitation of many current network inference techniques, which assume that each actor plays a unique and invariant role that accounts for all its interactions with other actors; instead, our method models the role of each actor as a time-evolving mixed membership vector that allows actors to behave differently over time and carry out different roles/functions when interacting with different peers, which is closer to reality. We present an efficient algorithm for approximate inference and learning using our model; and we applied our model to analyze a social network between monks (i.e., the Sampson's network), a dynamic email communication network between the Enron employees, and a rewiring gene interaction network of fruit fly collected during its full life cycle. In all cases, our model reveals interesting patterns of the dynamic roles of the actors.
Summaries of massive data sets support approximate query processing over the original data. A basic aggregate over a set of records is the weight of subpopulations specified as a predicate over records' attributes. Bottom-k sketches are a powerful summarization format of weighted items that includes priority sampling and the classic weighted sampling without replacement. They can be computed efficiently for many representations of the data including distributed databases and data streams.   We derive novel unbiased estimators and efficient confidence bounds for subpopulation weight. Our estimators and bounds are tailored by distinguishing between applications (such as data streams) where the total weight of the sketched set can be computed by the summarization algorithm without a significant use of additional resources, and applications (such as sketches of network neighborhoods) where this is not the case.   Our rigorous derivations are based on clever applications of the Horvitz-Thompson estimator, and are complemented by efficient computational methods. We demonstrate their benefit on a wide range of Pareto distributions.
At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and representations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite their attractive properties and potential for opening up entirely new neural architectures, complex-valued deep neural networks have been marginalized due to the absence of the building blocks required to design such models. In this work, we provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks. More precisely, we rely on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and we use them in experiments with end-to-end training schemes. We demonstrate that such complex-valued models are able to achieve comparable or better performance than their real-valued counterparts. We test deep complex models on several computer vision tasks and on music transcription using the MusicNet dataset where we achieve state of the art performance.
The 'Cavity-Mean-Field' approximation developed for the Random Transverse Field Ising Model on the Cayley tree [L. Ioffe and M. M\'ezard, PRL 105, 037001 (2010)] has been found to reproduce the known exact result for the surface magnetization in $d=1$ [O. Dimitrova and M. M\'ezard, J. Stat. Mech. (2011) P01020]. In the present paper, we propose to extend these ideas in finite dimensions $d>1$ via a non-linear transfer approach for the surface magnetization. In the disordered phase, the linearization of the transfer equations correspond to the transfer matrix for a Directed Polymer in a random medium of transverse dimension $D=d-1$, in agreement with the leading order perturbative scaling analysis [C. Monthus and T. Garel, arxiv:1110.3145]. We present numerical results of the non-linear transfer approach in dimensions $d=2$ and $d=3$. In both cases, we find that the critical point is governed by Infinite Disorder scaling. In particular exactly at criticality, the one-point surface magnetization scales as $\ln m_L^{surf} \simeq - L^{\omega_c} v$, where $\omega_c(d)$ coincides with the droplet exponent $\omega_{DP}(D=d-1)$ of the corresponding Directed Polymer model, with $\omega_c(d=2)=1/3$ and $\omega_c(d=3) \simeq 0.24$. The distribution $P(v)$ of the positive random variable $v$ of order O(1) presents a power-law singularity near the origin $P(v) \propto v^a$ with $a(d=2,3)>0$ so that all moments of the surface magnetization are governed by the same power-law decay $\bar{(m_L^{surf})^k} \propto L^{- x_S}$ with $x_S=\omega_c (1+a)$ independently of the order $k$.
Large-scale variations still pose a challenge in unconstrained face detection. To the best of our knowledge, no current face detection algorithm can detect a face as large as 800 x 800 pixels while simultaneously detecting another one as small as 8 x 8 pixels within a single image with equally high accuracy. We propose a two-stage cascaded face detection framework, Multi-Path Region-based Convolutional Neural Network (MP-RCNN), that seamlessly combines a deep neural network with a classic learning strategy, to tackle this challenge. The first stage is a Multi-Path Region Proposal Network (MP-RPN) that proposes faces at three different scales. It simultaneously utilizes three parallel outputs of the convolutional feature maps to predict multi-scale candidate face regions. The "atrous" convolution trick (convolution with up-sampled filters) and a newly proposed sampling layer for "hard" examples are embedded in MP-RPN to further boost its performance. The second stage is a Boosted Forests classifier, which utilizes deep facial features pooled from inside the candidate face regions as well as deep contextual features pooled from a larger region surrounding the candidate face regions. This step is included to further remove hard negative samples. Experiments show that this approach achieves state-of-the-art face detection performance on the WIDER FACE dataset "hard" partition, outperforming the former best result by 9.6% for the Average Precision.
Event recognition from still images is of great importance for image understanding. However, compared with event recognition in videos, there are much fewer research works on event recognition in images. This paper addresses the issue of event recognition from images and proposes an effective method with deep neural networks. Specifically, we design a new architecture, called Object-Scene Convolutional Neural Network (OS-CNN). This architecture is decomposed into object net and scene net, which extract useful information for event understanding from the perspective of objects and scene context, respectively. Meanwhile, we investigate different network architectures for OS-CNN design, and adapt the deep (AlexNet) and very-deep (GoogLeNet) networks to the task of event recognition. Furthermore, we find that the deep and very-deep networks are complementary to each other. Finally, based on the proposed OS-CNN and comparative study of different network architectures, we come up with a solution of five-stream CNN for the track of cultural event recognition at the ChaLearn Looking at People (LAP) challenge 2015. Our method obtains the performance of 85.5% and ranks the $1^{st}$ place in this challenge.
Feature selection has always been a critical step in pattern recognition, in which evolutionary algorithms, such as the genetic algorithm (GA), are most commonly used. However, the individual encoding scheme used in various GAs would either pose a bias on the solution or require a pre-specified number of features, and hence may lead to less accurate results. In this paper, a tribe competition-based genetic algorithm (TCbGA) is proposed for feature selection in pattern classification. The population of individuals is divided into multiple tribes, and the initialization and evolutionary operations are modified to ensure that the number of selected features in each tribe follows a Gaussian distribution. Thus each tribe focuses on exploring a specific part of the solution space. Meanwhile, tribe competition is introduced to the evolution process, which allows the winning tribes, which produce better individuals, to enlarge their sizes, i.e. having more individuals to search their parts of the solution space. This algorithm, therefore, avoids the bias on solutions and requirement of a pre-specified number of features. We have evaluated our algorithm against several state-of-the-art feature selection approaches on 20 benchmark datasets. Our results suggest that the proposed TCbGA algorithm can identify the optimal feature subset more effectively and produce more accurate pattern classification.
We consider the convergence time for solving the binary consensus problem using the interval consensus algorithm proposed by B\' en\' ezit, Thiran and Vetterli (2009). In the binary consensus problem, each node initially holds one of two states and the goal for each node is to correctly decide which one of these two states was initially held by a majority of nodes.   We derive an upper bound on the expected convergence time that holds for arbitrary connected graphs, which is based on the location of eigenvalues of some contact rate matrices. We instantiate our bound for particular networks of interest, including complete graphs, paths, cycles, star-shaped networks, and Erd\" os-R\' enyi random graphs; for these graphs, we compare our bound with alternative computations. We find that for all these examples our bound is tight, yielding the exact order with respect to the number of nodes.   We pinpoint the fact that the expected convergence time critically depends on the voting margin defined as the difference between the fraction of nodes that initially held the majority and the minority states, respectively. The characterization of the expected convergence time yields exact relation between the expected convergence time and the voting margin, for some of these graphs, which reveals how the expected convergence time goes to infinity as the voting margin approaches zero.   Our results provide insights into how the expected convergence time depends on the network topology which can be used for performance evaluation and network design. The results are of interest in the context of networked systems, in particular, peer-to-peer networks, sensor networks and distributed databases.
We present measurements of 1/f resistance noise in three different films of amorphous silicon (a-Si) in the presense of a transverse electric current. Two of these films have n-i-n sandwich structure - in one of them all three layers were hydrogenated; in the other one only the n-layers were hydrogenated, while the intrinsic layer was deuterated. The third film had p-i-p structure with all three layers hydrogenated. The experimental spectra were found to be in a very good quantitative agreement with theoretical predictions, which were based on the mechanism involving long-range fluctuations of the Coulomb potential created by charged defects (see cond-mat/0210680).
Magneto-transport measurements are performed on the two-dimensional electron system (2DES) in an AlGaAs/GaAs heterostructure. By increasing the magnetic field perpendicular to the 2DES, magnetoresistivity oscillations due to Landau quantisation can be identified just near the direct insulator-quantum Hall (I-QH) transition. However, different mobilities are obtained from the oscillations and transition point. Our study shows that the direct I-QH transition does not always correspond to the onset of strong localisation.
Data generated from real world events are usually temporal and contain multimodal information such as audio, visual, depth, sensor etc. which are required to be intelligently combined for classification tasks. In this paper, we propose a novel generalized deep neural network architecture where temporal streams from multiple modalities are combined. There are total M+1 (M is the number of modalities) components in the proposed network. The first component is a novel temporally hybrid Recurrent Neural Network (RNN) that exploits the complimentary nature of the multimodal temporal information by allowing the network to learn both modality specific temporal dynamics as well as the dynamics in a multimodal feature space. M additional components are added to the network which extract discriminative but non-temporal cues from each modality. Finally, the predictions from all of these components are linearly combined using a set of automatically learned weights. We perform exhaustive experiments on three different datasets spanning four modalities. The proposed network is relatively 3.5%, 5.7% and 2% better than the best performing temporal multimodal baseline for UCF-101, CCV and Multimodal Gesture datasets respectively.
Sum-networks are networks where all the terminals demand the sum of the symbols generated at the sources. It has been shown that for any finite set/co-finite set of prime numbers, there exists a sum-network which has a vector linear solution if and only if the characteristic of the finite field belongs to the given set. It has also been shown that for any positive rational number $k/n$, there exists a sum-network which has capacity equal to $k/n$. It is a natural question whether, for any positive rational number $k/n$, and for any finite set/co-finite set of primes $\{p_1,p_2,\ldots,p_l\}$, there exists a sum-network which has a capacity achieving rate $k/n$ fractional linear network coding solution if and only if the characteristic of the finite field belongs to the given set. We show that indeed there exists such a sum-network by constructing such a sum-network.
The aim of this paper is to propose an application of mutual information-based ensemble methods to the analysis and classification of heart beats associated with different types of Arrhythmia. Models of multilayer perceptrons, support vector machines, and radial basis function neural networks were trained and tested using the MIT-BIH arrhythmia database. This research brings a focus to an ensemble method that, to our knowledge, is a novel application in the area of ECG Arrhythmia detection. The proposed classifier ensemble method showed improved performance, relative to either majority voting classifier integration or to individual classifier performance. The overall ensemble accuracy was 98.25%.
We consider training a deep neural network to generate samples from an unknown distribution given i.i.d. data. We frame learning as an optimization minimizing a two-sample test statistic---informally speaking, a good generator network produces samples that cause a two-sample test to fail to reject the null hypothesis. As our two-sample test statistic, we use an unbiased estimate of the maximum mean discrepancy, which is the centerpiece of the nonparametric kernel two-sample test proposed by Gretton et al. (2012). We compare to the adversarial nets framework introduced by Goodfellow et al. (2014), in which learning is a two-player game between a generator network and an adversarial discriminator network, both trained to outwit the other. From this perspective, the MMD statistic plays the role of the discriminator. In addition to empirical comparisons, we prove bounds on the generalization error incurred by optimizing the empirical MMD.
Observations consisting of measurements on relationships for pairs of objects arise in many settings, such as protein interaction and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilisic models can be delicate because the simple exchangeability assumptions underlying many boilerplate models no longer hold. In this paper, we describe a latent variable model of such data called the mixed membership stochastic blockmodel. This model extends blockmodels for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation. We develop a general variational inference algorithm for fast approximate posterior inference. We explore applications to social and protein interaction networks.
We study the non-equilibrium phase structure of the three-state random quantum Potts model in one dimension. This spin chain is characterized by a non-Abelian $D_3$ symmetry recently argued to be incompatible with the existence of a symmetry-preserving many-body localized (MBL) phase. Using exact diagonalization and a finite-size scaling analysis, we find that the model supports two distinct broken-symmetry MBL phases at strong disorder that either break the ${\mathbb{Z}_3}$ clock symmetry or a ${\mathbb{Z}_2}$ chiral symmetry. In a dual formulation, our results indicate the existence of a stable finite-temperature topological phase with MBL-protected parafermionic end zero modes. While we find a thermal symmetry-preserving regime for weak disorder, scaling analysis at strong disorder points to an infinite-randomness critical point between two distinct broken-symmetry MBL phases.
Prostate segmentation from Magnetic Resonance (MR) images plays an important role in image guided interven- tion. However, the lack of clear boundary specifically at the apex and base, and huge variation of shape and texture between the images from different patients make the task very challenging. To overcome these problems, in this paper, we propose a deeply supervised convolutional neural network (CNN) utilizing the convolutional information to accurately segment the prostate from MR images. The proposed model can effectively detect the prostate region with additional deeply supervised layers compared with other approaches. Since some information will be abandoned after convolution, it is necessary to pass the features extracted from early stages to later stages. The experimental results show that significant segmentation accuracy improvement has been achieved by our proposed method compared to other reported approaches.
We study a model of a multi-species ecosystem described by Lotka-Volterra-like equations. Interactions among species form a network whose evolution is determined by the dynamics of the model. Numerical simulations show power-law distribution of intervals between extinctions, but only for ecosystems with sufficient variability of species and with networks of connectivity above certain threshold that is very close to the percolation threshold of the network. Effect of slow environmental changes on extinction dynamics, degree distribution of the network of interspecies interactions, and some emergent properties of our model are also examined.
Recent works on representation learning for graph structured data predominantly focus on learning distributed representations of graph substructures such as nodes and subgraphs. However, many graph analytics tasks such as graph classification and clustering require representing entire graphs as fixed length feature vectors. While the aforementioned approaches are naturally unequipped to learn such representations, graph kernels remain as the most effective way of obtaining them. However, these graph kernels use handcrafted features (e.g., shortest paths, graphlets, etc.) and hence are hampered by problems such as poor generalization. To address this limitation, in this work, we propose a neural embedding framework named graph2vec to learn data-driven distributed representations of arbitrary sized graphs. graph2vec's embeddings are learnt in an unsupervised manner and are task agnostic. Hence, they could be used for any downstream task such as graph classification, clustering and even seeding supervised representation learning approaches. Our experiments on several benchmark and large real-world datasets show that graph2vec achieves significant improvements in classification and clustering accuracies over substructure representation learning approaches and are competitive with state-of-the-art graph kernels.
Tensor networks representations of many-body quantum systems can be described in terms of quantum channels. We focus on channels associated with the Multi-scale Entanglement Renormalization Ansatz (MERA) tensor network that has been recently introduced to efficiently describe critical systems. Our approach allows us to compute the MERA correspondent to the thermodynamic limit of a critical system introducing a transfer matrix formalism, and to relate the system critical exponents to the convergence rates of the associated channels.
A general self-consistency approach allows a thorough treatment of the corrections to the mean-field approximation (MFA). The natural extension of standard MFA with the help of a cumulant expansion leads to a new point of view on the effective field theories. The proposed approach can be used for a systematic treatment of fluctuation effects of various length scales and, perhaps, for the development of a new coarse graining procedure. We outline and justify our method by some preliminary calculations. Results are given for the critical temperature and the Landau parameters of the $\phi^4$-theory -- the field counterpart of the Ising model. An important unresolved problem of the modern theory of phase transitions -- the problem for the calculation of the true critical temperature, is considered within the framework of the present approach. A comprehensive description of the ground state properties of many-body systems is also demonstrated.
We study the interaction among dispersion, nonlinearity, and disorder effects in the context of wave transmission through a discrete periodic structure, subjected to continuous harmonic excitation in its stop band. We consider a damped nonlinear periodic structure of finite length with disorder. Disorder is introduced throughout the structure by small changes in the stiffness parameters drawn from a uniform statistical distribution. Dispersion effects forbid wave transmission within the stop band of the linear periodic structure. However, nonlinearity leads to supratransmission phenomenon, by which enhanced wave transmission occurs within the stop band of the periodic structure when forced at an amplitude exceeding a certain threshold. The frequency components of the transmitted waves lie within the pass band of the linear structure, where disorder is known to cause Anderson localization. There is therefore a competition between dispersion, nonlinearity, and disorder in the context of supratransmission. We show that supratransmission persists in the presence of disorder. The influence of disorder decreases in general as the forcing frequency moves away from the pass band edge, reminiscent of dispersion effects subsuming disorder effects in linear periodic structures. We compute the dependence of the supratransmission force threshold on nonlinearity and strength of coupling between units. We observe that nonlinear forces are confined to the driven unit for weakly coupled systems. This observation, together with the truncation of higher-order nonlinear terms, permits us to develop closed-form expressions for the supratransmission force threshold. In sum, in the frequency range studied here, disorder does not influence the supratransmission force threshold in the ensemble-average sense, but it does reduce the average transmitted wave energy.
The GERDA experiment located at the LNGS searches for neutrinoless double beta (0\nu\beta\beta) decay of ^{76}Ge using germanium diodes as source and detector. In Phase I of the experiment eight semi-coaxial and five BEGe type detectors have been deployed. The latter type is used in this field of research for the first time. All detectors are made from material with enriched ^{76}Ge fraction. The experimental sensitivity can be improved by analyzing the pulse shape of the detector signals with the aim to reject background events. This paper documents the algorithms developed before the data of Phase I were unblinded. The double escape peak (DEP) and Compton edge events of 2.615 MeV \gamma\ rays from ^{208}Tl decays as well as 2\nu\beta\beta\ decays of ^{76}Ge are used as proxies for 0\nu\beta\beta\ decay. For BEGe detectors the chosen selection is based on a single pulse shape parameter. It accepts 0.92$\pm$0.02 of signal-like events while about 80% of the background events at Q_{\beta\beta}=2039 keV are rejected.   For semi-coaxial detectors three analyses are developed. The one based on an artificial neural network is used for the search of 0\nu\beta\beta\ decay. It retains 90% of DEP events and rejects about half of the events around Q_{\beta\beta}. The 2\nu\beta\beta\ events have an efficiency of 0.85\pm0.02 and the one for 0\nu\beta\beta\ decays is estimated to be 0.90^{+0.05}_{-0.09}. A second analysis uses a likelihood approach trained on Compton edge events. The third approach uses two pulse shape parameters. The latter two methods confirm the classification of the neural network since about 90% of the data events rejected by the neural network are also removed by both of them. In general, the selection efficiency extracted from DEP events agrees well with those determined from Compton edge events or from 2\nu\beta\beta\ decays.
We study the dynamics of a few-quantum-particle cloud in the presence of two- and three-body interactions in weakly disordered one-dimensional lattices. The interaction is dramatically enhancing the Anderson localization length $\xi_1$ of noninteracting particles. We launch compact wave packets and show that few-body interactions lead to transient subdiffusion of wave packets, $m_2 \sim t^{\alpha}$, $\alpha<1$, on length scales beyond $\xi_1$. The subdiffusion exponent is independent of the number of particles. Two-body interactions yield $\alpha\approx0.5$ for two and three particles, while three-body interactions decrease it to $\alpha\approx0.2$. The tails of expanding wave packets exhibit exponential localization with a slowly decreasing exponent. We relate our results to subdiffusion in nonlinear random lattices, and to results on restricted diffusion in high-dimensional spaces like e.g. on comb lattices.
We develop in this article a penalized likelihood method to estimate sparse Bayesian networks from categorical data. The structure of a Bayesian network is represented by a directed acyclic graph (DAG). We model the conditional distribution of a node given its parents by multi-logit regression and estimate the structure of a DAG via maximizing a regularized likelihood. The adaptive group Lasso penalty is employed to encourage sparsity by selecting grouped dummy variables encoding the level of a factor. We develop a blockwise coordinate descent algorithm to solve the penalized likelihood problem subject to the acyclicity constraint of a DAG. When intervention data are available, our method may construct a causal network, in which a directed edge represents a causal relation. We apply our method to various simulated networks and a real biological network. The results show that our method is very competitive, compared to other existing methods, in DAG estimation from both interventional and high-dimensional observational data. We also establish consistency in parameter and structure estimation for our method when the number of nodes is fixed.
The short- and long-time dynamics of model systems undergoing a glass transition with apparent inversion of Kauzmann and dynamical arrest glass transition lines is investigated. These models belong to the class of the spherical mean-field approximation of a spin-$1$ model with $p$-body quenched disordered interaction, with $p>2$, termed spherical Blume-Emery-Griffiths models. Depending on temperature and chemical potential the system is found in a paramagnetic or in a glassy phase and the transition between these phases can be of a different nature. In specific regions of the phase diagram coexistence of low density and high density paramagnets can occur, as well as the coexistence of spin-glass and paramagnetic phases. The exact static solution for the glassy phase is known to be obtained by the one-step replica symmetry breaking ansatz. Different scenarios arise for both the dynamic and the thermodynamic transitions. These include: (i) the usual random first- order transition (Kauzmann-like) for mean-field glasses preceded by a dynamic transition, (ii) a thermodynamic first-order transition with phase coexistence and latent heat and (iii) a regime of apparent inversion of static transition line and dynamic transition lines, the latter defined as a non-zero complexity line. The latter inversion, though, turns out to be preceded by a novel dynamical arrest line at higher temperature. Crossover between different regimes is analyzed by solving mode coupling theory equations throughout the space of external thermodynamic parameters and the relationship with the underlying statics is discussed.
Neural machine translation (NMT) has achieved notable performance recently. However, this approach has not been widely applied to the translation task between Chinese and Uyghur, partly due to the limited parallel data resource and the large proportion of rare words caused by the agglutinative nature of Uyghur. In this paper, we collect ~200,000 sentence pairs and show that with this middle-scale database, an attention-based NMT can perform very well on Chinese-Uyghur/Uyghur-Chinese translation. To tackle rare words, we propose a novel memory structure to assist the NMT inference. Our experiments demonstrated that the memory-augmented NMT (M-NMT) outperforms both the vanilla NMT and the phrase-based statistical machine translation (SMT). Interestingly, the memory structure provides an elegant way for dealing with words that are out of vocabulary.
We introduce the diffusion and superposition distances as two metrics to compare signals supported in the nodes of a network. Both metrics consider the given vectors as initial temperature distributions and diffuse heat trough the edges of the graph. The similarity between the given vectors is determined by the similarity of the respective diffusion profiles. The superposition distance computes the instantaneous difference between the diffused signals and integrates the difference over time. The diffusion distance determines a distance between the integrals of the diffused signals. We prove that both distances define valid metrics and that they are stable to perturbations in the underlying network. We utilize numerical experiments to illustrate their utility in classifying signals in a synthetic network as well as in classifying ovarian cancer histologies using gene mutation profiles of different patients. We also reinterpret diffusion as a transformation of interrelated feature spaces and use it as preprocessing tool for learning. We use diffusion to increase the accuracy of handwritten digit classification.
We present a computational model that highlights the role of basal ganglia (BG) in generating simple reaching movements. The model is cast within the reinforcement learning (RL) framework with the correspondence between RL components and neuroanatomy as follows: dopamine signal of substantia nigra pars compacta as the Temporal Difference error, striatum as the substrate for the Critic, and the motor cortex as the Actor. A key feature of this neurobiological interpretation is our hypothesis that the indirect pathway is the Explorer. Chaotic activity, originating from the indirect pathway part of the model, drives the wandering, exploratory movements of the arm. Thus the direct pathway subserves exploitation while the indirect pathway subserves exploration. The motor cortex becomes more and more independent of the corrective influence of BG, as training progresses. Reaching trajectories show diminishing variability with training. Reaching movements associated with Parkinson's disease (PD) are simulated by (a) reducing dopamine and (b) degrading the complexity of indirect pathway dynamics by switching it from chaotic to periodic behavior. Under the simulated PD conditions, the arm exhibits PD motor symptoms like tremor, bradykinesia and undershoot. The model echoes the notion that PD is a dynamical disease.
The influence of DNA cis-regulatory elements on a gene's expression has been intensively studied. However, little is known about expressions driven by trans-acting DNA hotspots. DNA hotspots harboring copy number aberrations are recognized to be important in cancer as they influence multiple genes on a global scale. The challenge in detecting trans-effects is mainly due to the computational difficulty in detecting weak and sparse trans-acting signals amidst co-occuring passenger events. We propose an integrative approach to learn a sparse interaction network of DNA copy-number regions with their downstream targets in a breast cancer dataset. Information from this network helps distinguish copy-number driven from copy-number independent expression changes on a global scale. Our result further delineates cis- and trans-effects in a breast cancer dataset, for which important oncogenes such as ESR1 and ERBB2 appear to be highly copy-number dependent. Further, our model is shown to be efficient and in terms of goodness of fit no worse than other state-of the art predictors and network reconstruction models using both simulated and real data.
We define catalytic networks as chemical reaction networks with an essentially catalytic reaction pathway: one which is on in the presence of certain catalysts and off in their absence. We show that examples of catalytic networks include synthetic DNA molecular circuits that have been shown to perform signal amplification and molecular logic. Recall that a critical siphon is a subset of the species in a chemical reaction network whose absence is forward invariant and stoichiometrically compatible with a positive point. Our main theorem is that all weakly-reversible networks with critical siphons are catalytic. Consequently, we obtain new proofs for the persistence of atomic event-systems of Adleman et al., and normal networks of Gnacadja. We define autocatalytic networks, and conjecture that a weakly-reversible reaction network has critical siphons if and only if it is autocatalytic.
Motivation: Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex eukaryotes like human. Moreover, recent work suggests that these methods still suffer from the curse of dimensionality if network size increases to 100 or higher.   Result: We present a novel scalable algorithm for identifying genome-wide regulatory network structures. The highlight of our method is that its superior performance does not degenerate even for a network size on the order of $10^4$, and is thus readily applicable to large-scale complex networks. Such a breakthrough is achieved by considering both prior biological knowledge and multiple topological properties (i.e., sparsity and hub gene structure) of complex networks in the regularized formulation. We also illustrate the application of our algorithm in practice using the time-course expression data from an influenza infection study in respiratory epithelial cells.   Availability and Implementation: The algorithm described in this article is implemented in MATLAB$^\circledR$. The source code is freely available from https://github.com/Hongyu-Miao/DMI.git.   Contact: jliu@cs.rochester.edu; hongyu.miao@uth.tmc.edu   Supplementary information: Supplementary data are available online.
Recent advances in designing meta-materials have demonstrated that global mechanical properties of disordered spring networks can be tuned by selectively modifying only a small subset of bonds. Here, using a computationally-efficient approach, we extend this idea in order to tune more general properties of networks. With nearly complete success, we are able to produce a strain between any pair of target nodes in a network in response to an applied source strain on any other pair of nodes by removing only ~1% of the bonds. We are also able to control multiple pairs of target nodes, each with a different individual response, from a single source, and to tune multiple independent source/target responses simultaneously into a network. We have fabricated physical networks in macroscopic two- and three-dimensional systems that exhibit these responses. This targeted behavior is reminiscent of the long-range coupled conformational changes that often occur during allostery in proteins. The ease with which we create these responses may give insight into why allostery is a common means for the regulation of activity in biological molecules.
Identifying the occurrence of congestion in a Mobile Ad-hoc Network (MANET) is a major task. The inbuilt congestion control techniques of existing Transmission Control Protocol (TCP) designed for wired networks do not handle the unique properties of shared wireless multi-hop link. There are several approaches proposed for detecting and overcoming the congestion in the mobile ad-hoc network. In this paper we present a Modified AD-hoc Transmission Control Protocol (M-ADTCP) method where the receiver detects the probable current network status and transmits this information to the sender as feedback. The sender behavior is altered appropriately. The proposed technique is also compatible with standard TCP.
Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without requiring extensive manual engineering. However, robotic skill learning methods typically make one of several trade-offs to enable practical real-world learning, such as requiring manually designed policy or value function representations, initialization from human-provided demonstrations, instrumentation of the training environment, or extremely long training times. In this paper, we propose a new reinforcement learning algorithm for learning manipulation skills that can train general-purpose neural network policies with minimal human engineering, while still allowing for fast, efficient learning in stochastic environments. Our approach builds on the guided policy search (GPS) algorithm, which transforms the reinforcement learning problem into supervised learning from a computational teacher (without human demonstrations). In contrast to prior GPS methods, which require a consistent set of initial states to which the system must be reset after each episode, our approach can handle randomized initial states, allowing it to be used in environments where deterministic resets are impossible. We compare our method to existing policy search techniques in simulation, showing that it can train high-dimensional neural network policies with the same sample efficiency as prior GPS methods, and present real-world results on a PR2 robotic manipulator.
The off-equilibrium dynamics of a monatomic Lennard-Jones glass is investigated after sudden isothermal density jumps (crunch) from well equilibrated liquid configurations towards the glassy state. The generalized fluctuation-dissipation relation has been studied and the temperature dependence of the violation factor m is found in agreement with the one step replica symmetry breaking scenario, i.e. at low temperature m(T) is found proportional to T up to an off-equilibrium effective temperature T_eff, where m(T_eff)=1. We report T_eff as a function of the density and compare it with the glass transition temperatures T_g as determined by equilibrium calculations.
DEEP is a two-phase spectral survey of faint field galaxies with the Keck Telescopes. The goals include exploring galaxy formation and evolution, mapping distant large scale structures, and constraining cosmology. DEEP, since its inception in the early 1990's, has been distinguished by an emphasis on studying the kinematics and masses of distant galaxies. The major DEEP survey in the second phase (DEEP2) is scheduled to begin in 2002 using a new spectrograph and will mainly aim for a sample of 50,000 galaxies to I ~ 23. Until then, the first phase of DEEP science programs will have been concentrating on using existing Keck spectrographs to undertake spectral surveys of over 1000 galaxies that have also been observed with HST. I will highlight the study of rotation curves of distant spirals; the fundamental plane of faint, high-redshift E/S0s; the narrow velocity widths seen in luminous blue compact galaxies; and the diversity of kinematics seen in a small sample of high redshift (z ~ 3) galaxies. These DEEP pilot programs have clearly demonstrated the feasibility, importance, and potential of using kinematics to better understand distant galaxies.
The self-organizational ability of ad-hoc Wireless Sensor Networks (WSNs) has led them to be the most popular choice in ubiquitous computing. Clustering sensor nodes organizing them hierarchically have proven to be an effective method to provide better data aggregation and scalability for the sensor network while conserving limited energy. It has some limitation in energy and mobility of nodes. In this paper we propose a mobility prediction technique which tries overcoming above mentioned problems and improves the life time of the network. The technique used here is Exponential Moving Average for online updates of nodal contact probability in cluster based network.
We present the results of the detailed surface photometry of a sample of elliptical galaxies in the Hubble Deep Field. In the (surface_brightness-effective_radius) plane, the elliptical galaxies of the HDF turn out to follow a `rest frame' Kormendy relation, once the appropriate K+E corrections are applied. This evidence, linked to the dynamical information gathered by Steidel et al. (1996), indicates that these galaxies, even at z ~ 2-3, lie in the Fundamental Plane, in a virial equilibrium condition. At the same redshifts a statistically signifcant lack of large galaxies [i.e. with log r_e(kpc) > 0.2] is observed.
Gaussian Graphical Models (GGMs) are popular tools for studying network structures. However, many modern applications such as gene network discovery and social interactions analysis often involve high-dimensional noisy data with outliers or heavier tails than the Gaussian distribution. In this paper, we propose the Trimmed Graphical Lasso for robust estimation of sparse GGMs. Our method guards against outliers by an implicit trimming mechanism akin to the popular Least Trimmed Squares method used for linear regression. We provide a rigorous statistical analysis of our estimator in the high-dimensional setting. In contrast, existing approaches for robust sparse GGMs estimation lack statistical guarantees. Our theoretical results are complemented by experiments on simulated and real gene expression data which further demonstrate the value of our approach.
We show that the matching problem that underlies optical flow requires multiple strategies, depending on the amount of image motion and other factors. We then study the implications of this observation on training a deep neural network for representing image patches in the context of descriptor based optical flow. We propose a metric learning method, which selects suitable negative samples based on the nature of the true match. This type of training produces a network that displays multiple strategies depending on the input and leads to state of the art results on the KITTI 2012 and KITTI 2015 optical flow benchmarks.
We present a new method for identifying the latent categorization of items based on their rankings. Complimenting a recent work that uses a Dirichlet prior on preference vectors and variational inference, we show that this problem can be effectively dealt with using existing community detection algorithms, with the communities corresponding to item categories. In particular we convert the bipartite ranking data to a unipartite graph of item affinities, and apply community detection algorithms. In this context we modify an existing algorithm - namely the label propagation algorithm to a variant that uses the distance between the nodes for weighting the label propagation - to identify the categories. We propose and analyze a synthetic ordinal ranking model and show its relation to the recently much studied stochastic block model. We test our algorithms on synthetic data and compare performance with several popular community detection algorithms. We also test the method on real data sets of movie categorization from the Movie Lens database. In all of the cases our algorithm is able to identify the categories for a suitable choice of tuning parameter.
The stochastic block model (SBM) is an important generative model for random graphs in network science and machine learning, useful for benchmarking community detection (or clustering) algorithms. The symmetric SBM generates a graph with $2n$ nodes which cluster into two equally sized communities. Nodes connect with probability $p$ within a community and $q$ across different communities. We consider the case of $p=a\ln (n)/n$ and $q=b\ln (n)/n$. In this case, it was recently shown that recovering the community membership (or label) of every node with high probability (w.h.p.) using only the graph is possible if and only if the Chernoff-Hellinger (CH) divergence $D(a,b)=(\sqrt{a}-\sqrt{b})^2 \geq 1$. In this work, we study if, and by how much, community detection below the clustering threshold (i.e. $D(a,b)<1$) is possible by querying the labels of a limited number of chosen nodes (i.e., active learning). Our main result is to show that, under certain conditions, sampling the labels of a vanishingly small fraction of nodes (a number sub-linear in $n$) is sufficient for exact community detection even when $D(a,b)<1$. Furthermore, we provide an efficient learning algorithm which recovers the community memberships of all nodes w.h.p. as long as the number of sampled points meets the sufficient condition. We also show that recovery is not possible if the number of observed labels is less than $n^{1-D(a,b)}$. The validity of our results is demonstrated through numerical experiments.
Vehicle-to-roadside (V2R) communications enable vehicular networks to support a wide range of applications for enhancing the efficiency of road transportation. While existing work focused on non-cooperative techniques for V2R communications between vehicles and roadside units (RSUs), this paper investigates novel cooperative strategies among the RSUs in a vehicular network. We propose a scheme whereby, through cooperation, the RSUs in a vehicular network can coordinate the classes of data being transmitted through V2R communications links to the vehicles. This scheme improves the diversity of the information circulating in the network while exploiting the underlying content-sharing vehicle-to-vehicle communication network. We model the problem as a coalition formation game with transferable utility and we propose an algorithm for forming coalitions among the RSUs. For coalition formation, each RSU can take an individual decision to join or leave a coalition, depending on its utility which accounts for the generated revenues and the costs for coalition coordination. We show that the RSUs can self-organize into a Nash-stable partition and adapt this partition to environmental changes. Simulation results show that, depending on different scenarios, coalition formation presents a performance improvement, in terms of the average payoff per RSU, ranging between 20.5% and 33.2%, relative to the non-cooperative case.
Deep learning is increasingly attracting attention for processing big data. Existing frameworks for deep learning must be set up to specialized computer systems. Gaining sufficient computing resources therefore entails high costs of deployment and maintenance. In this work, we implement a matrix library and deep learning framework that uses JavaScript. It can run on web browsers operating on ordinary personal computers and smartphones. Using JavaScript, deep learning can be accomplished in widely diverse environments without the necessity for software installation. Using GPGPU from WebCL framework, our framework can train large scale convolutional neural networks such as VGGNet and ResNet. In the experiments, we demonstrate their practicality by training VGGNet in a distributed manner using web browsers as the client.
Ultra-precision machining of metals, the breaking of nanowires under tensile stress and fracture of nanoscale materials are examples of technologically important processes which are both extremely difficult and costly to investigate experimentally. We describe a multiscale method for the simulation of such systems in which the energetically active region is modelled using a robust tight-binding scheme developed at the Naval Research Laboratory (NRL-TB) and the rest of the system is treated with molecular dynamics. We present a computer code implementing the method, geared towards non-equilibrium, cross-scaled tight-binding and molecular dynamics simulations. Apart from the presentation of the method and implementation, we discuss preliminary physical results obtained and discuss their validity.
Predictions for deep Virtual Compton Scattering are obtained in a two-component dipole model of diffraction. The model automatically includes hard and soft components and implicitly allows for ``hadronic'' contributions via large dipoles. It is also applicable to real Compton Scattering, which provides an important constraint.
This paper presents some properties of unary coding of significance for biological learning and instantaneously trained neural networks.
Artificial intelligence has seen several breakthroughs in recent years, with games often serving as milestones. A common feature of these games is that players have perfect information. Poker is the quintessential game of imperfect information, and a longstanding challenge problem in artificial intelligence. We introduce DeepStack, an algorithm for imperfect information settings. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition that is automatically learned from self-play using deep learning. In a study involving 44,000 hands of poker, DeepStack defeated with statistical significance professional poker players in heads-up no-limit Texas hold'em. The approach is theoretically sound and is shown to produce more difficult to exploit strategies than prior approaches.
The density of the transmission eigenvalues of Pb nano-contacts has been estimated recently in mechanically controllable break-junction experiments. Motivated by these experimental analyses, here we study the evolution of the density of the transmission eigenvalues with the disorder strength and the number of channels supported by the ballistic constriction of a quantum point contact in the framework of the Dorokhov-Mello-Pereyra-Kumar equation. We find that the transmission density evolves rapidly into the density in the diffusive metallic regime as the number of channels $N_c$ of the constriction increase. Therefore, the transmission density distribution for a few $N_c$ channels comes close to the known bimodal density distribution in the metallic limit. This is in agreement with the experimental statistical-studies in Pb nano-contacts. For the two analyzed cases, we show that the experimental densities are seen to be well described by the corresponding theoretical results.
Effect of static charges on charge carrier transport in disordered organic materials is considered. Long range nature of Coulomb interaction requires to take into consideration a finite thickness of the transport layer. Presence of conducting electrodes significantly modifies properties of organic medium, removes a long range Coulomb divergence, and makes it possible to calculate in finite form statistical properties of organic medium (with nonzero total charge density), relevant for transport characteristics. A special attention is paid to the particular case of charge induced disorder - a disorder originated from the surface charge located at the rough surface of electrode. We present also a generalization of 1D model of charge carrier transport to the case of inhomogeneous energetic disorder that realizes in for charge induced disorder.
Recent works have highlighted scale invariance or symmetry that is present in the weight space of a typical deep network and the adverse effect that it has on the Euclidean gradient based stochastic gradient descent optimization. In this work, we show that these and other commonly used deep networks, such as those which use a max-pooling and sub-sampling layer, possess more complex forms of symmetry arising from scaling based reparameterization of the network weights. We then propose two symmetry-invariant gradient based weight updates for stochastic gradient descent based learning. Our empirical evidence based on the MNIST dataset shows that these updates improve the test performance without sacrificing the computational efficiency of the weight updates. We also show the results of training with one of the proposed weight updates on an image segmentation problem.
Consistency of the previously suggested color-dipole representation of deep-inelastic scattering (DIS_) and vector-meson production at low x with DGLAP evolution allows one to predict the exponent of the W^2 dependence of the saturation scale Lambda_sat^2(W^2) ~ (W^2)^C-2. One finds C_2^theory = 0.27 in agreement with the model-independent analysis of the experimental data from HERA on deep-inelastic electron scattering.
Network embedding has attracted an increasing amount of attention in recent years due to its wide-ranging applications in graph mining tasks such as vertex classification, community detection, and network visualization. While embedding homogeneous networks has been widely studied, few methods have examined the embedding of partially labeled attributed networks (PLAN) that arise in a semi-supervised setting. In this paper, we propose a novel framework, called Semi-supervised Embedding in Attributed Networks with Outliers (SEANO), to learn a robust low-dimensional vector representation that captures the topological proximity, attribute affinity and label similarity of vertices in a PLAN while accounting for outliers. We design a tree-shaped deep neural network with both a supervised and an unsupervised component. These components share the first several layers of the network. We alternate training between the two components to iteratively push information regarding network structure, attributes, and labels into the embedding. Experimental results on various datasets demonstrate the advantages of SEANO over state-of-the-art methods in semi-supervised classification under both transductive and inductive settings. We also show as a byproduct that SEANO can significantly outperform other methods when applied to the task of outlier detection. Finally, we present the use of SEANO in a challenging real-world setting - flood mapping of satellite images. Qualitatively, we find that SEANO is able to outperform state-of-the-art remote sensing algorithms on this task.
In this paper we study two non-mean-field spin models built on a hierarchical lattice: The hierarchical Edward-Anderson model (HEA) of a spin glass, and Dyson's hierarchical model (DHM) of a ferromagnet. For the HEA, we prove the existence of the thermodynamic limit of the free energy and the replica-symmetry-breaking (RSB) free-energy bounds previously derived for the Sherrington-Kirkpatrick model of a spin glass. These RSB mean-field bounds are exact only if the order-parameter fluctuations (OPF) vanish: Given that such fluctuations are not negligible in non-mean-field models, we develop a novel strategy to tackle part of OPF in hierarchical models. The method is based on absorbing part of OPF of a block of spins into an effective Hamiltonian of the underlying spin blocks. We illustrate this method for DHM and show that, compared to the mean-field bound for the free energy, it provides a tighter non-mean-field bound, with a critical temperature closer to the exact one. To extend this method to the HEA model, a suitable generalization of Griffith's correlation inequalities for Ising ferromagnets is needed: Since correlation inequalities for spin glasses are still an open topic, we leave the extension of this method to hierarchical spin glasses as a future perspective.
A stationary state replica analysis for a dual neural network model that interpolates between a fully recurrent symmetric attractor network and a strictly feed-forward layered network, studied by Coolen and Viana, is extended in this work to account for finite dilution of the recurrent Hebbian interactions between binary Ising units within each layer. Gradual dilution is found to suppress part of the phase transitions that arise from the competition between recurrent and feed-forward operation modes of the network. Despite that, a long chain of layers still exhibits a relatively good performance under finite dilution for a balanced ratio between inter-layer and intra-layer interactions.
The Ramsey number is of vital importance in Ramsey's theorem. This paper proposed a novel methodology for constructing Ramsey graphs about R(3,10), which uses Artificial Bee Colony optimization(ABC) to raise the lower bound of Ramsey number R(3,10). The r(3,10)-graph contains two limitations, that is, neither complete graphs of order 3 nor independent sets of order 10. To resolve these limitations, a special mathematical model is put in the paradigm to convert the problems into discrete optimization whose smaller minimizers are correspondent to bigger lower bound as approximation of inf R(3,10). To demonstrate the potential of the proposed method, simulations are done to to minimize the amount of these two types of graphs. For the first time, four r(3,9,39) graphs with best approximation for inf R(3,10) are reported in simulations to support the current lower bound for R(3,10). The experiments' results show that the proposed paradigm for Ramsey number's calculation driven by ABC is a successful method with the advantages of high precision and robustness.
Neuronal circuits can learn and replay firing patterns evoked by sequences of sensory stimuli. After training, a brief cue can trigger a spatiotemporal pattern of neural activity similar to that evoked by a learned stimulus sequence. Network models show that such sequence learning can occur through the shaping of feedforward excitatory connectivity via long term plasticity. Previous models describe how event order can be learned, but they typically do not explain how precise timing can be recalled. We propose a mechanism for learning both the order and precise timing of event sequences. In our recurrent network model, long term plasticity leads to the learning of the sequence, while short term facilitation enables temporally precise replay of events. Learned synaptic weights between populations determine the time necessary for one population to activate another. Long term plasticity adjusts these weights so that the trained event times are matched during playback. While we chose short term facilitation as a time-tracking process, we also demonstrate that other mechanisms, such as spike rate adaptation, can fulfill this role. We also analyze the impact of trial-to-trial variability, showing how observational errors as well as neuronal noise result in variability in learned event times. The dynamics of the playback process determine how stochasticity is inherited in learned sequence timings. Future experiments that characterize such variability can therefore shed light on the neural mechanisms of sequence learning.
The potential energy landscape in the Kob-Andersen Lennard-Jones binary mixture model has been studied carefully from liquid down to the supercooled regime, from T =2 down to T =0.46. One thousand of independent configurations along the time evolution have been examined at each investigated temperature.   From the starting configuration we searched the nearest saddle (or quasi-saddle) and minimum of the potential energy. The vibrational densities of states for the starting and the two derived configurations have been evaluated. Besides the number of negative eigenvalues of saddle, other quantities show some signature of the approaching of the dynamical arrest temperature.
PageRank has numerous applications in information retrieval, reputation systems, machine learning, and graph partitioning.In this paper, we study PageRank in undirected random graphs with expansion property. The Chung-Lu random graph representsan example of such graphs. We show that in the limit, as the size of the graph goes to infinity, PageRank can be represented by a mixture of the restart distribution and the vertex degree distribution.
Traditional metrics of node influence such as degree or betweenness identify highly influential nodes, but are rarely usefully accurate in quantifying the spreading power of nodes which are not. Such nodes are the vast majority of the network, and the most likely entry points for novel influences, be they pandemic disease or new ideas. Several recent works have suggested metrics based on path counting. The current work proposes instead using the expected number of infected-susceptible edges, and shows that this measure predicts spreading power in discrete time, continuous time, and competitive spreading processes simulated on large random networks and on real world networks. Applied to the Ugandan road network, it predicts that Ebola is unlikely to pose a pandemic threat.
The inclusive forward jet cross section in deep inelastic $e^+p$ scattering has been measured in the region of $x$--Bjorken, ~$4.5 \cdot 10^{-4}$~ to ~$4.5 \cdot 10^{-2}$. This measurement is motivated by the search for effects of BFKL--like parton shower evolution. The cross section at hadron level as a function of \xbj is compared to cross sections predicted by various Monte Carlo models. An excess of forward jet production at small \xbj is observed, which is not reproduced by models based on DGLAP parton shower evolution. The Colour Dipole model describes the data reasonably well. Predictions of perturbative QCD calculations at the parton level based on BFKL and DGLAP parton evolution are discussed in the context of this measurement.
Attractor neural network is an important theoretical scenario for modeling memory function in the hippocampus and in the cortex. In these models, memories are stored in the plastic recurrent connections of neural populations in the form of "attractor states". The maximal information capacity for conventional abstract attractor networks with unconstrained connections is 2 bits/synapse. However, an unconstrained synapse has the capacity to store infinite amount of bits in a noiseless theoretical scenario: a capacity that conventional attractor networks cannot achieve. Here, I propose a hierarchical attractor network that can achieve an ultra high information capacity. The network has two layers: a visible layer with $N_v$ neurons, and a hidden layer with $N_h$ neurons. The visible-to-hidden connections are set at random and kept fixed during the training phase, in which the memory patterns are stored as fixed-points of the network dynamics. The hidden-to-visible connections, initially normally distributed, are learned via a local, online learning rule called the Three-Threshold Learning Rule and there is no within-layer connections. The results of simulations suggested that the maximal information capacity grows exponentially with the expansion ratio $N_h/N_v$. As a first order approximation to understand the mechanism providing the high capacity, I simulated a naive mean-field approximation (nMFA) of the network. The exponential increase was captured by the nMFA, revealing that a key underlying factor is the correlation between the hidden and the visible units. Additionally, it was observed that, at maximal capacity, the degree of symmetry of the connectivity between the hidden and the visible neurons increases with the expansion ratio. These results highlight the role of hierarchical architecture in remarkably increasing the performance of information storage in attractor networks.
A Fully Many-Body Localized (FMBL) quantum disordered system is characterized by the emergence of an extensive number of local conserved operators that prevents the relaxation towards thermal equilibrium. These local conserved operators can be seen as the building blocks of the whole set of eigenstates. In this paper, we propose to construct them explicitly via some block real-space renormalization. The principle is that each RG step diagonalizes the smallest remaining blocks and produces a conserved operator for each block. The final output for a chain of $N$ spins is a hierarchical organization of the $N$ conserved operators with $\left(\frac{\ln N}{\ln 2}\right)$ layers. The system-size nature of the conserved operators of the top layers is necessary to describe the possible long-ranged order of the excited eigenstates and the possible critical points between different FMBL phases. We discuss the similarities and the differences with the Strong Disorder RSRG-X method that generates the whole set of the $2^N$ eigenstates via a binary tree of $N$ layers. The approach is applied to the Long-Ranged Quantum Spin-Glass Ising model, where the constructed excited eigenstates are found to be exactly like ground states in another disorder realization, so that they can be either in the paramagnetic phase, in the spin-glass phase or critical.
Structural equation models (SEMs) and vector autoregressive models (VARMs) are two broad families of approaches that have been shown useful in effective brain connectivity studies. While VARMs postulate that a given region of interest in the brain is directionally connected to another one by virtue of time-lagged influences, SEMs assert that causal dependencies arise due to contemporaneous effects, and may even be adopted when nodal measurements are not necessarily multivariate time series. To unify these complementary perspectives, linear structural vector autoregressive models (SVARMs) that leverage both contemporaneous and time-lagged nodal data have recently been put forth. Albeit simple and tractable, linear SVARMs are quite limited since they are incapable of modeling nonlinear dependencies between neuronal time series. To this end, the overarching goal of the present paper is to considerably broaden the span of linear SVARMs by capturing nonlinearities through kernels, which have recently emerged as a powerful nonlinear modeling framework in canonical machine learning tasks, e.g., regression, classification, and dimensionality reduction. The merits of kernel-based methods are extended here to the task of learning the effective brain connectivity, and an efficient regularized estimator is put forth to leverage the edge sparsity inherent to real-world complex networks. Judicious kernel choice from a preselected dictionary of kernels is also addressed using a data-driven approach. Extensive numerical tests on ECoG data captured through a study on epileptic seizures demonstrate that it is possible to unveil previously unknown causal links between brain regions of interest.
The rapid detection of attackers within firewalls of enterprise computer net- works is of paramount importance. Anomaly detectors address this problem by quantifying deviations from baseline statistical models of normal network behav- ior and signaling an intrusion when the observed data deviates significantly from the baseline model. However, many anomaly detectors do not take into account plausible attacker behavior. As a result, anomaly detectors are prone to a large number of false positives due to unusual but benign activity. This paper first in- troduces a stochastic model of attacker behavior which is motivated by real world attacker traversal. Then, we develop a likelihood ratio detector that compares the probability of observed network behavior under normal conditions against the case when an attacker has possibly compromised a subset of hosts within the network. Since the likelihood ratio detector requires integrating over the time each host be- comes compromised, we illustrate how to use Monte Carlo methods to compute the requisite integral. We then present Receiver Operating Characteristic (ROC) curves for various network parameterizations that show for any rate of true posi- tives, the rate of false positives for the likelihood ratio detector is no higher than that of a simple anomaly detector and is often lower. We conclude by demon- strating the superiority of the proposed likelihood ratio detector when the network topologies and parameterizations are extracted from real-world networks.
Being able to analyze and interpret signal coming from electroencephalogram (EEG) recording can be of high interest for many applications including medical diagnosis and Brain-Computer Interfaces. Indeed, human experts are today able to extract from this signal many hints related to physiological as well as cognitive states of the recorded subject and it would be very interesting to perform such task automatically but today no completely automatic system exists. In previous studies, we have compared human expertise and automatic processing tools, including artificial neural networks (ANN), to better understand the competences of each and determine which are the difficult aspects to integrate in a fully automatic system. In this paper, we bring more elements to that study in reporting the main results of a practical experiment which was carried out in an hospital for sleep pathology study. An EEG recording was studied and labeled by a human expert and an ANN. We describe here the characteristics of the experiment, both human and neuronal procedure of analysis, compare their performances and point out the main limitations which arise from this study.
Many machine learning frameworks, such as resource-allocating networks, kernel-based methods, Gaussian processes, and radial-basis-function networks, require a sparsification scheme in order to address the online learning paradigm. For this purpose, several online sparsification criteria have been proposed to restrict the model definition on a subset of samples. The most known criterion is the (linear) approximation criterion, which discards any sample that can be well represented by the already contributing samples, an operation with excessive computational complexity. Several computationally efficient sparsification criteria have been introduced in the literature, such as the distance, the coherence and the Babel criteria. In this paper, we provide a framework that connects these sparsification criteria to the issue of approximating samples, by deriving theoretical bounds on the approximation errors. Moreover, we investigate the error of approximating any feature, by proposing upper-bounds on the approximation error for each of the aforementioned sparsification criteria. Two classes of features are described in detail, the empirical mean and the principal axes in the kernel principal component analysis.
Random coding arguments are the backbone of most channel capacity achievability proofs. In this paper, we show that in their standard form, such arguments are insufficient for proving some network capacity theorems: structured coding arguments, such as random linear or lattice codes, attain higher rates. Historically, structured codes have been studied as a stepping stone to practical constructions. However, K\"{o}rner and Marton demonstrated their usefulness for capacity theorems through the derivation of the optimal rate region of a distributed functional source coding problem. Here, we use multicasting over finite field and Gaussian multiple-access networks as canonical examples to demonstrate that even if we want to send bits over a network, structured codes succeed where simple random codes fail. Beyond network coding, we also consider distributed computation over noisy channels and a special relay-type problem.
Inferring depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train multi-scale convolutional neural network to predict depth and add two novel contributions which add accuracy and detail to the resulting depth maps. First, we define a set loss over multiple images. By regularizing the estimation between a common set of images, we prevent the network from over-fitting and can achieve better accuracy than competing methods. We further enhance our depth prediction with local gradient estimates that preserve sharp edges and detailing. Experiments on the NYU Depth v2 dataset show that our depth predictions outperform state-of-the-art and lead to 3D reconstructions that are realistic and rich with detail.
Consider the continuum of points along the edges of a network, i.e., a connected, undirected graph with positive edge weights. We measure the distance between these points in terms of the weighted shortest path distance, called the network distance. Within this metric space, we study farthest points and farthest distances. We introduce a data structure supporting queries for the farthest distance and the farthest points on two-terminal series-parallel networks. This data structure supports farthest-point queries in $O(k + \log n)$ time after $O(n \log p)$ construction time, where $k$ is the number of farthest points, $n$ is the size of the network, and $p$ parallel operations are required to generate the network.
Consider several source nodes communicating across a wireless network to a destination node with the help of several layers of relay nodes. Recent work by Avestimehr et al. has approximated the capacity of this network up to an additive gap. The communication scheme achieving this capacity approximation is based on compress-and-forward, resulting in noise accumulation as the messages traverse the network. As a consequence, the approximation gap increases linearly with the network depth.   This paper develops a computation alignment strategy that can approach the capacity of a class of layered, time-varying wireless relay networks up to an approximation gap that is independent of the network depth. This strategy is based on the compute-and-forward framework, which enables relays to decode deterministic functions of the transmitted messages. Alone, compute-and-forward is insufficient to approach the capacity as it incurs a penalty for approximating the wireless channel with complex-valued coefficients by a channel with integer coefficients. Here, this penalty is circumvented by carefully matching channel realizations across time slots to create integer-valued effective channels that are well-suited to compute-and-forward. Unlike prior constant gap results, the approximation gap obtained in this paper also depends closely on the fading statistics, which are assumed to be i.i.d. Rayleigh.
We study the relaxational critical dynamics of the three-dimensional random anisotropy magnets with the non-conserved n-component order parameter coupled to a conserved scalar density. In the random anisotropy magnets the structural disorder is present in a form of local quenched anisotropy axes of random orientation. When the anisotropy axes are randomly distributed along the edges of the n-dimensional hypercube, asymptotical dynamical critical properties coincide with those of the random-site Ising model. However structural disorder gives rise to considerable effects for non-asymptotic critical dynamics. We investigate this phenomenon by a field-theoretical renormalization group analysis in the two-loop order. We study critical slowing down and obtain quantitative estimates for the effective and asymptotic critical exponents of the order parameter and scalar density. The results predict complex scenarios for the effective critical exponent approaching an asymptotic regime.
In this paper we present a SOA (Service Oriented Architecture)-based platform, enabling the retrieval and analysis of big datasets stemming from social networking (SN) sites and Internet of Things (IoT) devices, collected by smart city applications and socially-aware data aggregation services. A large set of city applications in the areas of Participating Urbanism, Augmented Reality and Sound-Mapping throughout participating cities is being applied, resulting into produced sets of millions of user-generated events and online SN reports fed into the RADICAL platform. Moreover, we study the application of data analytics such as sentiment analysis to the combined IoT and SN data saved into an SQL database, further investigating algorithmic and configurations to minimize delays in dataset processing and results retrieval.
Without sufficient preparation and on-site management, the mass scale unexpected huge human crowd is a serious threat to public safety. A recent impressive tragedy is the 2014 Shanghai Stampede, where 36 people were killed and 49 were injured in celebration of the New Year's Eve on December 31th 2014 in the Shanghai Bund. Due to the innately stochastic and complicated individual movement, it is not easy to predict collective gatherings, which potentially leads to crowd events. In this paper, with leveraging the big data generated on Baidu map, we propose a novel approach to early warning such potential crowd disasters, which has profound public benefits. An insightful observation is that, with the prevalence and convenience of mobile map service, users usually search on the Baidu map to plan a routine. Therefore, aggregating users' query data on Baidu map can obtain priori and indication information for estimating future human population in a specific area ahead of time. Our careful analysis and deep investigation on the Baidu map data on various events also demonstrates a strong correlation pattern between the number of map query and the number of positioning users in an area. Based on such observation, we propose a decision method utilizing query data on Baidu map to invoke warnings for potential crowd events about 1-3 hours in advance. Then we also construct a machine learning model with heterogeneous data (such as query data and mobile positioning data) to quantitatively measure the risk of the potential crowd disasters. We evaluate the effectiveness of our methods on the data of Baidu map.
Graphs are powerful abstractions for capturing complex relationships in diverse application settings. An active area of research focuses on theoretical models that define the generative mechanism of a graph. Yet given the complexity and inherent noise in real datasets, it is still very challenging to identify the best model for a given observed graph. We discuss a framework for graph model selection that leverages a long list of graph topological properties and a random forest classifier to learn and classify different graph instances. We fully characterize the discriminative power of our approach as we sweep through the parameter space of two generative models, the Erdos-Renyi and the stochastic block model. We show that our approach gets very close to known theoretical bounds and we provide insight on which topological features play a critical discriminating role.
Tuning curves characterizing the response selectivities of biological neurons often exhibit large degrees of irregularity and diversity across neurons. Theoretical network models that feature heterogeneous cell populations or random connectivity also give rise to diverse tuning curves. However, a general framework for fitting such models to experimentally measured tuning curves is lacking. We address this problem by proposing to view mechanistic network models as generative models whose parameters can be optimized to fit the distribution of experimentally measured tuning curves. A major obstacle for fitting such models is that their likelihood function is not explicitly available or is highly intractable to compute. Recent advances in machine learning provide ways for fitting generative models without the need to evaluate the likelihood and its gradient. Generative Adversarial Networks (GAN) provide one such framework which has been successful in traditional machine learning tasks. We apply this approach in two separate experiments, showing how GANs can be used to fit commonly used mechanistic models in theoretical neuroscience to datasets of measured tuning curves. This fitting procedure avoids the computationally expensive step of inferring latent variables, e.g. the biophysical parameters of individual cells or the particular realization of the full synaptic connectivity matrix, and directly learns model parameters which characterize the statistics of connectivity or of single-cell properties. Another strength of this approach is that it fits the entire, joint distribution of experimental tuning curves, instead of matching a few summary statistics picked a priori by the user. More generally, this framework opens the door to fitting theoretically motivated dynamical network models directly to simultaneously or non-simultaneously recorded neural responses.
The retrieval abilities of spatially uniform attractor networks can be measured by the average overlap between patterns and neural states. We found that metric networks, with local connections, however, can carry information structured in blocks without any global overlap. and blocks attractors. We propose a way to measure the block information, related to the fluctuation of the overlap. The phase-diagram with the transition from local to global information, shows that the stability of blocks grows with dilution, but decreases with the storage rate and disappears for random topologies.
Using a strong disorder real-space renormalization group (RG), we study the phase diagram of a fully disordered chain of interacting bosons. Since this approach does not suffer from run-away flows, it allows a direct study of the insulating phases, which are not accessible in a weak disorder perturbative treatment. We find that the universal properties of the insulating phase are determined by the details and symmetries of the onsite chemical-potential disorder. Three insulating phases are possible: (i) an incompressible Mott glass with a finite superfluid susceptibility, (ii) a random-singlet glass with diverging compressibility and superfluid susceptibility, (iii) a Bose glass with a finite compressibility but diverging superfluid susceptibility. In addition to characterizing the insulating phases, we show that the superfluid-insulator transition is always of the Kosterlitz-Thouless universality class.
In this paper we study the application of convolutional neural networks for jointly detecting objects depicted in still images and estimating their 3D pose. We identify different feature representations of oriented objects, and energies that lead a network to learn this representations. The choice of the representation is crucial since the pose of an object has a natural, continuous structure while its category is a discrete variable. We evaluate the different approaches on the joint object detection and pose estimation task of the Pascal3D+ benchmark using Average Viewpoint Precision. We show that a classification approach on discretized viewpoints achieves state-of-the-art performance for joint object detection and pose estimation, and significantly outperforms existing baselines on this benchmark.
Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients. In this paper, we propose an adaptive learning rate algorithm, which utilizes stochastic curvature information of the loss function for automatically tuning the learning rates. The information about the element-wise curvature of the loss function is estimated from the local statistics of the stochastic first order gradients. We further propose a new variance reduction technique to speed up the convergence. In our experiments with deep neural networks, we obtained better performance compared to the popular stochastic gradient algorithms.
Deep convolutional neural networks have recently achieved state-of-the-art performance on a number of image recognition benchmarks, including the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC-2012). The winning model on the localization sub-task was a network that predicts a single bounding box and a confidence score for each object category in the image. Such a model captures the whole-image context around the objects but cannot handle multiple instances of the same object in the image without naively replicating the number of outputs for each instance. In this work, we propose a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest. The model naturally handles a variable number of instances for each class and allows for cross-class generalization at the highest levels of the network. We are able to obtain competitive recognition performance on VOC2007 and ILSVRC2012, while using only the top few predicted locations in each image and a small number of neural network evaluations.
In this paper we aim to automatically discover high quality frame-level speech features and acoustic tokens directly from unlabeled speech data. A Multi-granular Acoustic Tokenizer (MAT) was proposed for automatic discovery of multiple sets of acoustic tokens from the given corpus. Each acoustic token set is specified by a set of hyperparameters describing the model configuration. These different sets of acoustic tokens carry different characteristics for the given corpus and the language behind, thus can be mutually reinforced. The multiple sets of token labels are then used as the targets of a Multi-target Deep Neural Network (MDNN) trained on frame-level acoustic features. Bottleneck features extracted from the MDNN are then used as the feedback input to the MAT and the MDNN itself in the next iteration. The multi-granular acoustic token sets and the frame-level speech features can be iteratively optimized in the iterative deep learning framework. We call this framework the Multi-granular Acoustic Tokenizing Deep Neural Network (MATDNN). The results were evaluated using the metrics and corpora defined in the Zero Resource Speech Challenge organized at Interspeech 2015, and improved performance was obtained with a set of experiments of query-by-example spoken term detection on the same corpora. Visualization for the discovered tokens against the English phonemes was also shown.
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.
This work extends the set of works which deal with the popular problem of sentiment analysis in Twitter. It investigates the most popular document ("tweet") representation methods which feed sentiment evaluation mechanisms. In particular, we study the bag-of-words, n-grams and n-gram graphs approaches and for each of them we evaluate the performance of a lexicon-based and 7 learning-based classification algorithms (namely SVM, Na\"ive Bayesian Networks, Logistic Regression, Multilayer Perceptrons, Best-First Trees, Functional Trees and C4.5) as well as their combinations, using a set of 4451 manually annotated tweets. The results demonstrate the superiority of learning-based methods and in particular of n-gram graphs approaches for predicting the sentiment of tweets. They also show that the combinatory approach has impressive effects on n-grams, raising the confidence up to 83.15% on the 5-Grams, using majority vote and a balanced dataset (equal number of positive, negative and neutral tweets for training). In the n-gram graph cases the improvement was small to none, reaching 94.52% on the 4-gram graphs, using Orthodromic distance and a threshold of 0.001.
We present the problem of "Reject Inference" for credit acceptance. Because of the current legal framework (Basel II), credit institutions need to industrialize their processes for credit acceptance, including Reject Inference. We present here a methodology to compare various techniques of Reject Inference and show that it is necessary, in the absence of real theoretical results, to be able to produce and compare models adapted to available data (selection of "best" model conditionnaly on data). We describe some simulations run on a small data set to illustrate the approach and some strategies for choosing the control group, which is the only valid approach to Reject Inference.
We present a catalogue of galaxy photometric redshifts and k-corrections for the Sloan Digital Sky Survey Seven Data Release (SDSS-DR7), available on the World Wide Web. The photometric redshifts were estimated with an artificial neural network using five ugriz bands, concentration indices and Petrosian radii in the g and r bands. We have explored our redshift estimates with different training set concluding that the best choice to improve redshift accuracy comprises the Main Galaxies Sample (MGS), the Luminous Red Galaxies, and galaxies of active galactic nuclei covering the redshift range 0<z<0.3. For the MGS, the photometric redshift estimates agree with the spectroscopic values within rms=0.0227. The derived distribution of photometric redshifts in the range 0<zphot<0.6 agrees well with the model predictions. k-corrections were derived by calibration of the k-correct-v4.2 code results for the MGS with the reference frame (z=0.1) (g-r) colours. We adopt a linear dependence of k corrections on redshift and (g-r) colours that provide suitable distributions of luminosity and colours for galaxies up to redshift zphot=0.6 comparable to the results in the literature. Thus, our k-correction estimate procedure is a powerful, low computational time algorithm capable of reproducing suitable results that can be used for testing galaxy properties at intermediate redshifts using the large SDSS database.
The article reviews recent developments in the theory of fluctuations and correlations of energy levels and eigenfunction amplitudes in diffusive mesoscopic samples. Various spatial geometries are considered, with emphasis on low-dimensional (quasi-1D and 2D) systems. Calculations are based on the supermatrix sigma-model approach. The method reproduces, in so-called zero-mode approximation, the universal random matrix theory (RMT) results for the energy-level and eigenfunction fluctuations. Going beyond this approximation allows us to study system-specific deviations from universality, which are determined by the diffusive classical dynamics in the system. These deviations are especially strong in the far ``tails'' of the distribution function of the eigenfunction amplitudes (as well as of some related quantities, such as local density of states, relaxation time, etc.). These asymptotic ``tails'' are governed by anomalously localized states which are formed in rare realizations of the random potential. The deviations of the level and eigenfunction statistics from their RMT form strengthen with increasing disorder and become especially pronounced at the Anderson metal-insulator transition. In this regime, the wave functions are multifractal, while the level statistics acquires a scale-independent form with distinct critical features. Fluctuations of the conductance and of the local intensity of a classical wave radiated by a point-like source in the quasi-1D geometry are also studied within the sigma-model approach. For a ballistic system with rough surface an appropriately modified (``ballistic'') sigma-model is used. Finally, the interplay of the fluctuations and the electron-electron interaction in small samples is discussed, with application to the Coulomb blockade spectra.
A computer-aided detection (CAD) system for the identification of pulmonary nodules in low-dose multi-detector helical Computed Tomography (CT) images with 1.25 mm slice thickness is presented. The basic modules of our lung-CAD system, a dot-enhancement filter for nodule candidate selection and a neural classifier for false-positive finding reduction, are described. The results obtained on the collected database of lung CT scans are discussed.
In this paper, we present a new routing algorithm called "the Self Avoiding Paths Routing Algorithm". Its application to traffic flow in scale-free networks shows a great improvement over the so called "efficient routing" protocol while at the same time maintaining a relatively low average packet travel time. It has the advantage of minimizing path overlapping throughout the network in a self consistent manner with a relatively small number of iterations by maintaining an equilibrated path distribution especially among the hubs. This results in a significant shifting of the critical packet generation rate over which traffic congestion occurs, thus permitting the network to sustain more information packets in the free flow state. The performance of the algorithm is discussed both on a Bar\'abasi-Albert (BA) network and real autonomous system (AS) network data.
We analyze power corrections to flavour singlet deep inelastic scattering structure functions in the framework of the infrared renormalon model. Our calculations, together with previous results for the non-singlet contribution, allow to model the x-dependence of higher twist corrections to F_2, F_L and g_1 in the whole x domain.
Liquid chromatography coupled with tandem mass spectrometry, also known as shotgun proteomics, is a widely-used high-throughput technology for identifying proteins in complex biological samples. Analysis of the tens of thousands of fragmentation spectra produced by a typical shotgun proteomics experiment begins by assigning to each observed spectrum the peptide hypothesized to be responsible for generating the spectrum, typically done by searching each spectrum against a database of peptides. We have recently described a machine learning method---Dynamic Bayesian Network for Rapid Identification of Peptides (DRIP)---that not only achieves state-of-the-art spectrum identification performance on a variety of datasets but also provides a trainable model capable of returning valuable auxiliary information regarding specific peptide-spectrum matches. In this work, we present two significant improvements to DRIP. First, we describe how to use word lattices, which are widely used in natural language processing, to significantly speed up DRIP's computations. To our knowledge, all existing shotgun proteomics search engines compute independent scores between a given observed spectrum and each possible candidate peptide from the database. The key idea of the word lattice is to represent the set of candidate peptides in a single data structure, thereby allowing sharing of redundant computations among the different candidates. We demonstrate that using lattices in conjunction with DRIP leads to speedups on the order of tens across yeast and worm data sets. Second, we introduce a variant of DRIP that uses a discriminative training framework, performing maximum mutual entropy estimation rather than maximum likelihood estimation. This modification improves DRIP's statistical power, enabling us to increase the number of identified spectrum at a 1% false discovery rate on yeast and worm data sets.
Mobile ad hoc networks (MANETs) are self-configuring wireless networks that lack permanent infrastructure and are formed among mobile nodes on demand. Rapid node mobility results in dramatic channel variation, or fading, that degrades MANET performance. Employing channel state information (CSI) at the transmitter can improve the throughput of routing and medium access control (MAC) protocols for mobile ad hoc networks. Several routing algorithms in the literature explicitly incorporate the fading signal strength into the routing metric, thus selecting the routes with strong channel conditions. While these studies show that adaptation to the time-variant channel gain is beneficial in MANETs, they do not address the effect of the outdated fading CSI at the transmitter. For realistic mobile node speeds, the channel gain is rapidly varying, and becomes quickly outdated due the feedback delay. We analyze the link throughput of joint rate adaptation and adaptive relay selection in the presence of imperfect CSI. Moreover, for an 802.11 network that employs geographic opportunistic routing with adaptive rate and relay selection, we propose a novel method to reduce the effect of the feedback delay at the MAC layer in the presence of Rayleigh fading. This method exploits channel reciprocity and fading prediction and does not require significant modification to the existing 802.11 frame structure. Extensive network simulations demonstrate that the proposed approach significantly improves the throughput, delay, and packet delivery ratio for high mobile velocities relative to previously proposed approaches that employ outdated CSI at the transmitter.
The hybrid clustering-classification neural network is proposed. This network allows increasing a quality of information processing under the condition of overlapping classes due to the rational choice of a learning rate parameter and introducing a special procedure of fuzzy reasoning in the clustering process, which occurs both with an external learning signal (supervised) and without the one (unsupervised). As similarity measure neighborhood function or membership one, cosine structures are used, which allow to provide a high flexibility due to self-learning-learning process and to provide some new useful properties. Many realized experiments have confirmed the efficiency of proposed hybrid clustering-classification neural network; also, this network was used for solving diagnostics task of reactive arthritis.
The far-from-equilibrium low-temperature dynamics of ultra-thin magnetic films is analyzed by using Monte Carlo numerical simulations on a two dimensional Ising model with competing exchange ($J_0$) and dipolar ($J_d$) interactions. In particular, we focus our attention on the low temperature region of the $(\delta,T)$ phase diagram (where $\delta= J_0/J_d$) for the range of values of $\delta$ where striped phases with widths $h=1$ ($h1$) and $h=2$ ($h2$) are present. The presence of metastable states of the phase $h2$ in the region where the phase $h1$ is the thermodynamically stable one and viceversa was established recently. In this work we show that the presence of these metastable states appears as a blocking mechanism that slows the dynamics of magnetic domains growth when the system is quenched from a high temperature state to a low temperature state in the region of metastability.
Network connectivity plays an important role in the information exchange between different agents in the multi-level networks. In this paper, we establish a game-theoretic framework to capture the uncoordinated nature of the decision-making at different layers of the multi-level networks. Specifically, we design a decentralized algorithm that aims to maximize the algebraic connectivity of the global network iteratively. In addition, we show that the designed algorithm converges to a Nash equilibrium asymptotically and yields an equilibrium network. To study the network resiliency, we introduce three adversarial attack models and characterize their worst-case impacts on the network performance. Case studies based on a two-layer mobile robotic network are used to corroborate the effectiveness and resiliency of the proposed algorithm and show the interdependency between different layers of the network during the recovery processes.
In this paper, we address a neural field equation that characterizes spatio-temporal propagation of a neural population pulse. Due that the human brain is a complex system whose constituents interaction give rise to fundamental states of consciousness and behavior, it is crucial to gain insight into its functioning even at relativistic scales. To this end, we study the action of the relativistic conformal group on the accounted neural field propagation equation. In particular, we obtain an exact solution for the field propagation equation when the space-time is 3 or 4 dimensional. Furthermore, in the 4 dimensional case and the large distance limit, it is shown that the neural population pulse becomes a Yukawa potential.
We derive an effective Hamiltonian for ${\rm Ga}_{1-x}{\rm Mn}_x {\rm As}$ in the dilute limit, where ${\rm Ga}_{1-x}{\rm Mn}_x {\rm As}$ can be described in terms of spin $F=3/2$ polarons hopping between the {\rm Mn} sites and coupled to the local {\rm Mn} spins. We determine the parameters of our model from microscopic calculations using both a variational method and an exact diagonalization within the so-called spherical approximation. Our approach treats the extremely large Coulomb interaction in a non-perturbative way, and captures the effects of strong spin-orbit coupling and Mn positional disorder. We study the effective Hamiltonian in a mean field and variational calculation, including the effects of interactions between the holes at both zero and finite temperature. We study the resulting magnetic properties, such as the magnetization and spin disorder manifest in the generically non-collinear magnetic state. We find a well formed impurity band fairly well separated from the valence band up to $x_{\rm active} \lesssim 0.015$ for which finite size scaling studies of the participation ratios indicate a localization transition, even in the presence of strong on-site interactions, where $x_{\rm active}<x_{\rm nom}$ is the fraction of magnetically active Mn. We study the localization transition as a function of hole concentration, Mn positional disorder, and interaction strength between the holes.
Multi-hop inference is necessary for machine learning systems to successfully solve tasks such as Recognising Textual Entailment and Machine Reading. In this work, we demonstrate the effectiveness of adaptive computation for learning the number of inference steps required for examples of different complexity and that learning the correct number of inference steps is difficult. We introduce the first model involving Adaptive Computation Time which provides a small performance benefit on top of a similar model without an adaptive component as well as enabling considerable insight into the reasoning process of the model.
In the framework of the evolutionary dynamics of the Prisoner's Dilemma game on complex networks, we investigate the possibility that the average level of cooperation shows hysteresis under quasi-static variations of a model parameter (the "temptation to defect"). Under the "discrete replicator" strategy updating rule, for both Erdos-Renyi and Barabasi-Albert graphs we observe cooperation hysteresis cycles provided one reaches tipping point values of the parameter; otherwise, perfect reversibility is obtained. The selective fixation of cooperation at certain nodes and its organization in cooperator clusters, that are surrounded by fluctuating strategists, allows the rationalization of the "lagging behind" behavior observed.
The observation and modeling of natural Complex Systems (CSs) like the human nervous system, the evolution or the weather, allows the definition of special abilities and models reusable to solve other problems. For instance, Genetic Algorithms or Ant Colony Optimizations are inspired from natural CSs to solve optimization problems. This paper proposes the use of ant-based systems to solve various problems with a non assessing approach. This means that solutions to some problem are not evaluated. They appear as resultant structures from the activity of the system. Problems are modeled with graphs and such structures are observed directly on these graphs. Problems of Multiple Sequences Alignment and Natural Language Processing are addressed with this approach.
Suppose that multiple experts (or learning algorithms) provide us with alternative Bayesian network (BN) structures over a domain, and that we are interested in combining them into a single consensus BN structure. Specifically, we are interested in that the consensus BN structure only represents independences all the given BN structures agree upon and that it has as few parameters associated as possible. In this paper, we prove that there may exist several non-equivalent consensus BN structures and that finding one of them is NP-hard. Thus, we decide to resort to heuristics to find an approximated consensus BN structure. In this paper, we consider the heuristic proposed in \citep{MatzkevichandAbramson1992,MatzkevichandAbramson1993a,MatzkevichandAbramson1993b}. This heuristic builds upon two algorithms, called Methods A and B, for efficiently deriving the minimal directed independence map of a BN structure relative to a given node ordering. Methods A and B are claimed to be correct although no proof is provided (a proof is just sketched). In this paper, we show that Methods A and B are not correct and propose a correction of them.
This paper explores textual production in interaction networks, with special emphasis on its relation to topological measures. Four email lists were selected, in which measures were taken from the texts participants wrote. Peripheral, intermediary and hub sectors of these networks were observed to have discrepant linguistic elaborations. For completeness of exposition, correlation of textual and topological measures were observed for the entire network and for each connective sector. The formation of principal components is used for further insights of how measures are related.
Parking management systems, and vacancy-indication services in particular, can play a valuable role in reducing traffic and energy waste in large cities. Visual detection methods represent a cost-effective option, since they can take advantage of hardware usually already available in many parking lots, namely cameras. However, visual detection methods can be fragile and not easily generalizable. In this paper, we present a robust detection algorithm based on deep convolutional neural networks. We implemented and tested our algorithm on a large baseline dataset, and also on a set of image feeds from actual cameras already installed in parking lots. We have developed a fully functional system, from server-side image analysis to front-end user interface, to demonstrate the practicality of our method.
The recent realization of a "Levy glass" (a three-dimensional optical material with a Levy distribution of scattering lengths) has motivated us to analyze its one-dimensional analogue: A linear chain of barriers with independent spacings s that are Levy distributed: p(s)~1/s^(1+alpha) for s to infinity. The average spacing diverges for 0<alpha<1. A random walk along such a sparse chain is not a Levy walk because of the strong correlations of subsequent step sizes. We calculate all moments of conductance (or transmission), in the regime of incoherent sequential tunneling through the barriers. The average transmission from one barrier to a point at a distance L scales as L^(-alpha) ln L for 0<alpha<1. The corresponding electronic shot noise has a Fano factor (average noise power / average conductance) that approaches 1/3 very slowly, with 1/ln L corrections.
We consider the production of W and Z bosons in hadron collisions. We present a selection of numerical results obtained through a fully exclusive calculation up to next-to-next-to-leading order (NNLO) in QCD perturbation theory. We include the $\gamma$--$Z$ interference, finite-width effects, the leptonic decay of the vector bosons and the corresponding spin correlations. The calculation is completely realistic, since it allows us to apply arbitrary kinematical cuts on the final-state leptons and the associated partons, and to compute the corresponding distributions in the form of bin histograms.
Lower speed impinging ions (with hydration shells) cannot transverse ion channels once internal charge goes positive. Yet neural pulse waveforms fail to show the expected risetime distortion beginning at zero voltage. Observed waveforms cannot be explained unless electron capture is considered.
In this paper, we present a simple and modularized neural network architecture, named interleaved group convolutional neural networks (IGCNets). The main point lies in a novel building block, a pair of two successive interleaved group convolutions: primary group convolution and secondary group convolution. The two group convolutions are complementary: (i) the convolution on each partition in primary group convolution is a spatial convolution, while on each partition in secondary group convolution, the convolution is a point-wise convolution; (ii) the channels in the same secondary partition come from different primary partitions. We discuss one representative advantage: Wider than a regular convolution with the number of parameters and the computation complexity preserved. We also show that regular convolutions, group convolution with summation fusion, and the Xception block are special cases of interleaved group convolutions. Empirical results over standard benchmarks, CIFAR-$10$, CIFAR-$100$, SVHN and ImageNet demonstrate that our networks are more efficient in using parameters and computation complexity with similar or higher accuracy.
It is a great challenge to evaluate the network performance of cellular mobile communication systems. In this paper, we propose new spatial spectrum and energy efficiency models for Poisson-Voronoi tessellation (PVT) random cellular networks. To evaluate the user access the network, a Markov chain based wireless channel access model is first proposed for PVT random cellular networks. On that basis, the outage probability and blocking probability of PVT random cellular networks are derived, which can be computed numerically. Furthermore, taking into account the call arrival rate, the path loss exponent and the base station (BS) density in random cellular networks, spatial spectrum and energy efficiency models are proposed and analyzed for PVT random cellular networks. Numerical simulations are conducted to evaluate the network spectrum and energy efficiency in PVT random cellular networks.
This is part II of three-part work. Here, we present a second set of inter-related five variants of simplified Long Short-term Memory (LSTM) recurrent neural networks by further reducing adaptive parameters. Two of these models have been introduced in part I of this work. We evaluate and verify our model variants on the benchmark MNIST dataset and assert that these models are comparable to the base LSTM model while use progressively less number of parameters. Moreover, we observe that in case of using the ReLU activation, the test accuracy performance of the standard LSTM will drop after a number of epochs when learning parameter become larger. However all of the new model variants sustain their performance.
Energy efficiency in cellular networks is a growing concern for cellular operators to not only maintain profitability, but also to reduce the overall environment effects. This emerging trend of achieving energy efficiency in cellular networks is motivating the standardization authorities and network operators to continuously explore future technologies in order to bring improvements in the entire network infrastructure. In this article, we present a brief survey of methods to improve the power efficiency of cellular networks, explore some research issues and challenges and suggest some techniques to enable an energy efficient or "green" cellular network. Since base stations consume a maximum portion of the total energy used in a cellular system, we will first provide a comprehensive survey on techniques to obtain energy savings in base stations. Next, we discuss how heterogeneous network deployment based on micro, pico and femto-cells can be used to achieve this goal. Since cognitive radio and cooperative relaying are undisputed future technologies in this regard, we propose a research vision to make these technologies more energy efficient. Lastly, we explore some broader perspectives in realizing a "green" cellular network technology
An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e.g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e.g. 200 000). Computing the equally large, but typically non-sparse D-dimensional output vector from a last hidden layer of reasonable dimension d (e.g. 500) incurs a prohibitive O(Dd) computational cost for each example, as does updating the D x d output weight matrix and computing the gradient needed for backpropagation to previous layers. While efficient handling of large sparse network inputs is trivial, the case of large sparse targets is not, and has thus so far been sidestepped with approximate alternatives such as hierarchical softmax or sampling-based approximations during training. In this work we develop an original algorithmic approach which, for a family of loss functions that includes squared error and spherical softmax, can compute the exact loss, gradient update for the output weights, and gradient for backpropagation, all in O(d^2) per example instead of O(Dd), remarkably without ever computing the D-dimensional output. The proposed algorithm yields a speedup of D/4d , i.e. two orders of magnitude for typical sizes, for that critical part of the computations that often dominates the training time in this kind of network architecture.
The past decade has seen an increasing body of literature devoted to the estimation of causal effects in network-dependent data. However, the validity of many classical statistical methods in such data is often questioned. There is an emerging need for objective and practical ways to assess which causal methodologies might be applicable and valid in network-dependent data. This paper describes a set of tools implemented in the simcausal R package that allow simulating data based on user-specified structural equation model for connected units. Specification and simulation of counterfactual data is implemented for static, dynamic and stochastic interventions. A new interface aims to simplify the specification of network-based functional relationships between connected units. A set of examples illustrates how these simulations may be applied to evaluation of different statistical methods for estimation of causal effects in network-dependent data.
We carry out extensive Monte Carlo simulations of the three-dimensional (3D) uniformly frustrated XY model with uncorrelated randomly perturbed couplings, as a model for the equilibrium behavior of an extreme type-II superconductor with quenched uncorrelated random point vortex pinning, in the presence of a uniform applied magnetic field. We map out the resulting phase diagram as a function of temperature T and pinning strength p for a fixed value of the vortex line density. At low p we find a sharp first order vortex lattice melting phase boundary separating a vortex lattice from a vortex liquid. As p increases, it appears that this first order transition smears out over a finite temperature interval, due to the effects of the random pinning, in agreement with several recent experiments. At large p we find a second order transition from vortex liquid to vortex glass.
Jansson and Sung showed that, given a dense set of input triplets T (representing hypotheses about the local evolutionary relationships of triplets of species), it is possible to determine in polynomial time whether there exists a level-1 network consistent with T, and if so to construct such a network. They also showed that, unlike in the case of trees (i.e. level-0 networks), the problem becomes NP-hard when the input is non-dense. Here we further extend this work by showing that, when the set of input triplets is dense, the problem is even polynomial-time solvable for the construction of level-2 networks. This shows that, assuming density, it is tractable to construct plausible evolutionary histories from input triplets even when such histories are heavily non-tree like. This further strengthens the case for the use of triplet-based methods in the construction of phylogenetic networks. We also show that, in the non-dense case, the level-2 problem remains NP-hard.
In their letter, Andersen, Sornette, and Leung [Phys. Rev. Lett. 78, 2140 (1997)] describe possible behaviors for rupture in disordered media, based on the mean field-like democratic fiber bundle model. In this model, fibers are pulled with a force which is distributed uniformly. A fiber breaks if the stress on it exceeds a threshold chosen from a probability distribution, and the force is then redistributed over the intact fibers. Andersen et al. claim the existence of a tricritical point, separating a "first-order" regime, characterized by a sudden global failure, from a "second-order" regime, characterized by a divergence in the breaking rate. We show that a first-order transition is an artifact of a (large enough) discontinuity put by hand in the disorder distribution. Thus, in generic physical cases, a first-order regime is not present. This result is obtained from a graphical method, which, unlike Andersen at al.'s analytical solution, enables us to distinguish the various classes of qualitatively different behaviors of the model.
The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object's appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolutional networks. However, when the object to track is not known beforehand, it is necessary to perform Stochastic Gradient Descent online to adapt the weights of the network, severely compromising the speed of the system. In this paper we equip a basic tracking algorithm with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video. Our tracker operates at frame-rates beyond real-time and, despite its extreme simplicity, achieves state-of-the-art performance in multiple benchmarks.
Granular materials have been studied for decades, also driven by industrial and technological applications. These very simple systems, composed by agglomerations of mesoscopic particles, are characterized, in specific regimes, by a large number of metastable states and an extreme sensitivity (e.g., in sound transmission) on the arrangement of grains; they are not substantially affected by thermal phenomena, but can be controlled by mechanical solicitations. Laser emission from shaken granular matter is so far unexplored; here we provide experimental evidence that it can be affected and controlled by the status of motion of the granular, we also find that competitive random lasers can be observed. We hence demonstrate the potentialities of gravity affected moving disordered materials for optical applications, and open the road to a variety of novel interdisciplinary investigations, involving modern statistical mechanics and disordered photonics.
We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities, e.g., for applications involving bandits or active learning. One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics). Unfortunately, such a method needs to store many copies of the parameters (which wastes memory), and needs to make predictions using many versions of the model (which wastes time).   We describe a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network. We compare to two very recent approaches to Bayesian neural networks, namely an approach based on expectation propagation [Hernandez-Lobato and Adams, 2015] and an approach based on variational Bayes [Blundell et al., 2015]. Our method performs better than both of these, is much simpler to implement, and uses less computation at test time.
In our recent paper (hep-ph/9501348) we argued that the Bjorken variable $x$ in deep inelastic scattering cannot be interpreted as the light cone momentum fraction $\xi$ even in the Bjorken limit and in zero order of the perturbation theory. The purpose of the present paper is to qualitatively explain this fact using only a few simplest kinematical relations.
The connectome, or the entire connectivity of a neural system represented by network, ranges various scales from synaptic connections between individual neurons to fibre tract connections between brain regions. Although the modularity they commonly show has been extensively studied, it is unclear whether connection specificity of such networks can already be fully explained by the modularity alone. To answer this question, we study two networks, the neuronal network of C. elegans and the fibre tract network of human brains yielded through diffusion spectrum imaging (DSI). We compare them to their respective benchmark networks with varying modularities, which are generated by link swapping to have desired modularity values but otherwise maximally random. We find several network properties that are specific to the neural networks and cannot be fully explained by the modularity alone. First, the clustering coefficient and the characteristic path length of C. elegans and human connectomes are both higher than those of the benchmark networks with similar modularity. High clustering coefficient indicates efficient local information distribution and high characteristic path length suggests reduced global integration. Second, the total wiring length is smaller than for the alternative configurations with similar modularity. This is due to lower dispersion of connections, which means each neuron in C. elegans connectome or each region of interest (ROI) in human connectome reaches fewer ganglia or cortical areas, respectively. Third, both neural networks show lower algorithmic entropy compared to the alternative arrangements. This implies that fewer rules are needed to encode for the organisation of neural systems.
In dynamic wireless ad-hoc networks (DynWANs), autonomous computing devices set up a network for the communication needs of the moment. These networks require the implementation of a medium access control (MAC) layer. We consider MAC protocols for DynWANs that need to be autonomous and robust as well as have high bandwidth utilization, high predictability degree of bandwidth allocation, and low communication delay in the presence of frequent topological changes to the communication network. Recent studies have shown that existing implementations cannot guarantee the necessary satisfaction of these timing requirements. We propose a self-stabilizing MAC algorithm for DynWANs that guarantees a short convergence period, and by that, it can facilitate the satisfaction of severe timing requirements, such as the above. Besides the contribution in the algorithmic front of research, we expect that our proposal can enable quicker adoption by practitioners and faster deployment of DynWANs that are subject changes in the network topology.
In an external magnetic field B, the spins of the electron and hole will precess in effective fields b_e + B and b_h + B, where b_e and b_h are random hyperfine fields acting on the electron and hole, respectively. For sparse "soft" pairs the magnitudes of these effective fields coincide. The dynamics of precession for these pairs acquires a slow component, which leads to a slowing down of recombination. We study the effect of soft pairs on organic magnetoresistance, where slow recombination translates into blocking of the passage of current. It appears that when b_e and b_h have identical gaussian distributions the contribution of soft pairs to the current does not depend on B. Amazingly, small inequivalence in the rms values of b_e and b_h gives rise to a magnetic field response, and it becomes progressively stronger as the inequivalence increases. We find the expression for this response by performing the averaging over b_e, b_h analytically. Another source of magnetic field response in the regime when current is dominated by soft pairs is inequivalence of the g-factors of the pair partners. Our analytical calculation indicates that for this mechanism the response has an opposite sign.
Recently, one paper in Nature(Papadopoulos, 2012) raised an old debate on the origin of the scale-free property of complex networks, which focuses on whether the scale-free property origins from the optimization or not. Because the real-world complex networks often have multiple traits, any explanation on the scale-free property of complex networks should be capable of explaining the other traits as well. This paper proposed a framework which can model multi-trait scale-free networks based on optimization, and used three examples to demonstrate its effectiveness. The results suggested that the optimization is a more generalized explanation because it can not only explain the origin of the scale-free property, but also the origin of the other traits in a uniform way. This paper provides a universal method to get ideal networks for the researches such as epidemic spreading and synchronization on complex networks.
We present an automatic, fast, accurate and robust method of classifying astronomical objects. The Self Organizing Map (SOM) as an unsupervised Artificial Neural Network (ANN) algorithm is used for classification of stellar spectra of stars. The SOM is used to make clusters of different spectral classes of Jacoby, Hunter and Christian (JHC) library. This ANN technique needs no training examples and the stellar spectral data sets are directly fed to the network for the classification. The JHC library contains 161 spectra out of which, 158 spectra are selected for the classification. These 158 spectra are input vectors to the network and mapped into a two dimensional output grid. The input vectors close to each other are mapped into the same or neighboring neurons in the output space. So, the similar objects are making clusters in the output map and making it easy to analyze high dimensional data.   After running the SOM algorithm on 158 stellar spectra, with 2799 data points each, the output map is analyzed and found that, there are 7 clusters in the output map corresponding to O to M stellar type. But, there are 12 misclassifications out of 158 and all of them are misclassified into the neighborhood of correct clusters which gives a success rate of about 92.4%.
We explore the degrees of freedom of $M\times N$ user wireless $X$ networks, i.e. networks of $M$ transmitters and $N$ receivers where every transmitter has an independent message for every receiver. We derive a general outerbound on the degrees of freedom \emph{region} of these networks. When all nodes have a single antenna and all channel coefficients vary in time or frequency, we show that the \emph{total} number of degrees of freedom of the $X$ network is equal to $\frac{MN}{M+N-1}$ per orthogonal time and frequency dimension. Achievability is proved by constructing interference alignment schemes for $X$ networks that can come arbitrarily close to the outerbound on degrees of freedom. For the case where either M=2 or N=2 we find that the outerbound is exactly achievable. While $X$ networks have significant degrees of freedom benefits over interference networks when the number of users is small, our results show that as the number of users increases, this advantage disappears. Thus, for large $K$, the $K\times K$ user wireless $X$ network loses half the degrees of freedom relative to the $K\times K$ MIMO outerbound achievable through full cooperation. Interestingly, when there are few transmitters sending to many receivers ($N\gg M$) or many transmitters sending to few receivers ($M\gg N$), $X$ networks are able to approach the $\min(M,N)$ degrees of freedom possible with full cooperation on the $M\times N$ MIMO channel. Similar to the interference channel, we also construct an example of a 2 user $X$ channel with propagation delays where the outerbound on degrees of freedom is achieved through interference alignment based on a simple TDMA strategy.
We study the problem of stochastic optimization for deep learning in the parallel computing environment under communication constraints. A new algorithm is proposed in this setting where the communication and coordination of work among concurrent processes (local workers), is based on an elastic force which links the parameters they compute with a center variable stored by the parameter server (master). The algorithm enables the local workers to perform more exploration, i.e. the algorithm allows the local variables to fluctuate further from the center variable by reducing the amount of communication between local workers and the master. We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance. We propose synchronous and asynchronous variants of the new algorithm. We provide the stability analysis of the asynchronous variant in the round-robin scheme and compare it with the more common parallelized method ADMM. We show that the stability of EASGD is guaranteed when a simple stability condition is satisfied, which is not the case for ADMM. We additionally propose the momentum-based version of our algorithm that can be applied in both synchronous and asynchronous settings. Asynchronous variant of the algorithm is applied to train convolutional neural networks for image classification on the CIFAR and ImageNet datasets. Experiments demonstrate that the new algorithm accelerates the training of deep architectures compared to DOWNPOUR and other common baseline approaches and furthermore is very communication efficient.
In this paper we consider methods for sharing free text Twitter data, with the goal of protecting the privacy of individuals in the data while still releasing data that carries research value, i.e. minimizes risk and maximizes utility. We propose three protection methods: simple redaction of hashtags and twitter handles, an epsilon-differentially private Multinomial-Dirichlet synthesizer, and novel synthesis models based on a neural generative model. We evaluate these three methods using empirical measures of risk and utility. We define risk based on possible identification of users in the Twitter data, and we define utility based on two general language measures and two model-based tasks. We find that redaction maintains high utility for simple tasks but at the cost of high risk, while some neural synthesis models are able to produce higher levels of utility, even for more complicated tasks, while maintaining lower levels of risk. In practice, utility and risk present a trade-off, with some methods offering lower risk or higher utility. This work presents possible methods to approach the problem of privacy for free text and which methods could be used to meet different utility and risk thresholds.
We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal. The codebook can be optimally learned jointly with the net, or fixed, as for binarization or ternarization approaches. Previous work has quantized the weights of the reference net, or incorporated rounding operations in the backpropagation algorithm, but this has no guarantee of converging to a loss-optimal, quantized net. We describe a new approach based on the recently proposed framework of model compression as constrained optimization \citep{Carreir17a}. This results in a simple iterative "learning-compression" algorithm, which alternates a step that learns a net of continuous weights with a step that quantizes (or binarizes/ternarizes) the weights, and is guaranteed to converge to local optimum of the loss for quantized nets. We develop algorithms for an adaptive codebook or a (partially) fixed codebook. The latter includes binarization, ternarization, powers-of-two and other important particular cases. We show experimentally that we can achieve much higher compression rates than previous quantization work (even using just 1 bit per weight) with negligible loss degradation.
The production of Lambda and Lambda-bar hyperons by polarized muons of 160 GeV/c on a polarized 6LiD target has been studied using the COMPASS spectrometer. First preliminary results on the longitudinal polarization of Lambda and Lambda-bar hyperons produced in the deep-inelastic scattering will be presented for 2002 data set.
The human brain is autonomously active, being characterized by a self-sustained neural activity which would be present even in the absence of external sensory stimuli. Here we study the interrelation between the self-sustained activity in autonomously active recurrent neural nets and external sensory stimuli.   There is no a priori semantical relation between the influx of external stimuli and the patterns generated internally by the autonomous and ongoing brain dynamics. The question then arises when and how are semantic correlations between internal and external dynamical processes learned and built up?   We study this problem within the paradigm of transient state dynamics for the neural activity in recurrent neural nets, i.e. for an autonomous neural activity characterized by an infinite time-series of transiently stable attractor states. We propose that external stimuli will be relevant during the sensitive periods, {\it viz} the transition period between one transient state and the subsequent semi-stable attractor. A diffusive learning signal is generated unsupervised whenever the stimulus influences the internal dynamics qualitatively.   For testing we have presented to the model system stimuli corresponding to the bars and stripes problem. We found that the system performs a non-linear independent component analysis on its own, being continuously and autonomously active. This emergent cognitive capability results here from a general principle for the neural dynamics, the competition between neural ensembles.
Highway deep neural network (HDNN) is a type of depth-gated feedforward neural network, which has shown to be easier to train with more hidden layers and also generalise better compared to conventional plain deep neural networks (DNNs). Previously, we investigated a structured HDNN architecture for speech recognition, in which the two gate functions were tied across all the hidden layers, and we were able to train a much smaller model without sacrificing the recognition accuracy. In this paper, we carry on the study of this architecture with sequence-discriminative training criterion and speaker adaptation techniques on the AMI meeting speech recognition corpus. We show that these two techniques improve speech recognition accuracy on top of the model trained with the cross entropy criterion. Furthermore, we demonstrate that the two gate functions that are tied across all the hidden layers are able to control the information flow over the whole network, and we can achieve considerable improvements by only updating these gate functions in both sequence training and adaptation experiments.
In this paper, a general framework for the analysis of a connection between the training of artificial neural networks via the dynamics of Markov chains and the approximation of conservation law equations is proposed. This framework allows us to demonstrate an intrinsic link between microscopic and macroscopic models for evolution via the concept of perturbed generalized dynamic systems. The main result is exemplified with a number of illustrative examples where efficient numerical approximations follow directly from network-based computational models, viewed here as Markov chain approximations. Finally, stability and consistency conditions of such computational models are discussed.
A system for Operational Risk management based on the computational paradigm of Bayesian Networks is presented. The algorithm allows the construction of a Bayesian Network targeted for each bank using only internal loss data, and takes into account in a simple and realistic way the correlations among different processes of the bank. The internal losses are averaged over a variable time horizon, so that the correlations at different times are removed, while the correlations at the same time are kept: the averaged losses are thus suitable to perform the learning of the network topology and parameters. The algorithm has been validated on synthetic time series. It should be stressed that the practical implementation of the proposed algorithm has a small impact on the organizational structure of a bank and requires an investment in human resources limited to the computational area.
Image compositing is a popular and successful method used to generate realistic yet fake imagery. Much previous work in compositing has focused on improving the appearance compatibility between a given object segment and a background image. However, most previous work does not investigate the topic of automatically selecting semantically compatible segments and predicting their locations and sizes given a background image. In this work, we attempt to fill this gap by developing a fully automatic compositing system that learns this information. To simplify the task, we restrict our problem by focusing on human instance composition, because human segments exhibit strong correlations with the background scene and are easy to collect. The first problem we investigate is determining where should a person segment be placed given a background image, and what should be its size in the background image. We tackle this by developing a novel Convolutional Neural Network (CNN) model that jointly predicts the potential location and size of the person segment. The second problem we investigate is, given the background image, which person segments (who) can be composited with the previously predicted locations and sizes, while retaining compatibility with both the local context and the global scene semantics? To achieve this, we propose an efficient context-based segment retrieval method that incorporates pre-trained deep feature representations.   To demonstrate the effectiveness of the proposed compositing system, we conduct quantitative and qualitative experiments including a user study. Experimental results show our system can generate composite images that look semantically and visually convincing. We also develop a proof-of-concept user interface to demonstrate the potential application of our method.
The paper is devoted to game-theoretic methods for community detection in networks. The traditional methods for detecting community structure are based on selecting denser subgraphs inside the network. Here we propose to use the methods of cooperative game theory that highlight not only the link density but also the mechanisms of cluster formation. Specifically, we suggest two approaches from cooperative game theory: the first approach is based on the Myerson value, whereas the second approach is based on hedonic games. Both approaches allow to detect clusters with various resolution. However, the tuning of the resolution parameter in the hedonic games approach is particularly intuitive. Furthermore, the modularity based approach and its generalizations can be viewed as particular cases of the hedonic games.
Cooperative games provide an appropriate framework for fair and stable profit distribution in multiagent systems. In this paper, we study the algorithmic issues on path cooperative games that arise from the situations where some commodity flows through a network. In these games, a coalition of edges or vertices is successful if it enables a path from the source to the sink in the network, and lose otherwise. Based on dual theory of linear programming and the relationship with flow games, we provide the characterizations on the CS-core, least-core and nucleolus of path cooperative games. Furthermore, we show that the least-core and nucleolus are polynomially solvable for path cooperative games defined on both directed and undirected network.
Recently, the deep-belief-networks (DBN) based voice activity detection (VAD) has been proposed. It is powerful in fusing the advantages of multiple features, and achieves the state-of-the-art performance. However, the deep layers of the DBN-based VAD do not show an apparent superiority to the shallower layers. In this paper, we propose a denoising-deep-neural-network (DDNN) based VAD to address the aforementioned problem. Specifically, we pre-train a deep neural network in a special unsupervised denoising greedy layer-wise mode, and then fine-tune the whole network in a supervised way by the common back-propagation algorithm. In the pre-training phase, we take the noisy speech signals as the visible layer and try to extract a new feature that minimizes the reconstruction cross-entropy loss between the noisy speech signals and its corresponding clean speech signals. Experimental results show that the proposed DDNN-based VAD not only outperforms the DBN-based VAD but also shows an apparent performance improvement of the deep layers over shallower layers.
A different parametrization of the hyperplanes is used in the neural network algorithm. As demonstrated on several autoencoder examples it significantly outperforms the usual parametrization, reaching lower training error values with only a fraction of the number of epochs. It's argued that it makes it easier to understand and initialize the parameters.
Distant supervision significantly reduces human efforts in building training data for many classification tasks. While promising, this technique often introduces noise to the generated training data, which can severely affect the model performance. In this paper, we take a deep look at the application of distant supervision in relation extraction. We show that the dynamic transition matrix can effectively characterize the noise in the training data built by distant supervision. The transition matrix can be effectively trained using a novel curriculum learning based method without any direct supervision about the noise. We thoroughly evaluate our approach under a wide range of extraction scenarios. Experimental results show that our approach consistently improves the extraction results and outperforms the state-of-the-art in various evaluation scenarios.
We consider the Schroedinger equation with a supersymmetric random potential, where the superpotential is a Levy noise. We focus on the problem of computing the so-called complex Lyapunov exponent, whose real and imaginary parts are, respectively, the Lyapunov exponent and the integrated density of states of the system. In the case where the Levy process is non-decreasing, we show that the calculation of the complex Lyapunov exponent reduces to a Stieltjes moment problem, we ascertain the low-energy behaviour of the density of states in some generality, and relate it to the distributional properties of the Levy process. We review the known solvable cases, where the complex Lyapunov exponent can be expressed in terms of special functions, and discover a new one.
The Sherrington-Kirkpatrick model with random $\pm1$ couplings is programmed on the D-Wave Two annealer featuring 509 qubits interacting on a Chimera-type graph. The performance of the optimizer compares and correlates to simulated annealing. When considering the effect of the static noise, which degrades the performance of the annealer, one can estimate an improvement on the comparative scaling of the two methods in favor of the D-Wave machine. The optimal choice of parameters of the embedding on the Chimera graph is shown to be associated to the emergence of the spin-glass critical temperature of the embedded problem.
Sarcasm detection is a key task for many natural language processing tasks. In sentiment analysis, for example, sarcasm can flip the polarity of an "apparently positive" sentence and, hence, negatively affect polarity detection performance. To date, most approaches to sarcasm detection have treated the task primarily as a text categorization problem. Sarcasm, however, can be expressed in very subtle ways and requires a deeper understanding of natural language that standard text categorization techniques cannot grasp. In this work, we develop models based on a pre-trained convolutional neural network for extracting sentiment, emotion and personality features for sarcasm detection. Such features, along with the network's baseline features, allow the proposed models to outperform the state of the art on benchmark datasets. We also address the often ignored generalizability issue of classifying data that have not been seen by the models at learning phase.
We investigate Turing's notion of an A-type artificial neural network. We study a refinement of Turing's original idea, motivated by work of Teuscher, Bull, Preen and Copeland. Our A-types can process binary data by accepting and outputting sequences of binary vectors; hence we can associate a function to an A-type, and we say the A-type {\em represents} the function. There are two modes of data processing: clamped and sequential. We describe an evolutionary algorithm, involving graph-theoretic manipulations of A-types, which searches for A-types representing a given function. The algorithm uses both mutation and crossover operators. We implemented the algorithm and applied it to three benchmark tasks. We found that the algorithm performed much better than a random search. For two out of the three tasks, the algorithm with crossover performed better than a mutation-only version.
A key enabler for optimizing business processes is accurately estimating the probability distribution of a time series future given its past. Such probabilistic forecasts are crucial for example for reducing excess inventory in supply chains. In this paper we propose DeepAR, a novel methodology for producing accurate probabilistic forecasts, based on training an auto-regressive recurrent network model on a large number of related time series. We show through extensive empirical evaluation on several real-world forecasting data sets that our methodology is more accurate than state-of-the-art models, while requiring minimal feature engineering.
The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance. Implications for non-i.i.d. domains like time-series forecasting, discriminative learning, and reinforcement learning are discussed.
We study the steady-state low-temperature dynamics of an elastic line in a disordered medium below the depinning threshold. Analogously to the equilibrium dynamics, in the limit T->0, the steady state is dominated by a single configuration which is occupied with probability one. We develop an exact algorithm to target this dominant configuration and to analyze its geometrical properties as a function of the driving force. The roughness exponent of the line at large scales is identical to the one at depinning. No length scale diverges in the steady state regime as the depinning threshold is approached from below. We do find, a divergent length, but it is associated only with the transient relaxation between metastable states.
A supply chain is a system which moves products from a supplier to customers. The supply chains are ubiquitous. They play a key role in all economic activities. Inspired by biological principles of nutrients' distribution in protoplasmic networks of slime mould Physarum polycephalum we propose a novel algorithm for a supply chain design. The algorithm handles the supply networks where capacity investments and product flows are variables. The networks are constrained by a need to satisfy product demands. Two features of the slime mould are adopted in our algorithm. The first is the continuity of a flux during the iterative process, which is used in real-time update of the costs associated with the supply links. The second feature is adaptivity. The supply chain can converge to an equilibrium state when costs are changed. Practicality and flexibility of our algorithm is illustrated on numerical examples.
Passive network tomography uses end-to-end observations of network communication to characterize the network, for instance to estimate the network topology and to localize random or adversarial glitches. Under the setting of linear network coding this work provides a comprehensive study of passive network tomography in the presence of network (random or adversarial) glitches. To be concrete, this work is developed along two directions: 1. Tomographic upper and lower bounds (i.e., the most adverse conditions in each problem setting under which network tomography is possible, and corresponding schemes (computationally efficient, if possible) that achieve this performance) are presented for random linear network coding (RLNC). We consider RLNC designed with common randomness, i.e., the receiver knows the random code-books all nodes. (To justify this, we show an upper bound for the problem of topology estimation in networks using RLNC without common randomness.) In this setting we present the first set of algorithms that characterize the network topology exactly. Our algorithm for topology estimation with random network errors has time complexity that is polynomial in network parameters. For the problem of network error localization given the topology information, we present the first computationally tractable algorithm to localize random errors, and prove it is computationally intractable to localize adversarial errors. 2. New network coding schemes are designed that improve the tomographic performance of RLNC while maintaining the desirable low-complexity, throughput-optimal, distributed linear network coding properties of RLNC. In particular, we design network codes based on Reed-Solomon codes so that a maximal number of adversarial errors can be localized in a computationally efficient manner even without the information of network topology.
3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representations automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet -- a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
We develop a new theoretical framework to analyze the generalization error of deep learning, and derive a new fast learning rate for two representative algorithms: empirical risk minimization and Bayesian deep learning. The series of theoretical analyses of deep learning has revealed its high expressive power and universal approximation capability. Although these analyses are highly nonparametric, existing generalization error analyses have been developed mainly in a fixed dimensional parametric model. To compensate this gap, we develop an infinite dimensional model that is based on an integral form as performed in the analysis of the universal approximation capability. This allows us to define a reproducing kernel Hilbert space corresponding to each layer. Our point of view is to deal with the ordinary finite dimensional deep neural network as a finite approximation of the infinite dimensional one. The approximation error is evaluated by the degree of freedom of the reproducing kernel Hilbert space in each layer. To estimate a good finite dimensional model, we consider both of empirical risk minimization and Bayesian deep learning. We derive its generalization error bound and it is shown that there appears bias-variance trade-off in terms of the number of parameters of the finite dimensional approximation. We show that the optimal width of the internal layers can be determined through the degree of freedom and the convergence rate can be faster than $O(1/\sqrt{n})$ rate which has been shown in the existing studies.
This paper deals with neural networks modelling of HVAC systems. In order to increase the neural networks performances, a method based on sensitivity analysis is applied. The same technique is also used to compute the relevance of each input. To avoid the prediction errors in dry coil conditions, a metamodel for each capacity is derived from the neural networks. The regression coefficients of the polynomial forms are identified through the use of spectral analysis. These methods based on sensitivity and spectral analysis lead to an optimized neural network model, as regard to its architecture and predictions.
A variation of Gallager error-correcting codes is investigated using statistical mechanics. In codes of this type, a given message is encoded into a codeword which comprises Boolean sums of message bits selected by two randomly constructed sparse matrices. The similarity of these codes to Ising spin systems with random interaction makes it possible to assess their typical performance by analytical methods developed in the study of disordered systems. The typical case solutions obtained via the replica method are consistent with those obtained in simulations using belief propagation (BP) decoding. We discuss the practical implications of the results obtained and suggest a computationally efficient construction for one of the more practical configurations.
Phase change materials can be reversibly switched between amorphous and crystalline states and often show strong contrast in the optical and electrical properties of these two phases. They are now in widespread use for optical data storage, and their fast switching and a pronounced change of resistivity upon crystallization are also very attractive for nonvolatile electronic data storage. Nevertheless there are still several open questions regarding the electronic states and charge transport in these compounds. In this work we study electrical transport in thin metallic films of the disordered, crystalline phase change material Ge$_1$Sb$_2$Te$_4$. We observe weak antilocalization and disorder enhanced Coulomb interaction effects at low temperatures, and separate the contributions of these two phenomena to the temperature dependence of the resistivity, Hall effect, and magnetoresistance. Strong spin-orbit scattering causes positive magnetoresistance at all temperatures, and a careful analysis of the low-field magnetoresistance allows us to extract the temperature dependent electron dephasing rate and study other scattering phenomena. We find electron dephasing due to inelastic electron-phonon scattering at higher temperatures, electron-electron scattering dephasing at intermediate temperatures, and a crossover to weak temperature dependence below 1 K.
The concept of small worlds is introduced into the physical topology of wireless networks in this work. A. Helmy provided two con- struction schemes of small worlds for the wireless networks, link rewiring and link addition, but he mainly focused on the virtual topology. Based on the broadcasting nature of the radio transmission, we propose a con- struction scheme of small worlds for the physical topology of Multiple- Input Multiple-Output (MIMO) wireless networks. Besides the topology- related topics, we also evaluate the reduction of the power required by a request.
With the ability to watch Wikipedia and Wikidata edits in realtime, the online encyclopedia and the knowledge base have become increasingly used targets of research for the detection of breaking news events. In this paper, we present a case study of the 2014 Winter Olympics, where we tell the story of breaking news events in the context of the Olympics with the help of social multimedia stemming from multiple social network sites. Therefore, we have extended the application Wikipedia Live Monitor-a tool for the detection of breaking news events-with the capability of automatically creating media galleries that illustrate events. Athletes winning an Olympic competition, a new country leading the medal table, or simply the Olympics themselves are all events newsworthy enough for people to concurrently edit Wikipedia and Wikidata-around the world in many languages. The Olympics being an event of common interest, an even bigger majority of people share the event in a multitude of languages on global social network sites, which makes the event an ideal subject of study. With this work, we connect the world of Wikipedia and Wikidata with the world of social network sites, in order to convey the spirit of the 2014 Winter Olympics, to tell the story of victory and defeat, and always following the Olympic motto Citius, Altius, Fortius. The proposed system-generalized for all sort of breaking news stories-has been put in production in form of the Twitter bot @mediagalleries, available and archived at https://twitter.com/mediagalleries.
Most phoneme recognition state-of-the-art systems rely on a classical neural network classifiers, fed with highly tuned features, such as MFCC or PLP features. Recent advances in ``deep learning'' approaches questioned such systems, but while some attempts were made with simpler features such as spectrograms, state-of-the-art systems still rely on MFCCs. This might be viewed as a kind of failure from deep learning approaches, which are often claimed to have the ability to train with raw signals, alleviating the need of hand-crafted features. In this paper, we investigate a convolutional neural network approach for raw speech signals. While convolutional architectures got tremendous success in computer vision or text processing, they seem to have been let down in the past recent years in the speech processing field. We show that it is possible to learn an end-to-end phoneme sequence classifier system directly from raw signal, with similar performance on the TIMIT and WSJ datasets than existing systems based on MFCC, questioning the need of complex hand-crafted features on large datasets.
Autonomous threshold element circuit networks are used to investigate the structure of neural networks. With these circuits, as the transition functions are threshold functions, it is necessary to consider the existence of sequences of state configurations that cannot be transitioned. In this study, we focus on all logical functions of four or fewer variables, and we discuss the periodic sequences and transient series that transition from all sequences of state configurations. Furthermore, by using the sequences of state configurations in the Garden of Eden, we show that it is easy to obtain functions that determine the operation of circuit networks.
Several methods exist for a computer to generate music based on data including Markov chains, recurrent neural networks, recombinancy, and grammars. We explore the use of unit selection and concatenation as a means of generating music using a procedure based on ranking, where, we consider a unit to be a variable length number of measures of music. We first examine whether a unit selection method, that is restricted to a finite size unit library, can be sufficient for encompassing a wide spectrum of music. We do this by developing a deep autoencoder that encodes a musical input and reconstructs the input by selecting from the library. We then describe a generative model that combines a deep structured semantic model (DSSM) with an LSTM to predict the next unit, where units consist of four, two, and one measures of music. We evaluate the generative model using objective metrics including mean rank and accuracy and with a subjective listening test in which expert musicians are asked to complete a forced-choiced ranking task. We compare our model to a note-level generative baseline that consists of a stacked LSTM trained to predict forward by one note.
A non-reciprocal phonon model for microwave or optical isolators is considered. It gives a simpler framework to further investigate the formerly argued possibility for a heat transfer between black bodies at common temperatures. While the non-dissipative device retains the Detailed Balance property, the presence of dissipation breaks it. This property allows a net transfer of heat between the two black bodies at common temperatures, whenever the absorptive elements are at lower temperatures than the one being common to the bodies.
There are both benefits and drawbacks to creativity. In a social group it is not necessary for all members to be creative to benefit from creativity; some merely imitate or enjoy the fruits of others' creative efforts. What proportion should be creative? This paper contains a very preliminary investigation of this question carried out using a computer model of cultural evolution referred to as EVOC (for EVOlution of Culture). EVOC is composed of neural network based agents that evolve fitter ideas for actions by (1) inventing new ideas through modification of existing ones, and (2) imitating neighbors' ideas. The ideal proportion with respect to fitness of ideas occurs when thirty to forty percent of the individuals is creative. When creators are inventing 50% of iterations or less, mean fitness of actions in the society is a positive function of the ratio of creators to imitators; otherwise mean fitness of actions starts to drop when the ratio of creators to imitators exceeds approximately 30%. For all levels or creativity, the diversity of ideas in a population is positively correlated with the ratio of creative agents.
The past year saw the introduction of new architectures such as Highway networks and Residual networks which, for the first time, enabled the training of feedforward networks with dozens to hundreds of layers using simple gradient descent. While depth of representation has been posited as a primary reason for their success, there are indications that these architectures defy a popular view of deep learning as a hierarchical computation of increasingly abstract features at each layer.   In this report, we argue that this view is incomplete and does not adequately explain several recent findings. We propose an alternative viewpoint based on unrolled iterative estimation -- a group of successive layers iteratively refine their estimates of the same features instead of computing an entirely new representation. We demonstrate that this viewpoint directly leads to the construction of Highway and Residual networks. Finally we provide preliminary experiments to discuss the similarities and differences between the two architectures.
We consider a model of interface growth in two dimensions, given by a height function on the sites of the one--dimensional integer lattice. According to the discrete time update rule, the height above the site $x$ increases to the height above $x-1$, if the latter height is larger; otherwise the height above $x$ increases by 1 with probability $p_x$. We assume that $p_x$ are chosen independently at random with a common distribution $F$, and that the initial state is such that the origin is far above the other sites. We explicitly identify the asymptotic shape and prove that, in the pure regime, the fluctuations about that shape, normalized by the square root of time, are asymptotically normal. This contrasts with the quenched version: conditioned on the environment, and normalized by the cube root of time, the fluctuations almost surely approach a distribution known from random matrix theory.
Machine learning (ML) algorithms have been employed in the problem of classifying signal and background events with high accuracy in particle physics. In this paper, we compare the performance of a widespread ML technique, namely, \emph{stacked generalization}, against the results of two state-of-art algorithms: (1) a deep neural network (DNN) in the task of discovering a new neutral Higgs boson and (2) a scalable machine learning system for tree boosting, in the Standard Model Higgs to tau leptons channel, both at the 8 TeV LHC. In a cut-and-count analysis, \emph{stacking} three algorithms performed around 16\% worse than DNN but demanding far less computation efforts, however, the same \emph{stacking} outperforms boosted decision trees. Using the stacked classifiers in a multivariate statistical analysis (MVA), on the other hand, significantly enhances the statistical significance compared to cut-and-count in both Higgs processes, suggesting that combining an ensemble of simpler and faster ML algorithms with MVA tools is a better approach than building a complex state-of-art algorithm for cut-and-count.
Galaxy star formation rates (SFRs) are sensitive to the local environment; for example, the high-density regions at the cores of dense clusters are known to suppress star formation. It has been suggested that galaxy transformation occurs largely in groups, which are the intermediate step in density between field and cluster environments. In this paper, we use deep MIPS 24 micron observations of intermediate-redshift (0.3 < z < 0.55) group and field galaxies from the Group Environment and Evolution Collaboration (GEEC) subset of the Second Canadian Network for Observational Cosmology (CNOC2) survey to probe the moderate-density environment of groups, wherein the majority of galaxies are found. The completeness limit of our study is log(L_TIR (L_sun)) > 10.5, corresponding to SFR > 2.7 M_sun/yr. We find that the group and field galaxies have different distributions of morphologies and mass. However, individual group galaxies have star-forming properties comparable to those of field galaxies of similar mass and morphology; that is, the group environment does not appear to modify the properties of these galaxies directly. There is a relatively large number of massive early-type group spirals, along with E/S0 galaxies, that are forming stars above our detection limit. These galaxies account for the nearly comparable level of star-forming activity in groups as compared with the field, despite the differences in mass and morphology distributions between the two environments. The distribution of specific SFRs (SFR/M_*) is shifted to lower values in the groups, reflecting the fact that groups contain a higher proportion of massive and less active galaxies. Considering the distributions of morphology, mass, and SFR, the group members appear to lie between field and cluster galaxies in overall properties.
In this paper, we study network coding capacity for random wireless networks. Previous work on network coding capacity for wired and wireless networks have focused on the case where the capacities of links in the network are independent. In this paper, we consider a more realistic model, where wireless networks are modeled by random geometric graphs with interference and noise. In this model, the capacities of links are not independent. We consider two scenarios, single source multiple destinations and multiple sources multiple destinations. In the first scenario, employing coupling and martingale methods, we show that the network coding capacity for random wireless networks still exhibits a concentration behavior around the mean value of the minimum cut under some mild conditions. Furthermore, we establish upper and lower bounds on the network coding capacity for dependent and independent nodes. In the second one, we also show that the network coding capacity still follows a concentration behavior. Our simulation results confirm our theoretical predictions.
Second-order optimization methods such as natural gradient descent have the potential to speed up training of neural networks by correcting for the curvature of the loss function. Unfortunately, the exact natural gradient is impractical to compute for large models, and most approximations either require an expensive iterative procedure or make crude approximations to the curvature. We present Kronecker Factors for Convolution (KFC), a tractable approximation to the Fisher matrix for convolutional networks based on a structured probabilistic model for the distribution over backpropagated derivatives. Similarly to the recently proposed Kronecker-Factored Approximate Curvature (K-FAC), each block of the approximate Fisher matrix decomposes as the Kronecker product of small matrices, allowing for efficient inversion. KFC captures important curvature information while still yielding comparably efficient updates to stochastic gradient descent (SGD). We show that the updates are invariant to commonly used reparameterizations, such as centering of the activations. In our experiments, approximate natural gradient descent with KFC was able to train convolutional networks several times faster than carefully tuned SGD. Furthermore, it was able to train the networks in 10-20 times fewer iterations than SGD, suggesting its potential applicability in a distributed setting.
Deep Neural Networks (DNNs) are presently the state-of-the-art for image classification tasks. However, recent works have shown that these systems can be easily fooled to misidentify images by modifying the image in a particular way. Moreover, defense mechanisms proposed in the literature so far are mostly attack-specific and prove to be ineffective against new attacks. Indeed, recent work on universal perturbations can generate a single modification for all test images that is able to make existing networks misclassify 90% of the time. Presently, to our knowledge, no defense mechanisms are effective in preventing this. As such, the design of a general defense strategy against a wide range of attacks for Neural Networks becomes a challenging problem. In this paper, we derive inspiration from recent advances in the field of cybersecurity and multi-agent systems and propose to use the concept of Moving Target Defense (MTD) for increasing the robustness of well-known deep networks trained on the ImageNet dataset towards such adversarial attacks. In using this technique, we formalize and exploit the notion of differential immunity of different networks to specific attacks. To classify a single test image, we pick one of the trained networks each time and then use its classification output. To ensure maximum robustness, we generate an effective strategy by formulating this interaction as a Repeated Bayesian Stackelberg Game with a Defender and the Users. As a network switching strategy, we compute a Strong Stackelberg Equilibrium that optimizes the accuracy of prediction while at the same time reduces the misclassification rate on adversarial modification of test images. We show that while our approach produces an accuracy of 92.79% for the legitimate users, attackers can only misclassify images 58% (instead of 93.7%) of the time even when they select the best attack available to them.
Network Coding (NC) shows great potential in various communication scenarios through changing the packet forwarding principles of current networks. It can improve not only throughput, latency, reliability and security but also alleviates the need of coordination in many cases. However, it is still controversial due to widespread misunderstandings on how to exploit the advantages of it. The aim of the paper is to facilitate the usage of NC by $(i)$ explaining how it can improve the performance of the network (regardless the existence of any butterfly in the network), $(ii)$ showing how Software Defined Networking (SDN) can resolve the crucial problems of deployment and orchestration of NC elements, and $(iii)$ providing a prototype architecture with measurement results on the performance of our network coding capable software router implementation compared by fountain codes.
Restricted Boltzmann Machines are key tools in Machine Learning and are described by the energy function of bipartite spin-glasses. From a statistical mechanical perspective, they share the same Gibbs measure of Hopfield networks for associative memory. In this equivalence, weights in the former play as patterns in the latter. As Boltzmann machines usually require real weights to be trained with gradient descent like methods, while Hopfield networks typically store binary patterns to be able to retrieve, the investigation of a mixed Hebbian network, equipped with both real (e.g., Gaussian) and discrete (e.g., Boolean) patterns naturally arises. We prove that, in the challenging regime of a high storage of real patterns, where retrieval is forbidden, an extra load of Boolean patterns can still be retrieved, as long as the ratio among the overall load and the network size does not exceed a critical threshold, that turns out to be the same of the standard Amit-Gutfreund-Sompolinsky theory. Assuming replica symmetry, we study the case of a low load of Boolean patterns combining the stochastic stability and Hamilton-Jacobi interpolating techniques. The result can be extended to the high load by a non rigorous but standard replica computation argument.
An important task for recommender system is to generate explanations according to a user's preferences. Most of the current methods for explainable recommendations use structured sentences to provide descriptions along with the recommendations they produce. However, those methods have neglected the review-oriented way of writing a text, even though it is known that these reviews have a strong influence over user's decision.   In this paper, we propose a method for the automatic generation of natural language explanations, for predicting how a user would write about an item, based on user ratings from different items' features. We design a character-level recurrent neural network (RNN) model, which generates an item's review explanations using long-short term memories (LSTM). The model generates text reviews given a combination of the review and ratings score that express opinions about different factors or aspects of an item. Our network is trained on a sub-sample from the large real-world dataset BeerAdvocate. Our empirical evaluation using natural language processing metrics shows the generated text's quality is close to a real user written review, identifying negation, misspellings, and domain specific vocabulary.
Precise financial series predicting has long been a difficult problem because of unstableness and many noises within the series. Although Traditional time series models like ARIMA and GARCH have been researched and proved to be effective in predicting, their performances are still far from satisfying. Machine Learning, as an emerging research field in recent years, has brought about many incredible improvements in tasks such as regressing and classifying, and it's also promising to exploit the methodology in financial time series predicting. In this paper, the predicting precision of financial time series between traditional time series models and mainstream machine learning models including some state-of-the-art ones of deep learning are compared through experiment using real stock index data from history. The result shows that machine learning as a modern method far surpasses traditional models in precision.
In the cloud computing environment, cloud virtual machine (VM) will be more and more the number of virtual machine security and management faced giant Challenge. In order to address security issues cloud computing virtualization environment, this paper presents a virtual machine based on efficient and dynamic deployment VM security management model state migration and scheduling, study of which virtual machine security architecture, based on AHP (Analytic Hierarchy Process) virtual machine deployment and scheduling method, based on CUSUM (Cumulative Sum) DDoS attack detection algorithm, and the above-described method for functional testing and validation.
Learning long term dependencies in recurrent networks is difficult due to vanishing and exploding gradients. To overcome this difficulty, researchers have developed sophisticated optimization techniques and network architectures. In this paper, we propose a simpler solution that use recurrent neural networks composed of rectified linear units. Key to our solution is the use of the identity matrix or its scaled version to initialize the recurrent weight matrix. We find that our solution is comparable to LSTM on our four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem.
We demonstrate a new deep learning autoencoder network, trained by a nonnegativity constraint algorithm (NCAE), that learns features which show part-based representation of data. The learning algorithm is based on constraining negative weights. The performance of the algorithm is assessed based on decomposing data into parts and its prediction performance is tested on three standard image data sets and one text dataset. The results indicate that the nonnegativity constraint forces the autoencoder to learn features that amount to a part-based representation of data, while improving sparsity and reconstruction quality in comparison with the traditional sparse autoencoder and Nonnegative Matrix Factorization. It is also shown that this newly acquired representation improves the prediction performance of a deep neural network.
In many real-world scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We consider contagions propagating over the edges of an unobserved social network, where we only observe the times when nodes became infected, but not who infected them. Given such node infection times, we then identify the optimal network that best explains the observed data. We present a maximum likelihood approach based on convex programming with a l1-like penalty term that encourages sparsity. Experiments on real and synthetic data reveal that our method near-perfectly recovers the underlying network structure as well as the parameters of the contagion propagation model. Moreover, our approach scales well as it can infer optimal networks of thousands of nodes in a matter of minutes.
Results from a recent HST survey of field galaxies at wavelengths 1600 Angstroms and 2400 Angstroms are be presented. The data are used to constrain the fraction of Lyman-continuum radiation that escapes from galaxies at redshifts z ~ 1. The combined UV-IR photometry for HDF galaxies is also used to investigate whether low-mass starburst galaxies dominate the field-galaxy population at redshift z ~1. The relative lack of objects with the colors of faded bursts suggests that star-formation is largly quiescent rather than bursty or episodic.
We propose AWN (Algebra for Wireless Networks), a process algebra tailored to the modelling of Mobile Ad hoc Network (MANET) and Wireless Mesh Network (WMN) protocols. It combines novel treatments of local broadcast, conditional unicast and data structures.   In this framework we present a rigorous analysis of the Ad hoc On-Demand Distance Vector (AODV) protocol, a popular routing protocol designed for MANETs and WMNs, and one of the four protocols currently standardised by the IETF MANET working group.   We give a complete and unambiguous specification of this protocol, thereby formalising the RFC of AODV, the de facto standard specification, given in English prose. In doing so, we had to make non-evident assumptions to resolve ambiguities occurring in that specification. Our formalisation models the exact details of the core functionality of AODV, such as route maintenance and error handling, and only omits timing aspects.   The process algebra allows us to formalise and (dis)prove crucial properties of mesh network routing protocols such as loop freedom and packet delivery. We are the first to provide a detailed proof of loop freedom of AODV. In contrast to evaluations using simulation or model checking, our proof is generic and holds for any possible network scenario in terms of network topology, node mobility, etc. Due to ambiguities and contradictions the RFC specification allows several interpretations; we show for more than 5000 of them whether they are loop free or not, thereby demonstrating how the reasoning and proofs can relatively easily be adapted to protocol variants.   Using our formal and unambiguous specification, we find shortcomings of AODV that affect performance, e.g. the establishment of non-optimal routes, and some routes not being found at all. We formalise improvements in the same process algebra; carrying over the proofs is again easy.
Given a dense triplet set $\mathcal{T}$, there arise two interesting questions: Does there exists any phylogenetic network consistent with $\mathcal{T}$? And if so, can we find an effective algorithm to construct one? For cases of networks of levels $k=0$ or 1 or 2, these questions were answered with effective polynomial algorithms. For higher levels $k$, partial answers were recently obtained with an $O(|\mathcal{T}|^{k+1})$ time algorithm for simple networks. In this paper we give a complete answer to the general case. The main idea is to use a special property of SN-sets in a level-k network. As a consequence, we can also find the level-k network with the minimum number of reticulations in polynomial time.
We study loop percolation models in two and in three space dimensions, in which configurations of occupied bonds are forced to form closed loop. We show that the uncorrelated occupation of elementary plaquettes of the square and the simple cubic lattice by elementary loops leads to a percolation transition that is in the same universality class as the conventional bond percolation. In contrast to this an optimization constraint for the loop configurations, which then have to minimize a particular generic energy function, leads to a percolation transition that constitutes a new universality class, for which we report the critical exponents. Implication for the physics of solid-on-solid and vortex glass models are discussed.
Comment on "Approaching human language with complex networks" by Cong and Liu (Physics of Life Reviews, Volume 11, Issue 4, December 2014, Pages 598-618).
Neural machine translation (NMT) has recently become popular in the field of machine translation. However, NMT suffers from the problem of repeating or missing words in the translation. To address this problem, Tu et al. (2017) proposed an encoder-decoder-reconstructor framework for NMT using back-translation. In this method, they selected the best forward translation model in the same manner as Bahdanau et al. (2015), and then trained a bi-directional translation model as fine-tuning. Their experiments show that it offers significant improvement in BLEU scores in Chinese-English translation task. We confirm that our re-implementation also shows the same tendency and alleviates the problem of repeating and missing words in the translation on a English-Japanese task too. In addition, we evaluate the effectiveness of pre-training by comparing it with a jointly-trained model of forward translation and back-translation.
During the last ice age several quasi-periodic abrupt warming events took place. Known as Dansgaard-Oeschger (DO) events their effects were felt globally, although the North Atlantic experienced the largest temperature anomalies. Paleoclimate data shows that the fluctuations often occurred right after massive glacial meltwater releases in the North Atlantic and in bursts of three or four with progressively decreasing strengths. In this study a simple dynamical model of an overturning circulation and sea ice is developed with the goal of understanding the fundamental mechanisms that could have caused the DO events. Interaction between sea ice and the overturning circulation in the model produces self-sustained oscillations. Analysis and numerical experiments reveal that the insulating effect of sea ice causes the ocean to periodically vent out accumulated heat in the deep ocean into the atmosphere. Subjecting the model to idealized freshwater forcing mimicking Heinrich events causes modulation of the natural periodicity and produces burst patterns very similar to what is observed in temperature proxy data. Numerical experiments with the model also suggests that the characteristic period of 1,500 years is due to the geometry, or the effective heat capacity, of the ocean that comes under sea ice cover.
Deep convolutional networks have proven to be very successful in learning task specific features that allow for unprecedented performance on various computer vision tasks. Training of such networks follows mostly the supervised learning paradigm, where sufficiently many input-output pairs are required for training. Acquisition of large training sets is one of the key challenges, when approaching a new task. In this paper, we aim for generic feature learning and present an approach for training a convolutional network using only unlabeled data. To this end, we train the network to discriminate between a set of surrogate classes. Each surrogate class is formed by applying a variety of transformations to a randomly sampled 'seed' image patch. In contrast to supervised network training, the resulting feature representation is not class specific. It rather provides robustness to the transformations that have been applied during training. This generic feature representation allows for classification results that outperform the state of the art for unsupervised learning on several popular datasets (STL-10, CIFAR-10, Caltech-101, Caltech-256). While such generic features cannot compete with class specific features from supervised training on a classification task, we show that they are advantageous on geometric matching problems, where they also outperform the SIFT descriptor.
An ever increasing number of computer vision and image/video processing challenges are being approached using deep convolutional neural networks, obtaining state-of-the-art results in object recognition and detection, semantic segmentation, action recognition, optical flow and superresolution. Hardware acceleration of these algorithms is essential to adopt these improvements in embedded and mobile computer vision systems. We present a new architecture, design and implementation as well as the first reported silicon measurements of such an accelerator, outperforming previous work in terms of power-, area- and I/O-efficiency. The manufactured device provides up to 196 GOp/s on 3.09 mm^2 of silicon in UMC 65nm technology and can achieve a power efficiency of 803 GOp/s/W. The massively reduced bandwidth requirements make it the first architecture scalable to TOp/s performance.
We describe a framework for multitask deep reinforcement learning guided by policy sketches. Sketches annotate tasks with sequences of named subtasks, providing information about high-level structural relationships among tasks but not how to implement them---specifically not providing the detailed guidance used by much previous work on learning policy abstractions for RL (e.g. intermediate rewards, subtask completion signals, or intrinsic motivations). To learn from sketches, we present a model that associates every subtask with a modular subpolicy, and jointly maximizes reward over full task-specific policies by tying parameters across shared subpolicies. Optimization is accomplished via a decoupled actor--critic training objective that facilitates learning common behaviors from multiple dissimilar reward functions. We evaluate the effectiveness of our approach in three environments featuring both discrete and continuous control, and with sparse rewards that can be obtained only after completing a number of high-level subgoals. Experiments show that using our approach to learn policies guided by sketches gives better performance than existing techniques for learning task-specific or shared policies, while naturally inducing a library of interpretable primitive behaviors that can be recombined to rapidly adapt to new tasks.
We study the contribution of diffractive $Q \bar Q$ production to the $F_2^D$ proton structure function and the longitudinal double-spin asymmetry in polarized deep--inelastic $lp$ scattering. We show the strong dependence of the $F_2^D$ structure function and the $A_{ll}$ asymmetry on the quark--pomeron coupling structure.
This paper presents a methodology for creating streaming, distributed inference algorithms for Bayesian nonparametric (BNP) models. In the proposed framework, processing nodes receive a sequence of data minibatches, compute a variational posterior for each, and make asynchronous streaming updates to a central model. In contrast to previous algorithms, the proposed framework is truly streaming, distributed, asynchronous, learning-rate-free, and truncation-free. The key challenge in developing the framework, arising from the fact that BNP models do not impose an inherent ordering on their components, is finding the correspondence between minibatch and central BNP posterior components before performing each update. To address this, the paper develops a combinatorial optimization problem over component correspondences, and provides an efficient solution technique. The paper concludes with an application of the methodology to the DP mixture model, with experimental results demonstrating its practical scalability and performance.
We propose a spatial diffuseness feature for deep neural network (DNN)-based automatic speech recognition to improve recognition accuracy in reverberant and noisy environments. The feature is computed in real-time from multiple microphone signals without requiring knowledge or estimation of the direction of arrival, and represents the relative amount of diffuse noise in each time and frequency bin. It is shown that using the diffuseness feature as an additional input to a DNN-based acoustic model leads to a reduced word error rate for the REVERB challenge corpus, both compared to logmelspec features extracted from noisy signals, and features enhanced by spectral subtraction.
A four-node network consisting of a negative loop controlling a positive one is studied. It models some of the features of the p53 gene network. Using piecewise linear dynamics with thresholds, the allowed dynamical classes are fully characterized and coded. The biologically relevant situations are identified and conclusions drawn concerning the effectiveness of the p53 network as a tumour inhibitor mechanism.
To perform Quantum Key Distribution, the mastering of the extremely weak signals carried by the quantum channel is required. Transporting these signals without disturbance is customarily done by isolating the quantum channel from any noise sources using a dedicated physical channel. However, to really profit from this technology, a full integration with conventional network technologies would be highly desirable. Trying to use single photon signals with others that carry an average power many orders of magnitude bigger while sharing as much infrastructure with a conventional network as possible brings obvious problems. The purpose of the present paper is to report our efforts in researching the limits of the integration of QKD in modern optical networks scenarios. We have built a full metropolitan area network testbed comprising a backbone and an access network. The emphasis is put in using as much as possible the same industrial grade technology that is actually used in already installed networks, in order to understand the throughput, limits and cost of deploying QKD in a real network.
We propose a simple discrete time semi-supervised graph embedding approach to link prediction in dynamic networks. The learned embedding reflects information from both the temporal and cross-sectional network structures, which is performed by defining the loss function as a weighted sum of the supervised loss from past dynamics and the unsupervised loss of predicting the neighborhood context in the current network. Our model is also capable of learning different embeddings for both formation and dissolution dynamics. These key aspects contributes to the predictive performance of our model and we provide experiments with three real--world dynamic networks showing that our method is comparable to state of the art methods in link formation prediction and outperforms state of the art baseline methods in link dissolution prediction.
Multi cast communication is a key technology for wireless mesh networks. Multicast provides efficient data distribution among a group of nodes, Generally sensor networks and MANETs uses multicast algorithms which are designed to be energy efficient and to achieve optimal route discovery among mobile nodes whereas wireless mesh networks needs to maximize throughput. Here we propose two multicast algorithms: The Level Channel Assignment (LCA) algorithm and the Multi-Channel Multicast (MCM) algorithm to improve the throughput for multichannel sand multi interface mesh networks. The algorithm builds efficient multicast trees by minimizing the number of relay nodes and total hop count distance of the trees. Shortest path computation is a classical combinatorial optimization problem. Neural networks have been used for processing path optimization problem. Pulse Coupled Neural Networks (PCNNS) suffer from high computational cast for very long paths we propose a new PCNN modal called dual source PCNN (DSPCNN) which can improve the computational efficiency two auto waves are produced by DSPCNN one comes from source neuron and other from goal neuron when the auto waves from these two sources meet the DSPCNN stops and then the shortest path is found by backtracking the two auto waves.
A particularly successful role for Inductive Logic Programming (ILP) is as a tool for discovering useful relational features for subsequent use in a predictive model. Conceptually, the case for using ILP to construct relational features rests on treating these features as functions, the automated discovery of which necessarily requires some form of first-order learning. Practically, there are now several reports in the literature that suggest that augmenting any existing features with ILP-discovered relational features can substantially improve the predictive power of a model. While the approach is straightforward enough, much still needs to be done to scale it up to explore more fully the space of possible features that can be constructed by an ILP system. This is in principle, infinite and in practice, extremely large. Applications have been confined to heuristic or random selections from this space. In this paper, we address this computational difficulty by allowing features to be constructed in a distributed manner. That is, there is a network of computational units, each of which employs an ILP engine to construct some small number of features and then builds a (local) model. We then employ a consensus-based algorithm, in which neighboring nodes share information to update local models. For a category of models (those with convex loss functions), it can be shown that the algorithm will result in all nodes converging to a consensus model. In practice, it may be slow to achieve this convergence. Nevertheless, our results on synthetic and real datasets that suggests that in relatively short time the "best" node in the network reaches a model whose predictive accuracy is comparable to that obtained using more computational effort in a non-distributed setting (the best node is identified as the one whose weights converge first).
Today cloud computing has become as a new concept for hosting and delivering different services over the Internet for big data solutions. Cloud computing is attractive to different business owners of both small and enterprise as it eliminates the requirement for users to plan ahead for provisioning, and allows enterprises to start from the small and increase resources only when there is a rise in service demand. Despite the fact that cloud computing offers huge opportunities to the IT industry, the development of cloud computing technology is currently has several issues. This study presents an idea for introducing cloud templates which will be used for analyzing, designing, developing and implementing cloud computing systems. We will present a template based design for cloud computing systems, highlighting its key concepts, architectural principles and state of the art implementation, as well as research challenges and future work requirements. The aim of this idea is to provide a better understanding of the design challenges of cloud computing and identify important research directions in this big data increasingly important area. We will describe a series of studies by which we and other researchers have assessed the effectiveness of these techniques in practical situations. Finally, in this study we will show how this idea could be implemented in a practical and useful way in industry.
We present a new model, Predictive State Recurrent Neural Networks (PSRNNs), for filtering and prediction in dynamical systems. PSRNNs draw on insights from both Recurrent Neural Networks (RNNs) and Predictive State Representations (PSRs), and inherit advantages from both types of models. Like many successful RNN architectures, PSRNNs use (potentially deeply composed) bilinear transfer functions to combine information from multiple sources. We show that such bilinear functions arise naturally from state updates in Bayes filters like PSRs, in which observations can be viewed as gating belief states. We also show that PSRNNs can be learned effectively by combining Backpropogation Through Time (BPTT) with an initialization derived from a statistically consistent learning algorithm for PSRs called two-stage regression (2SR). Finally, we show that PSRNNs can be factorized using tensor decomposition, reducing model size and suggesting interesting connections to existing multiplicative architectures such as LSTMs. We applied PSRNNs to 4 datasets, and showed that we outperform several popular alternative approaches to modeling dynamical systems in all cases.
The human visual system excels at detecting local blur of visual images, but the underlying mechanism is mysterious. Traditional views of blur such as reduction in local or global high-frequency energy and loss of local phase coherence have fundamental limitations. For example, they cannot well discriminate flat regions from blurred ones. Here we argue that high-level semantic information is critical in successfully detecting local blur. Therefore, we resort to deep neural networks that are proficient in learning high-level features and propose the first end-to-end local blur mapping algorithm based on a fully convolutional network (FCN). We empirically show that high-level features of deeper layers indeed play a more important role than low-level features of shallower layers in resolving challenging ambiguities for this task. We test the proposed method on a standard blur detection benchmark and demonstrate that it significantly advances the state-of-the-art (ODS F-score of 0.853). In addition, we explore the use of the generated blur map in three applications, including blur region segmentation, blur degree estimation, and blur magnification.
One of the most difficult speech recognition tasks is accurate recognition of human to human communication. Advances in deep learning over the last few years have produced major speech recognition improvements on the representative Switchboard conversational corpus. Word error rates that just a few years ago were 14% have dropped to 8.0%, then 6.6% and most recently 5.8%, and are now believed to be within striking range of human performance. This then raises two issues - what IS human performance, and how far down can we still drive speech recognition error rates? A recent paper by Microsoft suggests that we have already achieved human performance. In trying to verify this statement, we performed an independent set of human performance measurements on two conversational tasks and found that human performance may be considerably better than what was earlier reported, giving the community a significantly harder goal to achieve. We also report on our own efforts in this area, presenting a set of acoustic and language modeling techniques that lowered the word error rate of our own English conversational telephone LVCSR system to the level of 5.5%/10.3% on the Switchboard/CallHome subsets of the Hub5 2000 evaluation, which - at least at the writing of this paper - is a new performance milestone (albeit not at what we measure to be human performance!). On the acoustic side, we use a score fusion of three models: one LSTM with multiple feature inputs, a second LSTM trained with speaker-adversarial multi-task learning and a third residual net (ResNet) with 25 convolutional layers and time-dilated convolutions. On the language modeling side, we use word and character LSTMs and convolutional WaveNet-style language models.
In both research and textbook literature one often finds two ``different'' Kubo formulas for the zero-temperature conductance of a non-interacting Fermi system. They contain a trace of the product of velocity operators and single-particle (retarded and advanced) Green operators: $\text{Tr} (\hat{v}_x \hat{G}^r \hat{v}_x \hat{G}^a)$ or $\text{Tr} (\hat{v}_x \text{Im} \hat{G} \hat{v}_x \text{Im} \hat{G})$. The study investigates the relationship between these expressions, as well as the requirements of current conservation, through exact evaluation of such quantum-mechanical traces for a nanoscale (containing 1000 atoms) mesoscopic disordered conductor. The traces are computed in the semiclassical regime (where disorder is weak) and, more importantly, in the nonperturbative transport regime (including the region around localization-delocalization transition) where concept of mean free path ceases to exist. Since quantum interference effects for such strong disorder are not amenable to diagrammatic or nonlinear $\sigma$-model techniques, the evolution of different Green function terms with disorder strength provides novel insight into the development of an Anderson localized phase.
The Jaccard loss, commonly referred to as the intersection-over-union loss, is commonly employed in the evaluation of segmentation quality due to its better perceptual quality and scale invariance, which lends appropriate relevance to small objects compared with per-pixel losses. We present a method for direct optimization of the per-image intersection-over-union loss in neural networks, in the context of semantic image segmentation, based on a convex surrogate: the Lov\'asz hinge. The loss is shown to perform better with respect to the Jaccard index measure than other losses traditionally used in the context of semantic segmentation; such as cross-entropy. We develop a specialized optimization method, based on an efficient computation of the proximal operator of the Lov\'asz hinge, yielding reliably faster and more stable optimization than alternatives. We demonstrate the effectiveness of the method by showing substantially improved intersection-overunion segmentation scores on the Pascal VOC dataset using a state-of-the-art deep learning segmentation architecture.
Self-Repairing Codes (SRC) are codes designed to suit the need of coding for distributed networked storage: they not only allow stored data to be recovered even in the presence of node failures, they also provide a repair mechanism where as little as two live nodes can be contacted to regenerate the data of a failed node. In this paper, we propose a new instance of self-repairing codes, based on constructions of spreads coming from projective geometry. We study some of their properties to demonstrate the suitability of these codes for distributed networked storage.
Degree of freedom (DoF) region provides an approximation of capacity region in high signal-to-noise ratio (SNR) regime, while sum DoF gives the scaling factor. In this correspondence, we analyse the DoF region and sum DoF for unicast layered multi-hop relay wireless networks with arbitrary number of source/destination/relay nodes, arbitrary number of hops and arbitrary number of antennas at each node. The result is valid for quite a few message topologies. We reveal the limitation on capacity of multi-hop network due to the concatenation structure and show the similarity with capacitor network. From the analysis on bound gap and optimality condition, the ultimate capacity of multi-hop network is shown to be strictly inferior to that of single-hop network. Linear scaling law can be established when the number of hops is fixed. At cost of channel state information at transmitters (CSIT) for each component single-hop network, our achievable scheme avoids routing and simplifies scheduling.
The fronthaul (FH) is an indispensable enabler for 5G networks. However, the classical fronthauling method demands for large bandwidth, low latency, and tightly synchronized on the transport network, and only allows for point-to-point logical topology. This greatly limits the usage of FH in many 5G scenarios. In this paper, we introduce a new perspective to understand and design FH for next-generation wireless access. We allow the renovated FH to transport information other than time-domain I/Q samples and to support logical topologies beyond point-to-point links. In this way, different function splitting scheme could be incorporated into the radio access network to satisfy the bandwidth and latency requirements of ultra-dense networks, control/data (C/D) decoupling architectures, and delay-sensitive communications. At the same time, massive cooperation and device-centric networking could be effectively enabled with point-to-multi-point FH transportation. We analyze three unique design requirements for the renovated FH, including the ability to handle various payload traffic, support different logical topology, and provide differentiated latency guarantee. Following this analysis, we propose a reference architecture for designing the renovated FH. The required functionalities are categorized into four logical layers and realized using novel technologies such as decoupled synchronization layer, packet switching, and session-based control. We also discuss some important future research issues.
We present the application of deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks. In this study, we have taken the case of Fanaroff-Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA) - Faint Images of the Radio Sky at Twenty Centimeters (FIRST) survey and existing visually classified samples available in literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is approximately 200 sources, which has been augmented by rotated versions of the same. Our study shows that convolutional neural networks can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and fusion classifier, which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification while being much faster. Finally, we discuss the computational and data-related challenges associated with morphological classification of radio galaxies with convolutional neural networks.
Small cell networks have recently been proposed as an important evolution path for the next-generation cellular networks. However, with more and more irregularly deployed base stations (BSs), it is becoming increasingly difficult to quantify the achievable network throughput or energy efficiency. In this paper, we develop an analytical framework for downlink performance evaluation of small cell networks, based on a random spatial network model, where BSs and users are modeled as two independent spatial Poisson point processes. A new simple expression of the outage probability is derived, which is analytically tractable and is especially useful with multi-antenna transmissions. This new result is then applied to evaluate the network throughput and energy efficiency. It is analytically shown that deploying more BSs or more BS antennas can always increase the network throughput, but the performance gain critically depends on the BS-user density ratio and the number of BS antennas. On the other hand, increasing the BS density or the number of transmit antennas will first increase and then decrease the energy efficiency if different components of BS power consumption satisfy certain conditions, and the optimal BS density and the optimal number of BS antennas can be found. Otherwise, the energy efficiency will always decrease. Simulation results shall demonstrate that our conclusions based on the random network model are general and also hold in a regular grid-based model.
In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision task such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation. We show empirically that our CNN architecture is able to outperform state of the art results for various word spotting benchmarks while exhibiting short training and test times.
Pouring a specific amount of liquid is a challenging task. In this paper we develop methods for robots to use visual feedback to perform closed-loop control for pouring liquids. We propose both a model-based and a model-free method utilizing deep learning for estimating the volume of liquid in a container. Our results show that the model-free method is better able to estimate the volume. We combine this with a simple PID controller to pour specific amounts of liquid, and show that the robot is able to achieve an average 38ml deviation from the target amount. To our knowledge, this is the first use of raw visual feedback to pour liquids in robotics.
This paper presents a simple yet effective supervised deep hash approach that constructs binary hash codes from labeled data for large-scale image search. We assume that the semantic labels are governed by several latent attributes with each attribute on or off, and classification relies on these attributes. Based on this assumption, our approach, dubbed supervised semantics-preserving deep hashing (SSDH), constructs hash functions as a latent layer in a deep network and the binary codes are learned by minimizing an objective function defined over classification error and other desirable hash codes properties. With this design, SSDH has a nice characteristic that classification and retrieval are unified in a single learning model. Moreover, SSDH performs joint learning of image representations, hash codes, and classification in a point-wised manner, and thus is scalable to large-scale datasets. SSDH is simple and can be realized by a slight enhancement of an existing deep architecture for classification; yet it is effective and outperforms other hashing approaches on several benchmarks and large datasets. Compared with state-of-the-art approaches, SSDH achieves higher retrieval accuracy, while the classification performance is not sacrificed.
This article proposes a method to quantify the structure of a bipartite graph using a network entropy per link. The network entropy of a bipartite graph with random links is calculated both numerically and theoretically. As an application of the proposed method to analyze collective behavior, the affairs in which participants quote and trade in the foreign exchange market are quantified. The network entropy per link is found to correspond to the macroeconomic situation. A finite mixture of Gumbel distributions is used to fit the empirical distribution for the minimum values of network entropy per link in each week. The mixture of Gumbel distributions with parameter estimates by segmentation procedure is verified by the Kolmogorov--Smirnov test. The finite mixture of Gumbel distributions that extrapolate the empirical probability of extreme events has explanatory power at a statistically significant level.
This paper presents a concept of a novel method for adjusting hyper-parameters in Deep Learning (DL) algorithms. An external agent-observer monitors a performance of a selected Deep Learning algorithm. The observer learns to model the DL algorithm using a series of random experiments. Consequently, it may be used for predicting a response of the DL algorithm in terms of a selected quality measurement to a set of hyper-parameters. This allows to construct an ensemble composed of a series of evaluators which constitute an observer-assisted architecture. The architecture may be used to gradually iterate towards to the best achievable quality score in tiny steps governed by a unit of progress. The algorithm is stopped when the maximum number of steps is reached or no further progress is made.
Deep learning methods have recently made notable advances in the tasks of classification and representation learning. These tasks are important for brain imaging and neuroscience discovery, making the methods attractive for porting to a neuroimager's toolbox. Success of these methods is, in part, explained by the flexibility of deep learning models. However, this flexibility makes the process of porting to new areas a difficult parameter optimization problem. In this work we demonstrate our results (and feasible parameter ranges) in application of deep learning methods to structural and functional brain imaging data. We also describe a novel constraint-based approach to visualizing high dimensional data. We use it to analyze the effect of parameter choices on data transformations. Our results show that deep learning methods are able to learn physiologically important representations and detect latent relations in neuroimaging data.
It has previously been shown that a recommender based on immune system idiotypic principles can out perform one based on correlation alone. This paper reports the results of work in progress, where we undertake some investigations into the nature of this beneficial effect. The initial findings are that the immune system recommender tends to produce different neighbourhoods, and that the superior performance of this recommender is due partly to the different neighbourhoods, and partly to the way that the idiotypic effect is used to weight each neighbours recommendations.
Boolean networks are an important class of computational models for molecular interaction networks. Boolean canalization, a type of hierarchical clustering of the inputs of a Boolean function, has been extensively studied in the context of network modeling where each layer of canalization adds a degree of stability in the dynamics of the network. Recently, dynamic network control approaches have been used for the design of new therapeutic interventions and for other applications such as stem cell reprogramming. This work studies the role of canalization in the control of Boolean molecular networks. It provides a method for identifying the potential edges to control in the wiring diagram of a network for avoiding undesirable state transitions. The method is based on identifying appropriate input-output combinations on undesirable transitions that can be modified using the edges in the wiring diagram of the network. Moreover, a method for estimating the number of changed transitions in the state space of the system as a result of an edge deletion in the wiring diagram is presented. The control methods of this paper were applied to a mutated cell-cycle model and to a p53-mdm2 model to identify potential control targets.
We show that, in a many-body system, all particles can be strongly confined to the initially occupied sites for a time that scales as a high power of the ratio of the bandwidth of site energies to the hopping amplitude. Such time-domain formulation is complementary to the formulation of the many-body localization of all stationary states with a large localization length. The long localization lifetime is achieved by constructing a periodic sequence of site energies with a large period in a one-dimensional chain. The scaling of the localization lifetime is independent of the number of particles for a broad range of the coupling strength. The analytical results are confirmed by numerical calculations.
We present an efficient and effective automatic method for determining the research focus of scientific communities found in co-authorship networks. It utilizes bibliographic data from a database to form the network, followed by fastgreedy community detection to identify communities within large connected components of the network. Text analysis techniques are used to identify community-specific significant terms which represent the topic of the community. In order to greatly reduce computation time, the `Topics' field of each publication in the network is analyzed rather than its entire text. Using this text analysis approach requires a certain level of statistical confidence,therefore analyzing very small communities is not effective with this technique. We find a minimum community size threshold of 8 coauthored papers; below this value, the community's topic cannot be reliably identified with this method. Additionally, we consider the implications this study has regarding factors involved in the formation of scientific communities in co-authorship networks.
Rapid advances in high-throughput technologies have led to considerable interest in analyzing genome-scale data in the context of biological pathways, with the goal of identifying functional systems that are involved in a given phenotype. In the most common approaches, biological pathways are modeled as simple sets of genes, neglecting the network of interactions comprising the pathway and treating all genes as equally important to the pathway's function. Recently, a number of new methods have been proposed to integrate pathway topology in the analyses, harnessing existing knowledge and enabling more nuanced models of complex biological systems. However, there is little guidance available to researches choosing between these methods. In this review, we discuss eight topology-based methods, comparing their methodological approaches and appropriate use cases. In addition, we present the results of the application of these methods to a curated set of ten gene expression profiling studies using a common set of pathway annotations. We report the computational efficiency of the methods and the consistency of the results across methods and studies to help guide users in choosing a method. We also discuss the challenges and future outlook for improved network analysis methodologies.
The growing presence of devices carrying digital cameras, such as mobile phones and tablets, combined with ever improving internet networks have enabled ordinary citizens, victims of human rights abuse, and participants in armed conflicts, protests, and disaster situations to capture and share via social media networks images and videos of specific events. This paper discusses the potential of images in human rights context including the opportunities and challenges they present. This study demonstrates that real-world images have the capacity to contribute complementary data to operational human rights monitoring efforts when combined with novel computer vision approaches. The analysis is concluded by arguing that if images are to be used effectively to detect and identify human rights violations by rights advocates, greater attention to gathering task-specific visual concepts from large-scale web images is required.
Online social networks have recently become an effective and innovative channel for spreading information and influence among hundreds of millions of end users. Many prior work have carried out empirical studies and proposed diffusion models to understand the information diffusion process in online social networks. However, most of these studies focus on the information diffusion in temporal dimension, that is, how the information propagates over time. Little attempt has been given on understanding information diffusion over both temporal and spatial dimensions. In this paper, we propose a Partial Differential Equation (PDE), specifically, a Diffusive Logistic (DL) equation to model the temporal and spatial characteristics of information diffusion in online social networks. To be more specific, we develop a PDE-based theoretical framework to measure and predict the density of influenced users at a given distance from the original information source after a time period. The density of influenced users over time and distance provides valuable insight on the actual information diffusion process. We present the temporal and spatial patterns in a real dataset collected from Digg social news site, and validate the proposed DL equation in terms of predicting the information diffusion process. Our experiment results show that the DL model is indeed able to characterize and predict the process of information propagation in online social networks. For example, for the most popular news with 24,099 votes in Digg, the average prediction accuracy of DL model over all distances during the first 6 hours is 92.08%. To the best of our knowledge, this paper is the first attempt to use PDE-based model to study the information diffusion process in both temporal and spatial dimensions in online social networks.
Many tasks in AI require the collaboration of multiple agents. Typically, the communication protocol between agents is manually specified and not altered during training. In this paper we explore a simple neural model, called CommNet, that uses continuous communication for fully cooperative tasks. The model consists of multiple agents and the communication between them is learned alongside their policy. We apply this model to a diverse set of tasks, demonstrating the ability of the agents to learn to communicate amongst themselves, yielding improved performance over non-communicative agents and baselines. In some cases, it is possible to interpret the language devised by the agents, revealing simple but effective strategies for solving the task at hand.
Hot Jupiter atmospheres exhibit fast, weakly-ionized winds. The interaction of these winds with the planetary magnetic field generates drag on the winds and leads to ohmic dissipation of the induced electric currents. We study the magnitude of ohmic dissipation in representative, three-dimensional atmospheric circulation models of the hot Jupiter HD 209458b. We find that ohmic dissipation can reach or exceed 1% of the stellar insolation power in the deepest atmospheric layers, in models with and without dragged winds. Such power, dissipated in the deep atmosphere, appears sufficient to slow down planetary contraction and explain the typically inflated radii of hot Jupiters. This atmospheric scenario does not require a top insulating layer or radial currents that penetrate deep in the planetary interior. Circulation in the deepest atmospheric layers may actually be driven by spatially non-uniform ohmic dissipation. A consistent treatment of magnetic drag and ohmic dissipation is required to further elucidate the consequences of magnetic effects for the atmospheres and the contracting interiors of hot Jupiters.
This technical report considers the design of a social network that would address the shortcomings of the current ones, and identifies user privacy, security, and service availability as strong motivations that push the architecture of the proposed design to be distributed. We describe our design in detail and identify the property of resiliency as a key objective for the overall design philosophy.   We define the system goals, threat model, and trust model as part of the system model, and discuss the challenges in adapting such distributed frameworks to become highly available and highly resilient in potentially hostile environments. We propose a distributed solution to address these challenges based on a trust-based friendship model for replicating user profiles and disseminating messages, and examine how this approach builds upon prior work in distributed Peer-to-Peer (P2P) networks.
Deep learning has significantly advanced state-of-the-art of speech recognition in the past few years. However, compared to conventional Gaussian mixture acoustic models, neural network models are usually much larger, and are therefore not very deployable in embedded devices. Previously, we investigated a compact highway deep neural network (HDNN) for acoustic modelling, which is a type of depth-gated feedforward neural network. We have shown that HDNN-based acoustic models can achieve comparable recognition accuracy with much smaller number of model parameters compared to plain deep neural network (DNN) acoustic models. In this paper, we push the boundary further by leveraging on the knowledge distillation technique that is also known as {\it teacher-student} training, i.e., we train the compact HDNN model with the supervision of a high accuracy cumbersome model. Furthermore, we also investigate sequence training and adaptation in the context of teacher-student training. Our experiments were performed on the AMI meeting speech recognition corpus. With this technique, we significantly improved the recognition accuracy of the HDNN acoustic model with less than 0.8 million parameters, and narrowed the gap between this model and the plain DNN with 30 million parameters.
Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and the revival of deep CNN. CNNs enable learning data-driven, highly representative, layered hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, with 85% sensitivity at 3 false positive per patient, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.
Conditional belief networks introduce stochastic binary variables in neural networks. Contrary to a classical neural network, a belief network can predict more than the expected value of the output $Y$ given the input $X$. It can predict a distribution of outputs $Y$ which is useful when an input can admit multiple outputs whose average is not necessarily a valid answer. Such networks are particularly relevant to inverse problems such as image prediction for denoising, or text to speech. However, traditional sigmoid belief networks are hard to train and are not suited to continuous problems. This work introduces a new family of networks called linearizing belief nets or LBNs. A LBN decomposes into a deep linear network where each linear unit can be turned on or off by non-deterministic binary latent units. It is a universal approximator of real-valued conditional distributions and can be trained using gradient descent. Moreover, the linear pathways efficiently propagate continuous information and they act as multiplicative skip-connections that help optimization by removing gradient diffusion. This yields a model which trains efficiently and improves the state-of-the-art on image denoising and facial expression generation with the Toronto faces dataset.
Headline generation for spoken content is important since spoken content is difficult to be shown on the screen and browsed by the user. It is a special type of abstractive summarization, for which the summaries are generated word by word from scratch without using any part of the original content. Many deep learning approaches for headline generation from text document have been proposed recently, all requiring huge quantities of training data, which is difficult for spoken document summarization. In this paper, we propose an ASR error modeling approach to learn the underlying structure of ASR error patterns and incorporate this model in an Attentive Recurrent Neural Network (ARNN) architecture. In this way, the model for abstractive headline generation for spoken content can be learned from abundant text data and the ASR data for some recognizers. Experiments showed very encouraging results and verified that the proposed ASR error model works well even when the input spoken content is recognized by a recognizer very different from the one the model learned from.
Dreaming is generally thought to be generated by spontaneous brain activity during sleep with patterns common to waking experience. This view is supported by a recent study demonstrating that dreamed objects can be predicted from brain activity during sleep using statistical decoders trained with stimulus-induced brain activity. However, it remains unclear whether and how visual image features associated with dreamed objects are represented in the brain. In this study, we used a deep neural network (DNN) model for object recognition as a proxy for hierarchical visual feature representation, and DNN features for dreamed objects were analyzed with brain decoding of fMRI data collected during dreaming. The decoders were first trained with stimulus-induced brain activity labeled with the feature values of the stimulus image from multiple DNN layers. The decoders were then used to decode DNN features from the dream fMRI data, and the decoded features were compared with the averaged features of each object category calculated from a large-scale image database. We found that the feature values decoded from the dream fMRI data positively correlated with those associated with dreamed object categories at mid- to high-level DNN layers. Using the decoded features, the dreamed object category could be identified at above-chance levels by matching them to the averaged features for candidate categories. The results suggest that dreaming recruits hierarchical visual feature representations associated with objects, which may support phenomenal aspects of dream experience.
Typical behavior of the linear programming problem (LP) is studied as a relaxation of the minimum vertex cover problem, which is a type of the integer programming problem (IP). To deal with the LP and IP by statistical mechanics, a lattice-gas model on the Erd\"os-R\'enyi random graphs is analyzed by a replica method. It is found that the LP optimal solution is typically equal to that of the IP below the critical average degree c*=e in the thermodynamic limit. The critical threshold for LP=IP is beyond a mathematical result, c=1, and coincides with the replica-symmetry-breaking threshold of the IP.
Many recent efforts have been devoted to designing sophisticated deep learning structures, obtaining revolutionary results on benchmark datasets. The success of these deep learning methods mostly relies on an enormous volume of labeled training samples to learn a huge number of parameters in a network; therefore, understanding the generalization ability of a learned deep network cannot be overlooked, especially when restricted to a small training set, which is the case for many applications. In this paper, we propose a novel deep learning objective formulation that unifies both the classification and metric learning criteria. We then introduce a geometry-aware deep transform to enable a non-linear discriminative and robust feature transform, which shows competitive performance on small training sets for both synthetic and real-world data. We further support the proposed framework with a formal $(K,\epsilon)$-robustness analysis.
We construct a N-dimensional Gaussian landscape with multiscale, translation invariant, logarithmic correlations and investigate the statistical mechanics of a single particle in this environment. In the limit of high dimension N>>1 the free energy of the system in the thermodynamic limit coincides with the most general version of Derrida's Generalized Random Energy Model. The low-temperature behaviour depends essentially on the spectrum of length scales involved in the construction of the landscape. We argue that our construction is in fact valid in any finite spatial dimensions, starting from N=1.
Location-based mobile social network services such as foursquare and Gowalla have grown exponentially over the past several years. These location-based services utilize the geographical position to enrich user experiences in a variety of contexts, including location-based searching and location-based mobile advertising. To attract more users, the location-based mobile social network services provide real-world rewards to the user, when a user checks in at a certain venue or location. This gives incentives for users to cheat on their locations. In this report, we investigate the threat of location cheating attacks, find the root cause of the vulnerability, and outline the possible defending mechanisms. We use foursquare as an example to introduce a novel location cheating attack, which can easily pass the current location verification mechanism (e.g., cheater code of foursquare). We also crawl the foursquare website. By analyzing the crawled data, we show that automated large scale cheating is possible. Through this work, we aim to call attention to location cheating in mobile social network services and provide insights into the defending mechanisms.
Many large datasets exhibit power-law statistics: The web graph, social networks, text data, click through data etc. Their adjacency graphs are termed natural graphs, and are known to be difficult to partition. As a consequence most distributed algorithms on these graphs are communication intensive. Many algorithms on natural graphs involve an Allreduce: a sum or average of partitioned data which is then shared back to the cluster nodes. Examples include PageRank, spectral partitioning, and many machine learning algorithms including regression, factor (topic) models, and clustering. In this paper we describe an efficient and scalable Allreduce primitive for power-law data. We point out scaling problems with existing butterfly and round-robin networks for Sparse Allreduce, and show that a hybrid approach improves on both. Furthermore, we show that Sparse Allreduce stages should be nested instead of cascaded (as in the dense case). And that the optimum throughput Allreduce network should be a butterfly of heterogeneous degree where degree decreases with depth into the network. Finally, a simple replication scheme is introduced to deal with node failures. We present experiments showing significant improvements over existing systems such as PowerGraph and Hadoop.
In this work, we discuss multiplayer pervasive games that rely on the use of ad hoc mobile sensor networks. The unique feature in such games is that players interact with each other and their surrounding environment by using movement and presence as a means of performing game-related actions, utilizing sensor devices. We discuss the fundamental issues and challenges related to these type of games and the scenarios associated with them. We also present and evaluate an example of such a game, called the "Hot Potato", developed using the Sun SPOT hardware platform. We provide a set of experimental results, so as to both evaluate our implementation and also to identify issues that arise in pervasive games which utilize sensor network nodes, which show that there is great potential in this type of games.
PAQ8 is an open source lossless data compression algorithm that currently achieves the best compression rates on many benchmarks. This report presents a detailed description of PAQ8 from a statistical machine learning perspective. It shows that it is possible to understand some of the modules of PAQ8 and use this understanding to improve the method. However, intuitive statistical explanations of the behavior of other modules remain elusive. We hope the description in this report will be a starting point for discussions that will increase our understanding, lead to improvements to PAQ8, and facilitate a transfer of knowledge from PAQ8 to other machine learning methods, such a recurrent neural networks and stochastic memoizers. Finally, the report presents a broad range of new applications of PAQ to machine learning tasks including language modeling and adaptive text prediction, adaptive game playing, classification, and compression using features from the field of deep learning.
Heuristic optimisers which search for an optimal configuration of variables relative to an objective function often get stuck in local optima where the algorithm is unable to find further improvement. The standard approach to circumvent this problem involves periodically restarting the algorithm from random initial configurations when no further improvement can be found. We propose a method of partial reinitialization, whereby, in an attempt to find a better solution, only sub-sets of variables are re-initialised rather than the whole configuration. Much of the information gained from previous runs is hence retained. This leads to significant improvements in the quality of the solution found in a given time for a variety of optimisation problems in machine learning.
Equilibrium modeling is common in a variety of fields such as game theory and transportation science. The inputs for these models, however, are often difficult to estimate, while their outputs, i.e., the equilibria they are meant to describe, are often directly observable. By combining ideas from inverse optimization with the theory of variational inequalities, we develop an efficient, data-driven technique for estimating the parameters of these models from observed equilibria. We use this technique to estimate the utility functions of players in a game from their observed actions and to estimate the congestion function on a road network from traffic count data. A distinguishing feature of our approach is that it supports both parametric and \emph{nonparametric} estimation by leveraging ideas from statistical learning (kernel methods and regularization operators). In computational experiments involving Nash and Wardrop equilibria in a nonparametric setting, we find that a) we effectively estimate the unknown demand or congestion function, respectively, and b) our proposed regularization technique substantially improves the out-of-sample performance of our estimators.
Some statistical properties of a network of two-Chinese-character compound words in Japanese language are reported. In this network, a node represents a Chinese character and an edge represents a two-Chinese-character compound word. It is found that this network has properties of "small-world" and "scale-free." A network formed by only Chinese characters for common use ({\it joyo-kanji} in Japanese), which is regarded as a subclass of the original network, also has small-world property. However, a degree distribution of the network exhibits no clear power law. In order to reproduce disappearance of the power-law property, a model for a selecting process of the Chinese characters for common use is proposed.
We examine the sensitiveness of the free-energy landscape of a directed polymer in random media with respect to various kinds of infinitesimally weak perturbation including the intriguing case of temperature-chaos. To this end, we combine the replica Bethe ansatz approach outlined in cond-mat/0112384, the mapping to a modified Sinai model and numerically exact calculations by the transfer-matrix method. Our results imply that for all the perturbations under study there is a slow crossover from a weakly perturbed regime where rare events take place to a strongly perturbed regime at larger length scales beyond the so called overlap length where typical events take place leading to chaos, i.e. a complete reshuffling of the free-energy landscape. Within the replica space, the evidence for chaos is found in the factorization of the replicated partition function induced by infinitesimal perturbations. This is the reflex of explicit replica symmetry breaking.
During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods has emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the winner of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
In recent times, the use of separable convolutions in deep convolutional neural network architectures has been explored. Several researchers, most notably (Chollet, 2016) and (Ghosh, 2017) have used separable convolutions in their deep architectures and have demonstrated state of the art or close to state of the art performance. However, the underlying mechanism of action of separable convolutions are still not fully understood. Although their mathematical definition is well understood as a depthwise convolution followed by a pointwise convolution, deeper interpretations such as the extreme Inception hypothesis (Chollet, 2016) have failed to provide a thorough explanation of their efficacy. In this paper, we propose a hybrid interpretation that we believe is a better model for explaining the efficacy of separable convolutions.
We obtain analytic expressions for the time correlation functions of a liquid of spherical particles, exact in the limit of high dimensions $d$. The derivation is long but straightforward: a dynamic virial expansion for which only the first two terms survive, followed by a change to generalized spherical coordinates in the dynamic variables leading to saddle-point evaluation of integrals for large $d$. The problem is thus mapped onto a one-dimensional diffusion in a perturbed harmonic potential with colored noise. At high density, an ergodicity-breaking glass transition is found. In this regime, our results agree with thermodynamics, consistently with the general Random First Order Transition scenario. The glass transition density is higher than the best known lower bound for hard sphere packings in large $d$. Because our calculation is, if not rigorous, elementary, an improvement in the bound for sphere packings in large dimensions is at hand.
We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics. We propose a two stage approach for this task that yields good results for the premise selection task on the Mizar corpus while avoiding the hand-engineered features of existing state-of-the-art models. To our knowledge, this is the first time deep learning has been applied to theorem proving on a large scale.
We find that the fractal scaling in a class of scale-free networks originates from the underlying tree structure called skeleton, a special type of spanning tree based on the edge betweenness centrality. The fractal skeleton has the property of the critical branching tree. The original fractal networks are viewed as a fractal skeleton dressed with local shortcuts. An in-silico model with both the fractal scaling and the scale-invariance properties is also constructed. The framework of fractal networks is useful in understanding the utility and the redundancy in networked systems.
Recent experiments have demonstrated that the nonlinear elasticity of in vitro networks of the biopolymer actin is dramatically altered in the presence of a flexible cross-linker such as the abundant cytoskeletal protein filamin. The basic principles of such networks remain poorly understood. Here we describe an effective medium theory of flexibly cross-linked stiff polymer networks. We argue that the response of the cross-links can be fully attributed to entropic stiffening, while softening due to domain unfolding can be ignored. The network is modeled as a collection of randomly oriented rods connected by flexible cross-links to an elastic continuum. This effective medium is treated in a linear elastic limit as well as in a more general framework, in which the medium self-consistently represents the nonlinear network behavior. This model predicts that the nonlinear elastic response sets in at strains proportional to cross-linker length and inversely proportional to filament length. Furthermore, we find that the differential modulus scales linearly with the stress in the stiffening regime. These results are in excellent agreement with bulk rheology data.
A system with artificial intelligence usually relies on symbol manipulation, at least partly and implicitly. However, the interpretation of the symbols - what they represent and what they are about - is ultimately left to humans, as designers and users of the system. How symbols can acquire meaning for the system itself, independent of external interpretation, is an unsolved problem. Some grounding of symbols can be obtained by embodiment, that is, by causally connecting symbols (or sub-symbolic variables) to the physical environment, such as in a robot with sensors and effectors. However, a causal connection as such does not produce representation and aboutness of the kind that symbols have for humans. Here I present a theory that explains how humans and other living organisms have acquired the capability to have symbols and sub-symbolic variables that represent, refer to, and are about something else. The theory shows how reference can be to physical objects, but also to abstract objects, and even how it can be misguided (errors in reference) or be about non-existing objects. I subsequently abstract the primary components of the theory from their biological context, and discuss how and under what conditions the theory could be implemented in artificial agents. A major component of the theory is the strong nonlinearity associated with (potentially unlimited) self-reproduction. The latter is likely not acceptable in artificial systems. It remains unclear if goals other than those inherently serving self-reproduction can have aboutness and if such goals could be stabilized.
A common definition of a robust connection between two nodes in a network such as a communication network is that there should be at least two independent paths connecting them, so that the failure of no single node in the network causes them to become disconnected. This definition leads us naturally to consider bicomponents, subnetworks in which every node has a robust connection of this kind to every other. Here we study bicomponents in both real and model networks using a combination of exact analytic techniques and numerical methods. We show that standard network models predict there to be essentially no small bicomponents in most networks, but there may be a giant bicomponent, whose presence coincides with the presence of the ordinary giant component, and we find that real networks seem by and large to follow this pattern, although there are some interesting exceptions. We study the size of the giant bicomponent as nodes in the network fail, using a specially developed computer algorithm based on data trees, and find in some cases that our networks are quite robust to failure, with large bicomponents persisting until almost all vertices have been removed.
Classical adaptive control proves total-system stability for control of linear plants, but only for plants meeting very restrictive assumptions. Approximate Dynamic Programming (ADP) has the potential, in principle, to ensure stability without such tight restrictions. It also offers nonlinear and neural extensions for optimal control, with empirically supported links to what is seen in the brain. However, the relevant ADP methods in use today -- TD, HDP, DHP, GDHP -- and the Galerkin-based versions of these all have serious limitations when used here as parallel distributed real-time learning systems; either they do not possess quadratic unconditional stability (to be defined) or they lead to incorrect results in the stochastic case. (ADAC or Q-learning designs do not help.) After explaining these conclusions, this paper describes new ADP designs which overcome these limitations. It also addresses the Generalized Moving Target problem, a common family of static optimization problems, and describes a way to stabilize large-scale economic equilibrium models, such as the old long-term energy model of DOE.
An early (and influential) scaling relation in the multifractal theory of Diffusion Limited Aggregation(DLA) is the Turkevich-Scher conjecture that relates the exponent \alpha_{min} that characterizes the ``hottest'' region of the harmonic measure and the fractal dimension D of the cluster, i.e. D=1+\alpha_{min}. Due to lack of accurate direct measurements of both D and \alpha_{min} this conjecture could never be put to serious test. Using the method of iterated conformal maps D was recently determined as D=1.713+-0.003. In this Letter we determine \alpha_{min} accurately, with the result \alpha_{min}=0.665+-0.004. We thus conclude that the Turkevich-Scher conjecture is incorrect for DLA.
We present a new field theoretic approach for finite dimensional site disordered spin systems by introducing the notion of grand canonical disorder, where the number of spins in the system is random but quenched. We analyze this field theory using the variational replica formalism. For a variety of interactions we find a spin glass phase (with continuous replica symmetry breaking) and explicitly discuss a three dimensional system where this occurs. This approximation also suggests that any ferromagnetic transition occur within the spin glass phase.
Scattering of electromagnetic waves in billiard-like systems has become a standard experimental tool of studying properties associated with Quantum Chaos. Random Matrix Theory (RMT) describing statistics of eigenfrequencies and associated eigenfunctions remains one of the pillars of theoretical understanding of quantum chaotic systems. In a scattering system coupling to continuum via antennae converts real eigenfrequencies into poles of the scattering matrix in the complex frequency plane and the associated eigenfunctions into decaying resonance states. Understanding statistics of these poles, as well as associated non-orthogonal resonance eigenfunctions within RMT approach is still possible, though much more challenging task.
Much has been learned about plasticity of biological synapses from empirical studies. Hebbian plasticity is driven by correlated activity of presynaptic and postsynaptic neurons. Synapses that converge onto the same neuron often behave as if they compete for a fixed resource; some survive the competition while others are eliminated. To provide computational interpretations of these aspects of synaptic plasticity, we formulate unsupervised learning as a zero-sum game between Hebbian excitation and anti-Hebbian inhibition in a neural network model. The game formalizes the intuition that Hebbian excitation tries to maximize correlations of neurons with their inputs, while anti-Hebbian inhibition tries to decorrelate neurons from each other. We further include a model of synaptic competition, which enables a neuron to eliminate all connections except those from its most strongly correlated inputs. Through empirical studies, we show that this facilitates the learning of sensory features that resemble parts of objects.
The current paper is an investigation towards understanding the navigational performance of humans on a network when the "landmark" nodes are blocked. We observe that humans learn to cope up, despite the continued introduction of blockages in the network. The experiment proposed involves the task of navigating on a word network based on a puzzle called the wordmorph. We introduce blockages in the network and report an incremental improvement in performance with respect to time. We explain this phenomenon by analyzing the evolution of the knowledge in the human participants of the underlying network as more and more landmarks are removed. We hypothesize that humans learn the bare essentials to navigate unless we introduce blockages in the network which would whence enforce upon them the need to explore newer ways of navigating. We draw a parallel to human problem solving and postulate that obstacles are catalysts for humans to innovate techniques to solve a restricted variant of a familiar problem.
Recent FIR/submm deep surveys revealed large number of ULIGs, which are proposed to lie at z>1, and in normally interacting systems with very dusty environments. We discussed in a previous paper that a population with a fast evolving infrared burst phase triggered by gas rich mergers at z~1 interpreted successfully the steep slope of IRAS 60um counts, leaving still a reasonable CIRB level at this wavelength. To extend the model to mid- and far-IR, we adopt a template SED as typical for nearby IR bright galaxies, such as Arp220. We construct the SED for the starburst mergers at z~1 by a simple dust extinction law and a thermal continuum assumption for the FIR emission. Since the radiation process at MIR for the merging systems is still uncertain, we assume it is similar to that of Arp220, but modify it by the observed flux correlation of ULIGs from IRAS, ISOCAM deep surveys. We show in this paper that the strong evolution of the ELAIS 90um, ISO 170um and the SCUBA 850um could be sufficiently accounted for by such an evolutionary scenario, especially for the hump of the ISOCAM 15um source count around 0.4mJy. From current best fit results, we find that the dust temperature of those merging system at z~1 would be higher than that of Arp220 for a reconciliation of the multi-wavelength deep surveys. The major difference of our current model prediction is that we see a fast convergence of the differential number counts at 60um below 50mJy, which is about a factor of two brighter than other model predictions. Future IR satellite like Astro-F,SIRTF would give strong constraints to the models.
This paper deals with a new model for clonal network dynamics. We describe in detail this model and derive special equations governing immune system dynamics based on the general gradient type principles that can be inherent to a wide class of real living objects. A special clonal network is modeled by two symmetric projector matrix variables simultaneously taking into account both asymmetry of the interaction to each other and adaptation states that can be realized owing to possible idiotypic clonal suppresions. We perform computer simulations of the model dynamics for some simple cases of relatively low dimension, paying special attention to the dynamics of amounts of activated receptor strings within clonal network.
The solving of least square systems is a useful operation in neurocomputational modeling of learning, pattern matching, and pattern recognition. In these last two cases, the solution must be obtained on-line, thus the time required to solve a system in a plausible neural architecture is critical. This paper presents a recurrent network of Sigma-Pi neurons, whose solving time increases at most like the logarithm of the system size, and of its condition number, which provides plausible computation times for biological systems.
The random K-satisfiability (K-SAT) problem is an important problem for studying typical-case complexity of NP-complete combinatorial satisfaction; it is also a representative model of finite-connectivity spin-glasses. In this paper we review our recent efforts on the solution space fine structures of the random K-SAT problem. A heterogeneity transition is predicted to occur in the solution space as the constraint density alpha reaches a critical value alpha_cm. This transition marks the emergency of exponentially many solution communities in the solution space. After the heterogeneity transition the solution space is still ergodic until alpha reaches a larger threshold value alpha_d, at which the solution communities disconnect from each other to become different solution clusters (ergodicity-breaking). The existence of solution communities in the solution space is confirmed by numerical simulations of solution space random walking, and the effect of solution space heterogeneity on a stochastic local search algorithm SEQSAT, which performs a random walk of single-spin flips, is investigated. The relevance of this work to glassy dynamics studies is briefly mentioned.
Studies have shown that multi-objective optimization problems are hard problems. Such problems either require longer time to converge to an optimum solution, or may not converge at all. Recently some researchers have claimed that real culprit for increasing the hardness of multi-objective problems are not the number of objectives themselves rather it is the increased size of solution set, incompatibility of solutions, and high probability of finding suboptimal solution due to increased number of local maxima. In this work, we have setup a simple framework for the evaluation of hardness of multi-objective genetic algorithms (MOGA). The algorithm is designed for a pray-predator game where a player is to improve its lifespan, challenging level and usability of the game arena through number of generations. A rigorous set of experiments are performed for quantifying the hardness in terms of evolution for increasing number of objective functions. In genetic algorithm, crossover and mutation with equal probability are applied to create offspring in each generation. First, each objective function is maximized individually by ranking the competing players on the basis of the fitness (cost) function, and then a multi-objective cost function (sum of individual cost functions) is maximized with ranking, and also without ranking where dominated solutions are also allowed to evolve.
The past decade has seen the rapid development of the online newsroom. News published online are the main outlet of news surpassing traditional printed newspapers. This poses challenges to the production and to the consumption of those news. With those many sources of information available it is important to find ways to cluster and organise the documents if one wants to understand this new system. A novel bio inspired approach to the problem of traversing the news is presented. It finds Hamiltonian cycles over documents published by the newspaper The Guardian. A Second Order Swarm Intelligence algorithm based on Ant Colony Optimisation was developed that uses a negative pheromone to mark unrewarding paths with a "no-entry" signal. This approach follows recent findings of negative pheromone usage in real ants.
Our visual perception of our surroundings is ultimately limited by the diffraction limit, which stipulates that optical information smaller than roughly half the illumination wavelength is not retrievable. Over the past decades, many breakthroughs have led to unprecedented imaging capabilities beyond the diffraction-limit, with applications in biology and nanotechnology. In this context, nano-photonics has revolutionized the field of optics in recent years by enabling the manipulation of light-matter interaction with subwavelength structures. However, despite the many advances in this field, its impact and penetration in our daily life has been hindered by a convoluted and iterative process, cycling through modeling, nanofabrication and nano-characterization. The fundamental reason is the fact that not only the prediction of the optical response is very time consuming and requires solving Maxwell's equations with dedicated numerical packages. But, more significantly, the inverse problem, i.e. designing a nanostructure with an on-demand optical response, is currently a prohibitive task even with the most advanced numerical tools due to the high non-linearity of the problem. Here, we harness the power of Deep Learning, a new path in modern machine learning, and show its ability to predict the geometry of nanostructures based solely on their far-field response. This approach also addresses in a direct way the currently inaccessible inverse problem breaking the ground for on-demand design of optical response with applications such as sensing, imaging and also for plasmon's mediated cancer thermotherapy.
In the paper very efficient, linear in number of arcs, algorithms for determining Hummon and Doreian's arc weights SPLC and SPNP in citation network are proposed, and some theoretical properties of these weights are presented. The nonacyclicity problem in citation networks is discussed. An approach to identify on the basis of arc weights an important small subnetwork is proposed and illustrated on the citation networks of SOM (self organizing maps) literature and US patents.
Routing and switching capabilities of computer networks seem as the closed environment containing a limited set of deployed protocols, which nobody dares to change. The majority of wired network designs are stuck with OSPF (guaranteeing dynamic routing exchange on network layer) and RSTP (securing loop-free data-link layer topology). Recently, more use-case specific routing protocols, such as Babel, have appeared. These technologies claim to have better characteristic than current industry standards. Babel is a fresh contribution to the family of distance-vector routing protocols, which is gaining its momentum for small double-stack (IPv6 and IPv4) networks. This paper briefly describes Babel behavior and provides details on its implementation in OMNeT++ discrete event simulator.
We generalize a sampling algorithm for lattice animals (connected clusters on a regular lattice) to a Monte Carlo algorithm for `graph animals', i.e. connected subgraphs in arbitrary networks. As with the algorithm in [N. Kashtan et al., Bioinformatics 20, 1746 (2004)], it provides a weighted sample, but the computation of the weights is much faster (linear in the size of subgraphs, instead of super-exponential). This allows subgraphs with up to ten or more nodes to be sampled with very high statistics, from arbitrarily large networks. Using this together with a heuristic algorithm for rapidly classifying isomorphic graphs, we present results for two protein interaction networks obtained using the TAP high throughput method: one of Escherichia coli with 230 nodes and 695 links, and one for yeast (Saccharomyces cerevisiae) with roughly ten times more nodes and links. We find in both cases that most connected subgraphs are strong motifs (Z-scores >10) or anti-motifs (Z-scores <-10) when the null model is the ensemble of networks with fixed degree sequence. Strong differences appear between the two networks, with dominant motifs in E. coli being (nearly) bipartite graphs and having many pairs of nodes which connect to the same neighbors, while dominant motifs in yeast tend towards completeness or contain large cliques. We also explore a number of methods that do not rely on measurements of Z-scores or comparisons with null models. For instance, we discuss the influence of specific complexes like the 26S proteasome in yeast, where a small number of complexes dominate the $k$-cores with large k and have a decisive effect on the strongest motifs with 6 to 8 nodes. We also present Zipf plots of counts versus rank. They show broad distributions that are not power laws, in contrast to the case when disconnected subgraphs are included.
We propose an efficient algorithm for simulating quantum many-body systems in two spatial dimensions using projected entangled pair states. This is done by approximating the environment, arising in the context of updating tensors in the process of time evolution, using a single-layered tensor network structure. This significantly reduces the computational costs and allows simulations in a larger submanifold of the Hilbert space as bounded by the bond dimension of the tensor network. We present numerical evidence for stability of the method on an antiferromagnetic isotropic Heisenberg model where good agreement is found with the available accurate results.
We introduce a family of one-dimensional geometric growth models, constructed iteratively by locally optimizing the tradeoffs between two competing metrics, and show that this family is equivalent to a family of preferential attachment random graph models with upper cutoffs. This is the first explanation of how preferential attachment can arise from a more basic underlying mechanism of local competition. We rigorously determine the degree distribution for the family of random graph models, showing that it obeys a power law up to a finite threshold and decays exponentially above this threshold.   We also rigorously analyze a generalized version of our graph process, with two natural parameters, one corresponding to the cutoff and the other a ``fertility'' parameter. We prove that the general model has a power-law degree distribution up to a cutoff, and establish monotonicity of the power as a function of the two parameters. Limiting cases of the general model include the standard preferential attachment model without cutoff and the uniform attachment model.
Web 2.0 services have enabled people to express their opinions, experience and feelings in the form of user-generated content. Sentiment analysis or opinion mining involves identifying, classifying and aggregating opinions as per their positive or negative polarity. This paper investigates the efficacy of different implementations of Self-Organizing Maps (SOM) for sentiment based visualization and classification of online reviews. Specifically, this paper implements the SOM algorithm for both supervised and unsupervised learning from text documents. The unsupervised SOM algorithm is implemented for sentiment based visualization and classification tasks. For supervised sentiment analysis, a competitive learning algorithm known as Learning Vector Quantization is used. Both algorithms are also compared with their respective multi-pass implementations where a quick rough ordering pass is followed by a fine tuning pass. The experimental results on the online movie review data set show that SOMs are well suited for sentiment based classification and sentiment polarity visualization.
We study the record statistics of random walks after $n$ steps, $x_0, x_1,\ldots, x_n$, with arbitrary symmetric and continuous distribution $p(\eta)$ of the jumps $\eta_i = x_i - x_{i-1}$. We consider the age of the records, i.e. the time up to which a record survives. Depending on how the age of the current last record is defined, we propose three distinct sequences of ages (indexed by $\alpha$ = I, II, III) associated to a given sequence of records. We then focus on the longest lasting record, which is the longest element among this sequence of ages. To characterize the statistics of these longest lasting records, we compute: (i) the probability that the record of the longest age is broken at step $n$, denoted by $Q^{\alpha}(n)$, which we call the probability of record breaking and: (ii) the duration of the longest lasting record, $\ell_{\max}^{\alpha}(n)$. We show that both $Q^{\alpha}(n)$ and the full statistics of $\ell_{\max}^{\alpha}(n)$ are universal, i.e. independent of the jump distribution $p(\eta)$. We compute exactly the large $n$ asymptotic behaviors of $Q^{\alpha}(n)$ as well as $\langle \ell_{\max}^{\alpha}(n)\rangle$ (when it exists) and show that each case gives rise to a different universal constant associated to random walks (including L\'evy flights). While two of them appeared before in the excursion theory of Brownian motion, for which we provide here a simpler derivation, the third case gives rise to a non-trivial new constant $C^{\rm III} = 0.241749 \ldots$ associated to the records of random walks. Other observables characterizing the ages of the records, exhibiting an interesting universal behavior, are also discussed.
The hyperclimbing hypothesis is a hypothetical explanation for adaptation in genetic algorithms with uniform crossover (UGAs). Hyperclimbing is an intuitive, general-purpose, non-local search heuristic applicable to discrete product spaces with rugged or stochastic cost functions. The strength of this heuristic lie in its insusceptibility to local optima when the cost function is deterministic, and its tolerance for noise when the cost function is stochastic. Hyperclimbing works by decimating a search space, i.e. by iteratively fixing the values of small numbers of variables. The hyperclimbing hypothesis holds that UGAs work by implementing efficient hyperclimbing. Proof of concept for this hypothesis comes from the use of a novel analytic technique involving the exploitation of algorithmic symmetry. We have also obtained experimental results that show that a simple tweak inspired by the hyperclimbing hypothesis dramatically improves the performance of a UGA on large, random instances of MAX-3SAT and the Sherrington Kirkpatrick Spin Glasses problem.
Observable operator models (OOMs) and related models are one of the most important and powerful tools for modeling and analyzing stochastic systems. They exactly describe dynamics of finite-rank systems and can be efficiently and consistently estimated through spectral learning under the assumption of identically distributed data. In this paper, we investigate the properties of spectral learning without this assumption due to the requirements of analyzing large-time scale systems, and show that the equilibrium dynamics of a system can be extracted from nonequilibrium observation data by imposing an equilibrium constraint. In addition, we propose a binless extension of spectral learning for continuous data. In comparison with the other continuous-valued spectral algorithms, the binless algorithm can achieve consistent estimation of equilibrium dynamics with only linear complexity.
We numerically simulate mechanically stable packings of soft-core, frictionless, bidisperse disks in two dimensions, above the jamming packing fraction phi_J. For configurations with a fixed isotropic global stress tensor, we investigate the fluctuations of the local packing fraction phi(r) to test whether such configurations display the hyperuniformity that has been claimed to exist exactly at phi_J. For our configurations, generated by a rapid quench protocol, we find that hyperuniformity persists only out to a finite length scale, and that this length scale appears to remain finite as the system stress decreases towards zero, i.e. towards the jamming transition. Our result suggests that the presence of hyperuniformity at jamming may be sensitive to the specific protocol used to construct the jammed configurations.
Omics technologies enable unbiased investigation of biological systems through massively parallel sequence acquisition or molecular measurements, bringing the life sciences into the era of Big Data. A central challenge posed by such omics datasets is how to transform this data into biological knowledge. For example, how to use this data to answer questions such as: which functional pathways are involved in cell differentiation? Which genes should we target to stop cancer? Network analysis is a powerful and general approach to solve this problem consisting of two fundamental stages, network reconstruction and network interrogation. Herein, we provide an overview of network analysis including a step by step guide on how to perform and use this approach to investigate a biological question. In this guide, we also include the software packages that we and others employ for each of the steps of a network analysis workflow.
We study the zero-temperature phase transition of a two-dimensional disordered boson Hubbard model. The phase diagram of this model is constructed in terms of the disorder strength and the chemical potential. Via quantum Monte Carlo simulations, we find a multicritical line separating the weak-disorder regime, where a random potential is irrelevant, from the strong-disorder regime. In the weak-disorder regime, the Mott-insulator-to-superfluid transition occurs, while, in the strong-disorder regime, the Bose-glass-to-superfluid transition occurs. On the multicritical line, the insulator-to-superfluid transition has the dynamical critical exponent $z=1.35 \pm 0.05$ and the correlation length critical exponent $\nu=0.67 \pm 0.03$, that are different from the values for the transitions off the line. We suggest that the proliferation of the particle-hole pairs screens out the weak disorder effects.
Currently, deep neural networks are deployed on low-power embedded devices by first training a full-precision model using powerful computing hardware, and then deriving a corresponding low-precision model for efficient inference on such systems. However, training models directly with coarsely quantized weights is a key step towards learning on embedded platforms that have limited computing resources, memory capacity, and power consumption. Numerous recent publications have studied methods for training quantized network, but these studies have mostly been empirical. In this work, we investigate training methods for quantized neural networks from a theoretical viewpoint. We first explore accuracy guarantees for training methods under convexity assumptions. We then look at the behavior of algorithms for non-convex problems, and we show that training algorithms that exploit high-precision representations have an important annealing property that purely quantized training methods lack, which explains many of the observed empirical differences between these types of algorithms.
We discuss the determination of deep-inelastic hadron structure in lattice QCD. By using a fictitious heavy quark, direct calculations of the Compton scattering tensor can be performed in Euclidean space that allow the extraction of the moments of structure functions. This overcomes issues of operator mixing and renormalisation that have so far prohibited lattice computations of higher moments. This approach is especially suitable for the study of the twist-two contributions to isovector quark distributions, which is practical with current computing resources. While we focus on the isovector unpolarised distribution, our method is equally applicable to other quark distributions and to generalised parton distributions. By looking at matrix elements such as $<\pi^\pm| T [V^\mu(x) A^{\nu}(0)]|0>$ (where $V^\mu$ and $A^\nu$ are vector and axial-vector heavy-light currents) within the same formalism, moments of meson distribution amplitudes can also be extracted.
Distance-based indices, including closeness centrality, average path length, eccentricity and average eccentricity, are important tools for network analysis. In these indices, the distance between two vertices is measured by the size of shortest paths between them. However, this measure has shortcomings. A well-studied shortcoming is that extending it to disconnected graphs (and also directed graphs) is controversial. The second shortcoming is that when this measure is used in real-world networks, a huge number of vertices may have exactly the same closeness/eccentricity scores. The third shortcoming is that in many applications, the distance between two vertices not only depends on the size of shortest paths, but also on the number of shortest paths between them. In this paper, we develop a new distance measure between vertices of a graph that yields discriminative distance-based centrality indices. This measure is proportional to the size of shortest paths and inversely proportional to the number of shortest paths. We present algorithms for exact computation of the proposed discriminative indices. We then develop randomized algorithms that precisely estimate average discriminative path length and average discriminative eccentricity and show that they give $(\epsilon,\delta)$-approximations of these indices. Finally, we preform extensive experiments over several real-world networks from different domains and show that compared to the traditional indices, discriminative indices have usually much more discriminability. Our experiments reveal that real-world networks have usually a tiny average discriminative path length, bounded by a constant (e.g., 2). We refer to this property as the tiny-world property.
The implementation of full-duplex (FD) radio in wireless communications is a potential approach for achieving higher spectral efficiency. A possible application is its employment in the next generation of cellular networks. However, the performance of large-scale FD multiuser networks is an area mostly unexplored. Most of the related work focuses on the performance analysis of small-scale networks or on loop interference cancellation schemes. In this paper, we derive the outage probability performance of large-scale FD cellular networks in the context of two architectures: two-node and three-node. We show how the performance is affected with respect to the model's parameters and provide a comparison between the two architectures.
A first measurement is presented of the charge asymmetry in the hadronic final state from the hard interaction in deep-inelastic ep neutral current scattering at HERA. The measurement is performed in the range of negative squared four momentum transfer 100<Q^2<8,000 GeV^2. The difference between the event normalised distributions of the scaled momentum, x_p, for positively and negatively charged particles, measured in the current region of the Breit frame, is studied together with its evolution as a function of Q. The results are compared to Monte Carlo models at the hadron and parton level.
Significant performance gains in deep learning coupled with the exponential growth of image and video data on the Internet have resulted in the recent emergence of automated image captioning systems. Ensuring scalability of automated image captioning systems with respect to the ever increasing volume of image and video data is a significant challenge. This paper provides a valuable insight in that the detection of a few significant (top) objects in an image allows one to extract other relevant information such as actions (verbs) in the image. We expect this insight to be useful in the design of scalable image captioning systems. We address two parameters by which the scalability of image captioning systems could be quantified, i.e., the traditional algorithmic time complexity which is important given the resource limitations of the user device and the system development time since the programmers' time is a critical resource constraint in many real-world scenarios. Additionally, we address the issue of how word embeddings could be used to infer the verb (action) from the nouns (objects) in a given image in a zero-shot manner. Our results show that it is possible to attain reasonably good performance on predicting actions and captioning images using our approaches with the added advantage of simplicity of implementation.
Random walk models in one-dimensional disordered media with an oscillatory input current are investigated theoretically as generic models of the boundary perturbation experiment. It is shown that the complex admittance obtained in the experiment is not self-averaging when the jump rates $w_i$ are random variables with the power-law distribution $\rho(w_i)\sim {w_i}^{\alpha-1} (0 < \alpha \leq 1)$. More precisely, the frequency-dependence of the disorder-averaged admittance $<\chi>$ disagrees with that of the admittance $\chi$ of any sample. It implies that the Cole-Cole plot of $<\chi>$ shows a different shape from that of the Cole-Cole plots of $\chi$ of each sample. The condition for absence of self-averaging is investigated with a toy model in terms of the extended central limit theorem. Higher dimensional media are also investigated and it is shown that the complex admittance for two-dimensional or three-dimensional media is also non-self-averaging.
We study living neural networks by measuring the neurons' response to a global electrical stimulation. Neural connectivity is lowered by reducing the synaptic strength, chemically blocking neurotransmitter receptors. We use a graph-theoretic approach to show that the connectivity undergoes a percolation transition. This occurs as the giant component disintegrates, characterized by a power law with critical exponent $\beta \simeq 0.65$ is independent of the balance between excitatory and inhibitory neurons and indicates that the degree distribution is gaussian rather than scale free
Mobile Ad-hoc Network (MANET) is the self organizing collection of mobile nodes. The communication in MANET is done via a wireless media. Ad hoc wireless networks have massive commercial and military potential because of their mobility support. Due to demanding real time multimedia applications, Quality of Services (QoS) support in such infrastructure less networks have become essential. QoS routing in mobile Ad-Hoc networks is challenging due to rapid change in network topology. Consequently, the available state information for routing is inherently imprecise. QoS routing may suffer badly due to several factors including radio interference on available bandwidth, and inefficient flooding of information to the adjacent nodes. As a result the performance of the network degrades substantially. This paper aims at the solution for energy efficient QoS routing by best utilization of network resources such as energy and bandwidth. A comparative study shows that despite the overhead due to QoS management, this solution performs better than classical OLSR protocol in terms of QoS and efficient utilization of energy.
Skin cancer is a major public health problem, with over 5 million newly diagnosed cases in the United States each year. Melanoma is the deadliest form of skin cancer, responsible for over 9,000 deaths each year. In this paper, we propose an ensemble of deep convolutional neural networks to classify dermoscopy images into three classes. To achieve the highest classification accuracy, we fuse the outputs of the softmax layers of four different neural architectures. For aggregation, we consider the individual accuracies of the networks weighted by the confidence values provided by their final softmax layers. This fusion-based approach outperformed all the individual neural networks regarding classification accuracy.
Minimum driver node sets (MDSs) play an important role in studying the structural controllability of complex networks. Recent research has shown that MDSs tend to avoid high-degree nodes. However, this observation is based on the analysis of a small number of MDSs, because enumerating all of the MDSs of a network is a #P problem. Therefore, past research has not been sufficient to arrive at a convincing conclusion. In this paper, first, we propose a preferential matching algorithm to find MDSs that have a specific degree property. Then, we show that the MDSs obtained by preferential matching can be composed of high- and medium-degree nodes. Moreover, the experimental results also show that the average degree of the MDSs of some networks tends to be greater than that of the overall network, even when the MDSs are obtained using previous research method. Further analysis shows that whether the driver nodes tend to be high-degree nodes or not is closely related to the edge direction of the network.
Stochastic binary hidden units in a multi-layer perceptron (MLP) network give at least three potential benefits when compared to deterministic MLP networks. (1) They allow to learn one-to-many type of mappings. (2) They can be used in structured prediction problems, where modeling the internal structure of the output is important. (3) Stochasticity has been shown to be an excellent regularizer, which makes generalization performance potentially better in general. However, training stochastic networks is considerably more difficult. We study training using M samples of hidden activations per input. We show that the case M=1 leads to a fundamentally different behavior where the network tries to avoid stochasticity. We propose two new estimators for the training gradient and propose benchmark tests for comparing training algorithms. Our experiments confirm that training stochastic networks is difficult and show that the proposed two estimators perform favorably among all the five known estimators.
We use a new set of Collins functions to update a previous prediction on the azimuthal asymmetries of pion productions in semi-inclusive deep inelastic scattering (SIDIS) process on a transversely polarized nucleon target. We find that the calculated results can give a good explanation to the HERMES experiment with the new parametrization, and this can enrich our knowledge of the fragmentation process. Furthermore, with two different approaches of distribution and fragmentation functions, we present a prediction on the azimuthal asymmetries of pion and kaon productions at the kinematics region of the experiments E06010 and E06011 planned at Jefferson Lab (JLab). It is shown that the results are insensitive to the models for the pion case. However, the results for kaon production are sensitive to different approaches of distribution and fragmentation functions. This is helpful to clarify some points in the study of the azimuthal spin asymmetries and fragmentation functions in hadronization processes.
We propose a novel neural network model for music signal processing using vector product neurons and dimensionality transformations. Here, the inputs are first mapped from real values into three-dimensional vectors then fed into a three-dimensional vector product neural network where the inputs, outputs, and weights are all three-dimensional values. Next, the final outputs are mapped back to the reals. Two methods for dimensionality transformation are proposed, one via context windows and the other via spectral coloring. Experimental results on the iKala dataset for blind singing voice separation confirm the efficacy of our model.
Deep learning methods have shown great promise in many practical applications, ranging from speech recognition, visual object recognition, to text processing. However, most of the current deep learning methods suffer from scalability problems for large-scale applications, forcing researchers or users to focus on small-scale problems with fewer parameters.   In this paper, we consider a well-known machine learning model, deep belief networks (DBNs) that have yielded impressive classification performance on a large number of benchmark machine learning tasks. To scale up DBN, we propose an approach that can use the computing clusters in a distributed environment to train large models, while the dense matrix computations within a single machine are sped up using graphics processors (GPU). When training a DBN, each machine randomly drops out a portion of neurons in each hidden layer, for each training case, making the remaining neurons only learn to detect features that are generally helpful for producing the correct answer. Within our approach, we have developed four methods to combine outcomes from each machine to form a unified model. Our preliminary experiment on the mnst handwritten digit database demonstrates that our approach outperforms the state of the art test error rate.
In this paper, we show the evaluation of the spectral radius for node degree as the basis to analyze the variation in the node degrees during the evolution of scale-free networks and small-world networks. Spectral radius is the principal eigenvalue of the adjacency matrix of a network graph and spectral radius ratio for node degree is the ratio of the spectral radius and the average node degree. We observe a very high positive correlation between the spectral radius ratio for node degree and the coefficient of variation of node degree (ratio of the standard deviation of node degree and average node degree). We show how the spectral radius ratio for node degree can be used as the basis to tune the operating parameters of the evolution models for scale-free networks and small-world networks as well as evaluate the impact of the number of links added per node introduced during the evolution of a scale-free network and evaluate the impact of the probability of rewiring during the evolution of a small-world network from a regular network.
We analyze the performance of Low-Density-Parity-Check codes in the error-floor domain where the Signal-to-Noise-Ratio, s, is large, s >> 1. We describe how the instanton method of theoretical physics, recently adapted to coding theory, solves the problem of characterizing the error-floor domain in the Laplacian channel. An example of the (155,64,20) LDPC code with four iterations (each iteration consisting of two semi-steps: from bits-to-checks and from checks-to-bits) of the min-sum decoding is discussed. A generalized computational tree analysis is devised to explain the rational structure of the leading instantons. The asymptotic for the symbol Bit-Error-Rate in the error-floor domain is comprised of individual instanton contributions, each estimated as ~ \exp(-l_{inst;L} s), where the effective distances, l_{inst;L}, of the the leading instantons are 7.6, 8.0 and 8.0 respectively. (The Hamming distance of the code is 20.) The analysis shows that the instantons are distinctly different from the ones found for the same coding/decoding scheme performing over the Gaussian channel. We validate instanton results against direct simulations and offer an explanation for remarkable performance of the instanton approximation not only in the extremal, s -> \infty, limit but also at the moderate s values of practical interest.
Effective field theory methods are used to study factorization of the deep inelastic scattering cross-section. The cross-section is shown to factor in QCD, even though it does not factor in perturbation theory for some choices of the infrared regulator. Messenger modes are not required in soft-collinear effective theory for deep inelastic scattering as x -> 1.
Correlations between charged particles in deep inelastic ep scattering have been studied in the Breit frame with the ZEUS detector at HERA using an integrated luminosity of 6.4 pb-1. Short-range correlations are analysed in terms of the angular separation between current-region particles within a cone centred around the virtual photon axis. Long-range correlations between the current and target regions have also been measured. The data support predictions for the scaling behaviour of the angular correlations at high Q2 and for anti-correlations between the current and target regions over a large range in Q2 and in the Bjorken scaling variable x. Analytic QCD calculations and Monte Carlo models correctly describe the trends of the data at high Q2, but show quantitative discrepancies. The data show differences between the correlations in deep inelastic scattering and e+e- annihilation.
Segmentation of the left ventricle (LV) from cardiac magnetic resonance imaging (MRI) datasets is an essential step for calculation of clinical indices such as ventricular volume and ejection fraction. In this work, we employ deep learning algorithms combined with deformable models to develop and evaluate a fully automatic segmentation tool for the LV from short-axis cardiac MRI datasets. The method employs deep learning algorithms to learn the segmentation task from the ground true data. Convolutional networks are employed to automatically detect the LV chamber in MRI dataset. Stacked autoencoders are utilized to infer the shape of the LV. The inferred shape is incorporated into deformable models to improve the accuracy and robustness of the segmentation. We validated our method using 45 cardiac MR datasets taken from the MICCAI 2009 LV segmentation challenge and showed that it outperforms the state-of-the art methods. Excellent agreement with the ground truth was achieved. Validation metrics, percentage of good contours, Dice metric, average perpendicular distance and conformity, were computed as 96.69%, 0.94, 1.81mm and 0.86, versus those of 79.2%-95.62%, 0.87-0.9, 1.76-2.97mm and 0.67-0.78, obtained by other methods, respectively.
Variational auto-encoder (VAE) is a powerful unsupervised learning framework for image generation. One drawback of VAE is that it generates blurry images due to its Gaussianity assumption and thus L2 loss. To allow the generation of high quality images by VAE, we increase the capacity of decoder network by employing residual blocks and skip connections, which also enable efficient optimization. To overcome the limitation of L2 loss, we propose to generate images in a multi-stage manner from coarse to fine. In the simplest case, the proposed multi-stage VAE divides the decoder into two components in which the second component generates refined images based on the course images generated by the first component. Since the second component is independent of the VAE model, it can employ other loss functions beyond the L2 loss and different model architectures. The proposed framework can be easily generalized to contain more than two components. Experiment results on the MNIST and CelebA datasets demonstrate that the proposed multi-stage VAE can generate sharper images as compared to those from the original VAE.
Aims. We present an innovative artificial neural network (ANN) architecture, called Generative ANN (GANN), that computes the forward model, that is it learns the function that relates the unknown outputs (stellar atmospheric parameters, in this case) to the given inputs (spectra). Such a model can be integrated in a Bayesian framework to estimate the posterior distribution of the outputs. Methods. The architecture of the GANN follows the same scheme as a normal ANN, but with the inputs and outputs inverted. We train the network with the set of atmospheric parameters (Teff, logg, [Fe/H] and [alpha/Fe]), obtaining the stellar spectra for such inputs. The residuals between the spectra in the grid and the estimated spectra are minimized using a validation dataset to keep solutions as general as possible. Results. The performance of both conventional ANNs and GANNs to estimate the stellar parameters as a function of the star brightness is presented and compared for different Galactic populations. GANNs provide significantly improved parameterizations for early and intermediate spectral types with rich and intermediate metallicities. The behaviour of both algorithms is very similar for our sample of late-type stars, obtaining residuals in the derivation of [Fe/H] and [alpha/Fe] below 0.1dex for stars with Gaia magnitude Grvs<12, which accounts for a number in the order of four million stars to be observed by the Radial Velocity Spectrograph of the Gaia satellite. Conclusions. Uncertainty estimation of computed astrophysical parameters is crucial for the validation of the parameterization itself and for the subsequent exploitation by the astronomical community. GANNs produce not only the parameters for a given spectrum, but a goodness-of-fit between the observed spectrum and the predicted one for a given set of parameters. Moreover, they allow us to obtain the full posterior distribution...
It is a common belief that power-law distributed avalanches are inherently unpredictable. This idea affects phenomena as diverse as evolution, earthquakes, superconducting vortices, stock markets, etc; from atomic to social scales. It mainly comes from the concept of ``Self-organized criticality" (SOC), where criticality is interpreted in the way that at any moment, any small avalanche can eventually cascade into a large event. Nevertheless, this work demonstrates experimentally the possibility of avalanche prediction in the classical paradigm of SOC: a sandpile. By knowing the position of every grain in a two-dimensional pile, avalanches of moving grains follow a distinct power-law distribution. Large avalanches, although uncorrelated, are preceded by continuous, detectable variations in the internal structure of the pile that are monitored in order to achieve prediction.
Precisely-labeled data sets with sufficient amount of samples are very important for training deep convolutional neural networks (CNNs). However, many of the available real-world data sets contain erroneously labeled samples and those errors substantially hinder the learning of very accurate CNN models. In this work, we consider the problem of training a deep CNN model for image classification with mislabeled training samples - an issue that is common in real image data sets with tags supplied by amateur users. To solve this problem, we propose an auxiliary image regularization technique, optimized by the stochastic Alternating Direction Method of Multipliers (ADMM) algorithm, that automatically exploits the mutual context information among training images and encourages the model to select reliable images to robustify the learning process. Comprehensive experiments on benchmark data sets clearly demonstrate our proposed regularized CNN model is resistant to label noise in training data.
In this paper, we explore different neural network architectures that can predict if a speaker of a given utterance is asking a question or making a statement. We com- pare the outcomes of regularization methods that are popularly used to train deep neural networks and study how different context functions can affect the classification performance. We also compare the efficacy of gated activation functions that are favorably used in recurrent neural networks and study how to combine multimodal inputs. We evaluate our models on two multimodal datasets: MSR-Skype and CALLHOME.
An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. One increasingly popular and successful approach is to take inspiration from inference and learning algorithms used in deep neural networks. However, the workhorse of deep learning, the gradient descent Back Propagation (BP) rule, often relies on the immediate availability of network-wide information stored with high-precision memory, and precise operations that are difficult to realize in neuromorphic hardware. Remarkably, recent work showed that exact backpropagated weights are not essential for learning deep representations. Random BP replaces feedback weights with random ones and encourages the network to adjust its feed-forward weights to learn pseudo-inverses of the (random) feedback weights. Building on these results, we demonstrate an event-driven random BP (eRBP) rule that uses an error-modulated synaptic plasticity for learning deep representations in neuromorphic computing hardware. The rule requires only one addition and two comparisons for each synaptic weight using a two-compartment leaky Integrate & Fire (I&F) neuron, making it very suitable for implementation in digital or mixed-signal neuromorphic hardware. Our results show that using eRBP, deep representations are rapidly learned, achieving nearly identical classification accuracies compared to artificial neural network simulations on GPUs, while being robust to neural and synaptic state quantizations during learning.
This paper follows previous research we have already performed in the area of Bayesian networks models for CAT. We present models using Item Response Theory (IRT - standard CAT method), Bayesian networks, and neural networks. We conducted simulated CAT tests on empirical data. Results of these tests are presented for each model separately and compared.
This paper has been withdrawn by the author to comply with the journal policy to which it has been submitted.
Localization of elastic waves in two-dimensional (2D) and three-dimensional (3D) media with random distributions of the Lam\'e coefficients (the shear and bulk moduli) is studied, using extensive numerical simulations. We compute the frequency-dependence of the minimum positive Lyapunov exponent $\gamma$ (the inverse of the localization length) using the transfer-matrix method, the density of states utilizing the force-oscillator method, and the energy-level statistics of the media. The results indicate that all the states may be localized in the 2D media, up to the disorder width and the smallest frequencies considered, although the numerical results also hint at the possibility that there might a small range of the allowed frequencies over which a mobility edge might exist. In the 3D media, however, most of the states are extended, with only a small part of the spectrum in the upper band tail that contains localized states, even if the Lam\'e coefficients are randomly distributed. Thus, the 3D heterogeneous media still possess a mobility edge. If both Lam\'e coefficients vary spatially in the 3D medium, the localization length $\Lambda$ follows a power law near the mobility edge, $\Lambda\sim(\Omega-\Omega_c)^{-\nu}$, where $\Omega_c$ is the critical frequency. The numerical simulation yields, $\nu \simeq 1.89\pm 0.17$, significantly larger than the numerical estimate, $\nu\simeq 1.57\pm 0.01$, and $\nu=3/2$, which was recently derived by a semiclassical theory for the 3D Anderson model of electron localization...
Wireless multi-hop networks, in various forms and under various names, are being increasingly used in military and civilian applications. Studying connectivity and capacity of these networks is an important problem. The scaling behavior of connectivity and capacity when the network becomes sufficiently large is of particular interest. In this position paper, we briefly overview recent development and discuss research challenges and opportunities in the area, with a focus on the network connectivity.
The purpose of a wireless sensor network (WSN) is to provide the users with access to the information of interest from data gathered by spatially distributed sensors. Generally the users require only certain aggregate functions of this distributed data. Computation of this aggregate data under the end-to-end information flow paradigm by communicating all the relevant data to a central collector node is a highly inefficient solution for this purpose. An alternative proposition is to perform in-network computation. This, however, raises questions such as: what is the optimal way to compute an aggregate function from a set of statistically correlated values stored in different nodes; what is the security of such aggregation as the results sent by a compromised or faulty node in the network can adversely affect the accuracy of the computed result. In this paper, we have presented an energy-efficient aggregation algorithm for WSNs that is secure and robust against malicious insider attack by any compromised or faulty node in the network. In contrast to the traditional snapshot aggregation approach in WSNs, a node in the proposed algorithm instead of unicasting its sensed information to its parent node, broadcasts its estimate to all its neighbors. This makes the system more fault-tolerant and increase the information availability in the network. The simulations conducted on the proposed algorithm have produced results that demonstrate its effectiveness.
Using dynamical density functional theory we calculate the speed of solidification fronts advancing into a quenched two-dimensional model fluid of soft-core particles. We find that solidification fronts can advance via two different mechanisms, depending on the depth of the quench. For shallow quenches, the front propagation is via a nonlinear mechanism. For deep quenches, front propagation is governed by a linear mechanism and in this regime we are able to determine the front speed via a marginal stability analysis. We find that the density modulations generated behind the advancing front have a characteristic scale that differs from the wavelength of the density modulation in thermodynamic equilibrium, i.e., the spacing between the crystal planes in an equilibrium crystal. This leads to the subsequent development of disorder in the solids that are formed. For the one-component fluid, the particles are able to rearrange to form a well-ordered crystal, with few defects. However, solidification fronts in a binary mixture exhibiting crystalline phases with square and hexagonal ordering generate solids that are unable to rearrange after the passage of the solidification front and a significant amount of disorder remains in the system.
We use a principal-agent model to analyze the structure of a book-driven dealer market when the dealer faces competition from a crossing network or dark pool. The agents are privately informed about their types (e.g. their portfolios), which is something that the dealer must take into account when engaging his counterparties. Instead of trading with the dealer, the agents may chose to trade in a crossing network. We show that the presence of such a network results in more types being serviced by the dealer and that, under certain conditions and due to reduced adverse selection effects, the book's spread shrinks. We allow for the pricing on the dealer market to determine the structure of the crossing network and show that the same conditions that lead to a reduction of the spread imply the existence of an equilibrium book/crossing network pair.
We present a weighted-Lasso method to infer the parameters of a first-order vector auto-regressive model that describes time course expression data generated by directed gene-to-gene regulation networks. These networks are assumed to own a prior internal structure of connectivity which drives the inference method. This prior structure can be either derived from prior biological knowledge or inferred by the method itself. We illustrate the performance of this structure-based penalization both on synthetic data and on two canonical regulatory networks, first yeast cell cycle regulation network by analyzing Spellman et al's dataset and second E. coli S.O.S. DNA repair network by analysing U. Alon's lab data.
Directed acyclic graphs are a fundamental class of networks that includes citation networks, food webs, and family trees, among others. Here we define a random graph model for directed acyclic graphs and give solutions for a number of the model's properties, including connection probabilities and component sizes, as well as a fast algorithm for simulating the model on a computer. We compare the predictions of the model to a real-world network of citations between physics papers and find surprisingly good agreement, suggesting that the structure of the real network may be quite well described by the random graph.
The vast majority of the neural network literature focuses on predicting point values for a given set of response variables, conditioned on a feature vector. In many cases we need to model the full joint conditional distribution over the response variables rather than simply making point predictions. In this paper, we present two novel approaches to such conditional density estimation (CDE): Multiscale Nets (MSNs) and CDE Trend Filtering. Multiscale nets transform the CDE regression task into a hierarchical classification task by decomposing the density into a series of half-spaces and learning boolean probabilities of each split. CDE Trend Filtering applies a k-th order graph trend filtering penalty to the unnormalized logits of a multinomial classifier network, with each edge in the graph corresponding to a neighboring point on a discretized version of the density. We compare both methods against plain multinomial classifier networks and mixture density networks (MDNs) on a simulated dataset and three real-world datasets. The results suggest the two methods are complementary: MSNs work well in a high-data-per-feature regime and CDE-TF is well suited for few-samples-per-feature scenarios where overfitting is a primary concern.
An X-ray imaging technique is used to probe the stability of 3-dimensional granular packs in a slowly rotating drum. Well before the surface reaches the avalanche angle, we observe intermittent plastic events associated with collective rearrangements of the grains located in the vicinity of the free surface. The energy released by these discrete events grows as the system approaches the avalanche threshold. By testing various preparation methods, we show that the pre-avalanche dynamics is not solely controlled by the difference between the free surface inclination and the avalanche angle. As a consequence, the measure of the pre-avalanche dynamics is unlikely to serve as a tool for predicting macroscopic avalanches.
Learning to rank has recently emerged as an attractive technique to train deep convolutional neural networks for various computer vision tasks. Pairwise ranking, in particular, has been successful in multi-label image classification, achieving state-of-the-art results on various benchmarks. However, most existing approaches use the hinge loss to train their models, which is non-smooth and thus is difficult to optimize especially with deep networks. Furthermore, they employ simple heuristics, such as top-k or thresholding, to determine which labels to include in the output from a ranked list of labels, which limits their use in the real-world setting. In this work, we propose two techniques to improve pairwise ranking based multi-label image classification: (1) we propose a novel loss function for pairwise ranking, which is smooth everywhere and thus is easier to optimize; and (2) we incorporate a label decision module into the model, estimating the optimal confidence thresholds for each visual concept. We provide theoretical analyses of our loss function in the Bayes consistency and risk minimization framework, and show its benefit over existing pairwise ranking formulations. We demonstrate the effectiveness of our approach on three large-scale datasets, VOC2007, NUS-WIDE and MS-COCO, achieving the best reported results in the literature.
We present a distributed (non-Bayesian) learning algorithm for the problem of parameter estimation with Gaussian noise. The algorithm is expressed as explicit updates on the parameters of the Gaussian beliefs (i.e. means and precision). We show a convergence rate of $O(1/k)$ with the constant term depending on the number of agents and the topology of the network. Moreover, we show almost sure convergence to the optimal solution of the estimation problem for the general case of time-varying directed graphs.
The cell cycle is a tightly controlled process, yet its underlying genetic network shows marked differences across species. Which of the associated structural features follow solely from the ability to impose the appropriate gene expression patterns? We tackle this question in silico by examining the ensemble of all regulatory networks which satisfy the constraint of producing a given sequence of gene expressions. We focus on three cell cycle profiles coming from baker's yeast, fission yeast and mammals. First, we show that the networks in each of the ensembles use just a few interactions that are repeatedly reused as building blocks. Second, we find an enrichment in network motifs that is similar in the two yeast cell cycle systems investigated. These motifs do not have autonomous functions, but nevertheless they reveal a regulatory logic for cell cycling based on a feed-forward cascade of activating interactions.
Real networks exhibit heterogeneous nature with nodes playing far different roles in structure and function. To identify vital nodes is thus very significant, allowing us to control the outbreak of epidemics, to conduct advertisements for e-commercial products, to predict popular scientific publications, and so on. The vital nodes identification attracts increasing attentions from both computer science and physical societies, with algorithms ranging from simply counting the immediate neighbors to complicated machine learning and message passing approaches. In this review, we clarify the concepts and metrics, classify the problems and methods, as well as review the important progresses and describe the state of the art. Furthermore, we provide extensive empirical analyses to compare well-known methods on disparate real networks, and highlight the future directions. In despite of the emphasis on physics-rooted approaches, the unification of the language and comparison with cross-domain methods would trigger interdisciplinary solutions in the near future.
Deep models are the defacto standard in visual decision models due to their impressive performance on a wide array of visual tasks. However, they are frequently seen as opaque and are unable to explain their decisions. In contrast, humans can justify their decisions with natural language and point to the evidence in the visual world which led to their decisions. We postulate that deep models can do this as well and propose our Pointing and Justification (PJ-X) model which can justify its decision with a sentence and point to the evidence by introspecting its decision and explanation process using an attention mechanism. Unfortunately there is no dataset available with reference explanations for visual decision making. We thus collect two datasets in two domains where it is interesting and challenging to explain decisions. First, we extend the visual question answering task to not only provide an answer but also a natural language explanation for the answer. Second, we focus on explaining human activities which is traditionally more challenging than object classification. We extensively evaluate our PJ-X model, both on the justification and pointing tasks, by comparing it to prior models and ablations using both automatic and human evaluations.
Generative Adversarial Networks (GANs) have gathered a lot of attention from the computer vision community, yielding impressive results for image generation. Advances in the adversarial generation of natural language from noise however are not commensurate with the progress made in generating images, and still lag far behind likelihood based methods. In this paper, we take a step towards generating natural language with a GAN objective alone. We introduce a simple baseline that addresses the discrete output space problem without relying on gradient estimators and show that it is able to achieve state-of-the-art results on a Chinese poem generation dataset. We present quantitative results on generating sentences from context-free and probabilistic context-free grammars, and qualitative language modeling results. A conditional version is also described that can generate sequences conditioned on sentence characteristics.
This paper extends our previous work on regularization of neural networks using Eigenvalue Decay by employing a soft approximation of the dominant eigenvalue in order to enable the calculation of its derivatives in relation to the synaptic weights, and therefore the application of back-propagation, which is a primary demand for deep learning. Moreover, we extend our previous theoretical analysis to deep neural networks and multiclass classification problems. Our method is implemented as an additional regularizer in Keras, a modular neural networks library written in Python, and evaluated in the benchmark data sets Reuters Newswire Topics Classification, IMDB database for binary sentiment classification, MNIST database of handwritten digits and CIFAR-10 data set for image classification.
We propose a Genetic Programming architecture for the generation of foreign exchange trading strategies. The system's principal features are the evolution of free-form strategies which do not rely on any prior models and the utilization of price series from multiple instruments as input data. This latter feature constitutes an innovation with respect to previous works documented in literature. In this article we utilize Open, High, Low, Close bar data at a 5 minutes frequency for the AUD.USD, EUR.USD, GBP.USD and USD.JPY currency pairs. We will test the implementation analyzing the in-sample and out-of-sample performance of strategies for trading the USD.JPY obtained across multiple algorithm runs. We will also evaluate the differences between strategies selected according to two different criteria: one relies on the fitness obtained on the training set only, the second one makes use of an additional validation dataset. Strategy activity and trade accuracy are remarkably stable between in and out of sample results. From a profitability aspect, the two criteria both result in strategies successful on out-of-sample data but exhibiting different characteristics. The overall best performing out-of-sample strategy achieves a yearly return of 19%.
Internet traffic displays many persistent periodicities (oscillations) on a large range of time scales. This paper describes the measurement methodology to detect Internet traffic periodicities and also describes the main periodicities in Internet traffic.
This paper explores an incremental training strategy for the skip-gram model with negative sampling (SGNS) from both empirical and theoretical perspectives. Existing methods of neural word embeddings, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. To address this problem, we present a simple incremental extension of SGNS and provide a thorough theoretical analysis to demonstrate its validity. Empirical experiments demonstrated the correctness of the theoretical analysis as well as the practical usefulness of the incremental algorithm.
Well known in the theory of network flows, Braess paradox states that in a congested network, it may happen that adding a new path between destinations can increase the level of congestion. In transportation networks the phenomenon results from the decisions of network participants who selfishly seek to optimize their own performance metrics. In an electric power distribution network, an analogous increase in congestion can arise as a consequence Kirchhoff's laws. Even for the simplest linear network of resistors and voltage sources, the sudden appearance of congestion due to an additional conductive line is a nonlinear phenomenon that results in a discontinuous change in the network state. It is argued that the phenomenon can occur in almost any grid in which they are loops, and with the increasing penetration of small-scale distributed generation it suggests challenges ahead in the operation of microgrids.
We present structured perceptron training for neural network transition-based dependency parsing. We learn the neural network representation using a gold corpus augmented by a large number of automatically parsed sentences. Given this fixed network representation, we learn a final layer using the structured perceptron with beam-search decoding. On the Penn Treebank, our parser reaches 94.26% unlabeled and 92.41% labeled attachment accuracy, which to our knowledge is the best accuracy on Stanford Dependencies to date. We also provide in-depth ablative analysis to determine which aspects of our model provide the largest gains in accuracy.
Voice conversion methods have advanced rapidly over the last decade. Studies have shown that speaker characteristics are captured by spectral feature as well as various prosodic features. Most existing conversion methods focus on the spectral feature as it directly represents the timbre characteristics, while some conversion methods have focused only on the prosodic feature represented by the fundamental frequency. In this paper, a comprehensive framework using deep neural networks to convert both timbre and prosodic features is proposed. The timbre feature is represented by a high-resolution spectral feature. The prosodic features include F0, intensity and duration. It is well known that DNN is useful as a tool to model high-dimensional features. In this work, we show that DNN initialized by our proposed autoencoder pretraining yields good quality DNN conversion models. This pretraining is tailor-made for voice conversion and leverages on autoencoder to capture the generic spectral shape of source speech. Additionally, our framework uses segmental DNN models to capture the evolution of the prosodic features over time. To reconstruct the converted speech, the spectral feature produced by the DNN model is combined with the three prosodic features produced by the DNN segmental models. Our experimental results show that the application of both prosodic and high-resolution spectral features leads to quality converted speech as measured by objective evaluation and subjective listening tests.
We present the development and validation of the Higgs Optimized b Identification Tagger (HOBIT), a multivariate b-jet identification algorithm optimized for Higgs boson searches at the CDF experiment at the Fermilab Tevatron. At collider experiments, b taggers allow one to distinguish particle jets containing B hadrons from other jets; these algorithms have been used for many years with great success at CDF. HOBIT has been designed specifically for use in searches for light Higgs bosons decaying via H ! b\bar{b}. This fact combined with the extent to which HOBIT synthesizes and extends the best ideas of previous taggers makes HOBIT unique among CDF b-tagging algorithms. Employing feed-forward neural network architectures, HOBIT provides an output value ranging from approximately -1 ("light-jet like") to 1 ("b-jet like"); this continuous output value has been tuned to provide maximum sensitivity in light Higgs boson search analyses. When tuned to the equivalent light jet rejection rate, HOBIT tags 54% of b jets in simulated 120 GeV/c2 Higgs boson events compared to 39% for SecVtx, the most commonly used b tagger at CDF. We present features of the tagger as well as its characterization in the form of b-jet finding efficiencies and false (light-jet) tag rates.
In a modern recommender system, it is important to understand how products relate to each other. For example, while a user is looking for mobile phones, it might make sense to recommend other phones, but once they buy a phone, we might instead want to recommend batteries, cases, or chargers. These two types of recommendations are referred to as substitutes and complements: substitutes are products that can be purchased instead of each other, while complements are products that can be purchased in addition to each other.   Here we develop a method to infer networks of substitutable and complementary products. We formulate this as a supervised link prediction task, where we learn the semantics of substitutes and complements from data associated with products. The primary source of data we use is the text of product reviews, though our method also makes use of features such as ratings, specifications, prices, and brands. Methodologically, we build topic models that are trained to automatically discover topics from text that are successful at predicting and explaining such relationships. Experimentally, we evaluate our system on the Amazon product catalog, a large dataset consisting of 9 million products, 237 million links, and 144 million reviews.
We analyse the role of partonic transverse motion in unpolarized Semi-Inclusive Deep Inelastic Scattering (SIDIS) processes. Imposing appropriate kinematical conditions, we find some constraints which fix an upper limit to the range of allowed kt values and lead to interesting results, particularly for some observables like the <cos phi_h> azimuthal modulation of the unpolarized SIDIS cross section and the average transverse momentum of the final, detected hadron.
We study the influence of the relative size of the reservoir on the adsorption isotherms of a fluid in disordered or inhomogeneous mesoporous solids. We consider both an atomistic model of a fluid in a simple, yet structured pore, whose adsorption isotherms are computed by molecular simulation, and a coarse-grained model for adsorption in a disordered mesoporous material, studied by a density functional approach in a local mean-field approximation. In both cases, the fluid inside the porous solid exchanges matter with a reservoir of gas that is at the same temperature and chemical potential and whose relative size can be varied, and the control parameter is the total number of molecules present in the porous sample and in the reservoir. Varying the relative sizes of the reservoir and the sample may change the shape of the hysteretic isotherms, leading to a "reentrant" behavior compared to the grand-canonical isotherm when the latter displays a jump in density. We relate these phenomena to the organization of the metastable states that are accessible for the adsorbed fluid at a given chemical potential or density.
In social network analysis, automatic social circle detection in ego-networks is becoming a fundamental and important task, with many potential applications such as user privacy protection or interest group recommendation. So far, most studies have focused on addressing two questions, namely, how to detect overlapping circles and how to detect circles using a combination of network structure and network node attributes. This paper asks an orthogonal research question, that is, how to detect circles based on network structures that are (usually) described by multiple views? Our investigation begins with crawling ego-networks from Twitter and employing classic techniques to model their structures by six views, including user relationships, user interactions and user content. We then apply both standard and our modified multi-view spectral clustering techniques to detect social circles in these ego-networks. Based on extensive automatic and manual experimental evaluations, we deliver two major findings: first, multi-view clustering techniques perform better than common single-view clustering techniques, which only use one view or naively integrate all views for detection, second, the standard multi-view clustering technique is less robust than our modified technique, which selectively transfers information across views based on an assumption that sparse network structures are (potentially) incomplete. In particular, the second finding makes us believe a direct application of standard clustering on potentially incomplete networks may yield biased results. We lightly examine this issue in theory, where we derive an upper bound for such bias by integrating theories of spectral clustering and matrix perturbation, and discuss how it may be affected by several network characteristics.
Single top quark events produced in the t channel are used to set limits on anomalous Wtb couplings and to search for top quark flavour-changing neutral current (FCNC) interactions. The data taken with the CMS detector at the LHC in proton-proton collisions at sqrt(s) = 7 and 8 TeV correspond to integrated luminosities of 5.0 and 19.7 inverse femtobarns, respectively. The analysis is performed using events with one muon and two or three jets. A Bayesian neural network technique is used to discriminate between the signal and backgrounds, which are observed to be consistent with the standard model prediction. The 95% confidence level (CL) exclusion limits on anomalous right-handed vector, and left- and right-handed tensor Wtb couplings are measured to be |f[V]^R| < 0.16, |f[T]^L| < 0.057, and -0.049 < f[T]^R < 0.048, respectively. For the FCNC couplings kappa[tug] and kappa[tcg], the 95% CL upper limits on coupling strengths are |kappa[tug]|/Lambda < 4.1E-3 TeV-1 and |kappa[tcg]|/Lambda < 1.8E-2 TeV-1, where Lambda is the scale for new physics, and correspond to upper limits on the branching fractions of 2.0E-5 and 4.1E-4 for the decays t to ug and t to cg, respectively.
We present a new framework for the crucial challenge of self-organization of a large sensor network. The basic scenario can be described as follows: Given a large swarm of immobile sensor nodes that have been scattered in a polygonal region, such as a street network. Nodes have no knowledge of size or shape of the environment or the position of other nodes. Moreover, they have no way of measuring coordinates, geometric distances to other nodes, or their direction. Their only way of interacting with other nodes is to send or to receive messages from any node that is within communication range. The objective is to develop algorithms and protocols that allow self-organization of the swarm into large-scale structures that reflect the structure of the street network, setting the stage for global routing, tracking and guiding algorithms.
We consider the Gaussian N-relay diamond network, where a source wants to communicate to a destination node through a layer of N-relay nodes. We investigate the following question: what fraction of the capacity can we maintain by using only k out of the N available relays? We show that independent of the channel configurations and the operating SNR, we can always find a subset of k relays which alone provide a rate (kC/(k+1))-G, where C is the information theoretic cutset upper bound on the capacity of the whole network and G is a constant that depends only on N and k (logarithmic in N and linear in k). In particular, for k = 1, this means that half of the capacity of any N-relay diamond network can be approximately achieved by routing information over a single relay. We also show that this fraction is tight: there are configurations of the N-relay diamond network where every subset of k relays alone can at most provide approximately a fraction k/(k+1) of the total capacity. These high-capacity k-relay subnetworks can be also discovered efficiently. We propose an algorithm that computes a constant gap approximation to the capacity of the Gaussian N-relay diamond network in O(N log N) running time and discovers a high-capacity k-relay subnetwork in O(kN) running time.   This result also provides a new approximation to the capacity of the Gaussian N-relay diamond network which is hybrid in nature: it has both multiplicative and additive gaps. In the intermediate SNR regime, this hybrid approximation is tighter than existing purely additive or purely multiplicative approximations to the capacity of this network.
The Internet is inherently a multipath network---for an underlying network with only a single path connecting various nodes would have been debilitatingly fragile. Unfortunately, traditional Internet technologies have been designed around the restrictive assumption of a single working path between a source and a destination. The lack of native multipath support constrains network performance even as the underlying network is richly connected and has redundant multiple paths. Computer networks can exploit the power of multiplicity to unlock the inherent redundancy of the Internet. This opens up a new vista of opportunities promising increased throughput (through concurrent usage of multiple paths) and increased reliability and fault-tolerance (through the use of multiple paths in backup/ redundant arrangements). There are many emerging trends in networking that signify that the Internet's future will be unmistakably multipath, including the use of multipath technology in datacenter computing; multi-interface, multi-channel, and multi-antenna trends in wireless; ubiquity of mobile devices that are multi-homed with heterogeneous access networks; and the development and standardization of multipath transport protocols such as MP-TCP.   The aim of this paper is to provide a comprehensive survey of the literature on network-layer multipath solutions. We will present a detailed investigation of two important design issues, namely the control plane problem of how to compute and select the routes, and the data plane problem of how to split the flow on the computed paths. The main contribution of this paper is a systematic articulation of the main design issues in network-layer multipath routing along with a broad-ranging survey of the vast literature on network-layer multipathing. We also highlight open issues and identify directions for future work.
Many powerful machine learning models are based on the composition of multiple processing layers, such as deep nets, which gives rise to nonconvex objective functions. A general, recent approach to optimise such "nested" functions is the method of auxiliary coordinates (MAC). MAC introduces an auxiliary coordinate for each data point in order to decouple the nested model into independent submodels. This decomposes the optimisation into steps that alternate between training single layers and updating the coordinates. It has the advantage that it reuses existing single-layer algorithms, introduces parallelism, and does not need to use chain-rule gradients, so it works with nondifferentiable layers. With large-scale problems, or when distributing the computation is necessary for faster training, the dataset may not fit in a single machine. It is then essential to limit the amount of communication between machines so it does not obliterate the benefit of parallelism. We describe a general way to achieve this, ParMAC. ParMAC works on a cluster of processing machines with a circular topology and alternates two steps until convergence: one step trains the submodels in parallel using stochastic updates, and the other trains the coordinates in parallel. Only submodel parameters, no data or coordinates, are ever communicated between machines. ParMAC exhibits high parallelism, low communication overhead, and facilitates data shuffling, load balancing, fault tolerance and streaming data processing. We study the convergence of ParMAC and propose a theoretical model of its runtime and parallel speedup. We develop ParMAC to learn binary autoencoders for fast, approximate image retrieval. We implement it in MPI in a distributed system and demonstrate nearly perfect speedups in a 128-processor cluster with a training set of 100 million high-dimensional points.
Electrocorticography (ECoG) provides direct measurements of synchronized postsynaptic potentials at the exposed cortical surface. Patterns of signal covariance across ECoG sensors have been associated with diverse cognitive functions and remain a critical marker of seizure onset, progression, and termination. Yet, a systems level understanding of these patterns (or networks) has remained elusive, in part due to variable electrode placement and sparse cortical coverage. Here, we address these challenges by constructing inter-regional ECoG networks from multi-subject recordings, demonstrate similarities between these networks and those constructed from blood-oxygen-level-dependent signal in functional magnetic resonance imaging, and predict network topology from anatomical connectivity, interregional distance, and correlated gene expression patterns. Our models accurately predict out-of-sample ECoG networks and perform well even when fit to data from individual subjects, suggesting shared organizing principles across persons. In addition, we identify a set of genes whose brain-wide co-expression is highly correlated with ECoG network organization. Using gene ontology analysis, we show that these same genes are enriched for membrane and ion channel maintenance and function, suggesting a molecular underpinning of ECoG connectivity. Our findings provide fundamental understanding of the factors that influence interregional ECoG networks, and open the possibility for predictive modeling of surgical outcomes in disease.
In this paper, we first propose a Bayesian neighborhood selection method to estimate Gaussian Graphical Models (GGMs). We show the graph selection consistency of this method in the sense that the posterior probability of the true model converges to one. When there are multiple groups of data available, instead of estimating the networks independently for each group, joint estimation of the networks may utilize the shared information among groups and lead to improved estimation for each individual network. Our method is extended to jointly estimate GGMs in multiple groups of data with complex structures, including spatial data, temporal data and data with both spatial and temporal structures. Markov random field (MRF) models are used to efficiently incorporate the complex data structures. We develop and implement an efficient algorithm for statistical inference that enables parallel computing. Simulation studies suggest that our approach achieves better accuracy in network estimation compared with methods not incorporating spatial and temporal dependencies when there are shared structures among the networks, and that it performs comparably well otherwise. Finally, we illustrate our method using the human brain gene expression microarray dataset, where the expression levels of genes are measured in different brain regions across multiple time periods.
The 18.5 K superconductor PuCoGa5 has many unusual properties, including those due to damage induced by self-irradiation. The superconducting transition temperature decreases sharply with time, suggesting a radiation-induced Frenkel defect concentration much larger than predicted by current radiation damage theories. Extended x-ray absorption fine-structure measurements demonstrate that while the local crystal structure in fresh material is well ordered, aged material is disordered much more strongly than expected from simple defects, consistent with strong disorder throughout the damage cascade region. These data highlight the potential impact of local lattice distortions relative to defects on the properties of irradiated materials and underscore the need for more atomic-resolution structural comparisons between radiation damage experiments and theory.
Online health communities are a valuable source of information for patients and physicians. However, such user-generated resources are often plagued by inaccuracies and misinformation. In this work we propose a method for automatically establishing the credibility of user-generated medical statements and the trustworthiness of their authors by exploiting linguistic cues and distant supervision from expert sources. To this end we introduce a probabilistic graphical model that jointly learns user trustworthiness, statement credibility, and language objectivity. We apply this methodology to the task of extracting rare or unknown side-effects of medical drugs --- this being one of the problems where large scale non-expert data has the potential to complement expert medical knowledge. We show that our method can reliably extract side-effects and filter out false statements, while identifying trustworthy users that are likely to contribute valuable medical information.
The members of the Eta Chamaleontis cluster are in an evolutionary stage in which disks are rapidly evolving. It also presents some peculiarities, such as the large fraction of binaries and accretion disks, probably related with the cluster formation process. Its proximity makes this stellar group an ideal target for studying the relation between X-ray emission and those stellar parameters. The main objective of this work is to determine general X-ray properties of the cluster members in terms of coronal temperature, column density, emission measure, X-ray luminosity and variability. We also aim to establish the relation between the X-ray luminosity of these stars and other stellar parameters, such as binarity and presence of accretion disks. A study of flare energies for each flare event and their relation with some stellar parameters is also performed. We used proprietary data from a deep XMM-Newton observation pointed at the core of the Eta Chamaleontis cluster. Specific software for the reduction of XMM-Newton data was used for the analysis of our observation. For the detection of sources, we used the wavelet-based code PWDetect. General coronal properties were derived from plasma model fitting. We also determined variability of the Eta Chamaleontis members in the EPIC field-of-view. A total of six flare-like events were clearly detected in five different stars. For them, we derived coronal properties during the flare events and pseudo-quiescent state separately. In our observations, stars that underwent a flare event have higher X-ray luminosities in the pseudo-quiescent state than cluster members with similar spectral type with no indications of flaring, independently whether they have an accretion disk or not. Observed flare energies are typical of both pre-main and main-sequence M stars. We detected no difference between flare energies of stars with and without an accretion disk.
Many studies have shown that we can gain additional information on time series by investigating their accompanying complex networks. In this work, we investigate the fundamental topological and fractal properties of recurrence networks constructed from fractional Brownian motions (FBMs). First, our results indicate that the constructed recurrence networks have exponential degree distributions; the relationship between $H$ and $<\lambda>$ can be represented by a cubic polynomial function. We next focus on the motif rank distribution of recurrence networks, so that we can better understand networks at the local structure level. We find the interesting superfamily phenomenon, i.e. the recurrence networks with the same motif rank pattern being grouped into two superfamilies. Last, we numerically analyze the fractal and multifractal properties of recurrence networks. We find that the average fractal dimension $<d_{B}>$ of recurrence networks decreases with the Hurst index $H$ of the associated FBMs, and their dependence approximately satisfies the linear formula $<d_{B}> \approx 2 - H$. Moreover, our numerical results of multifractal analysis show that the multifractality exists in these recurrence networks, and the multifractality of these networks becomes stronger at first and then weaker when the Hurst index of the associated time series becomes larger from 0.4 to 0.95. In particular, the recurrence network with the Hurst index $H=0.5$ possess the strongest multifractality. In addition, the dependence relationships of the average information dimension $<D(1)>$ and the average correlation dimension $<D(2)>$ on the Hurst index $H$ can also be fitted well with linear functions. Our results strongly suggest that the recurrence network inherits the basic characteristic and the fractal nature of the associated FBM series.
Consider a connected network of n nodes that all wish to recover k desired packets. Each node begins with a subset of the desired packets and exchanges coded packets with its neighbors. This paper provides necessary and sufficient conditions which characterize the set of all transmission schemes that permit every node to ultimately learn (recover) all k packets. When the network satisfies certain regularity conditions and packets are randomly distributed, this paper provides tight concentration results on the number of transmissions required to achieve universal recovery. For the case of a fully connected network, a polynomial-time algorithm for computing an optimal transmission scheme is derived. An application to secrecy generation is discussed.
In state-of-the-art Pervasive Computing, it is envisioned that unlimited access to information will be facilitated for anyone and anything. Wireless sensor networks will play a pivotal role in the stated vision. This reflects the phenomena where any situation can be sensed and analyzed anywhere. It makes heterogeneous context ubiquitous. Clustering context is one of the techniques to manage ubiquitous context information efficiently to maximize its potential. Logical-clustering is useful to share real-time context where sensors are physically distributed but logically clustered. This paper investigates the network performance of logical-clustering based on ns-3 simulations. In particular reliability, scalability, and reachability in terms of delay, jitter, and packet loss for the logically clustered network have been investigated. The performance study shows that jitter demonstrates 40 % and 44 % fluctuation for 200 % increase in the node per cluster and 100 % increase in the cluster size respectively. Packet loss exhibits only 18 % increase for 83 % increase in the packet flow-rate.
Evolving multiplex networks are a powerful model for representing the dynamics along time of different phenomena, such as social networks, power grids, biological pathways. However, exploring the structure of the multiplex network time series is still an open problem. Here we propose a two-steps strategy to tackle this problem based on the concept of distance (metric) between networks. Given a multiplex graph, first a network of networks is built for each time steps, and then a real valued time series is obtained by the sequence of (simple) networks by evaluating the distance from the first element of the series. The effectiveness of this approach in detecting the occurring changes along the original time series is shown on a synthetic example first, and then on the Gulf dataset of political events.
Robust object tracking requires knowledge and understanding of the object being tracked: its appearance, its motion, and how it changes over time. A tracker must be able to modify its underlying model and adapt to new observations. We present Re3, a real-time deep object tracker capable of incorporating long-term temporal information into its model. In line with other recent deep learning techniques, we do not train an online tracker. Instead, we use a recurrent neural network to represent the appearance and motion of the object. We train the network offline to learn how an object's appearance and motion may change, letting it track with a single forward pass at test time. This lightweight model is capable of tracking objects at 150 FPS, while attaining competitive results on challenging benchmarks. We also show that our method handles temporary occlusion better than other comparable trackers using experiments that directly measure performance on sequences with occlusion.
Learning to operate a vehicle is generally accomplished by forming a new cognitive map between the body motions and extrapersonal space. Here, we consider the challenge of remapping movement-to-space representations in survivors of spinal cord injury, for the control of powered wheelchairs. Our goal is to facilitate this remapping by developing interfaces between residual body motions and navigational commands that exploit the degrees of freedom that disabled individuals are most capable to coordinate. We present a new framework for allowing spinal cord injured persons to control powered wheelchairs through signals derived from their residual mobility. The main novelty of this approach lies in substituting the more common joystick controllers of powered wheelchairs with a sensor shirt. This allows the whole upper body of the user to operate as an adaptive joystick. Considerations about learning and risks have lead us to develop a safe testing environment in 3D Virtual Reality. A Personal Augmented Reality Immersive System (PARIS) allows us to analyse learning skills and provide users with an adequate training to control a simulated wheelchair through the signals generated by body motions in a safe environment. We provide a description of the basic theory, of the development phases and of the operation of the complete system. We also present preliminary results illustrating the processing of the data and supporting of the feasibility of this approach.
It has recently been discovered that many biological systems, when represented as graphs, exhibit a scale-free topology. One such system is the set of structural relationships among protein domains. The scale-free nature of this and other systems has previously been explained using network growth models that, while motivated by biological processes, do not explicitly consider the underlying physics or biology. In the present work we explore a sequence-based model for the evolution protein structures and demonstrate that this model is able to recapitulate the scale-free nature observed in graphs of real protein structures. We find that this model also reproduces other statistical feature of the protein domain graph. This represents, to our knowledge, the first such microscopic, physics-based evolutionary model for a scale-free network of biological importance and as such has strong implications for our understanding of the evolution of protein structures and of other biological networks.
In many network topologies, hosts have multiple IP addresses, and may choose among multiple network paths by selecting the source and destination addresses of the packets that they send. This can happen with multihomed hosts (hosts connected to multiple networks), or in multihomed networks using source-specific routing. A number of efforts have been made to dynamically choose between multiple addresses in order to improve the reliability or the performance of network applications, at the network layer, as in Shim6, or at the transport layer, as in MPTCP. In this paper, we describe our experience of implementing dynamic address selection at the application layer within the Mobile Shell. While our work is specific to Mosh, we hope that it is generic enough to serve as a basis for designing UDP-based multipath applications or even more general APIs.
We use the Litvinov-Maslov correspondence principle to reduce and hybridize networks of biochemical reactions. We apply this method to a cell cycle oscillator model. The reduced and hybridized model can be used as a hybrid model for the cell cycle. We also propose a practical recipe for detecting quasi-equilibrium QE reactions and quasi-steady state QSS species in biochemical models with rational rate functions and use this recipe for model reduction. Interestingly, the QE/QSS invariant manifold of the smooth model and the reduced dynamics along this manifold can be put into correspondence to the tropical variety of the hybridization and to sliding modes along this variety, respectively
I discuss a number of novel phenomenological features of QCD in high transverse momentum reactions. The presence of direct higher-twist processes, where a proton is produced directly in the hard subprocess, can explain the "baryon anomaly" -- the large proton-to-pion ratio seen at RHIC in high centrality heavy ion collisions. Direct hadronic processes can also account for the deviation from leading-twist PQCD scaling at fixed x_T = 2p_T/\sqrt s. I suggest that the "ridge" -- the same-side long-range rapidity correlation observed at RHIC in high centrality heavy ion collisions is due to the imprint of semihard DGLAP gluon radiation from initial-state partons which have transverse momenta biased toward the trigger. Rescattering interactions from gluon-exchange, normally neglected in the parton model, have a profound effect in QCD hard-scattering reactions, leading to leading-twist single-spin asymmetries, diffractive deep inelastic scattering, diffractive hard hadronic reactions, the breakdown of the Lam-Tung relation in Drell-Yan reactions, nuclear shadowing --all leading-twist dynamics not incorporated in the light-front wavefunctions of the target computed in isolation. Antishadowing is shown to be quark flavor specific and thus different in charged and neutral deep inelastic lepton-nucleus scattering. I also discuss the consequences of color-octet intrinsic heavy quark distributions in the proton for heavy particle and Higgs production at high x_F. The AdS/CFT correspondence between Anti-de Sitter space and conformal gauge theories allows one to compute the analytic form of frame-independent light-front wavefunctions of mesons and baryons and to compute quark and gluon hadronization at the amplitude level.
A typical image retrieval pipeline starts with the comparison of global descriptors from a large database to find a short list of candidate matches. A good image descriptor is key to the retrieval pipeline and should reconcile two contradictory requirements: providing recall rates as high as possible and being as compact as possible for fast matching. Following the recent successes of Deep Convolutional Neural Networks (DCNN) for large scale image classification, descriptors extracted from DCNNs are increasingly used in place of the traditional hand crafted descriptors such as Fisher Vectors (FV) with better retrieval performances. Nevertheless, the dimensionality of a typical DCNN descriptor --extracted either from the visual feature pyramid or the fully-connected layers-- remains quite high at several thousands of scalar values. In this paper, we propose Unsupervised Triplet Hashing (UTH), a fully unsupervised method to compute extremely compact binary hashes --in the 32-256 bits range-- from high-dimensional global descriptors. UTH consists of two successive deep learning steps. First, Stacked Restricted Boltzmann Machines (SRBM), a type of unsupervised deep neural nets, are used to learn binary embedding functions able to bring the descriptor size down to the desired bitrate. SRBMs are typically able to ensure a very high compression rate at the expense of loosing some desirable metric properties of the original DCNN descriptor space. Then, triplet networks, a rank learning scheme based on weight sharing nets is used to fine-tune the binary embedding functions to retain as much as possible of the useful metric properties of the original space. A thorough empirical evaluation conducted on multiple publicly available dataset using DCNN descriptors shows that our method is able to significantly outperform state-of-the-art unsupervised schemes in the target bit range.
This work explores conditional image generation with a new image density model based on the PixelCNN architecture. The model can be conditioned on any vector, including descriptive labels or tags, or latent embeddings created by other networks. When conditioned on class labels from the ImageNet database, the model is able to generate diverse, realistic scenes representing distinct animals, objects, landscapes and structures. When conditioned on an embedding produced by a convolutional network given a single image of an unseen face, it generates a variety of new portraits of the same person with different facial expressions, poses and lighting conditions. We also show that conditional PixelCNN can serve as a powerful decoder in an image autoencoder. Additionally, the gated convolutional layers in the proposed model improve the log-likelihood of PixelCNN to match the state-of-the-art performance of PixelRNN on ImageNet, with greatly reduced computational cost.
We consider the Chalker-Coddington network model for the Integer Quantum Hall Effect, and examine the possibility of solving it exactly. In the supersymmetric path integral framework, we introduce a truncation procedure, leading to a series of well-defined two-dimensional loop models, with two loop flavours. In the phase diagram of the first-order truncated model, we identify four integrable branches related to the dilute Birman-Wenzl-Murakami braid-monoid algebra, and parameterised by the loop fugacity $n$. In the continuum limit, two of these branches (1,2) are described by a pair of decoupled copies of a Coulomb-Gas theory, whereas the other two branches (3,4) couple the two loop flavours, and relate to an $SU(2)_r \times SU(2)_r / SU(2)_{2r}$ Wess-Zumino-Witten (WZW) coset model for the particular values $n= -2\cos[\pi/(r+2)]$ where $r$ is a positive integer. The truncated Chalker-Coddington model is the $n=0$ point of branch 4. By numerical diagonalisation, we find that its universality class is neither an analytic continuation of the WZW coset, nor the universality class of the original Chalker-Coddington model. It constitutes rather an integrable, critical approximation to the latter.
Evolutionary algorithms (EAs) are very popular tools to design and evolve artificial neural networks (ANNs), especially to train them. These methods have advantages over the conventional backpropagation (BP) method because of their low computational requirement when searching in a large solution space. In this paper, we employ Chemical Reaction Optimization (CRO), a newly developed global optimization method, to replace BP in training neural networks. CRO is a population-based metaheuristics mimicking the transition of molecules and their interactions in a chemical reaction. Simulation results show that CRO outperforms many EA strategies commonly used to train neural networks.
The goal of our research is to develop methods advancing automatic visual recognition. In order to predict the unique or multiple labels associated to an image, we study different kind of Deep Neural Networks architectures and methods for supervised features learning. We first draw up a state-of-the-art review of the Convolutional Neural Networks aiming to understand the history behind this family of statistical models, the limit of modern architectures and the novel techniques currently used to train deep CNNs. The originality of our work lies in our approach focusing on tasks with a low amount of data. We introduce different models and techniques to achieve the best accuracy on several kind of datasets, such as a medium dataset of food recipes (100k images) for building a web API, or a small dataset of satellite images (6,000) for the DSG online challenge that we've won. We also draw up the state-of-the-art in Weakly Supervised Learning, introducing different kind of CNNs able to localize regions of interest. Our last contribution is a framework, build on top of Torch7, for training and testing deep models on any visual recognition tasks and on datasets of any scale.
Echo state networks are simple recurrent neural networks that are easy to implement and train. Despite their simplicity, they show a form of memory and can predict or regenerate sequences of data. We make use of this property to realize a novel neural cryptography scheme. The key idea is to assume that Alice and Bob share a copy of an echo state network. If Alice trains her copy to memorize a message, she can communicate the trained part of the network to Bob who plugs it into his copy to regenerate the message. Considering a byte-level representation of in- and output, the technique applies to arbitrary types of data (texts, images, audio files, etc.) and practical experiments reveal it to satisfy the fundamental cryptographic properties of diffusion and confusion.
VO-Neural is the natural evolution of the Astroneural project which was started in 1994 with the aim to implement a suite of neural tools for data mining in astronomical massive data sets. At a difference with its ancestor, which was implemented under Matlab, VO-Neural is written in C++, object oriented, and it is specifically tailored to work in distributed computing architectures. We discuss the current status of implementation of VO-Neural, present an application to the classification of Active Galactic Nuclei, and outline the ongoing work to improve the functionalities of the package.
This paper is concerned with the design and implementation of an innovative user support system in the frame of an open educational environment. The environment adapted is ModelsCreator (MC), an educational system supporting learning through modelling activities. The pupils typical interaction with the system was modelled us-ing Bayesian Belief Networks (BBN). This model has been used in ModelsCreator to build an adaptive help system providing the most useful guidelines according to the current state of interaction. A brief description of the system and an overview of application of Bayesian techniques to educational systems is presented together with discussion about the process of building of the Bayesian Network derived from actual student interaction data. A preliminary evaluation of the developed prototype indicates that the proposed approach produces systems with promising performance.
Image and video classification research has made great progress through the development of handcrafted local features and learning based features. These two architectures were proposed roughly at the same time and have flourished at overlapping stages of history. However, they are typically viewed as distinct approaches. In this paper, we emphasize their structural similarities and show how such a unified view helps us in designing features that balance efficiency and effectiveness. As an example, we study the problem of designing efficient video feature learning algorithms for action recognition.   We approach this problem by first showing that local handcrafted features and Convolutional Neural Networks (CNNs) share the same convolution-pooling network structure. We then propose a two-stream Convolutional ISA (ConvISA) that adopts the convolution-pooling structure of the state-of-the-art handcrafted video feature with greater modeling capacities and a cost-effective training algorithm. Through custom designed network structures for pixels and optical flow, our method also reflects distinctive characteristics of these two data sources.   Our experimental results on standard action recognition benchmarks show that by focusing on the structure of CNNs, rather than end-to-end training methods, we are able to design an efficient and powerful video feature learning algorithm.
We study the capacity with which a system of independent neuron-like units represents a given set of stimuli. We assume that each neuron provides a fixed amount of information, and that the information provided by different neurons has a random overlap. We derive analytically the dependence of the mutual information between the set of stimuli and the neural responses on the number of units sampled. For a large set of stimuli, the mutual information rises linearly with the number of neurons, and later saturates exponentially at its maximum value.
The structure of a Bayesian network encodes most of the information about the probability distribution of the data, which is uniquely identified given some general distributional assumptions. Therefore it's important to study the variability of its network structure, which can be used to compare the performance of different learning algorithms and to measure the strength of any arbitrary subset of arcs.   In this paper we will introduce some descriptive statistics and the corresponding parametric and Monte Carlo tests on the undirected graph underlying the structure of a Bayesian network, modeled as a multivariate Bernoulli random variable.
Responses have been numerically studied of an ensemble of $N$ (=1, 10, and 100) Hodgkin-Huxley (HH) neurons to coherent spike-train inputs applied with independent Poisson spike-train (ST) noise and Gaussian white noise. Three interrelated issues have been investigated: (1) the difference and the similarity between the effects of the two noises, (2) the size effect of a neuron ensemble on the signal-to-noise ratio (SNR), and (3) the compatibility of a large firing variability with fairly good information transmission. (1) The property of stochastic resonance (SR) for ST noise is shown to be rather different from that for white noise. When SNR for sub-threshold inputs obtained in our simulation is analyzed by the expression given by $SNR=10 {\rm log_{10}} [(A/X^{\alpha}) {\rm exp}(-B/X)]$ where $X$ expresses the noise intensity and $A$ and $B$ are constants, the index $\alpha$ is $\alpha=3$ for the ST noise and $\alpha=2$ for the white noise: the former is different from the conventional value of $\alpha=2$ realized in many non-linear systems. ST noise works less effectively for SR than white noise. (2) The transmission fidelity evaluated by SNR is much improved by increasing $N$, the size of ensemble neurons. In a large-scale neuron ensemble, SNR for supra-threshold inputs is shown to be not significantly degraded by weak noises responsible to SR for sub-threshold inputs. (3) Interspike intervals (ISIs) of output spikes for sub-threshold inputs have a large variability ($c_{v} \siml 0.8$), which is comparable to the data observed in cortical neurons. Despite variable firings of individual neurons, output signals summed over an ensemble may carry information with a fairly good SNR by the aid of SR and a pooling effect.
Motivation: The consistent amount of different types of omics data requires novel methods of analysis and data integration. In this work we describe Regression2Net, a computational approach to analyse gene expression and methylation profiles via regression analysis and network-based techniques.   Results: We identified 284 and 447 unique candidate genes potentially associated to the Glioblastoma pathology from two networks inferred from mixed genetic datasets. In-depth biological analysis of these networks reveals genes that are related to energy metabolism, cell cycle control (AATF), immune system response and several types of cancer. Importantly, we observed significant over- representation of cancer related pathways including glioma especially in the methylation network. This confirms the strong link between methylation and glioblastomas. Potential glioma suppressor genes ACCN3 and ACCN4 linked to NBPF1 neuroblastoma breakpoint family have been identified in our expression network. Numerous ABC transporter genes (ABCA1, ABCB1) present in the expression network suggest drug resistance of glioblastoma tumors.
The pomeron structure function is extracted from the latest H1 data and are subject to a QCD analysis. The result shows evidence for gluon recombination.
We present a game-theoretic model for the spread of deviant behavior in online social networks. We utilize a two-strategy framework wherein each player's behavior is classified as normal or deviant and evolves according to the cooperate-defect payoff scheme of the classic prisoner's dilemma game. We demonstrate convergence of individual behavior over time to a final strategy vector and indicate counterexamples to this convergence outside the context of prisoner's dilemma. Theoretical results are validated on a real-world dataset collected from a popular online forum.
A pulse of light, injected into a weakly disordered dielectric medium, typically, will leave its initial location in a short time, by diffusion. However, due to some rare configurations of disorder, there is a possibility of formation of high quality resonators which can trap light for a long time. We present a rather detailed, quantitative study of such random resonators and of the "almost localized" states that they can support. After presenting a brief review of the earlier work on the subject, we concentrate on a detailed computation of the "prefactor": knowledge of the latter is crucial for varifying the viability of the random rasonators and their areal density. Both short range disorder (white noise) and correlated disorder are studied, and the important effect of the correlation radius, $R_c$, on the probability of formation of resonators with a given quality factor $Q$ is discussed. The random resonators are "self-formed", in the sense that no sharp features (like Mie scatterers or other "resonant entities") are introduced: our model is a featureless dielectric medium with fluctuating dielectric constant. We point out the relevance of the random resonators to the recently discovered phenomenon of coherent "random" lasing and review the existing work on that subject. We emphasize, however, that the random resonators exist already in the {\em passive} medium: gain is only needed to "make them visible".
We investigate a general framework of multiplicative multitask feature learning which decomposes each task's model parameters into a multiplication of two components. One of the components is used across all tasks and the other component is task-specific. Several previous methods have been proposed as special cases of our framework. We study the theoretical properties of this framework when different regularization conditions are applied to the two decomposed components. We prove that this framework is mathematically equivalent to the widely used multitask feature learning methods that are based on a joint regularization of all model parameters, but with a more general form of regularizers. Further, an analytical formula is derived for the across-task component as related to the task-specific component for all these regularizers, leading to a better understanding of the shrinkage effect. Study of this framework motivates new multitask learning algorithms. We propose two new learning formulations by varying the parameters in the proposed framework. Empirical studies have revealed the relative advantages of the two new formulations by comparing with the state of the art, which provides instructive insights into the feature learning problem with multiple tasks.
We review theoretical approaches to the understanding of food webs. After an overview of the available food web data, we discuss three different classes of models. The first class comprise static models, which assign links between species according to some simple rule. The second class are dynamical models, which include the population dynamics of several interacting species. We focus on the question of the stability of such webs. The third class are species assembly models and evolutionary models, which build webs starting from a few species by adding new species through a process of "invasion" (assembly models) or "speciation" (evolutionary models). Evolutionary models are found to be capable of building large stable webs.
Training examples are not all equally informative. Active learning strategies leverage this observation in order to massively reduce the number of examples that need to be labeled. We leverage the same observation to build a generic strategy for parallelizing learning algorithms. This strategy is effective because the search for informative examples is highly parallelizable and because we show that its performance does not deteriorate when the sifting process relies on a slightly outdated model. Parallel active learning is particularly attractive to train nonlinear models with non-linear representations because there are few practical parallel learning algorithms for such models. We report preliminary experiments using both kernel SVMs and SGD-trained neural networks.
Stochastic gradient MCMC (SG-MCMC) has played an important role in large-scale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it is becoming increasingly popular to employ distributed systems, where stochastic gradients are computed based on some outdated parameters, yielding what are termed stale gradients. While stale gradients could be directly used in SG-MCMC, their impact on convergence properties has not been well studied. In this paper we develop theory to show that while the bias and MSE of an SG-MCMC algorithm depend on the staleness of stochastic gradients, its estimation variance (relative to the expected estimate, based on a prescribed number of samples) is independent of it. In a simple Bayesian distributed system with SG-MCMC, where stale gradients are computed asynchronously by a set of workers, our theory indicates a linear speedup on the decrease of estimation variance w.r.t. the number of workers. Experiments on synthetic data and deep neural networks validate our theory, demonstrating the effectiveness and scalability of SG-MCMC with stale gradients.
Anomalies in online social networks can signify irregular, and often illegal behaviour. Anomalies in online social networks can signify irregular, and often illegal behaviour. Detection of such anomalies has been used to identify malicious individuals, including spammers, sexual predators, and online fraudsters. In this paper we survey existing computational techniques for detecting anomalies in online social networks. We characterise anomalies as being either static or dynamic, and as being labelled or unlabelled, and survey methods for detecting these different types of anomalies. We suggest that the detection of anomalies in online social networks is composed of two sub-processes; the selection and calculation of network features, and the classification of observations from this feature space. In addition, this paper provides an overview of the types of problems that anomaly detection can address and identifies key areas of future research.
When creating a new domain-specific language (DSL) it is common to embed it as a part of a flexible host language, rather than creating it entirely from scratch. The semantics of an embedded DSL (EDSL) is either given directly as a set of functions (shallow embedding), or an AST is constructed that is later processed (deep embedding). Typically, the deep embedding is used when the EDSL specifies domain-specific optimizations (DSO) in a form of AST transformations.   In this paper we show that deep embedding is not necessary to specify most optimizations. We define language semantics as action functions that are executed during parsing. These actions build incrementally a new, arbitrary complex program function.   The EDSL designer is able to specify many aspects of the semantics as a runnable code, such as variable scoping rules, custom type checking, arbitrary control flow structures, or DSO. A sufficiently powerful staging mechanism helps assembling the code from different actions, as well as evaluate the semantics in arbitrarily many stages. In the end, we obtain code that is as efficient as one written by hand.   We never create any object representation of the code. No external traversing algorithm is used to process the code. All program fragments are functions with their entire semantics embedded within the function bodies. This approach allows reusing the code between EDSL and the host language, as well as combining actions of many different EDSLs.
The thermodynamic and kinetic anomalies of supercooled liquids are analyzed from the perspective of energy landscapes. A mean field model, a generalized random energy model of liquids is developed, which exhibits a dynamical transition of the onset of slow dynamics at T_0, alteration of the nature of motion from the saddle-to-saddle to minimum-to-minimum motion at T_c, and an ideal glass transition at T_k. If the energy spectrum of the configurations has a low energy tail, the model also allows a thermodynamic liquid-liquid transition at T_l. The liquid-liquid transition of the model is correlated to the kinetic fragile-strong transition accompanied by the anomalous slowing down of motion. Fragility of the system is classified in terms of features of the energy landscape such as ruggedness of the potential energy surface, size of the cooperative motion invoked in a transition from one configuration to another, and energy needed to deform the local structure in the cooperative motion. A simple relation is found between diffusion constant, D and the saddle index of the potential energy surface, f, as $D \propto f^{a}$, where a depends on the size of the cooperative motion.
We propose Neural Responding Machine (NRM), a neural network-based response generator for Short-Text Conversation. NRM takes the general encoder-decoder framework: it formalizes the generation of response as a decoding process based on the latent representation of the input text, while both encoding and decoding are realized with recurrent neural networks (RNN). The NRM is trained with a large amount of one-round conversation data collected from a microblogging service. Empirical study shows that NRM can generate grammatically correct and content-wise appropriate responses to over 75% of the input text, outperforming state-of-the-arts in the same setting, including retrieval-based and SMT-based models.
We present the results of a detailed spectral analysis of optically faint hard X-ray sources in the Chandra deep fields selected on the basis of their high X-ray to optical flux ratio (X/O). The stacked spectra of high X/O sources in both Chandra deep fields, fitted with a single power-law model, are much harder than the spectrum of the X-ray background (XRB). The average slope is also insensitive to the 2-8 keV flux, being approximately constant around Gamma~1 over more than two decades, strongly indicating that high X/O sources represent the most obscured component of the XRB. For about half of the sample, a redshift estimate (in most of the cases a photometric redshift) is available from the literature. Individual fits of a few of the brightest objects and of stacked spectra in different redshift bins imply column densities in the range 10^{22-23.5} cm^{-2}. A trend of increasing absorption towards higher redshifts is suggested.
In this paper, we consider the problem of flow control together with power allocation to antennas on satellite with arbitrary link states, so as to maximize the utility function while stabilizing the network. Inspired by Lyapunov optimization method, a Degree-Limited Scheduling Algorithm (DLSA) is proposed with a control parameter V, which requires no stochastic knowledge of link state. Discussion about implementation is carried out about the complexity of DLSA and several approximation methods to reduce complexity. Analyze shows DLSA stabilizes the network and the gap between utility function under DLSA and the optimal value is arbitrarily close to zero on the order of O(1/V). Simulation results verify DLSA on a simple network.
An effective field theory for the three-body system with large scattering length a is applied to three-body recombination into deep bound states in a Bose gas. The recombination constant \alpha is calculated to first order in the short-distance interactions that allow the recombination. For a < 0, the dimensionless combination m \alpha/(\hbar a^4) is a periodic function of \ln |a| that exhibits resonances at values of a that differ by multiplicative factors of 22.7. This dramatic behavior should be observable near a Feshbach resonance when a becomes large and negative.
We present two optimal randomized leader election algorithms for multi-hop radio networks, which run in expected time asymptotically equal to the time required to broadcast one message to the entire network. We first observe that, under certain assumptions, a simulation approach of Bar-Yehuda, Golreich and Itai (1991) can be used to obtain an algorithm that for directed and undirected networks elects a leader in $O(D \log\frac{n}{D} + \log^2 n)$ expected time, where $n$ is the number of the nodes and $D$ is the eccentricity or the diameter of the network. We then extend this approach and present a second algorithm, which operates on undirected multi-hop radio networks with collision detection and elects a leader in $O(D + \log n)$ expected run-time. This algorithm in fact operates on the beep model, a strictly weaker model in which nodes can only communicate via beeps or silence. Both of these algorithms are optimal; no optimal expected-time algorithms for these models have been previously known.   We further apply our techniques to design an algorithm that is quicker to achieve leader election with high probability. We give an algorithm for the model without collision detection which always runs in time $O((D \log\frac{n}{D} + \log^2 n)\cdot \sqrt{\log n})$, and succeeds with high probability. While non-optimal, and indeed slightly slower than the algorithm of Ghaffari and Haeupler (2013), it has the advantage of working in directed networks; it is the fastest known leader election algorithm to achieve a high-probability bound in such circumstances.
Common nonlinear activation functions used in neural networks can cause training difficulties due to the saturation behavior of the activation function, which may hide dependencies that are not visible to vanilla-SGD (using first order gradients only). Gating mechanisms that use softly saturating activation functions to emulate the discrete switching of digital logic circuits are good examples of this. We propose to exploit the injection of appropriate noise so that the gradients may flow easily, even if the noiseless application of the activation function would yield zero gradient. Large noise will dominate the noise-free gradient and allow stochastic gradient descent toexplore more. By adding noise only to the problematic parts of the activation function, we allow the optimization procedure to explore the boundary between the degenerate (saturating) and the well-behaved parts of the activation function. We also establish connections to simulated annealing, when the amount of noise is annealed down, making it easier to optimize hard objective functions. We find experimentally that replacing such saturating activation functions by noisy variants helps training in many contexts, yielding state-of-the-art or competitive results on different datasets and task, especially when training seems to be the most difficult, e.g., when curriculum learning is necessary to obtain good results.
In secure multiparty computation, mutually distrusting users in a network want to collaborate to compute functions of data which is distributed among the users. The users should not learn any additional information about the data of others than what they may infer from their own data and the functions they are computing. Previous works have mostly considered the worst case context (i.e., without assuming any distribution for the data); Lee and Abbe (2014) is a notable exception. Here, we study the average case (i.e., we work with a distribution on the data) where correctness and privacy is only desired asymptotically.   For concreteness and simplicity, we consider a secure version of the function computation problem of K\"orner and Marton (1979) where two users observe a doubly symmetric binary source with parameter p and the third user wants to compute the XOR. We show that the amount of communication and randomness resources required depends on the level of correctness desired. When zero-error and perfect privacy are required, the results of Data et al. (2014) show that it can be achieved if and only if a total rate of 1 bit is communicated between every pair of users and private randomness at the rate of 1 is used up. In contrast, we show here that, if we only want the probability of error to vanish asymptotically in block length, it can be achieved by a lower rate (binary entropy of p) for all the links and for private randomness; this also guarantees perfect privacy. We also show that no smaller rates are possible even if privacy is only required asymptotically.
We present a deep ISO observation at 6.7 microns of the 53W002 group of galaxies and AGN at z=2.4. This approximately samples the emitted K band. The faint, blue star-forming objects are not detected, as expected from their very blue color across the emitted optical and UV. However, 53W002 itself is detected at the 3-sigma level, with an emitted V-K color appropriate for a population formed starting at z=3.6-7.0 with most likely value z=4.7. This fits with shorter-wavelength data suggesting that the more massive members of this group, which may all host AGN, began star formation earlier in deeper potential wells than the compact Lyman-alpha emission objects. Two foreground galaxies are detected, as well as several stars. One additional 6.7-micron source closely coincides with an optically faint galaxy, potentially at z=2-3. The overall source counts are consistent with other ISO deep fields.
Constrained connection is the phenomenon that a reviewer can only review a subset of products/services due to narrow range of interests or limited attention capacity. In this work, we study how constrained connections can affect estimation performance in online review systems (ORS). We find that reviewers' constrained connections will cause poor estimation performance, both from the measurements of estimation accuracy and Bayesian Cramer Rao lower bound.
Identification of intended movement type and movement phase of hand grasp shaping are critical features for the control of volitional neuroprosthetics. We demonstrate that neural dynamics during visually-guided imagined grasp shaping can encode intended movement. We apply Procrustes analysis and LASSO regression to achieve 72% accuracy (chance = 25%) in distinguishing between visually-guided imagined grasp trajectories. Further, we can predict the stage of grasp shaping in the form of elapsed time from start of trial (R2=0.4). Our approach contributes to more accurate single-trial decoding of higher-level movement goals and the phase of grasping movements in individuals not trained with brain-computer interfaces. We also find that the overall time-varying trajectory structure of imagined movements tend to be consistent within individuals, and that transient trajectory deviations within trials return to the task-dependent trajectory mean. These overall findings may contribute to the further understanding of the cortical dynamics of human motor imagery.
Truncated Backpropagation Through Time (truncated BPTT) is a widespread method for learning recurrent computational graphs. Truncated BPTT keeps the computational benefits of Backpropagation Through Time (BPTT) while relieving the need for a complete backtrack through the whole data sequence at every step. However, truncation favors short-term dependencies: the gradient estimate of truncated BPTT is biased, so that it does not benefit from the convergence guarantees from stochastic gradient theory. We introduce Anticipated Reweighted Truncated Backpropagation (ARTBP), an algorithm that keeps the computational benefits of truncated BPTT, while providing unbiasedness. ARTBP works by using variable truncation lengths together with carefully chosen compensation factors in the backpropagation equation. We check the viability of ARTBP on two tasks. First, a simple synthetic task where careful balancing of temporal dependencies at different scales is needed: truncated BPTT displays unreliable performance, and in worst case scenarios, divergence, while ARTBP converges reliably. Second, on Penn Treebank character-level language modelling, ARTBP slightly outperforms truncated BPTT.
Online Citizen Science platforms are good examples of socio-technical systems where technology-enabled interactions occur between scientists and the general public (volunteers). Citizen Science platforms usually host multiple Citizen Science projects, and allow volunteers to choose the ones to participate in. Recent work in the area has demonstrated a positive feedback loop between participation and learning and creativity in Citizen Science projects, which is one of the motivating factors both for scientists and the volunteers. This emphasises the importance of creating successful Citizen Science platforms, which support this feedback process, and enable enhanced learning and creativity to occur through knowledge sharing and diverse participation. In this paper, we discuss how scientists' and volunteers' motivation and participation influence the design of Citizen Science platforms. We present our summary as guidelines for designing these platforms as user-inspired socio-technical systems. We also present the case-studies on popular Citizen Science platforms, including our own CitizenGrid platform, developed as part of the CCL EU project, as well as Zooniverse, World Community Grid, CrowdCrafting and EpiCollect+ to see how closely these platforms follow our proposed guidelines and how these may be further improved to incorporate the creativity enabled by the collective knowledge sharing.
We propose to use bifurcation theory and pattern formation as theoretical probes for various hypotheses about the neural organization of the brain. This allows us to make predictions about the kinds of patterns that should be observed in the activity of real brains through, e.g. optical imaging, and opens the door to the design of experiments to test these hypotheses. We study the specific problem of visual edges and textures perception and suggest that these features may be represented at the population level in the visual cortex as a specific second-order tensor, the structure tensor, perhaps within a hypercolumn. We then extend the classical ring model to this case and show that its natural framework is the non-Euclidean hyperbolic geometry. This brings in the beautiful structure of its group of isometries and certain of its subgroups which have a direct interpretation in terms of the organization of the neural populations that are assumed to encode the structure tensor. By studying the bifurcations of the solutions of the structure tensor equations, the analog of the classical Wilson and Cowan equations, under the assumption of invariance with respect to the action of these subgroups, we predict the appearance of characteristic patterns. These patterns can be described by what we call hyperbolic or H-planforms that are reminiscent of Euclidean planar waves and of the planforms that were used in [1, 2] to account for some visual hallucinations. If these patterns could be observed through brain imaging techniques they would reveal the built-in or acquired invariance of the neural organization to the action of the corresponding subgroups.
The growth of data, the need for scalability and the complexity of models used in modern machine learning calls for distributed implementations. Yet, as of today, distributed machine learning frameworks have largely ignored the possibility of arbitrary (i.e., Byzantine) failures. In this paper, we study the robustness to Byzantine failures at the fundamental level of stochastic gradient descent (SGD), the heart of most machine learning algorithms. Assuming a set of $n$ workers, up to $f$ of them being Byzantine, we ask how robust can SGD be, without limiting the dimension, nor the size of the parameter space.   We first show that no gradient descent update rule based on a linear combination of the vectors proposed by the workers (i.e, current approaches) tolerates a single Byzantine failure. We then formulate a resilience property of the update rule capturing the basic requirements to guarantee convergence despite $f$ Byzantine workers. We finally propose Krum, an update rule that satisfies the resilience property aforementioned. For a $d$-dimensional learning problem, the time complexity of Krum is $O(n^2 \cdot (d + \log n))$.
Urban road networks have distinct geometric properties that are partially determined by their (quasi-) two-dimensional structure. In this work, we study these properties for 20 of the largest German cities. We find that the small-scale geometry of all examined road networks is extremely similar. The object-size distributions of road segments and the resulting cellular structures are characterised by heavy tails. As a specific feature, a large degree of rectangularity is observed in all networks, with link angle distributions approximately described by stretched exponential functions. We present a rigorous statistical analysis of the main geometric characteristics and discuss their mutual interrelationships. Our results demonstrate the fundamental importance of cost-efficiency constraints for in time evolution of urban road networks.
Learning is thought to occur by localized, experience-induced changes in the strength of synaptic connections between neurons. Recent work has shown that activity-dependent changes at one connection can affect changes at others (crosstalk). We studied the role of such crosstalk in nonlinear Hebbian learning using a neural network implementation of Independent Components Analysis (ICA). We find that there is a sudden qualitative change in the performance of the network at a critical crosstalk level and discuss the implications of this for nonlinear learning from higher-order correlations in the neocortex.
The MeerKAT (64 x 13.5m dish radio interferometer) is South Africa's precursor instrument for the Square Kilometre Array (SKA), exploring dish design, instrumentation, and the characteristics of a Karoo desert site and is projected to be on sky in 2016. One of two top-priority, Key Projects is a single deep field, integrating for 5000 hours total with the aim to detect neutral atomic hydrogen through its 21 cm line emission out to redshift unity and beyond.   This first truly deep HI survey will help constrain fueling models for galaxy assembly and evolution. It will measure the evolution of the cosmic neutral gas density and its distribution over galaxies over cosmic time, explore evolution of the gas in galaxies, measure the Tully-Fisher relation, measure OH maser counts, and address many more topics.   Here we present the observing strategy and envisaged science case for this unique deep field, which encompasses the Chandra Deep Field-South (and the footprints of GOODS, GEMS and several other surveys) to produce a singular legacy multi-wavelength data-set.
Sensor fusion is indispensable to improve accuracy and robustness in an autonomous navigation setting. However, in the space of end-to-end sensorimotor control, this multimodal outlook has received limited attention. In this work, we propose a novel stochastic regularization technique, called Sensor Dropout, to robustify multimodal sensor policy learning outcomes. We also introduce an auxiliary loss on policy network along with the standard DRL loss that help reduce the action variations of the multimodal sensor policy. Through empirical testing we demonstrate that our proposed policy can 1) operate with minimal performance drop in noisy environments, 2) remain functional even in the face of a sensor subset failure. Finally, through the visualization of gradients, we show that the learned policies are conditioned on the same latent input distribution despite having multiple sensory observations spaces - a hallmark of true sensor-fusion. This efficacy of a multimodal policy is shown through simulations on TORCS, a popular open-source racing car game. A demo video can be seen here: https://youtu.be/HC3TcJjXf3Q
As machine learning is applied to an increasing variety of complex problems, which are defined by high dimensional and complex data sets, the necessity for task oriented feature learning grows in importance. With the advancement of Deep Learning algorithms, various successful feature learning techniques have evolved. In this paper, we present a novel way of learning discriminative features by training Deep Neural Nets which have Encoder or Decoder type architecture similar to an Autoencoder. We demonstrate that our approach can learn discriminative features which can perform better at pattern classification tasks when the number of training samples is relatively small in size.
From footpaths to flight routes, human mobility networks facilitate the spread of communicable diseases. Control and elimination efforts depend on characterizing these networks in terms of connections and flux rates of individuals between contact nodes. In some cases, transport can be parameterized with gravity-type models or approximated by a diffusive random walk. As a alternative, we have isolated intranational commercial air traffic as a case study for the utility of non-diffusive, heavy-tailed transport models. We implemented new stochastic simulations of a prototypical influenza-like infection, focusing on the dense, highly-connected United States air travel network. We show that mobility on this network can be described mainly by a power law, in agreement with previous studies. Remarkably, we find that the global evolution of an outbreak on this network is accurately reproduced by a two-parameter space-fractional diffusion equation, such that those parameters are determined by the air travel network.
We develop a perturbative approach to study the supersymmetric non-linear sigma model characterized by a generic coupling matrix in the strong coupling limit. The method allows us to calculate explicitly the moments of the eigenfunctions and the two-level correlation function in the lowest order of the perturbative expansion. We find that the obtained expressions are equivalent to the results derived before for the corresponding random matrix ensembles. Such an equivalence is elucidated and generalized to all orders of the perturbative expansion by mapping the sigma model onto the field theory describing the almost diagonal random matrices.
Whether you just want to take a peek of a remote computer status, or you want to install the latest version of a software on several workstations, you can do all of this from your computer. The networks are growing, the time spent administering the workstations increases and the number of repetitive tasks is going sky high. But here comes MESH to take that load off your shoulders. And because of SMS commands you can take this "command center" wherever you will go. Just connect a GSM phone to the computer (using a cable, IrDA or Bluetooth) and lock/restart/shutdown computers from your LAN with the push of a cell phone button. You can even create your own SMS commands. This is MESH - the network administrator's Swiss knife
Twitter has been proven to be a notable source for predictive modelling on various domains such as the stock market, the dissemination of diseases or sports outcomes. However, such a study has not been conducted in football (soccer) so far. The purpose of this research was to study whether data mined from Twitter can be used for this purpose. We built a set of predictive models for the outcome of football games of the English Premier League for a 3 month period based on tweets and we studied whether these models can overcome predictive models which use only historical data and simple football statistics. Moreover, combined models are constructed using both Twitter and historical data. The final results indicate that data mined from Twitter can indeed be a useful source for predicting games in the Premier League. The final Twitter-based model performs significantly better than chance when measured by Cohen's kappa and is comparable to the model that uses simple statistics and historical data. Combining both models raises the performance higher than it was achieved by each individual model. Thereby, this study provides evidence that Twitter derived features can indeed provide useful information for the prediction of football (soccer) outcomes.
The goals of this paper are to present criteria, that allow to a priori quantify the attack stability of real world correlated networks of finite size and to check how these criteria correspond to analytic results available for infinite uncorrelated networks. As a case study, we consider public transportation networks (PTN) of several major cities of the world. To analyze their resilience against attacks either the network nodes or edges are removed in specific sequences (attack scenarios). During each scenario the size S(c) of the largest remaining network component is observed as function of the removed share c of nodes or edges. To quantify the PTN stability with respect to different attack scenarios we use the area below the curve described by S(c) for c \in [0,1] recently introduced (Schneider, C. M, et al., PNAS 108 (2011) 3838) as a numerical measure of network robustness. This measure captures the network reaction over the whole attack sequence. We present results of the analysis of PTN stability against node and link-targeted attacks.
A cooperative network model of sociological interest is examined to determine the sensitivity of the global dynamics to having a fraction of the members behaving uncooperatively, that is, being in conflict with the majority. We study a condition where in the absence of these uncooperative individuals, the contrarians, the control parameter exceeds a critical value and the network is frozen in a state of consensus. The network dynamics change with variations in the percentage of contrarians, resulting in a balance between the value of the control parameter and the percentage of those in conflict with the majority. We show that the transmission of information from a network $B$ to a network $A$, with a small fraction of lookout members in $A$ who adopt the behavior of $B$, becomes maximal when both networks are assigned the same critical percentage of contrarians.
Consider the problem of minimizing the expected value of a cost function parameterized by a random variable. The classical sample average approximation (SAA) method for solving this problem requires minimization of an ensemble average of the objective at each step, which can be expensive. In this paper, we propose a stochastic successive upper-bound minimization method (SSUM) which minimizes an approximate ensemble average at each iteration. To ensure convergence and to facilitate computation, we require the approximate ensemble average to be a locally tight upper-bound of the expected cost function and be easily optimized. The main contributions of this work include the development and analysis of the SSUM method as well as its applications in linear transceiver design for wireless communication networks and online dictionary learning. Moreover, using the SSUM framework, we extend the classical stochastic (sub-)gradient (SG) method to the case of minimizing a nonsmooth nonconvex objective function and establish its convergence.
This paper formulates a generalized classification algorithm with an application to classifying (or `decoding') neural activity in the brain. Medical doctors and researchers have long been interested in how brain activity correlates to body movement. Experiments have been conducted on patients whom are unable to move, in order to gain insight as to how thinking about movements might generate discernable neural activity. Researchers are tasked with determining which neurons are responsible for different imagined movements and how the firing behavior changes, given neural firing data. For instance, imagined movements may include wrist flexion, elbow extension, or closing the hand. This is just one of many applications to data classification. Though this article deals with an application in neuroscience, the generalized algorithm proposed in this article has applications in scientific areas ranging from neuroscience to acoustic and medical imaging.
We report an updated measurement of the top quark mass obtained from ppbar collisions at sqrt(s) = 1.96 TeV at the Fermilab Tevatron using the CDF II detector. Our measurement uses a matrix element integration method to obtain a signal likelihood, with a neural network used to identify background events and a likelihood cut applied to reduce the effect of badly reconstructed events. We use a 2.7 fb^-1 sample and observe 422 events passing all of our cuts. We find m_t = 172.2 +/- 1.0 (stat.) +/- 0.9 (JES) +/- 1.0 (syst.) GeV/c^2, or m_t = 172.2 +/- 1.7 (total) GeV/c^2.
General game playing artificial intelligence has recently seen important advances due to the various techniques known as 'deep learning'. However the advances conceal equally important limitations in their reliance on: massive data sets; fortuitously constructed problems; and absence of any human-level complexity, including other human opponents. On the other hand, deep learning systems which do beat human champions, such as in Go, do not generalise well. The power of deep learning simultaneously exposes its weakness. Given that deep learning is mostly clever reconfigurations of well-established methods, moving beyond the state of art calls for forward-thinking visionary solutions, not just more of the same. I present the argument that general game playing artificial intelligence will require a generalised player model. This is because games are inherently human artefacts which therefore, as a class of problems, contain cases which require a human-style problem solving approach. I relate this argument to the performance of state of art general game playing agents. I then describe a concept for a formal category theoretic basis to a generalised player model. This formal model approach integrates my existing 'Behavlets' method for psychologically-derived player modelling:   Cowley, B., Charles, D. (2016). Behavlets: a Method for Practical Player Modelling using Psychology-Based Player Traits and Domain Specific Features. User Modeling and User-Adapted Interaction, 26(2), 257-306.
Full-duplex (FD) communication is an emerging technology that can potentially double the throughput of cellular networks. Preliminary studies in single-cell or small FD network deployments have revealed promising rate gains using self-interference cancellation (SIC) techniques and receive processing. Nevertheless, the system-level performance gains of FD small cells in ultra-dense networks (UDNs) have not been fully investigated yet. In this paper, we evaluate the performance of resource allocation in ultra-dense FD small-cell networks using spatial stochastic models for the network layout and 3GPP channel models. More specifically, we consider various UDN scenarios and assess the performance of different low-complexity user-scheduling schemes and power allocation between uplink and downlink. We also provide useful insights into the effect of the SIC capability on the network throughput.
Captures of IP traffic contain much information on very different kinds of activities like file transfers, users interacting with remote systems, automatic backups, or distributed computations. Identifying such activities is crucial for an appropriate analysis, modeling and monitoring of the traffic. We propose here a notion of density that captures both temporal and structural features of interactions, and generalizes the classical notion of clustering coefficient. We use it to point out important differences between distinct parts of the traffic, and to identify interesting nodes and groups of nodes in terms of roles in the network.
We introduce a deep memory network for aspect level sentiment classification. Unlike feature-based SVM and sequential neural models such as LSTM, this approach explicitly captures the importance of each context word when inferring the sentiment polarity of an aspect. Such importance degree and text representation are calculated with multiple computational layers, each of which is a neural attention model over an external memory. Experiments on laptop and restaurant datasets demonstrate that our approach performs comparable to state-of-art feature based SVM system, and substantially better than LSTM and attention-based LSTM architectures. On both datasets we show that multiple computational layers could improve the performance. Moreover, our approach is also fast. The deep memory network with 9 layers is 15 times faster than LSTM with a CPU implementation.
The stellar metallicity is a direct measure of the amount of metals present in a galaxy, as a large part of the metals lie in its stars. In this paper we investigate new stellar metallicity indicators suitable for high-z galaxies studying the stellar photospheric absorption lines in the rest frame ultraviolet, hence sampling predominantly young hot stars. We defined these new indicators based on the equivalent widths (EW) of selected features using theoretical spectra created with the evolutionary population synthesis code Starburts99. We used them to compute the stellar metallicity for a sample of UV-selected galaxies at z > 3 from the AMAZE survey using very deep (37h per object) VLT/FORS spectra. Moreover, we applied the new metallicity indicators to eight additional high redshift galaxies found in literature. We then compared stellar and gas-phase metallicities measured from the emission lines for all these galaxies, finding that within the errors the two estimates are in good agreement, with possible tendency to have stellar metallicities lower than the gas phase ones. For the first time, we are able to study the stellar mass-metallicity relation at z > 3. We find that the metallicity of young, hot stars in galaxies at z \sim 3 have similar values of the aged stars in local SDSS galaxies, contrary to what observed for the gas phase metallicity.
We consider a one-dimensional system of interacting bosons in a random potential. At zero temperature, it can be either in the superfluid or in the insulating phase. We study the transition at weak disorder and moderate interaction. Using a systematic approach, we derive the renormalization group equations at two-loop order and discuss the phase diagram. We find the universal form of the correlation functions at the transitions and compute the logarithmic corrections to the main universal power-law behavior. In order to mimic large density fluctuations on a single site, we study a simplified model of disordered two-leg bosonic ladders with correlated disorder across the rung. Contrarily to the single-chain case, the latter system exhibits a transition between a superfluid and a localized phase where the exponents of the correlation functions at the transition do not take universal values.
Network coding has proved its efficiency in increasing the network performance for traditional ad-hoc networks. In this paper, we investigate using network coding for enhancing the throughput of multi-hop cognitive radio networks. We formulate the network coding throughput maximization problem as a graph theory problem, where different constraints and primary users' characteristics are mapped to the graph structure. We then show that the optimal solution to this problem in NP-hard and propose a heuristic algorithm to efficiently solve it.   Evaluation of the proposed algorithm through NS2 simulations shows that we can increase the throughput of the constrained secondary users' network by 150\% to 200\% for a wide range of scenarios covering different primary users' densities, traffic loads, and spectrum availability.
In this paper, we investigate the problem of the empirical coordination in a triangular multiterminal network. A triangular multiterminal network consists of three terminals where two terminals observe two external i.i.d correlated sequences. The third terminal wishes to generate a sequence with desired empirical joint distribution. For this problem, we derive inner and outer bounds on the empirical coordination capacity region. It is shown that the capacity region of the degraded source network and the inner and outer bounds on the capacity region of the cascade multiterminal network can be directly obtained from our inner and outer bounds. For a cipher system, we establish key distribution over a network with a reliable terminal, using the results of the empirical coordination. As another example, the problem of rate distortion in the triangular multiterminal network is investigated in which a distributed doubly symmetric binary source is available.
Since the seminal work by H.L.F. Helmholtz in 1863, to understand the basic principles of hearing has been a great, but still unresolved, challenge for physicists. Some time ago, it has been pointed out (Egu\'{\i}luz et al., Phys. Rev. Lett. 84, 5232, 2000) that the generic mathematical properties of nonlinear oscillators undergoing a Hopf bifurcation account for the salient characteristics of hearing. Recently, M.O. Magnasco proposed a model of the cochlea (Phys. Rev. Lett. 90, 058101, 2003), which employs Hopf-type instabilities for cochlear amplification. While this model reproduces the input-output behaviour of the cochlea to some extent, the generated model responses deviate significantly from physiological measurements. The reason for the discrepancies between model and experiment are due to the critical choice of the Hopf control parameter close to the bifurcation point ($\mu = 0$). The question whether the bifurcation parameter has to be chosen critically or subcritically ($\mu < 0$), is central, and has become the subject of a scientific debate.In this contribution, we argue that, for sustained input signals, the control parameter will assume a subcritical value. This leads to model results that are in close agreement with reported experimental data.
Regularization is essential when training large neural networks. As deep neural networks can be mathematically interpreted as universal function approximators, they are effective at memorizing sampling noise in the training data. This results in poor generalization to unseen data. Therefore, it is no surprise that a new regularization technique, Dropout, was partially responsible for the now-ubiquitous winning entry to ImageNet 2012 by the University of Toronto. Currently, Dropout (and related methods such as DropConnect) are the most effective means of regularizing large neural networks. These amount to efficiently visiting a large number of related models at training time, while aggregating them to a single predictor at test time. The proposed FaMe model aims to apply a similar strategy, yet learns a factorization of each weight matrix such that the factors are robust to noise.
From Harry Potter to American Horror Story, fanfiction is extremely popular among young people. Sites such as Fanfiction.net host millions of stories, with thousands more posted each day. Enthusiasts are sharing their writing and reading stories written by others. Exactly how does a generation known more for videogame expertise than long-form writing become so engaged in reading and writing in these communities? Via a nine-month ethnographic investigation of fanfiction communities that included participant observation, interviews, a thematic analysis of 4,500 reader reviews and an in-depth case study of a discussion group, we found that members of fanfiction communities spontaneously mentor each other in open forums, and that this mentoring builds upon previous interactions in a way that is distinct from traditional forms of mentoring and made possible by the affordances of networked publics. This work extends and develops the theory of distributed mentoring. Our findings illustrate how distributed mentoring supports fanfiction authors as they work to develop their writing skills. We believe distributed mentoring holds potential for supporting learning in a variety of formal and informal learning environments.
The wide spread of location-based social networks brings about a huge volume of user check-in data, which facilitates the recommendation of points of interest (POIs). Recent advances on distributed representation shed light on learning low dimensional dense vectors to alleviate the data sparsity problem. Current studies on representation learning for POI recommendation embed both users and POIs in a common latent space, and users' preference is inferred based on the distance/similarity between a user and a POI. Such an approach is not in accordance with the semantics of users and POIs as they are inherently different objects. In this paper, we present a novel spatiotemporal aware (STA) representation, which models the spatial and temporal information as \emph{a relationship connecting users and POIs}. Our model generalizes the recent advances in knowledge graph embedding. The basic idea is that the embedding of a $<$time, location$>$ pair corresponds to a translation from embeddings of users to POIs. Since the POI embedding should be close to the user embedding plus the relationship vector, the recommendation can be performed by selecting the top-\emph{k} POIs similar to the translated POI, which are all of the same type of objects. We conduct extensive experiments on two real-world datasets. The results demonstrate that our STA model achieves the state-of-the-art performance in terms of high recommendation accuracy, robustness to data sparsity and effectiveness in handling cold start problem.
Macroscopic spherical particles spontaneously form rich patterns on a standing Faraday wave. These patterns are found to follow a very systematic trend depending on the floater concentration $\phi$: The same floaters that accumulate at amplitude maxima (antinodes) of the wave at low $\phi$, surprisingly move towards the nodal lines when $\phi$ is beyond a certain value. In more detail, circular irregularly packed antinode clusters at low $\phi$ give way to loosely packed filamentary structures at intermediate $\phi$, and are then followed by densely packed grid-shaped node clusters at high $\phi$. Here, we successfully characterize the morphology of these rich patterns using a metric analysis, i.e., the Minkowski functionals. We modify the Minkowski functionals such that we are able to measure the physical quantities of the clusters such as area, perimeter, and aspect ratio.
This paper focuses on the problem of explaining predictions of psychological attributes such as attractiveness, happiness, confidence and intelligence from face photographs using deep neural networks. Since psychological attribute datasets typically suffer from small sample sizes, we apply transfer learning with two base models to avoid overfitting. These models were trained on an age and gender prediction task, respectively. Using a novel explanation method we extract heatmaps that highlight the parts of the image most responsible for the prediction. We further observe that the explanation method provides important insights into the nature of features of the base model, which allow one to assess the aptitude of the base model for a given transfer learning task. Finally, we observe that the multiclass model is more feature rich than its binary counterpart. The experimental evaluation is performed on the 2222 images from the 10k US faces dataset containing psychological attribute labels as well as on a subset of KDEF images.
Performance and reliability of content access in mobile networks is conditioned by the number and location of content replicas deployed at the network nodes. Facility location theory has been the traditional, centralized approach to study content replication: computing the number and placement of replicas in a network can be cast as an uncapacitated facility location problem. The endeavour of this work is to design a distributed, lightweight solution to the above joint optimization problem, while taking into account the network dynamics. In particular, we devise a mechanism that lets nodes share the burden of storing and providing content, so as to achieve load balancing, and decide whether to replicate or drop the information so as to adapt to a dynamic content demand and time-varying topology. We evaluate our mechanism through simulation, by exploring a wide range of settings and studying realistic content access mechanisms that go beyond the traditional assumptionmatching demand points to their closest content replica. Results show that our mechanism, which uses local measurements only, is: (i) extremely precise in approximating an optimal solution to content placement and replication; (ii) robust against network mobility; (iii) flexible in accommodating various content access patterns, including variation in time and space of the content demand.
Noise injection (NI) is an off-the-shelf method to mitigate over-fitting in neural networks (NNs). The recent developments in Bernoulli NI as implemented in the dropout and shakeout procedures demonstrates the efficiency and feasibility of NI in regularizing deep NNs. We propose whiteout, a new regularization technique via injection of adaptive Gaussian noises into deep NNs. We show that whiteout is associated with a deterministic optimization objective function in generalized linear models with a closed-form penalty term which has connections with the bridge, lasso, ridge, and elastic net penalization; and it can be also extended to offer regularization similar to the adaptive lasso and group lasso regression. We also demonstrate that whiteout can be viewed as robust learning of NN model in the presence of small perturbations in input and hidden nodes. Compared to dropout, whiteout has better performance in training data of relatively small sizes with the sparsity introduced through the $l_1$ regularization. Compared to shakeout, the penalized objective function in whiteout is more stable given the continuity of Gaussian noises. We establish theoretically that the noise-perturbed empirical loss function with whiteout converges almost surely to the ideal loss function, and the estimates of NN parameters obtained from minimizing the former loss function are consistent with those obtained from minimizing the ideal loss function. Computationally, whiteout can be incorporated in the back-propagation algorithm and is computationally efficient. The superiority of whiteout over dropout and shakeout in training NNs in classification is demonstrated using the MNIST and CIFAR-10 data.
In this paper we consider in detail the connection between the problem of a polymer in a random medium and that of a quantum particle in a random potential. We are interested in a system of finite volume where the polymer is known to be {\it localized} inside a low minimum of the potential. We show how the end-to-end distance of a polymer which is free to move can be obtained from the density of states of the quantum particle using extreme value statistics. We give a physical interpretation to the recently discovered one-step replica-symmetry-breaking solution for the polymer (Phys. Rev. E{\bf 61}, 1729 (2000)) in terms of the statistics of localized tail states. Numerical solutions of the variational equations for chains of different length are performed and compared with quenched averages computed directly by using the eigenfunctions and eigenenergies of the Schr\"odinger equation for a particle in a one-dimensional random potential. The quantities investigated are the radius of gyration of a free gaussian chain, its mean square distance from the origin and the end-to-end distance of a tethered chain. The probability distribution for the position of the chain is also investigated. The glassiness of the system is explained and is estimated from the variance of the measured quantities.
We study the evolution of public cooperation on two interdependent networks that are connected by means of a utility function, which determines to what extent payoffs in one network influence the success of players in the other network. We find that the stronger the bias in the utility function, the higher the level of public cooperation. Yet the benefits of enhanced public cooperation on the two networks are just as biased as the utility functions themselves. While cooperation may thrive on one network, the other may still be plagued by defectors. Nevertheless, the aggregate level of cooperation on both networks is higher than the one attainable on an isolated network. This positive effect of biased utility functions is due to the suppressed feedback of individual success, which leads to a spontaneous separation of characteristic time scales of the evolutionary process on the two interdependent networks. As a result, cooperation is promoted because the aggressive invasion of defectors is more sensitive to the slowing down than the build-up of collective efforts in sizable groups.
We describe briefly the recent advances in understanding the distributed nature of computations in the (neural) network structure of the brain. We discuss if such artificial networks will be able to perform mathematics and natural sciences. The problem of consciousness in such machines is addressed. Ancient Indian ideas regarding mind-body relations and J. C. Bose's experimental observations regarding the highly distributed computations in the plant body is discussed.
The magnetic responses of a spin-1/2 ladder doped with non-magnetic impurities are studied using various methods and including the regime where frustration induces incommensurability. Several improvements are made on the results of the seminal work of Sigrist and Furusaki [J. Phys. Soc. Jpn. 65, 2385 (1996)]. Deviations from the Brillouin magnetic curve due to interactions are also analyzed. First, the magnetic profile around a single impurity and effective interactions between impurities are analyzed within the bond-operator mean-field theory and compared to density-matrix renormalization group calculations. Then, the temperature behavior of the Curie constant is studied in details. At zero-temperature, we give doping-dependent corrections to the results of Sigrist and Furusaki on general bipartite lattice and compute exactly the distribution of ladder cluster due to chain breaking effects. Using exact diagonalization and quantum Monte-Carlo methods on the effective model, the temperature dependence of the Curie constant is compared to a random dimer model and a real-space renormalization group scenario. Next, the low-part of the magnetic curve corresponding to the contribution of impurities is computed using exact diagonalization. The random dimer model is shown to capture the bulk of the curve, accounting for the deviation from the Brillouin response. At zero-temperature, the effective model prediction agrees relatively well with density-matrix renormalization group calculations. Finite-temperature effects are displayed within the effective model and for large depleted ladder models using quantum Monte-Carlo simulations. In all, the effect of incommensurability does not display a strong qualitative effect on both the magnetic susceptibility and the magnetic curve. Consequences for experiments on the BiCu2PO6 compound and other spin-gapped materials are briefly discussed.
Understanding design principles of molecular interaction networks is an important goal of molecular systems biology. Some insights have been gained into features of their network topology through the discovery of graph theoretic patterns that constrain network dynamics. This paper contributes to the identification of patterns in the mechanisms that govern network dynamics. The control of nodes in gene regulatory, signaling, and metabolic networks is governed by a variety of biochemical mechanisms, with inputs from other network nodes that act additively or synergistically. This paper focuses on a certain type of logical rule that appears frequently as a regulatory pattern. Within the context of the multistate discrete model paradigm, a rule type is introduced that reduces to the concept of nested canalyzing function in the Boolean network case. It is shown that networks that employ this type of multivalued logic exhibit more robust dynamics than random networks, with few attractors and short limit cycles. It is also shown that the majority of regulatory functions in many published models of gene regulatory and signaling networks are nested canalyzing.
The applications of Wireless Sensor Networks (WSN) contain a wide variety of scenarios. In most of them, the network is composed of a significant number of nodes deployed in an extensive area in which not all nodes are directly connected. Then, the data exchange is supported by multihop communications. Routing protocols are in charge of discovering and maintaining the routes in the network. However, the correctness of a particular routing protocol mainly depends on the capabilities of the nodes and on the application requirements. This paper presents a dynamic discover routing method for communication between sensor nodes and a base station in WSN. This method tolerates failures of arbitrary individual nodes in the network (node failure) or a small part of the network (area failure). Each node in the network does only local routing preservation, needs to record only its neighbor nodes' information, and incurs no extra routing overhead during failure free periods. It dynamically discovers new routes when an intermediate node or a small part of the network in the path from a sensor node to a base station fails. In our planned method, every node decides its path based only on local information, such as its parent node and neighbor nodes' routing information. So, it is possible to form a loop in the routing path. We believe that the loop problem in sensor network routing is not as serious as that in the Internet routing or traditional mobile ad-hoc routing. We are trying to find all possible loops and eliminate the loops as far as possible in WSN.
This article presents measurements of the $t$-channel single top-quark ($t$) and top-antiquark ($\bar{t}$) total production cross sections $\sigma(tq)$ and $\sigma(\bar{t}q)$, their ratio $R_t=\sigma(tq)/\sigma(\bar{t}q)$. Differential cross sections for the $\sigma(tq)$ and $\sigma(\bar{t}q)$ processes are measured as a function of the transverse momentum and the absolute value of the rapidity of ($t$) and ($\bar{t}$), respectively. The analysed data set was recorded with the ATLAS detector and corresponds to an integrated luminosity of 4.59 fb$^{-1}$. The cross sections are measured by performing a binned maximum-likelihood fit to the output distributions of neural networks. The resulting measurements are $\sigma(tq)=46 \pm 6$ pb, $\sigma(\bar{t}q)=23 \pm 4$ pb, $R_t=2.04 \pm 0.18$, consistent with the Standard Model expectation.
Quantum repeater networks have attracted attention for the implementation of long-distance and large-scale sharing of quantum states. Recently, researchers extended classical network coding, which is a technique for throughput enhancement, into quantum information. The utility of quantum network coding (QNC) has been shown under ideal conditions, but it has not been studied previously under conditions of noise and shortage of quantum resources. We analyzed QNC on a butterfly network, which can create end-to-end Bell pairs at twice the rate of the standard quantum network repeater approach. The joint fidelity of creating two Bell pairs has a small penalty for QNC relative to entanglement swapping. It will thus be useful when we care more about throughput than fidelity. We found that the output fidelity drops below 0.5 when the initial Bell pairs have fidelity F < 0.90, even with perfect local gates. Local gate errors have a larger impact on quantum network coding than on entanglement swapping.
The Japanese shareholding network at the end of March 2002 is studied. To understand the characteristics of this network intuitively, we visualize it as a directed graph and an adjacency matrix. Especially detailed features of networks concerned with the automobile industry sector are discussed by using the visualized networks. The shareholding network is also considered as an undirected graph, because many quantities characterizing networks are defined for undirected cases. For this undirected shareholding network, we show that a degree distribution is well fitted by a power law function with an exponential tail. The exponent in the power law range is gamma=1.8. We also show that the spectrum of this network follows asymptotically the power law distribution with the exponent delta=2.6. By comparison with gamma and delta, we find a scaling relation delta=2gamma-1. The reason why this relation holds is attributed to the local tree-like structure of networks. To clarify this structure, the correlation between degrees and clustering coefficients is considered. We show that this correlation is negative and fitted by the power law function with the exponent alpha=1.1. This guarantees the local tree-like structure of the network and suggests the existence of a hierarchical structure. We also show that the degree correlation is negative and follows the power law function with the exponent nu=0.8. This indicates a degree-nonassortative network, in which hubs are not directly connected with each other. To understand these features of the network from the viewpoint of a company's growth, we consider the correlation between the degree and the company's total assets and age. It is clarified that the degree and the company's total assets correlate strongly, but the degree and the company's age have no correlation.
The main subject studied in this dissertation is a multi-layered social network (MSN) and its analysis. One of the crucial problems in multi-layered social network analysis is community extraction. To cope with this problem the CLECC measure (Cross Layered Edge Clustering Coefficient) was proposed in the thesis. It is an edge measure which expresses how much the neighbors of two given users are similar each other. Based on this measure the CLECC algorithm for community extraction in the multi-layered social networks was designed. The algorithm was tested on the real single-layered social networks (SSN) and multi-layered social networks (MSN), as well as on benchmark networks from GN Benchmark (SSN), LFR Benchmark (SSN) and mLFR Benchmark (MSN) a special extension of LFR Benchmark, designed as a part of this thesis, which is able to produce multi-layered benchmark networks. The second research problem considered in the thesis was group evolution discovery. Studies on this problem have led to the development of the inclusion measure and the Group Evolution Discovery (GED) method, which is designed to identify events between two groups in successive time frames in the social network. The method was tested on a real social network and compared with two well-known algorithms regarding accuracy, execution time, flexibility and ease of implementation. Finally, a new approach to prediction of group evolution in the social network was developed. The new approach involves usage of the outputs of the GED method. It is shown, that using even a simple sequence, which consists of several preceding groups sizes and events, as an input for the classifier, the learned model can produce very good results also for simple classifiers.
Self-organizing map(SOM) have been widely applied in clustering, this paper focused on centroids of clusters and what they reveal. When the input vectors consists of time, latitude and longitude, the map can be strongly linked to physical world, providing valuable information. Beyond basic clustering, a novel approach to address the temporal element is developed, enabling 3D SOM to track behaviors in multiple periods concurrently. Combined with adaptations targeting to process heterogeneous data relating to distribution in time and space, the paper offers a fresh scope for business and services based on temporal-spatial pattern.
The identification of authorship in disputed documents still requires human expertise, which is now unfeasible for many tasks owing to the large volumes of text and authors in practical applications. In this study, we introduce a methodology based on the dynamics of word co-occurrence networks representing written texts to classify a corpus of 80 texts by 8 authors. The texts were divided into sections with equal number of linguistic tokens, from which time series were created for 12 topological metrics. The series were proven to be stationary (p-value>0.05), which permits to use distribution moments as learning attributes. With an optimized supervised learning procedure using a Radial Basis Function Network, 68 out of 80 texts were correctly classified, i.e. a remarkable 85% author matching success rate. Therefore, fluctuations in purely dynamic network metrics were found to characterize authorship, thus opening the way for the description of texts in terms of small evolving networks. Moreover, the approach introduced allows for comparison of texts with diverse characteristics in a simple, fast fashion.
gamma ray altitude control system is an important equipment for deep space exploration and sample return mission, its main purpose is a low altitude measurement of the spacecraft based on Compton Effect at the moment when it lands on extraterrestrial celestial or sampling returns to the Earth land, and an ignition altitude correction of the spacecraft retrograde landing rocket at different landing speeds. This paper presents an ignition altitude correction method of the spacecraft at different landing speeds, based on the number of particles gamma ray reflected field gradient graded. Through the establishment of a theoretical model, its algorithm feasibility is proved by a mathematical derivation and verified by an experiment, and also the adaptability of the algorithm under different parameters is described. The method provides a certain value for landing control of the deep space exploration spacecraft landing the planet surface.
State-of-the-art methods for protein-protein interaction (PPI) extraction are primarily feature-based or kernel-based by leveraging lexical and syntactic information. But how to incorporate such knowledge in the recent deep learning methods remains an open question. In this paper, we propose a multichannel dependency-based convolutional neural network model (McDepCNN). It applies one channel to the embedding vector of each word in the sentence, and another channel to the embedding vector of the head of the corresponding word. Therefore, the model can use richer information obtained from different channels. Experiments on two public benchmarking datasets, AIMed and BioInfer, demonstrate that McDepCNN compares favorably to the state-of-the-art rich-feature and single-kernel based methods. In addition, McDepCNN achieves 24.4% relative improvement in F1-score over the state-of-the-art methods on cross-corpus evaluation and 12% improvement in F1-score over kernel-based methods on "difficult" instances. These results suggest that McDepCNN generalizes more easily over different corpora, and is capable of capturing long distance features in the sentences.
A generic and scalable Reinforcement Learning scheme for Artificial Neural Networks is presented, providing a general purpose learning machine. By reference to a node threshold three features are described 1) A mechanism for Primary Reinforcement, capable of solving linearly inseparable problems 2) The learning scheme is extended to include a mechanism for Conditioned Reinforcement, capable of forming long term strategy 3) The learning scheme is modified to use a threshold-based deep learning algorithm, providing a robust and biologically inspired alternative to backpropagation. The model may be used for supervised as well as unsupervised training regimes.
Pretraining is widely used in deep neutral network and one of the most famous pretraining models is Deep Belief Network (DBN). The optimization formulas are different during the pretraining process for different pretraining models. In this paper, we pretrained deep neutral network by different pretraining models and hence investigated the difference between DBN and Stacked Denoising Autoencoder (SDA) when used as pretraining model. The experimental results show that DBN get a better initial model. However the model converges to a relatively worse model after the finetuning process. Yet after pretrained by SDA for the second time the model converges to a better model if finetuned.
In this paper, we propose a method called Convolutional Neural Network-Markov Random Field (CNN-MRF) to estimate the crowd count in a still image. We first divide the dense crowd visible image into overlapping patches and then use a deep convolutional neural network to extract features from each patch image, followed by a fully connected neural network to regress the local patch crowd count. Since the local patches have overlapping portions, the crowd count of the adjacent patches has a high correlation. We use this correlation and the Markov random field to smooth the counting results of the local patches. Experiments show that our approach significantly outperforms the state-of-the-art methods on UCF and Shanghaitech crowd counting datasets.
This note has no new results and is therefore not intended to be submitted to a "research" journal in the foreseeable future, but to be available to the numerous individuals who are interested in this issue. Several of those have approached the author for his opinion, which is summarized here in a hopefully pedagogical manner, for convenience. It is demonstrated, using essentially only energy conservation and elementary quantum mechanics, that true decoherence by a normal environment approaching the zero-temperature limit is impossible for a test particle which can not give or lose energy. Prime examples are: Bragg scattering, the M\"ossbauer effect and related phenomena at zero temperature, as well as quantum corrections for the transport of conduction electrons in solids. The last example is valid within the scattering formulation for the transport. Similar statements apply also to interference properties in equilibrium.
Dynamical linked cluster expansions are linked cluster expansions with hopping parameter terms endowed with their own dynamics. This amounts to a generalization from 2-point to point-link-point interactions. We develop an associated graph theory with a generalized notion of connectivity and describe an algorithmic generation of the new multiple-line graphs. We indicate physical applications to spin glasses, partially annealed neural networks and SU(N) gauge Higgs systems. In particular the new expansion technique provides the possibility of avoiding the replica-trick in spin glasses. We consider variational estimates for the SU(2) Higgs model of the electroweak phase transition. The results for the transition line, obtained by dynamical linked cluster expansions, agree quite well with corresponding high precision Monte Carlo results.
We study synchronization in the two-dimensional lattice of coupled phase oscillators with random intrinsic frequencies. When the coupling $K$ is larger than a threshold $K_E$, there is a macroscopic cluster of frequency-synchronized oscillators. We explain why the macroscopic cluster disappears at $K_E$. We view the system in terms of vortices, since cluster boundaries are delineated by the motion of these topological defects. In the entrained phase ($K>K_E$), vortices move in fixed paths around clusters, while in the unentrained phase ($K<K_E$), vortices sometimes wander off. These deviant vortices are responsible for the disappearance of the macroscopic cluster. The regularity of vortex motion is determined by whether clusters behave as single effective oscillators. The unentrained phase is also characterized by time-dependent cluster structure and the presence of chaos. Thus, the entrainment transition is actually an order-chaos transition. We present an analytical argument for the scaling $K_E\sim K_L$ for small lattices, where $K_L$ is the threshold for phase-locking. By also deriving the scaling $K_L\sim\log N$, we thus show that $K_E\sim\log N$ for small $N$, in agreement with numerics. In addition, we show how to use the linearized model to predict where vortices are generated.
We study a network model that couples the dynamics of link states with the evolution of the network topology. The state of each link, either A or B, is updated according to the majority rule or zero-temperature Glauber dynamics, in which links adopt the state of the majority of their neighboring links in the network. Additionally, a link that is in a local minority is rewired to a randomly chosen node. While large systems evolving under the majority rule alone always fall into disordered topological traps composed by frustrated links, any amount of rewiring is able to drive the network to complete order, by relinking frustrated links and so releasing the system from traps. However, depending on the relative rate of the majority rule and the rewiring processes, the system evolves towards different ordered absorbing configurations: either a one-component network with all links in the same state or a network fragmented in two components with opposite states. For low rewiring rates and finite size networks there is a domain of bistability between fragmented and non-fragmented final states. Finite size scaling indicates that fragmentation is the only possible scenario for large systems and any nonzero rate of rewiring.
To reduce the potential radiation risk, low-dose CT has attracted much attention. However, simply lowering the radiation dose will lead to significant deterioration of the image quality. In this paper, we propose a noise reduction method for low-dose CT via deep neural network without accessing original projection data. A deep convolutional neural network is trained to transform low-dose CT images towards normal-dose CT images, patch by patch. Visual and quantitative evaluation demonstrates a competing performance of the proposed method.
The influence of deep convection on water vapor in the Tropical Tropopause Layer (TTL), the region just below the high ($\sim$18 km), cold tropical tropopause, remains an outstanding question in atmospheric science. Moisture transport to this region is important for climate projections because it drives the formation of local cirrus (ice) clouds, which have a disproportionate impact on the Earth's radiative balance. Deep cumulus towers carrying large volumes of ice are known to reach the TTL, but their importance to the water budget has been debated for several decades. We show here that profiles of the isotopic composition of water vapor can provide a quantitative estimate of the convective contribution to TTL moistening. Isotopic measurements from the ACE satellite instrument, in conjunction with ice loads inferred from CALIOP satellite measurements and simple mass-balance modeling, suggest that convection is the dominant source of water vapor in the TTL up to near-tropopause altitudes. The relatively large ice loads inferred from CALIOP satellite measurements can be produced only with significant water sources, and isotopic profiles imply that these sources are predominantly convective ice. Sublimating ice from deep convection appears to increase TTL cirrus by a factor of several over that expected if cirrus production were driven only by large-scale uplift; sensitivity analysis implies that these conclusions are robust for most physically reasonable assumptions. Changes in tropical deep convection in future warmer conditions may thus provide an important climate feedback.
Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.
The cryptanalysis of various cipher problems can be formulated as NP-Hard combinatorial problem. Solving such problems requires time and/or memory requirement which increases with the size of the problem. Techniques for solving combinatorial problems fall into two broad groups - exact algorithms and Evolutionary Computation algorithms. An exact algorithms guarantees that the optimal solution to the problem will be found. The exact algorithms like branch and bound, simplex method, brute force etc methodology is very inefficient for solving combinatorial problem because of their prohibitive complexity (time and memory requirement). The Evolutionary Computation algorithms are employed in an attempt to find an adequate solution to the problem. A Evolutionary Computation algorithm - Genetic algorithm, simulated annealing and tabu search were developed to provide a robust and efficient methodology for cryptanalysis. The aim of these techniques to find sufficient "good" solution efficiently with the characteristics of the problem, instead of the global optimum solution, and thus it also provides attractive alternative for the large scale applications. This paper focuses on the methodology of Evolutionary Computation algorithms .
Many biological processes are governed by protein-ligand interactions. One such example is the recognition of self and nonself cells by the immune system. This immune response process is regulated by the major histocompatibility complex (MHC) protein which is encoded by the human leukocyte antigen (HLA) complex. Understanding the binding potential between MHC and peptides can lead to the design of more potent, peptide-based vaccines and immunotherapies for infectious autoimmune diseases.   We apply machine learning techniques from the natural language processing (NLP) domain to address the task of MHC-peptide binding prediction. More specifically, we introduce a new distributed representation of amino acids, name HLA-Vec, that can be used for a variety of downstream proteomic machine learning tasks. We then propose a deep convolutional neural network architecture, name HLA-CNN, for the task of HLA class I-peptide binding prediction. Experimental results show combining the new distributed representation with our HLA-CNN architecture achieves state-of-the-art results in the majority of the latest two Immune Epitope Database (IEDB) weekly automated benchmark datasets. We further apply our model to predict binding on the human genome and identify 15 genes with potential for self binding.
Current domain-independent, classical planners require symbolic models of the problem domain and instance as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although recent work in deep learning has achieved impressive results in many fields, the knowledge is encoded in a subsymbolic representation which cannot be directly used by symbolic systems such as planners. We propose LatPlan, an integrated architecture combining deep learning and a classical planner. Given a set of unlabeled training image pairs showing allowed actions in the problem domain, and a pair of images representing the start and goal states, LatPlan uses a Variational Autoencoder to generate a discrete latent vector from the images, based on which a PDDL model can be constructed and then solved by an off-the-shelf planner. We evaluate LatPlan using image-based versions of 3 planning domains: 8-puzzle, LightsOut, and Towers of Hanoi.
A method is proposed for manipulating with diagrammatic expansion of Green's function of Frenkel's exciton random walks on the perfect lattice. The method allows one to select diagrams, to supply diagrams with factors containing information about the number of sites the diagram has passed through, etc. Simple problems related to the defect lattices are considered using the proposed method. The new criterion of localization of Frenkel exciton - the number of sites covered by the wave function - is established. The number of sites covered by the zero state of 1D non-diagonally disordered chain is studied. It is shown that this problem can be solved by calculating the random walks Green's function with modified diagrammatic expansion. By means of the developed method an exact analitical expression for the number of sites covered by the zero-state is obtained and zero-state is shown to be localized.
We present a distributed solution to optimizing a convex function composed of several non-convex functions. Each non-convex function is privately stored with an agent while the agents communicate with neighbors to form a network. We show that coupled consensus and projected gradient descent algorithm proposed in [1] can optimize convex sum of non-convex functions under an additional assumption on gradient Lipschitzness. We further discuss the applications of this analysis in improving privacy in distributed optimization.
We address a question answering task on real-world images that is set up as a Visual Turing Test. By combining latest advances in image representation and natural language processing, we propose Neural-Image-QA, an end-to-end formulation to this problem for which all parts are trained jointly. In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language input (image and question). Our approach Neural-Image-QA doubles the performance of the previous best approach on this problem. We provide additional insights into the problem by analyzing how much information is contained only in the language part for which we provide a new human baseline. To study human consensus, which is related to the ambiguities inherent in this challenging task, we propose two novel metrics and collect additional answers which extends the original DAQUAR dataset to DAQUAR-Consensus.
In this letter, we propose for the first time a method of abstracting the PPV (Perturbation Projection Vector) characteristic of the up-to-date memristor-based oscillators. Inspired from biological oscillators and its characteristic named PRC (Phase Response Curve), we build a bridge between PRC and PPV. This relationship is verified rigorously using the transistor level simulation of Colpitts and ring oscillators, i.e., comparing the PPV converted from PRC and the PPV obtained from accurate PSS+PXF simulation. Then we apply this method to the PPV calculation of the memristor-based oscillator. By keeping the phase dynamics of the oscillator and dropping the details of voltage/current amplitude, the PPV modelling is highly efficient to describe the phase dynamics due to the oscillator coupling, and will be very suitable for the fast simulation of large scale oscillatory neural networks.
We propose a novel reversible jump Markov chain Monte Carlo (MCMC) simulated annealing algorithm to optimize radial basis function (RBF) networks. This algorithm enables us to maximize the joint posterior distribution of the network parameters and the number of basis functions. It performs a global search in the joint space of the parameters and number of parameters, thereby surmounting the problem of local minima. We also show that by calibrating a Bayesian model, we can obtain the classical AIC, BIC and MDL model selection criteria within a penalized likelihood framework. Finally, we show theoretically and empirically that the algorithm converges to the modes of the full posterior distribution in an efficient way.
Worldwide, in 2014, more than 1.9 billion adults, 18 years and older, were overweight. Of these, over 600 million were obese. Accurately documenting dietary caloric intake is crucial to manage weight loss, but also presents challenges because most of the current methods for dietary assessment must rely on memory to recall foods eaten. The ultimate goal of our research is to develop computer-aided technical solutions to enhance and improve the accuracy of current measurements of dietary intake. Our proposed system in this paper aims to improve the accuracy of dietary assessment by analyzing the food images captured by mobile devices (e.g., smartphone). The key technique innovation in this paper is the deep learning-based food image recognition algorithms. Substantial research has demonstrated that digital imaging accurately estimates dietary intake in many environments and it has many advantages over other methods. However, how to derive the food information (e.g., food type and portion size) from food image effectively and efficiently remains a challenging and open research problem. We propose a new Convolutional Neural Network (CNN)-based food image recognition algorithm to address this problem. We applied our proposed approach to two real-world food image data sets (UEC-256 and Food-101) and achieved impressive results. To the best of our knowledge, these results outperformed all other reported work using these two data sets. Our experiments have demonstrated that the proposed approach is a promising solution for addressing the food image recognition problem. Our future work includes further improving the performance of the algorithms and integrating our system into a real-world mobile and cloud computing-based system to enhance the accuracy of current measurements of dietary intake.
This paper presents Rudra, a parameter server based distributed computing framework tuned for training large-scale deep neural networks. Using variants of the asynchronous stochastic gradient descent algorithm we study the impact of synchronization protocol, stale gradient updates, minibatch size, learning rates, and number of learners on runtime performance and model accuracy. We introduce a new learning rate modulation strategy to counter the effect of stale gradients and propose a new synchronization protocol that can effectively bound the staleness in gradients, improve runtime performance and achieve good model accuracy. Our empirical investigation reveals a principled approach for distributed training of neural networks: the mini-batch size per learner should be reduced as more learners are added to the system to preserve the model accuracy. We validate this approach using commonly-used image classification benchmarks: CIFAR10 and ImageNet.
We develop a mean-field theory of the metallic phase near the many-body localization (MBL) transition, using the observation that a system near the MBL transition should become an increasingly slow heat bath for its constituent parts. As a first step, we consider the properties of a many-body localized system coupled to a generic ergodic bath whose characteristic dynamical timescales are much slower than those of the system. As we discuss, a wide range of experimentally relevant systems fall into this class; we argue that relaxation in these systems is dominated by collective many-particle rearrangements, and compute the associated timescales and spectral broadening. We then use the observation that the self-consistent environment of any region in a nearly localized metal can itself be modeled as a slowly fluctuating bath to outline a self-consistent mean-field description of the nearly localized metal and the localization transition. In the nearly localized regime, the spectra of local operators are highly inhomogeneous and the typical local spectral linewidth is narrow. The local spectral linewidth is proportional to the DC conductivity, which is small in the nearly localized regime. This typical linewidth and the DC conductivity go to zero as the localized phase is approached, with a scaling that we calculate, and which appears to be in good agreement with recent experimental results.
We describe procedures to obtain the electronic structure of disordered systems using either tight binding like models or quite directly from ab inito density functional band structure calculations. The band structure is calculated using super cells much larger than those containing a single minority component atom. We average over a large number of different super cell calculations containing different randomly positioned minority component atoms in the super cell as well as a varying total number of minority component atoms, weighted by the statistical probability. We develop an efficient and simple algorithm for unfolding of these bands, based on Bloch's theorem. The unfolded band-structure obtained in this way exhibits momentum and energy broadened structures replacing the gaps observed in often used single super cell calculations. Using the super cell averaged band-structure one can introduce a self-energy, resulting from the scattering of randomly positioned alloy components. The self-energy is causal, and shows strong energy and some momentum dependence. The self-energy shows rather non-trivial behavior and is in general non-zero at the Fermi-energy, resulting in an ill or undefined Fermi surface. The real-part of the self-energy at the Fermi-energy relates to an apparent violation of Luttinger's theorem. There is no simple relation between the apparent Fermi-surface volume and the electron count. Examples introducing these effects both for model and real binary alloy systems are presented.
We study dynamic heterogeneities in the out-of-equilibrium coarsening dynamics of the spherical ferromagnet after a quench from infinite temperature to its critical point. A standard way of probing such heterogeneities is by monitoring the fluctuations of correlation and susceptibility, coarse-grained over mesoscopic regions. We discuss how to define fluctuating coarse-grained correlations (C) and susceptibilities (Chi) in models where no quenched disorder is present. Our focus for the spherical model is on coarse-graining over the whole volume of $N$ spins, which requires accounting for N^{-1/2} non-Gaussian fluctuations of the spin. The latter are treated as a perturbation about the leading order Gaussian statistics. We obtain exact results for these quantities, which enable us to characterise the joint distribution of C and Chi fluctuations. We find that this distribution is qualitatively different, even for equilibrium above criticality, from the spin-glass scenario where C and Chi fluctuations are linked in a manner akin to the fluctuation-dissipation relation between the average C and Chi. Our results show that coarsening at criticality is clearly heterogeneous for d>4 and suggest that, as in other glassy systems, there is a well-defined timescale on which fluctuations across thermal histories are largest. Surprisingly, however, neither this timescale nor the amplitude of the heterogeneities increase with the age of the system, as would be expected from the growing correlation length. For d<4, the strength of the fluctuations varies on a timescale proportional to the age of the system; the corresponding amplitude also grows with age, but does not scale with the correlation volume as might have been expected naively.
When a spin glass is cooled down, a memory of the cooling process is imprinted in the spin structure. This memory can be disclosed in a continuous heating measurement of the ac-susceptibility. E.g., if a continuous cooling process is intermittently halted during a certain aging time at one or two intermediate temperatures, the trace of the previous stop(s) is recovered when the sample is continuously re-heated [1]. However, heating the sample above the aging temperature, but keeping it below Tg, erases the memory of the thermal history at lower temperatures. We also show that a memory imprinted at a higher temperature can be erased by waiting a long enough time at a lower temperature. Predictions from two complementary spin glass descriptions, a hierarchical phase space model and a real space droplet picture are contested with these memory phenomena and interference effects.   [1] K. Jonason, E. Vincent, J. Hammann, J. P. Bouchaud and P. Nordblad, Phys. Rev. Lett. 31, 3243 (1998).
The problem of "approximating the crowd" is that of estimating the crowd's majority opinion by querying only a subset of it. Algorithms that approximate the crowd can intelligently stretch a limited budget for a crowdsourcing task. We present an algorithm, "CrowdSense," that works in an online fashion to dynamically sample subsets of labelers based on an exploration/exploitation criterion. The algorithm produces a weighted combination of a subset of the labelers' votes that approximates the crowd's opinion.
The deployment of mobile femtocellular networks can enhance the service quality for the users inside the vehicles. The deployment of mobile femtocells generates a lot of handover calls. Also, numbers of group handover scenarios are found in mobile femtocellular network deployment. The ability to seamlessly switch between the femtocells and the macrocell networks is a key concern for femtocell network deployment. However, until now there is no effective and complete handover scheme for the mobile femtocell network deployment. Also handover between the backhaul networks is a major concern for the mobile femtocellular network deployment. In this paper, we propose handover control between the access networks for the individual handover cases. Call flows for the handover between the backhaul networks of the macrocell-to-macrocell case are proposed in this paper. We also propose the link switching for the FSO based backhaul networks. The proposed resource allocation scheme ensures the negligible handover call dropping probability as well as higher bandwidth utilization.
Accelerating the inference of a trained DNN is a well studied subject. In this paper we switch the focus to the training of DNNs. The training phase is compute intensive, demands complicated data communication, and contains multiple levels of data dependencies and parallelism. This paper presents an algorithm/architecture space exploration of efficient accelerators to achieve better network convergence rates and higher energy efficiency for training DNNs. We further demonstrate that an architecture with hierarchical support for collective communication semantics provides flexibility in training various networks performing both stochastic and batched gradient descent based techniques. Our results suggest that smaller networks favor non-batched techniques while performance for larger networks is higher using batched operations. At 45nm technology, CATERPILLAR achieves performance efficiencies of 177 GFLOPS/W at over 80% utilization for SGD training on small networks and 211 GFLOPS/W at over 90% utilization for pipelined SGD/CP training on larger networks using a total area of 103.2 mm$^2$ and 178.9 mm$^2$ respectively.
The evolutionary edit distance between two individuals in a population, i.e., the amount of applications of any genetic operator it would take the evolutionary process to generate one individual starting from the other, seems like a promising estimate for the diversity between said individuals. We introduce genealogical diversity, i.e., estimating two individuals' degree of relatedness by analyzing large, unused parts of their genome, as a computationally efficient method to approximate that measure for diversity.
We introduce multiforce, a force-directed layout for multiplex networks, where the nodes of the network are organized into multiple layers and both in-layer and inter-layer relationships among nodes are used to compute node coordinates. The proposed approach generalizes existing work, providing a range of intermediate layouts in-between the ones produced by known methods. Our experiments on real data show that multiforce can keep nodes well aligned across different layers without significantly affecting their internal layouts when the layers have similar or compatible topologies. As a consequence, multiforce enriches the benefits of force-directed layouts by also supporting the identification of topological correspondences between layers.
We discuss the first results from ongoing studies of the resolved stellar populations in the outskirts of our nearest large neighbour, M31. Deep HST/WFPC2 archival observations are used to construct colour-magnitude-diagrams which reach well below the horizontal branch at selected locations in the outer disk and halo, while a panoramic ground-based imaging survey maps spatial density variations through resolved star counts to a projected radius of ~50 kpc.
In this paper, we determine the optimal convergence rates for strongly convex and smooth distributed optimization in two settings: centralized and decentralized communications over a network. For centralized (i.e. master/slave) algorithms, we show that distributing Nesterov's accelerated gradient descent is optimal and achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_g}(1+\Delta\tau)\ln(1/\varepsilon))$, where $\kappa_g$ is the condition number of the (global) function to optimize, $\Delta$ is the diameter of the network, and $\tau$ (resp. $1$) is the time needed to communicate values between two neighbors (resp. perform local computations). For decentralized algorithms based on gossip, we provide the first optimal algorithm, called the multi-step dual accelerated (MSDA) method, that achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_l}(1+\frac{\tau}{\sqrt{\gamma}})\ln(1/\varepsilon))$, where $\kappa_l$ is the condition number of the local functions and $\gamma$ is the (normalized) eigengap of the gossip matrix used for communication between nodes. We then verify the efficiency of MSDA against state-of-the-art methods for two problems: least-squares regression and classification by logistic regression.
However, the observations encompassed by classical physics excludes the observer from the physical reality, yet the deep-down understandung of nature --{\it the quantum theory}-- can not avoid the intrusion of observer into the measurement process. Indeed, the quantum physics experiments have knocked the door of a new paradigm: in which science of consciousness is an important axiom. In the present work, it is argued --by taking into account of the views of learned scientists and philosophers-- that modern science is incomplete and lacking something in the basic understanding of nature \cite{r}. Classical physics has failed to explain the dynamics of the microscopic particles, eventhough modern scientific researches are based upon the prejudice posed by classical physics: keeping the outer physical universe as a separate entity, that is something quite independent of the observer --human mind. One should not forget that human-being is a part of nature and human mind is an essential component of our observations. Basically, it is the observer --the consciousness-- which makes perception possible. The working of human mind must be included in our scientific theories. In fact, human free will is an illusion and nothing in the entire universe (with life) lies outside the domain of science and determinism.
As a complement to cloud computing, fog computing can offer many benefits in terms of avoiding the long wide-area network (WAN) propagation delay and relieving the network bandwidth burden by providing local services to nearby end users, resulting in a reduced revenue loss associated with the WAN propagation delay and network bandwidth cost for a cloud provider. However, serving the requests of end-users would lead to additional energy costs for fog devices, thus the could provider must compensate fog devices for their losses. In this paper, we investigate the problem of minimizing the total cost of a cloud provider without sacrificing the interests of fog devices. To be specific, we first formulate a total cost minimization problem for the cloud provider, where the cost consists of four parts, namely the energy cost of data centers, network bandwidth cost, revenue loss associated with WAN propagation delay, and the economic compensation paid to fog devices. Note that the formulated problem is a large-scale mixed integer linear programming, which is in general NP-hard. To solve the problem efficiently, a distributed heuristic algorithm is designed based on Proximal Jacobian Alternating Direction Method of Multipliers (ADMM), which determines the number of active fog devices, workload allocation, and the number of active servers in each cloud data center. Extensive simulation results show the effectiveness of the designed heuristic algorithm.
Numerical simulations by means of Monte Carlo method and finite-size scaling analysis have been performed to study the percolation behavior of linear $k$-mers (also denoted in the literature as rigid rods, needles, sticks) on two-dimensional square lattices $L \times L$ with periodic boundary conditions. Percolation phenomena are investigated for anisotropic relaxation random sequential adsorption of linear $k$-mers. Especially, effect of anisotropic placement of the objects on the percolation threshold has been investigated. Moreover, the behavior of percolation probability $R_L(p)$ that a lattice of size $L$ percolates at concentration $p$ has been studied in details in dependence on $k$, anisotropy and lattice size $L$. A nonmonotonic size dependence for the percolation threshold has been confirmed in isotropic case. We propose a fitting formula for percolation threshold $p_c = a/k^{\alpha}+b\log_{10} k+ c$, where $a$, $b$, $c$, $\alpha$ are the fitting parameters varying with anisotropy. We predict that for large $k$-mers ($k\gtrapprox 1.2\times10^4$) isotropic placed at the lattice, percolation cannot occur even at jamming concentration.
We introduce a coupled pair of evolution equations for the unintegrated gluon distribution and the sea quark distribution which incorporate both the resummed leading $\ln (1/x)$ BFKL contributions and the resummed leading $\ln (Q^2)$ DGLAP contributions. We solve these unified equations in the perturbative QCD domain. With only two physically motivated parameters we obtain an excellent description of the HERA $F_2$ data.
We present a multi-purpose algorithm for simultaneous face detection, face alignment, pose estimation, gender recognition, smile detection, age estimation and face recognition using a single deep convolutional neural network (CNN). The proposed method employs a multi-task learning framework that regularizes the shared parameters of CNN and builds a synergy among different domains and tasks. Extensive experiments show that the network has a better understanding of face and achieves state-of-the-art result for most of these tasks.
We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. To the best of our knowledge, this is the first exact algorithm for this problem. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach.
We propose a novel method for translation selection in statistical machine translation, in which a convolutional neural network is employed to judge the similarity between a phrase pair in two languages. The specifically designed convolutional architecture encodes not only the semantic similarity of the translation pair, but also the context containing the phrase in the source language. Therefore, our approach is able to capture context-dependent semantic similarities of translation pairs. We adopt a curriculum learning strategy to train the model: we classify the training examples into easy, medium, and difficult categories, and gradually build the ability of representing phrase and sentence level context by using training examples from easy to difficult. Experimental results show that our approach significantly outperforms the baseline system by up to 1.4 BLEU points.
Dropout training, originally designed for deep neural networks, has been successful on high-dimensional single-layer natural language tasks. This paper proposes a theoretical explanation for this phenomenon: we show that, under a generative Poisson topic model with long documents, dropout training improves the exponent in the generalization bound for empirical risk minimization. Dropout achieves this gain much like a marathon runner who practices at altitude: once a classifier learns to perform reasonably well on training examples that have been artificially corrupted by dropout, it will do very well on the uncorrupted test set. We also show that, under similar conditions, dropout preserves the Bayes decision boundary and should therefore induce minimal bias in high dimensions.
In Phys. Rev. Lett. 84, 4204 (2000) (cond-mat/9905379), Kato et al. presented quantum Monte Carlo results indicating that the critical concentration of random non-magnetic sites in the two-dimensional antiferromagnetic Heisenberg model equals the classical percolation density; pc=0.407254. The data also suggested a surprising dependence of the critical exponents on the spin S of the magnetic sites, with a gradual approach to the classical percolation exponents as S goes to infinity. I here argue that the exponents in fact are S-independent and equal to those of classical percolation. The apparent S-dependent behavior found by Kato et al. is due to temperature effects in the simulations as well as a quantum effect that masks the true asymptotic scaling behavior for small lattices.
This is a short review on an interdisciplinary field of quantum information science and statistical mechanics. We first give a pedagogical introduction to the stabilizer formalism, which is an efficient way to describe an important class of quantum states, the so-called stabilizer states, and quantum operations on them. Furthermore, graph states, which are a class of stabilizer states associated with graphs, and their applications for measurement-based quantum computation are also mentioned. Based on the stabilizer formalism, we review two interdisciplinary topics. One is the relation between quantum error correction codes and spin glass models, which allows us to analyze the performances of quantum error correction codes by using the knowledge about phases in statistical models. The other is the relation between the stabilizer formalism and partition functions of classical spin models, which provides new quantum and classical algorithms to evaluate partition functions of classical spin models.
We study the influence of the complex topology of scale-free graphs on the dynamics of anti-coordination games (e.g. snowdrift games). These reference models are characterized by the coexistence (evolutionary stable mixed strategy) of two competing species, say "cooperators" and "defectors", and, in finite systems, by metastability and large-fluctuation-driven fixation. In this work, we use extensive computer simulations and an effective diffusion approximation (in the weak selection limit) to determine under which circumstances, depending on the individual-based update rules, the topology drastically affects the long-time behavior of anti-coordination games. In particular, we compute the variance of the number of cooperators in the metastable state and the mean fixation time when the dynamics is implemented according to the voter model (death-first/birth-second process) and the link dynamics (birth/death or death/birth at random). For the voter update rule, we show that the scale-free topology effectively renormalizes the population size and as a result the statistics of observables depend on the network's degree distribution. In contrast, such a renormalization does not occur with the link dynamics update rule and we recover the same behavior as on complete graphs.
We report a new statistical general property in traveling salesman problem, that the $n$th-nearest-neighbor distribution of optimal tours verifies with very high accuracy an exponential decay as a function of the order of neighbor $n$. With defining the energy function as the deviation $\lambda$ from this exponential decay, which is different to the tour length $d$ in normal annealing processes, we propose a distinct highly optimized annealing scheme which is performed in $\lambda$-space and $d$-space by turns. The simulation results of some standard traveling salesman problems in TSPLIB95 are presented. It is shown that our annealing recipe is superior to the canonical simulated annealing.
We study the large deviations of the magnetization at some finite time in the Curie-Weiss Random Field Ising Model with parallel updating. While relaxation dynamics in an infinite time horizon gives rise to unique dynamical trajectories (specified by initial conditions and governed by first-order dynamics of the form $m_{t+1}=f(m_t)$), we observe that the introduction of a finite time horizon and the specification of terminal conditions can generate a host of metastable solutions obeying \textit{second-order} dynamics. We show that these solutions are governed by a Newtonian-like dynamics in discrete time which permits solutions in terms of both the first order relaxation ("forward") dynamics and the backward dynamics $m_{t+1} = f^{-1}(m_t)$. Our approach allows us to classify trajectories for a given final magnetization as stable or metastable according to the value of the rate function associated with them. We find that in analogy to the Freidlin-Wentzell description of the stochastic dynamics of escape from metastable states, the dominant trajectories may switch between the two types (forward and backward) of first-order dynamics.
Lateral Inhibition (LI) phenomena occur in a wide range of sensory modalities and are most famously described in the human visual system. In LI the activity of a stimulated neuron is itself excited and suppresses the activity of its local neighbours via inhibitory connections, increasing the contrast between spatial environmental stimuli. Simple or- ganisms, such as the single-celled slime mould Physarum polycephalum possess no neural tissue yet, despite this, are known to exhibit complex computational behaviour. Could simple organisms such as slime mould approximate LI without recourse to neural tissue? We describe a model whereby LI can emerge without explicit inhibitory wiring, using only bulk transport effects. We use a multi-agent virtual material model of slime mould to reproduce the characteristic contrast amplification response of LI using excitation via attractant stimuli. Restoration of baseline activ- ity occurs when the stimuli are removed. We also explore an opposite counterpart behaviour, Lateral Activation (LA), using repellent stimuli. These preliminary results suggest that simple organisms without neural tissue may approximate sensory contrast enhancement using alternative analogues of LI and suggests novel approaches towards generating collec- tive contrast enhancement in distributed computing and robotic devices.
The aging dynamics after shear rejuvenation in a glassy, charged clay suspension have been investigated through dynamic light scattering (DLS). Two different aging regimes are observed: one is attained if the sample is rejuvenated before its gelation and one after the rejuvenation of the gelled sample. In the first regime, the application of shear fully rejuvenates the sample, as the system dynamics soon after shear cessation follow the same aging evolution characteristic of normal aging. In the second regime, aging proceeds very fast after shear rejuvenation, and classical DLS cannot be used. An original protocol to measure an ensemble averaged intensity correlation function is proposed and its consistency with classical DLS is verified. The fast aging dynamics of rejuvenated gelled samples exhibit a power law dependence of the slow relaxation time on the waiting time.
There has been significant research over the past two decades in developing new platforms for spiking neural computation. Current neural computers are primarily developed to mimick biology. They use neural networks which can be trained to perform specific tasks to mainly solve pattern recognition problems. These machines can do more than simulate biology, they allow us to re-think our current paradigm of computation. The ultimate goal is to develop brain inspired general purpose computation architectures that can breach the current bottleneck introduced by the Von Neumann architecture. This work proposes a new framework for such a machine. We show that the use of neuron like units with precise timing representation, synaptic diversity, and temporal delays allows us to set a complete, scalable compact computation framework. The presented framework provides both linear and non linear operations, allowing us to represent and solve any function. We show usability in solving real use cases from simple differential equations to sets of non-linear differential equations leading to chaotic attractors.
First-principles alloy theory, formulated within the exact muffin-tin orbitals method in combination with the coherent-potential approximation, is used to study the mechanical properties of ferromagnetic body-centered cubic (bcc) Fe$_{1-x}$M$_x$ alloys (M=Mn or Ni, $0\le x \le 0.1$). We consider several physical parameters accessible from \emph{ab initio} calculations and their combinations in various phenomenological models to compare the effect of Mn and Ni on the properties of Fe. Alloying is found to slightly alter the lattice parameters and produce noticeable influence on elastic moduli. Both Mn and Ni decrease the surface energy and the unstable stacking fault energy associated with the $\{110\}$ surface facet and the $\{110\}\langle111\rangle$ slip system, respectively. Nickel is found to produce larger effect on the planar fault energies than Mn. The semi-empirical ductility criteria by Rice and Pugh consistently predict that Ni enhances the ductility of Fe but give contradictory results in the case of Mn doping. The origin of the discrepancy between the two criteria is discussed and an alternative measure of the ductile-brittle behavior based on the theoretical cleavage strength and single-crystal shear modulus $G\{110\}\langle111\rangle$ is proposed.
Security in the Wireless Sensor Networks (WSN) is a very challenging task because of their dissimilarities with the conventional wireless networks. The related works so far have been done have tried to solve the problem keeping in the mind the constraints of WSNs. In this paper we have proposed a set of cellular automata based security algorithms (CAWS) which consists of CAKD, a Cellular Automata (CA) based key management algorithm and CASC, a CA based secure data communication algorithm, which require very small amount of memory as well as simple computation.
We demonstrate that modern image recognition methods based on artificial neural networks can recover hidden information from images protected by various forms of obfuscation. The obfuscation techniques considered in this paper are mosaicing (also known as pixelation), blurring (as used by YouTube), and P3, a recently proposed system for privacy-preserving photo sharing that encrypts the significant JPEG coefficients to make images unrecognizable by humans. We empirically show how to train artificial neural networks to successfully identify faces and recognize objects and handwritten digits even if the images are protected using any of the above obfuscation techniques.
The spin glasses are disordered and frustrated magnetic systems. They show aging phenomena which are also a characteristic feature of structural glasses, polymers, dielectrics, colloids, etc. Under a strong enough magnetic field variation, spin glasses can be "rejuvenated", in the same way as glasses under a high enough stress. But spin glasses also display non-trivial "rejuvenation and memory" effects when they are submitted to temperature variations. Comparable phenomena have recently been identified in some polymers and gels, although less sharply pronounced. We propose to describe aging phenomena in disordered systems in terms of a combination of such "temperature-selective" effects with the more usual "temperature-cumulative" processes related to strong cooling rate effects.
We analyze the left-right asymmetry of pion production in semi-inclusive deep inelastic scattering (SIDIS) process of unpolarized charged lepton on transversely polarized nucleon target. Unlike available treatments, in which some specific weighting functions are multiplied to separate theoretically motivated quantities, we do not introduce any weighting function following the analyzing method by the E704 experiment. The advantage is that this basic observable is free of any theoretical bias, although we can perform the calculation under the current theoretical framework. We present numerical calculations at both HERMES kinematics for the proton target and JLab kinematics for the neutron target. We find that with the current theoretical understanding, Sivers effect plays a key role in our analysis.
Many-body localization (MBL) has emerged as a novel paradigm for robust ergodicity breaking in closed quantum many-body systems. However, it is not yet clear to which extent MBL survives in the presence of dissipative processes induced by the coupling to an environment. Here we study heating and ergodicity for a paradigmatic MBL system---an interacting fermionic chain subject to quenched disorder---in the presence of dephasing. We find that, even though the system is eventually driven into an infinite-temperature state, heating as monitored by the von Neumann entropy can progress logarithmically slowly, implying exponentially large time scales for relaxation. This slow loss of memory of initial conditions make signatures of non-ergodicity visible over a long, but transient, time regime. We point out a potential controlled realization of the considered setup with cold atomic gases held in optical lattices.
We present the evolutionary properties and luminosity functions of the radio sources belonging to the Chandra Deep Field South VLA survey, which reaches a flux density limit at 1.4 GHz of 43 microJy at the field center and redshift ~5, and which includes the first radio-selected complete sample of radio-quiet active galactic nuclei (AGN). We use a new, comprehensive classification scheme based on radio, far- and near-IR, optical, and X-ray data to disentangle star-forming galaxies from AGN and radio-quiet from radio-loud AGN. We confirm our previous result that star-forming galaxies become dominant only below 0.1 mJy. The sub-mJy radio sky turns out to be a complex mix of star-forming galaxies and radio-quiet AGN evolving at a similar, strong rate; non-evolving low-luminosity radio galaxies; and declining radio powerful (P > 3 10^24 W/Hz) AGN. Our results suggest that radio emission from radio-quiet AGN is closely related to star formation. The detection of compact, high brightness temperature cores in several nearby radio-quiet AGN can be explained by the co-existence of two components, one non-evolving and AGN-related and one evolving and star-formation-related. Radio-quiet AGN are an important class of sub-mJy sources, accounting for ~30% of the sample and ~60% of all AGN, and outnumbering radio-loud AGN at < 0.1 mJy. This implies that future, large area sub-mJy surveys, given the appropriate ancillary multi-wavelength data, have the potential of being able to assemble vast samples of radio-quiet AGN by-passing the problems of obscuration, which plague the optical and soft X-ray bands.
This paper strives for video event detection using a representation learned from deep convolutional neural networks. Different from the leading approaches, who all learn from the 1,000 classes defined in the ImageNet Large Scale Visual Recognition Challenge, we investigate how to leverage the complete ImageNet hierarchy for pre-training deep networks. To deal with the problems of over-specific classes and classes with few images, we introduce a bottom-up and top-down approach for reorganization of the ImageNet hierarchy based on all its 21,814 classes and more than 14 million images. Experiments on the TRECVID Multimedia Event Detection 2013 and 2015 datasets show that video representations derived from the layers of a deep neural network pre-trained with our reorganized hierarchy i) improves over standard pre-training, ii) is complementary among different reorganizations, iii) maintains the benefits of fusion with other modalities, and iv) leads to state-of-the-art event detection results. The reorganized hierarchies and their derived Caffe models are publicly available at http://tinyurl.com/imagenetshuffle.
We study the problems of pricing an indivisible product to consumers who are embedded in a given social network. The goal is to maximize the revenue of the seller. We assume impatient consumers who buy the product as soon as the seller posts a price not greater than their values of the product. The product's value for a consumer is determined by two factors: a fixed consumer-specified intrinsic value and a variable externality that is exerted from the consumer's neighbors in a linear way. We study the scenario of negative externalities, which captures many interesting situations, but is much less understood in comparison with its positive externality counterpart. We assume complete information about the network, consumers' intrinsic values, and the negative externalities. The maximum revenue is in general achieved by iterative pricing, which offers impatient consumers a sequence of prices over time.   We prove that it is NP-hard to find an optimal iterative pricing, even for unweighted tree networks with uniform intrinsic values. Complementary to the hardness result, we design a 2-approximation algorithm for finding iterative pricing in general weighted networks with (possibly) nonuniform intrinsic values. We show that, as an approximation to optimal iterative pricing, single pricing can work rather well for many interesting cases, but theoretically it can behave arbitrarily bad.
The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain.
Coevolution between strategy and network structure is established as a means to arrive at optimal conditions for resolving social dilemmas. Yet recent research highlights that the interdependence between networks may be just as important as the structure of an individual network. We therefore introduce coevolution of strategy and network interdependence to study whether it can give rise to elevated levels of cooperation in the prisoner's dilemma game. We show that the interdependence between networks self-organizes so as to yield optimal conditions for the evolution of cooperation. Even under extremely adverse conditions cooperators can prevail where on isolated networks they would perish. This is due to the spontaneous emergence of a two-class society, with only the upper class being allowed to control and take advantage of the interdependence. Spatial patterns reveal that cooperators, once arriving to the upper class, are much more competent than defectors in sustaining compact clusters of followers. Indeed, the asymmetric exploitation of interdependence confers to them a strong evolutionary advantage that may resolve even the toughest of social dilemmas.
The linear and cubic dynamic susceptibilities of solid dispersions of nano-sized maghemite particles have been measured for three samples with a volume concentration of magnetic particles ranging from 0.3 % to 17 %, in order to study the effect of dipole-dipole interactions. Significant differences between the dynamic response of the three samples are observed. The dynamic susceptibilities of the most dilute sample compare reasonably well with an existing theory for the linear and cubic dynamic susceptibilities of an assembly of noninteracting, uniaxial magnetic particles. The nonlinear dynamic response of the most concentrated sample exhibits at low temperatures some of the features observed in a Ag(11 at.% Mn) spin glass.
With a large influx of dermoscopy images and a growing shortage of dermatologists, automatic dermoscopic image analysis plays an essential role in skin cancer diagnosis. In this paper, a new deep fully convolutional neural network (FCNN) is proposed to automatically segment melanoma out of skin images by end-to-end learning with only pixels and labels as inputs. Our proposed FCNN is capable of using both local and global information to segment melanoma by adopting skipping layers. The public benchmark database consisting of 150 validation images, 600 test images and 2000 training images in the melanoma detection challenge 2017 at International Symposium Biomedical Imaging 2017 is used to test the performance of our algorithm. All large size images (for example, $4000\times 6000$ pixels) are reduced to much smaller images with $384\times 384$ pixels (more than 10 times smaller). We got and submitted preliminary results to the challenge without any pre or post processing. The performance of our proposed method could be further improved by data augmentation and by avoiding image size reduction.
Optimizing radio transmission power and user data rates in wireless systems via power control requires an accurate and instantaneous knowledge of the system model. While this problem has been extensively studied in the literature, an efficient solution approaching optimality with the limited information available in practical systems is still lacking. This paper presents a reinforcement learning framework for power control and rate adaptation in the downlink of a radio access network that closes this gap. We present a comprehensive design of the learning framework that includes the characterization of the system state, the design of a general reward function, and the method to learn the control policy. System level simulations show that our design can quickly learn a power control policy that brings significant energy savings and fairness across users in the system.
The creation of practical deep learning data-products often requires parallelization across processors and computers to make deep learning feasible on large data sets, but bottlenecks in communication bandwidth make it difficult to attain good speedups through parallelism. Here we develop and test 8-bit approximation algorithms which make better use of the available bandwidth by compressing 32-bit gradients and nonlinear activations to 8-bit approximations. We show that these approximations do not decrease predictive performance on MNIST, CIFAR10, and ImageNet for both model and data parallelism and provide a data transfer speedup of 2x relative to 32-bit parallelism. We build a predictive model for speedups based on our experimental data, verify its validity on known speedup data, and show that we can obtain a speedup of 50x and more on a system of 96 GPUs compared to a speedup of 23x for 32-bit. We compare our data types with other methods and show that 8-bit approximations achieve state-of-the-art speedups for model parallelism. Thus 8-bit approximation is an efficient method to parallelize convolutional networks on very large systems of GPUs.
Scope of this paper is to consider a mean field neural model which takes into account the functional neurogeometry of the visual cortex modelled as a group of rotations and translations. The model generalizes well known results of Bressloff and Cowan which, in absence of input, accounts for hallucination patterns. The main result of our study consists in showing that in presence of a visual input, the eigenmodes of the linearized operator which become stable represent perceptual units present in the image. The result is strictly related to dimensionality reduction and clustering problems.
We use computer simulation to investigate the topology of the potential energy $V(\{{\bf R}\})$ and to search for doublewell potential's (DWP) in a model glass . By a sequence of Newtonian and dissipative dynamics we find different minima of $V(\{{\bf R}\})$ and the energy profile along the least action paths joining them. At variance with previous suggestions, we find that the parameters describing the DWP's are correlated among each others. Moreover, the trajectory of the system in the 3$N$-d configurational phase space follows a quasi-1-d manifold. The motion parallel to the path is characterized by jumps between minima, and is nearly uncorrelated from the orthogonal, harmonic, dynamics.
We compute the dielectric response of glasses starting from a microscopic system-bath Hamiltonian of the Zwanzig-Caldeira-Leggett type and using an ansatz from kinetic theory for the memory function in the resulting Generalized Langevin Equation. The resulting framework requires the knowledge of the vibrational density of states (DOS) as input, that we take from numerical evaluation of a marginally-stable harmonic disordered lattice, featuring a strong boson peak (excess of soft modes over Debye $\sim\omega_{p}^{2}$ law). The dielectric function calculated based on this ansatz is compared with experimental data for the paradigmatic case of glycerol at $T\lesssim T_{g}$. Good agreement is found for both the reactive (real part) of the response and for the $\alpha$-relaxation peak in the imaginary part, with a significant improvement over earlier theoretical approaches, especially in the reactive modulus. On the low-frequency side of the $\alpha$-peak, the fitting supports the presence of $\sim \omega_{p}^{4}$ modes at vanishing eigenfrequency as recently shown in [Phys. Rev. Lett. 117, 035501 (2016)]. $\alpha$-wing asymmetry and stretched-exponential behaviour are recovered by our framework, which shows that these features are, to a large extent, caused by the soft boson-peak modes in the DOS.
Channelrhodopsins-2 (ChR2) are a class of light sensitive proteins that offer the ability to use light stimulation to regulate neural activity with millisecond precision. In order to address the limitations in the efficacy of the wild-type ChR2 (ChRwt) to achieve this objective, new variants of ChR2 that exhibit fast mono-exponential photocurrent decay characteristics have been recently developed and validated. In this paper, we investigate whether the framework of transition rate model with 4 states, primarily developed to mimic the bi-exponential photocurrent decay kinetics of ChRwt, as opposed to the low complexity 3 state model, is warranted to mimic the mono-exponential photocurrent decay kinetics of the newly developed fast ChR2 variants: ChETA (Gunaydin et al., Nature Neurosci, 13:387-392, 2010) and ChRET/TC (Berndt et al., PNAS, 108:7595-7600, 2011). We begin by estimating the parameters for the 3-state and 4-state models from experimental data on the photocurrent kinetics of ChRwt, ChETA and ChRET/TC. We then incorporate these models into a fast-spiking interneuron model (Wang and Buzsaki., J Neurosci, 16:6402-6413,1996) and a hippocampal pyramidal cell model (Golomb et al., J Neurophysiol, 96:1912-1926, 2006) and investigate the extent to which the experimentally observed neural response to various optostimulation protocols can be captured by these models. We demonstrate that for all ChR2 variants investigated, the 4 state model implementation is better able to capture neural response consistent with experiments across wide range of optostimulation protocol. We conclude by analytically investigating the conditions under which the characteristic specific to the 3-state model, namely the mono-exponential photocurrent decay of the newly developed variants of ChR2, can occurs in the framework of the 4-state model.
Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases in the data rather than learning to perform visual reasoning. Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer. Both the program generator and the execution engine are implemented by neural networks, and are trained using a combination of backpropagation and REINFORCE. Using the CLEVR benchmark for visual reasoning, we show that our model significantly outperforms strong baselines and generalizes better in a variety of settings.
We propose a simple route to evaluate the static structure, in terms of average coordination, of completely disordered solids with spherical constituents, from ca. 55% volume fraction up to random close packing, in the absence of structural heterogeneities. Based on the current understanding, according to which the structure-determining interaction in amorphous solids is the hard-core repulsion while weaker, longer-range interactions are mere perturbations, the model yields the average coordination in the solid as a result of a hyperquenching process where the instantaneous structure of the precursor liquid snapshot is distorted to the same degree required to quench the hard-sphere liquid into the isostatic jammed state at 64% volume fraction. The characteristic length of distortion turns out to be about 3% of the particle diameter. Extrapolating to lower volume fractions, this is thus the quenching route leading to the most spatially homogeneous states. Thus the model can be usefully employed to quantitatively assess the degree of structural inhomogeneity in amorphous solids. When spatial inhomogeneity is small, as for very dense systems, the model can be used to evaluate coordination-dependent macroscopic properties (e.g. the elastic moduli) as shown in parallel works.
Notes in a musical piece are building blocks employed in non-random ways to create melodies. It is the "interaction" among a limited amount of notes that allows constructing the variety of musical compositions that have been written in centuries and within different cultures. Networks are a modeling tool that is commonly employed to represent a set of entities interacting in some way. Thus, notes composing a melody can be seen as nodes of a network that are connected whenever these are played in sequence. The outcome of such a process results in a directed graph. By using complex network theory, some main metrics of musical graphs can be measured, which characterize the related musical pieces. In this paper, we define a framework to represent melodies as networks. Then, we provide an analysis on a set of guitar solos performed by main musicians. Results of this study indicate that the presented model can have an impact on audio and multimedia applications such as music classification, identification, e-learning, automatic music generation, multimedia entertainment.
We explore the method of style transfer presented in the article "A Neural Algorithm of Artistic Style" by Leon A. Gatys, Alexander S. Ecker and Matthias Bethge (arXiv:1508.06576).   We first demonstrate the power of the suggested style space on a few examples. We then vary different hyper-parameters and program properties that were not discussed in the original paper, among which are the recognition network used, starting point of the gradient descent and different ways to partition style and content layers. We also give a brief comparison of some of the existing algorithm implementations and deep learning frameworks used.   To study the style space further we attempt to generate synthetic images by maximizing a single entry in one of the Gram matrices $\mathcal{G}_l$ and some interesting results are observed. Next, we try to mimic the sparsity and intensity distribution of Gram matrices obtained from a real painting and generate more complex textures.   Finally, we propose two new style representations built on top of network's features and discuss how one could be used to achieve local and potentially content-aware style transfer.
We propose a novel weakly-supervised semantic segmentation algorithm based on Deep Convolutional Neural Network (DCNN). Contrary to existing weakly-supervised approaches, our algorithm exploits auxiliary segmentation annotations available for different categories to guide segmentations on images with only image-level class labels. To make the segmentation knowledge transferrable across categories, we design a decoupled encoder-decoder architecture with attention model. In this architecture, the model generates spatial highlights of each category presented in an image using an attention model, and subsequently generates foreground segmentation for each highlighted region using decoder. Combining attention model, we show that the decoder trained with segmentation annotations in different categories can boost the performance of weakly-supervised semantic segmentation. The proposed algorithm demonstrates substantially improved performance compared to the state-of-the-art weakly-supervised techniques in challenging PASCAL VOC 2012 dataset when our model is trained with the annotations in 60 exclusive categories in Microsoft COCO dataset.
The Cell Network Model is a fracture model recently introduced that resembles the microscopical structure and drying process of the parenchymatous tissue of the Bamboo Guadua angustifolia. The model exhibits a power-law distribution of avalanche sizes, with exponent -3.0 when the breaking thresholds are randomly distributed with uniform probability density. Hereby we show that the same exponent also holds when the breaking thresholds obey a broad set of Weibull distributions, and that the humidity decrements between successive avalanches (the equivalent to waiting times for this model) follow in all cases an exponential distribution. Moreover, the fraction of remaining junctures shows an exponential decay in time. In addition, introducing partial breakings and cumulative damages induces a crossover behavior between two power-laws in the avalanche size histograms. This results support the idea that the Cell Network Model may be in the same universality class as the Random Fuse Model.
Multi-layered social networks reflect complex relationships existing in modern interconnected IT systems. In such a network each pair of nodes may be linked by many edges that correspond to different communication or collaboration user activities. Multi-layered degree centrality for multi-layered social networks is presented in the paper. Experimental studies were carried out on data collected from the real Web 2.0 site. The multi-layered social network extracted from this data consists of ten distinct layers and the network analysis was performed for different degree centralities measures.
The problem of automatically generating a computer program from some specification has been studied since the early days of AI. Recently, two competing approaches for automatic program learning have received significant attention: (1) neural program synthesis, where a neural network is conditioned on input/output (I/O) examples and learns to generate a program, and (2) neural program induction, where a neural network generates new outputs directly using a latent program representation.   Here, for the first time, we directly compare both approaches on a large-scale, real-world learning task. We additionally contrast to rule-based program synthesis, which uses hand-crafted semantics to guide the program generation. Our neural models use a modified attention RNN to allow encoding of variable-sized sets of I/O pairs. Our best synthesis model achieves 92% accuracy on a real-world test set, compared to the 34% accuracy of the previous best neural synthesis approach. The synthesis model also outperforms a comparable induction model on this task, but we more importantly demonstrate that the strength of each approach is highly dependent on the evaluation metric and end-user application. Finally, we show that we can train our neural models to remain very robust to the type of noise expected in real-world data (e.g., typos), while a highly-engineered rule-based system fails entirely.
Evolutionary adaptation is the process that increases the fit of a population to the fitness landscape it inhabits. As a consequence, evolutionary dynamics is shaped, constrained, and channeled, by that fitness landscape. Much work has been expended to understand the evolutionary dynamics of adapting populations, but much less is known about the structure of the landscapes. Here, we study the global and local structure of complex fitness landscapes of interacting loci that describe protein folds or sets of interacting genes forming pathways or modules. We find that in these landscapes, high peaks are more likely to be found near other high peaks, corroborating Kauffman's "Massif Central" hypothesis. We study the clusters of peaks as a function of the ruggedness of the landscape and find that this clustering allows peaks to form interconnected networks. These networks undergo a percolation phase transition as a function of minimum peak height, which indicates that evolutionary trajectories that take no more than two mutations to shift from peak to peak can span the entire genetic space. These networks have implications for evolution in rugged landscapes, allowing adaptation to proceed after a local fitness peak has been ascended.
One of the purposes of network tomography is to infer the status of parameters (e.g., delay) for the links inside a network through end-to-end probing between (external) boundary nodes along predetermined routes. In this work, we apply concepts from compressed sensing and expander graphs to the delay estimation problem. We first show that a relative majority of network topologies are not expanders for existing expansion criteria. Motivated by this challenge, we then relax such criteria, enabling us to acquire simulation evidence that link delays can be estimated for 30% more networks. That is, our relaxation expands the list of identifiable networks with bounded estimation error by 30%. We conduct a simulation performance analysis of delay estimation and congestion detection on the basis of l1 minimization, demonstrating that accurate estimation is feasible for an increasing proportion of networks.
Mini-batch stochastic gradient descent and variants thereof have become standard for large-scale empirical risk minimization like the training of neural networks. These methods are usually used with a constant batch size chosen by simple empirical inspection. The batch size significantly influences the behavior of the stochastic optimization algorithm, though, since it determines the variance of the gradient estimates. This variance also changes over the optimization process; when using a constant batch size, stability and convergence is thus often enforced by means of a (manually tuned) decreasing learning rate schedule.   We propose a practical method for dynamic batch size adaptation. It estimates the variance of the stochastic gradients and adapts the batch size to decrease the variance proportionally to the value of the objective function, removing the need for the aforementioned learning rate decrease. In contrast to recent related work, our algorithm couples the batch size to the learning rate, directly reflecting the known relationship between the two. On popular image classification benchmarks, our batch size adaptation yields faster optimization convergence, while simultaneously simplifying learning rate tuning. A TensorFlow implementation is available.
Most successful information extraction systems operate with access to a large collection of documents. In this work, we explore the task of acquiring and incorporating external evidence to improve extraction accuracy in domains where the amount of training data is scarce. This process entails issuing search queries, extraction from new sources and reconciliation of extracted values, which are repeated until sufficient evidence is collected. We approach the problem using a reinforcement learning framework where our model learns to select optimal actions based on contextual information. We employ a deep Q-network, trained to optimize a reward function that reflects extraction accuracy while penalizing extra effort. Our experiments on two databases -- of shooting incidents, and food adulteration cases -- demonstrate that our system significantly outperforms traditional extractors and a competitive meta-classifier baseline.
We study the Kondo chain in the regime of high spin concentration where the low energy physics is dominated by the Ruderman-Kittel-Kasuya-Yosida (RKKY) interaction. As has been recently shown (A. M. Tsvelik and O. M. Yevtushenko, Phys. Rev. Lett 115, 216402 (2015)), this model has two phases with drastically different transport properties depending on the anisotropy of the exchange interaction. In particular, the helical symmetry of the fermions is spontaneously broken when the anisotropy is of the easy plane type (EP). This leads to a parametrical suppression of the localization effects. In the present paper we substantially extend the previous theory, in particular, by analyzing a competition of forward- and backward- scattering, including into the theory short range electron interactions and calculating spin correlation functions. We discuss applicability of our theory and possible experiments which could support the theoretical findings.
Real-world image recognition systems need to recognize tens of thousands of classes that constitute a plethora of visual concepts. The traditional approach of annotating thousands of images per class for training is infeasible in such a scenario, prompting the use of webly supervised data. This paper explores the training of image-recognition systems on large numbers of images and associated user comments. In particular, we develop visual n-gram models that can predict arbitrary phrases that are relevant to the content of an image. Our visual n-gram models are feed-forward convolutional networks trained using new loss functions that are inspired by n-gram models commonly used in language modeling. We demonstrate the merits of our models in phrase prediction, phrase-based image retrieval, relating images and captions, and zero-shot transfer.
We study deep neural networks for classification of images with quality distortions. We first show that networks fine-tuned on distorted data greatly outperform the original networks when tested on distorted data. However, fine-tuned networks perform poorly on quality distortions that they have not been trained for. We propose a mixture of experts ensemble method that is robust to different types of distortions. The "experts" in our model are trained on a particular type of distortion. The output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The gating network is trained to predict optimal weights for a particular distortion type and level. During testing, the network is blind to the distortion level and type, yet can still assign appropriate weights to the expert models. We additionally investigate weight sharing methods for the mixture model and show that improved performance can be achieved with a large reduction in the number of unique network parameters.
An ideal atomistic model of a disordered material should contradict no experiments,and should also be consistent with accurate force fields (either {\it ab initio}or empirical). We make significant progress toward jointly satisfying {\it both} of these criteria using a hybrid reverse Monte Carlo approach in conjunction with approximate first principles molecular dynamics. We illustrate the method by studying the complex binary glassy material g-GeSe$_2$. By constraining the model to agree with partial structure factors and {\it ab initio} simulation, we obtain a 647-atom model in close agreement with experiment, including the first sharp diffraction peak in the static structure factor. We compute the electronic state densities and compare to photoelectron spectroscopies. The approach is general and flexible.
One of the most significant threats faced by enterprise networks today is from Bots. A Bot is a program that operates as an agent for a user and runs automated tasks over the internet, at a much higher rate than would be possible for a human alone. A collection of Bots in a network, used for malicious purposes is referred to as a Botnet. Bot attacks can range from localized attacks like key-logging to network intensive attacks like Distributed Denial of Service (DDoS). In this paper, we suggest a novel approach that can detect and combat Bots. The proposed solution adopts a two pronged strategy which we have classified into the standalone algorithm and the network algorithm. The standalone algorithm runs independently on each node of the network. It monitors the active processes on the node and tries to identify Bot processes using parameters such as response time and output to input traffic ratio. If a suspicious process has been identified the network algorithm is triggered. The network algorithm will then analyze conversations to and from the hosts of the network using the transport layer flow records. It then tries to deduce the Bot pattern as well as Bot signatures which can subsequently be used by the standalone algorithm to thwart Bot processes at their very onset.
Semantic segmentation constitutes an integral part of medical image analyses for which breakthroughs in the field of deep learning were of high relevance. The large number of trainable parameters of deep neural networks however renders them inherently data hungry, a characteristic that heavily challenges the medical imaging community. Though interestingly, with the de facto standard training of fully convolutional networks (FCNs) for semantic segmentation being agnostic towards the `structure' of the predicted label maps, valuable complementary information about the global quality of the segmentation lies idle. In order to tap into this potential, we propose utilizing an adversarial network which discriminates between expert and generated annotations in order to train FCNs for semantic segmentation. Because the adversary constitutes a learned parametrization of what makes a good segmentation at a global level, we hypothesize that the method holds particular advantages for segmentation tasks on complex structured, small datasets. This holds true in our experiments: We learn to segment aggressive prostate cancer utilizing MRI images of 152 patients and show that the proposed scheme is superior over the de facto standard in terms of the detection sensitivity and the dice-score for aggressive prostate cancer. The achieved relative gains are shown to be particularly pronounced in the small dataset limit.
Cancer progression is an evolutionary process that is driven by mutation and selection in a population of tumor cells. We discuss mathematical models of cancer progression, starting from traditional multistage theory. Each stage is associated with the occurrence of genetic alterations and their fixation in the population. We describe the accumulation of mutations using conjunctive Bayesian networks, an exponential family of waiting time models in which the occurrence of mutations is constrained to a partial temporal order. Two opposing limit cases arise if mutations either follow a linear order or occur independently. We derive exact analytical expressions for the waiting time until a specific number of mutations have accumulated in these limit cases as well as for the general conjunctive Bayesian network. Finally, we analyze a stochastic population genetics model that explicitly accounts for mutation and selection. In this model, waves of clonal expansions sweep through the population at equidistant intervals. We present an approximate analytical expression for the waiting time in this model and compare it to the results obtained for the conjunctive Bayesian networks.
The type of malicious attack inflicting on networks greatly influences their stability under ordinary percolation in which a node fails when it becomes disconnected from the giant component. Here we study its generalization, $k$-core percolation, in which a node fails when it loses connection to a threshold $k$ number of neighbors. We study and compare analytically and by numerical simulations of $k$-core percolation the stability of networks under random attacks (RA), localized attacks (LA) and targeted attacks (TA), respectively. By mapping a network under LA or TA into an equivalent network under RA, we find that in both single and interdependent networks, TA exerts the greatest damage to the core structure of a network. We also find that for Erd\H{o}s-R\'{e}nyi (ER) networks, LA and RA exert equal damage to the core structure whereas for scale-free (SF) networks, LA exerts much more damage than RA does to the core structure.
This paper concerns the use of neural networks for predicting the residual life of machines and components. In addition, the advantage of using condition-monitoring data to enhance the predictive capability of these neural networks was also investigated. A number of neural network variations were trained and tested with the data of two different reliability-related datasets. The first dataset represents the renewal case where the failed unit is repaired and restored to a good-as-new condition. Data was collected in the laboratory by subjecting a series of similar test pieces to fatigue loading with a hydraulic actuator. The average prediction error of the various neural networks being compared varied from 431 to 841 seconds on this dataset, where test pieces had a characteristic life of 8,971 seconds. The second dataset was collected from a group of pumps used to circulate a water and magnetite solution within a plant. The data therefore originated from a repaired system affected by reliability degradation. When optimized, the multi-layer perceptron neural networks trained with the Levenberg-Marquardt algorithm and the general regression neural network produced a sum-of-squares error within 11.1% of each other. The potential for using neural networks for residual life prediction and the advantage of incorporating condition-based data into the model were proven for both examples.
Current work in surface realization concentrates on the use of general, abstract algorithms that interpret large, reversible grammars. Only little attention has been paid so far to the many small and simple applications that require coverage of a small sublanguage at different degrees of sophistication. The system TG/2 described in this paper can be smoothly integrated with deep generation processes, it integrates canned text, templates, and context-free rules into a single formalism, it allows for both textual and tabular output, and it can be parameterized according to linguistic preferences. These features are based on suitably restricted production system techniques and on a generic backtracking regime.
We consider the problems of deterministic broadcasting and gossiping in completely unknown ad-hoc radio networks. We assume that nothing is known to the nodes about the topology or even the size of the network, $n$, except that $n > 1$. Protocols for vanilla model, when $n$ is known, may be run for increasingly larger estimates $2^i$ on the size of the network, but one cannot determine when such a protocol should terminate. Thus, to carry this design paradigm, successful completion or in-completion of the process should be detected, and this knowledge circulated in the network. We consider the problem of deterministic Acknowledged Broadcasting and Gossiping when nodes can take polynomially large labels.   For the above setting, we present the following results for strongly connected networks: (a) A deterministic protocol for acknowledged broadcasting which takes $NRG(n,n^c)$ rounds, where $NRG(n,n^c)$ is the round complexity of deterministic gossiping for vanilla model. (b) A deterministic protocol for acknowledged gossiping, which takes $O(n^2 \lg n)$ rounds when collision detection mechanism is available. The structure of the transmissions of nodes in the network, to enable them to infer collisions, and discover existence of unknown in-neighborhood as a result, is abstracted as a family of integral sets called Selecting-Colliding family. We prove the existence of Selecting-Colliding families using the probabilistic method and employ them to design protocol for acknowledged gossiping when no collision detection mechanism is available.   Finally, we present a deterministic protocol for acknowledged broadcasting for bidirectional networks, with a round complexity of $O(n \lg n)$ rounds.
We introduce new families of Integral Probability Metrics (IPM) for training Generative Adversarial Networks (GAN). Our IPMs are based on matching statistics of distributions embedded in a finite dimensional feature space. Mean and covariance feature matching IPMs allow for stable training of GANs, which we will call McGan. McGan minimizes a meaningful loss between distributions.
In order to model volatile real-world network behavior, we analyze phase-flipping dynamical scale-free network in which nodes and links fail and recover. We investigate how stochasticity in a parameter governing the recovery process affects phase-flipping dynamics, and find the probability that no more than q% of nodes and links fail. We derive higher moments of the fractions of active nodes and active links, $f_n(t)$ and $f_{\ell}(t)$, and define two estimators to quantify the level of risk in a network. We find hysteresis in the correlations of $f_n(t)$ due to failures at the node level, and derive conditional probabilities for phase-flipping in networks. We apply our model to economic and traffic networks.
In the early days, content-based image retrieval (CBIR) was studied with global features. Since 2003, image retrieval based on local descriptors (de facto SIFT) has been extensively studied for over a decade due to the advantage of SIFT in dealing with image transformations. Recently, image representations based on the convolutional neural network (CNN) have attracted increasing interest in the community and demonstrated impressive performance. Given this time of rapid evolution, this article provides a comprehensive survey of instance retrieval over the last decade. Two broad categories, SIFT-based and CNN-based methods, are presented. For the former, according to the codebook size, we organize the literature into using large/medium-sized/small codebooks. For the latter, we discuss three lines of methods, i.e., using pre-trained or fine-tuned CNN models, and hybrid methods. The first two perform a single-pass of an image to the network, while the last category employs a patch-based feature extraction scheme. This survey presents milestones in modern instance retrieval, reviews a broad selection of previous works in different categories, and provides insights on the connection between SIFT and CNN-based methods. After analyzing and comparing retrieval performance of different categories on several datasets, we discuss promising directions towards generic and specialized instance retrieval.
Neural Machine Translation (NMT) has obtained state-of-the art performance for several language pairs, while only using parallel data for training. Target-side monolingual data plays an important role in boosting fluency for phrase-based statistical machine translation, and we investigate the use of monolingual data for NMT. In contrast to previous work, which combines NMT models with separately trained language models, we note that encoder-decoder NMT architectures already have the capacity to learn the same information as a language model, and we explore strategies to train with monolingual data without changing the neural network architecture. By pairing monolingual training data with an automatic back-translation, we can treat it as additional parallel training data, and we obtain substantial improvements on the WMT 15 task English<->German (+2.8-3.7 BLEU), and for the low-resourced IWSLT 14 task Turkish->English (+2.1-3.4 BLEU), obtaining new state-of-the-art results. We also show that fine-tuning on in-domain monolingual and parallel data gives substantial improvements for the IWSLT 15 task English->German.
Automatic feature extraction domain has witnessed the application of many intelligent methodologies over past decade; however detection accuracy of these approaches were limited as object geometry and contextual knowledge were not given enough consideration. In this paper, we propose a frame work for accurate detection of features along with automatic interpolation, and interpretation by modeling feature shape as well as contextual knowledge using advanced techniques such as SVRF, Cellular Neural Network, Core set, and MACA. Developed methodology has been compared with contemporary methods using different statistical measures. Investigations over various satellite images revealed that considerable success was achieved with the CNN approach. CNN has been effective in modeling different complex features effectively and complexity of the approach has been considerably reduced using corset optimization. The system has dynamically used spectral and spatial information for representing contextual knowledge using CNN-prolog approach. System has been also proved to be effective in providing intelligent interpolation and interpretation of random features.
We report on the discovery of a narrow-emission-line object at z=0.672 detected in a deep ASCA survey. The object, AXJ0341.4-4453, has a flux in the 2--10keV band of 1.1+/-0.27 E-13 erg/s/cm**2, corresponding to a luminosity of 1.8E44 erg/s (q0=0.5, $H0 = 50km/s/Mpc. It is also marginally detected in the ROSAT 0.5-2keV band with a flux 5.8E-15 erg/s/cm**2. Both the ASCA data alone and the combined ROSAT/ASCA data show a very hard X-ray spectrum, consistent with either a flat power law (alpha < 0.1) or photoelectric absorption with a column of nH > 4E22 cm**-2 (alpha = 1). The optical spectrum shows the high-ionisation, narrow emission lines typical of a Seyfert 2 galaxy. We suggest that this object may be typical of the hard sources required to explain the remainder of the X-ray background at hard energies.
In this paper we expose the results of our recent work on the dynamical TAP approach to mean field glassy systems. Our aim is to clarify the connection between free energy landscape and out of equilibrium dynamics in solvable models. Frequently qualitative explanations of glassy behaviour are based on the ``Free Energy Landscape Paradigm''. If its relevance for equilibrium properties is clear, the relationship between free energy landscape and out of equilibrium dynamics is not well understood yet. In this paper we clarify this relationship for the class of spin glass models which reproduce phenomenologically some features of structural glasses. The method we use is a generalisation to dynamics of the Thouless, Anderson and Palmer approach to thermodynamics of mean field spin glasses. Within this framework we show to what extent the dynamics can be represented as an evolution in the free energy landscape. In particular the relationship between the long-time dynamics and the local properties of the free energy landscape shows up explicitly using this approach.
Infrared neural stimulation (INS) pulses at water-absorbed mid-IR wavelengths could provide a non-invasive and safe modality for stimulating peripheral and cranial nerves and central nervous system neurons. The excitation mechanism underlying INS activation is thought to be mediated by photo-thermal tissue transients, which can also potentially be induced using extrinsic absorbers (Photo-Absorber Induced Neural-Thermal Stimulation or PAINTS). The specific biophysical effect of photo-thermal transients on target neurons has yet to be determined and quantitatively characterized. Here, we propose and study a model for thermally-induced neural stimulation where temperature changes induce a depolarizing transmembrane current proportional to the temperature rate of change. Our model includes physical calculations of the temperature transients induced by laser absorption and a biophysical model of the target cells. Our results indicate that stimulation thresholds predicted by the model are in good agreement with empirical data obtained in cortical cell cultures using extrinsic micro-particle absorbers (PAINTS) as well as with earlier results on auditory neuron stimulation using INS. These results suggest a general empirical-law for photo-thermal interactions with neural systems, and could help direct future basic and applied studies of these phenomena.
Cloud Computing offers virtualized computing, storage, and networking resources, over the Internet, to organizations and individual users in a completely dynamic way. These cloud resources are cheaper, easier to manage, and more elastic than sets of local, physical, ones. This encourages customers to outsource their applications and services to the cloud. The migration of both data and applications outside the administrative domain of customers into a shared environment imposes transversal, functional problems across distinct platforms and technologies. This article provides a contemporary discussion of the most relevant functional problems associated with the current evolution of Cloud Computing, mainly from the network perspective. The paper also gives a concise description of Cloud Computing concepts and technologies. It starts with a brief history about cloud computing, tracing its roots. Then, architectural models of cloud services are described, and the most relevant products for Cloud Computing are briefly discussed along with a comprehensive literature review. The paper highlights and analyzes the most pertinent and practical network issues of relevance to the provision of high-assurance cloud services through the Internet, including security. Finally, trends and future research directions are also presented.
I introduce a new distributed system for effective training and regularizing of Large-Scale Neural Networks on distributed computing architectures. The experiments demonstrate the effectiveness of flexible model partitioning and parallelization strategies based on neuron-centric computation model, with an implementation of the collective and parallel dropout neural networks training. Experiments are performed on MNIST handwritten digits classification including results.
In this paper, we present a communication-free algorithm for distributed coverage of an arbitrary network by a group of mobile agents with local sensing capabilities. The network is represented as a graph, and the agents are arbitrarily deployed on some nodes of the graph. Any node of the graph is covered if it is within the sensing range of at least one agent. The agents are mobile devices that aim to explore the graph and to optimize their locations in a decentralized fashion by relying only on their sensory inputs. We formulate this problem in a game theoretic setting and propose a communication-free learning algorithm for maximizing the coverage.
We study randomized gossip-based processes in dynamic networks that are motivated by discovery processes in large-scale distributed networks like peer-to-peer or social networks.   A well-studied problem in peer-to-peer networks is the resource discovery problem. There, the goal for nodes (hosts with IP addresses) is to discover the IP addresses of all other hosts. In social networks, nodes (people) discover new nodes through exchanging contacts with their neighbors (friends). In both cases the discovery of new nodes changes the underlying network - new edges are added to the network - and the process continues in the changed network. Rigorously analyzing such dynamic (stochastic) processes with a continuously self-changing topology remains a challenging problem with obvious applications.   This paper studies and analyzes two natural gossip-based discovery processes. In the push process, each node repeatedly chooses two random neighbors and puts them in contact (i.e., "pushes" their mutual information to each other). In the pull discovery process, each node repeatedly requests or "pulls" a random contact from a random neighbor. Both processes are lightweight, local, and naturally robust due to their randomization.   Our main result is an almost-tight analysis of the time taken for these two randomized processes to converge. We show that in any undirected n-node graph both processes take O(n log^2 n) rounds to connect every node to all other nodes with high probability, whereas Omega(n log n) is a lower bound. In the directed case we give an O(n^2 log n) upper bound and an Omega(n^2) lower bound for strongly connected directed graphs. A key technical challenge that we overcome is the analysis of a randomized process that itself results in a constantly changing network which leads to complicated dependencies in every round.
Inclusive and semi-inclusive deep inelastic leptoproduction offers possibilities to study details of the quark and gluon structure of the hadrons involved. In many of these experiments polarization is an essential ingredient. We also emphasize the dependence on transverse momenta of the quarks, which leads to azimuthal asymmetries in the produced hadrons.
Disordered and frustrated graphical systems are ubiquitous in physics, biology, and information science. For models on complete graphs or random graphs, deep understanding has been achieved through the mean-field replica and cavity methods. But finite-dimensional `real' systems persist to be very challenging because of the abundance of short loops and strong local correlations. A statistical mechanics theory is constructed in this paper for finite-dimensional models based on the mathematical framework of partition function expansion and the concept of region-graphs. Rigorous expressions for the free energy and grand free energy are derived. Message-passing equations on the region-graph, such as belief-propagation and survey-propagation, are also derived rigorously.
First-order phase transitions, classical or quantum, subject to randomness coupled to energy-like variables (bond randomness) can be rounded, resulting in continuous transitions (emergent criticality). We study perhaps the simplest such model, quantum three-color Ashkin-Teller model and show that the quantum critical point in $(1+1)$ dimension is an unusual one, with activated scaling at the critical point and Griffiths-McCoy phase away from it. The behavior is similar to the transverse random field Ising model, even though the pure system has a first-order transition in this case. We believe that this fact must be attended to when discussing quantum critical points in numerous physical systems, which may be first-order transitions in disguise.
This paper extends previous work with network fragments and situation-specific network construction. We formally define the asymmetry network, an alternative representation for a conditional probability table. We also present an object-oriented representation for partially specified asymmetry networks. We show that the representation is parsimonious. We define an algebra for the elements of the representation that allows us to 'factor' any CPT and to soundly combine the partially specified asymmetry networks.
The use of TLS by malware poses new challenges to network threat detection because traditional pattern-matching techniques can no longer be applied to its messages. However, TLS also introduces a complex set of observable data features that allow many inferences to be made about both the client and the server. We show that these features can be used to detect and understand malware communication, while at the same time preserving the privacy of benign uses of encryption. These data features also allow for accurate malware family attribution of network communication, even when restricted to a single, encrypted flow.   To demonstrate this, we performed a detailed study of how TLS is used by malware and enterprise applications. We provide a general analysis on millions of TLS encrypted flows, and a targeted study on 18 malware families composed of thousands of unique malware samples and ten-of-thousands of malicious TLS flows. Importantly, we identify and accommodate the bias introduced by the use of a malware sandbox. The performance of a malware classifier is correlated with a malware family's use of TLS, i.e., malware families that actively evolve their use of cryptography are more difficult to classify.   We conclude that malware's usage of TLS is distinct from benign usage in an enterprise setting, and that these differences can be effectively used in rules and machine learning classifiers.
Network modularity is a key feature for efficient information processing in the human brain. This information processing is however dynamic and networks can reconfigure at very short time period, few hundreds of millisecond. This requires neuroimaging techniques with sufficient time resolution. Here we use the dense electroencephalography, EEG, source connectivity methods to identify cortical networks with excellent time resolution, in the order of millisecond. We identify functional networks during picture naming task. Two categories of visual stimuli were presented, meaningful (tools, animals) and meaningless (scrambled) objects.   In this paper, we report the reconfiguration of brain network modularity for meaningful and meaningless objects. Results showed mainly that networks of meaningful objects were more modular than those of meaningless objects. Networks of the ventral visual pathway were activated in both cases. However a strong occipitotemporal functional connectivity appeared for meaningful object but not for meaningless object. We believe that this approach will give new insights into the dynamic behavior of the brain networks during fast information processing.
Wireless objects equipped with multiple antennas are able to simultaneously transmit multiple packets by exploiting the channel's spatial dimensions. In this paper, we study the benefits of such Multiple Packet Transmission (MPT) approach, when it is used in combination with a Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) protocol for fully interconnected networks, addressing the interactions between the two mechanisms and showing the performance gains that can be achieved. To this end, a very simple Media Access Control (MAC) protocol that captures the fundamental properties and tradeoffs of a CSMA/CA channel access protocol supporting MPT is introduced. Using this protocol as a reference, a new analytical model is presented for the case of non-saturated traffic sources with finite buffer space. Simulation results show that the analytical model is able to accurately characterize the steady-state behaviour of the reference protocol for different number of antennas and different traffic loads, providing a useful tool for understanding the performance gains achieved by MAC protocols supporting MPT.
In the present work, we develop a deep-learning approach for differentiating the eye-movement behavior of people with neurodegenerative diseases over healthy control subjects during reading well-defined sentences. We define an information compaction of the eye-tracking data of subjects without and with probable Alzheimer's disease when reading a set of well-defined, previously validated, sentences including high-, low-predictable sentences, and proverbs. Using this information we train a set of denoising sparse-autoencoders and build a deep neural network with these and a softmax classifier. Our results are very promising and show that these models may help to understand the dynamics of eye movement behavior and its relationship with underlying neuropsychological correlates.
Supervised online learning with an ensemble of students randomized by the choice of initial conditions is analyzed. For the case of the perceptron learning rule, asymptotically the same improvement in the generalization error of the ensemble compared to the performance of a single student is found as in Gibbs learning. For more optimized learning rules, however, using an ensemble yields no improvement. This is explained by showing that for any learning rule $f$ a transform $\tilde{f}$ exists, such that a single student using $\tilde{f}$ has the same generalization behaviour as an ensemble of $f$-students.
Markov networks and Bayesian networks are effective graphic representations of the dependencies embedded in probabilistic models. It is well known that independencies captured by Markov networks (called graph-isomorphs) have a finite axiomatic characterization. This paper, however, shows that independencies captured by Bayesian networks (called causal models) have no axiomatization by using even countably many Horn or disjunctive clauses. This is because a sub-independency model of a causal model may be not causal, while graph-isomorphs are closed under sub-models.
We solve a longstanding problem--determining structural information for disordered materials from their diffraction spectra--for the case of planar disorder in close-packed structures (CPSs). Our solution offers the most complete possible statistical description of the disorder, and, from it, we find the minimum effective memory length of disordered stacking sequences. We also compare our model of disorder with the so-called fault model (FM) and demonstrate that in simple cases our approach reduces to the FM, but in cases that are more complex it provides a general and more accurate structural description than the FM. We demonstrate our technique on two previously published zinc sulphide diffraction spectra.
The standard one-parameter scaling theory predicts that all eigenstates in two-dimensional random lattices are weakly localized. We show that this claim fails in two-dimensional dipolar Frenkel exciton systems. The linear energy dispersion at the top of the exciton band, originating from the long-range inter-site coupling of dipolar nature, yields the same size-scaling law for the level spacing and the effective disorder seen by the exciton. This finally results in the delocalization of those eigenstates in the thermodynamic limit. Large scale numerical simulations allow us to perform a detailed multifractal analysis and to elucidate the nature of the excitonic eigenstates.
The subitemset isomorphism problem is really important and there are excellent practical solutions described in the literature. However, the computational complexity analysis and classification of the BZ (Bundala and Zavodny) subitemset isomorphism problem is currently an open problem. In this paper we prove that checking whether two sorting networks are BZ isomorphic to each other is GI-Complete; the general GI (Graph Isomorphism) problem is known to be in NP and LWPP, but widely believed to be neither P nor NP-Complete; recent research suggests that the problem is in QP. Moreover, we state the BZ sorting network isomorphism problem as a general isomorphism problem on itemsets --- because every sorting network is represented by Bundala and Zavodny as an itemset. The complexity classification presented in this paper applies sorting networks, as well as the general itemset isomorphism problem. The main consequence of our work is that currently no polynomial-time algorithm exists for solving the BZ sorting network subitemset isomorphism problem; however the CM (Choi and Moon) sorting network isomorphism problem can be efficiently solved in polynomial time.
In this paper we quantify our limited information horizon, by measuring the information necessary to locate specific nodes in a network. To investigate different ways to overcome this horizon, and the interplay between communication and topology in social networks, we let agents communicate in a model society. Thereby they build a perception of the network that they can use to create strategic links to improve their standing in the network. We observe a narrow distribution of links when the communication is low and a network with a broad distribution of links when the communication is high.
We study energy spectra, eigenstates and quantum diffusion for one- and two-dimensional quasiperiodic tight-binding models. As our one-dimensional model system we choose the silver mean or `octonacci' chain. The two-dimensional labyrinth tiling, which is related to the octagonal tiling, is derived from a product of two octonacci chains. This makes it possible to treat rather large systems numerically. For the octonacci chain, one finds singular continuous energy spectra and critical eigenstates which is the typical behaviour for one-dimensional Schr"odinger operators based on substitution sequences. The energy spectra for the labyrinth tiling can, depending on the strength of the quasiperiodic modulation, be either band-like or fractal-like. However, the eigenstates are multifractal. The temporal spreading of a wavepacket is described in terms of the autocorrelation function C(t) and the mean square displacement d(t). In all cases, we observe power laws for C(t) and d(t) with exponents -delta and beta, respectively. For the octonacci chain, 0<delta<1, whereas for the labyrinth tiling a crossover is observed from delta=1 to 0<delta<1 with increasing modulation strength. Corresponding to the multifractal eigenstates, we obtain anomalous diffusion with 0<beta<1 for both systems. Moreover, we find that the behaviour of C(t) and d(t) is independent of the shape and the location of the initial wavepacket. We use our results to check several relations between the diffusion exponent beta and the fractal dimensions of energy spectra and eigenstates that were proposed in the literature.
Computing cost optimal paths in network data is a very important task in many application areas like transportation networks, computer networks or social graphs. In many cases, the cost of an edge can be described by various cost criteria. For example, in a road network possible cost criteria are distance, time, ascent, energy consumption or toll fees. In such a multicriteria network, a route or path skyline query computes the set of all paths having pareto optimal costs, i.e. each result path is optimal for different user preferences. In this paper, we propose a new method for computing route skylines which significantly decreases processing time and memory consumption. Furthermore, our method does not rely on any precomputation or indexing method and thus, it is suitable for dynamically changing edge costs. Our experiments demonstrate that our method outperforms state of the art approaches and allows highly efficient path skyline computation without any preprocessing.
Deep learning is quickly becoming the leading methodology for medical image analysis. Given a large medical archive, where each image is associated with a diagnosis, efficient pathology detectors or classifiers can be trained with virtually no expert knowledge about the target pathologies. However, deep learning algorithms, including the popular ConvNets, are black boxes: little is known about the local patterns analyzed by ConvNets to make a decision at the image level. A solution is proposed in this paper to create heatmaps showing which pixels in images play a role in the image-level predictions. In other words, a ConvNet trained for image-level classification can be used to detect lesions as well. A generalization of the backpropagation method is proposed in order to train ConvNets that produce high-quality heatmaps. The proposed solution is applied to diabetic retinopathy (DR) screening in a dataset of almost 90,000 fundus photographs from the 2015 Kaggle Diabetic Retinopathy competition and a private dataset of almost 110,000 photographs (e-ophtha). For the task of detecting referable DR, very good detection performance was achieved: $A_z = 0.954$ in Kaggle's dataset and $A_z = 0.949$ in e-ophtha. Performance was also evaluated at the image level and at the lesion level in the DiaretDB1 dataset, where four types of lesions are manually segmented: microaneurysms, hemorrhages, exudates and cotton-wool spots. The proposed detector outperforms recent algorithms trained to detect those lesions specifically, as well as competing heatmap generation algorithms for ConvNets. This detector is part of the Messidor system for mobile eye pathology screening. Because it does not rely on expert knowledge or manual segmentation for detecting relevant patterns, the proposed solution is a promising image mining tool, which has the potential to discover new biomarkers in images.
The production of two-jet final states in deep inelastic scattering is an important QCD precision observable. We compute it for the first time to next-to-next-to-leading order (NNLO) in perturbative QCD. Our calculation is fully differential in the lepton and jet variables and allows one to impose cuts on the jets both in the laboratory and the Breit frame. We observe that the NNLO corrections are moderate in size, except at kinematical edges, and that their inclusion leads to a substantial reduction of the scale variation uncertainty on the predictions. Our results will enable the inclusion of deep inelastic dijet data in precision phenomenology studies.
We describe fluctuations in finite-size networks with a complex distribution of connections, $P(k)$. We show that the spectrum of fluctuations of the number of vertices with a given degree is Poissonian. These mesoscopic fluctuations are strong in the large-degree region, where $P(k) \lesssim 1/N$ ($N$ is the total number of vertices in a network), and are important in networks with fat-tailed degree distributions.
High-quality KFe2As2 (K122) single crystals synthesized by different techniques have been studied by magnetization and specific heat (SH) measurements. There are 2 types of samples both affected by disordered magnetic phases: (i) cluster-glass (CG) like or (ii) Griffiths phase (G) like. For (i) at low applied magnetic fields the T-dependence of the zero field cooled (ZFC) linear susceptibility (chi_l) exhibits an anomaly with an irreversible behavior in ZFC and field cooled (FC) data. This anomaly is related to the freezing temperature T_f. The extrapolated T_f to B=0 varies between 50 K and 90 K. Below T_f we observed a magnetic hysteresis in the field dependence of the isothermal magnetization (M(B)). The frequency shift of the freezing temperature delta T_f=Delta T_f/[T_f\Delta(\ln \nu)]\sim 0.05$ has an intermediate value, which provides evidence for the formation of a CG-like state in the K122 samples of type (i). The frequency dependence of their T_f follows a conventional power-law divergence of critical slowing down: tau=tau_0 [T_f(nu)/T_f(0)-1]^{-z\nu^{'}} with the critical exponent z\nu^{'}=10(2) and a relatively long characteristic time constant tau_0 =6.9 x10^{-11}$s also supporting a CG behavior. The large value of the Sommerfeld coefficient was related to magnetic contribution from a CG. Samples from (ii) did not show a hysteresis behavior for chi_l(T) and M(B). Below a crossover temperature T^* sim 40 K a power-law dependence in the chi_l propto T^[lambda_G-1}], with a non-universal lambda_G was observed, suggesting a quantum G-like behavior. In this case chi_l and M(B) can be scaled using the scaling function M_s(T,B)= B^{1-\lambda_{\tiny G}}Y(mu B/k_BT) with the scaling moment mu of the order of 3.5mu_b. The same non-universal exponent was found also in SH measurements, where the magnetic contribution C/T propto T^(lambda_G-1).
We show that in the regime when strong disorder is more relevant than field quantization the superfluid--to--Bose-glass criticality of one-dimensional bosons is preceded by the prolonged logarithmically slow classical-field renormalization flow of the superfluid stiffness at mesoscopic scales. With the system compressibility remaining constant, the quantum nature of the system manifests itself only in the renormalization of dilute weak links. On the insulating side, the flow ultimately reaches a value of the Luttinger parameter at which the instanton--anti-instanton pairs start to proliferate, in accordance with the universal quantum scenario. This happens first at astronomic system sizes because of the suppressed instanton fugacity. We illustrate our result by first-principles simulations.
Given a spatio-temporal network (ST network) where edge properties vary with time, a time-sub-interval minimum spanning tree (TSMST) is a collection of minimum spanning trees of the ST network, where each tree is associated with a time interval. During this time interval, the total cost of tree is least among all the spanning trees. The TSMST problem aims to identify a collection of distinct minimum spanning trees and their respective time-sub-intervals under the constraint that the edge weight functions are piecewise linear. This is an important problem in ST network application domains such as wireless sensor networks (e.g., energy efficient routing). Computing TSMST is challenging because the ranking of candidate spanning trees is non-stationary over a given time interval. Existing methods such as dynamic graph algorithms and kinetic data structures assume separable edge weight functions. In contrast, we propose novel algorithms to find TSMST for large ST networks by accounting for both separable and non-separable piecewise linear edge weight functions. The algorithms are based on the ordering of edges in edge-order-intervals and intersection points of edge weight functions.
Temporal networks come with a wide variety of heterogeneities, from burstiness of event sequences to correlations between timings of node and link activations. In this paper, we set to explore the latter by using greedy walks as probes of temporal network structure. Given a temporal network (a sequence of contacts), greedy walks proceed from node to node by always following the first available contact. Because of this, their structure is particularly sensitive to temporal-topological patterns involving repeated contacts between sets of nodes. This becomes evident in their small coverage per step as compared to a temporal reference model -- in empirical temporal networks, greedy walks often get stuck within small sets of nodes because of correlated contact patterns. While this may also happen in static networks that have pronounced community structure, the use of the temporal reference model takes the underlying static network structure out of the equation and indicates that there is a purely temporal reason for the observations. Further analysis of the structure of greedy walks indicates that burst trains, sequences of repeated contacts between node pairs, are the dominant factor. However, there are larger patterns too, as shown with non-backtracking greedy walks. We proceed further to study the entropy rates of greedy walks, and show that the sequences of visited nodes are more structured and predictable in original data as compared to temporally uncorrelated references. Taken together, these results indicate a richness of correlated temporal-topological patterns in temporal networks.
We consider in-network computation of MAX in a structure-free random multihop wireless network. Nodes do not know their relative or absolute locations and use the Aloha MAC protocol. For one-shot computation, we describe a protocol in which the MAX value becomes available at the origin in $O(\sqrt{n/\log n})$ slots with high probability. This is within a constant factor of that required by the best coordinated protocol. A minimal structure (knowledge of hop-distance from the sink) is imposed on the network and with this structure, we describe a protocol for pipelined computation of MAX that achieves a rate of $\Omega(1/(\log^2 n)).$
Sustainability of communities, agriculture, and industry is strongly dependent on an effective storage and supply of water resources. In some regions the economic growth has led to a level of water demand which can only be accomplished through efficient reservoir networks. Such infrastructures are not always planned at larger scale but rather made by farmers according to their local needs of irrigation during droughts. Based on extensive data from the upper Jaguaribe basin, one of the world's largest system of reservoirs, located in the Brazilian semiarid northeast, we reveal that surprisingly it self-organizes into a scale-free network exhibiting also a power-law in the distribution of the lakes and avalanches of discharges. With a new self-organized-criticality-type model we manage to explain the novel critical exponents. Implementing a flow model we are able to reproduce the measured overspill evolution providing a tool for catastrophe mitigation and future planning.
Real world systems typically feature a variety of different dependency types and topologies that complicate model selection for probabilistic graphical models. We introduce the ensemble-of-forests model, a generalization of the ensemble-of-trees model. Our model enables structure learning of Markov random fields (MRF) with multiple connected components and arbitrary potentials. We present two approximate inference techniques for this model and demonstrate their performance on synthetic data. Our results suggest that the ensemble-of-forests approach can accurately recover sparse, possibly disconnected MRF topologies, even in presence of non-Gaussian dependencies and/or low sample size. We applied the ensemble-of-forests model to learn the structure of perturbed signaling networks of immune cells and found that these frequently exhibit non-Gaussian dependencies with disconnected MRF topologies. In summary, we expect that the ensemble-of-forests model will enable MRF structure learning in other high dimensional real world settings that are governed by non-trivial dependencies.
We carried out AC magnetic susceptibility measurements and muon spin relaxation spectroscopy on the cubic double perovskite Ba2YMoO6, down to 50 mK. Below ~1 K the muon relaxation is typical of a magnetic insulator with a spin-liquid type ground state, i.e. without broken symmetries or frozen moments. However, the AC susceptibility revealed a dilute-spin-glass like transition below ~ 1 K. Antiferromagnetically coupled Mo5+ 4d1 electrons in triply degenerate t2g orbitals are in this material arranged in a geometrically frustrated fcc lattice. Bulk magnetic susceptibility data has previously been interpreted in terms of a freezing to a heterogeneous state with non-magnetic sites where 4d^1 electrons have paired in spin-singlets dimers, and residual unpaired Mo5+ 4d1 electrons. Based on the magnetic heat capacity data it has been suggested that this heterogeneity is the result of kinetic constraints intrinsic to the physics of the pure system (possibly due to topological overprotection), leading to a self-induced glass of valence bonds between neighbouring 4d1 electrons. The muSR relaxation unambiguously points to a static heterogeneous state with a static arrangement of unpaired electrons isolated by spin-singlet (valence bond) dimers between the majority of Mo5+ 4d electrons. The AC susceptibility data indicate that the residual magnetic moments freeze into a dilute-spin-glass-like state. This is in apparent contradiction with the muon-spin decoupling at 50 mK in fields up to 200 mT, which indicates that, remarkably, the time scale of the field fluctuations from the residual moments is ~ 5 ns. Comparable behaviour has been observed in other geometrically frustrated magnets with spin-liquid-like behaviour and the implications of our observations on Ba2YMoO6 are discussed in this context.
In conductor-insulator nanocomposites in which conducting fillers are dispersed in an insulating matrix the electrical connectedness is established by interparticle tunneling or hopping processes. These systems are intrinsically non-percolative and a coherent description of the functional dependence of the conductivity $\sigma$ on the filler properties, and in particular of the conductor-insulator transition, requires going beyond the usual continuum percolation approach by relaxing the constraint of a fixed connectivity distance. In this article we consider dispersions of conducting spherical particles which are connected to all others by tunneling conductances and which are subjected to an effective attractive square well potential. We show that the conductor-insulator transition at low contents $\phi$ of the conducting fillers does not determine the behavior of $\sigma$ at larger concentrations, in striking contrast to what is predicted by percolation theory. In particular, we find that at low $\phi$ the conductivity is governed almost entirely by the stickiness of the attraction, while at larger $\phi$ values $\sigma$ depends mainly on the depth of the potential well. As a consequence, by varying the range and depth of the potential while keeping the stickiness fixed, composites with similar conductor-insulator transitions may display conductivity variations of several orders of magnitude at intermediate and large $\phi$ values. By using a recently developed effective medium theory and the critical path approximation we explain this behavior in terms of dominant tunneling processes which involve interparticle distances spanning different regions of the square-well fluid structure as $\phi$ is varied. Our predictions could be tested in experiments by changing the potential profile with different depletants in polymer nanocomposites.
We train a deep convolutional neural network to accurately predict the energies and magnetizations of Ising model configurations, using both the traditional nearest-neighbour Hamiltonian, as well as a long-range screened Coulomb Hamiltonian. We demonstrate the capability of a convolutional deep neural network in predicting the nearest-neighbour energy of the 4x4 Ising model. Using its success at this task, we motivate the study of the larger 8x8 Ising model, showing that the deep neural network can learn the nearest-neighbour Ising Hamiltonian after only seeing a vanishingly small fraction of configuration space. Additionally, we show that the neural network has learned both the energy and magnetization operators with sufficient accuracy to replicate the low-temperature Ising phase transition. Finally, we teach the convolutional deep neural network to accurately predict a long-range interaction through a screened Coulomb Hamiltonian. In this case, the benefits of the neural network become apparent; it is able to make predictions with a high degree of accuracy, 1600 times faster than a CUDA-optimized "exact" calculation.
In this work we introduce a differentiable version of the Compositional Pattern Producing Network, called the DPPN. Unlike a standard CPPN, the topology of a DPPN is evolved but the weights are learned. A Lamarckian algorithm, that combines evolution and learning, produces DPPNs to reconstruct an image. Our main result is that DPPNs can be evolved/trained to compress the weights of a denoising autoencoder from 157684 to roughly 200 parameters, while achieving a reconstruction accuracy comparable to a fully connected network with more than two orders of magnitude more parameters. The regularization ability of the DPPN allows it to rediscover (approximate) convolutional network architectures embedded within a fully connected architecture. Such convolutional architectures are the current state of the art for many computer vision applications, so it is satisfying that DPPNs are capable of discovering this structure rather than having to build it in by design. DPPNs exhibit better generalization when tested on the Omniglot dataset after being trained on MNIST, than directly encoded fully connected autoencoders. DPPNs are therefore a new framework for integrating learning and evolution.
In this paper, we propose an extremely simple deep model for the unsupervised nonlinear dimensionality reduction -- deep distributed random samplings, which performs like a stack of unsupervised bootstrap aggregating. First, its network structure is novel: each layer of the network is a group of mutually independent $k$-centers clusterings. Second, its learning method is extremely simple: the $k$ centers of each clustering are only $k$ randomly selected examples from the training data; for small-scale data sets, the $k$ centers are further randomly reconstructed by a simple cyclic-shift operation. Experimental results on nonlinear dimensionality reduction show that the proposed method can learn abstract representations on both large-scale and small-scale problems, and meanwhile is much faster than deep neural networks on large-scale problems.
We present results of a Monte Carlo study of the equilibrium dynamics of the one dimensional long-range Ising spin glass model. By tuning a parameter $\sigma$, this model interpolates between the mean field Sherrington-Kirkpatrick model and a proxy of the finite dimensional Edward-Anderson model. Activated scaling fits for the behavior of the relaxation time $\tau$ as a function of the number of spins $N$ (Namely $\ln(\tau)\propto N^{\psi}$) give values of $\psi$ that are not stable against inclusion of subleading corrections. Critical scaling ($\tau\propto N^{\rho}$) gives more stable fits, at least in the non mean field region. We also present results on the scaling of the time decay of the critical remanent magnetization of the Sherrington-Kirkpatrick model, a case where the simulation can be done with quite large systems and that shows the difficulties in obtaining precise values for dynamical exponents in spin glass models.
Designing an auction that maximizes expected revenue is an intricate task. Indeed, as of today--despite major efforts and impressive progress over the past few years--only the single-item case is fully understood. In this work, we initiate the exploration of the use of tools from deep learning on this topic. The design objective is revenue optimal, dominant-strategy incentive compatible auctions. We show that multi-layer neural networks can learn almost-optimal auctions for settings for which there are analytical solutions, such as Myerson's auction for a single item, Manelli and Vincent's mechanism for a single bidder with additive preferences over two items, or Yao's auction for two additive bidders with binary support distributions and multiple items, even if no prior knowledge about the form of optimal auctions is encoded in the network and the only feedback during training is revenue and regret. We further show how characterization results, even rather implicit ones such as Rochet's characterization through induced utilities and their gradients, can be leveraged to obtain more precise fits to the optimal design. We conclude by demonstrating the potential of deep learning for deriving optimal auctions with high revenue for poorly understood problems.
Re-identification is generally carried out by encoding the appearance of a subject in terms of outfit, suggesting scenarios where people do not change their attire. In this paper we overcome this restriction, by proposing a framework based on a deep convolutional neural network, SOMAnet, that additionally models other discriminative aspects, namely, structural attributes of the human figure (e.g. height, obesity, gender). Our method is unique in many respects. First, SOMAnet is based on the Inception architecture, departing from the usual siamese framework. This spares expensive data preparation (pairing images across cameras) and allows the understanding of what the network learned. Second, and most notably, the training data consists of a synthetic 100K instance dataset, SOMAset, created by photorealistic human body generation software. Synthetic data represents a good compromise between realistic imagery, usually not required in re-identification since surveillance cameras capture low-resolution silhouettes, and complete control of the samples, which is useful in order to customize the data w.r.t. the surveillance scenario at-hand, e.g. ethnicity. SOMAnet, trained on SOMAset and fine-tuned on recent re-identification benchmarks, outperforms all competitors, matching subjects even with different apparel. The combination of synthetic data with Inception architectures opens up new research avenues in re-identification.
The effect of the random quantum transverse field $\Omega$
In the original Evolutionary Minority Game, a segregation into two populations with opposing preferences is observed under many circumstances. We show that this segregation becomes more pronounced and more robust if the dynamics are changed slightly, such that strategies with above-average fitness become more frequent. Similar effects occur also for a generalization of the EMG to more than two choices, and for evolutionary dynamics of a different stochastic strategy for the Minority Game.
We investigated how the application of deep learning, specifically the use of convolutional networks trained with GPUs, can help to build better predictive models in telecommunication business environments, and fill this gap. In particular, we focus on the non-trivial problem of predicting customer churn in telecommunication operators. Our model, called WiseNet, consists of a convolutional network and a novel encoding method that transforms customer activity data and Call Detail Records (CDRs) into images. Experimental evaluation with several machine learning classifiers supports the ability of WiseNet for learning features when using structured input data. For this type of telecommunication business problems, we found that WiseNet outperforms machine learning models with hand-crafted features, and does not require the labor-intensive step of feature engineering. Furthermore, the same model has been applied without retraining to a different market, achieving consistent results. This confirms the generalization property of WiseNet and the ability to extract useful representations.
Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense classification problems such as semantic segmentation. However, repeated subsampling operations like pooling or convolution striding in deep CNNs lead to a significant decrease in the initial image resolution. Here, we present RefineNet, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections. In this way, the deeper layers that capture high-level semantic features can be directly refined using fine-grained features from earlier convolutions. The individual components of RefineNet employ residual connections following the identity mapping mindset, which allows for effective end-to-end training. Further, we introduce chained residual pooling, which captures rich background context in an efficient manner. We carry out comprehensive experiments and set new state-of-the-art results on seven public datasets. In particular, we achieve an intersection-over-union score of 83.4 on the challenging PASCAL VOC 2012 dataset, which is the best reported result to date.
In this paper, we propose a method for ranking fashion images to find the ones which might be liked by more people. We collect two new datasets from image sharing websites (Pinterest and Polyvore). We represent fashion images based on attributes: semantic attributes and data-driven attributes. To learn semantic attributes from limited training data, we use an algorithm on multi-task convolutional neural networks to share visual knowledge among different semantic attribute categories. To discover data-driven attributes unsupervisedly, we propose an algorithm to simultaneously discover visual clusters and learn fashion-specific feature representations. Given attributes as representations, we propose to learn a ranking SPN (sum product networks) to rank pairs of fashion images. The proposed ranking SPN can capture the high-order correlations of the attributes. We show the effectiveness of our method on our two newly collected datasets.
Object viewpoint estimation from 2D images is an essential task in computer vision. However, two issues hinder its progress: scarcity of training data with viewpoint annotations, and a lack of powerful features. Inspired by the growing availability of 3D models, we propose a framework to address both issues by combining render-based image synthesis and CNNs. We believe that 3D models have the potential in generating a large number of images of high variation, which can be well exploited by deep CNN with a high learning capacity. Towards this goal, we propose a scalable and overfit-resistant image synthesis pipeline, together with a novel CNN specifically tailored for the viewpoint estimation task. Experimentally, we show that the viewpoint estimation from our pipeline can significantly outperform state-of-the-art methods on PASCAL 3D+ benchmark.
Hyperspectral instruments (HSIs) measure the electromagnetic energy emitted by materials at high resolution (hundreds to thousands of channels) enabling material identification through spectroscopic analysis. Laser-induced breakdown spectroscopy (LIBS) is used by the ChemCam instrument on the Curiosity rover to measure the emission spectra of surface materials on Mars. From orbit, hyperspectral instruments (HSIs) on the CRISM instrument of the Mars Reconnaissance Orbiter (MRO) measure the electromagnetic energy emitted by materials at high resolution (hundreds to thousands of channels) enabling material identification through spectroscopic analysis. The data received are noisy, high-dimensional, and largely unlabeled. The ability to accurately predict elemental and material compositions of surface samples as well as to simulate spectra from hypothetical compositions, collectively known as hyperspectral unmixing, is invaluable to the exploration process. The nature of the problem allows us to construct deep (semi-supervised) generative models to accomplish both these tasks while making use of a large unlabeled dataset. Our main technical contribution is an invertibility trick where we train our model in reverse.
A single-source network is said to be \textit{memory-free} if all of the internal nodes (those except the source and the sinks) do not employ memory but merely send linear combinations of the incoming symbols (received at their incoming edges) on their outgoing edges. Memory-free networks with delay using network coding are forced to do inter-generation network coding, as a result of which the problem of some or all sinks requiring a large amount of memory for decoding is faced. In this work, we address this problem by utilizing memory elements at the internal nodes of the network also, which results in the reduction of the number of memory elements used at the sinks. We give an algorithm which employs memory at the nodes to achieve single-generation network coding. For fixed latency, our algorithm reduces the total number of memory elements used in the network to achieve single-generation network coding. We also discuss the advantages of employing single-generation network coding together with convolutional network-error correction codes (CNECCs) for networks with unit-delay and illustrate the performance gain of CNECCs by using memory at the intermediate nodes using simulations on an example network under a probabilistic network error model.
It is a critical issue to compute the shortest paths between nodes in networks. Exact algorithms for shortest paths are usually inapplicable for large scale networks due to the high computational complexity. In this paper, we propose a novel algorithm that is applicable for large networks with high efficiency and accuracy. The basic idea of our algorithm is to iteratively construct higher level hierarchy networks by condensing the central nodes and their neighbors into super nodes until the scale of the top level network is very small. Then the algorithm approximates the distances of the shortest paths in the original network with the help of super nodes in the higher level hierarchy networks. The experiment results show that our algorithm achieves both good efficiency and high accuracy compared with other algorithms.
We describe a method to train spiking deep networks that can be run using leaky integrate-and-fire (LIF) neurons, achieving state-of-the-art results for spiking LIF networks on five datasets, including the large ImageNet ILSVRC-2012 benchmark. Our method for transforming deep artificial neural networks into spiking networks is scalable and works with a wide range of neural nonlinearities. We achieve these results by softening the neural response function, such that its derivative remains bounded, and by training the network with noise to provide robustness against the variability introduced by spikes. Our analysis shows that implementations of these networks on neuromorphic hardware will be many times more power-efficient than the equivalent non-spiking networks on traditional hardware.
We use a Dynamic Bayesian Network to represent compactly a variety of sublexical and contextual features relevant to Part-of-Speech (PoS) tagging. The outcome is a flexible tagger (LegoTag) with state-of-the-art performance (3.6% error on a benchmark corpus). We explore the effect of eliminating redundancy and radically reducing the size of feature vocabularies. We find that a small but linguistically motivated set of suffixes results in improved cross-corpora generalization. We also show that a minimal lexicon limited to function words is sufficient to ensure reasonable performance.
Networking is undergoing a transformation throughout our industry. The need for scalable network control and automation shifts the focus from hardware driven products with ad hoc control to Software Defined Networks. This process is now well underway. In this paper, we adopt the perspective of the Promise Theory to examine the current and future states of networking technologies. The goal is to see beyond specific technologies, topologies and approaches and define principles. Promise Theory's bottom-up modeling has been applied to server management for many years and lends itself to principles of self-healing, scalability and robustness.
The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process. In this work, we appeal to kernels over combinatorial structures, such as sequences and graphs, to derive appropriate neural operations. We introduce a class of deep recurrent neural operations and formally characterize their associated kernel spaces. Our recurrent modules compare the input to virtual reference objects (cf. filters in CNN) via the kernels. Similar to traditional neural operations, these reference objects are parameterized and directly optimized in end-to-end training. We empirically evaluate the proposed class of neural architectures on standard applications such as language modeling and molecular graph regression, achieving state-of-the-art results across these applications.
The web of relations linking technological innovation can be fairly described in terms of patent citations. The resulting patent citation network provides a picture of the large-scale organization of innovations and its time evolution. Here we study the patterns of change of patents registered by the US Patent and Trademark Office (USPTO). We show that the scaling behavior exhibited by this network is consistent with a preferential attachment mechanism together with a Weibull-shaped aging term. Such attachment kernel is shared by scientific citation networks, thus indicating an universal type of mechanism linking ideas and designs and their evolution. The implications for evolutionary theory of innovation are discussed.
We provide a fast and precise Mellin-space implementation of the $O(\alpha_s)$ heavy flavor Wilson coefficients for charged current deep inelastic scattering processes. They are of importance for the extraction of the strange quark distribution in neutrino-nucleon scattering and the QCD analyses of the HERA charged current data. Errors in the literature are corrected. We also discuss a series of more general parton parameterizations in Mellin space.
We study the beam single-spin asymmetries $A_{LU}^{\sin\phi_h}$ for charged hadrons produced in semi-inclusive deep inelastic scattering process, by considering the $e H_1^\perp$ term and the $g^\perp D_1$ term simultaneously. Besides the asymmetries for charged pions, for the first time we present the analysis on the asymmetries in the production of charged kaons, protons and antiprotons by longitudinally polarized leptons scattered off unpolarized proton and deuteron targets. In our calculation we use two sets of transverse momentum dependent distributions $g^\perp(x,\bm k_T^2)$ and $e(x,\bm k_T^2)$ calculated from two different spectator models, and compare the numerical results with the preliminary data recently obtained by the HERMES Collaboration. We also predict the beam spin asymmetries for $\pi^\pm$, $K^\pm$, $p/\bar{p}$ electroproduction in semi-inclusive deep-inelastic scattering of 12 GeV polarized electrons from unpolarized proton and deuteron targets.
A nonparametric Bayesian approach is developed to determine quantum potentials from empirical data for quantum systems at finite temperature. The approach combines the likelihood model of quantum mechanics with a priori information over potentials implemented in form of stochastic processes. Its specific advantages are the possibilities to deal with heterogeneous data and to express a priori information explicitly, i.e., directly in terms of the potential of interest. A numerical solution in maximum a posteriori approximation was feasible for one--dimensional problems. Using correct a priori information turned out to be essential.
This study seeks to better understand the network characteristics of client support teams by analyzing the teams' e-mail communication networks and comparing it to client organization's satisfaction. In collaboration with a large service provider we studied the impact of network properties on the satisfaction of client organizations. In particular, we found that social network metrics correlate with client satisfaction as measured by Net Promoter Score (NPS). A Communication Score Card is suggested as a dashboard to continuously measure client satisfaction, illustrating that data-driven analysis might help improving service providers' service quality management.
Deep surveys of {\em Chandra} and {\em HST} (Hubble Space Telescope) show that active galactic nucleus (AGN) populations are changing with hard X-ray luminosities. This arises an interesting question whether the dusty torus is evolving with the central engines. We assemble a sample of 50 radio-quiet PG quasars to tackle this problem. The covering factors of the dusty tori can be estimated from the multiwavelength continuum. We find they are strongly correlated with the hard X-ray luminosity. Interestingly this correlation agrees with the fraction of type II AGNs discovered by {\em Chandra} and {\em HST}, implying strong evidence for that the AGN population changing results from the evolution of the tori. We also find that the frequencies of the dips around 1$\mu$m in the continuum correlate with the covering factors in the present sample, indicating the dip frequencies are adjusted by the covering factors. In the scenario of fueling black hole from the torus, the covering factor is a good and the dip frequency is a potential indicator of the torus evolution.
Networks are widely used in science and technology to represent relationships between entities, such as social or ecological links between organisms, enzymatic interactions in metabolic systems, or computer infrastructure. Statistical analyses of networks can provide critical insights into the structure, function, dynamics, and evolution of those systems. However, the structures of real-world networks are often not known completely, and they may exhibit considerable variation so that no single network is sufficiently representative of a system. In such situations, researchers may turn to proxy data from related systems, sophisticated methods for network inference, or synthetic networks. Here, we introduce a flexible method for synthesizing realistic ensembles of networks starting from a known network, through a series of mappings that coarsen and later refine the network structure by randomized editing. The method, MUSKETEER, preserves structural properties with minimal bias, including unknown or unspecified features, while introducing realistic variability at multiple scales. Using examples from several domains, we show that MUSKETEER produces the intended stochasticity while achieving greater fidelity across a suite of network properties than do other commonly used network generation algorithms.
In many real-world networks, nodes have class labels, attributes, or variables that affect the network's topology. If the topology of the network is known but the labels of the nodes are hidden, we would like to select a small subset of nodes such that, if we knew their labels, we could accurately predict the labels of all the other nodes. We develop an active learning algorithm for this problem which uses information-theoretic techniques to choose which nodes to explore. We test our algorithm on networks from three different domains: a social network, a network of English words that appear adjacently in a novel, and a marine food web. Our algorithm makes no initial assumptions about how the groups connect, and performs well even when faced with quite general types of network structure. In particular, we do not assume that nodes of the same class are more likely to be connected to each other---only that they connect to the rest of the network in similar ways.
We present here the predictions obtained from a calculation of the inclusive isolated photon production cross section in deep inelastic scattering. The results are compared with the cross section measurement of the ZEUS collaboration and found in good agreement with all aspects of it. Furthermore, a way of measuring the quark-to-photon fragmention function in DIS is also briefly outlined.
The occurrence of glass transition is believed to be associated to cooperative motion with a growing length scale with decreasing temperature. We provide a novel route to calculate the size of cooperatively rearranging regions CRR of glass-forming polymers combining the Adam-Gibbs theory of the glass transition with the self-concentration concept. To do so we explore the dynamics of glass-forming polymers in different environments. The material specific parameter $\alpha$ connecting the size of the CRR to the configurational entropy is obtained in this way. Thereby, the size of CRR can be precisely quantified in absolute values. This size results to be in the range 1 $\div$ 3 nm at the glass transition temperature depending on the glass-forming polymer.
Intrusion Detection is an invaluable part of computer networks defense. An important consideration is the fact that raising false alarms carries a significantly lower cost than not detecting at- tacks. For this reason, we examine how cost-sensitive classification methods can be used in Intrusion Detection systems. The performance of the approach is evaluated under different experimental conditions, cost matrices and different classification models, in terms of expected cost, as well as detection and false alarm rates. We find that even under unfavourable conditions, cost-sensitive classification can improve performance significantly, if only slightly.
Network intrusion detection systems (NIDSs) have a role of identifying malicious activities by monitoring the behavior of networks. Due to the currently high volume of networks trafic in addition to the increased number of attacks and their dynamic properties, NIDSs have the challenge of improving their classification performance. Bio-Inspired Optimization Algorithms (BIOs) are used to automatically extract the the discrimination rules of normal or abnormal behavior to improve the classification accuracy and the detection ability of NIDS. A quantum vaccined immune clonal algorithm with the estimation of distribution algorithm (QVICA-with EDA) is proposed in this paper to build a new NIDS. The proposed algorithm is used as classification algorithm of the new NIDS where it is trained and tested using the KDD data set. Also, the new NIDS is compared with another detection system based on particle swarm optimization (PSO). Results shows the ability of the proposed algorithm of achieving high intrusions classification accuracy where the highest obtained accuracy is 94.8 %.
Link prediction is a fundamental task in statistical network analysis. Recent advances have been made on learning flexible nonparametric Bayesian latent feature models for link prediction. In this paper, we present a max-margin learning method for such nonparametric latent feature relational models. Our approach attempts to unite the ideas of max-margin learning and Bayesian nonparametrics to discover discriminative latent features for link prediction. It inherits the advances of nonparametric Bayesian methods to infer the unknown latent social dimension, while for discriminative link prediction, it adopts the max-margin learning principle by minimizing a hinge-loss using the linear expectation operator, without dealing with a highly nonlinear link likelihood function. For posterior inference, we develop an efficient stochastic variational inference algorithm under a truncated mean-field assumption. Our methods can scale up to large-scale real networks with millions of entities and tens of millions of positive links. We also provide a full Bayesian formulation, which can avoid tuning regularization hyper-parameters. Experimental results on a diverse range of real datasets demonstrate the benefits inherited from max-margin learning and Bayesian nonparametric inference.
Superpositions of social networks, such as communication, friendship, or trade networks, are called multiplex networks, forming the structural backbone of human societies. Novel datasets now allow quantification and exploration of multiplex networks. Here we study gender-specific differences of a multiplex network from a complete behavioral dataset of an online-game society of about 300,000 players. On the individual level females perform better economically and are less risk-taking than males. Males reciprocate friendship requests from females faster than vice versa and hesitate to reciprocate hostile actions of females. On the network level females have more communication partners, who are less connected than partners of males. We find a strong homophily effect for females and higher clustering coefficients of females in trade and attack networks. Cooperative links between males are under-represented, reflecting competition for resources among males. These results confirm quantitatively that females and males manage their social networks in substantially different ways.
Gene regulatory networks arise in all living cells, allowing the control of gene expression patterns. The study of their topology has revealed that certain subgraphs of interactions or "motifs" appear at anomalously high frequencies. We ask here whether this phenomenon may emerge because of the functions carried out by these networks. Given a framework for describing regulatory interactions and dynamics, we consider in the space of all regulatory networks those that have a prescribed function. Monte Carlo sampling is then used to determine how these functional networks lead to specific motif statistics in the interactions. In the case where the regulatory networks are constrained to exhibit multi-stability, we find a high frequency of gene pairs that are mutually inhibitory and self-activating. In contrast, networks constrained to have periodic gene expression patterns (mimicking for instance the cell cycle) have a high frequency of bifan-like motifs involving four genes with at least one activating and one inhibitory interaction.
In intrusion detection systems, classifiers still suffer from several drawbacks such as data dimensionality and dominance, different network feature types, and data impact on the classification. In this paper two significant enhancements are presented to solve these drawbacks. The first enhancement is an improved feature selection using sequential backward search and information gain. This, in turn, extracts valuable features that enhance positively the detection rate and reduce the false positive rate. The second enhancement is transferring nominal network features to numeric ones by exploiting the discrete random variable and the probability mass function to solve the problem of different feature types, the problem of data dominance, and data impact on the classification. The latter is combined to known normalization methods to achieve a significant hybrid normalization approach. Finally, an intensive and comparative study approves the efficiency of these enhancements and shows better performance comparing to other proposed methods.
We study the elasticity of thermalized spring networks under an applied bulk strain. The networks considered are sub-isostatic random-bond networks that, in the athermal limit, are known to have vanishing bulk and linear shear moduli at zero bulk strain. Above a bulk strain threshold, however, these networks become rigid, although surprisingly the shear modulus remains zero until a second, higher, strain threshold. We find that thermal fluctuations stabilize all networks below the rigidity transition, resulting in systems with both finite bulk and shear moduli. Our results show a $T^{0.66}$ temperature dependence of the moduli in the region below the bulk strain threshold, resulting in networks with anomalously high rigidity as compared to ordinary entropic elasticity. Furthermore we find a second regime of anomalous temperature scaling for the shear modulus at its zero-temperature rigidity point, where it scales as $T^{0.5}$, behavior that is absent for the bulk modulus since its athermal rigidity transition is discontinuous.
Probabilistic independence can dramatically simplify the task of eliciting, representing, and computing with probabilities in large domains. A key technique in achieving these benefits is the idea of graphical modeling. We survey existing notions of independence for utility functions in a multi-attribute space, and suggest that these can be used to achieve similar advantages. Our new results concern conditional additive independence, which we show always has a perfect representation as separation in an undirected graph (a Markov network). Conditional additive independencies entail a particular functional for the utility function that is analogous to a product decomposition of a probability function, and confers analogous benefits. This functional form has been utilized in the Bayesian network and influence diagram literature, but generally without an explanation in terms of independence. The functional form yields a decomposition of the utility function that can greatly speed up expected utility calculations, particularly when the utility graph has a similar topology to the probabilistic network being used.
Deep learning has gained great popularity due to its widespread success on many inference problems. We consider the application of deep learning to the sparse linear inverse problem, where one seeks to recover a sparse signal from a few noisy linear measurements. In this paper, we propose two novel neural-network architectures that decouple prediction errors across layers in the same way that the approximate message passing (AMP) algorithms decouple them across iterations: through Onsager correction. First, we propose a "learned AMP" network that significantly improves upon Gregor and LeCun's "learned ISTA." Second, inspired by the recently proposed "vector AMP" (VAMP) algorithm, we propose a "learned VAMP" network that offers increased robustness to deviations in the measurement matrix from i.i.d. Gaussian. In both cases, we jointly learn the linear transforms and scalar nonlinearities of the network. Interestingly, with i.i.d. signals, the linear transforms and scalar nonlinearities prescribed by the VAMP algorithm coincide with the values learned through back-propagation, leading to an intuitive interpretation of learned VAMP. Finally, we apply our methods to two problems from 5G wireless communications: compressive random access and massive-MIMO channel estimation.
Robots are expected to operate autonomously in dynamic environments. Understanding the underlying dynamic characteristics of objects is a key enabler for achieving this goal. In this paper, we propose a method for pointwise semantic classification of 3D LiDAR data into three classes: non-movable, movable and dynamic. We concentrate on understanding these specific semantics because they characterize important information required for an autonomous system. Non-movable points in the scene belong to unchanging segments of the environment, whereas the remaining classes corresponds to the changing parts of the scene. The difference between the movable and dynamic class is their motion state. The dynamic points can be perceived as moving, whereas movable objects can move, but are perceived as static. To learn the distinction between movable and non-movable points in the environment, we introduce an approach based on deep neural network and for detecting the dynamic points, we estimate pointwise motion. We propose a Bayes filter framework for combining the learned semantic cues with the motion cues to infer the required semantic classification. In extensive experiments, we compare our approach with other methods on a standard benchmark dataset and report competitive results in comparison to the existing state-of-the-art. Furthermore, we show an improvement in the classification of points by combining the semantic cues retrieved from the neural network with the motion cues.
We describe in this report our audio scene recognition system submitted to the DCASE 2016 challenge. Firstly, given the label set of the scenes, a label tree is automatically constructed. This category taxonomy is then used in the feature extraction step in which an audio scene instance is represented by a label tree embedding image. Different convolutional neural networks, which are tailored for the task at hand, are finally learned on top of the image features for scene recognition. Our system reaches an overall recognition accuracy of 81.2% and 83.3% and outperforms the DCASE 2016 baseline with absolute improvements of 8.7% and 6.1% on the development and test data, respectively.
Accurate mobile traffic forecast is important for efficient network planning and operations. However, existing traffic forecasting models have high complexity, making the forecasting process slow and costly. In this paper, we analyze some characteristics of mobile traffic such as periodicity, spatial similarity and short term relativity. Based on these characteristics, we propose a \emph{Block Regression} ({BR}) model for mobile traffic forecasting. This model employs seasonal differentiation so as to take into account of the temporally repetitive nature of mobile traffic. One of the key features of our {BR} model lies in its low complexity since it constructs a single model for all base stations. We evaluate the accuracy of {BR} model based on real traffic data and compare it with the existing models. Results show that our {BR} model offers equal accuracy to the existing models but has much less complexity.
A recent hypothesis claims that the glass transition itself, though it is a very pronounced relaxation peak, is no separate relaxation process at all, but is just the breakdown of the shear modulus due to the weak elastic dipole interaction between all the quasi-independent relaxation centers of the glass. Two derivations are considered, one of them in terms of a breakdown of the shear modulus and the second in terms of a divergence of the shear compliance. Mechanical relaxation data from the literature for vitreous silica, glycerol, polymethylmethacrylate and polystyrene are found to be consistent with the first hypothesis.
We study three-dimensional Dirac fermions with weak finite-range scalar potential disorder. We show that even though disorder is perturbatively irrelevant at 3D Dirac points, nonperturbative effects from rare regions give rise to a nonzero density of states and a finite mean free path, with the transport at the Dirac point being dominated by hopping between rare regions. As one moves in chemical potential away from the Dirac point, there are interesting intermediate-energy regimes where the rare regions produce scattering resonances that determine the DC conductivity. We also discuss the interplay of disorder with interactions at the Dirac point. Attractive interactions drive a transition into a granular superconductor, with a critical temperature that depends strongly on the disorder distribution. In the presence of Coulomb repulsion and weak retarded attraction, the system can be a Bose glass. Our results apply to all 3D systems with Dirac points, including Weyl semimetals, and overturn a thirty year old consensus regarding the irrelevance of weak disorder at 3D Dirac points.
Different analytic expressions for the membrane potential distribution of membranes subject to synaptic noise have been proposed, and can be very helpful to analyze experimental data. However, all of these expressions are either approximations or limit cases, and it is not clear how they compare, and which expression should be used in a given situation. In this note, we provide a comparison of the different approximations available, with an aim to delineate which expression is most suitable for analyzing experimental data.
We discover a new capacity scaling law in ultra-dense networks (UDNs) under practical system assumptions, such as a general multi-piece path loss model, a non-zero base station (BS) to user equipment (UE) antenna height difference, and a finite UE density. The intuition and implication of this new capacity scaling law are completely different from that found in year 2011. That law indicated that the increase of the interference power caused by a denser network would be exactly compensated by the increase of the signal power due to the reduced distance between transmitters and receivers, and thus network capacity should grow linearly with network densification. However, we find that both the signal and interference powers become bounded in practical UDNs, which leads to a constant capacity scaling law. As a result, network densification should be stopped at a certain level for a given UE density, because the network capacity will reach its limit due to (i) the bounded signal and interference powers, and (ii) a finite frequency reuse factor because of a finite UE density. Our new discovery on the constant capacity scaling law also resolves the recent concerns about network capacity collapsing in UDNs, e.g., the capacity crash due to a non-zero BS-to-UE antenna height difference, or a bounded path loss model of the near-field (NF) effect, etc.
Companies are exposed to rigid competition, so they seek how best to improve the capabilities of their innovations. One strategy is to collaborate with other companies in order to speed up their own innovations. Such inter-company collaborations are conducted by inventors belonging to the companies. At the same time, the inventors also seem to be affected by past collaborations between companies. Therefore, interdependency of two networks, namely inventor and company networks, exists.   This paper discusses a model that replicates two-layer networks extracted from patent data of Japan and the United States in terms of degree distributions. The model replicates two-layer networks with the interdependency. Moreover it is the only model that uses local information, while other models have to use overall information, which is unrealistic. In addition, the proposed model replicates empirical data better than other models.
Background: Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regularities in the dynamics underlying the successive occurrences of events, suggesting the possibility of similar findings for language as well.   Methodology/Principal Findings: By considering frequent words in USENET discussion groups and in disparate databases where the language has different levels of formality, here we show that the distributions of distances between successive occurrences of the same word display bursty deviations from a Poisson process and are well characterized by a stretched exponential (Weibull) scaling. The extent of this deviation depends strongly on semantic type -- a measure of the logicality of each word -- and less strongly on frequency. We develop a generative model of this behavior that fully determines the dynamics of word usage.   Conclusions/Significance: Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law. Because the use of words provides a uniquely precise and powerful lens on human thought and activity, our findings also have implications for other overt manifestations of collective human dynamics.
We point out that measurements of longitudinal $\Lambda$ polarization in the target fragmentation region of deep--inelastic $\nu \,N$ and $\mu \,N$ or $e \, N$ scattering may test dynamical mechanisms invoked to explain the proton spin puzzle. A previously-proposed model for polarized $\bar s s$ pairs in the proton wave function reproduces successfully the negative $\Lambda$ polarization found in the WA59 $\bar \nu \, N$ experiment, and makes predictions that could be tested in future $\mu \, N$ and $e \, N$ experiments.
The complexity of deep neural network algorithms for hardware implementation can be lowered either by scaling the number of units or reducing the word-length of weights. Both approaches, however, can accompany the performance degradation although many types of research are conducted to relieve this problem. Thus, it is an important question which one, between the network size scaling and the weight quantization, is more effective for hardware optimization. For this study, the performances of fully-connected deep neural networks (FCDNNs) and convolutional neural networks (CNNs) are evaluated while changing the network complexity and the word-length of weights. Based on these experiments, we present the effective compression ratio (ECR) to guide the trade-off between the network size and the precision of weights when the hardware resource is limited.
A trust network is a social network in which edges represent the trust relationship between two nodes in the network. In a trust network, a fundamental question is how to assess and compute the bias and prestige of the nodes, where the bias of a node measures the trustworthiness of a node and the prestige of a node measures the importance of the node. The larger bias of a node implies the lower trustworthiness of the node, and the larger prestige of a node implies the higher importance of the node. In this paper, we define a vector-valued contractive function to characterize the bias vector which results in a rich family of bias measurements, and we propose a framework of algorithms for computing the bias and prestige of nodes in trust networks. Based on our framework, we develop four algorithms that can calculate the bias and prestige of nodes effectively and robustly. The time and space complexities of all our algorithms are linear w.r.t. the size of the graph, thus our algorithms are scalable to handle large datasets. We evaluate our algorithms using five real datasets. The experimental results demonstrate the effectiveness, robustness, and scalability of our algorithms.
In this contribution we will present deep VLT spectroscopy observations of the giant emission line halo around the z=2.49 radio galaxy MRC 2104-242. The morphology of the halo is dominated by two spatially resolved regions. Lya is extended by >12" along the radio axis, C IV and He II are extended by ~8". The overall spectrum is typical for that of high redshift radio galaxies. Interestingly, N V is present in the spectrum of the region associated with the center of the galaxy hosting the radio source, the northern region, while absent in the southern region. Using a simple photoionization model, the difference in N V can be explained due to a metallicity gradient within the halo. This is consistent with a scenario in which the halo is formed by a massive cooling flow or originates from the debris of the merging of two or more galaxies. However, also other mechanisms such as jet-cloud interactions or starburst-winds could be important.
We provide several new depth-based separation results for feed-forward neural networks, proving that various types of simple and natural functions can be better approximated using deeper networks than shallower ones, even if the shallower networks are much larger. This includes indicators of balls and ellipses; non-linear functions which are radial with respect to the $L_1$ norm; and smooth non-linear functions. We also show that these gaps can be observed experimentally: Increasing the depth indeed allows better learning than increasing width, when training neural networks to learn an indicator of a unit ball.
Object recognition in the presence of background clutter and distractors is a central problem both in neuroscience and in machine learning. However, the performance level of the models that are inspired by cortical mechanisms, including deep networks such as convolutional neural networks and deep belief networks, is shown to significantly decrease in the presence of noise and background objects [19, 24]. Here we develop a computational framework that is hierarchical, relies heavily on key properties of the visual cortex including mid-level feature selectivity in visual area V4 and Inferotemporal cortex (IT) [4, 9, 12, 18], high degrees of selectivity and invariance in IT [13, 17, 18] and the prior knowledge that is built into cortical circuits (such as the emergence of edge detector neurons in primary visual cortex before the onset of the visual experience) [1, 21], and addresses the problem of object recognition in the presence of background noise and distractors. Our approach is specifically designed to address large deformations, allows flexible communication between different layers of representation and learns highly selective filters from a small number of training examples.
We introduce a weight update formula that is expressed only in terms of firing rates and their derivatives and that results in changes consistent with those associated with spike-timing dependent plasticity (STDP) rules and biological observations, even though the explicit timing of spikes is not needed. The new rule changes a synaptic weight in proportion to the product of the presynaptic firing rate and the temporal rate of change of activity on the postsynaptic side. These quantities are interesting for studying theoretical explanation for synaptic changes from a machine learning perspective. In particular, if neural dynamics moved neural activity towards reducing some objective function, then this STDP rule would correspond to stochastic gradient descent on that objective function.
The suppression by a magnetic field of the anomalous H=0 conducting phase in high-mobility silicon MOSFETs is independent of the angle between the field and the plane of the 2D electron system. In the presence of a parallel field large enough to fully quench the anomalous conducting phase, the behavior is similar to that of disordered GaAs/AlGaAs heterostructures: the system is insulating in zero (perpendicular) field and exhibits reentrant insulator-quantum Hall effect-insulator transitions as a function of perpendicular field. The results demonstrate that the suppression of the low-T phase is related only to the electrons' spin.
Analysis and modeling of networked objects are fundamental pieces of modern data mining. Most real-world networks, from biological to social ones, are known to have common structural properties. These properties allow us to model the growth processes of networks and to develop useful algorithms. One remarkable example is the fractality of networks, which suggests the self-similar organization of global network structure. To determine the fractality of a network, we need to solve the so-called box-covering problem, where preceding algorithms are not feasible for large-scale networks. The lack of an efficient algorithm prevents us from investigating the fractal nature of large-scale networks. To overcome this issue, we propose a new box-covering algorithm based on recently emerging sketching techniques. We theoretically show that it works in near-linear time with a guarantee of solution accuracy. In experiments, we have confirmed that the algorithm enables us to study the fractality of million-scale networks for the first time. We have observed that its outputs are sufficiently accurate and that its time and space requirements are orders of magnitude smaller than those of previous algorithms.
Many real networks present a bounded scale-free behavior with a connectivity cut-off due to physical constraints or a finite network size. We study epidemic dynamics in bounded scale-free networks with soft and hard connectivity cut-offs. The finite size effects introduced by the cut-off induce an epidemic threshold that approaches zero at increasing sizes. The induced epidemic threshold is very small even at a relatively small cut-off, showing that the neglection of connectivity fluctuations in bounded scale-free networks leads to a strong over-estimation of the epidemic threshold. We provide the expression for the infection prevalence and discuss its finite size corrections. The present work shows that the highly heterogeneous nature of scale-free networks does not allow the use of homogeneous approximations even for systems of a relatively small number of nodes.
Now a day ad hoc mobile networks (MANETs) have lots of routing protocols, but no one can meet maximum performance. Some are good in a small network; some are suitable in large networks, and some give better performance in location or global networks. Today modern and innovative applications for health care environments based on a wireless network are being developed in the commercial sectors. The emerging wireless networks are rapidly becoming a fundamental part of every single field of life. Our proposed DEERP framework gives a better performance as compared to other routing protocol.
Here we introduce the RobERt (Robotic Exoplanet Recognition) algorithm for the classification of exoplanetary emission spectra. Spectral retrievals of exoplanetary atmospheres frequently requires the preselection of molecular/atomic opacities to be defined by the user. In the era of open-source, automated and self-sufficient retrieval algorithms, manual input should be avoided. User dependent input could, in worst case scenarios, lead to incomplete models and biases in the retrieval. The RobERt algorithm is based on deep belief neural (DBN) networks trained to accurately recognise molecular signatures for a wide range of planets, atmospheric thermal profiles and compositions. Reconstructions of the learned features, also referred to as `dreams' of the network, indicate good convergence and an accurate representation of molecular features in the DBN. Using these deep neural networks, we work towards retrieval algorithms that themselves understand the nature of the observed spectra, are able to learn from current and past data and make sensible qualitative preselections of atmospheric opacities to be used for the quantitative stage of the retrieval process.
This new research explores the effects of various training methods on a Polish to English Statistical Machine Translation system for medical texts. Various elements of the EMEA parallel text corpora from the OPUS project were used as the basis for training of phrase tables and language models and for development, tuning and testing of the translation system. The BLEU, NIST, METEOR, RIBES and TER metrics have been used to evaluate the effects of various system and data preparations on translation results. Our experiments included systems that used POS tagging, factored phrase models, hierarchical models, syntactic taggers, and many different alignment methods. We also conducted a deep analysis of Polish data as preparatory work for automatic data correction such as true casing and punctuation normalization phase.
The detection of triadic subgraph motifs is a common methodology in complex-networks research. The procedure usually applied in order to detect motifs evaluates whether a certain subgraph pattern is overrepresented in a network as a whole. However, motifs do not necessarily appear frequently in every region of a graph. For this reason, we recently introduced the framework of Node-Specific Pattern Mining (NoSPaM). This work is a manual for an implementation of NoSPaM which can be downloaded from www.mwinkler.eu.
Automatic fake news detection is a challenging problem in deception detection, and it has tremendous real-world political and social impacts. However, statistical approaches to combating fake news has been dramatically limited by the lack of labeled benchmark datasets. In this paper, we present liar: a new, publicly available dataset for fake news detection. We collected a decade-long, 12.8K manually labeled short statements in various contexts from PolitiFact.com, which provides detailed analysis report and links to source documents for each case. This dataset can be used for fact-checking research as well. Notably, this new dataset is an order of magnitude larger than previously largest public fake news datasets of similar type. Empirically, we investigate automatic fake news detection based on surface-level linguistic patterns. We have designed a novel, hybrid convolutional neural network to integrate meta-data with text. We show that this hybrid approach can improve a text-only deep learning model.
We study balanced solutions for network bargaining games with general capacities, where agents can participate in a fixed but arbitrary number of contracts. We provide the first polynomial time algorithm for computing balanced solutions for these games. In addition, we prove that an instance has a balanced solution if and only if it has a stable one. Our methods use a new idea of reducing an instance with general capacities to a network bargaining game with unit capacities defined on an auxiliary graph. This represents a departure from previous approaches, which rely on computing an allocation in the intersection of the core and prekernel of a corresponding cooperative game, and then proving that the solution corresponding to this allocation is balanced. In fact, we show that such cooperative game methods do not extend to general capacity games, since contrary to the case of unit capacities, there exist allocations in the intersection of the core and prekernel with no corresponding balanced solution. Finally, we identify two sufficient conditions under which the set of balanced solutions corresponds to the intersection of the core and prekernel, thereby extending the class of games for which this result was previously known.
Two important aspects of the Internet, namely the properties of its topology and the characteristics of its data traffic, have attracted growing attention of the physics community. My thesis has considered problems of both aspects. First I studied the stochastic behavior of TCP, the primary algorithm governing traffic in the current Internet, in an elementary network scenario consisting of a standalone infinite-sized buffer and an access link. The effect of the fast recovery and fast retransmission (FR/FR) algorithms is also considered. I showed that my model can be extended further to involve the effect of link propagation delay, characteristic of WAN. I continued my thesis with the investigation of finite-sized semi-bottleneck buffers, where packets can be dropped not only at the link, but also at the buffer. I demonstrated that the behavior of the system depends only on a certain combination of the parameters. Moreover, an analytic formula was derived that gives the ratio of packet loss rate at the buffer to the total packet loss rate. This formula makes it possible to treat buffer-losses as if they were link-losses. Finally, I studied computer networks from a structural perspective. I demonstrated through fluid simulations that the distribution of resources, specifically the link bandwidth, has a serious impact on the global performance of the network. Then I analyzed the distribution of edge betweenness in a growing scale-free tree under the condition that a local property, the in-degree of the "younger" node of an arbitrary edge, is known in order to find an optimum distribution of link capacity. The derived formula is exact even for finite-sized networks. I also calculated the conditional expectation of edge betweenness, rescaled for infinite networks.
In this study we compute numerical traveling wave solutions to a compact version of the Zakharov equation for unidirectional deep-water waves recently derived by Dyachenko & Zakharov (2011) Furthermore, by means of an accurate Fourier-type spectral scheme we find that solitary waves appear to collide elastically, suggesting the integrability of the Zakharov equation.
Several mechanisms to focus attention of a neural network on selected parts of its input or memory have been used successfully in deep learning models in recent years. Attention has improved image classification, image captioning, speech recognition, generative models, and learning algorithmic tasks, but it had probably the largest impact on neural machine translation.   Recently, similar improvements have been obtained using alternative mechanisms that do not focus on a single part of a memory but operate on all of it in parallel, in a uniform way. Such mechanism, which we call active memory, improved over attention in algorithmic tasks, image processing, and in generative modelling.   So far, however, active memory has not improved over attention for most natural language processing tasks, in particular for machine translation. We analyze this shortcoming in this paper and propose an extended model of active memory that matches existing attention models on neural machine translation and generalizes better to longer sentences. We investigate this model and explain why previous active memory models did not succeed. Finally, we discuss when active memory brings most benefits and where attention can be a better choice.
What are the roles of central and peripheral vision in human scene recognition? Larson and Loschky (2009) showed that peripheral vision contributes more than central vision in obtaining maximum scene recognition accuracy. However, central vision is more efficient for scene recognition than peripheral, based on the amount of visual area needed for accurate recognition. In this study, we model and explain the results of Larson and Loschky (2009) using a neurocomputational modeling approach. We show that the advantage of peripheral vision in scene recognition, as well as the efficiency advantage for central vision, can be replicated using state-of-the-art deep neural network models. In addition, we propose and provide support for the hypothesis that the peripheral advantage comes from the inherent usefulness of peripheral features. This result is consistent with data presented by Thibaut, Tran, Szaffarczyk, and Boucart (2014), who showed that patients with central vision loss can still categorize natural scenes efficiently. Furthermore, by using a deep mixture-of-experts model ("The Deep Model," or TDM) that receives central and peripheral visual information on separate channels simultaneously, we show that the peripheral advantage emerges naturally in the learning process: When trained to categorize scenes, the model weights the peripheral pathway more than the central pathway. As we have seen in our previous modeling work, learning creates a transform that spreads different scene categories into different regions in representational space. Finally, we visualize the features for the two pathways, and find that different preferences for scene categories emerge for the two pathways during the training process.
By using the quark-exchange formalism, realistic Faddeev wave functions and Fermi motion effect, we investigate deep inelastic electron scattering from A=3 mirror nuclei in the deep-valence region. The initial valence quarks in put are taken from the GRV's next-to-leading order calculations on $F_2^p(x,Q^2)$ which give very good fit to the available data in the $(x,Q^2)$-plane. It is shown that the free neutron to proton structure functions ratio can be extracted from corresponding EMC ratios for $^3He$ and $^3H$ mirror nuclei by using self-consistent iteration procedure and the results are in good agreement with other theoretical models as well as present available experimental data, especially the one expected form the proposed 11 GeV Jefferson Laboratory.
Understanding the network structure, and finding out the influential nodes is a challenging issue in the large networks. Identifying the most influential nodes in the network can be useful in many applications like immunization of nodes in case of epidemic spreading, during intentional attacks on complex networks. A lot of research is done to devise centrality measures which could efficiently identify the most influential nodes in the network. There are two major approaches to the problem: On one hand, deterministic strategies that exploit knowledge about the overall network topology in order to find the influential nodes, while on the other end, random strategies are completely agnostic about the network structure. Centrality measures that can deal with a limited knowledge of the network structure are required. Indeed, in practice, information about the global structure of the overall network is rarely available or hard to acquire. Even if available, the structure of the network might be too large that it is too much computationally expensive to calculate global centrality measures. To that end, a centrality measure is proposed that requires information only at the community level to identify the influential nodes in the network. Indeed, most of the real-world networks exhibit a community structure that can be exploited efficiently to discover the influential nodes. We performed a comparative evaluation of prominent global deterministic strategies together with stochastic strategies with an available and the proposed deterministic community-based strategy. Effectiveness of the proposed method is evaluated by performing experiments on synthetic and real-world networks with community structure in the case of immunization of nodes for epidemic control.
Purpose: A radial k-space trajectory is one of well-established sampling trajectory in magnetic resonance imaging. However, the radial k-space trajectory requires a large number of radial lines for high-resolution reconstruction. Increasing the number of lines causes longer sampling times, making it more difficult for routine clinical use. If we reduce the radial lines to reduce the sampling time, streaking artifact patterns are unavoidable. To solve this problem, we propose a novel deep learning approach to reconstruct high-resolution MR images from the under-sampled k-space data.   Methods: The proposed deep network estimates the streaking artifacts. Once the streaking artifacts are estimated, an artifact-free image is then obtained by subtracting the estimated streaking artifacts from the distorted image. In the case of the limited number of available radial acquisition data, we apply a domain adaptation scheme, which first pre-trains the network with a large number of x-ray computed tomography (CT) data sets and then fine-tunes it with only a few MR data sets.   Results: The proposed deep learning method shows better performance than the existing compressed sensing algorithms, such as total variation and PR-FOCUSS. In addition, the calculation time is several order of magnitude faster than total variation and PR-FOCUSS methods.   Conclusion: The proposed deep learning method surpasses the image quality as well as the computation times against the existing compressed sensing algorithms. In addition, we demonstrate the possibilities of domain-adaptation approach when a limited number of MR data is available.
Many research works have successfully extended algorithms such as evolutionary algorithms, reinforcement agents and neural networks using "opposition-based learning" (OBL). Two types of the "opposites" have been defined in the literature, namely \textit{type-I} and \textit{type-II}. The former are linear in nature and applicable to the variable space, hence easy to calculate. On the other hand, type-II opposites capture the "oppositeness" in the output space. In fact, type-I opposites are considered a special case of type-II opposites where inputs and outputs have a linear relationship. However, in many real-world problems, inputs and outputs do in fact exhibit a nonlinear relationship. Therefore, type-II opposites are expected to be better in capturing the sense of "opposition" in terms of the input-output relation. In the absence of any knowledge about the problem at hand, there seems to be no intuitive way to calculate the type-II opposites. In this paper, we introduce an approach to learn type-II opposites from the given inputs and their outputs using the artificial neural networks (ANNs). We first perform \emph{opposition mining} on the sample data, and then use the mined data to learn the relationship between input $x$ and its opposite $\breve{x}$. We have validated our algorithm using various benchmark functions to compare it against an evolving fuzzy inference approach that has been recently introduced. The results show the better performance of a neural approach to learn the opposites. This will create new possibilities for integrating oppositional schemes within existing algorithms promising a potential increase in convergence speed and/or accuracy.
We are interested in how to best communicate a (usually real valued) source to a number of destinations (sinks) over a network with capacity constraints in a collective fidelity metric over all the sinks, a problem which we call joint network-source coding. Unlike the lossless network coding problem, lossy reconstruction of the source at the sinks is permitted. We make a first attempt to characterize the set of all distortions achievable by a set of sinks in a given network. While the entire region of all achievable distortions remains largely an open problem, we find a large, non-trivial subset of it using ideas in multiple description coding. The achievable region is derived over all balanced multiple-description codes and over all network flows, while the network nodes are allowed to forward and duplicate data packets.
Here, we developed a complex network of solar active regions (ARs) to study various local and global properties of the network. The values of the Hurst exponent ($0.8-0.9$) were evaluated by both the detrended fluctuation analysis and the rescaled range analysis applied on the time series of the AR numbers. The findings suggest that ARs can be considered as a system of self-organized criticality. We constructed a growing network based on locations, occurrence times, and the lifetimes of 4,227 ARs recorded from 1 January 1999 to 14 April 2017. The behaviour of the clustering coefficient shows that the ARs network is not a random network. The logarithmic behaviour of the length scale has the characteristics of a so-called \textquotedblleft small-world network\textquotedblright. It is found that the probability distribution of the node degrees for undirected networks follows the power-law with exponents of about 3.7 to 4.2. This indicates the scale-free nature of the ARs network. The scale-free and small-world properties of the ARs network confirm that the system of ARs forms a system of self-organized criticality. Our results show that the occurrence probability of flares (classified by GOES class C $> 5$, M, and X flares) in the position of the ARs network hubs take values greater than that obtained for other nodes.
We investigate spin correlations in the dipolar Heisenberg antiferromagnet Gd2Sn2O7 using polarised neutron-scattering measurements in the correlated paramagnetic regime. Using Monte Carlo methods, we show that our data are sensitive to weak further-neighbour exchange interactions of magnitude ~0.5% of the nearest-neighbour interaction, and are compatible with either antiferromagnetic next-nearest neighbour interactions, or ferromagnetic third-neighbour interactions that connect spins across hexagonal loops. Calculations of the magnetic scattering intensity reveal rods of diffuse scattering along [111] reciprocal-space directions, which we explain in terms of strong antiferromagnetic correlations parallel to the set of <110> directions that connect a given spin with its nearest neighbours. Finally, we demonstrate that the spin correlations in Gd2Sn2O7 are highly anisotropic, and correlations parallel to third-neighbour separations are particularly sensitive to critical fluctuations associated with incipient long-range order.
We give a simple, fast algorithm for hyperparameter optimization inspired by techniques from the analysis of Boolean functions. We focus on the high-dimensional regime where the canonical example is training a neural network with a large number of hyperparameters. The algorithm - an iterative application of compressed sensing techniques for orthogonal polynomials - requires only uniform sampling of the hyperparameters and is thus easily parallelizable. Experiments for training deep nets on Cifar-10 show that compared to state-of-the-art tools (e.g., Hyperband and Spearmint), our algorithm finds significantly improved solutions, in some cases matching what is attainable by hand-tuning. In terms of overall running time (i.e., time required to sample various settings of hyperparameters plus additional computation time), we are at least an order of magnitude faster than Hyperband and even more so compared to Bayesian Optimization. We also outperform Random Search 5X. Additionally, our method comes with provable guarantees and yields the first quasi-polynomial time algorithm for learning decision trees under the uniform distribution with polynomial sample complexity, the first improvement in over two decades.
Sparse PCA provides a linear combination of small number of features that maximizes variance across data. Although Sparse PCA has apparent advantages compared to PCA, such as better interpretability, it is generally thought to be computationally much more expensive. In this paper, we demonstrate the surprising fact that sparse PCA can be easier than PCA in practice, and that it can be reliably applied to very large data sets. This comes from a rigorous feature elimination pre-processing result, coupled with the favorable fact that features in real-life data typically have exponentially decreasing variances, which allows for many features to be eliminated. We introduce a fast block coordinate ascent algorithm with much better computational complexity than the existing first-order ones. We provide experimental results obtained on text corpora involving millions of documents and hundreds of thousands of features. These results illustrate how Sparse PCA can help organize a large corpus of text data in a user-interpretable way, providing an attractive alternative approach to topic models.
Spectrum is a scarce commodity, and considering the spectrum scarcity faced by the wireless-based service providers led to high congestion levels. Technical inefficiencies from pooled spectrum (this is nothing but the "common carrier principle" adopted in oil/gas/electricity pipelines/networks.), since all ad hoc networks share a common pool of channels, exhausting the available channels will force ad hoc networks to block the services. Researchers found that cognitive radio (CR) technology may resolve the spectrum scarcity. CR network proved to next generation wireless communication system that proposed as a way to reuse under-utilised spectrum of licensee user (primary network) in an opportunistic and non-interfering basis. A CR is a self-configuring entity in a wireless networking that senses its environment, tracks changes, and frequently exchanges information with their networks. Adding this layer of such intelligence to the ad hoc network by looking at the overall geography of the network known as cognitive radio ad hoc networks (CRAHNs). However, CRAHN facing challenges and condition become worst while tracks changes i.e. reallocation of another under-utilised channels while primary network user arrives. In this paper, channels or resource reallocation technique based on bio-inspired computing algorithm for CRAHN has been proposed.
Low-disorder and high-mobility 2D electron (or hole) systems undergo an apparent metal-insulator-transition (MIT) at low temperatures as the carrier density (n) is varied. In some situations, the 2D MIT can be caused at a fixed low carrier density by changing an externally applied in-plane magnetic field parallel to the 2D layer. d\rho/dT changes its sign at some nonuniversal sample-dependent critical carrier density n_c separating an effective 2D metal (d\rho/dT >0) for n>n_c from an effective 2D insulator (d\rho/dT<0) for n<n_c. We study the 2D MIT phenomenon as a possible strong localization induced crossover process controlled by the Ioffe-Regel criterion, k_F l=1. Calculating the quantum mean free path (l) in the effective metallic phase from a realistic transport theory including disorder scattering effects, we solve the integral equation defined by the Ioffe-Regel criterion to obtain the nonuniversal critical density n_c as a function of the applicable physical experimental parameters including disorder strength, in-plane magnetic field, spin and valley degeneracy, background dielectric constant and carrier effective mass, and temperature. The key physics underlying the nonuniversal parameter dependence of the n_c is the temperature and density dependence of the Coulomb disorder. Our calculated results for the crossover n_c appear to be in qualitative and semi-quantitative agreement with the available experimental data in different 2D semiconductor systems lending credence to the possibility that the apparent 2D MIT signals the onset of the strong localization crossover in disordered 2D systems. We provide some results for graphene where a low-temperature 2D MIT becomes possible in the presence of intervalley scattering. We also provide an extensive comparison with the theoretical results obtained on the basis of 2D MIT being considered as a percolation transition.
Social recommender systems exploit users' social relationships to improve the recommendation accuracy. Intuitively, a user tends to trust different subsets of her social friends, regarding with different scenarios. Therefore, the main challenge of social recommendation is to exploit the optimal social dependency between users for a specific recommendation task. In this paper, we propose a novel recommendation method, named probabilistic relational matrix factorization (PRMF), which aims to learn the optimal social dependency between users to improve the recommendation accuracy, with or without users' social relationships. Specifically, in PRMF, the latent features of users are assumed to follow a matrix variate normal (MVN) distribution. The positive and negative dependency between users are modeled by the row precision matrix of the MVN distribution. Moreover, we have also proposed an efficient alternating algorithm to solve the optimization problem of PRMF. The experimental results on real datasets demonstrate that the proposed PRMF method outperforms state-of-the-art social recommendation approaches, in terms of root mean square error (RMSE) and mean absolute error (MAE).
Context. This is the third paper of a series devoted to the WIde-field Nearby Galaxy-cluster Survey (WINGS).WINGS is a long term project aimed at gathering wide-field, multiband imaging and spectroscopy of galaxies in a complete sample of 77 X-ray selected nearby clusters (0.04<z<0.07) located far from the galactic plane (b>20deg). The main goal of this project is to establish a local reference sample for evolutionary studies of galaxies and galaxy clusters. Aims. This paper presents the near-infrared (J,K) photometric catalogs of 28 clusters of the WINGS sample and describes the procedures followed to construct them. Methods. The raw data has been reduced at CASU and special care has been devoted to the final coadding, drizzling technique, astrometric solution and magnitude calibration for the WFCAM pipeline processed data. We have constructed the photometric catalogs based on the final calibrated coadded mosaics (0.79 deg2) in J (19 clusters) and K (27 clusters) bands. A customized interactive pipeline has been used to clean the catalogs and to make mock images for photometric errors and completeness estimates. Results. We provide deep near-infrared photometric catalogs (90% complete in detection rate at total magnitudes J =20.5, K =19.4, and in classification rate at J = 19.5 and K = 18.5), giving positions, geometrical parameters, total and aperture magnitudes for all detected sources. For each field we classify the detected sources as stars, galaxies and objects of "unknown" nature.
Image-matched nonseparable wavelets can find potential use in many applications including image classification, segmen- tation, compressive sensing, etc. This paper proposes a novel design methodology that utilizes convolutional neural net- work (CNN) to design two-channel non-separable wavelet matched to a given image. The design is proposed on quin- cunx lattice. The loss function of the convolutional neural network is setup with total squared error between the given input image to CNN and the reconstructed image at the output of CNN, leading to perfect reconstruction at the end of train- ing. Simulation results have been shown on some standard images.
Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation is crucial in solving this task. We extend the space of such models using real-valued non-volume preserving (real NVP) transformations, a set of powerful invertible and learnable transformations, resulting in an unsupervised learning algorithm with exact log-likelihood computation, exact sampling, exact inference of latent variables, and an interpretable latent space. We demonstrate its ability to model natural images on four datasets through sampling, log-likelihood evaluation and latent variable manipulations.
The out of equilibrium dynamics of finite dimensional spin glasses is considered from a point of view going beyond the standard `mean-field theory' versus `droplet picture' debate of the last decades. The main predictions of both theories concerning the spin glass dynamics are discussed. It is shown, in particular, that predictions originating from mean-field ideas concerning the violations of the fluctuation-dissipation theorem apply quantitatively, provided one properly takes into account the role of the spin glass coherence length which plays a central role in the droplet picture. Dynamics in a uniform magnetic field is also briefly discussed.
The Internet-of-Things (IoT) is an emerging concept of network connectivity at anytime and anywhere for billions of everyday objects, which has recently attracted tremendous attentions from both the industry and academia. The rapid growth of IoT has been driven by recent advancements in consumer electronics, wireless network densification, 5G communication technologies [e.g., millimeter wave and massive multiple-input and multiple-output (MIMO)], and cloud-computing enabled big-data analytics. One of the remaining key challenges for IoT is the limited network lifetime due to massive IoT devices being powered by batteries with finite capacities. The low-power and low-complexity backscatter communications (BackCom) has emerged to be a promising technology for tackling the challenge. In this article, we present an overview of the active area by discussing basic principles, system and network architectures and relevant techniques. Last, we describe the IoT applications for BackCom and how the technology can solve the energy challenge for IoT.
In cognitive radio (CR) networks, the secondary users (SUs) sense the spectrum licensed to the primary users (PUs) to identify and possibly transmit over temporarily unoccupied channels. Cooperative sensing was proposed to improve the sensing accuracy, but in heterogeneous scenarios SUs do not contribute equally to the cooperative sensing result because they experience different received PU signal quality at their sensors. In this paper, a two-layer coalitional game is developed for distributed sensing and access in multichannel CR ad hoc networks where the SUs' transmission opportunities are commensurate with their sensing contributions, thus fostering cooperation and eliminating free-riders. Numerical results show that the proposed two-layer game is computationally efficient and outperforms previously investigated collaborative sensing and spectrum access approaches for heterogeneous multichannel CR networks in terms of energy efficiency, throughput, SU fairness, and complexity. Moreover, it is demonstrated that this game is robust to changes in the network topology and the number of SUs in low-mobility scenarios. Finally, we propose a new physical-layer approach to distributing the network-level miss-detection (MD) constraints fairly among the interfering SUs for guaranteed PU protection and demonstrate the performance advantages of the AND-rule combining of spectrum sensing results for heterogeneous SUs.
In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks.Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction. We further generalize this algorithm to multi-layer and multi-branch cases. Our method reduces the accumulated error and enhance the compatibility with various architectures. Our pruned VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error. More importantly, our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively, which is significant.
Despite the overwhelming success of the existing Social Networking Services (SNS), their centralized ownership and control have led to serious concerns in user privacy, censorship vulnerability and operational robustness of these services. To overcome these limitations, Distributed Social Networks (DSN) have recently been proposed and implemented. Under these new DSN architectures, no single party possesses the full knowledge of the entire social network. While this approach solves the above problems, the lack of global knowledge for the DSN nodes makes it much more challenging to support some common but critical SNS services like friends discovery and community detection. In this paper, we tackle the problem of community detection for a given user under the constraint of limited local topology information as imposed by common DSN architectures. By considering the Personalized Page Rank (PPR) approach as an ink spilling process, we justify its applicability for decentralized community detection using limited local topology information.Our proposed PPR-based solution has a wide range of applications such as friends recommendation, targeted advertisement, automated social relationship labeling and sybil defense. Using data collected from a large-scale SNS in practice, we demonstrate our adapted version of PPR can significantly outperform the basic PR as well as two other commonly used heuristics. The inclusion of a few manually labeled friends in the Escape Vector (EV) can boost the performance considerably (64.97% relative improvement in terms of Area Under the ROC Curve (AUC)).
We report on the first observation of a novel type of superconducting proximity network using a superconductor-normal metal bilayer. Little-Parks oscillation measurements show that the superconducting current flows through a path enclosed by the edge rather than by the center of the Pb/Au wire in the network. Furthermore, several peaks were observed in a power spectrum analysis. We observed that the sequence of these peaks and that of the monolayer network were connected by the power function, which is a factor of the line width, S_{B_n} = \alpha^{n-2}S_{A_n}. This suggests that even in a proximity network vortices are arranged in a way identical to a monolayer network.
We numerically study a nondisordered lattice spin system with a first order liquid-crystal transition, as a model for supercooled liquids and glasses. Below the melting temperature the system can be kept in the metastable liquid phase, and it displays a dynamic phenomenology analogous to fragile supercooled liquids, with stretched exponential relaxation, power law increase of the relaxation time and high fragility index. At an effective spinodal temperature Tsp the relaxation time exceeds the crystal nucleation time, and the supercooled liquid loses stability. Below Tsp liquid properties cannot be extrapolated, in line with Kauzmann's scenario of a `lower metastability limit' of supercooled liquids as a solution of Kauzmann's paradox. The off-equilibrium dynamics below Tsp corresponds to fast nucleation of small, but stable, crystal droplets, followed by extremely slow growth, due to the presence of pinning energy barriers. In the early time region, which is longer the lower the temperature, this crystal-growth phase is indistinguishable from an off-equilibrium glass, both from a structural and a dynamical point of view: crystal growth has not advanced enough to be structurally detectable, and a violation of the fluctuation-dissipation theorem (FDT) typical of structural glasses is observed. On the other hand, for longer times crystallization reaches a threshold beyond which crystal domains are easily identified, and FDT violation becomes compatible with ordinary domain growth.
Over the last decade, the demand for efficient healthcare monitoring has increased and forced the health and wellness industry to embrace modern technological advances. Body Sensor Networks, or BSNs, can remotely collect users data and upload vital statistics to servers over the Internet. Advances in wireless technologies such as cellular devices and Bluetooth increase the mobility users experience while wearing a body sensor network. When connected by the proper framework, BSNs can efficiently monitor and record data while minimizing the energy expenditure of nodes in the BSN. Social networking sites play a large role in the aggregation and sharing of data between many users. Connecting a BSN to a social network creates the unique ability to share health related data with other users through social interaction. In this research, we present an integration of BSNs and social networks to establish a community promoting well being and great social awareness. We present the system architecture; both hardware and software, of a prototype implementation using Zephyr HxM heart monitor, Intel-Shimmer EMG senor and a Samsung Captivate smart phone. We provide implementation details for the design on the base station, the database server and the Facebook application. We illustrate how the Android application was designed with both functionality and user perspective in mind that resulted in an easy to use system. This prototype can be used in multiple health related applications based on the type of sensors used.
Small continuous sensory and mechanical perturbations have often been used to identify properties of the closed-loop neural control of posture and other systems that are approximately linear time invariant. Here we extend this approach to study the neural control of rhythmic behaviors such as walking. Our method is based on the theory of linear time periodic systems, with modifications to account for ability of perturbations to reset the phase of a rhythmic behavior. We characterize responses to perturbations in the frequency domain using harmonic transfer functions and then convert to the time domain to obtain phase-dependent impulse response functions (IRFs) that describe the response to a small brief perturbation at any phase of the rhythmic behavior. IRFs describing responses of kinematic variables and muscle activations (measured by EMG) to sensory and mechanical perturbations can be used to infer properties of the plant, the mapping from muscle activation to movement, and of neural feedback, the mapping from movement to muscle activation. We illustrate our method by applying it to simulated data from a model and experimental data of subjects walking on a treadmill perturbed by movement of the visual scene.
Continuous time Bayesian networks are investigated with a special focus on their ability to express causality. A framework is presented for doing inference in these networks. The central contributions are a representation of the intensity matrices for the networks and the introduction of a causality measure. A new model for high-frequency financial data is presented. It is calibrated to market data and by the new causality measure it performs better than older models.
Inspired by the brain, deep neural networks (DNN) are thought to learn abstract representations through their hierarchical architecture. However, at present, how this happens is not well understood. Here, we demonstrate that DNN learn abstract representations by a process of demodulation. We introduce a biased sigmoid activation function and use it to show that DNN learn and perform better when optimized for demodulation. Our findings constitute the first unambiguous evidence that DNN perform abstract learning in practical use. Our findings may also explain abstract learning in the human brain.
The standard model of online prediction deals with serial processing of inputs by a single processor. However, in large-scale online prediction problems, where inputs arrive at a high rate, an increasingly common necessity is to distribute the computation across several processors. A non-trivial challenge is to design distributed algorithms for online prediction, which maintain good regret guarantees. In \cite{DMB}, we presented the DMB algorithm, which is a generic framework to convert any serial gradient-based online prediction algorithm into a distributed algorithm. Moreover, its regret guarantee is asymptotically optimal for smooth convex loss functions and stochastic inputs. On the flip side, it is fragile to many types of failures that are common in distributed environments. In this companion paper, we present variants of the DMB algorithm, which are resilient to many types of network failures, and tolerant to varying performance of the computing nodes.
Latency-minimization is recognized as one of the pillars of 5G network architecture design. Information-Centric Networking (ICN) appears a promising candidate technology for building an agile communication model that reduces latency through in-network caching. However, no proposal has developed so far latency-aware cache management mechanisms for ICN. In the paper, we investigate the role of latency awareness on data delivery performance in ICN and introduce LAC, a new simple, yet very effective, Latency-Aware Cache management policy. The designed mechanism leverages in a distributed fashion local latency observations to decide whether to store an object in a network cache. The farther the object, latency-wise, the more favorable the caching decision. By means of simulations, show that LAC outperforms state of the art proposals and results in a reduction of the content mean delivery time and standard deviation by up to 50%, along with a very fast convergence to these figures.
We study the statistics of the number of records R_{n,N} for N identical and independent symmetric discrete-time random walks of n steps in one dimension, all starting at the origin at step 0. At each time step, each walker jumps by a random length drawn independently from a symmetric and continuous distribution. We consider two cases: (I) when the variance \sigma^2 of the jump distribution is finite and (II) when \sigma^2 is divergent as in the case of L\'evy flights with index 0 < \mu < 2. In both cases we find that the mean record number <R_{n,N}> grows universally as \sim \alpha_N \sqrt{n} for large n, but with a very different behavior of the amplitude \alpha_N for N > 1 in the two cases. We find that for large N, \alpha_N \approx 2 \sqrt{\log N} independently of \sigma^2 in case I. In contrast, in case II, the amplitude approaches to an N-independent constant for large N, \alpha_N \approx 4/\sqrt{\pi}, independently of 0<\mu<2. For finite \sigma^2 we argue, and this is confirmed by our numerical simulations, that the full distribution of (R_{n,N}/\sqrt{n} - 2 \sqrt{\log N}) \sqrt{\log N} converges to a Gumbel law as n \to \infty and N \to \infty. In case II, our numerical simulations indicate that the distribution of R_{n,N}/\sqrt{n} converges, for n \to \infty and N \to \infty, to a universal nontrivial distribution, independently of \mu. We discuss the applications of our results to the study of the record statistics of 366 daily stock prices from the Standard & Poors 500 index.
Coherent, large scale dynamics in many nonequilibrium physical, biological, or information transport networks are driven by small-scale local energy input. Here, we introduce and explore an analytically tractable nonlinear model for compressible active flow networks. In contrast to thermally-driven systems, we find that active friction selects discrete states with a limited number of oscillation modes activated at distinct fixed amplitudes. Using perturbation theory, we systematically predict the stationary states of noisy networks and find good agreement with a Bayesian state estimation based on a hidden Markov model applied to simulated time series data. Our results suggest that the macroscopic response of active network structures, from actomyosin force networks to cytoplasmic flows, can be dominated by a significantly reduced number of modes, in contrast to energy equipartition in thermal equilibrium. The model is also well-suited to study topological sound modes and spectral band gaps in active matter.
We present a method for constraining the evolution of the galaxy luminosity-velocity (LV) relation in hierarchical scenarios of structure formation. The comoving number density of dark-matter halos with circular velocity of 200 km/s is predicted in favored CDM cosmologies to be nearly constant over the redshift range 0<z<5. Any observed evolution in the density of bright galaxies implies in turn a corresponding evolution in the LV relation. We consider several possible forms of evolution for the zero-point of the LV relation and predict the corresponding evolution in galaxy number density. The Hubble Deep Field suggests a large deficit of bright (M_V < -19) galaxies at 1.4 < z < 2. If taken at face value, this implies a dimming of the LV zero-point by roughly 2 magnitudes. Deep, wide-field, near-IR selected surveys will provide more secure measurements to compare with our predictions.
We define a variety of abstract termination principles which form generalisations of simplification orders, and investigate their computational content. Simplification orders, which include the well-known multiset and lexicographic path orderings, are important techniques for proving that computer programs terminate. Moreover, an analysis of the proofs that these orders are wellfounded can yield additional quantitative information: namely an upper bound on the complexity of programs reducing under these orders. In this paper we focus on extracting computational content from the typically non-constructive wellfoundedness proofs of termination orders, with an eye towards the establishment of general metatheorems which characterise bounds on the derivational complexity induced by these orders. However, ultimately we have a much broader goal, which is to explore a number of deep mathematical concepts which underlie termination orders, including minimal-bad-sequence constructions, modified realizability and bar recursion. We aim to describe how these concepts all come together to form a particularly elegant illustration of the bridge between proofs and programs.
We examine the stability of l=1 and l=2 g modes in a pair of nitrogen-rich Wolf-Rayet stellar models characterized by differing hydrogen abundances. We find that modes with intermediate radial orders are destabilized by a kappa mechanism operating on an opacity bump at an envelope temperature log T ~ 6.25. This `deep opacity bump' is due primarily to L-shell bound-free transitions of iron. Periods of the unstable modes span ~ 11-21 hr in the model containing some hydrogen, and ~ 3-12 hr in the hydrogen-depleted model. Based on the latter finding, we suggest that self-excited g modes may be the source of the 9.8 hr-periodic variation of WR 123 recently reported by Lefevre et al. (2005).
Online Social Networks (OSNs), such as Facebook and Twitter, have become an integral part of our daily lives. There are hundreds of OSNs, each with its own focus in that each offers particular services and functionalities. Recent studies show that many OSN users create several accounts on multiple OSNs using the same or different personal information. Collecting all the available data of an individual from several OSNs and fusing it into a single profile can be useful for many purposes. In this paper, we introduce novel machine learning based methods for solving Entity Resolution (ER), a problem for matching user profiles across multiple OSNs. The presented methods are able to match between two user profiles from two different OSNs based on supervised learning techniques, which use features extracted from each one of the user profiles. By using the extracted features and supervised learning techniques, we developed classifiers which can perform entity matching between two profiles for the following scenarios: (a) matching entities across two OSNs; (b) searching for a user by similar name; and (c) de-anonymizing a user's identity.   The constructed classifiers were tested by using data collected from two popular OSNs, Facebook and Xing. We then evaluated the classifiers' performances using various evaluation measures, such as true and false positive rates, accuracy, and the Area Under the receiver operator Curve (AUC). The constructed classifiers were evaluated and their classification performance measured by AUC was quite remarkable, with an AUC of up to 0.982 and an accuracy of up to 95.9% in identifying user profiles across two OSNs.
We propose a novel measure of degree heterogeneity, for unweighted and undirected complex networks, which requires only the degree distribution of the network for its computation. We show that the proposed measure can be applied to all types of network topology with ease and increases with the diversity of node degrees in the network. The measure is applied to compute the heterogeneity of synthetic (both random and scale free) and real world networks with its value normalized in the interval [0, 1]. To define the measure, we introduce a limiting network whose heterogeneity can be expressed analytically with the value tending to 1 as the size of the network N tends to infinity. We numerically study the variation of heterogeneity for random graphs (as a function of p and N) and for scale free networks with and N as variables. Finally, as a specific application, we show that the proposed measure can be used to compare the heterogeneity of recurrence networks constructed from the time series of several low dimensional chaotic attractors9thereby providing a single index to compare the structural complexity of chaotic attractors.
Great successes of deep neural networks have been witnessed in various real applications. Many algorithmic and implementation techniques have been developed, however, theoretical understanding of many aspects of deep neural networks is far from clear. A particular interesting issue is the usefulness of dropout, which was motivated from the intuition of preventing complex co-adaptation of feature detectors. In this paper, we study the Rademacher complexity of different types of dropout, and our theoretical results disclose that for shallow neural networks (with one or none hidden layer) dropout is able to reduce the Rademacher complexity in polynomial, whereas for deep neural networks it can amazingly lead to an exponential reduction of the Rademacher complexity.
We analyze the stability of standing pulse solutions of a neural network integro-differential equation. The network consists of a coarse-grained layer of neurons synaptically connected by lateral inhibition with a non-saturating nonlinear gain function. When two standing single-pulse solutions coexist, the small pulse is unstable, and the large pulse is stable. The large single-pulse is bistable with the ``all-off'' state. This bistable localized activity may have strong implications for the mechanism underlying working memory. We show that dimple pulses have similar stability properties to large pulses but double pulses are unstable.
Milgram Condition proposed by Aoyama et al. plays an important role on the analysis of "six degrees of separation". We have shown that the relations between Milgram condition and the generalized clustering coefficient, which was introduced as an index for measuring the number of closed paths by us, are absolutely different in scale free networks (Barabasi and Albert) and small world networks (Watts and Strogatz, Watts). This fact implies that the effect of closed paths on information propagation is different in both networks. In this article, we first investigate the difference and pursuit what is a crucial mathematical quantity for information propagation. As a result we find that a sort of "disorder" plays more important role for information propagation than partially closed paths included in a network. Next we inquired into it in more detail by introducing two types of intermediate networks. Then we find that the average of the local clustering coefficient and the generalized clustering coefficients $C_{(q)}$ have some different functions and important meanings, respectively. We also find that $C_{(q)}$ is close to the propagation of information on networks. Lastly, we show that realizability of six degrees of separation in networks can be understood in a unified way by disorder.
Next-to-leading order corrections to jet cross sections in deep inelastic scattering at HERA are studied. The predicted jet rates allow for a precise determination of $\alpha_s(\mu_R)$ at HERA over a wide range of $\mu_R$. We argue, that the ``natural'' renormalization and factorization scale is set by the average $k_T^B$ of the jets in the Breit frame and suggest to divide the data in corresponding $<k_T^B>$ intervals. Some implications for the determination of the gluon density and the associated forward jet production in the low $x$ regime at HERA are briefly discussed.
We present a new method to analyse and reduce chemical networks and apply this technique to the chemistry in molecular clouds. Using the technique, we investigated the possibility of reducing the number of chemical reactions and species in the UMIST 95 database simultaneously. In addition, we did the same reduction but with the ``objective technique'' in order to compare both methods. We found that it is possible to compute the abundance of carbon monoxide and fractional ionisation accurately with significantly reduced chemical networks in the case of pure gas-phase chemistry. For gas-grain chemistry involving surface reactions reduction is not worthwhile. Compared to the ``objective technique'' our reduction method is more effective but more time-consuming as well.
The IEEE 802.11 standard offers a cheap and promising solution for small scale wireless networks. Due to the self configuring nature, WLANs do not require large scale infrastructure deployment, and are scalable and easily maintainable which incited its popularity in both literature and industry. In real environment, these networks operate mostly under unsaturated condition. We investigate performance of such a network with m-retry limit BEB based DCF. We consider imperfect channel with provision for power capture. Our method employs a Markov model and represents the most common performance measures in terms of network parameters making the model and mathematical analysis useful in network design and planning. We also explore the effects of packet error, network size, initial contention window, and retry limit on overall performance of WLANs.
Continual Learning in artificial neural networks suffers from interference and forgetting when different tasks are learned sequentially. This paper introduces the Active Long Term Memory Networks (A-LTM), a model of sequential multi-task deep learning that is able to maintain previously learned association between sensory input and behavioral output while acquiring knew knowledge. A-LTM exploits the non-convex nature of deep neural networks and actively maintains knowledge of previously learned, inactive tasks using a distillation loss. Distortions of the learned input-output map are penalized but hidden layers are free to transverse towards new local optima that are more favorable for the multi-task objective. We re-frame the McClelland's seminal Hippocampal theory with respect to Catastrophic Inference (CI) behavior exhibited by modern deep architectures trained with back-propagation and inhomogeneous sampling of latent factors across epochs. We present empirical results of non-trivial CI during continual learning in Deep Linear Networks trained on the same task, in Convolutional Neural Networks when the task shifts from predicting semantic to graphical factors and during domain adaptation from simple to complex environments. We present results of the A-LTM model's ability to maintain viewpoint recognition learned in the highly controlled iLab-20M dataset with 10 object categories and 88 camera viewpoints, while adapting to the unstructured domain of Imagenet with 1,000 object categories.
Finding a suitably growing length scale that increases in tandem with the immense viscous slowdown of supercooled liquids is an open problem associated with the glass transition. Here, we define and demonstrate the existence of one such length scale which may be experimentally verifiable. This is the length scale over which external shear perturbations appreciably penetrate into a liquid as the glass transition is approached. We provide simulation based evidence of its existence, and its growth by at least an order of magnitude, by using molecular dynamics simulations of NiZr2, a good fragile glass former. On the probed timescale, upon approaching the glass transition temperature from above, this length scale, {\xi} is also shown to be consistent with Ising-like scaling. Furthermore, we demonstrate the possible scaling of {\xi} about the temperature at which super-Arrhenius growth of viscosity, and a marked growth of the penetration depth sets in. Our simulation results suggest that upon supercooling, marked initial increase of the shear penetration depth in fluids may occur in tandem with the breakdown of the Stokes-Einstein relation.
We study the domain geometry during spinodal decomposition of a 50:50 binary mixture in two dimensions. Extending arguments developed to treat non-conserved coarsening, we obtain approximate analytic results for the distribution of domain areas and perimeters during the dynamics. The main approximation is to regard the interfaces separating domains as moving independently. While this is true in the non-conserved case, it is not in the conserved one. Our results can therefore be considered as a first-order approximation for the distributions. In contrast to the celebrated Lifshitz-Slyozov-Wagner distribution of structures of the minority phase in the limit of very small concentration, the distribution of domain areas in the 50:50 case does not have a cut-off. Large structures (areas or perimeters) retain the morphology of a percolative or critical initial condition, for quenches from high temperatures or the critical point respectively. The corresponding distributions are described by a $c A^{-\tau}$ tail, where $c$ and $\tau$ are exactly known. With increasing time, small structures tend to have a spherical shape with a smooth surface before evaporating by diffusion. In this regime the number density of domains with area $A$ scales as $A^{1/2}$, as in the Lifshitz-Slyozov-Wagner theory. The threshold between the small and large regimes is determined by the characteristic area, ${\rm A} \sim [\lambda(T) t]^{2/3}$. Finally, we study the relation between perimeters and areas and the distribution of boundary lengths, finding results that are consistent with the ones summarized above. We test our predictions with Monte Carlo simulations of the 2d Ising Model.
Riding on the waves of deep neural networks, deep metric learning has also achieved promising results in various tasks using triplet network or Siamese network. Though the basic goal of making images from the same category closer than the ones from different categories is intuitive, it is hard to directly optimize due to the quadratic or cubic sample size. To solve the problem, hard example mining which only focuses on a subset of samples that are considered hard is widely used. However, hard is defined relative to a model, where complex models treat most samples as easy ones and vice versa for simple models, and both are not good for training. Samples are also with different hard levels, it is hard to define a model with the just right complexity and choose hard examples adequately. This motivates us to ensemble a set of models with different complexities in cascaded manner and mine hard examples adaptively, a sample is judged by a series of models with increasing complexities and only updates models that consider the sample as a hard case. We evaluate our method on CARS196, CUB-200-2011, Stanford Online Products, VehicleID and DeepFashion datasets. Our method outperforms state-of-the-art methods by a large margin.
We show that the electronic spectrum of a tight-binding Hamiltonian defined in a quasiperiodic chain with an on-site potential given by a Fibonacci sequence, can be obtained as a superposition of Harper potentials. The electronic spectrum of the Harper equation is a fractal set, known as Hofstadter butterfly. Here we show that is possible to construct a similar butterfly for the Fibonacci potential just by adding harmonics to the Harper potential. As a result, the equations in reciprocal space for the Fibonacci case have the form of a chain with a long range interaction between Fourier components. Then we explore the transformation between both spectra, and specifically the origin of energy gaps due to the analytical calculation of the components in reciprocal space of the potentials. We also calculate some localization properties by finding the correlator of each potential.
Increasing availability of vehicle GPS data has created potentially transformative opportunities for traffic management, route planning and other location-based services. Critical to the utility of the data is their accuracy. Map-matching is the process of improving the accuracy by aligning GPS data with the road network. In this paper, we propose a purely probabilistic approach to map-matching based on a sequential Monte Carlo algorithm known as particle filters. The approach performs map-matching by producing a range of candidate solutions, each with an associated probability score. We outline implementation details and thoroughly validate the technique on GPS data of varied quality.
The human language is one of the most natural interfaces for humans to interact with robots. This paper presents a robot system that retrieves everyday objects with unconstrained natural language descriptions. A core issue for the system is semantic and spatial grounding, which is to infer objects and their spatial relationships from images and natural language expressions. We introduce a two-stage neural-network grounding pipeline that maps natural language referring expressions directly to objects in the images. The first stage uses visual descriptions in the referring expressions to generate a candidate set of relevant objects. The second stage examines all pairwise relationships between the candidates and predicts the most likely referred object according to the spatial descriptions in the referring expressions. A key feature of our system is that by leveraging a large dataset of images labeled with text descriptions, it allows unrestricted object types and natural language referring expressions. Preliminary results indicate that our system outperforms a near state-of-the-art object comprehension system on standard benchmark datasets. We also present a robot system that follows voice commands to pick and place previously unseen objects.
In this paper, we present a transfer learning approach for music classification and regression tasks. We propose to use a pre-trained convnet feature, a concatenated feature vector using the activations of feature maps of multiple layers in a trained convolutional network. We show how this convnet feature can serve as general-purpose music representation. In the experiments, a convnet is trained for music tagging and then transferred to other music-related classification and regression tasks. The convnet feature outperforms the baseline MFCC feature in all the considered tasks and several previous approaches that are aggregating MFCCs as well as low- and high-level music features.
The conductance of a disordered finite-size electron system is calculated by reducing the initial dynamic problem of arbitrary dimensionality to strictly one-dimensional problems for one-particle mode propagators. The metallic ground state of a two-dimensional conductor, which is considered as a limiting case of the actually three-dimensional quantum waveguide, is shown to result from its multi-modeness. On lowering the waveguide thickness, in practice, e.g., due to application of the ``pressing'' potential (depletion voltage), the electron system undergoes a set of continuous phase transitions connected with the discrete change in the number of extended modes. The closing of the last current-carrying mode is interpreted as the electron system transition from metallic to dielectric state. The results obtained agree qualitatively with the observed ``anomalies'' of the resistance of different electron and hole systems.
Conditional random fields (CRFs) provide a powerful tool for structured prediction, but cast significant challenges in both the learning and inference steps. Approximation techniques are widely used in both steps, which should be considered jointly to guarantee good performance (a.k.a. "inferning"). Perturb-and-MAP models provide a promising alternative to CRFs, but require global combinatorial optimization and hence they are usable only on specific models. In this work, we present a new Local Perturb-and-MAP (locPMAP) framework that replaces the global optimization with a local optimization by exploiting our observed connection between locPMAP and the pseudolikelihood of the original CRF model. We test our approach on three different vision tasks and show that our method achieves consistently improved performance over other approximate inference techniques optimized to a pseudolikelihood objective. Additionally, we demonstrate that we can integrate our method in the fully convolutional network framework to increase our model's complexity. Finally, our observed connection between locPMAP and the pseudolikelihood leads to a novel perspective for understanding and using pseudolikelihood.
Crowding is a visual effect suffered by humans, in which an object that can be recognized in isolation can no longer be recognized when other objects, called flankers, are placed close to it. In this work, we study the effect of crowding in artificial Deep Neural Networks for object recognition. We analyze both standard deep convolutional neural networks (DCNNs) as well as a new version of DCNNs which is 1) multi-scale and 2) with size of the convolution filters change depending on the eccentricity wrt to the center of fixation. Such networks, that we call eccentricity-dependent, are a computational model of the feedforward path of the primate visual cortex. Our results reveal that the eccentricity-dependent model, trained on target objects in isolation, can recognize such targets in the presence of flankers, if the targets are near the center of the image, whereas DCNNs cannot. Also, for all tested networks, when trained on targets in isolation, we find that recognition accuracy of the networks decreases the closer the flankers are to the target and the more flankers there are. We find that visual similarity between the target and flankers also plays a role and that pooling in early layers of the network leads to more crowding. Additionally, we show that incorporating the flankers into the images of the training set does not improve performance with crowding.
Nonnegative Boltzmann machines (NNBMs) are recurrent probabilistic neural network models that can describe multi-modal nonnegative data. NNBMs form rectified Gaussian distributions that appear in biological neural network models, positive matrix factorization, nonnegative matrix factorization, and so on. In this paper, an effective inference method for NNBMs is proposed that uses the mean-field method, referred to as the Thouless--Anderson--Palmer equation, and the diagonal consistency method, which was recently proposed.
Excitations from a strongly frustrated system, the kagome ice state of the spin ice Dy2Ti2O7 under magnetic fields along a [111] direction, have been studied. They are theoretically proposed to be regarded as magnetic monopoles. Neutron scattering measurements of spin correlations show that close to the critical point the monopoles are fluctuating between high- and low-density states, supporting that the magnetic Coulomb force acts between them. Specific heat measurements show that monopole-pair creation obeys an Arrhenius law, indicating that the density of monopoles can be controlled by temperature and magnetic field.
The objective of this work is to estimate 3D human pose from a single RGB image. Extracting image representations which incorporate both spatial relation of body parts and their relative depth plays an essential role in accurate3D pose reconstruction. In this paper, for the first time, we show that camera viewpoint in combination to 2D joint lo-cations significantly improves 3D pose accuracy without the explicit use of perspective geometry mathematical models.To this end, we train a deep Convolutional Neural Net-work (CNN) to learn categorical camera viewpoint. To make the network robust against clothing and body shape of the subject in the image, we utilized 3D computer rendering to synthesize additional training images. We test our framework on the largest 3D pose estimation bench-mark, Human3.6m, and achieve up to 20% error reduction compared to the state-of-the-art approaches that do not use body part segmentation.
Neural embeddings have been used with great success in Natural Language Processing (NLP). They provide compact representations that encapsulate word similarity and attain state-of-the-art performance in a range of linguistic tasks. The success of neural embeddings has prompted significant amounts of research into applications in domains other than language. One such domain is graph-structured data, where embeddings of vertices can be learned that encapsulate vertex similarity and improve performance on tasks including edge prediction and vertex labelling. For both NLP and graph based tasks, embeddings have been learned in high-dimensional Euclidean spaces. However, recent work has shown that the appropriate isometric space for embedding complex networks is not the flat Euclidean space, but negatively curved, hyperbolic space. We present a new concept that exploits these recent insights and propose learning neural embeddings of graphs in hyperbolic space. We provide experimental evidence that embedding graphs in their natural geometry significantly improves performance on downstream tasks for several real-world public datasets.
Learning articulated object pose is inherently difficult because the pose is high dimensional but has many structural constraints. Most existing work do not model such constraints and does not guarantee the geometric validity of their pose estimation, therefore requiring a post-processing to recover the correct geometry if desired, which is cumbersome and sub-optimal. In this work, we propose to directly embed a kinematic object model into the deep neutral network learning for general articulated object pose estimation. The kinematic function is defined on the appropriately parameterized object motion variables. It is differentiable and can be used in the gradient descent based optimization in network training. The prior knowledge on the object geometric model is fully exploited and the structure is guaranteed to be valid. We show convincing experiment results on a toy example and the 3D human pose estimation problem. For the latter we achieve state-of-the-art result on Human3.6M dataset.
A new approach in team sports analysis consists in studying positioning and movements of players during the game in relation to team performance. State of the art tracking systems produce spatio-temporal traces of players that have facilitated a variety of research aimed to extract insights from trajectories. Several methods borrowed from machine learning, network and complex systems, geographic information system, computer vision and statistics have been proposed. After having reviewed the state of the art in those niches of literature aiming to extract useful information to analysts and experts in terms of relation between players' trajectories and team performance, this paper presents preliminary results from analysing trajectories data and sheds light on potential future research in this field of study. In particular, using convex hulls, we find interesting regularities in players' movement patterns.
We discuss the physics of RNA as described by its secondary structure. We examine the static properties of a homogeneous RNA-model that includes pairing and base stacking energies as well as entropic costs for internal loops. For large enough costs the model exhibits a thermal denaturation transition which we analyze in terms of the radius of gyration. We point out an inconsistency in the standard approach to RNA secondary structure prediction for large molecules. Under an external force a second order phase transition between a globular and an extended phase takes place. A Harris-type criterion shows that sequence disorder does not affect the correlation length exponent while the other critical exponents are modified in the glass phase. However, at high temperatures, on a coarse-grained level, disordered RNA is well described by a homogeneous model. The characteristics of force-extension curves are discussed as a function of the energy parameters. We show that the force transition is always second order. A re-entrance phenomenon relevant for real disordered RNA is predicted.
One of the main challenges of visual object tracking comes from the arbitrary appearance of objects. Most existing algorithms try to resolve this problem as an object-specific task, i.e., the model is trained to regenerate or classify a specific object. As a result, the model need to be initialized and retrained for different objects. In this paper, we propose a more generic approach utilizing a novel two-flow convolutional neural network (named YCNN). The YCNN takes two inputs (one is object image patch, the other is search image patch), then outputs a response map which predicts how likely the object appears in a specific location. Unlike those object-specific approach, the YCNN is trained to measure the similarity between two image patches. Thus it will not be confined to any specific object. Furthermore the network can be end-to-end trained to extract both shallow and deep convolutional features which are dedicated for visual tracking. And once properly trained, the YCNN can be applied to track all kinds of objects without further training and updating. Benefiting from the once-for-all model, our algorithm is able to run at a very high speed of 45 frames-per-second. The experiments on 51 sequences also show that our algorithm achieves an outstanding performance.
The fundamental problem of multiple secondary users contending for opportunistic spectrum access over multiple channels in cognitive radio networks has been formulated recently as a decentralized multi-armed bandit (D-MAB) problem. In a D-MAB problem there are $M$ users and $N$ arms (channels) that each offer i.i.d. stochastic rewards with unknown means so long as they are accessed without collision. The goal is to design a decentralized online learning policy that incurs minimal regret, defined as the difference between the total expected rewards accumulated by a model-aware genie, and that obtained by all users applying the policy. We make two contributions in this paper. First, we consider the setting where the users have a prioritized ranking, such that it is desired for the $K$-th-ranked user to learn to access the arm offering the $K$-th highest mean reward. For this problem, we present the first distributed policy that yields regret that is uniformly logarithmic over time without requiring any prior assumption about the mean rewards. Second, we consider the case when a fair access policy is required, i.e., it is desired for all users to experience the same mean reward. For this problem, we present a distributed policy that yields order-optimal regret scaling with respect to the number of users and arms, better than previously proposed policies in the literature. Both of our distributed policies make use of an innovative modification of the well known UCB1 policy for the classic multi-armed bandit problem that allows a single user to learn how to play the arm that yields the $K$-th largest mean reward.
For a random walk on a network, the mean first-passage time from a node $i$ to another node $j$ chosen stochastically according to the equilibrium distribution of Markov chain representing the random walk is called Kemeny constant, which is closely related to the navigability on the network. Thus, the configuration of a network that provides optimal or suboptimal navigation efficiency is a question of interest. It has been proved that complete graphs have the exact minimum Kemeny constant over all graphs. In this paper, by using another method we first prove that complete graphs are the optimal networks with a minimum Kemeny constant, which grows linearly with the network size. Then, we study the Kemeny constant of a class of sparse networks that exhibit remarkable scale-free and fractal features as observed in many real-life networks, which cannot be described by complete graphs. To this end, we determine the closed-form solutions to all eigenvalues and their degeneracies of the networks. Employing these eigenvalues, we derive the exact solution to the Kemeny constant, which also behaves linearly with the network size for some particular cases of networks. We further use the eigenvalue spectra to determine the number of spanning trees in the networks under consideration, which is in complete agreement with previously reported results. Our work demonstrates that scale-free and fractal properties are favorable for efficient navigation, which could be considered when designing networks with high navigation efficiency.
In problems of estimation and control which involve a network, efficient distributed computation of averages is a key issue. This paper presents theoretical and simulation results about the accumulation of errors during the computation of averages by means of iterative "broadcast gossip" algorithms. Using martingale theory, we prove that the expectation of the accumulated error can be bounded from above by a quantity which only depends on the mixing parameter of the algorithm and on few properties of the network: its size, its maximum degree and its spectral gap. Both analytical results and computer simulations show that in several network topologies of applicative interest the accumulated error goes to zero as the size of the network grows large.
Propagation of balance-sheet or cash-flow insolvency across financial institutions may be modeled as a cascade process on a network representing their mutual exposures. We derive rigorous asymptotic results for the magnitude of contagion in a large financial network and give an analytical expression for the asymptotic fraction of defaults, in terms of network characteristics. Our results extend previous studies on contagion in random graphs to inhomogeneous directed graphs with a given degree sequence and arbitrary distribution of weights. We introduce a criterion for the resilience of a large financial network to the insolvency of a small group of financial institutions and quantify how contagion amplifies small shocks to the network. Our results emphasize the role played by "contagious links" and show that institutions which contribute most to network instability in case of default have both large connectivity and a large fraction of contagious links. The asymptotic results show good agreement with simulations for networks with realistic sizes.
This paper presents a computationally efficient machine-learned method for natural language response suggestion. Feed-forward neural networks using n-gram embedding features encode messages into vectors which are optimized to give message-response pairs a high dot-product value. An optimized search finds response suggestions. The method is evaluated in a large-scale commercial e-mail application, Inbox by Gmail. Compared to a sequence-to-sequence approach, the new system achieves the same quality at a small fraction of the computational requirements and latency.
Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors.   Our method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face.   On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result by 30% on both datasets.   We also introduce the concept of harmonic embeddings, and a harmonic triplet loss, which describe different versions of face embeddings (produced by different networks) that are compatible to each other and allow for direct comparison between each other.
We study the effect of additive noise on integro-differential neural field equations. In particular, we analyze an Amari-type model driven by a $Q$-Wiener process and focus on noise-induced transitions and escape. We argue that proving a sharp Kramers' law for neural fields poses substanial difficulties but that one may transfer techniques from stochastic partial differential equations to establish a large deviation principle (LDP). Then we demonstrate that an efficient finite-dimensional approximation of the stochastic neural field equation can be achieved using a Galerkin method and that the resulting finite-dimensional rate function for the LDP can have a multi-scale structure in certain cases. These results form the starting point for an efficient practical computation of the LDP. Our approach also provides the technical basis for further rigorous study of noise-induced transitions in neural fields based on Galerkin approximations.
We use the results derived in the framework of the replica approach to study the liquid-glass thermodynamic transition. The main results are rederived without using replicas and applied to the study of the Lennard-Jones binary mixture introduced by Kob and Andersen. We find that there is a phase transition due to the entropy crisis. We compute both analytically and numerically the value of the phase transition point $T_{K}$ and the specific heat in the low temperature phase.
The neural networks have trained on incomplete sets that a doctor could collect. Trained neural networks have correctly classified all the presented instances. The number of intervals entered for encoding the quantitative variables is equal two. The number of features as well as the number of neurons and layers in trained neural networks was minimal. Trained neural networks are adequately represented as a set of logical formulas that more comprehensible and easy-to-understand. These formulas are as the syndrome-complexes, which may be easily tabulated and represented as a diagnostic table that the doctors usually use. Decision rules provide the evaluations of their confidence in which interested a doctor. Conducted clinical researches have shown that iagnostic decisions produced by symbolic rules have coincided with the doctor's conclusions.
Deep learning with a convolutional neural network (CNN) has been proved to be very effective in feature extraction and representation of images. For image classification problems, this work aim at finding which classifier is more competitive based on high-level deep features of images. In this report, we have discussed the nearest neighbor, support vector machines and extreme learning machines for image classification under deep convolutional activation feature representation. Specifically, we adopt the benchmark object recognition dataset from multiple sources with domain bias for evaluating different classifiers. The deep features of the object dataset are obtained by a well-trained CNN with five convolutional layers and three fully-connected layers on the challenging ImageNet. Experiments demonstrate that the ELMs outperform SVMs in cross-domain recognition tasks. In particular, state-of-the-art results are obtained by kernel ELM which outperforms SVMs with about 4% of the average accuracy. The features and codes are available in http://www.escience.cn/people/lei/index.html
We study the dephasing of an individual high-frequency tunneling two-level system (TLS) due to its interaction with an ensemble of low-frequency thermal TLSs which are described by the standard tunneling model (STM). We show that the dephasing by the bath of TLSs explains both the dependence of the Ramsey dephasing rate on an externally applied strain as well as its order of magnitude, as observed in a recent experiment [J. Lisenfeld et al.]. However, the theory based on the STM predicts the Hahn-echo protocol to be much more efficient, yielding too low echo dephasing rates, as compared to the experiment. Also the strain dependence of the echo dephasing rate predicted by the STM does not agree with the measured quadratic dependence, which would fit to a high-frequency white noise environment. We suggest that few fast TLSs which are coupled much more strongly to the strain fields than the usual TLSs of the STM give rise to such a white noise. This explains the magnitude and strong fluctuations of the echo dephasing rate observed in the experiment.
Current recommender systems exploit user and item similarities by collaborative filtering. Some advanced methods also consider the temporal evolution of item ratings as a global background process. However, all prior methods disregard the individual evolution of a user's experience level and how this is expressed in the user's writing in a review community. In this paper, we model the joint evolution of user experience, interest in specific item facets, writing style, and rating behavior. This way we can generate individual recommendations that take into account the user's maturity level (e.g., recommending art movies rather than blockbusters for a cinematography expert). As only item ratings and review texts are observables, we capture the user's experience and interests in a latent model learned from her reviews, vocabulary and writing style. We develop a generative HMM-LDA model to trace user evolution, where the Hidden Markov Model (HMM) traces her latent experience progressing over time -- with solely user reviews and ratings as observables over time. The facets of a user's interest are drawn from a Latent Dirichlet Allocation (LDA) model derived from her reviews, as a function of her (again latent) experience level. In experiments with five real-world datasets, we show that our model improves the rating prediction over state-of-the-art baselines, by a substantial margin. We also show, in a use-case study, that our model performs well in the assessment of user experience levels.
In this paper, we address the problem of distributed interference management of cognitive femtocells that share the same frequency range with macrocells (primary user) using distributed multi-agent Q-learning. We formulate and solve three problems representing three different Q-learning algorithms: namely, centralized, distributed and partially distributed power control using Q-learning (CPC-Q, DPC-Q and PDPC-Q). CPCQ, although not of practical interest, characterizes the global optimum. Each of DPC-Q and PDPC-Q works in two different learning paradigms: Independent (IL) and Cooperative (CL). The former is considered the simplest form for applying Qlearning in multi-agent scenarios, where all the femtocells learn independently. The latter is the proposed scheme in which femtocells share partial information during the learning process in order to strike a balance between practical relevance and performance. In terms of performance, the simulation results showed that the CL paradigm outperforms the IL paradigm and achieves an aggregate femtocells capacity that is very close to the optimal one. For the practical relevance issue, we evaluate the robustness and scalability of DPC-Q, in real time, by deploying new femtocells in the system during the learning process, where we showed that DPC-Q in the CL paradigm is scalable to large number of femtocells and more robust to the network dynamics compared to the IL paradigm
In this paper an introduction to the physics of deep-inelastic scattering is given together with an account of some of the most recent results on the proton structure obtained in electron-- and positron--proton collisions at the HERA collider.
We present a first morphological study of z~7-8 Lyman Break galaxies (LBGs) from Oesch et al. 2009 and Bouwens et al. 2009 detected in ultra-deep near-infrared imaging of the Hubble Ultra Deep field (HUDF) by the HUDF09 program. With an average intrinsic size of 0.7+-0.3 kpc these galaxies are found to be extremely compact having an average observed surface brightness of mu_J ~= 26 mag arcsec^(-2), and only two out of the full sample of 16 z~7 galaxies show extended features with resolved double cores. By comparison to lower redshift LBGs it is found that only little size evolution takes place from z~7 to z~6, while galaxies between z~4-5 show more extended wings in their apparent profiles. The average size scales as (1+z)^(-m) with m=1.12+-0.17 for galaxies with luminosities in the range (0.3-1)L*_{z=3} and with m=1.32+-0.52 for (0.12-0.3)L*_{z=3}, consistent with galaxies having constant comoving sizes. The peak of the size distribution changes only slowly from z~7 to z~4. However, a tail of larger galaxies (>~ 1.2 kpc) is gradually built up towards later cosmic times, possibly via hierarchical build-up or via enhanced accretion of cold gas. Additionally, the average star-formation surface density of LBGs with luminosities (0.3-1)L*_{z=3} is nearly constant at Sigma_{SFR}=1.9 Msun/yr/kpc^2 over the entire redshift range z~4-7 suggesting similar star-formation efficiencies at these early epochs. The above evolutionary trends seem to hold out to z~8 though the sample is still small and possibly incomplete.
We consider the problem of providing a rigorous model for programming wireless sensor networks. Assuming that collisions, packet losses, and errors are dealt with at the lower layers of the protocol stack, we propose a Calculus for Sensor Networks (CSN) that captures the main abstractions for programming applications for this class of devices. Besides providing the syntax and semantics for the calculus, we show its expressiveness by providing implementations for several examples of typical operations on sensor networks. Also included is a detailed discussion of possible extensions to CSN that enable the modeling of other important features of these networks such as sensor state, sampling strategies, and network security.
The increasing complexity of deep neural networks (DNNs) has made it challenging to exploit existing large-scale data processing pipelines for handling massive data and parameters involved in DNN training. Distributed computing platforms and GPGPU-based acceleration provide a mainstream solution to this computational challenge. In this paper, we propose DeepSpark, a distributed and parallel deep learning framework that exploits Apache Spark on commodity clusters. To support parallel operations, DeepSpark automatically distributes workloads and parameters to Caffe/Tensorflow-running nodes using Spark, and iteratively aggregates training results by a novel lock-free asynchronous variant of the popular elastic averaging stochastic gradient descent based update scheme, effectively complementing the synchronized processing capabilities of Spark. DeepSpark is an on-going project, and the current release is available at http://deepspark.snu.ac.kr.
The experimental realization of increasingly complex synthetic quantum systems calls for the development of general theoretical methods, to validate and fully exploit quantum resources. Quantum-state tomography (QST) aims at reconstructing the full quantum state from simple measurements, and therefore provides a key tool to obtain reliable analytics. Brute-force approaches to QST, however, demand resources growing exponentially with the number of constituents, making it unfeasible except for small systems. Here we show that machine learning techniques can be efficiently used for QST of highly-entangled states in arbitrary dimension. Remarkably, the resulting approach allows one to reconstruct traditionally challenging many-body quantities - such as the entanglement entropy - from simple, experimentally accessible measurements. This approach can benefit existing and future generations of devices ranging from quantum computers to ultra-cold atom quantum simulators.
Traditional Byzantine consensus does not work in P2P network due to Sybil attack while the most prevalent Sybil-proof consensus at present still can't resist adversary with dominant compute power. This paper proposed opinion dynamics based consensus for P2P network with trust relationships, consisting of the sky framework and the sky model. With the sky framework, opinion dynamics can be applied in P2P network for consensus which is Sybil-proof through trust relationships and emerges from local interactions of each node with its direct contacts without topology, global information or even sample of the network involved. The sky model has better performance of convergence than existing models including MR, voter and Sznajd, and its lower bound of fault tolerance performance is also analyzed and proved. Simulations show that our approach can tolerant failures by at least 13% random nodes or 2% top influential nodes while over 96% correct nodes still make correct decision within 70 seconds on the SNAP Wikipedia who-votes-on-whom network for initial configuration of convergence>0.5 with reasonable latencies. Comparing to compute power based consensus, our approach can resist any faulty or malicious nodes by unfollowing them. To the best of our knowledge, it's the first work to bring opinion dynamics to P2P network for consensus.
While deep neural networks have succeeded in several visual applications, such as object recognition, detection, and localization, by reaching very high classification accuracies, it is important to note that many real-world applications demand vary- ing costs for different types of misclassification errors, thus requiring cost-sensitive classification algorithms. Current models of deep neural networks for cost-sensitive classification are restricted to some specific network structures and limited depth. In this paper, we propose a novel framework that can be applied to deep neural networks with any structure to facilitate their learning of meaningful representations for cost-sensitive classification problems. Furthermore, the framework allows end- to-end training of deeper networks directly. The framework is designed by augmenting auxiliary neurons to the output of each hidden layer for layer-wise cost estimation, and including the total estimation loss within the optimization objective. Experimental results on public benchmark visual data sets with two cost information settings demonstrate that the proposed frame- work outperforms state-of-the-art cost-sensitive deep learning models.
Financial forecasting is an example of a signal processing problem which is challenging due to Small sample sizes, high noise, non-stationarity, and non-linearity,but fast forecasting of stock market price is very important for strategic business planning.Present study is aimed to develop a comparative predictive model with Feedforward Multilayer Artificial Neural Network & Recurrent Time Delay Neural Network for the Financial Timeseries Prediction.This study is developed with the help of historical stockprice dataset made available by GoogleFinance.To develop this prediction model Backpropagation method with Gradient Descent learning has been implemented.Finally the Neural Net, learned with said algorithm is found to be skillful predictor for non-stationary noisy Financial Timeseries.
It is suggested a topological hierarchical classification of the infinite many Localized phases figuring in the phase diagram of the Harper equation for anisotropy parameter $\epsilon$ versus Energy $E$ with irrational magnetic flux $\omega$. It is also proposed a rule that explain the fractal structure of the phase diagram. Among many other applications, this system is equivalent to the Semi-classical problem of Bloch electrons in a uniform magnetic field, the Azbel-Hofstadter model, where the discrete magnetic translations operators constitute the quantum algebra $U_q(sl_2)$ with $q^2=e^{i2\pi\omega}$. The magnetic flux is taken to be the golden mean $\omega^*=(\sqrt{5}-1)/2$ and is obtained by successive rational approximants $\omega_m=F_{m-1}/F_m$ with $F_m$ given by the Fibonacci sequence $F_m$.[OUTP-00-08S, \texttt{cond-mat/0011396}]
Computer networks are hard to manage. Given a set of high-level requirements (e.g., reachability, security), operators have to manually figure out the individual configuration of potentially hundreds of devices running complex distributed protocols so that they, collectively, compute a compatible forwarding state. Not surprisingly, operators often make mistakes which lead to downtimes. To address this problem, we present a novel synthesis approach that automatically computes correct network configurations that comply with the operator's requirements. We capture the behavior of existing routers along with the distributed protocols they run in stratified Datalog. Our key insight is to reduce the problem of finding correct input configurations to the task of synthesizing inputs for a stratified Datalog program. To solve this synthesis task, we introduce a new algorithm that synthesizes inputs for stratified Datalog programs. This algorithm is applicable beyond the domain of networks. We leverage our synthesis algorithm to construct the first network-wide configuration synthesis system, called SyNET, that support multiple interacting routing protocols (OSPF and BGP) and static routes. We show that our system is practical and can infer correct input configurations, in a reasonable amount time, for networks of realistic size (> 50 routers) that forward packets for multiple traffic classes.
Motivated by sensor networks and other distributed settings, several models for distributed learning are presented. The models differ from classical works in statistical pattern recognition by allocating observations of an independent and identically distributed (i.i.d.) sampling process amongst members of a network of simple learning agents. The agents are limited in their ability to communicate to a central fusion center and thus, the amount of information available for use in classification or regression is constrained. For several basic communication models in both the binary classification and regression frameworks, we question the existence of agent decision rules and fusion rules that result in a universally consistent ensemble. The answers to this question present new issues to consider with regard to universal consistency. Insofar as these models present a useful picture of distributed scenarios, this paper addresses the issue of whether or not the guarantees provided by Stone's Theorem in centralized environments hold in distributed settings.
Deep-inelastic structure functions are studied within a covariant scalar diquark spectator model of the nucleon. Treating the target as a two-body bound state of a quark and a scalar diquark, the Bethe-Salpeter equation (BSE) for the bound state vertex function is solved in the ladder approximation. The valence quark distribution is discussed in terms of the solutions of the BSE.
This paper demonstrates that a heterogeneous group of individuals becomes more homogeneous in terms of their speaking abilities in a positive trajectory as they interact more with each other in an online platform. The experimental data comes from a learning platform called ROC Speak (available at rocspeak.com) where individuals can provide written feedback to each other on their speaking skills, in addition to receiving automated feedback. We represent the participants and the interactions among them as a network and use graph signal processing tools to analyze how knowledge transfers across the network. Our results show that the users' performance ratings have an improving trend in addition to getting closer to each other - in other words, the users "pull" each other closer towards the better. We hypothesize that this effect is a result of accumulation of knowledge, as knowledge diffuses across the network through mutually beneficial interactions. These insights motivate modeling and simulation of the user ratings' evolution via consensus-based diffusion dynamics, to synthesize an ideal instance of such knowledge propagation mechanism. Finally, we show that when we couple this "pull" effect with individual performance trajectories, we are able to better predict future outcomes when compared to a baseline predictor that relies on individual trajectories alone - providing evidence that the interaction information holds a key to the participants' learning outcomes at an individual level.
Key establishment in sensor networks becomes a challenging problem because of the resource limitations of the sensors and also due to vulnerability to physical capture of the sensor nodes. In this paper, we propose an unconditionally secure probabilistic group-based key pre-distribution scheme for a heterogeneous wireless sensor network. The proposed scheme always guarantees that no matter how many sensor nodes are compromised, the non-compromised nodes can still communicate with 100% secrecy, i.e., the proposed scheme is always unconditionally secure against node capture attacks. Moreover, it provides significantly better trade-off between communication overhead, computational overhead, network connectivity and security against node capture as compared to the existing key pre-distribution schemes. It also supports dynamic node addition after the initial deployment of the nodes in the network.
Most real-world social networks are inherently dynamic, composed of communities that are constantly changing in membership. To track these evolving communities, we need dynamic community detection techniques. This article evaluates the performance of a set of game theoretic approaches for identifying communities in dynamic networks. Our method, D-GT (Dynamic Game Theoretic community detection), models each network node as a rational agent who periodically plays a community membership game with its neighbors. During game play, nodes seek to maximize their local utility by joining or leaving the communities of network neighbors. The community structure emerges after the game reaches a Nash equilibrium. Compared to the benchmark community detection methods, D-GT more accurately predicts the number of communities and finds community assignments with a higher normalized mutual information, while retaining a good modularity.
Directed acyclic graphs (DAGs) are a popular framework to express multivariate probability distributions. Acyclic directed mixed graphs (ADMGs) are generalizations of DAGs that can succinctly capture much richer sets of conditional independencies, and are especially useful in modeling the effects of latent variables implicitly. Unfortunately there are currently no good parameterizations of general ADMGs. In this paper, we apply recent work on cumulative distribution networks and copulas to propose one one general construction for ADMG models. We consider a simple parameter estimation approach, and report some encouraging experimental results.
Due to the advancement in mobile devices and wireless networks mobile cloud computing, which combines mobile computing and cloud computing has gained momentum since 2009. The characteristics of mobile devices and wireless network makes the implementation of mobile cloud computing more complicated than for fixed clouds. This section lists some of the major issues in Mobile Cloud Computing. One of the key issues in mobile cloud computing is the end to end delay in servicing a request. Data caching is o ne of the techniques widely used in wired and wireless networks to improve data access efficiency. In this paper we explore the possibility of a cooperative caching approach to enhance data access efficiency in mobile cloud computing. The proposed approach is based on cloudlets, one of the architecture designed for mobile cloud computing.
Recent approaches based on artificial neural networks (ANNs) have shown promising results for short-text classification. However, many short texts occur in sequences (e.g., sentences in a document or utterances in a dialog), and most existing ANN-based systems do not leverage the preceding short texts when classifying a subsequent one. In this work, we present a model based on recurrent neural networks and convolutional neural networks that incorporates the preceding short texts. Our model achieves state-of-the-art results on three different datasets for dialog act prediction.
This paper presents a new method for 3D action recognition with skeleton sequences (i.e., 3D trajectories of human skeleton joints). The proposed method first transforms each skeleton sequence into three clips each consisting of several frames for spatial temporal feature learning using deep neural networks. Each clip is generated from one channel of the cylindrical coordinates of the skeleton sequence. Each frame of the generated clips represents the temporal information of the entire skeleton sequence, and incorporates one particular spatial relationship between the joints. The entire clips include multiple frames with different spatial relationships, which provide useful spatial structural information of the human skeleton. We propose to use deep convolutional neural networks to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and then use a Multi-Task Learning Network (MTLN) to jointly process all frames of the generated clips in parallel to incorporate spatial structural information for action recognition. Experimental results clearly show the effectiveness of the proposed new representation and feature learning method for 3D action recognition.
We propose a neural multi-document summarization (MDS) system that incorporates sentence relation graphs. We employ a Graph Convolutional Network (GCN) on the relation graphs, with sentence embeddings obtained from Recurrent Neural Networks as input node features. Through multiple layer-wise propagation, the GCN generates high-level hidden sentence features for salience estimation. We then use a greedy heuristic to extract salient sentences while avoiding redundancy. In our experiments on DUC 2004, we consider three types of sentence relation graphs and demonstrate the advantage of combining sentence relations in graphs with the representation power of deep neural networks. Our model improves upon traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems.
The propagation and roughening of a fluid-gas interface through a disordered medium in the case of capillary driven spontaneous imbibition is considered. The system is described by a conserved (model B) phase-field model, with the structure of the disordered medium appearing as a quenched random field $\alpha({\bf x})$. The flow of liquid into the medium is obtained by imposing a non-equilibrium boundary condition on the chemical potential, which reproduces Washburn's equation $H \sim t^{1/2}$ for the slowing down motion of the average interface position $H$. The interface is found to be superrough, with global roughness exponent $\chi \approx 1.25$, indicating anomalous scaling. The spatial extent of the roughness is determined by a length scale $\xi_{\times} \sim H^{1/2}$ arising from the conservation law. The interface advances by avalanche motion, which causes temporal multiscaling and qualitatively reproduces the experimental results of Horv\a'ath and Stanley [Phys. Rev. E {\bf 52} 5166 (1995)] on the temporal scaling of the interface.
Software and hardware components are basic parts of modern networks. However the software compo- nent is typical sealed and function-oriented. Therefore it is very difficult to modify these components. This badly affected networking innovations. Moreover, this resulted in network policies having complex interfaces that are not user-friendly and hence resulted in huge and complicated flow tables on physical switches of networks. This greatly degrades the network performance in many cases. Software-Defined Networks (SDNs) is a modern architecture of networks to overcome issues mentioned above. The idea of SDN is to add to the network a controller device that manages all the other devices on the network including physical switches of the network. One of the main tasks of the managing process is switch learning; achieved via programming physical switches of the network by adding or removing rules for packet-processing to/from switches, more specifically to/from their flow tables. A high-level imperative network programming language, called ImpNet, is presented in this paper. ImpNet enables writing efficient, yet simple, and powerful programs to run on the controller to control all other network devices including switches. ImpNet is compositional, simply-structured, expressive, and more importantly imperative. The syntax of ImpNet together two types of operational semantics to contracts of ImpNet are presented in the paper. The proposed semantics are of the static and dynamic types. Two modern application programmed using ImpNet are shown in the paper as well. The semantics of the applications are shown in the paper also.
For random L\'evy matrices of size $N \times N$, where matrix elements are drawn with some heavy-tailed distribution $P(H_{ij}) \propto N^{-1} |H_{ij} |^{-1-\mu}$ with $0<\mu<2$ (infinite variance), there exists an extensive number of finite eigenvalues $E=O(1)$, while the maximal eigenvalue grows as $E_{max} \sim N^{\frac{1}{\mu}}$. Here we study the localization properties of the corresponding eigenvectors via some strong disorder perturbative expansion that remains consistent within the localized phase and that yields their Inverse Participation Ratios (I.P.R.) $Y_q$ as a function of the continuous parameter $0<q<+\infty$. In the region $0<\mu<1$, we find that all eigenvectors are localized but display some multifractality : the IPR are finite above some threshold $q>q_c$ but diverge in the region $0<q<q_c$ near the origin. In the region $1<\mu<2$, only the sub-extensive fraction $N^{\frac{3}{2+\mu}}$ of the biggest eigenvalues corresponding to the region $| E| \geq N^{\frac{(\mu-1)}{\mu(2+\mu)}} $ remains localized, while the extensive number of other states of smaller energy are delocalized. For the extensive number of finite eigenvalues $E=O(1)$, the localization/delocalization transition thus takes place at the critical value $\mu_c=1$ corresponding to Cauchy matrices : the Inverse Participation Ratios $Y_q$ of the corresponding critical eigenstates follow the Strong-Multifractality Spectrum characterized by the generalized fractal dimensions $D^{criti}(q) = \frac{1-2q}{1-q} \theta (0 \leq q \leq \frac{1}{2})$, which has been found previously in various other Localization problems in spaces of effective infinite dimensionality.
The Kauffman model describes a particularly simple class of random Boolean networks. Despite the simplicity of the model, it exhibits complex behavior and has been suggested as a model for real world network problems. We introduce a novel approach to analyzing attractors in random Boolean networks, and applying it to Kauffman networks we prove that the average number of attractors grows faster than any power law with system size.
Complex networks tend to display communities which are groups of nodes cohesively connected among themselves in one group and sparsely connected to the remainder of the network. Detecting such communities is an important computational problem, since it provides an insight into the functionality of networks. Further, investigating community structure in a dynamic network, where the network is subject to change, is even more challenging. This paper presents a game-theoretical technique for detecting community structures in dynamic as well as static complex networks. In our method, each node takes the role of a player that attempts to gain a higher payoff by joining one or more communities or switching between them. The goal of the game is to reveal community structure formed by these players by finding a Nash-equilibrium point among them. To the best of our knowledge, this is the first game-theoretic algorithm which is able to extract overlapping communities from either static or dynamic networks. We present the experimental results illustrating the effectiveness of the proposed method on both synthetic and real-world networks.
In social networks, control of rumor spread is an active area of research. SIR model is generally used to study the rumor dynamics in network while considering the rumor as an epidemic. In disease spreading model, epidemic is controlled by removing central nodes in the network. Full network information is needed for such removal. To have the information of complete network is difficult proposition. As a consequence, the search of an algorithm that may control epidemic without needing global information is a matter of great interest. In this paper, an immunization strategy is proposed that uses only local information available at a node, viz. degree of the node and average degree of its neighbour nodes. Proposed algorithm has been evaluated for scale-free network using SIR model. Numerical results show that proposed method has less complexity and gives significantly better results in comparison with other strategies while using only local information.
I present my viewpoint on complexity, stressing general arguments and using a rather simple language.
We present a detailed microscopic study of quasi-ballistic transport in deep submicron semiconductor channels. In particular, we study the crossover between the diffusive and ballistic regimes of transport and identify signatures in the electrostatic response, electron density, and nonequilibrium electron distribution function that are due to ballistic and diffusive transport, respectively. Our theoretical and computational approach is based on the Boltzmann transport equation for a nondegenerate electron system, together with the Poisson equation, from which we obtain the nonequilibrium electron distribution in a self-consistent way.
The one-loop electroweak radiative correction of the lowest order to lepton current for deep inelastic scattering of longitudinally polarized leptons by polarized nucleons is obtained in model independent way. The detailed numerical analysis within kinematical requirements of future polarization collider experiments is performed. The possibility to reduce radiative effects by detection of hard photons is studied.
Mobile Ad Hoc Network (MANET) is a collection of nodes that can be rapidly deployed as a multi-hop network without the aid of any centralized administration. Misbehavior is challenged by bandwidth and energy efficient medium access control and fair share of throughput. Node misbehavior plays an important role in MANET. In this survey, few of the contention window misbehavior is reviewed and compared. The contention window cheating either minimizes the active communication of the network or reduces bandwidth utilization of a particular node. The classification presented is in no case unique but summarizes the chief characteristics of many published proposals for contention window cheating. After getting insight into the different contention window misbehavior, few of the enhancements that can be done to improve the existing contention window are suggested. The purpose of this paper is to facilitate the research efforts in combining the existing solutions to offer more efficient methods to reduce contention window cheating mechanisms.
We re-analyse earlier measurements of resistance R and piezoresistance K in RuO2-based thick-film resistors. The percolating nature of transport in these systems is well accounted by values of the transport exponent t larger than its universal value t=2.0. Furthermore, we show that the RuO2 volume fraction dependence of the piezoresistance data fit well with a logarithmically divergence at the percolation thresold. We argue that the universality breakdown and divergent piezoresistive response could be understood in the framework of a tunneling-percolating model proposed a few years ago to apply in carbon-black--polymer composites. We propose a new tunneling-percolating theory based on the segregated microstructure common to many thick-film resistors, and show that this model can in principle describe the observed universality breakdown and the divergent piezoresistance.
The grid cells (GCs) of the medial entorhinal cortex (MEC) and place cells (PCs) of the hippocampus are key elements of the brain network for the metric representation of space. Currently, any of the existing theoretical models can explain all aspects of GC-specific spatially selective patterns of activity. It is also not clear how these patterns are formed during the network development. On the other hand, it was previously shown that the network that can learn to extract high principal components from the activity of the place cells could provide the basis for GC-like activity patterns development. Supporting this hypothesis is the finding that PC activity remains spatially stable after the disruption of the GC firing patterns and that grid patterns almost disappear when hippocampal cells are deactivated. Development of the early PCs before GCs formation also supports the role of PCs as the spatial information providers to GCs. Here we have developed a new theoretical model of grid fields formation based on synaptic plasticity in synapses connecting PCs in hippocampus and neurons in entorhinal cortex. Learning of hexagonally symmetric fields in the model is due to complex action of several simple biologicaly motivated synaptic plasticity rules. These rules include associative synaptic plasticity rules similar to BCM rule and homeostatic plasticity rules which restrict synaptic weigths. In contrast to previously described models, this network could learn GC patterns after a very short experience of navigation in a novel environment. In conclusion we suggest that learning on the basis of simple and biologically plausible associative synaptic plasticity rules could contribute to the formation of grid fields in early development and to maintainence of normal GCs activity patterns in familiar environments.
Recurrent neural networks with various types of hidden units have been used to solve a diverse range of problems involving sequence data. Two of the most recent proposals, gated recurrent units (GRU) and minimal gated units (MGU), have shown comparable promising results on example public datasets. In this paper, we introduce three model variants of the minimal gated unit (MGU) which further simplify that design by reducing the number of parameters in the forget-gate dynamic equation. These three model variants, referred to simply as MGU1, MGU2, and MGU3, were tested on sequences generated from the MNIST dataset and from the Reuters Newswire Topics (RNT) dataset. The new models have shown similar accuracy to the MGU model while using fewer parameters and thus lowering training expense. One model variant, namely MGU2, performed better than MGU on the datasets considered, and thus may be used as an alternate to MGU or GRU in recurrent neural networks.
We present a detailed study of the electronic transport properties on a single crystalline specimen of the moderately disordered heavy fermion system URh$_2$Ge$_2$. For this material, we find glassy electronic transport in a single crystalline compound. We derive the temperature dependence of the electrical conductivity and establish metallicity by means of optical conductivity and Hall effect measurements. The overall behavior of the electronic transport properties closely resembles that of metallic glasses, with at low temperatures an additional minor spin disorder contribution. We argue that this glassy electronic behavior in a crystalline compound reflects the enhancement of disorder effects as consequence of strong electronic correlations.
Citizen science changes the way scientific research is pursued. It opens up data collection and analysis to the general public, to the wisdom of crowds. In this emerging area, there is much research to be done to better understand how we can develop citizen science infrastructure and continue the democratization of science. In creating such systems, there is much we can learn from principles that have emerged out of computer-supported cooperative work (CSCW) research. In this paper, I use a nine-step framework to highlight where CSCW knowledge can contribute.
We consider neural networks from the point of view of dynamical systems theory. In this spirit we review recent results dealing with the following questions, adressed in the context of specific models.   1. Characterizing the collective dynamics; 2. Statistical analysis of spikes trains; 3. Interplay between dynamics and network structure; 4. Effects of synaptic plasticity.
We propose a fluctuation analysis to quantify spatial correlations in complex networks. The approach considers the sequences of degrees along shortest paths in the networks and quantifies the fluctuations in analogy to time series. In this work, the Barabasi-Albert (BA) model, the Cayley tree at the percolation transition, a fractal network model, and examples of real-world networks are studied. While the fluctuation functions for the BA model show exponential decay, in the case of the Cayley tree and the fractal network model the fluctuation functions display a power-law behavior. The fractal network model comprises long-range anti-correlations. The results suggest that the fluctuation exponent provides complementary information to the fractal dimension.
The study of neurocognitive tasks requiring accurate localisation of activity often rely on functional Magnetic Resonance Imaging, a widely adopted technique that makes use of a pipeline of data processing modules, each involving a variety of parameters. These parameters are frequently set according to the local goal of each specific module, not accounting for the rest of the pipeline. Given recent success of neural network research in many different domains, we propose to convert the whole data pipeline into a deep neural network, where the parameters involved are jointly optimised by the network to best serve a common global goal. As a proof of concept, we develop a module able to adaptively apply the most suitable spatial smoothing to every brain volume for each specific neuroimaging task, and we validate its results in a standard brain decoding experiment.
In the context of growing networks, we introduce a simple dynamical model that unifies the generic features of real networks: scale-free distribution of degree and the small world effect. While the average shortest path length increases logartihmically as in random networks, the clustering coefficient assumes a large value independent of system size. We derive expressions for the clustering coefficient in two limiting cases: random (C ~ (ln N)^2 / N) and highly clustered (C = 5/6) scale-free networks.
Area of classifying satellite imagery has become a challenging task in current era where there is tremendous growth in settlement i.e. construction of buildings, roads, bridges, dam etc. This paper suggests an improvised k-means and Artificial Neural Network (ANN) classifier for land-cover mapping of Eastern Himalayan state Sikkim. The improvised k-means algorithm shows satisfactory results compared to existing methods that includes k-Nearest Neighbor and maximum likelihood classifier. The strength of the Artificial Neural Network (ANN) classifier lies in the fact that they are fast and have good recognition rate and it's capability of self-learning compared to other classification algorithms has made it widely accepted. Classifier based on ANN shows satisfactory and accurate result in comparison with the classical method.
In this work two soft computing methods, Artificial Neural Networks and Genetic Programming, are proposed in order to forecast the mean temperature that will occur in future seasons. The area in which the soft computing techniques were applied is that of the surroundings of the town of Benevento, in the south of Italy, having geographic coordinates (lat. 41{\deg}07'50"N; long.14{\deg}47'13"E). This area is not affected by maritime influences as well as by winds coming from the west. The methods are fed by data recorded in the meteorological stations of Benevento and Castelvenere, located in the hilly area, which characterizes the territory surrounding this city, at 144 m a.s.l. Both the applied methods show low error rates, while the Genetic Programming offers an explicit rule representation (a formula) explaining the prevision.   Keywords Seasonal Temperature Forecasting; Soft Computing; Artificial Neural Networks; Genetic Programming; Southern Italy.
By making use of the Langevin dynamics and its generating functional (GF) formulation the influence of the long-range nature of the interaction on the tendency of the glass formation is systematically investigated. In doing so two types of models is considered: (i) the non-disordered model with a pure repulsive type of interaction and (ii) the model with a randomly distributed strength of interaction (a quenched disordered model). The long-ranged potential of interaction is scaled with a number of particles $N$ in such a way as to enable for GF the saddle-point treatment as well as the systematic 1/N - expansion around it. We show that the non-disordered model has no glass transition which is in line with the mean-field limit of the mode - coupling theory (MCT) predictions. On the other hand the model with a long-range interaction which above that has a quenched disorder leads to MC - equations which are generic for the $p$ - spin glass model and polymeric manifold in a random media.
We calculate the transport properties of three-dimensional Weyl fermions in a disordered environment. The resulting conductivity depends only on the Fermi energy and the scattering rate. First we study the conductivity at the spectral node for a fixed scattering rate and obtain a continuous transition from an insulator at weak disorder to a metal at stronger disorder. In the self-consistent Born approximation the scattering rate depends on the Fermi energy. Then it is crucial that the limits of the conductivity for a vanishing Fermi energy and a vanishing scattering rate do not commute. As a result, there is also metallic behavior in the phase with vanishing scattering rate and only a quantum critical point remains as an insulating state.
In this paper, we present the performance of different broadcast schemes for multihop sensor networks based on mathematical modeling. In near future many applications will demand multicast (Broadcast) communication feature from the sensor networks. This broadcast feature does not use virtual carrier sensing but relies on physical carrier sensing to reduce collision. For this paper, we analyze the different broadcast schemes for multihop wireless sensor networks and also calculated the achievable throughput.
We present a revised theoretical study of the affine assumption applied to semiflexible networks. Drawing on simple models of semiflexible worm-like chains we derive an expression for the probability distribution of crosslink separations valid at all separations. This accounts for both entropic and mechanical filament stretching. From this we obtain the free energy density of such networks explicitly as a function of applied strain. We are therefore able to calculate the elastic moduli of such networks for any imposed strain or stress. We find that accounting for the distribution of cross-link separations destroys the simple scaling of modulus with stress that is well known in single chains, and that such scaling is sensitive to the mechanical stretch modulus of individual filaments. We compare this model to three experimental data sets, for networks of different types of filaments, and find that a properly treated affine model can successfully account for the data. We find that for networks of stiffer filaments, such as F-actin, to fit data we require a much smaller effective persistence length than usually assumed to be characteristic of this filament type. We propose that such an effectively reduced rigidity of filaments might be a consequence of network formation.
The research effort on mobile computing has focused mainly on routing and usually assumes that all mobile devices (MDs) are cooperative. These assumptions hold on military or search and rescue operations, where all hosts are from the same authority and their users have common goals. The application of mobile ad hoc networks (MANETs) as open networks has emerged recently but proliferated exponentially. Energy is a valuable commodity in MANETs due to the limited battery of the portable devices. Batteries typically cannot be replaced in MANETs, making their lifetime limited. Diverse users, with unlike goals, share the resources of their devices and ensuring global connectivity comes very low in their priority. This sort of communities can already be found in wired networks, namely on peer-to-peer networks. In this scenario, open MANETs will likely resemble social environments. A group of persons can provide benefits to each of its members as long as everyone provides his contribution. For our particular case, each element of a MANET will be called to forward messages and to participate on routing protocols. A selfish behavior threatens the entire community and also this behavior is infectious as, other MDs may also start to perform in the same way. In the extreme, this can take to the complete sabotage of the network. This paper investigates the prevalent malicious attacks in MANET and analyzes recent selfish trends in MANET. We analyzed the respective strengths and vulnerabilities of the existing selfish behaviour prevention scheme.
It is time-consuming and error-prone to implement inference procedures for each new probabilistic model. Probabilistic programming addresses this problem by allowing a user to specify the model and having a compiler automatically generate an inference procedure for it. For this approach to be practical, it is important to generate inference code that has reasonable performance. In this paper, we present a probabilistic programming language and compiler for Bayesian networks designed to make effective use of data-parallel architectures such as GPUs. Our language is fully integrated within the Scala programming language and benefits from tools such as IDE support, type-checking, and code completion. We show that the compiler can generate data-parallel inference code scalable to thousands of GPU cores by making use of the conditional independence relationships in the Bayesian network.
Centrality is one of the most studied concepts in social network analysis. There is a huge literature regarding centrality measures, as ways to identify the most relevant users in a social network. The challenge is to find measures that can be computed efficiently, and that can be able to classify the users according to relevance criteria as close as possible to reality. We address this problem in the context of the Twitter network, an online social networking service with millions of users and an impressive flow of messages that are published and spread daily by interactions between users. Twitter has different types of users, but the greatest utility lies in finding the most influential ones. The purpose of this article is to collect and classify the different Twitter influence measures that exist so far in literature. These measures are very diverse. Some are based on simple metrics provided by the Twitter API, while others are based on complex mathematical models. Several measures are based on the PageRank algorithm, traditionally used to rank the websites on the Internet. Some others consider the timeline of publication, others the content of the messages, some are focused on specific topics, and others try to make predictions. We consider all these aspects, and some additional ones. Furthermore, we include measures of activity and popularity, the traditional mechanisms to correlate measures, and some important aspects of computational complexity for this particular context.
The routing of packets are generally performed based on the destination address and forward link channel available from the instantaneous Router without sufficient cognizance of either the performance of the forward Router or forward channel characteristics. The lack of awareness of forward channel property can lead to packet loss or delayed delivery leading to multipleretransmissions or routing to an underperforming pathway. This paper describes an application of Cognitive Network to improve the network performance by implementing a Hidden Markov Model (HMM) algorithm for learning and predicting the performance of surrounding routers continuously while a routing demand is initiated. The cognition segment/domain of every router can gain knowledge about the quality of forward network. The information of the current network conditions is shared between routers by the Forward Channel Performance Index FCPI. This enables complete cognition of surroundings and efficient delivery of messages in various paradigms of performance
This paper presents an investigation of several techniques that increase the accuracy of online handwritten Chinese character recognition (HCCR). We propose a new training strategy named DropDistortion to train a deep convolutional neural network (DCNN) with distorted samples. DropDistortion gradually lowers the degree of character distortion during training, which allows the DCNN to better generalize. Path signature is used to extract effective features for online characters. Further improvement is achieved by employing spatial stochastic max-pooling as a method of feature map distortion and model averaging. Experiments were carried out on three publicly available datasets, namely CASIA-OLHWDB 1.0, CASIA-OLHWDB 1.1, and the ICDAR2013 online HCCR competition dataset. The proposed techniques yield state-of-the-art recognition accuracies of 97.67%, 97.30%, and 97.99%, respectively.
Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems on the task of English-to-French translation. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose an efficient architecture to train a deep character-level neural machine translation by introducing a decimator and an interpolator. The decimator is used to sample the source sequence before encoding while the interpolator is used to resample after decoding. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is much faster and more memory-efficient in training than conventional character-based models. More interestingly, our model is able to translate the misspelled word like human beings.
Recent trends show that there are swift developments and fast convergence of wireless and mobile communication networks with internet services to provide the quality of ubiquitous access to network users. Most of the wireless networks and mobile cellular networks are moving to be all IP based. These networks are connected through the private IP core networks using the TCP/IP protocol or through the Internet. As such, there is room to improve the mobility support through the Internet and support ubiquitous network access by providing seamless handover. This is especially true with the invention of portable mobile and laptop devices that can be connected almost everywhere at any time. However, the recent explosion on the usage of mobile and laptop devices has also generated several issues in terms of performance and quality of service. Nowadays, mobile users demand high quality performance, best quality of services and seamless connections that support real-time application such as audio and video streaming. The goal of this paper is to study the impact and evaluate the mobility management protocols under micro mobility domain on link layer and network layer handover performance. Therefore, this paper proposes an integration solution of network-based mobility management framework, based on Proxy Mobile IPv6, to alleviate handover latency, packet loss and increase throughput and the performance of video transmission when mobile host moves to new network during handover on high speed mobility. Simulations are conducted to analyze the relationship between the network performances with the moving speed of mobile host over mobility protocols. Based on simulation results, we presented and analyzed the results of mobility protocols under intra-domain traffics in micro mobility domain.
Genomic duplication-divergence events, which are the primary source of new protein functions, occur stochastically at a wide range of genomic scales, from single gene to whole genome duplications. Clearly, this fundamental evolutionary process must have largely conditioned the emerging structure of protein-protein interaction (PPI) networks, that control many cellular activities. We propose and asymptotically solve a general duplication-divergence model of PPI network evolution based on the statistical selection of duplication-derived interactions. We also introduce a conservation index, that formally defines the statistical evolutionary conservation of PPI networks. Distinct conditions on microscopic parameters are then shown to control global conservation and topology of emerging PPI networks. In particular, conserved, non-dense networks, which are the only ones of potential biological relevance, are also shown to be necessary scale-free.
We address the problem of finding optimal strategies for computing Boolean symmetric functions. We consider a collocated network, where each node's transmissions can be heard by every other node. Each node has a Boolean measurement and we wish to compute a given Boolean function of these measurements with zero error. We allow for block computation to enhance data fusion efficiency, and determine the minimum worst-case total bits to be communicated to perform the desired computation. We restrict attention to the class of symmetric Boolean functions, which only depend on the number of 1s among the n measurements.   We define three classes of functions, namely threshold functions, delta functions and interval functions. We provide exactly optimal strategies for the first two classes, and an order-optimal strategy with optimal preconstant for interval functions. Using these results, we can characterize the complexity of computing percentile type functions, which is of great interest. In our analysis, we use lower bounds from communication complexity theory, and provide an achievable scheme using information theoretic tools.
In this paper we investigate networks whose evolution is governed by the interaction of a random assembly process and an optimization process. In the first process, new nodes are added one at a time and form connections to randomly selected old nodes. In between node additions, the network is rewired to minimize its pathlength. For timescales, at which neither the assembly nor the optimization processes are dominant, we find a rich variety of complex networks with power law tails in the degree distributions. These networks also exhibit non-trivial clustering, a hierarchical organization and interesting degree mixing patterns.
Network controllers (NCs) are devices that are capable of converting dynamic, spatially extended, and functionally specialized modules into a taskable goal-oriented group called networked control system. This paper examines the practical aspects of designing and building an NC that uses the Internet as a communication medium. It focuses on finding compatible controller components that can be integrated via a host structure in a manner that makes it possible to network, in real-time, a webcam, an unmanned ground vehicle (UGV), and a remote computer server along with the necessary operator software interface. The aim is to deskill the UGV navigation process and yet maintain a robust performance. The structure of the suggested controller, its components, and the manner in which they are interfaced are described. Thorough experimental results along with performance assessment and comparisons to a previously implemented NC are provided.
We study the depolarization dynamics of a dense ensemble of dipolar interacting spins, associated with nitrogen-vacancy centers in diamond. We observe anomalously fast, density-dependent, and non-exponential spin relaxation. To explain these observations, we propose a microscopic model where an interplay of long-range interactions, disorder, and dissipation leads to predictions that are in quantitative agreement with both current and prior experimental results. Our results pave the way for controlled many-body experiments with long-lived and strongly interacting ensembles of solid-state spins.
We consider clustering player behavior and learning the optimal team composition for multiplayer online games. The goal is to determine a set of descriptive play style groupings and learn a predictor for win/loss outcomes. The predictor takes in as input the play styles of the participants in each team; i.e., the various team compositions in a game. Our framework uses unsupervised learning to find behavior clusters, which are, in turn, used with classification algorithms to learn the outcome predictor. For our numerical experiments, we consider League of Legends, a popular team-based role-playing game developed by Riot Games. We observe the learned clusters to not only corroborate well with game knowledge, but also provide insights surprising to expert players. We also demonstrate that game outcomes can be predicted with fairly high accuracy given team composition-based features.
As datasets continue to grow, neural network (NN) applications are becoming increasingly limited by both the amount of available computational power and the ease of developing high-performance applications. Researchers often must have expert systems knowledge to make their algorithms run efficiently. Although available computing power increases rapidly each year, algorithm efficiency is not able to keep pace due to the use of general purpose compilers, which are not able to fully optimize specialized application domains. Within the domain of NNs, we have the added knowledge that network architecture remains constant during training, meaning the architecture's data structure can be statically optimized by a compiler. In this paper, we present SONNC, a compiler for NNs that utilizes static analysis to generate optimized parallel code. We show that SONNC's use of static optimizations make it able to outperform hand-optimized C++ code by up to 7.8X, and MATLAB code by up to 24X. Additionally, we show that use of SONNC significantly reduces code complexity when using structurally sparse networks.
Randomly packing spheres of equal size into a container consistently results in a static configuration with a density of ~64%. The ubiquity of random close packing (RCP) rather than the optimal crystalline array at 74% begs the question of the physical law behind this empirically deduced state. Indeed, there is no signature of any macroscopic quantity with a discontinuity associated with the observed packing limit. Here we show that RCP can be interpreted as a manifestation of a thermodynamic singularity, which defines it as the "freezing point" in a first-order phase transition between ordered and disordered packing phases. Despite the athermal nature of granular matter, we show the thermodynamic character of the transition in that it is accompanied by sharp discontinuities in volume and entropy. This occurs at a critical compactivity, which is the intensive variable that plays the role of temperature in granular matter. Our results predict the experimental conditions necessary for the formation of a jammed crystal by calculating an analogue of the "entropy of fusion". This approach is useful since it maps out-of-equilibrium problems in complex systems onto simpler established frameworks in statistical mechanics.
Neural network based models are a very powerful tool for creating word embeddings, the objective of these models is to group similar words together. These embeddings have been used as features to improve results in various applications such as document classification, named entity recognition, etc. Neural language models are able to learn word representations which have been used to capture semantic shifts across time and geography. The objective of this paper is to first identify and then visualize how words change meaning in different text corpus. We will train a neural language model on texts from a diverse set of disciplines philosophy, religion, fiction etc. Each text will alter the embeddings of the words to represent the meaning of the word inside that text. We will present a computational technique to detect words that exhibit significant linguistic shift in meaning and usage. We then use enhanced scatterplots and storyline visualization to visualize the linguistic shift.
Dominance hierarchy among animals is widespread in various species and believed to serve to regulate resource allocation within an animal group. Unlike small groups, however, detection and quantification of linear hierarchy in large groups of animals are a difficult task. Here, we analyse aggression-based dominance hierarchies formed by worker ants in Diacamma sp. as large directed networks. We show that the observed dominance networks are perfect or approximate directed acyclic graphs, which are consistent with perfect linear hierarchy. The observed networks are also sparse and random but significantly different from networks generated through thinning of the perfect linear tournament (i.e., all individuals are linearly ranked and dominance relationship exists between every pair of individuals). These results pertain to global structure of the networks, which contrasts with the previous studies inspecting frequencies of different types of triads. In addition, the distribution of the out-degree (i.e., number of workers that the focal worker attacks), not in-degree (i.e., number of workers that attack the focal worker), of each observed network is right-skewed. Those having excessively large out-degrees are located near the top, but not the top, of the hierarchy. We also discuss evolutionary implications of the discovered properties of dominance networks.
Low-temperature heat-capacity investigations on the spinel FeCr2S4 with ferrimagnetic spin order and orbitally degenerated Jahn-Teller active Fe2+ ions in a tetrahedral crystal field, provide experimental evidence of an orbital liquid state above 10 K. We demonstrate that the low-temperature transition at 10 K arises from orbital order and is very sensitive to fine tuning of the stoichiometry in polycrystals. In single crystals the orbital order is fully suppressed resulting in an orbital glass state with the heat capacity following a strict T^2 dependence as temperature approaches zero.
Threats on the stability of a financial system may severely affect the functioning of the entire economy, and thus considerable emphasis is placed on the analyzing the cause and effect of such threats. The financial crisis in the current and past decade has shown that one important cause of instability in global markets is the so-called financial contagion, namely the spreading of instabilities or failures of individual components of the network to other, perhaps healthier, components. This leads to a natural question of whether the regulatory authorities could have predicted and perhaps mitigated the current economic crisis by effective computations of some stability measure of the banking networks. Motivated by such observations, we consider the problem of defining and evaluating stabilities of both homogeneous and heterogeneous banking networks against propagation of synchronous idiosyncratic shocks given to a subset of banks. We formalize the homogeneous banking network model of Nier et al. and its corresponding heterogeneous version, formalize the synchronous shock propagation procedures, define two appropriate stability measures and investigate the computational complexities of evaluating these measures for various network topologies and parameters of interest. Our results and proofs also shed some light on the properties of topologies and parameters of the network that may lead to higher or lower stabilities.
Point processes are becoming very popular in modeling asynchronous sequential data due to their sound mathematical foundation and strength in modeling a variety of real-world phenomena. Currently, they are often characterized via intensity function which limits model's expressiveness due to unrealistic assumptions on its parametric form used in practice. Furthermore, they are learned via maximum likelihood approach which is prone to failure in multi-modal distributions of sequences. In this paper, we propose an intensity-free approach for point processes modeling that transforms nuisance processes to a target one. Furthermore, we train the model using a likelihood-free leveraging Wasserstein distance between point processes. Experiments on various synthetic and real-world data substantiate the superiority of the proposed point process model over conventional ones.
Recommendation systems have been integrated into the majority of large online systems. They tailor those systems to individual users by filtering and ranking information according to user profiles. This adaptation process influences the way users interact with the system and, as a consequence, increases the difficulty of evaluating a recommendation algorithm with historical data (via offline evaluation). This paper analyses this evaluation bias and proposes a simple item weighting solution that reduces its impact. The efficiency of the proposed solution is evaluated on real world data extracted from Viadeo professional social network.
Current directions in network routing research have not kept pace with the latest developments in network architectures, such as peer-to-peer networks, sensor networks, ad-hoc wireless networks, and overlay networks. A common characteristic among all of these new technologies is the presence of highly dynamic network topologies. Currently deployed single-path routing protocols cannot adequately cope with this dynamism, and existing multi-path algorithms make trade-offs which lead to less than optimal performance on these networks. This drives the need for routing protocols designed with the unique characteristics of these networks in mind.   In this paper we propose the notion of reachability routing as a solution to the challenges posed by routing on such dynamic networks. In particular, our formulation of reachability routing provides cost-sensitive multi-path forwarding along with loop avoidance within the confines of the Internet Protocol (IP) architecture. This is achieved through the application of reinforcement learning within a probabilistic routing framework. Following an explanation of our design decisions and a description of the algorithm, we provide an evaluation of the performance of the algorithm on a variety of network topologies. The results show consistently superior performance compared to other reinforcement learning based routing algorithms.
The PCP Theorem is one of the most stunning results in computational complexity theory, a culmination of a series of results regarding proof checking it exposes some deep structure of computational problems. As a surprising side-effect, it also gives strong non-approximability results. In this paper we initiate the study of proof checking within the scope of Parameterized Complexity. In particular we adapt and extend the PCP[n log log n, n log log n] result of Feige et al. to several parameterized classes, and discuss some corollaries.
Recurrent neural networks have recently been used for learning to describe images using natural language. However, it has been observed that these models generalize poorly to scenes that were not observed during training, possibly depending too strongly on the statistics of the text in the training data. Here we propose to describe images using short structured representations, aiming to capture the crux of a description. These structured representations allow us to tease-out and evaluate separately two types of generalization: standard generalization to new images with similar scenes, and generalization to new combinations of known entities. We compare two learning approaches on the MS-COCO dataset: a state-of-the-art recurrent network based on an LSTM (Show, Attend and Tell), and a simple structured prediction model on top of a deep network. We find that the structured model generalizes to new compositions substantially better than the LSTM, ~7 times the accuracy of predicting structured representations. By providing a concrete method to quantify generalization for unseen combinations, we argue that structured representations and compositional splits are a useful benchmark for image captioning, and advocate compositional models that capture linguistic and visual structure.
In aspect-based sentiment analysis, extracting aspect terms along with the opinions being expressed from user-generated content is one of the most important subtasks. Previous studies have shown that exploiting connections between aspect and opinion terms is promising for this task. In this paper, we propose a novel joint model that integrates recursive neural networks and conditional random fields into a unified framework for explicit aspect and opinion terms co-extraction. The proposed model learns high-level discriminative features and double propagate information between aspect and opinion terms, simultaneously. Moreover, it is flexible to incorporate hand-crafted features into the proposed model to further boost its information extraction performance. Experimental results on the SemEval Challenge 2014 dataset show the superiority of our proposed model over several baseline methods as well as the winning systems of the challenge.
The highly variable dynamics of neocortical circuits observed in vivo have been hypothesized to represent a signature of ongoing stochastic inference but stand in apparent contrast to the deterministic response of neurons measured in vitro. Based on a propagation of the membrane autocorrelation across spike bursts, we provide an analytical derivation of the neural activation function that holds for a large parameter space, including the high-conductance state. On this basis, we show how an ensemble of leaky integrate-and-fire neurons with conductance-based synapses embedded in a spiking environment can attain the correct firing statistics for sampling from a well-defined target distribution. For recurrent networks, we examine convergence toward stationarity in computer simulations and demonstrate sample-based Bayesian inference in a mixed graphical model. This points to a new computational role of high-conductance states and establishes a rigorous link between deterministic neuron models and functional stochastic dynamics on the network level.
Traffic is a problem in many urban areas worldwide. Traffic flow is dictated by certain devices such as traffic lights. The traffic lights signal when each lane is able to pass through the intersection. Often, static schedules interfere with ideal traffic flow. The purpose of this project was to find a way to make intersections controlled with traffic lights more efficient. This goal was accomplished through the creation of a genetic algorithm, which enhances an input algorithm through genetic principles to produce the fittest algorithm. The program was comprised of two major elements: coding in Java and coding in Simulation of Urban Mobility (SUMO), which is an environment that simulates real traffic. The Java code called upon the SUMO simulation via a command prompt which ran the simulation, received the output, altered the algorithm, and looped. The SUMO component initialized a simulation in which a 1 x 1 street layout was created, each intersection with its own traffic light. Each loop enhanced the input algorithm by altering the scheduling string (dictates the light changes). After the looped simulations were executed, the data was then analyzed. This was accomplished by creating an algorithm based upon regular practice, timed traffic lights, and comparing the output which was comprised of the total time it took for all vehicles to exit the system and the average time it took each individual vehicle to exit the system. These different variables: the time it took the average vehicle to exit the system and total time for all vehicles to exit the system, where then graphed together to provide a visual aid. The genetic algorithm did improve traffic light and traffic flow efficiency in comparison to traditional scheduling methods.
With the novel and fast advances in the area of deep neural networks, several challenging image-based tasks have been recently approached by researchers in pattern recognition and computer vision. In this paper, we address one of these tasks, which is to match image content with natural language descriptions, sometimes referred as multimodal content retrieval. Such a task is particularly challenging considering that we must find a semantic correspondence between captions and the respective image, a challenge for both computer vision and natural language processing areas. For such, we propose a novel multimodal approach based solely on convolutional neural networks for aligning images with their captions by directly convolving raw characters. Our proposed character-based textual embeddings allow the replacement of both word-embeddings and recurrent neural networks for text understanding, saving processing time and requiring fewer learnable parameters. Our method is based on the idea of projecting both visual and textual information into a common embedding space. For training such embeddings we optimize a contrastive loss function that is computed to minimize order-violations between images and their respective descriptions. We achieve state-of-the-art performance in the largest and most well-known image-text alignment dataset, namely Microsoft COCO, with a method that is conceptually much simpler and that possesses considerably fewer parameters than current approaches.
Heterogeneous Vehicular NETworks (HetVNETs) can meet various quality-of-service (QoS) requirements for intelligent transport system (ITS) services by integrating different access networks coherently. However, the current network architecture for HetVNET cannot efficiently deal with the increasing demands of rapidly changing network landscape. Thanks to the centralization and flexibility of the cloud radio access network (Cloud-RAN), soft-defined networking (SDN) can conveniently be applied to support the dynamic nature of future HetVNET functions and various applications while reducing the operating costs. In this paper, we first propose the multi-layer Cloud RAN architecture for implementing the new network, where the multi-domain resources can be exploited as needed for vehicle users. Then, the high-level design of soft-defined HetVNET is presented in detail. Finally, we briefly discuss key challenges and solutions for this new network, corroborating its feasibility in the emerging fifth-generation (5G) era.
In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.
We discuss the effects of common synaptic inputs in a recurrent neural network. Because of the effects of these common synaptic inputs, the correlation between neural inputs cannot be ignored, and thus the network exhibits sample dependence. Networks of this type do not have well-defined thermodynamic limits, and self-averaging breaks down. We therefore need to develop a suitable theory without relying on these common properties. While the effects of the common synaptic inputs have been analyzed in layered neural networks, it was apparently difficult to analyze these effects in recurrent neural networks due to feedback connections. We investigated a sequential associative memory model as an example of recurrent networks and succeeded in deriving a macroscopic dynamical description as a recurrence relation form of a probability density function.
In hybrid hidden Markov model/artificial neural networks (HMM/ANN) automatic speech recognition (ASR) system, the phoneme class conditional probabilities are estimated by first extracting acoustic features from the speech signal based on prior knowledge such as, speech perception or/and speech production knowledge, and, then modeling the acoustic features with an ANN. Recent advances in machine learning techniques, more specifically in the field of image processing and text processing, have shown that such divide and conquer strategy (i.e., separating feature extraction and modeling steps) may not be necessary. Motivated from these studies, in the framework of convolutional neural networks (CNNs), this paper investigates a novel approach, where the input to the ANN is raw speech signal and the output is phoneme class conditional probability estimates. On TIMIT phoneme recognition task, we study different ANN architectures to show the benefit of CNNs and compare the proposed approach against conventional approach where, spectral-based feature MFCC is extracted and modeled by a multilayer perceptron. Our studies show that the proposed approach can yield comparable or better phoneme recognition performance when compared to the conventional approach. It indicates that CNNs can learn features relevant for phoneme classification automatically from the raw speech signal.
Dots-and-Boxes is a child's game which remains analytically unsolved. We implement and evolve artificial neural networks to play this game, evaluating them against simple heuristic players. Our networks do not evaluate or predict the final outcome of the game, but rather recommend moves at each stage. Superior generalisation of play by co-evolved populations is found, and a comparison made with networks trained by back-propagation using simple heuristics as an oracle.
In many supervised learning tasks, the entities to be labeled are related to each other in complex ways and their labels are not independent. For example, in hypertext classification, the labels of linked pages are highly correlated. A standard approach is to classify each entity independently, ignoring the correlations between them. Recently, Probabilistic Relational Models, a relational version of Bayesian networks, were used to define a joint probabilistic model for a collection of related entities. In this paper, we present an alternative framework that builds on (conditional) Markov networks and addresses two limitations of the previous approach. First, undirected models do not impose the acyclicity constraint that hinders representation of many important relational dependencies in directed models. Second, undirected models are well suited for discriminative training, where we optimize the conditional likelihood of the labels given the features, which generally improves classification accuracy. We show how to train these models effectively, and how to use approximate probabilistic inference over the learned model for collective classification of multiple related entities. We provide experimental results on a webpage classification task, showing that accuracy can be significantly improved by modeling relational dependencies.
Generally, social network analysis has often focused on the topology of the network without considering the characteristics of individuals involved in them. Less attention is given to study the behavior of individuals, considering they are the basic entity of a graph. Given a mobile social network graph, what are good features to extract key information from the nodes? How many distinct neighborhood patterns exist for ego nodes? What clues does such information provide to study nodes over a long period of time?   In this report, we develop an automated system in order to discover the occurrences of prototypical ego-centric patterns from data. We aim to provide a data-driven instrument to be used in behavioral sciences for graph interpretations. We analyze social networks derived from real-world data collected with smart-phones. We select 13 well-known network measures, especially those concerned with ego graphs. We form eight feature subsets and then assess their performance using unsupervised clustering techniques to discover distinguishing ego-centric patterns. From clustering analysis, we discover that eight distinct neighborhood patterns have emerged. This categorization allows concise analysis of users' data as they change over time. The results provide a fine-grained analysis for the contribution of different feature sets to detect unique clustering patterns. Last, as a case study, two datasets are studied over long periods to demonstrate the utility of this method. The study shows the effectiveness of the proposed approach in discovering important trends from data.
Over the last five years, methods based on Deep Convolutional Neural Networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems. This has been made possible due to the availability of large annotated datasets, a better understanding of the non-linear mapping between input images and class labels as well as the affordability of GPUs. In this paper, we present the design details of a deep learning system for unconstrained face recognition, including modules for face detection, association, alignment and face verification. The quantitative performance evaluation is conducted using the IARPA Janus Benchmark A (IJB-A), the JANUS Challenge Set 2 (JANUS CS2), and the LFW dataset. The IJB-A dataset includes real-world unconstrained faces of 500 subjects with significant pose and illumination variations which are much harder than the Labeled Faces in the Wild (LFW) and Youtube Face (YTF) datasets. JANUS CS2 is the extended version of IJB-A which contains not only all the images/frames of IJB-A but also includes the original videos for evaluating the video-based face verification system. Some open issues regarding DCNNs for face verification problems are then discussed.
We report a summary of our interdisciplinary research project "Evolutionary Perspective on Collective Decision Making" that was conducted through close collaboration between computational, organizational and social scientists at Binghamton University. We redefined collective human decision making and creativity as evolution of ecologies of ideas, where populations of ideas evolve via continual applications of evolutionary operators such as reproduction, recombination, mutation, selection, and migration of ideas, each conducted by participating humans. Based on this evolutionary perspective, we generated hypotheses about collective human decision making using agent-based computer simulations. The hypotheses were then tested through several experiments with real human subjects. Throughout this project, we utilized evolutionary computation (EC) in non-traditional ways---(1) as a theoretical framework for reinterpreting the dynamics of idea generation and selection, (2) as a computational simulation model of collective human decision making processes, and (3) as a research tool for collecting high-resolution experimental data of actual collaborative design and decision making from human subjects. We believe our work demonstrates untapped potential of EC for interdisciplinary research involving human and social dynamics.
Monte Carlo data of the two-dimensional Ising spin glass with bimodal interactions are presented with the aim of understanding the low-temperature physics of the model. An analysis of the specific heat, spin-glass susceptibility, finite-size correlation length, and the Binder ratio is performed to try to verify a recent proposal in which for large system sizes and finite but low temperatures the effective critical exponents are identical to the critical exponents of the two-dimensional Ising spin glass with Gaussian interactions. Our results show that with present system sizes the recently proposed scenario in which the two-dimensional Ising spin glass with bimodally distributed interactions is in the same universality class as the model with Gaussian-distributed disorder at low but finite temperatures cannot be reliably proven.
We report measurements of the real and imaginary parts of the AC conductivity in the quantum limit, \hbar \omega > k_B T of insulating nominally uncompensated n-type silicon. The observed frequency dependence shows evidence for a crossover from interacting Coulomb glass-like behavior at lower energies to non-interacting Fermi glass-like behavior at higher energies across a broad doping range. The crossover is sharper than predicted and cannot be described by any existing theories. Despite this, the measured crossover energy can be compared to the theoretically predicted Coulomb interaction energy and reasonable estimates of the localization length obtained from it. Based on a comparison with the amorphous semiconductor NbSi, we obtain a general classification scheme for electrodynamics of electron glasses.
We build a minimal, mean-field, model of plasticity of amorphous solids, based upon a phenomenology of dissipative events derived, in a preceding paper [A. Lemaitre, C. Caroli, arXiv:0705.0823] from extensive molecular simulations. It reduces to the dynamics of an ensemble of identical shear transformation zones interacting via the dynamic noise due to the long ranged elastic fields induced by zone flips themselves. We find that these ingredients are sufficient to generate flip avalanches with a power-law scaling with system size, analogous to that observed in molecular simulations. We further show that the scaling properties of avalanches sensitively depend on the detailed shape of the noise spectrum. This points out the importance of developing a realistic coarse-grained description of elasticity in these systems.
Constraint-based flux balance analysis (FBA) has proven successful in predicting the flux distribution of metabolic networks in diverse environmental conditions. FBA finds one of the alternate optimal solutions that maximizes the biomass production rate. Almaas et al have shown that the flux distribution follows a power law, and it is possible to associate with most metabolites two reactions which maximally produce and consume a give metabolite, respectively. This observation led to the concept of high-flux backbone (HFB) in metabolic networks. In previous work, the HFB has been computed using a particular optima obtained using FBA. In this paper, we investigate the conservation of HFB of a particular solution for a given medium across different alternate optima and near-optima in metabolic networks of E. coli and S. cerevisiae. Using flux variability analysis (FVA), we propose a method to determine reactions that are guaranteed to be in HFB regardless of alternate solutions. We find that the HFB of a particular optima is largely conserved across alternate optima in E. coli, while it is only moderately conserved in S. cerevisiae. However, the HFB of a particular near-optima shows a large variation across alternate near-optima in both organisms. We show that the conserved set of reactions in HFB across alternate near-optima has a large overlap with essential reactions and reactions which are both uniquely consuming (UC) and uniquely producing (UP). Our findings suggest that the structure of the metabolic network admits a high degree of redundancy and plasticity in near-optimal flow patterns enhancing system robustness for a given environmental condition.
P2P systems provide a scalable solution for distributing large files in a network. The file is split into many chunks, and peers contact other peers to collect missing chunks to eventually complete the entire file. The so-called `rare chunk' phenomenon, where a single chunk becomes rare and prevents peers from completing the file, is a threat to the stability of such systems. Practical systems such as BitTorrent overcome this issue by requiring a global search for the rare chunk, which necessitates a centralized mechanism. We demonstrate a new system based on an approximate rare-chunk rule, allowing for completely distributed file sharing while retaining scalability and stability. We assume non-altruistic peers and the seed is required to make only a minimal contribution.
Diabetic retinopathy is one of the leading causes of preventable blindness in the world. Its earliest sign are red lesions, a general term that groups both microaneurysms and hemorrhages. In daily clinical practice, these lesions are manually detected by physicians using fundus photographs. However, this task is tedious and time consuming, and requires an intensive effort due to the small size of the lesions and their lack of contrast. Computer-assisted diagnosis of DR based on red lesion detection is being actively explored due to its improvement effects both in clinicians consistency and accuracy. Several methods for detecting red lesions have been proposed in the literature, most of them based on characterizing lesion candidates using hand crafted features, and classifying them into true or false positive detections. Deep learning based approaches, by contrast, are scarce in this domain due to the high expense of annotating the lesions manually. In this paper we propose a novel method for red lesion detection based on combining both deep learned and domain knowledge. Features learned by a CNN are augmented by incorporating hand crafted features. Such ensemble vector of descriptors is used afterwards to identify true lesion candidates using a Random Forest classifier. We empirically observed that combining both sources of information significantly improve results with respect to using each approach separately. Furthermore, our method reported the highest performance on a per-lesion basis on DIARETDB1 and e-ophtha, and for screening and need for referral on MESSIDOR compared to a second human expert. Results highlight the fact that integrating manually engineered approaches with deep learned features is relevant to improve results when the networks are trained from lesion-level annotated data. An open source implementation of our system is publicly available online.
We consider the problem of routing in a mobile ad-hoc network (MANET) for which the planned mobilities of the nodes are partially known a priori and the nodes travel in groups. This situation arises commonly in military and emergency response scenarios. Optimal routes are computed using the most reliable path principle in which the negative logarithm of a node pair's adjacency probability is used as a link weight metric. This probability is estimated using the mobility plan as well as dynamic information captured by table exchanges, including a measure of the social tie strength between nodes. The latter information is useful when nodes deviate from their plans or when the plans are inaccurate. We compare the proposed routing algorithm with the commonly-used optimized link state routing (OLSR) protocol in ns-3 simulations. As the OLSR protocol does not exploit the mobility plans, it relies on link state determination which suffers with increasing mobility. Our simulations show considerably better throughput performance with the proposed approach as compared with OLSR at the expense of increased overhead. However, in the high-throughput regime, the proposed approach outperforms OLSR in terms of both throughput and overhead.
We introduce a dynamic neural algorithm called Dynamic Neural (DN) SARSA(\lambda) for learning a behavioral sequence from delayed reward. DN-SARSA(\lambda) combines Dynamic Field Theory models of behavioral sequence representation, classical reinforcement learning, and a computational neuroscience model of working memory, called Item and Order working memory, which serves as an eligibility trace. DN-SARSA(\lambda) is implemented on both a simulated and real robot that must learn a specific rewarding sequence of elementary behaviors from exploration. Results show DN-SARSA(\lambda) performs on the level of the discrete SARSA(\lambda), validating the feasibility of general reinforcement learning without compromising neural dynamics.
Human brains are adept at dealing with the deluge of information they continuously receive, by suppressing the non-essential inputs and focusing on the important ones. Inspired by such capability, we propose Deluge Networks (DelugeNets), a novel class of neural networks facilitating massive cross-layer information inflows from preceding layers to succeeding layers. The connections between layers in DelugeNets are efficiently established through cross-layer depthwise convolutional layers with learnable filters, acting as a flexible selection mechanism. By virtue of the massive cross-layer information inflows, DelugeNets can propagate information across many layers with greater flexibility and utilize network parameters more effectively, compared to existing ResNet models. Experiments show the superior performances of DelugeNets in terms of both classification accuracies and parameter efficiencies. Remarkably, a DelugeNet model with just 20.2M parameters achieve state-of-the-art error of 19.02% on CIFAR-100 dataset, outperforming DenseNet model with 27.2M parameters. Moreover, DelugeNet performs comparably to ResNet-200 on ImageNet dataset with merely half of the computations needed by the latter.
Pathological lung segmentation (PLS) is an important, yet challenging, medical image application due to the wide variability of pathological lung appearance and shape. Because PLS is often a pre-requisite for other imaging analytics, methodological simplicity and generality are key factors in usability. Along those lines, we present a bottom-up deep-learning based approach that is expressive enough to handle variations in appearance, while remaining unaffected by any variations in shape. We incorporate the deeply supervised learning framework, but enhance it with a simple, yet effective, progressive multi-path scheme, which more reliably merges outputs from different network stages. The result is a deep model able to produce finer detailed masks, which we call progressive holistically-nested networks (P-HNNs). Using extensive cross-validation, our method is tested on multi-institutional datasets comprising 929 CT scans (848 publicly available), of pathological lungs, reporting mean dice scores of 0.985 and demonstrating significant qualitative and quantitative improvements over state-of-the art approaches.
It was recently demonstrated that the Stillinger-Weber silicon undergoes a liquid-liquid first-order phase transition deep into the supercooled region (Sastry and Angell, Nature Materials 2, 739 (2003)). Here we study the effects of perturbations on this phase transition. We show that the order of the liquid-liquid transition changes with negative pressure. We also find that the liquid-liquid transition disappears when the three-body term of the potential is strengthened by as little as 5 %. This implies that the details of the potential could affect strongly the nature and even the existence of the liquid-liquid phase.
The need to build a link between the structure of a complex network and the dynamical properties of the corresponding complex system (comprised of multiple low dimensional systems) has recently become apparent. Several attempts to tackle this problem have been made and all focus on either the controllability or synchronisability of the network --- usually analyzed by way of the master stability function, or the graph Laplacian. We take a different approach. Using the basic tools from dynamical systems theory we show that the dynamical stability of a network can easily be defined in terms of the eigenvalues of an homologue of the network adjacency matrix. This allows us to compute the stability of a network (a quantity derived from the eigenspectrum of the adjacency matrix). Numerical experiments show that this quantity is very closely related too, and can even be predicted from, the standard structural network properties. Following from this we show that the stability of large network systems can be understood via an analytic study of the eigenvalues of their fixed points --- even for a very large number of fixed points.
The use of Convolutional Neural Networks (CNN) in natural image classification systems has produced very impressive results. Combined with the inherent nature of medical images that make them ideal for deep-learning, further application of such systems to medical image classification holds much promise. However, the usefulness and potential impact of such a system can be completely negated if it does not reach a target accuracy. In this paper, we present a study on determining the optimum size of the training data set necessary to achieve high classification accuracy with low variance in medical image classification systems. The CNN was applied to classify axial Computed Tomography (CT) images into six anatomical classes. We trained the CNN using six different sizes of training data set (5, 10, 20, 50, 100, and 200) and then tested the resulting system with a total of 6000 CT images. All images were acquired from the Massachusetts General Hospital (MGH) Picture Archiving and Communication System (PACS). Using this data, we employ the learning curve approach to predict classification accuracy at a given training sample size. Our research will present a general methodology for determining the training data set size necessary to achieve a certain target classification accuracy that can be easily applied to other problems within such systems.
The isochoric thermal conductivity of gama-O2 has been studied on samples of varying density in the temperature interval from 44 K to the onset of melting. The thermal conductivity of nearly free sample decreases at rising temperature along the isochores. It is shown that the absolute value of thermal conductivity in the gama-phase of O2 is close to its lower limit and most of the heat is transported by "diffusive" modes. The growth of thermal conductivity in gama-O2 is attributed to the decay of the phonon scattering at the rotational excitations of the molecules and at the short-range magnetic order fluctuations at rising temperature.
Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems -- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.
Recent work has shown deep neural networks (DNNs) to be highly susceptible to well-designed, small perturbations at the input layer, or so-called adversarial examples. Taking images as an example, such distortions are often imperceptible, but can result in 100% mis-classification for a state of the art DNN. We study the structure of adversarial examples and explore network topology, pre-processing and training strategies to improve the robustness of DNNs. We perform various experiments to assess the removability of adversarial examples by corrupting with additional noise and pre-processing with denoising autoencoders (DAEs). We find that DAEs can remove substantial amounts of the adversarial noise. How- ever, when stacking the DAE with the original DNN, the resulting network can again be attacked by new adversarial examples with even smaller distortion. As a solution, we propose Deep Contractive Network, a model with a new end-to-end training procedure that includes a smoothness penalty inspired by the contractive autoencoder (CAE). This increases the network robustness to adversarial examples, without a significant performance penalty.
Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding context words by hidden layers, an output layer estimates the probability of the next word. Such approaches are time- and memory-intensive because of the large numbers of parameters for word embeddings and the output layer. In this paper, we propose to compress neural language models by sparse word representations. In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible. Moreover, our approach not only reduces the parameter space to a large extent, but also improves the performance in terms of the perplexity measure.
We propose a neural network approach to price EU call options that significantly outperforms some existing pricing models and comes with guarantees that its predictions are economically reasonable. To achieve this, we introduce a class of gated neural networks that automatically learn to divide-and-conquer the problem space for robust and accurate pricing. We then derive instantiations of these networks that are 'rational by design' in terms of naturally encoding a valid call option surface that enforces no arbitrage principles. This integration of human insight within data-driven learning provides significantly better generalisation in pricing performance due to the encoded inductive bias in the learning, guarantees sanity in the model's predictions, and provides econometrically useful byproduct such as risk neutral density.
We have used recent polarized deep-inelastic scattering data from CERN and SLAC to extract information about nucleon spin structure. We find that the SMC proton data, the E142 neutron data and the deuteron data from SMC and E143 give different results for fractions of the spin carried by each of the constituents. These appear to lead to two different and incompatible models for the polarized strange sea. The polarized gluon distribution occuring in the gluon anomaly does not have to be large in order to be consistent with either set of experimental data. However, it appears that the discrepancies in the implications of these data cannot be resolved with any simple theoretical arguments. We conclude that more experiments must be performed in order to adequately determine the fraction of spin carried by each of the nucleon constituents.
We introduce Group equivariant Convolutional Neural Networks (G-CNNs), a natural generalization of convolutional neural networks that reduces sample complexity by exploiting symmetries. G-CNNs use G-convolutions, a new type of layer that enjoys a substantially higher degree of weight sharing than regular convolution layers. G-convolutions increase the expressive capacity of the network without increasing the number of parameters. Group convolution layers are easy to use and can be implemented with negligible computational overhead for discrete groups generated by translations, reflections and rotations. G-CNNs achieve state of the art results on CIFAR10 and rotated MNIST.
We prove the property of stochastic stability previously introduced as a consequence of the (unproved) continuity hypothesis in the temperature of the spin-glass quenched state. We show that stochastic stability holds in beta-average for both the Sherrington-Kirkpatrick model in terms of the square of the overlap function and for the Edwards-Anderson model in terms of the bond overlap. We show that the volume rate at which the property is reached in the thermodynamic limit is V^{-1}. As a byproduct we show that the stochastic stability identities coincide with those obtained with a different method by Ghirlanda and Guerra when applyed to the thermal fluctuations only.
This study explores the design and control of the behaviour of agents and robots using simple circuits of spiking neurons and Spike Timing Dependent Plasticity (STDP) as a mechanism of associative and unsupervised learning. Based on a "reward and punishment" classical conditioning, it is demonstrated that these robots learnt to identify and avoid obstacles as well as to identify and look for rewarding stimuli. Using the simulation and programming environment NetLogo, a software engine for the Integrate and Fire model was developed, which allowed us to monitor in discrete time steps the dynamics of each single neuron, synapse and spike in the proposed neural networks. These spiking neural networks (SNN) served as simple brains for the experimental robots. The Lego Mindstorms robot kit was used for the embodiment of the simulated agents. In this paper the topological building blocks are presented as well as the neural parameters required to reproduce the experiments. This paper summarizes the resulting behaviour as well as the observed dynamics of the neural circuits. The Internet-link to the NetLogo code is included in the annex.
Most successful deep learning models for action recognition generate predictions for short video clips, which are later aggregated into a longer time-frame action descriptor by computing a statistic over these predictions. Zeroth (max) or first order (average) statistic are commonly used. In this paper, we explore the benefits of using second-order statistics. Specifically, we propose a novel end-to-end learnable action pooling scheme, temporal correlation pooling, that generates an action descriptor for a video sequence by capturing the similarities between the temporal evolution of per-frame CNN features across the video. Such a descriptor, while being computationally cheap, also naturally encodes the co-activations of multiple CNN features, thereby providing a richer characterization of actions than their first-order counterparts. We also propose higher-order extensions of this scheme by computing correlations after embedding the CNN features in a reproducing kernel Hilbert space. We provide experiments on four standard and fine-grained action recognition datasets. Our results clearly demonstrate the advantages of higher-order pooling schemes, achieving state-of-the-art performance.
We study the fundamental network capacity of a multi-user wireless downlink under two assumptions: (1) Channels are not explicitly measured and thus instantaneous states are unknown, (2) Channels are modeled as ON/OFF Markov chains. This is an important network model to explore because channel probing may be costly or infeasible in some contexts. In this case, we can use channel memory with ACK/NACK feedback from previous transmissions to improve network throughput. Computing in closed form the capacity region of this network is difficult because it involves solving a high dimension partially observed Markov decision problem. Instead, in this paper we construct an inner and outer bound on the capacity region, showing that the bound is tight when the number of users is large and the traffic is symmetric. For the case of heterogeneous traffic and any number of users, we propose a simple queue-dependent policy that can stabilize the network with any data rates strictly within the inner capacity bound. The stability analysis uses a novel frame-based Lyapunov drift argument. The outer-bound analysis uses stochastic coupling and state aggregation to bound the performance of a restless bandit problem using a related multi-armed bandit system. Our results are useful in cognitive radio networks, opportunistic scheduling with delayed/uncertain channel state information, and restless bandit problems.
We consider \emph{plurality consensus} in a network of $n$ nodes. Initially, each node has one of $k$ opinions. The nodes execute a (randomized) distributed protocol to agree on the plurality opinion (the opinion initially supported by the most nodes). Nodes in such networks are often quite cheap and simple, and hence one seeks protocols that are not only fast but also simple and space efficient. Typically, protocols depend heavily on the employed communication mechanism, which ranges from sequential (only one pair of nodes communicates at any time) to fully parallel (all nodes communicate with all their neighbors at once) communication and everything in-between.   We propose a framework to design protocols for a multitude of communication mechanisms. We introduce protocols that solve the plurality consensus problem and are with probability 1-o(1) both time and space efficient. Our protocols are based on an interesting relationship between plurality consensus and distributed load balancing. This relationship allows us to design protocols that generalize the state of the art for a large range of problem parameters. In particular, we obtain the same bounds as the recent result of Alistarh et al. (who consider only two opinions on a clique) using a much simpler protocol that generalizes naturally to general graphs and multiple opinions.
Individual cancer cells carry a bewildering number of distinct genomic alterations i.e., copy number variations and mutations, making it a challenge to uncover genomic-driven mechanisms governing tumorigenesis. Here we performed exome-sequencing on several breast cancer cell lines which represent two subtypes, luminal and basal. We integrated this sequencing data, and functional RNAi screening data (i.e., for identifying genes which are essential for cell proliferation and survival), onto a human signaling network. Two subtype-specific networks were identified, which potentially represent core-signaling mechanisms underlying tumorigenesis. Within both networks, we found that genes were differentially affected in different cell lines; i.e., in some cell lines a gene was identified through RNAi screening whereas in others it was genomically altered. Interestingly, we found that highly connected network genes could be used to correctly classify breast tumors into subtypes based on genomic alterations. Further, the networks effectively predicted subtype-specific drug targets, which were experimentally validated.
Mobile sensing applications usually require time-series inputs from sensors. Some applications, such as tracking, can use sensed acceleration and rate of rotation to calculate displacement based on physical system models. Other applications, such as activity recognition, extract manually designed features from sensor inputs for classification. Such applications face two challenges. On one hand, on-device sensor measurements are noisy. For many mobile applications, it is hard to find a distribution that exactly describes the noise in practice. Unfortunately, calculating target quantities based on physical system and noise models is only as accurate as the noise assumptions. Similarly, in classification applications, although manually designed features have proven to be effective, it is not always straightforward to find the most robust features to accommodate diverse sensor noise patterns and user behaviors. To this end, we propose DeepSense, a deep learning framework that directly addresses the aforementioned noise and feature customization challenges in a unified manner. DeepSense integrates convolutional and recurrent neural networks to exploit local interactions among similar mobile sensors, merge local interactions of different sensory modalities into global interactions, and extract temporal relationships to model signal dynamics. DeepSense thus provides a general signal estimation and classification framework that accommodates a wide range of applications. We demonstrate the effectiveness of DeepSense using three representative and challenging tasks: car tracking with motion sensors, heterogeneous human activity recognition, and user identification with biometric motion analysis. DeepSense significantly outperforms the state-of-the-art methods for all three tasks. In addition, DeepSense is feasible to implement on smartphones due to its moderate energy consumption and low latency
We propose rectified factor networks (RFNs) to efficiently construct very sparse, non-linear, high-dimensional representations of the input. RFN models identify rare and small events in the input, have a low interference between code units, have a small reconstruction error, and explain the data covariance structure. RFN learning is a generalized alternating minimization algorithm derived from the posterior regularization method which enforces non-negative and normalized posterior means. We proof convergence and correctness of the RFN learning algorithm. On benchmarks, RFNs are compared to other unsupervised methods like autoencoders, RBMs, factor analysis, ICA, and PCA. In contrast to previous sparse coding methods, RFNs yield sparser codes, capture the data's covariance structure more precisely, and have a significantly smaller reconstruction error. We test RFNs as pretraining technique for deep networks on different vision datasets, where RFNs were superior to RBMs and autoencoders. On gene expression data from two pharmaceutical drug discovery studies, RFNs detected small and rare gene modules that revealed highly relevant new biological insights which were so far missed by other unsupervised methods.
Recently, the long short-term memory neural network (LSTM) has attracted wide interest due to its success in many tasks. LSTM architecture consists of a memory cell and three gates, which looks similar to the neuronal networks in the brain. However, there still lacks the evidence of the cognitive plausibility of LSTM architecture as well as its working mechanism. In this paper, we study the cognitive plausibility of LSTM by aligning its internal architecture with the brain activity observed via fMRI when the subjects read a story. Experiment results show that the artificial memory vector in LSTM can accurately predict the observed sequential brain activities, indicating the correlation between LSTM architecture and the cognitive process of story reading.
Discriminative model learning for image denoising has been recently attracting considerable attentions due to its favorable denoising performance. In this paper, we take one step forward by investigating the construction of feed-forward denoising convolutional neural networks (DnCNNs) to embrace the progress in very deep architecture, learning algorithm, and regularization method into image denoising. Specifically, residual learning and batch normalization are utilized to speed up the training process as well as boost the denoising performance. Different from the existing discriminative denoising models which usually train a specific model for additive white Gaussian noise (AWGN) at a certain noise level, our DnCNN model is able to handle Gaussian denoising with unknown noise level (i.e., blind Gaussian denoising). With the residual learning strategy, DnCNN implicitly removes the latent clean image in the hidden layers. This property motivates us to train a single DnCNN model to tackle with several general image denoising tasks such as Gaussian denoising, single image super-resolution and JPEG image deblocking. Our extensive experiments demonstrate that our DnCNN model can not only exhibit high effectiveness in several general image denoising tasks, but also be efficiently implemented by benefiting from GPU computing.
Two processes can influence the evolution of protein interaction networks: addition and elimination of interactions between proteins, and gene duplications increasing the number of proteins and interactions. The rates of these processes can be estimated from available Saccharomyces cerevisiae genome data and are sufficiently high to affect network structure on short time scales. For instance, more than 100 interactions may be added to the yeast network every million years, a substantial fraction of which adds previously unconnected proteins to the network. Highly connected proteins show a greater rate of interaction turnover than proteins with few interactions. From these observations one can explain ? without natural selection on global network structure ? the evolutionary sustenance of the most prominent network feature, the distribution of the frequency P(d) of proteins with d neighbors, which is a broad-tailed distribution. This distribution is independent of the experimental approach providing nformation on network structure.
Raman scattering, IR reflectance and modulated DSC measurements are performed on specifically prepared dry (AgI)x(AgPO3)1-x glasses over a wide range of compositions 0 < x < 0.60. A reversibility window is observed in the 0.095< x < 0.378 range, which fixes the elastically rigid but unstressed regime also known as the Intermediate Phase. Glass compositions at x < 0.095 are stressed-rigid, while those at x > 0.378 elastically flexible. Raman optical elasticity power-laws, trends in the nature of the glass transition endotherms corroborate the three elastic phase assignments. Ionic conductivities reveal a step-like increase when glasses become stress-free at x > xc(1) = 0.095, and a logarithmic increase in conductivity (sigma ~ (x-xc(2)t) once they become flexible at x > xc(2) = 0.378 with a power-law t = 1.78. The power-law is consistent with percolation of 3D filamentary conduction pathways. Traces of water doping lower Tg and narrow the reversibility window, and can also completely collapse it. Ideas on network flexibility promoting ion-conduction are in harmony with the unified approach of Ingram et al., who have emphasized the similarity of process compliance or elasticity relating to ion-transport and structural relaxation in decoupled systems. Boson mode frequency and scattering strength display thresholds that coincide with the two elastic phase boundaries. In particular, the scattering strength of the boson mode increases almost linearly with glass composition x, with a slope that tracks the floppy mode fraction as a function of mean coordination number r predicted by mean-field rigidity theory. These data suggest that the excess low frequency vibrations contributing to boson mode in flexible glasses come largely from floppy modes.
Deep convolutional neural networks have led to breakthrough results in numerous practical machine learning tasks such as classification of images in the ImageNet data set, control-policy-learning to play Atari games or the board game Go, and image captioning. Many of these applications first perform feature extraction and then feed the results thereof into a trainable classifier. The mathematical analysis of deep convolutional neural networks for feature extraction was initiated by Mallat, 2012. Specifically, Mallat considered so-called scattering networks based on a wavelet transform followed by the modulus non-linearity in each network layer, and proved translation invariance (asymptotically in the wavelet scale parameter) and deformation stability of the corresponding feature extractor. This paper complements Mallat's results by developing a theory of deep convolutional neural networks for feature extraction encompassing general convolutional transforms, or in more technical parlance, general semi-discrete frames (including Weyl-Heisenberg, curvelet, shearlet, ridgelet, and wavelet frames), general Lipschitz-continuous non-linearities (e.g., rectified linear units, shifted logistic sigmoids, hyperbolic tangents, and modulus functions), and general Lipschitz-continuous pooling operators emulating sub-sampling and averaging. In addition, all of these elements can be different in different network layers. For the resulting feature extractor we prove a translation invariance result which is of vertical nature in the sense of the network depth determining the amount of invariance, and we establish deformation sensitivity bounds that apply to signal classes with inherent deformation insensitivity such as, e.g., band-limited functions.
It is well known that it is challenging to train deep neural networks and recurrent neural networks for tasks that exhibit long term dependencies. The vanishing or exploding gradient problem is a well known issue associated with these challenges. One approach to addressing vanishing and exploding gradients is to use either soft or hard constraints on weight matrices so as to encourage or enforce orthogonality. Orthogonal matrices preserve gradient norm during backpropagation and may therefore be a desirable property. This paper explores issues with optimization convergence, speed and gradient stability when encouraging or enforcing orthogonality. To perform this analysis, we propose a weight matrix factorization and parameterization strategy through which we can bound matrix norms and therein control the degree of expansivity induced during backpropagation. We find that hard constraints on orthogonality can negatively affect the speed of convergence and model performance.
We study the time needed for deterministic leader election in the ${\cal LOCAL}$ model, where in every round a node can exchange any messages with its neighbors and perform any local computations. The topology of the network is unknown and nodes are unlabeled, but ports at each node have arbitrary fixed labelings which, together with the topology of the network, can create asymmetries to be exploited in leader election. We consider two versions of the leader election problem: strong LE in which exactly one leader has to be elected, if this is possible, while all nodes must terminate declaring that leader election is impossible otherwise, and weak LE, which differs from strong LE in that no requirement on the behavior of nodes is imposed, if leader election is impossible. We show that the time of leader election depends on three parameters of the network: its diameter $D$, its size $n$, and its level of symmetry $\lambda$, which, when leader election is feasible, is the smallest depth at which some node has a unique view of the network. It also depends on the knowledge by the nodes, or lack of it, of parameters $D$ and $n$.
Deep convolutional neural networks are powerful tools for learning visual representations from images. However, designing efficient deep architectures to analyse volumetric medical images remains challenging. This work investigates efficient and flexible elements of modern convolutional networks such as dilated convolution and residual connection. With these essential building blocks, we propose a high-resolution, compact convolutional network for volumetric image segmentation. To illustrate its efficiency of learning 3D representation from large-scale image data, the proposed network is validated with the challenging task of parcellating 155 neuroanatomical structures from brain MR images. Our experiments show that the proposed network architecture compares favourably with state-of-the-art volumetric segmentation networks while being an order of magnitude more compact. We consider the brain parcellation task as a pretext task for volumetric image segmentation; our trained network potentially provides a good starting point for transfer learning. Additionally, we show the feasibility of voxel-level uncertainty estimation using a sampling approximation through dropout.
We study the propagation and localization of classical waves in one-dimensional disordered structures composed of alternating layers of left- and right-handed materials (mixed stacks) and compare them to the structures composed of different layers of the same material (homogeneous stacks). For weakly scattering layers, we have developed an effective analytical approach and have calculated the transmission length within a wide region of the input parameters. When both refractive index and layer thickness of a mixed stack are random, the transmission length in the long-wave range of the localized regime exhibits a quadratic power wavelength dependence with the coefficients different for mixed and homogeneous stacks. Moreover, the transmission length of a mixed stack differs from reciprocal of the Lyapunov exponent of the corresponding infinite stack. In both the ballistic regime of a mixed stack and in the near long-wave region of a homogeneous stack, the transmission length of a realization is a strongly fluctuating quantity. In the far long-wave part of the ballistic region, the homogeneous stack becomes effectively uniform and the transmission length fluctuations are weaker. The crossover region from the localization to the ballistic regime is relatively narrow for both mixed and homogeneous stacks. In mixed stacks with only refractive-index disorder, Anderson localization at long wavelengths is substantially suppressed, with the localization length growing with the wavelength much faster than for homogeneous stacks. The crossover region becomes essentially wider and transmission resonances appear only in much longer stacks. All theoretical predictions are in an excellent agreement with the results of numerical simulations.
Convolutional neural networks (CNNs) have recently been very successful in a variety of computer vision tasks, especially on those linked to recognition. Optical flow estimation has not been among the tasks where CNNs were successful. In this paper we construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task. We propose and compare two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations.   Since existing ground truth data sets are not sufficiently large to train a CNN, we generate a synthetic Flying Chairs dataset. We show that networks trained on this unrealistic data still generalize very well to existing datasets such as Sintel and KITTI, achieving competitive accuracy at frame rates of 5 to 10 fps.
In distributed function computation, each node has an initial value and the goal is to compute a function of these values in a distributed manner. In this paper, we propose a novel token-based approach to compute a wide class of target functions to which we refer as "Token-based function Computation with Memory" (TCM) algorithm. In this approach, node values are attached to tokens and travel across the network. Each pair of travelling tokens would coalesce when they meet, forming a token with a new value as a function of the original token values. In contrast to the Coalescing Random Walk (CRW) algorithm, where token movement is governed by random walk, meeting of tokens in our scheme is accelerated by adopting a novel chasing mechanism. We proved that, compared to the CRW algorithm, the TCM algorithm results in a reduction of time complexity by a factor of at least $\sqrt{n/\log(n)}$ in Erd\"os-Renyi and complete graphs, and by a factor of $\log(n)/\log(\log(n))$ in torus networks. Simulation results show that there is at least a constant factor improvement in the message complexity of TCM algorithm in all considered topologies. Robustness of the CRW and TCM algorithms in the presence of node failure is analyzed. We show that their robustness can be improved by running multiple instances of the algorithms in parallel.
Convolutional Neural Networks(CNN) has had a great success in the recent past, because of the advent of faster GPUs and memory access. CNNs are really powerful as they learn the features from data in layers such that they exhibit the structure of the V-1 features of the human brain. A huge bottleneck, in this case, is that CNNs are very large and have a very high memory footprint, and hence they cannot be employed on devices with limited storage such as mobile phone, IoT etc. In this work, we study the model complexity versus accuracy trade-off on MNSIT dataset, and give a concrete framework for handling such a problem, given the worst case accuracy that a system can tolerate. In our work, we reduce the model complexity by 236 times, and memory footprint by 19.5 times compared to the base model while achieving worst case accuracy threshold.
A new architecture is presented for a Networked Signal Processing System (NSPS) suitable for handling the real-time signal processing of multi-element radio telescopes. In this system, a multi-element radio telescope is viewed as an application of a multi-sensor, data fusion problem which can be decomposed into a general set of computing and network components for which a practical and scalable architecture is enabled by current technology. The need for such a system arose in the context of an ongoing program for reconfiguring the Ooty Radio Telescope (ORT) as a programmable 264-element array, which will enable several new observing capabilities for large scale surveys on this mature telescope. For this application, it is necessary to manage, route and combine large volumes of data whose real-time collation requires large I/O bandwidths to be sustained. Since these are general requirements of many multi-sensor fusion applications, we first describe the basic architecture of the NSPS in terms of a Fusion Tree before elaborating on its application for the ORT. The paper addresses issues relating to high speed distributed data acquisition, Field Programmable Gate Array (FPGA) based peer-to-peer networks supporting significant on-the fly processing while routing, and providing a last mile interface to a typical commodity network like Gigabit Ethernet. The system is fundamentally a pair of two co-operative networks, among which one is part of a commodity high performance computer cluster and the other is based on Commercial-Off The-Shelf (COTS) technology with support from software/firmware components in the public domain.
In this work we investigate a new approach for detecting attacks which aim to degrade the network's Quality of Service (QoS). To this end, a new network-based intrusion detection system (NIDS) is proposed. Most contemporary NIDSs take a passive approach by solely monitoring the network's production traffic. This paper explores a complementary approach in which distributed agents actively send out periodic probes. The probes are continuously monitored to detect anomalous behavior of the network. The proposed approach takes away much of the variability of the network's production traffic that makes it so difficult to classify. This enables the NIDS to detect more subtle attacks which would not be detected using the passive approach alone. Furthermore, the active probing approach allows the NIDS to be effectively trained using only examples of the network's normal states, hence enabling an effective detection of zero-day attacks. Using realistic experiments, we show that an NIDS which also leverages the active approach is considerably more effective in detecting attacks which aim to degrade the network's QoS when compared to an NIDS which relies solely on the passive approach. Lastly, we show that the false positives rate remains very low even in the face of Byzantine faults.
The communication networks in real world often couple with each other to save costs, which results in any network does not have a stand-alone function and efficiency. To investigate this, in this paper we propose a transportation model on two coupled networks with bandwidth sharing. We find that the free-flow state and the congestion state can coexist in the two coupled networks, and the free-flow path and congestion path can coexist in each network. Considering three bandwidth-sharing mechanisms, random, assortative and disassortative couplings, we also find that the transportation capacity of the network only depends on the coupling mechanism, and the fraction of coupled links only affects the performance of the system in the congestion state, such as the traveling time. In addition, with assortative coupling, the transportation capacity of the system will decrease significantly. However, the disassortative coupling has little influence on the transportation capacity of the system, which provides a good strategy to save bandwidth. Furthermore, a theoretical method is developed to obtain the bandwidth usage of each link, based on which we can obtain the congestion transition point exactly.
Neural network models offer a theoretical testbed for the study of learning at the cellular level. The only experimentally verified learning rule, Hebb's rule, is extremely limited in its ability to train networks to perform complex tasks. An identified cellular mechanism responsible for Hebbian-type long-term potentiation, the NMDA receptor, is highly versatile. Its function and efficacy are modulated by a wide variety of compounds and conditions and are likely to be directed by non-local phenomena. Furthermore, it has been demonstrated that NMDA receptors are not essential for some types of learning. We have shown that another neural network learning rule, the chemotaxis algorithm, is theoretically much more powerful than Hebb's rule and is consistent with experimental data. A biased random-walk in synaptic weight space is a learning rule immanent in nervous activity and may account for some types of learning -- notably the acquisition of skilled movement.
Unsupervised learning from visual data is one of the most difficult challenges in computer vision, being a fundamental task for understanding how visual recognition works. From a practical point of view, learning from unsupervised visual input has an immense practical value, as very large quantities of unlabeled videos can be collected at low cost. In this paper, we address the task of unsupervised learning to detect and segment foreground objects in single images. We achieve our goal by training a student pathway, consisting of a deep neural network. It learns to predict from a single input image (a video frame) the output for that particular frame, of a teacher pathway that performs unsupervised object discovery in video. Our approach is different from the published literature that performs unsupervised discovery in videos or in collections of images at test time. We move the unsupervised discovery phase during the training stage, while at test time we apply the standard feed-forward processing along the student pathway. This has a dual benefit: firstly, it allows in principle unlimited possibilities of learning and generalization during training, while remaining very fast at testing. Secondly, the student not only becomes able to detect in single images significantly better than its unsupervised video discovery teacher, but it also achieves state of the art results on two important current benchmarks, YouTube Objects and Object Discovery datasets. Moreover, at test time, our system is at least two orders of magnitude faster than other previous methods.
In this paper, we propose a recurrent framework for Joint Unsupervised LEarning (JULE) of deep representations and image clusters. In our framework, successive operations in a clustering algorithm are expressed as steps in a recurrent process, stacked on top of representations output by a Convolutional Neural Network (CNN). During training, image clusters and representations are updated jointly: image clustering is conducted in the forward pass, while representation learning in the backward pass. Our key idea behind this framework is that good representations are beneficial to image clustering and clustering results provide supervisory signals to representation learning. By integrating two processes into a single model with a unified weighted triplet loss and optimizing it end-to-end, we can obtain not only more powerful representations, but also more precise image clusters. Extensive experiments show that our method outperforms the state-of-the-art on image clustering across a variety of image datasets. Moreover, the learned representations generalize well when transferred to other tasks.
Spatial modes of light can potentially carry a vast amount of information, making them promising candidates for both classical and quantum communication. However, the distribution of such modes over large distances remains difficult. Intermodal coupling complicates their use with common fibers, while free-space transmission is thought to be strongly influenced by atmospheric turbulence. Here we show the transmission of orbital angular momentum modes of light over a distance of 143 kilometers between two Canary Islands, which is 50 times greater than the maximum distance achieved previously. As a demonstration of the transmission quality, we use superpositions of these modes to encode a short message. At the receiver, an artificial neural network is used for distinguishing between the different twisted light superpositions. The algorithm is able to identify different mode superpositions with an accuracy of more than 80% up to the third mode order, and decode the transmitted message with an error rate of 8.33%. Using our data, we estimate that the distribution of orbital angular momentum entanglement over more than 100 kilometers of free space is feasible. Moreover, the quality of our free-space link can be further improved by the use of state-of-the-art adaptive optics systems.
Long-term rainfall prediction is a challenging task especially in the modern world where we are facing the major environmental problem of global warming. In general, climate and rainfall are highly non-linear phenomena in nature exhibiting what is known as the butterfly effect. While some regions of the world are noticing a systematic decrease in annual rainfall, others notice increases in flooding and severe storms. The global nature of this phenomenon is very complicated and requires sophisticated computer modeling and simulation to predict accurately. In this paper, we report a performance analysis for Multivariate Adaptive Regression Splines (MARS)and artificial neural networks for one month ahead prediction of rainfall. To evaluate the prediction efficiency, we made use of 87 years of rainfall data in Kerala state, the southern part of the Indian peninsula situated at latitude -longitude pairs (8o29'N - 76o57' E). We used an artificial neural network trained using the scaled conjugate gradient algorithm. The neural network and MARS were trained with 40 years of rainfall data. For performance evaluation, network predicted outputs were compared with the actual rainfall data. Simulation results reveal that MARS is a good forecasting tool and performed better than the considered neural network.
The dijet double-differential cross section is measured as a function of the dijet invariant mass, using data taken during 2010 and during 2011 with the ATLAS experiment at the LHC, with a center-of-mass energy of 7 TeV. The measurements are sensitive to invariant masses between 70 GeV and 4.27 TeV with center-of-mass jet rapidities up to 3.5. A novel technique to correct jets for pile-up (additional proton-proton collisions) in the 2011 data is developed and subsequently used in the measurement. The data are found to be consistent with fixed-order NLO pQCD predictions provided by NLOJET++. The results constitute a stringent test of pQCD, in an energy regime previously unexplored. The dijet analysis is a confidence building step for the extraction of the signal of hard double parton scattering (DPS) in four-jet events, and subsequent extraction of the effective overlap area between the interacting protons, expressed in terms of the variable, \sigma(eff). The measurement of DPS is performed using the 2010 ATLAS data. The rate of DPS events is estimated using a neural network. A clear signal is observed, under the assumption that the DPS signal can be represented by a random combination of exclusive dijet production. The fraction of DPS candidate events is determined to be f(DPS) = 0.081 +- 0.004 (stat.) +0.025-0.014 (syst.) in the analyzed phase-space of four-jet topologies. Combined with the measurement of the dijet and four-jet cross sections in the appropriate phase-space regions, the effective cross section is found to be \sigma(eff) = 16.0 +0.5-0.8 (stat.) +1.9-3.5 (syst.) mb. This result is consistent within the quoted uncertainties with previous measurements of \sigma(eff) at center-of-mass energies between 63 GeV and 7 TeV, using several final states.
This work details Sighthounds fully automated license plate detection and recognition system. The core technology of the system is built using a sequence of deep Convolutional Neural Networks (CNNs) interlaced with accurate and efficient algorithms. The CNNs are trained and fine-tuned so that they are robust under different conditions (e.g. variations in pose, lighting, occlusion, etc.) and can work across a variety of license plate templates (e.g. sizes, backgrounds, fonts, etc). For quantitative analysis, we show that our system outperforms the leading license plate detection and recognition technology i.e. ALPR on several benchmarks. Our system is available to developers through the Sighthound Cloud API at https://www.sighthound.com/products/cloud
In this paper we consider a general problem set-up for a wide class of convex and robust distributed optimization problems in peer-to-peer networks. In this set-up convex constraint sets are distributed to the network processors who have to compute the optimizer of a linear cost function subject to the constraints. We propose a novel fully distributed algorithm, named cutting-plane consensus, to solve the problem, based on an outer polyhedral approximation of the constraint sets. Processors running the algorithm compute and exchange linear approximations of their locally feasible sets. Independently of the number of processors in the network, each processor stores only a small number of linear constraints, making the algorithm scalable to large networks. The cutting-plane consensus algorithm is presented and analyzed for the general framework. Specifically, we prove that all processors running the algorithm agree on an optimizer of the global problem, and that the algorithm is tolerant to node and link failures as long as network connectivity is preserved. Then, the cutting plane consensus algorithm is specified to three different classes of distributed optimization problems, namely (i) inequality constrained problems, (ii) robust optimization problems, and (iii) almost separable optimization problems with separable objective functions and coupling constraints. For each one of these problem classes we solve a concrete problem that can be expressed in that framework and present computational results. That is, we show how to solve: position estimation in wireless sensor networks, a distributed robust linear program and, a distributed microgrid control problem.
We investigate the use of network coding for information dissemination over a wireless network. Using network coding allows for a simple, distributed and robust algorithm where nodes do not need any information from their neighbors. In this paper, we analyze the time needed to diffuse information throughout a network when network coding is implemented at all nodes. We then provide an upper bound for the dissemination time for ad-hoc networks with general topology. Moreover, we derive a relation between dissemination time and the size of the wireless network. It is shown that for a wireless network with N nodes, the dissemination latency is between O(N) and O(N^2), depending on the reception probabilities of the nodes. These observations are validated by the simulation results.
The analysis of data coming from interferometric antennas for gravitational waves detection may require a huge amount of computing power. The usual approach to the detection strategy is to set-up computer farms able to perform several tasks in parallel, exchanging data through network links. In this paper a new computation strategy is presented. This strategy is based on the GRID environment that allows several geographically distributed computing resources to exchange data and programs in secure way, using standard infrastructures. The computing resources can be geographically distributed also on a large scale. Some preliminary tests were performed using a subnetwork of the GRID infrastructure, producing good results in terms of distribution efficiency and time duration.
Power grids, road maps, and river streams are examples of infrastructural networks which are highly vulnerable to external perturbations. An abrupt local change of load (voltage, traffic density, or water level) might propagate in a cascading way and affect a significant fraction of the network. Almost discontinuous perturbations can be modeled by shock waves which can eventually interfere constructively and endanger the normal functionality of the infrastructure. We study their dynamics by solving the Burgers equation under random perturbations on several real and artificial directed graphs. Even for graphs with a narrow distribution of node properties (e.g., degree or betweenness), a steady state is reached exhibiting a heterogeneous load distribution, having a difference of one order of magnitude between the highest and average loads. Unexpectedly we find for the European power grid and for finite Watts-Strogatz networks a broad pronounced bimodal distribution for the loads. To identify the most vulnerable nodes, we introduce the concept of node-basin size, a purely topological property which we show to be strongly correlated to the average load of a node.
Video classification has advanced tremendously over the recent years. A large part of the improvements in video classification had to do with the work done by the image classification community and the use of deep convolutional networks (CNNs) which produce competitive results with hand- crafted motion features. These networks were adapted to use video frames in various ways and have yielded state of the art classification results. We present two methods that build on this work, and scale it up to work with millions of videos and hundreds of thousands of classes while maintaining a low computational cost. In the context of large scale video processing, training CNNs on video frames is extremely time consuming, due to the large number of frames involved. We propose to avoid this problem by training CNNs on either YouTube thumbnails or Flickr images, and then using these networks' outputs as features for other higher level classifiers. We discuss the challenges of achieving this and propose two models for frame-level and video-level classification. The first is a highly efficient mixture of experts while the latter is based on long short term memory neural networks. We present results on the Sports-1M video dataset (1 million videos, 487 classes) and on a new dataset which has 12 million videos and 150,000 labels.
A recent paper by Gatys et al. describes a method for rendering an image in the style of another image. First, they use convolutional neural network features to build a statistical model for the style of an image. Then they create a new image with the content of one image but the style statistics of another image. Here, we extend this method to render a movie in a given artistic style. The naive solution that independently renders each frame produces poor results because the features of the style move substantially from one frame to the next. The other naive method that initializes the optimization for the next frame using the rendered version of the previous frame also produces poor results because the features of the texture stay fixed relative to the frame of the movie instead of moving with objects in the scene. The main contribution of this paper is to use optical flow to initialize the style transfer optimization so that the texture features move with the objects in the video. Finally, we suggest a method to incorporate optical flow explicitly into the cost function.
We report on new results of the spin dependent structure functions $g_1$ and $h_1$ of the nucleon. An attempt is made to convert the moments, which is what one computes on the lattice, to quark distribution functions.
The top-quark pair production cross section in 7 TeV center-of-mass energy proton-proton collisions is measured using data collected by the CMS detector at the LHC. The measurement uses events with one jet identified as a hadronically decaying tau lepton and at least four additional energetic jets, at least one of which is identified as coming from a b quark. The analyzed data sample corresponds to an integrated luminosity of 3.9 inverse femtobarns recorded by a dedicated multijet plus hadronically decaying tau trigger. A neural network has been developed to separate the top-quark pairs from the W+jets and multijet backgrounds. The measured value of sigma(ttbar) = 152 +/- 12 (stat.) +/- 32 (syst.) +/- 3 (lum.) pb is consistent with the standard model predictions.
This paper addresses the decomposition of biochemical networks into functional modules that preserve their dynamic properties upon interconnection with other modules, which permits the inference of network behavior from the properties of its constituent modules. The modular decomposition method developed here also has the property that any changes in the parameters of a chemical reaction only affect the dynamics of a single module. To illustrate our results, we define and analyze a few key biological modules that arise in gene regulation, enzymatic networks, and signaling pathways. We also provide a collection of examples that demonstrate how the behavior of a biological network can be deduced from the properties of its constituent modules, based on results from control systems theory.
Processes such as disease propagation and information diffusion often spread over some latent network structure which must be learned from observation. Given a set of unlabeled training examples representing occurrences of an event type of interest (e.g., a disease outbreak), our goal is to learn a graph structure that can be used to accurately detect future events of that type. Motivated by new theoretical results on the consistency of constrained and unconstrained subset scans, we propose a novel framework for learning graph structure from unlabeled data by comparing the most anomalous subsets detected with and without the graph constraints. Our framework uses the mean normalized log-likelihood ratio score to measure the quality of a graph structure, and efficiently searches for the highest-scoring graph structure. Using simulated disease outbreaks injected into real-world Emergency Department data from Allegheny County, we show that our method learns a structure similar to the true underlying graph, but enables faster and more accurate detection.
In this survey, we present and compare different approaches to estimate Mutual Information (MI) from data to analyse general dependencies between variables of interest in a system. We demonstrate the performance difference of MI versus correlation analysis, which is only optimal in case of linear dependencies. First, we use a piece-wise constant Bayesian methodology using a general Dirichlet prior. In this estimation method, we use a two-stage approach where we approximate the probability distribution first and then calculate the marginal and joint entropies. Here, we demonstrate the performance of this Bayesian approach versus the others for computing the dependency between different variables. We also compare these with linear correlation analysis. Finally, we apply MI and correlation analysis to the identification of the bias in the determination of the aerosol optical depth (AOD) by the satellite based Moderate Resolution Imaging Spectroradiometer (MODIS) and the ground based AErosol RObotic NETwork (AERONET). Here, we observe that the AOD measurements by these two instruments might be different for the same location. The reason of this bias is explored by quantifying the dependencies between the bias and 15 other variables including cloud cover, surface reflectivity and others.
We show that a large number of elementary cellular automata are computationally simple. This work is the first systematic classification of elementary cellular automata based on a formal notion of computational complexity. Thanks to the generality of communication complexity, the perspectives of our method include its application to other natural systems such as neural networks and gene regulatory networks.
Spatial phenomena are subject to scale effects, but there are rarely studies addressing such effects on spatially embedded contact networks. There are two types of structure in these networks, network structure and spatial structure. The network structure has been actively studied. The spatial structure of these networks has received attention only in recent years. Certainly little is known whether the two structures respond to each other. This study examines the scale effects, in terms of spatial extent, on the network structure and the spatial structure of spatially embedded contact networks. Two issues are explored, how the two types of structures change in response to scale changes, and the range of the scale effects. Two sets of areal units, regular grids with 24 different levels of spatial extent and census units of three levels of spatial extent, are used to divide one observed and two reference random networks into multiple scales. Six metrics are used to represent the two structures. Results show different scale effects. In terms of the network structure, the properties of the observed network are sensitive to scale changes at fine scales. In comparison, the clustered spatial structure of the network is scale independent. The behaviors of the network structure are affected by the spatial structure. This information helps identify vulnerable households and communities to health risks and helps deploy intervention strategies to spatially targeted areas.
Hamiltonian Monte Carlo (HMC) is a popular Markov chain Monte Carlo (MCMC) algorithm that generates proposals for a Metropolis-Hastings algorithm by simulating the dynamics of a Hamiltonian system. However, HMC is sensitive to large time discretizations and performs poorly if there is a mismatch between the spatial geometry of the target distribution and the scales of the momentum distribution. In particular the mass matrix of HMC is hard to tune well. In order to alleviate these problems we propose relativistic Hamiltonian Monte Carlo, a version of HMC based on relativistic dynamics that introduce a maximum velocity on particles. We also derive stochastic gradient versions of the algorithm and show that the resulting algorithms bear interesting relationships to gradient clipping, RMSprop, Adagrad and Adam, popular optimisation methods in deep learning. Based on this, we develop relativistic stochastic gradient descent by taking the zero-temperature limit of relativistic stochastic gradient Hamiltonian Monte Carlo. In experiments we show that the relativistic algorithms perform better than classical Newtonian variants and Adam.
We review our models of quantum associative memories that represent the "quantization" of fully coupled neural networks like the Hopfield model. The idea is to replace the classical irreversible attractor dynamics driven by an Ising model with pattern-dependent weights by the reversible rotation of an input quantum state onto an output quantum state consisting of a linear superposition with probability amplitudes peaked on the stored pattern closest to the input in Hamming distance, resulting in a high probability of measuring a memory pattern very similar to the input. The unitary operator implementing this transformation can be formulated as a sequence of one-qubit and two-qubit elementary quantum gates and is thus the exponential of an ordered quantum Ising model with sequential operations and with pattern-dependent interactions, exactly as in the classical case. Probabilistic quantum memories, that make use of postselection of the measurement result of control qubits, overcome the famed linear storage limitation of their classical counterparts because they permit to completely {\it eliminate crosstalk and spurious memories}. The number of control qubits plays the role of an inverse fictitious temperature, the accuracy of pattern retrieval can be tuned by lowering the fictitious temperature under a critical value for quantum content association while the complexity of the retrieval algorithm remains polynomial for any number of patterns polynomial in the number of qubits. These models solve thus the capacity shortage problem of classical associative memories, providing a {\it polynomial improvement} in capacity. The price to pay is the probabilistic nature of information retrieval.
For Anderson tight-binding models in dimension $d$ with random on-site energies $\epsilon_{\vec r}$ and critical long-ranged hoppings decaying typically as $V^{typ}(r) \sim V/r^d$, we show that the strong multifractality regime corresponding to small $V$ can be studied via the standard perturbation theory for eigenvectors in quantum mechanics. The Inverse Participation Ratios $Y_q(L)$, which are the order parameters of Anderson transitions, can be written in terms of weighted L\'evy sums of broadly distributed variables (as a consequence of the presence of on-site random energies in the denominators of the perturbation theory). We compute at leading order the typical and disorder-averaged multifractal spectra $\tau_{typ}(q)$ and $\tau_{av}(q)$ as a function of $q$. For $q<1/2$, we obtain the non-vanishing limiting spectrum $\tau_{typ}(q)=\tau_{av}(q)=d(2q-1)$ as $V \to 0^+$. For $q>1/2$, this method yields the same disorder-averaged spectrum $\tau_{av}(q)$ of order $O(V)$ as obtained previously via the Levitov renormalization method by Mirlin and Evers [Phys. Rev. B 62, 7920 (2000)]. In addition, it allows to compute explicitly the typical spectrum, also of order $O(V)$, but with a different $q$-dependence $\tau_{typ}(q) \ne \tau_{av}(q)$ for all $q>q_c=1/2$. As a consequence, we find that the corresponding singularity spectra $f_{typ}(\alpha)$ and $f_{av}(\alpha)$ differ even in the positive region $f>0$, and vanish at different values $\alpha_+^{typ} > \alpha_+^{av}$, in contrast to the standard picture. We also obtain that the saddle value $\alpha_{typ}(q)$ of the Legendre transform reaches the termination point $\alpha_+^{typ}$ where $f_{typ}(\alpha_+^{typ})=0 $ only in the limit $q \to +\infty$.
We consider a mobile edge computing problem, in which mobile users offload their computation tasks to computing nodes (e.g., base stations) at the network edge. The edge nodes compute the requested functions and communicate the computed results to the users via wireless links. For this problem, we propose a Universal Coded Edge Computing (UCEC) scheme for linear functions to simultaneously minimize the load of computation at the edge nodes, and maximize the physical-layer communication efficiency towards the mobile users. In the proposed UCEC scheme, edge nodes create coded inputs of the users, from which they compute coded output results. Then, the edge nodes utilize the computed coded results to create communication messages that zero-force all the interference signals over the air at each user. Specifically, the proposed scheme is universal since the coded computations performed at the edge nodes are oblivious of the channel states during the communication process from the edge nodes to the users.
When a piece of malicious information becomes rampant in an information diffusion network, can we identify the source node that originally introduced the piece into the network and infer the time when it initiated this? Being able to do so is critical for curtailing the spread of malicious information, and reducing the potential losses incurred. This is a very challenging problem since typically only incomplete traces are observed and we need to unroll the incomplete traces into the past in order to pinpoint the source. In this paper, we tackle this problem by developing a two-stage framework, which first learns a continuous-time diffusion network model based on historical diffusion traces and then identifies the source of an incomplete diffusion trace by maximizing the likelihood of the trace under the learned model. Experiments on both large synthetic and real-world data show that our framework can effectively go back to the past, and pinpoint the source node and its initiation time significantly more accurately than previous state-of-the-arts.
The fractal and self-similarity properties are revealed in many real complex networks. However, the classical information dimension of complex networks is not practical for real complex networks. In this paper, a new information dimension to characterize the dimension of complex networks is proposed. The difference of information for each box in the box-covering algorithm of complex networks is considered by this measure. The proposed method is applied to calculate the fractal dimensions of some real networks. Our results show that the proposed method is efficient for fractal dimension of complex networks.
Understanding how the dynamics of a neural network is shaped by the network structure, and consequently how the network structure facilitates the functions implemented by the neural system, is at the core of using mathematical models to elucidate brain functions. This study investigates the tracking dynamics of continuous attractor neural networks (CANNs). Due to the translational invariance of neuronal recurrent interactions, CANNs can hold a continuous family of stationary states. They form a continuous manifold in which the neural system is neutrally stable. We systematically explore how this property facilitates the tracking performance of a CANN, which is believed to have clear correspondence with brain functions. By using the wave functions of the quantum harmonic oscillator as the basis, we demonstrate how the dynamics of a CANN is decomposed into different motion modes, corresponding to distortions in the amplitude, position, width or skewness of the network state. We then develop a perturbative approach that utilizes the dominating movement of the network's stationary states in the state space. This method allows us to approximate the network dynamics up to an arbitrary accuracy depending on the order of perturbation used. We quantify the distortions of a Gaussian bump during tracking, and study their effects on the tracking performance. Results are obtained on the maximum speed for a moving stimulus to be trackable and the reaction time for the network to catch up with an abrupt change in the stimulus.
Current state of the art object recognition architectures achieve impressive performance but are typically specialized for a single depictive style (e.g. photos only, sketches only). In this paper, we present SwiDeN : our Convolutional Neural Network (CNN) architecture which recognizes objects regardless of how they are visually depicted (line drawing, realistic shaded drawing, photograph etc.). In SwiDeN, we utilize a novel `deep' depictive style-based switching mechanism which appropriately addresses the depiction-specific and depiction-invariant aspects of the problem. We compare SwiDeN with alternative architectures and prior work on a 50-category Photo-Art dataset containing objects depicted in multiple styles. Experimental results show that SwiDeN outperforms other approaches for the depiction-invariant object recognition problem.
The human reasoning process is seldom a one-way process from an input leading to an output. Instead, it often involves a systematic deduction by ruling out other possible outcomes as a self-checking mechanism. In this paper, we describe the design of a hybrid neural network for logical learning that is similar to the human reasoning through the introduction of an auxiliary input, namely the indicators, that act as the hints to suggest logical outcomes. We generate these indicators by digging into the hidden information buried underneath the original training data for direct or indirect suggestions. We used the MNIST data to demonstrate the design and use of these indicators in a convolutional neural network. We trained a series of such hybrid neural networks with variations of the indicators. Our results show that these hybrid neural networks are very robust in generating logical outcomes with inherently higher prediction accuracy than the direct use of the original input and output in apparent models. Such improved predictability with reassured logical confidence is obtained through the exhaustion of all possible indicators to rule out all illogical outcomes, which is not available in the apparent models. Our logical learning process can effectively cope with the unknown unknowns using a full exploitation of all existing knowledge available for learning. The design and implementation of the hints, namely the indicators, become an essential part of artificial intelligence for logical learning. We also introduce an ongoing application setup for this hybrid neural network in an autonomous grasping robot, namely as_DeepClaw, aiming at learning an optimized grasping pose through logical learning.
Sample space reducing (SSR) processes are simple path dependent processes that offer an analytical understanding of the origin and ubiquity of power-laws in countless path dependent complex systems. SSR processes were shown to exhibit generic power-laws where the exponent is directly related to the noise level in the system. Here we show that SSR processes exhibit a much wider range of statistical diversity. Assuming that noise in the system is not uniformly strong within a system (or across its life-span), but depends on the current state the system is in, we demonstrate that practically any distribution function can be naturally derived from SSR processes. The functional form of noise can be surprisingly simple. Constant noise leads to exact power-laws, a linear noise functions give exponential or Gamma distributions, a quadratic noise function yields the normal distribution. Also the Weibull, Gompertz and Tsallis-Pareto distributions arise as a natural consequence from relatively simple noise functions. We discuss a deep relation of SSR processes with state-dependent noise with a recently proposed process of sustained random growth, that --as the SRR processes-- has a wide range of practical applications that range from fragmentation processes, language formation to cascading processes.
We explore the use of Evolution Strategies, a class of black box optimization algorithms, as an alternative to popular RL techniques such as Q-learning and Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable solution strategy that scales extremely well with the number of CPUs available: By using hundreds to thousands of parallel workers, ES can solve 3D humanoid walking in 10 minutes and obtain competitive results on most Atari games after one hour of training time. In addition, we highlight several advantages of ES as a black box optimization technique: it is invariant to action frequency and delayed rewards, tolerant of extremely long horizons, and does not need temporal discounting or value function approximation.
Inter-subject parcellation of functional Magnetic Resonance Imaging (fMRI) data based on a standard General Linear Model (GLM)and spectral clustering was recently proposed as a means to alleviate the issues associated with spatial normalization in fMRI. However, for all its appeal, a GLM-based parcellation approach introduces its own biases, in the form of a priori knowledge about the shape of Hemodynamic Response Function (HRF) and task-related signal changes, or about the subject behaviour during the task. In this paper, we introduce a data-driven version of the spectral clustering parcellation, based on Independent Component Analysis (ICA) and Partial Least Squares (PLS) instead of the GLM. First, a number of independent components are automatically selected. Seed voxels are then obtained from the associated ICA maps and we compute the PLS latent variables between the fMRI signal of the seed voxels (which covers regional variations of the HRF) and the principal components of the signal across all voxels. Finally, we parcellate all subjects data with a spectral clustering of the PLS latent variables. We present results of the application of the proposed method on both single-subject and multi-subject fMRI datasets. Preliminary experimental results, evaluated with intra-parcel variance of GLM t-values and PLS derived t-values, indicate that this data-driven approach offers improvement in terms of parcellation accuracy over GLM based techniques.
We study the temperature-dilution phase diagram of a site-diluted Heisenberg antiferromagnet on a fcc lattice, with and without the Dzyaloshinskii-Moriya anisotropic term, fixed to realistic microscopic parameters for $IIB_{1-x} Mn_x Te$ (IIB=Cd, Hg, Zn). We show that the dipolar Dzyaloshinskii-Moriya anisotropy induces a finite-temperature phase transition to a spin glass phase, at dilutions larger than 80%. The resulting probability distribution of the order parameter P(q) is similar to the one found in the cubic lattice Edwards-Anderson Ising model. The critical exponents undergo large finite size corrections, but tend to values similar to the ones of the Edwards-Anderson-Ising model.
Nowadays, the transport goods problem occupies an important place in the economic life of modern societies. The pickup and delivery problem with time windows (PDPTW) is one of the problems which a large part of the research was interested. In this paper, we present a a brief literature review of the VRP and the PDPTW, propose our multicriteria approach based on genetic algorithms which allows minimize the compromise between the vehicles number, the total tardiness time and the total travel cost. And this, by treating the case where a customer can have multiple suppliers and one supplier can have multiple customers
Mismatching problem between the source and target noisy corpora severely hinder the practical use of the machine-learning-based voice activity detection (VAD). In this paper, we try to address this problem in the transfer learning prospective. Transfer learning tries to find a common learning machine or a common feature subspace that is shared by both the source corpus and the target corpus. The denoising deep neural network is used as the learning machine. Three transfer techniques, which aim to learn common feature representations, are used for analysis. Experimental results demonstrate the effectiveness of the transfer learning schemes on the mismatch problem.
We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to learn, in polynomial time, a function that is competitive with the best function in the conjugate kernel space of the network, as defined in Daniely, Frostig and Singer. The result holds for log-depth networks from a rich family of architectures. To the best of our knowledge, it is the first polynomial-time guarantee for the standard neural network learning algorithm for networks of depth more that two.   As corollaries, it follows that for neural networks of any depth between $2$ and $\log(n)$, SGD is guaranteed to learn, in polynomial time, constant degree polynomials with polynomially bounded coefficients. Likewise, it follows that SGD on large enough networks can learn any continuous function (not in polynomial time), complementing classical expressivity results.
It is generally believed that, in the thermodynamic limit, the microcanonical description as a function of energy coincides with the canonical description as a function of temperature. However, various examples of systems for which the microcanonical and canonical ensembles are not equivalent have been identified. A complete theory of this intriguing phenomenon is still missing. Here we show that ensemble nonequivalence can manifest itself also in random graphs with topological constraints. We find that, while graphs with a given number of links are ensemble-equivalent, graphs with a given degree sequence are not. This result holds irrespective of whether the energy is nonadditive (as in unipartite graphs) or additive (as in bipartite graphs). In contrast with previous expectations, our results show that: (1) physically, nonequivalence can be induced by an extensive number of local constraints, and not necessarily by long-range interactions or nonadditivity; (2) mathematically, nonquivalence is determined by a different large-deviation behaviour of microcanonical and canonical probabilities for a single microstate, and not necessarily for almost all microstates. The latter criterion, which is entirely local, is not restricted to networks and holds in general.
We show a tight lower bound of $\Omega(N \log\log N)$ on the number of transmissions required to compute the parity of $N$ input bits with constant error in a noisy communication network of $N$ randomly placed sensors, each having one input bit and communicating with others using local transmissions with power near the connectivity threshold. This result settles the lower bound question left open by Ying, Srikant and Dullerud (WiOpt 06), who showed how the sum of all the $N$ bits can be computed using $O(N \log\log N)$ transmissions. The same lower bound has been shown to hold for a host of other functions including majority by Dutta and Radhakrishnan (FOCS 2008).   Most works on lower bounds for communication networks considered mostly the full broadcast model without using the fact that the communication in real networks is local, determined by the power of the transmitters. In fact, in full broadcast networks computing parity needs $\theta(N)$ transmissions. To obtain our lower bound we employ techniques developed by Goyal, Kindler and Saks (FOCS 05), who showed lower bounds in the full broadcast model by reducing the problem to a model of noisy decision trees. However, in order to capture the limited range of transmissions in real sensor networks, we adapt their definition of noisy decision trees and allow each node of the tree access to only a limited part of the input. Our lower bound is obtained by exploiting special properties of parity computations in such noisy decision trees.
Recent studies have shown that Convolutional Neural Networks (CNNs) are vulnerable to a small perturbation of input called "adversarial examples". In this work, we propose a new feedforward CNN that improves robustness in the presence of adversarial noise. Our model uses stochastic additive noise added to the input image and to the CNN models. The proposed model operates in conjunction with a CNN trained with either standard or adversarial objective function. In particular, convolution, max-pooling, and ReLU layers are modified to benefit from the noise model. Our feedforward model is parameterized by only a mean and variance per pixel which simplifies computations and makes our method scalable to a deep architecture. From CIFAR-10 and ImageNet test, the proposed model outperforms other methods and the improvement is more evident for difficult classification tasks or stronger adversarial noise.
The thesis examines the topics of disorder and electron-electron interactions in three distinct quantum systems. Firstly, the Anderson transition is studied for the BCC and FCC lattices. We obtain high precision results for the critical disorder at the band centre and the critical exponent. Comparing the critical disorder between the SC, BCC and FCC lattices, an increase is observed as a function of the coordination number of the lattice. The critical exponent is found to be approximately 1.5 in agreement with the value for the SC lattice. Energy-disorder phase diagrams are plotted for both lattice types. Next, we consider the Aharonov-Bohm effect for an exciton in a 1D ring. The aim is to determine how the addition of a constant electric field in the plane of the ring affects the Aharonov-Bohm oscillations. We observe an inversion of the oscillations in the oscillator strength at a critical electric field, with the oscillation minimum reaching zero at half a magnetic flux quantum. This suggests a possible process for controlling the formation and recombination of excitons through tuning the applied fields. The final section is concerned with collective excitations of graphene in a strong perpendicular magnetic field. The oscillator strengths and energies of collective excitations are calculated and the good quantum numbers identified. In particular, we study those arising from the SU(4) symmetry, which is due to two spin and two valley pseudospin projections. This enables us to determine the multiplet structure of the states. In addition to neutral collective excitations or excitons, we investigate the possible formation of charged collective excitations or trions from nearly full or nearly empty Landau levels. The localisation of neutral collective excitations upon a single Coulomb or delta-function impurity is also examined.
Due to the evolving Cloud Computing paradigm, there is a prevailing concern that in the near future data center managers may be in short supply. Cloud computing, as a whole, is becoming more prevalent into today s computing world. In fact, cloud computing has become so popular that some are now referring to data centers as cloud centers. How does this interest in cloud computing translate into employment rates for data center managers? The popularity of the public and private cloud models are the prevailing force behind answering this question. Therefore, the skill set of the datacenter manager has evolved to harness the on demand self-services, broad network access, resource pooling, rapid elasticity, measured service, and multi tenacity characteristics of cloud computing. Using diverse sources ranging from the Bureau of Labor and Statistics to trade articles, this manuscript takes an in-depth look at these employment rates related to the cloud and the determining factors behind them. Based on the information available, datacenter manager employment rates in the cloud computing era will continue to increase well into 2016.
We consider a system in which two users communicate with a destination with the help of a half-duplex relay. Based on the compute-and-forward scheme, we develop and evaluate the performance of coding strategies that are of network coding spirit. In this framework, instead of decoding the users' information messages, the destination decodes two integer-valued linear combinations that relate the transmitted codewords. Two decoding schemes are considered. In the first one, the relay computes one of the linear combinations and then forwards it to the destination. The destination computes the other linear combination based on the direct transmissions. In the second one, accounting for the side information available at the destination through the direct links, the relay compresses what it gets using Wyner-Ziv compression and conveys it to the destination. The destination then computes the two linear combinations, locally. For both coding schemes, we discuss the design criteria, and derive the allowed symmetric-rate. Next, we address the power allocation and the selection of the integer-valued coefficients to maximize the offered symmetric-rate; an iterative coordinate descent method is proposed. The analysis shows that the first scheme can outperform standard relaying techniques in certain regimes, and the second scheme, while relying on feasible structured lattice codes, can at best achieve the same performance as regular compress-and-forward for the multiaccess relay network model that we study. The results are illustrated through some numerical examples.
The identification of influential nodes in complex network can be very challenging. If the network has a community structure, centrality measures may fail to identify the complete set of influential nodes, as the hubs and other central nodes of the network may lie inside only one community. Here we define a bipartite clustering coefficient that, by taking differently structured clusters into account, can find important nodes across communities.
The thermodynamic and retrieval properties of the Blume-Emery-Griffiths neural network with synchronous updating and variable dilution are studied using replica mean-field theory. Several forms of dilution are allowed by pruning the different types of couplings present in the Hamiltonian. The appearance and properties of two-cycles are discussed. Capacity-temperature phase diagrams are derived for several values of the pattern activity. The results are compared with those for sequential updating. The effect of self-coupling is studied. Furthermore, the optimal combination of dilution parameters giving the largest critical capacity is obtained.
Community detecting is one of the main approaches to understanding networks \cite{For2010}.   However it has been a longstanding challenge to give a definition for community structures of networks. Here we found that community structures are definable in networks, and are universal in real world. We proposed the notions of entropy- and conductance-community structure ratios. It was shown that the definitions of the modularity proposed in \cite{NG2004}, and our entropy- and conductance-community structures are equivalent in defining community structures of networks, that randomness in the ER model \cite{ER1960} and preferential attachment in the PA \cite{Bar1999} model are not mechanisms of community structures of networks, and that the existence of community structures is a universal phenomenon in real networks. Our results demonstrate that community structure is a universal phenomenon in the real world that is definable, solving the challenge of definition of community structures in networks. This progress provides a foundation for a structural theory of networks.
We study the quantum version of a simplified model of optimization problems, where quantum fluctuations are introduced by a transverse field acting on the qubits. We find a complex low-energy spectrum of the quantum Hamiltonian, characterized by an abrupt condensation transition and a continuum of level crossings as a function of the transverse field. We expect this complex structure to have deep consequences on the behavior of quantum algorithms attempting to find solutions to these problems.
In this paper, we share our experience in designing a convolutional network-based face detector that could handle faces of an extremely wide range of scales. We show that faces with different scales can be modeled through a specialized set of deep convolutional networks with different structures. These detectors can be seamlessly integrated into a single unified network that can be trained end-to-end. In contrast to existing deep models that are designed for wide scale range, our network does not require an image pyramid input and the model is of modest complexity. Our network, dubbed ScaleFace, achieves promising performance on WIDER FACE and FDDB datasets with practical runtime speed. Specifically, our method achieves 76.4 average precision on the challenging WIDER FACE dataset and 96% recall rate on the FDDB dataset with 7 frames per second (fps) for 900 * 1300 input image.
Correlations and other collective phenomena in a schematic model of heterogeneous binary agents (individual spin-glass samples) are considered on the complete graph and also on 2d and 3d regular lattices. The system's stochastic dynamics is studied by numerical simulations. The dynamics is so slow that one can meaningfully speak of quasi-equilibrium states. Performing measurements of correlations in such a quasi-equilibrium state we find that they are random both as to their sign and absolute value, but on average they fall off very slowly with distance in all instances that we have studied. This means that the system is essentially non-local, small changes at one end may have a strong impact at the other. Correlations and other local quantities are extremely sensitive to the boundary conditions all across the system, although this sensitivity disappears upon averaging over the samples or partially averaging over the agents. The strong, random correlations tend to organize a large fraction of the agents into strongly correlated clusters that act together. If we think about this model as a distant metaphor of economic agents or bank networks, the systemic risk implications of this tendency are clear: any impact on even a single strongly correlated agent will spread, in an unforeseeable manner, to the whole system via the strong random correlations.
We analyze the perturbative series of the Keldysh-type sigma-model proposed recently for describing the quantum mechanics with time-dependent Hamiltonians from the unitary Wigner-Dyson random-matrix ensemble. We observe that vertices of orders higher than four cancel, which allows us to reduce the calculation of the energy-diffusion constant to that in a special kind of the matrix \phi^4 model. We further verify that the perturbative four-loop correction to the energy-diffusion constant in the high-velocity limit cancels, in agreement with the conjecture of one of the authors.
Tangle machines are a topologically inspired diagrammatic formalism to describe information flow in networks. This paper begins with an expository account of tangle machines motivated by the problem of describing `covariance intersection' fusion of Gaussian estimators in networks. It then gives two examples in which tangle machines tell stories of adiabatic quantum computations, and discusses learning tangle machines from data.
Being able to predict whether a song can be a hit has impor- tant applications in the music industry. Although it is true that the popularity of a song can be greatly affected by exter- nal factors such as social and commercial influences, to which degree audio features computed from musical signals (whom we regard as internal factors) can predict song popularity is an interesting research question on its own. Motivated by the recent success of deep learning techniques, we attempt to ex- tend previous work on hit song prediction by jointly learning the audio features and prediction models using deep learning. Specifically, we experiment with a convolutional neural net- work model that takes the primitive mel-spectrogram as the input for feature learning, a more advanced JYnet model that uses an external song dataset for supervised pre-training and auto-tagging, and the combination of these two models. We also consider the inception model to characterize audio infor- mation in different scales. Our experiments suggest that deep structures are indeed more accurate than shallow structures in predicting the popularity of either Chinese or Western Pop songs in Taiwan. We also use the tags predicted by JYnet to gain insights into the result of different models.
This work presents a novel approach to Automatic Post-Editing (APE) and Word-Level Quality Estimation (QE) using ensembles of specialized Neural Machine Translation (NMT) systems. Word-level features that have proven effective for QE are included as input factors, expanding the representation of the original source and the machine translation hypothesis, which are used to generate an automatically post-edited hypothesis. We train a suite of NMT models that use different input representations, but share the same output space. These models are then ensembled together, and tuned for both the APE and the QE task. We thus attempt to connect the state-of-the-art approaches to APE and QE within a single framework. Our models achieve state-of-the-art results in both tasks, with the only difference in the tuning step which learns weights for each component of the ensemble.
In recent times, the domain of network science has become extremely useful in understanding the underlying structure of various real-world networks and to answer non-trivial questions regarding them. In this study, we rigourously analyze the statistical properties of the bus networks of six major Indian cities as graphs in L- and P-space, using tools from network science. Although public transport networks, such as airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze their topological structure, and answer some of the basic questions on their evolution, growth, robustness and resiliency. We start from an empirical analysis of these networks, and determine their principle characteristics in terms of the complex network theory. The common features of small-world property and heavy tails in degree-distribution plots are observed in all the networks studied. Our analysis further reveals a wide spectrum of network topologies arising due to an interplay between preferential and random attachment of nodes. Unlike real-world networks, like the Internet, WWW and airline, which are virtual, bus networks are physically constrained in two-dimensional space by the underlying road networks. In order to understand the role of constraints in the evolution of these networks, we calculate their fractal dimensions that reveal a three-dimensional space-like evolution in a constrained two-dimensional plane. We also extend our study to understand the complex dynamical processes of epidemic outbreaks and information diffusion in these networks using SI and SIR models.
This work analyzes friendship network from a Massively Multiplayer Online Role-Playing Game (MMORPG). The network is based on data from a private server that was active from 2007 until 2011. The work conducts a standard analysis of the network and then divides players according to different groups based on their activity. Work checks how friendship network can be correlated to the clan (a self-organized group of players who often form a league and play on the same side in a match) network. Main part of the work is the recommendation method for players that are not part of any clan and it is based on communities of friendship network.
In this paper we investigate the performance of mobile user connectivity in femtocell/macrocell networks. The femto user equipment (FUE) can connect to femto access point (FAP) with low communication range rather than higher communication range to macro base station (MBS). Furthermore, in such emerging networks, the spatial reuse of resources is permissible and the transmission range can be decreased, then the probability of connectivity is high. Thereby in this study, we propose a tractable analytical model for the connectivity probability based on communication range and the mobility of mobile users in femtocell/macrocell networks. Further, we study the interplays between outage probability and spectral efficiency in such networks. Numerical results demonstrate the effectiveness of computing the connectivity probability in femtocell/macrocell networks.
New annealing schedules for quantum annealing are proposed based on the adiabatic theorem. These schedules exhibit faster decrease of the excitation probability than a linear schedule. To derive this conclusion, the asymptotic form of the excitation probability for quantum annealing is explicitly obtained in the limit of long annealing time. Its first-order term, which is inversely proportional to the square of the annealing time, is shown to be determined only by the information at the initial and final times. Our annealing schedules make it possible to drop this term, thus leading to a higher order (smaller) excitation probability. We verify these results by solving numerically the time-dependent Schrodinger equation for small size systems
Local network community detection is the task of finding a single community of nodes concentrated around few given seed nodes in a localized way. Conductance is a popular objective function used in many algorithms for local community detection. This paper studies a continuous relaxation of conductance. We show that continuous optimization of this objective still leads to discrete communities. We investigate the relation of conductance with weighted kernel k-means for a single community, which leads to the introduction of a new objective function, $\sigma$-conductance. Conductance is obtained by setting $\sigma$ to $0$. Two algorithms, EMc and PGDc, are proposed to locally optimize $\sigma$-conductance and automatically tune the parameter $\sigma$. They are based on expectation maximization and projected gradient descent, respectively. We prove locality and give performance guarantees for EMc and PGDc for a class of dense and well separated communities centered around the seeds. Experiments are conducted on networks with ground-truth communities, comparing to state-of-the-art graph diffusion algorithms for conductance optimization. On large graphs, results indicate that EMc and PGDc stay localized and produce communities most similar to the ground, while graph diffusion algorithms generate large communities of lower quality.
Recently, there have been several promising methods to generate realistic imagery from deep convolutional networks. These methods sidestep the traditional computer graphics rendering pipeline and instead generate imagery at the pixel level by learning from large collections of photos (e.g. faces or bedrooms). However, these methods are of limited utility because it is difficult for a user to control what the network produces. In this paper, we propose a deep adversarial image synthesis architecture that is conditioned on sketched boundaries and sparse color strokes to generate realistic cars, bedrooms, or faces. We demonstrate a sketch based image synthesis system which allows users to 'scribble' over the sketch to indicate preferred color for objects. Our network can then generate convincing images that satisfy both the color and the sketch constraints of user. The network is feed-forward which allows users to see the effect of their edits in real time. We compare to recent work on sketch to image synthesis and show that our approach can generate more realistic, more diverse, and more controllable outputs. The architecture is also effective at user-guided colorization of grayscale images.
Research in transportation frequently involve modelling and predicting attributes of events that occur at regular intervals. The event could be arrival of a bus at a bus stop, the volume of a traffic at a particular point, the demand at a particular bus stop etc. In this work, we propose a specific implementation of probabilistic graphical models to learn the probabilistic dependency between the events that occur in a network. A dependency graph is built from the past observed instances of the event and we use the graph to understand the causal effects of some events on others in the system. The dependency graph is also used to predict the attributes of future events and is shown to have a good prediction accuracy compared to the state of the art.
This paper introduces Quibbs v1.3, a Java application available for free. (Source code included in the distribution.) Quibbs is a "code generator" for quantum Gibbs sampling: after the user inputs some files that specify a classical Bayesian network, Quibbs outputs a quantum circuit for performing Gibbs sampling of that Bayesian network on a quantum computer. Quibbs implements an algorithm described in earlier papers, that combines various apple pie techniques such as: an adaptive fixed-point version of Grover's algorithm, Szegedy operators, quantum phase estimation and quantum multiplexors.
Community structure is one of the key properties of complex networks and plays a crucial role in their topology and function. While an impressive amount of work has been done on the issue of community detection, very little attention has been so far devoted to the investigation of communities in real networks. We present a systematic empirical analysis of the statistical properties of communities in large information, communication, technological, biological, and social networks. We find that the mesoscopic organization of networks of the same category is remarkably similar. This is reflected in several characteristics of community structure, which can be used as ``fingerprints'' of specific network categories. While community size distributions are always broad, certain categories of networks consist mainly of tree-like communities, while others have denser modules. Average path lengths within communities initially grow logarithmically with community size, but the growth saturates or slows down for communities larger than a characteristic size. This behaviour is related to the presence of hubs within communities, whose roles differ across categories. Also the community embeddedness of nodes, measured in terms of the fraction of links within their communities, has a characteristic distribution for each category. Our findings are verified by the use of two fundamentally different community detection methods.
We consider a hole-doped semiconductor with a sharp boundary and study the boundary spin accumulation in response to a charge current. First, we solve exactly a single-particle quantum mechanics problem described by the isotropic Luttinger model in half-space and construct an orthonormal basis for the corresponding Hamiltonian. It is shown that the complete basis includes two types of eigenstates. The first class of states contains conventional incident and reflected waves, while the other class includes localized surface states. Second, we consider a many-body system in the presence of a charge current flowing parallel to the boundary. It is shown that the localized states contribute to spin accumulation near the surface. We also show that the spin density exhibits current-induced Friedel oscillations with three different periods determined by the Fermi momenta of the light and heavy holes. We find an exact asymptotic expression for the Friedel oscillations far from the boundary. We also calculate numerically the spin density profile and compute the total spin accumulation, which is defined as the integral of the spin density in the direction perpendicular to the boundary. The total spin accumulation is shown to fit very well the simple formula S ~(1 - m_L/m_H)^2, where m_L and m_H are the light- and heavy-hole masses. The effects of disorder are discussed. We estimate the spin relaxation time in the Luttinger model and argue that spin physics cannot be described within the diffusion approximation.
In this paper, a novel adaptive tuning method of PID neural network (PIDNN) controller for nonlinear process is proposed. The method utilizes an improved gradient descent method to adjust PIDNN parameters where the margin stability will be employed to get high tracking performance and robustness with regard to external load disturbance and parameter variation. Simulation results show the effectiveness of the proposed algorithm compared with other well-known learning methods.
LinkedIn is the largest professional network with more than 350 million members. As the member base increases, searching for experts becomes more and more challenging. In this paper, we propose an approach to address the problem of personalized expertise search on LinkedIn, particularly for exploratory search queries containing {\it skills}. In the offline phase, we introduce a collaborative filtering approach based on matrix factorization. Our approach estimates expertise scores for both the skills that members list on their profiles as well as the skills they are likely to have but do not explicitly list. In the online phase (at query time) we use expertise scores on these skills as a feature in combination with other features to rank the results. To learn the personalized ranking function, we propose a heuristic to extract training data from search logs while handling position and sample selection biases. We tested our models on two products - LinkedIn homepage and LinkedIn recruiter. A/B tests showed significant improvements in click through rates - 31% for CTR@1 for recruiter (18% for homepage) as well as downstream messages sent from search - 37% for recruiter (20% for homepage). As of writing this paper, these models serve nearly all live traffic for skills search on LinkedIn homepage as well as LinkedIn recruiter.
Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them. Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains such as graphs and manifolds. The purpose of this paper is to overview different examples of geometric deep learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field.
As digital games continue to be explored as solutions to educational and behavioural challenges, the need for evaluation methodologies which support both the unique nature of the format and the need for comparison with other approaches continues to increase. In this workshop paper, a range of challenges are described related specifically to the case of cultural learning using digital games, in terms of how it may best be assessed, understood, and sustained through an iterative process supported by research. An evaluation framework is proposed, identifying metrics for reach and impact and their associated challenges, as well as presenting ethical considerations and the means to utilize evaluation outcomes within an iterative cycle, and to provide feedback to learners. Presenting as a case study a serious game from the Mobile Assistance for Social Inclusion and Empowerment of Immigrants with Persuasive Learning Technologies and Social Networks (MASELTOV) project, the use of the framework in the context of an integrative project is discussed, with emphasis on the need to view game-based learning as a blended component of the cultural learning process, rather than a standalone solution. The particular case of mobile gaming is also considered within this case study, providing a platform by which to deliver and update content in response to evaluation outcomes. Discussion reflects upon the general challenges related to the assessment of cultural learning, and behavioural change in more general terms, suggesting future work should address the need to provide sustainable, research-driven platforms for game-based learning content.
An updated version will be uploaded later.
Distributed optimization algorithms are frequently faced with solving sub-problems on disjoint connected parts of a network. Unfortunately, the diameter of these parts can be significantly larger than the diameter of the underlying network, leading to slow running times. Recent work by [Ghaffari and Hauepler; SODA'16] showed that this phenomenon can be seen as the broad underlying reason for the pervasive $\Omega(\sqrt{n} + D)$ lower bounds that apply to most optimization problems in the CONGEST model. On the positive side, this work also introduced low-congestion shortcuts as an elegant solution to circumvent this problem in certain topologies of interest. Particularly, they showed that there exist good shortcuts for any planar network and more generally any bounded genus network. This directly leads to fast $O(D \log^{O(1)} n)$ distributed algorithms for MST and Min-Cut approximation, given that one can efficiently construct these shortcuts in a distributed manner.   Unfortunately, the shortcut construction of [Ghaffari and Hauepler; SODA'16] relies heavily on having access to a genus embedding of the network. Computing such an embedding distributedly, however, is a hard problem - even for planar networks. No distributed embedding algorithm for bounded genus graphs is in sight.   In this work, we side-step this problem by defining a restricted and more structured form of shortcuts and giving a novel construction algorithm which efficiently finds a shortcut which is, up to a logarithmic factor, as good as the best shortcut that exists for a given network. This new construction algorithm directly leads to an $O(D \log^{O(1)} n)$-round algorithm for solving optimization problems like MST for any topology for which good restricted shortcuts exist - without the need to compute any embedding. This includes the first efficient algorithm for bounded genus graphs.
In this mostly numerical study, we revisit the statistical properties of the ground state of a directed polymer in a $d=1+1$ "hilly" disorder landscape, i.e. when the quenched disorder has power-law tails. When disorder is Gaussian, the polymer minimizes its total energy through a collective optimization, where the energy of each visited site only weakly contributes to the total. Conversely, a hilly landscape forces the polymer to distort and explore a larger portion of space to reach some particularly deep energy sites. As soon as the fifth moment of the disorder diverges, this mechanism radically changes the standard "KPZ" scaling behaviour of the directed polymer, and new exponents prevail. After confirming again that the Flory argument accurately predicts these exponent in the tail-dominated phase, we investigate several other statistical features of the ground state that shed light on this unusual transition and on the accuracy of the Flory argument. We underline the theoretical challenge posed by this situation, which paradoxically becomes even more acute above the upper critical dimension.
Recently, various deep-neural-network (DNN)-based approaches have been proposed for single-image super-resolution (SISR). Despite their promising results on major structure regions such as edges and lines, they still suffer from limited performance on texture regions that consist of very complex and fine patterns. This is because, during the acquisition of a low-resolution (LR) image via down-sampling, these regions lose most of the high frequency information necessary to represent the texture details. In this paper, we present a novel texture enhancement framework for SISR to effectively improve the spatial resolution in the texture regions as well as edges and lines. We call our method, high-resolution (HR) style transfer algorithm. Our framework consists of three steps: (i) generate an initial HR image from an interpolated LR image via an SISR algorithm, (ii) generate an HR style image from the initial HR image via down-scaling and tiling, and (iii) combine the HR style image with the initial HR image via a customized style transfer algorithm. Here, the HR style image is obtained by down-scaling the initial HR image and then repetitively tiling it into an image of the same size as the HR image. This down-scaling and tiling process comes from the idea that texture regions are often composed of small regions that similar in appearance albeit sometimes different in scale. This process creates an HR style image that is rich in details, which can be used to restore high-frequency texture details back into the initial HR image via the style transfer algorithm. Experimental results on a number of texture datasets show that our proposed HR style transfer algorithm provides more visually pleasing results compared with competitive methods.
Selection comparator networks have been studied for many years. Recently, they have been successfully applied to encode cardinality constraints for SAT-solvers. To decrease the size of generated formula there is a need for constructions of selection networks that can be efficiently generated and produce networks of small sizes for the practical range of their two parameters: n - the number of inputs (boolean variables) and k - the number of selected items (a cardinality bound). In this paper we give and analyze a new construction of smaller selection networks that are based on the pairwise selection networks introduced by Codish and Zanon-Ivry. We prove also that standard encodings of cardinality constraints with selection networks preserve arc-consistency.
We review a model--based rather than phenomenological approach to low--temperature anomalies in glasses. Specifically, we present a solvable model inspired by spin--glass theory that exhibits both, a glassy low--temperature phase, and a collection of double-- and single--well configurations in its potential energy landscape. The distribution of parameters characterizing the local potential energy configurations can be computed, and is found to differ from those assumed in the standard tunneling model and its variants. Still, low temperature anomalies characteristic of amorphous materials are reproduced. More importantly perhaps, we obtain a clue to the universality issue. That is, we are able to distinguish between properties which can be expected to be universal and those which cannot. Our theory also predicts the existence, under suitable circumstances of amorphous phases without low--energy tunneling excitations.
Implications of reduction procedures applied to the low energy part of the vibrational density of states in glasses and supercooled liquids are considered by advancing a detailed comparison between the excess - over the Debye limit - vibrational density of states g(w) and the frequency-reduced representation g(w)/w^2 usually referred to as the Boson peak. Analyzing representative experimental data from inelastic neutron and Raman scattering we show that reduction procedures distort to a great extent the otherwise symmetric excess density of states. The frequency of the maximum and the intensity of the excess experience dramatic changes; the former is reduced while the latter increases. The frequency and the intensity of the Boson peak are also sensitive to the distribution of the excess. In the light of the critical appraisal between the two forms of the density of states (i.e. the excess and the frequency-reduced one) we discuss changes of the Boson peak spectral features that are induced under the presence of external stimuli such as temperature (quenching rate, annealing), pressure, and irradiation. The majority of the Boson peak changes induced by the presence of those stimuli can be reasonably traced back to simple and expected modifications of the excess density of states and can be quite satisfactorily accounted for the Euclidean random matrix theory. Parallels to the heat capacity Boson peak are also briefly discussed.
Associating image regions with text queries has been recently explored as a new way to bridge visual and linguistic representations. A few pioneering approaches have been proposed based on recurrent neural language models trained generatively (e.g., generating captions), but achieving somewhat limited localization accuracy. To better address natural-language-based visual entity localization, we propose a discriminative approach. We formulate a discriminative bimodal neural network (DBNet), which can be trained by a classifier with extensive use of negative samples. Our training objective encourages better localization on single images, incorporates text phrases in a broad range, and properly pairs image regions with text phrases into positive and negative examples. Experiments on the Visual Genome dataset demonstrate the proposed DBNet significantly outperforms previous state-of-the-art methods both for localization on single images and for detection on multiple images. We we also establish an evaluation protocol for natural-language visual detection.
This paper presents the realistic approach towards the quantitative analysis and simulation of Energy Efficient Hierarchical Cluster (EEHC)-based routing for wireless sensor networks. Here the efforts have been done to combine analytical hardware model with the modified EEHC-based routing model. The dependence of various performance metrics like: optimum number of clusters, Energy Consumption, and Energy consumed per round etc. based on analytical hardware sensor model and EEHC model has been presented.
BIC criterion is widely used by the neural-network community for model selection tasks, although its convergence properties are not always theoretically established. In this paper we will focus on estimating the number of components in a mixture of multilayer perceptrons and proving the convergence of the BIC criterion in this frame. The penalized marginal-likelihood for mixture models and hidden Markov models introduced by Keribin (2000) and, respectively, Gassiat (2002) is extended to mixtures of multilayer perceptrons for which a penalized-likelihood criterion is proposed. We prove its convergence under some hypothesis which involve essentially the bracketing entropy of the generalized score-functions class and illustrate it by some numerical examples.
We develop a general method to explore how the function performed by a biological network can constrain both its structural and dynamical network properties. This approach is orthogonal to prior studies which examine the functional consequences of a given structural feature, for example a scale free architecture. A key step is to construct an algorithm that allows us to efficiently sample from a maximum entropy distribution on the space of boolean dynamical networks constrained to perform a specific function, or cascade of gene expression. Such a distribution can act as a "functional null model" to test the significance of any given network feature, and can aid in revealing underlying evolutionary selection pressures on various network properties. Although our methods are general, we illustrate them in an analysis of the yeast cell cycle cascade. This analysis uncovers strong constraints on the architecture of the cell cycle regulatory network as well as significant selection pressures on this network to maintain ordered and convergent dynamics, possibly at the expense of sacrificing robustness to structural perturbations.
We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. This was achieved by a carefully crafted design that allows for increasing the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC 2014 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.
Textual information is considered as significant supplement to knowledge representation learning (KRL). There are two main challenges for constructing knowledge representations from plain texts: (1) How to take full advantages of sequential contexts of entities in plain texts for KRL. (2) How to dynamically select those informative sentences of the corresponding entities for KRL. In this paper, we propose the Sequential Text-embodied Knowledge Representation Learning to build knowledge representations from multiple sentences. Given each reference sentence of an entity, we first utilize recurrent neural network with pooling or long short-term memory network to encode the semantic information of the sentence with respect to the entity. Then we further design an attention model to measure the informativeness of each sentence, and build text-based representations of entities. We evaluate our method on two tasks, including triple classification and link prediction. Experimental results demonstrate that our method outperforms other baselines on both tasks, which indicates that our method is capable of selecting informative sentences and encoding the textual information well into knowledge representations.
A new core hypothesis on laughter is presented. It has been built by putting together ideas from several disciplines: neurodynamics, evolutionary neurobiology, paleoanthropology, social networks, and communication studies. The hypothesis contributes to ascertain the evolutionary origins of human laughter in connection with its cognitive emotional signaling functions. The new behavioral and neurodynamic tenets introduced about this unusual sound feature of our species justify the ubiquitous presence it has in social interactions and along the life cycle of the individual. Laughter, far from being a curious evolutionary relic or a rather trivial innate behavior, should be considered as a highly efficient tool for inter-individual problem solving and for maintenance of social bonds.
Information hierarchies are organizational structures that often used to organize and present large and complex information as well as provide a mechanism for effective human navigation. Fortunately, many statistical and computational models exist that automatically generate hierarchies; however, the existing approaches do not consider linkages in information {\em networks} that are increasingly common in real-world scenarios. Current approaches also tend to present topics as an abstract probably distribution over words, etc rather than as tangible nodes from the original network. Furthermore, the statistical techniques present in many previous works are not yet capable of processing data at Web-scale. In this paper we present the Hierarchical Document Topic Model (HDTM), which uses a distributed vertex-programming process to calculate a nonparametric Bayesian generative model. Experiments on three medium size data sets and the entire Wikipedia dataset show that HDTM can infer accurate hierarchies even over large information networks.
We propose a novel deep neural network architecture for speech recognition that explicitly employs knowledge of the background environmental noise within a deep neural network acoustic model. A deep neural network is used to predict the acoustic environment in which the system in being used. The discriminative embedding generated at the bottleneck layer of this network is then concatenated with traditional acoustic features as input to a deep neural network acoustic model. Through a series of experiments on Resource Management, CHiME-3 task, and Aurora4, we show that the proposed approach significantly improves speech recognition accuracy in noisy and highly reverberant environments, outperforming multi-condition training, noise-aware training, i-vector framework, and multi-task learning on both in-domain noise and unseen noise.
Mapping origin-destination (OD) network traffic is pivotal for network management and proactive security tasks. However, lack of sufficient flow-level measurements as well as potential anomalies pose major challenges towards this goal. Leveraging the spatiotemporal correlation of nominal traffic, and the sparse nature of anomalies, this paper brings forth a novel framework to map out nominal and anomalous traffic, which treats jointly important network monitoring tasks including traffic estimation, anomaly detection, and traffic interpolation. To this end, a convex program is first formulated with nuclear and $\ell_1$-norm regularization to effect sparsity and low rank for the nominal and anomalous traffic with only the link counts and a {\it small} subset of OD-flow counts. Analysis and simulations confirm that the proposed estimator can {\em exactly} recover sufficiently low-dimensional nominal traffic and sporadic anomalies so long as the routing paths are sufficiently "spread-out" across the network, and an adequate amount of flow counts are randomly sampled. The results offer valuable insights about data acquisition strategies and network scenaria giving rise to accurate traffic estimation. For practical networks where the aforementioned conditions are possibly violated, the inherent spatiotemporal traffic patterns are taken into account by adopting a Bayesian approach along with a bilinear characterization of the nuclear and $\ell_1$ norms. The resultant nonconvex program involves quadratic regularizers with correlation matrices, learned systematically from (cyclo)stationary historical data. Alternating-minimization based algorithms with provable convergence are also developed to procure the estimates. Insightful tests with synthetic and real Internet data corroborate the effectiveness of the novel schemes.
Traditional approach of providing network security has been to borrow tools and mechanisms from cryptography. However, the conventional view of security based on cryptography alone is not sufficient for the defending against unique and novel types of misbehavior exhibited by nodes encountered in wireless communication networks. Reputation-based frameworks where nodes maintain reputation of other nodes and use it to evaluate their trustworthiness are deployed to provide scalable, diverse and a generalized approach for countering different types of misbehavior resulting form malicious and selfish nodes in these networks. In this paper, we present a comprehensive discussion on reputation and trust-based systems for wireless communication networks. Different classes of reputation system are described along with their unique characteristics and working principles. A number of currently used reputation systems are critically reviewed and compared with respect to their effectiveness and efficiency of performance. Some open problems in the area of reputation and trust-based system within the domain of wireless communication networks are also discussed.
Time series of cell size evolution in unicellular marine algae (division Haptophyta; Coccolithus lineage), covering 57 million years, are studied by a system of linear stochastic differential equations of hierarchical structure. The data consists of size measurements of fossilized calcite platelets (coccoliths) that cover the living cell, found in deep-sea sediment cores from six sites in the world oceans and dated to irregular points in time. To accommodate biological theory of populations tracking their fitness optima, and to allow potentially interpretable correlations in time and space, the model framework allows for an upper layer of partially observed site-specific population means, a layer of site-specific theoretical fitness optima and a bottom layer representing environmental and ecological processes. While the modeled process has many components, it is Gaussian and analytically tractable. A total of 710 model specifications within this framework are compared and inference is drawn with respect to model structure, evolutionary speed and the effect of global temperature.
Understanding the connections between unstructured text and semi-structured table is an important yet neglected problem in natural language processing. In this work, we focus on content-based table retrieval. Given a query, the task is to find the most relevant table from a collection of tables. Further progress towards improving this area requires powerful models of semantic matching and richer training and evaluation resources. To remedy this, we present a ranking based approach, and implement both carefully designed features and neural network architectures to measure the relevance between a query and the content of a table. Furthermore, we release an open-domain dataset that includes 21,113 web queries for 273,816 tables. We conduct comprehensive experiments on both real world and synthetic datasets. Results verify the effectiveness of our approach and present the challenges for this task.
The SNS control system communication network will take advantage of new, commercially available network switch features to increase reliability and reduce cost. A standard structured cable system will be installed. The decreasing cost of network switches will enable SNS to push the edge switches out near the networked devices and to run high-speed fiber communications all the way to the edge switches. Virtual Local Area Network VLAN) technology will be used to group network devices into logical subnets while minimizing the number of switches required. Commercially available single-board computers with TCP/IP interfaces will be used to provide terminal service plus remote power reboot service.
When initially introduced, a Hamiltonian that realises perfect transfer of a quantum state was found to be analogous to an x-rotation of a large spin. In this paper we extend the analogy further to demonstrate geometric effects by performing rotations on the spin. Such effects can be used to determine properties of the chain, such as its length, in a robust manner. Alternatively, they can form the basis of a spin network quantum computer. We demonstrate a universal set of gates in such a system by both dynamical and geometrical means.
New generation passive optical network aims at providing more than 100 Gb/s capacity. Thanks to recent progress enabling a variety of optical transceivers up to 40 Gb/s, many evolution possibilities to 200G PONs (passive optical network) could be investigated. This work proposes two directly deployable cases of evolution to 200G PON based on the combination of these improved optical transceivers and WDM (wavelength division multiplexing). The physical layer of the optical network has been simulated with OptiSystem software to show the communication links performances behavior when considering key components parameters in order to achieve good network design for a given area. The complexity of the proposed architectures and financial cost comparisons are also discussed.
The fast development of the Self-Organizing Network (SON) technology in mobile networks renders the problem of coordinating SON functionalities operating simultaneously critical. SON functionalities can be viewed as control loops that may need to be coordinated to guarantee conflict free operation, to enforce stability of the network and to achieve performance gain. This paper proposes a distributed solution for coordinating SON functionalities. It uses Rosen's concave games framework in conjunction with convex optimization. The SON functionalities are modeled as linear Ordinary Differential Equation (ODE)s. The stability of the system is first evaluated using a basic control theory approach. The coordination solution consists in finding a linear map (called coordination matrix) that stabilizes the system of SON functionalities. It is proven that the solution remains valid in a noisy environment using Stochastic Approximation. A practical example involving three different SON functionalities deployed in Base Stations (BSs) of a Long Term Evolution (LTE) network demonstrates the usefulness of the proposed method.
We develop a real space renormalisation group analysis of disordered models of glasses, in particular of the spin models at the origin of the Random First Order Transition theory. We find three fixed points respectively associated to the liquid state, to the critical behavior and to the glass state. The latter two are zero-temperature ones; this provides a natural explanation of the growth of effective activation energy scale and the concomitant huge increase of relaxation time approaching the glass transition. The lower critical dimension depends on the nature of the interacting degrees of freedom and is higher than three for all models. This does not prevent three dimensional systems from being glassy. Indeed, we find that their renormalisation group flow is affected by the fixed points existing in higher dimension and in consequence is non-trivial. Within our theoretical framework the glass transition results to be an avoided phase transition.
We present a new notion of probabilistic duality for random variables involving mixture distributions. Using this notion, we show how to implement a highly-parallelizable Gibbs sampler for weakly coupled discrete pairwise graphical models with strictly positive factors that requires almost no preprocessing and is easy to implement. Moreover, we show how our method can be combined with blocking to improve mixing. Even though our method leads to inferior mixing times compared to a sequential Gibbs sampler, we argue that our method is still very useful for large dynamic networks, where factors are added and removed on a continuous basis, as it is hard to maintain a graph coloring in this setup. Similarly, our method is useful for parallelizing Gibbs sampling in graphical models that do not allow for graph colorings with a small number of colors such as densely connected graphs.
The time-dependent stress relaxation for a Rouse model of a crosslinked polymer melt is completely determined by the spectrum of eigenvalues of the connectivity matrix. The latter has been computed analytically for a mean-field distribution of crosslinks. It shows a Lifshitz tail for small eigenvalues and all concentrations below the percolation threshold, giving rise to a stretched exponential decay of the stress relaxation function in the sol phase. At the critical point the density of states is finite for small eigenvalues, resulting in a logarithmic divergence of the viscosity and an algebraic decay of the stress relaxation function. Numerical diagonalization of the connectivity matrix supports the analytical findings and has furthermore been applied to cluster statistics corresponding to random bond percolation in two and three dimensions.
Hard electromagnetic radiation diffraction experiments on copper(I)-bromide melt are presented. Combining the result of this experiment with earlier neutron diffraction experiments the partial pair distribution functions of CuBr have been determined. The differing results for two of these functions obtained recently by various techniques are discussed. A potential inversion scheme has been applied to generate three dimensional structures from the partial pair distribution function. The angular correlations between near neighbor atoms have been determined. These show characteristic differences with the glass-forming ZnCl2 melt. While in ZnCl2 melt the angle joining adjacent ZnCl4 tetrahedra has been found - as in silica glass - well defined and the +-+ (ZnClZn) angle distribution peaked the corresponding distribution function in CuBr is broad. This is probably a key to understand, why ZnCl2 but not CuBr melt can be easily supercooled into a glassy state.
In many complex social systems, the timing and frequency of interactions between individuals are observable but friendship ties are hidden. Recovering these hidden ties, particularly for casual users who are relatively less active, would enable a wide variety of friendship-aware applications in domains where labeled data are often unavailable, including online advertising and national security. Here, we investigate the accuracy of multiple statistical features, based either purely on temporal interaction patterns or on the cooperative nature of the interactions, for automatically extracting latent social ties. Using self-reported friendship and non-friendship labels derived from an anonymous online survey, we learn highly accurate predictors for recovering hidden friendships within a massive online data set encompassing 18 billion interactions among 17 million individuals of the popular online game Halo: Reach. We find that the accuracy of many features improves as more data accumulates, and cooperative features are generally reliable. However, periodicities in interaction time series are sufficient to correctly classify 95% of ties, even for casual users. These results clarify the nature of friendship in online social environments and suggest new opportunities and new privacy concerns for friendship-aware applications that do not require the disclosure of private friendship information.
We study a Dirac fermion model with a random vector field, especially paying attention to a strong disorder regime. Applying the bosonization techniques, we first derive an equivalent sine-Gordon model, and next average over the random vector field using the replica trick. The operator product expansion based on the replica action leads to scaling equations of the coupling constants (``fugacities'') with nonlinear terms, if we take into account the fusion of the vertex operators. These equations are converted into a nonlinear diffusion equation known as the KPP equation. By the use of the asymptotic solution of the equation, we calculate the density of state, the generalized inverse participation ratios, and their spatial correlations. We show that results known so far are all derived in a unified way from the point of view of the renormalization group. Remarkably, it turns out that the scaling exponent obtained in this paper reproduces the recent numerical calculations of the density correlation function. This implies that the freezing transition has actually revealed itself in such calculations.
We present the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera. Our method combines a new convolutional neural network (CNN) based pose regressor with kinematic skeleton fitting. Our novel fully-convolutional pose formulation regresses 2D and 3D joint positions jointly in real time and does not require tightly cropped input frames. A real-time kinematic skeleton fitting method uses the CNN output to yield temporally stable 3D global pose reconstructions on the basis of a coherent kinematic skeleton. This makes our approach the first monocular RGB method usable in real-time applications such as 3D character control---thus far, the only monocular methods for such applications employed specialized RGB-D cameras. Our method's accuracy is quantitatively on par with the best offline 3D monocular RGB pose estimation methods. Our results are qualitatively comparable to, and sometimes better than, results from monocular RGB-D approaches, such as the Kinect. However, we show that our approach is more broadly applicable than RGB-D solutions, i.e. it works for outdoor scenes, community videos, and low quality commodity RGB cameras.
We investigate the properties of a deterministic walk, whose locomotion rule is always to travel to the nearest site. Initially the sites are randomly distributed in a closed rectangular ($A/L \times L)$ landscape and, once reached, they become unavailable for future visits. As expected, the walker step lengths present characteristic scales in one ($L \to 0$) and two ($A/L \sim L$) dimensions. However, we find scale invariance for an intermediate geometry, when the landscape is a thin strip-like region. This result is induced geometrically by a dynamical trapping mechanism, leading to a power law distribution for the step lengths. The relevance of our findings in broader contexts -- of both deterministic and random walks -- is also briefly discussed.
Deep learning techniques are increasingly popular in the textual entailment task, overcoming the fragility of traditional discrete models with hard alignments and logics. In particular, the recently proposed attention models (Rockt\"aschel et al., 2015; Wang and Jiang, 2015) achieves state-of-the-art accuracy by computing soft word alignments between the premise and hypothesis sentences. However, there remains a major limitation: this line of work completely ignores syntax and recursion, which is helpful in many traditional efforts. We show that it is beneficial to extend the attention model to tree nodes between premise and hypothesis. More importantly, this subtree-level attention reveals information about entailment relation. We study the recursive composition of this subtree-level entailment relation, which can be viewed as a soft version of the Natural Logic framework (MacCartney and Manning, 2009). Experiments show that our structured attention and entailment composition model can correctly identify and infer entailment relations from the bottom up, and bring significant improvements in accuracy.
In this paper we study the problem of learning Rectified Linear Units (ReLUs) which are functions of the form $max(0,<w,x>)$ with $w$ denoting the weight vector. We study this problem in the high-dimensional regime where the number of observations are fewer than the dimension of the weight vector. We assume that the weight vector belongs to some closed set (convex or nonconvex) which captures known side-information about its structure. We focus on the realizable model where the inputs are chosen i.i.d.~from a Gaussian distribution and the labels are generated according to a planted weight vector. We show that projected gradient descent, when initialization at 0, converges at a linear rate to the planted model with a number of samples that is optimal up to numerical constants. Our results on the dynamics of convergence of these very shallow neural nets may provide some insights towards understanding the dynamics of deeper architectures.
GSM networks are very expensive. The network design process requires too many decisions in a combinatorial explosion. For this reason, the larger is the network, the harder is to achieve a totally human based optimized solution. The BSC (Base Station Control) nodes have to be geographically well allocated to reduce the transmission costs. There are decisions of association between BTS and BSC those impacts in the correct dimensioning of these BSC. The choice of BSC quantity and model capable of carrying the cumulated traffic of its affiliated BTS nodes in turn reflects on the total cost. In addition, the last component of the total cost is due to transmission for linking BSC nodes to MSC. These trunks have a major significance since the number of required E1 lines is larger than BTS to BSC link. This work presents an integer programming model and a computational tool for designing GSM (Global System for Mobile Communications) networks, regarding BSS (Base Station Subsystem) with optimized cost.
We study the flow of fluid in porous media in dimensions $d=2$ and 3. The medium is modeled by bond percolation on a lattice of $L^d$ sites, while the flow front is modeled by tracer particles driven by a pressure difference between two fixed sites (``wells'') separated by Euclidean distance $r$. We investigate the distribution function of the shortest path connecting the two sites, and propose a scaling {\it Ansatz} that accounts for the dependence of this distribution (i) on the size of the system, $L$, and (ii) on the bond occupancy probability, $p$. We confirm by extensive simulations that the {\it Ansatz} holds for $d=2$ and 3, and calculate the relevant scaling parameters. We also study two dynamical quantities: the minimal traveling time of a tracer particle between the wells and the length of the path corresponding to the minimal traveling time ``fastest path'', which is not identical to the shortest path. A scaling {\it Ansatz} for these dynamical quantities also includes the effect of finite system size $L$ and off-critical bond occupation probability $p$. We find that the scaling form for the distribution functions for these dynamical quantities for $d=2$ and 3 is similar to that for the shortest path but with different critical exponents. The scaling form is represented as the product of a power law and three exponential cutoff functions. We summarize our results in a table which contains estimates for all parameters which characterize the scaling form for the shortest path and the minimal traveling time in 2 and 3 dimensions; these parameters are the fractal dimension, the power law exponent, and the constants and exponents that characterize the exponential cutoff functions.
In this work, we consider Corporate Governance (CG) ties among companies from a multiple network perspective. Such a structure naturally arises from the close interrelation between the Shareholding Network (SH) and the Board of Directors network (BD). In order to capture the simultaneous effects of both networks on CG, we propose to model the CG multiple network structure via tensor analysis. In particular, we consider the TOPHITS model, based on the PARAFAC tensor decomposition, to show that tensor techniques can be successfully applied in this context. By providing some empirical results from the Italian financial market in the univariate case, we then show that a tensor--based multiple network approach can reveal important information.
Until recently, VLBI targets have been drawn almost exclusively from the brightest and most compact radio sources in the sky, with typical flux densities well in excess of a few tens of mJy. These sources are predominantly identified with Active Galactic Nuclei (AGN), located at cosmological distances. In this lecture I will attempt to summarise what is currently known about the general properties of the faint sub-mJy and microJy radio source population, as determined from deep multi-wavelength studies of the HDF-N. In particular, I will try to provide a VLBI perspective, describing the first deep, wide-field, VLBI pilot observations of the HDF, together with a summary of the main results. The role VLBI can play in future high resolution studies of faint radio sources is also addressed.
We study a network congestion game of discrete-time dynamic traffic of atomic agents with a single origin-destination pair. Any agent freely makes a dynamic decision at each vertex (e.g., road crossing) and traffic is regulated with given priorities on edges (e.g., road segments). We first constructively prove that there always exists a subgame perfect equilibrium (SPE) in this game. We then study the relationship between this model and a simplified model, in which agents select and fix an origin-destination path simultaneously. We show that the set of Nash equilibrium (NE) flows of the simplified model is a proper subset of the set of SPE flows of our main model. We prove that each NE is also a strong NE and hence weakly Pareto optimal. We establish several other nice properties of NE flows, including global First-In-First-Out. Then for two classes of networks, including series-parallel ones, we show that the queue lengths at equilibrium are bounded at any given instance, which means the price of anarchy of any given game instance is bounded, provided that the inflow size never exceeds the network capacity.
Although researchers often comment on the rising popularity of nature-inspired meta-heuristics (NIM), there has been a paucity of data to directly support the claim that NIM are growing in prominence compared to other optimization techniques. This study presents evidence that the use of NIM is not only growing, but indeed appears to have surpassed mathematical optimization techniques (MOT) in several important metrics related to academic research activity (publication frequency) and commercial activity (patenting frequency). Motivated by these findings, this article discusses some of the possible origins of this growing popularity. I review different explanations for NIM popularity and discuss why some of these arguments remain unsatisfying. I argue that a compelling and comprehensive explanation should directly account for the manner in which most NIM success has actually been achieved, e.g. through hybridization and customization to different problem environments. By taking a problem lifecycle perspective, this paper offers a fresh look at the hypothesis that nature-inspired meta-heuristics derive much of their utility from being flexible. I discuss global trends within the business environments where optimization algorithms are applied and I speculate that highly flexible algorithm frameworks could become increasingly popular within our diverse and rapidly changing world.
During the past decade, machine learning has become extremely popular and can be found in many aspects of our every day life. Nowayadays with explosion of data while rapid growth of computation capacity, Distributed Deep Neural Networks (DDNNs) which can improve their performance linearly with more computation resources, have become hot and trending. However, there has not been an in depth study of the performance of these systems, and how well they scale.   In this paper we analyze CNTK, one of the most commonly used DDNNs, by first building a performance model and then evaluating the system two settings: a small cluster with all nodes in a single rack connected to a top of rack switch, and in large scale using Blue Waters with arbitary placement of nodes. Our main focus was the scalability of the system with respect to adding more nodes. Based on our results, this system has an excessive initialization overhead because of poor I/O utilization which dominates the whole execution time. Because of this, the system does not scale beyond a few nodes (4 in Blue Waters). Additionally, due to a single server-multiple worker design the server becomes a bottleneck after 16 nodes limiting the scalability of the CNTK.
Mobile video is considered a major upcoming application and revenue generator for broadband wireless networks like WiMAX and LTE. Therefore, it is important to design a proper resource allocation scheme for mobile video, since video traffic is both throughput consuming and delay sensitive.
Aims. We aim at detecting the presence of companions inside the inner hole/gap region of a sample of five well known transitional disks using spatially-resolved imaging in the near-IR with the VLT/NACO/S13 camera, which probes projected distances from the primary of typically 0.1 to 7 arcsec. The sample includes the stars DoAr 21, HD 135344B (SAO 206462), HR 4796A, T Cha, and TW Hya, spanning ages of less than 1 to 10 Myr, spectral types of A0 to K7, and hole/gap outer radii of 4 to 100 AU. Methods. In order to enhance the contrast and to avoid saturation at the core of the point-spread function (PSF), we use narrow-band filters at 1.75 and 2.12 {\mu}m. The "locally optimized combination of images" (LOCI) algorithm is applied for an optimal speckle noise removal and PSF subtraction, providing an increase of 0.5-1.5 mag in contrast over the classic method. Results. With the proviso that we could have missed companions owing to unfavorable projections, the VLT/NACO observations rule out the presence of unresolved companions down to an inner radius of about 0".1 from the primary in all five transitional disks and with a detection limit of 2 to 5 mag in contrast. In the disk outer regions the detection limits typically reach 8 to 9 mag in contrast and 4.7 mag for T Cha. Hence, the NACO images resolve part of the inner hole/gap region of all disks with the exception of TW Hya, for which the inner hole is only 4 AU. The 5{\sigma} sensitivity profiles, together with a selected evolutionary model, allow to discard stellar companions within the inner hole/gap region of T Cha, and down to the substellar regime for HD 135344B and HR 4796A. DoAr 21 is the only object from the sample of five disks for which the NACO images are sensitive enough for a detection of objects less massive than \sim 13 MJup that is, potential giant planets or low-mass brown dwarfs at radii larger than \sim 76 AU (0".63).
The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.
In a dynamic heterogeneous environment, such as pervasive and ubiquitous computing, context-aware adaptation is a key concept to meet the varying requirements of different users. Connectivity is an important context source that can be utilized for optimal management of diverse networking resources. Application QoS (Quality of service) is another important issue that should be taken into consideration for design of a context-aware system. This paper presents connectivity from the view point of context awareness, identifies various relevant raw connectivity contexts, and discusses how high-level context information can be abstracted from the raw context information. Further, rich context information is utilized in various policy representation with respect to user profile and preference, application characteristics, device capability, and network QoS conditions. Finally, a context-aware end-to-end evaluation algorithm is presented for adaptive connectivity management in a multi-access wireless network. Unlike the currently existing algorithms, the proposed algorithm takes into account user QoS parameters, and therefore, it is more practical.
A gauge model of neural network is introduced, which resembles the Z(2) Higgs lattice gauge theory of high-energy physics. It contains a neuron variable $S_x = \pm 1$ on each site $x$ of a 3D lattice and a synaptic-connection variable $J_{x\mu} = \pm 1$ on each link $(x,x+\hat{\mu}) (\mu=1,2,3)$. The model is regarded as a generalization of the Hopfield model of associative memory to a model of learning by converting the synaptic weight between $x$ and $x+\hat{\mu}$ to a dynamical Z(2) gauge variable $J_{x\mu}$. The local Z(2) gauge symmetry is inherited from the Hopfield model and assures us the locality of time evolutions of $S_x$ and $J_{x\mu}$ and a generalized Hebbian learning rule. At finite "temperatures", numerical simulations show that the model exhibits the Higgs, confinement, and Coulomb phases. We simulate dynamical processes of learning a pattern of $S_x$ and recalling it, and classify the parameter space according to the performance. At some parameter regions, stable column-layer structures in signal propagations are spontaneously generated. Mutual interactions between $S_x$ and $J_{x\mu}$ induce partial memory loss as expected.
The abundance of unlicensed spectrum in the 60 GHz band makes it an attractive alternative for future wireless communication systems. Such systems are expected to provide data transmission rates in the order of multi-gigabits per second in order to satisfy the ever-increasing demand for high rate data communication. Unfortunately, 60 GHz radio is subject to severe path loss which limits its usability for long-range outdoor communication. In this work, we propose a multi-hop 60 GHz wireless network for outdoor communication where multiple full-duplex buffered relays are used to extend the communication range while providing end-to-end performance guarantees to the traffic traversing the network. We provide a cumulative service process characterization for the 60 GHz outdoor propagation channel with self-interference in terms of the moment generating function (MGF) of its channel capacity. We then use this characterization to compute probabilistic upper bounds on the overall network performance, i.e., total backlog and end-to-end delay. Furthermore, we study the effect of self-interference on the network performance and propose an optimal power allocation scheme to mitigate its impact in order to enhance network performance. Finally, we investigate the relation between relay density and network performance under a total power budget constraint. We show that increasing relay density may have adverse effects on network performance unless self-interference can be kept sufficiently small.
We have investigated numerically the quantum evolution of a wave-packet in a quenched disordered medium described by a tight-binding Hamiltonian with long-range hopping (band random matrix approach). We have obtained clean data for the scaling properties in time and in the bandwidth b of the packet width and its fluctuations with respect to disorder realizations. We confirm that the fluctuations of the packet width in the steady-state show an anomalous scaling and we give a new estimate of the anomalous scaling exponent. This anomalous behaviour is related to the presence of non-Gaussian tails in the distribution of the packet width. Finally, we have analysed the steady state probability profile and we have found finite band corrections of order 1/b with respect to the theoretical formula derived by Zhirov in the limit of infinite bandwidth. In a neighbourhood of the origin, however, the corrections are $O(1/\sqrt{b})$.
We derive relations between theoretical properties of restricted Boltzmann machines (RBMs), popular machine learning models which form the building blocks of deep learning models, and several natural notions from discrete mathematics and convex geometry. We give implications and equivalences relating RBM-representable probability distributions, perfectly reconstructible inputs, Hamming modes, zonotopes and zonosets, point configurations in hyperplane arrangements, linear threshold codes, and multi-covering numbers of hypercubes. As a motivating application, we prove results on the relative representational power of mixtures of product distributions and products of mixtures of pairs of product distributions (RBMs) that formally justify widely held intuitions about distributed representations. In particular, we show that a mixture of products requiring an exponentially larger number of parameters is needed to represent the probability distributions which can be obtained as products of mixtures.
Prolonged network lifetime, scalability and efficient load balancing are essential for optimal performance of a wireless sensor network. Clustering provides an effective way of extending the lifetime of a sensor network. Clustering is the process that divides sensor networks into smaller localized group (called clusters) of members with a cluster head. Clustering protocols need to elect optimal number of clusters in hierarchically structured wireless sensor networks. Any clustering scheme that elects clusters uniformly (irrespective of the distance from Base Station) incurs excessive energy usage on clusters proximal and distant to Base Station. In single hop networks a gradual increment in the energy depletion rate is observed as the distance from the cluster head increases. This work focuses on the analysis of wasteful energy consumption within a uniform cluster head election model (EPEM) and provides an analytic solution to reduce the overall consumption of energy usage amongst the clusters elected in a wireless sensor network. A circular model of sensor network is considered, where the sensor nodes are deployed around a centrally located Base Station. The sensor network is divided into several concentric rings centred at the Base Station. A model, Unequal Probability Election Model (UEPEM), which elects cluster heads non-uniformly is proposed. The probability of cluster head election depends on the distance from the Base Station. UEPEM reduces the overall energy usage by about 21% over EPEM. The performance of UEPEM improves as the number of rings is increased.
We present a computational LCHO-CI approach allowing for the simulation of exchange interactions in gated lateral quantum dot networks. The approach is based on single-particle states calculated using a linear combination of harmonic orbitals (LCHO) of each of the dots, and a configuration interaction (CI) approach to the interacting electron problem. The LCHO-CI method is applied to a network of three quantum dots with one electron spin per dot, and a Heisenberg spin Hamiltonian is derived. The manipulation of spin states of a three-spin molecule by applying bias to one of the dots is demonstrated and related to the bias dependence of effective exchange interaction parameters.
Many real-world time-series analysis problems are characterised by scarce data. Solutions typically rely on hand-crafted features extracted from the time or frequency domain allied with classification or regression engines which condition on this (often low-dimensional) feature vector. The huge advances enjoyed by many application domains in recent years have been fuelled by the use of deep learning architectures trained on large data sets. This paper presents an application of deep learning for acoustic event detection in a challenging, data-scarce, real-world problem. Our candidate challenge is to accurately detect the presence of a mosquito from its acoustic signature. We develop convolutional neural networks (CNNs) operating on wavelet transformations of audio recordings. Furthermore, we interrogate the network's predictive power by visualising statistics of network-excitatory samples. These visualisations offer a deep insight into the relative informativeness of components in the detection problem. We include comparisons with conventional classifiers, conditioned on both hand-tuned and generic features, to stress the strength of automatic deep feature learning. Detection is achieved with performance metrics significantly surpassing those of existing algorithmic methods, as well as marginally exceeding those attained by individual human experts.
The entirely orbital self-organized dopant percolative filamentary model describes many counter-intuitive chemical trends in oxide superconductors quantitatively, especially the high superconductive transition temperatures Tc. According to rules previously used successfully for network glasees, the host networks are marginally stable mechanically, and the high Tc's are caused by network softening, which produces large electron-phonon interactions at interlayer dopants for states near the Fermi energy.
The absence of an appropriate text classification corpus makes the massive amount of online job information unusable for labor market analysis. This paper presents JCTC, a large job posting corpus for text classification. In JCTC construction framework, a formal specification issued by the Chinese central government is chosen as the classification standard. The unsupervised learning (WE-cos), supervised learning algorithm (SVM) and human judgements are all used in the construction process. JCTC has 102581 online job postings distributed in 465 categories. The method proposed here can not only ameliorate the high demands on people's skill and knowledge, but reduce the subjective influences as well. Besides, the method is not limited in Chinese. We benchmark five state-of-the-art deep learning approaches on JCTC providing baseline results for future studies. JCTC might be the first job posting corpus for text classification and the largest one in Chinese. With the help of JCTC, related organizations are able to monitor, analyze and predict the labor market in a comprehensive, accurate and timely manner.
We propose a multi-scale multi-channel deep neural network framework that, for the first time, yields sketch recognition performance surpassing that of humans. Our superior performance is a result of explicitly embedding the unique characteristics of sketches in our model: (i) a network architecture designed for sketch rather than natural photo statistics, (ii) a multi-channel generalisation that encodes sequential ordering in the sketching process, and (iii) a multi-scale network ensemble with joint Bayesian fusion that accounts for the different levels of abstraction exhibited in free-hand sketches. We show that state-of-the-art deep networks specifically engineered for photos of natural objects fail to perform well on sketch recognition, regardless whether they are trained using photo or sketch. Our network on the other hand not only delivers the best performance on the largest human sketch dataset to date, but also is small in size making efficient training possible using just CPUs.
Most existing sequence labelling models rely on a fixed decomposition of a target sequence into a sequence of basic units. These methods suffer from two major drawbacks: 1) the set of basic units is fixed, such as the set of words, characters or phonemes in speech recognition, and 2) the decomposition of target sequences is fixed. These drawbacks usually result in sub-optimal performance of modeling sequences. In this pa- per, we extend the popular CTC loss criterion to alleviate these limitations, and propose a new loss function called Gram-CTC. While preserving the advantages of CTC, Gram-CTC automatically learns the best set of basic units (grams), as well as the most suitable decomposition of tar- get sequences. Unlike CTC, Gram-CTC allows the model to output variable number of characters at each time step, which enables the model to capture longer term dependency and improves the computational efficiency. We demonstrate that the proposed Gram-CTC improves CTC in terms of both performance and efficiency on the large vocabulary speech recognition task at multiple scales of data, and that with Gram-CTC we can outperform the state-of-the-art on a standard speech benchmark.
Stochastic, iterative search methods such as Evolutionary Algorithms (EAs) are proven to be efficient optimizers. However, they require evaluation of the candidate solutions which may be prohibitively expensive in many real world optimization problems. Use of approximate models or surrogates is being explored as a way to reduce the number of such evaluations. In this paper we investigated three such methods. The first method (DAFHEA) partially replaces an expensive function evaluation by its approximate model. The approximation is realized with support vector machine (SVM) regression models. The second method (DAFHEA II) is an enhancement on DAFHEA to accommodate for uncertain environments. The third one uses surrogate ranking with preference learning or ordinal regression. The fitness of the candidates is estimated by modeling their rank. The techniques' performances on some of the benchmark numerical optimization problems have been reported. The comparative benefits and shortcomings of both techniques have been identified.
A dynamic logic method was developed to analyze molecular networks of cells by combining Kauffman and Thomas's logic operations with molecular interaction parameters. The logic operations characterize the discrete interactions between biological components. The interaction parameters (e.g. response times) describe the quantitative kinetics. The combination of the two quantitatively characterizes the discrete biological interactions. A number of simple networks were analyzed. The main results include: we proved the theorems to determine bistable states and oscillation behaviors of networks, we showed that time delays are essential for oscillation structures, we proved that single variable networks do not have chaotic behaviors, and we explained why one signal can have multiply responses. In addition, we applied the present method to the analysis of the MAPK cascade, feed-forward loops, and mitosis cycle of budding yeast cells.
Lipatov's high-energy effective action is a useful tool for computations in the Regge limit beyond leading order. Recently, a regularisation/subtraction prescription has been proposed that allows to apply this formalism to calculate next-to-leading order corrections in a consistent way. We illustrate this procedure with the computation of the gluon Regge trajectory at two loops.
Hashing is one of the most popular and powerful approximate nearest neighbor search techniques for large-scale image retrieval. Most traditional hashing methods first represent images as off-the-shelf visual features and then produce hashing codes in a separate stage. However, off-the-shelf visual features may not be optimally compatible with the hash code learning procedure, which may result in sub-optimal hash codes. Recently, deep hashing methods have been proposed to simultaneously learn image features and hash codes using deep neural networks and have shown superior performance over traditional hashing methods. Most deep hashing methods are given supervised information in the form of pairwise labels or triplet labels. The current state-of-the-art deep hashing method DPSH~\cite{li2015feature}, which is based on pairwise labels, performs image feature learning and hash code learning simultaneously by maximizing the likelihood of pairwise similarities. Inspired by DPSH~\cite{li2015feature}, we propose a triplet label based deep hashing method which aims to maximize the likelihood of the given triplet labels. Experimental results show that our method outperforms all the baselines on CIFAR-10 and NUS-WIDE datasets, including the state-of-the-art method DPSH~\cite{li2015feature} and all the previous triplet label based deep hashing methods.
Although the brain has long been considered a potential inspiration for future computing, Moore's Law - the scaling property that has seen revolutions in technologies ranging from supercomputers to smart phones - has largely been driven by advances in materials science. As the ability to miniaturize transistors is coming to an end, there is increasing attention on new approaches to computation, including renewed enthusiasm around the potential of neural computation. This paper describes how recent advances in neurotechnologies, many of which have been aided by computing's rapid progression over recent decades, are now reigniting this opportunity to bring neural computation insights into broader computing applications. As we understand more about the brain, our ability to motivate new computing paradigms with continue to progress. These new approaches to computing, which we are already seeing in techniques such as deep learning and neuromorphic hardware, will themselves improve our ability to learn about the brain and accordingly can be projected to give rise to even further insights. This paper will describe how this positive feedback has the potential to change the complexion of how computing sciences and neurosciences interact, and suggests that the next form of exponential scaling in computing may emerge from our progressive understanding of the brain.
We develop a statistical theory to characterize correlations in weighted networks. We define the appropriate metrics quantifying correlations and show that strictly uncorrelated weighted networks do not exist due to the presence of structural constraints. We also introduce an algorithm for generating maximally random weighted networks with arbitrary $P(k,s)$ to be used as null models. The application of our measures to real networks reveals the importance of weights in a correct understanding and modeling of these heterogeneous systems.
We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods, which have tackled this problem in a deterministic or non-parametric way, we propose a novel approach that models future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. Future frame synthesis is challenging, as it involves low- and high-level image and motion understanding. We propose a novel network structure, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, as well as on real-wold videos. We also show that our model can be applied to tasks such as visual analogy-making, and present an analysis of the learned network representations.
In the past few years, IRC bots, malicious programs which are remotely controlled by the attacker through IRC servers, have become a major threat to the Internet and users. These bots can be used in different malicious ways such as issuing distributed denial of services attacks to shutdown other networks and services, keystrokes logging, spamming, traffic sniffing cause serious disruption on networks and users. New bots use peer to peer (P2P) protocols start to appear as the upcoming threat to Internet security due to the fact that P2P bots do not have a centralized point to shutdown or traceback, thus making the detection of P2P bots is a real challenge. In response to these threats, we present an algorithm to detect an individual P2P bot running on a system by correlating its activities. Our evaluation shows that correlating different activities generated by P2P bots within a specified time period can detect these kind of bots.
Using numerical simulations and analytical approximations we study a modified version of the two-dimensional lattice model [R. Piasecki,phys. stat. sol. (b) 209, 403 (1998)] for random pH:(1-p)L systems consisting of grains of high (low) conductivity for H-(L-)phase, respectively. The modification reduces a spectrum of model bond conductivities to the two pure ones and the mixed one. The latter value explicitly depends on the average concentration gamma(p) of the H-component per model cell. The effective conductivity as a function of content p of the H-phase in such systems can be modelled making use of three model parameters that are sensitive to both grain size distributions, GSD(H) and GSD(L). However, to incorporate into the model information directly connected with a given GSD, a computer simulation of the geometrical arrangement of grains is necessary. By controlling the polydispersity in grain sizes and their relative area frequencies, the effective conductivity could be raised or decreased and correlated with gamma(p). When the phases are interchanged, a hysteresis-loop-like behaviour of the effective conductivity, characteristic of dual media, is found. We also show that the topological non-equivalence of the system's microstructure accompanies some GSDs, and it can be detected by the entropic measure of the spatial inhomogeneity of model cells.
The prediction of saliency areas in images has been traditionally addressed with hand crafted features based on neuroscience principles. This paper however addresses the problem with a completely data-driven approach by training a convolutional network. The learning process is formulated as a minimization of a loss function that measures the Euclidean distance of the predicted saliency map with the provided ground truth. The recent publication of large datasets of saliency prediction has provided enough data to train a not very deep architecture which is both fast and accurate. The convolutional network in this paper, named JuntingNet, won the LSUN 2015 challenge on saliency prediction with a superior performance in all considered metrics.
We introduce neural networks for end-to-end differentiable theorem proving that operate on dense vector representations of symbols. These neural networks are constructed recursively by taking inspiration from the backward chaining algorithm as used in Prolog. Specifically, we replace symbolic unification with a differentiable computation on vector representations of symbols using a radial basis function kernel, thereby combining symbolic reasoning with learning subsymbolic vector representations. By using gradient descent, the resulting neural network can be trained to infer facts from a given incomplete knowledge base. It learns to (i) place representations of similar symbols in close proximity in a vector space, (ii) make use of such similarities to prove facts, (iii) induce logical rules, and (iv) use provided and induced logical rules for complex multi-hop reasoning. We demonstrate that this architecture outperforms ComplEx, a state-of-the-art neural link prediction model, on four benchmark knowledge bases while at the same time inducing interpretable function-free first-order logic rules.
We have demonstrated neural networks can recognize parts by visual images. Input signals are gray scale photographs of objects consisting of some parts and output signals are their shapes. By training neural networks by a few set of images, without any supervision they become to be able to recognize the boundary between parts.
The Vosk-Altman Strong Disorder Renormalization for the unitary dynamics of various random quantum spin chains is reformulated as follows : the local degree of freedom characterized by the highest eigenfrequency $\Omega$ can be considered as a high-frequency-Floquet-periodic-driving for the neighboring slower degrees of freedom. Then the two first orders of the high-frequency expansion for the effective Floquet Hamiltonian can be used to generate the emergent Local Integrals of Motion (LIOMs) and to derive the renormalization rules for the effective dynamics of the remaining degrees of freedom. The flow for this effective Floquet Hamiltonian is equivalent to the RSRG-X procedure to construct the whole set of eigenstates that generalizes the Fisher RSRG procedure constructing the ground state. This general framework is applied to the random-transverse-field XXZ spin chain in its Many-Body-Localized phase, in order to derive the renormalization rules associated to the elimination of the biggest transverse field and to the elimination of the biggest coupling respectively.
Uncovering unknown or missing links in social networks is a difficult task because of their sparsity and because links may represent different types of relationships, characterized by different structural patterns. In this paper, we define a simple yet efficient supervised learning-to-rank framework, called RankMerging, which aims at combining information provided by various unsupervised rankings. We illustrate our method on three different kinds of social networks and show that it substantially improves the performances of unsupervised metrics of ranking. We also compare it to other combination strategies based on standard methods. Finally, we explore various aspects of RankMerging, such as feature selection and parameter estimation and discuss its area of relevance: the prediction of an adjustable number of links on large networks.
Learning a classifier from private data collected by multiple parties is an important problem that has many potential applications. How can we build an accurate and differentially private global classifier by combining locally-trained classifiers from different parties, without access to any party's private data? We propose to transfer the `knowledge' of the local classifier ensemble by first creating labeled data from auxiliary unlabeled data, and then train a global $\epsilon$-differentially private classifier. We show that majority voting is too sensitive and therefore propose a new risk weighted by class probabilities estimated from the ensemble. Relative to a non-private solution, our private solution has a generalization error bounded by $O(\epsilon^{-2}M^{-2})$ where $M$ is the number of parties. This allows strong privacy without performance loss when $M$ is large, such as in crowdsensing applications. We demonstrate the performance of our method with realistic tasks of activity recognition, network intrusion detection, and malicious URL detection.
Role of localized magnetic moments in metal-insulator transitions lies at the heart of modern condensed matter physics, for example, the mechanism of high T$_{c}$ superconductivity, the nature of non-Fermi liquid physics near heavy fermion quantum criticality, the problem of metal-insulator transitions in doped semiconductors, and etc. Dilute magnetic semiconductors have been studied for more than twenty years, achieving spin polarized electric currents in spite of low Curie temperatures. Replacing semiconductors with topological insulators, we propose the problem of dilute magnetic topological semiconductors. Increasing disorder strength which corresponds to the size distribution of ferromagnetic clusters, we suggest a novel disordered metallic state, where Weyl metallic islands appear to form inhomogeneous mixtures with topological insulating phases. Performing the renormalization group analysis combined with experimental results, we propose a phase diagram in $(\lambda_{so},\Gamma,T)$, where the spin-orbit coupling $\lambda_{so}$ controls a topological phase transition from a topological semiconductor to a semiconductor with temperature $T$ and the distribution for ferromagnetic clusters $\Gamma$ gives rise to a novel insulator-metal transition from either a topological insulating or band insulating phase to an inhomogeneously distributed Weyl metallic state with such insulating islands. Since electromagnetic properties in Weyl metal are described by axion electrodynamics, the role of random axion electrodynamics in transport phenomena casts an interesting problem beyond the physics of percolation in conventional disorder-driven metal-insulator transitions. We also discuss how to verify such inhomogeneous mixtures based on atomic force microscopy.
In spin-glass systems, frustration can be adjusted continuously and considerably, without changing the antiferromagnetic bond probability p, by using locally correlated quenched randomness, as we demonstrate here on hypercubic lattices and hierarchical lattices. Such overfrustrated and underfrustrated Ising systems on hierarchical lattices in d=3 and 2 are studied. With the removal of just 51 % of frustration, a spin-glass phase occurs in d=2. With the addition of just 33 % frustration, the spin-glass phase disappears in d=3. Sequences of 18 different phase diagrams for different levels of frustration are calculated in both dimensions. In general, frustration lowers the spin-glass ordering temperature. At low temperatures, increased frustration favors the spin-glass phase (before it disappears) over the ferromagnetic phase and symmetrically the antiferromagnetic phase. When any amount, including infinitesimal, frustration is introduced, the chaotic rescaling of local interactions occurs in the spin-glass phase. Chaos increases with increasing frustration, as seen from the increased positive value of the calculated Lyapunov exponent $\lambda$, starting from $\lambda =0$ when frustration is absent. The calculated runaway exponent $y_R$ of the renormalization-group flows decreases with increasing frustration to $y_R=0$ when the spin-glass phase disappears. From our calculations of entropy and specific heat curves in d=3, it is seen that frustration lowers in temperature the onset of both long- and short-range order in spin-glass phases, but is more effective on the former. From calculations of the entropy as a function of antiferromagnetic bond concentration p, it is seen that the ground-state and low-temperature entropy already mostly sets in within the ferromagnetic and antiferromagnetic phases, before the spin-glass phase is reached.
We study superfluid to Anderson insulator transition of strongly repulsive Bose gas in a one dimensional incommensurate optical lattice. In the hard core limit, the Bose-Fermi mapping allows us to deal with the system exactly by using the exact numerical method. Based on the Aubry-Andr\'{e} model, we exploit the phase transition of the hard core boson system from superfluid phase with all the single particle states being extended to the Bose glass phase with all the single particle states being Anderson localized as the strength of the incommensurate potential increasing relative to the amplitude of hopping. We evaluate the superfluid fraction, the one particle density matrices, momentum distributions, the natural orbitals and their occupations. All of these quantities show that there exists a phase transition from superfluid to insulator in the system.
We present a computational study of the electronic properties of amorphous SiO2. The ionic configurations used are the ones generated by an earlier molecular dynamics simulations in which the system was cooled with different cooling rates from the liquid state to a glass, thus giving access to glass-like configurations with different degrees of disorder [Phys. Rev. B 54, 15808 (1996)]. The electronic structure is described by a tight-binding Hamiltonian. We study the influence of the degree of disorder on the density of states, the localization properties, the optical absorption, the nature of defects within the mobility gap, and on the fluctuations of the Madelung potential, where the disorder manifests itself most prominently. The experimentally observed mismatch between a photoconductivity threshold of 9 eV and the onset of the optical absorption around 7 eV is interpreted by the picture of eigenstates localized by potential energy fluctuations in a mobility gap of approximately 9 eV and a density of states that exhibits valence and conduction band tails which are, even in the absence of defects, deeply located within the former band gap.
We study the temporal percolation properties of temporal networks by taking as a representative example the recently proposed activity driven network model [N. Perra et al., Sci. Rep. 2, 469 (2012)]. Building upon an analytical framework based on a mapping to hidden variables networks, we provide expressions for the percolation time marking the onset of a giant connected component in the integrated network. In particular, we consider both the generating function formalism, valid for degree uncorrelated networks, and the general case of networks with degree correlations. We discuss the different limits of the two approach, indicating the parameter regions where the correlated threshold collapses onto the uncorrelated case. Our analytical prediction are confirmed by numerical simulations of the model. The temporal percolation concept can be fruitfully applied to study epidemic spreading on temporal networks. We show in particular how the susceptible-infected- removed model on an activity driven network can be mapped to the percolation problem up to a time given by the spreading rate of the epidemic process. This mapping allows to obtain addition information on this process, not available for previous approaches.
In this note we briefly study the feasibility of community detection in complex networks using peripheral vertices. Our method suggests a novel direction in axiomizing the problem of clustering in graphs and complex networks by looking at the topological role each vertex plays in the community structure, regardless of the attributes. The promising strength of pseudo-peripheral vertices as a lever for analysis of complex networks is also demonstrated on real-world data.
The lac operon in Escherichia coli has been studied extensively and is one of the earliest gene systems found to undergo both positive and negative control. The lac operon is known to exhibit bistability, in the sense that the operon is either induced or uninduced. Many dynamical models have been proposed to capture this phenomenon. While most are based on complex mathematical formulations, it has been suggested that for other gene systems network topology is sufficient to produce the desired dynamical behavior.   We present a Boolean network as a discrete model for the lac operon. We include the two main glucose control mechanisms of catabolite repression and inducer exclusion in the model and show that it exhibits bistability. Further we present a reduced model which shows that lac mRNA and lactose form the core of the lac operon, and that this reduced model also exhibits the same dynamics. This work corroborates the claim that the key to dynamical properties is the topology of the network and signs of interactions.
We introduce a new approach to constructing networks with realistic features. Our method, in spite of its conceptual simplicity (it has only two parameters) is capable of generating a wide variety of network types with prescribed statistical properties, e.g., with degree- or clustering coefficient distributions of various, very different forms. In turn, these graphs can be used to test hypotheses, or, as models of actual data. The method is based on a mapping between suitably chosen singular measures defined on the unit square and sparse infinite networks. Such a mapping has the great potential of allowing for graph theoretical results for a variety of network topologies. The main idea of our approach is to go to the infinite limit of the singular measure and the size of the corresponding graph simultaneously. A very unique feature of this construction is that the complexity of the generated network is increasing with the size. We present analytic expressions derived from the parameters of the -- to be iterated-- initial generating measure for such major characteristics of graphs as their degree, clustering coefficient and assortativity coefficient distributions. The optimal parameters of the generating measure are determined from a simple simulated annealing process. Thus, the present work provides a tool for researchers from a variety of fields (such as biology, computer science, biology, or complex systems) enabling them to create a versatile model of their network data.
Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc.   We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized social-network graphs. To demonstrate its effectiveness on real-world networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.   Our de-anonymization algorithm is based purely on the network topology, does not require creation of a large number of dummy "sybil" nodes, is robust to noise and all existing defenses, and works even when the overlap between the target network and the adversary's auxiliary information is small.
A prospective analysis for the search of the Standard Model (SM) Higgs boson with the CMS detector is presented in the context of the early LHC data. The aim is to establish an analysis strategy for inclusive production of the Higgs boson decaying in WW* pairs in the context of the early LHC data. Higgs mass region between 120-200 GeV, in which this signature was proposed as highly sensitive, has been studied. The W decays into lnu are considered, where l stands for e or mu. The final states are characterized by two, opposite-sign, high transverse momentum leptons, missing energy, carried out by the undetected neutrinos, and little jet activity. This study uses Monte Carlo (MC) events with full detector simulation, including limited calibration and alignment precision as expected at the LHC startup. Sets of sequential cuts are applied to each of the three topologies, in order to isolate a signal which exceeds the tt and continuum WW backgrounds. Alternatively, an artificial neural network multi-variate analysis technique is used.
A computer model of the feed-forward neural network with the hidden layer is developed to reconstruct physical field investigated by the fiber-optic measuring system. The Gaussian distributions of some physical quantity are selected as learning patterns. Neural network is learned by error back-propagation using the conjugate gradient and coordinate descent minimization of deviation. Learned neural network reconstructs the two-dimensional scalar physical field with distribution having one or two Gaussian peaks.
A fundamental prerequisite for the implementation of linear optical quantum computation is a source of single-photon wavepackets capable of high-visibility interference in scalable networks. These conditions can be met with micro-structured waveguides in conjunction with ultra-short classical timing pulses. By exploiting a novel type-II phasematching configuration we demonstrate a waveguided single photon source exhibiting a conditional detection efficiency exceeding 51% (which corresponds to a preparation efficiency of 85%) and extraordinarily high detection rates of up to 8.5x10^5 coincidences/[s.mW].
Deep generative networks provide a powerful tool for modeling complex data in a wide range of applications. In inverse problems that use these networks as generative priors on data, one must often perform inference of the inputs of the networks from the outputs. Inference is also required for sampling during stochastic training on these generative models. This paper considers inference in a deep stochastic neural network where the parameters (e.g., weights, biases and activation functions) are known and the problem is to estimate the values of the input and hidden units from the output. While several approximate algorithms have been proposed for this task, there are few analytic tools that can provide rigorous guarantees in the reconstruction error. This work presents a novel and computationally tractable output-to-input inference method called Multi-Layer Vector Approximate Message Passing (ML-VAMP). The proposed algorithm, derived from expectation propagation, extends earlier AMP methods that are known to achieve the replica predictions for optimality in simple linear inverse problems. Our main contribution shows that the mean-squared error (MSE) of ML-VAMP can be exactly predicted in a certain large system limit (LSL) where the numbers of layers is fixed and weight matrices are random and orthogonally-invariant with dimensions that grow to infinity. ML-VAMP is thus a principled method for output-to-input inference in deep networks with a rigorous and precise performance achievability result in high dimensions.
Wireless communication networks rely heavily on channel state information (CSI) to make informed decision for signal processing and network operations. However, the traditional CSI acquisition methods is facing many difficulties: pilot-aided channel training consumes a great deal of channel resources and reduces the opportunities for energy saving, while location-aided channel estimation suffers from inaccurate and insufficient location information. In this paper, we propose a novel channel learning framework, which can tackle these difficulties by inferring unobservable CSI from the observable one. We formulate this framework theoretically and illustrate a special case in which the learnability of the unobservable CSI can be guaranteed. Possible applications of channel learning are then described, including cell selection in multi-tier networks, device discovery for device-to-device (D2D) communications, as well as end-to-end user association for load balancing. We also propose a neuron-network-based algorithm for the cell selection problem in multi-tier networks. The performance of this algorithm is evaluated using geometry-based stochastic channel model (GSCM). In settings with 5 small cells, the average cell-selection accuracy is 73% - only a 3.9% loss compared with a location-aided algorithm which requires genuine location information.
We study synchronization in scalar nonlinear systems connected over a linear network with stochastic uncertainty in their interactions. We provide a sufficient condition for the synchronization of such network systems expressed in terms of the parameters of the nonlinear scalar dynamics, the second and largest eigenvalues of the mean interconnection Laplacian, and the variance of the stochastic uncertainty. The sufficient condition is independent of network size thereby making it attractive for verification of synchronization in a large size network. The main contribution of this paper is to provide analytical characterization for the interplay of roles played by the internal dynamics of the nonlinear system, network topology, and uncertainty statistics in network synchronization. We show there exist important tradeoffs between these various network parameters necessary to achieve synchronization. We show for nearest neighbor networks with stochastic uncertainty in interactions there exists an optimal number of neighbors with maximum margin for synchronization. This proves in the presence of interaction uncertainty, too many connections among network components is just as harmful for synchronization as the lack of connection. We provide an analytical formula for the optimal gain required to achieve maximum synchronization margin thereby allowing us to compare various complex network topology for their synchronization property.
Background: Elucidating gene regulatory networks is crucial for understanding normal cell physiology and complex pathologic phenotypes. Existing computational methods for the genome-wide ``reverse engineering'' of such networks have been successful only for lower eukaryotes with simple genomes. Here we present ARACNE, a novel algorithm, using microarray expression profiles, specifically designed to scale up to the complexity of regulatory networks in mammalian cells, yet general enough to address a wider range of network deconvolution problems. This method uses an information theoretic approach to eliminate the majority of indirect interactions inferred by co-expression methods.   Results: We prove that ARACNE reconstructs the network exactly (asymptotically) if the effect of loops in the network topology is negligible, and we show that the algorithm works well in practice, even in the presence of numerous loops and complex topologies. We assess ARACNE's ability to reconstruct transcriptional regulatory networks using both a realistic synthetic dataset and a microarray dataset from human B cells. On synthetic datasets ARACNE achieves very low error rates and outperforms established methods, such as Relevance Networks and Bayesian Networks. Application to the deconvolution of genetic networks in human B cells demonstrates ARACNE's ability to infer validated transcriptional targets of the c MYC proto-oncogene. We also study the effects of mis estimation of mutual information on network reconstruction, and show that algorithms based on mutual information ranking are more resilient to estimation errors.
A novel, generic scheme for off-line handwritten English alphabets character images is proposed. The advantage of the technique is that it can be applied in a generic manner to different applications and is expected to perform better in uncertain and noisy environments. The recognition scheme is using a multilayer perceptron(MLP) neural networks. The system was trained and tested on a database of 300 samples of handwritten characters. For improved generalization and to avoid overtraining, the whole available dataset has been divided into two subsets: training set and test set. We achieved 99.10% and 94.15% correct recognition rates on training and test sets respectively. The purposed scheme is robust with respect to various writing styles and size as well as presence of considerable noise.
The user equilibrium in traffic assignment problem is based on the fact that travelers choose the minimum-cost path between every origin-destination pair and on the assumption that such a behavior will lead to an equilibrium of the traffic network. In this paper, we consider this problem when the traffic network links are fuzzy cost. Therefore, a Physarum-type algorithm is developed to unify the Physarum network and the traffic network for taking full of advantage of Physarum Polycephalum's adaptivity in network design to solve the user equilibrium problem. Eventually, some experiments are used to test the performance of this method. The results demonstrate that our approach is competitive when compared with other existing algorithms.
Many machine learning tasks can be expressed as the transformation---or \emph{transduction}---of input sequences into output sequences: speech recognition, machine translation, protein secondary structure prediction and text-to-speech to name but a few. One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential distortions such as shrinking, stretching and translating. Recurrent neural networks (RNNs) are a powerful sequence learning architecture that has proven capable of learning such representations. However RNNs traditionally require a pre-defined alignment between the input and output sequences to perform transduction. This is a severe limitation since \emph{finding} the alignment is the most difficult aspect of many sequence transduction problems. Indeed, even determining the length of the output sequence is often challenging. This paper introduces an end-to-end, probabilistic sequence transduction system, based entirely on RNNs, that is in principle able to transform any input sequence into any finite, discrete output sequence. Experimental results for phoneme recognition are provided on the TIMIT speech corpus.
We present the orbital period study and the photometric analys of the contact binary system V728 Her. Our orbital period analysis shows that the period of the system increases (dP/dt=1.92x10^-7dyr^-1) and the mass transfer rate from the less massive component to more massive one is 2.51x10^-8M_suny^-1. In addition, an advanced sinusoidal variation in period can be attributed to the light-time effect by a tertiary component or the Applegate mechanism triggered by the secondary component. The simultaneous multicolor BVR light and radial velocity curves solution indicates that the physical parameters of the system are M1=1.8M_sun, M2=0.28M_sun, R1=1.87R_sun, R2=0.82R_sun, L1=5.9L_sun, and L2=1.2L_sun. We discuss the evolutionary status and conclude that V728 Her is a deep (f=81%), low mass ratio (q=0.16) contact binary system.
There are many applications scenarios for which the computational performance and memory footprint of the prediction phase of Deep Neural Networks (DNNs) needs to be optimized. Binary Neural Networks (BDNNs) have been shown to be an effective way of achieving this objective. In this paper, we show how Convolutional Neural Networks (CNNs) can be implemented using binary representations. Espresso is a compact, yet powerful library written in C/CUDA that features all the functionalities required for the forward propagation of CNNs, in a binary file less than 400KB, without any external dependencies. Although it is mainly designed to take advantage of massive GPU parallelism, Espresso also provides an equivalent CPU implementation for CNNs. Espresso provides special convolutional and dense layers for BCNNs, leveraging bit-packing and bit-wise computations for efficient execution. These techniques provide a speed-up of matrix-multiplication routines, and at the same time, reduce memory usage when storing parameters and activations. We experimentally show that Espresso is significantly faster than existing implementations of optimized binary neural networks ($\approx$ 2 orders of magnitude). Espresso is released under the Apache 2.0 license and is available at http://github.com/fpeder/espresso.
In online discussion communities, users can interact and share information and opinions on a wide variety of topics. However, some users may create multiple identities, or sockpuppets, and engage in undesired behavior by deceiving others or manipulating discussions. In this work, we study sockpuppetry across nine discussion communities, and show that sockpuppets differ from ordinary users in terms of their posting behavior, linguistic traits, as well as social network structure. Sockpuppets tend to start fewer discussions, write shorter posts, use more personal pronouns such as "I", and have more clustered ego-networks. Further, pairs of sockpuppets controlled by the same individual are more likely to interact on the same discussion at the same time than pairs of ordinary users. Our analysis suggests a taxonomy of deceptive behavior in discussion communities. Pairs of sockpuppets can vary in their deceptiveness, i.e., whether they pretend to be different users, or their supportiveness, i.e., if they support arguments of other sockpuppets controlled by the same user. We apply these findings to a series of prediction tasks, notably, to identify whether a pair of accounts belongs to the same underlying user or not. Altogether, this work presents a data-driven view of deception in online discussion communities and paves the way towards the automatic detection of sockpuppets.
Finding talents, often among the people already hired, is an endemic challenge for organizations. The social networking revolution, with online tools like Linkedin, made possible to make explicit and accessible what we perceived, but not used, for thousands of years: the exact position and ranking of a person in a network of professional and personal connections. To search and mine where and how an employee is positioned on a global skill network will enable organizations to find unpredictable sources of knowledge, innovation and know-how. This data richness and hidden knowledge demands for a multidimensional and multiskill approach to the network ranking problem. Multidimensional networks are networks with multiple kinds of relations. To the best of our knowledge, no network-based ranking algorithm is able to handle multidimensional networks and multiple rankings over multiple attributes at the same time. In this paper we propose such an algorithm, whose aim is to address the node multi-ranking problem in multidimensional networks. We test our algorithm over several real world networks, extracted from DBLP and the Enron email corpus, and we show its usefulness in providing less trivial and more flexible rankings than the current state of the art algorithms.
We present the rest-frame optical galaxy merger fraction between 0.2<z<1.2, as a function of stellar mass and optical luminosity, as observed by the Canada-France-Hawaii Telescope Legacy Deep Survey (CFHTLS-Deep). We developed a new classification scheme to identify major galaxy-galaxy mergers based on the presence of tidal tails and bridges. These morphological features are signposts of recent and ongoing merger activity. Through the visual classification of all galaxies, down to i_vega<22.2 (~27,000 galaxies) over 2 square degrees, we have compiled the CFHTLS Deep Catalog of Interacting Galaxies, with ~1600 merging galaxies. We find the merger fraction to be 4.3% +/-0.3% at z~0.3 and 19.0% +/-2.5% at z~1, implying evolution of the merger fraction going as (1+z)^m, with m=2.25 +/-0.24. This result is inconsistent with a mild or non-evolving (m<1.5) scenario at a >4sigma level of confidence. A mild trend, where massive galaxies with M>10^10.7 M_sun are undergoing fewer mergers than less massive systems M~10^10 M_sun), consistent with the expectations of galaxy assembly downsizing is observed. Our results also show that interacting galaxies have on average SFRs double that found in non-interacting field galaxies. We conclude that (1) the optical galaxy merger fraction does evolve with redshift, (2) the merger fraction depends mildly on stellar mass, with lower mass galaxies having higher merger fractions at z<1, and (3) star formation is triggered at all phases of a merger, with larger enhancements at later stages, consistent with N-body simulations.
The current evidence for morphologically peculiar galaxy populations at high-redshifts is outlined. After describing various techniques which can be used to quantify the importance of ``morphological K-corrections'', and to objectively classify galaxy morphology in the presence of these biases, it is concluded that observational biases are not sufficient to explain the increase in the fraction of peculiar galaxies on deep HST images. A new technique is then described which models the spatially resolved internal colors of high redshift galaxies, as a probe of the processes driving galaxy evolution. This ``morphophotometric'' approach investigates directly the evolutionary history of stellar populations, and is a sensitive test of the mechanisms through which galaxies build up and evolve in the field. As a case study, we analyse several ``chain galaxies'' in the Hubble Deep Field. These chain galaxies are shown to be protogalaxies undergoing their first significant episodes of star-formation, and not simply distant edge-on spirals.
Task scheduling problem in heterogeneous systems is the process of allocating tasks of an application to heterogeneous processors interconnected by high-speed networks, so that minimizing the finishing time of application as much as possible. Tasks are processing units of application and have precedenceconstrained, communication and also, are presented by Directed Acyclic Graphs (DAGs). Evolutionary algorithms are well suited for solving task scheduling problem in heterogeneous environment. In this paper, we propose a hybrid heuristic-based Artificial Immune System (AIS) algorithm for solving the scheduling problem. In this regard, AIS with some heuristics and Single Neighbourhood Search (SNS) technique are hybridized. Clonning and immune-remove operators of AIS provide diversity, while heuristics and SNS provide convergence of algorithm into good solutions, that is balancing between exploration and exploitation. We have compared our method with some state-of-the art algorithms. The results of the experiments show the validity and efficiency of our method.
A multi-view image sequence provides a much richer capacity for object recognition than from a single image. However, most existing solutions to multi-view recognition typically adopt hand-crafted, model-based geometric methods, which do not readily embrace recent trends in deep learning. We propose to bring Convolutional Neural Networks to generic multi-view recognition, by decomposing an image sequence into a set of image pairs, classifying each pair independently, and then learning an object classifier by weighting the contribution of each pair. This allows for recognition over arbitrary camera trajectories, without requiring explicit training over the potentially infinite number of camera paths and lengths. Building these pairwise relationships then naturally extends to the next-best-view problem in an active recognition framework. To achieve this, we train a second Convolutional Neural Network to map directly from an observed image to next viewpoint. Finally, we incorporate this into a trajectory optimisation task, whereby the best recognition confidence is sought for a given trajectory length. We present state-of-the-art results in both guided and unguided multi-view recognition on the ModelNet dataset, and show how our method can be used with depth images, greyscale images, or both.
We propose and analyse simple deterministic algorithms that can be used to construct machines that have primitive learning capabilities. We demonstrate that locally connected networks of these machines can be used to perform blind classification on an event-by-event basis, without storing the information of the individual events. We also demonstrate that properly designed networks of these machines exhibit behavior that is usually only attributed to quantum systems. We present networks that simulate quantum interference on an event-by-event basis. In particular we show that by using simple geometry and the learning capabilities of the machines it becomes possible to simulate single-photon interference in a Mach-Zehnder interferometer. The interference pattern generated by the network of deterministic learning machines is in perfect agreement with the quantum theoretical result for the single-photon Mach-Zehnder interferometer. To illustrate that networks of these machines are indeed capable of simulating quantum interference we simulate, event-by-event, a setup involving two chained Mach-Zehnder interferometers. We show that also in this case the simulation results agree with quantum theory.
This paper presents the LIG-CRIStAL submission to the shared Automatic Post- Editing task of WMT 2017. We propose two neural post-editing models: a monosource model with a task-specific attention mechanism, which performs particularly well in a low-resource scenario; and a chained architecture which makes use of the source sentence to provide extra context. This latter architecture manages to slightly improve our results when more training data is available. We present and discuss our results on two datasets (en-de and de-en) that are made available for the task.
The cumulative distribution network (CDN) is a recently developed class of probabilistic graphical models (PGMs) permitting a copula factorization, in which the CDF, rather than the density, is factored. Despite there being much recent interest within the machine learning community about copula representations, there has been scarce research into the CDN, its amalgamation with copula theory, and no evaluation of its performance. Algorithms for inference, sampling, and learning in these models are underdeveloped compared those of other PGMs, hindering widerspread use.   One advantage of the CDN is that it allows the factors to be parameterized as copulae, combining the benefits of graphical models with those of copula theory. In brief, the use of a copula parameterization enables greater modelling flexibility by separating representation of the marginals from the dependence structure, permitting more efficient and robust learning. Another advantage is that the CDN permits the representation of implicit latent variables, whose parameterization and connectivity are not required to be specified. Unfortunately, that the model can encode only latent relationships between variables severely limits its utility.   In this thesis, we present inference, learning, and sampling for CDNs, and further the state-of-the-art. First, we explain the basics of copula theory and the representation of copula CDNs. Then, we discuss inference in the models, and develop the first sampling algorithm. We explain standard learning methods, propose an algorithm for learning from data missing completely at random (MCAR), and develop a novel algorithm for learning models of arbitrary treewidth and size. Properties of the models and algorithms are investigated through Monte Carlo simulations. We conclude with further discussion of the advantages and limitations of CDNs, and suggest future work.
Deep neural networks have proved very successful in domains where large training sets are available, but when the number of training samples is small, their performance suffers from overfitting. Prior methods of reducing overfitting such as weight decay, Dropout and DropConnect are data-independent. This paper proposes a new method, GraphConnect, that is data-dependent, and is motivated by the observation that data of interest lie close to a manifold. The new method encourages the relationships between the learned decisions to resemble a graph representing the manifold structure. Essentially GraphConnect is designed to learn attributes that are present in data samples in contrast to weight decay, Dropout and DropConnect which are simply designed to make it more difficult to fit to random error or noise. Empirical Rademacher complexity is used to connect the generalization error of the neural network to spectral properties of the graph learned from the input data. This framework is used to show that GraphConnect is superior to weight decay. Experimental results on several benchmark datasets validate the theoretical analysis, and show that when the number of training samples is small, GraphConnect is able to significantly improve performance over weight decay.
In these proceedings I discuss various extragalactic surveys which will be undertaken over the next few years and which will be complementary to any HI and/or continuum surveys with the SKA-precursor telescopes. I concentrate on the near-infrared public surveys which will be undertaken with the Visible and Infrared Survey Telscope for Astronomy (VISTA), and in particular the VISTA Deep Extragalactic Observations (VIDEO) survey which will provide the ideal data set to combine with any deep SKA-precursor observations of the extragalactic sky. After highlighting the links that the SKA precursors have with the various VISTA surveys, I briefly describe two forthcoming Herschel surveys, Herschel-ATLAS survey and HerMES which have a large scientific overlap with the SKA-precursor telescopes. Finally, I present a case study in combining multi-wavelength data sets with radio-frequency surveys to find the highest redshift radio sources with the aim of probing the epoch of reionization.
End-to-end learning of communications systems is a fascinating novel concept that has so far only been validated by simulations for block-based transmissions. It allows learning of transmitter and receiver implementations as deep neural networks (NNs) that are optimized for an arbitrary differentiable end-to-end performance metric, e.g., block error rate (BLER). In this paper, we demonstrate that over-the-air transmissions are possible: We build, train, and run a complete communications system solely composed of NNs using unsynchronized off-the-shelf software-defined radios (SDRs) and open-source deep learning (DL) software libraries. We extend the existing ideas towards continuous data transmission which eases their current restriction to short block lengths but also entails the issue of receiver synchronization. We overcome this problem by introducing a frame synchronization module based on another NN. A comparison of the BLER performance of the "learned" system with that of a practical baseline shows competitive performance close to 1 dB, even without extensive hyperparameter tuning. We identify several practical challenges of training such a system over actual channels, in particular the missing channel gradient, and propose a two-step learning procedure based on the idea of transfer learning that circumvents this issue.
Content-Based Image Retrieval based on local features is computationally expensive because of the complexity of both extraction and matching of local feature. On one hand, the cost for extracting, representing, and comparing local visual descriptors has been dramatically reduced by recently proposed binary local features. On the other hand, aggregation techniques provide a meaningful summarization of all the extracted feature of an image into a single descriptor, allowing us to speed up and scale up the image search. Only a few works have recently mixed together these two research directions, defining aggregation methods for binary local features, in order to leverage on the advantage of both approaches. In this paper, we report an extensive comparison among state-of-the-art aggregation methods applied to binary features. Then, we mathematically formalize the application of Fisher Kernels to Bernoulli Mixture Models. Finally, we investigate the combination of the aggregated binary features with the emerging Convolutional Neural Network (CNN) features. Our results show that aggregation methods on binary features are effective and represent a worthwhile alternative to the direct matching. Moreover, the combination of the CNN with the Fisher Vector (FV) built upon binary features allowed us to obtain a relative improvement over the CNN results that is in line with that recently obtained using the combination of the CNN with the FV built upon SIFTs. The advantage of using the FV built upon binary features is that the extraction process of binary features is about two order of magnitude faster than SIFTs.
In a Vehicular Ad-hoc Network (VANET), the amount of interference from neighboring nodes to a communication link is governed by the vehicle density dynamics in vicinity and transmission probabilities of terminals. It is obvious that vehicles are distributed non-homogeneously along a road segment due to traffic controls and speed limits at different portions of the road. The common assumption of homogeneous node distribution in the network in most of the previous work in mobile ad-hoc networks thus appears to be inappropriate in VANETs. In light of the inadequacy, we present in this paper an original methodology to study the performance of VANETs with practical vehicle distribution in urban environment. Specifically, we introduce the stochastic traffic model to characterize the general vehicular traffic flow as well as the randomness of individual vehicles, from which we can acquire the mean dynamics and the probability distribution of vehicular density. As illustrative examples, we demonstrate how the density knowledge from the stochastic traffic model can be utilized to derive the throughput and progress performance of three routing strategies in different channel access protocols. We confirm the accuracy of the analytical results through extensive simulations. Our results demonstrate the applicability of the proposed methodology on modeling protocol performance, and shed insight into the performance analysis of other transmission protocols and network configurations in vehicular networks. Furthermore, we illustrate that the optimal transmission probability for optimized network performance can be obtained as a function of the location space from our results. Such information can be computed by road-side nodes and then broadcasted to road users for optimized multi-hop packet transmission in the communication network.
This paper studies the fastest distributed consensus averaging problem on branches of an arbitrary connected sensor network. In the previous works full knowledge about the sensor network's connectivity topology was required for determining the optimal weights and convergence rate of distributed consensus averaging algorithm over the network. Here in this work for the first time, the optimal weights are determined analytically for the edges of certain types of branches, independent of the rest of network. The solution procedure consists of stratification of associated connectivity graph of the branches and Semidefinite Programming (SDP), particularly solving the slackness conditions, where the optimal weights are obtained by inductive comparing of the characteristic polynomials initiated by slackness conditions. Several examples and numerical results are provided to confirm the optimality of the obtained weights.
In this paper, first we present a new explanation for the relation between logical circuits and artificial neural networks, logical circuits and fuzzy logic, and artificial neural networks and fuzzy inference systems. Then, based on these results, we propose a new neuro-fuzzy computing system which can effectively be implemented on the memristor-crossbar structure. One important feature of the proposed system is that its hardware can directly be trained using the Hebbian learning rule and without the need to any optimization. The system also has a very good capability to deal with huge number of input-out training data without facing problems like overtraining.
A social network confers benefits and advantages on individuals (and on groups), the literature refers to these advantages as social capital. This paper presents a micro-founded mathematical model of the evolution of a social network and of the social capital of individuals within the network. The evolution of the network is influenced by the extent to which individuals are homophilic, structurally opportunistic, socially gregarious and by the distribution of types in the society. In the analysis, we identify different kinds of social capital: bonding capital, popularity capital, and bridging capital. Bonding capital is created by forming a circle of connections, homophily increases bonding capital because it makes this circle of connections more homogeneous. Popularity capital leads to preferential attachment: individuals who become popular tend to become more popular because others are more likely to link to them. Homophily creates asymmetries in the levels of popularity attained by different social groups, more gregarious types of agents are more likely to become popular. However, in homophilic societies, individuals who belong to less gregarious, less opportunistic, or major types are likely to be more central in the network and thus acquire a bridging capital.
Residual units are wildly used for alleviating optimization difficulties when building deep neural networks. However, the performance gain does not well compensate the model size increase, indicating low parameter efficiency in these residual units. In this work, we first revisit the residual function in several variations of residual units and demonstrate that these residual functions can actually be explained with a unified framework based on generalized block term decomposition. Then, based on the new explanation, we propose a new architecture, Collective Residual Unit (CRU), which enhances the parameter efficiency of deep neural networks through collective tensor factorization. CRU enables knowledge sharing across different residual units using shared factors. Experimental results show that our proposed CRU Network demonstrates outstanding parameter efficiency, achieving comparable classification performance to ResNet-200 with the model size of ResNet-50. By building a deeper network using CRU, we can achieve state-of-the-art single model classification accuracy on ImageNet-1k and Places365-Standard benchmark datasets. (Code and trained models are available on GitHub)
Many software projects are no longer done in-house by a single organization. Instead, we are in a new age where software is developed by a networked community of individuals and organizations, which base their relations to each other on mutual interest. Paradoxically, recent research suggests that software development can actually be jointly-developed by rival firms. For instance, it is known that the mobile-device makers Apple and Samsung kept collaborating in open source projects while running expensive patent wars in the court. Taking a case study approach, we explore how rival firms collaborate in the open source arena by employing a multi-method approach that combines qualitative analysis of archival data (QA) with mining software repositories (MSR) and Social Network Analysis (SNA). While exploring collaborative processes within the OpenStack ecosystem, our research contributes to Software Engineering research by exploring the role of groups, sub-communities and business models within a high-networked open source ecosystem. Surprising results point out that competition for the same revenue model (i.e., operating conflicting business models) does not necessary affect collaboration within the ecosystem. Moreover, while detecting the different sub-communities of the OpenStack community, we found out that the expected social tendency of developers to work with developers from same firm (i.e., homophily) did not hold within the OpenStack ecosystem. Furthermore, while addressing a novel, complex and unexplored open source case, this research also contributes to the management literature in coopetition strategy and high-tech entrepreneurship with a rich description on how heterogeneous actors within a high-networked ecosystem (involving individuals, startups, established firms and public organizations) joint-develop a complex infrastructure for big-data in the open source arena.
In this paper, we systematically analyze the connecting architectures of recurrent neural networks (RNNs). Our main contribution is twofold: first, we present a rigorous graph-theoretic framework describing the connecting architectures of RNNs in general. Second, we propose three architecture complexity measures of RNNs: (a) the recurrent depth, which captures the RNN's over-time nonlinear complexity, (b) the feedforward depth, which captures the local input-output nonlinearity (similar to the "depth" in feedforward neural networks (FNNs)), and (c) the recurrent skip coefficient which captures how rapidly the information propagates over time. We rigorously prove each measure's existence and computability. Our experimental results show that RNNs might benefit from larger recurrent depth and feedforward depth. We further demonstrate that increasing recurrent skip coefficient offers performance boosts on long term dependency problems.
The recent argue about the existence of an instability in the definition of the mean value appearing in the Tsallis non extensive Statistical Mechanic is reconsidered. Here, it is simply underlined that the pair of probability distributions employed in constructing the instability statement have a discontinuous limit when the number of states tends to infinity. That is, although for an arbitrary but finite number of states W, both probability distributions are normalized to the unit, their limits W tending to infinity do not satisfy the normalization condition and thus are not allowed "escort" probabilities for the q-mean value. However, similar distributions converging to the former ones when a parameter W_o is tending to infinity are defined here. They both satisfy the normalization to the unity in the limit W tending to infinity. This simple change allows to show that the stability condition becomes satisfied, for whatever large but fixed value of W_o is chosen.
In current scenario several commercial and social organizations are using computer networks for their business and management purposes. In order to meet the business requirements networks are also grow. The growth of network also promotes the handling capability of large networks because it counter raises the possibilities of various faults in the network. A fault in network degrades its performance by affecting parameters like throughput, delay, latency, reliability etc. In hierarchical network models any possibility of fault may collapse entire network. If a fault occurrence disables a device in hierarchical network then it may distresses all the devices underneath. Thus it affects entire networks performance. In this paper we propose Fault Tolerable hierarchical Network (FTN) approach as a solution to the problems of hierarchical networks. The proposed approach firstly detects possibilities of fault in the network and accordingly provides specific recovery mechanism. We have evaluated the performance of FTN approach in terms of delay and throughput of network.
Recent deep Chandra surveys of the Galactic center region have revealed the existence of a faint, hard X-ray source population. While the nature of this population is unknown, it is likely that several types of stellar objects contribute. For sources involving binary systems, accreting white dwarfs and accreting neutron stars with main sequence companions have been proposed. Among the accreting neutron star systems, previous studies have focused on stellar wind-fed sources. In this paper, we point out that binary systems in which mass transfer occurs via Roche lobe overflow (RLOF) can also contribute to this X-ray source population.   A binary population synthesis study of the Galactic center region has been carried out, and it is found that evolutionary channels for neutron star formation involving the accretion induced collapse of a massive ONeMg white dwarf, in addition to the core collapse of massive stars, can contribute to this population. The RLOF systems would appear as transients with quiescent luminosities, above 2 keV, in the range from 10^31-10^32 ergs/s. The results reveal that RLOF systems primarily contribute to the faint X-ray source population in the Muno et al. (2003) survey and wind-fed systems can contribute to the less sensitive Wang et al. (2002) survey. However, our results suggest that accreting neutron star systems are not likely to be the major contributor to the faint X-ray source population in the Galactic center.
Ejection of material after the Deep Impact collision with Comet Tempel 1 was studied based on analysis of the images made by the Deep Impact cameras during the first 13 minutes after impact. Analysis of the images shows that there was a local maximum of the rate of ejection at time of ejection ~10 s with typical velocities ~100 m/s. At the same time, a considerable excessive ejection in a few directions began, the direction to the brightest pixel changed by ~50 deg, and there was a local increase of brightness of the brightest pixel. The ejection can be considered as a superposition of the normal ejection and the longer triggered outburst.
Artificial neural networks built from two-state neurons are powerful computational substrates, whose computational ability is well understood by analogy with statistical mechanics. In this work, we introduce similar analogies in the context of spiking neurons in a fixed time window, where excitatory and inhibitory inputs drawn from a Poisson distribution play the role of temperature. For single neurons with a "bandgap" between their inputs and the spike threshold, this temperature allows for stochastic spiking. By imposing a global inhibitory rhythm over the fixed time windows, we connect neurons into a network that exhibits synchronous, clock-like updating akin to neural networks. We implement a single-layer Boltzmann machine without learning to demonstrate our model.
Optimal transportation distances are a fundamental family of parameterized distances for histograms. Despite their appealing theoretical properties, excellent performance in retrieval tasks and intuitive formulation, their computation involves the resolution of a linear program whose cost is prohibitive whenever the histograms' dimension exceeds a few hundreds. We propose in this work a new family of optimal transportation distances that look at transportation problems from a maximum-entropy perspective. We smooth the classical optimal transportation problem with an entropic regularization term, and show that the resulting optimum is also a distance which can be computed through Sinkhorn-Knopp's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transportation solvers. We also report improved performance over classical optimal transportation distances on the MNIST benchmark problem.
The behavioral specification of an object-oriented grammar model is considered. The model is based on full lexicalization, head-orientation via valency constraints and dependency relations, inheritance as a means for non-redundant lexicon specification, and concurrency of computation. The computation model relies upon the actor paradigm, with concurrency entering through asynchronous message passing between actors. In particular, we here elaborate on principles of how the global behavior of a lexically distributed grammar and its corresponding parser can be specified in terms of event type networks and event networks, resp.
This paper presents a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition. The whole network is composed of two modules. The feature embedding module is a deep Convolutional Neural Network (CNN) which maps each face image to a feature vector. The aggregation module consists of two attention blocks which adaptively aggregate the feature vectors to form a single feature inside the convex hull spanned by them. Due to the attention mechanism, the aggregation is invariant to the image order. Our NAN is trained with a standard classification or verification loss without any extra supervision signal, and we found that it automatically learns to advocate high-quality face images while repelling low-quality ones such as blurred, occluded and improperly exposed faces. The experiments on IJB-A, YouTube Face, Celebrity-1000 video face recognition benchmarks show that it consistently outperforms naive aggregation methods and achieves the state-of-the-art accuracy.
Recent advances in software-defined mobile networks (SDMNs), in-network caching, and mobile edge computing (MEC) can have great effects on video services in next generation mobile networks. In this paper, we jointly consider SDMNs, in-network caching, and MEC to enhance the video service in next generation mobile networks. With the objective of maximizing the mean measurement of video quality, an optimization problem is formulated. Due to the coupling of video data rate, computing resource, and traffic engineering (bandwidth provisioning and paths selection), the problem becomes intractable in practice. Thus, we utilize dual-decomposition method to decouple those three sets of variables. Extensive simulations are conducted with different system configurations to show the effectiveness of the proposed scheme.
The production of dijets is measured in diffractive deep-inelastic scattering at HERA. The data were recorded with the H1 detector at DESY in the years 2003-2007. Diffractive events are selected by requiring a gap in the rapidity distribution of the hadronic systen, where no particles are produced. Two jets are selected with transverse momenta in the hadronic-centre-of-mass system larger than 4 and 5.5GeV, respectively. Cross sections are measured single- and doubledifferentially in various kinematic quantities. The data are found to be in good agreement with NLO QCD calculations based on diffractive parton densities determined frominclusive diffractive cross section measurements.
Mobile cyberphysical systems have received considerable attention over the last decade, as communication, computing and control come together on a common platform. Understanding the complex interactions that govern the behavior of large complex cyberphysical systems is not an easy task. The goal of this paper is to address this challenge in the particular context of multimedia delivery over an autonomous aerial vehicle (AAV) network. Bandwidth requirements and stringent delay constraints of real-time video streaming, paired with limitations on computational complexity and power consumptions imposed by the underlying implementation platform, make cross-layer and cross-domain co-design approaches a necessity. In this paper, we propose a novel, low-complexity rate-distortion optimized (RDO) algorithms specifically targeted at video streaming over mobile embedded networks. We test the performance of our RDO algorithms using a network of AAVs both in simulation and implementation.
In this paper, we consider the problem of automatically segmenting neuronal cells in dual-color confocal microscopy images. This problem is a key task in various quantitative analysis applications in neuroscience, such as tracing cell genesis in Danio rerio (zebrafish) brains. Deep learning, especially using fully convolutional networks (FCN), has profoundly changed segmentation research in biomedical imaging. We face two major challenges in this problem. First, neuronal cells may form dense clusters, making it difficult to correctly identify all individual cells (even to human experts). Consequently, segmentation results of the known FCN-type models are not accurate enough. Second, pixel-wise ground truth is difficult to obtain. Only a limited amount of approximate instance-wise annotation can be collected, which makes the training of FCN models quite cumbersome. We propose a new FCN-type deep learning model, called deep complete bipartite networks (CB-Net), and a new scheme for leveraging approximate instance-wise annotation to train our pixel-wise prediction model. Evaluated using seven real datasets, our proposed new CB-Net model outperforms the state-of-the-art FCN models and produces neuron segmentation results of remarkable quality
We study the strategic formation of multi-layer networks, where each layer represents a different type of relationship between the nodes in the network and is designed to maximize some utility that depends on the topology of that layer and those of the other layers. We start by generalizing distance-based network formation to the two-layer setting, where edges are constructed in one layer (with fixed cost per edge) to minimize distances between nodes that are neighbors in another layer. We show that designing an optimal network in this setting is NP-hard. Despite the underlying complexity of the problem, we characterize certain properties of the optimal networks. We then formulate a multi-layer network formation game where each layer corresponds to a player that is optimally choosing its edge set in response to the edge sets of the other players. We consider utility functions that view the different layers as strategic substitutes. By applying our results about optimal networks, we show that players with low edge costs drive players with high edge costs out of the game, and that hub-and-spoke networks that are commonly observed in transportation systems arise as Nash equilibria in this game.
Community detection is an important task in network analysis, in which we aim to learn a network partition that groups together vertices with similar community-level connectivity patterns. By finding such groups of vertices with similar structural roles, we extract a compact representation of the network's large-scale structure, which can facilitate its scientific interpretation and the prediction of unknown or future interactions. Popular approaches, including the stochastic block model, assume edges are unweighted, which limits their utility by throwing away potentially useful information. We introduce the `weighted stochastic block model' (WSBM), which generalizes the stochastic block model to networks with edge weights drawn from any exponential family distribution. This model learns from both the presence and weight of edges, allowing it to discover structure that would otherwise be hidden when weights are discarded or thresholded. We describe a Bayesian variational algorithm for efficiently approximating this model's posterior distribution over latent block structures. We then evaluate the WSBM's performance on both edge-existence and edge-weight prediction tasks for a set of real-world weighted networks. In all cases, the WSBM performs as well or better than the best alternatives on these tasks.
Short-term passenger demand forecasting is of great importance to the on-demand ride service platform, which can incentivize vacant cars moving from over-supply regions to over-demand regions. The spatial dependences, temporal dependences, and exogenous dependences need to be considered simultaneously, however, which makes short-term passenger demand forecasting challenging. We propose a novel deep learning (DL) approach, named the fusion convolutional long short-term memory network (FCL-Net), to address these three dependences within one end-to-end learning architecture. The model is stacked and fused by multiple convolutional long short-term memory (LSTM) layers, standard LSTM layers, and convolutional layers. The fusion of convolutional techniques and the LSTM network enables the proposed DL approach to better capture the spatio-temporal characteristics and correlations of explanatory variables. A tailored spatially aggregated random forest is employed to rank the importance of the explanatory variables. The ranking is then used for feature selection. The proposed DL approach is applied to the short-term forecasting of passenger demand under an on-demand ride service platform in Hangzhou, China. Experimental results, validated on real-world data provided by DiDi Chuxing, show that the FCL-Net achieves better predictive performance than traditional approaches including both classical time-series prediction models and neural network based algorithms (e.g., artificial neural network and LSTM). This paper is one of the first DL studies to forecast the short-term passenger demand of an on-demand ride service platform by examining the spatio-temporal correlations.
Intelligent Transportation Systems (ITS) use data and information technology to improve the operation of our transportation network. ITS contributes to sustainable development by using technology to make the transportation system more efficient; improving our environment by reducing emissions, reducing the need for new construction and improving our daily lives through reduced congestion. A key component of ITS is traveler information. The Oregon Department of Transportation (ODOT) recently implemented a new traveler information system on selected freeways to provide drivers with travel time estimates that allow them to make more informed decisions about routing to their destinations. The ODOT project aims to improve traffic flow and promote efficient traffic movement, which can reduce emissions rates and improve air quality. The new ODOT system is based on travel data collected from a recently-increased set of sensors installed on its freeways. Our current project investigates novel data cleaning methodologies and the integration of those methodologies into the prediction of travel times. We use machine learning techniques on our archive to identify suspect data, and calculate revised travel times excluding this suspect data. We compare the resulting travel time predictions to ground-truth data, and to predictions based on simple, rule-based data cleaning. We report on the results of our study using qualitative and quantitative methods.
We expand the current understanding of ecological resilience to include the nested hierarchy of cognitive submodules that particularly, if not uniquely, delineates human ecosystems. These modules, ranging from the immune system to the local social network, are embedded in a cultural milieu which, to take the perspective of the evolutionary anthropologist Robert Boyd, ''is as much a part of human biology as the enamel on our teeth''. We begin by extending recent treatments of cognitive process as associated with characteristic information sources to a certain class of ecosystems through a generalization of coarse-graining. In the spirit of the Large Deviations Program, we then import renormalization formalism via the Asymptotic Equipartition Theorem to obtain punctuated response to parameters of increasing habitat degradation. A Legendre transform of an appropriate joint information permits analysis away from critical points, and generates the expected quasi-stability in a highly natural manner. We interweave the discussion with applications to the public health impacts of the massive deurbanization and deindustrialization presently afflicting the United States.
The Adler sum rule for deep inelastic neutrino scattering measures the isospin of the nucleon and is hence exact. By contrast, the corresponding Gottfried sum rule for charged lepton scattering was based merely on a valence picture and is modified both by perturbative and non-perturbative effects. Noting that the known perturbative corrections to two-loop order are suppressed by a factor 1/N_c^2, relative to those for higher moments, we propose that this suppression persists at higher orders and also applies to higher-twist effects. Moreover, we propose that the differences between the corresponding radiative corrections to higher non-singlet moments in charged-lepton and neutrino deep inelastic scattering are suppressed by 1/N_c^2, in all orders of perturbation theory. For the first moment, in the Gottfried sum rule, the substantial discrepancy between the measured value and the valence-model expectation may be attributed to an intrinsic isospin asymmetry in the nucleon sea, as is indeed the case in a chiral-soliton model, where the discrepancy persists in the limit N_c-->infinity.
The performance of deep learning in natural language processing has been spectacular, but the reason for this success remains unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a Long Short-Term Memory (LSTM)-based neural language model effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of the reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical law of natural language. This understanding could provide a direction of improvement of architectures of neural networks.
We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We also put forward a novel method to train binary convolutional networks by utilising an existing connection between noisy-rectified linear units and binary activations.
Spin-glass phases and phase transitions for q-state clock models and their q $\rightarrow \infty$ limit the XY model, in spatial dimension d=3, are studied by a detailed renormalization-group study that is exact for the d=3 hierarchical lattice and approximate for the cubic lattice. In addition to the now well-established chaotic rescaling behavior of the spin-glass phase, each of the two types of spin-glass phase boundaries displays, under renormalization-group trajectories, their own distinctive chaotic behavior. These chaotic renormalization-group trajectories subdivide into two categories, namely as strong-coupling chaos (in the spin-glass phase and, distinctly, on the spinglass-ferromagnetic phase boundary) and as intermediate-coupling chaos (on the spinglass-paramagnetic phase boundary). We thus characterize each different phase and phase boundary exhibiting chaos by its distinct Lyapunov exponent, which we calculate. We show that, under renormalization-group, chaotic trajectories and fixed distributions are mechanistically and quantitatively equivalent. The phase diagrams of arbitrary even q-state clock spin-glass models in d=3 are calculated. These models, for all non-infinite q, have a finite-temperature spin-glass phase. Furthermore, the spin-glass phases exhibit a universal ordering behavior, independent of q. The spin-glass phases and the spinglass-paramagnetic phase boundaries exhibit universal fixed distributions, chaotic trajectories and Lyapunov exponents. In the XY model limit, our calculations indicate a zero-temperature spin-glass phase.
Designing a practical, low complexity, close to optimal, channel decoder for powerful algebraic codes with short to moderate block length is an open research problem. Recently it has been shown that a feed-forward neural network architecture can improve on standard belief propagation decoding, despite the large example space. In this paper we introduce a recurrent neural network architecture for decoding linear block codes. Our method shows comparable bit error rate results compared to the feed-forward neural network with significantly less parameters. We also demonstrate improved performance over belief propagation on sparser Tanner graph representations of the codes. Furthermore, we demonstrate that the RNN decoder can be used to improve the performance or alternatively reduce the computational complexity of the mRRD algorithm for low complexity, close to optimal, decoding of short BCH codes.
This paper proposes a new scheme for performance enhancement of distributed genetic algorithm (DGA). Initial population is divided in two classes i.e. female and male. Simple distance based clustering is used for cluster formation around females. For reclustering self-adaptive K-means is used, which produces well distributed and well separated clusters. The self-adaptive K-means used for reclustering automatically locates initial position of centroids and number of clusters. Four plans of co-evolution are applied on these clusters independently. Clusters evolve separately. Merging of clusters takes place depending on their performance. For experimentation unimodal and multimodal test functions have been used. Test result show that the new scheme of distribution of population has given better performance.
This paper is about simulating the spread of opinions in a society and about finding ways to counteract that spread. To abstract away from potentially emotionally laden opinions, we instead simulate the spread of a zombie outbreak in a society. The virus causing this outbreak is different from traditional approaches: it not only causes a binary outcome (healthy vs infected) but rather a continuous outcome. To counteract the outbreak, a discrete number of infection-level specific treatments is available. This corresponds to acts of mild persuasion or the threats of legal action in the opinion spreading use case. This paper offers a genetic and a cultural algorithm that find the optimal mixture of treatments during the run of the simulation. They are assessed in a number of different scenarios. It is shown, that albeit far from being perfect, the cultural algorithm delivers superior performance at lower computational expense.
Complex-valued neural networks (CVNNs) are an emerging field of research in neural networks due to their potential representational properties for audio, image, and physiological signals. It is common in signal processing to transform sequences of real values to the complex domain via a set of complex basis functions, such as the Fourier transform. We show how CVNNs can be used to learn complex representations of real valued time-series data. We present methods and results using a framework that can compose holomorphic and non-holomorphic functions in a multi-layer network using a theoretical result called the Wirtinger derivative. We test our methods on a representation learning task for real-valued signals, recurrent complex-valued networks and their real-valued counterparts. Our results show that recurrent complex-valued networks can perform as well as their real-valued counterparts while learning filters that are representative of the domain of the data.
For learning problem of Radial Basis Function Process Neural Network (RBF-PNN), an optimization training method based on GA combined with SA is proposed in this paper. Through building generalized Fr\'echet distance to measure similarity between time-varying function samples, the learning problem of radial basis centre functions and connection weights is converted into the training on corresponding discrete sequence coefficients. Network training objective function is constructed according to the least square error criterion, and global optimization solving of network parameters is implemented in feasible solution space by use of global optimization feature of GA and probabilistic jumping property of SA . The experiment results illustrate that the training algorithm improves the network training efficiency and stability.
A fundamental process in the implementation of any numerical tensor network algorithm is that of contracting a tensor network. In this process, a network made up of multiple tensors connected by summed indices is reduced to a single tensor or a number by evaluating the index sums. This article presents a MATLAB function ncon(), or "Network CONtractor", which accepts as its input a tensor network and a contraction sequence describing how this network may be reduced to a single tensor or number. As its output it returns that single tensor or number. The function ncon() may be obtained by downloading the source of this preprint.
We use PCAC in the small $Q^2$ region in order to calculate the Adler sum rule and the production of hadrons in the low energy region where resonances dominate. We find very good agreement with the sum rule and with the computed cross sections. We find a value $C_5^A(0)$ close to the Goldberger-Treiman prediction. The formalism is general and can be applied to other reactions shedding light into the dynamical transition from resonances to deep inelastic scattering.
Spiking neural network is a type of artificial neural network in which neurons communicate between each other with spikes. Spikes are identical Boolean events characterized by the time of their arrival. A spiking neuron has internal dynamics and responds to the history of inputs as opposed to the current inputs only. Because of such properties a spiking neural network has rich intrinsic capabilities to process spatiotemporal data. However, because the spikes are discontinuous 'yes or no' events, it is not trivial to apply traditional training procedures such as gradient descend to the spiking neurons. In this thesis we propose to use stochastic spiking neuron models in which probability of a spiking output is a continuous function of parameters. We formulate several learning tasks as minimization of certain information-theoretic cost functions that use spiking output probability distributions. We develop a generalized description of the stochastic spiking neuron and a new spiking neuron model that allows to flexibly process rich spatiotemporal data. We formulate and derive learning rules for the following tasks:   - a supervised learning task of detecting a spatiotemporal pattern as a minimization of the negative log-likelihood (the surprisal) of the neuron's output   - an unsupervised learning task of increasing the stability of neurons output as a minimization of the entropy   - a reinforcement learning task of controlling an agent as a modulated optimization of filtered surprisal of the neuron's output.   We test the derived learning rules in several experiments such as spatiotemporal pattern detection, spatiotemporal data storing and recall with autoassociative memory, combination of supervised and unsupervised learning to speed up the learning process, adaptive control of simple virtual agents in changing environments.
Distributed Denial of Service (DDoS) attacks exhaust victim's bandwidth or services. Traditional architecture of Internet is vulnerable to DDoS attacks and an ongoing cycle of attack & defense is observed. In this paper, different types and techniques of DDoS attacks and their countermeasures are reviewed. The significance of this paper is the coverage of many aspects of countering DDoS attacks including new research on the topic. We survey different papers describing methods of defense against DDoS attacks based on entropy variations, traffic anomaly parameters, neural networks, device level defense, botnet flux identifications and application layer DDoS defense. We also discuss some traditional methods of defense such as traceback and packet filtering techniques so that readers can identify major differences between traditional and current techniques of defense against DDoS attacks. Before the discussion on countermeasures, we mention different attack types under DDoS with traditional and advanced schemes while some information on DDoS trends in the year 2012 Quarter-1 is also provided. We identify that application layer DDoS attacks possess the ability to produce greater impact on the victim as they are driven by legitimate-like traffic making it quite difficult to identify and distinguish from legitimate requests. The need of improved defense against such attacks is therefore more demanding in research. The study conducted in this paper can be helpful for readers and researchers to recognize better techniques of defense in current times against DDoS attacks and contribute with more research on the topic in the light of future challenges identified in this paper.
We propose a learning setting in which unlabeled data is free, and the cost of a label depends on its value, which is not known in advance. We study binary classification in an extreme case, where the algorithm only pays for negative labels. Our motivation are applications such as fraud detection, in which investigating an honest transaction should be avoided if possible. We term the setting auditing, and consider the auditing complexity of an algorithm: the number of negative labels the algorithm requires in order to learn a hypothesis with low relative error. We design auditing algorithms for simple hypothesis classes (thresholds and rectangles), and show that with these algorithms, the auditing complexity can be significantly lower than the active label complexity. We also discuss a general competitive approach for auditing and possible modifications to the framework.
This paper aims to explore the effect of prior disambiguation on neural network- based compositional models, with the hope that better semantic representations for text compounds can be produced. We disambiguate the input word vectors before they are fed into a compositional deep net. A series of evaluations shows the positive effect of prior disambiguation for such deep models.
Ray-finned fishes constitute the dominant radiation of vertebrates with over 30,000 species. Although molecular phylogenetics has begun to disentangle major evolutionary relationships within this vast section of the Tree of Life, there is no widely available approach for efficiently collecting phylogenomic data within fishes, leaving much of the enormous potential of massively parallel sequencing technologies for resolving major radiations in ray-finned fishes unrealized. Here, we provide a genomic perspective on longstanding questions regarding the diversification of major groups of ray-finned fishes through targeted enrichment of ultraconserved nuclear DNA elements (UCEs) and their flanking sequence. Our workflow efficiently and economically generates data sets that are orders of magnitude larger than those produced by traditional approaches and is well-suited to working with museum specimens. Analysis of the UCE data set recovers a well-supported phylogeny at both shallow and deep time-scales that supports a monophyletic relationship between Amia and Lepisosteus (Holostei) and reveals elopomorphs and then osteoglossomorphs to be the earliest diverging teleost lineages. Divergence time estimation based upon 14 fossil calibrations reveals that crown teleosts appeared ~270 Ma at the end of the Permian and that elopomorphs, osteoglossomorphs, ostarioclupeomorphs, and euteleosts diverged from one another by 205 Ma during the Triassic. Our approach additionally reveals that sequence capture of UCE regions and their flanking sequence offers enormous potential for resolving phylogenetic relationships within ray-finned fishes.
We describe a two-step approach for dialogue management in task-oriented spoken dialogue systems. A unified neural network framework is proposed to enable the system to first learn by supervision from a set of dialogue data and then continuously improve its behaviour via reinforcement learning, all using gradient-based algorithms on one single model. The experiments demonstrate the supervised model's effectiveness in the corpus-based evaluation, with user simulation, and with paid human subjects. The use of reinforcement learning further improves the model's performance in both interactive settings, especially under higher-noise conditions.
We investigate the liquid-glass phase transition in a system of point-like particles interacting via a finite-range attractive potential in D-dimensional space. The phase transition is driven by an `entropy crisis' where the available phase space volume collapses dramatically at the transition. We describe the general strategy underlying the first-principles replica calculation for this type of transition; its application to our model system then allows for an analytic description of the liquid-glass phase transition within a mean-field approximation, provided the parameters are chosen suitably. We find a transition exhibiting all the features associated with an `entropy crisis', including the characteristic finite jump of the order parameter at the transition while the free energy and its first derivative remain continuous.
Inclusive production cross sections are measured in deep inelastic scattering at HERA for meson states composed of a charm quark and a light antiquark or the charge conjugate. The measurements cover the kinematic region of photon virtuality 2 < Q^2 < 100 GeV^2, inelasticity 0.05 < y < 0.7, D meson transverse momenta p_t(D) > 2.5 GeV and pseudorapidity |eta(D)| < 1.5. The identification of the D-meson decays and the reduction of the combinatorial background profit from the reconstruction of displaced secondary vertices by means of the H1 silicon vertex detector. The production of charmed mesons containing the light quarks u, d and s is found to be compatible with a description in which the hard scattering is followed by a factorisable and universal hadronisation process.
So far, many network-structure-based link prediction methods have been proposed. However, these traditional methods were proposed by highlighting one or two structural features of networks, and then use the methods to implement link prediction in different networks. In many cases, the performance is not ideal since each network has its unique underlying structural features. In this article, by analyzing different real networks, we find that the structural features of different networks are remarkably different. In particular, even in the same networks, their inner structural features are utterly different. Inspired by these facts, an \emph{adaptive} link prediction method is proposed to incorporate multiple structural features from the perspective of combination optimization. In the model, the weight of each structural feature is \emph{adaptively } determined by logistic regression but not be artificially given in advance. According to our experimental results, we find that the logistic regression based link prediction outperforms other typical similarity indices.
There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications.
A first-person camera, placed at a person's head, captures, which objects are important to the camera wearer. Most prior methods for this task learn to detect such important objects from the manually labeled first-person data in a supervised fashion. However, important objects are strongly related to the camera wearer's internal state such as his intentions and attention, and thus, only the person wearing the camera can provide the importance labels. Such a constraint makes the annotation process costly and limited in scalability.   In this work, we show that we can detect important objects in first-person images without the supervision by the camera wearer or even third-person labelers. We formulate an important detection problem as an interplay between the 1) segmentation and 2) recognition agents. The segmentation agent first proposes a possible important object segmentation mask for each image, and then feeds it to the recognition agent, which learns to predict an important object mask using visual semantics and spatial features.   We implement such an interplay between both agents via an alternating cross-pathway supervision scheme inside our proposed Visual-Spatial Network (VSN). Our VSN consists of spatial ("where") and visual ("what") pathways, one of which learns common visual semantics while the other focuses on the spatial location cues. Our unsupervised learning is accomplished via a cross-pathway supervision, where one pathway feeds its predictions to a segmentation agent, which proposes a candidate important object segmentation mask that is then used by the other pathway as a supervisory signal. We show our method's success on two different important object datasets, where our method achieves similar or better results as the supervised methods.
Ultra network densification and Massive MIMO are considered major 5G enablers since they promise huge capacity gains by exploiting proximity, spectral and spatial reuse benefits. Both approaches rely on increasing the number of access elements per user, either through deploying more access nodes over an area or increasing the number of antenna elements per access node. At the network-level, optimal user-association for a densely and randomly deployed network of Massive MIMO empowered access nodes must account for both channel and load conditions. In this paper we formulate this complex problem, report its computationally intractability and reformulate it to a plausible form, amenable to acquire a global optimal solution with reasonable complexity. We apply the proposed optimization model to typical ultra-dense outdoor small-cell setups and demonstrate: (i) the significant impact of optimal user-association to the achieved rate levels compared to a baseline strategy, and (ii) the optimality of alternative network access element deployment strategies.
We analyze the Eckhaus instability of plane waves in the one-dimensional complex Ginzburg-Landau equation (CGLE) and describe the nonlinear effects arising in the Eckhaus unstable regime. Modulated amplitude waves (MAWs) are quasi-periodic solutions of the CGLE that emerge near the Eckhaus instability of plane waves and cease to exist due to saddle-node bifurcations (SN). These MAWs can be characterized by their average phase gradient $\nu$ and by the spatial period P of the periodic amplitude modulation. A numerical bifurcation analysis reveals the existence and stability properties of MAWs with arbitrary $\nu$ and P. MAWs are found to be stable for large enough $\nu$ and intermediate values of P. For different parameter values they are unstable to splitting and attractive interaction between subsequent extrema of the amplitude. Defects form from perturbed plane waves for parameter values above the SN of the corresponding MAWs. The break-down of phase chaos with average phase gradient $\nu$ > 0 (``wound-up phase chaos'') is thus related to these SNs. A lower bound for the break-down of wound-up phase chaos is given by the necessary presence of SNs and an upper bound by the absence of the splitting instability of MAWs.
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent search, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA) is the state-of-the-art in topic classification. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results which are not accurate in inferring the most suitable model parameters. Adapting approaches for community detection in networks, we propose a new algorithm which displays high-reproducibility and high-accuracy, and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure. Our algorithm promises to make "big data" text analysis systems more reliable.
Learning features from massive unlabelled data is a vast prevalent topic for high-level tasks in many machine learning applications. The recent great improvements on benchmark data sets achieved by increasingly complex unsupervised learning methods and deep learning models with lots of parameters usually requires many tedious tricks and much expertise to tune. However, filters learned by these complex architectures are quite similar to standard hand-crafted features visually. In this paper, unsupervised learning methods, such as PCA or auto-encoder, are employed as the building block to learn filter banks at each layer. The lower layer responses are transferred to the last layer (trans-layer) to form a more complete representation retaining more information. In addition, some beneficial methods such as local contrast normalization and whitening are added to the proposed deep trans-layer networks to further boost performance. The trans-layer representations are followed by block histograms with binary encoder schema to learn translation and rotation invariant representations, which are utilized to do high-level tasks such as recognition and classification. Compared to traditional deep learning methods, the implemented feature learning method has much less parameters and is validated in several typical experiments, such as digit recognition on MNIST and MNIST variations, object recognition on Caltech 101 dataset and face verification on LFW dataset. The deep trans-layer unsupervised learning achieves 99.45% accuracy on MNIST dataset, 67.11% accuracy on 15 samples per class and 75.98% accuracy on 30 samples per class on Caltech 101 dataset, 87.10% on LFW dataset.
Metaphors of Computation and Information tended to detract attention from the intrinsic modes of neural system functions, uncontaminated by the observer's role for collection and interpretation of experimental data. Recognizing the self-referential mode of function, and the propensity for self-organization to critical states requires a fundamental re-orientation with emphasis on the conceptual approaches of Complex System Dynamics. Accordingly, local cooperative processes, intrinsic to neural structures and of fractal nature, call for applying Fractional Calculus and models of Random Walks in Theoretical Neuroscience studies.
We consider the problem of distributed lossy linear function computation in a tree network. We examine two cases: (i) data aggregation (only one sink node computes) and (ii) consensus (all nodes compute the same function). By quantifying the accumulation of information loss in distributed computing, we obtain fundamental limits on network computation rate as a function of incremental distortions (and hence incremental loss of information) along the edges of the network. The above characterization, based on quantifying distortion accumulation, offers an improvement over classical cut-set type techniques which are based on overall distortions instead of incremental distortions. This quantification of information loss qualitatively resembles information dissipation in cascaded channels [1]. Surprisingly, this accumulation effect of distortion happens even at infinite blocklength. Combining this observation with an inequality on the dominance of mean-square quantities over relative-entropy quantities, we obtain outer bounds on the rate distortion function that are tighter than classical cut-set bounds by a difference which can be arbitrarily large in both data aggregation and consensus. We also obtain inner bounds on the optimal rate using random Gaussian coding, which differ from the outer bounds by $\mathcal{O}(\sqrt{D})$, where $D$ is the overall distortion. The obtained inner and outer bounds can provide insights on rate (bit) allocations for both the data aggregation problem and the consensus problem. We show that for tree networks, the rate allocation results have a mathematical structure similar to classical reverse water-filling for parallel Gaussian sources.
We consider the delay of network coding compared to routing with retransmissions in packet erasure networks with probabilistic erasures. We investigate the sub-linear term in the block delay required for unicasting $n$ packets and show that there is an unbounded gap between network coding and routing. In particular, we show that delay benefit of network coding scales at least as $\sqrt{n}$. Our analysis of the delay function for the routing strategy involves a major technical challenge of computing the expectation of the maximum of two negative binomial random variables. This problem has been studied previously and we derive the first exact characterization which may be of independent interest. We also use a martingale bounded differences argument to show that the actual coding delay is tightly concentrated around its expectation.
Neural word segmentation research has benefited from large-scale raw texts by leveraging them for pretraining character and word embeddings. On the other hand, statistical segmentation research has exploited richer sources of external information, such as punctuation, automatic segmentation and POS. We investigate the effectiveness of a range of external training sources for neural word segmentation by building a modular segmentation model, pretraining the most important submodule using rich external sources. Results show that such pretraining significantly improves the model, leading to accuracies competitive to the best methods on six benchmarks.
This is a chapter of the forthcoming Oxford Handbook on the Economics of Networks.
Recent advances in neural networks have inspired people to design hybrid recommendation algorithms that can incorporate both (1) user-item interaction information and (2) content information including image, audio, and text. Despite their promising results, neural network-based recommendation algorithms pose extensive computational costs, making it challenging to scale and improve upon. In this paper, we propose a general neural network-based recommendation framework, which subsumes several existing state-of-the-art recommendation algorithms, and address the efficiency issue by investigating sampling strategies in the stochastic gradient descent training for the framework. We tackle this issue by first establishing a connection between the loss functions and the user-item interaction bipartite graph, where the loss function terms are defined on links while major computation burdens are located at nodes. We call this type of loss functions "graph-based" loss functions, for which varied mini-batch sampling strategies can have different computational costs. Based on the insight, three novel sampling strategies are proposed, which can significantly improve the training efficiency of the proposed framework (up to $\times 30$ times speedup in our experiments), as well as improving the recommendation performance. Theoretical analysis is also provided for both the computational cost and the convergence. We believe the study of sampling strategies have further implications on general graph-based loss functions, and would also enable more research under the neural network-based recommendation framework.
Optically-transparent immersion liquids with refractive index (n ~ 1.77) to match sapphire-based aplanatic numerical aperture increasing lens (aNAIL) are necessary for achieving deep 3D imaging with high spatial resolution. We report that antimony tribromide (SbBr$_{3}$) salt dissolved in liquid diiodomethane (CH$_{2}$I$_{2}$) provides a new high refractive index immersion liquid for optics applications. The refractive index is tunable from n = 1.74 (pure) to n = 1.873 (saturated), by adjusting either salt concentration or temperature; this allows it to match (or even exceed) the refractive index of sapphire. Importantly, the solution gives excellent light transmittance in the ultraviolet to near-infrared range, an improvement over commercially-available immersion liquids. This refractive index matched immersion liquid formulation has enabled us to develop a sapphire-based aNAIL objective that has both high numerical aperture (NA = 1.17) and long working distance (WD = 12 mm). This opens up new possibilities for deep 3D imaging with high spatial resolution.
The Asiago plate archive has been searched for old plates covering the region of the sky containing Nova Del 2013 (V339 Del). The brightness of the progenitor was measured against a deep BVRI photometric sequence that we calibrated on purpose. The mean brightness of the progenitor on Asiago plates is <B>=17.27 and <V>=17.6, for a mean color (B-V) = -0.33. The recorded total amplitude of variation in B band is 0.9 mag. Color and variability are in agreement with a progenitor dominated by the emission from an accretion disc. The progenitor was marginally detected also by the APASS all sky survey on April 2012. We have stacked the CCD images from three individual visits and measured the progenitor at B=17.33+/-0.09 and V=17.06+/-0.10 mag.
We define a subdivision network $\Gamma^S$ of a given network $\Gamma,$ by inserting a new vertex in every edge, so that each edge is replaced by two new edges with conductances that fulfill electrical conditions on the new network. In this work, we firstly obtain an expression for the Green kernel of the subdivision network in terms of the Green kernel of the base network. Moreover, we also obtain the effective resistance and the Kirchhoff index of the subdivision network in terms of the corresponding parameters on the base network. Finally, as an example, we carry out the computations in the case of a wheel.
In this paper, we present a novel approach that uses deep learning techniques for colorizing grayscale images. By utilizing a pre-trained convolutional neural network, which is originally designed for image classification, we are able to separate content and style of different images and recombine them into a single image. We then propose a method that can add colors to a grayscale image by combining its content with style of a color image having semantic similarity with the grayscale one. As an application, to our knowledge the first of its kind, we use the proposed method to colorize images of ukiyo-e a genre of Japanese painting?and obtain interesting results, showing the potential of this method in the growing field of computer assisted art.
The study of rapid changes in brain dynamics and functional connectivity (FC) is of increasing interest in neuroimaging. Brain states departing from normal waking consciousness are expected to be accompanied by alterations in the aforementioned dynamics. In particular, the psychedelic experience produced by psilocybin (a substance found in `magic mushrooms`) is characterized by unconstrained cognition and profound alterations in the perception of time, space and selfhood. Considering the spontaneous and subjective manifestation of these effects, we hypothesize that neural correlates of the psychedelic experience can be found in the dynamics and variability of spontaneous brain activity fluctuations and connectivity, measurable with functional Magnetic Resonance Imaging (fMRI). Fifteen healthy subjects were scanned before, during and after intravenous infusion of psilocybin and an inert placebo. Blood-Oxygen Level Dependent (BOLD) temporal variability was assessed computing the variance and total spectral power, resulting in increased signal variability bilaterally in the hippocampi and anterior cingulate cortex. Changes in BOLD signal spectral behavior (including spectral scaling exponents) affected exclusively higher brain systems such as the default mode, executive control and dorsal attention networks. A novel framework enabled us to track different connectivity states explored by the brain during rest. This approach revealed a wider repertoire of connectivity states post-psilocybin than during control conditions. Together, the present results provide a comprehensive account of the effects of psilocybin on dynamical behaviour in the human brain at a macroscopic level and may have implications for our understanding of the unconstrained, hyper-associative quality of consciousness in the psychedelic state.
Change detection in multivariate time series has applications in many domains, including health care and network monitoring. A common approach to detect changes is to compare the divergence between the distributions of a reference window and a test window. When the number of dimensions is very large, however, the naive approach has both quality and efficiency issues: to ensure robustness the window size needs to be large, which not only leads to missed alarms but also increases runtime.   To this end, we propose LIGHT, a linear-time algorithm for robustly detecting non-linear changes in massively high dimensional time series. Importantly, LIGHT provides high flexibility in choosing the window size, allowing the domain expert to fit the level of details required. To do such, we 1) perform scalable PCA to reduce dimensionality, 2) perform scalable factorization of the joint distribution, and 3) scalably compute divergences between these lower dimensional distributions. Extensive empirical evaluation on both synthetic and real-world data show that LIGHT outperforms state of the art with up to 100% improvement in both quality and efficiency.
Deep neural networks are capable of modelling highly non-linear functions by capturing different levels of abstraction of data hierarchically. While training deep networks, first the system is initialized near a good optimum by greedy layer-wise unsupervised pre-training. However, with burgeoning data and increasing dimensions of the architecture, the time complexity of this approach becomes enormous. Also, greedy pre-training of the layers often turns detrimental by over-training a layer causing it to lose harmony with the rest of the network. In this paper a synchronized parallel algorithm for pre-training deep networks on multi-core machines has been proposed. Different layers are trained by parallel threads running on different cores with regular synchronization. Thus the pre-training process becomes faster and chances of over-training are reduced. This is experimentally validated using a stacked autoencoder for dimensionality reduction of MNIST handwritten digit database. The proposed algorithm achieved 26\% speed-up compared to greedy layer-wise pre-training for achieving the same reconstruction accuracy substantiating its potential as an alternative.
Sequential Monte Carlo (SMC), or particle filtering, is a popular class of methods for sampling from an intractable target distribution using a sequence of simpler intermediate distributions. Like other importance sampling-based methods, performance is critically dependent on the proposal distribution: a bad proposal can lead to arbitrarily inaccurate estimates of the target distribution. This paper presents a new method for automatically adapting the proposal using an approximation of the Kullback-Leibler divergence between the true posterior and the proposal distribution. The method is very flexible, applicable to any parameterized proposal distribution and it supports online and batch variants. We use the new framework to adapt powerful proposal distributions with rich parameterizations based upon neural networks leading to Neural Adaptive Sequential Monte Carlo (NASMC). Experiments indicate that NASMC significantly improves inference in a non-linear state space model outperforming adaptive proposal methods including the Extended Kalman and Unscented Particle Filters. Experiments also indicate that improved inference translates into improved parameter learning when NASMC is used as a subroutine of Particle Marginal Metropolis Hastings. Finally we show that NASMC is able to train a latent variable recurrent neural network (LV-RNN) achieving results that compete with the state-of-the-art for polymorphic music modelling. NASMC can be seen as bridging the gap between adaptive SMC methods and the recent work in scalable, black-box variational inference.
In the previous paper we studied Dirac fermions in a non-Abelian random vector potential by using lattice supersymmetry. By the lattice regularization, the system of disordered Dirac fermions is defined without any ambiguities. We showed there that at strong-disorder limit correlation function of the fermion local density of states decays algebraically at the band center. In this paper, we shall reexamine the multi-flavor or multi-species case rather in detail and argue that the correlator at the band center decays {\em exponentially} for the case of a {\em large} number of flavors. This means that a delocalization-localization phase transition occurs as the number of flavors is increased. This discussion is supported by the recent numerical studies on multi-flavor QCD at the strong-coupling limit, which shows that the phase structure of QCD drastically changes depending on the number of flavors. The above behaviour of the correlator of the random Dirac fermions is closely related with how the chiral symmetry is realized in QCD.
We demonstrate that a thermal metal of Majorana fermions forms in a two-dimensional system of interacting non-Abelian (Ising) anyons in the presence of moderate disorder. This bulk metallic phase arises in the $\nu=5/2$ quantum Hall state when disorder pins the anyonic quasiparticles. More generally, it naturally occurs for various proposed systems supporting Majorana fermion zero modes when disorder induces the random pinning of a finite density of vortices. This includes all two-dimensional topological superconductors in so-called symmetry class D. A distinct experimental signature of the thermal metal phase is the presence of bulk heat transport down to zero temperature.
We build a multi-source machine translation model and train it to maximize the probability of a target English string given French and German sources. Using the neural encoder-decoder framework, we explore several combination methods and report up to +4.8 Bleu increases on top of a very strong attention-based neural translation model.
We discuss replica symmetry breaking (RSB) in spin glasses. We update work in this area, from both the analytical and numerical points of view. We give particular attention to the difficulties stressed by Newman and Stein concerning the problem of constructing pure states in spin glass systems. We mainly discuss what happens in finite-dimensional, realistic spin glasses. Together with a detailed review of some of the most important features, facts, data, and phenomena, we present some new theoretical ideas and numerical results. We discuss among others the basic idea of the RSB theory, correlation functions, interfaces, overlaps, pure states, random field, and the dynamical approach. We present new numerical results for the behaviors of coupled replicas and about the numerical verification of sum rules, and we review some of the available numerical results that we consider of larger importance (for example, the determination of the phase transition point, the correlation functions, the window overlaps, and the dynamical behavior of the system).
In this paper, a transmission strategy of fountain codes over cooperative relay networks is proposed. When more than one relay nodes are available, we apply network coding to fountain-coded packets. By doing this, partial information is made available to the destination node about the upcoming message block. It is therefore able to reduce the required number of transmissions over erasure channels, hence increasing the effective throughput. Its application to wireless channels with Rayleigh fading and AWGN noise is also analysed, whereby the role of analogue network coding and optimal weight selection is demonstrated.
IceCube, with its surface array IceTop, detects three different components of extensive air showers: the total signal at the surface, GeV muons in the periphery of the showers and TeV muons in the deep array of IceCube. The spectrum is measured with high resolution from the knee to the ankle with IceTop. Composition and spectrum are extracted from events seen in coincidence by the surface array and the deep array of IceCube. The muon lateral distribution at the surface is obtained from the data and used to provide a measurement of the muon density at 600 meters from the shower core up to 30 PeV. Results are compared to measurements from other experiments to obtain an overview of the spectrum and composition over an extended range of energy. Consistency of the surface muon measurements with hadronic interaction models and with measurements at higher energy is discussed.
Most modern convolutional neural networks (CNNs) used for object recognition are built using the same principles: Alternating convolution and max-pooling layers followed by a small number of fully connected layers. We re-evaluate the state of the art for object recognition from small images with convolutional networks, questioning the necessity of different components in the pipeline. We find that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks. Following this finding -- and building on other recent work for finding simple network structures -- we propose a new architecture that consists solely of convolutional layers and yields competitive or state of the art performance on several object recognition datasets (CIFAR-10, CIFAR-100, ImageNet). To analyze the network we introduce a new variant of the "deconvolution approach" for visualizing features learned by CNNs, which can be applied to a broader range of network structures than existing approaches.
This paper introduces PriMaL, a general PRIvacy-preserving MAchine-Learning method for reducing the privacy cost of information transmitted through a network. Distributed sensor networks are often used for automated classification and detection of abnormal events in high-stakes situations, e.g. fire in buildings, earthquakes, or crowd disasters. Such networks might transmit privacy-sensitive information, e.g. GPS location of smartphones, which might be disclosed if the network is compromised. Privacy concerns might slow down the adoption of the technology, in particular in the scenario of social sensing where participation is voluntary, thus solutions are needed which improve privacy without compromising on the event detection accuracy. PriMaL is implemented as a machine-learning layer that works on top of an existing event detection algorithm. Experiments are run in a general simulation framework, for several network topologies and parameter values. The privacy footprint of state-of-the-art event detection algorithms is compared within the proposed framework. Results show that PriMaL is able to reduce the privacy cost of a distributed event detection algorithm below that of the corresponding centralized algorithm, within the bounds of some assumptions about the protocol. Moreover the performance of the distributed algorithm is not statistically worse than that of the centralized algorithm.
We provide a model to understand how adverse weather conditions modify traffic flow dynamic. We first prove that the microscopic Free Flow Speed of the vehicles is changed and then provide a rule to model this change. For this, we consider a thresholded linear model, corresponding to an application of a MARS model to road trafficking. This model adapts itself locally to the whole road network and provides accurate unbiased forecasted speed using live or short term forecasted weather data information.
Fisher vector has been widely used in many multimedia retrieval and visual recognition applications with good performance. However, the computation complexity prevents its usage in real-time video monitoring. In this work, we proposed and implemented GPU-FV, a fast Fisher vector extraction method with the help of modern GPUs. The challenge of implementing Fisher vector on GPUs lies in the data dependency in feature extraction and expensive memory access in Fisher vector computing. To handle these challenges, we carefully designed GPU-FV in a way that utilizes the computing power of GPU as much as possible, and applied optimizations such as loop tiling to boost the performance. GPU-FV is about 12 times faster than the CPU version, and 50\% faster than a non-optimized GPU implementation. For standard video input (320*240), GPU-FV can process each frame within 34ms on a model GPU. Our experiments show that GPU-FV obtains a similar recognition accuracy as traditional FV on VOC 2007 and Caltech 256 image sets. We also applied GPU-FV for realtime video monitoring tasks and found that GPU-FV outperforms a number of previous works. Especially, when the number of training examples are small, GPU-FV outperforms the recent popular deep CNN features borrowed from ImageNet. The code can be downloaded from the following link https://bitbucket.org/mawenjing/gpu-fv.
Recently, a programmable quantum annealing machine has been built that minimizes the cost function of hard optimization problems by adiabatically quenching quantum fluctuations. Tests performed by different research teams have shown that, indeed, the machine seems to exploit quantum effects. However experiments on a class of random-bond instances have not yet demonstrated an advantage over classical optimization algorithms on traditional computer hardware. Here we present evidence as to why this might be the case. These engineered quantum annealing machines effectively operate coupled to a decohering thermal bath. Therefore, we study the finite-temperature critical behavior of the standard benchmark problem used to assess the computational capabilities of these complex machines. We simulate both random-bond Ising models and spin glasses with bimodal and Gaussian disorder on the D-Wave Chimera topology. Our results show that while the worst-case complexity of finding a ground state of an Ising spin glass on the Chimera graph is not polynomial, the finite-temperature phase space is likely rather simple: Spin glasses on Chimera have only a zero-temperature transition. This means that benchmarking optimization methods using spin glasses on the Chimera graph might not be the best benchmark problems to test quantum speedup. We propose alternative benchmarks by embedding potentially harder problems on the Chimera topology. Finally, we also study the (reentrant) disorder-temperature phase diagram of the random-bond Ising model on the Chimera graph and show that a finite-temperature ferromagnetic phase is stable up to 19.85(15)% antiferromagnetic bonds. Beyond this threshold the system only displays a zero-temperature spin-glass phase. Our results therefore show that a careful design of the hardware architecture and benchmark problems is key when building quantum annealing machines.
This paper is a survey discussing Information Retrieval concepts, methods, and applications. It goes deep into the document and query modelling involved in IR systems, in addition to pre-processing operations such as removing stop words and searching by synonym techniques. The paper also tackles text categorization along with its application in neural networks and machine learning. Finally, the architecture of web crawlers is to be discussed shedding the light on how internet spiders index web documents and how they allow users to search for items on the web.
Despite the recent success of neural networks in tasks involving natural language understanding (NLU) there has only been limited progress in some of the fundamental challenges of NLU, such as the disambiguation of the meaning and function of words in context. This work approaches this problem by incorporating contextual information into word representations prior to processing the task at hand. To this end we propose a general-purpose reading architecture that is employed prior to a task-specific NLU model. It is responsible for refining context-agnostic word representations with contextual information and lends itself to the introduction of additional, context-relevant information from external knowledge sources. We demonstrate that previously non-competitive models benefit dramatically from employing contextual representations, closing the gap between general-purpose reading architectures and the state-of-the-art performance obtained with fine-tuned, task-specific architectures. Apart from our empirical results we present a comprehensive analysis of the computed representations which gives insights into the kind of information added during the refinement process.
We propose a theory of the origin of transport nonuniversality in disordered insulating-conducting compounds based on the interplay between microstructure and tunneling processes between metallic grains dispersed in the insulating host. We show that if the metallic phase is arranged in quasi-one dimensional chains of conducting grains, then the distribution function of the chain conductivities g has a power-law divergence for g -> 0 leading to nonuniversal values of the transport critical exponent t. We evaluate the critical exponent t by Monte Carlo calculations on a cubic lattice and show that our model can describe universal as well nonuniversal behavior of transport depending on the value of few microstructural parameters. Such segregated tunneling-percolation model can describe the microstructure of a quite vast class of materials known as thick-film resistors which display universal or nonuniversal values of t depending on the composition.
We present first results from evolutionary simulations aimed at exploring the Hot-Flasher scenario for the formation of H-deficient subdwarf stars. The two types of late hot flashers that lead to He-enriched surfaces, "deep" and "shallow" mixing cases, are investigated for different metallicities.
Spread of computer viruses can be modeled as the SIS (susceptible-infected-susceptible) epidemic propagation. We show that in order to ensure the random immunization or the targeted immunization effectively prevent computer viruses propagation on homogeneous networks, we should install antivirus programs in every computer node and frequently update those programs. This may produce large work and cost to install and update antivirus programs. Then we propose a new policy called "network monitors" to tackle this problem. In this policy, we only install and update antivirus programs for small number of computer nodes, namely the "network monitors". Further, the "network monitors" can monitor their neighboring nodes' behavior. This mechanism incur relative small cost to install and update antivirus programs.We also indicate that the policy of the "network monitors" is efficient to protect the network's safety. Numerical simulations confirm our analysis.
We introduce a model for predicting the diffusion of content information on social media. When propagation is usually modeled on discrete graph structures, we introduce here a continuous diffusion model, where nodes in a diffusion cascade are projected onto a latent space with the property that their proximity in this space reflects the temporal diffusion process. We focus on the task of predicting contaminated users for an initial initial information source and provide preliminary results on differents datasets.
Human-machine systems required a deep understanding of human behaviors. Most existing research on action recognition has focused on discriminating between different actions, however, the quality of executing an action has received little attention thus far. In this paper, we study the quality assessment of driving behaviors and present WiQ, a system to assess the quality of actions based on radio signals. This system includes three key components, a deep neural network based learning engine to extract the quality information from the changes of signal strength, a gradient based method to detect the signal boundary for an individual action, and an activitybased fusion policy to improve the recognition performance in a noisy environment. By using the quality information, WiQ can differentiate a triple body status with an accuracy of 97%, while for identification among 15 drivers, the average accuracy is 88%. Our results show that, via dedicated analysis of radio signals, a fine-grained action characterization can be achieved, which can facilitate a large variety of applications, such as smart driving assistants.
In a diffusive conductor the eigenstates are spread over the entire sample. However, with certain probability, an anomalously localized state (ALS) can occur, i.e. the wave function assumes anomalously large values in some region of space. Existing analytical theories of ALS are based on models described by a continuous (Gaussian) random potential. In the present paper we study ALS in a lattice (Anderson) model. We demonstrate that close to the center of the band, E=0, a new type of ALS exist and calculate analytically their likelihood. These ALS are lattice-specific and have no analog in the continuum. Our findings are relevant to numerical simulations, which are necessarily performed on a lattice. We demonstrate that inconsistencies with "continuous" results reported in the previous numerical work on ALS can be explained within our analytical theory. Finally, we point out that, in order to compare the numerics with the "continuous" ALS theories, simulations must be carried out not too far from the band edges, within the band, where the continuous description applies. Simulations performed for $E$ close to the band center reveal lattice-specific ALS that do not exist in continuous models.
We summarize the results of recent determinations of $\alpha_s(M_Z^2)$ in NNLO QCD analyses of the deep-inelastic world data.
The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O(m d log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m ~ n and d ~ log n, in which case our algorithm runs in essentially linear time, O(n log^2 n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web-site of a large online retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400,000 vertices and 2 million edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.
In the Bayesian approach to structure learning of graphical models, the equivalent sample size (ESS) in the Dirichlet prior over the model parameters was recently shown to have an important effect on the maximum-a-posteriori estimate of the Bayesian network structure. In our first contribution, we theoretically analyze the case of large ESS-values, which complements previous work: among other results, we find that the presence of an edge in a Bayesian network is favoured over its absence even if both the Dirichlet prior and the data imply independence, as long as the conditional empirical distribution is notably different from uniform. In our second contribution, we focus on realistic ESS-values, and provide an analytical approximation to the "optimal" ESS-value in a predictive sense (its accuracy is also validated experimentally): this approximation provides an understanding as to which properties of the data have the main effect determining the "optimal" ESS-value.
Automated cardiac segmentation from magnetic resonance imaging datasets is an essential step in the timely diagnosis and management of cardiac pathologies. We propose to tackle the problem of automated left and right ventricle segmentation through the application of a deep fully convolutional neural network architecture. Our model is efficiently trained end-to-end in a single learning stage from whole-image inputs and ground truths to make inference at every pixel. To our knowledge, this is the first application of a fully convolutional neural network architecture for pixel-wise labeling in cardiac magnetic resonance imaging. Numerical experiments demonstrate that our model is robust to outperform previous fully automated methods across multiple evaluation measures on a range of cardiac datasets. Moreover, our model is fast and can leverage commodity compute resources such as the graphics processing unit to enable state-of-the-art cardiac segmentation at massive scales. The models and code are available at https://github.com/vuptran/cardiac-segmentation
We solve the Sherrington-Kirkpatrick (SK) model in a transverse field \Gamma deep in its quantum glass phase at zero temperature. We show that the glass phase is critical everywhere, exhibiting collective excitations with a gapless Ohmic spectral function. Using an effective potential approach, we interpret the latter as being due to disordered spin waves which behave as weakly coupled, underdamped harmonic oscillators. In the limit of small transverse field the low frequency tail of the spectrum tends to a universal, \Gamma-independent limit.
In this work we performed a numerical study of an epidemic model that mimics the endemic state of whooping cough in the pre-vaccine era. We considered a stochastic SIR model on dynamical networks that involve local and global contacts among individuals and analyzed the influence of the network properties on the characterization of the quasi-stationary state. We computed probability density functions (PDF) for infected fraction of individuals and found that they are well fitted by gamma functions, excepted the tails of the distributions that are q-exponentials. We also computed the fluctuation power spectra of infective time series for different networks. We found that network effects can be partially absorbed by rescaling the rate of infective contacts of the model. An explicit relation between the effective transmission rate of the disease and the correlation of susceptible individuals with their infective nearest neighbours was obtained. This relation quantifies the known screening of infective individuals observed in these networks. We finally discuss the goodness and limitations of the SIR model with homogeneous mixing and parameters taken from epidemiological data to describe the dynamic behaviour observed in the networks studied.
Embedding network data into a low-dimensional vector space has shown promising performance for many real-world applications, such as node classification and entity retrieval. However, most existing methods focused only on leveraging network structure. For social networks, besides the network structure, there also exists rich information about social actors, such as user profiles of friendship networks and textual content of citation networks. These rich attribute information of social actors reveal the homophily effect, exerting huge impacts on the formation of social networks. In this paper, we explore the rich evidence source of attributes in social networks to improve network embedding. We propose a generic Social Network Embedding framework (SNE), which learns representations for social actors (i.e., nodes) by preserving both the structural proximity and attribute proximity. While the structural proximity captures the global network structure, the attribute proximity accounts for the homophily effect. To justify our proposal, we conduct extensive experiments on four real-world social networks. Compared to the state-of-the-art network embedding approaches, SNE can learn more informative representations, achieving substantial gains on the tasks of link prediction and node classification. Specifically, SNE significantly outperforms node2vec with an 8.2% relative improvement on the link prediction task, and a 12.7% gain on the node classification task.
We discuss several modifications and extensions over the previous proposed Cnvlutin (CNV) accelerator for convolutional and fully-connected layers of Deep Learning Network. We first describe different encodings of the activations that are deemed ineffectual. The encodings have different memory overhead and energy characteristics. We propose using a level of indirection when accessing activations from memory to reduce their memory footprint by storing only the effectual activations. We also present a modified organization that detects the activations that are deemed as ineffectual while fetching them from memory. This is different than the original design that instead detected them at the output of the preceding layer. Finally, we present an extended CNV that can also skip ineffectual weights.
We present a general framework for training deep neural networks without backpropagation. This substantially decreases training time and also allows for construction of deep networks with many sorts of learners, including networks whose layers are defined by functions that are not easily differentiated, like decision trees. The main idea is that layers can be trained one at a time, and once they are trained, the input data are mapped forward through the layer to create a new learning problem. The process is repeated, transforming the data through multiple layers, one at a time, rendering a new data set, which is expected to be better behaved, and on which a final output layer can achieve good performance. We call this forward thinking and demonstrate a proof of concept by achieving state-of-the-art accuracy on the MNIST dataset for convolutional neural networks. We also provide a general mathematical formulation of forward thinking that allows for other types of deep learning problems to be considered.
The current article introduces a supervised learning algorithm for multilayer spiking neural networks. The algorithm presented here overcomes some limitations of existing learning algorithms as it can be applied to neurons firing multiple spikes and it can in principle be applied to any linearisable neuron model. The algorithm is applied successfully to various benchmarks, such as the XOR problem and the Iris data set, as well as complex classifications problems. The simulations also show the flexibility of this supervised learning algorithm which permits different encodings of the spike timing patterns, including precise spike trains encoding.
A signal recovery scheme is developed for linear observation systems based on expectation consistent (EC) mean field approximation. Approximate message passing (AMP) is known to be consistent with the results obtained using the replica theory, which is supposed to be exact in the large system limit, when each entry of the observation matrix is independently generated from an identical distribution. However, this is not necessarily the case for general matrices. We show that EC recovery exhibits consistency with the replica theory for a wider class of random observation matrices. This is numerically confirmed by experiments for the Bayesian optimal signal recovery of compressed sensing using random row-orthogonal matrices.
We consider the problem of throughput-optimal broadcast- ing in time-varying wireless networks, whose underlying topology is restricted to Directed Acyclic Graphs (DAG). Previous broadcast algorithms route packets along spanning trees. In large networks with time-varying connectivities, these trees are difficult to compute and maintain. In this paper we propose a new online throughput-optimal broadcast algorithm which makes packet-by-packet scheduling and routing decisions, obviating the need for maintaining any global topological structures, such as spanning-trees. Our algorithm relies on system-state information for making transmission decisions and hence, may be thought of as a generalization of the well-known back-pressure algorithm which makes point-to-point unicast transmission decisions based on queue-length information, without requiring knowledge of end-to-end paths. Technically, the back-pressure algorithm is derived by stochastically stabilizing the network-queues. However, because of packet-duplications associated with broadcast, the work-conservation principle is violated and queuing processes are difficult to define in the broadcast problem. To address this fundamental issue, we identify certain state-variables which behave like virtual queues in the broadcast setting. By stochastically stabilizing these virtual queues, we devise a throughput-optimal broadcast policy. We also derive new characterizations of the broadcast-capacity of time-varying wireless DAGs and derive an efficient algorithm to compute the capacity exactly under certain assumptions of interference model, and a poly-time approximation algorithm for computing the capacity under less restrictive assumptions.
Network visualization allows a quick glance at how nodes (or actors) are connected by edges (or ties). A conventional network diagram of "contact tree" maps out a root and branches that represent the structure of nodes and edges, often without further specifying leaves or fruits that would have grown from small branches. By furnishing such a network structure with leaves and fruits, we reveal details about "contacts" in our ContactTrees that underline ties and relationships. Our elegant design employs a bottom-up approach that resembles a recent attempt to understand subjective well-being by means of a series of emotions. Such a bottom-up approach to social-network studies decomposes each tie into a series of interactions or contacts, which help deepen our understanding of the complexity embedded in a network structure. Unlike previous network visualizations, ContactTrees can highlight how relationships form and change based upon interactions among actors, and how relationships and networks vary by contact attributes. Based on a botanical tree metaphor, the design is easy to construct and the resulting tree-like visualization can display many properties at both tie and contact levels, a key ingredient missing from conventional techniques of network visualization. We first demonstrate ContactTrees using a dataset consisting of three waves of 3-month contact diaries over the 2004-2012 period, then compare ContactTrees with alternative tools and discuss how this tool can be applied to other types of datasets.
It is shown that in the general case the canonical construction of the current operators in quantum field theory does not render a bona fide vector field since Lorentz invariance is violated by Schwinger terms. We argue that the nonexistence of the canonical current operators for spinor fields follows from a very simple algebraic consideration. As a result, the well-known sum rules in deep inelastic scattering are not substantiated.
The growth of real world objects with embedded and globally networked sensors allows to consolidate the Internet of Things paradigm and increase the number of applications in the domains of ubiquitous and context-aware computing. The merging between Cloud Computing and Internet of Things named Cloud of Things will be the key to handle thousands of sensors and their data. One of the main challenges in the Cloud of Things is context-aware sensor search and selection. Typically, sensors require to be searched using two or more conflicting context properties. Most of the existing work uses some kind of multi-criteria decision analysis to perform the sensor search and selection, but does not show any concern for the quality of the selection presented by these methods. In this paper, we analyse the behaviour of the SAW, TOPSIS and VIKOR multi-objective decision methods and their quality of selection comparing them with the Pareto-optimality solutions. The gathered results allow to analyse and compare these algorithms regarding their behaviour, the number of optimal solutions and redundancy.
Distributed Nearest Neighbor Search (DNNS) locates service nodes that have shortest interactive delay towards requesting hosts. DNNS provides an important service for large-scale latency sensitive networked applications, such as VoIP, online network games, or interactive network services on the cloud. Existing work assumes the delay to be symmetric, which does not generalize to applications that are sensitive to one-way delays, such as the multimedia video delivery from the servers to the hosts. We propose a relaxed inframetric model for the network delay space that does not assume the triangle inequality and delay symmetry to hold. We prove that the DNNS requests can be completed efficiently if the delay space exhibits modest inframetric dimensions, which we can observe empirically. Finally, we propose a DNNS method named HybridNN (\textit{Hybrid} \textit{N}earest \textit{N}eighbor search) based on the inframetric model for fast and accurate DNNS. For DNNS requests, HybridNN chooses closest neighbors accurately via the inframetric modelling, and scalably by combining delay predictions with direct probes to a pruned set of neighbors. Simulation results show that HybridNN locates nearly optimally the nearest neighbor. Experiments on PlanetLab show that HybridNN can provide accurate nearest neighbors that are close to optimal with modest query overhead and maintenance traffic.
We generalize the method of computing functional determinants with a single excluded zero eigenvalue developed by McKane and Tarlie to differential operators with multiple zero eigenvalues. We derive general formulas for such functional determinants of $r\times r$ matrix second order differential operators $O$ with $0 < n \leqslant 2r$ linearly independent zero modes. We separately discuss the cases of the homogenous Dirichlet boundary conditions, when the number of zero modes cannot exceed $r$, and the case of twisted boundary conditions, including the periodic and anti-periodic ones, when the number of zero modes is bounded above by $2r$. In all cases the determinants with excluded zero eigenvalues can be expressed only in terms of the $n$ zero modes and other $r-n$ or $2r-n$ (depending on the boundary conditions) solutions of the homogeneous equation $O h=0$, in the spirit of Gel'fand-Yaglom approach. In instanton calculations, the contribution of the zero modes is taken into account by introducing the so-called collective coordinates. We show that there is a remarkable cancellation of a factor (involving scalar products of zero modes) between the Jacobian of the transformation to the collective coordinates and the functional fluctuation determinant with excluded zero eigenvalues. This cancellation drastically simplifies instanton calculations when one uses our formulas.
Recently, doc2vec has achieved excellent results in different tasks. In this paper, we present a context aware variant of doc2vec. We introduce a novel weight estimating mechanism that generates weights for each word occurrence according to its contribution in the context, using deep neural networks. Our context aware model can achieve similar results compared to doc2vec initialized byWikipedia trained vectors, while being much more efficient and free from heavy external corpus. Analysis of context aware weights shows they are a kind of enhanced IDF weights that capture sub-topic level keywords in documents. They might result from deep neural networks that learn hidden representations with the least entropy.
We present a number of bounds on convergence time for two elitist population-based Evolutionary Algorithms using a recombination operator k-Bit-Swap and a mainstream Randomized Local Search algorithm. We study the effect of distribution of elite species and population size.
In this paper we propose a novel framework for decentralized, online learning by many learners. At each moment of time, an instance characterized by a certain context may arrive to each learner; based on the context, the learner can select one of its own actions (which gives a reward and provides information) or request assistance from another learner. In the latter case, the requester pays a cost and receives the reward but the provider learns the information. In our framework, learners are modeled as cooperative contextual bandits. Each learner seeks to maximize the expected reward from its arrivals, which involves trading off the reward received from its own actions, the information learned from its own actions, the reward received from the actions requested of others and the cost paid for these actions - taking into account what it has learned about the value of assistance from each other learner. We develop distributed online learning algorithms and provide analytic bounds to compare the efficiency of these with algorithms with the complete knowledge (oracle) benchmark (in which the expected reward of every action in every context is known by every learner). Our estimates show that regret - the loss incurred by the algorithm - is sublinear in time. Our theoretical framework can be used in many practical applications including Big Data mining, event detection in surveillance sensor networks and distributed online recommendation systems.
The problem of the data routing management, it provides a method or a strategy that guarantees at any time the connection between any pair of nodes in the network. This routing method must be able to cope with frequent changes in the topology and also other characteristics of the ad hoc network as bandwidth, the number of links, network resources etc.. We also illustrate the utility of the proposed algorithms: the problem of assignment or frequency allotment in a radio network or mobile phones as in the following way: how to attribute a frequency to every transmitter(issuer) or an unity(unit) of the network, so that two broadcasting stations (issuers) which can interfere have frequencies distant enough from each other ? Thus to affect the wavelengths means finding a coloring of the graph, but because the network is not stable and the topology is dynamic, we need a method which maintains the initial allocation of the frequencies or to look for a new allocation to maintain the stability of the network.
We compute cross sections for inclusive scattering of high energy electrons on 4He, based on the two lowest orders of the Gersch-Rodriguez-Smith (GRS) series. The required one- and two-particle density matrices are obtained from non-relativistic 4He wave functions using realistic models for the nucleon-nucleon and three-nucleon interaction. Predictions for E=3.6 GeV agree well with the NE3 SLAC-Virginia data.
Despite significant recent progress, the best available computer vision algorithms still lag far behind human capabilities, even for recognizing individual discrete objects under various poses, illuminations, and backgrounds. Here we present a new approach to using object pose information to improve deep network learning. While existing large-scale datasets, e.g. ImageNet, do not have pose information, we leverage the newly published turntable dataset, iLab-20M, which has ~22M images of 704 object instances shot under different lightings, camera viewpoints and turntable rotations, to do more controlled object recognition experiments. We introduce a new convolutional neural network architecture, what/where CNN (2W-CNN), built on a linear-chain feedforward CNN (e.g., AlexNet), augmented by hierarchical layers regularized by object poses. Pose information is only used as feedback signal during training, in addition to category information; during test, the feedforward network only predicts category. To validate the approach, we train both 2W-CNN and AlexNet using a fraction of the dataset, and 2W-CNN achieves 6% performance improvement in category prediction. We show mathematically that 2W-CNN has inherent advantages over AlexNet under the stochastic gradient descent (SGD) optimization procedure. Further more, we fine-tune object recognition on ImageNet by using the pretrained 2W-CNN and AlexNet features on iLab-20M, results show that significant improvements have been achieved, compared with training AlexNet from scratch. Moreover, fine-tuning 2W-CNN features performs even better than fine-tuning the pretrained AlexNet features. These results show pretrained features on iLab- 20M generalizes well to natural image datasets, and 2WCNN learns even better features for object recognition than AlexNet.
This paper proposes a tree-based convolutional neural network (TBCNN) for discriminative sentence modeling. Our models leverage either constituency trees or dependency trees of sentences. The tree-based convolution process extracts sentences' structural features, and these features are aggregated by max pooling. Such architecture allows short propagation paths between the output layer and underlying feature detectors, which enables effective structural feature learning and extraction. We evaluate our models on two tasks: sentiment analysis and question classification. In both experiments, TBCNN outperforms previous state-of-the-art results, including existing neural networks and dedicated feature/rule engineering. We also make efforts to visualize the tree-based convolution process, shedding light on how our models work.
We introduce in this paper a new algorithm for Multi-Armed Bandit (MAB) problems. A machine learning paradigm popular within Cognitive Network related topics (e.g., Spectrum Sensing and Allocation). We focus on the case where the rewards are exponentially distributed, which is common when dealing with Rayleigh fading channels. This strategy, named Multiplicative Upper Confidence Bound (MUCB), associates a utility index to every available arm, and then selects the arm with the highest index. For every arm, the associated index is equal to the product of a multiplicative factor by the sample mean of the rewards collected by this arm. We show that the MUCB policy has a low complexity and is order optimal.
I study the zero temperature phase transition between superfluid and insulating ground states of the Bose-Hubbard model in a random chemical potential and at large integer average number of particles per site. Duality transformation maps the pure Bose-Hubbard model onto the sine-Gordon theory in one dimension (1D), and onto the three dimensional Higgs electrodynamics in two dimensions (2D). In 1D the random chemical potential in dual theory couples to the space derivative of the dual field, and appears as a random magnetic field along the imaginary time direction in 2D. I show that the transition from the superfluid state in both 1D and 2D is always controlled by the random critical point. This arises due to a coupling constant in the dual theory with replicas which becomes generated at large distances by the random chemical potential, and represents a relevant perturbation at the pure superfluid-Mott insulator fixed point. At large distances the dual theory in 1D becomes equivalent to the Haldane's macroscopic representation of disordered quantum fluid, where the generated term is identified with random backscattering. In 2D the generated coupling corresponds to the random mass of the complex field which represents vortex loops. I calculate the critical exponents at the superfluid-Bose glass fixed point in 2D to be \nu=1.38 and z=1.93, and the universal conductivity at the transition \sigma_c = 0.26 e_{*}^2 /h, using the one-loop field-theoretic renormalization group in fixed dimension.
We propose a simple method that combines neural networks and Gaussian processes. The proposed method can estimate the uncertainty of outputs and flexibly adjust target functions where training data exist, which are advantages of Gaussian processes. The proposed method can also achieve high generalization performance for unseen input configurations, which is an advantage of neural networks. With the proposed method, neural networks are used for the mean functions of Gaussian processes. We present a scalable stochastic inference procedure, where sparse Gaussian processes are inferred by stochastic variational inference, and the parameters of neural networks and kernels are estimated by stochastic gradient descent methods, simultaneously. We use two real-world spatio-temporal data sets to demonstrate experimentally that the proposed method achieves better uncertainty estimation and generalization performance than neural networks and Gaussian processes.
Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and this setup excludes preemptive actions to respond to imminent threats. An effective way to overcome these limitations is to build a smart camera that transmits alerts when relevant video sequences are detected. Deep neural networks (DNNs) have come to outperform humans in visual classifications tasks. The concept of DNNs and Convolutional Networks (ConvNets) can easily be extended to make use of higher-dimensional input data such as multispectral data. We explore this opportunity in terms of achievable accuracy and required computational effort. To analyze the precision of DNNs for scene labeling in an urban surveillance scenario we have created a dataset with 8 classes obtained in a field experiment. We combine an RGB camera with a 25-channel VIS-NIR snapshot sensor to assess the potential of multispectral image data for target classification. We evaluate several new DNNs, showing that the spectral information fused together with the RGB frames can be used to improve the accuracy of the system or to achieve similar accuracy with a 3x smaller computation effort. We achieve a very high per-pixel accuracy of 99.1%. Even for scarcely occurring, but particularly interesting classes, such as cars, 75% of the pixels are labeled correctly with errors occurring only around the border of the objects. This high accuracy was obtained with a training set of only 30 labeled images, paving the way for fast adaptation to various application scenarios.
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prevented. In this paper, we answer all these questions affirmatively. In particular, we first show that the recent DQN algorithm, which combines Q-learning with a deep neural network, suffers from substantial overestimations in some games in the Atari 2600 domain. We then show that the idea behind the Double Q-learning algorithm, which was introduced in a tabular setting, can be generalized to work with large-scale function approximation. We propose a specific adaptation to the DQN algorithm and show that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.
Learning tasks such as those involving genomic data often poses a serious challenge: the number of input features can be orders of magnitude larger than the number of training examples, making it difficult to avoid overfitting, even when using the known regularization techniques. We focus here on tasks in which the input is a description of the genetic variation specific to a patient, the single nucleotide polymorphisms (SNPs), yielding millions of ternary inputs. Improving the ability of deep learning to handle such datasets could have an important impact in precision medicine, where high-dimensional data regarding a particular patient is used to make predictions of interest. Even though the amount of data for such tasks is increasing, this mismatch between the number of examples and the number of inputs remains a concern. Naive implementations of classifier neural networks involve a huge number of free parameters in their first layer: each input feature is associated with as many parameters as there are hidden units. We propose a novel neural network parametrization which considerably reduces the number of free parameters. It is based on the idea that we can first learn or provide a distributed representation for each input feature (e.g. for each position in the genome where variations are observed), and then learn (with another neural network called the parameter prediction network) how to map a feature's distributed representation to the vector of parameters specific to that feature in the classifier neural network (the weights which link the value of the feature to each of the hidden units). We show experimentally on a population stratification task of interest to medical studies that the proposed approach can significantly reduce both the number of parameters and the error rate of the classifier.
Recurrent Neural Networks (RNNs) have the ability to retain memory and learn data sequences. Due to the recurrent nature of RNNs, it is sometimes hard to parallelize all its computations on conventional hardware. CPUs do not currently offer large parallelism, while GPUs offer limited parallelism due to sequential components of RNN models. In this paper we present a hardware implementation of Long-Short Term Memory (LSTM) recurrent network on the programmable logic Zynq 7020 FPGA from Xilinx. We implemented a RNN with $2$ layers and $128$ hidden units in hardware and it has been tested using a character level language model. The implementation is more than $21\times$ faster than the ARM CPU embedded on the Zynq 7020 FPGA. This work can potentially evolve to a RNN co-processor for future mobile devices.
We present the results of a classical molecular dynamic simulation as well as of an ab initio molecular dynamic simulation of an amorphous silica surface. In the case of the classical simulation we use the potential proposed by van Beest et al. (BKS) whereas the ab initio simulation is done with a Car-Parrinello method (CPMD). We find that the surfaces generated by BKS have a higher concentration of defects (e.g. concentration of two-membered rings) than those generated with CPMD. In addition also the distribution functions of the angles and of the distances are different for the short rings. Hence we conclude that whereas the BKS potential is able to reproduce correctly the surface on the length scale beyond approx 5 Angstroems, it is necessary to use an ab initio method to predict reliably the structure at small scales.
We address the question to what extent the success of scientific articles is due to social influence. Analyzing a data set of over 100000 publications from the field of Computer Science, we study how centrality in the coauthorship network differs between authors who have highly cited papers and those who do not. We further show that a machine learning classifier, based only on coauthorship network centrality measures at time of publication, is able to predict with high precision whether an article will be highly cited five years after publication. By this we provide quantitative insight into the social dimension of scientific publishing - challenging the perception of citations as an objective, socially unbiased measure of scientific success.
Many real-world brain-computer interface (BCI) applications rely on single-trial classification of event-related potentials (ERPs) in EEG signals. However, because different subjects have different neural responses to even the same stimulus, it is very difficult to build a generic ERP classifier whose parameters fit all subjects. The classifier needs to be calibrated for each individual subject, using some labeled subject-specific data. This paper proposes both online and offline weighted adaptation regularization (wAR) algorithms to reduce this calibration effort, i.e., to minimize the amount of labeled subject-specific EEG data required in BCI calibration, and hence to increase the utility of the BCI system. We demonstrate using a visually-evoked potential oddball task and three different EEG headsets that both online and offline wAR algorithms significantly outperform several other algorithms. Moreover, through source domain selection, we can reduce their computational cost by about 50%, making them more suitable for real-time applications.
We give a centralized deterministic algorithm for constructing linear network error-correcting codes that attain the Singleton bound of network error-correcting codes. The proposed algorithm is based on the algorithm by Jaggi et al. We give estimates on the time complexity and the required symbol size of the proposed algorithm. We also estimate the probability of a random choice of local encoding vectors by all intermediate nodes giving a network error-correcting codes attaining the Singleton bound. We also clarify the relationship between the robust network coding and the network error-correcting codes with known locations of errors.
Integrating information gained by observing others via Social Bayesian Learning can be beneficial for an agent's performance, but can also enable population wide information cascades that perpetuate false beliefs through the agent population. We show how agents can influence the observation network by changing their probability of observing others, and demonstrate the existence of a population-wide equilibrium, where the advantages and disadvantages of the Social Bayesian update are balanced. We also use the formalism of relevant information to illustrate how negative information cascades are characterized by processing increasing amounts of non-relevant information.
Mechanisms of human color vision are characterized by two phenomenological aspects: the system is nonlinear and adaptive to changing environments. Conventional attempts to derive these features from statistics use separate arguments for each aspect. The few statistical approaches that do consider both phenomena simultaneously follow parametric formulations based on empirical models. Therefore, it may be argued that the behavior does not come directly from the color statistics but from the convenient functional form adopted. In addition, many times the whole statistical analysis is based on simplified databases that disregard relevant physical effects in the input signal, as for instance by assuming flat Lambertian surfaces. Here we address the simultaneous statistical explanation of (i) the nonlinear behavior of achromatic and chromatic mechanisms in a fixed adaptation state, and (ii) the change of such behavior. Both phenomena emerge directly from the samples through a single data-driven method: the Sequential Principal Curves Analysis (SPCA) with local metric. SPCA is a new manifold learning technique to derive a set of sensors adapted to the manifold using different optimality criteria. A new database of colorimetrically calibrated images of natural objects under these illuminants was collected. The results obtained by applying SPCA show that the psychophysical behavior on color discrimination thresholds, discount of the illuminant and corresponding pairs in asymmetric color matching, emerge directly from realistic data regularities assuming no a priori functional form. These results provide stronger evidence for the hypothesis of a statistically driven organization of color sensors. Moreover, the obtained results suggest that color perception at this low abstraction level may be guided by an error minimization strategy rather than by the information maximization principle.
We present an alternative algorithm to global fitting procedures to construct Parton Distribution Functions (PDFs) parametrizations. The proposed algorithm uses Self-Organizing Maps (SOMs) which at variance with the standard Neural Networks, are based on competitive-learning. SOMs generate a non-uniform projection from a high dimensional data s pace onto a low dimensional one (usually 1 or 2 dimensions) by clustering similar PDF representations together. The SOMs are trained on progressively narrower selections of data samples. The selection criterion is that of convergence towards a neighborhood of the experimental data. All available data sets on deep inelastic scattering in the kinematical region of 0.001 < x < 0.75, and 1 <Q^2 < 100 GeV^2, with a cut on the final state invariant mass, W^2 > 10 GeV^2 were implemented. The proposed fitting procedure, at variance with standard neural network approaches, allows for an increased control of the systematic bias by enabling the user to directly control the data selection procedure at various stages of the process.
We introduce a new method, allowing to describe slowly time-dependent Langevin equations through the behaviour of individual paths. This approach yields considerably more information than the computation of the probability density. The main idea is to show that for sufficiently small noise intensity and slow time dependence, the vast majority of paths remain in small space-time sets, typically in the neighbourhood of potential wells. The size of these sets often has a power-law dependence on the small parameters, with universal exponents. The overall probability of exceptional paths is exponentially small, with an exponent also showing power-law behaviour. The results cover time spans up to the maximal Kramers time of the system. We apply our method to three phenomena characteristic for bistable systems: stochastic resonance, dynamical hysteresis and bifurcation delay, where it yields precise bounds on transition probabilities, and the distribution of hysteresis areas and first-exit times. We also discuss the effect of coloured noise.
In this paper the task of emotion recognition from speech is considered. Proposed approach uses deep recurrent neural network trained on a sequence of acoustic features calculated over small speech intervals. At the same time special probabilistic-nature CTC loss function allows to consider long utterances containing both emotional and unemotional parts. The effectiveness of such an approach is shown in two ways. First one is the comparison with recent advances in this field. While second way implies measuring human performance on the same task, which also was done by authors.
We present a numerical study of the spin Hall effect in a two-dimensional hole gas (2DHG) system in the presence of disorder. We find that the spin Hall conductance (SHC), extrapolated to the thermodynamic limit, remains finite in a wide range of disorder strengths for a closed system on torus. But there is no intrinsic spin Hall accumulation as induced by an external electric field once the disorder is turned on. The latter is examined by performing a Laughlin's Gedanken gauge experiment numerically with the adiabatical insertion of a flux quantum in a belt-shaped sample, in which the absence of level crossing is found under the disorder effect. Without disorder, on the other hand, energy levels do cross each other, which results in an oscillating spin-density-modulation at the sample boundary after the insertion of one flux quantum in the belt-shaped system. But the corresponding net spin transfer is only about one order of magnitude smaller than what is expected from the bulk SHC. These apparently contradictory results can be attributed to the violation of the spin conservation law in such a system. We also briefly address the dissipative Fermi surface contribution to spin polarization, which may be relevant to experimental measurements.
In olfactory search an immobile target emits chemical molecules at constant rate. The molecules are transported by the medium which is assumed to be turbulent. Considering a searcher able to detect such chemical signals and whose motion follows the infotaxis strategy, we study the statistics of the first-passage time to the target when the searcher moves on a finite two-dimensional lattice of different geometries. Far from the target, where the concentration of chemicals is low the direction of the searcher's first movement is determined by the geometry of the domain and the topology of the lattice, inducing strong fluctuations on the average search time with respect to the initial position of the searcher. The domain is partitioned in well defined regions characterized by the direction of the first movement. If the search starts over the interface between two different regions, large fluctuations in the search time are observed.
Detecting faults and SLA violations in a timely manner is critical for telecom providers, in order to avoid loss in business, revenue and reputation. At the same time predicting SLA violations for user services in telecom environments is difficult, due to time-varying user demands and infrastructure load conditions.   In this paper, we propose a service-agnostic online learning approach, whereby the behavior of the system is learned on the fly, in order to predict client-side SLA violations. The approach uses device-level metrics, which are collected in a streaming fashion on the server side.   Our results show that the approach can produce highly accurate predictions (>90% classification accuracy and < 10% false alarm rate) in scenarios where SLA violations are predicted for a video-on-demand service under changing load patterns. The paper also highlight the limitations of traditional offline learning methods, which perform significantly worse in many of the considered scenarios.
The paper proposes a way to add marketing into the standard threshold model of social networks. Within this framework, the paper studies logical properties of the influence relation between sets of agents in social networks. Two different forms of this relation are considered: one for promotional marketing and the other for preventive marketing. In each case a sound and complete logical system describing properties of the influence relation is proposed. Both systems could be viewed as extensions of Armstrong's axioms of functional dependency from the database theory.
Artificial neural networks are simple and efficient machine learning tools. Defined originally in the traditional setting of simple vector data, neural network models have evolved to address more and more difficulties of complex real world problems, ranging from time evolving data to sophisticated data structures such as graphs and functions. This paper summarizes advances on those themes from the last decade, with a focus on results obtained by members of the SAMM team of Universit\'e Paris 1
The results for one-loop correction to deep inelastic scattering of longitudinal polarized leptons on longitudinal polarized hadrons are obtained withing the framework of the standard theory of electroweak interactions and ordinary quark-parton model. The on-shell renormalization scheme in t'Hooft-Feynman gauge is applied.   The numerical analysis is carried out under conditions of modern particle physics experiments.Particular emphasis is laid on contributions usually ignored at RC procedure --- electroweak corrections to electromagnetic asymmetry and RC to hadronic current. The structure of RC contribution to polarized asymmetries within the framework of QED and electroweak theory is also discused
We report on measurements of the ttbar production cross section at a center-of-mass energy of 1.96 TeV at the D0 experiment during Run II of the Fermilab Tevatron collider. We use candidate events in lepton+jets and dilepton final states. In the most sensitive channel (lepton+jets channel), a neural network algorithm that uses lifetime information to identify b-quark jets is used to distinguish signal from background processes. We also present measurements of single top quark production at D0 using several multivariate techniques to separate signal from background.
The low temperature magnetic properties of pyrochlore compound Dy2Ti2O7 in magnetic fields applied along the [111] direction are reported. Below 1 K, a clear plateau has been observed in the magnetization process in the field range 2~9 kOe, followed by a sharp moment jump at around 10 kOe that corresponds to a breaking of the spin ice state. We found that the plateau state is disordered with the residual entropy of nearly half the value of the zero-field state, whose macroscopic degeneracy comes from a frustration of the spins on the kagome layers perpendicular to the magnetic field.
Cellular processes are governed by macromolecular complexes inside the cell. Study of the native structures of macromolecular complexes has been extremely difficult due to lack of data. With recent breakthroughs in Cellular electron cryo tomography (CECT) 3D imaging technology, it is now possible for researchers to gain accesses to fully study and understand the macromolecular structures single cells. However, systematic recovery of macromolecular structures from CECT is very difficult due to high degree of structural complexity and practical imaging limitations. Specifically, we proposed a deep learning based image classification approach for large-scale systematic macromolecular structure separation from CECT data. However, our previous work was only a very initial step towards exploration of the full potential of deep learning based macromolecule separation. In this paper, we focus on improving classification performance by proposing three newly designed individual CNN models: an extended version of (Deep Small Receptive Field) DSRF3D, donated as DSRF3D-v2, a 3D residual block based neural network, named as RB3D and a convolutional 3D(C3D) based model, CB3D. We compare them with our previously developed model (DSRF3D) on 12 datasets with different SNRs and tilt angle ranges. The experiments show that our new models achieved significantly higher classification accuracies. The accuracies are not only higher than 0.9 on normal datasets, but also demonstrate potentials to operate on datasets with high levels of noises and missing wedge effects presented.
We propose NEMO, a system for extracting organization names in the affiliation and normalizing them to a canonical organization name. Our parsing process involves multi-layered rule matching with multiple dictionaries. The system achieves more than 98% f-score in extracting organization names. Our process of normalization that involves clustering based on local sequence alignment metrics and local learning based on finding connected components. A high precision was also observed in normalization. NEMO is the missing link in associating each biomedical paper and its authors to an organization name in its canonical form and the Geopolitical location of the organization. This research could potentially help in analyzing large social networks of organizations for landscaping a particular topic, improving performance of author disambiguation, adding weak links in the co-author network of authors, augmenting NLM's MARS system for correcting errors in OCR output of affiliation field, and automatically indexing the PubMed citations with the normalized organization name and country. Our system is available as a graphical user interface available for download along with this paper.
By means of a finite elements technique we solve numerically the dynamics of an amorphous solid under deformation in the quasistatic driving limit. We study the noise statistics of the stress-strain signal in the steady state plastic flow, focusing on systems with low internal dissipation. We analyze the distributions of avalanche sizes and durations and the density of shear transformations when varying the damping strength. In contrast to avalanches in the overdamped case, dominated by the yielding point universal exponents, inertial avalanches are controlled by a non-universal damping dependent feedback mechanism; eventually turning negligible the role of correlations. Still, some general properties of avalanches persist and new scaling relations can be proposed.
Phase recovery from intensity-only measurements forms the heart of coherent imaging techniques and holography. Here we demonstrate that a neural network can learn to perform phase recovery and holographic image reconstruction after appropriate training. This deep learning-based approach provides an entirely new framework to conduct holographic imaging by rapidly eliminating twin-image and self-interference related spatial artifacts. Compared to existing approaches, this neural network based method is significantly faster to compute, and reconstructs improved phase and amplitude images of the objects using only one hologram, i.e., requires less number of measurements in addition to being computationally faster. We validated this method by reconstructing phase and amplitude images of various samples, including blood and Pap smears, and tissue sections. These results are broadly applicable to any phase recovery problem, and highlight that through machine learning challenging problems in imaging science can be overcome, providing new avenues to design powerful computational imaging systems.
Even though, many researchers tried to explore the various possibilities on multi objective feature selection, still it is yet to be explored with best of its capabilities in data mining applications rather than going for developing new ones. In this paper, multi-objective evolutionary algorithm ENORA is used to select the features in a multi-class classification problem. The fusion of AnDE (averaged n-dependence estimators) with n=1, a variant of naive Bayes with efficient feature selection by ENORA is performed in order to obtain a fast hybrid classifier which can effectively learn from big data. This method aims at solving the problem of finding optimal feature subset from full data which at present still remains to be a difficult problem. The efficacy of the obtained classifier is extensively evaluated with a range of most popular 21 real world dataset, ranging from small to big. The results obtained are encouraging in terms of time, Root mean square error, zero-one loss and classification accuracy.
We investigate multiple techniques to improve upon the current state of the art deep convolutional neural network based image classification pipeline. The techiques include adding more image transformations to training data, adding more transformations to generate additional predictions at test time and using complementary models applied to higher resolution images. This paper summarizes our entry in the Imagenet Large Scale Visual Recognition Challenge 2013. Our system achieved a top 5 classification error rate of 13.55% using no external data which is over a 20% relative improvement on the previous year's winner.
Understanding adoption patterns of smartphones is of vital importance to telecommunication managers in today's highly dynamic mobile markets. In this paper, we leverage the network structure and specific position of each individual in the social network to account for and measure the potential heterogeneous role of peer influence in the adoption of the iPhone 3G. We introduce the idea of core/periphery as a meso-level organizational principle to study the social network, which complements the use of centrality measures derived from either global network properties (macro-level) or from each individual's local social neighbourhood (micro-level). Using millions of call detailed records from a mobile network operator in one country for a period of eleven months, we identify overlapping social communities as well as core and periphery individuals in the network. Our empirical analysis shows that core users exert more influence on periphery users than vice versa. Our findings provide important insights to help identify influential members in the social network, which is potentially useful to design optimal targeting strategies to improve current network-based marketing practices.
We propose a new learning-based method for estimating 2D human pose from a single image, using Dual-Source Deep Convolutional Neural Networks (DS-CNN). Recently, many methods have been developed to estimate human pose by using pose priors that are estimated from physiologically inspired graphical models or learned from a holistic perspective. In this paper, we propose to integrate both the local (body) part appearance and the holistic view of each local part for more accurate human pose estimation. Specifically, the proposed DS-CNN takes a set of image patches (category-independent object proposals for training and multi-scale sliding windows for testing) as the input and then learns the appearance of each local part by considering their holistic views in the full body. Using DS-CNN, we achieve both joint detection, which determines whether an image patch contains a body joint, and joint localization, which finds the exact location of the joint in the image patch. Finally, we develop an algorithm to combine these joint detection/localization results from all the image patches for estimating the human pose. The experimental results show the effectiveness of the proposed method by comparing to the state-of-the-art human-pose estimation methods based on pose priors that are estimated from physiologically inspired graphical models or learned from a holistic perspective.
The paper presents a dynamic solution method for dynamic minimum parametric networks flow. The solution method solves the problem for a special parametric dynamic network with linear lower bound functions of a single parameter. Instead directly work on the original network, the method implements a labelling algorithm in the parametric dynamic residual network and uses quickest paths from the source node to the sink node in the time-space network along which repeatedly decreases the dynamic flow for a sequence of parameter values, in their increasing order. In each iteration, the algorithm computes both the minimum flow for a certain subinterval of the parameter values, and the new breakpoint for the maximum parametric dynamic flow value function.
Weight distribution largely impacts the epidemic spreading taking place on top of networks. This paper studies a susceptible-infected-susceptible model on regular random networks with different kinds of weight distributions. Simulation results show that the more homogeneous weight distribution leads to higher epidemic prevalence, which, unfortunately, could not be captured by the traditional mean-field approximation. This paper gives an edge-based mean-field solution for general weight distribution, which can quantitatively reproduce the simulation results. This method could find its applications in characterizing the non-equilibrium steady states of dynamical processes on weighted networks.
In this paper, we propose a novel neural approach for paraphrase generation. Conventional para- phrase generation methods either leverage hand-written rules and thesauri-based alignments, or use statistical machine learning principles. To the best of our knowledge, this work is the first to explore deep learning models for paraphrase generation. Our primary contribution is a stacked residual LSTM network, where we add residual connections between LSTM layers. This allows for efficient training of deep LSTMs. We evaluate our model and other state-of-the-art deep learning models on three different datasets: PPDB, WikiAnswers and MSCOCO. Evaluation results demonstrate that our model outperforms sequence to sequence, attention-based and bi- directional LSTM models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.
Building on the notion of a particle physics detector as a camera and the collimated streams of high energy particles, or jets, it measures as an image, we investigate the potential of machine learning techniques based on deep learning architectures to identify highly boosted W bosons. Modern deep learning algorithms trained on jet images can out-perform standard physically-motivated feature driven approaches to jet tagging. We develop techniques for visualizing how these features are learned by the network and what additional information is used to improve performance. This interplay between physically-motivated feature driven tools and supervised learning algorithms is general and can be used to significantly increase the sensitivity to discover new particles and new forces, and gain a deeper understanding of the physics within jets.
The distribution of a suitably defined azimuthal angle in diffractive deep inelastic scattering contains information on the polarisation of the exchanged photon. In particular it allows one to constrain the longitudinal diffractive structure function. We investigate the potential of such bounds in general and for particular diffractive final states.
A new two layer hierarchical routing protocol called Cluster Based Hierarchical Routing Protocol (CBHRP) is proposed in this paper. It is an extension of LEACH routing protocol. We introduce cluster head-set idea for cluster-based routing where several clusters are formed with the deployed sensors to collect information from target field. On rotation basis, a head-set member receives data from the neighbor nodes and transmits the aggregated results to the distance base station. This protocol reduces energy consumption quite significantly and prolongs the life time of sensor network. It is found that CBHRP performs better than other well accepted hierarchical routing protocols like LEACH in term of energy consumption and time requirement.
The sampling method has been paid much attention in the field of complex network in general and statistical physics in particular. This paper presents two new sampling methods based on the perspective that a small part of vertices with high node degree can possess the most structure information of a network. The two proposed sampling methods are efficient in sampling the nodes with high degree. The first new sampling method is improved on the basis of the stratified random sampling method and selects the high degree nodes with higher probability by classifying the nodes according to their degree distribution. The second sampling method improves the existing snowball sampling method so that it enables to sample the targeted nodes selectively in every sampling step. Besides, the two proposed sampling methods not only sample the nodes but also pick the edges directly connected to these nodes. In order to demonstrate the two methods' availability and accuracy, we compare them with the existing sampling methods in three commonly used simulation networks that are scale-free network, random network, small-world network, and two real networks. The experimental results show that the two proposed sampling methods perform much better than the compared existing sampling methods in terms of sampling cost and obtaining the true network structural characteristics.
We study the ultrametric structure of phase space of one-dimensional Ising spin glasses with random power-law interaction in an external random field. Although in zero field the model in both the mean-field and non-mean-field universality classes shows an ultrametric signature [Phys. Rev. Lett. 102, 037207 (2009)], when a field is applied ultrametricity seems only present in the mean-field regime. The results for the non-mean field case in an external field agree with data for spin glasses studied within the Migdal-Kadanoff approximation. Our results therefore suggest that the spin-glass state might be fragile to external fields below the upper critical dimension.
In this work we study the transport properties of non-interacting overdamped particles, moving on tilted disordered potentials, subjected to Gaussian white noise. We give exact formulas for the drift and diffusion coefficients for the case of random potentials resulting from the interaction of a particle with a "random polymer". In our model the "random polymer" is made up, by means of some stochastic process, of monomers that can be taken from a finite or countable infinite set of possible monomer types. For the case of uncorrelated random polymers we found that the diffusion coefficient exhibits a non-monotonous behavior as a function of the noise intensity. Particularly interesting is the fact that the relative diffusivity becomes optimal at a finite temperature, a behavior which is reminiscent to that of stochastic resonance. We explain this effect as an interplay between the deterministic and noisy dynamics of the system. We also show that this behavior of the diffusion coefficient at a finite temperature is more pronounced for the case of weakly disordered potentials. We test our findings by means of numerical simulations of the corresponding Langevin dynamics of an ensemble of noninteracting overdamped particles diffusing on uncorrelated random potentials.
Using 9-sets of different laboratory earthquake tests, we examined the nature of cracking under true triaxial stress conditions in the lithosphere . We found that 3D stress state can induce oblique nucleation of many fractures, forming final plane of complex polymodal faults. Our fully 3D experiments indicate unconventional fault nucleation with 2-3 times faster slip phase, implying a new slip-weakening mechanism for earthquakes in upper crust. In addition, We compared our observations of irregular cracks with typical anti-cracks signals in multi-anvil High Pressure and Temperature test. For the first time, we showed oblique faulting can change slip-weakening rate and accelerate the rate of energy release. This implicates a sharper source time function for further modeling of our experiments. Indeed, our results can be assumed as an ex tension to detachment fronts in micro-faults to a general concept of quasi-anti rupture fronts. We showed events from deep-focus earthquakes can share some similarity to shallow earthquakes, promoting recent approach on similarity of deep earthquakes to their shallow counterparts .
Several high-throughput distributed data-processing applications require multi-hop processing of streams of data. These applications include continual processing on data streams originating from a network of sensors, composing a multimedia stream through embedding several component streams originating from different locations, etc. These data-flow computing applications require multiple processing nodes interconnected according to the data-flow topology of the application, for on-stream processing of the data. Since the applications usually sustain for a long period, it is important to optimally map the component computations and communications on the nodes and links in the network, fulfilling the capacity constraints and optimizing some quality metric such as end-to-end latency. The mapping problem is unfortunately NP-complete and heuristics have been previously proposed to compute the approximate solution in a centralized way. However, because of the dynamicity of the network, it is practically impossible to aggregate the correct state of the whole network in a single node. In this paper, we present a distributed algorithm for optimal mapping of the components of the data flow applications. We propose several heuristics to minimize the message complexity of the algorithm while maintaining the quality of the solution.
This paper studies the approximation of optimal control policies by quantized (discretized) policies for a very general class of Markov decision processes (MDPs). The problem is motivated by applications in networked control systems, computational methods for MDPs, and learning algorithms for MDPs. We consider the finite-action approximation of stationary policies for a discrete-time Markov decision process with discounted and average costs under a weak continuity assumption on the transition probability, which is a significant relaxation of conditions required in earlier literature. The discretization is constructive, and quantized policies are shown to approximate optimal deterministic stationary policies with arbitrary precision. The results are applied to the fully observed reduction of a partially observed Markov decision process, where weak continuity is a much more reasonable assumption than more stringent conditions such as strong continuity or continuity in total variation.
Recent work in Information Retrieval (IR) using Deep Learning models has yielded state of the art results on a variety of IR tasks. Deep neural networks (DNN) are capable of learning ideal representations of data during the training process, removing the need for independently extracting features. However, the structures of these DNNs are often tailored to perform on specific datasets. In addition, IR tasks deal with text at varying levels of granularity from single factoids to documents containing thousands of words. In this paper, we examine the role of the granularity on the performance of common state of the art DNN structures in IR.
Hybrid neuro-evolutionary algorithms may be inspired on Darwinian or Lamarckian evolu- tion. In the case of Darwinian evolution, the Baldwin effect, that is, the progressive incorporation of learned characteristics to the genotypes, can be observed and leveraged to improve the search. The purpose of this paper is to carry out an exper- imental study into how learning can improve G-Prop genetic search. Two ways of combining learning and genetic search are explored: one exploits the Baldwin effect, while the other uses a Lamarckian strategy. Our experiments show that using a Lamarckian op- erator makes the algorithm find networks with a low error rate, and the smallest size, while using the Bald- win effect obtains MLPs with the smallest error rate, and a larger size, taking longer to reach a solution. Both approaches obtain a lower average error than other BP-based algorithms like RPROP, other evolu- tionary methods and fuzzy logic based methods
The paper considers a distributed robust estimation problem over a network with Markovian randomly varying topology. The objective is to deal with network variations locally, by switching observer gains at affected nodes only. We propose sufficient conditions which guarantee a suboptimal $H_\infty$ level of relative disagreement of estimates in such observer networks. When the status of the network is known globally, these sufficient conditions enable the network gains to be computed by solving certain LMIs. When the nodes are to rely on a locally available information about the network topology, additional rank constraints are used to condition the gains, given this information. The results are complemented by necessary conditions which relate properties of the interconnection graph Laplacian to the mean-square detectability of the plant through measurement and interconnection channels.
Exchange bias effect in the ferromagnetic double perovskite compound Y$_2$CoMnO$_6$, which is also a multiferroic, is reported. The exchange bias, observed below 8~K, is explained as arising due to the interface effect between the ferromagnetic and antiferromagnetic clusters created by {\it antisite} disorder in this material. Below 8~K, prominent ferromagnetic hysteresis with metamagnetic "steps" and significant coercive field, $H_c \approx$ 10~kOe are observed in this compound which has a $T_c \approx$ 75~K. A model based on growth of ferromagnetic domains overcoming the elastic energy of structurally pinned magnetic interfaces, which closely resembles martensitic-like transitions, is adapted to explain the observed effects. The role of {\it antisite} disorder in creating the domain structure leading to exchange bias effect is highlighted in the present work.
Retinal vessel segmentation is an indispensable step for automatic detection of retinal diseases with fundoscopic images. Though many approaches have been proposed, existing methods tend to miss fine vessels or allow false positives at terminal branches. Let alone under-segmentation, over-segmentation is also problematic when quantitative studies need to measure the precise width of vessels. In this paper, we present a method that generates the precise map of retinal vessels using generative adversarial training. Our methods achieve dice coefficient of 0.829 on DRIVE dataset and 0.834 on STARE dataset which is the state-of-the-art performance on both datasets.
Using the HST WFPC2/NICMOS archival data of the Hubble Deep Field North, we construct the nearly complete sample of the Mv <-20 (~L*+1) galaxies to z=2, and investigate when the Hubble sequence appeared, namely, the evolution of the morphology, colors, and the comoving number density of the sample. Even if taking into account of the uncertainty of the photometric redshift technique, the number density of relatively bright bulge-dominated galaxies in the HDF-N decrease significantly at z>1, and their rest-frame U-V color distribution is wide-spread over 0.5<z<2. On the other hand, while the number density of both disk-dominated and irregular galaxies does not show significant change at z<2, their distribution of the rest-frame U-V color alters at z~1.5: there is no relatively red (rest U-V>0.3) galaxies at z>1.5, while the significant fraction of these red disk-dominated or irregular galaxies exists at z<1.5. These results suggest that the significant evolution of the Hubble sequence which is seen in the present Universe occurs at 1<z<2.
Due to increased energy consumption and carbon emissions of the ICT industry, operators worldwide are focusing on reducing energy consumption of their networks from financial as well as corporate responsibility perspectives. The subject of green or energy efficient operation of cellular access network has attracted a lot of attention in the research community recently. In this regard, dynamically powering down the radio network equipment has emerged as a promising solution. In literature, research around this type of techniques has mainly focused on quantifying the energy saving potential. However, little efforts have been made towards practical realization of these energy saving concepts. On the other hand, Wi-Fi networks are undergoing a paradigm shift towards ubiquity. The main objective of this paper is to provide novel mechanisms for practically realizing the concept of improving power efficiency in the cellular access network through opportunistically offloading users to Wi- Fi networks. These mechanisms are based on the principles of cognitive network management. Performance evaluation shows the potential of proposed mechanisms as a viable solution for achieving energy efficiency in the cellular access network.
Traditionally, when generative models of data are developed via deep architectures, greedy layer-wise pre-training is employed. In a well-trained model, the lower layer of the architecture models the data distribution conditional upon the hidden variables, while the higher layers model the hidden distribution prior. But due to the greedy scheme of the layerwise training technique, the parameters of lower layers are fixed when training higher layers. This makes it extremely challenging for the model to learn the hidden distribution prior, which in turn leads to a suboptimal model for the data distribution. We therefore investigate joint training of deep autoencoders, where the architecture is viewed as one stack of two or more single-layer autoencoders. A single global reconstruction objective is jointly optimized, such that the objective for the single autoencoders at each layer acts as a local, layer-level regularizer. We empirically evaluate the performance of this joint training scheme and observe that it not only learns a better data model, but also learns better higher layer representations, which highlights its potential for unsupervised feature learning. In addition, we find that the usage of regularizations in the joint training scheme is crucial in achieving good performance. In the supervised setting, joint training also shows superior performance when training deeper models. The joint training framework can thus provide a platform for investigating more efficient usage of different types of regularizers, especially in light of the growing volumes of available unlabeled data.
The problem of Learning from Demonstration is targeted at learning to perform tasks based on observed examples. One approach to Learning from Demonstration is Inverse Reinforcement Learning, in which actions are observed to infer rewards. This work combines a feature based state evaluation approach to Inverse Reinforcement Learning with neuroevolution, a paradigm for modifying neural networks based on their performance on a given task. Neural networks are used to learn from a demonstrated expert policy and are evolved to generate a policy similar to the demonstration. The algorithm is discussed and evaluated against competitive feature-based Inverse Reinforcement Learning approaches. At the cost of execution time, neural networks allow for non-linear combinations of features in state evaluations. These valuations may correspond to state value or state reward. This results in better correspondence to observed examples as opposed to using linear combinations. This work also extends existing work on Bayesian Non-Parametric Feature Construction for Inverse Reinforcement Learning by using non-linear combinations of intermediate data to improve performance. The algorithm is observed to be specifically suitable for a linearly solvable non-deterministic Markov Decision Processes in which multiple rewards are sparsely scattered in state space. A conclusive performance hierarchy between evaluated algorithms is presented.
Local covariant feature detection, namely the problem of extracting viewpoint invariant features from images, has so far largely resisted the application of machine learning techniques. In this paper, we propose the first fully general formulation for learning local covariant feature detectors. We propose to cast detection as a regression problem, enabling the use of powerful regressors such as deep neural networks. We then derive a covariance constraint that can be used to automatically learn which visual structures provide stable anchors for local feature detection. We support these ideas theoretically, proposing a novel analysis of local features in term of geometric transformations, and we show that all common and many uncommon detectors can be derived in this framework. Finally, we present empirical results on translation and rotation covariant detectors on standard feature benchmarks, showing the power and flexibility of the framework.
In general, the behavior of large and complex aggregates of elementary components can not be understood nor extrapolated from the properties of a few components. The brain is a good example of this type of networked systems where some patterns of behavior are observed independently of the topology and of the number of coupled units. Following this insight, we have studied the dynamics of different aggregates of logistic maps according to a particular {\it symbiotic} coupling scheme that imitates the neuronal excitation coupling. All these aggregates show some common dynamical properties, concretely a bistable behavior that is reported here with a certain detail. Thus, the qualitative relationship with neural systems is suggested through a naive model of many of such networked logistic maps whose behavior mimics the waking-sleeping bistability displayed by brain systems. Due to its relevance, some regions of multistability are determined and sketched for all these logistic models.
We investigate the Robust Multiperiod Network Design Problem, a generalization of the Capacitated Network Design Problem (CNDP) that, besides establishing flow routing and network capacity installation as in a canonical CNDP, also considers a planning horizon made up of multiple time periods and protection against fluctuations in traffic volumes. As a remedy against traffic volume uncertainty, we propose a Robust Optimization model based on Multiband Robustness (B\"using and D'Andreagiovanni, 2012), a refinement of classical Gamma-Robustness by Bertsimas and Sim that uses a system of multiple deviation bands. Since the resulting optimization problem may prove very challenging even for instances of moderate size solved by a state-of-the-art optimization solver, we propose a hybrid primal heuristic that combines a randomized fixing strategy inspired by ant colony optimization, which exploits information coming from linear relaxations of the problem, and an exact large neighbourhood search. Computational experiments on a set of realistic instances from the SNDlib show that our original heuristic can run fast and produce solutions of extremely high quality associated with low optimality gaps.
We present a systematic, algebraically based, design methodology for efficient implementation of computer programs optimized over multiple levels of the processor/memory and network hierarchy. Using a common formalism to describe the problem and the partitioning of data over processors and memory levels allows one to mathematically prove the efficiency and correctness of a given algorithm as measured in terms of a set of metrics (such as processor/network speeds, etc.). The approach allows the average programmer to achieve high-level optimizations similar to those used by compiler writers (e.g. the notion of "tiling").   The approach presented in this monograph makes use of A Mathematics of Arrays (MoA, Mullin 1988) and an indexing calculus (i.e. the psi-calculus) to enable the programmer to develop algorithms using high-level compiler-like optimizations through the ability to algebraically compose and reduce sequences of array operations. Extensive discussion and benchmark results are presented for the Fast Fourier Transform and other important algorithms.
Markov Random Fields (MRFs), a formulation widely used in generative image modeling, have long been plagued by the lack of expressive power. This issue is primarily due to the fact that conventional MRFs formulations tend to use simplistic factors to capture local patterns. In this paper, we move beyond such limitations, and propose a novel MRF model that uses fully-connected neurons to express the complex interactions among pixels. Through theoretical analysis, we reveal an inherent connection between this model and recurrent neural networks, and thereon derive an approximated feed-forward network that couples multiple RNNs along opposite directions. This formulation combines the expressive power of deep neural networks and the cyclic dependency structure of MRF in a unified model, bringing the modeling capability to a new level. The feed-forward approximation also allows it to be efficiently learned from data. Experimental results on a variety of low-level vision tasks show notable improvement over state-of-the-arts.
The distribution system problems, such as planning, loss minimization, and energy restoration, usually involve the phase balancing or network reconfiguration procedures. The determination of an optimal phase balance is, in general, a combinatorial optimization problem. This paper proposes optimal reconfiguration of the phase balancing using the neural network, to switch on and off the different switches, allowing the three phases supply by the transformer to the end-users to be balanced. This paper presents the application examples of the proposed method using the real and simulated test data.
Probabilistic Graphical Models (PGM) are very useful in the fields of machine learning and data mining. The crucial limitation of those models,however, is the scalability. The Bayesian Network, which is one of the most common PGMs used in machine learning and data mining, demonstrates this limitation when the training data consists of random variables, each of them has a large set of possible values. In the big data era, one would expect new extensions to the existing PGMs to handle the massive amount of data produced these days by computers, sensors and other electronic devices. With hierarchical data - data that is arranged in a treelike structure with several levels - one would expect to see hundreds of thousands or millions of values distributed over even just a small number of levels. When modeling this kind of hierarchical data across large data sets, Bayesian Networks become infeasible for representing the probability distributions. In this paper we introduce an extension to Bayesian Networks to handle massive sets of hierarchical data in a reasonable amount of time and space. The proposed model achieves perfect precision of 1.0 and high recall of 0.93 when it is used as multi-label classifier for the annotation of mass spectrometry data. On another data set of 1.5 billion search logs provided by CareerBuilder.com the model was able to predict latent semantic relationships between search keywords with accuracy up to 0.80.
We present a framework to cluster nodes in directed networks according to their roles by combining Role-Based Similarity (RBS) and Markov Stability, two techniques based on flows. First we compute the RBS matrix, which contains the pairwise similarities between nodes according to the scaled number of in- and out-directed paths of different lengths. The weighted RBS similarity matrix is then transformed into an undirected similarity network using the Relaxed Minimum-Spanning Tree (RMST) algorithm, which uses the geometric structure of the RBS matrix to unblur the network, such that edges between nodes with high, direct RBS are preserved. Finally, we partition the RMST similarity network into role-communities of nodes at all scales using Markov Stability to find a robust set of roles in the network. We showcase our framework through a biological and a man-made network.
Topological matter is a trending topic in condensed matter: From a fundamental point of view it has introduced new phenomena and tools, and for technological applications, it holds the promise of basic stable quantum computing. Similarly, the physics of localization by disorder, an old paradigm of obvious technological importance in the field, continues revealing surprises when new properties of matter appear. This work deals with the localization behavior of electronic systems based on partite lattices with special attention to the role of topology. We find an unexpected result from the point of view of localization properties: A robust topological metallic state characterized by a non--quantized Hall conductivity arises from strong disorder in class A (time reversal symmetry broken) insulators. The key issue is the nature of the disorder realization: selective disorder in only one sublattice in systems based on bipartite lattices. The generality of the result is based on the partite nature of most recent 2D materials as graphene or transition metal dichalcogenides, and the possibility of the physical realization of the particular disorder demonstrated in ref. 1. An anomalous Hall metal arises also when the original clean insulator is topologically trivial.
Betweenness is a well-known centrality measure that ranks the nodes according to their participation in the shortest paths of a network. In several scenarios, having a high betweenness can have a positive impact on the node itself. Hence, in this paper we consider the problem of determining how much a vertex can increase its centrality by creating a limited amount of new edges incident to it. In particular, we study the problem of maximizing the betweenness score of a given node -- Maximum Betweenness Improvement (MBI) -- and that of maximizing the ranking of a given node -- Maximum Ranking Improvement (MRI). We show that MBI cannot be approximated in polynomial-time within a factor $(1-\frac{1}{2e})$ and that MRI does not admit any polynomial-time constant factor approximation algorithm, both unless $P=NP$. We then propose a simple greedy approximation algorithm for MBI with an almost tight approximation ratio and we test its performance on several real-world networks. We experimentally show that our algorithm highly increases both the betweenness score and the ranking of a given node ant that it outperforms several competitive baselines. To speed up the computation of our greedy algorithm, we also propose a new dynamic algorithm for updating the betweenness of one node after an edge insertion, which might be of independent interest. Using the dynamic algorithm, we are now able to compute an approximation of MBI on networks with up to $10^5$ edges in most cases in a matter of seconds or a few minutes.
We introduce general scattering transforms as mathematical models of deep neural networks with l2 pooling. Scattering networks iteratively apply complex valued unitary operators, and the pooling is performed by a complex modulus. An expected scattering defines a contractive representation of a high-dimensional probability distribution, which preserves its mean-square norm. We show that unsupervised learning can be casted as an optimization of the space contraction to preserve the volume occupied by unlabeled examples, at each layer of the network. Supervised learning and classification are performed with an averaged scattering, which provides scattering estimations for multiple classes.
Assessing the partitioning performance of community detection algorithms is one of the most important issues in complex network analysis. Artificially generated networks are often used as benchmarks for this purpose. However, previous studies showed their level of realism have a significant effect on the algorithms performance. In this study, we adopt a thorough experimental approach to tackle this problem and investigate this effect. To assess the level of realism, we use consensual network topological properties. Based on the LFR method, the most realistic generative method to date, we propose two alternative random models to replace the Configuration Model originally used in this algorithm, in order to increase its realism. Experimental results show both modifications allow generating collections of community-structured artificial networks whose topological properties are closer to those encountered in real-world networks. Moreover, the results obtained with eleven popular community identification algorithms on these benchmarks show their performance decrease on more realistic networks.
We propose a method of generating different scale-free networks, which has several input parameters in order to adjust the structure, so that they can serve as a basis for computer simulation of real-world phenomena. The topological structure of these networks was studied to determine what kind of networks can be produced and how can we give the appropriate values of parameters to get a desired structure.
In this paper, we propose another version of help-training approach by employing a Probabilistic Neural Network (PNN) that improves the performance of the main discriminative classifier in the semi-supervised strategy. We introduce the PNN-training algorithm and use it for training the support vector machine (SVM) with a few numbers of labeled data and a large number of unlabeled data. We try to find the best labels for unlabeled data and then use SVM to enhance the classification rate. We test our method on two famous benchmarks and show the efficiency of our method in comparison with pervious methods.
Recurrent Neural Networks (RNNs) have become increasingly popular for the task of language understanding. In this task, a semantic tagger is deployed to associate a semantic label to each word in an input sequence. The success of RNN may be attributed to its ability to memorize long-term dependence that relates the current-time semantic label prediction to the observations many time instances away. However, the memory capacity of simple RNNs is limited because of the gradient vanishing and exploding problem. We propose to use an external memory to improve memorization capability of RNNs. We conducted experiments on the ATIS dataset, and observed that the proposed model was able to achieve the state-of-the-art results. We compare our proposed model with alternative models and report analysis results that may provide insights for future research.
In this paper we propose the use of neural interference as the origin of quantum-like effects in the brain. We do so by using a neural oscillator model consistent with neurophysiological data. The model used was shown to reproduce well the predictions of behavioral stimulus-response theory. The quantum-like effects are obtained by the spreading activation of incompatible oscillators, leading to an interference-like effect mediated by inhibitory and excitatory synapses.
This paper presents a statistical analysis of the structure of Peer-to-Peer (P2P) social networks that captures social associations of distributed peers in resource sharing. Peer social networks appear to be mainly composed of pure resource providers that guarantee high resource availability and reliability of P2P systems. The major peers that both provide and request resources are only a small fraction. The connectivity between peers, including undirected, directed (out and in) and weighted connections, is scale-free and the social networks of all peers and major peers are small world networks. The analysis also confirms that peer social networks show in general disassortative correlations, except that active providers are connected between each other and by active requesters. The study presented in this paper gives a better understanding of peer relationships in resource sharing, which may help a better design of future P2P networks and open the path to the study of transport processes on top of real P2P topologies.
The Paulsen Problem in Hilbert space frame theory has proved to be one of the most intractable problems in the field. We will help explain why by showing that this problem is equivalent to a fundamental, deep problem in operator theory. Along the way we will give a new exact computation for chordal distances, we will give a generalization of these problems and we will spell out exactly the complementary versions of the problem.
In this paper, we focus on fraud detection on a signed graph with only a small set of labeled training data. We propose a novel framework that combines deep neural networks and spectral graph analysis. In particular, we use the node projection (called as spectral coordinate) in the low dimensional spectral space of the graph's adjacency matrix as input of deep neural networks. Spectral coordinates in the spectral space capture the most useful topology information of the network. Due to the small dimension of spectral coordinates (compared with the dimension of the adjacency matrix derived from a graph), training deep neural networks becomes feasible. We develop and evaluate two neural networks, deep autoencoder and convolutional neural network, in our fraud detection framework. Experimental results on a real signed graph show that our spectrum based deep neural networks are effective in fraud detection.
Deep neural networks are learning models with a very high capacity and therefore prone to over-fitting. Many regularization techniques such as Dropout, DropConnect, and weight decay all attempt to solve the problem of over-fitting by reducing the capacity of their respective models (Srivastava et al., 2014), (Wan et al., 2013), (Krogh & Hertz, 1992). In this paper we introduce a new form of regularization that guides the learning problem in a way that reduces over-fitting without sacrificing the capacity of the model. The mistakes that models make in early stages of training carry information about the learning problem. By adjusting the labels of the current epoch of training through a weighted average of the real labels, and an exponential average of the past soft-targets we achieved a regularization scheme as powerful as Dropout without necessarily reducing the capacity of the model, and simplified the complexity of the learning problem. SoftTarget regularization proved to be an effective tool in various neural network architectures.
The distribution of overlaps of solutions of a random CSP is an indicator of the overall geometry of its solution space. For random $k$-SAT, nonrigorous methods from Statistical Physics support the validity of the ``one step replica symmetry breaking'' approach. Some of these predictions were rigorously confirmed in \cite{cond-mat/0504070/prl} \cite{cond-mat/0506053}. There it is proved that the overlap distribution of random $k$-SAT, $k\geq 9$, has discontinuous support. Furthermore, Achlioptas and Ricci-Tersenghi proved that, for random $k$-SAT, $k\geq 8$. and constraint densities close enough to the phase transition there exists an exponential number of clusters of satisfying assignments; moreover, the distance between satisfying assignments in different clusters is linear.   We aim to understand the structural properties of random CSP that lead to solution clustering. To this end, we prove two results on the cluster structure of solutions for binary CSP under the random model from Molloy (STOC 2002)   1. For all constraint sets $S$ (described explicitly in Creignou and Daude (2004), Istrate (2005)) s.t. $SAT(S)$ has a sharp threshold and all $q\in (0,1]$, $q$-overlap-$SAT(S)$ has a sharp threshold (i.e. the first step of the approach in Mora et al. works in all nontrivial cases). 2. For any constraint density value $c<1$, the set of solutions of a random instance of 2-SAT form, w.h.p., a single cluster. Also, for and any $q\in (0,1]$ such an instance has w.h.p. two satisfying assignment of overlap $\sim q$. Thus, as expected from Statistical Physics predictions, the second step of the approach in Mora et al. fails for 2-SAT.
The problem of operating a Gaussian Half-Duplex (HD) relay network optimally is challenging due to the exponential number of listen/transmit network states that need to be considered. Recent results have shown that, for the class of Gaussian HD networks with N relays, there always exists a simple schedule, i.e., with at most N +1 active states, that is sufficient for approximate (i.e., up to a constant gap) capacity characterization. This paper investigates how to efficiently find such a simple schedule over line networks. Towards this end, a polynomial-time algorithm is designed and proved to output a simple schedule that achieves the approximate capacity. The key ingredient of the algorithm is to leverage similarities between network states in HD and edge coloring in a graph. It is also shown that the algorithm allows to derive a closed-form expression for the approximate capacity of the Gaussian line network that can be evaluated distributively and in linear time. Additionally, it is shown using this closed-form that the problem of Half-Duplex routing is NP-Hard.
A new cross-correlation synchrony index for neural activity is proposed. The index is based on the integration of the kernel estimation of the cross-correlation function. It is used to test for the dynamic synchronization levels of spontaneous neural activity under two induced brain states: sleep-like and awake-like. Two bootstrap resampling plans are proposed to approximate the distribution of the test statistics. The results of the first bootstrap method indicate that it is useful to discern significant differences in the synchronization dynamics of brain states characterized by a neural activity with low firing rate. The second bootstrap method is useful to unveil subtle differences in the synchronization levels of the awake-like state, depending on the activation pathway.
A model, applicable to a range of innovation diffusion applications with a strong peer to peer component, is developed and studied, along with methods for its investigation and analysis. A particular application is to individual households deciding whether to install an energy efficiency measure in their home. The model represents these individuals as nodes on a network, each with a variable representing their current state of adoption of the innovation. The motivation to adopt is composed of three terms, representing personal preference, an average of each individual's network neighbours' states and a system average, which is a measure of the current social trend. The adoption state of a node changes if a weighted linear combination of these factors exceeds some threshold. Numerical simulations have been carried out, computing the average uptake after a sufficient number of time-steps over many realisations at a range of model parameter values, on various network topologies, including random (Erdos-Renyi), small world (Watts-Strogatz) and (Newman's) highly clustered, community-based networks. An analytical and probabilistic approach has been developed to account for the observed behaviour, which explains the results of the numerical calculations.
We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture.
Analysis of information retrieved from microblogging services such as Twitter can provide valuable insight into public sentiment in a geographic region. This insight can be enriched by visualising information in its geographic context. Two underlying approaches for sentiment analysis are dictionary-based and machine learning. The former is popular for public sentiment analysis, and the latter has found limited use for aggregating public sentiment from Twitter data. The research presented in this paper aims to extend the machine learning approach for aggregating public sentiment. To this end, a framework for analysing and visualising public sentiment from a Twitter corpus is developed. A dictionary-based approach and a machine learning approach are implemented within the framework and compared using one UK case study, namely the royal birth of 2013. The case study validates the feasibility of the framework for analysis and rapid visualisation. One observation is that there is good correlation between the results produced by the popular dictionary-based approach and the machine learning approach when large volumes of tweets are analysed. However, for rapid analysis to be possible faster methods need to be developed using big data techniques and parallel methods.
This paper applies a deep convolutional/highway MLP framework to classify genomic sequences on the transcription factor binding site task. To make the model understandable, we propose an optimization driven strategy to extract "motifs", or symbolic patterns which visualize the positive class learned by the network. We show that our system, Deep Motif (DeMo), extracts motifs that are similar to, and in some cases outperform the current well known motifs. In addition, we find that a deeper model consisting of multiple convolutional and highway layers can outperform a single convolutional and fully connected layer in the previous state-of-the-art.
We consider a random matrix model of interaction between a small $n$-level system, $S$, and its environment, a $N$-level heat reservoir, $R$. The interaction between $S$ and $R$ is modeled by a tensor product of a fixed $% n\times n$ matrix and a $N\times N$ hermitian Gaussian random matrix. We show that under certain "macroscopicity" conditions on $R$, the reduced density matrix of the system $\rho _{S}=\mathrm{Tr}_{R}\rho _{S\cup R}^{(eq)} $, is given by $\rho _{S}^{(c)}\sim \exp {\{-\beta H_{S}\}}$, where $H_{S}$ is the Hamiltonian of the isolated system. This holds for all strengths of the interaction and thus gives some justification for using $% \rho _{S}^{(c)}$ to describe some nano-systems, like biopolymers, in equilibrium with their environment \cite{Se:12}. Our results extend those obtained previously in \cite{Le-Pa:03,Le-Co:07} for a special two-level system.
The article presents performance analysis of a real valued neuro genetic algorithm applied for the detection of proportion of the gases found in manhole gas mixture. The neural network (NN) trained using genetic algorithm (GA) leads to concept of neuro genetic algorithm, which is used for implementing an intelligent sensory system for the detection of component gases present in manhole gas mixture Usually a manhole gas mixture contains several toxic gases like Hydrogen Sulfide, Ammonia, Methane, Carbon Dioxide, Nitrogen Oxide, and Carbon Monoxide. A semiconductor based gas sensor array used for sensing manhole gas components is an integral part of the proposed intelligent system. It consists of many sensor elements, where each sensor element is responsible for sensing particular gas component. Multiple sensors of different gases used for detecting gas mixture of multiple gases, results in cross-sensitivity. The cross-sensitivity is a major issue and the problem is viewed as pattern recognition problem. The objective of this article is to present performance analysis of the real valued neuro genetic algorithm which is applied for multiple gas detection.
In order to closely simulate the real network scenario thereby verify the effectiveness of protocol designs, it is necessary to model the traffic flows carried over realistic networks. Extensive studies [1] showed that the actual traffic in access and local area networks (e.g., those generated by ftp and video streams) exhibits the property of self-similarity and long-range dependency (LRD) [2]. In this appendix we briefly introduce the property of self-similarity and suggest a practical approach for modeling self-similar traces with specified traffic intensity.
In a recent letter [PRL 80 (1998) 3539] Fisher, Le Doussal and Monthus report new predictions for the persistence properties of Sinai's model, which they obtain by using an approximate real space renormalization group scheme. In this comment we show that these predictions can be derived in a rigorous and more transparent way by reducing the quantities of interest to particular statistical properties of a simple random walk in a homogeneous environment. We also point out that they are valid in a much more general context: our results are neither restricted to the random force model but are valid for any asymmetric hopping model, nor to the vicinity of the critical point.
Our objective is to provide guaranteed packet delivery service in time constrained sensor networks. The wireless network is a highly variable environment, where available link bandwidth may vary with network load. Since multimedia applications require higher bandwidth so we use FSO links for their transmission. The main advantage of FSO links is that they offer higher bandwidth and security, while RF links offer more reliability. The routing in this multitier network is based on directional geographic routing protocol, in which sensors route their data via multihop paths, to a powerful base station, through a cluster head. Some modifications have also been incorporated in the MAC layer to improve the QoS of such systems.
Analysis and prediction of network traffic has applications in wide comprehensive set of areas and has newly attracted significant number of studies. Different kinds of experiments are conducted and summarized to identify various problems in existing computer network applications. Network traffic analysis and prediction is a proactive approach to ensure secure, reliable and qualitative network communication. Various techniques are proposed and experimented for analyzing network traffic including neural network based techniques to data mining techniques. Similarly, various Linear and non-linear models are proposed for network traffic prediction. Several interesting combinations of network analysis and prediction techniques are implemented to attain efficient and effective results.   This paper presents a survey on various such network analysis and traffic prediction techniques. The uniqueness and rules of previous studies are investigated. Moreover, various accomplished areas of analysis and prediction of network traffic have been summed.
Bootstrap percolation is a well-known model to study the spreading of rumors, new products or innovations on social networks. The empirical studies show that community structure is ubiquitous among various social networks. Thus, studying the bootstrap percolation on the complex networks with communities can bring us new and important insights of the spreading dynamics on social networks. It attracts a lot of scientists' attentions recently. In this letter, we study the bootstrap percolation on Erd\H{o}s-R\'{e}nyi networks with communities and observed second order, hybrid (both second and first order) and multiple hybrid phase transitions, which is rare in natural system. Moreover, we have analytically solved this system and obtained the phase diagram, which is further justified well by the corresponding simulations.
Cross modal face matching between the thermal and visible spectrum is a much desired capability for night-time surveillance and security applications. Due to a very large modality gap, thermal-to-visible face recognition is one of the most challenging face matching problem. In this paper, we present an approach to bridge this modality gap by a significant margin. Our approach captures the highly non-linear relationship between the two modalities by using a deep neural network. Our model attempts to learn a non-linear mapping from visible to thermal spectrum while preserving the identity information. We show substantive performance improvement on three difficult thermal-visible face datasets. The presented approach improves the state-of-the-art by more than 10\% on UND-X1 dataset and by more than 15-30\% on NVESD dataset in terms of Rank-1 identification. Our method bridges the drop in performance due to the modality gap by more than 40\%.
Human verbal communication includes affective messages which are conveyed through use of emotionally colored words. There has been a lot of research in this direction but the problem of integrating state-of-the-art neural language models with affective information remains an area ripe for exploration. In this paper, we propose an extension to an LSTM (Long Short-Term Memory) language model for generating conversational text, conditioned on affect categories. Our proposed model, Affect-LM enables us to customize the degree of emotional content in generated sentences through an additional design parameter. Perception studies conducted using Amazon Mechanical Turk show that Affect-LM generates naturally looking emotional sentences without sacrificing grammatical correctness. Affect-LM also learns affect-discriminative word representations, and perplexity experiments show that additional affective information in conversational text can improve language model prediction.
An interatomic potential for the diamond and graphite phases of carbon has been created using a neural-network (NN) representation of the ab initio potential energy surface. The NN potential combines the accuracy of a first-principle description of both phases with the efficiency of empirical force fields and allows one to perform, for the first time, a molecular dynamics study, of ab initio quality, of the thermodynamics of graphite-diamond coexistence. Good agreement between the experimental and calculated coexistence curves is achieved if nuclear quantum effects are included in the simulation.
The problem of maximum rate achievable with analog network coding for a unicast communication over a layered relay network with directed links is considered. A relay node performing analog network coding scales and forwards the signals received at its input. Recently this problem has been considered under certain assumptions on per node scaling factor and received SNR. Previously, we established a result that allows us to characterize the optimal performance of analog network coding in network scenarios beyond those that can be analyzed using the approaches based on such assumptions.   The key contribution of this work is a scheme to greedily compute a lower bound to the optimal rate achievable with analog network coding in the general layered networks. This scheme allows for exact computation of the optimal achievable rates in a wider class of layered networks than those that can be addressed using existing approaches. For the specific case of Gaussian N-relay diamond network, to the best of our knowledge, the proposed scheme provides the first exact characterization of the optimal rate achievable with analog network coding. Further, for general layered networks, our scheme allows us to compute optimal rates within a constant gap from the cut-set upper bound asymptotically in the source power.
We study the simple genetic algorithm with a ranking selection mechanism (linear ranking or tournament). We denote by $\ell$ the length of the chromosomes, by $m$ the population size, by $p_C$ the crossover probability and by $p_M$ the mutation probability. We introduce a parameter $\sigma$, called the selection drift, which measures the selection intensity of the fittest chromosome. We show that the dynamics of the genetic algorithm depend in a critical way on the parameter $$\pi \,=\,\sigma(1-p_C)(1-p_M)^\ell\,.$$ If $\pi<1$, then the genetic algorithm operates in a disordered regime: an advantageous mutant disappears with probability larger than $1-1/m^\beta$, where $\beta$ is a positive exponent. If $\pi>1$, then the genetic algorithm operates in a quasispecies regime: an advantageous mutant invades a positive fraction of the population with probability larger than a constant $p^*$ (which does not depend on $m$). We estimate next the probability of the occurrence of a catastrophe (the whole population falls below a fitness level which was previously reached by a positive fraction of the population). The asymptotic results suggest the following rules: $\pi=\sigma(1-p_C)(1-p_M)^\ell$ should be slightly larger than $1$; $p_M$ should be of order $1/\ell$; $m$ should be larger than $\ell\ln\ell$; the running time should be of exponential order in $m$. The first condition requires that $ \ell p_M +p_C< \ln\sigma$. These conclusions must be taken with great care: they come from an asymptotic regime, and it is a formidable task to understand the relevance of this regime for a real-world problem. At least, we hope that these conclusions provide interesting guidelines for the practical implementation of the simple genetic algorithm.
The time evolution of the system of water in the Libersorb 23 (L23) disordered nanoporous medium after the complete filling at excess pressure and the subsequent removal of excess pressure has been studied. It has been found that three stages can be identified in the relaxation of the L23-water system under study. At the first stage, a portion of water at the removal of excess pressure rapidly flows out in the pressure reduction time, i.e., following a decrease in the pressure. It has been shown that, at temperatures below the dispersion transition temperature $T < T_d = 284 K$, e.g., $T = 277 K$, the degree of filling ${\theta}$ decreases from 1 to 0.8 in 10 s, following the variation of excess pressure. At the second stage of relaxation, the degree of filling ${\theta}$ varies slowly according to a power law ${\theta} \sim t^{-\alpha}$ with the exponent ${\alpha} < 0.1$ in the time $t \sim 10^5$ s. This corresponds to a slow relaxation of the formed metastable state of the nonwetting liquid in the porous medium. At the third stage when $t > 10^5$ s, the formed metastable state decays, which is manifested in the transition to a power-law dependence ${\theta}(t)$ with a larger exponent. The extrusion-time distribution function of pores has been calculated along with the time dependence of the degree of filling, which qualitatively describes the observed anomalously slow relaxation and crossover of the transition to the stage of decay with a power-law dependence ${\theta}(t)$ with a larger exponent. It has been shown that the relaxation and decay of the metastable state of the confined nonwetting liquid at ${\theta}>{\theta_c}$ are attributed to the appearance of local configurations of liquid clusters in confinement and their interaction inside the infinite percolation cluster of filled pores.
In the Stackelberg Network Pricing problem, one has to assign tariffs to a certain subset of the arcs of a given transportation network. The aim is to maximize the amount paid by the user of the network, knowing that the user will take a shortest st-path once the tariffs are fixed. Roch, Savard, and Marcotte (Networks, Vol. 46(1), 57-67, 2005) proved that this problem is NP-hard, and gave an O(log m)-approximation algorithm, where m denote the number of arcs to be priced. In this note, we show that the problem is also APX-hard.
We analyze, on a random graph, a diffusive strategic dynamics with pairwise interactions, where nor Glauber prescription, neither detailed balance hold. We observe numerically that such a dynamics reaches a well defined steady state that fulfills a shift property: the critical temperature of the canonical ferromagnetic phase transition is higher with respect to the expected equilibrium one, known both numerically via Glauber relaxation or Monte Carlo simulations as well as analytically via cavity techniques or replica approaches. We show how the relaxed states of this kind of dynamics can be described by statistical mechanics equilibria of a diluted p-spin model, for a suitable non-integer real p. Several implications from both theoretical physics and quantitative sociology points of view are discussed.
Teaching machines to accomplish tasks by conversing naturally with humans is challenging. Currently, developing task-oriented dialogue systems requires creating multiple components and typically this involves either a large amount of handcrafting, or acquiring costly labelled datasets to solve a statistical learning problem for each component. In this work we introduce a neural network-based text-in, text-out end-to-end trainable goal-oriented dialogue system along with a new way of collecting dialogue data based on a novel pipe-lined Wizard-of-Oz framework. This approach allows us to develop dialogue systems easily and without making too many assumptions about the task at hand. The results show that the model can converse with human subjects naturally whilst helping them to accomplish tasks in a restaurant search domain.
This short article revisits some of the ideas introduced in arXiv:1701.07875 and arXiv:1705.07642 in a simple setup. This sheds some lights on the connexions between Variational Autoencoders (VAE), Generative Adversarial Networks (GAN) and Minimum Kantorovitch Estimators (MKE).
The data center network (DCN), wired or wireless, features large amounts of Many-to-One (M2O) sessions. Each M2O session is currently operated based on Point-to-Point (P2P) communications and Store-and-Forward (SAF) relays, and is generally followed by certain further computation at the destination. %typically a weighted summation of the received digits. Different from this separate P2P/SAF-based-transmission and computation strategy, this paper proposes STAC, a novel physical layer scheme that achieves Simultaneous Transmission and Air Computation in wireless DCNs. In particular, STAC takes advantage of the superposition nature of electromagnetic (EM) waves, and allows multiple transmitters to transmit in the same time slot with appropriately chosen parameters, such that the received superimposed signal can be directly transformed to the needed summation at the receiver. Exploiting the static channel environment and compact space in DCN, we propose an enhanced Software Defined Network (SDN) architecture to enable STAC, where wired connections are established to provide the wireless transceivers external reference signals. Theoretical analysis and simulation show that with STAC used, both the bandwidth and energy efficiencies can be improved severalfold.
Although the Music Sight Reading process has been studied from the cognitive psychology view points, but the computational learning methods like the Reinforcement Learning have not yet been used to modeling of such processes. In this paper, with regards to essential properties of our specific problem, we consider the value function concept and will indicate that the optimum policy can be obtained by the method we offer without to be getting involved with computing of the complex value functions. Also, we will offer a normative behavioral model for the interaction of the agent with the musical pitch environment and by using a slightly different version of Partially observable Markov decision processes we will show that our method helps for faster learning of state-action pairs in our implemented agents.
Dynamics of Yamamoto-type poly(p-phenylene)[PPP] was investigated by differential scanning calorimetry(DSC) and proton solid-state NMR relaxation spectroscopy. The DSC chart shows the baseline jump without latent heat at 295K, which is due to the glass transition of the polymer. From the variable temperature proton longitudinal relaxation time(T1) measurements, relatively short T1 is observed over the wide temperatures range from 250K (closed to Vogel-Fulcher-Tamman temperature) to 360K, inferred the existence of cooperative critical slowing down associated withthe glass transition. The frequency dependence of proton longitudinal relaxation time at 295K shows R1 ~ w^-0.5 dependence, which is due to the one-dimensional diffusion-like motion of the backbone conformational modulation. The frequency dependence is held at least up to 360K. From these experiments, we were able to observe the twist glass transition of the backbone of PPP, of which critical dynamics has a universality class of the three-dimensional XY model.
Self-organizing maps (SOMs) are a technique that has been used with high-dimensional data vectors to develop an archetypal set of states (nodes) that span, in some sense, the high-dimensional space. Noteworthy applications include weather states as described by weather variables over a region and speech patterns as characterized by frequencies in time. The SOM approach is essentially a neural network model that implements a nonlinear projection from a high-dimensional input space to a low-dimensional array of neurons. In the process, it also becomes a clustering technique, assigning to any vector in the high-dimensional data space the node (neuron) to which it is closest (using, say, Euclidean distance) in the data space. The number of nodes is thus equal to the number of clusters. However, the primary use for the SOM is as a representation technique, that is, finding a set of nodes which representatively span the high-dimensional space. These nodes are typically displayed using maps to enable visualization of the continuum of the data space. The technique does not appear to have been discussed in the statistics literature so it is our intent here to bring it to the attention of the community. The technique is implemented algorithmically through a training set of vectors. However, through the introduction of stochasticity in the form of a space--time process model, we seek to illuminate and interpret its performance in the context of application to daily data collection. That is, the observed daily state vectors are viewed as a time series of multivariate process realizations which we try to understand under the dimension reduction achieved by the SOM procedure.
We train and apply convolutional neural networks, a machine learning technique developed to learn from and classify image data, to Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) imaging for the identification of potential strong lensing systems. An ensemble of four convolutional neural networks was trained on images of simulated galaxy-galaxy lenses. The training sets consisted of a total of 62,406 simulated lenses and 64,673 non-lens negative examples generated with two different methodologies. The networks were able to learn the features of simulated lenses with accuracy of up to 99.8% and a purity and completeness of 94-100% on a test set of 2000 simulations. An ensemble of trained networks was applied to all of the 171 square degrees of the CFHTLS wide field image data, identifying 18,861 candidates including 63 known and 139 other potential lens candidates. A second search of 1.4 million early type galaxies selected from the survey catalog as potential deflectors, identified 2,465 candidates including 117 previously known lens candidates, 29 confirmed lenses/high-quality lens candidates, 266 novel probable or potential lenses and 2097 candidates we classify as false positives. For the catalog-based search we estimate a completeness of 21-28% with respect to detectable lenses and a purity of 15%, with a false-positive rate of 1 in 671 images tested. We predict a human astronomer reviewing candidates produced by the system would identify ~20 probable lenses and 100 possible lenses per hour in a sample selected by the robot. Convolutional neural networks are therefore a promising tool for use in the search for lenses in current and forthcoming surveys such as the Dark Energy Survey and the Large Synoptic Survey Telescope.
In this paper, we introduce elements of probabilistic model that is suitable for modeling of learning algorithms in biologically plausible artificial neural networks framework. Model is based on two of the main concepts in quantum physics - a density matrix and the Born rule. As an example, we will show that proposed probabilistic interpretation is suitable for modeling of on-line learning algorithms for PSA, which are preferably realized by a parallel hardware based on very simple computational units. Proposed concept (model) can be used in the context of improving algorithm convergence speed, learning factor choice, or input signal scale robustness. We are going to see how the Born rule and the Hebbian learning rule are connected
The minimum feedback arc set problem asks to delete a minimum number of arcs (directed edges) from a digraph (directed graph) to make it free of any directed cycles. In this work we approach this fundamental cycle-constrained optimization problem by considering a generalized task of dividing the digraph into D layers of equal size. We solve the D-segmentation problem by the replica-symmetric mean field theory and belief-propagation heuristic algorithms. The minimum feedback arc density of a given random digraph ensemble is then obtained by extrapolating the theoretical results to the limit of large D. A divide-and-conquer algorithm (nested-BPR) is devised to solve the minimum feedback arc set problem with very good performance and high efficiency.
Networks composed from both connectivity and dependency links were found to be more vulnerable compared to classical networks with only connectivity links. Their percolation transition is usually of a first order compared to the second order transition found in classical networks. We analytically analyze the effect of different distributions of dependencies links on the robustness of networks. For a random Erd$\ddot{o}$s-R$\acute{e}$nyi (ER) network with average degree $k$ that is divided into dependency clusters of size $s$, the fraction of nodes that belong to the giant component, $P_\infty$, is given by $ P_\infty=p^{s-1} [1-\exp{(-kpP_\infty)}]^s $ where $1-p$ is the initial fraction of removed nodes. Our general result coincides with the known Erd$\ddot{o}$s-R$\acute{e}$nyi equation for random networks for $s=1$ and with the result of Parshani et al (PNAS, in press, 2011) for $s=2$. For networks with Poissonian distribution of dependency links we find that $P_\infty$ is given by $P_\infty = f_{k,p}(P_\infty) e^{(<s>-1)(pf_{k,p}(P_\infty)-1)}$ where $f_{k,p}(P_\infty) \equiv 1-\exp{(-kpP_\infty)}$ and $<s>$ is the mean value of the size of dependency clusters. For networks with Gaussian distribution of dependency links we show how the average and width of the distribution affect the robustness of the networks.
Interaction between users in online social networks plays a key role in social network analysis. One on important types of social group is full connected relation between some users, which known as clique structure. Therefore finding a maximum clique is essential for some analysis. In this paper, we proposed a new method using ant colony optimization algorithm and particle swarm optimization algorithm. In the proposed method, in order to attain better results, it is improved process of pheromone update by particle swarm optimization. Simulation results on popular standard social network benchmarks in comparison standard ant colony optimization algorithm are shown a relative enhancement of proposed algorithm.
Estimating cascade size and nodes' influence is a fundamental task in social, technological, and biological networks. Yet this task is extremely challenging due to the sheer size and the structural heterogeneity of networks. We investigate a new influence measure, termed outward influence (OI), defined as the (expected) number of nodes that a subset of nodes $S$ will activate, excluding the nodes in S. Thus, OI equals, the de facto standard measure, influence spread of S minus |S|. OI is not only more informative for nodes with small influence, but also, critical in designing new effective sampling and statistical estimation methods.   Based on OI, we propose SIEA/SOIEA, novel methods to estimate influence spread/outward influence at scale and with rigorous theoretical guarantees. The proposed methods are built on two novel components 1) IICP an important sampling method for outward influence, and 2) RSA, a robust mean estimation method that minimize the number of samples through analyzing variance and range of random variables. Compared to the state-of-the art for influence estimation, SIEA is $\Omega(\log^4 n)$ times faster in theory and up to several orders of magnitude faster in practice. For the first time, influence of nodes in the networks of billions of edges can be estimated with high accuracy within a few minutes. Our comprehensive experiments on real-world networks also give evidence against the popular practice of using a fixed number, e.g. 10K or 20K, of samples to compute the "ground truth" for influence spread.
Fast Magnetic Resonance Imaging (MRI) is highly in demand for many clinical applications in order to reduce the scanning cost and improve the patient experience. This can also potentially increase the image quality by reducing the motion artefacts and contrast washout. However, once an image field of view and the desired resolution are chosen, the minimum scanning time is normally determined by the requirement of acquiring sufficient raw data to meet the Nyquist-Shannon sampling criteria. Compressive Sensing (CS) theory has been perfectly matched to the MRI scanning sequence design with much less required raw data for the image reconstruction. Inspired by recent advances in deep learning for solving various inverse problems, we propose a conditional Generative Adversarial Networks-based deep learning framework for de-aliasing and reconstructing MRI images from highly undersampled data with great promise to accelerate the data acquisition process. By coupling an innovative content loss with the adversarial loss our de-aliasing results are more realistic. Furthermore, we propose a refinement learning procedure for training the generator network, which can stabilise the training with fast convergence and less parameter tuning. We demonstrate that the proposed framework outperforms state-of-the-art CS-MRI methods, in terms of reconstruction error and perceptual image quality. In addition, our method can reconstruct each image in 0.22ms--0.37ms, which is promising for real-time applications.
For years security machine learning research has promised to obviate the need for signature based detection by automatically learning to detect indicators of attack. Unfortunately, this vision hasn't come to fruition: in fact, developing and maintaining today's security machine learning systems can require engineering resources that are comparable to that of signature-based detection systems, due in part to the need to develop and continuously tune the "features" these machine learning systems look at as attacks evolve. Deep learning, a subfield of machine learning, promises to change this by operating on raw input signals and automating the process of feature design and extraction. In this paper we propose the eXpose neural network, which uses a deep learning approach we have developed to take generic, raw short character strings as input (a common case for security inputs, which include artifacts like potentially malicious URLs, file paths, named pipes, named mutexes, and registry keys), and learns to simultaneously extract features and classify using character-level embeddings and convolutional neural network. In addition to completely automating the feature design and extraction process, eXpose outperforms manual feature extraction based baselines on all of the intrusion detection problems we tested it on, yielding a 5%-10% detection rate gain at 0.1% false positive rate compared to these baselines.
Inferring gene interaction network from gene expression data is an important task in systems biology research. The gene interaction network, especially key interactions, plays an important role in identifying biomarkers for disease that further helps in drug design. Ant colony optimization is an optimization algorithm based on natural evolution and has been used in many optimization problems. In this paper, we applied ant colony optimization algorithm for inferring the key gene interactions from gene expression data. The algorithm has been tested on two different kinds of benchmark datasets and observed that it successfully identify some key gene interactions.
We present the project Asteroids@home that uses distributed computing to solve the time-consuming inverse problem of shape reconstruction of asteroids. The project uses the Berkeley Open Infrastructure for Network Computing (BOINC) framework to distribute, collect, and validate small computational units that are solved independently at individual computers of volunteers connected to the project. Shapes, rotational periods, and orientations of the spin axes of asteroids are reconstructed from their disk-integrated photometry by the lightcurve inversion method.
Critical exponents at the ferromagnetic transition were measured for the first time in an organic ferromagnetic material tetrakis(dimethylamino)ethylene fullerene[60] (TDAE-C$_{60}$). From a complete magnetization-temperature-field data set near $T_{c}=16.1\pm 0.05,$ we determine the susceptibility and magnetization critical exponents $\gamma =1.22\pm 0.02$ and $\beta =0.75 \pm 0.03$ respectively, and the field vs. magnetization exponent at $T_{c}$ of $\delta =2.28\pm 0.14$. Hyperscaling is found to be violated by $\Omega \equiv d^{\prime}-d \approx -1/4$, suggesting that the onset of ferromagnetism can be related to percolation of a particular contact configuration of C$_{60}$ molecular orientations.
We use Monte Carlo simulations to identify the mechanism that allows for phase transitions in dipolar spin ice to occur and survive for applied magnetic field, H, much larger in strength than that of the spin-spin interactions. In the most generic and highest symmetry case, the spins on one out of four sublattices of the pyrochlore decouple from the total local exchange+dipolar+applied field. In the special case where H is aligned perfectly along the [110] crystallographic direction, spin chains perpendicular to H show a transition to q=X long range order, which proceeds via a one to three dimensional crossover. We propose that these transitions are relevant to the origin of specific heat features observed in powder samples of the Dy2Ti2O7 spin ice material for H above 1 Tesla.
Neural machine translation (NMT) aims at solving machine translation (MT) problems using neural networks and has exhibited promising results in recent years. However, most of the existing NMT models are shallow and there is still a performance gap between a single NMT model and the best conventional MT system. In this work, we introduce a new type of linear connections, named fast-forward connections, based on deep Long Short-Term Memory (LSTM) networks, and an interleaved bi-directional architecture for stacking the LSTM layers. Fast-forward connections play an essential role in propagating the gradients and building a deep topology of depth 16. On the WMT'14 English-to-French task, we achieve BLEU=37.7 with a single attention model, which outperforms the corresponding single shallow model by 6.2 BLEU points. This is the first time that a single NMT model achieves state-of-the-art performance and outperforms the best conventional model by 0.7 BLEU points. We can still achieve BLEU=36.3 even without using an attention mechanism. After special handling of unknown words and model ensembling, we obtain the best score reported to date on this task with BLEU=40.4. Our models are also validated on the more difficult WMT'14 English-to-German task.
The digital exhaust left by flows of physical and digital commodities provides a rich measure of the nature, strength and significance of relationships between countries in the global network. With this work, we examine how these traces and the network structure can reveal the socioeconomic profile of different countries. We take into account multiple international networks of physical and digital flows, including the previously unexplored international postal network. By measuring the position of each country in the Trade, Postal, Migration, International Flights, IP and Digital Communications networks, we are able to build proxies for a number of crucial socioeconomic indicators such as GDP per capita and the Human Development Index ranking along with twelve other indicators used as benchmarks of national wellbeing by the United Nations and other international organisations. In this context, we have also proposed and evaluated a global connectivity degree measure applying multiplex theory across the six networks that accounts for the strength of relationships between countries. We conclude with a multiplex community analysis of the global flow networks, showing how countries with shared community membership over multiple networks have similar socioeconomic profiles. Combining multiple flow data sources into global multiplex networks can help understand the forces which drive economic activity on a global level. Such an ability to infer proxy indicators in a context of incomplete information is extremely timely in light of recent discussions on measurement of indicators relevant to the Sustainable Development Goals.
We study, beyond the Gaussian approximation, the decay of the translational order correlation function for a d-dimensional scalar periodic elastic system in a disordered environment. We develop a method based on functional determinants, equivalent to summing an infinite set of diagrams. We obtain, in dimension d=4-epsilon, the even n-th cumulant of relative displacements as <[u(r)-u(0)]^n>^c = A_n ln r, with A_n = -(\epsilon/3)^n \Gamma(n-1/2) \zeta(2n-3)/\pi^(1/2), as well as the multifractal dimension x_q of the exponential field e^{q u(r)}. As a corollary, we obtain an analytic expression for a class of n-loop integrals in d=4, which appear in the perturbative determination of Konishi amplitudes, also accessible via AdS/CFT using integrability.
In this paper we present a system for monitoring and controlling dynamic network circuits inside the USLHCNet network. This distributed service system provides in near real-time complete topological information for all the circuits, resource allocation and usage, accounting, detects automatically failures in the links and network equipment, generate alarms and has the functionality to take automatic actions. The system is developed based on the MonALISA framework, which provides a robust monitoring and controlling service oriented architecture, with no single points of failure.
Topological design of terrestrial networks for communication via satellites is studied in the paper. Quantitative model of the network cost-analysis minimizing the total transmission and switching cost is described. Several algorithms solving combinatorial problem of the optimal topology design based on binary partitioning, a minimax parametric search and dynamic programming are developed by the author and demonstrated with a numeric example. Analysis of average complexity of the minimax parametric search algorithm is also provided.
This paper studies the effect of boundary value conditions on consensus networks. Consider a network where some nodes keep their estimates constant while other nodes average their estimates with that of their neighbors. We analyze such networks and show that in contrast to standard consensus networks, the network estimate converges to a general harmonic function on the graph. Furthermore, the final value depends only on the value at the boundary nodes. This has important implications in consensus networks -- for example, we show that consensus networks are extremely sensitive to the existence of a single malicious node or consistent errors in a single node. We also discuss applications of this result in social and sensor networks. We investigate the existence of boundary nodes in human social networks via an experimental study involving human subjects. Finally, the paper is concluded with the numerical studies of the boundary value problems in consensus networks.
We consider unstable attractors; Milnor attractors $A$ such that, for some neighbourhood $U$ of $A$, almost all initial conditions leave $U$. Previous research strongly suggests that unstable attractors exist and even occur robustly (i.e. for open sets of parameter values) in a system modelling biological phenomena, namely in globally coupled oscillators with delayed pulse interactions.   In the first part of this paper we give a rigorous definition of unstable attractors for general dynamical systems. We classify unstable attractors into two types, depending on whether or not there is a neighbourhood of the attractor that intersects the basin in a set of positive measure. We give examples of both types of unstable attractor; these examples have non-invertible dynamics that collapse certain open sets onto stable manifolds of saddle orbits.   In the second part we give the first rigorous demonstration of existence and robust occurrence of unstable attractors in a network of oscillators with delayed pulse coupling. Although such systems are technically hybrid systems of delay differential equations with discontinuous `firing' events, we show that their dynamics reduces to a finite dimensional hybrid system system after a finite time and hence we can discuss Milnor attractors for this reduced finite dimensional system. We prove that for an open set of phase resetting functions there are saddle periodic orbits that are unstable attractors.
We consider the stochastic dynamics of the pure and random ferromagnetic Ising model on the hierarchical diamond lattice of branching ratio $K$ with fractal dimension $d_f=(\ln (2K))/\ln 2$. We adapt the Real Space Renormalization procedure introduced in our previous work [C. Monthus and T. Garel, J. Stat. Mech. P02037 (2013)] to study the equilibrium time $t_{eq}(L)$ as a function of the system size $L$ near zero-temperature. For the pure Ising model, we obtain the behavior $t_{eq}(L) \sim L^{\alpha} e^{\beta 2J L^{d_s}} $ where $d_s=d_f-1$ is the interface dimension, and we compute the prefactor exponent $\alpha$. For the random ferromagnetic Ising model, we derive the renormalization rules for dynamical barriers $B_{eq}(L) \equiv (\ln t_{eq}/\beta)$ near zero temperature. For the fractal dimension $d_f=2$, we obtain that the dynamical barrier scales as $ B_{eq}(L)= c L+L^{1/2} u$ where $u$ is a Gaussian random variable of non-zero-mean. While the non-random term scaling as $L$ corresponds to the energy-cost of the creation of a system-size domain-wall, the fluctuation part scaling as $L^{1/2}$ characterizes the barriers for the motion of the system-size domain-wall after its creation. This scaling corresponds to the dynamical exponent $\psi=1/2$, in agreement with the conjecture $\psi=d_s/2$ proposed in [C. Monthus and T. Garel, J. Phys. A 41, 115002 (2008)]. In particular, it is clearly different from the droplet exponent $\theta \simeq 0.299$ involved in the statics of the random ferromagnet on the same lattice.
Good parameter settings are crucial to achieve high performance in many areas of artificial intelligence (AI), such as satisfiability solving, AI planning, scheduling, and machine learning (in particular deep learning). Automated algorithm configuration methods have recently received much attention in the AI community since they replace tedious, irreproducible and error-prone manual parameter tuning and can lead to new state-of-the-art performance. However, practical applications of algorithm configuration are prone to several (often subtle) pitfalls in the experimental design that can render the procedure ineffective. We identify several common issues and propose best practices for avoiding them, including a tool called GenericWrapper4AC for preventing the many possible problems in measuring the performance of the algorithm being optimized by executing it in a standardized, controlled manner.
We investigate the effect of weak randomness on the antiferromagnetic anisotropic spin-1 chain. We use Abelian bosonization to construct the low-energy effective theory. A renormalization group calculation up to second order in the strength of the disorder is performed on this effective theory. We observe in this framework the destruction of the antiferromagnetic ordered phase `a la Imry-Ma. We predict the effects of a random magnetic field along z axis, a random field in the XY plane as well as random exchange with and without XY symmetry. Instabilities of massless phases appear in general by mechanisms different from the case of the 2-leg spin ladder.
We discuss the scaling of entanglement entropy in the random singlet phase (RSP) of disordered quantum magnetic chains of general spin-S. Through an analysis of the general structure of the RSP, we show that the entanglement entropy scales logarithmically with the size of a block and we provide a closed expression for this scaling. This result is applicable for arbitrary quantum spin chains in the RSP, being dependent only on the magnitude S of the spin. Remarkably, the logarithmic scaling holds for the disordered chain even if the pure chain with no disorder does not exhibit conformal invariance, as is the case for Heisenberg integer spin chains. Our conclusions are supported by explicit evaluations of the entanglement entropy for random spin-1 and spin-3/2 chains using an asymptotically exact real-space renormalization group approach.
Mobile collaborative learning (MCL) is highly acknowledged and focusing paradigm in eductional institutions and several organizations across the world. It exhibits intellectual synergy of various combined minds to handle the problem and stimulate the social activity of mutual understanding. To improve and foster the baseline of MCL, several supporting architectures, frameworks including number of the mobile applications have been introduced. Limited research was reported that particularly focuses to enhance the security of those pardigms and provide secure MCL to users. The paper handles the issue of rogue DHCP server that affects and disrupts the network resources during the MCL. The rogue DHCP is unauthorized server that releases the incorrect IP address to users and sniffs the traffic illegally. The contribution specially provides the privacy to users and enhances the security aspects of mobile supported collaborative framework (MSCF). The paper introduces multi-frame signature-cum anomaly-based intrusion detection systems (MSAIDS) supported with novel algorithms through addition of new rules in IDS and mathematcal model. The major target of contribution is to detect the malicious attacks and blocks the illegal activities of rogue DHCP server. This innovative security mechanism reinforces the confidence of users, protects network from illicit intervention and restore the privacy of users. Finally, the paper validates the idea through simulation and compares the findings with other existing techniques.
Based on molecular dynamics simulations of a lithium metasilicate glass we study the potential of bond valence sum calculations to identify sites and diffusion pathways of mobile Li ions in a glassy silicate network. We find that the bond valence method is not well suitable to locate the sites, but allows one to estimate the number of sites. Spatial regions of the glass determined as accessible for the Li ions by the bond valence method can capture up to 90% of the diffusion path. These regions however entail a significant fraction that does not belong to the diffusion path. Because of this low specificity, care must be taken to determine the diffusive motion of particles in amorphous systems based on the bond valence method. The best identification of the diffusion path is achieved by using a modified valence mismatch in the BV analysis that takes into account that a Li ion favors equal partial valences to the neighboring oxygen ions. Using this modified valence mismatch it is possible to replace hard geometric constraints formerly applied in the BV method. Further investigations are necessary to better understand the relation between the complex structure of the host network and the ionic diffusion paths.
The low-field quantum Hall effect is investigated on a two-dimensional electron system in an AlGaAs/GaAs heterostructure. Magneto-oscillations following the semiclassical Shubnikov-de Haas formula are observed even when the emergence of the mobility gap shows the importance of quantum localization effects. Moreover, the Lifshitz-Kosevich formula can survive as the oscillating amplitude becomes large enough for the deviation to the Dingle factor. The crossover from the semiclassical transport to the description of quantum diffusion is discussed. From our study, the difference between the mobility and cyclotron gaps indicates that some electron states away from the Landau-band tails can be responsible for the semiclassical behaviors under low-field Landau quantization.
We reconsider the problem of the enhancement of tunnelling of a quantum particle induced by disorder of a one-dimensional tunnel barrier of length $L$, using two different approximate analytic solutions of the invariant imbedding equations of wave propagation for weak disorder. The two solutions are complementary for the detailed understanding of important aspects of numerical results on disorder-enhanced tunnelling obtained recently by Kim et al. (Phys. rev. B{\bf 77}, 024203 (2008)). In particular, we derive analytically the scaled wavenumber $(kL)$-threshold where disorder-enhanced tunnelling of an incident electron first occurs, as well as the rate of variation of the transmittance in the limit of vanishing disorder. Both quantities are in good agreement with the numerical results of Kim et al. Our non-perturbative solution of the invariant imbedding equations allows us to show that the disorder enhances both the mean conductance and the mean resistance of the barrier.
Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to deal with multimodal data, such as in image annotation tasks. Another popular approach to model the multimodal data is through deep neural networks, such as the deep Boltzmann machine (DBM). Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator (DocNADE) was proposed and demonstrated state-of-the-art performance for text document modeling. In this work, we show how to successfully apply and extend this model to multimodal data, such as simultaneous image classification and annotation. First, we propose SupDocNADE, a supervised extension of DocNADE, that increases the discriminative power of the learned hidden topic features and show how to employ it to learn a joint representation from image visual words, annotation words and class label information. We test our model on the LabelMe and UIUC-Sports data sets and show that it compares favorably to other topic models. Second, we propose a deep extension of our model and provide an efficient way of training the deep model. Experimental results show that our deep model outperforms its shallow version and reaches state-of-the-art performance on the Multimedia Information Retrieval (MIR) Flickr data set.
Random scattering of light is what makes materials such as white paint, clouds and biological tissue opaque. We show that although light propagating in these media is diffuse, a high degree of control is possible as phase information is not irreversibly lost. Opaque objects such as eggshell or white paint focus coherent light as sharply as a lens when illuminated with a wavefront that inverts the wave diffusion. We demonstrate the construction of such wavefronts using feedback, achieving a focus that is 1000 times brighter than the diffusely transmitted light. Our results are explained quantitatively by a universal relation based on statistical optics.
The Braess paradox, known for traffic and other classical networks, lies in the fact that adding a new route to a congested network in an attempt to relieve congestion can counter-intuitively degrade the overall network performance. Recently, we have extended the concept of Braess paradox to semiconductor mesoscopic networks, whose transport properties are governed by quantum physics. In this paper, we demonstrate theoretically that, alike in classical systems, congestion plays a key role in the occurrence of a Braess paradox in mesoscopic networks.
Worldwide Interoperability for Microwave Access (WiMAX) networks were expected to be the main Broadband Wireless Access (BWA) technology that provided several services such as data, voice, and video services including different classes of Quality of Services (QoS), which in turn were defined by IEEE 802.16 standard. Scheduling in WiMAX became one of the most challenging issues, since it was responsible for distributing available resources of the network among all users; this leaded to the demand of constructing and designing high efficient scheduling algorithms in order to improve the network utilization, to increase the network throughput, and to minimize the end-to-end delay. In this study, we presenedt a simulation study to measure the performance of several scheduling algorithms in WiMAX, which were Strict Priority algorithm, Round-Robin (RR), Weighted Round Robin (WRR), Weighted Fair Queuing (WFQ), Self-Clocked Fair (SCF), and Diff-Serv Algorithm.
In this paper we introduce the spectral approach to delocalization in infinite disordered systems and provide a physical interpretation in context of the classical model of Edwards and Thouless. We argue that spectral analysis is an important contribution to localization problems since it avoids issues related to the use of boundary conditions. Applying the method to 2D and 3D numerical simulations with various amount of disorder W shows that delocalization occurs for W<=0.6 in 2D and for W<=5 for 3D.
The quantum kicked rotor (QKR) driven by $d$ incommensurate frequencies realizes the universality class of $d$-dimensional disordered metals. For $d>3$, the system exhibits an Anderson metal-insulator transition which has been observed within the framework of an atom optics realization. However, the absence of genuine randomness in the QKR reflects in critical phenomena beyond those of the Anderson universality class. Specifically, the system shows strong sensitivity to the algebraic properties of its effective Planck constant $\tilde h \equiv 4\pi /q$. For integer $q$, the system may be in a globally integrable state, in a `super-metallic' configuration characterized by diverging response coefficients, Anderson localized, metallic, or exhibit transitions between these phases. We present numerical data for different $q$-values and effective dimensionalities, with the focus being on parameter configurations which may be accessible to experimental investigations.
The ability of a classifier to take on new information and classes by evolving the classifier without it having to be fully retrained is known as incremental learning. Incremental learning has been successfully applied to many classification problems, where the data is changing and is not all available at once. In this paper there is a comparison between Learn++, which is one of the most recent incremental learning algorithms, and the new proposed method of Incremental Learning Using Genetic Algorithm (ILUGA). Learn++ has shown good incremental learning capabilities on benchmark datasets on which the new ILUGA method has been tested. ILUGA has also shown good incremental learning ability using only a few classifiers and does not suffer from catastrophic forgetting. The results obtained for ILUGA on the Optical Character Recognition (OCR) and Wine datasets are good, with an overall accuracy of 93% and 94% respectively showing a 4% improvement over Learn++.MT for the difficult multi-class OCR dataset.
Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of evolutionary events acting at the population level, like recombination between genes, hybridization between lineages, and lateral gene transfer. While most phylogenetics tools implement a wide range of algorithms on phylogenetic trees, there exist only a few applications to work with phylogenetic networks, and there are no open-source libraries either.   In order to improve this situation, we have developed a Perl package that relies on the BioPerl bundle and implements many algorithms on phylogenetic networks. We have also developed a Java applet that makes use of the aforementioned Perl package and allows the user to make simple experiments with phylogenetic networks without having to develop a program or Perl script by herself.   The Perl package has been accepted as part of the BioPerl bundle. It can be downloaded from http://dmi.uib.es/~gcardona/BioInfo/Bio-PhyloNetwork.tgz. The web-based application is available at http://dmi.uib.es/~gcardona/BioInfo/. The Perl package includes full documentation of all its features.
We present results of a survey of public transport networks (PTNs) of selected 14 major cities of the world with PTN sizes ranging between 2000 and 46000 stations and develop an evolutionary model of these networks. The structure of these PTNs is revealed in terms of a set of neighbourhood relations both for the routes and the stations. The networks defined in this way display distinguishing properties due to the constraints of the embedding 2D geographical space and the structure of the cities. In addition to the standard characteristics of complex networks like the number of nearest neighbours, mean path length, and clustering we observe features specific to PTNs. While other networks with real-world links like cables or neurons embedded in two or three dimensions often show similar behavior, these can be studied in detail in our present case. Geographical data for the routes reveal surprising self-avoiding walk properties that we relate to the optimization of surface coverage. We propose and simulate an evolutionary growth model based on effectively interacting self-avoiding walks that reproduces the key features of PTN.
Person re-identification aims at matching pedestrians observed from non-overlapping camera views. Feature descriptor and metric learning are two significant problems in person re-identification. A discriminative metric learning method should be capable of exploiting complex nonlinear transformations due to the large variations in feature space. In this paper, we propose a nonlinear local metric learning (NLML) method to improve the state-of-the-art performance of person re-identification on public datasets. Motivated by the fact that local metric learning has been introduced to handle the data which varies locally and deep neural network has presented outstanding capability in exploiting the nonlinearity of samples, we utilize the merits of both local metric learning and deep neural network to learn multiple sets of nonlinear transformations. By enforcing a margin between the distances of positive pedestrian image pairs and distances of negative pairs in the transformed feature subspace, discriminative information can be effectively exploited in the developed neural networks. Our experiments show that the proposed NLML method achieves the state-of-the-art results on the widely used VIPeR, GRID, and CUHK 01 datasets.
This comment is dedicated to the investigation of many-body localization in a quantum Ising model with long-range power law interactions, $r^{-\alpha}$, relevant for a variety of systems ranging from electrons in Anderson insulators to spin excitations in chains of cold atoms. It has been earlier argued [1, 2] that this model obeys the dimensional constraint suggesting the delocalization of all finite temperature states in thermodynamic limit for $\alpha \leq 2d$ in a $d$-dimensional system. This expectation conflicts with the recent numerical studies of the specific interacting spin model in Ref. [3]. To resolve this controversy we reexamine the model of Ref. [3] and demonstrate that the infinite temperature states there obey the dimensional constraint. The earlier developed scaling theory for the critical system size required for delocalization [2] is extended to small exponents $0 \leq \alpha \leq d$. Disagreements between two works are explained by the non-standard selection of investigated states in the ordered phase and misinterpretation of the localization-delocalization transition in Ref. [3].
Modelling of inspection data for large scale physical systems is critical to assessment of their integrity. We present a general method for inference about system state and associated model variance structure from spatially distributed time series which are typically short, irregular, incomplete and not directly observable. Bayes linear analysis simplifies parameter estimation and avoids often-unrealistic distributional assumptions. Second-order exchangeability judgements facilitate variance learning for sparse inspection time-series. The model is applied to inspection data for minimum wall thickness from corroding pipe-work networks on a full-scale offshore platform, and shown to give materially different forecasts of remnant life compared to an equivalent model neglecting variance learning.
We introduce the concepts of closed sets and closure operators as mathematical tools for the study of social networks. Dynamic networks are represented by transformations. It is shown that under continuous change/transformation, all networks tend to "break down" and become less complex. It is a kind of entropy. The product of this theoretical decomposition is an abundance of triadically closed clusters which sociologists have observed in practice. This gives credence to the relevance of this kind of mathematical analysis in the sociological context.
The ability to recognize facial expressions automatically enables novel applications in human-computer interaction and other areas. Consequently, there has been active research in this field, with several recent works utilizing Convolutional Neural Networks (CNNs) for feature extraction and inference. These works differ significantly in terms of CNN architectures and other factors. Based on the reported results alone, the performance impact of these factors is unclear. In this paper, we review the state of the art in image-based facial expression recognition using CNNs and highlight algorithmic differences and their performance impact. On this basis, we identify existing bottlenecks and consequently directions for advancing this research field. Furthermore, we demonstrate that overcoming one of these bottlenecks - the comparatively basic architectures of the CNNs utilized in this field - leads to a substantial performance increase. By forming an ensemble of modern deep CNNs, we obtain a FER2013 test accuracy of 75.2%, outperforming previous works without requiring auxiliary training data or face registration.
In the recent years impressive advances were made for single image super-resolution. Deep learning is behind a big part of this success. Deep(er) architecture design and external priors modeling are the key ingredients. The internal contents of the low resolution input image is neglected with deep modeling despite the earlier works showing the power of using such internal priors. In this paper we propose a novel deep convolutional neural network carefully designed for robustness and efficiency at both learning and testing. Moreover, we propose a couple of model adaptation strategies to the internal contents of the low resolution input image and analyze their strong points and weaknesses. By trading runtime and using internal priors we achieve 0.1 up to 0.3dB PSNR improvements over best reported results on standard datasets. Our adaptation especially favors images with repetitive structures or under large resolutions. Moreover, it can be combined with other simple techniques, such as back-projection or enhanced prediction, for further improvements.
We introduce a framework to intertwine dynamical processes of different nature, each with its own distinct network topology, using a multilayer network approach. As an example of collective phenomena emerging from the interactions of multiple dynamical processes, we study a model where neural dynamics and nutrient transport are bidirectionally coupled in such a way that the allocation of the transport process at one layer depends on the degree of synchronization at the other layer, and vice versa. We show numerically, and we prove analytically, that the multilayer coupling induces a spontaneous explosive synchronization and a heterogeneous distribution of allocations, otherwise not present in the two systems considered separately. Our framework can find application to other cases where two or more dynamical processes such as synchronization, opinion formation, information diffusion, or disease spreading, are interacting with each other.
We investigate the effect of nonmagnetic disorder on the stability of the algebraic spin liquid ($ASL$) by deriving an effective field theory, nonlinear $\sigma$ model ($NL{\sigma}M$). We find that the anomalous critical exponent characterizing the criticality of the $ASL$ causes an anomalous gradient in the $NL{\sigma}M$. We show that the sign of the anomalous gradient exponent or the critical exponent of the $ASL$ determines the stability of the "dirty" $ASL$. A positive exponent results in an unstable fixed point separating delocalized and localized phases, which is consistent with our previous study [Phys. Rev. B {\bf 70}, 140405 (2004)]. We find power law suppression for the density of spinon states in contrast to the logarithmic correction in the free Dirac theory. On the other hand, a negative exponent destabilizes the $ASL$, causing the Anderson localization. We discuss the implication of our study in the pseudogap phase of high $T_c$ cuprates.
Each node in a wireless multi-hop network can adjust the power level at which it transmits and thus change the topology of the network to save energy by choosing the neighbors with which it directly communicates. Many previous algorithms for distributed topology control have assumed an ability at each node to deduce some location-based information such as the direction and the distance of its neighbor nodes with respect to itself. Such a deduction of location-based information, however, cannot be relied upon in real environments where the path loss exponents vary greatly leading to significant errors in distance estimates. Also, multipath effects may result in different signal paths with different loss characteristics, and none of these paths may be line-of-sight, making it difficult to estimate the direction of a neighboring node. In this paper, we present Step Topology Control (STC), a simple distributed topology control algorithm which reduces energy consumption while preserving the connectivity of a heterogeneous sensor network without use of any location-based information. We show that the STC algorithm achieves the same or better order of communication and computational complexity when compared to other known algorithms that also preserve connectivity without the use of location-based information. We also present a detailed simulation-based comparative analysis of the energy savings and interference reduction achieved by the algorithms. The results show that, in spite of not incurring a higher communication or computational complexity, the STC algorithm performs better than other algorithms in uniform wireless environments and especially better when path loss characteristics are non-uniform.
We address the problem of computing approximate marginals in Gaussian probabilistic models by using mean field and fractional Bethe approximations. As an extension of Welling and Teh (2001), we define the Gaussian fractional Bethe free energy in terms of the moment parameters of the approximate marginals and derive an upper and lower bound for it. We give necessary conditions for the Gaussian fractional Bethe free energies to be bounded from below. It turns out that the bounding condition is the same as the pairwise normalizability condition derived by Malioutov et al. (2006) as a sufficient condition for the convergence of the message passing algorithm. By giving a counterexample, we disprove the conjecture in Welling and Teh (2001): even when the Bethe free energy is not bounded from below, it can possess a local minimum to which the minimization algorithms can converge.
Recent HST observations of the metal-rich bulge clusters NGC 6388 and NGC 6441 by Rich et al. (1997) have found that the horizontal branches (HBs) in these clusters slope upward with decreasing B-V. Such an upward slope in the HB morphology is not predicted by canonical HB models. Moreover, it cannot be produced by either a greater cluster age or enhanced mass loss along the red-giant branch (RGB). The peculiar HB morphology in these clusters may provide an important clue for understanding the second-parameter effect.   We have carried out extensive evolutionary calculations and numerical simulations in order to understand the cause of the sloped HBs in NGC 6388 and NGC 6441. Three scenarios have been investigated: i) A high cluster helium abundance scenario, where the HB morphology is determined by long blue loops; ii) A rotation scenario, where the core mass in the HB models is increased by internal rotation during the HB phase; iii) A helium-mixing scenario, where deep mixing on the RGB enhances the envelope helium abundance.   All three of these scenarios predict sloped HBs with anomalously bright RR Lyrae variables. We compare this prediction with the properties of the two known RR Lyrae variables in NGC 6388 as well as with the properties of the metal-rich field RR Lyrae variables and V9 in 47 Tuc.
Recent work has demonstrated that neural networks are vulnerable to adversarial examples, i.e., inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete, general guarantee to provide. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. This suggests that adversarially resistant deep learning models might be within our reach after all.
In this paper, we provide a new neural-network based perspective on multi-task learning (MTL) and multi-domain learning (MDL). By introducing the concept of a semantic descriptor, this framework unifies MDL and MTL as well as encompassing various classic and recent MTL/MDL algorithms by interpreting them as different ways of constructing semantic descriptors. Our interpretation provides an alternative pipeline for zero-shot learning (ZSL), where a model for a novel class can be constructed without training data. Moreover, it leads to a new and practically relevant problem setting of zero-shot domain adaptation (ZSDA), which is the analogous to ZSL but for novel domains: A model for an unseen domain can be generated by its semantic descriptor. Experiments across this range of problems demonstrate that our framework outperforms a variety of alternatives.
A detailed investigation of lowest excitations in two-dimensional Gaussian spin glasses is presented. We show the existence of a new zero-temperature exponent lambda describing the relative number of finite-volume excitations with respect to large-scale ones. This exponent yields the standard thermal exponent of droplet theory theta through the relation, theta=d(lambda-1). Our work provides a new way to measure the thermal exponent theta without any assumption about the procedure to generate typical low-lying excitations. We find clear evidence that theta < theta_{DW} where theta_{DW} is the thermal exponent obtained in domain-wall theory showing that MacMillan excitations are not typical.
Ordering of the Heisenberg spin glass with the nearest-neighbor Gaussian coupling is investigated by equilibrium Monte Carlo simulations in four and five dimensions. Ordering of the mean-field Heisenberg spin-glass is also studied for comparison. Particular attention is paid to the nature of the spin-glass and the chiral-glass orderings. Our numerical data suggest that, in five dimensions, the model exhibits a single spin-glass transition at a finite temperature, where the spin-glass order accompanying the simultaneous chiral-glass order sets in. In four dimensions, by contrast, the model exhibits a chiral-glass transition at a finite temperature, not accompanying the standard spin-glass order. The critical region associated with the chiral-glass transition, however, is very narrow, suggesting that dimension four is close to the marginal dimensionality.
Identity transformations, used as skip-connections in residual networks, directly connect convolutional layers close to the input and those close to the output in deep neural networks, improving information flow and thus easing the training. In this paper, we introduce two alternative linear transforms, orthogonal transformation and idempotent transformation. According to the definition and property of orthogonal and idempotent matrices, the product of multiple orthogonal (same idempotent) matrices, used to form linear transformations, is equal to a single orthogonal (idempotent) matrix, resulting in that information flow is improved and the training is eased. One interesting point is that the success essentially stems from feature reuse and gradient reuse in forward and backward propagation for maintaining the information during flow and eliminating the gradient vanishing problem because of the express way through skip-connections. We empirically demonstrate the effectiveness of the proposed two transformations: similar performance in single-branch networks and even superior in multi-branch networks in comparison to identity transformations.
This contribution describes photometry for two Galactic dSphs obtained with the Large Binocular Telescope to a magnitude of ~25.5. Using the Large Binocular Camera, a purpose-built wide-field imager for the LBT, we have examined the structure and star formation histories of two newly-discovered Local Group members, the Hercules dSph and the Leo T dSph/dIrr system. We have constructed a structural map for the Hercules system using three-filter photometry to V ~ 25.5. This is the first deep photometry for this system, and it indicates that Hercules is unusually elongated, possibly indicating distortion due to the Galactic tidal field. We have also derived the first star formation history for the Leo T system, and find that its oldest population of stars (age ~ 13 Gyr) were relatively metal-rich, with [Fe/H] ~ -1.5.
In ISIT'12 Brahma, \"{O}zg\"{u}r and Fragouli conjectured that in a half-duplex diamond relay network (a Gaussian noise network without a direct source-destination link and with $N$ non-interfering relays) an approximately optimal relay scheduling (achieving the cut-set upper bound to within a constant gap uniformly over all channel gains) exists with at most $N+1$ active states (only $N+1$ out of the $2^N$ possible relay listen-transmit configurations have a strictly positive probability). Such relay scheduling policies are said to be simple. In ITW'13 we conjectured that simple relay policies are optimal for any half-duplex Gaussian multi-relay network, that is, simple schedules are not a consequence of the diamond network's sparse topology. In this paper we formally prove the conjecture beyond Gaussian networks. In particular, for any memoryless half-duplex $N$-relay network with independent noises and for which independent inputs are approximately optimal in the cut-set upper bound, an optimal schedule exists with at most $N+1$ active states. The key step of our proof is to write the minimum of a submodular function by means of its Lov\'{a}sz extension and use the greedy algorithm for submodular polyhedra to highlight structural properties of the optimal solution. This, together with the saddle-point property of min-max problems and the existence of optimal basic feasible solutions in linear programs, proves the claim.
Relatively few intensively star-forming galaxies at redshifts z>2.5 have been found in the Hubble Deep Field (HDF). This has been interpreted to imply a low space density of elliptical galaxies at high z, possibly due to a late (z<2.5) epoch of formation, or to dust obscuration of the ellipticals that are forming at z~3. I use HST UV (2300 Ang) images of 25 local early-type galaxies to investigate a third option, that ellipticals formed at z>4.5, and were fading passively by 2<z<4.5. Present-day early-types are faint and centrally concentrated in the UV. If ellipticals formed their stars in a short burst at z>4.5, and have faded passively to their present brightnesses at UV wavelengths, they would generally be below the HDF detection limits in any of its bands at z>2.5. Quiescent z ~ 3 ellipticals, if they exist, should turn up in sufficiently deep IR images.
The stucture determination of a neural network for the modelisation of a system remain the core of the problem. Within this framework, we propose a pruning algorithm of the network based on the use of the analysis of the sensitivity of the variance of all the parameters of the network. This algorithm will be tested on two examples of simulation and its performances will be compared with three other algorithms of pruning of the literature
In this work we investigate time varying networks with complex dynamics at the nodes. We consider two scenarios of network change in an interval of time: first, we have the case where each link can change with probability pt, i.e. the network changes occur locally and independently at each node. Secondly we consider the case where the entire connectivity matrix changes with probability pt, i.e. the change is global. We show that network changes, occurring both locally and globally, yield an enhanced range of synchronization. When the connections are changed slowly (i.e. pt is low) the nodes display nearly synchronized intervals interrupted by intermittent unsynchronized chaotic bursts. However when the connections are switched quickly (i.e. pt is large), the intermittent behavior quickly settles down to a steady synchronized state. Furthermore we find that the mean time taken to reach synchronization from generic random initial states is significantly reduced when the underlying links change more rapidly. We also analyze the probabilistic dynamics of the system with changing connectivity and the stable synchronized range thus obtained is in broad agreement with those observed numerically.
Collaboration networks arise when we map the connections between scientists which are formed through joint publications. These networks thus display the social structure of academia, and also allow conclusions about the structure of scientific knowledge. Using the computer science publication database DBLP, we compile relations between authors and publications as graphs and proceed with examining and quantifying collaborative relations with graph-based methods. We review standard properties of the network and rank authors and publications by centrality. Additionally, we detect communities with modularity-based clustering and compare the resulting clusters to a ground-truth based on conferences and thus topical similarity. In a second part, we are the first to combine DBLP network data with data from the Dagstuhl Seminars: We investigate whether seminars of this kind, as social and academic events designed to connect researchers, leave a visible track in the structure of the collaboration network. Our results suggest that such single events are not influential enough to change the network structure significantly. However, the network structure seems to influence a participant's decision to accept or decline an invitation.
In last decades optimization and control of complex systems that possessed various conflicted objectives simultaneously attracted an incremental interest of scientists. This is because of the vast applications of these systems in various fields of real life engineering phenomena that are generally multi modal, non convex and multi criterion. Hence, many researchers utilized versatile intelligent models such as Pareto based techniques, game theory (cooperative and non cooperative games), neuro evolutionary systems, fuzzy logic and advanced neural networks for handling these types of problems. In this paper a novel method called Synchronous Self Learning Pareto Strategy Algorithm (SSLPSA) is presented which utilizes Evolutionary Computing (EC), Swarm Intelligence (SI) techniques and adaptive Classical Self Organizing Map (CSOM) simultaneously incorporating with a data shuffling behavior. Evolutionary Algorithms (EA) which attempt to simulate the phenomenon of natural evolution are powerful numerical optimization algorithms that reach an approximate global maximum of a complex multi variable function over a wide search space and swarm base technique can improved the intensity and the robustness in EA. CSOM is a neural network capable of learning and can improve the quality of obtained optimal Pareto front. To prove the efficient performance of proposed algorithm, authors utilized some well known benchmark test functions. Obtained results indicate that the cited method is best suit in the case of vector optimization.
Microbial communities play important roles in the function and maintenance of various biosystems, ranging from human body to the environment. Current methods for analysis of microbial communities are typically based on taxonomic phylogenetic alignment using 16S rRNA metagenomic or Whole Genome Sequencing data. In typical characterizations of microbial communities, studies deal with billions of micobial sequences, aligning them to a phylogenetic tree. We introduce a new approach for the efficient analysis of microbial communities. Our new reference-free analysis tech- nique is based on n-gram sequence analysis of 16S rRNA data and reduces the processing data size dramatically (by 105 fold), without requiring taxonomic alignment. The proposed approach is applied to characterize phenotypic microbial community differ- ences in different settings. Specifically, we applied this approach in classification of microbial com- munities across different body sites, characterization of oral microbiomes associated with healthy and diseased individuals, and classification of microbial communities longitudinally during the develop- ment of infants. Different dimensionality reduction methods are introduced that offer a more scalable analysis framework, while minimizing the loss in classification accuracies. Among dimensionality re- duction techniques, we propose a continuous vector representation for microbial communities, which can widely be used for deep learning applications in microbial informatics.
The specific heat of the Coulomb glass is studied by numerical simulations. Both the lattice model with various strengths of disorder, and the random-position model are considered for the one- to three-dimensional cases. In order to extend the investigations down to very low temperatures where the many-valley structure of the configuration space is of great importance we use a hybrid-Metropolis procedure. This algorithm bridges the gap between Metropolis simulation and analytical statistical mechanics. The analysis of the simulation results shows that the correlation length of the relevant processes is rather small, and that multi-particle processes yield an essential contribution to the specific heat in all cases except the one-dimensional random-position model.
Deep learning has been one of the most prominent machine learning techniques nowadays, being the state-of-the-art on a broad range of applications where automatic feature extraction is needed. Many such applications also demand varying costs for different types of mis-classification errors, but it is not clear whether or how such cost information can be incorporated into deep learning to improve performance. In this work, we propose a novel cost-aware algorithm that takes into account the cost information into not only the training stage but also the pre-training stage of deep learning. The approach allows deep learning to conduct automatic feature extraction with the cost information effectively. Extensive experimental results demonstrate that the proposed approach outperforms other deep learning models that do not digest the cost information in the pre-training stage.
The calculation of the polarization in ferroelectric thin films is performed using an analytical solution of the Euler-Lagrange differential equation with boundary conditions with different extrapolation lengths of positive sign on the surfaces. The depolarization field effect is taken into account in the model for a short-circuited single domain film, that is a perfect insulator. It is shown that the calculation of the polarization and other properties profiles and average values can be reduced to the minimization of the free energy expressed as a power series of the average polarization with a renormalized coefficient which depends on temperature, film thickness, extrapolation lengths, and a coefficient for the polarization gradient term in the free energy functional, the depolarization field being also included into the renormalized coefficient. The function defining the space distribution properties is calculated as well and its amplitude is shown to coincide with the average polarization. The detailed calculations of the spontaneous polarization, dielectric susceptibility and pyrocoefficient is performed. The divergence of the dielectric susceptibility and pyrocoefficient for critical parameters of the thickness induced ferroelectric phase transition, namely at temperature T(cl) and critical length l(c), is shown to exist with and without the depolarization field contribution, although the values of T(cl) and l(c) are different in both cases. The detailed analysis of the depolarization field space distribution and of this field dependence on temperature and film thickness is performed.
Analytical results are derived for the bond percolation threshold and the size of the giant connected component in a class of random networks with non-zero clustering. The network's degree distribution and clustering spectrum may be prescribed, and theoretical results match well to numerical simulations on both synthetic and real-world networks.
The analysis of networks affects the research of many real phenomena. The complex network structure can be viewed as a network's state at the time of the analysis or as a result of the process through which the network arises. Research activities focus on both and, thanks to them, we know not only many measurable properties of networks but also the essence of some phenomena that occur during the evolution of networks. One typical research area is the analysis of co-authorship networks and their evolution. In our paper, the analysis of one real-world co-authorship network and inspiration from existing models form the basis of the hypothesis from which we derive new 3-lambda network model. This hypothesis works with the assumption that regular behavior of nodes revolves around an average. However, some anomalies may occur. The 3-lambda model is stochastic and uses the three parameters associated with the average behavior of the nodes. The growth of the network based on this model assumes that one step of the growth is an interaction in which both new and existing nodes are participating. In the paper we present the results of the analysis of a co-authorship network and formulate a hypothesis and a model based on this hypothesis. Later in the paper, we examine the outputs from the network generator based on the 3-lambda model and show that generated networks have characteristics known from the environment of real-world networks.
Unconstrained face recognition performance evaluations have traditionally focused on Labeled Faces in the Wild (LFW) dataset for imagery and the YouTubeFaces (YTF) dataset for videos in the last couple of years. Spectacular progress in this field has resulted in a saturation on verification and identification accuracies for those benchmark datasets. In this paper, we propose a unified learning framework named transferred deep feature fusion targeting at the new IARPA Janus Bechmark A (IJB-A) face recognition dataset released by NIST face challenge. The IJB-A dataset includes real-world unconstrained faces from 500 subjects with full pose and illumination variations which are much harder than the LFW and YTF datasets. Inspired by transfer learning, we train two advanced deep convolutional neural networks (DCNN) with two different large datasets in source domain, respectively. By exploring the complementarity of two distinct DCNNs, deep feature fusion is utilized after feature extraction in target domain. Then, template specific linear SVMs is adopted to enhance the discrimination of framework. Finally, multiple matching scores corresponding different templates are merged as the final results. This simple unified framework outperforms the state-of-the-art by a wide margin on IJB-A dataset. Based on the proposed approach, we have submitted our IJB-A results to National Institute of Standards and Technology (NIST) for official evaluation.
We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation. We rely on graph-convolutional networks (GCNs), a recent class of neural networks developed for modeling graph-structured data. Our GCNs use predicted syntactic dependency trees of source sentences to produce representations of words (i.e. hidden states of the encoder) that are sensitive to their syntactic neighborhoods. GCNs take word representations as input and produce word representations as output, so they can easily be incorporated as layers into standard encoders (e.g., on top of bidirectional RNNs or convolutional neural networks). We evaluate their effectiveness with English-German and English-Czech translation experiments for different types of encoders and observe substantial improvements over their syntax-agnostic versions in all the considered setups.
Satellite imagery and remote sensing provide explanatory variables at relatively high resolutions for modeling geospatial phenomena, yet regional summaries are often desirable for analysis and actionable insight. In this paper, we propose a novel method of inducing spatial aggregations as a component of the machine learning process, yielding regional model features whose construction is driven by model prediction performance rather than prior assumptions. Our results demonstrate that Genetic Programming is particularly well suited to this type of feature construction because it can automatically synthesize appropriate aggregations, as well as better incorporate them into predictive models compared to other regression methods we tested. In our experiments we consider a specific problem instance and real-world dataset relevant to predicting snow properties in high-mountain Asia.
A stock market is considered as one of the highly complex systems, which consists of many components whose prices move up and down without having a clear pattern. The complex nature of a stock market challenges us on making a reliable prediction of its future movements. In this paper, we aim at building a new method to forecast the future movements of Standard & Poor's 500 Index (S&P 500) by constructing time-series complex networks of S&P 500 underlying companies by connecting them with links whose weights are given by the mutual information of 60-minute price movements of the pairs of the companies with the consecutive 5,340 minutes price records. We showed that the changes in the strength distributions of the networks provide an important information on the network's future movements. We built several metrics using the strength distributions and network measurements such as centrality, and we combined the best two predictors by performing a linear combination. We found that the combined predictor and the changes in S&P 500 show a quadratic relationship, and it allows us to predict the amplitude of the one step future change in S&P 500. The result showed significant fluctuations in S&P 500 Index when the combined predictor was high. In terms of making the actual index predictions, we built ARIMA models. We found that adding the network measurements into the ARIMA models improves the model accuracy. These findings are useful for financial market policy makers as an indicator based on which they can interfere with the markets before the markets make a drastic change, and for quantitative investors to improve their forecasting models.
The standard LSTM recurrent neural networks while very powerful in long-range dependency sequence applications have highly complex structure and relatively large (adaptive) parameters. In this work, we present empirical comparison between the standard LSTM recurrent neural network architecture and three new parameter-reduced variants obtained by eliminating combinations of the input signal, bias, and hidden unit signals from individual gating signals. The experiments on two sequence datasets show that the three new variants, called simply as LSTM1, LSTM2, and LSTM3, can achieve comparable performance to the standard LSTM model with less (adaptive) parameters.
In this letter, we investigate survivability in optical networks for protection from two simultaneous link failures. Failure probability of two links with overlapping protection can be high if these links are geographically close. In a network with deterministic single link protection, simultaneous failure of two links may lead to partial or full loss of traffic on the failed links. Two link failure protection will make the network more resilient by protecting double failures having overlapping protection. A method for achieving double fault tolerance is double cycle method (DB); it uses two pre-configured cycles (p-cycles) to protect a link. Single p-cycle (SG) method, which uses one p-cycle to protect a link from two simultaneous link failure is introduced in this letter. Integer linear programs (ILP) are formulated for the SG method as well as DB method. It has been observed that the SG method provides a solution to bigger networks with lesser computational resources as compared to the DB method.
We calculate for a binary mixture of Lennard-Jones particles the time dependence of the solution of the mode-coupling equations in which the full wave vector dependence is taken into account. In addition we also take into account the short time dynamics, which we model with a simple memory kernel. We find that the so obtained solution agrees very well with the time and wave vector dependence of the coherent and incoherent intermediate scattering functions as determined from molecular dynamics computer simulations. Furthermore we calculate the wave vector dependence of the Debye-Waller factor for a realistic model of silica and compare these results with the ones obtained from a simulation of this model. We find that if the contribution of the three point correlation function to the vertices of the memory kernel are taken into account, the agreement between theory and simulation is very good. Hence we conclude that mode coupling theory is able to give a correct quantitative description of the caging phenomena in fragile as well as strong glass-forming liquids.
Volunteer Computing, sometimes called Public Resource Computing, is an emerging computational model that is very suitable for work-pooled parallel processing. As more complex grid applications make use of work flows in their design and deployment it is reasonable to consider the impact of work flow deployment over a Volunteer Computing infrastructure. In this case, the inter work flow I/O can lead to a significant increase in I/O demands at the work pool server. A possible solution is the use of a Peer-to- Peer based parallel computing architecture to off-load this I/O demand to the workers; where the workers can fulfill some aspects of work flow coordination and I/O checking, etc. However, achieving robustness in such a large scale system is a challenging hurdle towards the decentralized execution of work flows and general parallel processes. To increase robustness, we propose and show the merits of using an adaptive checkpoint scheme that efficiently checkpoints the status of the parallel processes according to the estimation of relevant network and peer parameters. Our scheme uses statistical data observed during runtime to dynamically make checkpoint decisions in a completely de- centralized manner. The results of simulation show support for our proposed approach in terms of reduced required runtime.
Many natural systems are organized as networks, in which the nodes interact in a time-dependent fashion. The object of our study is to relate connectivity to the temporal behavior of a network in which the nodes are (real or complex) logistic maps, coupled according to a connectivity scheme that obeys certain constrains, but also incorporates random aspects. We investigate in particular the relationship between the system architecture and possible dynamics. In the current paper we focus on establishing the framework, terminology and pertinent questions for low-dimensional networks. A subsequent paper will further address the relationship between hardwiring and dynamics in high-dimensional networks.   For networks of both complex and real node-maps, we define extensions of the Julia and Mandelbrot sets traditionally defined in the context of single map iterations. For three different model networks, we use a combination of analytical and numerical tools to illustrate how the system behavior (measured via topological properties of the Julia sets) changes when perturbing the underlying adjacency graph. We differentiate between the effects on dynamics of different perturbations that directly modulate network connectivity: increasing/decreasing edge weights, and altering edge configuration by adding, deleting or moving edges. We discuss the implications of considering a rigorous extension of Fatou-Julia theory known to apply for iterations of single maps, to iterations of ensembles of maps coupled as nodes in a network.
Artificial Neural Network is among the most popular algorithm for supervised learning. However, Neural Networks have a well-known drawback of being a "Black Box" learner that is not comprehensible to the Users. This lack of transparency makes it unsuitable for many high risk tasks such as medical diagnosis that requires a rational justification for making a decision. Rule Extraction methods attempt to curb this limitation by extracting comprehensible rules from a trained Network. Many such extraction algorithms have been developed over the years with their respective strengths and weaknesses. They have been broadly categorized into three types based on their approach to use internal model of the Network. Eclectic Methods are hybrid algorithms that combine the other approaches to attain more performance. In this paper, we present an Eclectic method called HERETIC. Our algorithm uses Inductive Decision Tree learning combined with information of the neural network structure for extracting logical rules. Experiments and theoretical analysis show HERETIC to be better in terms of speed and performance.
Brain-inspired learning models attempt to mimic the cortical architecture and computations performed in the neurons and synapses constituting the human brain to achieve its efficiency in cognitive tasks. In this work, we present convolutional spike timing dependent plasticity based feature learning with biologically plausible leaky-integrate-and-fire neurons in Spiking Neural Networks (SNNs). We use shared weight kernels that are trained to encode representative features underlying the input patterns thereby improving the sparsity as well as the robustness of the learning model. We demonstrate that the proposed unsupervised learning methodology learns several visual categories for object recognition with fewer number of examples and outperforms traditional fully-connected SNN architectures while yielding competitive accuracy. Additionally, we observe that the learning model performs out-of-set generalization further making the proposed biologically plausible framework a viable and efficient architecture for future neuromorphic applications.
Network analysis is often focused on characterizing the dependencies between network relations and node-level attributes. Potential relationships are typically explored by modeling the network as a function of the nodal attributes or by modeling the attributes as a function of the network. These methods require specification of the exact nature of the association between the network and attributes, reduce the network data to a small number of summary statistics, and are unable provide predictions simultaneously for missing attribute and network information. Existing methods that model the attributes and network jointly also assume the data are fully observed. In this article we introduce a unified approach to analysis that addresses these shortcomings. We use a latent variable model to obtain a low dimensional representation of the network in terms of node-specific network factors and use a test of dependence between the network factors and attributes as a surrogate for a test of dependence between the network and attributes. We propose a formal testing procedure to determine if dependencies exists between the network factors and attributes. We also introduce a joint model for the network and attributes, for use if the test rejects, that can capture a variety of dependence patterns and be used to make inference and predictions for missing observations.
Cardiac indices estimation is of great importance during identification and diagnosis of cardiac disease in clinical routine. However, estimation of multitype cardiac indices with consistently reliable and high accuracy is still a great challenge due to the high variability of cardiac structures and complexity of temporal dynamics in cardiac MR sequences. While efforts have been devoted into cardiac volumes estimation through feature engineering followed by a independent regression model, these methods suffer from the vulnerable feature representation and incompatible regression model. In this paper, we propose a semi-automated method for multitype cardiac indices estimation. After manual labelling of two landmarks for ROI cropping, an integrated deep neural network Indices-Net is designed to jointly learn the representation and regression models. It comprises two tightly-coupled networks: a deep convolution autoencoder (DCAE) for cardiac image representation, and a multiple output convolution neural network (CNN) for indices regression. Joint learning of the two networks effectively enhances the expressiveness of image representation with respect to cardiac indices, and the compatibility between image representation and indices regression, thus leading to accurate and reliable estimations for all the cardiac indices.   When applied with five-fold cross validation on MR images of 145 subjects, Indices-Net achieves consistently low estimation error for LV wall thicknesses (1.44$\pm$0.71mm) and areas of cavity and myocardium (204$\pm$133mm$^2$). It outperforms, with significant error reductions, segmentation method (55.1% and 17.4%) and two-phase direct volume-only methods (12.7% and 14.6%) for wall thicknesses and areas, respectively. These advantages endow the proposed method a great potential in clinical cardiac function assessment.
We consider the Watts-Strogatz small-world network consisting of subthreshold neurons which exhibit noise-induced spikings. This neuronal network has adaptive dynamic synaptic strengths governed by the spike-timing-dependent plasticity (STDP). In previous works without STDP, stochastic spike synchronization (SSS) between noise-induced spikings of subthreshold neurons was found to occur in a range of intermediate noise intensities through competition between the constructive and the destructive roles of noise. Here, we investigate the effect of additive STDP on the SSS by varying the noise intensity. Occurrence of a "Matthew" effect in synaptic plasticity is found due to a positive feedback process. As a result, good synchronization gets better via long-term potentiation (LTP) of synaptic strengths, while bad synchronization gets worse via long-term depression (LTD). Emergence of LTP and LTD of synaptic strengths are intensively investigated via microscopic studies based on the pair-correlations between the pre- and the post-synaptic IISRs (instantaneous individual spike rates) as well as the distributions of time delays between the pre- and the post-synaptic spike times. Furthermore, the effects of multiplicative STDP (which depends on the states) on the SSS are also studied and discussed in comparison with the case of additive STDP.
We have fabricated oxygen deficient polycrystalline ZnO films by the rf sputtering deposition method. To systematically investigate the charge transport mechanisms in these samples, the electrical resistivities have been measured over a wide range of temperature from 300 K down to liquid-helium temperatures. We found that below about 100 K, the variable-range-hopping (VRH) conduction processes govern the charge transport properties. In particular, the Mott VRH conduction process dominates at higher temperatures, while crossing over to the Efros-Shklovskii (ES) VRH conduction process at lower temperatures. The crossover occurred at temperatures as high as a few tens degrees Kelvin. Moreover, the temperature behavior of resistivity over the entire VRH conduction regime from the Mott-type to the ES-type process can be well described by a universal scaling law.
A mobile ad hoc network (MANET) is a collection of autonomous nodes that communicate with each other by forming a multi-hop radio network and maintaining connections in a decentralized manner. Security remains a major challenge for these networks due to their features of open medium, dynamically changing topologies, reliance on cooperative algorithms, absence of centralized monitoring points, and lack of clear lines of defense. Protecting the network layer of a MANET from malicious attacks is an important and challenging security issue, since most of the routing protocols for MANETs are vulnerable to various types of attacks. Ad hoc on-demand distance vector routing (AODV) is a very popular routing algorithm. However, it is vulnerable to the well-known black hole attack, where a malicious node falsely advertises good paths to a destination node during the route discovery process but drops all packets in the data forwarding phase. This attack becomes more severe when a group of malicious nodes cooperate each other. The proposed mechanism does not apply any cryptographic primitives on the routing messages. Instead, it protects the network by detecting and reacting to malicious activities of the nodes. Simulation results show that the scheme has a significantly high detection rate with moderate network traffic overhead and computation overhead in the nodes.
Combinations of random and preferential growth for both on-growing and stationary networks are studied and a hierarchical topology is observed. Thus for real world scale-free networks which do not exhibit hierarchical features preferential growth is probably not the main ingredient in the growth process. An example of such real world networks includes the protein-protein interaction network in yeast, which exhibits pronounced anti-hierarchical features.
We study localization in two- and three channel quasi-1D systems using multichain tight-binding Anderson models with nearest-neighbour interchain hopping. In the three chain case we discuss both the case of free- and that of periodic boundary conditions between the chains. The finite disordered wires are connected to ideal leads and the localization length is defined from the Landauer conductance in terms of the transmission coefficients matrix. The transmission- and reflection amplitudes in properly defined quantum channels are obtained from S-matrices constructed from transfer matrices in Bloch wave bases for the various quasi-1D systems. Our exact analytic expressions for localization lengths for weak disorder reduce to the Thouless expression for 1D systems in the limit of vanishing interchain hopping. For weak interchain hopping the localization length decreases with respect to the 1D value in all three cases. In the three-channel cases it increases with interchain hopping over restricted domains of large hopping.
In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as theArbitrary missing pattern. Additionally, this paper employs a methodology based on Deep Learning and Swarm Intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The investigated methodology in this paper therefore has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed.
We study the non-equilibrium dynamics of the spherical p-spin models in the scaling regime near the plateau and derive the corresponding scaling functions for the correlators. Our main result is that the matching between different time regimes fixes the aging function in the aging regime to $h(t)=\exp(t^{1-\mu})$. The exponent $\mu$ is related to the one giving the length of the plateau. Interestingly $1-\mu$ is quickly very small when one goes away from the dynamic transition temperature in the glassy phase. This gives new light on the interpretation of experiments and simulations where simple aging was found to be a reasonable but not perfect approximation, which could be attributed to the existence of a small but non-zero stretching exponent.
Infant speech perception and learning is modeled using Echo State Network classification and Reinforcement Learning. Ambient speech for the modeled infant learner is created using the speech synthesizer Vocaltractlab. An auditory system is trained to recognize vowel sounds from a series of speakers of different anatomies in Vocaltractlab. Having formed perceptual targets, the infant uses Reinforcement Learning to imitate his ambient speech. A possible way of bridging the problem of speaker normalisation is proposed, using direct imitation but also including a caregiver who listens to the infants sounds and imitates those that sound vowel-like.
Corporations across the world are highly interconnected in a large global network of corporate control. This paper investigates the global board interlock network, covering 400,000 firms linked through 1,700,000 edges representing shared directors between these firms. The main focus is on the concept of centrality, which is used to investigate the embeddedness of firms from a particular country within the global network. The study results in three contributions. First, to the best of our knowledge for the first time we can investigate the topology as well as the concept of centrality in corporate networks at a global scale, allowing for the largest cross-country comparison ever done in interlocking directorates literature. We demonstrate, amongst other things, extremely similar network topologies, yet large differences between countries when it comes to the relation between economic prominence indicators and firm centrality. Second, we introduce two new metrics that are specifically suitable for comparing the centrality ranking of a partition to that of the full network. Using the notion of centrality persistence we propose to measure the persistence of a partition's centrality ranking in the full network. In the board interlock network, it allows us to assess the extent to which the footprint of a national network is still present within the global network. Next, the measure of centrality ranking dominance tells us whether a partition (country) is more dominant at the top or the bottom of the centrality ranking of the full (global) network. Finally, comparing these two new measures of persistence and dominance between different countries allows us to classify these countries based the their embeddedness, measured using the relation between the centrality of a country's firms on the national and the global scale of the board interlock network.
This study provides insights into the quantitative similarities, differences and relationships between users' spatial, face-to-face, urban social networks and their transpatial, online counterparts. We explore and map the social ties within a cohort of 2602 users, and how those ties are mediated via physical co-presence and online tools. Our analysis focused on isolating two distinct segments of the social network: one mediated by physical co-presence, and the other mediated by Facebook. Our results suggest that as a whole the networks exhibit homogeneous characteristics, but individuals' involvement in those networks varies considerably. Furthermore this study provides a methodological approach for jointly analysing spatial & transpatial networks utilising pervasive and ubiquitous technology.
In this paper we improve the deterministic complexity of two fundamental communication primitives in the classical model of ad-hoc radio networks with unknown topology: broadcasting and wake-up. We consider an unknown radio network, in which all nodes have no prior knowledge about network topology, and know only the size of the network $n$, the maximum in-degree of any node $\Delta$, and the eccentricity of the network $D$.   For such networks, we first give an algorithm for wake-up, in both directed and undirected networks, based on the existence of small universal synchronizers. This algorithm runs in $O(\frac{\min\{n, D \Delta\} \log n \log \Delta}{\log\log \Delta})$ time, improving over the previous best $O(n \log^2n)$-time result across all ranges of parameters, but particularly when maximum in-degree is small.   Next, we introduce a new combinatorial framework of block synchronizers and prove the existence of such objects of low size. Using this framework, we design a new deterministic algorithm for the fundamental problem of \emph{broadcasting}, running in $O(n \log D \log\log\frac{D \Delta}{n})$ time. This is the fastest known algorithm for the problem, improving upon the $O(n \log n \log \log n)$-time algorithm of De Marco (2010) and the $O(n \log^2 D)$-time algorithm due to Czumaj and Rytter (2003), the previous fastest results for directed networks. It is also the first to come within a log-logarithmic factor of the $\Omega(n \log D)$ lower bound due to Clementi et al. (2003).   Our results also have direct implications on the fastest deterministic leader election and clock synchronization algorithms in both directed and undirected radio networks, tasks which are commonly used as building blocks for more complex procedures.
High resolution X-ray powder-diffraction experiments on a well-characterized polycrystalline sample of the spin liquid Tb2Ti2O7 reveal that it shows normal positive thermal-expansion above 4 K, which does not agree with the intriguing anomalous negative thermal-expansion due to a magneto-elastic coupling reported for a single crystal sample below 20 K. We also performed a Rietveld profile refinement of a powder-diffraction pattern taken at a room temperature, and confirmed that it is consistent with the fully ordered cubic pyrochlore structure.
Semi-supervised learning algorithms reduce the high cost of acquiring labeled training data by using both labeled and unlabeled data during learning. Deep Convolutional Networks (DCNs) have achieved great success in supervised tasks and as such have been widely employed in the semi-supervised learning. In this paper we leverage the recently developed Deep Rendering Mixture Model (DRMM), a probabilistic generative model that models latent nuisance variation, and whose inference algorithm yields DCNs. We develop an EM algorithm for the DRMM to learn from both labeled and unlabeled data. Guided by the theory of the DRMM, we introduce a novel non-negativity constraint and a variational inference term. We report state-of-the-art performance on MNIST and SVHN and competitive results on CIFAR10. We also probe deeper into how a DRMM trained in a semi-supervised setting represents latent nuisance variation using synthetically rendered images. Taken together, our work provides a unified framework for supervised, unsupervised, and semi-supervised learning.
We present a learning-based mapless motion planner by taking the sparse 10-dimensional range findings and the target position with respect to the mobile robot coordinate frame as input and the continuous steering commands as output. Traditional motion planners for mobile ground robots with a laser range sensor mostly depend on the obstacle map of the navigation environment where both the highly precise laser sensor and the obstacle map building work of the environment are indispensable. We show that, through an asynchronous deep reinforcement learning method, a mapless motion planner can be trained end-to-end without any manually designed features and prior demonstrations. The trained planner can be directly applied in unseen virtual and real environments. The experiments show that the proposed mapless motion planner can navigate the nonholonomic mobile robot to the desired targets without colliding with any obstacles.
Bayesian networks are basic graphical models, used widely both in statistics and artificial intelligence. These statistical models of conditional independence structure are described by acyclic directed graphs whose nodes correspond to (random) variables in consideration. A quite important topic is the learning of Bayesian network structures, which is determining the best fitting statistical model on the basis of given data. Although there are learning methods based on statistical conditional independence tests, contemporary methods are mainly based on maximization of a suitable quality criterion that evaluates how good the graph explains the occurrence of the observed data. This leads to a nonlinear combinatorial optimization problem that is in general NP-hard to solve. In this paper we deal with the complexity of learning restricted Bayesian network structures, that is, we wish to find network structures of highest score within a given subset of all possible network structures. For this, we introduce a new unique algebraic representative for these structures, called the characteristic imset. We show that these imsets are always 0-1-vectors and that they have many nice properties that allow us to simplify long proofs for some known results and to easily establish new complexity results for learning restricted Bayes network structures.
Functional connectivity refers to the temporal statistical relationship between spatially distinct brain regions and is usually inferred from the time series coherence/correlation in brain activity between regions of interest. In human functional brain networks, the network structure is often inferred from functional magnetic resonance imaging (fMRI) blood oxygen level dependent (BOLD) signal. Since the BOLD signal is a proxy for neuronal activity, it is of interest to learn the latent functional network structure. Additionally, despite a core set of observations about functional networks such as small-worldness, modularity, exponentially truncated degree distributions, and presence of various types of hubs, very little is known about the computational principles which can give rise to these observations. This paper introduces a Hidden Markov Random Field framework for the purpose of representing, estimating, and evaluating latent neuronal functional relationships between different brain regions using fMRI data.
In this work, a deep learning approach has been developed to carry out road detection using only LIDAR data. Starting from an unstructured point cloud, top-view images encoding several basic statistics such as mean elevation and density are generated. By considering a top-view representation, road detection is reduced to a single-scale problem that can be addressed with a simple and fast fully convolutional neural network (FCN). The FCN is specifically designed for the task of pixel-wise semantic segmentation by combining a large receptive field with high-resolution feature maps. The proposed system achieved excellent performance and it is among the top-performing algorithms on the KITTI road benchmark. Its fast inference makes it particularly suitable for real-time applications.
The widespread use of cloud computing services is expected to increase the power consumed by ICT equipment in cloud computing environments rapidly. This paper first identifies the need of the collaboration among servers, the communication network and the power network, in order to reduce the total power consumption by the entire ICT equipment in cloud computing environments. Five fundamental policies for the collaboration are proposed and the algorithm to realize each collaboration policy is outlined. Next, this paper proposes possible signaling sequences to exchange information on power consumption between network and servers, in order to realize the proposed collaboration policy. Then, in order to reduce the power consumption by the network, this paper proposes a method of estimating the volume of power consumption by all network devices simply and assigning it to an individual user.
The competitive MNIST handwritten digit recognition benchmark has a long history of broken records since 1998. The most recent substantial improvement by others dates back 7 years (error rate 0.4%) . Recently we were able to significantly improve this result, using graphics cards to greatly speed up training of simple but deep MLPs, which achieved 0.35%, outperforming all the previous more complex methods. Here we report another substantial improvement: 0.31% obtained using a committee of MLPs.
Wireless Multiuser receivers suffer from their relatively higher computational complexity that prevents widespread use of this technique. In addition, one of the main characteristics of multi-channel communications that can severely degrade the performance is the inconsistent and low values of SNR that result in high BER and poor channel capacity. It has been shown that the computational complexity of a multiuser receiver can be reduced by using the transformation matrix (TM) algorithm [4]. In this paper, we provide quantification of SNR based on the computational complexity of TM algorithm. We show that the reduction of complexity results high and consistent values of SNR that can consequently be used to achieve a desirable BER performance. In addition, our simulation results suggest that the high and consistent values of SNR can be achieved for a desirable BER performance. The performance measure adopted in this paper is the consistent values of SNR.
We give an overview of recent exciting achievements of deep reinforcement learning (RL). We discuss six core elements, six important mechanisms, and twelve applications. We start with background of machine learning, deep learning and reinforcement learning. Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration. After that, we discuss important mechanisms for RL, including attention and memory, in particular, differentiable neural computer (DNC), unsupervised learning, transfer learning, semi-supervised learning, hierarchical RL, and learning to learn. Then we discuss various applications of RL, including games, in particular, AlphaGo, robotics, natural language processing, including dialogue systems (a.k.a. chatbots), machine translation, and text generation, computer vision, neural architecture design, business management, finance, healthcare, Industry 4.0, smart grid, intelligent transportation systems, and computer systems. We mention topics not reviewed yet. After listing a collection of RL resources, we present a brief summary, and close with discussions.
Employing Big Bear Solar Observatory (BBSO) deep magnetograms and H${\alpha}$ images in a quiet region and a coronal hole, observed on September 14 and 16, 2004, respectively, we have explored the magnetic flux emergence, disappearance and distribution in the two regions. The following results are obtained: (1) The evolution of magnetic flux in the quiet region is much faster than that in the coronal hole, as the flux appeared in the form of ephemeral regions in the quiet region is 4.3 times as large as that in the coronal hole, and the flux disappeared in the form of flux cancellation, 2.9 times as fast as in the coronal hole. (2) More magnetic elements with opposite polarities in the quiet region are connected by arch filaments, estimating from magnetograms and H${\alpha}$ images. (3) We measured the magnetic flux of about 1000 magnetic elements in each observing region. The flux distribution of network and intranetwork (IN) elements is similar in both polarities in the quiet region. For network fields in the coronal hole, the number of negative elements is much more than that of positive elements. However for the IN fields, the number of positive elements is much more than that of negative elements. (4) In the coronal hole, the fraction of negative flux change obviously with different threshold flux density. 73% of the magnetic fields with flux density larger than 2 Gauss is negative polarity, and 95% of the magnetic fields is negative, if we only measure the fields with their flux density larger than 20 Gauss. Our results display that in a coronal hole, stronger fields is occupied by one predominant polarity; however the majority of weaker fields, occupied by the other polarity.
This paper proposes to model the intracellular signalling networks using a fusion of behaviour-based systems and the blackboard architecture. In virtue of this fusion, the model developed by us, which has been named Cellulat, allows to take account two essential aspects of the intracellular signalling networks: (1) the cognitive capabilities of certain types of networks' components and (2) the high level of spatial organization of these networks. A simple example of modelling of Ca2+ signalling pathways using Cellulat is presented here. An intracellular signalling virtual laboratory is being developed from Cellulat.
This paper proposes a new framework, named Generative Partition Network (GPN), for addressing the challenging multi-person pose estimation problem. Different from existing pure top-down and bottom-up solutions, the proposed GPN models the multi-person partition detection as a generative process from joint candidates and infers joint configurations for person instances from each person partition locally, resulting in both low joint detection and joint partition complexities. In particular, GPN designs a generative model based on the Generalized Hough Transform framework to detect person partitions via votes from joint candidates in the Hough space, parameterized by centroids of persons. Such generative model produces joint candidates and their corresponding person partitions by performing only one pass of joint detection. In addition, GPN formulates the inference procedure for joint configurations of human poses as a graph partition problem and optimizes it locally. Inspired by recent success of deep learning techniques for human pose estimation, GPN designs a multi-stage convolutional neural network with feature pyramid branch to jointly learn joint confidence maps and Hough transformation maps. Extensive experiments on two benchmarks demonstrate the efficiency and effectiveness of the proposed GPN.
This paper presents a cumulative multi-niching genetic algorithm (CMN GA), designed to expedite optimization problems that have computationally-expensive multimodal objective functions. By never discarding individuals from the population, the CMN GA makes use of the information from every objective function evaluation as it explores the design space. A fitness-related population density control over the design space reduces unnecessary objective function evaluations. The algorithm's novel arrangement of genetic operations provides fast and robust convergence to multiple local optima. Benchmark tests alongside three other multi-niching algorithms show that the CMN GA has a greater convergence ability and provides an order-of-magnitude reduction in the number of objective function evaluations required to achieve a given level of convergence.
Cardiac left ventricle (LV) quantification is among the most clinically important tasks for identification and diagnosis of cardiac diseases, yet still a challenge due to the high variability of cardiac structure and the complexity of temporal dynamics. Full quantification, i.e., to simultaneously quantify all LV indices including two areas (cavity and myocardium), six regional wall thicknesses (RWT), three LV dimensions, and one cardiac phase, is even more challenging since the uncertain relatedness intra and inter each type of indices may hinder the learning procedure from better convergence and generalization. In this paper, we propose a newly-designed multitask learning network (FullLVNet), which is constituted by a deep convolution neural network (CNN) for expressive feature embedding of cardiac structure; two followed parallel recurrent neural network (RNN) modules for temporal dynamic modeling; and four linear models for the final estimation. During the final estimation, both intra- and inter-task relatedness are modeled to enforce improvement of generalization: 1) respecting intra-task relatedness, group lasso is applied to each of the regression tasks for sparse and common feature selection and consistent prediction; 2) respecting inter-task relatedness, three phase-guided constraints are proposed to penalize violation of the temporal behavior of the obtained LV indices. Experiments on MR sequences of 145 subjects show that FullLVNet achieves high accurate prediction with our intra- and inter-task relatedness, leading to MAE of 190mm$^2$, 1.41mm, 2.68mm for average areas, RWT, dimensions and error rate of 10.4\% for the phase classification. This endows our method a great potential in comprehensive clinical assessment of global, regional and dynamic cardiac function.
Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. We propose a method for constructing a low-dimensional representation of the original decision process for which: 1. the MDP model holds; 2. a decision strategy that maximizes mean utility when applied to the low-dimensional representation also maximizes mean utility when applied to the original process. We use a deep neural network to define a class of potential process representations and estimate the process of lowest dimension within this class. The method is illustrated using data from a mobile study on heavy drinking and smoking among college students.
Neurons are individually translated into simple gates to plan a brain based on human psychology and intelligence. State machines, assumed previously learned in subconscious associative memory are shown to enable equation solving and rudimentary thinking using nanoprocessing within short term memory.
We reformulate the problem of modularity maximization over the set of partitions of a network as a conic optimization problem over the completely positive cone, converting it from a combinatorial optimization problem to a convex continuous one. A semidefinite relaxation of this conic program then allows to compute upper bounds on the maximum modularity of the network. Based on the solution of the corresponding semidefinite program, we design a randomized algorithm generating partitions of the network with suboptimal modularities. We apply this algorithm to several benchmark networks, demonstrating that it is competitive in accuracy with the best algorithms previously known. We use our method to provide the first proof of optimality of a partition for a real-world network.
Class imbalance is a common problem in the case of real-world object detection and classification tasks. Data of some classes is abundant making them an over-represented majority, and data of other classes is scarce, making them an under-represented minority. This imbalance makes it challenging for a classifier to appropriately learn the discriminating boundaries of the majority and minority classes. In this work, we propose a cost sensitive deep neural network which can automatically learn robust feature representations for both the majority and minority classes. During training, our learning procedure jointly optimizes the class dependent costs and the neural network parameters. The proposed approach is applicable to both binary and multi-class problems without any modification. Moreover, as opposed to data level approaches, we do not alter the original data distribution which results in a lower computational cost during the training process. We report the results of our experiments on six major image classification datasets and show that the proposed approach significantly outperforms the baseline algorithms. Comparisons with popular data sampling techniques and cost sensitive classifiers demonstrate the superior performance of our proposed method.
We consider the problem of estimating parameter sensitivity for Markovian models of reaction networks. Sensitivity values measure the responsiveness of an output to the model parameters. They help in analyzing the network, understanding its robustness properties and identifying the important reactions for a specific output. Sensitivity values are commonly estimated using methods that perform finite-difference computations along with Monte Carlo simulations of the reaction dynamics. These methods are computationally efficient and easy to implement, but they produce a biased estimate which can be unreliable for certain applications. Moreover the size of the bias is generally unknown and hence the accuracy of these methods cannot be easily determined. There also exist unbiased schemes for sensitivity estimation but these schemes can be computationally infeasible, even for simple networks. Our goal in this paper is to present a new method for sensitivity estimation, which combines the computational efficiency of finite-difference methods with the accuracy of unbiased schemes. Our method is easy to implement and it relies on an exact representation of parameter sensitivity that we recently proved in an earlier paper. Through examples we demonstrate that the proposed method can outperform the existing methods, both biased and unbiased, in many situations.
Statistical approaches to cyber-security involve building realistic probability models of computer network data. In a data pre-processing phase, separating automated events from those caused by human activity should improve statistical model building and enhance anomaly detection capabilities. This article presents a changepoint detection framework for identifying periodic subsequences of event times. The opening event of each subsequence can be interpreted as a human action which then generates an automated, periodic process. Difficulties arising from the presence of duplicate and missing data are addressed. The methodology is demonstrated using authentication data from the computer network of Los Alamos National Laboratory.
We consider scenarios from the real-time strategy game StarCraft as new benchmarks for reinforcement learning algorithms. We propose micromanagement tasks, which present the problem of the short-term, low-level control of army members during a battle. From a reinforcement learning point of view, these scenarios are challenging because the state-action space is very large, and because there is no obvious feature representation for the state-action evaluation function. We describe our approach to tackle the micromanagement scenarios with deep neural network controllers from raw state features given by the game engine. In addition, we present a heuristic reinforcement learning algorithm which combines direct exploration in the policy space and backpropagation. This algorithm allows for the collection of traces for learning using deterministic policies, which appears much more efficient than, for example, {\epsilon}-greedy exploration. Experiments show that with this algorithm, we successfully learn non-trivial strategies for scenarios with armies of up to 15 agents, where both Q-learning and REINFORCE struggle.
In online social media systems users are not only posting, consuming, and resharing content, but also creating new and destroying existing connections in the underlying social network. While each of these two types of dynamics has individually been studied in the past, much less is known about the connection between the two. How does user information posting and seeking behavior interact with the evolution of the underlying social network structure?   Here, we study ways in which network structure reacts to users posting and sharing content. We examine the complete dynamics of the Twitter information network, where users post and reshare information while they also create and destroy connections. We find that the dynamics of network structure can be characterized by steady rates of change, interrupted by sudden bursts. Information diffusion in the form of cascades of post re-sharing often creates such sudden bursts of new connections, which significantly change users' local network structure. These bursts transform users' networks of followers to become structurally more cohesive as well as more homogenous in terms of follower interests. We also explore the effect of the information content on the dynamics of the network and find evidence that the appearance of new topics and real-world events can lead to significant changes in edge creations and deletions. Lastly, we develop a model that quantifies the dynamics of the network and the occurrence of these bursts as a function of the information spreading through the network. The model can successfully predict which information diffusion events will lead to bursts in network dynamics.
Star counts at high and intermediate galactic latitudes, in the visible and the near infrared, are used to determine the density law and the initial mass function of the thick disc population. The combination of shallow fields dominated by stars at the turnoff with deep fields allows the determination of the thick disc mass function in the mass range 0.2-0.8 Msun. Star counts are compared with simulations of a synthesis population model. The fit is based on a maximum likelihood criterion. The best fit model gives a scale height of 800 pc, a scale length of 2500 pc and a local density of 10e-3 stars/pc3 or 7.1e-4 Msun/pc3 for Mv<8. The IMF is found to follow a power law with a slope alpha=-0.5. This is the first determination of the thick disc mass function.
Neuromorphic computing using post-CMOS technologies is gaining immense popularity due to its promising abilities to address the memory and power bottlenecks in von-Neumann computing systems. In this paper, we propose RESPARC - a reconfigurable and energy efficient architecture built-on Memristive Crossbar Arrays (MCA) for deep Spiking Neural Networks (SNNs). Prior works were primarily focused on device and circuit implementations of SNNs on crossbars. RESPARC advances this by proposing a complete system for SNN acceleration and its subsequent analysis. RESPARC utilizes the energy-efficiency of MCAs for inner-product computation and realizes a hierarchical reconfigurable design to incorporate the data-flow patterns in an SNN in a scalable fashion. We evaluate the proposed architecture on different SNNs ranging in complexity from 2k-230k neurons and 1.2M-5.5M synapses. Simulation results on these networks show that compared to the baseline digital CMOS architecture, RESPARC achieves 500X (15X) efficiency in energy benefits at 300X (60X) higher throughput for multi-layer perceptrons (deep convolutional networks). Furthermore, RESPARC is a technology-aware architecture that maps a given SNN topology to the most optimized MCA size for the given crossbar technology.
While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing. Architectures and optimization techniques used for video are largely based off those for static images, potentially underutilizing rich video information. In this work, we rethink both the underlying network architecture and the stochastic learning paradigm for temporal data. To do so, we draw inspiration from classic theory on linear dynamic systems for modeling time series. By extending such models to include nonlinear mappings, we derive a series of novel recurrent neural networks that sequentially make top-down predictions about the future and then correct those predictions with bottom-up observations. Predictive-corrective networks have a number of desirable properties: (1) they can adaptively focus computation on "surprising" frames where predictions require large corrections, (2) they simplify learning in that only "residual-like" corrective terms need to be learned over time and (3) they naturally decorrelate an input data stream in a hierarchical fashion, producing a more reliable signal for learning at each layer of a network. We provide an extensive analysis of our lightweight and interpretable framework, and demonstrate that our model is competitive with the two-stream network on three challenging datasets without the need for computationally expensive optical flow.
I review the observational constraints on the stars responsible for the upturn in the UV spectra of ellipticals, ranging from galaxies in the local Universe to distant clusters. In nearby galaxies, this UV upturn is produced by a minority population of extreme horizontal branch (EHB) stars, with the large variations observed in the UV-to-optical flux ratio driven by variations in the number of EHB stars, and not the type of UV-bright stars. Deep UV images of the nearest elliptical galaxy, M32, show that it has a well-populated EHB, even though it has the weakest UV upturn of any known elliptical galaxy. However, M32 suffers from a striking dearth of the hot post-HB stars expected from canonical evolutionary theory. As we observe to larger lookback times in more distant galaxy clusters, the UV upturn fades, as predicted by theories of stellar and galactic evolution, but does so gradually. Because the EHB stars do not appear suddenly in the Universe, their presence is likely driven by a large dispersion in the parameters that govern HB morphology.
A long-term goal of AI is to produce agents that can learn a diversity of skills throughout their lifetimes and continuously improve those skills via experience. A longstanding obstacle towards that goal is catastrophic forgetting, which is when learning new information erases previously learned information. Catastrophic forgetting occurs in artificial neural networks (ANNs), which have fueled most recent advances in AI. A recent paper proposed that catastrophic forgetting in ANNs can be reduced by promoting modularity, which can limit forgetting by isolating task information to specific clusters of nodes and connections (functional modules). While the prior work did show that modular ANNs suffered less from catastrophic forgetting, it was not able to produce ANNs that possessed task-specific functional modules, thereby leaving the main theory regarding modularity and forgetting untested. We introduce diffusion-based neuromodulation, which simulates the release of diffusing, neuromodulatory chemicals within an ANN that can modulate (i.e. up or down regulate) learning in a spatial region. On the simple diagnostic problem from the prior work, diffusion-based neuromodulation 1) induces task-specific learning in groups of nodes and connections (task-specific localized learning), which 2) produces functional modules for each subtask, and 3) yields higher performance by eliminating catastrophic forgetting. Overall, our results suggest that diffusion-based neuromodulation promotes task-specific localized learning and functional modularity, which can help solve the challenging, but important problem of catastrophic forgetting.
A long and slender finger can serve as a simple ``test bed'' for different phase ordering models. In this work, the globally-conserved, interface-controlled dynamics of a long finger is investigated, analytically and numerically, in two dimensions. An important limit is considered when the finger dynamics are reducible to the area-preserving motion by curvature. A free boundary problem for the finger shape is formulated. An asymptotic perturbation theory is developed that uses the finger aspect ratio as a small parameter. The leading-order approximation is a modification of ``the Mullins finger" (a well-known analytic solution) which width is allowed to slowly vary with time. This time dependence is described, in the leading order, by an exponential law with the characteristic time proportional to the (constant) finger area. The subleading terms of the asymptotic theory are also calculated. Finally, the finger dynamics is investigated numerically, employing the Ginzburg-Landau equation with a global conservation law. The theory is in a very good agreement with the numerical solution.
The emergence of autocatalytic sets of molecules seems to have played an important role in the origin of life context. Although the possibility to reproduce this emergence in laboratory has received considerable attention, this is still far from being achieved. In order to unravel some key properties enabling the emergence of structures potentially able to sustain their own existence and growth, in this work we investigate the probability to observe them in ensembles of random catalytic reaction networks characterized by different structural properties. From the point of view of network topology, an autocatalytic set have been defined either in term of strongly connected components (SCCs) or as reflexively autocatalytic and food-generated sets (RAFs). We observe that the average level of catalysis differently affects the probability to observe a SCC or a RAF, highlighting the existence of a region where the former can be observed, whereas the latter cannot. This parameter also affects the composition of the RAF, which can be further characterized into linear structures, autocatalysis or SCCs. Interestingly, we show that the different network topology (uniform as opposed to power-law catalysis systems) does not have a significantly divergent impact on SCCs and RAFs appearance, whereas the proportion between cleavages and condensations seems instead to play a role. A major factor that limits the probability of RAF appearance and that may explain some of the difficulties encountered in laboratory seems to be the presence of molecules which can accumulate without being substrate or catalyst of any reaction.
Understanding spoken language is a highly complex problem, which can be decomposed into several simpler tasks. In this paper, we focus on Spoken Language Understanding (SLU), the module of spoken dialog systems responsible for extracting a semantic interpretation from the user utterance. The task is treated as a labeling problem. In the past, SLU has been performed with a wide variety of probabilistic models. The rise of neural networks, in the last couple of years, has opened new interesting research directions in this domain. Recurrent Neural Networks (RNNs) in particular are able not only to represent several pieces of information as embeddings but also, thanks to their recurrent architecture, to encode as embeddings relatively long contexts. Such long contexts are in general out of reach for models previously used for SLU. In this paper we propose novel RNNs architectures for SLU which outperform previous ones. Starting from a published idea as base block, we design new deep RNNs achieving state-of-the-art results on two widely used corpora for SLU: ATIS (Air Traveling Information System), in English, and MEDIA (Hotel information and reservation in France), in French.
The electronic health record (EHR) provides an unprecedented opportunity to build actionable tools to support physicians at the point of care. In this paper, we investigate survival analysis in the context of EHR data. We introduce deep survival analysis, a hierarchical generative approach to survival analysis. It departs from previous approaches in two primary ways: (1) all observations, including covariates, are modeled jointly conditioned on a rich latent structure; and (2) the observations are aligned by their failure time, rather than by an arbitrary time zero as in traditional survival analysis. Further, it (3) scalably handles heterogeneous (continuous and discrete) data types that occur in the EHR. We validate deep survival analysis model by stratifying patients according to risk of developing coronary heart disease (CHD). Specifically, we study a dataset of 313,000 patients corresponding to 5.5 million months of observations. When compared to the clinically validated Framingham CHD risk score, deep survival analysis is significantly superior in stratifying patients according to their risk.
Data deduplication is able to effectively identify and eliminate redundant data and only maintain a single copy of files and chunks. Hence, it is widely used in cloud storage systems to save storage space and network bandwidth. However, the occurrence of deduplication can be easily identified by monitoring and analyzing network traffic, which leads to the risk of user privacy leakage. The attacker can carry out a very dangerous side channel attack, i.e., learn-the-remaining-information (LRI) attack, to reveal users' privacy information by exploiting the side channel of network traffic in deduplication. Existing work addresses the LRI attack at the cost of the high bandwidth efficiency of deduplication. In order to address this problem, we propose a simple yet effective scheme, called randomized redundant chunk scheme (RRCS), to significantly mitigate the risk of the LRI attack while maintaining the high bandwidth efficiency of deduplication. The basic idea behind RRCS is to add randomized redundant chunks to mix up the real deduplication states of files used for the LRI attack, which effectively obfuscates the view of the attacker, who attempts to exploit the side channel of network traffic for the LRI attack. Our security analysis shows that RRCS could significantly mitigate the risk of the LRI attack. We implement the RRCS prototype and evaluate it by using three large-scale real-world datasets. Experimental results demonstrate the efficiency and efficacy of RRCS.
Robots which interact with the physical world will benefit from a fine-grained tactile understanding of objects and surfaces. Additionally, for certain tasks, robots may need to know the haptic properties of an object before touching it. To enable better tactile understanding for robots, we propose a method of classifying surfaces with haptic adjectives (e.g., compressible or smooth) from both visual and physical interaction data. Humans typically combine visual predictions and feedback from physical interactions to accurately predict haptic properties and interact with the world. Inspired by this cognitive pattern, we propose and explore a purely visual haptic prediction model. Purely visual models enable a robot to "feel" without physical interaction. Furthermore, we demonstrate that using both visual and physical interaction signals together yields more accurate haptic classification. Our models take advantage of recent advances in deep neural networks by employing a unified approach to learning features for physical interaction and visual observations. Even though we employ little domain specific knowledge, our model still achieves better results than methods based on hand-designed features.
In itinerant magnetic systems with disorder, the quantum Griffiths phase at T=0 is unstable to formation of a cluster glass (CG) of frozen droplet degrees of freedom. In the absence of the fluctuations associated with these degrees of freedom, the transition from the paramagnetic Fermi liquid (PMFL) to the ordered phase proceeds via a conventional second-order quantum phase transition. However, when the Griffiths anomalies due to the broad distribution of local energy scales are included, the transition is driven first-order via a novel mechanism for a fluctuation induced first-order transition. At higher temperatures, thermal effects restore the transition to second-order. Implications of the enhanced non-Ohmic dissipation in the CG phase are briefly discussed.
Statistical analysis of protein-protein interactions shows anomalously high frequency of homodimers [Ispolatov, I., et al. (2005) Nucleic Acids Res 33, 3629-35]. Furthermore, recent findings [Wright, C.F., et al. (2005) Nature 438, 878-81] demonstrate that maintaining low sequence identity is a key evolutionary mechanism that inhibits protein aggregation. Here, we study statistical properties of interacting protein-like surfaces and predict the effect of universal, enhanced self-attraction of proteins. The effect originates in the fact that a pattern self-match between two identical, even randomly organized interacting protein surfaces is always stronger compared to the pattern match between two different, promiscuous protein surfaces. This finding implies an increased probability of homodimer selection in the course of early evolution. Our simple model of early evolutionary selection of interacting proteins accurately reproduces the experimental data on homodimer interface aminoacid compositions. In addition, we predict that heterodimers evolved from homodimers with the negative design evolutionary pressure applied against promiscuous homodimer formation. We predict that the anti-homodimer negative design evolutionary signal is conveyed through the enrichment of heterodimeric interfaces in polar residues, and most profoundly in glutamic acid and lysine, which is consistent with experimental findings. We predict therefore that the negative design against homodimers is the
This paper introduces how human languages can be studied in light of recent development of network theories. There are two directions of exploration. One is to study networks existing in the language system. Various lexical networks can be built based on different relationships between words, being semantic or syntactic. Recent studies have shown that these lexical networks exhibit small-world and scale-free features. The other direction of exploration is to study networks of language users (i.e. social networks of people in the linguistic community), and their role in language evolution. Social networks also show small-world and scale-free features, which cannot be captured by random or regular network models. In the past, computational models of language change and language emergence often assume a population to have a random or regular structure, and there has been little discussion how network structures may affect the dynamics. In the second part of the paper, a series of simulation models of diffusion of linguistic innovation are used to illustrate the importance of choosing realistic conditions of population structure for modeling language change. Four types of social networks are compared, which exhibit two categories of diffusion dynamics. While the questions about which type of networks are more appropriate for modeling still remains, we give some preliminary suggestions for choosing the type of social networks for modeling.
This work studies the learning ability of consensus and diffusion distributed learners from continuous streams of data arising from different but related statistical distributions. Four distinctive features for diffusion learners are revealed in relation to other decentralized schemes even under left-stochastic combination policies. First, closed-form expressions for the evolution of their excess-risk are derived for strongly-convex risk functions under a diminishing step-size rule. Second, using these results, it is shown that the diffusion strategy improves the asymptotic convergence rate of the excess-risk relative to non-cooperative schemes. Third, it is shown that when the in-network cooperation rules are designed optimally, the performance of the diffusion implementation can outperform that of naive centralized processing. Finally, the arguments further show that diffusion outperforms consensus strategies asymptotically, and that the asymptotic excess-risk expression is invariant to the particular network topology. The framework adopted in this work studies convergence in the stronger mean-square-error sense, rather than in distribution, and develops tools that enable a close examination of the differences between distributed strategies in terms of asymptotic behavior, as well as in terms of convergence rates.
We present deep VLT/NACO infrared imaging and spectroscopic observations of the brown dwarf 2MASSWJ1207334-393254, obtained during our on-going adaptive optics survey of southern young, nearby associations. This 25 MJup brown dwarf, located 70 pc from Earth, has been recently identified as a member of the TW Hydrae Association (age 8 Myr). Using adaptive optics infrared wavefront sensing to acquire sharp images of its circumstellar environment, we discovered a very faint and very red object at a close separation of 780 mas (55 AU). Photometry in the H, Ks and L' bands and upper limit in J-band are compatible with a spectral type L5-L9.5. Near-infrared spectroscopy is consistent with this spectral type estimate. Different evolutionary models predict an object within the planetary regime with a mass of 5+-2 MJup and an effective temperature of Teff=1250+-200K.
We present a deep UBI CCD survey using the Palomar 5-m telescope of a sample of high X-ray luminosity, distant clusters selected from the ROSAT All-Sky Survey. The 10 clusters lie at z=0.22-0.28, an era where evolutionary effects have been reported in the properties of cluster galaxy populations. Our clusters thus provide a well-defined sample of the most massive systems at these redshifts to quantify the extent and variability of these evolutionary effects. Moreover, by concentrating on a narrow redshift range, we can take advantage of the homogeneity of our sample to combine the catalogues from all the clusters to analyse the bulk properties of their populations. The core regions of these clusters ontain only a small proportion of star-forming galaxies, and they therefore do not exhibit a `Butcher-Oemler' effect. Focusing on the redder cluster galaxies we find that their integrated luminosity is well correlated with the cluster X-ray temperatures, and hence with cluster mass. Furthermore, the typical UV-optical colours of the elliptical sequences in the clusters exhibit a small cluster-to-cluster scatter, <=2%, indicating that these galaxies are highly homogeneous between clusters. However, at fainter magnitudes we observe an increase in the range of mid-UV colours of galaxies possessing strong 4000A breaks. In the light of the apparent decline in the S0 populations of z>=0.4 clusters (Dressler et al. 1997), and in view of the luminosities and colours of this population, we propose that they may be the progenitors of the S0 population of local rich clusters, caught in the final stage before they become completely quiescent. [abridged]
We introduce a novel multivariate random process producing Bernoulli outputs per dimension, that can possibly formalize binary interactions in various graphical structures and can be used to model opinion dynamics, epidemics, financial and biological time series data, etc. We call this a Bernoulli Autoregressive Process (BAR). A BAR process models a discrete-time vector random sequence of $p$ scalar Bernoulli processes with autoregressive dynamics and corresponds to a particular Markov Chain. The benefit from the autoregressive dynamics is the description of a $2^p\times 2^p$ transition matrix by at most $pd$ effective parameters for some $d\ll p$ or by two sparse matrices of dimensions $p\times p^2$ and $p\times p$, respectively, parameterizing the transitions. Additionally, we show that the BAR process mixes rapidly, by proving that the mixing time is $O(\log p)$. The hidden constant in the previous mixing time bound depends explicitly on the values of the chain parameters and implicitly on the maximum allowed in-degree of a node in the corresponding graph. For a network with $p$ nodes, where each node has in-degree at most $d$ and corresponds to a scalar Bernoulli process generated by a BAR, we provide a greedy algorithm that can efficiently learn the structure of the underlying directed graph with a sample complexity proportional to the mixing time of the BAR process. The sample complexity of the proposed algorithm is nearly order-optimal as it is only a $\log p$ factor away from an information-theoretic lower bound. We present simulation results illustrating the performance of our algorithm in various setups, including a model for a biological signaling network.
We investigated the properties of Boolean networks that follow a given reliable trajectory in state space. A reliable trajectory is defined as a sequence of states which is independent of the order in which the nodes are updated. We explored numerically the topology, the update functions, and the state space structure of these networks, which we constructed using a minimum number of links and the simplest update functions. We found that the clustering coefficient is larger than in random networks, and that the probability distribution of three-node motifs is similar to that found in gene regulation networks. Among the update functions, only a subset of all possible functions occur, and they can be classified according to their probability. More homogeneous functions occur more often, leading to a dominance of canalyzing functions. Finally, we studied the entire state space of the networks. We observed that with increasing systems size, fixed points become more dominant, moving the networks close to the frozen phase.
Preparing and scanning histopathology slides consists of several steps, each with a multitude of parameters. The parameters can vary between pathology labs and within the same lab over time, resulting in significant variability of the tissue appearance that hampers the generalization of automatic image analysis methods. Typically, this is addressed with ad-hoc approaches such as staining normalization that aim to reduce the appearance variability. In this paper, we propose a systematic solution based on domain-adversarial neural networks. We hypothesize that removing the domain information from the model representation leads to better generalization. We tested our hypothesis for the problem of mitosis detection in breast cancer histopathology images and made a comparative analysis with two other approaches. We show that combining color augmentation with domain-adversarial training is a better alternative than standard approaches to improve the generalization of deep learning methods.
The main problems of school course timetabling are time, curriculum, and classrooms. In addition there are other problems that vary from one institution to another. This paper is intended to solve the problem of satisfying the teachers preferred schedule in a way that regards the importance of the teacher to the supervising institute, i.e. his score according to some criteria. Genetic algorithm (GA) has been presented as an elegant method in solving timetable problem (TTP) in order to produce solutions with no conflict. In this paper, we consider the analytic hierarchy process (AHP) to efficiently obtain a score for each teacher, and consequently produce a GA-based TTP solution that satisfies most of the teachers preferences.
Video based person re-identification plays a central role in realistic security and video surveillance. In this paper we propose a novel Accumulative Motion Context (AMOC) network for addressing this important problem, which effectively exploits the long-range motion context for robustly identifying the same person under challenging conditions. Given a video sequence of the same or different persons, the proposed AMOC network jointly learns appearance representation and motion context from a collection of adjacent frames using a two-stream convolutional architecture. Then AMOC accumulates clues from motion context by recurrent aggregation, allowing effective information flow among adjacent frames and capturing dynamic gist of the persons. The architecture of AMOC is end-to-end trainable and thus motion context can be adapted to complement appearance clues under unfavorable conditions (e.g. occlusions). Extensive experiments are conduced on three public benchmark datasets, i.e., the iLIDS-VID, PRID-2011 and MARS datasets, to investigate the performance of AMOC. The experimental results demonstrate that the proposed AMOC network outperforms state-of-the-arts for video-based re-identification significantly and confirm the advantage of exploiting long-range motion context for video based person re-identification, validating our motivation evidently.
Mining the silent members of an online community, also called lurkers, has been recognized as an important problem that accompanies the extensive use of online social networks (OSNs). Existing solutions to the ranking of lurkers can aid understanding the lurking behaviors in an OSN. However, they are limited to use only structural properties of the static network graph, thus ignoring any relevant information concerning the time dimension. Our goal in this work is to push forward research in lurker mining in a twofold manner: (i) to provide an in-depth analysis of temporal aspects that aims to unveil the behavior of lurkers and their relations with other users, and (ii) to enhance existing methods for ranking lurkers by integrating different time-aware properties concerning information-production and information-consumption actions. Network analysis and ranking evaluation performed on Flickr, FriendFeed and Instagram networks allowed us to draw interesting remarks on both the understanding of lurking dynamics and on transient and cumulative scenarios of time-aware ranking.
We address the problem of learning efficient and adaptive ways to communicate binary information over an impaired channel. We treat the problem as reconstruction optimization through impairment layers in a channel autoencoder and introduce several new domain-specific regularizing layers to emulate common channel impairments. We also apply a radio transformer network based attention model on the input of the decoder to help recover canonical signal representations. We demonstrate some promising initial capacity results from this architecture and address several remaining challenges before such a system could become practical.
The idea that all life on earth traces back to a common beginning dates back at least to Charles Darwin's {\em Origin of Species}. Ever since, biologists have tried to piece together parts of this `tree of life' based on what we can observe today: fossils, and the evolutionary signal that is present in the genomes and phenotypes of different organisms. Mathematics has played a key role in helping transform genetic data into phylogenetic (evolutionary) trees and networks. Here, I will explain some of the central concepts and basic results in phylogenetics, which benefit from several branches of mathematics, including combinatorics, probability and algebra.
This paper deals with gene networks whose dynamics is assumed to be generated by a continuous-time, linear, time invariant, finite dimensional system (LTI) at steady state. In particular, we deal with the problem of network reconstruction in the typical practical situation in which the number of available data is largely insufficient to uniquely determine the network. In order to try to remove this ambiguity, we will exploit the biologically a priori assumption of network sparseness, and propose a new algorithm for network reconstruction having a very low computational complexity (linear in the number of genes) so to be able to deal also with very large networks (say, thousands of genes). Its performances are also tested both on artificial data (generated with linear models) and on real data obtained by Gardner et al. from the SOS pathway in Escherichia coli.
Evolving biomolecular networks have to combine the stability against perturbations with flexibility allowing their constituents to assume new roles in the cell. Gene duplication followed by functional divergence of associated proteins is a major force shaping molecular networks in living organisms. Recent availability of system-wide data for yeast S. Cerevisiae allow us to access the effects of gene duplication on robustness and plasticity of molecular networks. We demonstrate that the upstream transcriptional regulation of duplicated genes diverges fast, losing on average 4% of their common transcription factors for every 1% divergence of their amino acid sequences. In contrast, the set of physical interaction partners of their protein products changes much slower. The relative stability of downstream functions of duplicated genes, is further corroborated by their ability to substitute for each other in gene knockout experiments. We believe that the combination of the upstream plasticity and the downstream robustness is a general feature determining the evolvability of molecular networks.
Strongly disordered superconducting films have been observed to undergo finite temperature transitions to a superinsulating state, of apparently infinite resistance, mirroring superconductivity. Approaching the transition, some of the films reportedly exhibit Berezinskii-Kosterlitz-Thouless (BKT) criticality implying that superinsulation is associated with an ordered charge BKT phase. An even more singular Vogel-Fulcher-Tammann (VFT) criticality has also been seen, positing the question of the existence of fundamentally different states of finite temperature insulators. Here we develop a theory of the criticality of a disordered lateral Josephson junction array with weak Josephson coupling. We show that it is equivalent to a two-dimensional Coulomb gas with logarithmically correlated disorder. We show that strong disorder results in a regime exhibiting VFT criticality instead of the usual BKT one, and find that it corresponds to transition to a nonergodic insulator phase.
In this paper we study the effect of stochastic errors on two constrained incremental sub-gradient algorithms. We view the incremental sub-gradient algorithms as decentralized network optimization algorithms as applied to minimize a sum of functions, when each component function is known only to a particular agent of a distributed network. We first study the standard cyclic incremental sub-gradient algorithm in which the agents form a ring structure and pass the iterate in a cycle. We consider the method with stochastic errors in the sub-gradient evaluations and provide sufficient conditions on the moments of the stochastic errors that guarantee almost sure convergence when a diminishing step-size is used. We also obtain almost sure bounds on the algorithm's performance when a constant step-size is used. We then consider \ram{the} Markov randomized incremental subgradient method, which is a non-cyclic version of the incremental algorithm where the sequence of computing agents is modeled as a time non-homogeneous Markov chain. Such a model is appropriate for mobile networks, as the network topology changes across time in these networks. We establish the convergence results and error bounds for the Markov randomized method in the presence of stochastic errors for diminishing and constant step-sizes, respectively.
Deep learning can achieve outstanding results in various fields. However, it requires so significant computational power that graphics processing units (GPUs) and/or numerous computers are often required for the practical application. We have developed a new distributed calculation framework called "Sashimi" that allows any computer to be used as a distribution node only by accessing a website. We have also developed a new JavaScript neural network framework called "Sukiyaki" that uses general purpose GPUs with web browsers. Sukiyaki performs 30 times faster than a conventional JavaScript library for deep convolutional neural networks (deep CNNs) learning. The combination of Sashimi and Sukiyaki, as well as new distribution algorithms, demonstrates the distributed deep learning of deep CNNs only with web browsers on various devices. The libraries that comprise the proposed methods are available under MIT license at http://mil-tokyo.github.io/.
The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this paper, we consider a feature-wise kernelized Lasso for capturing non-linear input-output dependency. We first show that, with particular choices of kernel functions, non-redundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments with thousands of features.
Strengthening or destroying a network is a very important issue in designing resilient networks or in planning attacks against networks including planning strategies to immunize a network against diseases, viruses etc.. Here we develop a method for strengthening or destroying a random network with a minimum cost. We assume a correlation between the cost required to strengthen or destroy a node and the degree of the node. Accordingly, we define a cost function c(k), which is the cost of strengthening or destroying a node with degree k. Using the degrees $k$ in a network and the cost function c(k), we develop a method for defining a list of priorities of degrees, and for choosing the right group of degrees to be strengthened or destroyed that minimizes the total price of strengthening or destroying the entire network. We find that the list of priorities of degrees is universal and independent of the network's degree distribution, for all kinds of random networks. The list of priorities is the same for both strengthening a network and for destroying a network with minimum cost. However, in spite of this similarity there is a difference between their p_c - the critical fraction of nodes that has to be functional, to guarantee the existence of a giant component in the network.
The ability to accurately predict and simulate human driving behavior is critical for the development of intelligent transportation systems. Traditional modeling methods have employed simple parametric models and behavioral cloning. This paper adopts a method for overcoming the problem of cascading errors inherent in prior approaches, resulting in realistic behavior that is robust to trajectory perturbations. We extend Generative Adversarial Imitation Learning to the training of recurrent policies, and we demonstrate that our model outperforms rule-based controllers and maximum likelihood models in realistic highway simulations. Our model both reproduces emergent behavior of human drivers, such as lane change rate, while maintaining realistic control over long time horizons.
We study the reliability of large networks of coupled neural oscillators in response to fluctuating stimuli. Reliability means that a stimulus elicits essentially identical responses upon repeated presentations. We view the problem on two scales: neuronal reliability, which concerns the repeatability of spike times of individual neurons embedded within a network, and pooled-response reliability, which addresses the repeatability of the total synaptic output from the network. We find that individual embedded neurons can be reliable or unreliable depending on network conditions, whereas pooled responses of sufficiently large networks are mostly reliable. We study also the effects of noise, and find that some types affect reliability more seriously than others.
Chaotic neural networks have received a great deal of attention these last years. In this paper we establish a precise correspondence between the so-called chaotic iterations and a particular class of artificial neural networks: global recurrent multi-layer perceptrons. We show formally that it is possible to make these iterations behave chaotically, as defined by Devaney, and thus we obtain the first neural networks proven chaotic. Several neural networks with different architectures are trained to exhibit a chaotical behavior.
We develop a sequential low-complexity inference procedure for Dirichlet process mixtures of Gaussians for online clustering and parameter estimation when the number of clusters are unknown a-priori. We present an easily computable, closed form parametric expression for the conditional likelihood, in which hyperparameters are recursively updated as a function of the streaming data assuming conjugate priors. Motivated by large-sample asymptotics, we propose a novel adaptive low-complexity design for the Dirichlet process concentration parameter and show that the number of classes grow at most at a logarithmic rate. We further prove that in the large-sample limit, the conditional likelihood and data predictive distribution become asymptotically Gaussian. We demonstrate through experiments on synthetic and real data sets that our approach is superior to other online state-of-the-art methods.
Protein quality assessment (QA) by ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model pool consisting of mostly low-quality models, is still a largely unsolved problem. We introduce a novel single-model quality assessment method DeepQA based on deep belief network that utilizes a number of selected features describing the quality of a model from different perspectives, such as energy, physio-chemical characteristics, and structural information. The deep belief network is trained on several large datasets consisting of models from the Critical Assessment of Protein Structure Prediction (CASP) experiments, several publicly available datasets, and models generated by our in-house ab initio method. Our experiment demonstrate that deep belief network has better performance compared to Support Vector Machines and Neural Networks on the protein model quality assessment problem, and our method DeepQA achieves the state-of-the-art performance on CASP11 dataset. It also outperformed two well-established methods in selecting good outlier models from a large set of models of mostly low quality generated by ab initio modeling methods. DeepQA is a useful tool for protein single model quality assessment and protein structure prediction. The source code, executable, document and training/test datasets of DeepQA for Linux is freely available to non-commercial users at http://cactus.rnet.missouri.edu/DeepQA/.
Video frame interpolation typically involves two steps: motion estimation and pixel synthesis. Such a two-step approach heavily depends on the quality of motion estimation. This paper presents a robust video frame interpolation method that combines these two steps into a single process. Specifically, our method considers pixel synthesis for the interpolated frame as local convolution over two input frames. The convolution kernel captures both the local motion between the input frames and the coefficients for pixel synthesis. Our method employs a deep fully convolutional neural network to estimate a spatially-adaptive convolution kernel for each pixel. This deep neural network can be directly trained end to end using widely available video data without any difficult-to-obtain ground-truth data like optical flow. Our experiments show that the formulation of video interpolation as a single convolution process allows our method to gracefully handle challenges like occlusion, blur, and abrupt brightness change and enables high-quality video frame interpolation.
People search is an important topic in information retrieval. Many previous studies on this topic employed social networks to boost search performance by incorporating either local network features (e.g. the common connections between the querying user and candidates in social networks), or global network features (e.g. the PageRank), or both. However, the available social network information can be restricted because of the privacy settings of involved users, which in turn would affect the performance of people search. Therefore, in this paper, we focus on the privacy issues in people search. We propose simulating different privacy settings with a public social network due to the unavailability of privacy-concerned networks. Our study examines the influences of privacy concerns on the local and global network features, and their impacts on the performance of people search. Our results show that: 1) the privacy concerns of different people in the networks have different influences. People with higher association (i.e. higher degree in a network) have much greater impacts on the performance of people search; 2) local network features are more sensitive to the privacy concerns, especially when such concerns come from high association peoples in the network who are also related to the querying user. As the first study on this topic, we hope to generate further discussions on these issues.
Optical tomographic imaging of biological specimen bases its reliability on the combination of both accurate experimental measures and advanced computational techniques. In general, due to high scattering and absorption in most of the tissues, multi view geometries are required to reduce diffuse halo and blurring in the reconstructions. Scanning processes are used to acquire the data but they inevitably introduces perturbation, negating the assumption of aligned measures. Here we propose an innovative, registration free, imaging protocol implemented to image a human tumor spheroid at mesoscopic regime. The technique relies on the calculation of autocorrelation sinogram and object autocorrelation, finalizing the tomographic reconstruction via a three dimensional Gerchberg Saxton algorithm that retrieves the missing phase information. Our method is conceptually simple and focuses on single image acquisition, regardless of the specimen position in the camera plane. We demonstrate increased deep resolution abilities, not achievable with the current approaches, rendering the data alignment process obsolete.
We study bidisperse colloidal suspensions confined within glass microcapillary tubes to model the glass transition in confined cylindrical geometries. We use high speed three-dimensional confocal microscopy to observe particle motions for a wide range of volume fractions and tube radii. Holding volume fraction constant, we find that particles move slower in thinner tubes. The tube walls induce a gradient in particle mobility: particles move substantially slower near the walls. This suggests that the confinement-induced glassiness may be due to an interfacial effect.
Brain imaging data such as EEG or MEG are high-dimensional spatiotemporal data often degraded by complex, non-Gaussian noise. For reliable analysis of brain imaging data, it is important to extract discriminative, low-dimensional intrinsic representation of the recorded data. This work proposes a new method to learn the low-dimensional representations from the noise-degraded measurements. In particular, our work proposes a new deep neural network design that integrates graph information such as brain connectivity with fully-connected layers. Our work leverages efficient graph filter design using Chebyshev polynomial and recent work on convolutional nets on graph-structured data. Our approach exploits graph structure as the prior side information, localized graph filter for feature extraction and neural networks for high capacity learning. Experiments on real MEG datasets show that our approach can extract more discriminative representations, leading to improved accuracy in a supervised classification task.
Multipliers are the most space and power-hungry arithmetic operators of the digital implementation of deep neural networks. We train a set of state-of-the-art neural networks (Maxout networks) on three benchmark datasets: MNIST, CIFAR-10 and SVHN. They are trained with three distinct formats: floating point, fixed point and dynamic fixed point. For each of those datasets and for each of those formats, we assess the impact of the precision of the multiplications on the final error after training. We find that very low precision is sufficient not just for running trained networks but also for training them. For example, it is possible to train Maxout networks with 10 bits multiplications.
The stability and convergence of the neural networks are the fundamental characteristics in the Hopfield type networks. Since time delay is ubiquitous in most physical and biological systems, more attention is being made for the delayed neural networks. The inclusion of time delay into a neural model is natural due to the finite transmission time of the interactions. The stability analysis of the neural networks depends on the Lyapunov function and hence it must be constructed for the given system. In this paper we have made an attempt to establish the logarithmic stability of the impulsive delayed neural networks by constructing suitable Lyapunov function.
Extensive empirical investigation has shown that a plethora of real networks synchronously exhibit scale-free and modular structure, and it is thus of great importance to uncover the effects of these two striking properties on various dynamical processes occurring on such networks. In this paper, we examine two cases of random walks performed on a class of modular scale-free networks with multiple traps located at several given nodes. We first derive a formula of the mean first-passage time (MFPT) for a general network, which is the mean of the expected time to absorption originating from a specific node, averaged over all non-trap starting nodes. Although the computation is complex, the expression of the formula is exact; moreover, the computational approach and procedure are independent of the number and position of the traps. We then determine analytically the MFPT for the two random walks being considered. The obtained analytical results are in complete agreement with the numerical ones. Our results show that the number and location of traps play an important role in the behavior of the MFPT, since for both cases the MFPT grows as a power-law function of the number of nodes, but their exponents are quite different. We demonstrate that the root of the difference in the behavior of MFPT is attributed to the modular and scale-free topologies of the networks. This work can deepen the understanding of diffusion on networks with modular and scale-free architecture and motivate relevant studies for random walks running on complex random networks with multiple traps.
The Weyl semimetals are topologically protected from a gap opening against weak disorder in three dimensions. However, a strong disorder drives this relativistic semimetal through a quantum transition towards a diffusive metallic phase characterized by a finite density of states at the band crossing. This transition is usually described by a perturbative renormalization group in $d=2+\varepsilon$ of a $U(N)$ Gross-Neveu model in the limit $N \to 0$. Unfortunately, this model is not multiplicatively renormalizable in $2+\varepsilon$ dimensions: An infinite number of relevant operators are required to describe the critical behavior. Hence its use in a quantitative description of the transition beyond one-loop is at least questionable. We propose an alternative route, building on the correspondence between the Gross-Neveu and Gross-Neveu-Yukawa models developed in the context of high energy physics. It results in a model of Weyl fermions with a random non-Gaussian imaginary potential which allows one to study the critical properties of the transition within a $d=4-\varepsilon$ expansion. We also discuss the characterization of the transition by the multifractal spectrum of wave functions.
The Little-Hopfield neural network programmed with Horn clauses is studied. We argue that the energy landscape of the system, corresponding to the inconsistency function for logical interpretations of the sets of Horn clauses, has minimal ruggedness. This is supported by computer simulations.
Well characterized sequences of dynamical states play an important role for motor control and associative neural computation in the brain. Autonomous dynamics involving sequences of transiently stable states have been termed associative latching in the context of grammar generation. We propose that generating functionals allow for a systematic construction of dynamical networks with well characterized dynamical behavior, such as regular or intermittent bursting latching dynamics.   Coupling local, slowly adapting variables to an attractor network allows to destabilize all attractors, turning them into attractor ruins. The resulting attractor relict network may show ongoing autonomous latching dynamics. We propose to use two generating functionals for the construction of attractor relict networks. The first functional is a simple Hopfield energy functional, known to generate a neural attractor network. The second generating functional, which we denote polyhomeostatic optimization, is based on information-theoretical principles, encoding the information content of the neural firing statistics. Polyhomeostatic optimization destabilizes the attractors of the Hopfield network inducing latching dynamics.   We investigate the influence of stress, in terms of conflicting optimization targets, on the resulting dynamics. Objective function stress is absent when the target level for the mean of neural activities is identical for the two generating functionals and the resulting latching dynamics is then found to be regular. Objective function stress is present when the respective target activity levels differ, inducing intermittent bursting latching dynamics. We propose that generating functionals may be useful quite generally for the controlled construction of complex dynamical systems.
Network appliances perform different functions on network flows and constitute an important part of an operator's network. Normally, a set of chained network functions process network flows. Following the trend of virtualization of networks, virtualization of the network functions has also become a topic of interest. We define a model for formalizing the chaining of network functions using a context-free language. We process deployment requests and construct virtual network function graphs that can be mapped to the network. We describe the mapping as a Mixed Integer Quadratically Constrained Program (MIQCP) for finding the placement of the network functions and chaining them together considering the limited network resources and requirements of the functions. We have performed a Pareto set analysis to investigate the possible trade-offs between different optimization objectives.
We compute the rate for diffractive $\phi$ electro-production using the Color Glass Condensate dipole model. The model parameters are obtained from fits to the most recent combined HERA data on inclusive deep inelastic scattering. As for the $\phi$ meson, we use an AdS/QCD holographic light front wavefunction. Our predictions are compared to the available data collected at the HERA collider.
Magnetic neutron scattering provides evidence for nucleation of antiferromagnetic droplets around impurities in a doped nickel-oxide based quantum magnet. The undoped parent compound contains a spin liquid with a cooperative singlet ground state and a gap in the magnetic excitation spectrum. Calcium doping creates excitations below the gap with an incommensurate structure factor. We show that weakly interacting antiferromagnetic droplets with a central phase shift of $\pi$ and a size controlled by the correlation length of the quantum liquid can account for the data. The experiment provides a first quantitative impression of the magnetic polarization cloud associated with holes in a doped transition metal oxide.
Recent measurements of resonance widths for low-energy neutron scattering off heavy nuclei show large deviations from the standard Porter-Thomas distribution. We propose a new resonance width distribution based on the random matrix theory for an open quantum system. Two methods of derivation lead to a single analytical expression; in the limit of vanishing continuum coupling, we recover the Porter-Thomas distribution. The result depends on the ratio of typical widths $\Gamma$ to the energy level spacing $D$ via the dimensionless parameter $\kappa=(\pi\Gamma/2D)$. The new distribution suppresses small widths and increases the probabilities of larger widths.
We present a new, deadlock-free, routing scheme for toroidal interconnection networks, called OutFlank Routing (OFR). OFR is an adaptive strategy which exploits non-minimal links, both in the source and in the destination nodes. When minimal links are congested, OFR deroutes packets to carefully chosen intermediate destinations, in order to obtain travel paths which are only an additive constant longer than the shortest ones. Since routing performance is very sensitive to changes in the traffic model or in the router parameters, an accurate discrete-event simulator of the toroidal network has been developed to empirically validate OFR, by comparing it against other relevant routing strategies, over a range of typical real-world traffic patterns. On the 16x16x16 (4096 nodes) simulated network OFR exhibits improvements of the maximum sustained throughput between 14% and 114%, with respect to Adaptive Bubble Routing.
The "clumpiness" matrix of a network is used to develop a method to identify its community structure. A "projection space" is constructed from the eigenvectors of the clumpiness matrix and a border line is defined using some kind of angular distance in this space. The community structure of the network is identified using this borderline and/or hierarchical clustering methods. The performance of our algorithm is tested on some computer-generated and real-world networks. The accuracy of the results is checked using normalized mutual information. The effect of community size heterogeneity on the accuracy of the method is also discussed.
Recently, deep learning (DL) methods have been introduced very successfully into human activity recognition (HAR) scenarios in ubiquitous and wearable computing. Especially the prospect of overcoming the need for manual feature design combined with superior classification capabilities render deep neural networks very attractive for real-life HAR application. Even though DL-based approaches now outperform the state-of-the-art in a number of recognitions tasks of the field, yet substantial challenges remain. Most prominently, issues with real-life datasets, typically including imbalanced datasets and problematic data quality, still limit the effectiveness of activity recognition using wearables. In this paper we tackle such challenges through Ensembles of deep Long Short Term Memory (LSTM) networks. We have developed modified training procedures for LSTM networks and combine sets of diverse LSTM learners into classifier collectives. We demonstrate, both formally and empirically, that Ensembles of deep LSTM learners outperform the individual LSTM networks. Through an extensive experimental evaluation on three standard benchmarks (Opportunity, PAMAP2, Skoda) we demonstrate the excellent recognition capabilities of our approach and its potential for real-life applications of human activity recognition.
We describe a broadly-applicable theory of spin relaxation in materials with incoherent charge transport; examples include amorphous inorganic semiconductors, organic semiconductors, quantum dot arrays, and systems displaying trap-controlled transport or transport within an impurity band. The theory can incorporate many different relaxation mechanisms, so long as electron-electron correlations can be neglected. We focus primarily on spin relaxation caused by spin-orbit effects, which manifest through inhomogeneities in the $g$-factor and non-spin-conserving carrier hops, scattering, trapping, or detrapping. Analytic and numerical results from the theory are compared in various regimes with Monte Carlo simulations. Our results should assist in evaluating the suitability of various disordered materials for spintronic devices.
When small cells are densely deployed in the fifth generation (5G) cellular networks, the base stations (BSs) switch-off strategy is an effective approach for saving energy consumption considering changes of traffic load. In general, the loss of coverage efficiency is an inevitable cost for cellular networks adopting BSs switch-off strategies. Based on the BSs switch-off strategy, an optimized energy density efficiency of hard core point process (HCPP) small cell networks is proposed to trade off the energy and coverage efficiency. Simulation results imply that the minimum active BS distance used for the BSs switch-off strategy is recommended as 150 meters to achieve a tradeoff between energy and coverage efficiency in 5G small cell networks.
By numerical simulation of a Lennard-Jones like liquid driven by a velocity gradient \gamma we test the fluctuation relation (FR) below the (numerical) glass transition temperature T_g. We show that, in this region, the FR deserves to be generalized introducing a numerical factor X(T,\gamma)<1 that defines an ``effective temperature'' T_{FR}=T/X. On the same system we also measure the effective temperature T_{eff}, as defined from the generalized fluctuation-dissipation relation, and find a qualitative agreement between the two different nonequilibrium temperatures.
As a novel technology, cloud computing attracts more and more people including technology enthusiasts and malicious users. Different from the classical network architecture, cloud environment has many its own features which make the traditional defense mechanism invalid. To make the network more robust against a malicious attack, we introduce a new method to mitigate this risk efficiently and systematically. In this paper, we first propose a coupled networks model which adequately considers the interactions between physical layer and virtual layer in a practical cloud computing environment. Based on this new model and our systematical method, we show that with the addition of protection of some specific nodes in the network structure, the robustness of cloud computing's network can be significantly improved whereas their functionality remains unchanged. Our results demonstrate that our new method can effectively settle the hard problems which cloud computing now is facing without much cost.
A finite length linear chain of dielectric loss-less identical spheres is considered. A propagation of dipole radiation in the chain of particles induced by the point dipole source placed near one end of the chain is investigated. It is found that at sufficiently large refractive index there exist frequency pass bands around every low frequency Mie resonance. In particular, if the dipole oscillates across the chain axis, one can reveal a longitudinal mode frequency pass band if refractive index of the spheres exceeds 1.9. Then, if the dipole oscillates transversely to the chain axis, the transverse frequencies pass bands show up depending on the chain length. In this case, the pass band is formed if the length chain is large enough. Three dielectric materials ZnO, rutile and GaAs are considered. It is found that the top of the frequency pass band corresponds to the top of the Brillouin band edge in the quasi-momentum space. On the order hand, the bottom of the frequency pass band corresponds to the guiding wave criterion. This explains the remarkable feature of the band picture established for infinite chain: the band structure breaks down as the wavevector becomes small enough. The multisphere Mie scattering formalism is used to calculate how the amplitude of the radiation changes along the chain.
Metabolism displays striking and robust regularities in the forms of modularity and hierarchy, whose composition may be compactly described. This renders metabolic architecture comprehensible as a system, and suggests the order in which layers of that system emerged. Metabolism also serves as the foundation in other hierarchies, at least up to cellular integration including bioenergetics and molecular replication, and trophic ecology. The recapitulation of patterns first seen in metabolism, in these higher levels, suggests metabolism as a source of causation or constraint on many forms of organization in the biosphere.   We identify as modules widely reused subsets of chemicals, reactions, or functions, each with a conserved internal structure. At the small molecule substrate level, module boundaries are generally associated with the most complex reaction mechanisms and the most conserved enzymes. Cofactors form a structurally and functionally distinctive control layer over the small-molecule substrate. Complex cofactors are often used at module boundaries of the substrate level, while simpler ones participate in widely used reactions. Cofactor functions thus act as "keys" that incorporate classes of organic reactions within biochemistry.   The same modules that organize the compositional diversity of metabolism are argued to have governed long-term evolution. Early evolution of core metabolism, especially carbon-fixation, appears to have required few innovations among a small number of conserved modules, to produce adaptations to simple biogeochemical changes of environment. We demonstrate these features of metabolism at several levels of hierarchy, beginning with the small-molecule substrate and network architecture, continuing with cofactors and key conserved reactions, and culminating in the aggregation of multiple diverse physical and biochemical processes in cells.
One of the most challenges of 4G network is to have a unified network of heterogeneous wireless networks. To achieve seamless mobility in such a diverse environment, vertical hand off is still a challenging problem. In many situations handover failures and unnecessary handoffs are triggered causing degradation of services, reduction in throughput and increase the blocking probability and packet loss. In this paper a new vertical handoff decision algorithm handover necessity estimation (HNE), is proposed to minimize the number of handover failure and unnecessary handover in heterogeneous wireless networks. we have proposed a multi criteria vertical handoff decision algorithm based on two parts: traveling time estimation and time threshold calculation. Our proposed methods are compared against two other methods: (a) the fixed RSS threshold based method, in which handovers between the cellular network and the WLAN are initiated when the RSS from the WLAN reaches a fixed threshold, and (b) the hysteresis based method, in which a hysteresis is introduced to prevent the ping-pong effect. Simulation results show that, this method reduced the number of handover failures and unnecessary handovers up to 80% and 70%, respectively.
As an important part of the Internet-of-Things (IoT), machine-to-machine (M2M) communications have attracted great attention. In this paper, we introduce mobile edge computing (MEC) into virtualized cellular networks with M2M communications, to decrease the energy consumption and optimize the computing resource allocation as well as improve computing capability. Moreover, based on different functions and quality of service (QoS) requirements, the physical network can be virtualized into several virtual networks, and then each MTCD selects the corresponding virtual network to access. Meanwhile, the random access process of MTCDs is formulated as a partially observable Markov decision process (POMDP) to minimize the system cost, which consists of both the energy consumption and execution time of computing tasks. Furthermore, to facilitate the network architecture integration, software-defined networking (SDN) is introduced to deal with the diverse protocols and standards in the networks. Extensive simulation results with different system parameters reveal that the proposed scheme could significantly improve the system performance compared to the existing schemes.
Because of the huge commercial importance of granular systems, the second-most used material in industry after water, intersecting the industry in multiple trades, like pharmacy and agriculture, fundamental research on grain-like materials has received an increasing amount of attention in the last decades. In photonics, the applications of granular materials have been only marginally investigated. We report the first phase-diagram of a granular as obtained by laser emission. The dynamics of vertically-oscillated granular in a liquid solution in a three-dimensional container is investigated by employing its random laser emission. The granular motion is function of the frequency and amplitude of the mechanical solicitation, we show how the laser emission allows to distinguish two phases in the granular and analyze its spectral distribution. This constitutes a fundamental step in the field of granulars and gives a clear evidence of the possible control on light-matter interaction achievable in grain-like system.
One of the challenges in modeling cognitive events from electroencephalogram (EEG) data is finding representations that are invariant to inter- and intra-subject differences, as well as to inherent noise associated with such data. Herein, we propose a novel approach for learning such representations from multi-channel EEG time-series, and demonstrate its advantages in the context of mental load classification task. First, we transform EEG activities into a sequence of topology-preserving multi-spectral images, as opposed to standard EEG analysis techniques that ignore such spatial information. Next, we train a deep recurrent-convolutional network inspired by state-of-the-art video classification to learn robust representations from the sequence of images. The proposed approach is designed to preserve the spatial, spectral, and temporal structure of EEG which leads to finding features that are less sensitive to variations and distortions within each dimension. Empirical evaluation on the cognitive load classification task demonstrated significant improvements in classification accuracy over current state-of-the-art approaches in this field.
This article presents the Neo-Fuzzy-Neuron Modified by Kohonen Network (NFN-MK), an hybrid computational model that combines fuzzy system technique and artificial neural networks. Its main task consists in the automatic generation of membership functions, in particular, triangle forms, aiming a dynamic modeling of a system. The model is tested by simulating real systems, here represented by a nonlinear mathematical function. Comparison with the results obtained by traditional neural networks, and correlated studies of neurofuzzy systems applied in system identification area, shows that the NFN-MK model has a similar performance, despite its greater simplicity.
Online conference proceedings for the IWDECIE workshop, taking place in New Orleans on June 5th, 2011. The workshop focuses on non-conventional implementations of bioinspired algorithms and its conceptual implications.
Geometric crossover is a representation-independent definition of crossover based on the distance of the search space interpreted as a metric space. It generalizes the traditional crossover for binary strings and other important recombination operators for the most frequently used representations. Using a distance tailored to the problem at hand, the abstract definition of crossover can be used to design new problem specific crossovers that embed problem knowledge in the search. This paper is motivated by the fact that genotype-phenotype mapping can be theoretically interpreted using the concept of quotient space in mathematics. In this paper, we study a metric transformation, the quotient metric space, that gives rise to the notion of quotient geometric crossover. This turns out to be a very versatile notion. We give many example applications of the quotient geometric crossover.
Adaptive bitrate streaming (ABR) has been widely adopted to support video streaming services over heterogeneous devices and varying network conditions. With ABR, each video content is transcoded into multiple representations in different bitrates and resolutions. However, video transcoding is computing intensive, which requires the transcoding service providers to deploy a large number of servers for transcoding the video contents published by the content producers. As such, a natural question for the transcoding service provider is how to provision the computing resource for transcoding the video contents while maximizing service profit. To address this problem, we design a cloud video transcoding system by taking the advantage of cloud computing technology to elastically allocate computing resource. We propose a method for jointly considering the task scheduling and resource provisioning problem in two timescales, and formulate the service profit maximization as a two-timescale stochastic optimization problem. We derive some approximate policies for the task scheduling and resource provisioning. Based on our proposed methods, we implement our open source cloud video transcoding system Morph and evaluate its performance in a real environment. The experiment results demonstrate that our proposed method can reduce the resource consumption and achieve a higher profit compared with the baseline schemes.
We address the question of geometrical as well as energetic properties of local excitations in mean field Ising spin glasses. We study analytically the Random Energy Model and numerically a dilute mean field model, first on tree-like graphs, equivalent to a replica symmetric computation, and then directly on finite connectivity random lattices. In the first model, characterized by a discontinuous replica symmetry breaking, we found that the energy of finite volume excitation is infinite whereas in the dilute mean field model, described by a continuous replica symmetry breaking, it slowly decreases with sizes and saturates at a finite value, in contrast with what would be naively expected. The geometrical properties of these excitations are similar to those of lattice animals or branched polymers. We discuss the meaning of these results in terms of replica symmetry breaking and also possible relevance in finite dimensional systems.
A fundamental challenge in calcium imaging has been to infer the timing of action potentials from the measured noisy calcium fluorescence traces. We systematically evaluate a range of spike inference algorithms on a large benchmark dataset recorded from varying neural tissue (V1 and retina) using different calcium indicators (OGB-1 and GCamp6). We show that a new algorithm based on supervised learning in flexible probabilistic models outperforms all previously published techniques, setting a new standard for spike inference from calcium signals. Importantly, it performs better than other algorithms even on datasets not seen during training. Future data acquired in new experimental conditions can easily be used to further improve its spike prediction accuracy and generalization performance. Finally, we show that comparing algorithms on artificial data is not informative about performance on real population imaging data, suggesting that a benchmark dataset may greatly facilitate future algorithmic developments.
Community Question Answering (cQA) forums have become a popular medium for soliciting direct answers to specific questions of users from experts or other experienced users on a given topic. However, for a given question, users sometimes have to sift through a large number of low-quality or irrelevant answers to find out the answer which satisfies their information need. To alleviate this, the problem of Answer Quality Prediction (AQP) aims to predict the quality of an answer posted in response to a forum question. Current AQP systems either learn models using - a) various hand-crafted features (HCF) or b) use deep learning (DL) techniques which automatically learn the required feature representations.   In this paper, we propose a novel approach for AQP known as - "Deep Feature Fusion Network (DFFN)" which leverages the advantages of both hand-crafted features and deep learning based systems. Given a question-answer pair along with its metadata, DFFN independently - a) learns deep features using a Convolutional Neural Network (CNN) and b) computes hand-crafted features using various external resources and then combines them using a deep neural network trained to predict the final answer quality. DFFN achieves state-of-the-art performance on the standard SemEval-2015 and SemEval-2016 benchmark datasets and outperforms baseline approaches which individually employ either HCF or DL based techniques alone.
We investigate the influence of columnar defects in layered superconductors on the thermally activated penetration of pancake vortices through the surface barrier. Columnar defects, located near the surface, facilitate penetration of vortices through the surface barrier, by creating ``weak spots'', through which pancakes can penetrate into the superconductor. Penetration of a pancake mediated by an isolated column, located near the surface, is a two-stage process involving hopping from the surface to the column and the detachment from the column into the bulk; each stage is controlled by its own activation barrier. The resulting effective energy is equal to the maximum of those two barriers. For a given external field there exists an optimum location of the column for which the barriers for the both processes are equal and the reduction of the effective penetration barrier is maximal. At high fields the effective penetration field is approximately two times smaller than in unirradiated samples. We also estimate the suppression of the effective penetration field by column clusters. This mechanism provides further reduction of the penetration field at low temperatures.
By generating the specifics of a network structure only when needed (on-the-fly), we derive a simple stochastic process that exactly models the time evolution of susceptible-infectious dynamics on finite-size networks. The small number of dynamical variables of this birth-death Markov process greatly simplifies analytical calculations. We show how a dual analytical description, treating large scale epidemics with a Gaussian approximations and small outbreaks with a branching process, provides an accurate approximation of the distribution even for rather small networks. The approach also offers important computational advantages and generalizes to a vast class of systems.
The average power spectrum of a pulse reflected by a disordered medium embedded in an N-mode waveguide decays in time with a power law t^(-p). We show that the exponent p increases from 3/2 to 2 after N^2 scattering times, due to the onset of localisation. We compare two methods to arrive at this result. The first method involves the analytic continuation to imaginary absorption rate of a static scattering problem. The second method involves the solution of a Fokker-Planck equation for the frequency dependent reflection matrix, by means of a mapping onto a problem in non-Hermitian quantum mechanics.
A growing number of threats to Android phones creates challenges for malware detection. Manually labeling the samples into benign or different malicious families requires tremendous human efforts, while it is comparably easy and cheap to obtain a large amount of unlabeled APKs from various sources. Moreover, the fast-paced evolution of Android malware continuously generates derivative malware families. These families often contain new signatures, which can escape detection when using static analysis. These practical challenges can also cause traditional supervised machine learning algorithms to degrade in performance.   In this paper, we propose a framework that uses model-based semi-supervised (MBSS) classification scheme on the dynamic Android API call logs. The semi-supervised approach efficiently uses the labeled and unlabeled APKs to estimate a finite mixture model of Gaussian distributions via conditional expectation-maximization and efficiently detects malwares during out-of-sample testing. We compare MBSS with the popular malware detection classifiers such as support vector machine (SVM), $k$-nearest neighbor (kNN) and linear discriminant analysis (LDA). Under the ideal classification setting, MBSS has competitive performance with 98\% accuracy and very low false positive rate for in-sample classification. For out-of-sample testing, the out-of-sample test data exhibit similar behavior of retrieving phone information and sending to the network, compared with in-sample training set. When this similarity is strong, MBSS and SVM with linear kernel maintain 90\% detection rate while $k$NN and LDA suffer great performance degradation. When this similarity is slightly weaker, all classifiers degrade in performance, but MBSS still performs significantly better than other classifiers.
We adapt the bialgebra and Hopf relations to expose internal structure in the ground state of a Hamiltonian with $Z_2$ topological order. Its tensor network description allows for exact contraction through simple diagrammatic rewrite rules. The contraction property does not depend on specifics such as geometry, but rather originates from the non-trivial algebraic properties of the constituent tensors. We then generalise the resulting tensor network from a spin-1/2 lattice to a class of exactly contractible states on spin-S degrees of freedom, yielding the most efficient tensor network description of finite Abelian lattice gauge theories. We gain a new perspective on these states as examples of two-dimensional quantum states with algebraically contractible tensor network representations. The introduction of local perturbations to the network is shown to reduce the von Neumann entropy of string-like regions, creating an unentangled sub-system within the bulk in a certain limit. We also show how perturbations induce finite-range correlations in this system. This class of tensor networks is readily translated onto any lattice, and we differentiate between the physical consequences of bipartite and non-bipartite lattices on the properties of the corresponding quantum states. We explicitly show this on the hexagonal, square, kagome and triangular lattices.
Experimental results often do not assess network structure; rather, the network structure is inferred by the dynamics of the nodes. From the dynamics of the nodes one then constructs a network of functional relations, termed the functional network. A fundamental question in the analysis of complex systems concerns the relation between functional and structural networks. Using synchronisation as a paradigm for network functionality, we study the dynamics of cluster formation in functional networks. We show that the functional network can drastically differ from the structural network. We uncover the mechanism driving these bifurcations by obtaining necessary conditions for modular synchronisation.
We propose an approach to formulating string theory in a curved spacetime, which is based on the connection between the states of the WZW model for the isometry group of a background spacetime metric and the representations of the corresponding quantum group. In this approach the string states scattering amplitudes are defined by certain evaluations of the theta spin networks for the associated quantum group. We examine the evaluations given by the spin network invariants defined by the spin foam state sum model associated to the two-dimensional BF theory for the background isometry group. We show that the corresponding string amplitudes are well defined if the spacetime manifold is compact and admits a group metric. We compute the simplest scattering amplitudes in the case of the SU(2) background isometry group, and we provide arguments that these are the amplitudes of a topological string theory.
Consider observing an undirected network that is `noisy' in the sense that there are Type I and Type II errors in the observation of edges. Such errors can arise, for example, in the context of inferring gene regulatory networks in genomics or functional connectivity networks in neuroscience. Given a single observed network then, to what extent are summary statistics for that network representative of their analogues for the true underlying network? Can we infer such statistics more accurately by taking into account the noise in the observed network edges?   In this paper, we answer both of these questions. In particular, we develop a spectral-based methodology using the adjacency matrix to `denoise' the observed network data and produce more accurate inference of the summary statistics of the true network. We characterize performance of our methodology through bounds on appropriate notions of risk in the $L^2$ sense, and conclude by illustrating the practical impact of this work on synthetic and real-world data.
The stochastic block model (SBM) is a popular tool for community detection in networks, but fitting it by maximum likelihood (MLE) involves a computationally infeasible optimization problem. We propose a new semidefinite programming (SDP) solution to the problem of fitting the SBM, derived as a relaxation of the MLE. We put ours and previously proposed SDPs in a unified framework, as relaxations of the MLE over various sub-classes of the SBM, revealing a connection to sparse PCA. Our main relaxation, which we call SDP-1, is tighter than other recently proposed SDP relaxations, and thus previously established theoretical guarantees carry over. However, we show that SDP-1 exactly recovers true communities over a wider class of SBMs than those covered by current results. In particular, the assumption of strong assortativity of the SBM, implicit in consistency conditions for previously proposed SDPs, can be relaxed to weak assortativity for our approach, thus significantly broadening the class of SBMs covered by the consistency results. We also show that strong assortativity is indeed a necessary condition for exact recovery for previously proposed SDP approaches and not an artifact of the proofs. Our analysis of SDPs is based on primal-dual witness constructions, which provides some insight into the nature of the solutions of various SDPs. We show how to combine features from SDP-1 and already available SDPs to achieve the most flexibility in terms of both assortativity and block-size constraints, as our relaxation has the tendency to produce communities of similar sizes. This tendency makes it the ideal tool for fitting network histograms, a method gaining popularity in the graphon estimation literature, as we illustrate on an example of a social networks of dolphins. We also provide empirical evidence that SDPs outperform spectral methods for fitting SBMs with a large number of blocks.
At the heart of deep learning we aim to use neural networks as function approximators - training them to produce outputs from inputs in emulation of a ground truth function or data creation process. In many cases we only have access to input-output pairs from the ground truth, however it is becoming more common to have access to derivatives of the target output with respect to the input - for example when the ground truth function is itself a neural network such as in network compression or distillation. Generally these target derivatives are not computed, or are ignored. This paper introduces Sobolev Training for neural networks, which is a method for incorporating these target derivatives in addition the to target values while training. By optimising neural networks to not only approximate the function's outputs but also the function's derivatives we encode additional information about the target function within the parameters of the neural network. Thereby we can improve the quality of our predictors, as well as the data-efficiency and generalization capabilities of our learned function approximation. We provide theoretical justifications for such an approach as well as examples of empirical evidence on three distinct domains: regression on classical optimisation datasets, distilling policies of an agent playing Atari, and on large-scale applications of synthetic gradients. In all three domains the use of Sobolev Training, employing target derivatives in addition to target values, results in models with higher accuracy and stronger generalisation.
Deep Neural Networks (DNN) have demonstrated superior ability to extract high level embedding vectors from low level features. Despite the success, the serving time is still the bottleneck due to expensive run-time computation of multiple layers of dense matrices. GPGPU, FPGA, or ASIC-based serving systems require additional hardware that are not in the mainstream design of most commercial applications. In contrast, tree or forest-based models are widely adopted because of low serving cost, but heavily depend on carefully engineered features. This work proposes a Deep Embedding Forest model that benefits from the best of both worlds. The model consists of a number of embedding layers and a forest/tree layer. The former maps high dimensional (hundreds of thousands to millions) and heterogeneous low-level features to the lower dimensional (thousands) vectors, and the latter ensures fast serving.   Built on top of a representative DNN model called Deep Crossing, and two forest/tree-based models including XGBoost and LightGBM, a two-step Deep Embedding Forest algorithm is demonstrated to achieve on-par or slightly better performance as compared with the DNN counterpart, with only a fraction of serving time on conventional hardware. After comparing with a joint optimization algorithm called partial fuzzification, also proposed in this paper, it is concluded that the two-step Deep Embedding Forest has achieved near optimal performance. Experiments based on large scale data sets (up to 1 billion samples) from a major sponsored search engine proves the efficacy of the proposed model.
Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in a wide variety of applications including image classification and speech recognition. Inference and learning in these algorithms uses a Markov Chain Monte Carlo procedure called Gibbs sampling. A sigmoidal function forms the kernel of this sampler which can be realized from the firing statistics of noisy integrate-and-fire neurons on a neuromorphic VLSI substrate. This paper demonstrates such an implementation on an array of digital spiking neurons with stochastic leak and threshold properties for inference tasks and presents some key performance metrics for such a hardware-based sampler in both the generative and discriminative contexts.
As the explosive growth of the ISM band usage continues, there are many scenarios where different systems operate in the same place at the same time. One of growing concerns is the coexistence of heterogeneous wireless network systems. For the successful deployment of mission-critical systems such as wireless sensor networks, it is required to provide a solution for the coexistence. In this paper, we propose a new scheme using inter packet delay for the coexistence of IEEE 802.15.4 LRWPAN and IEEE 802.11b WLAN. To evaluate the effectiveness of the proposed scheme, measurement and simulation study are conducted using Qualnet 4.5 simulation software. The simulation results show that the proposed scheme is effective in performance improvement for coexistence network of IEEE 802.15.4 for various topologies.
Sensor networks are rather challenging to deploy, program, and debug. Current programming languages for these platforms suffer from a significant semantic gap between their specifications and underlying implementations. This fact precludes the development of (type-)safe applications, which would potentially simplify the task of programming and debugging deployed networks. In this paper we define a core calculus for programming sensor networks and propose to use it as an assembly language for developing type-safe, high-level programming languages.
We present an empirical method for estimating the underlying redshift distribution N(z) of galaxy photometric samples from photometric observables. The method does not rely on photometric redshift (photo-z) estimates for individual galaxies, which typically suffer from biases. Instead, it assigns weights to galaxies in a spectroscopic subsample such that the weighted distributions of photometric observables (e.g., multi-band magnitudes) match the corresponding distributions for the photometric sample. The weights are estimated using a nearest-neighbor technique that ensures stability in sparsely populated regions of color-magnitude space. The derived weights are then summed in redshift bins to create the redshift distribution. We apply this weighting technique to data from the Sloan Digital Sky Survey as well as to mock catalogs for the Dark Energy Survey, and compare the results to those from the estimation of photo-z's derived by a neural network algorithm. We find that the weighting method accurately recovers the underlying redshift distribution, typically better than the photo-z reconstruction, provided the spectroscopic subsample spans the range of photometric observables covered by the photometric sample.
For mobile devices, communication via cellular networks consumes more energy, and has a lower data rate than WiFi networks, and suffers an expensive limited data plan. However the WiFi network coverage range and density are smaller than those of the cellular networks. In this work, we present a behavior-aware and preference-based approach to prefetch news webpages that a user will be interested in and access, by exploiting the WiFi network connections to reduce the energy and monetary cost. In our solution, we first design an efficient preference learning algorithm based on keywords and URLs visited, which will keep track of the user's changing interests. By predicting the appearance and durations of the WiFi network connections, our prefetch approach then optimizes when to prefetch what webpages to maximize the user experience while lowing the prefetch cost. Our prefetch approach exploits the idle period of WiFi connections to reduce the tail-energy consumption. We implement our approach in iPhone. Our extensive evaluations show that our system achieves about 60% hit ratio, saves about 50% cellular data usage, and reduces the energy cost by 9%.
We introduce a novel unsupervised loss function for learning semantic segmentation with deep convolutional neural nets (ConvNet) when densely labeled training images are not available. More specifically, the proposed loss function penalizes the L1-norm of the gradient of the label probability vector image , i.e. total variation, produced by the ConvNet. This can be seen as a regularization term that promotes piecewise smoothness of the label probability vector image produced by the ConvNet during learning. The unsupervised loss function is combined with a supervised loss in a semi-supervised setting to learn ConvNets that can achieve high semantic segmentation accuracy even when only a tiny percentage of the pixels in the training images are labeled. We demonstrate significant improvements over the purely supervised setting in the Weizmann horse, Stanford background and Sift Flow datasets. Furthermore, we show that using the proposed piecewise smoothness constraint in the learning phase significantly outperforms post-processing results from a purely supervised approach with Markov Random Fields (MRF). Finally, we note that the framework we introduce is general and can be used to learn to label other types of structures such as curvilinear structures by modifying the unsupervised loss function accordingly.
This paper presents a systematic evaluation of Neural Network (NN) for classification of real-world data. In the field of machine learning, it is often seen that a single parameter that is 'predictive accuracy' is being used for evaluating the performance of a classifier model. However, this parameter might not be considered reliable given a dataset with very high level of skewness. To demonstrate such behavior, seven different types of datasets have been used to evaluate a Multilayer Perceptron (MLP) using twelve(12) different parameters which include micro- and macro-level estimation. In the present study, the most common problem of prediction called 'multiclass' classification has been considered. The results that are obtained for different parameters for each of the dataset could demonstrate interesting findings to support the usability of these set of performance evaluation parameters.
We introduce an online neural sequence to sequence model that learns to alternate between encoding and decoding segments of the input as it is read. By independently tracking the encoding and decoding representations our algorithm permits exact polynomial marginalization of the latent segmentation during training, and during decoding beam search is employed to find the best alignment path together with the predicted output sequence. Our model tackles the bottleneck of vanilla encoder-decoders that have to read and memorize the entire input sequence in their fixed-length hidden states before producing any output. It is different from previous attentive models in that, instead of treating the attention weights as output of a deterministic function, our model assigns attention weights to a sequential latent variable which can be marginalized out and permits online generation. Experiments on abstractive sentence summarization and morphological inflection show significant performance gains over the baseline encoder-decoders.
There are many challenges when designing and deploying wireless sensor networks (WSNs). One of the key challenges is how to make full use of the limited energy to prolong the lifetime of the network, because energy is a valuable resource in WSNs. The status of energy consumption should be continuously monitored after network deployment. In this paper, we propose coverage and connectivity aware neural network based energy efficient routing in WSN with the objective of maximizing the network lifetime. In the proposed scheme, the problem is formulated as linear programming (LP) with coverage and connectivity aware constraints. Cluster head selection is proposed using adaptive learning in neural networks followed by coverage and connectivity aware routing with data transmission. The proposed scheme is compared with existing schemes with respect to the parameters such as number of alive nodes, packet delivery fraction, and node residual energy. The simulation results show that the proposed scheme can be used in wide area of applications in WSNs.
A formalism for electronic-structure calculations is presented that is based on the functional renormalization group (FRG). The traditional FRG has been formulated for systems that exhibit a translational symmetry with an associated Fermi surface, which can provide the organization principle for the renormalization group (RG) procedure. We here advance an alternative formulation, where the RG-flow is organized in the energy-domain rather than in k-space. This has the advantage that it can also be applied to inhomogeneous matter lacking a band-structure, such as disordered metals or molecules. The energy-domain FRG ({\epsilon}FRG) presented here accounts for Fermi-liquid corrections to quasi-particle energies and particle-hole excitations. It goes beyond the state of the art GW-BSE, because in {\epsilon}FRG the Bethe-Salpeter equation (BSE) is solved in a self-consistent manner. An efficient implementation of the approach that has been tested against exact diagonalization calculations and calculations based on the density matrix renormalization group is presented.   Similar to the conventional FRG, also the {\epsilon}FRG is able to signalize the vicinity of an instability of the Fermi-liquid fixed point via runaway flow of the corresponding interaction vertex. Embarking upon this fact, in an application of {\epsilon}FRG to the spinless disordered Hubbard model we calculate its phase-boundary in the plane spanned by the interaction and disorder strength. Finally, an extension of the approach to finite temperatures and spin S = 1/2 is also given.
This paper presents an experimental study to investigate the learning and decision making behavior of individuals in a human society. Social learning is used as the mathematical basis for modelling interaction of individuals that aim to perform a perceptual task interactively. A psychology experiment was conducted on a group of undergraduate students at the University of British Columbia to examine whether the decision (action) of one individual affects the decision of the subsequent individuals. The major experimental observation that stands out here is that the participants of the experiment (agents) were affected by decisions of their partners in a relatively large fraction (60%) of trials. We fit a social learning model that mimics the interactions between participants of the psychology experiment. Misinformation propagation (also known as data incest) within the society under study is further investigated in this paper.
In this paper, we analyze the spatial information of deep features, and propose two complementary regressions for robust visual tracking. First, we propose a kernelized ridge regression model wherein the kernel value is defined as the weighted sum of similarity scores of all pairs of patches between two samples. We show that this model can be formulated as a neural network and thus can be efficiently solved. Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target. Distance transform pooling is further exploited to determine the effectiveness of each output channel of the convolution layer. The outputs from the kernelized ridge regression model and the fully convolutional neural network are combined to obtain the ultimate response. Experimental results on three benchmark datasets validate the effectiveness of the proposed method.
We study the problem of efficiently broadcasting packets in multi-hop wireless networks. At each time slot the network controller activates a set of non-interfering links and forwards selected copies of packets on each activated link. A packet is considered jointly received only when all nodes in the network have obtained a copy of it. The maximum rate of jointly received packets is referred to as the broadcast capacity of the network. Existing policies achieve the broadcast capacity by balancing traffic over a set of spanning trees, which are difficult to maintain in a large and time-varying wireless network. We propose a new dynamic algorithm that achieves the broadcast capacity when the underlying network topology is a directed acyclic graph (DAG). This algorithm is decentralized, utilizes local queue-length information only and does not require the use of global topological structures such as spanning trees. The principal technical challenge inherent in the problem is the absence of work-conservation principle due to the duplication of packets, which renders traditional queuing modelling inapplicable. We overcome this difficulty by studying relative packet deficits and imposing in-order delivery constraints to every node in the network. Although in-order packet delivery, in general, leads to degraded throughput in graphs with cycles, we show that it is throughput optimal in DAGs and can be exploited to simplify the design and analysis of optimal algorithms. Our characterization leads to a polynomial time algorithm for computing the broadcast capacity of any wireless DAG under the primary interference constraints. Additionally, we propose an extension of our algorithm which can be effectively used for broadcasting in any network with arbitrary topology.
Reconciling a gene tree with a species tree is an important task that reveals much about the evolution of genes, genomes, and species, as well as about the molecular function of genes. A wide array of computational tools have been devised for this task under certain evolutionary events such as hybridization, gene duplication/loss, or incomplete lineage sorting. Work on reconciling gene tree with species phylogenies under two or more of these events have also begun to emerge. Our group recently devised both parsimony and probabilistic frameworks for reconciling a gene tree with a phylogenetic network, thus allowing for the detection of hybridization in the presence of incomplete lineage sorting. While the frameworks were general and could handle any topology, they are computationally intensive, rendering their application to large datasets infeasible. In this paper, we present two novel approaches to address the computational challenges of the two frameworks that are based on the concept of ancestral configurations. Our approaches still compute exact solutions while improving the computational time by up to five orders of magnitude. These substantial gains in speed scale the applicability of these unified reconciliation frameworks to much larger data sets. We discuss how the topological features of the gene tree and phylogenetic network may affect the performance of the new algorithms. We have implemented the algorithms in our PhyloNet software package, which is publicly available in open source.
The growing incidents of counterfeiting and associated economic and health consequences necessitate the development of active surveillance systems capable of producing timely and reliable information for all stake holders in the anti-counterfeiting fight. User generated content from social media platforms can provide early clues about product allergies, adverse events and product counterfeiting. This paper reports a work in progresswith contributions including: the development of a framework for gathering and analyzing the views and experiences of users of drug and cosmetic products using machine learning, text mining and sentiment analysis, the application of the proposed framework on Facebook comments and data from Twitter for brand analysis, and the description of how to develop a product safety lexicon and training data for modeling a machine learning classifier for drug and cosmetic product sentiment prediction. The initial brand and product comparison results signify the usefulness of text mining and sentiment analysis on social media data while the use of machine learning classifier for predicting the sentiment orientation provides a useful tool for users, product manufacturers, regulatory and enforcement agencies to monitor brand or product sentiment trends in order to act in the event of sudden or significant rise in negative sentiment.
Domain knowledge can often be encoded in the structure of a network, such as convolutional layers for vision, which has been shown to increase generalization and decrease sample complexity, or the number of samples required for successful learning. In this study, we ask whether sample complexity can be reduced for systems where the structure of the domain is unknown beforehand, and the structure and parameters must both be learned from the data. We show that sample complexity reduction through learning structure is possible for at least two simple cases. In studying these cases, we also gain insight into how this might be done for more complex domains.
We consider the problem of how to assign treatment in a randomized experiment, in which the correlation among the outcomes is informed by a network available pre-intervention. Working within the potential outcome causal framework, we develop a class of models that posit such a correlation structure among the outcomes. Then we leverage these models to develop restricted randomization strategies for allocating treatment optimally, by minimizing the mean square error of the estimated average treatment effect. Analytical decompositions of the mean square error, due both to the model and to the randomization distribution, provide insights into aspects of the optimal designs. In particular, the analysis suggests new notions of balance based on specific network quantities, in addition to classical covariate balance. The resulting balanced, optimal restricted randomization strategies are still design unbiased, in situations where the model used to derive them does not hold. We illustrate how the proposed treatment allocation strategies improve on allocations that ignore the network structure, with extensive simulations.
A crucial challenge in network theory is the study of the robustness of a network after facing a sequence of failures. In this work, we propose a dynamical definition of network's robustness based on Information Theory, that considers measurements of the structural changes caused by failures of the network's components. Failures are defined here, as a temporal process defined in a sequence. The robustness of the network is then evaluated by measuring dissimilarities between topologies after each time step of the sequence, providing a dynamical information about the topological damage. We thoroughly analyze the efficiency of the method in capturing small perturbations by considering both, the degree and distance distributions. We found the network's distance distribution more consistent in capturing network structural deviations, as better reflects the consequences of the failures. Theoretical examples and real networks are used to study the performance of this methodology.
Source camera identification is still a hard task in forensics community, especially for the case of the small query image size. In this paper, we propose a solution to identify the source camera of the small-size images: content-adaptive fusion network. In order to learn better feature representation from the input data, content-adaptive convolutional neural networks(CA-CNN) are constructed. We add a convolutional layer in preprocessing stage. Moreover, with the purpose of capturing more comprehensive information, we parallel three CA-CNNs: CA3-CNN, CA5-CNN, CA7-CNN to get the content-adaptive fusion network. The difference of three CA-CNNs lies in the convolutional kernel size of pre-processing layer. The experimental results show that the proposed method is practicable and satisfactory.
We survey the recent distributed computing literature on checking whether a given distributed system configuration satisfies a given boolean predicate, i.e., whether the configuration is legal or illegal w.r.t. that predicate. We consider classical distributed computing environments, including mostly synchronous fault-free network computing (LOCAL and CONGEST models), but also asynchronous crash-prone shared-memory computing (WAIT-FREE model), and mobile computing (FSYNC model).
Motivated by systems where the information is represented by a graph, such as neural networks, associative memories, and distributed systems, we present in this work a new class of codes, called codes over graphs. Under this paradigm, the information is stored on the edges of an undirected graph, and a code over graphs is a set of graphs. A node failure is the event where all edges in the neighborhood of the failed node have been erased. We say that a code over graphs can tolerate $\rho$ node failures if it can correct the erased edges of any $\rho$ failed nodes in the graph. While the construction of such codes can be easily accomplished by MDS codes, their field size has to be at least $O(n^2)$, when $n$ is the number of nodes in the graph. In this work we present several constructions of codes over graphs with smaller field size. In particular, we present optimal codes over graphs correcting two node failures over the binary field, when the number of nodes in the graph is a prime number. We also present a construction of codes over graphs correcting $\rho$ node failures for all $\rho$ over a field of size at least $(n+1)/2-1$, and show how to improve this construction for optimal codes when $\rho=2,3$.
MOdified Newtonian Dynamics (MOND) is evolving from an empirical to a decent theory respecting fundamental physics after Bekenstein (2004) showed that lensing and Hubble expansion can be modeled rigourously in a Modified Relativity. The degeneracy of MOND with Dark Matter can be broken if we examine the non-linear MONDian Poisson's equation in detail. Here we study the effect of tides for a binary stellar system or a baryonic satellite-host galaxy system. We show that the Roche lobe is more squashed than the Newtonian case due to the anisotropic dilation effect in deep-MOND. We prove analytically that the Roche lobe volume scales linearly with the "true" baryonic mass ratio in both Newtonian and deep-MOND regimes, insensitive to the modification to the inertia mass. Hence accurate Roche radii of satellites can break the degeneracy of MOND and dark matter theory. Globular clusters and dwarf galaxies of comparable luminosities and distances show a factor of ten scatter in limiting radii; this is difficult to explain in any "mass-tracing-light" universe. The results here are generalizable to the intermediate MOND regime for a wide class of gravity modification function $\mu(g)$ (Zhao and Tian, astro-ph/0511754).
Transfer learning is aimed to make use of valuable knowledge in a source domain to help model performance in a target domain. It is particularly important to neural networks, which are very likely to be overfitting. In some fields like image processing, many studies have shown the effectiveness of neural network-based transfer learning. For neural NLP, however, existing studies have only casually applied transfer learning, and conclusions are inconsistent. In this paper, we conduct systematic case studies and provide an illuminating picture on the transferability of neural networks in NLP.
In this paper we describe an end to end Neural Model for Named Entity Recognition NER) which is based on Bi-Directional RNN-LSTM. Almost all NER systems for Hindi use Language Specific features and handcrafted rules with gazetteers. Our model is language independent and uses no domain specific features or any handcrafted rules. Our models rely on semantic information in the form of word vectors which are learnt by an unsupervised learning algorithm on an unannotated corpus. Our model attained state of the art performance in both English and Hindi without the use of any morphological analysis or without using gazetteers of any sort.
The magnetization and magnetic ac susceptibility, $\chi = \chi' - i \chi''$, of superferromagnetic systems are studied by numerical simulations. The Cole-Cole plot, $\chi''$ vs. $\chi'$, is used as a tool for classifying magnetic systems by their dynamical behavior. The simulations of the magnetization hysteresis and the ac susceptibility are performed with two approaches for a driven domain wall in random media. The studies are motivated by recent experimental results on the interacting nanoparticle system Co$_{80}$Fe$_{20}$/Al$_{2}$O$_{3}$ showing superferromagnetic behavior. Its Cole-Cole plot indicates domain wall motion dynamics similarly to a disordered ferromagnet, including pinning and sliding motion. With our models we can successfully reproduce the features found in the experimental Cole-Cole plots.
General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs). Up to now, extracting the right state representations out of bare observations, that is, reducing the general agent setup to the MDP framework, is an art that involves significant effort by designers. The primary goal of this work is to automate the reduction process and thereby significantly expand the scope of many existing reinforcement learning algorithms and the agents that employ them. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in Part II. The role of POMDPs is also considered there.
It is well known that recognizers personalized to each user are much more effective than user-independent recognizers. With the popularity of smartphones today, although it is not difficult to collect a large set of audio data for each user, it is difficult to transcribe it. However, it is now possible to automatically discover acoustic tokens from unlabeled personal data in an unsupervised way. We therefore propose a multi-task deep learning framework called a phoneme-token deep neural network (PTDNN), jointly trained from unsupervised acoustic tokens discovered from unlabeled data and very limited transcribed data for personalized acoustic modeling. We term this scenario "weakly supervised". The underlying intuition is that the high degree of similarity between the HMM states of acoustic token models and phoneme models may help them learn from each other in this multi-task learning framework. Initial experiments performed over a personalized audio data set recorded from Facebook posts demonstrated that very good improvements can be achieved in both frame accuracy and word accuracy over popularly-considered baselines such as fDLR, speaker code and lightly supervised adaptation. This approach complements existing speaker adaptation approaches and can be used jointly with such techniques to yield improved results.
As the number and size of the Network increases, the deficiencies persist, including network security problems. But there is no shortage of technologies offered as universal remedy - EIGRP,BGP, OSPF, VoIP, IPv6, IPTV, MPLS, WiFi, to name a few. There are multiple factors for the current situation. Now a day during emergent and blossoming stages of network development is no longer sufficient when the networks are mature and have become everyday tool for social and business interactions. A new model of network is necessary to find solutions for today's pressing problems, especially those related to network security. In this paper out factors leading to current stagnation discusses critical assumptions behind current networks, how many of them are no longer valid and have become barriers for implementing real solutions. The paper concludes by offering new directions for future needs and solving current challenges.
Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning.
Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research. We propose a novel mechanism to address some of these limitations and improve the NMT attention. Specifically, our approach memorizes the alignments temporally (within each sentence) and modulates the attention with the accumulated temporal memory, as the decoder generates the candidate translation. We compare our approach against the baseline NMT model and two other related approaches that address this issue either explicitly or implicitly. Large-scale experiments on two language pairs show that our approach achieves better and robust gains over the baseline and related NMT approaches. Our model further outperforms strong SMT baselines in some settings even without using ensembles.
A fundamental question related to innovation diffusion is how the social network structure influences the process. Empirical evidence regarding real-world influence networks is very limited. On the other hand, agent-based modeling literature reports different and at times seemingly contradictory results. In this paper we study innovation diffusion processes for a range of Watts-Strogatz networks in an attempt to shed more light on this problem. Using the so-called Sznajd model as the backbone of opinion dynamics, we find that the published results are in fact consistent and allow to predict the role of network topology in various situations. In particular, the diffusion of innovation is easier on more regular graphs, i.e. with a higher clustering coefficient. Moreover, in the case of uncertainty - which is particularly high for innovations connected to public health programs or ecological campaigns - a more clustered network will help the diffusion. On the other hand, when social influence is less important (i.e. in the case of perfect information), a shorter path will help the innovation to spread in the society and - as a result - the diffusion will be easiest on a random graph.
A previous calculation of electroweak O(alpha) corrections to deep-inelastic neutrino scattering, as e.g. measured by NuTeV and NOMAD, is supplemented by higher-order effects. In detail, we take into account universal two-loop effects from \Delta\alpha and \Delta\rho as well as higher-order final-state photon radiation off muons in the structure function approach. Moreover, we make use of the recently released O(alpha)-improved parton distributions MRST2004QED and identify the relevant QED factorization scheme, which is DIS like. As a technical byproduct, we describe slicing and subtraction techniques for an efficient calculation of a new type of real corrections that are induced by the generated photon distribution. A numerical discussion of the higher-order effects suggests that the remaining theoretical uncertainty from unknown electroweak corrections is dominated by non-universal two-loop effects and is of the order 0.0003 when translated into a shift in sin^2\theta_W=1-MW^2/MZ^2. The O(alpha) corrections implicitly included in the parton distributions lead to a shift of about 0.0004.
The relation between performance and stress is described by the Yerkes-Dodson Law but varies significantly between individuals. This paper describes a method for determining the individual optimal performance as a function of physiological signals. The method is based on attention and reasoning tests of increasing complexity under monitoring of three physiological signals: Galvanic Skin Response (GSR), Heart Rate (HR), and Electromyogram (EMG). Based on the test results with 15 different individuals, we first show that two of the signals, GSR and HR, have enough discriminative power to distinguish between relax and stress periods. We then show a positive correlation between the complexity level of the tests and the GSR and HR signals, and we finally determine the optimal performance point as the signal level just before a performance decrease. We also discuss the differences among signals depending on the type of test.
Millimeter wave (mmWave) links have the potential to offer high data rates and capacity needed in fifth generation (5G) networks, however they have very high penetration and path loss. A solution to this problem is to bring the base station closer to the end-user through heterogeneous networks (HetNets). HetNets could be designed to allow users to connect to different base stations (BSs) in the uplink and downlink. This phenomenon is known as downlink-uplink decoupling (DUDe). This paper explores the effect of DUDe in a three tier HetNet deployed in two different real-world environments. Our simulation results show that DUDe can provide improvements with regard to increasing the system coverage and data rates while the extent of improvement depends on the different environments that the system is deployed in.
In this paper, we propose a novel architecture of wavelet network called Multi-input Multi-output Wavelet Network MIMOWN as a generalization of the old architecture of wavelet network. This newel prototype was applied to speech recognition application especially to model acoustic unit of speech. The originality of our work is the proposal of MIMOWN to model acoustic unit of speech. This approach was proposed to overcome limitation of old wavelet network model. The use of the multi-input multi-output architecture will allows training wavelet network on various examples of acoustic units.
This paper describes the design of a neural network that performs the phonetic-to-acoustic mapping in a speech synthesis system. The use of a time-domain neural network architecture limits discontinuities that occur at phone boundaries. Recurrent data input also helps smooth the output parameter tracks. Independent testing has demonstrated that the voice quality produced by this system compares favorably with speech from existing commercial text-to-speech systems.
We present precise and deep optical photometry of the globular M92. Data were collected in three different photometric systems: Sloan Digital Sky Survey (g',r',i',z'; MegaCam@CFHT), Johnson-Kron-Cousins (B, V, I; various ground-based telescopes) and Advanced Camera for Surveys (ACS) Vegamag (F475W, F555W, F814W; Hubble Space Telescope). Special attention was given to the photometric calibration, and the precision of the ground-based data is generally better than 0.01 mag. We computed a new set of {\alpha}-enhanced evolutionary models accounting for the gravitational settling of heavy elements at fixed chemical composition ([{\alpha}/Fe]=+0.3, [Fe/H]=-2.32 dex, Y=0.248). The isochrones -- assuming the same true distance modulus ({\mu}=14.74 mag), the same reddening (E(B-V)=0.025+-0.010 mag), and the same reddening law -- account for the stellar distribution along the main sequence and the red giant branch in different Color-Magnitude Diagrams (i',g'-i' ; i',g'-r' ; i',g'-z' ; I,B-I ; F814W,F475W-F814W). The same outcome applies to the comparison between the predicted Zero-Age-Horizontal-Branch (ZAHB) and the HB stars. We also found a cluster age of 11 +/- 1.5 Gyr, in good agreement with previous estimates. The error budget accounts for uncertainties in the input physics and the photometry. To test the possible occurrence of CNO-enhanced stars, we also computed two sets of {\alpha}- and CNO-enhanced (by a factor of three) models both at fixed total metallicity ([M/H]=-2.10 dex) and at fixed iron abundance. We found that the isochrones based on the former set give the same cluster age (11 +/- 1.5 Gyr) as the canonical {\alpha}-enhanced isochrones. The isochrones based on the latter set also give a similar cluster age (10 +/- 1.5 Gyr). These indings support previous results concerning the weak sensitivity of cluster isochrones to CNO-enhanced chemical mixtures.
We consider distributed channel access in multi-hop cognitive radio networks. Previous works on opportunistic channel access using multi-armed bandits (MAB) mainly focus on single-hop networks that assume complete conflicts among all secondary users. In the multi-hop multi-channel network settings studied here, there is more general competition among different communication pairs. We formulate the problem as a linearly combinatorial MAB problem that involves a maximum weighted independent set (MWIS) problem with unknown weights which need to learn. Existing methods for MAB where each of $N$ nodes chooses from $M$ channels have exponential time and space complexity $O(M^N)$, and poor theoretical guarantee on throughput performance. We propose a distributed channel access algorithm that can achieve $1/\rho$ of the optimum averaged throughput where each node has communication complexity $O(r^2+D)$ and space complexity $O(m)$ in the learning process, and time complexity $O(D m^{\rho^r})$ in strategy decision process for an arbitrary wireless network. Here $\rho=1+\epsilon$ is the approximation ratio to MWIS for a local $r$-hop network with $m<N$ nodes,and $D$ is the number of mini-rounds inside each round of strategy decision. For randomly located networks with an average degree $d$, the time complexity is $O(d^{\rho^r})$.
We propose a novel diverse feature selection method based on determinantal point processes (DPPs). Our model enables one to flexibly define diversity based on the covariance of features (similar to orthogonal matching pursuit) or alternatively based on side information. We introduce our approach in the context of Bayesian sparse regression, employing a DPP as a variational approximation to the true spike and slab posterior distribution. We subsequently show how this variational DPP approximation generalizes and extends mean-field approximation, and can be learned efficiently by exploiting the fast sampling properties of DPPs. Our motivating application comes from bioinformatics, where we aim to identify a diverse set of genes whose expression profiles predict a tumor type where the diversity is defined with respect to a gene-gene interaction network. We also explore an application in spatial statistics. In both cases, we demonstrate that the proposed method yields significantly more diverse feature sets than classic sparse methods, without compromising accuracy.
The key challenge of face recognition is to develop effective feature representations for reducing intra-personal variations while enlarging inter-personal differences. In this paper, we show that it can be well solved with deep learning and using both face identification and verification signals as supervision. The Deep IDentification-verification features (DeepID2) are learned with carefully designed deep convolutional networks. The face identification task increases the inter-personal variations by drawing DeepID2 extracted from different identities apart, while the face verification task reduces the intra-personal variations by pulling DeepID2 extracted from the same identity together, both of which are essential to face recognition. The learned DeepID2 features can be well generalized to new identities unseen in the training data. On the challenging LFW dataset, 99.15% face verification accuracy is achieved. Compared with the best deep learning result on LFW, the error rate has been significantly reduced by 67%.
This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games. Termed a deep reinforcement relevance network (DRRN), the architecture represents action and state spaces with separate embedding vectors, which are combined with an interaction function to approximate the Q-function in reinforcement learning. We evaluate the DRRN on two popular text games, showing superior performance over other deep Q-learning architectures. Experiments with paraphrased action descriptions show that the model is extracting meaning rather than simply memorizing strings of text.
We study the problem of recognition of fingerspelled letter sequences in American Sign Language in a signer-independent setting. Fingerspelled sequences are both challenging and important to recognize, as they are used for many content words such as proper nouns and technical terms. Previous work has shown that it is possible to achieve almost 90% accuracies on fingerspelling recognition in a signer-dependent setting. However, the more realistic signer-independent setting presents challenges due to significant variations among signers, coupled with the dearth of available training data. We investigate this problem with approaches inspired by automatic speech recognition. We start with the best-performing approaches from prior work, based on tandem models and segmental conditional random fields (SCRFs), with features based on deep neural network (DNN) classifiers of letters and phonological features. Using DNN adaptation, we find that it is possible to bridge a large part of the gap between signer-dependent and signer-independent performance. Using only about 115 transcribed words for adaptation from the target signer, we obtain letter accuracies of up to 82.7% with frame-level adaptation labels and 69.7% with only word labels.
This contribution reviews the parallel dynamics of Q-Ising neural networks for various architectures: extremely diluted asymmetric, layered feedforward, extremely diluted symmetric, and fully connected. Using a probabilistic signal-to-noise ratio analysis, taking into account all feedback correlations, which are strongly dependent upon these architectures the evolution of the distribution of the local field is found. This leads to a recursive scheme determining the complete time evolution of the order parameters of the network. Arbitrary Q and mainly zero temperature are considered. For the asymmetrically diluted and the layered feedforward network a closed-form solution is obtained while for the symmetrically diluted and fully connected architecture the feedback correlations prevent such a closed-form solution. For these symmetric networks equilibrium fixed-point equations can be derived under certain conditions on the noise in the system. They are the same as those obtained in a thermodynamic replica-symmetric mean-field theory approach.
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with graph feedback, even when the graph structure itself is unknown and/or changing. We provide theoretical guarantees on the Bayesian regret of the algorithm, linking its performance to the underlying properties of the graph. Thompson Sampling has the advantage of being applicable without the need to construct complicated upper confidence bounds for different problems. We illustrate its performance through extensive experimental results on real and simulated networks with graph feedback. More specifically, we tested our algorithms on power law, planted partitions and Erdo's-Renyi graphs, as well as on graphs derived from Facebook and Flixster data. These all show that our algorithms clearly outperform related methods that employ upper confidence bounds, even if the latter use more information about the graph.
An approach has been devised and tested for preserving the molecular geometry and taking into account energetic considerations during Reverse Monte Carlo modeling. Instead of the commonly used fixed neighbour constraints, where molecules are held together by constraining distance ranges available for the specified atom pairs, here molecules are kept together via bond, angle and dihedral potential energies. The scaled total potential energy contributed to the measure of the goodness-of-fit, thus the atoms could be prevented from drifting apart. In some of the simulations (Lennard-Jones and Coulombic) non-bonding potentials were also applied. The algorithm was successfully tested for the X-ray structure factor based structure study of liquid dimethyl trisulfide, for which material now significantly more sensible results have been obtained than during previous attempts via any earlier version of Reverse Monte Carlo modeling. It is envisaged that structural modeling of a large class of materials, primarily liquids and amorphous solids containing molecules of up to about 100 atoms, will make use of the new code in the near future.
Community detection is the task of detecting hidden communities from observed interactions. Guaranteed community detection has so far been mostly limited to models with non-overlapping communities such as the stochastic block model. In this paper, we remove this restriction, and provide guaranteed community detection for a family of probabilistic network models with overlapping communities, termed as the mixed membership Dirichlet model, first introduced by Airoldi et al. This model allows for nodes to have fractional memberships in multiple communities and assumes that the community memberships are drawn from a Dirichlet distribution. Moreover, it contains the stochastic block model as a special case. We propose a unified approach to learning these models via a tensor spectral decomposition method. Our estimator is based on low-order moment tensor of the observed network, consisting of 3-star counts. Our learning method is fast and is based on simple linear algebraic operations, e.g. singular value decomposition and tensor power iterations. We provide guaranteed recovery of community memberships and model parameters and present a careful finite sample analysis of our learning method. As an important special case, our results match the best known scaling requirements for the (homogeneous) stochastic block model.
MIMO technology is one of the most significant advances in the past decade to increase channel capacity and has a great potential to improve network capacity for mesh networks. In a MIMO-based mesh network, the links outgoing from each node sharing the common communication spectrum can be modeled as a Gaussian vector broadcast channel. Recently, researchers showed that ``dirty paper coding'' (DPC) is the optimal transmission strategy for Gaussian vector broadcast channels. So far, there has been little study on how this fundamental result will impact the cross-layer design for MIMO-based mesh networks. To fill this gap, we consider the problem of jointly optimizing DPC power allocation in the link layer at each node and multihop/multipath routing in a MIMO-based mesh networks. It turns out that this optimization problem is a very challenging non-convex problem. To address this difficulty, we transform the original problem to an equivalent problem by exploiting the channel duality. For the transformed problem, we develop an efficient solution procedure that integrates Lagrangian dual decomposition method, conjugate gradient projection method based on matrix differential calculus, cutting-plane method, and subgradient method. In our numerical example, it is shown that we can achieve a network performance gain of 34.4% by using DPC.
In comparison with document summarization on the articles from social media and newswire, argumentative zoning (AZ) is an important task in scientific paper analysis. Traditional methodology to carry on this task relies on feature engineering from different levels. In this paper, three models of generating sentence vectors for the task of sentence classification were explored and compared. The proposed approach builds sentence representations using learned embeddings based on neural network. The learned word embeddings formed a feature space, to which the examined sentence is mapped to. Those features are input into the classifiers for supervised classification. Using 10-cross-validation scheme, evaluation was conducted on the Argumentative-Zoning (AZ) annotated articles. The results showed that simply averaging the word vectors in a sentence works better than the paragraph to vector algorithm and by integrating specific cuewords into the loss function of the neural network can improve the classification performance. In comparison with the hand-crafted features, the word2vec method won for most of the categories. However, the hand-crafted features showed their strength on classifying some of the categories.
Many future wireless sensor networks and the Internet of Things are expected to follow a software defined paradigm, where protocol parameters and behaviors will be dynamically tuned as a function of the signal statistics. New protocols will be then injected as a software as certain events occur. For instance, new data compressors could be (re)programmed on-the-fly as the monitored signal type or its statistical properties change. We consider a lossy compression scenario, where the application tolerates some distortion of the gathered signal in return for improved energy efficiency. To reap the full benefits of this paradigm, we discuss an automatic sensor profiling approach where the signal class, and in particular the corresponding rate-distortion curve, is automatically assessed using machine learning tools (namely, support vector machines and neural networks). We show that this curve can be reliably estimated on-the-fly through the computation of a small number (from ten to twenty) of statistical features on time windows of a few hundreds samples.
In the Hubble Ultra Deep Field (HUDF) an abundance of galaxies is seen with a knot at one end plus an extended tail, resembling a tadpole. These "tadpole galaxies" appear dynamically unrelaxed--presumably in an early merging state--where tidal interactions likely created the distorted knot-plus-tail morphology. Here we systematically select tadpole galaxies from the HUDF and study their properties as a function of their photometric redshifts. In a companion HUDF variability study, Cohen et al. (2005) revealed a total of 45 variable objects believed to be Active Galactic Nuclei (AGN). Here we show that this faint AGN sample has no overlap with the tadpole galaxy sample, as predicted by theoretical work. The tadpole morphology--combined with the lack of overlap with the variable objects--supports the idea that these galaxies are in the process of an early-stage merger event, i.e., at a stage that likely precedes the "turn-on" of any AGN component and the onset of any point-source variability.
We study expanding circle maps interacting in a heterogeneous random network. Heterogeneity means that some nodes in the network are massively connected, while the remaining nodes are only poorly connected. We provide a probabilistic approach which enables us to describe the effective dynamics of the massively connected nodes when taking a weak interaction limit. More precisely, we show that for almost every random network and almost all initial conditions the high dimensional network governing the dynamics of the massively connected nodes can be reduced to a few macroscopic equations. Such reduction is intimately related to the ergodic properties of the expanding maps. This reduction allows one to explore the coherent properties of the network.
We study the temperature structure of the naive TAP equations by mean of a recursion algorithm. The problem of the chaos in temperature is addressed using the notion of the temperature evolution of equilibrium states. The lowest free energy states show relevant correlations with the ground state, and a careful finite size analysis indicates that these correlations are not finite size effects, ruling out the possibility of chaos in temperature even in the thermodynamic limit. The correlations of the equilibrium states with respect to the ground state are investigated. The performance of a new heuristic algorithm for the search of ground states is also discussed.
We study the three-dimensional (3D) bond-diluted Edwards-Anderson (EA) model with binary interactions at a bond occupation of 45% by Monte Carlo (MC) simulations. Using an efficient cluster MC algorithm we are able to determine the universal finite-size scaling (FSS) functions and the critical exponents with high statistical accuracy. We observe small corrections to scaling for the measured observables. The critical quantities and the FSS functions indicate clearly that the bond-diluted model for dilutions above the critical dilution p*, at which a spin glass (SG) phase appears, lies in the same universality class as the 3D undiluted EA model with binary interactions. A comparison with the FSS functions of the 3D site-diluted EA model with Gaussian interactions at a site occupation of 62.5% gives very strong evidence for the universality of the SG transition in the 3D EA model.
Adaptation in the retina is thought to optimize the encoding of natural light signals into sequences of spikes sent to the brain. However, adaptation also entails computational costs: adaptive code is intrinsically ambiguous, because output symbols cannot be trivially mapped back to the stimuli without the knowledge of the adaptive state of the encoding neuron. It is thus important to learn which statistical changes in the input do, and which do not, invoke adaptive responses, and ask about the reasons for potential limits to adaptation. We measured the ganglion cell responses in the tiger salamander retina to controlled changes in the second (contrast), third (skew) and fourth (kurtosis) moments of the light intensity distribution of spatially uniform temporally independent stimuli. The skew and kurtosis of the stimuli were chosen to cover the range observed in natural scenes. We quantified adaptation in ganglion cells by studying two-dimensional linear-nonlinear models that capture well the retinal encoding properties across all stimuli. We found that the retinal ganglion cells adapt to contrast, but exhibit remarkably invariant behavior to changes in higher-order statistics. Finally, by theoretically analyzing optimal coding in LN-type models, we showed that the neural code can maintain a high information rate without dynamic adaptation despite changes in stimulus skew and kurtosis.
Much information available on the web is copied, reused or rephrased. The phenomenon that multiple web sources pick up certain information is often called trend. A central problem in the context of web data mining is to detect those web sources that are first to publish information which will give rise to a trend. We present a simple and efficient method for finding trends dominating a pool of web sources and identifying those web sources that publish the information relevant to a trend before others. We validate our approach on real data collected from influential technology news feeds.
In this work we present a state-of-the-art approach for unconstrained natural scene text recognition. We propose a cascade approach that incorporates a convolutional neural network (CNN) architecture followed by a long short term memory model (LSTM). The CNN learns visual features for the characters and uses them with a softmax layer to detect sequence of characters. While the CNN gives very good recognition results, it does not model relation between characters, hence gives rise to false positive and false negative cases (confusing characters due to visual similarities like "g" and "9", or confusing background patches with characters; either removing existing characters or adding non-existing ones) To alleviate these problems we leverage recent developments in LSTM architectures to encode contextual information. We show that the LSTM can dramatically reduce such errors and achieve state-of-the-art accuracy in the task of unconstrained natural scene text recognition. Moreover we manually remove all occurrences of the words that exist in the test set from our training set to test whether our approach will generalize to unseen data. We use the ICDAR 13 test set for evaluation and compare the results with the state of the art approaches [11, 18]. We finally present an application of the work in the domain of for traffic monitoring.
The combination of the network theoretic approach with recently available abundant economic data leads to the development of novel analytic and computational tools for modelling and forecasting key economic indicators. The main idea is to introduce a topological component into the analysis, taking into account consistently all higher-order interactions. We present three basic methodologies to demonstrate different approaches to harness the resulting network gain. First, a multiple linear regression optimisation algorithm is used to generate a relational network between individual components of national balance of payment accounts. This model describes annual statistics with a high accuracy and delivers good forecasts for the majority of indicators. Second, an early-warning mechanism for global financial crises is presented, which combines network measures with standard economic indicators. From the analysis of the cross-border portfolio investment network of long-term debt securities, the proliferation of a wide range of over-the-counter-traded financial derivative products, such as credit default swaps, can be described in terms of gross-market values and notional outstanding amounts, which are associated with increased levels of market interdependence and systemic risk. Third, considering the flow-network of goods traded between G-20 economies, network statistics provide better proxies for key economic measures than conventional indicators. For example, it is shown that a country's gate-keeping potential, as a measure for local power, projects its annual change of GDP generally far better than the volume of its imports or exports.
We compare three approaches to statistical machine translation (pure phrase-based, factored phrase-based and neural) by performing a fine-grained manual evaluation via error annotation of the systems' outputs. The error types in our annotation are compliant with the multidimensional quality metrics (MQM), and the annotation is performed by two annotators. Inter-annotator agreement is high for such a task, and results show that the best performing system (neural) reduces the errors produced by the worst system (phrase-based) by 54%.
The true slime mould Physarum polycephalum is a recent well studied example of how complex transport networks emerge from simple auto-catalytic and self- organising local interactions, adapting structure and function against changing environmental conditions and external perturbation. Physarum networks also exhibit computationally desirable measures of transport efficiency in terms of overall path length, minimal connectivity and network resilience. Although significant progress has been made in mathematically modelling the behaviour of Physarum networks (and other biological transport networks) based on observed features in experimental settings, their initial emergence - and in particular their long-term persistence and evolution - is still poorly understood. We present a low-level, bottom-up, approach to the modelling of emergent transport networks. A population of simple particle-like agents coupled with paracrine chemotaxis behaviours in a dissipative environment results in the spontaneous emergence of persistent, complex structures. Second order emergent behaviours, in the form of network surface minimisation, are also observed contributing to the long term evolution and dynamics of the networks. The framework is extended to allow data presentation and the population is used to perform a direct (spatial) approximation of network minimisation problems. Three methods are employed, loosely relating to behaviours of Physarum under different environmental conditions. Finally, the low-level approach is summarised with a view to further research.
Influential node detection is a central research topic in social network analysis. Many existing methods rely on the assumption that the network structure is completely known \textit{a priori}. However, in many applications, network structure is unavailable to explain the underlying information diffusion phenomenon. To address the challenge of information diffusion analysis with incomplete knowledge of network structure, we develop a multi-task low rank linear influence model. By exploiting the relationships between contagions, our approach can simultaneously predict the volume (i.e. time series prediction) for each contagion (or topic) and automatically identify the most influential nodes for each contagion. The proposed model is validated using synthetic data and an ISIS twitter dataset. In addition to improving the volume prediction performance significantly, we show that the proposed approach can reliably infer the most influential users for specific contagions.
Deep I-band imaging was carried out to search for the optical counterpart of the X-ray jet candidate near SDSS 1306+0356, reported by Schwartz (2002, astro-ph/0202190). The data suggest that the extended X-ray source may be a jet, related to a galaxy rather than to the quasar itself.
We consider the fundamental problem in non-convex optimization of efficiently reaching a stationary point. In contrast to the convex case, in the long history of this basic problem, the only known theoretical results on first-order non-convex optimization remain to be full gradient descent that converges in $O(1/\varepsilon)$ iterations for smooth objectives, and stochastic gradient descent that converges in $O(1/\varepsilon^2)$ iterations for objectives that are sum of smooth functions.   We provide the first improvement in this line of research. Our result is based on the variance reduction trick recently introduced to convex optimization, as well as a brand new analysis of variance reduction that is suitable for non-convex optimization. For objectives that are sum of smooth functions, our first-order minibatch stochastic method converges with an $O(1/\varepsilon)$ rate, and is faster than full gradient descent by $\Omega(n^{1/3})$.   We demonstrate the effectiveness of our methods on empirical risk minimizations with non-convex loss functions and training neural nets.
In cellular networks, the locations of the radio access network (RAN) elements are determined mainly based on the long-term traffic behaviour. However, when the random and hard-to-predict spatio-temporal distribution of the traffic (load,demand) does not fully match the fixed locations of the RAN elements (supply), some performance degradation becomes inevitable. The concept of multi-tier cells (heterogeneous networks, HetNets) has been introduced in 4G networks to alleviate this mismatch. However, as the traffic distribution deviates more and more from the long-term average, even the HetNet architecture will have difficulty in coping up with the erratic supply-demand mismatch, unless the RAN is grossly over-engineered (which is a financially non-viable solution). In this article, we study the opportunistic utilization of low-altitude unmanned aerial platforms equipped with base stations (BSs), i.e., drone-BSs, in 5G networks. In particular, we envisage a multi-tier drone-cell network complementing the terrestrial HetNets. The variety of equipment, and non-rigid placement options allow utilizing multitier drone-cell networks to serve diversified demands. Hence, drone-cells bring the supply to where the demand is, which sets new frontiers for the heterogeneity in 5G networks. We investigate the advancements promised by drone-cells, and discuss the challenges associated with their operation and management. We propose a drone-cell management framework (DMF) benefiting from the synergy among software defined networking (SDN), network functions virtualization (NFV), and cloud-computing. We demonstrate DMF mechanisms via a case study, and numerically show that it can reduce the cost of utilizing drone-cells in multitenancy cellular networks.
Flow allocation methods represent a valuable tool set to analyze the power flows in networked electricity systems. Based on this flow allocation, the costs associated with the usage of the underlying network infrastructure can be assigned to the users of the electricity system. This paper presents a generalization of the flow tracing method that is applicable to arbitrary compositions of inflow appearing naturally in aggregated networks. The composition of inflow is followed from net-generating sources through the network and assigns corresponding shares of the total power flow as well as of the outflow to the net-consuming sinks. We showcase the analytical power of this method for a scenario based on the IEEE 118 bus network and emphasize the need of appropriate aggregating measures, which allow to integrate over whole time series of fluctuating flow patterns.
The present paper studies a feedback regulation problem, which may be interpreted as an adaptive control problem, but has not yet been studied in the control literature. The problem, which arises in at least two different biological applications (Hopf bifurcations in the auditory system, and neural integrators used to maintain persistent neural activity), is that of tuning a parameter so as to bring it to a value at which a bifurcation takes place.
We introduce a new loss function for the weakly-supervised training of semantic image segmentation models based on three guiding principles: to seed with weak localization cues, to expand objects based on the information about which classes can occur in an image, and to constrain the segmentations to coincide with object boundaries. We show experimentally that training a deep convolutional neural network using the proposed loss function leads to substantially better segmentations than previous state-of-the-art methods on the challenging PASCAL VOC 2012 dataset. We furthermore give insight into the working mechanism of our method by a detailed experimental study that illustrates how the segmentation quality is affected by each term of the proposed loss function as well as their combinations.
Convolutional neural networks have recently shown excellent results in general object detection and many other tasks. Albeit very effective, they involve many user-defined design choices. In this paper we want to better understand these choices by inspecting two key aspects "what did the network learn?", and "what can the network learn?". We exploit new annotations (Pascal3D+), to enable a new empirical analysis of the R-CNN detector. Despite common belief, our results indicate that existing state-of-the-art convnet architectures are not invariant to various appearance factors. In fact, all considered networks have similar weak points which cannot be mitigated by simply increasing the training data (architectural changes are needed). We show that overall performance can improve when using image renderings for data augmentation. We report the best known results on the Pascal3D+ detection and view-point estimation tasks.
We investigated numerically the distribution of participation numbers in the 3d Anderson tight-binding model at the localization-delocalization threshold. These numbers in {\em one} disordered system experience strong level-to-level fluctuations in a wide energy range. The fluctuations grow substantially with increasing size of the system. We argue that the fluctuations of the correlation dimension, $D_2$ of the wave functions are the main reason for this. The distribution of these correlation dimensions at the transition is calculated. In the thermodynamic limit ($L\to \infty$) it does not depend on the system size $L$. An interesting feature of this limiting distribution is that it vanishes exactly at $D_{\rm 2max}=1.83$, the highest possible value of the correlation dimension at the Anderson threshold in this model.
We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters. We exploit the phenomenon of flat minima that has been shown to lead to improved generalization error for deep networks. Parle requires very infrequent communication with the parameter server and instead performs more computation on each client, which makes it well-suited to both single-machine, multi-GPU settings and distributed implementations.
We have seen great progress in basic perceptual tasks such as object recognition and detection. However, AI models still fail to match humans in high-level vision tasks due to the lack of capacities for deeper reasoning. Recently the new task of visual question answering (QA) has been proposed to evaluate a model's capacity for deep image understanding. Previous works have established a loose, global association between QA sentences and images. However, many questions and answers, in practice, relate to local regions in the images. We establish a semantic link between textual descriptions and image regions by object-level grounding. It enables a new type of QA with visual answers, in addition to textual answers used in previous work. We study the visual QA tasks in a grounded setting with a large collection of 7W multiple-choice QA pairs. Furthermore, we evaluate human performance and several baseline models on the QA tasks. Finally, we propose a novel LSTM model with spatial attention to tackle the 7W QA tasks.
Based on the Aristotelian concept of potentiality vs. actuality allowing for the study of energy and dynamics in language, we propose a field approach to lexical analysis. Falling back on the distributional hypothesis to statistically model word meaning, we used evolving fields as a metaphor to express time-dependent changes in a vector space model by a combination of random indexing and evolving self-organizing maps (ESOM). To monitor semantic drifts within the observation period, an experiment was carried out on the term space of a collection of 12.8 million Amazon book reviews. For evaluation, the semantic consistency of ESOM term clusters was compared with their respective neighbourhoods in WordNet, and contrasted with distances among term vectors by random indexing. We found that at 0.05 level of significance, the terms in the clusters showed a high level of semantic consistency. Tracking the drift of distributional patterns in the term space across time periods, we found that consistency decreased, but not at a statistically significant level. Our method is highly scalable, with interpretations in philosophy.
We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics. Our model is evaluated on a real-world robotic manipulation task that requires displacing objects to target locations by poking. The robot gathered over 400 hours of experience by executing more than 100K pokes on different objects. We propose a novel approach based on deep neural networks for modeling the dynamics of robot's interactions directly from images, by jointly estimating forward and inverse models of dynamics. The inverse model objective provides supervision to construct informative visual features, which the forward model can then predict and in turn regularize the feature space for the inverse model. The interplay between these two objectives creates useful, accurate models that can then be used for multi-step decision making. This formulation has the additional benefit that it is possible to learn forward models in an abstract feature space and thus alleviate the need of predicting pixels. Our experiments show that this joint modeling approach outperforms alternative methods.
We make the simple, and yet deep, observation that a regular conditional distribution (rcd) almost surely trivialises the conditioning $\sigma$-algebra if and only if there exists a "measurable selection" of regular conditional distributions for almost all the measures arising out of the original rcd.
The widespread use of cloud computing services is expected to deteriorate a Quality of Service and toincrease the power consumption of ICT devices, since the distance to a server becomes longer than before. Migration of virtual machines over a wide area can solve many problems such as load balancing and power saving in cloud computing environments.   This paper proposes to dynamically apply WAN accelerator within the network when a virtual machine is moved to a distant center, in order to prevent the degradation in performance after live migration of virtual machines over a wide area. mSCTP-based data transfer using different TCP connections before and after migration is proposed in order to use a currently available WAN accelerator. This paper does not consider the performance degradation of live migration itself. Then, this paper proposes to reduce the power consumption of ICT devices, which consists of installing WAN accelerators as part of cloud resources actively and increasing the packet transfer rate of communication link temporarily. It is demonstrated that the power consumption with WAN accelerator could be reduced to one-tenth of that without WAN accelerator.
LinkedIn search is deeply personalized - for the same queries, different searchers expect completely different results. This paper presents our approach to achieving this by mining various data sources available in LinkedIn to infer searchers' intents (such as hiring, job seeking, etc.), as well as extending the concept of homophily to capture the searcher-result similarities on many aspects. Then, learning-to-rank (LTR) is applied to combine these signals with standard search features.
Quantification of symmetries in complex networks is typically done globally in terms of automorphisms. Extending previous methods to locally assess the symmetry of nodes is not straightforward. Here we present a new framework to quantify the symmetries around nodes, which we call connectivity patterns. We develop two topological transformations that allow a concise characterization of the different types of symmetry appearing on networks and apply these concepts to six network models, namely the Erd\H{o}s-R\'enyi, Barab\'asi-Albert, random geometric graph, Waxman, Voronoi and rewired Voronoi. Real-world networks, namely the scientific areas of Wikipedia, the world-wide airport network and the street networks of Oldenburg and San Joaquin, are also analyzed in terms of the proposed symmetry measurements. Several interesting results emerge from this analysis, including the high symmetry exhibited by the Erd\H{o}s-R\'enyi model. Additionally, we found that the proposed measurements present low correlation with other traditional metrics, such as node degree and betweenness centrality. Principal component analysis is used to combine all the results, revealing that the concepts presented here have substantial potential to also characterize networks at a global scale.
We consider networks of processes which interact with beeps. In the basic model defined by Cornejo and Kuhn, which we refer to as the $BL$ variant, processes can choose in each round either to beep or to listen. Those who beep are unable to detect simultaneous beeps. Those who listen can only distinguish between silence and the presence of at least one beep. Beeping models are weak in essence and even simple tasks may become difficult or unfeasible with them.   In this paper, we address the problem of computing how many participants there are in a one-hop network: the {\em counting} problem. We first observe that no algorithm can compute this number with certainty in $BL$, whether the algorithm be deterministic or even randomised (Las Vegas). We thus consider the stronger variant where beeping nodes are able to detect simultaneous beeps, referred to as $B_{cd}L$ (for {\em collision detection}). We prove that at least $n$ rounds are necessary in $B_{cd}L$, and we present an algorithm whose running time is $O(n)$ rounds with high probability. Further experimental results show that its expected running time is less than $10n$. Finally, we discuss how this algorithm can be adapted in other beeping models. In particular, we show that it can be emulated in $BL$, at the cost of a logarithmic slowdown and of trading its Las Vegas nature (result certain, time uncertain) against Monte Carlo (time certain, result uncertain).
The Distributed object computing is a paradigm that allows objects to be distributed across a heterogeneous network, and allows each of the components to interoperate as a unified whole. A new generation of distributed applications, such as telemedicine and e-commerce applications, are being deployed in heterogeneous and ubiquitous computing environments. The objective of this paper is to explore an applicability of a component based services in ubiquitous computational environment. While the fundamental structure of various distributed object components is similar, there are differences that can profoundly impact an application developer or the administrator of a distributed simulation exercise and to implement in Ubiquitous Computing Environment.
Energy saving is becoming an important issue in the design and use of computer networks. In this work we propose a problem that considers the use of rate adaptation as the energy saving strategy in networks. The problem is modeled as an integral demand-routing problem in a network with discrete cost functions at the links. The discreteness of the cost function comes from the different states (bandwidths) at which links can operate and, in particular, from the energy consumed at each state. This in its turn leads to the non-convexity of the cost function, and thus adds complexity to solve this problem. We formulate this routing problem as an integer program, and we show that the general case of this problem is NP-hard, and even hard to approximate. For the special case when the step ratio of the cost function is bounded, we show that effective approximations can be obtained. Our main algorithm executes two processes in sequence: relaxation and rounding. The relaxation process eliminates the non-convexity of the cost function, so that the problem is transformed into a fractional convex program solvable in polynomial time. After that, a randomized rounding process is used to get a feasible solution for the original problem. This algorithm provides a constant approximation ratio for uniform demands and an approximation ratio of $O(\log^{\beta-1} d)$ for non-uniform demands, where $\beta$ is a constant and $d$ is the largest demand.
We focus on named entity recognition (NER) for Chinese social media. With massive unlabeled text and quite limited labelled corpus, we propose a semi-supervised learning model based on B-LSTM neural network. To take advantage of traditional methods in NER such as CRF, we combine transition probability with deep learning in our model. To bridge the gap between label accuracy and F-score of NER, we construct a model which can be directly trained on F-score. When considering the instability of F-score driven method and meaningful information provided by label accuracy, we propose an integrated method to train on both F-score and label accuracy. Our integrated model yields 7.44\% improvement over previous state-of-the-art result.
Lossy image compression is generally formulated as a joint rate-distortion optimization to learn encoder, quantizer, and decoder. However, the quantizer is non-differentiable, and discrete entropy estimation usually is required for rate control. These make it very challenging to develop a convolutional network (CNN)-based image compression system. In this paper, motivated by that the local information content is spatially variant in an image, we suggest that the bit rate of the different parts of the image should be adapted to local content. And the content aware bit rate is allocated under the guidance of a content-weighted importance map. Thus, the sum of the importance map can serve as a continuous alternative of discrete entropy estimation to control compression rate. And binarizer is adopted to quantize the output of encoder due to the binarization scheme is also directly defined by the importance map. Furthermore, a proxy function is introduced for binary operation in backward propagation to make it differentiable. Therefore, the encoder, decoder, binarizer and importance map can be jointly optimized in an end-to-end manner by using a subset of the ImageNet database. In low bit rate image compression, experiments show that our system significantly outperforms JPEG and JPEG 2000 by structural similarity (SSIM) index, and can produce the much better visual result with sharp edges, rich textures, and fewer artifacts.
We present extensive sets of stellar models for 0.8-9.0Msun in mass and -5 <= [Fe/H] <= -2 and Z = 0 in metallicity. The present work focuses on the evolutionary characteristics of hydrogen mixing into the He-flash convective zones during the core and shell He flashes which occurs for the models with [Fe/H] <~ -2.5. Evolution is followed from the zero age MS to the TPAGB phase including the hydrogen engulfment by the He-flash convection during the RGB or AGB phase. There exist various types of mixing episodes of how the H mixing sets in and how it affects the final abundances at the surface. In particular, we find H ingestion events without dredge-ups that enables repeated neutron-capture nucleosynthesis in the He flash convective zones with 13 C(a,n)16 O as neutron source. For Z = 0, the mixing and dredge-up processes vary with the initial mass, which results in different final abundances in the surface. We investigate the occurrence of these events for various initial mass and metallicity to find the metallicity dependence for the He-flash driven deep mixing (He-FDDM) and also for the third dredge-up (TDU) events. In our models, we find He-FDDM for M <= 3Msun for Z = 0 and for M <~ 2Msun for -5 <~ [Fe/H] <~ -3. On the other hand, the occurrence of the TDU is limited to the mass range of ~1.5Msun to ~5Msun for [Fe/H] = -3, which narrows with decreasing metallicity. The paper also discusses the implications of the results of model computations for observations. We compared the abundance pattern of CNO abundances with observed metal-poor stars. The origins of most iron-deficient stars are discussed by assuming that these stars are affected by binary mass transfer. We also point out the existence of a blue horizontal branch for -4 <~ [Fe/H] <~ -2.5.
Many natural language understanding (NLU) tasks, such as shallow parsing (i.e., text chunking) and semantic slot filling, require the assignment of representative labels to the meaningful chunks in a sentence. Most of the current deep neural network (DNN) based methods consider these tasks as a sequence labeling problem, in which a word, rather than a chunk, is treated as the basic unit for labeling. These chunks are then inferred by the standard IOB (Inside-Outside-Beginning) labels. In this paper, we propose an alternative approach by investigating the use of DNN for sequence chunking, and propose three neural models so that each chunk can be treated as a complete unit for labeling. Experimental results show that the proposed neural sequence chunking models can achieve start-of-the-art performance on both the text chunking and slot filling tasks.
The effect of surface disorder on electronic systems is particularly interesting for topological phases with surface and edge states. Using exact diagonalization, it has been demonstrated that the surface states of a 3D topological insulator survive strong surface disorder, and simply get pushed to a clean part of the bulk. Here we explore a new method which analytically eliminates the clean bulk, and reduces a $D$-dimensional problem to a Hamiltonian-diagonalization problem within the $(D-1)$-dimensional disordered surface. This dramatic reduction in complexity allows the analysis of significantly bigger systems than is possible with exact diagonalization. We use our method to analyze a 2D topological spin-Hall insulator with non-magnetic and magnetic edge impurities, and we calculate the probability density (or local density of states) of the zero-energy eigenstates as a function of edge-parallel momentum and layer index. Our analysis reveals that the system size needed to reach behavior in the thermodynamic limit increases with disorder. We also compute the edge conductance as a function of disorder strength, and chart a lower bound for the length scale marking the crossover to the thermodynamic limit.
Online knowledge repositories typically rely on their users or dedicated editors to evaluate the reliability of their content. These evaluations can be viewed as noisy measurements of both information reliability and information source trustworthiness. Can we leverage these noisy evaluations, often biased, to distill a robust, unbiased and interpretable measure of both notions?   In this paper, we argue that the temporal traces left by these noisy evaluations give cues on the reliability of the information and the trustworthiness of the sources. Then, we propose a temporal point process modeling framework that links these temporal traces to robust, unbiased and interpretable notions of information reliability and source trustworthiness. Furthermore, we develop an efficient convex optimization procedure to learn the parameters of the model from historical traces. Experiments on real-world data gathered from Wikipedia and Stack Overflow show that our modeling framework accurately predicts evaluation events, provides an interpretable measure of information reliability and source trustworthiness, and yields interesting insights about real-world events.
Recent ZEUS measurements of inclusive-jet and dijet cross sections in neutral current deep inelastic ep scattering at HERA are presented. The data correspond to more than a two-fold increase in statistics compared to previous studies. The cross sections are measured in the Breit frame, for boson virtualities of Q^2 > 125 GeV^2, as functions of various kinematic and jet observables. The data are found to be well described by NLO QCD and have the potential to constrain the gluon density in the proton. Two new extractions of the strong coupling, alpha_s, are also presented: the first is determined from the inclusive-jet neutral current DIS measurement presented here, while the second is from a re-analysis of previously published data on inclusive jet photoproduction. Both measurements are of competitive precision and in agreement with the world average.
Neural networks can efficiently encode the probability distribution of errors in an error correcting code. Moreover, these distributions can be conditioned on the syndromes of the corresponding errors. This paves a path forward for a decoder that employs a neural network to calculate the conditional distribution, then sample from the distribution - the sample will be the predicted error for the given syndrome. We present an implementation of such an algorithm that can be applied to any stabilizer code. Testing it on the toric code, it has higher threshold than a number of known decoders thanks to naturally finding the most probable error and accounting for correlations between errors.
The percolation threshold and wrapping probability $R_{\infty}$ for the two-dimensional problem of continuum percolation on the surface of a Klein bottle have been calculated by the Monte Carlo method with the Newman--Ziff algorithm for completely permeable disks. It has been shown that the percolation threshold of disks on the Klein bottle coincides with the percolation threshold of disks on the surface of a torus, indicating that this threshold is topologically invariant. The scaling exponents determining corrections to the wrapping probability and critical concentration owing to the finite-size effects are also topologically invariant. At the same time, the quantities $R_{\infty}$ are different for percolation on the torus and Klein bottle and are apparently determined by the topology of the surface. Furthermore, the difference between the $R_{\infty}$ values for the torus and Klein bottle means that at least one of the percolation clusters is degenerate.
It has been found that many networks display community structure -- groups of vertices within which connections are dense but between which they are sparser -- and highly sensitive computer algorithms have in recent years been developed for detecting such structure. These algorithms however are computationally demanding, which limits their application to small networks. Here we describe a new algorithm which gives excellent results when tested on both computer-generated and real-world networks and is much faster, typically thousands of times faster than previous algorithms. We give several example applications, including one to a collaboration network of more than 50000 physicists.
We investigate the effect of disorder on the transfer of quantum states across a one-dimensional lattice with varying levels of control resources. We find that the application of properly designed control signals, even when applied only to the two ends of the lattice, allows perfect state transfer up to disorder strengths that would not allow a generic quantum state to propagate the length of the lattice. At sufficiently large disorder strengths, however, the local control signals fail to send the quantum state from one end of the system to the other end. Our results shed light on the interplay between disorder and controlled transport in one-dimensional systems.
A recent McKinsey report estimates the economic impact of the Internet of Things (IoT) to be between $3.9 to $11 trillion dollars by 20251 . IoT has the potential to have a profound impact on our daily lives, including technologies for the home, for health, for transportation, and for managing our natural resources. The Internet was largely driven by information and ideas generated by people, but advances in sensing and hardware have enabled computers to more easily observe the physical world. Coupling this additional layer of information with advances in machine learning brings dramatic new capabilities including the ability to capture and process tremendous amounts of data; to predict behaviors, activities, and the future in uncanny ways; and to manipulate the physical world in response. This trend will fundamentally change how people interact with physical objects and the environment. Success in developing value-added capabilities around IoT requires a broad approach that includes expertise in sensing and hardware, machine learning, networked systems, human-computer interaction, security, and privacy. Strategies for making IoT practical and spurring its ultimate adoption also require a multifaceted approach that often transcends technology, such as with concerns over data security, privacy, public policy, and regulatory issues. In this paper we argue that existing best practices in building robust and secure systems are insufficient to address the new challenges that IoT systems will present. We provide recommendations regarding investments in research areas that will help address inadequacies in existing systems, practices, tools, and policies.
Farey sequences of irreducible fractions between 0 and 1 can be related to graph constructions known as Farey graphs. These graphs were first introduced by Matula and Kornerup in 1979 and further studied by Colbourn in 1982 and they have many interesting properties: they are minimally 3-colorable, uniquely Hamiltonian, maximally outerplanar and perfect. In this paper we introduce a simple generation method for a Farey graph family, and we study analytically relevant topological properties: order, size, degree distribution and correlation, clustering, transitivity, diameter and average distance. We show that the graphs are a good model for networks associated with some complex systems.
With the increasing amount of interconnections between vehicles, the attack surface of internal vehicle networks is rising steeply. Although these networks are shielded against external attacks, they often do not have any internal security to protect against malicious components or adversaries who breach the network perimeter. To secure the in-vehicle network, all communicating components must be authenticated, and only authorized components should be allowed to send and receive messages. This is achieved using an authentication framework. Cryptography is widely used to authenticate communicating parties and provide secure communication channels (e.g., Internet communication). However, the real-time performance requirements of in-vehicle networks restrict the types of cryptographic algorithms and protocols that may be used. In particular, asymmetric cryptography is computationally infeasible during vehicle operation.   In this work, we address the challenges of designing authentication protocols for automotive systems. We present Lightweight Authentication for Secure Automotive Networks (LASAN), a full lifecycle authentication approach. We describe the core LASAN protocols and show how they protect the internal vehicle network while complying with the real-time constraints and low computational resources of this domain. Unlike previous work, we also explain how this framework can be integrated into all aspects of the automotive lifecycle, including manufacturing, vehicle maintenance, and software updates. We evaluate LASAN in two different ways: First, we analyze the security properties of the protocols using established protocol verification techniques based on formal methods. Second, we evaluate the timing requirements of LASAN and compare these to other frameworks using a new highly modular discrete event simulator for in-vehicle networks, which we have developed for this evaluation.
We study the effects of spatial constraints on the structural properties of networks embedded in one or two dimensional space. When nodes are embedded in space, they have a well defined Euclidean distance $r$ between any pair. We assume that nodes at distance $r$ have a link with probability $p(r) \sim r^{- \delta}$. We study the mean topological distance $l$ and the clustering coefficient $C$ of these networks and find that they both exhibit phase transitions for some critical value of the control parameter $\delta$ depending on the dimensionality $d$ of the embedding space. We have identified three regimes. When $\delta <d$, the networks are not affected at all by the spatial constraints. They are ``small-worlds'' $l\sim \log N$ with zero clustering at the thermodynamic limit. In the intermediate regime $d<\delta<2d$, the networks are affected by the space and the distance increases and becomes a power of $\log N$, and have non-zero clustering. When $\delta>2d$ the networks are ``large'' worlds $l \sim N^{1/d}$ with high clustering. Our results indicate that spatial constrains have a significant impact on the network properties, a fact that should be taken into account when modeling complex networks.
Quantum mechanics requires the operation of quantum computers to be unitary, and thus makes it important to have general techniques for developing fast quantum algorithms for computing unitary transforms. A quantum routine for computing a generalized Kronecker product is given. Applications include re-development of the networks for computing the Walsh-Hadamard and the quantum Fourier transform. New networks for two wavelet transforms are given. Quantum computation of Fourier transforms for non-Abelian groups is defined. A slightly relaxed definition is shown to simplify the analysis and the networks that computes the transforms. Efficient networks for computing such transforms for a class of metacyclic groups are introduced. A novel network for computing a Fourier transform for a group used in quantum error-correction is also given.
This paper proposes a new methodology to maximize the feasible set of power injections and cross-border power transfers in meshed multi-area power systems. The approach used polyhedral computation schemes and is an extension to the classic procedure for cross-border transfer capacity assessment in the European power network, including the computation of bilateral cross-border transfer capacities as well as multilateral flow-based approaches. The focus is the characterization of inter-area exchange limits required for secure power system operation in the presence of physical transmission constraints, while maximizing the utilization factors of the transmission lines. The numerical examples include a case study of the ENTSO-E transmission system.
Elucidating principles that underlie computation in neural networks is currently a major research topic of interest in neuroscience. Transfer Entropy (TE) is increasingly used as a tool to bridge the gap between network structure, function, and behavior in fMRI studies. Computational models allow us to bridge the gap even further by directly associating individual neuron activity with behavior. However, most computational models that have analyzed embodied behaviors have employed non-spiking neurons. On the other hand, computational models that employ spiking neural networks tend to be restricted to disembodied tasks. We show for the first time the artificial evolution and TE-analysis of embodied spiking neural networks to perform a cognitively-interesting behavior. Specifically, we evolved an agent controlled by an Izhikevich neural network to perform a visual categorization task. The smallest networks capable of performing the task were found by repeating evolutionary runs with different network sizes. Informational analysis of the best solution revealed task-specific TE-network clusters, suggesting that within-task homogeneity and across-task heterogeneity were key to behavioral success. Moreover, analysis of the ensemble of solutions revealed that task-specificity of TE-network clusters correlated with fitness. This provides an empirically testable hypothesis that links network structure to behavior.
We consider a problem of prediction based on opinions elicited from heterogeneous rational agents with private information. Making an accurate prediction with a minimal cost requires a joint design of the incentive mechanism and the prediction algorithm. Such a problem lies at the nexus of statistical learning theory and game theory, and arises in many domains such as consumer surveys and mobile crowdsourcing. In order to elicit heterogeneous agents' private information and incentivize agents with different capabilities to act in the principal's best interest, we design an optimal joint incentive mechanism and prediction algorithm called COPE (COst and Prediction Elicitation), the analysis of which offers several valuable engineering insights. First, when the costs incurred by the agents are linear in the exerted effort, COPE corresponds to a "crowd contending" mechanism, where the principal only employs the agent with the highest capability. Second, when the costs are quadratic, COPE corresponds to a "crowd-sourcing" mechanism that employs multiple agents with different capabilities at the same time. Numerical simulations show that COPE improves the principal's profit and the network profit significantly (larger than 30% in our simulations), comparing to those mechanisms that assume all agents have equal capabilities.
Homology theory has attracted great attention because it can provide novel and powerful solutions to address coverage problems in wireless sensor networks. They usually use an easily computable algebraic object, Rips complex, to detect coverage holes. But Rips complex may miss some coverage holes in some cases. In this paper, we investigate homology-based coverage hole detection for wireless sensor networks on sphere. The situations when Rips complex may miss coverage holes are first presented. Then we choose the proportion of the area of coverage holes missed by Rips complex as a metric to evaluate the accuracy of homology-based coverage hole detection approaches. Three different cases are considered for the computation of accuracy. For each case, closed-form expressions for lower and upper bounds of the accuracy are derived. Simulation results are well consistent with the analytical lower and upper bounds, with maximum differences of 0.5% and 3% respectively. Furthermore, it is shown that the radius of sphere has little impact on the accuracy if it is much larger than communication and sensing radii of each sensor.
The frequent outbreak of severe foodborne diseases warns of a potential threat that the global trade networks could spread fatal pathogens. The global trade network is a typical overlay network, which compounds multiple standalone trade networks representing the transmission of a single product and connecting the same set of countries and territories through their own set of trade interactions. Although the epidemic dynamic implications of overlay networks have been debated in recent studies, some general answers for the overlay of multiple and diverse standalone networks remain elusive, especially the relationship between the heterogeneity and diversity of a set of standalone networks and the behavior of the overlay network. In this paper, we establish a general analysis framework for multiple overlay networks based on diversity theory. The framework could reveal the critical epidemic mechanisms beyond overlay processes. Applying the framework to global trade networks, we found that, although the distribution of connectivity of standalone trade networks was highly heterogeneous, epidemic behavior on overlay networks is more dependent on cooperation among standalone trade networks rather than on a few high-connectivity networks as the general property of complex systems with heterogeneous distribution. Moreover, the analysis of overlay trade networks related to 7 real pathogens also suggested that epidemic behavior is not controlled by high-connectivity goods but that the actual compound mode of overlay trade networks plays a critical role in spreading pathogens. Finally, we study the influence of cooperation mechanisms on the stability of overlay networks and on the control of global epidemics. The framework provides a general tool to study different problems on overlay networks.
Detecting spreading outbreaks in social networks with sensors is of great significance in applications. Inspired by the formation mechanism of human's physical sensations to external stimuli, we propose a new method to detect the influence of spreading by constructing excitable sensor networks. Exploiting the amplifying effect of excitable sensor networks, our method can better detect small-scale spreading processes. At the same time, it can also distinguish large-scale diffusion instances due to the self-inhibition effect of excitable elements. Through simulations of diverse spreading dynamics on typical real-world social networks (facebook, coauthor and email social networks), we find that the excitable senor networks are capable of detecting and ranking spreading processes in a much wider range of influence than other commonly used sensor placement methods, such as random, targeted, acquaintance and distance strategies. In addition, we validate the efficacy of our method with diffusion data from a real-world online social system, Twitter. We find that our method can detect more spreading topics in practice. Our approach provides a new direction in spreading detection and should be useful for designing effective detection methods.
Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences. Fitting functions that exhibit multiple solutions as local minima can be analysed in terms of the corresponding machine learning landscape. Methods to explore and visualise molecular potential energy landscapes can be applied to these machine learning landscapes to gain new insight into the solution space involved in training and the nature of the corresponding predictions. In particular, we can define quantities analogous to molecular structure, thermodynamics, and kinetics, and relate these emergent properties to the structure of the underlying landscape. This Perspective aims to describe these analogies with examples from recent applications, and suggest avenues for new interdisciplinary research.
This paper discusses the notion of generalization of training samples over long distances in the input space of a feedforward neural network. Such a generalization might occur in various ways, that differ in how great the contribution of different training features should be.   The structure of a neuron in a feedforward neural network is analyzed and it is concluded, that the actual performance of the discussed generalization in such neural networks may be problematic -- while such neural networks might be capable for such a distant generalization, a random and spurious generalization may occur as well.   To illustrate the differences in generalizing of the same function by different learning machines, results given by the support vector machines are also presented.
Pushing by big data and deep convolutional neural network (CNN), the performance of face recognition is becoming comparable to human. Using private large scale training datasets, several groups achieve very high performance on LFW, i.e., 97% to 99%. While there are many open source implementations of CNN, none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem, this paper proposes a semi-automatical way to collect face images from Internet and builds a large scale dataset containing about 10,000 subjects and 500,000 images, called CASIAWebFace. Based on the database, we use a 11-layer CNN to learn discriminative representation and obtain state-of-theart accuracy on LFW and YTF. The publication of CASIAWebFace will attract more research groups entering this field and accelerate the development of face recognition in the wild.
The Longest Common Subsequence (LCS) problem is a fundamental problem of sequence comparison. A natural approximation to this problem is a model in which every pairs of letters of two ``sequences'' are matched independently of the other pairs with probability 1/S, $S$ representing the size of the alphabet. This model is analogous to a mean field version of the LCS problem, which can be solved with a cavity approach (Eur. Phys. J. B 7-2(1999),pp. 293-308). We refine here this approximation by incorporating in a systematic way correlations among the matches in the cavity calculation. We obtain a series of closer and closer approximations to the LCS problem, which we quantify in the large $S$ limit, both with a perturbative approach and by Monte-Carlo simulations. We find that, as it happens in the expansion around mean-field for other disordered systems, the corrections to our approximations depend upon long-ranged correlation effects which render the large $S$ expansion non-perturbative.
Deep X-ray surveys by Chandra and XMM-Newton have resolved about 80% of the 2-10 keV cosmic extragalactic X-ray background (CXRB) into point sources, the majority of which are obscured AGN. The obscuration might be connected to processes within the host galaxy, possibly the star-formation rate. Here, we use the results of CXRB synthesis calculations as input to detailed Cloudy simulations in order to predict the evolution of AGN properties at several mid-IR wavelengths. Computations were performed for three different evolutions of the AGN type 2/type 1 ratio between z=0 and 1: where the ratio increased as (1+z)^{0.9}, as (1+z)^{0.3} and one with no redshift evolution. Models were calculated with the inner radius of the absorbing gas and dust at 1 pc or at 10 pc. Comparing the results of the calculations to combined X-ray and Spitzer data of AGN shows that the predicted spectral energy distributions are a good description of average AGNs found in the deep surveys. The existing data indicates that the mid-IR emission from an average AGN is best described by models where the attenuating material is ~10 pc from the central engine. We present the expected Spitzer cumulative number count distributions and the evolution of the total AGN (type 1 + type 2) luminosity function (LF) between z=0 and 1 at rest-frame 8 microns and 30 microns for the three evolutionary scenarios. The mid-IR AGN LF will be an excellent tool to measure the evolution of the covering factor of the gas and dust from z~0 to 1.
Randomized network coding (RNC) greatly reduces the complexity of implementing network coding in large-scale, heterogeneous networks. This paper examines two tradeoffs in applying RNC: The first studies how the performance of RNC varies with a node's randomizing capabilities. Specifically, a limited randomized network coding (L-RNC) scheme - in which intermediate nodes perform randomized encoding based on only a limited number of random coefficients - is proposed and its performance bounds are analyzed. Such a L-RNC approach is applicable to networks in which nodes have either limited computation/storage capacity or have ambiguity about downstream edge connectivity (e.g., as in ad hoc sensor networks). A second tradeoff studied here examines the relationship between the reliability and the capacity gains of generalized RNC, i.e., how the outage probability of RNC relates to the transmission rate at the source node. This tradeoff reveals that significant reductions in outage probability are possible when the source transmits deliberately and only slightly below network capacity. This approach provides an effective means to improve the feasibility probability of RNC when the size of the finite field is fixed.
A collaboration network is a graph formed by communication channels between parties. Parties communicate over these channels to establish secrets, simultaneously enforcing interdependencies between the secrets. The paper studies properties of these interdependencies that are induced by the topology of the network. In previous work, the authors developed a complete logical system for one such property, independence, also known in the information flow literature as nondeducibility. This work describes a complete and decidable logical system for the functional dependence relation between sets of secrets over a collaboration network. The system extends Armstrong's system of axioms for functional dependency in databases.
A number of recent works have concentrated on a few statistical properties of complex networks, such as the clustering, the right-skewed degree distribution and the community, which are common to many real world networks. In this paper, we address the hierarchy property sharing among a large amount of networks. Based upon the eigenvector centrality (EC) measure, a method is proposed to reconstruct the hierarchical structure of a complex network. It is tested on the Santa Fe Institute collaboration network, whose structure is well known. We also apply it to a Mathematicians' collaboration network and the protein interaction network of Yeast. The method can detect significantly hierarchical structures in these networks.
Language models are typically applied at the sentence level, without access to the broader document context. We present a neural language model that incorporates document context in the form of a topic model-like architecture, thus providing a succinct representation of the broader document context outside of the current sentence. Experiments over a range of datasets demonstrate that our model outperforms a pure sentence-based model in terms of language model perplexity, and leads to topics that are potentially more coherent than those produced by a standard LDA topic model. Our model also has the ability to generate related sentences for a topic, providing another way to interpret topics.
Network Intrusion Detection Systems (NIDS) are computer systems which monitor a network with the aim of discerning malicious from benign activity on that network. While a wide range of approaches have met varying levels of success, most IDSs rely on having access to a database of known attack signatures which are written by security experts. Nowadays, in order to solve problems with false positive alerts, correlation algorithms are used to add additional structure to sequences of IDS alerts. However, such techniques are of no help in discovering novel attacks or variations of known attacks, something the human immune system (HIS) is capable of doing in its own specialised domain. This paper presents a novel immune algorithm for application to the IDS problem. The goal is to discover packets containing novel variations of attacks covered by an existing signature base.
Reading comprehension has embraced a booming in recent NLP research. Several institutes have released the Cloze-style reading comprehension data, and these have greatly accelerated the research of machine comprehension. In this work, we firstly present Chinese reading comprehension datasets, which consist of People Daily news dataset and Children's Fairy Tale (CFT) dataset. Also, we propose a consensus attention-based neural network architecture to tackle the Cloze-style reading comprehension problem, which aims to induce a consensus attention over every words in the query. Experimental results show that the proposed neural network significantly outperforms the state-of-the-art baselines in several public datasets. Furthermore, we setup a baseline for Chinese reading comprehension task, and hopefully this would speed up the process for future research.
Data reflecting social and business relations has often form of network of connections between entities (called social network). In such network important and influential users can be identified as well as groups of strongly connected users. Finding such groups and observing their evolution becomes an increasingly important research problem. One of the significant problems is to develop method incorporating not only information about connections between entities but also information obtained from text written by the users. Method presented in this paper combine social network analysis and text mining in order to understand groups evolution.
Azimuthal single-spin asymmetries (SSA) in semi-inclusive electroproduction of charged pions in deep-inelastic scattering (DIS) of positrons on a transversely polarised hydrogen target are presented. Azimuthal moments for both the Collins and the Sivers mechanism are extracted. In addition the subleading-twist contribution due to the transverse spin component from SSA on a longitudinally polarised hydrogen target is evaluated.
A QCD analysis of the world data on inclusive polarized deep inelastic scattering of leptons on nucleons is presented in leading and next-to-leading order. New parameterizations are derived for the quark and gluon distributions and the value of $\alpha_s(M_Z)$ is determined. Emphasis is put on the derivation of fully correlated error bands for these distributions which are directly applicable to determine experimental errors of other polarized observables. The impact of the variation of both the renormalization and factorization scales on the value of $\alpha_s$ is studied. Finally a factorization-scheme invariant QCD analysis based on the observables $g_1(x,Q^2)$ and $d g_1(x,Q^2)/d \log(Q^2)$ is performed in next-to-leading order, which is compared to the standard analysis.
We study complex networks of stochastic two-state units. Our aim is to model discrete stochastic excitable dynamics with a rest and an excited state. Both states are assumed to possess different waiting time distributions. The rest state is treated as an activation process with an exponentially distributed life time, whereas the latter in the excited state shall have a constant mean which may originate from any distribution. The activation rate of any single unit is determined by its neighbors according to a random complex network structure. In order to treat this problem in an analytical way, we use a heterogeneous mean-field approximation yielding a set of equations general valid for uncorrelated random networks. Based on this derivation we focus on random binary networks where the network is solely comprised of nodes with either of two degrees. The ratio between the two degrees is shown to be a crucial parameter. Dependent on the composition of the network the steady states show the usual transition from disorder to homogeneous ordered bistability as well as new scenarios that include inhomogeneous ordered and disordered bistability as well as tristability. The various steady states differ in their spiking activity expressed by a state dependent spiking rate. Numerical simulations agree with analytic results of the heterogeneous mean-field approximation.
Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community. While being able to cluster within a network is important, there are emerging needs to be able to cluster multiple networks. This is largely motivated by the routine collection of network data that are generated from potentially different populations, such as brain networks of subjects from different disease groups, genders, or biological networks generated under different experimental conditions, etc. We propose a simple and general framework for clustering multiple networks based on a mixture model on graphons. Our clustering method employs graphon estimation as a first step and performs spectral clustering on the matrix of distances between estimated graphons. This is illustrated through both simulated and real data sets, and theoretical justification of the algorithm is given in terms of consistency.
We match quasars discovered in a multi-color survey centered on the northern Hubble Deep Field (HDF) with radio sources from an ultra-deep radio survey. Although 3 out of 12 quasars are detected at a level below 0.2 mJy at 1.4 GHz, all of the quasars in the search area are radio-quiet by the criterion L_r < 10^{25} h^{-1}_{50} W/Hz. We combine this information with other radio surveys of quasars so as to break the degeneracy between redshift and luminosity. In the redshift range 0.02 < z < 3.64, the radio-loud fraction increases with increasing optical luminosity, consistent with some degree of correlation between the non-thermal optical and radio emissions. More tentatively, for low luminosity quasars in the range -22.5 < M_B < -25, the radio-loud fraction decreases with increasing redshift. We can infer from this that the radio luminosity function evolves more slowly than the optical luminosity function. The mechanism that leads to strong radio emission in only a small fraction of quasars at any epoch is still unknown.
Networks built using SDN (Software-Defined Networks) and NFV (Network Functions Virtualization) approaches are expected to face several challenges such as scalability, robustness and resiliency. In this paper, we propose a self-modeling based diagnosis to enable resilient networks in the context of SDN and NFV. We focus on solving two major problems: On the one hand, we lack today of a model or template that describes the managed elements in the context of SDN and NFV. On the other hand, the highly dynamic networks enabled by the softwarisation require the generation at runtime of a diagnosis model from which the root causes can be identified. In this paper, we propose finer granular templates that do not only model network nodes but also their sub-components for a more detailed diagnosis suitable in the SDN and NFV context. In addition, we specify and validate a self-modeling based diagnosis using Bayesian Networks. This approach differs from the state of the art in the discovery of network and service dependencies at run-time and the building of the diagnosis model of any SDN infrastructure using our templates.
In the marriage problem, a variant of the bi-parted matching problem, each member has a `wish-list' expressing his/her preference for all possible partners; this list consists of random, positive real numbers drawn from a certain distribution. One searches the lowest cost for the society, at the risk of breaking up pairs in the course of time. Minimization of a global cost function (Hamiltonian) is performed with statistical mechanics techniques at a finite fictitious temperature.   The problem is generalized to include bachelors, needed in particular when the groups have different size, and polygamy. Exact solutions are found for the optimal solution (T=0). The entropy is found to vanish quadratically in $T$. Also other evidence is found that the replica symmetric solution is exact, implying at most a polynomial degeneracy of the optimal solution.   Whether bachelors occur or not, depends not only on their intrinsic qualities, or lack thereof, but also on global aspects of the chance for pair formation in society.
Computer networks have become a critical infrastructure. Designing dependable computer networks however is challenging, as such networks should not only meet strict requirements in terms of correctness, availability, and performance, but they should also be flexible enough to support fast updates, e.g., due to a change in the security policy, an increasing traffic demand, or a failure. The advent of Software-Defined Networks (SDNs) promises to provide such flexiblities, allowing to update networks in a fine-grained manner, also enabling a more online traffic engineering. In this paper, we present a structured survey of mechanisms and protocols to update computer networks in a fast and consistent manner. In particular, we identify and discuss the different desirable update consistency properties a network should provide, the algorithmic techniques which are needed to meet these consistency properties, their implications on the speed and costs at which updates can be performed. We also discuss the relationship of consistent network update problems to classic algorithmic optimization problems. While our survey is mainly motivated by the advent of Software-Defined Networks (SDNs), the fundamental underlying problems are not new, and we also provide a historical perspective of the subject.
Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application. In this paper we propose a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function. This is often non-trivial, since these functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise.
Hash codes are a very efficient data representation needed to be able to cope with the ever growing amounts of data. We introduce a random forest semantic hashing scheme with information-theoretic code aggregation, showing for the first time how random forest, a technique that together with deep learning have shown spectacular results in classification, can also be extended to large-scale retrieval. Traditional random forest fails to enforce the consistency of hashes generated from each tree for the same class data, i.e., to preserve the underlying similarity, and it also lacks a principled way for code aggregation across trees. We start with a simple hashing scheme, where independently trained random trees in a forest are acting as hashing functions. We the propose a subspace model as the splitting function, and show that it enforces the hash consistency in a tree for data from the same class. We also introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. Experiments on large-scale public datasets are presented, showing that the proposed approach significantly outperforms state-of-the-art hashing methods for retrieval tasks.
Machine learning algorithms, and more in particular neural networks, arguably experience a revolution in terms of performance. Currently, the best systems we have for speech recognition, computer vision and similar problems are based on neural networks, trained using the half-century old backpropagation algorithm. Despite the fact that neural networks are a form of analog computers, they are still implemented digitally for reasons of convenience and availability. In this paper we demonstrate how we can design physical linear dynamic systems with non-linear feedback as a generic platform for dynamic, neuro-inspired analog computing. We show that a crucial advantage of this setup is that the error backpropagation can be performed physically as well, which greatly speeds up the optimisation process. As we show in this paper, using one experimentally validated and one conceptual example, such systems may be the key to providing a relatively straightforward mechanism for constructing highly scalable, fully dynamic analog computers.
The state of health of patients is typically not characterized by a single disease alone but by multiple (comorbid) medical conditions. These comorbidities may depend strongly on age and gender. We propose a specific phenomenological comorbidity network of human diseases that is based on medical claims data of the entire population of Austria. The network is constructed from a two-layer multiplex network, where in one layer the links represent the conditional probability for a comorbidity, and in the other the links contain the respective statistical significance. We show that the network undergoes dramatic structural changes across the lifetime of patients.Disease networks for children consist of a single, strongly inter-connected cluster. During adolescence and adulthood further disease clusters emerge that are related to specific classes of diseases, such as circulatory, mental, or genitourinary disorders.For people above 65 these clusters start to merge and highly connected hubs dominate the network. These hubs are related to hypertension, chronic ischemic heart diseases, and chronic obstructive pulmonary diseases. We introduce a simple diffusion model to understand the spreading of diseases on the disease network at the population level. For the first time we are able to show that patients predominantly develop diseases which are in close network-proximity to disorders that they already suffer. The model explains more than 85 % of the variance of all disease incidents in the population. The presented methodology could be of importance for anticipating age-dependent disease-profiles for entire populations, and for validation and of prevention schemes.
In cellular systems using frequency division duplex, growing Internet services cause unbalance of uplink and downlink traffic, resulting in poor uplink spectrum utilization. Addressing this issue, this paper considers overlaying an ad hoc network onto a cellular uplink network for improving spectrum utilization and spatial reuse efficiency. Transmission capacities of the overlaid networks are analyzed, which are defined as the maximum densities of the ad hoc nodes and mobile users under an outage constraint. Using tools from stochastic geometry, the capacity tradeoff curves for the overlaid networks are shown to be linear. Deploying overlaid networks based on frequency separation is proved to achieve higher network capacities than that based on spatial separation. Furthermore, spatial diversity is shown to enhance network capacities.
Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities.
Anti-transitivity captures the notion that enemies of enemies are friends, and arises naturally in the study of adversaries in social networks and in the study of conflicting nation states or organizations. We present a simplified, evolutionary model for anti-transitivity influencing link formation in complex networks, and analyze the model's network dynamics. The Iterated Local Anti-Transitivity (or ILAT) model creates anti-clone nodes in each time-step, and joins anti-clones to the parent node's non-neighbor set. The graphs generated by ILAT exhibit familiar properties of complex networks such as densification, short distances (bounded by absolute constants), and bad spectral expansion. We determine the cop and domination number for graphs generated by ILAT, and finish with an analysis of their clustering coefficients. We interpret these results within the context of real-world complex networks and present open problems.
In this work, we try to propose, in a novel way using the Bose and Fermi quantum network approach, a framework studying condensation and evolution of space time network described by the Loop quantum gravity. Considering quantum network connectivity features in the Loop quantum gravity, we introduce a link operator, and through extending the dynamical equation for the evolution of quantum network posed by Ginestra Bianconi to an operator equation, we get the solution of the link operator. This solution is relevant to the Hamiltonian of the network, and then is related to the energy distribution of network nodes. Showing that tremendous energy distribution induce huge curved space-time network, may have space time condensation in high-energy nodes. For example, in the black hole circumstances, quantum energy distribution is related to the area, thus the eigenvalues of the link operator of the nodes can be related to quantum number of area, and the eigenvectors are just the spin network states. This reveals that the degree distribution of nodes for space-time network is quantized, which can form the space-time network condensation. The black hole is a sort of result of space-time network condensation, however there may be more extensive space-time network condensation, for example, the universe singularity (big bang).
Measurement has become fundamental to the operation of networks and at-scale services---whether for management, security, diagnostics, optimization, or simply enhancing our collective understanding of the Internet as a complex system. Further, measurements are useful across points of view---from end hosts to enterprise networks and data centers to the wide area Internet. We observe that many measurements are decoupled from the protocols and applications they are designed to illuminate. Worse, current measurement practice often involves the exploitation of side-effects and unintended features of the network, or, in other words, the artful piling of hacks atop one another. This state of affairs is a direct result of the relative paucity of diagnostic and measurement capabilities built into today's network stack.   Given our modern dependence on ubiquitous measurement, we propose measurability as an explicit low-level goal of current protocol design, and argue that measurements should be available to all network protocols throughout the stack. We seek to generalize the idea of measurement within protocols, e.g., the way in which TCP relies on measurement to drive its end-to-end behavior. Rhetorically, we pose the question: what if the stack had been built with measurability and diagnostic support in mind? We start from a set of principles for explicit measurability, and define primitives that, were they supported by the stack, would not only provide a solid foundation for protocol design going forward, but also reduce the cost and increase the accuracy of measuring the network.
Text recognition in natural scene is a challenging problem due to the many factors affecting text appearance. In this paper, we presents a method that directly transcribes scene text images to text without needing of sophisticated character segmentation. We leverage recent advances of deep neural networks to model the appearance of scene text images with temporal dynamics. Specifically, we integrates convolutional neural network (CNN) and recurrent neural network (RNN) which is motivated by observing the complementary modeling capabilities of the two models. The main contribution of this work is investigating how temporal memory helps in an segmentation free fashion for this specific problem. By using long short-term memory (LSTM) blocks as hidden units, our model can retain long-term memory compared with HMMs which only maintain short-term state dependences. We conduct experiments on Street View House Number dataset containing highly variable number images. The results demonstrate the superiority of the proposed method over traditional HMM based methods.
Many analytic results for the connectivity, coverage, and capacity of wireless networks have been reported for the case where the number of nodes, $n$, tends to infinity (large-scale networks). The majority of these results have not been extended for small or moderate values of $n$; whereas in many practical networks, $n$ is not very large. In this paper, we consider finite (small-scale) wireless sensor networks. We first show that previous asymptotic results provide poor approximations for such networks. We provide a set of differences between small-scale and large-scale analysis and propose a methodology for analysis of finite sensor networks. Furthermore, we consider two models for such networks: unreliable sensor grids, and sensor networks with random node deployment. We provide easily computable expressions for bounds on the coverage and connectivity of these networks. With validation from simulations, we show that the derived analytic expressions give very good estimates of such quantities for finite sensor networks. Our investigation confirms the fact that small-scale networks possesses unique characteristics different from the large-scale counterparts, necessitating the development of a new framework for their analysis and design.
We define {\em predictive information} $I_{\rm pred} (T)$ as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times $T$: $I_{\rm pred} (T)$ can remain finite, grow logarithmically, or grow as a fractional power law. If the time series allows us to learn a model with a finite number of parameters, then $I_{\rm pred} (T)$ grows logarithmically with a coefficient that counts the dimensionality of the model space. In contrast, power--law growth is associated, for example, with the learning of infinite parameter (or nonparametric) models such as continuous functions with smoothness constraints. There are connections between the predictive information and measures of complexity that have been defined both in learning theory and in the analysis of physical systems through statistical mechanics and dynamical systems theory. Further, in the same way that entropy provides the unique measure of available information consistent with some simple and plausible conditions, we argue that the divergent part of $I_{\rm pred} (T)$ provides the unique measure for the complexity of dynamics underlying a time series. Finally, we discuss how these ideas may be useful in different problems in physics, statistics, and biology.
We successfully describe the HERA-data on diffractive deep inelastic scattering using a saturation model which has been applied in our earlier analysis of the inclusive $ep$-scattering data. No further parameters are needed. Saturation already turned out to be essential in describing the transition from large to small values of $Q^2$ in inclusive scattering. It is even more important for diffractive processes and naturally leads to a constant ratio of the diffractive versus inclusive cross sections. We present an extensive discussion of our results as well as detailed comparison with data.
Heuristic algorithms are able to optimize objective functions efficiently because they use intelligently the information about the objective functions. Thus, information utilization is critical to the performance of heuristics. However, the concept of information utilization has remained vague and abstract because there is no reliable metric to reflect the extent to which the information about the objective function is utilized by heuristic algorithms. In this paper, the metric of information utilization ratio (IUR) is defined, which is the ratio of the utilized information quantity over the acquired information quantity in the search process. The IUR proves to be well-defined. Several examples of typical heuristic algorithms are given to demonstrate the procedure of calculating the IUR. Empirical evidences on the correlation between the IUR and the performance of a heuristic are also provided. The IUR can be an index of how finely an algorithm is designed and guide the invention of new heuristics and the improvement of existing ones.
The probability distribution of the proper delay times during scattering on a chaotic system is derived in the framework of the random matrix approach and the supersymmetry method. The result obtained is valid for an arbitrary number of scattering channels as well as arbitrary coupling to the energy continuum. The case of statistically equivalent channels is studied in detail. In particular, the semiclassical limit of infinite number of weak channels is paid appreciable attention.
5G networks are expected to be more dynamic and chaotic in their structure than current networks. With the advent of Network Function Virtualization (NFV), Network Functions (NF) will no longer be tightly coupled with the hardware they are running on, which poses new challenges in network management. Noisy neighbor is a term commonly used to describe situations in NFV infrastructure where an application experiences degradation in performance due to the fact that some of the resources it needs are occupied by other applications in the same cloud node. These situations cannot be easily identified using straightforward approaches, which calls for the use of sophisticated methods for NFV infrastructure management. In this paper we demonstrate how Machine Learning (ML) techniques can be used to identify such events. Through experiments using data collected at real NFV infrastructure, we show that standard models for automated classification can detect the noisy neighbor phenomenon with an accuracy of more than 90% in a simple scenario.
Mobile ad hoc networking (MANET) has become an exciting and important technology in recent years because of the rapid proliferation of wireless devices. MANETs are highly vulnerable to attacks due to the open medium, dynamically changing network topology and lack of centralized monitoring point. It is important to search new architecture and mechanisms to protect the wireless networks and mobile computing application. IDS analyze the network activities by means of audit data and use patterns of well-known attacks or normal profile to detect potential attacks. There are two methods to analyze: misuse detection and anomaly detection. Misuse detection is not effective against unknown attacks and therefore, anomaly detection method is used. In this approach, the audit data is collected from each mobile node after simulating the attack and compared with the normal behavior of the system. If there is any deviation from normal behavior then the event is considered as an attack. Some of the features of collected audit data may be redundant or contribute little to the detection process. So it is essential to select the important features to increase the detection rate. This paper focuses on implementing two feature selection methods namely, markov blanket discovery and genetic algorithm. In genetic algorithm, bayesian network is constructed over the collected features and fitness function is calculated. Based on the fitness value the features are selected. Markov blanket discovery also uses bayesian network and the features are selected depending on the minimum description length. During the evaluation phase, the performances of both approaches are compared based on detection rate and false alarm rate.
Thermal infra-red (IR) images focus on changes of temperature distribution on facial muscles and blood vessels. These temperature changes can be regarded as texture features of images. A comparative study of face recognition methods working in thermal spectrum is carried out in this paper. In these study two local-matching methods based on Haar wavelet transform and Local Binary Pattern (LBP) are analyzed. Wavelet transform is a good tool to analyze multi-scale, multi-direction changes of texture. Local binary patterns (LBP) are a type of feature used for classification in computer vision. Firstly, human thermal IR face image is preprocessed and cropped the face region only from the entire image. Secondly, two different approaches are used to extract the features from the cropped face region. In the first approach, the training images and the test images are processed with Haar wavelet transform and the LL band and the average of LH/HL/HH bands sub-images are created for each face image. Then a total confidence matrix is formed for each face image by taking a weighted sum of the corresponding pixel values of the LL band and average band. For LBP feature extraction, each of the face images in training and test datasets is divided into 161 numbers of sub images, each of size 8X8 pixels. For each such sub images, LBP features are extracted which are concatenated in row wise manner. PCA is performed separately on the individual feature set for dimensionality reeducation. Finally two different classifiers are used to classify face images. One such classifier multi-layer feed forward neural network and another classifier is minimum distance classifier. The Experiments have been performed on the database created at our own laboratory and Terravic Facial IR Database.
Gradient boosting of regression trees is a competitive procedure for learning predictive models of continuous data that fits the data with an additive non-parametric model. The classic version of gradient boosting assumes that the data is independent and identically distributed. However, relational data with interdependent, linked instances is now common and the dependencies in such data can be exploited to improve predictive performance. Collective inference is one approach to exploit relational correlation patterns and significantly reduce classification error. However, much of the work on collective learning and inference has focused on discrete prediction tasks rather than continuous. %target values has not got that attention in terms of collective inference. In this work, we investigate how to combine these two paradigms together to improve regression in relational domains. Specifically, we propose a boosting algorithm for learning a collective inference model that predicts a continuous target variable. In the algorithm, we learn a basic relational model, collectively infer the target values, and then iteratively learn relational models to predict the residuals. We evaluate our proposed algorithm on a real network dataset and show that it outperforms alternative boosting methods. However, our investigation also revealed that the relational features interact together to produce better predictions.
Non-reversing relaxation enthalpies (DHnr) at glass transitions Tg(x) in the PxGexSe1-2x ternary display a wide, sharp and deep global minimum (~0) in the 0.09 < x < 0.145 range, within which Tg becomes thermally reversing. In the reversibility window these glasses are found not to age, in contrast to aging observed for fragile glass compositions outside the window. Thermal reversibility and lack of aging are paradigms that molecular glasses in the window share with proteins in transition states, which result from structural self-organization in both systems. In proteins the self-organized structures appear to be at places where life sustaining repeating foldings and unfoldings occur.
The evolution of network structure and the spreading of epidemic are common coexistent dynamical processes. In most cases, network structure is treated either static or time-varying, supposing the whole network is observed in a same time window. In this paper, we consider the epidemic spreading on a network consisting of both static and time-varying structures. At meanwhile, the time-varying part and the epidemic spreading are supposed to be of the same time scale. We introduce a static and activity driven coupling (SADC) network model to characterize the coupling between static (strong) structure and dynamic (weak) structure. Epidemic thresholds of SIS and SIR model are studied on SADC both analytically and numerically with various coupling strategies, where the strong structure is of homogeneous or heterogeneous degree distribution. Theoretical thresholds obtained from SADC model can both recover and generalize the classical results in static and time-varying networks. It is demonstrated that weak structures can make the epidemics break out much more easily in homogeneous coupling but harder in heterogeneous coupling when keeping same average degree in SADC networks. Furthermore, we show there exists a threshold ratio of the weak structure to have substantive effects on the breakout of the epidemics. This promotes our understanding of why epidemics can still break out in some social networks even we restrict the flow of the population.
We introduce "AnnealSGD", a regularized stochastic gradient descent algorithm motivated by an analysis of the energy landscape of a particular class of deep networks with sparse random weights. The loss function of such networks can be approximated by the Hamiltonian of a spherical spin glass with Gaussian coupling. While different from currently-popular architectures such as convolutional ones, spin glasses are amenable to analysis, which provides insights on the topology of the loss function and motivates algorithms to minimize it. Specifically, we show that a regularization term akin to a magnetic field can be modulated with a single scalar parameter to transition the loss function from a complex, non-convex landscape with exponentially many local minima, to a phase with a polynomial number of minima, all the way down to a trivial landscape with a unique minimum. AnnealSGD starts training in the relaxed polynomial regime and gradually tightens the regularization parameter to steer the energy towards the original exponential regime. Even for convolutional neural networks, which are quite unlike sparse random networks, we empirically show that AnnealSGD improves the generalization error using competitive baselines on MNIST and CIFAR-10.
We have performed an unbiased deep near-infrared survey toward the Aquila molecular cloud with a sky coverage of ~1 deg2. We identified 45 molecular hydrogen emission-line objects(MHOs), of which only 11 were previously known. Using the Spitzer archival data we also identified 802 young stellar objects (YSOs) in this region. Based on the morphology and the location of MHOs and YSO candidates, we associate 43 MHOs with 40 YSO candidates. The distribution of jet length shows an exponential decrease in the number of outflows with increasing length and the molecular hydrogen outflows seem to be oriented randomly. Moreover, there is no obvious correlation between jet lengths, jet opening angles, or jet H2 1-0 S(1) luminosities and spectral indices of the possible driving sources in this region. We also suggest that molecular hydrogen outflows in the Aquila molecular cloud are rather weak sources of turbulence, unlikely to generate the observed velocity dispersion in the region of survey.
We consider multiuser scheduling in wireless networks with channel variations and flow-level dynamics. Recently, it has been shown that the MaxWeight algorithm, which is throughput-optimal in networks with a fixed number users, fails to achieve the maximum throughput in the presence of flow-level dynamics. In this paper, we propose a new algorithm, called workload-based scheduling with learning, which is provably throughput-optimal, requires no prior knowledge of channels and user demands, and performs significantly better than previously suggested algorithms.
A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.
Strictly speaking the laws of the conventional Statistical Physics, based on the Equipartition Postulate and Ergodicity Hypothesis, apply only in the presence of a heat bath. Until recently this restriction was not important for real physical systems: a weak coupling with the bath was believed to be sufficient. However, the progress in both quantum gases and solid state coherent quantum devices demonstrates that the coupling to the bath can be reduced dramatically. To describe such systems properly one should revisit the very foundations of the Statistical Mechanics. We examine this general problem for the case of the Josephson junction chain and show that it displays a novel high temperature non-ergodic phase with finite resistance. With further increase of the temperature the system undergoes a transition to the fully localized state characterized by infinite resistance and exponentially long relaxation.
General properties of the transport of charge carriers (electrons and holes) in disordered organic materials are discussed. It was demonstrated that the dominant part of the total energetic disorder in organic material is usually provided by the electrostatic disorder, generated by randomly located and oriented dipoles and quadrupoles. For this reason this disorder is strongly spatially correlated. Spatial correlation directly governs the field dependence of the carrier drift mobility. Shape of the current transients, which is of primary importance for a correct determination of the carrier mobility, is considered. A notable feature of the electrostatic disorder is its modification in the vicinity of the electrode, and this modification takes place without modification of the structure of the material. It is shown how this phenomenon affects characteristics of the charge injection. We consider also effect of inter-charge interaction on charge transport.
We report on recent progress towards quantitative phenomenology of small x resummation of deep-inelastic structure functions. We compute small x resummed K-factors with realistic PDFs and estimate their impact in the HERA kinematical region. These K-factors, which match smoothly to the fixed order NLO results, approximately reproduce the effect of a small x resummed PDF analysis. Typical corrections are found to be of the same order as the NNLO ones, that is, a few percent, but with opposite sign. These results imply that resummation corrections could be relevant for a global PDF analysis, especially with the very precise combined HERA dataset.
Graph transformation systems have the potential to be realistic models of chemistry, provided a comprehensive collection of reaction rules can be extracted from the body of chemical knowledge. A first key step for rule learning is the computation of atom-atom mappings, i.e., the atom-wise correspondence between products and educts of all published chemical reactions. This can be phrased as a maximum common edge subgraph problem with the constraint that transition states must have cyclic structure. We describe a search tree method well suited for small edit distance and an integer linear program best suited for general instances and demonstrate that it is feasible to compute atom-atom maps at large scales using a manually curated database of biochemical reactions as an example. In this context we address the network completion problem.
We report on ultrasonic measurements of the propagation operator in a strongly scattering mesoglass. The backscattered field is shown to display a deterministic spatial coherence due to a remarkably large memory effect induced by long recurrent trajectories. Investigation of the recurrent scattering contribution directly yields the probability for a wave to come back close to its starting spot. The decay of this quantity with time is shown to change dramatically near the Anderson localization transition. The singular value decomposition of the propagation operator reveals the dominance of very intense recurrent scattering paths near the mobility edge.
Systems undergoing an equilibrium phase transition from a liquid state to an amorphous solid state exhibit certain universal characteristics. Chief among these are the fraction of particles that are randomly localized and the scaling functions that describe the order parameter and (equivalently) the statistical distribution of localization lengths for these localized particles. The purpose of this Paper is to discuss the origins and consequences of this universality, and in doing so, three themes are explored. First, a replica-Landau-type approach is formulated for the universality class of systems that are composed of extended objects connected by permanent random constraints and undergo amorphous solidification at a critical density of constraints. This formulation generalizes the cases of randomly cross-linked and end-linked macromolecular systems, discussed previously. The universal replica free energy is constructed, in terms of the replica order parameter appropriate to amorphous solidification, the value of the order parameter is obtained in the liquid and amorphous solid states, and the chief universal characteristics are determined. Second, the theory is reformulated in terms of the distribution of local static density fluctuations rather than the replica order parameter. It is shown that a suitable free energy can be constructed, depending on the distribution of static density fluctuations, and that this formulation yields precisely the same conclusions as the replica approach. Third, the universal predictions of the theory are compared with the results of extensive numerical simulations of randomly cross-linked macromolecular systems, due to Barsky and Plischke, and excellent agreement is found.
This article discusses how concepts and methods of complex networks can be applied to real-time imaging and computer vision. After a brief introduction of complex networks basic concepts, their use as means to represent and characterize images, as well as for modeling visual saliency, are briefly described. The possibility to apply complex networks in order to model and simulate the performance of parallel and distributed computing systems for performance of visual methods is also proposed.
In this paper a data mining approach for variable selection and knowledge extraction from datasets is presented. The approach is based on unguided symbolic regression (every variable present in the dataset is treated as the target variable in multiple regression runs) and a novel variable relevance metric for genetic programming. The relevance of each input variable is calculated and a model approximating the target variable is created. The genetic programming configurations with different target variables are executed multiple times to reduce stochastic effects and the aggregated results are displayed as a variable interaction network. This interaction network highlights important system components and implicit relations between the variables. The whole approach is tested on a blast furnace dataset, because of the complexity of the blast furnace and the many interrelations between the variables. Finally the achieved results are discussed with respect to existing knowledge about the blast furnace process.
Deep learning has been demonstrated to achieve excellent results for image classification and object detection. However, the impact of deep learning on video analysis (e.g. action detection and recognition) has been limited due to complexity of video data and lack of annotations. Previous convolutional neural networks (CNN) based video action detection approaches usually consist of two major steps: frame-level action proposal detection and association of proposals across frames. Also, these methods employ two-stream CNN framework to handle spatial and temporal feature separately. In this paper, we propose an end-to-end deep network called Tube Convolutional Neural Network (T-CNN) for action detection in videos. The proposed architecture is a unified network that is able to recognize and localize action based on 3D convolution features. A video is first divided into equal length clips and for each clip a set of tube proposals are generated next based on 3D Convolutional Network (ConvNet) features. Finally, the tube proposals of different clips are linked together employing network flow and spatio-temporal action detection is performed using these linked video proposals. Extensive experiments on several video datasets demonstrate the superior performance of T-CNN for classifying and localizing actions in both trimmed and untrimmed videos compared to state-of-the-arts.
Spike-timing-dependent plasticity (STDP) with asymmetric learning windows is commonly found in the brain and useful for a variety of spike-based computations such as input filtering and associative memory. A natural consequence of STDP is establishment of causality in the sense that a neuron learns to fire with a lag after specific presynaptic neurons have fired. The effect of STDP on synchrony is elusive because spike synchrony implies unitary spike events of different neurons rather than a causal delayed relationship between neurons. We explore how synchrony can be facilitated by STDP in oscillator networks with a pacemaker. We show that STDP with asymmetric learning windows leads to self-organization of feedforward networks starting from the pacemaker. As a result, STDP drastically facilitates frequency synchrony. Even though differences in spike times are lessened as a result of synaptic plasticity, the finite time lag remains so that perfect spike synchrony is not realized. In contrast to traditional mechanisms of large-scale synchrony based on mutual interaction of coupled neurons, the route to synchrony discovered here is enslavement of downstream neurons by upstream ones. Facilitation of such feedforward synchrony does not occur for STDP with symmetric learning windows.
It is commonly expected that future fifth generation (5G) networks will be deployed with a high spatial density of access nodes (ANs) in order to meet the envisioned capacity requirements of the upcoming wireless networks. Densification is beneficial not only for communications but it also creates a convenient infrastructure for highly accurate user node (UN) positioning. Despite the fact that positioning will play an important role in future networks, thus enabling a huge amount of location-based applications and services, this great opportunity has not been widely explored in the existing literature. Therefore, this paper proposes an unscented Kalman filter (UKF)-based method for estimating directions of arrival (DoAs) and times of arrival (ToA) at ANs as well as performing joint 3D positioning and network synchronization in a network-centric manner. In addition to the proposed UKF-based solution, the existing 2D extended Kalman filter (EKF)-based solution is extended to cover also realistic 3D positioning scenarios. Building on the premises of 5G ultra-dense networks (UDNs), the performance of both methods is evaluated and analysed in terms of DoA and ToA estimation as well as positioning and clock offset estimation accuracy, using the METIS map-based ray-tracing channel model and 3D trajectories for vehicles and unmanned aerial vehicles (UAVs) through the Madrid grid. Based on the comprehensive numerical evaluations, both proposed methods can provide the envisioned one meter 3D positioning accuracy even in the case of unsynchronized 5G network while simultaneously tracking the clock offsets of network elements with a nanosecond-scale accuracy.
We study the growth of London's street-network in its dual representation, as the city has evolved over the last 224 years. The dual representation of a planar graph is a content-based network, where each node is a set of edges of the planar graph, and represents a transportation unit in the so-called information space, i.e. the space where information is handled in order to navigate through the city. First, we discuss a novel hybrid technique to extract dual graphs from planar graphs, called the hierarchical intersection continuity negotiation principle. Then we show that the growth of the network can be analytically described by logistic laws and that the topological properties of the network are governed by robust lognormal distributions characterising the network's connectivity and small-world properties that are consistent over time. Moreover, we find that the double-Pareto-like distributions for the connectivity emerge for major roads and can be modelled via a stochastic content-based network model using simple space filling principles.
We address the problem of temporal action localization in videos. We pose action localization as a structured prediction over arbitrary-length temporal windows, where each window is scored as the sum of frame-wise classification scores. Additionally, our model classifies the start, middle, and end of each action as separate components, allowing our system to explicitly model each action's temporal evolution and take advantage of informative temporal dependencies present in this structure. In this framework, we localize actions by searching for the structured maximal sum, a problem for which we develop a novel, provably-efficient algorithmic solution. The frame-wise classification scores are computed using features from a deep Convolutional Neural Network (CNN), which are trained end-to-end to directly optimize for a novel structured objective. We evaluate our system on the THUMOS 14 action detection benchmark and achieve competitive performance.
Recent deep learning based approaches have achieved great success on handwriting recognition. Chinese characters are among the most widely adopted writing systems in the world. Previous research has mainly focused on recognizing handwritten Chinese characters. However, recognition is only one aspect for understanding a language, another challenging and interesting task is to teach a machine to automatically write (pictographic) Chinese characters. In this paper, we propose a framework by using the recurrent neural network (RNN) as both a discriminative model for recognizing Chinese characters and a generative model for drawing (generating) Chinese characters. To recognize Chinese characters, previous methods usually adopt the convolutional neural network (CNN) models which require transforming the online handwriting trajectory into image-like representations. Instead, our RNN based approach is an end-to-end system which directly deals with the sequential structure and does not require any domain-specific knowledge. With the RNN system (combining an LSTM and GRU), state-of-the-art performance can be achieved on the ICDAR-2013 competition database. Furthermore, under the RNN framework, a conditional generative model with character embedding is proposed for automatically drawing recognizable Chinese characters. The generated characters (in vector format) are human-readable and also can be recognized by the discriminative RNN model with high accuracy. Experimental results verify the effectiveness of using RNNs as both generative and discriminative models for the tasks of drawing and recognizing Chinese characters.
The increasing trend on wireless-connected devices makes opportunistic networking a promising alternative to existing infrastructure-based networks. However, on these networks there is neither guarantee about the availability of the connections nor on the topology of the network. The development of opportunistic applications, i.e., applications running over opportunistic networks, is still in early stages. This is due to lack of tools to support the process in such uncertain conditions. Indeed, many tools have been introduced to study and characterize opportunistic networks, but none of them is focused on helping developers to conceive opportunistic applications. In this paper, we argue that the gap between opportunistic applications development and network characterization can be filled with network emulation. First, this position paper points out important challenges about the development of opportunistic applications. Then, in order to cope with these challenges, it details a set of requirements that an emulator should meet to allow the testing of such applications.
We study the pure and random-bond versions of the square lattice ferromagnetic Blume-Capel model, in both the first-order and second-order phase transition regimes of the pure model. Phase transition temperatures, thermal and magnetic critical exponents are determined for lattice sizes in the range L=20-100 via a sophisticated two-stage numerical strategy of entropic sampling in dominant energy subspaces, using mainly the Wang-Landau algorithm. The second-order phase transition, emerging under random bonds from the second-order regime of the pure model, has the same values of critical exponents as the 2d Ising universality class, with the effect of the bond disorder on the specific heat being well described by double-logarithmic corrections, our findings thus supporting the marginal irrelevance of quenched bond randomness. On the other hand, the second-order transition, emerging under bond randomness from the first-order regime of the pure model, has a distinctive universality class with \nu=1.30(6) and \beta/\nu=0.128(5). This amounts to a strong violation of the universality principle of critical phenomena, since these two second-order transitions, with different sets of critical exponents, are between the same ferromagnetic and paramagnetic phases. Furthermore, the latter of these two transitions supports an extensive but weak universality, since it has the same magnetic critical exponent (but a different thermal critical exponent) as a wide variety of two-dimensional systems. In the conversion by bond randomness of the first-order transition of the pure system to second order, we detect, by introducing and evaluating connectivity spin densities, a microsegregation that also explains the increase we find in the phase transition temperature under bond randomness.
This paper presents a significant improvement for the synthesis of texture images using convolutional neural networks (CNNs), making use of constraints on the Fourier spectrum of the results. More precisely, the texture synthesis is regarded as a constrained optimization problem, with constraints conditioning both the Fourier spectrum and statistical features learned by CNNs. In contrast with existing methods, the presented method inherits from previous CNN approaches the ability to depict local structures and fine scale details, and at the same time yields coherent large scale structures, even in the case of quasi-periodic images. This is done at no extra computational cost. Synthesis experiments on various images show a clear improvement compared to a recent state-of-the art method relying on CNN constraints only.
Crowdsourced wireless community networks can effectively alleviate the limited coverage issue of Wi-Fi access points (APs), by encouraging individuals (users) to share their private residential Wi-Fi APs with others. In this paper, we provide a comprehensive economic analysis for such a crowdsourced network, with the particular focus on the users' behavior analysis and the community network operator's pricing design. Specifically, we formulate the interactions between the network operator and users as a two-layer Stackelberg model, where the operator determining the pricing scheme in Layer I, and then users determining their Wi-Fi sharing schemes in Layer II. First, we analyze the user behavior in Layer II via a two-stage membership selection and network access game, for both small-scale networks and large-scale networks. Then, we design a partial price differentiation scheme for the operator in Layer I, which generalizes both the complete price differentiation scheme and the single pricing scheme (i.e., no price differentiation). We show that the proposed partial pricing scheme can achieve a good tradeoff between the revenue and the implementation complexity. Numerical results demonstrate that when using the partial pricing scheme with only two prices, we can increase the operator's revenue up to 124.44% comparing with the single pricing scheme, and can achieve an average of 80% of the maximum operator revenue under the complete price differentiation scheme.
When modeling geo-spatial data, it is critical to capture spatial correlations for achieving high accuracy. Spatial Auto-Regression (SAR) is a common tool used to model such data, where the spatial contiguity matrix (W) encodes the spatial correlations. However, the efficacy of SAR is limited by two factors. First, it depends on the choice of contiguity matrix, which is typically not learnt from data, but instead, is assumed to be known apriori. Second, it assumes that the observations can be explained by linear models. In this paper, we propose a Convolutional Neural Network (CNN) framework to model geo-spatial data (specifi- cally housing prices), to learn the spatial correlations automatically. We show that neighborhood information embedded in satellite imagery can be leveraged to achieve the desired spatial smoothing. An additional upside of our framework is the relaxation of linear assumption on the data. Specific challenges we tackle while implementing our framework include, (i) how much of the neighborhood is relevant while estimating housing prices? (ii) what is the right approach to capture multiple resolutions of satellite imagery? and (iii) what other data-sources can help improve the estimation of spatial correlations? We demonstrate a marked improvement of 57% on top of the SAR baseline through the use of features from deep neural networks for the cities of London, Birmingham and Liverpool.
Consider a transmission scheme with a single transmitter and multiple receivers over a faulty broadcast channel. For each receiver, the transmitter has a unique infinite stream of packets, and its goal is to deliver them at the highest throughput possible. While such multiple-unicast models are unsolved in general, several network coding based schemes were suggested. In such schemes, the transmitter can either send an uncoded packet, or a coded packet which is a function of a few packets. The packets sent can be received by the designated receiver (with some probability) or heard and stored by other receivers. Two functional modes are considered; the first presumes that the storage time is unlimited, while in the second it is limited by a given Time to Expire (TTE) parameter. We model the transmission process as an infinite-horizon Markov Decision Process (MDP). Since the large state space renders exact solutions computationally impractical, we introduce policy restricted and induced MDPs with significantly reduced state space, and prove that with proper reward function they have equal optimal value function (hence equal optimal throughput). We then derive a reinforcement learning algorithm, which learns the optimal policy for the induced MDP. This optimal strategy of the induced MDP, once applied to the policy restricted one, significantly improves over uncoded schemes. Next, we enhance the algorithm by means of analysis of the structural properties of the resulting reward functional. We demonstrate that our method scales well in the number of users, and automatically adapts to the packet loss rates, unknown in advance. In addition, the performance is compared to the recent bound by Wang, which assumes much stronger coding (e.g., intra-session and buffering of coded packets), yet is shown to be comparable.
Hebb's idea of a cell assembly as the fundamental unit of neural information processing has dominated neuroscience like no other theoretical concept within the past 60 years. A range of different physiological phenomena, from precisely synchronized spiking to broadly simultaneous rate increases, has been subsumed under this term. Yet progress in this area is hampered by the lack of statistical tools that would enable to extract assemblies with arbitrary constellations of time lags, and at multiple temporal scales, partly due to the severe computational burden. Here we present such a unifying methodological and conceptual framework which detects assembly structure at many different time scales, levels of precision, and with arbitrary internal organization. Applying this methodology to multiple single unit recordings from various cortical areas, we find that there is no universal cortical coding scheme, but that assembly structure and precision significantly depends on brain area recorded and ongoing task demands.
It is suggested that the degree distribution for networks of the cell-metabolism for simple organisms reflects an ubiquitous randomness. This implies that natural selection has exerted no or very little pressure on the network degree distribution during evolution. The corresponding random network, here termed the blind watchmaker network has a power-law degree distribution with an exponent gamma >= 2. It is random with respect to a complete set of network states characterized by a description of which links are attached to a node as well as a time-ordering of these links. No a priory assumption of any growth mechanism or evolution process is made. It is found that the degree distribution of the blind watchmaker network agrees very precisely with that of the metabolic networks. This implies that the evolutionary pathway of the cell-metabolism, when projected onto a metabolic network representation, has remained statistically random with respect to a complete set of network states. This suggests that even a biological system, which due to natural selection has developed an enormous specificity like the cellular metabolism, nevertheless can, at the same time, display well defined characteristics emanating from the ubiquitous inherent random element of Darwinian evolution. The fact that also completely random networks may have scale-free node distributions gives a new perspective on the origin of scale-free networks in general.
Time-reversal symmetry prohibits elastic backscattering of electrons propagating within a helical edge of a two-dimensional topological insulator. However, small band gaps in these systems make them sensitive to doping disorder, which may lead to the formation of electron and hole puddles. Such a puddle -- a quantum dot -- tunnel-coupled to the edge may significantly enhance the inelastic backscattering rate, due to the long dwelling time of an electron in the dot. The added resistance is especially strong for dots carrying an odd number of electrons, due to the Kondo effect. For the same reason, the temperature dependence of the added resistance becomes rather weak. We present a detailed theory of the quantum dot effect on the helical edge resistance. It allows us to make specific predictions for possible future experiments with artificially prepared dots in topological insulators. It also provides a qualitative explanation of the resistance fluctuations observed in short HgTe quantum wells. In addition to the single-dot theory, we develop a statistical description of the helical edge resistivity introduced by random charge puddles in a long heterostructure carrying helical edge states. The presence of charge puddles in long samples may explain the observed coexistence of a high sample resistance with the propagation of electrons along the sample edges.
Bayesian nonparametric models, such as Gaussian processes, provide a compelling framework for automatic statistical modelling: these models have a high degree of flexibility, and automatically calibrated complexity. However, automating human expertise remains elusive; for example, Gaussian processes with standard kernels struggle on function extrapolation problems that are trivial for human learners. In this paper, we create function extrapolation problems and acquire human responses, and then design a kernel learning framework to reverse engineer the inductive biases of human learners across a set of behavioral experiments. We use the learned kernels to gain psychological insights and to extrapolate in human-like ways that go beyond traditional stationary and polynomial kernels. Finally, we investigate Occam's razor in human and Gaussian process based function learning.
We present a polynomial time approximation algorithm for constructing an overlay multicast network for streaming live media events over the Internet. The class of overlay networks constructed by our algorithm include networks used by Akamai Technologies to deliver live media events to a global audience with high fidelity. We construct networks consisting of three stages of nodes. The nodes in the first stage are the entry points that act as sources for the live streams. Each source forwards each of its streams to one or more nodes in the second stage that are called reflectors. A reflector can split an incoming stream into multiple identical outgoing streams, which are then sent on to nodes in the third and final stage that act as sinks and are located in edge networks near end-users. As the packets in a stream travel from one stage to the next, some of them may be lost. A sink combines the packets from multiple instances of the same stream (by reordering packets and discarding duplicates) to form a single instance of the stream with minimal loss. Our primary contribution is an algorithm that constructs an overlay network that provably satisfies capacity and reliability constraints to within a constant factor of optimal, and minimizes cost to within a logarithmic factor of optimal. Further in the common case where only the transmission costs are minimized, we show that our algorithm produces a solution that has cost within a factor of 2 of optimal. We also implement our algorithm and evaluate it on realistic traces derived from Akamai's live streaming network. Our empirical results show that our algorithm can be used to efficiently construct large-scale overlay networks in practice with near-optimal cost.
Mobile-edge computation offloading (MECO) offloads intensive mobile computation to clouds located at the edges of cellular networks. Thereby, MECO is envisioned as a promising technique for prolonging the battery lives and enhancing the computation capacities of mobiles. In this paper, we consider resource allocation in a MECO system comprising multiple users that time share a single edge cloud and have different computation loads. The optimal resource allocation is formulated as a convex optimization problem for minimizing the weighted sum mobile energy consumption under constraint on computation latency and for both the cases of infinite and finite edge cloud computation capacities. The optimal policy is proved to have a threshold-based structure with respect to a derived offloading priority function, which yields priorities for users according to their channel gains and local computing energy consumption. As a result, users with priorities above and below a given threshold perform complete and minimum offloading, respectively. Computing the threshold requires iterative computation. To reduce the complexity, a sub-optimal resource-allocation algorithm is proposed and shown by simulation to have close-to-optimal performance.
We present a model of a basic recurrent neural network (or bRNN) that includes a separate linear term with a slightly "stable" fixed matrix to guarantee bounded solutions and fast dynamic response. We formulate a state space viewpoint and adapt the constrained optimization Lagrange Multiplier (CLM) technique and the vector Calculus of Variations (CoV) to derive the (stochastic) gradient descent. In this process, one avoids the commonly used re-application of the circular chain-rule and identifies the error back-propagation with the co-state backward dynamic equations. We assert that this bRNN can successfully perform regression tracking of time-series. Moreover, the "vanishing and exploding" gradients are explicitly quantified and explained through the co-state dynamics and the update laws. The adapted CoV framework, in addition, can correctly and principally integrate new loss functions in the network on any variable and for varied goals, e.g., for supervised learning on the outputs and unsupervised learning on the internal (hidden) states.
Cloud computing and Network Function Virtualization (NFV) are emerging as key technologies to overcome the challenges facing 4G and beyond mobile systems. Over the last few years, Platform-as-a-Service (PaaS) has gained momentum and has become more widely adopted throughout IT enterprises. It simplifies the applications provisioning and accelerates time-to-market while lowering costs. Telco can leverage the same model to provision the 4G and beyond core network services using NFV technology. However, many challenges have to be addressed, mainly due to the specificities of network services. This paper proposes an architecture for a Virtual Network Platform-as-a-Service (VNPaaS) to provision 3GPP 4G and beyond core network services in a distributed environment. As an illustrative use case, the proposed architecture is employed to provision the 3GPP Home Subscriber Server (HSS) as-a-Service (HSSaaS). The HSSaaS is built from Virtualized Network Functions (VNFs) resulting from a novel decomposition of HSS. A prototype is implemented and early measurements are made.
Neural development represents not only an exciting and complex field of study, with ongoing progress, but it also became the epicentre of neuroscience and developmental biology, as it strives to describe the underlying cellular and molecular mechanisms by which the central nervous system emerges during the various levels of embryonic development phases. The nervous system is a dynamic entity, where the genetic information plays an important role in shaping the intra- and extracellular environments, which in turn offer a reliable foundation for the stem cell precursors to divide and form neurons. Throughout the embryonic development stages, the neurons undergo different processes: migration at an immature level from the initial place in the embryo to a predefined final position, axonal differentiation and guidance of the motile growth cone towards a postsynaptic target, synaptic formation between axons and target, and lastly long-term synaptic changes which underlie learning and memory. In order to gain a better understanding of how the nervous system develops, mathematical and computational models have been created and expanded in order to bridge the gap between system-level dynamics and lower level cellular and molecular processes. This research paper aims to illustrate the potential of theoretical mathematical and computational models for analysing one important stage of neural development - axonal growth and guidance mechanisms in the presence of diffusion cues, through a visual simulation which is optimized via the graphic processing unit and parallel programming techniques.
The propagation of a narrow-band signal radiated by a point source in a randomly layered absorbing medium is studied asymptotically in the weak-scattering limit. It is shown that in a disordered stratified medium that is homogeneous on average a pulse is channelled along the layers in a narrow strip in the vicinity of the source. The space-time distribution of the pulse energy is calculated. Far from the source, the shape of wave packets is universal and independent of the frequency spectrum of the radiated signal. Strong localization effects manifest themselves also as a low-decaying tail of the pulse and a strong time delay in the direction of stratification. The frequency-momentum correlation function in a one-dimensional random medium is calculated.
Social networks have turned out to be of fundamental importance both for our understanding human sociality and for the design of digital communication technology. However, social networks are themselves based on dyadic relationships and we have little understanding of the dynamics of close relationships and how these change over time. Evolutionary theory suggests that, even in monogamous mating systems, the pattern of investment in close relationships should vary across the lifespan when post-weaning investment plays an important role in maximising fitness. Mobile phone data sets provide us with a unique window into the structure of relationships and the way these change across the lifespan. We here use data from a large national mobile phone dataset to demonstrate striking sex differences in the pattern in the gender-bias of preferred relationships that reflect the way the reproductive investment strategies of the two sexes change across the lifespan: these differences mainly reflect women's shifting patterns of investment in reproduction and parental care. These results suggest that human social strategies may have more complex dynamics than we have tended to assume and a life-history perspective may be crucial for understanding them.
Folding properties of a two-dimensional toy protein model containing only two amino-acid types, hydrophobic and hydrophilic, respectively, are analyzed. An efficient Monte Carlo procedure is employed to ensure that the ground states are found. The thermodynamic properties are found to be strongly sequence dependent in contrast to the kinetic ones. Hence, criteria for good folders are defined entirely in terms of thermodynamic fluctuations. With these criteria sequence patterns that fold well are isolated. For 300 chains with 20 randomly chosen binary residues approximately 10% meet these criteria. Also, an analysis is performed by means of statistical and artificial neural network methods from which it is concluded that the folding properties can be predicted to a certain degree given the binary numbers characterizing the sequences.
In this paper we study how the network of agents adopting a particular technology relates to the structure of the underlying network over which the technology adoption spreads. We develop a model and show that the network of agents adopting a particular technology may have characteristics that differ significantly from the social network of agents over which the technology spreads. For example, the network induced by a cascade may have a heavy-tailed degree distribution even if the original network does not.   This provides evidence that online social networks created by technology adoption over an underlying social network may look fundamentally different from social networks and indicates that using data from many online social networks may mislead us if we try to use it to directly infer the structure of social networks. Our results provide an alternate explanation for certain properties repeatedly observed in data sets, for example: heavy-tailed degree distribution, network densification, shrinking diameter, and network community profile. These properties could be caused by a sort of `sampling bias' rather than by attributes of the underlying social structure. By generating networks using cascades over traditional network models that do not themselves contain these properties, we can nevertheless reliably produce networks that contain all these properties.   An opportunity for interesting future research is developing new methods that correctly infer underlying network structure from data about a network that is generated via a cascade spread over the underlying network.
Although deep convolutional neural networks(CNNs) have achieved remarkable results on object detection and segmentation, pre- and post-processing steps such as region proposals and non-maximum suppression(NMS), have been required. These steps result in high computational complexity and sensitivity to hyperparameters, e.g. thresholds for NMS. In this work, we propose a novel end-to-end trainable deep neural network architecture, which consists of convolutional and recurrent layers, that generates the correct number of object instances and their bounding boxes (or segmentation masks) given an image, using only a single network evaluation without any pre- or post-processing steps. We have tested on detecting digits in multi-digit images synthesized using MNIST, automatically segmenting digits in these images, and detecting cars in the KITTI benchmark dataset. The proposed approach outperforms a strong CNN baseline on the synthesized digits datasets and shows promising results on KITTI car detection.
We study the intricate relationships between the dynamical scaling properties of electron wave packets and the multifractality of the eigenstates in quantum systems. Numerical simulations for the Harper model and the Fibonacci chain indicate that the root mean square displacement displays the scaling behavior $r(t)\sim t^\beta$ with $\beta=D_2^\psi$, where $D_2^\psi$ is the correlation dimension of the multifractal eigenstates. The equality can be generalized to $d$-dimensional systems as $\beta=D_2^\psi/d$, as long as the electron motion is ballistic in the effective $D_2^\psi$-dimensional space. This equality should be replaced by $\beta<D_2^\psi/d$ if the motion is non-ballistic, as supported by all known results.
Visual relations, such as "person ride bike" and "bike next to car", offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer vision and natural language. However, due to the challenging combinatorial complexity of modeling subject-predicate-object relation triplets, very little work has been done to localize and predict visual relations. Inspired by the recent advances in relational representation learning of knowledge bases and convolutional object detection networks, we propose a Visual Translation Embedding network (VTransE) for visual relation detection. VTransE places objects in a low-dimensional relation space where a relation can be modeled as a simple vector translation, i.e., subject + predicate $\approx$ object. We propose a novel feature extraction layer that enables object-relation knowledge transfer in a fully-convolutional fashion that supports training and inference in a single forward/backward pass. To the best of our knowledge, VTransE is the first end-to-end relation detection network. We demonstrate the effectiveness of VTransE over other state-of-the-art methods on two large-scale datasets: Visual Relationship and Visual Genome. Note that even though VTransE is a purely visual model, it is still competitive to the Lu's multi-modal model with language priors.
We study the superfluid-to-Bose glass transition in a disordered Bose-Hubbard model through a very simple variational wavefunction: a permanent of non-orthogonal single-particle wavefunctions that are variationally determined. The transition is identified by the behavior of the superfluid stiffness. We also introduce a less rigorous but very enlightening criterium for the transition, which is related to the overlap matrix among the single-particle wavefunctions that are used to built the permanent. We find that the two criteria agree quite well. We finally consider a further quantity, the bipartite entanglement entropy, which also provides a good estimator for the superfluid-to-Bose glass transition.
Single-spin asymmetries in the semi-inclusive production of charged pions in deep-inelastic scattering from transversely and longitudinally polarized proton targets are combined to evaluate the subleading-twist contribution to the longitudinal case. This contribution is significantly positive for (\pi^+) mesons and dominates the asymmetries on a longitudinally polarized target previously measured by \hermes. The subleading-twist contribution for (\pi^-) mesons is found to be small.
This paper considers the problem of cooperative power control in distributed small cell wireless networks. We introduce a novel framework, based on repeated games, which models the interactions of the different transmit base stations in the downlink. By exploiting the specific structure of the game, we show that we can improve the system performance by selecting the Pareto optimal solution as well as reduce the price of stability.
We present a possible explanation for the deep sub-threshold, $\phi$ and $\Xi^-$ production yields measured with the HADES experiment in Ar+KCl reactions at $E_{\mathrm{lab}}=1.76$ A GeV and present predictions for Au+Au reactions at $E_{\mathrm{lab}}=1.23$ A GeV. To explain the surprisingly high yields of $\phi$ and $\Xi^-$ hadrons we propose new decay channels for high mass baryon resonances. These new decay channels are constrained by elementary $\mathrm{p+p}\rightarrow \mathrm{p+p+}\phi$ cross sections, and $\Xi^-$ production in p+Nb. Based on the fits to the elementary reactions one obtains a satisfactorily description of $\phi$ and $\Xi^-$ production in deep sub-threshold Ar+KCl reactions as well as the observed nuclear transparency ratio in proton induced $\phi$ production in cold nuclear matter. The results implicate that no new medium effects are required to describe the rare strange particle production data in low energy nuclear collisions.
Real-world applications could benefit from the ability to automatically generate a fine-grained ranking of photo aesthetics. However, previous methods for image aesthetics analysis have primarily focused on the coarse, binary categorization of images into high- or low-aesthetic categories. In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function. Our model incorporates joint learning of meaningful photographic attributes and image content information which can help regularize the complicated photo aesthetics rating problem.   To train and analyze this model, we have assembled a new aesthetics and attributes database (AADB) which contains aesthetic scores and meaningful attributes assigned to each image by multiple human raters. Anonymized rater identities are recorded across images allowing us to exploit intra-rater consistency using a novel sampling strategy when computing the ranking loss of training image pairs. We show the proposed sampling strategy is very effective and robust in face of subjective judgement of image aesthetics by individuals with different aesthetic tastes. Experiments demonstrate that our unified model can generate aesthetic rankings that are more consistent with human ratings. To further validate our model, we show that by simply thresholding the estimated aesthetic scores, we are able to achieve state-or-the-art classification performance on the existing AVA dataset benchmark.
Location-based social media make it possible to understand social and geographic aspects of human activities. However, previous studies have mostly examined these two aspects separately without looking at how they are linked. The study aims to connect two aspects by investigating whether there is any correlation between social connections and users' check-in locations from a socio-geographic perspective. We constructed three types of networks: a people-people network, a location-location network, and a city-city network from former location-based social media Brightkite and Gowalla in the U.S., based on users' check-in locations and their friendships. We adopted some complexity science methods such as power-law detection and head/tail breaks classification method for analysis and visualization. Head/tail breaks recursively partitions data into a few large things in the head and many small things in the tail. By analyzing check-in locations, we found that users' check-in patterns are heterogeneous at both the individual and collective levels. We also discovered that users' first or most frequent chec-in locations can be the representatives of users' spatial information. The constructed networks based on these locations are very heterogeneous, as indicated by the high ht-index. Most importantly, the node degree of the networks correlates highly with the population at locations (mostly with R-square being 0.7) or cities (above 0.9). This correlation indicates that the geographic distributions of the social media users relate highly to their online social connections.   Keywords: social networks, check-in locations, natural cities, power law, head/tail breaks, ht-index
We investigate the role of disorder in the Mott-Hubbard transition based on the slave-rotor representation of the Hubbard model, where an electron is decomposed into a fermionic spinon for a spin degree of freedom and a bosonic rotor (chargon) for a charge degree of freedom. In the absence of disorder the Mott-Hubbard insulator is assumed to be the spin liquid Mott insulator in terms of gapless spinons near the Fermi surface and gapped chargons interacting via U(1) gauge fields. We found that the Mott-Hubbard critical point becomes unstable as soon as disorder is turned on. As a result, a disorder critical point appears to be identified with the spin liquid glass insulator to the Fermi liquid metal transition, where the spin liquid glass consists of the U(1) spin liquid and the chargon glass. We expect that glassy behaviors of charge fluctuations can be measured by the optical spectra in the insulating phase of an organic material $\kappa-(BEDT-TTF)_{2}Cu_{2}(CN)_{3}$. Furthermore, since the Mott-Anderson critical point depends on the spinon conductivity, universality in the critical exponents may not be found.
In this paper, a new methodology for stability assessment of a smart power system is proposed. The key to this assessment is an index called betweenness index which is based on ideas from complex network theory. The proposed betweenness index is an improvement of previous works since it considers the actual real power flow through the transmission lines along the network. Furthermore, this work initiates a new area for complex system research to assess the stability of the power system.
Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much more fine grained control over prediction, but at the same time present novel challenges in terms of algorithm design and analysis. In this work we initiate a study of online learning techniques for such non-decomposable loss functions with an aim to enable incremental learning as well as design scalable solvers for batch problems. To this end, we propose an online learning framework for such loss functions. Our model enjoys several nice properties, chief amongst them being the existence of efficient online learning algorithms with sublinear regret and online to batch conversion bounds. Our model is a provable extension of existing online learning models for point loss functions. We instantiate two popular losses, prec@k and pAUC, in our model and prove sublinear regret bounds for both of them. Our proofs require a novel structural lemma over ranked lists which may be of independent interest. We then develop scalable stochastic gradient descent solvers for non-decomposable loss functions. We show that for a large family of loss functions satisfying a certain uniform convergence property (that includes prec@k, pAUC, and F-measure), our methods provably converge to the empirical risk minimizer. Such uniform convergence results were not known for these losses and we establish these using novel proof techniques. We then use extensive experimentation on real life and benchmark datasets to establish that our method can be orders of magnitude faster than a recently proposed cutting plane method.
Deep Neural Networks (DNNs) have advanced the state-of-the-art in a variety of machine learning tasks and are deployed in increasing numbers of products and services. However, the computational requirements of training and evaluating large-scale DNNs are growing at a much faster pace than the capabilities of the underlying hardware platforms that they are executed upon. In this work, we propose Dynamic Variable Effort Deep Neural Networks (DyVEDeep) to reduce the computational requirements of DNNs during inference. Previous efforts propose specialized hardware implementations for DNNs, statically prune the network, or compress the weights. Complementary to these approaches, DyVEDeep is a dynamic approach that exploits the heterogeneity in the inputs to DNNs to improve their compute efficiency with comparable classification accuracy. DyVEDeep equips DNNs with dynamic effort mechanisms that, in the course of processing an input, identify how critical a group of computations are to classify the input. DyVEDeep dynamically focuses its compute effort only on the critical computa- tions, while skipping or approximating the rest. We propose 3 effort knobs that operate at different levels of granularity viz. neuron, feature and layer levels. We build DyVEDeep versions for 5 popular image recognition benchmarks - one for CIFAR-10 and four for ImageNet (AlexNet, OverFeat and VGG-16, weight-compressed AlexNet). Across all benchmarks, DyVEDeep achieves 2.1x-2.6x reduction in the number of scalar operations, which translates to 1.8x-2.3x performance improvement over a Caffe-based implementation, with < 0.5% loss in accuracy.
Cognitive radio technology, which is designed to enhance spectrum utilization, depends on the success of opportunistic access, where secondary users (SUs) exploit spectrum void unoccupied by primary users (PUs) for transmissions. We note that the system behaviors are very similar to the interactions among different species coexisting in an ecosystem. However, SUs of a selfish nature or of misleading information may make concurrent transmissions with PUs for additional incentives, and thus disrupt the entire ecosystem. By exploiting this vulnerability, this paper proposes a novel distributed denial-of-service (DoS) attack where invasive species, i.e., malicious users (MUs), induce originally normal-behaved SUs to execute concurrent transmissions with PUs and thus collapse the cognitive radio network. We adopt stochastic geometry to model the spatial distributions of PUs, SUs, and MUs for the analysis of the mutual interference among them. The access strategy of each SU in the spectrum sharing ecosystem, which evolves with the experienced payoffs and interference, is modeled by an evolutionary game. Based on the evolutionary stable strategy concept, we could efficiently identify the fragile operating region at which normal-behaved SUs are eventually evolved to conduct concurrent transmissions and thus to cause the ruin of the network.
Using a model calculation of dihadron fragmentation functions, we fit the spin asymmetry recently extracted by HERMES for the semi-inclusive pion pair production in deep-inelastic scattering on a transversely polarized proton target. By evolving the obtained dihadron fragmentation functions, we make predictions for the correlation of the angular distributions of two pion pairs produced in electron-positron annihilations at BELLE kinematics. Our study shows that the combination of two-hadron inclusive deep-inelastic scattering and electron-positron annihilation measurements can provide a valid alternative to Collins effect for the extraction of the quark transversity distribution in the nucleon.
Online reviews are often our first port of call when considering products and purchases online. When evaluating a potential purchase, we may have a specific query in mind, e.g. `will this baby seat fit in the overhead compartment of a 747?' or `will I like this album if I liked Taylor Swift's 1989?'. To answer such questions we must either wade through huge volumes of consumer reviews hoping to find one that is relevant, or otherwise pose our question directly to the community via a Q/A system.   In this paper we hope to fuse these two paradigms: given a large volume of previously answered queries about products, we hope to automatically learn whether a review of a product is relevant to a given query. We formulate this as a machine learning problem using a mixture-of-experts-type framework---here each review is an `expert' that gets to vote on the response to a particular query; simultaneously we learn a relevance function such that `relevant' reviews are those that vote correctly. At test time this learned relevance function allows us to surface reviews that are relevant to new queries on-demand. We evaluate our system, Moqa, on a novel corpus of 1.4 million questions (and answers) and 13 million reviews. We show quantitatively that it is effective at addressing both binary and open-ended queries, and qualitatively that it surfaces reviews that human evaluators consider to be relevant.
Tuning stepsize between convergence rate and steady state error level or stability is a problem in some subspace tracking schemes. Methods in DPM and OJA class may show sparks in their steady state error sometimes, even with a rather small stepsize. By a study on the schemes' updating formula, it is found that the update only happens in a specific plane but not all the subspace basis. Through an analysis on relationship between the vectors in that plane, an amendment as needed is made on the algorithm routine to fix the problem by constricting the stepsize at every update step. The simulation confirms elimination of the sparks.
The production of slow nucleons in semi-inclusive deep inelastic electron scattering off the deuteron is investigated in the region $x \gsim 0.3$. It is shown that within the spectator mechanism the semi-inclusive cross section exhibits a scaling property even at moderate values of $Q^2$ ($\sim$ few $(GeV/c)^2$) accessible at present facilities, like $CEBAF$. Such a scaling property can be used as a model-independent test of the dominance of the spectator mechanism itself and provides an interesting tool to investigate the neutron structure function. The possibility of extracting model-independent information on the neutron to proton structure function ratio from semi-inclusive experiments is illustrated. The application of the spectator scaling to semi-inclusive processes off complex nuclei is outlined.
We attempt to de-mistify Artificial Neural Networks (ANNs) by considering special cases which are related to other statistical methods common in Astronomy and other fields. In particular we show how ANNs generalise Bayesian methods, multi-parameter fitting, Principal Component Analysis (PCA), Wiener filtering and regularisation methods. Examples of morphological classification of galaxies illustrate how non-linear ANNs improve on linear techniques.
In routing games, the network performance at equilibrium can be significantly improved if we remove some edges from the network. This counterintuitive fact, widely known as Braess's paradox, gives rise to the (selfish) network design problem, where we seek to recognize routing games suffering from the paradox, and to improve the equilibrium performance by edge removal. In this work, we investigate the computational complexity and the approximability of the network design problem for non-atomic bottleneck routing games, where the individual cost of each player is the bottleneck cost of her path, and the social cost is the bottleneck cost of the network. We first show that bottleneck routing games do not suffer from Braess's paradox either if the network is series-parallel, or if we consider only subpath-optimal Nash flows. On the negative side, we prove that even for games with strictly increasing linear latencies, it is NP-hard not only to recognize instances suffering from the paradox, but also to distinguish between instances for which the Price of Anarchy (PoA) can decrease to 1 and instances for which the PoA is as large as \Omega(n^{0.121}) and cannot improve by edge removal. Thus, the network design problem for such games is NP-hard to approximate within a factor of O(n^{0.121-\eps}), for any constant \eps > 0. On the positive side, we show how to compute an almost optimal subnetwork w.r.t. the bottleneck cost of its worst Nash flow, when the worst Nash flow in the best subnetwork routes a non-negligible amount of flow on all used edges. The running time is determined by the total number of paths, and is quasipolynomial when the number of paths is quasipolynomial.
To evaluate the variability of multi-phase flow properties of porous media at the pore scale, it is necessary to acquire a number of representative samples of the void-solid structure. While modern x-ray computer tomography has made it possible to extract three-dimensional images of the pore space, assessment of the variability in the inherent material properties is often experimentally not feasible. We present a novel method to reconstruct the solid-void structure of porous media by applying a generative neural network that allows an implicit description of the probability distribution represented by three-dimensional image datasets. We show, by using an adversarial learning approach for neural networks, that this method of unsupervised learning is able to generate representative samples of porous media that honor their statistics. We successfully compare measures of pore morphology, such as the Euler characteristic, two-point statistics and directional single-phase permeability of synthetic realizations with the calculated properties of a bead pack, Berea sandstone, and Ketton limestone. Results show that GANs can be used to reconstruct high-resolution three-dimensional images of porous media at different scales that are representative of the morphology of the images used to train the neural network. The fully convolutional nature of the trained neural network allows the generation of large samples while maintaining computational efficiency. Compared to classical stochastic methods of image reconstruction, the implicit representation of the learned data distribution can be stored and reused to generate multiple realizations of the pore structure very rapidly.
We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights. Our results reveal an order-to-chaos expressivity phase transition, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth but not width. We prove this generic class of deep random functions cannot be efficiently computed by any shallow network, going beyond prior work restricted to the analysis of single functions. Moreover, we formalize and quantitatively demonstrate the long conjectured idea that deep networks can disentangle highly curved manifolds in input space into flat manifolds in hidden space. Our theoretical analysis of the expressive power of deep networks broadly applies to arbitrary nonlinearities, and provides a quantitative underpinning for previously abstract notions about the geometry of deep functions.
We propose a deep learning model inspired by neocortical communication via the thalamus. Our model consists of recurrent neural modules that send features via a routing center, endowing the modules with the flexibility to share features over multiple time steps. We show that our model learns to route information hierarchically, processing input data by a chain of modules. We observe common architectures, such as feed forward neural networks and skip connections, emerging as special cases of our architecture, while novel connectivity patterns are learned for the text8 compression task. We demonstrate that our model outperforms standard recurrent neural networks on three sequential benchmarks.
We propose a Bayesian evidence framework to facilitate transfer learning from pre-trained deep convolutional neural networks (CNNs). Our framework is formulated on top of a least squares SVM (LS-SVM) classifier, which is simple and fast in both training and testing, and achieves competitive performance in practice. The regularization parameters in LS-SVM is estimated automatically without grid search and cross-validation by maximizing evidence, which is a useful measure to select the best performing CNN out of multiple candidates for transfer learning; the evidence is optimized efficiently by employing Aitken's delta-squared process, which accelerates convergence of fixed point update. The proposed Bayesian evidence framework also provides a good solution to identify the best ensemble of heterogeneous CNNs through a greedy algorithm. Our Bayesian evidence framework for transfer learning is tested on 12 visual recognition datasets and illustrates the state-of-the-art performance consistently in terms of prediction accuracy and modeling efficiency.
We study the Langevin equation for a single harmonic saddle as an elementary model for the beta-relaxation in supercooled liquids close to Tc. The input of the theory is the spectrum of the eigenvalues of the dominant stationary points at a given temperature. We prove in general the existence of a time-scale t_eps, which is uniquely determined by the spectrum, but is not simply related to the fraction of negative eigenvalues. The mean square displacement develops a plateau of length t_eps, such that a two-step relaxation is obtained if t_eps diverges at Tc. We analyze the specific case of a spectrum with bounded left tail, and show that in this case the mean square displacement has a scaling dependence on time identical to the beta-relaxation regime of Mode Coupling Theory, with power law approach to the plateau and power law divergence of t_eps at Tc.
A new era is dawning for wireless mobile ad hoc networks where communication will be done using a group of mobile devices called cluster, hence clustered network. In a clustered network, protocols used by these mobile devices are different from those used in a wired network; which helps to save computation time and resources efficiently. This paper focuses on Cluster-Based Routing Protocol and Dynamic Source Routing. The results presented in this paper illustrates the implementation of Ad-hoc On-Demand Distance Vector routing protocol for enhancing mobile nodes performance and lifetime in a clustered network and to demonstrate how this routing protocol results in time efficient and resource saving in wireless mobile ad hoc networks.
Non-cosmic, non-Gaussian disturbances known as "glitches", show up in gravitational-wave data of the Advanced Laser Interferometer Gravitational-wave Observatory, or aLIGO. In this paper, we propose a deep multi-view convolutional neural network to classify glitches automatically. The primary purpose of classifying glitches is to understand their characteristics and origin, which facilitates their removal from the data or from the detector entirely. We visualize glitches as spectrograms and leverage the state-of-the-art image classification techniques in our model. The suggested classifier is a multi-view deep neural network that exploits four different views for classification. The experimental results demonstrate that the proposed model improves the overall accuracy of the classification compared to traditional single view algorithms.
Multitask learning can be effective when features useful in one task are also useful for other tasks, and the group lasso is a standard method for selecting a common subset of features. In this paper, we are interested in a less restrictive form of multitask learning, wherein (1) the available features can be organized into subsets according to a notion of similarity and (2) features useful in one task are similar, but not necessarily identical, to the features best suited for other tasks. The main contribution of this paper is a new procedure called Sparse Overlapping Sets (SOS) lasso, a convex optimization that automatically selects similar features for related learning tasks. Error bounds are derived for SOSlasso and its consistency is established for squared error loss. In particular, SOSlasso is motivated by multi- subject fMRI studies in which functional activity is classified using brain voxels as features. Experiments with real and synthetic data demonstrate the advantages of SOSlasso compared to the lasso and group lasso.
What is aging? Mechanistic answers to this question remain elusive despite decades of research. Here, we propose a mathematical model of cellular aging based on a model gene interaction network. Our network model is made of only non-aging components - the biological functions of gene interactions decrease with a constant mortality rate. Death of a cell occurs in the model when an essential gene loses all of its interactions to other genes, equivalent to the deletion of an essential gene. Gene interactions are stochastic based on a binomial distribution. We show that the defining characteristic of biological aging, the exponential increase of mortality rate over time, can arise from this gene network model during the early stage of aging. Hence, we demonstrate that cellular aging is an emergent property of this model network. Our model predicts that the rate of aging, defined by the Gompertz coefficient, is approximately proportional to the average number of active interactions per gene and that the stochastic heterogeneity of gene interactions is an important factor in the dynamics of the aging process. This theoretic framework offers a mechanistic foundation for the pleiotropic nature of aging and can provide insights on cellular aging.
Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds and well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective. Empirically, it shows strong performance on par or even better than state of the art. Theoretically, we provide analysis towards understanding of what the network has learnt and why the network is robust with respect to input perturbation and corruption.
In this letter, we present an original demonstration of an associative learning neural network inspired by the famous Pavlov's dogs experiment. A single nanoparticle organic memory field effect transistor (NOMFET) is used to implement each synapse. We show how the physical properties of this dynamic memristive device can be used to perform low power write operations for the learning and implement short-term association using temporal coding and spike timing dependent plasticity based learning. An electronic circuit was built to validate the proposed learning scheme with packaged devices, with good reproducibility despite the complex synaptic-like dynamic of the NOMFET in pulse regime.
We introduce a new method, based on the recently developed random tensor theory, to study the p-spin glass model with non-Gaussian, correlated disorder. Using a suitable generalization of Gurau's theorem on the universality of the large N limit of the p-unitary ensemble of random tensors, we exhibit an infinite family of such non-Gaussian distributions which leads to same low temperature phase as the Gaussian distribution. While this result is easy to show (and well known) for uncorrelated disorder, its robustness with respect to strong quenched correlations is surprising. We show in detail how the critical temperature is renormalized by these correlations. We close with a speculation on possible applications of random tensor theory to finite-range spin glass models.
Many machine learning classifiers are vulnerable to adversarial perturbations. An adversarial perturbation modifies an input to change a classifier's prediction without causing the input to seem substantially different to human perception. We deploy three methods to detect adversarial images. Adversaries trying to bypass our detectors must make the adversarial image less pathological or they will fail trying. Our best detection method reveals that adversarial images place abnormal emphasis on the lower-ranked principal components from PCA. Other detectors and a colorful saliency map are in an appendix.
Link capacity dimensioning is the periodic task where ISPs have to make provisions for sudden traffic bursts and network failures to assure uninterrupted operations. This provision comes in the form of link working capacities with noticeable amounts of headroom, i.e., spare capacities that are used in case of congestions or network failures. Distributed routing protocols like OSPF provide convergence after network failures and have proven their reliable operation over decades, but require overprovisioning and headroom of over 50%. However, SDN has recently been proposed to either replace or work together with OSPF in routing Internet traffic. This paper addresses the question of how to robustly dimension the link capacities in emerging hybrid SDN/OSPF networks. We analyze the networks with various implementations of hybrid SDN/OSPF control planes, and show that our idea of SDN Partitioning requires less amounts of spare capacity compared to legacy or other hybrid SDN/OSPF schemes, outperformed only by a full SDN deployment.
A central challenge in neuroscience is to understand neural computations and circuit mechanisms that underlie the encoding of ethologically relevant, natural stimuli. In multilayered neural circuits, nonlinear processes such as synaptic transmission and spiking dynamics present a significant obstacle to the creation of accurate computational models of responses to natural stimuli. Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes nearly to within the variability of a cell's response, and are markedly more accurate than linear-nonlinear (LN) models and Generalized Linear Models (GLMs). Moreover, we find two additional surprising properties of CNNs: they are less susceptible to overfitting than their LN counterparts when trained on small amounts of data, and generalize better when tested on stimuli drawn from a different distribution (e.g. between natural scenes and white noise). Examination of trained CNNs reveals several properties. First, a richer set of feature maps is necessary for predicting the responses to natural scenes compared to white noise. Second, temporally precise responses to slowly varying inputs originate from feedforward inhibition, similar to known retinal mechanisms. Third, the injection of latent noise sources in intermediate layers enables our model to capture the sub-Poisson spiking variability observed in retinal ganglion cells. Fourth, augmenting our CNNs with recurrent lateral connections enables them to capture contrast adaptation as an emergent property of accurately describing retinal responses to natural scenes. These methods can be readily generalized to other sensory modalities and stimulus ensembles. Overall, this work demonstrates that CNNs not only accurately capture sensory circuit responses to natural scenes, but also yield information about the circuit's internal structure and function.
We study directly the length of the domain walls (DW) obtained by comparing the ground states of the Edwards-Anderson spin glass model subject to periodic and antiperiodic boundary conditions. For the bimodal and Gaussian bond distributions, we have isolated the DW and have calculated directly its fractal dimension $d_f$. Our results show that, even though in three dimensions $d_f$ is the same for both distributions of bonds, this is clearly not the case for two-dimensional (2D) systems. In addition, contrary to what happens in the case of the 2D Edwards-Anderson spin glass with Gaussian distribution of bonds, we find no evidence that the DW for the bimodal distribution of bonds can be described as a Schramm-Loewner evolution processes.
With the advancement of web technology and its growth, there is a huge volume of data present in the web for internet users and a lot of data is generated too. Internet has become a platform for online learning, exchanging ideas and sharing opinions. Social networking sites like Twitter, Facebook, Google+ are rapidly gaining popularity as they allow people to share and express their views about topics,have discussion with different communities, or post messages across the world. There has been lot of work in the field of sentiment analysis of twitter data. This survey focuses mainly on sentiment analysis of twitter data which is helpful to analyze the information in the tweets where opinions are highly unstructured, heterogeneous and are either positive or negative, or neutral in some cases. In this paper, we provide a survey and a comparative analyses of existing techniques for opinion mining like machine learning and lexicon-based approaches, together with evaluation metrics. Using various machine learning algorithms like Naive Bayes, Max Entropy, and Support Vector Machine, we provide a research on twitter data streams.General challenges and applications of Sentiment Analysis on Twitter are also discussed in this paper.
Using spectroscopic data from the Deep Extragalactic Evolutionary Probe (DEEP) Groth Strip survey (DGSS), we analyze the gas-phase oxygen abundances for 56 emission-line field galaxies in the redshift range 0.26<z<0.82. Oxygen abundances relative to hydrogen range between 8.4<12+log(O/H)<9.0 with typical uncertainties of 0.17 dex. The 56 DGSS galaxies collectively exhibit a correlation between B-band luminosity and metallicity, i.e., an L-Z relation. Subsets of DGSS galaxies binned by redshift also exhibit L-Z correlations but with different zero points. Galaxies in the highest redshift bin (z=0.6-0.82) are brighter by ~1 mag compared to the lowest redshift bin (z=0.26-0.40) and brighter by ~1-2 mag compared to local (z<0.1) field galaxies. This offset from the local L-Z relation is greatest for objects at the low-luminosity (M_B ~ -19) end of the sample, and vanishingly small for objects at the high-luminosity end of the sample (M_B ~ -22). Thus, both the slope and zero point of the L-Z relation appear to evolve. Either the least-luminous DGSS field galaxies have faded by 1--2 mag due to decreasing levels of star formation, or they have experienced an increase in the mean metallicity of the interstellar medium by factors of 1.3--2 (0.1-0.3 dex). The relatively greater degree of luminosity and metallicity evolution seen among the lower luminosity (sub L*) galaxies in the last 8 Gyr implies either a more protracted assembly process, or a more recent formation epoch compared to more luminous L* galaxies. (abridged)
Unlike telephone operators, which pay termination fees to reach the users of another network, Internet Content Providers (CPs) do not pay the Internet Service Providers (ISPs) of users they reach. While the consequent cross subsidization to CPs has nurtured content innovations at the edge of the Internet, it reduces the investment incentives for the access ISPs to expand capacity. As potential charges for terminating CPs' traffic are criticized under the net neutrality debate, we propose to allow CPs to voluntarily subsidize the usagebased fees induced by their content traffic for end-users. We model the regulated subsidization competition among CPs under a neutral network and show how deregulation of subsidization could increase an access ISP's utilization and revenue, strengthening its investment incentives. Although the competition might harm certain CPs, we find that the main cause comes from high access prices rather than the existence of subsidization. Our results suggest that subsidization competition will increase the competitiveness and welfare of the Internet content market; however, regulators might need to regulate access prices if the access ISP market is not competitive enough. We envision that subsidization competition could become a viable model for the future Internet.
Fully Convolution Networks (FCN) have achieved great success in dense prediction tasks including semantic segmentation. In this paper, we start from discussing FCN by understanding its architecture limitations in building a strong segmentation network. Next, we present our Improved Fully Convolution Network (IFCN). In contrast to FCN, IFCN introduces a context network that progressively expands the receptive fields of feature maps. In addition, dense skip connections are added so that the context network can be effectively optimized. More importantly, these dense skip connections enable IFCN to fuse rich-scale context to make reliable predictions. Empirically, those architecture modifications are proven to be significant to enhance the segmentation performance. Without engaging any contextual post-processing, IFCN significantly advances the state-of-the-arts on ADE20K (ImageNet scene parsing), Pascal Context, Pascal VOC 2012 and SUN-RGBD segmentation datasets.
Artificial neural networks (ANNs) trained using backpropagation are powerful learning architectures that have achieved state-of-the-art performance in various benchmarks. Significant effort has been devoted to developing custom silicon devices to accelerate inference in ANNs. Accelerating the training phase, however, has attracted relatively little attention. In this paper, we describe a hardware-efficient on-line learning technique for feedforward multi-layer ANNs that is based on pipelined backpropagation. Learning is performed in parallel with inference in the forward pass, removing the need for an explicit backward pass and requiring no extra weight lookup. This saves 50\% of the weight lookups needed by a standard online implementation of backpropagation. By using binary state variables in the feedforward network and ternary errors in truncated-error backpropagation, the need for any multiplications in the forward and backward passes is removed, and memory requirements for the pipelining are drastically reduced. For proof-of-concept validation, we demonstrate on-line learning of MNIST handwritten digit classification on a Spartan 6 FPGA interfacing with an external 1Gb DDR2 DRAM, that shows small degradation in test error performance compared to an equivalently sized ANN trained off-line using standard back-propagation and exact errors. Our results highlight an attractive synergy between pipelined backpropagation and binary-state networks in substantially reducing computation and memory requirements, making pipelined on-line learning practical in deep networks.
A deep neural network model is a powerful framework for learning representations. Usually, it is used to learn the relation x to y by exploiting the regularities in the input x but without considering the output representation y. In structured output prediction problems, where the output is multi-dimensional and where structural relations exist between the dimensions, the network usually tends to overfit when the training data are limited. In order to overcome this issue and circumvent the large required data to output accurate predictions, we propose in this paper a regularization scheme for training neural networks for these particular tasks. Our proposed scheme aims at incorporating the learning of the output representation y in the training process while learning the mapping function x to y. Our proposition is a multi-task framework containing two unsupervised tasks over the input and the output data along with the supervised task. We experimented the use of the output labels y without their corresponding input x.   We evaluate our framework on a facial landmark detection problem which is a typical structured output prediction task. We show over two public challenging datasets (LFPW and HELEN) that our regularization scheme improves the generalization of deep neural networks and accelerates their training. The use of unlabeled data is also explored, showing an additional improvement of the results. We provide an opensource implementation https://github.com/sbelharbi/structured-output-ae of our framework.
We use recently proposed method of ratios to assess the quality of geometrical scaling in deep inelastic scattering for different forms of the saturation scale. We consider original form of geometrical scaling (motivated by the Balitski-Kovchegov (BK) equation with fixed coupling) studied in more detail in our previous paper, and four new hypotheses: phenomenologically motivated case with $Q^2$ dependent exponent $\lambda$ that governs small $x$ dependence of the saturation scale, two versions of scaling (running coupling 1 and 2) that follow from the BK equation with running coupling, and diffusive scaling suggested by the QCD evolution equation beyond mean field approximation. It turns out that more sophisticated scenarios: running coupling scaling and diffusive scaling are disfavored by the combined HERA data on $e^+p$ deep inelastic structure function $F_2$.
Coordination and cooperation are among the most important issues of game theory. Recently, the attention turned to game theory on graphs and social networks. Encouraged by interesting results obtained in quantum evolutionary game analysis, we study cooperative Parrondo's games in a quantum setup. The game is modeled using multidimensional quantum random walks with biased coins. We use the GHZ and W entangled states as the initial state of the coins. Our analysis shows than an apparent paradox in cooperative quantum games and some interesting phenomena can be observed.
Based upon the correlation matrix of the human promoter sequences, a complex network is constructed to capture the principal relationships between these promoters. It is a complex network has the properties of the right-skewed degree distribution and the clustering simultaneously, i.e., a hierarchical structure. An eigenvector centrality (EC) based method is used to reconstruct this hierarchical structure.
We present a novel distributed probabilistic bisection algorithm using social learning with application to target localization. Each agent in the network first constructs a query about the target based on its local information and obtains a noisy response. Agents then perform a Bayesian update of their beliefs followed by an averaging of the log beliefs over local neighborhoods. This two stage algorithm consisting of repeated querying and averaging runs until convergence. We derive bounds on the rate of convergence of the beliefs at the correct target location. Numerical simulations show that our method outperforms current state of the art methods.
Network intrusions have become a significant threat in recent years as a result of the increased demand of computer networks for critical systems. Intrusion detection system (IDS) has been widely deployed as a defense measure for computer networks. Features extracted from network traffic can be used as sign to detect anomalies. However with the huge amount of network traffic, collected data contains irrelevant and redundant features that affect the detection rate of the IDS, consumes high amount of system resources, and slowdown the training and testing process of the IDS. In this paper, a new feature selection model is proposed; this model can effectively select the most relevant features for intrusion detection. Our goal is to build a lightweight intrusion detection system by using a reduced features set. Deleting irrelevant and redundant features helps to build a faster training and testing process, to have less resource consumption as well as to maintain high detection rates. The effectiveness and the feasibility of our feature selection model were verified by several experiments on KDD intrusion detection dataset. The experimental results strongly showed that our model is not only able to yield high detection rates but also to speed up the detection process.
Accurate representation of the physical layer is required for analysis and simulation of multi-hop networking in sensor, ad hoc, and mesh networks. This paper investigates, models, and analyzes the correlations that exist in shadow fading between links in multi-hop networks. Radio links that are geographically proximate often experience similar environmental shadowing effects and thus have correlated fading. We describe a measurement procedure and campaign to measure a large number of multi-hop networks in an ensemble of environments. The measurements show statistically significant correlations among shadowing experienced on different links in the network, with correlation coefficients up to 0.33. We propose a statistical model for the shadowing correlation between link pairs which shows strong agreement with the measurements, and we compare the new model with an existing shadowing correlation model of Gudmundson (1991). Finally, we analyze multi-hop paths in three and four node networks using both correlated and independent shadowing models and show that independent shadowing models can underestimate the probability of route failure by a factor of two or greater.
In this paper, we propose a deep neural network architecture for object recognition based on recurrent neural networks. The proposed network, called ReNet, replaces the ubiquitous convolution+pooling layer of the deep convolutional neural network with four recurrent neural networks that sweep horizontally and vertically in both directions across the image. We evaluate the proposed ReNet on three widely-used benchmark datasets; MNIST, CIFAR-10 and SVHN. The result suggests that ReNet is a viable alternative to the deep convolutional neural network, and that further investigation is needed.
We propose a graphical model for representing networks of stochastic processes, the minimal generative model graph. It is based on reduced factorizations of the joint distribution over time. We show that under appropriate conditions, it is unique and consistent with another type of graphical model, the directed information graph, which is based on a generalization of Granger causality. We demonstrate how directed information quantifies Granger causality in a particular sequential prediction setting. We also develop efficient methods to estimate the topological structure from data that obviate estimating the joint statistics. One algorithm assumes upper-bounds on the degrees and uses the minimal dimension statistics necessary. In the event that the upper-bounds are not valid, the resulting graph is nonetheless an optimal approximation. Another algorithm uses near-minimal dimension statistics when no bounds are known but the distribution satisfies a certain criterion. Analogous to how structure learning algorithms for undirected graphical models use mutual information estimates, these algorithms use directed information estimates. We characterize the sample-complexity of two plug-in directed information estimators and obtain confidence intervals. For the setting when point estimates are unreliable, we propose an algorithm that uses confidence intervals to identify the best approximation that is robust to estimation error. Lastly, we demonstrate the effectiveness of the proposed algorithms through analysis of both synthetic data and real data from the Twitter network. In the latter case, we identify which news sources influence users in the network by merely analyzing tweet times.
With the rapid growth in multimedia services and the enormous offers of video contents in online social networks, users have difficulty in obtaining their interests. Therefore, various personalized recommendation systems have been proposed. However, they ignore that the accelerated proliferation of social media data has led to the big data era, which has greatly impeded the process of video recommendation. In addition, none of them has considered both the privacy of users' contexts (e,g., social status, ages and hobbies) and video service vendors' repositories, which are extremely sensitive and of significant commercial value. To handle the problems, we propose a cloud-assisted differentially private video recommendation system based on distributed online learning. In our framework, service vendors are modeled as distributed cooperative learners, recommending videos according to user's context, while simultaneously adapting the video-selection strategy based on user-click feedback to maximize total user clicks (reward). Considering the sparsity and heterogeneity of big social media data, we also propose a novel geometric differentially private model, which can greatly reduce the performance (recommendation accuracy) loss. Our simulation shows the proposed algorithms outperform other existing methods and keep a delicate balance between computing accuracy and privacy preserving level.
A great deal of effort has been devoted to reducing the risk of spurious scientific discoveries, from the use of sophisticated validation techniques, to deep statistical methods for controlling the false discovery rate in multiple hypothesis testing. However, there is a fundamental disconnect between the theoretical results and the practice of data analysis: the theory of statistical inference assumes a fixed collection of hypotheses to be tested, or learning algorithms to be applied, selected non-adaptively before the data are gathered, whereas in practice data is shared and reused with hypotheses and new analyses being generated on the basis of data exploration and the outcomes of previous analyses.   In this work we initiate a principled study of how to guarantee the validity of statistical inference in adaptive data analysis. As an instance of this problem, we propose and investigate the question of estimating the expectations of $m$ adaptively chosen functions on an unknown distribution given $n$ random samples.   We show that, surprisingly, there is a way to estimate an exponential in $n$ number of expectations accurately even if the functions are chosen adaptively. This gives an exponential improvement over standard empirical estimators that are limited to a linear number of estimates. Our result follows from a general technique that counter-intuitively involves actively perturbing and coordinating the estimates, using techniques developed for privacy preservation. We give additional applications of this technique to our question.
We study the properties of the avoided or hidden quantum critical point (AQCP) in three dimensional Dirac and Weyl semi-metals in the presence of short range potential disorder. By computing the averaged density of states (along with its second and fourth derivative at zero energy) with the kernel polynomial method (KPM) we systematically tune the effective length scale that eventually rounds out the transition and leads to an AQCP. We show how to determine the strength of the avoidance, establishing that it is not controlled by the long wavelength component of the disorder. Instead, the amount of avoidance can be adjusted via the tails of the probability distribution of the local random potentials. A binary distribution with no tails produces much less avoidance than a Gaussian distribution. We introduce a double Gaussian distribution to interpolate between these two limits. As a result we are able to make the length scale of the avoidance sufficiently large so that we can accurately study the properties of the underlying transition (that is eventually rounded out), unambiguously identify its location, and provide accurate estimates of the critical exponents $\nu=1.01\pm0.06$ and $z=1.50\pm0.04$. We also show that the KPM expansion order introduces an effective length scale that can also round out the transition in the scaling regime near the AQCP.
A model to describe the spontaneous formation of military and economic coalitions among a group of countries is proposed using spin glass theory. Between each couple of countries, there exists a bond exchange coupling which is either zero, cooperative or conflicting. It depends on their common history, specific nature, and cannot be varied. Then, given a frozen random bond distribution, coalitions are found to spontaneously form. However they are also unstable making the system very disordered. Countries shift coalitions all the time. Only the setting of macro extra national coalition are shown to stabilize alliances among countries. The model gives new light on the recent instabilities produced in Eastern Europe by the Warsow pact dissolution at odd to the previous communist stability. Current European stability is also discussed with respect to the European Union construction.
Current Gigabit-class passive optical networks (PONs) evolve into next-generation PONs, whereby high-speed 10+ Gb/s time division multiplexing (TDM) and long-reach wavelength-broadcasting/routing wavelength division multiplexing (WDM) PONs are promising near-term candidates. On the other hand, next-generation wireless local area networks (WLANs) based on frame aggregation techniques will leverage physical layer enhancements, giving rise to Gigabit-class very high throughput (VHT) WLANs. In this paper, we develop an analytical framework for evaluating the capacity and delay performance of a wide range of routing algorithms in converged fiber-wireless (FiWi) broadband access networks based on different next-generation PONs and a Gigabit-class multi-radio multi-channel WLAN-mesh front-end. Our framework is very flexible and incorporates arbitrary frame size distributions, traffic matrices, optical/wireless propagation delays, data rates, and fiber faults. We verify the accuracy of our probabilistic analysis by means of simulation for the wireless and wireless-optical-wireless operation modes of various FiWi network architectures under peer-to-peer, upstream, uniform, and nonuniform traffic scenarios. The results indicate that our proposed optimized FiWi routing algorithm (OFRA) outperforms minimum (wireless) hop and delay routing in terms of throughput for balanced and unbalanced traffic loads, at the expense of a slightly increased mean delay at small to medium traffic loads.
In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: with probability one, it will converge to the target distribution, given a sufficient number of samples. However, the rate of this convergence has been hitherto unknown. In this work we examine the sample complexity of MDL based learning procedures for Bayesian networks. We show that the number of samples needed to learn an epsilon-close approximation (in terms of entropy distance) with confidence delta is O((1/epsilon)^(4/3)log(1/epsilon)log(1/delta)loglog (1/delta)). This means that the sample complexity is a low-order polynomial in the error threshold and sub-linear in the confidence bound. We also discuss how the constants in this term depend on the complexity of the target distribution. Finally, we address questions of asymptotic minimality and propose a method for using the sample complexity results to speed up the learning process.
Electronic bait (honeypots) are network resources whose value consists of being attacked and compromised. These are often computers which do not have a task in the network, but are otherwise indestinguishable from regular computers. Such bait systems could be interconnected (honeynets). These honeynets are equipped with special software, facilitating forensic anylisis of incidents. Taking average of the wide variety of recorded data it is possible to learn considerable more about the behaviour of attackers in networks than with traditional methods. This article is an introduction into electronic bait and a description of the setup and first experiences of such a network deployed at RWTH Aachen University.   -----   Als elektronische Koeder (honeypots) bezeichnet man Netzwerkressourcen, deren Wert darin besteht, angegriffen und kompromittiert zu werden. Oft sind dies Computer, die keine spezielle Aufgabe im Netzwerk haben, aber ansonsten nicht von regulaeren Rechnern zu unterscheiden sind. Koeder koennen zu Koeder-Netzwerken (honeynets) zusammengeschlossen werden. Sie sind mit spezieller Software ausgestattet, die die Forensik einer eingetretenen Schutzzielverletzung erleichtert. Durch die Vielfalt an mitgeschnittenen Daten kann man deutlich mehr ueber das Verhalten von Angreifern in Netzwerken lernen als mit herkoemmlichen forensischen Methoden. Dieser Beitrag stellt die Philosophie der Koeder-Netzwerke vor und beschreibt die ersten Erfahrungen, die mit einem solchen Netzwerk an der RWTH Aachen gemacht wurden.
Although neural networks are routinely and successfully trained in practice using simple gradient-based methods, most existing theoretical results are negative, showing that learning such networks is difficult, in a worst-case sense over all data distributions. In this paper, we take a more nuanced view, and consider whether specific assumptions on the "niceness" of the input distribution, or "niceness" of the target function (e.g. in terms of smoothness, non-degeneracy, incoherence, random choice of parameters etc.), are sufficient to guarantee learnability using gradient-based methods. We provide evidence that neither class of assumptions alone is sufficient: On the one hand, for any member of a class of "nice" target functions, there are difficult input distributions. On the other hand, we identify a family of simple target functions, which are difficult to learn even if the input distribution is "nice". To prove our results, we develop some tools which may be of independent interest, such as extending Fourier-based hardness techniques developed in the context of statistical queries \cite{blum1994weakly}, from the Boolean cube to Euclidean space and to more general classes of functions.
The parallel dynamics is given in the case of neural networks with separable coupling through starting from Coolen-Sherrington (CS) theory. It is shown that this retrieve dynamics as is the case of sequential evolution in the postulate of away from saturation and finite temperature. The finite-size effects is governed by a homogeneous Markov process, which differs from the time-dependent Ornstein-Uhlenbeck process in sequential dynamics. PACS number(s): 87.10.+e, 75.10.Nr, 02.50.+s
We report on our experimental evidence of a substantial geometrical ingredient characterizing the problem of incipient dissipation in high-T_c superconductors(HTS): high-resolution studies of differential resistance-current characteristics in absence of magnetic field enabled us to identify and quantify the fractal dissipative regime inside which the actual current-carrying medium is an object of fractal geometry. The discovery of a fractal regime proves the reality and consistency of critical-phenomena scenario as a model for dissipation in inhomogeneous and disordered HTS, gives the experimentally-based value of the relevant finite-size scaling exponent and offers some interesting new guidelines to the problem of pairing mechanisms in HTS.
Network alignment (NA) aims to find regions of similarities between molecular networks of different species. There exist two NA categories: local (LNA) or global (GNA). LNA finds small highly conserved network regions and produces a many-to-many node mapping. GNA finds large conserved regions and produces a one-to-one node mapping. Given the different outputs of LNA and GNA, when a new NA method is proposed, it is compared against existing methods from the same category. However, both NA categories have the same goal: to allow for transferring functional knowledge from well- to poorly-studied species between conserved network regions. So, which one to choose, LNA or GNA? To answer this, we introduce the first systematic evaluation of the two NA categories.   We introduce new measures of alignment quality that allow for fair comparison of the different LNA and GNA outputs, as such measures do not exist. We provide user-friendly software for efficient alignment evaluation that implements the new and existing measures. We evaluate prominent LNA and GNA methods on synthetic and real-world biological networks. We study the effect on alignment quality of using different interaction types and confidence levels. We find that the superiority of one NA category over the other is context-dependent. Further, when we contrast LNA and GNA in the application of learning novel protein functional knowledge, the two produce very different predictions, indicating their complementarity. Our results and software provide guidelines for future NA method development and evaluation.
In this note, we present preliminary results on the use of "network calculus" for parallel processing systems, specifically MapReduce.
The random hopping models exhibit many fascinating features, such as diverging localization length and density of states as energy approaches the bandcenter, due to its particle-hole symmetry. Nevertheless, such models are yet to be realized experimentally because the particle-hole symmetry is easily destroyed by diagonal disorder. Here we propose that a pure random hopping model can be effectively realized in ultracold atoms by modulating a disordered onsite potential in particular frequency ranges. This idea is motivated by the recent development of the phenomena called "dynamical localization" or "coherent destruction of tunneling". Investigating the application of this idea in one dimension, we find that if the oscillation frequency of the disorder potential is gradually increased from zero to infinity, one can tune a non-interacting system from an Anderson insulator to a random hopping model with diverging localization length at the band center, and eventually to a uniform-hopping tight-binding model.
Recently, realistic image generation using deep neural networks has become a hot topic in machine learning and computer vision. Images can be generated at the pixel level by learning from a large collection of images. Learning to generate colorful cartoon images from black-and-white sketches is not only an interesting research problem, but also a potential application in digital entertainment. In this paper, we investigate the sketch-to-image synthesis problem by using conditional generative adversarial networks (cGAN). We propose the auto-painter model which can automatically generate compatible colors for a sketch. The new model is not only capable of painting hand-draw sketch with proper colors, but also allowing users to indicate preferred colors. Experimental results on two sketch datasets show that the auto-painter performs better that existing image-to-image methods.
A transformation network describes how one set of resources can be transformed into another via technological processes. Transformation networks in economics are useful because they can highlight areas for future innovations, both in terms of new products, new production techniques, or better efficiency. They also make it easy to detect areas where an economy might be fragile. In this paper, we use computational simulations to investigate how the density of a transformation network affects the economic performance, as measured by the gross domestic product (GDP), of an artificial economy. Our results show that on average, the GDP of our economy increases as the density of the transformation network increases. We also find that while the average performance increases, the maximum possible performance decreases and the minimum possible performance increases.
Classical erasure codes, e.g. Reed-Solomon codes, have been acknowledged as an efficient alternative to plain replication to reduce the storage overhead in reliable distributed storage systems. Yet, such codes experience high overhead during the maintenance process. In this paper we propose a novel erasure-coded framework especially tailored for networked storage systems. Our approach relies on the use of random codes coupled with a clustered placement strategy, enabling the maintenance of a failed machine at the granularity of multiple files. Our repair protocol leverages network coding techniques to reduce by half the amount of data transferred during maintenance, as several files can be repaired simultaneously. This approach, as formally proven and demonstrated by our evaluation on a public experimental testbed, enables to dramatically decrease the bandwidth overhead during the maintenance process, as well as the time to repair a failure. In addition, the implementation is made as simple as possible, aiming at a deployment into practical systems.
We propose a new method for aggregating the information of multiple reviewers rating multiple products. Our approach is based on the network relations induced between products by the rating activity of the reviewers. We show that our method is algorithmically implementable even for large numbers of both products and consumers, as is the case for many online sites. Moreover, comparing it with the simple average, which is mostly used in practice, and with other methods previously proposed in the literature, it performs very well under various dimension, proving itself to be an optimal trade--off between computational efficiency, accordance with the reviewers original orderings, and robustness with respect to the inclusion of systematically biased reports.
We build on recent new evolutionary models of Jupiter and Saturn and here extend our calculations to investigate the evolution of extrasolar giant planets of mass 0.15 to 3.0 M_J. Our inhomogeneous thermal history models show that the possible phase separation of helium from liquid metallic hydrogen in the deep interiors of these planets can lead to luminosities ~2 times greater than have been predicted by homogeneous models. For our chosen phase diagram this phase separation will begin to affect the planets' evolution at ~700 Myr for a 0.15 M_J object and ~10 Gyr for a 3.0 M_J object. We show how phase separation affects the luminosity, effective temperature, radii, and atmospheric helium mass fraction as a function of age for planets of various masses, with and without heavy element cores, and with and without the effect of modest stellar irradiation. This phase separation process will likely not affect giant planets within a few AU of their parent star, as these planets will cool to their equilibrium temperatures, determined by stellar heating, before the onset of phase separation. We discuss the detectability of these objects and the likelihood that the energy provided by helium phase separation can change the timescales for formation and settling of ammonia clouds by several Gyr. We discuss how correctly incorporating stellar irradiation into giant planet atmosphere and albedo modeling may lead to a consistent evolutionary history for Jupiter and Saturn.
We investigate a model for randomly layered magnets, viz. a three-dimensional Ising model with planar defects. The magnetic phase transition in this system is smeared because static long-range order can develop on isolated rare spatial regions. Here, we report large-scale kinetic Monte Carlo simulations of the dynamical behavior close to the smeared phase transition which we characterize by the spin (time) autocorrelation function. In the paramagnetic phase, its behavior is dominated by Griffiths effects similar to those in magnets with point defects. In the tail region of the smeared transition the dynamics is even slower: the autocorrelation function decays like a stretched exponential at intermediate times before approaching the exponentially small asymptotic value following a power law at late times. Our Monte-Carlo results are in good agreement with recent theoretical predictions based on optimal fluctuation theory.
Besides independent learning, human learning process is highly improved by summarizing what has been learned, communicating it with peers, and subsequently fusing knowledge from different sources to assist the current learning goal. This collaborative learning procedure ensures that the knowledge is shared, continuously refined, and concluded from different perspectives to construct a more profound understanding. The idea of knowledge transfer has led to many advances in machine learning and data mining, but significant challenges remain, especially when it comes to reinforcement learning, heterogeneous model structures, and different learning tasks. Motivated by human collaborative learning, in this paper we propose a collaborative deep reinforcement learning (CDRL) framework that performs adaptive knowledge transfer among heterogeneous learning agents. Specifically, the proposed CDRL conducts a novel deep knowledge distillation method to address the heterogeneity among different learning tasks with a deep alignment network. Furthermore, we present an efficient collaborative Asynchronous Advantage Actor-Critic (cA3C) algorithm to incorporate deep knowledge distillation into the online training of agents, and demonstrate the effectiveness of the CDRL framework using extensive empirical evaluation on OpenAI gym.
Largest Solar Neutrino Flare may be soon detectable by Deep Core neutrino detector immediately and comunicate to satellites or astronauts. Its detection is the fastest manifestation of a later (tens minutes,hours) dangerous cosmic shower. The precursor trigger maybe saving satellites and even long flight astronauts lives. We shall suggest how. Moreover their detection may probe the inner solar flare acceleration place as well as the neutrino flavor mixing in a new different parameter windows. We show the updated expected rate and signature of neutrinos and antineutrinos in largest solar flare for present tens Megaton Deep Core telescope at tens Gev range. Speculation for additional Icecube gigaton array signals are also considered.
Automated segmentation of retinal blood vessels in label-free fundus images entails a pivotal role in computed aided diagnosis of ophthalmic pathologies, viz., diabetic retinopathy, hypertensive disorders and cardiovascular diseases. The challenge remains active in medical image analysis research due to varied distribution of blood vessels, which manifest variations in their dimensions of physical appearance against a noisy background.   In this paper we formulate the segmentation challenge as a classification task. Specifically, we employ unsupervised hierarchical feature learning using ensemble of two level of sparsely trained denoised stacked autoencoder. First level training with bootstrap samples ensures decoupling and second level ensemble formed by different network architectures ensures architectural revision. We show that ensemble training of auto-encoders fosters diversity in learning dictionary of visual kernels for vessel segmentation. SoftMax classifier is used for fine tuning each member auto-encoder and multiple strategies are explored for 2-level fusion of ensemble members. On DRIVE dataset, we achieve maximum average accuracy of 95.33\% with an impressively low standard deviation of 0.003 and Kappa agreement coefficient of 0.708 . Comparison with other major algorithms substantiates the high efficacy of our model.
In coupled rotor models which describe identical rotating nuclei the nuclear spin states restrict the possible angular momenta of each molecule. There are two mean-field approaches to determining the orientational phase diagrams in such systems. In one the nuclear spin conversion times are assumed to be instantaneous in the other infinite. In this paper the intermediate case, when the spin conversion times are significantly slower than those of rotational time scales, but are not infinite on the time-scale of the experiment, is investigated. Via incorporation of the configurational degeneracy it is shown that in the thermodynamic limit the mean-field approach in the intermediate case is identical to the instantaneous spin conversion time approximation. The total entropy can be split into configurational and rotational terms. The mean-field phase diagram of a model of coupled rotors of three-fold symmetry is also calculated in the two approximations. It is shown that the configurational entropy has a maximum as a function of temperature which shifts to lower temperatures with increasing order.
In this paper, we develop an approach to modeling high-dimensional networks with a large number of nodes arranged in a hierarchical and modular structure. We propose a novel multi-scale factor analysis (MSFA) model which partitions the massive spatio-temporal data defined over the complex networks into a finite set of regional clusters. To achieve further dimension reduction, we represent the signals in each cluster by a small number of latent factors. The correlation matrix for all nodes in the network are approximated by lower-dimensional sub-structures derived from the cluster-specific factors. To estimate regional connectivity between numerous nodes (within each cluster), we apply principal components analysis (PCA) to produce factors which are derived as the optimal reconstruction of the observed signals under the squared loss. Then, we estimate global connectivity (between clusters or sub-networks) based on the factors across regions using the RV-coefficient as the cross-dependence measure. This gives a reliable and computationally efficient multi-scale analysis of both regional and global dependencies of the large networks. The proposed novel approach is applied to estimate brain connectivity networks using functional magnetic resonance imaging (fMRI) data. Results on resting-state fMRI reveal interesting modular and hierarchical organization of human brain networks during rest.
This paper introduces the Hybrid Architecture of Dynamic Spectrum Allocation in the hierarchical network combining centralized and distributed architecture to get optimum allocation of radio resources. It can limit the interference by interacting dynamically and enhance the spectrum efficiency while maintaining the desired QoS in the network. This paper presented dynamic framework for the interaction. The proposed architecture employed simple learning rule based on hebbian learning for sensing the primary network and allocating the spectrum.
The celebrated Kermack-McKendric model of epidemics studies the transmission of a disease in a population where each individual is initially susceptible (S), may become infective (I) and then removed or recovered (R) and plays no further epidemiological role. This ODE model arises as the limiting case of a network model where each individual has an equal chance of infecting every other. More recent work gives explicit consideration to the network of social interaction and attendant probability of transmission for each interacting pair. The state of such a network is an assignment of the values {S,I,R} to its members. Given such a network, an initial state and a particular susceptible individual, we would like to compute their probability of becoming infected in the course of an epidemic. It turns out that this problem is NP-hard. In particular, it belongs in a class of problems all of whose known solutions require an exponential amount of computation and for which it is unlikely that there will be more efficient solutions.
Discovering communities in complex networks helps to understand the behaviour of the network. Some works in this promising research area exist, but communities uncovering in time-dependent and/or multiplex networks has not deeply investigated yet. In this paper, we propose a communities detection approach for multislice networks based on modularity optimization. We first present a method to reduce the network size that still preserves modularity. Then we introduce an algorithm that approximates modularity optimization (as usually adopted) for multislice networks, thus finding communities. The network size reduction allows us to maintain acceptable performances without affecting the effectiveness of the proposed approach.
The brain of mammals are divided into different cortical areas that are anatomically connected forming larger networks which perform cognitive tasks. The cat cerebral cortex is composed of 65 areas organised into the visual, auditory, somatosensory-motor and frontolimbic cognitive regions. We have built a network of networks, in which networks are connected among themselves according to the connections observed in the cat cortical areas aiming to study how inputs drive the synchronous behaviour in this cat brain-like network. We show that without external perturbations it is possible to observe high level of bursting synchronisation between neurons within almost all areas, except for the auditory area. Bursting synchronisation appears between neurons in the auditory region when an external perturbation is applied in another cognitive area. This is a clear evidence that pattern formation and collective behaviour in the brain might be a process mediated by other brain areas under stimulation.
Leveraging large data sets, deep Convolutional Neural Networks (CNNs) achieve state-of-the-art recognition accuracy. Due to the substantial compute and memory operations, however, they require significant execution time. The massive parallel computing capability of GPUs make them as one of the ideal platforms to accelerate CNNs and a number of GPU-based CNN libraries have been developed. While existing works mainly focus on the computational efficiency of CNNs, the memory efficiency of CNNs have been largely overlooked. Yet CNNs have intricate data structures and their memory behavior can have significant impact on the performance. In this work, we study the memory efficiency of various CNN layers and reveal the performance implication from both data layouts and memory access patterns. Experiments show the universal effect of our proposed optimizations on both single layers and various networks, with up to 27.9x for a single layer and up to 5.6x on the whole networks.
We propose a generalized stochastic block model to explore the mesoscopic structures in signed networks by grouping vertices that exhibit similar positive and negative connection profiles into the same cluster. In this model, the group memberships are viewed as hidden or unobserved quantities, and the connection patterns between groups are explicitly characterized by two block matrices, one for positive links and the other for negative links. By fitting the model to the observed network, we can not only extract various structural patterns existing in the network without prior knowledge, but also recognize what specific structures we obtained. Furthermore, the model parameters provide vital clues about the probabilities that each vertex belongs to different groups and the centrality of each vertex in its corresponding group. This information sheds light on the discovery of the networks' overlapping structures and the identification of two types of important vertices, which serve as the cores of each group and the bridges between different groups, respectively. Experiments on a series of synthetic and real-life networks show the effectiveness as well as the superiority of our model.
Although the recent progress in the deep neural network has led to the development of learnable local feature descriptors, there is no explicit answer for estimation of the necessary size of a neural network. Specifically, the local feature is represented in a low dimensional space, so the neural network should have more compact structure. The small networks required for local feature descriptor learning may be sensitive to initial conditions and learning parameters and more likely to become trapped in local minima. In order to address the above problem, we introduce an adaptive pruning Siamese Architecture based on neuron activation to learn local feature descriptors, making the network more computationally efficient with an improved recognition rate over more complex networks. Our experiments demonstrate that our learned local feature descriptors outperform the state-of-art methods in patch matching.
Motivated by an important insight from neural science, we propose a new framework for understanding the success of the recently proposed "maxout" networks. The framework is based on encoding information on sparse pathways and recognizing the correct pathway at inference time. Elaborating further on this insight, we propose a novel deep network architecture, called "channel-out" network, which takes a much better advantage of sparse pathway encoding. In channel-out networks, pathways are not only formed a posteriori, but they are also actively selected according to the inference outputs from the lower layers. From a mathematical perspective, channel-out networks can represent a wider class of piece-wise continuous functions, thereby endowing the network with more expressive power than that of maxout networks. We test our channel-out networks on several well-known image classification benchmarks, setting new state-of-the-art performance on CIFAR-100 and STL-10, which represent some of the "harder" image classification benchmarks.
Motivated by puzzling spin-glass behaviors observed in many pyrochlore-based magnets, effects of magnetoelastic coupling to local lattice distortions were recently studied by the authors for a bond-disordered antiferromagnet on a pyrochlore lattice [Phys. Rev. Lett. 107, 047204 (2011)]. Here, we extend the analyses with focusing on the critical property of the spin-glass transition which occurs concomitantly with a nematic transition. Finite-size scaling analyses are performed up to a larger system size with 8192 spins to estimate the transition temperature and critical exponents. The exponents are compared with those in the absence of the magnetoelastic coupling and with those for the canonical spin-glass systems. We also discuss the temperature dependence of the specific heat in comparison with that in canonical spin-glass systems as well as an experimental result.
Two approaches for graph based semi-supervised learning are proposed. The firstapproach is based on iteration of an affine map. A key element of the affine map iteration is sparsematrix-vector multiplication, which has several very efficient parallel implementations. The secondapproach belongs to the class of Markov Chain Monte Carlo (MCMC) algorithms. It is based onsampling of nodes by performing a random walk on the graph. The latter approach is distributedby its nature and can be easily implemented on several processors or over the network. Boththeoretical and practical evaluations are provided. It is found that the nodes are classified intotheir class with very small error. The sampling algorithm's ability to track new incoming nodesand to classify them is also demonstrated.
In this work we study variance in the results of neural network training on a wide variety of configurations in automatic speech recognition. Although this variance itself is well known, this is, to the best of our knowledge, the first paper that performs an extensive empirical study on its effects in speech recognition. We view training as sampling from a distribution and show that these distributions can have a substantial variance. These results show the urgent need to rethink the way in which results in the literature are reported and interpreted.
This paper considers a class of wireline networks, derived from the well-known butterfly network, over which two independent unicast sessions take place simultaneously. The main objectives are to understand when network coding type of operations are beneficial with and without security considerations and to derive the ultimate gains that cooperation among sources and sinks can bring. Towards these goals, the capacity region of the butterfly network with arbitrary edge capacities is first derived. It is then shown that no rate can be guaranteed over this network under security considerations, when an eavesdropper wiretaps any of the links. Three variants of the butterfly network, such as the case of co-located sources, are analyzed as well and their secure and non-secure capacity regions are characterized. By using the butterfly network and its variants as building blocks, these results can be used to design high-throughput achieving transmission schemes for general multiple-unicast networks.
P2P systems are a great solution to the problem of distributing resources. The main issue of P2P networks is that searching and retrieving resources shared by peers is usually expensive and does not take into account similarities among peers. In this paper we present preliminary simulations of PROSA, a novel algorithm for P2P network structuring, inspired by social behaviours. Peers in PROSA self--organise in social groups of similar peers, called ``semantic--groups'', depending on the resources they are sharing. Such a network smoothly evolves to a small--world graph, where queries for resources are efficiently and effectively routed.
This paper introduces a novel deep learning framework including a lexicon-based approach for sentence-level prediction of sentiment label distribution. We propose to first apply semantic rules and then use a Deep Convolutional Neural Network (DeepCNN) for character-level embeddings in order to increase information for word-level embedding. After that, a Bidirectional Long Short-Term Memory Network (Bi-LSTM) produces a sentence-wide feature representation from the word-level embedding. We evaluate our approach on three Twitter sentiment classification datasets. Experimental results show that our model can improve the classification accuracy of sentence-level sentiment analysis in Twitter social networking.
Speech signals are complex intermingling of various informative factors, and this information blending makes decoding any of the individual factors extremely difficult. A natural idea is to factorize each speech frame into independent factors, though it turns out to be even more difficult than decoding each individual factor. A major encumbrance is that the speaker trait, a major factor in speech signals, has been suspected to be a long-term distributional pattern and so not identifiable at the frame level. In this paper, we demonstrated that the speaker factor is also a short-time spectral pattern and can be largely identified with just a few frames using a simple deep neural network (DNN). This discovery motivated a cascade deep factorization (CDF) framework that infers speech factors in a sequential way, and factors previously inferred are used as conditional variables when inferring other factors. Our experiment on an automatic emotion recognition (AER) task demonstrated that this approach can effectively factorize speech signals, and using these factors, the original speech spectrum can be recovered with high accuracy. This factorization and reconstruction approach provides a novel tool for many speech processing tasks.
The goal of this paper is to present the implementation of a Radial Basis Function neural network with built-in knowledge to recognize hand-written characters. The neural network includes in its architecture gates controlled by an attraction/repulsion system of coefficients. These coefficients are derived from a preprocessing stage which groups the characters according to their ascendant, central, or descendent components. The neural network is trained using data from invariant moment functions. Results are compared with those obtained using a K nearest neighbor method on the same moment data.
Surface plasmon polaritons (SPPs) confined along metal-dielectric interface have attracted a relevant interest in the area of ultracompact photonic circuits, photovoltaic devices and other applications due to their strong field confinement and enhancement. This paper investigates a novel cascade neural network (NN) architecture to find the dependance of metal thickness on the SPP propagation. Additionally, a novel training procedure for the proposed cascade NN has been developed using an OpenMP-based framework, thus greatly reducing training time. The performed experiments confirm the effectiveness of the proposed NN architecture for the problem at hand.
Theories developed by Tinto and Nora identify academic performance, learning gains, and involvement in learning communities as important facets of student engagement that support student persistence. Collaborative learning environments, such as those employed in the Modeling Instruction introductory physics course, are considered especially important because they provide students with the academic and social support required for success. Due to the inherently social nature of collaborative learning, we examined student social interactions in the classroom using Network Analysis. We used student centrality, a family of measures that quantify how connected or "central" a particular student is within the classroom network, to measure student engagement longitudinally over multiple times during the semester. Bootstrapped linear regression modeling showed that student centrality predicted future academic performance over and above prior GPA for five out of the six centrality measures tested; in particular, closeness centrality explained 29% more of the variance than prior GPA alone. These results confirm that student engagement in the classroom is critical to supporting academic performance. Furthermore, we found that this relationship emerged from social interactions that took place in the second half of the semester, suggesting the classroom network developed over time in a meaningful way.
He I 10830 profiles acquired with Keck's NIRSPEC for 6 young low mass stars with high disk accretion rates (AS 353A, DG Tau, DL Tau, DR Tau, HL Tau and SVS 13) provide new insight into accretion-driven winds. In 4 stars the profiles have the signature of resonance scattering, and possess a deep and broad blueshifted absorption that penetrates more than 50% into the 1 micron continuum over a continuous range of velocities from near the stellar rest velocity to the terminal velocity of the wind, unlike inner wind signatures seen in other spectral features. This deep and broad absorption provides the first observational tracer of the acceleration region of the inner wind and suggests that this acceleration region is situated such that it occults a significant portion of the stellar disk. The remaining 2 stars also have blue absorption extending below the continuum although here the profiles are dominated by emission, requiring an additional source of helium excitation beyond resonant scattering. This is likely the same process that produces the emission profiles seen at He I 5876.
The rapidly growing field of network analytics requires data sets for use in evaluation. Real world data often lack truth and simulated data lack narrative fidelity or statistical generality. This paper presents a novel, mixed-membership, agentbased simulation model to generate activity data with narrative power while providing statistical diversity through random draws. The model generalizes to a variety of network activity types such as Internet and cellular communications, human mobility, and social network interactions. The simulated actions over all agents can then drive an application specific observational model to render measurements as one would collect in real-world experiments. We apply this framework to human mobility and demonstrate its utility in generating high fidelity traffic data for network analytics.
Deep Learning refers to a set of machine learning techniques that utilize neural networks with many hidden layers for tasks, such as image classification, speech recognition, language understanding. Deep learning has been proven to be very effective in these domains and is pervasively used by many Internet services. In this paper, we describe different automotive uses cases for deep learning in particular in the domain of computer vision. We surveys the current state-of-the-art in libraries, tools and infrastructures (e.\,g.\ GPUs and clouds) for implementing, training and deploying deep neural networks. We particularly focus on convolutional neural networks and computer vision use cases, such as the visual inspection process in manufacturing plants and the analysis of social media data. To train neural networks, curated and labeled datasets are essential. In particular, both the availability and scope of such datasets is typically very limited. A main contribution of this paper is the creation of an automotive dataset, that allows us to learn and automatically recognize different vehicle properties. We describe an end-to-end deep learning application utilizing a mobile app for data collection and process support, and an Amazon-based cloud backend for storage and training. For training we evaluate the use of cloud and on-premises infrastructures (including multiple GPUs) in conjunction with different neural network architectures and frameworks. We assess both the training times as well as the accuracy of the classifier. Finally, we demonstrate the effectiveness of the trained classifier in a real world setting during manufacturing process.
We investigate the generalization ability of a simple perceptron trained in the off-line and on-line supervised modes. Examples are extracted from the teacher who is a non-monotonic perceptron. For this system, difficulties of training can be controlled continuously by changing a parameter of the teacher. We train the student by several learning strategies in order to obtain the theoretical lower bounds of generalization errors under various conditions. Asymptotic behavior of the learning curve has been derived, which enables us to determine the most suitable learning algorithm for a given value of the parameter controlling difficulties of training.
Many body localization (MBL) has emerged as a powerful paradigm for understanding non-equilibrium quantum dynamics. Folklore based on perturbative arguments holds that MBL only arises in systems with short range interactions. Here we advance non-perturbative arguments indicating that MBL can arise in systems with long range (Coulomb) interactions. In particular, we show using bosonization that MBL can arise in one dimensional systems with ~ r interactions, a problem that exhibits charge confinement. We also argue that (through the Anderson-Higgs mechanism) MBL can arise in two dimensional systems with log r interactions, and speculate that our arguments may even extend to three dimensional systems with 1/r interactions. Our arguments are `asymptotic' (i.e. valid up to rare region corrections), yet they open the door to investigation of MBL physics in a wide array of long range interacting systems where such physics was previously believed not to arise.
Eigenvectors of data matrices play an important role in many computational problems, ranging from signal processing to machine learning and control. For instance, algorithms that compute positions of the nodes of a wireless network on the basis of pairwise distance measurements require a few leading eigenvectors of the distances matrix. While eigenvector calculation is a standard topic in numerical linear algebra, it becomes challenging under severe communication or computation constraints, or in absence of central scheduling. In this paper we investigate the possibility of computing the leading eigenvectors of a large data matrix through gossip algorithms.   The proposed algorithm amounts to iteratively multiplying a vector by independent random sparsification of the original matrix and averaging the resulting normalized vectors. This can be viewed as a generalization of gossip algorithms for consensus, but the resulting dynamics is significantly more intricate. Our analysis is based on controlling the convergence to stationarity of the associated Kesten-Furstenberg Markov chain.
The problem of clustering content in social media has pervasive applications, including the identification of discussion topics, event detection, and content recommendation. Here we describe a streaming framework for online detection and clustering of memes in social media, specifically Twitter. A pre-clustering procedure, namely protomeme detection, first isolates atomic tokens of information carried by the tweets. Protomemes are thereafter aggregated, based on multiple similarity measures, to obtain memes as cohesive groups of tweets reflecting actual concepts or topics of discussion. The clustering algorithm takes into account various dimensions of the data and metadata, including natural language, the social network, and the patterns of information diffusion. As a result, our system can build clusters of semantically, structurally, and topically related tweets. The clustering process is based on a variant of Online K-means that incorporates a memory mechanism, used to "forget" old memes and replace them over time with the new ones. The evaluation of our framework is carried out by using a dataset of Twitter trending topics. Over a one-week period, we systematically determined whether our algorithm was able to recover the trending hashtags. We show that the proposed method outperforms baseline algorithms that only use content features, as well as a state-of-the-art event detection method that assumes full knowledge of the underlying follower network. We finally show that our online learning framework is flexible, due to its independence of the adopted clustering algorithm, and best suited to work in a streaming scenario.
In this paper, we focus on the stochastic block model (SBM),a probabilistic tool describing interactions between nodes of a network using latent clusters. The SBM assumes that the networkhas a stationary structure, in which connections of time varying intensity are not taken into account. In other words, interactions between two groups are forced to have the same features during the whole observation time. To overcome this limitation,we propose a partition of the whole time horizon, in which interactions are observed, and develop a non stationary extension of the SBM,allowing to simultaneously cluster the nodes in a network along with fixed time intervals in which the interactions take place. The number of clusters (K for nodes, D for time intervals) as well as the class memberships are finallyobtained through maximizing the complete-data integrated likelihood by means of a greedy search approach. After showing that the model works properly with simulated data, we focus on a real data set. We thus consider the three days ACM Hypertext conference held in Turin,June 29th - July 1st 2009. Proximity interactions between attendees during the first day are modelled and an interestingclustering of the daily hours is finally obtained, with times of social gathering (e.g. coffee breaks) recovered by the approach. Applications to large networks are limited due to the computational complexity of the greedy search which is dominated bythe number $K\_{max}$ and $D\_{max}$ of clusters used in the initialization. Therefore,advanced clustering tools are considered to reduce the number of clusters expected in the data, making the greedy search applicable to large networks.
We study the problem of classifying deep holes of Reed-Solomon codes. We show that this problem is equivalent to the problem of classifying MDS extensions of Reed-Solomon codes by one digit. This equivalence allows us to improve recent results on the former problem. In particular, we classify deep holes of Reed-Solomon codes of dimension greater than half the alphabet size.   We also give a complete classification of deep holes of Reed Solomon codes with redundancy three in all dimensions.
Spambot detection in online social networks is a long-lasting challenge involving the study and design of detection techniques capable of efficiently identifying ever-evolving spammers. Recently, a new wave of social spambots has emerged, with advanced human-like characteristics that allow them to go undetected even by current state-of-the-art algorithms. In this paper, we show that efficient spambots detection can be achieved via an in-depth analysis of their collective behaviors exploiting the digital DNA technique for modeling the behaviors of social network users. Inspired by its biological counterpart, in the digital DNA representation the behavioral lifetime of a digital account is encoded in a sequence of characters. Then, we define a similarity measure for such digital DNA sequences. We build upon digital DNA and the similarity between groups of users to characterize both genuine accounts and spambots. Leveraging such characterization, we design the Social Fingerprinting technique, which is able to discriminate among spambots and genuine accounts in both a supervised and an unsupervised fashion. We finally evaluate the effectiveness of Social Fingerprinting and we compare it with three state-of-the-art detection algorithms. Among the peculiarities of our approach is the possibility to apply off-the-shelf DNA analysis techniques to study online users behaviors and to efficiently rely on a limited number of lightweight account characteristics.
Finding the global minimum of a cost function given by the sum of a quadratic and a linear form in N real variables over (N-1)- dimensional sphere is one of the simplest, yet paradigmatic problems in Optimization Theory known as the "trust region subproblem" or "constraint least square problem". When both terms in the cost function are random this amounts to studying the ground state energy of the simplest spherical spin glass in a random magnetic field. We first identify and study two distinct large-N scaling regimes in which the linear term (magnetic field) leads to a gradual topology trivialization, i.e. reduction in the total number N_{tot} of critical (stationary) points in the cost function landscape. In the first regime N_{tot} remains of the order $N$ and the cost function (energy) has generically two almost degenerate minima with the Tracy-Widom (TW) statistics. In the second regime the number of critical points is of the order of unity with a finite probability for a single minimum. In that case the mean total number of extrema (minima and maxima) of the cost function is given by the Laplace transform of the TW density, and the distribution of the global minimum energy is expected to take a universal scaling form generalizing the TW law. Though the full form of that distribution is not yet known to us, one of its far tails can be inferred from the large deviation theory for the global minimum. In the rest of the paper we show how to use the replica method to obtain the probability density of the minimum energy in the large-deviation approximation by finding both the rate function and the leading pre-exponential factor.
We perform helioseismic holography to assess the noise in p-mode travel-time shifts which would form the basis of inferences of large-scale flows throughout the solar convection zone. We also derive the expected travel times from a parameterized return (equatorward) flow component of the meridional circulation at the base of the convection zone from forward models under the assumption of the ray and Born approximations. From estimates of the signal-to-noise ratio for measurements focused near the base of the convection zone, we conclude that the helioseismic detection of the deep meridional flow including the return component may not be possible using data spanning an interval less than a solar cycle.
The $O(\alpha^2\log(Q^2/m_e^2))$ leptonic QED corrections to unpolarized deeply inelastic electron-nucleon scattering are calculated in the mixed variables.
Alzheimer's disease is a major cause of dementia. Its diagnosis requires accurate biomarkers that are sensitive to disease stages. In this respect, we regard probabilistic classification as a method of designing a probabilistic biomarker for disease staging. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation of uncertainty associated with them. In this paper, we obtain probabilistic biomarkers via Gaussian Processes. Gaussian Processes enable probabilistic kernel machines that offer flexible means to accomplish Multiple Kernel Learning. Exploiting this flexibility, we propose a new variation of Automatic Relevance Determination and tackle the challenges of high dimensionality through multiple kernels. Our research results demonstrate that the Gaussian Process models are competitive with or better than the well-known Support Vector Machine in terms of classification performance even in the cases of single kernel learning. Extending the basic scheme towards the Multiple Kernel Learning, we improve the efficacy of the Gaussian Process models and their interpretability in terms of the known anatomical correlates of the disease. For instance, the disease pathology starts in and around the hippocampus and entorhinal cortex. Through the use of Gaussian Processes and Multiple Kernel Learning, we have automatically and efficiently determined those portions of neuroimaging data. In addition to their interpretability, our Gaussian Process models are competitive with recent deep learning solutions under similar settings.
Deep learning (DL) systems are increasingly deployed in security-critical domains including self-driving cars and malware detection, where the correctness and predictability of a system's behavior for corner-case inputs are of great importance. However, systematic testing of large-scale DL systems with thousands of neurons and millions of parameters for all possible corner-cases is a hard problem. Existing DL testing depends heavily on manually labeled data and therefore often fails to expose different erroneous behaviors for rare inputs.   We present DeepXplore, the first whitebox framework for systematically testing real-world DL systems. We address two problems: (1) generating inputs that trigger different parts of a DL system's logic and (2) identifying incorrect behaviors of DL systems without manual effort. First, we introduce neuron coverage for estimating the parts of DL system exercised by a set of test inputs. Next, we leverage multiple DL systems with similar functionality as cross-referencing oracles and thus avoid manual checking for erroneous behaviors. We demonstrate how finding inputs triggering differential behaviors while achieving high neuron coverage for DL algorithms can be represented as a joint optimization problem and solved efficiently using gradient-based optimization techniques.   DeepXplore finds thousands of incorrect corner-case behaviors in state-of-the-art DL models trained on five popular datasets. For all tested DL models, on average, DeepXplore generated one test input demonstrating incorrect behavior within one second while running on a commodity laptop. The inputs generated by DeepXplore achieved 33.2% higher neuron coverage on average than existing testing methods. We further show that the test inputs generated by DeepXplore can also be used to retrain the corresponding DL model to improve classification accuracy or identify polluted training data.
We consider interaction effects in a granular normal metal at not very low temperatures. Assuming that all weak localization effects are suppressed by the temperature we replace the initial Hamiltonian by a proper functional of phases and study the possibility for a phase transition depending on the tunneling conductance $g$. It is demonstrated for any dimension that, while at small $g$ the conductivity decays with temperature exponentially, its temperature dependence is logarithmic at large $g.$ The formulae obtained are compared with an existing experiment and a good agreement is found.
At the extreme energies of the Large Hadron Collider, massive particles can be produced at such high velocities that their hadronic decays are collimated and the resulting jets overlap. Deducing whether the substructure of an observed jet is due to a low-mass single particle or due to multiple decay objects of a massive particle is an important problem in the analysis of collider data. Traditional approaches have relied on expert features designed to detect energy deposition patterns in the calorimeter, but the complexity of the data make this task an excellent candidate for the application of machine learning tools. The data collected by the detector can be treated as a two-dimensional image, lending itself to the natural application of image classification techniques. In this work, we apply deep neural networks with a mixture of locally-connected and fully-connected nodes. Our experiments demonstrate that without the aid of expert features, such networks match or modestly outperform the current state-of-the-art approach for discriminating between jets from single hadronic particles and overlapping jets from pairs of collimated hadronic particles, and that such performance gains persist in the presence of pileup interactions.
Deep neural networks, albeit their great success on feature learning in various computer vision tasks, are usually considered as impractical for online visual tracking because they require very long training time and a large number of training samples. In this work, we present an efficient and very robust tracking algorithm using a single Convolutional Neural Network (CNN) for learning effective feature representations of the target object, in a purely online manner. Our contributions are multifold: First, we introduce a novel truncated structural loss function that maintains as many training samples as possible and reduces the risk of tracking error accumulation. Second, we enhance the ordinary Stochastic Gradient Descent approach in CNN training with a robust sample selection mechanism. The sampling mechanism randomly generates positive and negative samples from different temporal distributions, which are generated by taking the temporal relations and label noise into account. Finally, a lazy yet effective updating scheme is designed for CNN training. Equipped with this novel updating algorithm, the CNN model is robust to some long-existing difficulties in visual tracking such as occlusion or incorrect detections, without loss of the effective adaption for significant appearance changes. In the experiment, our CNN tracker outperforms all compared state-of-the-art methods on two recently proposed benchmarks which in total involve over 60 video sequences. The remarkable performance improvement over the existing trackers illustrates the superiority of the feature representations which are learned
We present a quantitative method to classify galaxies, based on multi-wavelength data and elaborated from the properties of nearby galaxies. Our objective is to define an evolutionary method that can be used for low and high redshift objects. We estimate the concentration of light (C) at the galaxy center and the 180 degree-rotational asymmetry (A), computed at several wavelengths, from ultraviolet (UV) to I-band. The variation of the indices of concentration and asymmetry with the wavelength reflects the proportion and the distribution of young and old stellar populations in galaxies. In general C is found to decrease from optical to UV, and A is found to increase from optical to UV: the patchy appearance of galaxies in UV with no bulge is often very different from their counterpart at optical wavelengths, with prominent bulges and more regular disks. The variation of C and A with the wavelength is quantified. By this way, we are able to distinguish five types of galaxies that we call spectro-morphological types: compact, ringed, spiral, irregular and central-starburst galaxies, which can be differentiated by the repartition of their stellar populations. We discuss in detail the morphology of galaxies of the sample, and describe the morphological characteristics of each spectro-morphological type. We apply spectro-morphology to three objects at a redshift z=1 in the Hubble Deep Field North, that gives encouraging results for applications to large samples of high-redshift galaxies. This method of morphological classification could be used to study the evolution of the morphology with the redshift and is expected to bring observational constraints on scenarios of galaxy evolution.
A common requirement in policy specification languages is the ability to map policies to the underlying network devices. Doing so, in a provably correct way, is important in a security policy context, so administrators can be confident of the level of protection provided by the policies for their networks. Existing policy languages allow policy composition but lack formal semantics to allocate policy to network devices.   Our research tackles this from first principles: we ask how network policies can be described at a high-level, independent of firewall-vendor and network minutiae. We identify the algebraic requirements of the policy mapping process and propose semantic foundations to formally verify if a policy is implemented by the correct set of policy-arbiters. We show the value of our proposed algebras in maintaining concise network-device configurations by applying them to real-world networks.
This paper presents a convolutional layer that is able to process sparse input features. As an example, for image recognition problems this allows an efficient filtering of signals that do not lie on a dense grid (like pixel position), but of more general features (such as color values). The presented algorithm makes use of the permutohedral lattice data structure. The permutohedral lattice was introduced to efficiently implement a bilateral filter, a commonly used image processing operation. Its use allows for a generalization of the convolution type found in current (spatial) convolutional network architectures.
A minireview of the recent results from CLEO-c is presented. It includes new results in charmonium spectroscopy, charmonium-like exotics, and open-charm decays.
This paper presents a brief review of some existing correlation models which attempt to map Quality of Service (QoS) to Quality of Experience (QoE) for multimedia services. The term QoS refers to deterministic network behaviour, so that data can be transported with a minimum of packet loss, delay and maximum bandwidth. QoE is a subjective measure that involves human dimensions; it ties together user perception, expectations, and experience of the application and network performance. The Holy Grail of subjective measurement is to predict it from the objective measurements; in other words predict QoE from a given set of QoS parameters or vice versa. Whilst there are many quality models for multimedia, most of them are only partial solutions to predicting QoE from a given QoS. This contribution analyses a number of previous attempts and optimisation techniquesthat can reliably compute the weighting coefficients for the QoS/QoE mapping.
The public sector comprises government agencies, ministries, education institutions, health providers and other types of government, commercial and not-for-profit organisations. Unlike commercial enterprises, this environment is highly heterogeneous in all aspects. This forms a complex network which is not always optimised. A lack of optimisation and communication hinders information sharing between the network nodes limiting the flow of information. Another limiting aspect is privacy of personal information and security of operations of some nodes or segments of the network. Attempts to reorganise the network or improve communications to make more information available for sharing and analysis may be hindered or completely halted by public concerns over privacy, political agendas, social and technological barriers. This paper discusses a technical solution for information sharing while addressing the privacy concerns with no need for reorganisation of the existing public sector infrastructure . The solution is based on imposing an additional layer of Intelligent Software Agents and Knowledge Bases for data mining and analysis.
The recommendation system is a software system to predict customers' unknown preferences from known preferences. In the recommendation system, customers' preferences are encoded into vectors, and finding the nearest vectors to each vector is an essential part. This vector-searching part of the problem is called a $k$-nearest neighbor problem. We give an effective algorithm to solve this problem on multiple graphics processor units (GPUs).   Our algorithm consists of two parts: an $N$-body problem and a partial sort. For a algorithm of the $N$-body problem, we applied the idea of a known algorithm for the $N$-body problem in physics, although another trick is need to overcome the problem of small sized shared memory. For the partial sort, we give a novel GPU algorithm which is effective for small $k$. In our partial sort algorithm, a heap is accessed in parallel by threads with a low cost of synchronization. Both of these two parts of our algorithm utilize maximal power of coalesced memory access, so that a full bandwidth is achieved.   By an experiment, we show that when the size of the problem is large, an implementation of the algorithm on two GPUs runs more than 330 times faster than a single core implementation on a latest CPU. We also show that our algorithm scales well with respect to the number of GPUs.
Large-scale networked environments, such as the Internet, possess the characteristics of centralised data, centralised access and centralised control; this gives the user a powerful mechanism for building and integrating large repositories of centralised information from diverse resources set. However, a centralised network system with GSM Networks development for a hospital information systems or a health care information portal is still in its infancy. The shortcomings of the currently available tools have made the use of mobile devices more appealing. In mobile computing, the issues such as low bandwidth, high latency wireless Networks, loss or degradation of wireless connections, and network errors or failures need to be dealt with. Other issues to be addressed include system adaptability, reliability, robustness, extensibility, flexibility, and maintainability. GSM approach has emerged as the most viable approach for development of intelligent software applications for wireless mobile devices in a centralized environment, which gives higher band width of 900 MHz for transmission. The e-healthcare system that we have developed provides support for physicians, nurses, pharmacists and other healthcare professionals, as well as for patients and medical devices used to monitor patients. In this paper, we present the architecture and the demonstration prototype.
Adhoc wireless network with their changing topology and distributed nature are more prone to intruders. The efficiency of an Intrusion detection system in the case of an adhoc network is not only determined by its dynamicity in monitoring but also in its flexibility in utilizing the available power in each of its nodes. In this paper we propose a hybrid intrusion detection system, based on a power level metric for potential adhoc hosts, which is used to determine the duration for which a particular node can support a network-monitoring node. The detection of intrusions in the network is done with the help of Cellular Automata (CA). IDFADNWCA (Intrusion Detection for Adhoc Networks with Cellular Automata) focuses on the available power level in each of the nodes and determines the network monitors. Power Level Metric in the network results in maintaining power for network monitoring, with monitors changing often, since it is an iterative power optimal solution to identify nodes for distributed agent based intrusion detection. The advantage of this approach entails is the inherent flexibility it provides, by means of considering only fewer nodes for reestablishing network monitors.
Designing appropriate features for acoustic event recognition tasks is an active field of research. Expressive features should both improve the performance of the tasks and also be interpret-able. Currently, heuristically designed features based on the domain knowledge requires tremendous effort in hand-crafting, while features extracted through deep network are difficult for human to interpret. In this work, we explore the experience guided learning method for designing acoustic features. This is a novel hybrid approach combining both domain knowledge and purely data driven feature designing. Based on the procedure of log Mel-filter banks, we design a filter bank learning layer. We concatenate this layer with a convolutional neural network (CNN) model. After training the network, the weight of the filter bank learning layer is extracted to facilitate the design of acoustic features. We smooth the trained weight of the learning layer and re-initialize it in filter bank learning layer as audio feature extractor. For the environmental sound recognition task based on the Urban- sound8K dataset, the experience guided learning leads to a 2% accuracy improvement compared with the fixed feature extractors (the log Mel-filter bank). The shape of the new filter banks are visualized and explained to prove the effectiveness of the feature design process.
We study the fate of dynamical localization of two quantum kicked rotors with contact interaction. This interaction mimics experimental realizations with ultracold atomic gases. Dynamical localization for a single rotor takes place in momentum space. The contact interaction affects the evolution of the relative momentum $k$ of a pair of interacting rotors in a non-analytic way. Consequently the evolution operator $U$ is exciting large relative momenta with amplitudes which decay only as a power law $1/k^4$. This is in contrast to the center-of-mass momentum $K$ for which the amplitudes excited by $U$ decay superexponentially fast. Therefore dynamical localization is preserved for the center-of-mass momentum, but destroyed for the relative momentum for any nonzero strength of interaction.
The development of Internet wide resources for general purpose parallel computing poses the challenging task of matching computation and communication complexity. A number of parallel computing models exist that address this for traditional parallel architectures, and there are a number of emerging models that attempt to do this for large scale Internet-based systems like computational grids. In this survey we cover the three fundamental aspects -- application, architecture and model, and we show how they have been developed over the last decade. We also cover programming tools that are currently being used for parallel programming in computational grids. The trend in conventional computational models are to put emphasis on efficient communication between participating nodes by adapting different types of communication to network conditions. Effects of dynamism and uncertainties that arise in large scale systems are evidently important to understand and yet there is currently little work that addresses this from a parallel computing perspective.
Two-stream Convolutional Networks (ConvNets) have shown strong performance for human action recognition in videos. Recently, Residual Networks (ResNets) have arisen as a new technique to train extremely deep architectures. In this paper, we introduce spatiotemporal ResNets as a combination of these two approaches. Our novel architecture generalizes ResNets for the spatiotemporal domain by introducing residual connections in two ways. First, we inject residual connections between the appearance and motion pathways of a two-stream architecture to allow spatiotemporal interaction between the two streams. Second, we transform pretrained image ConvNets into spatiotemporal networks by equipping these with learnable convolutional filters that are initialized as temporal residual connections and operate on adjacent feature maps in time. This approach slowly increases the spatiotemporal receptive field as the depth of the model increases and naturally integrates image ConvNet design principles. The whole model is trained end-to-end to allow hierarchical learning of complex spatiotemporal features. We evaluate our novel spatiotemporal ResNet using two widely used action recognition benchmarks where it exceeds the previous state-of-the-art.
We develop a Genetic Programming-based methodology that enables discovery of novel functional forms for classical inter-atomic force-fields, used in molecular dynamics simulations. Unlike previous efforts in the field, that fit only the parameters to the fixed functional forms, we instead use a novel algorithm to search the space of many possible functional forms. While a follow-on practical procedure will use experimental and {\it ab inito} data to find an optimal functional form for a forcefield, we first validate the approach using a manufactured solution. This validation has the advantage of a well-defined metric of success. We manufactured a training set of atomic coordinate data with an associated set of global energies using the well-known Lennard-Jones inter-atomic potential. We performed an automatic functional form fitting procedure starting with a population of random functions, using a genetic programming functional formulation, and a parallel tempering Metropolis-based optimization algorithm. Our massively-parallel method independently discovered the Lennard-Jones function after searching for several hours on 100 processors and covering a miniscule portion of the configuration space. We find that the method is suitable for unsupervised discovery of functional forms for inter-atomic potentials/force-fields. We also find that our parallel tempering Metropolis-based approach significantly improves the optimization convergence time, and takes good advantage of the parallel cluster architecture.
Drawing on an ongoing longitudinal research study, we discuss problems in the development of five next generation community networking projects in central New York. The projects were funded under a state program to diffuse broadband technologies in economically depressed areas of the state. The networks are technologically complex and entail high costs for subscribers. The political economy of the development process has biased the subscriber base toward the resource rich and away from the resource poor, and toward tried-and-tested uses like Internet and intra-organizational connectivity and away from community-oriented uses. These trends raise troubling questions about network ontology and function, and about the relation between the network and its physical host community. The need for appropriate social policy, and new planning practices, is argued to effect desired change.
Domain adaptation deals with adapting behaviour of machine learning based systems trained using samples in source domain to their deployment in target domain where the statistics of samples in both domains are dissimilar. The task of directly training or adapting a learner in the target domain is challenged by lack of abundant labeled samples. In this paper we propose a technique for domain adaptation in stacked autoencoder (SAE) based deep neural networks (DNN) performed in two stages: (i) unsupervised weight adaptation using systematic dropouts in mini-batch training, (ii) supervised fine-tuning with limited number of labeled samples in target domain. We experimentally evaluate performance in the problem of retinal vessel segmentation where the SAE-DNN is trained using large number of labeled samples in the source domain (DRIVE dataset) and adapted using less number of labeled samples in target domain (STARE dataset). The performance of SAE-DNN measured using $logloss$ in source domain is $0.19$, without and with adaptation are $0.40$ and $0.18$, and $0.39$ when trained exclusively with limited samples in target domain. The area under ROC curve is observed respectively as $0.90$, $0.86$, $0.92$ and $0.87$. The high efficiency of vessel segmentation with DASA strongly substantiates our claim.
We propose a novel semantic tagging task, sem-tagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets). Our tagger uses both word and character representations and includes a novel residual bypass architecture. We evaluate the tagset both intrinsically on the new task of semantic tagging, as well as on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an auxiliary loss function predicting our semantic tags, significantly outperforms prior results on English Universal Dependencies POS tagging (95.71% accuracy on UD v1.2 and 95.67% accuracy on UD v1.3).
In this paper, we analyze the performance of a single-relay network in which the reliability is provided by means of Random Linear Network Coding (RLNC). We consider a scenario when both source and relay nodes can encode packets. Unlike the traditional approach to relay networks, we introduce a passive relay mode, in which the relay node simply retransmits collected packets in case it cannot decode them. In contrast with the previous studies, we derive a novel theoretical framework for the performance characterization of the considered relay network. We extend our analysis to a more general scenario, in which coding coefficients are generated from non-binary fields. The theoretical results are verified using simulation, for both binary and non-binary fields. It is also shown that the passive relay mode significantly improves the performance compared with the active-only case, offering an up to two-fold gain in terms of the decoding probability. The proposed framework can be used as a building block for the analysis of more complex network topologies.
In this paper, we present nmtpy, a flexible Python toolkit based on Theano for training Neural Machine Translation and other neural sequence-to-sequence architectures. nmtpy decouples the specification of a network from the training and inference utilities to simplify the addition of a new architecture and reduce the amount of boilerplate code to be written. nmtpy has been used for LIUM's top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.
A recently proposed schematic model for the non--linear rheology of dense colloidal dispersions is compared to flow curves measured in suspensions that consist of thermosensitive particles. The volume fraction of this purely repulsive model system can be adjusted by changing temperature. Hence, high volume fractions ($\phi \leq 0.63$) can be achieved in a reproducible manner. The quantitative analysis of the flow curves suggests that the theoretical approach captures the increase of the low shear viscosity with increasing density, the shear thinning for increasing shear rate, and the yielding of a soft glassy solid. Variations of the high shear viscosity can be traced back to hydrodynamic interactions which are not contained in the present approach but can be incorporated into the data analysis by an appropriate rescaling.
Particle swarm optimization is used in several combinatorial optimization problems. In this work, particle swarms are used to solve quadratic programming problems with quadratic constraints. The approach of particle swarms is an example for interior point methods in optimization as an iterative technique. This approach is novel and deals with classification problems without the use of a traditional classifier. Our method determines the optimal hyperplane or classification boundary for a data set. In a binary classification problem, we constrain each class as a cluster, which is enclosed by an ellipsoid. The estimation of the optimal hyperplane between the two clusters is posed as a quadratically constrained quadratic problem. The optimization problem is solved in distributed format using modified particle swarms. Our method has the advantage of using the direction towards optimal solution rather than searching the entire feasible region. Our results on the Iris, Pima, Wine, and Thyroid datasets show that the proposed method works better than a neural network and the performance is close to that of SVM.
Not all nodes in a network are created equal. Differences and similarities exist at both individual node and group levels. Disentangling single node from group properties is crucial for network modeling and structural inference. Based on unbiased generative probabilistic exponential random graph models and employing distributive message passing techniques, we present an efficient algorithm that allows one to separate the contributions of individual nodes and groups of nodes to the network structure. This leads to improved detection accuracy of latent class structure in real world data sets compared to models that focus on group structure alone. Furthermore, the inclusion of hitherto neglected group specific effects in models used to assess the statistical significance of small subgraph (motif) distributions in networks may be sufficient to explain most of the observed statistics. We show the predictive power of such generative models in forecasting putative gene-disease associations in the Online Mendelian Inheritance in Man (OMIM) database. The approach is suitable for both directed and undirected uni-partite as well as for bipartite networks.
We study the problem of Salient Object Subitizing, i.e. predicting the existence and the number of salient objects in an image using holistic cues. This task is inspired by the ability of people to quickly and accurately identify the number of items within the subitizing range (1-4). To this end, we present a salient object subitizing image dataset of about 14K everyday images which are annotated using an online crowdsourcing marketplace. We show that using an end-to-end trained Convolutional Neural Network (CNN) model, we achieve prediction accuracy comparable to human performance in identifying images with zero or one salient object. For images with multiple salient objects, our model also provides significantly better than chance performance without requiring any localization process. Moreover, we propose a method to improve the training of the CNN subitizing model by leveraging synthetic images. In experiments, we demonstrate the accuracy and generalizability of our CNN subitizing model and its applications in salient object detection and image retrieval.
We present an algorithm for the continuous monitoring of the biomass and ethanol concentrations and moreover the kinetic rate in the Mezcal fermentation process. This algorithm performs its task having only available the on-line measurements of the redox potential. The procedure includes an artificial neural network (ANN) that relates the redox potential to the ethanol and biomass concentrations. Then a nonlinear-observer-based algorithm uses the biomass estimations to infer the kinetic rate of this fermentation process. The method shows that the redox potential is a valuable indicator of microorganism metabolic activity during the Mezcal fermentation. In addition, the estimated kinetic rate can be considered as a direct evidence of the presence of mixed culture growth in the process. In this work, the detailed design of the software-sensor is presented, as well as its experimental application at the laboratory level
We study the long-time aging dynamics of spin-glass models with two-spin interactions by performing a Renormalization Group transformation on the time variable in the non-equilibrium dynamical generating functional. We obtain the RG equations and find that the flow converges to an exact fixed point. We show that this fixed point is invariant under reparametrizations of the time variable. This continuous symmetry is broken, as evidenced by the fact that the observed correlations and responses are not invariant under it. We argue that this gives rise to the presence of Goldstone modes, and that those Goldstone modes shape the behavior of fluctuations in the nonequilibrium dynamics.
In this work we apply techniques and modus operandi typical of Statistical Mechanics to a large dataset about key social quantifiers and compare the resulting behaviours of five European nations, namely France, Germany, Italy, Spain and Switzerland. The social quantifiers considered are $i.$ the evolution of the number of autochthonous marriages (i.e. between two natives) within a given territorial district and $ii.$ the evolution of the number of mixed marriages (i.e. between a native and an immigrant) within a given territorial district. Our investigations are twofold. From a theoretical perspective, we develop novel techniques, complementary to classical methods (e.g. historical series and logistic regression), in order to detect possible collective features underlying the empirical behaviours; from an experimental perspective, we evidence a clear outline for the evolution of the social quantifiers considered. The comparison between experimental results and theoretical predictions is excellent and allows speculating that France, Italy and Spain display a certain degree of {\em internal heterogeneity}, that is not found in Germany and Switzerland; such heterogeneity, quite mild in France and in Spain, is not negligible in Italy and highlights quantitative differences in the customs of Northern and Southern regions. These findings may suggest the persistence of two culturally distinct communities, long-term lasting heritages of different and well-established cultures.
One of the central themes in Sum-Product networks (SPNs) is the interpretation of sum nodes as marginalized latent variables (LVs). This interpretation yields an increased syntactic or semantic structure, allows the application of the EM algorithm and to efficiently perform MPE inference. In literature, the LV interpretation was justified by explicitly introducing the indicator variables corresponding to the LVs' states. However, as pointed out in this paper, this approach is in conflict with the completeness condition in SPNs and does not fully specify the probabilistic model. We propose a remedy for this problem by modifying the original approach for introducing the LVs, which we call SPN augmentation. We discuss conditional independencies in augmented SPNs, formally establish the probabilistic interpretation of the sum-weights and give an interpretation of augmented SPNs as Bayesian networks. Based on these results, we find a sound derivation of the EM algorithm for SPNs. Furthermore, the Viterbi-style algorithm for MPE proposed in literature was never proven to be correct. We show that this is indeed a correct algorithm, when applied to selective SPNs, and in particular when applied to augmented SPNs. Our theoretical results are confirmed in experiments on synthetic data and 103 real-world datasets.
In this paper we consider the problem of distributed Nash equilibrium (NE) seeking over networks, a setting in which players have limited local information. We start from a continuous-time gradient-play dynamics that converges to an NE under strict monotonicity of the pseudo-gradient and assumes perfect information, i.e., instantaneous all-to-all player communication. We consider how to modify this gradient-play dynamics in the case of partial, or networked information between players. We propose an augmented gradient-play dynamics with correction in which players communicate locally only with their neighbours to compute an estimate of the other players' actions. We derive the new dynamics based on the reformulation as a multi-agent coordination problem over an undirected graph. We exploit incremental passivity properties and show that a synchronizing, distributed Laplacian feedback can be designed using relative estimates of the neighbours. Under a strict monotonicity property of the pseudo-gradient, we show that the augmented gradient-play dynamics converges to consensus on the NE of the game. We further discuss two cases that highlight the tradeoff between properties of the game and the communication graph.
A formalism for study of spectral correlations in non-Gaussian, unitary invariant ensembles of large random matrices with strong level confinement is reviewed. It is based on the Shohat method in the theory of orthogonal polynomials. The approach presented is equally suitable for description of both local and global spectral characteristics, thereby providing an overall look at the phenomenon of spectral universality in Random Matrix Theory.
The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences.   This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds.   The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed.
Objects are represented in sensory systems by continuous manifolds due to sensitivity of neuronal responses to changes in physical features such as location, orientation, and intensity. What makes certain sensory representations better suited for invariant decoding of objects by downstream networks? We present a theory that characterizes the ability of a linear readout network, the perceptron, to classify objects from variable neural responses. We show how the readout perceptron capacity depends on the dimensionality, size, and shape of the object manifolds in its input neural representation.
Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biological relevant knowledge from the LSTM networks.
The localization in a disordered system of $N$ interacting spins coupled by the long-range anisotropic interaction $1/R^{\alpha}$ is investigated using a finite size scaling in a $d=1$ -dimensional system for $N=8, 10, 12, 14$. The results supports the absence of localization in the infinite system at $\alpha<2d$ and a scaling of a critical energy disordering $W_{c} \propto N^{2d-\alpha}$ in agreement with the analytical theory suggesting the energy delocalization in the subset of interacting resonant pairs of spins as a precursor of the many-body delocalization.The spin relaxation rate $k$ dependence on disordering $k \propto W^{-2}$ has been revealed in the practically interesting case $\alpha=d$. This relaxation mechanism can be responsible for the anomalous relaxation of quantum two level systems in amorphous solids.
High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity Learning (SSL) method to regularize the structures (i.e., filters, channels, filter shapes, and layer depth) of DNNs. SSL can: (1) learn a compact structure from a bigger DNN to reduce computation cost; (2) obtain a hardware-friendly structured sparsity of DNN to efficiently accelerate the DNNs evaluation. Experimental results show that SSL achieves on average 5.1x and 3.1x speedups of convolutional layer computation of AlexNet against CPU and GPU, respectively, with off-the-shelf libraries. These speedups are about twice speedups of non-structured sparsity; (3) regularize the DNN structure to improve classification accuracy. The results show that for CIFAR-10, regularization on layer depth can reduce 20 layers of a Deep Residual Network (ResNet) to 18 layers while improve the accuracy from 91.25% to 92.60%, which is still slightly higher than that of original ResNet with 32 layers. For AlexNet, structure regularization by SSL also reduces the error by around ~1%. Open source code is in https://github.com/wenwei202/caffe/tree/scnn
We use large-scale molecular dynamics simulations of a simple glass-forming system to investigate how its liquid-gas phase separation kinetics depends on temperature. A shallow quench leads to a fully demixed liquid-gas system whereas a deep quench makes the dense phase undergo a glass transition and become an amorphous solid. This glass has a gel-like bicontinuous structure that evolves very slowly with time and becomes fully arrested in the limit where thermal fluctuations become negligible. We show that the phase separation kinetics changes qualitatively with temperature, the microscopic dynamics evolving from a surface tension-driven diffusive motion at high temperature to a strongly intermittent, heterogeneous and thermally activated dynamics at low temperature, with a logarithmically slow growth of the typical domain size. These results shed light on recent experimental observations of various porous materials produced by arrested spinodal decomposition, such as nonequilibrium colloidal gels and bicontinuous polymeric structures, and they elucidate the microscopic mechanisms underlying a specific class of viscoelastic phase separation.
We use the formalism of 'Maximum Principle of Shannon's Entropy' to derive the general power law distribution function, using what seems to be a reasonable physical assumption, namely, the demand of a constant mean "internal order" (Boltzmann Entropy) of a complex, self interacting, self organized system. Since the Shannon entropy is equivalent to the Boltzmann's entropy under equilibrium, non interacting conditions, we interpret this result as the complex system making use of its intra-interactions and its non equilibrium in order to keep the equilibrium Boltzmann's entropy constant on the average, thus enabling it an advantage at surviving over less ordered systems, i.e. hinting towards an "Evolution of Structure". We then demonstrate the formalism using a toy model to explain the power laws observed in Cities' populations and show how Zipf's law comes out as a natural special point of the model. We also suggest further directions of theory.
Parameter space of a driven damped oscillator in a double well potential presents either a chaotic trajectory with sign oscillating amplitude or a non-chaotic trajectory with a fixed sign amplitude. A network of such delay coupled damped oscillators is shown to present chaotic dynamics while the amplitude sign of each damped oscillator is randomly frozen. This phenomenon of random broken global symmetry of the network simultaneously with random freezing of each degree of freedom is accompanied by the existence of exponentially many randomly frozen chaotic attractors with the ize of the network. Results are exemplified by a network of modified Duffing oscillators with infinite ange pseudo-inverse delayed interactions.
I first give an overview of the thesis and Matrix Product States (MPS) representation of quantum spin chains with an improvement on the conventional notation.   The rest of this thesis is divided into two parts. The first part is devoted to eigenvalues of quantum many-body systems (QMBS). I introduce Isotropic Entanglement, which draws from various tools in random matrix theory and free probability theory (FPT) to accurately approximate the eigenvalue distribution of QMBS on a line with generic interactions. Next, I discuss the energy distribution of one particle hopping random Schr\"odinger operator in 1D from FPT in context of the Anderson model.   The second part is devoted to ground states and gap of QMBS. I first give the necessary background on frustration free (FF) Hamiltonians, real and imaginary time evolution within MPS representation and a numerical implementation. I then prove the degeneracy and FF condition for quantum spin chains with generic local interactions, including corrections to our earlier assertions. I then summarize my efforts in proving lower bounds for the entanglement of the ground states, which includes some new results, with the hope that they inspire future work resulting in solving the conjecture given therein. Next I discuss two interesting measure zero examples where FF Hamiltonians are carefully constructed to give unique ground states with high entanglement. One of the examples (i.e., $d=4$) has not appeared elsewhere. In particular, we calculate the Schmidt numbers exactly, entanglement entropies and introduce a novel technique for calculating the gap which may be of independent interest. The last chapter elaborates on one of the measure zero examples (i.e., $d=3$) which is the first example of a FF translation-invariant spin-1 chain that has a unique highly entangled ground state and exhibits signatures of a critical behavior.
We describe our plan to develop a large-scale cluster system with a peak speed of 14.3Tflops for lattice QCD at the Center for Computational Sciences, University of Tsukuba, as a successor to the current 0.6Tflops CP-PACS computer. The system consist of 2560 nodes connected by a 16x16x10 three-dimensional hyper crossbar network. Each node has a single low-voltage 2.8GHz Xeon processor and 2GBytes of memory with 6.4GBytes/sec bandwidth, and 160 GBytes of disk in RAID1 mode. The network link in each of the three directions is made of dual Gigabit Ethernet with the peak throughput of 250MByte/sec. Hence each node has an aggregate network bandwidth of 750MByte/sec. The system will run under Linux and SCore, and an extension of the PM driver is developed for the network. The system will be developed jointly with Hitachi Limited. The installation is scheduled in the first quarter of Japanese Fiscal 2006 (April-June 2006) and the start of operation is expected in July 2006.
Athermal models of disordered fibrous networks are highly useful for studying the mechanics of elastic networks composed of stiff biopolymers. The underlying network architecture is a key aspect that can affect the elastic properties of these systems, which include rich linear and nonlinear elasticity. Existing computational approaches have focused on both lattice-based and off-lattice networks obtained from the random placement of rods. It is not obvious, a priori, whether the two architectures have fundamentally similar or different mechanics. If they are different, it is not clear which of these represents a better model for biological networks. Here, we show that both approaches are essentially equivalent for the same network connectivity, provided the networks are sub-isostatic with respect to central force interactions. Moreover, for a given sub-isostatic connectivity, we even find that lattice-based networks in both 2D and 3D exhibit nearly identical nonlinear elastic response. We provide a description of the linear mechanics for both architectures in terms of a scaling function. We also show that the nonlinear regime is dominated by fiber bending and that stiffening originates from the stabilization of sub-isostatic networks by stress. We propose a generalized relation for this regime in terms of the self-generated normal stresses that develop under deformation. Different network architectures have different susceptibilities to the normal stress, but essentially exhibit the same nonlinear mechanics. Such stiffening mechanism has been shown to successfully capture the nonlinear mechanics of collagen networks.
The arguments of Volovik are refuted.
Given a network represented by a graph $G=(V,E)$, we consider a dynamical process of influence diffusion in $G$ that evolves as follows: Initially only the nodes of a given $S\subseteq V$ are influenced; subsequently, at each round, the set of influenced nodes is augmented by all the nodes in the network that have a sufficiently large number of already influenced neighbors. The question is to determine a small subset of nodes $S$ (\emph{a target set}) that can influence the whole network. This is a widely studied problem that abstracts many phenomena in the social, economic, biological, and physical sciences. It is known that the above optimization problem is hard to approximate within a factor of $2^{\log^{1-\epsilon}|V|}$, for any $\epsilon >0$. In this paper, we present a fast and surprisingly simple algorithm that exhibits the following features: 1) when applied to trees, cycles, or complete graphs, it always produces an optimal solution (i.e, a minimum size target set); 2) when applied to arbitrary networks, it always produces a solution of cardinality which improves on the previously known upper bound; 3) when applied to real-life networks, it always produces solutions that substantially outperform the ones obtained by previously published algorithms (for which no proof of optimality or performance guarantee is known in any class of graphs).
The transmission (T) and reflection (R) coefficients are studied in periodic systems and random systems with gain. For both the periodic electronic tight-binding model and the periodic classical many-layered model, we obtain numerically and theoretically the dependence of T and R. The critical length of periodic system L[sub c][sup 0], above which T decreases with the size of the system L while R approaches a constant value, is obtained to be inversely proportional to the imaginary part cursive-epsilon (double-prime) of the dielectric function cursive-epsilon . For the random system, T and R also show a nonmonotonic behavior versus L. For short systems (L < Lc) with gain (left-angle)ln T(right-angle) = (l[sub g][sup -1] - xi [sub 0][sup -1])L. For large systems (L(very-much-greater-than)Lc) with gain (left-angle)ln T(right-angle) = - (l[sub g][sup -1] + xi [sub 0][sup -1])L. Lc, lg, and xi 0 are the critical, gain, and localization lengths, respectively. The dependence of the critical length Lc on cursive-epsilon (double-prime) and disorder strength W are also given. Finally, the probability distribution of the reflection R for random systems with gain is also examined. Some very interesting behaviors are observed.
We introduce an informative labeling algorithm for the vertices of a family of Koch networks. Each of the labels is consisted of two parts, the precise position and the time adding to Koch networks. The shortest path routing between any two vertices is determined only on the basis of their labels, and the routing is calculated only by few computations. The rigorous solutions of betweenness centrality for every node and edge are also derived by the help of their labels. Furthermore, the community structure in Koch networks is studied by the current and voltage characteristics of its resistor networks.
Suppose that ball-shaped sensors wander in a bounded domain. A sensor doesn't know its location but does know when it overlaps a nearby sensor. We say that an evasion path exists in this sensor network if a moving intruder can avoid detection. In "Coordinate-free coverage in sensor networks with controlled boundaries via homology", Vin deSilva and Robert Ghrist give a necessary condition, depending only on the time-varying connectivity data of the sensors, for an evasion path to exist. Using zigzag persistent homology, we provide an equivalent condition that moreover can be computed in a streaming fashion. However, no method with time-varying connectivity data as input can give necessary and sufficient conditions for the existence of an evasion path. Indeed, we show that the existence of an evasion path depends not only on the fibrewise homotopy type of the region covered by sensors but also on its embedding in spacetime. For planar sensors that also measure weak rotation and distance information, we provide necessary and sufficient conditions for the existence of an evasion path.
Sparse coding algorithm is an learning algorithm mainly for unsupervised feature for finding succinct, a little above high - level Representation of inputs, and it has successfully given a way for Deep learning. Our objective is to use High - Level Representation data in form of unlabeled category to help unsupervised learning task. when compared with labeled data, unlabeled data is easier to acquire because, unlike labeled data it does not follow some particular class labels. This really makes the Deep learning wider and applicable to practical problems and learning. The main problem with sparse coding is it uses Quadratic loss function and Gaussian noise mode. So, its performs is very poor when binary or integer value or other Non- Gaussian type data is applied. Thus first we propose an algorithm for solving the L1 - regularized convex optimization algorithm for the problem to allow High - Level Representation of unlabeled data. Through this we derive a optimal solution for describing an approach to Deep learning algorithm by using sparse code.
Virtually every organism gathers information about its noisy environment and builds models from that data, mostly using neural networks. Here, we use stochastic thermodynamics to analyse the learning of a classification rule by a neural network. We show that the information acquired by the network is bounded by the thermodynamic cost of learning and introduce a learning efficiency $\eta\le1$. We discuss the conditions for optimal learning and analyse Hebbian learning in the thermodynamic limit.
Object detection is a core problem in computer vision. With the development of deep ConvNets, the performance of object detectors has been dramatically improved. The deep ConvNets based object detectors mainly focus on regressing the coordinates of bounding box, e.g., Faster-R-CNN, YOLO and SSD. Different from these methods that considering bounding box as a whole, we propose a novel object bounding box representation using points and links and implemented using deep ConvNets, termed as Point Linking Network (PLN). Specifically, we regress the corner/center points of bounding-box and their links using a fully convolutional network; then we map the corner points and their links back to multiple bounding boxes; finally an object detection result is obtained by fusing the multiple bounding boxes. PLN is naturally robust to object occlusion and flexible to object scale variation and aspect ratio variation. In the experiments, PLN with the Inception-v2 model achieves state-of-the-art single-model and single-scale results on the PASCAL VOC 2007, the PASCAL VOC 2012 and the COCO detection benchmarks without bells and whistles. The source code will be released.
This paper studies real-world road networks from an algorithmic perspective, focusing on empirical studies that yield useful properties of road networks that can be exploited in the design of fast algorithms that deal with geographic data. Unlike previous approaches, our study is not based on the assumption that road networks are planar graphs. Indeed, based on the a number of experiments we have performed on the road networks of the 50 United States and District of Columbia, we provide strong empirical evidence that road networks are quite non-planar. Our approach therefore instead is directed at finding algorithmically-motivated properties of road networks as non-planar geometric graphs, focusing on alternative properties of road networks that can still lead to efficient algorithms for such problems as shortest paths and Voronoi diagrams. In particular, we study road networks as multiscale-dispersed graphs, which is a concept we formalize in terms of disk neighborhood systems. This approach allows us to develop fast algorithms for road networks without making any additional assumptions about the distribution of edge weights. In fact, our algorithms can allow for non-metric weights.
The widespread use of multi-sensor technology and the emergence of big datasets has highlighted the limitations of standard flat-view matrix models and the necessity to move towards more versatile data analysis tools. We show that higher-order tensors (i.e., multiway arrays) enable such a fundamental paradigm shift towards models that are essentially polynomial and whose uniqueness, unlike the matrix methods, is guaranteed under verymild and natural conditions. Benefiting fromthe power ofmultilinear algebra as theirmathematical backbone, data analysis techniques using tensor decompositions are shown to have great flexibility in the choice of constraints that match data properties, and to find more general latent components in the data than matrix-based methods. A comprehensive introduction to tensor decompositions is provided from a signal processing perspective, starting from the algebraic foundations, via basic Canonical Polyadic and Tucker models, through to advanced cause-effect and multi-view data analysis schemes. We show that tensor decompositions enable natural generalizations of some commonly used signal processing paradigms, such as canonical correlation and subspace techniques, signal separation, linear regression, feature extraction and classification. We also cover computational aspects, and point out how ideas from compressed sensing and scientific computing may be used for addressing the otherwise unmanageable storage and manipulation problems associated with big datasets. The concepts are supported by illustrative real world case studies illuminating the benefits of the tensor framework, as efficient and promising tools for modern signal processing, data analysis and machine learning applications; these benefits also extend to vector/matrix data through tensorization. Keywords: ICA, NMF, CPD, Tucker decomposition, HOSVD, tensor networks, Tensor Train.
Synaptic weights for neurons in logic programming can be calculated either by using Hebbian learning or by Wan Abdullah's method. In other words, Hebbian learning for governing events corresponding to some respective program clauses is equivalent with learning using Wan Abdullah's method for the same respective program clauses. In this paper we will evaluate experimentally the equivalence between these two types of learning through computer simulations.
A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to Neural Networks, and more recently Dynamical Genetic Programming (DGP). This paper presents results from an investigation into using a fuzzy DGP representation within the XCSF Learning Classifier System. In particular, asynchronous Fuzzy Logic Networks are used to represent the traditional condition-action production system rules. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such fuzzy dynamical systems within XCSF to solve several well-known continuous-valued test problems.
We shortly review the various methods suggested for determining the transversity function. Among such methods, we consider especially those based on semi-inclusive deep inelastic scattering. In the framework of this kind of reactions, we propose to measure a double spin asymmetry, using a transversely polarized proton target and a longitudinally polarized lepton beam, and fixing the direction of the final pion. Under particular conditions, the asymmetry is sensitive to the transversity function.
Graph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of algorithmic design methods for graph clustering. However, most of these methods are based on complicated spectral techniques or convex optimisation, and cannot be applied directly for clustering many networks that occur in practice, whose information is often collected on different sites. Designing a simple and distributed clustering algorithm is of great interest, and has wide applications for processing big datasets.   In this paper we present a simple and distributed algorithm for graph clustering: for a wide class of graphs that are characterised by a strong cluster-structure, our algorithm finishes in a poly-logarithmic number of rounds, and recovers a partition of the graph close to an optimal partition. The main component of our algorithm is an application of the random matching model of load balancing, which is a fundamental protocol in distributed computing and has been extensively studied in the past 20 years. Hence, our result highlights an intrinsic and interesting connection between graph clustering and load balancing.   At a technical level, we present a purely algebraic result characterising the early behaviours of load balancing processes for graphs exhibiting a cluster-structure. We believe that this result can be further applied to analyse other gossip processes, such as rumour spreading and averaging processes.
In computer vision pixelwise dense prediction is the task of predicting a label for each pixel in the image. Convolutional neural networks achieve good performance on this task, while being computationally efficient. In this paper we carry these ideas over to the problem of assigning a sequence of labels to a set of speech frames, a task commonly known as framewise classification. We show that dense prediction view of framewise classification offers several advantages and insights, including computational efficiency and the ability to apply batch normalization. When doing dense prediction we pay specific attention to strided pooling in time and introduce an asymmetric dilated convolution, called time-dilated convolution, that allows for efficient and elegant implementation of pooling in time. We show results using time-dilated convolutions in a very deep VGG-style CNN with batch normalization on the Hub5 Switchboard-2000 benchmark task. With a big n-gram language model, we achieve 7.7% WER which is the best single model single-pass performance reported so far.
Ensembling multiple predictions is a widely used technique to improve the accuracy of various machine learning tasks. In image classification tasks, for example, averaging the predictions for multiple patches extracted from the input image significantly improves accuracy. Using multiple networks trained independently to make predictions improves accuracy further. One obvious drawback of the ensembling technique is its higher execution cost during inference. If we average 100 predictions, the execution cost will be 100 times as high as the cost without the ensemble. This higher cost limits the real-world use of ensembling, even though using it is almost the norm to win image classification competitions. In this paper, we describe a new technique called adaptive ensemble prediction, which achieves the benefits of ensembling with much smaller additional execution costs. Our observation behind this technique is that many easy-to-predict inputs do not require ensembling. Hence we calculate the confidence level of the prediction for each input on the basis of the probability of the predicted label, i.e. the outputs from the softmax, during the ensembling computation. If the prediction for an input reaches a high enough probability on the basis of the confidence level, we stop ensembling for this input to avoid wasting computation power. We evaluated the adaptive ensembling by using various datasets and showed that it reduces the computation time significantly while achieving similar accuracy to the naive ensembling.
Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story. This paper aims to align books to their movie releases in order to provide rich descriptive explanations for visual content that go semantically far beyond the captions available in current datasets. To align movies and books we exploit a neural sentence embedding that is trained in an unsupervised way from a large corpus of books, as well as a video-text neural embedding for computing similarities between movie clips and sentences in the book. We propose a context-aware CNN to combine information from multiple sources. We demonstrate good quantitative performance for movie/book alignment and show several qualitative examples that showcase the diversity of tasks our model can be used for.
Networks describing the interaction of the elements that constitute a complex system grow and develop via a number of different mechanisms, such as the addition and deletion of nodes, the addition and deletion of edges, as well as the duplication or fusion of nodes. While each of these mechanisms can have a different cause depending on whether the network is biological, technological, or social, their impact on the network's structure, as well as its local and global properties, is similar. This allows us to study how each of these mechanisms affects networks either alone or together with the other processes, and how they shape the characteristics that have been observed. We study how a network's growth parameters impact the distribution of edges in the network, how they affect a network's modularity, and point out that some parameters will give rise to networks that have the opposite tendency, namely to display anti-modularity. Within the model we are describing, we can search the space of possible networks for parameter sets that generate networks that are very similar to well-known and well-studied examples, such as the brain of a worm, and the network of interactions of the proteins in baker's yeast.
Factoid question answering (QA) has recently benefited from the development of deep learning (DL) systems. Neural network models outperform traditional approaches in domains where large datasets exist, such as SQuAD (ca. 100,000 questions) for Wikipedia articles. However, these systems have not yet been applied to QA in more specific domains, such as biomedicine, because datasets are generally too small to train a DL system from scratch. For example, the BioASQ dataset for biomedical QA comprises less then 900 factoid (single answer) and list (multiple answers) QA instances. In this work, we adapt a neural QA system trained on a large open-domain dataset (SQuAD, source) to a biomedical dataset (BioASQ, target) by employing various transfer learning techniques. Our network architecture is based on a state-of-the-art QA system, extended with biomedical word embeddings and a novel mechanism to answer list questions. In contrast to existing biomedical QA systems, our system does not rely on domain-specific ontologies, parsers or entity taggers, which are expensive to create. Despite this fact, our systems achieve state-of-the-art results on factoid questions and competitive results on list questions.
We demonstrate diversification rather than optimisation for highly interacting organisms in a well mixed biological system by means of a simple model and reference to experiment, and find the cause to be the complex network of interactions formed, allowing species less well adapted to an environment to flourish by co-interaction over the `best' species. This diversification can be considered as the construction of many co-evolutionary niches by the network of interactions between species. Evidence for this comes from work with the bacteria Escherichia coli, which may coexist with their own mutants under certain conditions. Diversification only occurs above a certain threshold interaction strength, below which competitive exclusion occurs.
Stochastic series expansion quantum Monte Carlo is used to study the ground state of the antiferromagnetic spin-1 Heisenberg chain with bond disorder. Typical spin- and string-correlations functions behave in accordance with real-space renormalization group predictions for the random-singlet phase. The average string-correlation function decays algebraically with an exponent of -0.378(6), in very good agreement with the prediction of $-(3-\sqrt{5})/2\simeq -0.382$, while the average spin-correlation function is found to decay with an exponent of about -1, quite different from the expected value of -2. By implementing the concept of directed loops for the spin-1 chain we show that autocorrelation times can be reduced by up to two orders of magnitude.
Network neutrality and the role of regulation on the Internet have been heavily debated in recent times. Amongst the various definitions of network neutrality, we focus on the one which prohibits paid prioritization of content and we present an analytical treatment of the topic. We develop a model of the Internet ecosystem in terms of three primary players: consumers, ISPs and content providers. Our analysis looks at this issue from the point of view of the consumer, and we describe the desired state of the system as one which maximizes consumer surplus. By analyzing different scenarios of monopoly and competition, we obtain different conclusions on the desirability of regulation. We also introduce the notion of a Public Option ISP, an ISP that carries traffic in a network neutral manner. Our major findings are (i) in a monopolistic scenario, network neutral regulations benefit consumers; however, the introduction of a Public Option ISP is even better for consumers, as it aligns the interests of the monopolistic ISP with the consumer surplus and (ii) in an oligopolistic situation, the presence of a Public Option ISP is again preferable to network neutral regulations, although the presence of competing price-discriminating ISPs provides the most desirable situation for the consumers.
In this paper, the uncertainty is defined as the mean square error between a given enhanced noisy observation vector and the corresponding clean one. Then, a DNN is trained by using enhanced noisy observation vectors as input and the uncertainty as output with a training database. In testing, the DNN receives an enhanced noisy observation vector and delivers the estimated uncertainty. This uncertainty in employed in combination with a weighted DNN-HMM based speech recognition system and compared with an existing estimation of the noise cancelling uncertainty variance based on an additive noise model. Experiments were carried out with Aurora-4 task. Results with clean, multi-noise and multi-condition training are presented.
A previous paper [2] showed how to generate a linear discriminant network (LDN) that computes likely faults for a noisy fault detection problem by using a modification of the perceptron learning algorithm called the pocket algorithm. Here we compare the performance of this connectionist model with performance of the optimal Bayesian decision rule for the example that was previously described. We find that for this particular problem the connectionist model performs about 97% as well as the optimal Bayesian procedure. We then define a more general class of noisy single-pattern boolean (NSB) fault detection problems where each fault corresponds to a single :pattern of boolean instrument readings and instruments are independently noisy. This is equivalent to specifying that instrument readings are probabilistic but conditionally independent given any particular fault. We prove:   1. The optimal Bayesian decision rule for every NSB fault detection problem is representable by an LDN containing no intermediate nodes. (This slightly extends a result first published by Minsky & Selfridge.) 2. Given an NSB fault detection problem, then with arbitrarily high probability after sufficient iterations the pocket algorithm will generate an LDN that computes an optimal Bayesian decision rule for that problem. In practice we find that a reasonable number of iterations of the pocket algorithm produces a network with good, but not optimal, performance.
We propose a simple and efficient time-series clustering framework particularly suited for low Signal-to-Noise Ratio (SNR), by simultaneous smoothing and dimensionality reduction aimed at preserving clustering information. We extend the sparse K-means algorithm by incorporating structured sparsity, and use it to exploit the multi-scale property of wavelets and group structure in multivariate signals. Finally, we extract features invariant to translation and scaling with the scattering transform, which corresponds to a convolutional network with filters given by a wavelet operator, and use the network's structure in sparse clustering. By promoting sparsity, this transform can yield a low-dimensional representation of signals that gives improved clustering results on several real datasets.
Vision based text entry systems aim to help disabled people achieve text communication using eye movement. Most previous methods have employed an existing eye tracker to predict gaze direction and design an input method based upon that. However, these methods can result in eye tracking quality becoming easily affected by various factors and lengthy amounts of time for calibration. Our paper presents a novel efficient gaze based text input method, which has the advantage of low cost and robustness. Users can type in words by looking at an on-screen keyboard and blinking. Rather than estimate gaze angles directly to track eyes, we introduce a method that divides the human gaze into nine directions. This method can effectively improve the accuracy of making a selection by gaze and blinks. We build a Convolutional Neural Network (CNN) model for 9-direction gaze estimation. On the basis of the 9-direction gaze, we use a nine-key T9 input method which is widely used in candy bar phones. Bar phones were very popular in the world decades ago and have cultivated strong user habits and language models. To train a robust gaze estimator, we created a large-scale dataset with images of eyes sourced from 25 people. According to the results from our experiments, our CNN model is able to accurately estimate different people's gaze under various lighting conditions by different devices. In considering disable people's needs, we removed the complex calibration process. The input methods can run in screen mode and portable off-screen mode. Moreover, The datasets used in our experiments are made available to the community to allow further experimentation.
We introduce a recurrent neural network architecture for automated road surface wetness detection from audio of tire-surface interaction. The robustness of our approach is evaluated on 785,826 bins of audio that span an extensive range of vehicle speeds, noises from the environment, road surface types, and pavement conditions including international roughness index (IRI) values from 25 in/mi to 1400 in/mi. The training and evaluation of the model are performed on different roads to minimize the impact of environmental and other external factors on the accuracy of the classification. We achieve an unweighted average recall (UAR) of 93.2% across all vehicle speeds including 0 mph. The classifier still works at 0 mph because the discriminating signal is present in the sound of other vehicles driving by.
Given an existing trained neural network, it is often desirable to be able to add new capabilities without hindering performance of already learned tasks. Existing approaches either learn sub-optimal solutions, require joint training, or incur a substantial increment in the number of parameters for each added task, typically as many as the original network. We propose a method which fully preserves performance on the original task, with only a small increase (around 20%) in the number of required parameters while performing on par with more costly fine-tuning procedures, which typically double the number of parameters. The learned architecture can be controlled to switch between various learned representations, enabling a single network to solve a task from multiple different domains. We conduct extensive experiments showing the effectiveness of our method and explore different aspects of its behavior.
In this paper we review a recent proposal to understand the long time limit of glassy dynamics in terms of an appropriate Markov Chain. [1]. The advantages of the resulting construction are many. The first one is that it gives a quasi equilibrium description on how glassy systems explore the phase space in the slow relaxation part of their dynamics. The second one is that it gives an alternative way to obtain dynamical equations starting from a dynamical rule that is static in spirit. This provides a way to overcome the difficulties encountered in the short time part of the dynamics where current conservation must be enforced. We study this approach in detail in a prototypical mean field disordered spin system, namely the p-spin spherical model, showing how we can obtain the well known equations that describes its dynamics. Then we apply the same approach to structural glasses. We first derive a set of dynamical Ornstein-Zernike equations which are very general in nature. Finally we consider two possible closure schemes for them, namely the Hypernetted Chain approximation of liquid theory and a closure of the BBGKY hierarchy that has been recently introduced by G. Szamel. From both approaches we finally find a set of dynamical Mode-Coupling like equations that are supposed to describe the system in the long time/slow dynamics regime.
A one-dimensional quantum system with off diagonal disorder, consisting of a sample of conducting regions randomly interspersed within potential barriers is considered. Results mainly concerning the large $N$ limit are presented. In particular, the effect of compression on the transmission coefficient is investigated. A numerical method to simulate such a system, for a physically relevant number of barriers, is proposed. It is shown that the disordered model converges to the periodic case as $N$ increases, with a rate of convergence which depends on the disorder degree. Compression always leads to a decrease of the transmission coefficient which may be exploited to design nano-technological sensors. Effective choices for the physical parameters to improve the sensitivity are provided. Eventually large fluctuations and rate functions are analysed.
In this paper we provide the largest published comparison of translation quality for phrase-based SMT and neural machine translation across 30 translation directions. For ten directions we also include hierarchical phrase-based MT. Experiments are performed for the recently published United Nations Parallel Corpus v1.0 and its large six-way sentence-aligned subcorpus. In the second part of the paper we investigate aspects of translation speed, introducing AmuNMT, our efficient neural machine translation decoder. We demonstrate that current neural machine translation could already be used for in-production systems when comparing words-per-second ratios.
Tree-structured recursive neural networks (TreeRNNs) for sentence meaning have been successful for many applications, but it remains an open question whether the fixed-length representations that they learn can support tasks as demanding as logical deduction. We pursue this question by evaluating whether two such models---plain TreeRNNs and tree-structured neural tensor networks (TreeRNTNs)---can correctly learn to identify logical relationships such as entailment and contradiction using these representations. In our first set of experiments, we generate artificial data from a logical grammar and use it to evaluate the models' ability to learn to handle basic relational reasoning, recursive structures, and quantification. We then evaluate the models on the more natural SICK challenge data. Both models perform competitively on the SICK data and generalize well in all three experiments on simulated data, suggesting that they can learn suitable representations for logical inference in natural language.
The fractal dimension of minimal spanning trees on percolation clusters is estimated for dimensions $d$ up to $d=5$. A robust analysis technique is developed for correlated data, as seen in such trees. This should be a robust method suitable for analyzing a wide array of randomly generated fractal structures. The trees analyzed using these techniques are built using a combination of Prim's and Kruskal's algorithms for finding minimal spanning trees. This combination reduces memory usage and allows for simulation of larger systems than would otherwise be possible. The path length fractal dimension $d_{s}$ of MSTs on critical percolation clusters is found to be compatible with the predictions of the perturbation expansion developed by T.S.Jackson and N.Read [T.S.Jackson and N.Read, Phys.\ Rev.\ E \textbf{81}, 021131 (2010)].
This paper addresses 3D shape recognition. Recent work typically represents a 3D shape as a set of binary variables corresponding to 3D voxels of a uniform 3D grid centered on the shape, and resorts to deep convolutional neural networks(CNNs) for modeling these binary variables. Robust learning of such CNNs is currently limited by the small datasets of 3D shapes available, an order of magnitude smaller than other common datasets in computer vision. Related work typically deals with the small training datasets using a number of ad hoc, hand-tuning strategies. To address this issue, we formulate CNN learning as a beam search aimed at identifying an optimal CNN architecture, namely, the number of layers, nodes, and their connectivity in the network, as well as estimating parameters of such an optimal CNN. Each state of the beam search corresponds to a candidate CNN. Two types of actions are defined to add new convolutional filters or new convolutional layers to a parent CNN, and thus transition to children states. The utility function of each action is efficiently computed by transferring parameter values of the parent CNN to its children, thereby enabling an efficient beam search. Our experimental evaluation on the 3D ModelNet dataset demonstrates that our model pursuit using the beam search yields a CNN with superior performance on 3D shape classification than the state of the art.
This manuscript describes our participation in the International Skin Imaging Collaboration's 2017 Skin Lesion Analysis Towards Melanoma Detection competition. We participated in Part 3: Lesion Classification. The two stated goals of this binary image classification challenge were to distinguish between (a) melanoma and (b) nevus and seborrheic keratosis, followed by distinguishing between (a) seborrheic keratosis and (b) nevus and melanoma. We chose a deep neural network approach with a transfer learning strategy, using a pre-trained Inception V3 network as both a feature extractor to provide input for a multi-layer perceptron as well as fine-tuning an augmented Inception network. This approach yielded validation set AUC's of 0.84 on the second task and 0.76 on the first task, for an average AUC of 0.80. We joined the competition unfortunately late, and we look forward to improving on these results.
Most materials in available macroscopic quantities are polycrystalline. Graphene, a recently discovered two-dimensional form of carbon with strong potential for replacing silicon in future electronics, is no exception. There is growing evidence of the polycrystalline nature of graphene samples obtained using various techniques. Grain boundaries, intrinsic topological defects of polycrystalline materials, are expected to dramatically alter the electronic transport in graphene. Here, we develop a theory of charge carrier transmission through grain boundaries composed of a periodic array of dislocations in graphene based on the momentum conservation principle. Depending on the grain boundary structure we find two distinct transport behaviours - either high transparency, or perfect reflection of charge carriers over remarkably large energy ranges. First-principles quantum transport calculations are used to verify and further investigate this striking behaviour. Our study sheds light on the transport properties of large-area graphene samples. Furthermore, purposeful engineering of periodic grain boundaries with tunable transport gaps would allow for controlling charge currents without the need of introducing bulk band gaps in otherwise semimetallic graphene. The proposed approach can be regarded as a means towards building practical graphene electronics.
The production of $D^{*\pm}$ mesons in deep inelastic $ep$ scattering has been measured for exchanged photon virtualities $ 5<Q^2<1000 \gev^2 $, using an integrated luminosity of 363 pb$^{-1}$ with the ZEUS detector at HERA. Differential cross sections have been measured and compared to next-to-leading-order QCD calculations. The cross-sections are used to extract the charm contribution to the proton structure functions, expressed in terms of the reduced charm cross section, $\sigma_{\rm red}^{c\bar{c}}$. Theoretical calculations based on fits to inclusive HERA data are compared to the results.
This paper explores the potential of extreme learning machine based supervised classification algorithm for land cover classification. In comparison to a backpropagation neural network, which requires setting of several user-defined parameters and may produce local minima, extreme learning machine require setting of one parameter and produce a unique solution. ETM+ multispectral data set (England) was used to judge the suitability of extreme learning machine for remote sensing classifications. A back propagation neural network was used to compare its performance in term of classification accuracy and computational cost. Results suggest that the extreme learning machine perform equally well to back propagation neural network in term of classification accuracy with this data set. The computational cost using extreme learning machine is very small in comparison to back propagation neural network.
Named-entity recognition (NER) aims at identifying entities of interest in a text. Artificial neural networks (ANNs) have recently been shown to outperform existing NER systems. However, ANNs remain challenging to use for non-expert users. In this paper, we present NeuroNER, an easy-to-use named-entity recognition tool based on ANNs. Users can annotate entities using a graphical web-based user interface (BRAT): the annotations are then used to train an ANN, which in turn predict entities' locations and categories in new texts. NeuroNER makes this annotation-training-prediction flow smooth and accessible to anyone.
Partial differential equations (PDEs) with Dirichlet boundary conditions defined on boundaries with simple geometry have been succesfuly treated using sigmoidal multilayer perceptrons in previous works. This article deals with the case of complex boundary geometry, where the boundary is determined by a number of points that belong to it and are closely located, so as to offer a reasonable representation. Two networks are employed: a multilayer perceptron and a radial basis function network. The later is used to account for the satisfaction of the boundary conditions. The method has been successfuly tested on two-dimensional and three-dimensional PDEs and has yielded accurate solutions.
In this paper, we propose a novel approach to group users according to the VoD user request pattern. We cluster the user requests based on ART1 neural network algorithm. The knowledge extracted from the cluster is used to prefetch the multimedia object from each cluster before the users request. We have developed an algorithm to cluster users according to the users request patterns based on ART1 neural network algorithm that offers an unsupervised clustering. This approach adapts to changes in user request patterns over period without losing previous information. Each cluster is represented as prototype vector by generalizing the most frequently used URLs that are accessed by all the cluster members. The simulation results of our proposed clustering and prefetching algorithm, shows enormous increase in the performance of streaming server. Our algorithm helps the servers agent to learn user preferences and discover the information about the corresponding sources and other similar interested individuals.
We study the distribution function $P(\omega)$ of the random variable $\omega = \tau_1/(\tau_1 + ... + \tau_N)$, where $\tau_k$'s are the partial Wigner delay times for chaotic scattering in a disordered system with $N$ independent, statistically equivalent channels. In this case, $\tau_k$'s are i.i.d. random variables with a distribution $\Psi(\tau)$ characterized by a "fat" power-law intermediate tail $\sim 1/\tau^{1 + \mu}$, truncated by an exponential (or a log-normal) function of $\tau$. For $N = 2$ and N=3, we observe a surprisingly rich behavior of $P(\omega)$ revealing a breakdown of the symmetry between identical independent channels. For N=2, numerical simulations of the quasi one-dimensional Anderson model confirm our findings.
Deep neural networks are able to learn powerful representations from large quantities of labeled input data, however they cannot always generalize well across changes in input distributions. Domain adaptation algorithms have been proposed to compensate for the degradation in performance due to domain shift. In this paper, we address the case when the target domain is unlabeled, requiring unsupervised adaptation. CORAL is a "frustratingly easy" unsupervised domain adaptation method that aligns the second-order statistics of the source and target distributions with a linear transformation. Here, we extend CORAL to learn a nonlinear transformation that aligns correlations of layer activations in deep neural networks (Deep CORAL). Experiments on standard benchmark datasets show state-of-the-art performance.
We describe non-conventional localization of the midband E=0 state in square and cubic finite bipartite lattices with off-diagonal disorder by solving numerically the linear equations for the corresponding amplitudes. This state is shown to display multifractal fluctuations, having many sparse peaks, and by scaling the participation ratio we obtain its disorder-dependent fractal dimension $D_{2}$. A logarithmic average correlation function grows as $g(r) \sim \eta \ln r$ at distance $r$ from the maximum amplitude and is consistent with a typical overall power-law decay $|\psi(r)| \sim r^{-\eta}$ where $\eta $ is proportional to the strength of off-diagonal disorder.
Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.
Classification and clustering have been studied separately in machine learning and computer vision. Inspired by the recent success of deep learning models in solving various vision problems (e.g., object recognition, semantic segmentation) and the fact that humans serve as the gold standard in assessing clustering algorithms, here, we advocate for a unified treatment of the two problems and suggest that hierarchical frameworks that progressively build complex patterns on top of the simpler ones (e.g., convolutional neural networks) offer a promising solution. We do not dwell much on the learning mechanisms in these frameworks as they are still a matter of debate, with respect to biological constraints. Instead, we emphasize on the compositionality of the real world structures and objects. In particular, we show that CNNs, trained end to end using back propagation with noisy labels, are able to cluster data points belonging to several overlapping shapes, and do so much better than the state of the art algorithms. The main takeaway lesson from our study is that mechanisms of human vision, particularly the hierarchal organization of the visual ventral stream should be taken into account in clustering algorithms (e.g., for learning representations in an unsupervised manner or with minimum supervision) to reach human level clustering performance. This, by no means, suggests that other methods do not hold merits. For example, methods relying on pairwise affinities (e.g., spectral clustering) have been very successful in many cases but still fail in some cases (e.g., overlapping clusters).
The transition to turbulence in the boundary flow of superfluid $^4$He is investigated using a vortex--free vibrating wire. At high wire vibration velocities, we found that stable alternating flow around the wire enters a turbulent phase triggered by free vortex rings. Numerical simulations of vortex dynamics demonstrate that vortex rings can attach to the surface of an oscillating obstacle and expand unstably due to the boundary flow of the superfluid, forming turbulence. Experimental investigations indicate that the turbulent phase continues even after stopping the injection of vortex rings, which is also confirmed by the simulations.
Despite being forbidden in equilibrium, spontaneous breaking of time translation symmetry can occur in periodically driven, Floquet systems with discrete time-translation symmetry. The period of the resulting discrete time crystal is quantized to an integer multiple of the drive period, arising from a combination of collective synchronization and many body localization. Here, we consider a simple model for a one dimensional discrete time crystal which explicitly reveals the rigidity of the emergent oscillations as the drive is varied. We numerically map out its phase diagram and compute the properties of the dynamical phase transition where the time crystal melts into a trivial Floquet insulator. Moreover, we demonstrate that the model can be realized with current experimental technologies and propose a blueprint based upon a one dimensional chain of trapped ions. Using experimental parameters (featuring long-range interactions), we identify the phase boundaries of the ion-time-crystal and propose a measurable signature of the symmetry breaking phase transition.
Deep convolutional networks have achieved great success for image recognition. However, for action recognition in videos, their advantage over traditional methods is not so evident. We present a general and flexible video-level framework for learning action models in videos. This method, called temporal segment network (TSN), aims to model long-range temporal structures with a new segment-based sampling and aggregation module. This unique design enables our TSN to efficiently learn action models by using the whole action videos. The learned models could be easily adapted for action recognition in both trimmed and untrimmed videos with simple average pooling and multi-scale temporal window integration, respectively. We also study a series of good practices for the instantiation of TSN framework given limited training samples. Our approach obtains the state-the-of-art performance on four challenging action recognition benchmarks: HMDB51 (71.0%), UCF101 (94.9%), THUMOS14 (80.1%), and ActivityNet v1.2 (89.6%). Using the proposed RGB difference for motion models, our method can still achieve competitive accuracy on UCF101 (91.0%) while running at 340 FPS. Furthermore, based on the temporal segment networks, we won the video classification track at the ActivityNet challenge 2016 among 24 teams, which demonstrates the effectiveness of TSN and the proposed good practices.
We investigate the possibility that narrowband oscillations may emerge from completely asynchronous, independent neural firing. We find that a population of asynchronous neurons may produce narrowband oscillations if each neuron fires quasi-periodically, and we deduce bounds on the degree of variability in neural spike-timing which will permit the emergence of such oscillations. These results suggest a novel mechanism of neural rhythmogenesis, and they help to explain recent experimental reports of large-amplitude local field potential oscillations in the absence of neural spike-timing synchrony. Simply put, although synchrony can produce oscillations, oscillations do not always imply the existence of synchrony.
Quantitative structural analysis of the galaxies present in the Hawaiian Deep Fields SSA13 and SSA22 is reported. The structural parameters of the galaxies have been obtained automatically by fitting a two-component model (Sersic r^{1/n} bulge and exponential disc) to the surface brightness of the galaxies. The galaxies were classified on the basis of the bulge-to-total luminosity ratio (B/T). The magnitude selection criteria and the reliability of our method have been checked by using Monte Carlo simulations. A complete sample of objects up to redshift 0.8 has been achieved. Spheroidal objects (E/S0) represent ~33% and spirals ~41% of the total number of galaxies, while mergers and unclassified objects represent ~26%. We have computed the comoving space density of the different kinds of objects. In an Einstein-de Sitter universe a decrease in the comoving density of E/S0 galaxies is observed as redshift increases (a ~30% less at z=0.8), while for spiral galaxies a relatively quiet evolution is reported. The framework of hierarchical clustering evolution models of galaxies seems to be the most appropriate to explain our results.
This paper studies theoretically and empirically a method of turning machine-learning algorithms into probabilistic predictors that automatically enjoys a property of validity (perfect calibration) and is computationally efficient. The price to pay for perfect calibration is that these probabilistic predictors produce imprecise (in practice, almost precise for large data sets) probabilities. When these imprecise probabilities are merged into precise probabilities, the resulting predictors, while losing the theoretical property of perfect calibration, are consistently more accurate than the existing methods in empirical studies.
The paper has two objectives. The first is to study rigorously the transient behavior of some P2P networks whenever information is replicated and disseminated according to epidemic-like dynamics. The second is to use the insight gained from the previous analysis in order to predict how efficient are measures taken against peer-to-peer (P2P) networks. We first introduce a stochastic model which extends a classical epidemic model and characterize the P2P swarm behavior in presence of free riding peers. We then study a second model in which a peer initiates a contact with another peer chosen randomly. In both cases the network is shown to exhibit a phase transition: a small change in the parameters causes a large change in the behavior of the network. We show, in particular, how the phase transition affects measures that content provider networks may take against P2P networks that distribute non-authorized music or books, and what is the efficiency of counter-measures.
We present here results from a deep spectroscopic survey of the Coma cluster of galaxies (29 galaxies between 18.98<m_R<21.5). Only 1 of these galaxies is within Coma compared to an expected 6.7 galaxies computed from nearby control fields. This discrepancy potentially indicates that Coma's faint end luminosity function has been grossly overestimated and raises concerns about the validity of using 2D statistical subtraction to correct for the background galaxy population when constructing cluster luminosity functions.
Self-avoiding random walks were performed on protein residue networks. Compared with protein residue networks with randomized links, the probability of a walk being successful is lower and the length of successful walks shorter in (non-randomized) protein residue networks. Fewer successful walks and shorter successful walks point to higher communication specificity between protein residues, a conceivably favourable attribute for proteins to have. The use of random walks instead of shortest paths also produced lower node centrality, lower edge betweeness and lower edge load for (non-randomized) protein residue networks than in their respective randomized counterparts. The implications of these properties for protein residue networks are discussed in terms of communication congestion and network vulnerability. The randomized protein residue networks have lower network clustering than the (non-randomized) protein residue networks. Hence, our findings also shed light on a hitherto neglected aspect: the importance of high network clustering in protein residue networks. High clustering increases navigability of a network for local search and the combination of a local search process on a highly clustered small-world network topology such as protein residue networks reduces communication congestion and network vulnerability.
We show that the resolution of social dilemmas on random graphs and scale-free networks is facilitated by imitating not the strategy of better performing players but rather their emotions. We assume sympathy and envy as the two emotions that determine the strategy of each player by any given interaction, and we define them as probabilities to cooperate with players having a lower and higher payoff, respectively. Starting with a population where all possible combinations of the two emotions are available, the evolutionary process leads to a spontaneous fixation to a single emotional profile that is eventually adopted by all players. However, this emotional profile depends not only on the payoffs but also on the heterogeneity of the interaction network. Homogeneous networks, such as lattices and regular random graphs, lead to fixations that are characterized by high sympathy and high envy, while heterogeneous networks lead to low or modest sympathy but also low envy. Our results thus suggest that public emotions and the propensity to cooperate at large depend, and are in fact determined by the properties of the interaction network.
Being motivated by recent developments in the theory of complex networks, we examine the robustness of communication networks under intentional attack that takes down network nodes in a decreasing order of their nodal degrees. In this paper, we study two different effects that have been largely missed in the existing results: (i) some communication networks, like Internet, are too large for anyone to have global information of their topologies, which makes the accurate intentional attack practically impossible; and (ii) most attacks in communication networks are propagated from one node to its neighborhood node(s), utilizing local network-topology information only. We show that incomplete global information has different impacts to the intentional attack in different circumstances, while local information-based attacks can be actually highly efficient. Such insights would be helpful for the future developments of efficient network attack/protection schemes.
In this paper we discuss the stability properties of convolutional neural networks. Convolutional neural networks are widely used in machine learning. In classification they are mainly used as feature extractors. Ideally, we expect similar features when the inputs are from the same class. That is, we hope to see a small change in the feature vector with respect to a deformation on the input signal. This can be established mathematically, and the key step is to derive the Lipschitz properties. Further, we establish that the stability results can be extended for more general networks. We give a formula for computing the Lipschitz bound, and compare it with other methods to show it is closer to the optimal value.
Conventional extreme learning machines solve a Moore-Penrose generalized inverse of hidden layer activated matrix and analytically determine the output weights to achieve generalized performance, by assuming the same loss from different types of misclassification. The assumption may not hold in cost-sensitive recognition tasks, such as face recognition based access control system, where misclassifying a stranger as a family member may result in more serious disaster than misclassifying a family member as a stranger. Though recent cost-sensitive learning can reduce the total loss with a given cost matrix that quantifies how severe one type of mistake against another, in many realistic cases the cost matrix is unknown to users. Motivated by these concerns, this paper proposes an evolutionary cost-sensitive extreme learning machine (ECSELM), with the following merits: 1) to our best knowledge, it is the first proposal of ELM in evolutionary cost-sensitive classification scenario; 2) it well addresses the open issue of how to define the cost matrix in cost-sensitive learning tasks; 3) an evolutionary backtracking search algorithm is induced for adaptive cost matrix optimization. Experiments in a variety of cost-sensitive tasks well demonstrate the effectiveness of the proposed approaches, with about 5%~10% improvements.
We study the Brownian force model (BFM), a solvable model of avalanche statistics for an interface, in a general discrete setting. The BFM describes the overdamped motion of elastically coupled particles driven by a parabolic well in independent Brownian force landscapes. Avalanches are defined as the collective jump of the particles in response to an arbitrary monotonous change in the well position (i.e. in the applied force). We derive an exact formula for the joint probability distribution of these jumps. From it we obtain the joint density of local avalanche sizes for stationary driving in the quasi-static limit near the depinning threshold. A saddle-point analysis predicts the spatial shape of avalanches in the limit of large aspect ratios for the continuum version of the model. We then study fluctuations around this saddle point, and obtain the leading corrections to the mean shape, the fluctuations around the mean shape and the shape asymmetry, for finite aspect ratios. Our results are finally confronted to numerical simulations.
Deep residual networks (ResNets) have significantly pushed forward the state-of-the-art on image classification, increasing in performance as networks grow both deeper and wider. However, memory consumption becomes a bottleneck, as one needs to store the activations in order to calculate gradients using backpropagation. We present the Reversible Residual Network (RevNet), a variant of ResNets where each layer's activations can be reconstructed exactly from the next layer's. Therefore, the activations for most layers need not be stored in memory during backpropagation. We demonstrate the effectiveness of RevNets on CIFAR-10, CIFAR-100, and ImageNet, establishing nearly identical classification accuracy to equally-sized ResNets, even though the activation storage requirements are independent of depth.
We investigate transport in a disordered reaction-diffusion (RD) model consisting of particles which are allowed to diffuse, compete with one another (2A->A), give birth in small areas called "oases" (A->2A), and die in the "desert" outside the oases (A->0). This model has previously been used to study bacterial populations in the lab and is related to a model of plankton populations in the oceans. We first consider the nature of transport between two oases: in the limit of high growth rate, this is effectively a first passage process, and we are able to determine the first passage time probability density function in the limit of large oasis separation. This result is then used along with the theory of hopping conduction in doped semiconductors to estimate the time taken by a population to cross a large system.
We introduce a conditional generative model for learning to disentangle the hidden factors of variation within a set of labeled observations, and separate them into complementary codes. One code summarizes the specified factors of variation associated with the labels. The other summarizes the remaining unspecified variability. During training, the only available source of supervision comes from our ability to distinguish among different observations belonging to the same class. Examples of such observations include images of a set of labeled objects captured at different viewpoints, or recordings of set of speakers dictating multiple phrases. In both instances, the intra-class diversity is the source of the unspecified factors of variation: each object is observed at multiple viewpoints, and each speaker dictates multiple phrases. Learning to disentangle the specified factors from the unspecified ones becomes easier when strong supervision is possible. Suppose that during training, we have access to pairs of images, where each pair shows two different objects captured from the same viewpoint. This source of alignment allows us to solve our task using existing methods. However, labels for the unspecified factors are usually unavailable in realistic scenarios where data acquisition is not strictly controlled. We address the problem of disentanglement in this more general setting by combining deep convolutional autoencoders with a form of adversarial training. Both factors of variation are implicitly captured in the organization of the learned embedding space, and can be used for solving single-image analogies. Experimental results on synthetic and real datasets show that the proposed method is capable of generalizing to unseen classes and intra-class variabilities.
We review the characteristics of the dust continuum emission from normal galaxies, as revealed by the ISOPHOT Virgo Cluster Deep Survey (Tuffs et al. 2002; Popescu et al. 2002, Popescu & Tuffs 2002b).
In this paper we study QCD and power corrections to sum rules which show up in deep inelastic lepton-hadron scattering. Furthermore we will make a distinction between fundamental sum rules which can be derived from quantum field theory and those which are of a phenomenological origin. Using current algebra techniques the fundamental sum rules can be expressed into expectation values of (partially) conserved (axial-) vector currents sandwiched between hadronic states. These expectation values yield the quantum numbers of the corresponding hadron which are determined by the underlying flavour group $SU(n)_F$. In this case one can show that there exist an intimate relation between the appearance of power and QCD corrections. The above features do not hold for phenomenological sum rules, hereafter called non-fundamental. They have no foundation in quantum field theory and they mostly depend on certain assumptions made for the structure functions like superconvergence relations or the parton model. Therefore only the fundamental sum rules provide us with a stringent test of QCD.
This chapter takes as its departure point a neural level theory of insight that arose from studies of the sparse, distributed, content-addressable architecture of associative memory. It is argued that convergent thought is most fruitfully characterized in terms of, not the generation of a single correct solution (as it is conventionally construed), but using concepts in their most compact form by activating neural cell assemblies that respond to their most typical properties. This allows them to be deployed in a conventional manner such that effort is reserved for exploring causal relationships. Conversely, it is argued that divergent thought is most fruitfully characterized in terms of, not the generation of multiple solutions (as it is conventionally construed), but using concepts in a form that is, albeit expanded, constrained by the situation, by activating neural cell assemblies that respond to context-specific atypical properties. This allows them to be deployed in a manner that is conducive to exploring unconventional yet potentially relevant associations, and unearthing potentially useful relationships of correlation. Thus, divergent thought can involve as few as one idea. This proposal is compatible with widespread beliefs that (1) most creative tasks require not many solutions but one, yet entail both divergent and convergent thinking, and (2) not all problems with multiple solutions require creative thinking, and conversely, some problems with single solution do require creative thought. The chapter discusses how the ability to shift between convergent and divergent modes of thought may have evolved, and it concludes with educational and vocational implications.
We study the mean-field version of a model proposed by Leschhorn to describe the depinning transition of interfaces in random media. We show that evolution equations for the distribution of forces felt by the interface sites can be written down directly for an infinite system. For a flat distribution of random local forces the value of the depinning threshold can be obtained exactly. In the case of parallel dynamics (all unstable sites move simultaneously), due to the discrete character of the allowed interface heights, the motion of the center of mass is non-uniform in time in the moving phase close to the threshold and the mean interface velocity vanishes with a square-root singularity.
In this paper, we have introduced the notion of UselessGate and ReverseOperation. We have also given an algorithm to implement a sorting network for reversible logic synthesis based on swapping bit strings. The network is constructed in terms of n*n Toffoli Gates read from left to right and it has shown that there will be no more gates than the number of swappings the algorithm requires. The gate complexity of the network is O(n2). The number of gates in the network can be further reduced by template reduction technique and removing UselessGate from the network.
Nurse rostering is a complex scheduling problem that affects hospital personnel on a daily basis all over the world. This paper presents a new component-based approach with adaptive perturbations, for a nurse scheduling problem arising at a major UK hospital. The main idea behind this technique is to decompose a schedule into its components (i.e. the allocated shift pattern of each nurse), and then mimic a natural evolutionary process on these components to iteratively deliver better schedules. The worthiness of all components in the schedule has to be continuously demonstrated in order for them to remain there. This demonstration employs a dynamic evaluation function which evaluates how well each component contributes towards the final objective. Two perturbation steps are then applied: the first perturbation eliminates a number of components that are deemed not worthy to stay in the current schedule; the second perturbation may also throw out, with a low level of probability, some worthy components. The eliminated components are replenished with new ones using a set of constructive heuristics using local optimality criteria. Computational results using 52 data instances demonstrate the applicability of the proposed approach in solving real-world problems.
We study the spin dynamics in the presence of impurity and electron-electron (e-e) scattering in a III-V semiconductor quantum well with arbitrary spin-orbit coupling (SOC) strength and symmetry at finite temperature. We derive the coupled spin-charge dynamic equations in the presence of inelastic scattering and provide a new formalism that describes the spin relaxation and dynamics in both the weak and the strong SOC regime in a unified way. In the weak SOC regime, as expected, our theory reproduces all previous zero-temperature results, most of which have focused on impurity-scattering induced spin-charge dynamics. In the regime where the strength of the Rashba and linear Dresselhaus SOC match, known as the SU(2) symmetry point, experiments have observed the spin-helix mode with a large spin-lifetime whose unexplained non-monotonic temperature dependence peaks at around 75 K. As a key test of our theory, we are able to naturally explain quantitatively this non-monotonic dependence and show that it arises as a competition between the Dyakonov-Perel mechanism, suppressed at the SU(2) point, and the Elliott-Yafet mechanism. In the strong SOC regime, we show that our theory directly reproduces the only previous known analytical result at the SU(2) symmetry point in the ballistic regime. It also explains, as we have shown previously, the rise of damped oscillating dynamics when the electron scattering time is larger than half of the spin precession time due to the SOC. Hence, we provide a unified theory of the spin-dynamics in two dimensional electron gases in the full phase diagram experimentally accessible.
Wireless mesh networks (WMNs) have emerged as a promising concept to meet the challenges in next-generation networks such as providing flexible, adaptive, and reconfigurable architecture while offering cost-effective solutions to the service providers. Unlike traditional Wi-Fi networks, with each access point (AP) connected to the wired network, in WMNs only a subset of the APs are required to be connected to the wired network. The APs that are connected to the wired network are called the Internet gateways (IGWs), while the APs that do not have wired connections are called the mesh routers (MRs). The MRs are connected to the IGWs using multi-hop communication. The IGWs provide access to conventional clients and interconnect ad hoc, sensor, cellular, and other networks to the Internet. However, most of the existing routing protocols for WMNs are extensions of protocols originally designed for mobile ad hoc networks (MANETs) and thus they perform sub-optimally. Moreover, most routing protocols for WMNs are designed without security issues in mind, where the nodes are all assumed to be honest. In practical deployment scenarios, this assumption does not hold. This chapter provides a comprehensive overview of security issues in WMNs and then particularly focuses on secure routing in these networks. First, it identifies security vulnerabilities in the medium access control (MAC) and the network layers. Various possibilities of compromising data confidentiality, data integrity, replay attacks and offline cryptanalysis are also discussed. Then various types of attacks in the MAC and the network layers are discussed. After enumerating the various types of attacks on the MAC and the network layer, the chapter briefly discusses on some of the preventive mechanisms for these attacks.
TCP performs poorly in networks with serious packet reordering. Processing reordered packets in the TCP layer is costly and inefficient, involving interaction of the sender and receiver. Motivated by the interrupt coalescing mechanism that delivers packets upward for protocol processing in blocks, we propose a new strategy, Sorting Reordered Packets with Interrupt Coalescing (SRPIC), to reduce packet reordering in the receiver. SRPIC works in the network device driver; it makes use of the interrupt coalescing mechanism to sort the reordered packets belonging to the same TCP stream in a block of packets before delivering them upward; each sorted block is internally ordered. Experiments have proven the effectiveness of SRPIC against forward-path reordering.
We study the coherent dynamics of globally coupled maps showing macroscopic chaos. With this term we indicate the hydrodynamical-like irregular behaviour of some global observables, with typical times much longer than the times related to the evolution of the single (or microscopic) elements of the system. The usual Lyapunov exponent is not able to capture the essential features of this macroscopic phenomenon. Using the recently introduced notion of finite size Lyapunov exponent, we characterize, in a consistent way, these macroscopic behaviours. Basically, at small values of the perturbation we recover the usual (microscopic) Lyapunov exponent, while at larger values a sort of macroscopic Lyapunov exponent emerges, which can be much smaller than the former. A quantitative characterization of the chaotic motion at hydrodynamical level is then possible, even in the absence of the explicit equations for the time evolution of the macroscopic observables.
Optimum nuclear parton distributions are determined by an analysis of muon and electron deep inelastic scattering data. Assuming simple A dependence and polynomial functions of x and 1-x for nuclear modification of parton distributions, we determine the initial distributions by a chi^2 analysis. Although valence-quark distributions are relatively well determined except for the small-x region, antiquark distributions cannot be fixed at medium and large x. It is also difficult to fix gluon distributions.
Mathematical models of stem cell differentiation are commonly based upon the concept of subsequent cell fate decisions, each controlled by a gene regulatory network. These networks exhibit a multistable behavior and cause the system to switch between qualitatively distinct stable steady states. However, the network structure of such a switching module is often uncertain, and there is lack of knowledge about the exact reaction kinetics. In this paper, we therefore perform an elementary study of small networks consisting of three interacting transcriptional regulators responsible for cell differentiation: We investigate which network structures can reproduce a certain multistable behavior, and how robustly this behavior is realized by each network. In order to approach these questions, we use a modeling framework which only uses qualitative information about the network, yet allows model discrimination as well as to evaluate the robustness of the desired multistability properties. We reveal structural network properties which are necessary and sufficient to realize distinct steady state patterns required for cell differentiation. Our results also show that structural and robustness properties of the networks are related to each other.
This paper addresses the task of set prediction using deep learning. This is important because the output of many computer vision tasks, including image tagging and object detection, are naturally expressed as sets of entities rather than vectors. As opposed to a vector, the size of a set is not fixed in advance, and it is invariant to the ordering of entities within it. We define a likelihood for a set distribution and learn its parameters using a deep neural network. We also derive a loss for predicting a discrete distribution corresponding to set cardinality. Set prediction is demonstrated on the problem of multi-class image classification. Moreover, we show that the proposed cardinality loss can also trivially be applied to the tasks of object counting and pedestrian detection. Our approach outperforms existing methods in all three cases on standard datasets.
As deep nets are increasingly used in applications suited for mobile devices, a fundamental dilemma becomes apparent: the trend in deep learning is to grow models to absorb ever-increasing data set sizes; however mobile devices are designed with very little memory and cannot store such large models. We present a novel network architecture, HashedNets, that exploits inherent redundancy in neural networks to achieve drastic reductions in model sizes. HashedNets uses a low-cost hash function to randomly group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value. These parameters are tuned to adjust to the HashedNets weight sharing architecture with standard backprop during training. Our hashing procedure introduces no additional memory overhead, and we demonstrate on several benchmark data sets that HashedNets shrink the storage requirements of neural networks substantially while mostly preserving generalization performance.
Early detection of pulmonary cancer is the most promising way to enhance a patient's chance for survival. Accurate pulmonary nodule detection in computed tomography (CT) images is a crucial step in diagnosing pulmonary cancer. In this paper, inspired by the successful use of deep convolutional neural networks (DCNNs) in natural image recognition, we propose a novel pulmonary nodule detection approach based on DCNNs. We first introduce a deconvolutional structure to Faster Region-based Convolutional Neural Network (Faster R-CNN) for candidate detection on axial slices. Then, a three-dimensional DCNN is presented for the subsequent false positive reduction. Experimental results of the LUng Nodule Analysis 2016 (LUNA16) Challenge demonstrate the superior detection performance of the proposed approach on nodule detection(average FROC-score of 0.891, ranking the 1st place over all submitted results).
Convolutional Neural Networks (CNNs) are the state of the art solution for many computer vision problems, and many researchers have explored optimized implementations. Most implementations heuristically block the computation to deal with the large data sizes and high data reuse of CNNs. This paper explores how to block CNN computations for memory locality by creating an analytical model for CNN-like loop nests. Using this model we automatically derive optimized blockings for common networks that improve the energy efficiency of custom hardware implementations by up to an order of magnitude. Compared to traditional CNN CPU implementations based on highly-tuned, hand-optimized BLAS libraries,our x86 programs implementing the optimal blocking reduce the number of memory accesses by up to 90%.
We present the results of the pilot observations of the Deep Extragalactic VLBI-Optical Survey (DEVOS). Our ultimate aim is to collect information on compact structures in a large sample of extragalactic radio sources (~10000 objects) up to two orders of magnitude fainter than those studied in typical imaging Very Long Baseline Interferometry (VLBI) surveys up until now. This would lead to an unprecedented data base for various astrophysical, astrometric and cosmological studies. The first global VLBI observations of the DEVOS programme were successfully conducted in May 2002. We selected sources without any spectral criterion from the Very Large Array (VLA) Faint Images of the Radio Sky at Twenty-centimeters (FIRST) catalogue, that are also detected with the Multi-Element Radio Linked Interferometer Network (MERLIN). The DEVOS pilot sample sources are in the area of the sky that is covered by the Sloan Digital Sky Survey (SDSS). We describe the sample selection and present high resolution 5-GHz radio images of the sources. Based on the results of this pilot study, we estimate the outcome of and the resources needed for a full-scale DEVOS project.
The ubiquity of modular structure in real-world complex networks is being the focus of attention in many trials to understand the interplay between network topology and functionality. The best approaches to the identification of modular structure are based on the optimization of a quality function known as modularity. However this optimization is a hard task provided that the computational complexity of the problem is in the NP-hard class. Here we propose an exact method for reducing the size of weighted (directed and undirected) complex networks while maintaining invariant its modularity. This size reduction allows the heuristic algorithms that optimize modularity for a better exploration of the modularity landscape. We compare the modularity obtained in several real complex-networks by using the Extremal Optimization algorithm, before and after the size reduction, showing the improvement obtained. We speculate that the proposed analytical size reduction could be extended to an exact coarse graining of the network in the scope of real-space renormalization.
Submodular maximization (SM) has become a silver bullet for a broad class of applications such as influence maximization, data summarization, top-$k$ representative queries, and recommendations. In this paper, we study the SM problem in data streams. Most existing algorithms for streaming SM only support the append-only model with cardinality constraints, which cannot meet the requirements of real-world problems considering either the data recency issues or more general $d$-knapsack constraints. Therefore, we first propose an append-only streaming algorithm {\sc KnapStream} for SM subject to a $d$-knapsack constraint (SMDK). Furthermore, we devise the {\sc KnapWindow} algorithm for SMDK over sliding windows to capture the recency constraints. Theoretically, the proposed algorithms have constant approximation ratios for a fixed number of knapsacks and sublinear complexities. We finally evaluate the efficiency and effectiveness of our algorithms in two real-world datasets. The results show that the proposed algorithms achieve two orders of magnitude speedups over the greedy baseline in the batch setting while preserving high quality solutions.
We discuss the efficiency of the so-called parallel tempering method to equilibrate glassy systems also at low temperatures. The main focus is on two structural glass models, SiO_2 and a Lennard-Jones system, but we also investigate a fully connected 10 state Potts-glass. By calculating the mean squared displacement of a tagged particle and the spin-autocorrelation function, we find that for these three glass-formers the parallel tempering method is indeed able to generate, at low temperatures, new independent configurations at a rate which is O(100) times faster than more traditional algorithms, such as molecular dynamics and single spin flip Monte Carlo dynamics. In addition we find that this speedup increases with decreasing temperature. The reliability of the results is checked by calculating the distribution of the energy at various temperatures and by showing that these can be mapped onto each other by the reweighting technique.
The strong growth of low rate wireless personal area networks (LR-WPAN), leads us to consider the autonomy problems, thus node lifetime in a network, knowing that the power supplies replacement is often difficult to realize. The inherent mobility in this type of equipment is an essential element. It will provide routing constraints, so a complex problem to solve. This article provides work lines to assess the performance of such a network in terms of energy consumption and mobility. The objectives are contradictory; it will necessarily find a compromise. In addition, if we want to guarantee a maximum delay for the transmitted messages, possibility offered by the IEEE 802.15-4 standard, another compromise necessitate a strictly fixed structure and a fully mobile structure. Therefore, we present a quantization of the energy cost related to the desired data rate and compared to the sleep duration of nodes in the network. Then, we open reflexion lines to find the best compromise: consumption / mobility / guaranteed deadlines, in suggesting an adaptive network structure from a concept of MANET.
We study the Sinai model for the diffusion of a particle in a one dimension random potential in presence of a small concentration $\rho$ of perfect absorbers using the asymptotically exact real space renormalization method. We compute the survival probability, the averaged diffusion front and return probability, the two particle meeting probability, the distribution of total distance traveled before absorption and the averaged Green's function of the associated Schrodinger operator. Our work confirms some recent results of Texier and Hagendorf obtained by Dyson-Schmidt methods, and extends them to other observables and in presence of a drift. In particular the power law density of states is found to hold in all cases. Irrespective of the drift, the asymptotic rescaled diffusion front of surviving particles is found to be a symmetric step distribution, uniform for $|x| < {1/2} \xi(t)$, where $\xi(t)$ is a new, survival length scale ($\xi(t)=T \ln t/\sqrt{\rho}$ in the absence of drift). Survival outside this sharp region is found to decay with a larger exponent, continuously varying with the rescaled distance $x/\xi(t)$. A simple physical picture based on a saddle point is given, and universality is discussed.
Recently deep neural networks (DNNs) have been used to learn speaker features. However, the quality of the learned features is not sufficiently good, so a complex back-end model, either neural or probabilistic, has to be used to address the residual uncertainty when applied to speaker verification, just as with raw features. This paper presents a convolutional time-delay deep neural network structure (CT-DNN) for speaker feature learning. Our experimental results on the Fisher database demonstrated that this CT-DNN can produce high-quality speaker features: even with a single feature (0.3 seconds including the context), the EER can be as low as 7.68%. This effectively confirmed that the speaker trait is largely a deterministic short-time property rather than a long-time distributional pattern, and therefore can be extracted from just dozens of frames.
We present a method to solve initial and boundary value problems using artificial neural networks. A trial solution of the differential equation is written as a sum of two parts. The first part satisfies the boundary (or initial) conditions and contains no adjustable parameters. The second part is constructed so as not to affect the boundary conditions. This part involves a feedforward neural network, containing adjustable parameters (the weights). Hence by construction the boundary conditions are satisfied and the network is trained to satisfy the differential equation. The applicability of this approach ranges from single ODE's, to systems of coupled ODE's and also to PDE's. In this article we illustrate the method by solving a variety of model problems and present comparisons with finite elements for several cases of partial differential equations.
The ongoing massive global environmental changes and the past learnings have highlighted the urgency and importance of further detailed understanding of the earth system and implementation of social ecological sustainability measures in a much more effective and transparent manner. This short communication discuss the potential of sensor webs in addressing those research challenges, highlighting it in the context of air pollution issues.
We propose novel methods of solving two tasks using Convolutional Neural Networks, firstly the task of generating HDR map of a static scene using differently exposed LDR images of the scene captured using conventional cameras and secondly the task of finding an optimal tone mapping operator that would give a better score on the TMQI metric compared to the existing methods. We quantitatively show the performance of our networks and illustrate the cases where our networks performs good as well as bad.
The ability to visually understand and interpret learned features from complex predictive models is crucial for their acceptance in sensitive areas such as health care. To move closer to this goal of truly interpretable complex models, we present PatchNet, a network that restricts global context for image classification tasks in order to easily provide visual representations of learned texture features on a predetermined local scale. We demonstrate how PatchNet provides visual heatmap representations of the learned features, and we mathematically analyze the behavior of the network during convergence. We also present a version of PatchNet that is particularly well suited for lowering false positive rates in image classification tasks. We apply PatchNet to the classification of textures from the Describable Textures Dataset and to the ISBI-ISIC 2016 melanoma classification challenge.
We searched for Technetium (Tc) in a sample of bright oxygen-rich asymptotic giant branch (AGB) stars located in the outer galactic bulge. Tc is an unstable element synthesised via the s-process in deep layers of AGB stars, thus it is a reliable indicator of both recent s-process activity and third dredge-up.   We aim to test theoretical predictions on the luminosity limit for the onset of third dredge-up.   Using high resolution optical spectra obtained with the UVES spectrograph at ESO's VLT we search for resonance lines of neutral Tc in the blue spectral region of our sample stars. These measurements allow us to improve the procedure of classification of stars with respect to their Tc content by using flux ratios. Synthetic spectra based on MARCS atmospheric models are presented and compared to the observed spectra around three lines of Tc. Bolometric magnitudes are calculated based on near infrared photometry of the objects.   Among the sample of 27 long period bulge variables four were found to definitely contain Tc in their atmospheres.   The luminosity of the Tc rich stars is in agreement with predictions from AGB evolutionary models on the minimum luminosity at the time when third dredge-up sets in. However, AGB evolutionary models and a bulge consisting of a single old population cannot be brought into agreement. This probably means that a younger population is present in the bulge, as suggested by various authors, which contains the Tc-rich stars here identified.
A novel method for learning optimal, orthonormal wavelet bases for representing 1- and 2D signals, based on parallels between the wavelet transform and fully connected artificial neural networks, is described. The structural similarities between these two concepts are reviewed and combined to a "wavenet", allowing for the direct learning of optimal wavelet filter coefficient through stochastic gradient descent with back-propagation over ensembles of training inputs, where conditions on the filter coefficients for constituting orthonormal wavelet bases are cast as quadratic regularisations terms. We describe the practical implementation of this method, and study its performance for high-energy physics collision events for QCD $2 \to 2$ processes. It is shown that an optimal solution is found, even in a high-dimensional search space, and the implications of the result are discussed.
This paper presents a method to assess a basketball player's performance from his/her first-person video. A key challenge lies in the fact that the evaluation metric is highly subjective and specific to a particular evaluator. We leverage the first-person camera to address this challenge. The spatiotemporal visual semantics provided by a first-person view allows us to reason about the camera wearer's actions while he/she is participating in an unscripted basketball game. Our method takes a player's first-person video and provides a player's performance measure that is specific to an evaluator's preference.   To achieve this goal, we first use a convolutional LSTM network to detect atomic basketball events from first-person videos. Our network's ability to zoom-in to the salient regions addresses the issue of a severe camera wearer's head movement in first-person videos. The detected atomic events are then passed through the Gaussian mixtures to construct a highly non-linear visual spatiotemporal basketball assessment feature. Finally, we use this feature to learn a basketball assessment model from pairs of labeled first-person basketball videos, for which a basketball expert indicates, which of the two players is better.   We demonstrate that despite not knowing the basketball evaluator's criterion, our model learns to accurately assess the players in real-world games. Furthermore, our model can also discover basketball events that contribute positively and negatively to a player's performance.
Full Intuitionistic Linear Logic (FILL) is multiplicative intuitionistic linear logic extended with par. Its proof theory has been notoriously difficult to get right, and existing sequent calculi all involve inference rules with complex annotations to guarantee soundness and cut-elimination. We give a simple and annotation-free display calculus for FILL which satisfies Belnap's generic cut-elimination theorem. To do so, our display calculus actually handles an extension of FILL, called Bi-Intuitionistic Linear Logic (BiILL), with an `exclusion' connective defined via an adjunction with par. We refine our display calculus for BiILL into a cut-free nested sequent calculus with deep inference in which the explicit structural rules of the display calculus become admissible. A separation property guarantees that proofs of FILL formulae in the deep inference calculus contain no trace of exclusion. Each such rule is sound for the semantics of FILL, thus our deep inference calculus and display calculus are conservative over FILL. The deep inference calculus also enjoys the subformula property and terminating backward proof search, which gives the NP-completeness of BiILL and FILL.
The application of a magnetic field along the [111] direction in the spin ice compounds leads to two magnetization plateaux, in the first of which the ground state entropy is reduced but still remains extensive. We observe that under reasonable assumptions, the remaining degrees of freedom in the low field plateau live on decoupled kagome planes, and can be mapped to hard core dimers on a honeycomb lattice. The resulting two dimensional state is critical, and we have obtained its residual entropy -- in good agreement with a recent experiments -- the equal time spin correlations as well as a theory for the dynamical spin correlations. Small tilts of the field are predicted to lead a vanishing of the entropy and the termination of the critical phase by a Kasteleyn transition characterized by highly anisotropic scaling. We discuss the thermally excited defects that terminate the plateau either end, among them an exotic string defect which restores three dimensionality.
Individual choices are either based on personal experience or on information provided by peers. The latter case, causes individuals to conform to the majority in their neighborhood. Such herding behavior may be very efficient in aggregating disperse private information, thereby revealing the optimal choice. However if the majority relies on herding, this mechanism may dramatically fail to aggregate correctly the information, causing the majority adopting the wrong choice. We address these issues in a simple model of interacting agents who aim at giving a correct forecast of a public variable, either seeking private information or resorting to herding. As the fraction of herders increases, the model features a phase transition beyond which a state where most agents make the correct forecast coexists with one where most of them are wrong. Simple strategic considerations suggest that indeed such a system of agents self-organizes deep in the coexistence region. There, agents tend to agree much more among themselves than with what they aim at forecasting, as found in recent empirical studies.
Characterization of lung nodules as benign or malignant is one of the most important tasks in lung cancer diagnosis, staging and treatment planning. While the variation in the appearance of the nodules remains large, there is a need for a fast and robust computer aided system. In this work, we propose an end-to-end trainable multi-view deep Convolutional Neural Network (CNN) for nodule characterization. First, we use median intensity projection to obtain a 2D patch corresponding to each dimension. The three images are then concatenated to form a tensor, where the images serve as different channels of the input image. In order to increase the number of training samples, we perform data augmentation by scaling, rotating and adding noise to the input image. The trained network is used to extract features from the input image followed by a Gaussian Process (GP) regression to obtain the malignancy score. We also empirically establish the significance of different high level nodule attributes such as calcification, sphericity and others for malignancy determination. These attributes are found to be complementary to the deep multi-view CNN features and a significant improvement over other methods is obtained.
We present a measurement of tbar-t production using multijet final states in pbar-p collisions at a center-of-mass energy of 1.8 TeV, with an integrated luminosity of 110.3 pb(-1). The analysis has been optimized using neural networks to achieve the smallest expected fractional uncertainty on the tbar-t production cross section, and yields a cross section of 7.1 +/- 2.8(stat.) +/- 1.5(syst.) pb, assuming a top quark mass of 172.1 GeV/c^(2). Combining this result with previous D0 measurements, where one or both of the W bosons decay leptonically, gives a tbar-t production cross section of 5.9 +/- 1.2(stat) +/- 1.1(syst) pb.
Obtaining the exact multi-time correlations for one-dimensional growth models described by the Kardar-Parisi-Zhang (KPZ) universality class is presently an outstanding open problem. Here, we study the joint probability distribution function (JPDF) of the height of the KPZ equation with droplet initial conditions, at two different times $t_1<t_2$, in the limit where both times are large and their ratio $t_2/t_1$ is fixed. This maps to the JPDF of the free energies of two directed polymers with two different lengths and in the same random potential. Using the replica Bethe ansatz (RBA) method, we obtain the exact tail of the JPDF when one of its argument (the KPZ height at the earlier time $t_1$) is large and positive. Our formula interpolates between two limits where the JPDF decouples: (i) for $t_2/t_1 \to +\infty$ into a product of two GUE Tracy-Widom (TW) distributions, and (ii) for $t_2/t_1 \to 1^+$ into a product of a GUE-TW distribution and a Baik-Rains distribution (associated to stationary KPZ evolution). The lowest cumulants of the height at time $t_2$, conditioned on the one at time $t_1$, are expressed analytically as expansions around these limits, and computed numerically for arbitrary $t_2/t_1$. Moreover we compute the connected two-time correlation, conditioned to a large enough value at $t_1$, providing a quantitative prediction for the so-called persistence of correlations (or ergodicity breaking) in the time evolution from the droplet initial condition. Our RBA results are then compared with arguments based on Airy processes, with satisfactory agreement. These predictions are universal for all models in the KPZ class and should be testable in experiments and numerical simulations.
Many statistical methods for network data parameterize the edge-probability by attributing latent traits to the vertices such as block structure and assume exchangeability in the sense of the Aldous-Hoover representation theorem. Empirical studies of networks indicate that many real-world networks have a power-law distribution of the vertices which in turn implies the number of edges scale slower than quadratically in the number of vertices. These assumptions are fundamentally irreconcilable as the Aldous-Hoover theorem implies quadratic scaling of the number of edges. Recently Caron and Fox (2014) proposed the use of a different notion of exchangeability due to Kallenberg (2009) and obtained a network model which admits power-law behaviour while retaining desirable statistical properties, however this model does not capture latent vertex traits such as block-structure. In this work we re-introduce the use of block-structure for network models obeying Kallenberg's notion of exchangeability and thereby obtain a model which admits the inference of block-structure and edge inhomogeneity. We derive a simple expression for the likelihood and an efficient sampling method. The obtained model is not significantly more difficult to implement than existing approaches to block-modelling and performs well on real network datasets.
We present a detailed study of a two-dimensional minimal lattice model for the description of mud cracking in the limit of extremely thin layers. In this model each bond of the lattice is assigned to a (quenched) breaking threshold. Fractures proceed through the selection of the part of the material with the smallest breaking threshold. A local damaging rule is also implemented, by using two different types of weakening of the neighboring sites, corresponding to different physical situations. Some analytical results are derived through a probabilistic approach known as Run Time Statistics. In particular, we find that the total time to break down the sample grows with the dimension $L$ of the lattice as $L^2$ even though the percolating cluster has a non trivial fractal dimension. Furthermore, a formula for the mean weakening in time of the whole sample is obtained.
We report on the discovery of a star-forming galaxy at z=6.33 in the Subaru Deep Field. This object is selected as a candidate of an i'-dropout, high-redshift galaxy around z=6 because of its red i'-z' color in our deep optical imaging survey in the Subaru Deep Field. Our follow up optical spectroscopy reveals that this object is a strong Ly-alpha emitter with only very faint ultraviolet continuum. The rest-frame equivalent width of the detected Ly-alpha emission is as much as 130 A. Thus the light detected in our z' image is largely attributed to the Ly-alpha emission, i.e., ~40% of the z'-band flux is the strong Ly-alpha emission, giving a very red i'-z' color. This is consistent with the photometric property of this object because the narrow-band data obtained with the NB921 filter shows a significant depression, z'-NB921 = -0.54 mag. By using the photometric data, we show that some other objects among the 48 i'-dropout high-redshift galaxy candidates found in the Subaru Deep Field also show a significant NB921 depression. We briefly discuss the nature of these NB921-depressed objects.
Convolutional neural network (CNN) is a neural network that can make use of the internal structure of data such as the 2D structure of image data. This paper studies CNN on text categorization to exploit the 1D structure (namely, word order) of text data for accurate prediction. Instead of using low-dimensional word vectors as input as is often done, we directly apply CNN to high-dimensional text data, which leads to directly learning embedding of small text regions for use in classification. In addition to a straightforward adaptation of CNN from image to text, a simple but new variation which employs bag-of-word conversion in the convolution layer is proposed. An extension to combine multiple convolution layers is also explored for higher accuracy. The experiments demonstrate the effectiveness of our approach in comparison with state-of-the-art methods.
This book addresses the scientific domains of operations research, information science and statistics with a focus on engineering applications. The purpose of this book is to report on the implications of the loop equations formulation of the state estimation procedure of the network systems, for the purpose of the implementation of Decision Support (DS) systems for the operational control of the network systems. In general an operational DS comprises a series of standalone applications from which the mathematical modeling and simulation of the distribution systems and the managing of the uncertainty in the decision-making process are essential in order to obtain efficient control and monitoring of the distribution systems. The mathematical modeling and simulation forms the basis for detailed optimization of the network operations and the second one uses uncertainty based reasoning in order to reduce the complexity of the network system and to increase the credibility of its model. This book reports on the integration of the two aspects of operational DS into a single computational framework of loop network equations. The proposed DS system will be validated using case studies taken from the water industry. The optimal control of water distribution systems is an important problem because the models are non-linear and large-scale and measurements are prone to errors and very often they are incomplete.
We analyze the stellar absorption features in high signal-to-noise ratio near-infrared (NIR) spectra of the nuclear region of 12 nearby galaxies, mostly spirals. The features detected in some or all of the galaxies in this sample are the TiO (0.843 $\mu$m\ and 0.886 $\mu$m), VO (1.048 $\mu$m), CN (1.1 $\mu$m\ and 1.4 $\mu$m), H$\rm _2$O (1.4 $\mu$m\ and 1.9 $\mu$m) and CO (1.6 $\mu$m\ and 2.3 $\mu$m) bands. The C$\rm _2$ (1.17 $\mu$m\ and 1.76 $\mu$m) bands are generally weak or absent, although C$\rm _2$ (1.76 $\mu$m) may be weakly present in the mean galaxy spectrum. A deep feature near 0.93 $\mu$m, likely caused by CN, TiO and/or ZrO, is also detected in all objects. Fitting a combination of stellar spectra to the mean spectrum shows that the absorption features are produced by evolved stars: cool giants and supergiant stars in the early- or thermally-pulsing asymptotic giant branch (E-AGB or TP-AGB) phases. The high luminosity of TP-AGB stars, and the appearance of VO and ZrO features in the data, suggest that TP-AGB stars dominate these spectral features. However, a contribution from other evolved stars is also likely. Comparison with evolutionary population synthesis models shows that models based on empirical libraries that predict relatively strong NIR features provide a more accurate description of the data. However, none of the models tested accurately reproduces all of the features observed in the spectra. To do so, the models will need to not only improve the treatment of TP-AGB stars, but also include good quality spectra of red giant and E-AGB stars. The uninterrupted wavelength coverage, high S/N, and quantity of features we present here will provide a benchmark for the next generation of models aiming to explain and predict the NIR properties of galaxies.
Artificial Neural Networks(ANN) has been phenomenally successful on various pattern recognition tasks. However, the design of neural networks rely heavily on the experience and intuitions of individual developers. In this article, the author introduces a mathematical structure called MLP algebra on the set of all Multilayer Perceptron Neural Networks(MLP), which can serve as a guiding principle to build MLPs accommodating to the particular data sets, and to build complex MLPs from simpler ones.
As a highlighting research topic in the multimedia area, cross-media retrieval aims to capture the complex correlations among multiple media types. Learning better shared representation and distance metric for multimedia data is important to boost the cross-media retrieval. Motivated by the strong ability of deep neural network in feature representation and comparison functions learning, we propose the Unified Network for Cross-media Similarity Metric (UNCSM) to associate cross-media shared representation learning with distance metric in a unified framework. First, we design a two-pathway deep network pretrained with contrastive loss, and employ double triplet similarity loss for fine-tuning to learn the shared representation for each media type by modeling the relative semantic similarity. Second, the metric network is designed for effectively calculating the cross-media similarity of the shared representation, by modeling the pairwise similar and dissimilar constraints. Compared to the existing methods which mostly ignore the dissimilar constraints and only use sample distance metric as Euclidean distance separately, our UNCSM approach unifies the representation learning and distance metric to preserve the relative similarity as well as embrace more complex similarity functions for further improving the cross-media retrieval accuracy. The experimental results show that our UNCSM approach outperforms 8 state-of-the-art methods on 4 widely-used cross-media datasets.
Multivariate Pattern (MVP) classification holds enormous potential for decoding visual stimuli in the human brain by employing task-based fMRI data sets. There is a wide range of challenges in the MVP techniques, i.e. decreasing noise and sparsity, defining effective regions of interest (ROIs), visualizing results, and the cost of brain studies. In overcoming these challenges, this paper proposes a novel model of neural representation, which can automatically detect the active regions for each visual stimulus and then utilize these anatomical regions for visualizing and analyzing the functional activities. Therefore, this model provides an opportunity for neuroscientists to ask this question: what is the effect of a stimulus on each of the detected regions instead of just study the fluctuation of voxels in the manually selected ROIs. Moreover, our method introduces analyzing snapshots of brain image for decreasing sparsity rather than using the whole of fMRI time series. Further, a new Gaussian smoothing method is proposed for removing noise of voxels in the level of ROIs. The proposed method enables us to combine different fMRI data sets for reducing the cost of brain studies. Experimental studies on 4 visual categories (words, consonants, objects and nonsense photos) confirm that the proposed method achieves superior performance to state-of-the-art methods.
In this paper we propose an in-depth analysis of a method, called the flow network method, which associates with any network a complete and quasi-transitive binary relation on its vertices. Such a method, originally proposed by Gvozdik (1987), is based on the concept of maximum flow. Given a competition involving two or more teams, the flow network method can be used to build a relation on the set of teams which establishes, for every ordered pair of teams, if the first one did at least as good as the second one in the competition. Such a relation naturally induces procedures for ranking teams and selecting the best $k$ teams of a competition. Those procedures are proved to satisfy many desirable properties. Further, by means of the flow network method, we define a multiwinner voting system where individuals are allowed to express their preferences through any binary relation on the set of alternatives. That system is proved to be decisive, anonymous, neutral, homogeneous, unanimous, monotonic and immune to the reversal bias as well as to coincide with the multiwinner Borda count for preference profiles made up by linear orders. The theory here developed also allows to get an interesting characterization of complete and quasi-transitive relations.
Massive stars live fast and die young. They shine furiously for a few million years, during which time they synthesize most of the heavy elements in the universe in their cores. They end by blowing themselves up in a powerful explosion known as a supernova. During this process, the core collapses to a neutron star or a black hole, while the outer layers are expelled with velocities of thousands of kilometers per second. The resulting fireworks often outshine the entire host galaxy for many weeks.   The explosion energy is eventually radiated away, but powering of the newborn nebula continues by radioactive isotopes synthesized in the explosion. The ejecta are now quite transparent, and we can see the material produced in the deep interiors of the star. To interpret the observations, detailed spectral modeling is needed. This thesis aims to develop and apply state-of-the-art computational tools for interpreting and modeling supernova observations in the nebular phase. This requires calculation of the physical conditions throughout the nebula, including non-thermal processes from the radioactivity, thermal and statistical equilibrium, as well as radiative transport. The inclusion of multi-line radiative transfer, which we compute with a Monte Carlo technique, represents one of the major advancements presented in this thesis.
Positive Lyapunov exponents measure the asymptotic exponential divergence of nearby trajectories of a dynamical system. Not only they quantify how chaotic a dynamical system is, but since their sum is an upper bound for the entropy by the Ruelle inequality, they also provide a convenient way to quantify the complexity of an active network. We present numerical evidences that for a large class of active networks, the sum of the positive Lyapunov exponents is bounded by the sum of the positive Lyapunov exponents of the corresponding synchronization manifold, the last quantity being in principle easier to compute than the latter. This fact is a consequence of the property that for an active network considered here, the amount of information produced is more affected by the interactions between the nodes than by the topology of the network. Using the inequality described above, we explain how to predict the behavior of a large active network only knowing the information provided by an active network consisting of two coupled nodes.
Automatically assessing emotional valence in human speech has historically been a difficult task for machine learning algorithms. The subtle changes in the voice of the speaker that are indicative of positive or negative emotional states are often "overshadowed" by voice characteristics relating to emotional intensity or emotional activation. In this work we explore a representation learning approach that automatically derives discriminative representations of emotional speech. In particular, we investigate two machine learning strategies to improve classifier performance: (1) utilization of unlabeled data using a deep convolutional generative adversarial network (DCGAN), and (2) multitask learning. Within our extensive experiments we leverage a multitask annotated emotional corpus as well as a large unlabeled meeting corpus (around 100 hours). Our speaker-independent classification experiments show that in particular the use of unlabeled data in our investigations improves performance of the classifiers and both fully supervised baseline approaches are outperformed considerably. We improve the classification of emotional valence on a discrete 5-point scale to 43.88% and on a 3-point scale to 49.80%, which is competitive to state-of-the-art performance.
Dynamic treatment regimens (DTRs) are sequential decision rules tailored at each stage by potentially time-varying patient features and intermediate outcomes observed in previous stages. The complexity, patient heterogeneity and chronicity of many diseases and disorders call for learning optimal DTRs which best dynamically tailor treatment to each individual's response over time. Proliferation of personalized data (e.g., genetic and imaging data) provides opportunities for deep tailoring as well as new challenges for statistical methodology. In this work, we propose a robust hybrid approach referred as Augmented Multistage Outcome-Weighted Learning (AMOL) to integrate outcome-weighted learning and Q-learning to identify optimal DTRs from the Sequential Multiple Assignment Randomization Trials (SMARTs). We generalize outcome weighted learning (O-learning; Zhao et al.~2012) to allow for negative outcomes; we propose methods to reduce variability of weights in O-learning to achieve numeric stability and higher efficiency; finally, for multiple-stage SMART studies, we introduce doubly robust augmentation to machine learning based O-learning to improve efficiency by drawing information from regression model-based Q-learning at each stage. The proposed AMOL remains valid even if the Q-learning model is misspecified. We establish the theoretical properties of AMOL, including the consistency of the estimated rules and the rates of convergence to the optimal value function. The comparative advantage of AMOL over existing methods is demonstrated in extensive simulation studies and applications to two SMART data sets: a two-stage trial for attention deficit and hyperactive disorder (ADHD) and the STAR*D trial for major depressive disorder (MDD).
We consider communication over a noisy network under randomized linear network coding. Possible error mechanism include node- or link- failures, Byzantine behavior of nodes, or an over-estimate of the network min-cut. Building on the work of Koetter and Kschischang, we introduce a probabilistic model for errors. We compute the capacity of this channel and we define an error-correction scheme based on random sparse graphs and a low-complexity decoding algorithm. By optimizing over the code degree profile, we show that this construction achieves the channel capacity in complexity which is jointly quadratic in the number of coded information bits and sublogarithmic in the error probability.
A cluster tree provides a highly-interpretable summary of a density function by representing the hierarchy of its high-density clusters. It is estimated using the empirical tree, which is the cluster tree constructed from a density estimator. This paper addresses the basic question of quantifying our uncertainty by assessing the statistical significance of topological features of an empirical cluster tree. We first study a variety of metrics that can be used to compare different trees, analyze their properties and assess their suitability for inference. We then propose methods to construct and summarize confidence sets for the unknown true cluster tree. We introduce a partial ordering on cluster trees which we use to prune some of the statistically insignificant features of the empirical tree, yielding interpretable and parsimonious cluster trees. Finally, we illustrate the proposed methods on a variety of synthetic examples and furthermore demonstrate their utility in the analysis of a Graft-versus-Host Disease (GvHD) data set.
We present the results of a deep ZYJ near-infrared survey of 13.5 square degrees in the Upper Scorpius (USco) OB association. We photometrically selected ~100 cluster member candidates with masses in the range 30-5 Jupiters, according to state-of-the-art evolutionary models. We identified 67 ZYJ candidates as bona-fide members, based on complementary photometry and astrometry. We also extracted five candidates detected with VISTA at YJ-only. One is excluded using deep optical z-band imaging, while two are likely non-members, and three remain as potential members. We conclude that the USco mass function is more likely decreasing in the planetary-mass regime (although a flat mass function cannot yet be discarded), consistent with surveys in other regions.
Most successful Bayesian network (BN) applications to datehave been built through knowledge elicitation from experts.This is difficult and time consuming, which has lead to recentinterest in automated methods for learning BNs from data. We present a case study in the construction of a BN in anintelligent tutoring application, specifically decimal misconceptions. Wedescribe the BN construction using expert elicitation and then investigate how certainexisting automated knowledge discovery methods might support the BN knowledge engineering process.
There exists an infinite series of ratios by which one can derive the Riemann zeta function $\zeta(s)$ from Catalan numbers and central binomial coefficients which appear in the terms of the series. While admittedly the derivation is not deep it does indicate some combinatorial aspect to the Riemann zeta function. But we actually do find also four additional new closed formulas, which include a formula by which one can compute $\zeta(s)$ for a countably infinite number of discrete positive values for $s$ where the formula contains Catalan numbers not in infinite series. The Riemann zeta function has applications in physics, such as in computations related to the Casimir effect. Our result indicates a link between the Riemann zeta function, combinatorics, Catalan numbers, the central binomial coefficient and the content of a hypersphere, under certain conditions.
Measurements of the flux and redshifts of Type Ia supernovae have provided persuasive evidence that the expansion of the universe is accelerating. If true, then in the context of standard FRW cosmology this suggests that the energy density of hte universe is dominated by "dark energy" -- a component with negative pressure of magnitude comparable to its energy density. To further investigate this phenomenon, more extensive surveys of supernovae are being planned. Given the likely timescales for completion, by the time data from these surveys are available some important cosmological parameters will be known to high precision from CMB measurements. Here we consider the impact of that foreknowledge on the design of supernova surveys. In particular we show that, despite greater opportunities to multiplex, purely from the point of view of statistical errors, a deep survey may not obviously be better than a shallow one.
It has been shown that a neural network model recently proposed to describe basic memory performance is based on a ternary/binary coding/decoding algorithm which leads to a new neural network assembly memory model (NNAMM) providing maximum-likelihood recall/recognition properties and implying a new memory unit architecture with Hopfield two-layer network, N-channel time gate, auxiliary reference memory, and two nested feedback loops. For the data coding used, conditions are found under which a version of Hopfied network implements maximum-likelihood convolutional decoding algorithm and, simultaneously, linear statistical classifier of arbitrary binary vectors with respect to Hamming distance between vector analyzed and reference vector given. In addition to basic memory performance and etc, the model explicitly describes the dependence on time of memory trace retrieval, gives a possibility of one-trial learning, metamemory simulation, generalized knowledge representation, and distinct description of conscious and unconscious mental processes. It has been shown that an assembly memory unit may be viewed as a model of a smallest inseparable part or an 'atom' of consciousness. Some nontraditional neurobiological backgrounds (dynamic spatiotemporal synchrony, properties of time dependent and error detector neurons, early precise spike firing, etc) and the model's application to solve some interdisciplinary problems from different scientific fields are discussed.
Spike-timing dependent plasticity (STDP) is an organizing principle of biological neural networks. While synchronous firing of neurons is considered to be an important functional block in the brain, how STDP shapes neural networks possibly toward synchrony is not entirely clear. We examine relations between STDP and synchronous firing in spontaneously firing neural populations. Using coupled heterogeneous phase oscillators placed on initial networks, we show numerically that STDP prunes some synapses and promotes formation of a feedforward network. Eventually a pacemaker, which is the neuron with the fastest inherent frequency in our numerical simulations, emerges at the root of the feedforward network. In each oscillatory cycle, a packet of neural activity is propagated from the pacemaker to downstream neurons along layers of the feedforward network. This event occurs above a clear-cut threshold value of the initial synaptic weight. Below the threshold, neurons are self-organized into separate clusters each of which is a feedforward network.
In mobile adhoc networks, congestion occurs with limited resources. The standard TCP congestion control mechanism is not able to handle the special properties of a shared wireless channel. TCP congestion control works very well on the Internet. But mobile adhoc networks exhibit some unique properties that greatly affect the design of appropriate protocols and protocol stacks in general, and of congestion control mechanism in particular. As it turned out, the vastly differing environment in a mobile adhoc network is highly problematic for standard TCP. Many approaches have been proposed to overcome these difficulties. Mobile agent based congestion control Technique is proposed to avoid congestion in adhoc network. When mobile agent travels through the network, it can select a less-loaded neighbor node as its next hop and update the routing table according to the node congestion status. With the aid of mobile agents, the nodes can get the dynamic network topology in time. In this paper, a mobile agent based congestion control mechanism is presented.
We investigate the potential of attention-based neural machine translation in simultaneous translation. We introduce a novel decoding algorithm, called simultaneous greedy decoding, that allows an existing neural machine translation model to begin translating before a full source sentence is received. This approach is unique from previous works on simultaneous translation in that segmentation and translation are done jointly to maximize the translation quality and that translating each segment is strongly conditioned on all the previous segments. This paper presents a first step toward building a full simultaneous translation system based on neural machine translation.
The emergence of many challenges and the rapid development of the means of communications and computer networks and the Internet. Digital information revolution has affected a lot on human societies. Data today has become available in digital format (text, image, audio, and video), which led to the emergence of many opportunities for creativity for innovation as well as the emergence of a new kind of challenges
Latent space models are effective tools for statistical modeling and exploration of network data. These models can effectively model real world network characteristics such as degree heterogeneity, transitivity, homophily, etc. Due to their close connection to generalized linear models, it is also natural to incorporate covariate information in them. The current paper presents two universal fitting algorithms for networks with edge covariates: one based on nuclear norm penalization and the other based on projected gradient descent. Both algorithms are motivated by maximizing likelihood for a special class of inner-product models while working simultaneously for a wide range of different latent space models, such as distance models, which allow latent vectors to affect edge formation in flexible ways. These fitting methods, especially the one based on projected gradient descent, are fast and scalable to large networks. We obtain their rates of convergence for both inner-product models and beyond. The effectiveness of the modeling approach and fitting algorithms is demonstrated on five real world network datasets for different statistical tasks, including community detection with and without edge covariates, and network assisted learning.
Synchronization in networks with delayed coupling are ubiquitous in nature and play a key role in almost all fields of science including physics, biology, ecology, climatology and sociology. In general, the published works on network synchronization are based on data analysis and simulations, with little experimental verification. Here we develop and experimentally demonstrate various multi-cluster phase synchronization scenarios within coupled laser networks. Synchronization is controlled by the network connectivity in accordance to number theory, whereby the number of synchronized clusters equals the greatest common divisor of network loops. This dependence enables remote switching mechanisms to control the optical phase coherence among distant lasers by local network connectivity adjustments. Our results serve as a benchmark for a broad range of coupled oscillators in science and technology, and offer feasible routes to achieve multi-user secure protocols in communication networks and parallel distribution of versatile complex combinatorial tasks in optical computers.
This paper describes recent development and test implementation of a continuous time recurrent neural network that has been configured to predict rates of change in securities. It presents outcomes in the context of popular technical analysis indicators and highlights the potential impact of continuous predictive capability on securities market trading operations.
The improvements in spectral and spatial resolution of the satellite images have facilitated the automatic extraction and identification of the features from satellite images and aerial photographs. An automatic object extraction method is presented for extracting and identifying the various objects from satellite images and the accuracy of the system is verified with regard to IRS satellite images. The system is based on neural network and simulates the process of visual interpretation from remote sensing images and hence increases the efficiency of image analysis. This approach obtains the basic characteristics of the various features and the performance is enhanced by the automatic learning approach, intelligent interpretation, and intelligent interpolation. The major advantage of the method is its simplicity and that the system identifies the features not only based on pixel value but also based on the shape, haralick features etc of the objects. Further the system allows flexibility for identifying the features within the same category based on size and shape. The successful application of the system verified its effectiveness and the accuracy of the system were assessed by ground truth verification.
Joint state and parameter estimation is a core problem for dynamic Bayesian networks. Although modern probabilistic inference toolkits make it relatively easy to specify large and practically relevant probabilistic models, the silver bullet---an efficient and general online inference algorithm for such problems---remains elusive, forcing users to write special-purpose code for each application. We propose a novel blackbox algorithm -- a hybrid of particle filtering for state variables and assumed density filtering for parameter variables. It has following advantages: (a) it is efficient due to its online nature, and (b) it is applicable to both discrete and continuous parameter spaces . On a variety of toy and real models, our system is able to generate more accurate results within a fixed computation budget. This preliminary evidence indicates that the proposed approach is likely to be of practical use.
This paper studies the problem of locating multiple diffusion sources in networks with partial observations. We propose a new source localization algorithm, named Optimal-Jordan-Cover (OJC). The algorithm first extracts a subgraph using a candidate selection algorithm that selects source candidates based on the number of observed infected nodes in their neighborhoods. Then, in the extracted subgraph, OJC finds a set of nodes that "cover" all observed infected nodes with the minimum radius. The set of nodes is called the Jordan cover, and is regarded as the set of diffusion sources. Considering the heterogeneous susceptible-infected-recovered (SIR) diffusion in the Erdos-Renyi (ER) random graph, we prove that OJC can locate all sources with probability one asymptotically with partial observations. OJC is a polynomial-time algorithm in terms of network size. However, the computational complexity increases exponentially in $m,$ the number of sources. We further propose a low-complexity heuristic based on the K-Means for approximating the Jordan cover, named Approximate-Jordan-Cover (AJC). Simulations on random graphs and real networks demonstrate that both AJC and OJC significantly outperform other heuristic algorithms.
Computing as you know it is about to change, your applications and documents are going to move from the desktop into the cloud. I'm talking about cloud computing, where applications and files are hosted on a "cloud" consisting of thousands of computers and servers, all linked together and accessible via the Internet. With cloud computing, everything you do is now web based instead of being desktop based. You can access all your programs and documents from any computer that's connected to the Internet. How will cloud computing change the way you work? For one thing, you're no longer tied to a single computer. You can take your work anywhere because it's always accessible via the web. In addition, cloud computing facilitates group collaboration, as all group members can access the same programs and documents from wherever they happen to be located. Cloud computing might sound far-fetched, but chances are you're already using some cloud applications. If you're using a web-based email program, such as Gmail or Hotmail, you're computing in the cloud. If you're using a web-based application such as Google Calendar or Apple Mobile Me, you're computing in the cloud. If you're using a file- or photo-sharing site, such as Flickr or Picasa Web Albums, you're computing in the cloud. It's the technology of the future, available to use today.
Training deep Convolutional Neural Networks (CNN) is a time consuming task that may take weeks to complete. In this article we propose a novel, theoretically founded method for reducing CNN training time without incurring any loss in accuracy. The basic idea is to begin training with a pre-train network using lower-resolution kernels and input images, and then refine the results at the full resolution by exploiting the spatial scaling property of convolutions. We apply our method to the ImageNet winner OverFeat and to the more recent ResNet architecture and show a reduction in training time of nearly 20% while test set accuracy is preserved in both cases.
Mixed language data is one of the difficult yet less explored domains of natural language processing. Most research in fields like machine translation or sentiment analysis assume monolingual input. However, people who are capable of using more than one language often communicate using multiple languages at the same time. Sociolinguists believe this "code-switching" phenomenon to be socially motivated. For example, to express solidarity or to establish authority. Most past work depend on external tools or resources, such as part-of-speech tagging, dictionary look-up, or named-entity recognizers to extract rich features for training machine learning models. In this paper, we train recurrent neural networks with only raw features, and use word embedding to automatically learn meaningful representations. Using the same mixed-language Twitter corpus, our system is able to outperform the best SVM-based systems reported in the EMNLP'14 Code-Switching Workshop by 1% in accuracy, or by 17% in error rate reduction.
The anomalies of electromagnetic properties of carbon condensates are discussed. Basic attention has been done to electromagnetic properties of thin carbon films produced by means of sputtering of spectral pure graphite in electrical arc (CA films). The room temperature rf-to-dc conversion associated with the ac Josephson effect, the jumps of conductivity stipulated by variation of current and temperature and zero spin density can be pointed among the anomalous properties of CA films. The number of basic hypotheses are presented in trying to explain this anomalies. As the main hypothesis the high temperature superconductivity is considered. As alternative the analogy with the chalcogenide glasses, fluctuation conductivity, etc. can be considered.
The Aduio-visual Speech Recognition (AVSR) which employs both the video and audio information to do Automatic Speech Recognition (ASR) is one of the application of multimodal leaning making ASR system more robust and accuracy. The traditional models usually treated AVSR as inference or projection but strict prior limits its ability. As the revival of deep learning, Deep Neural Networks (DNN) becomes an important toolkit in many traditional classification tasks including ASR, image classification, natural language processing. Some DNN models were used in AVSR like Multimodal Deep Autoencoders (MDAEs), Multimodal Deep Belief Network (MDBN) and Multimodal Deep Boltzmann Machine (MDBM) that actually work better than traditional methods. However, such DNN models have several shortcomings: (1) They don't balance the modal fusion and temporal fusion, or even haven't temporal fusion; (2)The architecture of these models isn't end-to-end, the training and testing getting cumbersome. We propose a DNN model, Auxiliary Multimodal LSTM (am-LSTM), to overcome such weakness. The am-LSTM could be trained and tested once, moreover easy to train and preventing overfitting automatically. The extensibility and flexibility are also take into consideration. The experiments show that am-LSTM is much better than traditional methods and other DNN models in three datasets.
Results are presented for the geometry of low-energy excitations in the one-dimensional Ising spin chain with power-law interactions, in which the model parameters are chosen to yield a finite spin-glass transition temperature. Both finite-temperature and ground-state studies are carried out. For the range of sizes studied the data cannot be fitted to any of the standard spin-glass scenarios without including corrections to scaling. Incorporating such corrections we find that the fractal dimension of the surface of the excitations, is either equal to the space dimension, consistent with replica symmetry breaking, or very slightly less than it. The latter case is consistent with the droplet and "trivial-nontrivial" (TNT) pictures.
We calculate the cross section of J/psi inclusive production in neutrino-nucleon deep-inelastic scattering via the weak neutral current within the factorization formalism of nonrelativistic quantum chromodynamics. Besides J/psi single production via the Z-gluon fusion mechanism, we also consider J/psi plus hadron-jet associated production. We take into account both direct production and feed-down from directly-produced heavier charmonia. We present theoretical predictions for the J/psi transverse-momentum and rapidity distributions, which can be measured in the CHORUS and NOMAD experiments at CERN, including conservative error estimates. In order to interpret a recent CHORUS measurement of the total cross section, we also estimate the contribution due to J/psi prompt production via diffractive processes using the vector-meson dominance model.
We introduce an approach that leverages surface normal predictions, along with appearance cues, to retrieve 3D models for objects depicted in 2D still images from a large CAD object library. Critical to the success of our approach is the ability to recover accurate surface normals for objects in the depicted scene. We introduce a skip-network model built on the pre-trained Oxford VGG convolutional neural network (CNN) for surface normal prediction. Our model achieves state-of-the-art accuracy on the NYUv2 RGB-D dataset for surface normal prediction, and recovers fine object detail compared to previous methods. Furthermore, we develop a two-stream network over the input image and predicted surface normals that jointly learns pose and style for CAD model retrieval. When using the predicted surface normals, our two-stream network matches prior work using surface normals computed from RGB-D images on the task of pose prediction, and achieves state of the art when using RGB-D input. Finally, our two-stream network allows us to retrieve CAD models that better match the style and pose of a depicted object compared with baseline approaches.
Web discussion forums are used by millions of people worldwide to share information belonging to a variety of domains such as automotive vehicles, pets, sports, etc. They typically contain posts that fall into different categories such as problem, solution, feedback, spam, etc. Automatic identification of these categories can aid information retrieval that is tailored for specific user requirements. Previously, a number of supervised methods have attempted to solve this problem; however, these depend on the availability of abundant training data. A few existing unsupervised and semi-supervised approaches are either focused on identifying a single category or do not report category-specific performance. In contrast, this work proposes unsupervised and semi-supervised methods that require no or minimal training data to achieve this objective without compromising on performance. A fine-grained analysis is also carried out to discuss their limitations. The proposed methods are based on sequence models (specifically, Hidden Markov Models) that can model language for each category using word and part-of-speech probability distributions, and manually specified features. Empirical evaluations across domains demonstrate that the proposed methods are better suited for this task than existing ones.
Optimizing interconnection networks is a prime object in switching schemes. In this work the authors present a novel approach for obtaining a required channel arrangement in a multi-stage interconnection network, using a new concept - a fundamental arrangement. The fundamental arrangement is an initial N-1 stage switch arrangement that allows obtaining any required output channel arrangement given an input arrangement, using N/2 binary switches at each stage. The paper demonstrates how a fundamental arrangement can be achieved and how, once this is done, any required arrangement may be obtained within 2(N-1) steps.
Using the Feynman diagram techniques, we derive the finite-temperature conductivity and magnetoconductivity formulas from the quantum interference and electron-electron interaction, for a three-dimensional disordered Weyl semimetal. For a single valley of Weyl fermions, we find that the magnetoconductivity is negative and proportional to the square root of magnetic field at low temperatures, as a result of the weak antilocalization. By including the contributions from the weak antilocalization, Berry curvature correction, and Lorentz force, we compare the calculated magnetoconductivity with a recent experiment. The weak antilocalization always dominates the magnetoconductivity near zero field, thus gives one of the transport signatures for Weyl semimetals. In the presence of strong intervalley scattering and correlations, we expect a crossover from the weak antilocalization to weak localization. In addition, we find that the interplay of electron-electron interaction and disorder scattering always dominates the conductivity at low temperatures and leads to a tendency to localization. Finally, we present a systematic comparison of the transport properties of single-valley Weyl fermions, 2D massless Dirac fermions, and 3D conventional electrons.
We study coherent electron transport in a one-dimensional wire with disorder modeled as a chain of randomly positioned scatterers. We derive analytical expressions for all statistical moments of the wire resistance $\rho$. By means of these expressions we show analytically that the distribution $P(f)$ of the variable $f=\ln(1+\rho)$ is not exactly Gaussian even in the limit of weak disorder. In a strict mathematical sense, this conclusion is found to hold not only for the distribution tails but also for the bulk of the distribution $P(f)$.
In this topical review we discuss the nature of the low-temperature phase in both infinite-ranged and short-ranged spin glasses. We analyze the meaning of pure states in spin glasses, and distinguish between physical, or ``observable'', states and (probably) unphysical, ``invisible'' states. We review replica symmetry breaking, and describe what it would mean in short-ranged spin glasses. We introduce the notion of thermodynamic chaos, which leads to the metastate construct. We apply these tools to short-ranged spin glasses, and conclude that replica symmetry breaking, in any form, cannot describe the low-temperature spin glass phase in any finite dimension. We then discuss the remaining possible scenarios that_could_ describe the low-temperature phase, and the differences they exhibit in some of their physical properties -- in particular, the interfaces that separate them. We also present rigorous results on metastable states and discuss their connection to thermodynamic states. Finally, we discuss some of the differences between the statistical mechanics of homogeneous systems and those with quenched disorder and frustration.
Deep neural networks are commonly trained using stochastic non-convex optimization procedures, which are driven by gradient information estimated on fractions (batches) of the dataset. While it is commonly accepted that batch size is an important parameter for offline tuning, the benefits of online selection of batches remain poorly understood. We investigate online batch selection strategies for two state-of-the-art methods of stochastic gradient-based optimization, AdaDelta and Adam. As the loss function to be minimized for the whole dataset is an aggregation of loss functions of individual datapoints, intuitively, datapoints with the greatest loss should be considered (selected in a batch) more frequently. However, the limitations of this intuition and the proper control of the selection pressure over time are open questions. We propose a simple strategy where all datapoints are ranked w.r.t. their latest known loss value and the probability to be selected decays exponentially as a function of rank. Our experimental results on the MNIST dataset suggest that selecting batches speeds up both AdaDelta and Adam by a factor of about 5.
The Call admission control (CAC) is one of the Radio Resource Management (RRM) techniques plays instrumental role in ensuring the desired Quality of Service (QoS) to the users working on different applications which have diversified nature of QoS requirements. This paper proposes a fuzzy neural approach for call admission control in a multi class traffic based Next Generation Wireless Networks (NGWN). The proposed Fuzzy Neural Call Admission Control (FNCAC) scheme is an integrated CAC module that combines the linguistic control capabilities of the fuzzy logic controller and the learning capabilities of the neural networks .The model is based on Recurrent Radial Basis Function Networks (RRBFN) which have better learning and adaptability that can be used to develop the intelligent system to handle the incoming traffic in the heterogeneous network environment. The proposed FNCAC can achieve reduced call blocking probability keeping the resource utilisation at an optimal level. In the proposed algorithm we have considered three classes of traffic having different QoS requirements. We have considered the heterogeneous network environment which can effectively handle this traffic. The traffic classes taken for the study are Conversational traffic, Interactive traffic and back ground traffic which are with varied QoS parameters. The paper also presents the analytical model for the CAC .The paper compares the call blocking probabilities for all the three types of traffic in both the models. The simulation results indicate that compared to Fuzzy logic based CAC, Conventional CAC, The simulation results are optimistic and indicates that the proposed FNCAC algorithm performs better where the call blocking probability is minimal when compared to other two methods.
We study the constitutive behaviour, the damage process, and the properties of bursts in the continuous damage fiber bundle model introduced recently. Depending on its two parameters, the model provides various types of constitutive behaviours including also macroscopic plasticity. Analytic results are obtained to characterize the damage process along the plastic plateau under strain controlled loading, furthermore, for stress controlled experiments we develop a simulation technique and explore numerically the distribution of bursts of fiber breaks assuming infinite range of interaction. Simulations revealed that under certain conditions power law distribution of bursts arises with an exponent significantly different from the mean field exponent 5/2. A phase diagram of the model characterizing the possible burst distributions is constructed.
The density correlation function F(q,t) of the two similar substituted aromatic liquids, Toluene and m-Toluidine, is studied by coherent neutron spin-echo and time-of-flight scattering for wave vectors q around the maximum q_max of the total static structure factor S_m(q) in the supercooled i.e. high density state far away from the normal fluid state. The wave-vector dependence of the mean structural relaxation time tau shows in both liquids a very pronounced de Gennes-like narrowing centered around q_0 < q_max, where q_0 corresponds to the first maximum in the center-of-mass static structure factor S_COM}(q). We find that the narrowing can be described quantitatively by using S_COM(q)/q^2 instead of S_m(q)/q^2 indicating that at the corresponding molecular length scales the relaxation of F(q,t) is dominated by purely translational motion.
Terrorist attacks on transportation networks have traumatized modern societies. With a single blast, it has become possible to paralyze airline traffic, electric power supply, ground transportation or Internet communication. How and at which cost can one restructure the network such that it will become more robust against a malicious attack? We introduce a unique measure for robustness and use it to devise a method to mitigate economically and efficiently this risk. We demonstrate its efficiency on the European electricity system and on the Internet as well as on complex networks models. We show that with small changes in the network structure (low cost) the robustness of diverse networks can be improved dramatically while their functionality remains unchanged. Our results are useful not only for improving significantly with low cost the robustness of existing infrastructures but also for designing economically robust network systems.
With recent developments in cloud computing, a paradigm shift from rather static deployment of resources to more dynamic, on-demand practices means more flexibility and better utilization of resources. This demands new ways to efficiently configure networks.   In this paper, we will characterize a class of competitive cloud services that telecom operators could provide based on the characteristics of telecom infrastructure through an applicable streaming service architecture. Then, we will model this architecture as a cost-based mathematic model. This model provides a tool to evaluate and compare the cost of software services for different telecom network topologies and deployment strategies. Additionally, with each topology it acts as a means to characterize the deployment solution that yields the lowest resource usage over the entire network. These applications are illustrated through numerical analysis. Finally, a proof-of-concept prototype is deployed to shows dynamic properties of the service in the architecture and the model above.
There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of RL tasks, from Atari games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning, that learn to play from experience with minimal knowledge of the specific domain of interest. In this work, we will investigate the performance of these methods on Super Smash Bros. Melee (SSBM), a popular console fighting game. The SSBM environment has complex dynamics and partial observability, making it challenging for human and machine alike. The multi-player aspect poses an additional challenge, as the vast majority of recent advances in RL have focused on single-agent environments. Nonetheless, we will show that it is possible to train agents that are competitive against and even surpass human professionals, a new result for the multi-player video game setting.
We investigate the network model of community by Watts, Dodds and Newman (D. J. Watts et al., Science 296 (2002) 1302) as a hierarchy of groups, each of 5 individuals. A homophily parameter $\alpha$ controls the probability proportional to $\exp(-\alpha x)$ of selection of neighbours against distance $x$. The network nodes are endowed with spin-like variables $s_i = \pm 1$, with Ising interaction $J>0$. The Glauber dynamics is used to investigate the order-disorder transition. The transition temperature $T_c$ is close to 3.8 for $\alpha < 0.0$ and it falls down to zero above this value. The result provides a mathematical illustration of the social ability to a collective action {\it via} weak ties, as discussed by Granovetter in 1973.
The dominant paradigm in origin of life research is that of an RNA world. However, despite experimental progress towards the spontaneous formation of RNA, the RNA world hypothesis still has its problems. Here, we introduce a novel computational model of chemical reaction networks based on RNA secondary structure and analyze the existence of autocatalytic sub-networks in random instances of this model, by combining two well-established computational tools. Our main results are that (i) autocatalytic sets are highly likely to exist, even for very small reaction networks and short RNA sequences, and (ii) sequence diversity seems to be a more important factor in the formation of autocatalytic sets than sequence length. These findings could shed new light on the probability of the spontaneous emergence of an RNA world as a network of mutually collaborative ribozymes.
Insensitive to dust obscuration, radio wavelengths are ideal to study star-forming galaxies free of dust induced biases. Using data from the Phoenix Deep Survey, we have identified a sample of star-forming extremely red objects (EROs). Stacking of the radio images of the radio-undetected star-forming EROs revealed a significant radio detection. Using the expected median redshift, we estimate an average star-formation rate of 61 M_sun/yr for these galaxies.
We present the first CCD mosaic of the supernova remnant G 65.3+5.7 in the optical emission lines of [O II] and [O III]. The new images reveal several diffuse and filamentary structures both inside and outside the extent of the remnant as defined by its X-ray and radio emission. The medium ionization line of [O III] 5007 provides the sharpest view to the system, while the remnant appears less filamentary in the emission line of [O II]. There are significant morphological differences between the two images strongly suggesting the presence of incomplete shock structures. Deep long-slit spectra were taken at several different positions of G 65.3+5.7. All spectra originate from shock heated gas, while the majority of them is characterized by large [O III]/Hbeta ratios. The sulfur line ratios indicate electron densities below ~200 cm^{-3}, while estimates of the shock velocities lie in the range of 90-140 km/s. Finally, the observed variations of the Halpha/Hbeta ratios may reflect the presence of intrinsic absorption affecting the optical spectra.
We consider a communication network where there exist wiretappers who can access a subset of channels, called a wiretap set, which is chosen from a given collection of wiretap sets. The collection of wiretap sets can be arbitrary. Secure network coding is applied to prevent the source information from being leaked to the wiretappers. In secure network coding, the required alphabet size is an open problem not only of theoretical interest but also of practical importance, because it is closely related to the implementation of such coding schemes in terms of computational complexity and storage requirement. In this paper, we develop a systematic graph-theoretic approach for improving Cai and Yeung's lower bound on the required alphabet size for the existence of secure network codes. The new lower bound thus obtained, which depends only on the network topology and the collection of wiretap sets, can be significantly smaller than Cai and Yeung's lower bound. A polynomial-time algorithm is devised for efficient computation of the new lower bound.
Primary User Emulation Attack (PUEA), in which attackers emulate primary user signals causing restriction of secondary access on the attacked channels, is a serious security problem in Cognitive Radio Networks (CRNs). An user performing a PUEA for selfishly occupying more channels is called a selfish PUEA attacker. Network managers could adopt a surveillance process on disallowed channels for identifying illegal channel occupation of selfish PUEA attackers and hence mitigating selfish PUEA. Determining surveillance strategies, particularly in multi-channel context, is necessary for ensuring network operation fairness. In this paper, we formulate a game, called multi-channel surveillance game, between the selfish attack and the surveillance process in multi-channel CRNs. The sequence-form representation method is adopted to determine the Nash Equilibrium (NE) of the game. We show that performing the obtained NE surveillance strategy significantly mitigates selfish PUEA.
Though the deep learning is pushing the machine learning to a new stage, basic theories of machine learning are still limited. The principle of learning, the role of the a prior knowledge, the role of neuron bias, and the basis for choosing neural transfer function and cost function, etc., are still far from clear. In this paper, we present a general theoretical framework for machine learning. We classify the prior knowledge into common and problem-dependent parts, and consider that the aim of learning is to maximally incorporate them. The principle we suggested for maximizing the former is the design risk minimization principle, while the neural transfer function, the cost function, as well as pretreatment of samples, are endowed with the role for maximizing the latter. The role of the neuron bias is explained from a different angle. We develop a Monte Carlo algorithm to establish the input-output responses, and we control the input-output sensitivity of a learning machine by controlling that of individual neurons. Applications of function approaching and smoothing, pattern recognition and classification, are provided to illustrate how to train general learning machines based on our theory and algorithm. Our method may in addition induce new applications, such as the transductive inference.
Seasonality is a distinctive characteristic which is often observed in many practical time series. Artificial Neural Networks (ANNs) are a class of promising models for efficiently recognizing and forecasting seasonal patterns. In this paper, the Particle Swarm Optimization (PSO) approach is used to enhance the forecasting strengths of feedforward ANN (FANN) as well as Elman ANN (EANN) models for seasonal data. Three widely popular versions of the basic PSO algorithm, viz. Trelea-I, Trelea-II and Clerc-Type1 are considered here. The empirical analysis is conducted on three real-world seasonal time series. Results clearly show that each version of the PSO algorithm achieves notably better forecasting accuracies than the standard Backpropagation (BP) training method for both FANN and EANN models. The neural network forecasting results are also compared with those from the three traditional statistical models, viz. Seasonal Autoregressive Integrated Moving Average (SARIMA), Holt-Winters (HW) and Support Vector Machine (SVM). The comparison demonstrates that both PSO and BP based neural networks outperform SARIMA, HW and SVM models for all three time series datasets. The forecasting performances of ANNs are further improved through combining the outputs from the three PSO based models.
In this paper, we aim to learn a mapping (or embedding) from images to a compact binary space in which Hamming distances correspond to a ranking measure for the image retrieval task.   We make use of a triplet loss because this has been shown to be most effective for ranking problems.   However, training in previous works can be prohibitively expensive due to the fact that optimization is directly performed on the triplet space, where the number of possible triplets for training is cubic in the number of training examples.   To address this issue, we propose to formulate high-order binary codes learning as a multi-label classification problem by explicitly separating learning into two interleaved stages.   To solve the first stage, we design a large-scale high-order binary codes inference algorithm to reduce the high-order objective to a standard binary quadratic problem such that graph cuts can be used to efficiently infer the binary code which serve as the label of each training datum.   In the second stage we propose to map the original image to compact binary codes via carefully designed deep convolutional neural networks (CNNs) and the hashing function fitting can be solved by training binary CNN classifiers.   An incremental/interleaved optimization strategy is proffered to ensure that these two steps are interactive with each other during training for better accuracy.   We conduct experiments on several benchmark datasets, which demonstrate both improved training time (by as much as two orders of magnitude) as well as producing state-of-the-art hashing for various retrieval tasks.
Geometrical disorder is present in many physical situations giving rise to eigenvalue problems. The simplest case of diffusion on a random lattice with fluctuating site connectivities is studied analytically and by exact numerical diagonalizations. Localization of eigenmodes is shown to be induced by geometrical defects, that is sites with abnormally low or large connectivities. We expose a ``single defect approximation'' (SDA) scheme founded on this mechanism that provides an accurate quantitative description of both extended and localized regions of the spectrum. We then present a systematic diagrammatic expansion allowing to use SDA for finite-dimensional problems, e.g. to determine the localized harmonic modes of amorphous media.
We study the thermal transport properties in three-dimensional disordered systems close to the metal-insulator transition within linear response. Using a suitable form for the energy-dependent conductivity $\sigma$, we show that the value of the dynamical scaling exponent for noninteracting disordered systems such as the Anderson model of localization can be reproduced. Furthermore, the values of the thermopower S have the right order of magnitude close to the transition as compared to the experimental results. A sign change in the thermoelectric power S - as is often observed in experiments - can also be modeled within the linear response formulation using modified experimental $\sigma$ data as input.
Disorder plays a crucial role in many systems particularly in solid state physics. However, the disorder in a particular system can usually not be chosen or controlled. We show that the unique control available for ultracold atomic gases may be used for the production and observation of disordered quantum degenerate gases. A detailed analysis of localization effects for two possible realizations of a disordered potential is presented. In a theoretical analysis clear localization effects are observed when a superlattice is used to provide a quasiperiodic disorder. The effects of localization are analyzed by investigating the superfluid fraction and the localization length within the system. The theoretical analysis in this paper paves a clear path for the future observation of Anderson-like localization in disordered quantum gases.
Preferential attachment models have been widely studied in complex networks, because they can explain the formation of many networks like social networks, citation networks, power grids, and biological networks, to name a few. Motivated by the application of key predistribution in wireless sensor networks (WSN), we initiate the study of preferential attachment with degree bound.   Our paper has two important contributions to two different areas. The first is a contribution in the study of complex networks. We propose preferential attachment model with degree bound for the first time. In the normal preferential attachment model, the degree distribution follows a power law, with many nodes of low degree and a few nodes of high degree. In our scheme, the nodes can have a maximum degree $d_{\max}$, where $d_{\max}$ is an integer chosen according to the application. The second is in the security of wireless sensor networks. We propose a new key predistribution scheme based on the above model. The important features of this model are that the network is fully connected, it has fewer keys, has larger size of the giant component and lower average path length compared with traditional key predistribution schemes and comparable resilience to random node attacks.   We argue that in many networks like key predistribution and Internet of Things, having nodes of very high degree will be a bottle-neck in communication. Thus, studying preferential attachment model with degree bound will open up new directions in the study of complex networks, and will have many applications in real world scenarios.
Gun related violence is a complex issue and accounts for a large proportion of violent incidents. In the research reported in this paper, we set out to investigate the pro-gun and anti-gun sentiments expressed on a social media platform, namely Twitter, in response to the 2012 Sandy Hook Elementary School shooting in Connecticut, USA. Machine learning techniques are applied to classify a data corpus of over 700,000 tweets. The sentiments are captured using a public sentiment score that considers the volume of tweets as well as population. A web-based interactive tool is developed to visualise the sentiments and is available at http://www.gunsontwitter.com. The key findings from this research are: (i) There are elevated rates of both pro-gun and anti-gun sentiments on the day of the shooting. Surprisingly, the pro-gun sentiment remains high for a number of days following the event but the anti-gun sentiment quickly falls to pre-event levels. (ii) There is a different public response from each state, with the highest pro-gun sentiment not coming from those with highest gun ownership levels but rather from California, Texas and New York.
We have studied disordering effects on the coefficients of Ginzburg - Landau expansion in powers of superconducting order - parameter in attractive Anderson - Hubbard model within the generalized $DMFT+\Sigma$ approximation. We consider the wide region of attractive potentials $U$ from the weak coupling region, where superconductivity is described by BCS model, to the strong coupling region, where superconducting transition is related with Bose - Einstein condensation (BEC) of compact Cooper pairs formed at temperatures essentially larger than the temperature of superconducting transition, and the wide range of disorder - from weak to strong, where the system is in the vicinity of Anderson transition. In case of semi - elliptic bare density of states disorder influence upon the coefficients $A$ and $B$ before the square and the fourth power of the order - parameter is universal for any value of electron correlation and is related only to the general disorder widening of the bare band (generalized Anderson theorem). Such universality is absent for the gradient term expansion coefficient $C$. In the usual theory of "dirty" superconductors the $C$ coefficient drops with the growth of disorder. In the limit of strong disorder in BCS limit the coefficient $C$ is very sensitive to the effects of Anderson localization, which lead to its further drop with disorder growth up to the region of Anderson insulator. In the region of BCS - BEC crossover and in BEC limit the coefficient $C$ and all related physical properties are weakly dependent on disorder. In particular, this leads to relatively weak disorder dependence of both penetration depth and coherence lengths, as well as of related slope of the upper critical magnetic field at superconducting transition, in the region of very strong coupling.
Large systems are commonly internetworked. A security policy describes the communication relationship between the networked entities. The security policy defines rules, for example that A can connect to B, which results in a directed graph. However, this policy is often implemented in the network, for example by firewalls, such that A can establish a connection to B and all packets belonging to established connections are allowed. This stateful implementation is usually required for the network's functionality, but it introduces the backflow from B to A, which might contradict the security policy. We derive compliance criteria for a policy and its stateful implementation. In particular, we provide a criterion to verify the lack of side effects in linear time. Algorithms to automatically construct a stateful implementation of security policy rules are presented, which narrows the gap between formalization and real-world implementation. The solution scales to large networks, which is confirmed by a large real-world case study. Its correctness is guaranteed by the Isabelle/HOL theorem prover.
We analyze very deep HST, VLT and Spitzer photometry of galaxies at 2<z<3.5 in the Hubble Deep Field South. The sample is selected from the deepest public K-band imaging currently available. We show that the rest-frame U-V vs V-J color-color diagram is a powerful diagnostic of the stellar populations of distant galaxies. Galaxies with red rest-frame U-V colors are generally red in rest-frame V-J as well. However, at a given U-V color a range in V-J colors exists, and we show that this allows us to distinguish young, dusty galaxies from old, passively evolving galaxies. We quantify the effects of IRAC photometry on estimates of masses, ages, and the dust content of z>2 galaxies. The estimated distributions of these properties do not change significantly when adding IRAC data to the UBVIJHK photometry. However, for individual galaxies the addition of IRAC can improve the constraints on the stellar populations, especially for red galaxies: uncertainties in stellar mass decrease by a factor of 2.7 for red (U-V > 1) galaxies, but only by a factor of 1.3 for blue (U-V < 1) galaxies. We find a similar color-dependence of the improvement for estimates of age and dust extinction. In addition, the improvement from adding IRAC depends on the availability of full near-infrared JHK coverage; if only K-band were available, the mass uncertainties of blue galaxies would decrease by a more substantial factor 1.9. Finally, we find that a trend of galaxy color with stellar mass is already present at z>2. The most massive galaxies at high redshift have red rest-frame U-V colors compared to lower mass galaxies even when allowing for complex star formation histories.
Many applications model their data in a general-purpose storage format such as JSON. This data structure is modified by the application as a result of user input. Such modifications are well understood if performed sequentially on a single copy of the data, but if the data is replicated and modified concurrently on multiple devices, it is unclear what the semantics should be. In this paper we present an algorithm and formal semantics for a JSON data structure that automatically resolves concurrent modifications such that no updates are lost, and such that all replicas converge towards the same state (a conflict-free replicated datatype or CRDT). It supports arbitrarily nested list and map types, which can be modified by insertion, deletion and assignment. The algorithm performs all merging client-side and does not depend on ordering guarantees from the network, making it suitable for deployment on mobile devices with poor network connectivity, in peer-to-peer networks, and in messaging systems with end-to-end encryption.
We present an improved method for handling off-shell effects in deep inelastic nuclear scattering. With a firm understanding of the effects of the nuclear wave function, including these off-shell corrections as well as binding and nucleon-nucleon correlations, we can begin to examine the role of QCD in nuclei through an analysis of the moments of the nuclear structure function. Our analysis is aimed at extracting the Q^2 dependence of the moments of the nucleon structure function by using the recent high x world Iron data and by properly removing nuclear effects from the perturbative contribution. In addition, we compare quantitatively the behavior of the extracted moments with a simple O(1/Q^2) phenomenological form and we determine the mass term for this parametrization.
The success of deep convolutional neural networks on image classification and recognition tasks has led to new applications in very diversified contexts, including the field of medical imaging. In this paper we investigate and propose neural network architectures within the context of automated segmentation of anatomical organs in chest radiographs, namely for lungs, clavicles and heart. The problem of relative intrinsic imbalances in the dataspace is solved by relating prior class data distributions to the loss function. We investigate three different models and propose the best performing one based on the evaluation results. The models are trained and tested on the publicly available JSRT database, consisting of 247 X-ray images the ground truth masks for which are available in the SCR database. The networks have been trained in a multi-class setup with three target classes. Our best performing model, trained with the loss function based on the Dice coefficient, reached mean Jaccard overlap scores of 95.0\% for lungs, 86.8\% for clavicles and 88.2\% for heart in the multi-class configuration. Though we proceed in a multi-class configuration, our best network reached competitive results on lung segmentation and outperformed single-class state-of-the-art methods on clavicles and heart segmentation tasks.
Is strong supervision necessary for learning a good visual representation? Do we really need millions of semantically-labeled images to train a Convolutional Neural Network (CNN)? In this paper, we present a simple yet surprisingly powerful approach for unsupervised learning of CNN. Specifically, we use hundreds of thousands of unlabeled videos from the web to learn visual representations. Our key idea is that visual tracking provides the supervision. That is, two patches connected by a track should have similar visual representation in deep feature space since they probably belong to the same object or object part. We design a Siamese-triplet network with a ranking loss function to train this CNN representation. Without using a single image from ImageNet, just using 100K unlabeled videos and the VOC 2012 dataset, we train an ensemble of unsupervised networks that achieves 52% mAP (no bounding box regression). This performance comes tantalizingly close to its ImageNet-supervised counterpart, an ensemble which achieves a mAP of 54.4%. We also show that our unsupervised network can perform competitively in other tasks such as surface-normal estimation.
Monte Carlo simulations were performed in order to determine the excess number of clusters b and the average density of clusters n_c for the two-dimensional "Swiss cheese" continuum percolation model on a planar L x L system and on the surface of a sphere. The excess number of clusters for the L x L system was confirmed to be a universal quantity with a value b = 0.8841 as previously predicted and verified only for lattice percolation. The excess number of clusters on the surface of a sphere was found to have the value b = 1.215(1) for discs with the same coverage as the flat critical system. Finally, the average critical density of clusters was calculated for continuum systems n_c = 0.0408(1).
The goal of this paper is not to introduce a single algorithm or method, but to make theoretical steps towards fully understanding the training dynamics of generative adversarial networks. In order to substantiate our theoretical analysis, we perform targeted experiments to verify our assumptions, illustrate our claims, and quantify the phenomena. This paper is divided into three sections. The first section introduces the problem at hand. The second section is dedicated to studying and proving rigorously the problems including instability and saturation that arize when training generative adversarial networks. The third section examines a practical and theoretically grounded direction towards solving these problems, while introducing new tools to study them.
Longitudinal observational databases have become a recent interest in the post marketing drug surveillance community due to their ability of presenting a new perspective for detecting negative side effects. Algorithms mining longitudinal observation databases are not restricted by many of the limitations associated with the more conventional methods that have been developed for spontaneous reporting system databases. In this paper we investigate the robustness of four recently developed algorithms that mine longitudinal observational databases by applying them to The Health Improvement Network (THIN) for six drugs with well document known negative side effects. Our results show that none of the existing algorithms was able to consistently identify known adverse drug reactions above events related to the cause of the drug and no algorithm was superior.
Simulated evolution of biological networks can be used to generate functional networks as well as investigate hypotheses regarding natural evolution. A handful of studies have shown how simulated evolution can be used for studying the functional space spanned by biochemical networks, studying natural evolution, or designing new synthetic networks. If there was a method for easily performing such studies, it can allow the community to further experiment with simulated evolution and explore all of its uses. As a result, we have developed a library written in the C language that performs all the basic functions needed to carry out simulated evolution of biological networks. The library comes with a generic genetic algorithm as well as genetic algorithms for specifically evolving genetic networks, protein networks, or mass-action networks. The library also comes with functions for simulating these networks. A user needs to specify a desired function. A GUI is provided for users to become oriented with all the options available in the library. The library is free and open source under the BSD lisence and can be obtained at evolvenetworks.sourceforge.net. It can be built on all major platforms. The code can be most conveniently compiled using cross-platform make (CMake).
This paper proves that non-convex quadratically constrained quadratic programs can be solved in polynomial time when their underlying graph is acyclic, provided the constraints satisfy a certain technical condition. When this condition is not satisfied, we propose a heuristic to obtain a feasible point. We demonstrate this approach on optimal power flow problems over radial networks.
HEP Cluster is designed and implemented in Scientific Linux Cern 5.5 to grant High Energy Physics researchers one place where they can go to undertake a particular task or to provide a parallel processing architecture in which CPU resources are shared across a network and all machines function as one large supercomputer. It gives physicists a facility to access computers and data, transparently, without having to consider location, operating system, account administration, and other details. By using this facility researchers can process their jobs much faster than the stand alone desktop systems. Keywords: Cluster, Network, Storage, Parallel Computing & Gris.
Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. To establish its prospects, we explore to what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon the paradigm that a jet can be treated as an image, with intensity given by the local calorimeter deposits. We supplement this construction by adding color to the images, with red, green and blue intensities given by the transverse momentum in charged particles, transverse momentum in neutral particles, and pixel-level charged particle counts. Overall, the deep networks match or outperform traditional jet variables. We also find that, while various simulations produce different quark and gluon jets, the neural networks are surprisingly insensitive to these differences, similar to traditional observables. This suggests that the networks can extract robust physical information from imperfect simulations.
The vertices of an interval graph represent intervals over a real line where overlapping intervals denote that their corresponding vertices are adjacent. This implies that the vertices are measurable by a metric and there exists a linear structure in the system. The generalization is an embedding of a graph onto a multi-dimensional Euclidean space and it was used by scientists to study the multi-relational complexity of ecology. However the research went out of fashion in the 1980s and was not revisited when Network Science recently expressed interests with multi-relational networks known as multiplexes. This paper studies interval graphs from the perspective of Network Science.
A theoretical approach relying on evolutionary population synthesis models could help refining the search criteria in deep galaxy surveys on the basis of a better knowledge of the expected apparent photometric properties of high-redshift objects. The following is a brief discussion reviewing some relevant aspects of the question in order to allow a more critical appraisal to primeval galaxy recognition.
In this paper, a game-theoretic model for studying power control for wireless data networks in frequency-selective multipath environments is analyzed. The uplink of an impulse-radio ultrawideband system is considered. The effects of self-interference and multiple-access interference on the performance of Rake receivers are investigated for synchronous systems. Focusing on energy efficiency, a noncooperative game is proposed in which users in the network are allowed to choose their transmit powers to maximize their own utilities, and the Nash equilibrium for the proposed game is derived. It is shown that, due to the frequency selective multipath, the noncooperative solution is achieved at different signal-to-interference-plus-noise ratios, respectively of the channel realization. A large-system analysis is performed to derive explicit expressions for the achieved utilities. The Pareto-optimal (cooperative) solution is also discussed and compared with the noncooperative approach.
Deep neural networks have recently demonstrated excellent performance on various tasks. Despite recent advances, our understanding of these learning models is still incomplete, at least, as their unexpected vulnerability to imperceptibly small, non-random perturbations revealed. The existence of these so-called adversarial examples presents a serious problem of the application of vulnerable machine learning models. In this paper, we introduce the layerwise origin-target synthesis (LOTS) that can serve multiple purposes. First, we can use it as a visualization technique that gives us insights into the function of any intermediate feature layer by showing the notion of a particular input in deep neural networks. Second, our approach can be applied to assess the invariance of the learned features captured at any layer with respect to the class of the particular input. Finally, we can also utilize LOTS as a general way of producing a vast amount of diverse adversarial examples that can be used for training to further improve the robustness of machine learning models and their performance as well.
Nicholas Pippenger and Kristin Schleich have recently given a combinatorial interpretation for the second-order super-Catalan numbers (u_{n})_{n>=0}=(3,2,3,6,14,36,...): they count "aligned cubic trees" on n internal vertices. Here we give a combinatorial interpretation of the recurrence u_{n} = Sum_{k=0}^{n/2-1} ({n-2}choose{2k} 2^{n-2-2k} u_{k}): it counts these trees by number of deep interior vertices where deep interior means "neither a leaf nor adjacent to a leaf".
We discuss the dynamics of particles in one dimension in potentials that are random both in space and in time. The results are applied to recent optics experiments on Anderson localization, in which the transverse spreading of a beam is suppressed by random fluctuations in the refractive index. If the refractive index fluctuates along the direction of the paraxial propagation of the beam, the localization is destroyed. We analyze this broken localization, in terms of the spectral decomposition of the potential. When the potential has a discrete spectrum, the spread is controlled by the overlap of Chirikov resonances in phase space. As the number of Fourier components is increased, the resonances merge into a continuum, which is described by a Fokker-Planck equation. We express the diffusion coefficient in terms of the spectral intensity of the potential. For a general class of potentials that are commonly used in optics, the solutions of the Fokker-Planck equation exhibit anomalous diffusion in phase space, implying that when Anderson localization is broken by temporal fluctuations of the potential, the result is transport at a rate similar to a ballistic one or even faster. For a class of potentials which arise in some existing realizations of Anderson localization atypical behavior is found.
Recently, several optimization methods have been successfully applied to the hyperparameter optimization of deep neural networks (DNNs). The methods work by modeling the joint distribution of hyperparameter values and corresponding error. Those methods become less practical when applied to modern DNNs whose training may take a few days and thus one cannot collect sufficient observations to accurately model the distribution. To address this challenging issue, we propose a method that learns to transfer optimal hyperparameter values for a small source dataset to hyperparameter values with comparable performance on a dataset of interest. As opposed to existing transfer learning methods, our proposed method does not use hand-designed features. Instead, it uses surrogates to model the hyperparameter-error distributions of the two datasets and trains a neural network to learn the transfer function. Extensive experiments on three CV benchmark datasets clearly demonstrate the efficiency of our method.
Convolutional neural networks (CNNs) are the cutting edge model for supervised machine learning in computer vision. In recent years CNNs have outperformed traditional approaches in many computer vision tasks such as object detection, image classification and face recognition. CNNs are vulnerable to overfitting, and a lot of research focuses on finding regularization methods to overcome it. One approach is designing task specific models based on prior knowledge.   Several works have shown that properties of natural images can be easily captured using complex numbers. Motivated by these works, we present a variation of the CNN model with complex valued input and weights. We construct the complex model as a generalization of the real model. Lack of order over the complex field raises several difficulties both in the definition and in the training of the network. We address these issues and suggest possible solutions.   The resulting model is shown to be a restricted form of a real valued CNN with twice the parameters. It is sensitive to phase structure, and we suggest it serves as a regularized model for problems where such structure is important. This suggestion is verified empirically by comparing the performance of a complex and a real network in the problem of cell detection. The two networks achieve comparable results, and although the complex model is hard to train, it is significantly less vulnerable to overfitting. We also demonstrate that the complex network detects meaningful phase structure in the data.
We present a model for pragmatically describing scenes, in which contrastive behavior results from a combination of inference-driven pragmatics and learned semantics. Like previous learned approaches to language generation, our model uses a simple feature-driven architecture (here a pair of neural "listener" and "speaker" models) to ground language in the world. Like inference-driven approaches to pragmatics, our model actively reasons about listener behavior when selecting utterances. For training, our approach requires only ordinary captions, annotated _without_ demonstration of the pragmatic behavior the model ultimately exhibits. In human evaluations on a referring expression game, our approach succeeds 81% of the time, compared to a 69% success rate using existing techniques.
Network's resilience to the malfunction of its components has been of great concern. The goal of this work is to determine the network design guidelines, which maximizes the network efficiency while keeping the cost of the network (that is the average connectivity) constant. With a global optimization method, memory tabu search (MTS), we get the optimal network structure with the approximately best efficiency. We analyze the statistical characters of the network and find that a network with a small quantity of hub nodes, high degree of clustering may be much more resilient to perturbations than a random network and the optimal network is one kind of highly heterogeneous networks. The results strongly suggest that networks with higher efficiency are more robust to random failures. In addition, we propose a simple model to describe the statistical properties of the optimal network and investigate the synchronizability of this model.
We develop randomized modifications of Markov chains and apply these modifications to the routing chains of customers in Jacksonian stochastic networks. The aim of our investigations is to find new rerouting schemes for non standard Jackson networks which hitherto resist computing explicitly the stationary distribution.   The non standard properties we can handle by suitable algorithms encompass several modifications of Jackson networks known in the literature, especially breakdown and repair of nodes with access modification for customers to down nodes, finite buffers with control of buffer overflow. The rerouting schemes available in the literature for these situations are special cases of our rerouting schemes, which can deal also with partial degrading of service capacities and even with speed up of service.   In any case we require our algorithms to react on such general changes in the network with the aim to maintain the utilization of the nodes. To hold this invariant under change of service speeds (intensities) our algorithms not only adapt the routing probabilities but decrease automatically the overall arrival rate to the network if necessary.   Our main application is for stochastic networks in a random environment. The impact of the environment on the network is by changing service speeds (by upgrading and/or degrading, breakdown, repair) and we implement the randomization algorithms to react to the changes of the environment. On the other side, customers departing from the network may enforce the environment to jump immediately. So our environment is not Markov for its own.   The main result is to compute explicitly the joint stationary distribution of the queue lengths vector and the environment which is of product form: Environment and queue lengths vector, and the queue lengths over the network are decomposable.
We perform the analysis of existing light-targets deep-inelastic-scattering (DIS) data in the leading-order (LO), next-to-leading-order (NLO), and next-to-next-to-leading-order (NNLO) QCD approximations and extract PDFs simultaneously with the value of the strong coupling constant $\alpha_s$ and the high-twist contribution to the structure functions. The main theoretical uncertainties and experimental uncertainties due to all sources of experimental errors in data are estimated, the latter generally dominate for the obtained PDFs. The uncertainty in Higgs boson production cross section due to errors in PDFs is $\sim 2$% for the LHC and varies from 2% to 10% for the Fermilab collider under variation of the Higgs boson mass from $100 {\rm GeV}$ to $300 {\rm GeV}$. For the $W$-boson production cross section the uncertainty is $\sim 2$% for the both colliders. The value of $\alpha^{\rm NNLO}_{\rm s}(M_{\rm Z})=0.1143\pm 0.0014({\rm exp.})$ is obtained, while the high-twist terms do not vanish up to the NNLO as required by comparison to data.
We consider the Edwards-Anderson Ising spin glass model on the half-plane $Z \times Z^+$ with zero external field and a wide range of choices, including mean zero Gaussian, for the common distribution of the collection J of i.i.d. nearest neighbor couplings. The infinite-volume joint distribution $K(J,\alpha)$ of couplings J and ground state pairs $\alpha$ with periodic (respectively, free) boundary conditions in the horizontal (respectively, vertical) coordinate is shown to exist without need for subsequence limits. Our main result is that for almost every J, the conditional distribution $K(\alpha|J)$ is supported on a single ground state pair.
This paper discusses a nonparametric regression model that naturally generalizes neural network models. The model is based on a finite number of one-dimensional transformations and can be estimated with a one-dimensional rate of convergence. The model contains the generalized additive model with unknown link function as a special case. For this case, it is shown that the additive components and link function can be estimated with the optimal rate by a smoothing spline that is the solution of a penalized least squares criterion.
In this paper, we apply a general deep learning (DL) framework for the answer selection task, which does not depend on manually defined features or linguistic tools. The basic framework is to build the embeddings of questions and answers based on bidirectional long short-term memory (biLSTM) models, and measure their closeness by cosine similarity. We further extend this basic model in two directions. One direction is to define a more composite representation for questions and answers by combining convolutional neural network with the basic framework. The other direction is to utilize a simple but efficient attention mechanism in order to generate the answer representation according to the question context. Several variations of models are provided. The models are examined by two datasets, including TREC-QA and InsuranceQA. Experimental results demonstrate that the proposed models substantially outperform several strong baselines.
One conjecture in both deep learning and classical connectionist viewpoint is that the biological brain implements certain kinds of deep networks as its back-end. However, to our knowledge, a detailed correspondence has not yet been set up, which is important if we want to bridge between neuroscience and machine learning. Recent researches emphasized the biological plausibility of Linear-Nonlinear-Poisson (LNP) neuron model. We show that with neurally plausible settings, the whole network is capable of representing any Boltzmann machine and performing a semi-stochastic Bayesian inference algorithm lying between Gibbs sampling and variational inference.
We describe a family of exact Gerstner type solutions for the geophysical equatorial deep water wave problem in the f-plane approximation. These Gerstner type waves are two-dimensional and travel with constant speed over a uniform horizontal current. The particle paths in the presence and absence of the Coriolis force are also analyzed in dependence of the current strength.
Understanding how images of objects and scenes behave in response to specific ego-motions is a crucial aspect of proper visual development, yet existing visual learning methods are conspicuously disconnected from the physical source of their images. We propose to exploit proprioceptive motor signals to provide unsupervised regularization in convolutional neural networks to learn visual representations from egocentric video. Specifically, we enforce that our learned features exhibit equivariance i.e. they respond predictably to transformations associated with distinct ego-motions. With three datasets, we show that our unsupervised feature learning approach significantly outperforms previous approaches on visual recognition and next-best-view prediction tasks. In the most challenging test, we show that features learned from video captured on an autonomous driving platform improve large-scale scene recognition in static images from a disjoint domain.
Network virtualization techniques allow for the coexistence of many virtual networks (VNs) jointly sharing the resources of an underlying substrate network. The Virtual Network Embedding problem (VNE) arises when looking for the most profitable set of VNs to embed onto the substrate. In this paper, we address the offline version of the problem. We propose a Mixed-Integer Linear Programming formulation to solve it to optimality which accounts for acceptance and rejection of virtual network requests, allowing for both splittable and unsplittable (single path) routing schemes. Our formulation also considers a Rent-at-Bulk (RaB) model for the rental of substrate capacities where economies of scale apply. To better emphasize the importance of RaB, we also compare our method to a baseline one which only takes RaB into account a posteriori, once a solution to VNE, oblivious to RaB, has been found. Computational experiments show the viability of our approach, stressing the relevance of addressing RaB directly with an exact formulation
Time periodic driving serves not only as a convenient way to engineer effective Hamiltonians, but also as a means to produce intrinsically dynamical phases that do not exist in the static limit. A recent example of the latter are 2D chiral Floquet (CF) phases exhibiting anomalous edge dynamics that pump discrete packets of quantum information along one direction. In non-fractionalized systems with only bosonic excitations, this pumping is quantified by a dynamical topological index that is a rational number -- highlighting its difference from the integer valued invariant underlying equilibrium chiral phases (e.g. quantum Hall systems). Here, we explore CF phases in systems with emergent anyon excitations that have fractional statistics (Abelian topological order). Despite the absence of mobile non-Abelian particles in these systems, external driving can supply the energy to pump otherwise immobile non-Abelian defects (sometimes called twist defects or genons) around the boundary, thereby transporting an irrational fractional number of quantum bits along the edge during each drive period. This enables new CF phases with chiral indices that are square roots of rational numbers, inspiring the label: "radical CF phases". We demonstrate an unexpected bulk-boundary correspondence, in which the radical CF edge is tied to bulk dynamics that exchange electric and magnetic anyon excitations during each period. We construct solvable, stroboscopically driven versions of Kitaev's honeycomb spin model that realize these radical CF phases, and discuss their stability against heating in strongly disordered many-body localized settings or in the limit of rapid driving as an exponentially long-lived pre-thermal phenomena.
It is well known that links in CSMA wireless networks are prone to starvation. Prior works focused almost exclusively on equilibrium starvation. In this paper, we show that links in CSMA wireless networks are also susceptible to temporal starvation. Specifically, although some links have good equilibrium throughputs and do not suffer from equilibrium starvation, they can still have no throughput for extended periods from time to time. Given its impact on quality of service, it is important to understand and characterize temporal starvation. To this end, we develop a "trap theory" to analyze temporal throughput fluctuation. The trap theory serves two functions. First, it allows us to derive new mathematical results that shed light on the transient behavior of CSMA networks. For example, we show that the duration of a trap, during which some links receive no throughput, is insensitive to the distributions of the backoff countdown and transmission time (packet duration) in the CSMA protocol. Second, we can develop analytical tools for computing the "degrees of starvation" for CSMA networks to aid network design. For example, given a CSMA network, we can determine whether it suffers from starvation, and if so, which links will starve. Furthermore, the likelihood and durations of temporal starvation can also be computed. We believe that the ability to identify and characterize temporal starvation as established in this paper will serve as an important first step toward the design of effective remedies for it.
This paper addresses the general problem of reinforcement learning (RL) in partially observable environments. In 2013, our large RL recurrent neural networks (RNNs) learned from scratch to drive simulated cars from high-dimensional video input. However, real brains are more powerful in many ways. In particular, they learn a predictive model of their initially unknown environment, and somehow use it for abstract (e.g., hierarchical) planning and reasoning. Guided by algorithmic information theory, we describe RNN-based AIs (RNNAIs) designed to do the same. Such an RNNAI can be trained on never-ending sequences of tasks, some of them provided by the user, others invented by the RNNAI itself in a curious, playful fashion, to improve its RNN-based world model. Unlike our previous model-building RNN-based RL machines dating back to 1990, the RNNAI learns to actively query its model for abstract reasoning and planning and decision making, essentially "learning to think." The basic ideas of this report can be applied to many other cases where one RNN-like system exploits the algorithmic information content of another. They are taken from a grant proposal submitted in Fall 2014, and also explain concepts such as "mirror neurons." Experimental results will be described in separate papers.
We discuss a general method to learn data representations from multiple tasks. We provide a justification for this method in both settings of multitask learning and learning-to-learn. The method is illustrated in detail in the special case of linear feature learning. Conditions on the theoretical advantage offered by multitask representation learning over independent task learning are established. In particular, focusing on the important example of half-space learning, we derive the regime in which multitask representation learning is beneficial over independent task learning, as a function of the sample size, the number of tasks and the intrinsic data dimensionality. Other potential applications of our results include multitask feature learning in reproducing kernel Hilbert spaces and multilayer, deep networks.
This paper proposes a web-based visual graph analytics platform for interactive graph mining, visualization, and real-time exploration of networks. GraphVis is fast, intuitive, and flexible, combining interactive visualizations with analytic techniques to reveal important patterns and insights for sense making, reasoning, and decision making. Networks can be visualized and explored within seconds by simply drag-and-dropping a graph file into the web browser. The structure, properties, and patterns of the network are computed automatically and can be instantly explored in real-time. At the heart of GraphVis lies a multi-level interactive network visualization and analytics engine that allows for real-time graph mining and exploration across multiple levels of granularity simultaneously. Both the graph analytic and visualization techniques (at each level of granularity) are dynamic and interactive, with immediate and continuous visual feedback upon every user interaction (e.g., change of a slider for filtering). Furthermore, nodes, edges, and subgraphs are easily inserted, deleted or exported via a number of novel techniques and tools that make it extremely easy and flexible for exploring, testing hypothesis, and understanding networks in real-time over the web. A number of interactive visual graph analytic techniques are also proposed including interactive role discovery methods, community detection, as well as a number of novel block models for generating graphs with community structure. Finally, we also highlight other key aspects including filtering, querying, ranking, manipulating, exporting, partitioning, as well as tools for dynamic network analysis and visualization, interactive graph generators, and a variety of multi-level network analysis, summarization, and statistical techniques.
In recent years, concepts and methods of complex networks have been employed to tackle the word sense disambiguation (WSD) task by representing words as nodes, which are connected if they are semantically similar. Despite the increasingly number of studies carried out with such models, most of them use networks just to represent the data, while the pattern recognition performed on the attribute space is performed using traditional learning techniques. In other words, the structural relationship between words have not been explicitly used in the pattern recognition process. In addition, only a few investigations have probed the suitability of representations based on bipartite networks and graphs (bigraphs) for the problem, as many approaches consider all possible links between words. In this context, we assess the relevance of a bipartite network model representing both feature words (i.e. the words characterizing the context) and target (ambiguous) words to solve ambiguities in written texts. Here, we focus on the semantical relationships between these two type of words, disregarding the relationships between feature words. In special, the proposed method not only serves to represent texts as graphs, but also constructs a structure on which the discrimination of senses is accomplished. Our results revealed that the proposed learning algorithm in such bipartite networks provides excellent results mostly when topical features are employed to characterize the context. Surprisingly, our method even outperformed the support vector machine algorithm in particular cases, with the advantage of being robust even if a small training dataset is available. Taken together, the results obtained here show that the proposed representation/classification method might be useful to improve the semantical characterization of written texts.
The neural ideal of a binary code $\mathbb{C} \subseteq \mathbb{F}_2^n$ is an ideal in $\mathbb{F}_2[x_1,\ldots, x_n]$ closely related to the vanishing ideal of $\mathbb{C}$. The neural ideal, first introduced by Curto et al, provides an algebraic way to extract geometric properties of realizations of binary codes. In this paper we investigate homomorphisms between polynomial rings $\mathbb{F}_2[x_1,\ldots, x_n]$ which preserve all neural ideals. We show that all such homomorphisms can be decomposed into a composition of three basic types of maps. Using this decomposition, we can interpret how these homomorphisms act on the underlying binary codes. We can also determine their effect on geometric realizations of these codes using sets in $\mathbb{R}^d$. We also describe how these homomorphisms affect a canonical generating set for neural ideals, yielding an efficient method for computing these generators in some cases.
This paper studies change point detection on networks with community structures. It proposes a framework that can detect both local and global changes in networks efficiently. Importantly, it can clearly distinguish the two types of changes. The framework design is generic and as such several state-of-the-art change point detection algorithms can fit in this design. Experiments on both synthetic and real-world networks show that this framework can accurately detect changes while achieving up to 800X speedup.
Graphical Gaussian models have proven to be useful tools for exploring network structures based on multivariate data. Applications to studies of gene expression have generated substantial interest in these models, and resulting recent progress includes the development of fitting methodology involving penalization of the likelihood function. In this paper we advocate the use of multivariate $t$-distributions for more robust inference of graphs. In particular, we demonstrate that penalized likelihood inference combined with an application of the EM algorithm provides a computationally efficient approach to model selection in the $t$-distribution case. We consider two versions of multivariate $t$-distributions, one of which requires the use of approximation techniques. For this distribution, we describe a Markov chain Monte Carlo EM algorithm based on a Gibbs sampler as well as a simple variational approximation that makes the resulting method feasible in large problems.
Quantitative assessment of the uncertainties tainting the results of computer simulations is nowadays a major topic of interest in both industrial and scientific communities. One of the key issues in such studies is to get information about the output when the numerical simulations are expensive to run. This paper considers the problem of exploring the whole space of variations of the computer model input variables in the context of a large dimensional exploration space. Various properties of space filling designs are justified: interpoint-distance, discrepancy, minimum spanning tree criteria. A specific class of design, the optimized Latin Hypercube Sample, is considered. Several optimization algorithms, coming from the literature, are studied in terms of convergence speed, robustness to subprojection and space filling properties of the resulting design. Some recommendations for building such designs are given. Finally, another contribution of this paper is the deep analysis of the space filling properties of the design 2D-subprojections.
Connectionist temporal classification (CTC) based supervised sequence training of recurrent neural networks (RNNs) has shown great success in many machine learning areas including end-to-end speech and handwritten character recognition. For the CTC training, however, it is required to unroll (or unfold) the RNN by the length of an input sequence. This unrolling requires a lot of memory and hinders a small footprint implementation of online learning or adaptation. Furthermore, the length of training sequences is usually not uniform, which makes parallel training with multiple sequences inefficient on shared memory models such as graphics processing units (GPUs). In this work, we introduce an expectation-maximization (EM) based online CTC algorithm that enables unidirectional RNNs to learn sequences that are longer than the amount of unrolling. The RNNs can also be trained to process an infinitely long input sequence without pre-segmentation or external reset. Moreover, the proposed approach allows efficient parallel training on GPUs. For evaluation, phoneme recognition and end-to-end speech recognition examples are presented on the TIMIT and Wall Street Journal (WSJ) corpora, respectively. Our online model achieves 20.7% phoneme error rate (PER) on the very long input sequence that is generated by concatenating all 192 utterances in the TIMIT core test set. On WSJ, a network can be trained with only 64 times of unrolling while sacrificing 4.5% relative word error rate (WER).
We study the structural properties of the neural network of the C.elegans (worm) from a directed graph point of view. The Google matrix analysis is used to characterize the neuron connectivity structure and node classifications are discussed and compared with physiological properties of the cells. Our results are obtained by a proper definition of neural directed network and subsequent eigenvector analysis which recovers some results of previous studies. Our analysis highlights particular sets of important neurons constituting the core of the neural system. The applications of PageRank, CheiRank and ImpactRank to characterization of interdependency of neurons are discussed.
Virtual Network Functions as a Service (VNFaaS) is currently under attentive study by telecommunications and cloud stakeholders as a promising business and technical direction consisting of providing network functions as a service on a cloud (NFV Infrastructure), instead of delivering standalone network appliances, in order to provide higher scalability and reduce maintenance costs. However, the functioning of such NFVI hosting the VNFs is fundamental for all the services and applications running on top of it, forcing to guarantee a high availability level. Indeed the availability of an VNFaaS relies on the failure rate of its single components, namely the servers, the virtualization software, and the communication network. The proper assignment of the virtual machines implementing network functions to NFVI servers and their protection is essential to guarantee high availability. We model the High Availability Virtual Network Function Placement (HA-VNFP) as the problem of finding the best assignment of virtual machines to servers guaranteeing protection by replication. We propose a probabilistic approach to measure the real availability of a system and design both efficient and effective algorithms that can be used by stakeholders for both online and offline planning.
This paper presents a development paradigm for Ethiopia, based on appropriate services and innovative use of mobile communications technologies via applications tailored for sectors like business, finance, healthcare, governance, education and infotainment. The experience of other developing countries like India and Kenya is cited so as to adapt those to the Ethiopian context. Notable application areas in the aforementioned sectors have been outlined. The ETC 'next generation network' is taken into consideration, with an emphasis on mobile service offering by the Telco itself and/or third party service providers. In addition, enabling technologies like mobile internet, location-based systems, open interfaces to large telecom networks, specifically service-oriented architecture (SOA), Parlay/JAIN and the like are discussed. The paper points out possible endeavors by such stakeholders like: telecom agencies and network operators; businesses, government and NGOs; entrepreneurs and innovators; technology companies and professionals; as well as researchers and academic institutions. ICT4D through mobile services and their role in bridging the digital divide by building a virtual 'network economy' is presented.
In this report we describe an ongoing line of research for solving single-channel source separation problems. Many monaural signal decomposition techniques proposed in the literature operate on a feature space consisting of a time-frequency representation of the input data. A challenge faced by these approaches is to effectively exploit the temporal dependencies of the signals at scales larger than the duration of a time-frame. In this work we propose to tackle this problem by modeling the signals using a time-frequency representation with multiple temporal resolutions. The proposed representation consists of a pyramid of wavelet scattering operators, which generalizes Constant Q Transforms (CQT) with extra layers of convolution and complex modulus. We first show that learning standard models with this multi-resolution setting improves source separation results over fixed-resolution methods. As study case, we use Non-Negative Matrix Factorizations (NMF) that has been widely considered in many audio application. Then, we investigate the inclusion of the proposed multi-resolution setting into a discriminative training regime. We discuss several alternatives using different deep neural network architectures.
Waveform templates are a powerful tool for extracting and characterizing gravitational wave signals, acting as highly restrictive priors on the signal morphologies that allow us to extract weak events buried deep in the instrumental noise. The templates map the waveform shapes to physical parameters, thus allowing us to produce posterior probability distributions for these parameters. However, there are attendant dangers in using highly restrictive signal priors. If strong field gravity is not accurately described by General Relativity (GR), then using GR templates may result in fundamental bias in the recovered parameters, or even worse, a complete failure to detect signals. Here we study such dangers, concentrating on three distinct possibilities. First, we show that there exist modified theories compatible with all existing tests that would fail to be detected by the LIGO/Virgo network using searches based on GR templates, but which would be detected using a one parameter post-Einsteinian extension. Second, we study modified theories that produce departures from GR that turn on suddenly at a critical frequency, producing waveforms that do not naively fit into the simplest parameterized post-Einsteinian (ppE) scheme. We show that even the simplest ppE templates are still capable of picking up these strange signals and diagnosing a departure from GR. Third, we study whether using inspiral-only ppE waveforms for signals that include merger and ringdown can lead to problems in misidentifying a GR departure. We present an easy technique that allows us to self-consistently identify the inspiral portion of the signal, and thus remove these potential biases, allowing GR tests to be performed on higher mass signals that merge within the detector band. We close by studying a parameterized waveform model that may allow us to test GR using the full inspiral-merger-ringdown signal.
This paper introduces Graph Convolutional Recurrent Network (GCRN), a deep learning model able to predict structured sequences of data. Precisely, GCRN is a generalization of classical recurrent neural networks (RNN) to data structured by an arbitrary graph. Such structured sequences can represent series of frames in videos, spatio-temporal measurements on a network of sensors, or random walks on a vocabulary graph for natural language modeling. The proposed model combines convolutional neural networks (CNN) on graphs to identify spatial structures and RNN to find dynamic patterns. We study two possible architectures of GCRN, and apply the models to two practical problems: predicting moving MNIST data, and modeling natural language with the Penn Treebank dataset. Experiments show that exploiting simultaneously graph spatial and dynamic information about data can improve both precision and learning speed.
We summarize the main results of the Spin Physics Working Group Sessions at DIS 2010, the XVIII International Workshop on Deep Inelastic Scattering and Related Subjects.
This paper considers vector network coding solutions based on rank-metric codes and subspace codes. The main result of this paper is that vector solutions can significantly reduce the required field size compared to the optimal scalar linear solution for the same multicast network. The multicast networks considered in this paper have one source with $h$ messages and the vector solution is over a field of size $q$ with vectors of length $t$. The achieved gap of the field size between the optimal scalar linear solution and the vector solution is $q^{(h-2)t^2/h + o(t)}$ for any $q \geq 2$ and any even $h \geq 4$. If $h \geq 5$ is odd, then the achieved gap of the field size is $q^{(h-3)t^2/(h-1) + o(t)}$. Previously, only a gap of constant size had been shown for networks with a very large number of messages. These results imply the same gap of the field size between the optimal scalar linear and any scalar \emph{nonlinear} network coding solution for multicast networks. For three messages, we also show an advantage of vector network coding, while for two messages the problem remains open. Several networks are considered, all of them are generalizations and modifications of the well-known combination network. The vector network codes that are used as a solution for those networks are based on subspace codes and in particular subspace codes obtained from rank-metric codes. Some of these codes form a new family of subspace codes which poses a new interesting {research} problem. Finally, the exposition given in this paper suggests a sequence of related problems for future research.
This volume is a collection of contributions from the 5th Workshop on Machine Learning and Interpretation in Neuroimaging (MLINI) at the Neural Information Processing Systems (NIPS 2015) conference. Modern multivariate statistical methods developed in the rapidly growing field of machine learning are being increasingly applied to various problems in neuroimaging, from cognitive state detection to clinical diagnosis and prognosis. Multivariate pattern analysis methods are designed to examine complex relationships between high-dimensional signals, such as brain images, and outcomes of interest, such as the category of a stimulus, a type of a mental state of a subject, or a specific mental disorder. Such techniques are in contrast with the traditional mass-univariate approaches that dominated neuroimaging in the past and treated each individual imaging measurement in isolation.   We believe that machine learning has a prominent role in shaping how questions in neuroscience are framed, and that the machine-learning mind set is now entering modern psychology and behavioral studies. It is also equally important that practical applications in these fields motivate a rapidly evolving line or research in the machine learning community. In parallel, there is an intense interest in learning more about brain function in the context of rich naturalistic environments and scenes. Efforts to go beyond highly specific paradigms that pinpoint a single function, towards schemes for measuring the interaction with natural and more varied scene are made. The goal of the workshop is to pinpoint the most pressing issues and common challenges across the neuroscience, neuroimaging, psychology and machine learning fields, and to sketch future directions and open questions in the light of novel methodology.
A new network construction method is presented for building of scalable, high throughput, low latency networks. The method is based on the exact equivalence discovered between the problem of maximizing network throughput (measured as bisection bandwidth) for a large class of practically interesting Cayley graphs and the problem of maximizing codeword distance for linear error correcting codes. Since the latter problem belongs to a more mature research field with large collections of optimal solutions available, a simple translation recipe is provided for converting the existent optimal error correcting codes into optimal throughput networks. The resulting networks, called here Long Hop networks, require 1.5-5 times fewer switches, 2-6 times fewer internal cables and 1.2-2 times fewer `average hops' than the best presently known networks for the same number of ports provided and the same total throughput. These advantage ratios increase with the network size and switch radix. Independently interesting byproduct of the discovered equivalence is an efficient O(n*log(n)) algorithm based on Walsh-Hadamard transform for computing exact bisections of this class of Cayley graphs (this is NP complete problem for general graphs).
Machine learning models are frequently used to solve complex security problems, as well as to make decisions in sensitive situations like guiding autonomous vehicles or predicting financial market behaviors. Previous efforts have shown that numerous machine learning models were vulnerable to adversarial manipulations of their inputs taking the form of adversarial samples. Such inputs are crafted by adding carefully selected perturbations to legitimate inputs so as to force the machine learning model to misbehave, for instance by outputting a wrong class if the machine learning task of interest is classification. In fact, to the best of our knowledge, all previous work on adversarial samples crafting for neural network considered models used to solve classification tasks, most frequently in computer vision applications. In this paper, we contribute to the field of adversarial machine learning by investigating adversarial input sequences for recurrent neural networks processing sequential data. We show that the classes of algorithms introduced previously to craft adversarial samples misclassified by feed-forward neural networks can be adapted to recurrent neural networks. In a experiment, we show that adversaries can craft adversarial sequences misleading both categorical and sequential recurrent neural networks.
The extreme eigenvalues of adjacency matrices are important indicators on the influences of topological structures to collective dynamical behavior of complex networks. Recent findings on the ensemble averageability of the extreme eigenvalue further authenticate its sensibility in the study of network dynamics. Here we determine the ensemble average of the extreme eigenvalue and characterize the deviation across the ensemble through the discrete form of random scale-free network. Remarkably, the analytical approximation derived from the discrete form shows significant improvement over the previous results. This has also led us to the same conclusion as [Phys. Rev. Lett. 98, 248701 (2007)] that deviation in the reduced extreme eigenvalues vanishes as the network size grows.
Empirical risk minimization (ERM) is ubiquitous in machine learning and underlies most supervised learning methods. While there has been a large body of work on algorithms for various ERM problems, the exact computational complexity of ERM is still not understood. We address this issue for multiple popular ERM problems including kernel SVMs, kernel ridge regression, and training the final layer of a neural network. In particular, we give conditional hardness results for these problems based on complexity-theoretic assumptions such as the Strong Exponential Time Hypothesis. Under these assumptions, we show that there are no algorithms that solve the aforementioned ERM problems to high accuracy in sub-quadratic time. We also give similar hardness results for computing the gradient of the empirical loss, which is the main computational burden in many non-convex learning tasks.
We introduce evolving networks where new vertices preferentially connect to the more central parts of a network. This makes such networks compact. Finite networks grown under the preferential compactness mechanism have complex architectures, but infinite ones tend towards the opposite, having rapidly decreasing distributions of connections. We present an analytical solution of the problem for tree-like networks. Our approach links a collective self-optimization mechanism of the emergence of complex network architectures to self-organization mechanisms.
We introduce a new representation learning algorithm suited to the context of domain adaptation, in which data at training and test time come from similar but different distributions. Our algorithm is directly inspired by theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on a data representation that cannot discriminate between the training (source) and test (target) domains. We propose a training objective that implements this idea in the context of a neural network, whose hidden layer is trained to be predictive of the classification task, but uninformative as to the domain of the input. Our experiments on a sentiment analysis classification benchmark, where the target domain data available at training time is unlabeled, show that our neural network for domain adaption algorithm has better performance than either a standard neural network or an SVM, even if trained on input features extracted with the state-of-the-art marginalized stacked denoising autoencoders of Chen et al. (2012).
We report on the ab initio discovery of a novel putative ground state for quasi two-dimensional TiO$_2$ through a structural search using the minima hopping method with an artificial neural network potential. The structure is based on a honeycomb lattice and is energetically lower than the experimentally reported lepidocrocite sheet by 7~meV/atom, and merely 13~meV/atom higher in energy than the ground state rutile bulk structure. According to our calculations, the hexagonal sheet is stable against mechanical stress, it is chemically inert and can be deposited on various substrates without disrupting the structure. Its properties differ significantly from all known TiO$_2$ bulk phases with a large gap of 5.05~eV that can be tuned through strain engineering.
Transcription networks, and other directed networks can be characterized by some topological observables such as for example subgraph occurrence (network motifs). In order to perform such kind of analysis, it is necessary to be able to generate suitable randomized network ensembles. Typically, one considers null networks with the same degree sequences of the original ones. The commonly used algorithms sometimes have long convergence times, and sampling problems. We present here an alternative, based on a variant of the importance sampling Montecarlo developed by Chen et al. [1].
We present a detailed proof of a previously announced result (C.M. Newman and D.L. Stein, Phys. Rev. Lett. v. 84, pp. 3966--3969 (2000)) supporting the absence of multiple (incongruent) ground state pairs for 2D Edwards-Anderson spin glasses (with zero external field and, e.g., Gaussian couplings): if two ground state pairs (chosen from metastates with, e.g., periodic boundary conditions) on the infinite square lattice are distinct, then the dual bonds where they differ form a single doubly-infinite, positive-density domain wall. It is an open problem to prove that such a situation cannot occur (or else to show --- much less likely in our opinion --- that it indeed does happen) in these models. Our proof involves an analysis of how (infinite-volume) ground states change as (finitely many) couplings vary, which leads us to a notion of zero-temperature excitation metastates, that may be of independent interest.
We propose Impatient Deep Neural Networks (DNNs) which deal with dynamic time budgets during application. They allow for individual budgets given a priori for each test example and for anytime prediction, i.e., a possible interruption at multiple stages during inference while still providing output estimates. Our approach can therefore tackle the computational costs and energy demands of DNNs in an adaptive manner, a property essential for real-time applications. Our Impatient DNNs are based on a new general framework of learning dynamic budget predictors using risk minimization, which can be applied to current DNN architectures by adding early prediction and additional loss layers. A key aspect of our method is that all of the intermediate predictors are learned jointly. In experiments, we evaluate our approach for different budget distributions, architectures, and datasets. Our results show a significant gain in expected accuracy compared to common baselines.
We present a synthesis framework to map logic networks into quantum circuits for quantum computing. The synthesis framework is based on LUT networks (lookup-table networks), which play a key role in conventional logic synthesis. Establishing a connection between LUTs in a LUT network and reversible single-target gates in a reversible network allows us to bridge conventional logic synthesis with logic synthesis for quantum computing, despite several fundamental differences. We call our synthesis framework LUT-based Hierarchical Reversible Logic Synthesis (LHRS). Input to LHRS is a classical logic network; output is a quantum network (realized in terms of Clifford+$T$ gates). The framework offers to trade-off the number of qubits for the number of quantum gates. In a first step, an initial network is derived that only consists of single-target gates and already completely determines the number of qubits in the final quantum network. Different methods are then used to map each single-target gate into Clifford+$T$ gates, while aiming at optimally using available resources. We demonstrate the effectiveness of our method in automatically synthesizing IEEE compliant floating point networks up to double precision. As many quantum algorithms target scientific simulation applications, they can make rich use of floating point arithmetic components. But due to the lack of quantum circuit descriptions for those components, it can be difficult to find a realistic cost estimation for the algorithms. Our synthesized benchmarks provide cost estimates that allow quantum algorithm designers to provide the first complete cost estimates for a host of quantum algorithms. Thus, the benchmarks and, more generally, the LHRS framework are an essential step towards the goal of understanding which quantum algorithms will be practical in the first generations of quantum computers.
Adhoc networks are characterized by connectivity through a collection of wireless nodes and fast changing network topology. Wireless nodes are free to move independent of each other which makes routing much difficult. This calls for the need of an efficient dynamic routing protocol. Mesh-based multicast routing technique establishes communications between mobile nodes of wireless adhoc networks in a faster and efficient way. In this article the performance of prominent on-demand routing protocols for mobile adhoc networks such as ODMRP (On Demand Multicast Routing Protocol), AODV (Adhoc on Demand Distance Vector) and FSR (Fisheye State Routing protocol) was studied. The parameters viz., average throughput, packet delivery ration and end-to-end delay were evaluated. From the simulation results and analysis, a suitable routing protocol can be chosen for a specified network. The results show that the ODMRP protocol performance is remarkably superior as compared with AODV and FSR routing protocols. Keywords: MANET, Multicast Routing, ODMRP, AODV, FSR.
We study the two-dimensional +/-J Ising model, three-state Potts model and four-state Potts model, by the numerical transfer matrix method to investigate the behaviour of the sample-to-sample fluctuations of the internal energy on the Nishimori line. The result shows a maximum at the multicritical point in all the models we investigated. The large sample-to-sample fluctuations of the internal energy as well as the existence of a singularity in these fluctuations imply that the bond configuration (or, equivalently, the distribution of frustrated plaquettes) may be experiencing a non-trivial change of its behaviour at the multicritical point. This observation is consistent with the picture that the phase transition at the multicritical point is of geometry-induced nature.
With the advent of word embeddings, lexicons are no longer fully utilized for sentiment analysis although they still provide important features in the traditional setting. This paper introduces a novel approach to sentiment analysis that integrates lexicon embeddings and an attention mechanism into Convolutional Neural Networks. Our approach performs separate convolutions for word and lexicon embeddings and provides a global view of the document using attention. Our models are experimented on both the SemEval'16 Task 4 dataset and the Stanford Sentiment Treebank, and show comparative or better results against the existing state-of-the-art systems. Our analysis shows that lexicon embeddings allow to build high-performing models with much smaller word embeddings, and the attention mechanism effectively dims out noisy words for sentiment analysis.
An extension to a recently introduced binary neural network is proposed in order to allow the learning of sparse messages, in large numbers and with high memory efficiency. This new network is justified both in biological and informational terms. The learning and retrieval rules are detailed and illustrated by various simulation results.
The volume of data generated by internet and social networks is increasing every day, and there is a clear need for efficient ways of extracting useful information from them. As those data can take different forms, it is important to use all the available data representations for prediction.   In this paper, we focus our attention on supervised classification using both regular plain, tabular, data and structural information coming from a network structure. 14 techniques are investigated and compared in this study and can be divided in three classes: the first one uses only the plain data to build a classification model, the second uses only the graph structure and the last uses both information sources. The relative performances in these three cases are investigated. Furthermore, the effect of using a graph embedding and well-known indicators in spatial statistics is also studied.   Possible applications are automatic classification of web pages or other linked documents, of people in a social network or of proteins in a biological complex system, to name a few.   Based on our comparison, we draw some general conclusions and advices to tackle this particular classification task: some datasets can be better explained by their graph structure (graph-driven), or by their feature set (features-driven). The most efficient methods are discussed in both cases.
The evolutionary properties of the old metal-rich Galactic open cluster NGC6791 are assessed, based on deep UB photometry and 2Mass JK data. For 4739 stars in the cluster, bolometric luminosity and effective temperature have been derived from theoretical (U-B) and (J-K) color fitting. The derived H-R diagram has been matched with the UVBLUE grid of synthetic stellar spectra to obtain the integrated SED of the system, together with a full set UV (Fanelli) and optical (Lick) narrow-band indices. The cluster appears to be a fairly good proxy of standard elliptical galaxies, although with significantly bluer infrared colors, a shallower 4000A Balmer break, and a lower Mg2 index. The confirmed presence of a dozen hot stars, along their EHB evolution, leads the cluster SED to consistently match the properties of the most active UV-upturn galaxies, with 1.7+/-0.4% of the total bolometric luminosity emitted shortward of 2500A.   The cluster Helium abundance results Y=0.30 +/-0.04, while the Post-MS implied stellar lifetime from star number counts fairly agrees with the theoretical expectations from both the Padova and BASTI stellar tracks. A Post-MS fuel consumption of 0.43 +/- 0.01 M_sun is found for NGC6791 stars, in close agreement with the estimated mass of cluster He-rich white dwarfs. Such a tight figure may lead to suspect that a fraction of the cluster stellar population does actually not reach the minimum mass required to effectively ignite He in the stellar core.
Steady meridional flow makes no first-order perturbation to the frequencies of helioseismic normal modes. It does, however, Doppler shift the local wavenumber, thereby distorting the eigenfunctions. For high-degree modes, whose peaks in a power spectrum are blended into continuous ridges, the effect of the distortion is to shift the locations of those ridges. From this blended superposition of modes, one can isolate oppositely directed wave components with the same local horizontal wavenumber and measure a frequency difference which can be safely used to infer the subsurface background flow. But such a procedure fails for the components of the more-deeply-penetrating low-degree modes that are not blended into ridges. Instead, one must analyze the spatial distortions explicitly. With a simple toy model, we illustrate one method by which that might be accomplished by measuring the spatial variation of the oscillation phase. We estimate that by this procedure it might be possible to infer meridional flow deep in the solar convection zone.
Networks of neurons in some brain areas are flexible enough to encode new memories quickly. Using a standard firing rate model of recurrent networks, we develop a theory of flexible memory networks. Our main results characterize networks having the maximal number of flexible memory patterns, given a constraint graph on the network's connectivity matrix. Modulo a mild topological condition, we find a close connection between maximally flexible networks and rank 1 matrices. The topological condition is H_1(X;Z)=0, where X is the clique complex associated to the network's constraint graph; this condition is generically satisfied for large random networks that are not overly sparse. In order to prove our main results, we develop some matrix-theoretic tools and present them in a self-contained section independent of the neuroscience context.
A spin network is a cubic ribbon graph labeled by representations of $\mathrm{SU}(2)$. Spin networks are important in various areas of Mathematics (3-dimensional Quantum Topology), Physics (Angular Momentum, Classical and Quantum Gravity) and Chemistry (Atomic Spectroscopy). The evaluation of a spin network is an integer number. The main results of our paper are: (a) an existence theorem for the asymptotics of evaluations of arbitrary spin networks (using the theory of $G$-functions), (b) a rationality property of the generating series of all evaluations with a fixed underlying graph (using the combinatorics of the chromatic evaluation of a spin network), (c) rigorous effective computations of our results for some $6j$-symbols using the Wilf-Zeilberger theory, and (d) a complete analysis of the regular Cube $12j$ spin network (including a non-rigorous guess of its Stokes constants), in the appendix.
Here, we present IDNet, a user authentication framework from smartphone-acquired motion signals. Its goal is to recognize a target user from their way of walking, using the accelerometer and gyroscope (inertial) signals provided by a commercial smartphone worn in the front pocket of the user's trousers. IDNet features several innovations including: i) a robust and smartphone-orientation-independent walking cycle extraction block, ii) a novel feature extractor based on convolutional neural networks, iii) a one-class support vector machine to classify walking cycles, and the coherent integration of these into iv) a multi-stage authentication technique. IDNet is the first system that exploits a deep learning approach as universal feature extractors for gait recognition, and that combines classification results from subsequent walking cycles into a multi-stage decision making framework. Experimental results show the superiority of our approach against state-of-the-art techniques, leading to misclassification rates (either false negatives or positives) smaller than 0.15% with fewer than five walking cycles. Design choices are discussed and motivated throughout, assessing their impact on the user authentication performance.
Deep convolutional neural networks (CNNs) have shown great potential for numerous real-world machine learning applications, but performing inference in large CNNs in real-time remains a challenge. We have previously demonstrated that traditional CNNs can be converted into deep spiking neural networks (SNNs), which exhibit similar accuracy while reducing both latency and computational load as a consequence of their data-driven, event-based style of computing. Here we provide a novel theory that explains why this conversion is successful, and derive from it several new tools to convert a larger and more powerful class of deep networks into SNNs. We identify the main sources of approximation errors in previous conversion methods, and propose simple mechanisms to fix these issues. Furthermore, we develop spiking implementations of common CNN operations such as max-pooling, softmax, and batch-normalization, which allow almost loss-less conversion of arbitrary CNN architectures into the spiking domain. Empirical evaluation of different network architectures on the MNIST and CIFAR10 benchmarks leads to the best SNN results reported to date.
We study a slow relaxation process in a frustrated spin system in which a type of screening effect due to a frustrated environment plays an important role. This screening effect is attributed to the highly degenerate configurations of the frustrated environment. This slow relaxation is due to an entropy effect and is different from those due to the energy barrier observed in systems such as random ferromagnets. In the present system, even if there is no energy gap, the slow relaxation still takes place. Thus, we call this phenomenon ``entropic slowing down''. Here, we study the mechanism of entropic slowing down quantitatively in an Ising spin model with the so-called decorated bonds. The spins included in decorated bonds (decoration spins) cause a peculiar density of states, which causes on entropy-induced screening effect. We analytically estimate the time scale of the system that increases exponentially with the number of decoration spins. We demonstrate the scaling of relaxation processes using the time scale at all temperatures including the critical point.
Automated forecasts serve important role in space weather science, by providing statistical insights to flare-trigger mechanisms, and by enabling tailor-made forecasts and high-frequency forecasts. Only by realtime forecast we can experimentally measure the performance of flare-forecasting methods while confidently avoiding overlearning.   We have been operating unmanned flare forecast service since August, 2015 that provides 24-hour-ahead forecast of solar flares, every 12 minutes. We report the method and prediction results of the system.
Neural selectivity of N170 responses is an important phenomenon in perceptual processing; however, the recovery times of neural selective responses remain unclear. In the present study, we used an adaptation paradigm to test the recovery speeds of N170 responses to faces and Chinese characters. The results showed that recovery of N170 responses elicited by faces occurred between 1400 and 1800 ms after stimuli onset, whereas recovery of N170 responses elicited by Chinese characters occurred between 600 and 800 ms after stimuli onset. These results demonstrate category-specific recovery speeds of N170 responses involved in the processing of faces and Chinese characters.
Almost all real-world networks are subject to constant evolution, and plenty of evolving networks have been investigated to uncover the underlying mechanisms for a deeper understanding of the organization and development of them. Compared with the rapid expansion of the empirical studies about evolution mechanisms exploration, the future links prediction methods corresponding to the evolution mechanisms are deficient. Real-world information always contain hints of what would happen next, which is also the case in the observed evolving networks. In this paper, we firstly propose a structured-dependent index to strengthen the robustness of link prediction methods. Then we treat the observed links and their timestamps in evolving networks as known information. We envision evolving networks as dynamic systems and model the evolutionary dynamics of nodes similarity. Based on the iterative updating of nodes' network position, the potential trend of evolving networks is uncovered, which improves the accuracy of future links prediction. Experiments on various real-world networks show that the proposed index performs better than baseline methods and the spatial-temporal position drift model performs well in real-world evolving networks.
A feedback vertex set (FVS) of an undirected graph contains vertices from every cycle of this graph. Constructing a FVS of sufficiently small cardinality is very difficult in the worst cases, but for random graphs this problem can be efficiently solved after converting it into an appropriate spin glass model [H.-J. Zhou, Eur. Phys. J. B 86 (2013) 455]. In the present work we study the local stability and the phase transition properties of this spin glass model on random graphs. For both regular random graphs and Erd\"os-R\'enyi graphs we determine the inverse temperature $\beta_l$ at which the replica-symmetric mean field theory loses its local stability, the inverse temperature $\beta_d$ of the dynamical (clustering) phase transition, and the inverse temperature $\beta_c$ of the static (condensation) phase transition. We find that $\beta_{l}$, $\beta_{d}$, and $\beta_c$ change with the (mean) vertex degree in a non-monotonic way; $\beta_d$ is distinct from $\beta_c$ for regular random graphs of vertex degrees $K\geq 64$, while $\beta_d$ are always identical to $\beta_c$ for Erd\"os-R\'enyi graphs (at least up to mean vertex degree $c=512$). We also compute the minimum FVS size of regular random graphs through the zero-temperature first-step replica-symmetry-breaking mean field theory and reach good agreement with the results obtained on single graph instances by the belief propagation-guided decimation algorithm. Taking together, this paper presents a systematic theoretical study on the energy landscape property of a spin glass system with global cycle constraints.
Invariance to geometric transformations is a highly desirable property of automatic classifiers in many image recognition tasks. Nevertheless, it is unclear to which extent state-of-the-art classifiers are invariant to basic transformations such as rotations and translations. This is mainly due to the lack of general methods that properly measure such an invariance. In this paper, we propose a rigorous and systematic approach for quantifying the invariance to geometric transformations of any classifier. Our key idea is to cast the problem of assessing a classifier's invariance as the computation of geodesics along the manifold of transformed images. We propose the Manitest method, built on the efficient Fast Marching algorithm to compute the invariance of classifiers. Our new method quantifies in particular the importance of data augmentation for learning invariance from data, and the increased invariance of convolutional neural networks with depth. We foresee that the proposed generic tool for measuring invariance to a large class of geometric transformations and arbitrary classifiers will have many applications for evaluating and comparing classifiers based on their invariance, and help improving the invariance of existing classifiers.
Focusing on the optimization version of the random K-satisfiability problem, the MAX-K-SAT problem, we study the performance of the finite energy version of the Survey Propagation (SP) algorithm. We show that a simple (linear time) backtrack decimation strategy is sufficient to reach configurations well below the lower bound for the dynamic threshold energy and very close to the analytic prediction for the optimal ground states. A comparative numerical study on one of the most efficient local search procedures is also given.
Studying the quality requirements (aka Non-Functional Requirements (NFR)) of a system is crucial in Requirements Engineering. Many software projects fail because of neglecting or failing to incorporate the NFR during the software life development cycle. This paper focuses on analyzing the importance of the quality requirements attributes in software effort estimation models based on the Desharnais dataset. The Desharnais dataset is a collection of eighty one software projects of twelve attributes developed by a Canadian software house. The analysis includes studying the influence of each of the quality requirements attributes, as well as the influence of all quality requirements attributes combined when calculating software effort using regression and Artificial Neural Network (ANN) models. The evaluation criteria used in this investigation include the Mean of the Magnitude of Relative Error (MMRE), the Prediction Level (PRED), Root Mean Squared Error (RMSE), Mean Error and the Coefficient of determination (R2). Results show that the quality attribute Language is the most statistically significant when calculating software effort. Moreover, if all quality requirements attributes are eliminated in the training stage and software effort is predicted based on software size only, the value of the error (MMRE) is doubled.
We derive a simple algebraic criterion to select the optimal detector network for a coherent wide parameter-space (all-sky) search for continuous gravitational waves. Optimality in this context is defined as providing the highest (average) sensitivity per computing cost. This criterion is a direct consequence of the properties of the multi-detector F-statistic metric, which has been derived recently. Interestingly, the choice of the optimal network only depends on the noise-levels and duty-cycles of the respective detectors, and not on the available computing power.
This paper introduces mathematical formalism for Spatial (SP) of Hierarchical Temporal Memory (HTM) with a spacial consideration for its hardware implementation. Performance of HTM network and its ability to learn and adjust to a problem at hand is governed by a large set of parameters. Most of parameters are codependent which makes creating efficient HTM-based solutions challenging. It requires profound knowledge of the settings and their impact on the performance of system. Consequently, this paper introduced a set of formulas which are to facilitate the design process by enhancing tedious trial-and-error method with a tool for choosing initial parameters which enable quick learning convergence. This is especially important in hardware implementations which are constrained by the limited resources of a platform. The authors focused especially on a formalism of Spatial Pooler and derive at the formulas for quality and convergence of the model. This may be considered as recipes for designing efficient HTM models for given input patterns.
Strong and supportive social relationships are fundamental to our well-being. However, there are costs to their maintenance, resulting in a trade-off between quality and quantity, a typical strategy being to put a lot of effort on a few high-intensity relationships while maintaining larger numbers of less close relationships. It has also been shown that there are persistent individual differences in this pattern; some individuals allocate their efforts more uniformly across their networks, while others strongly focus on their closest relationships. Furthermore, some individuals maintain more stable networks than others. Here, we focus on how personality traits of individuals affect this picture, using mobile phone calls records and survey data from the Mobile Territorial Lab (MTL) study. In particular, we look at the relationship between personality traits and the (i) persistence of social signatures, namely the similarity of the social signature shape of an individual measured in different time intervals; (ii) the turnover in egocentric networks, that is, differences in the set of alters present at two consecutive temporal intervals; and (iii) the rank dynamics defined as the variation of alter rankings in egocentric networks in consecutive intervals. We observe that some traits have effects on the stability of the social signatures as well as network turnover and rank dynamics. As an example, individuals who score highly in the Openness to Experience trait tend to have higher levels of network turnover and larger alter rank variations. On broader terms, our study shows that personality traits clearly affect the ways in which individuals maintain their personal networks.
We consider the problem of learning deep generative models from data. We formulate a method that generates an independent sample via a single feedforward pass through a multilayer perceptron, as in the recently proposed generative adversarial networks (Goodfellow et al., 2014). Training a generative adversarial network, however, requires careful optimization of a difficult minimax program. Instead, we utilize a technique from statistical hypothesis testing known as maximum mean discrepancy (MMD), which leads to a simple objective that can be interpreted as matching all orders of statistics between a dataset and samples from the model, and can be trained by backpropagation. We further boost the performance of this approach by combining our generative network with an auto-encoder network, using MMD to learn to generate codes that can then be decoded to produce samples. We show that the combination of these techniques yields excellent generative models compared to baseline approaches as measured on MNIST and the Toronto Face Database.
Social media has become globally ubiquitous, transforming how people are networked and mobilized. This forum explores research and applications of these new networked publics at individual, organizational, and societal levels.
We study associative memory of an oscillator neural network with distributed native frequencies. The model is based on the use of the Hebb learning rule with random patterns ($\xi_i^{\mu}=\pm 1$), and the distribution function of native frequencies is assumed to be symmetric with respect to its average. Although the system with an extensive number of stored patterns is not allowed to get entirely synchronized, long time behaviors of the macroscopic order parameters describing partial synchronization phenomena can be obtained by discarding the contribution from the desynchronized part of the system. The oscillator network is shown to work as associative memory accompanied by synchronized oscillations. A phase diagram representing properties of memory retrieval is presented in terms of the parameters characterizing the native frequency distribution. Our analytical calculations based on the self-consistent signal-to-noise analysis are shown to be in excellent agreement with numerical simulations, confirming the validity of our theoretical treatment.
A key goal of computer vision is to recover the underlying 3D structure from 2D observations of the world. In this paper we learn strong deep generative models of 3D structures, and recover these structures from 3D and 2D images via probabilistic inference. We demonstrate high-quality samples and report log-likelihoods on several datasets, including ShapeNet [2], and establish the first benchmarks in the literature. We also show how these models and their inference networks can be trained end-to-end from 2D images. This demonstrates for the first time the feasibility of learning to infer 3D representations of the world in a purely unsupervised manner.
We present a large-scale study exploring the capability of temporal deep neural networks to interpret natural human kinematics and introduce the first method for active biometric authentication with mobile inertial sensors. At Google, we have created a first-of-its-kind dataset of human movements, passively collected by 1500 volunteers using their smartphones daily over several months. We (1) compare several neural architectures for efficient learning of temporal multi-modal data representations, (2) propose an optimized shift-invariant dense convolutional mechanism (DCWRNN), and (3) incorporate the discriminatively-trained dynamic features in a probabilistic generative framework taking into account temporal characteristics. Our results demonstrate that human kinematics convey important information about user identity and can serve as a valuable component of multi-modal authentication systems.
Conventional change detection methods require a large number of images to learn background models or depend on tedious pixel-level labeling by humans. In this paper, we present a weakly supervised approach that needs only image-level labels to simultaneously detect and localize changes in a pair of images. To this end, we employ a deep neural network with DAG topology to learn patterns of change from image-level labeled training data. On top of the initial CNN activations, we define a CRF model to incorporate the local differences and context with the dense connections between individual pixels. We apply a constrained mean-field algorithm to estimate the pixel-level labels, and use the estimated labels to update the parameters of the CNN in an iterative EM framework. This enables imposing global constraints on the observed foreground probability mass function. Our evaluations on four benchmark datasets demonstrate superior detection and localization performance.
Deep convolutional neural networks (CNNs) used in practice employ potentially hundreds of layers and $10$,$000$s of nodes. Such network sizes entail significant computational complexity due to the large number of convolutions that need to be carried out; in addition, a large number of parameters needs to be learned and stored. Very deep and wide CNNs may therefore not be well suited to applications operating under severe resource constraints as is the case, e.g., in low-power embedded and mobile platforms. This paper aims at understanding the impact of CNN topology, specifically depth and width, on the network's feature extraction capabilities. We address this question for the class of scattering networks that employ either Weyl-Heisenberg filters or wavelets, the modulus non-linearity, and no pooling. The exponential feature map energy decay results in Wiatowski et al., 2017, are generalized to $\mathcal{O}(a^{-N})$, where an arbitrary decay factor $a>1$ can be realized through suitable choice of the Weyl-Heisenberg prototype function or the mother wavelet. We then show how networks of fixed (possibly small) depth $N$ can be designed to guarantee that $((1-\varepsilon)\cdot 100)\%$ of the input signal's energy are contained in the feature vector. Based on the notion of operationally significant nodes, we characterize, partly rigorously and partly heuristically, the topology-reducing effects of (effectively) band-limited input signals, band-limited filters, and feature map symmetries. Finally, for networks based on Weyl-Heisenberg filters, we determine the prototype function bandwidth that minimizes---for fixed network depth $N$---the average number of operationally significant nodes per layer.
In the ac polarized Raman spectra of NaV2O5 we have found anomalous phonon broadening, and an energy shift of the low-frequency mode as a function of the temperature. These effects are related to the breaking of translational symmetry, caused by electrical disorder that originates from the fluctuating nature of the V {4.5+} valence state of vanadium. The structural correlation length, obtained from comparisons between the measured and calculated Raman scattering spectra, diverges at T< 5 K, indicating the existence of the long-range charge order at very low temperatures, probably at T=0 K.
There is a wide gap between symbolic reasoning and deep learning. In this research, we explore the possibility of using deep learning to improve symbolic reasoning. Briefly, in a reasoning system, a deep feedforward neural network is used to guide rewriting processes after learning from algebraic reasoning examples produced by humans. To enable the neural network to recognise patterns of algebraic expressions with non-deterministic sizes, reduced partial trees are used to represent the expressions. Also, to represent both top-down and bottom-up information of the expressions, a centralisation technique is used to improve the reduced partial trees. Besides, symbolic association vectors and rule application records are used to improve the rewriting processes. Experimental results reveal that the algebraic reasoning examples can be accurately learnt only if the feedforward neural network has enough hidden layers. Also, the centralisation technique, the symbolic association vectors and the rule application records can reduce error rates of reasoning. In particular, the above approaches have led to 4.6% error rate of reasoning on a dataset of linear equations, differentials and integrals.
While the implementation of single particle coherent diffraction imaging for non-crystalline particles is complicated by current limitations in photon flux, hit rate, and sample delivery a concept of many-particle coherent diffraction imaging offers an alternative way to overcome these difficulties. Here we present a direct, non-iterative approach for the recovery of the diffraction pattern corresponding to a single particle using coherent x-ray data collected from a two-dimensional (2D) disordered system of identical particles, that does not require a priori information about the particles and can be applied to a general case of particles without symmetry. The reconstructed single particle diffraction pattern can be directly used in common iterative phase retrieval algorithms to recover the structure of the particle.
Neural word segmentation has attracted more and more research interests for its ability to alleviate the effort of feature engineering and utilize the external resource by the pre-trained character or word embeddings. In this paper, we propose a new neural model to incorporate the word-level information for Chinese word segmentation. Unlike the previous word-based models, our model still adopts the framework of character-based sequence labeling, which has advantages on both effectiveness and efficiency at the inference stage. To utilize the word-level information, we also propose a new long short-term memory (LSTM) architecture over directed acyclic graph (DAG). Experimental results demonstrate that our model leads to better performances than the baseline models.
Gene regulatory networks (GRNs) are increasingly used for explaining biological processes with complex transcriptional regulation. A GRN links the expression levels of a set of genes via regulatory controls that gene products exert on one another. Boolean networks are a common modeling choice since they balance between detail and ease of analysis. However, even for Boolean networks the problem of fitting a given network model to an expression dataset is NP-Complete. Previous methods have addressed this issue heuristically or by focusing on acyclic networks and specific classes of regulation functions. In this paper we introduce a novel algorithm for this problem that makes use of sampling in order to handle large datasets. Our algorithm can handle time series data for any network type and steady state data for acyclic networks. Using in-silico time series data we demonstrate good performance on large datasets with a significant level of noise.
We present near-infrared J,H, and K-band time series observations of the Galactic Globular Cluster (GGC) M92. On the basis of these data, we derived well-sampled light curves for eleven out of the seventeen cluster RR Lyrae variables, and in turn, accurate mean near-infrared (NIR) magnitudes. The comparison between predicted and empirical slopes of NIR Period-Luminosity (PL) relations indicates a very good agreement. Cluster distance determinations based on independent theoretical NIR $PL$ relations present uncertainties smaller than 5% and agree quite well with recent distance estimates based on different distance indicators. We also obtained accurate and deep NIR color-magnitude diagrams (CMDs) ranging from the tip of the Red Giant Branch (RGB) down to the Main Sequence Turn-Off. We detected the RGB bump and the NIR luminosities of this evolutionary feature are, within theoretical and empirical uncertainties, in good agreement with each other.
In this work we present a novel recurrent neural network architecture designed to model systems characterized by multiple characteristic timescales in their dynamics. The proposed network is composed by several recurrent groups of neurons that are trained to separately adapt to each timescale, in order to improve the system identification process. We test our framework on time series prediction tasks and we show some promising, preliminary results achieved on synthetic data. To evaluate the capabilities of our network, we compare the performance with several state-of-the-art recurrent architectures.
Developing Intelligent Systems involves artificial intelligence approaches including artificial neural networks. Here, we present a tutorial of Deep Neural Networks (DNNs), and some insights about the origin of the term "deep"; references to deep learning are also given. Restricted Boltzmann Machines, which are the core of DNNs, are discussed in detail. An example of a simple two-layer network, performing unsupervised learning for unlabeled data, is shown. Deep Belief Networks (DBNs), which are used to build networks with more than two layers, are also described. Moreover, examples for supervised learning with DNNs performing simple prediction and classification tasks, are presented and explained. This tutorial includes two intelligent pattern recognition applications: hand- written digits (benchmark known as MNIST) and speech recognition.
Deep radio observations of a wide region centred on the Hubble Deep Field South have been performed, providing one of the most sensitive set of radio observations acquired on the Australia Telescope Compact Array to date. A central rms of ~10 microJy is reached at four frequencies (1.4, 2.5, 5.2 and 8.7 GHz). In this paper the full source catalogues from the 2.5, 5.2 and 8.7 GHz observations are presented to complement Paper II, along with a detailed analysis of image quality and noise. We produce a consolidated catalogue by matching sources across all four frequencies of our survey. Radio spectral indices are used to investigate the nature of the radio sources and identify a number of sources with flat or inverted radio spectra, which indicates AGN activity. We also find several other interesting sources, including a broadline emitting radio galaxy, a giant radio galaxy and three Gigahertz Peaked Spectrum sources.
Spectra of sparse non-Hermitian random matrices determine the dynamics of complex processes on graphs. Eigenvalue outliers in the spectrum are of particular interest, since they determine the stationary state and the stability of dynamical processes. We present a general and exact theory for the eigenvalue outliers of random matrices with a local tree structure. For adjacency and Laplacian matrices of oriented random graphs, we derive analytical expressions for the eigenvalue outliers, the first moments of the distribution of eigenvector elements associated with an outlier, the support of the spectral density, and the spectral gap. We show that these spectral observables obey universal expressions, which hold for a broad class of oriented random matrices.
A promising paradigm for achieving highly efficient deep neural networks is the idea of evolutionary deep intelligence, which mimics biological evolution processes to progressively synthesize more efficient networks. A crucial design factor in evolutionary deep intelligence is the genetic encoding scheme used to simulate heredity and determine the architectures of offspring networks. In this study, we take a deeper look at the notion of synaptic cluster-driven evolution of deep neural networks which guides the evolution process towards the formation of a highly sparse set of synaptic clusters in offspring networks. Utilizing a synaptic cluster-driven genetic encoding, the probabilistic encoding of synaptic traits considers not only individual synaptic properties but also inter-synaptic relationships within a deep neural network. This process results in highly sparse offspring networks which are particularly tailored for parallel computational devices such as GPUs and deep neural network accelerator chips. Comprehensive experimental results using four well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and DetectNet) on two different tasks (object categorization and object detection) demonstrate the efficiency of the proposed method. Cluster-driven genetic encoding scheme synthesizes networks that can achieve state-of-the-art performance with significantly smaller number of synapses than that of the original ancestor network. ($\sim$125-fold decrease in synapses for MNIST). Furthermore, the improved cluster efficiency in the generated offspring networks ($\sim$9.71-fold decrease in clusters for MNIST and a $\sim$8.16-fold decrease in clusters for KITTI) is particularly useful for accelerated performance on parallel computing hardware architectures such as those in GPUs and deep neural network accelerator chips.
We consider generating functionals for computing correlators in quantum field theories with random potentials. Examples of such theories include condensed matter systems with quenched disorder (e.g. spin glass) or cosmological systems in context of the string theory landscape (e.g. cosmic inflation). We use the so-called replica trick to define two different generating functionals for calculating correlators of the quantum fields averaged over a given distribution of random potentials. The first generating functional is appropriate for calculating averaged (in-out) amplitudes and involves a single replica of fields, but the replica limit is taken to an (unphysical) negative one number of fields outside of the path integral. When the number of replicas is doubled the generating functional can also be used for calculating averaged probabilities (squared amplitudes) using the in-in construction. The second generating functional involves an infinite number of replicas, but can be used for calculating both in-out and in-in correlators and the replica limits are taken to only a zero number of fields. We discuss the formalism in details for a single real scalar field, but the generalization to more fields or to different types of fields is straightforward. We work out three examples: one where the mass of scalar field is treated as a random variable and two where the functional form of interactions is random, one described by a Gaussian random field and the other by a Euclidean action in the field configuration space.
Estimating the state of a dynamical system from a series of noise-corrupted observations is fundamental in many areas of science and engineering. The most well-known method, the Kalman smoother (and the related Kalman filter), relies on assumptions of linearity and Gaussianity that are rarely met in practice. In this paper, we introduced a new dynamical smoothing method that exploits the remarkable capabilities of convolutional neural networks to approximate complex non-linear functions. The main idea is to generate a training set composed of both latent states and observations from an ensemble of simulators and to train the deep network to recover the former from the latter. Importantly, this method only requires the availability of the simulators and can therefore be applied in situations in which either the latent dynamical model or the observation model cannot be easily expressed in closed form. In our simulation studies, we show that the resulting ConvNet smoother has almost optimal performance in the Gaussian case even when the parameters are unknown. Furthermore, the method can be successfully applied to extremely non-linear and non-Gaussian systems. Finally, we empirically validate our approach via the analysis of measured brain signals.
The district of southern California and Japan are divided into small cubic cells, each of which is regarded as a vertex of a graph if earthquakes occur therein. Two successive earthquakes define an edge and a loop, which replace the complex fault-fault interaction. In this way, the seismic data are mapped to a random graph. It is discovered that an evolving random graph associated with earthquakes behaves as a scale-free network of the Barabasi-Albert type. The distributions of connectivities in the graphs thus constructed are found to decay as a power law, showing a novel feature of earthquake as a complex critical phenomenon. This result can be interpreted in view of the facts that frequency of earthquakes with large values of moment also decays as a power law (the Gutenberg-Richter law) and aftershocks associated with a mainshock tend to return to the locus of the mainshock, contributing to the large degree of connectivity of the vertex of the mainshock. It is also found that the exponent of the distribution of connectivities is characteristic for a plate under investigation.
We propose an explanatory and computational theory of transformative discoveries in science. The theory is derived from a recurring theme found in a diverse range of scientific change, scientific discovery, and knowledge diffusion theories in philosophy of science, sociology of science, social network analysis, and information science. The theory extends the concept of structural holes from social networks to a broader range of associative networks found in science studies, especially including networks that reflect underlying intellectual structures such as co-citation networks and collaboration networks. The central premise is that connecting otherwise disparate patches of knowledge is a valuable mechanism of creative thinking in general and transformative scientific discovery in particular.
Biology presents many examples of planar distribution and structural networks having dense sets of closed loops. An archetype of this form of network organization is the vasculature of dicotyledonous leaves, which showcases a hierarchically-nested architecture containing closed loops at many different levels. Although a number of methods have been proposed to measure aspects of the structure of such networks, a robust metric to quantify their hierarchical organization is still lacking. We present an algorithmic framework, the hierarchical loop decomposition, that allows mapping loopy networks to binary trees, preserving in the connectivity of the trees the architecture of the original graph. We apply this framework to investigate computer generated graphs, such as artificial models and optimal distribution networks, as well as natural graphs extracted from digitized images of dicotyledonous leaves and vasculature of rat cerebral neocortex. We calculate various metrics based on the Asymmetry, the cumulative size distribution and the Strahler bifurcation ratios of the corresponding trees and discuss the relationship of these quantities to the architectural organization of the original graphs. This algorithmic framework decouples the geometric information (exact location of edges and nodes) from the metric topology (connectivity and edge weight) and it ultimately allows us to perform a quantitative statistical comparison between predictions of theoretical models and naturally occurring loopy graphs.
We present clustering methods for multivariate data exploiting the underlying geometry of the graphical structure between variables. As opposed to standard approaches that assume known graph structures, we first estimate the edge structure of the unknown graph using Bayesian neighborhood selection approaches, wherein we account for the uncertainty of graphical structure learning through model-averaged estimates of the suitable parameters. Subsequently, we develop a nonparametric graph clustering model on the lower dimensional projections of the graph based on Laplacian embeddings using Dirichlet process mixture models. In contrast to standard algorithmic approaches, this fully probabilistic approach allows incorporation of uncertainty in estimation and inference for both graph structure learning and clustering. More importantly, we formalize the arguments for Laplacian embeddings as suitable projections for graph clustering by providing theoretical support for the consistency of the eigenspace of the estimated graph Laplacians. We develop fast computational algorithms that allow our methods to scale to large number of nodes. Through extensive simulations we compare our clustering performance with standard clustering methods. We apply our methods to a novel pan-cancer proteomic data set, and evaluate protein networks and clusters across multiple different cancer types.
Both caching and interference alignment (IA) are promising techniques for future wireless networks. Nevertheless, most of existing works on cache-enabled IA wireless networks assume that the channel is invariant, which is unrealistic considering the time-varying nature of practical wireless environments. In this paper, we consider realistic time-varying channels. Specifically, the channel is formulated as a finite-state Markov channel (FSMC). The complexity of the system is very high when we consider realistic FSMC models. Therefore, we propose a novel big data reinforcement learning approach in this paper. Deep reinforcement learning is an advanced reinforcement learning algorithm that uses deep $Q$ network to approximate the $Q$ value-action function. Deep reinforcement learning is used in this paper to obtain the optimal IA user selection policy in cache-enabled opportunistic IA wireless networks. Simulation results are presented to show the effectiveness of the proposed scheme.
Triplet networks are widely used models that are characterized by good performance in classification and retrieval tasks. In this work we propose to train a triplet network by putting it as the discriminator in Generative Adversarial Nets (GANs). We make use of the good capability of representation learning of the discriminator to increase the predictive quality of the model. We evaluated our approach on Cifar10 and MNIST datasets and observed significant improvement on the classification performance using the simple k-nn method.
For several decades, image restoration remains an active research topic in low-level computer vision and hence new approaches are constantly emerging. However, many recently proposed algorithms achieve state-of-the-art performance only at the expense of very high computation time, which clearly limits their practical relevance. In this work, we propose a simple but effective approach with both high computational efficiency and high restoration quality. We extend conventional nonlinear reaction diffusion models by several parametrized linear filters as well as several parametrized influence functions. We propose to train the parameters of the filters and the influence functions through a loss based approach. Experiments show that our trained nonlinear reaction diffusion models largely benefit from the training of the parameters and finally lead to the best reported performance on common test datasets for image restoration. Due to their structural simplicity, our trained models are highly efficient and are also well-suited for parallel computation on GPUs.
The detection of the signature of dipole gravity modes has opened the path to study the solar inner radiative zone. Indeed, g modes should be the best probes to infer the properties of the solar nuclear core that represents more than half of the total mass of the Sun. Concerning the dynamics of the solar core, we can study how future observations of individual g modes could enhance our knowledge of the rotation profile of the deep radiative zone. Applying inversions on a set of real p-mode splittings coupled with either one or several g modes, we have checked the improvement of the inferred rotation profile when different error bars are considered for the g modes. Moreover, using a new methodology based on the analysis of the almost constant separation of the dipole gravity modes, we can introduce new constraints on solar models. For that purpose, we can compare g-mode predictions computed from several models including different physical inputs with the g-mode asymptotic signature detected in GOLF data and calculate the correlation. This work shows the great consistency between the signature of dipole gravity modes and our knowledge of p-modes: incompatibility of data with a present standard model including the Asplund composition
The connectivity properties of ad hoc networks have been extensively studied over the past few years, from local observables, to global network properties. In this paper we introduce a novel layer of network dynamics which lives and evolves on top of the ad hoc network. Nodes are assumed selfish and a snow-drift type game is defined dictating the way nodes decide to allocate their cooperative resource efforts towards other nodes in the network. The dynamics are strongly coupled with the physical network causing the cooperation network topology to converge towards a stable equilibrium state, a global maximum of the total pay-off. We study this convergence from a connectivity perspective and analyse the inherent parameter dependence. Moreover, we show that direct reciprocity can be an efficient incentive to promote cooperation within the network and discuss the analogies between our simple yet tractable framework with D2D proximity based services such as LTE-Direct. We argue that cooperative network dynamics have many application in ICT, not just ad hoc networks, and similar models as the one described herein can be devised and studied in their own right.
In this Letter, we proposed a mixing navigation mechanism, which interpolates between random-walk and shortest-path protocol. The navigation efficiency can be remarkably enhanced via a few routers. Some advanced strategies are also designed: For non-geographical scale-free networks, the targeted strategy with a tiny fraction of routers can guarantee an efficient navigation with low and stable delivery time almost independent of network size. For geographical localized networks, the clustering strategy can simultaneously increase the efficiency and reduce the communication cost. The present mixing navigation mechanism is of significance especially for information organization of wireless sensor networks and distributed autonomous robotic systems.
In this paper we propose and study a technique to reduce the number of parameters and computation time in fully-connected layers of neural networks using Kronecker product, at a mild cost of the prediction quality. The technique proceeds by replacing Fully-Connected layers with so-called Kronecker Fully-Connected layers, where the weight matrices of the FC layers are approximated by linear combinations of multiple Kronecker products of smaller matrices. In particular, given a model trained on SVHN dataset, we are able to construct a new KFC model with 73\% reduction in total number of parameters, while the error only rises mildly. In contrast, using low-rank method can only achieve 35\% reduction in total number of parameters given similar quality degradation allowance. If we only compare the KFC layer with its counterpart fully-connected layer, the reduction in the number of parameters exceeds 99\%. The amount of computation is also reduced as we replace matrix product of the large matrices in FC layers with matrix products of a few smaller matrices in KFC layers. Further experiments on MNIST, SVHN and some Chinese Character recognition models also demonstrate effectiveness of our technique.
Metabolic networks are complex systems that comprise hundreds of chemical reactions which synthesize biomass molecules from chemicals in an organism's environment. The metabolic network of any one organism is encoded by a metabolic genotype, defined by a set of enzyme-coding genes whose products catalyze the network's reactions. Each metabolic genotype has a metabolic phenotype, such as the ability to synthesize biomass on a spectrum of different sources of chemical elements and energy. We here focus on sulfur metabolism, which is attractive to study the evolution of metabolic networks, because it involves many fewer reactions than carbon metabolism. Specifically, we study properties of the space of all possible metabolic genotypes, and analyze properties of random metabolic genotypes that are viable on different numbers of sulfur sources. We show that metabolic genotypes with the same phenotype form large connected genotype networks that extend far through metabolic genotype space. How far they reach through this space is a linear function of the number of super-essential reactions in such networks, the number of reactions that occur in all networks with the same phenotype. We show that different neighborhoods of any genotype network harbor very different novel phenotypes, metabolic innovations that can sustain life on novel sulfur sources. We also analyze the ability of evolving populations of metabolic networks to explore novel metabolic phenotypes. This ability is facilitated by the existence of genotype networks, because different neighborhoods of these networks contain very different novel phenotypes. In contrast to macromolecules, where phenotypic robustness may facilitate phenotypic innovation, we show that here the ability to access novel phenotypes does not monotonically increase with robustness.
Person re-identification is challenging due to the large variations of pose, illumination, occlusion and camera view. Owing to these variations, the pedestrian data is distributed as highly-curved manifolds in the feature space, despite the current convolutional neural networks (CNN)'s capability of feature extraction. However, the distribution is unknown, so it is difficult to use the geodesic distance when comparing two samples. In practice, the current deep embedding methods use the Euclidean distance for the training and test. On the other hand, the manifold learning methods suggest to use the Euclidean distance in the local range, combining with the graphical relationship between samples, for approximating the geodesic distance. From this point of view, selecting suitable positive i.e. intra-class) training samples within a local range is critical for training the CNN embedding, especially when the data has large intra-class variations. In this paper, we propose a novel moderate positive sample mining method to train robust CNN for person re-identification, dealing with the problem of large variation. In addition, we improve the learning by a metric weight constraint, so that the learned metric has a better generalization ability. Experiments show that these two strategies are effective in learning robust deep metrics for person re-identification, and accordingly our deep model significantly outperforms the state-of-the-art methods on several benchmarks of person re-identification. Therefore, the study presented in this paper may be useful in inspiring new designs of deep models for person re-identification.
Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli.
A neural network model that exhibits stochastic population bursting is studied by simulation. First return maps of inter-burst intervals exhibit recurrent unstable periodic orbit (UPO)-like trajectories similar to those found in experiments on hippocampal slices. Applications of various control methods and surrogate analysis for UPO-detection also yield results similar to those of experiments. Our results question the interpretation of the experimental data as evidence for deterministic chaos and suggest caution in the use of UPO-based methods for detecting determinism in time-series data.
Knowledge could be gained from experts, specialists in the area of interest, or it can be gained by induction from sets of data. Automatic induction of knowledge from data sets, usually stored in large databases, is called data mining. Data mining methods are important in the management of complex systems. There are many technologies available to data mining practitioners, including Artificial Neural Networks, Regression, and Decision Trees. Neural networks have been successfully applied in wide range of supervised and unsupervised learning applications. Neural network methods are not commonly used for data mining tasks, because they often produce incomprehensible models, and require long training times. One way in which the collective properties of a neural network may be used to implement a computational task is by way of the concept of energy minimization. The Hopfield network is well-known example of such an approach. The Hopfield network is useful as content addressable memory or an analog computer for solving combinatorial-type optimization problems. Wan Abdullah [1] proposed a method of doing logic programming on a Hopfield neural network. Optimization of logical inconsistency is carried out by the network after the connection strengths are defined from the logic program; the network relaxes to neural states corresponding to a valid interpretation. In this article, we describe how Hopfield network is able to induce logical rules from large database by using reverse analysis method: given the values of the connections of a network, we can hope to know what logical rules are entrenched in the database.
The widespread proliferation of Internet and wireless applications has produced a significant increase of ICT energy footprint. As a response, in the last five years, significant efforts have been undertaken to include energy-awareness into network management. Several green networking frameworks have been proposed by carefully managing the network routing and the power state of network devices.   Even though approaches proposed differ based on network technologies and sleep modes of nodes and interfaces, they all aim at tailoring the active network resources to the varying traffic needs in order to minimize energy consumption. From a modeling point of view, this has several commonalities with classical network design and routing problems, even if with different objectives and in a dynamic context.   With most researchers focused on addressing the complex and crucial technological aspects of green networking schemes, there has been so far little attention on understanding the modeling similarities and differences of proposed solutions. This paper fills the gap surveying the literature with optimization modeling glasses, following a tutorial approach that guides through the different components of the models with a unified symbolism. A detailed classification of the previous work based on the modeling issues included is also proposed.
This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks.
Predicting the next activity of a running process is an important aspect of process management. Recently, artificial neural networks, so called deep-learning approaches, have been proposed to address this challenge. This demo paper describes a software application that applies the Tensorflow deep-learning framework to process prediction. The software application reads industry-standard XES files for training and presents the user with an easy-to-use graphical user interface for both training and prediction. The system provides several improvements over earlier work. This demo paper focuses on the software implementation and describes the architecture and user interface.
We respond to the issues discussed by Farmer and Lillo (FL) related to our proposed approach to understanding the origin of power-law distributions in stock price fluctuations. First, we extend our previous analysis to 1000 US stocks and perform a new estimation of market impact that accounts for splitting of large orders and potential autocorrelations in the trade flow. Our new analysis shows clearly that price impact and volume are related by a square-root functional form of market impact for large volumes, in contrast to the claim of FL that this relationship increases as a power law with a smaller exponent. Since large orders are usually executed by splitting into smaller size trades, procedures used by FL give a downward bias for this power law exponent. Second, FL analyze 3 stocks traded on the London Stock Exchange, and solely on this basis they claim that the distribution of transaction volumes do not have a power-law tail for the London Stock Exchange. We perform new empirical analysis on transaction data for the 262 largest stocks listed in the London Stock Exchange, and find that the distribution of volume decays as a power-law with an exponent $\approx 3/2$ -- in sharp contrast to FL's claim that the distribution of transaction volume does not have a power-law tail. Our exponent estimate of $\approx 3/2$ is consistent with our previous results from the New York and Paris Stock Exchanges. We conclude that the available empirical evidence is consistent with our hypothesis on the origin of power-law fluctuations in stock prices.
In recent years, the research community has discovered that deep neural networks (DNNs) and convolutional neural networks (CNNs) can yield higher accuracy than all previous solutions to a broad array of machine learning problems. To our knowledge, there is no single CNN/DNN architecture that solves all problems optimally. Instead, the "right" CNN/DNN architecture varies depending on the application at hand. CNN/DNNs comprise an enormous design space. Quantitatively, we find that a small region of the CNN design space contains 30 billion different CNN architectures.   In this dissertation, we develop a methodology that enables systematic exploration of the design space of CNNs. Our methodology is comprised of the following four themes.   1. Judiciously choosing benchmarks and metrics.   2. Rapidly training CNN models.   3. Defining and describing the CNN design space.   4. Exploring the design space of CNN architectures.   Taken together, these four themes comprise an effective methodology for discovering the "right" CNN architectures to meet the needs of practical applications.
We consider two models of Hopfield-like associative memory with $q$-valued neurons: Potts-glass neural network (PGNN) and parametrical neural network (PNN). In these models neurons can be in more than two different states. The models have the record characteristics of its storage capacity and noise immunity, and significantly exceed the Hopfield model. We present a uniform formalism allowing us to describe both PNN and PGNN. This networks inherent mechanisms, responsible for outstanding recognizing properties, are clarified.
In this paper, we present a method to control a quadrotor with a neural network trained using reinforcement learning techniques. With reinforcement learning, a common network can be trained to directly map state to actuator command making any predefined control structure obsolete for training. Moreover, we present a new learning algorithm which differs from the existing ones in certain aspects. Our algorithm is conservative but stable for complicated tasks. We found that it is more applicable to controlling a quadrotor than existing algorithms. We demonstrate the performance of the trained policy both in simulation and with a real quadrotor. Experiments show that our policy network can react to step response relatively accurately. With the same policy, we also demonstrate that we can stabilize the quadrotor in the air even under very harsh initialization (manually throwing it upside-down in the air with an initial velocity of 5 m/s). Computation time of evaluating the policy is only 7 {\mu}s per time step which is two orders of magnitude less than common trajectory optimization algorithms with an approximated model.
In this paper, we propose a novel face alignment method that trains deep convolutional network from coarse to fine. It divides given landmarks into principal subset and elaborate subset. We firstly keep a large weight for principal subset to make our network primarily predict their locations while slightly take elaborate subset into account. Next the weight of principal subset is gradually decreased until two subsets have equivalent weights. This process contributes to learn a good initial model and search the optimal model smoothly to avoid missing fairly good intermediate models in subsequent procedures. On the challenging COFW dataset [1], our method achieves 6.33% mean error with a reduction of 21.37% compared with the best previous result [2].
The emerging Software Defined Networking (SDN) paradigm separates the data plane from the control plane and centralizes network control in an SDN controller. Applications interact with controllers to implement network services, such as network transport with Quality of Service (QoS). SDN facilitates the virtualization of network functions so that multiple virtual networks can operate over a given installed physical network infrastructure. Due to the specific characteristics of optical (photonic) communication components and the high optical transmission capacities, SDN based optical networking poses particular challenges, but holds also great potential. In this article, we comprehensively survey studies that examine the SDN paradigm in optical networks; in brief, we survey the area of Software Defined Optical Networks (SDONs). We mainly organize the SDON studies into studies focused on the infrastructure layer, the control layer, and the application layer. Moreover, we cover SDON studies focused on network virtualization, as well as SDON studies focused on the orchestration of multilayer and multidomain networking. Based on the survey, we identify open challenges for SDONs and outline future directions.
As the Internet struggles to cope with scalability, mobility, and security issues, new network architectures are being proposed to better accommodate the needs of modern systems and applications. In particular, Content-Oriented Networking (CON) has emerged as a promising next-generation Internet architecture: it sets to decouple content from hosts, at the network layer, by naming data rather than hosts. CON comes with a potential for a wide range of benefits, including reduced congestion and improved delivery speed by means of content caching, simpler configuration of network devices, and security at the data level. However, it remains an interesting open question whether or not, and to what extent, this emerging networking paradigm bears new privacy challenges. In this paper, we provide a systematic privacy analysis of CON and the common building blocks among its various architectural instances in order to highlight emerging privacy threats, and analyze a few potential countermeasures. Finally, we present a comparison between CON and today's Internet in the context of a few privacy concepts, such as, anonymity, censoring, traceability, and confidentiality.
Opinion polarization is a ubiquitous phenomenon in opinion dynamics. In contrast to the traditional consensus oriented group decision making (GDM) framework, this paper proposes a framework with the co-evolution of both opinions and relationship networks to improve the potential consensus level of a group and help the group reach a stable state. Taking the bound of confidence and the degree of individual's persistence into consideration, the evolution of the opinion is driven by the relationship among the group. Meanwhile, the antagonism or cooperation of individuals presented by the network topology also evolve according to the dynamic opinion distances. Opinions are convergent and the stable state will be reached in this co-evolution mechanism. We further explored this framework through simulation experiments. The simulation results verify the influence of the level of persistence on the time cost and indicate the influence of group size, the initial topology of networks and the bound of confidence on the number of opinion clusters.
In this paper we consider the problem of semi-supervised learning with deep Convolutional Neural Networks (ConvNets). Semi-supervised learning is motivated on the observation that unlabeled data is cheap and can be used to improve the accuracy of classifiers. In this paper we propose an unsupervised regularization term that explicitly forces the classifier's prediction for multiple classes to be mutually-exclusive and effectively guides the decision boundary to lie on the low density space between the manifolds corresponding to different classes of data. Our proposed approach is general and can be used with any backpropagation-based learning method. We show through different experiments that our method can improve the object recognition performance of ConvNets using unlabeled data.
Wireless Sensor Networks (WSNs) are used to perform distributed sensing in various fields, such as health, military, home etc. In WSNs, sensor nodes should communicate among themselves and do distributed computation over the sensed values to identify the occurrence of an event. This paper assumes the no memory computation model for sensor nodes, i.e. the sensor nodes only have two registers. This paper presents an optimal architecture for the distributed computation in WSN and also claims that this architecture is the optimal for the described computation model.
Forecasting the flow of crowds is of great importance to traffic management and public safety, yet a very challenging task affected by many complex factors, such as inter-region traffic, events and weather. In this paper, we propose a deep-learning-based approach, called ST-ResNet, to collectively forecast the in-flow and out-flow of crowds in each and every region through a city. We design an end-to-end structure of ST-ResNet based on unique properties of spatio-temporal data. More specifically, we employ the framework of the residual neural networks to model the temporal closeness, period, and trend properties of the crowd traffic, respectively. For each property, we design a branch of residual convolutional units, each of which models the spatial properties of the crowd traffic. ST-ResNet learns to dynamically aggregate the output of the three residual neural networks based on data, assigning different weights to different branches and regions. The aggregation is further combined with external factors, such as weather and day of the week, to predict the final traffic of crowds in each and every region. We evaluate ST-ResNet based on two types of crowd flows in Beijing and NYC, finding that its performance exceeds six well-know methods.
Recurrent neural networks are powerful models for processing sequential data, but they are generally plagued by vanishing and exploding gradient problems. Unitary recurrent neural networks (uRNNs), which use unitary recurrence matrices, have recently been proposed as a means to avoid these issues. However, in previous experiments, the recurrence matrices were restricted to be a product of parameterized unitary matrices, and an open question remains: when does such a parameterization fail to represent all unitary matrices, and how does this restricted representational capacity limit what can be learned? To address this question, we propose full-capacity uRNNs that optimize their recurrence matrix over all unitary matrices, leading to significantly improved performance over uRNNs that use a restricted-capacity recurrence matrix. Our contribution consists of two main components. First, we provide a theoretical argument to determine if a unitary parameterization has restricted capacity. Using this argument, we show that a recently proposed unitary parameterization has restricted capacity for hidden state dimension greater than 7. Second, we show how a complete, full-capacity unitary recurrence matrix can be optimized over the differentiable manifold of unitary matrices. The resulting multiplicative gradient step is very simple and does not require gradient clipping or learning rate adaptation. We confirm the utility of our claims by empirically evaluating our new full-capacity uRNNs on both synthetic and natural data, achieving superior performance compared to both LSTMs and the original restricted-capacity uRNNs.
In this paper, we introduce the problem of simultaneously detecting multiple photographic defects. We aim at detecting the existence, severity, and potential locations of common photographic defects related to color, noise, blur and composition. The automatic detection of such defects could be used to provide users with suggestions for how to improve photos without the need to laboriously try various correction methods. Defect detection could also help users select photos of higher quality while filtering out those with severe defects in photo curation and summarization.   To investigate this problem, we collected a large-scale dataset of user annotations on seven common photographic defects, which allows us to evaluate algorithms by measuring their consistency with human judgments. Our new dataset enables us to formulate the problem as a multi-task learning problem and train a multi-column deep convolutional neural network (CNN) to simultaneously predict the severity of all the defects. Unlike some existing single-defect estimation methods that rely on low-level statistics and may fail in many cases on natural photographs, our model is able to understand image contents and quality at a higher level. As a result, in our experiments, we show that our model has predictions with much higher consistency with human judgments than low-level methods as well as several baseline CNN models. Our model also performs better than an average human from our user study.
Stars with masses in the range from about 1.3 to 3.5 Mo pass through an evolutionary stage where they become carbon stars. In this stage, which lasts a few Myr, these stars are extremely luminous pulsating giants. They are so luminous in the near-infrared that just a few of them can double the integrated luminosity of intermediate-age (0.6 to 2 Gyr) Magellanic Cloud clusters at 2.2 microns. Astronomers routinely use such near-infrared observations to minimize the effects of dust extinction, but it is precisely in this band that carbon stars can contribute hugely. The actual contribution of carbon stars to the outer disk light of evolving spiral galaxies has not previously been morphologically investigated. Here we report new and very deep near-IR images of the Triangulum spiral galaxy M33=NGC 598, delineating spectacular arcs of carbon stars in its outer regions. It is these arcs which dominate the near-infrared m=2 Fourier spectra of M33. We present near-infrared photometry with the Hale 5-m reflector, and propose that the arcs are the signature of accretion of low metallicity gas in the outer disk of M33.
Using data from computer databases of scientific papers in physics, biomedical research, and computer science, we have constructed networks of collaboration between scientists in each of these disciplines. In these networks two scientists are considered connected if they have coauthored one or more papers together. We have studied many statistical properties of our networks, including numbers of papers written by authors, numbers of authors per paper, numbers of collaborators that scientists have, typical distance through the network from one scientist to another, and a variety of measures of connectedness within a network, such as closeness and betweenness. We further argue that simple networks such as these cannot capture the variation in the strength of collaborative ties and propose a measure of this strength based on the number of papers coauthored by pairs of scientists, and the number of other scientists with whom they coauthored those papers. Using a selection of our results, we suggest a variety of possible ways to answer the question "Who is the best connected scientist?"
Precisely quantifying the heterogeneity or disorder of a network system is very important and desired in studies of behavior and function of the network system. Although many degree-based entropies have been proposed to measure the heterogeneity of real networks, heterogeneity implicated in the structure of networks can not be precisely quantified yet. Hence, we propose a new structure entropy based on automorphism partition to precisely quantify the structural heterogeneity of networks. Analysis of extreme cases shows that entropy based on automorphism partition can quantify the structural heterogeneity of networks more precisely than degree-based entropy. We also summarized symmetry and heterogeneity statistics of many real networks, finding that real networks are indeed more heterogenous in the view of automorphism partition than what have been depicted under the measurement of degree based entropies; and that structural heterogeneity is strongly negatively correlated to symmetry of real networks.
The mechanism of acoustic wave propagation in supercooled liquids is not yet fully understood since the vibrational dynamics of supercooled liquids are strongly affected by their amorphous inherent structures. In this paper, the acoustic wave propagation in a supercooled model liquid is studied by using normal mode analysis. Due to the highly disordered inherent structure, a single acoustic wave is decomposed into many normal modes in broad frequency range. This causes the rapid decay of the acoustic wave and results in anomalous wavenumber dependency of the dispersion relation and the rate of attenuation.
The availability of vast amounts of data is changing how we can make medical discoveries, predict global market trends, save energy, and develop educational strategies. In some settings such as Genome Wide Association Studies or deep learning, sheer size of data seems critical. When data is held distributedly by many parties, they must share it to reap its full benefits.   One obstacle to this revolution is the lack of willingness of different parties to share data, due to reasons such as loss of privacy or competitive edge. Cryptographic works address privacy aspects, but shed no light on individual parties' losses/gains when access to data carries tangible rewards. Even if it is clear that better overall conclusions can be drawn from collaboration, are individual collaborators better off by collaborating? Addressing this question is the topic of this paper.   * We formalize a model of n-party collaboration for computing functions over private inputs in which participants receive their outputs in sequence, and the order depends on their private inputs. Each output "improves" on preceding outputs according to a score function.   * We say a mechanism for collaboration achieves collaborative equilibrium if it ensures higher reward for all participants when collaborating (rather than working alone). We show that in general, computing a collaborative equilibrium is NP-complete, yet we design efficient algorithms to compute it in a range of natural model settings.   Our collaboration mechanisms are in the standard model, and thus require a central trusted party; however, we show this assumption is unnecessary under standard cryptographic assumptions. We show how to implement the mechanisms in a decentralized way with new extensions of secure multiparty computation that impose order/timing constraints on output delivery to different players, as well as privacy and correctness.
We study by the perturbative Functional Renormalization Group (FRG) the Random Field and Random Anisotropy O(N) models near $d=4$, the lower critical dimension of ferromagnetism. The long-distance physics is controlled by zero-temperature fixed points at which the renormalized effective action is nonanalytic. We obtain the beta functions at 2-loop order, showing that despite the nonanalytic character of the renormalized effective action, the theory is perturbatively renormalizable at this order. The physical results obtained at 2-loop level, most notably concerning the breakdown of dimensional reduction at the critical point and the stability of quasi-long range order in $d<4$, are shown to fit into the picture predicted by our recent non-perturbative FRG approach.
Although recently there are extensive research on the collaborative networks and online communities, there is very limited knowledge about the actual evolution of the online social networks (OSN). In the Letter, we study the structural evolution of a large online virtual community. We find that the scale growth of the OSN shows non-trivial S shape which may provide a proper exemplification for Bass diffusion model. We reveal that the evolutions of many network properties, such as density, clustering, heterogeneity and modularity, show non-monotone feature, and shrink phenomenon occurs for the path length and diameter of the network. Furthermore, the OSN underwent a transition from degree assortativity characteristic of collaborative networks to degree disassortativity characteristic of many OSNs. Our study has revealed the evolutionary pattern of interpersonal interactions in a specific population and provided a valuable platform for theoretical modeling and further analysis.
In this paper, we consider the competitive diffusion game, and study the existence of its pure-strategy Nash equilibrium when defined over general undirected networks. We first determine the set of pure-strategy Nash equilibria for two special but well-known classes of networks, namely the lattice and the hypercube. Characterizing the utility of the players in terms of graphical distances of their initial seed placements to other nodes in the network, we show that in general networks the decision process on the existence of pure-strategy Nash equilibrium is an NP-hard problem. Following this, we provide some necessary conditions for a given profile to be a Nash equilibrium. Furthermore, we study players' utilities in the competitive diffusion game over Erdos-Renyi random graphs and show that as the size of the network grows, the utilities of the players are highly concentrated around their expectation, and are bounded below by some threshold based on the parameters of the network. Finally, we obtain a lower bound for the maximum social welfare of the game with two players, and study sub-modularity of the players' utilities.
This paper analyzes a nuclear reactor power signal that suffers from network induced random delays in the shared data network while being fed-back to the Reactor Regulating System (RRS). A detailed study is carried out to investigate the self similarity of random delay dynamics due to the network traffic in shared medium. The fractionality or selfsimilarity in the network induced delay that corrupts the measured power signal coming from Self Powered Neutron Detectors (SPND) is estimated and analyzed. As any fractional order randomness is intrinsically different from conventional Gaussian kind of randomness, these delay dynamics need to be handled efficiently, before reaching the controller within the RRS. An attempt has been made to minimize the effect of the randomness in the reactor power transient data with few classes of smoothing filters. The performance measure of the smoothers with fractional order noise consideration is also investigated into.
Using exact optimization methods, we find all of the ground states of +/- h random-field Ising magnets (RFIM) and of dilute antiferromagnets in a field (DAFF). The degenerate ground states are usually composed of isolated clusters (two-level systems) embedded in a frozen background. We calculate the paramagnetic response (sublattice response) and the ground state entropy for the RFIM (DAFF) due to these clusters. In both two and three dimensions there is a broad regime in which these quantities are strictly positive, even at irrational values of h/J (J is the exchange constant).
The background activity of a cortical neural network is modeled by a homogeneous integrate-and-fire network with unreliable inhibitory synapses. Numerical and analytical calculations show that the network relaxes into a stationary state of high attention. The majority of the neurons has a membrane potential just below the threshold; as a consequence the network can react immediately - on the time scale of synaptic transmission- on external pulses. The neurons fire with a low rate and with a broad distribution of interspike intervals. Firing events of the total network are correlated over short time periods. The firing rate increases linearly with external stimuli. In the limit of infinitely large networks, the synaptic noise decreases to zero. Nevertheless, the distribution of interspike intervals remains broad.
Network Function Virtualization (NFV) is enabling the softwarization of traditional network services, commonly deployed in dedicated hardware, into generic hardware in form of Virtual Network Functions (VNFs), which can be located flexibly in the network. However, network load balancing can be critical for an ordered sequence of VNFs, also known as Service Function Chains (SFCs), a common cloud and network service approach today. The placement of these chained functions increases the ping-pong traffic between VNFs, directly affecting to the efficiency of bandwidth utilization. The optimization of the placement of these VNFs is a challenge as also other factors need to be considered, such as the resource utilization. To address this issue, we study the problem of VNF placement with replications, and especially the potential of VNFs replications to help load balance the network, while the server utilization is minimized. In this paper we present a Linear Programming (LP) model for the optimum placement of functions finding a trade-off between the minimization of two objectives: the link utilization and CPU resource usage. The results show how the model load balance the utilization of all links in the network using minimum resources.
The focus of our work is speeding up evaluation of deep neural networks in retrieval scenarios, where conventional architectures may spend too much time on negative examples. We propose to replace a monolithic network with our novel cascade of feature-sharing deep classifiers, called OnionNet, where subsequent stages may add both new layers as well as new feature channels to the previous ones. Importantly, intermediate feature maps are shared among classifiers, preventing them from the necessity of being recomputed. To accomplish this, the model is trained end-to-end in a principled way under a joint loss. We validate our approach in theory and on a synthetic benchmark. As a result demonstrated in three applications (patch matching, object detection, and image retrieval), our cascade can operate significantly faster than both monolithic networks and traditional cascades without sharing at the cost of marginal decrease in precision.
In case of decision making problems, classification of pattern is a complex and crucial task. Pattern classification using multilayer perceptron (MLP) trained with back propagation learning becomes much complex with increase in number of layers, number of nodes and number of epochs and ultimate increases computational time [31]. In this paper, an attempt has been made to use fuzzy MLP and its learning algorithm for pattern classification. The time and space complexities of the algorithm have been analyzed. A training performance comparison has been carried out between MLP and the proposed fuzzy-MLP model by considering six cases. Results are noted against different learning rates ranging from 0 to 1. A new performance evaluation factor 'convergence gain' has been introduced. It is observed that the number of epochs drastically reduced and performance increased compared to MLP. The average and minimum gain has been found to be 93% and 75% respectively. The best gain is found to be 95% and is obtained by setting the learning rate to 0.55.
In [1] Zawadoski introduces a banking network model in which the asset and counter-party risks are treated separately and the banks hedge their assets risks by appropriate OTC contracts. In his model, each bank has only two counter-party neighbors, a bank fails due to the counter-party risk only if at least one of its two neighbors default, and such a counter-party risk is a low probability event. Informally, the author shows that the banks will hedge their asset risks by appropriate OTC contracts, and, though it may be socially optimal to insure against counter-party risk, in equilibrium banks will {\em not} choose to insure this low probability event.   In this paper, we consider the above model for more general network topologies, namely when each node has exactly 2r counter-party neighbors for some integer r>0. We extend the analysis of [1] to show that as the number of counter-party neighbors increase the probability of counter-party risk also increases, and in particular the socially optimal solution becomes privately sustainable when each bank hedges its risk to at least n/2 banks, where n is the number of banks in the network, i.e., when 2r is at least n/2, banks not only hedge their asset risk but also hedge its counter-party risk.
Deep galaxy counts are among the best constraints on the cosmic star formation history (SFH) of galaxies. The evolution of the star formation activity is followed on a wide range of redshifts (0< z < 4) covering most of the history of the Universe. Two incompatible interpretations of the observations are currently competing. After applying star formation rate (SFR) conversion factors to the CFRS, Halpha or ISO samples, many authors conclude to a strong increase (~ a factor 10) of the SFR from z=0 to z=1. They also find some evidence for a peak at z~1 and for a rapid decrease at higher redshifts. On the other side, the Hawaii deep surveys favor only a mild increase between z=0 and 1 (Cowie et al., 1996, 1999). To understand the reason for these discrepant interpretations, we consider three classes of galaxies: E/S0, Sa-Sb-Sbc, Sc-Sd-Im and bursting dwarfs. We use the new version of our evolutionary synthesis code, PEGASE which takes into account metallicity and dust effects. The main results are: i) Late-type galaxies contribute significantly to the local SFR, especially bursting dwarfs (Fioc and Rocca-Volmerange, 1999). Because of that, the cosmic SFR can not decrease by a factor 10 from z=0 to 1. This is in agreement with Cowie's result. ii) The SFR of intermediate-type galaxies has strongly decreased since z=1. Though the decrease is less than what find Lilly et al., 1996, this suggests that the CFRS and Halpha surveys are dominated by such bright early spirals. iii) The contribution of early-type galaxies increases rapidly from z=1 to their redshift of formation (> 2-3 for cosmological reasons). Their intense star formation rates at high-z give strong constraints on early ionization phases, primeval populations or metal enrichments.(abridged)
We address the vehicle detection and classification problems using Deep Neural Networks (DNNs) approaches. Here we answer to questions that are specific to our application including how to utilize DNN for vehicle detection, what features are useful for vehicle classification, and how to extend a model trained on a limited size dataset, to the cases of extreme lighting condition. Answering these questions we propose our approach that outperforms state-of-the-art methods, and achieves promising results on image with extreme lighting conditions.
The asymptotic mean number of distinct sites visited by a subdiffusive continuous time random walker in two dimensions seems not to have been explicitly calculated anywhere in the literature. This number has been calculated for other dimensions for only one specific asymptotic behavior of the waiting time distribution between steps. We present an explicit derivation for two cases in all integer dimensions so as to formally complete a tableaux of results. In this tableaux we include the dominant as well as subdominant contributions in all integer dimensions. Other quantities that can be calculated from the mean number of distinct sites visited are also discussed.
This paper presents a new technique for induction motor parameter identification. The proposed technique is based on a simple startup test using a standard V/F inverter. The recorded startup currents are compared to that obtained by simulation of an induction motor model. A Modified PSO optimization is used to find out the best model parameter that minimizes the sum square error between the measured and the simulated currents. The performance of the modified PSO is compared with other optimization methods including line search, conventional PSO and Genetic Algorithms. Simulation results demonstrate the ability of the proposed technique to capture the true values of the machine parameters and the superiority of the results obtained using the modified PSO over other optimization techniques.
Generative adversarial networks (GANs) transform latent vectors into visually plausible images. It is generally thought that the original GAN formulation gives no out-of-the-box method to reverse the mapping, projecting images back into latent space. We introduce a simple, gradient-based technique called stochastic clipping. In experiments, for images generated by the GAN, we precisely recover their latent vector pre-images 100% of the time. Additional experiments demonstrate that this method is robust to noise. Finally, we show that even for unseen images, our method appears to recover unique encodings.
Complex networks grow subject to structural constraints which affect their measurable properties. Assessing the effect that such constraints impose on their observables is thus a crucial aspect to be taken into account in their analysis. To this end,we examine the effect of fixing the strength sequence in multi-edge networks on several network observables such as degrees, disparity, average neighbor properties and weight distribution using an ensemble approach. We provide a general method to calculate any desired weighted network metric and we show that several features detected in real data could be explained solely by structural constraints. We thus justify the need of analytical null models to be used as basis to assess the relevance of features found in real data represented in weighted network form.
Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects $\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}$, measures how well other pairs A:B fit in with the set $\mathbf{S}$. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in $\mathbf{S}$? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.
Recent experimental studies indicate that synaptic changes induced by neuronal activity are discrete jumps between a small number of stable states. Learning in systems with discrete synapses is known to be a computationally hard problem. Here, we study a neurobiologically plausible on-line learning algorithm that derives from Belief Propagation algorithms. We show that it performs remarkably well in a model neuron with binary synapses, and a finite number of `hidden' states per synapse, that has to learn a random classification task. Such system is able to learn a number of associations close to the theoretical limit, in time which is sublinear in system size. This is to our knowledge the first on-line algorithm that is able to achieve efficiently a finite number of patterns learned per binary synapse. Furthermore, we show that performance is optimal for a finite number of hidden states which becomes very small for sparse coding. The algorithm is similar to the standard `perceptron' learning algorithm, with an additional rule for synaptic transitions which occur only if a currently presented pattern is `barely correct'. In this case, the synaptic changes are meta-plastic only (change in hidden states and not in actual synaptic state), stabilizing the synapse in its current state. Finally, we show that a system with two visible states and K hidden states is much more robust to noise than a system with K visible states. We suggest this rule is sufficiently simple to be easily implemented by neurobiological systems or in hardware.
We study a two-dimensional granular gas of inelastic spheres subject to multiplicative driving proportional to a power $|v(\vec{x})|^{\delta}$ of the local particle velocity $v(\vec{x})$. The steady state properties of the model are examined for different values of $\delta$, and compared with the homogeneous case $\delta=0$. A driving linearly proportional to $v(\vec{x})$ seems to reproduce some experimental observations which could not be reproduced by a homogeneous driving. Furthermore, we obtain that the system can be homogenized even for strong dissipation, if a driving inversely proportional to
Two coexisting ad-hoc networks, primary and secondary, are considered, where each node of the primary network has a single antenna, while each node of the secondary network is equipped with multiple antennas. Using multiple antennas, each secondary transmitter uses some of its spatial transmit degrees of freedom (STDOF) to null its interference towards the primary receivers, while each secondary receiver employs interference cancelation using some of its spatial receive degrees of freedom (SRDOF). This paper derives the optimal STDOF for nulling and SRDOF for interference cancelation that maximize the scaling of the transmission capacity of the secondary network with respect to the number of antennas, when the secondary network operates under an outage constraint at the primary receivers. With a single receive antenna, using a fraction of the total STDOF for nulling at each secondary transmitter maximizes the transmission capacity. With multiple transmit and receive antennas and fixing all but one STDOF for nulling, using a fraction of the total SRDOF to cancel the nearest interferers maximizes the transmission capacity of the secondary network.
Compared with the fourth generation (4G) cellular systems, the fifth generation wireless communication systems (5G) are anticipated to provide spectral and energy efficiency growth by a factor of at least 10, and the area throughput growth by a factor of at least 25. To achieve these goals, a heterogeneous cloud radio access network (H-CRAN) is presented in this article as the advanced wireless access network paradigm, where cloud computing is used to fulfill the centralized large-scale cooperative processing for suppressing co-channel interferences. The state-of-the-art research achievements in aspects of system architecture and key technologies for H-CRANs are surveyed. Particularly, Node C as a new communication entity is defined to converge the existing ancestral base stations and act as the base band unit (BBU) pool to manage all accessed remote radio heads (RRHs), and the software-defined H-CRAN system architecture is presented to be compatible with software-defined networks (SDN). The principles, performance gains and open issues of key technologies including adaptive large-scale cooperative spatial signal processing, cooperative radio resource management, network function virtualization, and self-organization are summarized. The major challenges in terms of fronthaul constrained resource allocation optimization and energy harvesting that may affect the promotion of H-CRANs are discussed as well.
We study the Parisi overlap probability density P_L(q) for the three-dimensional Ising ferromagnet by means of Monte Carlo (MC) simulations. At the critical point P_L(q) is peaked around q=0 in contrast with the double peaked magnetic probability density. We give particular attention to the tails of the overlap distribution at the critical point, which we control over up to 500 orders of magnitude by using the multi-overlap MC algorithm. Below the critical temperature interface tension estimates from the overlap probability density are given and their approach to the infinite volume limit appears to be smoother than for estimates from the magnetization.
We study the entanglement entropy(EE) of disordered one-dimensional spinless fermions with attractive interactions. With intensive numerical calculation of the EE using the density matrix renormalization group method, we find clear signatures of the transition between the localized and delocalized phase. In the delocalized phase, the fluctuations of the EE becomes minimum and independent of the system size. Meanwhile the EE's logarithmic scaling behavior is found to recover to that of a clean system. We present a general scheme of finite size scaling of the EE at the critical regime of the Anderson transition, from which we extract the critical parameters of the transition with good accuracy, including the critical exponent, critical point and a power-law divergent localization length.
The capacitance of the double layer formed at a metal/ionic-conductor interface can be remarkably large, so that the apparent width of the double layer is as small as 0.3 \AA. Mean-field theories fail to explain such large capacitance. We propose an alternate theory of the ionic double layer which allows for the binding of discrete ions to their image charges in the metal. We show that at small voltages the capacitance of the double layer is limited only by the weak dipole-dipole repulsion between bound ions, and is therefore very large. At large voltages the depletion of bound ions from one of the capacitor electrodes triggers a collapse of the capacitance to the mean-field value.
The local minima of a quadratic functional depending on binary variables are discussed. An arbitrary connection matrix can be presented in the form of quasi-Hebbian expansion where each pattern is supplied with its own individual weight. For such matrices statistical physics methods allow one to derive an equation describing local minima of the functional. A model where only one weight differs from other ones is discussed in details. In this case the above-mention equation can be solved analytically. Obtained results are confirmed by computer simulations.
Mobile-edge cloud computing is a new paradigm to provide cloud computing capabilities at the edge of pervasive radio access networks in close proximity to mobile users. In this paper, we first study the multi-user computation offloading problem for mobile-edge cloud computing in a multi-channel wireless interference environment. We show that it is NP-hard to compute a centralized optimal solution, and hence adopt a game theoretic approach for achieving efficient computation offloading in a distributed manner. We formulate the distributed computation offloading decision making problem among mobile device users as a multi-user computation offloading game. We analyze the structural property of the game and show that the game admits a Nash equilibrium and possesses the finite improvement property. We then design a distributed computation offloading algorithm that can achieve a Nash equilibrium, derive the upper bound of the convergence time, and quantify its efficiency ratio over the centralized optimal solutions in terms of two important performance metrics. We further extend our study to the scenario of multi-user computation offloading in the multi-channel wireless contention environment. Numerical results corroborate that the proposed algorithm can achieve superior computation offloading performance and scale well as the user size increases.
We study the non-equilibrium behavior of the three-dimensional Gaussian random-field Ising model at T=0 in the presence of a uniform external field using a 2-spin-flip dynamics. The deterministic, history-dependent evolution of the system is compared with the one obtained with the standard 1-spin-flip dynamics used in previous studies of the model. The change in the dynamics yields a significant suppression of coercivity, but the distribution of avalanches (in number and size) stays remarkably similar, except for the largest ones that are responsible for the jump in the saturation magnetization curve at low disorder in the thermodynamic limit. By performing a finite-size scaling study, we find strong evidence that the change in the dynamics does not modify the universality class of the disorder-induced phase transition.
The scaling properties of a random walker subject to the global constraint that it needs to visit each site an even number of times are determined. Such walks are realized in the equilibrium state of one dimensional surfaces that are subject to dissociative dimer-type surface dynamics. Moreover, they can be mapped onto unconstrained random walks on a random surface, and the latter corresponds to a non-Hermitian random free fermion model which describes electron localization near a band edge. We show analytically that the dynamic exponent of this random walk is $z=d+2$ in spatial dimension $d$. This explains the anomalous roughness, with exponent $\alpha=1/3$, in one dimensional equilibrium surfaces with dissociative dimer-type dynamics.
This work provides new insights on the convergence of a locally connected network of pulse coupled oscillator (PCOs) (i.e., a bio-inspired model for communication networks) to synchronous and desynchronous states, and their implication in terms of the decentralized synchronization and scheduling in communication networks. Bio-inspired techniques have been advocated by many as fault-tolerant and scalable alternatives to produce self-organization in communication networks. The PCO dynamics in particular have been the source of inspiration for many network synchronization and scheduling protocols. However, their convergence properties, especially in locally connected networks, have not been fully understood, prohibiting the migration into mainstream standards. This work provides further results on the convergence of PCOs in locally connected networks and the achievable convergence accuracy under propagation delays. For synchronization, almost sure convergence is proved for $3$ nodes and accuracy results are obtained for general locally connected networks whereas, for scheduling (or desynchronization), results are derived for locally connected networks with mild conditions on the overlapping set of maximal cliques. These issues have not been fully addressed before in the literature.
Localizing functional regions of objects or affordances is an important aspect of scene understanding. In this work, we cast the problem of affordance segmentation as that of semantic image segmentation. In order to explore various levels of supervision, we introduce a pixel-annotated affordance dataset of 3090 images containing 9916 object instances with rich contextual information in terms of human-object interactions. We use a deep convolutional neural network within an expectation maximization framework to take advantage of weakly labeled data like image level annotations or keypoint annotations. We show that a further reduction in supervision is possible with a minimal loss in performance when human pose is used as context.
We introduce the first global recursive neural parsing model with optimality guarantees during decoding. To support global features, we give up dynamic programs and instead search directly in the space of all possible subtrees. Although this space is exponentially large in the sentence length, we show it is possible to learn an efficient A* parser. We augment existing parsing models, which have informative bounds on the outside score, with a global model that has loose bounds but only needs to model non-local phenomena. The global model is trained with a new objective that encourages the parser to explore a tiny fraction of the search space. The approach is applied to CCG parsing, improving state-of-the-art accuracy by 0.4 F1. The parser finds the optimal parse for 99.9% of held-out sentences, exploring on average only 190 subtrees.
The D'yakonov-Perel' mechanism of spin relaxation is connected with the spin splitting of the electron dispersion curve in crystals lacking a center of symmetry. In a two-dimensional noncentrosymmetric system, e.g. quantum well or heterojunction, the spin splitting is a linear function of ${\bm k}$, at least for small values of ${\bm k}$. We demonstrate that the spin relaxation time $\tau_s$ due to the spin splitting is controlled not only by momentum relaxation processes as widely accepted but also by electron-electron collisions which make no effect on the electron mobility. In order to calculate the time $\tau_s$ taking into account the electron-electron scattering we have solved the two-dimensional kinetic equation for the electron spin density matrix. We show how the theory can be extended to allow for degenerate distribution of the spin-polarized two-dimensional electron gas.
A recent study characterizing failures in computer networks shows that transient single element (node/link) failures are the dominant failures in large communication networks like the Internet. Thus, having the routing paths globally recomputed on a failure does not pay off since the failed element recovers fairly quickly, and the recomputed routing paths need to be discarded. In this paper, we present the first distributed algorithm that computes the alternate paths required by some "proactive recovery schemes" for handling transient failures. Our algorithm computes paths that avoid a failed node, and provides an alternate path to a particular destination from an upstream neighbor of the failed node. With minor modifications, we can have the algorithm compute alternate paths that avoid a failed link as well. To the best of our knowledge all previous algorithms proposed for computing alternate paths are centralized, and need complete information of the network graph as input to the algorithm.
Mobile networks are undergoing fast evolution to software-defined networking (SDN) infrastructure in order to accommodate the ever-growing mobile traffic and overcome the network management nightmares caused by unremitting acceleration in technology innovations and evolution of the service market.Enabled by virtualized network functionalities, evolving carrier wireless networks tend to share radio access network (RAN) among multiple (virtual) network operators so as to increase network capacity and reduce expenses.However, existing RAN sharing models are operator-oriented, which expose extensive resource details, e.g. infrastructure and spectrum,to participating network operators for resource-sharing purposes. These old-fashioned models violate the design principles of SDN abstraction and are infeasible to manage the thriving traffic of on-demand customized services. This paper presents SOARAN, a service-oriented framework for RAN sharing in mobile networks evolving from LTE/LTE advanced to software-defined carrier wireless networks(SD-CWNs), which decouples network operators from radio resource by providing application-level differentiated services. SOARAN defines a serial of abstract applications with distinct Quality of Experience (QoE) requirements. The central controller periodically computes application-level resource allocation for each radio element with respect to runtime traffic demands and channel conditions, and disseminate these allocation decisions as service-oriented policies to respect element. The radio elements then independently determine flow-level resource allocation within each application to accomplish these policies. We formulate the application-level resource allocation as an optimization problem and develop a fast algorithm to solve it with a provably approximate guarantee.
The spin-glass transition in external magnetic field is studied both in and out of the limit of validity of mean-field theory on a diluted one dimensional chain of Ising spins where exchange bonds occur with a probability decaying as the inverse power of the distance. Varying the power in this long-range model corresponds, in a one-to-one relationship, to change the dimension in spin-glass short-range models. Evidence for a spin-glass transition in magnetic field is found also for systems whose equivalent dimension is below the upper critical dimension at zero magnetic field.
We develop a novel probabilistic approach for multi-label classification that is based on the mixtures-of-experts architecture combined with recently introduced conditional tree-structured Bayesian networks. Our approach captures different input-output relations from multi-label data using the efficient tree-structured classifiers, while the mixtures-of-experts architecture aims to compensate for the tree-structured restrictions and build a more accurate model. We develop and present algorithms for learning the model from data and for performing multi-label predictions on future data instances. Experiments on multiple benchmark datasets demonstrate that our approach achieves highly competitive results and outperforms the existing state-of-the-art multi-label classification methods.
Given a query photo issued by a user (q-user), the landmark retrieval is to return a set of photos with their landmarks similar to those of the query, while the existing studies on the landmark retrieval focus on exploiting geometries of landmarks for similarity matches between candidate photos and a query photo. We observe that the same landmarks provided by different users over social media community may convey different geometry information depending on the viewpoints and/or angles, and may subsequently yield very different results. In fact, dealing with the landmarks with \illshapes caused by the photography of q-users is often nontrivial and has seldom been studied. In this paper we propose a novel framework, namely multi-query expansions, to retrieve semantically robust landmarks by two steps. Firstly, we identify the top-$k$ photos regarding the latent topics of a query landmark to construct multi-query set so as to remedy its possible \illshape. For this purpose, we significantly extend the techniques of Latent Dirichlet Allocation. Then, motivated by the typical \emph{collaborative filtering} methods, we propose to learn a \emph{collaborative} deep networks based semantically, nonlinear and high-level features over the latent factor for landmark photo as the training set, which is formed by matrix factorization over \emph{collaborative} user-photo matrix regarding the multi-query set. The learned deep network is further applied to generate the features for all the other photos, meanwhile resulting into a compact multi-query set within such space. Extensive experiments are conducted on real-world social media data with both landmark photos together with their user information to show the superior performance over the existing methods.
Background Network analyses on psychopathological data focus on the network structure and its derivatives such as node centrality. One conclusion one can draw from centrality measures is that the node with the highest centrality is likely to be the node that is determined most by its neighboring nodes. However, centrality is a relative measure: knowing that a node is highly central gives no information about the extent to which it is determined by its neighbors. Here we provide an absolute measure of determination (or controllability) of a node - its predictability. We introduce predictability, estimate the predictability of all nodes in 18 prior empirical network papers on psychopathology, and statistically relate it to centrality.   Methods We carried out a literature review and collected 25 datasets from 18 published papers in the field (several mood and anxiety disorders, substance abuse, psychosis, autism, and transdiagnostic data). We fit state-of-the-art net- work models to all datasets, and computed the predictability of all nodes.   Results Predictability was unrelated to sample size, moderately high in most symptom networks, and differed considerable both within and between datasets. Predictability was higher in community than clinical samples, highest for mood and anxiety disorders, and lowest for psychosis.   Conclusions Predictability is an important additional characterization of symptom networks because it gives an absolute measure of the controllability of each node. It allows conclusions about how self-determined a symptom network is, and may help to inform intervention strategies. Limitations of predictability along with future directions are discussed.
We report on our recent results for deep-inelastic neutrino-proton scattering. We have computed the perturbative QCD corrections to three loops for the harged current structure functions F_2, F_L and F_3 for the combination nu P - nubar P. In leading twist approximation we have calculated the first six odd-integer Mellin moments in the case of F_2 and F_L and the first six even-integer moments in the case of F_3. As a new result we have obtained the coefficient functions to O(alpha_s^3) and we have found the corresponding anomalous dimensions to agree with known results in the literature.
The AdS/CFT correspondence in principle gives a new approach to deep inelastic scattering as formulated by Polchinski and Strassler. Subsequently Brower, Polchinski, Strassler and Tan (BPST) computed the strong coupling kernel for the vacuum (or Pomeron) contribution to total cross sections. By identifying deep inelastic scattering with virtual photon total cross section, this allows a self consistent description at small-$x$ where the dominant contribution is the vacuum exchange process. Here we formulate this contribution and compare it with HERA small-$x$ DIS scattering data. We find that the BPST kernel along with a very simple local approximation to the proton and current "wave functions" gives a remarkably good fit not only at large $Q^2$ dominated by conformal symmetry but also extends to small $Q^2$, supplemented by a hard-wall cut-off of the AdS in the IR. We suggest that this is a useful phenomenological parametrization with implications for other diffractive processes, such as double diffractive Higgs production.
A directed polymer is considered on a flat substrate with randomly located parallel ridges. It prefers to lie inside wide regions between the ridges. When the transversel width $W=\exp(\lambda L^{1/3})$ is exponential in the longitudinal length $L$, there can be a large number $\sim \exp L^{1/3}$ of available wide states. This ``complexity'' causes a phase transition from a high temperature phase where the polymer lies in the widest lane, to a glassy low temperature phase where it lies in one of many narrower lanes. Starting from a uniform initial distribution of independent polymers, equilibration up to some exponential time scale induces a sharp dynamical transition. When the temperature is slowly increased with time, this occurs at a tunable temperature. There is an asymmetry between cooling and heating. The structure of phase space in the low temperature non-equilibrium glassy phase is of a one-level tree.
A model of associative memory is studied, which stores and reliably retrieves many more patterns than the number of neurons in the network. We propose a simple duality between this dense associative memory and neural networks commonly used in deep learning. On the associative memory side of this duality, a family of models that smoothly interpolates between two limiting cases can be constructed. One limit is referred to as the feature-matching mode of pattern recognition, and the other one as the prototype regime. On the deep learning side of the duality, this family corresponds to feedforward neural networks with one hidden layer and various activation functions, which transmit the activities of the visible neurons to the hidden layer. This family of activation functions includes logistics, rectified linear units, and rectified polynomials of higher degrees. The proposed duality makes it possible to apply energy-based intuition from associative memory to analyze computational properties of neural networks with unusual activation functions - the higher rectified polynomials which until now have not been used in deep learning. The utility of the dense memories is illustrated for two test cases: the logical gate XOR and the recognition of handwritten digits from the MNIST data set.
The non-valence spin-flavor structure of the nucleon extracted from semi-inclusive measurements of polarized deep inelastic scattering depends strongly on the transverse momentum of the detected hadrons which are used to determine the individual polarized sea distributions. This physics may explain the recent HERMES observation of a positively polarized strange sea through semi-inclusive scattering, in contrast to the negative strange sea polarization deduced from inclusive polarized deep inelastic scattering.
We study a model for cascade effects over finite networks based on a deterministic binary linear threshold model. Our starting point is a networked coordination game where each agent's payoff is the sum of the payoffs coming from pairwise interactions with each of the neighbors. We first establish that the best response dynamics in this networked game is equivalent to the linear threshold dynamics with heterogeneous thresholds over the agents. While the previous literature has studied such linear threshold models under the assumption that each agent may change actions at most once, a study of best response dynamics in such networked games necessitates an analysis that allows for multiple switches in actions. In this paper, we develop such an analysis and construct a combinatorial framework to understand the behavior of the model. To this end, we establish that the agents behavior cycles among different actions in the limit and provide three sets of results.   We first characterize the limiting behavioral properties of the dynamics. We determine the length of the limit cycles and reveal bounds on the time steps required to reach such cycles for different network structures. We then study the complexity of decision/counting problems that arise within the context. Specifically, we consider the tractability of counting the number of limit cycles and fixed-points, and deciding the reachability of action profiles. We finally propose a measure of network resilience that captures the nature of the involved dynamics. We prove bounds and investigate the resilience of different network structures under this measure.
Sequence labeling for extraction of medical events and their attributes from unstructured text in Electronic Health Record (EHR) notes is a key step towards semantic understanding of EHRs. It has important applications in health informatics including pharmacovigilance and drug surveillance. The state of the art supervised machine learning models in this domain are based on Conditional Random Fields (CRFs) with features calculated from fixed context windows. In this application, we explored various recurrent neural network frameworks and show that they significantly outperformed the CRF models.
Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU ($\approx$ 2.5 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.
Deep convolutional neural networks (CNN) has become the most promising method for object recognition, repeatedly demonstrating record breaking results for image classification and object detection in recent years. However, a very deep CNN generally involves many layers with millions of parameters, making the storage of the network model to be extremely large. This prohibits the usage of deep CNNs on resource limited hardware, especially cell phones or other embedded devices. In this paper, we tackle this model storage issue by investigating information theoretical vector quantization methods for compressing the parameters of CNNs. In particular, we have found in terms of compressing the most storage demanding dense connected layers, vector quantization methods have a clear gain over existing matrix factorization methods. Simply applying k-means clustering to the weights or conducting product quantization can lead to a very good balance between model size and recognition accuracy. For the 1000-category classification task in the ImageNet challenge, we are able to achieve 16-24 times compression of the network with only 1% loss of classification accuracy using the state-of-the-art CNN.
We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively acts as a (dynamic) knowledge base, and the output is a textual response. We evaluate them on a large-scale QA task, and a smaller, but more complex, toy task generated from a simulated world. In the latter, we show the reasoning power of such models by chaining multiple supporting sentences to answer questions that require understanding the intension of verbs.
Deep neural networks have emerged as a widely used and effective means for tackling complex, real-world problems. However, a major obstacle in applying them to safety-critical systems is the great difficulty in providing formal guarantees about their behavior. We present a novel, scalable, and efficient technique for verifying properties of deep neural networks (or providing counter-examples). The technique is based on the simplex method, extended to handle the non-convex Rectified Linear Unit (ReLU) activation function, which is a crucial ingredient in many modern neural networks. The verification procedure tackles neural networks as a whole, without making any simplifying assumptions. We evaluated our technique on a prototype deep neural network implementation of the next-generation airborne collision avoidance system for unmanned aircraft (ACAS Xu). Results show that our technique can successfully prove properties of networks that are an order of magnitude larger than the largest networks verified using existing methods.
The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications the number of clusters or communities (say, $K$) is generally unknown a-priori. Consequently, the majority of the existing methods either choose $K$ heuristically or they repeat the clustering method with different choices of $K$ and accept the best clustering result. The first option, more often, yields suboptimal result, while the second option is computationally expensive. In this work, we propose an incremental method for constructing the eigenspectrum of the graph Laplacian matrix. This method leverages the eigenstructure of graph Laplacian matrix to obtain the $K$-th eigenpairs of the Laplacian matrix given a collection of all the $K-1$ smallest eigenpairs. Our proposed method adapts the Laplacian matrix such that the batch eigenvalue decomposition problem transforms into an efficient sequential leading eigenpair computation problem. As a practical application, we consider user-guided spectral clustering. Specifically, we demonstrate that users can utilize the proposed incremental method for effective eigenpair computation and determining the desired number of clusters based on multiple clustering metrics.
We consider the problem of the superconductor-insulator transition in the presence of disorder, assuming that the fermionic degrees of freedom can be ignored so that the problem reduces to one of Cooper pair localization. Weak disorder drives the critical behavior away from the pure critical point, initially towards a diffusive fixed point. We consider the effects of Coulomb interactions and quantum interference at this diffusive fixed point. Coulomb interactions enhance the conductivity, in contrast to the situation for fermions, essentially because the exchange interaction is opposite in sign. The interaction-driven enhancement of the conductivity is larger than the weak-localization suppression, so the system scales to a perfect conductor. Thus, it is a consistent possibility for the critical resistivity at the superconductor-insulator transition to be zero, but this value is only approached logarithmically. We determine the values of the critical exponents $\eta,z,\nu$ and comment on possible implications for the interpretation of experiments.
This paper investigates the issue of connectivity of a wireless adhoc network in the presence of channel impairments. We derive analytical expressions for the node isolation probability in an adhoc network in the presence of Nakagami-m fading with superimposed lognormal shadowing. The node isolation probability is the probability that a randomly chosen node is not able to communicate with none of the other nodes in the network. An extensive investigation into the impact of path loss exponent, lognormal shadowing, Nakagami fading severity index, node density, and diversity order on the node isolation probability is conducted. The presented results are beneficial for the practical design of ad hoc networks.
We report on a data-driven investigation aimed at understanding the dynamics of message spreading in a real-world dynamical network of human proximity. We use data collected by means of a proximity-sensing network of wearable sensors that we deployed at three different social gatherings, simultaneously involving several hundred individuals. We simulate a message spreading process over the recorded proximity network, focusing on both the topological and the temporal properties. We show that by using an appropriate technique to deal with the temporal heterogeneity of proximity events, a universal statistical pattern emerges for the delivery times of messages, robust across all the data sets. Our results are useful to set constraints for generic processes of data dissemination, as well as to validate established models of human mobility and proximity that are frequently used to simulate realistic behaviors.
Potential games, originally introduced in the early 1990's by Lloyd Shapley, the 2012 Nobel Laureate in Economics, and his colleague Dov Monderer, are a very important class of models in game theory. They have special properties such as the existence of Nash equilibria in pure strategies. This note introduces graphical versions of potential games. Special cases of graphical potential games have already found applicability in many areas of science and engineering beyond economics, including artificial intelligence, computer vision, and machine learning. They have been effectively applied to the study and solution of important real-world problems such as routing and congestion in networks, distributed resource allocation (e.g., public goods), and relaxation-labeling for image segmentation. Implicit use of graphical potential games goes back at least 40 years. Several classes of games considered standard in the literature, including coordination games, local interaction games, lattice games, congestion games, and party-affiliation games, are instances of graphical potential games. This note provides several characterizations of graphical potential games by leveraging well-known results from the literature on probabilistic graphical models. A major contribution of the work presented here that particularly distinguishes it from previous work is establishing that the convergence of certain type of game-playing rules implies that the agents/players must be embedded in some graphical potential game.
We study the diffusion of an ensemble of overdamped particles sliding over a tilted random poten- tial (produced by the interaction of a particle with a random polymer) with long-range correlations. We found that the diffusion properties of such a system are closely related to the correlation function of the corresponding potential. We model the substrate as a symbolic trajectory of a shift space which enables us to obtain a general formula for the diffusion coefficient when normal diffusion occurs. The total time that the particle takes to travel through n monomers can be seen as an ergodic sum to which we can apply the central limit theorem. The latter can be implemented if the correlations decay fast enough in order for the central limit theorem to be valid. On the other hand, we presume that when the central limit theorem breaks down the system give rise to anomalous diffusion. We give two examples exhibiting a transition from normal to anomalous diffusion due to this mechanism. We also give analytical expressions for the diffusion exponents in both cases by assuming convergence to a stable law. Finally we test our predictions by means of numerical simulations.
Localization of waves by disorder is a fundamental physical problem encompassing a diverse spectrum of theoretical, experimental and numerical studies in the context of metal-insulator transitions, the quantum Hall effect, light propagation in photonic crystals, and dynamics of ultra-cold atoms in optical arrays, to name just a few examples. Large intensity light can induce nonlinear response, ultracold atomic gases can be tuned into an interacting regime, which leads again to nonlinear wave equations on a mean field level. The interplay between disorder and nonlinearity, their localizing and delocalizing effects is currently an intriguing and challenging issue in the field of lattice waves. In particular it leads to the prediction and observation of two different regimes of destruction of Anderson localization - asymptotic weak chaos, and intermediate strong chaos, separated by a crossover condition on densities. On the other side approximate full quantum interacting many body treatments were recently used to predict and obtain a novel many body localization transition, and two distinct phases - a localization phase, and a delocalization phase, both again separated by some typical density scale. We will discuss selftrapping, nonergodicity and nonGibbsean phases which are typical for such discrete models with particle number conservation and their relation to the above crossover and transition physics. We will also discuss potential connections to quantum many body theories.
We propose a framework for reasoning about unbounded dynamic networks of infinite-state processes. We propose Constrained Petri Nets (CPN) as generic models for these networks. They can be seen as Petri nets where tokens (representing occurrences of processes) are colored by values over some potentially infinite data domain such as integers, reals, etc. Furthermore, we define a logic, called CML (colored markings logic), for the description of CPN configurations. CML is a first-order logic over tokens allowing to reason about their locations and their colors. Both CPNs and CML are parametrized by a color logic allowing to express constraints on the colors (data) associated with tokens. We investigate the decidability of the satisfiability problem of CML and its applications in the verification of CPNs. We identify a fragment of CML for which the satisfiability problem is decidable (whenever it is the case for the underlying color logic), and which is closed under the computations of post and pre images for CPNs. These results can be used for several kinds of analysis such as invariance checking, pre-post condition reasoning, and bounded reachability analysis.
Slow self-avoiding adaptive walks by an infinite radius search algorithm (Limax) are analyzed as themselves, and as the network they form. The study is conducted on several NK problems and two HIFF problems. We find that examination of such "slacker" walks and networks can indicate relative search difficulty within a family of problems, help identify potential local optima, and detect presence of structure in fitness landscapes. Hierarchical walks are used to differentiate rugged landscapes which are hierarchical (e.g. HIFF) from those which are anarchic (e.g. NK). The notion of node viscidity as a measure of local optimum potential is introduced and found quite successful although more work needs to be done to improve its accuracy on problems with larger K.
A eukaryotic gene consists of multiple exons (protein coding regions) and introns (non-coding regions), and a splice junction refers to the boundary between a pair of exon and intron. Precise identification of spice junctions on a gene is important for deciphering its primary structure, function, and interaction. Experimental techniques for determining exon/intron boundaries include RNA-seq, which is often accompanied by computational approaches. Canonical splicing signals are known, but computational junction prediction still remains challenging because of a large number of false positives and other complications. In this paper, we exploit deep recurrent neural networks (RNNs) to model DNA sequences and to detect splice junctions thereon. We test various RNN units and architectures including long short-term memory units, gated recurrent units, and recently proposed iRNN for in-depth design space exploration. According to our experimental results, the proposed approach significantly outperforms not only conventional machine learning-based methods but also a recent state-of-the-art deep belief network-based technique in terms of prediction accuracy.
Recent studies on the evolutionary dynamics of the Prisoner's Dilemma game in scale-free networks have demonstrated that the heterogeneity of the network interconnections enhances the evolutionary success of cooperation. In this paper we address the issue of how the characterization of the asymptotic states of the evolutionary dynamics depends on the initial concentration of cooperators. We find that the measure and the connectedness properties of the set of nodes where cooperation reaches fixation is largely independent of initial conditions, in contrast with the behavior of both the set of nodes where defection is fixed, and the fluctuating nodes. We also check for the robustness of these results when varying the degree heterogeneity along a one-parametric family of networks interpolating between the class of Erdos-Renyi graphs and the Barabasi-Albert networks.
Exact Bayesian structure discovery in Bayesian networks requires exponential time and space. Using dynamic programming (DP), the fastest known sequential algorithm computes the exact posterior probabilities of structural features in $O(2(d+1)n2^n)$ time and space, if the number of nodes (variables) in the Bayesian network is $n$ and the in-degree (the number of parents) per node is bounded by a constant $d$. Here we present a parallel algorithm capable of computing the exact posterior probabilities for all $n(n-1)$ edges with optimal parallel space efficiency and nearly optimal parallel time efficiency. That is, if $p=2^k$ processors are used, the run-time reduces to $O(5(d+1)n2^{n-k}+k(n-k)^d)$ and the space usage becomes $O(n2^{n-k})$ per processor. Our algorithm is based the observation that the subproblems in the sequential DP algorithm constitute a $n$-$D$ hypercube. We take a delicate way to coordinate the computation of correlated DP procedures such that large amount of data exchange is suppressed. Further, we develop parallel techniques for two variants of the well-known \emph{zeta transform}, which have applications outside the context of Bayesian networks. We demonstrate the capability of our algorithm on datasets with up to 33 variables and its scalability on up to 2048 processors. We apply our algorithm to a biological data set for discovering the yeast pheromone response pathways.
We present a variational approximation to the information bottleneck of Tishby et al. (1999). This variational approach allows us to parameterize the information bottleneck model using a neural network and leverage the reparameterization trick for efficient training. We call this method "Deep Variational Information Bottleneck", or Deep VIB. We show that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.
One of the most common approaches for multiobjective optimization is to generate a solution set that well approximates the whole Pareto-optimal frontier to facilitate the later decision-making process. However, how to evaluate and compare the quality of different solution sets remains challenging. Existing measures typically require additional problem knowledge and information, such as a reference point or a substituted set of the Pareto-optimal frontier. In this paper, we propose a quality measure, called dominance move (DoM), to compare solution sets generated by multiobjective optimizers. Given two solution sets, DoM measures the minimum sum of move distances for one set to weakly Pareto dominate the other set. DoM can be seen as a natural reflection of the difference between two solutions, capturing all aspects of solution sets' quality, being compliant with Pareto dominance, and does not need any additional problem knowledge and parameters. We present an exact method to calculate the DoM in the biobjective case. We show the necessary condition of constructing the optimal partition for a solution set's minimum move, and accordingly propose an efficient algorithm to recursively calculate the DoM. Finally, DoM is evaluated on several groups of artificial and real test cases as well as by a comparison with two well-established quality measures.
Cardiovascular disease (CVD) is the leading cause of mortality yet largely preventable, but the key to prevention is to identify at-risk individuals before adverse events. For predicting individual CVD risk, carotid intima-media thickness (CIMT), a noninvasive ultrasound method, has proven to be valuable, offering several advantages over CT coronary artery calcium score. However, each CIMT examination includes several ultrasound videos, and interpreting each of these CIMT videos involves three operations: (1) select three end-diastolic ultrasound frames (EUF) in the video, (2) localize a region of interest (ROI) in each selected frame, and (3) trace the lumen-intima interface and the media-adventitia interface in each ROI to measure CIMT. These operations are tedious, laborious, and time consuming, a serious limitation that hinders the widespread utilization of CIMT in clinical practice. To overcome this limitation, this paper presents a new system to automate CIMT video interpretation. Our extensive experiments demonstrate that the suggested system significantly outperforms the state-of-the-art methods. The superior performance is attributable to our unified framework based on convolutional neural networks (CNNs) coupled with our informative image representation and effective post-processing of the CNN outputs, which are uniquely designed for each of the above three operations.
The use of mobile sensors is of great relevance for a number of strategic applications devoted to monitoring critical areas where sensors can not be deployed manually. In these networks, each sensor adapts its position on the basis of a local evaluation of the coverage efficiency, thus permitting an autonomous deployment.   Several algorithms have been proposed to deploy mobile sensors over the area of interest. The applicability of these approaches largely depends on a proper formalization of rigorous rules to coordinate sensor movements, solve local conflicts and manage possible failures of communications and devices.   In this paper we introduce P&P, a communication protocol that permits a correct and efficient coordination of sensor movements in agreement with the PUSH&PULL algorithm. We deeply investigate and solve the problems that may occur when coordinating asynchronous local decisions in the presence of an unreliable transmission medium and possibly faulty devices such as in the typical working scenario of mobile sensor networks.   Simulation results show the performance of our protocol under a range of operative settings, including conflict situations, irregularly shaped target areas, and node failures.
A major hurdle in brain-machine interfaces (BMI) is the lack of an implantable neural interface system that remains viable for a lifetime. This paper explores the fundamental system design trade-offs and ultimate size, power, and bandwidth scaling limits of neural recording systems built from low-power CMOS circuitry coupled with ultrasonic power delivery and backscatter communication. In particular, we propose an ultra-miniature as well as extremely compliant system that enables massive scaling in the number of neural recordings from the brain while providing a path towards truly chronic BMI. These goals are achieved via two fundamental technology innovations: 1) thousands of 10 - 100 \mu m scale, free-floating, independent sensor nodes, or neural dust, that detect and report local extracellular electrophysiological data, and 2) a sub-cranial interrogator that establishes power and communication links with the neural dust.
We consider the solution of a nonlinear Kraichnan equation $$\partial_s H(s,t)=\int_t^s H(s,u)H(u,t) k(s,u) du,\quad s\ge t$$ with a covariance kernel $k$ and boundary condition $H(t,t)=1$. We study the long time behaviour of $H$ as the time parameters $t,s$ go to infinity, according to the asymptotic behaviour of $k$. This question appears in various subjects since it is related with the analysis of the asymptotic behaviour of the trace of non-commutative processes satisfying a linear differential equation, but also naturally shows up in the study of the so-called response function and aging properties of the dynamics of some disordered spin systems.
We study the Anderson transition on a generic model of random graphs with a tunable branching parameter $1<K\le 2$, through large scale numerical simulations and finite-size scaling analysis. We find that a single transition separates a localized phase from an unusual delocalized phase which is ergodic at large scales but strongly non-ergodic at smaller scales. In the critical regime, multifractal wavefunctions are located on few branches of the graph. Different scaling laws apply on both sides of the transition: a scaling with the linear size of the system on the localized side, and an unusual volumic scaling on the delocalized side. The critical scalings and exponents are independent of the branching parameter, which strongly supports the universality of our results.
Large real-world networks typically follow a power-law degree distribution. To study such networks, numerous random graph models have been proposed. However, real-world networks are not drawn at random. Therefore, Brach, Cygan, {\L}acki, and Sankowski [SODA 2016] introduced two natural deterministic conditions: (1) a power-law upper bound on the degree distribution (PLB-U) and (2) power-law neighborhoods, that is, the degree distribution of neighbors of each vertex is also upper bounded by a power law (PLB-N). They showed that many real-world networks satisfy both deterministic properties and exploit them to design faster algorithms for a number of classical graph problems.   We complement the work of Brach et al. by showing that some well-studied random graph models exhibit both the mentioned PLB properties and additionally also a power-law lower bound on the degree distribution (PLB-L). All three properties hold with high probability for Chung-Lu Random Graphs and Geometric Inhomogeneous Random Graphs and almost surely for Hyperbolic Random Graphs. As a consequence, all results of Brach et al. also hold with high probability or almost surely for those random graph classes.   In the second part of this work we study three classical NP-hard combinatorial optimization problems on PLB networks. It is known that on general graphs with maximum degree {\Delta}, a greedy algorithm, which chooses nodes in the order of their degree, only achieves an {\Omega}(ln {\Delta})-approximation for Minimum Vertex Cover and Minimum Dominating Set, and an {\Omega}({\Delta})-approximation for Maximum Independent Set. We prove that the PLB-U property suffices for the greedy approach to achieve a constant-factor approximation for all three problems. We also show that all three combinatorial optimization problems are APX-complete, even if all PLB-properties hold.
In biochemical networks, reactions often occur on disparate timescales and can be characterized as either "fast" or "slow." The quasi-steady state approximation (QSSA) utilizes timescale separation to project models of biochemical networks onto lower-dimensional slow manifolds. As a result, fast elementary reactions are not modeled explicitly, and their effect is captured by non-elementary reaction rate functions (e.g. Hill functions). The accuracy of the QSSA applied to deterministic systems depends on how well timescales are separated. Recently, it has been proposed to use the non-elementary rate functions obtained via the deterministic QSSA to define propensity functions in stochastic simulations of biochemical networks. In this approach, termed the stochastic QSSA, fast reactions that are part of non-elementary reactions are not simulated, greatly reducing computation time. However, it is unclear when the stochastic QSSA provides an accurate approximation of the original stochastic simulation. We show that, unlike the deterministic QSSA, the validity of the stochastic QSSA does not follow from timescale separation alone, but also depends on the sensitivity of the non-elementary reaction rate functions to changes in the slow species. The stochastic QSSA becomes more accurate when this sensitivity is small. Different types of QSSAs result in non-elementary functions with different sensitivities, and the total QSSA results in less sensitive functions than the standard or the pre-factor QSSA. We prove that, as a result, the stochastic QSSA becomes more accurate when non-elementary reaction functions are obtained using the total QSSA. Our work provides a novel condition for the validity of the QSSA in stochastic simulations of biochemical reaction networks with disparate timescales.
MatConvNet is an implementation of Convolutional Neural Networks (CNNs) for MATLAB. The toolbox is designed with an emphasis on simplicity and flexibility. It exposes the building blocks of CNNs as easy-to-use MATLAB functions, providing routines for computing linear convolutions with filter banks, feature pooling, and many more. In this manner, MatConvNet allows fast prototyping of new CNN architectures; at the same time, it supports efficient computation on CPU and GPU allowing to train complex models on large datasets such as ImageNet ILSVRC. This document provides an overview of CNNs and how they are implemented in MatConvNet and gives the technical details of each computational block in the toolbox.
'Health utilities' measure patient preferences for perfect health compared to specific unhealthy states, such as asthma, a fractured hip, or colon cancer. When integrated over time, these estimations are called quality adjusted life years (QALYs). Until now, characterizing health utilities (HUs) required detailed patient interviews or written surveys. While reliable and specific, this data remained costly due to efforts to locate, enlist and coordinate participants. Thus the scope, context and temporality of diseases examined has remained limited.   Now that more than a billion people use social media, we propose a novel strategy: use natural language processing to analyze public online conversations for signals of the severity of medical conditions and correlate these to known HUs using machine learning. In this work, we filter a dataset that originally contained 2 billion tweets for relevant content on 60 diseases. Using this data, our algorithm successfully distinguished mild from severe diseases, which had previously been categorized only by traditional techniques. This represents progress towards two related applications: first, predicting HUs where such information is nonexistent; and second, (where rich HU data already exists) estimating temporal or geographic patterns of disease severity through data mining.
In this paper, a progressive learning algorithm for multi-label classification to learn new labels while retaining the knowledge of previous labels is designed. New output neurons corresponding to new labels are added and the neural network connections and parameters are automatically restructured as if the label has been introduced from the beginning. This work is the first of the kind in multi-label classifier for class-incremental learning. It is useful for real-world applications such as robotics where streaming data are available and the number of labels is often unknown. Based on the Extreme Learning Machine framework, a novel universal classifier with plug and play capabilities for progressive multi-label classification is developed. Experimental results on various benchmark synthetic and real datasets validate the efficiency and effectiveness of our proposed algorithm.
Complex networks can be typically broken down into groups or modules. Discovering this "community structure" is an important step in studying the large-scale structure of networks. Many algorithms have been proposed for community detection and benchmarks have been created to evaluate their performance. Typically algorithms for community detection either partition the graph (non-overlapping communities) or find node covers (overlapping communities).   In this paper, we propose a particularly simple semi-supervised learning algorithm for finding out communities. In essence, given the community information of a small number of "seed nodes", the method uses random walks from the seed nodes to uncover the community information of the whole network. The algorithm runs in time $O(k \cdot m \cdot \log n)$, where $m$ is the number of edges; $n$ the number of links; and $k$ the number of communities in the network. In sparse networks with $m = O(n)$ and a constant number of communities, this running time is almost linear in the size of the network. Another important feature of our algorithm is that it can be used for either non-overlapping or overlapping communities.   We test our algorithm using the LFR benchmark created by Lancichinetti, Fortunato, and Radicchi specifically for the purpose of evaluating such algorithms. Our algorithm can compete with the best of algorithms for both non-overlapping and overlapping communities as found in the comprehensive study of Lancichinetti and Fortunato.
There have been numerous works on network intrusion detection and prevention systems, but work on application layer intrusion detection and prevention is rare and not very mature. Intrusion detection and prevention at both network and application layers are important for cyber-security and enterprise system security. Since application layer intrusion is increasing day by day, it is imperative to give adequate attention to it and use state-of-the-art algorithms for effective detection and prevention. This paper talks about current state of application layer intrusion detection and prevention capabilities in commercial and open-source space and provides a path for evolution to more mature state that will address not only enterprise system security, but also national cyber-defence. Scalability and cost-effectiveness were important factors which shaped the proposed solution.
Given a set $S$ of points in the plane, a geometric network for $S$ is a graph $G$ with vertex set $S$ and straight edges. We consider a broadcasting situation, where one point $r \in S$ is a designated source. Given a dilation factor $\delta$, we ask for a geometric network $G$ such that for every point $v \in S$ there is a path from $r$ to $v$ in $G$ of length at most $\delta|rv|$, and such that the total edge length is minimized. We show that finding such a network of minimum total edge length is NP-hard, and give an approximation algorithm.
We use a class of variational wave functions to study the properties of an impurity in a Bose-Einstein condensate, i.e. the "Bose polaron". The impurity interacts with the condensate through a contact interaction, which can be tuned by a Feshbach resonance. We find a stable attractive polaron branch that evolves continuously across the resonance to a tight-binding diatomic molecule deep in the positive scattering length side. A repulsive polaron branch with finite lifetime is also observed and it becomes unstable as the interaction strength increases. The effective mass of the attractive polaron also changes smoothly across the resonance connecting the two well-understood limits deep on both sides.
Interest in multioutput kernel methods is increasing, whether under the guise of multitask learning, multisensor networks or structured output data. From the Gaussian process perspective a multioutput Mercer kernel is a covariance function over correlated output functions. One way of constructing such kernels is based on convolution processes (CP). A key problem for this approach is efficient inference. Alvarez and Lawrence (2009) recently presented a sparse approximation for CPs that enabled efficient inference. In this paper, we extend this work in two directions: we introduce the concept of variational inducing functions to handle potential non-smooth functions involved in the kernel CP construction and we consider an alternative approach to approximate inference based on variational methods, extending the work by Titsias (2009) to the multiple output case. We demonstrate our approaches on prediction of school marks, compiler performance and financial time series.
In the following article we provide an exposition of exact computational methods to perform parameter inference from partially observed network models. In particular, we consider the duplication attachment (DA) model which has a likelihood function that typically cannot be evaluated in any reasonable computational time. We consider a number of importance sampling (IS) and sequential Monte Carlo (SMC) methods for approximating the likelihood of the network model for a fixed parameter value. It is well-known that for IS, the relative variance of the likelihood estimate typically grows at an exponential rate in the time parameter (here this is associated to the size of the network): we prove that, under assumptions, the SMC method will have relative variance which can grow only polynomially. In order to perform parameter estimation, we develop particle Markov chain Monte Carlo (PMCMC) algorithms to perform Bayesian inference. Such algorithms use the afore-mentioned SMC algorithms within the transition dynamics. The approaches are illustrated numerically.
Satellite networks provide unique challenges that can restrict users' quality of service. For example, high packet erasure rates and large latencies can cause significant disruptions to applications such as video streaming or voice-over-IP. Network coding is one promising technique that has been shown to help improve performance, especially in these environments. However, implementing any form of network code can be challenging. This paper will use an example of a generation-based network code and a sliding-window network code to help highlight the benefits and drawbacks of using one over the other. In-order packet delivery delay, as well as network efficiency, will be used as metrics to help differentiate between the two approaches. Furthermore, lessoned learned during the course of our research will be provided in an attempt to help the reader understand when and where network coding provides its benefits.
Modern applications and progress in deep learning research have created renewed interest for generative models of text and of images. However, even today it is unclear what objective functions one should use to train and evaluate these models. In this paper we present two contributions.   Firstly, we present a critique of scheduled sampling, a state-of-the-art training method that contributed to the winning entry to the MSCOCO image captioning benchmark in 2015. Here we show that despite this impressive empirical performance, the objective function underlying scheduled sampling is improper and leads to an inconsistent learning algorithm.   Secondly, we revisit the problems that scheduled sampling was meant to address, and present an alternative interpretation. We argue that maximum likelihood is an inappropriate training objective when the end-goal is to generate natural-looking samples. We go on to derive an ideal objective function to use in this situation instead. We introduce a generalisation of adversarial training, and show how such method can interpolate between maximum likelihood training and our ideal training objective. To our knowledge this is the first theoretical analysis that explains why adversarial training tends to produce samples with higher perceived quality.
In this paper, we deal with the problem of maximizing the profit of Network Operators (NOs) of green cellular networks in situations where Quality-of-Service (QoS) guarantees must be ensured to users, and Base Stations (BSs) can be shared among different operators. We show that if NOs cooperate among them, by mutually sharing their users and BSs, then each one of them can improve its net profit. By using a game-theoretic framework, we study the problem of forming stable coalitions among NOs. Furthermore, we propose a mathematical optimization model to allocate users to a set of BSs, in order to reduce costs and, at the same time, to meet user QoS for NOs inside the same coalition. Based on this, we propose an algorithm, based on cooperative game theory, that enables each operator to decide with whom to cooperate in order to maximize its profit. This algorithms adopts a distributed approach in which each NO autonomously makes its own decisions, and where the best solution arises without the need to synchronize them or to resort to a trusted third party. The effectiveness of the proposed algorithm is demonstrated through a thorough experimental evaluation considering real-world traffic traces, and a set of realistic scenarios. The results we obtain indicate that our algorithm allows a population of NOs to significantly improve their profits thanks to the combination of energy reduction and satisfaction of QoS requirements.
In this paper, we investigate how to network smartphones for providing communications in disaster recovery. By bridging the gaps among different kinds of wireless networks, we have designed and implemented a system called TeamPhone, which provides smartphones the capabilities of communications in disaster recovery. Specifically, TeamPhone consists of two components: a messaging system and a self-rescue system. The messaging system integrates cellular networking, ad-hoc networking and opportunistic networking seamlessly, and enables communications among rescue workers. The self-rescue system groups, schedules and positions the smartphones of trapped survivors. Such a group of smartphones can cooperatively wake up and send out emergency messages in an energy-efficient manner with their location and position information so as to assist rescue operations. We have implemented TeamPhone as a prototype application on the Android platform and deployed it on off-the-shelf smartphones. Experimental results demonstrate that TeamPhone can properly fulfill communication requirements and greatly facilitate rescue operations in disaster recovery.
When a liquid is cooled below its melting temperature, if crystallization is avoided, it forms a glass. This phenomenon, called glass transition, is characterized by a marked increase of viscosity, about 14 orders of magnitude, in a narrow temperature interval. The microscopic mechanism behind the glass transition is still poorly understood. However, recently, great advances have been made in the identification of cooperative rearranging regions, or dynamical heterogeneities, i.e. domains of the liquid whose relaxation is highly correlated. The growth of the size of these domains is now believed to be the driving mechanism for the increase of the viscosity. Recently a tool to quantify the size of these domains has been proposed. We apply this tool to a wide class of materials to investigate the correlation between the size of the heterogeneities and their configurational entropy, i.e. the number of states accessible to a correlated domain. We find that the relaxation time of a given system, apart from a material dependent pre-factor, is a universal function of the configurational entropy of a correlated domain. As a consequence, we find that at the glass transition temperature, the size of the domains and the configurational entropy per unit volume are anti-correlated, as originally predicted by the Adam-Gibbs theory. Finally, we use our data to extract some exponents defined in the framework of the Random First Order Theory, a recent quantitative theory of the glass transition.
We calculate the level compressibility $\chi(W,L)$ of the energy levels inside $[-L/2,L/2]$ for the Anderson model on infinitely large random regular graphs with on-site potentials distributed uniformly in $[-W/2,W/2]$. We show that $\chi(W,L)$ approaches the limit $\lim_{L \rightarrow 0^+} \chi(W,L) = 0$ for a broad interval of the disorder strength $W$ within the extended phase, including the region of $W$ close to the critical point for the Anderson transition. These results strongly suggest that the energy levels follow the Wigner-Dyson statistics in the extended phase, which implies on the absence of non-ergodic extended wavefunctions. This picture is consistent with earlier analytical predictions derived using the supersymmetric method. Our results are obtained from the accurate numerical solution of an exact set of equations for the level-compressibility of infinitely large regular random graphs.
In 802.11 wireless infrastructure networks, as the mobile node moves from one access point to another, the active connections will not be badly dropped if the handoff is smooth and if there are sufficient resources reserved in the target access point. In a 5x5 grid of access points, within a 6x6 grid of regions, by location tracking and data mining, we predict the mobility pattern of mobile node with good accuracy. The pre-scanning of mobile nodes, along with pre-authenticating neighbouring access points and pre-reassociation is used to reduce the scan delay, authentication delay and re-association delay respectively. The model implements first stage reservation by using prediction results and does second stage reservation based on the packet content type, so that sufficient resources can be reserved when the mobile node does the handoff to the next access point. The overall mobility management scheme thus reduces the handoff delay. The performance simulations are done to verify the proposed model.
In this note we calculate the gradient of the network function in matrix notation.
Generative Adversarial Networks have become one of the most studied frameworks for unsupervised learning due to their intuitive formulation. They have also been shown to be capable of generating convincing examples in limited domains, such as low-resolution images. However, they still prove difficult to train in practice and tend to ignore modes of the data generating distribution. Quantitatively capturing effects such as mode coverage and more generally the quality of the generative model still remain elusive. We propose Generative Adversarial Parallelization, a framework in which many GANs or their variants are trained simultaneously, exchanging their discriminators. This eliminates the tight coupling between a generator and discriminator, leading to improved convergence and improved coverage of modes. We also propose an improved variant of the recently proposed Generative Adversarial Metric and show how it can score individual GANs or their collections under the GAP model.
A Bayesian classifier that up-weights the differences in the attribute values is discussed. Using four popular datasets from the UCI repository, some interesting features of the network are illustrated. The network is suitable for classification problems.
Learning dictionaries suitable for sparse coding instead of using engineered bases has proven effective in a variety of image processing tasks. This paper studies the optimization of dictionaries on image data where the representation is enforced to be explicitly sparse with respect to a smooth, normalized sparseness measure. This involves the computation of Euclidean projections onto level sets of the sparseness measure. While previous algorithms for this optimization problem had at least quasi-linear time complexity, here the first algorithm with linear time complexity and constant space complexity is proposed. The key for this is the mathematically rigorous derivation of a characterization of the projection's result based on a soft-shrinkage function. This theory is applied in an original algorithm called Easy Dictionary Learning (EZDL), which learns dictionaries with a simple and fast-to-compute Hebbian-like learning rule. The new algorithm is efficient, expressive and particularly simple to implement. It is demonstrated that despite its simplicity, the proposed learning algorithm is able to generate a rich variety of dictionaries, in particular a topographic organization of atoms or separable atoms. Further, the dictionaries are as expressive as those of benchmark learning algorithms in terms of the reproduction quality on entire images, and result in an equivalent denoising performance. EZDL learns approximately 30 % faster than the already very efficient Online Dictionary Learning algorithm, and is therefore eligible for rapid data set analysis and problems with vast quantities of learning samples.
Given a wireless network where some pairs of communication links interfere with each other, we study sufficient conditions for determining whether a given set of minimum bandwidth quality-of-service (QoS) requirements can be satisfied. We are especially interested in algorithms which have low communication overhead and low processing complexity. The interference in the network is modeled using a conflict graph whose vertices correspond to the communication links in the network. Two links are adjacent in this graph if and only if they interfere with each other due to being in the same vicinity and hence cannot be simultaneously active. The problem of scheduling the transmission of the various links is then essentially a fractional, weighted vertex coloring problem, for which upper bounds on the fractional chromatic number are sought using only localized information. We recall some distributed algorithms for this problem, and then assess their worst-case performance. Our results on this fundamental problem imply that for some well known classes of networks and interference models, the performance of these distributed algorithms is within a bounded factor away from that of an optimal, centralized algorithm. The performance bounds are simple expressions in terms of graph invariants. It is seen that the induced star number of a network plays an important role in the design and performance of such networks.
The immune system can be thought as a complex network of different interacting elements. A cellular automaton, defined in shape-space, was recently shown to exhibit self-regulation and complex behavior and is, therefore, a good candidate to model the immune system. Using this model to simulate a real immune system we find good agreement with recent experiments on mice. The model exhibits the experimentally observed refractory behavior of the immune system under multiple antigen presentations as well as loss of its plasticity caused by aging.
Given a visual history, multiple future outcomes for a video scene are equally probable, in other words, the distribution of future outcomes has multiple modes. Multimodality is notoriously hard to handle by standard regressors or classifiers: the former regress to the mean and the latter discretize a continuous high dimensional output space. In this work, we present stochastic neural network architectures that handle such multimodality through stochasticity: future trajectories of objects, body joints or frames are represented as deep, non-linear transformations of random (as opposed to deterministic) variables. Such random variables are sampled from simple Gaussian distributions whose means and variances are parametrized by the output of convolutional encoders over the visual history. We introduce novel convolutional architectures for predicting future body joint trajectories that outperform fully connected alternatives \cite{DBLP:journals/corr/WalkerDGH16}. We introduce stochastic spatial transformers through optical flow warping for predicting future frames, which outperform their deterministic equivalents \cite{DBLP:journals/corr/PatrauceanHC15}. Training stochastic networks involves an intractable marginalization over stochastic variables. We compare various training schemes that handle such marginalization through a) straightforward sampling from the prior, b) conditional variational autoencoders \cite{NIPS2015_5775,DBLP:journals/corr/WalkerDGH16}, and, c) a proposed K-best-sample loss that penalizes the best prediction under a fixed "prediction budget". We show experimental results on object trajectory prediction, human body joint trajectory prediction and video prediction under varying future uncertainty, validating quantitatively and qualitatively our architectural choices and training schemes.
Spin-glass and chiral-glass orderings in three-dimensional Heisenberg spin glasses are studied both by equilibrium and off-equilibrium Monte Carlo simulations. Fully isotropic model is found to exhibit a finite-temperature chiral-glass transition without the conventional spin-glass order. Although chirality is an Ising-like quantity from symmetry, universality class of the chiral-glass transition appears to be different from that of the standard Ising spin glass. In the off-equilibrium simulation, while the spin autocorrelation exhibits only an interrupted aging, the chirality autocorrelation persists to exhibit a pronounced aging effect reminisecnt of the one observed in the mean-field model. Effects of random magnetic anisotropy is also studied by the off-equilibrium simulation, in which asymptotic mixing of the spin and the chirality is observed.
Dielectric response effects on wave localization in random periodic-on-average layered systems (POAS) are studied. Based on Monte Carlo simulations and products of Random Matrices, statistics of the Lyapunovmexponent are determined efficiently for very long systems. A novel oscillatory behavior for the Lyapunov exponent is found and explained for mildly strong scattering conditions. We also show the emergence of strongly localized states in metallic layered systems with intermediate disorder for frequencies above the plasma frequency $\omega_{p}$ of metals, as is not shown in dielectrics. Furthermore, the violation of universal single parameter scaling behaviors in different regimes of multiple scattering is discussed.
Regulatory networks consist of interacting molecules with a high degree of mutual chemical specificity. How can these molecules evolve when their function depends on maintenance of interactions with cognate partners and simultaneous avoidance of deleterious "crosstalk" with non-cognate molecules? Although physical models of molecular interactions provide a framework in which co-evolution of network components can be analyzed, most theoretical studies have focused on the evolution of individual alleles, neglecting the network. In contrast, we study the elementary step in the evolution of gene regulatory networks: duplication of a transcription factor followed by selection for TFs to specialize their inputs as well as the regulation of their downstream genes. We show how to coarse grain the complete, biophysically realistic genotype-phenotype map for this process into macroscopic functional outcomes and quantify the probability of attaining each. We determine which evolutionary and biophysical parameters bias evolutionary trajectories towards fast emergence of new functions and show that this can be greatly facilitated by the availability of "promiscuity-promoting" mutations that affect TF specificity.
The influence of the initial composition and structure of the exploding white dwarf on the nucleosynthesis and structure of Type Ia Supernovae has been studied.   The progenitor structures are based on detailed stellar evolutionary tracks for stars in the mass range between 1 to 9 $M_\odot$ using the state of the art code FRANEC.   The calculations of the thermonuclear explosions are based on a set of delayed detonation models which give a good account of the optical and infrared light curves and of the spectral evolution.   Our code solves the hydrodynamical equations explicitly by the piecewise parabolic method.   Nuclear burning is taken into account using an extended network of 218 nuclei.   In principle, our calculations allow the observed spectra and light curve to be linked to the progenitor. Moreover, our study is relevant to estimate potential evolution in the progenitor population at cosmological time scales.
Quantum circuit network is a set of circuits that implements a certain computation task. Being at the center of the quantum circuit network, the multi-qubit controlled phase shift is one of the most important quantum gates. In this paper, we apply the method of modular structuring in classical computer architecture to quantum computer and give a recursive realization of the multi-qubit phase gate. This realization of the controlled phase shift gate is convenient in realizing certain quantum algorithms. We have experimentally implemented this modularized multi-qubit controlled phase gate in a three qubit nuclear magnetic resonance quantum system. The network is demonstrated experimentally using line selective pulses in nuclear magnetic resonance technique. The procedure has the advantage of being simple and easy to implement.
Objective: We investigate whether deep learning techniques for natural language processing (NLP) can be used efficiently for patient phenotyping. Patient phenotyping is a classification task for determining whether a patient has a medical condition, and is a crucial part of secondary analysis of healthcare data. We assess the performance of deep learning algorithms and compare them with classical NLP approaches.   Materials and Methods: We compare convolutional neural networks (CNNs), n-gram models, and approaches based on cTAKES that extract pre-defined medical concepts from clinical notes and use them to predict patient phenotypes. The performance is tested on 10 different phenotyping tasks using 1,610 discharge summaries extracted from the MIMIC-III database.   Results: CNNs outperform other phenotyping algorithms in all 10 tasks. The average F1-score of our model is 76 (PPV of 83, and sensitivity of 71) with our model having an F1-score up to 37 points higher than alternative approaches. We additionally assess the interpretability of our model by presenting a method that extracts the most salient phrases for a particular prediction.   Conclusion: We show that NLP methods based on deep learning improve the performance of patient phenotyping. Our CNN-based algorithm automatically learns the phrases associated with each patient phenotype. As such, it reduces the annotation complexity for clinical domain experts, who are normally required to develop task-specific annotation rules and identify relevant phrases. Our method performs well in terms of both performance and interpretability, which indicates that deep learning is an effective approach to patient phenotyping based on clinicians' notes.
Classifying videos according to content semantics is an important problem with a wide range of applications. In this paper, we propose a hybrid deep learning framework for video classification, which is able to model static spatial information, short-term motion, as well as long-term temporal clues in the videos. Specifically, the spatial and the short-term motion features are extracted separately by two Convolutional Neural Networks (CNN). These two types of CNN-based features are then combined in a regularized feature fusion network for classification, which is able to learn and utilize feature relationships for improved performance. In addition, Long Short Term Memory (LSTM) networks are applied on top of the two features to further model longer-term temporal clues. The main contribution of this work is the hybrid learning framework that can model several important aspects of the video data. We also show that (1) combining the spatial and the short-term motion features in the regularized fusion network is better than direct classification and fusion using the CNN with a softmax layer, and (2) the sequence-based LSTM is highly complementary to the traditional classification strategy without considering the temporal frame orders. Extensive experiments are conducted on two popular and challenging benchmarks, the UCF-101 Human Actions and the Columbia Consumer Videos (CCV). On both benchmarks, our framework achieves to-date the best reported performance: $91.3\%$ on the UCF-101 and $83.5\%$ on the CCV.
Ultra Steep Spectrum (USS) radio sources are one of the efficient tracers of High Redshift Radio Galaxies (HzRGs). To search for HzRGs candidates, we investigate properties of a large sample of faint USS sources derived from our deep 325 MHz GMRT observations combined with 1.4 GHz VLA data on the two subfields (i.e., VLA-VIMOS VLT Deep Survey (VVDS) and Subaru X-ray Deep Field (SXDF)) in the XMM-LSS field. The available redshift estimates show that majority of our USS sample sources are at higher redshifts with the median redshifts ~ 1.18 and ~ 1.57 in the VLA-VVDS and SXDF fields. In the VLA-VVDS field, ~ 20% of USS sources lack the redshift estimates as well as the detection in the deep optical, IR surveys, and thus these sources may be considered as potential high-z candidates. The radio luminosity distributions suggest that a substantial fraction (~ 40%) of our USS sample sources are radio-loud sources, distributed over redshifts ~ 0.5 to 4.
We study the closure problem for continuum balance equations that model mesoscale dynamics of large ODE systems. The underlying microscale model consists of classical Newton equations of particle dynamics. As a mesoscale model we use the balance equations for spatial averages obtained earlier by a number of authors: Murdoch and Bedeaux, Hardy, Noll and others. The momentum balance equation contains a flux (stress), which is given by an exact function of particle positions and velocities. We propose a method for approximating this function by a sequence of operators applied to average density and momentum. The resulting approximate mesoscopic models are systems in closed form. The closed from property allows one to work directly with the mesoscale equaitons without the need to calculate underlying particle trajectories, which is useful for modeling and simulation of large particle systems. The proposed closure method utilizes the theory of ill-posed problems, in particular iterative regularization methods for solving first order linear integral equations. The closed from approximations are obtained in two steps. First, we use Landweber regularization to (approximately) reconstruct the interpolants of relevant microscale quantitites from the average density and momentum. Second, these reconstructions are substituted into the exact formulas for stress. The developed general theory is then applied to non-linear oscillator chains. We conduct a detailed study of the simplest zero-order approximation, and show numerically that it works well as long as fluctuations of velocity are nearly constant.
A major challenge in cognitive neuroscience is to evaluate the ability of the human brain to categorize or group visual stimuli based on common features. This categorization process is very fast and occurs in few hundreds of millisecond time scale. However, an accurate tracking of the spatiotemporal dynamics of large-scale brain networks is still an unsolved issue. Here, we show the combination of recently developed method called dense-EEG source connectivity to identify functional brain networks with excellent temporal and spatial resolutions and an algorithm, called SimNet, to compute brain networks similarity. Two categories of visual stimuli were analysed in this study: immobile and mobile. Networks similarity was assessed within each category (intra-condition) and between categories (inter-condition). Results showed high similarity within each category and low similarity between the two categories. A significant difference between similarities computed in the intra and inter-conditions was observed at the period of 120-190ms supposed to be related to visual recognition and memory access. We speculate that these observations will be very helpful toward understanding the object categorization in the human brain from a network perspective.
The random field q-States Potts model is investigated using exact groundstates and finite-temperature transfer matrix calculations. It is found that the domain structure and the Zeeman energy of the domains resembles for general q the random field Ising case (q=2), which is also the expectation based on a random-walk picture of the groundstate. The domain size distribution is exponential, and the scaling of the average domain size with the disorder strength is similar for q arbitrary. The zero-temperature properties are compared to the equilibrium spin states at small temperatures, to investigate the effect of local random field fluctuations that imply locally degenerate regions. The response to field pertubabtions ('chaos') and the susceptibility are investigated. In particular for the chaos exponent it is found to be 1 for q = 2,...,5. Finally for q=2 (Ising case) the domain length distribution is studied for correlated random fields.
People are observed to assortatively connect on a set of traits. This phenomenon, termed assortative mixing or sometimes homophily, can be quantified through assortativity coefficient in social networks. Uncovering the exact causes of strong assortative mixing found in social networks has been a research challenge. Among the main suggested causes from sociology are the tendency of similar individuals to connect (often itself referred as homophily) and the social influence among already connected individuals. An important question to researchers and in practice can be tackled, as we present here: understanding the exact mechanisms of interplay between these tendencies and the underlying social network structure. Namely, in addition to the mentioned assortativity coefficient, there are several other static and temporal network properties and substructures that can be linked to the tendencies of homophily and social influence in the social network and we herein investigate those. Concretely, we tackle a computer-mediated \textit{communication network} (based on Twitter mentions) and a particular type of assortative mixing that can be inferred from the semantic features of communication content that we term \textit{semantic homophily}. Our work, to the best of our knowledge, is the first to offer an in-depth analysis on semantic homophily in a communication network and the interplay between them. We quantify diverse levels of semantic homophily, identify the semantic aspects that are the drivers of observed homophily, show insights in its temporal evolution and finally, we present its intricate interplay with the communication network on Twitter. By analyzing these mechanisms we increase understanding on what are the semantic aspects that shape and how they shape the human computer-mediated communication.
As artificial agents proliferate, it is becoming increasingly important to ensure that their interactions with one another are well-behaved. In this paper, we formalize a common-sense notion of when algorithms are well-behaved: an algorithm is safe if it does no harm. Motivated by recent progress in deep learning, we focus on the specific case where agents update their actions according to gradient descent. The first result is that gradient descent converges to a Nash equilibrium in safe games.   The paper provides sufficient conditions that guarantee safe interactions. The main contribution is to define strongly-typed agents and show they are guaranteed to interact safely. A series of examples show that strong-typing generalizes certain key features of convexity and is closely related to blind source separation. The analysis introduce a new perspective on classical multilinear games based on tensor decomposition.
The physical relevance of the fractional time derivative in quantum mechanics is discussed. It is shown that the introduction of the fractional time Scr\"odinger equation (FTSE) in quantum mechanics by analogy with the fractional diffusion $\frac{\prt}{\prt t}\rightarrow \frac{\prt^{\alpha}}{\prt t^{\alpha}}$ can lead to an essential deficiency in the quantum mechanical description, and needs special care. To shed light on this situation, a quantum comb model is introduced. It is shown that for $\alpha=1/2$, the FTSE is a particular case of the quantum comb model. This \textit{exact} example shows that the FTSE is insufficient to describe a quantum process, and the appearance of the fractional time derivative by a simple change $\frac{\prt}{\prt t}\rightarrow \frac{\prt^{\alpha}}{\prt t^{\alpha}}$ in the Schr\"odinger equation leads to the loss of most of the information about quantum dynamics.
Message broadcasting in networks could be carried over spanning trees. A set of spanning trees in the same network is node independent if two conditions are satisfied. First, all trees are rooted at node $r$. Second, for every node $u$ in the network, all trees' paths from $r$ to $u$ are node-disjoint, excluding the end nodes $r$ and $u$. Independent spanning trees have applications in fault-tolerant communications and secure message distributions. Gaussian networks and two-dimensional toroidal networks share similar topological characteristics. They are regular of degree four, symmetric, and node-transitive. Gaussian networks, however, have relatively lesser network diameter that could result in a better performance. This promotes Gaussian networks to be a potential alternative for two-dimensional toroidal networks. In this paper, we present constructions for node independent spanning trees in dense Gaussian networks. Based on these constructions, we design routing algorithms that can be used in fault-tolerant routing and secure message distribution. We also design fault-tolerant algorithms to construct these trees in parallel.
Many network optimization problems can be formulated as stochastic network design problems in which edges are present or absent stochastically. Furthermore, protective actions can guarantee that edges will remain present. We consider the problem of finding the optimal protection strategy under a budget limit in order to maximize some connectivity measurements of the network. Previous approaches rely on the assumption that edges are independent. In this paper, we consider a more realistic setting where multiple edges are not independent due to natural disasters or regional events that make the states of multiple edges stochastically correlated. We use Markov Random Fields to model the correlation and define a new stochastic network design framework. We provide a novel algorithm based on Sample Average Approximation (SAA) coupled with a Gibbs or XOR sampler. The experimental results on real road network data show that the policies produced by SAA with the XOR sampler have higher quality and lower variance compared to SAA with Gibbs sampler.
This letter deals with the controllability issue of complex networks. An index is chosen to quantitatively measure the extent of controllability of given network. The effect of this index is analyzed based on empirical studies on various classes of network topologies, such as random network, small-world network, and scale-free network.
In many non-linear systems, such as plasma oscillation, boson condensation, chemical reaction, and even predatory-prey oscillation, the coarse-grained dynamics are governed by an equation containing anti-symmetric transitions, known as the anti-symmetric Lotka-Volterra (ALV) equations. In this work, we prove the existence of a novel bifurcation mechanism for the ALV equations, where the equilibrium state can be drastically changed by flipping the stability of a pair of fixed points. As an application, we focus on the implications of the bifurcation mechanism for evolutionary networks; we found that the bifurcation point can be determined quantitatively by the microscopic quantum entanglement. The equilibrium state can be critically changed from one type of global demographic condensation to another state that supports global cooperation for homogeneous networks. In other words, our results indicate that there exist a class of many-body systems where the macroscopic proper- ties are invariant with a certain amount of microscopic entanglement, but they can be changed abruptly once the entanglement exceeds a critical value. Furthermore, we provide numerical evidence showing that the emergence of bifurcation is robust against the change of the network topologies, and the critical values are in good agreement with our theoretical prediction. These results show that the bifurcation mechanism could be ubiquitous in many physical systems, in addition to evolutionary networks.
Rate, energy and delay are three main parameters of interest in ad-hoc networks. In this paper, we discuss the problem of maximizing network utility and minimizing energy consumption while satisfying a given transmission delay constraint for each packet. We formulate this problem in the standard convex optimization form and subsequently discuss the tradeoff between utility, energy and delay in such framework. Also, in order to adapt for the distributed nature of the network, a distributed algorithm where nodes decide on choosing transmission rates and probabilities based on their local information is introduced.
In this chapter, we focus on the problem of containing the spread of diseases taking place in both temporal and adaptive networks (i.e., networks whose structure `adapts' to the state of the disease). We specifically focus on the problem of finding the optimal allocation of containment resources (e.g., vaccines, medical personnel, traffic control resources, etc.) to eradicate epidemic outbreaks over the following three models of temporal and adaptive networks: (i) Markovian temporal networks, (ii) aggregated-Markovian temporal networks, and (iii) stochastically adaptive models. For each model, we present a rigorous and tractable mathematical framework to efficiently find the optimal distribution of control resources to eliminate the disease. In contrast with other existing results, our results are not based on heuristic control strategies, but on a disciplined analysis using tools from dynamical systems and convex optimization.
We consider the problem of distributed learning, where a network of agents collectively aim to agree on a hypothesis that best explains a set of distributed observations of conditionally independent random processes. We propose a distributed algorithm and establish consistency, as well as a non-asymptotic, explicit and geometric convergence rate for the concentration of the beliefs around the set of optimal hypotheses. Additionally, if the agents interact over static networks, we provide an improved learning protocol with better scalability with respect to the number of nodes in the network.
We consider general networks of bilateral contracts that include supply chains. We define a new stability concept, called trail stability, and show that any network of bilateral contracts has a trail-stable outcome whenever agents' preferences satisfy full substitutability. Trail stability is a natural extension of chain stability, but is a stronger solution concept in general contract networks. Trail-stable outcomes are not immune to deviations of arbitrary sets of firms. In fact, we show that outcomes satisfying an even more demanding stability property -- full trail stability -- always exist. We pin down conditions under which trail-stable and fully trail-stable outcomes have a lattice structure. We then completely describe the relationships between all stability concepts. When contracts specify trades and prices, we also show that competitive equilibrium exists in networked markets even in the absence of fully transferrable utility. The competitive equilibrium outcome is trail-stable.
We prove a complexity lower bound on deciding membership in a semialgebraic set for arithmetic networks in terms of the sum of Betti numbers with respect to "ordinary" (singular) homology. This result complements a similar lower bound by Montana, Morais and Pardo for locally close semialgebraic sets in terms of the sum of Borel-Moore Betti numbers. We also prove a lower bound in terms of the sum of Betti numbers of the projection of a semialgebraic set to a coordinate subspace.
This paper has been withdrawn by the author due to a crucial sign error in equation 1. With the advance of online social networks, there has been extensive research on how to spread influence in online social networks, and many algorithms and models have been proposed. However, many fundamental problems have also been overlooked. Among those, the most important problems are the incentive aspect and the privacy aspect (eg, nodes' relationships) of the influence propagation in online social networks. Bearing these defects in mind, and incorporating the powerful tool from differential privacy, we propose PRINCE, which is a series of \underline{PR}ivacy preserving mechanisms for \underline{IN}fluen\underline{CE} diffusion in online social networks to solve the problems. We not only theoretically prove many elegant properties of PRINCE, but also implement PRINCE to evaluate its performance extensively. The evaluation results show that PRINCE achieves good performances. To the best of our knowledge, PRINCE is the first differentially private mechanism for influence diffusion in online social networks.
Information-Centric Networking (ICN) is a promi- nent topic in current networking research. ICN design signifi- cantly considers the increased demand of scalable and efficient content distribution for Future Internet. However, intermittently connected mobile environments or disruptive networks present a significant challenge to ICN deployment. In this context, delay tolerant networking (DTN) architecture is an initiative that effec- tively deals with network disruptions. Among all ICN proposals, Content Centric Networking (CCN) is gaining more and more interest for its architectural design, but still has the limitation in highly disruptive environment. In this paper, we design a protocol stack referred as CCNDTN which integrates DTN architecture in the native CCN to deal with network disruption. We also present the implementation details of the proposed CCNDTN. We extend CCN routing strategies by integrating Bundle protocol of DTN architecture. The integration of CCN and DTN enriches the connectivity options of CCN architecture in fragmented networks. Furthermore, CCNDTN can be beneficial through the simultaneous use of all available connectivities and opportunistic networking of DTN for the dissemination of larger data items. This paper also highlights the potential use cases of CCNDTN architecture and crucial questions about integrating CCN and DTN
In recent years we have witnessed a move of the major industrial automation providers into the wireless domain. While most of these companies already offer wireless products for measurement and monitoring purposes, the ultimate goal is to be able to close feedback loops over wireless networks interconnecting sensors, computation devices, and actuators. In this paper we present a decentralized event-triggered implementation, over sensor/actuator networks, of centralized nonlinear controllers. Event-triggered control has been recently proposed as an alternative to the more traditional periodic execution of control tasks. In a typical event-triggered implementation, the control signals are kept constant until the violation of a condition on the state of the plant triggers the re-computation of the control signals. The possibility of reducing the number of re-computations, and thus of transmissions, while guaranteeing desired levels of performance makes event-triggered control very appealing in the context of sensor/actuator networks. In these systems the communication network is a shared resource and event-triggered implementations of control laws offer a flexible way to reduce network utilization. Moreover reducing the number of times that a feedback control law is executed implies a reduction in transmissions and thus a reduction in energy expenditures of battery powered wireless sensor nodes.
Network data is ubiquitous and growing, yet we lack realistic generative network models that can be calibrated to match real-world data. The recently proposed Block Two-Level Erdss-Renyi (BTER) model can be tuned to capture two fundamental properties: degree distribution and clustering coefficients. The latter is particularly important for reproducing graphs with community structure, such as social networks. In this paper, we compare BTER to other scalable models and show that it gives a better fit to real data. We provide a scalable implementation that requires only O(d_max) storage where d_max is the maximum number of neighbors for a single node. The generator is trivially parallelizable, and we show results for a Hadoop MapReduce implementation for a modeling a real-world web graph with over 4.6 billion edges. We propose that the BTER model can be used as a graph generator for benchmarking purposes and provide idealized degree distributions and clustering coefficient profiles that can be tuned for user specifications.
We report on the status of the calculation of deep-inelastic structure functions at three loops in perturbative QCD. The method employed allows to calculate the Mellin moments of structure functions analytically as a general function of N. As an illustration, we present the leading fermionic contributions to the non-singlet anomalous dimension of F_2 at three loops and, as a new result, to the non-singlet coefficient function of F_2 at three loops.
Traditional key management techniques, such as public key cryptography or key distribution center (e.g., Kerberos), are often not effective for wireless sensor networks for the serious limitations in terms of computational power, energy supply, network bandwidth. In order to balance the security and efficiency, we propose a new scheme by employing LU Composition techniques for mutual authenticated pairwise key establishment and integrating LU Matrix with Elliptic Curve Diffie-Hellman for anonymous pathkey establishment. At the meantime, it is able to achieve efficient group key agreement and management. Analysis shows that the new scheme has better performance and provides authenticity and anonymity for sensor to establish multiple kinds of keys, compared with previous related works.
Directed information theory deals with communication channels with feedback. When applied to networks, a natural extension based on causal conditioning is needed. We show here that measures built from directed information theory in networks can be used to assess Granger causality graphs of stochastic processes. We show that directed information theory includes measures such as the transfer entropy, and that it is the adequate information theoretic framework needed for neuroscience applications, such as connectivity inference problems.
Reinforcement learning (RL) can automate a wide variety of robotic skills, but learning each new skill requires considerable real-world data collection and manual representation engineering to design policy classes or features. Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations. Transfer learning can mitigate this problem by enabling us to transfer information from one skill to another and even from one robot to another. We show that neural network policies can be decomposed into "task-specific" and "robot-specific" modules, where the task-specific modules are shared across robots, and the robot-specific modules are shared across all tasks on that robot. This allows for sharing task information, such as perception, between robots and sharing robot information, such as dynamics and kinematics, between tasks. We exploit this decomposition to train mix-and-match modules that can solve new robot-task combinations that were not seen during training. Using a novel neural network architecture, we demonstrate the effectiveness of our transfer method for enabling zero-shot generalization with a variety of robots and tasks in simulation for both visual and non-visual tasks.
Imaging atmospheric Cherenkov telescopes (IACTs) are sensitive to rare gamma-ray photons, buried in the background of charged cosmic-ray (CR) particles, the flux of which is several orders of magnitude greater. The ability to separate gamma rays from CR particles is important, as it is directly related to the sensitivity of the instrument. This gamma-ray/CR-particle classification problem in IACT data analysis can be treated with the rapidly-advancing machine learning algorithms, which have the potential to outperform the traditional box-cut methods on image parameters. We present preliminary results of a precise classification of a small set of muon events using a convolutional neural networks model with the raw images as input features. We also show the possibility of using the convolutional neural networks model for regression problems, such as the radius and brightness measurement of muon events, which can be used to calibrate the throughput efficiency of IACTs.
Stochastic gradient descent (SGD), which updates the model parameters by adding a local gradient times a learning rate at each step, is widely used in model training of machine learning algorithms such as neural networks. It is observed that the models trained by SGD are sensitive to learning rates and good learning rates are problem specific. We propose an algorithm to automatically learn learning rates using neural network based actor-critic methods from deep reinforcement learning (RL).In particular, we train a policy network called actor to decide the learning rate at each step during training, and a value network called critic to give feedback about quality of the decision (e.g., the goodness of the learning rate outputted by the actor) that the actor made. The introduction of auxiliary actor and critic networks helps the main network achieve better performance. Experiments on different datasets and network architectures show that our approach leads to better convergence of SGD than human-designed competitors.
K-means Fast Learning Artificial Neural Network (K-FLANN) is an unsupervised neural network requires two parameters: tolerance and vigilance. Best Clustering results are feasible only by finest parameters specified to the neural network. Selecting optimal values for these parameters is a major problem. To solve this issue, Genetic Algorithm (GA) is used to determine optimal parameters of K-FLANN for finding groups in multidimensional data. K-FLANN is a simple topological network, in which output nodes grows dynamically during the clustering process on receiving input patterns. Original K-FLANN is enhanced to select winner unit out of the matched nodes so that stable clusters are formed with in a less number of epochs. The experimental results show that the GA is efficient in finding optimal values of parameters from the large search space and is tested using artificial and synthetic data sets.
Seismic waves may be strongly amplified in deep alluvial basins due to the velocity contrast (or velocity gradient) between the various layers as well as the basin edge effects. In this work, the seismic ground motion in a deep alpine valley (Grenoble basin, French Alps) is investigated through various 'classical' Boundary Element models. This deep valley has a peculiar geometry ("Y"-shaped) and involves a strong velocity gradient between surface geological structures. In the framework of a numerical benchmark [21-23], a representative cross section of the valley has been proposed to investigate 2D site effects through various numerical methods. The 'classical' Boundary Element Method is considered herein to model the strong velocity gradient with a 2D piecewise homogeneous medium. For a large incidence angle, the transfer functions estimated from plane SH waves are close to the one computed with shallow SH point sources. The fundamental frequency is estimated at 0.33 Hz (SH wave) and the agreement with previous experimental results (spectral ratios) is good. Comparisons between 1D and 2D amplification are then performed: the values of the fundamental frequency and the corresponding amplitude are larger in 2D. Converting frequency domain results into the time domain, we underline surface waves generated at the valley edges and the directivity effect for the amplified wave-field. In the time domain for plane SV-wave, we also computed the ground motion for a strong seismic event (M=6): time duration and peak ground velocity are found to be 3 times larger than for the input signal. Such 2D models involving basin effects are then capable to recover the high amplification level measured in the field. Nevertheless, to deal with complex 3D basins as also proposed in the "ESG" benchmark [21-23], the capabilities of the classical Boundary Element Method are limited. As shown recently [43,47,52], such improvements as the fast multipole formulation (FM-BEM) may be a promising alternative for future 3D simulations.
Regularization is key for deep learning since it allows training more complex models while keeping lower levels of overfitting. However, the most prevalent regularizations do not leverage all the capacity of the models since they rely on reducing the effective number of parameters. Feature decorrelation is an alternative for using the full capacity of the models but the overfitting reduction margins are too narrow given the overhead it introduces. In this paper, we show that regularizing negatively correlated features is an obstacle for effective decorrelation and present OrthoReg, a novel regularization technique that locally enforces feature orthogonality. As a result, imposing locality constraints in feature decorrelation removes interferences between negatively correlated feature weights, allowing the regularizer to reach higher decorrelation bounds, and reducing the overfitting more effectively. In particular, we show that the models regularized with OrthoReg have higher accuracy bounds even when batch normalization and dropout are present. Moreover, since our regularization is directly performed on the weights, it is especially suitable for fully convolutional neural networks, where the weight space is constant compared to the feature map space. As a result, we are able to reduce the overfitting of state-of-the-art CNNs on CIFAR-10, CIFAR-100, and SVHN.
The Multi Agent Based programming, modeling and simulation environment of NetLogo has been used extensively during the last fifteen years for educational among other purposes. The learning subject, upon interacting with the Users Interface of NetLogo, can easily study properties of the simulated natural systems, as well as observe the latters response, when altering their parameters. In this research, NetLogo was used under the perspective that the learning subject (student or prospective teacher)interacts with the model in a deeper way, obtaining the role of an agent. This is not achieved by obliging the learner to program (write NetLogo code) but by interviewing them, together with applying the choices that they make on the model. The scheme was carried out, as part of a broader research, with interviews, and web page like interface menu selections, in a sample of 17 University students in Athens (prospective Primary School teachers) and the results were judged as encouraging. At a further stage, the computers were set as a network, where all the agents performed together. In this way the learners could watch onscreen the overall outcome of their choices and actions on the modeled ecosystem. This seems to open a new, small, area of research in NetLogo educational applications.
Deep convolutional neural networks (DCNNs) trained on a large number of images with strong pixel-level annotations have recently significantly pushed the state-of-art in semantic image segmentation. We study the more challenging problem of learning DCNNs for semantic image segmentation from either (1) weakly annotated training data such as bounding boxes or image-level labels or (2) a combination of few strongly labeled and many weakly labeled images, sourced from one or multiple datasets. We develop Expectation-Maximization (EM) methods for semantic image segmentation model training under these weakly supervised and semi-supervised settings. Extensive experimental evaluation shows that the proposed techniques can learn models delivering competitive results on the challenging PASCAL VOC 2012 image segmentation benchmark, while requiring significantly less annotation effort. We share source code implementing the proposed system at https://bitbucket.org/deeplab/deeplab-public.
There has been significant recent interest towards achieving highly efficient deep neural network architectures. A promising paradigm for achieving this is the concept of evolutionary deep intelligence, which attempts to mimic biological evolution processes to synthesize highly-efficient deep neural networks over successive generations. An important aspect of evolutionary deep intelligence is the genetic encoding scheme used to mimic heredity, which can have a significant impact on the quality of offspring deep neural networks. Motivated by the neurobiological phenomenon of synaptic clustering, we introduce a new genetic encoding scheme where synaptic probability is driven towards the formation of a highly sparse set of synaptic clusters. Experimental results for the task of image classification demonstrated that the synthesized offspring networks using this synaptic cluster-driven genetic encoding scheme can achieve state-of-the-art performance while having network architectures that are not only significantly more efficient (with a ~125-fold decrease in synapses for MNIST) compared to the original ancestor network, but also tailored for GPU-accelerated machine learning applications.
Recent successes in word embedding and document embedding have motivated researchers to explore similar representations for networks and to use such representations for tasks such as edge prediction, node label prediction, and community detection. Existing methods are largely focused on finding distributed representations for unsigned networks and are unable to discover embeddings that respect polarities inherent in edges. We propose SIGNet, a fast scalable embedding method suitable for signed networks. Our proposed objective function aims to carefully model the social structure implicit in signed networks by reinforcing the principles of social balance theory. Our method builds upon the traditional word2vec family of embedding approaches but we propose a new targeted node sampling strategy to maintain structural balance in higher-order neighborhoods. We demonstrate the superiority of SIGNet over state-of-the-art methods proposed for both signed and unsigned networks on several real world datasets from different domains. In particular, SIGNet offers an approach to generate a richer vocabulary of features of signed networks to support representation and reasoning.
In this paper, we study controllability of a network of linear single-integrator agents when the network size goes to infinity. We first investigate the effect of increasing size by injecting an input at every node and requiring that network controllability Gramian remain well-conditioned with the increasing dimension. We provide theoretical justification to the intuition that high degree nodes pose a challenge to network controllability. In particular, the controllability Gramian for the networks with bounded maximum degrees is shown to remain well-conditioned even as the network size goes to infinity. In the canonical cases of star, chain and ring networks, we also provide closed-form expressions which bound the condition number of the controllability Gramian in terms of the network size. We next consider the effect of the choice and number of leader nodes by actuating only a subset of nodes and considering the least eigenvalue of the Gramian as the network size increases. Accordingly, while a directed star topology can never be made controllable for all sizes by injecting an input just at a fraction $f<1$ of nodes; for path or cycle networks, the designer can actuate a non-zero fraction of nodes and spread them throughout the network in such way that the least eigenvalue of the Gramians remain bounded away from zero with the increasing size. The results offer interesting insights on the challenges of control in large networks and with high-degree nodes.
We consider the multi-cell uplink (network MIMO) where M base-stations (BSs) communicate simultaneously with M user terminals (UTs). Although the potential benefit of multi-cell cooperation increases with M, the overhead related to learning the uplink channels will rapidly dominate the uplink resource. In other words, there exists a non-trivial tradeoff between the performance gains of network MIMO and the related overhead in channel estimation for a finite coherence time. We use a close approximation of the ergodic capacity to study this tradeoff by taking some realistic aspects into account such as unreliable backhaul links and different path losses between the BSs and UTs. Our results provide some insight into practical limitations as well as realistic dimensions of network MIMO systems.
A total of 10 good candidates for gravitational lensing have been discovered in the WFPC2 images from the HST Medium Deep Survey (MDS) and archival primary observations. These candidate lenses are unique HST discoveries, i.e. they are faint systems with sub-arcsecond separations between the lensing objects and the lensed source images. Most of them are difficult objects for ground-based spectroscopic confirmation or for measurement of the lens and source redshifts. Seven are ``strong lens'' candidates which appear to have multiple images of the source. Three are cases where the single image of the source galaxy has been significantly distorted into an arc. The first two quadruply lensed candidates were reported in Ratnatunga et al 1995 (ApJL, 453, L5) We report on the subsequent eight candidates and describe them with simple models based on the assumption of singular isothermal potentials. Residuals from the simple models for some of the candidates indicate that a more complex model for the potential will probably be required to explain the full structural detail of the observations once they are confirmed to be lenses. We also discuss the effective survey area which was searched for these candidate lens objects.
In this paper, a novel cluster-based approach for maximizing the energy efficiency of wireless small cell networks is proposed. A dynamic mechanism is proposed to group locally-coupled small cell base stations (SBSs) into clusters based on location and traffic load. Within each formed cluster, SBSs coordinate their transmission parameters to minimize a cost function which captures the tradeoffs between energy efficiency and flow level performance, while satisfying their users' quality-of-service requirements. Due to the lack of inter-cluster communications, clusters compete with one another in order to improve the overall network's energy efficiency. This inter-cluster competition is formulated as a noncooperative game between clusters that seek to minimize their respective cost functions. To solve this game, a distributed learning algorithm is proposed using which clusters autonomously choose their optimal transmission strategies based on local information. It is shown that the proposed algorithm converges to a stationary mixed-strategy distribution which constitutes an epsilon-coarse correlated equilibrium for the studied game. Simulation results show that the proposed approach yields significant performance gains reaching up to 36% of reduced energy expenditures and up to 41% of reduced fractional transfer time compared to conventional approaches.
We investigate the lattice QCD Dirac operator with staggered fermions at temperatures around the chiral phase transition. We present evidence of a metal-insulator transition in the low lying modes of the Dirac operator around the same temperature as the chiral phase transition. This strongly suggests the phenomenon of Anderson localization drives the QCD vacuum to the chirally symmetric phase in a way similar to a metal-insulator transition in a disordered conductor. We also discuss how Anderson localization affects the usual phenomenological treatment of phase transitions a la Ginzburg-Landau.
We propose in this article an adaptation of the basic techniques of the deterministic network calculus theory to the road traffic flow theory. Network calculus is a theory based on min-plus algebra. It uses algebraic techniques to compute performance bounds in communication networks, such as maximum end-to-end delays and backlogs. The objective of this article is to investigate the application of such techniques for determining performance bounds on road networks, such as maximum bounds on travel times. The main difficulty to apply the network calculus theory on road networks is the modeling of interaction of cars inside one road, or more precisely the congestion phase. We propose a traffic model for a single-lane road without passing, which is compatible with the network calculus theory. The model permits to derive a maximum bound of the travel time of cars through the road. Then, basing on that model, we explain how to extend the approach to model intersections and large-scale networks.
Demanding sparsity in estimated models has become a routine practice in statistics. In many situations, we wish to require that the sparsity patterns attained honor certain problem-specific constraints. Hierarchical sparse modeling (HSM) refers to situations in which these constraints specify that one set of parameters be set to zero whenever another is set to zero. In recent years, numerous papers have developed convex regularizers for this form of sparsity structure, which arises in many areas of statistics including interaction modeling, time series analysis, and covariance estimation. In this paper, we observe that these methods fall into two frameworks, the group lasso (GL) and latent overlapping group lasso (LOG), which have not been systematically compared in the context of HSM. The purpose of this paper is to provide a side-by-side comparison of these two frameworks for HSM in terms of their statistical properties and computational efficiency. We call special attention to GL's more aggressive shrinkage of parameters deep in the hierarchy, a property not shared by LOG. In terms of computation, we introduce a finite-step algorithm that exactly solves the proximal operator of LOG for a certain simple HSM structure; we later exploit this to develop a novel path-based block coordinate descent scheme for general HSM structures. Both algorithms greatly improve the computational performance of LOG. Finally, we compare the two methods in the context of covariance estimation, where we introduce a new sparsely-banded estimator using LOG, which we show achieves the statistical advantages of an existing GL-based method but is simpler to express and more efficient to compute.
For gauge theories which admit a dual string description, we analyze deep inelastic scattering at strong 't Hooft coupling and high energy, in the vicinity of the unitarity limit. We discuss the onset of unitarity corrections and determine the saturation line which separates weak scattering from strong scattering in the parameter space of rapidity and photon virtuality. We discover that the approach towards unitarity proceeds through two different mechanisms, depending upon the photon virtuality Q^2 : single Pomeron exchange at relatively low Q^2 and, respectively, multiple graviton exchanges at higher Q^2. This implies that the total cross-section at high energy and large Q^2 is dominated by diffractive processes. This is furthermore suggestive of a partonic description where all the partons have transverse momenta below the saturation momentum and occupation numbers of order one.
The Boolean satisfiability problem (SAT) can be solved efficiently with variants of the DPLL algorithm. For industrial SAT problems, DPLL with conflict analysis dependent dynamic decision heuristics has proved to be particularly efficient, e.g. in Chaff. In this work, algorithms that initialize the variable activity values in the solver MiniSAT v1.14 by analyzing the CNF are evolved using genetic programming (GP), with the goal to reduce the total number of conflicts of the search and the solving time. The effect of using initial activities other than zero is examined by initializing with random numbers. The possibility of countering the detrimental effects of reordering the CNF with improved initialization is investigated. The best result found (with validation testing on further problems) was used in the solver Actin, which was submitted to SAT-Race 2006.
We demonstrate that a deep neural network can significantly improve optical microscopy, enhancing its spatial resolution over a large field-of-view and depth-of-field. After its training, the only input to this network is an image acquired using a regular optical microscope, without any changes to its design. We blindly tested this deep learning approach using various tissue samples that are imaged with low-resolution and wide-field systems, where the network rapidly outputs an image with remarkably better resolution, matching the performance of higher numerical aperture lenses, also significantly surpassing their limited field-of-view and depth-of-field. These results are transformative for various fields that use microscopy tools, including e.g., life sciences, where optical microscopy is considered as one of the most widely used and deployed techniques. Beyond such applications, our presented approach is broadly applicable to other imaging modalities, also spanning different parts of the electromagnetic spectrum, and can be used to design computational imagers that get better and better as they continue to image specimen and establish new transformations among different modes of imaging.
In this work, we are interested in generalizing convolutional neural networks (CNNs) from low-dimensional regular grids, where image, video and speech are represented, to high-dimensional irregular domains, such as social networks, brain connectomes or words' embedding, represented by graphs. We present a formulation of CNNs in the context of spectral graph theory, which provides the necessary mathematical background and efficient numerical schemes to design fast localized convolutional filters on graphs. Importantly, the proposed technique offers the same linear computational complexity and constant learning complexity as classical CNNs, while being universal to any graph structure. Experiments on MNIST and 20NEWS demonstrate the ability of this novel deep learning system to learn local, stationary, and compositional features on graphs.
Gaussian state space models have been used for decades as generative models of sequential data. They admit an intuitive probabilistic interpretation, have a simple functional form, and enjoy widespread adoption. We introduce a unified algorithm to efficiently learn a broad class of linear and non-linear state space models, including variants where the emission and transition distributions are modeled by deep neural networks. Our learning algorithm simultaneously learns a compiled inference network and the generative model, leveraging a structured variational approximation parameterized by recurrent neural networks to mimic the posterior distribution. We apply the learning algorithm to both synthetic and real-world datasets, demonstrating its scalability and versatility. We find that using the structured approximation to the posterior results in models with significantly higher held-out likelihood.
In a conversation or a dialogue process, attention and intention play intrinsic roles. This paper proposes a neural network based approach that models the attention and intention processes. It essentially consists of three recurrent networks. The encoder network is a word-level model representing source side sentences. The intention network is a recurrent network that models the dynamics of the intention process. The decoder network is a recurrent network produces responses to the input from the source side. It is a language model that is dependent on the intention and has an attention mechanism to attend to particular source side words, when predicting a symbol in the response. The model is trained end-to-end without labeling data. Experiments show that this model generates natural responses to user inputs.
Synchronous neural activity can improve neural processing and is believed to mediate neuronal interaction by providing temporal windows during which information is more easily transferred. We demonstrate a pulse gating mechanism in a feedforward network that can exactly propagate graded information through a multilayer circuit. Based on this mechanism, we present a unified framework wherein neural information coding and processing can be considered as a product of linear maps under the active control of a pulse generator. Distinct control and processing components combine to form the basis for the binding, propagation, and processing of dynamically routed information within neural pathways. Using our framework, we construct example neural circuits to 1) maintain a short-term memory, 2) compute time-windowed Fourier transforms, and 3) perform spatial rotations. We postulate that such circuits, with stereotyped control and processing of information, are the neural correlates of Crick and Koch's zombie modes.
We present a multi-site formulation of mean-field theory applied to the disordered Bose-Hubbard model. In this approach the lattice is partitioned into clusters, each isolated cluster being treated exactly, with inter-cluster hopping being treated approximately. The theory allows for the possibility of a different superfluid order parameter at every site in the lattice, such as what has been used in previously published site-decoupled mean-field theories, but a multi-site formulation also allows for the inclusion of spatial correlations allowing us, e.g., to calculate the correlation length (over the length scale of each cluster). We present our numerical results for a two-dimensional system. This theory is shown to produce a phase diagram in which the stability of the Mott insulator phase is larger than that predicted by site-decoupled single-site mean-field theory. Two different methods are given for the identification of the Bose glass-to-superfluid transition, one an approximation based on the behaviour of the condensate fraction, and one of which relies on obtaining the spatial variation of the order parameter correlation. The relation of our results to a recent proposal that both transitions are non self-averaging is discussed.
Factorial Hidden Markov Models (FHMMs) are powerful models for sequential data but they do not scale well with long sequences. We propose a scalable inference and learning algorithm for FHMMs that draws on ideas from the stochastic variational inference, neural network and copula literatures. Unlike existing approaches, the proposed algorithm requires no message passing procedure among latent variables and can be distributed to a network of computers to speed up learning. Our experiments corroborate that the proposed algorithm does not introduce further approximation bias compared to the proven structured mean-field algorithm, and achieves better performance with long sequences and large FHMMs.
Recently we proposed a general, ensemble-based feature engineering wrapper (FEW) that was paired with a number of machine learning methods to solve regression problems. Here, we adapt FEW for supervised classification and perform a thorough analysis of fitness and survival methods within this framework. Our tests demonstrate that two fitness metrics, one introduced as an adaptation of the silhouette score, outperform the more commonly used Fisher criterion. We analyze survival methods and demonstrate that $\epsilon$-lexicase survival works best across our test problems, followed by random survival which outperforms both tournament and deterministic crowding. We conduct a benchmark comparison to several classification methods using a large set of problems and show that FEW can improve the best classifier performance in several cases. We show that FEW generates consistent, meaningful features for a biomedical problem with different ML pairings.
Modeling structure in complex networks using Bayesian non-parametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This paper provides a gentle introduction to non-parametric Bayesian modeling of complex networks: Using an infinite mixture model as running example we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model's fit and predictive performance. We explain how advanced non-parametric models for complex networks can be derived and point out relevant literature.
The two approaches to analyzing the large strain behavior of rubbery networks are phenomenologically, using strain energy functions drawn from continuum mechanics, and molecular models, which apply statistical mechanics to compute the effect of chain orientation on the entropy. The early rubber elasticity models ignored intermolecular interactions, whereas later developments ("constraint models") included the effect of entanglements or steric constraints on the mechanical stress. These constitutive equations for rubber elasticity are compared to experimental results, and the connection of network elasticity to the relaxation behaviour is discussed. For conventional elastomers there is a compromise between stiffness and strength. Different methods to circumvent this limitation are described. Examples are given of the properties obtained with novel network architectures, including interpenetrating networks, double networks, bimodal networks, miscible heterogeneous networks, and deswollen networks.
Inference of the network structure (e.g., routing topology) and dynamics (e.g., link performance) is an essential component in many network design and management tasks. In this paper we propose a new, general framework for analyzing and designing routing topology and link performance inference algorithms using ideas and tools from phylogenetic inference in evolutionary biology. The framework is applicable to a variety of measurement techniques. Based on the framework we introduce and develop several polynomial-time distance-based inference algorithms with provable performance. We provide sufficient conditions for the correctness of the algorithms. We show that the algorithms are consistent (return correct topology and link performance with an increasing sample size) and robust (can tolerate a certain level of measurement errors). In addition, we establish certain optimality properties of the algorithms (i.e., they achieve the optimal $l_\infty$-radius) and demonstrate their effectiveness via model simulation.
A diffusion process on complex networks is introduced in order to uncover their large scale topological structures. This is achieved by focusing on the slowest decaying diffusive modes of the network. The proposed procedure is applied to real-world networks like a friendship network of known modular structure, and an Internet routing network. For the friendship network, its known structure is well reproduced. In case of the Internet, where the structure is far less well-known, one indeed finds a modular structure, and modules can roughly be associated with individual countries. Quantitatively the modular structure of the Internet manifests itself in an approximately 10 times larger participation ratio of its slowest decaying modes as compared to the null model -- a random scale-free network. The extreme edges of the Internet are found to correspond to Russian and US military sites.
An appropriate model for the random energy landscape in organic glasses is a spatially correlated Gaussian field, generated by randomly located and oriented dipoles and quadrupoles. Correlation properties of energetic disorder directly dictates the mobility dependence on the applied electric field. Electrostatic disorder is significantly modified in the vicinity of the electrode that affects injection properties. Correlated Gaussian field forms clusters. We suggest a simple method to estimate an asymptotics of the cluster distribution on size for deep clusters where a value of the field on each site is much greater than the rms disorder. Hopping transport in organic glasses in the case of high carrier density could be described in terms of the effective density-dependent temperature.
We introduce a simple benchmark model of dynamic matching in networked markets, where agents arrive and depart stochastically and the network of acceptable transactions among agents forms a random graph. We analyze our model from three perspectives: waiting, optimization, and information. The main insight of our analysis is that waiting to thicken the market can be substantially more important than increasing the speed of transactions, and this is quite robust to the presence of waiting costs. From an optimization perspective, naive local algorithms, that choose the right time to match agents but do not exploit global network structure, can perform very close to optimal algorithms. From an information perspective, algorithms that employ even partial information on agents' departure times perform substantially better than those that lack such information. To elicit agents' departure times, we design an incentive-compatible continuous-time dynamic mechanism without transfers.
We analyze growing networks that are built by enhanced redirection. Nodes are sequentially added and each incoming node attaches to a randomly chosen 'target' node with probability 1-r, or to the parent of the target node with probability r. When the redirection probability r is an increasing function of the degree of the parent node, with r-->1 as the parent degree diverges, networks grown via this enhanced redirection mechanism exhibit unusual properties, including: (i) multiple macrohubs---nodes with degrees proportional to the number of network nodes N; (ii) non-extensivity of the degree distribution in which the number of nodes of degree k, N_k, scales as N^{nu-1}/k^{nu}, with 1<nu<2; (iii) lack of self-averaging, with large fluctuations between individual network realizations. These features are robust and continue to hold when the incoming node has out-degree greater than 1 so that networks contain closed loops. The latter networks are strongly clustered; for the specific case of the double attachment, the average local clustering coefficient is <C_i>=4(ln2)-2=0.77258...
The plasmodium of slime mould Physarum polycephalum behaves as an amorphous reaction-diffusion computing substrate and is capable of apparently intelligent behaviour. But how does intelligence emerge in an acellular organism? Through a range of laboratory experiments, we visualise the plasmodial cytoskeleton, a ubiquitous cellular protein scaffold whose functions are manifold and essential to life, and discuss its putative role as a network for transducing, transmitting and structuring data streams within the plasmodium. Through a range of computer modelling techniques, we demonstrate how emergent behaviour, and hence computational intelligence, may occur in cytoskeletal communications networks. Specifically, we model the topology of both the actin and tubulin cytoskeletal networks and discuss how computation may occur therein. Furthermore, we present bespoke cellular automata and particle swarm models for the computational process within the cytoskeleton and observe the incidence of emergent patterns in both. Our work grants unique insight into the origins of natural intelligence; the results presented here are therefore readily transferable to the fields of natural computation, cell biology and biomedical science. We conclude by discussing how our results may alter our biological, computational and philosophical understanding of intelligence and consciousness.
We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this work, we present a system that performs emotion recognition on video data using both CNNs and RNNs, and we also analyze how much each neural network component contributes to the system's overall performance. We present our findings on videos from the Audio/Visual+Emotion Challenge (AV+EC2015). In our experiments, we analyze the effects of several hyperparameters on overall performance while also achieving superior performance to the baseline and other competing methods.
We present five variants of the standard Long Short-term Memory (LSTM) recurrent neural networks by uniformly reducing blocks of adaptive parameters in the gating mechanisms. For simplicity, we refer to these models as LSTM1, LSTM2, LSTM3, LSTM4, and LSTM5, respectively. Such parameter-reduced variants enable speeding up data training computations and would be more suitable for implementations onto constrained embedded platforms. We comparatively evaluate and verify our five variant models on the classical MNIST dataset and demonstrate that these variant models are comparable to a standard implementation of the LSTM model while using less number of parameters. Moreover, we observe that in some cases the standard LSTM's accuracy performance will drop after a number of epochs when using the ReLU nonlinearity; in contrast, however, LSTM3, LSTM4 and LSTM5 will retain their performance.
Minimizing the number of dropped User Datagram Protocol (UDP) messages in a network is regarded as a challenge by researchers. This issue represents serious problems for many protocols particularly those that depend on sending messages as part of their strategy, such us service discovery protocols. This paper proposes and evaluates an algorithm to predict the minimum period of time required between two or more consecutive messages and suggests the minimum queue sizes for the routers, to manage the traffic and minimise the number of dropped messages that has been caused by either congestion or queue overflow or both together. The algorithm has been applied to the Universal Plug and Play (UPnP) protocol using ns2 simulator. It was tested when the routers were connected in two configurations; as a centralized and de centralized. The message length and bandwidth of the links among the routers were taken in the consideration. The result shows Better improvement in number of dropped messages `among the routers.
We suggest an approach to account for spatial (composition) and thermal fluctuations in "disordered" magnetic models (e.g. Heisenberg, Ising) with given spatial dependence of magnetic spin-spin interaction. Our approach is based on introduction of fluctuating molecular field (rather than mean field) acting between the spins. The distribution function of the above field is derived self-consistently. In general case this function is not Gaussian, latter asymptotics occurs only at sufficiently large spins (magnetic ions) concentrations $n_i$. Our approach permits to derive the equation for a critical temperature $T_c$ of ferromagnetic phase transition with respect to the above fluctuations. We apply our theory to the analysis of influence of composition fluctuations on $T_c$ in diluted magnetic semiconductors (DMS) with RKKY indirect spin-spin interaction.
In the bidirected minimum Manhattan network problem, given a set T of n terminals in the plane, we need to construct a network N(T) of minimum total length with the property that the edges of N(T) are axis-parallel and oriented in a such a way that every ordered pair of terminals is connected in N(T) by a directed Manhattan path. In this paper, we present a polynomial factor 2 approximation algorithm for the bidirected minimum Manhattan network problem.
There is a large variety of quantum and classical systems in which the quenched disorder plays a dominant r\^ole over quantum, thermal, or stochastic fluctuations : these systems display strong spatial heterogeneities, and many averaged observables are actually governed by rare regions. A unifying approach to treat the dynamical and/or static singularities of these systems has emerged recently, following the pioneering RG idea by Ma and Dasgupta and the detailed analysis by Fisher who showed that the Ma-Dasgupta RG rules yield asymptotic exact results if the broadness of the disorder grows indefinitely at large scales. Here we report these new developments by starting with an introduction of the main ingredients of the strong disorder RG method. We describe the basic properties of infinite disorder fixed points, which are realized at critical points, and of strong disorder fixed points, which control the singular behaviors in the Griffiths-phases. We then review in detail applications of the RG method to various disordered models, either (i) quantum models, such as random spin chains, ladders and higher dimensional spin systems, or (ii) classical models, such as diffusion in a random potential, equilibrium at low temperature and coarsening dynamics of classical random spin chains, trap models, delocalization transition of a random polymer from an interface, driven lattice gases and reaction diffusion models in the presence of quenched disorder. For several one-dimensional systems, the Ma-Dasgupta RG rules yields very detailed analytical results, whereas for other, mainly higher dimensional problems, the RG rules have to be implemented numerically. If available, the strong disorder RG results are compared with another, exact or numerical calculations.
Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins, drugs) and edges represent relational ties among these objects (binds-to, interacts-with, regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, often suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. In this paper, we present a hypergraph-based approach for modeling physical systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs in a semi-supervised setting. We introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of small simple hypergraphs, referred to as hypergraphlets, rooted at a vertex of interest. We extensively evaluate this method and show its potential use in a positive-unlabeled setting to estimate the number of missing and false positive links in protein-protein interaction networks.
Recently, the attention mechanism plays a key role to achieve high performance for Neural Machine Translation models. However, as it computes a score function for the encoder states in all positions at each decoding step, the attention model greatly increases the computational complexity. In this paper, we investigate the adequate vision span of attention models in the context of machine translation, by proposing a novel attention framework that is capable of reducing redundant score computation dynamically. The term "vision span" means a window of the encoder states considered by the attention model in one step. In our experiments, we found that the average window size of vision span can be reduced by over 50% with modest loss in accuracy on English-Japanese and German-English translation tasks.% This results indicate that the conventional attention mechanism performs a significant amount of redundant computation.
The emergence of mutual cooperation is studied in a spatially extended evolutionary prisoner's dilemma game in which the players are located on the sites of cubic lattices for dimensions d=1, 2, and 3. Each player can choose one of the three strategies: cooperation (C), defection (D) or Tit for Tat (T). During the evolutionary process the randomly chosen players adopt one of their neighboring strategies if the chosen neighbor has higher payoff. Morover, an external constraint imposes that the players always cooperate with probability p. The stationary state phase diagram are computed by both using generalized mean-field approximations and Monte Carlo simulations. Nonequilibrium second order phase transitions assosiated with the extinction of one of the possible strategies are found and the corresponding critical exponents belong to the directed percolation universality class. It is shown that forcing externally the collaboration does not always produce the desired result.
We examine the unusual Fe-Kalpha line profile in MCG-6-30-15 observed by ASCA during a deep minimum in the source intensity. The intense red wing and depressed blue wing of the line have been interpreted as evidence for extreme gravitational redshifts in terms of emission from within six gravitational radii of a black hole. We find that the data do not uniquely support this interpretation and can be equally well explained by occultation of the continuum source and the putative line-emitting accretion disk, which we offer as an alternative hypothesis. Two problems with previous modeling were that the equivalent width of the line during the deep minimum was required to be unusually large (> 1 keV) and the line intensity was thought to increase as the source became dim. The occultation model does not suffer from these problems. Our results serve to highlight the hazards of over-interpreting observational results which have low statistical significance, to the extent that theoretical implications can become generally accepted when the data do not provide a strong case for them.
The effect of disorder is investigated in granular superconductive materials with strong and weak links. The transition is controlled by the interplay of the \emph{tunneling} $g$ and \emph{intragrain} $g_{intr}$ conductances, which depend on the strength of the intergrain coupling. For $g \ll g_{intr}$, the transition involves first the grain boundary, while for $g \sim g_{intr}$ the transition occurs into the whole grain. The different intergrain coupling is considered by modelling the superconducting material as a disordered network of Josephson junctions. Numerical simulations show that on increasing the disorder, the resistive transition occurs for lower temperatures and the curve broadens. These features are enhanced in disordered superconductors with strong links. The different behaviour is further checked by estimating the average network resistance for weak and strong links in the framework of the effective medium approximation theory. These results may be relevant to shed light on long standing puzzles as: (i) enhancement of the superconducting transition temperature of many metals in the granular states; (ii) suppression of superconductivity in homogeneously disordered films compared to standard granular systems close to the metal-insulator transition; (iii) enhanced degradation of superconductivity by doping and impurities in strongly linked materials, such as magnesium diboride, compared to weakly-linked superconductors, such as cuprates.
We investigate the feasibility of high performance scientific computation using cloud computers as an alternative to traditional computational tools. The availability of these large, virtualized pools of compute resources raises the possibility of a new compute paradigm for scientific research with many advantages. For research groups, cloud computing provides convenient access to reliable, high performance clusters and storage, without the need to purchase and maintain sophisticated hardware. For developers, virtualization allows scientific codes to be optimized and pre-installed on machine images, facilitating control over the computational environment. Preliminary tests are presented for serial and parallelized versions of the widely used x-ray spectroscopy and electronic structure code FEFF on the Amazon Elastic Compute Cloud, including CPU and network performance.
An important challenge in neuroevolution is to evolve complex neural networks with multiple modes of behavior. Indirect encodings can potentially answer this challenge. Yet in practice, indirect encodings do not yield effective multimodal controllers. Thus, this paper introduces novel multimodal extensions to HyperNEAT, a popular indirect encoding. A previous multimodal HyperNEAT approach called situational policy geometry assumes that multiple brains benefit from being embedded within an explicit geometric space. However, experiments here illustrate that this assumption unnecessarily constrains evolution, resulting in lower performance. Specifically, this paper introduces HyperNEAT extensions for evolving many brains without assuming geometric relationships between them. The resulting Multi-Brain HyperNEAT can exploit human-specified task divisions to decide when each brain controls the agent, or can automatically discover when brains should be used, by means of preference neurons. A further extension called module mutation allows evolution to discover the number of brains, enabling multimodal behavior with even less expert knowledge. Experiments in several multimodal domains highlight that multi-brain approaches are more effective than HyperNEAT without multimodal extensions, and show that brains without a geometric relation to each other outperform situational policy geometry. The conclusion is that Multi-Brain HyperNEAT provides several promising techniques for evolving complex multimodal behavior.
In this paper, we propose a winner-take-all method for learning hierarchical sparse representations in an unsupervised fashion. We first introduce fully-connected winner-take-all autoencoders which use mini-batch statistics to directly enforce a lifetime sparsity in the activations of the hidden units. We then propose the convolutional winner-take-all autoencoder which combines the benefits of convolutional architectures and autoencoders for learning shift-invariant sparse representations. We describe a way to train convolutional autoencoders layer by layer, where in addition to lifetime sparsity, a spatial sparsity within each feature map is achieved using winner-take-all activation functions. We will show that winner-take-all autoencoders can be used to to learn deep sparse representations from the MNIST, CIFAR-10, ImageNet, Street View House Numbers and Toronto Face datasets, and achieve competitive classification performance.
This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the delta-cover sampling. Most Bayesian optimization methods require auxiliary optimization: an additional non-convex global optimization problem, which can be time-consuming and hard to implement in practice. Also, the existing Bayesian optimization method with exponential convergence requires access to the delta-cover sampling, which was considered to be impractical. Our approach eliminates both requirements and achieves an exponential convergence rate.
This short paper introduces a new way by which to design production system rules. An indirect encoding scheme is presented which views such rules as protein complexes produced by the temporal behaviour of an artificial genetic regulatory network. This initial study begins by using a simple Boolean regulatory network to produce traditional ternary-encoded rules before moving to a fuzzy variant to produce real-valued rules. Competitive performance is shown with related genetic regulatory networks and rule-based systems on benchmark problems.
We report the measurements of the t anti-t production cross section and of the top quark mass using 1.02 fb^-1 of p anti-p data collected with the CDFII detector at the Fermilab Tevatron. We select events with six or more jets on which a number of kinematical requirements are imposed by means of a neural network algorithm. At least one of these jets must be identified as initiated by a b-quark candidate by the reconstruction of a secondary vertex. The cross section is measured to be sigma_{tt}=8.3+-1.0(stat.)+2.0-1.5(syst.)+-0.5(lumi.) pb, which is consistent with the standard model prediction. The top quark mass of 174.0+-2.2(stat.)+-4.8(syst.) GeV/c^2 is derived from a likelihood fit incorporating reconstructed mass distributions representative of signal and background.
We have performed depth dependent muon spin rotation/relaxation studies of the dynamics of single layer films of {\it Au}Fe and {\it Cu}Mn spin glasses as a function of thickness and of its behavior as a function of distance from the vacuum interface (5-70 nm). A significant reduction in the muon spin relaxation rate as a function of temperature with respect to the bulk material is observed when the muons are stopped near (5-10 nm) the surface of the sample. A similar reduction is observed for the whole sample if the thickness is reduced to e.g. 20 nm and less. This reflects an increased impurity spin dynamics (incomplete freezing) close to the surface although the freezing temperature is only modestly affected by the dimensional reduction.
The stationary isotropic Poisson line process was used to derive upper bounds on mean excess network geodesic length in Aldous and Kendall [Adv. in Appl. Probab. 40 (2008) 1-21]. The current paper presents a study of the geometry and fluctuations of near-geodesics in the network generated by the line process. The notion of a "Poissonian city" is introduced, in which connections between pairs of nodes are made using simple "no-overshoot" paths based on the Poisson line process. Asymptotics for geometric features and random variation in length are computed for such near-geodesic paths; it is shown that they traverse the network with an order of efficiency comparable to that of true network geodesics. Mean characteristics and limiting behavior at the center are computed for a natural network flow. Comparisons are drawn with similar network flows in a city based on a comparable rectilinear grid. A concluding section discusses several open problems.
Empirical studies show that real world networks often exhibit multiple scales of topological descriptions. However, it is still an open problem how to identify the intrinsic multiple scales of networks. In this article, we consider detecting the multi-scale community structure of network from the perspective of dimension reduction. According to this perspective, a covariance matrix of network is defined to uncover the multi-scale community structure through the translation and rotation transformations. It is proved that the covariance matrix is the unbiased version of the well-known modularity matrix. We then point out that the translation and rotation transformations fail to deal with the heterogeneous network, which is very common in nature and society. To address this problem, a correlation matrix is proposed through introducing the rescaling transformation into the covariance matrix. Extensive tests on real world and artificial networks demonstrate that the correlation matrix significantly outperforms the covariance matrix, identically the modularity matrix, as regards identifying the multi-scale community structure of network. This work provides a novel perspective to the identification of community structure and thus various dimension reduction methods might be used for the identification of community structure. Through introducing the correlation matrix, we further conclude that the rescaling transformation is crucial to identify the multi-scale community structure of network, as well as the translation and rotation transformations.
Deep learning has been proven to yield reliably generalizable answers to numerous classification and decision tasks. Here, we demonstrate for the first time, to our knowledge, that deep neural networks (DNNs) can be trained to solve inverse problems in computational imaging. We experimentally demonstrate a lens-less imaging system where a DNN was trained to recover a phase object given a raw intensity image recorded some distance away.
We propose FrogWild, a novel algorithm for fast approximation of high PageRank vertices, geared towards reducing network costs of running traditional PageRank algorithms. Our algorithm can be seen as a quantized version of power iteration that performs multiple parallel random walks over a directed graph. One important innovation is that we introduce a modification to the GraphLab framework that only partially synchronizes mirror vertices. This partial synchronization vastly reduces the network traffic generated by traditional PageRank algorithms, thus greatly reducing the per-iteration cost of PageRank. On the other hand, this partial synchronization also creates dependencies between the random walks used to estimate PageRank. Our main theoretical innovation is the analysis of the correlations introduced by this partial synchronization process and a bound establishing that our approximation is close to the true PageRank vector.   We implement our algorithm in GraphLab and compare it against the default PageRank implementation. We show that our algorithm is very fast, performing each iteration in less than one second on the Twitter graph and can be up to 7x faster compared to the standard GraphLab PageRank implementation.
A system and method for enabling human beings to communicate by way of their monitored brain activity. The brain activity of an individual is monitored and transmitted to a remote location (e.g. by satellite). At the remote location, the monitored brain activity is compared with pre-recorded normalized brain activity curves, waveforms, or patterns to determine if a match or substantial match is found. If such a match is found, then the computer at the remote location determines that the individual was attempting to communicate the word, phrase, or thought corresponding to the matched stored normalized signal.
Earlier we reported an observation at low temperatures of activation conductivity with small activation energies in strongly doped uncompensated layers of p-GaAs/AlGaAs quantum wells. We attributed it to Anderson delocalization of electronic states in the vicinity of the maximum of the narrow impurity band. A possibility of such delocalization at relatively small impurity concentration is related to the small width of the impurity band characterized by weak disorder. In this case the carriers were activated from the "bandtail" while its presence was related to weak background compensation. Here we study an effect of the extrinsic compensation and of the impurity concentration on this "virtual" Anderson transition. It was shown that an increase of compensation initially does not affect the Anderson transition, however at strong compensations the transition is suppressed due to increase of disorder. In its turn, an increase of the dopant concentration initially leads to a suppression of the transition due an increase of disorder, the latter resulting from a partial overlap of the Hubbard bands. However at larger concentration the conductivity becomes to be metallic due to Mott transition.
Software Defined Networks (SDN) provide vital benefits to network administrators by offering global visibility and network-wide control over the switching infrastructure of the network. It is rather much difficult to obtain the same benefits in the presence of middleboxes (MBs), due to (i) lack of a proper topology discovery mechanism in environments with a mix of forwarding devices and middleboxes. (ii) lack of generic APIs to abstract and gain control on these rigid and heterogeneous third-party middleboxes (iii) lack of a generic network infrastructure framework to monitor and verify any specific device or path connectivity status in the network. These limitations make automation of network operations such as, network-wide monitoring, policy enforcement and rule-placement much difficult to handle. Hence, there is a greater urge even from middlebox vendors, to better handle the control and visibility aspects of the network in presence of middleboxes.   In this paper, we propose a Unified network infrastructure framework for gaining global network visibility, by discovering the network topology in the presence of middleboxes, along with a framework to support the end-to-end path connectivity verification, independent of SDN. We have also addressed security aspects and provided necessary APIs to support our framework.
We present the results of the first test plates of the extended Baryon Oscillation Spectroscopic Survey. This paper focuses on the emission line galaxies (ELG) population targetted from the Dark Energy Survey (DES) photometry. We analyse the success rate, efficiency, redshift distribution, and clustering properties of the targets. From the 9000 spectroscopic redshifts targetted, 4600 have been selected from the DES photometry. The total success rate for redshifts between 0.6 and 1.2 is 71\% and 68\% respectively for a bright and faint, on average more distant, samples including redshifts measured from a single strong emission line. We find a mean redshift of 0.8 and 0.87, with 15 and 13\% of unknown redshifts respectively for the bright and faint samples. In the redshift range 0.6<z<1.2, for the most secure spectroscopic redshifts, the mean redshift for the bright and faint sample is 0.85 and 0.9 respectively. Star contamination is lower than 2\%. We measure a galaxy bias averaged on scales of 1 and 10~Mpc/h of 1.72 \pm 0.1 for the bright sample and of 1.78 \pm 0.12 for the faint sample. The error on the galaxy bias have been obtained propagating the errors in the correlation function to the fitted parameters. This redshift evolution for the galaxy bias is in agreement with theoretical expectations for a galaxy population with MB-5\log h < -21.0. We note that biasing is derived from the galaxy clustering relative to a model for the mass fluctuations. We investigate the quality of the DES photometric redshifts and find that the outlier fraction can be reduced using a comparison between template fitting and neural network, or using a random forest algorithm.
Spatial intensity correlations between waves transmitted through random media are analyzed within the framework of the random matrix theory of transport. Assuming that the statistical distribution of transfer matrices is isotropic, we found that the spatial correlation function can be expressed as the sum of three terms, with distinctive spatial dependences. This result coincides with the one obtained in the diffusive regime from perturbative calculations, but holds all the way from quasi-ballistic transport to localization. While correlations are positive in the diffusive regime, we predict a transition to negative correlations as the length of the system decreases.
Content-Centric Networking (CCN) is a new class of network architectures designed to address some key limitations of the current IP-based Internet. One of its main features is in-network content caching, which allows requests for content to be served by routers. Despite improved bandwidth utilization and lower latency for popular content retrieval, in-network content caching offers producers no means of collecting information about content that is requested and later served from network caches. Such information is often needed for accounting purposes. In this paper, we design some secure accounting schemes that vary in the degree of consumer, router, and producer involvement. Next, we identify and analyze performance and security tradeoffs, and show that specific per-consumer accounting is impossible in the presence of router caches and without application-specific support. We then recommend accounting strategies that entail a few simple requirements for CCN architectures. Finally, our experimental results show that forms of native and secure CCN accounting are both more viable and practical than application-specific approaches with little modification to the existing architecture and protocol.
This paper presents a novel pre-processing scheme to improve the prediction of sand fraction from multiple seismic attributes such as seismic impedance, amplitude and frequency using machine learning and information filtering. The available well logs along with the 3-D seismic data have been used to benchmark the proposed pre-processing stage using a methodology which primarily consists of three steps: pre-processing, training and post-processing. An Artificial Neural Network (ANN) with conjugate-gradient learning algorithm has been used to model the sand fraction. The available sand fraction data from the high resolution well logs has far more information content than the low resolution seismic attributes. Therefore, regularization schemes based on Fourier Transform (FT), Wavelet Decomposition (WD) and Empirical Mode Decomposition (EMD) have been proposed to shape the high resolution sand fraction data for effective machine learning. The input data sets have been segregated into training, testing and validation sets. The test results are primarily used to check different network structures and activation function performances. Once the network passes the testing phase with an acceptable performance in terms of the selected evaluators, the validation phase follows. In the validation stage, the prediction model is tested against unseen data. The network yielding satisfactory performance in the validation stage is used to predict lithological properties from seismic attributes throughout a given volume. Finally, a post-processing scheme using 3-D spatial filtering is implemented for smoothing the sand fraction in the volume. Prediction of lithological properties using this framework is helpful for Reservoir Characterization.
We have developed a method for the linear reconstruction of an image from undersampled, dithered data, which has been used to create the distributed, combined Hubble Deep Field images -- the deepest optical images yet taken of the universe. The algorithm, known as Variable-Pixel Linear Reconstruction (or informally as "drizzling"), preserves photometry and resolution, can weight input images according to the statistical significance of each pixel, and removes the effects of geometric distortion both on image shape and photometry. In this paper, the algorithm and its implementation are described, and measurements of the photometric accuracy and image fidelity are presented. In addition, we describe the use of drizzling to combine dithered images in the presence of cosmic rays.
A geometric entropy is defined as the Riemannian volume of the parameter space of a statistical manifold associated with a given network. As such it can be a good candidate for measuring networks complexity. Here we investigate its ability to single out topological features of networks proceeding in a bottom-up manner: first we consider small size networks by analytical methods and then large size networks by numerical techniques. Two different classes of networks, the random graphs and the scale--free networks, are investigated computing their Betti numbers and then showing the capability of geometric entropy of detecting homologies.
Deep feedforward and recurrent networks have achieved impressive results in many perception and language processing applications. This success is partially attributed to architectural innovations such as convolutional and long short-term memory networks. The main motivation for these architectural innovations is that they capture better domain knowledge, and importantly are easier to optimize than more basic architectures. Recently, more complex architectures such as Neural Turing Machines and Memory Networks have been proposed for tasks including question answering and general computation, creating a new set of optimization challenges. In this paper, we discuss a low-overhead and easy-to-implement technique of adding gradient noise which we find to be surprisingly effective when training these very deep architectures. The technique not only helps to avoid overfitting, but also can result in lower training loss. This method alone allows a fully-connected 20-layer deep network to be trained with standard gradient descent, even starting from a poor initialization. We see consistent improvements for many complex models, including a 72% relative reduction in error rate over a carefully-tuned baseline on a challenging question-answering task, and a doubling of the number of accurate binary multiplication models learned across 7,000 random restarts. We encourage further application of this technique to additional complex modern architectures.
We propose a novel approach for distributed statistical detection of change-points in high-volume network traffic. We consider more specifically the task of detecting and identifying the targets of Distributed Denial of Service (DDoS) attacks. The proposed algorithm, called DTopRank, performs distributed network anomaly detection by aggregating the partial information gathered in a set of network monitors. In order to address massive data while limiting the communication overhead within the network, the approach combines record filtering at the monitor level and a nonparametric rank test for doubly censored time series at the central decision site. The performance of the DTopRank algorithm is illustrated both on synthetic data as well as from a traffic trace provided by a major Internet service provider.
We study the quantum dynamics of a two-level system interacting with a quantized harmonic oscillator in the deep strong coupling regime (DSC) of the Jaynes-Cummings model, that is, when the coupling strength g is comparable or larger than the oscillator frequency w (g/w > 1). In this case, the rotating-wave approximation cannot be applied or treated perturbatively in general. We propose an intuitive and predictive physical frame to describe the DSC regime where photon number wavepackets bounce back and forth along parity chains of the Hilbert space, while producing collapse and revivals of the initial population. We exemplify our physical frame with numerical and analytical considerations in the qubit population, photon statistics, and Wigner phase space.
In a published paper \cite{Sengupta2016}, we have proposed that the brain (and other self-organized biological and artificial systems) can be characterized via the mathematical apparatus of a gauge theory. The picture that emerges from this approach suggests that any biological system (from a neuron to an organism) can be cast as resolving uncertainty about its external milieu, either by changing its internal states or its relationship to the environment. Using formal arguments, we have shown that a gauge theory for neuronal dynamics -- based on approximate Bayesian inference -- has the potential to shed new light on phenomena that have thus far eluded a formal description, such as attention and the link between action and perception. Here, we describe the technical apparatus that enables such a variational inference on manifolds.
We study the effects of disorder on the slope of the disorder--temperature phase boundary near the Onsager point (Tc = 2.269...) in spin-glass models. So far, studies have focused on marginal or irrelevant cases of disorder. Using duality arguments, as well as exact Pfaffian techniques we reproduce these analytical estimates. In addition, we obtain different estimates for spin-glass models on hierarchical lattices where the effects of disorder are relevant. We show that the phase-boundary slope near the Onsager point can be used to probe for the relevance of disorder effects.
Spectra of real world networks exhibit properties which are different from the random networks. One such property is the existence of a very high degeneracy at zero eigenvalues. In this work, we provide possible reasons behind occurrence of the zero degeneracy in various networks spectra. Comparison of zero degeneracy in protein-protein interaction networks of six different species and in their corresponding model networks sheds light in understanding the evolution of complex biological systems.
Assuming a phenomenological self-energy $Im \Sigma(\omega) \sim |\omega|^{\beta\}, (\beta=1 $), which becomes gapped below $T_c$, we derived a new gap equation. The new gap equation contains the effect of the kinetic energy gain upon developing a superconducting order parameter. However, this new kinetic energy gain mechanism works only for a repulsive pairing potential leading to a s-wave state. In this case, compared to the usual potential energy gain in the superconducting state as in the BCS gap equation, the kinetic energy gain is more effective to easily achieve a high critical temperature $T_c$, since it is naturally Fermi energy scale. In view of the experimental evidences of the d-wave pairing state in the hole-doped copper-oxide high-$T_c$ superconductors, we discuss the implications of our results.
On the Optimizing of Wireless Networks and toward improving the future 5th Generation mobile Network Infrastructure, we propose a novel infrastructure that can be the next Smart City Network. Our proposed Infrastructure takes into consideration most future demands and challenges, includes Capacity, Reliability, Scalability, and Flexibility. To deal with this issues we propose a wireless network infrastructure that is based on latest technologies of Massive MIMO systems. We further extend our infrastructure with many smart features, to be capable of coping with Cloud Computing, Smartphones, IoT and other intelligence-based services. The proposed infrastructure uses Network Functions Virtualization (NFV), Software-Defined Networking   (SDN), Virtual Antenna Arrays (VAA) and Joint Beamforming to afford flexibility. We further propose a Terminal-centric rather than a Cell-centric based Infrastructure, which optimize interference aware environment and lead to higher capacity and reliability. The new infrastructure includes multi-purpose nodes that run a Network Operating System (NOS). This node will afford a scalable and flexible cost effective and semi-distributed network resources. Other propositions that meet Power-Effective, Cost-Effective, and Scenery aware design are discussed. Keywords - Wireless Network Infrastructure, Massive MIMO, Joint Beamforming, Cloud-based Networks, NFV, SDN, Cloud Computing, Grid Computing, and Distributed Systems.
This report deals with security in wireless sensor networks (WSNs), especially in network layer. Multiple secure routing protocols have been proposed in the literature. However, they often use the cryptography to secure routing functionalities. The cryptography alone is not enough to defend against multiple attacks due to the node compromise. Therefore, we need more algorithmic solutions. In this report, we focus on the behavior of routing protocols to determine which properties make them more resilient to attacks. Our aim is to find some answers to the following questions. Are there any existing protocols, not designed initially for security, but which already contain some inherently resilient properties against attacks under which some portion of the network nodes is compromised? If yes, which specific behaviors are making these protocols more resilient? We propose in this report an overview of security strategies for WSNs in general, including existing attacks and defensive measures. In this report we focus at the network layer in particular, and an analysis of the behavior of four particular routing protocols is provided to determine their inherent resiliency to insider attacks. The protocols considered are: Dynamic Source Routing (DSR), Gradient-Based Routing (GBR), Greedy Forwarding (GF) and Random Walk Routing (RWR).
Complex networks have attracted increasing interest from various fields of science. It has been demonstrated that each complex network model presents specific topological structures which characterize its connectivity and dynamics. Complex network classification rely on the use of representative measurements that model topological structures. Although there are a large number of measurements, most of them are correlated. To overcome this limitation, this paper presents a new measurement for complex network classification based on partially self-avoiding walks. We validate the measurement on a data set composed by 40.000 complex networks of four well-known models. Our results indicate that the proposed measurement improves correct classification of networks compared to the traditional ones.
The many-body localized (MBL) phase is characterized by a complete set of quasi-local integrals of motion and area-law entanglement of excited eigenstates. We study the effect of non-Abelian continuous symmetries on MBL, considering the case of $SU(2)$ symmetric disordered spin chains. The $SU(2)$ symmetry imposes strong constraints on the entanglement structure of the eigenstates, precluding conventional MBL. We construct a fixed-point Hamiltonian, which realizes a non-ergodic (but non-MBL) phase characterized by eigenstates having logarithmic scaling of entanglement with the system size, as well as an incomplete set of quasi-local integrals of motion. We study the response of such a phase to local symmetric perturbations, finding that even weak perturbations induce multi-spin resonances. We conclude that the non-ergodic phase is generally unstable and that $SU(2)$ symmetry implies thermalization. The approach introduced in this work can be used to study dynamics in disordered systems with non-Abelian symmetries, and provides a starting point for searching non-ergodic phases beyond conventional MBL.
In this paper, we propose convolutional neural networks for learning an optimal representation of question and answer sentences. Their main aspect is the use of relational information given by the matches between words from the two members of the pair. The matches are encoded as embeddings with additional parameters (dimensions), which are tuned by the network. These allows for better capturing interactions between questions and answers, resulting in a significant boost in accuracy. We test our models on two widely used answer sentence selection benchmarks. The results clearly show the effectiveness of our relational information, which allows our relatively simple network to approach the state of the art.
Many approaches to transform classification problems from non-linear to linear by feature transformation have been recently presented in the literature. These notably include sparse coding methods and deep neural networks. However, many of these approaches require the repeated application of a learning process upon the presentation of unseen data input vectors, or else involve the use of large numbers of parameters and hyper-parameters, which must be chosen through cross-validation, thus increasing running time dramatically. In this paper, we propose and experimentally investigate a new approach for the purpose of overcoming limitations of both kinds. The proposed approach makes use of a linear auto-associative network (called SCNN) with just one hidden layer. The combination of this architecture with a specific error function to be minimized enables one to learn a linear encoder computing a sparse code which turns out to be as similar as possible to the sparse coding that one obtains by re-training the neural network. Importantly, the linearity of SCNN and the choice of the error function allow one to achieve reduced running time in the learning phase. The proposed architecture is evaluated on the basis of two standard machine learning tasks. Its performances are compared with those of recently proposed non-linear auto-associative neural networks. The overall results suggest that linear encoders can be profitably used to obtain sparse data representations in the context of machine learning problems, provided that an appropriate error function is used during the learning phase.
In this paper, we address the problem of broadcasting in a wireless network under a novel communication model: the {\em swamping} communication model. In this model, nodes communicate only with those nodes at geometric distance greater than $s$ and at most $r$ from them. Communication between nearby nodes under this model can be very time consuming, as the length of the path between two nodes within distance $s$ is only bounded above by the diameter $D$, in many cases. For the $n$-node lattice networks, we present algorithms of optimal time complexity, respectively $O(n/r + r/(r-s))$ for the lattice line and $O(\sqrt{n}/r + r/(r-s))$ for the two-dimensional lattice. We also consider networks of unknown topology of diameter $D$ and of a parameter $g$ ({\em granularity}). More specifically, we consider networks with $\gamma$ the minimum distance between any two nodes and $g = 1/\gamma$. We present broadcast algorithms for networks of nodes placed on the line and on the plane with respective time complexities $O(D/l + g^2)$ and $O(Dg/l + g^4)$, where $l \in \Theta(\max\{(1-s),\gamma\})$.
We describe a model of a communication network that allows us to price complex network services as financial derivative contracts based on the spot price of the capacity in individual routers. We prove a theorem of a Girsanov transform that is useful for pricing linear derivatives on underlying assets, which can be used to price many complex network services, and it is used to price an option that gives access to one of several virtual channels between two network nodes, during a specified future time interval. We give the continuous time hedging strategy, for which the option price is independent of the service providers attitude towards risk. The option price contains the density function of a sum of lognormal variables, which has to be evaluated numerically.
Constraint-based (CB) learning is a formalism for learning a causal network with a database D by performing a series of conditional-independence tests to infer structural information. This paper considers a new test of independence that combines ideas from Bayesian learning, Bayesian network inference, and classical hypothesis testing to produce a more reliable and robust test. The new test can be calculated in the same asymptotic time and space required for the standard tests such as the chi-squared test, but it allows the specification of a prior distribution over parameters and can be used when the database is incomplete. We prove that the test is correct, and we demonstrate empirically that, when used with a CB causal discovery algorithm with noninformative priors, it recovers structural features more reliably and it produces networks with smaller KL-Divergence, especially as the number of nodes increases or the number of records decreases. Another benefit is the dramatic reduction in the probability that a CB algorithm will stall during the search, providing a remedy for an annoying problem plaguing CB learning when the database is small.
We propose the method for optical visualization of Bose-Hubbard model with two interacting bosons in the form of two-dimensional (2D) optical lattices consisting of optical waveguides, where the waveguides at the diagonal are characterized by different refractive index than others elsewhere, modeling the boson-boson interaction. We study the light intensity distribution function averaged over direction of propagation for both ordered and disordered cases, exploring sensitivity of the averaged picture with respect to the beam injection position. For our finite systems the resulting patterns reminiscent the ones set in billiards and therefore we introduce a definition of discrete quantum billiard discussing the possible relevance to its well established continuous counterpart.
All multi-component product manufacturing companies face the problem of warranty cost estimation. Failure rate analysis of components plays a key role in this problem. Data source used for failure rate analysis has traditionally been past failure data of components. However, failure rate analysis can be improved by means of fusion of additional information, such as symptoms observed during after-sale service of the product, geographical information (hilly or plains areas), and information from tele-diagnostic analytics. In this paper, we propose an approach, which learns dependency between part-failures and symptoms gleaned from such diverse sources of information, to predict expected number of failures with better accuracy. We also indicate how the optimum warranty period can be computed. We demonstrate, through empirical results, that our method can improve the warranty cost estimates significantly.
Part I of this paper discusses the problem of learning the operational structure of the grid from nodal voltage measurements. In this work (Part II), the learning of the operational radial structure is coupled with the problem of estimating nodal consumption statistics and inferring the line parameters in the grid. Based on a Linear-Coupled (LC) approximation of AC power flows equations, polynomial time algorithms are designed to complete these tasks using the available nodal complex voltage measurements. Then the structure learning algorithm is extended to cases with missing data, where available observations are limited to a fraction of the grid nodes. The efficacy of the presented algorithms are demonstrated through simulations on several distribution test cases.
Strongly interacting, dynamically disordered and with no small parameter, liquids took a theoretical status between gases and solids. We review different approaches to liquids and propose that liquids do not need classifying in terms of their proximity to gases and solids. Instead, they are a unique system in their own class with a notably mixed dynamical state in contrast to pure dynamical states of solids and gases. We start with explaining how the first-principles approach to liquids is an intractable, exponentially complex problem of coupled non-linear oscillators with bifurcations. This is followed by a reduction of the problem based on liquid relaxation time $\tau$ representing non-perturbative treatment of strong interactions. On the basis of $\tau$, solid-like high-frequency modes are predicted and we review related recent experiments. We demonstrate how these modes can be derived by generalizing either hydrodynamic or elasticity equations. We comment on the historical trend to approach liquids using hydrodynamics and compare it to an alternative solid-like approach. We subsequently discuss how collective modes evolve with temperature and how this affects liquid energy and other properties such as fast sound. Here, our emphasis is on real, rather than model, liquids. Highlighting the dominant role of high-frequency modes for liquid energy, we review a wide range of liquids: subcritical low-viscous liquids, supercritical state with two different dynamical and thermodynamic regimes separated by the Frenkel line, highly-viscous liquids and liquid-glass transition. We also discuss liquid-liquid phase transitions where the solid-like properties of liquids have become further apparent. We then discuss gas-like and solid-like approaches to quantum liquids and persisting theoretical problems. We list areas where interesting insights may appear and continue the extraordinary liquid story.
Deep convolution neural networks (CNN) have demonstrated advanced performance on single-label image classification, and various progress also have been made to apply CNN methods on multi-label image classification, which requires to annotate objects, attributes, scene categories etc. in a single shot. Recent state-of-the-art approaches to multi-label image classification exploit the label dependencies in an image, at global level, largely improving the labeling capacity. However, predicting small objects and visual concepts is still challenging due to the limited discrimination of the global visual features. In this paper, we propose a Regional Latent Semantic Dependencies model (RLSD) to address this problem. The utilized model includes a fully convolutional localization architecture to localize the regions that may contain multiple highly-dependent labels. The localized regions are further sent to the recurrent neural networks (RNN) to characterize the latent semantic dependencies at the regional level. Experimental results on several benchmark datasets show that our proposed model achieves the best performance compared to the state-of-the-art models, especially for predicting small objects occurred in the images. In addition, we set up an upper bound model (RLSD+ft-RPN) using bounding box coordinates during training, the experimental results also show that our RLSD can approach the upper bound without using the bounding-box annotations, which is more realistic in the real world.
We consider the two related problems of detecting if an example is misclassified or out-of-distribution. We present a simple baseline that utilizes probabilities from softmax distributions. Correctly classified examples tend to have greater maximum softmax probabilities than erroneously classified and out-of-distribution examples, allowing for their detection. We assess performance by defining several tasks in computer vision, natural language processing, and automatic speech recognition, showing the effectiveness of this baseline across all. We then show the baseline can sometimes be surpassed, demonstrating the room for future research on these underexplored detection tasks.
We report a theoretical investigation of quantized spin-Hall conductance fluctuation of graphene devices in the diffusive regime. Two graphene models that exhibit quantized spin-Hall effect (QSHE) are analyzed. Model-I is with unitary symmetry under an external magnetic field $B\ne 0$ but with zero spin-orbit interaction, $t_{SO}=0$. Model-II is with symplectic symmetry where B=0 but $t_{SO} \ne 0$. Extensive numerical calculations indicate that the two models have exactly the same universal QSHE conductance fluctuation value $0.285 e/4\pi$ regardless of the symmetry. Qualitatively different from the conventional charge and spin universal conductance distributions, in the presence of edge states the spin-Hall conductance shows an one-sided log-normal distribution rather than a Gaussian distribution. Our results strongly suggest that the quantized spin-Hall conductance fluctuation belongs to a new universality class.
Multiplex networks are a type of multilayer network in which entities are connected to each other via multiple types of connections. We propose a method, based on computing pairwise similarities between layers and then doing community detection, for grouping structurally similar layers in multiplex networks. We illustrate our approach using both synthetic and empirical networks, and we are able to find meaningful groups of layers in both cases. For example, we find that airlines that are based in similar geographic locations tend to be grouped together in an airline multiplex network and that related research areas in physics tend to be grouped together in an multiplex collaboration network.
Oscillation has an important role in bio-dynamical systems such as circadian rhythms and eukaryotic cell cycle. John Tyson et. al. in Nature Review Mol Cell Biol 2008 examined a limited number of network topologies consisting of three nodes and four or fewer edges and identified the network design principles of biochemical oscillations. Tsai et. al. in Science 2008 studied three different network motifs, namely a negative feedback loop, coupled negative feedback loops, and coupled positive and negative feedback loops, and found that the interconnected positive and negative feedback loops are capable of generating frequency-tunable oscillations. We enumerate 249 topologically unique network architectures consisting of three nodes and at least three cyclic inhibitory edges, and identify network architectural commonalities among three functional groups: (1) most frequency-tunable yet less robust oscillators, (2) least frequency-tunable and least robust oscillators, and (3) less frequency-tunable yet most robust oscillators. We find that Frequency-tunable networks cannot simultaneously express high robustness, indicating a tradeoff between frequency tunability and robustness.
This thesis sums up the research work I performed as a PhD student in Sapienza Universit\`a di Roma, and \'Ecole Normale Sup\'erieure, Paris, under the joint supervision of Prof. Giorgio Parisi and Dr. Francesco Zamponi. The thesis focuses on the theoretical study of metastable glasses prepared through non-equilibrium protocols. The first two chapters contain a review of the phenomenology of equilibrium supercooled liquids and non-equilibrium glasses, along with an exposition of the theoretical tools used up to now to approach the glass problem; the third chapter contains a review of the theoretical tools which are employed in the following of the thesis; the following two chapters contain our main result, namely a mean-field and first-principles theory of metastable glassy states which, as we show, is both able to reproduce known observations and to formulate new predictions and insights into the very nature of the glass phase, with tangible consequences in terms of behavior at high densities (or equivalently, low temperatures) and elasto-plastic response under strain; the last two chapters contain a numerical check of some of the results, and our conclusions and proposals for further research. The results I report in the present work have been for the most part already published, but here they are presented in a consistent and (as much as possible) self-contained manner.
It was proposed recently that the out-of-time-ordered four-point correlator (OTOC) may serve as a useful characteristic of quantum-chaotic behavior, because in the semi-classical limit, $\hbar \to 0$, its rate of exponential growth resembles the classical Lyapunov exponent. Here, we calculate the four-point correlator, $C(t)$, for the classical and quantum kicked rotor -- a textbook driven chaotic system -- and compare its growth rate at initial times with the standard definition of the classical Lyapunov exponent. Using both quantum and classical arguments, we show that the OTOC's growth rate and the Lyapunov exponent are in general distinct quantities, corresponding to the logarithm of phase-space averaged divergence rate of classical trajectories and to the phase-space average of the logarithm, respectively. The difference appears to be more pronounced in the regime of low kicking strength $K$, where no classical chaos exists globally. In this case, the Lyapunov exponent quickly decreases as $K \to 0$, while the OTOC's growth rate may decrease much slower showing higher sensitivity to small chaotic islands in the phase space. We also show that the quantum correlator as a function of time exhibits a clear singularity at the Ehrenfest time $t_E$: transitioning from a time-independent value of $t^{-1} \ln{C(t)}$ at $t < t_E$ to its monotonous decrease with time at $t>t_E$. We note that the underlying physics here is the same as in the theory of weak (dynamical) localization [Aleiner and Larkin, Phys. Rev. B 54, 14423 (1996); Tian, Kamenev, and Larkin, Phys. Rev. Lett. 93, 124101 (2004)] and is due to a delay in the onset of quantum interference effects, which occur sharply at a time of the order of the Ehrenfest time.
We present the result of differential spectral analyses of a further four apparently normal B-type stars. Abundance anomalies (e.g. He, C, N enrichment), slow rotation and/or high gravities suggest that the programme stars are evolved low-mass B-type stars. In order to trace their evolutionary status several scenarios are discussed. Post-AGB evolution can be ruled out. PG 0229+064 and PG 1400+389 could be horizontal branch (HB) stars, while HD 76431 and SB 939 have already evolved away from the extreme HB (EHB). The low helium abundance of HD 76431 is consistent with post-EHB evolution. The enrichment in helium, carbon and nitrogen of the remaining stars can be explained either by deep mixing of nuclearly processed material to the surface or by diffusion processes modified by magnetic fields and/or stellar winds. A kinematic study of their galactic orbits indicates that the stars belong to an old disk population.
We review recent results on the topological properties of two spatial scale-free networks, the inherent structure and Apollonian networks. The similarities between these two types of network suggest an explanation for the scale-free character of the inherent structure networks. Namely, that the energy landscape can be viewed as a fractal packing of basins of attraction.
Anticipating the future actions of a human is a widely studied problem in robotics that requires spatio-temporal reasoning. In this work we propose a deep learning approach for anticipation in sensory-rich robotics applications. We introduce a sensory-fusion architecture which jointly learns to anticipate and fuse information from multiple sensory streams. Our architecture consists of Recurrent Neural Networks (RNNs) that use Long Short-Term Memory (LSTM) units to capture long temporal dependencies. We train our architecture in a sequence-to-sequence prediction manner, and it explicitly learns to predict the future given only a partial temporal context. We further introduce a novel loss layer for anticipation which prevents over-fitting and encourages early anticipation. We use our architecture to anticipate driving maneuvers several seconds before they happen on a natural driving data set of 1180 miles. The context for maneuver anticipation comes from multiple sensors installed on the vehicle. Our approach shows significant improvement over the state-of-the-art in maneuver anticipation by increasing the precision from 77.4% to 90.5% and recall from 71.2% to 87.4%.
Cascades of information-sharing are a primary mechanism by which content reaches its audience on social media, and an active line of research has studied how such cascades, which form as content is reshared from person to person, develop and subside. In this paper, we perform a large-scale analysis of cascades on Facebook over significantly longer time scales, and find that a more complex picture emerges, in which many large cascades recur, exhibiting multiple bursts of popularity with periods of quiescence in between. We characterize recurrence by measuring the time elapsed between bursts, their overlap and proximity in the social network, and the diversity in the demographics of individuals participating in each peak. We discover that content virality, as revealed by its initial popularity, is a main driver of recurrence, with the availability of multiple copies of that content helping to spark new bursts. Still, beyond a certain popularity of content, the rate of recurrence drops as cascades start exhausting the population of interested individuals. We reproduce these observed patterns in a simple model of content recurrence simulated on a real social network. Using only characteristics of a cascade's initial burst, we demonstrate strong performance in predicting whether it will recur in the future.
This paper contributes to a development of randomized methods for neural networks. The proposed learner model is generated incrementally by stochastic configuration (SC) algorithms, termed as Stochastic Configuration Networks (SCNs). In contrast to the existing randomised learning algorithms for single layer feed-forward neural networks (SLFNNs), we randomly assign the input weights and biases of the hidden nodes in the light of a supervisory mechanism, and the output weights are analytically evaluated in either constructive or selective manner. As fundamentals of SCN-based data modelling techniques, we establish some theoretical results on the universal approximation property. Three versions of SC algorithms are presented for regression problems (applicable for classification problems as well) in this work. Simulation results concerning both function approximation and real world data regression indicate some remarkable merits of our proposed SCNs in terms of less human intervention on the network size setting, the scope adaptation of random parameters, fast learning and sound generalization.
Deep neural networks, like many other machine learning models, have recently been shown to lack robustness against adversarially crafted inputs. These inputs are derived from regular inputs by minor yet carefully selected perturbations that deceive machine learning models into desired misclassifications. Existing work in this emerging field was largely specific to the domain of image classification, since the high-entropy of images can be conveniently manipulated without changing the images' overall visual appearance. Yet, it remains unclear how such attacks translate to more security-sensitive applications such as malware detection - which may pose significant challenges in sample generation and arguably grave consequences for failure.   In this paper, we show how to construct highly-effective adversarial sample crafting attacks for neural networks used as malware classifiers. The application domain of malware classification introduces additional constraints in the adversarial sample crafting problem when compared to the computer vision domain: (i) continuous, differentiable input domains are replaced by discrete, often binary inputs; and (ii) the loose condition of leaving visual appearance unchanged is replaced by requiring equivalent functional behavior. We demonstrate the feasibility of these attacks on many different instances of malware classifiers that we trained using the DREBIN Android malware data set. We furthermore evaluate to which extent potential defensive mechanisms against adversarial crafting can be leveraged to the setting of malware classification. While feature reduction did not prove to have a positive impact, distillation and re-training on adversarially crafted samples show promising results.
Topological Weyl semimetals can host Weyl nodes with monopole charges in momentum space. How to detect the signature of the monopole charges in quantum transport remains a challenging topic. Here, we reveal the connection between the parity of monopole charge in topological semimetals and the quantum interference corrections to the conductivity. We show that the parity of monopole charge determines the sign of the quantum interference correction, with odd and even parity yielding the weak anti-localization and weak localization effects, respectively. This is attributed to the Berry phase difference between time-reversed trajectories circulating the Fermi sphere that encloses the monopole charges. From standard Feynman diagram calculations, we further show that the weak-field magnetoconductivity at low temperatures is proportional to $+\sqrt{B}$ in double-Weyl semimetals and $-\sqrt{B}$ in Weyl semimetals, respectively, which could be verified experimentally.
Teachers in conventional classrooms often ask learners to express themselves and show their thought processes by speaking out loud, drawing on a whiteboard, or even using physical objects. Despite the pedagogical value of such activities, interactive exercises available in most online learning platforms are constrained to multiple-choice and short answer questions. We introduce RIMES, a system for easily authoring, recording, and reviewing interactive multimedia exercises embedded in lecture videos. With RIMES, teachers can prompt learners to record their responses to an activity using video, audio, and inking while watching lecture videos. Teachers can then review and interact with all the learners' responses in an aggregated gallery. We evaluated RIMES with 19 teachers and 25 students. Teachers created a diverse set of activities across multiple subjects that tested deep conceptual and procedural knowledge. Teachers found the exercises useful for capturing students' thought processes, identifying misconceptions, and engaging students with content.
Dijet production in deep inelastic ep scattering is investigated in the region of low values of the Bjorken-variable x (10^-4 < x < 10^-2) and low photon virtualities Q^2 (5 < Q^2 < 100 GeV^2). The measured dijet cross sections are compared with perturbative QCD calculations in next-to-leading order. For most dijet variables studied, these calculations can provide a reasonable description of the data over the full phase space region covered, including the region of very low x. However, large discrepancies are observed for events with small separation in azimuth between the two highest transverse momentum jets. This region of phase space is described better by predictions based on the CCFM evolution equation, which incorporates k_t factorized unintegrated parton distributions. A reasonable description is also obtained using the Color Dipole Model or models incorporating virtual photon structure.
This work considers the robustness of uncertain consensus networks. The first set of results studies the stability properties of consensus networks with negative edge weights. We show that if either the negative weight edges form a cut in the graph, or any single negative edge weight has magnitude less than the inverse of the effective resistance between the two incident nodes, then the resulting network is unstable. These results are then applied to analyze the robustness properties of the consensus network with additive but bounded perturbations of the edge weights. It is shown that the small-gain condition is related again to cuts in the graph and effective resistance. For the single edge case, the small-gain condition is also shown to be exact. The results are then extended to consensus networks with non-linear couplings.
The differential evolution (DE) algorithm suffers from high computational time due to slow nature of evaluation. In contrast, micro-DE (MDE) algorithms employ a very small population size, which can converge faster to a reasonable solution. However, these algorithms are vulnerable to a premature convergence as well as to high risk of stagnation. In this paper, MDE algorithm with vectorized random mutation factor (MDEVM) is proposed, which utilizes the small size population benefit while empowers the exploration ability of mutation factor through randomizing it in the decision variable level. The idea is supported by analyzing mutation factor using Monte-Carlo based simulations. To facilitate the usage of MDE algorithms with very-small population sizes, new mutation schemes for population sizes less than four are also proposed. Furthermore, comprehensive comparative simulations and analysis on performance of the MDE algorithms over various mutation schemes, population sizes, problem types (i.e. uni-modal, multi-modal, and composite), problem dimensionalities, and mutation factor ranges are conducted by considering population diversity analysis for stagnation and trapping in local optimum situations. The studies are conducted on 28 benchmark functions provided for the IEEE CEC-2013 competition. Experimental results demonstrate high performance and convergence speed of the proposed MDEVM algorithm.
Mobile ad-hoc networks(MANET) is the collection of mobile nodes which are self organizing and are connected by wireless links where nodes which are not in the direct range communicate with each other relying on the intermediate nodes. As a result of trusting other nodes in the route, a malicious node can easily compromise the security of the network. A black-hole node is the malicious node which drops the entire packet coming to it and always shows the fresh route to the destination, even if the route to destination doesn't exist. This paper describes a scheme that will detect the intrusion in the network in the presence of black-hole node and its performance is compared with the previous technique. This novel technique helps to increase the network performance by reducing the overhead in the network.
Statistical traffic data analysis is a hot topic in traffic management and control. In this field, current research progresses focus on analyzing traffic flows of individual links or local regions in a transportation network. Less attention are paid to the global view of traffic states over the entire network, which is important for modeling large-scale traffic scenes. Our aim is precisely to propose a new methodology for extracting spatio-temporal traffic patterns, ultimately for modeling large-scale traffic dynamics, and long-term traffic forecasting. We attack this issue by utilizing Locality-Preserving Non-negative Matrix Factorization (LPNMF) to derive low-dimensional representation of network-level traffic states. Clustering is performed on the compact LPNMF projections to unveil typical spatial patterns and temporal dynamics of network-level traffic states. We have tested the proposed method on simulated traffic data generated for a large-scale road network, and reported experimental results validate the ability of our approach for extracting meaningful large-scale space-time traffic patterns. Furthermore, the derived clustering results provide an intuitive understanding of spatial-temporal characteristics of traffic flows in the large-scale network, and a basis for potential long-term forecasting.
Confidentiality, integrity and authentication are more relevant issues in Ad hoc networks than in wired fixed networks. One way to address these issues is the use of symmetric key cryptography, relying on a secret key shared by all members of the network. But establishing and maintaining such a key (also called the session key) is a non-trivial problem. We show that Group Key Agreement (GKA) protocols are suitable for establishing and maintaining such a session key in these dynamic networks. We take an existing GKA protocol, which is robust to connectivity losses and discuss all the issues for good functioning of this protocol in Ad hoc networks. We give implementation details and network parameters, which significantly reduce the computational burden of using public key cryptography in such networks.
Multi-layer networks are networks in which several protocols may coexist at different layers. The Pseudo-Wire architecture provides encapsulation and de-capsulation functions of protocols over Packet-Switched Networks. In a multi-domain context, computing a path to support end-to-end services requires the consideration of encapsulation and decapsulation capabilities. It appears that graph models are not expressive enough to tackle this problem. In this paper, we propose a new model of heterogeneous networks using Automata Theory. A network is modeled as a Push-Down Automaton (PDA) which is able to capture the encapsulation and decapsulation capabilities, the PDA stack corresponding to the stack of encapsulated protocols. We provide polynomial algorithms that compute the shortest path either in hops or in the number of encapsulations and decapsulations along the inter-domain path, the latter reducing manual configurations and possible loops in the path.
We propose an end-to-end recurrent encoder-decoder based sequence learning approach for printed text Optical Character Recognition (OCR). In contrast to present day existing state-of-art OCR solution which uses connectionist temporal classification (CTC) output layer, our approach makes minimalistic assumptions on the structure and length of the sequence. We use a two step encoder-decoder approach -- (a) A recurrent encoder reads a variable length printed text word image and encodes it to a fixed dimensional embedding. (b) This fixed dimensional embedding is subsequently comprehended by decoder structure which converts it into a variable length text output. Our architecture gives competitive performance relative to connectionist temporal classification (CTC) output layer while being executed in more natural settings. The learnt deep word image embedding from encoder can be used for printed text based retrieval systems. The expressive fixed dimensional embedding for any variable length input expedites the task of retrieval and makes it more efficient which is not possible with other recurrent neural network architectures. We empirically investigate the expressiveness and the learnability of long short term memory (LSTMs) in the sequence to sequence learning regime by training our network for prediction tasks in segmentation free printed text OCR. The utility of the proposed architecture for printed text is demonstrated by quantitative and qualitative evaluation of two tasks -- word prediction and retrieval.
In single channel wireless networks, concurrent transmission at different links may interfere with each other. To improve system throughput, a scheduling algorithm is necessary to choose a subset of links at each time slot for data trasmission. Throughput optimal link scheduling discipline in such a wireless network is generally an NP-hard problem. In this paper, we develop a poylnomial time algorithm for link scheduling problem provided that network conflict graph is line multigraph. (i.e. line graph for which its root graph is multigraph). This result can be a guideline for network designers to plan the topology of a stationary wireless network such that the required conditions hold and then the throughput optimal algorithm can be run in a much less time.
The current Internet is based on a fundamental assumption of reliability and good intent among actors in the network. Unfortunately, unreliable and malicious behaviour is becoming a major obstacle for Internet communication. In order to improve the trustworthiness and reliability of the network infrastructure, we propose a novel trust model to be incorporated into BGP routing. In our approach, trust model is defined by combining voting and recommendation to direct trust estimation for neighbour routers located in different autonomous systems. We illustrate the impact of our approach with cases that demonstrate the indication of distrusted paths beyond the nearest neighbours and the detection of a distrusted neighbour advertising a trusted path. We simulated the impact of weighting voted and direct trust in a rectangular grid of 15*15 nodes (autonomous systems) with a randomly connected topology.
We introduce the study of fairness in multi-armed bandit problems. Our fairness definition can be interpreted as demanding that given a pool of applicants (say, for college admission or mortgages), a worse applicant is never favored over a better one, despite a learning algorithm's uncertainty over the true payoffs. We prove results of two types.   First, in the important special case of the classic stochastic bandits problem (i.e., in which there are no contexts), we provide a provably fair algorithm based on "chained" confidence intervals, and provide a cumulative regret bound with a cubic dependence on the number of arms. We further show that any fair algorithm must have such a dependence. When combined with regret bounds for standard non-fair algorithms such as UCB, this proves a strong separation between fair and unfair learning, which extends to the general contextual case.   In the general contextual case, we prove a tight connection between fairness and the KWIK (Knows What It Knows) learning model: a KWIK algorithm for a class of functions can be transformed into a provably fair contextual bandit algorithm, and conversely any fair contextual bandit algorithm can be transformed into a KWIK learning algorithm. This tight connection allows us to provide a provably fair algorithm for the linear contextual bandit problem with a polynomial dependence on the dimension, and to show (for a different class of functions) a worst-case exponential gap in regret between fair and non-fair learning algorithms
Reconstructing the states of the nodes of a dynamical network is a problem of fundamental importance in the study of neuronal and genetic networks. An underlying related problem is that of observability, i.e., identifying the conditions under which such a reconstruction is possible. In this paper we study observability of complex dynamical networks, where we consider the effects of network symmetries on observability. We present an efficient algorithm that returns a minimal set of necessary sensor nodes for observability in the presence of symmetries.
In this paper, a class of neutral type high-order Hopfield neural networks with mixed time-varying delays and leakage delays on time scales is proposed. Based on the exponential dichotomy of linear dynamic equations on time scales, Banach's fixed point theorem and the theory of calculus on time scales, some sufficient conditions are obtained for the existence and global exponential stability of pseudo almost periodic solutions for this class of neural networks. Our results are completely new. Finally, we present an example to illustrate our results are effective. Our example also shows that the continuous-time neural network and its discrete-time analogue have the same dynamical behaviors for the pseudo almost periodicity.
It is well known that sedimentary rocks having same porosity can have very different pore size distribution. The pore distribution determines many characteristics of the rock among which, its transport property is often the most useful. Multifractal analysis is a powerful tool that is increasingly used to characterize the pore space. In this study we have done multifractal analysis of pore distribution on sedimentary rocks simulated using the Relaxed Bidisperse Ballistic Model (RBBDM). The RBBDM can generate a $3-D$ structure of sedimentary rocks of variable porosity by tuning the fraction $p$ of particles of two different sizes. We have also done multifractal analysis on two samples of real sedimentary rock to compare with the simulation studies. One sample, an oolitic limestone is of high porosity (40%)while the other is a reefal carbonate of low porosity around 7%. $2-D$ sections of X-ray micro-tomographs of the real rocks were stacked sequentially to reconstruct the real rock specimens. Both samples show a multifractal character, but we show that RBBDM gives a very realistic representation of a typical high porosity sedimentary rock.
To build the connectomics map of the brain, we developed a new algorithm that can automatically refine the Membrane Detection Probability Maps (MDPM) generated to perform automatic segmentation of electron microscopy (EM) images. To achieve this, we executed supervised training of a convolutional neural network to recover the removed center pixel label of patches sampled from a MDPM. MDPM can be generated from other machine learning based algorithms recognizing whether a pixel in an image corresponds to the cell membrane. By iteratively applying this network over MDPM for multiple rounds, we were able to significantly improve membrane segmentation results.
We consider a family of models describing the evolution under selection of a population whose dynamics can be related to the propagation of noisy traveling waves. For one particular model, that we shall call the exponential model, the properties of the traveling wave front can be calculated exactly, as well as the statistics of the genealogy of the population. One striking result is that, for this particular model, the genealogical trees have the same statistics as the trees of replicas in the Parisi mean-field theory of spin glasses. We also find that in the exponential model, the coalescence times along these trees grow like the logarithm of the population size. A phenomenological picture of the propagation of wave fronts that we introduced in a previous work, as well as our numerical data, suggest that these statistics remain valid for a larger class of models, while the coalescence times grow like the cube of the logarithm of the population size.
Object detection is one of the most active areas in computer vision, which has made significant improvement in recent years. Current state-of-the-art object detection methods mostly adhere to the framework of regions with convolutional neural network (R-CNN) and only use local appearance features inside object bounding boxes. Since these approaches ignore the contextual information around the object proposals, the outcome of these detectors may generate a semantically incoherent interpretation of the input image. In this paper, we propose an ensemble object detection system which incorporates the local appearance, the contextual information in term of relationships among objects and the global scene based contextual feature generated by a convolutional neural network. The system is formulated as a fully connected conditional random field (CRF) defined on object proposals and the contextual constraints among object proposals are modeled as edges naturally. Furthermore, a fast mean field approximation method is utilized to inference in this CRF model efficiently. The experimental results demonstrate that our approach achieves a higher mean average precision (mAP) on PASCAL VOC 2007 datasets compared to the baseline algorithm Faster R-CNN.
We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MDL approximation. We also consider approximations proposed by Draper (1993) and Cheeseman and Stutz (1995). These approximations are as efficient as BIC/MDL, but their accuracy has not been studied in any depth. We compare the accuracy of these approximations under the assumption that the Laplace approximation is the most accurate. In experiments using synthetic data generated from discrete naive-Bayes models having a hidden root node, we find that the CS measure is the most accurate.
We present u,b,r & i galaxy number counts and colours both from the North and South Hubble Space Telescope Deep Fields and from the William Herschel Deep Field. The latter comprises a 7'x7' area of sky reaching b~28.5 at its deepest. We show that simple Bruzual & Charlot evolutionary models which assume exponentially increasing star-formation rates with look-back time and q_0=0.05 continue to give excellent fits to galaxy counts and colours in the deep imaging data. With q_0=0.5, an extra population of `disappearing dwarf' galaxies is required to fit the optical counts. When we compare the observed and predicted galaxy counts for UV dropouts in the range 2<z<3.5 we find excellent agreement, indicating that the space density of galaxies may not have changed much since z=0 and identifying the Lyman break galaxies with the bright end of the evolved spiral luminosity function. Making the same comparison for B dropout galaxies in the range 3.5<z<4.5 we find that the space density of intrinsically bright galaxies remains the same but the space density of faint galaxies drops by a factor of ~5 compared to the models, consistent with the idea that L* galaxies were already in place at z~4 but that dwarf galaxies may have formed later at 3<z<4.
Information-Centric Networking (ICN) has emerged as an interesting approach to overcome many of the limitations of legacy IP-based networks. However, the drastic changes to legacy infrastructure required to realise an ICN have significantly hindered its adoption by network operators. As a result, alternative deployment strategies are investigated, with Software-Defined Networking (SDN) arising as a solution compatible with legacy infrastructure, thus opening new possibilities for integrating ICN concepts in operators' networks. This paper discusses the seamless integration of these two architectural paradigms and suggests a scalable and dynamic network topology bootstrapping and management framework to deploy and operate ICN topologies over SDN-enabled operator networks. We describe the designed protocol and supporting mechanisms, as well as the minimum required implementation to realize this inter-operability. A proof-of-concept prototype has been implemented to validate the feasibility of the approach. Results show that topology bootstrapping time is not significantly affected by the topology size, substantially facilitating the intelligent management of an ICN-enabled network.
We describe a novel architecture that combines the simplicity of RESTful architecture with the power of functional programming for delivering web-services. Although, RESTful architecture has been quite useful in simplifying the development of scalable systems, it is not suited for all types of network applications. Our architecture improves upon the RESTful architecture to provide scalable framework for computationally intensive network applications. The proposed architecture is ideal for applications that involve data management and data analysis/calculations on data. Data analytics and financial calculations are two areas where the architecture can be applied efficiently.
A Bounded Confidence (BC) model of socio-physics, in which the agents have continuous opinions and can influence each other only if the distance between their opinions is below a threshold, is simulated on a still growing scale-free network considering several different strategies: for each new node (or vertex), that is added to the network all individuals of the network have their opinions updated following a BC model recipe. The results obtained are compared with the original model, with numerical simulations on different graph structures and also when it is considered on the usual fixed BA network. In particular, the comparison with the latter leads us to conclude that it does not matter much whether the network is still growing or is fixed during the opinion dynamics.
Recovering the occlusion relationships between objects is a fundamental human visual ability which yields important information about the 3D world. In this paper we propose a deep network architecture, called DOC, which acts on a single image, detects object boundaries and estimates the border ownership (i.e. which side of the boundary is foreground and which is background). We represent occlusion relations by a binary edge map, to indicate the object boundary, and an occlusion orientation variable which is tangential to the boundary and whose direction specifies border ownership by a left-hand rule. We train two related deep convolutional neural networks, called DOC, which exploit local and non-local image cues to estimate this representation and hence recover occlusion relations. In order to train and test DOC we construct a large-scale instance occlusion boundary dataset using PASCAL VOC images, which we call the PASCAL instance occlusion dataset (PIOD). This contains 10,000 images and hence is two orders of magnitude larger than existing occlusion datasets for outdoor images. We test two variants of DOC on PIOD and on the BSDS occlusion dataset and show they outperform state-of-the-art methods. Finally, we perform numerous experiments investigating multiple settings of DOC and transfer between BSDS and PIOD, which provides more insights for further study of occlusion estimation.
We investigate the prompt photon production in deep inelastic scattering at HERA in the framework of kt-factorization QCD approach. Our study is based on the off-shell partonic amplitude e q* -> e gamma q, where the photon radiation from the leptons as well as from the quarks is taken into account. The unintegrated quark densities in a proton are determined using the Kimber-Martin-Ryskin prescription. The conservative error analisys is performed. We investigate both inclusive and jet associated prompt photon production rates. Our predictions are compared with the recent experimental data taken by the H1 and ZEUS collaborations. We demonstrate that in the kt-factorization approach the contribution from the quark radiation subprocess (QQ mechanism) is enhanced as compared to the leading-order collinear approximation.
Convolution neural network (CNN) has significantly pushed forward the development of face recognition and analysis techniques. Current CNN models tend to be deeper and larger to better fit large amounts of training data. When training data are from internet, their labels are often ambiguous and inaccurate. This paper presents a light CNN framework to learn a compact embedding on the large-scale face data with massive noisy labels. First, we introduce the concept of maxout activation into each convolutional layer of CNN, which results in a Max-Feature-Map (MFM). Different from Rectified Linear Unit that suppresses a neuron by a threshold (or bias), MFM suppresses a neuron by a competitive relationship. MFM can not only separate noisy signals and informative signals but also plays a role of feature selection. Second, a network of five convolution layers and four Network in Network (NIN) layers are implemented to reduce the number of parameters and improve performance. Lastly, a semantic bootstrapping method is accordingly designed to make the prediction of the models be better consistent with noisy labels. Experimental results show that the proposed framework can utilize large-scale noisy data to learn a light model in terms of both computational cost and storage space. The learnt single model with a 256-D representation achieves state-of-the-art results on five face benchmarks without fine-tuning. The light CNN model is released on https://github.com/AlfredXiangWu/face_verification_experiment.
The paper describes a new CNC control unit for machining centres with learning ability and automatic intelligent generating of NC programs on the bases of a neural network, which is built-in into a CNC unit as special device. The device performs intelligent and completely automatically the NC part programs only on the bases of 2D, 2,5D or 3D computer model of prismatic part. Intervention of the operator is not needed. The neural network for milling, drilling, reaming, threading and operations alike has learned to generate NC programs in the learning module, which is a part of intelligent CAD/CAM system.
We propose a method combining relational-logic representations with neural network learning. A general lifted architecture, possibly reflecting some background domain knowledge, is described through relational rules which may be handcrafted or learned. The relational rule-set serves as a template for unfolding possibly deep neural networks whose structures also reflect the structures of given training or testing relational examples. Different networks corresponding to different examples share their weights, which co-evolve during training by stochastic gradient descent algorithm. The framework allows for hierarchical relational modeling constructs and learning of latent relational concepts through shared hidden layers weights corresponding to the rules. Discovery of notable relational concepts and experiments on 78 relational learning benchmarks demonstrate favorable performance of the method.
Recently we proposed an algorithm for the fast reconstruction of compact context-specific metabolic networks (FASTCORE) that allowed dropping the reconstruction time to the time order of seconds (Vlassis et al.,2014). This extremely low computational demand opens new possibilities for improving the quality of the models. Several rounds of model reconstruction, testing of the model's predictions against real experimental data, curation steps of the input model and the set of core reactions as well as cross-validations assays are required to reconstruct high-quality models. These semi-automated model curations steps are in such extend not possible with competing algorithms due to their high computational demands. To adapt FASTCORE for the integration of microarray data, we therefore propose a new workflow: FASTCORMICS. FASTCORMICS requires as input microarray data and a Genome-scale reconstruction. FASTCORMICS is devoid of heuristic parameter settings and has a low computational demand with overall building times in the order of a few minutes. FASTCORMICS preprocesses the microarrays data with the discretization tool Barcode (Zillox et al, 2007). Barcode uses prior knowledge on the intensity distribution of each probe set for a given microarray platform to segregate between expressed genes and non-expressed genes. This preprocessing step allows circumventing the need of setting a heuristic expression threshold, which is critical for the output models as in response to this threshold alternative pathways or subsystems might be included or excluded, thereby heavily changing the functionalities of the model.   In general, FASTCORMICS outperforms competing algorithms and allows obtaining high-quality, robust models in a high-throughput manner. This will allow the use of metabolic modelling as routine process for the analysis of microarray data e.g. in the field of personalized medicine.
We consider a cognitive radio network where primary users (PUs) employ network coding for data transmissions. We view network coding as a spectrum shaper, in the sense that it increases spectrum availability to secondary users (SUs) and offers more structure of spectrum holes that improves the predictability of the primary spectrum. With this spectrum shaping effect of network coding, each SU can carry out adaptive channel sensing by dynamically updating the list of the PU channels predicted to be idle while giving priority to these channels when sensing. This dynamic spectrum access approach with network coding improves how SUs detect and utilize temporal spectrum holes over PU channels. Our results show that compared to the existing approaches based on retransmission, both PUs and SUs can achieve higher stable throughput, thanks to the spectrum shaping effect of network coding.
We found that neither randomness in the ER model nor the preferential attachment in the PA model is the mechanism of community structures of networks, that community structures are universal in real networks, that community structures are definable in networks, that communities are interpretable in networks, and that homophyly is the mechanism of community structures and a structural theory of networks. We proposed the notions of entropy- and conductance-community structures. It was shown that the two definitions of the entropy- and conductance-community structures and the notion of modularity proposed by physicists are all equivalent in defining community structures of networks, that neither randomness in the ER model nor preferential attachment in the PA model is the mechanism of community structures of networks, and that the existence of community structures is a universal phenomenon in real networks. This poses a fundamental question: What are the mechanisms of community structures of real networks? To answer this question, we proposed a homophyly model of networks. It was shown that networks of our model satisfy a series of new topological, probabilistic and combinatorial principles, including a fundamental principle, a community structure principle, a degree priority principle, a widths principle, an inclusion and infection principle, a king node principle and a predicting principle etc. The new principles provide a firm foundation for a structural theory of networks. Our homophyly model demonstrates that homophyly is the underlying mechanism of community structures of networks, that nodes of the same community share common features, that power law and small world property are never obstacles of the existence of community structures in networks, that community structures are {\it definable} in networks, and that (natural) communities are {\it interpretable}.
The most frequent infectious diseases in humans - and those with the highest potential for rapid pandemic spread - are usually transmitted via droplets during close proximity interactions (CPIs). Despite the importance of this transmission route, very little is known about the dynamic patterns of CPIs. Using wireless sensor network technology, we obtained high-resolution data of CPIs during a typical day at an American high school, permitting the reconstruction of the social network relevant for infectious disease transmission. At a 94% coverage, we collected 762,868 CPIs at a maximal distance of 3 meters among 788 individuals. The data revealed a high density network with typical small world properties and a relatively homogenous distribution of both interaction time and interaction partners among subjects. Computer simulations of the spread of an influenza-like disease on the weighted contact graph are in good agreement with absentee data during the most recent influenza season. Analysis of targeted immunization strategies suggested that contact network data are required to design strategies that are significantly more effective than random immunization. Immunization strategies based on contact network data were most effective at high vaccination coverage.
Systems of globally coupled logistic maps (GCLM) can display complex collective behaviour characterized by the formation of synchronous clusters. In the dynamical clustering regime, such systems possess a large number of coexisting attractors and might be viewed as dynamical glasses. Glass properties of GCLM in the thermodynamical limit of large system sizes $N$ are investigated. Replicas, representing orbits that start from various initial conditions, are introduced and distributions of their overlaps are numerically determined. We show that for fixed-field ensembles of initial conditions, as used in previous numerical studies, all attractors of the system become identical in the thermodynamical limit up to variations of order $1/\sqrt{N}$ because the initial value of the coupling field is characterized by vanishing fluctuations, and thus replica symmetry is recovered for $N\to \infty $. In contrast to this, when random-field ensembles of initial conditions are chosen, replica symmetry remains broken in the thermodynamical limit.
Considering congestion games with uncertain delays, we compute the inefficiency introduced in network routing by risk-averse agents. At equilibrium, agents may select paths that do not minimize the expected latency so as to obtain lower variability. A social planner, who is likely to be more risk neutral than agents because it operates at a longer time-scale, quantifies social cost with the total expected delay along routes. From that perspective, agents may make suboptimal decisions that degrade long-term quality. We define the {\em price of risk aversion} (PRA) as the worst-case ratio of the social cost at a risk-averse Wardrop equilibrium to that where agents are risk-neutral. For networks with general delay functions and a single source-sink pair, we show that the PRA depends linearly on the agents' risk tolerance and on the degree of variability present in the network. In contrast to the {\em price of anarchy}, in general the PRA increases when the network gets larger but it does not depend on the shape of the delay functions. To get this result we rely on a combinatorial proof that employs alternating paths that are reminiscent of those used in max-flow algorithms. For {\em series-parallel} (SP) graphs, the PRA becomes independent of the network topology and its size. As a result of independent interest, we prove that for SP networks with deterministic delays, Wardrop equilibria {\em maximize} the shortest-path objective among all feasible flows.
We present a biophysical approach for the coupling of neural network activity as resulting from proper dipole currents of cortical pyramidal neurons to the electric field in extracellular fluid. Starting from a reduced threecompartment model of a single pyramidal neuron, we derive an observation model for dendritic dipole currents in extracellular space and thereby for the dendritic field potential that contributes to the local field potential of a neural population. This work aligns and satisfies the widespread dipole assumption that is motivated by the "open-field" configuration of the dendritic field potential around cortical pyramidal cells. Our reduced three-compartment scheme allows to derive networks of leaky integrate-and-fire models, which facilitates comparison with existing neural network and observation models. In particular, by means of numerical simulations we compare our approach with an ad hoc model by Mazzoni et al. [Mazzoni, A., S. Panzeri, N. K. Logothetis, and N. Brunel (2008). Encoding of naturalistic stimuli by local field potential spectra in networks of excitatory and inhibitory neurons. PLoS Computational Biology 4 (12), e1000239], and conclude that our biophysically motivated approach yields substantial improvement.
Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in which a principled mechanism is pre-wired into their connectivity. Here we demonstrate that, starting from random connectivity and modifying a small fraction of connections, a largely disordered recur- rent network can produce sequences and implement working memory efficiently. We use this process, called Partial In-Network Training (PINning), to model and match cellular resolution imaging data from the posterior parietal cortex during a virtual memory- guided two-alternative forced-choice task. Analysis of the connectivity reveals that sequences propagate by the cooperation between recurrent synaptic interactions and external inputs, rather than through feedforward or asymmetric connections. Together our results suggest that neural sequences may emerge through learning from largely unstructured network architectures.
We propose a recurrent neural model that generates natural-language questions from documents, conditioned on answers. We show how to train the model using a combination of supervised and reinforcement learning. After teacher forcing for standard maximum likelihood training, we fine-tune the model using policy gradient techniques to maximize several rewards that measure question quality. Most notably, one of these rewards is the performance of a question-answering system. We motivate question generation as a means to improve the performance of question answering systems. Our model is trained and evaluated on the recent question-answering dataset SQuAD.
An interesting transient has been detected in one of our three Dark Energy Camera deep fields. Observations of these deep fields take advantage of the high red sensitivity of DECam on the Cerro Tololo Interamerican Observatory Blanco telescope. The survey includes the Y band with rest wavelength 1430{\AA} at z = 6. Survey fields (the Prime field 0555-6130, the 16hr field 1600-75 and the SUDSS New Southern Field) are deeper in Y than other infrared surveys. They are circumpolar, allowing all night to be used efficiently, exploiting the moon tolerance of 1 micron observations to minimize conflict with the Dark Energy Survey. As an i-band dropout (meaning that the flux decrement shortward of Lyman alpha is in the i bandpass), the transient we report here is a supernova candidate with z ~ 6, with a luminosity comparable to the brightest known current epoch superluminous supernova (i.e., ~ 2 x 10^11 solar luminosities).
As the demand of, requesting the Internet without any disturbance by the mobile users of any network is increasing the IETF started working on Network Mobility (NEMO). Maintaining the session of all the nodes in mobile network with its home network and external nodes can be provided by the basic Network Mobility support protocol. It provides mobility at IP level to complete networks, allowing a Mobile Network to change its point of attachment to the Internet, while maintaining the ongoing sessions of the nodes of the network. The Mobile Router (MR) manages the mobility even though the nodes don't know the status of mobility. This article discusses few basic concepts and limitations of NEMO protocol and proposes two ways to optimize the NEMO routing technique for registered and unregistered Correspondent Nodes (CN) of the Mobile Network Node (MNN).
We investigate the structure of metastable states in self-generated Coulomb glasses. In dramatic contrast to disordered electron glasses, we find that these states lack marginal stability. Such absence of marginal stability is reflected by the suppression of the single-particle density of states into an exponentially soft gap of the form $g(\epsilon) \sim |\epsilon|^{-3/2} e^{-V / \xi |\epsilon|}$. To analytically explain this behavior, we extend the stability criterion of Efros and Shklovskii to incorporate local charge correlations, in quantitative agreement with our numerical findings. Our work suggests the existence of a new class of self-generated glasses dominated by strong geometric frustration.
We develop a path-based approach to continuous-time random walks on networks with arbitrarily weighted edges. We describe an efficient numerical algorithm for calculating statistical properties of the stochastic path ensemble. After demonstrating our approach on two reaction rate problems, we present a biophysical model that describes how proteins evolve new functions while maintaining thermodynamic stability. We use our methodology to characterize dynamics of evolutionary adaptation, reproducing several key features observed in directed evolution experiments. We find that proteins generally fall into two qualitatively different regimes of adaptation depending on their binding and folding energetics.
This paper aims to provide insight into stability of collaboration choices in P2P networks. We study networks where exchanges between nodes are driven by the desire to receive the best service available. This is the case for most existing P2P networks. We explore an evolution model derived from stable roommates theory that accounts for heterogeneity between nodes. We show that most P2P applications can be modeled using stable matching theory. This is the case whenever preference lists can be deduced from the exchange policy. In many cases, the preferences lists are characterized by an interesting acyclic property. We show that P2P networks with acyclic preferences possess a unique stable state with good convergence properties.
It is known that many networks modeling real-life complex systems are small-word (large local clustering and small diameter) and scale-free (power law of the degree distribution), and very often they are also hierarchical. Although most of the models are based on stochastic methods, some deterministic constructions have been recently proposed, because this allows a better computation of their properties. Here a new deterministic family of hierarchical networks is presented, which generalizes most of the previous proposals, such as the so-called binomial tree. The obtained graphs can be seen as graphs on alphabets (where vertices are labeled with words of a given alphabet, and the edges are defined by a specific rule relating different words). This allows us the characterization of their main distance-related parameters, such as the radius and the diameter. Moreover, as a by product, an efficient shortest-path local algorithm is proposed.
Within the context of evolution, an altruistic act that benefits the receiving individual at the expense of the acting individual is a puzzling phenomenon. An extreme form of altruism can be found in colicinogenic E. coli. These suicidal altruists explode, releasing colicins that kill unrelated individuals, which are not colicin resistant. By committing suicide, the altruist makes it more likely that its kin will have less competition. The benefits of this strategy rely on the number of competitors and kin nearby. If the organism explodes at an inopportune time, the suicidal act may not harm any competitors. Communication could enable organisms to act altruistically when environmental conditions suggest that that strategy would be most beneficial. Quorum sensing is a form of communication in which bacteria produce a protein and gauge the amount of that protein around them. Quorum sensing is one means by which bacteria sense the biotic factors around them and determine when to produce products, such as antibiotics, that influence competition. Suicidal altruists could use quorum sensing to determine when exploding is most beneficial, but it is challenging to study the selective forces at work in microbes. To address these challenges, we use digital evolution (a form of experimental evolution that uses self-replicating computer programs as organisms) to investigate the effects of enabling altruistic organisms to communicate via quorum sensing. We found that quorum-sensing altruists killed a greater number of competitors per explosion, winning competitions against non-communicative altruists. These findings indicate that quorum sensing could increase the beneficial effect of altruism and the suite of conditions under which it will evolve.
We study nuclear effects on the deuteron in the deep inelastic regime using the newest available data analyzing their $Q^2$ dependence. We conclude that precise EMC ratios for large $Q^2$ (> 30, GeV$^2$) cannot be obtained without considering these nuclear effects. For this purpose we use a scheme which parametrizes these effects in a simple manner and compare our results with other recent proposals.
The dynamics of complex-valued fractional-order neuronal networks are investigated, focusing on stability, instability and Hopf bifurcations. Sufficient conditions for the asymptotic stability and instability of a steady state of the network are derived, based on the complex system parameters and the fractional order of the system, considering simplified neuronal connectivity structures (hub and ring). In some specific cases, it is possible to identify the critical values of the fractional order for which Hopf bifurcations may occur. Numerical simulations are presented to illustrate the theoretical findings and to investigate the stability of the limit cycles which appear due to Hopf bifurcations.
We clarify conflicting results in the literature on coefficient functions in front of higher twist operators contributing to the parton sum rules for deep inelastic scattering from polarized targets. The necessary corrections do not affect our calculations of matrix elements, published in Phys.Lett.B242(1990)245, but change final estimates of the $\sim 1/Q^2$ contributions to Bjorken and Ellis--Jaffe sum rules.
In this paper we use artificial neural networks to predict and help compute the values of certain knot invariants. In particular, we show that neural networks are able to predict when a knot is quasipositive with a high degree of accuracy. Given a knot with unknown quasipositivity we use these predictions to identify braid representatives that are likely to be quasipositive, which we then subject to further testing to verify. Using these techniques we identify 84 new quasipositive 11 and 12-crossing knots. Furthermore, we show that neural networks are also able to predict and help compute the slice genus and Ozsv\'{a}th-Szab\'{o} $\tau$-invariant of knots.
We study the effect of spatially modulated magnetic fields on the energy spectrum of a two-dimensional (2D) Bloch electron. Taking into account four kinds of modulated fields and using the method of direct diagonalization of the Hamiltonian matrix, we calculate energy spectra with varying system parameters (i.e., the kind of the modulation, the relative strength of the modulated field to the uniform background field, and the period of the modulation) to elucidate that the energy band structure sensitively depends on such parameters: Inclusion of spatially modulated fields into a uniform field leads occurrence of gap opening, gap closing, band crossing, and band broadening, resulting distinctive energy band structure from the Hofstadter's spectrum. We also discuss the effect of the field modulation on the symmetries appeared in the Hofstadter's spectrum in detail.
In this paper, a digital image forgery localization method based on Multi-Scale Convolutional Neural Networks (CNNs) is proposed. To deal with input sliding windows of different scales, we adopt a unified CNN architecture. To learn robust multi-scale tampering detectors, we elaborately design the training procedures of CNNs on sampled training patches. With a set of carefully trained multi-scale CNNs, complementary tampering possibility maps can be generated, and then fused to get the final decision maps. By exploiting the benefits of both the small-scale and large-scale analyses, the multi-scale analysis can lead to a performance leap in forgery localization of CNNs. Numerous experiments are conducted to demonstrate the effectiveness and efficiency of the proposed method.
Through this research, embedded synthetic fracture networks in rock masses are studied. To analysis the fluid flow complexity in fracture networks with respect to the variation of connectivity patterns, two different approaches are employed, namely, the Lattice Boltzmann method and graph theory. The Lattice Boltzmann method is used to show the sensitivity of the permeability and fluid velocity distribution to synthetic fracture networks' connectivity patterns. Furthermore, the fracture networks are mapped into the graphs, and the characteristics of these graphs are compared to the main spatial fracture networks. Among different characteristics of networks, we distinguish the modularity of networks and sub-graphs distributions. We map the flow regimes into the proper regions of the network's modularity space. Also, for each type of fluid regime, corresponding motifs shapes are scaled. Implemented power law distributions of fracture length in spatial fracture networks yielded the same node's degree distribution in transformed networks. Two general spatial networks are considered: random networks and networks with "hubness" properties mimicking a spatial damage zone (both with power law distribution of fracture length). In the first case, the fractures are embedded in uniformly distributed fracture sets; the second case covers spatial fracture zones. We prove numerically that the abnormal change (transition) in permeability is controlled by the hub growth rate. Also, comparing LBM results with the characteristic mean length of transformed networks' links shows a reverse relationship between the aforementioned parameters. In addition, the abnormalities in advection through nodes are presented.
We propose the first deterministic algorithm that tolerates up to $f$ byzantine faults in $3f+1$-sized networks and performs in the asynchronous CORDA model. Our solution matches the previously established lower bound for the semi-synchronous ATOM model on the number of tolerated Byzantine robots. Our algorithm works under bounded scheduling assumptions for oblivious robots moving in a uni-dimensional space.
New system for i-vector speaker recognition based on variational autoencoder (VAE) is investigated. VAE is a promising approach for developing accurate deep nonlinear generative models of complex data. Experiments show that VAE provides speaker embedding and can be effectively trained in an unsupervised manner. LLR estimate for VAE is developed. Experiments on NIST SRE 2010 data demonstrate its correctness. Additionally, we show that the performance of VAE-based system in the i-vectors space is close to that of the diagonal PLDA. Several interesting results are also observed in the experiments with $\beta$-VAE. In particular, we found that for $\beta\ll 1$, VAE can be trained to capture the features of complex input data distributions in an effective way, which is hard to obtain in the standard VAE ($\beta=1$).
We analyze the solutions, on single network instances, of a recently introduced class of constraint-satisfaction problems (CSPs), describing feasible steady states of chemical reaction networks. First, we show that the CSPs generalize the scheme known as Network Expansion, which is recovered in a specific limit. Next, a full statistical mechanics characterization (including the phase diagram and a discussion of physical origin of the phase transitions) for Network Expansion is obtained. Finally, we provide a message-passing algorithm to solve the original CSPs in the most general form.
The spatial distribution of neuronal cells is an important requirement for achieving proper neuronal function in several parts of the nervous system of most animals. For instance, specific distribution of photoreceptors and related neuronal cells, particularly the ganglion cells, in mammal's retina is required in order to properly sample the projected scene. This work presents how two concepts from the areas of statistical mechanics and complex systems, namely the \emph{lacunarity} and the \emph{multiscale entropy} (i.e. the entropy calculated over progressively diffused representations of the cell mosaic), have allowed effective characterization of the spatial distribution of retinal cells.
We study quantum spin systems with quenched Gaussian disorder. We prove that the variance of all physical quantities in a certain class vanishes in the infinite volume limit. We study also replica symmetry breaking phenomena, where the variance of an overlap operator in the other class does not vanish in the replica symmetric Gibbs state. On the other hand, it vanishes in a spontaneous replica symmetry breaking Gibbs state defined by applying an infinitesimal replica symmetry breaking field. We prove also that the finite variance of the overlap operator in the replica symmetric Gibbs state implies the existence of a spontaneous replica symmetry breaking.
With globalization, countries are more connected than before by trading flows, which currently amount to at least 36 trillion dollars. Interestingly, approximately 30-60 percent of global exports consist of intermediate products. Therefore, the trade flow network of a particular product with high added values can be regarded as a value chain. The problem is weather we can discriminate between these products based on their unique flow network structure. This paper applies the flow analysis method developed in ecology to 638 trading flow networks of different products. We claim that the allometric scaling exponent $\eta$ can be used to characterize the degree of hierarchicality of a flow network, i.e., whether the trading products flow on long hierarchical chains. Then, the flow networks of products with higher added values and complexity, such as machinery&transport equipment with larger exponents, are highlighted. These higher values indicate that their trade flow networks are more hierarchical. As a result, without extra data such as global input-output table, we can identify the product categories with higher complexity and the relative importance of a country in the global value chain solely by the trading network.
Individuals' access to information in a social network depends on its distributed and where in the network individuals position themselves. However, individuals have limited capacity to manage their social connections and process information. In this work, we study how this limited capacity and network structure interact to affect the diversity of information social media users receive. Previous studies of the role of networks in information access were limited in their ability to measure the diversity of information. We address this problem by learning the topics of interest to social media users by observing messages they share online with their followers. We present a probabilistic model that incorporates human cognitive constraints in a generative model of information sharing. We then use the topics learned by the model to measure the diversity of information users receive from their social media contacts. We confirm that users in structurally diverse network positions, which bridge otherwise disconnected regions of the follower graph, are exposed to more diverse information. In addition, we identify user effort as an important variable that mediates access to diverse information in social media. Users who invest more effort into their activity on the site not only place themselves in more structurally diverse positions within the network than the less engaged users, but they also receive more diverse information when located in similar network positions. These findings indicate that the relationship between network structure and access to information in networks is more nuanced than previously thought.
So far little effort has been put into researching the importance of internal ERP project stakeholders mutual interactions,realizing the projects complexity,influence on the whole organization, and high risk for a useful final outcome. This research analyzes the stakeholders interactions and positions in the project network, their criticality, potential bottlenecks and conflicts. The main methods used are Social Network Analysis, and the elicitation of drivers for the individual players. Information was collected from several stakeholders from three large ERP projects all in global companies headquartered in Finland, together with representatives from two different ERP vendors, and with two experienced ERP consultants. The analysis gives quantitative as well as qualitative characterization of stakeholder criticality (mostly the Project Manager(s), the Business Owner(s) and the Process Owner(s)), degree of centrality, closeness, mediating or bottleneck roles, relational ties and conflicts (individual, besides those between business and project organizations), and clique formations. A generic internal stakeholder network model is established as well as the criticality of the project phases. The results are summarized in the form of a list of recommendations for future ERP projects to address the internal stakeholder impacts .Project management should utilize the latest technology to provide tools to increase the interaction between the stakeholders and to monitor the strength of these relations. Social network analysis tools could be used in the projects to visualize the stakeholder relations in order to better understand the possible risks related to the relations (or lack of them).
We develop a statistical theory of networks. A network is a set of vertices and links given by its adjacency matrix $\c$, and the relevant statistical ensembles are defined in terms of a partition function $Z=\sum_{\c} \exp {[}-\beta \H(\c) {]}$. The simplest cases are uncorrelated random networks such as the well-known Erd\"os-R\'eny graphs. Here we study more general interactions $\H(\c)$ which lead to {\em correlations}, for example, between the connectivities of adjacent vertices. In particular, such correlations occur in {\em optimized} networks described by partition functions in the limit $\beta \to \infty$. They are argued to be a crucial signature of evolutionary design in biological networks.
Wireless sensor networks can revolutionize soil ecology by providing measurements at temporal and spatial granularities previously impossible. This paper presents a soil monitoring system we developed and deployed at an urban forest in Baltimore as a first step towards realizing this vision. Motes in this network measure and save soil moisture and temperature in situ every minute. Raw measurements are periodically retrieved by a sensor gateway and stored in a central database where calibrated versions are derived and stored. The measurement database is published through Web Services interfaces. In addition, analysis tools let scientists analyze current and historical data and help manage the sensor network. The article describes the system design, what we learned from the deployment, and initial results obtained from the sensors. The system measures soil factors with unprecedented temporal precision. However, the deployment required device-level programming, sensor calibration across space and time, and cross-referencing measurements with external sources. The database, web server, and data analysis design required considerable innovation and expertise. So, the ratio of computer-scientists to ecologists was 3:1. Before sensor networks can fulfill their potential as instruments that can be easily deployed by scientists, these technical problems must be addressed so that the ratio is one nerd per ten ecologists.
We investigate the possibility of an Anderson type transition in the quantum kicked rotor with a smooth potential due to dynamical localization of the wavefunctions. Our results show the typical characteristics of a critical behavior i.e multifractal eigenfunctions and a scale-invariant level-statistics at a critical kicking strength which classically corresponds to a mixed regime. This indicates the existence of a localization to delocalization transition in the quantum kicked rotor.   Our study also reveals the possibility of other type of transitions in the quantum kicked rotor, with a kicking strength well within strongly chaotic regime. These transitions, driven by the breaking of exact symmetries e.g. time-reversal and parity, are similar to weak-localization transitions in disordered metals.
We study the finite temperature (FT) phase transitions of two-dimensional (2D) $q$-states Potts models on the square lattice, using the first principles Monte Carlo (MC) simulations as well as the techniques of neural networks (NN). We demonstrate that the ideas from NN can be adopted to study these considered FT phase transitions efficiently. In particular, even with a simple NN constructed in this investigation, we are able to obtain the relevant information of the nature of these FT phase transitions, namely whether they are first order or second order. Our results strengthens the potential applicability of machine learning in studying various states of matters. Subtlety of applying NN techniques to investigate many-body systems is briefly discussed as well.
Computing meaningful clusters of nodes is crucial to analyze large networks. In this paper, we present a pairwise node similarity measure that allows to extract roles, i.e. group of nodes sharing similar flow patterns within a network. We propose a low rank iterative scheme to approximate the similarity measure for very large networks. Finally, we show that our low rank similarity score successfully extracts the different roles in random graphs and that its performances are similar to the pairwise similarity measure.
We present the mathematical basis of a new approach to the analysis of temporal coding. The foundation of the approach is the construction of several families of novel distances (metrics) between neuronal impulse trains. In contrast to most previous approaches to the analysis of temporal coding, the present approach does not attempt to embed impulse trains in a vector space, and does not assume a Euclidean notion of distance. Rather, the proposed metrics formalize physiologically-based hypotheses for what aspects of the firing pattern might be stimulus-dependent, and make essential use of the point process nature of neural discharges. We show that these families of metrics endow the space of impulse trains with related but inequivalent topological structures. We show how these metrics can be used to determine whether a set of observed responses have stimulus-dependent temporal structure without a vector-space embedding. We show how multidimensional scaling can be used to assess the similarity of these metrics to Euclidean distances. For two of these families of metrics (one based on spike times and one based on spike intervals), we present highly efficient computational algorithms for calculating the distances. We illustrate these ideas by application to artificial datasets and to recordings from auditory and visual cortex.
We have developed an approach allowing us to resolve the problem of non-conventional Anderson localization emerging in bilayered periodic-on-average structures with alternating layers of right-handed and left-handed materials. Recently, it was numerically discovered that in such structures with weak fluctuations of refraction indices, the localization length $L_{loc}$ can be enormously large for small wave frequencies $\omega$. Within the fourth order of perturbation theory in disorder, $\sigma^2 \ll 1$, we derive the expression for $L_{loc}$ valid for any $\omega$. In the limit $\omega \rightarrow 0$ one gets a quite specific dependence, $L^{-1}_{loc} \propto \sigma ^4 \omega^8$. Our approach allows one to establish the conditions under which this effect can be observed.
MIMO processing plays a central part towards the recent increase in spectral and energy efficiencies of wireless networks. MIMO has grown beyond the original point-to-point channel and nowadays refers to a diverse range of centralized and distributed deployments. The fundamental bottleneck towards enormous spectral and energy efficiency benefits in multiuser MIMO networks lies in a huge demand for accurate channel state information at the transmitter (CSIT). This has become increasingly difficult to satisfy due to the increasing number of antennas and access points in next generation wireless networks relying on dense heterogeneous networks and transmitters equipped with a large number of antennas. CSIT inaccuracy results in a multi-user interference problem that is the primary bottleneck of MIMO wireless networks. Looking backward, the problem has been to strive to apply techniques designed for perfect CSIT to scenarios with imperfect CSIT. In this paper, we depart from this conventional approach and introduce the readers to a promising strategy based on rate-splitting. Rate-splitting relies on the transmission of common and private messages and is shown to provide significant benefits in terms of spectral and energy efficiencies, reliability and CSI feedback overhead reduction over conventional strategies used in LTE-A and exclusively relying on private message transmissions. Open problems, impact on standard specifications and operational challenges are also discussed.
We consider an arbitrary layered Gaussian relay network with $L$ layers of $N$ relays each, from which we select subnetworks with $K$ relays per layer. We prove that: (i) For arbitrary $L, N$ and $K = 1$, there always exists a subnetwork that approximately achieves $\frac{2}{(L-1)N + 4}$ $\left(\mbox{resp.}\frac{2}{LN+2}\right)$ of the network capacity for odd $L$ (resp. even $L$), (ii) For $L = 2, N = 3, K = 2$, there always exists a subnetwork that approximately achieves $\frac{1}{2}$ of the network capacity. We also provide example networks where even the best subnetworks achieve exactly these fractions (up to additive gaps). Along the way, we derive some results on MIMO antenna selection and capacity decomposition that may also be of independent interest.
Spiking neural networks (SNNs) could play a key role in unsupervised machine learning applications, by virtue of strengths related to learning from the fine temporal structure of event-based signals. However, some spike-timing-related strengths of SNNs are hindered by the sensitivity of spike-timing-dependent plasticity (STDP) rules to input spike rates, as fine temporal correlations may be obstructed by coarser correlations between firing rates. In this article, we propose a spike-timing-dependent learning rule that allows a neuron to learn from the temporally-coded information despite the presence of rate codes. Our long-term plasticity rule makes use of short-term synaptic fatigue dynamics. We show analytically that, in contrast to conventional STDP rules, our fatiguing STDP (FSTDP) helps learn the temporal code, and we derive the necessary conditions to optimize the learning process. We showcase the effectiveness of FSTDP in learning spike-timing correlations among processes of different rates in synthetic data. Finally, we use FSTDP to detect correlations in real-world weather data from the United States in an experimental realization of the algorithm that uses a neuromorphic hardware platform comprising phase-change memristive devices. Taken together, our analyses and demonstrations suggest that FSTDP paves the way for the exploitation of the spike-based strengths of SNNs in real-world applications.
We investigate the scaling properties of the two-dimensional (2D) Anderson model of localization with purely off-diagonal disorder (random hopping). Using the transfer-matrix method and finite-size scaling we compute the infinite-size localization lengths for bipartite square and hexagonal 2D lattices, non-bipartite triangular lattices and different distribution functions for the hopping elements. We show that for small energies the localization lengths in the bipartite case diverge with a power-law behavior. The corresponding exponents are in the range $0.2 - 0.6$ and seem to depend on the type and the strength of disorder.
It is generally accepted that neighboring nodes in financial networks are negatively assorted with respect to the correlation between their degrees. This feature would play an important 'damping' role in the market during downturns (periods of distress) since this connectivity pattern between firms lowers the chances of auto-amplifying (the propagation of) distress. In this paper we explore a trade-network of industrial firms where the nodes are suppliers or buyers, and the links are those invoices that the suppliers send out to their buyers and then go on to present to their bank for discounting. The network was collected by a large Italian bank in 2007, from their intermediation of the sales on credit made by their clients. The network also shows dissortative behavior as seen in other studies on financial networks. However, when looking at the credit rating of the firms, an important attribute internal to each node, we find that firms that trade with one another share overwhelming similarity. We know that much data is missing from our data set. However, we can quantify the amount of missing data using information exposure, a variable that connects social structure and behavior. This variable is a ratio of the sales invoices that a supplier presents to their bank over their total sales. Results reveal a non-trivial and robust relationship between the information exposure and credit rating of a firm, indicating the influence of the neighbors on a firm's rating. This methodology provides a new insight into how to reconstruct a network suffering from incomplete information.
Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced semantics. Our model is based on an extended Dynamic Convolution Neural Network, which learns convolution filters at both the sentence and document level, hierarchically learning to capture and compose low level lexical features into high level semantic concepts. We demonstrate the effectiveness of this model on a range of document modelling tasks, achieving strong results with no feature engineering and with a more compact model. Inspired by recent advances in visualising deep convolution networks for computer vision, we present a novel visualisation technique for our document networks which not only provides insight into their learning process, but also can be interpreted to produce a compelling automatic summarisation system for texts.
We consider a network of event-based systems that use a shared wireless medium to communicate with their respective controllers. These systems use a contention resolution mechanism to arbitrate access to the shared network. We identify sufficient conditions for Lyapunov mean square stability of each control system in the network, and design event-based policies that guarantee it. Our stability analysis is based on a Markov model that removes the network-induced correlation between the states of the control systems in the network. Analyzing the stability of this Markov model remains a challenge, as the event-triggering policy renders the estimation error non-Gaussian. Hence, we identify an auxiliary system that furnishes an upper bound for the variance of the system states. Using the stability analysis, we design policies, such as the constant-probability policy, for adapting the event-triggering thresholds to the delay in accessing the network. Realistic wireless networked control examples illustrate the applicability of the presented approach.
The evolution towards equipartition in the $\beta$-FPU chain is studied considering as initial condition the highest frequency mode. Above an analytically derived energy threshold, this zone-boundary mode is shown to be modulationally unstable and to give rise to a striking localization process. The spontaneously created excitations have strong similarity with moving exact breathers solutions. But they have a finite lifetime and their dynamics is chaotic. These chaotic breathers are able to collect very efficiently the energy in the chain. Therefore their size grows in time and they can transport a very large quantity of energy. These features can be explained analyzing the dynamics of perturbed exact breathers of the FPU chain. In particular, a close connection between the Lyapunov spectrum of the chaotic breathers and the Floquet spectrum of the exact ones has been found. The emergence of chaotic breathers is convincingly explained by the absorption of high frequency phonons whereas a breather's metastability is for the first time identified. The lifetime of the chaotic breather is related to the time necessary for the system to reach equipartition. The equipartition time turns out to be dependent on the system energy density $\epsilon$ only. Moreover, such time diverges as $\epsilon^{-2}$ in the limit $\epsilon \to 0$ and vanishes as $\epsilon^{-1/4}$ for $\epsilon \to \infty$.
This paper proposes an approach that predicts the road course from camera sensors leveraging deep learning techniques. Road pixels are identified by training a multi-scale convolutional neural network on a large number of full-scene-labeled night-time road images including adverse weather conditions. A framework is presented that applies the proposed approach to longer distance road course estimation, which is the basis for an augmented reality navigation application. In this framework long range sensor data (radar) and data from a map database are fused with short range sensor data (camera) to produce a precise longitudinal and lateral localization and road course estimation. The proposed approach reliably detects roads with and without lane markings and thus increases the robustness and availability of road course estimations and augmented reality navigation. Evaluations on an extensive set of high precision ground truth data taken from a differential GPS and an inertial measurement unit show that the proposed approach reaches state-of-the-art performance without the limitation of requiring existing lane markings.
(withdrawn) A combined study of magnetic susceptibility and AC resistance was performed on melt-spun Cu-Co granular magnetic ribbons. The AC resistance as a function of temperature has a sharp maximum. We associate it with a diverging correlation length at the temperature of collective freezing of magnetic moments via increasing magnetic losses in the induced non-uniform field. Application of this model to the experimental data allows a direct determination of the critical exponent of correlation length on both sides of the transition. Giant AC magnetoresistance is observed at the freezing temperature.
The inclusive energy distributions of complex fragments (3 $\leq$Z $\leq$ 9) emitted in the reactions $^{20}$Ne (145, 158, 200, 218 MeV) + $^{27}$Al have been measured in the angular range 10$^{o}$ - 50$^{o}$. The fusion-fission and the deep-inelastic components of the fragment yield have been extracted using multiple Gaussian functions from the experimental fragment energy spectra. The elemental yields of the fusion-fission component have been found to be fairly well exlained in the framework of standard statistical model. It is found that there is strong competition between the fusion-fission and the deep-inelastic processes at these energies. The time scale of the deep-inelastic process was estimated to be typically in the range of $\sim$ 10$^{-21}$ - 10$^{-22}$ sec., and it was found to decrease with increasing fragment mass. The angular momentum dissipations in fully energy damped deep-inelastic process have been estimated from the average energies of the deep-inelastic components of the fragment energy spectra. It has been found that, the estimated angular momentum dissipations, for lighter fragments in particular, are more than those predicted by the empirical sticking limit.
To learn the multi-class conceptions from the electroencephalogram (EEG) data we developed a neural network decision tree (DT), that performs the linear tests, and a new training algorithm. We found that the known methods fail inducting the classification models when the data are presented by the features some of them are irrelevant, and the classes are heavily overlapped. To train the DT, our algorithm exploits a bottom up search of the features that provide the best classification accuracy of the linear tests. We applied the developed algorithm to induce the DT from the large EEG dataset consisted of 65 patients belonging to 16 age groups. In these recordings each EEG segment was represented by 72 calculated features. The DT correctly classified 80.8% of the training and 80.1% of the testing examples. Correspondingly it correctly classified 89.2% and 87.7% of the EEG recordings.
In this paper, a novel approach to visual salience detection via Neural Response Divergence (NeRD) is proposed, where synaptic portions of deep neural networks, previously trained for complex object recognition, are leveraged to compute low level cues that can be used to compute image region distinctiveness. Based on this concept , an efficient visual salience detection framework is proposed using deep convolutional StochasticNets. Experimental results using CSSD and MSRA10k natural image datasets show that the proposed NeRD approach can achieve improved performance when compared to state-of-the-art image saliency approaches, while the attaining low computational complexity necessary for near-real-time computer vision applications.
Nowadays, social networks became essential in information exchange between individuals. Indeed, as users of these networks, we can send messages to other people according to the links connecting us. Moreover, given the large volume of exchanged messages, detecting the true nature of the received message becomes a challenge. For this purpose, it is interesting to consider this new tendency with reasoning under uncertainty by using the theory of belief functions. In this paper, we tried to model a social network as being a network of fusion of information and determine the true nature of the received message in a well-defined node by proposing a new model: the belief social network.
Epidemics have so far been mostly studied in undirected networks. However, many real-world networks, such as the social network Twitter and the WWW networks, upon which information, emotion or malware spreads, are shown to be directed networks, composed of both unidirectional links and bidirectional links. We define the directionality as the percentage of unidirectional links. The epidemic threshold for the susceptible-infected-susceptible (SIS) epidemic has been proved to be 1/lambda_{1} in directed networks by N-intertwined Mean-field Approximation, where lambda_{1}, also called as spectral radius, is the largest eigenvalue of the adjacency matrix. Here, we propose two algorithms to generate directed networks with a given degree distribution, where the directionality can be controlled. The effect of directionality on the spectral radius lambda_{1}, principal eigenvector x_{1}, spectral gap lambda_{1}-|lambda_{2}|) and algebraic connectivity |mu_{N-1}| is studied. Important findings are that the spectral radius lambda_{1} decreases with the directionality, and the spectral gap and the algebraic connectivity increase with the directionality. The extent of the decrease of the spectral radius depends on both the degree distribution and the degree-degree correlation rho_{D}. Hence, the epidemic threshold of directed networks is larger than that of undirected networks, and a random walk converges to its steady-state faster in directed networks than in undirected networks with degree distribution.
In the Bjorken limit of the present theory of deep inelastic scattering (DIS) the structure functions (up to anomalous dimensions and perturbative QCD corrections) are described by the parton model. However the current operator in the parton model does not properly commute with the representation operators corresponding to the Lorentz group, space reflection and time reversal. To investigate the violation of these symmetries in the parton model we consider a model in which the current operator explicitly satisfies extended Poincare invariance and current conservation. It is shown that due to binding of quarks in the nucleon the Bjorken variable x no longer can be interpreted as the internal light cone momentum fraction $\xi$ even in the Bjorken limit. As a result, the data on DIS alone do not make it possible to determine the $\xi$ distribution of quarks in the nucleon. We also consider a qualitative explanation of the fact that in the parton model the values given by the sum rules exceed the corresponding experimental quantities while the quark contribution to the nucleon momentum and spin is underestimated.
Ethernet topology discovery has gained increasing interest in the recent years. This trend is motivated mostly by increasing number of carrier Ethernet networks as well as the size of these networks, and consequently the increasing sales of these networks. To manage these networks efficiently, detailed and accurate knowledge of their topology is needed. Knowledge of a network's entities and the physical connections between them can be useful in various prospective. Administrators can use topology information for network planning and fault detecting. Topology information can also be used during protocol and routing algorithm development, for performance prediction and as a basis for accurate network simulations. From a network security perspective, threat detection, network monitoring, network access control and forensic investigations can benefit from accurate network topology information. In this paper, we analyze market trends and investigate current tools available for both research and commercial purposes.
Machine learning has emerged as an invaluable tool in many research areas. In the present work, we harness this power to predict highly accurate molecular infrared spectra with unprecedented computational efficiency. To account for vibrational anharmonic and dynamical effects -- typically neglected by conventional quantum chemistry approaches -- we base our machine learning strategy on ab initio molecular dynamics simulations. While these simulations are usually extremely time consuming even for small molecules, we overcome these limitations by leveraging the power of a variety of machine learning techniques, not only accelerating simulations by several orders of magnitude, but also greatly extending the size of systems that can be treated. To this end, we develop a molecular dipole moment model based on environment dependent neural network charges and combine it with the neural network potentials of Behler and Parrinello. Contrary to the prevalent big data philosophy, we are able to obtain very accurate machine learning models for the prediction of infrared spectra based on only a few hundreds of electronic structure reference points. This is made possible through the introduction of a fully automated sampling scheme and the use of molecular forces during neural network potential training. We demonstrate the power of our machine learning approach by applying it to model the infrared spectra of a methanol molecule, n-alkanes containing up to 200 atoms and the protonated alanine tripeptide, which at the same time represents the first application of machine learning techniques to simulate the dynamics of a peptide. In all these case studies we find excellent agreement between the infrared spectra predicted via machine learning models and the respective theoretical and experimental spectra.
We investigate two and three-dimensional shell-structured-inflatable froths, which can be constructed by a recursion procedure adding successive layers of cells around a germ cell. We prove that any froth can be reduced into a system of concentric shells. There is only a restricted set of local configurations for which the recursive inflation transformation is not applicable. These configurations are inclusions between successive layers and can be treated as vertices and edges decorations of a shell-structure-inflatable skeleton. The recursion procedure is described by a logistic map, which provides a natural classification into Euclidean, hyperbolic and elliptic froths. Froths tiling manifolds with different curvature can be classified simply by distinguishing between those with a bounded or unbounded number of elements per shell, without any a-priori knowledge on their curvature. A new result, associated with maximal orientational entropy, is obtained on topological properties of natural cellular systems. The topological characteristics of all experimentally known tetrahedrally close-packed structures are retrieved.
This paper considers the problem of information capacity of a random neural network. The network is represented by matrices that are square and symmetrical. The matrices have a weight which determines the highest and lowest possible value found in the matrix. The examined matrices are randomly generated and analyzed by a computer program. We find the surprising result that the capacity of the network is a maximum for the binary random neural network and it does not change as the number of quantization levels associated with the weights increases.
It is shown that particles undergoing discrete-time jumps in 3D, starting at a distance r0 from the center of an adsorbing sphere of radius R, are captured with probability (R - c sigma)/r0 for r0 much greater than R, where c is related to the Fourier transform of the scaled jump distribution and sigma is the distribution's root-mean square jump length. For particles starting on the surface of the sphere, the asymptotic survival probability is non-zero (in contrast to the case of Brownian diffusion) and has a universal behavior sigma/(R sqrt(6)) depending only upon sigma/R. These results have applications to computer simulations of reaction and aggregation.
We investigate the replica trick for the microscopic spectral density, $\rho_s(x)$, of the Euclidean QCD Dirac operator. Our starting point is the low-energy limit of the QCD partition function for $n$ fermionic flavors (or replicas) in the sector of topological charge $\nu$. In the domain of the smallest eigenvalues, this partition function is simply given by a U(n) unitary matrix integral. We show that the asymptotic behavior of $\rho_s(x)$ for $x \to \infty$ is obtained from the $n\to 0$ limit of this integral. The smooth contributions to this series are obtained from an expansion about the replica symmetric saddle-point, whereas the oscillatory terms follow from an expansion about a saddle-point that breaks the replica symmetry. For $\nu =0$ we recover the small-$x$ logarithmic singularity of the resolvent by means of the replica trick. For half integer $\nu$, when the saddle point expansion of the U(n) integral terminates, the replica trick reproduces the exact analytical result. In all other cases only an asymptotic series that does not uniquely determine the microscopic spectral density is obtained. We argue that bosonic replicas fail to reproduce the microscopic spectral density. In all cases, the exact answer is obtained naturally by means of the supersymmetric method.
Statistical fluctuations of the light emitted from amplifying random media are studied theoretically and numerically. The characteristic scales of the diffusive motion of light lead to Gaussian or power-law (Levy) distributed fluctuations depending on external control parameters. In the Levy regime, the output pulse is highly irregular leading to huge deviations from a mean--field description. Monte Carlo simulations of a simplified model which includes the population of the medium, demonstrate the two statistical regimes and provide a comparison with dynamical rate equations. Different statistics of the fluctuations helps to explain recent experimental observations reported in the literature.
We introduce Deep Variational Bayes Filters (DVBF), a new method for unsupervised learning and identification of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable inference distributions via variational inference. Thus, it can handle highly nonlinear input data with temporal and spatial dependencies such as image sequences without domain knowledge. Our experiments show that enabling backpropagation through transitions enforces state space assumptions and significantly improves information content of the latent embedding. This also enables realistic long-term prediction.
Wireless sensor networks have emerged as an important and new area in wireless and mobile computing research because of their numerous potential applications that range from indoor deployment scenarios in home and office to outdoor deployment in adversary's territory in tactical battleground. Since in many WSN applications, lives and livelihoods may depend on the timeliness and correctness of sensor data obtained from dispersed sensor nodes, these networks must be secured to prevent any possible attacks that may be launched on them. Security is, therefore, an important issue in WSNs. However, this issue becomes even more critical in cognitive wireless sensor networks, a type of WSN in which the sensor nodes have the capabilities of changing their transmission and reception parameters according to the radio environment under which they operate in order to achieve reliable and efficient communication and optimum utilization of the network resources. This survey paper presents a comprehensive discussion on various security issues in CWSNs by identifying numerous security threats in these networks and defense mechanisms to counter these vulnerabilities. Various types of attacks on CWSNs are categorized under different classes based on their natures and tragets, and corresponding to each attack class, appropriate security mechanisms are presented. The paper also identifies some open problems in this emerging area of wireless networking.
Traditional graph-based semi-supervised learning (SSL) approaches, even though widely applied, are not suited for massive data and large label scenarios since they scale linearly with the number of edges $|E|$ and distinct labels $m$. To deal with the large label size problem, recent works propose sketch-based methods to approximate the distribution on labels per node thereby achieving a space reduction from $O(m)$ to $O(\log m)$, under certain conditions. In this paper, we present a novel streaming graph-based SSL approximation that captures the sparsity of the label distribution and ensures the algorithm propagates labels accurately, and further reduces the space complexity per node to $O(1)$. We also provide a distributed version of the algorithm that scales well to large data sizes. Experiments on real-world datasets demonstrate that the new method achieves better performance than existing state-of-the-art algorithms with significant reduction in memory footprint. We also study different graph construction mechanisms for natural language applications and propose a robust graph augmentation strategy trained using state-of-the-art unsupervised deep learning architectures that yields further significant quality gains.
New experimental results on bacterial growth inspire a novel top-down approach to study cell metabolism, combining mass balance and proteomic constraints to extend and complement Flux Balance Analysis. We introduce here Constrained Allocation Flux Balance Analysis, CAFBA, in which the biosynthetic costs associated to growth are accounted for in an effective way through a single additional genome-wide constraint. Its roots lie in the experimentally observed pattern of proteome allocation for metabolic functions, allowing to bridge regulation and metabolism in a transparent way under the principle of growth-rate maximization. We provide a simple method to solve CAFBA efficiently and propose an "ensemble averaging" procedure to account for unknown protein costs. Applying this approach to modeling E. coli metabolism, we find that, as the growth rate increases, CAFBA solutions cross over from respiratory, growth-yield maximizing states (preferred at slow growth) to fermentative states with carbon overflow (preferred at fast growth). In addition, CAFBA allows for quantitatively accurate predictions on the rate of acetate excretion and growth yield based on only 3 parameters determined by empirical growth laws.
Mobile Communication marketplace has stressed that "content is king" ever since the initial footsteps for Next Generation Networks like 3G, 3GPP, IP Multimedia subsystem (IMS) services. However, many carriers and content providers have struggled to drive revenue for content services, primarily due to current limitations of certain types of desirable content offerings, simplistic billing models, and the inability to support flexible pricing, charging and settlement. Unlike wire line carriers, wireless carriers have a limit to the volume of traffic they can carry, bounded by the finite wireless spectrum. Event based services like calling, conferencing etc., only perceive charge per event, while the Content based charging system attracts Mobile Network Operators (MNOs) to maximize service delivery to customer and achieve best ARPU. With the Next Generation Networks, the number of data related services that can be offered, is increased significantly. The wireless carrier will be able to move from offering wireless telecommunications services to offering wireless telecommunication services plus a number of personalized Value Added Services like news, games, video broadcasts, or multimedia messaging service (MMS) through the network. The next generation Content Based Billing systems allow the operators to maximize their revenues from such services. These systems will enable operators to offer and bill for application-based and content-based services, rather than for just bytes of data. Therefore, the wireless business focus is no longer on infrastructure build-outs but on customer retention and increased average revenue per customer (ARPU). The mobile operator generates new revenues, strengthens brand value, and differentiates its service to attract and retain customers.
In this paper we propose a classification scheme to isolate truly benign tumors from those that initially start off as benign but subsequently show metastases. A non-parametric artificial neural network methodology has been chosen because of the analytical difficulties associated with extraction of closed-form stochastic-likelihood parameters given the extremely complicated and possibly non-linear behavior of the state variables.
Differential measurements of single top quark $t$-channel cross sections as a function of the transverse momentum and the absolute value of the rapidity of the top quark are presented. The data collected by the CMS experiment at the LHC at a center-of-mass energy of 8 TeV corresponds to an integrated luminosity of 19.7 fb$^{-1}$. Leptonic decay channels of the top quark, with either a muon or an electron in the final state, are considered. Neural networks are used to separate signal from background contributions. After correcting for selection efficiencies and detector resolution with an unfolding technique, the resulting distributions are found to agree with predictions from different Monte Carlo generators within the estimated uncertainties.
We introduce a simple and accurate neural model for dependency-based semantic role labeling. Our model predicts predicate-argument dependencies relying on states of a bidirectional LSTM encoder. The semantic role labeler achieves competitive performance on English, even without any kind of syntactic information and only using local inference. However, when automatically predicted part-of-speech tags are provided as input, it substantially outperforms all previous local models and approaches the best reported results on the English CoNLL-2009 dataset. We also consider Chinese, Czech and Spanish where our approach also achieves competitive results. Syntactic parsers are unreliable on out-of-domain data, so standard (i.e., syntactically-informed) SRL models are hindered when tested in this setting. Our syntax-agnostic model appears more robust, resulting in the best reported results on standard out-of-domain test sets.
Deep surveys of the cosmic X-ray background are reviewed in the context of observational progress enabled by the Chandra X-ray Observatory and the X-ray Multi-Mirror Mission-Newton. The sources found by deep surveys are described along with their redshift and luminosity distributions, and the effectiveness of such surveys at selecting active galactic nuclei (AGN) is assessed. Some key results from deep surveys are highlighted including (1) measurements of AGN evolution and the growth of supermassive black holes, (2) constraints on the demography and physics of high-redshift AGN, (3) the X-ray AGN content of infrared and submillimeter galaxies, and (4) X-ray emission from distant starburst and normal galaxies. We also describe some outstanding problems and future prospects for deep extragalactic X-ray surveys.
In network quality of service provisioning, premium services generally require to keep a very small loss probability, which is infeasible to measure directly. The proposed virtual queue scheme estimates the small packet loss probability of a real queueing system by measuring queue statistics in a set of separate virtual queues. A novel scaling property between the real queue and the virtual queues is deduced on the basis of the maximum variance asymptotic (MVA) theory. The new scheme retains the high accuracy and wide applicability of the MVA method for aggregated traffic while avoiding the high computational complexity in a direct application of the original MVA analysis in real time. This makes it suitable for online measurement applications such as network performance monitoring and measurement-based admission control.
Deep Learning has attracted significant attention in recent years. Here I present a brief overview of my first Deep Learner of 1991, and its historic context, with a timeline of Deep Learning highlights.
We discuss relations between Residual Networks (ResNet), Recurrent Neural Networks (RNNs) and the primate visual cortex. We begin with the observation that a shallow RNN is exactly equivalent to a very deep ResNet with weight sharing among the layers. A direct implementation of such a RNN, although having orders of magnitude fewer parameters, leads to a performance similar to the corresponding ResNet. We propose 1) a generalization of both RNN and ResNet architectures and 2) the conjecture that a class of moderately deep RNNs is a biologically-plausible model of the ventral stream in visual cortex. We demonstrate the effectiveness of the architectures by testing them on the CIFAR-10 dataset.
In recent years, deep neural networks have emerged as a dominant machine learning tool for a wide variety of application domains. However, training a deep neural network requires a large amount of labeled data, which is an expensive process in terms of time, labor and human expertise. Domain adaptation or transfer learning algorithms address this challenge by leveraging labeled data in a different, but related source domain, to develop a model for the target domain. Further, the explosive growth of digital data has posed a fundamental challenge concerning its storage and retrieval. Due to its storage and retrieval efficiency, recent years have witnessed a wide application of hashing in a variety of computer vision applications. In this paper, we first introduce a new dataset, Office-Home, to evaluate domain adaptation algorithms. The dataset contains images of a variety of everyday objects from multiple domains. We then propose a novel deep learning framework that can exploit labeled source data and unlabeled target data to learn informative hash codes, to accurately classify unseen target data. To the best of our knowledge, this is the first research effort to exploit the feature learning capabilities of deep neural networks to learn representative hash codes to address the domain adaptation problem. Our extensive empirical studies on multiple transfer tasks corroborate the usefulness of the framework in learning efficient hash codes which outperform existing competitive baselines for unsupervised domain adaptation.
Many biological networks have to filter out useful information from a vast excess of spurious interactions. We use computational evolution to predict design features of networks processing ligand categorization. The important problem of early immune response is considered as a case-study. Rounds of evolution with different constraints uncover elaborations of the same network motif we name adaptive sorting. Corresponding network substructures can be identified in current models of immune recognition. Our work draws a deep analogy between immune recognition and biochemical adaptation.
We investigate the validity of a zeroth thermodynamic law for non-equilibrium systems. In order to describe the thermodynamics of the glassy systems, it has been introduced an extra parameter, the effective temperature which generalizes the fluctuation-dissipation theorem (FDT) to off-equilibrium systems and supposedly describes thermal fluctuations around the aging state. In particular we analyze two coupled systems of harmonic oscillators with Monte Carlo dynamics. We study in detail two types of dynamics: sequential dynamics, where the coupling between the subsystems comes only from the Hamiltonian; and parallel dynamics where there is another source of coupling: the dynamics. We show how in the first case the effective temperatures of the two interacting subsystems are different asymptotically due to the smallness of the thermal conductivity in the aging regime. This explains why, in structural glasses, different interacting degrees of freedom can stay at different effective temperatures, and never thermalize.
We investigate the possibility of identifying an explicit pionic component of the nucleon through measurements of polarized $\Delta^{++}$ baryon fragments produced in deep-inelastic leptoproduction off polarized protons, which may help to identify the physical mechanism responsible for the breaking of the Gottfried sum rule. The pion-exchange model predicts highly correlated polarizations of the $\Delta^{++}$ and target proton, in marked contrast with the competing diquark fragmentation process. Measurement of asymmetries in polarized $\Lambda$ production may also reveal the presence of a kaon cloud in the nucleon.
Randomness and frustration are considered to be the key ingredients for the existence of spin glass (SG) phase. In a canonical system, these ingredients are realized by the random mixture of ferromagnetic (FM) and antiferromagnetic (AF) couplings. The study by Bartolozzi {\it et al.} [Phys. Rev. B{\bf 73}, 224419 (2006)] who observed the presence of SG phase on the AF Ising model on scale free network (SFN) is stimulating. It is a new type of SG system where randomness and frustration are not caused by the presence of FM and AF couplings. To further elaborate this type of system, here we study Heisenberg model on AF SFN and search for the SG phase. The canonical SG Heisenberg model is not observed in $d$-dimensional regular lattices for ($d \leq 3$). We can make an analogy for the connectivity density ($m$) of SFN with the dimensionality of the regular lattice. It should be plausible to find the critical value of $m$ for the existence of SG behaviour, analogous to the lower critical dimension ($d_l$) for the canonical SG systems. Here we study system with $m=2,3,4$ and $5$. We used Replica Exchange algorithm of Monte Carlo Method and calculated the SG order parameter. We observed SG phase for each value of $m$ and estimated its corersponding critical temperature.
The chiral spin-glass Potts system with q=3 states is studied in d=2 and 3 spatial dimensions by renormalization-group theory and the global phase diagrams are calculated in temperature, chirality concentration p, and chirality-breaking concentration c, with determination of phase chaos and phase-boundary chaos. In d=3, the system has ferromagnetic, left-chiral, right-chiral, chiral spin-glass, and disordered phases. The phase boundaries to the ferromagnetic, left- and right-chiral phases show, differently, an unusual, fibrous patchwork (microreentrances) of all four (ferromagnetic, left-chiral, right-chiral, chiral spin-glass) ordered ordered phases, especially in the multicritical region. The chaotic behavior of the interactions, under scale change, are determined in the chiral spin-glass phase and on the boundary between the chiral spin-glass and disordered phases, showing Lyapunov exponents in magnitudes reversed from the usual ferromagnetic-antiferromagnetic spin-glass systems. At low temperatures, the boundaries of the left- and right-chiral phases become thresholded in p and c. In the d=2, the chiral spin-glass system does not have a spin-glass phase, consistently with the lower-critical dimension of ferromagnetic-antiferromagnetic spin glasses. The left- and right-chirally ordered phases show reentrance in chirality concentration p.
We review recent developments in the calculation of deep-inelastic structure functions to next-to-next-to leading order in perturbative QCD. We discuss the impact of these corrections on the determination of the strong coupling alpha_s and the parton distributions.
In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities. However, in other key areas of visual perception such as object and face recognition near-human performance was recently demonstrated by a class of biologically inspired vision models called Deep Neural Networks. Here we introduce an artificial system based on a Deep Neural Network that creates artistic images of high perceptual quality. The system uses neural representations to separate and recombine content and style of arbitrary images, providing a neural algorithm for the creation of artistic images. Moreover, in light of the striking similarities between performance-optimised artificial neural networks and biological vision, our work offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery.
This paper studies the problem of controlling complex networks, that is, the joint problem of selecting a set of control nodes and of designing a control input to steer a network to a target state. For this problem (i) we propose a metric to quantify the difficulty of the control problem as a function of the required control energy, (ii) we derive bounds based on the system dynamics (network topology and weights) to characterize the tradeoff between the control energy and the number of control nodes, and (iii) we propose an open-loop control strategy with performance guarantees. In our strategy we select control nodes by relying on network partitioning, and we design the control input by leveraging optimal and distributed control techniques. Our findings show several control limitations and properties. For instance, for Schur stable and symmetric networks: (i) if the number of control nodes is constant, then the control energy increases exponentially with the number of network nodes, (ii) if the number of control nodes is a fixed fraction of the network nodes, then certain networks can be controlled with constant energy independently of the network dimension, and (iii) clustered networks may be easier to control because, for sufficiently many control nodes, the control energy depends only on the controllability properties of the clusters and on their coupling strength. We validate our results with examples from power networks, social networks, and epidemics spreading.
Based on cluster de-synchronization properties of phase oscillators, we introduce an efficient method for the detection and identification of modules in complex networks. The performance of the algorithm is tested on computer generated and real-world networks whose modular structure is already known or has been studied by means of other methods. The algorithm attains a high level of precision, especially when the modular units are very mixed and hardly detectable by the other methods, with a computational effort ${\cal O}(KN)$ on a generic graph with $N$ nodes and $K$ links.
Deep convolutional networks have achieved great success for object recognition in still images. However, for action recognition in videos, the improvement of deep convolutional networks is not so evident. We argue that there are two reasons that could probably explain this result. First the current network architectures (e.g. Two-stream ConvNets) are relatively shallow compared with those very deep models in image domain (e.g. VGGNet, GoogLeNet), and therefore their modeling capacity is constrained by their depth. Second, probably more importantly, the training dataset of action recognition is extremely small compared with the ImageNet dataset, and thus it will be easy to over-fit on the training dataset.   To address these issues, this report presents very deep two-stream ConvNets for action recognition, by adapting recent very deep architectures into video domain. However, this extension is not easy as the size of action recognition is quite small. We design several good practices for the training of very deep two-stream ConvNets, namely (i) pre-training for both spatial and temporal nets, (ii) smaller learning rates, (iii) more data augmentation techniques, (iv) high drop out ratio. Meanwhile, we extend the Caffe toolbox into Multi-GPU implementation with high computational efficiency and low memory consumption. We verify the performance of very deep two-stream ConvNets on the dataset of UCF101 and it achieves the recognition accuracy of $91.4\%$.
Artificial neural networks can be trained with relatively low-precision floating-point and fixed-point arithmetic, using between one and 16 bits. Previous works have focused on relatively wide-but-shallow, feed-forward networks. We introduce a quantization scheme that is compatible with training very deep neural networks. Quantizing the network activations in the middle of each batch-normalization module can greatly reduce the amount of memory and computational power needed, with little loss in accuracy.
In this paper the application of uncertainty modeling to convolutional neural networks is evaluated. A novel method for adjusting the network's predictions based on uncertainty information is introduced. This allows the network to be either optimistic or pessimistic in its prediction scores. The proposed method builds on the idea of applying dropout at test time and sampling a predictive mean and variance from the network's output. Besides the methodological aspects, implementation details allowing for a fast evaluation are presented. Furthermore, a multilabel network architecture is introduced that strongly benefits from the presented approach. In the evaluation it will be shown that modeling uncertainty allows for improving the performance of a given model purely at test time without any further training steps. The evaluation considers several applications in the field of computer vision, including object classification and detection as well as scene attribute recognition.
Interpretability of deep neural networks (DNNs) is essential since it enables users to understand the overall strengths and weaknesses of the models, conveys an understanding of how the models will behave in the future, and how to diagnose and correct potential problems. However, it is challenging to reason about what a DNN actually does due to its opaque or black-box nature. To address this issue, we propose a novel technique to improve the interpretability of DNNs by leveraging the rich semantic information embedded in human descriptions. By concentrating on the video captioning task, we first extract a set of semantically meaningful topics from the human descriptions that cover a wide range of visual concepts, and integrate them into the model with an interpretive loss. We then propose a prediction difference maximization algorithm to interpret the learned features of each neuron. Experimental results demonstrate its effectiveness in video captioning using the interpretable features, which can also be transferred to video action recognition. By clearly understanding the learned features, users can easily revise false predictions via a human-in-the-loop procedure.
The inception network has been shown to provide good performance on image classification problems, but there are not much evidences that it is also effective for the image restoration or pixel-wise labeling problems. For image restoration problems, the pooling is generally not used because the decimated features are not helpful for the reconstruction of an image as the output. Moreover, most deep learning architectures for the restoration problems do not use dense prediction that need lots of training parameters. From these observations, for enjoying the performance of inception-like structure on the image based problems we propose a new convolutional network-in-network structure. The proposed network can be considered a modification of inception structure where pool projection and pooling layer are removed for maintaining the entire feature map size, and a larger kernel filter is added instead. Proposed network greatly reduces the number of parameters on account of removed dense prediction and pooling, which is an advantage, but may also reduce the receptive field in each layer. Hence, we add a larger kernel than the original inception structure for not increasing the depth of layers. The proposed structure is applied to typical image-to-image learning problems, i.e., the problems where the size of input and output are same such as skin detection, semantic segmentation, and compression artifacts reduction. Extensive experiments show that the proposed network brings comparable or better results than the state-of-the-art convolutional neural networks for these problems.
Segmenting blood vessels in fundus imaging plays an important role in medical diagnosis. Many algorithms have been proposed. While deep Neural Networks have been attracting enormous attention from computer vision community recent years and several novel works have been done in terms of its application in retinal blood vessel segmentation, most of them are based on supervised learning which requires amount of labeled data, which is both scarce and expensive to obtain. We leverage the power of Deep Convolutional Neural Networks (DCNN) in feature learning, in this work, to achieve this ultimate goal. The highly efficient feature learning of DCNN inspires our novel approach that trains the networks with automatically-generated samples to achieve desirable performance on real-world fundus images. For this, we design a set of rules abstracted from the domain-specific prior knowledge to generate these samples. We argue that, with the high efficiency of DCNN in feature learning, one can achieve this goal by constructing the training dataset with prior knowledge, no manual labeling is needed. This approach allows us to take advantages of supervised learning without labeling. We also build a naive DCNN model to test it. The results on standard benchmarks of fundus imaging show it is competitive to the state-of-the-art methods which implies a potential way to leverage the power of DCNN in feature learning.
Speech emotion recognition is an important and challenging task in the realm of human-computer interaction. Prior work proposed a variety of models and feature sets for training a system. In this work, we conduct extensive experiments using an attentive convolutional neural network with multi-view learning objective function. We compare system performance using different lengths of the input signal, different types of acoustic features and different types of emotion speech (improvised/scripted). Our experimental results on the Interactive Emotional Motion Capture (IEMOCAP) database reveal that the recognition performance strongly depends on the type of speech data independent of the choice of input features. Furthermore, we achieved state-of-the-art results on the improvised speech data of IEMOCAP.
Gatys et al. recently demonstrated that deep networks can generate beautiful textures and stylized images from a single texture example. However, their methods requires a slow and memory-consuming optimization process. We propose here an alternative approach that moves the computational burden to a learning stage. Given a single example of a texture, our approach trains compact feed-forward convolutional networks to generate multiple samples of the same texture of arbitrary size and to transfer artistic style from a given image to any other image. The resulting networks are remarkably light-weight and can generate textures of quality comparable to Gatys~et~al., but hundreds of times faster. More generally, our approach highlights the power and flexibility of generative feed-forward models trained with complex and expressive loss functions.
Artificial spin ice has become a valuable tool for understanding magnetic interactions on a microscopic level. The strength in the approach lies in the ability of a synthetic array of nanoscale magnets to mimic crystalline materials, composed of atomic magnetic moments. Unfortunately, these nanoscale magnets, patterned from metal alloys, can show substantial variation in relevant quantities such as coercive field, with deviations up to 6%. By carefully studying the reversal process of artificial kagome ice, we can directly measure the distribution of coercivities, and by switching from disconnected islands to a connected structure, we find that the coercivity distribution can achieve a deviation of only 3.3%. These narrow deviations should allow the observation of behavior that mimics canonical spin-ice materials more closely.
The network inference problem consists of reconstructing the edge set of a network given traces representing the chronology of infection times as epidemics spread through the network. This problem is a paradigmatic representative of prediction tasks in machine learning that require deducing a latent structure from observed patterns of activity in a network, which often require an unrealistically large number of resources (e.g., amount of available data, or computational time). A fundamental question is to understand which properties we can predict with a reasonable degree of accuracy with the available resources, and which we cannot. We define the trace complexity as the number of distinct traces required to achieve high fidelity in reconstructing the topology of the unobserved network or, more generally, some of its properties. We give algorithms that are competitive with, while being simpler and more efficient than, existing network inference approaches. Moreover, we prove that our algorithms are nearly optimal, by proving an information-theoretic lower bound on the number of traces that an optimal inference algorithm requires for performing this task in the general case. Given these strong lower bounds, we turn our attention to special cases, such as trees and bounded-degree graphs, and to property recovery tasks, such as reconstructing the degree distribution without inferring the network. We show that these problems require a much smaller (and more realistic) number of traces, making them potentially solvable in practice.
The development of a deep (stacked) convolutional auto-encoder in the Caffe deep learning framework is presented in this paper. We describe simple principles which we used to create this model in Caffe. The proposed model of convolutional auto-encoder does not have pooling/unpooling layers yet. The results of our experimental research show comparable accuracy of dimensionality reduction in comparison with a classic auto-encoder on the example of MNIST dataset.
We describe a new approach for dealing with the following central problem in the self-organization of a geometric sensor network: Given a polygonal region R, and a large, dense set of sensor nodes that are scattered uniformly at random in R. There is no central control unit, and nodes can only communicate locally by wireless radio to all other nodes that are within communication radius r, without knowing their coordinates or distances to other nodes. The objective is to develop a simple distributed protocol that allows nodes to identify themselves as being located near the boundary of R and form connected pieces of the boundary. We give a comparison of several centrality measures commonly used in the analysis of social networks and show that restricted stress centrality is particularly suited for geometric networks; we provide mathematical as well as experimental evidence for the quality of this measure.
We study the cost of improving the goodput, or the useful data rate, to user in a wireless network. We measure the cost in terms of number of base stations, which is highly correlated to the energy cost as well as capital and operational costs of a network provider.We show that increasing the available bandwidth, or throughput, may not necessarily lead to increase in goodput, particularly in lossy wireless networks in which TCP does not perform well. As a result, much of the resources dedicated to the user may not translate to high goodput, resulting in an inefficient use of the network resources. We show that using protocols such as TCP/NC, which are more resilient to erasures and failures in the network, may lead to a goodput commensurate the throughput dedicated to each user. By increasing goodput, users' transactions are completed faster; thus, the resources dedicated to these users can be released to serve other requests or transactions. Consequently, we show that translating efficiently throughput to goodput may bring forth better connection to users while reducing the cost for the network providers.
Convolutional Neural Networks demonstrate high performance on ImageNet Large-Scale Visual Recognition Challenges contest. Nevertheless, the published results only show the overall performance for all image classes. There is no further analysis why certain images get worse results and how they could be improved. In this paper, we provide deep performance analysis based on different types of images and point out the weaknesses of convolutional neural networks through experiment. We design a novel multiple paths convolutional neural network, which feeds different versions of images into separated paths to learn more comprehensive features. This model has better presentation for image than the traditional single path model. We acquire better classification results on complex validation set on both top 1 and top 5 scores than the best ILSVRC 2013 classification model.
We have studied tbar-t production using multijet final states in pbar-p collisions at a center-of-mass energy of 1.8 TeV, with an integrated luminosity of 110.3 pb(-1). Each of the top quarks with these final states decays exclusively to a bottom quark and a W boson, with the W bosons decaying into quark-antiquark pairs. The analysis has been optimized using neural networks to achieve the smallest expected fractional uncertainty on the tbar-t production cross section, and yields a cross section of 7.1 +/- 2.8(stat.) +/- 1.5(syst.) pb, assuming a top quark mass of 172.1 GeV/c^(2). Combining this result with previous D0 measurements, where one or both of the W bosons decay leptonically, gives a tbar t production cross section of 5.9 +/- 1.2(stat) +/- 1.1(syst) pb.
This article examines the relationship between acquaintanceship and coauthorship patterns in a multi-disciplinary, multi-institutional, geographically distributed research center. Two social networks are constructed and compared: a network of coauthorship, representing how researchers write articles with one another, and a network of acquaintanceship, representing how those researchers know each other on a personal level, based on their responses to an online survey. Statistical analyses of the topology and community structure of these networks point to the importance of small-scale, local, personal networks predicated upon acquaintanceship for accomplishing collaborative work in scientific communities.
We introduce the spectrum bifurcation renormalization group (SBRG) as a generalization of the real-space renormalization group for the many-body localized (MBL) system without truncating the Hilbert space. Starting from a disordered many-body Hamiltonian in the full MBL phase, the SBRG flows to the MBL fixed-point Hamiltonian, and generates the local conserved quantities and the matrix product state representations for all eigenstates. The method is applicable to both spin and fermion models with arbitrary interaction strength on any lattice in all dimensions, as long as the models are in the MBL phase. In particular, we focus on the $1d$ interacting Majorana chain with strong disorder, and map out its phase diagram using the entanglement entropy. The SBRG flow also generates an entanglement holographic mapping, which duals the MBL state to a fragmented holographic space decorated with small blackholes.
In a multi-service network such as ATM, adaptive data services(such as ABR) share the bandwidth unused by higher priority services. The network indicates to the ABR sources the fair and efficient rates at which they should transmit to minimize their cell loss. In this paper, we propose a new method for determining the "effective" number of active connections, and the fair bandwidth share for each connection.
In this paper, we argue that the future of Artificial Intelligence research resides in two keywords: integration and embodiment. We support this claim by analyzing the recent advances of the field. Regarding integration, we note that the most impactful recent contributions have been made possible through the integration of recent Machine Learning methods (based in particular on Deep Learning and Recurrent Neural Networks) with more traditional ones (e.g. Monte-Carlo tree search, goal babbling exploration or addressable memory systems). Regarding embodiment, we note that the traditional benchmark tasks (e.g. visual classification or board games) are becoming obsolete as state-of-the-art learning algorithms approach or even surpass human performance in most of them, having recently encouraged the development of first-person 3D game platforms embedding realistic physics. Building upon this analysis, we first propose an embodied cognitive architecture integrating heterogenous sub-fields of Artificial Intelligence into a unified framework. We demonstrate the utility of our approach by showing how major contributions of the field can be expressed within the proposed framework. We then claim that benchmarking environments need to reproduce ecologically-valid conditions for bootstrapping the acquisition of increasingly complex cognitive skills through the concept of a cognitive arms race between embodied agents.
We introduce a model of a totally asymmetric simple exclusion process (TASEP) on a tree network where the aggregate hopping rate is constant from level to level. With this choice for hopping rates the model shows the same phase diagram as the one-dimensional case. The potential applications of our model are in the area of distribution networks; where a single large source supplies material to a large number of small sinks via a hierarchical network. We show that mean field theory (MFT) for our model is identical to that of the one-dimensional TASEP and that this mean field theory is exact for the TASEP on a tree in the limit of large branching ratio, $b$(or equivalently large coordination number). We then present an exact solution for the two level tree (or star network) that allows the computation of any correlation function and confirm how mean field results are recovered as $b\rightarrow\infty$. As an example we compute the steady-state current as a function of branching ratio. We present simulation results that confirm these results and indicate that the convergence to MFT with large branching ratio is quite rapid.
IEEE 802.16 OFDMA (Orthogonal Frequency Division Multiple Access) technology has emerged as a promising technology for broadband access in a Wireless Metropolitan Area Network (WMAN) environment. In this paper, we address the problem of queueing theoretic performance modeling and analysis of OFDMA under broad-band wireless networks. We consider a single-cell IEEE 802.16 environment in which the base station allocates subchannels to the subscriber stations in its coverage area. The subchannels allocated to a subscriber station are shared by multiple connections at that subscriber station. To ensure the Quality of Service (QoS) performances, a Connection Admission Control (CAC) scheme is considered at a subscriber station. A queueing analytical framework for these admission control schemes is presented considering OFDMA-based transmission at the physical layer. Then, based on the queueing model, both the connection-level and the packet-level performances are studied and compared with their analogues in the case without CAC. The connection arrival is modeled by a Poisson process and the packet arrival for a connection by a two-state Markov Modulated Poisson Process (MMPP). We determine analytically and numerically different performance parameters, such as connection blocking probability, average number of ongoing connections, average queue length, packet dropping probability, queue throughput and average packet delay.
The development of the algorithm of a neural network building by the corresponding parts of a DNA code is discussed.
Artificial Immune Systems(AIS) have been widely used in different fields, such as control, robotics, computer science and multi-agent systems. In this paper is proposed a new approach of neural immune PD type tracking control combining artificial immune control with neural network. It is assumed that the output of the helper T-cell is concerned with not only the error of system but also its changing rate, while the output of suppressor T-cell is unknown nonlinear function with respect to the amount and changing rate of antigens and the changing rate of antibodies, which is approximated by the output of neural network. From this, we derive neural immune PD type control law and apply it to the trajectory tracking of DC actuating mechanism. The validity of the proposed method is verified by simulation and the simulation results show that this method can follow the desired trajectory more rapidly and more accurately compared to the previous method.
Information encoding in the nervous system is supported through the precise spike-timings of neurons; however, an understanding of the underlying processes by which such representations are formed in the first place remains unclear. Here we examine how networks of spiking neurons can learn to encode for input patterns using a fully temporal coding scheme. To this end, we introduce a learning rule for spiking networks containing hidden neurons which optimizes the likelihood of generating desired output spiking patterns. We show the proposed learning rule allows for a large number of accurate input-output spike pattern mappings to be learnt, which outperforms other existing learning rules for spiking neural networks: both in the number of mappings that can be learnt as well as the complexity of spike train encodings that can be utilised. The learning rule is successful even in the presence of input noise, is demonstrated to solve the linearly non-separable XOR computation and generalizes well on an example dataset. We further present a biologically plausible implementation of backpropagated learning in multilayer spiking networks, and discuss the neural mechanisms that might underlie its function. Our approach contributes both to a systematic understanding of how pattern encodings might take place in the nervous system, and a learning rule that displays strong technical capability.
We formalize the problem of detecting a community in a network into testing whether in a given (random) graph there is a subgraph that is unusually dense. We observe an undirected and unweighted graph on N nodes. Under the null hypothesis, the graph is a realization of an Erd\"os-R\'enyi graph with probability p0. Under the (composite) alternative, there is a subgraph of n nodes where the probability of connection is p1 > p0. We derive a detection lower bound for detecting such a subgraph in terms of N, n, p0, p1 and exhibit a test that achieves that lower bound. We do this both when p0 is known and unknown. We also consider the problem of testing in polynomial-time. As an aside, we consider the problem of detecting a clique, which is intimately related to the planted clique problem. Our focus in this paper is in the quasi-normal regime where n p0 is either bounded away from zero, or tends to zero slowly.
In neural machine translation (NMT), generation of a target word depends on both source and target contexts. We find that source contexts have a direct impact on the adequacy of a translation while target contexts affect the fluency. Intuitively, generation of a content word should rely more on the source context and generation of a functional word should rely more on the target context. Due to the lack of effective control over the influence from source and target contexts, conventional NMT tends to yield fluent but inadequate translations. To address this problem, we propose context gates which dynamically control the ratios at which source and target contexts contribute to the generation of target words. In this way, we can enhance both the adequacy and fluency of NMT with more careful control of the information flow from contexts. Experiments show that our approach significantly improves upon a standard attention-based NMT system by +2.3 BLEU points.
Convolutional neural networks (CNNs) have demonstrated remarkable results in image classification tasks for benchmark and practical uses. The CNNs with deeper architectures have achieved higher performances recently thanks to their robustness to parallel shift of objects in images aw well as their numerous parameters and resulting high expression ability. However, the CNNs have a limited robustness to other geometric transformations such as scaling and rotation. This problem is considered to limit performance improvement of the deep CNNs but there is no established solution. This study focuses on scale transformation and proposes a novel network architecture called weight-shared multi-stage network (WSMS-Net), consisting of multiple stages of CNNs. The WSMS-Net is easily combined with existing deep CNNs, such as ResNet and DenseNet, and enables them to acquire a robustness to scaling of objects. The experimental results demonstrate that existing deep CNNs combined with the proposed WSMS-Net achieve higher accuracy for image classification tasks only with a little increase in the number of parameters.
We present an analytic scheme to connect the fragility and viscoelasticity of metallic glasses to the effective ion-ion interaction in the metal. This is achieved by an approximation of the short-range repulsive part of the interaction, combined with nonaffine lattice dynamics to obtain analytical expressions for the shear modulus, viscosity, and fragility in terms of the ion-ion interaction. By fitting the theoretical model to experimental data, we are able to link the steepness of the interionic repulsion to the Thomas-Fermi screened Coulomb repulsion and to the Born-Mayer valence-electron overlap repulsion for various alloys. The result is a simple closed-form expression for the fragility of the supercooled liquid metal in terms of few crucial atomic-scale interaction and anharmonicity parameters. In particular, a linear relationship is found between the fragility and the energy scales of both the screened Coulomb and the electron-overlap repulsions. This relationship opens up opportunities to fabricate alloys with tailored thermo-elasticity and fragility by rationally tuning the chemical composition of the alloy according to general principles. The analysis presented here brings a new way of looking at the link between the outer-shell electronic structure of metals and metalloids and the viscoelasticity and fragility thereof.
We review the inherent structure thermodynamical formalism and the formulation of an equation of state for liquids in equilibrium based on the (volume) derivatives of the statistical properties of the potential energy surface. We also show that, under the hypothesis that during aging the system explores states associated to equilibrium configurations, it is possible to generalize the proposed equation of state to out-of-equilibrium conditions. The proposed formulation is based on the introduction of one additional parameter which, in the chosen thermodynamic formalism, can be chosen as the local minima where the slowly relaxing out-of-equilibrium liquid is trapped.
The nodes in wireless sensor networks (WSNs) contain limited energy resources, which are needed to transmit data to base station (BS). Routing protocols are designed to reduce the energy consumption. Clustering algorithms are best in this aspect. Such clustering algorithms increase the stability and lifetime of the network. However, every routing protocol is not suitable for heterogeneous environments. AM-DisCNT is proposed and evaluated as a new energy efficient protocol for wireless sensor networks. AM-DisCNT uses circular deployment for even consumption of energy in entire wireless sensor network. Cluster-head selection is on the basis of energy. Highest energy node becomes CH for that round. Energy is again compared in the next round to check the highest energy node of that round. The simulation results show that AM-DisCNT performs better than the existing heterogeneous protocols on the basis of network lifetime, throughput and stability of the system.
In this lecture we will consider the minimum weight spanning tree (MST) problem, i.e., one of the simplest and most vital combinatorial optimization problems. We will discuss a particular greedy algorithm that allows to compute a MST for undirected weighted graphs, namely Kruskal's algorithm, and we will study the structure of MSTs obtained for weighted scale free random graphs. This is meant to clarify whether the structure of MSTs is sensitive to correlations between edge weights and topology of the underlying scale free graphs.
Conditional random field (CRF) and Structural Support Vector Machine (Structural SVM) are two state-of-the-art methods for structured prediction which captures the interdependencies among output variables. The success of these methods is attributed to the fact that their discriminative models are able to account for overlapping features on the whole input observations. These features are usually generated by applying a given set of templates on labeled data, but improper templates may lead to degraded performance. To alleviate this issue, in this paper, we propose a novel multiple template learning paradigm to learn structured prediction and the importance of each template simultaneously, so that hundreds of arbitrary templates could be added into the learning model without caution. This paradigm can be formulated as a special multiple kernel learning problem with exponential number of constraints. Then we introduce an efficient cutting plane algorithm to solve this problem in the primal, and its convergence is presented. We also evaluate the proposed learning paradigm on two widely-studied structured prediction tasks, \emph{i.e.} sequence labeling and dependency parsing. Extensive experimental results show that the proposed method outperforms CRFs and Structural SVMs due to exploiting the importance of each template. Our complexity analysis and empirical results also show that our proposed method is more efficient than OnlineMKL on very sparse and high-dimensional data. We further extend this paradigm for structured prediction using generalized $p$-block norm regularization with $p>1$, and experiments show competitive performances when $p \in [1,2)$.
In previous studies, much attention from multidisciplinary fields has been devoted to understand the mechanism of underlying scholarly networks including bibliographic networks, citation networks and co-citation networks. Particularly focusing on networks constructed by means of either authors affinities or the mutual content. Missing a valuable dimension of network, which is an audience scholarly paper. We aim at this paper to assess the impact that social networks and media can have on scholarly papers. We also examine the process of information flow in such networks. We also mention some observa- tions of attractive incidents that our proposed network model revealed.
We provide a method for automatically detecting change in language across time through a chronologically trained neural language model. We train the model on the Google Books Ngram corpus to obtain word vector representations specific to each year, and identify words that have changed significantly from 1900 to 2009. The model identifies words such as "cell" and "gay" as having changed during that time period. The model simultaneously identifies the specific years during which such words underwent change.
We explore the energy landscape of a simple neural network. In particular, we expand upon previous work demonstrating that the empirical complexity of fitted neural networks is vastly less than a naive parameter count would suggest and that this implicit regularization is actually beneficial for generalization from fitted models.
Computing has passed through many transformations since the birth of the first computing machines. Developments in technology have resulted in the availability of fast and inexpensive processors, and progresses in communication technology have resulted in the availability of lucrative and highly proficient computer networks. Among these, the centralized networks have one component that is shared by users all the time. All resources are accessible, but there is a single point of control as well as a single point of failure. The integration of computer and networking technologies gave birth to new paradigm of computing called distributed computing in the late 1970s. Distributed computing has changed the face of computing and offered quick and precise solutions for a variety of complex problems for different fields. Nowadays, we are fully engrossed by the information age, and expending more time communicating and gathering information through the Internet. The Internet keeps on progressing along more than a few magnitudes, abiding end systems increasingly to communicate in more and more different ways. Over the years, several methods have evolved to enable these developments, ranging from simplistic data sharing to advanced systems supporting a multitude of services. This article provides an overview of distributed computing systems. The definition, architecture, characteristics of distributed systems and the various distributed computing fallacies are discussed in the beginning. Finally, discusses client/server computing, World Wide Web and types of distributed systems.
Deep convolutional neural network models pre-trained for the ImageNet classification task have been successfully adopted to tasks in other domains, such as texture description and object proposal generation, but these tasks require annotations for images in the new domain. In this paper, we focus on a novel and challenging task in the pure unsupervised setting: fine-grained image retrieval. Even with image labels, fine-grained images are difficult to classify, let alone the unsupervised retrieval task. We propose the Selective Convolutional Descriptor Aggregation (SCDA) method. SCDA firstly localizes the main object in fine-grained images, a step that discards the noisy background and keeps useful deep descriptors. The selected descriptors are then aggregated and dimensionality reduced into a short feature vector using the best practices we found. SCDA is unsupervised, using no image label or bounding box annotation. Experiments on six fine-grained datasets confirm the effectiveness of SCDA for fine-grained image retrieval. Besides, visualization of the SCDA features shows that they correspond to visual attributes (even subtle ones), which might explain SCDA's high mean average precision in fine-grained retrieval. Moreover, on general image retrieval datasets, SCDA achieves comparable retrieval results with state-of-the-art general image retrieval approaches.
We report a molecular dynamics (MD) study of the collective dynamics of a simple monatomic liquid -interacting through a two body potential that mimics that of lithium- across the liquid-glass transition. In the glassy phase we find evidences of a fast relaxation process similar to that recently found in Lennard-Jones glasses. The origin of this process is ascribed to the topological disorder, i.e. to the dephasing of the different momentum $Q$ Fourier components of the actual normal modes of vibration of the disordered structure. More important, we find that the fast relaxation persists in the liquid phase with almost no temperature dependence of its characteristic parameters (strength and relaxation time). We conclude, therefore, that in the liquid phase well above the melting point, at variance with the usual assumption of {\it un-correlated} binary collisions, the short time particles motion is strongly {\it correlated} and can be described via a normal mode expansion of the atomic dynamics.
We consider the detector size, location, depth, background, and radio-purity required of a mid-Pacific deep-ocean instrument to accomplish the twin goals of making a definitive measurement of the electron anti-neutrino flux due to uranium and thorium decays from Earth's mantle and core, and of testing the hypothesis for a natural nuclear reactor at the core of Earth. We take the experience with the KamLAND detector in Japan as our baseline for sensitivity and background estimates. We conclude that an instrument adequate to accomplish these tasks should have an exposure of at least 10 kilotonne-years (kT-y), should be placed at least at 4 km depth, may be located close to the Hawaiian Islands (no significant background from them), and should aim for KamLAND radio-purity levels, except for radon where it should be improved by a factor of at least 100. With an exposure of 10 kT-y we should achieve a 24% measurement of the U/Th content of the mantle plus core. Exposure at multiple ocean locations for testing lateral heterogeneity is possible.
In this paper, a neuron with nonlinear dendrites (NNLD) and binary synapses that is able to learn temporal features of spike input patterns is considered. Since binary synapses are considered, learning happens through formation and elimination of connections between the inputs and the dendritic branches to modify the structure or "morphology" of the NNLD. A morphological learning algorithm inspired by the 'Tempotron', i.e., a recently proposed temporal learning algorithm-is presented in this work. Unlike 'Tempotron', the proposed learning rule uses a technique to automatically adapt the NNLD threshold during training. Experimental results indicate that our NNLD with 1-bit synapses can obtain similar accuracy as a traditional Tempotron with 4-bit synapses in classifying single spike random latency and pair-wise synchrony patterns. Hence, the proposed method is better suited for robust hardware implementation in the presence of statistical variations. We also present results of applying this rule to real life spike classification problems from the field of tactile sensing.
In this paper, we prove the existence of fundamental relations between information theory and estimation theory for network-coded flows. When the network is represented by a directed graph G=(V, E) and under the assumption of uncorrelated noise over information flows between the directed links connecting transmitters, switches (relays), and receivers. We unveil that there yet exist closed-form relations for the gradient of the mutual information with respect to different components of the system matrix M. On the one hand, this result opens a new class of problems casting further insights into effects of the network topology, topological changes when nodes are mobile, and the impact of errors and delays in certain links into the network capacity which can be further studied in scenarios where one source multi-sinks multicasts and multi-source multicast where the invertibility and the rank of matrix M plays a significant role in the decoding process and therefore, on the network capacity. On the other hand, it opens further research questions of finding precoding solutions adapted to the network level.
We present a new approach to automate the spectroscopic redshift reliability assessment based on machine learning (ML) and characteristics of the redshift probability density function (PDF).   We propose to rephrase the spectroscopic redshift estimation into a Bayesian framework, in order to incorporate all sources of information and uncertainties related to the redshift estimation process, and produce a redshift posterior PDF that will be the starting-point for ML algorithms to provide an automated assessment of a redshift reliability.   As a use case, public data from the VIMOS VLT Deep Survey is exploited to present and test this new methodology. We first tried to reproduce the existing reliability flags using supervised classification to describe different types of redshift PDFs, but due to the subjective definition of these flags, soon opted for a new homogeneous partitioning of the data into distinct clusters via unsupervised classification. After assessing the accuracy of the new clusters via resubstitution and test predictions, unlabeled data from preliminary mock simulations for the Euclid space mission are projected into this mapping to predict their redshift reliability labels.
Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task due to two main issues: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e. zero-shot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot fine-grained classification.
We consider the problem of distributed scheduling in wireless networks where heterogeneously delayed information about queue lengths and channel states of all links are available at all the transmitters. In an earlier work (by Reddy et al. in Queueing Systems, 2012), a throughput optimal scheduling policy (which we refer to henceforth as the R policy) for this setting was proposed. We study the R policy, and examine its two drawbacks -- (i) its huge computational complexity, and (ii) its non-optimal average per-packet queueing delay. We show that the R policy unnecessarily constrains itself to work with information that is more delayed than that afforded by the system. We propose a new policy that fully exploits the commonly available information, thereby greatly improving upon the computational complexity and the delay performance of the R policy. We show that our policy is throughput optimal. Our main contribution in this work is the design of two fast and near-throughput-optimal policies for this setting, whose explicit throughput and runtime performances we characterize analytically. While the R policy takes a few milliseconds to several tens of seconds to compute the schedule once (for varying number of links in the network), the running times of the proposed near-throughput-optimal algorithms range from a few microseconds to only a few hundred microseconds, and are thus suitable for practical implementation in networks with heterogeneously delayed information.
The discovery, representation and reconstruction of Business Networks (BN) from Network Mining (NM) raw data is a difficult problem for enterprises. This is due to huge amounts of complex business processes within and across enterprise boundaries, heterogeneous technology stacks, and fragmented data. To remain competitive, visibility into the enterprise and partner networks on different, interrelated abstraction levels is desirable. We present a novel data discovery, mining and network inference system, called Business Network System (BNS), that reconstructs the BN--integration and business process networks--from raw data, hidden in the enterprises' landscapes. BNS provides a new, declarative foundation for gathering information, defining a network model, inferring the network and check its conformance to the real-world "as-is" network. The paper covers both the foundation and the key features of BNS, including its underlying technologies, its overall system architecture, and its most interesting capabilities.
In recent years solar oscillations have been studied in great detail, both observationally and theoretically; so, perhaps, the Sun currently is the best understood pulsating star. The observational studies include long, almost uninterrupted series of oscillation data from the SOHO spacecraft and ground-based networks, GONG and BiSON, and more recently, extremely high-resolution observations from the Hinode mission. These observational data cover the whole oscillation spectrum, and have been extensively used for helioseismology studies, providing frequencies and travel times for diagnostics of the internal stratification, differential rotation, zonal and meridional flows, subsurface convection and sunspots. Together with realistic numerical simulations they lead to better understanding of the excitation mechanism and interactions of the oscillations with turbulence and magnetic fields. However, many problems remain unsolved. In particular, the precision of the helioseismology measurements is still insufficient for detecting the dynamo zone and deep routes of sunspots. Our knowledge of the oscillation physics in strong magnetic field regions is inadequate for interpretation of MHD waves in sunspots and for sunspot seismology. A new significant progress in studying the solar oscillations is expected from the Solar Dynamics Observatory scheduled for launch in 2010.
In several physical systems, important properties characterizing the system itself are theoretically related with specific degrees of freedom. Although standard Monte Carlo simulations provide an effective tool to accurately reconstruct the physical configurations of the system, they are unable to isolate the different contributions corresponding to different degrees of freedom. Here we show that unsupervised deep learning can become a valid support to MC simulation, coupling useful insights in the phases detection task with good reconstruction performance. As a testbed we consider the 2D XY model, showing that a deep neural network based on variational autoencoders can detect the continuous Kosterlitz-Thouless (KT) transitions, and that, if endowed with the appropriate constrains, they generate configurations with meaningful physical content.
A zero temperature real space renormalization group (RG) approach is used to investigate the role of disorder near the quantum critical point (QCP) of a Kondo necklace (XY-KN) model. In the pure case this approach yields $J_{c}=0$ implying that any coupling $J \not = 0 $ between the local moments and the conduction electrons leads to a non-magnetic phase. We also consider an anisotropic version of the model ($X-KN$), for which there is a quantum phase transition at a finite value of the ratio between the coupling and the bandwidth, $(J/W)$. Disorder is introduced either in the on-site interactions or in the hopping terms. We find that in both cases randomness is irrelevant in the $X-KN$ model, i.e., the disorder induced magnetic-non-magnetic quantum phase transition is controlled by the same exponents of the pure case. Finally, we show the fixed point distributions $P_{J}(J/W)$ at the atractors of the disordered, non-magnetic phases.
Multiplex networks are generalized network structures that are able to describe networks in which the same set of nodes are connected by links that have different connotations. Multiplex networks are ubiquitous since they describe social, financial, engineering and biological networks as well. Extending our ability to analyze complex networks to multiplex network structures increases greatly the level of information that is possible to extract from Big Data. For these reasons characterizing the centrality of nodes in multiplex networks and finding new ways to solve challenging inference problems defined on multiplex networks are fundamental questions of network science. In this paper we discuss the relevance of the Multiplex PageRank algorithm for measuring the centrality of nodes in multilayer networks and we characterize the utility of the recently introduced indicator function $\widetilde{\Theta}^{S}$ for describing their mesoscale organization and community structure. As working examples for studying these measures we consider three multiplex network datasets coming for social science.
In this paper, we reveal the relationship between entropy rate and the congestion in complex network and solve it analytically for special cases. Finding maximizing entropy rate will lead to an improvement of traffic efficiency, we propose a method to mitigate congestion by allocating limited traffic capacity to the nodes in network rationally. Different from former strategies, our method only requires local and observable information of network, and is low-cost and widely applicable in practice. In the simulation of the phase transition for various network models, our method performs well in mitigating congestion both locally and globally. By comparison, we also uncover the deficiency of former degree-biased approaches. Owing to the rapid development of transportation networks, our method may be helpful for modern society.
In healthcare applications, temporal variables that encode movement, health status and longitudinal patient evolution are often accompanied by rich structured information such as demographics, diagnostics and medical exam data. However, current methods do not jointly optimize over structured covariates and time series in the feature extraction process. We present ShortFuse, a method that boosts the accuracy of deep learning models for time series by explicitly modeling temporal interactions and dependencies with structured covariates. ShortFuse introduces hybrid convolutional and LSTM cells that incorporate the covariates via weights that are shared across the temporal domain. ShortFuse outperforms competing models by 3% on two biomedical applications, forecasting osteoarthritis-related cartilage degeneration and predicting surgical outcomes for cerebral palsy patients, matching or exceeding the accuracy of models that use features engineered by domain experts.
Structural equation models and Bayesian networks have been widely used to analyze causal relations between continuous variables. In such frameworks, linear acyclic models are typically used to model the datagenerating process of variables. Recently, it was shown that use of non-Gaussianity identifies a causal ordering of variables in a linear acyclic model without using any prior knowledge on the network structure, which is not the case with conventional methods. However, existing estimation methods are based on iterative search algorithms and may not converge to a correct solution in a finite number of steps. In this paper, we propose a new direct method to estimate a causal ordering based on non-Gaussianity. In contrast to the previous methods, our algorithm requires no algorithmic parameters and is guaranteed to converge to the right solution within a small fixed number of steps if the data strictly follows the model.
The paper introduces mixed networks, a new framework for expressing and reasoning with probabilistic and deterministic information. The framework combines belief networks with constraint networks, defining the semantics and graphical representation. We also introduce the AND/OR search space for graphical models, and develop a new linear space search algorithm. This provides the basis for understanding the benefits of processing the constraint information separately, resulting in the pruning of the search space. When the constraint part is tractable or has a small number of solutions, using the mixed representation can be exponentially more effective than using pure belief networks which odel constraints as conditional probability tables.
MAP is the problem of finding a most probable instantiation of a set of variables given evidence. MAP has always been perceived to be significantly harder than the related problems of computing the probability of a variable instantiation Pr, or the problem of computing the most probable explanation (MPE). This paper investigates the complexity of MAP in Bayesian networks. Specifically, we show that MAP is complete for NP^PP and provide further negative complexity results for algorithms based on variable elimination. We also show that MAP remains hard even when MPE and Pr become easy. For example, we show that MAP is NP-complete when the networks are restricted to polytrees, and even then can not be effectively approximated. Given the difficulty of computing MAP exactly, and the difficulty of approximating MAP while providing useful guarantees on the resulting approximation, we investigate best effort approximations. We introduce a generic MAP approximation framework. We provide two instantiations of the framework; one for networks which are amenable to exact inference Pr, and one for networks for which even exact inference is too hard. This allows MAP approximation on networks that are too complex to even exactly solve the easier problems, Pr and MPE. Experimental results indicate that using these approximation algorithms provides much better solutions than standard techniques, and provide accurate MAP estimates in many cases.
Recently, the problem of local minima in very high dimensional non-convex optimization has been challenged and the problem of saddle points has been introduced. This paper introduces a dynamic type of normalization that forces the system to escape saddle points. Unlike other saddle point escaping algorithms, second order information is not utilized, and the system can be trained with an arbitrary gradient descent learner. The system drastically improves learning in a range of deep neural networks on various data-sets in comparison to non-CPN neural networks.
A fully discrete formalism is introduced to perform stability analysis of a turbulent compressible flow whom dynamics is modeled with the Reynolds-Averaged Navier-Stokes (RANS) equations. The discrete equations are linearized using finite differences and the Jacobian is computed using repeated evaluation of the residuals. Stability of the flow is assessed solving an eigenvalue problem. The sensitivity gradients which indicate regions of the flow where a passive control device could stabilize the unstable eigenvalues are defined within this fully discrete framework. Second order finite differences are applied to the discrete residual to compute the gradients. In particular, the sensitivity gradients are shown to be linked to the Hessian of the RANS equations. The introduced formalism and linearization method are generic: the code used to evaluate the residual of the RANS equations can be used in a black box manner, and the complex linearization of the Hessian is avoided. The method is tested on a two dimensional deep cavity case, the flow is turbulent with a Reynolds number equal to 860 000 and compressible with a Mach number of 0.8. Several turbulence models and numerical schemes are used to validate the method. Physical features of the flow are recovered, such as the fundamental frequency of the natural flow as well as acoustic mechanisms, suggesting the validity of the method. The sensitivity gradients are then computed and validated, the error in predicting the eigenvalue variation being found less than 3\%. Control maps using a small steady control device are finally obtained, indicating that the control area should be chosen in the vicinity of the leading edge of the cavity.
Experiments performed on a wide range of glassy materials display many interesting phenomena, such as aging behavior. In recent years, a large body of experiments probed this nonequilibrium glassy dynamics through elaborate protocols, in which external parameters are shifted, or cycled in the course of the experiment. We review here these protocols, as well as experimental and numerical results. Then, we critically discuss various theoretical approaches put forward in this context. Emphasis is put more on the generality of the phenomena than on a specific system. Experiments are also suggested.
In this review we summarize our recent efforts in trying to understand the role of heterogeneity in cancer progression by using neural networks to characterise different aspects of the mapping from a cancer cells genotype and environment to its phenotype. Our central premise is that cancer is an evolving system subject to mutation and selection, and the primary conduit for these processes to occur is the cancer cell whose behaviour is regulated on multiple biological scales. The selection pressure is mainly driven by the microenvironment that the tumour is growing in and this acts directly upon the cell phenotype. In turn, the phenotype is driven by the intracellular pathways that are regulated by the genotype. Integrating all of these processes is a massive undertaking and requires bridging many biological scales (i.e. genotype, pathway, phenotype and environment) that we will only scratch the surface of in this review. We will focus on models that use neural networks as a means of connecting these different biological scales, since they allow us to easily create heterogeneity for selection to act upon and importantly this heterogeneity can be implemented at different biological scales. More specifically, we consider three different neural networks that bridge different aspects of these scales and the dialogue with the micro-environment, (i) the impact of the micro-environment on evolutionary dynamics, (ii) the mapping from genotype to phenotype under drug-induced perturbations and (iii) pathway activity in both normal and cancer cells under different micro-environmental conditions.
We present an analysis of the nucleon strange sea extracted from a global Parton Distribution Function fit including the neutrino and anti-neutrino dimuon data by the CCFR and NuTeV collaborations, the inclusive charged lepton-nucleon Deep Inelastic Scattering and Drell-Yan data. The (anti-)neutrino induced dimuon analysis is constrained by the semi-leptonic charmed-hadron branching ratio $B_\mu=(8.8\pm0.5)%$, determined from the inclusive charmed hadron measurements performed by the FNAL-E531 and CHORUS neutrino emulsion experiments. Our analysis yields a strange sea suppression factor $\kappa(Q^2=20{\rm GeV}^2)=0.62\pm 0.04$, the most precise value available, an $x$-distribution of total strange sea that is slightly softer than the non-strange sea, and an asymmetry between strange and anti-strange quark distributions consistent with zero (integrated over $x$ it is equal to $0.0013 \pm 0.0009$ at $Q^2=20{\rm GeV}^2$).
We consider the problem of maximizing the revenue raised from tolls set on the arcs of a transportation network, under the constraint that users are assigned to toll-compatible shortest paths. We first prove that this problem is strongly NP-hard. We then provide a polynomial time algorithm with a worst-case precision guarantee of ${1/2}\log_2 m_T+1$, where $m_T$ denotes the number of toll arcs. Finally we show that the approximation is tight with respect to a natural relaxation by constructing a family of instances for which the relaxation gap is reached.
Results of optical identification of the ASCA Lynx deep survey are presented. Six X-ray sources are detected in the 2-7 keV band using the SIS in a 20'x20' field of view with fluxes larger than ~4x10^{-14} erg s-1 cm-2 in the band. Follow-up optical spectroscopic observations were made, and five out of six sources are identified with AGNs/QSOs at redshifts of 0.5-1.3. We also identify two more additional X-ray sources detected in a soft X-ray band with AGNs/QSOs. It is found that three QSOs identified are located at z~1.3. Two rich clusters and several groups of galaxies are also placed at the same redshift in the surveyed field, and projected separations between the QSOs and the clusters are 3-8 Mpc at the redshift.
In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Teasauro's TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm DQN (combining Q-learning with a deep neural network, experience replay, and a separate target network) achieved human-level performance in many Atari 2600 games. The purpose of this study is twofold. First, based on the expected energy restricted Boltzmann machine (EE-RBM), we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear (SiL) unit and its derivative function (SiLd1). The activation of the SiL unit is computed by the sigmoid function multiplied by its input, which is equal to the contribution to the output from one hidden unit in an EE-RBM. Second, we suggest that the more traditional approach of using on-policy learning with eligibility traces, instead of experience replay, and softmax action selection can be competitive with DQN, without the need for a separate target network. We validate our proposed approach by, first, achieving new state-of-the-art results in both stochastic SZ-Tetris and Tetris with a small 10x10 board, using TD($\lambda$) learning and shallow SiLd1 network agents, and, then, outperforming DQN in the Atari 2600 domain by using a deep Sarsa($\lambda$) agent with SiL and SiLd1 hidden units.
We consider online similarity prediction problems over networked data. We begin by relating this task to the more standard class prediction problem, showing that, given an arbitrary algorithm for class prediction, we can construct an algorithm for similarity prediction with "nearly" the same mistake bound, and vice versa. After noticing that this general construction is computationally infeasible, we target our study to {\em feasible} similarity prediction algorithms on networked data. We initially assume that the network structure is {\em known} to the learner. Here we observe that Matrix Winnow \cite{w07} has a near-optimal mistake guarantee, at the price of cubic prediction time per round. This motivates our effort for an efficient implementation of a Perceptron algorithm with a weaker mistake guarantee but with only poly-logarithmic prediction time. Our focus then turns to the challenging case of networks whose structure is initially {\em unknown} to the learner. In this novel setting, where the network structure is only incrementally revealed, we obtain a mistake-bounded algorithm with a quadratic prediction time per round.
We propose to use local search algorithms to produce SAT instances which are harder to solve than randomly generated k-CNF formulae. The first results, obtained with rudimentary search algorithms, show that the approach deserves further study. It could be used as a test of robustness for SAT solvers, and could help to investigate how branching heuristics, learning strategies, and other aspects of solvers impact there robustness.
This paper addresses combinatorial optimization scheme for solving the multicriteria Steiner tree problem for communication network topology design (e.g., wireless mesh network). The solving scheme is based on several models: multicriteria ranking, clustering, minimum spanning tree, and minimum Steiner tree problem. An illustrative numerical example corresponds to designing a covering long-distance Wi-Fi network (static Ad-Hoc network). The set of criteria (i.e., objective functions) involves the following: total cost, total edge length, overall throughput (capacity), and estimate of QoS. Obtained computing results show the suggested solving scheme provides good network topologies which can be compared with minimum spanning trees.
The key feature of online social networks (OSN) is the ability of users to become active, make friends and interact via comments, videos or messages with those around them. This social interaction is typically perceived as critical to the proper functioning of these platforms; therefore, a significant share of OSN research in the recent past has investigated the characteristics and importance of these social links, studying the networks' friendship relations through their topological properties, the structure of the resulting communities and identifying the role and importance of individual members within these networks.   In this paper, we present results from a multi-year study of the online social network Digg.com, indicating that the importance of friends and the friend network in the propagation of information is less than originally perceived. While we do note that users form and maintain a social structure along which information is exchanged, the importance of these links and their contribution is very low: Users with even a nearly identical overlap in interests react on average only with a probability of 2% to information propagated and received from friends. Furthermore, in only about 50% of stories that became popular from the entire body of 10 million news we find evidence that the social ties among users were a critical ingredient to the successful spread. Our findings indicate the presence of previously unconsidered factors, the temporal alignment between user activities and the existence of additional logical relationships beyond the topology of the social graph, that are able to drive and steer the dynamics of such OSNs.
When different type of packets with different needs of Quality of Service (QoS) requirements share the same network resources, it became important to use queue management and scheduling schemes in order to maintain perceived quality at the end users at an acceptable level. Many schemes have been studied in the literature, these schemes use time priority (to maintain QoS for Real Time (RT) packets) and/or space priority (to maintain QoS for Non Real Time (NRT) packets). In this paper, we study and show the drawback of a combined time and space priority (TSP) scheme used to manage QoS for RT and NRT packets intended for an end user in High Speed Downlink Packet Access (HSDPA) cell, and we propose an enhanced scheme (Enhanced Basic-TSP scheme) to improve QoS relatively to the RT packets, and to exploit efficiently the network resources. A mathematical model for the EB-TSP scheme is done, and numerical results show the positive impact of this scheme.
Deep learning methods have predominantly been applied to large artificial neural networks. Despite their state-of-the-art performance, these large networks typically do not generalize well to datasets with limited sample sizes. In this paper, we take a different approach by learning multiple layers of kernels. We combine kernels at each layer and then optimize over an estimate of the support vector machine leave-one-out error rather than the dual objective function. Our experiments on a variety of datasets show that each layer successively increases performance with only a few base kernels.
Cortical networks are hypothesized to rely on transient network activity to support short term memory (STM). In this paper we study the capacity of randomly connected recurrent linear networks for performing STM when the input signals are approximately sparse in some basis. We leverage results from compressed sensing to provide rigorous non asymptotic recovery guarantees, quantifying the impact of the input sparsity level, the input sparsity basis, and the network characteristics on the system capacity. Our analysis demonstrates that network memory capacities can scale superlinearly with the number of nodes, and in some situations can achieve STM capacities that are much larger than the network size. We provide perfect recovery guarantees for finite sequences and recovery bounds for infinite sequences. The latter analysis predicts that network STM systems may have an optimal recovery length that balances errors due to omission and recall mistakes. Furthermore, we show that the conditions yielding optimal STM capacity can be embodied in several network topologies, including networks with sparse or dense connectivities.
The wireless sensor networks combines sensing, computation, and communication into a single small device. These devices depend on battery power and may be placed in hostile environments replacing them becomes a tedious task. Thus improving the energy of these networks becomes important. Clustering in wireless sensor network looks several challenges such as selection of an optimal group of sensor nodes as cluster, optimum selection of cluster head, energy balanced optimal strategy for rotating the role of cluster head in a cluster, maintaining intra and inter cluster connectivity and optimal data routing in the network.   In this paper, we study a protocol supporting an energy efficient clustering, cluster head selection and data routing method to extend the lifetime of sensor network. Simulation results demonstrate that the proposed protocol prolongs network lifetime due to the use of efficient clustering, cluster head selection and data routing. The results of simulation show that at the end of some certain part of running the EECS and Fuzzy based clustering algorithm increases the number of alive nodes comparing with the LEACH and HEED methods and this can lead to an increase in sensor network lifetime. By using the EECS method the total number of messages received at base station is increased when compared with LEACH and HEED methods. The Fuzzy based clustering method compared with the K-Means Clustering by means of iteration count and time taken to die first node in wireless sensor network, as the result shows that the fuzzy based clustering method perform well than kmeans clustering methods.
We use an exact transfer-matrix approach to compute the equilibrium properties of a system of hard disks of diameter $\sigma$ confined to a two-dimensional channel of width $1.95\,\sigma$ at constant longitudinal applied force. At this channel width, which is sufficient for next-nearest-neighbor disks to interact, the system is known to have a great many jammed states. Our calculations show that the longitudinal force (pressure) extrapolates to infinity at a well-defined packing fraction $\phi_K$ that is less than the maximum possible $\phi_{\rm max}$, the latter corresponding to a buckled crystal. In this quasi-one-dimensional problem there is no question of there being any \emph{real} divergence of the pressure at $\phi_K$. We give arguments that this avoided phase transition is a structural feature -- the remnant in our narrow channel system of the hexatic to crystal transition -- but that it has the phenomenology of the (avoided) ideal glass transition. We identify a length scale $\tilde{\xi}_3$ as our equivalent of the penetration length for amorphous order: In the channel system, it reaches a maximum value of around $15\,\sigma$ at $\phi_K$, which is larger than the penetration lengths that have been reported for three dimensional systems. It is argued that the $\alpha$-relaxation time would appear on extrapolation to diverge in a Vogel-Fulcher manner as the packing fraction approaches $\phi_K$.
There is growing interest in understanding how the structural interconnections among brain regions change with the occurrence of neurological diseases. Diffusion weighted MRI imaging has allowed researchers to non-invasively estimate a network of structural cortical connections made by white matter tracts, but current statistical methods for relating such networks to the presence or absence of a disease cannot exploit this rich network information. Standard practice considers each edge independently or summarizes the network with a few simple features. We enable dramatic gains in biological insight via a novel unifying methodology for inference on brain network variations associated to the occurrence of neurological diseases. The key of this approach is to define a probabilistic generative mechanism directly on the space of network configurations via dependent mixtures of low-rank factorizations, which efficiently exploit network information and allow the probability mass function for the brain network-valued random variable to vary flexibly across the group of patients characterized by a specific neurological disease and the one comprising age-matched cognitively healthy individuals.
This paper addresses the problem of bearing-based network localization, which aims to localize all the nodes in a static network given the locations of a subset of nodes termed anchors and inter-node bearings measured in a common reference frame. The contributions of the paper are twofold. Firstly, we propose necessary and sufficient conditions for network localizability with both algebraic and rigidity theoretic interpretations. The analysis of the localizability heavily relies on the recently developed bearing rigidity theory and a special matrix termed the bearing Laplacian. Secondly, we propose a linear distributed protocol for bearing-based network localization. The protocol can globally localize a network if and only if the network is localizable. The sensitivity of the protocol to constant measurement errors is also analyzed. One novelty of this work is that the localizability analysis and localization protocol are applicable to networks in arbitrary dimensional spaces.
Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties.
The study of networks leads to a wide range of high dimensional inference problems. In most practical scenarios, one needs to draw inference from a small population of large networks. This paper studies hypothesis testing of graphs in the above regime. We consider the problem of testing between two populations of inhomogeneous random graphs defined on the same set of vertices. We propose two test statistics based on comparison of the adjacency matrices of the graphs. We show that the statistics lead to uniformly consistent tests in both the "large graph, small sample" and "small graph, large sample" regimes. We further derive lower bounds on the minimax separation rate for the associated testing problems, and show that the constructed tests are near optimal.
Deep learning Networks play a crucial role in the evolution of a vast number of current machine learning models for solving a variety of real world non-trivial tasks. Such networks use big data which is generally unlabeled unsupervised and multi-layered requiring no form of supervision for training and learning data and has been used to successfully build automatic supervisory neural networks. However the question still remains how well the learned data represents interestingness, and their effectiveness i.e. efficiency in deep learning models or applications. If the output of a network of deep learning models can be beamed unto a scene of observables, we could learn the variational frequencies of these stacked networks in a parallel and distributive way.This paper seeks to discover and represent interesting patterns in an efficient and less complex way by incorporating the concept of Mode synthesizers in the deep learning process models
We design a Convolutional Neural Network (CNN) which studies correlation between discretized inverse temperature and spin configuration of 2D Ising model and show that it can find a feature of the phase transition without teaching any a priori information for it. We also define a new order parameter via the CNN and show that it provides well approximated critical inverse temperature. In addition, we compare the activation functions for convolution layer and find that the Rectified Linear Unit (ReLU) is important to detect the phase transition of 2D Ising model.
Temperature-dependent thermoelectric power (TEP) in semiconductor crystals and crystal structures containing an electron-vibrational centers (EVC) was investigated in this article. The TEP contain narrow pius at Debye temperatures for different phonons. In thin epitaxial layers on substrates such pius exist at Debye temperatures of substrates phonons. These pius existence imposable to explain on basis of known TEP theory but it may be explained by the phonon drag of electrons (PDE) effect. In this connection there is the necessity to change traditional point of view on the PDE effect existence only at low temperatures and expand the PDE theory on case of strong electron-phonon coupling provided by EVC at high temperatures.
Network coding has been successfully used in the past for efficient broadcasting in wireless multi-hop networks. Two coding approaches are suitable for mobile networks; Random Linear Network Coding (RLNC) and XOR-based coding. In this work, we make the observation that RLNC provides increased resilience to packet losses compared to XOR-based coding. We develop an analytical model that justifies our intuition. However, the model also reveals that combining RLNC with probabilistic forwarding, which is the approach taken in the literature, may significantly impact RLNC's performance. Therefore, we take the novel approach to combine RLNC with a deterministic broadcasting algorithm in order to prune transmissions. More specifically, we propose a Connected Dominating Set (CDS) based algorithm that works in synergy with RLNC on the "packet generation level". Since managing packet generations is a key issue in RLNC, we propose a distributed scheme, which is also suitable for mobile environments and does not compromise the coding efficiency. We show that the proposed algorithm outperforms XOR-based as well as RLNC-based schemes even when global knowledge is used for managing packet generations.
Bayesian networks are directed acyclic graphs representing independence relationships among a set of random variables. A random variable can be regarded as a set of exhaustive and mutually exclusive propositions. We argue that there are several drawbacks resulting from the propositional nature and acyclic structure of Bayesian networks. To remedy these shortcomings, we propose a probabilistic network where nodes represent unary predicates and which may contain directed cycles. The proposed representation allows us to represent domain knowledge in a single static network even though we cannot determine the instantiations of the predicates before hand. The ability to deal with cycles also enables us to handle cyclic causal tendencies and to recognize recursive plans.
Recent advances in saliency detection have utilized deep learning to obtain high level features to detect salient regions in a scene. These advances have demonstrated superior results over previous works that utilize hand-crafted low level features for saliency detection. In this paper, we demonstrate that hand-crafted features can provide complementary information to enhance performance of saliency detection that utilizes only high level features. Our method utilizes both high level and low level features for saliency detection under a unified deep learning framework. The high level features are extracted using the VGG-net, and the low level features are compared with other parts of an image to form a low level distance map. The low level distance map is then encoded using a convolutional neural network(CNN) with multiple 1X1 convolutional and ReLU layers. We concatenate the encoded low level distance map and the high level features, and connect them to a fully connected neural network classifier to evaluate the saliency of a query region. Our experiments show that our method can further improve the performance of state-of-the-art deep learning-based saliency detection methods.
We consider the problem of performing community detection on a network, while maintaining privacy, assuming that the adversary has access to an auxiliary correlated network. We ask the question "Does there exist a regime where the network cannot be deanonymized perfectly, yet the community structure could be learned?." To answer this question, we derive information theoretic converses for the perfect deanonymization problem using the Stochastic Block Model and edge sub-sampling. We also provide an almost tight achievability result for perfect deanonymization.   We also evaluate the performance of percolation based deanonymization algorithm on Stochastic Block Model data-sets that satisfy the conditions of our converse. Although our converse applies to exact deanonymization, the algorithm fails drastically when the conditions of the converse are met. Additionally, we study the effect of edge sub-sampling on the community structure of a real world dataset. Results show that the dataset falls under the purview of the idea of this paper. There results suggest that it may be possible to prove stronger partial deanonymizability converses, which would enable better privacy guarantees.
We present an updated version of MegaZ-LRG (Collister et al.,(2007)) with photometric redshifts derived with the neural network method, ANNz as well as five other publicly available photo-z codes (HyperZ, SDSS, Le PHARE, BPZ and ZEBRA) for ~1.5 million Luminous Red Galaxies (LRGs) in SDSS DR6. This allows us to identify how reliable codes are relative to each other if used as described in their public release. We compare and contrast the relative merits of each code using ~13000 spectroscopic redshifts from the 2SLAQ sample. We find that the performance of each code depends on the figure of merit used to assess it. As expected, the availability of a complete training set means that the training method performs best in the intermediate redshift bins where there are plenty of training objects. Codes such as Le PHARE, which use new observed templates perform best in the lower redshift bins. All codes produce reasonable photometric redshifts, the 1-sigma scatters ranging from 0.057 to 0.097 if averaged over the entire redshift range. We also perform tests to check whether a training set from a small region of the sky such as 2SLAQ produces biases if used to train over a larger area of the sky. We conclude that this is not likely to be a problem for future wide-field surveys. The complete photometric redshift catalogue including redshift estimates and errors on these from all six methods can be found at www.star.ucl.ac.uk/~mbanerji/MegaZLRGDR6/megaz.html
Many real-world networks are large, complex and thus hard to understand, analyze or visualize. The data about networks is not always complete, their structure may be hidden or they change quickly over time. Therefore, understanding how incomplete system differs from complete one is crucial. In this paper, we study the changes in networks under simplification (i.e., reduction in size). We simplify 30 real-world networks with six simplification methods and analyze the similarity between original and simplified networks based on preservation of several properties, for example degree distribution, clustering coefficient, betweenness centrality, density and degree mixing. We propose an approach for assessing the effectiveness of simplification process to define the most appropriate size of simplified networks and to determine the method, which preserves the most properties of original networks. The results reveal the type and size of original networks do not influence the changes of networks under simplification process, while the size of simplified networks does. Moreover, we investigate the performance of simplification methods when the size of simplified networks is 10% of the original networks. The findings show that sampling methods outperform merging ones, particularly random node selection based on degree and breadth-first sampling perform the best.
We discuss memory effects in the conductance of hopping insulators due to slow rearrangements of structural defects leading to formation of polarons close to the electron hopping states. An abrupt change in the gate voltage and corresponding shift of the chemical potential change populations of the hopping sites, which then slowly relax due to rearrangements of structural defects. As a result, the density of hopping states becomes time dependent on a scale relevant to rearrangement of the structural defects leading to the excess time dependent conductivity.
We propose a new stochastic model for biological neural nets which is a continuous time version of the model proposed by Galves and L\"ocherbach in [A. Galves and E. L\"ocherbach, "Infinite systems of interacting chains with memory of variable length - a stochastic model for biological neural nets", J. Stat. Phys. 151 (2013), no. 5, 896-921]. We also show how to computationally simulate such model for easy neuron potential decays and probability functions and characterize when the model has a finite time of death almost surely.
Algorithm learning is a core problem in artificial intelligence with significant implications on automation level that can be achieved by machines. Recently deep learning methods are emerging for synthesizing an algorithm from its input-output examples, the most successful being the Neural GPU, capable of learning multiplication. We present several improvements to the Neural GPU that substantially reduces training time and improves generalization. We introduce a technique of general applicability to use hard nonlinearities with saturation cost. We also introduce a technique of diagonal gates that can be applied to active-memory models. The proposed architecture is the first capable of learning decimal multiplication end-to-end.
We consider the problem of designing scalable and portable controllers for unmanned aerial vehicles (UAVs) to reach time-varying formations as quickly as possible. This brief confirms that deep reinforcement learning can be used in a multi-agent fashion to drive UAVs to reach any formation while taking into account optimality and portability. We use a deep neural network to estimate how good a state is, so the agent can choose actions accordingly. The system is tested with different non-high-dimensional sensory inputs without any change in the neural network architecture, algorithm or hyperparameters, just with additional training.
Even though clustering trajectory data attracted considerable attention in the last few years, most of prior work assumed that moving objects can move freely in an euclidean space and did not consider the eventual presence of an underlying road network and its influence on evaluating the similarity between trajectories. In this paper, we present two approaches to clustering network-constrained trajectory data. The first approach discovers clusters of trajectories that traveled along the same parts of the road network. The second approach is segment-oriented and aims to group together road segments based on trajectories that they have in common. Both approaches use a graph model to depict the interactions between observations w.r.t. their similarity and cluster this similarity graph using a community detection algorithm. We also present experimental results obtained on synthetic data to showcase our propositions.
Sign language recognition is important for natural and convenient communication between deaf community and hearing majority. We take the highly efficient initial step of automatic fingerspelling recognition system using convolutional neural networks (CNNs) from depth maps. In this work, we consider relatively larger number of classes compared with the previous literature. We train CNNs for the classification of 31 alphabets and numbers using a subset of collected depth data from multiple subjects. While using different learning configurations, such as hyper-parameter selection with and without validation, we achieve 99.99% accuracy for observed signers and 83.58% to 85.49% accuracy for new signers. The result shows that accuracy improves as we include more data from different subjects during training. The processing time is 3 ms for the prediction of a single image. To the best of our knowledge, the system achieves the highest accuracy and speed. The trained model and dataset is available on our repository.
We examine numerically the storage capacity and the behaviour near saturation of an attractor neural network consisting of bistable elements with an adjustable coupling strength, the Bistable Gradient Network (BGN). For strong coupling, we find evidence of a first-order "memory blackout" phase transition as in the Hopfield network. For weak coupling, on the other hand, there is no evidence of such a transition and memorized patterns can be stable even at high levels of loading. The enhanced storage capacity comes, however, at the cost of imperfect retrieval of the patterns from corrupted versions.
Since the advent of software defined networks ({SDN}), there have been many attempts to outsource the complex and costly local network functionality, i.e. the middlebox, to the cloud in the same way as outsourcing computation and storage. The privacy issues, however, may thwart the enterprises' willingness to adopt this innovation since the underlying configurations of these middleboxes may leak crucial and confidential information which can be utilized by attackers. To address this new problem, we use firewall as an sample functionality and propose the first privacy preserving outsourcing framework and schemes in SDN. The basic technique that we exploit is a ground-breaking tool in cryptography, the \textit{cryptographic multilinear map}. In contrast to the infeasibility in efficiency if a naive approach is adopted, we devise practical schemes that can outsource the middlebox as a blackbox after \textit{obfuscating} it such that the cloud provider can efficiently perform the same functionality without knowing its underlying private configurations. Both theoretical analysis and experiments on real-world firewall rules demonstrate that our schemes are secure, accurate, and practical.
The degree-degree correlation is important in understanding the structural organization of a network and the dynamics upon a network. Such correlation is usually measured by the assortativity coefficient $r$, with natural bounds $r \in [-1,1]$. For scale-free networks with power-law degree distribution $p(k) \sim k^{-\gamma}$, we analytically obtain the lower bound of assortativity coefficient in the limit of large network size, which is not -1 but dependent on the power-law exponent $\gamma$. This work challenges the validation of assortativity coefficient in heterogeneous networks, suggesting that one cannot judge whether a network is positively or negatively correlated just by looking at its assortativity coefficient.
We develop new methods based on graph motifs for graph clustering, allowing more efficient detection of communities within networks. We focus on triangles within graphs, but our techniques extend to other clique motifs as well. Our intuition, which has been suggested but not formalized similarly in previous works, is that triangles are a better signature of community than edges. We therefore generalize the notion of conductance for a graph to {\em triangle conductance}, where the edges are weighted according to the number of triangles containing the edge. This methodology allows us to develop variations of several existing clustering techniques, including spectral clustering, that minimize triangles split by the cluster instead of edges cut by the cluster. We provide theoretical results in a planted partition model to demonstrate the potential for triangle conductance in clustering problems. We then show experimentally the effectiveness of our methods to multiple applications in machine learning and graph mining.
While deep learning has led to remarkable advances across diverse applications, it struggles in domains where the data distribution changes over the course of learning. In stark contrast, biological neural networks continually adapt to changing domains, possibly by leveraging complex molecular machinery to solve many tasks simultaneously. In this study, we introduce intelligent synapses that bring some of this biological complexity into artificial neural networks. Each synapse accumulates task relevant information over time, and exploits this information to rapidly store new memories without forgetting old ones. We evaluate our approach on continual learning of classification tasks, and show that it dramatically reduces forgetting while maintaining computational efficiency.
Following the rapidly growing digital image usage, automatic image categorization has become preeminent research area. It has broaden and adopted many algorithms from time to time, whereby multi-feature (generally, hand-engineered features) based image characterization comes handy to improve accuracy. Recently, in machine learning, pre-trained deep convolutional neural networks (DCNNs or ConvNets) have been that the features extracted through such DCNN can improve classification accuracy. Thence, in this paper, we further investigate a feature embedding strategy to exploit cues from multiple DCNNs. We derive a generalized feature space by embedding three different DCNN bottleneck features with weights respect to their Softmax cross-entropy loss. Test outcomes on six different object classification data-sets and an action classification data-set show that regardless of variation in image statistics and tasks the proposed multi-DCNN bottleneck feature fusion is well suited to image classification tasks and an effective complement of DCNN. The comparisons to existing fusion-based image classification approaches prove that the proposed method surmounts the state-of-the-art methods and produces competitive results with fully trained DCNNs as well.
In this paper we present the first algorithm in the streaming model to characterize completely the biconnectivity properties of undirected networks: articulation points, bridges, and connected and biconnected components. The motivation of our work was the development of a real-time algorithm to monitor the connectivity of the Autonomous Systems (AS) Network, but the solution provided is general enough to be applied to any network.   The network structure is represented by a graph, and the algorithm is analyzed in the datastream framework. Here, as in the \emph{on-line} model, the input graph is revealed one item (i.e., graph edge) after the other, in an on-line fashion; but, if compared to traditional on-line computation, there are stricter requirements for both memory occupation and per item processing time. Our algorithm works by properly updating a forest over the graph nodes. All the graph (bi)connectivity properties are stored in this forest. We prove the correctness of the algorithm, together with its space ($O(n\,\log n)$, with $n$ being the number of nodes in the graph) and time bounds.   We also present the results of a brief experimental evaluation against real-world graphs, including many samples of the AS network, ranging from medium to massive size. These preliminary experimental results confirm the effectiveness of our approach.
Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era.
Quantum spin networks can be used to transport information between separated registers in a quantum information processor. To find a practical implementation, the strict requirements of ideal models for perfect state transfer need to be relaxed, allowing for complex coupling topologies and general initial states. Here we analyze transport in complex quantum spin networks in the maximally mixed state and derive explicit conditions that should be satisfied by propagators for perfect state transport. Using a description of the transport process as a quantum walk over the network, we show that it is necessary to phase correlate the transport processes occurring along all the possible paths in the network. We provide a Hamiltonian that achieves this correlation, and use it in a constructive method to derive engineered couplings for perfect transport in complicated network topologies.
Recurrent Neural Networks (RNNs) are extensively used for time-series modeling and prediction. We propose an approach for automatic construction of a binary classifier based on Long Short-Term Memory RNNs (LSTM-RNNs) for detection of a vehicle passage through a checkpoint. As an input to the classifier we use multidimensional signals of various sensors that are installed on the checkpoint. Obtained results demonstrate that the previous approach to handcrafting a classifier, consisting of a set of deterministic rules, can be successfully replaced by an automatic RNN training on an appropriately labelled data.
In topic modeling, many algorithms that guarantee identifiability of the topics have been developed under the premise that there exist anchor words -- i.e., words that only appear (with positive probability) in one topic. Follow-up work has resorted to three or higher-order statistics of the data corpus to relax the anchor word assumption. Reliable estimates of higher-order statistics are hard to obtain, however, and the identification of topics under those models hinges on uncorrelatedness of the topics, which can be unrealistic. This paper revisits topic modeling based on second-order moments, and proposes an anchor-free topic mining framework. The proposed approach guarantees the identification of the topics under a much milder condition compared to the anchor-word assumption, thereby exhibiting much better robustness in practice. The associated algorithm only involves one eigen-decomposition and a few small linear programs. This makes it easy to implement and scale up to very large problem instances. Experiments using the TDT2 and Reuters-21578 corpus demonstrate that the proposed anchor-free approach exhibits very favorable performance (measured using coherence, similarity count, and clustering accuracy metrics) compared to the prior art.
Inclusive production of D* mesons in deep-inelastic ep scattering at HERA is studied in the range 5 < Q^2 <100 GeV^2 of the photon virtuality and 0.02 < y < 0.7 of the inelasticity of the scattering process. The observed phase space for the D* meson is p_T(D*) > 1.25 GeV and |eta(D*)| < 1.8. The data sample corresponds to an integrated luminosity of 348 pb^{-1} collected with the H1 detector. Single and double differential cross sections are measured and the charm contribution F_2^{ccbar} to the proton structure function F_2 is determined. The results are compared to perturbative QCD predictions at next-to-leading order implementing different schemes for the charm mass treatment and with Monte Carlo models based on leading order matrix elements with parton showers.
We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters. The contributions of this work are three-fold. First, we propose a convolutional neural network architecture for geometric matching. The architecture is based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end. Second, we demonstrate that the network parameters can be trained from synthetically generated imagery without the need for manual annotation and that our matching layer significantly increases generalization capabilities to never seen before images. Finally, we show that the same model can perform both instance-level and category-level matching giving state-of-the-art results on the challenging Proposal Flow dataset.
Organizations consist of individuals connected by their responsibilities, incentives, and reporting structure. These connections are aptly represented by a network, hierarchical or other, which is often used to divide tasks. A primary goal of the organization as a whole is to maximize the net productive output. Individuals in these networks trade off between their productive and managing efforts to perform these tasks and the trade-off is influenced by their positions and share of rewards in the network. Efforts of the agents here are substitutable, e.g., the increase in the productive effort by an individual in effect reduces the same of some other individual in the network, who now puts their efforts into management. The management effort of an agent improves the productivity of certain other agents in the network.   In this paper, we carry out a detailed game-theoretic analysis of individual's equilibrium split of efforts into multiple components when connected over a network. We provide a design recipe of the reward sharing scheme that maximizes the net productive output. Our results show that under the strategic behavior of the agents, it may not always be possible to achieve the optimal output using an idea from game theory called the price of anarchy.
The temporal evolution of microwave pulses transmitted through random dielectric samples is obtained from the Fourier transform of field spectra. Large fluctuations are found in the local or single channel delay time, which is the first temporal moment of the transmitted pulse at a point in the output speckle pattern. Both positive and negative values of local delay time are observed. The widest distribution is found at low intensity values near a phase singularity in the transmitted speckle pattern. In the limit of long duration, narrow-bandwidth incident pulses, the single channel delay time equals the spectral derivative of the phase of the transmitted field. Fluctuations of the phase of the transmitted field thus reflect the underlying statistics of dynamics in mesoscopic systems.
Virtualization enables multiple applications to share the same wireless sensor and actuator network (WSAN). However, in heterogeneous environments, virtualized wireless sensor and actuator networks (VWSAN) raise new challenges, such as the need for on-the-fly, dynamic, elastic, and scalable provisioning of gateways. Network Functions Virtualization (NFV) is a paradigm emerging to help tackle these new challenges. It leverages standard virtualization technology to consolidate special-purpose network elements on commodity hardware. This article presents NFV architecture for VWSAN gateways, in which software instances of gateway modules are hosted in NFV infrastructure operated and managed by a VWSAN gateway provider. We consider several VWSAN providers, each with its own brand or combination of brands of sensors and actuators/robots. These sensors and actuators can be accessed by a variety of applications, each may have different interface and QoS (i.e., latency, throughput, etc.) requirements. The NFV infrastructure allows dynamic, elastic, and scalable deployment of gateway modules in this heterogeneous VWSAN environment. Furthermore, the proposed architecture is flexible enough to easily allow new sensors and actuators integration and new application domains accommodation. We present a prototype that is built using the OpenStack platform. Besides, the performance results are discussed
A neural network combined to a neural classifier is used in a real time forecasting of hourly maximum ozone in the centre of France, in an urban atmosphere. This neural model is based on the MultiLayer Perceptron (MLP) structure. The inputs of the statistical network are model output statistics of the weather predictions from the French National Weather Service. With this neural classifier, the Success Index of forecasting is 78% whereas it is from 65% to 72% with the classical MLPs. During the validation phase, in the Summer of 2003, six ozone peaks above the threshold were detected. They actually were seven.
From just a glance, humans can make rich predictions about the future state of a wide range of physical systems. On the other hand, modern approaches from engineering, robotics, and graphics are often restricted to narrow domains and require direct measurements of the underlying states. We introduce the Visual Interaction Network, a general-purpose model for learning the dynamics of a physical system from raw visual observations. Our model consists of a perceptual front-end based on convolutional neural networks and a dynamics predictor based on interaction networks. Through joint training, the perceptual front-end learns to parse a dynamic visual scene into a set of factored latent object representations. The dynamics predictor learns to roll these states forward in time by computing their interactions and dynamics, producing a predicted physical trajectory of arbitrary length. We found that from just six input video frames the Visual Interaction Network can generate accurate future trajectories of hundreds of time steps on a wide range of physical systems. Our model can also be applied to scenes with invisible objects, inferring their future states from their effects on the visible objects, and can implicitly infer the unknown mass of objects. Our results demonstrate that the perceptual module and the object-based dynamics predictor module can induce factored latent representations that support accurate dynamical predictions. This work opens new opportunities for model-based decision-making and planning from raw sensory observations in complex physical environments.
We present a new method for inferring photometric redshifts in deep galaxy and quasar surveys, based on a data driven model of latent spectral energy distributions (SEDs) and a physical model of photometric fluxes as a function of redshift. This conceptually novel approach combines the advantages of both machine-learning and template-fitting methods by building template SEDs directly from the training data. This is made computationally tractable with Gaussian Processes operating in flux--redshift space, encoding the physics of redshift and the projection of galaxy SEDs onto photometric band passes. This method alleviates the need of acquiring representative training data or constructing detailed galaxy SED models; it requires only that the photometric band passes and calibrations be known or have parameterized unknowns. The training data can consist of a combination of spectroscopic and deep many-band photometric data, which do not need to entirely spatially overlap with the target survey of interest or even involve the same photometric bands. We showcase the method on the $i$-magnitude-selected, spectroscopically-confirmed galaxies in the COSMOS field. The model is trained on the deepest bands (from SUBARU and HST) and photometric redshifts are derived using the shallower SDSS optical bands only. We demonstrate that we obtain accurate redshift point estimates and probability distributions despite the training and target sets having very different redshift distributions, noise properties, and even photometric bands. Our model can also be used to predict missing photometric fluxes, or to simulate populations of galaxies with realistic fluxes and redshifts, for example. This method opens a new era in which photometric redshifts for large photometric surveys are derived using a flexible yet physical model of the data trained on all available surveys (spectroscopic and photometric).
A steadily increasing body of evidence suggests that the brain performs probabilistic inference to interpret and respond to sensory input and that trial-to-trial variability in neural activity plays an important role. The neural sampling hypothesis interprets stochastic neural activity as sampling from an underlying probability distribution and has been shown to be compatible with biologically observed firing patterns. In many studies, uncorrelated noise is used as a source of stochasticity, discounting the fact that cortical neurons may share a significant portion of their presynaptic partners, which impacts their computation. This is relevant in biology and for implementations of neural networks where bandwidth constraints limit the amount of independent noise. When receiving correlated noise, the resulting correlations cannot be directly countered by changes in synaptic weights $W$. We show that this is contingent on the chosen coding: when changing the state space from $z\in\{0,1\}$ to $z'\in\{-1,1\}$, correlated noise has the exact same effect as changes in $W'$. The translation of the problem to the $\{-1,1\}$ space allows to find a weight configuration that compensates for the induced correlations. For an artificial embedding of sampling networks, this allows a straightforward transfer between platforms with different bandwidth constraints. The existence of the mapping is important for learning. Since in the $\{-1,1\}$-coding the correlated noise can be compensated by parameter changes and the probability distribution can be kept invariant when changing the coding, the distribution will be found in the $\{0,1\}$-coding as well during learning, as demonstrated in simulations. Conclusively, sampling spiking networks are impervious to noise correlations when trained. If such computation happens in cortex, network plasticity does not need to take account of shared noise inputs.
In this paper we developed approach based on the BFKL evolution in $\ln\Lb Q^2\Rb$. We show that the simplest diffusion approximation with running QCD coupling is able to describe the HERA experimental data on the deep inelastic structure function with good $\chi^2/d.o.f. \approx 1.3$. From our description of the experimental data we learned several lessons; (i) the non-perturbative physics at long distances started to show up at $Q^2 = 0.25\,GeV^2$; (ii) the scattering amplitude at $Q^2 = 0.25\,GeV^2$ cannot be written as sum of soft Pomeron and the secondary Reggeon but the Pomeron interactions should be taken into account; (iii) the Pomeron interactions can be reduced to the enhanced diagrams and, therefore, we do not see any needs for the shadowing corrections at HERA energies; and (iv) we demonstrated that the shadowing correction could be sizable at higher than HERA energies without any contradiction with our initial conditions.
We examine the extent to which Gaussian relay networks can be approximated by deterministic networks, and present two results, one negative and one positive.   The gap between the capacities of a Gaussian relay network and a corresponding linear deterministic network can be unbounded. The key reasons are that the linear deterministic model fails to capture the phase of received signals, and there is a loss in signal strength in the reduction to a linear deterministic network.   On the positive side, Gaussian relay networks are indeed well approximated by certain discrete superposition networks, where the inputs and outputs to the channels are discrete, and channel gains are signed integers.   As a corollary, MIMO channels cannot be approximated by the linear deterministic model but can be by the discrete superposition model.
Unsupervised learning and supervised learning are key research topics in deep learning. However, as high-capacity supervised neural networks trained with a large amount of labels have achieved remarkable success in many computer vision tasks, the availability of large-scale labeled images reduced the significance of unsupervised learning. Inspired by the recent trend toward revisiting the importance of unsupervised learning, we investigate joint supervised and unsupervised learning in a large-scale setting by augmenting existing neural networks with decoding pathways for reconstruction. First, we demonstrate that the intermediate activations of pretrained large-scale classification networks preserve almost all the information of input images except a portion of local spatial details. Then, by end-to-end training of the entire augmented architecture with the reconstructive objective, we show improvement of the network performance for supervised tasks. We evaluate several variants of autoencoders, including the recently proposed "what-where" autoencoder that uses the encoder pooling switches, to study the importance of the architecture design. Taking the 16-layer VGGNet trained under the ImageNet ILSVRC 2012 protocol as a strong baseline for image classification, our methods improve the validation-set accuracy by a noticeable margin.
We develop a statistical-mechanical formulation for image restoration and error-correcting codes. These problems are shown to be equivalent to the Ising spin glass with ferromagnetic bias under random external fields. We prove that the quality of restoration/decoding is maximized at a specific set of parameter values determined by the source and channel properties. For image restoration in mean-field system a line of optimal performance is shown to exist in the parameter space. These results are illustrated by solving exactly the infinite-range model. The solutions enable us to determine how precisely one should estimate unknown parameters. Monte Carlo simulations are carried out to see how far the conclusions from the infinite-range model are applicable to the more realistic two-dimensional case in image restoration.
This article is an extension of the results of two earlier articles. In [J. Schubert, On nonspecific evidence, Int. J. Intell. Syst. 8 (1993) 711-725] we established within Dempster-Shafer theory a criterion function called the metaconflict function. With this criterion we can partition into subsets a set of several pieces of evidence with propositions that are weakly specified in the sense that it may be uncertain to which event a proposition is referring. In a second article [J. Schubert, Specifying nonspecific evidence, in Cluster-based specification techniques in Dempster-Shafer theory for an evidential intelligence analysis of multiple target tracks, Ph.D. Thesis, TRITA-NA-9410, Royal Institute of Technology, Stockholm, 1994, ISBN 91-7170-801-4] we not only found the most plausible subset for each piece of evidence, we also found the plausibility for every subset that this piece of evidence belongs to the subset. In this article we aim to find a posterior probability distribution regarding the number of subsets. We use the idea that each piece of evidence in a subset supports the existence of that subset to the degree that this piece of evidence supports anything at all. From this we can derive a bpa that is concerned with the question of how many subsets we have. That bpa can then be combined with a given prior domain probability distribution in order to obtain the sought-after posterior domain distribution.
This paper proposes a deep cerebellar model articulation controller (DCMAC) for adaptive noise cancellation (ANC). We expand upon the conventional CMAC by stacking sin-gle-layer CMAC models into multiple layers to form a DCMAC model and derive a modified backpropagation training algorithm to learn the DCMAC parameters. Com-pared with conventional CMAC, the DCMAC can characterize nonlinear transformations more effectively because of its deep structure. Experimental results confirm that the pro-posed DCMAC model outperforms the CMAC in terms of residual noise in an ANC task, showing that DCMAC provides enhanced modeling capability based on channel characteristics.
Although a number of models have been developed to investigate the emergence of culture and evolutionary phases in social systems, one important aspect has not yet been sufficiently emphasized. This is the structure of the underlaying network of social relations serving as channels in transmitting cultural traits, which is expected to play a crucial role in the evolutionary processes in social systems. In this paper we contribute to the understanding of the role of the network structure by developing a layered ego-centric network structure based model, inspired by the social brain hypothesis, to study transmission of cultural traits and their evolution in social network. For this model we first find analytical results in the spirit of mean-field approximation and then to validate the results we compare them with the results of extensive numerical simulations.
Together with the development of more accurate methods in Computer Vision and Natural Language Understanding, holistic architectures that answer on questions about the content of real-world images have emerged. In this tutorial, we build a neural-based approach to answer questions about images. We base our tutorial on two datasets: (mostly on) DAQUAR, and (a bit on) VQA. With small tweaks the models that we present here can achieve a competitive performance on both datasets, in fact, they are among the best methods that use a combination of LSTM with a global, full frame CNN representation of an image. We hope that after reading this tutorial, the reader will be able to use Deep Learning frameworks, such as Keras and introduced Kraino, to build various architectures that will lead to a further performance improvement on this challenging task.
Can we objectively distinguish chemical systems that are able to process meaningful information from those that are not suitable for information processing? Here, we present a formal method to assess the semantic capacity of a chemical reaction network. The semantic capacity of a network can be measured by analyzing the capability of the network to implement molecular codes. We analyzed models of real chemical systems (Martian atmosphere chemistry and various combustion chemistries), bio-chemical systems (gene expression, gene translation, and phosphorylation signaling cascades), as well as an artificial chemistry and random networks. Our study suggests that different chemical systems posses different semantic capacities. Basically no semantic capacity was found in the atmosphere chemistry of Mars and all studied combustion chemistries, as well as in highly connected random networks, i.e., with these chemistries molecular codes cannot be implemented. High semantic capacity was found in the bio-chemical systems, as well as in random networks where the number of second order reactions is at the number of species. Hypotheses concern the origin and evolution of life. We conclude that our approach can be applied to evaluate the information processing capabilities of a chemical system and may thus be a useful tool to understand the origin and evolution of meaningful information, e.g., at the origin of life.
In this paper we present a deep neural network topology that incorporates a simple to implement transformation invariant pooling operator (TI-POOLING). This operator is able to efficiently handle prior knowledge on nuisance variations in the data, such as rotation or scale changes. Most current methods usually make use of dataset augmentation to address this issue, but this requires larger number of model parameters and more training data, and results in significantly increased training time and larger chance of under- or overfitting. The main reason for these drawbacks is that the learned model needs to capture adequate features for all the possible transformations of the input. On the other hand, we formulate features in convolutional neural networks to be transformation-invariant. We achieve that using parallel siamese architectures for the considered transformation set and applying the TI-POOLING operator on their outputs before the fully-connected layers. We show that this topology internally finds the most optimal "canonical" instance of the input image for training and therefore limits the redundancy in learned features. This more efficient use of training data results in better performance on popular benchmark datasets with smaller number of parameters when comparing to standard convolutional neural networks with dataset augmentation and to other baselines.
In this paper the problem of the theory of a quasicrystal structures - the determination of coordinates of each atom of quasicrystal in analytical form - is solved. Within the framework of the proposed model a periodic crystal can be presented as a particular case of a quasicrystal. The simple and explicit analytical formulas which describe the location of each atom in a quasicrystal are given. The exact solutions for Penrose and Ammann-Beenker quasicrystal structures are given. On the basis of the analytical formulas the routines are created. The routines are inserted directly into graphical files generating the quasiperiodic structures.
Text corpora which are tagged with part-of-speech information are useful in many areas of linguistic research. In this paper, a new part-of-speech tagging method based on neural networks (Net- Tagger) is presented and its performance is compared to that of a HMM-tagger and a trigram-based tagger. It is shown that the Net- Tagger performs as well as the trigram-based tagger and better than the HMM-tagger.
We introduce a statistical method to investigate the impact of dyadic relations on complex networks generated from repeated interactions. It is based on generalised hypergeometric ensembles, a class of statistical network ensembles developed recently. We represent different types of known relations between system elements by weighted graphs, separated in the different layers of a multiplex network. With our method we can regress the influence of each relational layer, the independent variables, on the interaction counts, the dependent variables. Moreover, we can test the statistical significance of the relations as explanatory variables for the observed interactions. To demonstrate the power of our approach and its broad applicability, we will present examples based on synthetic and empirical data.
We have recently shown that the statistical properties of goal directed reaching in human subjects depends on recent experience in a way that is consistent with the presence of adaptive Bayesian priors (Verstynen and Sabes, 2011). We also showed that when Hebbian (associative) learning is added to a simple line-attractor network model, the network provides both a good account of the experimental data and a good approximation to a normative Bayesian estimator. This latter conclusion was based entirely on empirical simulations of the network model. Here we study the effects of Hebbian learning on the line-attractor model using a combination of analytic and computational approaches. Specifically, we find an approximate solution to the network steady-state. We show numerically that the solution approximates Bayesian estimation. We next show that the solution contains two opposing terms: one that depends on the distribution of recent network activity and one that depends on the current network inputs. These results provide additional intuition for why Hebbian learning mimics adaptive Bayesian estimation in this context.
A numerical study of the statistics of transmission ($t$) and reflection ($r$) of quasi-particles from a one-dimensional disordered lasing or amplifying medium is presented. The amplification is introduced via a uniform imaginary part in the site energies in the disordered segment of the single-band tight binding model. It is shown that $t$ is a non-self-averaging quantity. The cross-over length scale above which the amplification suppresses the transmittance is studied as a function of amplification strength. A new cross-over length scale is introduced in the regime of strong disorder and weak amplification. The stationary distribution of the backscattered reflection coefficient is shown to differ qualitatively from the earlier analytical results obtained within the random phase approximation.
An artificial Ant Colony System (ACS) algorithm to solve general-purpose combinatorial Optimization Problems (COP) that extends previous AC models [21] by the inclusion of a negative pheromone, is here described. Several Travelling Salesman Problem (TSP) were used as benchmark. We show that by using two different sets of pheromones, a second-order co-evolved compromise between positive and negative feedbacks achieves better results than single positive feedback systems. The algorithm was tested against known NP-complete combinatorial Optimization Problems, running on symmetrical TSP's. We show that the new algorithm compares favourably against these benchmarks, accordingly to recent biological findings by Robinson [26,27], and Gruter [28] where "No entry" signals and negative feedback allows a colony to quickly reallocate the majority of its foragers to superior food patches. This is the first time an extended ACS algorithm is implemented with these successful characteristics.
Video is one of the fastest-growing sources of data and is rich with interesting semantic information. Furthermore, recent advances in computer vision, in the form of deep convolutional neural networks (CNNs), have made it possible to query this semantic information with near-human accuracy (in the form of image tagging). However, performing inference with state-of-the-art CNNs is computationally expensive: analyzing videos in real time (at 30 frames/sec) requires a $1200 GPU per video stream, posing a serious computational barrier to CNN adoption in large-scale video data management systems. In response, we present NOSCOPE, a system that uses cost-based optimization to assemble a specialized video processing pipeline for each input video stream, greatly accelerating subsequent CNNbased queries on the video. As NOSCOPE observes a video, it trains two types of pipeline components (which we call filters) to exploit the locality in the video stream: difference detectors that exploit temporal locality between frames, and specialized models that are tailored to a specific scene and query (i.e., exploit environmental and query-specific locality). We show that the optimal set of filters and their parameters depends significantly on the video stream and query in question, so NOSCOPE introduces an efficient cost-based optimizer for this problem to select them. With this approach, our NOSCOPE prototype achieves up to 120-3,200x speed-ups (318- 8,500x real-time) on binary classification tasks over real-world webcam and surveillance video while maintaining accuracy within 1-5% of a state-of-the-art CNN.
We outline a model to study the evolution of cooperation in a population of agents playing the prisoner's dilemma in signed networks. We highlight that if only dyadic interactions are taken into account, cooperation never evolves. However, when triadic considerations are introduced, a window of opportunity for emergence of cooperation as a stable behaviour emerges.
We present evolutionary synthesis models for both components and add their spectra in various proportions to obtain the full range of local galaxies' B-band bulge-to-total light ratios. Bulge star formation is assumed to occur on a short timescale of $10^9$ yr, disk star formation proceeds at a constant rate. Models are presented for three scenarios: bulges and disks of equal age, old bulges and delayed disk star formation, and old disks with subsequent bulge star formation. We quantitatively show that bulge-to-total ($=$ B/T) light ratios for local galaxies increase significantly from U through I-bands with the rate of increase slightly depending on the relative ages of the two components. Hence, simultaneous decomposition of galaxy profiles in several bands can give direct information about the relative ages of bulges and disks and help constrain galaxy formation scenarios. The redshift evolution of B/T-ratios in various bands U, B, V, I, H is calculated accounting both for cosmological and evolutionary corrections. I- and H-bands show the smoothest redshift evolution and, hence, are best suited for a first order comparison of galaxies over the redshift range from ${\rm z=0}$ to ${\rm z \gta 1}$ but still are shown to increasingly overestimate B/T ratios at higher redshifts. This is a robust result irrespective of the respective ages of the bulge and disk stellar components and implies that the scarcity of bulge-strong systems at ${\rm z \geq 0.8}$ reported for HDF and Hawaiian Deep Field galaxies is further enhanced.
This paper proposes a practical approach to addressing limitations posed by use of single active electrodes in applications for sleep stage classification. Electroencephalography (EEG)-based characterizations of sleep stage progression contribute the diagnosis and monitoring of the many pathologies of sleep. Several prior reports have explored ways of automating the analysis of sleep EEG and of reducing the complexity of the data needed for reliable discrimination of sleep stages in order to make it possible to perform sleep studies at lower cost in the home (rather than only in specialized clinical facilities). However, these reports have involved recordings from electrodes placed on the cranial vertex or occiput, which can be uncomfortable or difficult for subjects to position. Those that have utilized single EEG channels which contain less sleep information, have showed poor classification performance. We have taken advantage of Rectifier Neural Network for feature detection and Long Short-Term Memory (LSTM) network for sequential data learning to optimize classification performance with single electrode recordings. After exploring alternative electrode placements, we found a comfortable configuration of a single-channel EEG on the forehead and have shown that it can be integrated with additional electrodes for simultaneous recording of the electroocuolgram (EOG). Evaluation of data from 62 people (with 494 hours sleep) demonstrated better performance of our analytical algorithm for automated sleep classification than existing approaches using vertex or occipital electrode placements. Use of this recording configuration with neural network deconvolution promises to make clinically indicated home sleep studies practical.
The time variation of contacts in a networked system may fundamentally alter the properties of spreading processes and affect the condition for large-scale propagation, as encoded in the epidemic threshold. Despite the great interest in the problem for the physics, applied mathematics, computer science and epidemiology communities, a full theoretical understanding is still missing and currently limited to the cases where the time-scale separation holds between spreading and network dynamics or to specific temporal network models. We consider a Markov chain description of the Susceptible-Infectious-Susceptible process on an arbitrary temporal network. By adopting a multilayer perspective, we develop a general analytical derivation of the epidemic threshold in terms of the spectral radius of a matrix that encodes both network structure and disease dynamics. The accuracy of the approach is confirmed on a set of temporal models and empirical networks and against numerical results. In addition, we explore how the threshold changes when varying the overall time of observation of the temporal network, so as to provide insights on the optimal time window for data collection of empirical temporal networked systems. Our framework is both of fundamental and practical interest, as it offers novel understanding of the interplay between temporal networks and spreading dynamics.
Wireless sensor networks are collections of large number of sensor nodes. The sensor nodes are featured with limited energy, computation and transmission power. Each node in the network coordinates with every other node in forwarding their packets to reach the destination. Since these nodes operate in a physically insecure environment; they are vulnerable to different types of attacks such as selective forwarding and sinkhole. These attacks can inject malicious packets by compromising the node. Geographical routing protocols of wireless sensor networks have been developed without considering the security aspects against these attacks. In this paper, a secure routing protocol named secured greedy perimeter stateless routing protocol (S-GPSR) is proposed for mobile sensor networks by incorporating trust based mechanism in the existing greedy perimeter stateless routing protocol (GPSR). Simulation results prove that S-GPSR outperforms the GPSR by reducing the overhead and improving the delivery ratio of the networks.
Motivated by the problem of Many-Body Localization and the recent numerical results for the level and eigenfunction statistics on the random regular graphs, a generalization of the Rosenzweig-Porter random matrix model is suggested that possesses two localization transitions as the parameter $\gamma$ of the model varies from 0 to $\infty$. One of them is the Anderson transition from the localized to the extended states that happens at $\gamma=2$. The other one at $\gamma=1$ is the transition from the extended non-ergodic (multifractal) states to the extended ergodic states similar to the eigenstates of the Gaussian Orthogonal Ensemble. We computed the two-level spectral correlation function, the spectrum of multifractality $f(\alpha)$ and the wave function overlap which all show the transitions at $\gamma=1$ and $\gamma=2$.
Our objective is to compute the moments of the deep-inelastic structure functions of the nucleon on the lattice. A major source of uncertainty is the renormalization of the lattice operators that enter the calculation. In this talk we compare the renormalization constants of the most relevant twist-two bilinear quark operators which we have computed non-perturbatively and perturbatively to one loop order. Furthermore, we discuss the use of tadpole improved perturbation theory.
Social network analysis is a prominent approach to investigate interpersonal relationships. Most studies use self-report data to quantify the connections between participants and construct social networks. In recent years smartphones have been used as an alternative to map networks by assessing the proximity between participants based on Bluetooth and GPS data. While most studies have handed out specially programmed smartphones to study participants, we developed an application for iOS and Android to collect Bluetooth data from participants own smartphones. In this study, we compared the networks estimated with the smartphone app to those obtained from sociometric badges and self-report data. Participants (n=21) installed the app on their phone and wore a sociometric badge during office hours. Proximity data was collected for 4 weeks. A contingency table revealed a significant association between proximity data (rho = 0.17, p<0.0001), but the marginal odds were higher for the app (8.6%) than for the badges (1.3%), indicating that dyads were more often detected by the app. We then compared the networks that were estimated using the proximity and self-report data. All three networks were significantly correlated, although the correlation with self-reported data was lower for the app (rho = 0.25) than for badges (rho = 0.67). The scanning rates of the app varied considerably between devices and was lower on iOS than on Android. The association between the app and the badges increased when the network was estimated between participants whose app recorded more regularly. These findings suggest that the accuracy of proximity networks can be further improved by reducing missing data and restricting the interpersonal distance at which interactions are detected.
In this paper, the problem of multi-view embedding from different visual cues and modalities is considered. We propose a unified solution for subspace learning methods using the Rayleigh quotient, which is extensible for multiple views, supervised learning, and non-linear embeddings. Numerous methods including Canonical Correlation Analysis, Partial Least Sqaure regression and Linear Discriminant Analysis are studied using specific intrinsic and penalty graphs within the same framework. Non-linear extensions based on kernels and (deep) neural networks are derived, achieving better performance than the linear ones. Moreover, a novel Multi-view Modular Discriminant Analysis (MvMDA) is proposed by taking the view difference into consideration. We demonstrate the effectiveness of the proposed multi-view embedding methods on visual object recognition and cross-modal image retrieval, and obtain superior results in both applications compared to related methods.
We theoretically investigate the possibility of generating pulses in an excitable (asymmetric) semiconductor ring laser (SRL) using optical trigger pulses. We show that the phase difference between the injected field and the electric field inside the SRL determines the direction of the perturbation in phase space. Due to the folded shape of the excitability threshold, this has an important influence on the ability to cross it. A mechanism for exciting multiple consecutive pulses using a single trigger pulse (i.e. multi pulse excitability) is revealed. We furthermore investigate the possibility of using asymmetric SRLs in a coupled configuration, which is a first step toward an all-optical neural network using SRLs as building blocks.
The ROSAT Deep Survey in the Lockman Hole is the most sensitive X-ray survey performed to date, encompassing an exposure time of 200 ksec with the PSPC and 1.2 Msec with the HRI. The source counts reach a density of ~1000/deg2 at a limiting flux of ~1E-15 erg/cm2/s. At this level 70-80% of the 0.5-2 keV X-ray background is resolved into discrete sources. Because of the excellent HRI positions, 83 X-ray sources with fluxes (0.5-2 keV) above 1.2E-15 erg/cm2/s could be optically identified so far utilizing deep optical CCD images, NIR photometry and Keck spectroscopy. Only 11 sources above this flux limit remain unidentified. The majority of objects turned out to be active galactic nuclei (AGN) with minority contributions of clusters of galaxies, stars and some individual galaxies. These deep pencil beam data together with a number of shallower ROSAT surveys define the source counts over six orders of magnitude in flux and provide a unique tool of unprecedented quality to study the cosmological evolution of AGN, which can easily provide the bulk of the extragalactic X-ray background and could give an important contribution to the total background light in the universe.
There has been some disagreement about the strength of evolution exhibited by IRAS galaxies. With a parameterisation such that the co-moving density increases as $(1+z)^P$ measurements of $P$ range from $2\pm3$ to $6.7\pm2.3$. We have recently completed a deep IRAS redshift survey which should help to clarify the situation. Paying particular attention to possible systematic effects (notably Malmquist Bias) we claim positive detection of evolution within this sample at the $3\sigma$ level. Our best estimate of the strength of evolution seen in this survey is $P=5.6\pm1.6\pm0.7$ corresponding to luminosity evolution $L(z)=L_{z=0}(1+z) ^{3.3\pm0.8\pm0.6}$. This estimate is consistent with all previous determinations of the rate of cosmological evolution of IRAS galaxies. Neither this survey nor any previous survey is sufficiently deep to distinguish between pure luminosity and pure density evolution nor between different parameterisation of these forms.
We examine the large-network, low-loading behaviour of an attractor neural network, the so-called bistable gradient network (BGN). We use analytical and numerical methods to characterize the attractor states of the network and their basins of attraction. The energy landscape is more complex than that of the Hopfield network and depends on the strength of the coupling among units. At weak coupling, the BGN acts as a highly selective associative memory; the input must be close to the one of the stored patterns in order to be recognized. A category of spurious attractors occurs which is not present in the Hopfield network. Stronger coupling results in a transition to a more Hopfield-like regime with large basins of attraction. The basins of attraction for spurious attractors are noticeably suppressed compared to the Hopfield case, even though the Hebbian synaptic structure is the same and there is no stochastic noise.
This paper studies the problem of multivariate linear regression where a portion of the observations is grossly corrupted or is missing, and the magnitudes and locations of such occurrences are unknown in priori. To deal with this problem, we propose a new approach by explicitly consider the error source as well as its sparseness nature. An interesting property of our approach lies in its ability of allowing individual regression output elements or tasks to possess their unique noise levels. Moreover, despite working with a non-smooth optimization problem, our approach still guarantees to converge to its optimal solution. Experiments on synthetic data demonstrate the competitiveness of our approach compared with existing multivariate regression models. In addition, empirically our approach has been validated with very promising results on two exemplar real-world applications: The first concerns the prediction of \textit{Big-Five} personality based on user behaviors at social network sites (SNSs), while the second is 3D human hand pose estimation from depth images. The implementation of our approach and comparison methods as well as the involved datasets are made publicly available in support of the open-source and reproducible research initiatives.
ATMs enable the public to perform financial transactions. Banks try to strategically position their ATMs in order to maximize transactions and revenue. In this paper, we introduce a model which provides a score to an ATM location, which serves as an indicator of its relative likelihood of transactions. In order to efficiently capture the spatially dynamic features, we utilize two concurrent prediction models: the local model which encodes the spatial variance by considering highly energetic features in a given location, and the global model which enforces the dominant trends in the entire data and serves as a feedback to the local model to prevent overfitting. The major challenge in learning the model parameters is the lack of an objective function. The model is trained using a synthetic objective function using the dominant features returned from the k-means clustering algorithm in the local model. The results obtained from the energetic features using the models are encouraging.
Trickle is a polite gossip algorithm for managing communication traffic. It is of particular interest in low-power wireless networks for reducing the amount of control traffic, as in routing protocols (RPL), or reducing network congestion, as in multicast protocols (MPL). Trickle is used at the network or application level, and relies on up-to-date information on the activity of neighbors. This makes it vulnerable to interference from the media access control layer, which we explore in this paper. We present several scenarios how the MAC layer in low-power radios violates Trickle timing. As a case study, we analyze the impact of CSMA/CA with ContikiMAC on Trickle's performance. Additionally, we propose a solution called Cleansing that resolves these issues.
Despite their importance for the spread of zoonotic diseases, our understanding of the dynamical aspects characterizing the movements of farmed animal populations remains limited as these systems are traditionally studied as static objects and through simplified approximations. By leveraging on the network science approach, here we are able for the first time to fully analyze the longitudinal dataset of Italian cattle movements that reports the mobility of individual animals among farms on a daily basis. The complexity and inter-relations between topology, function and dynamical nature of the system are characterized at different spatial and time resolutions, in order to uncover patterns and vulnerabilities fundamental for the definition of targeted prevention and control measures for zoonotic diseases. Results show how the stationarity of statistical distributions coexists with a strong and non-trivial evolutionary dynamics at the node and link levels, on all timescales. Traditional static views of the displacement network hide important patterns of structural changes affecting nodes' centrality and farms' spreading potential, thus limiting the efficiency of interventions based on partial longitudinal information. By fully taking into account the longitudinal dimension, we propose a novel definition of dynamical motifs that is able to uncover the presence of a temporal arrow describing the evolution of the system and the causality patterns of its displacements, shedding light on mechanisms that may play a crucial role in the definition of preventive actions.
OTELO (OSIRIS Tunable Emission Line Object Survey) will be carried out with the OSIRIS instrument at the 10 m GTC telescope at La Palma, and is aimed to be the deepest and richest survey of emission line objects to date. The deep narrow-band optical data from OSIRIS will be complemented by means of additional observations that include: (i) an exploratory broad-band survey that is already being carried out in the optical domain, (ii) FIR and sub-mm observations to be carried with the Herschel space telescope and the GTM, and (iii) deep X-Ray observations from XMM-Newton and Chandra.Here we present a preliminary analysis of public EPIC data of one of the OTELO targets,the Groth-Westphal strip, gathered from the XMM-Newton Science Archive (XSA). EPIC images are combined with optical BVRI data from our broadband survey carried out with the 4.2m WHT at La Palma. Distance-independent diagnostics (involving X/O ratio, hardness ratios, B/T ratio) are tested.
This paper considers the dynamics of edges in a network. The Dynamic Bond Percolation (DBP) process models, through stochastic local rules, the dependence of an edge $(a,b)$ in a network on the states of its neighboring edges. Unlike previous models, DBP does not assume statistical independence between different edges. In applications, this means for example that failures of transmission lines in a power grid are not statistically independent, or alternatively, relationships between individuals (dyads) can lead to changes in other dyads in a social network. We consider the time evolution of the probability distribution of the network state, the collective states of all the edges (bonds), and show that it converges to a stationary distribution. We use this distribution to study the emergence of global behaviors like consensus (i.e., catastrophic failure or full recovery of the entire grid) or coexistence (i.e., some failed and some operating substructures in the grid). In particular, we show that, depending on the local dynamical rule, different network substructures, such as hub or triangle subgraphs, are more prone to failure.
We present an approach for constructing dynamic models for the simulation of gene regulatory networks from simple computational elements. Each element is called a ``gene gate'' and defines an input/output-relationship corresponding to the binding and production of transcription factors. The proposed reaction kinetics of the gene gates can be mapped onto stochastic processes and the standard ode-description. While the ode-approach requires fixing the system's topology before its correct implementation, expressing them in stochastic pi-calculus leads to a fully compositional scheme: network elements become autonomous and only the input/output relationships fix their wiring. The modularity of our approach allows to pass easily from a basic first-level description to refined models which capture more details of the biological system. As an illustrative application we present the stochastic repressilator, an artificial cellular clock, which oscillates readily without any cooperative effects.
We establish that mass conserving single terminal-linkage networks of chemical reactions admit positive steady states regardless of network deficiency and the choice of reaction rate constants. This result holds for closed systems without material exchange across the boundary, as well as for open systems with material exchange at rates that satisfy a simple sufficient and necessary condition. Our proof uses a fixed point of a novel convex optimization formulation to find the steady state behavior of chemical reaction networks that satisfy the law of mass-action kinetics. A fixed point iteration can be used to compute these steady states, and we show that it converges for weakly reversible homogeneous systems. We report the results of our algorithm on numerical experiments.
In accessibility tests for digital preservation, over time we experience drifts of localized and labelled content in statistical models of evolving semantics represented as a vector field. This articulates the need to detect, measure, interpret and model outcomes of knowledge dynamics. To this end we employ a high-performance machine learning algorithm for the training of extremely large emergent self-organizing maps for exploratory data analysis. The working hypothesis we present here is that the dynamics of semantic drifts can be modeled on a relaxed version of Newtonian mechanics called social mechanics. By using term distances as a measure of semantic relatedness vs. their PageRank values indicating social importance and applied as variable `term mass', gravitation as a metaphor to express changes in the semantic content of a vector field lends a new perspective for experimentation. From `term gravitation' over time, one can compute its generating potential whose fluctuations manifest modifications in pairwise term similarity vs. social importance, thereby updating Osgood's semantic differential. The dataset examined is the public catalog metadata of Tate Galleries, London.
Studies have been made on the phase transition phenomena of an oscillator network model based on a standard Hebb learning rule like the Hopfield model. The relative phase informations---the in-phase and anti-phase, can be embedded in the network. By self-consistent signal-to-noise analysis (SCSNA), it was found that the storage capacity is given by $\alpha_c = 0.042$, which is better than that of Cook's model. However, the retrieval quality is worse. In addition, an investigation was made into an acceleration effect caused by asymmetry of the phase dynamics. Finally, it was numerically shown that the storage capacity can be improved by modifying the shape of the coupling function.
Self-organizing networks such as Neural Gas, Growing Neural Gas and many others have been adopted in actual applications for both dimensionality reduction and manifold learning. Typically, in these applications, the structure of the adapted network yields a good estimate of the topology of the unknown subspace from where the input data points are sampled. The approach presented here takes a different perspective, namely by assuming that the input space is a manifold of known dimension. In return, the new type of growing self-organizing network presented gains the ability to adapt itself in way that may guarantee the effective and stable recovery of the exact topological structure of the input manifold.
We show that many-body localization, which exists in tight-binding models, is unstable in a continuum. Irrespective of the dimensionality of the system, many-body localization does not survive the unbounded growth of the single-particle localization length with increasing energy that is characteristic of the continuum limit. The system remains delocalized down to arbitrarily small temperature $T$, although its dynamics slows down as $T$ decreases. Remarkably, the conductivity vanishes with decreasing $T$ faster than in the Arrhenius law. The system can be characterized by an effective $T$-dependent single-particle mobility edge which diverges in the limit of $T\to 0$. Delocalization is driven by interactions between hot electrons above the mobility edge and the "bath" of thermal electrons in the vicinity of the Fermi level.
We extend the theory of neural fields which has been developed in a deterministic framework by considering the influence spatio-temporal noise. The outstanding problem that we here address is the development of a theory that gives rigorous meaning to stochastic neural field equations, and conditions ensuring that they are well-posed. Previous investigations in the field of computational and mathematical neuroscience have been numerical for the most part. Such questions have been considered for a long time in the theory of stochastic partial differential equations, where at least two different approaches have been developed, each having its advantages and disadvantages. It turns out that both approaches have also been used in computational and mathematical neuroscience, but with much less emphasis on the underlying theory. We present a review of two existing theories and show how they can be used to put the theory of stochastic neural fields on a rigorous footing. We also provide general conditions on the parameters of the stochastic neural field equations under which we guarantee that these equations are well-posed. In so doing we relate each approach to previous work in computational and mathematical neuroscience. We hope this will provide a reference that will pave the way for future studies (both theoretical and applied) of these equations, where basic questions of existence and uniqueness will no longer be a cause for concern.
Fast and accurate upper-body and head pose estimation is a key task for automatic monitoring of driver attention, a challenging context characterized by severe illumination changes, occlusions and extreme poses. In this work, we present a new deep learning framework for head localization and pose estimation on depth images. The core of the proposal is a regression neural network, called POSEidon, which is composed of three independent convolutional nets followed by a fusion layer, specially conceived for understanding the pose by depth. In addition, to recover the intrinsic value of face appearance for understanding head position and orientation, we propose a new Face-from-Depth approach for learning image faces from depth. Results in face reconstruction are qualitatively impressive. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Results show that our method overcomes all recent state-of-art works, running in real time at more than 30 frames per second.
Despite recent advances in face recognition using deep learning, severe accuracy drops are observed for large pose variations in unconstrained environments. Learning pose-invariant features is one solution, but needs expensively labeled large scale data and carefully designed feature learning algorithms. In this work, we focus on frontalizing faces in the wild under various head poses, including extreme profile views. We propose a novel deep 3D Morphable Model (3DMM) conditioned Face Frontalization Generative Adversarial Network (GAN), termed as FF-GAN, to generate neutral head pose face images. Our framework differs from both traditional GANs and 3DMM based modeling. Incorporating 3DMM into the GAN structure provides shape and appearance priors for fast convergence with less training data, while also supporting end-to-end training. The 3DMM conditioned GAN employs not only the discriminator and generator loss but also a new masked symmetry loss to retain visual quality under occlusions, besides an identity loss to recover high frequency information. Experiments on face recognition, landmark localization and 3D reconstruction consistently show the advantage of our frontalization method on faces in the wild datasets.Detailed results can be refered to: http://cvlab.cse.msu.edu/project-face-frontalization.html.
Recently, researchers have demonstrated that loopy belief propagation - the use of Pearls polytree algorithm IN a Bayesian network WITH loops OF error- correcting codes.The most dramatic instance OF this IS the near Shannon - limit performance OF Turbo Codes codes whose decoding algorithm IS equivalent TO loopy belief propagation IN a chain - structured Bayesian network. IN this paper we ask : IS there something special about the error - correcting code context, OR does loopy propagation WORK AS an approximate inference schemeIN a more general setting? We compare the marginals computed using loopy propagation TO the exact ones IN four Bayesian network architectures, including two real - world networks : ALARM AND QMR.We find that the loopy beliefs often converge AND WHEN they do, they give a good approximation TO the correct marginals.However,ON the QMR network, the loopy beliefs oscillated AND had no obvious relationship TO the correct posteriors. We present SOME initial investigations INTO the cause OF these oscillations, AND show that SOME simple methods OF preventing them lead TO the wrong results.
Networks are pervasive in the real world. Nature, society, economy, and technology are supported by ostensibly different networks that in fact share an amazing number of interesting structural properties. Network thinking exploded in the last decade, boosted by the availability of large databases on the topology of various real networks, mainly the Web and biological networks, and converged to the new discipline of network analysis - the holistic analysis of complex systems through the study of the network that wires their components. Physicists mainly drove the investigation, studying the structure and function of networks using methods and tools of statistical mechanics. Here, we give an alternative perspective on network analysis, proposing a logic for specifying general properties of networks and a modular algorithm for checking these properties. The logic borrows from two intertwined computing fields: XML databases and model checking.
Factorization machines and polynomial networks are supervised polynomial models based on an efficient low-rank decomposition. We extend these models to the multi-output setting, i.e., for learning vector-valued functions, with application to multi-class or multi-task problems. We cast this as the problem of learning a 3-way tensor whose slices share a common decomposition and propose a convex formulation of that problem. We then develop an efficient conditional gradient algorithm and prove its global convergence, despite the fact that it involves a non-convex hidden unit selection step. On classification tasks, we show that our algorithm achieves excellent accuracy with much sparser models than existing methods. On recommendation system tasks, we show how to combine our algorithm with a reduction from ordinal regression to multi-output classification and show that the resulting algorithm outperforms existing baselines in terms of ranking accuracy.
We show that a film of a semiconductor such as GaAs, in which s-wave superconductivity and a Zeeman splitting are induced by proximity effect, supports zero-energy Majorana fermion modes in the ordinary vortex excitations. The key to the topological order is the existence of spin-orbit coupling, coexisting with proximity-induced s-wave superconductivity. Since time reversal symmetry is explicitly broken, the edge of the film constitutes a chiral Majorana wire. The heterostructure we propose -- a semiconducting thin film sandwiched between an s-wave superconductor and a magnetic insulator -- is a generic system which can be used as the platform for topological quantum computation by virtue of the existence of non-Abelian Majorana fermions.
Probabilistic graphical models are a powerful tool to represent real-word phenomena and to learn network structures starting from data. This paper applies graphical models in a new framework to study association rules driven by consumer choices in a network of museums. The network consists of the museums participating in the program of Abbonamento Musei Torino Piemonte, which is a yearly subscription managed by the Associazione Torino Citt\`a Capitale Europea. Consumers are card-holders, who are allowed to entry to all the museums in the network for one year. We employ graphical models to highlight associations among the museums driven by card-holder visiting behaviour. We use both undirected graphs to investigate the strength of the network and directed graphs to highlight asimmetry in the association rules.
Since it is impossible to predict and identify all the vulnerabilities of a network beforehand, and penetration into a system by malicious intruders cannot always be prevented, intrusion detection systems (IDSs) are essential entities to ensure the security of a networked system. To be effective in carrying out their functions, the IDSs need to be accurate, adaptive, and extensible. Given these stringent requirements and the high level of vulnerabilities of the current days' networks, the design of an IDS has become a very challenging task. Although, an extensive research has been done on intrusion detection in a distributed environment, distributed IDSs suffer from a number of drawbacks e.g., high rates of false positives, low detection efficiency etc. In this paper, the design of a distributed IDS is proposed that consists of a group of autonomous and cooperating agents. In addition to its ability to detect attacks, the system is capable of identifying and isolating compromised nodes in the network thereby introducing fault-tolerance in its operations. The experiments conducted on the system have shown that it has a high detection efficiency and low false positives compared to some of the currently existing systems.
Many reinforcement learning methods for continuous control tasks are based on updating a policy function by maximizing an approximated action-value function or Q-function. However, the Q-function also depends on the policy and this dependency often leads to unstable policy learning. To overcome this issue, we propose a method that does not greedily exploit the Q-function. To do so, we upper-bound the Kullback-Leibler divergence of the new policy while maximizing the Q-function. Furthermore, we also lower-bound the entropy of the new policy to maintain its exploratory behavior. We show that by using a Gaussian policy and a Q-function that is quadratic in actions, we can solve the corresponding constrained optimization problem in a closed form. In addition, we show that our method can be regarded as a variant of the well-known deterministic policy gradient method. Through experiments, we evaluate the proposed method using a neural network as a function approximator and show that it gives more stable learning performance than the deep deterministic policy gradient method and the continuous Q-learning method.
We observe a crossover from strong to weak chaos in the spatiotemporal evolution of multiple site excitations within disordered chains with cubic nonlinearity. Recent studies have shown that Anderson localization is destroyed, and the wave packet spreading is characterized by an asymptotic divergence of the second moment $m_2$ in time (as $t^{1/3}$), due to weak chaos. In the present paper, we observe the existence of a qualitatively new dynamical regime of strong chaos, in which the second moment spreads even faster (as $t^{1/2}$), with a crossover to the asymptotic law of weak chaos at larger times. We analyze the pecularities of these spreading regimes and perform extensive numerical simulations over large times with ensemble averaging. A technique of local derivatives on logarithmic scales is developed in order to quantitatively visualize the slow crossover processes.
We investigate sequence machine learning techniques on raw radio signal time-series data. By applying deep recurrent neural networks we learn to discriminate between several application layer traffic types on top of a constant envelope modulation without using an expert demodulation algorithm. We show that complex protocol sequences can be learned and used for both classification and generation tasks using this approach.
We address the problem of chaotic temperature dependence in disordered glassy systems at equilibrium by following states of a random-energy random-entropy model in temperature; of particular interest are the crossings of the free-energies of these states. We find that this model exhibits strong, weak or no temperature chaos depending on the value of an exponent. This allows us to write a general criterion for temperature chaos in disordered systems, predicting the presence of temperature chaos in the Sherrington-Kirkpatrick and Edwards-Anderson spin glass models, albeit when the number of spins is large enough. The absence of chaos for smaller systems may justify why it is difficult to observe chaos with current simulations. We also illustrate our findings by studying temperature chaos in the naive mean field equations for the Edwards-Anderson spin glass.
Given financial data from popular sites like Yahoo and the London Exchange, the presented paper attempts to model and predict stocks that can be considered "good investments". Stocks are characterized by 125 features ranging from gross domestic product to EDIBTA, and are labeled by discrepancies between stock and market price returns. An artificial neural network (Self-Organizing Map) is fitted to train on more than a million data points to predict "good investments" given testing stocks from 2013 and after.
Computer vision enables a wide range of applications in robotics/drones, self-driving cars, smart Internet of Things, and portable/wearable electronics. For many of these applications, local embedded processing is preferred due to privacy and/or latency concerns. Accordingly, energy-efficient embedded vision hardware delivering real-time and robust performance is crucial. While deep learning is gaining popularity in several computer vision algorithms, a significant energy consumption difference exists compared to traditional hand-crafted approaches. In this paper, we provide an in-depth analysis of the computation, energy and accuracy trade-offs between learned features such as deep Convolutional Neural Networks (CNN) and hand-crafted features such as Histogram of Oriented Gradients (HOG). This analysis is supported by measurements from two chips that implement these algorithms. Our goal is to understand the source of the energy discrepancy between the two approaches and to provide insight about the potential areas where CNNs can be improved and eventually approach the energy-efficiency of HOG while maintaining its outstanding performance accuracy.
We propose a numerical method for explicitly constructing a complete set of local integrals of motion (LIOM) and definitely show the existence of LIOM for strongly many-body localized systems. The method combines exact diagonalization and nonlinear minimization, and gradually deforms the LIOM for the noninteracting case to those for the interacting case. By using this method we find that for strongly disordered and weakly interacting systems, there are two characteristic lengths in the LIOM. The first one is governed by disorder and is of Anderson-localization nature. The second one is induced by interaction but shows a discontinuity at zero interaction, showing a nonperturbative nature. We prove that the entanglement and correlation in any eigenstate extend not longer than twice the second length and thus the eigenstates of the system are `quasi-product states' with such a localization length.
Network Function Virtualization (NFV) has recently received significant attention as an innovative way of deploying network services. By decoupling network functions from the physical equipment on which they run, NFV has been proposed as passage towards service agility, better time-to-market, and reduced Capital Expenses (CAPEX) and Operating Expenses (OPEX). One of the main selling points of NFV is its promise for better energy efficiency resulting from consolidation of resources as well as their more dynamic utilization. However, there are currently no studies or implementations which attach values to energy savings that can be expected, which could make it hard for Telecommunication Service Providers (TSPs) to make investment decisions. In this paper, we utilize Bell Labs' GWATT tool to estimate the energy savings that could result from the three main NFV use cases Virtualized Evolved Packet Core (VEPC), Virtualized Customer Premises Equipment (VCPE) and Virtualized Radio Access Network (VRAN). We determine that the part of the mobile network with the highest energy utilization prospects is the Evolved Packet Core (EPC) where virtualization of functions leads to a 22% reduction in energy consumption and a 32% enhancement in energy efficiency.
A patient-centric approach to healthcare leads to an informal social network among medical professionals. This chapter presents a research framework to: identify the collaboration structure among physicians that is effective and efficient for patients, discover effective structural attributes of a collaboration network that evolves during the course of providing care, and explore the impact of socio-demographic characteristics of healthcare professionals, patients, and hospitals on collaboration structures, from the point of view of measurable outcomes such as cost and quality of care. The framework uses illustrative examples drawn from a data set of patients undergoing hip replacement surgery.
Transforming a graphical user interface screenshot created by a designer into computer code is a typical task conducted by a developer in order to build customized software, websites and mobile applications. In this paper, we show that Deep Learning techniques can be leveraged to automatically generate code given a graphical user interface screenshot as input. Our model is able to generate code targeting three different platforms (i.e. iOS, Android and web-based technologies) from a single input image with over 77% of accuracy.
We show that smectic liquid crystals confined in_anisotropic_ porous structures such as e.g.,_strained_ aerogel or aerosil exhibit two new glassy phases. The strain both ensures the stability of these phases and determines their nature. One type of strain induces an ``XY Bragg glass'', while the other creates a novel, triaxially anisotropic ``m=1 Bragg glass''. The latter exhibits anomalous elasticity, characterized by exponents that we calculate to high precision. We predict the phase diagram for the system, and numerous other experimental observables.
This work is concerned with autoregressive prediction of turning points in financial price sequences. Such turning points are critical local extrema points along a series, which mark the start of new swings. Predicting the future time of such turning points or even their early or late identification slightly before or after the fact has useful applications in economics and finance. Building on recently proposed neural network model for turning point prediction, we propose and study a new autoregressive model for predicting turning points of small swings. Our method relies on a known turning point indicator, a Fourier enriched representation of price histories, and support vector regression. We empirically examine the performance of the proposed method over a long history of the Dow Jones Industrial average. Our study shows that the proposed method is superior to the previous neural network model, in terms of trading performance of a simple trading application and also exhibits a quantifiable advantage over the buy-and-hold benchmark.
Privacy-preserving techniques for distributed computation have been proposed recently as a promising framework in collaborative inter-domain network monitoring. Several different approaches exist to solve such class of problems, e.g., Homomorphic Encryption (HE) and Secure Multiparty Computation (SMC) based on Shamir's Secret Sharing algorithm (SSS). Such techniques are complete from a computation-theoretic perspective: given a set of private inputs, it is possible to perform arbitrary computation tasks without revealing any of the intermediate results. In fact, HE and SSS can operate also on secret inputs and/or provide secret outputs. However, they are computationally expensive and do not scale well in the number of players and/or in the rate of computation tasks. In this paper we advocate the use of "elementary" (as opposite to "complete") Secure Multiparty Computation (E-SMC) procedures for traffic monitoring. E-SMC supports only simple computations with private input and public output, i.e., it can not handle secret input nor secret (intermediate) output. Such a simplification brings a dramatic reduction in complexity and enables massive-scale implementation with acceptable delay and overhead. Notwithstanding its simplicity, we claim that an E-SMC scheme is sufficient to perform a great variety of computation tasks of practical relevance to collaborative network monitoring, including, e.g., anonymous publishing and set operations. This is achieved by combining a E-SMC scheme with data structures like Bloom Filters and bitmap strings.
Protecting against link failures in communication networks is essential to increase robustness, accessibility, and reliability of data transmission. Recently, network coding has been proposed as a solution to provide agile and cost efficient network protection against link failures, which does not require data rerouting, or packet retransmission. To achieve this, separate paths have to be provisioned to carry encoded packets, hence requiring either the addition of extra links, or reserving some of the resources for this purpose. In this paper, we propose network protection codes against a single link failure using network coding, where a separate path using reserved links is not needed. In this case portions of the link capacities are used to carry the encoded packets.   The scheme is extended to protect against multiple link failures and can be implemented at an overlay layer. Although this leads to reducing the network capacity, the network capacity reduction is asymptotically small in most cases of practical interest. We demonstrate that such network protection codes are equivalent to error correcting codes for erasure channels. Finally, we study the encoding and decoding operations of such codes over the binary field.
The aim of the Gemini Deep Deep Survey is to push spectroscopic studies of complete galaxy samples (both red and blue objects) significantly beyond z=1; this is the redshift where the current Hubble sequence of ellipticals and spirals is already extant. In the Universe at z=2 the only currently spectroscopically confirmed galaxies are blue, star-forming and of fragmented morphology. Exploring this transition means filling the `redshift desert' 1<z<2 where there is a dearth of spectroscopic measurements. To do this we need to secure redshifts of the oldest, reddest galaxies (candidate ellipticals) beyond z>1 which has led us to carry out the longest exposure redshift survey ever done: 100 ksec spectroscopic MOS exposures with GMOS on Gemini North. We have developed an implementation of the CCD ``nod & shuffle'' technique to ensure precise sky-subtraction in these ultra-deep exposures. At the halfway mark the GDDS now has ~36 galaxies in the redshift desert 1.2<z<2 extending up to z=1.97 and I<24.5 with secure redshifts based on weak rest-frame UV absorption features complete for both red, old objects and young, blue objects. The peak epoch of galaxy assembly is now being probed by direct spectroscopic investigation for the first time. On behalf of the GDDS team I present our first results on the properties of galaxies in the `redshift desert'.
Topic of the thesis is a theoretical description of the ultracold atomic gases in one- and two-dimensional optical lattices in the presence of the disorder leading to the Anderson localization. The disorder is created by interaction of the main fraction of atoms with the second immobilized fraction distributed randomly over the lattice.   In low-dimensional systems there is no transition from the Anderson localized to the conducting phase, although in the presence of correlations a discrete set of extended states can exist. The first part of the thesis is devoted to properties of such states. In the finite size lattices, the presence of those states results in the appearance of `windows of transport' -- energy ranges, in which the localization length is longer than the system size.The analytical method of determining the extended states energies for correlations of generalized $N$-mers type is presented, along with a proof that indeed those states are extended in the infinite system. Subsequently, the way of experimental creation of this type of correlations is proposed, as well as the technique of generation of the disorder in the tunneling amplitudes, which significantly enhances tunability of the proposed energy filters.   The second part of the thesis describes the method which allows to simulate a specific type of the disorder: random magnetic field. Systems of such a class, created in a two dimensional lattice using simultaneous fast periodic modulation of the lattice height and interactions with immobilized species, may allow in future to investigate the range of topics from the condensed matter physics, for example fractional quantum Hall effect at half-filling. In the thesis, the most interesting features observed upon investigation of such systems are presented. Especially, the anomalously low localization length for correlated disorder is explained.
Big data analytics applications drive the convergence of data management and machine learning. But there is no conceptual language available that is spoken in both worlds. The main contribution of the paper is a method to translate Bayesian networks, a main conceptual language for probabilistic graphical models, into usable entity relationship models. The transformed representation of a Bayesian network leaves out mathematical details about probabilistic relationships but unfolds all information relevant for data management tasks. As a real world example, we present the TopicExplorer system that uses Bayesian topic models as a core component in an interactive, database-supported web application. Last, we sketch a conceptual framework that eases machine learning specific development tasks while building big data analytics applications.
Air transport is a key infrastructure of modern societies. In this paper we review some recent approaches to air transport, which make extensive use of theory of complex networks. We discuss possible networks that can be defined for the air transport and we focus our attention to networks of airports connected by flights. We review several papers investigating the topology of these networks and their dynamics for time scales ranging from years to intraday intervals, and consider also the resilience properties of air networks to extreme events. Finally we discuss the results of some recent papers investigating the dynamics on air transport network, with emphasis on passengers traveling in the network and epidemic spreading mediated by air transport.
Discovering causal models from observational and interventional data is an important first step preceding what-if analysis or counterfactual reasoning. As has been shown before, the direction of pairwise causal relations can, under certain conditions, be inferred from observational data via standard gradient-boosted classifiers (GBC) using carefully engineered statistical features. In this paper we apply deep convolutional neural networks (CNNs) to this problem by plotting attribute pairs as 2-D scatter plots that are fed to the CNN as images. We evaluate our approach on the 'Cause- Effect Pairs' NIPS 2013 Data Challenge. We observe that a weighted ensemble of CNN with the earlier GBC approach yields significant improvement. Further, we observe that when less training data is available, our approach performs better than the GBC based approach suggesting that CNN models pre-trained to determine the direction of pairwise causal direction could have wider applicability in causal discovery and enabling what-if or counterfactual analysis.
Attractors in asymmetric neural networks with deterministic parallel dynamics were shown to present a "chaotic" regime at symmetry eta < 0.5, where the average length of the cycles increases exponentially with system size, and an oscillatory regime at high symmetry, where the typical length of the cycles is 2. We show, both with analytic arguments and numerically, that there is a sharp transition, at a critical symmetry $\e_c=0.33$, between a phase where the typical cycles have length 2 and basins of attraction of vanishing weight and a phase where the typical cycles are exponentially long with system size, and the weights of their attraction basins are distributed as in a Random Map with reversal symmetry. The time-scale after which cycles are reached grows exponentially with system size $N$, and the exponent vanishes in the symmetric limit, where $T\propto N^{2/3}$. The transition can be related to the dynamics of the infinite system (where cycles are never reached), using the closing probabilities as a tool.   We also study the relaxation of the function $E(t)=-1/N\sum_i |h_i(t)|$, where $h_i$ is the local field experienced by the neuron $i$. In the symmetric system, it plays the role of a Ljapunov function which drives the system towards its minima through steepest descent. This interpretation survives, even if only on the average, also for small asymmetry. This acts like an effective temperature: the larger is the asymmetry, the faster is the relaxation of $E$, and the higher is the asymptotic value reached. $E$ reachs very deep minima in the fixed points of the dynamics, which are reached with vanishing probability, and attains a larger value on the typical attractors, which are cycles of length 2.
Brain function results from communication between neurons connected by complex synaptic networks. Synapses are themselves highly complex and diverse signaling machines, containing protein products of hundreds of different genes, some in hundreds of copies, arranged in precise lattice at each individual synapse. Synapses are fundamental not only to synaptic network function but also to network development, adaptation, and memory. In addition, abnormalities of synapse numbers or molecular components are implicated in most mental and neurological disorders. Despite their obvious importance, mammalian synapse populations have so far resisted detailed quantitative study. In human brains and most animal nervous systems, synapses are very small and very densely packed: there are approximately 1 billion synapses per cubic millimeter of human cortex. This volumetric density poses very substantial challenges to proteometric analysis at the critical level of the individual synapse. The present work describes new probabilistic image analysis methods for single-synapse analysis of synapse populations in both animal and human brains.
Entity Disambiguation aims to link mentions of ambiguous entities to a knowledge base (e.g., Wikipedia). Modeling topical coherence is crucial for this task based on the assumption that information from the same semantic context tends to belong to the same topic. This paper presents a novel deep semantic relatedness model (DSRM) based on deep neural networks (DNN) and semantic knowledge graphs (KGs) to measure entity semantic relatedness for topical coherence modeling. The DSRM is directly trained on large-scale KGs and it maps heterogeneous types of knowledge of an entity from KGs to numerical feature vectors in a latent space such that the distance between two semantically-related entities is minimized. Compared with the state-of-the-art relatedness approach proposed by (Milne and Witten, 2008a), the DSRM obtains 19.4% and 24.5% reductions in entity disambiguation errors on two publicly available datasets respectively.
We consider the structure functions S^(q)(T), i.e. the moments of order q of the increments X(t+T)-X(t) of the Foreign Exchange rate X(t) which give clear evidence of scaling (S^(q)(T)~T^z(q)). We demonstrate that the nonlinearity of the observed scaling exponent z(q) is incompatible with monofractal additive stochastic models usually introduced in finance: Brownian motion, Levy processes and their truncated versions. This nonlinearity corresponds to multifractal intermittency yielded by multiplicative processes. The non-analycity of z(q) corresponds to universal multifractals, which are furthermore able to produce ``hyperbolic'' pdf tails with an exponent q_D >2. We argue that it is necessary to introduce stochastic evolution equations which are compatible with this multifractal behaviour.
Restricted Boltzmann Machine (RBM) is a particular type of random neural network models modeling vector data based on the assumption of Bernoulli distribution. For multi-dimensional and non-binary data, it is necessary to vectorize and discretize the information in order to apply the conventional RBM. It is well-known that vectorization would destroy internal structure of data, and the binary units will limit the applying performance due to fickle real data. To address the issue, this paper proposes a Matrix variate Gaussian Restricted Boltzmann Machine (MVGRBM) model for matrix data whose entries follow Gaussian distributions. Compared with some other RBM algorithm, MVGRBM can model real value data better and it has good performance in image classification.
Face attribute estimation has many potential applications in video surveillance, face retrieval, and social media. While a number of methods have been proposed for face attribute estimation, most of them did not explicitly consider the attribute correlation and heterogeneity (e.g., ordinal vs. nominal attributes) during feature representation learning. In this paper, we present a Deep Multi-Task Learning (DMTL) approach to jointly estimate multiple heterogeneous attributes from a single face image. In DMTL, we tackle attribute correlation and heterogeneity with convolutional neural networks (CNNs) consisting of shared feature learning for all the attributes, and category-specific feature learning for heterogeneous attributes. We also introduce an unconstrained face database (LFW+), an extension of public-domain LFW, with heterogeneous demographic attributes (age, gender, and race) obtained via crowdsourcing. Experimental results on benchmarks with multiple heterogeneous attributes (LFW+ and MORPH II) show that the proposed approach has superior performance compared to state of the art. Finally, evaluations on public-domain face databases with multiple binary attributes (CelebA and LFWA) and a single attribute (LAP) show that the proposed approach has excellent generalization ability.
In case of realization of successful business, gain analysis is essential. In this paper we have cited some new techniques of gain expectation on the basis of neural property of perceptron. Support rule and Sequence mining based artificial intelligence oriented practices have also been done in this context. In the view of above fuzzy and statistical based gain sensing is also pointed out.
Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training methods. In this work we present a novel approach for unsupervised training of Convolutional networks that is based on contrasting between spatial regions within images. This criterion can be employed within conventional neural networks and trained using standard techniques such as SGD and back-propagation, thus complementing supervised methods.
We present a transformation-grounded image generation network for novel 3D view synthesis from a single image. Instead of taking a 'blank slate' approach, we first explicitly infer the parts of the geometry visible both in the input and novel views and then re-cast the remaining synthesis problem as image completion. Specifically, we both predict a flow to move the pixels from the input to the novel view along with a novel visibility map that helps deal with occulsion/disocculsion. Next, conditioned on those intermediate results, we hallucinate (infer) parts of the object invisible in the input image. In addition to the new network structure, training with a combination of adversarial and perceptual loss results in a reduction in common artifacts of novel view synthesis such as distortions and holes, while successfully generating high frequency details and preserving visual aspects of the input image. We evaluate our approach on a wide range of synthetic and real examples. Both qualitative and quantitative results show our method achieves significantly better results compared to existing methods.
We present Net2Vec, a flexible high-performance platform that allows the execution of deep learning algorithms in the communication network. Net2Vec is able to capture data from the network at more than 60Gbps, transform it into meaningful tuples and apply predictions over the tuples in real time. This platform can be used for different purposes ranging from traffic classification to network performance analysis.   Finally, we showcase the use of Net2Vec by implementing and testing a solution able to profile network users at line rate using traces coming from a real network. We show that the use of deep learning for this case outperforms the baseline method both in terms of accuracy and performance.
This paper studies the capacity of single-source single-sink noiseless networks under adversarial or arbitrary errors on no more than z edges. Unlike prior papers, which assume equal capacities on all links, arbitrary link capacities are considered. Results include new upper bounds, network error correction coding strategies, and examples of network families where our bounds are tight. An example is provided of a network where the capacity is 50% greater than the best rate that can be achieved with linear coding. While coding at the source and sink suffices in networks with equal link capacities, in networks with unequal link capacities, it is shown that intermediate nodes may have to do coding, nonlinear error detection, or error correction in order to achieve the network error correction capacity.
Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that i) approach state-of-the-art classification accuracy across 8 standard datasets, encompassing vision and speech, ii) perform inference while preserving the hardware's underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1200 and 2600 frames per second and using between 25 and 275 mW (effectively > 6000 frames / sec / W) and iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. For the first time, the algorithmic power of deep learning can be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer.
Pairwise learning or dyadic prediction concerns the prediction of properties for pairs of objects. It can be seen as an umbrella covering various machine learning problems such as matrix completion, collaborative filtering, multi-task learning, transfer learning, network prediction and zero-shot learning. In this work we analyze kernel-based methods for pairwise learning, with a particular focus on a recently-suggested two-step method. We show that this method offers an appealing alternative for commonly-applied Kronecker-based methods that model dyads by means of pairwise feature representations and pairwise kernels. In a series of theoretical results, we establish correspondences between the two types of methods in terms of linear algebra and spectral filtering, and we analyze their statistical consistency. In addition, the two-step method allows us to establish novel algorithmic shortcuts for efficient training and validation on very large datasets. Putting those properties together, we believe that this simple, yet powerful method can become a standard tool for many problems. Extensive experimental results for a range of practical settings are reported.
In this paper we develop a formalism allowing us to describe operating of a network based on the parametrical four-wave mixing process that is well-known in nonlinear optics. The recognition power of a network using parametric neurons operating with q different frequencies is considered. It is shown that the storage capacity of such a network is higher compared with the Potts-glass models.
Deep X-ray surveys provide the most efficient identification of Active Galactic Nuclei (AGN) activity. However, recent evidence has indicated that the current <10 keV surveys do not detect the most heavily obscured AGNs. Here we explore whether the X-ray undetected AGN population can be identified within the ultra-deep Spitzer survey of the GOODS-N field using X-ray stacking techniques. We find evidence for AGN activity in the Spitzer dataset and the strongest and hardest X-ray signal is produced by galaxies with starburst-like infrared spectral slopes and median properties of z~0.8 and L_IR~10^{11} L_solar. The stacked X-ray properties suggest that obscured AGN activity is present in these sources, with a median X-ray spectral slope of Gamma~1 and L_X~10^{42} erg/s. These overall properties are consistent with the obscured AGN population expected to produce the unresolved X-ray background.
A currently successful approach to computational semantics is to represent words as embeddings in a machine-learned vector space. We present an ensemble method that combines embeddings produced by GloVe (Pennington et al., 2014) and word2vec (Mikolov et al., 2013) with structured knowledge from the semantic networks ConceptNet (Speer and Havasi, 2012) and PPDB (Ganitkevitch et al., 2013), merging their information into a common representation with a large, multilingual vocabulary. The embeddings it produces achieve state-of-the-art performance on many word-similarity evaluations. Its score of $\rho = .596$ on an evaluation of rare words (Luong et al., 2013) is 16% higher than the previous best known system.
This dissertation contributes to mathematical and algorithmic problems that arise in the analysis of network and biological data.
A large variety of complex systems in ecology, climate science, biomedicine and engineering have been observed to exhibit tipping points, where the internal dynamical state of the system abruptly changes. For example, such critical transitions may result in the sudden change of ecological environments and climate conditions. Data and models suggest that detectable warning signs may precede some of these drastic events. This view is also corroborated by abstract mathematical theory for generic bifurcations in stochastic multi-scale systems. Whether the stochastic scaling laws used as warning signs are also present in social networks that anticipate a-priori {\it unknown} events in society is an exciting open problem, to which at present only highly speculative answers can be given. Here, we instead provide a first step towards tackling this formidable question by focusing on a-priori {\it known} events and analyzing a social network data set with a focus on classical variance and autocorrelation warning signs. Our results thus pertain to one absolutely fundamental question: Can the stochastic warning signs known from other areas also be detected in large-scale social network data? We answer this question affirmatively as we find that several a-priori known events are preceded by variance and autocorrelation growth. Our findings thus clearly establish the necessary starting point to further investigate the relation between abstract mathematical theory and various classes of critical transitions in social networks.
We consider the interplay of the elastic pinning and the Anderson localization in the transport properties of a charge-density wave in one dimension, within the framework of the Luttinger model in the limit of strong repulsion. We address a conceptually important issue of which of the two disorder-induced phenomena limits the mobility more effectively. We argue that the interplay of the classical and quantum effects in transport of a very rigid charge-density wave is quite nontrivial: the quantum localization sets in at a temperature much smaller than the pinning temperature, whereas the quantum localization length is much smaller than the pinning length.
Question answering forums are rapidly growing in size with no effective automated ability to refer to and reuse answers already available for previous posted questions. In this paper, we develop a methodology for finding semantically related questions. The task is difficult since 1) key pieces of information are often buried in extraneous details in the question body and 2) available annotations on similar questions are scarce and fragmented. We design a recurrent and convolutional model (gated convolution) to effectively map questions to their semantic representations. The models are pre-trained within an encoder-decoder framework (from body to title) on the basis of the entire raw corpus, and fine-tuned discriminatively from limited annotations. Our evaluation demonstrates that our model yields substantial gains over a standard IR baseline and various neural network architectures (including CNNs, LSTMs and GRUs).
The influence of the nuclear medium on the production of charged hadrons in semi-inclusive deep inelastic scattering has been studied by the HERMES experiment at DESY using 27.5 GeV positrons.   A substantial reduction of the multiplicity of charged hadrons and identified charged pions from nuclei relative to that from deuterium has been measured as function of the relevant kinematic variables. The preliminary results on krypton show a larger reduction of the multiplicity ratio $R_M^{h}$ with respect to the one previously measured on nitrogen and suggest a possible modification of the quark fragmentation process in the nuclear environment.
Evolutionary complexity is here measured by the number of trials/evaluations needed for evolving a logical gate in a non-linear medium. Behavioural complexity of the gates evolved is characterised in terms of cellular automata behaviour. We speculate that hierarchies of behavioural and evolutionary complexities are isomorphic up to some degree, subject to substrate specificity of evolution and the spectrum of evolution parameters.
We study the low-temperature properties of a classical Heisenberg model with site-random interlayer couplings on the cubic lattice. This model is introduced as a simplified effective model of Sr(Fe$_{1-x}$Mn$_{x}$)O$_2$, which was recently synthesized. In this material, when $x=0.3$, $(\pi\pi\pi)$ and $(\pi\pi0)$ mixed ordering is observed by neutron diffraction measurements. By Monte Carlo simulations, we find an exotic bulk spin structure that explains the experimentally obtained results. We name this spin structure the "random fan-out state". The mean-field calculations provide an intuitive understanding of this phase being induced by the site-random interlayer couplings. Since Rietveld analysis assuming the random fan-out state agrees well with the neutron diffraction pattern of Sr(Fe$_{0.7}$Mn$_{0.3}$)O$_2$, we conclude that the random fan-out state is reasonable for the spin-ordering pattern of Sr(Fe$_{0.7}$Mn$_{0.3}$)O$_2$ at the low-temperature phase.
We present a novel classification-based method for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences (``motifs'') in the gene's regulatory region and (2) the expression levels of regulators such as transcription factors in the experiment (``parents''). Thus our learning task integrates two qualitatively different data sources: genome-wide cDNA microarray data across multiple perturbation and mutant experiments along with motif profile data from regulatory sequences. We convert the regression task of predicting real-valued gene expression measurement to a classification task of predicting +1 and -1 labels, corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. The learning algorithm employed is boosting with a margin-based generalization of decision trees, alternating decision trees. This large-margin classifier is sufficiently flexible to allow complex logical functions, yet sufficiently simple to give insight into the combinatorial mechanisms of gene regulation. We observe encouraging prediction accuracy on experiments based on the Gasch S. cerevisiae dataset, and we show that we can accurately predict up- and down-regulation on held-out experiments. Our method thus provides predictive hypotheses, suggests biological experiments, and provides interpretable insight into the structure of genetic regulatory networks.
The quantum correction to the conductivity in disordered quantum wires with linear Rashba spin-orbit coupling is obtained. For quantum wires with spin-conserving boundary conditions, we find a crossover from weak anti- to weak localization as the wire width W is reduced using exact diagonalization of the Cooperon equation. This crossover is due to the dimensional dependence of the spin relaxation rate of conduction electrons, which becomes diminished, when the wire width is smaller than the bulk spin precession length $L_{SO}$. We thus confirm previous results for small wire width, $W/L_{SO}<= 1$ [PRL98,176808(2007)], where only the transverse 0-modes of the Cooperon equation had been taken into account. We find that spin helix solutions become stable for arbitrary ratios of linear Rashba and Dresselhaus coupling in narrow wires. For wider wires, the spin relaxation rate is found to be not monotonous as function of wire width: it becomes first enhanced for W on the order of the bulk $L_{SO}$ before it becomes diminished for smaller wire widths. In addition, we find that the spin relaxation is smallest at the edge of the wire for wide wires. The effect of the Zeeman coupling to the magnetic field perpendicular to the 2D electron system is studied and found that it shifts the crossover from weak anti- to weak localization to larger wire widths $W_c$. When the transverse confinement potential of the quantum wire is smooth (adiabatic), the spin relaxation rate is found to be enhanced as W is reduced. We find that only a spin polarized state retains a finite spin relaxation rate in such narrow wires. Thus, we conclude that the injection of polarized spins into nonmagnetic quantum wires should be favorable in wires with smooth confinement potential. Finally, in wires with tubular shape, corresponding to transverse periodic boundary conditions, we find no reduction of the spin relaxation rate.
We investigate the classical-quantum correspondence for particle motion in a superlattice in the form of a 2D channel with periodic modulated boundaries. Its classical dynamics undergoes the generic transition to chaos of Hamiltonian systems as the amplitude of the modulation is increased. We show that for strong chaotic motion, the classical counterpart of the structure of eigenstates (SES) in energy space reveals an excellent agreement with the quantum one. This correspondence allows us to understand important features of the SES in terms of classical trajectories. We also show that for typical 2D modulated waveguides there exist, at any energy range, extremely localized eigenstates (in energy) which are practically unperturbed by the modulation. These states contribute to the strong fluctuations around the classical SES. The approach to the classical limit is discussed.
The issues of stochastically varying network delays and packet dropouts in Networked Control System (NCS) applications have been simultaneously addressed by time domain optimal tuning of fractional order (FO) PID controllers. Different variants of evolutionary algorithms are used for the tuning process and their performances are compared. Also the effectiveness of the fractional order PI{\lambda}D{\mu} controllers over their integer order counterparts is looked into. Two standard test bench plants with time delay and unstable poles which are encountered in process control applications are tuned with the proposed method to establish the validity of the tuning methodology. The proposed tuning methodology is independent of the specific choice of plant and is also applicable for less complicated systems. Thus it is useful in a wide variety of scenarios. The paper also shows the superiority of FOPID controllers over their conventional PID counterparts for NCS applications.
In this paper, we propose a principled deep reinforcement learning (RL) approach that is able to accelerate the convergence rate of general deep neural networks (DNNs). With our approach, a deep RL agent (synonym for optimizer in this work) is used to automatically learn policies about how to schedule learning rates during the optimization of a DNN. The state features of the agent are learned from the weight statistics of the optimizee during training. The reward function of this agent is designed to learn policies that minimize the optimizee's training time given a certain performance goal. The actions of the agent correspond to changing the learning rate for the optimizee during training. As far as we know, this is the first attempt to use deep RL to learn how to optimize a large-sized DNN. We perform extensive experiments on a standard benchmark dataset and demonstrate the effectiveness of the policies learned by our approach.
In the area of network performance and discovery, network tomography focuses on reconstructing network properties using only end-to-end measurements at the application layer. One challenging problem in network tomography is reconstructing available bandwidth along all links during multiple source/multiple destination transmissions. The traditional measurement procedures used for bandwidth tomography are extremely time consuming. We propose a novel solution to this problem. Our method counts the fragments exchanged during a BitTorrent broadcast. While this measurement has a high level of randomness, it can be obtained very efficiently, and aggregated into a reliable metric. This data is then analyzed with state-of-the-art algorithms, which reliably reconstruct logical clusters of nodes inter-connected by high bandwidth, as well as bottlenecks between these logical clusters. Our experiments demonstrate that the proposed two-phase approach efficiently solves the presented problem for a number of settings on a complex grid infrastructure.
We study statistical properties of energy spectra of a tight-binding model on the two-dimensional quasiperiodic Ammann-Beenker tiling. Taking into account the symmetries of finite approximants, we find that the underlying universal level-spacing distribution is given by the Gaussian orthogonal random matrix ensemble, and thus differs from the critical level-spacing distribution observed at the metal-insulator transition in the three-dimensional Anderson model of disorder. Our data allow us to see the difference to the Wigner surmise.
After a prolonged and deep solar minimum at the end of Cycle 23, the current Solar Cycle 24 is one of the lowest cycles. These two periods of deep minimum and mini maximum are separated by a period of increasing solar activity. We study the cosmic-ray intensity variation in relation with the solar activity, heliospheric plasma and field parameters, including the heliospheric current sheet, during these three periods (phases) of different activity level and nature: (a) a deep minimum, (b) an increasing activity period and (c) a mini maximum. We use neutron monitor data from stations located around the globe to study the rigidity dependence on modulation during the two extremes, i.e., minimum and maximum. We also study the time lag between the cosmic-ray intensity and various solar and interplanetary parameters separately during the three activity phases. We also analyze the role of various parameters, including the current sheet tilt, in modulating the cosmic-ray intensity during the three different phases. Their relative importance and the implications of our results are also discussed.
Distantly supervised relation extraction has been widely used to find novel relational facts from plain text. To predict the relation between a pair of two target entities, existing methods solely rely on those direct sentences containing both entities. In fact, there are also many sentences containing only one of the target entities, which provide rich and useful information for relation extraction. To address this issue, we build inference chains between two target entities via intermediate entities, and propose a path-based neural relation extraction model to encode the relational semantics from both direct sentences and inference chains. Experimental results on real-world datasets show that, our model can make full use of those sentences containing only one target entity, and achieves significant and consistent improvements on relation extraction as compared with baselines.
We present the results obtained through the various ISO extragalactic deep surveys. While IRAS revealed the existence of galaxies forming stars at a rate of a few tens (LIRGs) or even hundreds (ULIRGs) solar masses in the local universe, ISO not only discovered that these galaxies were already in place at redshift one, but also that they are not the extreme objects that we once believed them to be. Instead they appear to play a dominant role in shaping present-day galaxies as reflected by their role in the cosmic history of star formation and in producing the cosmic infrared background detected by the COBE satellite in the far infrared to sub-millimeter range.
The ability of the Generative Adversarial Networks (GANs) framework to learn generative models mapping from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing that the latent space of such generators captures semantic variation in the data distribution. Intuitively, models trained to predict these semantic latent representations given data may serve as useful feature representations for auxiliary problems where semantics are relevant. However, in their existing form, GANs have no means of learning the inverse mapping -- projecting data back into the latent space. We propose Bidirectional Generative Adversarial Networks (BiGANs) as a means of learning this inverse mapping, and demonstrate that the resulting learned feature representation is useful for auxiliary supervised discrimination tasks, competitive with contemporary approaches to unsupervised and self-supervised feature learning.
Verifying the identity of a person using handwritten signatures is challenging in the presence of skilled forgeries, where a forger has access to a person's signature and deliberately attempt to imitate it. In offline (static) signature verification, the dynamic information of the signature writing process is lost, and it is difficult to design good feature extractors that can distinguish genuine signatures and skilled forgeries. This reflects in a relatively poor performance, with verification errors around 7% in the best systems in the literature. To address both the difficulty of obtaining good features, as well as improve system performance, we propose learning the representations from signature images, in a Writer-Independent format, using Convolutional Neural Networks. In particular, we propose a novel formulation of the problem that includes knowledge of skilled forgeries from a subset of users in the feature learning process, that aims to capture visual cues that distinguish genuine signatures and forgeries regardless of the user. Extensive experiments were conducted on four datasets: GPDS, MCYT, CEDAR and Brazilian PUC-PR datasets. On GPDS-160, we obtained a large improvement in state-of-the-art performance, achieving 1.72% Equal Error Rate, compared to 6.97% in the literature. We also verified that the features generalize beyond the GPDS dataset, surpassing the state-of-the-art performance in the other datasets, without requiring the representation to be fine-tuned to each particular dataset.
We introduce a novel loss max-pooling concept for handling imbalanced training data distributions, applicable as alternative loss layer in the context of deep neural networks for semantic image segmentation. Most real-world semantic segmentation datasets exhibit long tail distributions with few object categories comprising the majority of data and consequently biasing the classifiers towards them. Our method adaptively re-weights the contributions of each pixel based on their observed losses, targeting under-performing classification results as often encountered for under-represented object classes. Our approach goes beyond conventional cost-sensitive learning attempts through adaptive considerations that allow us to indirectly address both, inter- and intra-class imbalances. We provide a theoretical justification of our approach, complementary to experimental analyses on benchmark datasets. In our experiments on the Cityscapes and Pascal VOC 2012 segmentation datasets we find consistently improved results, demonstrating the efficacy of our approach.
We present an approach for the visualisation of a set of time series that combines an echo state network with an autoencoder. For each time series in the dataset we train an echo state network, using a common and fixed reservoir of hidden neurons, and use the optimised readout weights as the new representation. Dimensionality reduction is then performed via an autoencoder on the readout weight representations. The crux of the work is to equip the autoencoder with a loss function that correctly interprets the reconstructed readout weights by associating them with a reconstruction error measured in the data space of sequences. This essentially amounts to measuring the predictive performance that the reconstructed readout weights exhibit on their corresponding sequences when plugged back into the echo state network with the same fixed reservoir. We demonstrate that the proposed visualisation framework can deal both with real valued sequences as well as binary sequences. We derive magnification factors in order to analyse distance preservations and distortions in the visualisation space. The versatility and advantages of the proposed method are demonstrated on datasets of time series that originate from diverse domains.
Among the multiple causes of high error rates in spreadsheets, lack of proper training and of deep understanding of the computational model upon which spreadsheet computations rest might not be the least issue. The paper addresses this problem by presenting a didactical model focussing on cell interaction, thus exceeding the atomicity of cell computations. The approach is motivated by an investigation how different spreadsheet systems handle certain computational issues implied from moving cells, copy-paste operations, or recursion.
Due to rapid advances in multielectrode recording technology, the local field potential (LFP) has again become a popular measure of neuronal activity in both basic research and clinical applications. Proper understanding of the LFP requires detailed mathematical modeling incorporating the anatomical and electrophysiological features of neurons near the recording electrode, as well as synaptic inputs from the entire network. Here we propose a hybrid modeling scheme combining the efficiency of commonly used simplified point-neuron network models with the biophysical principles underlying LFP generation by real neurons. The scheme can be used with an arbitrary number of point-neuron network populations. The LFP predictions rely on populations of network-equivalent, anatomically reconstructed multicompartment neuron models with layer-specific synaptic connectivity. The present scheme allows for a full separation of the network dynamics simulation and LFP generation. For illustration, we apply the scheme to a full-scale cortical network model for a $\sim$1 mm$^2$ patch of primary visual cortex and predict laminar LFPs for different network states, assess the relative LFP contribution from different laminar populations, and investigate the role of synaptic input correlations and neuron density on the LFP. The generic nature of the hybrid scheme and its publicly available implementation in \texttt{hybridLFPy} form the basis for LFP predictions from other point-neuron network models, as well as extensions of the current application to larger circuitry and additional biological detail.
We consider a two-dimensional electron gas with long range disorder. Assuming that time reversal symmetry is broken either by an external magnetic field or, as in the case of a delta-correlated random magnetic field, by the disorder itself, we derive a supermatrix $\sigma $-model. As an intermediate step, we provide a microscopic derivation of the ballistic $\sigma$-model, and find that certain corrections to its usual form may become important. We then integrate out degrees of freedom corresponding to short length scales to derive a low-energy supermatrix $\sigma $-model. We find an extra term in the free energy that couples to the correlator of local currents. Use of a proper ultraviolet regularisation procedure that preserves gauge invariance indicates that the contribution of the extra term seems finally to become irrelevant. Within the scope of our analyis, we therefore do not find any deviation of the scaling behaviour of the delta-correlated random magnetic field model from that of the conventional unitary ensemble. We then generalize the discussion to include models of even longer-ranged disorder, plus short-range disorder. When the disorder is sufficiently long-ranged that the local currents become delta-correlated, a new term appears in the free energy that does give rise to logarithmic corrections to the conductivity. A renormalisation group analysis of the free energy yields a scaling form for the diffusion coefficient which contains both a positive correction, that represents classical superdiffusion, and a negative correction, which is the usual weak localization correction. The fact that both corrections are of the same order and opposite sign leads to the interesting possibility of a quantum phase transition at weak disorder in two dimensions, tuned by the relative disorder strengths.
Asynchrony, overlaps and delays in sensory-motor signals introduce ambiguity as to which stimuli, actions, and rewards are causally related. Only the repetition of reward episodes helps distinguish true cause-effect relationships from coincidental occurrences. In the model proposed here, a novel plasticity rule employs short and long-term changes to evaluate hypotheses on cause-effect relationships. Transient weights represent hypotheses that are consolidated in long-term memory only when they consistently predict or cause future rewards. The main objective of the model is to preserve existing network topologies when learning with ambiguous information flows. Learning is also improved by biasing the exploration of the stimulus-response space towards actions that in the past occurred before rewards. The model indicates under which conditions beliefs can be consolidated in long-term memory, it suggests a solution to the plasticity-stability dilemma, and proposes an interpretation of the role of short-term plasticity.
Oscillations in a stochastic dynamical system, whose deterministic counterpart has a stable steady state, are a widely reported phenomenon. Traditional methods of finding parameter regimes for stochastically-driven resonances are, however, cumbersome for any but the smallest networks. In this letter we show by example of the Brusselator how to use real root counting algorithms and graph theoretic tools to efficiently determine the number of resonant modes and parameter ranges for stochastic oscillations. We argue that stochastic resonance is a network property by showing that resonant modes only depend on the squared Jacobian matrix $J^2$ , unlike deterministic oscillations which are determined by $J$. By using graph theoretic tools, analysis of stochastic behaviour for larger networks is simplified and chemical reaction networks with multiple resonant modes can be identified easily.
This mini-course was given in the First Yaroslavl Summer School on Discrete and Computational Geometry in August 2012, organized by International Delaunay Laboratory "Discrete and Computational Geometry" of Demidov Yaroslavl State University. The aim of this mini-course is to give an introduction in Optimal Networks theory. Optimal networks appear as solutions of the following natural problem: How to connect a finite set of points in a metric space in an optimal way? We cover three most natural types of optimal connection: spanning trees connection without additional road forks, shortest trees and locally shortest trees, and minimal fillings.
In this paper we apply the Named Data Networking, a newly proposed Internet architecture, to networking vehicles on the run. Our initial design, dubbed V-NDN, illustrates NDN's promising potential in providing a unifying architecture that enables networking among all computing devices independent from whether they are connected through wired infrastructure, ad hoc, or intermittent DTN. This paper describes the prototype implementation of V-NDN and its preliminary performance assessment.
One of the most promising ways of improving the performance of deep convolutional neural networks is by increasing the number of convolutional layers. However, adding layers makes training more difficult and computationally expensive. In order to train deeper networks, we propose to add auxiliary supervision branches after certain intermediate layers during training. We formulate a simple rule of thumb to determine where these branches should be added. The resulting deeply supervised structure makes the training much easier and also produces better classification results on ImageNet and the recently released, larger MIT Places dataset
Many physical, biological and social phenomena can be described by cascades taking place on a network. Often, the activity can be empirically observed, but not the underlying network of interactions. In this paper we solve the dynamics of general cascade processes. We then offer three topological inversion methods to infer the structure of any directed network given a set of cascade arrival times. Our forward and inverse formulas hold for a very general class of models where the activation probability of a node is a generic function of its degree and the number of its active neighbors. We report high success rates for synthetic and real networks, for 5 different cascade models.
We present optical morphologies obtained from deep HST and ground-based images for galaxies selected from the first sub-millimeter survey of the distant Universe. Our sample comprises galaxies detected in deep 850-micron continuum maps of seven massive clusters, obtained using SCUBA, the new bolometer camera on the JCMT. The survey covers a total area of 0.01 square degrees to 1-sigma noise levels of about 2 mJy/beam. We detect a total of 25 sources at 850 microns, of which 17 and 10 are brighter than the respective 50% and 80% completeness limits. Optical counterparts are identified for 14 of the 16 sources in the f(50%) sample and for 9 of the 10 sources in the f(80%) sample that lie within our optical fields. The morphologies of those galaxies for which we have HST imaging fall into three broad categories: faint disturbed galaxies and interactions; faint galaxies too compact to classify reliably; and dusty, star-forming galaxies at intermediate redshifts. The disturbed and interacting galaxies constitute the largest class, which suggests that interactions remain an important mechanism for triggering star formation and the formation of ultraluminous galaxies in the distant Universe. The faint, compact galaxies may represent a later evolutionary stage in these mergers, or more centrally-concentrated starbursts. It is likely that some of these will host AGN. Analysis of the colors of our sample allow us to estimate a crude redshift distribution: >75% have z<5.5 whilst >50% lie at z<4.5, suggesting that the luminous sub-mm population is coeval with the more modestly star-forming galaxies selected by UV/optical surveys of the distant Universe. This imposes important constraints on models of galaxy formation and evolution.
We present deep and accurate Near-Infrared (NIR) photometry of the Galactic Globular Cluster (GC) Omega Cen. Data were collected using the Multi-Conjugate Adaptive Optics Demonstrator (MAD) on VLT (ESO). The unprecedented quality of the images provided the opportunity to perform accurate photometry in the central crowded regions. Preliminary results indicate that the spread in age among the different stellar populations in Omega Cen is limited.
Monitoring Wireless Sensor Networks (WSNs) are composed of sensor nodes that report temperature, relative humidity, and other environmental parameters. The time between two successive measurements is a critical parameter to set during the WSN configuration because it can impact the WSN's lifetime, the wireless medium contention and the quality of the reported data. As trends in monitored parameters can significantly vary between scenarios and within time, identifying a sampling interval suitable for several cases is also challenging. In this work, we propose a dynamic sampling rate adaptation scheme based on reinforcement learning, able to tune sensors' sampling interval on-the-fly, according to environmental conditions and application requirements. The primary goal is to set the sampling interval to the best value possible so as to avoid oversampling and save energy, while not missing environmental changes that can be relevant for the application. In simulations, our mechanism could reduce up to 73% the total number of transmissions compared to a fixed strategy and, simultaneously, keep the average quality of information provided by the WSN. The inherent flexibility of the reinforcement learning algorithm facilitates its use in several scenarios, so as to exploit the broad scope of the Internet of Things.
We propose a robust, scalable, integrated methodology for community detection and community comparison in graphs. In our procedure, we first embed a graph into an appropriate Euclidean space to obtain a low-dimensional representation, and then cluster the vertices into communities. We next employ nonparametric graph inference techniques to identify structural similarity among these communities. These two steps are then applied recursively on the communities, allowing us to detect more fine-grained structure. We describe a hierarchical stochastic blockmodel---namely, a stochastic blockmodel with a natural hierarchical structure---and establish conditions under which our algorithm yields consistent estimates of model parameters and motifs, which we define to be stochastically similar groups of subgraphs. Finally, we demonstrate the effectiveness of our algorithm in both simulated and real data. Specifically, we address the problem of locating similar subcommunities in a partially reconstructed Drosophila connectome and in the social network Friendster.
Learning and inferring features that generate sensory input is a task continuously performed by cortex. In recent years, novel algorithms and learning rules have been proposed that allow neural network models to learn such features from natural images, written text, audio signals, etc. These networks usually involve deep architectures with many layers of hidden neurons. Here we review recent advancements in this area emphasizing, amongst other things, the processing of dynamical inputs by networks with hidden nodes and the role of single neuron models. These points and the questions they arise can provide conceptual advancements in understanding of learning in the cortex and the relationship between machine learning approaches to learning with hidden nodes and those in cortical circuits.
Recently there has been increasing interest in improving smart grids efficiency using computational intelligence. A key challenge in future smart grid is designing Optimal Power Flow tool to solve important planning problems including optimal DG capacities. Although, a number of OPF tools exists for balanced networks there is a lack of research for unbalanced multi-phase distribution networks. In this paper, a new OPF technique has been proposed for the DG capacity planning of a smart grid. During the formulation of the proposed algorithm, multi-phase power distribution system is considered which has unbalanced loadings, voltage control and reactive power compensation devices. The proposed algorithm is built upon a co-simulation framework that optimizes the objective by adapting a constriction factor Particle Swarm optimization. The proposed multi-phase OPF technique is validated using IEEE 8500-node benchmark distribution system.
Modeling human conversations is the essence for building satisfying chat-bots with multi-turn dialog ability. Conversation modeling will notably benefit from domain knowledge since the relationships between sentences can be clarified due to semantic hints introduced by knowledge. In this paper, a deep neural network is proposed to incorporate background knowledge for conversation modeling. Through a specially designed Recall gate, domain knowledge can be transformed into the extra global memory of Long Short-Term Memory (LSTM), so as to enhance LSTM by cooperating with its local memory to capture the implicit semantic relevance between sentences within conversations. In addition, this paper introduces the loose structured domain knowledge base, which can be built with slight amount of manual work and easily adopted by the Recall gate. Our model is evaluated on the context-oriented response selecting task, and experimental results on both two datasets have shown that our approach is promising for modeling human conversations and building key components of automatic chatting systems.
An effective neural network algorithm of the perceptron type is proposed. The algorithm allows us to identify strongly distorted input vector reliably. It is shown that its reliability and processing speed are orders of magnitude higher than that of full connected neural networks. The processing speed of our algorithm exceeds the one of the stack fast-access retrieval algorithm that is modified for working when there are noises in the input channel.
People often use a web search engine to find information about events of interest, for example, sport competitions, political elections, festivals and entertainment news. In this paper, we study a problem of detecting event-related queries, which is the first step before selecting a suitable time-aware retrieval model. In general, event-related information needs can be observed in query streams through various temporal patterns of user search behavior, e.g., spiky peaks for popular events, and periodicities for repetitive events. However, it is also common that users search for non-popular events, which may not exhibit temporal variations in query streams, e.g., past events recently occurred, historical events triggered by anniversaries or similar events, and future events anticipated to happen. To address the challenge of detecting dynamic classes of events, we propose a novel deep learning model to classify a given query into a predetermined set of multiple event types. Our proposed model, a Stacked Multilayer Perceptron (S-MLP) network, consists of multilayer perceptron used as a basic learning unit. We assemble stacked units to further learn complex relationships between neutrons in successive layers. To evaluate our proposed model, we conduct experiments using real-world queries and a set of manually created ground truth. Preliminary results have shown that our proposed deep learning model outperforms the state-of-the-art classification models significantly.
We propose a multi-phase approach to explore network structures. In this method, structure analysis is not carried out on the observed network directly. Instead, certain similarity measures of the nodes are derived from the network firstly, which are then projected onto an appropriate lower-dimensional feature space. The clustering structure can be defined in the feature space, and analyzed by conventional clustering algorithms. The classified data are finally mapped back to the original network space if necessary to complete the analysis of network structures. By mapping onto the feature space, some difficulties due to the diversity of micro-structures and scale of the network can be circumvented. This makes it possible for the proposed method to deal with more general structures such as detecting groups in a random background, as well as identifying usual community structures in networks.
Spin-spin correlations are calculated in frustrated hierarchical Ising models that exhibit chaotic renormalization-group behavior. The spin-spin correlations, as a function of distance, behave chaotically. The far correlations, but not the near correlations, are sensitive to small changes in temperature or frustration, with temperature changes having a larger effect. On the other hand, the calculated free energy, internal energy, and entropy are smooth functions of temperature. The recursion-matrix calculation of thermodynamic densities in a chaotic band is demonstrated. The leading Lyapunov exponents are calculated as a function of frustration.
Artificial neural networks have gone through a recent rise in popularity, achieving state-of-the-art results in various fields, including image classification, speech recognition, and automated control. Both the performance and computational complexity of such models are heavily dependant on the design of characteristic hyper-parameters (e.g., number of hidden layers, nodes per layer, or choice of activation functions), which have traditionally been optimized manually. With machine learning penetrating low-power mobile and embedded areas, the need to optimize not only for performance (accuracy), but also for implementation complexity, becomes paramount. In this work, we present a multi-objective design space exploration method that reduces the number of solution networks trained and evaluated through response surface modelling. Given spaces which can easily exceed 1020 solutions, manually designing a near-optimal architecture is unlikely as opportunities to reduce network complexity, while maintaining performance, may be overlooked. This problem is exacerbated by the fact that hyper-parameters which perform well on specific datasets may yield sub-par results on others, and must therefore be designed on a per-application basis. In our work, machine learning is leveraged by training an artificial neural network to predict the performance of future candidate networks. The method is evaluated on the MNIST and CIFAR-10 image datasets, optimizing for both recognition accuracy and computational complexity. Experimental results demonstrate that the proposed method can closely approximate the Pareto-optimal front, while only exploring a small fraction of the design space.
A fundamental advantage of neural models for NLP is their ability to learn representations from scratch. However, in practice this often means ignoring existing external linguistic resources, e.g., WordNet or domain specific ontologies such as the Unified Medical Language System (UMLS). We propose a general, novel method for exploiting such resources via weight sharing. Prior work on weight sharing in neural networks has considered it largely as a means of model compression. In contrast, we treat weight sharing as a flexible mechanism for incorporating prior knowledge into neural models. We show that this approach consistently yields improved performance on classification tasks compared to baseline strategies that do not exploit weight sharing.
Current Adaptive Mesh Refinement (AMR) simulations require algorithms that are highly parallelized and manage memory efficiently. As compute engines grow larger, AMR simulations will require algorithms that achieve new levels of efficient parallelization and memory management. We have attempted to employ new techniques to achieve both of these goals. Patch or grid based AMR often employs ghost cells to decouple the hyperbolic advances of each grid on a given refinement level. This decoupling allows each grid to be advanced independently. In AstroBEAR we utilize this independence by threading the grid advances on each level with preference going to the finer level grids. This allows for global load balancing instead of level by level load balancing and allows for greater parallelization across both physical space and AMR level. Threading of level advances can also improve performance by interleaving communication with computation, especially in deep simulations with many levels of refinement. While we see improvements of up to 30% on deep simulations run on a few cores, the speedup is typically more modest (5-20%) for larger scale simulations. To improve memory management we have employed a distributed tree algorithm that requires processors to only store and communicate local sections of the AMR tree structure with neighboring processors. Using this distributed approach we are able to get reasonable scaling efficiency (> 80%) out to 12288 cores and up to 8 levels of AMR - independent of the use of threading.
While new forms of attacks are developed every day to compromise essential infrastructures, service providers are also expected to develop strategies to mitigate the risk of extreme failures. In this context, tools of Network Science have been used to evaluate network robustness and propose resilient topologies against attacks. We present here a new rewiring method to modify the network topology improving its robustness, based on the evolution of the network largest component during a sequence of targeted attacks. In comparison to previous strategies, our method lowers by several orders of magnitude the computational effort necessary to improve robustness. Our rewiring also drives the formation of layers of nodes with similar degree while keeping a highly modular structure. This "modular onion-like structure" is a particular class of the onion-like structure previously described in the literature. We apply our rewiring strategy to an unweighted representation of the World Air Transportation network and show that an improvement of 30% in its overall robustness can be achieved through smart swaps of around 9% of its links.
Noise is often considered to be a nuisance. Here we argue that it can be a useful probe of fluctuating two level systems in glasses. It can be used to: (1) shed light on whether the fluctuations are correlated or independent events; (2) determine if there is a low temperature glass or phase transition among interacting two level systems, and if the hierarchical or droplet model can be used to describe the glassy phase; and (3) find the lower bound of the two level system relaxation rate without going to ultralow temperatures. Finally we point out that understanding noise due to two level systems is important for technological applications such as quantum qubits that use Josephson junctions.
Ferromagnetic Ising systems with competing interactions are considered in the presence of a random field. We find that in three space dimensions the ferromagnetic phase is disordered by a random field which is considerably smaller than the typical interaction strength between the spins. This is the result of a novel disordering mechanism triggered by an underlying spin-glass phase. Calculations for the specific case of the long-range dipolar LiHo_xY_{1-x}F_4 compound suggest that the above mechanism is responsible for the peculiar dependence of the critical temperature on the strength of the random field and the broadening of the susceptibility peaks as temperature is decreased, as found in recent experiments by Silevitch et al. [Nature (London) 448, 567 (2007)]. Our results thus emphasize the need to go beyond the standard Imry-Ma argument when studying general random-field systems.
This work considers the secure and reliable information transmission in two-hop relay wireless networks without the information of both eavesdropper channels and locations. While the previous work on this problem mainly studied infinite networks and their asymptotic behavior and scaling law results, this papers focuses on a more practical network with finite number of system nodes and explores the corresponding exact results on the number of eavesdroppers the network can tolerant to ensure a desired secrecy and reliability. For achieving secure and reliable information transmission in a finite network, two transmission protocols are considered in this paper, one adopts an optimal but complex relay selection process with less load balance capacity while the other adopts a random but simple relay selection process with good load balance capacity. Theoretical analysis is further provided to determine the exact and maximum number of independent and also uniformly distributed eavesdroppers one network can tolerate to satisfy a specified requirement in terms of the maximum secrecy outage probability and maximum transmission outage probability allowed.
In this paper, we discuss the outer-synchronization of the asymmetrically connected recurrent time-varying neural networks. By both centralized and decentralized discretization data sampling principles, we derive several sufficient conditions based on diverse vector norms that guarantee that any two trajectories from different initial values of the identical neural network system converge together. The lower bounds of the common time intervals between data samples in centralized and decentralized principles are proved to be positive, which guarantees exclusion of Zeno behavior. A numerical example is provided to illustrate the efficiency of the theoretical results.
This paper presents the systems developed by LIUM and CVC for the WMT16 Multimodal Machine Translation challenge. We explored various comparative methods, namely phrase-based systems and attentional recurrent neural networks models trained using monomodal or multimodal data. We also performed a human evaluation in order to estimate the usefulness of multimodal data for human machine translation and image description generation. Our systems obtained the best results for both tasks according to the automatic evaluation metrics BLEU and METEOR.
We propose a deep learning model for identifying structure within experiment narratives in scientific literature. We take a sequence labeling approach to this problem, and label clauses within experiment narratives to identify the different parts of the experiment. Our dataset consists of paragraphs taken from open access PubMed papers labeled with rhetorical information as a result of our pilot annotation. Our model is a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) cells that labels clauses. The clause representations are computed by combining word representations using a novel attention mechanism that involves a separate RNN. We compare this model against LSTMs where the input layer has simple or no attention and a feature rich CRF model. Furthermore, we describe how our work could be useful for information extraction from scientific literature.
In situ control of spin-orbit coupling in coherent transport using a clean GaAs/AlGaAs 2DEG is realized, leading to a gate-tunable crossover from weak localization to antilocalization. The necessary theory of 2D magnetotransport in the presence of spin-orbit coupling beyond the diffusive approximation is developed and used to analyze experimental data. With this theory the Rashba contribution and linear and cubic Dresselhaus contributions to spin-orbit coupling are separately estimated, allowing the angular dependence of spin-orbit precession to be extracted at various gate voltages.
This paper presents a generalized network flow model for simulating capillary-dominated two-phase flow through porous media at the pore scale. The generalized network is extracted from 3D images of the void space - described in a companion paper \citep{0Raeini2017a} - and comprises pores which are divided into smaller elements called half-throats and subsequently into half-throat corners. Half-throats define the connectivity of the network at the coarsest level, connecting each pore to the half-throats of its neighbouring pores from their narrower ends. Their corners are discretized at different levels for accurate calculation of entry pressures, fluid volumes and flow conductivities: these properties are obtained using direct simulation of flow on the underlying image from which the network is extracted. We present semi-analytical expressions to simulate capillary dominated two-phase flow through this generalized network description of the void space and to predict its relative permeability and capillary pressure curves. The model will be validated using direct finite-volume two-phase flow simulations and available experimental measurements of relative permeabilities.
We show that although the prominent centrality measures in network analysis make use of different information about nodes' positions, they all process that information in a very restrictive and identical way. They all spring from a common family that are characterized by the same axioms. In particular, they are all based on a additively separable and linear treatment of a statistic that captures a node's position in the network. Using such statistics on nodes' positions, we also characterize networks on which centrality measures all agree.
In this study we discuss the performance of approximate SQS supercell models in describing the cubic elastic properties of B1 (rocksalt) Ti$_{0.5}$Al$_{0.5}$N alloy by using a symmetry based projection technique. We show on the example of Ti$_{0.5}$Al$_{0.5}$N alloy, that this projection technique can be used to align the differently shaped and sized SQS structures for a comparison in modeling elasticity. Moreover, we focus to accurately determine the cubic elastic constants and Zener's type elastic anisotropy of Ti$_{0.5}$Al$_{0.5}$N. Our best supercell model, that captures accurately both the randomness and cubic elastic symmetry, results in $C_{11}=447$ GPa, $C_{12}=158$ GPa and $C_{44}=203$ GPa with 3% of error and $A=1.40$ for Zener's elastic anisotropy with 6% of error. In addition, we establish the general importance of selecting proper approximate SQS supercells with symmetry arguments to reliably model elasticity of alloys. In general, we suggest the calculation of nine elastic tensor elements - $C_{11}$, $C_{22}$, $C_{33}$, $C_{12}$, $C_{13}$, $C_{23}$, $C_{44}$, $C_{55}$ and $C_{66}$, to evaluate and analyze the performance of SQS supercells in predicting elasticity of cubic alloys via projecting out the closest cubic approximate of the elastic tensor. The here described methodology is general enough to be applied in discussing elasticity of substitutional alloys with any symmetry and at arbitrary composition.
We study the efficiency of parallel tempering Monte Carlo technique for calculating true ground states of the Edwards-Anderson spin glass model. Bimodal and Gaussian bond distributions were considered in two and three-dimensional lattices. By a systematic analysis we find a simple formula to estimate the values of the parameters needed in the algorithm to find the GS with a fixed average probability. We also study the performance of the algorithm for single samples, quantifying the difference between samples where the GS is hard, or easy, to find. The GS energies we obtain are in good agreement with the values found in the literature. Our results show that the performance of the parallel tempering technique is comparable to more powerful heuristics developed to find the ground state of Ising spin glass systems.
In literature there are several studies on the performance of Bayesian network structure learning algorithms. The focus of these studies is almost always the heuristics the learning algorithms are based on, i.e. the maximisation algorithms (in score-based algorithms) or the techniques for learning the dependencies of each variable (in constraint-based algorithms). In this paper we investigate how the use of permutation tests instead of parametric ones affects the performance of Bayesian network structure learning from discrete data. Shrinkage tests are also covered to provide a broad overview of the techniques developed in current literature.
We study the continuum space-time limit of a periodic one dimensional array of deterministic logistic maps coupled diffusively. First, we analyse this system in connection with a stochastic one dimensional Kardar-Parisi-Zhang (KPZ) equation for confined surface fluctuations. We compare the large-scale and long-time behaviour of space-time correlations in both systems. The dynamic structure factor of the coupled map lattice (CML) of logistic units in its deep chaotic regime and the usual d=1 KPZ equation have a similar temporal stretched exponential relaxation. Conversely, the spatial scaling and, in particular, the size dependence are very different due to the intrinsic confinement of the fluctuations in the CML. We discuss the range of values of the non-linear parameter in the logistic map elements and the elastic coefficient coupling neighbours on the ring for which the connection with the KPZ-like equation holds. In the same spirit, we derive a continuum partial differential equation governing the evolution of the Lyapunov vector and we confirm that its space-time behaviour becomes the one of KPZ. Finally, we briefly discuss the interpretation of the continuum limit of the CML as a Fisher-Kolmogorov-Petrovsky-Piscounov (FKPP) non-linear diffusion equation with an additional KPZ non-linearity and the possibility of developing travelling wave configurations.
The software-defined networking paradigm introduces interesting opportunities to operate networks in a more flexible, optimized, yet formally verifiable manner. Despite the logically centralized control, however, a Software-Defined Network (SDN) is still a distributed system, with inherent delays between the switches and the controller. Especially the problem of changing network configurations in a consistent manner, also known as the consistent network update problem, has received much attention over the last years. In particular, it has been shown that there exists an inherent tradeoff between update consistency and speed. This paper revisits the problem of updating an SDN in a transiently consistent, loop-free manner. First, we rigorously prove that computing a maximum (greedy) loop-free network update is generally NP-hard; this result has implications for the classic maximum acyclic subgraph problem (the dual feedback arc set problem) as well. Second, we show that for special problem instances, fast and good approximation algorithms exist.
Dramatic glass-like behavior involving nonexponential relaxation of in-plane electrical conduction, has been induced in quench-condensed ultrathin amorphous Bi films by the application of a parallel magnetic field. The effect is found over a narrow range of film thicknesses well on the insulating side of the thickness-tuned superconductor-insulator transition. The simplest explanation of this field-induced glass-like behavior is a Pauli principle blocking of hopping transport between singly occupied states when electrons are polarized by the magnetic field. A second explanation involving weak coupling of superconducting clusters by intercluster interactions of random sign is also discussed.
We present several techniques for knowledge engineering of large belief networks (BNs) based on the our experiences with a network derived from a large medical knowledge base. The noisyMAX, a generalization of the noisy-OR gate, is used to model causal in dependence in a BN with multi-valued variables. We describe the use of leak probabilities to enforce the closed-world assumption in our model. We present Netview, a visualization tool based on causal independence and the use of leak probabilities. The Netview software allows knowledge engineers to dynamically view sub-networks for knowledge engineering, and it provides version control for editing a BN. Netview generates sub-networks in which leak probabilities are dynamically updated to reflect the missing portions of the network.
As a successful deep model applied in image super-resolution (SR), the Super-Resolution Convolutional Neural Network (SRCNN) has demonstrated superior performance to the previous hand-crafted models either in speed and restoration quality. However, the high computational cost still hinders it from practical usage that demands real-time performance (24 fps). In this paper, we aim at accelerating the current SRCNN, and propose a compact hourglass-shape CNN structure for faster and better SR. We re-design the SRCNN structure mainly in three aspects. First, we introduce a deconvolution layer at the end of the network, then the mapping is learned directly from the original low-resolution image (without interpolation) to the high-resolution one. Second, we reformulate the mapping layer by shrinking the input feature dimension before mapping and expanding back afterwards. Third, we adopt smaller filter sizes but more mapping layers. The proposed model achieves a speed up of more than 40 times with even superior restoration quality. Further, we present the parameter settings that can achieve real-time performance on a generic CPU while still maintaining good performance. A corresponding transfer strategy is also proposed for fast training and testing across different upscaling factors.
A Robust Markov Decision Process (RMDP) is a sequential decision making model that accounts for uncertainty in the parameters of dynamic systems. This uncertainty introduces difficulties in learning an optimal policy, especially for environments with large state spaces. We propose two algorithms, RTD-DQN and Deep-RoK, for solving large-scale RMDPs using nonlinear approximation schemes such as deep neural networks. The RTD-DQN algorithm incorporates the robust Bellman temporal difference error into a robust loss function, yielding robust policies for the agent. The Deep-RoK algorithm is a robust Bayesian method, based on the Extended Kalman Filter (EKF), that accounts for both the uncertainty in the weights of the approximated value function and the uncertainty in the transition probabilities, improving the robustness of the agent. We provide theoretical results for our approach and test the proposed algorithms on a continuous state domain.
With SDN increasingly becoming an enabling technology for NFV in the cloud, many virtualized network functions need to monitor the network state in order to function properly. An outdated network view at the controllers can impact the performance of those virtualized network functions. In earlier work, we identified two main factors contributing to an outdated network view in the case of a load-balancer: network state collection and controllers' state distribution. In this paper, we anticipate that the impact might be different in case of security functions. Therefore, we study the impact of an outdated network view on an anomaly-based IDS application. In particular, we investigate: (1) the impact of controllers' state distribution on the performance of a distributed IDS in the case of a DDoS attack; and (2) the impact of network state collection on the performance of an IDS in the case of a TCP SYN flood attack. Our results showed that the outdated network view had negative impact on the IDS anomaly-detection performance in the experiments that we conducted.
It is of some interest to understand how statistically based mechanisms for signal processing might be integrated with biologically motivated mechanisms such as neural networks. This paper explores a novel hybrid approach for classifying segments of sequential data, such as individual spoken works. The approach combines a hidden Markov model (HMM) with a spiking neural network (SNN). The HMM, consisting of states and transitions, forms a fixed backbone with nonadaptive transition probabilities. The SNN, however, implements a biologically based Bayesian computation that derives from the spike timing-dependent plasticity (STDP) learning rule. The emission (observation) probabilities of the HMM are represented in the SNN and trained with the STDP rule. A separate SNN, each with the same architecture, is associated with each of the states of the HMM. Because of the STDP training, each SNN implements an expectation maximization algorithm to learn the emission probabilities for one HMM state. The model was studied on synthesized spike-train data and also on spoken word data. Preliminary results suggest its performance compares favorably with other biologically motivated approaches. Because of the model's uniqueness and initial promise, it warrants further study. It provides some new ideas on how the brain might implement the equivalent of an HMM in a neural circuit.
The traditional view of elliptical galaxies has been that they formed in a single, rapid burst of star formation at high redshift, and have evolved quiescently since that time. In opposition to this traditional view is evidence that at least some elliptical galaxies have formed from the merger of two disk galaxies. What has not been clear is which process is the dominant formation mechanism for the large majority of elliptical galaxies. This question has significant implications for cosmological models, as different models make different predictions for the formation mechanism and epoch of elliptical galaxies. Here I use deep optical and near-infrared images to show that there are fewer galaxies with very red optical and near-infrared colors than predicted by models in which typical elliptical galaxies have completed their star formation by z >= 5. These observations require that elliptical galaxies have significant star formation at z < 5. This requirement, combined with constraints on lower redshift starbursts from the modest ultraviolet luminosities of galaxies in the Hubble Deep Field and the properties of galaxies from 0 < z < 1, suggests that elliptical galaxies form at moderate redshift in dusty starbursts and/or through the hierarchical merging of smaller objects.
The network metaphor in the analysis of urban and territorial cases has a long tradition especially in transportation/land-use planning and economic geography. More recently, urban design has brought its contribution by means of the "space syntax" methodology. All these approaches, though under different terms like accessibility, proximity, integration,connectivity, cost or effort, focus on the idea that some places (or streets) are more important than others because they are more central. The study of centrality in complex systems,however, originated in other scientific areas, namely in structural sociology, well before its use in urban studies; moreover, as a structural property of the system, centrality has never been extensively investigated metrically in geographic networks as it has been topologically in a wide range of other relational networks like social, biological or technological. After two previous works on some structural properties of the dual and primal graph representations of urban street networks (Porta et al. cond-mat/0411241; Crucitti et al. physics/0504163), in this paper we provide an in-depth investigation of centrality in the primal approach as compared to the dual one, with a special focus on potentials for urban design.
We investigate the use of hierarchical phrase-based SMT lattices in end-to-end neural machine translation (NMT). Weight pushing transforms the Hiero scores for complete translation hypotheses, with the full translation grammar score and full n-gram language model score, into posteriors compatible with NMT predictive probabilities. With a slightly modified NMT beam-search decoder we find gains over both Hiero and NMT decoding alone, with practical advantages in extending NMT to very large input and output vocabularies.
Deploying cloud computing in an enterprise infrastructure bring significant security concerns. Successful implementation of cloud computing in an enterprise requires proper planning and understanding of emerging risks, threats, vulnerabilities, and possible countermeasures. We believe enterprise should analyze the company/organization security risks, threats, and available countermeasures before adopting this technology. In this paper, we have discussed security risks and concerns in cloud computing and enlightened steps that an enterprise can take to reduce security risks and protect their resources. We have also explained cloud computing strengths/benefits, weaknesses, and applicable areas in information risk management.
We consider the problem of retrieving a reliable estimate of an attribute monitored by a wireless sensor network, where the sensors harvest energy from the environment independently, at random. Each sensor stores the harvested energy in batteries of limited capacity. Moreover, provided they have sufficient energy, the sensors broadcast their measurements in a decentralized fashion. Clients arrive at the sensor network according to a Poisson process and are interested in retrieving a fixed number of sensor measurements, based on which a reliable estimate is computed. We show that the time until an arbitrary sensor broadcasts has a phase-type distribution. Based on this result and the theory of order statistics of phase-type distributions, we determine the probability distribution of the time needed for a client to retrieve a reliable estimate of an attribute monitored by the sensor network. We also provide closed-form expression for the retrieval time of a reliable estimate when the capacity of the sensor battery or the rate at which energy is harvested is asymptotically large. In addition, we analyze numerically the retrieval time of a reliable estimate for various sizes of the sensor network, maximum capacity of the sensor batteries and rate at which energy is harvested. These results show that the energy harvesting rate and the broadcasting rate are the main parameters that influence the retrieval time of a reliable estimate, while deploying sensors with large batteries does not significantly reduce the retrieval time.
We present a unified, global perspective on the magnetic properties of strongly disordered electronic systems, with special emphasis on the case where the ground state is metallic. We review the arguments for the instability of the disordered Fermi liquid state towards the formation of local magnetic moments, and argue that their singular low temperature thermodynamics are the ``quantum Griffiths'' precursors of the quantum phase transition to a metallic spin glass; the local moment formation is therefore not directly related to the metal-insulator transition. We also review the the mean-field theory of the disordered Fermi liquid to metallic spin glass transition and describe the separate regime of ``non-Fermi liquid'' behavior at higher temperatures near the quantum critical point. The relationship to experimental results on doped semiconductors and heavy-fermion compounds is noted.
The Attacks done by Viruses, Worms, Hackers, etc. are a Network Security-Problem in many Organisations. Current Intrusion Detection Systems have significant Disadvantages, e.g. the need of plenty of Computational Power or the Local Installation. Therefore, we introduce a novel Framework for Network Security which is called SANA. SANA contains an artificial Immune System with artificial Cells which perform certain Tasks in order to to support existing systems to better secure the Network against Intrusions. The Advantages of SANA are that it is efficient, adaptive, autonomous, and massively-distributed. In this Article, we describe the Architecture of the artificial Immune System and the Functionality of the Components. We explain briefly the Implementation and discuss Results.
The paper describes the Egyptian Arabic-to-English statistical machine translation (SMT) system that the QCRI-Columbia-NYUAD (QCN) group submitted to the NIST OpenMT'2015 competition. The competition focused on informal dialectal Arabic, as used in SMS, chat, and speech. Thus, our efforts focused on processing and standardizing Arabic, e.g., using tools such as 3arrib and MADAMIRA. We further trained a phrase-based SMT system using state-of-the-art features and components such as operation sequence model, class-based language model, sparse features, neural network joint model, genre-based hierarchically-interpolated language model, unsupervised transliteration mining, phrase-table merging, and hypothesis combination. Our system ranked second on all three genres.
We analyze string networks in 7-brane configurations in IIB string theory. We introduce a complex parameter M characterizing equivalence classes of networks on a fixed 7-brane background and specifying the BPS mass of the network as M_{BPS} = | M |. We show that M can be calculated without knowing the particular representative of the BPS state. Based on detailed examination of backgrounds with three and four 7-branes we argue that equivalent networks may not be simultaneously BPS, an essential requirement of consistency.
Some recent works revealed that deep neural networks (DNNs) are vulnerable to so-called adversarial attacks where input examples are intentionally perturbed to fool DNNs. In this work, we revisit the DNN training process that includes adversarial examples into the training dataset so as to improve DNN's resilience to adversarial attacks, namely, adversarial training. Our experiments show that different adversarial strengths, i.e., perturbation levels of adversarial examples, have different working zones to resist the attack. Based on the observation, we propose a multi-strength adversarial training method (MAT) that combines the adversarial training examples with different adversarial strengths to defend adversarial attacks. Two training structures - mixed MAT and parallel MAT - are developed to facilitate the tradeoffs between training time and memory occupation. Our results show that MAT can substantially minimize the accuracy degradation of deep learning systems to adversarial attacks on MNIST, CIFAR-10, CIFAR-100, and SVHN.
Automated extraction of semantic information from a network of sensors for cognitive analysis and human-like reasoning is a desired capability in future ground surveillance systems. We tackle the problem of complex decision making under uncertainty in network information environment, where lack of effective visual processing tools, incomplete domain knowledge frequently cause uncertainty in the visual primitives, leading to sub-optimal decisions. While state-of-the-art vision techniques exist in detecting visual entities (humans, vehicles and scene elements) in an image, a missing functionality is the ability to merge the information to reveal meaningful information for high level inference. In this work, we develop a probabilistic first order predicate logic(FOPL) based reasoning system for recognizing complex events in synchronized stream of videos, acquired from sensors with non-overlapping fields of view. We adopt Markov Logic Network(MLN) as a tool to model uncertainty in observations, and fuse information extracted from heterogeneous data in a probabilistically consistent way. MLN overcomes strong dependence on pure empirical learning by incorporating domain knowledge, in the form of user-defined rules and confidences associated with them. This work demonstrates that the MLN based decision control system can be made scalable to model statistical relations between a variety of entities and over long video sequences. Experiments with real-world data, under a variety of settings, illustrate the mathematical soundness and wide-ranging applicability of our approach.
Dropout and other feature noising schemes control overfitting by artificially corrupting the training data. For generalized linear models, dropout performs a form of adaptive regularization. Using this viewpoint, we show that the dropout regularizer is first-order equivalent to an L2 regularizer applied after scaling the features by an estimate of the inverse diagonal Fisher information matrix. We also establish a connection to AdaGrad, an online learning algorithm, and find that a close relative of AdaGrad operates by repeatedly solving linear dropout-regularized problems. By casting dropout as regularization, we develop a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer. We apply this idea to document classification tasks, and show that it consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.
Causal inference concerns the identification of cause-effect relationships between variables, e.g. establishing whether a stimulus affects activity in a certain brain region. The observed variables themselves often do not constitute meaningful causal variables, however, and linear combinations need to be considered. In electroencephalographic studies, for example, one is not interested in establishing cause-effect relationships between electrode signals (the observed variables), but rather between cortical signals (the causal variables) which can be recovered as linear combinations of electrode signals.   We introduce MERLiN (Mixture Effect Recovery in Linear Networks), a family of causal inference algorithms that implement a novel means of constructing causal variables from non-causal variables. We demonstrate through application to EEG data how the basic MERLiN algorithm can be extended for application to different (neuroimaging) data modalities. Given an observed linear mixture, the algorithms can recover a causal variable that is a linear effect of another given variable. That is, MERLiN allows us to recover a cortical signal that is affected by activity in a certain brain region, while not being a direct effect of the stimulus. The Python/Matlab implementation for all presented algorithms is available on https://github.com/sweichwald/MERLiN
Evolution of cooperation in the prisoner's dilemma and the public goods game is studied, where initially players belong to two independent structured populations. Simultaneously with the strategy evolution, players whose current utility exceeds a threshold are rewarded by an external link to a player belonging to the other population. Yet as soon as the utility drops below the threshold, the external link is terminated. The rewarding of current evolutionary fitness thus introduces a time-varying interdependence between the two populations. We show that, regardless of the details of the evolutionary game and the interaction structure, the self-organization of fitness and reward gives rise to distinguished players that act as strong catalysts of cooperative behavior. However, there also exist critical utility thresholds beyond which distinguished players are no longer able to percolate. The interdependence between the two populations then vanishes, and cooperators are forced to rely on traditional network reciprocity alone. We thus demonstrate that a simple strategy-independent form of rewarding may significantly expand the scope of cooperation on structured populations. The formation of links outside the immediate community seems particularly applicable in human societies, where an individual is typically member in many different social networks.
We derive the finite temperature Keldysh response theory for interacting fermions in the presence of quenched disorder, as applicable to any of the 10 Altland-Zirnbauer classes in an Anderson delocalized phase with at least a U(1) continuous symmetry. In this formulation of the interacting Finkel'stein nonlinear sigma model, the statistics of one-body wave functions are encoded by the constrained matrix field, while physical correlations follow from the hydrodynamic density or spin response field, which decouples the interactions. Integrating out the matrix field first, we obtain weak (anti)localization and Altshuler-Aronov quantum conductance corrections from the hydrodynamic response function. This procedure automatically incorporates the correct infrared physics, and in particular gives the Altshuler-Aronov-Khmelnitsky (AAK) equations for dephasing of weak (anti)localization due to electron-electron collisions. We explicate the method by deriving known quantum corrections in two dimensions for the symplectic metal class AII, as well as the spin-SU(2) invariant superconductor classes C and CI. We show that conductance corrections due to the special modes at zero energy in nonstandard classes are automatically cut off by temperature, as previously expected, while the Wigner-Dyson class Cooperon modes that persist to all energies are cut by dephasing. We also show that for short-ranged interactions, the standard self-consistent solution for the dephasing rate is equivalent to a diagrammatic summation via the self-consistent Born approximation. This should be compared to the AAK solution for long-ranged Coulomb interactions, which exploits the Markovian noise correlations induced by thermal fluctuations of the electromagnetic field. We discuss prospects for exploring the many-body localization transition from the ergodic side as a dephasing catastrophe in short-range interacting models.
We take a game theoretical approach to determine necessary and sufficient conditions under which we can persuade rational agents to exchange messages in pairwise exchanges over links of a dynamic network, by holding them accountable for deviations with punishments. We make three contributions: (1) we provide a new game theoretical model of repeated interactions in dynamic networks, where agents have incomplete information of the topology, (2) we define a new solution concept for this model, and (3) we identify necessary and sufficient conditions for enforcing accountability, i.e., for persuading agents to exchange messages in the aforementioned model.   Our results are of technical interest but also of practical relevance. We show that we cannot enforce accountability if the dynamic network does not allow for \emph{timely punishments}. In practice, this means for instance that we cannot enforce accountability in some networks formed in file-sharing applications such as Bittorrent\,\cite{Cohen:03}. We also show that for applications such as secret exchange, where the benefits of the exchanges significantly surpass the communication costs, timely punishments are enough to enforce accountability. However, we cannot in general enforce accountability if agents do not possess enough information about the network topology. Nevertheless, we can enforce accountability in a wide variety of networks that satisfy 1-connectivity\,\cite{Kuhn:10} with minimal knowledge about the network topology, including overlays for gossip dissemination such as \cite{Li:06,Li:08}.
We calculate the cross section of J/psi plus jet associated production in ep deep-inelastic scattering within the factorization formalism of nonrelativistic quantum chromodynamics. Our analytic results disagree with previous analyses, both for the colour-singlet and colour-octet channels. Our theoretical predictions agree reasonably well with recent data taken by the H1 Collaboration at DESY HERA, significantly better than those obtained within the colour-singlet model.
The Critical Path Approximation ("CPA") is integrated with a lattice-based approach to percolation to provide a model for conductivity in nanofibre-based composites. Our treatment incorporates a recent estimate for the anisotropy in tunnelingbased conductance as a function of the relative angle between the axes of elongated nanoparticles. The conductivity is examined as a function of the volume fraction, degree of clustering, and of the mean value and standard deviation of the orientational order parameter. Results from our calculations suggest that the conductivity can depend strongly upon the standard deviation in the orientational order parameter even when all the other variables (including the mean value of the order parameter <S>) are held invariant.
Many problems in computational neuroscience, neuroinformatics, pattern/image recognition, signal processing and machine learning generate massive amounts of multidimensional data with multiple aspects and high dimensionality. Tensors (i.e., multi-way arrays) provide often a natural and compact representation for such massive multidimensional data via suitable low-rank approximations. Big data analytics require novel technologies to efficiently process huge datasets within tolerable elapsed times. Such a new emerging technology for multidimensional big data is a multiway analysis via tensor networks (TNs) and tensor decompositions (TDs) which represent tensors by sets of factor (component) matrices and lower-order (core) tensors. Dynamic tensor analysis allows us to discover meaningful hidden structures of complex data and to perform generalizations by capturing multi-linear and multi-aspect relationships. We will discuss some fundamental TN models, their mathematical and graphical descriptions and associated learning algorithms for large-scale TDs and TNs, with many potential applications including: Anomaly detection, feature extraction, classification, cluster analysis, data fusion and integration, pattern recognition, predictive modeling, regression, time series analysis and multiway component analysis.   Keywords: Large-scale HOSVD, Tensor decompositions, CPD, Tucker models, Hierarchical Tucker (HT) decomposition, low-rank tensor approximations (LRA), Tensorization/Quantization, tensor train (TT/QTT) - Matrix Product States (MPS), Matrix Product Operator (MPO), DMRG, Strong Kronecker Product (SKP).
The morphological systems of natural languages are replete with examples of the same devices used for multiple purposes: (1) the same type of morphological process (for example, suffixation for both noun case and verb tense) and (2) identical morphemes (for example, the same suffix for English noun plural and possessive). These sorts of similarity would be expected to convey advantages on language learners in the form of transfer from one morphological category to another. Connectionist models of morphology acquisition have been faulted for their supposed inability to represent phonological similarity across morphological categories and hence to facilitate transfer. This paper describes a connectionist model of the acquisition of morphology which is shown to exhibit transfer of this type. The model treats the morphology acquisition problem as one of learning to map forms onto meanings and vice versa. As the network learns these mappings, it makes phonological generalizations which are embedded in connection weights. Since these weights are shared by different morphological categories, transfer is enabled. In a set of experiments with artificial stimuli, networks were trained first on one morphological task (e.g., tense) and then on a second (e.g., number). It is shown that in the context of suffixation, prefixation, and template rules, the second task is facilitated when the second category either makes use of the same forms or the same general process type (e.g., prefixation) as the first.
We introduce a methodology for efficient monitoring of processes running on hosts in a corporate network. The methodology is based on collecting streams of system calls produced by all or selected processes on the hosts, and sending them over the network to a monitoring server, where machine learning algorithms are used to identify changes in process behavior due to malicious activity, hardware failures, or software errors. The methodology uses a sequence of system call count vectors as the data format which can handle large and varying volumes of data.   Unlike previous approaches, the methodology introduced in this paper is suitable for distributed collection and processing of data in large corporate networks. We evaluate the methodology both in a laboratory setting on a real-life setup and provide statistics characterizing performance and accuracy of the methodology.
Stimulated emission depletion (STED) microscopy has become a powerful imaging and localized excitation method beating the diffraction barrier for improved lateral spatial resolution in cellular imaging, lithography, etc. Due to specimen-induced aberrations and scattering distortion, it has been a great challenge for STED to maintain consistent lateral resolution deeply inside the specimens. Here we report on a deep imaging STED microscopy by using Gaussian beam for excitation and hollow Bessel beam for depletion (GB-STED). The proposed scheme shows the improved imaging depth up to ~155{\mu}m in solid agarose sample, ~115{\mu}m in PDMS and ~100{\mu}m in phantom of gray matter in brain tissue with consistent super resolution, while the standard STED microscopy shown a significantly reduced lateral resolution at the same imaging depth. The results indicate the excellent imaging penetration capability of GB-STED, making it a promising tool for deep 3D imaging optical nanoscopy and laser fabrication.
From the viewpoint of networks, a ranking system for players or teams in sports is equivalent to a centrality measure for sports networks, whereby a directed link represents the result of a single game. Previously proposed network-based ranking systems are derived from static networks, i.e., aggregation of the results of games over time. However, the score of a player (or team) fluctuates over time. Defeating a renowned player in the peak performance is intuitively more rewarding than defeating the same player in other periods. To account for this factor, we propose a dynamic variant of such a network-based ranking system and apply it to professional men's tennis data. We derive a set of linear online update equations for the score of each player. The proposed ranking system predicts the outcome of the future games with a higher accuracy than the static counterparts.
Charged particle production in neutral current deep inelastic scattering (DIS) has been studied using the ZEUS detector.The evolution of the mean multiplicities, scaled momenta and transverse momenta in $Q^2$ and $x$ for $ 10 < Q^2 < 5120 GeV^2$ and $x > 6\times 10^{-4}$ has been investigated in the current and target fragmentation regions of the Breit frame. Distributions in the target region, using HERA data for the first time, are compared to distributions in the current region. Predictions based on MLLA and LPHD are inconsistent with the data.
Network traffic model is a critical problem for urban applications, mainly because of its diversity and node density. As wireless sensor network is highly concerned with the development of smart cities, careful consideration to traffic model helps choose appropriate protocols and adapt network parameters to reach best performances on energy-latency tradeoffs. In this paper, we compare the performance of two off-the-shelf medium access control protocols on two different kinds of traffic models, and then evaluate their application-end information delay and energy consumption while varying traffic parameters and network density. From the simulation results, we highlight some limits induced by network density and occurrence frequency of event-driven applications. When it comes to realtime urban services, a protocol selection shall be taken into account - even dynamically - with a special attention to energy-delay tradeoff. To this end, we provide several insights on parking sensor networks.
This paper presents a new method for pre-training neural networks that can decrease the total training time for a neural network while maintaining the final performance, which motivates its use on deep neural networks. By partitioning the training task in multiple training subtasks with sub-models, which can be performed independently and in parallel, it is shown that the size of the sub-models reduces almost quadratically with the number of subtasks created, quickly scaling down the sub-models used for the pre-training. The sub-models are then merged to provide a pre-trained initial set of weights for the original model. The proposed method is independent of the other aspects of the training, such as architecture of the neural network, training method, and objective, making it compatible with a wide range of existing approaches. The speedup without loss of performance is validated experimentally on MNIST and on CIFAR10 data sets, also showing that even performing the subtasks sequentially can decrease the training time. Moreover, we show that larger models may present higher speedups and conjecture about the benefits of the method in distributed learning systems.
Traditionally, network analysis is based on local properties of vertices, like their degree or clustering, and their statistical behavior across the network in question. This paper develops an approach which is different in two respects. We investigate edge-based properties, and we define global characteristics of networks directly. The latter will provide our affirmative answer to the question raised in the title. More concretely, we start with Forman's notion of the Ricci curvature of a graph, or more generally, a polyhedral complex. This will allow us to pass from a graph as representing a network to a polyhedral complex for instance by filling in triangles into connected triples of edges and to investigate the resulting effect on the curvature. This is insightful for two reasons: First, we can define a curvature flow in order to asymptotically simplify a network and reduce it to its essentials. Second, using a construction of Bloch, which yields a discrete Gauss-Bonnet theorem, we have the Euler characteristic of a network as a global characteristic. These two aspects beautifully merge in the sense that the asymptotic properties of the curvature flow are indicated by that Euler characteristic.
A computational framework is developed to address capillary self-focusing in Step Emulsification. The microfluidic system consists of a single shallow and wide microchannel that merges into a deep reservoir. A continuum approach coupled with a volume of fluid method is used to model the capillary self-focusing effect. The original governing equations are reduced using the Hele-Shaw approximation. We show that the interface between the two fluids takes the shape of a neck narrowing in the flow direction just before entering the reservoir, in agreement with our experimental observations. Our computational model relies on the assumption that the pressure at the boundary, where the fluid exits into the reservoir, is the uniform pressure in the reservoir. We investigate this hypothesis by comparing the numerical results with experimental data. We conjecture that the pressure boundary condition becomes important when the width of the neck is comparable to the depth of the microchannel. A correction to the exit pressure boundary condition is then proposed, which is determined by comparison with experimental data. We also present the experimental observations and the numerical results of the transitions of breakup regimes.
This paper presents a study on data dissemination in unstructured Peer-to-Peer (P2P) network overlays. The absence of a structure in unstructured overlays eases the network management, at the cost of non-optimal mechanisms to spread messages in the network. Thus, dissemination schemes must be employed that allow covering a large portion of the network with a high probability (e.g.~gossip based approaches). We identify principal metrics, provide a theoretical model and perform the assessment evaluation using a high performance simulator that is based on a parallel and distributed architecture. A main point of this study is that our simulation model considers implementation technical details, such as the use of caching and Time To Live (TTL) in message dissemination, that are usually neglected in simulations, due to the additional overhead they cause. Outcomes confirm that these technical details have an important influence on the performance of dissemination schemes and that the studied schemes are quite effective to spread information in P2P overlay networks, whatever their topology. Moreover, the practical usage of such dissemination mechanisms requires a fine tuning of many parameters, the choice between different network topologies and the assessment of behaviors such as free riding. All this can be done only using efficient simulation tools to support both the network design phase and, in some cases, at runtime.
We propose an automatic diabetic retinopathy (DR) analysis algorithm based on two-stages deep convolutional neural networks (DCNN). Compared to existing DCNN-based DR detection methods, the proposed algorithm have the following advantages: (1) Our method can point out the location and type of lesions in the fundus images, as well as giving the severity grades of DR. Moreover, since retina lesions and DR severity appear with different scales in fundus images, the integration of both local and global networks learn more complete and specific features for DR analysis. (2) By introducing imbalanced weighting map, more attentions will be given to lesion patches for DR grading, which significantly improve the performance of the proposed algorithm. In this study, we label 12,206 lesion patches and re-annotate the DR grades of 23,595 fundus images from Kaggle competition dataset. Under the guidance of clinical ophthalmologists, the experimental results show that our local lesion detection net achieve comparable performance with trained human observers, and the proposed imbalanced weighted scheme also be proved to significantly improve the capability of our DCNN-based DR grading algorithm.
Many standard robotic platforms are equipped with at least a fixed 2D laser range finder and a monocular camera. Although those platforms do not have sensors for 3D depth sensing capability, knowledge of depth is an essential part in many robotics activities. Therefore, recently, there is an increasing interest in depth estimation using monocular images. As this task is inherently ambiguous, the data-driven estimated depth might be unreliable in robotics applications. In this paper, we have attempted to improve the precision of monocular depth estimation by introducing 2D planar observation from the remaining laser range finder without extra cost. Specifically, we construct a dense reference map from the sparse laser range data, redefining the depth estimation task as estimating the distance between the real and the reference depth. To solve the problem, we construct a novel residual of residual neural network, and tightly combine the classification and regression losses for continuous depth estimation. Experimental results suggest that our method achieves considerable promotion compared to the state-of-the-art methods on both NYUD2 and KITTI, validating the effectiveness of our method on leveraging the additional sensory information. We further demonstrate the potential usage of our method in obstacle avoidance where our methodology provides comprehensive depth information compared to the solution using monocular camera or 2D laser range finder alone.
Natural selection favors the more successful individuals. This is the elementary premise that pervades common models of evolution. Under extreme conditions, however, the process may no longer be probabilistic. Those that meet certain conditions survive and may reproduce while others perish. By introducing the corresponding binary birth-death dynamics to spatial evolutionary games, we observe solutions that are fundamentally different from those reported previously based on imitation dynamics. Social dilemmas transform to collective enterprises, where the availability of free expansion ranges and limited exploitation possibilities dictates self-organized growth. Strategies that dominate are those that are collectively most apt in meeting the survival threshold, rather than those who succeed in exploiting others for unfair benefits. Revisiting Darwinian principles with the focus on survival rather than imitation thus reveals the most counterintuitive ways of reconciling cooperation with competition.
This paper proposes to improve visual question answering (VQA) with structured representations of both scene contents and questions. A key challenge in VQA is to require joint reasoning over the visual and text domains. The predominant CNN/LSTM-based approach to VQA is limited by monolithic vector representations that largely ignore structure in the scene and in the form of the question. CNN feature vectors cannot effectively capture situations as simple as multiple object instances, and LSTMs process questions as series of words, which does not reflect the true complexity of language structure. We instead propose to build graphs over the scene objects and over the question words, and we describe a deep neural network that exploits the structure in these representations. This shows significant benefit over the sequential processing of LSTMs. The overall efficacy of our approach is demonstrated by significant improvements over the state-of-the-art, from 71.2% to 74.4% in accuracy on the "abstract scenes" multiple-choice benchmark, and from 34.7% to 39.1% in accuracy over pairs of "balanced" scenes, i.e. images with fine-grained differences and opposite yes/no answers to a same question.
Cellular manufacturing (CM) is an approach that includes both flexibility of job shops and high production rate of flow lines. Although CM provides many benefits in reducing throughput times, setup times, work-in-process inventories but the design of CM is complex and NP complete problem. The cell formation problem based on operation sequence (ordinal data) is rarely reported in the literature. The objective of the present paper is to propose a visual clustering approach for machine-part cell formation using Self Organizing Map (SOM) algorithm an unsupervised neural network to achieve better group technology efficiency measure of cell formation as well as measure of SOM quality. The work also has established the criteria of choosing an optimum SOM map size based on results of quantization error, topography error, and average distortion measure during SOM training which have generated the best clustering and preservation of topology. To evaluate the performance of the proposed algorithm, we tested the several benchmark problems available in the literature. The results show that the proposed approach not only generates the best and accurate solution as any of the results reported, so far, in literature but also, in some instances the results produced are even better than the previously reported results. The effectiveness of the proposed approach is also statistically verified.
The problem of node-centric, or local, community detection in information networks refers to the identification of a community for a given input node, having limited information about the network topology. Existing methods for solving this problem, however, are not conceived to work on complex networks. In this paper, we propose a novel framework for local community detection based on the multilayer network model. Our approach relies on the maximization of the ratio between the community internal connection density and the external connection density, according to multilayer similarity-based community relations. We also define a biasing scheme that allows the discovery of local communities characterized by different degrees of layer-coverage diversification. Experimental evaluation conducted on real-world multilayer networks has shown the significance of our approach.
The mechanical properties of cells are dominated by the cytoskeleton, an interconnected network of long elastic filaments. The connections between the filaments are provided by crosslinking proteins, which constitute, next to the filaments, the second important mechanical element of the network. An important aspect of cytoskeletal assemblies is their dynamic nature, which allows remodeling in response to external cues. The reversible nature of crosslink binding is an important mechanism that underlies these dynamical processes. Here, we develop a theoretical model that provides insight into how the mechanical properties of cytoskeletal networks may depend on their underlying constituting elements. We incorporate three important ingredients: nonaffine filament deformations in response to network strain; interplay between filament and crosslink mechanical properties; reversible crosslink (un)binding in response to imposed stress. With this we are able to self-consistently calculate the nonlinear modulus of the network as a function of deformation amplitude and crosslink as well as filament stiffnesses. During loading crosslink unbinding processes lead to a relaxation of stress and therefore to a reduction of the network modulus and eventually to network failure, when all crosslink are unbound. This softening due to crosslink unbinding generically competes with an inherent stiffening response, which may either be due to filament or crosslink nonlinear elasticity.
Recently, deep architectures, such as recurrent and recursive neural networks have been successfully applied to various natural language processing tasks. Inspired by bidirectional recurrent neural networks which use representations that summarize the past and future around an instance, we propose a novel architecture that aims to capture the structural information around an input, and use it to label instances. We apply our method to the task of opinion expression extraction, where we employ the binary parse tree of a sentence as the structure, and word vector representations as the initial representation of a single token. We conduct preliminary experiments to investigate its performance and compare it to the sequential approach.
Item Response Theory (IRT) allows for measuring ability of Machine Learning models as compared to a human population. However, it is difficult to create a large dataset to train the ability of deep neural network models (DNNs). We propose Crowd-Informed Fine-Tuning (CIFT) as a new training process, where a pre-trained model is fine-tuned with a specialized supplemental training set obtained via IRT model-fitting on a large set of crowdsourced response patterns. With CIFT we can leverage the specialized set of data obtained through IRT to inform parameter tuning in DNNs. We experiment with two loss functions in CIFT to represent (i) memorization of fine-tuning items and (ii) learning a probability distribution over potential labels that is similar to the crowdsourced distribution over labels to simulate crowd knowledge. Our results show that CIFT improves ability for a state-of-the art DNN model for Recognizing Textual Entailment (RTE) tasks and is generalizable to a large-scale RTE test set.
Centrality measures have been defined to quantify the importance of a node in complex networks. The relative importance of a node can be measured using its centrality rank based on the centrality value. In the present work, we predict the degree centrality rank of a node without having the entire network. The proposed method uses degree of the node and some network parameters to predict its rank. These network parameters include network size, minimum, maximum, and average degree of the network. These parameters are estimated using random walk sampling techniques. The proposed method is validated on Barabasi-Albert networks. Simulation results show that the proposed method predicts the rank of higher degree nodes with more accuracy. The average error in the rank prediction is approximately $0.16\%$ of the network size.
We consider a cyber-physical system consisting of two interacting networks, i.e., a cyber-network overlaying a physical-network. It is envisioned that these systems are more vulnerable to attacks since node failures in one network may result in (due to the interdependence) failures in the other network, causing a cascade of failures that would potentially lead to the collapse of the entire infrastructure. The robustness of interdependent systems against this sort of catastrophic failure hinges heavily on the allocation of the (interconnecting) links that connect nodes in one network to nodes in the other network. In this paper, we characterize the optimum inter-link allocation strategy against random attacks in the case where the topology of each individual network is unknown. In particular, we analyze the "regular" allocation strategy that allots exactly the same number of bi-directional inter-network links to all nodes in the system. We show, both analytically and experimentally, that this strategy yields better performance (from a network resilience perspective) compared to all possible strategies, including strategies using random allocation, unidirectional inter-links, etc.
Background: Biological networks have a growing importance for the interpretation of high-throughput omics data. Integrative network analysis makes use of statistical and combinatorial methods to extract smaller subnetwork modules, and performs enrichment analysis to annotate the modules with ontology terms or other available knowledge. This process results in an annotated module, which retains the original network structure and includes enrichment information as a set system. A major bottleneck is a lack of tools that allow exploring both network structure of extracted modules and its annotations. Results: Thispaperpresentsavisualanalysisapproachthattargetssmallmoduleswithmanyset-based annotations, and which displays the annotations as contours on top of a node-link diagram. We introduce an extension of self-organizing maps to lay out nodes, links, and contours in a unified way. An implementation of this approach is freely available as the Cytoscape app eXamine. Conclusions: eXamine accurately conveys small and annotated modules consisting of several dozens of proteins and annotations. We demonstrate that eXamine facilitates the interpretation of integrative network analysis results in a guided case study. This study has resulted in a novel biological insight regarding the virally-encoded G-protein coupled receptor US28.
By an improved scaling analysis, we suggest that there may appear two possibilities concerning the electronic localization in two dimensional random media. The first is that all electronic states are localized in two dimensions, as already conjectured previously. The second possibility is that the electronic behaviors in two and three dimensional random systems are similar, in agreement with a recent calculation based on a direct calculation of the conductance with the use of the Kubo formula. In this case, non-localized states is possible in two dimensions, and possess some peculiar properties. A few predictions are proposed. Moreover, the present analysis seems accommodating results from previous scaling analysis.
The recommendation to change breathing patterns from the mouth to the nose can have a significantly positive impact upon the general well being of the individual. We classify nasal and mouth breathing by using an acoustic sensor and intelligent signal processing techniques. The overall purpose is to investigate the possibility of identifying the differences in patterns between nasal and mouth breathing in order to integrate this information into a decision support system which will form the basis of a patient monitoring and motivational feedback system to recommend the change from mouth to nasal breathing. Our findings show that the breath pattern can be discriminated in certain places of the body both by visual spectrum analysis and with a Back Propagation neural network classifier. The sound file recoded from the sensor placed on the hollow in the neck shows the most promising accuracy which is as high as 90%.
In language identification, a common first step in natural language processing, we want to automatically determine the language of some input text. Monolingual language identification assumes that the given document is written in one language. In multilingual language identification, the document is usually in two or three languages and we just want their names. We aim one step further and propose a method for textual language identification where languages can change arbitrarily and the goal is to identify the spans of each of the languages. Our method is based on Bidirectional Recurrent Neural Networks and it performs well in monolingual and multilingual language identification tasks on six datasets covering 131 languages. The method keeps the accuracy also for short documents and across domains, so it is ideal for off-the-shelf use without preparation of training data.
In the context of this paper, a record is an entry in a sequence of random variables (RV's) that is larger or smaller than all previous entries. After a brief review of the classic theory of records, which is largely restricted to sequences of independent and identically distributed (i.i.d.) RV's, new results for sequences of independent RV's with distributions that broaden or sharpen with time are presented. In particular, we show that when the width of the distribution grows as a power law in time $n$, the mean number of records is asymptotically of order $\ln n$ for distributions with a power law tail (the \textit{Fr\'echet class} of extremal value statistics), of order $(\ln n)^2$ for distributions of exponential type (\textit{Gumbel class}), and of order $n^{1/(\nu+1)}$ for distributions of bounded support (\textit{Weibull class}), where the exponent $\nu$ describes the behaviour of the distribution at the upper (or lower) boundary. Simulations are presented which indicate that, in contrast to the i.i.d. case, the sequence of record breaking events is correlated in such a way that the variance of the number of records is asymptotically smaller than the mean.
Recently deep neural networks have received considerable attention due to their ability to extract and represent high-level abstractions in data sets. Deep neural networks such as fully-connected and convolutional neural networks have shown excellent performance on a wide range of recognition and classification tasks. However, their hardware implementations currently suffer from large silicon area and high power consumption due to the their high degree of complexity. The power/energy consumption of neural networks is dominated by memory accesses, the majority of which occur in fully-connected networks. In fact, they contain most of the deep neural network parameters. In this paper, we propose sparsely-connected networks, by showing that the number of connections in fully-connected networks can be reduced by up to 90% while improving the accuracy performance on three popular datasets (MNIST, CIFAR10 and SVHN). We then propose an efficient hardware architecture based on linear-feedback shift registers to reduce the memory requirements of the proposed sparsely-connected networks. The proposed architecture can save up to 90% of memory compared to the conventional implementations of fully-connected neural networks. Moreover, implementation results show up to 84% reduction in the energy consumption of a single neuron of the proposed sparsely-connected networks compared to a single neuron of fully-connected neural networks.
Prediction of protein secondary structure from the amino acid sequence is a classical bioinformatics problem. Common methods use feed forward neural networks or SVMs combined with a sliding window, as these models does not naturally handle sequential data. Recurrent neural networks are an generalization of the feed forward neural network that naturally handle sequential data. We use a bidirectional recurrent neural network with long short term memory cells for prediction of secondary structure and evaluate using the CB513 dataset. On the secondary structure 8-class problem we report better performance (0.674) than state of the art (0.664). Our model includes feed forward networks between the long short term memory cells, a path that can be further explored.
Many model based scientific and engineering methodologies, such as system identification, sensitivity analysis, optimization and control, require a large number of model evaluations. In particular, model based real-time control of urban water infrastructures and online flood alarm systems require fast prediction of the network response at different actuation and/or parameter values. General purpose urban drainage simulators are too slow for this application. Fast surrogate models, so-called emulators, provide a solution to this efficiency demand. Emulators are attractive, because they sacrifice unneeded accuracy in favor of speed. However, they have to be fine-tuned to predict the system behavior satisfactorily. Also, some emulators fail to extrapolate the system behavior beyond the training set. Although, there are many strategies for developing emulators, up until now the selection of the emulation strategy remains subjective. In this paper, we therefore compare the performance of two families of emulators for open channel flows in the context of urban drainage simulators. We compare emulators that explicitly use knowledge of the simulator's equations, i.e. mechanistic emulators based on Gaussian Processes, with purely data-driven emulators using matrix factorization. Our results suggest that in many urban applications, naive data-driven emulation outperforms mechanistic emulation. Nevertheless, we discuss scenarios in which we think that mechanistic emulation might be favorable for i) extrapolation in time and ii) dealing with sparse and unevenly sampled data. We also provide many references to advances in the field of Machine Learning that have not yet permeated into the Bayesian environmental science community.
We study the typical learning properties of the recently proposed Support Vectors Machines. The generalization error on linearly separable tasks, the capacity, the typical number of Support Vectors, the margin, and the robustness or noise tolerance of a class of Support Vector Machines are determined in the framework of Statistical Mechanics. The robustness is shown to be closely related to the generalization properties of these machines.
Predicting large-scale transportation network traffic has become an important and challenging topic in recent decades. Inspired by the domain knowledge of motion prediction, in which the future motion of an object can be predicted based on previous scenes, we propose a network grid representation method that can retain the fine-scale structure of a transportation network. Network-wide traffic speeds are converted into a series of static images and input into a novel deep architecture, namely, spatiotemporal recurrent convolutional networks (SRCNs), for traffic forecasting. The proposed SRCNs inherit the advantages of deep convolutional neural networks (DCNNs) and long short-term memory (LSTM) neural networks. The spatial dependencies of network-wide traffic can be captured by DCNNs, and the temporal dynamics can be learned by LSTMs. An experiment on a Beijing transportation network with 278 links demonstrates that SRCNs outperform other deep learning-based algorithms in both short-term and long-term traffic prediction.
We analytically investigate a 1d branching-coalescing model with reflecting boundaries in a canonical ensemble where the total number of particles on the chain is conserved. Exact analytical calculations show that the model has two different phases which are separated by a second-order phase transition. The thermodynamic behavior of the canonical partition function of the model has been calculated exactly in each phase. Density profiles of particles have also been obtained explicitly. It is shown that the exponential part of the density profiles decay on three different length scales which depend on total density of particles.
Glauber dynamics of a bond-diluted Ising model on a Bethe lattice (a random graph with fixed connectivity) is investigated by an approximate theory which provides exact results for equilibrium properties. The time-dependent solutions of the dynamical system derived by this method are in good agreement with the results obtained by Monte Carlo simulations in almost all situations. Furthermore, the derived dynamical system exhibits a remarkable phenomenon that the magnetization shows multi-step relaxations at intermediate time scales in a low-temperature part of the Griffiths phase without bond percolation clusters.
The investigation of colour symmetries for periodic and aperiodic systems consists of two steps. The first concerns the computation of the possible numbers of colours and is mainly combinatorial in nature. The second is algebraic and determines the actual colour symmetry groups. Continuing previous work, we present the results of the combinatorial part for planar patterns with n-fold symmetry, where n=7,9,15,16,20,24. This completes the cases with values of n such that Euler's totient function of n is less than or equal to eight.
Modeling network traffic is gaining importance in order to counter modern threats of ever increasing sophistication. It is though surprisingly difficult and costly to construct reliable classifiers on top of telemetry data due to the variety and complexity of signals that no human can manage to interpret in full. Obtaining training data with sufficiently large and variable body of labels can thus be seen as prohibitive problem. The goal of this work is to detect infected computers by observing their HTTP(S) traffic collected from network sensors, which are typically proxy servers or network firewalls, while relying on only minimal human input in model training phase. We propose a discriminative model that makes decisions based on all computer's traffic observed during predefined time window (5 minutes in our case). The model is trained on collected traffic samples over equally sized time window per large number of computers, where the only labels needed are human verdicts about the computer as a whole (presumed infected vs. presumed clean). As part of training the model itself recognizes discriminative patterns in traffic targeted to individual servers and constructs the final high-level classifier on top of them. We show the classifier to perform with very high precision, while the learned traffic patterns can be interpreted as Indicators of Compromise. In the following we implement the discriminative model as a neural network with special structure reflecting two stacked multi-instance problems. The main advantages of the proposed configuration include not only improved accuracy and ability to learn from gross labels, but also automatic learning of server types (together with their detectors) which are typically visited by infected computers.
Deep Convolutional Neural Networks (CNNs) are the state of the art systems for image classification and scene understating. However, such techniques are computationally intensive and involve highly regular parallel computation. CNNs can thus benefit from a significant acceleration in execution time when running on fine grain programmable logic devices. As a consequence, several studies have proposed FPGA-based accelerators for CNNs. However, because of the huge amount of the required hardware resources, none of these studies directly was based on a direct mapping of the CNN computing elements onto the FPGA physical resources. In this work, we demonstrate the feasibility of this so-called direct hardware mapping approach and discuss several associated implementation issues. As a proof of concept, we introduce the haddoc2 open source tool, that is able to automatically transform a CNN description into a platform independent hardware description for FPGA implementation.
In the study of human learning, there is broad evidence that our ability to retain information improves with repeated exposure and decays with delay since last exposure. This plays a crucial role in the design of educational software, leading to a trade-off between teaching new material and reviewing what has already been taught. A common way to balance this trade-off is spaced repetition, which uses periodic review of content to improve long-term retention. Though spaced repetition is widely used in practice, e.g., in electronic flashcard software, there is little formal understanding of the design of these systems. Our paper addresses this gap in three ways. First, we mine log data from spaced repetition software to establish the functional dependence of retention on reinforcement and delay. Second, we use this memory model to develop a stochastic model for spaced repetition systems. We propose a queueing network model of the Leitner system for reviewing flashcards, along with a heuristic approximation that admits a tractable optimization problem for review scheduling. Finally, we empirically evaluate our queueing model through a Mechanical Turk experiment, verifying a key qualitative prediction of our model: the existence of a sharp phase transition in learning outcomes upon increasing the rate of new item introductions.
Adversarial examples have raised questions regarding the robustness and security of deep neural networks. In this work we formalize the problem of adversarial images given a pretrained classifier, showing that even in the linear case the resulting optimization problem is nonconvex. We generate adversarial images using shallow and deep classifiers on the MNIST and ImageNet datasets. We probe the pixel space of adversarial images using noise of varying intensity and distribution. We bring novel visualizations that showcase the phenomenon and its high variability. We show that adversarial images appear in large regions in the pixel space, but that, for the same task, a shallow classifier seems more robust to adversarial images than a deep convolutional network.
This paper presents new results for the (partial) maximum a posteriori (MAP) problem in Bayesian networks, which is the problem of querying the most probable state configuration of some of the network variables given evidence. First, it is demonstrated that the problem remains hard even in networks with very simple topology, such as binary polytrees and simple trees (including the Naive Bayes structure). Such proofs extend previous complexity results for the problem. Inapproximability results are also derived in the case of trees if the number of states per variable is not bounded. Although the problem is shown to be hard and inapproximable even in very simple scenarios, a new exact algorithm is described that is empirically fast in networks of bounded treewidth and bounded number of states per variable. The same algorithm is used as basis of a Fully Polynomial Time Approximation Scheme for MAP under such assumptions. Approximation schemes were generally thought to be impossible for this problem, but we show otherwise for classes of networks that are important in practice. The algorithms are extensively tested using some well-known networks as well as random generated cases to show their effectiveness.
Going beyond the currently investigated regimes in experiments on quantum transport of ultracold atoms in disordered potentials, we predict a crossover between regular and quantum-chaotic dynamics when varying the strength of disorder. Our spectral approach is based on the Bose-Hubbard model describing interacting atoms in deep random potentials. The predicted crossover from localized to diffusive dynamics depends on the simultaneous presence of interactions and disorder, and can be verified in the laboratory by monitoring the evolution of typical experimental initial states.
The notion of spontaneous formation of an inhomogeneous superconducting state is at the heart of most theories attempting to understand the superconducting state in the presence of strong disorder. Using scanning tunneling spectroscopy and high resolution scanning transmission electron microscopy, we experimentally demonstrate that under the competing effects of strong homogeneous disorder and superconducting correlations, the superconducting state of a conventional superconductor, NbN, spontaneously segregates into domains. Tracking these domains as a function of temperature we observe that the superconducting domains persist across the bulk superconducting transition, Tc, and disappear close to the pseudogap temperature, T*, where signatures of superconducting correlations disappear from the tunneling spectrum and the superfluid response of the system.
We consider scaling laws for maximal energy efficiency of communicating a message to all the nodes in a wireless network, as the number of nodes in the network becomes large. Two cases of large wireless networks are studied -- dense random networks and constant density (extended) random networks. In addition, we also study finite size regular networks in order to understand how regularity in node placement affects energy consumption.   We first establish an information-theoretic lower bound on the minimum energy per bit for multicasting in arbitrary wireless networks when the channel state information is not available at the transmitters. Upper bounds are obtained by constructing a simple flooding scheme that requires no information at the receivers about the channel states or the locations and identities of the nodes. The gap between the upper and lower bounds is only a constant factor for dense random networks and regular networks, and differs by a poly-logarithmic factor for extended random networks. Furthermore, we show that the proposed upper and lower bounds for random networks hold almost surely in the node locations as the number of nodes approaches infinity.
Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems.
The transverse Meissner effect (TME) in the highly layered superconductor $Bi_2Sr_2CaCu_2O_{8+y}$ with columnar defects is investigated by transport measurements. We present detailed evidence for the persistence of the Bose glass phase for $H_{\perp}<H_{\perp c}$ : (i) the variable-range vortex hopping process for low currents crosses over to the half-loops regime for high currents; (ii) in both regimes near $H_{\perp c}$ the energy barriers vanish linearly with $H_{\perp}$ ; (iii) the transition temperature is governed by $T_{BG}(H_{\parallel},0) -T_{BG}(H_{\parallel},H_{\perp}) \sim |H_{\perp}| ^{1/\nu_{\perp}}$ with $\nu_{\perp}=1.0 \pm 0.1$. Furthermore, above the transition as $H_{\perp}\to H_{\perp c}^+$, moving kink chains consistent with a commensurate-incommensurate transition scenario are observed. These results thereby clearly show the existence of the TME for $H_{\perp}<H_{\perp c}$ .
Networks are a useful representation for data on connections between units of interests, but the observed connections are often noisy and/or include missing values. One common approach to network analysis is to treat the network as a realization from a random graph model, and estimate the underlying edge probability matrix, which is sometimes referred to as network denoising. Here we propose a generalized linear model with low rank effects to model network edges. This model can be applied to various types of networks, including directed and undirected, binary and weighted, and it can naturally utilize additional information such as node and/or edge covariates. We develop an efficient projected gradient ascent algorithm to fit the model, establish asymptotic consistency, and demonstrate empirical performance of the method on both simulated and real networks.
This review is an introduction to theoretical models and mathematical calculations for biological evolution, aimed at physicists. The methods in the field are naturally very similar to those used in statistical physics, although the majority of publications appeared in biology journals. The review has three parts, which can be read independently. The first part deals with evolution in fitness landscapes and includes Fisher's theorem, adaptive walks, quasispecies models, effects of finite population sizes, and neutral evolution. The second part studies models of coevolution, including evolutionary game theory, kin selection, group selection, sexual selection, speciation, and coevolution of hosts and parasites. The third part discusses models for networks of interacting species and their extinction avalanches. Throughout the review, attention is paid to giving the necessary biological information, and to pointing out the assumptions underlying the models, and their limits of validity.
The social networking era has left us with little privacy. The details of the social network users are published on Social Networking sites. Vulnerability has reached new heights due to the overpowering effects of social networking. The sites like Facebook, Twitter are having a huge set of users who publish their files, comments, messages in other users walls. These messages and comments could be of any nature. Even friends could post a comment that would harm a persons integrity. Thus there has to be a system which will monitor the messages and comments that are posted on the walls. If the messages are found to be neutral (does not have any harmful content), then it can be published. If the messages are found to have non-neutral content in them, then these messages would be blocked by the social network manager. The messages that are non-neutral would be of sexual, offensive, hatred, pun intended nature. Thus the social network manager can classify content as neutral and non-neutral and notify the user if there seems to be messages of non-neutral behavior.
In theoretical cognitive science, there is a tension between highly structured models whose parameters have a direct psychological interpretation and highly complex, general-purpose models whose parameters and representations are difficult to interpret. The former typically provide more insight into cognition but the latter often perform better. This tension has recently surfaced in the realm of educational data mining, where a deep learning approach to predicting students' performance as they work through a series of exercises---termed deep knowledge tracing or DKT---has demonstrated a stunning performance advantage over the mainstay of the field, Bayesian knowledge tracing or BKT. In this article, we attempt to understand the basis for DKT's advantage by considering the sources of statistical regularity in the data that DKT can leverage but which BKT cannot. We hypothesize four forms of regularity that BKT fails to exploit: recency effects, the contextualized trial sequence, inter-skill similarity, and individual variation in ability. We demonstrate that when BKT is extended to allow it more flexibility in modeling statistical regularities---using extensions previously proposed in the literature---BKT achieves a level of performance indistinguishable from that of DKT. We argue that while DKT is a powerful, useful, general-purpose framework for modeling student learning, its gains do not come from the discovery of novel representations---the fundamental advantage of deep learning. To answer the question posed in our title, knowledge tracing may be a domain that does not require `depth'; shallow models like BKT can perform just as well and offer us greater interpretability and explanatory power.
A schema is a naturally defined subset of the space of fixed-length binary strings. The Holland Schema Theorem gives a lower bound on the expected fraction of a population in a schema after one generation of a simple genetic algorithm. This paper gives formulas for the exact expected fraction of a population in a schema after one generation of the simple genetic algorithm. Holland's schema theorem has three parts, one for selection, one for crossover, and one for mutation. The selection part is exact, whereas the crossover and mutation parts are approximations. This paper shows how the crossover and mutation parts can be made exact. Holland's schema theorem follows naturally as a corollary. There is a close relationship between schemata and the representation of the population in the Walsh basis. This relationship is used in the derivation of the results, and can also make computation of the schema averages more efficient. This paper gives a version of the Vose infinite population model where crossover and mutation are separated into two functions rather than a single "mixing" function.
The influence of weak localization on Hanle effect in a two-dimensional system with spin-split spectrum is considered. We show that weak localization drastically changes the dependence of stationary spin polarization $\mathbf S$ on external magnetic field $B.$ In particular, the non-analytic dependence of $\mathbf S$ on $\mathbf B$ is predicted for III-V-based quantum wells grown in [110] direction and for [100]-grown quantum wells having equal strengths of Dresselhaus and Bychkov-Rashba spin-orbit coupling. It is shown that in weakly localized regime the components of $\mathbf S$ are discontinuous at $B=0.$ At low $B,$ the magnetic field-induced rotation of the stationary polarization is determined by quantum interference effects. This implies that the Hanle effect in such systems is totally driven by weak localization.
In this work we address the statistical periodicity phenomenon on a coupled map lattice. The study was done based on the asymptotic binary patterns. The pattern multiplicity gives us the lattice information capacity, while the entropy rate allows us to calculate the locking-time. Our results suggest that the lattice has low locking-time and high capacity information when the coupling is weak. This is the condition for the system to reproduce a kind of behavior observed in neural networks.
Motivated by the fact that the same social dilemma can be perceived differently by different players, we here study evolutionary multigames in structured populations. While the core game is the weak prisoner's dilemma, a fraction of the population adopts either a positive or a negative value of the sucker's payoff, thus playing either the traditional prisoner's dilemma or the snowdrift game. We show that the higher the fraction of the population adopting a different payoff matrix, the more the evolution of cooperation is promoted. The microscopic mechanism responsible for this outcome is unique to structured populations, and it is due to the payoff heterogeneity, which spontaneously introduces strong cooperative leaders that give rise to an asymmetric strategy imitation flow in favor of cooperation. We demonstrate that the reported evolutionary outcomes are robust against variations of the interaction network, and they also remain valid if players are allowed to vary which game they play over time. These results corroborate existing evidence in favor of heterogeneity-enhanced network reciprocity, and they reveal how different perceptions of social dilemmas may contribute to their resolution.
We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihood-based methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account. Optimzing this profile likelihood poses two main difficulties: multiple (local) maxima, and its very high-dimensional parameter space. In this paper a new method is presented for optimizing the profile likelihood that addresses the second difficulty: in the proposed AI&M (adjusting imputation and mazimization) procedure the optimization is performed by operations in the space of data completions, rather than directly in the parameter space of the profile likelihood. We apply the AI&M method to learning parameters for Bayesian networks. The method is compared against conservative inference, which takes into account each possible data completion, and against EM. The results indicate that likelihood-based inference is still feasible in the case of unknown missingness mechanisms, and that conservative inference is unnecessarily weak. On the other hand, our results also provide evidence that the EM algorithm is still quite effective when the data is not missing at random.
The organizational principles behind the connectivity of a complex network are known to influence its behavior. In this work we investigate, using the Hopfield model, the influence of the network architecture on the performance for associative recall while the network is under hub and edge attack. We show, by using four different attack strategies, that although the importance of hubs is more definite for Barab\'asi-Albert neuronal networks, the random removal of the same amount of edges as in a hub may imply a greater reduction of memory recall.
We present FUSE spectra of three He-rich sdB stars. Two of these stars, PG1544+488 and JL87, reveal extremely strong C III lines, suggesting that they have mixed triple-alpha carbon from the deep interior out to their surfaces. Using TLUSTY NLTE line-blanketed model atmospheres, we find that PG1544+488 has a surface composition of 96% He, 2% C, and 1% N. JL87 shows a similar surface enrichment of C and N but still retains a significant amount of hydrogen. In contrast, the third star, LB1766, is devoid of hydrogen and strongly depleted of carbon, indicating that its surface material has undergone CN-cycle processing. We interpret these observations with new evolutionary calculations which suggest that He-rich sdB stars with C-rich compositions arise from a delayed helium-core flash on the white-dwarf cooling curve. During such a flash the interior convection zone will penetrate into the stellar envelope, thereby mixing the envelope with the He- and C-rich core. Such "flash-mixed" stars will arrive on the extreme horizontal branch (EHB) with He- and C-rich surface compositions and will be hotter than the hottest canonical EHB stars. Two types of flash mixing are possible: "deep" and "shallow", depending on whether the hydrogen envelope is mixed deeply into the site of the helium flash or only with the outer layers of the core. Based on both their stellar parameters and surface compositions, we suggest that PG1544+488 and JL87 are examples of "deep" and "shallow" flash mixing, respectively.
One fundamental problem in the field of network coding is to determine the network coding capacity of networks under various network coding schemes. In this thesis, we address the problem with two approaches: matroidal networks and capacity regions.   In our matroidal approach, we prove the converse of the theorem which states that, if a network is scalar-linearly solvable then it is a matroidal network associated with a representable matroid over a finite field. As a consequence, we obtain a correspondence between scalar-linearly solvable networks and representable matroids over finite fields in the framework of matroidal networks. We prove a theorem about the scalar-linear solvability of networks and field characteristics. We provide a method for generating scalar-linearly solvable networks that are potentially different from the networks that we already know are scalar-linearly solvable.   In our capacity region approach, we define a multi-dimensional object, called the network capacity region, associated with networks that is analogous to the rate regions in information theory. For the network routing capacity region, we show that the region is a computable rational polytope and provide exact algorithms and approximation heuristics for computing the region. For the network linear coding capacity region, we construct a computable rational polytope, with respect to a given finite field, that inner bounds the linear coding capacity region and provide exact algorithms and approximation heuristics for computing the polytope. The exact algorithms and approximation heuristics we present are not polynomial time schemes and may depend on the output size.
Studies of cooperation have traditionally focused on discrete games such as the well-known prisoner's dilemma, in which players choose between two pure strategies: cooperation and defection. Increasingly, however, cooperation is being studied in continuous games that feature a continuum of strategies determining the level of cooperative investment. For the continuous snowdrift game, it has been shown that a gradually evolving monomorphic population may undergo evolutionary branching, resulting in the emergence of a defector strategy that coexists with a cooperator strategy. This phenomenon has been dubbed the 'tragedy of the commune'. Here we study the effects of fluctuating group size on the tragedy of the commune and derive analytical conditions for evolutionary branching. Our results show that the effects of fluctuating group size on evolutionary dynamics critically depend on the structure of payoff functions. For games with additively separable benefits and costs, fluctuations in group size make evolutionary branching less likely, and sufficiently large fluctuations in group size can always turn an evolutionary branching point into a locally evolutionarily stable strategy. For games with multiplicatively separable benefits and costs, fluctuations in group size can either prevent or induce the tragedy of the commune. For games with general interactions between benefits and costs, we derive a general classification scheme based on second derivatives of the payoff function, to elucidate when fluctuations in group size help or hinder cooperation.
Obtaining models that capture imaging markers relevant for disease progression and treatment monitoring is challenging. Models are typically based on large amounts of data with annotated examples of known markers aiming at automating detection. High annotation effort and the limitation to a vocabulary of known markers limit the power of such approaches. Here, we perform unsupervised learning to identify anomalies in imaging data as candidates for markers. We propose AnoGAN, a deep convolutional generative adversarial network to learn a manifold of normal anatomical variability, accompanying a novel anomaly scoring scheme based on the mapping from image space to a latent space. Applied to new data, the model labels anomalies, and scores image patches indicating their fit into the learned distribution. Results on optical coherence tomography images of the retina demonstrate that the approach correctly identifies anomalous images, such as images containing retinal fluid or hyperreflective foci.
For artificial general intelligence (AGI) it would be efficient if multiple users trained the same giant neural network, permitting parameter reuse, without catastrophic forgetting. PathNet is a first step in this direction. It is a neural network algorithm that uses agents embedded in the neural network whose task is to discover which parts of the network to re-use for new tasks. Agents are pathways (views) through the network which determine the subset of parameters that are used and updated by the forwards and backwards passes of the backpropogation algorithm. During learning, a tournament selection genetic algorithm is used to select pathways through the neural network for replication and mutation. Pathway fitness is the performance of that pathway measured according to a cost function. We demonstrate successful transfer learning; fixing the parameters along a path learned on task A and re-evolving a new population of paths for task B, allows task B to be learned faster than it could be learned from scratch or after fine-tuning. Paths evolved on task B re-use parts of the optimal path evolved on task A. Positive transfer was demonstrated for binary MNIST, CIFAR, and SVHN supervised learning classification tasks, and a set of Atari and Labyrinth reinforcement learning tasks, suggesting PathNets have general applicability for neural network training. Finally, PathNet also significantly improves the robustness to hyperparameter choices of a parallel asynchronous reinforcement learning algorithm (A3C).
We present two related methods for creating MasterPrints, synthetic fingerprints that a fingerprint verification system identifies as many different people. Both methods start with training a Generative Adversarial Network (GAN) on a set of real fingerprint images. The generator network is then used to search for images that can be recognized as multiple individuals. The first method uses evolutionary optimization in the space of latent variables, and the second uses gradient-based search. Our method is able to design a MasterPrint that a commercial fingerprint system matches to 22% of all users in a strict security setting, and 75% of all users at a looser security setting.
Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipation and real-time interfacing with the environment. However the traditional RBM architecture and the commonly used training algorithm known as Contrastive Divergence (CD) are based on discrete updates and exact arithmetics which do not directly map onto a dynamical neural substrate. Here, we present an event-driven variation of CD to train a RBM constructed with Integrate & Fire (I&F) neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms. Our strategy is based on neural sampling, which allows us to synthesize a spiking neural network that samples from a target Boltzmann distribution. The recurrent activity of the network replaces the discrete steps of the CD algorithm, while Spike Time Dependent Plasticity (STDP) carries out the weight updates in an online, asynchronous fashion. We demonstrate our approach by training an RBM composed of leaky I&F neurons with STDP synapses to learn a generative model of the MNIST hand-written digit dataset, and by testing it in recognition, generation and cue integration tasks. Our results contribute to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality.
High-performance computing systems are moving toward 2.5D memory hierarchies, based on High-bandwidth memory (HBM) and Hybrid Memory Cube (HMC) to mitigate the main memory bottlenecks. This trend is also creating new opportunities to revisit near-memory computation. In this paper, we propose a flexible processor-in-memory (PIM) solution for scalable and energy-efficient execution of deep convolutional networks (ConvNets), one of the fastest-growing workloads for servers and high-end embedded systems. Our co-design approach consists of a network of Smart Memory Cubes (modular extensions to the standard HMC) each augmented with a manycore PIM platform called NeuroCluster. NeuroClusters have a modular design based on NeuroStream co-processors (for Convolution-intensive computations) and general-purpose RISCV cores. In addition, a DRAM-friendly tiling mechanism and a scalable computation paradigm are presented to efficiently harness this computational capability with a very low programming effort. NeuroCluster occupies only 8% of the total logic-base (LoB) die area in a standard HMC and achieves an average performance of 240 GFLOPS for complete execution of full-featured state-of-the-art (SoA) ConvNets within a power budget of 2.5 W. Overall 11 W is consumed in a single SMC device, with 22.5 GFLOPS/W energy-efficiency which is 3.5X better than the best GPU implementations in similar technologies. The minor increase in system-level power and the negligible area increase make our PIM system a cost-effective and energy efficient solution, easily scalable to 955 GFLOPS with a small network of just four SMCs.
In Nature, the primary goal of any network is to survive. This is less obvious for engineering networks (electric power, gas, water, transportation systems etc.) that are expected to operate under normal conditions most of time. As a result, the ability of a network to withstand massive sudden damage caused by adverse events (or survivability) has not been among traditional goals in the network design. Reality, however, calls for the adjustment of design priorities. As modern networks develop toward increasing their size, complexity, and integration, the likelihood of adverse events increases too due to technological development, climate change, and activities in the political arena among other factors. Under such circumstances, a network failure has an unprecedented effect on lives and economy. To mitigate the impact of adverse events on the network operability, the survivability analysis must be conducted at the early stage of the network design. Such analysis requires the development of new analytical and computational tools. Computational analysis of the network survivability is the exponential time problem at least. The current paper describes a new algorithm, in which the reduction of the computational complexity is achieved by mapping an initial network topology with multiple sources and sinks onto a set of simpler smaller topologies with multiple sources and a single sink. Steps for further reducing the time and space expenses of computations are also discussed.
We study the dynamics of networks with inhibitory and excitatory leaky-integrate-and-fire neurons with short-term synaptic plasticity in the presence of depressive and facilitating mechanisms. The dynamics is analyzed by a Heterogeneous Mean-Field approximation, that allows to keep track of the effects of structural disorder in the network. We describe the complex behavior of different classes of excitatory and inhibitory components, that give rise to a rich dynamical phase-diagram as a function of the fraction of inhibitory neurons. By the same mean field approach, we study and solve a global inverse problem: reconstructing the degree probability distributions of the inhibitory and excitatory components and the fraction of inhibitory neurons from the knowledge of the average synaptic activity field. This approach unveils new perspectives in the numerical study of neural network dynamics and in the possibility of using these models as testbed for the analysis of experimental data.
Networks have in recent years emerged as an invaluable tool for describing and quantifying complex systems in many branches of science. Recent studies suggest that networks often exhibit hierarchical organization, where vertices divide into groups that further subdivide into groups of groups, and so forth over multiple scales. In many cases these groups are found to correspond to known functional units, such as ecological niches in food webs, modules in biochemical networks (protein interaction networks, metabolic networks, or genetic regulatory networks), or communities in social networks. Here we present a general technique for inferring hierarchical structure from network data and demonstrate that the existence of hierarchy can simultaneously explain and quantitatively reproduce many commonly observed topological properties of networks, such as right-skewed degree distributions, high clustering coefficients, and short path lengths. We further show that knowledge of hierarchical structure can be used to predict missing connections in partially known networks with high accuracy, and for more general network structures than competing techniques. Taken together, our results suggest that hierarchy is a central organizing principle of complex networks, capable of offering insight into many network phenomena.
Regularization-based image restoration has remained an active research topic in computer vision and image processing. It often leverages a guidance signal captured in different fields as an additional cue. In this work, we present a general framework for image restoration, called deeply aggregated alternating minimization (DeepAM). We propose to train deep neural network to advance two of the steps in the conventional AM algorithm: proximal mapping and ?- continuation. Both steps are learned from a large dataset in an end-to-end manner. The proposed framework enables the convolutional neural networks (CNNs) to operate as a prior or regularizer in the AM algorithm. We show that our learned regularizer via deep aggregation outperforms the recent data-driven approaches as well as the nonlocalbased methods. The flexibility and effectiveness of our framework are demonstrated in several image restoration tasks, including single image denoising, RGB-NIR restoration, and depth super-resolution.
Deep neural networks have shown striking progress and obtained state-of-the-art results in many AI research fields in the recent years. However, it is often unsatisfying to not know why they predict what they do. In this paper, we address the problem of interpreting Visual Question Answering (VQA) models. Specifically, we are interested in finding what part of the input (pixels in images or words in questions) the VQA model focuses on while answering the question. To tackle this problem, we use two visualization techniques -- guided backpropagation and occlusion -- to find important words in the question and important regions in the image. We then present qualitative and quantitative analyses of these importance maps. We found that even without explicit attention mechanisms, VQA models may sometimes be implicitly attending to relevant regions in the image, and often to appropriate words in the question.
Cascading chains of events are a salient feature of many real-world social, biological, and financial networks. In social networks, social reciprocity accounts for retaliations in gang interactions, proxy wars in nation-state conflicts, or Internet memes shared via social media. Neuron spikes stimulate or inhibit spike activity in other neurons. Stock market shocks can trigger a contagion of volatility throughout a financial network. In these and other examples, only individual events associated with network nodes are observed, usually without knowledge of the underlying dynamic relationships between nodes. This paper addresses the challenge of tracking how events within such networks stimulate or influence future events. The proposed approach is an online learning framework well-suited to streaming data, using a multivariate Hawkes point process model to encapsulate autoregressive features of observed events within the social network. Recent work on online learning in dynamic environments is leveraged not only to exploit the dynamics within the underlying network, but also to track the network structure as it evolves. Regret bounds and experimental results demonstrate that the proposed method performs nearly as well as an oracle or batch algorithm.
This paper presents Natural Evolution Strategies (NES), a recent family of algorithms that constitute a more principled approach to black-box optimization than established evolutionary algorithms. NES maintains a parameterized distribution on the set of solution candidates, and the natural gradient is used to update the distribution's parameters in the direction of higher expected fitness. We introduce a collection of techniques that address issues of convergence, robustness, sample complexity, computational complexity and sensitivity to hyperparameters. This paper explores a number of implementations of the NES family, ranging from general-purpose multi-variate normal distributions to heavy-tailed and separable distributions tailored towards global optimization and search in high dimensional spaces, respectively. Experimental results show best published performance on various standard benchmarks, as well as competitive performance on others.
In this paper, we aim to introduce the classic Optimal Transport theory to enhance deep generative probabilistic modeling. For this purpose, we design a Generative Autotransporter (GAT) model with explicit distribution optimal transport. Particularly, the GAT model owns a deep distribution transporter to transfer the target distribution to a specific prior probability distribution, which enables a regular decoder to generate target samples from the input data that follows the transported prior distribution. With such a design, the GAT model can be stably trained to generate novel data by merely using a very simple $l_1$ reconstruction loss function with a generalized manifold-based Adam training algorithm. The experiments on two standard benchmarks demonstrate its strong generation ability.
The extreme traffic load that future wireless networks are expected to accommodate requires a re-thinking of the system design. Initial estimations indicate that, different from the evolutionary path of previous cellular generations that was based on spectral efficiency improvements, the most substantial amount of future system performance gains will be obtained by means of network infrastructure densification. By increasing the density of operator-deployed infrastructure elements, along with incorporation of user-deployed access nodes and mobile user devices acting as "infrastructure prosumers", it is expected that having one or more access nodes exclusively dedicated to each user will become feasible, introducing the ultra dense network (UDN) paradigm. Although it is clear that UDNs are able to take advantage of the significant benefits provided by proximal transmissions and increased spatial reuse of system resources, at the same time, large node density and irregular deployment introduce new challenges, mainly due to the interference environment characteristics that are vastly different from previous cellular deployments. This article attempts to provide insights on fundamental issues related to UDN deployment, such as determining the infrastructure density required to support given traffic load requirements and the benefits of network-wise coordination, demonstrating the potential of UDNs for 5G wireless networks.
We introduce an algorithm able to reconstruct the relevant network structure on which the time evolution of country-product bipartite networks takes place. The significant links are obtained by selecting the largest values of the projected matrix. We first perform a number of tests of this filtering procedure on synthetic cases and a toy model. Then we analyze the bipartite network constituted by countries and exported products, using two databases for a total of almost 50 years. It is then possible to build a hierarchically directed network, in which the taxonomy of products emerges in a natural way. We study the influence of the structure of this taxonomy network on countries' development; in particular, guided by an example taken from the industrialization of South Korea, we link the structure of the taxonomy network to the empirical temporal connections between product activations, finding that the most relevant edges for countries' development are the ones suggested by our network. These results suggest paths in the product space which are easier to achieve, and so can drive countries' policies in the industrialization process.
SMAR1 is a sensitive signaling molecule in p53 regulatory network which can drive p53 network dynamics to three distinct states, namely, stabilized (two), damped and sustain oscillation states. In the interaction of p53 network with SMAR1, p53 network sees SMAR1 as a sub-network with its new complexes formed by SMAR1, where SMAR1 is the central node, and fluctuations in SMAR1 concentration is propagated as a stress signal throughout the network. Excess stress induced by SMAR1 can drive p53 network dynamics to amplitude death scenario which corresponds to apoptotic state. The permutation entropy calculated for normal state is minimum indicating self-organized behavior, whereas for apoptotic state, the value is maximum showing breakdown of self-organization. We also show that the regulation of SMAR1 togather with other signaling molecules p300 and HDAC1 in the p53 regulatory network can be engineered to extend the range of stress such that the system can be save from apoptosis.
This paper describes our deep learning-based approach to multilingual aspect-based sentiment analysis as part of SemEval 2016 Task 5. We use a convolutional neural network (CNN) for both aspect extraction and aspect-based sentiment analysis. We cast aspect extraction as a multi-label classification problem, outputting probabilities over aspects parameterized by a threshold. To determine the sentiment towards an aspect, we concatenate an aspect vector with every word embedding and apply a convolution over it. Our constrained system (unconstrained for English) achieves competitive results across all languages and domains, placing first or second in 5 and 7 out of 11 language-domain pairs for aspect category detection (slot 1) and sentiment polarity (slot 3) respectively, thereby demonstrating the viability of a deep learning-based approach for multilingual aspect-based sentiment analysis.
Non-intrusive inspection systems based on X-ray radiography techniques are routinely used at transport hubs to ensure the conformity of cargo content with the supplied shipping manifest. As trade volumes increase and regulations become more stringent, manual inspection by trained operators is less and less viable due to low throughput. Machine vision techniques can assist operators in their task by automating parts of the inspection workflow. Since cars are routinely involved in trafficking, export fraud, and tax evasion schemes, they represent an attractive target for automated detection and flagging for subsequent inspection by operators. In this contribution, we describe a method for the detection of cars in X-ray cargo images based on trained-from-scratch Convolutional Neural Networks. By introducing an oversampling scheme that suitably addresses the low number of car images available for training, we achieved 100% car image classification rate for a false positive rate of 1-in-454. Cars that were partially or completely obscured by other goods, a modus operandi frequently adopted by criminals, were correctly detected. We believe that this level of performance suggests that the method is suitable for deployment in the field. It is expected that the generic object detection workflow described can be extended to other object classes given the availability of suitable training data.
The paper presents four distinct new ideas and results for communication networks:   1) We show that relay-networks (i.e. communication networks where different nodes use the same coding functions) can be used to model dynamic networks.   2) We introduce {\em the term model}, which is a simple, graph-free symbolic approach to communication networks.   3) We state and prove variants of a theorem concerning the dispersion of information in single-receiver communications.   4) We show that the solvability of an abstract multi-user communication problem is equivalent to the solvability of a single-target communication in a suitable relay network.   In the paper, we develop a number of technical ramifications of these ideas and results. One technical result is a max-flow min-cut theorem for the R\'enyi entropy with order less than one, given that the sources are equiprobably distributed; conversely, we show that the max-flow min-cut theorem fails for the R\'enyi entropy with order greater than one. We leave the status of the theorem with regards to the ordinary Shannon Entropy measure (R\'enyi entropy of order one and the limit case between validity or failure of the theorem) as an open question. In non-dynamic static communication networks with a single receiver, a simple application of Menger's theorem shows that the optimal throughput can be achieved without proper use of network coding i.e. just by using ordinary packet-switching. This fails dramatically in relay networks with a single receiver. We show that even a powerful method like linear network coding fails miserably for relay networks. With that in mind, it is noticeable that our rather weak form of network coding (routing with dynamic headers) is asymptotically sufficient to reach capacity.
The uncertainties involved in the calculation of bottom quark cross-sections in deep-inelastic scattering at HERA are studied in different phase space regions. Besides the inclusive bottom quark cross-section, definitions closer to the detector acceptance requiring at least one high energetic muon from the semi-leptonic \bquark decay or a jet with high transverse energy are investigated. For each case the uncertainties due to the choice of the renormalisation and factorisation scale as well as the \bquark mass are estimated in the perturbative NLO QCD calculation and furthermore uncertainties in the fragmenation of the bottom quark to a B-meson and in its semi-leptonic decay are discussed.
In this paper, we present the notion of "mobile 3C systems in which the "Communications", "Computing", and "Caching" (i.e., 3C) make up the three primary resources/funcationalties, akin to the three primary colors, for a mobile system. We argue that in future mobile networks, the roles of computing and caching are as intrinsic and essential as communications, and only the collective usage of these three primary resources can support the sustainable growth of mobile systems. By defining the 3C resources in their canonical forms, we reveal the important fact that "caching" affects the mobile system performance by introducing non-causality into the system, whereas "computing" achieves capacity gains by performing logical operations across mobile system entities. Many existing capacity-enhancing techniques such as coded multicast, collaborative transmissions, and proactive content pushing can be cast into the native 3C framework for analytical tractability. We further illustrate the mobile 3C concepts with practical examples, including a system on broadcast-unicast convergence for massive media content delivery. The mobile 3C design paradigm opens up new possibilities as well as key research problems bearing academic and practice significance.
A theory for kinetic arrest in isotropic systems of repulsive, radially-interacting particles is presented that predicts exponents for the scaling of various macroscopic quantities near the rigidity transition that are in agreement with simulations, including the non-trivial shear exponent. Both statics and dynamics are treated in a simplified, one-particle level description, and coupled via the assumption that kinetic arrest occurs on the boundary between mechanically stable and unstable regions of the static parameter diagram. This suggests the arrested states observed in simulations are at (or near) an elastic buckling transition. Some additional numerical evidence to confirm the scaling of microscopic quantities is also provided.
Taking precautions before or during the start of a virus outbreak can heavily reduce the number of infected. The question which individuals should be immunized in order to mitigate the impact of the virus on the rest of population has received quite some attention in the literature. The dynamics of the of a virus spread through a population is often represented as information spread over a complex network. The strategies commonly proposed to determine which nodes are to be selected for immunization often involve only one centrality measure at a time, while often the topology of the network seems to suggest that a single metric is insufficient to capture the influence of a node entirely.   In this work we present a generic method based on a genetic algorithm (GA) which does not rely explicitly on any centrality measures during its search but only exploits this type of information to narrow the search space. The fitness of an individual is defined as the estimated expected number of infections of a virus following SIR dynamics. The proposed method is evaluated on two contact networks: the Goodreau's Faux Mesa high school and the US air transportation network. The GA method manages to outperform the most common strategies based on a single metric for the air transportation network and its performance is comparable with the best performing strategy for the high school network.
We consider the problem of selecting a minimum size subset of nodes in a network, that allows to activate all the nodes of the network. We present a fast and simple algorithm that, in real-life networks, produces solutions that outperform the ones obtained by using the best algorithms in the literature. We also investigate the theoretical performances of our algorithm and give proofs of optimality for some classes of graphs. From an experimental perspective, experiments also show that the performance of the algorithms correlates with the modularity of the analyzed network. Moreover, the more the influence among communities is hard to propagate, the less the performances of the algorithms differ. On the other hand, when the network allows some propagation of influence between different communities, the gap between the solutions returned by the proposed algorithm and by the previous algorithms in the literature increases.
The goal of this work is to recognise and localise short temporal signals in image time series, where strong supervision is not available for training.   To this end we propose an image encoding that concisely represents human motion in a video sequence in a form that is suitable for learning with a ConvNet. The encoding reduces the pose information from an image to a single column, dramatically diminishing the input requirements for the network, but retaining the essential information for recognition.   The encoding is applied to the task of recognizing and localizing signed gestures in British Sign Language (BSL) videos. We demonstrate that using the proposed encoding, signs as short as 10 frames duration can be learnt from clips lasting hundreds of frames using only weak (clip level) supervision and with considerable label noise.
The abelian sandpile model in two dimensions does not show the type of critical behavior familar from equilibrium systems. Rather, the properties of the stationary state follow from the condition that an avalanche started at a distance r from the system boundary has a probability proportional to 1/sqrt(r) to reach the boundary. As a consequence, the scaling behavior of the model can be obtained from evaluating dissipative avalanches alone, allowing not only to determine the values of all exponents, but showing also the breakdown of finite-size scaling.
Discussion threads form a central part of the experience on many Web sites, including social networking sites such as Facebook and Google Plus and knowledge creation sites such as Wikipedia. To help users manage the challenge of allocating their attention among the discussions that are relevant to them, there has been a growing need for the algorithmic curation of on-line conversations --- the development of automated methods to select a subset of discussions to present to a user.   Here we consider two key sub-problems inherent in conversational curation: length prediction --- predicting the number of comments a discussion thread will receive --- and the novel task of re-entry prediction --- predicting whether a user who has participated in a thread will later contribute another comment to it. The first of these sub-problems arises in estimating how interesting a thread is, in the sense of generating a lot of conversation; the second can help determine whether users should be kept notified of the progress of a thread to which they have already contributed. We develop and evaluate a range of approaches for these tasks, based on an analysis of the network structure and arrival pattern among the participants, as well as a novel dichotomy in the structure of long threads. We find that for both tasks, learning-based approaches using these sources of information yield improvements for all the performance metrics we used.
We model d-wave ceramic superconductors with a three-dimensional lattice of randomly distributed $\pi$ Josephson junctions with finite self-inductance. The linear and nonlinear ac resistivity of the d-wave ceramic superconductors is obtained as function of temperature by solving the corresponding Langevin dynamical equations. We find that the linear ac resistivity remains finite at the temperature $T_p$ where the third harmonics of resistivity has a peak. The current amplitude dependence of the nonlinear resistivity at the peak position is found to be a power law. These results agree qualitatively with experiments. We also show that the peak of the nonlinear resistivity is related to the onset of the paramagnetic Meissner effect which occurs at the crossover temperature $T_p$, which is above the chiral glass transition temperature $T_{cg}$.
In the wake of the on-going digital revolution, we will see a dramatic transformation of our economy and most of our societal institutions. While the benefits of this transformation can be massive, there are also tremendous risks to our society. After the automation of many production processes and the creation of self-driving vehicles, the automation of society is next. This is moving us to a tipping point and to a crossroads: we must decide between a society in which the actions are determined in a top-down way and then implemented by coercion or manipulative technologies (such as personalized ads and nudging) or a society, in which decisions are taken in a free and participatory way and mutually coordinated. Modern information and communication systems (ICT) enable both, but the latter has economic and strategic benefits. The fundaments of human dignity, autonomous decision-making, and democracies are shaking, but I believe that they need to be vigorously defended, as they are not only core principles of livable societies, but also the basis of greater efficiency and success.
The ability to recognize and predict temporal sequences of sensory inputs is vital for survival in natural environments. Based on many known properties of cortical neurons, hierarchical temporal memory (HTM) sequence memory is recently proposed as a theoretical framework for sequence learning in the cortex. In this paper, we analyze properties of HTM sequence memory and apply it to sequence learning and prediction problems with streaming data. We show the model is able to continuously learn a large number of variable-order temporal sequences using an unsupervised Hebbian-like learning rule. The sparse temporal codes formed by the model can robustly handle branching temporal sequences by maintaining multiple predictions until there is sufficient disambiguating evidence. We compare the HTM sequence memory with other sequence learning algorithms, including statistical methods: autoregressive integrated moving average (ARIMA), feedforward neural networks: online sequential extreme learning machine (ELM), and recurrent neural networks: long short-term memory (LSTM) and echo-state networks (ESN), on sequence prediction problems with both artificial and real-world data. The HTM model achieves comparable accuracy to other state-of-the-art algorithms. The model also exhibits properties that are critical for sequence learning, including continuous online learning, the ability to handle multiple predictions and branching sequences with high order statistics, robustness to sensor noise and fault tolerance, and good performance without task-specific hyper- parameters tuning. Therefore the HTM sequence memory not only advances our understanding of how the brain may solve the sequence learning problem, but is also applicable to a wide range of real-world problems such as discrete and continuous sequence prediction, anomaly detection, and sequence classification.
Using generating functional and replica techniques, respectively, we study the dynamics and statics of a spherical Minority Game (MG), which in contrast with a spherical MG previously presented in J.Phys A: Math. Gen. 36 11159 (2003) displays a phase with broken ergodicity and dependence of the macroscopic stationary state on initial conditions. The model thus bears more similarity with the original MG. Still, all order parameters including the volatility can computed in the ergodic phases without making any approximations. We also study the effects of market impact correction on the phase diagram. Finally we discuss a continuous-time version of the model as well as the differences between on-line and batch update rules. Our analytical results are confirmed convincingly by comparison with numerical simulations. In an appendix we extend the analysis of the earlier spherical MG to a model with general time-step, and compare the dynamics and statics of the two spherical models.
With the help of the replica exchange Monte Carlo method and the improved Monte Carlo renormalization-group scheme, we investigate over a wide area in the phase diagram of the Gaussian random field Ising model on the simple cubic lattice. We found that the phase transition at a weak random field belongs to the same universality class as the zero-temperature phase transition. We also present a possible scenario for the replica symmetry breaking transition.
We present DCOOL-NET, a scalable distributed in-network algorithm for sensor network localization based on noisy range measurements. DCOOL-NET operates by parallel, collaborative message passing between single-hop neighbor sensors, and involves simple computations at each node. It stems from an application of the majorization-minimization (MM) framework to the nonconvex optimization problem at hand, and capitalizes on a novel convex majorizer. The proposed majorizer is endowed with several desirable properties and represents a key contribution of this work. It is a more accurate match to the underlying nonconvex cost function than popular MM quadratic majorizers, and is readily amenable to distributed minimization via the alternating direction method of multipliers (ADMM). Moreover, it allows for low-complexity, fast Nesterov gradient methods to tackle the ADMM subproblems induced at each node. Computer simulations show that DCOOL-NET achieves comparable or better sensor position accuracies than a state-of-art method which, furthermore, is not parallel.
Spreading of either information or matter can often be treated as a network problem. It can be of great importance to be able to estimate the likelihood that spreading through a network reaches essentially the entire network while still not reaching certain sub-classes of the network. We show that excluding nodes and edges from the network has a subtle effect on the percolation. We study two specific examples of degree distributions (exponential and scale free) for which analytical solutions can be obtained. The two cases exhibit qualitatively different behavior.
Argument Component Boundary Detection (ACBD) is an important sub-task in argumentation mining; it aims at identifying the word sequences that constitute argument components, and is usually considered as the first sub-task in the argumentation mining pipeline. Existing ACBD methods heavily depend on task-specific knowledge, and require considerable human efforts on feature-engineering. To tackle these problems, in this work, we formulate ACBD as a sequence labeling problem and propose a variety of Recurrent Neural Network (RNN) based methods, which do not use domain specific or handcrafted features beyond the relative position of the sentence in the document. In particular, we propose a novel joint RNN model that can predict whether sentences are argumentative or not, and use the predicted results to more precisely detect the argument component boundaries. We evaluate our techniques on two corpora from two different genres; results suggest that our joint RNN model obtain the state-of-the-art performance on both datasets.
We discuss recent results of the replica approach to statistical mechanics of a single classical particle placed in a random N(>>1)-dimensional Gaussian landscape. The particular attention is paid to the case of landscapes with logarithmically growing correlations and to its recent generalisations. Those landscapes give rise to a rich multifractal spatial structure of the associated Boltzmann-Gibbs measure. We also briefly mention related results on counting stationary points of random Gaussian surfaces, as well as ongoing research on statistical mechanics in a random landscape constructed locally by adding many squared Gaussian-distributed terms.
We present a hybrid model of a biological filter, a genetic circuit which removes fast fluctuations in the cell's internal representation of the extra cellular environment. The model takes the classic feed-forward loop (FFL) motif and represents it as a network of continuous protein concentrations and binary, unobserved gene promoter states. We address the problem of statistical inference and parameter learning for this class of models from partial, discrete time observations. We show that the hybrid representation leads to an efficient algorithm for approximate statistical inference in this circuit, and show its effectiveness on a simulated data set.
We develop and analyze new cooperative strategies for ad hoc networks that are more spectrally efficient than classical DF cooperative protocols. Using analog network coding, our strategies preserve the practical half-duplex assumption but relax the orthogonality constraint. The introduction of interference due to non-orthogonality is mitigated thanks to precoding, in particular Dirty Paper coding. Combined with smart power allocation, our cooperation strategies allow to save time and lead to more efficient use of bandwidth and to improved network throughput with respect to classical RDF/PDF.
In this paper we evaluate distributed node coloring algorithms for wireless networks using the network simulator Sinalgo [by DCG@ETHZ]. All considered algorithms operate in the realistic signal-to-interference-and-noise-ratio (SINR) model of interference. We evaluate two recent coloring algorithms, Rand4DColor and ColorReduction (in the following ColorRed), proposed by Fuchs and Prutkin in [SIROCCO'15], the MW-Coloring algorithm introduced by Moscibroda and Wattenhofer [DC'08] and transferred to the SINR model by Derbel and Talbi [ICDCS'10], and a variant of the coloring algorithm of Yu et al. [TCS'14]. We additionally consider several practical improvements to the algorithms and evaluate their performance in both static and dynamic scenarios. Our experiments show that Rand4DColor is very fast, computing a valid (4Degree)-coloring in less than one third of the time slots required for local broadcasting, where Degree is the maximum node degree in the network. Regarding other O(Degree)-coloring algorithms Rand4DColor is at least 4 to 5 times faster. Additionally, the algorithm is robust even in networks with mobile nodes and an additional listening phase at the start of the algorithm makes Rand4DColor robust against the late wake-up of large parts of the network. Regarding (Degree+1)-coloring algorithms, we observe that ColorRed it is significantly faster than the considered variant of the Yu et al. coloring algorithm, which is the only other (Degree+1)-coloring algorithm for the SINR model. Further improvement can be made with an error-correcting variant that increases the runtime by allowing some uncertainty in the communication and afterwards correcting the introduced conflicts.
This paper studies the stability of synchronized states in networks where couplings between nodes are characterized by some distributed time delay, and develops a generalized master stability function approach. Using a generic example of Stuart-Landau oscillators, it is shown how the stability of synchronized solutions in networks with distributed delay coupling can be determined through a semi-analytic computation of Floquet exponents. The analysis of stability of fully synchronized and of cluster or splay states is illustrated for several practically important choices of delay distributions and network topologies.
Conditional Random Rields (CRF) have been widely applied in image segmentations. While most studies rely on hand-crafted features, we here propose to exploit a pre-trained large convolutional neural network (CNN) to generate deep features for CRF learning. The deep CNN is trained on the ImageNet dataset and transferred to image segmentations here for constructing potentials of superpixels. Then the CRF parameters are learnt using a structured support vector machine (SSVM). To fully exploit context information in inference, we construct spatially related co-occurrence pairwise potentials and incorporate them into the energy function. This prefers labelling of object pairs that frequently co-occur in a certain spatial layout and at the same time avoids implausible labellings during the inference. Extensive experiments on binary and multi-class segmentation benchmarks demonstrate the promise of the proposed method. We thus provide new baselines for the segmentation performance on the Weizmann horse, Graz-02, MSRC-21, Stanford Background and PASCAL VOC 2011 datasets.
The merging of the Internet with the Wireless services domain has created a potential market whose characteristics are new technologies and time-to-market pressure. The lack of knowledge about new technologies and the need to be competitive in a short time demand that software organizations learn quickly about this domain and its characteristics. Additionally, the effects of development techniques in this context need to be understood. Learning from previous experiences in such a changing environment demands a clear understanding of the evidence to be captured, and how it could be used in the future. This article presents definitions of quantitative and qualitative evidence, and templates for capturing such evidence in a systematic way. Such templates were used in the context of two pilot projects dealing with the development of Wireless Internet Services.
We introduce $\infty$-dimensional versions of three common models of random hetero-polymers, in which both the polymer density and the density of the polymer-solvent mixture are finite. These solvable models give valuable insight into the problems related to the (quenched) average over the randomness in statistical mechanical models of proteins, without having to deal with the hard geometrical constraints occurring in finite dimensional models. Our exact solution, which is specific to the $\infty$-dimensional case, is compared to the results obtained by a saddle-point analysis and by the grand ensemble approach, both of which canalso be applied to models of finite dimension. We find, somewhat surprisingly, that the saddle-point analysis can lead to qualitatively incorrect results.
Automatic speaker naming is the problem of localizing as well as identifying each speaking character in a TV/movie/live show video. This is a challenging problem mainly attributes to its multimodal nature, namely face cue alone is insufficient to achieve good performance. Previous multimodal approaches to this problem usually process the data of different modalities individually and merge them using handcrafted heuristics. Such approaches work well for simple scenes, but fail to achieve high performance for speakers with large appearance variations. In this paper, we propose a novel convolutional neural networks (CNN) based learning framework to automatically learn the fusion function of both face and audio cues. We show that without using face tracking, facial landmark localization or subtitle/transcript, our system with robust multimodal feature extraction is able to achieve state-of-the-art speaker naming performance evaluated on two diverse TV series. The dataset and implementation of our algorithm are publicly available online.
Neuromorphic computing is a brainlike information processing paradigm that requires adaptive learning mechanisms. A spiking neuro-evolutionary system is used for this purpose; plastic resistive memories are implemented as synapses in spiking neural networks. The evolutionary design process exploits parameter self-adaptation and allows the topology and synaptic weights to be evolved for each network in an autonomous manner. Variable resistive memories are the focus of this research; each synapse has its own conductance profile which modifies the plastic behaviour of the device and may be altered during evolution. These variable resistive networks are evaluated on a noisy robotic dynamic-reward scenario against two static resistive memories and a system containing standard connections only. Results indicate that the extra behavioural degrees of freedom available to the networks incorporating variable resistive memories enable them to outperform the comparative synapse types.
We explore the task of multi-source morphological reinflection, which generalizes the standard, single-source version. The input consists of (i) a target tag and (ii) multiple pairs of source form and source tag for a lemma. The motivation is that it is beneficial to have access to more than one source form since different source forms can provide complementary information, e.g., different stems. We further present a novel extension to the encoder- decoder recurrent neural architecture, consisting of multiple encoders, to better solve the task. We show that our new architecture outperforms single-source reinflection models and publish our dataset for multi-source morphological reinflection to facilitate future research.
Classical correlations of ground states typically decay exponentially and polynomially, respectively for gapped and gapless short-ranged quantum spin systems. In such systems, entanglement decays exponentially even at the quantum critical points. However, quantum discord, an information-theoretic quantum correlation measure, survives long lattice distances. We investigate the effects of quenched disorder on quantum correlation lengths of quenched averaged entanglement and quantum discord, in the anisotropic XY and XYZ spin glass and random field chains. We find that there is virtually neither reduction nor enhancement in entanglement length while quantum discord length increases significantly with the introduction of the quenched disorder.
A mobile ad hoc network (MANET), is a self-configuring network of mobile devices connected by wireless links. In order to achieve stable clusters, the cluster-heads maintaining the cluster should be stable with minimum overhead of cluster re-elections. In this paper we propose a Probability Based Adaptive Invoked Weighted Clustering Algorithm (PAIWCA) which can enhance the stability of the clusters by taking battery power of the nodes into considerations for the clustering formation and electing stable cluster-heads using cluster head probability of a node. In this simulation study a comparison was conducted to measure the performance of our algorithm with maximal weighted independent set (MWIS) in terms of the number of clusters formed, the connectivity of the network, dominant set updates,throughput of the overall network and packet delivery ratio. The result shows that our algorithm performs better than existing one and is also tunable to different kinds of network conditions.
JBotSim is a java library that offers basic primitives for prototyping, running, and visualizing distributed algorithms in dynamic networks. With JBotSim, one can implement an idea in minutes and interact with it ({\it e.g.}, add, move, or delete nodes) while it is running. JBotSim is well suited to prepare live demonstrations of your algorithms to colleagues or students; it can also be used to evaluate performance at the algorithmic level (number of messages, number of rounds, etc.). Unlike most tools, JBotSim is not an integrated environment. It is a lightweight library to be used in your program. In this paper, we present an overview of its distinctive features and architecture.
Deep-layered models trained on a large number of labeled samples boost the accuracy of many tasks. It is important to apply such models to different domains because collecting many labeled samples in various domains is expensive. In unsupervised domain adaptation, one needs to train a classifier that works well on a target domain when provided with labeled source samples and unlabeled target samples. Although many methods aim to match the distributions of source and target samples, simply matching the distribution cannot ensure accuracy on the target domain. To learn discriminative representations for the target domain, we assume that artificially labeling target samples can result in a good representation. Tri-training leverages three classifiers equally to give pseudo-labels to unlabeled samples, but the method does not assume labeling samples generated from a different domain.In this paper, we propose an asymmetric tri-training method for unsupervised domain adaptation, where we assign pseudo-labels to unlabeled samples and train neural networks as if they are true labels. In our work, we use three networks asymmetrically. By asymmetric, we mean that two networks are used to label unlabeled target samples and one network is trained by the samples to obtain target-discriminative representations. We evaluate our method on digit recognition and sentiment analysis datasets. Our proposed method achieves state-of-the-art performance on the benchmark digit recognition datasets of domain adaptation.
Image-generating machine learning models are typically trained with loss functions based on distance in the image space. This often leads to over-smoothed results. We propose a class of loss functions, which we call deep perceptual similarity metrics (DeePSiM), that mitigate this problem. Instead of computing distances in the image space, we compute distances between image features extracted by deep neural networks. This metric better reflects perceptually similarity of images and thus leads to better results. We show three applications: autoencoder training, a modification of a variational autoencoder, and inversion of deep convolutional networks. In all cases, the generated images look sharp and resemble natural images.
Big graph mining is an important research area and it has attracted considerable attention. It allows to process, analyze, and extract meaningful information from large amounts of graph data. Big graph mining has been highly motivated not only by the tremendously increasing size of graphs but also by its huge number of applications. Such applications include bioinformatics, chemoinformatics and social networks. One of the most challenging tasks in big graph mining is pattern mining in big graphs. This task consists on using data mining algorithms to discover interesting, unexpected and useful patterns in large amounts of graph data. It aims also to provide deeper understanding of graph data. In this context, several graph processing frameworks and scaling data mining/pattern mining techniques have been proposed to deal with very big graphs. This paper gives an overview of existing data mining and graph processing frameworks that deal with very big graphs. Then it presents a survey of current researches in the field of data mining / pattern mining in big graphs and discusses the main research issues related to this field. It also gives a categorization of both distributed data mining and machine learning techniques, graph processing frameworks and large scale pattern mining approaches.
Sleep stages pattern provides important clues in diagnosing the presence of sleep disorder. By analyzing sleep stages pattern and extracting its features from EEG, EOG, and EMG signals, we can classify sleep stages. This study presents a novel classification model for predicting sleep stages with a high accuracy. The main idea is to combine the generative capability of Deep Belief Network (DBN) with a discriminative ability and sequence pattern recognizing capability of Long Short-term Memory (LSTM). We use DBN that is treated as an automatic higher level features generator. The input to DBN is 28 "handcrafted" features as used in previous sleep stages studies. We compared our method with other techniques which combined DBN with Hidden Markov Model (HMM).In this study, we exploit the sequence or time series characteristics of sleep dataset. To the best of our knowledge, most of the present sleep analysis from polysomnogram relies only on single instanced label (nonsequence) for classification. In this study, we used two datasets: an open data set that is treated as a benchmark; the other dataset is our sleep stages dataset (available for download) to verify the results further. Our experiments showed that the combination of DBN with LSTM gives better overall accuracy 98.75\% (Fscore=0.9875) for benchmark dataset and 98.94\% (Fscore=0.9894) for MKG dataset. This result is better than the state of the art of sleep stages classification that was 91.31\%.
We present a new form of randomness, called Deep Randomness, generated in such a way that probability distribution of the output signal is made unknowledgeable for an observer. By limiting, thanks to Deep Randomness, the capacity of the opponent observer to perform bayesian inference over public information to estimate private information, we can design protocols, beyond Shannon limit, enabling two legitimate partners, sharing originally no common private information, to exchange secret information with accuracy as close as desired from perfection, and knowledge as close as desired from zero by any unlimitedly powered opponent. We discuss the theoretical foundation of Deep Randomness, which lies on Prior Probability theory, introduced and developped by authors like Laplace, Cox, Carnap, Jefferys and Jaynes ; and we introduce computational method to generate such Deep Randomness.   V2: we add a commented example of Perfact Secrecy Protocol based on Deep Random assumption   V3: we provide a major update of the article. The logic foundation of Deep Random assumption is highly strengthened by avoiding the inconsistency attached to rare events. Such inconsistency could lead to security flaws in previous proposition. At the same time, several variants of the protocol are commented with improved performances.   V4: we correct an error due to lack of symmetry in the example of protocol given in annex. We also make some writing improvements in perspective of conference publication.   V5: we introduce parallel with former article from Maurer presenting a model of Perfect security based on partially independent channels.
We investigate the morphological, dynamical, and evolutionary properties of the internetwork and network fine structure of the quiet sun at disk centre. The analysis is based on a $\sim$6 h time sequence of narrow-band filtergrams centred on the inner-wing \Ca II K$_{\rm 2v}$ reversal at 393.3 nm. The results for the internetwork are related to predictions derived from numerical simulations of the quiet sun. The average evolutionary time scale of the internetwork in our observations is 52 sec. Internetwork grains show a tendency to appear on a mesh-like pattern with a mean cell size of $\sim$4-5 arcsec. Based on this size and the spatial organisation of the mesh we speculate that this pattern is related to the existence of photospheric downdrafts as predicted by convection simulations. The image segmentation shows that typical sizes of both network and internetwork grains are in the order of 1.6 arcs.
In the Hopfield model the ability of the network to generalization is studied in the case of the network trained by one input image ({\it the standard}).
For infinitely large sparse networks of spiking neurons mean field theory shows that a balanced state of highly irregular activity arises under various conditions. Here we analytically investigate the microscopic irregular dynamics in finite networks of arbitrary connectivity, keeping track of all individual spike times. For delayed, purely inhibitory interactions we demonstrate that the irregular dynamics is not chaotic but rather stable and convergent towards periodic orbits. Moreover, every generic periodic orbit of these dynamical systems is stable. These results highlight that chaotic and stable dynamics are equally capable of generating irregular activity.
Multi-valued networks provide a simple yet expressive qualitative state based modelling approach for biological systems. In this paper we develop an abstraction theory for asynchronous multi-valued network models that allows the state space of a model to be reduced while preserving key properties of the model. The abstraction theory therefore provides a mechanism for coping with the state space explosion problem and supports the analysis and comparison of multi-valued networks. We take as our starting point the abstraction theory for synchronous multi-valued networks which is based on the finite set of traces that represent the behaviour of such a model. The problem with extending this approach to the asynchronous case is that we can now have an infinite set of traces associated with a model making a simple trace inclusion test infeasible. To address this we develop a decision procedure for checking asynchronous abstractions based on using the finite state graph of an asynchronous multi-valued network to reason about its trace semantics. We illustrate the abstraction techniques developed by considering a detailed case study based on a multi-valued network model of the regulation of tryptophan biosynthesis in Escherichia coli.
Recent advances in machine learning have made significant contributions to drug discovery. Deep neural networks in particular have been demonstrated to provide significant boosts in predictive power when inferring the properties and activities of small-molecule compounds. However, the applicability of these techniques has been limited by the requirement for large amounts of training data. In this work, we demonstrate how one-shot learning can be used to significantly lower the amounts of data required to make meaningful predictions in drug discovery applications. We introduce a new architecture, the residual LSTM embedding, that, when combined with graph convolutional neural networks, significantly improves the ability to learn meaningful distance metrics over small-molecules. We open source all models introduced in this work as part of DeepChem, an open-source framework for deep-learning in drug discovery.
The dynamics of the quantum Rabi model in the deep strong coupling regime is investigated in a trapped-ion setup. Recognizably, the main hallmark of this regime is the emergence of collapses and revivals, whose faithful observation is hindered under realistic magnetic dephasing noise. Here we discuss how to attain a faithful implementation of the quantum Rabi model in the deep strong coupling regime which is robust against magnetic-field fluctuations and at the same time provides a large tunability of the simulated parameters. This is achieved by combining standing wave laser configuration with continuous dynamical decoupling. In this manner the present work further supports the suitability of continuous dynamical decoupling techniques in trapped-ion settings to faithfully realize different interacting dynamics.
Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. In addition, we provide a unified probabilistic interpretation of related activation maximization methods and call the general class of models "Plug and Play Generative Networks". PPGNs are composed of 1) a generator network G that is capable of drawing a wide range of image types and 2) a replaceable "condition" network C that tells the generator what to draw. We demonstrate the generation of images conditioned on a class (when C is an ImageNet or MIT Places classification network) and also conditioned on a caption (when C is an image captioning network). Our method also improves the state of the art of Multifaceted Feature Visualization, which generates the set of synthetic inputs that activate a neuron in order to better understand how deep neural networks operate. Finally, we show that our model performs reasonably well at the task of image inpainting. While image models are used in this paper, the approach is modality-agnostic and can be applied to many types of data.
We analyze the interplay of synchronization and structure evolution in an evolving network of phase oscillators. An initially random network is adaptively rewired according to the dynamical coherence of the oscillators, in order to enhance their mutual synchronization. We show that the evolving network reaches a small-world structure. Its clustering coefficient attains a maximum for an intermediate intensity of the coupling between oscillators, where a rich diversity of synchronized oscillator groups is observed. In the stationary state, these synchronized groups are directly associated with network clusters.
We study the relationship between the sentiment levels of Twitter users and the evolving network structure that the users created by @-mentioning each other. We use a large dataset of tweets to which we apply three sentiment scoring algorithms, including the open source SentiStrength program. Specifically we make three contributions. Firstly we find that people who have potentially the largest communication reach (according to a dynamic centrality measure) use sentiment differently than the average user: for example they use positive sentiment more often and negative sentiment less often. Secondly we find that when we follow structurally stable Twitter communities over a period of months, their sentiment levels are also stable, and sudden changes in community sentiment from one day to the next can in most cases be traced to external events affecting the community. Thirdly, based on our findings, we create and calibrate a simple agent-based model that is capable of reproducing measures of emotive response comparable to those obtained from our empirical dataset.
Deep surveys in the far-infrared and sub-millimeter wavebands are revealing a new phase of galactic evolution hidden by dust. Observations with SCUBA on the JCMT show that 25% of the COBE/FIRAS background at 850 microns is being produced by high luminosity sources (L ~ 3x10^12 L_sun) at high redshifts 0.5 < z < 3+. These sources have an estimated redshift distribution that is broadly consistent with a global star-formation history that is similar to that inferred from optical observations. The sub-mm galaxies and optically selected galaxies are producing comparable quantities of stars. However, the sub-mm sources are doing so in systems that have luminosities that are an order of magnitude higher, and comoving densities an order of magnitude lower, then the optically selected galaxies. These high luminosity sources are plausibly responsible for producing the spheroidal components of massive galaxies at z ~ 2.
We present an extensive analysis of long-term statistics of the queries to websites using logs collected on several web caches in Russian academic networks and on US IRCache caches. We check the sensitivity of the statistics to several parameters: (1) duration of data collection, (2) geographical location of the cache server collecting data, and (3) the year of data collection. We propose a two-parameter modification of the Zipf law and interpret the parameters. We find that the rank distribution of websites is stable when approximated by the modified Zipf law. We suggest that website popularity may be a universal property of Internet.
Automatic recognition of signature is a challenging problem which has received much attention during recent years due to its many applications in different fields. Signature has been used for long time for verification and authentication purpose. Earlier methods were manual but nowadays they are getting digitized. This paper provides an efficient method to signature recognition using Radial Basis Function Network. The network is trained with sample images in database. Feature extraction is performed before using them for training. For testing purpose, an image is made to undergo rotation-translation-scaling correction and then given to network. The network successfully identifies the original image and gives correct output for stored database images also. The method provides recognition rate of approximately 80% for 200 samples.
Telecommunications operators (telcos) traditional sources of income, voice and SMS, are shrinking due to customers using over-the-top (OTT) applications such as WhatsApp or Viber. In this challenging environment it is critical for telcos to maintain or grow their market share, by providing users with as good an experience as possible on their network.   But the task of extracting customer insights from the vast amounts of data collected by telcos is growing in complexity and scale everey day. How can we measure and predict the quality of a user's experience on a telco network in real-time? That is the problem that we address in this paper.   We present an approach to capture, in (near) real-time, the mobile customer experience in order to assess which conditions lead the user to place a call to a telco's customer care center. To this end, we follow a supervised learning approach for prediction and train our 'Restricted Random Forest' model using, as a proxy for bad experience, the observed customer transactions in the telco data feed before the user places a call to a customer care center.   We evaluate our approach using a rich dataset provided by a major African telecommunication's company and a novel big data architecture for both the training and scoring of predictive models. Our empirical study shows our solution to be effective at predicting user experience by inferring if a customer will place a call based on his current context.   These promising results open new possibilities for improved customer service, which will help telcos to reduce churn rates and improve customer experience, both factors that directly impact their revenue growth.
Image generation remains a fundamental problem in artificial intelligence in general and deep learning in specific. The generative adversarial network (GAN) was successful in generating high quality samples of natural images. We propose a model called composite generative adversarial network, that reveals the complex structure of images with multiple generators in which each generator generates some part of the image. Those parts are combined by alpha blending process to create a new single image. It can generate, for example, background and face sequentially with two generators, after training on face dataset. Training was done in an unsupervised way without any labels about what each generator should generate. We found possibilities of learning the structure by using this generative model empirically.
Data inconsistencies are present in the data collected over a large wireless sensor network (WSN), usually deployed for any kind of monitoring applications. Before passing this data to some WSN applications for decision making, it is necessary to ensure that the data received are clean and accurate. In this paper, we have used a statistical tool to examine the past data to fit in a highly sophisticated prediction model i.e., ARIMA for a given sensor node and with this, the model corrects the data using forecast value if any data anomaly exists there. Another scheme is also proposed for detecting data anomaly at sink among the aggregated data in the data are received from a particular sensor node. The effectiveness of our methods are validated by data collected over a real WSN application consisting of Crossbow IRIS Motes \cite{Crossbow:2009}.
The duality and other symmetry properties of the effective conductivity sigma_e of the classical two-dimensional isotropic randomly inhomogeneous heterophase systems at arbitrary number of phases N are discussed. A new approach for a obtaining sigma_e based on a duality relation is proposed. The exact values of sigma_e at some special sets of the partial parameters are found.The explicit basic solutions of the duality relation, connected with the higher moments and satisfying all necessary requirements, are found at arbitrary values of partial parameters. It is shown that one of them can describe sigma_e for systems with a finite maximal characteristic scale of the inhomogeneities in a wide range of parameters. The other solution, connected with a mean conductivity describes sigma_e of the random parquet model of N-phase randomly inhomogeneous medium in some mean field like approximation. The comparison with the known effective medium approximation and crossover to the continuous smoothly inhomogeneous case are also discussed.
Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, not only the group number but also the certain type of structure that a network has are usually unknown in advance. To automatically explore structural regularities in complex networks, without any prior knowledge about the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory and propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to automatically explore structural regularities in networks with a stable and state-of-the-art performance.
In a spiking neural network (SNN), individual neurons operate autonomously and only communicate with other neurons sparingly and asynchronously via spike signals. These characteristics render a massively parallel hardware implementation of SNN a potentially powerful computer, albeit a non von Neumann one. But can one guarantee that a SNN computer solves some important problems reliably? In this paper, we formulate a mathematical model of one SNN that can be configured for a sparse coding problem for feature extraction. With a moderate but well-defined assumption, we prove that the SNN indeed solves sparse coding. To the best of our knowledge, this is the first rigorous result of this kind.
In every form of digital store-and-forward communication, intermediate forwarding nodes are computers, with attendant memory and processing resources. This has inevitably given rise to efforts to create a wide area infrastructure that goes beyond simple store and forward, a facility that makes more general and varied use of the potential of this collection of increasingly powerful nodes. Historically, efforts in this direction predate the advent of globally routed packet networking. The desire for a converged infrastructure of this kind has only intensified over the last 30 years, as memory, storage and processing resources have both increased in density and speed and decreased in cost. Although there seems to be a general consensus that it should be possible to define and deploy such a dramatically more capable wide area facility, a great deal of investment in research prototypes has yet to produce a credible candidate architecture. Drawing on technical analysis, historical examples, and case studies, we present a argument for the hypothesis that in order to realize a distributed system with the kind of convergent generality and deployment scalability that might qualify as "future-defining," we must build it up from a small set of simple, generic, and limited abstractions of the low level processing, storage and network resources of its intermediate nodes.
Using the Coulomb gas method and standard methods of statistical physics, we compute analytically the joint cumulative probability distribution of the extreme eigenvalues of the Jacobi-MANOVA ensemble of random matrices, in the limit of large matrices. This allows us to derive the rate functions for the large fluctuations to the left and the right of the expected values of the smallest and largest eigenvalues analytically. Our findings are compared with some available known exact results as well as with numerical simulations finding good agreement.
We analyse the symmetries and the self-consistent perturbative approaches of dynamical field theories for glassforming liquids. In particular, we focus on the time-reversal symmetry (TRS), which is crucial to obtain fluctuation-dissipation relations (FDRs). Previous field theoretical treatment violated this symmetry, whereas others pointed out that constructing symmetry preserving perturbation theories is a crucial and open issue. In this work we solve this problem and then apply our results to the mode-coupling theory of the glass transition (MCT). We show that in the context of dynamical field theories for glass-forming liquids TRS is expressed as a nonlinear field transformation that leaves the action invariant. Because of this nonlinearity, standard perturbation theories generically do not preserve TRS and in particular FDRs. We show how one can cure this problem and set up symmetry-preserving perturbation theories by introducing some auxiliary fields. As an outcome we obtain Schwinger-Dyson dynamical equations that automatically preserve FDRs and that serve as a basis for carrying out symmetry-preserving approximations. We apply our results to MCT, revisiting previous field theory derivations of MCT equations and showing that they generically violate FDR. We obtain symmetry-preserving mode-coupling equations and discuss their advantages and drawbacks. Furthermore, we show, contrary to previous works, that the structure of the dynamic equations is such that the ideal glass transition is not cut off at any finite order of perturbation theory, even in the presence of coupling between current and density. The opposite results found in previous field theoretical works, such as the ones based on nonlinear fluctuating hydrodynamics, were only due to an incorrect treatment of TRS.
Alongside the neo-institutional model of networked relations among universities, industries, and governments, the Triple Helix can be provided with a neo-evolutionary interpretation as three selection environments operating upon one another: markets, organizations, and technological opportunities. How are technological innovation systems different from national ones? The three selection environments fulfill social functions: wealth creation, organization control, and organized knowledge production. The main carriers of this system-industry, government, and academia-provide the variation both recursively and by interacting among them under the pressure of competition. Empirical case studies enable us to understand how these evolutionary mechanisms can be expected to operate in historical instance. The model is needed for distinguishing, for example, between trajectories and regimes.
During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank has increased more than 15 fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence however is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D-convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The 2-layer architecture was investigated on a large dataset of 63,558 enzymes from the Protein Data Bank and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet.
The fast growth of social networks and their privacy requirements in recent years, has lead to increasing difficulty in obtaining complete topology of these networks. However, diffusion information over these networks is available and many algorithms have been proposed to infer the underlying networks by using this information. The previously proposed algorithms only focus on inferring more links and do not pay attention to the important characteristics of the underlying social networks In this paper, we propose a novel algorithm, called DANI, to infer the underlying network structure while preserving its properties by using the diffusion information. Moreover, the running time of the proposed method is considerably lower than the previous methods. We applied the proposed method to both real and synthetic networks. The experimental results showed that DANI has higher accuracy and lower run time compared to well-known network inference methods.
This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of kernel functions centered at a subset of training points. The weights are determined by the outer layer of a deep neural network, trained by minimizing the negative log likelihood. This generalizes the popular quantized softmax approach, which can be seen as a kernel mixture network with square and non-overlapping kernels. We test the performance of our method on two important applications, namely Bayesian filtering and generative modeling. In the Bayesian filtering example, we show that the method can be used to filter complex nonlinear and non-Gaussian signals defined on manifolds. The resulting kernel mixture network filter outperforms both the quantized softmax filter and the extended Kalman filter in terms of model likelihood. Finally, our experiments on generative models show that, given the same architecture, the kernel mixture network leads to higher test set likelihood, less overfitting and more diversified and realistic generated samples than the quantized softmax approach.
Cooperation among unrelated individuals is frequently observed in social groups when their members combine efforts and resources to obtain a shared benefit that is unachievable by an individual alone. However, understanding why cooperation arises despite the natural tendency of individuals towards selfish behavior is still an open problem and represents one of the most fascinating challenges in evolutionary dynamics. Recently, the structural characterization of the networks in which social interactions take place has shed some light on the mechanisms by which cooperative behavior emerges and eventually overcomes the natural temptation to defect. In particular, it has been found that the heterogeneity in the number of social ties and the presence of tightly knit communities lead to a significant increase in cooperation as compared with the unstructured and homogeneous connection patterns considered in classical evolutionary dynamics. Here, we investigate the role of social-ties dynamics for the emergence of cooperation in a family of social dilemmas. Social interactions are in fact intrinsically dynamic, fluctuating, and intermittent over time, and they can be represented by time-varying networks. By considering two experimental data sets of human interactions with detailed time information, we show that the temporal dynamics of social ties has a dramatic impact on the evolution of cooperation: the dynamics of pairwise interactions favors selfish behavior.
Motivated by recent experiments showing that stiff biopolymer gels exhibit highly unusual negative normal elastic stresses, we develop a computational model for stiff polymer networks subject to large strains. In all cases, we find that such networks develop normal stresses that are both negative and of magnitude comparable to the corresponding shear stress. We find that these normal stresses coincide with other nonlinearities in our networks, and specifically with compressive bucking of the individual filaments. Our results suggest that negative normal stresses are a characteristic feature of stiff (bio)polymer gels that have been shown to exhibit strong nonlinear elastic properties.
We show that memcapacitive (memory capacitive) systems can be used as synapses in artificial neural networks. As an example of our approach, we discuss the architecture of an integrate-and-fire neural network based on memcapacitive synapses. Moreover, we demonstrate that the spike-timing-dependent plasticity can be simply realized with some of these devices. Memcapacitive synapses are a low-energy alternative to memristive synapses for neuromorphic computation.
This article investigates the problem of dynamic spectrum access with statistical quality of service (QoS) provisioning for dynamic canonical networks, in which the channel states are time-varying from slot to slot. In the existing work with time-varying environment, the commonly used optimization objective is to maximize the expectation of a certain metric (e.g., throughput or achievable rate). However, it is realized that expectation alone is not enough since some applications are sensitive to the channel fluctuations. Effective capacity is a promising metric for time-varying service process since it characterizes the packet delay violating probability (regarded as an important statistical QoS index), by taking into account not only the expectation but also other high-order statistic. We formulate the interactions among the users in the time-varying environment as a non-cooperative game, in which the utility function is defined as the achieved effective capacity. We prove that it is an ordinal potential game which has at least one pure strategy Nash equilibrium. In addition, we propose a multi-agent learning algorithm which is proved to achieve stable solutions with uncertain, dynamic and incomplete information constraints. The convergence of the proposed learning algorithm is verified by simulation results. Also, it is shown that the proposed multi-agent learning algorithm achieves satisfactory performance.
Intelligent network selection plays an important role in achieving an effective data offloading in the integrated cellular and Wi-Fi networks. However, previously proposed network selection schemes mainly focused on offloading as much data traffic to Wi-Fi as possible, without systematically considering the Wi-Fi network congestion and the ping-pong effect, both of which may lead to a poor overall user quality of experience. Thus, in this paper, we study a more practical network selection problem by considering both the impacts of the network congestion and switching penalties. More specifically, we formulate the users' interactions as a Bayesian network selection game (NSG) under the incomplete information of the users' mobilities. We prove that it is a Bayesian potential game and show the existence of a pure Bayesian Nash equilibrium that can be easily reached. We then propose a distributed network selection (DNS) algorithm based on the network congestion statistics obtained from the operator. Furthermore, we show that computing the optimal centralized network allocation is an NP-hard problem, which further justifies our distributed approach. Simulation results show that the DNS algorithm achieves the highest user utility and a good fairness among users, as compared with the on-the-spot offloading and cellular-only benchmark schemes.
In this paper, we consider uplink transmissions involving multiple users communicating with a base station over a fading channel. We assume that the base station does not coordinate the transmissions of the users and hence the users employ random access communication. The situation is modeled as a non-cooperative repeated game with incomplete information. Each user attempts to minimize its long term power consumption subject to a minimum rate requirement. We propose a two timescale stochastic gradient algorithm (TTSGA) for tuning the users' transmission probabilities. The algorithm includes a 'waterfilling threshold update mechanism' that ensures that the rate constraints are satisfied. We prove that under the algorithm, the users' transmission probabilities converge to a Nash equilibrium. Moreover, we also prove that the rate constraints are satisfied; this is also demonstrated using simulation studies.
Gene expression is a readily-observed quantification of transcriptional activity and cellular state that enables the recovery of the relationships between regulators and their target genes. Reconstructing transcriptional regulatory networks from gene expression data is a problem that has attracted much attention, but previous work often makes the simplifying (but unrealistic) assumption that regulator activity is represented by mRNA levels. We use a latent tree graphical model to analyze gene expression without relying on transcription factor expression as a proxy for regulator activity. The latent tree model is a type of Markov random field that includes both observed gene variables and latent (hidden) variables, which factorize on a Markov tree. Through efficient unsupervised learning approaches, we determine which groups of genes are co-regulated by hidden regulators and the activity levels of those regulators. Post-processing annotates many of these discovered latent variables as specific transcription factors or groups of transcription factors. Other latent variables do not necessarily represent physical regulators but instead reveal hidden structure in the gene expression such as shared biological function. We apply the latent tree graphical model to a yeast stress response dataset. In addition to novel predictions, such as condition-specific binding of the transcription factor Msn4, our model recovers many known aspects of the yeast regulatory network. These include groups of co-regulated genes, condition-specific regulator activity, and combinatorial regulation among transcription factors. The latent tree graphical model is a general approach for analyzing gene expression data that requires no prior knowledge of which possible regulators exist, regulator activity, or where transcription factors physically bind.
Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve convergence and reliability of deep reinforcement learning approaches. We concentrate on macro-actions, and evaluate these on different Atari 2600 games, where we show that they yield significant improvements in learning speed. Additionally, we show that they can even achieve better scores than DQN. We offer analysis and explanation for both convergence and final results, revealing a problem deep RL approaches have with sparse reward signals.
Network analysis provides a powerful framework for the interpretation of genome-wide data. While static network approaches have proved fruitful, there is increasing interest in the insights gained from the analysis of cellular networks under different conditions. In this work, we study the effect of stress on cellular networks in fission yeast. Stress elicits a sophisticated and large scale cellular response, involving a shift of resources from cell growth and metabolism towards protection and maintenance. Previous work has suggested that these changes can be appreciated at the network level.   In this paper, we study two types of cellular networks: gene co-regulation networks and weighted protein interaction networks. We show that in response to oxidative stress, the co-regulation networks re-organize towards a more modularised structure: while sets of genes become more tightly co-regulated, co-regulation between these modules is decreased. This shift translates into longer average shortest path length, increased transitivity, and decreased modular overlap in these networks. We also find a similar change in structure in the weighted protein interaction network in response to both oxidative stress and nitrogen starvation, confirming and extending previous findings.   These changes in network structure could represent an increase in network robustness and/or the emergence of more specialised functional modules. Additionally, we find stress induces tighter co-regulation of non-coding RNAs, decreased functional importance of splicing factors, as well as changes in the centrality of genes involved in chromatin organization, cytoskeleton organization, cell division, and protein turnover.
Access to electronic health record (EHR) data has motivated computational advances in medical research. However, various concerns, particularly over privacy, can limit access to and collaborative use of EHR data. Sharing synthetic EHR data could mitigate risk. In this paper, we propose a new approach, medical Generative Adversarial Network (medGAN), to generate realistic synthetic patient records. Based on input real patient records, medGAN can generate high-dimensional discrete variables (e.g., binary and count features) via a combination of an autoencoder and generative adversarial networks. We also propose minibatch averaging to efficiently avoid mode collapse, and increase the learning efficiency with batch normalization and shortcut connections. To demonstrate feasibility, we showed that medGAN generates synthetic patient records that achieve comparable performance to real data on many experiments including distribution statistics, predictive modeling tasks and a medical expert review. We also empirically observe a limited privacy risk in both identity and attribute disclosure using medGAN.
Even though active learning forms an important pillar of machine learning, deep learning tools are not prevalent within it. Deep learning poses several difficulties when used in an active learning setting. First, active learning (AL) methods generally rely on being able to learn and update models from small amounts of data. Recent advances in deep learning, on the other hand, are notorious for their dependence on large amounts of data. Second, many AL acquisition functions rely on model uncertainty, yet deep learning methods rarely represent such model uncertainty. In this paper we combine recent advances in Bayesian deep learning into the active learning framework in a practical way. We develop an active learning framework for high dimensional data, a task which has been extremely challenging so far, with very sparse existing literature. Taking advantage of specialised models such as Bayesian convolutional neural networks, we demonstrate our active learning techniques with image data, obtaining a significant improvement on existing active learning approaches. We demonstrate this on both the MNIST dataset, as well as for skin cancer diagnosis from lesion images (ISIC2016 task).
We introduce in this work an efficient approach for audio scene classification using deep recurrent neural networks. An audio scene is firstly transformed into a sequence of high-level label tree embedding feature vectors. The vector sequence is then divided into multiple subsequences on which a deep GRU-based recurrent neural network is trained for sequence-to-label classification. The global predicted label for the entire sequence is finally obtained via aggregation of subsequence classification outputs. We will show that our approach obtains an F1-score of 97.7% on the LITIS Rouen dataset, which is the largest dataset publicly available for the task. Compared to the best previously reported result on the dataset, our approach is able to reduce the relative classification error by 35.3%.
A central issue in neural recording is that of distinguishing the activities of many neurons. Here, we develop a framework, based on Fisher information, to quantify how separable a neuron's activity is from the activities of nearby neurons. We (1) apply this framework to model information flow and spatial distinguishability for several electrical and optical neural recording methods, (2) provide analytic expressions for information content, and (3) demonstrate potential applications of the approach. This method generalizes to many recording devices that resolve objects in space and thus may be useful in the design of next-generation scalable neural recording systems.
Currently, deep neural networks are the state of the art on problems such as speech recognition and computer vision. In this extended abstract, we show that shallow feed-forward networks can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models. Moreover, in some cases the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model. We evaluate our method on the TIMIT phoneme recognition task and are able to train shallow fully-connected nets that perform similarly to complex, well-engineered, deep convolutional architectures. Our success in training shallow neural nets to mimic deeper models suggests that there probably exist better algorithms for training shallow feed-forward nets than those currently available.
All standard AI planners to-date can only handle a single objective, and the only way for them to take into account multiple objectives is by aggregation of the objectives. Furthermore, and in deep contrast with the single objective case, there exists no benchmark problems on which to test the algorithms for multi-objective planning. Divide and Evolve (DAE) is an evolutionary planner that won the (single-objective) deterministic temporal satisficing track in the last International Planning Competition. Even though it uses intensively the classical (and hence single-objective) planner YAHSP, it is possible to turn DAE-YAHSP into a multi-objective evolutionary planner. A tunable benchmark suite for multi-objective planning is first proposed, and the performances of several variants of multi-objective DAE-YAHSP are compared on different instances of this benchmark, hopefully paving the road to further multi-objective competitions in AI planning.
We provide the first experimental results on non-synthetic datasets for the quasi-diagonal Riemannian gradient descents for neural networks introduced in [Ollivier, 2015]. These include the MNIST, SVHN, and FACE datasets as well as a previously unpublished electroencephalogram dataset. The quasi-diagonal Riemannian algorithms consistently beat simple stochastic gradient gradient descents by a varying margin. The computational overhead with respect to simple backpropagation is around a factor $2$. Perhaps more interestingly, these methods also reach their final performance quickly, thus requiring fewer training epochs and a smaller total computation time.   We also present an implementation guide to these Riemannian gradient descents for neural networks, showing how the quasi-diagonal versions can be implemented with minimal effort on top of existing routines which compute gradients.
Skin cancer is one of the major types of cancers and its incidence has been increasing over the past decades. Skin lesions can arise from various dermatologic disorders and can be classified to various types according to their texture, structure, color and other morphological features. The accuracy of diagnosis of skin lesions, specifically the discrimination of benign and malignant lesions, is paramount to ensure appropriate patient treatment. Machine learning-based classification approaches are among popular automatic methods for skin lesion classification. While there are many existing methods, convolutional neural networks (CNN) have shown to be superior over other classical machine learning methods for object detection and classification tasks. In this work, a fully automatic computerized method is proposed, which employs well established pre-trained convolutional neural networks and ensembles learning to classify skin lesions. We trained the networks using 2000 skin lesion images available from the ISIC 2017 challenge, which has three main categories and includes 374 melanoma, 254 seborrheic keratosis and 1372 benign nevi images. The trained classifier was then tested on 150 unlabeled images. The results, evaluated by the challenge organizer and based on the area under the receiver operating characteristic curve (AUC), were 84.8% and 93.6% for Melanoma and seborrheic keratosis binary classification problem, respectively.   The proposed method achieved competitive results to experienced dermatologist. Further improvement and optimization of the proposed method with a larger training dataset could lead to a more precise, reliable and robust method for skin lesion classification.
We propose a method for learning cyclic causal models from a combination of observational and interventional equilibrium data. Novel aspects of the proposed method are its ability to work with continuous data (without assuming linearity) and to deal with feedback loops. Within the context of biochemical reactions, we also propose a novel way of modeling interventions that modify the activity of compounds instead of their abundance. For computational reasons, we approximate the nonlinear causal mechanisms by (coupled) local linearizations, one for each experimental condition. We apply the method to reconstruct a cellular signaling network from the flow cytometry data measured by Sachs et al. (2005). We show that our method finds evidence in the data for feedback loops and that it gives a more accurate quantitative description of the data at comparable model complexity.
Deep learning has achieved a remarkable performance breakthrough in several fields, most notably in speech recognition, natural language processing, and computer vision. In particular, convolutional neural network (CNN) architectures currently produce state-of-the-art performance on a variety of image analysis tasks such as object detection and recognition. Most of deep learning research has so far focused on dealing with 1D, 2D, or 3D Euclidean-structured data such as acoustic signals, images, or videos. Recently, there has been an increasing interest in geometric deep learning, attempting to generalize deep learning methods to non-Euclidean structured data such as graphs and manifolds, with a variety of applications from the domains of network analysis, computational social science, or computer graphics. In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN methods previously proposed in the literature can be considered as particular instances of our framework. We test the proposed method on standard tasks from the realms of image-, graph- and 3D shape analysis and show that it consistently outperforms previous approaches.
Deep reinforcement learning is becoming increasingly popular for robot control algorithms, with the aim for a robot to self-learn useful feature representations from unstructured sensory input leading to the optimal actuation policy. In addition to sensors mounted on the robot, sensors might also be deployed in the environment, although these might need to be accessed via an unreliable wireless connection. In this paper, we demonstrate deep neural network architectures that are able to fuse information coming from multiple sensors and are robust to sensor failures at runtime. We evaluate our method on a search and pick task for a robot both in simulation and the real world.
We study numerically the generation of power laws in the framework of weak turbulence theory for surface gravity waves in deep water. Starting from a random wave field, we let the system evolve numerically according to the nonlinear Euler equations for gravity waves in infinitely deep water. In agreement with the theory of Zakharov and Filonenko, we find the formation of a power spectrum characterized by a power law of the form of $|{\bf k}|^{-2.5}$.
From transportation networks to complex infrastructures, and to social and economic networks, a large variety of systems can be described in terms of multiplex networks formed by a set of nodes interacting through different network layers. Network robustness, as one of the most successful application areas of complex networks, has also attracted great interest in both theoretical and empirical researches. However, the vast majority of existing researches mainly focus on the robustness of single-layer networks an interdependent networks, how multiplex networks respond to potential attack is still short of further exploration. Here we study the robustness of multiplex networks under two attack strategies: layer node-based random attack and layer node-based targeted attack. A theoretical analysis framework is proposed to calculate the critical threshold and the size of giant component of multiplex networks when a fraction of layer nodes are removed randomly or intentionally. Via numerous simulations, it is unveiled that the theoretical method can accurately predict the threshold and the size of giant component, irrespective of attack strategies. Moreover, we also compare the robustness of multiplex networks under multiplex node-based attack and layer node-based attack, and find that layer node-based attack makes multiplex networks more vulnerable, regardless of average degree and underlying topology. Our finding may shed new light on the protection of multiplex networks.
We find numerically that the sample to sample fluctuation of the entropy, $\Delta S$, is a tool more sensitive in distinguishing how from high temperature behaviors, than the corresponding fluctuation in the free energy. In 1+1 dimensions we find a single phase for all temperatures since $(\Delta S)^{2}$ is always extensive. In 2+1 dimensions we find a behavior may look at first sight as a transition from a low temperature phase where $(\Delta S)^{2}$ is extensive to a high temperature phase where it is subextensive. This is observed in spite of the relatively large system we use. The observed behavior is explained not as a phase transition but as a strong crossover behavior. We use an analytical agreement to obtain $(\Delta S)^{2}$ for high temperature and find that while it is always extensive it is also extremly small and the leading extensive part decays very fast to zero as temperature is increased.
We present a scenario for the formation and evolution of disk galaxies within the framework of an inflationary CDM universe, and we compare the results with observations ranking from the present-day up to z~1. Galactic disks are built-up inside-out by gas infall with accretion rates driven by the cosmological mass aggregation history (MAH). We generate the MAHs for a Gaussian density fluctuation field, and we calculate the gravitational collapse and virialization of the dark halos. Assuming detailed angular momentum conservation, disks in centrifugal equilibrium are built-up within them. The disk galactic evolution is followed through a physically self-consistent approach. The main disk galaxy properties and their correlations are determined by the mass, the MAH, and the spin parameter. The models give exponential disk surface brightness (SB) profiles with realistic parameters (including LSB galaxies), nearly flat rotation curves, and negative gradients in the B-V radial profile. The main trends across the Hubble sequence are reproduced. A maximum in the star formation rate for most of the models is attained at z~ 1.5-2.5 (this rate is approximately 2.5-4.0 times larger than the present one). The scale radii and the bulge-to-total ratios decrease with z, while the central SB increase. The B-band TF relation remains almost the same at different redshifts. Our scenario reveals that the cosmological initial conditions are able to determine the main properties of disk galaxies and predict evolutionary features for the present-day dominant galaxy population that are in agreement with very recent deep field observational studies.
This work presents a model that allows the study of research specialties through the manifestations of the specialty's social and epistemological processes in a collection of journal papers. Collections of papers are modeled as coupled bipartite networks interlinking 7 types of entities. Matrix-based link weight functions are introduced to calculate weighted bipartite networks and weighted unipartite co-occurrence networks in the collection of papers. These weight calculation methods, when used in conjunction with unweighted bipartite growth models, produce simple growth models for weighted networks in collections of papers.
The rogue DHCP is unauthorized server that releases the incorrect IP address to users and sniffs the traffic illegally. The contribution specially provides privacy to users and enhances the security aspects of mobile supported collaborative framework (MSCF) explained in [24].The paper introduces multi-frame signature-cum anomaly-based intrusion detection systems (MSAIDS) supported with novel algorithms and inclusion of new rules in existing IDS. The major target of contribution is to detect the malicious attacks and blocks the illegal activities of rogue DHCP server. This innovative security mechanism reinforces the confidence of users, protects network from illicit intervention and restore the privacy of users. Finally, the paper validates the idea through simulation and compares the findings with known existing techniques.
We numerically investigate the phase diagram of two-dimensional site-diluted coupled dimer systems in an external magnetic field. We show that this phase diagram is characterized by the presence of an extended Bose glass, not accessible to mean-field approximation, and stemming from the localization of two distinct species of bosonic quasiparticles appearing in the ground state. On the one hand, non-magnetic impurities doped into the dimer-singlet phase of a weakly coupled dimer system are known to free up local magnetic moments. The deviations of these local moments from full polarization along the field can be mapped onto a gas of bosonic quasiparticles, which undergo condensation in zero and very weak magnetic fields, corresponding to transverse long-range antiferromagnetic order. An increasing magnetic field lowers the density of such quasiparticles to a critical value at which a quantum phase transition occurs, corresponding to the quasiparticle localization on clusters of local magnets (dimers, trimers, etc.) and to the onset of a Bose glass. Strong finite-size quantum fluctuations hinder further depletion of quasiparticles from such clusters, and thus lead to the appearance of pseudo-plateaus in the magnetization curve of the system. On the other hand, site dilution hinders the field-induced Bose-Einstein condensation of triplet quasiparticles on the intact dimers, and it introduces instead a Bose glass of triplets. A thorough numerical investigation of the phase diagram for a planar system of coupled dimers shows that the two above-mentioned Bose glass phases are continuously connected, and they overlap in a finite region of parameter space, thus featuring a two-species Bose glass. The quantum phase transition from Bose glass to magnetic order in two dimensions is marked by novel universal exponents.
Faces form the basis for a rich variety of judgments in humans, yet the underlying features remain poorly understood. Although fine-grained distinctions within a race might more strongly constrain possible facial features used by humans than in case of coarse categories such as race or gender, such fine grained distinctions are relatively less studied. Fine-grained race classification is also interesting because even humans may not be perfectly accurate on these tasks. This allows us to compare errors made by humans and machines, in contrast to standard object detection tasks where human performance is nearly perfect. We have developed a novel face database of close to 1650 diverse Indian faces labeled for fine-grained race (South vs North India) as well as for age, weight, height and gender. We then asked close to 130 human subjects who were instructed to categorize each face as belonging toa Northern or Southern state in India. We then compared human performance on this task with that of computational models trained on the ground-truth labels. Our main results are as follows: (1) Humans are highly consistent (average accuracy: 63.6%), with some faces being consistently classified with > 90% accuracy and others consistently misclassified with < 30% accuracy; (2) Models trained on ground-truth labels showed slightly worse performance (average accuracy: 62%) but showed higher accuracy (72.2%) on faces classified with > 80% accuracy by humans. This was true for models trained on simple spatial and intensity measurements extracted from faces as well as deep neural networks trained on race or gender classification; (3) Using overcomplete banks of features derived from each face part, we found that mouth shape was the single largest contributor towards fine-grained race classification, whereas distances between face parts was the strongest predictor of gender.
We investigate the effect of strong disorder on a system with strong electronic repulsion. In absence of disorder, the system has a d-wave superconducting ground-state with strong non-BCS features due to its proximity to a Mott insulator. We find that, while strong correlations make superconductivity in this system immune to weak disorder, superconductivity is destroyed efficiently when disorder strength is comparable to the effective bandwidth. The suppression of charge motion in regions of strong potential fluctuation leads to formation of Mott insulating patches, which anchor a larger non-superconducting region around them. The system thus breaks into islands of Mott insulating and superconducting regions, with Anderson insulating regions occurring along the boundary of these regions. Thus, electronic correlation and disorder, when both are strong, aid each other in destroying superconductivity, in contrast to their competition at weak disorder. Our results shed light on why Zinc impurities are efficient in destroying superconductivity in cuprates, even though it is robust to weaker impurities.
Networks are a general language for representing relational information among objects. An effective way to model, reason about, and summarize networks, is to discover sets of nodes with common connectivity patterns. Such sets are commonly referred to as network communities. Research on network community detection has predominantly focused on identifying communities of densely connected nodes in undirected networks.   In this paper we develop a novel overlapping community detection method that scales to networks of millions of nodes and edges and advances research along two dimensions: the connectivity structure of communities, and the use of edge directedness for community detection. First, we extend traditional definitions of network communities by building on the observation that nodes can be densely interlinked in two different ways: In cohesive communities nodes link to each other, while in 2-mode communities nodes link in a bipartite fashion, where links predominate between the two partitions rather than inside them. Our method successfully detects both 2-mode as well as cohesive communities, that may also overlap or be hierarchically nested. Second, while most existing community detection methods treat directed edges as though they were undirected, our method accounts for edge directions and is able to identify novel and meaningful community structures in both directed and undirected networks, using data from social, biological, and ecological domains.
We introduce a new approach to connectivity-dependent properties of diluted systems, which is based on the transfer-matrix formulation of the percolation problem. It simultaneously incorporates the connective properties reflected in non-zero matrix elements and allows one to use standard random-matrix multiplication techniques. Thus it is possible to investigate physical processes on the percolation structure with the high efficiency and precision characteristic of transfer-matrix methods, while avoiding disconnections. The method is illustrated for two-dimensional site percolation by calculating (i) the critical correlation length along the strip, and the finite-size longitudinal DC conductivity: (ii) at the percolation threshold, and (iii) very near the pure-system limit.
This paper introduces an SLD-resolution technique based on deep learning. This technique enables neural networks to learn from old and successful resolution processes and to use learnt experiences to guide new resolution processes. An implementation of this technique is named SLDR-DL. It includes a Prolog library of deep feedforward neural networks and some essential functions of resolution. In the SLDR-DL framework, users can define logical rules in the form of definite clauses and teach neural networks to use the rules in reasoning processes.
Diffractive deep inelastic events with a large rapidity gap are analyzed by using a Regge model for the pomeron flux and a gluonic content for the pomeron. Contrary to the expectations, the simplest assumption for the pomeron trajectory gives the best agreement with the data on the ratio of diffractive to the total number of events. In this case the main properties of the model are described by an analytic expression.
Pomeron loop effects (gluon number fluctuations) on deep inelastic scattering (DIS) in the fixed coupling case are studied. We show that the description of the HERA data for the inclusive structure function $F_2(x,Q^2)$ for $x\leq10^{-2}$ and $0.045\leq Q^2\leq 45GeV^2$ is improved once pomeron loop effects are included.
Large-scale distributed systems such as sensor networks, often need to achieve filtering and consensus on an estimated parameter from high-dimensional measurements. Running a Kalman filter on every node in such a network is computationally intensive; in particular the matrix inversion in the Kalman gain update step is expensive. In this paper, we extend previous results in distributed Kalman filtering and large-scale machine learning to propose a gradient descent step for updating an estimate of the error covariance matrix; this is then embedded and analyzed in the context of distributed Kalman filtering. We provide properties of the resulting filters, in addition to a number of applications throughout the paper.
Concurrent presence of inter-cell and intra-cell interferences constitutes a major impediment to reliable downlink transmission in multi-cell multiuser networks. Harnessing such interferences largely hinges on two levels of information exchange in the network: one from the users to the base-stations (feedback) and the other one among the base-stations (cooperation). We demonstrate that exchanging a finite number of bits across the network, in the form of feedback and cooperation, is adequate for achieving the optimal capacity scaling. We also show that the average level of information exchange is independent of the number of users in the network. This level of information exchange is considerably less than that required by the existing coordination strategies which necessitate exchanging infinite bits across the network for achieving the optimal sum-rate capacity scaling. The results provided rely on a constructive proof.
We are proposing an extension of the recursive neural network that makes use of a variant of the long short-term memory architecture. The extension allows information low in parse trees to be stored in a memory register (the `memory cell') and used much later higher up in the parse tree. This provides a solution to the vanishing gradient problem and allows the network to capture long range dependencies. Experimental results show that our composition outperformed the traditional neural-network composition on the Stanford Sentiment Treebank.
We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the linear structure present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2x, while keeping the accuracy within 1% of the original model.
With the ongoing exhaustion of free address pools at the registries serving the global demand for IPv4 address space, scarcity has become reality. Networks in need of address space can no longer get more address allocations from their respective registries.   In this work we frame the fundamentals of the IPv4 address exhaustion phenomena and connected issues. We elaborate on how the current ecosystem of IPv4 address space has evolved since the standardization of IPv4, leading to the rather complex and opaque scenario we face today. We outline the evolution in address space management as well as address space use patterns, identifying key factors of the scarcity issues. We characterize the possible solution space to overcome these issues and open the perspective of address blocks as virtual resources, which involves issues such as differentiation between address blocks, the need for resource certification, and issues arising when transferring address space between networks.
The degree distribution of many biological and technological networks has been described as a power-law distribution. While the degree distribution does not capture all aspects of a network, it has often been suggested that its functional form contains important clues as to underlying evolutionary processes that have shaped the network. Generally, the functional form for the degree distribution has been determined in an ad-hoc fashion, with clear power-law like behaviour often only extending over a limited range of connectivities. Here we apply formal model selection techniques to decide which probability distribution best describes the degree distributions of protein interaction networks. Contrary to previous studies this well defined approach suggests that the degree distribution of many molecular networks is often better described by distributions other than the popular power-law distribution. This, in turn, suggests that simple, if elegant, models may not necessarily help in the quantitative understanding of complex biological processes.\
Interventions are made in networks to change the network or its values in a desired way. The intervention strategies evaluated in the study described here use network sampling designs to find units to which interventions are applied. An intervention applied to a network node or link can change a value associated with that unit. Over time the effect of the intervention can have an effect on the population that goes beyond the sample units to which it is directly applied. This paper describes the methods used for this study. These include a variety of link-tracing sampling designs in networks, a number of types of interventions, and a temporal spatial network model in which the intervention strategies are evaluated. An intervention strategy is associated with an agent and different intervention strategies interact and adapt to each other over time. Some preliminary results are summarized regarding potential intervention strategies to help alleviate the HIV epidemic.
We propose a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image. The CTPN detects a text line in a sequence of fine-scale text proposals directly in convolutional feature maps. We develop a vertical anchor mechanism that jointly predicts location and text/non-text score of each fixed-width proposal, considerably improving localization accuracy. The sequential proposals are naturally connected by a recurrent neural network, which is seamlessly incorporated into the convolutional network, resulting in an end-to-end trainable model. This allows the CTPN to explore rich context information of image, making it powerful to detect extremely ambiguous text. The CTPN works reliably on multi-scale and multi- language text without further post-processing, departing from previous bottom-up methods requiring multi-step post-processing. It achieves 0.88 and 0.61 F-measure on the ICDAR 2013 and 2015 benchmarks, surpass- ing recent results [8, 35] by a large margin. The CTPN is computationally efficient with 0:14s/image, by using the very deep VGG16 model [27]. Online demo is available at: http://textdet.com/.
Energy barriers determine the dynamics in many physical systems like structural glasses, disordered spin systems or proteins. Here we present an approach, which is based on subdividing the configuration space in a hierarchical manner, leading to upper and lower bounds for the energy barrier separating two configurations. The fundamental operation is to perform a constrained energy optimization, where the degree of constraintness increases with the level in the hierarchy. As application, we consider Ising spin glasses, where the energy barrier which needs to be surmounted in order to flip a compact region of spins of linear dimension $L$ are expected to scale as $L^{\psi}$. The exponent $\psi$ is very hard to estimate from experimental and simulation studies. By using the new approach, applying efficient combinatorial matching algorithms, we are able to give the first non-trivial numerical bounds $0.25 < \psi < 0.54$ for the two-dimensional Ising spin glass.
We investigate the problem of obtaining agreement protocols in the presence of a mobile adversary, who can control an ever-changing selection of processors. We make improvements to previous results for the case when the communications network forms a complete graph, and also adapt these to the general case when the network is not complete.
We report on measurements of the electrical conductivity in both a 2D triangular lattice of metallic beads and in a chain of beads. The voltage/current characteristics are qualitatively similar in both experiments. At low applied current, the voltage is found to increase logarithmically in a good agreement with a model of widely distributed resistances in series. At high enough current, the voltage saturates due to the local welding of microcontacts between beads. The frequency dependence of the saturation voltage gives an estimate of the size of these welded microcontacts. The DC value of the saturation voltage (~ 0.4 V per contact) gives an indirect measure of the number of welded contact carrying the current within the 2D lattice. Also, a new measurement technique provides a map of the current paths within the 2D lattice of beads. For an isotropic compression of the 2D granular medium, the current paths are localized in few discrete linear paths. This quasi-onedimensional nature of the electrical conductivity thus explains the similarity between the characteristics in the 1D and 2D systems.
Deep--elastic scattering and its role in discrimination of the possible absorptive and reflective asymptotic scattering mechanisms are discussed with emphasis on the difference in the experimental signatures related to production processes.
Hand detection is essential for many hand related tasks, e.g. parsing hand pose, understanding gesture, which are extremely useful for robotics and human-computer interaction. However, hand detection in uncontrolled environments is challenging due to the flexibility of wrist joint and cluttered background. We propose a deep learning based approach which detects hands and calibrates in-plane rotation under supervision at the same time. To guarantee the recall, we propose a context aware proposal generation algorithm which significantly outperforms the selective search. We then design a convolutional neural network(CNN) which handles object rotation explicitly to jointly solve the object detection and rotation estimation tasks. Experiments show that our method achieves better results than state-of-the-art detection models on widely-used benchmarks such as Oxford and Egohands database. We further show that rotation estimation and classification can mutually benefit each other.
We present results from a Monte Carlo investigation of a simple bilayer model with geometrically frustrated interactions similar to those found in the mixed layer pnictide oxides $(Sr_{2}Mn_{3}Pn_{2}O_{2}, Pn=As,Sb).$ Our model is composed of two inequivalent square lattices with nearest neighbor intra- and interlayer interactions. We find a ground state composed of two independent N\'{e}el ordered layers when the interlayer exchange is an order of magnitude weaker than the intralayer exchange, as suggested by experiment. We observe this result independent of the number of layers in our model. We find evidence for local orthogonal order between the layers, but it occurs in regions of parameter space that are not experimentally realized. We conclude that frustration caused by nearest neighbor interactions in the mixed layer pnictide oxides is not sufficient to explain the long--range orthogonal order that is observed experimentally, and that it is likely that other terms (e.g., local anisotropies) in the Hamiltonian are required to explain the magnetic behavior.
Energy Conservation and prolonging the life of Wireless Sensor Network is one of the major issues in the wireless sensor network as sensor nodes are highly energy constrained devices. Many routing protocols have been proposed for sensor network, especially cluster based routing protocols. Cluster based routing protocols are best known for its energy efficiency, network stability and for increasing the life time of the sensor network. Low Energy Adaptive Clustering Hierarchy (LEACH) is one of the fundamental protocols in this class. In this research paper we propose a new energy efficient cluster based protocol (EECP) for single hop heterogeneous wireless sensor network to increase the life of a sensor network. Our sensor protocol use the distance of the sensor from the sink as the major issue for the selection of a cluster in the sensor network.
A swarm algorithm framework (SWAF), realized by agent-based modeling, is presented to solve numerical optimization problems. Each agent is a bare bones cognitive architecture, which learns knowledge by appropriately deploying a set of simple rules in fast and frugal heuristics. Two essential categories of rules, the generate-and-test and the problem-formulation rules, are implemented, and both of the macro rules by simple combination and subsymbolic deploying of multiple rules among them are also studied. Experimental results on benchmark problems are presented, and performance comparison between SWAF and other existing algorithms indicates that it is efficiently.
Alkali halide (100) surfaces are anomalously poorly wetted by their own melt at the triple point. We carried out simulations for NaCl(100) within a simple (BMHFT) model potential. Calculations of the solid-vapor, solid-liquid and liquid-vapor free energies showed that solid NaCl(100) is a nonmelting surface, and that the incomplete wetting can be traced to the conspiracy of three factors: surface anharmonicities stabilizing the solid surface; a large density jump causing bad liquid-solid adhesion; incipient NaCl molecular correlations destabilizing the liquid surface, reducing in particular its entropy much below that of solid NaCl(100). Presently, we are making use of the nonmelting properties of this surface to conduct case study simulations of hard tips sliding on a hot stable crystal surface. Preliminary results reveal novel phenomena whose applicability is likely of greater generality.
Inclusive dijet and trijet production in deep inelastic $ep$ scattering has been measured for $10<Q^2<100$ GeV$^2$ and low Bjorken $x$, $10^{-4}<x_{\rm Bj}<10^{-2}$. The data were taken at the HERA $ep$ collider with centre-of-mass energy $\sqrt{s} = 318 \gev$ using the ZEUS detector and correspond to an integrated luminosity of $82 {\rm pb}^{-1}$. Jets were identified in the hadronic centre-of-mass (HCM) frame using the $k_{T}$ cluster algorithm in the longitudinally invariant inclusive mode. Measurements of dijet and trijet differential cross sections are presented as functions of $Q^2$, $x_{\rm Bj}$, jet transverse energy, and jet pseudorapidity. As a further examination of low-$x_{\rm Bj}$ dynamics, multi-differential cross sections as functions of the jet correlations in transverse momenta, azimuthal angles, and pseudorapidity are also presented. Calculations at $\mathcal{O}(\alpha_{s}^3)$ generally describe the trijet data well and improve the description of the dijet data compared to the calculation at $\mathcal{O}(\alpha_{s}^2)$.
We present results for ground state structures of small Si$_{n}$H (2 \leq \emph{n} \leq 10) clusters using the Car-Parrinello molecular dynamics. In particular, we focus on how the addition of a hydrogen atom affects the ground state geometry, total energy and the first excited electronic level gap of an Si$_{n}$ cluster. We discuss the nature of bonding of hydrogen in these clusters. We find that hydrogen bonds with two silicon atoms only in Si$_{2}$H, Si$_{3}$H and Si$_{5}$H clusters, while in other clusters (i.e. Si$_{4}$H, Si$_{6}$H, Si$_{7}$H, Si$_{8}$H, Si$_{9}$H and Si$_{10}$H) hydrogen is bonded to only one silicon atom. Also in the case of a compact and closed silicon cluster hydrogen bonds to the cluster from outside. We find that the first excited electronic level gap of Si$_{n}$ and Si$_{n}$H fluctuates as a function of size and this may provide a first principles basis for the short-range potential fluctuations in hydrogenated amorphous silicon. Our results show that the addition of a single hydrogen can cause large changes in the electronic structure of a silicon cluster, though the geometry is not much affected. Our calculation of the lowest energy fragmentation products of Si$_{n}$H clusters shows that hydrogen is easily removed from Si$_{n}$H clusters.
The next generation of High Energy Physics (HEP) experiments requires a GRID approach to a distributed computing system and the associated data management: the key concept is the Virtual Organisation (VO), a group of distributed users with a common goal and the will to share their resources. A similar approach is being applied to a group of Hospitals which joined the GPCALMA project (Grid Platform for Computer Assisted Library for MAmmography), which will allow common screening programs for early diagnosis of breast and, in the future, lung cancer. HEP techniques come into play in writing the application code, which makes use of neural networks for the image analysis and proved to be useful in improving the radiologists' performances in the diagnosis. GRID technologies allow remote image analysis and interactive online diagnosis, with a potential for a relevant reduction of the delays presently associated to screening programs. A prototype of the system, based on AliEn GRID Services, is already available, with a central Server running common services and several clients connecting to it. Mammograms can be acquired in any location; the related information required to select and access them at any time is stored in a common service called Data Catalogue, which can be queried by any client. The result of a query can be used as input for analysis algorithms, which are executed on nodes that are in general remote to the user (but always local to the input images) thanks to the PROOF facility. The selected approach avoids data transfers for all the images with a negative diagnosis (about 95% of the sample) and allows an almost real time diagnosis for the 5% of images with high cancer probability.
We present a possible candidate of construction of a scalable, uniform and universal quantum network, which is built from quantum gates to elements of quantum circuit, again to quantum subnetworks and finally to an entire quantum network. Our scheme can overcome some difficulties of the existing schemes and makes improvements to different extent in the scale of quantum network, ability of computation, implementation of engineering, efficiency of quantum network, universality, compatibility, design principle, programmability, fault tolerance, error control, industrialization and commercialization {\it et. al} aspects. As the applications of this construction scheme, we obtain the entire quantum networks for Shor's algorithm, Grover's algorithm and solving Schr\"odinger equation in general. This implies that the scalable, uniform and universal quantum networks are able to generally describe the known main results and can be further applied to more interesting examples in quantum computations.
Many studies have tried to evaluate wireless networks and especially the IEEE 802.15.4 standard. Hence, several papers have aimed to describe the functionalities of the physical (PHY) and medium access control (MAC) layers. They have highlighted some characteristics with experimental results and/or have attempted to reproduce them using theoretical models. In this paper, we use the first way to better understand IEEE 802.15.4 standard. Indeed, we provide a comprehensive model, able more faithfully to mimic the functionalities of this standard at the PHY and MAC layers. We propose a combination of two relevant models for the two layers. The PHY layer behavior is reproduced by a mathematical framework, which is based on radio and channel models, in order to quantify link reliability. On the other hand, the MAC layer is mimed by an enhanced Markov chain. The results show the pertinence of our approach compared to the model based on a Markov chain for IEEE 802.15.4 MAC layer. This contribution allows us fully and more precisely to estimate the network performance with different network sizes, as well as different metrics such as node reliability and delay. Our contribution enables us to catch possible failures at both layers.
Kutato is a system that takes as input a database of cases and produces a belief network that captures many of the dependence relations represented by those data. This system incorporates a module for determining the entropy of a belief network and a module for constructing belief networks based on entropy calculations. Kutato constructs an initial belief network in which all variables in the database are assumed to be marginally independent. The entropy of this belief network is calculated, and that arc is added that minimizes the entropy of the resulting belief network. Conditional probabilities for an arc are obtained directly from the database. This process continues until an entropy-based threshold is reached. We have tested the system by generating databases from networks using the probabilistic logic-sampling method, and then using those databases as input to Kutato. The system consistently reproduces the original belief networks with high fidelity.
Despite the widespread installation of accelerometers in almost all mobile phones and wearable devices, activity recognition using accelerometers is still immature due to the poor recognition accuracy of existing recognition methods and the scarcity of labeled training data. We consider the problem of human activity recognition using triaxial accelerometers and deep learning paradigms. This paper shows that deep activity recognition models (a) provide better recognition accuracy of human activities, (b) avoid the expensive design of handcrafted features in existing systems, and (c) utilize the massive unlabeled acceleration samples for unsupervised feature extraction. Moreover, a hybrid approach of deep learning and hidden Markov models (DL-HMM) is presented for sequential activity recognition. This hybrid approach integrates the hierarchical representations of deep activity recognition models with the stochastic modeling of temporal sequences in the hidden Markov models. We show substantial recognition improvement on real world datasets over state-of-the-art methods of human activity recognition using triaxial accelerometers.
The problem of identifying the optimal location for a new retail store has been the focus of past research, especially in the field of land economy, due to its importance in the success of a business. Traditional approaches to the problem have factored in demographics, revenue and aggregated human flow statistics from nearby or remote areas. However, the acquisition of relevant data is usually expensive. With the growth of location-based social networks, fine grained data describing user mobility and popularity of places has recently become attainable.   In this paper we study the predictive power of various machine learning features on the popularity of retail stores in the city through the use of a dataset collected from Foursquare in New York. The features we mine are based on two general signals: geographic, where features are formulated according to the types and density of nearby places, and user mobility, which includes transitions between venues or the incoming flow of mobile users from distant areas. Our evaluation suggests that the best performing features are common across the three different commercial chains considered in the analysis, although variations may exist too, as explained by heterogeneities in the way retail facilities attract users. We also show that performance improves significantly when combining multiple features in supervised learning algorithms, suggesting that the retail success of a business may depend on multiple factors.
The evolutionary reason for the increase in gene length from archaea to prokaryotes to eukaryotes observed in large scale genome sequencing efforts has been unclear. We propose here that the increasing complexity of protein-protein interactions has driven the selection of longer proteins, as longer proteins are more able to distinguish among a larger number of distinct interactions due to their greater average surface area. Annotated protein sequences available from the SWISS-PROT database were analyzed for thirteen eukaryotes, eight bacteria, and two archaea species. The number of subcellular locations to which each protein is associated is used as a measure of the number of interactions to which a protein participates. Two databases of yeast protein-protein interactions were used as another measure of the number of interactions to which each \emph{S. cerevisiae} protein participates. Protein length is shown to correlate with both number of subcellular locations to which a protein is associated and number of interactions as measured by yeast two-hybrid experiments. Protein length is also shown to correlate with the probability that the protein is encoded by an essential gene. Interestingly, average protein length and number of subcellular locations are not significantly different between all human proteins and protein targets of known, marketed drugs. Increased protein length appears to be a significant mechanism by which the increasing complexity of protein-protein interaction networks is accommodated within the natural evolution of species. Consideration of protein length may be a valuable tool in drug design, one that predicts different strategies for inhibiting interactions in aberrant and normal pathways.
We derive the nonlinear k_\perp-factorization for the spectrum of jets in high-mass diffractive deep inelastic scattering as a function of three hard scales - the virtuality of the photon Q^2, the transverse momentum of the jet and the saturation scale Q_A. In contrast to all other hard reactions studied so far, we encounter a clash between the two definitions of the glue in the pomeron -- from the inclusive spectrum of leading quarks and the small-\beta evolution of the diffractive cross section. This clash casts a further shadow on customary applications of the familiar collinear factorization to a pQCD analysis of diffractive deep inelastic scattering.
We introduce a new parameter {\Delta}{\xi} - the difference in magnitude between the red giant branch (RGB) bump and a point on the main sequence (MS) at the same color as the bump, the "benchmark" - to estimate the helium content in old stellar systems. Its sensitivity to helium is linear over the entire metallicity range, it is minimally affected by age, uncertainties in the photometric zero-point, reddening or the effects of evolution on the horizontal branch. The two main drawbacks are the need for precise and large photometric data sets, and a strong dependence of the {\Delta}Y/{\Delta}{\xi} slope on metallicity. To test the {\Delta}{\xi} parameter we selected 22 Galactic Globular Clusters (GGCs) with low foreground reddening, a broad range of iron abundance and precise, relatively deep, and homogeneous multi-band (B,V,I) photometry. We found that the observed {\Delta}{\xi} and those predicted from {\alpha}-enhanced models agree quite well if we assume Y=0.20. Comparison with canonical primordial helium content models (Y=0.245, {\Delta}Y/{\Delta}Z=1.4) indicates that the observed {\Delta}{\xi} values are systematically smaller than predicted. The outcome is the same if predicted parameters are based on models that take into account also CNO enhancements and becomes even larger if we consider He-enhanced models. These findings suggest that current stellar evolutionary models overestimate the luminosity of the RGB bump. We also found that including envelope overshooting can eliminate the discrepancy, as originally suggested by Alongi et al. (1993, aaps, 97, 851). The {\Delta}{\xi} parameter of GGCs, in spite of the possible limitations concerning the input physics of current evolutionary models, provides an independent detection of pre-stellar helium at least at the 5{\sigma} level.
Finding a basis matrix (dictionary) by which objective signals are represented sparsely is of major relevance in various scientific and technological fields. We consider a problem to learn a dictionary from a set of training signals. We employ techniques of statistical mechanics of disordered systems to evaluate the size of the training set necessary to typically succeed in the dictionary learning. The results indicate that the necessary size is much smaller than previously estimated, which theoretically supports and/or encourages the use of dictionary learning in practical situations.
When encountering novel objects, humans are able to infer a wide range of physical properties such as mass, friction and deformability by interacting with them in a goal driven way. This process of active interaction is in the same spirit as a scientist performing experiments to discover hidden facts. Recent advances in artificial intelligence have yielded machines that can achieve superhuman performance in Go, Atari, natural language processing, and complex control problems; however, it is not clear that these systems can rival the scientific intuition of even a young child. In this work we introduce a basic set of tasks that require agents to estimate properties such as mass and cohesion of objects in an interactive simulated environment where they can manipulate the objects and observe the consequences. We found that state of art deep reinforcement learning methods can learn to perform the experiments necessary to discover such hidden properties. By systematically manipulating the problem difficulty and the cost incurred by the agent for performing experiments, we found that agents learn different strategies that balance the cost of gathering information against the cost of making mistakes in different situations.
In this paper, we propose a novel energy efficient environment monitoring scheme for wireless sensor networks, based on data mining formulation. The proposed adapting routing scheme for sensors for achieving energy efficiency from temperature wireless sensor network data set. The experimental validation of the proposed approach using publicly available Intel Berkeley lab Wireless Sensor Network dataset shows that it is possible to achieve energy efficient environment monitoring for wireless sensor networks, with a trade-off between accuracy and life time extension factor of sensors, using the proposed approach.
We propose a {\it quantum-like} (QL) model of the functioning of the brain. It should be sharply distinguished from the reductionist {\it quantum} model. By the latter cognition is created by {\it physical quantum processes} in the brain. The crucial point of our modelling is that discovery of the mathematical formalism of quantum mechanics (QM) was in fact discovery of a very general formalism describing {\it consistent processing of incomplete information} about contexts (physical, mental, economic, social). The brain is an advanced device which developed the ability to create a QL representation of contexts. Therefore its functioning can also be described by the mathematical formalism of QM. The possibility of such a description has nothing to do with composing of the brain of quantum systems (photons, electrons, protons,...). Moreover, we shall propose a model in that the QL representation is based on conventional neurophysiological model of the functioning of the brain. The brain uses the QL rule (given by von Neumann trace formula) for calculation of {\it approximative averages} for mental functions, but the physical basis of mental functions is given by neural networks in the brain. The QL representation has a {\it temporal basis.} Any cognitive process is based on (at least) two time scales: subcognitive time scale (which is very fine) and cognitive time scale (which is essentially coarser).
The status and ongoing developments in the field of deep inelastic scattering presented at the DIS01 workshop in Bologna are discussed from both the experimental and the theoretical perspective.
Our aim is to compute the lower moments of the unpolarized and polarized deep-inelastic structure functions of the nucleon on the lattice. The theoretical basis of the calculation is the operator product expansion. To construct operators with the appropriate continuum behavior out of the bare lattice operators one must absorb the effects of momentum scales far greater than any physical scale into a renormalization of the operators. In this work we compute the renormalization constants of all bilinear quark operators of leading twist and spin up to four. The calculation is done for Wilson fermions and in the quenched approximation where dynamical quark loops are neglected.
Polynomial networks and factorization machines are two recently-proposed models that can efficiently use feature interactions in classification and regression tasks. In this paper, we revisit both models from a unified perspective. Based on this new view, we study the properties of both models and propose new efficient training algorithms. Key to our approach is to cast parameter learning as a low-rank symmetric tensor estimation problem, which we solve by multi-convex optimization. We demonstrate our approach on regression and recommender system tasks.
Petri-nets are a simple formalism for modeling concurrent computation. Recently, they have emerged as a powerful tool for the modeling and analysis of biochemical reaction networks, bridging the gap between purely qualitative and quantitative models. These networks can be large and complex, which makes their study difficult and computationally challenging. In this paper, we focus on two structural properties of Petri-nets, siphons and traps, that bring us information about the persistence of some molecular species. We present two methods for enumerating all minimal siphons and traps of a Petri-net by iterating the resolution of a boolean model interpreted as either a SAT or a CLP(B) program. We compare the performance of these methods with a state-of-the-art dedicated algorithm of the Petri-net community. We show that the SAT and CLP(B) programs are both faster. We analyze why these programs perform so well on the models of the repository of biological models biomodels.net, and propose some hard instances for the problem of minimal siphons enumeration.
In the next generation network (NGN) environment specific consideration is on bandwidth minimization, because this reduces the cost of network. In response to the growing market demand for multimedia traffic transmission, NGN concept has been produced. The next generation network provides multimedia services over high speed networks, which supports DVD quality video on demand. Although it has numerous advantages, more exploration of the large-scale deployment video on demand is still needed. The focus of the research presented in this paper is a class based admission control by the complete partitioning of the video on demand server. In this paper we present analytically and by simulation how the blockage probability of the server significantly affects the on demand video request and the service. We also present how the blockage probability affects the performance of the video on demand server.
Hadron production in lepton-nucleus deep inelastic scattering is studied in a model including quark energy loss and nuclear absorption. The leading-order computations for hadron multiplicity ratios are presented and compared with the selected HERMES experimental data with the quark hadronization occurring inside the nucleus by means of the hadron formation time. It is shown that with increase of the energy fraction carried by the hadron, the nuclear suppression on hadron multiplicity ratio from nuclear absorption gets bigger. It is found that when hadronization occurs inside the nucleus, the nuclear absorption is the dominant mechanism causing a reduction of the hadron yield. The atomic mass dependence of hadron attenuation for quark hadronization starting inside the nucleus is confirmed theoretically and experimentally to be proportional to $ A^{1/3}$.
This paper presents SplitBox, a scalable system for privately processing network functions that are outsourced as software processes to the cloud. Specifically, providers processing the network functions do not learn the network policies instructing how the functions are to be processed. We first propose an abstract model of a generic network function based on match-action pairs, assuming that this is processed in a distributed manner by multiple honest-but-curious providers. Then, we introduce our SplitBox system for private network function virtualization and present a proof-of-concept implementation on FastClick -- an extension of the Click modular router -- using a firewall as a use case. Our experimental results show that SplitBox achieves a throughput of over 2 Gbps with 1 kB-sized packets on average, traversing up to 60 firewall rules.
Is it possible to link a set of nodes without using preexisting positional information or any kind of long-range attraction of the nodes? Can the process of generating positional information, i.e. the detection of ``unknown'' nodes and the estabishment of chemical gradients, \emph{and} the process of network formation, i.e. the establishment of links between nodes, occur in parallel, on a comparable time scale, as a process of co-evolution?   The paper discusses a model where the generation of relevant information for establishing the links between nodes results from the interaction of many \emph{agents}, i.e. subunits of the system that are capable of performing some activities. Their collective interaction is based on (indirect) communication, which also includes memory effects and the dissemination of information in the system. The relevant (``pragmatic'') information that leads to the establishment of the links then emerges from an evolutionary interplay of selection and reamplification.
We investigate the application of mesoscopic response functions (MRFs) to characterize a large set of networks of fungi and slime moulds grown under a wide variety of different experimental treatments, including inter-species competition and attack by fungivores. We construct 'structural networks' by estimating cord conductances (which yield edge weights) from the experimental data, and we construct 'functional networks' by calculating edge weights based on how much nutrient traffic is predicted to occur along each edge. Both types of networks have the same topology, and we compute MRFs for both families of networks to illustrate two different ways of constructing taxonomies to group the networks into clusters of related fungi and slime moulds. Although both network taxonomies generate intuitively sensible groupings of networks across species, treatments and laboratories, we find that clustering using the functional-network measure appears to give groups with lower intra-group variation in species or treatments. We argue that MRFs provide a useful quantitative analysis of network behaviour that can (1) help summarize an expanding set of increasingly complex biological networks and (2) help extract information that captures subtle changes in intra- and inter-specific phenotypic traits that are integral to a mechanistic understanding of fungal behaviour and ecology. As an accompaniment to our paper, we also make a large data set of fungal networks available in the public domain.
Genetic regulation is a key component in development, but a clear understanding of the structure and dynamics of genetic networks is not yet at hand. In this work we investigate these properties within an artificial genome model originally introduced by Reil. We analyze statistical properties of randomly generated genomes both on the sequence- and network level, and show that this model correctly predicts the frequency of genes in genomes as found in experimental data. Using an evolutionary algorithm based on stabilizing selection for a phenotype, we show that robustness against single base mutations, as well as against random changes in initial network states that mimic stochastic fluctuations in environmental conditions, can emerge in parallel. Evolved genomes exhibit characteristic patterns on both sequence and network level.
Action anticipation aims to detect an action before it happens. Many real world applications in robotics and surveillance are related to this predictive capability. Current methods address this problem by first anticipating visual representations of future frames and then categorizing the anticipated representations to actions. However, anticipation is based on a single past frame's representation, which ignores the history trend. Besides, it can only anticipate a fixed future time. We propose a Reinforced Encoder-Decoder (RED) network for action anticipation. RED takes multiple history representations as input and learns to anticipate a sequence of future representations. One salient aspect of RED is that a reinforcement module is adopted to provide sequence-level supervision; the reward function is designed to encourage the system to make correct predictions as early as possible. We test RED on TVSeries, THUMOS-14 and TV-Human-Interaction datasets for action anticipation and achieve state-of-the-art performance on all datasets.
In undercooled liquids, the anharmonicity of the interatomic potentials causes a volume increase of the inherent structures with increasing energy content. In most glass formers, this increase is stronger than the vibrational Gr\"uneisen volume expansion and dominates the thermal expansion of the liquid phase. For a gaussian distribution of inherent states in energy, the generic case, this implies a 1/T-squared temperature dependence of the additional thermal expansion and the additional heat capacity at zero pressure. The corresponding compressibility contribution has the Prigogine-Defay ratio one. In experiment, one finds a higher Prigogine-Defay ratio, explainable in terms of structural volume changes without any energy change. These should always exist, though their influence becomes weak in close-packing systems, at the crossover to soft matter.
The radiofrequency (RF) transmit field is severely inhomogeneous at ultrahigh field due to both RF penetration and RF coil design issues. This particularly impairs image quality for sequences that use inversion pulses such as magnetization prepared rapid acquisition gradient echo and limits the use of quantitative arterial spin labeling sequences such as flow-attenuated inversion recovery. Here we have used a search algorithm to produce inversion pulses tailored to take into account the heterogeneity of the RF transmit field at 7 T. This created a slice selective inversion pulse that worked well (good slice profile and uniform inversion) over the range of RF amplitudes typically obtained in the head at 7 T while still maintaining an experimentally achievable pulse length and pulse amplitude in the brain at 7 T. The pulses used were based on the frequency offset correction inversion technique, as well as time dilation of functions, but the RF amplitude, frequency sweep, and gradient functions were all generated using a genetic algorithm with an evaluation function that took into account both the desired inversion profile and the transmit field inhomogeneity.
We examine the global organization of growing networks in which a new vertex is attached to already existing ones with a probability depending on their age. We find that the network is infinite- or finite-dimensional depending on whether the attachment probability decays slower or faster than $(age)^{-1}$. The network becomes one-dimensional when the attachment probability decays faster than $(age)^{-2}$. We describe structural characteristics of these phases and transitions between them.
We present the experimental observation of the fluctuation-dissipation theorem (FDT) violation in an assembly of interacting magnetic nanoparticles in the low temperature superspin glass phase. The magnetic noise is measured with a two-dimension electron gas Hall probe and compared to the out of phase ac susceptibility of the same ferrofluid. For "intermediate" aging times of the order of 1 h, the ratio of the effective temperature $T_{\rm eff}$ to the bath temperature T grows from 1 to 6.5 when T is lowered from $T_g$ to 0.3 $T_g$, regardless of the noise frequency. These values are comparable to those measured in an atomic spin glass as well as those calculated for a Heisenberg spin glass.
Triadic relationships are accepted to play a key role in the dynamics of social and political networks. Building on insights gleaned from balance theory in social network studies and from Boltzmann-Gibbs statistical physics, we propose a model to quantitatively capture the dynamics of the four types of triadic relationships in a network. Central to our model are the triads' incidence rates and the idea that those can be modeled by assigning a specific triadic energy to each type of triadic relation. We emphasize the role of the degeneracy of the different triads and how it impacts the degree of frustration in the political network. In order to account for a persistent form of disorder in the formation of the triadic relationships, we introduce the systemic variable temperature. In order to learn about the dynamics and motives, we propose a generic Hamiltonian with three terms to model the triadic energies. One term is connected with a three-body interaction that captures balance theory. The other terms take into account the impact of heterogeneity and of negative edges in the triads. The validity of our model is tested on four datasets including the time series of triadic relationships for the standings between two classes of alliances in a massively multiplayer online game (MMOG). We also analyze real-world data for the relationships between the "agents" involved in the Syrian civil war, and in the relations between countries during the Cold War era. We find emerging properties in the triadic relationships in a political network, for example reflecting itself in a persistent hierarchy between the four triadic energies, and in the consistency of the extracted parameters from comparing the model Hamiltonian to the data.
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
Live virtual machine migration aims at enabling the dynamic balanced use of the networking/computing physical resources of virtualized data-centers, so to lead to reduced energy consumption. Here, we analytically characterize, prototype in software and test an optimal bandwidth manager for live migration of VMs in wireless channel. In this paper we present the optimal tunable-complexity bandwidth manager (TCBM) for the QoS live migration of VMs under a wireless channel from smartphone to access point. The goal is the minimization of the migration-induced communication energy under service level agreement (SLA)-induced hard constrains on the total migration time, downtime and overall available bandwidth.
Tensor networks are a central tool in condensed matter physics. In this paper, we initiate the study of tensor network non-zero testing (TNZ): Given a tensor network T, does T represent a non-zero vector? We show that TNZ is not in the Polynomial-Time Hierarchy unless the hierarchy collapses. We next show (among other results) that the special cases of TNZ on non-negative and injective tensor networks are in NP. Using this, we make a simple observation: The commuting variant of the MA-complete stoquastic k-SAT problem on D-dimensional qudits is in NP for logarithmic k and constant D. This reveals the first class of quantum Hamiltonians whose commuting variant is known to be in NP for all (1) logarithmic k, (2) constant D, and (3) for arbitrary interaction graphs.
We calculate the scattering rates of phonons on two-level systems in disordered trigonal and hexagonal crystals. We apply a model in which the two-level system, characterized by a direction in space, is coupled to the strain field of the phonon via a tensor of coupling constants. The structure of the tensor of coupling constants is similar to the structure of the tensor of elastic stiffness constants, in the sense that they are determined by the same symmetry transformations. In this way, we emphasize the anisotropy of the interaction of elastic waves with the ensemble of two-level systems in disordered crystals. We also point to the fact that the ratio $\gamma_l/\gamma_t$ has a much broader range of allowed values in disordered crystals than in isotropic solids.
Neutron scattering experiments on a polycrystalline sample of the frustrated pyrochlore magnet Tb2Ti2O7, which does not show any magnetic order down to 50 mK, have revealed that it shows condensation behavior below 0.4 K from a thermally fluctuating paramagnetic state to a spin-liquid ground-state with quantum spin fluctuations. Energy spectra change from quasielastic scattering to a continuum with a double-peak structure at energies of 0 and 0.8 K in the spin-liquid state. Specific heat shows an anomaly at the crossover temperature.
Quantum mechanics can strongly influence the noise properties of mesoscopic devices. To probe this effect we have measured the current fluctuations at high-frequency (5-90G Hz) using a superconductor-insulator-superconductor tunnel junction as an on-chip spectrum analyzer. By coupling this frequency-resolved noise detector to a quantum device we can measure the high-frequency, non-symmetrized noise as demonstrated for a Josephson junction. The same scheme has been used to detect the current fluctuations arising from coherent charge oscillations in a two-level system, a superconducting charge qubit. A narrow band peak was observed in the spectral noise density at the frequency of the coherent charge oscillations.
Taking into account of both the huge computing power of intruders and untrusted cloud servers, we develop an enhanced secure pseudonym scheme to protect the privacy of mobile cloud data. To face the huge computing power challenge, we develop an unconditionally secure lightweight network coding pseudonym scheme. For the privacy issue of untrusted cloud server, we further design a two tier network coding to decouple the stored mobile cloud data from the owner pseudonyms. Therefore, our proposed network coding based pseudonym scheme can simultaneously defend against attackers from both outside and inside. We implement our proposed two-tier light-weight network coding mechanism in a group location based service (LBS) using untrusted cloud database. Compared to computationally secure Hash-based pseudonym, our proposed scheme is not only unconditionally secure, but also can reduce more than 90 percent of processing time as well as 10 percent of energy consumption.
This paper explores a variety of strategies for understanding the formation, structure, efficiency and vulnerability of water distribution networks. Water supply systems are studied as spatially organized networks for which the practical applications of abstract evaluation methods are critically evaluated. Empirical data from benchmark networks are used to study the interplay between network structure and operational efficiency, reliability and robustness. Structural measurements are undertaken to quantify properties such as redundancy and optimal-connectivity, herein proposed as constraints in network design optimization problems. The role of the supply-demand structure towards system efficiency is studied and an assessment of the vulnerability to failures based on the disconnection of nodes from the source(s) is undertaken. The absence of conventional degree-based hubs (observed through uncorrelated non-heterogeneous sparse topologies) prompts an alternative approach to studying structural vulnerability based on the identification of network cut-sets and optimal connectivity invariants. A discussion on the scope, limitations and possible future directions of this research is provided.
Online identification of post-contingency transient stability is essential in power system control, as it facilitates the grid operator to decide and coordinate system failure correction control actions. Utilizing machine learning methods with synchrophasor measurements for transient stability assessment has received much attention recently with the gradual deployment of wide-area protection and control systems. In this paper, we develop a transient stability assessment system based on the long short-term memory network. By proposing a temporal self-adaptive scheme, our proposed system aims to balance the trade-off between assessment accuracy and response time, both of which may be crucial in real-world scenarios. Compared with previous work, the most significant enhancement is that our system learns from the temporal data dependencies of the input data, which contributes to better assessment accuracy. In addition, the model structure of our system is relatively less complex, speeding up the model training process. Case studies on three power systems demonstrate the efficacy of the proposed transient stability assessment system.
We consider a class of stochastic processing networks. Assume that the networks satisfy a complete resource pooling condition. We prove that each maximum pressure policy asymptotically minimizes the workload process in a stochastic processing network in heavy traffic. We also show that, under each quadratic holding cost structure, there is a maximum pressure policy that asymptotically minimizes the holding cost. A key to the optimality proofs is to prove a state space collapse result and a heavy traffic limit theorem for the network processes under a maximum pressure policy. We extend a framework of Bramson [Queueing Systems Theory Appl. 30 (1998) 89--148] and Williams [Queueing Systems Theory Appl. 30 (1998b) 5--25] from the multiclass queueing network setting to the stochastic processing network setting to prove the state space collapse result and the heavy traffic limit theorem. The extension can be adapted to other studies of stochastic processing networks.
Network calculus is a powerful methodology of characterizing queueing processes and has wide applications, but few works on applying it to 802.11 by far. In this paper, we take one of the first steps to analyze the backlog bounds of an 802.11 wireless LAN using stochastic network calculus. In particular, we want to address its effectiveness on bounding backlogs. We model a wireless node as a single server with impairment service based on two best-known models in stochastic network calculus: Jiang's and Ciucu's. Interestingly, we find that the two models can derive equivalent stochastic service curves and backlog bounds in our studied case. We prove that the network-calculus bounds imply stable backlogs as long as the average rate of traffic arrival is less than that of service, indicating the theoretical effectiveness of stochastic network calculus in bounding backlogs. From A. Kumar's 802.11 model, we derive the concrete stochastic service curve of an 802.11 node and its backlog bounds. We compare the derived bounds with ns-2 simulations and find that the former are very loose and we discuss the reasons. And we show that the martingale and independent case analysis techniques can improve the bounds significantly. Our work offers a good reference to applying stochastic network calculus to practical scenarios.
We investigate the three-dimensional Anderson model of localization via a modified transfer-matrix method in the presence of scale-free diagonal disorder characterized by a disorder correlation function $g(r)$ decaying asymptotically as $r^{-\alpha}$. We study the dependence of the localization-length exponent $\nu$ on the correlation-strength exponent $\alpha$. % For fixed disorder $W$, there is a critical $\alpha_{\rm c}$, such that for $\alpha < \alpha_{\rm c}$, $\nu=2/\alpha$ and for $\alpha > \alpha_{\rm c}$, $\nu$ remains that of the uncorrelated system in accordance with the extended Harris criterion. At the band center, $\nu$ is independent of $\alpha$ but equal to that of the uncorrelated system. The physical mechanisms leading to this different behavior are discussed.
In the last two years, there have been numerous papers that have looked into using Deep Neural Networks to replace the acoustic model in traditional statistical parametric speech synthesis. However, far less attention has been paid to approaches like DNN-based postfiltering where DNNs work in conjunction with traditional acoustic models. In this paper, we investigate the use of Recurrent Neural Networks as a potential postfilter for synthesis. We explore the possibility of replacing existing postfilters, as well as highlight the ease with which arbitrary new features can be added as input to the postfilter. We also tried a novel approach of jointly training the Classification And Regression Tree and the postfilter, rather than the traditional approach of training them independently.
Data on molecular interactions is increasing at a tremendous pace, while the development of solid methods for analyzing this network data is lagging behind. This holds in particular for the field of comparative network analysis, where one wants to identify commonalities between biological networks. Since biological functionality primarily operates at the network level, there is a clear need for topology-aware comparison methods.   In this paper we present a method for global network alignment that is fast and robust, and can flexibly deal with various scoring schemes taking both node-to-node correspondences as well as network topologies into account. It is based on an integer linear programming formulation, generalizing the well-studied quadratic assignment problem. We obtain strong upper and lower bounds for the problem by improving a Lagrangian relaxation approach and introduce the software tool natalie 2.0, a publicly available implementation of our method. In an extensive computational study on protein interaction networks for six different species, we find that our new method outperforms alternative state-of-the-art methods with respect to quality and running time.
We compare a set of convolutional neural network (CNN) architectures for the task of segmenting and detecting human sperm cells in an image taken from a semen sample. In contrast to previous work, samples are not stained or washed to allow for full sperm quality analysis, making analysis harder due to clutter. Our results indicate that training on full images is superior to training on patches when class-skew is properly handled. Full image training including up-sampling during training proves to be beneficial in deep CNNs for pixel wise accuracy and detection performance. Predicted sperm cells are found by using connected components on the CNN predictions. We investigate optimization of a threshold parameter on the size of detected components. Our best network achieves 93.87% precision and 91.89% recall on our test dataset after thresholding outperforming a classical mage analysis approach.
The complex ac conductivity of thin highly disordered InOx films was studied as a function of magnetic field through the nominal two-dimensional superconductor-insulator transition. We have resolved a significant finite-frequency superfluid stiffness well into the insulating regime, giving direct evidence for quantum superconducting fluctuations around an insulating ground state and a state of matter with localized Cooper pairs. A phase diagram is established that includes the superconducting state, a transition to a "Bose" insulator, and an eventual crossover to a "Fermi" insulating state at high fields. We speculate on the consequences of these observations, their impact on our understanding of the insulating state, and its relevance as a prototype for other insulating states of matter that derive from superconductors.
Topological landscape is introduced for networks with functions defined on the nodes. By extending the notion of gradient flows to the network setting, critical nodes of different indices are defined. This leads to a concise and hierarchical representation of the network. Persistent homology from computational topology is used to design efficient algorithms for performing such analysis. Applications to some examples in social and biological networks are demonstrated, which show that critical nodes carry important information about structures and dynamics of such networks.
Many real-world complex networks simultaneously exhibit topological features of scale-free behaviour and hierarchical organization. In this regard, deterministic scale-free [A.-L. Barab\'asi \etal, Physica A, 299, 3 (2001)] and pseudofractal scale-free [S. N. Dorogovtsev \etal, Phy. Rev. E, 65, 6 (2002)] networks constitute notable models which simultaneously incorporate the aforementioned properties. The rules governing the formation of such networks are completely deterministic. However, real-world networks are presumably neither completely deterministic, nor perfectly hierarchical. Therefore, we suggest here perfectly hierarchical scale-free networks with randomly rewired edges as better representatives of practical networked systems. In particular, we preserve the scale-free degree distribution of the various deterministic networks but successively relax the hierarchical structure while rewiring them. We utilize the framework of master stability function in investigating the synchronizability of dynamical systems coupled on such rewired networks. Interestingly, this reveals that the process of rewiring is capable of significantly enhancing, as well as, deteriorating the synchronizability of the resulting networks. We investigate the influence of rewiring edges on the topological properties of the rewired networks and, in turn, their relation to the synchronizability of the respective topologies. Finally, we compare the synchronizability of deterministic scale-free and pseudofractcal scale-free networks with that of random scale-free networks (generated using the classical Barab\'asi-Albert model of growth and preferential attachment) and find that the latter ones promote synchronizability better than their deterministic counterparts.
Topology and weights are closely related in weighted complex networks and this is reflected in their modular structure. We present a simple network model where the weights are generated dynamically and they shape the developing topology. By tuning a model parameter governing the importance of weights, the resulting networks undergo a gradual structural transition from a module free topology to one with communities. The model also reproduces many features of large social networks, including the "weak links" property.
In Bayesian networks, a Most Probable Explanation (MPE) is a complete variable instantiation with a highest probability given the current evidence. In this paper, we discuss the problem of finding robustness conditions of the MPE under single parameter changes. Specifically, we ask the question: How much change in a single network parameter can we afford to apply while keeping the MPE unchanged? We will describe a procedure, which is the first of its kind, that computes this answer for each parameter in the Bayesian network variable in time O(n exp(w)), where n is the number of network variables and w is its treewidth.
As the explosive growth of smart devices and the advent of many new applications, traffic volume has been growing exponentially. The traditional centralized network architecture cannot accommodate such user demands due to heavy burden on the backhaul links and long latency. Therefore, new architectures which bring network functions and contents to the network edge are proposed, i.e., mobile edge computing and caching. Mobile edge networks provide cloud computing and caching capabilities at the edge of cellular networks. In this survey, we make an exhaustive review on the state-of-the-art research efforts on mobile edge networks. We first give an overview of mobile edge networks including definition, architecture and advantages. Next, a comprehensive survey of issues on computing, caching and communication techniques at the network edge is presented respectively. The applications and use cases of mobile edge networks are discussed. Subsequently, the key enablers of mobile edge networks such as cloud technology, SDN/NFV and smart devices are discussed. Finally, open research challenges and future directions are presented as well.
In wireless sensor networks, a few sensor nodes end up being vulnerable to potentially rapid depletion of the battery reserves due to either their central location or just the traffic patterns generated by the application. Traditional energy management strategies, such as those which use topology control algorithms, reduce the energy consumed at each node to the minimum necessary. In this paper, we use a different approach that balances the energy consumption at each of the nodes, thus increasing the functional lifetime of the network. We propose a new distributed dynamic topology control algorithm called Energy Balanced Topology Control (EBTC) which considers the actual energy consumed for each transmission and reception to achieve the goal of an increased functional lifetime. We analyze the algorithm's computational and communication complexity and show that it is equivalent or lower in complexity to other dynamic topology control algorithms. Using an empirical model of energy consumption, we show that the EBTC algorithm increases the lifetime of a wireless sensor network by over 40% compared to the best of previously known algorithms.
In this paper, we propose a probabilistic parsing model, which defines a proper conditional probability distribution over non-projective dependency trees for a given sentence, using neural representations as inputs. The neural network architecture is based on bi-directional LSTM-CNNs which benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM and CNN. On top of the neural network, we introduce a probabilistic structured layer, defining a conditional log-linear model over non-projective trees. We evaluate our model on 17 different datasets, across 14 different languages. By exploiting Kirchhoff's Matrix-Tree Theorem (Tutte, 1984), the partition functions and marginals can be computed efficiently, leading to a straight-forward end-to-end model training procedure via back-propagation. Our parser achieves state-of-the-art parsing performance on nine datasets.
We investigate boundary multifractality of critical wave functions at the Anderson metal-insulator transition in two-dimensional disordered non-interacting electron systems with spin-orbit scattering. We show numerically that multifractal exponents at a corner with an opening angle \theta=3\pi/2 are directly related to those near a straight boundary in the way dictated by conformal symmetry. This result extends our previous numerical results on corner multifractality obtained for \theta < \pi to \theta > \pi, and gives further supporting evidence for conformal invariance at criticality. We also propose a refinement of the validity of the symmetry relation of A. D. Mirlin et al., Phys. Rev. Lett. \textbf{97} (2006) 046803, for corners.
One major open problem in network coding is to characterize the capacity region of a general multi-source multi-demand network. There are some existing computational tools for bounding the capacity of general networks, but their computational complexity grows very quickly with the size of the network. This motivates us to propose a new hierarchical approach which finds upper and lower bounding networks of smaller size for a given network. This approach sequentially replaces components of the network with simpler structures, i.e., with fewer links or nodes, so that the resulting network is more amenable to computational analysis and its capacity provides an upper or lower bound on the capacity of the original network. The accuracy of the resulting bounds can be bounded as a function of the link capacities. Surprisingly, we are able to simplify some families of network structures without any loss in accuracy.
It is well-known that many networks follow a power-law degree distribution; however, the factors that influence the formation of their distributions are still unclear. How can one model the connection between individual actions and network distributions? How can one explain the formation of group phenomena and their evolutionary patterns?   In this paper, we propose a unified framework, M3D, to model human dynamics in social networks from three perspectives: macro, meso, and micro. At the micro-level, we seek to capture the way in which an individual user decides whether to perform an action. At the meso-level, we study how group behavior develops and evolves over time, based on individual actions. At the macro-level, we try to understand how network distributions such as power-law (or heavy-tailed phenomena) can be explained by group behavior. We provide theoretical analysis for the proposed framework, and discuss the connection of our framework with existing work.   The framework offers a new, flexible way to explain the interplay between individual user actions and network distributions, and can benefit many applications. To model heavy-tailed distributions from partially observed individual actions and to predict the formation of group behaviors, we apply M3D to three different genres of networks: Tencent Weibo, Citation, and Flickr. We also use information-burst prediction as a particular application to quantitatively evaluate the predictive power of the proposed framework. Our results on the Weibo indicate that M3D's prediction performance exceeds that of several alternative methods by up to 30\%.
The Neural Turing Machine (NTM) is more expressive than all previously considered models because of its external memory. It can be viewed as a broader effort to use abstract external Interfaces and to learn a parametric model that interacts with them.   The capabilities of a model can be extended by providing it with proper Interfaces that interact with the world. These external Interfaces include memory, a database, a search engine, or a piece of software such as a theorem verifier. Some of these Interfaces are provided by the developers of the model. However, many important existing Interfaces, such as databases and search engines, are discrete.   We examine feasibility of learning models to interact with discrete Interfaces. We investigate the following discrete Interfaces: a memory Tape, an input Tape, and an output Tape. We use a Reinforcement Learning algorithm to train a neural network that interacts with such Interfaces to solve simple algorithmic tasks. Our Interfaces are expressive enough to make our model Turing complete.
Compiling Bayesian networks (BNs) to junction trees and performing belief propagation over them is among the most prominent approaches to computing posteriors in BNs. However, belief propagation over junction tree is known to be computationally intensive in the general case. Its complexity may increase dramatically with the connectivity and state space cardinality of Bayesian network nodes. In this paper, we address this computational challenge using GPU parallelization. We develop data structures and algorithms that extend existing junction tree techniques, and specifically develop a novel approach to computing each belief propagation message in parallel. We implement our approach on an NVIDIA GPU and test it using BNs from several applications. Experimentally, we study how junction tree parameters affect parallelization opportunities and hence the performance of our algorithm. We achieve speedups ranging from 0.68 to 9.18 for the BNs studied.
Network epidemiology's most important assumption is that the contact structure over which infectious diseases propagate can be represented as a static network. However, contacts are highly dynamic, changing at many time scales. In this paper, we investigate conceptually simple methods to construct static graphs for network epidemiology from temporal contact data. We evaluate these methods on empirical and synthetic model data. For almost all our cases, the network representation that captures most relevant information is a so-called exponential-threshold network. In these, each contact contributes with a weight decreasing exponentially with time, and there is an edge between a pair of vertices if the weight between them exceeds a threshold. Networks of aggregated contacts over an optimally chosen time window perform almost as good as the exponential-threshold networks. On the other hand, networks of accumulated contacts over the entire sampling time, and networks of concurrent partnerships, perform worse. We discuss these observations in the context of the temporal and topological structure of the data sets.
We report results from the first deep millimeter continuum survey targeting Brown Dwarfs (BDs). The survey led to the first detection of cold dust in the disks around two young BDs (CFHT-BD-Tau 4 and IC348 613), with deep JCMT and IRAM observations reaching flux levels of a few mJy. The dust masses are estimated to be a few Earth masses assuming the same dust opacities as usually applied to TTauri stars.
The discrepancy between abundances computed using optical recombination lines (ORLs) and collisionally excited lines (CELs) is a major unresolved problem in nebular astrophysics. We show here that the largest abundance discrepancies are reached in planetary nebulae with close binary central stars. This is illustrated by deep spectroscopy of three nebulae with a post common-envelope (CE) binary star. Abell 46 and Ou5 have O++/H+ abundance discrepancy factors larger than 50, and as high as 300 in the inner regions of Abell 46. Abell 63 has a smaller discrepancy factor around 10, but still above the typical values in ionized nebulae. Our spectroscopic analysis supports previous conclusions that, in addition to "standard" hot (Te~10000 K) gas, a colder (Te~1000 K) ionized component that is highly enriched in heavy elements also exists. These nebulae have low ionized masses, between 0.001 and 0.1 solar masses depending on the adopted electron densities and temperatures. Since the much more massive red-giant envelope is expected to be entirely ejected in the CE phase, the currently observed nebulae would be produced much later, in post-CE mass loss episodes when the envelope has already dispersed. These observations add constraints to the abundance discrepancy problem. Possible explanations are revised. Some are naturally linked to binarity, such as for instance high-metallicity nova ejecta, but it is difficult at this stage to depict an evolutionary scenario consistent with all the observed properties. The hypothesis that these nebulae are the result of tidal destruction, accretion and ejection of Jupiter-like planets is also introduced.
We have studied the kinetic roughening of an oil--air interface in a forced imbibition experiment in a horizontal Hele--Shaw cell with quenched disorder. Different disorder configurations, characterized by their persistence length in the direction of growth, have been explored by varying the average interface velocity v and the gap spacing b. Through the analysis of the rms width as a function of time, we have measured a growth exponent beta ~= 0.5 that is almost independent of the experimental parameters. The analysis of the roughness exponent alpha through the power spectrum have shown different behaviors at short (alpha_1) and long (alpha_2) length scales, separated by a crossover wavenumber q_c. The values of the measured roughness exponents depend on experimental parameters, but at large velocities we obtain alpha_1 ~= 1.3 independently of the disorder configuration. The dependence of the crossover wavenumber with the experimental parameters has also been investigated, measuring q_c ~ v^{0.47} for the shortest persistence length, in agreement with theoretical predictions.
We study the role of the anomalous $E=0$ state in dynamical properties of non-interacting fermionic chains with chiral symmetry and correlated bond disorder in one dimension. These models posses a diverging density of states at zero energy leading to a divergent localization length at the band center. By analytically calculating the localization length for a finite system, we show that correlations in the disorder modify the spatial decay of the $E = 0$ state from being quasilocalized to extended. We numerically simulate charge and entanglement propagation and provide evidence that states close to $E=0$ dominate the dynamical properties. Remarkably, we find that correlations lead to subdiffusive charge propagation, whereas the growth of entanglement is logarithmically slow. A logarithmic scaling of entanglement saturation with system size is also observed, which indicates a behavior akin to quantum critical glasses.
Significant efforts have gone into the development of statistical models for analyzing data in the form of networks, such as social networks. Most existing work has focused on modeling static networks, which represent either a single time snapshot or an aggregate view over time. There has been recent interest in statistical modeling of dynamic networks, which are observed at multiple points in time and offer a richer representation of many complex phenomena. In this paper, we present a state-space model for dynamic networks that extends the well-known stochastic blockmodel for static networks to the dynamic setting. We fit the model in a near-optimal manner using an extended Kalman filter (EKF) augmented with a local search. We demonstrate that the EKF-based algorithm performs competitively with a state-of-the-art algorithm based on Markov chain Monte Carlo sampling but is significantly less computationally demanding.
We present an evolutionary programming algorithm for solving the dynamic routing and wavelength assignment (DRWA) problem in optical wavelength-division multiplexing (WDM) networks under wavelength continuity constraint. We assume an ideal physical channel and therefore neglect the blocking of connection requests due to the physical impairments. The problem formulation includes suitable constraints that enable the algorithm to balance the load among the individuals and thus results in a lower blocking probability and lower mean execution time than the existing bio-inspired algorithms available in the literature for the DRWA problems. Three types of wavelength assignment techniques, such as First fit, Random, and Round Robin wavelength assignment techniques have been investigated here. The ability to guarantee both low blocking probability without any wavelength converters and small delay makes the improved algorithm very attractive for current optical switching networks.
We give evidence of a clear structural signature of the glass transition, in terms of a static correlation length with the same dependence on the system size which is typical of critical phenomena. Our approach is to introduce an external, static perturbation to extract the structural information from the system's response. In particular, we consider the transformation behavior of the local minima of the underlying potential energy landscape (inherent structures), under a static deformation. The finite-size scaling analysis of our numerical results indicate that the correlation length diverges at a temperature $T_c$, below the temperatures here the system can be equilibrated. Our numerical results are consistent with random first order theory, which predicts such a divergence with a critical exponent $\nu=2/3$ at the Kauzmann temperature, where the extrapolated configurational entropy vanishes.
Consider an acyclic directed network $G$ with sources $S_1, S_2,..., S_l$ and distinct sinks $R_1, R_2,..., R_l$. For $i=1, 2,..., l$, let $c_i$ denote the min-cut between $S_i$ and $R_i$. Then, by Menger's theorem, there exists a group of $c_i$ edge-disjoint paths from $S_i$ to $R_i$, which will be referred to as a group of Menger's paths from $S_i$ to $R_i$ in this paper. Although within the same group they are edge-disjoint, the Menger's paths from different groups may have to merge with each other. It is known that by choosing Menger's paths appropriately, the number of mergings among different groups of Menger's paths is always bounded by a constant, which is independent of the size and the topology of $G$. The tightest such constant for the all the above-mentioned networks is denoted by $\mathcal{M}(c_1, c_2,..., c_2)$ when all $S_i$'s are distinct, and by $\mathcal{M}^*(c_1, c_2,..., c_2)$ when all $S_i$'s are in fact identical. It turns out that $\mathcal{M}$ and $\mathcal{M}^*$ are closely related to the network encoding complexity for a variety of networks, such as multicast networks, two-way networks and networks with multiple sessions of unicast. Using this connection, we compute in this paper some exact values and bounds in network encoding complexity using a graph theoretical approach.
Despite the proven advantages of scenario-based stochastic model predictive control for the operational control of water networks, its applicability is limited by its considerable computational footprint. In this paper we fully exploit the structure of these problems and solve them using a proximal gradient algorithm parallelizing the involved operations. The proposed methodology is applied and validated on a case study: the water network of the city of Barcelona.
Contemporary deep neural networks exhibit impressive results on practical problems. These networks generalize well although their inherent capacity may extend significantly beyond the number of training examples. We analyze this behavior in the context of deep, infinite neural networks. We show that deep infinite layers are naturally aligned with Gaussian processes and kernel methods, and devise stochastic kernels that encode the information of these networks. We show that stability results apply despite the size, offering an explanation for their empirical success.
Elastic systems, such as magnetic domain walls, density waves, contact lines, and cracks, are all pinned by substrate disorder. When driven, they move via successive jumps called avalanches, with power law distributions of size, duration and velocity. Their exponents, and the shape of an avalanche, defined as its mean velocity as function of time, have recently been studied. They are known approximatively from experiments and simulations, and were predicted from mean-field models, such as the Brownian force model (BFM), where each point of the elastic interface sees a force field which itself is a random walk. As we showed in EPL 97 (2012) 46004, the BFM is the starting point for an $\epsilon = d_{\rm c}-d$ expansion around the upper critical dimension, with $d_{\rm c}=4$ for short-ranged elasticity, and $d_{\rm c}=2$ for long-ranged elasticity. Here we calculate analytically the ${\cal O}(\epsilon)$, i.e. 1-loop, correction to the avalanche shape at fixed duration $T$, for both types of elasticity. The exact expression is well approximated by $\left< \dot u(t=x T)\right>_T\simeq [ Tx(1-x)]^{\gamma-1} \exp\left( {\cal A}\left[\frac12-x\right]\right)$, $0<x<1$. The asymmetry ${\cal A}\approx - 0.336 (1-d/d_{\rm c})$ is negative for $d$ close to $d_{\rm c}$, skewing the avalanche towards its end, as observed in numerical simulations in $d=2$ and $3$. The exponent $\gamma=(d+\zeta)/z$ is given by the two independent exponents at depinning, the roughness $\zeta$ and the dynamical exponent $z$. We propose a general procedure to predict other avalanche exponents in terms of $\zeta$ and $z$. We finally introduce and calculate the shape at fixed avalanche size, not yet measured in experiments or simulations.
We investigated the effects of synaptic depression on the macroscopic behavior of stochastic neural networks. Dynamical mean field equations were derived for such networks by taking the average of two stochastic variables: a firing state variable and a synaptic variable. In these equations, their average product is decoupled as the product of averaged them because the two stochastic variables are independent. We proved the independence of these two stochastic variables assuming that the synaptic weight is of the order of 1/N with respect to the number of neurons N. Using these equations, we derived macroscopic steady state equations for a network with uniform connections and a ring attractor network with Mexican hat type connectivity and investigated the stability of the steady state solutions. An oscillatory uniform state was observed in the network with uniform connections due to a Hopf instability. With the ring network, high-frequency perturbations were shown not to affect system stability. Two mechanisms destabilize the inhomogeneous steady state, leading two oscillatory states. A Turing instability leads to a rotating bump state, while a Hopf instability leads to an oscillatory bump state, which was previous unreported. Various oscillatory states take place in a network with synaptic depression depending on the strength of the interneuron connections.
Disease, opinions, ideas, gossip, etc. all spread on social networks. How these networks are connected (the network structure) influences the dynamics of the spreading processes. By investigating these relationships one gains understanding both of the spreading itself and the structure and function of the contact network. In this chapter, we will summarize the recent literature using simulation of spreading processes on top of empirical contact data. We will mostly focus on disease simulations on temporal proximity networks -- networks recording who is close to whom, at what time -- but also cover other types of networks and spreading processes. We analyze 29 empirical networks to illustrate the methods.
The fundamental `plasticity' of the nervous system (i.e high adaptability at different structural levels) is primarily based on Hebbian learning mechanisms that modify the synaptic connections. The modifications rely on neural activity and assign a special dynamic behavior to the neural networks. Another striking feature of the nervous system is that spike based information transmission, which is supposed to be robust against noise, is noisy in itself: the variance of the spiking of the individual neurons is surprisingly large which may deteriorate the adequate functioning of the Hebbian mechanisms. In this paper we focus on networks in which Hebbian-like adaptation is induced only by external random noise and study spike-timing dependent synaptic plasticity. We show that such `HebbNets' are able to develop a broad range of network structures, including scale-free small-world networks. The development of such network structures may provide an explanation of the role of noise and its interplay with Hebbian plasticity. We also argue that this model can be seen as a unification of the famous Watts-Strogatz and preferential attachment models of small-world nets.
Sequence alignment forms the basis of many methods for functional annotation by phylogenetic comparison, but becomes unreliable in the `twilight' regions of high sequence divergence and short gene length. Here we perform a cross-species comparison of two herpesviruses, VZV and KSHV, with a hybrid method called graph alignment. The method is based jointly on the similarity of protein interaction networks and on sequence similarity. In our alignment, we find open reading frames for which interaction similarity concurs with a low level of sequence similarity, thus confirming the evolutionary relationship. In addition, we find high levels of interaction similarity between open reading frames without any detectable sequence similarity. The functional predictions derived from this alignment are consistent with genomic position and gene expression data.
Dynamic texture and scene classification are two fundamental problems in understanding natural video content. Extracting robust and effective features is a crucial step towards solving these problems. However the existing approaches suffer from the sensitivity to either varying illumination, or viewpoint changing, or even camera motion, and/or the lack of spatial information. Inspired by the success of deep structures in image classification, we attempt to leverage a deep structure to extract feature for dynamic texture and scene classification. To tackle with the challenges in training a deep structure, we propose to transfer some prior knowledge from image domain to video domain. To be specific, we propose to apply a well-trained Convolutional Neural Network (ConvNet) as a mid-level feature extractor to extract features from each frame, and then form a representation of a video by concatenating the first and the second order statistics over the mid-level features. We term this two-level feature extraction scheme as a Transferred ConvNet Feature (TCoF). Moreover we explore two different implementations of the TCoF scheme, i.e., the \textit{spatial} TCoF and the \textit{temporal} TCoF, in which the mean-removed frames and the difference between two adjacent frames are used as the inputs of the ConvNet, respectively. We evaluate systematically the proposed spatial TCoF and the temporal TCoF schemes on three benchmark data sets, including DynTex, YUPENN, and Maryland, and demonstrate that the proposed approach yields superior performance.
Examples of glasses are abundant, yet it remains one of the phases of matter whose understanding is very elusive. In recent years, remarkable experiments have been performed on the dynamical aspects of glasses. Electron glasses offer a particularly good example of the 'trademarks' of glassy behavior, such as aging and slow relaxations. In this work we review the experimental literature on electron glasses, as well as the local mean-field theoretical framework put forward in recent years to understand some of these results. We also present novel theoretical results explaining the periodic aging experiment.
We show that collaborative filtering can be viewed as a sequence prediction problem, and that given this interpretation, recurrent neural networks offer very competitive approach. In particular we study how the long short-term memory (LSTM) can be applied to collaborative filtering, and how it compares to standard nearest neighbors and matrix factorization methods on movie recommendation. We show that the LSTM is competitive in all aspects, and largely outperforms other methods in terms of item coverage and short term predictions.
We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are distributed (unevenly) over an extremely large number of \nodes, but the goal remains to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of utmost importance.   A motivating example for federated optimization arises when we keep the training data locally on users' mobile devices rather than logging it to a data center for training. Instead, the mobile devices are used as nodes performing computation on their local data in order to update a global model. We suppose that we have an extremely large number of devices in our network, each of which has only a tiny fraction of data available totally; in particular, we expect the number of data points available locally to be much smaller than the number of devices. Additionally, since different users generate data with different patterns, we assume that no device has a representative sample of the overall distribution.   We show that existing algorithms are not suitable for this setting, and propose a new algorithm which shows encouraging experimental results. This work also sets a path for future research needed in the context of federated optimization.
The principle that 'the brand effect is attractive' underlies preferential attachment. Here we show that the brand effect is just one dimension of attractiveness. Another dimension is competitiveness. We firstly develop a general framework that allows us to investigate the competitive aspect of real networks, instead of simply preferring popular nodes. Our model accurately describes the evolution of social and technological networks. The phenomenon which more competitive nodes become richer links can help us to understand the evolution of many competitive systems in nature and society. In general, the paper provides an explicit analytical expression of degree distributions of the network. In particular, the model yields a nontrivial time evolution of nodes' properties and scale-free behavior with exponents depending on the microscopic parameters characterizing the competition rules. Secondly, through theoretical analysis and numerical simulations, it reveals that our model has not only the universality for the homogeneous weighted network, but also the character for the heterogeneous weighted network. Thirdly, the paper also develops a model based on a profit-driven mechanism. It can better describe the observed phenomenon in enterprise cooperation networks. We show that standard preferential attachment, the growing random graph, the initial attractiveness model, the fitness model and weighted networks, can all be seen as degenerate cases of our model.
With our growing reliability on distributed networks, the security aspect of such networks becomes of prime importance. In large scale distributed networks it becomes cardinal to have an efficient and effective monitoring scheme. The monitoring schemes supervise the node behaviour in the network and look out for any discrepancy. Monitoring schemes comprise of monitoring components that work together to help schemes in meeting various security requirement parameters for the networks. These security parameters are breached via various attacks by manipulation of monitoring components of particular monitoring schemes to produce faulty results and thereby reducing efficiency of networks, reliability and security. In this paper we have discussed these components of monitoring, multiple monitoring schemes, their security parameters and various types of attacks possible on these monitoring components by manipulating assumptions of monitoring schemes.
The learning process is a process of communication and interaction between the teacher and his students on one side and between the students and each others on the other side. Interaction of the teacher with his students has a great importance in the process of learning and education. The pattern and style of this interaction is determined by the educational situation, trends and concerns, and educational characteristics. Classroom interaction has an importance and a big role in increasing the efficiency of the learning process and raising the achievement levels of students. Students need to learn skills and habits of study, especially at the university level. The effectiveness of learning is affected by several factors that include the prevailing patterns of interactive behavior in the classroom. These patterns are reflected in the activities of teacher and learners during the learning process. The effectiveness of learning is also influenced by the cognitive and non cognitive characteristics of teacher that help him to succeed, the characteristics of learners, teaching subject, and the teaching methods. This paper presents a machine learning algorithm for extracting knowledge from student model. The proposed algorithm utilizes the inherent characteristic of genetic algorithm and neural network for extracting comprehensible rules from the student database. The knowledge is used for discriminating male and female levels in logical reasoning as a part of an expert system course.
It has previously been hypothesized, and supported with some experimental evidence, that deeper representations, when well trained, tend to do a better job at disentangling the underlying factors of variation. We study the following related conjecture: better representations, in the sense of better disentangling, can be exploited to produce faster-mixing Markov chains. Consequently, mixing would be more efficient at higher levels of representation. To better understand why and how this is happening, we propose a secondary conjecture: the higher-level samples fill more uniformly the space they occupy and the high-density manifolds tend to unfold when represented at higher levels. The paper discusses these hypotheses and tests them experimentally through visualization and measurements of mixing and interpolating between samples.
We consider the self organizing process of merging and regeneration of vertices in complex networks and demonstrate that a scale-free degree distribution emerges in a steady state of such a dynamics. The merging of neighbor vertices in a network may be viewed as an optimization of efficiency by minimizing redundancy. It is also a mechanism to shorten the distance and thus decrease signaling times between vertices in a complex network. Thus the merging process will in particular be relevant for networks where these issues related to global signaling are of concern.
Probabilistic graphical models are powerful mathematical formalisms for machine learning and reasoning under uncertainty that are widely used for cognitive computing. However they cannot be employed efficiently for large problems (with variables in the order of 100K or larger) on conventional systems, due to inefficiencies resulting from layers of abstraction and separation of logic and memory in CMOS implementations. In this paper, we present a magneto-electric probabilistic technology framework for implementing probabilistic reasoning functions. The technology leverages Straintronic Magneto-Tunneling Junction (S-MTJ) devices in a novel mixed-signal circuit framework for direct computations on probabilities while enabling in-memory computations with persistence. Initial evaluations of the Bayesian likelihood estimation operation occurring during Bayesian Network inference indicate up to 127x lower area, 214x lower active power, and 70x lower latency compared to an equivalent 45nm CMOS Boolean implementation.
The study of the longitudinal polarization of Lambda and Lambda-bar hyperons produced in polarized deep inelastic scattering, neutrino scattering, and in Z0 decays allows to access the spin dynamics of the quark fragmentation process. Different phenomenological spin transfer mechanisms are considered and predictions for the Lambda and Lambda-bar longitudinal polarization in various processes using unpolarized and polarized targets are made. Current and future semi-inclusive deep inelastic scattering experiments will provide soon accurate enough data to study these phenomena and distinguish between various models for the spin transfer mechanisms.
We study the localization in the Hilbert space of a modified Tomonaga-Luttinger model. For the standard version of this model, the states are found to be extended in the basis of Slater determinants, representing the eigenstates of the non-interacting system. The linear dispersion which leads to the fact that these eigenstates are extended in the modified model is replaced by one with random level spacings modeling the complicated one-particle spectra of realistic models. The localization properties of the eigenstates are studied. The interactions are simplified and an effective one-dimensional Lloyd model is obtained. The effects of many-body energy correlations are studied numerically. The eigenstates of the system are found to be localized in Fock space for any strength of the interactions, but the localization is not exponential.
Information-centric networking (ICN) has gained attention from network research communities due to its capability of efficient content dissemination. In-network caching function in ICN plays an important role to achieve the design motivation. However, many researchers on in-network caching have focused on where to cache rather than how to cache: the former is known as contents deployment in the network and the latter is known as cache replacement in an ICN element. Although, the cache replacement has been intensively researched in the context of web-caching and content delivery network previously, the conventional approaches cannot be directly applied to ICN due to the fine granularity of cacheable items in ICN, which eventually changes the access patterns.   In this paper, we argue that ICN requires a novel cache replacement algorithm to fulfill the requirements in the design of high performance ICN element. Then, we propose a novel cache replacement algorithm to satisfy the requirements named Compact CLOCK with Adaptive Replacement (Compact CAR), which can reduce the consumption of cache memory to one-tenth compared to conventional approaches.
Unmanned Aerial Vehicles (UAVs) have enormous potential in the public and civil domains. These are particularly useful in applications where human lives would otherwise be endangered. Multi-UAV systems can collaboratively complete missions more efficiently and economically as compared to single UAV systems. However, there are many issues to be resolved before effective use of UAVs can be made to provide stable and reliable context-specific networks. Much of the work carried out in the areas of Mobile Ad Hoc Networks (MANETs), and Vehicular Ad Hoc Networks (VANETs) does not address the unique characteristics of the UAV networks. UAV networks may vary from slow dynamic to dynamic; have intermittent links and fluid topology. While it is believed that ad hoc mesh network would be most suitable for UAV networks yet the architecture of multi-UAV networks has been an understudied area. Software Defined Networking (SDN) could facilitate flexible deployment and management of new services and help reduce cost, increase security and availability in networks. Routing demands of UAV networks go beyond the needs of MANETS and VANETS. Protocols are required that would adapt to high mobility, dynamic topology, intermittent links, power constraints and changing link quality. UAVs may fail and the network may get partitioned making delay and disruption tolerance an important design consideration. Limited life of the node and dynamicity of the network leads to the requirement of seamless handovers where researchers are looking at the work done in the areas of MANETs and VANETs, but the jury is still out. As energy supply on UAVs is limited, protocols in various layers should contribute towards greening of the network. This article surveys the work done towards all of these outstanding issues, relating to this new class of networks, so as to spur further research in these areas.
Allocating resources to virtualized network functions and services to meet service level agreements is a challenging task for NFV management and orchestration systems. This becomes even more challenging when agile development methodologies, like DevOps, are applied. In such scenarios, management and orchestration systems are continuously facing new versions of functions and services which makes it hard to decide how much resources have to be allocated to them to provide the expected service performance. One solution for this problem is to support resource allocation decisions with performance behavior information obtained by profiling techniques applied to such network functions and services.   In this position paper, we analyze and discuss the components needed to generate such performance behavior information within the NFV DevOps workflow. We also outline research questions that identify open issues and missing pieces for a fully integrated NFV profiling solution. Further, we introduce a novel profiling mechanism that is able to profile virtualized network functions and entire network service chains under different resource constraints before they are deployed on production infrastructure.
KNET is a general-purpose shell for constructing expert systems based on belief networks and decision networks. Such networks serve as graphical representations for decision models, in which the knowledge engineer must define clearly the alternatives, states, preferences, and relationships that constitute a decision basis. KNET contains a knowledge-engineering core written in Object Pascal and an interface that tightly integrates HyperCard, a hypertext authoring tool for the Apple Macintosh computer, into a novel expert-system architecture. Hypertext and hypermedia have become increasingly important in the storage management, and retrieval of information. In broad terms, hypermedia deliver heterogeneous bits of information in dynamic, extensively cross-referenced packages. The resulting KNET system features a coherent probabilistic scheme for managing uncertainty, an objectoriented graphics editor for drawing and manipulating decision networks, and HyperCard's potential for quickly constructing flexible and friendly user interfaces. We envision KNET as a useful prototyping tool for our ongoing research on a variety of Bayesian reasoning problems, including tractable representation, inference, and explanation.
An introduction and overview is given of the theory of spin glasses and its application.
Wireless network topologies change over time and maintaining routes requires frequent updates. Updates are costly in terms of consuming throughput available for data transmission, which is precious in wireless networks. In this paper, we ask whether there exist low-overhead schemes that produce low-stretch routes. This is studied by using the underlying geometric properties of the connectivity graph in wireless networks.
Networks are a popular tool for representing elements in a system and their interconnectedness. Many observed networks can be viewed as only samples of some true underlying network. Such is frequently the case, for example, in the monitoring and study of massive, online social networks. We study the problem of how to estimate the degree distribution - an object of fundamental interest - of a true underlying network from its sampled network. In particular, we show that this problem can be formulated as an inverse problem. Playing a key role in this formulation is a matrix relating the expectation of our sampled degree distribution to the true underlying degree distribution. Under many network sampling designs, this matrix can be defined entirely in terms of the design and is found to be ill-conditioned. As a result, our inverse problem frequently is ill-posed. Accordingly, we offer a constrained, penalized weighted least-squares approach to solving this problem. A Monte Carlo variant of Stein's unbiased risk estimation (SURE) is used to select the penalization parameter. We explore the behavior of our resulting estimator of network degree distribution in simulation, using a variety of combinations of network models and sampling regimes. In addition, we demonstrate the ability of our method to accurately reconstruct the degree distributions of various sub-communities within online social networks corresponding to Friendster, Orkut and LiveJournal. Overall, our results show that the true degree distributions from both homogeneous and inhomogeneous networks can be recovered with substantially greater accuracy than reflected in the empirical degree distribution resulting from the original sampling.
We study localization properties of a one-dimensional disordered system characterized by a random non-hermitean hamiltonian where both the randomness and the non-hermiticity arises in the local site-potential; its real part being random, and a constant imaginary part implying the presence of either a coherent absorption or amplification at each site. While the two-probe transport properties behave seemingly very differently for the amplifying and the absorbing chains, the logarithmic resistance $u$ = ln$(1+R_4)$ where $R_4$ is the 4-probe resistance gives a unified description of both the cases. It is found that the ensemble-averaged $<u>$ increases linearly with length indicating exponential growth of resistance. While in contrast to the case of Anderson localization (random hermitean matrix), the variance of $u$ could be orders of magnitude smaller in the non-hermitean case, the distribution of $u$ still remains non-Gaussian even in the large length limit.
Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Though early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. While showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently employed models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, we show how an existing approach based on logistic network regression can be extended to serve as highly scalable framework for modeling large networks with dynamic vertex sets. We place this approach within a general dynamic exponential family (ERGM) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, we illustrate this approach on a classic data set involving interactions among windsurfers on a California beach.
In this work, we study the critical behavior of an epidemic propagation model that considers individuals that can develop drug resistance. In our lattice model, each site can be found in one of four states: empty, healthy, normally infected (not drug resistant) and strain infected (drug resistant) states. The most relevant parameters in our model are related to the mortality, cure and mutation rates. This model presents two distinct stationary active phases: a phase with co-existing normal and drug resistant infected individuals and an intermediate active phase with only drug resistant individuals. We employ a finite-size scaling analysis to compute the critical points the critical exponents ratio $\beta/\nu$ governing the phase-transitions between these active states and the absorbing inactive state. Our results are consistent with the hypothesis that these transitions belong to the directed percolation universality class.
We consider grouping as a general characterization for problems such as clustering, community detection in networks, and multiple parametric model estimation. We are interested in merging solutions from different grouping algorithms, distilling all their good qualities into a consensus solution. In this paper, we propose a bi-clustering framework and perspective for reaching consensus in such grouping problems. In particular, this is the first time that the task of finding/fitting multiple parametric models to a dataset is formally posed as a consensus problem. We highlight the equivalence of these tasks and establish the connection with the computational Gestalt program, that seeks to provide a psychologically-inspired detection theory for visual events. We also present a simple but powerful bi-clustering algorithm, specially tuned to the nature of the problem we address, though general enough to handle many different instances inscribed within our characterization. The presentation is accompanied with diverse and extensive experimental results in clustering, community detection, and multiple parametric model estimation in image processing applications.
We have investigated critical phenomena in spin glasses RxY1-xRu2Si2 (R = Dy, Tb, Gd). These compounds, where the magnetic moments of rare-earth ions interact by the long-range Ruderman-Kittel-Kasuya-Yoshida (RKKY) interaction via conduction electrons, has uniaxial magnetic anisotropy. The separation of the zero-field-cooled and field-cooled magnetization was found only along the c-axis in all compounds, and hence, they are classified into the long-range Ising spin glass. The magnetic anisotropic energies in these compounds are different from each other in two orders of magnitude, from 330 K to 1.8 K, however, the critical exponents are similar. It clearly indicates a presence of the universality of the long-range RKKY Ising spin glasses.
We introduce linear network coding on parallel architecture for multi-source finite acyclic network. In this problem, different messages in diverse time periods are broadcast and every nonsource node in the network decodes and encodes the message based on further communication.We wish to minimize the communication steps and time complexity involved in transfer of data from node-to-node during parallel communication.We have used Multi-Mesh of Trees (MMT) topology for implementing network coding. To envisage our result, we use all-to-all broadcast as communication algorithm.
Several determinations of the strong coupling from the H1 Collaboration are reviewed.
Canalizing functions have important applications in physics and biology. For example, they represent a mechanism capable of stabilizing chaotic behavior in Boolean network models of discrete dynamical systems. When comparing the class of canalizing functions to other classes of functions with respect to their evolutionary plausibility as emergent control rules in genetic regulatory systems, it is informative to know the number of canalizing functions with a given number of input variables. This is also important in the context of using the class of canalizing functions as a constraint during the inference of genetic networks from gene expression data. To this end, we derive an exact formula for the number of canalizing Boolean functions of n variables. We also derive a formula for the probability that a random Boolean function is canalizing for any given bias p of taking the value 1. In addition, we consider the number and probability of Boolean functions that are canalizing for exactly k variables. Finally, we provide an algorithm for randomly generating canalizing functions with a given bias p and any number of variables, which is needed for Monte Carlo simulations of Boolean networks.
The Convolutional Neural Network (CNN) has achieved great success in image classification. The classification model can also be utilized at image or patch level for many other applications, such as object detection and segmentation. In this paper, we propose a whole-image CNN regression model, by removing the full connection layer and training the network with continuous feature maps. This is a generic regression framework that fits many applications. We demonstrate this method through two tasks: simultaneous face detection & segmentation, and scene saliency prediction. The result is comparable with other models in the respective fields, using only a small scale network. Since the regression model is trained on corresponding image / feature map pairs, there are no requirements on uniform input size as opposed to the classification model. Our framework avoids classifier design, a process that may introduce too much manual intervention in model development. Yet, it is highly correlated to the classification network and offers some in-deep review of CNN structures.
For years, we have been building models of gene regulatory networks, where recent advances in molecular biology shed some light on new structural and dynamical properties of such highly complex systems. In this work, we propose a novel timing of updates in Random and Scale-Free Boolean Networks, inspired by recent findings in molecular biology. This update sequence is neither fully synchronous nor asynchronous, but rather takes into account the sequence in which genes affect each other. We have used both Kauffman's original model and Aldana's extension, which takes into account the structural properties about known parts of actual GRNs, where the degree distribution is right-skewed and long-tailed. The computer simulations of the dynamics of the new model compare favorably to the original ones and show biologically plausible results both in terms of attractors number and length. We have complemented this study with a complete analysis of our systems' stability under transient perturbations, which is one of biological networks defining attribute. Results are encouraging, as our model shows comparable and usually even better behavior than preceding ones without loosing Boolean networks attractive simplicity.
In this paper, we discuss different models for human logic systems and describe a game with nature. G\"odel`s incompleteness theorem is taken into account to construct a model of logical networks based on axioms obtained by symmetry breaking. These classical logic networks are then coupled using rules that depend on whether two networks contain axioms or anti-axioms. The social lattice of axiom based logic networks is then placed with the environment network in a game including entropy as a cost factor. The classical logical networks are then replaced with ``preference axioms'' to the role of fuzzy logic.
Most existing works on physical-layer (PHY) cooperation (beyond routing) focus on how to best use a given, static relay network--while wireless networks are anything but static. In this paper, we pose a different set of questions: given that we have multiple devices within range, which relay(s) do we use for PHY cooperation, to maintain a consistent target performance? How can we efficiently adapt, as network conditions change? And how important is it, in terms of performance, to adapt? Although adapting to the best path when routing is a well understood problem, how to do so over PHY cooperation networks is an open question. Our contributions are: (1) We demonstrate via theoretical evaluation, a diminishing returns trend as the number of deployed relays increases. (2) Using a simple algorithm based on network metrics, we efficiently select the sub-network to use at any given time to maintain a target reliability. (3) When streaming video from Netflix, we experimentally show (using measurements from a WARP radio testbed employing DIQIF relaying) that our adaptive PHY cooperation scheme provides a throughput gain of 2x over nonadaptive PHY schemes, and a gain of 6x over genie-aided IP-level adaptive routing.
Betweenness is a well-known centrality measure that ranks the nodes of a network according to their participation in shortest paths. Since an exact computation is prohibitive in large networks, several approximation algorithms have been proposed. Besides that, recent years have seen the publication of dynamic algorithms for efficient recomputation of betweenness in evolving networks. In previous work we proposed the first semi-dynamic algorithms that recompute an approximation of betweenness in connected graphs after batches of edge insertions.   In this paper we propose the first fully-dynamic approximation algorithms (for weighted and unweighted undirected graphs that need not to be connected) with a provable guarantee on the maximum approximation error. The transfer to fully-dynamic and disconnected graphs implies additional algorithmic problems that could be of independent interest. In particular, we propose a new upper bound on the vertex diameter for weighted undirected graphs. For both weighted and unweighted graphs, we also propose the first fully-dynamic algorithms that keep track of such upper bound. In addition, we extend our former algorithm for semi-dynamic BFS to batches of both edge insertions and deletions.   Using approximation, our algorithms are the first to make in-memory computation of betweenness in fully-dynamic networks with millions of edges feasible. Our experiments show that they can achieve substantial speedups compared to recomputation, up to several orders of magnitude.
The counting grid is a grid of microtopics, sparse word/feature distributions. The generative model associated with the grid does not use these microtopics individually. Rather, it groups them in overlapping rectangular windows and uses these grouped microtopics as either mixture or admixture components. This paper builds upon the basic counting grid model and it shows that hierarchical reasoning helps avoid bad local minima, produces better classification accuracy and, most interestingly, allows for extraction of large numbers of coherent microtopics even from small datasets. We evaluate this in terms of consistency, diversity and clarity of the indexed content, as well as in a user study on word intrusion tasks. We demonstrate that these models work well as a technique for embedding raw images and discuss interesting parallels between hierarchical CG models and other deep architectures.
A network of independently trained Gaussian processes (StackedGP) is introduced to obtain predictions of quantities of interest with quantified uncertainties. The main applications of the StackedGP framework are to integrate different datasets through model composition, enhance predictions of quantities of interest through a cascade of intermediate predictions, and to propagate uncertainties through emulated dynamical systems driven by uncertain forcing variables. By using analytical first and second-order moments of a Gaussian process with uncertain inputs using squared exponential and polynomial kernels, approximated expectations of quantities of interests that require an arbitrary composition of functions can be obtained. The StackedGP model is extended to any number of layers and nodes per layer, and it provides flexibility in kernel selection for the input nodes. The proposed nonparametric stacked model is validated using synthetic datasets, and its performance in model composition and cascading predictions is measured in two applications using real data.
The analysis of the interdependence between time series has become an important field of research in the last years, mainly as a result of advances in the characterization of dynamical systems from the signals they produce, the introduction of concepts such as generalized and phase synchronization and the application of information theory to time series analysis. In neurophysiology, different analytical tools stemming from these concepts have added to the 'traditional' set of linear methods, which includes the cross-correlation and the coherency function in the time and frequency domain, respectively, or more elaborated tools such as Granger Causality. This increase in the number of approaches to tackle the existence of functional (FC) or effective connectivity (EC) between two (or among many) neural networks, along with the mathematical complexity of the corresponding time series analysis tools, makes it desirable to arrange them into a unified-easy-to-use software package. The goal is to allow neuroscientists, neurophysiologists and researchers from related fields to easily access and make use of these analysis methods from a single integrated toolbox. Here we present HERMES (http://hermes.ctb.upm.es), a toolbox for the Matlab environment (The Mathworks, Inc), which is designed for the analysis functional and effective brain connectivity from neurophysiological data such as multivariate EEG and/or MEG records. It includes also visualization tools and statistical methods to address the problem of multiple comparisons. We believe that this toolbox will be very helpful to all the researchers working in the emerging field of brain connectivity analysis.
A recommender system's basic task is to estimate how users will respond to unseen items. This is typically modeled in terms of how a user might rate a product, but here we aim to extend such approaches to model how a user would write about the product. To do so, we design a character-level Recurrent Neural Network (RNN) that generates personalized product reviews. The network convincingly learns styles and opinions of nearly 1000 distinct authors, using a large corpus of reviews from BeerAdvocate.com. It also tailors reviews to describe specific items, categories, and star ratings. Using a simple input replication strategy, the Generative Concatenative Network (GCN) preserves the signal of static auxiliary inputs across wide sequence intervals. Without any additional training, the generative model can classify reviews, identifying the author of the review, the product category, and the sentiment (rating), with remarkable accuracy. Our evaluation shows the GCN captures complex dynamics in text, such as the effect of negation, misspellings, slang, and large vocabularies gracefully absent any machinery explicitly dedicated to the purpose.
In this paper, for the first time, we study label propagation in heterogeneous graphs under heterophily assumption. Homophily label propagation (i.e., two connected nodes share similar labels) in homogeneous graph (with same types of vertices and relations) has been extensively studied before. Unfortunately, real-life networks are heterogeneous, they contain different types of vertices (e.g., users, images, texts) and relations (e.g., friendships, co-tagging) and allow for each node to propagate both the same and opposite copy of labels to its neighbors. We propose a $\mathcal{K}$-partite label propagation model to handle the mystifying combination of heterogeneous nodes/relations and heterophily propagation. With this model, we develop a novel label inference algorithm framework with update rules in near-linear time complexity. Since real networks change over time, we devise an incremental approach, which supports fast updates for both new data and evidence (e.g., ground truth labels) with guaranteed efficiency. We further provide a utility function to automatically determine whether an incremental or a re-modeling approach is favored. Extensive experiments on real datasets have verified the effectiveness and efficiency of our approach, and its superiority over the state-of-the-art label propagation methods.
Scalability properties of deep neural networks raise key research questions, particularly as the problems considered become larger and more challenging. This paper expands on the idea of conditional computation introduced by Bengio, et. al., where the nodes of a deep network are augmented by a set of gating units that determine when a node should be calculated. By factorizing the weight matrix into a low-rank approximation, an estimation of the sign of the pre-nonlinearity activation can be efficiently obtained. For networks using rectified-linear hidden units, this implies that the computation of a hidden unit with an estimated negative pre-nonlinearity can be ommitted altogether, as its value will become zero when nonlinearity is applied. For sparse neural networks, this can result in considerable speed gains. Experimental results using the MNIST and SVHN data sets with a fully-connected deep neural network demonstrate the performance robustness of the proposed scheme with respect to the error introduced by the conditional computation process.
The two-dimensional q-state Potts model is subjected to a Z_q symmetric disorder that allows for the existence of a Nishimori line. At q=2, this model coincides with the +/- J random-bond Ising model. For q>2, apart from the usual pure and zero-temperature fixed points, the ferro/paramagnetic phase boundary is controlled by two critical fixed points: a weak disorder point, whose universality class is that of the ferromagnetic bond-disordered Potts model, and a strong disorder point which generalizes the usual Nishimori point. We numerically study the case q=3, tracing out the phase diagram and precisely determining the critical exponents. The universality class of the Nishimori point is inconsistent with percolation on Potts clusters.
A network is called localizable if the positions of all the nodes of the network can be computed uniquely. If a network is localizable and embedded in plane with generic configuration, the positions of the nodes may be computed uniquely in finite time. Therefore, identifying localizable networks is an important function. If the complete information about the network is available at a single place, localizability can be tested in polynomial time. In a distributed environment, networks with trilateration orderings (popular in real applications) and wheel extensions (a specific class of localizable networks) embedded in plane can be identified by existing techniques. We propose a distributed technique which efficiently identifies a larger class of localizable networks. This class covers both trilateration and wheel extensions. In reality, exact distance is almost impossible or costly. The proposed algorithm based only on connectivity information. It requires no distance information.
We are in the middle of a remarkable rise in the use and capability of artificial intelligence. Much of this growth has been fueled by the success of deep learning architectures: models that map from observables to outputs via multiple layers of latent representations. These deep learning algorithms are effective tools for unstructured prediction, and they can be combined in AI systems to solve complex automated reasoning problems. This paper provides a recipe for combining ML algorithms to solve for causal effects in the presence of instrumental variables -- sources of treatment randomization that are conditionally independent from the response. We show that a flexible IV specification resolves into two prediction tasks that can be solved with deep neural nets: a first-stage network for treatment prediction and a second-stage network whose loss function involves integration over the conditional treatment distribution. This Deep IV framework imposes some specific structure on the stochastic gradient descent routine used for training, but it is general enough that we can take advantage of off-the-shelf ML capabilities and avoid extensive algorithm customization. We outline how to obtain out-of-sample causal validation in order to avoid over-fit. We also introduce schemes for both Bayesian and frequentist inference: the former via a novel adaptation of dropout training, and the latter via a data splitting routine.
Cognition involves dynamic reconfiguration of functional brain networks at sub-second time scale. A precise tracking of these reconfigurations to categorize visual objects remains elusive. Here, we use dense electroencephalography (EEG) data recorded during naming meaningful (tools, animals) and scrambled objects from 20 healthy subjects. We combine technique for identifying functional brain networks and recently developed algorithm for estimating networks similarity to discriminate between the two categories. First, we showed that dynamic networks of both categories can be segmented into several brain network states (times windows with consistent brain networks) reflecting sequential information processing from object representation to reaction time. Second, using a network similarity algorithm, results showed high intra-category and very low inter-category values. An average accuracy of 76% was obtained at different brain network states.
We perturb the SC, BCC, and FCC crystal structures with a spatial Gaussian noise whose adimensional strength is controlled by the parameter a, and analyze the topological and metrical properties of the resulting Voronoi Tessellations (VT). The topological properties of the VT of the SC and FCC crystals are unstable with respect to the introduction of noise, because the corresponding polyhedra are geometrically degenerate, whereas the tessellation of the BCC crystal is topologically stable even against noise of small but finite intensity. For weak noise, the mean area of the perturbed BCC and FCC crystals VT increases quadratically with a. In the case of perturbed SCC crystals, there is an optimal amount of noise that minimizes the mean area of the cells. Already for a moderate noise (a>0.5), the properties of the three perturbed VT are indistinguishable, and for intense noise (a>2), results converge to the Poisson-VT limit. Notably, 2-parameter gamma distributions are an excellent model for the empirical of of all considered properties. The VT of the perturbed BCC and FCC structures are local maxima for the isoperimetric quotient, which measures the degre of sphericity of the cells, among space filling VT. In the BCC case, this suggests a weaker form of the recentluy disproved Kelvin conjecture. Due to the fluctuations of the shape of the cells, anomalous scalings with exponents >3/2 is observed between the area and the volumes of the cells, and, except for the FCC case, also for a->0. In the Poisson-VT limit, the exponent is about 1.67. As the number of faces is positively correlated with the sphericity of the cells, the anomalous scaling is heavily reduced when we perform powerlaw fits separately on cells with a specific number of faces.
Variable selection is recognized as one of the most critical steps in statistical modeling. The problems encountered in engineering and social sciences are commonly characterized by over-abundance of explanatory variables, non-linearities and unknown interdependencies between the regressors. An added difficulty is that the analysts may have little or no prior knowledge on the relative importance of the variables. To provide a robust method for model selection, this paper introduces the Multi-objective Genetic Algorithm for Variable Selection (MOGA-VS) that provides the user with an optimal set of regression models for a given data-set. The algorithm considers the regression problem as a two objective task, and explores the Pareto-optimal (best subset) models by preferring those models over the other which have less number of regression coefficients and better goodness of fit. The model exploration can be performed based on in-sample or generalization error minimization. The model selection is proposed to be performed in two steps. First, we generate the frontier of Pareto-optimal regression models by eliminating the dominated models without any user intervention. Second, a decision making process is executed which allows the user to choose the most preferred model using visualisations and simple metrics. The method has been evaluated on a recently published real dataset on Communities and Crime within United States.
We study the creep rupture of fiber composites in the framework of fiber bundle models. Two novel fiber bundle models are introduced based on different microscopic mechanisms responsible for the macroscopic creep behavior. Analytical and numerical calculations show that above a critical load the deformation of the creeping system monotonically increases in time resulting in global failure at a finite time $t_f$, while below the critical load the system suffers only partial failure and the deformation tends to a constant value giving rise to an infinite lifetime. It is found that approaching the critical load from below and above the creeping system is characterized by universal power laws when the fibers have long range interaction. The lifetime of the composite above the critical point has a universal dependence on the system size.
There exists an injective, information-preserving function that maps a semantic network (i.e a directed labeled network) to a directed network (i.e. a directed unlabeled network). The edge label in the semantic network is represented as a topological feature of the directed network. Also, there exists an injective function that maps a directed network to an undirected network (i.e. an undirected unlabeled network). The edge directionality in the directed network is represented as a topological feature of the undirected network. Through function composition, there exists an injective function that maps a semantic network to an undirected network. Thus, aside from space constraints, the semantic network construct does not have any modeling functionality that is not possible with either a directed or undirected network representation. Two proofs of this idea will be presented. The first is a proof of the aforementioned function composition concept. The second is a simpler proof involving an undirected binary encoding of a semantic network.
X-ray observations reveiled a group of radio-silent isolated neutron stars (INSs) at the centre of young supernova remnants (SNRs), dubbed central compact objects or CCOs, with properties different from those of classical rotation-powered pulsars. In at least three cases, evidence points towards CCOs being low-magnetized INSs, born with slow rotation periods, and possibly accreting from a debris disc of material formed out of the supernova event. Understanding the origin of the diversity of the CCOs can shed light on supernova explosion and neutron star formation models. Optical/infrared (IR) observations are crucial to test different CCO interpretations. The aim of our work is to perform a deep optical investigation of the CCO RX J0822.0-4300 in the Puppis A SNR, one of the most poorly understood in the CCO family. By using as a reference the Chandra X-ray coordinates of RX J0822.0-4300, we performed deep optical observations in the B, V and I bands with the Very Large Telescope (VLT). We found no candidate optical counterpart within 3 sigma of the computed Chandra X-ray position down to 5 sigma limits of B~27.2, V~26.9, and I~25.6, the deepest obtained in the optical band for this source. These limits confirm the non-detection of a companion brighter than an M5 dwarf. At the same time, they do not constrain optical emission from the neutron star surface, while emission from the magnetosphere would require a spectral break in the optical/IR.
We show that a {\em vibrational instability} of the spectrum of weakly interacting quasi-local harmonic modes creates the maximum in the inelastic scattering intensity in glasses, the Boson peak. The instability, limited by anharmonicity, causes a complete reconstruction of the vibrational density of states (DOS) below some frequency $\omega_c$, proportional to the strength of interaction. The DOS of the new {\em harmonic modes} is independent of the actual value of the anharmonicity. It is a universal function of frequency depending on a single parameter -- the Boson peak frequency, $\omega_b$ which is a function of interaction strength. The excess of the DOS over the Debye value is $\propto\omega^4$ at low frequencies and linear in $\omega$ in the interval $\omega_b \ll \omega \ll \omega_c$. Our results are in an excellent agreement with recent experimental studies.
Mobile Agent is a type of software system which acts "intelligently" on one's behalf with the feature of autonomy, learning ability and most importantly mobility. Now mobile agents are gaining interest in the research community. In this article mobile agents will be addressed as tools for mobile computing. Mobile agents have been used in applications ranging from network management to information management. We present mobile agent concept, characteristics, classification, need, applications and technical constraints in the mobile technology. We also provide a brief case study about how mobile agent is used for information retrieval.
Bundling of graph edges (node-to-node connections) is a common technique to enhance visibility of overall trends in the edge structure of a large graph layout, and a large variety of bundling algorithms have been proposed. However, with strong bundling, it becomes hard to identify origins and destinations of individual edges. We propose a solution: we optimize edge coloring to differentiate bundled edges. We quantify strength of bundling in a flexible pairwise fashion between edges, and among bundled edges, we quantify how dissimilar their colors should be by dissimilarity of their origins and destinations. We solve the resulting nonlinear optimization, which is also interpretable as a novel dimensionality reduction task. In large graphs the necessary compromise is whether to differentiate colors sharply between locally occurring strongly bundled edges ("local bundles"), or also between the weakly bundled edges occurring globally over the graph ("global bundles"); we allow a user-set global-local tradeoff. We call the technique "peacock bundles". Experiments show the coloring clearly enhances comprehensibility of graph layouts with edge bundling.
We review photodetectors used in present running neutrino telescopes. After a brief historical discourse, the photodetector requirements for the next generation deep underwater neutrino telescopes are discussed. It is shown that large area vacuum hybrid phototubes are the closest to the ideal photodetector for such kind of applications when compared with other vacuum phototubes.
In recent years, free space optical communication has gained significant importance owing to its unique features: large bandwidth, license-free spectrum, high data rate, easy and quick deployability, less power and low mass requirements. FSO communication uses the optical carrier in the near infrared band to establish either terrestrial links within the Earth's atmosphere or inter-satellite or deep space links or ground-to-satellite or satellite-to-ground links. However, despite the great potential of FSO communication, its performance is limited by the adverse effects viz., absorption, scattering, and turbulence of the atmospheric channel. This paper presents a comprehensive survey on various challenges faced by FSO communication system for ground-to-satellite or satellite-to-ground and inter-satellite links. It also provides details of various performance mitigation techniques in order to have high link availability and reliability. The first part of the paper will focus on various types of impairments that pose a serious challenge to the performance of optical communication system for ground-to-satellite or satellite-to-ground and inter-satellite links. The latter part of the paper will provide the reader with an exhaustive review of various techniques both at physical layer as well as at the other layers i.e., link, network or transport layer to combat the adverse effects of the atmosphere. It also uniquely presents a recently developed technique using orbital angular momentum for utilizing the high capacity advantage of the optical carrier in case of space-based and near-Earth optical communication links. This survey provides the reader with comprehensive details on the use of space-based optical backhaul links in order to provide high-capacity and low-cost backhaul solutions.
Signal transduction pathways are largely conserved throughout the animal kingdom. The repertoire of pathways is limited and each pathway is used in different intercellular signaling events during the development of a given animal. For example, Wnt signaling is recruited, sometimes redundantly with other molecular pathways, in four cell specification events during Caenorhabditis elegans vulva development, including the activation of vulval differentiation. Strikingly, a recent study finds that Wnts act to repress vulval differentiation in the nematode Pristionchus pacificus, demonstrating evolutionary flexibility in the use of intercellular signaling pathways.
Community analysis algorithm proposed by Clauset, Newman, and Moore (CNM algorithm) finds community structure in social networks. Unfortunately, CNM algorithm does not scale well and its use is practically limited to networks whose sizes are up to 500,000 nodes. The paper identifies that this inefficiency is caused from merging communities in unbalanced manner. The paper introduces three kinds of metrics (consolidation ratio) to control the process of community analysis trying to balance the sizes of the communities being merged. Three flavors of CNM algorithms are built incorporating those metrics. The proposed techniques are tested using data sets obtained from existing social networking service that hosts 5.5 million users. All the methods exhibit dramatic improvement of execution efficiency in comparison with the original CNM algorithm and shows high scalability. The fastest method processes a network with 1 million nodes in 5 minutes and a network with 4 million nodes in 35 minutes, respectively. Another one processes a network with 500,000 nodes in 50 minutes (7 times faster than the original algorithm), finds community structures that has improved modularity, and scales to a network with 5.5 million.
Predicting an interaction before it is fully executed is very important in applications such as human-robot interaction and video surveillance. In a two-human interaction scenario, there often contextual dependency structure between the global interaction context of the two humans and the local context of the different body parts of each human. In this paper, we propose to learn the structure of the interaction contexts, and combine it with the spatial and temporal information of a video sequence for a better prediction of the interaction class. The structural models, including the spatial and the temporal models, are learned with Long Short Term Memory (LSTM) networks to capture the dependency of the global and local contexts of each RGB frame and each optical flow image, respectively. LSTM networks are also capable of detecting the key information from the global and local interaction contexts. Moreover, to effectively combine the structural models with the spatial and temporal models for interaction prediction, a ranking score fusion method is also introduced to automatically compute the optimal weight of each model for score fusion. Experimental results on the BIT Interaction and the UT-Interaction datasets clearly demonstrate the benefits of the proposed method.
In this paper, we develop a new weakly-supervised learning algorithm to learn to segment cancerous regions in histopathology images. Our work is under a multiple instance learning framework (MIL) with a new formulation, deep weak supervision (DWS); we also propose an effective way to introduce constraints to our neural networks to assist the learning process. The contributions of our algorithm are threefold: (1) We build an end-to-end learning system that segments cancerous regions with fully convolutional networks (FCN) in which image-to-image weakly-supervised learning is performed. (2) We develop a deep week supervision formulation to exploit multi-scale learning under weak supervision within fully convolutional networks. (3) Constraints about positive instances are introduced in our approach to effectively explore additional weakly-supervised information that is easy to obtain and enjoys a significant boost to the learning process. The proposed algorithm, abbreviated as DWS-MIL, is easy to implement and can be trained efficiently. Our system demonstrates state-of-the-art results on large-scale histopathology image datasets and can be applied to various applications in medical imaging beyond histopathology images such as MRI, CT, and ultrasound images.
The amorphous conducting carbon films have been prepared at three different preparation temperatures with different boron-doping levels. The structural and transport properties of the same have been studied. X-ray diffraction measurements show that the 'd' value of the carbon depends both on atomic percentage of B in the carbon network and also on the preparation temperature. Doping of boron increases the structural graphitic ordering of the films prepared at lower temperatures. On the contrary for the films prepared at higher temperatures the ordering deteriorates as the boron content increases. The d.c electrical transport measurements on these amorphous conducting carbon films show, doping induced metal-insulator transition via critical regime, in the temperature interval of 1.3 K to 300K. Also the films in the insulating regime show a crossover from Mott to ES VRH for T < 55K. Additional support to this transition is evident from negative magnetoresistance in VRH regime when the sample is deep inside the insulating side of MI transition. The calculated value of density of states at Fermi level shows a gradual change with corresponding variation in boron doping level, indicating a change in the number of conducting pi electrons due to substitutional doping of boron in the carbon network. However for the films exhibiting critical behaviour, the magnetic field dependence of magnetoresistance was not as predicted by the available theoretical models. Various calculated parameters like localization length, density of states at the Fermi level and coulomb gap for insulating samples were calculated from the experimental data.
The ferromagnet-to-paramagnet transition of the four-dimensional random-field Ising model with Gaussian distribution of the random fields is studied. Exact ground states of systems with sizes up to 32^4 are obtained using graph theoretical algorithms. The magnetization, the disconnected susceptibility, the susceptibility and a specific heat-like quantity are calculated. Using finite-size scaling techniques, the corresponding critical exponents are obtained: \beta=0.15(6), \gamma`=3.12(10), \gamma=1.57(10) and \alpha=0 (logarithmic divergence). Furthermore, values for the critical randomness h_c=4.18(1) and the correlation-length exponent \nu=0.78(10) were found. These independently obtained exponents are compatible with all (hyper-) scaling relations and support the two-exponent scenario (\gamma`=2\gamma)
I review the current status of deep, wide-field VLBI continuum surveys. I also discuss anticipated short and long-term improvements in sensitivity (e.g. the eEVN), and the science these developments will enable.
This article shows how the operational semantics of a language like ORC can be instrumented so that the execution of a program produces information on the causal dependencies between events. The concurrent semantics we obtain is based on asymmetric labeled event structures. The approach is illustrated using a Web service orchestration instance and the detection of race conditions.
We consider the random Erd{\H o}s--R\'enyi network with enhanced clusterization and Ising spins $s=\pm 1$ at the network nodes. Mutually linked spins interact with energy $J$. Magnetic properties of the system as dependent on the clustering coefficient $C$ are investigated with the Monte Carlo heat bath algorithm. For $J>0$ the Curie temperature $T_c$ increases from 3.9 to 5.5 when $C$ increases from almost zero to 0.18. These results deviate only slightly from the mean field theory. For $J<0$ the spin-glass phase appears below $T_{SG}$; this temperature decreases with $C$, on the contrary to the mean field calculations. The results are interpreted in terms of social systems.
Deep convolutional neural networks are known to give good results on image classification tasks. In this paper we present a method to improve the classification result by combining multiple such networks in a committee. We adopt the STL-10 dataset which has very few training examples and show that our method can achieve results that are better than the state of the art. The networks are trained layer-wise and no backpropagation is used. We also explore the effects of dataset augmentation by mirroring, rotation, and scaling.
A model system with fast and slow processes is introduced. After integrating out the fast ones, the considered dynamics of the slow variables is exactly solvable. In statics the system undergoes a Kauzmann transition to a glassy state. The relaxation time obeys a generalized Vogel-Fulcher-Tammann-Hesse law. The aging dynamics on the approach to and below the Kauzmann temperature is analyzed; it has logarithmic behavior. The structure of the results could be general, as they satisfy laws of thermodynamics far from equilibrium. The original VFTH law is on the border-line between the regime where only the effective temperature of the slow modes is needed, and the regime where also an effective field occurs. The production rates of entropy and heat are calculated.
This work was firstly published in 1986 \cite{we}. No real two-dimensional object with the zero-gap quasi-relativistic spectrum was known in that time. Such an object is well known now: this is graphene. That is why we decided to present it again as a e-print in a slightly modified form. A density of the two-dimensional zero-gap electronic states at the quantizing magnetic field in the presence the Gaussian random potential has been calculated. The problem is reduced to zero-dimensional spinor field theory using the holomorphic supersymmetric representation. The calculated density of states in the case of the mass perturbation has a delta function peak in the Dirac point.This peak smears due to the potential perturbation.
This paper reviews the overview of the dynamic shortest path routing problem and the various neural networks to solve it. Different shortest path optimization problems can be solved by using various neural networks algorithms. The routing in packet switched multi-hop networks can be described as a classical combinatorial optimization problem i.e. a shortest path routing problem in graphs. The survey shows that the neural networks are the best candidates for the optimization of dynamic shortest path routing problems due to their fastness in computation comparing to other softcomputing and metaheuristics algorithms
In this article we develop a three dimensional (3D) analytical model of wireless networks. We establish an analytical expression of the SINR (Signal to Interference plus Noise Ratio) of user equipments (UE), by using a 3D fluid model approach of the network. This model enables to evaluate in a simple way the cumulative distribution function of the SINR, and therefore the performance, the quality of service and the coverage of wireless networks, with a high accuracy. The use of this 3D wireless network model, instead of a standard two-dimensional one, in order to analyze wireless networks, is particularly interesting. Indeed, this 3D model enables to establish more accurate performance and quality of services results than a 2D one.
We study numerically the frequency modulated kicked nonlinear rotator with effective dimension $d=1,2,3,4$. We follow the time evolution of the model up to $10^9$ kicks and determine the exponent $\alpha$ of subdiffusive spreading which changes from $0.35$ to $0.5$ when the dimension changes from $d=1$ to $4$. All results are obtained in a regime of relatively strong Anderson localization well below the Anderson transition point existing for $d=3,4$. We explain that this variation of the exponent is different from the usual $d-$dimensional Anderson models with local nonlinearity where $\alpha$ drops with increasing $d$. We also argue that the renormalization arguments proposed by Cherroret N et al. arXiv:1401.1038 are not valid.
A modified version of a finite random field Ising ferromagnetic model in an external magnetic field at zero temperature is presented to describe group decision making. Fields may have a non-zero average. A postulate of minimum inter-individual conflicts is assumed. Interactions then produce a group polarization along one very choice which is however randomly selected. A small external social pressure is shown to have a drastic effect on the polarization. Individual bias related to personal backgrounds, cultural values and past experiences are introduced via quenched local competing fields. They are shown to be instrumental in generating a larger spectrum of collective new choices beyond initial ones. In particular, compromise is found to result from the existence of individual competing bias. Conflict is shown to weaken group polarization. The model yields new psycho-sociological insights about consensus and compromise in groups.
In this work we use the recent advances in representation learning to propose a neural architecture for the problem of natural language inference. Our approach is aligned to mimic how a human does the natural language inference process given two statements. The model uses variants of Long Short Term Memory (LSTM), attention mechanism and composable neural networks, to carry out the task. Each part of our model can be mapped to a clear functionality humans do for carrying out the overall task of natural language inference. The model is end-to-end differentiable enabling training by stochastic gradient descent. On Stanford Natural Language Inference(SNLI) dataset, the proposed model achieves better accuracy numbers than all published models in literature.
A genetic programming system is created. A first fitness function f1 is used to evolve a program that implements a first feature. Then the fitness function is switched to a second function f2, which is used to evolve a program that implements a second feature while still maintaining the first feature. The median number of generations G1 and G2 needed to evolve programs that work as defined by f1 and f2 are measured. The behavior of G1 and G2 are observed as the difficulty of the problem is increased.   In these systems, the density D1 of programs that work (for fitness function f1) is measured in the general population of programs. The relationship G1~1/sqrt(D1) is observed to approximately hold. Also, the density D2 of programs that work (for fitness function f2) is measured in the general population of programs. The relationship G2~1/sqrt(D2) is observed to approximately hold.
We review three studies of information flow in social networks that help reveal their underlying social structure, how information spreads through them and why small world experiments work.
We consider a class of models describing the dynamics of $N$ Boolean variables, where the time evolution of each depends on the values of $K$ of the other variables. Previous work has considered models with dissipative dynamics. Here we consider time-reversible models, which necessarily have the property that every possible point in the state-space is an element of one and only one cycle.   As in the dissipative case, when K is large, typical orbit lengths grow exponentially with N, whereas for small enough K, typical orbit lengths grow much more slowly with N. The numerical data are consistent with the existence of a phase transition at which the average orbit length grows as a power of N at a value of K between 1.4 and 1.7. However, in the reversible models the interplay between the discrete symmetry and quenched randomness can lead to enormous fluctuations of orbit lengths and other interesting features that are unique to the reversible case.   The orbits can be classified by their behavior under time reversal. The orbits that transform into themselves under time reversal have properties quite different from those that do not; in particular, a significant fraction of latter-type orbits have lengths enormously longer than orbits that are time reversal-symmetric. For large K and moderate N, the vast majority of points in the state-space are on one of the time reversal singlet orbits, and a random hopping model gives an accurate description of orbit lengths. However, for any finite K, the random hopping approximation fails qualitatively when N is large enough ($N\gg 2^{(2^K)}$).
Suitable lateral connections between encoder and decoder are shown to allow higher layers of a denoising autoencoder (dAE) to focus on invariant representations. In regular autoencoders, detailed information needs to be carried through the highest layers but lateral connections from encoder to decoder relieve this pressure. It is shown that abstract invariant features can be translated to detailed reconstructions when invariant features are allowed to modulate the strength of the lateral connection. Three dAE structures with modulated and additive lateral connections, and without lateral connections were compared in experiments using real-world images. The experiments verify that adding modulated lateral connections to the model 1) improves the accuracy of the probability model for inputs, as measured by denoising performance; 2) results in representations whose degree of invariance grows faster towards the higher layers; and 3) supports the formation of diverse invariant poolings.
We discuss approximation of functions using deep neural nets. Given a function $f$ on a $d$-dimensional manifold $\Gamma \subset \mathbb{R}^m$, we construct a sparsely-connected depth-4 neural network and bound its error in approximating $f$. The size of the network depends on dimension and curvature of the manifold $\Gamma$, the complexity of $f$, in terms of its wavelet description, and only weakly on the ambient dimension $m$. Essentially, our network computes wavelet functions, which are computed from Rectified Linear Units (ReLU)
A substantial volume of research has been devoted to studies of community structure in networks, but communities are not the only possible form of large-scale network structure. Here we describe a broad extension of community structure that encompasses traditional communities but includes a wide range of generalized structural patterns as well. We describe a principled method for detecting this generalized structure in empirical network data and demonstrate with real-world examples how it can be used to learn new things about the shape and meaning of networks.
We characterize different cell states, related to cancer and ageing phenotypes, by a measure of entropy of network ensembles, integrating gene expression values and protein interaction networks. The entropy measure estimates the parameter space available to the network ensemble, that can be interpreted as the level of plasticity of the system for high entropy values (the ability to change its internal parameters, e.g. in response to environmental stimuli), or as a fine tuning of the parameters (that restricts the range of possible parameter values) in the opposite case. This approach can be applied at different scales, from whole cell to single biological functions, by defining appropriate subnetworks based on a priori biological knowledge, thus allowing a deeper understanding of the cell processes involved. In our analysis we used specific network features (degree sequence, subnetwork structure and distance between gene profiles) to obtain informations at different biological scales, providing a novel point of view for the integration of experimental transcriptomic data and a priori biological knowledge, but the entropy measure can also highlight other aspects of the biological systems studied depending on the constraints introduced in the model (e.g. community structures).
Anomalies are those deviating from the norm. Unsupervised anomaly detection often translates to identifying low density regions. Major problems arise when data is high-dimensional and mixed of discrete and continuous attributes. We propose MIXMAD, which stands for MIXed data Multilevel Anomaly Detection, an ensemble method that estimates the sparse regions across multiple levels of abstraction of mixed data. The hypothesis is for domains where multiple data abstractions exist, a data point may be anomalous with respect to the raw representation or more abstract representations. To this end, our method sequentially constructs an ensemble of Deep Belief Nets (DBNs) with varying depths. Each DBN is an energy-based detector at a predefined abstraction level. At the bottom level of each DBN, there is a Mixed-variate Restricted Boltzmann Machine that models the density of mixed data. Predictions across the ensemble are finally combined via rank aggregation. The proposed MIXMAD is evaluated on high-dimensional realworld datasets of different characteristics. The results demonstrate that for anomaly detection, (a) multilevel abstraction of high-dimensional and mixed data is a sensible strategy, and (b) empirically, MIXMAD is superior to popular unsupervised detection methods for both homogeneous and mixed data.
Complex networks can model the structure and dynamics of different types of systems. It has been shown that they are characterized by a set of measures. In this work, we evaluate the variability of complex networks measures face to perturbations and, for this purpose, we impose controlled perturbations and quantify their effect. We analyze theoretical models (random, small-world and scale-free) and real networks (a collaboration network and a metabolic networks) along with the shortest path length, vertex degree, local cluster coefficient and betweenness centrality measures. In such analysis, we propose the use of three stochastic quantifiers: the Kullback-Leibler divergence and the Jensen-Shannon and Hellinger distances. The sensitivity of these measures was analyzed with respect to the following perturbations: edge addition, edge removal, edge rewiring and node removal, all of them applied at different intensities. The results reveal that the evaluated measures are influenced by these perturbations. Additionally, hypotheses tests were performed to verify the behavior of the degree distribution to identify the intensity of the perturbations that leads to break this property.
Wireless mesh networks have seen a real progress due of their implementation at a low cost. They present one of Next Generation Networks technologies and can serve as home, companies and universities networks. In this paper, we propose and discuss a new multi-objective model for nodes deployment optimization in Multi-Radio Multi-Channel Wireless Mesh Networks. We exploit the trade-off between network cost and the overall network performance. This optimization problem is solved simultaneously by using a meta-heuristic method that returns a non-dominated set of near optimal solutions. A comparative study was driven to evaluate the efficiency of the proposed model.
In a Spiking Neural Networks (SNN), spike emissions are sparsely and irregularly distributed both in time and in the network architecture. Since a current feature of SNNs is a low average activity, efficient implementations of SNNs are usually based on an Event-Driven Simulation (EDS). On the other hand, simulations of large scale neural networks can take advantage of distributing the neurons on a set of processors (either workstation cluster or parallel computer). This article presents DAMNED, a large scale SNN simulation framework able to gather the benefits of EDS and parallel computing. Two levels of parallelism are combined: Distributed mapping of the neural topology, at the network level, and local multithreaded allocation of resources for simultaneous processing of events, at the neuron level. Based on the causality of events, a distributed solution is proposed for solving the complex problem of scheduling without synchronization barrier.
Software-Defined Networking (SDN) is an emerging paradigm that promises to change this state of affairs, by breaking vertical integration, separating the network's control logic from the underlying routers and switches, promoting (logical) centralization of network control, and introducing the ability to program the network. The separation of concerns introduced between the definition of network policies, their implementation in switching hardware, and the forwarding of traffic, is key to the desired flexibility: by breaking the network control problem into tractable pieces, SDN makes it easier to create and introduce new abstractions in networking, simplifying network management and facilitating network evolution. In this paper we present a comprehensive survey on SDN. We start by introducing the motivation for SDN, explain its main concepts and how it differs from traditional networking, its roots, and the standardization activities regarding this novel paradigm. Next, we present the key building blocks of an SDN infrastructure using a bottom-up, layered approach. We provide an in-depth analysis of the hardware infrastructure, southbound and northbound APIs, network virtualization layers, network operating systems (SDN controllers), network programming languages, and network applications. We also look at cross-layer problems such as debugging and troubleshooting. In an effort to anticipate the future evolution of this new paradigm, we discuss the main ongoing research efforts and challenges of SDN. In particular, we address the design of switches and control platforms -- with a focus on aspects such as resiliency, scalability, performance, security and dependability -- as well as new opportunities for carrier transport networks and cloud providers. Last but not least, we analyze the position of SDN as a key enabler of a software-defined environment.
The principle characteristics of biased greedy random walks (BGRWs) on two-dimensional lattices with real-valued quenched disorder on the lattice edges are studied. Here, the disorder allows for negative edge-weights. In previous studies, considering the negative-weight percolation (NWP) problem, this was shown to change the universality class of the existing, static percolation transition. In the presented study, four different types of BGRWs and an algorithm based on the ant colony optimization (ACO) heuristic were considered. Regarding the BGRWs, the precise configurations of the lattice walks constructed during the numerical simulations were influenced by two parameters: a disorder parameter rho that controls the amount of negative edge weights on the lattice and a bias strength B that governs the drift of the walkers along a certain lattice direction. Here, the pivotal observable is the probability that, after termination, a lattice walk exhibits a total negative weight, which is here considered as percolating. The behavior of this observable as function of rho for different bias strengths B is put under scrutiny. Upon tuning rho, the probability to find such a feasible lattice walk increases from zero to one. This is the key feature of the percolation transition in the NWP model. Here, we address the question how well the transition point rho_c, resulting from numerically exact and "static" simulations in terms of the NWP model can be resolved using simple dynamic algorithms that have only local information available, one of the basic questions in the physics of glassy systems.
Several phenomena related to the critical behaviour of non-interacting electrons in a disordered 2d tight-binding system with a magnetic field are studied. Localization lengths, critical exponents and density of states are computed using transfer matrix techniques. Scaling functions of isotropic systems are recovered once the dimension of the system in each direction is chosen proportional to the localization length. It is also found that the critical point is independent of the propagation direction, and that the critical exponents for the localization length for both propagating directions are equal to that of the isotropic system (approximately 7/3). We also calculate the critical value of the scaling function for both the isotropic and the anisotropic system. It is found that the isotropic value equals the geometric mean of the two anisotropic values. Detailed numerical studies of the density of states for the isotropic system reveals that for an appreciable amount of disorder the critical energy is off the band center.
I present the first results of a search for clusters of galaxies in Chandra ACIS pointed observations at high galactic latitude with exposure times larger than 10 ks. The survey is being carried out using the Voronoi Tessellation and Percolation technique, which is particularly suited for the detection and accurate quantification of extended and/or low surface brightness emission in X-ray imaging observations. A new catalogue of 36 cluster candidates has been created from 5.55 square degrees of surveyed area. Five of these candidates have already been associated to visible enhancements of the projected galaxy distribution in low deepness DSS-II fields and are probably low-to moderate redshift systems. Three of the candidates have been identified in previous ROSAT-based surveys. I show that a significative fraction (30-40%) of the candidate clusters are probably intermediate to high redshift systems. In this paper I publish the catalogue of these first candidate clusters. I also derive the number counts of clusters and compare it with the results of deep ROSAT-based cluster surveys.
In reinforcement learning, we often define goals by specifying rewards within desirable states. One problem with this approach is that we typically need to redefine the rewards each time the goal changes, which often requires some understanding of the solution in the agents environment. When humans are learning to complete tasks, we regularly utilize alternative sources that guide our understanding of the problem. Such task representations allow one to specify goals on their own terms, thus providing specifications that can be appropriately interpreted across various environments. This motivates our own work, in which we represent goals in environments that are different from the agents. We introduce Cross-Domain Perceptual Reward (CDPR) functions, learned rewards that represent the visual similarity between an agents state and a cross-domain goal image. We report results for learning the CDPRs with a deep neural network and using them to solve two tasks with deep reinforcement learning.
We present experimental results attempting to fingerprint nonanalyticities in the magnetization curves of spin glasses found by Katzgraber et al. [Phys. Rev. Lett. 89, 257202 (2002)] via zero-temperature Monte Carlo simulations of the Edwards-Anderson Ising spin glass. Our results show that the singularities at zero temperature due to the reversal-field memory effect are washed out by the finite temperatures of the experiments. The data are analyzed via the first order reversal curve (FORC) magnetic fingerprinting method. The experimental results are supported by Monte Carlo simulations of the Edwards-Anderson Ising spin glass at finite temperatures which agree qualitatively very well with the experimental results. This suggests that the hysteretic behavior of real Ising spin-glass materials is well described by the Edwards-Anderson Ising spin glass. Furthermore, reversal-field memory is a purely zero-temperature effect.
Despite their overwhelming capacity to overfit, deep learning architectures tend to generalize relatively well to unseen data, allowing them to be deployed in practice. However, explaining why this is the case is still an open area of research. One standing hypothesis that is gaining popularity, e.g. Hochreiter & Schmidhuber (1997); Keskar et al. (2017), is that the flatness of minima of the loss function found by stochastic gradient based methods results in good generalization. This paper argues that most notions of flatness are problematic for deep models and can not be directly applied to explain generalization. Specifically, when focusing on deep networks with rectifier units, we can exploit the particular geometry of parameter space induced by the inherent symmetries that these architectures exhibit to build equivalent models corresponding to arbitrarily sharper minima. Furthermore, if we allow to reparametrize a function, the geometry of its parameters can change drastically without affecting its generalization properties.
Although the Internet AS-level topology has been extensively studied over the past few years, little is known about the details of the AS taxonomy. An AS "node" can represent a wide variety of organizations, e.g., large ISP, or small private business, university, with vastly different network characteristics, external connectivity patterns, network growth tendencies, and other properties that we can hardly neglect while working on veracious Internet representations in simulation environments. In this paper, we introduce a radically new approach based on machine learning techniques to map all the ASes in the Internet into a natural AS taxonomy. We successfully classify 95.3% of ASes with expected accuracy of 78.1%. We release to the community the AS-level topology dataset augmented with: 1) the AS taxonomy information and 2) the set of AS attributes we used to classify ASes. We believe that this dataset will serve as an invaluable addition to further understanding of the structure and evolution of the Internet.
We study first passage percolation on the configuration model (CM) having power-law degrees with exponent $\tau\in [1,2)$. To this end we equip the edges with exponential weights. We derive the distributional limit of the minimal weight of a path between typical vertices in the network and the number of edges on the minimal weight path, which can be computed in terms of the Poisson-Dirichlet distribution. We explicitly describe these limits via the construction of an infinite limiting object describing the FPP problem in the densely connected core of the network. We consider two separate cases, namely, the {\it original CM}, in which each edge, regardless of its multiplicity, receives an independent exponential weight, as well as the {\it erased CM}, for which there is an independent exponential weight between any pair of direct neighbors. While the results are qualitatively similar, surprisingly the limiting random variables are quite different.   Our results imply that the flow carrying properties of the network are markedly different from either the mean-field setting or the locally tree-like setting, which occurs as $\tau>2$, and for which the hopcount between typical vertices scales as $\log{n}$. In our setting the hopcount is tight and has an explicit limiting distribution, showing that one can transfer information remarkably quickly between different vertices in the network. This efficiency has a down side in that such networks are remarkably fragile to directed attacks. These results continue a general program by the authors to obtain a complete picture of how random disorder changes the inherent geometry of various random network models.
In this work we numerically investigate a new method for the characterization of growing length scales associated with spatially heterogeneous dynamics of glass-forming liquids. This approach, motivated by the formulation of the inhomogeneous mode-coupling theory (IMCT) [Biroli G. et al. Phys. Rev. Lett. 2006 97, 195701], utilizes inhomogeneous molecular dynamics simulations in which the system is perturbed by a spatially modulated external potential. We show that the response of the two-point correlation function to the external field allows one to probe dynamic correlations. We examine the critical properties shown by this function, in particular the associated dynamic correlation length, that is found to be comparable to the one extracted from standardly-employed four-point correlation functions. Our numerical results are in qualitative agreement with IMCT predictions but suggest that one has to take into account fluctuations not included in this mean-field approach in order to reach quantitative agreement. Advantages of our approach over the more conventional one based on four-point correlation functions are discussed.
Biological phenomena differ significantly from physical phenomena. At the heart of this distinction is the fact that biological entities have computational abilities and thus they are inherently difficult to predict. This is the reason why simplified models that provide the minimal requirements for computation turn out to be very useful to study networks of many components. In this chapter, we briefly review the dynamical aspects of models of regulatory networks, discussing their most salient features, and we also show how these models can give clues about the way in which networks may organize their capacity to evolve, by providing simple examples of the implementation of robustness and modularity.
Software Defined Networking (SDN) has been recently introduced as a new communication paradigm in computer networks. By separating the control plane from the data plane and entrusting packet forwarding to straightforward switches, SDN makes it possible to deploy and run networks which are more flexible to manage and easier to configure. This paper describes a set of extensions for the INET framework, which allow researchers and network designers to simulate SDN architectures and evaluate their performance and security at design time. Together with performance evaluation and design optimization of SDN networks, our extensions enable the simulation of SDN-based anomaly detection and mitigation techniques, as well as the quantitative evaluation of cyber-physical attacks and their impact on the network and application. This work is an ongoing research activity, and we plan to propose it for an official contribution to the INET framework.
In power networks where multiple fuel cell stacks are employed to deliver the required power, optimal sharing of the power demand between different stacks is an important problem. This is because the total current collectively produced by all the stacks is directly proportional to the fuel utilization, through stoichiometry. As a result, one would like to produce the required power while minimizing the total current produced. In this paper, an optimization formulation is proposed for this power distribution control problem. An algorithm that identifies the globally optimal solution for this problem is developed. Through an analysis of the KKT conditions, the solution to the optimization problem is decomposed into on-line and on-line computations. The on-line computations reduce to simple equation solving. For an application with a specific v-i function derived from data, we show that analytical solutions exist for on-line computations. We also discuss the wider applicability of the proposed approach for similar problems in other domains.
We study the effect of magnetic scattering on transport in a system with strong structural disorder, using exact finite size calculation of the low frequency optical conductivity. At weak electron-spin coupling spin disorder leads to a decrease in resistivity by weakening the quantum interference precursors to Anderson localisation. However, at strong electron-spin coupling, the double exchange limit, magnetic scattering increases the effective disorder, sharply increasing the resistivity. We illustrate the several unusual transport regimes in this strong disorder problem, identify a re-entrant insulator-metal-insulator transition, and map out the phase diagram at a generic electron density.
Sampling techniques such as Respondent-Driven Sampling (RDS) are widely used in epidemiology to sample "hidden" populations, such that properties of the network can be deduced from the sample. We consider how similar techniques can be designed that allow the discovery of the structure, especially the community structure, of networks. Our method involves collecting samples of a network by random walks and reconstructing the network by probabilistically coalescing vertices, using vertex attributes to determine the probabilities. Even though our method can only approximately reconstruct a part of the original network, it can recover its community structure relatively well. Moreover, it can find the key vertices which, when immunized, can effectively reduce the spread of an infection through the original network.
The Particle Swarm Optimized (PSO) fuzzy controller has been proposed for indirect vector control of induction motor. In this proposed scheme a Neutral Point Clamped (NPC) multilevel inverter is used and hysteresis current control technique has been adopted for switching the IGBTs. A Mamdani type fuzzy controller is used in place of conventional PI controller. To ensure better performance of fuzzy controller all parameters such as membership functions, normalizing and de-normalizing parameters are optimized using PSO. The performance of proposed controller is investigated under various load and speed conditions. The simulation results show its stability and robustness for high performance derives applications.
Most people simultaneously belong to several distinct social networks, in which their relations can be different. They have opinions about certain topics, which they share and spread on these networks, and are influenced by the opinions of other persons. In this paper, we build upon this observation to propose a new nodal centrality measure for multiplex networks. Our measure, called Opinion centrality, is based on a stochastic model representing opinion propagation dynamics in such a network. We formulate an optimization problem consisting in maximizing the opinion of the whole network when controlling an external influence able to affect each node individually. We find a mathematical closed form of this problem, and use its solution to derive our centrality measure. According to the opinion centrality, the more a node is worth investing external influence, and the more it is central. We perform an empirical study of the proposed centrality over a toy network, as well as a collection of real-world networks. Our measure is generally negatively correlated with existing multiplex centrality measures, and highlights different types of nodes, accordingly to its definition.
We study the dynamics of the Bak-Sneppen model on small-world networks. For each site in the network, we define a ``connectance,'' which measures the distance to all other sites. We find radically different patterns of activity for different sites, depending on their connectance and also on the topology of the network. For a given network, the site with the minimal connectance shows long periods of stasis interrupted by much smaller periods of activity. In contrast, the activity pattern for the maximally connected site appears uniform on the same time scale. We discuss the significance of these results for speciation events.
Deep inelastic scattering off the strongly coupled N=4 supersymmetric Yang-Mills plasma at finite temperature can be computed within the AdS/CFT correspondence, with results which are suggestive of a parton picture for the plasma. Via successive branchings, essentially all partons cascade down to very small values of the longitudinal momentum fraction x and to transverse momenta smaller than the saturation momentum Q_s\sim T/x. This scale Q_s controls the plasma interactions with a hard probe, in particular, the jet energy loss and its transverse momentum broadening.
Recently there has been significant research on power generation, distribution and transmission efficiency especially in the case of renewable resources. The main objective is reduction of energy losses and this requires improvements on data acquisition and analysis. In this paper we address these concerns by using consumers' electrical smart meter readings to estimate network loading and this information can then be used for better capacity planning. We compare Deep Neural Network (DNN) methods with traditional methods for load forecasting. Our results indicate that DNN methods outperform most traditional methods. This comes at the cost of additional computational complexity but this can be addressed with the use of cloud resources. We also illustrate how these results can be used to better support dynamic pricing.
Machine learning and data mining algorithms are becoming increasingly important in analyzing large volume, multi-relational and multi--modal datasets, which are often conveniently represented as multiway arrays or tensors. It is therefore timely and valuable for the multidisciplinary research community to review tensor decompositions and tensor networks as emerging tools for large-scale data analysis and data mining. We provide the mathematical and graphical representations and interpretation of tensor networks, with the main focus on the Tucker and Tensor Train (TT) decompositions and their extensions or generalizations.   Keywords: Tensor networks, Function-related tensors, CP decomposition, Tucker models, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, multiway component analysis, multilinear blind source separation, tensor completion, linear/multilinear dimensionality reduction, large-scale optimization problems, symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear equations, pseudo-inverse of very large matrices, Lasso and Canonical Correlation Analysis (CCA) (This is Part 1)
Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to exploit the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear subspace of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public benchmarks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings.
Observations of the transversity parton distribution based on an analysis of pion-pair production in deep inelastic scattering off transversely polarized targets are presented. This extraction relies on the knowledge of dihadron fragmentation functions, which are obtained from electron-positron annihilation measurements. This is the first attempt to determine the transversity distribution in the framework of collinear factorization.
Positive definite matrices abound in a dazzling variety of applications. This ubiquity can be in part attributed to their rich geometric structure: positive definite matrices form a self-dual convex cone whose strict interior is a Riemannian manifold. The manifold view is endowed with a "natural" distance function while the conic view is not. Nevertheless, drawing motivation from the conic view, we introduce the S-Divergence as a "natural" distance-like function on the open cone of positive definite matrices. We motivate the S-divergence via a sequence of results that connect it to the Riemannian distance. In particular, we show that (a) this divergence is the square of a distance; and (b) that it has several geometric properties similar to those of the Riemannian distance, though without being computationally as demanding. The S-divergence is even more intriguing: although nonconvex, we can still compute matrix means and medians using it to global optimality. We complement our results with some numerical experiments illustrating our theorems and our optimization algorithm for computing matrix medians.
Sparsification reduces the size of networks while preserving structural and statistical properties of interest. Various sparsifying algorithms have been proposed in different contexts. We contribute the first systematic conceptual and experimental comparison of \textit{edge sparsification} methods on a diverse set of network properties. It is shown that they can be understood as methods for rating edges by importance and then filtering globally or locally by these scores. We show that applying a local filtering technique improves the preservation of all kinds of properties. In addition, we propose a new sparsification method (\textit{Local Degree}) which preserves edges leading to local hub nodes. All methods are evaluated on a set of social networks from Facebook, Google+, Twitter and LiveJournal with respect to network properties including diameter, connected components, community structure, multiple node centrality measures and the behavior of epidemic simulations. In order to assess the preservation of the community structure, we also include experiments on synthetically generated networks with ground truth communities. Experiments with our implementations of the sparsification methods (included in the open-source network analysis tool suite NetworKit) show that many network properties can be preserved down to about 20\% of the original set of edges for sparse graphs with a reasonable density. The experimental results allow us to differentiate the behavior of different methods and show which method is suitable with respect to which property. While our Local Degree method is best for preserving connectivity and short distances, other newly introduced local variants are best for preserving the community structure.
This paper is concerned with distributed computation of several commonly used centrality measures in complex networks. In particular, we propose deterministic algorithms, which converge in finite time, for the distributed computation of the degree, closeness and betweenness centrality measures in directed graphs. Regarding eigenvector centrality, we consider the PageRank problem as its typical variant, and design distributed randomized algorithms to compute PageRank for both fixed and time-varying graphs. A key feature of the proposed algorithms is that they do not require to know the network size, which can be simultaneously estimated at every node, and that they are clock-free. To address the PageRank problem of time-varying graphs, we introduce the novel concept of persistent graph, which eliminates the effect of spamming nodes. Moreover, we prove that these algorithms converge almost surely and in the sense of $L^p$. Finally, the effectiveness of the proposed algorithms is illustrated via extensive simulations using a classical benchmark.
It has been recently proven that natural images exhibit scaling properties analogue to those of turbulent flows. These properties allow regarding each image as a multifractal object, for which its most singular manifold conveys the most of the non-redundant structure. In the present work, we go further in this analysis, proposing a simple propagator that reconstructs the whole image from this set. This fact could have deep implications for biology, technology and statistical mechanics.
Multi-task learning employs shared representation of knowledge for learning multiple instances from the same or related problems. Time series prediction consists of several instances that are defined by the way they are broken down into fixed windows known as embedding dimension. Finding the optimal values for embedding dimension is a computationally intensive task. Therefore, we introduce a new category of problem called dynamic time series prediction that requires a trained model to give prediction when presented with different values of the embedding dimension. This can be seen a new class of time series prediction where dynamic prediction is needed. In this paper, we propose a co-evolutionary multi-task learning method that provides a synergy between multi-task learning and coevolution. This enables neural networks to retain modularity during training for building blocks of knowledge for different instances of the problem. The effectiveness of the proposed method is demonstrated using one-step-ahead chaotic time series problems. The results show that the proposed method can effectively be used for different instances of the related time series problems while providing improved generalisation performance.
These notes review six lectures given by Prof. Andrea Montanari on the topic of statistical estimation for linear models. The first two lectures cover the principles of signal recovery from linear measurements in terms of minimax risk. Subsequent lectures demonstrate the application of these principles to several practical problems in science and engineering. Specifically, these topics include denoising of error-laden signals, recovery of compressively sensed signals, reconstruction of low-rank matrices, and also the discovery of hidden cliques within large networks.
Each day billions of photographs are uploaded to photo-sharing services and social media platforms. These images are packed with information about how people live around the world. In this paper we exploit this rich trove of data to understand fashion and style trends worldwide. We present a framework for visual discovery at scale, analyzing clothing and fashion across millions of images of people around the world and spanning several years. We introduce a large-scale dataset of photos of people annotated with clothing attributes, and use this dataset to train attribute classifiers via deep learning. We also present a method for discovering visually consistent style clusters that capture useful visual correlations in this massive dataset. Using these tools, we analyze millions of photos to derive visual insight, producing a first-of-its-kind analysis of global and per-city fashion choices and spatio-temporal trends.
Besides an indicator of the GDP, the Central Bank of Venezuela generates the so called Monthly Economic Activity General Indicator. The a priori knowledge of this indicator, which represents and sometimes even anticipates the economy's fluctuations, could be helpful in developing public policies and in investment decision making. The purpose of this study is forecasting the IGAEM through non parametric methods, an approach that has proven effective in a wide variety of problems in economics and finance.
We propose the self-organized relaxation process which drives a collisionless self-gravitating system (SGS) to the equilibrium state satisfying local virial (LV) relation. During the violent relaxation process, particles can move widely within the time interval as short as a few free fall times, because of the effective potential oscillations. Since such particle movement causes further potential oscillations, it is expected that the system approaches the critical state where such particle activities, which we call gravitational fugacity, is independent of the local position as much as possible. Here we demonstrate that gravitational fugacity can be described as the functional of the LV ratio, which means that the LV ratio is a key ingredient estimating the particle activities against gravitational potential. We also demonstrate that LV relation is attained if the LV ratio exceeds the crticaial value $b=1$ everywhere in the bound region during the violent relaxation process. The local region which does not meet this criterion can be trapped into the pre-saturated state. However, small phase-space perturbation can bring the inactive part into the LV critical state.
The concepts and methods of Systems Biology are being extended to neuropharmacology, to test and design drugs against neurological and psychiatric disorders. Computational modeling by integrating compartmental neural modeling technique and detailed kinetic description of pharmacological modulation of transmitter-receptor interaction is offered as a method to test the electrophysiological and behavioral effects of putative drugs. Even more, an inverse method is suggested as a method for controlling a neural system to realize a prescribed temporal pattern. In particular, as an application of the proposed new methodology a computational platform is offered to analyze the generation and pharmacological modulation of theta rhythm related to anxiety is analyzed here in more detail.
This paper introduces a greedy parser based on neural networks, which leverages a new compositional sub-tree representation. The greedy parser and the compositional procedure are jointly trained, and tightly depends on each-other. The composition procedure outputs a vector representation which summarizes syntactically (parsing tags) and semantically (words) sub-trees. Composition and tagging is achieved over continuous (word or tag) representations, and recurrent neural networks. We reach F1 performance on par with well-known existing parsers, while having the advantage of speed, thanks to the greedy nature of the parser. We provide a fully functional implementation of the method described in this paper.
Speech recognition in adverse real-world environments is highly affected by reverberation and nonstationary background noise. A well-known strategy to reduce such undesired signal components in multi-microphone scenarios is spatial filtering of the microphone signals. In this article, we demonstrate that an additional coherence-based postfilter, which is applied to the beamformer output signal to remove diffuse interference components from the latter, is an effective means to further improve the recognition accuracy of modern deep learning speech recognition systems. To this end, the recently updated 3rd CHiME Speech Separation and Recognition Challenge (CHiME-3) baseline speech recognition system is extended by a coherence-based postfilter and the postfilter's impact on the word error rates is investigated for the noisy environments provided by CHiME-3. To determine the time- and frequency-dependent postfilter gains, we use a Direction-of-Arrival (DOA)-dependent and a DOA-independent estimator of the coherent-to-diffuse power ratio as an approximation of the short-time signal-to-noise ratio. Our experiments show that incorporating coherence-based postfiltering into the CHiME-3 baseline speech recognition system leads to a significant reduction of the word error rate scores for the noisy and reverberant environments provided as part of CHiME-3.
A rigorous solution for the spectrum of quasioptical cylindrical cavity resonator with a randomly rough side boundary has been obtained for the first time. To accomplish this task, we have developed a novel method for variables separation in wave equation, which enables one, in principle, to rigorously examine any limiting case --- from negligibly weak to arbitrarily strong disorder. It is shown that the effect of disorder-induced scattering can be properly described in terms of two geometric potentials, specifically, the "amplitude" and the "gradient" potentials, which appear in wave equation in the course of conformal smoothing of the resonator boundaries. The scattering resulting from the gradient potential appears to be dominant, and its impact on the whole spectrum is governed by the unique sharpness parameter $\Xi$, the mean tangent of the asperity slope. As opposed to the resonator with bulk disorder, the distribution of nearest-neighbor spacings (NNS) in the rough-resonator spectrum acquires Wigner-like features only when the wave operator loses its unitarity, i.e., with the availability in the system of either openness or dissipation channels. Our numeric experiments suggest that in the absence of dissipation loss the random-rough resonator spectrum is always regular, whatever the degree of roughness. Yet, the spectrum structure is quite different in the domains of small and large values of the parameter $\Xi$. For the dissipation-free resonator, the NNS distribution changes its form with growing the asperity sharpness from Poissonian-like distribution in the limit of $\Xi<<1$ to the bell-shaped distribution in the domain where $\Xi>>1$.
In this paper, we discuss the affordances of open-source Geoweb 2.0 platforms to support the participatory design of urban projects in real-world practices.We first introduce the two open-source platforms used in our study for testing purposes. Then, based on evidence from five different field studies we identify five affordances of these platforms: conversations on alternative urban projects, citizen consultation, design empowerment, design studio learning and design research. We elaborate on these in detail and identify a key set of success factors for the facilitation of better practices in the future.
How do shared conventions emerge in complex decentralized social systems? This question engages fields as diverse as linguistics, sociology and cognitive science. Previous empirical attempts to solve this puzzle all presuppose that formal or informal institutions, such as incentives for global agreement, coordinated leadership, or aggregated information about the population, are needed to facilitate a solution. Evolutionary theories of social conventions, by contrast, hypothesize that such institutions are not necessary in order for social conventions to form. However, empirical tests of this hypothesis have been hindered by the difficulties of evaluating the real-time creation of new collective behaviors in large decentralized populations. Here, we present experimental results - replicated at several scales - that demonstrate the spontaneous creation of universally adopted social conventions, and show how simple changes in a population's network structure can direct the dynamics of norm formation, driving human populations with no ambition for large scale coordination to rapidly evolve shared social conventions.
The NF-{\kappa}B signaling network plays an important role in many different compartments of the immune system during immune activation. Using a computational model of the NF-{\kappa}B signaling network involving two negative regulators, I{\kappa}B{\alpha} and A20, we performed sensitivity analyses with three different sampling methods and present a ranking of the kinetic rate variables by the strength of their influence on the NF-{\kappa}B signaling response. We also present a classification of temporal response profiles of nuclear NF-{\kappa}B concentration into six clusters, which can be regrouped to three biologically relevant clusters. Lastly, based upon the ranking, we constructed a reduced network of the IKK-NF-{\kappa}B-I{\kappa}B{\alpha}-A20 signal transduction.
Stability region of random access wireless networks is known for only simple network scenarios. The main problem in this respect is due to interaction among queues. When transmission probabilities during successive transmissions change, e.g., when exponential backoff mechanism is exploited, the interactions in the network are stimulated. In this paper, we derive the stability region of a buffered slotted Aloha network with K-exponential backoff mechanism, approximately, when a finite number of nodes exist. To this end, we propose a new approach in modeling the interaction among wireless nodes. In this approach, we model the network with inter-related quasi-birth-death (QBD) processes such that at each QBD corresponding to each node, a finite number of phases consider the status of the other nodes. Then, by exploiting the available theorems on stability of QBDs, we find the stability region. We show that exponential backoff mechanism is able to increase the area of the stability region of a simple slotted Aloha network with two nodes, more than 40\%. We also show that a slotted Aloha network with exponential backoff may perform very near to ideal scheduling. The accuracy of our modeling approach is verified by simulation in different conditions.
We study the diffusion of a particle on a random lattice with fluctuating local connectivity of average value q. This model is a basic description of relaxation processes in random media with geometrical defects. We analyze here the asymptotic behavior of the eigenvalue distribution for the Laplacian operator. We found that the localized states outside the mobility band and observed by Biroli and Monasson (1999, J. Phys. A: Math. Gen. 32 L255), in a previous numerical analysis, are described by saddle point solutions that breaks the rotational symmetry of the main action in the real space. The density of states is characterized asymptotically by a series of peaks with periodicity 1/q.
The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic dependencies among the attributes. We are particularly interested in Probabilistic Relational Models (PRMs), which extend Bayesian Networks (BNs) to a relational data mining context and enable effective and robust reasoning over relational data. Even though a panoply of works have focused, separately , on the generation of random Bayesian networks and relational databases, no work has been identified for PRMs on that track. This paper provides an algorithmic approach for generating random PRMs from scratch to fill this gap. The proposed method allows to generate PRMs as well as synthetic relational data from a randomly generated relational schema and a random set of probabilistic dependencies. This can be of interest not only for machine learning researchers to evaluate their proposals in a common framework, but also for databases designers to evaluate the effectiveness of the components of a database management system.
In this paper we consider the problem of human pose estimation from a single still image. We propose a novel approach where each location in the image votes for the position of each keypoint using a convolutional neural net. The voting scheme allows us to utilize information from the whole image, rather than rely on a sparse set of keypoint locations. Using dense, multi-target votes, not only produces good keypoint predictions, but also enables us to compute image-dependent joint keypoint probabilities by looking at consensus voting. This differs from most previous methods where joint probabilities are learned from relative keypoint locations and are independent of the image. We finally combine the keypoints votes and joint probabilities in order to identify the optimal pose configuration. We show our competitive performance on the MPII Human Pose and Leeds Sports Pose datasets.
A theoretical study of the magnetoelectronic properties of zigzag and armchair bilayer graphene nanoribbons (BGNs) is presented. Using the recursive Green's function method, we study the band structure of BGNs in uniform perpendicular magnetic fields and discuss the zero-temperature conductance for the corresponding clean systems. The conductance quantized as 2(n+1)G_ for the zigzag edges and nG_0 for the armchair edges with G_{0}=2e^2/h being the conductance unit and $n$ an integer. Special attention is paid to the effects of edge disorder. As in the case of monolayer graphene nanoribbons (GNR), a small degree of edge disorder is already sufficient to induce a transport gap around the neutrality point. We further perform comparative studies of the transport gap E_g and the localization length in bilayer and monolayer nanoribbons. While for the GNRs E_{g}^{GNR}is proportional to 1/W, the corresponding transport gap E_{g}^{BGN} for the bilayer ribbons shows a more rapid decrease as the ribbon width W is increased. We also demonstrate that the evolution of localization lengths with the Fermi energy shows two distinct regimes. Inside the transport gap, xi is essentially independent on energy and the states in the BGNs are significantly less localized than those in the corresponding GNRs. Outside the transport gap \xi grows rapidly as the Fermi energy increases and becomes very similar for BGNs and GNRs.
Communication is not only an action of choosing a signal, but needs to consider the context and sensor signals. It also needs to decide what information is communicated and how it is represented in or understood from signals. Therefore, communication should be realized comprehensively together with its purpose and other functions.   The recent successful results in end-to-end reinforcement learning (RL) show the importance of comprehensive learning and the usefulness of end-to-end RL. Although little is known, we have shown that a variety of communications emerge through RL using a (recurrent) neural network (NN). Here, three of them are introduced.   In the 1st one, negotiation to avoid conflicts among 4 randomly-picked agents was learned. Each agent generates a binary signal from the output of its recurrent NN (RNN), and receives 4 signals from the agents three times. After learning, each agent made an appropriate final decision after negotiation for any combination of 4 agents. Differentiation of individuality among the agents also could be seen.   The 2nd one focused on discretization of communication signal. A sender agent perceives the receiver's location and generates a continuous signal twice by its RNN. A receiver agent receives them sequentially, and moves according to its RNN's output to reach the sender's location. When noises were added to the signal, it was binarized through learning and 2-bit communication was established.   The 3rd one focused on end-to-end comprehensive communication. A sender receives 1,785-pixel real camera image on which a real robot can be seen, and sends two sounds whose frequencies are computed by its NN. A receiver receives them, and two motion commands for the robot are generated by its NN. After learning, though some preliminary learning was necessary for the sender, the robot could reach the goal from any initial location.
Studying the topology of so-called real networks, that is networks obtained from sociological or biological data for instance, has become a major field of interest in the last decade. One way to deal with it is to consider that networks are built from small functional units called motifs, which can be found by looking for small subgraphs whose numbers of occurrences in the whole network are surprisingly high. In this article, we propose to define motifs through a local overrepresentation in the network and develop a statistic to detect them without relying on simulations. We then illustrate the performance of our procedure on simulated and real data, recovering already known biologically relevant motifs. Moreover, we explain how our method gives some information about the respective roles of the vertices in a motif.
Optimal sensor scheduling with applications to networked estimation and control systems is considered. We model sensor measurement and transmission instances using jumps between states of a continuous-time Markov chain. We introduce a cost function for this Markov chain as the summation of terms depending on the average sampling frequencies of the subsystems and the effort needed for changing the parameters of the underlying Markov chain. By minimizing this cost function through extending Brockett's recent approach to optimal control of Markov chains, we extract an optimal scheduling policy to fairly allocate the network resources among the control loops. We study the statistical properties of this scheduling policy in order to compute upper bounds for the closed-loop performance of the networked system, where several decoupled scalar subsystems are connected to their corresponding estimator or controller through a shared communication medium. We generalize the estimation results to observable subsystems of arbitrary order. Finally, we illustrate the developed results numerically on a networked system composed of several decoupled water tanks.
Existing person re-identification benchmarks and methods mainly focus on matching cropped pedestrian images between queries and candidates. However, it is different from real-world scenarios where the annotations of pedestrian bounding boxes are unavailable and the target person needs to be searched from a gallery of whole scene images. To close the gap, we propose a new deep learning framework for person search. Instead of breaking it down into two separate tasks---pedestrian detection and person re-identification, we jointly handle both aspects in a single convolutional neural network. An Online Instance Matching (OIM) loss function is proposed to train the network effectively, which is scalable to datasets with numerous identities. To validate our approach, we collect and annotate a large-scale benchmark dataset for person search. It contains 18,184 images, 8,432 identities, and 96,143 pedestrian bounding boxes. Experiments show that our framework outperforms other separate approaches, and the proposed OIM loss function converges much faster and better than the conventional Softmax loss.
In this paper we present a synthesis of work performed on tow information retrieval models: Bayesian network information retrieval model witch encode (in) dependence relation between terms and possibilistic network information retrieval model witch make use of necessity and possibility measures to represent the fuzziness of pertinence measure. It is known that the use of a general Bayesian network methodology as the basis for an IR system is difficult to tackle. The problem mainly appears because of the large number of variables involved and the computational efforts needed to both determine the relationships between variables and perform the inference processes. To resolve these problems, many models have been proposed such as BNR model. Generally, Bayesian network models doesn't consider the fuzziness of natural language in the relevance measure of a document to a given query and possibilistic models doesn't undertake the dependence relations between terms used to index documents. As a first solution we propose a hybridization of these two models in one that will undertake both the relationship between terms and the intrinsic fuzziness of natural language. We believe that the translation of Bayesian network model from the probabilistic framework to possibilistic one will allow a performance improvement of BNRM.
Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher Vector encoding with Variational Auto-Encoder (FV-VAE), a novel deep architecture that quantizes the local activations of convolutional layer in a deep generative model, by training them in an end-to-end manner. To incorporate FV encoding strategy into deep generative models, we introduce Variational Auto-Encoder model, which steers a variational inference and learning in a neural network which can be straightforwardly optimized using standard stochastic gradient method. Different from the FV characterized by conventional generative models (e.g., Gaussian Mixture Model) which parsimoniously fit a discrete mixture model to data distribution, the proposed FV-VAE is more flexible to represent the natural property of data for better generalization. Extensive experiments are conducted on three public datasets, i.e., UCF101, ActivityNet, and CUB-200-2011 in the context of video action recognition and fine-grained image classification, respectively. Superior results are reported when compared to state-of-the-art representations. Most remarkably, our proposed FV-VAE achieves to-date the best published accuracy of 94.2% on UCF101.
In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. With no unrealistic assumption, we first prove the following statements for the squared loss function of deep linear neural networks with any depth and any widths: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) there exist "bad" saddle points (where the Hessian has no negative eigenvalue) for the deeper networks (with more than three layers), whereas there is no bad saddle point for the shallow networks (with three layers). Moreover, for deep nonlinear neural networks, we prove the same four statements via a reduction to a deep linear model under the independence assumption adopted from recent work. As a result, we present an instance, for which we can answer the following question: how difficult is it to directly train a deep model in theory? It is more difficult than the classical machine learning models (because of the non-convexity), but not too difficult (because of the nonexistence of poor local minima). Furthermore, the mathematically proven existence of bad saddle points for deeper models would suggest a possible open problem. We note that even though we have advanced the theoretical foundations of deep learning and non-convex optimization, there is still a gap between theory and practice.
Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem. Despite efforts made in developing various methods for FER, existing approaches traditionally lack generalizability when applied to unseen images or those that are captured in wild setting. Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifier's hyperparameters are tuned to give best recognition accuracies across a single database, or a small collection of similar databases. Nevertheless, the results are not significant when they are applied to novel data. This paper proposes a deep neural network architecture to address the FER problem across multiple well-known standard face datasets. Specifically, our network consists of two convolutional layers each followed by max pooling and then four Inception layers. The network is a single component architecture that takes registered facial images as the input and classifies them into either of the six basic or the neutral expressions. We conducted comprehensive experiments on seven publically available facial expression databases, viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013. The results of proposed architecture are comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks and in both accuracy and training time.
It is well known that speaker verification systems are subject to spoofing attacks. The Automatic Speaker Verification Spoofing and Countermeasures Challenge -- ASVSpoof2015 -- provides a standard spoofing database, containing attacks based on synthetic speech, along with a protocol for experiments. This paper describes CPqD's systems submitted to the ASVSpoof2015 Challenge, based on deep neural networks, working both as a classifier and as a feature extraction module for a GMM and a SVM classifier. Results show the validity of this approach, achieving less than 0.5\% EER for known attacks.
The use of parameters measuring order-parameter fluctuations (OPF) has been encouraged by the recent results reported in \cite{RS} which show that two of these parameters, $G$ and $G_c$, take universal values in the $\lim_{T\to 0}$. In this paper we present a detailed study of parameters measuring OPF for two mean-field models with and without time-reversal symmetry which exhibit different patterns of replica symmetry breaking below the transition: the Sherrington-Kirkpatrick model with and without a field and the Ising p-spin glass (p=3). We give numerical results and analyze the consequences which replica equivalence imposes on these models in the infinite volume. We give evidence for the transition in each system and discuss the character of finite-size effects. Furthermore, a comparative study between this new family of parameters and the usual Binder cumulant analysis shows what kind of new information can be extracted from the finite $T$ behavior of these quantities. The two main outcomes of this work are: 1) Parameters measuring OPF give better estimates than the Binder cumulant for $T_c$ and even for very small systems they give evidence for the transition. 2) For systems with no time-reversal symmetry, parameters defined in terms of connected quantities are the proper ones to look at.
Clarifying the interplay of interactions and disorder is fundamental to the understanding of many quantum systems, including superfluid helium in porous media, granular and thin-film superconductors, and light propagating in disordered media. One central aspect for bosonic systems is the competition between disorder, which tends to localize particles, and weak repulsive interactions, which instead have a delocalizing effect. Since the required degree of independent control of the disorder and of the interactions is not easily achievable in most available physical systems, a systematic experimental investigation of this competition has so far not been possible. Here we employ an ultracold atomic Bose-Einstein condensate with tunable repulsive interactions in a quasi-periodic lattice potential to study this interplay in detail. We characterize the entire delocalization crossover through the study of the average local shape of the wavefunction, the spatial correlations, and the phase coherence. Three different regimes are identified and compared with theoretical expectations: an exponentially localized Anderson glass, the formation of locally coherent fragments, as well as a coherent, extended state. Our results illuminate the role of weak repulsive interactions on disordered bosonic systems and show that the system and the techniques we employ are promising for further investigations of disordered systems with interactions, also in the strongly correlated regime.
Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely $\mathcal{O}(1)$ per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.
In a recent paper, "Why does deep and cheap learning work so well?", Lin and Tegmark claim to show that the mapping between deep belief networks and the variational renormalization group derived in [arXiv:1410.3831] is invalid, and present a "counterexample" that claims to show that this mapping does not hold. In this comment, we show that these claims are incorrect and stem from a misunderstanding of the variational RG procedure proposed by Kadanoff. We also explain why the "counterexample" of Lin and Tegmark is compatible with the mapping proposed in [arXiv:1410.3831].
We propose a network characterization of combinatorial fitness landscapes by adapting the notion of inherent networks proposed for energy surfaces (Doye, 2002). We use the well-known family of $NK$ landscapes as an example. In our case the inherent network is the graph where the vertices are all the local maxima and edges mean basin adjacency between two maxima. We exhaustively extract such networks on representative small NK landscape instances, and show that they are 'small-worlds'. However, the maxima graphs are not random, since their clustering coefficients are much larger than those of corresponding random graphs. Furthermore, the degree distributions are close to exponential instead of Poissonian. We also describe the nature of the basins of attraction and their relationship with the local maxima network.
Nowadays, Cellular Neural Networks (CNN) are practically implemented in parallel, analog computers, showing a fast developing trend. Physicist must be aware that such computers are appropriate for solving in an elegant manner practically important problems, which are extremely slow on the classical digital architecture. Here, CNN is used for solving NP-hard optimization problems on lattices. It is proved, that a CNN in which the parameters of all cells can be separately controlled, is the analog correspondent of a two-dimensional Ising type (Edwards-Anderson) spin-glass system. Using the properties of CNN computers a fast optimization method can be built for such problems. Estimating the simulation time needed for solving such NP-hard optimization problems on CNN based computers, and comparing it with the time needed on normal digital computers using the simulated annealing algorithm, the results are astonishing: CNN computers would be faster than digital computers already at 10*10 lattice sizes. Hardwares realized nowadays are of 176*144 size. Also, there seems to be no technical difficulties adapting CNN chips for such problems and the needed local control is expected to be fully developed in the near future.
One of the main challenges in Zero-Shot Learning of visual categories is gathering semantic attributes to accompany images. Recent work has shown that learning from textual descriptions, such as Wikipedia articles, avoids the problem of having to explicitly define these attributes. We present a new model that can classify unseen categories from their textual description. Specifically, we use text features to predict the output weights of both the convolutional and the fully connected layers in a deep convolutional neural network (CNN). We take advantage of the architecture of CNNs and learn features at different layers, rather than just learning an embedding space for both modalities, as is common with existing approaches. The proposed model also allows us to automatically generate a list of pseudo- attributes for each visual category consisting of words from Wikipedia articles. We train our models end-to-end us- ing the Caltech-UCSD bird and flower datasets and evaluate both ROC and Precision-Recall curves. Our empirical results show that the proposed model significantly outperforms previous methods.
Typical dimensionality reduction methods focus on directly reducing the number of random variables while retaining maximal variations in the data. In this paper, we consider the dimensionality reduction in parameter spaces of binary multivariate distributions. We propose a general Confident-Information-First (CIF) principle to maximally preserve parameters with confident estimates and rule out unreliable or noisy parameters. Formally, the confidence of a parameter can be assessed by its Fisher information, which establishes a connection with the inverse variance of any unbiased estimate for the parameter via the Cram\'{e}r-Rao bound. We then revisit Boltzmann machines (BM) and theoretically show that both single-layer BM without hidden units (SBM) and restricted BM (RBM) can be solidly derived using the CIF principle. This can not only help us uncover and formalize the essential parts of the target density that SBM and RBM capture, but also suggest that the deep neural network consisting of several layers of RBM can be seen as the layer-wise application of CIF. Guided by the theoretical analysis, we develop a sample-specific CIF-based contrastive divergence (CD-CIF) algorithm for SBM and a CIF-based iterative projection procedure (IP) for RBM. Both CD-CIF and IP are studied in a series of density estimation experiments.
We tackle the problem of estimating optical flow from a monocular camera in the context of autonomous driving. We build on the observation that the scene is typically composed of a static background, as well as a relatively small number of traffic participants which move rigidly in 3D. We propose to estimate the traffic participants using instance-level segmentation. For each traffic participant, we use the epipolar constraints that govern each independent motion for faster and more accurate estimation. Our second contribution is a new convolutional net that learns to perform flow matching, and is able to estimate the uncertainty of its matches. This is a core element of our flow estimation pipeline. We demonstrate the effectiveness of our approach in the challenging KITTI 2015 flow benchmark, and show that our approach outperforms published approaches by a large margin.
High-throughput mRNA sequencing (RNA-Seq) is widely used for transcript quantification of gene isoforms. Since RNA-Seq data alone is often not sufficient to accurately identify the read origins from the isoforms for quantification, we propose to explore protein domain-domain interactions as prior knowledge for integrative analysis with RNA-seq data. We introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ) to integrate protein domain-domain interaction network with short read alignments for transcript abundance estimation. Based on our observation that the abundances of the neighboring isoforms by domain-domain interactions in the network are positively correlated, Net-RSTQ models the expression of the neighboring transcripts as Dirichlet priors on the likelihood of the observed read alignments against the transcripts in one gene. The transcript abundances of all the genes are then jointly estimated with alternating optimization of multiple EM problems. In simulation Net-RSTQ effectively improved isoform transcript quantifications when isoform co-expressions correlate with their interactions. qRT-PCR results on 25 multi-isoform genes in a stem cell line, an ovarian cancer cell line, and a breast cancer cell line also showed that Net-RSTQ estimated more consistent isoform proportions with RNA-Seq data. In the experiments on the RNA-Seq data in The Cancer Genome Atlas (TCGA), the transcript abundances estimated by Net-RSTQ are more informative for patient sample classification of ovarian cancer, breast cancer and lung cancer. All experimental results collectively support that Net-RSTQ is a promising approach for isoform quantification.
Convolutional Neural Network (CNN) features have been successfully employed in recent works as an image descriptor for various vision tasks. But the inability of the deep CNN features to exhibit invariance to geometric transformations and object compositions poses a great challenge for image search. In this work, we demonstrate the effectiveness of the objectness prior over the deep CNN features of image regions for obtaining an invariant image representation. The proposed approach represents the image as a vector of pooled CNN features describing the underlying objects. This representation provides robustness to spatial layout of the objects in the scene and achieves invariance to general geometric transformations, such as translation, rotation and scaling. The proposed approach also leads to a compact representation of the scene, making each image occupy a smaller memory footprint. Experiments show that the proposed representation achieves state of the art retrieval results on a set of challenging benchmark image datasets, while maintaining a compact representation.
We propose a new neural network architecture for solving single-image analogies - the generation of an entire set of stylistically similar images from just a single input image. Solving this problem requires separating image style from content. Our network is a modified variational autoencoder (VAE) that supports supervised training of single-image analogies and in-network evaluation of outputs with a structured similarity objective that captures pixel covariances. On the challenging task of generating a 62-letter font from a single example letter we produce images with 22.4% lower dissimilarity to the ground truth than state-of-the-art.
The ground state of an elastic interface in a disordered medium undergoes collective jumps upon variation of external parameters. These mesoscopic jumps are called shocks, or static avalanches. Submitting the interface to a parabolic potential centered at $w$, we study the avalanches which occur as $w$ is varied. We are interested in the correlations between the avalanche sizes $S_1$ and $S_2$ occurring at positions $w_1$ and $w_2$. Using the Functional Renormalization Group (FRG), we show that correlations exist for realistic interface models below their upper critical dimension. Notably, the connected moment $ \langle S_1 S_2 \rangle^c$ is up to a prefactor exactly the renormalized disorder correlator, itself a function of $|w_2-w_1|$. The latter is the universal function at the center of the FRG; hence correlations between shocks are universal as well. All moments and the full joint probability distribution are computed to first non-trivial order in an $\epsilon$-expansion below the upper critical dimension. To quantify the local nature of the coupling between avalanches, we calculate the correlations of their local jumps. We finally test our predictions against simulations of a particle in random-bond and random-force disorder, with surprisingly good agreement.
Machine Learning (ML) has found it particularly useful in malware detection. However, as the malware evolves very fast, the stability of the feature extracted from malware serves as a critical issue in malware detection. The recent success of deep learning in image recognition, natural language processing, and machine translation indicates a potential solution for stabilizing the malware detection effectiveness. In this research, we haven't extract selected any features (e.g., the control-flow of op-code, classes, methods of functions and the timing they are invoked etc.) from Android apps. We develop our own method for translating Android apps into rgb color code and transform them to a fixed-sized encoded image. After that, the encoded image is fed to convolutional neural network (CNN) for automatic feature extraction and learning, reducing the expert's intervention. Deep learning usually involves a large number of parameters that cannot be learned from only a small dataset. In this way, we currently have collected 1500k Android apps samples, have run our system over these 800k malware samples (benign and malicious samples are roughly equal-sized), and also through our back-end (60 million monthly active users and 10k new malware samples per day), we can effectively detect the malware. We believe that our methodology and the corresponding use of deep learning malware classification can overcome the weakness, and computational cost of the common static/dynamic analysis process or machine learning-based of Android malware detection approach.
Continuous phase transitions are catalogued into universality classes, families of systems having identical values of all the exponents governing the critical behaviour of their different physical properties. Numerical simulations have been carried out on Ising Spin Glasses in dimension three, using a technique where corrections to finite size scaling can be controled. The data show that the critical exponents vary strongly as a function of the kurtosis of the interaction distribution, a parameter which from the standard point of view should not be pertinent. This observation implies that for spin glasses the renormalization group analysis should not be approached in a the same way as in the case of canonical second order transitions; a much richer structure of universality classes would appear to exist for spin glasses.
Finding a new mathematical representations for graph, which allows direct comparison between different graph structures, is an open-ended research direction. Having such a representation is the first prerequisite for a variety of machine learning algorithms like classification, clustering, etc., over graph datasets. In this paper, we propose a symmetric positive semidefinite matrix with the $(i,j)$-{th} entry equal to the covariance between normalized vectors $A^ie$ and $A^je$ ($e$ being vector of all ones) as a representation for graph with adjacency matrix $A$. We show that the proposed matrix representation encodes the spectrum of the underlying adjacency matrix and it also contains information about the counts of small sub-structures present in the graph such as triangles and small paths. In addition, we show that this matrix is a \emph{"graph invariant"}. All these properties make the proposed matrix a suitable object for representing graphs.   The representation, being a covariance matrix in a fixed dimensional metric space, gives a mathematical embedding for graphs. This naturally leads to a measure of similarity on graph objects. We define similarity between two given graphs as a Bhattacharya similarity measure between their corresponding covariance matrix representations. As shown in our experimental study on the task of social network classification, such a similarity measure outperforms other widely used state-of-the-art methodologies. Our proposed method is also computationally efficient. The computation of both the matrix representation and the similarity value can be performed in operations linear in the number of edges. This makes our method scalable in practice.   We believe our theoretical and empirical results provide evidence for studying truncated power iterations, of the adjacency matrix, to characterize social networks.
Neuroevolution has yet to scale up to complex reinforcement learning tasks that require large networks. Networks with many inputs (e.g. raw video) imply a very high dimensional search space if encoded directly. Indirect methods use a more compact genotype representation that is transformed into networks of potentially arbitrary size. In this paper, we present an indirect method where networks are encoded by a set of Fourier coefficients which are transformed into network weight matrices via an inverse Fourier-type transform. Because there often exist network solutions whose weight matrices contain regularity (i.e. adjacent weights are correlated), the number of coefficients required to represent these networks in the frequency domain is much smaller than the number of weights (in the same way that natural images can be compressed by ignore high-frequency components). This "compressed" encoding is compared to the direct approach where search is conducted in the weight space on the high-dimensional octopus arm task. The results show that representing networks in the frequency domain can reduce the search-space dimensionality by as much as two orders of magnitude, both accelerating convergence and yielding more general solutions.
The concept of scale-free networks has been widely applied across natural and physical sciences. Many claims are made about the properties of these networks, even though the concept of scale-free is often vaguely defined. We present tools and procedures to analyse the statistical properties of networks defined by arbitrary degree distributions and other constraints. Doing so reveals the highly likely properties, and some unrecognised richness, of scale-free networks, and casts doubt on some previously claimed properties being due to a scale-free characteristic.
We describe algorithms for learning Bayesian networks from a combination of user knowledge and statistical data. The algorithms have two components: a scoring metric and a search procedure. The scoring metric takes a network structure, statistical data, and a user's prior knowledge, and returns a score proportional to the posterior probability of the network structure given the data. The search procedure generates networks for evaluation by the scoring metric. Our contributions are threefold. First, we identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simplify the encoding of a user's prior knowledge. In particular, a user can express her knowledge-for the most part-as a single prior Bayesian network for the domain. Second, we describe local search and annealing algorithms to be used in conjunction with scoring metrics. In the special case where each node has at most one parent, we show that heuristic search can be replaced with a polynomial algorithm to identify the networks with the highest score. Third, we describe a methodology for evaluating Bayesian-network learning algorithms. We apply this approach to a comparison of metrics and search procedures.
Muon neutrino astronomy is drown within a polluted atmospheric neutrino noise. However at 24 GeV energy atmospheric muon neutrinos, while rising vertically along the terrestrial diameter, should disappear (or be severely depleted) while converting into tau flavor: any rarest vertical 12 GeV muon track at South Pole Deep Core volume, pointing back to North Pole, might be tracing mostly a noise-free astrophysical signal. The corresponding Deep Core 6-7-8-9 channels trigger maybe point in those directions and inside that energy range without much background. Deep Core detector at South Pole, may scan at 18-27GeV energy windows, into a narrow vertical cone for a novel neutrino astronomy almost noise-free, pointing back toward the North Pole.Unfortunately muon at 12 GeV trace their arrival direction mostly spread around an unique string in a zenith-cone solid angle. To achieve also an azimuth angular resolution a two string detection at once is needed. The doubling of the Deep Core string number, (two new arrays of six string each, achieving an average detection distance of 36.5 m), is desirable, leading to a larger Deep Core detection mass (more than double) and a sharper zenith and azimuth angular resolution by two-string vertical axis detection. Such an improvement may show a noise free (at least factor ten) muon neutrino astronomy. This enhancement may also be a crucial probe of a peculiar anisotropy foreseen for atmospheric anti-muon, in CPT violated physics versus conserved one, following a hint by recent Minos results.
Multicast is a central challenge for emerging multi-hop wireless architectures such as wireless mesh networks, because of its substantial cost in terms of bandwidth. In this report, we study one specific case of multicast: broadcasting, sending data from one source to all nodes, in a multi-hop wireless network. The broadcast we focus on is based on network coding, a promising avenue for reducing cost; previous work of ours showed that the performance of network coding with simple heuristics is asymptotically optimal: each transmission is beneficial to nearly every receiver. This is for homogenous and large networks of the plan. But for small, sparse or for inhomogeneous networks, some additional heuristics are required. This report proposes such additional new heuristics (for selecting rates) for broadcasting with network coding. Our heuristics are intended to use only simple local topology information. We detail the logic of the heuristics, and with experimental results, we illustrate the behavior of the heuristics, and demonstrate their excellent performance.
The aim of this work is to change the routing strategy of AODV protocol (Ad hoc On Demand Vector) in order to improve the energy consumption in mobile ad hoc networks (MANET). The purpose is to minimize the regular period of HELLO messages generated by the AODV protocol used for the research, development and maintenance of routes. This information is useful to have an idea about battery power levels of different network hosts. After storing this information, the node elect the shortest path following the classical model used this information to elect safest path (make a compromise) in terms of energy. Transmitter node does not select another node as its battery will be exhausted soon. Any node of the network can have the same information's about the neighborhoods as well as other information about the energy level of the different terminal to avoid routing using a link that will be lost due to an exhausted battery of a node in this link. Analytical study and simulations by Jist/SWANS have been conducted to note that no divergence relatively to the classical AODV, a node can have this type of information that improves the energy efficiency in ad hoc networks.
We consider dense wireless random-access networks, modeled as systems of particles with hard-core interaction. The particles represent the network users that try to become active after an exponential back-off time, and stay active for an exponential transmission time. Due to wireless interference, active users prevent other nearby users from simultaneous activity, which we describe as hard-core interaction on a conflict graph. We show that dense networks with aggressive back-off schemes lead to extremely slow transitions between dominant states, and inevitably cause long mixing times and starvation effects.
Recurrent networks of dynamic elements frequently exhibit emergent collective oscillations, which can display substantial regularity even when the individual elements are considerably noisy. How noise-induced dynamics at the local level coexists with regular oscillations at the global level is still unclear. Here we show that a combination of stochastic recurrence-based initiation with deterministic refractoriness in an excitable network can reconcile these two features, leading to maximum collective coherence for an intermediate noise level. We report this behavior in the slow oscillation regime exhibited by a cerebral cortex network under dynamical conditions resembling slow-wave sleep and anaesthesia. Computational analysis of a biologically realistic network model reveals that an intermediate level of background noise leads to quasi-regular dynamics. We verify this prediction experimentally in cortical slices subject to varying amounts of extracellular potassium, which modulates neuronal excitability and thus synaptic noise. The model also predicts that this effectively regular state should exhibit noise-induced memory of the spatial propagation profile of the collective oscillations, which is also verified experimentally. Taken together, these results allow us to construe the enhanced regularity observed experimentally in the brain as an instance of collective stochastic coherence.
Modern large-scale date centres, such as those used for cloud computing service provision, are becoming ever-larger as the operators of those data centres seek to maximise the benefits from economies of scale. With these increases in size comes a growth in system complexity, which is usually problematic. There is an increased desire for automated "self-star" configuration, management, and failure-recovery of the data-centre infrastructure, but many traditional techniques scale much worse than linearly as the number of nodes to be managed increases. As the number of nodes in a median-sized data-centre looks set to increase by two or three orders of magnitude in coming decades, it seems reasonable to attempt to explore and understand the scaling properties of the data-centre middleware before such data-centres are constructed. In [1] we presented SPECI, a simulator that predicts aspects of large-scale data-centre middleware performance, concentrating on the influence of status changes such as policy updates or routine node failures. [...]. In [1] we used a first-approximation assumption that such subscriptions are distributed wholly at random across the data centre. In this present paper, we explore the effects of introducing more realistic constraints to the structure of the internal network of subscriptions. We contrast the original results [...] exploring the effects of making the data-centre's subscription network have a regular lattice-like structure, and also semi-random network structures resulting from parameterised network generation functions that create "small-world" and "scale-free" networks. We show that for distributed middleware topologies, the structure and distribution of tasks carried out in the data centre can significantly influence the performance overhead imposed by the middleware.
Understanding relationship between noisy dynamics and biological network architecture is a fundamentally important question, particularly in order to elucidate how cells encode and process information. We analytically and numerically investigate general network architectural conditions that are necessary to generate stochastic amplified and coherent oscillations. We enumerate all possible topologies of coupled negative feedbacks in the underlying biochemical networks with three components, negative feedback loops, and mass action kinetics. Using the linear noise approximation to analytically obtain the time-dependent solution of the master equation and derive the algebraic expression of power spectra, we find that (a) all networks with coupled negative feedbacks are capable of generating stochastic amplified and coherent oscillations; (b) networks with a single negative feedback are better stochastic amplified and coherent oscillators than those with multiple coupled negative feedbacks; (c) multiple timescale difference among the kinetic rate constants is required for stochastic amplified and coherent oscillations.
We explore the redundancy of parameters in deep neural networks by replacing the conventional linear projection in fully-connected layers with the circulant projection. The circulant structure substantially reduces memory footprint and enables the use of the Fast Fourier Transform to speed up the computation. Considering a fully-connected neural network layer with d input nodes, and d output nodes, this method improves the time complexity from O(d^2) to O(dlogd) and space complexity from O(d^2) to O(d). The space savings are particularly important for modern deep convolutional neural network architectures, where fully-connected layers typically contain more than 90% of the network parameters. We further show that the gradient computation and optimization of the circulant projections can be performed very efficiently. Our experiments on three standard datasets show that the proposed approach achieves this significant gain in storage and efficiency with minimal increase in error rate compared to neural networks with unstructured projections.
For effective treatment of Alzheimer disease (AD), it is important to identify subjects who are most likely to exhibit rapid cognitive decline. Herein, we developed a novel framework based on a deep convolutional neural network which can predict future cognitive decline in mild cognitive impairment (MCI) patients using flurodeoxyglucose and florbetapir positron emission tomography (PET). The architecture of the network only relies on baseline PET studies of AD and normal subjects as the training dataset. Feature extraction and complicated image preprocessing including nonlinear warping are unnecessary for our approach. Accuracy of prediction (84.2%) for conversion to AD in MCI patients outperformed conventional feature-based quantification approaches. ROC analyses revealed that performance of CNN-based approach was significantly higher than that of the conventional quantification methods (p < 0.05). Output scores of the network were strongly correlated with the longitudinal change in cognitive measurements. These results show the feasibility of deep learning as a tool for predicting disease outcome using brain images.
We report observational analyses and theoretical interpretations of unusually red galaxies in the Subaru Deep Field (SDF). A careful analysis of the SDF data revealed a population with unusually red near-infrared (NIR) colors of J - K >~ 3-4, with higher confidence than the previous SDF result. Their surface number density drastically increases at K >~ 22 and becomes roughly the same with that of dusty starburst galaxies detected by submillimeter observations in recent years. These colors are even redder than the known population of the extremely red objects (EROs), and too red to explain by passively evolving elliptical galaxies which are the largest population of EROs. Hence these hyper extremely red objects (HEROs) should be considered as a distinct population from EROs. We discuss several possible interpretations of these enigmatic objects, and we show that these red NIR colors, K-band and sub-mm flux, and surface number density are quantitatively best explained by primordial elliptical galaxies reddened by dust, still in the starburst phase of their formation at z ~ 3.
Glass transition where viscosity of liquids increases dramatically upon decrease of temperature without any major change in structural properties, remains one of the most challenging problems in condensed matter physics (Cavagna, 2009; Berthier and Biroli, 2011) in spite of tremendous research efforts in last decades. On the other hand disordered freezing of spins in a magnetic materials with decreasing temperature, the so-called spin glass transition, is relatively better understood (Mezard, Parisi and Virasoro, 1987; Castellani and Cavagna, 2005). Previously found similarity between some spin glass models with the structural glasses (Kirkpatrick and Thirumalai, 1987; Kirkpatrick and Wolynes, 1987; Kirkpatrick and Wolynes, 1987; Franz and Parisi, 1999; Moore and Drossel, 2002) inspired development of theories of structural glasses (Kirkpatrick, Thirumalai and Wolynes, 1989; Barrat, Franz and Parisi, 1997; M\'ezard and Parisi, 1999; Lubchenko and Wolynes, 2007; Biroli and Bouchaud, 2012) based on the scenario of spin glass transition. This scenario though looks very appealing is still far from being well established. One of the main differences between standard spin systems to molecular systems is the absence of quenched disorder and the presence of translational invariance: it often assumed that this difference is not relevant, but this conjecture is still far from being established. The quantities, which are well defined and characterized for spin models, are not easily calculable for molecular glasses due to the lack of quenched disorder which breaks the translational invariance in the system and the characterization of the similarity between the spin and the structural glass transition remained an elusive subject still now. In this study we introduced a model structural glass with built in quenched disorder which alleviates this main difference between the spin and molecular glasses thereby helping us to compare these two systems: the possibility of producing a good thermalization at rather low temperatures is one of the advantages of this model.
Recurrent neural networks, and in particular long short-term memory networks (LSTMs), are a remarkably effective tool for sequence modeling that learn a dense black-box hidden representation of their sequential input. Researchers interested in better understanding these models have studied the changes in hidden state representations over time and noticed some interpretable patterns but also significant noise. In this work, we present LSTMVis a visual analysis tool for recurrent neural networks with a focus on understanding these hidden state dynamics. The tool allows a user to select a hypothesis input range to focus on local state changes, to match these states changes to similar patterns in a large data set, and to align these results with domain specific structural annotations. We further show several use cases of the tool for analyzing specific hidden state properties on datasets containing nesting, phrase structure, and chord progressions, and demonstrate how the tool can be used to isolate patterns for further statistical analysis.
Motivated by recent progress in using restricted Boltzmann machines as preprocessing algorithms for deep neural network, we revisit the mean-field equations (belief-propagation and TAP equations) in the best understood such machine, namely the Hopfield model of neural networks, and we explicit how they can be used as iterative message-passing algorithms, providing a fast method to compute the local polarizations of neurons. In the "retrieval phase" where neurons polarize in the direction of one memorized pattern, we point out a major difference between the belief propagation and TAP equations : the set of belief propagation equations depends on the pattern which is retrieved, while one can use a unique set of TAP equations. This makes the latter method much better suited for applications in the learning process of restricted Boltzmann machines. In the case where the patterns memorized in the Hopfield model are not independent, but are correlated through a combinatorial structure, we show that the TAP equations have to be modified. This modification can be seen either as an alteration of the reaction term in TAP equations, or, more interestingly, as the consequence of message passing on a graphical model with several hidden layers, where the number of hidden layers depends on the depth of the correlations in the memorized patterns. This layered structure is actually necessary when one deals with more general restricted Boltzmann machines.
Videos are inherently multimodal. This paper studies the problem of how to fully exploit the abundant multimodal clues for improved video categorization. We introduce a hybrid deep learning framework that integrates useful clues from multiple modalities, including static spatial appearance information, motion patterns within a short time window, audio information as well as long-range temporal dynamics. More specifically, we utilize three Convolutional Neural Networks (CNNs) operating on appearance, motion and audio signals to extract their corresponding features. We then employ a feature fusion network to derive a unified representation with an aim to capture the relationships among features. Furthermore, to exploit the long-range temporal dynamics in videos, we apply two Long Short Term Memory networks with extracted appearance and motion features as inputs. Finally, we also propose to refine the prediction scores by leveraging contextual relationships among video semantics. The hybrid deep learning framework is able to exploit a comprehensive set of multimodal features for video classification. Through an extensive set of experiments, we demonstrate that (1) LSTM networks which model sequences in an explicitly recurrent manner are highly complementary with CNN models; (2) the feature fusion network which produces a fused representation through modeling feature relationships outperforms alternative fusion strategies; (3) the semantic context of video classes can help further refine the predictions for improved performance. Experimental results on two challenging benchmarks, the UCF-101 and the Columbia Consumer Videos (CCV), provide strong quantitative evidence that our framework achieves promising results: $93.1\%$ on the UCF-101 and $84.5\%$ on the CCV, outperforming competing methods with clear margins.
In order to develop a pipeline for automated classification of stars to be observed by the TAUVEX ultraviolet space Telescope, we employ an artificial neural network (ANN) technique for classifying stars by using synthetic spectra in the UV region from 1250\AA to 3220\AA as the training set and International Ultraviolet Explorer (IUE) low resolution spectra as the test set. Both the data sets have been pre-processed to mimic the observations of the TAUVEX ultraviolet imager. We have successfully classified 229 stars from the IUE low resolution catalog to within 3-4 spectral sub-class using two different simulated training spectra, the TAUVEX spectra of 286 spectral types and UVBLUE spectra of 277 spectral types. Further, we have also been able to obtain the colour excess (i.e. E(B-V) in magnitude units) or the interstellar reddening for those IUE spectra which have known reddening to an accuracy of better than 0.1 magnitudes. It has been shown that even with the limitation of data from just photometric bands, ANNs have not only classified the stars, but also provided satisfactory estimates for interstellar extinction. The ANN based classification scheme has been successfully tested on the simulated TAUVEX data pipeline. It is expected that the same technique can be employed for data validation in the ultraviolet from the virtual observatories. Finally, the interstellar extinction estimated by applying the ANNs on the TAUVEX data base would provide an extensive extinction map for our galaxy and which could in turn be modeled for the dust distribution in the galaxy.
We formulate a strong equivalence between machine learning, artificial intelligence methods and the formulation of statistical data assimilation as used widely in physical and biological sciences. The correspondence is that layer number in the artificial network setting is the analog of time in the data assimilation setting. Within the discussion of this equivalence we show that adding more layers (making the network deeper) is analogous to adding temporal resolution in a data assimilation framework.   How one can find a candidate for the global minimum of the cost functions in the machine learning context using a method from data assimilation is discussed. Calculations on simple models from each side of the equivalence are reported.   Also discussed is a framework in which the time or layer label is taken to be continuous, providing a differential equation, the Euler-Lagrange equation, which shows that the problem being solved is a two point boundary value problem familiar in the discussion of variational methods. The use of continuous layers is denoted "deepest learning". These problems respect a symplectic symmetry in continuous time/layer phase space. Both Lagrangian versions and Hamiltonian versions of these problems are presented. Their well-studied implementation in a discrete time/layer, while respected the symplectic structure, is addressed. The Hamiltonian version provides a direct rationale for back propagation as a solution method for the canonical momentum.
Although species longevity is subject to a diverse range of selective forces, the mortality curves of a wide variety of organisms are rather similar. We argue that aging and its universal characteristics may have evolved by means of a gradual increase in the systemic interdependence between a large collection of biochemical or mechanical components. Modeling the organism as a dependency network which we create using a constructive evolutionary process, we age it by allowing nodes to be broken or repaired according to a probabilistic algorithm that accounts for random failures/repairs and dependencies. Our simulations show that the network slowly accumulates damage and then catastrophically collapses. We use our simulations to fit experimental data for the time dependent mortality rates of a variety of multicellular organisms and even complex machines such as automobiles. Our study suggests that aging is an emergent finite-size effect in networks with dynamical dependencies and that the qualitative and quantitative features of aging are not sensitively dependent on the details of system structure.
We simulated site dilute Ising models in $d=3$ dimensions for several lattice sizes $L$. For each $L$ singular thermodynamic quantities $X$ were measured at criticality and their distributions $P(X)$ were determined, for ensembles of several thousand random samples. For $L \to \infty$ the width of $P(X)$ tends to a universal constant, i.e. there is no self averaging. The width of the distribution of the sample dependent pseudocritical temperatures $T_c(i,L)$ scales as $\delta T_c(L) \sim L^{-1/\nu}$ and NOT as $\sim L^{-d/2}$. Finite size scaling holds; the sample dependence of $X_i(T_c)$ enters predominantly through $T_c(i,L)$.
Reputation aggregation in peer to peer networks is generally a very time and resource consuming process. Moreover, most of the methods consider that a node will have same reputation with all the nodes in the network, which is not true. This paper proposes a reputation aggregation algorithm that uses a variant of gossip algorithm called differential gossip. In this paper, estimate of reputation is considered to be having two parts, one common component which is same with every node, and the other one is information received from immediate neighbours based on the neighbours' direct interaction with the node. The differential gossip is fast and requires less amount of resources. This mechanism allows computation of independent reputation value by a node, of every other node in the network, for each node. The differential gossip trust has been investigated for a power law network formed using preferential attachment \emph{(PA)} Model. The reputation computed using differential gossip trust shows good amount of immunity to the collusion. We have verified the performance of the algorithm on the power law networks of different sizes ranging from 100 nodes to 50,000 nodes.
We introduce the concept of concatenated tensor networks to efficiently describe quantum states. We show that the corresponding concatenated tensor network states can efficiently describe time evolution and possess arbitrary block-wise entanglement and long-ranged correlations. We illustrate the approach for the enhancement of matrix product states, i.e. 1D tensor networks, where we replace each of the matrices of the original matrix product state with another 1D tensor network. This procedure yields a 2D tensor network, which includes -- already for tensor dimension two -- all states that can be prepared by circuits of polynomially many (possibly non-unitary) two-qubit quantum operations, as well as states resulting from time evolution with respect to Hamiltonians with short-ranged interactions. We investigate the possibility to efficiently extract information from these states, which serves as the basic step in a variational optimization procedure. To this aim we utilize known exact and approximate methods for 2D tensor networks and demonstrate some improvements thereof, which are also applicable e.g. in the context of 2D projected entangled pair states. We generalize the approach to higher dimensional- and tree tensor networks.
It is believed that hippocampus functions as a memory allocator in brain, the mechanism of which remains unrevealed. In Valiant's neuroidal model, the hippocampus was described as a randomly connected graph, the computation on which maps input to a set of activated neuroids with stable size. Valiant proposed three requirements for the hippocampal circuit to become a stable memory allocator (SMA): stability, continuity and orthogonality. The functionality of SMA in hippocampus is essential in further computation within cortex, according to Valiant's model.   In this paper, we put these requirements for memorization functions into rigorous mathematical formulation and introduce the concept of capacity, based on the probability of erroneous allocation. We prove fundamental limits for the capacity and error probability of SMA, in both data-independent and data-dependent settings. We also establish an example of stable memory allocator that can be implemented via neuroidal circuits. Both theoretical bounds and simulation results show that the neural SMA functions well.
We present a method for Monte Carlo sampling on systems with discrete variables (focusing in the Ising case), introducing a prior on the candidate moves in a Metropolis-Hastings scheme which can significantly reduce the rejection rate, called the reduced-rejection-rate (RRR) method. The method employs same probability distribution for the choice of the moves as rejection-free schemes such as the method proposed by Bortz, Kalos and Lebowitz (BKL) [Bortz et al. J.Comput.Phys. 1975]; however, it uses it as a prior in an otherwise standard Metropolis scheme: it is thus not fully rejection-free, but in a wide range of scenarios it is nearly so. This allows to extend the method to cases for which rejection-free schemes become inefficient, in particular when the graph connectivity is not sparse, but the energy can nevertheless be expressed as a sum of two components, one of which is computed on a sparse graph and dominates the measure. As examples of such instances, we demonstrate that the method yields excellent results when performing Monte Carlo simulations of quantum spin models in presence of a transverse field in the Suzuki-Trotter formalism, and when exploring the so-called robust ensemble which was recently introduced in Baldassi et al. [PNAS 2016]. Our code for the Ising case is publicly available [https://github.com/carlobaldassi/RRRMC.jl], and extensible to user-defined models: it provides efficient implementations of standard Metropolis, the RRR method, the BKL method (extended to the case of continuous energy specra), and the waiting time method [Dall and Sibani Comput.Phys.Commun. 2001].
We conduct a numerical study of the dynamical behavior of a system of three-dimensional crosses, particles that consist of three mutually perpendicular line segments rigidly joined at their midpoints. In an earlier study [W. van Ketel et al., Phys. Rev. Lett. 94, 135703 (2005)] we showed that this model has the structural properties of an ideal gas, yet the dynamical properties of a strong glass former. In the present paper we report an extensive study of the dynamical heterogeneities that appear in this system in the regime where glassy behavior sets in. On the one hand, we find that the propensity of a particle to diffuse is determined by the structure of its local environment. The local density around mobile particles is significantly less than the average density, but there is little clustering of mobile particles, and the clusters observed tend to be small. On the other hand, dynamical susceptibility results indicate that a large dynamical length scale develops even at moderate densities. This suggests that propensity and other mobility measures are an incomplete measure of dynamical length scales in this system.
We introduce FeUdal Networks (FuNs): a novel architecture for hierarchical reinforcement learning. Our approach is inspired by the feudal reinforcement learning proposal of Dayan and Hinton, and gains power and efficacy by decoupling end-to-end learning across multiple levels -- allowing it to utilise different resolutions of time. Our framework employs a Manager module and a Worker module. The Manager operates at a lower temporal resolution and sets abstract goals which are conveyed to and enacted by the Worker. The Worker generates primitive actions at every tick of the environment. The decoupled structure of FuN conveys several benefits -- in addition to facilitating very long timescale credit assignment it also encourages the emergence of sub-policies associated with different goals set by the Manager. These properties allow FuN to dramatically outperform a strong baseline agent on tasks that involve long-term credit assignment or memorisation. We demonstrate the performance of our proposed system on a range of tasks from the ATARI suite and also from a 3D DeepMind Lab environment.
Spin chains have been proposed as wires to transport information between distributed registers in a quantum information processor. Unfortunately, the challenges in manufacturing linear chains with engineered couplings has hindered experimental implementations. Here we present strategies to achieve perfect quantum information transport in arbitrary spin networks. Our proposal is based on the weak coupling limit for pure state transport, where information is transferred between two end-spins that are only weakly coupled to the rest of the network. This regime allows disregarding the complex, internal dynamics of the bulk network and relying on virtual transitions or on the coupling to a single bulk eigenmode. We further introduce control methods capable of tuning the transport process and achieve perfect fidelity with limited resources, involving only manipulation of the end-qubits. These strategies could be thus applied not only to engineered systems with relaxed fabrication precision, but also to naturally occurring networks; specifically, we discuss the practical implementation of quantum state transfer between two separated nitrogen vacancy (NV) centers through a network of nitrogen substitutional impurities.
We describe the conditions under which a set of continuous variables or characters can be described as an X-tree or a split network. A distance matrix corresponds exactly to a split network or a valued X-tree if, after ordering of the taxa, the variables values can be embedded into a function with at most a local maxima and a local minima, and crossing any horizontal line at most twice. In real applications, the order of the taxa best satisfying the above conditions can be obtained using the Minimum Contradiction method. This approach is applied to 2 sets of continuous characters. The first set corresponds to craniofacial landmarks in Hominids. The contradiction matrix is used to identify possible tree structures and some alternatives when they exist. We explain how to discover the main structuring characters in a tree. The second set consists of a sample of 100 galaxies. In that second example one shows how to discretize the continuous variables describing physical properties of the galaxies without disrupting the underlying tree structure.
High-dimensional sparse data present computational and statistical challenges for supervised learning. We propose compact linear sketches for reducing the dimensionality of the input, followed by a single layer neural network. We show that any sparse polynomial function can be computed, on nearly all sparse binary vectors, by a single layer neural network that takes a compact sketch of the vector as input. Consequently, when a set of sparse binary vectors is approximately separable using a sparse polynomial, there exists a single-layer neural network that takes a short sketch as input and correctly classifies nearly all the points. Previous work has proposed using sketches to reduce dimensionality while preserving the hypothesis class. However, the sketch size has an exponential dependence on the degree in the case of polynomial classifiers. In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree. Even in the linear case, our approach allows us to improve on the pesky $O({1}/{{\gamma}^2})$ dependence of random projections, on the margin $\gamma$. We empirically show that our approach leads to more compact neural networks than related methods such as feature hashing at equal or better performance.
Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to train and generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, the context window size and the dimensionality of word embeddings on the classification performance. By comparing the classification results of two word embedding models, which are trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data type should align with the Twitter classification dataset to achieve a better performance. Moreover, by evaluating the results of word embeddings models trained using various context window sizes and dimensionalities, we found that large context window and dimension sizes are preferable to improve the performance. Our experimental results also show that using word embeddings and CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings.
The advent of high--throughput transcription profiling technologies has enabled identification of genes and pathways associated with disease, providing new avenues for precision medicine. A key challenge is to analyze this data in the context of the regulatory networks and pathways that control cellular processes, while still obtaining insights that can be used to design new diagnostic and therapeutic interventions. While classical differential expression analysis provides specific and hence targetable gene-level insights, it does not include any systems--level information. On the other hand, pathway analyses integrate systems--level information with expression data, but are often limited in their ability to indicate specific molecular targets. We introduce GeneSurrounder, an analysis method that takes into account the complex structure of interaction networks to identify specific genes that disrupt pathway activity in a disease--specific manner. GeneSurrounder integrates transcriptomic data and pathway network information in a novel two--step procedure to detect genes that (i) appear to influence the expression of other genes local to it in the network and (ii) are part of a subnetwork of differentially expressed genes. Combined, this evidence can be used to pinpoint specific genes that have a mechanistic role in the phenotype of interest. Applying GeneSurrounder to three distinct ovarian cancer studies using a global KEGG network, we show that our method is able to identify biologically relevant genes and genes missed by single-gene association tests, integrate pathway and expression data, and yield more consistent results across multiple studies of the same phenotype than competing methods.
Bibliographic analysis considers the author's research areas, the citation network and the paper content among other things. In this paper, we combine these three in a topic model that produces a bibliographic model of authors, topics and documents, using a nonparametric extension of a combination of the Poisson mixed-topic link model and the author-topic model. This gives rise to the Citation Network Topic Model (CNTM). We propose a novel and efficient inference algorithm for the CNTM to explore subsets of research publications from CiteSeerX. The publication datasets are organised into three corpora, totalling to about 168k publications with about 62k authors. The queried datasets are made available online. In three publicly available corpora in addition to the queried datasets, our proposed model demonstrates an improved performance in both model fitting and document clustering, compared to several baselines. Moreover, our model allows extraction of additional useful knowledge from the corpora, such as the visualisation of the author-topics network. Additionally, we propose a simple method to incorporate supervision into topic modelling to achieve further improvement on the clustering task.
Bundles of many fibers, with statistically distributed thresholds for breakdown of individual fibers and where the load carried by a bursting fiber is equally distributed among the surviving members, are considered. During the breakdown process, avalanches consisting of simultaneous rupture of several fibers occur, with a distribution D(Delta) of the magnitude Delta of such avalanches. We show that there is, for certain threshold distributions, a crossover behavior of D(Delta) between two power laws D(Delta) proportional to Delta^(-xi), with xi=3/2 or xi=5/2. The latter is known to be the generic behavior, and we give the condition for which the D(Delta) proportional to Delta^(-3/2) behavior is seen. This crossover is a signal of imminent catastrophic failure in the fiber bundle. We find the same crossover behavior in the fuse model.
Given a regular network (in which all cells have the same type and receive the same number of inputs and all arrows have the same type), we define the special Jordan subspaces to that network and we use these subspaces to study the synchrony phenomenon in the theory of coupled cell networks. To be more precise, we prove that the synchrony subspaces of a regular network are precisely the polydiagonals that are direct sums of special Jordan subspaces. We also show that special Jordan subspaces play a special role in the lattice structure of all synchrony subspace because every join-irreducible element of the lattice is the smallest synchrony subspace containing some special Jordan subspace.
We introduce a mixed-integer linear programming (MILP) framework capable of determining whether a chemical reaction network possesses the property of being endotactic or strongly endotactic. The network property of being strongly endotactic is known to lead to persistence and permanence of chemical species under genetic kinetic assumptions, while the same result is conjectured but as yet unproved for general endotactic networks. The algorithms we present are the first capable of verifying endotacticity of chemical reaction networks for systems with greater than two constituent species. We implement the algorithms in the open-source online package CoNtRol and apply them to several well-studied biochemical examples, including the general $n$-site phosphorylation / dephosphorylation networks and a circadian clock mechanism.
We discuss memory effects in the conductance of hopping insulators due to slow rearrangements of many-electron clusters leading to formation of polarons close to the electron hopping sites. An abrupt change in the gate voltage and corresponding shift of the chemical potential change populations of the hopping sites, which then slowly relax due to rearrangements of the clusters. As a result, the density of hopping states becomes time dependent on a scale relevant to rearrangement of the structural defects leading to the excess time dependent conductivity.
Many social media platforms offer a mechanism for readers to react to comments, both positively and negatively, which in aggregate can be thought of as community endorsement. This paper addresses the problem of predicting community endorsement in online discussions, leveraging both the participant response structure and the text of the comment. The different types of features are integrated in a neural network that uses a novel architecture to learn latent modes of discussion structure that perform as well as deep neural networks but are more interpretable. In addition, the latent modes can be used to weight text features thereby improving prediction accuracy.
Neural network models have shown their promising opportunities for multi-task learning, which focus on learning the shared layers to extract the common and task-invariant features. However, in most existing approaches, the extracted shared features are prone to be contaminated by task-specific features or the noise brought by other tasks. In this paper, we propose an adversarial multi-task learning framework, alleviating the shared and private latent feature spaces from interfering with each other. We conduct extensive experiments on 16 different text classification tasks, which demonstrates the benefits of our approach. Besides, we show that the shared knowledge learned by our proposed model can be regarded as off-the-shelf knowledge and easily transferred to new tasks. The datasets of all 16 tasks are publicly available at \url{http://nlp.fudan.edu.cn/data/}
The increasing complexity and interdependency of today's networks highlight the importance of studying network robustness to failure and attacks. Many large-scale networks are prone to cascading effects where a limited number of initial failures (due to attacks, natural hazards or resource depletion) propagate through a dependent mechanism, ultimately leading to a global failure scenario where a substantial fraction of the network loses its functionality. These cascading failure scenarios often take place in networks which are embedded in space and constrained by geometry. Building on previous results on cascading failure in random geometric networks, we introduce and analyze a continuous cascading failure model where a node has an initial continuously-valued state, and fails if the aggregate state of its neighbors fall below a threshold. Within this model, we derive analytical conditions for the occurrence and non-occurrence of cascading node failure, respectively.
Opportunistic networks could become the solution to provide communication support in both cities where the cellular network could be overloaded, and in scenarios where a fixed infrastructure is not available, like in remote and developing regions. A critical issue that still requires a satisfactory solution is the design of an efficient data delivery solution. Social characteristics are recently being considered as a promising alternative. Most opportunistic network applications rely on the different mobile devices carried by users, and whose behavior affects the use of the device itself.   This work presents the "Friendship and Selfishness Forwarding" (FSF) algorithm. FSF analyses two aspects to make message forwarding decisions when a contact opportunity arises: First, it classifies the friendship strength among a pair of nodes by using a machine learning algorithm to quantify the friendship strength among pairs of nodes in the network. Next, FSF assesses the relay node selfishness to consider those cases in which, despite a strong friendship with the destination, the relay node may not accept to receive the message because it is behaving selfishly, or because its device has resource constraints in that moment.   By using trace-driven simulations through the ONE simulator, we show that the FSF algorithm outperforms previously proposed schemes in terms of delivery rate, average cost, and efficiency.
We present scaling laws that dictate both local and global connectivity properties of bounded wireless networks. These laws are defined with respect to the key system parameters of per-node transmit power and the number of antennas exploited for diversity coding and/or beamforming at each node. We demonstrate that the local probability of connectivity scales like $\mathcal{O}(z^\mathcal C)$ in these parameters, where $\mathcal C$ is the ratio of the dimension of the network domain to the path loss exponent, thus enabling efficient boundary effect mitigation and network topology control.
Using finite-size scaling techniques, we study the critical properties of the site-diluted Ising model in four dimensions. We carry out a high statistics Monte Carlo simulation for several values of the dilution. The results support the perturbative scenario: there is only the Ising fixed point with large logarithmic scaling corrections. We obtain, using the Perturbative Renormalization Group, functional forms for the scaling of several observables that are in agreement with the numerical data.
In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8% of its average POS tagging accuracy when trained at 1.2% of the total available training data, i.e.~150 sentences per language.
In Chinese societies, superstition is of paramount importance, and vehicle license plates with desirable numbers can fetch very high prices in auctions. Unlike other valuable items, license plates are not allocated an estimated price before auction. I propose that the task of predicting plate prices can be viewed as a natural language processing (NLP) task, as the value depends on the meaning of each individual character on the plate and its semantics. I construct a deep recurrent neural network (RNN) to predict the prices of vehicle license plates in Hong Kong, based on the characters on a plate. I demonstrate the importance of having a deep network and of retraining. Evaluated on 13 years of historical auction prices, the deep RNN outperforms previous models by a significant margin.
Event recognition in still images is an intriguing problem and has potential for real applications. This paper addresses the problem of event recognition by proposing a convolutional neural network that exploits knowledge of objects and scenes for event classification (OS2E-CNN). Intuitively, it stands to reason that there exists a correlation among the concepts of objects, scenes, and events. We empirically demonstrate that the recognition of objects and scenes substantially contributes to the recognition of events. Meanwhile, we propose an iterative selection method to identify a subset of object and scene classes, which help to more efficiently and effectively transfer their deep representations to event recognition. Specifically, we develop three types of transferring techniques: (1) initialization-based transferring, (2) knowledge-based transferring, and (3) data-based transferring. These newly designed transferring techniques exploit multi-task learning frameworks to incorporate extra knowledge from other networks and additional datasets into the training procedure of event CNNs. These multi-task learning frameworks turn out to be effective in reducing the effect of over-fitting and improving the generalization ability of the learned CNNs. With OS2E-CNN, we design a multi-ratio and multi-scale cropping strategy, and propose an end-to-end event recognition pipeline. We perform experiments on three event recognition benchmarks: the ChaLearn Cultural Event Recognition dataset, the Web Image Dataset for Event Recognition (WIDER), and the UIUC Sports Event dataset. The experimental results show that our proposed algorithm successfully adapts object and scene representations towards the event dataset and that it achieves the current state-of-the-art performance on these challenging datasets.
Multicanonical ensemble sampling simulations have been performed to calculate the phase diagram of a Lennard-Jones fluid embedded in a fractal random matrix generated through diffusion limited cluster aggregation. The study of the system at increasing size and constant porosity shows that the results are independent from the matrix realization but not from the size effects. A gas-liquid transition shifted with respect to bulk is found. On growing the size of the system on the high density side of the gas-liquid coexistence curve it appears a second coexistence region between two liquid phases. These two phases are characterized by a different behaviour of the local density inside the interconnected porous structure at the same temperature and chemical potential.
Brain-Like Stochastic Search (BLiSS) refers to this task: given a family of utility functions U(u,A), where u is a vector of parameters or task descriptors, maximize or minimize U with respect to u, using networks (Option Nets) which input A and learn to generate good options u stochastically. This paper discusses why this is crucial to brain-like intelligence (an area funded by NSF) and to many applications, and discusses various possibilities for network design and training. The appendix discusses recent research, relations to work on stochastic optimization in operations research, and relations to engineering-based approaches to understanding neocortex.
A model for describing a growing length-scale near the glass transition point is introduced. We assume that, in a subsystem whose density is above a certain threshold value, $\rho_{\rm c}$, owing to topological restrictions, particle rearrangements are highly suppressed (i.e., almost frozen) for a certain long time-period. We regard such a subsystem as a (glassy) cluster. With this assumption and without introducing any complicated thermodynamic arguments, we can simply construct a phenomenological model by considering the usual equilibrium density fluctuations. We predict that with compression (increasing the average density $\rho$) at a fixed temperature $T$ in supercooled states, the characteristic length of the clusters, $\xi$, diverges as $\xi\sim(\rho_{\rm c}-\rho)^{-2/d}$, where $d$ is the spatial dimensionality. Additionally, with decreasing $T$ at a fixed $\rho$, the length-scale diverges in the same manner as $\xi\sim(T-T_{\rm c})^{-2/d}$, for which $\rho$ is identical to $\rho_{\rm c}$ at $T=T_{\rm c}$. The exponent describing the diverging length-scale is the same as the one predicted by some theoretical models and indeed has been observed in some simulations and experiments. However, the basic mechanism for this divergence is different. We further argue the cooperative properties of the structural relaxation based on the clusters.
We study the structure of the social graph of active Facebook users, the largest social network ever analyzed. We compute numerous features of the graph including the number of users and friendships, the degree distribution, path lengths, clustering, and mixing patterns. Our results center around three main observations. First, we characterize the global structure of the graph, determining that the social network is nearly fully connected, with 99.91% of individuals belonging to a single large connected component, and we confirm the "six degrees of separation" phenomenon on a global scale. Second, by studying the average local clustering coefficient and degeneracy of graph neighborhoods, we show that while the Facebook graph as a whole is clearly sparse, the graph neighborhoods of users contain surprisingly dense structure. Third, we characterize the assortativity patterns present in the graph by studying the basic demographic and network properties of users. We observe clear degree assortativity and characterize the extent to which "your friends have more friends than you". Furthermore, we observe a strong effect of age on friendship preferences as well as a globally modular community structure driven by nationality, but we do not find any strong gender homophily. We compare our results with those from smaller social networks and find mostly, but not entirely, agreement on common structural network characteristics.
To gain theoretical insight into the relationship between parking scarcity and congestion, we describe block-faces of curbside parking as a network of queues. Due to the nature of this network, canonical queueing network results are not available to us. We present a new kind of queueing network subject to customer rejection due to the lack of available servers. We provide conditions for such networks to be stable, a computationally tractable "single node" view of such a network, and show that maximizing the occupancy through price control of such queues, and subject to constraints on the allowable congestion between queues searching for an available server, is a convex optimization problem. We demonstrate an application of this method in the Mission District of San Francisco; our results suggest congestion due to drivers searching for parking stems from an inefficient spatial utilization of parking resources.
In cyber-physical systems such as in-vehicle wireless sensor networks, a large number of sensor nodes continually generate measurements that should be received by other nodes such as actuators in a regular fashion. Meanwhile, energy-efficiency is also important in wireless sensor networks. Motivated by these, we develop scheduling policies which are energy efficient and simultaneously maintain "regular" deliveries of packets. A tradeoff parameter is introduced to balance these two conflicting objectives. We employ a Markov Decision Process (MDP) model where the state of each client is the time-since-last-delivery of its packet, and reduce it into an equivalent finite-state MDP problem. Although this equivalent problem can be solved by standard dynamic programming techniques, it suffers from a high-computational complexity. Thus we further pose the problem as a restless multi-armed bandit problem and employ the low-complexity Whittle Index policy. It is shown that this problem is indexable and the Whittle indexes are derived. Also, we prove the Whittle Index policy is asymptotically optimal and validate its optimality via extensive simulations.
We present semiparametric spectral modeling of the complete larval Drosophila mushroom body connectome. Motivated by a thorough exploratory data analysis of the network via Gaussian mixture modeling (GMM) in the adjacency spectral embedding (ASE) representation space, we introduce the latent structure model (LSM) for network modeling and inference. LSM is a generalization of the stochastic block model (SBM) and a special case of the random dot product graph (RDPG) latent position model, and is amenable to semiparametric GMM in the ASE representation space. The resulting connectome code derived via semiparametric GMM composed with ASE captures latent connectome structure and elucidates biologically relevant neuronal properties.
We study a one-dimensional chain of corner-sharing triangles with antiferromagnetic Ising interactions along its bonds. Classically, this system is highly frustrated with an extensive entropy at T = 0 and exponentially decaying spin correlations. We show that the introduction of a quantum dynmamics via a transverse magnetic field removes the entropy and opens a gap, but leaves the ground state disordered at all values of the transverse field, thereby providing an analog of the "disorder by disorder" scenario first proposed by Anderson and Fazekas in their search for resonating valence bond states. Our conclusion relies on exact diagonalization calculations as well as on the analysis of a 14th order series expansion about the large transverse field limit. This test suggests that the series method could be used to search for other instances of quantum disordered states in frustrated transverse field magnets in higher dimensions.
This study proposes a fully convolutional network (FCN) model for raw waveform-based speech enhancement. The proposed system performs speech enhancement in an end-to-end (i.e., waveform-in and waveform-out) manner, which dif-fers from most existing denoising methods that process the magnitude spectrum (e.g., log power spectrum (LPS)) only. Because the fully connected layers, which are involved in deep neural networks (DNN) and convolutional neural networks (CNN), may not accurately characterize the local information of speech signals, particularly with high frequency components, we employed fully convolutional layers to model the waveform. More specifically, FCN consists of only convolutional layers and thus the local temporal structures of speech signals can be efficiently and effectively preserved with relatively few weights. Experimental results show that DNN- and CNN-based models have limited capability to restore high frequency components of waveforms, thus leading to decreased intelligibility of enhanced speech. By contrast, the proposed FCN model can not only effectively recover the waveforms but also outperform the LPS-based DNN baseline in terms of short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ). In addition, the number of model parameters in FCN is approximately only 0.2% compared with that in both DNN and CNN.
We study the formation of stationary localized states using the discrete nonlinear Schr\"{o}dinger equation in a Cayley tree with connectivity $K$. Two cases, namely, a dimeric power law nonlinear impurity and a fully nonlinear system are considered. We introduce a transformation which reduces the Cayley tree into an one dimensional chain with a bond defect. The hopping matrix element between the impurity sites is reduced by $1/\surd K$. The transformed system is also shown to yield tight binding Green's function of the Cayley tree. The dimeric ansatz is used to find the reduced Hamiltonian of the system. Stationary localized states are found from the fixed point equations of the Hamiltonian of the reduced dynamical system. We discuss the existence of different kinds of localized states. We have also analyzed the formation of localized states in one dimensional system with a bond defect and nonlinearity which does not correspond to a Cayley tree. Stability of the states is discussed and stability diagram is presented for few cases. In all cases the total phase diagram for localized states have been presented.
It has been shown recently that predictions from Mode-Coupling Theory for the glass transition of hard-spheres become increasingly bad when dimensionality increases, whereas replica theory predicts a correct scaling. Nevertheless if one focuses on the regime around the dynamical transition in three dimensions, Mode-Coupling results are far more convincing than replica theory predictions. It seems thus necessary to reconcile the two theoretic approaches in order to obtain a theory that interpolates between low-dimensional, Mode-Coupling results, and "mean-field" results from replica theory. Even though quantitative results for the dynamical transition issued from replica theory are not accurate in low dimensions, two different approximation schemes --small cage expansion and replicated Hyper-Netted-Chain (RHNC)-- provide the correct qualitative picture for the transition, namely a discontinuous jump of a static order parameter from zero to a finite value. The purpose of this work is to develop a systematic expansion around the RHNC result in powers of the static order parameter, and to calculate the first correction in this expansion. Interestingly, this correction involves the static three-body correlations of the liquid. More importantly, we separately demonstrate that higher order terms in the expansion are quantitatively relevant at the transition, and that the usual mode-coupling kernel, involving two-body direct correlation functions of the liquid, cannot be recovered from static computations.
In this paper we obtain two results for the Sherrington-Kirkpatrick (SK) model, and we show that they both emerge from a single approach. First, we prove that the average of the overlap takes positive values when it is non zero. More specificly, the average of the overlap, which is naively expected to take values in the whole interval $[-1,+1]$, becomes positive if we ``first'' apply an external field, so to destroy the gauge invariance of the model, and ``then'' remove it in the thermodynamic limit. This phenomenon emerges at the critical point. This first result is weaker that the one obtained by Talagrand (not limited to the average of the overlap), but we show here that, at least in average, the overlap is proven to be non-negative with no use of the Ghirlanda-Guerra identities. The latter are instead needed to obtain the second result, which is the control the behavior of the overlap at the critical point: we find the critical exponents of all the overlap correlation functions.
The problem of prediction of a given time series is examined on the basis of recent nonlinear dynamics theories. Particular attention is devoted to forecast the amplitude and phase of one of the most common solar indicator activity, the international monthly smoothed sunspot number. It is well known that the solar cycle is very difficult to predict due to the intrinsic complexity of the related time behaviour and to the lack of a succesful quantitative theoretical model of the Sun magnetic cycle. Starting from a previous recent work, we checked the reliability and accuracy of a forecasting model based on concepts of nonlinear dynamical systems applied to experimental time series, such as embedding phase space, Lyapunov spectrum, chaotic behaviour. The model is based on a local hypothesis of the behaviour on the embedding space, utilizing an optimal number k of neighbour vectors to predict the future evolution of the current point with the set of characteristic parameters determined by several previous parametric computations. The performances of this method suggest its valuable insertion in the set of the so-called statistical-numerical prediction techniques, like Fourier analyses, curve fitting, neural networks, climatological, etc. The main task is to set up and to compare a promising numerical nonlinear prediction technique, essentially based on an inverse problem, with the most accurate predictive methods like the so-called "precursor methods" which appear now reasonably accurate in predicting "long term" Sun activity, with particular reference to the "solar" precursor methods based on a solar dynamo theory.
Shannon's seminal 1948 work gave rise to two distinct areas of research: information theory and mathematical coding theory. While information theory has had a strong influence on theoretical neuroscience, ideas from mathematical coding theory have received considerably less attention. Here we take a new look at combinatorial neural codes from a mathematical coding theory perspective, examining the error correction capabilities of familiar receptive field codes (RF codes). We find, perhaps surprisingly, that the high levels of redundancy present in these codes does not support accurate error correction, although the error-correcting performance of RF codes "catches up" to that of random comparison codes when a small tolerance to error is introduced. On the other hand, RF codes are good at reflecting distances between represented stimuli, while the random comparison codes are not. We suggest that a compromise in error-correcting capability may be a necessary price to pay for a neural code whose structure serves not only error correction, but must also reflect relationships between stimuli.
Measuring and forecasting opinion trends from real-time social media is a long-standing goal of big-data analytics. Despite its importance, there has been no conclusive scientific evidence so far that social media activity can capture the opinion of the general population. Here we develop a method to infer the opinion of Twitter users regarding the candidates of the 2016 US Presidential Election by using a combination of statistical physics of complex networks and machine learning based on hashtags co-occurrence to develop an in-domain training set approaching 1 million tweets. We investigate the social networks formed by the interactions among millions of Twitter users and infer the support of each user to the presidential candidates. The resulting Twitter trends follow the New York Times National Polling Average, which represents an aggregate of hundreds of independent traditional polls, with remarkable accuracy. Moreover, the Twitter opinion trend precedes the aggregated NYT polls by 10 days, showing that Twitter can be an early signal of global opinion trends. Our analytics unleash the power of Twitter to uncover social trends from elections, brands to political movements, and at a fraction of the cost of national polls.
Temperature modulated Alternating Differential Scanning Calorimetric (ADSC) studies show that Se rich Ge0.15Se0.85-xAgx (0 <= x <= 0.20) glasses are microscopically phase separated, containing Ag2Se phases embedded in a Ge0.15Se0.85 backbone. With increasing silver concentration, Ag2Se phase percolates in the Ge-Se matrix, with a well-defined percolation threshold at x = 0.10. A signature of this percolation transition is shown up in the thermal behavior, as the appearance of two exothermic crystallization peaks. Density, molar volume and micro-hardness measurements, undertaken in the present study, also strongly support this view of percolation transition. The super-ionic conduction observed earlier in these glasses at higher silver proportions, is likely to be connected with the silver phase percolation.
We present a principled approach for detecting overlapping temporal community structure in dynamic networks. Our method is based on the following framework: find the overlapping temporal community structure that maximizes a quality function associated with each snapshot of the network subject to a temporal smoothness constraint. A novel quality function and a smoothness constraint are proposed to handle overlaps, and a new convex relaxation is used to solve the resulting combinatorial optimization problem. We provide theoretical guarantees as well as experimental results that reveal community structure in real and synthetic networks. Our main insight is that certain structures can be identified only when temporal correlation is considered and when communities are allowed to overlap. In general, discovering such overlapping temporal community structure can enhance our understanding of real-world complex networks by revealing the underlying stability behind their seemingly chaotic evolution.
Classification of ordinal data is one of the most important tasks of relation learning. In this thesis a novel framework for ordered classes is proposed. The technique reduces the problem of classifying ordered classes to the standard two-class problem. The introduced method is then mapped into support vector machines and neural networks. Compared with a well-known approach using pairwise objects as training samples, the new algorithm has a reduced complexity and training time. A second novel model, the unimodal model, is also introduced and a parametric version is mapped into neural networks. Several case studies are presented to assert the validity of the proposed models.
Efficient unsupervised training and inference in deep generative models remains a challenging problem. One basic approach, called Helmholtz machine, involves training a top-down directed generative model together with a bottom-up auxiliary model used for approximate inference. Recent results indicate that better generative models can be obtained with better approximate inference procedures. Instead of improving the inference procedure, we here propose a new model which guarantees that the top-down and bottom-up distributions can efficiently invert each other. We achieve this by interpreting both the top-down and the bottom-up directed models as approximate inference distributions and by defining the model distribution to be the geometric mean of these two. We present a lower-bound for the likelihood of this model and we show that optimizing this bound regularizes the model so that the Bhattacharyya distance between the bottom-up and top-down approximate distributions is minimized. This approach results in state of the art generative models which prefer significantly deeper architectures while it allows for orders of magnitude more efficient approximate inference.
Continuous improvement in silicon process technologies has made possible the integration of hundreds of cores on a single chip. However, power and heat have become dominant constraints in designing these massive multicore chips causing issues with reliability, timing variations and reduced lifetime of the chips. Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on the die. Typical DTM schemes only address core level thermal issues. However, the Network-on-chip (NoC) paradigm, which has emerged as an enabling methodology for integrating hundreds to thousands of cores on the same die can contribute significantly to the thermal issues. Moreover, the typical DTM is triggered reactively based on temperature measurements from on-chip thermal sensor requiring long reaction times whereas predictive DTM method estimates future temperature in advance, eliminating the chance of temperature overshoot. Artificial Neural Networks (ANNs) have been used in various domains for modeling and prediction with high accuracy due to its ability to learn and adapt. This thesis concentrates on designing an ANN prediction engine to predict the thermal profile of the cores and Network-on-Chip elements of the chip. This thermal profile of the chip is then used by the predictive DTM that combines both core level and network level DTM techniques. On-chip wireless interconnect which is recently envisioned to enable energy-efficient data exchange between cores in a multicore environment, will be used to provide a broadcast-capable medium to efficiently distribute thermal control messages to trigger and manage the DTM schemes.
Recent experimental advances are producing an avalanche of data on both neural connectivity and neural activity. To take full advantage of these two emerging datasets we need a framework that links them, revealing how collective neural activity arises from the structure of neural connectivity and intrinsic neural dynamics. This problem of {\it structure-driven activity} has drawn major interest in computational neuroscience. Existing methods for relating activity and architecture in spiking networks rely on linearizing activity around a central operating point and thus fail to capture the nonlinear responses of individual neurons that are the hallmark of neural information processing. Here, we overcome this limitation and present a new relationship between connectivity and activity in networks of nonlinear spiking neurons by developing a diagrammatic fluctuation expansion based on statistical field theory. We explicitly show how recurrent network structure produces pairwise and higher-order correlated activity, and how nonlinearities impact the networks' spiking activity. Our findings open new avenues to investigating how single-neuron nonlinearities---including those of different cell types---combine with connectivity to shape population activity and function.
The information that a pattern of firing in the output layer of a feedforward network of threshold-linear neurons conveys about the network's inputs is considered. A replica-symmetric solution is found to be stable for all but small amounts of noise. The region of instability depends on the contribution of the threshold and the sparseness: for distributed pattern distributions, the unstable region extends to higher noise variances than for very sparse distributions, for which it is almost nonexistant.
Background: The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM). The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET) versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM) with Gaussian observables. A detailed description of smFRET is provided as well. Results: The VBEM algorithm returns the model's evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME), and the latter a description of the model's parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML) optimized by the expectation maximization (EM) algorithm, the most important being a natural form of model selection and a well-posed (non-divergent) optimization problem. Conclusions: The results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.
Machine learning models are vulnerable to adversarial examples formed by applying small carefully chosen perturbations to inputs that cause unexpected classification errors. In this paper, we perform experiments on various adversarial example generation approaches with multiple deep convolutional neural networks including Residual Networks, the best performing models on ImageNet Large-Scale Visual Recognition Challenge 2015. We compare the adversarial example generation techniques with respect to the quality of the produced images, and measure the robustness of the tested machine learning models to adversarial examples. Finally, we conduct large-scale experiments on cross-model adversarial portability. We find that adversarial examples are mostly transferable across similar network topologies, and we demonstrate that better machine learning models are less vulnerable to adversarial examples.
Among the topics discussed in Social Media, some lead to controversy. A number of recent studies have focused on the problem of identifying controversy in social media mostly based on the analysis of textual content or rely on global network structure. Such approaches have strong limitations due to the difficulty of understanding natural language, and of investigating the global network structure. In this work we show that it is possible to detect controversy in social media by exploiting network motifs, i.e., local patterns of user interaction. The proposed approach allows for a language-independent and fine- grained and efficient-to-compute analysis of user discussions and their evolution over time. The supervised model exploiting motif patterns can achieve 85% accuracy, with an improvement of 7% compared to baseline structural, propagation-based and temporal network features.
This paper presents a new version of Dropout called Split Dropout (sDropout) and rotational convolution techniques to improve CNNs' performance on image classification. The widely used standard Dropout has advantage of preventing deep neural networks from overfitting by randomly dropping units during training. Our sDropout randomly splits the data into two subsets and keeps both rather than discards one subset. We also introduce two rotational convolution techniques, i.e. rotate-pooling convolution (RPC) and flip-rotate-pooling convolution (FRPC) to boost CNNs' performance on the robustness for rotation transformation. These two techniques encode rotation invariance into the network without adding extra parameters. Experimental evaluations on ImageNet2012 classification task demonstrate that sDropout not only enhances the performance but also converges faster. Additionally, RPC and FRPC make CNNs more robust for rotation transformations. Overall, FRPC together with sDropout bring $1.18\%$ (model of Zeiler and Fergus~\cite{zeiler2013visualizing}, 10-view, top-1) accuracy increase in ImageNet 2012 classification task compared to the original network.
Blood pressure (BP) has been a difficult vascular risk factor to measure precisely and continuously due to its multiscale temporal dependencies. However, both pulse transit time (PTT) model and regression model fail to learn such dependencies and thus suffer from accuracy decay over time. In this work, we addressed the limitation of existing BP prediction models by formulating BP extraction as a sequence prediction problem in which both the input and target are temporal sequence. By incorporating both a bidirectional layer structure and a deep architecture in a standard long short term-memory (LSTM), we established a deep bidirectional LSTM (DB-LSTM) network that can adaptively discover the latent structures of different timescales in BP sequences and automatically learn such multiscale dependencies. We evaluated our proposed model on a static and follow-up continuous BP dataset, and the results show that DB-LSTM network can effectively learn different timescale dependencies in the BP sequences and advances the state-of-the-art by achieving superior accuracy performance than other leading methods on both datasets. To the best of our knowledge, this is the first study to validate the ability of recurrent neural networks to learn the multiscale dependencies of long-term continuous BP sequence.
Deep Neural nets (NNs) with millions of parameters are at the heart of many state-of-the-art computer vision systems today. However, recent works have shown that much smaller models can achieve similar levels of performance. In this work, we address the problem of pruning parameters in a trained NN model. Instead of removing individual weights one at a time as done in previous works, we remove one neuron at a time. We show how similar neurons are redundant, and propose a systematic way to remove them. Our experiments in pruning the densely connected layers show that we can remove upto 85\% of the total parameters in an MNIST-trained network, and about 35\% for AlexNet without significantly affecting performance. Our method can be applied on top of most networks with a fully connected layer to give a smaller network.
An off-line handwritten alphabetical character recognition system using multilayer feed forward neural network is described in the paper. A new method, called, diagonal based feature extraction is introduced for extracting the features of the handwritten alphabets. Fifty data sets, each containing 26 alphabets written by various people, are used for training the neural network and 570 different handwritten alphabetical characters are used for testing. The proposed recognition system performs quite well yielding higher levels of recognition accuracy compared to the systems employing the conventional horizontal and vertical methods of feature extraction. This system will be suitable for converting handwritten documents into structural text form and recognizing handwritten names.
We present Bicoq3, a deep embedding of the B system in Coq, focusing on the technical aspects of the development. The main subjects discussed are related to the representation of sets and maps, the use of induction principles, and the introduction of a new de Bruijn notation providing solutions to various problems related to the mechanisation of languages and logics.
This study takes a close look at mobile web performance. The two main parameters determining web page load time are the network speed and the computing power of the end-user device. Based on data from real users, this paper quantifies the relative importance of network and device. The findings suggest that increased processing power of latest generation smart phones and optimized browsers have a significant impact on web performance; up to 56% reduction in median page load time from one generation to the following. The cellular networks, on the other hand, have become so mature that the median page load time on one fiber-to-the-home network (using wifi for the last meter) is only 18-28% faster than cellular and the median page load time on one DSL network is 19% slower compared to a well-deployed cellular network.
This paper has been withdrawn by the author ali pourmohammad.
Message passing equations yield a sharp percolation transition in finite graphs, as an artifact of the locally treelike approximation. For an arbitrary finite, connected, undirected graph we construct an infinite tree having the same local structural properties as this finite graph, when observed by a nonbacktracking walker. Formally excluding the boundary, this infinite tree is a generalization of the Bethe lattice. We indicate an infinite, locally treelike, random network whose local structure is exactly given by this infinite tree. Message passing equations for various cooperative models on this construction are the same as for the original finite graph, but here they provide the exact solutions of the corresponding cooperative problems. These solutions are good approximations to observables for the models on the original graph when it is sufficiently large and not strongly correlated. We show how to express these solutions in the critical region in terms of the principal eigenvector components of the nonbacktracking matrix. As representative examples we formulate the problems of the random and optimal destruction of a connected graph in terms of our construction, the nonbacktracking expansion. We analyze the limitations and the accuracy of the message passing algorithms for different classes of networks and compare the complexity of the message passing calculations to that of direct numerical simulations. Notably, in a range of important cases, simulations turn out to be more efficient computationally than the message passing.
This paper presents the viscoelastic model for the Ashcroft-Langreth dynamic structure factors of liquid binary mixtures. We also provide expressions for the Bhatia-Thornton dynamic structure factors and, within these expressions, show how the model reproduces both the dynamic and the self-dynamic structure factors corresponding to a one-component system in the appropriate limits (pseudobinary system or zero concentration of one component). In particular we analyze the behavior of the concentration-concentration dynamic structure factor and longitudinal current, and their corresponding counterparts in the one-component limit, namely, the self dynamic structure factor and self longitudinal current. The results for several lithium alloys with different ordering tendencies are compared with computer simulations data, leading to a good qualitative agreement, and showing the natural appearance in the model of the fast sound phenomenon.
Artificial Neural Networks have emerged as an important tool for classification and have been widely used to classify a non-linear separable pattern. The most popular artificial neural networks model is a Multilayer Perceptron (MLP) as it is able to perform classification task with significant success. However due to the complexity of MLP structure and also problems such as local minima trapping, over fitting and weight interference have made neural network training difficult. Thus, the easy way to avoid these problems is to remove the hidden layers. This paper presents the ability of Functional Link Neural Network (FLNN) to overcome the complexity structure of MLP by using single layer architecture and propose an Artificial Bee Colony (ABC) optimization for training the FLNN. The proposed technique is expected to provide better learning scheme for a classifier in order to get more accurate classification result
This paper advocates Riemannian multi-manifold modeling in the context of network-wide non-stationary time-series analysis. Time-series data, collected sequentially over time and across a network, yield features which are viewed as points in or close to a union of multiple submanifolds of a Riemannian manifold, and distinguishing disparate time series amounts to clustering multiple Riemannian submanifolds. To support the claim that exploiting the latent Riemannian geometry behind many statistical features of time series is beneficial to learning from network data, this paper focuses on brain networks and puts forth two feature-generation schemes for network-wide dynamic time series. The first is motivated by Granger-causality arguments and uses an auto-regressive moving average model to map low-rank linear vector subspaces, spanned by column vectors of appropriately defined observability matrices, to points into the Grassmann manifold. The second utilizes (non-linear) dependencies among network nodes by introducing kernel-based partial correlations to generate points in the manifold of positive-definite matrices. Capitilizing on recently developed research on clustering Riemannian submanifolds, an algorithm is provided for distinguishing time series based on their geometrical properties, revealed within Riemannian feature spaces. Extensive numerical tests demonstrate that the proposed framework outperforms classical and state-of-the-art techniques in clustering brain-network states/structures hidden beneath synthetic fMRI time series and brain-activity signals generated from real brain-network structural connectivity matrices.
In a previous paper we introduced a method called augmented sparse reconstruction (ASR) that identifies links among nodes of ordinary differential equation networks, given a small set of observed trajectories with various initial conditions. The main purpose of that technique was to reconstruct intracellular protein signaling networks. In this paper we show that a recursive augmented sparse reconstruction generates artificial networks that are homologous to a large, reference network, in the sense that kinase inhibition of several reactions in the network alters the trajectories of a sizable number of proteins in comparable ways for reference and reconstructed networks. We show this result using a large in-silico model of the epidermal growth factor receptor (EGF-R) driven signaling cascade to generate the data used in the reconstruction algorithm. The most significant consequence of this observed homology is that a nearly optimal combinatorial dosage of kinase inhibitors can be inferred, for many nodes, from the reconstructed network, a result potentially useful for a variety of applications in personalized medicine.
Asymptotically exact results are obtained for the average Green function and density of states of a disordered system for a renormalizable class of models (as opposed to the lattice models examined previously [Zh. Eksp. Teor. Fiz. 106 (1994) 560-584]). For N\sim 1 (where N is an order of the perturbation theory), only the parquet terms corresponding to the highest powers of large logarithms are retained. For large N, this approximation is inadequate because of the fast growth with N of the coefficients for the lower powers of the logarithms. The latter coefficients are calculated in the leading order in N from the Callan-Symanzik equation with results of the Lipatov method using as boundary conditions. For calculating the self-energy at finite momentum, a modification of the parquet approximation is used, that allows the calculations to be done in an arbitrary finite logarithmic approximation but in the leading order in N. It is shown that the phase transition point shifts in the complex plane, thereby insuring regularity of the density of states for all energies. The "spurious" pole is avoided in such a way that effective interaction remains logarithmically weak.
Human Mobility has attracted attentions from different fields of studies such as epidemic modeling, traffic engineering, traffic prediction and urban planning. In this survey we review major characteristics of human mobility studies including from trajectory-based studies to studies using graph and network theory. In trajectory-based studies statistical measures such as jump length distribution and radius of gyration are analyzed in order to investigate how people move in their daily life, and if it is possible to model this individual movements and make prediction based on them. Using graph in mobility studies, helps to investigate the dynamic behavior of the system, such as diffusion and flow in the network and makes it easier to estimate how much one part of the network influences another by using metrics like centrality measures. We aim to study population flow in transportation networks using mobility data to derive models and patterns, and to develop new applications in predicting phenomena such as congestion. Human Mobility studies with the new generation of mobility data provided by cellular phone networks, arise new challenges such as data storing, data representation, data analysis and computation complexity. A comparative review of different data types used in current tools and applications of Human Mobility studies leads us to new approaches for dealing with mentioned challenges.
The quantum dynamics of an ensemble of interacting electrons in an array of random scatterers is treated using a new numerical approach for the calculation of average values of quantum operators and time correlation functions in the Wigner representation. The Fourier transform of the product of matrix elements of the dynamic propagators obeys an integral Wigner-Liouville-type equation. Initial conditions for this equation are given by the Fourier transform of the Wiener path integral representation of the matrix elements of the propagators at the chosen initial times. This approach combines both molecular dynamics and Monte Carlo methods and computes numerical traces and spectra of the relevant dynamical quantities such as momentum-momentum correlation functions and spatial dispersions. Considering as an application a system with fixed scatterers, the results clearly demonstrate that the many-particle interaction between the electrons leads to an enhancement of the conductivity and spatial dispersion compared to the noninteracting case.
Solving free riding and selecting a reliable service provider in P2P networks has been separately investigated in last few years. Using trust has shown to be one of the best ways of solving these problems. But using this approach to simultaneously deal with both problems makes it impossible for newcomers to join the network and the expansion of network is prevented. In this paper we used the game theory to model the behavior of peers and developed a mechanism in which free riding and providing bad service are dominated strategies for peers. At the same time newcomers can participate and are encouraged to be active in the network. The proposed model has been simulated and the results showed that the trust value of free riders and bad service providers converge to a finite value and trust of peers who provide good service is monotonically increased despite the time they join the network.
Probabilistic Boolean networks (PBNs) is a well-established computational framework for modelling biological systems. The steady-state dynamics of PBNs is of crucial importance in the study of such systems. However, for large PBNs, which often arise in systems biology, obtaining the steady-state distribution poses a significant challenge. In fact, statistical methods for steady-state approximation are the only viable means when dealing with large networks. In this paper, we revive the two-state Markov chain approach presented in the literature. We first identify a problem of generating biased results, due to the size of the initial sample with which the approach needs to start and we propose a few heuristics to avoid such a pitfall. Second, we conduct an extensive experimental comparison of the two-state Markov chain approach and another approach based on the Skart method and we show that statistically the two-state Markov chain has a better performance. Finally, we apply this approach to a large PBN model of apoptosis in hepatocytes.
We have experimentally investigated a chaotic reverberation chamber in the regime of strong modal overlap ($1{<}d{<}150$) varying the opening as well as the coupling strength $\kappa$ of the two attached antennas. We find a good agreement with numerical distribution of the reflection $R=|S_{ii}|^2$ and transmission $T=|S_{21}|^2$ obtained via the effective Hamiltonian formalism using random matrix theory. The two parameters entering the numerics, $\kappa$ and the decay rate $\tau$ were determined beforehand experimentally. Additionally we verified the relation predicted by Schroeder and Kuttruff for acoustic rooms between the averaged frequency spacing of maximal transmission $\langle\delta f_{max}\rangle$ and the decay rate.
Recent work has suggested the existence of glassy behavior in a ferromagnetic model with a four-spin interaction. Motivated by these findings, we have studied the dynamics of this model using Monte Carlo simulations with particular attention being paid to two-time quantities. We find that the system shares many features in common with glass forming liquids. In particular, the model exhibits: (i) a very long-lived metastable state, (ii) autocorrelation functions that show stretched exponential relaxation, (iii) a non-equilibrium timescale that appears to diverge at a well defined temperature, and (iv) low temperature aging behaviour characteristic of glasses.
The combination of modern scientific computing with electronic structure theory can lead to an unprecedented amount of data amenable to intelligent data analysis for the identification of meaningful, novel, and predictive structure-property relationships. Such relationships enable high-throughput screening for relevant properties in an exponentially growing pool of virtual compounds that are synthetically accessible. Here, we present a machine learning (ML) model, trained on a data base of \textit{ab initio} calculation results for thousands of organic molecules, that simultaneously predicts multiple electronic ground- and excited-state properties. The properties include atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity, and excitation energies. The ML model is based on a deep multi-task artificial neural network, exploiting underlying correlations between various molecular properties. The input is identical to \emph{ab initio} methods, \emph{i.e.} nuclear charges and Cartesian coordinates of all atoms. For small organic molecules the accuracy of such a "Quantum Machine" is similar, and sometimes superior, to modern quantum-chemical methods---at negligible computational cost.
We investigate the effects of layered quenched disorder on the behavior of planar magnets, superfluids, and superconductors by performing large-scale Monte-Carlo simulations of a three-dimensional randomly layered XY model. Our data provide numerical evidence for the recently predicted anomalously elastic (sliding) intermediate phase between the conventional high-temperature and low-temperature phases. In this intermediate phase, the spin-wave stiffness perpendicular to the layers vanishes in the thermodynamic limit while the stiffness parallel to the layers as well as the spontaneous magnetization are nonzero. In addition, the susceptibility displays unconventional finite-size scaling properties. We compare our Monte-Carlo results with the theoretical predictions, and we discuss possible experiments in ultracold atomic gases, layered superconductors and in nanostructures.
In this paper, we present a novel approach to automatic 3D Facial Expression Recognition (FER) based on deep representation of facial 3D geometric and 2D photometric attributes. A 3D face is firstly represented by its geometric and photometric attributes, including the geometry map, normal maps, normalized curvature map and texture map. These maps are then fed into a pre-trained deep convolutional neural network to generate the deep representation. Then the facial expression prediction is simplyachieved by training linear SVMs over the deep representation for different maps and fusing these SVM scores. The visualizations show that the deep representation provides a complete and highly discriminative coding scheme for 3D faces. Comprehensive experiments on the BU-3DFE database demonstrate that the proposed deep representation can outperform the widely used hand-crafted descriptors (i.e., LBP, SIFT, HOG, Gabor) and the state-of-art approaches under the same experimental protocols.
Multiple cause-of-death data provides a valuable source of information that can be used to enhance health standards by predicting health related trajectories in societies with large populations. These data are often available in large quantities across U.S. states and require Big Data techniques to uncover complex hidden patterns. We design two different classes of models suitable for large-scale analysis of mortality data, a Hadoop-based ensemble of random forests trained over N-grams, and the DeepDeath, a deep classifier based on the recurrent neural network (RNN). We apply both classes to the mortality data provided by the National Center for Health Statistics and show that while both perform significantly better than the random classifier, the deep model that utilizes long short-term memory networks (LSTMs), surpasses the N-gram based models and is capable of learning the temporal aspect of the data without a need for building ad-hoc, expert-driven features.
We discuss the strain dependence of the acoustic properties of amorphous metals in both normal and superconducting states, in the temperature range 0.1 mK $ \le T \le 1 $K. A crossover is found when the strain energy is of the order of the effective interaction energy between tunneling systems at the corresponding temperature. Our results provide clear evidence for the interaction between tunneling systems, whose energy is in quantitative agreement with theoretical expectations, and reveal that without the knowledge of the corresponding strain dependences, the measured temperature dependences below $\sim 50 $mK of the acoustic properties of disordered solids are rather meaningless.
Using the Mellin transform approach, it is shown that, in contrast with integer-order derivatives, the fractional-order derivative of a periodic function cannot be a function with the same period. The three most widely used definitions of fractional-order derivatives are taken into account, namely, the Caputo, Riemann-Liouville and Grunwald-Letnikov definitions. As a consequence, the non-existence of exact periodic solutions in a wide class of fractional-order dynamical systems is obtained. As an application, it is emphasized that the limit cycle observed in numerical simulations of a simple fractional-order neural network cannot be an exact periodic solution of the system.
The discovery of the near infrared windows into the Venus deep atmosphere has enabled the use of remote sensing techniques to study the composition of the Venus atmosphere below the clouds. In particular, water vapor absorption lines can be observed in a number of the near-infrared windows allowing measurement of the H2O abundance at several different levels in the lower atmosphere. Accurate determination of the abundance requires a good database of spectral line parameters for the H2O absorption lines at the high temperatures (up to ~700 K) encountered in the Venus deep atmosphere. This paper presents a comparison of a number of H2O line lists that have been, or that could potentially be used, to analyze Venus deep atmosphere water abundances and shows that there are substantial discrepancies between them. For example, the early high-temperature list used by Meadows and Crisp (1996) had large systematic errors in line intensities. When these are corrected for using the more recent high-temperature BT2 list of Barber et al. (2006) their value of 45+/-10 ppm for the water vapor mixing ratio reduces to 27+/-6 ppm. The HITRAN and GEISA lists used for most other studies of Venus are deficient in "hot" lines that become important in the Venus deep atmosphere and also show evidence of systematic errors in line intensities, particularly for the 8000 to 9500 cm-1 region that includes the 1.18 um window. Water vapor mixing ratios derived from these lists may also be somewhat overestimated. The BT2 line list is recommended as being the most complete and accurate current representation of the H2O spectrum at Venus temperatures.
Section headings: 1 Qubits, gates and networks 2 Quantum arithmetic and function evaluations 3 Algorithms and their complexity 4 From interferometers to computers 5 The first quantum algorithms 6 Quantum search 7 Optimal phase estimation 8 Periodicity and quantum factoring 9 Cryptography 10 Conditional quantum dynamics 11 Decoherence and recoherence 12 Concluding remarks
We show that discotics, lying deep in the columnar phase, can exhibit an x-ray scattering pattern which mimics that of a somewhat unusual smectic liquid crystal. This exotic, new glassy phase of columnar liquid crystals, which we call a ``hybrid columnar Bragg glass'', can be achieved by confining a columnar liquid crystal in an anisotropic random environment of e.g., strained aerogel. Long-ranged orientational order in this phase makes {\em single domain} x-ray scattering possible, from which a wealth of information could be extracted. We give detailed quantitative predictions for the scattering pattern in addition to exponents characterizing anomalous elasticity of the system.
We introduce a method for using deep neural networks to amortize the cost of inference in models from the family induced by universal probabilistic programming languages, establishing a framework that combines the strengths of probabilistic programming and deep learning methods. We call what we do "compilation of inference" because our method transforms a denotational specification of an inference problem in the form of a probabilistic program written in a universal programming language into a trained neural network denoted in a neural network specification language. When at test time this neural network is fed observational data and executed, it performs approximate inference in the original model specified by the probabilistic program. Our training objective and learning procedure are designed to allow the trained neural network to be used as a proposal distribution in a sequential importance sampling inference engine. We illustrate our method on mixture models and Captcha solving and show significant speedups in the efficiency of inference.
Tensor network methods provide an intuitive graphical language to describe quantum states, channels, open quantum systems and a class of numerical approximation methods that efficiently simulate certain many-body states in one spatial dimension. There are two fundamental types of tensor networks in wide use today. The most common is similar to quantum circuits. The second is the braided class of tensor networks, used in topological quantum computing. Recently a third class of tensor networks was discovered by Jaffe, Liu and Wozniakowski---the JLW-model---notably, the wires carry charge excitations. The rules in which network components can be moved, merged and manipulated in a graphical form of reasoning take an elegant form. For instance the relative charge locations on wires carries precise meaning and changing the ordering modifies a connected network specifically by a complex number. The type of isotopy discovered in the topological JLW-model provides an alternative means to reason about quantum information, computation and protocols. Here we recall the tensor-network building blocks used in a controlled-NOT gate. Some open problems related to the JLW-model are given.
Policy makers, urban planners, architects, sociologists, and economists are interested in creating urban areas that are both lively and safe. But are the safety and liveliness of neighborhoods independent characteristics? Or are they just two sides of the same coin? In a world where people avoid unsafe looking places, neighborhoods that look unsafe will be less lively, and will fail to harness the natural surveillance of human activity. But in a world where the preference for safe looking neighborhoods is small, the connection between the perception of safety and liveliness will be either weak or nonexistent. In this paper we explore the connection between the levels of activity and the perception of safety of neighborhoods in two major Italian cities by combining mobile phone data (as a proxy for activity or liveliness) with scores of perceived safety estimated using a Convolutional Neural Network trained on a dataset of Google Street View images scored using a crowdsourced visual perception survey. We find that: (i) safer looking neighborhoods are more active than what is expected from their population density, employee density, and distance to the city centre; and (ii) that the correlation between appearance of safety and activity is positive, strong, and significant, for females and people over 50, but negative for people under 30, suggesting that the behavioral impact of perception depends on the demographic of the population. Finally, we use occlusion techniques to identify the urban features that contribute to the appearance of safety, finding that greenery and street facing windows contribute to a positive appearance of safety (in agreement with Oscar Newman's defensible space theory). These results suggest that urban appearance modulates levels of human activity and, consequently, a neighborhood's rate of natural surveillance.
We study the impact of interaction of nodes in a layer of a multiplex network on the dynamical behavior and cluster synchronization of these nodes in other layers. We find that nodes interactions in one layer affects the cluster synchronizability of another layer in many different ways. While multiplexing with a sparse network enhances the synchronizability multiplexing with a dense network suppresses the cluster synchronizability with the network architecture deciding the impact of the enhancement and suppression. Additionally, at weak couplings the enhancement in the cluster synchronizability due to multiplexing remains of the driven type, while for strong couplings the multiplexing may lead to a transition to the self-organized mechanism.
In order to describe the thermodynamics of the glassy systems it has been recently introduced an extra parameter also called effective temperature which generalizes the fluctuation-dissipation theorem (FDT) to systems off-equilibrium and supposedly describes thermal fluctuations around the aging state. Here we investigate the applicability of a zero-th law for non-equilibrium glassy systems based on these effective temperatures by studying two coupled subsystems of harmonic oscillators with Monte Carlo dynamics. We analyze in detail two types of dynamics: 1) sequential dynamics where the coupling between the subsystems comes only from the Hamiltonian and 2) parallel dynamics where there is a further coupling between the subsystems arising from the dynamics. We show that the coupling described in the first case is not enough to make asymptotically the effective temperatures of two interacting subsystems coincide, the reason being the too small thermal conductivity between them in the aging state. This explains why different interacting degrees of freedom in structural glasses may stay at different effective temperatures without never mutually thermalizing.
Taking into account the fact that overload failures in real-world functional networks are usually caused by extreme values of temporally fluctuating loads that exceed the allowable range, we study the robustness of scale-free networks against cascading overload failures induced by fluctuating loads. In our model, loads are described by random walkers moving on a network and a node fails when the number of walkers on the node is beyond the node capacity. Our results obtained by using the generating function method shows that scale-free networks are more robust against cascading overload failures than Erd\H{o}s-R\'enyi random graphs with homogeneous degree distributions. This conclusion is contrary to that predicted by previous works which neglect the effect of fluctuations of loads.
Collectiveness motions of crowd systems have attracted a great deal of attentions in recently years. In this paper, we try to measure the collectiveness of a crowd system by the proposed node clique learning method. The proposed method is a graph based method, and investigates the influence from one node to other nodes. A node is represented by a set of nodes which named a clique, which is obtained by spreading information from this node to other nodes in graph. Then only nodes with sufficient information are selected as the clique of this node. The motion coherence between two nodes is defined by node cliques comparing. The collectiveness of a node and the collectiveness of the crowd system are defined by the nodes coherence. Self-driven particle (SDP) model and the crowd motion database are used to test the ability of the proposed method in measuring collectiveness.
Self-Organizing Map (SOM) is a neural network model which is used to obtain a topology-preserving mapping from the (usually high dimensional) input/feature space to an output/map space of fewer dimensions (usually two or three in order to facilitate visualization). Neurons in the output space are connected with each other but this structure remains fixed throughout training and learning is achieved through the updating of neuron reference vectors in feature space. Despite the fact that growing variants of SOM overcome the fixed structure limitation they increase computational cost and also do not allow the removal of a neuron after its introduction. In this paper, a variant of SOM is proposed called AMSOM (Adaptive Moving Self-Organizing Map) that on the one hand creates a more flexible structure where neuron positions are dynamically altered during training and on the other hand tackles the drawback of having a predefined grid by allowing neuron addition and/or removal during training. Experiments using multiple literature datasets show that the proposed method improves training performance of SOM, leads to a better visualization of the input dataset and provides a framework for determining the optimal number and structure of neurons.
A survey is made of several aspects of the dynamics of networks, with special emphasis on unsupervised learning processes, non-Gaussian data analysis and pattern recognition in networks with complex nodes.
In the field of empirical modeling using Genetic Programming (GP), it is important to evolve solution with good generalization ability. Generalization ability of GP solutions get affected by two important issues: bloat and over-fitting. Bloat is uncontrolled growth of code without any gain in fitness and important issue in GP. We surveyed and classified existing literature related to different techniques used by GP research community to deal with the issue of bloat. Moreover, the classifications of different bloat control approaches and measures for bloat are discussed. Next, we tested four bloat control methods: Tarpeian, double tournament, lexicographic parsimony pressure with direct bucketing and ratio bucketing on six different problems and identified where each bloat control method performs well on per problem basis. Based on the analysis of each method, we combined two methods: double tournament (selection method) and Tarpeian method (works before evaluation) to avoid bloated solutions and compared with the results obtained from individual performance of double tournament method. It was found that the results were improved with this combination of two methods.
One of the missing keys in the present understanding of the spin structure of the nucleon is the contribution from the gluons: the so-called gluon polarisation. This quantity can be determined in DIS through the photon-gluon fusion process, in which two analysis methods may be used: (i) identifying open charm events or (ii) selecting events with high p_T hadrons. The data used in the present work were collected in the COMPASS experiment, where a 160 GeV/c naturally polarised muon beam, impinging on a polarised nucleon fixed target is used. Preliminary results for the gluon polarisation from high p_T and open charm analyses are presented. The gluon polarisation result for high p_T hadrons is divided, for the first time, into three statistically independent measurements at LO. The result from open charm analysis is obtained at LO and NLO. In both analyses a new weighted method based on a neural network approach is used.
Performing more tasks in parallel is a typical feature of complex brains. These are characterized by the coexistence of excitatory and inhibitory synapses, whose percentage in mammals is measured to have a typical value of 20-30\%. Here we investigate parallel learning of more Boolean rules in neuronal networks. We find that multi-task learning results from the alternation of learning and forgetting of the individual rules. Interestingly, a fraction of 30\% inhibitory synapses optimizes the overall performance, carving a complex backbone supporting information transmission with a minimal shortest path length. We show that 30\% inhibitory synapses is the percentage maximizing the learning performance since it guarantees, at the same time, the network excitability necessary to express the response and the variability required to confine the employment of resources.
The physical properties of the faint blue galaxy population are reviewed in the context of observational progress made via deep spectroscopic surveys and Hubble Space Telescope imaging of field galaxies at various limits, and theoretical models for the integrated star formation history of the Universe. Notwithstanding uncertainties in the properties of the local population of galaxies, convincing evidence has emerged from several independent studies for a rapid decline in the volume-averaged star formation rate of field galaxies since a redshift z~1. Together with the small angular sizes and modest mean redshift of the faintest detectable sources, these results can be understood in hierarchical models where the bulk of the star formation occurred at redshifts between z~1-2. The physical processes responsible for the subsequent demise of the faint blue galaxy population remains unclear. Considerable progress will be possible when the evolutionary trends can be monitored in the context of independent physical parameters such as the underlying galactic mass.
Multiresolution analysis and matrix factorization are foundational tools in computer vision. In this work, we study the interface between these two distinct topics and obtain techniques to uncover hierarchical block structure in symmetric matrices -- an important aspect in the success of many vision problems. Our new algorithm, the incremental multiresolution matrix factorization, uncovers such structure one feature at a time, and hence scales well to large matrices. We describe how this multiscale analysis goes much farther than what a direct global factorization of the data can identify. We evaluate the efficacy of the resulting factorizations for relative leveraging within regression tasks using medical imaging data. We also use the factorization on representations learned by popular deep networks, providing evidence of their ability to infer semantic relationships even when they are not explicitly trained to do so. We show that this algorithm can be used as an exploratory tool to improve the network architecture, and within numerous other settings in vision.
In this article, a study of the mean-square error (MSE) performance of linear echo-state neural networks is performed, both for training and testing tasks. Considering the realistic setting of noise present at the network nodes, we derive deterministic equivalents for the aforementioned MSE in the limit where the number of input data $T$ and network size $n$ both grow large. Specializing then the network connectivity matrix to specific random settings, we further obtain simple formulas that provide new insights on the performance of such networks.
Sensor networks can nowadays deliver 99.9% of their data with duty cycles below 1%. This remarkable performance is, however, dependent on some important underlying assumptions: low traffic rates, medium size densities and static nodes. In this thesis, we investigate the performance of these same resource-constrained devices, but under scenarios that present extreme conditions: high traffic rates, high densities and mobility: the so-called Extreme Wireless Sensor Networks (EWSNs).
Charm production in deep inelastic scattering has been measured with the ZEUS detector at HERA using an integrated luminosity of 120 pb^{-1}. The hadronic decay channels D^{+} -> K^{0}_{S} pi^{+}, Lambda_{c}^{+} -> p K^{0}_{S} and Lambda_{c}^{+} -> Lambda pi^{+}, and their charge conjugates, were reconstructed. The presence of a neutral strange hadron in the final state reduces the combinatorial background and extends the measured sensitivity into the low transverse momentum region. The kinematic range is 0 < p_{T}(D^{+}, Lambda_{c}^{+}) < 10 GeV, |eta(D^{+}, Lambda_{c}^{+})| < 1.6, 1.5 < Q^{2} < 1000 GeV^{2} and 0.02 < y < 0.7. Inclusive and differential cross sections for the production of D^{+} mesons are compared to next-to-leading-order QCD predictions. The fraction of c quarks hadronising into Lambda_{c}^{+} baryons is extracted.
We analyze the properties of seven community food webs from a variety of environments--including freshwater, marine-freshwater interfaces and terrestrial environments. We uncover quantitative unifying patterns that describe the properties of the diverse trophic webs considered and suggest that statistical physics concepts such as scaling and universality may be useful in the description of ecosystems. Specifically, we find that several quantities characterizing these diverse food webs obey functional forms that are universal across the different environments considered. The empirical results are in remarkable agreement with the analytical solution of a recently proposed model for food webs.
Spectral clustering is a widely studied problem, yet its complexity is prohibitive for dynamic graphs of even modest size. We claim that it is possible to reuse information of past cluster assignments to expedite computation. Our approach builds on a recent idea of sidestepping the main bottleneck of spectral clustering, i.e., computing the graph eigenvectors, by using fast Chebyshev graph filtering of random signals. We show that the proposed algorithm achieves clustering assignments with quality approximating that of spectral clustering and that it can yield significant complexity benefits when the graph dynamics are appropriately bounded.
Design of experiments is a branch of statistics that aims to identify efficient procedures for planning experiments in order to optimize knowledge discovery. Network inference is a subfield of systems biology devoted to the identification of biochemical networks from experimental data. Common to both areas of research is their focus on the maximization of information gathered from experimentation. The goal of this paper is to establish a connection between these two areas coming from the common use of polynomial models and techniques from computational algebra.
We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE. The policy function is then approximated by a neural network, as is done in deep reinforcement learning. Numerical results using TensorFlow illustrate the efficiency and accuracy of the proposed algorithms for several 100-dimensional nonlinear PDEs from physics and finance such as the Allen-Cahn equation, the Hamilton-Jacobi-Bellman equation, and a nonlinear pricing model for financial derivatives.
We establish a network formation game for the Internet's Autonomous System (AS) interconnection topology. The game includes different types of players, accounting for the heterogeneity of ASs in the Internet. We incorporate reliability considerations in the player's utility function, and analyze static properties of the game as well as its dynamic evolution. We provide dynamic analysis of its topological quantities, and explain the prevalence of some "network motifs" in the Internet graph. We assess our predictions with real-world data.
We propose a network metric, edge proximity, ${\cal P}_e$, which demonstrates the importance of specific edges in a network, hitherto not captured by existing network metrics. The effects of removing edges with high ${\cal P}_e$ might initially seem inconspicuous but are eventually shown to be very harmful for networks. Compared to existing strategies, the removal of edges by ${\cal P}_e$ leads to a remarkable increase in the diameter and average shortest path length in undirected real and random networks till the first disconnection and well beyond. ${\cal P}_e$ can be consistently used to rupture the network into two nearly equal parts, thus presenting a very potent strategy to greatly harm a network. Targeting by ${\cal P}_e$ causes notable efficiency loss in U.S. and European power grid networks. ${\cal P}_e$ identifies proteins with essential cellular functions in protein-protein interaction networks. It pinpoints regulatory neural connections and important portions of the neural and brain networks, respectively. Energy flow interactions identified by ${\cal P}_e$ form the backbone of long food web chains. Finally, we scrutinize the potential of ${\cal P}_e$ in edge controllability dynamics of directed networks.
The use of network coding for large scale content distribution improves download time. This is demonstrated in this work by the use of network coded Electronic Health Record Storage System (EHR-SS). An architecture of 4-layer to build the EHR-SS is designed. The application integrates the data captured for the patient from three modules namely administrative data, medical records of consultation and reports of medical tests. The lower layer is the data capturing layer using RFID reader. The data is captured in the lower level from different nodes. The data is combined with some linear coefficients using linear network coding. At the lower level the data from different tags are combined and stored and at the level 2 coding combines the data from multiple readers and a corresponding encoding vector is generated. This network coding is done at the server node through small mat lab net-cod interface software. While accessing the stored data, the user data has the data type represented in the form of decoding vector. For storing and retrieval the primary key is the patient id. The results obtained were observed with a reduction of download time of about 12% for our case study set up.
The distribution of the azimuthal angle of charged and neutral hadrons relative to the lepton plane has been studied for neutral current deep inelastic $ep$ scattering using an integrated luminosity of 45 pb-1 taken with the ZEUS detector at HERA. The measurements were made in the hadronic centre-of-mass system. The analysis exploits the energy-flow method, which allows the measurement to be made over a larger range of pseudorapidity compared to previous results. The dependence of the moments of the azimuthal distributions on the pseudorapidity and minimum transverse energy of the final-state hadrons are presented. Although the predictions from next-to-leading-order QCD describe the data better than do the Monte Carlo models incorporating leading-logarithm parton showers, they still fail to describe the magnitude of the asymmetries. This suggests that higher-order calculations may be necessary to describe these data.
Cellular networks are usually modeled by placing the base stations on a grid, with mobile users either randomly scattered or placed deterministically. These models have been used extensively but suffer from being both highly idealized and not very tractable, so complex system-level simulations are used to evaluate coverage/outage probability and rate. More tractable models have long been desirable. We develop new general models for the multi-cell signal-to-interference-plus-noise ratio (SINR) using stochastic geometry. Under very general assumptions, the resulting expressions for the downlink SINR CCDF (equivalent to the coverage probability) involve quickly computable integrals, and in some practical special cases can be simplified to common integrals (e.g., the Q-function) or even to simple closed-form expressions. We also derive the mean rate, and then the coverage gain (and mean rate loss) from static frequency reuse. We compare our coverage predictions to the grid model and an actual base station deployment, and observe that the proposed model is pessimistic (a lower bound on coverage) whereas the grid model is optimistic, and that both are about equally accurate. In addition to being more tractable, the proposed model may better capture the increasingly opportunistic and dense placement of base stations in future networks.
On the basis of the target mass corrections to structure functions of deep-inelastic scattering of leptons, we evaluate effective nucleon mass that turns out to be twice $M_{nucl.}$ for deep-inelastic scattering on the nucleus target and equals $M_{nucl.}$ for the hydrogen target.
Aims. The aim of this paper is to characterise the star formation activity in the poorly studied embedded cluster Serpens/G3-G6, located ~ 45' (3 pc) to the south of the Serpens Cloud Core, and to determine the luminosity and mass functions of its population of Young Stellar Objects (YSOs).   Methods. Multi-wavelength broadband photometry was obtained to sample the near and mid-IR spectral energy distributions to separate YSOs from field stars and classify the YSO evolutionary stage. ISOCAM mapping in the two filters LW2 (5-8.5 um) and LW3 (12-18 um) of a 19' x 16' field was combined with JHKs data from 2MASS, Ks data from Arnica/NOT, and L' data from SIRCA/NOT. Continuum emission at 1.3 mm (IRAM) and 3.6 cm (VLA) was mapped to study the cloud structure and the coldest/youngest sources. Deep narrow band imaging at the 2.12 um S(1) line of H2 from NOTCam/NOT was obtained to search for signs of bipolar outflows.   Results. We have strong evidence for a stellar population of 31 Class II sources, 5 flat-spectrum sources, 5 Class I sources, and two Class 0 sources. Our method does not sample the Class III sources. The cloud is composed of two main dense clumps aligned along a ridge over ~ 0.5 pc plus a starless core coinciding with absorption features seen in the ISOCAM maps. We find two S-shaped bipolar collimated flows embedded in the NE clump, and propose the two driving sources to be a Class 0 candidate (MMS3) and a double Class I (MMS2). For the Class II population we find a best age of ~ 2 Myr and compatibility with recent Initial Mass Functions (IMFs) by comparing the observed Class II luminosity function (LF), which is complete to 0.08 L_sun, to various model LFs with different star formation scenarios and input IMFs.
This paper presents a novel framework for ubiquitous social networks (USNs). Instead of making virtual connections, on the basis of human social networks, an effort has been made to facilitate interactions among human social networks with the help of virtual social networks. The imperative domains that support ubiquitous social networks are highlighted and different scenarios are provided to project real world applications of proposed framework. Our proposed framework can provide preliminary foundations for creating ubiquitous social networks in true essence.
Motor adaptation displays a structure-learning effect: adaptation to a new perturbation occurs more quickly when the subject has prior exposure to perturbations with related structure. Although this `learning-to-learn' effect is well documented, its underlying computational mechanisms are poorly understood. We present a new model of motor structure learning, approaching it from the point of view of deep reinforcement learning. Previous work outside of motor control has shown how recurrent neural networks can account for learning-to-learn effects. We leverage this insight to address motor learning, by importing it into the setting of model-based reinforcement learning. We apply the resulting processing architecture to empirical findings from a landmark study of structure learning in target-directed reaching (Braun et al., 2009), and discuss its implications for a wider range of learning-to-learn phenomena.
A relatively recent advance in cognitive neuroscience has been multi-voxel pattern analysis (MVPA), which enables researchers to decode brain states and/or the type of information represented in the brain during a cognitive operation. MVPA methods utilize machine learning algorithms to distinguish among types of information or cognitive states represented in the brain, based on distributed patterns of neural activity. In the current investigation, we propose a new approach for representation of neural data for pattern analysis, namely a Mesh Learning Model. In this approach, at each time instant, a star mesh is formed around each voxel, such that the voxel corresponding to the center node is surrounded by its p-nearest neighbors. The arc weights of each mesh are estimated from the voxel intensity values by least squares method. The estimated arc weights of all the meshes, called Mesh Arc Descriptors (MADs), are then used to train a classifier, such as Neural Networks, k-Nearest Neighbor, Na\"ive Bayes and Support Vector Machines. The proposed Mesh Model was tested on neuroimaging data acquired via functional magnetic resonance imaging (fMRI) during a recognition memory experiment using categorized word lists, employing a previously established experimental paradigm (\"Oztekin & Badre, 2011). Results suggest that the proposed Mesh Learning approach can provide an effective algorithm for pattern analysis of brain activity during cognitive processing.
Central pattern generators (CPGs) appear to have evolved multiple times throughout the animal kingdom, indicating that their design imparts a significant evolutionary advantage. Insight into how this design is achieved is hindered by the difficulty inherent in examining relationships among electrophysiological properties of the constituent cells of a CPG and their functional connectivity. That is: experimentally it is challenging to estimate the values of more than two or three of these properties simultaneously. We employ a method of statistical data assimilation (D.A.) to estimate the synaptic weights, synaptic reversal potentials, and maximum conductances of ion channels of the constituent neurons in a multi-modal network model. We then use these estimates to predict the functional mode of activity that the network is expressing. The measurements used are the membrane voltage time series of all neurons in the circuit. We find that these measurements provide sufficient information to yield accurate predictions of the network's associated electrical activity. This experiment can apply directly in a real laboratory using intracellular recordings from a known biological CPG whose structural mapping is known, and which can be completely isolated from the animal. The simulated results in this paper suggest that D.A. might provide a tool for simultaneously estimating tens to hundreds of CPG properties, thereby offering the opportunity to seek possible systematic relationships among these properties and the emergent electrical activity.
New types of self-similar states are found in quasiperiodic systems characterized by topological invariants-- the Chern numbers. We show that the topology introduces a competing length in the self-similar band edge states transforming peaks into doublets of size equal to the Chern number. This length intertwines with the quasiperiodicity and introduces an intrinsic scale, producing Chern-beats and nested regions where the fractal structure becomes smooth. Cherns also influence the zero-energy mode, that for quasiperiodic systems which exhibit exponential localization, is related to the ghost of the Majorana; the delocalized state at the onset to topological transition. The Chern and the Majorana, two distinct types of topological edge modes, exist in quasiperiodic superconducting wires.
Some of the experimental and theoretical results discussed at the Fifth International Workshop on Deep Inelastic Scattering and QCD are reviewed.
During overload, most networks drop packets due to buffer unavailability. The resulting timeouts at the source provide an implicit mechanism to convey congestion signals from the network to the source. On a timeout, a source should not only retransmit the lost packet, but it should also reduce its load on the network. Based on this realization, we have developed a simple congestion control scheme using the acknowledgment timeouts as indications of packet loss and congestion. This scheme does not require any new message formats, therefore, it can be used in any network with window flow control, e.g., ARPAnet or ISO.
A field-theoretic description of the critical behaviour of the weakly disordered systems is given. Directly, for three- and two-dimensional systems a renormalization analysis of the effective Hamiltonian of model with replica symmetry breaking (RSB) potentials is carried out in the two-loop approximation. For case with 1-step RSB the fixed points (FP's) corresponding to stability of the various types of critical behaviour are identified with the use of the Pade-Borel summation technique. Analysis of FP's has shown a stability of the critical behaviour of the weakly disordered systems with respect to RSB effects and realization of former scenario of disorder influence on critical behaviour.
What does it take for a system, biological or not, to have goals? Here, this question is approached in the context of in silico artificial evolution. By examining the informational and causal properties of artificial organisms ('animats') controlled by small, adaptive neural networks (Markov Brains), this essay discusses necessary requirements for intrinsic information, autonomy, and meaning. The focus lies on comparing two types of Markov Brains that evolved in the same simple environment: one with purely feedforward connections between its elements, the other with an integrated set of elements that causally constrain each other. While both types of brains 'process' information about their environment and are equally fit, only the integrated one forms a causally autonomous entity above a background of external influences. This suggests that to assess whether goals are meaningful for a system itself, it is important to understand what the system is, rather than what it does.
Sparse coding is a basic task in many fields including signal processing, neuroscience and machine learning where the goal is to learn a basis that enables a sparse representation of a given set of data, if one exists. Its standard formulation is as a non-convex optimization problem which is solved in practice by heuristics based on alternating minimization. Re- cent work has resulted in several algorithms for sparse coding with provable guarantees, but somewhat surprisingly these are outperformed by the simple alternating minimization heuristics. Here we give a general framework for understanding alternating minimization which we leverage to analyze existing heuristics and to design new ones also with provable guarantees. Some of these algorithms seem implementable on simple neural architectures, which was the original motivation of Olshausen and Field (1997a) in introducing sparse coding. We also give the first efficient algorithm for sparse coding that works almost up to the information theoretic limit for sparse recovery on incoherent dictionaries. All previous algorithms that approached or surpassed this limit run in time exponential in some natural parameter. Finally, our algorithms improve upon the sample complexity of existing approaches. We believe that our analysis framework will have applications in other settings where simple iterative algorithms are used.
Knowledge tracing---where a machine models the knowledge of a student as they interact with coursework---is a well established problem in computer supported education. Though effectively modeling student knowledge would have high educational impact, the task has many inherent challenges. In this paper we explore the utility of using Recurrent Neural Networks (RNNs) to model student learning. The RNN family of models have important advantages over previous methods in that they do not require the explicit encoding of human domain knowledge, and can capture more complex representations of student knowledge. Using neural networks results in substantial improvements in prediction performance on a range of knowledge tracing datasets. Moreover the learned model can be used for intelligent curriculum design and allows straightforward interpretation and discovery of structure in student tasks. These results suggest a promising new line of research for knowledge tracing and an exemplary application task for RNNs.
We frame the task of predicting a semantic labeling as a sparse reconstruction procedure that applies a target-specific learned transfer function to a generic deep sparse code representation of an image. This strategy partitions training into two distinct stages. First, in an unsupervised manner, we learn a set of generic dictionaries optimized for sparse coding of image patches. We train a multilayer representation via recursive sparse dictionary learning on pooled codes output by earlier layers. Second, we encode all training images with the generic dictionaries and learn a transfer function that optimizes reconstruction of patches extracted from annotated ground-truth given the sparse codes of their corresponding image patches. At test time, we encode a novel image using the generic dictionaries and then reconstruct using the transfer function. The output reconstruction is a semantic labeling of the test image.   Applying this strategy to the task of contour detection, we demonstrate performance competitive with state-of-the-art systems. Unlike almost all prior work, our approach obviates the need for any form of hand-designed features or filters. To illustrate general applicability, we also show initial results on semantic part labeling of human faces.   The effectiveness of our approach opens new avenues for research on deep sparse representations. Our classifiers utilize this representation in a novel manner. Rather than acting on nodes in the deepest layer, they attach to nodes along a slice through multiple layers of the network in order to make predictions about local patches. Our flexible combination of a generatively learned sparse representation with discriminatively trained transfer classifiers extends the notion of sparse reconstruction to encompass arbitrary semantic labeling tasks.
Using the diagrammatic method, we derive a set of self-consistent equations that describe eigenvalue distributions of large correlated asymmetric random matrices. The matrix elements can have different variances and be correlated with each other. The analytical results are confirmed by numerical simulations. The results have implications for the dynamics of neural and other biological networks where plasticity induces correlations in the connection strengths within the network. We find that the presence of correlations can have a major impact on network stability.
We consider pulse-coupled Leaky Integrate-and-Fire neural networks with randomly distributed synaptic couplings. This random dilution induces fluctuations in the evolution of the macroscopic variables and deterministic chaos at the microscopic level. Our main aim is to mimic the effect of the dilution as a noise source acting on the dynamics of a globally coupled non-chaotic system. Indeed, the evolution of a diluted neural network can be well approximated as a fully pulse coupled network, where each neuron is driven by a mean synaptic current plus additive noise. These terms represent the average and the fluctuations of the synaptic currents acting on the single neurons in the diluted system. The main microscopic and macroscopic dynamical features can be retrieved with this stochastic approximation. Furthermore, the microscopic stability of the diluted network can be also reproduced, as demonstrated from the almost coincidence of the measured Lyapunov exponents in the deterministic and stochastic cases for an ample range of system sizes. Our results strongly suggest that the fluctuations in the synaptic currents are responsible for the emergence of chaos in this class of pulse coupled networks.
A social network consists of a set of actors and a set of relationships between them which describe certain patterns of communication. Most current networks are huge and difficult to analyze and visualize. One of the methods frequently used is to extract the most important features, namely to create a certain abstraction, that is the transformation of a large network to a much smaller one, so the latter is a useful summary of the original one, still keeping the most important characteristics. In the case of a social network it can be achieved in two ways. One is to find groups of actors and present only them and relationships between them. The other is to find actors who play similar roles and to construct a smaller network in which the connection between the actors would be replaced with connections between the roles. Classifying actors by the roles they are playing in the network can help to understand 'who is who' in a social network. This classification can be very useful, because it gives us a comprehensive view of the network and helps to understand how the network is organized, and to predict how it could behave in the case of certain events (internal or external).
We present deep variational canonical correlation analysis (VCCA), a deep multi-view learning model that extends the latent variable model interpretation of linear CCA to nonlinear observation models parameterized by deep neural networks. We derive variational lower bounds of the data likelihood by parameterizing the posterior probability of the latent variables from the view that is available at test time. We also propose a variant of VCCA called VCCA-private that can, in addition to the "common variables" underlying both views, extract the "private variables" within each view, and disentangles the shared and private information for multi-view data without hard supervision. Experimental results on real-world datasets show that our methods are competitive across domains.
Gaussian graphical modeling has been widely used to explore various network structures, such as gene regulatory networks and social networks. We often use a penalized maximum likelihood approach with the $L_1$ penalty for learning a high-dimensional graphical model. However, the penalized maximum likelihood procedure is sensitive to outliers. To overcome this problem, we introduce a robust estimation procedure based on the $\gamma$-divergence. The proposed method has a redescending property, which is known as a desirable property in robust statistics. The parameter estimation procedure is constructed using the Majorize-Minimization algorithm, which guarantees that the objective function monotonically decreases at each iteration. Extensive simulation studies showed that our procedure performed much better than the existing methods, in particular, when the contamination ratio was large. Two real data analyses were carried out to illustrate the usefulness of our proposed procedure.
In this article we revise the problem of anomalous values of pulsars' braking indices n_{obs} and frequency second derivatives F2 arising in observations. The intrinsic evolutionary braking is buried deep under superimposed irregular processes, that prevent direct estimations of its parameters for the majority of pulsars. We re-analyze the distribution of "ordinary" radio pulsars on a F2-F1, F2-F0, F1-F0 and n_{obs}-tau_{ch} diagrams assuming their spin-down to be the superposition of a "true" monotonous term and a symmetric oscillatory term. We demonstrate that their effects may be clearly separated using simple ad hoc arguments. Using maximum likelihood estimator we derive the parameters of both components. We find characteristic timescales of such oscillations to be of the order of 1e3-1e4 years, while its amplitudes are large enough to modulate the observed spin-down rate up to 0.5-5 times and completely dominate the second frequency derivatives. On the other hand, pulsars' secular evolution is consistent with classical magnetodipolar model with braking index n ~ 3. So, observed pulsars' characteristic ages (and similar estimators that depend on the observed F1) are also affected by long term cyclic process and differ up to 0.5-5 times from their monotonous values. This fact naturally resolves the discrepancy of characteristic and independently estimated physical ages of several objects, as well as explains very large, up to 1e8 years, characteristic ages of some pulsars. We discuss the possible physical connection of long term oscillation with a complex neutron star rotation relative its magnetic axis due to influence of the near-field part of magnetodipolar torque.
We study how the degree of ordering depends on the strength of the thermal and quantum fluctuations in frustrated systems by investigating the correlation function of the order parameter. Concretely, we compare the equilibrium spin correlation function in a frustrated lattice which exhibits a non-monotonic temperature dependence (reentrant type dependence) with that in the ground state as a function of the transverse field that causes the quantum fluctuation. We find the correlation function in the ground state also shows a non-monotonic dependence on the strength of the transverse field. We also study the real-time dynamics of the spin correlation function under a time-dependent field. After sudden decrease of the temperature, we found non-monotonic changes of the correlation function reflecting the static temperature dependence, which indicates that an effective temperature of the system changes gradually. For the quantum system, we study the dependence of changes of the correlation function on the sweeping speed of the transverse field. Contrary to the classical case, the correlation function varies little in a rapid change of the field, though it shows a non-monotonic change when we sweep the field slowly.
We study the disordered topological Anderson insulator in a 2-D (square not strip) geometry. We first report the phase diagram of finite systems and then study the evolution of phase boundaries when the system size is increased to a very large $1120 \times 1120$ area. We establish that conductance quantization can occur without a bulk band gap, and that there are two distinct scaling regions with quantized conductance: TAI-I with a bulk band gap, and TAI-II with localized bulk states. We show that there is no intervening insulating phase between the bulk conduction phase and the TAI-I and TAI-II scaling regions, and that there is no metallic phase at the transition between the quantized and insulating phases. Centered near the quantized-insulating transition there are very broad peaks in the eigenstate size and fractal dimension $d_2$; in a large portion of the conductance plateau eigenstates grow when the disorder strength is increased. The fractal dimension at the peak maximum is $d_2 \approx 1.5$. Effective medium theory (CPA, SCBA) predicts well the boundaries and interior of the gapped TAI-I scaling region, but fails to predict all boundaries save one of the ungapped TAI-II scaling region. We report conductance distributions near several phase transitions and compare them with critical conductance distributions for well-known models.
When the Network-On-Chip (NoC) paradigm was introduced, many researchers have proposed many novelistic NoC architectures, tools and design strategies. In this paper we introduce a new approach in the field of designing Network-On-Chip (NoC). Our inspiration came from an avionic protocol which is the AFDX protocol. The proposed NoC architecture is a switch centric architecture, with exclusive shortcuts between hosts and utilizes the flexibility, the reliability and the performances offered by AFDX.
In this paper we present a formal description of PROSA, a P2P resource management system heavily inspired by social networks. Social networks have been deeply studied in the last two decades in order to understand how communities of people arise and grow. It is a widely known result that networks of social relationships usually evolves to small-worlds, i.e. networks where nodes are strongly connected to neighbours and separated from all other nodes by a small amount of hops. This work shows that algorithms implemented into PROSA allow to obtain an efficient small-world P2P network.
Robotic surgery has become a powerful tool for performing minimally invasive procedures, providing advantages in dexterity, precision, and 3D vision, over traditional surgery. One popular robotic system is the da Vinci surgical platform, which allows preoperative information to be incorporated into live procedures using Augmented Reality (AR). Scene depth estimation is a prerequisite for AR, as accurate registration requires 3D correspondences between preoperative and intraoperative organ models. In the past decade, there has been much progress on depth estimation for surgical scenes, such as using monocular or binocular laparoscopes [1,2]. More recently, advances in deep learning have enabled depth estimation via Convolutional Neural Networks (CNNs) [3], but training requires a large image dataset with ground truth depths. Inspired by [4], we propose a deep learning framework for surgical scene depth estimation using self-supervision for scalable data acquisition. Our framework consists of an autoencoder for depth prediction, and a differentiable spatial transformer for training the autoencoder on stereo image pairs without ground truth depths. Validation was conducted on stereo videos collected in robotic partial nephrectomy.
Sensor networks are an exciting new kind of computer system. Consisting of a large number of tiny, cheap computational devices physically distributed in an environment, they gather and process data about the environment in real time. One of the central questions in sensor networks is what to do with the data, i.e., how to reason with it and how to communicate it. This paper argues that the lessons of the UAI community, in particular that one should produce and communicate beliefs rather than raw sensor values, are highly relevant to sensor networks. We contend that loopy belief propagation is particularly well suited to communicating beliefs in sensor networks, due to its compact implementation and distributed nature. We investigate the ability of loopy belief propagation to function under the stressful conditions likely to prevail in sensor networks. Our experiments show that it performs well and degrades gracefully. It converges to appropriate beliefs even in highly asynchronous settings where some nodes communicate far less frequently than others; it continues to function if some nodes fail to participate in the propagation process; and it can track changes in the environment that occur while beliefs are propagating. As a result, we believe that sensor networks present an important application opportunity for UAI.
An important challenge in big data analysis nowadays is detection of cohesive groups in large-scale networks, including social networks, genetic networks, communication networks and so. In this paper, we propose LabelRank, an efficient algorithm detecting communities through label propagation. A set of operators is introduced to control and stabilize the propagation dynamics. These operations resolve the randomness issue in traditional label propagation algorithms (LPA), stabilizing the discovered communities in all runs of the same network. Tests on real-world networks demonstrate that LabelRank significantly improves the quality of detected communities compared to LPA, as well as other popular algorithms.
Bayesian networks provide a probabilistic semantics for qualitative assertions about likelihood. A qualitative reasoner based on an algebra over these assertions can derive further conclusions about the influence of actions. While the conclusions are much weaker than those computed from complete probability distributions, they are still valuable for suggesting potential actions, eliminating obviously inferior plans, identifying important tradeoffs, and explaining probabilistic models.
Cancer stem cells are controlled by developmental networks that are often topologically indistinguishable from normal, healthy stem cells. The question is why cancer stem cells can be both phenotypically distinct and have morphological effects so different from normal stem cells. The difference between cancer stem cells and normal stem cells lies not in differences their network architecture, but rather in the spatial-temporal locality of their activation in the genome and the resulting expression in the body. The metastatic potential cancer stem cells is not based primarily on their network divergence from normal stem cells, but on non-network based genetic changes that enable the evolution of gene-based phenotypic properties of the cell that permit its escape and travel to other parts of the body. Stem cell network theory allows the precise prediction of stem cell behavioral dynamics and a mathematical description of stem cell proliferation for both normal and cancer stem cells. It indicates that the best therapeutic approach is to tackle the highest order stem cells first, otherwise spontaneous remission of so called cured cancers will always be a danger. Stem cell networks point to a pathway to new methods to diagnose and cure not only stem cell cancers but cancers generally.
In the past few years, lossy compression has been widely applied in the field of wireless sensor networks (WSN), where energy efficiency is a crucial concern due to the constrained nature of the transmission devices. Often, the common thinking among researchers and implementers is that compression is always a good choice, because the major source of energy consumption in a sensor node comes from the transmission of the data. Lossy compression is deemed a viable solution as the imperfect reconstruction of the signal is often acceptable in WSN. In this paper, we thoroughly review a number of lossy compression methods from the literature, and analyze their performance in terms of compression efficiency, computational complexity and energy consumption. We consider two different scenarios, namely, wireless and underwater communications, and show that signal compression may or may not help in the reduction of the overall energy consumption, depending on factors such as the compression algorithm, the signal statistics and the hardware characteristics, i.e., micro-controller and transmission technology. The lesson that we have learned, is that signal compression may in fact provide some energy savings. However, its usage should be carefully evaluated, as in quite a few cases processing and transmission costs are of the same order of magnitude, whereas, in some other cases, the former may even dominate the latter. In this paper, we show quantitative comparisons to assess these tradeoffs in the above mentioned scenarios. Finally, we provide formulas, obtained through numerical fittings, to gauge computational complexity, overall energy consumption and signal representation accuracy for the best performing algorithms as a function of the most relevant system parameters.
The goal of congestion control is to avoid congestion in network elements. A network element is congested if it is being offered more traffic than it can process. To detect such situations and to neutralize them we should monitor traffic in the network. In this paper, we propose using Cisco's NetFlow technology, which allows collecting statistics about traffic in the network by generating special NetFlow packets. Cisco's routers can send NetFlow packets to a special node, so we can collect these packets, analyze its content and detect network congestion. We use Cisco's feature as example, some other vendors (Juniper, 3COM, Alcatel, etc.) provide similar features for their routers. We also consider a simple system, which collects statistical information about network elements, determines overloaded elements and identifies flows, which congest them.
We describe a method for the extraction of spectra from high dispersion objective prism plates. Our method is a catalogue driven plate solution approach, making use of the Right Ascension and Declination coordinates for the target objects. In contrast to existing methods of photographic plate reduction, we digitize the entire plate and extract spectra off-line. This approach has the advantages that it can be applied to CCD objective prism images, and spectra can be re-extracted (or additional spectra extracted) without having to re-scan the plate. After a brief initial interactive period, the subsequent reduction procedure is completely automatic, resulting in fully-reduced, wavelength justified spectra. We also discuss a method of removing stellar continua using a combination of non-linear filtering algorithms.   The method described is used to extract over 12,000 spectra from a set of 92 objective prism plates. These spectra are used in an associated project to develop automated spectral classifiers based on neural networks.
Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space. Instead, these methods use supervised learning to train the policy to mimic a "teacher" algorithm, such as a trajectory optimizer or a trajectory-centric reinforcement learning method. Guided policy search methods provide asymptotic local convergence guarantees by construction, but it is not clear how much the policy improves within a small, finite number of iterations. We show that guided policy search algorithms can be interpreted as an approximate variant of mirror descent, where the projection onto the constraint manifold is not exact. We derive a new guided policy search algorithm that is simpler and provides appealing improvement and convergence guarantees in simplified convex and linear settings, and show that in the more general nonlinear setting, the error in the projection step can be bounded. We provide empirical results on several simulated robotic navigation and manipulation tasks that show that our method is stable and achieves similar or better performance when compared to prior guided policy search methods, with a simpler formulation and fewer hyperparameters.
We consider the Ising model on the square lattice with biaxially correlated random ferromagnetic couplings, the critical point of which is fixed by self-duality. The disorder represents a relevant perturbation according to the extended Harris criterion. Critical properties of the system are studied by large scale Monte Carlo simulations. The correlation length critical exponent, \nu=2.005(5), corresponds to that expected in a system with isotropic correlated long-range disorder, whereas the scaling dimension of the magnetization density, x_m=0.1294(7), is somewhat larger than in the pure system. Conformal properties of the magnetization and energy density profiles are also examined numerically.
Local field potentials (LFPs) sampled with extracellular electrodes are frequently used as a measure of population neuronal activity. However, relating such measurements to underlying neuronal behaviour and connectivity is non-trivial. To help study this link, we developed the Virtual Electrode Recording Tool for EXtracellular potentials (VERTEX). We first identified a reduced neuron model that retained the spatial and frequency filtering characteristics of extracellular potentials from neocortical neurons. We then developed VERTEX as an easy-to-use Matlab tool for simulating LFPs from large populations (>100 000 neurons). A VERTEX-based simulation successfully reproduced features of the LFPs from an in vitro multi-electrode array recording of macaque neocortical tissue. Our model, with virtual electrodes placed anywhere in 3D, allows direct comparisons with the in vitro recording setup. We envisage that VERTEX will stimulate experimentalists, clinicians, and computational neuroscientists to use models to understand the mechanisms underlying measured brain dynamics in health and disease.
We model an isothermal aggregation process of particles/atoms interacting according to the Lennard-Jones pair potential by mapping the energy landscapes of each cluster size $N$ onto stochastic networks, computing transition probabilities {from} the network for an $N$-particle cluster to the one for $N+1$, and connecting these networks into a single joint network. The attachment rate is a control parameter. The resulting network representing the aggregation of up to 14 particles contains {6427} vertices. It is not only time-irreversible but also reducible. To analyze its transient dynamics, we introduce the sequence of the expected initial and pre-attachment distributions and compute them for a wide range of attachment rates and three values of temperature. As a result, we find the {configurations most likely to be observed} in the process of aggregation for each cluster size. We examine the attachment process and conduct a structural analysis of the sets of local energy minima for every cluster size. We show that both processes taking place in the network, attachment and relaxation, lead to the dominance of icosahedral packing in small (up to 14 atom) clusters.
The shapes of cooperatively rearranging regions in glassy liquids change from being compact at low temperatures to fractal or ``stringy'' as the dynamical crossover temperature from activated to collisional transport is approached from below. We present a quantitative microscopic treatment of this change of morphology within the framework of the random first order transition theory of glasses. We predict a correlation of the ratio of the dynamical crossover temperature to the laboratory glass transition temperature, and the heat capacity discontinuity at the glass transition, Delta C_p. The predicted correlation agrees with experimental results for the 21 materials compiled by Novikov and Sokolov.
In this paper, we present methods in deep multimodal learning for fusing speech and visual modalities for Audio-Visual Automatic Speech Recognition (AV-ASR). First, we study an approach where uni-modal deep networks are trained separately and their final hidden layers fused to obtain a joint feature space in which another deep network is built. While the audio network alone achieves a phone error rate (PER) of $41\%$ under clean condition on the IBM large vocabulary audio-visual studio dataset, this fusion model achieves a PER of $35.83\%$ demonstrating the tremendous value of the visual channel in phone classification even in audio with high signal to noise ratio. Second, we present a new deep network architecture that uses a bilinear softmax layer to account for class specific correlations between modalities. We show that combining the posteriors from the bilinear networks with those from the fused model mentioned above results in a further significant phone error rate reduction, yielding a final PER of $34.03\%$.
We present a systematic and exact way of computing finite size corrections for the random energy model, in its low temperature phase. We obtain explicit (though complicated) expressions for the finite size corrections of the overlap functions. In its low temperature phase, the random energy model is known to exhibit Parisi's broken symmetry of replicas. The finite size corrections given by our exact calculation can be reproduced using replicas if we make specific assumptions about the fluctuations (with negative variances!) of the number and sizes of the blocks when replica symmetry is broken. As an alternative we show that the exact expression for the non-integer moments of the partition function can be written in terms of coupled contour integrals over what can be thought of as "complex replica numbers". Parisi's one step replica symmetry breaking arises naturally from the saddle point of these integrals without making any ansatz or using the replica method. The fluctuations of the "complex replica numbers" near the saddle point in the imaginary direction correspond to the negative variances we observed in the replica calculation. Finally, our approach allows one to see why some apparently diverging series or integrals are harmless.
A fundamental characteristic of computer networks is their topological structure. The question of the description of the structural characteristics of computer networks represents a problem that is not completely solved. Search methods for structures of computer networks, for which the values of the selected parameters of their operation quality are extreme, have not been completely developed. The construction of computer networks with optimum indices of their operation quality is reduced to the solution of discrete optimization problems over graphs. This paper describes in detail the advantages of the practical use of k-geodetic graphs [2, 3] in the topological design of computer networks as an alternative for the solution of the fundamental problems mentioned above which, we believe, are still open. Also, the topological analysis and synthesis of some classes of these networks have been performed.
We present a graphical criterion for reading dependencies from the minimal directed independence map G of a graphoid p when G is a polytree and p satisfies composition and weak transitivity. We prove that the criterion is sound and complete. We argue that assuming composition and weak transitivity is not too restrictive.
Hadronic jets in deeply inelastic electron-proton collisions are produced by the scattering of a parton from the proton with the virtual gauge boson mediating the interaction. The HERA experiments have performed precision measurements of inclusive single jet production and di-jet production in the Breit frame, which provide important constraints on the strong coupling constant and on parton distributions in the proton. We describe the calculation of the next-to-next-to-leading order (NNLO) QCD corrections to these processes, and assess their size and impact. A detailed comparison with data from the H1 and ZEUS experiments highlights that inclusive single jet production displays a better perturbative convergence than di-jet production. We also observe that the event selection cuts in some of the di-jet measurements of both H1 and ZEUS induce an infrared sensitivity that destabilises the perturbative stability of the predictions. Our results open up new opportunities for QCD precision studies with jet production observables in deep inelastic scattering.
The normalized radial basis function neural network emerges in the statistical modeling of natural laws that relate components of multivariate data. The modeling is based on the kernel estimator of the joint probability density function pertaining to given data. From this function a governing law is extracted by the conditional average estimator. The corresponding nonparametric regression represents a normalized radial basis function neural network and can be related with the multi-layer perceptron equation. In this article an exact equivalence of both paradigms is demonstrated for a one-dimensional case with symmetric triangular basis functions. The transformation provides for a simple interpretation of perceptron parameters in terms of statistical samples of multivariate data.
Current approaches to learning vector representations of text that are compatible between different languages usually require some amount of parallel text, aligned at word, sentence or at least document level. We hypothesize however, that different natural languages share enough semantic structure that it should be possible, in principle, to learn compatible vector representations just by analyzing the monolingual distribution of words.   In order to evaluate this hypothesis, we propose a scheme to map word vectors trained on a source language to vectors semantically compatible with word vectors trained on a target language using an adversarial autoencoder.   We present preliminary qualitative results and discuss possible future developments of this technique, such as applications to cross-lingual sentence representations.
In many physical, statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found 'lines and planes of closest fit to system of points'. The famous k-means algorithm solves the approximation problem too, but by finite sets instead of lines and planes. This chapter gives a brief practical introduction into the methods of construction of general principal objects, i.e. objects embedded in the 'middle' of the multidimensional data set. As a basis, the unifying framework of mean squared distance approximation of finite datasets is selected. Principal graphs and manifolds are constructed as generalisations of principal components and k-means principal points. For this purpose, the family of expectation/maximisation algorithms with nearest generalisations is presented. Construction of principal graphs with controlled complexity is based on the graph grammar approach.
We present sketch-rnn, a recurrent neural network (RNN) able to construct stroke-based drawings of common objects. The model is trained on thousands of crude human-drawn images representing hundreds of classes. We outline a framework for conditional and unconditional sketch generation, and describe new robust training methods for generating coherent sketch drawings in a vector format.
This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style. Our approach builds upon the recent work on painterly transfer that separates style from the content of an image by considering different layers of a neural network. However, as is, this approach is not suitable for photorealistic style transfer. Even when both the input and reference images are photographs, the output still exhibits distortions reminiscent of a painting. Our contribution is to constrain the transformation from the input to the output to be locally affine in colorspace, and to express this constraint as a custom fully differentiable energy term. We show that this approach successfully suppresses distortion and yields satisfying photorealistic style transfers in a broad variety of scenarios, including transfer of the time of day, weather, season, and artistic edits.
We study the energy landscape of the soft-spin random field model in the mean-field limit and compute analytically the quenched complexity of the metastable states as a function of their magnetization and energy at a given external magnetic field. The shape of the domain within which the complexity is positive (and the number of typical metastable states grows exponentially with system size) changes with the amount of disorder and becomes non-convex and disconnected at low disorder. As a consequence, phase transitions occur both at equilibrium and out of equilibrium along the saturation hysteresis loop. We focus on the zero complexity curve in the field-magnetization plane and its relationship with the hysteresis loop. We also study the response of the system when the magnetization is externally controlled instead of the magnetic field. The main features of the model that should survive in finite dimensions are discussed.
The evaluation of incomplete satisfiability solvers depends critically on the availability of hard satisfiable instances. A plausible source of such instances consists of random k-SAT formulas whose clauses are chosen uniformly from among all clauses satisfying some randomly chosen truth assignment A. Unfortunately, instances generated in this manner tend to be relatively easy and can be solved efficiently by practical heuristics. Roughly speaking, as the formula's density increases, for a number of different algorithms, A acts as a stronger and stronger attractor. Motivated by recent results on the geometry of the space of satisfying truth assignments of random k-SAT and NAE-k-SAT formulas, we introduce a simple twist on this basic model, which appears to dramatically increase its hardness. Namely, in addition to forbidding the clauses violated by the hidden assignment A, we also forbid the clauses violated by its complement, so that both A and complement of A are satisfying. It appears that under this "symmetrization'' the effects of the two attractors largely cancel out, making it much harder for algorithms to find any truth assignment. We give theoretical and experimental evidence supporting this assertion.
We generalize to higher spatial dimensions the Stokes--Einstein relation (SER) and the leading correction to diffusivity in periodic systems, and validate them using numerical simulations. Using these results, we investigate the evolution of the SER violation with dimension in simple hard sphere glass formers. The analysis suggests that the SER violation disappears around dimension d=8, above which SER is not violated. The critical exponent associated to the violation appears to evolve linearly in 8-d below d=8, as predicted by Biroli and Bouchaud [J. Phys.: Cond. Mat. 19, 205101 (2007)], but the linear coefficient is not consistent with their prediction. The SER violation evolution with d establishes a new benchmark for theory, and a complete description remains an open problem.
We study the influence of Ohmic dissipation on the random transverse-field Ising chain by means of large-scale Monte-Carlo simulations. To this end, we first map the Hamiltonian onto a classical Ising model with long-range $1/\tau^2$ interaction in the time-like direction. We then apply the highly efficient cluster algorithm proposed by Luijten and Bl\"ote for system with long-range interactions. Our simulations show that Ohmic dissipation destroys the infinite-randomness quantum critical point of the dissipationless system. Instead, the quantum phase transition between the paramagnetic and ferromagnetic phases is smeared. We compare our results to recent predictions of a strong-disorder renormalization group approach, and we discuss generalizations to higher dimensions as well as experiments.
Dynamical patterns in complex networks of coupled oscillators are both of theoretical and practical interest, yet to fully reveal and understand the interplay between pattern emergence and network structure remains to be an outstanding problem. A fundamental issue is the effect of network structure on the stability of the patterns. We address this issue by using the setting where random links are systematically added to a regular lattice and focusing on the dynamical evolution of spiral wave patterns. As the network structure deviates more from the regular topology (so that it becomes increasingly more complex), the original stable spiral wave pattern can disappear and a different type of pattern can emerge. Our main findings are the following. (1) Short-distance links added to a small region containing the spiral tip can have a more significant effect on the wave pattern than long-distance connections. (2) As more random links are introduced into the network, distinct pattern transitions can occur, which include the transition of spiral wave to global synchronization, to a chimera-like state, and then to a pinned spiral wave. (3) Around the transitions the network dynamics is highly sensitive to small variations in the network structure in the sense that the addition of even a single link can change the pattern from one type to another. These findings provide insights into the pattern dynamics in complex networks, a problem that is relevant to many physical, chemical, and biological systems.
Many biological systems, such as metabolic pathways, exhibit bistability behavior: these biological systems exhibit two distinct stable states with switching between the two stable states controlled by certain conditions. Since understanding bistability is key for understanding these biological systems, mathematical modeling of the bistability phenomenon has been at the focus of researches in quantitative and system biology. Recent study shows that Boolean networks offer relative simple mathematical models that are capable of capturing these essential information. Thus a better understanding of the Boolean networks with bistability property is desirable for both theoretical and application purposes. In this paper, we describe an algebraic condition for the number of stable states (fixed points) of a Boolean network based on its polynomial representation, and derive algorithms for a Boolean network to have a single stable state or two stable states. As an example, we also construct a Boolean network with exactly two stable states for the lac operon's $\beta$-galactosidase regulatory pathway when glucose is absent based on a delay differential equation model
Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then complete separation is not necessary and hence separation difficulty and separation quality are dependent on the nature of the re-mix. Here, we use a convolutional deep neural network (DNN), trained to estimate 'ideal' binary masks for separating voice from music, to perform re-mixing of the vocal balance by operating directly on the individual magnitude components of the musical mixture spectrogram. Our results demonstrate that small changes in vocal gain may be applied with very little distortion to the ultimate re-mix. Our method may be useful for re-mixing existing mixes.
Heterogeneity of neural attributes has recently gained a lot of attention and is increasing recognized as a crucial feature in neural processing. Despite its importance, this physiological feature has traditionally been neglected in theoretical studies of cortical neural networks. Thus, there is still a lot unknown about the consequences of cellular and circuit heterogeneity in spiking neural networks. In particular, combining network or synaptic heterogeneity and intrinsic heterogeneity has yet to be considered systematically despite the fact that both are known to exist and likely have significant roles in neural network dynamics. In a canonical recurrent spiking neural network model, we study how these two forms of heterogeneity lead to different distributions of excitatory firing rates. To analytically characterize how these types of heterogeneities affect the network, we employ a dimension reduction method that relies on a combination of Monte Carlo simulations and probability density function equations. We find that the relationship between intrinsic and network heterogeneity has a strong effect on the overall level of heterogeneity of the firing rates. Specifically, this relationship can lead to amplification or attenuation of firing rate heterogeneity, and these effects depend on whether the recurrent network is firing asynchronously or rhythmically firing. These observations are captured with the aforementioned reduction method, and furthermore simpler analytic descriptions based on this dimension reduction method are developed. The final analytic descriptions provide compact and descriptive formulas for how the relationship between intrinsic and network heterogeneity determines the firing rate heterogeneity dynamics in various settings.
This paper studies the cooperative training of two probabilistic models of signals such as images. Both models are parametrized by convolutional neural networks (ConvNets). The first network is a descriptor network, which is an exponential family model or an energy-based model, whose feature statistics or energy function are defined by a bottom-up ConvNet, which maps the observed signal to the feature statistics. The second network is a generator network, which is a non-linear version of factor analysis. It is defined by a top-down ConvNet, which maps the latent factors to the observed signal. The maximum likelihood training algorithms of both the descriptor net and the generator net are in the form of alternating back-propagation, and both algorithms involve Langevin sampling. We observe that the two training algorithms can cooperate with each other by jumpstarting each other's Langevin sampling, and they can be naturally and seamlessly interwoven into a CoopNets algorithm that can train both nets simultaneously.
This paper describes a Social Network Analysis toolkit to monitor an Enterprise Social Network and help analyzing informal leadership as a function of social ties and topic discussions. The toolkit has been developed in the context of a regional project, Fiordaliso, funded by Regione Lazio (a region of central Italy) and leaded by Reply, an international network of specialized companies in the field of digital services.
Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.
Importance Sampling (IS) is a well-known Monte Carlo technique that approximates integrals involving a posterior distribution by means of weighted samples. In this work, we study the assignation of a single weighted sample which compresses the information contained in a population of weighted samples. Part of the theory that we present as Group Importance Sampling (GIS) has been employed implicitly in different works in the literature. The provided analysis yields several theoretical and practical consequences. For instance, we discuss the application of GIS into the Sequential Importance Resampling framework and show that Independent Multiple Try Metropolis schemes can be interpreted as a standard Metropolis-Hastings algorithm, following the GIS approach. We also introduce two novel Markov Chain Monte Carlo (MCMC) techniques based on GIS. The first one, named Group Metropolis Sampling method, produces a Markov chain of sets of weighted samples. All these sets are then employed for obtaining a unique global estimator. The second one is the Distributed Particle Metropolis-Hastings technique, where different parallel particle filters are jointly used to drive an MCMC algorithm. Different resampled trajectories are compared and then tested with a proper acceptance probability. The novel schemes are tested in different numerical experiments such as learning the hyperparameters of Gaussian Processes, the localization problem in a wireless sensor network and the tracking of vegetation parameters given satellite observations, where they are compared with several benchmark Monte Carlo techniques. Three illustrative Matlab demos are also provided.
Advances in semiconductor technology are contributing to the increasing complexity in the design of embedded systems. Architectures with novel techniques such as evolvable nature and autonomous behavior have engrossed lot of attention. This paper demonstrates conceptually evolvable embedded systems can be characterized basing on acausal nature. It is noted that in acausal systems, future input needs to be known, here we make a mechanism such that the system predicts the future inputs and exhibits pseudo acausal nature. An embedded system that uses theoretical framework of acausality is proposed. Our method aims at a novel architecture that features the hardware evolability and autonomous behavior alongside pseudo acausality. Various aspects of this architecture are discussed in detail along with the limitations.
Iris is one of the popular biometrics that is widely used for identity authentication. Different features have been used to perform iris recognition in the past. Most of them are based on hand-crafted features designed by biometrics experts. Due to tremendous success of deep learning in computer vision problems, there has been a lot of interest in applying features learned by convolutional neural networks on general image recognition to other tasks such as segmentation, face recognition, and object detection. In this paper, we have investigated the application of deep features extracted from VGG-Net for iris recognition. The proposed scheme has been tested on two well-known iris databases, and has shown promising results with the best accuracy rate of 99.4\%, which outperforms the previous best result.
The widespread proliferation of handheld devices enables mobile carriers to be connected at anytime and anywhere. Meanwhile, the mobility patterns of mobile devices strongly depend on the users' movements, which are closely related to their social relationships and behaviors. Consequently, today's mobile networks are becoming increasingly human centric. This leads to the emergence of a new field which we call socially-aware networking (SAN). One of the major features of SAN is that social awareness becomes indispensable information for the design of networking solutions. This emerging paradigm is applicable to various types of networks (e.g. opportunistic networks, mobile social networks, delay tolerant networks, ad hoc networks, etc) where the users have social relationships and interactions. By exploiting social properties of nodes, SAN can provide better networking support to innovative applications and services. In addition, it facilitates the convergence of human society and cyber physical systems. In this paper, for the first time, to the best of our knowledge, we present a survey of this emerging field. Basic concepts of SAN are introduced. We intend to generalize the widely-used social properties in this regard. The state-of-the-art research on SAN is reviewed with focus on three aspects: routing and forwarding, incentive mechanisms and data dissemination. Some important open issues with respect to mobile social sensing and learning, privacy, node selfishness and scalability are discussed.
The friendship paradox states that your friends have on average more friends than you have. Does the paradox "hold" for other individual characteristics like income or happiness? To address this question, we generalize the friendship paradox for arbitrary node characteristics in complex networks. By analyzing two coauthorship networks of Physical Review journals and Google Scholar profiles, we find that the generalized friendship paradox (GFP) holds at the individual and network levels for various characteristics, including the number of coauthors, the number of citations, and the number of publications. The origin of the GFP is shown to be rooted in positive correlations between degree and characteristics. As a fruitful application of the GFP, we suggest effective and efficient sampling methods for identifying high characteristic nodes in large-scale networks. Our study on the GFP can shed lights on understanding the interplay between network structure and node characteristics in complex networks.
We study thermodynamic properties, specific heat and susceptibility, of XY quantum chains with coupling constants following arbitrary substitution rules. Generalizing an exact renormalization group transformation, originally formulated for Ising quantum chains, we obtain exact relevance criteria of Harris-Luck type for this class of models. For two-letter substitution rules, a detailed classification is given of sequences leading to irrelevant, marginal or relevant aperiodic modulations. We find that the relevance of the same aperiodic sequence of couplings in general will be different for XY and Ising quantum chains. By our method, continuously varying critical exponents may be calculated exactly for arbitrary (two-letter) substitution rules with marginal aperiodicity. A number of examples are given, including the period-doubling, three-folding and precious mean chains. We also discuss extensions of the renormalization approach to a special class of long-range correlated random chains, generated by random substitutions.
Microwave transport experiments have been performed in a quasi-two-dimensional resonator with randomly distributed scatterers, each mimicking an $r^{-2}$ repulsive potential. Analysis of both stationary wave fields and transient transport shows large deviations from Rayleigh's law for the wave height distribution, which can only partially be described by existing multiple-scattering theories. At high frequencies, the flow shows branching structures similar to those observed previously in stationary imaging of electron flow. Semiclassical simulations confirm that caustics in the ray dynamics are likely to be responsible for the observed structures. Particular conspicuous features observed in the stationary patterns are "hot spots" with intensities far beyond those expected in a random wave field. Reinterpreting the flow patterns as ocean waves in the presence of spatially varying currents or depth variations in the sea floor, the branches and hot spots lead to enhanced frequency of freak or rogue wave formation in these regions.
Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labelled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.
End-to-end (E2E) systems have achieved competitive results compared to conventional hybrid hidden Markov model (HMM)-deep neural network based automatic speech recognition (ASR) systems. Such E2E systems are attractive due to the lack of dependence on alignments between input acoustic and output grapheme or HMM state sequence during training. This paper explores the design of an ASR-free end-to-end system for text query-based keyword search (KWS) from speech trained with minimal supervision. Our E2E KWS system consists of three sub-systems. The first sub-system is a recurrent neural network (RNN)-based acoustic auto-encoder trained to reconstruct the audio through a finite-dimensional representation. The second sub-system is a character-level RNN language model using embeddings learned from a convolutional neural network. Since the acoustic and text query embeddings occupy different representation spaces, they are input to a third feed-forward neural network that predicts whether the query occurs in the acoustic utterance or not. This E2E ASR-free KWS system performs respectably despite lacking a conventional ASR system and trains much faster.
We describe the theoretical and computational framework for the Dynamic Signatures for Genetic Regulatory Network (DSGRN) database. The motivation stems from urgent need to understand the global dynamics of biologically relevant signal transduction/gene regulatory networks that have at least 5 to 10 nodes, involve multiple interactions, and decades of parameters.   The input to the database computations is a regulatory network, i.e.\ a directed graph with edges indicating up or down regulation, from which a computational model based on switching networks is generated. The phase space dimension equals the number of nodes. The associated parameter space consists of one parameter for each node (a decay rate), and three parameters for each edge (low and high levels of expression, and a threshold at which expression levels change). Since the nonlinearities of switching systems are piece-wise constant, there is a natural decomposition of phase space into cells from which the dynamics can be described combinatorially in terms of a state transition graph. This in turn leads to compact representation of the global dynamics called an annotated Morse graph that identifies recurrent and nonrecurrent. The focus of this paper is on the construction of a natural computable finite decomposition of parameter space into domains where the annotated Morse graph description of dynamics is constant.   We use this decomposition to construct an SQL database that can be effectively searched for dynamic signatures such as bistability, stable or unstable oscillations, and stable equilibria. We include two simple 3-node networks to provide small explicit examples of the type information stored in the DSGRN database. To demonstrate the computational capabilities of this system we consider a simple network associated with p53 that involves 5-nodes and a 29-dimensional parameter space.
A concept of neighborhood in complex networks is addressed based on the criterion of the minimal number os steps to reach other vertices. This amounts to, starting from a given network $R_1$, generating a family of networks $R_\ell, \ell=2,3,...$ such that, the vertices that are $\ell$ steps apart in the original $R_1$, are only 1 step apart in $R_\ell$. The higher order networks are generated using Boolean operations among the adjacency matrices $M_\ell$ that represent $R_\ell$. The families originated by the well known linear and the Erd\"os-Renyi networks are found to be invariant, in the sense that the spectra of $M_\ell$ are the same, up to finite size effects. A further family originated from small world network is identified.
The dynamics of planar crack fronts in heterogeneous media is studied using a recently proposed stochastic equation of motion that takes into account nonlinear effects. The analysis is carried for a moving front in the quasi-static regime using the Self Consistent Expansion. A continuous dynamical phase transition between a flat phase and a dynamically rough phase, with a roughness exponent $\zeta=1/2$, is found. The rough phase becomes possible due to the destabilization of the linear modes by the nonlinear terms. Taking into account the irreversibility of the crack propagation, we infer that the roughness exponent found in experiments might become history-dependent, and so our result gives a lower bound for $\zeta$.
Support Vector Machines (SVM), a popular machine learning technique, has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. Whether it is identifying high-risk patients by health-care professionals, or potential high-school students to enroll in college by school districts, SVMs can play a major role for social good. This paper undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Intuitive techniques for improving the time-space complexity including adaptive elimination of samples for faster convergence and sparse format representation are proposed. Under sample elimination, several heuristics for {\em earliest possible} to {\em lazy} elimination of non-contributing samples are proposed. In several cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The algorithm and heuristics are implemented and evaluated on various publicly available datasets. Empirical evaluation shows up to 26x speed improvement on some datasets against the sequential baseline, when evaluated on multiple compute nodes, and an improvement in execution time up to 30-60\% is readily observed on a number of other datasets against our parallel baseline.
We present the first results from the Australia Telescope Large Area Survey (ATLAS), which consist of deep radio observations of a 3.7 square degree field surrounding the Chandra Deep Field South, largely coincident with the infrared Spitzer Wide-Area Extragalactic (SWIRE) Survey. We also list cross-identifications to infrared and optical photometry data from SWIRE, and ground-based optical spectroscopy. A total of 784 radio components are identified, corresponding to 726 distinct radio sources, nearly all of which are identified with SWIRE sources. Of the radio sources with measured redshifts, most lie in the redshift range 0.5-2, and include both star-forming galaxies and active galactic nuclei (AGN). We identify a rare population of infrared-faint radio sources which are bright at radio wavelengths but are not seen in the available optical, infrared, or X-ray data. Such rare classes of sources can only be discovered in wide, deep surveys such as this.
The energy-harvested Wireless Sensor Networks (WSNs) may operate perpetually with the extra energy supply from ambient natural energy, such as solar energy. Nevertheless, the harvested energy is still limited so it's not able to support the perpetual network operation with full duty cycle. To achieve the perpetual network operation and process the data with high importance, measured by Value of Information (VoI), the network has to operate under partial duty cycle and to improve the efficiency to consume the harvested energy. The challenging problem is how to deal with the stochastic feature of the natural energy and the variable data VoI. We consider the energy consumption during storing and the diversity of the data process including sampling, transmitting and receiving, which consume different power levels. The problem is then mapped as the budget-dynamic Multi-Arm Bandit (MAB) problem by treating the energy as the budget and the data process as arm pulling. This paper proposes an Opportunistic Duty Cycling (ODC) scheme to improve the energy efficiency while satisfying the perpetual network operation. ODC chooses the proper opportunities to store the harvested energy or to spend it on the data process based on the historical information of the energy harvesting and the VoI of the processed data. With this scheme, each sensor node need only estimate the ambient natural energy in short term so as to reduce the computation and the storage for the historical information. It also can distributively adjust its own duty cycle according to its local historical information. This paper also conducts the extensive analysis on the performance of our scheme ODC, and the theoretical results validate the regret, which is the difference between the optimal scheme and ours. Our experimental results also manifest the promising performance of ODC.
Solving the shortest path and the min-cut problems are key in achieving high performance and robust communication networks. Those problems have often beeny studied in deterministic and independent networks both in their original formulations as well as in several constrained variants. However, in real-world networks, link weights (e.g., delay, bandwidth, failure probability) are often correlated due to spatial or temporal reasons, and these correlated link weights together behave in a different manner and are not always additive.   In this paper, we first propose two correlated link-weight models, namely (i) the deterministic correlated model and (ii) the (log-concave) stochastic correlated model. Subsequently, we study the shortest path problem and the min-cut problem under these two correlated models. We prove that these two problems are NP-hard under the deterministic correlated model, and even cannot be approximated to arbitrary degree in polynomial time. However, these two problems are polynomial-time solvable under the (constrained) nodal deterministic correlated model, and can be solved by convex optimization under the (log-concave) stochastic correlated model.
We introduce an adaptation algorithm by which an ensemble of coupled oscillators with attractive and repulsive interactions is induced to adopt a prescribed synchronized state. While the performance of adaptation is controlled by measuring a macroscopic quantity, which characterizes the achieved degree of synchronization, adaptive changes are introduced at the microscopic level of the interaction network, by modifying the configuration of repulsive interactions. This scheme emulates the distinct levels of selection and mutation in biological evolution and learning.
Biological and technological networks contain patterns, termed network motifs, which occur far more often than in randomized networks. Network motifs were suggested to be elementary building blocks that carry out key functions in the network. It is of interest to understand how network motifs combine to form larger structures. To address this, we present a systematic approach to define 'motif generalizations': families of motifs of different sizes that share a common architectural theme. To define motif generalizations, we first define 'roles' in a subgraph according to structural equivalence. For example, the feedforward loop triad, a motif in transcription, neuronal and some electronic networks, has three roles, an input node, an output node and an internal node. The roles are used to define possible generalizations of the motif. The feedforward loop can have three simple generalizations, based on replicating each of the three roles and their connections. We present algorithms for efficiently detecting motif generalizations. We find that the transcription networks of bacteria and yeast display only one of the three generalizations, the multi-output feedforward generalization. In contrast, the neuronal network of \emph{C. elegans} mainly displays the multi-input generalization. Forward-logic electronic circuits display a multi-input, multi-output hybrid. Thus, networks which share a common motif can have very different generalizations of that motif. Using mathematical modelling, we describe the information processing functions of the different motif generalizations in transcription, neuronal and electronic networks.
We revise the problem of the density of states in disordered superconductors. Randomness of local sample characteristics translates to the quenched spatial inhomogeneity of the spectral gap, smearing the BCS coherence peak. We show that various microscopic models of potential and magnetic disorder can be reduced to a universal phenomenological random order parameter model, whereas the details of the microscopic description are encoded in the correlation function of the order parameter fluctuations. The resulting form of the density of states is generally described by two parameters: the width Gamma measuring the broadening of the BCS peak, and the energy scale Gamma_{tail} which controls the exponential decay of the density of the subgap states. We refine the existing instanton approaches for determination of Gamma_{tail} and show that they appear as limiting cases of a unified theory of optimal fluctuations in a nonlinear system. Application to various types of disorder is discussed.
In this paper, we propose a novel encoder-decoder neural network model referred to as DeepBinaryMask for video compressive sensing. In video compressive sensing one frame is acquired using a set of coded masks (sensing matrix) from which a number of video frames is reconstructed, equal to the number of coded masks. The proposed framework is an end-to-end model where the sensing matrix is trained along with the video reconstruction. The encoder learns the binary elements of the sensing matrix and the decoder is trained to recover the unknown video sequence. The reconstruction performance is found to improve when using the trained sensing mask from the network as compared to other mask designs such as random, across a wide variety of compressive sensing reconstruction algorithms. Finally, our analysis and discussion offers insights into understanding the characteristics of the trained mask designs that lead to the improved reconstruction quality.
Many-body localization (MBL) in a one-dimensional Fermi Hubbard model with random on-site interactions is studied. While for this model all single-particle states are trivially delocalized, it is shown that for sufficiently strong disordered interactions the model is many-body localized. It is therefore argued that MBL does not necessary rely on localization of the single-particle spectrum. This model provides a convenient platform to study pure MBL phenomenology, since Anderson localization in this model does not exist. By examining various forms of the interaction term a dramatic effect of symmetries on charge transport is demonstrated. A possible realization in a cold atom experiments is proposed.
Deep learning has become a powerful and popular tool for a variety of machine learning tasks. However, it is extremely challenging to understand the mechanism of deep learning from a theoretical perspective. In this work, we study robustness of a deep network in its generalization capability against removal of a certain number of connections between layers. A critical value of this number is observed to separate a robust (redundant) regime from a sensitive regime. This empirical behavior is captured qualitatively by a random active path model, where the path from input to output is randomly and independently constructed. The empirical critical value corresponds to termination of a paramagnetic phase in the random active path model. Furthermore, this model provides us qualitative understandings about dropconnect probability commonly used in the dropconnect algorithm and its relationship with the redundancy phenomenon. In addition, we combine the dropconnect and the random feedback alignment for feedforward and backward pass in a deep network training respectively, and observe fast learning and improved test performance in classifying a benchmark handwritten digits dataset.
We present a perception model of ambiguous patterns based on the chaotic neural network and investigate the characteristics through computer simulations. The results induced by the chaotic activity are similar to those of psychophysical experiments and it is difficult for the stochastic activity to reproduce them in the same simple framework. Our demonstration suggests functional usefulness of the chaotic activity in perceptual systems even at higher cognitive levels. The perceptual alternation may be an inherent feature built in the chaotic neuron assembly.
The single-parameter scaling hypothesis relating the average and variance of the logarithm of the conductance is a pillar of the theory of electronic transport. We use a maximum-entropy ansatz to explore the logarithm of the energy density, $\ln {\cal W}(x)$, at a depth $x$ into a random one-dimensional system. Single-parameter scaling would be the special case in which $x \to L$ (the system length). We find the simple yet counterintuitive result, confirmed in microwave measurements and computer simulations, that the average of $\ln {\cal W}(x)$ is independent of $L$ and equal to $-x/\ell$, with $\ell$ the mean free path. In most of the sample, ${\rm var}(\ln {\cal W}(x))$ is also independent of $L$ and equal to $2x/\ell$.
Rising energy consumption of IT infrastructure concerns have spurred the development of more power efficient networking equipment and algorithms. When \emph{old} equipment just drew an almost constant amount of power regardless of the traffic load, there were some efforts to minimize the total energy usage by modifying routing decisions to aggregate traffic in a minimal set of links, creating the opportunity to power off some unused equipment during low traffic periods. New equipment, with power profile functions depending on the offered load, presents new challenges for optimal routing. The goal now is not just to power some links down, but to aggregate and/or spread the traffic so that devices operate in their sweet spot in regards to network usage. In this paper we present an algorithm that, making use of the ant colonization algorithm, computes, in a decentralized manner, the routing tables so as to minimize global energy consumption. Moreover, the resulting algorithm is also able to track changes in the offered load and react to them in real time.
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.
Compressed Learning (CL) is a joint signal processing and machine learning framework for inference from a signal, using a small number of measurements obtained by linear projections of the signal. In this paper we present an end-to-end deep learning approach for CL, in which a network composed of fully-connected layers followed by convolutional layers perform the linear sensing and non-linear inference stages. During the training phase, the sensing matrix and the non-linear inference operator are jointly optimized, and the proposed approach outperforms state-of-the-art for the task of image classification. For example, at a sensing rate of 1% (only 8 measurements of 28 X 28 pixels images), the classification error for the MNIST handwritten digits dataset is 6.46% compared to 41.06% with state-of-the-art.
Convolutional Neural Networks (Convnets) have achieved good results in a range of computer vision tasks the recent years. Though given a lot of attention, visualizing the learned representations to interpret Convnets, still remains a challenging task. The high dimensionality of internal representations and the high abstractions of deep layers are the main challenges when visualizing Convnet functionality. We present in this paper a technique based on clustering internal Convnet representations with a Dirichlet Process Gaussian Mixture Model, for visualization of learned representations in Convnets. Our method copes with the high dimensionality of a Convnet by clustering representations across all nodes of each layer. We will discuss how this application is useful when considering transfer learning, i.e.\ transferring a model trained on one dataset to solve a task on a different one.
We evaluate the following Machine Learning techniques for Green Energy (Wind, Solar) Prediction: Bayesian Inference, Neural Networks, Support Vector Machines, Clustering techniques (PCA). Our objective is to predict green energy using weather forecasts, predict deviations from forecast green energy, find correlation amongst different weather parameters and green energy availability, recover lost or missing energy (/ weather) data. We use historical weather data and weather forecasts for the same.
In recent years, Deep Learning has been successfully applied to multimodal learning problems, with the aim of learning useful joint representations in data fusion applications. When the available modalities consist of time series data such as video, audio and sensor signals, it becomes imperative to consider their temporal structure during the fusion process. In this paper, we propose the Correlational Recurrent Neural Network (CorrRNN), a novel temporal fusion model for fusing multiple input modalities that are inherently temporal in nature. Key features of our proposed model include: (i) simultaneous learning of the joint representation and temporal dependencies between modalities, (ii) use of multiple loss terms in the objective function, including a maximum correlation loss term to enhance learning of cross-modal information, and (iii) the use of an attention model to dynamically adjust the contribution of different input modalities to the joint representation. We validate our model via experimentation on two different tasks: video- and sensor-based activity classification, and audio-visual speech recognition. We empirically analyze the contributions of different components of the proposed CorrRNN model, and demonstrate its robustness, effectiveness and state-of-the-art performance on multiple datasets.
In this dissertation we use sophisticated numerical methods in order to examine ground-state (GS) properties of two types of quantum systems with electron electron interactions: A quantum dot (QD) and a nano-wire. In the first half of the work we study a system of a single level coupled to a one-dimensional wire with interacting spinless electrons, when the wire is either clean or disordered. We utilize the density-matrix renormalization-group (DMRG) method to investigate the influence of the level on several thermodynamic properties of the clean interacting wire, which can be in one of two phases: Tomonaga-Luttinger liquid and charge density wave phases. When the wire is disordered, we investigate the Friedel oscillations, exploring the difference between the two phases and comparing them to the clean non-interacting case, for which we develop an exact formula for the oscillations. In the second half of the dissertation we study two cases of an isolated two-dimensional QD. We begin by an investigation of a new numerical method, the particle-hole DMRG (PH-DMRG), which is used to calculate the GS energy of a disordered QD consisting of interacting spinless electrons. We show that this method is much more accurate than the Hartree-Fock method, and we suggest an improvement of the algorithm, which reduces the error rate by almost 30 percents. Finally we study the magnetization of a QD with spin 1/2 electrons, in the presence of spin-orbit coupling and interactions. We calculate the g-factor and the expectation values of the spin operators in the GS, and find that when the QD is occupied by an even number of electrons, the GS can have a finite magnetization.
High resolution satellite image sequences are multidimensional signals composed of spatio-temporal patterns associated to numerous and various phenomena. Bayesian methods have been previously proposed in (Heas and Datcu, 2005) to code the information contained in satellite image sequences in a graph representation using Bayesian methods. Based on such a representation, this paper further presents a supervised learning methodology of semantics associated to spatio-temporal patterns occurring in satellite image sequences. It enables the recognition and the probabilistic retrieval of similar events. Indeed, graphs are attached to statistical models for spatio-temporal processes, which at their turn describe physical changes in the observed scene. Therefore, we adjust a parametric model evaluating similarity types between graph patterns in order to represent user-specific semantics attached to spatio-temporal phenomena. The learning step is performed by the incremental definition of similarity types via user-provided spatio-temporal pattern examples attached to positive or/and negative semantics. From these examples, probabilities are inferred using a Bayesian network and a Dirichlet model. This enables to links user interest to a specific similarity model between graph patterns. According to the current state of learning, semantic posterior probabilities are updated for all possible graph patterns so that similar spatio-temporal phenomena can be recognized and retrieved from the image sequence. Few experiments performed on a multi-spectral SPOT image sequence illustrate the proposed spatio-temporal recognition method.
Peer to peer network architecture introduces many desired features including self-scalability that led to achieving higher efficiency rate than the traditional server-client architecture. This was contributed to the highly distributed architecture of peer to peer network. Meanwhile, the lack of a centralized control unit in peer to peer network introduces some challenge. One of these challenges is key distribution and management in such an architecture. This research will explore the possibility of developing a novel scheme for distributing and managing keys in peer to peer network architecture efficiently.
Criticism of Gnutella network scalability has rested on the bandwidth attributes of the original interconnection topology: a Cayley tree. Trees, in general, are known to have lower aggregate bandwidth than higher dimensional topologies e.g., hypercubes, meshes and tori. Gnutella was intended to support thousands to millions of peers. Studies of interconnection topologies in the literature, however, have focused on hardware implementations which are limited by cost to a few thousand nodes. Since the Gnutella network is virtual, hyper-topologies are relatively unfettered by such constraints. We present performance models for several plausible hyper-topologies and compare their query throughput up to millions of peers. The virtual hypercube and the virtual hypertorus are shown to offer near linear scalability subject to the number of peer TCP/IP connections that can be simultaneously kept open.
Quantitative analysis of empirical data from online social networks reveals group dynamics in which emotions are involved (\v{S}uvakov et al). Full understanding of the underlying mechanisms, however, remains a challenging task. Using agent-based computer simulations, in this paper we study dynamics of emotional communications in online social networks. The rules that guide how the agents interact are motivated, and the realistic network structure and some important parameters are inferred from the empirical dataset of \texttt{MySpace} social network. Agent's emotional state is characterized by two variables representing psychological arousal---reactivity to stimuli, and valence---attractiveness or aversiveness, by which common emotions can be defined. Agent's action is triggered by increased arousal. High-resolution dynamics is implemented where each message carrying agent's emotion along the network link is identified and its effect on the recipient agent is considered as continuously aging in time. Our results demonstrate that (i) aggregated group behaviors may arise from individual emotional actions of agents; (ii) collective states characterized by temporal correlations and dominant positive emotions emerge, similar to the empirical system; (iii) nature of the driving signal---rate of user's stepping into online world, has profound effects on building the coherent behaviors, which are observed for users in online social networks. Further, our simulations suggest that spreading patterns differ for the emotions, e.g., "enthusiastic" and "ashamed", which have entirely different emotional content. {\bf {All data used in this study are fully anonymized.}}
We report on low-resolution multi-object spectroscopy of 65 objects from I(AB) ~= 20 to I(AB) ~= 25 in the HDF-S obtained with the VLT Focal Reducer/low dispersion Spectrograph (FORS2). 18 objects belong to the HDF-S proper, i.e. the WFPC2 deep area. 15 high-redshift galaxies with 2.0 < z < 3.5 (10 in the HDF-S proper) have been identified. The spectroscopic redshifts are in good agreement with the photometric ones derived from a chi^{2} minimization technique comparing the observed spectral energy distribution with synthetic libraries and with a new neural network (NN) approach. The dispersion with the former method is sigma_z=0.16 whereas the latter provides sigma_z=0.13. No "catastrophic" difference is encountered. The inferred star formation rates of the individual objects range from tens to a few hundreds of M_{\odot} yr^{-1} and the global star formation rate of the Universe at <z>=~2.4 is estimated to be 0.15 M_{\odot} yr^{-1} Mpc^{-3} with a statistical error of 0.04. Evidence for large scale structure is found with two groups' redshifts observed at z ~= 2.1 and z ~= 2.7 and a pronounced low redshift peak around z ~= 0.58. An elliptical galaxy lensing a background object turns out to be at a redshift z=0.577.
Image and video analysis is often a crucial step in the study of animal behavior and kinematics. Often these analyses require that the position of one or more animal landmarks are annotated (marked) in numerous images. The process of annotating landmarks can require a significant amount of time and tedious labor, which motivates the need for algorithms that can automatically annotate landmarks. In the community of scientists that use image and video analysis to study the 3D flight of animals, there has been a trend of developing more automated approaches for annotating landmarks, yet they fall short of being generally applicable. Inspired by the success of Deep Neural Networks (DNNs) on many problems in the field of computer vision, we investigate how suitable DNNs are for accurate and automatic annotation of landmarks in video datasets representative of those collected by scientists studying animals.   Our work shows, through extensive experimentation on videos of hawkmoths, that DNNs are suitable for automatic and accurate landmark localization. In particular, we show that one of our proposed DNNs is more accurate than the current best algorithm for automatic localization of landmarks on hawkmoth videos. Moreover, we demonstrate how these annotations can be used to quantitatively analyze the 3D flight of a hawkmoth. To facilitate the use of DNNs by scientists from many different fields, we provide a self contained explanation of what DNNs are, how they work, and how to apply them to other datasets using the freely available library Caffe and supplemental code that we provide.
In this paper, we introduce transformations of deep rectifier networks, enabling the conversion of deep rectifier networks into shallow rectifier networks. We subsequently prove that any rectifier net of any depth can be represented by a maximum of a number of functions that can be realized by a shallow network with a single hidden layer. The transformations of both deep rectifier nets and deep residual nets are conducted to demonstrate the advantages of the residual nets over the conventional neural nets and the advantages of the deep neural nets over the shallow neural nets. In summary, for two rectifier nets with different depths but with same total number of hidden units, the corresponding single hidden layer representation of the deeper net is much more complex than the corresponding single hidden representation of the shallower net. Similarly, for a residual net and a conventional rectifier net with the same structure except for the skip connections in the residual net, the corresponding single hidden layer representation of the residual net is much more complex than the corresponding single hidden layer representation of the conventional net.
Deep SCUBA surveys have uncovered a large population of ultra-luminous galaxies at z>1. These sources are often assumed to be starburst galaxies, but there is growing evidence that a substantial fraction host an AGN (i.e., an accreting super-massive black hole). We present here possibly the strongest evidence for this viewpoint to date: the combination of ultra-deep X-ray observations (the 2 Ms Chandra Deep Field-North) and deep optical spectroscopic data. We argue that upward of 38% of bright (f850um>=5mJy) SCUBA galaxies host an AGN, a fraction of which are obscured QSOs (i.e., L_X>3x10^{44} erg/s). However, using evidence from a variety of analyses, we argue that in almost all cases the AGNs are not bolometrically important (i.e., <20%). Thus, star formation appears to dominate their bolometric output. A substantial fraction of bright SCUBA galaxies show evidence for binary AGN activity. Since these systems appear to be interacting and merging at optical/near-IR wavelengths, their super-massive black holes will eventually coalesce.
We study routing for massively dense wireless networks, i.e., wireless networks that contain so many nodes that, in addition to their usual microscopic description, a novel macroscopic description becomes possible. The macroscopic description is not detailed, but nevertheless contains enough information to permit a meaningful study and performance optimization of the network. Within this context, we continue and significantly expand previous work on the analogy between optimal routing and the propagation of light according to the laws of Geometrical Optics. Firstly, we pose the analogy in a more general framework than previously, notably showing how the eikonal equation, which is the central equation of Geometrical Optics, also appears in the networking context. Secondly, we develop a methodology for calculating the cost function, which is the function describing the network at the macroscopic level. We apply this methodology for two important types of networks: bandwidth limited and energy limited.
Computation encounter the new approach of cloud computing which maybe keeps the world and possibly can prepare all the human's necessities. In other words, cloud computing is the subsequent regular step in the evolution of on-demand information technology services and products. The Cloud is a metaphor for the Internet and is a concept for the covered complicated infrastructure; it also depends on sketching in computer network diagrams. In this paper we will focus on concept of cloud computing, cloud deployment models, cloud security challenges encryption and data protection, privacy and security and data management and movement from grid to cloud.
We study a network of coupled logistic maps whose interactions occur with a certain distribution of delay times. The local dynamics is chaotic in the absence of coupling and thus the network is a paradigm of a complex system. There are two regimes of synchronization, depending on the distribution of delays: when the delays are sufficiently heterogeneous the network synchronizes on a steady-state (that is unstable for the uncoupled maps); when the delays are homogeneous, it synchronizes in a time-dependent state (that is either periodic or chaotic). Using two global indicators we quantify the synchronizability on the two regimes, focusing on the roles of the network connectivity and the topology. The connectivity is measured in terms of the average number of links per node, and we consider various topologies (scale-free, small-world, star, and nearest-neighbor with and without a central hub). With weak connectivity and weak coupling strength, the network displays an irregular oscillatory dynamics that is largely independent of the topology and of the delay distribution. With heterogeneous delays, we find a threshold connectivity level below which the network does not synchronize, regardless of the network size. This minimum average number of neighbors seems to be independent of the delay distribution. We also analyze the effect of self-feedback loops and find that they have an impact on the synchronizability of small networks with large coupling strengths. The influence of feedback, enhancing or degrading synchronization, depends on the topology and on the distribution of delays.
Training of neural machine translation (NMT) models usually uses mini-batches for efficiency purposes. During the mini-batched training process, it is necessary to pad shorter sentences in a mini-batch to be equal in length to the longest sentence therein for efficient computation. Previous work has noted that sorting the corpus based on the sentence length before making mini-batches reduces the amount of padding and increases the processing speed. However, despite the fact that mini-batch creation is an essential step in NMT training, widely used NMT toolkits implement disparate strategies for doing so, which have not been empirically validated or compared. This work investigates mini-batch creation strategies with experiments over two different datasets. Our results suggest that the choice of a mini-batch creation strategy has a large effect on NMT training and some length-based sorting strategies do not always work well compared with simple shuffling.
Hierarchical application of Triple-Modular Redundancy (TMR) increases fault tolerance of digital Integrated Circuit (IC). In this paper, a simple probabilistic model was proposed for analysis of fault masking performance of hierarchical TMR networks. Performance improvements obtained by second order TMR network were theoretically compared with first order TMR network.
Recent work has shown that multimodal association areas-including frontal, temporal and parietal cortex-are focal points of functional network reconfiguration during human learning and performance of cognitive tasks. On the other hand, neurocomputational theories of category learning suggest that the basal ganglia and related subcortical structures are focal points of functional network reconfiguration during early learning of some categorization tasks, but become less so with the development of automatic categorization performance. Using a combination of network science and multilevel regression, we explore how changes in the connectivity of small brain regions can predict behavioral changes during training in a visual categorization task. We find that initial category learning, as indexed by changes in accuracy, is predicted by increasingly efficient integrative processing in subcortical areas, with higher functional specialization, more efficient integration across modules, but a lower cost in terms of redundancy of information processing. The development of automaticity, as indexed by changes in the speed of correct responses, was predicted by lower clustering (particularly in subcortical areas), higher strength (highest in cortical areas) and higher betweenness centrality. By combining neurocomputational theories and network scientific methods, these results synthesize the dissociative roles of multimodal association areas and subcortical structures in the development of automaticity during category learning.
We present a work-in-progress snapshot of learning with a 15 billion parameter deep learning network on HPC architectures applied to the largest publicly available natural image and video dataset released to-date. Recent advancements in unsupervised deep neural networks suggest that scaling up such networks in both model and training dataset size can yield significant improvements in the learning of concepts at the highest layers. We train our three-layer deep neural network on the Yahoo! Flickr Creative Commons 100M dataset. The dataset comprises approximately 99.2 million images and 800,000 user-created videos from Yahoo's Flickr image and video sharing platform. Training of our network takes eight days on 98 GPU nodes at the High Performance Computing Center at Lawrence Livermore National Laboratory. Encouraging preliminary results and future research directions are presented and discussed.
Hidden nodes in a wireless network refer to nodes that are out of range of other nodes or a collection of nodes. We will discuss a few problems introduced by the RTS/CTS mechanism of collision avoidance and focus on the virtual jamming problem, which allows a malicious node to effectively jam a large fragment of a wireless network at minimum expense of power. We have also discussed WiCCP (Wireless Central Coordinated Protocol) which is a protocol booster that also provides good solution to hidden nodes.
Recently we reported that radio Doppler data generated by NASA's Deep Space Network (DSN) from the Pioneer 10 and 11 spacecraft indicate an apparent anomalous, constant, spacecraft acceleration with a magnitude $\sim 8.5\times 10^{-8}$ cm s$^{-2}$, directed towards the Sun (gr-qc/9808081). Analysis of similar Doppler and ranging data from the Galileo and Ulysses spacecraft yielded ambiguous results for the anomalous acceleration, but it was useful in that it ruled out the possibility of a systematic error in the DSN Doppler system that could easily have been mistaken as a spacecraft acceleration. Here we present some new results, including a critique suggestions that the anomalous acceleration could be caused by collimated thermal emission. Based partially on a further data for the Pioneer 10 orbit determination, the data now spans January 1987 to July 1998, our best estimate of the average Pioneer 10 acceleration directed towards the Sun is $\sim 7.5 \times 10^{-8}$ cm s$^{-2}$.
We demonstrate that a potential coexists with limit cycle. Here the potential determines the final distribution of population. Our demonstration consists of three steps: We first show the existence of limit from a typical physical sciences setting: the potential is a type of Mexican hat type, with the strength of a magnetic field scale with the strength the potential gradient near the limit cycle, and the friction goes to zero faster than the potential near the limit cycle. Hence the dynamics at the limit cycle is conserved. The diffusion matrix is nevertheless finite at the limit cycle. Secondly, we construct the potential in the dynamics with limit cycle in a typical dynamical systems setting. Thirdly, we argue that such a construction can be carried out in a more general situation based on a method discovered by one of us. This method of dealing with stochastic differential equation is in general different from both Ito and Stratonovich calculus. Our result may be useful in many related applications, such as in the discussion of metastability of limit cycle and in the construction of Hopfield potential in the neural network computation.
Real-time monitoring of traffic density, road congestion, public transportation, and parking availability are key to realizing the vision of a smarter city and, with the advent of vehicular networking technologies such as IEEE 802.11p and WAVE, this information can now be gathered directly from the vehicles in an urban area. To act as a backbone to the network of moving vehicles, collecting, aggregating, and disseminating their information, the use of parked cars has been proposed as an alternative to costly deployments of fixed Roadside Units.   In this paper, we introduce novel mechanisms for parking vehicles to self-organize and form efficient vehicular support networks that provide widespread coverage to a city. These mechanisms are innovative in their ability to keep the network of parked cars under continuous optimization, in their multi-criteria decision process that can be focused on key network performance metrics, and in their ability to manage the battery usage of each car, rotating roadside unit roles between vehicles as required. We also present the first comprehensive study of the performance of such an approach, via realistic modeling of mobility, parking, and communication, thorough simulations, and an experimental verification of concepts that are key to self-organization. Our analysis brings strong evidence that parked cars can serve as an alternative to fixed roadside units, and organize to form networks that can support smarter transportation and mobility.
Deep neural networks excel in regimes with large amounts of data, but tend to struggle when data is scarce or when they need to adapt quickly to changes in the task. Recent work in meta-learning seeks to overcome this shortcoming by training a meta-learner on a distribution of similar tasks; the goal is for the meta-learner to generalize to novel but related tasks by learning a high-level strategy that captures the essence of the problem it is asked to solve. However, most recent approaches to meta-learning are extensively hand-designed, either using architectures that are specialized to a particular application, or hard-coding algorithmic components that tell the meta-learner how to solve the task. We propose a class of simple and generic meta-learner architectures, based on temporal convolutions, that is domain- agnostic and has no particular strategy or algorithm encoded into it. We validate our temporal-convolution-based meta-learner (TCML) through experiments pertaining to both supervised and reinforcement learning, and demonstrate that it outperforms state-of-the-art methods that are less general and more complex.
This paper presents a content-aware style transfer algorithm for paintings and photos of similar content using pre-trained neural network, obtaining better results than the previous work. In addition, the numerical experiments show that the style pattern and the content information is not completely separated by neural network.
Controlling complex networked systems to a desired state is a key research goal in contemporary science. Despite recent advances in studying the impact of network topology on controllability, a comprehensive understanding of the synergistic effect of network topology and individual dynamics on controllability is still lacking. Here we offer a theoretical study with particular interest in the diversity of dynamic units characterized by different types of individual dynamics. Interestingly, we find a global symmetry accounting for the invariance of controllability with respect to exchanging the densities of any two different types of dynamic units, irrespective of the network topology. The highest controllability arises at the global symmetry point, at which different types of dynamic units are of the same density. The lowest controllability occurs when all self-loops are either completely absent or present with identical weights. These findings further improve our understanding of network controllability and have implications for devising the optimal control of complex networked systems in a wide range of fields.
Real-time visual analysis tasks, like tracking and recognition, require swift execution of computationally intensive algorithms. Visual sensor networks can be enabled to perform such tasks by augmenting the sensor network with processing nodes and distributing the computational burden in a way that the cameras contend for the processing nodes while trying to minimize their task completion times. In this paper, we formulate the problem of minimizing the completion time of all camera sensors as an optimization problem. We propose algorithms for fully distributed optimization, analyze the existence of equilibrium allocations, evaluate the effect of the network topology and of the video characteristics, and the benefits of central coordination. Our results demonstrate that with sufficient information available, distributed optimization can provide low completion times, moreover predictable and stable performance can be achieved with additional, sparse central coordination.
We present the first galaxy counts at 18 microns using the Japanese AKARI satellite's survey at the North Ecliptic Pole (NEP), produced from the images from the NEP-Deep and NEP-Wide surveys covering 0.6 and 5.8 square degrees respectively. We describe a procedure using a point source filtering algorithm to remove background structure and a minimum variance method for our source extraction and photometry that delivers the optimum signal to noise for our extracted sources, confirming this by comparison with standard photometry methods. The final source counts are complete and reliable over three orders of magnitude in flux density, resulting in sensitivities (80 percent completeness) of 0.15mJy and 0.3mJy for the NEP-Deep and NEP-Wide surveys respectively, a factor of 1.3 deeper than previous catalogues constructed from this field. The differential source counts exhibit a characteristic upturn from Euclidean expectations at around a milliJansky and a corresponding evolutionary bump between 0.2-0.4 mJy consistent with previous mid-infrared surveys with ISO and Spitzer at 15 and 24 microns. We compare our results with galaxy evolution models confirming the striking divergence from the non-evolving scenario. The models and observations are in broad agreement implying that the source counts are consistent with a strongly evolving population of luminous infrared galaxies at redshifts higher than unity. Integrating our source counts down to the limit of the NEP survey at the 150 microJy level we calculate that AKARI has resolved approximately 55 percent of the 18 micron cosmic infrared background relative to the predictions of contemporary source count models.
The chiral clock spin-glass model with q=5 states, with both competing ferromagnetic-antiferromagnetic and left-right chiral frustrations, is studied in d=3 spatial dimensions by renormalization-group theory. The global phase diagram is calculated in temperature, antiferromagnetic bond concentration p, random chirality strength, and right-chirality concentration c. The system has a ferromagnetic phase, a multitude of different chiral phases, a chiral spin-glass phase, and a critical (algebraically) ordered phase. The ferromagnetic and chiral phases accumulate at the disordered phase boundary and form a spectrum of devil's staircases, where different ordered phases characteristically intercede at all scales of phase-diagram space. Shallow and deep reentrances of the disordered phase, bordered by fragments of regular and temperature-inverted devil's staircases, are seen. The extremely rich phase diagrams are presented as continuously and qualitatively changing videos.
With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either spatial or angular domain. In this paper, we use machine learning to mitigate this trade-off. Specifically, we propose a novel learning-based approach to synthesize new views from a sparse set of input views. We build upon existing view synthesis techniques and break down the process into disparity and color estimation components. We use two sequential convolutional neural networks to model these two components and train both networks simultaneously by minimizing the error between the synthesized and ground truth images. We show the performance of our approach using only four corner sub-aperture views from the light fields captured by the Lytro Illum camera. Experimental results show that our approach synthesizes high-quality images that are superior to the state-of-the-art techniques on a variety of challenging real-world scenes. We believe our method could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase.
Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties.   First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.   Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend. We can cause the network to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network's prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.
As a first step towards the complete calculation of deep-inelastic scattering at third order of massless perturbative QCD, we have computed the fermionic (nf) contributions to the flavour non-singlet structure functions in unpolarized electromagnetic scattering. We briefly discuss the approach chosen for this calculation, the problems we encountered and the status of the project. We show the results for the corresponding anomalous dimension in both Mellin-N and Bjorken-x. Together with the nf part of A_3, our calculation facilitates the complete determination of the threshold-resummation coefficients B_2 and D_2^DIS. The latter quantity vanishes in the MSbar scheme.
Understanding structural controllability of a complex network requires to identify a Minimum Input nodes Set (MIS) of the network. It has been suggested that finding an MIS is equivalent to computing a maximum matching of the network, where the unmatched nodes constitute an MIS. However, maximum matching of a network is often not unique, and finding all MISs may provide deep insights to the controllability of the network. Finding all possible input nodes, which form the union of all MISs, is computationally challenging for large networks. Here we present an efficient enumerative algorithm for the problem. The main idea is to modify a maximum matching algorithm to make it efficient for finding all possible input nodes by computing only one MIS. We rigorously proved the correctness of the new algorithm and evaluated its performance on synthetic and large real networks. The experimental results showed that the new algorithm ran several orders of magnitude faster than the existing method on large real networks.
We address the problem of efficiently gathering correlated data from a wired or a wireless sensor network, with the aim of designing algorithms with provable optimality guarantees, and understanding how close we can get to the known theoretical lower bounds. Our proposed approach is based on finding an optimal or a near-optimal {\em compression tree} for a given sensor network: a compression tree is a directed tree over the sensor network nodes such that the value of a node is compressed using the value of its parent. We consider this problem under different communication models, including the {\em broadcast communication} model that enables many new opportunities for energy-efficient data collection. We draw connections between the data collection problem and a previously studied graph concept, called {\em weakly connected dominating sets}, and we use this to develop novel approximation algorithms for the problem. We present comparative results on several synthetic and real-world datasets showing that our algorithms construct near-optimal compression trees that yield a significant reduction in the data collection cost.
As two main focuses of the study of complex networks, the community structure and the dynamics on networks have both attracted much attention in various scientific fields. However, it is still an open question how the community structure is associated with the dynamics on complex networks. In this paper, through investigating the diffusion process taking place on networks, we demonstrate that the intrinsic community structure of networks can be revealed by the stable local equilibrium states of the diffusion process. Furthermore, we show that such community structure can be directly identified through the optimization of the conductance of network, which measures how easily the diffusion occurs among different communities. Tests on benchmark networks indicate that the conductance optimization method significantly outperforms the modularity optimization methods at identifying the community structure of networks. Applications on real world networks also demonstrate the effectiveness of the conductance optimization method. This work provides insights into the multiple topological scales of complex networks, and the obtained community structure can naturally reflect the diffusion capability of the underlying network.
Machine-learning algorithms offer immense possibilities in the development of several cognitive applications. In fact, large scale machine-learning classifiers now represent the state-of-the-art in a wide range of object detection/classification problems. However, the network complexities of large-scale classifiers present them as one of the most challenging and energy intensive workloads across the computing spectrum. In this paper, we present a new approach to optimize energy efficiency of object detection tasks using semantic decomposition to build a hierarchical classification framework. We observe that certain semantic information like color/texture are common across various images in real-world datasets for object detection applications. We exploit these common semantic features to distinguish the objects of interest from the remaining inputs (non-objects of interest) in a dataset at a lower computational effort. We propose a 2-stage hierarchical classification framework, with increasing levels of complexity, wherein the first stage is trained to recognize the broad representative semantic features relevant to the object of interest. The first stage rejects the input instances that do not have the representative features and passes only the relevant instances to the second stage. Our methodology thus allows us to reject certain information at lower complexity and utilize the full computational effort of a network only on a smaller fraction of inputs to perform detection. We use color and texture as distinctive traits to carry out several experiments for object detection. Our experiments on the Caltech101/CIFAR10 dataset show that the proposed method yields 1.93x/1.46x improvement in average energy, respectively, over the traditional single classifier model.
The presence of two distant FR I radio galaxies in the Hubble Deep Field plus Flanking Fields indicates that the number density of these objects is about 10-50 times higher at z>1 than in the local universe. This is in strong contrast with the idea that FR Is undergo no cosmological evolution. We advocate that the cosmological evolution of radio sources may be independent of FR class, and instead solely a function of radio power. In this scenario the evolutionary properties of extragalactic radio sources can be fully described within a `single population scheme'.
Despite their advantages in terms of computational resources, latency, and power consumption, event-based implementations of neural networks have not been able to achieve the same performance figures as their equivalent state-of-the-art deep network models. We propose counter neurons as minimal spiking neuron models which only require addition and comparison operations, thus avoiding costly multiplications. We show how inference carried out in deep counter networks converges to the same accuracy levels as are achieved with state-of-the-art conventional networks. As their event-based style of computation leads to reduced latency and sparse updates, counter networks are ideally suited for efficient compact and low-power hardware implementation. We present theory and training methods for counter networks, and demonstrate on the MNIST benchmark that counter networks converge quickly, both in terms of time and number of operations required, to state-of-the-art classification accuracy.
Learning an algorithm from examples is a fundamental problem that has been widely studied. Recently it has been addressed using neural networks, in particular by Neural Turing Machines (NTMs). These are fully differentiable computers that use backpropagation to learn their own programming. Despite their appeal NTMs have a weakness that is caused by their sequential nature: they are not parallel and are are hard to train due to their large depth when unfolded.   We present a neural network architecture to address this problem: the Neural GPU. It is based on a type of convolutional gated recurrent unit and, like the NTM, is computationally universal. Unlike the NTM, the Neural GPU is highly parallel which makes it easier to train and efficient to run.   An essential property of algorithms is their ability to handle inputs of arbitrary size. We show that the Neural GPU can be trained on short instances of an algorithmic task and successfully generalize to long instances. We verified it on a number of tasks including long addition and long multiplication of numbers represented in binary. We train the Neural GPU on numbers with upto 20 bits and observe no errors whatsoever while testing it, even on much longer numbers.   To achieve these results we introduce a technique for training deep recurrent networks: parameter sharing relaxation. We also found a small amount of dropout and gradient noise to have a large positive effect on learning and generalization.
Elastic systems driven in a disordered medium exhibit a depinning transition at zero temperature and a creep regime at finite temperature and slow drive $f$. We derive functional renormalization group equations which allow to describe in details the properties of the slowly moving states in both cases. Since they hold at finite velocity $v$, they allow to remedy some shortcomings of the previous approaches to zero temperature depinning. In particular, they enable us to derive the depinning law directly from the equation of motion, with no artificial prescription or additional physical assumptions. Our approach provides a controlled framework to establish under which conditions the depinning regime is universal. It explicitly demonstrates that the random potential seen by a moving extended system evolves at large scale to a random field and yields a self-contained picture for the size of the avalanches associated with the deterministic motion. At $T>0$ we find that the effective barriers grow with lenghtscale as the energy differences between neighboring metastable states, and demonstrate the resulting activated creep law $v\sim \exp (-C f^{-\mu}/T)$ where the exponent $\mu$ is obtained in a $\epsilon=4-D$ expansion ($D$ is the internal dimension of the interface). Our approach also provides quantitatively a new scenario for creep motion as it allows to identify several intermediate lengthscales. In particular, we unveil a novel ``depinning-like'' regime at scales larger than the activation scale, with avalanches spreading from the thermal nucleus scale up to the much larger correlation length $R_{V}$. We predict that $R_{V}\sim T^{-\sigma}f^{-\lambda }$ diverges at small $f$ and $T$ with exponents $\sigma ,\lambda$ that we determine.
Standard models of population dynamics focus on the the interaction, survival, and extinction of the competing species individually. Real ecological systems, however, are characterized by an abundance of species (or strategies, in the terminology of evolutionary-game theory) that form intricate, complex interaction networks. The description of the ensuing dynamics may be aided by studying associations of certain strategies rather than individual ones. Here we show how such a higher-level description can bear fruitful insight. Motivated from different strains of colicinogenic Escherichia coli bacteria, we investigate a four-strategy system which contains a three-strategy cycle and a neutral alliance of two strategies. We find that the stochastic, spatial model exhibits a mobility-dependent selection of either the three-strategy cycle or of the neutral pair. We analyze this intriguing phenomenon numerically and analytically.
The present paper proposes a novel interpretation of the widely scattered states (called synchronized traffic) stimulated by Kerner's hypotheses about the existence of a multitude of metastable states in the fundamental diagram. Using single vehicle data collected at the German highway A1, temporal velocity patterns have been analyzed to show a collection of certain fragments with approximately constant velocities and sharp jumps between them. The particular velocity values in these fragments vary in a wide range. In contrast, the flow rate is more or less constant because its fluctuations are mainly due to the discreteness of traffic flow.   Subsequently, we develop a model for synchronized traffic that can explain these characteristics. Following previous work (I.A.Lubashevsky, R.Mahnke, Phys. Rev. E v. 62, p. 6082, 2000) the vehicle flow is specified by car density, mean velocity, and additional order parameters $h$ and $a$ that are due to the many-particle effects of the vehicle interaction. The parameter $h$ describes the multilane correlations in the vehicle motion. Together with the car density it determines directly the mean velocity. The parameter $a$, in contrast, controls the evolution of $h$ only. The model assumes that $a$ fluctuates randomly around the value corresponding to the car configuration optimal for lane changing. When it deviates from this value the lane change is depressed for all cars forming a local cluster. Since exactly the overtaking manoeuvres of these cars cause the order parameter $a$ to vary, the evolution of the car arrangement becomes frozen for a certain time. In other words, the evolution equations form certain dynamical traps responsible for the long-time correlations in the synchronized mode.
Network coding theory studies the transmission of information in networks whose vertices may perform nontrivial encoding and decoding operations on data as it passes through the network. The main approach to deciding the feasibility of network coding problems aims to reduce the problem to optimization over a polytope of entropic vectors subject to constraints imposed by the network structure. In the case of directed acyclic graphs, these constraints are completely understood, but for general graphs the problem of enumerating them remains open: it is not known how to classify the constraints implied by a property that we call serializability, which refers to the absence of paradoxical circular dependencies in a network code.   In this work we initiate the first systematic study of the constraints imposed on a network code by serializability. We find that serializability cannot be detected solely by evaluating the Shannon entropy of edge sets in the graph, but nevertheless, we give a polynomial-time algorithm that decides the serializability of a network code. We define a certificate of non-serializability, called an information vortex, that plays a role in the theory of serializability comparable to the role of fractional cuts in multicommodity flow theory, including a type of min-max relation. Finally, we study the serializability deficit of a network code, defined as the minimum number of extra bits that must be sent in order to make it serializable. For linear codes, we show that it is NP-hard to approximate this parameter within a constant factor, and we demonstrate some surprising facts about the behavior of this parameter under parallel composition of codes.
We introduce an exceptionally simple gated recurrent neural network (RNN) that achieves performance comparable to well-known gated architectures, such as LSTMs and GRUs, on the word-level language modeling task. We prove that our model has simple, predicable and non-chaotic dynamics. This stands in stark contrast to more standard gated architectures, whose underlying dynamical systems exhibit chaotic behavior.
We investigate the problem of factorizing a matrix into several sparse matrices and propose an algorithm for this under randomness and sparsity assumptions. This problem can be viewed as a simplification of the deep learning problem where finding a factorization corresponds to finding edges in different layers and values of hidden units. We prove that under certain assumptions for a sparse linear deep network with $n$ nodes in each layer, our algorithm is able to recover the structure of the network and values of top layer hidden units for depths up to $\tilde O(n^{1/6})$. We further discuss the relation among sparse matrix factorization, deep learning, sparse recovery and dictionary learning.
Semi-supervised clustering is an very important topic in machine learning and computer vision. The key challenge of this problem is how to learn a metric, such that the instances sharing the same label are more likely close to each other on the embedded space. However, little attention has been paid to learn better representations when the data lie on non-linear manifold. Fortunately, deep learning has led to great success on feature learning recently. Inspired by the advances of deep learning, we propose a deep transductive semi-supervised maximum margin clustering approach. More specifically, given pairwise constraints, we exploit both labeled and unlabeled data to learn a non-linear mapping under maximum margin framework for clustering analysis. Thus, our model unifies transductive learning, feature learning and maximum margin techniques in the semi-supervised clustering framework. We pretrain the deep network structure with restricted Boltzmann machines (RBMs) layer by layer greedily, and optimize our objective function with gradient descent. By checking the most violated constraints, our approach updates the model parameters through error backpropagation, in which deep features are learned automatically. The experimental results shows that our model is significantly better than the state of the art on semi-supervised clustering.
A restricted Boltzmann machine (RBM) is an undirected graphical model constructed for discrete or continuous random variables, with two layers, one hidden and one visible, and no conditional dependency within a layer. In recent years, RBMs have risen to prominence due to their connection to deep learning. By treating a hidden layer of one RBM as the visible layer in a second RBM, a deep architecture can be created. RBMs are thought to thereby have the ability to encode very complex and rich structures in data, making them attractive for supervised learning. However, the generative behavior of RBMs is largely unexplored. In this paper, we discuss the relationship between RBM parameter specification in the binary case and the tendency to undesirable model properties such as degeneracy, instability and uninterpretability. We also describe the difficulties that arise in likelihood-based and Bayes fitting of such (highly flexible) models, especially as Gibbs sampling (quasi-Bayes) methods are often advocated for the RBM model structure.
Visual attributes are great means of describing images or scenes, in a way both humans and computers understand. In order to establish a correspondence between images and to be able to compare the strength of each property between images, relative attributes were introduced. However, since their introduction, hand-crafted and engineered features were used to learn increasingly complex models for the problem of relative attributes. This limits the applicability of those methods for more realistic cases. We introduce a deep neural network architecture for the task of relative attribute prediction. A convolutional neural network (ConvNet) is adopted to learn the features by including an additional layer (ranking layer) that learns to rank the images based on these features. We adopt an appropriate ranking loss to train the whole network in an end-to-end fashion. Our proposed method outperforms the baseline and state-of-the-art methods in relative attribute prediction on various coarse and fine-grained datasets. Our qualitative results along with the visualization of the saliency maps show that the network is able to learn effective features for each specific attribute. Source code of the proposed method is available at https://github.com/yassersouri/ghiaseddin.
A system is in a self-organized critical state if the distribution of some measured events (avalanche sizes, for instance) obeys a power law for as many decades as it is possible to calculate or measure. The finite-size scaling of this distribution function with the lattice size is usually enough to assume that any cut off will disappear as the lattice size goes to infinity. This approach, however, can lead to misleading conclusions. In this work we analyze the behavior of the branching rate sigma of the events to establish whether a system is in a critical state. We apply this method to the Olami-Feder-Christensen model to obtain evidences that, in contrast to previous results, the model is critical in the conservative regime only.
Conversational agents ("bots") are beginning to be widely used in conversational interfaces. To design a system that is capable of emulating human-like interactions, a conversational layer that can serve as a fabric for chat-like interaction with the agent is needed. In this paper, we introduce a model that employs Information Retrieval by utilizing convolutional deep structured semantic neural network-based features in the ranker to present human-like responses in ongoing conversation with a user. In conversations, accounting for context is critical to the retrieval model; we show that our context-sensitive approach using a Convolutional Deep Structured Semantic Model (cDSSM) with character trigrams significantly outperforms several conventional baselines in terms of the relevance of responses retrieved.
We propose a particularly structured Boltzmann machine, which we refer to as a dynamic Boltzmann machine (DyBM), as a stochastic model of a multi-dimensional time-series. The DyBM can have infinitely many layers of units but allows exact and efficient inference and learning when its parameters have a proposed structure. This proposed structure is motivated by postulates and observations, from biological neural networks, that the synaptic weight is strengthened or weakened, depending on the timing of spikes (i.e., spike-timing dependent plasticity or STDP). We show that the learning rule of updating the parameters of the DyBM in the direction of maximizing the likelihood of given time-series can be interpreted as STDP with long term potentiation and long term depression. The learning rule has a guarantee of convergence and can be performed in a distributed matter (i.e., local in space) with limited memory (i.e., local in time).
In this thesis I develop a variety of techniques to train, evaluate, and sample from intractable and high dimensional probabilistic models. Abstract exceeds arXiv space limitations -- see PDF.
We determine the site and bond percolation thresholds for a system of disordered jammed sphere packings in the maximally random jammed state, generated by the Torquato-Jiao algorithm. For the site threshold, which gives the fraction of conducting vs. non-conducting spheres necessary for percolation, we find $p_c = 0.3116(3)$, consistent with the 1979 value of Powell $0.310(5)$ and identical within errors to the threshold for the simple-cubic lattice, 0.311608, which shares the same average coordination number of 6. In terms of the volume fraction $\varphi$, the threshold corresponds to a critical value $\varphi_c = 0.199$. For the bond threshold, which apparently was not measured before, we find $p_c = 0.2424(3)$. To find these thresholds, we considered two shape-dependent universal ratios involving the size of the largest cluster, fluctuations in that size, and the second moment of the size distribution; we confirmed the ratios' universality by also studying the simple-cubic lattice with a similar cubic boundary. The results are applicable to many problems including conductivity in random mixtures, glass formation, and drug loading in pharmaceutical tablets.
Obesity treatment requires obese patients to record all food intakes per day. Computer vision has been introduced to estimate calories from food images. In order to increase accuracy of detection and reduce the error of volume estimation in food calorie estimation, we present our calorie estimation method in this paper. To estimate calorie of food, a top view and side view is needed. Faster R-CNN is used to detect the food and calibration object. GrabCut algorithm is used to get each food's contour. Then the volume is estimated with the food and corresponding object. Finally we estimate each food's calorie. And the experiment results show our estimation method is effective.
We study fluctuations of the local density of states (LDOS) on a tree-like lattice with large branching number $m$. The average form of the local spectral function (at given value of the random potential in the observation point) shows a crossover from the Lorentzian to semicircular form at $\alpha\sim 1/m$, where $\alpha= (V/W)^2$, $V$ is the typical value of the hopping matrix element, and $W$ is the width of the distribution of random site energies. For $\alpha>1/m^2$ the LDOS fluctuations (with respect to this average form) are weak. In the opposite case, $\alpha<1/m^2$, the fluctuations get strong and the average LDOS ceases to be representative, which is related to the existence of the Anderson transition at $\alpha_c\sim 1/(m^2\log^2m)$. On the localized side of the transition the spectrum is discrete, and LDOS is given by a set of $\delta$-like peaks. The effective number of components in this regime is given by $1/P$, with $P$ being the inverse participation ratio. It is shown that $P$ has in the transition point a limiting value $P_c$ close to unity, $1-P_c\sim 1/\log m$, so that the system undergoes a transition directly from the deeply localized to extended phase. On the side of delocalized states, the peaks in LDOS get broadened, with a width $\sim\exp\{-{const}\log m[(\alpha-\alpha_c)/\alpha_c]^{-1/2}\}$ being exponentially small near the transition point. We discuss application of our results to the problem of the quasiparticle line shape in a finite Fermi system, as suggested recently by Altshuler, Gefen, Kamenev, and Levitov.
Finding the reduced-dimensional structure is critical to understanding complex networks. Existing approaches such as spectral clustering are applicable only when the full network is explicitly observed. In this paper, we focus on the online factorization and partition of implicit large-scale networks based on observations from an associated random walk. We propose an efficient and scalable nonconvex stochastic gradient algorithm. It is able to process dependent data dynamically generated by the underlying network and learn a low-dimensional representation for each vertex. By applying a diffusion approximation analysis, we show that the nonconvex stochastic algorithm achieves nearly optimal sample complexity. Once given the learned low-dimensional representations, we further apply clustering techniques to recover the network partition. We show that, when the associated Markov process is lumpable, one can recover the partition exactly with high probability. The proposed approach is experimented on Manhattan taxi data.
Accurate detection of the myocardial infarction (MI) area is crucial for early diagnosis planning and follow-up management. In this study, we propose an end-to-end deep-learning algorithm framework (OF-RNN ) to accurately detect the MI area at the pixel level. Our OF-RNN consists of three different function layers: the heart localization layers, which can accurately and automatically crop the region-of-interest (ROI) sequences, including the left ventricle, using the whole cardiac magnetic resonance image sequences; the motion statistical layers, which are used to build a time-series architecture to capture two types of motion features (at the pixel-level) by integrating the local motion features generated by long short-term memory-recurrent neural networks and the global motion features generated by deep optical flows from the whole ROI sequence, which can effectively characterize myocardial physiologic function; and the fully connected discriminate layers, which use stacked auto-encoders to further learn these features, and they use a softmax classifier to build the correspondences from the motion features to the tissue identities (infarction or not) for each pixel. Through the seamless connection of each layer, our OF-RNN can obtain the area, position, and shape of the MI for each patient. Our proposed framework yielded an overall classification accuracy of 94.35% at the pixel level, from 114 clinical subjects. These results indicate the potential of our proposed method in aiding standardized MI assessments.
We propose a convolutional recurrent neural network, with Winner-Take-All dropout for high dimensional unsupervised feature learning in multi-dimensional time series. We apply the proposedmethod for object recognition with temporal context in videos and obtain better results than comparable methods in the literature, including the Deep Predictive Coding Networks previously proposed by Chalasani and Principe.Our contributions can be summarized as a scalable reinterpretation of the Deep Predictive Coding Networks trained end-to-end with backpropagation through time, an extension of the previously proposed Winner-Take-All Autoencoders to sequences in time, and a new technique for initializing and regularizing convolutional-recurrent neural networks.
A study of semi-inclusive deep inelastic scattering off transversely polarized $^3$He is presented. The formal expressions of the Collins and Sivers contributions to the azimuthal single spin asymmetry for the production of leading pions are derived, in impulse approximation, and estimated in the kinematics of forth-coming experiments at JLab. The AV18 interaction has been used for a realistic description of the nuclear dynamics; the nucleon structure has been described by proper parameterizations of data or suitable model calculations. The initial transverse momentum of the struck quark has been properly included in the calculation. The crucial issue of extracting the neutron information from $^3$He data, planned to shed some light on the puzzling experimental scenario arisen from recent measurements for the proton and the deuteron, is thoroughly discussed. It is found that a model independent procedure, widely used in inclusive deep inelastic scattering to take into account the momentum and energy distributions of the bound nucleons in $^3$He, can be applied also in the kinematics of the planned JLab experiments, although fragmentation functions, not only parton distributions, are involved. The possible role played by final state interactions in the process under investigation is addressed.
Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Selection of relevant features for spitting the tree nodes is a key property of their architecture, at the same time being their major shortcoming: the recursive nodes partitioning leads to geometric reduction of data quantity in the leaf nodes, which causes an excessive model complexity and data overfitting. In this paper, we present a novel architecture - a Decision Stream, - aimed to overcome this problem. Instead of building an acyclic tree structure during the training process, we propose merging nodes from different branches based on their similarity that is estimated with two-sample test statistics. To evaluate the proposed solution, we test it on several common machine learning problems - credit scoring, twitter sentiment analysis, aircraft flight control, MNIST and CIFAR image classification, synthetic data classification and regression. Our experimental results reveal that the proposed approach significantly outperforms the standard decision tree method on both regression and classification tasks, yielding a prediction error decrease up to 35%.
Broad instructional methods like interactive engagement have been shown to be effective, but such general characterization provides little guidance on the details of how to structure the instructional materials. In this study, we seek instructional specificity by comparing two ways of using an analogy to learn a target physical principle: (i) applying the analogy to the target physical domain on a Case-by-Case basis and (ii) using the analogy to create a General Rule in the target physical domain. In the discussion sections of a large, introductory physics course (N = 231), students who sought a General Rule were better able to discover and apply a correct physics principle than students who analyzed the examples Case-by-Case. The difference persisted at a reduced level after subsequent direct instruction. We argue that students who performed Case-by-Case analyses are more likely to focus on idiosyncratic problem-specific features rather than the deep structural features. This study provides an example of investigating how the specific structure of instructional materials can be consequential for what is learned.
We introduce Deep-HiTS, a rotation invariant convolutional neural network (CNN) model for classifying images of transients candidates into artifacts or real sources for the High cadence Transient Survey (HiTS). CNNs have the advantage of learning the features automatically from the data while achieving high performance. We compare our CNN model against a feature engineering approach using random forests (RF). We show that our CNN significantly outperforms the RF model reducing the error by almost half. Furthermore, for a fixed number of approximately 2,000 allowed false transient candidates per night we are able to reduce the miss-classified real transients by approximately 1/5. To the best of our knowledge, this is the first time CNNs have been used to detect astronomical transient events. Our approach will be very useful when processing images from next generation instruments such as the Large Synoptic Survey Telescope (LSST). We have made all our code and data available to the community for the sake of allowing further developments and comparisons at https://github.com/guille-c/Deep-HiTS.
Electron-electron interactions generally reduce the low temperature resistivity due to the screening of the impurity potential by the electron gas. In the weak-coupling limit, the magnitude of this screening effect is determined by the thermodynamic compressibility which is proportional to the inverse screening length. We show that when strong correlations are present, although the compressibility is reduced, the screening effect is nevertheless strongly enhanced. This phenomenon is traced to the same non-perturbative Kondo-like processes that lead to strong mass enhancements, but which are absent in weak coupling approaches. We predict metallicity to be strongly stabilized in an intermediate regime where the interactions and the disorder are of comparable magnitude.
The low-temperature states of bosonic fluids exhibit fundamental quantum effects at the macroscopic scale: the best-known examples are Bose-Einstein condensation (BEC) and superfluidity, which have been tested experimentally in a variety of different systems. When bosons are interacting, disorder can destroy condensation leading to a so-called Bose glass. This phase has been very elusive to experiments due to the absence of any broken symmetry and of a finite energy gap in the spectrum. Here we report the observation of a Bose glass of field-induced magnetic quasiparticles in a doped quantum magnet (Br-doped dichloro-tetrakis-thiourea-Nickel, DTN). The physics of DTN in a magnetic field is equivalent to that of a lattice gas of bosons in the grand-canonical ensemble; Br-doping introduces disorder in the hoppings and interaction strengths, leading to localization of the bosons into a Bose glass down to zero field, where it acquires the nature of an incompressible Mott glass. The transition from the Bose glass (corresponding to a gapless spin liquid) to the BEC (corresponding to a magnetically ordered phase) is marked by a novel, universal exponent governing the scaling on the critical temperature with the applied field, in excellent agreement with theoretical predictions. Our study represents the first, quantitative account of the universal features of disordered bosons in the grand-canonical ensemble.
Solving efficiently complex problems using metaheuristics, and in particular local searches, requires incorporating knowledge about the problem to solve. In this paper, the permutation flowshop problem is studied. It is well known that in such problems, several solutions may have the same fitness value. As this neutrality property is an important one, it should be taken into account during the design of optimization methods. Then in the context of the permutation flowshop, a deep landscape analysis focused on the neutrality property is driven and propositions on the way to use this neutrality to guide efficiently the search are given.
In this work we consider quantum cascade networks in which quantum systems are connected through unidirectional channels that can mutually interact giving rise to interference effects. In particular we show how to compute master equations for cascade systems in an arbitrary interferometric configuration by means of a collisional model. We apply our general theory to two specific examples: the first consists in two systems arranged in a Mach-Zender-like configuration; the second is a three system network where it is possible to tune the effective chiral interactions between the nodes exploiting interference effects.
Tracking congestion throughout the network road is a critical component of Intelligent transportation network management systems. Understanding how the traffic flows and short-term prediction of congestion occurrence due to rush-hour or incidents can be beneficial to such systems to effectively manage and direct the traffic to the most appropriate detours. Many of the current traffic flow prediction systems are designed by utilizing a central processing component where the prediction is carried out through aggregation of the information gathered from all measuring stations. However, centralized systems are not scalable and fail provide real-time feedback to the system whereas in a decentralized scheme, each node is responsible to predict its own short-term congestion based on the local current measurements in neighboring nodes.   We propose a decentralized deep learning-based method where each node accurately predicts its own congestion state in real-time based on the congestion state of the neighboring stations. Moreover, historical data from the deployment site is not required, which makes the proposed method more suitable for newly installed stations. In order to achieve higher performance, we introduce a regularized Euclidean loss function that favors high congestion samples over low congestion samples to avoid the impact of the unbalanced training dataset. A novel dataset for this purpose is designed based on the traffic data obtained from traffic control stations in northern California. Extensive experiments conducted on the designed benchmark reflect a successful congestion prediction.
The influence of networks topology on collective properties of dynamical systems defined upon it is studied in the thermodynamic limit. A network model construction scheme is proposed where the number of links, the average eccentricity and the clustering coefficient are controlled. This is done by rewiring links of a regular one dimensional chain according to a probability $p$ within a specific range $r$, that can depend on the number of vertices $N$. We compute the thermodynamic behavior of a system defined on the network, the $XY-$rotors model, and monitor how it is affected by the topological changes. We identify the network dimension $d$ as a crucial parameter: topologies with $d\textless{}2$ exhibit no phase transitions while ones with $d\textgreater{}2$ display a second order phase transition. Topologies with $d=2$ exhibit states characterized by infinite susceptibility and macroscopic chaotic/turbulent dynamical behavior. These features are also captured by $d$ in the finite size context.
Knowledge discovery is defined as non-trivial extraction of implicit, previously unknown and potentially useful information from given data. Knowledge extraction from web documents deals with unstructured, free-format documents whose number is enormous and rapidly growing. The artificial neural networks are well suitable to solve a problem of knowledge discovery from web documents because trained networks are able more accurately and easily to classify the learning and testing examples those represent the text mining domain. However, the neural networks that consist of large number of weighted connections and activation units often generate the incomprehensible and hard-to-understand models of text classification. This problem may be also addressed to most powerful recurrent neural networks that employ the feedback links from hidden or output units to their input units. Due to feedback links, recurrent neural networks are able take into account of a context in document. To be useful for data mining, self-organizing neural network techniques of knowledge extraction have been explored and developed. Self-organization principles were used to create an adequate neural-network structure and reduce a dimensionality of features used to describe text documents. The use of these principles seems interesting because ones are able to reduce a neural-network redundancy and considerably facilitate the knowledge representation.
Recognizing arbitrary multi-character text in unconstrained natural photographs is a hard problem. In this paper, we address an equally hard sub-problem in this domain viz. recognizing arbitrary multi-digit numbers from Street View imagery. Traditional approaches to solve this problem typically separate out the localization, segmentation, and recognition steps. In this paper we propose a unified approach that integrates these three steps via the use of a deep convolutional neural network that operates directly on the image pixels. We employ the DistBelief implementation of deep neural networks in order to train large, distributed neural networks on high quality images. We find that the performance of this approach increases with the depth of the convolutional network, with the best performance occurring in the deepest architecture we trained, with eleven hidden layers. We evaluate this approach on the publicly available SVHN dataset and achieve over $96\%$ accuracy in recognizing complete street numbers. We show that on a per-digit recognition task, we improve upon the state-of-the-art, achieving $97.84\%$ accuracy. We also evaluate this approach on an even more challenging dataset generated from Street View imagery containing several tens of millions of street number annotations and achieve over $90\%$ accuracy. To further explore the applicability of the proposed system to broader text recognition tasks, we apply it to synthetic distorted text from reCAPTCHA. reCAPTCHA is one of the most secure reverse turing tests that uses distorted text to distinguish humans from bots. We report a $99.8\%$ accuracy on the hardest category of reCAPTCHA. Our evaluations on both tasks indicate that at specific operating thresholds, the performance of the proposed system is comparable to, and in some cases exceeds, that of human operators.
In this paper we present an analytical study of dynamic membership (aka churn) in structured peer-to-peer networks. We use a fluid model approach to describe steady-state or transient phenomena, and apply it to the Chord system. For any rate of churn and stabilization rates, and any system size, we accurately account for the functional form of the probability of network disconnection as well as the fraction of failed or incorrect successor and finger pointers. We show how we can use these quantities to predict both the performance and consistency of lookups under churn. All theoretical predictions match simulation results. The analysis includes both features that are generic to structured overlays deploying a ring as well as Chord-specific details, and opens the door to a systematic comparative analysis of, at least, ring-based structured overlay systems under churn.
Network alignment refers to the problem of finding a bijective mapping across vertices of two graphs to maximize the number of overlapping edges and/or to minimize the number of mismatched interactions across networks. This problem is often cast as an expensive quadratic assignment problem (QAP). Although spectral methods have received significant attention in different network science problems such as network clustering, the use of spectral techniques in the network alignment problem has been limited partially owing to the lack of principled connections between spectral methods and relaxations of the network alignment optimization. In this paper, we propose a network alignment framework that uses an orthogonal relaxation of the underlying QAP in a maximum weight bipartite matching optimization. Our method takes into account the ellipsoidal level sets of the quadratic objective function by exploiting eigenvalues and eigenvectors of (transformations of) adjacency graphs. Our framework not only can be employed to provide a theoretical justification for existing heuristic spectral network alignment methods, but it also leads to a new scalable network alignment algorithm which outperforms existing ones over various synthetic and real networks. Moreover, we generalize the objective function of the network alignment problem to consider both matched and mismatched interactions in a standard QAP formulation. This can be critical in applications where networks have low similarity and therefore we expect more mismatches than matches. We assess the effectiveness of our proposed method theoretically for certain classes of networks, through simulations over various synthetic network models, and in two real-data applications; in comparative analysis of gene regulatory networks across human, fly and worm, and in user de-anonymization over twitter follower subgraphs.
We study the emergence of Minkowski space-time from a causal network. Differently from previous approaches, we require the network to be topologically homogeneous, so that the metric is derived from pure event-counting. Emergence from events has an operational motivation in requiring that every physical quantity---including space-time---be defined through precise measurement procedures. Topological homogeneity is a requirement for having space-time metric emergent from the pure topology of causal connections, whereas physically homogeneity corresponds to the universality of the physical law. We analyze in detail the case of 1+1 dimensions. If we consider the causal connections as an exchange of classical information, we can establish coordinate systems via an Einsteinian protocol, and this leads to a digital version of the Lorentz transformations. In a computational analogy, the foliation construction can be regarded as the synchronization with a global clock of the calls to independent subroutines (corresponding to the causally independent events) in a parallel distributed computation. Thus the Lorentz time-dilation emerges as an increased density of leaves within a single tic-tac of a clock, whereas space-contraction results from the corresponding decrease of density of events per leaf. The operational procedure of building up the coordinate system introduces an in-principle indistinguishability between neighboring events, resulting in a network that is coarse-grained, the thickness of the event being a function of the observer's clock.
We present an analytical method of studying "extended" electronic eigenstates of a diamond hierarchical lattice, which may be taken as the simplest of the hierarchical models recently proposed for stretched polymers. We use intuitive arguments and a renormalization-group method to determine the distribution of amplitudes of the wave functions corresponding to some of these "extended" eigenstates. An exact analysis of the end-to-end transmission property of arbitrarily large finite lattices reveals an anomalous behavoiur. It is seen that while for a special value of the energy the lattice, however large, becomes completely transparent to an incoming electron, for the other energy eigenvalues the transmission decreases with system size. For one such energy eigenvalue we analytically obtain the precise scaling form of the transmission coefficient. The same method can easily be adopted for other energies.
As individuals communicate, their exchanges form a dynamic network. We demonstrate, using time series analysis of communication in three online settings, that network structure alone can be highly revealing of the diversity and novelty of the information being communicated. Our approach uses both standard and novel network metrics to characterize how unexpected a network configuration is, and to capture a network's ability to conduct information. We find that networks with a higher conductance in link structure exhibit higher information entropy, while unexpected network configurations can be tied to information novelty. We use a simulation model to explain the observed correspondence between the evolution of a network's structure and the information it carries.
In this paper, we consider the scene parsing problem and propose a novel Multi-Path Feedback recurrent neural network (MPF-RNN) for parsing scene images. MPF-RNN can enhance the capability of RNNs in modeling long-range context information at multiple levels and better distinguish pixels that are easy to confuse. Different from feedforward CNNs and RNNs with only single feedback, MPF-RNN propagates the contextual features learned at top layer through \textit{multiple} weighted recurrent connections to learn bottom features. For better training MPF-RNN, we propose a new strategy that considers accumulative loss at multiple recurrent steps to improve performance of the MPF-RNN on parsing small objects. With these two novel components, MPF-RNN has achieved significant improvement over strong baselines (VGG16 and Res101) on five challenging scene parsing benchmarks, including traditional SiftFlow, Barcelona, CamVid, Stanford Background as well as the recently released large-scale ADE20K.
In a multi-hop mobile ad hoc network (MANET), mobile nodes cooperate to form a network without using any infrastructure such as access points or base stations. The mobility of the nodes and the fundamentally limited capacity of the wireless medium, together with wireless transmission effects such as attenuation, multi-path propagation, and interference combine to create sig-nificant challenges for security in MANETs. Traditional cryptographic mecha-nisms such as authentication and encryption are not capable of handling some kinds of attacks such as packet dropping by malicious nodes in MANETs. This paper presents a mechanism for detecting malicious packet dropping attacks in MANETs. The mechanism is depends on a trust module on each node, which is based on the reputation value computed for that node by its neighbors. The reputation value of a node is computed based on its packet forwarding behavior in the network. The reputation information is gathered, stored and exchanged between the nodes, and computed under different scenario. The proposed pro-tocol has been simulated in a network simulator. The simulation results show the efficiency of its performance.
Corrupting the input and hidden layers of deep neural networks (DNNs) with multiplicative noise, often drawn from the Bernoulli distribution (or 'dropout'), provides regularization that has significantly contributed to deep learning's success. However, understanding how multiplicative corruptions prevent overfitting has been difficult due to the complexity of a DNN's functional form. In this paper, we show that when a Gaussian prior is placed on a DNN's weights, applying multiplicative noise induces a Gaussian scale mixture, which can be reparameterized to circumvent the problematic likelihood function. Analysis can then proceed by using a type-II maximum likelihood procedure to derive a closed-form expression revealing how regularization evolves as a function of the network's weights. Results show that multiplicative noise forces weights to become either sparse or invariant to rescaling. We find our analysis has implications for model compression as it naturally reveals a weight pruning rule that starkly contrasts with the commonly used signal-to-noise ratio (SNR). While the SNR prunes weights with large variances, seeing them as noisy, our approach recognizes their robustness and retains them. We empirically demonstrate our approach has a strong advantage over the SNR heuristic and is competitive to retraining with soft targets produced from a teacher model.
The 3-loop heavy flavor corrections to deep-inelastic scattering are essential for consistent next-to-next-to-leading order QCD analyses. We report on the present status of the calculation of these corrections at large virtualities $Q^2$. We also describe a series of mathematical, computer-algebraic and combinatorial methods and special function spaces, needed to perform these calculations. Finally, we briefly discuss the status of measuring $\alpha_s(M_Z)$, the charm quark mass $m_c$, and the parton distribution functions at next-to-next-to-leading order from the world precision data on deep-inelastic scattering.
Recent works using artificial neural networks based on distributed word representation greatly boost performance on various natural language processing tasks, especially the answer selection problem. Nevertheless, most of the previous works used deep learning methods (like LSTM-RNN, CNN, etc.) only to capture semantic representation of each sentence separately, without considering the interdependence between each other. In this paper, we propose a novel end-to-end learning framework which constitutes deep convolutional neural network based on multi-modal similarity metric learning (M$^2$S-Net) on pairwise tokens. The proposed model demonstrates its performance by surpassing previous state-of-the-art systems on the answer selection benchmark, i.e., TREC-QA dataset, in both MAP and MRR metrics.
How can researchers test for heterogeneity in the local structure of a network? In this paper, we present a framework that utilizes random sampling to give subgraphs which are then used in a goodness of fit test to test for heterogeneity. We illustrate how to use the goodness of fit test for an analytically derived distribution as well as an empirical distribution. To demonstrate our framework, we consider the simple case of testing for edge probability heterogeneity. We examine the significance level, power and computation time for this case with appropriate examples. Finally we outline how to apply our framework to other heterogeneity problems.
Kernel methods have great promise for learning rich statistical representations of large modern datasets. However, compared to neural networks, kernel methods have been perceived as lacking in scalability and flexibility. We introduce a family of fast, flexible, lightly parametrized and general purpose kernel learning methods, derived from Fastfood basis function expansions. We provide mechanisms to learn the properties of groups of spectral frequencies in these expansions, which require only O(mlogd) time and O(m) memory, for m basis functions and d input dimensions. We show that the proposed methods can learn a wide class of kernels, outperforming the alternatives in accuracy, speed, and memory consumption.
Accurate and robust cell nuclei classification is the cornerstone for a wider range of tasks in digital and Computational Pathology. However, most machine learning systems require extensive labeling from expert pathologists for each individual problem at hand, with no or limited abilities for knowledge transfer between datasets and organ sites. In this paper we implement and evaluate a variety of deep neural network models and model ensembles for nuclei classification in renal cell cancer (RCC) and prostate cancer (PCa). We propose a convolutional neural network system based on residual learning which significantly improves over the state-of-the-art in cell nuclei classification. Finally, we show that the combination of tissue types during training increases not only classification accuracy but also overall survival analysis.
In this paper, we study the fundamental problem of gossip in the mobile telephone model: a recently introduced variation of the classical telephone model modified to better describe the local peer-to-peer communication services implemented in many popular smartphone operating systems. In more detail, the mobile telephone model differs from the classical telephone model in three ways: (1) each device can participate in at most one connection per round; (2) the network topology can undergo a parameterized rate of change; and (3) devices can advertise a parameterized number of bits about their state to their neighbors in each round before connection attempts are initiated. We begin by describing and analyzing new randomized gossip algorithms in this model under the harsh assumption of a network topology that can change completely in every round. We prove a significant time complexity gap between the case where nodes can advertise $0$ bits to their neighbors in each round, and the case where nodes can advertise $1$ bit. For the latter assumption, we present two solutions: the first depends on a shared randomness source, while the second eliminates this assumption using a pseudorandomness generator we prove to exist with a novel generalization of a classical result from the study of two-party communication complexity. We then turn our attention to the easier case where the topology graph is stable, and describe and analyze a new gossip algorithm that provides a substantial performance improvement for many parameters. We conclude by studying a relaxed version of gossip in which it is only necessary for nodes to each learn a specified fraction of the messages in the system.
A QCD analysis of the world data on polarized deep inelastic scattering is presented in next--to--leading order, including the heavy flavor Wilson coefficient in leading order in the fixed flavor number scheme. New parameterizations are derived for the quark and gluon distributions and the value of $\alpha_s(M_z^2)$ is determined. The impact of the variation of both the renormalization and factorization scales on the distributions and the value of $\alpha_s$ is studied. We obtain $\alpha_s^{\rm NLO}(M_Z^2) = 0.1132~~\begin{array}{l} + 0.0056 \\ -0.0095 \end{array}$. The first moments of the polarized twist--2 parton distribution functions are calculated with correlated errors to allow for comparisons with results from lattice QCD simulations. Potential higher twist contributions to the structure function $g_1(x,Q^2)$ are determined and found to be compatible with zero both for proton and deuteron targets.
The understanding of how classical dynamics can emerge in closed quantum systems is a problem of fundamental importance. Remarkably, while classical behavior usually arises from coupling to thermal fluctuations or random spectral noise, it may also be an innate property of certain isolated, periodically driven quantum systems. Here, we experimentally realize the simplest such system, consisting of two coupled, kicked quantum rotors, by subjecting a coherent atomic matter wave to two periodically pulsed, incommensurate optical lattices. Momentum transport in this system is found to be radically different from that in a single kicked rotor, with a breakdown of dynamical localization and the emergence of classical diffusion. Our observation, which confirms a long-standing prediction for many-dimensional quantum-chaotic systems, sheds new light on the quantum-classical correspondence.
This article describes our strategy for deploying self-forming ad hoc networks based on the Internet Protocol version 6 and evaluates the dynamics of this proposal. Among others, we suggest a technique called adaptive routing that provides secure intelligent routing capabilities to computer communication networks. This technique uses the flow label, supports hybrid metrics, network load sharing, and is not restricted to evaluation of performance on first hop routers when making routing decisions. Selective anycasting is an extension to the anycast addressing model that supports exclusion of members of groups that perform poorly or inappropriately on a per-host basis. Distributed name lookup is suggested for integrating self-forming and global networks where they coexist. At last, we pose an address hierarchy to support unmanaged discovery of services in unknown networks.
Recent advances have enabled "oracle" classifiers that can classify across many classes and input distributions with high accuracy without retraining. However, these classifiers are relatively heavyweight, so that applying them to classify video is costly. We show that day-to-day video exhibits highly skewed class distributions over the short term, and that these distributions can be classified by much simpler models. We formulate the problem of detecting the short-term skews online and exploiting models based on it as a new sequential decision making problem dubbed the Online Bandit Problem, and present a new algorithm to solve it. When applied to recognizing faces in TV shows and movies, we realize end-to-end classification speedups of 2.4-7.8x/2.6-11.2x (on GPU/CPU) relative to a state-of-the-art convolutional neural network, at competitive accuracy.
Suggested the decision of the network cranback protocols performance analyzing problem from Eyal Felstine, Reuven Cohen and Ofer Hadar, " Crankback Prediction in Hierarchical ATM networks", Journal of Network and Systems Management, Vol. 10, No. 3, September 2002. It show that the false alarm probability and probability of successful way crossing can be calculated. The main optimization equations are developed for cranback protocol parameters by using analytical expressions for statistical protocol characteristics.
Single crystals of the layered organic type II superconductor, $\kappa$-(BEDT-TTF)$_{2}$Cu(NCS)$_{2}$, have been studied in magnetic fields of up to 33 T and at temperatures between 0.5 K and 11 K using a compact differential susceptometer. When the magnetic field lies precisely in the quasi-two-dimensional planes of the material, there is strong evidence for a phase transition from the superconducting mixed state into a Fulde-Ferrell-Larkin-Ovchinnikov (FFLO) state, manisfested as a change in the rigidity of the vortex system. The behaviour of the transition as a function of temperature is in good agreement with theoretical predictions.
The co-evolution between network structure and functional performance is a fundamental and challenging problem whose complexity emerges from the intrinsic interdependent nature of structure and function. Within this context, we investigate the interplay between the efficiency of network navigation (i.e., path lengths) and network structure (i.e., edge weights). We propose a simple and tractable model based on iterative biased random walks where edge weights increase over time as function of the traversed path length. Under mild assumptions, we prove that biased random walks will eventually only traverse shortest paths in their journey towards the destination. We further characterize the transient regime proving that the probability to traverse non-shortest paths decays according to a power-law. We also highlight various properties in this dynamic, such as the trade-off between exploration and convergence, and preservation of initial network plasticity. We believe the proposed model and results can be of interest to various domains where biased random walks and decentralized navigation have been applied.
In this paper, we propose a new image instance segmentation method that segments individual glands (instances) in colon histology images. This is a task called instance segmentation that has recently become increasingly important. The problem is challenging since not only do the glands need to be segmented from the complex background, they are also required to be individually identified. Here we leverage the idea of image-to-image prediction in recent deep learning by building a framework that automatically exploits and fuses complex multichannel information, regional, location and boundary patterns in gland histology images. Our proposed system, deep multichannel framework, alleviates heavy feature design due to the use of convolutional neural networks and is able to meet multifarious requirement by altering channels. Compared to methods reported in the 2015 MICCAI Gland Segmentation Challenge and other currently prevalent methods of instance segmentation, we observe state-of-the-art results based on a number of evaluation metrics.
We discuss probabilistic methods for predicting protein functions from protein-protein interaction networks. Previous work based on Markov Randon Fields is extended and compared to a general machine-learning theoretic approach. Using actual protein interaction networks for yeast from the MIPS database and GO-SLIM function assignments, we compare the predictions of the different probabilistic methods and of a standard support vector machine. It turns out that, with the currently available networks, the simple methods based on counting frequencies perform as well as the more sophisticated approaches.
Vehicular sensor networks (VSNs) are built on top of vehicular ad-hoc networks (VANETs) by equipping vehicles with sensing devices. These new technologies create a huge opportunity to extend the sensing capabilities of the existing road traffic control systems and improve their performance. Efficient utilisation of wireless communication channel is one of the basic issues in the vehicular networks development. This paper presents and evaluates data collection algorithms that use uncertainty estimates to reduce data transmission in a VSN-based road traffic control system.
Experimental results on diffraction, which were presented at the 7th International Workshop on Deep Inelastic Scattering and QCD (DIS99), are summarized.
Respondent-driven sampling (RDS) is a link-tracing network sampling strategy for collecting data from hard-to-reach populations, such as injection drug users or individuals at high risk of being infected with HIV. The mechanism is to find initial participants (seeds), and give each of them a fixed number of coupons allowing them to recruit people they know from the population of interest, with a mutual financial incentive. The new participants are given coupons again and the process repeats. Currently, the standard RDS estimator used in practice is known as the Volz-Heckathorn (VH) estimator. It relies on strong assumptions about the underlying social network and the RDS process. Via simulation, we study the relative performance of the plain mean and VH estimator when assumptions of the latter are not satisfied, under different network types (including homophily and rich-get-richer networks), participant referral patterns, and varying number of coupons. The analysis demonstrates that the plain mean outperforms the VH estimator in many but not all of the simulated settings, including homophily networks. Also, we highlight the implications of multiple recruitment and varying referral patterns on the depth of RDS process. We develop interactive visualizations of the findings and RDS process to further build insight into the various factors contributing to the performance of current RDS estimation techniques.
We study spin glasses with Kac type interaction potential for small but finite inverse interaction range $\gamma$. Using the theoretical setup of coupled replicas, through the replica method we argue that the probability of overlap profiles can be expressed for small $\gamma$ through a large-deviation functional. This result is supported by rigorous arguments, showing that the large-deviation functional provides at least upper bounds for the probability. Finally we analyze the rate function, in the vicinity of the critical point $T_c=1$ of mean field theory, and we study the free energy cost of overlap interfaces, assuming the validity of a gradient expansion for the rate functional.
The intrinsic sizes of the field galaxies with I<26 in the Hubble and ESO-NTT Deep Fields are shown as a function of their redshifts and absolute magnitudes using photometric redshifts derived from the multicolor catalogs and are compared with the CDM predictions. Extending to lower luminosities and to higher z our previous analysis performed on the NTT field alone, we find that the distribution of the galaxy disk sizes at different cosmic epochs is within the range predicted by typical CDM models. However, the observed size distribution of faint (M_B>-19) galaxies is skewed with respect to the CDM predictions and an excess of small-size disks (R_d<2 kpc) is already present at z~ 0.5. The excess persists up to z~3 and involves brighter galaxies . Such an excess may be reduced if luminosity-dependent effects, like starburst activity in interacting galaxies, are included in the physical mechanisms governing the star formation history in CDM models.
We study two-player security games which can be viewed as sequences of nonzero-sum matrix games played by an Attacker and a Defender. The evolution of the game is based on a stochastic fictitious play process, where players do not have access to each other's payoff matrix. Each has to observe the other's actions up to present and plays the action generated based on the best response to these observations. In a regular fictitious play process, each player makes a maximum likelihood estimate of her opponent's mixed strategy, which results in a time-varying update based on the previous estimate and current action. In this paper, we explore an alternative scheme for frequency update, whose mean dynamic is instead time-invariant. We examine convergence properties of the mean dynamic of the fictitious play process with such an update scheme, and establish local stability of the equilibrium point when both players are restricted to two actions. We also propose an adaptive algorithm based on this time-invariant frequency update.
Approximate variational inference has shown to be a powerful tool for modeling unknown complex probability distributions. Recent advances in the field allow us to learn probabilistic models of sequences that actively exploit spatial and temporal structure. We apply a Stochastic Recurrent Network (STORN) to learn robot time series data. Our evaluation demonstrates that we can robustly detect anomalies both off- and on-line.
We have identified counterparts to two submillimeter (submm) sources, SMM J09429+4659 and SMM J09431+4700, seen through the core of the z=0.41 cluster Abell 851. We employ deep 1.4-GHz observations and the far-infrared/radio correlation to refine the submm positions and then optical and near-infrared imaging to locate their counterparts. We identify an extremely red counterpart to SMM J09429+4659, while GMOS spectroscopy with Gemini-North shows that the R=23.8 radio source identified with SMM J09431+4700 is a hyperluminous infrared galaxy (L_FIR~1.5x10^13 L_sun) at z=3.35, the highest spectroscopic redshift so far for a galaxy discovered in the submm. The emission line properties of this galaxy are characteristic of a narrow-line Seyfert-1, although the lack of detected X-ray emission in a deep XMM-Newton observation suggests that the bulk of the luminosity of this galaxy is derived from massive star formation. We suggest that active nuclei, and the outflows they engender, may be an important part of the evolution of the brightest submm galaxies at high redshifts.
While smartphone usage become more and more pervasive, people start also asking to which extent such devices can be maliciously exploited as "tracking devices". The concern is not only related to an adversary taking physical or remote control of the device (e.g., via a malicious app), but also to what a passive adversary (without the above capabilities) can observe from the device communications. Work in this latter direction aimed, for example, at inferring the apps a user has installed on his device, or identifying the presence of a specific user within a network.   In this paper, we move a step forward: we investigate to which extent it is feasible to identify the specific actions that a user is doing on his mobile device, by simply eavesdropping the device's network traffic. In particular, we aim at identifying actions like browsing someone's profile on a social network, posting a message on a friend's wall, or sending an email. We design a system that achieves this goal starting from encrypted TCP/IP packets: it works through identification of network flows and application of machine learning techniques. We did a complete implementation of this system and run a thorough set of experiments, which show that it can achieve accuracy and precision higher than 95%, for most of the considered actions.
Using a recently developed semi-analytical method (Self-Consistent Local RPA or SC-LRPA) we study the stability of the ferromagnetic phase in diluted magnetic systems where the exchange coupling between magnetic impurities are of RKKY form. A short discussion of the relevance of these calculations with respect to the ferromagnetism observed in diluted ferromagnetic materials is provided. Then, within a two step approach, we study ferromagnetism in $Zn_{1-x}Cr_{x}Te$. In the first step of our study, we calculate the magnetic couplings between Mn impurities within the LDA. In the second step, we diagonalize the resulting effective Heisenberg Hamiltonian using the SC-LRPA. We also compare, when available, our calculations with Monte Carlo simulations and experimental measurements.
We share the implementation details and testing results for video retrieval system based exclusively on features extracted by convolutional neural networks. We show that deep learned features might serve as universal signature for semantic content of video useful in many search and retrieval tasks. We further show that graph-based storage structure for video index allows to efficiently retrieving the content with complicated spatial and temporal search queries.
The design and fabrication of phononic crystals (PnCs) hold the key to control the propagation of heat and sound at the nanoscale. However, there is a lack of experimental studies addressing the impact of order/disorder on the phononic properties of PnCs. Here, we present a comparative investigation of the influence of disorder on the hypersonic and thermal properties of two-dimensional PnCs. PnCs of ordered and disordered lattices are fabricated of circular holes with equal filling fractions in free-standing Si membranes. Ultrafast pump and probe spectroscopy (asynchronous optical sampling) and Raman thermometry based on a novel two-laser approach are used to study the phononic properties in the gigahertz (GHz) and terahertz (THz) regime, respectively. Finite element method simulations of the phonon dispersion relation and three-dimensional displacement fields furthermore enable the unique identification of the different hypersonic vibrations. The increase of surface roughness and the introduction of short-range disorder are shown to modify the phonon dispersion and phonon coherence in the hypersonic (GHz) range without affecting the room-temperature thermal conductivity. On the basis of these findings, we suggest a criteria for predicting phonon coherence as a function of roughness and disorder.
We define Persistent Mutual Information (PMI) as the Mutual (Shannon) Information between the past history of a system and its evolution significantly later in the future. This quantifies how much past observations enable long term prediction, which we propose as the primary signature of (Strong) Emergent Behaviour. The key feature of our definition of PMI is the omission of an interval of 'present' time, so that the mutual information between close times is excluded: this renders PMI robust to superposed noise or chaotic behaviour or graininess of data, distinguishing it from a range of established Complexity Measures. For the logistic map we compare predicted with measured long-time PMI data. We show that measured PMI data captures not just the period doubling cascade but also the associated cascade of banded chaos, without confusion by the overlayer of chaotic decoration. We find that the standard map has apparently infinite PMI, but with well defined fractal scaling which we can interpret in terms of the relative information codimension. Whilst our main focus is in terms of PMI over time, we can also apply the idea to PMI across space in spatially-extended systems as a generalisation of the notion of ordered phases.
In this work, we study a generic network cost minimization problem, in which every node has a local decision vector to determine. Each node incurs a cost depending on its decision vector and each link also incurs a cost depending on the decision vectors of its two end nodes. All nodes cooperate to minimize the overall network cost. The formulated network cost minimization problem has broad applications in distributed signal processing and control over multi-agent systems. To obtain a decentralized algorithm for the formulated problem, we resort to the distributed alternating direction method of multipliers (DADMM). However, each iteration of the DADMM involves solving a local optimization problem at each node, leading to intractable computational burden in many circumstances. As such, we propose a distributed linearized ADMM (DLADMM) algorithm for network cost minimization. In the DLADMM, each iteration only involves closed-form computations and avoids local optimization problems, which greatly reduces the computational complexity compared to the DADMM. We prove that the DLADMM converges to an optimal point when the local cost functions are convex and have Lipschitz continuous gradients. Linear convergence rate of the DLADMM is also established if the local cost functions are further strongly convex. Numerical experiments are conducted to corroborate the effectiveness of the DLADMM and we observe that the DLADMM has similar convergence performance as DADMM does while the former enjoys much lower computational overhead. The impact of network topology, connectivity and algorithm parameters are also investigated through simulations.
Neural network models of real-world systems, such as industrial processes, made from sensor data must often rely on incomplete data. System states may not all be known, sensor data may be biased or noisy, and it is not often known which sensor data may be useful for predictive modelling. Genetic algorithms may be used to help to address this problem by determining the near optimal subset of sensor variables most appropriate to produce good models. This paper describes the use of genetic search to optimize variable selection to determine inputs into the neural network model. We discuss genetic algorithm implementation issues including data representation types and genetic operators such as crossover and mutation. We present the use of this technique for neural network modelling of a typical industrial application, a liquid fed ceramic melter, and detail the results of the genetic search to optimize the neural network model for this application.
We outline initial concepts for an immune inspired algorithm to evaluate and predict oil price time series data. The proposed solution evolves a short term pool of trackers dynamically, with each member attempting to map trends and anticipate future price movements. Successful trackers feed into a long term memory pool that can generalise across repeating trend patterns. The resulting sequence of trackers, ordered in time, can be used as a forecasting tool. Examination of the pool of evolving trackers also provides valuable insight into the properties of the crude oil market.
We consider a crucial aspect of self-organization of a sensor network consisting of a large set of simple sensor nodes with no location hardware and only very limited communication range. After having been distributed randomly in a given two-dimensional region, the nodes are required to develop a sense for the environment, based on a limited amount of local communication. We describe algorithmic approaches for determining the structure of boundary nodes of the region, and the topology of the region. We also develop methods for determining the outside boundary, the distance to the closest boundary for each point, the Voronoi diagram of the different boundaries, and the geometric thickness of the network. Our methods rely on a number of natural assumptions that are present in densely distributed sets of nodes, and make use of a combination of stochastics, topology, and geometry. Evaluation requires only a limited number of simple local computations.
An analytical framework is proposed to describe the elasticity, viscosity and fragility of metallic glasses in relation to their atomic-level structure and the effective interatomic interaction. The bottom-up approach starts with forming an effective Ashcroft-Born-Mayer interatomic potential based on Boltzmann inversion of the radial distribution function g(r) and on fitting the short-range part of $g(r)$ by means of a simple power-law approximation. The power exponent $\lambda$ represents a global repulsion steepness parameter. A scaling relation between atomic connectivity and packing fraction $Z \sim \phi^{1+\lambda}$ is derived. This relation is then implemented in a lattice-dynamical model for the high-frequency shear modulus where the attractive anharmonic part of the effective interaction is taken into account through the thermal expansion coefficient which maps the $\phi$-dependence into a $T$-dependence. The shear modulus as a function of temperature calculated in this way is then used within the cooperative shear model of the glass transition to yield the viscosity of the supercooled melt as a double-exponential function of $T$ across the entire Angell plot. The model, which has only one adjustable parameter (the characteristic atomic volume for high-frequency cage deformation) is tested against new experimental data of ZrCu alloys and provides an excellent one-parameter description of the viscosity down to the glass transition temperature.
Deep Convolutional Neural Networks (DCNN) have established a remarkable performance benchmark in the field of image classification, displacing classical approaches based on hand-tailored aggregations of local descriptors. Yet DCNNs impose high computational burdens both at training and at testing time, and training them requires collecting and annotating large amounts of training data. Supervised adaptation methods have been proposed in the literature that partially re-learn a transferred DCNN structure from a new target dataset. Yet these require expensive bounding-box annotations and are still computationally expensive to learn. In this paper, we address these shortcomings of DCNN adaptation schemes by proposing a hybrid approach that combines conventional, unsupervised aggregators such as Bag-of-Words (BoW), with the DCNN pipeline by treating the output of intermediate layers as densely extracted local descriptors.   We test a variant of our approach that uses only intermediate DCNN layers on the standard PASCAL VOC 2007 dataset and show performance significantly higher than the standard BoW model and comparable to Fisher vector aggregation but with a feature that is 150 times smaller. A second variant of our approach that includes the fully connected DCNN layers significantly outperforms Fisher vector schemes and performs comparably to DCNN approaches adapted to Pascal VOC 2007, yet at only a small fraction of the training and testing cost.
Over the recent years a considerable amount of effort has been devoted towards the performance evaluation and prediction of Mobile Networks. Performance modeling and evaluation of mobile networks are very important in view of their ever expending usage and the multiplicity of their component parts together with the complexity of their functioning. The present paper addresses current issues in traffic management and congestion control by (signal to interference plus noise ratio) SINR prediction congestion control, routing and optimization of cellular mobile networks.
We propose a model for deterministic distributed function computation by a network of identical and anonymous nodes. In this model, each node has bounded computation and storage capabilities that do not grow with the network size. Furthermore, each node only knows its neighbors, not the entire graph. Our goal is to characterize the class of functions that can be computed within this model. In our main result, we provide a necessary condition for computability which we show to be nearly sufficient, in the sense that every function that satisfies this condition can at least be approximated. The problem of computing suitably rounded averages in a distributed manner plays a central role in our development; we provide an algorithm that solves it in time that grows quadratically with the size of the network.
Despite having high accuracy, neural nets have been shown to be susceptible to adversarial examples, where a small perturbation to an input can cause it to become mislabeled. We propose metrics for measuring the robustness of a neural net and devise a novel algorithm for approximating these metrics based on an encoding of robustness as a linear program. We show how our metrics can be used to evaluate the robustness of deep neural nets with experiments on the MNIST and CIFAR-10 datasets. Our algorithm generates more informative estimates of robustness metrics compared to estimates based on existing algorithms. Furthermore, we show how existing approaches to improving robustness "overfit" to adversarial examples generated using a specific algorithm. Finally, we show that our techniques can be used to additionally improve neural net robustness both according to the metrics that we propose, but also according to previously proposed metrics.
The low-temperature phase of discontinuous mean-field spin glasses is generally described by a one-step replica symmetry breaking (1RSB) Ansatz. The Gardner transition, i.e. a very-low-temperature phase transition to a full replica symmetry breaking (FRSB) phase, is often regarded as an inessential, and somehow exotic phenomenon. In this paper we show that the metastable states which are relevant for the out-of-equilibrium dynamics of such systems are always in a FRSB phase. The only exceptions are (to the best of our knowledge) the p-spin spherical model and the random energy model (REM). We also discuss the consequences of our results for aging dynamics and for local search algorithms in hard combinatorial problems.
We investigate the polarization dependence of optical nonlinearity enhancement for a uniaxial anisotropic composite of metal nanocrystals in a dielectric host. Three cases are distinguished depending on whether the polarization is parallel, perpendicular or unpolarized with respect to the axis of anisotropy. For the parallel polarization, the results show that the 3D results are qualitatively similar to the 2D case reported recently. For the perpendicular polarization, the results are markedly different from the parallel counterpart: In contrast to the absorption, the enhancement factor actually increases with the anisotropy. Thus the separation of the absorption and enhancement peaks becomes even more pronounced than the parallel polarization case. These results indicate a strong polarization dependence of the nonlinear optical response.
To achieve state-of-the-art results on challenges in vision, Convolutional Neural Networks learn stationary filters that take advantage of the underlying image structure. Our purpose is to propose an efficient layer formulation that extends this property to any domain described by a graph. Namely, we use the support of its adjacency matrix to design learnable weight sharing filters able to exploit the underlying structure of signals. The proposed formulation makes it possible to learn the weights of the filter as well as a scheme that controls how they are shared across the graph. We perform validation experiments with image datasets and show that these filters offer performances comparable with convolutional ones.
Highlighting similarities and differences between networks is an informative task in investigating many biological processes. Typical examples are detecting differences between an inferred network and the corresponding gold standard, or evaluating changes in a dynamic network along time. Although fruitful insights can be drawn by qualitative or feature-based methods, a distance must be used whenever a quantitative assessment is required. Here we introduce the Ipsen-Mikhailov metric for biological network comparison, based on the difference of the distributions of the Laplacian eigenvalues of the compared graphs. Being a spectral measure, its focus is on the general structure of the net so it can overcome the issues affecting local metrics such as the edit distances. Relation with the classical Matthews Correlation Coefficient (MCC) is discussed, showing the finer discriminant resolution achieved by the Ipsen-Mikhailov metric. We conclude with three examples of application in functional genomic tasks, including stability of network reconstruction as robustness to data subsampling, variability in dynamical networks and differences in networks associated to a classification task.
This paper proposes a novel distributed reduced--rank scheme and an adaptive algorithm for distributed estimation in wireless sensor networks. The proposed distributed scheme is based on a transformation that performs dimensionality reduction at each agent of the network followed by a reduced-dimension parameter vector. A distributed reduced-rank joint iterative estimation algorithm is developed, which has the ability to achieve significantly reduced communication overhead and improved performance when compared with existing techniques. Simulation results illustrate the advantages of the proposed strategy in terms of convergence rate and mean square error performance.
Estimating of the overhead costs of building construction projects is an important task in the management of these projects. The quality of construction management depends heavily on their accurate cost estimation. Construction costs prediction is a very difficult and sophisticated task especially when using manual calculation methods. This paper uses Artificial Neural Network (ANN) approach to develop a parametric cost-estimating model for site overhead cost in Egypt. Fifty-two actual real-life cases of building projects constructed in Egypt during the seven year period 2002-2009 were used as training materials. The neural network architecture is presented for the estimation of the site overhead costs as a percentage from the total project price.
We propose a hybrid spectrum and information market for a database-assisted TV white space network, where the geo-location database serves as both a spectrum market platform and an information market platform. We study the inter- actions among the database operator, the spectrum licensee, and unlicensed users systematically, using a three-layer hierarchical model. In Layer I, the database and the licensee negotiate the commission fee that the licensee pays for using the spectrum market platform. In Layer II, the database and the licensee compete for selling information or channels to unlicensed users. In Layer III, unlicensed users determine whether they should buy the exclusive usage right of licensed channels from the licensee, or the information regarding unlicensed channels from the database. Analyzing such a three-layer model is challenging due to the co-existence of both positive and negative network externalities in the information market. We characterize how the network externalities affect the equilibrium behaviours of all parties involved. Our numerical results show that the proposed hybrid market can improve the network profit up to 87%, compared with a pure information market. Meanwhile, the achieved network profit is very close to the coordinated benchmark solution (the gap is less than 4% in our simulation).
Nearly all previous work on neural machine translation (NMT) has used quite restricted vocabularies, perhaps with a subsequent method to patch in unknown words. This paper presents a novel word-character solution to achieving open vocabulary NMT. We build hybrid systems that translate mostly at the word level and consult the character components for rare words. Our character-level recurrent neural networks compute source word representations and recover unknown target words when needed. The twofold advantage of such a hybrid approach is that it is much faster and easier to train than character-based ones; at the same time, it never produces unknown words as in the case of word-based models. On the WMT'15 English to Czech translation task, this hybrid approach offers an addition boost of +2.1-11.4 BLEU points over models that already handle unknown words. Our best system achieves a new state-of-the-art result with 20.7 BLEU score. We demonstrate that our character models can successfully learn to not only generate well-formed words for Czech, a highly-inflected language with a very complex vocabulary, but also build correct representations for English source words.
The Internet and social media have fueled enormous interest in social network analysis. New tools continue to be developed and used to analyse our personal connections, with particular emphasis on detecting communities or identifying key individuals in a social network. This raises privacy concerns that are likely to exacerbate in the future. With this in mind, we ask the question: Can individuals or groups actively manage their connections to evade social network analysis tools?   By addressing this question, the general public may better protect their privacy, oppressed activist groups may better conceal their existence, and security agencies may better understand how terrorists escape detection. We first study how an individual can evade "network centrality" analysis without compromising his or her influence within the network. We prove that an optimal solution to this problem is hard to compute. Despite this hardness, we demonstrate that even a simple heuristic, whereby attention is restricted to the individual's immediate neighbourhood, can be surprisingly effective in practice. For instance, it could disguise Mohamed Atta's leading position within the WTC terrorist network, and that is by rewiring a strikingly-small number of connections. Next, we study how a community can increase the likelihood of being overlooked by community-detection algorithms. We propose a measure of concealment, expressing how well a community is hidden, and use it to demonstrate the effectiveness of a simple heuristic, whereby members of the community either "unfriend" certain other members, or "befriend" some non-members, in a coordinated effort to camouflage their community.
Network Function Virtualization (NFV) aims to abstract the functionality of traditional proprietary hardware into software as Virtual Network Functions (VNFs), which can run on commercial off the shelf (COTS) servers. Besides reducing dependency on proprietary support, NFV helps network operators to deploy multiple services in a agile fashion. Service deployment involves placement and in sequence routing through VNFs comprising a Service Chain (SC). Our study is the first to focus on the computationally complex problem of multiple VNF SC placement and routing while considering VNF service chaining explicitly. We propose a novel column generation model for placing multiple VNF SCs and routing, which reduces the computational complexity of the problem significantly. Our aim here is to determine the ideal NFV Infrastructure (NFVI) for minimizing network resource consumption. Our results indicate that a Network enabled Cloud (NeC) results in lower networkresource consumption than a centralized NFVI (e.g., Data Center) while avoiding infeasibility with a distributed NFVI.
Using stochastic conformal mapping techniques we study the patterns emerging from Laplacian growth with a power-law decaying threshold for growth $R_N^{-\gamma}$ (where $R_N$ is the radius of the $N-$ particle cluster). For $\gamma > 1$ the growth pattern is in the same universality class as diffusion limited aggregation (DLA) growth, while for $\gamma < 1$ the resulting patterns have a lower fractal dimension $D(\gamma)$ than a DLA cluster due to the enhancement of growth at the hot tips of the developing pattern. Our results indicate that a pinning transition occurs at $\gamma = 1/2$, significantly smaller than might be expected from the lower bound $\alpha_{min} \simeq 0.67$ of multifractal spectrum of DLA. This limiting case shows that the most singular tips in the pruned cluster now correspond to those expected for a purely one-dimensional line. Using multifractal analysis, analytic expressions are established for $D(\gamma)$ both close to the breakdown of DLA universality class, i.e., $\gamma \lesssim 1$, and close to the pinning transition, i.e., $\gamma \gtrsim 1/2$.
We have conducted a deep multi-color imaging survey of 0.2 degrees^2 centered on the Hubble Deep Field North (HDF-N). We shall refer to this region as the Hawaii-HDF-N. Deep data were collected in U, B, V, R, I, and z' bands over the central 0.2 degrees^2 and in HK' over a smaller region covering the Chandra Deep Field North (CDF-N). The data were reduced to have accurate relative photometry and astrometry across the entire field to facilitate photometric redshifts and spectroscopic followup. We have compiled a catalog of 48,858 objects in the central 0.2 degrees^2 detected at 5 sigma significance in a 3" aperture in either R or z' band. Number counts and color-magnitude diagrams are presented and shown to be consistent with previous observations. Using color selection we have measured the density of objects at 3<z<7. Our multi-color data indicates that samples selected at z>5.5 using the Lyman break technique suffer from more contamination by low redshift objects than suggested by previous studies.
Hyper-Raman scattering has been measured on vitreous boron oxide, $v-$B$_2$O$_3$. This spectroscopy, complemented with Raman scattering and infrared absorption, reveals the full set of vibrations that can be observed with light. A mode analysis is performed based on the local D$_{3h}$ symmetry of BO$_3$ triangles and B$_3$O$_3$ boroxol rings. The results show that in $v-$B$_2$O$_3$ the main spectral components can be succesfully assigned using this relatively simple model. In particular, it can be shown that the hyper-Raman boson peak arises from external modes that correspond mainly to librational motions of rigid boroxol rings.
In this paper, we propose a novel graph-based approach for semi-supervised learning problems, which considers an adaptive adjacency of the examples throughout the unsupervised portion of the training. Adjacency of the examples is inferred using the predictions of a neural network model which is first initialized by a supervised pretraining. These predictions are then updated according to a novel unsupervised objective which regularizes another adjacency, now linking the output nodes. Regularizing the adjacency of the output nodes, inferred from the predictions of the network, creates an easier optimization problem and ultimately provides that the predictions of the network turn into the optimal embedding. Ultimately, the proposed framework provides an effective and scalable graph-based solution which is natural to the operational mechanism of deep neural networks. Our results show state-of-the-art performance within semi-supervised learning with the highest accuracies reported to date in the literature for SVHN and NORB datasets.
This paper focuses on the problem of script identification in scene text images. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a fixed aspect ratio as in the typical use of holistic CNN classifiers, we propose here a patch-based classification framework in order to preserve discriminative parts of the image that are characteristic of its class. We describe a novel method based on the use of ensembles of conjoined networks to jointly learn discriminative stroke-parts representations and their relative importance in a patch-based classification scheme. Our experiments with this learning procedure demonstrate state-of-the-art results in two public script identification datasets. In addition, we propose a new public benchmark dataset for the evaluation of multi-lingual scene text end-to-end reading systems. Experiments done in this dataset demonstrate the key role of script identification in a complete end-to-end system that combines our script identification method with a previously published text detector and an off-the-shelf OCR engine.
We develop a prescription for estimating the interstellar medium oxygen abundances of distant star-forming galaxies using the ratio EWR_{23} formed from the equivalent widths of the [O II] 3727, [O III] 4959,5007 and Hbeta nebular emission lines. This EWR_{23} approach essentially identical to the widely-used R_{23} method of Pagel etal (1979). Using data from three spectroscopic surveys of nearby galaxies, we conclude that the emission line equivalent width ratios are a good substitute for emission line flux ratios in galaxies with active star formation. The RMS dispersion between EWR_{23} and the reddening-corrected R_{23} values is sigma(log R_{23})\leq0.08 dex. This dispersion is comparable to the emission-line measurement uncertainties for distant galaxies in many spectroscopic galaxy surveys, and is somewhat smaller than the uncertainties of sigma(O/H)~0.15 dex inherent in strong-line metallicity calibrations. Because equivalent width ratios are partially insensitive to interstellar reddening effects, emission line equivalent width ratios should be superior to flux ratios when reddening corrections are not available. The EWR_{23} method presented here is likely to be most useful for statistically estimating the mean metallicities for large samples of galaxies to trace their chemical properties as a function of redshift or environment. The approach developed here is used in a companion paper (Kobulnicky etal 2003) to measure the metallicities of star-forming galaxies at z=0.2 - 0.8 in the Deep Extragalactic Evolutionary Probe spectroscopic survey of the Groth Strip.
Overlap is one of the characteristics of social networks, in which a person may belong to more than one social group. For this reason, discovering overlapping structures is necessary for realistic social analysis. In this paper, we present a novel, general framework to detect and analyze both individual overlapping nodes and entire communities. In this framework, nodes exchange labels according to dynamic interaction rules. A specific implementation called Speaker-listener Label Propagation Algorithm (SLPA1) demonstrates an excellent performance in identifying both overlapping nodes and overlapping communities with different degrees of diversity.
Neural Machine Translation (NMT) is a new approach to machine translation that has made great progress in recent years. However, recent studies show that NMT generally produces fluent but inadequate translations (Tu et al. 2016b; Tu et al. 2016a; He et al. 2016; Tu et al. 2017). This is in contrast to conventional Statistical Machine Translation (SMT), which usually yields adequate but non-fluent translations. It is natural, therefore, to leverage the advantages of both models for better translations, and in this work we propose to incorporate SMT model into NMT framework. More specifically, at each decoding step, SMT offers additional recommendations of generated words based on the decoding information from NMT (e.g., the generated partial translation and attention history). Then we employ an auxiliary classifier to score the SMT recommendations and a gating function to combine the SMT recommendations with NMT generations, both of which are jointly trained within the NMT architecture in an end-to-end manner. Experimental results on Chinese-English translation show that the proposed approach achieves significant and consistent improvements over state-of-the-art NMT and SMT systems on multiple NIST test sets.
Increased biological complexity is generally associated with the addition of new genetic information, which must be integrated into the existing regulatory network that operates within the cell. General arguments on network control, as well as several recent genomic observations, indicate that regulatory gene number grows disproportionally fast with increasing genome size. We present two models for the growth of regulatory networks. Both predict that the number of transcriptional regulators will scale quadratically with total gene number. This appears to be in good quantitative agreement with genomic data from 89 fully sequenced prokaryotes. Moreover, the empirical curve predicts that any new non-regulatory gene will be accompanied by more than one additional regulator beyond a genome size of about 20,000 genes, within a factor of two of the observed ceiling. Our analysis places transcriptional regulatory networks in the class of accelerating networks. We suggest that prokaryotic complexity may have been limited throughout evolution by regulatory overhead, and conversely that complex eukaryotes must have bypassed this constraint by novel strategies.
The solution to the Balitsky-Kovchegov equation is found in the deep saturation domain. The controversy between different approaches regarding the asymptotic behaviour of the scattering amplitude is solved. It is shown that the dipole amplitude behaves as $ 1 - \exp (- z + \ln z)$ with $ z = \ln (r^2 Q^2_s)$ ($ r$ -size of the dipole, $Q_s$ is the saturation scale) in the deep saturation region. This solution is developed from the scaling solution to the homogeneous Balitsky-Kovchegov equation. The dangers associated with making simplifications in the BFKL kernel, to investigate the asymptotic behaviour of the scattering amplitude, is pointed out . In particular, the fact that the Balitsky-Kovchegov equation belongs to the Fisher-Kolmogorov-Petrovsky-Piscounov -type of equation, needs further careful investigation.
Several recent polyphonic music transcription systems have utilized deep neural networks to achieve state of the art results on various benchmark datasets, pushing the envelope on framewise and note-level performance measures. Unfortunately we can observe a sort of glass ceiling effect. To investigate this effect, we provide a detailed analysis of the particular kinds of errors that state of the art deep neural transcription systems make, when trained and tested on a piano transcription task. We are ultimately forced to draw a rather disheartening conclusion: the networks seem to learn combinations of notes, and have a hard time generalizing to unseen combinations of notes. Furthermore, we speculate on various means to alleviate this situation.
This article presents a spiking neuroevolutionary system which implements memristors as plastic connections, i.e. whose weights can vary during a trial. The evolutionary design process exploits parameter self-adaptation and variable topologies, allowing the number of neurons, connection weights, and inter-neural connectivity pattern to emerge. By comparing two phenomenological real-world memristor implementations with networks comprised of (i) linear resistors (ii) constant-valued connections, we demonstrate that this approach allows the evolution of networks of appropriate complexity to emerge whilst exploiting the memristive properties of the connections to reduce learning time. We extend this approach to allow for heterogeneous mixtures of memristors within the networks; our approach provides an in-depth analysis of network structure. Our networks are evaluated on simulated robotic navigation tasks; results demonstrate that memristive plasticity enables higher performance than constant-weighted connections in both static and dynamic reward scenarios, and that mixtures of memristive elements provide performance advantages when compared to homogeneous memristive networks.
A efficient incremental learning algorithm for classification tasks, called NetLines, well adapted for both binary and real-valued input patterns is presented. It generates small compact feedforward neural networks with one hidden layer of binary units and binary output units. A convergence theorem ensures that solutions with a finite number of hidden units exist for both binary and real-valued input patterns. An implementation for problems with more than two classes, valid for any binary classifier, is proposed. The generalization error and the size of the resulting networks are compared to the best published results on well-known classification benchmarks. Early stopping is shown to decrease overfitting, without improving the generalization performance.
We generalize the degree-organizational view of real-world networks with broad degree-distributions in a landscape analogue with mountains (high-degree nodes) and valleys (low-degree nodes). For example, correlated degrees between adjacent nodes corresponds to smooth landscapes (social networks), hierarchical networks to one-mountain landscapes (the Internet), and degree-disassortative networks without hierarchical features to rough landscapes with several mountains. We also generate ridge landscapes to model networks organized under constraints imposed by the space the networks are embedded in, associated to spatial or, in molecular networks, to functional localization. To quantify the topology, we here measure the widths of the mountains and the separation between different mountains.
In this paper, we present an algorithm for unconstrained face verification based on deep convolutional features and evaluate it on the newly released IARPA Janus Benchmark A (IJB-A) dataset. The IJB-A dataset includes real-world unconstrained faces from 500 subjects with full pose and illumination variations which are much harder than the traditional Labeled Face in the Wild (LFW) and Youtube Face (YTF) datasets. The deep convolutional neural network (DCNN) is trained using the CASIA-WebFace dataset. Extensive experiments on the IJB-A dataset are provided.
Future wireless networks are expected to be a convergence of many diverse network technologies and architectures, such as cellular networks, wireless local area networks, sensor networks, and device to device communications. Through cooperation between dissimilar wireless devices, this new combined network topology promises to unlock ever larger data rates and provide truly ubiquitous coverage for end users, as well as enabling higher spectral efficiency. However, it also increases the risk of co-channel interference and introduces the possibility of correlation in the aggregated interference that not only impacts the communication performance, but also makes the associated mathematical analysis much more complex. To address this problem and evaluate the communication performance of cooperative relay networks, we adopt a stochastic geometry based approach by assuming that the interfering nodes are randomly distributed according to a Poisson point process (PPP). We also use a random medium access protocol to counteract the effects of interference correlation. Using this approach, we derive novel closed-form expressions for the successful transmission probability and local delay of a relay network with correlated interference. As well as this, we find the optimal transmission probability $p$ that jointly maximizes the successful transmission probability and minimizes the local delay. Finally numerical results are provided to confirm that the proposed joint optimization strategy achieves a significant performance gain compared to a conventional scheme.
State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we discuss the vulnerabilities of a face recognition system based on deep templates, extracted by deep networks under image reconstruction attack. We propose a de-convolutional neural network (D-CNN) to reconstruct images of faces from their deep templates. In our experiments, we did not assume any knowledge about the target subject nor the deep network. To train the D-CNN reconstruction models, we augmented existing face datasets with a large collection of images synthesized using a face generator. The proposed reconstruction method was evaluated using type-I (comparing the reconstructed images against the original face images used to generate the deep template) and type-II (comparing the reconstructed images against a different face image of the same subject) attacks. We conducted a three-trial attack for each target face image using three face images reconstructed from three different D-CNNs. Each D-CNN was trained on a different dataset (VGG-Face, CASIA-Webface, or Multi-PIE). The type-I attack achieved a true accept rate (TAR) of 85.48% at a false accept rate (FAR) of 0.1% on the LFW dataset. The corresponding TAR for the type-II attack is 14.71%. Our experimental results demonstrate the need to secure deep templates in face recognition systems.
We study the spin Nernst effect of a mesoscopic four-terminal cross-bar device with the Rashba spin-orbit interaction (SOI) in the absence of a magnetic field. The interplay between the spin Nernst effect and the seebeck coefficient is investigated for a wide range of the Rashba SOI. When no peaks appeared in the seebeck coefficient, an oscillatory spin Nernst effect still occurs. In addition, the disorder effect on the spin Nernst effect is also studied. We find that the spin Nernst effect can be enhanced up to threefold by disorder. Besides, due to the interface effect, the counter propagating of the charge current to the direction of the temperature gradient is possible for a nonuniform system.
The dynamics of a glass-forming material slow greatly near the glass transition, and molecular motion becomes inhibited. We use confocal microscopy to investigate the motion of colloidal particles near the colloidal glass transition. As the concentration in a dense colloidal suspension is increased, particles become confined in transient cages formed by their neighbors. This prevents them from diffusing freely throughout the sample. We quantify the properties of these cages by measuring temporal anticorrelations of the particles' displacements. The local cage properties are related to the subdiffusive rise of the mean square displacement: over a broad range of time scales, the mean square displacement grows slower than linearly in time.
Predicting facial attributes from faces in the wild is very challenging due to pose and lighting variations in the real world. The key to this problem is to build proper feature representations to cope with these unfavourable conditions. Given the success of Convolutional Neural Network (CNN) in image classification, the high-level CNN feature, as an intuitive and reasonable choice, has been widely utilized for this problem. In this paper, however, we consider the mid-level CNN features as an alternative to the high-level ones for attribute prediction. This is based on the observation that face attributes are different: some of them are locally oriented while others are globally defined. Our investigations reveal that the mid-level deep representations outperform the prediction accuracy achieved by the (fine-tuned) high-level abstractions. We empirically demonstrate that the midlevel representations achieve state-of-the-art prediction performance on CelebA and LFWA datasets. Our investigations also show that by utilizing the mid-level representations one can employ a single deep network to achieve both face recognition and attribute prediction.
A Semantic Compositional Network (SCN) is developed for image captioning, in which semantic concepts (i.e., tags) are detected from the image, and the probability of each tag is used to compose the parameters in a long short-term memory (LSTM) network. The SCN extends each weight matrix of the LSTM to an ensemble of tag-dependent weight matrices. The degree to which each member of the ensemble is used to generate an image caption is tied to the image-dependent probability of the corresponding tag. In addition to captioning images, we also extend the SCN to generate captions for video clips. We qualitatively analyze semantic composition in SCNs, and quantitatively evaluate the algorithm on three benchmark datasets: COCO, Flickr30k, and Youtube2Text. Experimental results show that the proposed method significantly outperforms prior state-of-the-art approaches, across multiple evaluation metrics.
This work compares several node (and network) criticality measures quantifying to which extend each node is critical with respect to the communication flow between nodes of the network, and introduces a new measure based on the Bag-of-Paths (BoP) framework. Network disconnection simulation experiments show that the new BoP measure outperforms all the other measures on a sample of Erdos-Renyi and Albert-Barabasi graphs. Furthermore, a faster (still O(n^3)), approximate, BoP criticality relying on the Sherman-Morrison rank-one update of a matrix is introduced for tackling larger networks. This approximate measure shows similar performances as the original, exact, one.
A central challenge in sensory neuroscience is describing how the activity of populations of neurons can represent useful features of the external environment. However, while neurophysiologists have long been able to record the responses of neurons in awake, behaving animals, it is another matter entirely to say what a given neuron does. A key problem is that in many sensory domains, the space of all possible stimuli that one might encounter is effectively infinite; in vision, for instance, natural scenes are combinatorially complex, and an organism will only encounter a tiny fraction of possible stimuli. As a result, even describing the response properties of sensory neurons is difficult, and investigations of neuronal functions are almost always critically limited by the number of stimuli that can be considered. In this paper, we propose a closed-loop, optimization-based experimental framework for characterizing the response properties of sensory neurons, building on past efforts in closed-loop experimental methods, and leveraging recent advances in artificial neural networks to serve as as a proving ground for our techniques. Specifically, using deep convolutional neural networks, we asked whether modern black-box optimization techniques can be used to interrogate the "tuning landscape" of an artificial neuron in a deep, nonlinear system, without imposing significant constraints on the space of stimuli under consideration. We introduce a series of measures to quantify the tuning landscapes, and show how these relate to the performances of the networks in an object recognition task. To the extent that deep convolutional neural networks increasingly serve as de facto working hypotheses for biological vision, we argue that developing a unified approach for studying both artificial and biological systems holds great potential to advance both fields together.
We analyze the properties of the energy landscape of {\it finite-size} fully connected $p$-spin-like models. In the thermodynamic limit the high temperature phase is described by the schematic Mode Coupling Theory of super-cooled liquids. In this limit the barriers between different basins are infinite below the critical dynamical temperature the ergodicity is broken on in infinite times. We show that {\it finite-size} fully connected $p$-spin-like models, where activated processes are possible, do exhibit properties typical of real super-cooled liquid when both are near the critical glass transition. Our results support the conclusion that fully-connected $p$-spin-like models are the natural statistical mechanical models for studying the glass transition in super-cooled liquids.
The class of Koch fractals is one of the most interesting families of fractals, and the study of complex networks is a central issue in the scientific community. In this paper, inspired by the famous Koch fractals, we propose a mapping technique converting Koch fractals into a family of deterministic networks, called Koch networks. This novel class of networks incorporates some key properties characterizing a majority of real-life networked systems---a power-law distribution with exponent in the range between 2 and 3, a high clustering coefficient, small diameter and average path length, and degree correlations. Besides, we enumerate the exact numbers of spanning trees, spanning forests, and connected spanning subgraphs in the networks. All these features are obtained exactly according to the proposed generation algorithm of the networks considered. The network representation approach could be used to investigate the complexity of some real-world systems from the perspective of complex networks.
In statistical physics any given system can be either at an equilibrium or away from it. Networks are not an exception. Most network models can be classified as either equilibrium or growing. Here we show that under certain conditions there exists an equilibrium formulation for any growing network model, and vice versa. The equivalence between the equilibrium and nonequilibrium formulations is exact not only asymptotically, but even for any finite system size. The required conditions are satisfied in random geometric graphs in general and causal sets in particular, and to a large extent in some real networks.
This paper presents a deep neural network-based approach to image quality assessment (IQA). The network can be trained end-to-end and comprises 10 convolutional layers and 5 pooling layers for feature extraction, and 2 fully connected layers for regression, which makes it significantly deeper than related IQA methods. An unique feature of the proposed architecture is that it can be used (with slight adaptations) in a no-reference (NR) as well as in a full-reference (FR) IQA setting. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. The network estimates perceived quality patchwise; the overall image quality is calculated as the average of these patchwise scores. In order to consider the locally non-uniform distribution of perceived quality in images, we introduce a spatial attention mechanism which performs a weighted aggregation of the patchwise scores. We evaluate the proposed approach on the LIVE, CISQ and TID2013 databases and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different datasets, indicating a high robustness of the learned features.
Neuroscience is experiencing a data revolution in which many hundreds or thousands of neurons are recorded simultaneously. Currently, there is little consensus on how such data should be analyzed. Here we introduce LFADS (Latent Factor Analysis via Dynamical Systems), a method to infer latent dynamics from simultaneously recorded, single-trial, high-dimensional neural spiking data. LFADS is a sequential model based on a variational auto-encoder. By making a dynamical systems hypothesis regarding the generation of the observed data, LFADS reduces observed spiking to a set of low-dimensional temporal factors, per-trial initial conditions, and inferred inputs. We compare LFADS to existing methods on synthetic data and show that it significantly out-performs them in inferring neural firing rates and latent dynamics.
Phase transitions induced by varying the strength of disorder in the large-q state Potts model in 3d are studied by analytical and numerical methods. By switching on the disorder the transition stays of first order, but different thermodynamical quantities display essential singularities. Only for strong enough disorder the transition will be soften into a second-order one, in which case the ordered phase becomes non-homogeneous at large scales, while the non-correlated sites percolate the sample. In the critical regime the critical exponents are found universal: \beta/\nu=0.60(2) and \nu=0.73(1).
IDEAL (Influence Diagram Evaluation and Analysis in Lisp) is a software environment for creation and evaluation of belief networks and influence diagrams. IDEAL is primarily a research tool and provides an implementation of many of the latest developments in belief network and influence diagram evaluation in a unified framework. This paper describes IDEAL and some lessons learned during its development.
Automatically identifying data types of web structured data is a key step in the process of web data integration. Web structured data is usually associated with entities or objects in a particular domain. In this paper, we aim to map attributes of an entity in a given domain to pre-specified classes of attributes in the same domain based on their values. To perform this task, we propose a hybrid deep learning network that relies on the format of the attributes' values. It does so without any pre-processing or using pre-defined hand-crafted features. The hybrid network combines sequence-based neural networks, namely convolutional neural networks (CNN) and recurrent neural networks (RNN), to learn the sequence structure of attributes' values. The CNN captures short-distance dependencies in these sequences through a sliding window approach, and the RNN captures long-distance dependencies by storing information of previous characters. These networks create different vector representations of the input sequence which are combined using a pooling layer. This layer applies a specific operation on these vectors in order to capture their most useful patterns for the task. Finally, on top of the pooling layer, a softmax function predicts the label of a given attribute value. We evaluate our strategy in four different web domains. The results show that the pooling network outperforms previous approaches, which use some kind of input pre-processing, in all domains.
The robustness of complex networks with dependencies has been studied in recent years. However, previous studies focused on the robustness of networks composed of dependency links without network topology. In this study, we will analyze the percolation properties of a realistic network model where dependency links follow certain network topology. We perform the theoretical analysis and numerical simulations to show the critical effects of topology of dependency links on robustness of complex networks. For Erd\"os-R\'enyi (ER) connectivity network, we find that the system with dependency of RR topology is more vulnerable than system with dependency of ER topology. And RR-RR (i.e. random-regular (RR) network with dependency of RR topology) disintegrates in an abrupt transition. In particular, we find that the system of RR-ER shows different types of phase transitions. For system of different combinations, the type of percolation depends on the interaction between connectivity topology and dependency topology.
We present an interactive approach to train a deep neural network pixel classifier for the segmentation of neuronal structures. An interactive training scheme reduces the extremely tedious manual annotation task that is typically required for deep networks to perform well on image segmentation problems. Our proposed method employs a feedback loop that captures sparse annotations using a graphical user interface, trains a deep neural network based on recent and past annotations, and displays the prediction output to users in almost real-time. Our implementation of the algorithm also allows multiple users to provide annotations in parallel and receive feedback from the same classifier. Quick feedback on classifier performance in an interactive setting enables users to identify and label examples that are more important than others for segmentation purposes. Our experiments show that an interactively-trained pixel classifier produces better region segmentation results on Electron Microscopy (EM) images than those generated by a network of the same architecture trained offline on exhaustive ground-truth labels.
The mobile application (app) has become the main entrance to access the Internet on handheld devices. Unlike the Web where each webpage has a global URL to reach directly, a specific "content page" of an app can be opened only by exploring the app with several operations from the landing page. The interoperability between apps is quite fixed and thus limits the value-added "linked data" between apps. Recently, deep link has been proposed to enable targeting and opening a specific page of an app externally with an accessible uniform resource identifier (URI). However, implementing deep link for mobile apps requires a lot of manual efforts by app developers, which can be very error-prone and time-consuming. In this paper, we propose DroidLink to automatically generating deep links for existing Android apps. We design a deep link model suitable for automatic generation. Then we explore the transition of pages and build a navigation graph based on static and dynamic analysis of Android apps. Next, we realize an updating mechanism that keeps on revisiting the target app and discover new pages, and thus generates deep links for every single page of the app. Finally, we repackage the app with deep link supports, but requires no additional deployment requirements. We generate deep links for some popular apps and demonstrate the feasibility of DroidLink.
Risks threatening modern societies form an intricately interconnected network that often underlies crisis situations. Yet, little is known about how risk materializations in distinct domains influence each other. Here we present an approach in which expert assessments of risks likelihoods and influence underlie a quantitative model of the global risk network dynamics. The modeled risks range from environmental to economic and technological and include difficult to quantify risks, such as geo-political or social. Using the maximum likelihood estimation, we find the optimal model parameters and demonstrate that the model including network effects significantly outperforms the others, uncovering full value of the expert collected data. We analyze the model dynamics and study its resilience and stability. Our findings include such risk properties as contagion potential, persistence, roles in cascades of failures and the identity of risks most detrimental to system stability. The model provides quantitative means for measuring the adverse effects of risk interdependence and the materialization of risks in the network.
The effects of bond randomness on the phase diagram and critical behavior of the square lattice ferromagnetic Blume-Capel model are discussed. The system is studied in both the pure and disordered versions by the same efficient two-stage Wang-Landau method for many values of the crystal field, restricted here in the second-order phase transition regime of the pure model. For the random-bond version several disorder strengths are considered. We present phase diagram points of both pure and random versions and for a particular disorder strength we locate the emergence of the enhancement of ferromagnetic order observed in an earlier study in the ex-first-order regime. The critical properties of the pure model are contrasted and compared to those of the random model. Accepting, for the weak random version, the assumption of the double logarithmic scenario for the specific heat we attempt to estimate the range of universality between the pure and random-bond models. The behavior of the strong disorder regime is also discussed and a rather complex and yet not fully understood behavior is observed. It is pointed out that this complexity is related to the ground-state structure of the random-bond version.
The conductance of finite systems plays a central role in the scaling theory of localization (Abrahams et al, 1979). Usually it is defined by the Landauer-type formulas, which remain open the following questions: (a) exclusion of the contact resistance in the many-channel case; (b) correspondence of the Landauer conductance with internal properties of the system; (c) relation with the diffusion coefficient D(\omega,q) of an infinite system. The answers to these questions are obtained below in the framework of two approaches: (1) self-consistent theory of localization by Vollhardt and Woelfle, and (2) quantum mechanical analysis based on the shell model. Both approaches lead to the same definition for the conductance of a finite system, closely related to the Thouless definition. In the framework of the self-consistent theory, the relations of finite-size scaling are derived and the Gell-Mann - Low functions \beta(g) for space dimensions d=1,2,3 are calculated. In contrast to the previous attempt by Vollhardt and Woelfle (1982), the metallic and localized phase are considered from the same standpoint, and the conductance of a finite system has no singularity at the critical point. In the 2D case, the expansion of \beta(g) in 1/g coincides with results of the \sigma-model approach on the two-loop level and depends on the renormalization scheme in higher loops; the use of dimensional regularization for transition to dimension d=2+\epsilon looks incompatible with the physical essence of the problem. The obtained results are compared with numerical and physical experiments. A situation in higher dimensions and the conditions for observation of the localization law \sigma\propto -i\omega for conductivity are discussed.
Attention mechanisms have recently been introduced in deep learning for various tasks in natural language processing and computer vision. But despite their popularity, the "correctness" of the implicitly-learned attention maps has only been assessed qualitatively by visualization of several examples. In this paper we focus on evaluating and improving the correctness of attention in neural image captioning models. Specifically, we propose a quantitative evaluation metric for the consistency between the generated attention maps and human annotations, using recently released datasets with alignment between regions in images and entities in captions. We then propose novel models with different levels of explicit supervision for learning attention maps during training. The supervision can be strong when alignment between regions and caption entities are available, or weak when only object segments and categories are provided. We show on the popular Flickr30k and COCO datasets that introducing supervision of attention maps during training solidly improves both attention correctness and caption quality, showing the promise of making machine perception more human-like.
Community detection using both graphs and social networks is the focus of many algorithms. Recent methods aimed at optimizing the so-called modularity function proceed by maximizing relations within communities while minimizing inter-community relations.   However, given the NP-completeness of the problem, these algorithms are heuristics that do not guarantee an optimum. In this paper, we introduce a new algorithm along with a function that takes an approximate solution and modifies it in order to reach an optimum. This reassignment function is considered a 'potential function' and becomes a necessary condition to asserting that the computed optimum is indeed a Nash Equilibrium. We also use this function to simultaneously show partitioning and overlapping communities, two detection and visualization modes of great value in revealing interesting features of a social network. Our approach is successfully illustrated through several experiments on either real unipartite, multipartite or directed graphs of medium and large-sized datasets.
An efficient order$-N$ real-space Kubo approach is developed for the calculation of the thermal conductivity of complex disordered materials. The method, which is based on the Chebyshev polynomial expansion of the time evolution operator and the Lanczos tridiagonalization scheme, efficiently treats the propagation of phonon wave-packets in real-space and the phonon diffusion coefficients. The mean free paths and the thermal conductance can be determined from the diffusion coefficients. These quantities can be extracted simultaneously for all frequencies, which is another advantage in comparison with the Green's function based approaches. Additionally, multiple scattering phenomena can be followed through the time dependence of the diffusion coefficient deep into the diffusive regime, and the onset of weak or strong phonon localization could possibly be revealed at low temperatures for thermal insulators. The accuracy of our computational scheme is demonstrated by comparing the calculated phonon mean free paths in isotope-disordered carbon nanotubes with Landauer simulations and analytical results. Then, the upscalibility of the method is illustrated by exploring the phonon mean free paths and the thermal conductance features of edge disordered graphene nanoribbons having widths of $\sim$20 nanometers and lengths as long as a micrometer, which are beyond the reach of other numerical techniques. It is shown that, the phonon mean free paths of armchair nanoribbons are smaller than those of zigzag nanoribbons for the frequency range which dominate the thermal conductance at low temperatures. This computational strategy is applicable to higher dimensional systems, as well as to a wide range of materials.
In this paper, we establish a baseline for object symmetry detection in complex backgrounds by presenting a new benchmark and an end-to-end deep learning approach, opening up a promising direction for symmetry detection in the wild. The new benchmark, named Sym-PASCAL, spans challenges including object diversity, multi-objects, part-invisibility, and various complex backgrounds that are far beyond those in existing datasets. The proposed symmetry detection approach, named Side-output Residual Network (SRN), leverages output Residual Units (RUs) to fit the errors between the object symmetry groundtruth and the outputs of RUs. By stacking RUs in a deep-to-shallow manner, SRN exploits the 'flow' of errors among multiple scales to ease the problems of fitting complex outputs with limited layers, suppressing the complex backgrounds, and effectively matching object symmetry of different scales. Experimental results validate both the benchmark and its challenging aspects related to realworld images, and the state-of-the-art performance of our symmetry detection approach. The benchmark and the code for SRN are publicly available at https://github.com/KevinKecc/SRN.
Network coding (NC), in principle, is a Layer-3 innovation that improves network throughput in wired networks for multicast/broadcast scenarios. Due to the fundamental differences between wired and wireless networks, extending NC to wireless networks generates several new and significant practical challenges. Two-way information exchange (both symmetric and asymmetric). Network coding (NC), in principle, is a Layer-3 innovation that improves network throughput in wired networks for multicast/broadcast scenarios. Due to the fundamental differences between wired and wireless networks, extending NC to wireless networks generates several new and significant practical challenges. Two-way information exchange (both symmetric and asymmetric) between a pair of 802.11 sources/sinks using an intermediate relay node is a canonical scenario for evaluating the effectiveness of Wireless Network Coding (WNC) in a practical setting. Our primary objective in this work is to suggest pragmatic and novel modifications at the MAC and PHY layers of the 802.11 protocol stack on a Software Radio (SORA) platform to support WNC and obtain achievable throughput estimates via lab-scale experiments. Our results show that network coding (at the MAC or PHY layer) increases system throughput-typically by 20-30%%.
Several fundamental properties of real complex networks, such as the small-world effect, the scale-free degree distribution, and recently discovered topological fractal structure, have presented the possibility of a unique growth mechanism and allow for uncovering universal origins of collective behaviors. However, highly clustered scale-free network, with power-law degree distribution, or small-world network models, with exponential degree distribution, are not self-similarity. We investigate networks growth mechanism of the branching-deactivated geographical attachment preference that learned from certain empirical evidence of social behaviors. It yields high clustering and spectrums of degree distribution ranging from algebraic to exponential, average shortest path length ranging from linear to logarithmic. We observe that the present networks fit well with small-world graphs and scale-free networks in both limit cases (exponential and algebraic degree distribution respectively), obviously lacking self-similar property under a length-scale transformation. Interestingly, we find perfect topological fractal structure emerges by a mixture of both algebraic and exponential degree distributions in a wide range of parameter values. The results present a reliable connection among small-world graphs, scale-free networks and topological fractal networks, and promise a natural way to investigate universal origins of collective behaviors.
The capacity of multiuser networks has been a long-standing problem in information theory. Recently, Avestimehr et al. have proposed a deterministic network model to approximate multiuser wireless networks. This model, known as the ADT network model, takes into account the broadcast nature of wireless medium and interference.   We show that the ADT network model can be described within the algebraic network coding framework introduced by Koetter and Medard. We prove that the ADT network problem can be captured by a single matrix, and show that the min-cut of an ADT network is the rank of this matrix; thus, eliminating the need to optimize over exponential number of cuts between two nodes to compute the min-cut of an ADT network. We extend the capacity characterization for ADT networks to a more general set of connections, including single unicast/multicast connection and non-multicast connections such as multiple multicast, disjoint multicast, and two-level multicast. We also provide sufficiency conditions for achievability in ADT networks for any general connection set. In addition, we show that random linear network coding, a randomized distributed algorithm for network code construction, achieves the capacity for the connections listed above. Furthermore, we extend the ADT networks to those with random erasures and cycles (thus, allowing bi-directional links).   In addition, we propose an efficient linear code construction for the deterministic wireless multicast relay network model. Avestimehr et al.'s proposed code construction is not guaranteed to be efficient and may potentially involve an infinite block length. Unlike several previous coding schemes, we do not attempt to find flows in the network. Instead, for a layered network, we maintain an invariant where it is required that at each stage of the code construction, certain sets of codewords are linearly independent.
Recommender systems leverage both content and user interactions to generate recommendations that fit users' preferences. The recent surge of interest in deep learning presents new opportunities for exploiting these two sources of information. To recommend items we propose to first learn a user-independent high-dimensional semantic space in which items are positioned according to their substitutability, and then learn a user-specific transformation function to transform this space into a ranking according to the user's past preferences. An advantage of the proposed architecture is that it can be used to effectively recommend items using either content that describes the items or user-item ratings. We show that this approach significantly outperforms state-of-the-art recommender systems on the MovieLens 1M dataset.
In computer networking, the Media Access Control (MAC) address is a unique value associated with a network adapter. MAC addresses are also known as hardware addresses or physical addresses. TCP/IP and other mainstream networking architectures generally adopt the OSI model. MAC addresses function at the data link layer (layer 2 in the OSI model). They allow computers to uniquely identify themselves on a network at this relatively low level. In this paper, suggested data encryption technique is presented by using the MAC address as a key that is used to authenticate the receiver device like PC, mobile phone, laptop or any other devices that is connected to the network. This technique was tested on some data, visual and numerical measurements were used to check the strength and performance of the technique. The experiments showed that the suggested technique can be used easily to encrypt data that is transmitted through networks.
Dynamic behaviour of a WMN imposes stringent constraints on the routing policy of the network. In the shortest path based routing the shortest paths needs to be evaluated within a given time frame allowed by the WMN dynamics. The exact reasoning based shortest path evaluation methods usually fail to meet this rigid requirement. Thus, requiring some soft computing based approaches which can replace "best for sure" solutions with "good enough" solutions. This paper proposes a framework for optimal routing in the WMNs; where we investigate the suitability of Big Bang-Big Crunch (BB-BC), a soft computing based approach to evaluate shortest/near-shortest path. In order to make routing optimal we first propose to replace distance between the adjacent nodes with an integrated cost measure that takes into account throughput, delay, jitter and residual energy of a node. A fuzzy logic based inference mechanism evaluates this cost measure at each node. Using this distance measure we apply BB-BC optimization algorithm to evaluate shortest/near shortest path to update the routing tables periodically as dictated by network requirements. A large number of simulations were conducted and it has been observed that BB-BC algorithm appears to be a high potential candidate suitable for routing in WMNs.
We consider the observability model in networks with arbitrary topologies. We introduce a system of coupled nonlinear equations, valid under the locally tree-like ansatz, to describe the size of the largest observable cluster as a function of the fraction of directly observable nodes present in the network. We perform a systematic analysis on 95 real-world graphs and compare our theoretical predictions with numerical simulations of the observability model. Our method provides almost perfect predictions in the majority of the cases, even for networks with very large values of the clustering coefficient. Potential applications of our theory include the development of efficient and scalable algorithms for real-time surveillance of social networks, and monitoring of technological networks.
In this paper, we propose a fully convolutional networks for iterative non-blind deconvolution We decompose the non-blind deconvolution problem into image denoising and image deconvolution. We train a FCNN to remove noises in the gradient domain and use the learned gradients to guide the image deconvolution step. In contrast to the existing deep neural network based methods, we iteratively deconvolve the blurred images in a multi-stage framework. The proposed method is able to learn an adaptive image prior, which keeps both local (details) and global (structures) information. Both quantitative and qualitative evaluations on benchmark datasets demonstrate that the proposed method performs favorably against state-of-the-art algorithms in terms of quality and speed.
Deep Belief Networks (DBN) have been successfully applied on popular machine learning tasks. Specifically, when applied on hand-written digit recognition, DBNs have achieved approximate accuracy rates of 98.8%. In an effort to optimize the data representation achieved by the DBN and maximize their descriptive power, recent advances have focused on inducing sparse constraints at each layer of the DBN. In this paper we present a theoretical approach for sparse constraints in the DBN using the mixed norm for both non-overlapping and overlapping groups. We explore how these constraints affect the classification accuracy for digit recognition in three different datasets (MNIST, USPS, RIMES) and provide initial estimations of their usefulness by altering different parameters such as the group size and overlap percentage.
Observations of the flow on Jupiter exists essentially only for the cloud-level, which is dominated by strong east-west jet-streams. These have been suggested to result from dynamics in a superficial thin weather-layer, or alternatively be a manifestation of deep interior cylindrical flows. However, it is possible that the observed winds are indeed superficial, yet there exists deep flow that is completely decoupled from it. To date, all models linking the wind, via the induced density anomalies, to the gravity field, to be measured by Juno, consider only flow that is a projection of the observed could-level wind. Here we explore the possibility of complex wind dynamics that include both the shallow weather-layer wind, and a deep flow that is decoupled from the flow above it. The upper flow is based on the observed cloud-level flow and is set to decay with depth. The deep flow is constructed to produce cylindrical structures with variable width and magnitude, thus allowing for a wide range of possible scenarios for the unknown deep flow. The combined flow is then related to the density anomalies and gravitational moments via a dynamical model. An adjoint inverse model is used for optimizing the parameters controlling the setup of the deep and surface-bound flows, so that these flows can be reconstructed given a gravity field. We show that the model can be used for examination of various scenarios, including cases in which the deep flow is dominating over the surface wind. We discuss extensively the uncertainties associated with the model solution. The flexibility of the adjoint method allows for a wide range of dynamical setups, so that when new observations and physical understanding will arise, these constraints could be easily implemented and used to better decipher Jupiter flow dynamics.
Objective: To transform heterogeneous clinical data from electronic health records into clinically meaningful constructed features using data driven method that rely, in part, on temporal relations among data. Materials and Methods: The clinically meaningful representations of medical concepts and patients are the key for health analytic applications. Most of existing approaches directly construct features mapped to raw data (e.g., ICD or CPT codes), or utilize some ontology mapping such as SNOMED codes. However, none of the existing approaches leverage EHR data directly for learning such concept representation. We propose a new way to represent heterogeneous medical concepts (e.g., diagnoses, medications and procedures) based on co-occurrence patterns in longitudinal electronic health records. The intuition behind the method is to map medical concepts that are co-occuring closely in time to similar concept vectors so that their distance will be small. We also derive a simple method to construct patient vectors from the related medical concept vectors. Results: For qualitative evaluation, we study similar medical concepts across diagnosis, medication and procedure. In quantitative evaluation, our proposed representation significantly improves the predictive modeling performance for onset of heart failure (HF), where classification methods (e.g. logistic regression, neural network, support vector machine and K-nearest neighbors) achieve up to 23% improvement in area under the ROC curve (AUC) using this proposed representation. Conclusion: We proposed an effective method for patient and medical concept representation learning. The resulting representation can map relevant concepts together and also improves predictive modeling performance.
This paper considers the problem of algorithm selection for community detection. The aim of community detection is to identify sets of nodes in a network which are more interconnected relative to their connectivity to the rest of the network. A large number of algorithms have been developed to tackle this problem, but as with any machine learning task there is no "one-size-fits-all" and each algorithm excels in a specific part of the problem space. This paper examines the performance of algorithms developed for weighted networks against those using unweighted networks for different parts of the problem space (parameterised by the intra/inter community links). It is then demonstrated how the choice of algorithm (weighted/unweighted) can be made based only on the observed network.
In application domains such as healthcare, we want accurate predictive models that are also causally interpretable. In pursuit of such models, we propose a causal regularizer to steer predictive models towards causally-interpretable solutions and theoretically study its properties. In a large-scale analysis of Electronic Health Records (EHR), our causally-regularized model outperforms its L1-regularized counterpart in causal accuracy and is competitive in predictive performance. We perform non-linear causality analysis by causally regularizing a special neural network architecture. We also show that the proposed causal regularizer can be used together with neural representation learning algorithms to yield up to 20% improvement over multilayer perceptron in detecting multivariate causation, a situation common in healthcare, where many causal factors should occur simultaneously to have an effect on the target variable.
The Social Percolation model recently proposed by Solomon et al. is studied on the Ising correlated inhomogeneous network. The dynamics in this is studied so as to understand the role of correlations in the social structure. Thus the possible role of the structural social connectivity is examined.
Understanding what visual features and representations contribute to human object recognition may provide scaffolding for more effective artificial vision systems. While recent advances in Deep Convolutional Networks (DCNs) have led to systems approaching human accuracy, it is unclear if they leverage the same visual features as humans for object recognition.   We introduce Clicktionary, a competitive web-based game for discovering features that humans use for object recognition: One participant from a pair sequentially reveals parts of an object in an image until the other correctly identifies its category. Scoring image regions according to their proximity to correct recognition yields maps of visual feature importance for individual images. We find that these "realization" maps exhibit only weak correlation with relevance maps derived from DCNs or image salience algorithms. Cueing DCNs to attend to features emphasized by these maps improves their object recognition accuracy. Our results thus suggest that realization maps identify visual features that humans deem important for object recognition but are not adequately captured by DCNs. To rectify this shortcoming, we propose a novel web-based application for acquiring realization maps at scale, with the aim of improving the state-of-the-art in object recognition.
In real life, it is always an urge to reach our goal in minimum effort i.e., it should have a minimum constrained path. The path may be shortest route in practical life, either physical or electronic medium. The scenario is to represents the ambiance as a graph and to find a spanning tree with custom design criteria. Here, we have chosen a minimum degree spanning tree, which can be generated in real time with minimum turnaround time. The problem is NP-complete in nature [1, 2]. The solution approach, in general, is approximate. We have used a heuristic approach, namely hybrid genetic algorithm (GA), with motivated criteria of encoded data structures of graph. We compare the experimental result with the existing approximate algorithm and the result is so encouraging that we are interested to use it in our future applications.
Constructing effective representations is a critical but challenging problem in multimedia understanding. The traditional handcraft features often rely on domain knowledge, limiting the performances of exiting methods. This paper discusses a novel computational architecture for general image feature mining, which assembles the primitive filters (i.e. Gabor wavelets) into compositional features in a layer-wise manner. In each layer, we produce a number of base classifiers (i.e. regression stumps) associated with the generated features, and discover informative compositions by using the boosting algorithm. The output compositional features of each layer are treated as the base components to build up the next layer. Our framework is able to generate expressive image representations while inducing very discriminate functions for image classification. The experiments are conducted on several public datasets, and we demonstrate superior performances over state-of-the-art approaches.
Training neural networks involves solving large-scale non-convex optimization problems. This task has long been believed to be extremely difficult, with fear of local minima and other obstacles motivating a variety of schemes to improve optimization, such as unsupervised pretraining. However, modern neural networks are able to achieve negligible training error on complex tasks, using only direct training with stochastic gradient descent. We introduce a simple analysis technique to look for evidence that such networks are overcoming local optima. We find that, in fact, on a straight path from initialization to solution, a variety of state of the art neural networks never encounter any significant obstacles.
We propose a novel model for generating graphs similar to a given example graph. Unlike standard approaches that compute features of graphs in Euclidean space, our approach obtains features on a surface of a hypersphere. We then utilize a von Mises-Fisher distribution, an exponential family distribution on the surface of a hypersphere, to define a model over possible feature values. While our approach bears similarity to a popular exponential random graph model (ERGM), unlike ERGMs, it does not suffer from degeneracy, a situation when a significant probability mass is placed on unrealistic graphs. We propose a parameter estimation approach for our model, and a procedure for drawing samples from the distribution. We evaluate the performance of our approach both on the small domain of all 8-node graphs as well as larger real-world social networks.
We study the problem of revenue maximization in the marketing model for social networks introduced by (Hartline, Mirrokni, Sundararajan, WWW '08). We restrict our attention to the Uniform Additive Model and mostly focus on Influence-and-Exploit (IE) marketing strategies. We obtain a comprehensive collection of results on the efficiency and the approximability of IE strategies, which also imply a significant improvement on the best known approximation ratios for revenue maximization. Specifically, we show that in the Uniform Additive Model, both computing the optimal marketing strategy and computing the best IE strategy are $\NP$-hard for undirected social networks. We observe that allowing IE strategies to offer prices smaller than the myopic price in the exploit step leads to a measurable improvement on their performance. Thus, we show that the best IE strategy approximates the maximum revenue within a factor of 0.911 for undirected and of roughly 0.553 for directed networks. Moreover, we present a natural generalization of IE strategies, with more than two pricing classes, and show that they approximate the maximum revenue within a factor of roughly 0.7 for undirected and of roughly 0.35 for directed networks. Utilizing a connection between good IE strategies and large cuts in the underlying social network, we obtain polynomial-time algorithms that approximate the revenue of the best IE strategy within a factor of roughly 0.9. Hence, we significantly improve on the best known approximation ratio for revenue maximization to 0.8229 for undirected and to 0.5011 for directed networks (from 2/3 and 1/3, respectively, by Hartline et al.).
The dynamics of a single crack moving through a heterogeneous medium is studied in the quasi-static approximation. Equations of motion for the crack front are formulated and the resulting scaling behaviour analyzed. In a model scalar system and for mode III (tearing) cracks, the crack surface is found to be self affine with a roughness exponent of $\zeta=1/2$. But in the usual experimental case of mode I (tensile) cracks, local mode preference causes the crack surface to be only logarithmically rough, quite unlike those seen in experiments. The effects of residual stresses are considered and found, potentially, to lead to increased crack surface roughness. But it appears likely that elastic wave propagation effects may be needed to explain the very rough crack surfaces observed experimentally.
We consider Klein-Gordon models with a $\delta$-correlated spatial disorder. We show that the properties of immobile kinks exhibit strong dependence on the assumptions as to their statistical distribution over the minima of the effective random potential. Namely, there exists a crossover from monotonically increasing (when a kink occupies the deepest potential well) to the non-monotonic (at equiprobable distribution of kinks over the potential minima) dependence of the average kink width as a function of the disorder intensity. We show also that the same crossover may take place with changing size of the system.
A shift from even-aged forest management to uneven-aged management practices leads to a problem rather different from the existing straightforward practice that follows a rotation cycle of artificial regeneration, thinning of inferior trees and a clearcut. A lack of realistic models and methods suggesting how to manage uneven-aged stands in a way that is economically viable and ecologically sustainable creates difficulties in adopting this new management practice. To tackle this problem, we make a two-fold contribution in this paper. The first contribution is the proposal of an algorithm that is able to handle a realistic uneven-aged stand management model that is otherwise computationally tedious and intractable. The model considered in this paper is an empirically estimated size-structured ecological model for uneven-aged spruce forests. The second contribution is on the sensitivity analysis of the forest model with respect to a number of important parameters. The analysis provides us an insight into the behavior of the uneven-aged forest model.
Upstream reciprocity (also called generalized reciprocity) is a putative mechanism for cooperation in social dilemma situations with which players help others when they are helped by somebody else. It is a type of indirect reciprocity. Although upstream reciprocity is often observed in experiments, most theories suggest that it is operative only when players form short cycles such as triangles, implying a small population size, or when it is combined with other mechanisms that promote cooperation on their own. An expectation is that real social networks, which are known to be full of triangles and other short cycles, may accommodate upstream reciprocity. In this study, I extend the upstream reciprocity game proposed for a directed cycle by Boyd and Richerson to the case of general networks. The model is not evolutionary and concerns the conditions under which the unanimity of cooperative players is a Nash equilibrium. I show that an abundance of triangles or other short cycles in a network does little to promote upstream reciprocity. Cooperation is less likely for a larger population size even if triangles are abundant in the network. In addition, in contrast to the results for evolutionary social dilemma games on networks, scale-free networks lead to less cooperation than networks with a homogeneous degree distribution.
In the framework of distributed network computing, it is known that, for every network predicate, each network configuration that satisfies this predicate can be proved using distributed certificates which can be verified locally. However, this requires to leak information about the identities of the nodes in the certificates, which might not be applicable in a context in which privacy is desirable. Unfortunately, it is known that if one insists on certificates independent of the node identities, then not all network predicates can be proved using distributed certificates that can be verified locally. In this paper, we prove that, for every network predicate, there is a distributed protocol satisfying the following two properties: (1) for every network configuration that is legal w.r.t. the predicate, and for any attempt by an adversary to prove the illegality of that configuration using distributed certificates, there is a locally verifiable proof that the adversary is wrong, also using distributed certificates; (2) for every network configuration that is illegal w.r.t. the predicate, there is a proof of that illegality, using distributed certificates, such that no matter the way an adversary assigns its own set of distributed certificates in an attempt to prove the legality of the configuration, the actual illegality of the configuration will be locally detected. In both cases, the certificates are independent of the identities of the nodes. These results are achieved by investigating the so-called local hierarchy of complexity classes in which the certificates do not exploit the node identities. Indeed, we give a characterization of such a hierarchy, which is of its own interest
In the current hyper-connected era, modern Information and Communication Technology systems form sophisticated networks where not only do people interact with other people, but also machines take an increasingly visible and participatory role. Such human-machine networks (HMNs) are embedded in the daily lives of people, both for personal and professional use. They can have a significant impact by producing synergy and innovations. The challenge in designing successful HMNs is that they cannot be developed and implemented in the same manner as networks of machines nodes alone, nor following a wholly human-centric view of the network. The problem requires an interdisciplinary approach. Here, we review current research of relevance to HMNs across many disciplines. Extending the previous theoretical concepts of socio-technical systems, actor-network theory, cyber-physical-social systems, and social machines, we concentrate on the interactions among humans and between humans and machines. We identify eight types of HMNs: public-resource computing, crowdsourcing, web search engines, crowdsensing, online markets, social media, multiplayer online games and virtual worlds, and mass collaboration. We systematically select literature on each of these types and review it with a focus on implications for designing HMNs. Moreover, we discuss risks associated with HMNs and identify emerging design and development trends.
In this paper, statistical Quality of Service provisioning in next generation heterogeneous mobile cellular networks is investigated. To this aim, any active entity of the cellular network is regarded as a queuing system, whose statistical QoS requirements depend on the specific application. In this context, by quantifying the performance in terms of effective capacity, we introduce a lower bound for the system performance that facilitates an efficient analysis. We exploit this analytical framework to give insights about the possible improvement of the statistical QoS experienced by the users if the current heterogeneous cellular network architecture migrates from a Half Duplex to a Full Duplex mode of operation. Numerical results and analysis are provided, where the network is modeled as a Mat\'ern point processes with a hard core distance. The results demonstrate the accuracy and computational efficiency of the proposed scheme, especially in large scale wireless systems.
The recently proposed principal component analysis network (PCANet) has been proved high performance for visual content classification. In this letter, we develop a tensorial extension of PCANet, namely, multilinear principal analysis component network (MPCANet), for tensor object classification. Compared to PCANet, the proposed MPCANet uses the spatial structure and the relationship between each dimension of tensor objects much more efficiently. Experiments were conducted on different visual content datasets including UCF sports action video sequences database and UCF11 database. The experimental results have revealed that the proposed MPCANet achieves higher classification accuracy than PCANet for tensor object classification.
Object localization is an important computer vision problem with a variety of applications. The lack of large scale object-level annotations and the relative abundance of image-level labels makes a compelling case for weak supervision in the object localization task. Deep Convolutional Neural Networks are a class of state-of-the-art methods for the related problem of object recognition. In this paper, we describe a novel object localization algorithm which uses classification networks trained on only image labels. This weakly supervised method leverages local spatial and semantic patterns captured in the convolutional layers of classification networks. We propose an efficient beam search based approach to detect and localize multiple objects in images. The proposed method significantly outperforms the state-of-the-art in standard object localization data-sets with a 8 point increase in mAP scores.
This paper proposes Navigo, a location based packet forwarding mechanism for vehicular Named Data Networking (NDN). Navigo takes a radically new approach to address the challenges of frequent connectivity disruptions and sudden network changes in a vehicle network. Instead of forwarding packets to a specific moving car, Navigo aims to fetch specific pieces of data from multiple potential carriers of the data. The design provides (1) a mechanism to bind NDN data names to the producers' geographic area(s); (2) an algorithm to guide Interests towards data producers using a specialized shortest path over the road topology; and (3) an adaptive discovery and selection mechanism that can identify the best data source across multiple geographic areas, as well as quickly react to changes in the V2X network.
Social movements rely in large measure on networked communication technologies to organize and disseminate information relating to the movements' objectives. In this work we seek to understand how the goals and needs of a protest movement are reflected in the geographic patterns of its communication network, and how these patterns differ from those of stable political communication. To this end, we examine an online communication network reconstructed from over 600,000 tweets from a thirty-six week period covering the birth and maturation of the American anticapitalist movement, Occupy Wall Street. We find that, compared to a network of stable domestic political communication, the Occupy Wall Street network exhibits higher levels of locality and a hub and spoke structure, in which the majority of non-local attention is allocated to high-profile locations such as New York, California, and Washington D.C. Moreover, we observe that information flows across state boundaries are more likely to contain framing language and references to the media, while communication among individuals in the same state is more likely to reference protest action and specific places and and times. Tying these results to social movement theory, we propose that these features reflect the movement's efforts to mobilize resources at the local level and to develop narrative frames that reinforce collective purpose at the national level.
Overlay networks present an interesting challenge for fault-tolerant computing. Many overlay networks operate in dynamic environments (e.g. the Internet), where faults are frequent and widespread, and the number of processes in a system may be quite large. Recently, self-stabilizing overlay networks have been presented as a method for managing this complexity. \emph{Self-stabilizing overlay networks} promise that, starting from any weakly-connected configuration, a correct overlay network will eventually be built. To date, this guarantee has come at a cost: nodes may either have high degree during the algorithm's execution, or the algorithm may take a long time to reach a legal configuration. In this paper, we present the first self-stabilizing overlay network algorithm that does not incur this penalty. Specifically, we (i) present a new locally-checkable overlay network based upon a binary search tree, and (ii) provide a randomized algorithm for self-stabilization that terminates in an expected polylogarithmic number of rounds \emph{and} increases a node's degree by only a polylogarithmic factor in expectation.
Temporal coding is one approach to representing information in spiking neural networks. An example of its application is the location of sounds by barn owls that requires especially precise temporal coding. Dependent upon the azimuthal angle, the arrival times of sound signals are shifted between both ears. In order to deter- mine these interaural time differences, the phase difference of the signals is measured. We implemented this biologically inspired network on a neuromorphic hardware system and demonstrate spike-timing dependent plasticity on an analog, highly accelerated hardware substrate. Our neuromorphic implementation enables the resolution of time differences of less than 50 ns. On-chip Hebbian learning mechanisms select inputs from a pool of neurons which code for the same sound frequency. Hence, noise caused by different synaptic delays across these inputs is reduced. Furthermore, learning compensates for variations on neuronal and synaptic parameters caused by device mismatch intrinsic to the neuromorphic substrate.
We study the molecular mode coupling theory for a liquid of diatomic molecules. The equations for the critical tensorial nonergodicity parameters ${\bf F}_{ll'}^m(q)$ and the critical amplitudes of the $\beta$ - relaxation ${\bf H}_{ll'}^m(q)$ are solved up to a cut off $l_{co}$ = 2 without any further approximations.   Here $l,m$ are indices of spherical harmonics. Contrary to previous studies, where additional approximations were applied, we find in agreement with simulations, that all molecular degrees of freedom vitrify at a single temperature $T_c$. The theoretical results for the non ergodicity parameters and the critical amplitudes are compared with those from simulations. The qualitative agreement is good for all molecular degrees of freedom. To study the influence of the cut off on the non ergodicity parameter, we also calculate the non ergodicity parameters for an upper cut off $l_{co}=4$. In addition we also propose a new method for the calculation of the critical nonergodicity parameter
I show that the conclusions of [Hwang, Chavez, Amann, & Boccaletti, PRL 94, 138701 (2005); Chavez, Hwang, Amann, Hentschel, & Boccaletti, PRL 94, 218701 (2005)] are closely related to those of previous publications.
In the holographic or AdS/CFT dual to QCD, the Pomeron is identified with a Reggeized Graviton in $AdS_5$. We emphasize the importance of confinement, which in this context corresponds to a deformation of $AdS_5$ geometry in the IR. The holographic Pomeron provides a very good fit to the combined data from HERA for Deep Inelastic Scattering at small $x$, lending new confidence to this AdS dual approach to high energy diffractive scattering.
Typically, amorphous organic materials contain high density of traps. Traps hinder charge transport and, hence, affect various working parameters of organic electronic devices. In this paper we suggest a simple but reliable method for the estimation of the concentration of deep traps (traps that keep carriers for a time much longer than the typical transport time of the device). The method is based on the measurement of the dependence of the total charge, collected at the electrode, on the total initial charge, uniformly generated in the transport layer under the action of a light pulse. Advantages and limitations of the method are discussed and an experimental example of the estimation of the density of deep traps in photoconductive organic material poly(2-methoxy-5-(2'-ethylhexyloxy)-1,4-phenylenevinylene (MEH-PPV) is provided.
The paper presents a new sampling methodology for Bayesian networks that samples only a subset of variables and applies exact inference to the rest. Cutset sampling is a network structure-exploiting application of the Rao-Blackwellisation principle to sampling in Bayesian networks. It improves convergence by exploiting memory-based inference algorithms. It can also be viewed as an anytime approximation of the exact cutset-conditioning algorithm developed by Pearl. Cutset sampling can be implemented efficiently when the sampled variables constitute a loop-cutset of the Bayesian network and, more generally, when the induced width of the networks graph conditioned on the observed sampled variables is bounded by a constant w. We demonstrate empirically the benefit of this scheme on a range of benchmarks.
We use nonlinear signal processing techniques, based on artificial neural networks, to construct an empirical mapping from experimental Rayleigh-Benard convection data in the quasiperiodic regime. The data, in the form of a one-parameter sequence of Poincare sections in the interior of a mode-locked region (resonance horn), are indicative of a complicated interplay of local and global bifurcations with respect to the experimentally varied Rayleigh number. The dynamic phenomena apparent in the data include period doublings, complex intermittent behavior, secondary Hopf bifurcations, and chaotic dynamics. We use the fitted map to reconstruct the experimental dynamics and to explore the associated local and global bifurcation structures in phase space. Using numerical bifurcation techniques we locate the stable and unstable periodic solutions, calculate eigenvalues, approximate invariant manifolds of saddle type solutions and identify bifurcation points. This approach constitutes a promising data post-processing procedure for investigating phase space and parameter space of real experimental systems; it allows us to infer phase space structures which the experiments can only probe with limited measurement precision and only at a discrete number of operating parameter settings.
Despite recent advances in reputation technologies, it is not clear how reputation systems can affect human cooperation in social networks. Although it is known that two of the major mechanisms in the evolution of cooperation are spatial selection and reputation-based reciprocity, theoretical study of the interplay between both mechanisms remains almost uncharted. Here, we present a new individual-based model for the evolution of reciprocal cooperation between reputation and networks. We comparatively analyze four of the leading moral assessment rules---shunning, image scoring, stern judging, and simple standing---and base the model on the giving game in regular networks for Cooperators, Defectors, and Discriminators. Discriminators rely on a proper moral assessment rule. By using individual-based models, we show that the four assessment rules are differently characterized in terms of how cooperation evolves, depending on the benefit-to-cost ratio, the network-node degree, and the observation and error conditions. Our findings show that the most tolerant rule---simple standing---is the most robust among the four assessment rules in promoting cooperation in regular networks.
We have extended previous results on the threshold expansion of the gluon coefficient function for the charm contribution to the deep-inelastic structure function F_2 by deriving all threshold-enhanced contributions at the next-to-next-to-leading order. The size of these corrections is briefly illustrated, and a first step towards extending this improvement to more differential charm-production cross sections is presented.
Predictably sharing the network is critical to achieving high utilization in the datacenter. Past work has focussed on providing bandwidth to endpoints, but often we want to allocate resources among multi-node services. In this paper, we present Parley, which provides service-centric minimum bandwidth guarantees, which can be composed hierarchically. Parley also supports service-centric weighted sharing of bandwidth in excess of these guarantees. Further, we show how to configure these policies so services can get low latencies even at high network load. We evaluate Parley on a multi-tiered oversubscribed network connecting 90 machines, each with a 10Gb/s network interface, and demonstrate that Parley is able to meet its goals.
We explore the effects of the resummation of large logarithmic perturbative corrections to double-longitudinal spin asymmetries for inclusive and semi-inclusive deep inelastic scattering in fixed-target experiments. We find that the asymmetries are overall rather robust with respect to the inclusion of the resummed higher-order terms. Significant effects are observed at fairly high values of x, where resummation tends to decrease the spin asymmetries. This effect turns out to be more pronounced for semi-inclusive scattering. We also investigate the potential impact of resummation on the extraction of polarized valence quark distributions in dedicated high-x experiments.
Deep architecture such as hierarchical semi-Markov models is an important class of models for nested sequential data. Current exact inference schemes either cost cubic time in sequence length, or exponential time in model depth. These costs are prohibitive for large-scale problems with arbitrary length and depth. In this contribution, we propose a new approximation technique that may have the potential to achieve sub-cubic time complexity in length and linear time depth, at the cost of some loss of quality. The idea is based on two well-known methods: Gibbs sampling and Rao-Blackwellisation. We provide some simulation-based evaluation of the quality of the RGBS with respect to run time and sequence length.
The network of networks(NON) research is focused on studying the properties of n interdependent networks which is ubiquitous in the real world. Identifying the influential nodes in the network of networks is theoretical and practical significance. However, it is hard to describe the structure property of the NON based on traditional methods. In this paper, a new method is proposed to identify the influential nodes in the network of networks base on the evidence theory. The proposed method can fuse different kinds of relationship between the network components to constructed a comprehensive similarity network. The nodes which have a big value of similarity are the influential nodes in the NON. The experiment results illustrate that the proposed method is reasonable and significant
We construct a complete set of quasi-local integrals of motion for the many-body localized phase of interacting fermions in a disordered potential. The integrals of motion can be chosen to have binary spectrum $\{0,1\}$, thus constituting exact quasiparticle occupation number operators for the Fermi insulator. We map the problem onto a non-Hermitian hopping problem on a lattice in operator space. We show how the integrals of motion can be built, under certain approximations, as a convergent series in the interaction strength. An estimate of its radius of convergence is given, which also provides an estimate for the many-body localization-delocalization transition. Finally, we discuss how the properties of the operator expansion for the integrals of motion imply the presence or absence of a finite temperature transition.
There is a great need for accurate and autonomous spectral classification methods in astrophysics. This thesis is about training a convolutional neural network (ConvNet) to recognize an object class (quasar, star or galaxy) from one-dimension spectra only. Author developed several scripts and C programs for datasets preparation, preprocessing and postprocessing of the data. EBLearn library (developed by Pierre Sermanet and Yann LeCun) was used to create ConvNets. Application on dataset of more than 60000 spectra yielded success rate of nearly 95%. This thesis conclusively proved great potential of convolutional neural networks and deep learning methods in astrophysics.
Modern multi-domain networks now span over datacenter networks, enterprise networks, customer sites and mobile entities. Such networks are critical and, thus, must be resilient, scalable and easily extensible. The emergence of Software-Defined Networking (SDN) protocols, which enables to decouple the data plane from the control plane and dynamically program the network, opens up new ways to architect such networks. In this paper, we propose DISCO, an open and extensible DIstributed SDN COntrol plane able to cope with the distributed and heterogeneous nature of modern overlay networks and wide area networks. DISCO controllers manage their own network domain and communicate with each others to provide end-to-end network services. This communication is based on a unique lightweight and highly manageable control channel used by agents to self-adaptively share aggregated network-wide information. We implemented DISCO on top of the Floodlight OpenFlow controller and the AMQP protocol. We demonstrated how DISCO's control plane dynamically adapts to heterogeneous network topologies while being resilient enough to survive to disruptions and attacks and providing classic functionalities such as end-point migration and network-wide traffic engineering. The experimentation results we present are organized around three use cases: inter-domain topology disruption, end-to-end priority service request and virtual machine migration.
Low temperature properties of antiferromagnetic two-leg spin-1/2 ladders with bond randomness and site dilution (or doping with nonmagnetic impurities) are studied using the real-space renormalization-group technique. We find that for non zero doping concentrations the systems are driven into a phase dominated by large effective spins, i.e. the Large Spin phase. The susceptibility follows a universal Curie-like 1/T behavior at low temperature,regardless of the doping concentrations (as long as it is nonzero) and the strength of bond randomness. Very similar behavior has been found in ladders that are doped with magnetic impurities that carry spin-1.
We present a neural network technique for the analysis and extrapolation of time-series data called Neural Decomposition (ND). Units with a sinusoidal activation function are used to perform a Fourier-like decomposition of training samples into a sum of sinusoids, augmented by units with nonperiodic activation functions to capture linear trends and other nonperiodic components. We show how careful weight initialization can be combined with regularization to form a simple model that generalizes well. Our method generalizes effectively on the Mackey-Glass series, a dataset of unemployment rates as reported by the U.S. Department of Labor Statistics, a time-series of monthly international airline passengers, the monthly ozone concentration in downtown Los Angeles, and an unevenly sampled time-series of oxygen isotope measurements from a cave in north India. We find that ND outperforms popular time-series forecasting techniques including LSTM, echo state networks, ARIMA, SARIMA, SVR with a radial basis function, and Gashler and Ashmore's model.
We have calculated the coefficient functions for the structure functions F_2, F_L and F_3 in nu-nubar charged-current deep-inelastic scattering (DIS) at the third order in the strong coupling alpha_s, thus completing the description of unpolarized inclusive W^(+-) exchange DIS to this order of massless perturbative QCD. In this brief note, our new results are presented in terms of compact approximate expressions that are sufficiently accurate for phenomenological analyses. For the benefit of such analyses we also collect, in a unified notation, the corresponding lower-order contributions and the flavour non-singlet coefficient functions for nu+nubar charged-current DIS. The behaviour of all six third-order coefficient functions at small Bjorken-x is briefly discussed.
We show that introducing long-range Coulomb interactions immediately lifts the massive ground state degeneracy induced by geometric frustration for electrons on quarter-filled triangular lattices in the classical limit. Important consequences include the stabilization of a stripe-ordered crystalline (global) ground state, but also the emergence of very many low-lying metastable states with amorphous "stripe-glass" spatial structures. Melting of the stripe order thus leads to a frustrated Coulomb liquid at intermediate temperatures, showing remarkably slow (viscous) dynamics, with very long relaxation times growing in Arrhenius fashion upon cooling, as typical of strong glass formers. On shorter time scales, the system falls out of equilibrium and displays the aging phenomena characteristic of supercooled liquids above the glass transition. Our results show remarkable similarity with the recent observations of charge-glass behavior in ultra-clean triangular organic materials of the $\theta$-(BEDT-TTF)$_2$ family.
The Normalized Mutual Information (NMI) has been widely used to evaluate the accuracy of community detection algorithms. However in this article we show that the NMI is seriously affected by systematic errors due to finite size of networks, and may give a wrong estimate of performance of algorithms in some cases. We give a simple theory to the finite-size effect of NMI and test our theory numerically. Then we propose a new metric for the accuracy of community detection, namely the relative Normalized Mutual Information (rNMI), which considers statistical significance of the NMI by comparing it with the expected NMI of random partitions. Our numerical experiments show that the rNMI overcomes the finite-size effect of the NMI.
Power-law networks such as the Internet, terrorist cells, species relationships, and cellular metabolic interactions are susceptible to node failures, yet maintaining network connectivity is essential for network functionality. Disconnection of the network leads to fragmentation and, in some cases, collapse of the underlying system. However, the influences of the topology of networks on their ability to withstand node failures are poorly understood. Based on a study of the response of 2,000 power-law networks to node failures, we find that networks with higher nodal degree and clustering coefficient, lower betweenness centrality, and lower variability in path length and clustering coefficient maintain their cohesion better during such events. We also find that network robustness, i.e., the ability to withstand node failures, can be accurately predicted a priori for power-law networks across many fields. These results provide a basis for designing new, more robust networks, improving the robustness of existing networks such as the Internet and cellular metabolic pathways, and efficiently degrading networks such as terrorist cells.
We study the fundamental problem of information spreading (also known as gossip) in dynamic networks. In gossip, or more generally, $k$-gossip, there are $k$ pieces of information (or tokens) that are initially present in some nodes and the problem is to disseminate the $k$ tokens to all nodes. The goal is to accomplish the task in as few rounds of distributed computation as possible. The problem is especially challenging in dynamic networks where the network topology can change from round to round and can be controlled by an on-line adversary.   The focus of this paper is on the power of token-forwarding algorithms, which do not manipulate tokens in any way other than storing and forwarding them. We first consider a worst-case adversarial model first studied by Kuhn, Lynch, and Oshman~\cite{kuhn+lo:dynamic} in which the communication links for each round are chosen by an adversary, and nodes do not know who their neighbors for the current round are before they broadcast their messages. Our main result is an $\Omega(nk/\log n)$ lower bound on the number of rounds needed for any deterministic token-forwarding algorithm to solve $k$-gossip. This resolves an open problem raised in~\cite{kuhn+lo:dynamic}, improving their lower bound of $\Omega(n \log k)$, and matching their upper bound of $O(nk)$ to within a logarithmic factor.   We next show that token-forwarding algorithms can achieve subquadratic time in the offline version of the problem where the adversary has to commit all the topology changes in advance at the beginning of the computation, and present two polynomial-time offline token-forwarding algorithms. Our results are a step towards understanding the power and limitation of token-forwarding algorithms in dynamic networks.
Visual representation is crucial for a visual tracking method's performances. Conventionally, visual representations adopted in visual tracking rely on hand-crafted computer vision descriptors. These descriptors were developed generically without considering tracking-specific information. In this paper, we propose to learn complex-valued invariant representations from tracked sequential image patches, via strong temporal slowness constraint and stacked convolutional autoencoders. The deep slow local representations are learned offline on unlabeled data and transferred to the observational model of our proposed tracker. The proposed observational model retains old training samples to alleviate drift, and collect negative samples which are coherent with target's motion pattern for better discriminative tracking. With the learned representation and online training samples, a logistic regression classifier is adopted to distinguish target from background, and retrained online to adapt to appearance changes. Subsequently, the observational model is integrated into a particle filter framework to peform visual tracking. Experimental results on various challenging benchmark sequences demonstrate that the proposed tracker performs favourably against several state-of-the-art trackers.
The transmission coefficient for a one dimensional system is given in terms of Chebyshev polynomials using the tight-binding model. This result is applied to a system composed of two impurities located between $N$ sites of a host lattice. It is found that the system has extended states for several values of the energy. Analytical expressions are given for the impurity site energy in terms of the electron's energy. The number of resonant states grows like the number of host sites between the impurities. This property makes the system interesting since it is a simple task to design a configuration with resonant energy very close to the Fermi level $E_F$.
In this paper we present a method which assigns to each layer of a multilayer neural network, whose network dynamics is governed by a noisy winner-take-all mechanism, a neural temperature. This neural temperature is obtained by a least mean square fit of the probability distribution of the noisy winner-take-all mechanism to the distribution of a softmax mechanism, which has a well defined temperature as free parameter. We call this approximated temperature resulting from the optimization step the neural temperature. We apply this method to a multilayer neural network during learning the XOR-problem with a Hebb-like learning rule and show that after a transient the neural temperature decreases in each layer according to a power law. This indicates a self-organized annealing behavior induced by the learning rule itself instead of an external adjustment of a control parameter as in physically motivated optimization methods e.g. simulated annealing.
A crucial issue for a mobile ad hoc network is the handling of a large number of nodes. As more nodes join the mobile ad hoc network, contention and congestion are more likely. The on demand routing protocols which broadcasts control packets to discover routes to the destination nodes, generate a high number of broadcast packets in a larger networks causing contention and collision. We propose an efficient route discovery protocol, which reduces the number of broadcast packet, using controlled flooding technique. The simulation results show that the proposed probabilistic flooding decreases the number of control packets floating in the network during route discovery phase, without lowering the success ratio of path discoveries. Furthermore, the proposed method adapts to the normal network conditions. The results show that up to 70% of control packet traffic is saved in route discovery phase when the network is denser.
Engineering genetic networks to be both predictable and robust is a key challenge in synthetic biology. Synthetic circuits must reliably function in dynamic, stochastic and heterogeneous environments, and simple circuits can be studied to refine complex gene-regulation models. Although robust behaviours such as genetic oscillators have been designed and implemented in prokaryotic and eukaryotic organisms, a priori genetic engineering of even simple networks remains difficult, and many aspects of cell and molecular biology critical to engineering robust networks are still inadequately characterized. Particularly, periodic processes such as gene doubling and cell division are rarely considered in gene regulatory models, which may become more important as synthetic biologists utilize new tools for chromosome integration. We studied a chromosome-integrated, negative-feedback circuit based upon the bacteriophage {\lambda} transcriptional repressor Cro and observed strong, feedback-dependent oscillations in single-cell time traces. This finding was surprising due to a lack of cooperativity, long delays or fast protein degradation. We further show that oscillations are synchronized to the cell cycle by gene duplication, with phase shifts predictably correlating with estimated gene doubling times. Furthermore, we characterized the influence of negative feedback on the magnitude and dynamics of noise in gene expression. Our results show that cell-cycle effects must be accounted for in accurate, predictive models for even simple gene circuits. Cell-cycle-periodic expression of {\lambda} Cro also suggests an explanation for cell-size dependence in lysis probability and an evolutionary basis for site-specific {\lambda} integration.
We consider the design of cognitive Medium Access Control (MAC) protocols enabling an unlicensed (secondary) transmitter-receiver pair to communicate over the idle periods of a set of licensed channels, i.e., the primary network. The objective is to maximize data throughput while maintaining the synchronization between secondary users and avoiding interference with licensed (primary) users. No statistical information about the primary traffic is assumed to be available a-priori to the secondary user. We investigate two distinct sensing scenarios. In the first, the secondary transmitter is capable of sensing all the primary channels, whereas it senses one channel only in the second scenario. In both cases, we propose MAC protocols that efficiently learn the statistics of the primary traffic online. Our simulation results demonstrate that the proposed blind protocols asymptotically achieve the throughput obtained when prior knowledge of primary traffic statistics is available.
We design and implement a network-coding-enabled reliability architecture for next generation wireless networks. Our network coding (NC) architecture uses a flexible thread-based design, with each encoder-decoder instance applying systematic intra-session random linear network coding as a packet erasure code at the IP layer, to ensure the fast and reliable transfer of information between wireless nodes.   Using Global Environment for Network Innovations (GENI) WiMAX platforms, a series of point-to-point transmission experiments were conducted to compare the performance of the NC architecture to that of the Automatic Repeated reQuest (ARQ) and Hybrid ARQ (HARQ) mechanisms. At the application layer, Iperf and UDP-based File Transfer Protocol (UFTP) are used to measure throughput, packet loss and file transfer delay. In our selected scenarios, the proposed architecture is able to decrease packet loss from around 11-32% to nearly 0%; compared to HARQ and joint HARQ/ARQ mechanisms, the NC architecture offers up to 5.9 times gain in throughput and 5.5 times reduction in end-to-end file transfer delay. Our experiments show that network coding as a packet erasure code in the upper layers of the protocol stack has the potential to reduce the need for joint HARQ/ARQ schemes in the PHY/MAC layers, thus offering insights into cross-layer designs of efficient next generation wireless networks.
In this work, we address the problem to model all the nodes (words or phrases) in a dependency tree with the dense representations. We propose a recursive convolutional neural network (RCNN) architecture to capture syntactic and compositional-semantic representations of phrases and words in a dependency tree. Different with the original recursive neural network, we introduce the convolution and pooling layers, which can model a variety of compositions by the feature maps and choose the most informative compositions by the pooling layers. Based on RCNN, we use a discriminative model to re-rank a $k$-best list of candidate dependency parsing trees. The experiments show that RCNN is very effective to improve the state-of-the-art dependency parsing on both English and Chinese datasets.
Adiabatic quantum optimization is a procedure to solve a vast class of optimization problems by slowly changing the Hamiltonian of a quantum system. The evolution time necessary for the algorithm to be successful scales inversely with the minimum energy gap encountered during the dynamics. Unfortunately, the direct calculation of the gap is strongly limited by the exponential growth in the dimensionality of the Hilbert space associated to the quantum system. Although many special-purpose methods have been devised to reduce the effective dimensionality, they are strongly limited to particular classes of problems with evident symmetries. Moreover, little is known about the computational power of adiabatic quantum optimizers in real-world conditions. Here, we propose and implement a general purposes reduction method that does not rely on any explicit symmetry and which requires, under certain general conditions, only a polynomial amount of classical resources. Thanks to this method, we are able to analyze the performance of "non-ideal" quantum adiabatic optimizers to solve the well-known Grover problem, namely the search of target entries in an unsorted database, in the presence of discrete local defects. In this case, we show that adiabatic quantum optimization, even if affected by random noise, is still potentially faster than any classical algorithm.
Luminous X-ray outbursts with variability amplitudes as high as ~1000 have been detected from a small number of galactic nuclei. These events are likely associated with transient fueling of nuclear supermassive black holes. In this paper, we constrain X-ray outbursts with harder spectra, higher redshifts, and lower luminosities than have been studied previously. We performed a systematic survey of 24668 optical galaxies in the Chandra Deep Fields to search for such X-ray outbursts; the median redshift of these galaxies is ~0.8. The survey spans 798 days for the Chandra Deep Field-North, and 1828 days for the Chandra Deep Field-South. No outbursts were found, and thus we set upper limits on the rate of such events in the Universe, which depend upon the adopted outburst X-ray luminosity. For an outburst with X-ray luminosity $\ga 10^{43}$ ergs/s and a duration of 6 months, the upper limit on its event rate is ~10^{-4} /galaxy/yr, roughly consistent with theoretical predictions. Compared to previous survey results, our harder-band and deeper survey suggests that the outburst rate may increase by a maximum factor of 10 when considering both obscured X-ray outbursts and redshift evolution from z~0 to z~0.8. Our results also suggest that the X-ray luminosity function for moderate-luminosity active galactic nuclei is not primarily due to stellar tidal disruptions.
We revisit the problem of the stress distribution in a frictional sandpile under gravity, equipped with a new numerical model of granular assemblies with both normal and tangential (frictional) inter-granular forces. Numerical simulations allow a determination of the spatial dependence of all the components of the stress field as a function of systems size, the coefficient of static friction and the frictional interaction with the bottom surface. Our study clearly demonstrates that interaction with the bottom surface plays a crucial role in the formation of a pressure dip under the apex of a granular pile. Basic to the theory of sandpiles are assumptions about the form of scaling solutions and constitutive relations for cohesive-less hard grains for which no typical scale is available. We find that these constitutive relations must be modified; moreover for smaller friction coefficients and smaller piles these scaling assumptions break down in the bulk of the sandpile due to the presence of length scales that must be carefully identified. After identifying the crucial scale we provide a predictive theory to when scaling solutions are expected to break down. At the bottom of the pile the scaling assumption always breaks, due to the different interactions with the bottom surface. The consequences for measurable quantities like the pressure distribution and shear stress at the bottom of the pile are discussed. For example one can have a transition from no dip in the base-pressure to a dip at the center of the pile as a function of the system size.
A fundamental goal in network neuroscience is to understand how activity in one region drives activity elsewhere, a process referred to as effective connectivity. Here we propose to model this causal interaction using integro-differential equations and causal kernels that allow for a rich analysis of effective connectivity. The approach combines the tractability and flexibility of autoregressive modeling with the biophysical interpretability of dynamic causal modeling. The causal kernels are learned nonparametrically using Gaussian process regression, yielding an efficient framework for causal inference. We construct a novel class of causal covariance functions that enforce the desired properties of the causal kernels, an approach which we call GP CaKe. By construction, the model and its hyperparameters have biophysical meaning and are therefore easily interpretable. We demonstrate the efficacy of GP CaKe on a number of simulations and give an example of a realistic application on magnetoencephalography (MEG) data.
The sharpened No-Free-Lunch-theorem (NFL-theorem) states that the performance of all optimization algorithms averaged over any finite set F of functions is equal if and only if F is closed under permutation (c.u.p.) and each target function in F is equally likely. In this paper, we first summarize some consequences of this theorem, which have been proven recently: The average number of evaluations needed to find a desirable (e.g., optimal) solution can be calculated; the number of subsets c.u.p. can be neglected compared to the overall number of possible subsets; and problem classes relevant in practice are not likely to be c.u.p. Second, as the main result, the NFL-theorem is extended. Necessary and sufficient conditions for NFL-results to hold are given for arbitrary, non-uniform distributions of target functions. This yields the most general NFL-theorem for optimization presented so far.
Peer-to-peer (P2P) computing is currently attracting enormous attention. In P2P systems a very large number of autonomous computing nodes (the peers) pool together their resources and rely on each other for data and services. Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of Internet traffic. Examples include P2P systems for network storage, web caching, searching and indexing of relevant documents and distributed network-threat analysis. Requirements for widely distributed information systems supporting virtual organizations have given rise to a new category of P2P systems called schema-based. In such systems each peer exposes its own schema and the main objective is the efficient search across the P2P network by processing each incoming query without overly consuming bandwidth. The usability of these systems depends on effective techniques to find and retrieve data; however, efficient and effective routing of content-based queries is a challenging problem in P2P networks. This work was attended as an attempt to motivate the use of mining algorithms and hypergraphs context to develop two different methods that improve significantly the efficiency of P2P communications. The proposed query routing methods direct the query to a set of relevant peers in such way as to avoid network traffic and bandwidth consumption. We compare the performance of the two proposed methods with the baseline one and our experimental results prove that our proposed methods generate impressive levels of performance and scalability.
We study a multiple invasion model to simulate corrosion or intrusion processes. Estimated values for the fractal dimension of the invaded region reveal that the critical exponents vary as function of the generation number $G$, i.e., with the number of times the invasion process takes place. The averaged mass $M$ of the invaded region decreases with a power-law as a function of $G$, $M\sim G^{\beta}$, where the exponent $\beta\approx 0.6$. We also find that the fractal dimension of the invaded cluster changes from $d_{1}=1.887\pm0.002$ to $d_{s}=1.217\pm0.005$. This result confirms that the multiple invasion process follows a continuous transition from one universality class (NTIP) to another (optimal path). In addition, we report extensive numerical simulations that indicate that the mass distribution of avalanches $P(S,L)$ has a power-law behavior and we find that the exponent $\tau$ governing the power-law $P(S,L)\sim S^{-\tau}$ changes continuously as a function of the parameter $G$. We propose a scaling law for the mass distribution of avalanches for different number of generations $G$.
We study the semi-inclusive hadron production in deep inelastic scattering at small $x$. A transverse momentum dependent factorization is found consistent with the results calculated in the small-$x$ approaches, such as the color-dipole framework and the color glass condensate, in the appropriate kinematic region at the lowest order. The transverse momentum dependent quark distribution can be studied in this process as a probe for the small-$x$ saturation physics. Especially, the ratio of quark distributions as a function of transverse momentum at different $x$ demonstrates strong dependence on the saturation scale. The $Q^2$ dependence of the same ratio is also studied by applying the Collins-Soper-Sterman resummation method.
Women are dramatically underrepresented in computer science at all levels in academia and account for just 15% of tenure-track faculty. Understanding the causes of this gender imbalance would inform both policies intended to rectify it and employment decisions by departments and individuals. Progress in this direction, however, is complicated by the complexity and decentralized nature of faculty hiring and the non-independence of hires. Using comprehensive data on both hiring outcomes and scholarly productivity for 2659 tenure-track faculty across 205 Ph.D.-granting departments in North America, we investigate the multi-dimensional nature of gender inequality in computer science faculty hiring through a network model of the hiring process. Overall, we find that hiring outcomes are most directly affected by (i) the relative prestige between hiring and placing institutions and (ii) the scholarly productivity of the candidates. After including these, and other features, the addition of gender did not significantly reduce modeling error. However, gender differences do exist, e.g., in scholarly productivity, postdoctoral training rates, and in career movements up the rankings of universities, suggesting that the effects of gender are indirectly incorporated into hiring decisions through gender's covariates. Furthermore, we find evidence that more highly ranked departments recruit female faculty at higher than expected rates, which appears to inhibit similar efforts by lower ranked departments. These findings illustrate the subtle nature of gender inequality in faculty hiring networks and provide new insights to the underrepresentation of women in computer science.
Surface sensitive x-ray scattering techniques with atomic scale resolution are employed to investigate the microscopic structure of the surface of three classes of liquid binary alloys: (i) Surface segregation in partly miscible binary alloys as predicted by the Gibbs adsorption rule is investigated for Ga-In. The first layer consists of a supercooled In monolayer and the bulk composition is reached after about two atomic diameters. (ii) The Ga-Bi system displays a wetting transition at a characteristic temperature T_w~220 C. The transition from a Bi monolayer on Ga below T_w to a thick Bi-rich wetting film above T_w is studied. (iii) The effect of attractive interactions between the two components of a binary alloy on the surface structure is investigated for two Hg-Au alloys.
It is well-known that neural networks are universal approximators, but that deeper networks tend to be much more efficient than shallow ones. We shed light on this by proving that the total number of neurons $m$ required to approximate natural classes of multivariate polynomials of $n$ variables grows only linearly with $n$ for deep neural networks, but grows exponentially when merely a single hidden layer is allowed. We also provide evidence that when the number of hidden layers is increased from $1$ to $k$, the neuron requirement grows exponentially not with $n$ but with $n^{1/k}$, suggesting that the minimum number of layers required for computational tractability grows only logarithmically with $n$.
We propose Sparse Neural Network architectures that are based on random or structured bipartite graph topologies. Sparse architectures provide compression of the models learned and speed-ups of computations, they can also surpass their unstructured or fully connected counterparts. As we show, even more compact topologies of the so-called SNN (Sparse Neural Network) can be achieved with the use of structured graphs of connections between consecutive layers of neurons. In this paper, we investigate how the accuracy and training speed of the models depend on the topology and sparsity of the neural network. Previous approaches using sparcity are all based on fully connected neural network models and create sparcity during training phase, instead we explicitly define a sparse architectures of connections before the training. Building compact neural network models is coherent with empirical observations showing that there is much redundancy in learned neural network models. We show experimentally that the accuracy of the models learned with neural networks depends on expander-like properties of the underlying topologies such as the spectral gap and algebraic connectivity rather than the density of the graphs of connections.
Localization of anatomical structures is a prerequisite for many tasks in medical image analysis. We propose a method for automatic localization of one or more anatomical structures in 3D medical images through detection of their presence in 2D image slices using a convolutional neural network (ConvNet).   A single ConvNet is trained to detect presence of the anatomical structure of interest in axial, coronal, and sagittal slices extracted from a 3D image. To allow the ConvNet to analyze slices of different sizes, spatial pyramid pooling is applied. After detection, 3D bounding boxes are created by combining the output of the ConvNet in all slices.   In the experiments 200 chest CT, 100 cardiac CT angiography (CTA), and 100 abdomen CT scans were used. The heart, ascending aorta, aortic arch, and descending aorta were localized in chest CT scans, the left cardiac ventricle in cardiac CTA scans, and the liver in abdomen CT scans. Localization was evaluated using the distances between automatically and manually defined reference bounding box centroids and walls.   The best results were achieved in localization of structures with clearly defined boundaries (e.g. aortic arch) and the worst when the structure boundary was not clearly visible (e.g. liver). The method was more robust and accurate in localization multiple structures.
We investigate the equidistribution of the eigenfunctions on quantum graphs in the high-energy limit. Our main result is an estimate of the deviations from equidistribution for large well-connected graphs. We use an exact field-theoretic expression in terms of a variant of the supersymmetric nonlinear sigma-model. Our estimate is based on a saddle-point analysis of this expression and leads to a criterion for when equidistribution emerges asymptotically in the limit of large graphs. Our theory predicts a rate of convergence that is a significant refinement of previous estimates, long-assumed to be valid for quantum chaotic systems, agreeing with them in some situations but not all. We discuss specific examples for which the theory is tested numerically.
We study the in-network computation of arbitrary functions whose computation schema is a complete binary tree, i.e., we assume that the network wants to compute a function of $K$ operands, each of which is available at a distinct node in the network, and rather than simply collecting the $K$ operands at a single sink node and computing the function, we want to compute the function during the process of moving the data towards the sink. Such settings have been studied in the literature but largely only for symmetric functions, e.g. average, parity etc., which have the specific property that the output is invariant to permutations of the operands. To the best of our knowledge, we present the first decentralised algorithms for arbitrary functions. We propose two algorithms, Fixed Metropolis-Compute and Flexible Metropolis-Compute, for this problem, both of which use random walks on the network as their basic primitive. Assuming that time is slotted we provide upper bounds on time taken to compute the function, characterising this time in terms of the fundamental parameters of the random walk on a graph: the hitting time in the case of Fixed Metropolis-Compute, and the mixing time in the case of Flexible Metropolis-Compute. Assuming a stochastic model for the generation of streams of data at each source, we also provide lower and upper bound on the rate at which Fixed Metropolis-Compute is able to compute the stream of associated function values.
The fine-structure spacing $d_{\ell}(n) = \nu_{\ell,n} - \nu_{\ell+2,n-1}$ for low-degree solar p modes of angular degree $\ell$ and radial order n, is sensitive to conditions in the deep radiative interior of the Sun. Here, we present fine-structure spacings derived from the analysis of nearly five years of helioseismological data collected between 1991 July and 1996 February by the Birmingham Solar-Oscillations Network (BiSON). These data cover $9 \le n \le 28$ for $d_{0}(n)$, and $11 \le n \le 27$ for $d_{1}(n)$. The measured spacings are much more precise, and cover a greater range, than earlier measurements from BiSON data (Elsworth et al. 1990a). The predicted fine-structure spacings for a ``standard'' solar model are clearly excluded by the BiSON data (at $\approx 10\sigma$); models that include helium and heavy element settling provide a much better match to the observed spacings (see also Elsworth et al. 1995). Since the inclusion of core settling in solar models will tend to slightly increase the predicted neutrino flux, the BiSON fine-structure data appear to reinforce previous conclusions, i.e., an astrophysical solution to the solar neutrino problem seems unlikely.
In this paper, we study the wireline two-unicast-Z communication network over directed acyclic graphs. The two-unicast-Z network is a two-unicast network where the destination intending to decode the second message has apriori side information of the first message. We make three contributions in this paper:   1. We describe a new linear network coding algorithm for two-unicast-Z networks over directed acyclic graphs. Our approach includes the idea of interference alignment as one of its key ingredients. For graphs of a bounded degree, our algorithm has linear complexity in terms of the number of vertices, and polynomial complexity in terms of the number of edges.   2. We prove that our algorithm achieves the rate-pair (1, 1) whenever it is feasible in the network. Our proof serves as an alternative, albeit restricted to two-unicast-Z networks over directed acyclic graphs, to an earlier result of Wang et al. which studied necessary and sufficient conditions for feasibility of the rate pair (1, 1) in two-unicast networks.   3. We provide a new proof of the classical max-flow min-cut theorem for directed acyclic graphs.
With rapid development of the Internet, web contents become huge. Most of the websites are publicly available, and anyone can access the contents from anywhere such as workplace, home and even schools. Nevertheless, not all the web contents are appropriate for all users, especially children. An example of these contents is pornography images which should be restricted to certain age group. Besides, these images are not safe for work (NSFW) in which employees should not be seen accessing such contents during work. Recently, convolutional neural networks have been successfully applied to many computer vision problems. Inspired by these successes, we propose a mixture of convolutional neural networks for adult content recognition. Unlike other works, our method is formulated on a weighted sum of multiple deep neural network models. The weights of each CNN models are expressed as a linear regression problem learned using Ordinary Least Squares (OLS). Experimental results demonstrate that the proposed model outperforms both single CNN model and the average sum of CNN models in adult content recognition.
Gaussian graphical models (GGMs) are widely used for statistical modeling, because of ease of inference and the ubiquitous use of the normal distribution in practical approximations. However, they are also known for their limited modeling abilities, due to the Gaussian assumption. In this paper, we introduce a novel variant of GGMs, which relaxes the Gaussian restriction and yet admits efficient inference. Specifically, we impose a bipartite structure on the GGM and govern the hidden variables by truncated normal distributions. The nonlinearity of the model is revealed by its connection to rectified linear unit (ReLU) neural networks. Meanwhile, thanks to the bipartite structure and appealing properties of truncated normals, we are able to train the models efficiently using contrastive divergence. We consider three output constructs, accounting for real-valued, binary and count data. We further extend the model to deep constructions and show that deep models can be used for unsupervised pre-training of rectifier neural networks. Extensive experimental results are provided to validate the proposed models and demonstrate their superiority over competing models.
Various implementations of wireless sensor networks (i.e. personal area-, wireless body area- networks) are prone to node and network failures by such characteristics as limited node energy resources and hardware damage incurred from their surrounding environment (i.e. flooding, forest fires, a patient falling). This may jeopardize their reliability to act as early warning systems, monitoring systems for patients and athletes, and industrial and environmental observation networks. Following the current trend and widespread use of hand held, mobile communication devices, we outline an application architecture designed to detect and predict faulty nodes in wireless sensor networks. Furthermore, we implement our design as a proof of concept prototype for Android-based smartphones, which may be extended to develop other applications used for monitoring networked wireless personal area and body sensors used in other capacities. We have conducted several preliminary experiments to demonstrate the use of our design, which is capable of monitoring networks of wireless sensor devices and predicting node faults based on several localized metrics. As attributes of such networks may change over time, any models generated when the application is initialized must be updated periodically such that the applied machine learning algorithm maintains high levels of both accuracy and precision. The application is designed to discover node faults and, once identified, alert the user so that appropriate action may be taken.
We present the design and optimization of a Tomographic Gamma Scanning (TGS) collimator based on Monte Carlo simulations using MCNP5 computer code. In these simulations, an accurate Monte Carlo model of TGS was built and the collimator radius, collimator deep and collimator shape of the TGS are optimized. The simulation results reveal that the collimator aperture radius of 3.1 and depth of 18.6 cm are the high sensitivity when FWHM choose 26.7cm, the rotated hexagon is the optimal shape. Our design shows a significantly improved performance of the TGS system.
Spike patterns have been reported to encode sensory information in several brain areas. Here we assess the role of specific patterns in the neural code, by comparing the amount of information transmitted with different choices of the readout neural alphabet. This allows us to rank several alternative alphabets depending on the amount of information that can be extracted from them. One can thereby identify the specific patterns that constitute the most prominent ingredients of the code. We finally discuss the interplay of categorical and temporal information in the amount of synergy or redundancy in the neural code.
The temperature and magnetic-field dependences of the resistance of one-dimensional (1D) conductors have been studied in the vicinity of the Thouless crossover. We find that on the weak localization (WL) side of the crossover, these dependences are consistent with the theory of quantum corrections to the resistance, and the phase breaking is due to the quasi-elastic electron-electron interactions (the Nyquist noise). The temperature dependence of the phase coherence time does not saturate, and the quasiparticle states remain well defined over the whole WL temperature range. This fact, as well as observation of the Thouless crossover in 1D samples, argues against the idea of intrinsic decoherence by zero-point fluctuations of the electrons (Mohanty et al., Phys.Rev.Lett. 78, 3366 (1997)). We believe that frequently observed saturation of the phase coherence time is caused by the external microwave noise.
Visual data such as videos are often sampled from complex manifold. We propose leveraging the manifold structure to constrain the deep action feature learning, thereby minimizing the intra-class variations in the feature space and alleviating the over-fitting problem. Considering that manifold can be transferred, layer by layer, from the data domain to the deep features, the manifold priori is posed from the top layer into the back propagation learning procedure of convolutional neural network (CNN). The resulting algorithm --Spatio-Temporal Manifold Network-- is solved with the efficient Alternating Direction Method of Multipliers and Backward Propagation (ADMM-BP). We theoretically show that STMN recasts the problem as projection over the manifold via an embedding method. The proposed approach is evaluated on two benchmark datasets, showing significant improvements to the baselines.
We consider networks of noisy degraded wiretap channels in the presence of an eavesdropper. For the case where the eavesdropper can wiretap at most one channel at a time, we show that the secrecy capacity region, for a broad class of channels and any given network topology and communication demands, is equivalent to that of a corresponding network where each noisy wiretap channel is replaced by a noiseless wiretap channel. Thus in this case there is a separation between wiretap channel coding on each channel and secure network coding on the resulting noiseless network. We show with an example that such separation does not hold when the eavesdropper can access multiple channels at the same time, for which case we provide upper and lower bounding noiseless networks.
Great advances in computing and communication technology are bringing many benefits to society, with transformative changes and financial opportunities being created in health care, transportation, education, law enforcement, national security, commerce, and social interactions. Many of these benefits, however, involve the use of sensitive personal data, and thereby raise concerns about privacy. Failure to address these concerns can lead to a loss of trust in the private and public institutions that handle personal data, and can stifle the independent thought and expression that is needed for our democracy to flourish.   This report, sponsored by the Computing Community Consortium (CCC), suggests a roadmap for privacy research over the next decade, aimed at enabling society to appropriately control threats to privacy while enjoying the benefits of information technology and data science. We hope that it will be useful to the agencies of the Federal Networking and Information Technology Research and Development (NITRD) Program as they develop a joint National Privacy Research Strategy over the coming months. The report synthesizes input drawn from the privacy and computing communities submitted to both the CCC and NITRD, as well as past reports on the topic.
We study the problem of learning the dependency graph between random processes in a Vector Autoregressive (VAR) model from samples when a subset of the variables are latent. We show that the dependencies among the observed processes can be identified successfully under some conditions on the VAR model. Moreover, we can recover the length of all directed paths between any two observed processes which pass through latent part. By utilizing this information, we reconstruct the latent subgraph with minimum number of nodes uniquely if its topology is a directed tree. Furthermore, we propose an algorithm that finds all possible minimal latent networks if there exists at most one directed path of each length between any two observed nodes through the latent part. Experimental results on various synthetic and real-world datasets validate our theoretical results.
In this paper, we investigate the impact of segmentation algorithms as a preprocessing step for classification of remote sensing images in a deep learning framework. Especially, we address the issue of segmenting the image into regions to be classified using pre-trained deep neural networks as feature extractors for an SVM-based classifier. An efficient segmentation as a preprocessing step helps learning by adding a spatially-coherent structure to the data. Therefore, we compare algorithms producing superpixels with more traditional remote sensing segmentation algorithms and measure the variation in terms of classification accuracy. We establish that superpixel algorithms allow for a better classification accuracy as a homogenous and compact segmentation favors better generalization of the training samples.
The content of today's social media is becoming more and more rich, increasingly mixing text, images, videos, and audio. It is an intriguing research question to model the interplay between these different modes in attracting user attention and engagement. But in order to pursue this study of multimodal content, we must also account for context: timing effects, community preferences, and social factors (e.g., which authors are already popular) also affect the amount of feedback and reaction that social-media posts receive. In this work, we separate out the influence of these non-content factors in several ways. First, we focus on ranking pairs of submissions posted to the same community in quick succession, e.g., within 30 seconds, this framing encourages models to focus on time-agnostic and community-specific content features. Within that setting, we determine the relative performance of author vs. content features. We find that victory usually belongs to "cats and captions," as visual and textual features together tend to outperform identity-based features. Moreover, our experiments show that when considered in isolation, simple unigram text features and deep neural network visual features yield the highest accuracy individually, and that the combination of the two modalities generally leads to the best accuracies overall.
Semantic boundary and edge detection aims at simultaneously detecting object edge pixels in images and assigning class labels to them. Systematic training of predictors for this task requires the labeling of edges in images which is a particularly tedious task. We propose a novel strategy for solving this task, when pixel-level annotations are not available, performing it in an almost zero-shot manner by relying on conventional whole image neural net classifiers that were trained using large bounding boxes. Our method performs the following two steps at test time. Firstly it predicts the class labels by applying the trained whole image network to the test images. Secondly, it computes pixel-wise scores from the obtained predictions by applying backprop gradients as well as recent visualization algorithms such as deconvolution and layer-wise relevance propagation. We show that high pixel-wise scores are indicative for the location of semantic boundaries, which suggests that the semantic boundary problem can be approached without using edge labels during the training phase.
We report on the search for the top quark in proton-antiproton collisions at the Fermilab Tevatron in the di-lepton and lepton+jets channels using multivariate methods. An H-matrix analysis of the e-mu data corresponding to an integrated luminosity of about 13.5 pb-1 yields one event with a likelihood to be a top event (assuming top mass of 180 GeV/c**2) that is 10 times more than WW and 18 times more than Z -> tau tau. A neural network analysis of e+jets channel with about 48 pb-1 of data shows an excess of events in the signal region and yields a cross-section for top-antitop production of 6.7 +/- 2.3(stat.) pb, assuming a top mass of 200 GeV/c**2. A PDE analysis of e+jets data gives results consistent with the above.
Visual scene understanding is an important capability that enables robots to purposefully act in their environment. In this paper, we propose a novel approach to object-class segmentation from multiple RGB-D views using deep learning. We train a deep neural network to predict object-class semantics that is consistent from several view points in a semi-supervised way. At test time, the semantics predictions of our network can be fused more consistently in semantic keyframe maps than predictions of a network trained on individual views. We base our network architecture on a recent single-view deep learning approach to RGB and depth fusion for semantic object-class segmentation and enhance it with multi-scale loss minimization. We obtain the camera trajectory using RGB-D SLAM and warp the predictions of RGB-D images into ground-truth annotated frames in order to enforce multi-view consistency during training. At test time, predictions from multiple views are fused into keyframes. We propose and analyze several methods for enforcing multi-view consistency during training and testing. We evaluate the benefit of multi-view consistency training and demonstrate that pooling of deep features and fusion over multiple views outperforms single-view baselines on the NYUDv2 benchmark for semantic segmentation. Our end-to-end trained network achieves state-of-the-art performance on the NYUDv2 dataset in single-view segmentation as well as multi-view semantic fusion.
We consider a wireless services market where a set of operators compete for a large common pool of users. The latter have a reservation utility of U0 units or, equivalently, an alternative option to satisfy their communication needs. The operators must satisfy these minimum requirements in order to attract the users. We model the users decisions and interaction as an evolutionary game and the competition among the operators as a non cooperative price game which is proved to be a potential game. For each set of prices selected by the operators, the evolutionary game attains a different stationary point. We show that the outcome of both games depend on the reservation utility of the users and the amount of spectrum W the operators have at their disposal. We express the market welfare and the revenue of the operators as functions of these two parameters. Accordingly, we consider the scenario where a regulating agency is able to intervene and change the outcome of the market by tuning W and/or U0. Different regulators may have different objectives and criteria according to which they intervene. We analyze the various possible regulation methods and discuss their requirements, implications and impact on the market.
Voice communication over internet not be possible without a reliable data network, this was first available when distributed network topologies were used in conjunction with data packets. Early network used single centre node network in which a single workstation (Server) is responsible for the communication. This posed problems as if there was a fault with the centre node, (workstation) nothing would work. This problem was solved by the distributed system in which reliability increases by spreading the load between many nodes. The idea of packet switching & distributed network were combined, this combination were increased reliability, speed & responsible for voice communication over internet, Voice-over-IP (VoIP)These data packets travel through a packet-switched network such as the Internet and arrive at their destination where they are decompressed using a compatible Codec (audio coder/decoder) and converted back to analogue audio. This paper deals with the Simulink architecture for VoIP network.
We demonstrate that networks of locally connected processing units with a primitive learning capability exhibit behavior that is usually only attributed to quantum systems. We describe networks that simulate single-photon beam-splitter and Mach-Zehnder interferometer experiments on a causal, event-by-event basis and demonstrate that the simulation results are in excellent agreement with quantum theory.
Consider the continuum of points on the edges of a network, i.e., a connected, undirected graph with positive edge weights. We measure the distance between these points in terms of the weighted shortest path distance, called the network distance. Within this metric space, we study farthest points and farthest distances. We introduce optimal data structures supporting queries for the farthest distance and the farthest points on trees, cycles, uni-cyclic networks, and cactus networks.
Random key graphs were introduced to study various properties of the Eschenauer-Gligor key predistribution scheme for wireless sensor networks (WSNs). Recently this class of random graphs has received much attention in contexts as diverse as recommender systems, social network modeling, and clustering and classification analysis. This paper is devoted to analyzing various properties of random key graphs. In particular, we establish a zero-one law for the the existence of triangles in random key graphs, and identify the corresponding critical scaling. This zero-one law exhibits significant differences with the corresponding result in Erdos-Renyi (ER) graphs. We also compute the clustering coefficient of random key graphs, and compare it to that of ER graphs in the many node regime when their expected average degrees are asymptotically equivalent. For the parameter range of practical relevance in both wireless sensor network and social network applications, random key graphs are shown to be much more clustered than the corresponding ER graphs. We also explore the suitability of random key graphs as small world models in the sense of Watts and Strogatz.
Tensor train (TT) decomposition is a powerful representation for high-order tensors, which has been successfully applied to various machine learning tasks in recent years. However, since the tensor product is not commutative, permutation of data dimensions makes solutions and TT-ranks of TT decomposition inconsistent. To alleviate this problem, we propose a permutation symmetric network structure by employing circular multilinear products over a sequence of low-order core tensors. This network structure can be graphically interpreted as a cyclic interconnection of tensors, and thus we call it tensor ring (TR) representation. We develop several efficient algorithms to learn TR representation with adaptive TR-ranks by employing low-rank approximations. Furthermore, mathematical properties are investigated, which enables us to perform basic operations in a computationally efficiently way by using TR representations. Experimental results on synthetic signals and real-world datasets demonstrate that the proposed TR network is more expressive and consistently informative than existing TT networks.
An ad hoc network of vehicles (VANET) consists of vehicles that exchange information via radio in order to improve road safety, traffic management and do better distribution of traffic load in time and space. Along with this it allows Internet access for passengers and users of vehicles. A significant characteristic while studying VANETs is the requirement of having a mobility model that gives aspects of real vehicular traffic. These scenarios play an important role in performance of VANETs. In our paper we have demonstration and description of generating realistic mobility model using various tools such as eWorld, OpenStreetMap, SUMO and TraNS. Generated mobility scenario is added to NS-2.34 (Network Simulator) for analysis of DSR and AODV routing protocol under 802.11p (DSRC/WAVE) and 802.11a. Results after analysis shows 802.11p is more suitable than 802.11a for VANET.
The Web has enabled one of the most visible recent developments in education---the deployment of massive open online courses. With their global reach and often staggering enrollments, MOOCs have the potential to become a major new mechanism for learning. Despite this early promise, however, MOOCs are still relatively unexplored and poorly understood.   In a MOOC, each student's complete interaction with the course materials takes place on the Web, thus providing a record of learner activity of unprecedented scale and resolution. In this work, we use such trace data to develop a conceptual framework for understanding how users currently engage with MOOCs. We develop a taxonomy of individual behavior, examine the different behavioral patterns of high- and low-achieving students, and investigate how forum participation relates to other parts of the course.   We also report on a large-scale deployment of badges as incentives for engagement in a MOOC, including randomized experiments in which the presentation of badges was varied across sub-populations. We find that making badges more salient produced increases in forum engagement.
A considerable portion of the machine learning literature applied to intrusion detection uses outdated data sets based on a simulated network with a limited environment. Moreover, flaws usually appear in datasets and the way we handle them may impact on measurements. Finally, the detection capacity of intrusion detection is highly influenced by the system configuration. We focus on a topic rarely investigated: the characterization of anomalies in a large network environment. Intrusion Detection System (IDS) are used to detect exploits or other attacks that raise alarms. These anomalous events usually receive less attention than attack alarms, causing them to be frequently overlooked by security administrators. However, the observation of this activity contributes to understand the traffic network characteristics. On one hand, abnormal behaviors may be legitimate, e.g., misinterpreted protocols or malfunctioning network equipment, but on the other hand an attacker may intentionally craft packets to introduce anomalies to evade monitoring systems. Anomalies found in operational network environments may indicate cases of evasion attacks, application bugs, and a wide variety of factors that highly influence intrusion detection performance. This study explores the nature of anomalies found in U-Tokyo Network using cooperatively Bro and Snort IDS among other resources. We analyze 6.5 TB of compressed binary tcpdump data representing 12 hours of network traffic. Our major contributions can be summarized in: 1) reporting the anomalies observed in real, up-to-date traffic from a large academic network environment, and documenting problems in research that may lead to wrong results due to misinterpretations of data or misconfigurations in software; 2) assessing the quality of data by analyzing the potential and the real problems in the capture process.
The physics potential of the forward physics project at CMS is very rich. Some of the diffraction and low-x physics channels are briefly discussed.
Disordered fibrous networks are ubiquitous in nature as major structural components of living cells and tissues. The mechanical stability of networks generally depends on the degree of connectivity: only when the average number of connections between nodes exceeds the isostatic threshold are networks stable (Maxwell, J. C., Philosophical Magazine 27, 294 (1864)). Upon increasing the connectivity through this point, such networks undergo a mechanical phase transition from a floppy to a rigid phase. However, even sub-isostatic networks become rigid when subjected to sufficiently large deformations. To study this strain-controlled transition, we perform a combination of computational modeling of fibre networks and experiments on networks of type I collagen fibers, which are crucial for the integrity of biological tissues. We show theoretically that the development of rigidity is characterized by a strain-controlled continuous phase transition with signatures of criticality. Our experiments demonstrate mechanical properties consistent with our model, including the predicted critical exponents. We show that the nonlinear mechanics of collagen networks can be quantitatively captured by the predictions of scaling theory for the strain-controlled critical behavior over a wide range of network concentrations and strains up to failure of the material.
For survival, a living agent must have the ability to assess risk (1) by temporally anticipating accidents before they occur, and (2) by spatially localizing risky regions in the environment to move away from threats. In this paper, we take an agent-centric approach to study the accident anticipation and risky region localization tasks. We propose a novel soft-attention Recurrent Neural Network (RNN) which explicitly models both spatial and appearance-wise non-linear interaction between the agent triggering the event and another agent or static-region involved. In order to test our proposed method, we introduce the Epic Fail (EF) dataset consisting of 3000 viral videos capturing various accidents. In the experiments, we evaluate the risk assessment accuracy both in the temporal domain (accident anticipation) and spatial domain (risky region localization) on our EF dataset and the Street Accident (SA) dataset. Our method consistently outperforms other baselines on both datasets.
We model the formation of networks as the result of a game where by players act selfishly to get the portfolio of links they desire most. The integration of player strategies into the network formation model is appropriate for organizational networks because in these smaller networks, dynamics are not random, but the result of intentional actions carried through by players maximizing their own objectives. This model is a better framework for the analysis of influences upon a network because it integrates the strategies of the players involved. We present an Integer Program that calculates the price of anarchy of this game by finding the worst stable graph and the best coordinated graph for this game. We simulate the formation of the network and calculated the simulated price of anarchy, which we find tends to be rather low.
We discuss a novel approach, the point-to-set correlation functions, that allows to determine relevant static and dynamic length scales in glass-forming liquids. We find that static length scales increase monotonically when the temperature is lowered, whereas the measured dynamic length scale shows a maximum around the critical temperature of mode-coupling theory. We show that a similar non-monotonicity is found in the temperature evolution of certain finite size effects in the relaxation dynamics. These two independent sets of results demonstrate the existence of a change in the transport mechanism when the glass-former is cooled from moderately to deeply supercooled states across the mode-coupling crossover and clarify the status of the theoretical calculations done at the mean field level.
We present the luminosity function of [OII]-emitting galaxies at a median redshift of z=0.9, as measured in the deep spectroscopic data in the STIS Parallel Survey (SPS). The luminosity function shows strong evolution from the local value, as expected. By using random lines of sight, the SPS measurement complements previous deep single field studies. We calculate the density of inferred star formation at this redshift by converting from [OII] to H-alpha line flux as a function of absolute magnitude and find rho_dot=0.043 +/- 0.014 Msun/yr/Mpc^3 at a median redshift z~0.9 within the range 0.46<z<1.415 (H_0 = 70 km/s/Mpc, Omega_M=0.3, Omega_Lambda=0.7. This density is consistent with a (1+z)^4 evolution in global star formation since z~1. To reconcile the density with similar measurements made by surveys targeting H-alpha may require substantial extinction correction.
We consider the evolution of a network of neurons, focusing on the asymptotic behavior of spikes dynamics instead of membrane potential dynamics. The spike response is not sought as a deterministic response in this context, but as a conditional probability : "Reading out the code" consists of inferring such a probability. This probability is computed from empirical raster plots, by using the framework of thermodynamic formalism in ergodic theory. This gives us a parametric statistical model where the probability has the form of a Gibbs distribution. In this respect, this approach generalizes the seminal and profound work of Schneidman and collaborators. A minimal presentation of the formalism is reviewed here, while a general algorithmic estimation method is proposed yielding fast convergent implementations. It is also made explicit how several spike observables (entropy, rate, synchronizations, correlations) are given in closed-form from the parametric estimation. This paradigm does not only allow us to estimate the spike statistics, given a design choice, but also to compare different models, thus answering comparative questions about the neural code such as : "are correlations (or time synchrony or a given set of spike patterns, ..) significant with respect to rate coding only ?" A numerical validation of the method is proposed and the perspectives regarding spike-train code analysis are also discussed.
Virtual Content Delivery Network (vCDN) migration is necessary to optimize the use of resources and improve the performance of the overall SDN/NFV-based CDN function in terms of network operator cost reduction and high streaming quality. It requires intelligent and enticed joint SDN/NFV migration algorithms due to the evident huge amount of traffic to be delivered to end customers of the network. In this paper, two approaches for finding the optimal and near optimal path placement(s) and vCDN migration(s) are proposed (OPAC and HPAC). Moreover, several scenarios are considered to quantify the OPAC and HPAC behaviors and to compare their efficiency in terms of migration cost, migration time, vCDN replication number, and other cost factors. Then, they are implemented and evaluated under different network scales. Finally, the proposed algorithms are integrated in an SDN/NFV framework. Index Terms: vCDN; SDN/NFV Optimization; Migration Algorithms; Scalability Algorithms.
Oscillations are omnipresent in neural population signals, like multi-unit recordings, EEG/MEG, and the local field potential. They have been linked to the population firing rate of neurons, with individual neurons firing in a close-to-irregular fashion at low rates. Using a combination of mean-field and linear response theory we predict the spectra generated in a layered microcircuit model of V1, composed of leaky integrate-and-fire neurons and based on connectivity compiled from anatomical and electrophysiological studies. The model exhibits low- and high-gamma oscillations visible in all populations. Since locally generated frequencies are imposed onto other populations, the origin of the oscillations cannot be deduced from the spectra. We develop an universally applicable systematic approach that identifies the anatomical circuits underlying the generation of oscillations in a given network. Based on a theoretical reduction of the dynamics, we derive a sensitivity measure resulting in a frequency-dependent connectivity map that reveals connections crucial for the peak amplitude and frequency of the observed oscillations and identifies the minimal circuit generating a given frequency. The low-gamma peak turns out to be generated in a sub-circuit located in layer 2/3 and 4, while the high-gamma peak emerges from the inter-neurons in layer 4. Connections within and onto layer 5 are found to regulate slow rate fluctuations. We further demonstrate how small perturbations of the crucial connections have significant impact on the population spectra, while the impairment of other connections leaves the dynamics on the population level unaltered. The study uncovers connections where mechanisms controlling the spectra of the cortical microcircuit are most effective.
In the neuroevolution literature, research has primarily focused on evolving the number of nodes, connections, and weights in artificial neural networks. Few attempts have been made to evolve activation functions. Research in evolving activation functions has mainly focused on evolving function parameters, and developing heterogeneous networks by selecting from a fixed pool of activation functions. This paper introduces a novel technique for evolving heterogeneous artificial neural networks through combinatorially generating piecewise activation functions to enhance expressive power. I demonstrate this technique on NeuroEvolution of Augmenting Topologies using ArcTan and Sigmoid, and show that it outperforms the original algorithm on non-Markovian double pole balancing. This technique expands the landscape of unconventional activation functions by demonstrating that they are competitive with canonical choices, and introduces a purview for further exploration of automatic model selection for artificial neural networks.
We study a Gaussian matrix function of the adjacency matrix of artificial and real-world networks. In particular, we study the Gaussian Estrada index---an index characterizing the importance of eigenvalues close to zero. This index accounts for the information contained in the eigenvalues close to zero in the spectra of networks. Here we obtain bounds for this index in simple graphs, proving that it reaches its maximum for star graphs followed by complete bipartite graphs. We also obtain formulas for the Estrada Gaussian index of Erd\H{o}s-R\'enyi random graphs as well as for the Barab\'asi-Albert graphs. We also show that in real-world networks this index is related to the existence of important structural patterns, such as complete bipartite subgraphs (bicliques). Such bicliques appear naturally in many real-world networks as a consequence of the evolutionary processes giving rise to them. In general, the Gaussian matrix function of the adjacency matrix of networks characterizes important structural information not described in previously used matrix functions of graphs.
This paper surveys research on applying neuroevolution (NE) to games. In neuroevolution, artificial neural networks are trained through evolutionary algorithms, taking inspiration from the way biological brains evolved. We analyse the application of NE in games along five different axes, which are the role NE is chosen to play in a game, the different types of neural networks used, the way these networks are evolved, how the fitness is determined and what type of input the network receives. The article also highlights important open research challenges in the field.
The Kuramoto phase diffusion equation is a nonlinear partial differential equation which describes the spatio-temporal evolution of a phase variable in an oscillatory reaction diffusion system. Synchronization manifests itself in a stationary phase gradient where all phases throughout a system evolve with the same velocity, the synchronization frequency. The formation of concentric waves can be explained by local impurities of higher frequency which can entrain their surroundings. Concentric waves in synchronization also occur in heterogeneous systems, where the local frequencies are distributed randomly. We present a perturbation analysis of the synchronization frequency where the perturbation is given by the heterogeneity of natural frequencies in the system. The nonlinearity in form of dispersion, leads to an overall acceleration of the oscillation for which the expected value can be calculated from the second order perturbation terms. We apply the theory to simple topologies, like a line or the sphere, and deduce the dependence of the synchronization frequency on the size and the dimension of the oscillatory medium. We show that our theory can be extended to include rotating waves in a medium with periodic boundary conditions. By changing a system parameter the synchronized state may become quasi degenerate. We demonstrate how perturbation theory fails at such a critical point.
The conjectured exact percolation thresholds of the Fortuin-Kasteleyn cluster for the +-J Ising spin glass model are theoretically shown based on a conjecture. It is pointed out that the percolation transition of the Fortuin-Kasteleyn cluster for the spin glass model is related to a dynamical transition for the freezing of spins. The present results are obtained as locations of points on the so-called Nishimori line, which is a special line in the phase diagram. We obtain TFK = 2 / ln [z / (z - 2)] and pFK = z / [2 (z - 1)] for the Bethe lattice, TFK -> infinity and pFK -> 1 / 2 for the infinite-range model, TFK = 2 / ln 3 and pFK = 3 / 4 for the square lattice, TFK ~ 3.9347 and pFK ~ 0.62441 for the simple cubic lattice, TFK ~ 6.191 and pFK ~ 0.5801 for the 4-dimensional hypercubic lattice, and TFK = 2 / ln {[1 + 2 sin (pi / 18)] / [1 - 2 sin (pi / 18) ]} and pFK = [1 + 2 sin (pi / 18) ] / 2 for the triangular lattice, when J / kB = 1, where z is the coordination number, J is the strength of the exchange interaction between spins, kB is the Boltzmann constant, TFK is the temperature at the percolation transition point, and pFK is the probability, that the interaction is ferromagnetic, at the percolation transition point.
We present a coarse-grained model for linear polymers with a tunable number of effective atoms (blobs) per chain interacting by intra- and inter-molecular potentials obtained at zero density. We show how this model is able to accurately reproduce the universal properties of the underlying solution of athermal linear chains at various levels of coarse-graining and in a range of chain densities which can be widened by increasing the spatial resolution of the multiblob representation, i.e., the number of blobs per chain. The present model is unique in its ability to quantitatively predict thermodynamic and large scale structural properties of polymer solutions deep in the semidilute regime with a very limited computational effort, overcoming most of the problems related to the simulations of semidilute polymer solutions in good solvent conditions.
This paper deals with the distributed processing in the search for an optimum classification model using evolutionary product unit neural networks. For this distributed search we used a cluster of computers. Our objective is to obtain a more efficient design than those net architectures which do not use a distributed process and which thus result in simpler designs. In order to get the best classification models we use evolutionary algorithms to train and design neural networks, which require a very time consuming computation. The reasons behind the need for this distribution are various. It is complicated to train this type of nets because of the difficulty entailed in determining their architecture due to the complex error surface. On the other hand, the use of evolutionary algorithms involves running a great number of tests with different seeds and parameters, thus resulting in a high computational cost
Estimation of Distribution Algorithms (EDAs) require flexible probability models that can be efficiently learned and sampled. Generative Adversarial Networks (GAN) are generative neural networks which can be trained to implicitly model the probability distribution of given data, and it is possible to sample this distribution. We integrate a GAN into an EDA and evaluate the performance of this system when solving combinatorial optimization problems with a single objective. We use several standard benchmark problems and compare the results to state-of-the-art multivariate EDAs. GAN-EDA doe not yield competitive results - the GAN lacks the ability to quickly learn a good approximation of the probability distribution. A key reason seems to be the large amount of noise present in the first EDA generations.
In this paper we provide a fully distributed implementation of the k-means clustering algorithm, intended for wireless sensor networks where each agent is endowed with a possibly high-dimensional observation (e.g., position, humidity, temperature, etc.) The proposed algorithm, by means of one-hop communication, partitions the agents into measure-dependent groups that have small in-group and large out-group "distances". Since the partitions may not have a relation with the topology of the network--members of the same clusters may not be spatially close--the algorithm is provided with a mechanism to compute the clusters'centroids even when the clusters are disconnected in several sub-clusters.The results of the proposed distributed algorithm coincide, in terms of minimization of the objective function, with the centralized k-means algorithm. Some numerical examples illustrate the capabilities of the proposed solution.
For intelligent robots to interact in deeply meaningful ways with their environment, they must understand both the geometric and semantic properties of the scene surrounding them. The majority of research to date has addressed these mapping challenges separately, focusing on either geometric or semantic mapping. In this paper we address the problem of building environmental maps that include both semantically meaningful, object-level entities and point- or mesh-based geometrical representations. We simultaneously build geometric point cloud models of previously unseen instances of known object classes and create a map that contains these object models as central entities. Our system leverages sparse, feature-based RGB-D SLAM, image-based deep-learning object detection and 3D unsupervised segmentation. We demonstrate the efficacy of our approach through quantitative evaluation in an automated inventory management task using a new real-world dataset recorded over a building office floor.
Detecting buffer overruns from a source code is one of the most common and yet challenging tasks in program analysis. Current approaches have mainly relied on rigid rules and handcrafted features devised by a few experts, limiting themselves in terms of flexible applicability and robustness due to diverse bug patterns and characteristics existing in sophisticated real-world software programs. In this paper, we propose a novel, data-driven approach that is completely end-to-end without requiring any hand-crafted features, thus free from any program language-specific structural limitations. In particular, our approach leverages a recently proposed neural network model called memory networks that have shown the state-of-the-art performances mainly in question-answering tasks. Our experimental results using source codes demonstrate that our proposed model is capable of accurately detecting simple buffer overruns. We also present in-depth analyses on how a memory network can learn to understand the semantics in programming languages solely from raw source codes, such as tracing variables of interest, identifying numerical values, and performing their quantitative comparisons.
We consider the problem of checkpointing a distributed application efficiently in Content Centric Networks so that it can withstand transient failures. We present CCNCheck, a system which enables a sender optimized way of checkpointing distributed applications in CCN's and provides an efficient mechanism for failure recovery in such applications. CCNCheck's checkpointing mechanism is a fork of DMTCP repository CCNCheck is capable of running any distributed application written in C/C++ language.
We investigate the achievable rate of data transmission from sources to sinks through a multiple-relay network. We study achievable rates for omniscient coding, in which all nodes are considered in the coding design at each node. We find that, when maximizing the achievable rate, not all nodes need to ``cooperate'' with all other nodes in terms of coding and decoding. This leads us to suggest a constrained network, whereby each node only considers a few neighboring nodes during encoding and decoding. We term this myopic coding and calculate achievable rates for myopic coding. We show by examples that, when nodes transmit at low SNR, these rates are close to that achievable by omniscient coding, when the network is unconstrained . This suggests that a myopic view of the network might be as good as a global view. In addition, myopic coding has the practical advantage of being more robust to topology changes. It also mitigates the high computational complexity and large buffer/memory requirements of omniscient coding schemes.
A three-dimensional (3D) Network-on-Chip (NoC) enables the design of high performance and low power many-core chips. Existing 3D NoCs are inadequate for meeting the ever-increasing performance requirements of many-core processors since they are simple extensions of regular 2D architectures and they do not fully exploit the advantages provided by 3D integration. Moreover, the anticipated performance gain of a 3D NoC-enabled many-core chip may be compromised due to the potential failures of through-silicon-vias (TSVs) that are predominantly used as vertical interconnects in a 3D IC. To address these problems, we propose a machine-learning-inspired predictive design methodology for energy-efficient and reliable many-core architectures enabled by 3D integration. We demonstrate that a small-world network-based 3D NoC (3D SWNoC) performs significantly better than its 3D MESH-based counterparts. On average, the 3D SWNoC shows 35% energy-delay-product (EDP) improvement over 3D MESH for the PARSEC and SPLASH2 benchmarks considered in this work. To improve the reliability of 3D NoC, we propose a computationally efficient spare-vertical link (sVL) allocation algorithm based on a state-space search formulation. Our results show that the proposed sVL allocation algorithm can significantly improve the reliability as well as the lifetime of 3D SWNoC.
We describe a new method to extract parton distribution functions both in the unpolarized and the polarized case, based on a type of neural networks, the Self-Organizing Maps. Initial quantitative results of our Next to Leading Order analysis are presented for the unpolarized case.
High spatial resolution images of PNe have shown their extremely complex morphology. However, the circumstellar envelopes of their progenitors, the AGB stars, are strikingly spherical. In order to understand the carving processes leading to axisymmetric nebulae, we are carrying out a study of a large sample of pre-PNe. Our emission model of the nebular molecular gas (12CO & 13CO) will allow us to determine important physical parameters (mass, linear momentum, kinetic energy) of the fast bipolar and slow spherical nebular components separately. We will study in an innovative way the properties for each source individually, and put our results in an evolutionary context with the help of the data obtained by us and collected from the literature.
Intracavity frequency-doubled solid-state lasers exhibit intensity fluctuations of their light output, which are cause by nonlinear dynamical processes. Up to now, there are different solutions to this problem, but they reduce the output power, increase the size of the laser and/or make them more complicated to assemble. One focus of current research in nonlinear dynamics is derivation of control strategies from mathematical models and their experimental realization. We suggest a method to stabilize the output power by means of an electronic feedback of the emitted infrared light intensity to the pump power. First we show the theoretical predictions of a recently published stability analysis of a rate equation model with feedback. The presented experimental observation show systematic deviations from theory. This makes it necessary to refine the model to explain the deviations. The refinement has direct impact on the improvement of the feedback loop and, therefore, on the application of such a control scheme.
Fault tolerance of an $(n,k)$-star network is measured by its $h$-super connectivity $\kappa_s^{(h)}$ or $h$-super edge-connectivity $\lambda_s^{(h)}$. Li {\it et al.} [Appl. Math. Comput. 248 (2014), 525-530; Math. Sci. Lett. 1 (2012), 133-138] determined $\kappa_s^{(h)}$ and $\lambda_s^{(h)}$ for $0\leq h\leq n-k$. This paper determines $\kappa_s^{(h)}=\lambda_s^{(h)}=\frac{(h+1)!(n-h-1)}{(n-k)!}$ for $n-k\leq h \leq n-2$.
We examined the kinetics of the transformation from the lamellar (LAM) to the hexagonally packed cylinder (HEX) phase for the triblock copolymer, polystyrene-b-poly (ethylene-co-butylene)-b-polystyrene (SEBS) in dibutyl phthalate (DBP), a selective solvent for polystyrene (PS), using time-resolved small angle x-ray scattering (SAXS). We observe the HEX phase with the EB block in the cores at a lower temperature than the LAM phase due to the solvent selectivity of DBP for the PS block. Analysis of the SAXS data for a deep temperature quench well below the LAM-HEX transition shows that the transformation occurs in a one-step process. We calculate the scattering using a geometric model of rippled layers with adjacent layers totally out of phase during the transformation. The agreement of the calculations with the data further supports the continuous transformation mechanism from the LAM to HEX for a deep quench. In contrast, for a shallow quench close to the OOT we find agreement with a two-step nucleation and growth mechanism.
This paper presents a novel deep architecture for saliency prediction. Current state of the art models for saliency prediction employ Fully Convolutional networks that perform a non-linear combination of features extracted from the last convolutional layer to predict saliency maps. We propose an architecture which, instead, combines features extracted at different levels of a Convolutional Neural Network (CNN). Our model is composed of three main blocks: a feature extraction CNN, a feature encoding network, that weights low and high level feature maps, and a prior learning network. We compare our solution with state of the art saliency models on two public benchmarks datasets. Results show that our model outperforms under all evaluation metrics on the SALICON dataset, which is currently the largest public dataset for saliency prediction, and achieves competitive results on the MIT300 benchmark.
We consider the effect of distributed delays in neural feedback systems. The avian optic tectum is reciprocally connected with the nucleus isthmi. Extracellular stimulation combined with intracellular recordings reveal a range of signal delays from 4 to 9 ms between isthmotectal elements. This observation together with prior mathematical analysis concerning the influence of a delay distribution on system dynamics raises the question whether a broad delay distribution can impact the dynamics of neural feedback loops. For a system of reciprocally connected model neurons, we found that distributed delays enhance system stability in the following sense. With increased distribution of delays, the system converges faster to a fixed point and converges slower toward a limit cycle. Further, the introduction of distributed delays leads to an increased range of the average delay value for which the system's equilibrium point is stable. The enhancement of stability with increasing delay distribution is caused by the introduction of smaller delays rather than the distribution per se.
Proteins are an important class of biomolecules that serve as essential building blocks of the cells. Their three-dimensional structures are responsible for their functions. In this thesis we have investigated the protein structures using a network theoretical approach. While doing so we used a coarse-grained method, viz., complex network analysis. We model protein structures at two length scales as Protein Contact Networks (PCN) and as Long-range Interaction Networks (LINs). We found that proteins by virtue of being characterised by high amount of clustering, are small-world networks. Apart from the small-world nature, we found that proteins have another general property, viz., assortativity. This is an interesting and exceptional finding as all other complex networks (except for social networks) are known to be disassortative. Importantly, we could identify one of the major topological determinant of assortativity by building appropriate controls.
A recent exponent inequality is applied to a number of dynamical growth models. Many of the known exponents for models such as the Kardar-Parisi-Zhang (KPZ) equation are shown to be consistent with the inequality. In some cases, such as the Molecular Beam Equation, the situation is more interesting, where the exponents saturate the inequality. As the acid test for the relative strength of four popular approximation schemes we apply the inequality to the exponents obtained for two Non Local KPZ systems. We find that all methods but one, the Self Consistent Expansion, violate the inequality in some regions of parameter space. To further demonstrate the usefulness of the inequality, we apply it to a specific model, which belongs to a family of models in which the inequality becomes an equality. We thus show that the inequality can easily yield results, which otherwise have to rely either on approximations or general beliefs.
This paper presents the evaluation of a Network-on-Chip (NoC) that offers load balancing for Systems-on-Chip (SoCs) dedicated for multimedia applications that require high traffic of variable bitrate communication. The NoC is based on a technique that allows the interleaving of flits from diferente flows in the same communication channel, and keep the load balancing without a centralized control in the network. For this purpose, all flits in the network received extra bits, such that every flit carries routing information. The routers use this extra information to perform arbitration and schedule the flits to the corresponding output ports. Analytic comparisons and experimental data show that the approach adopted in the network keeps average latency lower for variable bitrate flows than a network based on resource reservation when both networks are working over 80% of offered load.
A new highly efficient method is developed for computation of traveling periodic waves (Stokes waves) on the free surface of deep water. A convergence of numerical approximation is determined by the complex singularites above the free surface for the analytical continuation of the travelling wave into the complex plane. An auxiliary conformal mapping is introduced which moves singularities away from the free surface thus dramatically speeding up numerical convergence by adapting the numerical grid for resolving singularities while being consistent with the fluid dynamics. The efficiency of that conformal mapping is demonstrated for Stokes wave approaching the limiting Stokes wave (the wave of the greatest height) which significantly expands the family of numerically accessible solutions. It allows to provide a detailed study of the oscillatory approach of these solutions to the limiting wave. Generalizations of the conformal mapping to resolve multiple singularities are also introduced.
Major advances in large-scale yeast two hybrid (Y2H) screening have provided a global view of binary protein-protein interactions across species as dissimilar as human, yeast, and bacteria. Remarkably, these analyses have revealed that all species studied have a degree distribution of protein-protein binding that is approximately scale-free (varies as a power law) even though their evolutionary divergence times differ by billions of years. The universal power-law shows only the surface of the rich information harbored by these high-throughput data. We develop a detailed mathematical model of the protein-protein interaction network based on association free energy, the biochemical quantity that determines protein-protein interaction strength. This model reproduces the degree distribution of all of the large-scale Y2H data sets available and allows us to extract the distribution of free energy, the likelihood that a pair of proteins of a given species will bind. We find that across-species interactomes have significant differences that reflect the strengths of the protein-protein interaction. Our results identify a global evolutionary shift: more evolved organisms have weaker binary protein-protein binding. This result is consistent with the evolution of increased protein unfoldedness and challenges the dogma that only specific protein-protein interactions can be biologically functional..
We combine deep K-band (Keck) with V- and I-band (NTT) observations of two high-Galactic latitude fields, surveying a total of ~2 sq. arcmin. The K-band galaxy counts continue to rise above K=22, reaching surface densities of few x 10^5 per sq. degree. The slope for the counts is (d log(N) per mag per sq. degree) = 0.23 +/- 0.02 between K=18-23, consistent with other deep K surveys. The numbers of galaxies in each mag bin is about two times greater than the galaxy counts of Djorgovski et al. (1995).   The optical and near infrared magnitudes of all objects detected in the V+I+K image are discussed in the context of grids of isochrone synthesis galaxy evolutionary models (Bruzual & Charlot 1993, 1995). The colors of most of the observed galaxies are consistent with a population drawn from a broad redshift distribution. A few galaxies at K=19-20 are red in both colors (V-I>3; I-K>2), consistent with being early-type galaxies having undergone a burst of star formation at z>5 and viewed at z~1. At K>20, we find ~8 ``red outlier'' galaxies with I-K>4 and V-I<2.5, whose colors are difficult to mimic by a single evolving or non-evolving stellar population at any redshift. They are likely either low-metallicity, dusty dwarf galaxies, or old galaxies at high redshift (z>1.2). Their surface density is several per square arcminute, which is so high that they are probably common objects of low luminosity $L<L_*$.
Predictions for semi-inclusive deep inelastic lepton-nucleus scattering are presented. Both the effects of gluon radiation by the struck quark and the absorption of the produced hadron are considered. The gluon radiation covers a larger window in virtuality $Q^2$ because of the increased deconfinement of quarks inside nuclei. The absorption of hadrons formed inside the nucleus is described with a flavor dependent cross section. Calculations for rescaled fragmentation functions and nuclear absorption are compared with the EMC and HERMES data for N, Cu and Kr targets with respect to the deuteron target. Predictions for Ne and Xe targets in the HERMES kinematic regime are given.
A combination of molecular-dynamics (MD) computer simulation and mode-coupling theory (MCT) is used to elucidate the structure-dynamics relation in sodium-silicate melts (NSx) of varying sodium concentration. Using only the partial static structure factors from the MD as an input, MCT reproduces the large separation in relaxation time scales of the sodium and the silicon/oxygen components. This confirms the idea of sodium diffusion channels which are reflected by a prepeak in the static structure factors around 0.95 A^-1, and shows that it is possible to explain the fast sodium-ion dynamics peculiar to these mixtures using a microscopic theory.
The recent work of Gatys et al. demonstrated the power of Convolutional Neural Networks (CNN) in creating artistic fantastic imagery by separating and recombing the image content and style. This process of using CNN to migrate the semantic content of one image to different styles is referred to as Neural Style Transfer. Since then, Neural Style Transfer has become a trending topic both in academic literature and industrial applications. It is receiving increasing attention from computer vision researchers and several methods are proposed to either improve or extend the original neural algorithm proposed by Gatys et al. However, there is no comprehensive survey presenting and summarizing recent Neural Style Transfer literature. This review aims to provide an overview of the current progress towards Neural Style Transfer, as well as discussing its various applications and open problems for future research.
Using Monte Carlo simulation we investigated time of flight current transients predicted by the dipolar glass model for a random spatial distribution of hopping centers. Behavior of the carrier drift mobility was studied at room temperature over a broad range of electric field and sample thickness. A flat plateau followed by $j\propto t^{-2}$ current decay is the most common feature of the simulated transients. Poole-Frenkel mobility field dependence was confirmed over 5 to 200 V/$\mu$m as well as its independence of the sample thickness. Universality of transients with respect to both field and sample thickness has been observed. A simple phenomenological model to describe simulated current transients has been proposed. Simulation results agree well with the reported Poole-Frenkel slope and shape of the transients for a prototype molecularly doped polymer.
The field-induced antiferromagnetic ordering in systems of weakly coupled S=1/2 dimers at zero temperature can be described as a Bose-Einstein condensation of triplet quasiparticles (singlet quasiholes) in the ground state. For the case of a Heisenberg bilayer, it is here shown how the above picture is altered in the presence of site dilution of the magnetic lattice. Geometric randomness leads to quantum localization of the quasiparticles/quasiholes and to an extended Bose-glass phase in a realistic disordered model. This localization phenomenon drives the system towards a quantum-disordered phase well before the classical geometric percolation threshold is reached.
We study the problem of optimal traffic prediction and monitoring in large-scale networks. Our goal is to determine which subset of K links to monitor in order to "best" predict the traffic on the remaining links in the network. We consider several optimality criteria. This can be formulated as a combinatorial optimization problem, belonging to the family of subset selection problems. Similar NP-hard problems arise in statistics, machine learning and signal processing. Some include subset selection for regression, variable selection, and sparse approximation. Exact solutions are computationally prohibitive. We present both new heuristics as well as new efficient algorithms implementing the classical greedy heuristic - commonly used to tackle such combinatorial problems. Our approach exploits connections to principal component analysis (PCA), and yields new types of performance lower bounds which do not require submodularity of the objective functions. We show that an ensemble method applied to our new randomized heuristic algorithm, often outperforms the classical greedy heuristic in practice. We evaluate our algorithms under several large-scale networks, including real life networks.
We present a spike-based unsupervised regenerative learning scheme to train Spiking Deep Networks (SpikeCNN) for object recognition problems using biologically realistic leaky integrate-and-fire neurons. The training methodology is based on the Auto-Encoder learning model wherein the hierarchical network is trained layer wise using the encoder-decoder principle. Regenerative learning uses spike-timing information and inherent latencies to update the weights and learn representative levels for each convolutional layer in an unsupervised manner. The features learnt from the final layer in the hierarchy are then fed to an output layer. The output layer is trained with supervision by showing a fraction of the labeled training dataset and performs the overall classification of the input. Our proposed methodology yields 0.92%/29.84% classification error on MNIST/CIFAR10 datasets which is comparable with state-of-the-art results. The proposed methodology also introduces sparsity in the hierarchical feature representations on account of event-based coding resulting in computationally efficient learning.
We present analytic and numeric results for percolation in a network formed of interdependent spatially embedded networks. We show results for a treelike and a random regular network of networks each with $(i)$ unconstrained interdependent links and $(ii)$ interdependent links restricted to a maximum length, $r$. Analytic results are given for each network of networks with unconstrained dependency links and compared with simulations. For the case of two spatially embedded networks it was found that only for $r>r_c\approx8$ does the system undergo a first order phase transition. We find that for treelike networks of networks $r_c$ significantly decreases as $n$ increases and rapidly reaches its limiting value, $r=1$. For cases where the dependencies form loops, such as in random regular networks, we show analytically and confirm through simulations, that there is a certain fraction of dependent nodes, $q_{max}$, above which the entire network structure collapses even if a single node is removed. This $q_{max}$ decreases quickly with $m$, the degree of the random regular network of networks. Our results show the extreme sensitivity of coupled spatial networks and emphasize the susceptibility of these networks to sudden collapse. The theory derived here can be used to find the robustness of any network of networks where the profile of percolation of a single network is known.
We first present our work in machine translation, during which we used aligned sentences to train a neural network to embed n-grams of different languages into an $d$-dimensional space, such that n-grams that are the translation of each other are close with respect to some metric. Good n-grams to n-grams translation results were achieved, but full sentences translation is still problematic. We realized that learning semantics of sentences and documents was the key for solving a lot of natural language processing problems, and thus moved to the second part of our work: sentence compression. We introduce a flexible neural network architecture for learning embeddings of words and sentences that extract their semantics, propose an efficient implementation in the Torch framework and present embedding results comparable to the ones obtained with classical neural language models, while being more powerful.
We study two distinct, but overlapping, networks that operate at the same time, space, and frequency. The first network consists of $n$ randomly distributed \emph{primary users}, which form either an ad hoc network, or an infrastructure-supported ad hoc network with $l$ additional base stations. The second network consists of $m$ randomly distributed, ad hoc secondary users or cognitive users. The primary users have priority access to the spectrum and do not need to change their communication protocol in the presence of secondary users. The secondary users, however, need to adjust their protocol based on knowledge about the locations of the primary nodes to bring little loss to the primary network's throughput. By introducing preservation regions around primary receivers and avoidance regions around primary base stations, we propose two modified multihop routing protocols for the cognitive users. Base on percolation theory, we show that when the secondary network is denser than the primary network, both networks can simultaneously achieve the same throughput scaling law as a stand-alone network. Furthermore, the primary network throughput is subject to only a vanishingly fractional loss. Specifically, for the ad hoc and the infrastructure-supported primary models, the primary network achieves sum throughputs of order $n^{1/2}$ and $\max\{n^{1/2},l\}$, respectively. For both primary network models, for any $\delta>0$, the secondary network can achieve sum throughput of order $m^{1/2-\delta}$ with an arbitrarily small fraction of outage. Thus, almost all secondary source-destination pairs can communicate at a rate of order $m^{-1/2-\delta}$.
Using Monte Carlo simulations, we have studied aging phenomena in three-dimensional Gaussian Ising spin-glass model focusing on quasi-equilibrium behavior of the spin auto-correlation functions. Weak violation of the time translational invariance in the quasi-equilibrium regime is analyzed in terms of effective stiffness for droplet excitations in the presence of domain walls. The simulated results in not only isothermal but also $T$-shift aging processes exhibit the expected scaling behavior with respect to the characteristic length scales associated with droplet excitations and domain walls in spite of the fact that the growth law for these length scales still shows a pre-asymptotic behavior compared with the asymptotic form proposed by the droplet theory. Implications of our simulational results are also discussed in relation to experimental observations.
Stochastic models for chemical reaction networks have become very popular in recent years. For such models, the estimation of parameter sensitivities is an important and challenging problem. Sensitivity values help in analyzing the network, understanding its robustness properties and also in identifying the key reactions for a given outcome. Most of the methods that exist in the literature for the estimation of parameter sensitivities, rely on Monte Carlo simulations using Gillespie's stochastic simulation algorithm or its variants. It is well-known that such simulation methods can be prohibitively expensive when the network contains reactions firing at different time-scales, which is a feature of many important biochemical networks. For such networks, it is often possible to exploit the time-scale separation and approximately capture the original dynamics by simulating a "reduced" model, which is obtained by eliminating the fast reactions in a certain way. The aim of this paper is to tie these model reduction techniques with sensitivity analysis. We prove that under some conditions, the sensitivity values of the reduced model can be used to approximately recover the sensitivity values for the original model. Through an example we illustrate how our result can help in sharply reducing the computational costs for the estimation of parameter sensitivities for reaction networks with multiple time-scales. To prove our result, we use coupling arguments based on the random time change representation of Kurtz. We also exploit certain connections between the distributions of the occupation times of Markov chains and multi-dimensional wave equations.
We exploit the deep and extended far infrared data sets (at 70, 100 and 160 um) of the Herschel GTO PACS Evolutionary Probe (PEP) Survey, in combination with the HERschel Multi tiered Extragalactic Survey (HerMES) data at 250, 350 and 500 um, to derive the evolution of the restframe 35 um, 60 um, 90 um, and total infrared (IR) luminosity functions (LFs) up to z~4. We detect very strong luminosity evolution for the total IR LF combined with a density evolution. In agreement with previous findings, the IR luminosity density increases steeply to z~1, then flattens between z~1 and z~3 to decrease at z greater than 3. Galaxies with different SEDs, masses and sSFRs evolve in very different ways and this large and deep statistical sample is the first one allowing us to separately study the different evolutionary behaviours of the individual IR populations contributing to the IR luminosity density. Galaxies occupying the well established SFR/stellar mass main sequence (MS) are found to dominate both the total IR LF and luminosity density at all redshifts, with the contribution from off MS sources (0.6 dex above MS) being nearly constant (~20% of the total IR luminosity density) and showing no significant signs of increase with increasing z over the whole 0.8<z<2.2 range. Sources with mass in the 10< log(M/Msun) <11 range are found to dominate the total IR LF, with more massive galaxies prevailing at the bright end of the high-z LF. A two-fold evolutionary scheme for IR galaxies is envisaged: on the one hand, a starburst-dominated phase in which the SMBH grows and is obscured by dust, is followed by an AGN dominated phase, then evolving toward a local elliptical. On the other hand, moderately starforming galaxies containing a low-luminosity AGN have various properties suggesting they are good candidates for systems in a transition phase preceding the formation of steady spiral galaxies.
Content-Centric Networking (CCN) is an emerging network architecture designed to overcome limitations of the current IP-based Internet. One of the fundamental tenets of CCN is that data, or content, is a named and addressable entity in the network. Consumers request content by issuing interest messages with the desired content name. These interests are forwarded by routers to producers, and the resulting content object is returned and optionally cached at each router along the path. In-network caching makes it difficult to enforce access control policies on sensitive content outside of the producer since routers only use interest information for forwarding decisions. To that end, we propose an Interest-Based Access Control (IBAC) scheme that enables access control enforcement using only information contained in interest messages, i.e., by making sensitive content names unpredictable to unauthorized parties. Our IBAC scheme supports both hash- and encryption-based name obfuscation. We address the problem of interest replay attacks by formulating a mutual trust framework between producers and consumers that enables routers to perform authorization checks when satisfying interests from their cache. We assess the computational, storage, and bandwidth overhead of each IBAC variant. Our design is flexible and allows producers to arbitrarily specify and enforce any type of access control on content, without having to deal with the problems of content encryption and key distribution. This is the first comprehensive design for CCN access control using only information contained in interest messages.
A revolution is taking place in telecommunication networks. New services are appearing on platforms such as third generation cellular phones (3G) and broadband Internet access. This motivates the transition from mostly switched to all-IP networks. The replacement of the traditional shallow and well-defined interface to telephony networks brings accrued flexibility, but also makes the network accordingly difficult to properly secure. This paper surveys the implications of this transition on security issues in telecom applications. It does not give an exhaustive list of security tools or security protocols. Its goal is rather to initiate the reader to the security issues brought to carrier class servers by this revolution.
Device-to-Device (D2D) communication was initially proposed in cellular networks as a new paradigm to enhance network performance. The emergence of new applications such as content distribution and location-aware advertisement introduced new use-cases for D2D communications in cellular networks. The initial studies showed that D2D communication has advantages such as increased spectral efficiency and reduced communication delay. However, this communication mode introduces complications in terms of interference control overhead and protocols that are still open research problems. The feasibility of D2D communications in LTE-A is being studied by academia, industry, and the standardization bodies. To date, there are more than 100 papers available on D2D communications in cellular networks and, there is no survey on this field. In this article, we provide a taxonomy based on the D2D communicating spectrum and review the available literature extensively under the proposed taxonomy. Moreover, we provide new insights to the over-explored and under-explored areas which lead us to identify open research problems of D2D communication in cellular networks.
This paper studies the scalability of a wireless backhaul network modeled as a random extended network with multi-antenna base stations (BSs), where the number of antennas per BS is allowed to scale as a function of the network size. The antenna scaling is justified by the current trend towards the use of higher carrier frequencies, which allows to pack large number of antennas in small form factors. The main goal is to study the per-BS antenna requirement that ensures scalability of this network, i.e., its ability to deliver non-vanishing rate to each source-destination pair. We first derive an information theoretic upper bound on the capacity of this network under a general propagation model, which provides a lower bound on the per-BS antenna requirement. Then, we characterize the scalability requirements for two competing strategies of interest: (i) long hop: each source-destination pair minimizes the number of hops by sacrificing multiplexing gain while achieving full beamforming (power) gain over each hop, and (ii) short hop: each source-destination pair communicates through a series of short hops, each achieving full multiplexing gain. While long hop may seem more intuitive in the context of massive multiple-input multiple-output (MIMO) transmission, we show that the short hop strategy is significantly more efficient in terms of per-BS antenna requirement for throughput scalability. As a part of the proof, we construct a scalable short hop strategy and show that it does not violate any fundamental limits on the spatial degrees of freedom (DoFs).
Image representation and classification are two fundamental tasks towards multimedia content retrieval and understanding. The idea that shape and texture information (e.g. edge or orientation) are the key features for visual representation is ingrained and dominated in current multimedia and computer vision communities. A number of low-level features have been proposed by computing local gradients (e.g. SIFT, LBP and HOG), and have achieved great successes on numerous multimedia applications. In this paper, we present a simple yet efficient local descriptor for image classification, referred as Local Color Contrastive Descriptor (LCCD), by leveraging the neural mechanisms of color contrast. The idea originates from the observation in neural science that color and shape information are linked inextricably in visual cortical processing. The color contrast yields key information for visual color perception and provides strong linkage between color and shape. We propose a novel contrastive mechanism to compute the color contrast in both spatial location and multiple channels. The color contrast is computed by measuring \emph{f}-divergence between the color distributions of two regions. Our descriptor enriches local image representation with both color and contrast information. We verified experimentally that it can compensate strongly for the shape based descriptor (e.g. SIFT), while keeping computationally simple. Extensive experimental results on image classification show that our descriptor improves the performance of SIFT substantially by combinations, and achieves the state-of-the-art performance on three challenging benchmark datasets. It improves recent Deep Learning model (DeCAF) [1] largely from the accuracy of 40.94% to 49.68% in the large scale SUN397 database. Codes for the LCCD will be available.
We investigate the distribution of the volume and coordination number associated to each particle in a jammed packing of monodisperse hard sphere using the mesoscopic ensemble developed in Nature 453, 606 (2008). Theory predicts an exponential distribution of the orientational volumes for random close packings and random loose packings. A comparison with computer generated packings reveals deviations from the theoretical prediction in the volume distribution, which can be better modeled by a compressed exponential function. On the other hand, the average of the volumes is well reproduced by the theory leading to good predictions of the limiting densities of RCP and RLP. We discuss a more exact theory to capture the volume distribution in its entire range. The available data suggests a plausible order/disorder transition defining random close packings. Finally, we consider an extended ensemble to calculate the coordination number distribution which is shown to be of an exponential and inverse exponential form for coordinations larger and smaller than the average, respectively, in reasonable agreement with the simulated data.
Deep learning algorithms for connectomics rely upon localized classification, rather than overall morphology. This leads to a high incidence of erroneously merged objects. Humans, by contrast, can easily detect such errors by acquiring intuition for the correct morphology of objects. Biological neurons have complicated and variable shapes, which are challenging to learn, and merge errors take a multitude of different forms. We present an algorithm, MergeNet, that shows 3D ConvNets can, in fact, detect merge errors from high-level neuronal morphology. MergeNet follows unsupervised training and operates across datasets. We demonstrate the performance of MergeNet both on a variety of connectomics data and on a dataset created from merged MNIST images.
This work clarifies the relation between network circuit (topology) and behavior (information transmission and synchronization) in active networks, e.g. neural networks. As an application, we show how to determine a network topology that is optimal for information transmission. By optimal, we mean that the network is able to transmit a large amount of information, it possesses a large number of communication channels, and it is robust under large variations of the network coupling configuration. This theoretical approach is general and does not depend on the particular dynamic of the elements forming the network, since the network topology can be determined by finding a Laplacian matrix (the matrix that describes the connections and the coupling strengths among the elements) whose eigenvalues satisfy some special conditions. To illustrate our ideas and theoretical approaches, we use neural networks of electrically connected chaotic Hindmarsh-Rose neurons.
We discuss a general formalism that allows study of transitions over barriers in spin glasses with long-range interactions that contain large but finite number, $N$, of spins. We apply this formalism to the Sherrington-Kirkpatrick model with finite $N$ and derive equations for the dynamical order parameters which allow ''instanton'' solutions describing transitions over the barriers separating metastable states. Specifically, we study these equations for a glass state that was obtained in a slow cooling process ending a little below $T_{c}$ and show that these equations allow ''instanton'' solutions which erase the response of the glass to the perturbations applied during the slow cooling process. The corresponding action of these solutions gives the energy of the barriers, we find that it scales as $\tau ^{6}$ where $\tau $ is the reduced temperature.
Embedding and visualizing large-scale high-dimensional data in a two-dimensional space is an important problem since such visualization can reveal deep insights out of complex data. Most of the existing embedding approaches, however, run on an excessively high precision, ignoring the fact that at the end, embedding outputs are converted into coarse-grained discrete pixel coordinates in a screen space. Motivated by such an observation and directly considering pixel coordinates in an embedding optimization process, we accelerate Barnes-Hut tree-based t-distributed stochastic neighbor embedding (BH-SNE), known as a state-of-the-art 2D embedding method, and propose a novel method called PixelSNE, a highly-efficient, screen resolution-driven 2D embedding method with a linear computational complexity in terms of the number of data items. Our experimental results show the significantly fast running time of PixelSNE by a large margin against BH-SNE, while maintaining the minimal degradation in the embedding quality. Finally, the source code of our method is publicly available at https://github.com/awesome-davian/PixelSNE
In this paper, we study jamming attacks against wireless networks. Specifically, we consider a network of base stations (BS) or access points (AP) and investigate the impact of a fixed number of jammers that are randomly deployed according to a Binomial point process. We shed light on the network performance in terms of a) the outage probability and b) the error probability of a victim receiver in the downlink of this wireless network. We derive analytical expressions for both these metrics and discuss in detail how the jammer network must adapt to the various wireless network parameters in order to effectively attack the victim receivers. For instance, we will show that with only 1 jammer per BS/AP a) the outage probability of the wireless network can be increased from 1% (as seen in the non-jamming case) to 80% and b) when retransmissions are used, the jammers cause the effective network activity factor (and hence the interference among the BSs) to be doubled. Furthermore, we show that the behavior of the jammer network as a function of the BS/AP density is not obvious. In particular, an interesting concave-type behavior is seen which indicates that the number of jammers required to attack the wireless network must scale with the BS density only until a certain value beyond which it decreases. In the context of error probability of the victim receiver, we study whether or not some recent results related to jamming in the point-to-point link scenario can be extended to the case of jamming against wireless networks. Numerical results are presented to validate the theoretical inferences presented.
Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However, algorithmic progress has been limited due to the lack of a standard benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for molecular machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed molecular featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations, and in particular graph convolutional networks, are powerful tools for molecular machine learning and broadly offer the best performance. However, for quantum mechanical and biophysical datasets, the use of physics-aware featurizations can be significantly more important than choice of particular learning algorithm.
We study hollow plasma channels with smooth boundaries for laser-driven electron acceleration in the bubble regime. Contrary to the uniform plasma case, the laser forms no optical shock and no etching at the front. This increases the effective bubble phase velocity and energy gain. The longitudinal field has a plateau that allows for mono-energetic acceleration. We observe as low as 10^{-3} r.m.s. relative witness beam energy uncertainty in each cross-section and 0.3% total energy spread. By varying plasma density profile inside a deep channel, the bubble fields can be adjusted to balance the laser depletion and dephasing lengths. Bubble scaling laws for the deep channel are derived. Ultra-short pancake-like laser pulses lead to the highest energies of accelerated electrons per Joule of laser pulse energy.
Content caching in small base stations or wireless infostations is considered to be a suitable approach to improve the efficiency in wireless content delivery. Placing the optimal content into local caches is crucial due to storage limitations, but it requires knowledge about the content popularity distribution, which is often not available in advance. Moreover, local content popularity is subject to fluctuations since mobile users with different interests connect to the caching entity over time. Which content a user prefers may depend on the user's context. In this paper, we propose a novel algorithm for context-aware proactive caching. The algorithm learns context-specific content popularity online by regularly observing context information of connected users, updating the cache content and observing cache hits subsequently. We derive a sublinear regret bound, which characterizes the learning speed and proves that our algorithm converges to the optimal cache content placement strategy in terms of maximizing the number of cache hits. Furthermore, our algorithm supports service differentiation by allowing operators of caching entities to prioritize customer groups. Our numerical results confirm that our algorithm outperforms state-of-the-art algorithms in a real world data set, with an increase in the number of cache hits of at least 14%.
Sparse matrix-vector multiplication (spMVM) is the most time-consuming kernel in many numerical algorithms and has been studied extensively on all modern processor and accelerator architectures. However, the optimal sparse matrix data storage format is highly hardware-specific, which could become an obstacle when using heterogeneous systems. Also, it is as yet unclear how the wide single instruction multiple data (SIMD) units in current multi- and many-core processors should be used most efficiently if there is no structure in the sparsity pattern of the matrix. We suggest SELL-C-sigma, a variant of Sliced ELLPACK, as a SIMD-friendly data format which combines long-standing ideas from General Purpose Graphics Processing Units (GPGPUs) and vector computer programming. We discuss the advantages of SELL-C-sigma compared to established formats like Compressed Row Storage (CRS) and ELLPACK and show its suitability on a variety of hardware platforms (Intel Sandy Bridge, Intel Xeon Phi and Nvidia Tesla K20) for a wide range of test matrices from different application areas. Using appropriate performance models we develop deep insight into the data transfer properties of the SELL-C-sigma spMVM kernel. SELL-C-sigma comes with two tuning parameters whose performance impact across the range of test matrices is studied and for which reasonable choices are proposed. This leads to a hardware-independent ("catch-all") sparse matrix format, which achieves very high efficiency for all test matrices across all hardware platforms.
We have investigated the physical and chemical status of the pre-protostellar core B68. A previous extinction study suggested that the density profile of B68 is remarkably consistent with a Bonnor-Ebert sphere with 2.1 $M_{\odot}$ at 16 K. We mapped B68 in C3H2, CCS, and NH3 with the Deep Space Network (DSN) 70m telescope at Goldstone. Our results show that the B68 chemical structure is consistent with that seen in other pre-protostellar cores (L1498, L1544) and is explained by time dependent chemical models that include depletion. We measured a kinetic temperature of 11K with NH3 (1,1) and (2,2) spectra. We also show that B68 is thermally dominated with little contribution from turbulence support (< 10%). We consider a modified Bonnor-Ebert sphere to include effects of turbulence and magnetic fields and use it to constrain the uncertainties in its distance determination. We conclude that the distance to B68 is $\sim$ 95pc with a corresponding mass of $\sim$ 1.0 $M_{\odot}$. If some magnetic field is present it can be further away (beyond $\sim$ 100pc) and still satisfy the density structure of a Bonnor-Ebert sphere. Our observations do not preclude any instability such as the onset of collapse, or slow contraction, occurring in the center of the core, which cannot be resolved with our beam size (45").
Automatic image synthesis research has been rapidly growing with deep networks getting more and more expressive. In the last couple of years, we have observed images of digits, indoor scenes, birds, chairs, etc. being automatically generated. The expressive power of image generators have also been enhanced by introducing several forms of conditioning variables such as object names, sentences, bounding box and key-point locations. In this work, we propose a novel deep conditional generative adversarial network architecture that takes its strength from the semantic layout and scene attributes integrated as conditioning variables. We show that our architecture is able to generate realistic outdoor scene images under different conditions, e.g. day-night, sunny-foggy, with clear object boundaries.
Generative Adversarial Networks (GANs) provide a versatile class of models for generative modeling. To improve the performance of machine learning models, there has recently been interest in designing objective functions based on Wasserstein distance rather than Jensen-Shannon (JS) divergence. In this paper, we propose a novel asymmetric statistical divergence called Relaxed Wasserstein (RW) divergence as a generalization of Wasserstein-$L^2$ distance of order 2. We show that RW is dominated by Total Variation (TV) and Wasserstein-$L^2$ distance, and establish continuity, differentiability, and duality representation of RW divergence. Finally, we provide a nonasymptotic moment estimate and a concentration inequality for RW divergence. Our experiments show that RWGANs with Kullback-Leibler (KL) divergence produce recognizable images with a ReLU Multi-Layer Perceptron (MLP) generator in fewer iterations, compared to Wasserstein-$L^1$ GAN (WGAN).
The state-of-the-art performance for object detection has been significantly improved over the past two years. Besides the introduction of powerful deep neural networks such as GoogleNet and VGG, novel object detection frameworks such as R-CNN and its successors, Fast R-CNN and Faster R-CNN, play an essential role in improving the state-of-the-art. Despite their effectiveness on still images, those frameworks are not specifically designed for object detection from videos. Temporal and contextual information of videos are not fully investigated and utilized. In this work, we propose a deep learning framework that incorporates temporal and contextual information from tubelets obtained in videos, which dramatically improves the baseline performance of existing still-image detection frameworks when they are applied to videos. It is called T-CNN, i.e. tubelets with convolutional neueral networks. The proposed framework won the recently introduced object-detection-from-video (VID) task with provided data in the ImageNet Large-Scale Visual Recognition Challenge 2015 (ILSVRC2015).
By using the point model of reaction kinetics we have studied the stochastic properties of the lifetime of small systems controlled by autocatalytic reaction A+X -> X+X -> A+X, X -> B. Assuming that a system is living only when the number of autocatalytic particles is larger than zero but smaller than a positive integer N, we have calculated the probability of the lifetime provided that the number of substrate particles A is kept constant by a suitable reservoir, and the end-products B do not take part in the reaction. We have shown that the density function of the lifetime is strongly asymmetric and in certain cases it has a well-defined minimum at the beginning of the process. It has been also proven that the extinction probability of systems of this type is exactly 1.
IR or near-infrared (NIR) spectroscopy is a method used to identify a compound or to analyze the composition of a material. Calibration of NIR spectra refers to the use of the spectra as multivariate descriptors to predict concentrations of the constituents. To build a calibration model, state-of-the-art software predominantly uses linear regression techniques. For nonlinear calibration problems, neural network-based models have proved to be an interesting alternative. In this paper, we propose a novel extension of the conventional neural network-based approach, the use of an ensemble of neural network models. The individual neural networks are obtained by resampling the available training data with bootstrapping or cross-validation techniques. The results obtained for a realistic calibration example show that the ensemble-based approach produces a significantly more accurate and robust calibration model than conventional regression methods.
Transport properties of narrow two-dimensional conducting wires in which the electron scattering is caused by side edges' roughness have been studied. The method for calculating dynamic characteristics of such conductors is proposed which is based on the two-scale representation of the mode wave functions at weak scattering. With this method, fundamentally different {\it by-height} and {\it by-slope} scattering mechanisms associated with edge roughness are discriminated. The results for single-mode systems, previously obtained by conventional methods, are proven to correspond to the former mechanism only. Yet the commonly ignored by-slope scattering is more likely dominant. The electron extinction lengths relevant to this scattering differ substantially in functional structure from those pertinent to the by-height scattering. The transmittance of ultra-quantum wires is calculated over all range of scattering parameters, from ballistic to localized transport of quasi-particles. The obtained dependence of scattering lengths on the disorder parameters is valid qualitatively for arbitrary inter-correlation of the boundaries' defects.
We present an unsupervised representation learning approach that compactly encodes the motion dependencies in videos. Given a pair of images from a video clip, our framework learns to predict the long-term 3D motions. To reduce the complexity of the learning framework, we propose to describe the motion as a sequence of atomic 3D flows computed with RGB-D modality. We use a Recurrent Neural Network based Encoder-Decoder framework to predict these sequences of flows. We argue that in order for the decoder to reconstruct these sequences, the encoder must learn a robust video representation that captures long-term motion dependencies and spatial-temporal relations. We demonstrate the effectiveness of our learned temporal representations on activity classification across multiple modalities and datasets such as NTU RGB+D and MSR Daily Activity 3D. Our framework is generic to any input modality, i.e., RGB, Depth, and RGB-D videos.
The deployment of a large number of small cells poses new challenges to energy efficiency, which has often been ignored in fifth generation (5G) cellular networks. While massive multiple-input multiple outputs (MIMO) will reduce the transmission power at the expense of higher computational cost, the question remains as to which computation or transmission power is more important in the energy efficiency of 5G small cell networks. Thus, the main objective in this paper is to investigate the computation power based on the Landauer principle. Simulation results reveal that more than 50% of the energy is consumed by the computation power at 5G small cell BS's. Moreover, the computation power of 5G small cell BS can approach 800 watt when the massive MIMO (e.g., 128 antennas) is deployed to transmit high volume traffic. This clearly indicates that computation power optimization can play a major role in the energy efficiency of small cell networks.
Researchers have proposed various methods to extract 3D keypoints from the surface of 3D mesh models over the last decades, but most of them are based on geometric methods, which lack enough flexibility to meet the requirements for various applications. In this paper, we propose a new method on the basis of deep learning by formulating the 3D keypoint detection as a regression problem using deep neural network (DNN) with sparse autoencoder (SAE) as our regression model. Both local information and global information of a 3D mesh model in multi-scale space are fully utilized to detect whether a vertex is a keypoint or not. SAE can effectively extract the internal structure of these two kinds of information and formulate high-level features for them, which is beneficial to the regression model. Three SAEs are used to formulate the hidden layers of the DNN and then a logistic regression layer is trained to process the high-level features produced in the third SAE. Numerical experiments show that the proposed DNN based 3D keypoint detection algorithm outperforms current five state-of-the-art methods for various 3D mesh models.
To investigate the influence of orientational degrees of freedom onto the dynamics of molecular systems in its supercooled and glassy regime we have solved numerically the mode-coupling equations for hard ellipsoids of revolution. For a wide range of volume fractions $\phi$ and aspect ratios $x_{0}$ we find an orientational peak in the center of mass spectra $\chi_{000}^{''}(q,\omega)$ and $\phi_{000}^{''} (q,\omega)$ about one decade below a high frequency peak. This orientational peak is the counterpart of a peak appearing in the quadrupolar spectra $\chi_{22m}^{''}(q,\omega)$ and $\phi_{22m}^{''}(q,\omega)$. The latter peak is almost insensitive on $\phi$ for $x_{0}$ close to one, i.e. for weak steric hindrance, and broadens strongly with increasing $x_{0}$. Deep in the glass we find an additional peak between the orientational and the high frequency peak. We have evidence that this intermediate peak is the result of a coupling between modes with $l=0$ and $l=2$, due to the nondiagonality of the static correlators.
We propose a novel framework for the analysis of learning algorithms that allows us to say when such algorithms can and cannot generalize certain patterns from training data to test data. In particular we focus on situations where the rule that must be learned concerns two components of a stimulus being identical. We call such a basis for discrimination an identity-based rule. Identity-based rules have proven to be difficult or impossible for certain types of learning algorithms to acquire from limited datasets. This is in contrast to human behaviour on similar tasks. Here we provide a framework for rigorously establishing which learning algorithms will fail at generalizing identity-based rules to novel stimuli. We use this framework to show that such algorithms are unable to generalize identity-based rules to novel inputs unless trained on virtually all possible inputs. We demonstrate these results computationally with a multilayer feedforward neural network.
Neural models have recently been used in text summarization including headline generation. The model can be trained using a set of document-headline pairs. However, the model does not explicitly consider topical similarities and differences of documents. We suggest to categorizing documents into various topics so that documents within the same topic are similar in content and share similar summarization patterns. Taking advantage of topic information of documents, we propose topic sensitive neural headline generation model. Our model can generate more accurate summaries guided by document topics. We test our model on LCSTS dataset, and experiments show that our method outperforms other baselines on each topic and achieves the state-of-art performance.
In this paper we introduce a class of Markov models, termed best-effort networks, designed to capture performance indices such as mean transfer times in data networks with best-effort service. We introduce the so-called min bandwidth sharing policy as a conservative approximation to the classical max-min policy. We establish necessary and sufficient ergodicity conditions for best-effort networks under the min policy. We then resort to the mean field technique of statistical physics to analyze network performance deriving fixed point equations for the stationary distribution of large symmetrical best-effort networks. A specific instance of such net- works is the star-shaped network which constitutes a plausible model of a network with an overprovisioned backbone. Numerical and analytical study of the equations allows us to state a number of qualitative conclusions on the impact of traffic parameters (link loads) and topology parameters (route lengths) on mean document transfer time.
We investigate the nonequilibrium phase transition in the disordered contact process in the presence of long-range spatial disorder correlations. These correlations greatly increase the probability for finding rare regions that are locally in the active phase while the bulk system is still in the inactive phase. Specifically, if the correlations decay as a power of the distance, the rare region probability is a stretched exponential of the rare region size rather than a simple exponential as is the case for uncorrelated disorder. As a result, the Griffiths singularities are enhanced and take a non-power-law form. The critical point itself is of infinite-randomness type but with critical exponent values that differ from the uncorrelated case. We report large-scale Monte-Carlo simulations that verify and illustrate our theory. We also discuss generalizations to higher dimensions and applications to other systems such as the random transverse-field Ising model, itinerant magnets and the superconductor-metal transition.
Previous work has established that the localized regime of wave transport in open media is characterized by a position-dependent diffusion coefficient. In this work we study how the concept of position-dependent diffusion affects the delay time, the transverse confinement, the coherent backscattering, and the time reversal of waves. Definitions of energy transport velocity of localized waves are proposed. We start with a phenomenological model of radiative transfer and then present a novel perturbational approach based on the self-consistent theory of localization. The latter allows us to obtain results relevant for realistic experiments in disordered quasi-1D wave guides and 3D slabs.
We present a status report of our systematic theoretical and phenomenological study of QCD-instanton induced processes in deep-inelastic scattering. We show that this regime plays a distinguished role for studying manifestations of QCD-instantons, since the typical hard momentum scale $Q$ provides a dynamical infrared cutoff for the instanton size $\rho\lwig O(1/Q)$. For deep-inelastic scattering at HERA, we present a preliminary theoretical estimate of the total instanton-induced cross-section (subject to appropriate kinematical cuts). It is surprisingly large, in the $ O(1-100)$ pb range, albeit still uncertain. We report on our investigation of the discovery potential for instanton-induced events at HERA by means of a Monte Carlo event generator. It is based on a detailed study of the characteristic signatures of the final state, like a large total transverse energy, $E_{T}= O(20)$ GeV, a large multiplicity, $n= O(25)$, and a flavour-democratic production of hadrons. A combination of event shape information with searches of $K^{0}$ mesons, muons, and multiplicity cuts might help to discriminate further the QCD-instanton induced processes from the standard perturbative QCD background.
Real-life graphs usually have various kinds of events happening on them, e.g., product purchases in online social networks and intrusion alerts in computer networks. The occurrences of events on the same graph could be correlated, exhibiting either attraction or repulsion. Such structural correlations can reveal important relationships between different events. Unfortunately, correlation relationships on graph structures are not well studied and cannot be captured by traditional measures. In this work, we design a novel measure for assessing two-event structural correlations on graphs. Given the occurrences of two events, we choose uniformly a sample of "reference nodes" from the vicinity of all event nodes and employ the Kendall's tau rank correlation measure to compute the average concordance of event density changes. Significance can be efficiently assessed by tau's nice property of being asymptotically normal under the null hypothesis. In order to compute the measure in large scale networks, we develop a scalable framework using different sampling strategies. The complexity of these strategies is analyzed. Experiments on real graph datasets with both synthetic and real events demonstrate that the proposed framework is not only efficacious, but also efficient and scalable.
In this paper we study the problem of generation of dependent random variables, known as the "coordination capacity" [4,5], in multiterminal networks. In this model $m$ nodes of the network are observing i.i.d. repetitions of $X^{(1)}$, $X^{(2)}$,..., $X^{(m)}$ distributed according to $q(x^{(1)},...,x^{(m)})$. Given a joint distribution $q(x^{(1)},...,x^{(m)},y^{(1)},...,y^{(m)})$, the final goal of the $i^{th}$ node is to construct the i.i.d. copies of $Y^{(i)}$ after the communication over the network where $X^{(1)}$, $X^{(2)}$,..., $X^{(m)}, Y^{(1)}$, $Y^{(2)}$,..., $Y^{(m)}$ are jointly distributed according to $q(x^{(1)},...,x^{(m)},y^{(1)},...,y^{(m)})$. To do this, the nodes can exchange messages over the network at rates not exceeding the capacity constraints of the links. This problem is difficult to solve even for the special case of two nodes. In this paper we prove new inner and outer bounds on the achievable rates for networks with two nodes.
Genetic regulatory networks are usually modeled by systems of coupled differential equations and by finite state models, better known as logical networks, are also used. In this paper we consider a class of models of regulatory networks which present both discrete and continuous aspects. Our models consist of a network of units, whose states are quantified by a continuous real variable. The state of each unit in the network evolves according to a contractive transformation chosen from a finite collection of possible transformations, according to a rule which depends on the state of the neighboring units. As a first approximation to the complete description of the dynamics of this networks we focus on a global characteristic, the dynamical complexity, related to the proliferation of distinguishable temporal behaviors. In this work we give explicit conditions under which explicit relations between the topological structure of the regulatory network, and the growth rate of the dynamical complexity can be established. We illustrate our results by means of some biologically motivated examples.
Bridging the 'reality gap' that separates simulated robotics from experiments on hardware could accelerate robotic research through improved data availability. This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator. With enough variability in the simulator, the real world may appear to the model as just another variation. We focus on the task of object localization, which is a stepping stone to general robotic manipulation skills. We find that it is possible to train a real-world object detector that is accurate to $1.5$cm and robust to distractors and partial occlusions using only data from a simulator with non-realistic random textures. To demonstrate the capabilities of our detectors, we show they can be used to perform grasping in a cluttered environment. To our knowledge, this is the first successful transfer of a deep neural network trained only on simulated RGB images (without pre-training on real images) to the real world for the purpose of robotic control.
The problem of rare and unknown words is an important issue that can potentially influence the performance of many NLP systems, including both the traditional count-based and the deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each time-step, the decision of which softmax layer to use choose adaptively made by an MLP which is conditioned on the context.~We motivate our work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known.~We observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset using our proposed model.
We use numerical simulation to examine the possibility of a reversible liquid-liquid transition in supercooled water and related systems. In particular, for two atomistic models of water, we have computed free energies as functions of multiple order parameters, where one is density and another distinguishes crystal from liquid. For a range of temperatures and pressures, separate free energy basins for liquid and crystal are found, conditions of phase coexistence between these phases are demonstrated, and time scales for equilibration are determined. We find that at no range of temperatures and pressures is there more than a single liquid basin, even at conditions where amorphous behavior is unstable with respect to the crystal. We find a similar result for a related model of silicon. This result excludes the possibility of the proposed liquid-liquid critical point for the models we have studied. Further, we argue that behaviors others have attributed to a liquid-liquid transition in water and related systems are in fact reflections of transitions between liquid and crystal.
We derive general relations between grain boundaries, rotational deformations, and stress-free states for the mesoscale continuum Nye dislocation density tensor. Dislocations generally are associated with long-range stress fields. We provide the general form for dislocation density fields whose stress fields vanish. We explain that a grain boundary (a dislocation wall satisfying Frank's formula) has vanishing stress in the continuum limit. We show that the general stress-free state can be written explicitly as a (perhaps continuous) superposition of flat Frank walls. We show that the stress-free states are also naturally interpreted as configurations generated by a general spatially-dependent rotational deformation. Finally, we propose a least-squares definition for the spatially-dependent rotation field of a general (stressful) dislocation density field.
Several solutions exist for file storage, sharing, and synchronization. Many of them involve a central server, or a collection of servers, that either store the files, or act as a gateway for them to be shared. Some systems take a decentralized approach, wherein interconnected users form a peer-to-peer (P2P) network, and partake in the sharing process: they share the files they possess with others, and can obtain the files owned by other peers. In this paper, we survey various technologies, both cloud-based and P2P-based, that users use to synchronize their files across the network, and discuss their strengths and weaknesses.
Since the work of Anderson on localization, interference effects for the propagation of a wave in the presence of disorder have been extensively studied, as exemplified in coherent backscattering (CBS) of light. In the multiple scattering of light by a disordered sample of thermal atoms, interference effects are usually washed out by the fast atomic motion. This is no longer true for cold atoms where CBS has recently been observed. However, the internal structure of the atoms strongly influences the interference properties. In this paper, we consider light scattering by an atomic dipole transition with arbitrary degeneracy and study its impact on coherent backscattering. We show that the interference contrast is strongly reduced. Assuming a uniform statistical distribution over internal degrees of freedom, we compute analytically the single and double scattering contributions to the intensity in the weak localization regime. The so-called ladder and crossed diagrams are generalized to the case of atoms and permit to calculate enhancement factors and backscattering intensity profiles for polarized light and any closed atomic dipole transition.
View-invariant object recognition is a challenging problem, which has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g. 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best algorithms for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition using the same images and controlling for both the kinds of transformation as well as their magnitude. We used four object categories and images were rendered from 3D computer models. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position. This suggests that humans recognize objects mainly through 2D template matching, rather than by constructing 3D object models, and that DCNNs are not too unreasonable models of human feed-forward vision. Also, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.
In this paper we extend the work of Smith and Papamichail (1999) and present fast approximate Bayesian algorithms for learning in complex scenarios where at any time frame, the relationships between explanatory state space variables can be described by a Bayesian network that evolve dynamically over time and the observations taken are not necessarily Gaussian. It uses recent developments in approximate Bayesian forecasting methods in combination with more familiar Gaussian propagation algorithms on junction trees. The procedure for learning state parameters from data is given explicitly for common sampling distributions and the methodology is illustrated through a real application. The efficiency of the dynamic approximation is explored by using the Hellinger divergence measure and theoretical bounds for the efficacy of such a procedure are discussed.
$\textit{Fake followers}$ are those Twitter accounts specifically created to inflate the number of followers of a target account. Fake followers are dangerous for the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere - hence impacting on economy, politics, and society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a baseline dataset of verified human and fake follower accounts. Such baseline dataset is publicly available to the scientific community. Then, we exploit the baseline dataset to train a set of machine-learning classifiers built over the reviewed rules and features. Our results show that most of the rules proposed by Media provide unsatisfactory performance in revealing fake followers, while features proposed in the past by Academia for spam detection provide good results. Building on the most promising features, we revise the classifiers both in terms of reduction of overfitting and cost for gathering the data needed to compute the features. The final result is a novel $\textit{Class A}$ classifier, general enough to thwart overfitting, lightweight thanks to the usage of the less costly features, and still able to correctly classify more than 95% of the accounts of the original training set. We ultimately perform an information fusion-based sensitivity analysis, to assess the global sensitivity of each of the features employed by the classifier. The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the novel issue of fake Twitter followers.
The Abeles model of cortical activity assumes that in absence of stimulation neural activity in zero order can be described by a Poisson process. Here the model is extended to describe information processing by synfire chains within a network of activity uncorrelated to the synfire chain. A quantitative derivation of the transfer function from this concept is given.
Through this project, we researched on transfer learning methods and their applications on real world problems. By implementing and modifying various methods in transfer learning for our problem, we obtained an insight in the advantages and disadvantages of these methods, as well as experiences in developing neural network models for knowledge transfer. Due to time constraint, we only applied a representative method for each major approach in transfer learning. As pointed out in the literature review, each method has its own assumptions, strengths and shortcomings. Thus we believe that an ensemble-learning approach combining the different methods should yield a better performance, which can be our future research focus.
This work proposes a general framework for the design and simulation of network on chip based turbo decoder architectures. Several parameters in the design space are investigated, namely the network topology, the parallelism degree, the rate at which messages are sent by processing nodes over the network and the routing strategy. The main results of this analysis are: i) the most suited topologies to achieve high throughput with a limited complexity overhead are generalized de-Bruijn and generalized Kautz topologies; ii) depending on the throughput requirements different parallelism degrees, message injection rates and routing algorithms can be used to minimize the network area overhead.
We have performed numerical simulation of a 3-dimensional elastic medium, with scalar displacements, subject to quenched disorder. We applied an efficient combinatorial optimization algorithm to generate exact ground states for an interface representation. Our results indicate that this Bragg glass is characterized by power law divergences in the structure factor $S(k)\sim A k^{-3}$. We have found numerically consistent values of the coefficient $A$ for two lattice discretizations of the medium, supporting universality for $A$ in the isotropic systems considered here. We also examine the response of the ground state to the change in boundary conditions that corresponds to introducing a single dislocation loop encircling the system. Our results indicate that the domain walls formed by this change are highly convoluted, with a fractal dimension $d_f=2.60(5)$. We also discuss the implications of the domain wall energetics for the stability of the Bragg glass phase. As in other disordered systems, perturbations of relative strength $\delta$ introduce a new length scale $L^* \sim \delta^{-1/\zeta}$ beyond which the perturbed ground state becomes uncorrelated with the reference (unperturbed) ground state. We have performed scaling analysis of the response of the ground state to the perturbations and obtain $\zeta = 0.385(40)$. This value is consistent with the scaling relation $\zeta=d_f/2- \theta$, where $\theta$ characterizes the scaling of the energy fluctuations of low energy excitations.
With the growing importance of large network models and enormous training datasets, GPUs have become increasingly necessary to train neural networks. This is largely because conventional optimization algorithms rely on stochastic gradient methods that don't scale well to large numbers of cores in a cluster setting. Furthermore, the convergence of all gradient methods, including batch methods, suffers from common problems like saturation effects, poor conditioning, and saddle points. This paper explores an unconventional training method that uses alternating direction methods and Bregman iteration to train networks without gradient descent steps. The proposed method reduces the network training problem to a sequence of minimization sub-steps that can each be solved globally in closed form. The proposed method is advantageous because it avoids many of the caveats that make gradient methods slow on highly non-convex problems. The method exhibits strong scaling in the distributed setting, yielding linear speedups even when split over thousands of cores.
We consider an invariant random matrix model where the standard Gaussian potential is distorted by an additional single pole of order $m$. We compute the average or macroscopic spectral density in the limit of large matrix size, solving the loop equation with the additional constraint of vanishing trace on average. The density is generally supported on two disconnected intervals lying on the two sides of the pole. In the limit of having no pole, we recover the standard semicircle. Obtained in the planar limit, our results apply to matrices with orthogonal, unitary or symplectic symmetry alike. The orthogonal case with $m=2$ is motivated by an application to spin glass physics. In the Sherrington-Kirkpatrick mean-field model, in the paramagnetic phase and for sufficiently large systems the spin glass susceptibility is a random variable, depending on the realization of disorder. It is essentially given by a linear statistics on the eigenvalues of the coupling matrix. As such its large deviation function can be computed using standard Coulomb fluid techniques. The resulting free energy of the associated fluid precisely corresponds to the partition function of our random matrix model. Numerical simulations provide an excellent confirmation of our analytical results.
Data is the new oil; this refrain is repeated extensively in the age of internet tracking, machine learning, and data analytics. Social network analysis, cookie-based advertising, and government surveillance are all evidence of the use of data for commercial and national interests. Public pressure, however, is mounting for the protection of privacy. Frameworks such as differential privacy offer machine learning algorithms methods to guarantee limits to information disclosure, but they are seldom implemented. Recently, however, developers have made significant efforts to undermine tracking through obfuscation tools that hide user characteristics in a sea of noise. These services highlight an emerging clash between tracking and data obfuscation. In this paper, we conceptualize this conflict through a dynamic game between users and a machine learning algorithm that uses empirical risk minimization. First, a machine learner declares a privacy protection level, and then users respond by choosing their own perturbation amounts. We study the interaction between the users and the learner using a Stackelberg game. The utility functions quantify accuracy using expected loss and privacy in terms of the bounds of differential privacy. In equilibrium, we find selfish users tend to cause significant utility loss to trackers by perturbing heavily, in a phenomenon reminiscent of public good games. Trackers, however, can improve the balance by proactively perturbing the data themselves. While other work in this area has studied privacy markets and mechanism design for truthful reporting of user information, we take a different viewpoint by considering both user and learner perturbation.
Online Social Communities (OSCs) provide a medium for connecting people, sharing news, eliciting information, and finding jobs, among others. The dynamics of the interaction among the members of OSCs is not always growth dynamics. Instead, a $\textit{decay}$ or $\textit{inactivity}$ dynamics often happens, which makes an OSC obsolete. Understanding the behavior and the characteristics of the members of an inactive community help to sustain the growth dynamics of these communities and, possibly, prevents them from being out of service. In this work, we provide two prediction models for predicting the interaction decay of community members, namely: a Simple Threshold Model (STM) and a supervised machine learning classification framework. We conducted evaluation experiments for our prediction models supported by a $\textit{ground truth}$ of decayed communities extracted from the StackExchange platform. The results of the experiments revealed that it is possible, with satisfactory prediction performance in terms of the F1-score and the accuracy, to predict the decay of the activity of the members of these communities using network-based attributes and network-exogenous attributes of the members. The upper bound of the prediction performance of the methods we used is $0.91$ and $0.83$ for the F1-score and the accuracy, respectively. These results indicate that network-based attributes are correlated with the activity of the members and that we can find decay patterns in terms of these attributes. The results also showed that the structure of the decayed communities can be used to support the alive communities by discovering inactive members.
We report on a statistical approach to mode-locking transitions of nano-structured laser cavities characterized by an enhanced density of states. We show that the equations for the interacting modes can be mapped onto a statistical model exhibiting a first order thermodynamic transition, with the average mode-energy playing the role of inverse temperature. The transition corresponds to a phase-locking of modes. Extended modes lead to a mean-field like model, while in presence of localized modes, as due to a small disorder, the model has short range interactions. We show that simple scaling arguments lead to observable differences between transitions involving extended modes and those involving localized modes. We also show that the dynamics of the light modes can be exactly solved, predicting a jump in the relaxation time of the coherence functions at the transition. Finally, we link the thermodynamic transition to a topological singularity of the phase space, as previously reported for similar models.
The shear flow and the dielectric alpha-process in molecular glass formers is modeled in terms of local structural rearrangements which reverse a strong local shear. Using Eshelby's solution of the corresponding elasticity theory problem (J. D. Eshelby, Proc. Roy. Soc. A241, 376 (1957)), one can calculate the recoverable compliance and estimate the lifetime of the symmetric double-well potential characterizing such a structural rearrangement. A full modeling of the shear relaxation spectra requires an additional parametrization of the barrier density of these structural rearrangements. The dielectric relaxation spectrum can be described as a folding of these relaxations with the Debye process.
High-throughput techniques are leading to an explosive growth in the size of biological databases and creating the opportunity to revolutionize our understanding of life and disease. Interpretation of these data remains, however, a major scientific challenge. Here, we propose a methodology that enables us to extract and display information contained in complex networks. Specifically, we demonstrate that one can (i) find functional modules in complex networks, and (ii) classify nodes into universal roles according to their pattern of intra- and inter-module connections. The method thus yields a ``cartographic representation'' of complex networks. Metabolic networks are among the most challenging biological networks and, arguably, the ones with more potential for immediate applicability. We use our method to analyze the metabolic networks of twelve organisms from three different super-kingdoms. We find that, typically, 80% of the nodes are only connected to other nodes within their respective modules, and that nodes with different roles are affected by different evolutionary constraints and pressures. Remarkably, we find that low-degree metabolites that connect different modules are more conserved than hubs whose links are mostly within a single module.
In this paper, we formulate the collaborative multi-user wireless video transmission problem as a multi-user Markov decision process (MUMDP) by explicitly considering the users' heterogeneous video traffic characteristics, time-varying network conditions and the resulting dynamic coupling between the wireless users. These environment dynamics are often ignored in existing multi-user video transmission solutions. To comply with the decentralized nature of wireless networks, we propose to decompose the MUMDP into local MDPs using Lagrangian relaxation. Unlike in conventional multi-user video transmission solutions stemming from the network utility maximization framework, the proposed decomposition enables each wireless user to individually solve its own dynamic cross-layer optimization (i.e. the local MDP) and the network coordinator to update the Lagrangian multipliers (i.e. resource prices) based on not only current, but also future resource needs of all users, such that the long-term video quality of all users is maximized. However, solving the MUMDP requires statistical knowledge of the experienced environment dynamics, which is often unavailable before transmission time. To overcome this obstacle, we then propose a novel online learning algorithm, which allows the wireless users to update their policies in multiple states during one time slot. This is different from conventional learning solutions, which often update one state per time slot. The proposed learning algorithm can significantly improve the learning performance, thereby dramatically improving the video quality experienced by the wireless users over time. Our simulation results demonstrate the efficiency of the proposed MUMDP framework as compared to conventional multi-user video transmission solutions.
The Networks Mobility (NEMO) Protocol is a way of managing the mobility of an entire network, and mobile internet protocol is the basic solution for Networks Mobility. A hierarchical route optimization system for mobile network is proposed to solve management of hierarchical route optimization problems. In present paper, we study Hierarchical Route Optimization Scheme using Tree Information Option (HROSTIO). The concept of optimization finding the extreme of a function that maps candidate solution to scalar values of quality, is an extremely general and useful idea. For solving this problem, we use a few salient adaptations and we also extend HROSTIO perform routing between the mobile networks.
The modest extinction and reasonably face-on viewing geometry make the luminous IR galaxy NGC1614 an ideal laboratory for study of a powerful starburst. HST/NICMOS observations show: 1.) deep CO stellar absorption, tracing a starburst nucleus about 45 pc in diameter; 2.) surrounded by a ~600 pc diameter ring of supergiant HII regions revealed in Pa$\alpha$ line emission; 3.) lying within a molecular ring indicated by its extinction shadow in H-K; 4.) all at the center of a disturbed spiral galaxy. The luminosities of the giant HII regions in the ring are extremely high, an order of magnitude brighter than 30 Doradus. The relation of deep stellar CO bands to surrounding ionized gas ring to molecular gas indicates that the luminous starburst started in the nucleus and is propagating outward into the surrounding molecular ring. This hypothesis is supported by evolutionary starburst modeling that shows that the properties of NGC1614 can be fitted with two short-lived bursts of star formation separated by 5 Myr. The total dynamical mass of the starburst region of 1.3x10^9 Msol is mostly accounted for by the old pre-star burst stellar population. Although our starburst models use a modified Salpeter IMF, the tight mass budget suggests that the IMF may contain relatively more 10-30 Msol stars and fewer low mass stars than the Salpeter function. The dynamical mass is nearly 4 times smaller than the mass of molecular gas estimated from the standard ratio of 12CO(1-0) to H2. A number of arguments place the mass of gas in the starburst region at ~25% of the dynamical mass, nominally about 1/15 and with an upper limit of 1/10 of the amount estimated from 12CO and the standard ratio. (Abridged)
Mobile smartphones along with embedded sensors have become an efficient enabler for various mobile applications including opportunistic sensing. The hi-tech advances in smartphones are opening up a world of possibilities. This paper proposes a mobile collaborative platform called MOSDEN that enables and supports opportunistic sensing at run time. MOSDEN captures and shares sensor data across multiple apps, smartphones and users. MOSDEN supports the emerging trend of separating sensors from application-specific processing, storing and sharing. MOSDEN promotes reuse and re-purposing of sensor data hence reducing the efforts in developing novel opportunistic sensing applications. MOSDEN has been implemented on Android-based smartphones and tablets. Experimental evaluations validate the scalability and energy efficiency of MOSDEN and its suitability towards real world applications. The results of evaluation and lessons learned are presented and discussed in this paper.
Leveraging the potential power of even small handheld devices able to communicate wirelessly requires dedicated support. In particular, collaborative applications need sophisticated assistance in terms of querying and exchanging different kinds of data. Using a concrete example from the domain of mobile learning, the general need for information dissemination is motivated. Subsequently, and driven by infrastructural conditions, realization strategies of an appropriate middleware service are discussed.
The single photon occupation of a localized field mode within an engineered network of defects in a photonic band-gap (PBG) material is proposed as a unit of quantum information (qubit). Qubit operations are mediated by optically-excited atoms interacting with these localized states of light as the atoms traverse the connected void network of the PBG structure. We describe conditions under which this system can have independent qubits with controllable interactions and very low decoherence, as required for quantum computation.
Introduction to deep neural networks and their history.
We consider a Degree-Corrected Planted-Partition model: a random graph on $n$ nodes with two asymptotically equal-sized clusters. The model parameters are two constants $a,b > 0$ and an i.i.d. sequence of weights $(\phi_u)_{u=1}^n$, with finite second moment $\Phi^{(2)}$. Vertices $u$ and $v$ are joined by an edge with probability $\frac{\phi_u \phi_v}{n}a$ when they are in the same class and with probability $\frac{\phi_u \phi_v}{n}b$ otherwise.   We prove that it is information-theoretically impossible to estimate the spins in a way positively correlated with the true community structure when $(a-b)^2 \Phi^{(2)} \leq 2(a+b)$.   A by-product of our proof is a precise coupling-result for local-neighbourhoods in Degree-Corrected Planted-Partition models, which could be of independent interest.
Network distance (Round Trip Time, RTT) is an important parameter for many Internet distributed systems to optimize their performances. Network Coordinate System (NCS) is assumed as a lightweight and scalable mechanism to predict network distance between any two Internet hosts without explicit measurements. Though many NCSes have been proposed in the literatures, they are not satisfactory in terms of accuracy. In this paper, we propose to use delay matrix to replace the NCS. This paper makes three contributions. First, we show that not all the hosts need an independent network coordinate (NC). On the contrary, most hosts can be represented as one or several nodes in NCS. Second, we present an Internet Delay Matrix Service (IDMS) for representing network distances without explicit measurements in Internet. Third, we describe two delay matrices up to date delay matrix (UDM) and previous delay matrices (PDM) for representing the network distances. Extensive simulations on our collected Chinese Internet data sets show that IDMS is an accurate, efficient, scalable and practical method for Chinese Internet. The performance of IDMS is better than other existing approaches.
Research in quantitative evolutionary genomics and systems biology led to the discovery of several universal regularities connecting genomic and molecular phenomic variables. These universals include the log-normal distribution of the evolutionary rates of orthologous genes; the power law-like distributions of paralogous family size and node degree in various biological networks; the negative correlation between a gene's sequence evolution rate and expression level; and differential scaling of functional classes of genes with genome size. The universals of genome evolution can be accounted for by simple mathematical models similar to those used in statistical physics, such as the birth-death-innovation model. These models do not explicitly incorporate selection, therefore the observed universal regularities do not appear to be shaped by selection but rather are emergent properties of gene ensembles. Although a complete physical theory of evolutionary biology is inconceivable, the universals of genome evolution might qualify as 'laws of evolutionary genomics' in the same sense 'law' is understood in modern physics.
Humans often make decisions which maximize an internal utility function. For example, humans often maximize their expected reward when gambling and this is considered as a "rational" decision. However, humans tend to change their betting strategies depending on how they "feel". If someone has experienced a losing streak, they may "feel" that they are more likely to win on the next hand even though the odds of the game have not changed. That is, their decisions are driven by their emotional state. In this paper, we investigate how the human brain responds to wins and losses during gambling. Using a combination of local field potential recordings in human subjects performing a financial decision-making task, spectral analyses, and non-parametric cluster statistics, we investigated whether neural responses in different cognitive and limbic brain areas differ between wins and losses after decisions are made. In eleven subjects, the neural activity modulated significantly between win and loss trials in one brain region: the anterior insula ($p=0.01$). In particular, gamma activity (30-70 Hz) increased in the anterior insula when subjects just realized that they won. Modulation of metabolic activity in the anterior insula has been observed previously in functional magnetic resonance imaging studies during decision making and when emotions are elicited. However, our study is able to characterize temporal dynamics of electrical activity in this brain region at the millisecond resolution while decisions are made and after outcomes are revealed.
Online social networks have gained great success in recent years and many of them involve multiple kinds of nodes and complex relationships. Among these relationships, social links among users are of great importance. Many existing link prediction methods focus on predicting social links that will appear in the future among all users based upon a snapshot of the social network. In real-world social networks, many new users are joining in the service every day. Predicting links for new users are more important. Different from conventional link prediction problems, link prediction for new users are more challenging due to the following reasons: (1) differences in information distributions between new users and the existing active users (i.e., old users); (2) lack of information from the new users in the network. We propose a link prediction method called SCAN-PS (Supervised Cross Aligned Networks link prediction with Personalized Sampling), to solve the link prediction problem for new users with information transferred from both the existing active users in the target network and other source networks through aligned accounts. We proposed a within-target-network personalized sampling method to process the existing active users' information in order to accommodate the differences in information distributions before the intra-network knowledge transfer. SCAN-PS can also exploit information in other source networks, where the user accounts are aligned with the target network. In this way, SCAN-PS could solve the cold start problem when information of these new users is total absent in the target network.
In view of the paradigm shift that makes science ever more data-driven, we consider deterministic scientific hypotheses as uncertain data. This vision comprises a probabilistic database (p-DB) design methodology for the systematic construction and management of U-relational hypothesis DBs, viz., $\Upsilon$-DBs. It introduces hypothesis management as a promising new class of applications for p-DBs. We illustrate the potential of $\Upsilon$-DB as a tool for deep predictive analytics.
We investigate the consistency of visual morphological classifications of galaxies by comparing classifications for 831 galaxies from six independent observers. The galaxies were classified on laser print copy images or on computer screen produced from scans with the Automated Plate Measuring (APM) machine. Classifications are compared using the Revised Hubble numerical type index T. We find that individual observers agree with one another with rms combined dispersions of between 1.3 and 2.3 type units, typically about 1.8 units. The dispersions tend to decrease slightly with increasing angular diameter and, in some cases, with increasing axial ratio $(b/a)$. The agreement between independent observers is reasonably good but the scatter is non-negligible. In spite of the scatter the Revised Hubble T system can be used to train an automated galaxy classifier, e.g. an Artificial Neural Network, to handle the large number of galaxy images that are being compiled in the APM and other surveys.
In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete data- that is, in the presence of missing values or hidden variables. In a recent paper, I introduced an algorithm called Structural EM that combines the standard Expectation Maximization (EM) algorithm, which optimizes parameters, with structure search for model selection. That algorithm learns networks based on penalized likelihood scores, which include the BIC/MDL score and various approximations to the Bayesian score. In this paper, I extend Structural EM to deal directly with Bayesian model selection. I prove the convergence of the resulting algorithm and show how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.
Recently, dropout has seen increasing use in deep learning. For deep convolutional neural networks, dropout is known to work well in fully-connected layers. However, its effect in convolutional and pooling layers is still not clear. This paper demonstrates that max-pooling dropout is equivalent to randomly picking activation based on a multinomial distribution at training time. In light of this insight, we advocate employing our proposed probabilistic weighted pooling, instead of commonly used max-pooling, to act as model averaging at test time. Empirical evidence validates the superiority of probabilistic weighted pooling. We also empirically show that the effect of convolutional dropout is not trivial, despite the dramatically reduced possibility of over-fitting due to the convolutional architecture. Elaborately designing dropout training simultaneously in max-pooling and fully-connected layers, we achieve state-of-the-art performance on MNIST, and very competitive results on CIFAR-10 and CIFAR-100, relative to other approaches without data augmentation. Finally, we compare max-pooling dropout and stochastic pooling, both of which introduce stochasticity based on multinomial distributions at pooling stage.
This paper proposes a new measure of node centrality in social networks, the Harmonic Influence Centrality, which emerges naturally in the study of social influence over networks. Using an intuitive analogy between social and electrical networks, we introduce a distributed message passing algorithm to compute the Harmonic Influence Centrality of each node. Although its design is based on theoretical results which assume the network to have no cycle, the algorithm can also be successfully applied on general graphs.
Recent works have shown that synthetic parallel data automatically generated by existing translation models can be an effective solution to various neural machine translation (NMT) issues. In this study, we construct NMT systems using only synthetic parallel data. As an effective alternative to real parallel data, we also present a new type of synthetic parallel corpus. The proposed pseudo parallel data are distinct from previous works in that real and synthetic examples are mixed on both sides of sentence pairs. Experiments on Czech-German and French-German translations demonstrate the efficacy of the proposed pseudo parallel corpus that shows not only both balanced and competitive performance for bidirectional translation tasks but also substantial improvement with the aid of a real parallel corpus.
In evolution, the effects of a single deleterious mutation can sometimes be compensated for by a second mutation which recovers the original phenotype. Such epistatic interactions have implications for the structure of genome space - namely, that networks of genomes encoding the same phenotype may not be connected by single mutational moves. We use the folding of RNA sequences into secondary structures as a model genotype-phenotype map and explore the neutral spaces corresponding to networks of genotypes with the same phenotype. In most of these networks, we find that it is not possible to connect all genotypes to one another by single point mutations. Instead, a network for a phenotypic structure with $n$ bonds typically fragments into at least $2^n$ neutral components, often of similar size. While components of the same network generate the same phenotype, they show important variations in their properties, most strikingly in their evolvability and mutational robustness. This heterogeneity implies contingency in the evolutionary process.
Convolutional neural networks have achieved great improvement on face recognition in recent years because of its extraordinary ability in learning discriminative features of people with different identities. To train such a well-designed deep network, tremendous amounts of data is indispensable. Long tail distribution specifically refers to the fact that a small number of generic entities appear frequently while other objects far less existing. Considering the existence of long tail distribution of the real world data, large but uniform distributed data are usually hard to retrieve. Empirical experiences and analysis show that classes with more samples will pose greater impact on the feature learning process and inversely cripple the whole models feature extracting ability on tail part data. Contrary to most of the existing works that alleviate this problem by simply cutting the tailed data for uniform distributions across the classes, this paper proposes a new loss function called range loss to effectively utilize the whole long tailed data in training process. More specifically, range loss is designed to reduce overall intra-personal variations while enlarging inter-personal differences within one mini-batch simultaneously when facing even extremely unbalanced data. The optimization objective of range loss is the $k$ greatest range's harmonic mean values in one class and the shortest inter-class distance within one batch. Extensive experiments on two famous and challenging face recognition benchmarks (Labeled Faces in the Wild (LFW) and YouTube Faces (YTF) not only demonstrate the effectiveness of the proposed approach in overcoming the long tail effect but also show the good generalization ability of the proposed approach.
Light propagating in an optically thick sample experiences multiple scattering. It is now known that interferences alter this propagation, leading to an enhanced backscattering, a manifestation of weak localization of light in such diffuse samples. This phenomenon has been extensively studied with classical scatterers. In this letter we report the first experimental evidence for coherent backscattering of light in a laser-cooled gas of Rubidium atoms.
We describe the 1st place winning approach for the CIKM Cup 2016 Challenge. In this paper, we provide an approach to reasonably identify same users across multiple devices based on browsing logs. Our approach regards a candidate ranking problem as pairwise classification and utilizes an unsupervised neural feature ensemble approach to learn latent features of users. Combined with traditional hand crafted features, each user pair feature is fed into a supervised classifier in order to perform pairwise classification. Lastly, we propose supervised and unsupervised inference techniques.
Superbosonization formula aims at rigorously calculating fermionic integrals via employing supersymmetry. We derive such a supermatrix representation of superfield integrals and specify integration contours for the supermatrices. The derivation is essentially based on the supersymmetric generalization of the Itzikson-Zuber integral in the presence of anomalies in the Berezinian and shows how an integral over supervectors is eventually reduced to an integral over commuting variables. The approach is tested by calculating both one and two point correlation functions in a class of random matrix models. It is argued that the approach is capable of producing nonperturbative results in various systems with disorder, including physics of many-body localization, and other situations hosting localization phenomena.
WSNs are becoming an appealing research area due to their several application domains. The performance of WSNs depends on the topology of sensors and their ability to adapt to changes in the network. Sensor nodes are often resource constrained by their limited power, less communication distance capacity, and restricted sensing capability. Therefore, they need to cooperate with each other to accomplish a specific task. Thus, clustering enables sensor nodes to communicate through the cluster head node for continuous communication process. In this paper, we introduce a dynamic cluster head election mechanism. Each node in the cluster calculates its residual energy value to determine its candidacy to become the Cluster Head Node (CHN). With this mechanism, each sensor node compares its residual energy level to other nodes in the same cluster. Depending on the residual energy level the sensor node acts as the next cluster head. Evaluation of the dynamic CHN election mechanism is conducted using network simulator-2 (ns2). The simulation results demonstrate that the proposed approach prolongs the network lifetime and balancing the energy
Neural data analysis has increasingly incorporated causal information to study circuit connectivity. Dimensional reduction forms the basis of most analyses of large multivariate time series. Here, we present a new, multitaper-based decomposition for stochastic, multivariate time series that acts on the covariance of the time series at all lags, $C(\tau)$, as opposed to standard methods that decompose the time series, $\mathbf{X}(t)$, using only information at zero-lag. In both simulated and neural imaging examples, we demonstrate that methods that neglect the full causal structure may be discarding important dynamical information in a time series.
Correlation functions involving products and ratios of half-integer powers of characteristic polynomials of random matrices from the Gaussian Orthogonal Ensemble (GOE) frequently arise in applications of Random Matrix Theory (RMT) to physics of quantum chaotic systems, and beyond. We provide an explicit evaluation of the large-$N$ limits of a few non-trivial objects of that sort within a variant of the supersymmetry formalism, and via a related but different method. As one of the applications we derive the distribution of an off-diagonal entry $K_{ab}$ of the resolvent (or Wigner $K$-matrix) of GOE matrices which, among other things, is of relevance for experiments on chaotic wave scattering in electromagnetic resonators.
The network flow optimization approach is offered for restoration of grayscale and color images corrupted by noise. The Ising models are used as a statistical background of the proposed method. The new multiresolution network flow minimum cut algorithm, which is especially efficient in identification of the maximum a posteriori estimates of corrupted images, is presented. The algorithm is able to compute the MAP estimates of large size images and can be used in a concurrent mode. We also describe the efficient solutions of the problem of integer minimization of two energy functions for the Ising models of gray-scale and color images.
We derive analytic expressions for force propagation in packings of frictionless discs with a narrow distribution of disc sizes, by expanding to first order about the known ordered solution. The distribution of contact forces P(f) is found to be narrow at the upper surface, and broaden at a rate that varies with depth, being superdiffusive near the surface until crossing over to a subdiffusive regime near the fixed base. Furthermore, the response to an isolated load propagates along the edge of a `cone,' as in the ordered case, but fluctuates under ensemble averaging by an amount that depends purely on height, not on the lateral position. Finally, we comment on ways in which the analytical framework presented here could be extended to a wider range of granular packings.
The excess numbers of blue galaxies at faint magnitudes are a long-standing cosmological puzzle. We present new number-magnitude counts as a function of galactic morphology from the first deep fields of the Cycle 4 Hubble Space Telescope {\it Medium Deep Survey} project. From a sample of 301 galaxies we define counts for elliptical, spiral and irregular/peculiar galaxies to $I=22$. We find two principal results. Firstly the elliptical and spiral galaxy counts both follow the predictions of high-normalisation no-evolution models at all magnitudes, indicating that regular Hubble types evolve only slowly to $z\sim 0.5$. Secondly we find that irregular/peculiar galaxies, including multiple-peaked, possibly merging, objects, have a very steep number-magnitude relation and greatly exceed predictions based on proportions in local surveys. These systems make up half the total counts by $I=22$ and imply the rapidly-evolving component of the faint galaxy population has been identified.
Recently, it has been observed that when representations are learnt in a way that encourages sparsity, improved performance is obtained on classification tasks. These methods involve combinations of activation functions, sampling steps and different kinds of penalties. To investigate the effectiveness of sparsity by itself, we propose the k-sparse autoencoder, which is an autoencoder with linear activation function, where in hidden layers only the k highest activities are kept. When applied to the MNIST and NORB datasets, we find that this method achieves better classification results than denoising autoencoders, networks trained with dropout, and RBMs. k-sparse autoencoders are simple to train and the encoding stage is very fast, making them well-suited to large problem sizes, where conventional sparse coding algorithms cannot be applied.
Sun-like stars are thought to be regularly disrupted by supermassive black holes (SMBHs) within galactic nuclei. Yet, as stars evolve off the main sequence their vulnerability to tidal disruption increases drastically as they develop a bifurcated structure consisting of a dense core and a tenuous envelope. Here we present the first hydrodynamic simulations of the tidal disruption of giant stars and show that the core has a substantial influence on the star's ability to survive the encounter. Stars with more massive cores retain large fractions of their envelope mass, even in deep encounters. Accretion flares resulting from the disruption of giant stars should last for tens to hundreds of years. Their characteristic signature in transient searches would not be the $t^{-5/3}$ decay typically associated with tidal disruption events, but a correlated rise over many orders of magnitude in brightness on months to years timescales. We calculate the relative disruption rates of stars of varying evolutionary stages in typical galactic centers, then use our results to produce Monte Carlo realizations of the expected flaring event populations. We find that the demographics of tidal disruption flares are strongly dependent on both stellar and black hole mass, especially near the limiting SMBH mass scale of $\sim 10^8 M_\odot$. At this black hole mass, we predict a sharp transition in the SMBH flaring diet beyond which all observable disruptions arise from evolved stars, accompanied by a dramatic cutoff in the overall tidal disruption flaring rate. Black holes less massive than this limiting mass scale will show observable flares from both main sequence and evolved stars, with giants contributing up to 10% of the event rate. The relative fractions of stars disrupted at different evolutionary states can constrain the properties and distributions of stars in galactic nuclei other than our own.
Neural machine translation (NMT) becomes a new state-of-the-art and achieves promising translation results using a simple encoder-decoder neural network. This neural network is trained once on the parallel corpus and the fixed network is used to translate all the test sentences. We argue that the general fixed network cannot best fit the specific test sentences. In this paper, we propose the dynamic NMT which learns a general network as usual, and then fine-tunes the network for each test sentence. The fine-tune work is done on a small set of the bilingual training data that is obtained through similarity search according to the test sentence. Extensive experiments demonstrate that this method can significantly improve the translation performance, especially when highly similar sentences are available.
Many recent models study the downstream projection from grid cells to place cells, while recent data has pointed out the importance of the feedback projection. We thus asked how grid cells are affected by the nature of the input from the place cells. We propose a two-layered neural network with feedforward weights connecting place-like input cells to grid cell outputs. Place-to-grid weights were learned via a generalized Hebbian rule. The architecture of this network highly resembles neural networks used to perform Principal Component Analysis (PCA). Our results indicate that if the components of the feedforward neural network were non-negative, the output converged to a hexagonal lattice. Without the non-negativity constraint the output converged to a square lattice. Consistent with experiments, grid alignment to walls was ~7{\deg} and grid spacing ratio between consecutive modules was ~1.4. Our results express a possible linkage between place-cell to grid-cell interactions and PCA, suggesting that grid cells represent a process of constrained dimensionality reduction that can be viewed also as a process of variance maximization of the information from place-cells.
Kalman Filters are one of the most influential models of time-varying phenomena. They admit an intuitive probabilistic interpretation, have a simple functional form, and enjoy widespread adoption in a variety of disciplines. Motivated by recent variational methods for learning deep generative models, we introduce a unified algorithm to efficiently learn a broad spectrum of Kalman filters. Of particular interest is the use of temporal generative models for counterfactual inference. We investigate the efficacy of such models for counterfactual inference, and to that end we introduce the "Healing MNIST" dataset where long-term structure, noise and actions are applied to sequences of digits. We show the efficacy of our method for modeling this dataset. We further show how our model can be used for counterfactual inference for patients, based on electronic health record data of 8,000 patients over 4.5 years.
Recently low displacement rank (LDR) matrices, or so-called structured matrices, have been proposed to compress large-scale neural networks. Empirical results have shown that neural networks with weight matrices of LDR matrices, referred as LDR neural networks, can achieve significant reduction in space and computational complexity while retaining high accuracy. We formally study LDR matrices in deep learning. First, we prove the universal approximation property of LDR neural networks with a mild condition on the displacement operators. We then show that the error bounds of LDR neural networks are as efficient as general neural networks with both single-layer and multiple-layer structure. Finally, we propose back-propagation based training algorithm for general LDR neural networks.
We build deep RL agents that execute declarative programs expressed in formal language. The agents learn to ground the terms in this language in their environment, and can generalize their behavior at test time to execute new programs that refer to objects that were not referenced during training. The agents develop disentangled interpretable representations that allow them to generalize to a wide variety of zero-shot semantic tasks.
Although representation learning methods developed within the framework of traditional neural networks are relatively mature, developing a spiking representation model remains a challenging problem. This paper proposes an event-based method to train a feedforward spiking neural network (SNN) for extracting visual features. The method introduces a novel spike-timing-dependent plasticity (STDP) rule derived from a vector quantization-like objective function subject to a sparsity constraint. Independence and sparsity of the model are achieved by a threshold adjustment rule and a softmax function implementing inhibition in the representation layer consisting of WTA-thresholded spiking neurons. Together, these mechanisms implement a form of spike-based, competitive learning. Two sets of experiments are performed on the MNIST and natural image datasets. The results demonstrate a sparse spiking visual representation model with low reconstruction loss comparable with state-of-the-art visual coding approaches, yet our rule is local in both time and space, thus biologically plausible and hardware friendly.
In a previous paper, the authors developed a method for computing normal forms of dynamical systems with a coupled cell network structure. We now apply this theory to one-parameter families of homogeneous feed-forward chains with 2-dimensional cells. Our main result is that Hopf bifurcations in such families generically generate branches of periodic solutions with amplitudes growing like $\lambda^{1/2}$, $\lambda^{1/6}$, $\lambda^{1/18}$, etc. Such amplified Hopf branches were previously found by others in a subclass of feed-forward networks with three cells, first under a normal form assumption and later by explicit computations. We explain here how these bifurcations arise generically in a broader class of feed-forward chains of arbitrary length.
We present calculations of the localisation length, $\lambda_{2}$, for two interacting particles (TIP) in a one-dimensional random potential, presenting its dependence on disorder, interaction strength $U$ and system size. $\lambda_{2}(U)$ is computed by a decimation method from the decay of the Green function along the diagonal of finite samples. Infinite sample size estimates $\xi_{2}(U)$ are obtained by finite-size scaling. For U=0 we reproduce approximately the well-known dependence of the one-particle localisation length on disorder while for finite $U$, we find that $ \xi_{2}(U) \sim \xi_2(0)^{\beta(U)} $ with $\beta(U)$ varying between $\beta(0)=1$ and $\beta(1) \approx 1.5$. We test the validity of various other proposed fit functions and also study the problem of TIP in two different random potentials corresponding to interacting electron-hole pairs. As a check of our method and data, we also reproduce well-known results for the two-dimensional Anderson model without interaction.
Glasses are out-of-equilibrium systems aging under the crystallization threat. During ordinary glass formation, the atomic diffusion slows down rendering its experimental investigation impractically long, to the extent that a timescale divergence is taken for granted by many. We circumvent here these limitations, taking advantage of a wide family of glasses rapidly obtained by physical vapor deposition directly into the solid state, endowed with different "ages" rivaling those reached by standard cooling and waiting for millennia. Isothermally probing the mechanical response of each of these glasses, we infer a correspondence with viscosity along the equilibrium line, up to exapoise values. We find a dependence of the elastic modulus on the glass age, which, traced back to temperature steepness index of the viscosity, tears down one of the cornerstones of several glass transition theories: the dynamical divergence. Critically, our results suggest that the conventional wisdom picture of a glass ceasing to flow at finite temperature could be wrong.
We study a model of associative memory based on a neural network with small-world structure. The efficacy of the network to retrieve one of the stored patterns exhibits a phase transition at a finite value of the disorder. The more ordered networks are unable to recover the patterns, and are always attracted to mixture states. Besides, for a range of the number of stored patterns, the efficacy has a maximum at an intermediate value of the disorder. We also give a statistical characterization of the attractors for all values of the disorder of the network.
THz spectroscopy is used to identify a broad distribution of two-level systems, characteristic of glasses, in the substitutional monatomic mixed crystal systems, Ba_{1-x}Ca_xF_2 and Pb_{1-x}Ca_xF_2. In these minimally disordered systems, two-level behavior begins at a specific CaF_2 concentration. The concentration dependence, successfully modeled using the statistics of the impurity distribution in the lattice, points to a collective dopant tunneling mechanism.
We propose a process algebra for wireless mesh networks that combines novel treatments of local broadcast, conditional unicast and data structures. In this framework, we model the Ad-hoc On-Demand Distance Vector (AODV) routing protocol and (dis)prove crucial properties such as loop freedom and packet delivery.
In this paper we are interested in the problem of learning an over-complete basis and a methodology such that the reconstruction or inverse problem does not need optimization. We analyze the optimality of the presented approaches, their link to popular already known techniques s.a. Artificial Neural Networks,k-means or Oja's learning rule. Finally, we will see that one approach to reach the optimal dictionary is a factorial and hierarchical approach. The derived approach lead to a formulation of a Deep Oja Network. We present results on different tasks and present the resulting very efficient learning algorithm which brings a new vision on the training of deep nets. Finally, the theoretical work shows that deep frameworks are one way to efficiently have over-complete (combinatorially large) dictionary yet allowing easy reconstruction. We thus present the Deep Residual Oja Network (DRON). We demonstrate that a recursive deep approach working on the residuals allow exponential decrease of the error w.r.t. the depth.
The vast majority of current machine learning algorithms are designed to predict single responses or a vector of responses, yet many types of response are more naturally organized as matrices or higher-order tensor objects where characteristics are shared across modes. We present a new machine learning algorithm BaTFLED (Bayesian Tensor Factorization Linked to External Data) that predicts values in a three-dimensional response tensor using input features for each of the dimensions. BaTFLED uses a probabilistic Bayesian framework to learn projection matrices mapping input features for each mode into latent representations that multiply to form the response tensor. By utilizing a Tucker decomposition, the model can capture weights for interactions between latent factors for each mode in a small core tensor. Priors that encourage sparsity in the projection matrices and core tensor allow for feature selection and model regularization. This method is shown to far outperform elastic net and neural net models on 'cold start' tasks from data simulated in a three-mode structure. Additionally, we apply the model to predict dose-response curves in a panel of breast cancer cell lines treated with drug compounds that was used as a Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge.
Boolean networks may be viewed as idealizations of biological genetic networks, where each node is represented by an on-off switch which is a function of the binary output from some other nodes. We evolve connectivity in a single Boolean network, and demonstrate how the sole requirement of sequential matching of attractors may open for an evolution that exhibits punctuated equilibrium.
User simulation is essential for generating enough data to train a statistical spoken dialogue system. Previous models for user simulation suffer from several drawbacks, such as the inability to take dialogue history into account, the need of rigid structure to ensure coherent user behaviour, heavy dependence on a specific domain, the inability to output several user intentions during one dialogue turn, or the requirement of a summarized action space for tractability. This paper introduces a data-driven user simulator based on an encoder-decoder recurrent neural network. The model takes as input a sequence of dialogue contexts and outputs a sequence of dialogue acts corresponding to user intentions. The dialogue contexts include information about the machine acts and the status of the user goal. We show on the Dialogue State Tracking Challenge 2 (DSTC2) dataset that the sequence-to-sequence model outperforms an agenda-based simulator and an n-gram simulator, according to F-score. Furthermore, we show how this model can be used on the original action space and thereby models user behaviour with finer granularity.
Recent numerical developments in the study of glassy systems have shown that it is possible to give a purely geometric interpretation of the dynamic glass transition by considering the properties of unstable saddle points of the energy. Here we further develop this program in the context of a mean-field model, by analytically studying the properties of the closest saddle point to an equilibrium configuration of the system. We prove that when the glass transition is approached the energy of the closest saddle goes to the threshold energy, defined as the energy level below which the degree of instability of the typical stationary points vanishes. Moreover, we show that the distance between a typical equilibrium configuration and the closest saddle is always very small and that, surprisingly, it is almost independent of the temperature.
Extending the success of deep neural networks to natural language understanding and symbolic reasoning requires complex operations and external memory. Recent neural program induction approaches have attempted to address this problem, but are typically limited to differentiable memory, and consequently cannot scale beyond small synthetic tasks. In this work, we propose the Manager-Programmer-Computer framework, which integrates neural networks with non-differentiable memory to support abstract, scalable and precise operations through a friendly neural computer interface. Specifically, we introduce a Neural Symbolic Machine, which contains a sequence-to-sequence neural "programmer", and a non-differentiable "computer" that is a Lisp interpreter with code assist. To successfully apply REINFORCE for training, we augment it with approximate gold programs found by an iterative maximum likelihood training process. NSM is able to learn a semantic parser from weak supervision over a large knowledge base. It achieves new state-of-the-art performance on WebQuestionsSP, a challenging semantic parsing dataset, with weak supervision. Compared to previous approaches, NSM is end-to-end, therefore does not rely on feature engineering or domain specific knowledge.
In this paper, we present Scratch Community Blocks, a new system that enables children to programmatically access, analyze, and visualize data about their participation in Scratch, an online community for learning computer programming. At its core, our approach involves a shift in who analyzes data: from adult data scientists to young learners themselves. We first introduce the goals and design of the system and then demonstrate it by describing example projects that illustrate its functionality. Next, we show through a series of case studies how the system engages children in not only representing data and answering questions with data but also in self-reflection about their own learning and participation.
The shift towards an energy Grid dominated by prosumers (consumers and producers of energy) will inevitably have repercussions on the distribution infrastructure. Today it is a hierarchical one designed to deliver energy from large scale facilities to end-users. Tomorrow it will be a capillary infrastructure at the medium and Low Voltage levels that will support local energy trading among prosumers. In our previous work, we analyzed the Dutch Power Grid and made an initial analysis of the economic impact topological properties have on decentralized energy trading. In this paper, we go one step further and investigate how different networks topologies and growth models facilitate the emergence of a decentralized market. In particular, we show how the connectivity plays an important role in improving the properties of reliability and path-cost reduction. From the economic point of view, we estimate how the topological evolutions facilitate local electricity distribution, taking into account the main cost ingredient required for increasing network connectivity, i.e., the price of cabling.
Transient responses in disordered systems typically show a heavy-tail relaxation behavior: the decay time constant increases as time increases, revealing a spectral distribution of time constants. The asymptotic value of such transients is notoriously difficult to experimentally measure due to the increasing decay time-scale. However, if the heavy-tail transient is plotted versus log-time, a reduced set of data around the inflection point of such a plot is sufficient for an accurate fit. From a derivative plot in log-time, the peak height, position, line width, and, most importantly, skewness are all that is needed to accurately predict the asymptotic value of various heavy-tail decay models to within less than a percent. This curve fitting strategy reduces by orders of magnitude the amount of experimental data required, and clearly identifies a threshold below which the amount of data is insufficient to distinguish various models. The skew normal spectral fit and dispersive diffusion transient fit are proposed as four-parameter fits, with the latter including the stretched exponential as a limiting case. The line fit and asymptotic prediction are demonstrated using experimental transient responses in previously published amorphous silicon and amorphous InGaZnO data.
We present deep, panoramic multi-color imaging of the distant rich cluster A851 (Cl0939+4713, z=0.41) using Suprime-Cam on Subaru. These images cover a 27' field of view, ~11 Mpc at z=0.41, and by exploiting photometric redshifts estimated from our BVRI imaging we can isolate galaxies in a narrow redshift slice at the cluster redshift. Using a sample of ~2700 probable cluster members brighter than 0.02 Lv*, we trace the network of filaments and subclumps around the cluster core. The depth of our observations, combined with the identification of filamentary structure, gives us an unprecedented opportunity to test the influence of the environment on the properties of low luminosity galaxies. We find an abrupt change in the colors of < 0.1 Lv* galaxies at a local density of 100 gal. per sq. Mpc, with the population in lower density regions being predominantly blue, while those in higher density regions are red. The transition in the color-local density behavior occurs at densities corresponding to subclumps within the filaments surrounding the cluster. Identifying the sites where the transition occurs brings us much closer to understanding the mechanisms which are responsible for establishing the present-day relationship between environment and galaxy characteristics.
The complex network analysis of COSMOS galaxy field for R.A. = 149.4 deg - 150.4 deg and Decl. = 1.7 deg - 2.7 deg is presented. 2D projections of spatial distributions of galaxies in three redshift slices 0.88-0.91, 0.91-0.94 and 0.94-0.97 are studied. We analyse network similarity/peculiarity of different samples and correlations of galaxy astrophysical properties (colour index and stellar mass) with their topological environments. For each slice the local and global network measures are calculated. Results indicate a high level of similarity between geometry and topology of different galaxy samples. We found no clear evidence of evolutionary change in network measures for different slices. Most local network measures have non-Gaussian distributions, often bi- or multi-modal. The distribution of local clustering coefficient C manifests three modes which allow for discrimination between stand-alone singlets and dumbbells (0 <= C < 0.1), intermediately packed galaxies (0.1 <= C < 0.9) and cliques (0.9 <= C <= 1). Analysing astrophysical properties, we show that mean values and distributions of galaxy colour index and stellar mass are similar in all slices. However, statistically significant correlations are found if one selects galaxies according to different modes of C distribution. The distribution of stellar mass for galaxies with interim C differ from the corresponding distributions for stand-alone and clique galaxies. This difference holds for all redshift slices. Besides, the analogous difference in the colour index distributions is observed only in the central redshift interval.
Power law distribution is common in real-world networks including online social networks. Many studies on complex networks focus on the characteristics of vertices, which are always proved to follow the power law. However, few researches have been done on edges in directed networks. In this paper, edge balance ratio is firstly proposed to measure the balance property of edges in directed networks. Based on edge balance ratio, balance profile and positivity are put forward to describe the balance level of the whole network. Then the distribution of edge balance ratio is theoretically analyzed. In a directed network whose vertex in-degree follows the power law with scaling exponent $\gamma$, it is proved that the edge balance ratio follows a piecewise power law, with the scaling exponent of each section linearly dependents on $\gamma$. The theoretical analysis is verified by numerical simulations. Moreover, the theoretical analysis is confirmed by statistics of real-world online social networks, including Twitter network with 35 million users and Sina Weibo network with 110 million users.
The problem of securing a network coding communication system against a wiretapper adversary is considered. The network implements linear network coding to deliver $n$ packets from source to each receiver, and the wiretapper can eavesdrop on $\mu$ arbitrarily chosen links. A coding scheme is proposed that can achieve the maximum possible rate of $k=n-\mu$ packets that are information-theoretically secure from the adversary. A distinctive feature of our scheme is that it is universal: it can be applied on top of any communication network without requiring knowledge of or any modifications on the underlying network code. In fact, even a randomized network code can be used. Our approach is based on Rouayheb-Soljanin's formulation of a wiretap network as a generalization of the Ozarow-Wyner wiretap channel of type II. Essentially, the linear MDS code in Ozarow-Wyner's coset coding scheme is replaced by a maximum-rank-distance code over an extension of the field in which linear network coding operations are performed.
Network models are routinely downscaled compared to nature in terms of numbers of nodes or edges because of a lack of computational resources, often without explicit mention of the limitations this entails. While reliable methods have long existed to adjust parameters such that the first-order statistics of network dynamics are conserved, here we show that limitations already arise if also second-order statistics are to be maintained. The temporal structure of pairwise averaged correlations in the activity of recurrent networks is determined by the effective population-level connectivity. We first show that in general the converse is also true and explicitly mention degenerate cases when this one-to-one relationship does not hold. The one-to-one correspondence between effective connectivity and the temporal structure of pairwise averaged correlations implies that network scalings should preserve the effective connectivity if pairwise averaged correlations are to be held constant. Changes in effective connectivity can even push a network from a linearly stable to an unstable, oscillatory regime and vice versa. On this basis, we derive conditions for the preservation of both mean population-averaged activities and pairwise averaged correlations under a change in numbers of neurons or synapses in the asynchronous regime typical of cortical networks. We find that mean activities and correlation structure can be maintained by an appropriate scaling of the synaptic weights, but only over a range of numbers of synapses that is limited by the variance of external inputs to the network. Our results therefore show that the reducibility of asynchronous networks is fundamentally limited.
The "interpretation through synthesis" approach to analyze face images, particularly Active Appearance Models (AAMs) method, has become one of the most successful face modeling approaches over the last two decades. AAM models have ability to represent face images through synthesis using a controllable parameterized Principal Component Analysis (PCA) model. However, the accuracy and robustness of the synthesized faces of AAM are highly depended on the training sets and inherently on the generalizability of PCA subspaces. This paper presents a novel Deep Appearance Models (DAMs) approach, an efficient replacement for AAMs, to accurately capture both shape and texture of face images under large variations. In this approach, three crucial components represented in hierarchical layers are modeled using the Deep Boltzmann Machines (DBM) to robustly capture the variations of facial shapes and appearances. DAMs are therefore superior to AAMs in inferencing a representation for new face images under various challenging conditions. The proposed approach is evaluated in various applications to demonstrate its robustness and capabilities, i.e. facial super-resolution reconstruction, facial off-angle reconstruction or face frontalization, facial occlusion removal and age estimation using challenging face databases, i.e. Labeled Face Parts in the Wild (LFPW), Helen and FG-NET.
Why do large neural network generalize so well on complex tasks such as image classification or speech recognition? What exactly is the role regularization for them? These are arguably among the most important open questions in machine learning today. In a recent and thought provoking paper [C. Zhang et al.] several authors performed a number of numerical experiments that hint at the need for novel theoretical concepts to account for this phenomenon. The paper stirred quit a lot of excitement among the machine learning community but at the same time it created some confusion as discussions on OpenReview.net testifies. The aim of this pedagogical paper is to make this debate accessible to a wider audience of data scientists without advanced theoretical knowledge in statistical learning. The focus here is on explicit mathematical definitions and on a discussion of relevant concepts, not on proofs for which we provide references.
We analyze the performance of encoder-decoder neural models and compare them with well-known established methods. The latter represent different classes of traditional approaches that are applied to the monotone sequence-to-sequence tasks OCR post-correction, spelling correction, grapheme-to-phoneme conversion, and lemmatization. Such tasks are of practical relevance for various higher-level research fields including digital humanities, automatic text correction, and speech recognition. We investigate how well generic deep-learning approaches adapt to these tasks, and how they perform in comparison with established and more specialized methods, including our own adaptation of pruned CRFs.
We consider topological phases in periodically driven (Floquet) systems exhibiting many-body localization, protected by a symmetry $G$. We argue for a general correspondence between such phases and topological phases of undriven systems protected by symmetry $\mathbb{Z} \rtimes G$, where the additional $\mathbb{Z}$ accounts for the discrete time translation symmetry. Thus, for example, the bosonic phases in $d$ spatial dimensions without intrinsic topological order (SPT phases) are classified by the cohomology group $H^{d+1}(\mathbb{Z} \rtimes G, \mathrm{U}(1))$. For unitary symmetries, we interpret the additional resulting Floquet phases in terms of the lower-dimensional SPT phases that are pumped to the boundary during one time step. These results also imply the existence of novel symmetry-enriched topological (SET) orders protected solely by the periodicity of the drive.
The dynamic structure factor, S(Q,w), of vitreous silica, has been measured by inelastic X-ray scattering in the exchanged wavevector (Q) region Q=4-16.5 nm-1 and up to energies hw=115 meV in the Stokes side. The unprecedented statistical accuracy in such an extended energy range allows to accurately determine the longitudinal current spectra, and the energies of the vibrational excitations. The simultaneous observation of two excitations in the acoustic region, and the persistence of propagating sound waves up to Q values comparable with the (pseudo-)Brillouin zone edge, allow to observe a positive dispersion in the generalized sound velocity that, around Q=5 nm-1, varies from 6500 to 9000 m/s: this phenomenon was never experimentally observed in a glass.
Information processing in certain neuronal networks in the brain can be considered as a map of binary vectors, where ones (spikes) and zeros (no spikes) of input neurons are transformed into spikes and no spikes of output neurons. A simple but fundamental characteristic of such a map is how it transforms distances between input vectors. In particular what is the mean distance between output vectors given certain distance between input vectors? Using combinatorial approach we found an exact solution to this problem for networks of perceptrons with binary weights. he resulting formulas allow for precise analysis how network connectivity and neuronal excitability affect the transformation of distances between the vectors of neuronal spiking. As an application, we considered a simple network model of information processing in the hippocampus, a brain area critically implicated in learning and memory, and found a combination of parameters for which the output neurons discriminated similar and distinct inputs most effectively. A decrease of threshold values of the output neurons, which in biological networks may be associated with decreased inhibition, impaired optimality of discrimination.
Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we show how to improve pixel-wise semantic segmentation by manipulating convolution-related operations that are better for practical use. First, we implement dense upsampling convolution (DUC) to generate pixel-level prediction, which is able to capture and decode more detailed information that is generally missing in bilinear upsampling. Second, we propose a hybrid dilated convolution (HDC) framework in the encoding phase. This framework 1) effectively enlarges the receptive fields of the network to aggregate global information; 2) alleviates what we call the "gridding issue" caused by the standard dilated convolution operation. We evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a new state-of-art result of 80.1% mIOU in the test set. We also are state-of-the-art overall on the KITTI road estimation benchmark and the PASCAL VOC2012 segmentation task. Pretrained models are available at https://goo.gl/DQMeun
Modern Internet services, such as those at Google, Yahoo!, and Amazon, handle billions of requests per day on clusters of thousands of computers. Because these services operate under strict performance requirements, a statistical understanding of their performance is of great practical interest. Such services are modeled by networks of queues, where each queue models one of the computers in the system. A key challenge is that the data are incomplete, because recording detailed information about every request to a heavily used system can require unacceptable overhead. In this paper we develop a Bayesian perspective on queueing models in which the arrival and departure times that are not observed are treated as latent variables. Underlying this viewpoint is the observation that a queueing model defines a deterministic transformation between the data and a set of independent variables called the service times. With this viewpoint in hand, we sample from the posterior distribution over missing data and model parameters using Markov chain Monte Carlo. We evaluate our framework on data from a benchmark Web application. We also present a simple technique for selection among nested queueing models. We are unaware of any previous work that considers inference in networks of queues in the presence of missing data.
In crystals, molecules thermally vibrate around the periodic lattice sites. Vibrational motions are well understood in terms of phonons, which carry heat and control heat transport. The situation is notably different in disordered solids, where vibrational excitations are not phonons and can be even localized. Recent numerical work has established the concept of elastic heterogeneity: disordered solids show inhomogeneous local mechanical response. Clearly, the heterogeneous nature of elastic properties strongly influences vibrational and thermal properties, and it is expected to be the origin of anomalous features, including boson peak, vibrational localization, and temperature dependence of thermal conductivity. These are all crucial long-standing problems in materials physics, which we address in the present work. We have considered a toy model able to stabilize different states of matter, by introducing an increasing amount of size disorder. The phase diagram generated by molecular dynamics simulations encompasses the perfect crystalline state with a spatially homogeneous elastic moduli distribution, multiple defective phases with increasing moduli heterogeneities, and eventually a series of amorphous states. We have established clear correlations among the heterogeneous local mechanical response, vibrational states, and thermal conductivity. We provide evidence that elastic heterogeneity controls both vibrational and thermal properties, and is a key concept to understand the anomalous puzzling features of disordered solids.
In a previous contribution (H.J. Stoeckmann, J. Phys. A35, 5165 (2002)), the density of states was calculated for a billiard with randomly distributed delta-like scatterers, doubly averaged over the positions of the impurities and the billiard shape. This result is now extended to the k-point correlation function. Using supersymmetric methods, we show that the correlations in the bulk are always identical to those of the Gaussian Unitary Ensemble (GUE) of random matrices. In passing from the band centre to the tail states, The density of states is depleted considerably and the two-point correlation function shows a gradual change from the GUE behaviour to that found for completely uncorrelated eigenvalues. This can be viewed as similar to a mobility edge.
In this paper, we address the task of Optical Character Recognition(OCR) for the Telugu script. We present an end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model. The segmentation is based on mathematical morphology. The classification module, which is the most challenging task of the three, is a deep convolutional neural network. The language is modelled as a third degree markov chain at the glyph level. Telugu script is a complex alphasyllabary and the language is agglutinative, making the problem hard. In this paper we apply the latest advances in neural networks to achieve state-of-the-art error rates. We also review convolutional neural networks in great detail and expound the statistical justification behind the many tricks needed to make Deep Learning work.
Many complex systems exhibit hydrodynamic (or macroscopic) behavior at large scales characterized by few variables such as the particle number density, temperature and pressure obeying a set of hydrodynamic (or macroscopic) equations. Does the hydrodynamic description exist also for waves in complex open media? This is a long-standing fundamental problem in studies on wave localization. Practically, if it does exist, owing to its simplicity macroscopic equations can be mastered far more easily than sophisticated microscopic theories of wave localization especially for experimentalists. The purposes of the present paper are two-fold. On the one hand, it is devoted to a review of substantial recent progress in this subject. We show that in random open media the wave energy density obeys a highly unconventional macroscopic diffusion equation at scales much larger than the elastic mean free path. The diffusion coefficient is inhomogeneous in space; most strikingly, as a function of the distance to the interface, it displays novel single parameter scaling which captures the impact of rare high-transmission states that dominate long-time transport of localized waves. We review aspects of this novel macroscopic diffusive phenomenon. On the other hand, it is devoted to a review of the supersymmetric field theory of light localization in open media. In particular, we review its application in establishing a microscopic theory of the aforementioned unconventional diffusive phenomenon.
In this paper, a new method was developed for initialising artificial neural networks predicting dynamics of time series. Initial weighting coefficients were determined for neurons analogously to the case of a linear prediction filter. Moreover, to improve the accuracy of the initialization method for a multilayer neural network, some variants of decomposition of the transformation matrix corresponding to the linear prediction filter were suggested. The efficiency of the proposed neural network prediction method by forecasting solutions of the Lorentz chaotic system is shown in this paper.
Generating adversarial examples is a critical step for evaluating and improving the robustness of learning machines. So far, most existing methods only work for classification and are not designed to alter the true performance measure of the problem at hand. We introduce a novel flexible approach named Houdini for generating adversarial examples specifically tailored for the final performance measure of the task considered, be it combinatorial and non-decomposable. We successfully apply Houdini to a range of applications such as speech recognition, pose estimation and semantic segmentation. In all cases, the attacks based on Houdini achieve higher success rate than those based on the traditional surrogates used to train the models while using a less perceptible adversarial perturbation.
We propose the Margin Adaptation for Generative Adversarial Networks (MAGANs) algorithm, a novel training procedure for GANs to improve stability and performance by using an adaptive hinge loss function. We estimate the appropriate hinge loss margin with the expected energy of the target distribution, and derive principled criteria for when to update the margin. We prove that our method converges to its global optimum under certain assumptions. Evaluated on the task of unsupervised image generation, the proposed training procedure is simple yet robust on a diverse set of data, and achieves qualitative and quantitative improvements compared to the state-of-the-art.
A new probabilistic network construction system, DYNASTY, is proposed for diagnostic reasoning given variables whose probabilities change over time. Diagnostic reasoning is formulated as a sequential stochastic process, and is modeled using influence diagrams. Given a set O of observations, DYNASTY creates an influence diagram in order to devise the best action given O. Sensitivity analyses are conducted to determine if the best network has been created, given the uncertainty in network parameters and topology. DYNASTY uses an equivalence class approach to provide decision thresholds for the sensitivity analysis. This equivalence-class approach to diagnostic reasoning differentiates diagnoses only if the required actions are different. A set of network-topology updating algorithms are proposed for dynamically updating the network when necessary.
We have reconstructed the three-dimensional density fluctuation maps to z ~ 1.5 using the distribution of galaxies observed in the VVDS-Deep survey. We use this overdensity field to measure the evolution of the probability distribution function and its lower-order moments over the redshift interval 0.7<z<1.5. We apply a self-consistent reconstruction scheme which includes a complete non-linear description of galaxy biasing and which has been throughly tested on realistic mock samples. We find that the variance and skewness of the galaxy distribution evolve over this redshift interval in a way that is remarkably consistent with predictions of first- and second-order perturbation theory. This finding confirms the standard gravitational instability paradigm over nearly 9 Gyrs of cosmic time and demonstrates the importance of accounting for the non-linear component of galaxy biasing to consistently reproduce the higher-order moments of the galaxy distribution and their evolution.
This work focuses on reliable detection of bird sound emissions as recorded in the open field. Acoustic detection of avian sounds can be used for the automatized monitoring of multiple bird taxa and querying in long-term recordings for species of interest for researchers, conservation practitioners, and decision makers. Recordings in the wild can be very noisy due to the exposure of the microphones to a large number of audio sources originating from all distances and directions, the number and identity of which cannot be known a-priori. The co-existence of the target vocalizations with abiotic interferences in an unconstrained environment is inefficiently treated by current approaches of audio signal enhancement. A technique that would spot only bird vocalization while ignoring other audio sources is of prime importance. These difficulties are tackled in this work, presenting a deep autoencoder that maps the audio spectrogram of bird vocalizations to its corresponding binary mask that encircles the spectral blobs of vocalizations while suppressing other audio sources. The procedure requires minimum human attendance, it is very fast during execution, thus suitable to scan massive volumes of data, in order to analyze them, evaluate insights and hypotheses, identify patterns of bird activity that, hopefully, finally lead to design policies on biodiversity issues.
Dynamical properties of complex networks are related to the spectral properties of the Laplacian matrix that describes the pattern of connectivity of the network. In particular we compute the synchronization time for different types of networks and different dynamics. We show that the main dependence of the synchronization time is on the smallest nonzero eigenvalue of the Laplacian matrix, in contrast to other proposals in terms of the spectrum of the adjacency matrix. Then, this topological property becomes the most relevant for the dynamics.
In computer simulations of spiking neural networks, often it is assumed that every two neurons of the network are connected by a probability of 2\%, 20\% of neurons are inhibitory and 80\% are excitatory. These common values are based on experiments, observations, and trials and errors, but here, I take a different perspective, inspired by evolution, I systematically simulate many networks, each with a different set of parameters, and then I try to figure out what makes the common values desirable. I stimulate networks with pulses and then measure their: dynamic range, dominant frequency of population activities, total duration of activities, maximum rate of population and the occurrence time of maximum rate. The results are organized in phase diagram. This phase diagram gives an insight into the space of parameters -- excitatory to inhibitory ratio, sparseness of connections and synaptic weights. This phase diagram can be used to decide the parameters of a model. The phase diagrams show that networks which are configured according to the common values, have a good dynamic range in response to an impulse and their dynamic range is robust in respect to synaptic weights, and for some synaptic weights they oscillate in $\alpha$ or $\beta$ frequencies, even in absence of external stimuli.
Network steganography conceals the transfer of sensitive information within unobtrusive data in computer networks. So-called micro protocols are communication protocols placed within the payload of a network steganographic transfer. They enrich this transfer with features such as reliability, dynamic overlay routing, or performance optimization --- just to mention a few. We present different design approaches for the embedding of hidden channels with micro protocols in digitized audio signals under consideration of different requirements. On the basis of experimental results, our design approaches are compared, and introduced into a protocol engineering approach for micro protocols.
Laser scanning (also known as Light Detection And Ranging) has been widely applied in various application. As part of that, aerial laser scanning (ALS) has been used to collect topographic data points for a large area, which triggers to million points to be acquired. Furthermore, today, with integrating full wareform (FWF) technology during ALS data acquisition, all return information of laser pulse is stored. Thus, ALS data are to be massive and complexity since the FWF of each laser pulse can be stored up to 256 samples and density of ALS data is also increasing significantly. Processing LiDAR data demands heavy operations and the traditional approaches require significant hardware and running time. On the other hand, researchers have recently proposed parallel approaches for analysing LiDAR data. These approaches are normally based on parallel architecture of target systems such as multi-core processors, GPU, etc. However, there is still missing efficient approaches/tools supporting the analysis of LiDAR data due to the lack of a deep study on both library tools and algorithms used in processing this data. In this paper, we present a comparative study of software libraries and algorithms to optimise the processing of LiDAR data. We also propose new method to improve this process with experiments on large LiDAR data. Finally, we discuss on a parallel solution of our approach where we integrate parallel computing in processing LiDAR data.
In this paper, I examine what I refer to as the spike doctrine, which is the generally held belief in neuroscience that information in the brain is encoded by sequences of neural action potentials. I present the argument that specific neurochemicals, and not spikes, are the elementary units of information in the brain. I outline several predictions that arise from this interpretation, relate them to results in the current research literature, and show how they address some open questions.
Image instance retrieval is the problem of retrieving images from a database which contain the same object. Convolutional Neural Network (CNN) based descriptors are becoming the dominant approach for generating {\it global image descriptors} for the instance retrieval problem. One major drawback of CNN-based {\it global descriptors} is that uncompressed deep neural network models require hundreds of megabytes of storage making them inconvenient to deploy in mobile applications or in custom hardware. In this work, we study the problem of neural network model compression focusing on the image instance retrieval task. We study quantization, coding, pruning and weight sharing techniques for reducing model size for the instance retrieval problem. We provide extensive experimental results on the trade-off between retrieval performance and model size for different types of networks on several data sets providing the most comprehensive study on this topic. We compress models to the order of a few MBs: two orders of magnitude smaller than the uncompressed models while achieving negligible loss in retrieval performance.
While there has been a success in 2D human pose estimation with convolutional neural networks (CNNs), 3D human pose estimation has not been thoroughly studied. In this paper, we tackle the 3D human pose estimation task with end-to-end learning using CNNs. Relative 3D positions between one joint and the other joints are learned via CNNs. The proposed method improves the performance of CNN with two novel ideas. First, we added 2D pose information to estimate a 3D pose from an image by concatenating 2D pose estimation result with the features from an image. Second, we have found that more accurate 3D poses are obtained by combining information on relative positions with respect to multiple joints, instead of just one root joint. Experimental results show that the proposed method achieves comparable performance to the state-of-the-art methods on Human 3.6m dataset.
Very deep CNNs with small 3x3 kernels have recently been shown to achieve very strong performance as acoustic models in hybrid NN-HMM speech recognition systems. In this paper we investigate how to efficiently scale these models to larger datasets. Specifically, we address the design choice of pooling and padding along the time dimension which renders convolutional evaluation of sequences highly inefficient. We propose a new CNN design without timepadding and without timepooling, which is slightly suboptimal for accuracy, but has two significant advantages: it enables sequence training and deployment by allowing efficient convolutional evaluation of full utterances, and, it allows for batch normalization to be straightforwardly adopted to CNNs on sequence data. Through batch normalization, we recover the lost peformance from removing the time-pooling, while keeping the benefit of efficient convolutional evaluation. We demonstrate the performance of our models both on larger scale data than before, and after sequence training. Our very deep CNN model sequence trained on the 2000h switchboard dataset obtains 9.4 word error rate on the Hub5 test-set, matching with a single model the performance of the 2015 IBM system combination, which was the previous best published result.
We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.
Concepts and methods of complex networks can be used to analyse texts at their different complexity levels. Examples of natural language processing (NLP) tasks studied via topological analysis of networks are keyword identification, automatic extractive summarization and authorship attribution. Even though a myriad of network measurements have been applied to study the authorship attribution problem, the use of motifs for text analysis has been restricted to a few works. The goal of this paper is to apply the concept of motifs, recurrent interconnection patterns, in the authorship attribution task. The absolute frequencies of all thirteen directed motifs with three nodes were extracted from the co-occurrence networks and used as classification features. The effectiveness of these features was verified with four machine learning methods. The results show that motifs are able to distinguish the writing style of different authors. In our best scenario, 57.5% of the books were correctly classified. The chance baseline for this problem is 12.5%. In addition, we have found that function words play an important role in these recurrent patterns. Taken together, our findings suggest that motifs should be further explored in other related linguistic tasks.
In this paper, our goal is to model functional and effective (directional) connectivity in network of multichannel brain physiological signals (e.g., electroencephalograms, local field potentials). The primary challenges here are twofold: first, there are major statistical and computational difficulties for modeling and analyzing high dimensional multichannel brain signals; second, there is no set of universally-agreed measures for characterizing connectivity. To model multichannel brain signals, our approach is to fit a vector autoregressive (VAR) model with sufficiently high order so that complex lead-lag temporal dynamics between the channels can be accurately characterized. However, such a model contains a large number of parameters. Thus, we will estimate the high dimensional VAR parameter space by our proposed hybrid LASSLE method (LASSO+LSE) which is imposes regularization on the first step (to control for sparsity) and constrained least squares estimation on the second step (to improve bias and mean-squared error of the estimator). Then to characterize connectivity between channels in a brain network, we will use various measures but put an emphasis on partial directed coherence (PDC) in order to capture directional connectivity between channels. PDC is a directed frequency-specific measure that explains the extent to which the present oscillatory activity in a sender channel influences the future oscillatory activity in a specific receiver channel relative all possible receivers in the network. Using the proposed modeling approach, we have achieved some insights on learning in a rat engaged in a non-spatial memory task.
We demonstrate the first implementation of recently-developed fast explicit kinetic integration algorithms on modern graphics processing unit (GPU) accelerators. Taking as a generic test case a Type Ia supernova explosion with an extremely stiff thermonuclear network having 150 isotopic species and 1604 reactions coupled to hydrodynamics using operator splitting, we demonstrate the capability to solve of order 100 realistic kinetic networks in parallel in the same time that standard implicit methods can solve a single such network on a CPU. This orders-of-magnitude decrease in compute time for solving systems of realistic kinetic networks implies that important coupled, multiphysics problems in various scientific and technical fields that were intractible, or could be simulated only with highly schematic kinetic networks, are now computationally feasible.
We consider expected risk minimization in multi-agent systems comprised of distinct subsets of agents operating without a common time-scale.   Each individual in the network is charged with minimizing the global objective function, which is an average of sum of the statistical average loss function of each agent in the network. Since agents are not assumed to observe data from identical distributions, the hypothesis that all agents seek a common action is violated, and thus the hypothesis upon which consensus constraints are formulated is violated. Thus, we consider nonlinear network proximity constraints which incentivize nearby nodes to make decisions which are close to one another but not necessarily coincide. Moreover, agents are not assumed to receive their sequentially arriving observations on a common time index, and thus seek to learn in an asynchronous manner. An asynchronous stochastic variant of the Arrow-Hurwicz saddle point method is proposed to solve this problem which operates by alternating primal stochastic descent steps and Lagrange multiplier updates which penalize the discrepancies between agents. This tool leads to an implementation that allows for each agent to operate asynchronously with local information only and message passing with neighbors. Our main result establishes that the proposed method yields convergence in expectation both in terms of the primal sub-optimality and constraint violation to radii of sizes $\mathcal{O}(\sqrt{T})$ and $\mathcal{O}(T^{3/4})$, respectively. Empirical evaluation on an asynchronously operating wireless network that manages user channel interference through an adaptive communications pricing mechanism demonstrates that our theoretical results translates well to practice.
Inspired by recent advances in deep learning, we propose a framework for reconstructing dynamic sequences of 2D cardiac magnetic resonance (MR) images from undersampled data using a deep cascade of convolutional neural networks (CNNs) to accelerate the data acquisition process. In particular, we address the case where data is acquired using aggressive Cartesian undersampling. Firstly, we show that when each 2D image frame is reconstructed independently, the proposed method outperforms state-of-the-art 2D compressed sensing approaches such as dictionary learning-based MR image reconstruction, in terms of reconstruction error and reconstruction speed. Secondly, when reconstructing the frames of the sequences jointly, we demonstrate that CNNs can learn spatio-temporal correlations efficiently by combining convolution and data sharing approaches. We show that the proposed method consistently outperforms Dictionary Learning with Temporal Gradients (DLTG) and is capable of preserving anatomical structure more faithfully up to 11-fold undersampling. Moreover, reconstruction is very fast: each complete dynamic sequence can be reconstructed in less than 10s and, for the 2D case, each image frame can be reconstructed in 23ms, enabling real-time applications.
We present zero-temperature simulations for the single-particle density of states of the Coulomb glass. Our results in three dimensions are consistent with the Efros and Shklovskii prediction for the density of states. Finite-temperature Monte Carlo simulations show no sign of a thermodynamic glass transition down to low temperatures, in disagreement with mean-field theory. Furthermore, the random-displacement formulation of the model undergoes a transition into a distorted Wigner crystal for a surprisingly broad range of the disorder strength.
This paper introduces Progressively Diffused Networks (PDNs) for unifying multi-scale context modeling with deep feature learning, by taking semantic image segmentation as an exemplar application. Prior neural networks, such as ResNet, tend to enhance representational power by increasing the depth of architectures and driving the training objective across layers. However, we argue that spatial dependencies in different layers, which generally represent the rich contexts among data elements, are also critical to building deep and discriminative representations. To this end, our PDNs enables to progressively broadcast information over the learned feature maps by inserting a stack of information diffusion layers, each of which exploits multi-dimensional convolutional LSTMs (Long-Short-Term Memory Structures). In each LSTM unit, a special type of atrous filters are designed to capture the short range and long range dependencies from various neighbors to a certain site of the feature map and pass the accumulated information to the next layer. From the extensive experiments on semantic image segmentation benchmarks (e.g., ImageNet Parsing, PASCAL VOC2012 and PASCAL-Part), our framework demonstrates the effectiveness to substantially improve the performances over the popular existing neural network models, and achieves state-of-the-art on ImageNet Parsing for large scale semantic segmentation.
Spiking neural networks (SNNs) with adaptive synapses reflect core properties of biological neural networks. Speech recognition, as an application involving audio coding and dynamic learning, provides a good test problem to study SNN functionality. We present a simple, novel, and efficient nonrecurrent SNN that learns to convert a speech signal into a spike train signature. The signature is distinguishable from signatures for other speech signals representing different words, thereby enabling digit recognition and discrimination in devices that use only spiking neurons. The method uses a small, nonrecurrent SNN consisting of Izhikevich neurons equipped with spike timing dependent plasticity (STDP) and biologically realistic synapses. This approach introduces an efficient and fast network without error-feedback training, although it does require supervised training. The new simulation results produce discriminative spike train patterns for spoken digits in which highly correlated spike trains belong to the same category and low correlated patterns belong to different categories. The proposed SNN is evaluated using a spoken digit recognition task where a subset of the Aurora speech dataset is used. The experimental results show that the network performs well in terms of accuracy rate and complexity.
We present and describe a catalog of galaxy photometric redshifts (photo-z's) for the Sloan Digital Sky Survey (SDSS) Coadd Data. We use the Artificial Neural Network (ANN) technique to calculate photo-z's and the Nearest Neighbor Error (NNE) method to estimate photo-z errors for $\sim$ 13 million objects classified as galaxies in the coadd with $r < 24.5$. The photo-z and photo-z error estimators are trained and validated on a sample of $\sim 83,000$ galaxies that have SDSS photometry and spectroscopic redshifts measured by the SDSS Data Release 7 (DR7), the Canadian Network for Observational Cosmology Field Galaxy Survey (CNOC2), the Deep Extragalactic Evolutionary Probe Data Release 3(DEEP2 DR3), the VIsible imaging Multi-Object Spectrograph - Very Large Telescope Deep Survey (VVDS) and the WiggleZ Dark Energy Survey. For the best ANN methods we have tried, we find that 68% of the galaxies in the validation set have a photo-z error smaller than $\sigma_{68} =0.031$. After presenting our results and quality tests, we provide a short guide for users accessing the public data.
An accurate impact parameter determination in a heavy ion collision is crucial for almost all further analysis. The capabilities of an artificial neural network are investigated to that respect. A novel input generation for the network is proposed, namely the transverse and longitudinal momentum distribution of all outgoing (or actually detectable) particles. The neural network approach yields an improvement in performance of a factor of two as compared to classical techniques. To achieve this improvement simple network architectures and a 5 by 5 input grid in (p_t,p_z) space are sufficient.
The Hubble Deep Fields represent our best opportunity for probing galaxy evolution over a substantive look-back time. However as with any dataset the HDFs are prone to selection biases. These biases are extremely severe beyond z \~1.25 such that a meaningful interpretation of generic galaxy evolution is not possible. We can however extract well defined volume-limited samples at z < 1. The data are entirely consistent with passive/null-evolution for ellipticals, spirals and irregulars however this concluion is tempered by small number statistics. Alas stringent constraints on galaxy evolution await an order of magnitude increase in the number of HDFs.
Clustering with submodular functions has been of interest over the last few years. Symmetric submodular functions are of particular interest as minimizing them is significantly more efficient and they include many commonly used functions in practice viz. graph cuts, mutual information. In this paper, we propose a novel constraint to make clustering actionable which is motivated by applications across multiple domains, and pose the problem of performing symmetric submodular clustering subject to this constraint. We see that obtaining a $k$ partition with approximation guarantees is a non-trivial task requiring further work.
Representing relationships as translations in vector space lives at the heart of many neural embedding models such as word embeddings and knowledge graph embeddings. In this work, we study the connections of this translational principle with collaborative filtering algorithms. We propose Translational Recommender Networks (\textsc{TransRec}), a new attentive neural architecture that utilizes the translational principle to model the relationships between user and item pairs. Our model employs a neural attention mechanism over a \emph{Latent Relational Attentive Memory} (LRAM) module to learn the latent relations between user-item pairs that best explains the interaction. By exploiting adaptive user-item specific translations in vector space, our model also alleviates the geometric inflexibility problem of other metric learning algorithms while enabling greater modeling capability and fine-grained fitting of users and items in vector space. The proposed architecture not only demonstrates the state-of-the-art performance across multiple recommendation benchmarks but also boasts of improved interpretability. Qualitative studies over the LRAM module shows evidence that our proposed model is able to infer and encode explicit sentiment, temporal and attribute information despite being only trained on implicit feedback. As such, this ascertains the ability of \textsc{TransRec} to uncover hidden relational structure within implicit datasets.
We consider the problem of designing a set of computational agents so that as they all pursue their self-interests a global function G of the collective system is optimized. Three factors govern the quality of such design. The first relates to conventional exploration-exploitation search algorithms for finding the maxima of such a global function, e.g., simulated annealing. Game-theoretic algorithms instead are related to the second of those factors, and the third is related to techniques from the field of machine learning. Here we demonstrate how to exploit all three factors by modifying the search algorithm's exploration stage so that rather than by random sampling, each coordinate of the underlying search space is controlled by an associated machine-learning-based ``player'' engaged in a non-cooperative game. Experiments demonstrate that this modification improves SA by up to an order of magnitude for bin-packing and for a model of an economic process run over an underlying network. These experiments also reveal novel small worlds phenomena.
Social networks offer users new means of accessing information, essentially relying on "social filtering", i.e. propagation and filtering of information by social contacts. The sheer amount of data flowing in these networks, combined with the limited budget of attention of each user, makes it difficult to ensure that social filtering brings relevant content to the interested users. Our motivation in this paper is to measure to what extent self-organization of the social network results in efficient social filtering. To this end we introduce flow games, a simple abstraction that models network formation under selfish user dynamics, featuring user-specific interests and budget of attention. In the context of homogeneous user interests, we show that selfish dynamics converge to a stable network structure (namely a pure Nash equilibrium) with close-to-optimal information dissemination. We show in contrast, for the more realistic case of heterogeneous interests, that convergence, if it occurs, may lead to information dissemination that can be arbitrarily inefficient, as captured by an unbounded "price of anarchy". Nevertheless the situation differs when users' interests exhibit a particular structure, captured by a metric space with low doubling dimension. In that case, natural autonomous dynamics converge to a stable configuration. Moreover, users obtain all the information of interest to them in the corresponding dissemination, provided their budget of attention is logarithmic in the size of their interest set.
Ozsvath and Szabo proved that knot Floer homology determines the genera of knots in S^3. We will generalize this deep result to links in homology 3-spheres, by adapting their method. Our proof relies on a result of Gabai and some constructions related to foliations. We also interpret a theorem of Kauffman in the world of knot Floer homology, hence we can compute the top filtration term of the knot Floer homology for alternative links.
This paper proposes a method for constructing an experimental power beam pattern (PB) of RATAN-600 based on the sample of NVSS sources observed in the process of a deep sky survey near local zenith. The data obtained from observations of radio sources at wave 7.6 cm in nine bands of the survey (the 2002 and 2003 sets) are used to construct vertical PB of the telescope at rather large offsets from the central horizontal section of the PB (+/-36'). The experimental PBs obtained using different methods are compared and the root-mean-square deviations of the experimental PB from the corresponding computed PB are determined. The stability of the power beam pattern in its central part (+/-6') during the RATAN-600 Zenith Field (RZF) survey (1998-2003) and the accuracies of the fluxes of the sources observed within the framework of this survey and included into the RZF catalog are estimated.
Caenorhabditis elegans (C. elegans) illustrated remarkable behavioral plasticities including complex non-associative and associative learning representations. Understanding the principles of such mechanisms presumably leads to constructive inspirations for the design of efficient learning algorithms. In the present study, we postulate a novel approach on modeling single neurons and synapses to study the mechanisms underlying learning in the C. elegans nervous system. In this regard, we construct a precise mathematical model of sensory neurons where we include multi-scale details from genes, ion channels and ion pumps, together with a dynamic model of synapses comprised of neurotransmitters and receptors kinetics. We recapitulate mechanosensory habituation mechanism, a non-associative learning process, in which elements of the neural network tune their parameters as a result of repeated input stimuli. Accordingly, we quantitatively demonstrate the roots of such plasticity in the neuronal and synaptic-level representations. Our findings can potentially give rise to the development of new bio-inspired learning algorithms.
Unsupervised pre-training was a critical technique for training deep neural networks years ago. With sufficient labeled data and modern training techniques, it is possible to train very deep neural networks from scratch in a purely supervised manner nowadays. However, unlabeled data is easier to obtain and usually of very large scale. How to make use of them better to help supervised learning is still a well-valued topic. In this paper, we investigate convolutional denoising auto-encoders to show that unsupervised pre-training can still improve the performance of high-level image related tasks such as image classification and semantic segmentation. The architecture we use is a convolutional auto-encoder network with symmetric shortcut connections. We empirically show that symmetric shortcut connections are very important for learning abstract representations via image reconstruction. When no extra unlabeled data are available, unsupervised pre-training with our network can regularize the supervised training and therefore lead to better generalization performance. With the help of unsupervised pre-training, our method achieves very competitive results in image classification using very simple all-convolution networks. When labeled data are limited but extra unlabeled data are available, our method achieves good results in several semi-supervised learning tasks.
The study of granular crystals, metamaterials that consist of closely packed arrays of particles that interact elastically, is a vibrant area of research that combines ideas from disciplines such as materials science, nonlinear dynamics, and condensed-matter physics. Granular crystals, a type of nonlinear metamaterial, exploit geometrical nonlinearities in their constitutive microstructure to produce properties (such as tunability and energy localization) that are not conventional to engineering materials and linear devices. In this topical review, we focus on recent experimental, computational, and theoretical results on nonlinear coherent structures in granular crystals. Such structures --- which include traveling solitary waves, dispersive shock waves, and discrete breathers --- have fascinating dynamics, including a diversity of both transient features and robust, long-lived patterns that emerge from broad classes of initial data. In our review, we primarily discuss phenomena in one-dimensional crystals, as most research has focused on such scenarios, but we also present some extensions to two-dimensional settings. Throughout the review, we highlight open problems and discuss a variety of potential engineering applications that arise from the rich dynamic response of granular crystals.
We present results from a deep Chandra observation of MS1137.5+66, a distant (z=0.783) and massive cluster of galaxies. Only a few similarly massive clusters are currently known at such high redshifts; accordingly, this observation provides much-needed information on the dynamical state of these rare systems. The cluster appears both regular and symmetric in the X-ray image. However, our analysis of the spectral and spatial X-ray data in conjunction with interferometric Sunyaev-Zel'dovich effect data and published deep optical imaging suggests the cluster has a fairly complex structure. The angular diameter distance we calculate from the Chandra and Sunyaev-Zel'dovich effect data assuming an isothermal, spherically symmetric cluster implies a low value for the Hubble constant for which we explore possible explanations.
A new algorithm developed to perform autonomous fitting of gravitational microlensing lightcurves is presented. The new algorithm is conceptually simple, versatile and robust, and parallelises trivially; it combines features of extant evolutionary algorithms with some novel ones, and fares well on the problem of fitting binary-lens microlensing lightcurves, as well as on a number of other difficult optimisation problems. Success rates in excess of 90% are achieved when fitting synthetic though noisy binary-lens lightcurves, allowing no more than 20 minutes per fit on a desktop computer; this success rate is shown to compare very favourably with that of both a conventional (iterated simplex) algorithm, and a more state-of-the-art, artificial neural network-based approach. As such, this work provides proof of concept for the use of an evolutionary algorithm as the basis for real-time, autonomous modelling of microlensing events. Further work is required to investigate how the algorithm will fare when faced with more complex and realistic microlensing modelling problems; it is, however, argued here that the use of parallel computing platforms, such as inexpensive graphics processing units, should allow fitting times to be constrained to under an hour, even when dealing with complicated microlensing models. In any event, it is hoped that this work might stimulate some interest in evolutionary algorithms, and that the algorithm described here might prove useful for solving microlensing and/or more general model-fitting problems.
We explore both analytically and numerically an ensemble of coupled phase-oscillators governed by a Kuramoto-type system of differential equations. However, we have included the effects of time-delay (due to finite signal-propagation speeds) and network plasticity (via dynamic coupling constants) inspired by the Hebbian learning rule in neuroscience. When time-delay and learning effects combine, novel synchronization phenomena are observed. We investigate the formation of spatio-temporal patterns in both one- and two-dimensional oscillator lattices with periodic boundary conditions and comment on the role of dimensionality.
We study information processing in populations of Boolean networks with evolving connectivity and systematically explore the interplay between the learning capability, robustness, the network topology, and the task complexity. We solve a long-standing open question and find computationally that, for large system sizes $N$, adaptive information processing drives the networks to a critical connectivity $K_{c}=2$. For finite size networks, the connectivity approaches the critical value with a power-law of the system size $N$. We show that network learning and generalization are optimized near criticality, given task complexity and the amount of information provided threshold values. Both random and evolved networks exhibit maximal topological diversity near $K_{c}$. We hypothesize that this supports efficient exploration and robustness of solutions. Also reflected in our observation is that the variance of the values is maximal in critical network populations. Finally, we discuss implications of our results for determining the optimal topology of adaptive dynamical networks that solve computational tasks.
Glass-to-glass and liquid-to-liquid phase transitions were observed many years ago in bulk and confined water with or without applied pressure. It is shown that they result from the competition of two-liquid phases separated by an enthalpy difference depending on temperature. The model is based on the classical nucleation equation of these phases completed by this enthalpy saving existing at all temperatures and a pressure contribution. The thermodynamic parameters of amorphous water, the multiple glass transition temperatures, the enthalpy changes, and the temperature of maximum density under negative pressure above Tg =227.5K are predicted to be in agreement with many measurements
Text-to-Speech synthesis in Indian languages has a seen lot of progress over the decade partly due to the annual Blizzard challenges. These systems assume the text to be written in Devanagari or Dravidian scripts which are nearly phonemic orthography scripts. However, the most common form of computer interaction among Indians is ASCII written transliterated text. Such text is generally noisy with many variations in spelling for the same word. In this paper we evaluate three approaches to synthesize speech from such noisy ASCII text: a naive Uni-Grapheme approach, a Multi-Grapheme approach, and a supervised Grapheme-to-Phoneme (G2P) approach. These methods first convert the ASCII text to a phonetic script, and then learn a Deep Neural Network to synthesize speech from that. We train and test our models on Blizzard Challenge datasets that were transliterated to ASCII using crowdsourcing. Our experiments on Hindi, Tamil and Telugu demonstrate that our models generate speech of competetive quality from ASCII text compared to the speech synthesized from the native scripts. All the accompanying transliterated datasets are released for public access.
We present an exact dynamical solution of a spherical version of the batch minority game (MG) with random external information. The control parameters in this model are the ratio of the number of possible values for the public information over the number of agents, and the radius of the spherical constraint on the microscopic degrees of freedom. We find a phase diagram with three phases: two without anomalous response (an oscillating versus a frozen state), and a further frozen phase with divergent integrated response. In contrast to standard MG versions, we can also calculate the volatility exactly. Our study reveals similarities between the spherical and the conventional MG, but also intriguing differences. Numerical simulations confirm our analytical results.
This paper presents a bias-variance tradeoff of graph Laplacian regularizer, which is widely used in graph signal processing and semi-supervised learning tasks. The scaling law of the optimal regularization parameter is specified in terms of the spectral graph properties and a novel signal-to-noise ratio parameter, which suggests selecting a mediocre regularization parameter is often suboptimal. The analysis is applied to three applications, including random, band-limited, and multiple-sampled graph signals. Experiments on synthetic and real-world graphs demonstrate near-optimal performance of the established analysis.
We consider the task of computing (combined) function mapping and routing for requests in Software-Defined Networks (SDNs). Function mapping refers to the assignment of nodes in the substrate network to various processing stages that requests must undergo. Routing refers to the assignment of a path in the substrate network that begins in a source node of the request, traverses the nodes that are assigned functions for this request, and ends in a destination of the request. The algorithm either rejects a request or completely serves a request, and its goal is to maximize the sum of the benefits of the served requests. The solution must abide edge and vertex capacities. We follow the framework suggested by Even for the specification of the processing requirements and routing of requests via processing-and-routing graphs (PR-graphs). In this framework, each request has a demand, a benefit, and PR-graph. Our main result is a randomized approximation algorithm for path computation and function placement with the following guarantee. Let $m$ denote the number of links in the substrate network, $\eps$ denote a parameter such that $0< \eps <1$, and $\opt_f$ denote the maximum benefit that can be attained by a fractional solution (one in which requests may be partly served and flow may be split along multiple paths). Let $\cmin$ denote the minimum edge capacity, and let $\dmax$ denote the maximum demand. Let $\Deltamax$ denote an upper bound on the number of processing stages a request undergoes. If $\cmin/(\Deltamax\cdot\dmax)=\Omega((\log m)/\eps^2)$, then with probability at least $1-\frac{1}{m}-\textit{exp}(-\Omega(\eps^2\cdot \opt_f /(\bmax \cdot \dmax)))$, the algorithm computes a $(1-\eps)$-approximate solution.
In this article, a network model incorporating both line-of-sight (LOS) and non-line-of-sight (NLOS) transmissions is proposed to investigate impacts of blockages in urban areas on heterogeneous network coverage performance. Results show that co-existence of NLOS and LOS transmissions has a significant impact on network performance. We find in urban areas, that deploying more BSs in different tiers is better than merely deploying all BSs in the same tier in terms of coverage probability.
In many applications, nodes in a network desire not only a consensus, but an optimal one. To date, a family of subgradient algorithms have been proposed to solve this problem under general convexity assumptions. This paper shows that, for the scalar case and by assuming a bit more, novel non-gradient-based algorithms with appealing features can be constructed. Specifically, we develop Pairwise Equalizing (PE) and Pairwise Bisectioning (PB), two gossip algorithms that solve unconstrained, separable, convex consensus optimization problems over undirected networks with time-varying topologies, where each local function is strictly convex, continuously differentiable, and has a minimizer. We show that PE and PB are easy to implement, bypass limitations of the subgradient algorithms, and produce switched, nonlinear, networked dynamical systems that admit a common Lyapunov function and asymptotically converge. Moreover, PE generalizes the well-known Pairwise Averaging and Randomized Gossip Algorithm, while PB relaxes a requirement of PE, allowing nodes to never share their local functions.
The emerging Internet of Things (IoT)-driven ultra-dense small cell networks (UD-SCNs) will need to combat a variety of challenges. On one hand, massive number of devices sharing the limited wireless resources will render centralized control mechanisms infeasible due to the excessive cost of information acquisition and computations. On the other hand, to reduce energy consumption from fixed power grid and/or battery, many IoT devices may need to depend on the energy harvested from the ambient environment (e.g., from RF transmissions, environmental sources). However, due to the opportunistic nature of energy harvesting, this will introduce uncertainty in the network operation. In this article, we study the distributed cell association problem for energy harvesting IoT devices in UD-SCNs. After reviewing the state-of-the-art research on the cell association problem in small cell networks, we outline the major challenges for distributed cell association in IoT-driven UD-SCNs where the IoT devices will need to perform cell association in a distributed manner in presence of uncertainty (e.g., limited knowledge on channel/network) and limited computational capabilities. To this end, we propose an approach based on mean-field multi-armed bandit games to solve the uplink cell association problem for energy harvesting IoT devices in a UD-SCN. This approach is particularly suitable to analyze large multi-agent systems under uncertainty and lack of information. We provide some theoretical results as well as preliminary performance evaluation results for the proposed approach.
In the framework of the canonical model of hydrodynamics, where fluid is assumed to be ideal and incompressible, waves are potential, two-dimensional, and symmetric, the authors have recently reported the existence of a new type of gravity waves on deep water besides well studied Stokes waves (Phys. Rev. Lett., 2002, v. 89, 164502). The distinctive feature of these waves is that horizontal water velocities in the wave crests exceed the speed of the crests themselves. Such waves were found to describe irregular flows with stagnation point inside the flow domain and discontinuous streamlines near the wave crests. Irregular flows produce a simple model for describing the initial stage of the formation of spilling breakers when a localized jet is formed at the crest following by generating whitecaps.   In the present work, a new highly efficient method for computing steady potential gravity waves on deep water is proposed to examine the above results in more detail. The method is based on the truncated fractional approximations for the velocity potential in terms of the basis functions $1/\bigr(1-\exp(y_0-y-ix)\bigl)^n$, $y_0$ being a free parameter. The non-linear transformation of the horizontal scale $x = \chi - \gamma \sin\chi, 0<\gamma<1,$ is additionally applied to concentrate a numerical emphasis on the crest region of a wave for accelerating the convergence of the series. Fractional approximations were employed for calculating both steep Stokes waves and irregular flows. For lesser computational time, the advantage in accuracy over ordinary Fourier expansions in terms the basis functions $\exp\bigl(n (y+ix)\bigr)$ was found to be from one to ten decimal orders depending on the wave steepness and flow parameters.
Large set of linear equations, especially for sparse and structured coefficient (matrix) equations, solutions using classical methods become arduous. And evolutionary algorithms have mostly been used to solve various optimization and learning problems. Recently, hybridization of classical methods (Jacobi method and Gauss-Seidel method) with evolutionary computation techniques have successfully been applied in linear equation solving. In the both above hybrid evolutionary methods, uniform adaptation (UA) techniques are used to adapt relaxation factor. In this paper, a new Jacobi Based Time-Variant Adaptive (JBTVA) hybrid evolutionary algorithm is proposed. In this algorithm, a Time-Variant Adaptive (TVA) technique of relaxation factor is introduced aiming at both improving the fine local tuning and reducing the disadvantage of uniform adaptation of relaxation factors. This algorithm integrates the Jacobi based SR method with time variant adaptive evolutionary algorithm. The convergence theorems of the proposed algorithm are proved theoretically. And the performance of the proposed algorithm is compared with JBUA hybrid evolutionary algorithm and classical methods in the experimental domain. The proposed algorithm outperforms both the JBUA hybrid algorithm and classical methods in terms of convergence speed and effectiveness.
There exists a theory of a single general-purpose learning algorithm which could explain the principles of its operation. This theory assumes that the brain has some initial rough architecture, a small library of simple innate circuits which are prewired at birth and proposes that all significant mental algorithms can be learned. Given current understanding and observations, this paper reviews and lists the ingredients of such an algorithm from both architectural and functional perspectives.
Two-phase superconductor tapes were produced by blending high purity magnesium diboride powder with a liquid ethylcellulose-based polymeric binder. This procedure produced a material which is easily formable with a high superconducting transition temperature (38K). We show that the bulk superconducting properties are not affected by the presence of the binder, nor is there any evidence of a chemical reaction between the superconducting particles and the binder. However, the transport properties of the material are strongly affected by the presence of the binder, which leads to a seven order of magnitude increase of the normal state resistance along with a seven order of magnitude decrease of the transport critical current density. This new material is shown to be equivalent to a system of coupled Josephson junctions.
Comment on Nature Physics 8, 164 (2012) by Kob, Roldan-Vargas and Berthier
The article describes developed information technology for online recognition of handwritten mathematical expressions that based on proposed approaches to handwritten symbols recognition and structural analysis.
Nuclear effect in the neutrino-nucleus charged-Current inelastic scattering process is studied by analyzing the CCFR and NuTeV data. Structure functions $F_2(x,Q^2)$ and $xF_3(x,Q^2)$ as well as differential cross sections are calculated by using CTEQ parton distribution functions and EKRS and HKN nuclear parton distribution functions, and compared with the CCFR and NuTeV data. It is found that the corrections of nuclear effect to the differential cross section for the charged-current anti-neutrino scattering on nucleus are negligible, the EMC effect exists in the neutrino structure function $F_2(x,Q^2)$ in the large $x$ region, the shadowing and anti-shadowing effect occurs in the distribution functions of valence quarks in the small and medium $x$ region,respectively. It is also found that shadowing effects on $F_2(x,Q^2)$ in the small $x$ region in the neutrino-nucleus and the charged-lepton-nucleus deep inelastic scattering processes are different. It is clear that the neutrino-nucleus deep inelastic scattering data should further be employed in restricting nuclear parton distributions.
We combine Bayesian networks (BNs) and structural reliability methods (SRMs) to create a new computational framework, termed enhanced Bayesian network (eBN), for reliability and risk analysis of engineering structures and infrastructure. BNs are efficient in representing and evaluating complex probabilistic dependence structures, as present in infrastructure and structural systems, and they facilitate Bayesian updating of the model when new information becomes available. On the other hand, SRMs enable accurate assessment of probabilities of rare events represented by computationally demanding, physically-based models. By combining the two methods, the eBN framework provides a unified and powerful tool for efficiently computing probabilities of rare events in complex structural and infrastructure systems in which information evolves in time. Strategies for modeling and efficiently analyzing the eBN are described by way of several conceptual examples. The companion paper applies the eBN methodology to example structural and infrastructure systems.
The isotropic-nematic (I-N) phase transition in a system of long straight rigid rods of length k on square lattices is studied by combining Monte Carlo simulations and theoretical analysis. The process is analyzed by comparing the configurational entropy of the system with the corresponding to a fully aligned system, whose calculation reduces to the 1D case. The results obtained (1) allow to estimate the minimum value of k which leads to the formation of a nematic phase and provide an interesting interpretation of this critical value; (2) provide numerical evidence on the existence of a second phase transition (from a nematic to a non-nematic state) occurring at density close to 1 and (3) allow to test the predictions of the main theoretical models developed to treat the polymers adsorption problem.
Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This grounding of dropout in approximate Bayesian inference suggests an extension of the theoretical results, offering insights into the use of dropout with RNN models. We apply this new variational inference based dropout technique in LSTM and GRU models, assessing it on language modelling and sentiment analysis tasks. The new approach outperforms existing techniques, and to the best of our knowledge improves on the single model state-of-the-art in language modelling with the Penn Treebank (73.4 test perplexity). This extends our arsenal of variational tools in deep learning.
This paper describes a new model for an artificial neural network processing unit or neuron. It is slightly different to a traditional feedforward network by the fact that it favours a mechanism of trying to match the wave-like 'shape' of the input with the shape of the output against specific value error corrections. The expectation is then that a best fit shape can be transposed into the desired output values more easily. This allows for notions of reinforcement through resonance and also the construction of synapses.
This thesis includes analysis of disordered spin ensembles corresponding to Exact Cover, a multi-access channel problem, and composite models combining sparse and dense interactions. The satisfiability problem in Exact Cover is addressed using a statistical analysis of a simple branch and bound algorithm. The algorithm can be formulated in the large system limit as a branching process, for which critical properties can be analysed. Far from the critical point a set of differential equations may be used to model the process, and these are solved by numerical integration and exact bounding methods. The multi-access channel problem is formulated as an equilibrium statistical physics problem for the case of bit transmission on a channel with power control and synchronisation. A sparse code division multiple access method is considered and the optimal detection properties are examined in typical case by use of the replica method, and compared to detection performance achieved by iterative decoding methods. These codes are found to have phenomena closely resembling the well-understood dense codes. The composite model is introduced as an abstraction of canonical sparse and dense disordered spin models. The model includes couplings due to both dense and sparse topologies simultaneously. Through an exact replica analysis at high temperature, and variational approaches at low temperature, several phenomena uncharacteristic of either sparse or dense models are demonstrated. An extension of the composite interaction structure to a code division multiple access method is presented. The new type of codes are shown to outperform sparse and dense codes in some regimes both in optimal performance, and in performance achieved by iterative detection methods in finite systems.
Probabilistic Boolean networks (PBNs) is an important mathematical framework widely used for modelling and analysing biological systems. PBNs are suited for modelling large biological systems, which more and more often arise in systems biology. However, the large system size poses a~significant challenge to the analysis of PBNs, in particular, to the crucial analysis of their steady-state behaviour. Numerical methods for performing steady-state analyses suffer from the state-space explosion problem, which makes the utilisation of statistical methods the only viable approach. However, such methods require long simulations of PBNs, rendering the simulation speed a crucial efficiency factor. For large PBNs and high estimation precision requirements, a slow simulation speed becomes an obstacle. In this paper, we propose a structure-based method for fast simulation of PBNs. This method first performs a network reduction operation and then divides nodes into groups for parallel simulation. Experimental results show that our method can lead to an approximately 10 times speedup for computing steady-state probabilities of a real-life biological network.
Automated photo tagging has established itself as one of the most compelling applications of deep learning. While deep convolutional neural networks have repeatedly demonstrated top performance on standard datasets for classification, there are a number of often overlooked but important considerations when deploying this technology in a real-world scenario. In this paper, we present our efforts in developing a large-scale photo-tagging system for Flickr photo search. We discuss topics including how to select the tags that matter most to our users, develop lightweight, high-performance models for tag prediction, and leverage the power of large amounts of noisy data for training. Our results demonstrate that, for real-world datasets, training exclusively with noisy data yields performance nearly on par with the standard paradigm of first pre-training on clean data and then fine-tuning. We advocate for the approach of harnessing user-generated data in large-scale systems.
Online content publishers often use catchy headlines for their articles in order to attract users to their websites. These headlines, popularly known as clickbaits, exploit a user's curiosity gap and lure them to click on links that often disappoint them. Existing methods for automatically detecting clickbaits rely on heavy feature engineering and domain knowledge. Here, we introduce a neural network architecture based on Recurrent Neural Networks for detecting clickbaits. Our model relies on distributed word representations learned from a large unannotated corpora, and character embeddings learned via Convolutional Neural Networks. Experimental results on a dataset of news headlines show that our model outperforms existing techniques for clickbait detection with an accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.
The neural activity of the human brain is dominated by self-sustained activities. External sensory stimuli influence this autonomous activity but they do not drive the brain directly. Most standard artificial neural network models are however input driven and do not show spontaneous activities.   It constitutes a challenge to develop organizational principles for controlled, self-sustained activity in artificial neural networks. Here we propose and examine the dHAN concept for autonomous associative thought processes in dense and homogeneous associative networks. An associative thought-process is characterized, within this approach, by a time-series of transient attractors. Each transient state corresponds to a stored information, a memory. The subsequent transient states are characterized by large associative overlaps, which are identical to acquired patterns. Memory states, the acquired patterns, have such a dual functionality.   In this approach the self-sustained neural activity has a central functional role. The network acquires a discrimination capability, as external stimuli need to compete with the autonomous activity. Noise in the input is readily filtered-out.   Hebbian learning of external patterns occurs coinstantaneous with the ongoing associative thought process. The autonomous dynamics needs a long-term working-point optimization which acquires within the dHAN concept a dual functionality: It stabilizes the time development of the associative thought process and limits runaway synaptic growth, which generically occurs otherwise in neural networks with self-induced activities and Hebbian-type learning rules.
Consider how easy it is for people to imagine what a "purple hippo" would look like, even though they do not exist. If we instead said "purple hippo with wings", they could just as easily create a different internal mental representation, to represent this more specific concept. To assess whether the person has correctly understood the concept, we can ask them to draw a few sketches, to illustrate their thoughts. We call the ability to map text descriptions of concepts to latent representations and then to images (or vice versa) visually grounded semantic imagination. We propose a latent variable model for images and attributes, based on variational auto-encoders, which can perform this task. Our method uses a novel training objective, and a novel product-of-experts inference network, which can handle partially specified (abstract) concepts in a principled and efficient way. We also propose a set of easy-to-compute evaluation metrics that capture our intuitive notions of what it means to have good imagination, namely correctness, coverage, and compositionality (the 3 C's). Finally, we perform a detailed comparison (in terms of the 3 C's) of our method with two existing joint image-attribute VAE methods (the JMVAE method of (Suzuki et al., 2017) and the bi-VCCA method of (Wang et al., 2016)) by applying them to two simple datasets based on MNIST, where it is easy to objectively evaluate performance in a controlled way.
Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from what we desire: ideally, robots should be able to learn from very few demonstrations of any given task, and instantly generalize to new situations of the same task, without requiring task-specific engineering. In this paper, we propose a meta-learning framework for achieving such capability, which we call one-shot imitation learning.   Specifically, we consider the setting where there is a very large set of tasks, and each task has many instantiations. For example, a task could be to stack all blocks on a table into a single tower, another task could be to place all blocks on a table into two-block towers, etc. In each case, different instances of the task would consist of different sets of blocks with different initial states. At training time, our algorithm is presented with pairs of demonstrations for a subset of all tasks. A neural net is trained that takes as input one demonstration and the current state (which initially is the initial state of the other demonstration of the pair), and outputs an action with the goal that the resulting sequence of states and actions matches as closely as possible with the second demonstration. At test time, a demonstration of a single instance of a new task is presented, and the neural net is expected to perform well on new instances of this new task. The use of soft attention allows the model to generalize to conditions and tasks unseen in the training data. We anticipate that by training this model on a much greater variety of tasks and settings, we will obtain a general system that can turn any demonstrations into robust policies that can accomplish an overwhelming variety of tasks.   Videos available at https://bit.ly/one-shot-imitation .
A Monte-Carlo algorithm for discrete statistical models that combines the full power of the Belief Propagation algorithm with the advantages of a detailed-balanced heat bath approach is presented. A sub-tree inside the factor graph is first extracted randomly; Belief Propagation is then used as a perfect sampler to generate a configuration on the tree given the boundary conditions and the procedure is iterated. This appoach is best adapted for locally tree like graphs, it is therefore tested on the hard cases of spin-glass models for random graphs demonstrating its state-of-the art status in those cases.
The thermodynamic and retrieval properties of fully connected Blume-Emery-Griffiths networks, storing ternary patterns, are studied using replica mean-field theory. Capacity-temperature phase diagrams are derived for several values of the pattern activity. It is found that the retrieval phase is the largest in comparison with other three-state neuron models. Furthermore, the meaning and stability of the so-called quadrupolar phase is discussed as a function of both the temperature and the pattern activity. Where appropriate, the results are compared with the diluted version of the model.
Recent advances in AI and robotics have claimed many incredible results with deep learning, yet no work to date has applied deep learning to the problem of liquid perception and reasoning. In this paper, we apply fully-convolutional deep neural networks to the tasks of detecting and tracking liquids. We evaluate three models: a single-frame network, multi-frame network, and a LSTM recurrent network. Our results show that the best liquid detection results are achieved when aggregating data over multiple frames and that the LSTM network outperforms the other two in both tasks. This suggests that LSTM-based neural networks have the potential to be a key component for enabling robots to handle liquids using robust, closed-loop controllers.
This paper investigates the coloring problem on Fibonacci-Cayley tree, which is a Cayley graph whose vertex set is the Fibonacci sequence. More precisely, we elucidate the complexity of shifts of finite type defined on Fibonacci-Cayley tree via an invariant called entropy. It comes that computing the entropy of a Fibonacci tree-shift of finite type is equivalent to studying a nonlinear recursive system. After proposing an algorithm for the computation of entropy, we apply the result to neural networks defined on Fibonacci-Cayley tree, which reflect those neural systems with neuronal dysfunction. Aside from demonstrating a surprising phenomenon that there are only two possibilities of entropy for neural networks on Fibonacci-Cayley tree, we reveal the formula of the boundary in the parameter space.
The results of the first generation of submillimeter-wave surveys have been published. The opening of this new window on the distant Universe has added considerably to our understanding of the galaxy formation process, by revealing a numerous population of very luminous distant galaxies. Most would have been very difficult to identify using other methods. The potential importance of selection effects, especially those connected with the spectral energy distributions of the detected galaxies, for the interpretation of the results are highlighted.
We study a simple model for a neuron function in a collective brain system. The neural network is composed of uncorrelated random scale-free network for eliminating the degree correlation of dynamical processes. The interaction of neurons is supposed to be isotropic and idealized. This neuron dynamics is similar to biological evolution in extremal dynamics with isotropic locally interaction but has different time scale. The evolution of neuron spike takes place according to punctuated patterns similar to the avalanche dynamics. We find that the evolutionary dynamics of this neuron function exhibit self-organized criticality which shows power-law behavior of the avalanche sizes. For a given network, the avalanche dynamic behavior is not changed with different degree exponents of networks, $\gamma \geq 2.4$ and refractory period correspondent to the memory effect, $T_r$. In addition, the avalanche size distributions exhibit the power-law behavior in a single scaling region in contrast to other networks. However, the return time distributions displaying spatiotemporal complexity have three characteristic time scaling regimes.
We investigate the role of clustering on the critical behavior of the contact process (CP) on small-world networks using the Watts-Strogatz (WS) network model with an edge rewiring probability p. The critical point is well predicted by a homogeneous cluster-approximation for the limit of vanishing clustering (p close to 1). The critical exponents and dimensionless moment ratios of the CP are in agreement with those predicted by the mean-field theory for any p > 0. This independence on the network clustering shows that the small-world property is a sufficient condition for the mean-field theory to correctly predict the universality of the model. Moreover, we compare the CP dynamics on WS networks with rewiring probability p = 1 and random regular networks and show that the weak heterogeneity of the WS network slightly changes the critical point but does not alter other critical quantities of the model.
How can we enable computers to automatically answer questions like "Who created the character Harry Potter"? Carefully built knowledge bases provide rich sources of facts. However, it remains a challenge to answer factoid questions raised in natural language due to numerous expressions of one question. In particular, we focus on the most common questions --- ones that can be answered with a single fact in the knowledge base. We propose CFO, a Conditional Focused neural-network-based approach to answering factoid questions with knowledge bases. Our approach first zooms in a question to find more probable candidate subject mentions, and infers the final answers with a unified conditional probabilistic framework. Powered by deep recurrent neural networks and neural embeddings, our proposed CFO achieves an accuracy of 75.7% on a dataset of 108k questions - the largest public one to date. It outperforms the current state of the art by an absolute margin of 11.8%.
Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive "fly-as-you-draw" application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method's potential in real-world applications. Tracking errors are reduced by around 40-50 % for both training and testing trajectories from users, highlighting the DNNs' capability of generalizing knowledge.
We propose two novel techniques --- stacking bottleneck features and minimum generation error training criterion --- to improve the performance of deep neural network (DNN)-based speech synthesis. The techniques address the related issues of frame-by-frame independence and ignorance of the relationship between static and dynamic features, within current typical DNN-based synthesis frameworks. Stacking bottleneck features, which are an acoustically--informed linguistic representation, provides an efficient way to include more detailed linguistic context at the input. The minimum generation error training criterion minimises overall output trajectory error across an utterance, rather than minimising the error per frame independently, and thus takes into account the interaction between static and dynamic features. The two techniques can be easily combined to further improve performance. We present both objective and subjective results that demonstrate the effectiveness of the proposed techniques. The subjective results show that combining the two techniques leads to significantly more natural synthetic speech than from conventional DNN or long short-term memory (LSTM) recurrent neural network (RNN) systems.
With the overwhelming success in the field of quantum information in the last decades, the "quest" for a Quantum Neural Network (QNN) model began in order to combine quantum computing with the striking properties of neural computing. This article presents a systematic approach to QNN research, which so far consists of a conglomeration of ideas and proposals. It outlines the challenge of combining the nonlinear, dissipative dynamics of neural computing and the linear, unitary dynamics of quantum computing. It establishes requirements for a meaningful QNN and reviews existing literature against these requirements. It is found that none of the proposals for a potential QNN model fully exploits both the advantages of quantum physics and computing in neural networks. An outlook on possible ways forward is given, emphasizing the idea of Open Quantum Neural Networks based on dissipative quantum computing.
Matching natural language sentences is central for many applications such as information retrieval and question answering. Existing deep models rely on a single sentence representation or multiple granularity representations for matching. However, such methods cannot well capture the contextualized local information in the matching process. To tackle this problem, we present a new deep architecture to match two sentences with multiple positional sentence representations. Specifically, each positional sentence representation is a sentence representation at this position, generated by a bidirectional long short term memory (Bi-LSTM). The matching score is finally produced by aggregating interactions between these different positional sentence representations, through $k$-Max pooling and a multi-layer perceptron. Our model has several advantages: (1) By using Bi-LSTM, rich context of the whole sentence is leveraged to capture the contextualized local information in each positional sentence representation; (2) By matching with multiple positional sentence representations, it is flexible to aggregate different important contextualized local information in a sentence to support the matching; (3) Experiments on different tasks such as question answering and sentence completion demonstrate the superiority of our model.
Under the framework of spectral clustering, the key of subspace clustering is building a similarity graph which describes the neighborhood relations among data points. Some recent works build the graph using sparse, low-rank, and $\ell_2$-norm-based representation, and have achieved state-of-the-art performance. However, these methods have suffered from the following two limitations. First, the time complexities of these methods are at least proportional to the cube of the data size, which make those methods inefficient for solving large-scale problems. Second, they cannot cope with out-of-sample data that are not used to construct the similarity graph. To cluster each out-of-sample datum, the methods have to recalculate the similarity graph and the cluster membership of the whole data set. In this paper, we propose a unified framework which makes representation-based subspace clustering algorithms feasible to cluster both out-of-sample and large-scale data. Under our framework, the large-scale problem is tackled by converting it as out-of-sample problem in the manner of "sampling, clustering, coding, and classifying". Furthermore, we give an estimation for the error bounds by treating each subspace as a point in a hyperspace. Extensive experimental results on various benchmark data sets show that our methods outperform several recently-proposed scalable methods in clustering large-scale data set.
In recent years deep learning algorithms have shown extremely high performance on machine learning tasks such as image classification and speech recognition. In support of such applications, various FPGA accelerator architectures have been proposed for convolutional neural networks (CNNs) that enable high performance for classification tasks at lower power than CPU and GPU processors. However, to date, there has been little research on the use of FPGA implementations of deconvolutional neural networks (DCNNs). DCNNs, also known as generative CNNs, encode high-dimensional probability distributions and have been widely used for computer vision applications such as scene completion, scene segmentation, image creation, image denoising, and super-resolution imaging. We propose an FPGA architecture for deconvolutional networks built around an accelerator which effectively handles the complex memory access patterns needed to perform strided deconvolutions, and that supports convolution as well. We also develop a three-step design optimization method that systematically exploits statistical analysis, design space exploration and VLSI optimization. To verify our FPGA deconvolutional accelerator design methodology we train DCNNs offline on two representative datasets using the generative adversarial network method (GAN) run on Tensorflow, and then map these DCNNs to an FPGA DCNN-plus-accelerator implementation to perform generative inference on a Xilinx Zynq-7000 FPGA. Our DCNN implementation achieves a peak performance density of 0.012 GOPs/DSP.
Learning to learn has emerged as an important direction for achieving artificial intelligence. Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks. We introduce a learned gradient descent optimizer that generalizes well to new tasks, and which has significantly reduced memory and computation overhead. We achieve this by introducing a novel hierarchical RNN architecture, with minimal per-parameter overhead, augmented with additional architectural features that mirror the known structure of optimization tasks. We also develop a metatraining ensemble of small, diverse, optimization tasks capturing common properties of loss landscapes. The optimizer learns to outperform RMSProp/ADAM on problems in this corpus. More importantly, it performs comparably or better when applied to small convolutional neural networks, despite seeing no neural networks in its metatraining set. Finally, it generalizes to train Inception V3 and ResNet V2 architectures on the ImageNet dataset for thousands of steps, optimization problems that are of a vastly different scale than those it was trained on.
We present an algorithmic framework for a variant of the quantum Monte Carlo operator-loop algorithm, where non-local cluster updates are constructed in a way that makes each individual loop smaller. The algorithm is designed to increase simulation efficiency in cases where conventional loops become very large, do not close altogether, or otherwise behave poorly. We demonstrate and characterize some aspects of the short-loop on a square lattice spin-1/2 XXZ model where, remarkably, a significant increase in simulation efficiency is observed in some parameter regimes. The simplicity of the model provides a prototype for the use of short-loops on more complicated quantum systems.
In highly doped uncompensated p-type layers within the central part of GaAs/AlGaAs quantum wells at low temperatures we observed an activated behavior of the conductivity with low activation energies (1-3) meV which can not be ascribed to standard mechanisms. We attribute this behavior to the delocalization of hole states near the maximum of the narrow impurity band in the sense of the Anderson transition. Low temperature conduction $\epsilon_4$ is supported by an activation of minority carriers - electrons (resulting from a weak compensation by back-ground defects) - from the Fermi level to the band of delocalized states mentioned above. The corresponding behavior can be specified as virtual Anderson transition. Low temperature transport ($<4$ K) exhibits also strong nonlinearity of a breakdown type characterized in particular by S-shaped I-V curve. The nonlinearity is observed in unexpectedly low fields ($<10$ V/cm). Such a behavior can be explained by a simple model implying an impact ionization of the localized states of the minority carriers mentioned above to the band of Anderson-delocalized states.
Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fully-visible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support. In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems.
We test four fast mean field type algorithms on Hopfield networks as an inverse Ising problem. The equilibrium behavior of Hopfield networks is simulated through Glauber dynamics. In the low temperature regime, the simulated annealing technique is adopted. Although performances of these network reconstruction algorithms on the simulated network of spiking neurons are extensively studied recently, the analysis of Hopfield networks is lacking so far. For the Hopfield network, we found that, in the retrieval phase favored when the network wants to memory one of stored patterns, all the reconstruction algorithms fail to extract interactions within a desired accuracy, and the same failure occurs in the spin glass phase where spurious minima show up, while in the paramagnetic phase, albeit unfavored during the retrieval dynamics, the algorithms work well to reconstruct the network itself. This implies that, as a inverse problem, the paramagnetic phase is conversely useful for reconstructing the network while the retrieval phase loses all the information about interactions in the network except for the case where only one pattern is stored. The performances of algorithms are studied with respect to the system size, memory load and temperature, sample-to-sample fluctuations are also considered.
QCD coherence effects in initial state radiation at small x in deep inelastic scattering in HERA kinematics are studied with the help of the Monte Carlo model SMALLX. Theoretical assumptions based on the CCFM evolution equation are reviewed and the basic properties of the partonic final states are investigated. The results are compared with those obtained in the conventional DGLAP evolution scheme.
Improved interoperability between public and private organizations is of key significance to make digital government newest triumphant. Digital Government interoperability, information sharing protocol and security are measured the key issue for achieving a refined stage of digital government. Flawless interoperability is essential to share the information between diverse and merely dispersed organisations in several network environments by using computer based tools. Digital government must ensure security for its information systems, including computers and networks for providing better service to the citizens. Governments around the world are increasingly revolving to information sharing and integration for solving problems in programs and policy areas. Evils of global worry such as syndrome discovery and manage, terror campaign, immigration and border control, prohibited drug trafficking, and more demand information sharing, harmonization and cooperation amid government agencies within a country and across national borders. A number of daunting challenges survive to the progress of an efficient information sharing protocol. A secure and trusted information-sharing protocol is required to enable users to interact and share information easily and perfectly across many diverse networks and databases globally.
Learning generic and robust feature representations with data from multiple domains for the same problem is of great value, especially for the problems that have multiple datasets but none of them are large enough to provide abundant data variations. In this work, we present a pipeline for learning deep feature representations from multiple domains with Convolutional Neural Networks (CNNs). When training a CNN with data from all the domains, some neurons learn representations shared across several domains, while some others are effective only for a specific one. Based on this important observation, we propose a Domain Guided Dropout algorithm to improve the feature learning procedure. Experiments show the effectiveness of our pipeline and the proposed algorithm. Our methods on the person re-identification problem outperform state-of-the-art methods on multiple datasets by large margins.
Recently, there has been a growing interest in monitoring brain activity for individual recognition system. So far these works are mainly focussing on single channel data or fragment data collected by some advanced brain monitoring modalities. In this study we propose new individual recognition schemes based on spatio-temporal resting state Electroencephalography (EEG) data. Besides, instead of using features derived from artificially-designed procedures, modified deep learning architectures which aim to automatically extract an individual's unique features are developed to conduct classification. Our designed deep learning frameworks are proved of a small but consistent advantage of replacing the $softmax$ layer with Random Forest. Additionally, a voting layer is added at the top of designed neural networks in order to tackle the classification problem arisen from EEG streams. Lastly, various experiments are implemented to evaluate the performance of the designed deep learning architectures; Results indicate that the proposed EEG-based individual recognition scheme yields a high degree of classification accuracy: $81.6\%$ for characteristics in high risk (CHR) individuals, $96.7\%$ for clinically stable first episode patients with schizophrenia (FES) and $99.2\%$ for healthy controls (HC).
In this research paper, the problem of optimization of quadratic forms associated with the dynamics of Hopfield-Amari neural network is considered. An elegant (and short) proof of the states at which local/global minima of quadratic form are attained is provided. A theorem associated with local/global minimization of quadratic energy function using the Hopfield-Amari neural network is discussed. The results are generalized to a "Complex Hopfield neural network" dynamics over the complex hypercube (using a "complex signum function"). It is also reasoned through two theorems that there is no loss of generality in assuming the threshold vector to be a zero vector in the case of real as well as a "Complex Hopfield neural network". Some structured quadratic forms like Toeplitz form and Complex Toeplitz form are discussed.
We proposed a deep learning method for interpretable diabetic retinopathy (DR) detection. The visual-interpretable feature of the proposed method is achieved by adding the regression activation map (RAM) after the global averaging pooling layer of the convolutional networks (CNN). With RAM, the proposed model can localize the discriminative regions of an retina image to show the specific region of interest in terms of its severity level. We believe this advantage of the proposed deep learning model is highly desired for DR detection because in practice, users are not only interested with high prediction performance, but also keen to understand the insights of DR detection and why the adopted learning model works. In the experiments conducted on a large scale of retina image dataset, we show that the proposed CNN model can achieve high performance on DR detection compared with the state-of-the-art while achieving the merits of providing the RAM to highlight the salient regions of the input image.
The objective of this paper is to study the characteristics (geometric and otherwise) of very large attribute based undirected networks. Real-world networks are often very large and fast evolving. Their analysis and understanding present a great challenge. An Attribute based network is a graph in which the edges depend on certain properties of the vertices on which they are incident. In context of a social network, the existence of links between two individuals may depend on certain attributes of the two of them. We use the Lovasz type sampling strategy of observing a certain random process on a graph locally , i.e., in the neighborhood of a node, and deriving information about global properties of the graph. The corresponding adjacency matrix is our primary object of interest. We study the efficiency of recently proposed sampling strategies, modified to our set up, to estimate the degree distribution, centrality measures, planarity etc. The limiting distributions are derived using recently developed probabilistic techniques for random matrices and hence we devise relevant test statistics and confidence intervals for different parameters / hypotheses of interest. We hope that our work will be useful for social and computer scientists for designing sampling strategies and computational algorithms appropriate to their respective domains of inquiry. Extensive simulations studies are done to empirically verify the probabilistic statements made in the paper.
In this paper, we consider the temporal pattern in traffic flow time series, and implement a deep learning model for traffic flow prediction. Detrending based methods decompose original flow series into trend and residual series, in which trend describes the fixed temporal pattern in traffic flow and residual series is used for prediction. Inspired by the detrending method, we propose DeepTrend, a deep hierarchical neural network used for traffic flow prediction which considers and extracts the time-variant trend. DeepTrend has two stacked layers: extraction layer and prediction layer. Extraction layer, a fully connected layer, is used to extract the time-variant trend in traffic flow by feeding the original flow series concatenated with corresponding simple average trend series. Prediction layer, an LSTM layer, is used to make flow prediction by feeding the obtained trend from the output of extraction layer and calculated residual series. To make the model more effective, DeepTrend needs first pre-trained layer-by-layer and then fine-tuned in the entire network. Experiments show that DeepTrend can noticeably boost the prediction performance compared with some traditional prediction models and LSTM with detrending based methods.
With the advent of big data and deep learning, computation power has become a bottleneck for many applications. Network-on-Chip (NoC) has been proposed to enable multiprocessor acceleration for deep learning computation, and efficient arbitration is a key issue for high performance. In this paper, an arbitration scheme based on interference of wave is proposed. In this scheme, home node sends out multiple tokens in different frequencies, and nodes that are competing for bus can capture token by sending out wave that cancels its specific wave token. In this scheme, speed-of-light arbitration can be achieved, and full information is available to all nodes in arbitration.
Face images appeared in multimedia applications, e.g., social networks and digital entertainment, usually exhibit dramatic pose, illumination, and expression variations, resulting in considerable performance degradation for traditional face recognition algorithms. This paper proposes a comprehensive deep learning framework to jointly learn face representation using multimodal information. The proposed deep learning structure is composed of a set of elaborately designed convolutional neural networks (CNNs) and a three-layer stacked auto-encoder (SAE). The set of CNNs extracts complementary facial features from multimodal data. Then, the extracted features are concatenated to form a high-dimensional feature vector, whose dimension is compressed by SAE. All the CNNs are trained using a subset of 9,000 subjects from the publicly available CASIA-WebFace database, which ensures the reproducibility of this work. Using the proposed single CNN architecture and limited training data, 98.43% verification rate is achieved on the LFW database. Benefited from the complementary information contained in multimodal data, our small ensemble system achieves higher than 99.0% recognition rate on LFW using publicly available training set.
We report the discovery of a galaxy at a redshift z = 6.17 identified from deep narrow band imaging and spectroscopic follow-up in one of the CFHT-VIRMOS deep survey fields at 0226-04. In addition to the existing deep BVRI images of this field, we obtained a very deep narrow band image at 920 nm with the aim of detecting Lyalpha emission at redshift ~ 6.5. Spectroscopic follow-up of some of the candidates selected on the basis of their excess flux in the NB920 filter was performed at the VLT-UT4 with the FORS2 instrument. For one object a strong and asymmetric emission line associated with a strong break in continuum emission is identified as Lyalpha at z = 6.17. This galaxy was selected from its continuum emission in the 920 nm filter rather than for its Lyalpha emission, in effect performing a Lyman Break detection at z = 6.17. We estimate a star formation rate of several tens of M/yr for this object, with a velocity dispersion 400 km/s. The spectroscopic follow-up of other high $z$ galaxy candidates is on-going.
Rather than creating yet another network controller which provides a framework in a specific (potentially new) programming language and runs as a monolithic application, in this paper we extend an existing operating system and leverage its software ecosystem in order to serve as a practical SDN controller. This paper introduces yanc, a controller platform for software-defined networks which exposes the network configuration and state as a file system, enabling user and system applications to interact through standard file I/O, and to easily take advantage of the tools available on the host operating system. In yanc, network applications are separate processes, are provided by multiple sources, and may be written in any language. Applications benefit from common and powerful technologies such as the virtual file system (VFS) layer, which we leverage to layer a distributed file system on top of, and Linux namespaces, which we use to isolate applications with different views (e.g., slices). In this paper we present the goals and design of yanc. Our initial prototype is built with the FUSE file system in user space on Linux and has been demonstrated with a simple static flow pusher application. Effectively, we are making Linux the network operating system.
Using methods of Statistical Physics, we investigate the generalization performance of support vector machines (SVMs), which have been recently introduced as a general alternative to neural networks. For nonlinear classification rules, the generalization error saturates on a plateau, when the number of examples is too small to properly estimate the coefficients of the nonlinear part. When trained on simple rules, we find that SVMs overfit only weakly. The performance of SVMs is strongly enhanced, when the distribution of the inputs has a gap in feature space.
Recent innovations in training deep convolutional neural network (ConvNet) models have motivated the design of new methods to automatically learn local image descriptors. The latest deep ConvNets proposed for this task consist of a siamese network that is trained by penalising misclassification of pairs of local image patches. Current results from machine learning show that replacing this siamese by a triplet network can improve the classification accuracy in several problems, but this has yet to be demonstrated for local image descriptor learning. Moreover, current siamese and triplet networks have been trained with stochastic gradient descent that computes the gradient from individual pairs or triplets of local image patches, which can make them prone to overfitting. In this paper, we first propose the use of triplet networks for the problem of local image descriptor learning. Furthermore, we also propose the use of a global loss that minimises the overall classification error in the training set, which can improve the generalisation capability of the model. Using the UBC benchmark dataset for comparing local image descriptors, we show that the triplet network produces a more accurate embedding than the siamese network in terms of the UBC dataset errors. Moreover, we also demonstrate that a combination of the triplet and global losses produces the best embedding in the field, using this triplet network. Finally, we also show that the use of the central-surround siamese network trained with the global loss produces the best result of the field on the UBC dataset. Pre-trained models are available online at https://github.com/vijaykbg/deep-patchmatch
"If I provide you a face image of mine (without telling you the actual age when I took the picture) and a large amount of face images that I crawled (containing labeled faces of different ages but not necessarily paired), can you show me what I would look like when I am 80 or what I was like when I was 5?" The answer is probably a "No." Most existing face aging works attempt to learn the transformation between age groups and thus would require the paired samples as well as the labeled query image. In this paper, we look at the problem from a generative modeling perspective such that no paired samples is required. In addition, given an unlabeled image, the generative model can directly produce the image with desired age attribute. We propose a conditional adversarial autoencoder (CAAE) that learns a face manifold, traversing on which smooth age progression and regression can be realized simultaneously. In CAAE, the face is first mapped to a latent vector through a convolutional encoder, and then the vector is projected to the face manifold conditional on age through a deconvolutional generator. The latent vector preserves personalized face features (i.e., personality) and the age condition controls progression vs. regression. Two adversarial networks are imposed on the encoder and generator, respectively, forcing to generate more photo-realistic faces. Experimental results demonstrate the appealing performance and flexibility of the proposed framework by comparing with the state-of-the-art and ground truth.
This paper addresses the problem of locating base stations in a certain area which is highly populated by mobile stations; each mobile station is assumed to select the closest base station. Base stations are modeled by players who choose their best location for maximizing their uplink throughput. The approach of this paper is to make some simplifying assumptions in order to get interpretable analytical results and insights to the problem under study. Specifically, a relatively complete Nash equilibrium (NE) analysis is conducted (existence, uniqueness, determination, and efficiency). Then, assuming that the base station location can be adjusted dynamically, the best-response dynamics and reinforcement learning algorithm are applied, discussed, and illustrated through numerical results.
Tunneling between two classically disconnected regular regions can be strongly affected by the presence of a chaotic sea in between. This phenomenon, known as chaos-assisted tunneling, gives rise to large fluctuations of the tunneling rate. Here we study chaos-assisted tunneling in the presence of Anderson localization effects in the chaotic sea. Our results show that the standard tunneling rate distribution is strongly modified by localization, going from the known Cauchy distribution in the ergodic regime to a log-normal distribution in the strongly localized case. We develop an analytical single-parameter scaling theory which accurately describes the numerical data, for both a deterministic and a disordered model. Several possible experimental implementations using cold atoms, photonic lattices or microwave billiards are discussed.
Bond-dilution effects on the ground state of the square-lattice antiferromagnetic Heisenberg model, consisting of coupled bond-alternating chains, are investigated by means of the quantum Monte Carlo simulation. It is found that, when the ground state of the non-diluted system is a non-magnetic state with a finite spin gap, a sufficiently weak bond dilution induces a disordered state with a mid gap in the original spin gap, and under a further stronger bond dilution an antiferromagnetic long-range order emerges. While the site-dilution-induced long-range order is induced by an infinitesimal concentration of dilution, there exists a finite critical concentration in the case of bond dilution. We argue that this essential difference is due to the occurrence of two types of effective interactions between induced magnetic moments in the case of bond dilution, and that the antiferromagnetic long-range-ordered phase does not appear until the magnitudes of the two interactions become comparable.
Among the myriad of data collected by the ESA Gaia satellite, about 150 million spectra will be delivered by the Radial Velocity Spectrometer (RVS) for stars as faint as G_RVS~16. A specific stellar parametrization will be performed for most of these RVS spectra. Some individual chemical abundances will also be estimated for the brightest targets. We describe the different parametrization codes that have been specifically developed or adapted for RVS spectra within the GSP-spec working group of the analysis consortium. The tested codes are based on optimization (FERRE and GAUGUIN), projection (MATISSE) or pattern recognition methods (Artificial Neural Networks). We present and discuss their expected performances in the recovered stellar atmospheric parameters (Teff, log(g), [M/H]) for B- to K- type stars. The performances for the determinations of [alpha/Fe] ratios are also presented for cool stars. For all the considered stellar types, stars brighter than G_RVS~12.5 will be very efficiently parametrized by the GSP-spec pipeline, including solid estimations of [alpha/Fe]. Typical internal errors for FGK metal-rich and metal-intermediate stars are around 40K in Teff , 0.1dex in log(g), 0.04dex in [M/H], and 0.03dex in [alpha/Fe] at G_RVS=10.3. Similar accuracies in Teff and [M/H] are found for A-type stars, while the log(g) derivation is more accurate. For the faintest stars, with G_RVS>13-14, a spectrophotometric Teff input will allow the improvement of the final GSP-spec parametrization. The reported results show that the contribution of the RVS based stellar parameters will be unique in the brighter part of the Gaia survey allowing crucial age estimations, and accurate chemical abundances. This will constitute a unique and precious sample for which many pieces of the Milky Way history puzzle will be available, with unprecedented precision and statistical relevance.
Mobile cloud computing enables the offloading of computationally heavy applications, such as for gaming, object recognition or video processing, from mobile users (MUs) to cloudlet or cloud servers, which are connected to wireless access points, either directly or through finite-capacity backhaul links. In this paper, the design of a mobile cloud computing system is investigated by proposing the joint optimization of computing and communication resources with the aim of minimizing the energy required for offloading across all MUs under latency constraints at the application layer. The proposed design accounts for multiantenna uplink and downlink interfering transmissions, with or without cooperation on the downlink, along with the allocation of backhaul and computational resources and user selection. The resulting design optimization problems are nonconvex, and stationary solutions are computed by means of successive convex approximation (SCA) techniques. Numerical results illustrate the advantages in terms of energy-latency trade-off of the joint optimization of computing and communication resources, as well as the impact of system parameters, such as backhaul capacity, and of the network architecture.
This paper considers distributed coding for multi-source single-sink data collection wireless networks. A unified framework for network coding and channel coding, termed "generalized adaptive network coded cooperation" (GANCC), is proposed. Key ingredients of GANCC include: matching code graphs with the dynamic network graphs on-the-fly, and integrating channel coding with network coding through circulant low-density parity-check codes. Several code constructing methods and several families of sparse-graph codes are proposed, and information theoretical analysis is performed. It is shown that GANCC is simple to operate, adaptive in real time, distributed in nature, and capable of providing remarkable coding gains even with a very limited number of cooperating users.
With energy-efficient resource allocation, mobile users and base station have different objectives. While the base station strives for an energy-efficient operation of the complete cell, each user aims to maximize its own data rate. To obtain this individual benefit, users may selfishly adjust their Channel State Information (CSI) reports, reducing the cell's energy efficiency. To analyze this conflict of interest, we formalize energy-efficient power allocation as a utility maximization problem and present a simple algorithm that performs close to the optimum. By formulating selfish CSI reporting as a game, we prove the existence of an unique equilibrium and characterize energy efficiency with true and selfish CSI in closed form. Our numerical results show that, surprisingly, energy-efficient power allocation in small cells is more robust against selfish CSI than cells with large transmit powers. This and further design rules show that our paper provides valuable theoretical insight to energy-efficient networks when CSI reports cannot be trusted.
In long-term deployments of sensor networks, monitoring the quality of gathered data is a critical issue. Over the time of deployment, sensors are exposed to harsh conditions, causing some of them to fail or to deliver less accurate data. If such a degradation remains undetected, the usefulness of a sensor network can be greatly reduced. We present an approach that learns spatio-temporal correlations between different sensors, and makes use of the learned model to detect misbehaving sensors by using distributed computation and only local communication between nodes. We introduce SODESN, a distributed recurrent neural network architecture, and a learning method to train SODESN for fault detection in a distributed scenario. Our approach is evaluated using data from different types of sensors and is able to work well even with less-than-perfect link qualities and more than 50% of failed nodes.
Automated Facial Expression Recognition (FER) has been a challenging task for decades. Many of the existing works use hand-crafted features such as LBP, HOG, LPQ, and Histogram of Optical Flow (HOF) combined with classifiers such as Support Vector Machines for expression recognition. These methods often require rigorous hyperparameter tuning to achieve good results. Recently Deep Neural Networks (DNN) have shown to outperform traditional methods in visual object recognition. In this paper, we propose a two-part network consisting of a DNN-based architecture followed by a Conditional Random Field (CRF) module for facial expression recognition in videos. The first part captures the spatial relation within facial images using convolutional layers followed by three Inception-ResNet modules and two fully-connected layers. To capture the temporal relation between the image frames, we use linear chain CRF in the second part of our network. We evaluate our proposed network on three publicly available databases, viz. CK+, MMI, and FERA. Experiments are performed in subject-independent and cross-database manners. Our experimental results show that cascading the deep network architecture with the CRF module considerably increases the recognition of facial expressions in videos and in particular it outperforms the state-of-the-art methods in the cross-database experiments and yields comparable results in the subject-independent experiments.
There has been considerable recent interest in algorithms for finding communities in networks - groups of vertex within which connections are dense (frequent), but between which connections are sparser (rare). Most of the current literature advocates an heuristic approach to the removal of the edges (i.e., removing the links that are less significant using a well-designed function). In this article, we will investigate a technique for uncovering latent communities using a new modelling approach, based on how information spread within a network. It will prove to be easy to use, robust and scalable. It makes supplementary information related to the network/community structure (different communications, consecutive observations) easier to integrate. We will demonstrate the efficiency of our approach by providing some illustrating real-world applications, like the famous Zachary karate club, or the Amazon political books buyers network.
The metric structure of bosonic scale-free networks and fermionic Cayley-tree networks is analyzed focousing on the directed distance of nodes from the origin. The topology of the netwoks strongly depends on the dynamical parameter $T$, called temperature. At $T=\infty$ we show analytically that the two networks have a similar behavior: the distance of a generic node from the origin of the network scales as the logarithm of the number of nodes in the network. At T=0 the two networks have an opposite behavior: the bosonic network remains very clusterized (the distance from the origin remains constant as the network increases the number of nodes) while the fermionic network grows following a single branch of the tree and the distance from the origin grows as a power-law of the number of nodes in the network.
Network-based marketing refers to a collection of marketing techniques that take advantage of links between consumers to increase sales. We concentrate on the consumer networks formed using direct interactions (e.g., communications) between consumers. We survey the diverse literature on such marketing with an emphasis on the statistical methods used and the data to which these methods have been applied. We also provide a discussion of challenges and opportunities for this burgeoning research topic. Our survey highlights a gap in the literature. Because of inadequate data, prior studies have not been able to provide direct, statistical support for the hypothesis that network linkage can directly affect product/service adoption. Using a new data set that represents the adoption of a new telecommunications service, we show very strong support for the hypothesis. Specifically, we show three main results: (1) ``Network neighbors''--those consumers linked to a prior customer--adopt the service at a rate 3--5 times greater than baseline groups selected by the best practices of the firm's marketing team. In addition, analyzing the network allows the firm to acquire new customers who otherwise would have fallen through the cracks, because they would not have been identified based on traditional attributes. (2) Statistical models, built with a very large amount of geographic, demographic and prior purchase data, are significantly and substantially improved by including network information. (3) More detailed network information allows the ranking of the network neighbors so as to permit the selection of small sets of individuals with very high probabilities of adoption.
We introduce a new machine learning approach for image segmentation that uses a neural network to model the conditional energy of a segmentation given an image. Our approach, combinatorial energy learning for image segmentation (CELIS) places a particular emphasis on modeling the inherent combinatorial nature of dense image segmentation problems. We propose efficient algorithms for learning deep neural networks to model the energy function, and for local optimization of this energy in the space of supervoxel agglomerations. We extensively evaluate our method on a publicly available 3-D microscopy dataset with 25 billion voxels of ground truth data. On an 11 billion voxel test set, we find that our method improves volumetric reconstruction accuracy by more than 20% as compared to two state-of-the-art baseline methods: graph-based segmentation of the output of a 3-D convolutional neural network trained to predict boundaries, as well as a random forest classifier trained to agglomerate supervoxels that were generated by a 3-D convolutional neural network.
In a first step towards the comprehension of neural activity, one should focus on the stability of the various dynamical states. Even the characterization of idealized regimes, such as a perfectly periodic spiking activity, reveals unexpected difficulties. In this paper we discuss a general approach to linear stability of pulse-coupled neural networks for generic phase-response curves and post-synaptic response functions. In particular, we present: (i) a mean-field approach developed under the hypothesis of an infinite network and small synaptic conductances; (ii) a "microscopic" approach which applies to finite but large networks. As a result, we find that no matter how large is a neural network, its response to most of the perturbations depends on the system size. There exists, however, also a second class of perturbations, whose evolution typically covers an increasingly wide range of time scales. The analysis of perfectly regular, asynchronous, states reveals that their stability depends crucially on the smoothness of both the phase-response curve and the transmitted post-synaptic pulse. The general validity of this scenarion is confirmed by numerical simulations of systems that are not amenable to a perturbative approach.
This paper proposes a new data-driven approach for modeling detailed splashes for liquid simulations with neural networks. Our model learns to generate small-scale splash detail for fluid-implicit-particle methods using training data acquired from physically accurate, high-resolution simulations. We use neural networks to model the regression of splash formation using a classifier together with a velocity modification term. More specifically, we employ a heteroscedastic model for the velocity updates. Our simulation results demonstrate that our model significantly improves visual fidelity with a large amount of realistic droplet formation and yields splash detail much more efficiently than finer discretizations. We show this for two different spatial scales and simulation setups.
A simple evolutionary model for biological ageing is modified such that it requires a minimum population for survival, like in human society. This social effect leads to a transition between extinction and survival of the species.
We present Conedy, a performant scientific tool to numerically investigate dynamics on complex networks. Conedy allows to create networks and provides automatic code generation and compilation to ensure performant treatment of arbitrary node dynamics. Conedy can be interfaced via an internal script interpreter or via a Python module.
We present in this paper a simple, yet efficient convolutional neural network (CNN) architecture for robust audio event recognition. Opposing to deep CNN architectures with multiple convolutional and pooling layers topped up with multiple fully connected layers, the proposed network consists of only three layers: convolutional, pooling, and softmax layer. Two further features distinguish it from the deep architectures that have been proposed for the task: varying-size convolutional filters at the convolutional layer and 1-max pooling scheme at the pooling layer. In intuition, the network tends to select the most discriminative features from the whole audio signals for recognition. Our proposed CNN not only shows state-of-the-art performance on the standard task of robust audio event recognition but also outperforms other deep architectures up to 4.5% in terms of recognition accuracy, which is equivalent to 76.3% relative error reduction.
This article introduces a robust hybrid method for solving supervised learning tasks, which uses the Echo State Network (ESN) model and the Particle Swarm Optimization (PSO) algorithm. An ESN is a Recurrent Neural Network with the hidden-hidden weights fixed in the learning process. The recurrent part of the network stores the input information in internal states of the network. Another structure forms a free-memory method used as supervised learning tool. The setting procedure for initializing the recurrent structure of the ESN model can impact on the model performance. On the other hand, the PSO has been shown to be a successful technique for finding optimal points in complex spaces. Here, we present an approach to use the PSO for finding some initial hidden-hidden weights of the ESN model. We present empirical results that compare the canonical ESN model with this hybrid method on a wide range of benchmark problems.
The energetics of cerebral activity critically relies on the functional and metabolic interactions between neurons and astrocytes. Important open questions include the relation between neuronal versus astrocytic energy demand, glucose uptake and intercellular lactate transfer, as well as their dependence on the level of activity. We have developed a large-scale, constraint-based network model of the metabolic partnership between astrocytes and glutamatergic neurons that allows for a quantitative appraisal of the extent to which stoichiometry alone drives the energetics of the system. We find that the velocity of the glutamate-glutamine cycle ($V_{cyc}$) explains part of the uncoupling between glucose and oxygen utilization at increasing $V_{cyc}$ levels. Thus, we are able to characterize different activation states in terms of the tissue oxygen-glucose index (OGI). Calculations show that glucose is taken up and metabolized according to cellular energy requirements, and that partitioning of the sugar between different cell types is not significantly affected by $V_{cyc}$. Furthermore, both the direction and magnitude of the lactate shuttle between neurons and astrocytes turn out to depend on the relative cell glucose uptake while being roughly independent of $V_{cyc}$. These findings suggest that, in absence of ad hoc activity-related constraints on neuronal and astrocytic metabolism, the glutamate-glutamine cycle does not control the relative energy demand of neurons and astrocytes, and hence their glucose uptake and lactate exchange.
Recent methods based on 3D skeleton data have achieved outstanding performance due to its conciseness, robustness, and view-independent representation. With the development of deep learning, Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM)-based learning methods have achieved promising performance for action recognition. However, for CNN-based methods, it is inevitable to loss temporal information when a sequence is encoded into images. In order to capture as much spatial-temporal information as possible, LSTM and CNN are adopted to conduct effective recognition with later score fusion. In addition, experimental results show that the score fusion between CNN and LSTM performs better than that between LSTM and LSTM for the same feature. Our method achieved state-of-the-art results on NTU RGB+D datasets for 3D human action analysis. The proposed method achieved 87.40% in terms of accuracy and ranked $1^{st}$ place in Large Scale 3D Human Activity Analysis Challenge in Depth Videos.
The overwhelming majority of neural field and mass models use current-based synapses [1] unlike spiking models which typically use conductance-based synapses [2]. Although neural field models that employ conductance-based synapses have been studied [3,9], the functional effects on the neuronal dynamics have not been systematically analysed and compared to that of models with current-based synapses. This shortcoming is particularly apparent with respect to epileptic dynamics, where neural field models of epilepsy typically describe the transition to seizure-like activity as a bifurcation [10]. This letter examines and compares the differences between conductance-based synapses and current-based synapses by constructing a neural field model that encapsulates both through a homotopic mapping. The results demonstrate significantly different non-trivial effects of these synaptic mechanisms on the system dynamics, particularly with respect to the model's bifurcation dynamics.
Image quality assessment (IQA) continues to garner great interest in the research community, particularly given the tremendous rise in consumer video capture and streaming. Despite significant research effort in IQA in the past few decades, the area of no-reference image quality assessment remains a great challenge and is largely unsolved. In this paper, we propose a novel no-reference image quality assessment system called Deep Quality, which leverages the power of deep learning to model the complex relationship between visual content and the perceived quality. Deep Quality consists of a novel multi-scale deep convolutional neural network, trained to learn to assess image quality based on training samples consisting of different distortions and degradations such as blur, Gaussian noise, and compression artifacts. Preliminary results using the CSIQ benchmark image quality dataset showed that Deep Quality was able to achieve strong quality prediction performance (89% patch-level and 98% image-level prediction accuracy), being able to achieve similar performance as full-reference IQA methods.
A simplified speech recognition system that uses the maximum mutual information (MMI) criterion is considered. End-to-end training using gradient descent is suggested, similarly to the training of connectionist temporal classification (CTC). We use an MMI criterion with a simple language model in the training stage, and a standard HMM decoder. Our method compares favorably to CTC in terms of performance, robustness, decoding time, disk footprint and quality of alignments. The good alignments enable the use of a straightforward ensemble method, obtained by simply averaging the predictions of several neural network models, that were trained separately end-to-end. The ensemble method yields a considerable reduction in the word error rate.
In recent years the field of neuromorphic low-power systems that consume orders of magnitude less power gained significant momentum. However, their wider use is still hindered by the lack of algorithms that can harness the strengths of such architectures. While neuromorphic adaptations of representation learning algorithms are now emerging, efficient processing of temporal sequences or variable length-inputs remain difficult. Recurrent neural networks (RNN) are widely used in machine learning to solve a variety of sequence learning tasks. In this work we present a train-and-constrain methodology that enables the mapping of machine learned (Elman) RNNs on a substrate of spiking neurons, while being compatible with the capabilities of current and near-future neuromorphic systems. This "train-and-constrain" method consists of first training RNNs using backpropagation through time, then discretizing the weights and finally converting them to spiking RNNs by matching the responses of artificial neurons with those of the spiking neurons. We demonstrate our approach by mapping a natural language processing task (question classification), where we demonstrate the entire mapping process of the recurrent layer of the network on IBM's Neurosynaptic System "TrueNorth", a spike-based digital neuromorphic hardware architecture. TrueNorth imposes specific constraints on connectivity, neural and synaptic parameters. To satisfy these constraints, it was necessary to discretize the synaptic weights and neural activities to 16 levels, and to limit fan-in to 64 inputs. We find that short synaptic delays are sufficient to implement the dynamical (temporal) aspect of the RNN in the question classification task. The hardware-constrained model achieved 74% accuracy in question classification while using less than 0.025% of the cores on one TrueNorth chip, resulting in an estimated power consumption of ~17 uW.
In agreement with the Harris criterion, asymptotic critical exponents of three-dimensional (3d) Heisenberg-like magnets are not influenced by weak quenched dilution of non-magnetic component. However, often in the experimental studies of corresponding systems concentration- and temperature-dependent exponents are found with values differing from those of the 3d Heisenberg model.   In our study, we use the field--theoretical renormalization group approach to explain this observation and to calculate the effective critical exponents of weakly diluted quenched Heisenberg-like magnet. Being non-universal, these exponents change with distance to the critical point $T_c$ as observed experimentally. In the asymptotic limit (at $T_c$) they equal to the critical exponents of the pure 3d Heisenberg magnet as predicted by the Harris criterion.
Network sparsification methods play an important role in modern network analysis when fast estimation of computationally expensive properties (such as the diameter, centrality indices, and paths) is required. We propose a method of network sparsification that preserves a wide range of structural properties. Depending on the analysis goals, the method allows to distinguish between local and global range edges that can be filtered out during the sparsification. First we rank edges by their algebraic distances and then we sample them. We also introduce a multilevel framework for sparsification that can be used to control the sparsification process at various coarse-grained resolutions. Based primarily on the matrix-vector multiplications, our method is easily parallelized for different architectures.
We study the robustness of complex networks subject to edge removal. Several network models and removing strategies are simulated. Rather than the existence of the giant component, we use total connectedness as the criterion of breakdown. The network topologies are introduced a simple traffic dynamics and the total connectedness is interpreted not only in the sense of topology but also in the sense of function. We define the topological robustness and the functional robustness, investigate their combined effect and compare their relative importance to each other. The results of our study provide an alternative view of the overall robustness and highlight efficient ways to improve the robustness of the network models.
Convolutional Neural Network (CNNs) are typically associated with Computer Vision. CNNs are responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today. More recently CNNs have been applied to problems in Natural Language Processing and gotten some interesting results. In this paper, we will try to explain the basics of CNNs, its different variations and how they have been applied to NLP.
Mobile data services are penetrating mobile markets rapidly. The mobile industry relies heavily on data service to replace the traditional voice services with the evolution of the wireless technology and market. A reliable packet service network is critical to the mobile operators to maintain their core competence in data service market. Furthermore, mobile operators need to develop effective operational models to manage the varying mix of voice, data and video traffic on a single network. Application of statistical models could prove to be an effective approach. This paper first introduces the architecture of Universal Mobile Telecommunications System (UMTS) packet switched (PS) network and then applies multivariate statistical analysis to Key Performance Indicators (KPI) monitored from network entities in UMTS PS network to guide the long term capacity planning for the network. The approach proposed in this paper could be helpful to mobile operators in operating and maintaining their 3G packet switched networks for the long run.
We develop a prescription for estimating the interstellar medium oxygen abundances of distant star-forming galaxies using the ratio EWR_{23} formed from the equivalent widths of the [O II] 3727, [O III] 4959,5007 and Hbeta nebular emission lines. This EWR_{23} approach essentially identical to the widely-used R_{23} method of Pagel et. al (1979). Using data from three spectroscopic surveys of nearby galaxies, we conclude that the emission line equivalent width ratios are a good substitute for emission line flux ratios in galaxies with active star formation. The RMS dispersion between EWR_{23} and the reddening-corrected R_{23} values is sigma(log(R_{23})) < 0.08 dex. This dispersion is comparable to the emission-line measurement uncertainties for distant galaxies in many spectroscopic galaxy surveys, and is smaller than the uncertainties of sigma(O/H) ~ 0.15 dex inherent in strong-line metallicity calibrations. Because equivalent width ratios are, to first order, insentitive to interstellar reddening, emission line equivalent width ratios may actually be superior to flux ratios when reddening corrections are not available. The EWR_{23} method presented here is likely to be most useful for statistically estimating the mean metallicities for large samples of galaxies to trace their chemical properties as a function of redshift or environment. The approach developed here is used in a companion paper (Kobulnicky et. al 2003) to measure the metallicities of star-forming galaxies at z=0.2-0.8 in the Deep Extragalactic Evolutionary Probe spectroscopic survey of the Groth Strip.
Recently there has been much interest in understanding why deep neural networks are preferred to shallow networks. We show that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neurons needed by a deep network for a given degree of function approximation. First, we consider univariate functions on a bounded interval and require a neural network to achieve an approximation error of $\varepsilon$ uniformly over the interval. We show that shallow networks (i.e., networks whose depth does not depend on $\varepsilon$) require $\Omega(\text{poly}(1/\varepsilon))$ neurons while deep networks (i.e., networks whose depth grows with $1/\varepsilon$) require $\mathcal{O}(\text{polylog}(1/\varepsilon))$ neurons. We then extend these results to certain classes of important multivariate functions. Our results are derived for neural networks which use a combination of rectifier linear units (ReLUs) and binary step units, two of the most popular type of activation functions. Our analysis builds on a simple observation: the multiplication of two bits can be represented by a ReLU.
The backpropagation algorithm for calculating gradients has been widely used in computation of weights for deep neural networks (DNNs). This method requires derivatives of objective functions and has some difficulties finding appropriate parameters such as learning rate. In this paper, we propose a novel approach for computing weight matrices of fully-connected DNNs by using two types of semi-nonnegative matrix factorizations (semi-NMFs). In this method, optimization processes are performed by calculating weight matrices alternately, and backpropagation (BP) is not used. We also present a method to calculate stacked autoencoder using a NMF. The output results of the autoencoder are used as pre-training data for DNNs. The experimental results show that our method using three types of NMFs attains similar error rates to the conventional DNNs with BP.
While the usual goal in Monte Carlo (MC) simulations of Ising models is the efficient generation of spin configurations with Boltzmann probabilities, the inverse problem is to determine the coupling constants from a given set of spin configurations. Most recent work has been limited to local magnetic fields and pair-wise interactions. We have extended solutions to multi-spin interactions, using correlation function matching (CFM). A more serious limitation of previous work has been the uncertainty of whether a chosen set of interactions is capable of faithfully representing real data. We show how our confirmation testing method uses an additional MC simulation to detect significant interactions that might be missing in the assumed representation of the data.
A molecular dynamics simulation is performed for a supercooled liquid of rigid diatomic molecules. The time-dependent self and collective density correlators of the molecular centers of mass are determined and compared with the predictions of the ideal mode coupling theory (MCT) for simple liquids. This is done in real as well as in momentum space. One of the main results is the existence of a unique transition temperature T_c, where the dynamics crosses over from an ergodic to a quasi-nonergodic behavior. The value for T_c agrees with that found earlier for the orientational dynamics within the error bars. In the beta- regime of MCT the factorization of space- and time dependence is satisfactorily fulfilled for both types of correlations. The first scaling law of ideal MCT holds in the von Schweidler regime, only, since the validity of the critical law can not be confirmed, due to a strong interference with the microscopic dynamics. In this first scaling regime a consistent description within ideal MCT emerges only, if the next order correction to the asymptotic law is taken into account. This correction is almost negligible for q=q_max, the position of the main peak in the static structure factor S(q), but becomes important for q=q_min, the position of its first minimum. The second scaling law, i.e. the time-temperature superposition principle, holds reasonably well for the self and collective density correlators and different values for q. The alpha-relaxation times tau_q^(s) and tau_q follow a power law in T-T_c over 2 -- 3 decades. The corresponding exponent gamma is weakly q-dependent and is around 2.55. This value is in agreement with the one predicted by MCT from the value of the von Schweidler exponent but at variance with the corresponding exponent gamma
Extensive Monte Carlo simulations are used to investigate the low-temperature properties of the random anisotropy Heisenberg model, which describes the magnetic behavior of amorphous rare-earth-transition metal alloy. We show that the low-temperature phase in weak-anisotropy region is characterized by a {\it frozen-in} power-law spin-spin correlation. Numerical observation of the power-law exponent $\eta$ indicates nonuniversal behavior.
Probabilistic-driven classification techniques extend the role of traditional approaches that output labels (usually integer numbers) only. Such techniques are more fruitful when dealing with problems where one is not interested in recognition/identification only, but also into monitoring the behavior of consumers and/or machines, for instance. Therefore, by means of probability estimates, one can take decisions to work better in a number of scenarios. In this paper, we propose a probabilistic-based Optimum Path Forest (OPF) classifier to handle with binary classification problems, and we show it can be more accurate than naive OPF in a number of datasets. In addition to being just more accurate or not, probabilistic OPF turns to be another useful tool to the scientific community.
Networks with a prescribed power-law scaling in the spectrum of the graph Laplacian can be generated by evolutionary optimization. The Laplacian spectrum encodes the dynamical behavior of many important processes. Here, the networks are evolved to exhibit subdiffusive dynamics. Under the additional constraint of degree-regularity, the evolved networks display an abundance of symmetric motifs arranged into loops and long linear segments. Exploiting results from algebraic graph theory on symmetric networks, we find the underlying backbone structures and how they contribute to the spectrum. The resulting coarse-grained networks provide an intuitive view of how the anomalous diffusive properties can be realized in the evolved structures.
We investigate the effects of topological defects (dislocations) to the ground state of the solid-on-solid (SOS) model on a simple cubic disordered substrate utilizing the min-cost-flow algorithm from combinatorial optimization. The dislocations are found to destabilize and destroy the elastic phase, particularly when the defects are placed only in partially optimized positions. For multi defect pairs their density decreases exponentially with the vortex core energy. Their mean distance has a maximum depending on the vortex core energy and system size, which gives a fractal dimension of $1.27 \pm 0.02$. The maximal mean distances correspond to special vortex core energies for which the scaling behavior of the density of dislocations change from a pure exponential decay to a stretched one. Furthermore, an extra introduced vortex pair is screened due to the disorder-induced defects and its energy is linear in the vortex core energy.
Policy-based network management (PBNM) paradigms provide an effective tool for end-to-end resource management in converged next generation networks by enabling unified, adaptive and scalable solutions that integrate and co-ordinate diverse resource management mechanisms associated with heterogeneous access technologies. In our project, a PBNM framework for end-to-end QoS management in converged networks is being developed. The framework consists of distributed functional entities managed within a policy-based infrastructure to provide QoS and resource management in converged networks. Within any QoS control framework, an effective admission control scheme is essential for maintaining the QoS of flows present in the network. Measurement based admission control (MBAC) and parameter based admission control (PBAC) are two commonly used approaches. This paper presents the implementation and analysis of various measurement-based admission control schemes developed within a Java-based prototype of our policy-based framework. The evaluation is made with real traffic flows on a Linux-based experimental testbed where the current prototype is deployed. Our results show that unlike with classic MBAC or PBAC only schemes, a hybrid approach that combines both methods can simultaneously result in improved admission control and network utilization efficiency.
Strategy changes are an essential part of evolutionary games. Here we introduce a simple rule that, depending on the value of a single parameter $w$, influences the selection of players that are considered as potential sources of the new strategy. For positive $w$ players with high payoffs will be considered more likely, while for negative $w$ the opposite holds. Setting $w$ equal to zero returns the frequently adopted random selection of the opponent. We find that increasing the probability of adopting the strategy from the fittest player within reach, i.e. setting $w$ positive, promotes the evolution of cooperation. The robustness of this observation is tested against different levels of uncertainty in the strategy adoption process and for different interaction network. Since the evolution to widespread defection is tightly associated with cooperators having a lower fitness than defectors, the fact that positive values of $w$ facilitate cooperation is quite surprising. We show that the results can be explained by means of a negative feedback effect that increases the vulnerability of defectors although initially increasing their survivability. Moreover, we demonstrate that the introduction of $w$ effectively alters the interaction network and thus also the impact of uncertainty by strategy adoptions on the evolution of cooperation.
Effective training of deep neural networks suffers from two main issues. The first is that the parameter spaces of these models exhibit pathological curvature. Recent methods address this problem by using adaptive preconditioning for Stochastic Gradient Descent (SGD). These methods improve convergence by adapting to the local geometry of parameter space. A second issue is overfitting, which is typically addressed by early stopping. However, recent work has demonstrated that Bayesian model averaging mitigates this problem. The posterior can be sampled by using Stochastic Gradient Langevin Dynamics (SGLD). However, the rapidly changing curvature renders default SGLD methods inefficient. Here, we propose combining adaptive preconditioners with SGLD. In support of this idea, we give theoretical properties on asymptotic convergence and predictive risk. We also provide empirical results for Logistic Regression, Feedforward Neural Nets, and Convolutional Neural Nets, demonstrating that our preconditioned SGLD method gives state-of-the-art performance on these models.
Action recognition is a fundamental problem in computer vision with a lot of potential applications such as video surveillance, human computer interaction, and robot learning. Given pre-segmented videos, the task is to recognize actions happening within videos. Historically, hand crafted video features were used to address the task of action recognition. With the success of Deep ConvNets as an image analysis method, a lot of extensions of standard ConvNets were purposed to process variable length video data. In this work, we propose a novel recurrent ConvNet architecture called recurrent residual networks to address the task of action recognition. The approach extends ResNet, a state of the art model for image classification. While the original formulation of ResNet aims at learning spatial residuals in its layers, we extend the approach by introducing recurrent connections that allow to learn a spatio-temporal residual. In contrast to fully recurrent networks, our temporal connections only allow a limited range of preceding frames to contribute to the output for the current frame, enabling efficient training and inference as well as limiting the temporal context to a reasonable local range around each frame. On a large-scale action recognition dataset, we show that our model improves over both, the standard ResNet architecture and a ResNet extended by a fully recurrent layer.
his paper describes and analyzes the spatial spread of tuberculosis (TB) on complex metapopulation, that is, networks of populations connected by migratory flows whose configurations are described in terms of connectivity distribution of nodes (patches) and the conditional probabilities of connections among classes of nodes sharing the same degree. The migration and transmission processes occur simultaneously. For uncorrelated networks under the assumption of standard incidence transmission, we compute the disease-free equilibrium and the basic reproduction number, and show that the disease-free equilibrium is locally asymptotically stable. Moreover, for uncorrelated networks and under assumption of simple mass action transmission, we give a necessary and sufficient conditions for the instability of the disease-free equilibrium.   The existence of endemic equilibria is also discussed. Finally, the prevalence of the TB infection across the metapopulation as a function of the path connectivity is studied using numerical simulations.
Recursive neural models, which use syntactic parse trees to recursively generate representations bottom-up, are a popular architecture. But there have not been rigorous evaluations showing for exactly which tasks this syntax-based method is appropriate. In this paper we benchmark {\bf recursive} neural models against sequential {\bf recurrent} neural models (simple recurrent and LSTM models), enforcing apples-to-apples comparison as much as possible. We investigate 4 tasks: (1) sentiment classification at the sentence level and phrase level; (2) matching questions to answer-phrases; (3) discourse parsing; (4) semantic relation extraction (e.g., {\em component-whole} between nouns).   Our goal is to understand better when, and why, recursive models can outperform simpler models. We find that recursive models help mainly on tasks (like semantic relation extraction) that require associating headwords across a long distance, particularly on very long sequences. We then introduce a method for allowing recurrent models to achieve similar performance: breaking long sentences into clause-like units at punctuation and processing them separately before combining. Our results thus help understand the limitations of both classes of models, and suggest directions for improving recurrent models.
Our work proposes a novel deep learning framework for estimating crowd density from static images of highly dense crowds. We use a combination of deep and shallow, fully convolutional networks to predict the density map for a given crowd image. Such a combination is used for effectively capturing both the high-level semantic information (face/body detectors) and the low-level features (blob detectors), that are necessary for crowd counting under large scale variations. As most crowd datasets have limited training samples (<100 images) and deep learning based approaches require large amounts of training data, we perform multi-scale data augmentation. Augmenting the training samples in such a manner helps in guiding the CNN to learn scale invariant representations. Our method is tested on the challenging UCF_CC_50 dataset, and shown to outperform the state of the art methods.
Directed acyclic graphs are the basic representation of the structure underlying Bayesian networks, which represent multivariate probability distributions. In many practical applications, such as the reverse engineering of gene regulatory networks, not only the estimation of model parameters but the reconstruction of the structure itself is of great interest. As well as for the assessment of different structure learning algorithms in simulation studies, a uniform sample from the space of directed acyclic graphs is required to evaluate the prevalence of certain structural features. Here we analyse how to sample acyclic digraphs uniformly at random through recursive enumeration, an approach previously thought too computationally involved. Based on complexity considerations, we discuss in particular how the enumeration directly provides an exact method, which avoids the convergence issues of the alternative Markov chain methods and is actually computationally much faster. The limiting behaviour of the distribution of acyclic digraphs then allows us to sample arbitrarily large graphs. Building on the ideas of recursive enumeration based sampling we also introduce a novel hybrid Markov chain with much faster convergence than current alternatives while still being easy to adapt to various restrictions. Finally we discuss how to include such restrictions in the combinatorial enumeration and the new hybrid Markov chain method for efficient uniform sampling of the corresponding graphs.
Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network. This paper presents a novel approach, which bridges the gap between these two categories of networks. In particular, we propose an architecture which takes advantage of the explicit, sequential enumeration of the word history in FNN structure while enhancing each word representation at the projection layer through recurrent context information that evolves in the network. The context integration is performed using an additional word-dependent weight matrix that is also learned during the training. Extensive experiments conducted on the Penn Treebank (PTB) and the Large Text Compression Benchmark (LTCB) corpus showed a significant reduction of the perplexity when compared to state-of-the-art feedforward as well as recurrent neural network architectures.
This paper presents a parallel genetic algorithm for three dimensional bin packing with heterogeneous bins using Hadoop Map-Reduce framework. The most common three dimensional bin packing problem which packs given set of boxes into minimum number of equal sized bins is proven to be NP Hard. The variation of three dimensional bin packing problem that allows heterogeneous bin sizes and rotation of boxes is computationally more harder than common three dimensional bin packing problem. The proposed Map-Reduce implementation helps to run the genetic algorithm for three dimensional bin packing with heterogeneous bins on multiple machines parallely and computes the solution in relatively short time.
A wide range of infectious diseases are both vertically and horizontally transmitted. Such diseases are spatially transmitted via multiple species in heterogeneous environments, typically described by complex meta-population models. The reproduction number is a critical metric predicting whether the disease can invade the meta-population system. This paper presents the reproduction number for a generic disease vertically and horizontally transmitted among multiple species in heterogeneous networks, where nodes are locations, and links reflect outgoing or incoming movement flows. The metapopulation model for vertically and horizontally transmitted diseases is gradually formulated from two species, two-node network models. We derived an explicit expression of the reproduction number, which is the spectral radius of a matrix reduced in size with respect to the original next generation matrix. The reproduction number is shown to be a function of vertical and horizontal transmission parameters, and the lower bound is the reproduction number for horizontal transmission. As an application, the reproduction number and its bounds for the Rift Valley fever zoonosis, where livestock, mosquitoes, and humans are the involved species are derived. By computing the reproduction number for different scenarios through numerical simulations, we found the reproduction number is affected by livestock movement rates only when parameters are heterogeneous across nodes. To summarize, our study contributes the reproduction number for vertically and horizontally transmitted diseases in heterogeneous networks. This explicit expression is easily adaptable to specific infectious diseases, affording insights into disease evolution.
Inventory management is considered to be an important field in Supply Chain Management because the cost of inventories in a supply chain accounts for about 30 percent of the value of the product. The service provided to the customer eventually gets enhanced once the efficient and effective management of inventory is carried out all through the supply chain. The precise estimation of optimal inventory is essential since shortage of inventory yields to lost sales, while excess of inventory may result in pointless storage costs. Thus the determination of the inventory to be held at various levels in a supply chain becomes inevitable so as to ensure minimal cost for the supply chain. The minimization of the total supply chain cost can only be achieved when optimization of the base stock level is carried out at each member of the supply chain. This paper deals with the problem of determination of base stock levels in a ten member serial supply chain with multiple products produced by factories using Uniform Crossover Genetic Algorithms. The complexity of the problem increases when more distribution centers and agents and multiple products were involved. These considerations leading to very complex inventory management process has been resolved in this work.
Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision. However, a principled way in which to train such hierarchies in the unsupervised setting has remained elusive. In this work we suggest a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unlabeled natural video sequences. This is done by training a generative model to predict video frames. We also address the problem of inherent uncertainty in prediction by introducing latent variables that are non-deterministic functions of the input into the network architecture.
A one-way quantum computer works by only performing a sequence of one-qubit measurements on a particular entangled multi-qubit state, the cluster state. No non-local operations are required in the process of computation. Any quantum logic network can be simulated on the one-way quantum computer. On the other hand, the network model of quantum computation cannot explain all ways of processing quantum information possible with the one-way quantum computer. In this paper, two examples of the non-network character of the one-way quantum computer are given. First, circuits in the Clifford group can be performed in a single time step. Second, the realisation of a particular circuit --the bit-reversal gate-- on the one-way quantum computer has no network interpretation. (Submitted to J. Mod. Opt, Gdansk ESF QIT conference issue.)
The latest data released by the BaBar Collaboration on azimuthal correlations measured for pion-kaon and kaon-kaon pairs produced in $e^+e^-$ annihilations allow, for the first time, a direct extraction of the kaon Collins functions. These functions are then used to compute the kaon Collins asymmetries in Semi Inclusive Deep Inelastic Scattering processes, which result in good agreement with the measurements performed by the HERMES and COMPASS Collaborations.
We present a first principle scheme to compute the rigidity, i. e. the shear-modulus of structural glasses at finite temperatures using the cloned liquid theory, which combines the replica theory and the liquid theory. With the aid of the replica method which enables disentanglement of thermal fluctuations in liquids into intra-state and inter-state fluctuations, we extract the rigidity of metastable amorphous solid states in the supercooled liquid and glass phases. The result can be understood intuitively without replicas. As a test case, we apply the scheme to the supercooled and glassy state of a binary mixture of soft-spheres. The result compares well with the shear-modulus obtained by a previous molecular dynamic simulation. The rigidity of metastable states is significantly reduced with respect to the instantaneous rigidity, namely the Born term, due to non-affine responses caused by displacements of particles inside cages at all temperatures down to T=0. It becomes nearly independent of temperature below the Kauzmann temperature T_K. At higher temperatures in the supercooled liquid state, the non-affine correction to the rigidity becomes stronger suggesting melting of the metastable solid state. Inter-state part of the static response implies jerky, intermittent stress-strain curves with static analogue of yielding at mesoscopic scales.
In this paper, we propose a Hybrid Ant Colony Optimization algorithm (HACO) for Next Release Problem (NRP). NRP, a NP-hard problem in requirement engineering, is to balance customer requests, resource constraints, and requirement dependencies by requirement selection. Inspired by the successes of Ant Colony Optimization algorithms (ACO) for solving NP-hard problems, we design our HACO to approximately solve NRP. Similar to traditional ACO algorithms, multiple artificial ants are employed to construct new solutions. During the solution construction phase, both pheromone trails and neighborhood information will be taken to determine the choices of every ant. In addition, a local search (first found hill climbing) is incorporated into HACO to improve the solution quality. Extensively wide experiments on typical NRP test instances show that HACO outperforms the existing algorithms (GRASP and simulated annealing) in terms of both solution uality and running time.
Spin models of neural networks and genetic networks are considered elegant as they are accessible to statistical mechanics tools for spin glasses and magnetic systems. However, the conventional choice of variables in spin systems may cause problems in some models when parameter choices are unrealistic from a biological perspective. Obviously, this may limit the role of a model as a template model for biological systems. Perhaps less obviously, also ensembles of random networks are affected and may exhibit different critical properties. We consider here a prototypical network model that is biologically plausible in its local mechanisms. We study a discrete dynamical network with two characteristic properties: Nodes with binary states 0 and 1, and a modified threshold function with $\Theta_0(0)=0$. We explore the critical properties of random networks of such nodes and find a critical connectivity $K_c=2.0$ with activity vanishing at the critical point. Finally, we observe that the present model allows a more natural implementation of recent models of budding yeast and fission yeast cell-cycle control networks.
The unavoidable interaction between a quantum system and the external noisy environment can be mimicked by a sequence of stochastic measurements whose outcomes are neglected. Here we investigate how this stochasticity is reflected in the survival probability to find the system in a given Hilbert subspace at the end of the dynamical evolution. In particular, we analytically study the distinguishability of two different stochastic measurement sequences in terms of a new Fisher information measure depending on the variation of a function, instead of a finite set of parameters. We find a novel characterization of Zeno phenomena as the physical result of the random observation of the quantum system, linked to the sensitivity of the survival probability with respect to an arbitrary small perturbation of the measurement stochasticity. Finally, the implications on the Cram\'er-Rao bound are discussed, together with a numerical example. These results are expected to provide promising applications in quantum metrology towards future, more robust, noise-based quantum sensing devices.
We analyze random networks that change over time. First we analyze a dynamic Erdos-Renyi model, whose edges change over time. We describe its stationary distribution, its convergence thereto, and the SI contact process on the network, which has relevance for connectivity and the spread of infections. Second, we analyze the effect of node turnover, when nodes enter and leave the network, which has relevance for network models incorporating births, deaths, aging, and other demographic factors.
Detecting community structures in social networks has gained considerable attention in recent years. However, lack of prior knowledge about the number of communities, and their overlapping nature have made community detection a challenging problem. Moreover, many of the existing methods only consider static networks, while most of real world networks are dynamic and evolve over time. Hence, finding consistent overlapping communities in dynamic networks without any prior knowledge about the number of communities is still an interesting open research problem. In this paper, we present an overlapping community detection method for dynamic networks called Dynamic Bayesian Overlapping Community Detector (DBOCD). DBOCD assumes that in every snapshot of network, overlapping parts of communities are dense areas and utilizes link communities instead of common node communities. Using Recurrent Chinese Restaurant Process and community structure of the network in the last snapshot, DBOCD simultaneously extracts the number of communities and soft community memberships of nodes while maintaining the consistency of communities over time. We evaluated DBOCD on both synthetic and real dynamic data-sets to assess its ability to find overlapping communities in different types of network evolution. The results show that DBOCD outperforms the recent state of the art dynamic community detection methods.
A lecture notes style review of the equilibrium statistical mechanics of recurrent neural networks with discrete and continuous neurons (e.g. Ising, coupled-oscillators). To be published in the Handbook of Biological Physics (North-Holland). Accompanied by a similar review (part II) dealing with the dynamics.
Data often comes in the form of an array or matrix. Matrix factorization techniques attempt to recover missing or corrupted entries by assuming that the matrix can be written as the product of two low-rank matrices. In other words, matrix factorization approximates the entries of the matrix by a simple, fixed function---namely, the inner product---acting on the latent feature vectors for the corresponding row and column. Here we consider replacing the inner product by an arbitrary function that we learn from the data at the same time as we learn the latent feature vectors. In particular, we replace the inner product by a multi-layer feed-forward neural network, and learn by alternating between optimizing the network for fixed latent features, and optimizing the latent features for a fixed network. The resulting approach---which we call neural network matrix factorization or NNMF, for short---dominates standard low-rank techniques on a suite of benchmark but is dominated by some recent proposals that take advantage of the graph features. Given the vast range of architectures, activation functions, regularizers, and optimization techniques that could be used within the NNMF framework, it seems likely the true potential of the approach has yet to be reached.
Domain adaptation aims at training a classifier in one dataset and applying it to a related but not identical dataset. One successfully used framework of domain adaptation is to learn a transformation to match both the distribution of the features (marginal distribution), and the distribution of the labels given features (conditional distribution). In this paper, we propose a new domain adaptation framework named Deep Transfer Network (DTN), where the highly flexible deep neural networks are used to implement such a distribution matching process.   This is achieved by two types of layers in DTN: the shared feature extraction layers which learn a shared feature subspace in which the marginal distributions of the source and the target samples are drawn close, and the discrimination layers which match conditional distributions by classifier transduction. We also show that DTN has a computation complexity linear to the number of training samples, making it suitable to large-scale problems. By combining the best paradigms in both worlds (deep neural networks in recognition, and matching marginal and conditional distributions in domain adaptation), we demonstrate by extensive experiments that DTN improves significantly over former methods in both execution time and classification accuracy.
Recent years have witnessed amazing outcomes from "Big Models" trained by "Big Data". Most popular algorithms for model training are iterative. Due to the surging volumes of data, we can usually afford to process only a fraction of the training data in each iteration. Typically, the data are either uniformly sampled or sequentially accessed.   In this paper, we study how the data access pattern can affect model training. We propose an Active Sampler algorithm, where training data with more "learning value" to the model are sampled more frequently. The goal is to focus training effort on valuable instances near the classification boundaries, rather than evident cases, noisy data or outliers. We show the correctness and optimality of Active Sampler in theory, and then develop a light-weight vectorized implementation. Active Sampler is orthogonal to most approaches optimizing the efficiency of large-scale data analytics, and can be applied to most analytics models trained by stochastic gradient descent (SGD) algorithm. Extensive experimental evaluations demonstrate that Active Sampler can speed up the training procedure of SVM, feature selection and deep learning, for comparable training quality by 1.6-2.2x.
Solving the visual symbol grounding problem has long been a goal of artificial intelligence. The field appears to be advancing closer to this goal with recent breakthroughs in deep learning for natural language grounding in static images. In this paper, we propose to translate videos directly to sentences using a unified deep neural network with both convolutional and recurrent structure. Described video datasets are scarce, and most existing methods have been applied to toy domains with a small vocabulary of possible words. By transferring knowledge from 1.2M+ images with category labels and 100,000+ images with captions, our method is able to create sentence descriptions of open-domain videos with large vocabularies. We compare our approach with recent work using language generation metrics, subject, verb, and object prediction accuracy, and a human evaluation.
Software-Defined Networking (SDN) paradigm provides many features including hardware abstraction, programmable networking and centralized policy control. One of the main benefits used along with these features is core/backhaul network virtualization which ensures sharing of mobile core and backhaul networks among Mobile Operators (MOs). In this paper, we propose a virtualized SDN-based Evolved Packet System (EPS) cellular network architecture including design of network virtualization controller. After virtualization of core/backhaul network elements, eNodeBs associated with MOs become a part of resource allocation problem for Backhaul Transport Providers (BTPs). We investigate the performance of our proposed architecture where eNodeBs are assigned to the MOs using quality-of-service (QoS)-aware and QoS-unaware scheduling algorithms under the consideration of time-varying numbers and locations of user equipments (UEs) through Monte-Carlo simulations. The performances are compared with traditional EPS in Long Term Evolution (LTE) architecture and the results reveal that our proposed architecture outperforms the traditional cellular network architecture.
A key element in transfer learning is representation learning; if representations can be developed that expose the relevant factors underlying the data, then new tasks and domains can be learned readily based on mappings of these salient factors. We propose that an important aim for these representations are to be unbiased. Different forms of representation learning can be derived from alternative definitions of unwanted bias, e.g., bias to particular tasks, domains, or irrelevant underlying data dimensions. One very useful approach to estimating the amount of bias in a representation comes from maximum mean discrepancy (MMD) [5], a measure of distance between probability distributions. We are not the first to suggest that MMD can be a useful criterion in developing representations that apply across multiple domains or tasks [1]. However, in this paper we describe a number of novel applications of this criterion that we have devised, all based on the idea of developing unbiased representations. These formulations include: a standard domain adaptation framework; a method of learning invariant representations; an approach based on noise-insensitive autoencoders; and a novel form of generative model.
The past thirteen years have seen the development of many algorithms for approximating matrix functions in O(N) time, where N is the basis size. These O(N) algorithms rely on assumptions about the spatial locality of the matrix function; therefore their validity depends very much on the argument of the matrix function. In this article I carefully examine the validity of certain O(N) algorithms when applied to hamiltonians of disordered systems. I focus on the prototypical disordered system, the Anderson model. I find that O(N) algorithms for the density matrix function can be used well below the Anderson transition (i.e. in the metallic phase;) they fail only when the coherence length becomes large. This paper also includes some experimental results about the Anderson model's behavior across a range of disorders.
SmallCells are deployed in order to enhance the network performance by bringing the network closer to the user. However, as the number of low power nodes grows increasingly, the overall energy consumption of the SmallCells base stations cannot be ignored. A relevant amount of energy could be saved through several techniques, especially power control mechanisms. In this paper, we are concerned with energy aware self organizing networks that guarantee a satisfactory performance. We consider satisfaction equilibria, mainly the efficient satisfaction equilibrium (ESE), to ensure a target quality of service (QoS) and save energy. First, we identify conditions of existence and uniqueness of ESE under a stationary channel assumption. We fully characterize the ESE and prove that, whenever it exists, it is a solution of a linear system. Moreover, we define satisfactory Pareto optimality and show that, at the ESE, no player can increase its QoS without degrading the overall performance. Under a fast fading channel assumption, as the robust satisfaction equilibrium solution is very restrictive, we propose an alternative solution namely the long term satisfaction equilibrium, and describe how to reach this solution efficiently. Finally, in order to find satisfactory solution per all users, we propose fully distributed strategic learning schemes based on Banach-Picard, Mann and Bush Mosteller algorithms, and show through simulations their qualitative properties. fully distributed strategic learning schemes based on Banach Picard, Mann and Bush Mosteller algorithms, and show through simulations their qualitative properties.
We present new radio observations of 9 members of a sample of 29 nearby (z < 0.2) BL Lac objects. The new data have been obtained with the European VLBI Network and/or the MERLIN at 1.6 and 5 GHz and complement previous observations. For one object, the TeV source Mrk 421, we also present deep multi-epoch VLBA and Global VLBI data, which reveal a resolved diffuse jet, with clear signatures of limb brightening. We use the new and old data to estimate physical parameters of the jets of the sample from which the subset with new radio data is drawn. We derive Doppler factors in the parsec scale radio jet in the range ~2 < delta < ~9. Using HST data, we separate the contribution of the host galaxy from that of the active core. From the measured and de-beamed observables, we find a weak correlation between radio power and black hole mass, and a tight correlation between radio and optical core luminosities. We interpret this result in terms of a common synchrotron origin, with little contribution from a radiatively efficient accretion disk. The BL Lacs in our sample have de-beamed properties similar to low power radio galaxies, including the fundamental plane of black hole activity.
In this technical report we analyse the performance of diffusion strategies applied to the Least-Mean-Square adaptive filter. We configure a network of cooperative agents running adaptive filters and discuss their behaviour when compared with a non-cooperative agent which represents the average of the network. The analysis provides conditions under which diversity in the filter parameters is beneficial in terms of convergence and stability. Simulations drive and support the analysis.
Recommender systems play a central role in providing individualized access to information and services. This paper focuses on collaborative filtering, an approach that exploits the shared structure among mind-liked users and similar items. In particular, we focus on a formal probabilistic framework known as Markov random fields (MRF). We address the open problem of structure learning and introduce a sparsity-inducing algorithm to automatically estimate the interaction structures between users and between items. Item-item and user-user correlation networks are obtained as a by-product. Large-scale experiments on movie recommendation and date matching datasets demonstrate the power of the proposed method.
In this paper, we developed a deep neural network (DNN) that learns to solve simultaneously the three tasks of the cQA challenge proposed by the SemEval-2016 Task 3, i.e., question-comment similarity, question-question similarity and new question-comment similarity. The latter is the main task, which can exploit the previous two for achieving better results. Our DNN is trained jointly on all the three cQA tasks and learns to encode questions and comments into a single vector representation shared across the multiple tasks. The results on the official challenge test set show that our approach produces higher accuracy and faster convergence rates than the individual neural networks. Additionally, our method, which does not use any manual feature engineering, approaches the state of the art established with methods that make heavy use of it.
We present two simple ways of reducing the number of parameters and accelerating the training of large Long Short-Term Memory (LSTM) networks: the first one is "matrix factorization by design" of LSTM matrix into the product of two smaller matrices, and the second one is partitioning of LSTM matrix, its inputs and states into the independent groups. Both approaches allow us to train large LSTM networks significantly faster to the state-of the art perplexity. On the One Billion Word Benchmark we improve single model perplexity down to 23.36.
This paper describes the evolution of controllers for racing a simulated radio-controlled car around a track, modelled on a real physical track. Five different controller architectures were compared, based on neural networks, force fields and action sequences. The controllers use either egocentric (first person), Newtonian (third person) or no information about the state of the car (open-loop controller). The only controller that was able to evolve good racing behaviour was based on a neural network acting on egocentric inputs.
Deep convolutional networks provide state of the art classifications and regressions results over many high-dimensional problems. We review their architecture, which scatters data with a cascade of linear filter weights and non-linearities. A mathematical framework is introduced to analyze their properties. Computations of invariants involve multiscale contractions, the linearization of hierarchical symmetries, and sparse separations. Applications are discussed.
We investigate dynamic scaling properties of the two-dimensional gauge glass model for the vortex glass phase in superconductors with quenched disorder.   From extensive Monte Carlo simulations we obtain static and dynamic finite size scaling behavior, where the static simulations use a temperature exchange method to ensure convergence at low temperatures. Both static and dynamic scaling of Monte Carlo data is consistent with a glass transition at zero temperature. We study a dynamic correlation function for the superconducting order parameter, as well as the phase slip resistance. From the scaling of these two functions, we find evidence for two distinct diverging correlation times at the zero temperature glass transition. The longer of these time scales is associated with phase slip fluctuations across the system that lead to finite resistance at any finite temperature, while the shorter time scale is associated with local phase fluctuations.
In this paper we present methods for attacking and defending $k$-gram statistical analysis techniques that are used, for example, in network traffic analysis and covert channel detection. The main new result is our demonstration of how to use a behavior's or process' $k$-order statistics to build a stochastic process that has those same $k$-order stationary statistics but possesses different, deliberately designed, $(k+1)$-order statistics if desired. Such a model realizes a "complexification" of the process or behavior which a defender can use to monitor whether an attacker is shaping the behavior. By deliberately introducing designed $(k+1)$-order behaviors, the defender can check to see if those behaviors are present in the data. We also develop constructs for source codes that respect the $k$-order statistics of a process while encoding covert information. One fundamental consequence of these results is that certain types of behavior analyses techniques come down to an {\em arms race} in the sense that the advantage goes to the party that has more computing resources applied to the problem.
We survey results on neural network expressivity described in "On the Expressive Power of Deep Neural Networks". The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed. These results translate to consequences for networks during and after training. They suggest that parameters earlier in a network have greater influence on its expressive power -- in particular, given a layer, its influence on expressivity is determined by the remaining depth of the network after that layer. This is verified with experiments on MNIST and CIFAR-10. We also explore the effect of training on the input-output map, and find that it trades off between the stability and expressivity.
To mitigate the severe inter-tier interference and enhance limited cooperative gains resulting from the constrained and non-ideal transmissions between adjacent base stations in heterogeneous networks (HetNets), heterogeneous cloud radio access networks (H-CRANs) are proposed as cost-efficient potential solutions through incorporating the cloud computing into HetNets. In this article, state-of-the-art research achievements and challenges on H-CRANs are surveyed. In particular, we discuss issues of system architectures, spectral and energy efficiency performances, and promising key techniques. A great emphasis is given towards promising key techniques in H-CRANs to improve both spectral and energy efficiencies, including cloud computing based coordinated multi-point transmission and reception, large-scale cooperative multiple antenna, cloud computing based cooperative radio resource management, and cloud computing based self-organizing network in the cloud converging scenarios. The major challenges and open issues in terms of theoretical performance with stochastic geometry, fronthaul constrained resource allocation, and standard development that may block the promotion of H-CRANs are discussed as well.
A sum-network is a directed acyclic network in which all terminal nodes demand the `sum' of the independent information observed at the source nodes. Many characteristics of the well-studied multiple-unicast network communication problem also hold for sum-networks due to a known reduction between instances of these two problems. Our main result is that unlike a multiple unicast network, the coding capacity of a sum-network is dependent on the message alphabet. We demonstrate this using a construction procedure and show that the choice of a message alphabet can reduce the coding capacity of a sum-network from $1$ to close to $0$.
To improve the problem that the parameter identification for fuzzy neural network has many time complexities in calculating, an improved T-S fuzzy inference method and an parameter identification method for fuzzy neural network are proposed. It mainly includes three parts. First, improved fuzzy inference method based on production term for T-S Fuzzy model is explained. Then, compared with existing Sugeno fuzzy inference based on Compositional rules and type-distance fuzzy inference method, the proposed fuzzy inference algorithm has a less amount of complexity in calculating and the calculating process is simple. Next, a parameter identification method for FNN based on production inference is proposed. Finally, the proposed method is applied for the precipitation forecast and security situation prediction. Test results showed that the proposed method significantly improved the effectiveness of identification, reduced the learning order, time complexity and learning error.
Detecting and characterizing dense subgraphs (tight communities) in social and information networks is an important exploratory tool in social network analysis. Several approaches have been proposed that either (i) partition the whole network into clusters, even in low density region, or (ii) are aimed at finding a single densest community (and need to be iterated to find the next one). As social networks grow larger both approaches (i) and (ii) result in algorithms too slow to be practical, in particular when speed in analyzing the data is required. In this paper we propose an approach that aims at balancing efficiency of computation and expressiveness and manageability of the output community representation. We define the notion of a partial dense cover (PDC) of a graph. Intuitively a PDC of a graph is a collection of sets of nodes that (a) each set forms a disjoint dense induced subgraphs and (b) its removal leaves the residual graph without dense regions. Exact computation of PDC is an NP-complete problem, thus, we propose an efficient heuristic algorithms for computing a PDC which we christen Core and Peel. Moreover we propose a novel benchmarking technique that allows us to evaluate algorithms for computing PDC using the classical IR concepts of precision and recall even without a golden standard. Tests on 25 social and technological networks from the Stanford Large Network Dataset Collection confirm that Core and Peel is efficient and attains very high precison and recall.
This paper presents a versatile system intended to acquire paraphrastic phrases from a representative corpus. In order to decrease the time spent on the elaboration of resources for NLP system (for example Information Extraction, IE hereafter), we suggest to use a machine learning system that helps defining new templates and associated resources. This knowledge is automatically derived from the text collection, in interaction with a large semantic network.
Training a deep convolutional neural net typically starts with a random initialisation of all filters in all layers which severely reduces the forward signal and back-propagated error and leads to slow and sub-optimal training. Techniques that counter that focus on either increasing the signal or increasing the gradients adaptively but the model behaves very differently at the beginning of training compared to later when stable pathways through the net have been established. To compound this problem the effective minibatch size varies greatly between layers at different depths and between individual filters as activation sparsity typically increases with depth leading to a reduction in effective learning rate since gradients may superpose rather than add and this further compounds the covariate shift problem as deeper neurons are less able to adapt to upstream shift.   Proposed here is a method of automatic gain control of the signal built into each convolutional neuron that achieves equivalent or superior performance than batch normalisation and is compatible with single sample or minibatch gradient descent. The same model is used both for training and inference.   The technique comprises a scaled per sample map mean subtraction from the raw convolutional filter output followed by scaling of the difference.
We study a lattice model of a three-dimensional periodic elastic medium at zero temperature with exact combinatorial optimization methods. A competition between pinning of the elastic medium, representing magnetic flux lines in the mixed phase of a superconductor or charge density waves in a crystal, by randomly distributed impurities and a periodic lattice potential gives rise to a continuous phase transition from a flat phase to a rough phase. We determine the critical exponents of this roughening transition via finite size scaling obtaining $\nu\approx1.3$, $\beta\approx0.05$, $\gamma/\nu\approx2.9$ and find that they are universal with respect to the periodicity of the lattice potential. The small order parameter exponent is reminiscent of the random field Ising critical behavior in 3$d$.
Despite a rapid rise in the quality of built-in smartphone cameras, their physical limitations - small sensor size, compact lenses and the lack of specific hardware, - impede them to achieve the quality results of DSLR cameras. In this work we present an end-to-end deep learning approach that bridges this gap by translating ordinary photos into DSLR-produced images. We propose learning the translation function using a residual convolutional neural network that improves both color rendition and image sharpness. Since the standard mean squared loss is not well suited for measuring perceptual image quality, we introduce a composite perceptual error function that combines content, color and texture losses. The first two losses are defined analytically, while the texture loss is learned using an adversarial network. We also present a large-scale dataset that consists of real photos captured from three different phones and one high-end reflex camera. Our quantitative and qualitative assessments reveal that the enhanced images demonstrate the quality comparable to DSLR-taken photos, while the method itself can be applied to any type of digital cameras.
Topological features of gene regulatory networks can be successfully reproduced by a model population evolving under selection for short dynamical attractors. The evolved population of networks exhibit motif statistics, summarized by significance profiles, which closely match those of {\it E. coli, S. cerevsiae} and {\it B. subtilis}, in such features as the excess of linear motifs and feed-forward loops, and deficiency of feedback loops. The slow relaxation to stasis is a hallmark of a rugged fitness landscape, with independently evolving populations exploring distinct valleys strongly differing in network properties.
Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data). To this aim, we use the output of an unsupervised ranking model, such as BM25, as a weak supervision signal. We further train a set of simple yet effective ranking models based on feed-forward neural networks. We study their effectiveness under various learning scenarios (point-wise and pair-wise models) and using different input representations (i.e., from encoding query-document pairs into dense/sparse vectors to using word embedding representation). We train our networks using tens of millions of training instances and evaluate it on two standard collections: a homogeneous news collection(Robust) and a heterogeneous large-scale web collection (ClueWeb). Our experiments indicate that employing proper objective functions and letting the networks to learn the input representation based on weakly supervised data leads to impressive performance, with over 13% and 35% MAP improvements over the BM25 model on the Robust and the ClueWeb collections. Our findings also suggest that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.
The apps installed on a smartphone can reveal much information about a user, such as their medical conditions, sexual orientation, or religious beliefs. Additionally, the presence or absence of particular apps on a smartphone can inform an adversary who is intent on attacking the device. In this paper, we show that a passive eavesdropper can feasibly identify smartphone apps by fingerprinting the network traffic that they send. Although SSL/TLS hides the payload of packets, side-channel data such as packet size and direction is still leaked from encrypted connections. We use machine learning techniques to identify smartphone apps from this side-channel data. In addition to merely fingerprinting and identifying smartphone apps, we investigate how app fingerprints change over time, across devices and across different versions of apps. Additionally, we introduce strategies that enable our app classification system to identify and mitigate the effect of ambiguous traffic, i.e., traffic in common among apps such as advertisement traffic. We fully implemented a framework to fingerprint apps and ran a thorough set of experiments to assess its performance. We fingerprinted 110 of the most popular apps in the Google Play Store and were able to identify them six months later with up to 96% accuracy. Additionally, we show that app fingerprints persist to varying extents across devices and app versions.
In this paper, we propose an effective face completion algorithm using a deep generative model. Different from well-studied background completion, the face completion task is more challenging as it often requires to generate semantically new pixels for the missing key components (e.g., eyes and mouths) that contain large appearance variations. Unlike existing nonparametric algorithms that search for patches to synthesize, our algorithm directly generates contents for missing regions based on a neural network. The model is trained with a combination of a reconstruction loss, two adversarial losses and a semantic parsing loss, which ensures pixel faithfulness and local-global contents consistency. With extensive experimental results, we demonstrate qualitatively and quantitatively that our model is able to deal with a large area of missing pixels in arbitrary shapes and generate realistic face completion results.
The social recommender system that supports the creation of new relations between users in the multimedia sharing system is presented in the paper. To generate suggestions the new concept of the multirelational social network was introduced. It covers both direct as well as object-based relationships that reflect social and semantic links between users. The main goal of the new method is to create the personalized suggestions that are continuously adapted to users' needs depending on the personal weights assigned to each layer from the social network. The conducted experiments confirmed the usefulness of the proposed model.
In this paper we introduce a novel Automatic Repeat reQuest (ARQ) scheme for cooperative wireless networks. Our scheme adopts network coding techniques in order to enhance the total bandwidth of the network by minimizing the total number of transmissions. The performance of the proposed approach is evaluated by means of computer simulations and compared to other cooperative schemes, while an analytical solution is provided to validate the results.
Spin systems with frustration and disorder are notoriously difficult to study both analytically and numerically. While the simulation of ferromagnetic statistical mechanical models benefits greatly from cluster algorithms, these accelerated dynamics methods remain elusive for generic spin-glass-like systems. Here we present a cluster algorithm for Ising spin glasses that works in any space dimension and speeds up thermalization by at least one order of magnitude at temperatures where thermalization is typically difficult. Our isoenergetic cluster moves are based on the Houdayer cluster algorithm for two-dimensional spin glasses and lead to a speedup over conventional state-of-the-art methods that increases with the system size. We illustrate the benefits of the isoenergetic cluster moves in two and three space dimensions, as well as the nonplanar chimera topology found in the D-Wave Inc.~quantum annealing machine.
We explore methods of producing adversarial examples on deep generative models such as the variational autoencoder (VAE) and the VAE-GAN. Deep learning architectures are known to be vulnerable to adversarial examples, but previous work has focused on the application of adversarial examples to classification tasks. Deep generative models have recently become popular due to their ability to model input data distributions and generate realistic examples from those distributions. We present three classes of attacks on the VAE and VAE-GAN architectures and demonstrate them against networks trained on MNIST, SVHN and CelebA. Our first attack leverages classification-based adversaries by attaching a classifier to the trained encoder of the target generative model, which can then be used to indirectly manipulate the latent representation. Our second attack directly uses the VAE loss function to generate a target reconstruction image from the adversarial example. Our third attack moves beyond relying on classification or the standard loss for the gradient and directly optimizes against differences in source and target latent representations. We also motivate why an attacker might be interested in deploying such techniques against a target generative network.
The charm contribution to the structure functions of diffractive deep inelastic scattering is considered here within the context of the Ingelman-Schlein model. Numerical estimations of this contribution are made from parametrizations of the HERA data. Influence of the Pomeron flux factor is analized as well as the effect of the shape of the initial parton distribution employed in the calculations. The obtained results indicate that the charm contribution to diffractive deep inelastic process might be large enough to be measured in the HERA experiments.
We have witnessed rapid evolution of deep neural network architecture design in the past years. These latest progresses greatly facilitate the developments in various areas such as computer vision, natural language processing, etc. However, along with the extraordinary performance, these state-of-the-art models also bring in expensive computational cost. Directly deploying these models into applications with real-time requirement is still infeasible. Recently, Hinton etal. have shown that the dark knowledge within a powerful teacher model can significantly help the training of a smaller and faster student network. These knowledge are vastly beneficial to improve the generalization ability of the student model. Inspired by their work, we introduce a new type of knowledge -- cross sample similarities for model compression and acceleration. This knowledge can be naturally derived from deep metric learning model. To transfer them, we bring the learning to rank technique into deep metric learning formulation. We test our proposed DarkRank on the pedestrian re-identification task. The results are quite encouraging. Our DarkRank can improve over the baseline method by a large margin. Moreover, it is fully compatible with other existing methods. When combined, the performance can be further boosted.
Objective: In this work, we perform margin assessment of human breast tissue from optical coherence tomography (OCT) images using deep neural networks (DNNs). This work simulates an intraoperative setting for breast cancer lumpectomy. Methods: To train the DNNs, we use both the state-of-the-art methods (Weight Decay and DropOut) and a newly introduced regularization method based on function norms. Commonly used methods can fail when only a small database is available. The use of a function norm introduces a direct control over the complexity of the function with the aim of diminishing the risk of overfitting. Results: As neither the code nor the data of previous results are publicly available, the obtained results are compared with reported results in the literature for a conservative comparison. Moreover, our method is applied to locally collected data on several data configurations. The reported results are the average over the different trials. Conclusion: The experimental results show that the use of DNNs yields significantly better results than other techniques when evaluated in terms of sensitivity, specificity, F1 score, G-mean and Matthews correlation coefficient. Function norm regularization yielded higher and more robust results than competing methods. Significance: We have demonstrated a system that shows high promise for (partially) automated margin assessment of human breast tissue, Equal error rate (EER) is reduced from approximately 12\% (the lowest reported in the literature) to 5\%\,--\,a 58\% reduction. The method is computationally feasible for intraoperative application (less than 2 seconds per image).
Texture classification is a problem that has various applications such as remote sensing and forest species recognition. Solutions tend to be custom fit to the dataset used but fails to generalize. The Convolutional Neural Network (CNN) in combination with Support Vector Machine (SVM) form a robust selection between powerful invariant feature extractor and accurate classifier. The fusion of experts provides stability in classification rates among different datasets.
Spear phishing is a complex targeted attack in which, an attacker harvests information about the victim prior to the attack. This information is then used to create sophisticated, genuine-looking attack vectors, drawing the victim to compromise confidential information. What makes spear phishing different, and more powerful than normal phishing, is this contextual information about the victim. Online social media services can be one such source for gathering vital information about an individual. In this paper, we characterize and examine a true positive dataset of spear phishing, spam, and normal phishing emails from Symantec's enterprise email scanning service. We then present a model to detect spear phishing emails sent to employees of 14 international organizations, by using social features extracted from LinkedIn. Our dataset consists of 4,742 targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack emails sent to 5,912 non victims; and publicly available information from their LinkedIn profiles. We applied various machine learning algorithms to this labeled data, and achieved an overall maximum accuracy of 97.76% in identifying spear phishing emails. We used a combination of social features from LinkedIn profiles, and stylometric features extracted from email subjects, bodies, and attachments. However, we achieved a slightly better accuracy of 98.28% without the social features. Our analysis revealed that social features extracted from LinkedIn do not help in identifying spear phishing emails. To the best of our knowledge, this is one of the first attempts to make use of a combination of stylometric features extracted from emails, and social features extracted from an online social network to detect targeted spear phishing emails.
Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of the involved humans and objects. Inspired by the success of convolutional neural networks (CNN) for image classification, recent attempts have been made to learn 3D CNNs for recognizing human actions in videos. However, partly due to the high complexity of training 3D convolution kernels and the need for large quantities of training videos, only limited success has been reported. This has triggered us to investigate in this paper a new deep architecture which can handle 3D signals more effectively. Specifically, we propose factorized spatio-temporal convolutional networks (FstCN) that factorize the original 3D convolution kernel learning as a sequential process of learning 2D spatial kernels in the lower layers (called spatial convolutional layers), followed by learning 1D temporal kernels in the upper layers (called temporal convolutional layers). We introduce a novel transformation and permutation operator to make factorization in FstCN possible. Moreover, to address the issue of sequence alignment, we propose an effective training and inference strategy based on sampling multiple video clips from a given action video sequence. We have tested FstCN on two commonly used benchmark datasets (UCF-101 and HMDB-51). Without using auxiliary training videos to boost the performance, FstCN outperforms existing CNN based methods and achieves comparable performance with a recent method that benefits from using auxiliary training videos.
Convolutional Neural Networks (CNNs) are one of the most successful deep machine learning technologies for processing image, voice and video data. CNNs require large amounts of processing capacity and memory, which can exceed the resources of low power mobile and embedded systems. Several designs for hardware accelerators have been proposed for CNNs which typically contain large numbers of Multiply Accumulate (MAC) units. One approach to reducing data sizes and memory traffic in CNN accelerators is "weight sharing", where the full range of values in a trained CNN are put in bins and the bin index is stored instead of the original weight value. In this paper we propose a novel MAC circuit that exploits binning in weight-sharing CNNs. Rather than computing the MAC directly we instead count the frequency of each weight and place it in a bin. We then compute the accumulated value in a subsequent multiply phase. This allows hardware multipliers in the MAC circuit to be replaced with adders and selection logic. Experiments show that for the same clock speed our approach results in fewer gates, smaller logic, and reduced power.
The goal of this paper is two-fold. First, based on the interpretation of a quantum tight-binding model in terms of a classical Hamiltonian map, we consider the Anderson localization (AL) problem as the Fermi-Pasta-Ulam (FPU) effect in a modified dynamical system containing both stable and unstable (inverted) modes. Delocalized states in the AL are analogous to the stable quasi-periodic motion in FPU; whereas localized states are analogous to thermalization, respectively. The second aim is to use the classical Hamilton map for a simplified derivation of \textit{exact} equations for the localization operator $H(z)$. The letter was presented earlier [J.Phys.: Condens. Matter {\bf 14} (2002) 13777] treating the AL as a generalized diffusion in a dynamical system. We demonstrate that counter-intuitive results of our studies of the AL are similar to the FPU counter-intuitivity.
We study the critical behavior of the Anderson localization-delocalization transition in corner-sharing tetrahedral lattices. We compare our results obtained by three different numerical methods namely the multifractal analysis, the Green resolvent method, and the energy-level statistics which yield the singularity strength, the decay length of the wave functions, and the (integrated) energy-level distribution, respectively. From these measures a finite-size scaling approach allows us to determine the critical parameters simultaneously. With particular emphasis we calculate the propagation of the statistical errors by a Monte-Carlo method. We find a high agreement between the results of all methods and we can estimate the highest critical disorder $W_\mathrm{c}=14.474(8)$ at energy $E_\mathrm{c}=-4.0$ and the critical exponent $\nu=1.565(11)$. Our results agree with a previous study by Fazileh et al. but improve accuracy significantly.
Deep neural networks have had an enormous impact on image analysis. State-of-the-art training methods, based on weight decay and DropOut, result in impressive performance when a very large training set is available. However, they tend to have large problems overfitting to small data sets. Indeed, the available regularization methods deal with the complexity of the network function only indirectly. In this paper, we study the feasibility of directly using the $L_2$ function norm for regularization. Two methods to integrate this new regularization in the stochastic backpropagation are proposed. Moreover, the convergence of these new algorithms is studied. We finally show that they outperform the state-of-the-art methods in the low sample regime on benchmark datasets (MNIST and CIFAR10). The obtained results demonstrate very clear improvement, especially in the context of small sample regimes with data laying in a low dimensional manifold. Source code of the method can be found at \url{https://github.com/AmalRT/DNN_Reg}.
A topologically ordered material is characterized by a rare quantum organization of electrons that evades the conventional spontaneously broken symmetry based classification of condensed matter. Exotic spin transport phenomena such as the dissipationless quantum spin Hall effect have been speculated to originate from a novel topological order whose identification requires a spin sensitive measurement, which does not exist to this date in any system (neither in Hg(Cd)Te quantum wells nor in the topological insulator BiSb). Using Mott polarimetry, we probe the spin degrees of freedom of these quantum spin Hall states and demonstrate that topological quantum numbers are uniquely determined from spin texture imaging measurements. Applying this method to the Bi{1-x}Sb{x} series, we identify the origin of its novel order and unusual chiral properties. These results taken together constitute the first observation of surface electrons collectively carrying a geometrical quantum (Berry's) phase and definite chirality (mirror Chern number, n_M =-1), which are the key electronic properties for realizing topological computing bits with intrinsic spin Hall-like topological phenomena. Our spin-resolved results not only provides the first clear proof of a topological insulating state in nature but also demonstrate the utility of spin-resolved ARPES technique in measuring the quantum spin Hall phases of matter.
While neural networks have been employed to handle several different text-to-speech tasks, ours is the first system to use neural networks throughout, for both linguistic and acoustic processing. We divide the text-to-speech task into three subtasks, a linguistic module mapping from text to a linguistic representation, an acoustic module mapping from the linguistic representation to speech, and a video module mapping from the linguistic representation to animated images. The linguistic module employs a letter-to-sound neural network and a postlexical neural network. The acoustic module employs a duration neural network and a phonetic neural network. The visual neural network is employed in parallel to the acoustic module to drive a talking head. The use of neural networks that can be retrained on the characteristics of different voices and languages affords our system a degree of adaptability and naturalness heretofore unavailable.
We present a novel approach for parameter-free modeling of the structural, dynamical and electronic properties of non-crystalline materials based on ab-initio Molecular Dynamics, improved signal processing technique and computer visualization. The method have been extensively tested by investigating hydrogen and silicon dynamics in hydrogenated amorphous silicon (a-Si:H). By comparing the theoretical and experimental vibrational spectra we demonstrate how to relate vibrational properties to the structural stability, bonding and hydrogen diffusion. We extracted microscopic characteristics that cannot be obtained by other techniques, namely hydrogen migration and related bond switching, dangling bond passivation, low hydrogen activation energy, and a-Si:H stability in general, and we show, via the analysis of a test case, that our method provides a rigorous and realistic description of non-crystalline materials. We also demonstrate that this method offers the possibility of accessing other important macroscopic characteristics of amorphous silicon and can be used to model all the aspects of a-Si:H dynamics, including the detrimental Staebler-Wronski effect.
We describe a new paradigm for implementing inference in belief networks, which consists of two steps: (1) compiling a belief network into an arithmetic expression called a Query DAG (Q-DAG); and (2) answering queries using a simple evaluation algorithm. Each node of a Q-DAG represents a numeric operation, a number, or a symbol for evidence. Each leaf node of a Q-DAG represents the answer to a network query, that is, the probability of some event of interest. It appears that Q-DAGs can be generated using any of the standard algorithms for exact inference in belief networks (we show how they can be generated using clustering and conditioning algorithms). The time and space complexity of a Q-DAG generation algorithm is no worse than the time complexity of the inference algorithm on which it is based. The complexity of a Q-DAG evaluation algorithm is linear in the size of the Q-DAG, and such inference amounts to a standard evaluation of the arithmetic expression it represents. The intended value of Q-DAGs is in reducing the software and hardware resources required to utilize belief networks in on-line, real-world applications. The proposed framework also facilitates the development of on-line inference on different software and hardware platforms due to the simplicity of the Q-DAG evaluation algorithm. Interestingly enough, Q-DAGs were found to serve other purposes: simple techniques for reducing Q-DAGs tend to subsume relatively complex optimization techniques for belief-network inference, such as network-pruning and computation-caching.
Recursive loops in a logic program present a challenging problem to the PLP framework. On the one hand, they loop forever so that the PLP backward-chaining inferences would never stop. On the other hand, they generate cyclic influences, which are disallowed in Bayesian networks. Therefore, in existing PLP approaches logic programs with recursive loops are considered to be problematic and thus are excluded. In this paper, we propose an approach that makes use of recursive loops to build a stationary dynamic Bayesian network. Our work stems from an observation that recursive loops in a logic program imply a time sequence and thus can be used to model a stationary dynamic Bayesian network without using explicit time parameters. We introduce a Bayesian knowledge base with logic clauses of the form $A \leftarrow A_1,...,A_l, true, Context, Types$, which naturally represents the knowledge that the $A_i$s have direct influences on $A$ in the context $Context$ under the type constraints $Types$. We then use the well-founded model of a logic program to define the direct influence relation and apply SLG-resolution to compute the space of random variables together with their parental connections. We introduce a novel notion of influence clauses, based on which a declarative semantics for a Bayesian knowledge base is established and algorithms for building a two-slice dynamic Bayesian network from a logic program are developed.
Ferromagnetic order in superconductors can induce a {\em spontaneous} vortex (SV) state. For external field ${\bf H}=0$, rotational symmetry guarantees a vanishing tilt modulus of the SV solid, leading to drastically different behavior than that of a conventional, external-field-induced vortex solid. We show that quenched disorder and anharmonic effects lead to elastic moduli that are wavevector-dependent out to arbitrarily long length scales, and non-Hookean elasticity. The latter implies that for weak external fields $H$, the magnetic induction scales {\em universally} like $B(H)\sim B(0)+ c H^{\alpha}$, with $\alpha\approx 0.72$. For weak disorder, we predict the SV solid is a topologically ordered vortex glass, in the ``columnar elastic glass'' universality class.
Deep convolutional neural networks (CNNs) trained for object classification have a number of striking similarities with the primate ventral visual stream. In particular, activity in early, intermediate, and late layers is closely related to activity in V1, V4, and the inferotemporal cortex (IT). This study further compares activity in late layers of object-classification CNNs to activity patterns reported in the IT electrophysiology literature. There are a number of close similarities, including the distributions of population response sparseness across stimuli, and the distribution of size tuning bandwidth. Statisics of scale invariance, responses to clutter and occlusion, and orientation tuning are less similar. Statistics of object selectivity are quite different. These results agree with recent studies that highlight strong parallels between object-categorization CNNs and the ventral stream, and also highlight differences that could perhaps be reduced in future CNNs.
Genetic Algorithms (GAs) are powerful metaheuristic techniques mostly used in many real-world applications. The sequential execution of GAs requires considerable computational power both in time and resources. Nevertheless, GAs are naturally parallel and accessing a parallel platform such as Cloud is easy and cheap. Apache Hadoop is one of the common services that can be used for parallel applications. However, using Hadoop to develop a parallel version of GAs is not simple without facing its inner workings. Even though some sequential frameworks for GAs already exist, there is no framework supporting the development of GA applications that can be executed in parallel. In this paper is described a framework for parallel GAs on the Hadoop platform, following the paradigm of MapReduce. The main purpose of this framework is to allow the user to focus on the aspects of GA that are specific to the problem to be addressed, being sure that this task is going to be correctly executed on the Cloud with a good performance. The framework has been also exploited to develop an application for Feature Subset Selection problem. A preliminary analysis of the performance of the developed GA application has been performed using three datasets and shown very promising performance.
We introduce a new topological descriptor of a network called the density decomposition which is a partition of the nodes of a network into regions of uniform density. The decomposition we define is unique in the sense that a given network has exactly one density decomposition. The number of nodes in each partition defines a density distribution which we find is measurably similar to the degree distribution of given real networks (social, internet, etc.) and measurably dissimilar in synthetic networks (preferential attachment, small world, etc.). We also show how to build networks having given density distributions, which gives us further insight into the structure of real networks.
We show that two-band superconductors harbor hidden criticality deep in the superconducting state, stemming from the critical temperature of the weaker band taken as an independent system. For sufficiently small interband coupling $\gamma$ the coherence length of the weaker band exhibits a remarkable deviation from the conventional monotonic increase with temperature, namely, a pronounced peak close to the hidden critical point. The magnitude of the peak scales proportionally to \gamma^(-\mu), with the Landau critical exponent \mu = 1/3, the same as found for the mean-field critical behavior with respect to the source field in ferromagnets and ferroelectrics. Here reported hidden criticality of multi-band superconductors can be experimentally observed by, e.g., imaging of the variations of the vortex core in a broader temperature range. Similar effects are expected for the superconducting multilayers.
We consider the problem of network coding across multiple unicasts. We give, for wired and wireless networks, efficient polynomial time algorithms for finding optimal network codes within the class of network codes restricted to XOR coding between pairs of flows.
Probabilistic modeling enables combining domain knowledge with learning from data, thereby supporting learning from fewer training instances than purely data-driven methods. However, learning probabilistic models is difficult and has not achieved the level of performance of methods such as deep neural networks on many tasks. In this paper, we attempt to address this issue by presenting a method for learning the parameters of a probabilistic program using backpropagation. Our approach opens the possibility to building deep probabilistic programming models that are trained in a similar way to neural networks.
A network of agents attempt to learn some unknown state of the world drawn by nature from a finite set. Agents observe private signals conditioned on the true state, and form beliefs about the unknown state accordingly. Each agent may face an identification problem in the sense that she cannot distinguish the truth in isolation. However, by communicating with each other, agents are able to benefit from side observations to learn the truth collectively. Unlike many distributed algorithms which rely on all-time communication protocols, we propose an efficient method by switching between Bayesian and non-Bayesian regimes. In this model, agents exchange information only when their private signals are not informative enough; thence, by switching between the two regimes, agents efficiently learn the truth using only a few rounds of communications. The proposed algorithm preserves learnability while incurring a lower communication cost. We also verify our theoretical findings by simulation examples.
Optical band images of distant (z > 0.5) galaxies, such as those of the Hubble Deep Field, record light from the rest-frame vacuum ultraviolet (< 3000 A). Because the appearance of a galaxy is a very strong function of wavelength, and especially so in the UV, evolutionary studies of distant galaxies can be seriously influenced by a "morphological k-correction" effect. We use images obtained by the Ultraviolet Imaging Telescope during the Astro missions to explore the extent of this effect and intercompare far-UV with optical morphologies for various types of galaxies.
We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, including Bayesian networks with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). An SEF is a finite union of CEFs satisfying a frontier condition. In addition, we illustrate how one can automatically generate independence and non-independence constraints on the distributions over the observable variables implied by a Bayesian network with hidden variables. The relevance of these results for model selection is examined.
Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to efficiently exploit available negative data by including all shifted versions of a training sample. However, the underlying DCF formulation is restricted to single-resolution feature maps, significantly limiting its potential. In this paper, we go beyond the conventional DCF framework and introduce a novel formulation for training continuous convolution filters. We employ an implicit interpolation model to pose the learning problem in the continuous spatial domain. Our proposed formulation enables efficient integration of multi-resolution deep feature maps, leading to superior results on three object tracking benchmarks: OTB-2015 (+5.1% in mean OP), Temple-Color (+4.6% in mean OP), and VOT2015 (20% relative reduction in failure rate). Additionally, our approach is capable of sub-pixel localization, crucial for the task of accurate feature point tracking. We also demonstrate the effectiveness of our learning formulation in extensive feature point tracking experiments. Code and supplementary material are available at http://www.cvl.isy.liu.se/research/objrec/visualtracking/conttrack/index.html.
Nowadays, acquisition of trustable information is increasingly important in both professional and private contexts. However, establishing what information is trustable and what is not, is a very challenging task. For example, how can information quality be reliably assessed? How can sources? credibility be fairly assessed? How can gatekeeping processes be found trustworthy when filtering out news and deciding ranking and priorities of traditional media? An Internet-based solution to a human-based ancient issue is being studied, and it is called Polidoxa, from Greek "poly", meaning "many" or "several" and "doxa", meaning "common belief" or "popular opinion". This old problem will be solved by means of ancient philosophies and processes with truly modern tools and technologies. This is why this work required a collaborative and interdisciplinary joint effort from researchers with very different backgrounds and institutes with significantly different agendas. Polidoxa aims at offering: 1) a trust-based search engine algorithm, which exploits stigmergic behaviours of users? network, 2) a trust-based social network, where the notion of trust derives from network activity and 3) a holonic system for bottom-up self-protection and social privacy. By presenting the Polidoxa solution, this work also describes the current state of traditional media as well as newer ones, providing an accurate analysis of major search engines such as Google and social network (e.g., Facebook). The advantages that Polidoxa offers, compared to these, are also clearly detailed and motivated. Finally, a Twitter application (Polidoxa@twitter) which enables experimentation of basic Polidoxa principles is presented.
Systems based on artificial neural networks (ANNs) have achieved state-of-the-art results in many natural language processing tasks. Although ANNs do not require manually engineered features, ANNs have many hyperparameters to be optimized. The choice of hyperparameters significantly impacts models' performances. However, the ANN hyperparameters are typically chosen by manual, grid, or random search, which either requires expert experiences or is computationally expensive. Recent approaches based on Bayesian optimization using Gaussian processes (GPs) is a more systematic way to automatically pinpoint optimal or near-optimal machine learning hyperparameters. Using a previously published ANN model yielding state-of-the-art results for dialog act classification, we demonstrate that optimizing hyperparameters using GP further improves the results, and reduces the computational time by a factor of 4 compared to a random search. Therefore it is a useful technique for tuning ANN models to yield the best performances for natural language processing tasks.
The paper presents a novel, principled approach to train recurrent neural networks from the Reservoir Computing family that are robust to missing part of the input features at prediction time. By building on the ensembling properties of Dropout regularization, we propose a methodology, named DropIn, which efficiently trains a neural model as a committee machine of subnetworks, each capable of predicting with a subset of the original input features. We discuss the application of the DropIn methodology in the context of Reservoir Computing models and targeting applications characterized by input sources that are unreliable or prone to be disconnected, such as in pervasive wireless sensor networks and ambient intelligence. We provide an experimental assessment using real-world data from such application domains, showing how the Dropin methodology allows to maintain predictive performances comparable to those of a model without missing features, even when 20\%-50\% of the inputs are not available.
The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional approaches --- meaning distributed representations that exploit co-occurrence statistics of large corpora --- have proved popular and successful across a number of tasks. However, natural language usually comes in structures beyond the word level, with meaning arising not only from the individual words but also the structure they are contained in at the phrasal or sentential level. Modelling the compositional process by which the meaning of an utterance arises from the meaning of its parts is an equally fundamental task of NLP.   This dissertation explores methods for learning distributed semantic representations and models for composing these into representations for larger linguistic units. Our underlying hypothesis is that neural models are a suitable vehicle for learning semantically rich representations and that such representations in turn are suitable vehicles for solving important tasks in natural language processing. The contribution of this thesis is a thorough evaluation of our hypothesis, as part of which we introduce several new approaches to representation learning and compositional semantics, as well as multiple state-of-the-art models which apply distributed semantic representations to various tasks in NLP.
We consider unitary invariant random matrix ensembles which obey spectral statistics different from the Wigner-Dyson, including unitary ensembles with slowly (~(log x)^2) growing potentials and the finite-temperature fermi gas model. If the deformation parameters in these matrix ensembles are small, the asymptotically translational-invariant region in the spectral bulk is universally governed by a one-parameter generalization of the sine kernel. We provide an analytic expression for the distribution of the eigenvalue spacings of this universal asymptotic kernel, which is a hybrid of the Wigner-Dyson and the Poisson distributions, by determining the Fredholm determinant of the universal kernel in terms of a Painleve VI transcendental function.
In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that measure the likelihood of finding eigenvalues of the matrix at a given point on the real line. Integrating the spectral density over an interval gives the eigenvalue count of the matrix in that interval. Therefore the rank can be approximated by integrating the spectral density over a carefully selected interval. Two different approaches are discussed to estimate the approximate rank, one based on Chebyshev polynomials and the other based on the Lanczos algorithm. In order to obtain the appropriate interval, it is necessary to locate a gap between the eigenvalues that correspond to noise and the relevant eigenvalues that contribute to the matrix rank. A method for locating this gap and selecting the interval of integration is proposed based on the plot of the spectral density. Numerical experiments illustrate the performance of these techniques on matrices from typical applications.
We present an efficient learning algorithm for the problem of training neural networks with discrete synapses, a well-known hard (NP-complete) discrete optimization problem. The algorithm is a variant of the so-called Max-Sum (MS) algorithm. In particular, we show how, for bounded integer weights with $q$ distinct states and independent concave a priori distribution (e.g. $l_{1}$ regularization), the algorithm's time complexity can be made to scale as $O\left(N\log N\right)$ per node update, thus putting it on par with alternative schemes, such as Belief Propagation (BP), without resorting to approximations. Two special cases are of particular interest: binary synapses $W\in\{-1,1\}$ and ternary synapses $W\in\{-1,0,1\}$ with $l_{0}$ regularization. The algorithm we present performs as well as BP on binary perceptron learning problems, and may be better suited to address the problem on fully-connected two-layer networks, since inherent symmetries in two layer networks are naturally broken using the MS approach.
A key problem in automatic analysis and understanding of scientific papers is to extract semantic information from non-textual paper components like figures, diagrams, tables, etc. This research always requires a very first preprocessing step: decomposing compound multi-part figures into individual subfigures. Previous work in compound figure separation has been based on manually designed features and separation rules, which often fail for less common figure types and layouts. Moreover, no implementation for compound figure decomposition is publicly available.   This paper proposes a data driven approach to separate compound figures using modern deep Convolutional Neural Networks (CNNs) to train the separator in an end-to-end manner. CNNs eliminate the need for manually designing features and separation rules, but require large amount of annotated training data. We overcome this challenge using transfer learning as well as automatically synthesizing training exemplars. We evaluate our technique on the ImageCLEF Medical dataset, achieving 85.9% accuracy and outperforming manually engineered previous techniques. We made the resulting approach available as an easy-to-use Python library, aiming to promote further research in scientific figure mining.
Chinese discourse coherence modeling remains a challenge taskin Natural Language Processing field.Existing approaches mostlyfocus on the need for feature engineering, whichadoptthe sophisticated features to capture the logic or syntactic or semantic relationships acrosssentences within a text.In this paper, we present an entity-drivenrecursive deep modelfor the Chinese discourse coherence evaluation based on current English discourse coherenceneural network model. Specifically, to overcome the shortage of identifying the entity(nouns) overlap across sentences in the currentmodel, Our combined modelsuccessfully investigatesthe entities information into the recursive neural network freamework.Evaluation results on both sentence ordering and machine translation coherence rating task show the effectiveness of the proposed model, which significantly outperforms the existing strong baseline.
A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a graph, at least one observation, and a diffusion model for threat. A link to well-known spectral detection methods is provided, and the equivalence of the random walk and harmonic solutions to the Bayesian formulation is proven. A general diffusion model is introduced that utilizes spatio-temporal relationships between vertices, and is used for a specific space-time formulation that leads to significant performance improvements on coordinated covert networks. This performance is demonstrated using a new hybrid mixed-membership blockmodel introduced to simulate random covert networks with realistic properties.
Large knowledge bases (KBs) are useful in many tasks, but it is unclear how to integrate this sort of knowledge into "deep" gradient-based learning systems. To address this problem, we describe a probabilistic deductive database, called TensorLog, in which reasoning uses a differentiable process. In TensorLog, each clause in a logical theory is first converted into certain type of factor graph. Then, for each type of query to the factor graph, the message-passing steps required to perform belief propagation (BP) are "unrolled" into a function, which is differentiable. We show that these functions can be composed recursively to perform inference in non-trivial logical theories containing multiple interrelated clauses and predicates. Both compilation and inference in TensorLog are efficient: compilation is linear in theory size and proof depth, and inference is linear in database size and the number of message-passing steps used in BP. We also present experimental results with TensorLog and discuss its relationship to other first-order probabilistic logics.
We reformulated the string formalism given by Aoyama, using an adjacent matrix of a network and introduced a series of generalized clustering coefficients based on it. Furthermore we numerically evaluated Milgram condition proposed by their article in order to explore $q$-$th$ degrees of separation in scale free networks. In this article, we apply the reformulation to small world networks and numerically evaluate Milgram condition, especially the separation number of small world networks and its relation to cycle structures are discussed. Considering the number of non-zero elements of an adjacent matrix, the average path length and Milgram condition, we show that the formalism proposed by us is effective to analyze the six degrees of separation, especially effective for analyzing the relation between the separation number and cycle structures in a network.   By this analysis of small world networks, it proves that a sort of power low holds between $M_n$, which is a key quantity in Milgram condition, and the generalized clustering coefficients. This property in small world networks stands in contrast to that of scale free networks.
The power of networks manifests itself in a highly non-linear amplification of a number of effects, and their weakness - in propagation of cascading failures. The potential systemic risk effects can be either exacerbated or mitigated, depending on the resilience characteristics of the network. The goals of this paper are to study some characteristics of network amplification and resilience. We simulate random Erdos-Renyi networks and measure amplification by varying node capacity, transaction volume, and expected failure rates. We discover that network throughput scales almost quadratically with respect to the node capacity and that the effects of excessive network load and random and irreparable node faults are equivalent and almost perfectly anticorrelated. This knowledge can be used by capacity planners to determine optimal reliability requirements that maximize the optimal operational regions.
Perceptrons are neuronal devices capable of fully discriminating linearly separable classes. Although straightforward to implement and train, their applicability is usually hindered by non-trivial requirements imposed by real-world classification problems. Therefore, several approaches, such as kernel perceptrons, have been conceived to counteract such difficulties. In this paper, we investigate an enhanced perceptron model based on the notion of contrastive biclusters. From this perspective, a good discriminative bicluster comprises a subset of data instances belonging to one class that show high coherence across a subset of features and high differentiation from nearest instances of the other class under the same features (referred to as its contrastive bicluster). Upon each local subspace associated with a pair of contrastive biclusters a perceptron is trained and the model with highest area under the receiver operating characteristic curve (AUC) value is selected as the final classifier. Experiments conducted on a range of data sets, including those related to a difficult biosignal classification problem, show that the proposed variant can be indeed very useful, prevailing in most of the cases upon standard and kernel perceptrons in terms of accuracy and AUC measures.
The number of publicly available Web services (WS) is continuously growing. To perform efficient WS discovery, it is desirable to organize the WS space. Works in this direction propose to group WS according to certain shared properties. Such groups commonly called communities are based either on similarity or on interaction between WS. In this paper we focus on the former, and propose a new network-based approach to extract communities from a WS collection. This process is three-stepped: first we define several similarity functions able to compare WS operations, second we use them to build so-called similarity networks, and third we identify communities under the form of specific structures in these networks. We apply our method on a collection of real-world WS and comment the resulting communities. Finally, we additionally provide an analysis and an interpretation of our similarity networks with a complex networks perspective.
Deep Convolution Neural Networks (CNNs) have shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. For object detection, particularly in still images, the performance has been significantly increased last year thanks to powerful deep networks (e.g. GoogleNet) and detection frameworks (e.g. Regions with CNN features (R-CNN)). The lately introduced ImageNet task on object detection from video (VID) brings the object detection task into the video domain, in which objects' locations at each frame are required to be annotated with bounding boxes. In this work, we introduce a complete framework for the VID task based on still-image object detection and general object tracking. Their relations and contributions in the VID task are thoroughly studied and evaluated. In addition, a temporal convolution network is proposed to incorporate temporal information to regularize the detection results and shows its effectiveness for the task.
Drug resistant focal epilepsy can be treated by resecting the epileptic focus requiring a precise focus localization using stereoelectroencephalography (SEEG) probes. As commercial SEEG probes offer only a limited spatial resolution, probes of higher channel count and design freedom enabling the incorporation of macro and microelectrodes would help increasing spatial resolution and thus open new perspectives for investigating mechanisms underlying focal epilepsy and its treatment. This work describes a new fabrication process for SEEG probes with materials and dimensions similar to clinical probes enabling recording single neuron activity at high spatial resolution. Polyimide is used as a biocompatible flexible substrate into which platinum electrodes and leads are...   The resulting probe features match those of clinically approved devices. Tests in saline solution confirmed the probe stability and functionality. Probes were implanted into the brain of one monkey (Macaca mulatta), trained to perform different motor tasks. Suitable configurations including up to 128 electrode sites allow the recording of task-related neuronal signals. Probes with 32 and 64 electrode sites were implanted in the posterior parietal cortex. Local field potentials and multi-unit activity were recorded as early as one hour after implantation. Stable single-unit activity was achieved for up to 26 days after implantation of a 64-channel probe. All recorded signals showed modulation during task execution. With the novel probes it is possible to record stable biologically relevant data over a time span exceeding the usual time needed for epileptic focus localization in human patients. This is the first time that single units are recorded along cylindrical polyimide probes chronically implanted 22 mm deep into the brain of a monkey, which suggests the potential usefulness of this probe for human applications.
We investigate the fundamental statistical features of tagged (or annotated) networks having a rich variety of attributes associated with their nodes. Tags (attributes, annotations, properties, features, etc.) provide essential information about the entity represented by a given node, thus, taking them into account represents a significant step towards a more complete description of the structure of large complex systems. Our main goal here is to uncover the relations between the statistical properties of the node tags and those of the graph topology. In order to better characterise the networks with tagged nodes, we introduce a number of new notions, including tag-assortativity (relating link probability to node similarity), and new quantities, such as node uniqueness (measuring how rarely the tags of a node occur in the network) and tag-assortativity exponent. We apply our approach to three large networks representing very different domains of complex systems. A number of the tag related quantities display analogous behaviour (e.g., the networks we studied are tag-assortative, indicating possible universal aspects of tags versus topology), while some other features, such as the distribution of the node uniqueness, show variability from network to network allowing for pin-pointing large scale specific features of real-world complex networks. We also find that for each network the topology and the tag distribution are scale invariant, and this self-similar property of the networks can be well characterised by the tag-assortativity exponent, which is specific to each system.
3D action recognition - analysis of human actions based on 3D skeleton data - becomes popular recently due to its succinctness, robustness, and view-invariant representation. Recent attempts on this problem suggested to develop RNN-based learning methods to model the contextual dependency in the temporal domain. In this paper, we extend this idea to spatio-temporal domains to analyze the hidden sources of action-related information within the input data over both domains concurrently. Inspired by the graphical structure of the human skeleton, we further propose a more powerful tree-structure based traversal method. To handle the noise and occlusion in 3D skeleton data, we introduce new gating mechanism within LSTM to learn the reliability of the sequential input data and accordingly adjust its effect on updating the long-term context information stored in the memory cell. Our method achieves state-of-the-art performance on 4 challenging benchmark datasets for 3D human action analysis.
A fundamental requirement to develop routing strategies in power line networks is the knowledge of the network topology, which might not be complete. In this work, we present a novel method to derive the topology of a distribution network that exploits the capability of Power Line Communication modems to measure the network admittance, and we report the most significant results.
We introduce a strong-disorder renormalization group (RG) approach suitable for investigating the quasiparticle excitations of disordered superconductors in which the quasiparticle spin is not conserved. We analyze one-dimensional models with this RG and with elementary transfer matrix methods. We find that such models with broken spin rotation invariance {\it generically} lie in one of two topologically distinct localized phases. Close enough to the critical point separating the two phases, the system has a power-law divergent low-energy density of states (with a non-universal continuously varying power-law) in either phase, due to quantum Griffiths singularities. This critical point belongs to the same infinite-disorder universality class as the one dimensional particle-hole symmetric Anderson localization problem, while the Griffiths phases in the vicinity of the transition are controlled by lines of strong (but not infinite) disorder fixed points terminating in the critical point.
Recent work has shown that dopamine-modulated STDP can solve many of the issues associated with reinforcement learning, such as the distal reward problem. Spiking neural networks provide a useful technique in implementing reinforcement learning in an embodied context as they can deal with continuous parameter spaces and as such are better at generalizing the correct behaviour to perform in a given context.   In this project we implement a version of DA-modulated STDP in an embodied robot on a food foraging task. Through simulated dopaminergic neurons we show how the robot is able to learn a sequence of behaviours in order to achieve a food reward. In tests the robot was able to learn food-attraction behaviour, and subsequently unlearn this behaviour when the environment changed, in all 50 trials. Moreover we show that the robot is able to operate in an environment whereby the optimal behaviour changes rapidly and so the agent must constantly relearn. In a more complex environment, consisting of food-containers, the robot was able to learn food-container attraction in 95% of trials, despite the large temporal distance between the correct behaviour and the reward. This is achieved by shifting the dopamine response from the primary stimulus (food) to the secondary stimulus (food-container).   Our work provides insights into the reasons behind some observed biological phenomena, such as the bursting behaviour observed in dopaminergic neurons. As well as demonstrating how spiking neural network controlled robots are able to solve a range of reinforcement learning tasks.
Recent years have witnessed significant interest in convex relaxations of the power flows, several papers showing that the second-order cone relaxation is tight for tree networks under various conditions on loads or voltages. This paper shows that AC-feasibility, i.e., to find whether some generator dispatch can satisfy a given demand, is NP-Hard for tree networks.
"If we know more, we can achieve more." This adage also applies to communication networks, where more information about the network state translates into higher sumrates. In this paper, we formalize this increase of sum-rate with increased knowledge of the network state. The knowledge of network state is measured in terms of the number of hops, h, of information available to each transmitter and is labeled as h-local view. To understand how much capacity is lost due to limited information, we propose to use the metric of normalized sum-capacity, which is the h-local view sum-capacity divided by global-view sum capacity. For the cases of one and two-local view, we characterize the normalized sum-capacity for many classes of deterministic and Gaussian interference networks. In many cases, a scheduling scheme called maximal independent graph scheduling is shown to achieve normalized sum-capacity. We also show that its generalization for 1-local view, labeled coded set scheduling, achieves normalized sum-capacity in some cases where its uncoded counterpart fails to do so.
Deep inference is a proof theoretic methodology that generalizes the standard notion of inference of the sequent calculus, whereby inference rules become applicable at any depth inside logical expressions. Deep inference provides more freedom in the design of deductive systems for different logics and a rich combinatoric analysis of proofs. In particular, construction of exponentially shorter analytic proofs becomes possible, however with the cost of a greater nondeterminism than in the sequent calculus. In this paper, we show that the nondeterminism in proof search can be reduced without losing the shorter proofs and without sacrificing proof theoretic cleanliness. For this, we exploit an interaction and depth scheme in the logical expressions. We demonstrate our method on deep inference systems for multiplicative linear logic and classical logic, discuss its proof complexity and its relation to focusing, and present implementations.
Prediction problems from spectra are largely encountered in chemometry. In addition to accurate predictions, it is often needed to extract information about which wavelengths in the spectra contribute in an effective way to the quality of the prediction. This implies to select wavelengths (or wavelength intervals), a problem associated to variable selection. In this paper, it is shown how this problem may be tackled in the specific case of smooth (for example infrared) spectra. The functional character of the spectra (their smoothness) is taken into account through a functional variable projection procedure. Contrarily to standard approaches, the projection is performed on a basis that is driven by the spectra themselves, in order to best fit their characteristics. The methodology is illustrated by two examples of functional projection, using Independent Component Analysis and functional variable clustering, respectively. The performances on two standard infrared spectra benchmarks are illustrated.
Based upon the idea that network functionality is impaired if two nodes in a network are sufficiently separated in terms of a given metric, we introduce two combinatorial \emph{pseudocut} problems generalizing the classical min-cut and multi-cut problems. We expect the pseudocut problems will find broad relevance to the study of network reliability. We comprehensively analyze the computational complexity of the pseudocut problems and provide three approximation algorithms for these problems.   Motivated by applications in communication networks with strict Quality-of-Service (QoS) requirements, we demonstrate the utility of the pseudocut problems by proposing a targeted vulnerability assessment for the structure of communication networks using QoS metrics; we perform experimental evaluations of our proposed approximation algorithms in this context.
We report a calculation of the correlation function of the local density of states in a disordered quasi-one-dimensional wire in the unitary symmetry class at a small energy difference. Using an expression from the supersymmetric sigma-model, we obtain the full dependence of the two-point correlation function on the distance between the points. In the limit of zero energy difference, our calculation reproduces the statistics of a single localized wave function. At logarithmically large distances of the order of the Mott scale, we obtain a reentrant behavior similar to that in strictly one-dimensional chains.
The article is about the features of application of network Internet-communications for advancement of the goods. Wide development of Internet technologies has transformed social communications into the independent tool of marketing. Authors classify and analyze possibilities of use of network Internet-communications in the marketing environment.
Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample error bounds in this networked setting, providing significantly improved results. An important component of our approach is formed by efficient sample weighting schemes, which leads to novel concentration inequalities.
We present an integrated framework for using Convolutional Networks for classification, localization and detection. We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. We also introduce a novel deep learning approach to localization by learning to predict object boundaries. Bounding boxes are then accumulated rather than suppressed in order to increase detection confidence. We show that different tasks can be learned simultaneously using a single shared network. This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and classifications tasks. In post-competition work, we establish a new state of the art for the detection task. Finally, we release a feature extractor from our best model called OverFeat.
We present a deep learning framework for probabilistic pixel-wise semantic segmentation, which we term Bayesian SegNet. Semantic segmentation is an important tool for visual scene understanding and a meaningful measure of uncertainty is essential for decision making. Our contribution is a practical system which is able to predict pixel-wise class labels with a measure of model uncertainty. We achieve this by Monte Carlo sampling with dropout at test time to generate a posterior distribution of pixel class labels. In addition, we show that modelling uncertainty improves segmentation performance by 2-3% across a number of state of the art architectures such as SegNet, FCN and Dilation Network, with no additional parametrisation. We also observe a significant improvement in performance for smaller datasets where modelling uncertainty is more effective. We benchmark Bayesian SegNet on the indoor SUN Scene Understanding and outdoor CamVid driving scenes datasets.
We introduce GuessWhat?!, a two-player guessing game as a testbed for research on the interplay of computer vision and dialogue systems. The goal of the game is to locate an unknown object in a rich image scene by asking a sequence of questions. Higher-level image understanding, like spatial reasoning and language grounding, is required to solve the proposed task. Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images. We explain our design decisions in collecting the dataset and introduce the oracle and questioner tasks that are associated with the two players of the game. We prototyped deep learning models to establish initial baselines of the introduced tasks.
Automatic photo aesthetic assessment is a challenging artificial intelligence task. Existing computational approaches have focused on modeling a single aesthetic score or a class (good or bad), however these do not provide any details on why the photograph is good or bad, or which attributes contribute to the quality of the photograph. To obtain both accuracy and human interpretation of the score, we advocate learning the aesthetic attributes along with the prediction of the overall score. For this purpose, we propose a novel multitask deep convolution neural network, which jointly learns eight aesthetic attributes along with the overall aesthetic score. We report near human performance in the prediction of the overall aesthetic score. To understand the internal representation of these attributes in the learned model, we also develop the visualization technique using back propagation of gradients. These visualizations highlight the important image regions for the corresponding attributes, thus providing insights about model's representation of these attributes. We showcase the diversity and complexity associated with different attributes through a qualitative analysis of the activation maps.
Computation is currently seen as a forward propagator that evolves (retards) a completely defined initial vector into a corresponding final vector. Initial and final vectors map the (logical) input and output of a reversible Boolean network respectively, whereas forward propagation maps a one-way propagation of logical implication, from input to output. Conversely, hard NP-complete problems are characterized by a two-way propagation of logical implication from input to output and vice versa, given that both are partly defined from the beginning. Logical implication can be propagated forward and backward in a computation by constructing the gate array corresponding to the entire reversible Boolean network and constraining output bits as well as input bits. The possibility of modeling the physical process undergone by such a network by using a retarded and advanced in time propagation scheme is investigated. PACS numbers: 89.70.+c, 02.50.-r, 03.65.-w, 89.80.+h
The Munich Near-IR Cluster Survey (MUNICS) is a wide-area, medium-deep, photometric survey selected in the K' band. It covers an area of roughly one square degree in the K' and J near-IR pass-bands. The survey area consists of 16 6' x 6' fields targeted at QSOs with redshifts 0.5 < z < 2 and 7 28' x 13' stripes targeted at `random' high Galactic latitude fields. Ten of the QSO fields were additionally imaged in R and I, and 0.6 square degrees of the randomly selected fields were also imaged in the V, R, and I bands. The resulting object catalogues were strictly selected in K', having a limiting magnitude (50 per cent completeness) of K' ~ 19.5 mag and J ~ 21 mag, sufficiently deep to detect passively evolving systems up to a redshift of z ~ 1.5 and luminosity of 0.5 L*. The optical data reach a depth of roughly R ~ 23.5 mag. The project's main scientific aims are the identification of galaxy clusters at redshifts around unity and the selection of a large sample of field early-type galaxies at 0 < z < 1.5 for evolutionary studies. In this paper - the first in a series - we describe the survey's concept, the selection of the survey fields, the near-IR and optical imaging and data reduction, object extraction, and the construction of photometric catalogues. Finally, we show the J-K' vs. K' colour-magnitude diagramme and the R-J vs. J-K', V-I vs. J-K', and V-I vs. V-R colour-colour diagrammes for MUNICS objects, together with stellar population-synthesis models for different star-formation histories, and conclude that the data set presented is suitable for extracting a catalogue of massive field galaxies in the redshift range 0.5 < z < 1.5 for evolutionary studies and follow-up observations.
Although there is a rich literature on the rate of occupational mobility, there are important gaps in understanding patterns of movement among occupations. We employ a network based approach to explore occupational mobility of the Romanian university graduates in the first years after graduation (2003 - 2008). We use survey data on their career mobility to build an empirical occupational mobility network (OMN) that covers all their job movements in the considered period. We construct the network as directed and weighted. The nodes are represented by the occupations (post coded at 3 digits according to ISCO-88) and the links are weighted with the number of persons switching from one occupation to another. This representation of data permits us to use the novel statistical techniques developed in the framework of weighted directed networks in order to extract a set of stylized facts that highlight patterns of occupational mobility: centrality, network motifs.
We review the time evolution of wavepackets at the metal-insulator transition in two- and three-dimensional disordered systems. The importance of scale invariance and multifractal eigenfunction fluctuations is stressed. The implications of the frequency- and wavevector-dependence of the diffusion coefficient are compared with the results of numerical simulations. We argue that network models are particularly suited for the investigation of the dynamics of disordered systems.
Samples of two deep-sea sediment cores from the Indian Ocean are analyzed with accelerator mass spectrometry (AMS) to search for traces of recent supernova activity around 2 Myr ago. Here, long-lived radionuclides, which are synthesized in massive stars and ejected in supernova explosions, namely 26Al, 53Mn and 60Fe, are extracted from the sediment samples. The cosmogenic isotope 10Be, which is mainly produced in the Earths atmosphere, is analyzed for dating purposes of the marine sediment cores. The first AMS measurement results for 10Be and 26Al are presented, which represent for the first time a detailed study in the time period of 1.7-3.1 Myr with high time resolution. Our first results do not support a significant extraterrestrial signal of 26Al above terrestrial background. However, there is evidence that, like 10Be, 26Al might be a valuable isotope for dating of deep-sea sediment cores for the past few million years.
An article about the transformation of the theory and practice of marketing in terms of e-commerce and network economy. The author considers Internet Marketing as an independent marketing communication in a virtual environment. The main thesis of the article: virtual environment determines the transformation of marketing, changing methods, priorities and structure not only practice, but also the theory of marketing.
We present and analyze a topologically induced transition from ordered, synchronized to disordered dynamics in directed networks of oscillators. The analysis reveals where in the space of networks this transition occurs and its underlying mechanisms. If disordered, the dynamics of the units is precisely determined by the topology of the network and thus characteristic for it. We develop a method to predict the disordered dynamics from topology. The results suggest a new route towards understanding how the precise dynamics of the units of a directed network may encode information about its topology.
An approximate solution of the Dyson equation related to a stochastic Helmholtz equation describing acoustic waves dynamics in the three-dimensional space of a random medium with spatial fluctuating shear modulus is given in the framework of the self-consistent Born approximation. The wavevector-dependence of the self-energy is maintained, thus allowing a description of the acoustic dynamics at wavelengths comparable with the size of heterogeneity domains. This in turn permits to quantitatively describe the mixing of longitudinal and transverse dynamics induced by the medium elastic heterogeneity and occurring at such wavelengths. A functional analysis aimed to define the region of validity in the frequency-wavevector plane of the proposed approximate solution is presented.
Online parameter controllers for evolutionary algorithms adjust values of parameters during the run of an evolutionary algorithm. Recently a new efficient parameter controller based on reinforcement learning was proposed by Karafotias et al. In this method ranges of parameters are discretized into several intervals before the run. However, performing adaptive discretization during the run may increase efficiency of an evolutionary algorithm. Aleti et al. proposed another efficient controller with adaptive discretization.   In the present paper we propose a parameter controller based on reinforcement learning with adaptive discretization. The proposed controller is compared with the existing parameter adjusting methods on several test problems using different configurations of an evolutionary algorithm. For the test problems, we consider four continuous functions, namely the sphere function, the Rosenbrock function, the Levi function and the Rastrigin function. Results show that the new controller outperforms the other controllers on most of the considered test problems.
Recent works on deep conditional random fields (CRF) have set new records on many vision tasks involving structured predictions. Here we propose a fully-connected deep continuous CRF model for both discrete and continuous labelling problems. We exemplify the usefulness of the proposed model on multi-class semantic labelling (discrete) and the robust depth estimation (continuous) problems.   In our framework, we model both the unary and the pairwise potential functions as deep convolutional neural networks (CNN), which are jointly learned in an end-to-end fashion. The proposed method possesses the main advantage of continuously-valued CRF, which is a closed-form solution for the Maximum a posteriori (MAP) inference.   To better adapt to different tasks, instead of using the commonly employed maximum likelihood CRF parameter learning protocol, we propose task-specific loss functions for learning the CRF parameters.   It enables direct optimization of the quality of the MAP estimates during the course of learning.   Specifically, we optimize the multi-class classification loss for the semantic labelling task and the Turkey's biweight loss for the robust depth estimation problem.   Experimental results on the semantic labelling and robust depth estimation tasks demonstrate that the proposed method compare favorably against both baseline and state-of-the-art methods.   In particular, we show that although the proposed deep CRF model is continuously valued, with the equipment of task-specific loss, it achieves impressive results even on discrete labelling tasks.
In the present research, possibility of predicting average summer-monsoon rainfall over India has been analyzed through Artificial Neural Network models. In formulating the Artificial Neural Network based predictive model, three layered networks have been constructed with sigmoid non-linearity. The models under study are different in the number of hidden neurons. After a thorough training and test procedure, neural net with three nodes in the hidden layer is found to be the best predictive model.
We introduce two practical properties of hierarchical clustering methods for (possibly asymmetric) network data: excisiveness and linear scale preservation. The latter enforces imperviousness to change in units of measure whereas the former ensures local consistency of the clustering outcome. Algorithmically, excisiveness implies that we can reduce computational complexity by only clustering a data subset of interest while theoretically guaranteeing that the same hierarchical outcome would be observed when clustering the whole dataset. Moreover, we introduce the concept of representability, i.e. a generative model for describing clustering methods through the specification of their action on a collection of networks. We further show that, within a rich set of admissible methods, requiring representability is equivalent to requiring both excisiveness and linear scale preservation. Leveraging this equivalence, we show that all excisive and linear scale preserving methods can be factored into two steps: a transformation of the weights in the input network followed by the application of a canonical clustering method. Furthermore, their factorization can be used to show stability of excisive and linear scale preserving methods in the sense that a bounded perturbation in the input network entails a bounded perturbation in the clustering output.
Thermal atomic vibrations in amorphous solids can be distinguished by whether they propagate as elastic waves or do not propagate due to lack of atomic periodicity. In a-Si, prior works concluded that non-propagating waves are the dominant contributors to heat transport, while propagating waves are restricted to frequencies less than a few THz and are scattered by anharmonicity. Here, we present a lattice and molecular dynamics analysis of vibrations in a-Si that supports a qualitatively different picture in which propagating elastic waves dominate the thermal conduction and are scattered by elastic fluctuations rather than anharmonicity. We explicitly demonstrate the propagating nature of vibration with frequency approaching 10 THz using a triggered wave computational experiment. Our work suggests that most heat is carried by propagating elastic waves in a-Si and demonstrates a route to achieve extreme thermal properties in amorphous materials by manipulating elastic fluctuations.
Due to resource constraints of the sensor nodes, traditional public key cryptographic techniques are not feasible in most sensor network architectures. Several symmetric key distribution mechanisms are proposed for establishing pairwise keys between sensor nodes in sensor networks, but most of them are not scalable and also are not much suited for mobile sensor networks because they incur much communication as well as computational overheads. Moreover, these schemes are either vulnerable to a small number of compromised sensor nodes or involve expensive protocols for establishing keys. In this paper, we introduce a new scheme for establishing keys between sensor nodes with the help of additional high-end sensor nodes, called the auxiliary nodes. Our scheme provides unconditional security against sensor node captures and high network connectivity. In addition, our scheme requires minimal storage requirement for storing keys mainly due to only a single key before deployment in each node in the sensor network, supports efficiently addition of new nodes after initial deployment and also works for any deployment topology.
Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated to general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable to distinguish among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.
This paper considers the recovery of group sparse signals over a multi-agent network, where the measurements are subject to sparse errors. We first investigate the robust group LASSO model and its centralized algorithm based on the alternating direction method of multipliers (ADMM), which requires a central fusion center to compute a global row-support detector. To implement it in a decentralized network environment, we then adopt dynamic average consensus strategies that enable dynamic tracking of the global row-support detector. Numerical experiments demonstrate the effectiveness of the proposed algorithms.
We propose a method for improving approximate inference methods that corrects for the influence of loops in the graphical model. The method is applicable to arbitrary factor graphs, provided that the size of the Markov blankets is not too large. It is an alternative implementation of an idea introduced recently by Montanari and Rizzo (2005). In its simplest form, which amounts to the assumption that no loops are present, the method reduces to the minimal Cluster Variation Method approximation (which uses maximal factors as outer clusters). On the other hand, using estimates of the effect of loops (obtained by some approximate inference algorithm) and applying the Loop Correcting (LC) method usually gives significantly better results than applying the approximate inference algorithm directly without loop corrections. Indeed, we often observe that the loop corrected error is approximately the square of the error of the approximate inference method used to estimate the effect of loops. We compare different variants of the Loop Correcting method with other approximate inference methods on a variety of graphical models, including "real world" networks, and conclude that the LC approach generally obtains the most accurate results.
In this work we analyse the Parisi's infinity-replica symmetry breaking solution of the Sherrington - Kirkpatrick model without external field using high order perturbative expansions. The predictions are compared with those obtained from the numerical solution of the infinity-replica symmetry breaking equations which are solved using a new pseudo-spectral code which allows for very accurate results. With this methods we are able to get more insight into the analytical properties of the solutions. We are also able to determine numerically the end-point x_{max} of the plateau of q(x) and find that lim_{T --> 0} x_{max}(T) > 0.5.
Great progress has been made recently in verifying the correctness of router forwarding tables. However, these approaches do not work for networks containing middleboxes such as caches and firewalls whose forwarding behavior depends on previously observed traffic. We explore how to verify isolation properties in networks that include such "dynamic datapath" elements using model checking. Our work leverages recent advances in SMT solvers, and the main challenge lies in scaling the approach to handle large and complicated networks. While the straightforward application of model checking to this problem can only handle very small networks (if at all), our approach can verify simple realistic invariants on networks containing 30,000 middleboxes in a few minutes.
Inspired by the tremendous success of deep Convolutional Neural Networks as generic feature extractors for images, we propose TimeNet: a deep recurrent neural network (RNN) trained on diverse time series in an unsupervised manner using sequence to sequence (seq2seq) models to extract features from time series. Rather than relying on data from the problem domain, TimeNet attempts to generalize time series representation across domains by ingesting time series from several domains simultaneously. Once trained, TimeNet can be used as a generic off-the-shelf feature extractor for time series. The representations or embeddings given by a pre-trained TimeNet are found to be useful for time series classification (TSC). For several publicly available datasets from UCR TSC Archive and an industrial telematics sensor data from vehicles, we observe that a classifier learned over the TimeNet embeddings yields significantly better performance compared to (i) a classifier learned over the embeddings given by a domain-specific RNN, as well as (ii) a nearest neighbor classifier based on Dynamic Time Warping.
The magnetic systems with disorder form an important class of systems, which are under intensive studies, since they reflect real systems. Such a class of systems is the spin glass one, which combines randomness and frustration. The Sherrington-Kirkpatrick Ising spin glass with random couplings in the presence of a random magnetic field is investigated in detail within the framework of the replica method. The two random variables (exchange integral interaction and random magnetic field) are drawn from a joint Gaussian probability density function characterized by a correlation coefficient $\rho$. The thermodynamic properties and phase diagrams are studied with respect to the natural parameters of both random components of the system contained in the probability density. The de Almeida-Thouless line is explored as a function of temperature, $\rho$ and other system parameters. The entropy for zero temperature as well as for non zero temperatures is partly negative or positive, acquiring positive branches as $h_{0}$ increases.
We propose the concept of open network as an arbitrary selection of nodes of a large unknown network. Using the hypothesis that information of the whole network structure can be extrapolated from an arbitrary set of its nodes, we use Renyi mutual entropies in different q-orders to establish the minimum critical size of a random set of nodes that represents reliably the information of the main network structure. We also identify the clusters of nodes responsible for the structure of their containing network.
We study the non-equilibrium version of the fluctuation-dissipation theorem (FDT) within the glass phase of Bouchaud's trap model. We incorporate into the model an arbitrary observable m and obtain its correlation and response functions in closed form. A limiting non-equilibrium FDT plot (of correlator vs response) is approached at long times for most choices of m, with energy-temperature FDT a notable exception. In contrast to standard mean field models, however, the shape of the plot depends nontrivially on the observable, and its slope varies continuously even though there is a single scaling of relaxation times with age. Non-equilibrium FDT plots can therefore not be used to define a meaningful effective temperature T_eff in this model. Consequences for the wider applicability of an FDT-derived T_eff are discussed.
We study a generalization of the Heston model, which consists of two coupled stochastic differential equations, one for the stock price and the other one for the volatility. We consider a cubic nonlinearity in the first equation and a correlation between the two Wiener processes, which model the two white noise sources. This model can be useful to describe the market dynamics characterized by different regimes corresponding to normal and extreme days. We analyze the effect of the noise on the statistical properties of the escape time with reference to the noise enhanced stability (NES) phenomenon, that is the noise induced enhancement of the lifetime of a metastable state. We observe NES effect in our model with stochastic volatility. We investigate the role of the correlation between the two noise sources on the NES effect.
As a consequence of a continuous degeneracy of the order parameter of the superfluid 3He the quenched disorder in a form of aerogel gives rise both to a disruption of the orientational long-range order of the condensate and to a significant change of the order parameter itself. There exist a class of quasi-isotropic order parameters which are not sensitive to the disorienting effect of aerogel. While the BW order parameter belongs to this class the ABM does not. Possible candidate for the order parameter of the observed A-like phase is discussed.
Reinforcement learning is considered to be a strong AI paradigm which can be used to teach machines through interaction with the environment and learning from their mistakes. Despite its perceived utility, it has not yet been successfully applied in automotive applications. Motivated by the successful demonstrations of learning of Atari games and Go by Google DeepMind, we propose a framework for autonomous driving using deep reinforcement learning. This is of particular relevance as it is difficult to pose autonomous driving as a supervised learning problem due to strong interactions with the environment including other vehicles, pedestrians and roadworks. As it is a relatively new area of research for autonomous driving, we provide a short overview of deep reinforcement learning and then describe our proposed framework. It incorporates Recurrent Neural Networks for information integration, enabling the car to handle partially observable scenarios. It also integrates the recent work on attention models to focus on relevant information, thereby reducing the computational complexity for deployment on embedded hardware. The framework was tested in an open source 3D car racing simulator called TORCS. Our simulation results demonstrate learning of autonomous maneuvering in a scenario of complex road curvatures and simple interaction of other vehicles.
This paper presents two different evolutionary systems - Evolutionary Programming Network (EPNet) and Novel Evolutions Strategy (NES) Algorithm. EPNet does both training and architecture evolution simultaneously, whereas NES does a fixed network and only trains the network. Five mutation operators proposed in EPNet to reflect the emphasis on evolving ANNs behaviors. Close behavioral links between parents and their offspring are maintained by various mutations, such as partial training and node splitting. On the other hand, NES uses two new genetic operators - subpopulation-based max-mean arithmetical crossover and time-variant mutation. The above-mentioned two algorithms have been tested on a number of benchmark problems, such as the medical diagnosis problems (breast cancer, diabetes, and heart disease). The results and the comparison between them are also presented in this paper.
Glassy behavior is a generic feature of electrons close to disorder-driven metal-insulator transitions. Deep in the insulating phase, electrons are tightly bound to impurities, and thus classical models for electron glasses have long been used. As the metallic phase is approached, quantum fluctuations become more important, as they control the electronic mobility. In this paper we review recent work that used extended dynamical mean-field approaches to discuss the influence of such quantum fluctuations on the glassy behavior of electrons, and examine how the stability of the glassy phase is affected by the Anderson and the Mott mechanisms of localization.
Keyphrases provide a simple way of describing a document, giving the reader some clues about its contents. Keyphrases can be useful in a various applications such as retrieval engines, browsing interfaces, thesaurus construction, text mining etc.. There are also other tasks for which keyphrases are useful, as we discuss in this paper. This paper describes a neural network based approach to keyphrase extraction from scientific articles. Our results show that the proposed method performs better than some state-of-the art keyphrase extraction approaches.
Accurate analysis and forecasting of tidal level are very important tasks for human activities in oceanic and coastal areas. They can be crucial in catastrophic situations like occurrences of Tsunamis in order to provide a rapid alerting to the human population involved and to save lives. Conventional tidal forecasting methods are based on harmonic analysis using the least squares method to determine harmonic parameters. However, a large number of parameters and long-term measured data are required for precise tidal level predictions with harmonic analysis. Furthermore, traditional harmonic methods rely on models based on the analysis of astronomical components and they can be inadequate when the contribution of non-astronomical components, such as the weather, is significant. Other alternative approaches have been developed in the literature in order to deal with these situations and provide predictions with the desired accuracy, with respect also to the length of the available tidal record. These methods include standard high or band pass filtering techniques, although the relatively deterministic character and large amplitude of tidal signals make special techniques, like artificial neural networks and wavelets transform analysis methods, more effective. This paper is intended to provide the communities of both researchers and practitioners with a broadly applicable, up to date coverage of tidal analysis and forecasting methodologies that have proven to be successful in a variety of circumstances, and that hold particular promise for success in the future. Classical and novel methods are reviewed in a systematic and consistent way, outlining their main concepts and components, similarities and differences, advantages and disadvantages.
Instance retrieval requires one to search for images that contain a particular object within a large corpus. Recent studies show that using image features generated by pooling convolutional layer feature maps (CFMs) of a pretrained convolutional neural network (CNN) leads to promising performance for this task. However, due to the global pooling strategy adopted in those works, the generated image feature is less robust to image clutter and tends to be contaminated by the irrelevant image patterns. In this article, we alleviate this drawback by proposing a novel reranking algorithm using CFMs to refine the retrieval result obtained by existing methods. Our key idea, called query adaptive matching (QAM), is to first represent the CFMs of each image by a set of base regions which can be freely combined into larger regions-of-interest. Then the similarity between the query and a candidate image is measured by the best similarity score that can be attained by comparing the query feature and the feature pooled from a combined region. We show that the above procedure can be cast as an optimization problem and it can be solved efficiently with an off-the-shelf solver. Besides this general framework, we also propose two practical ways to create the base regions. One is based on the property of the CFM and the other one is based on a multi-scale spatial pyramid scheme. Through extensive experiments, we show that our reranking approaches bring substantial performance improvement and by applying them we can outperform the state of the art on several instance retrieval benchmarks.
We use ideas from distributed computing to study dynamic environments in which computational nodes, or decision makers, follow adaptive heuristics (Hart 2005), i.e., simple and unsophisticated rules of behavior, e.g., repeatedly "best replying" to others' actions, and minimizing "regret", that have been extensively studied in game theory and economics. We explore when convergence of such simple dynamics to an equilibrium is guaranteed in asynchronous computational environments, where nodes can act at any time. Our research agenda, distributed computing with adaptive heuristics, lies on the borderline of computer science (including distributed computing and learning) and game theory (including game dynamics and adaptive heuristics). We exhibit a general non-termination result for a broad class of heuristics with bounded recall---that is, simple rules of behavior that depend only on recent history of interaction between nodes. We consider implications of our result across a wide variety of interesting and timely applications: game theory, circuit design, social networks, routing and congestion control. We also study the computational and communication complexity of asynchronous dynamics and present some basic observations regarding the effects of asynchrony on no-regret dynamics. We believe that our work opens a new avenue for research in both distributed computing and game theory.
Technological evolution of mobile user equipments (UEs), such as smartphones or laptops, goes hand-in-hand with evolution of new mobile applications. However, running computationally demanding applications at the UEs is constrained by limited battery capacity and energy consumption of the UEs. Suitable solution extending the battery life-time of the UEs is to offload the applications demanding huge processing to a conventional centralized cloud (CC). Nevertheless, this option introduces significant execution delay consisting in delivery of the offloaded applications to the cloud and back plus time of the computation at the cloud. Such delay is inconvenient and make the offloading unsuitable for real-time applications. To cope with the delay problem, a new emerging concept, known as mobile edge computing (MEC), has been introduced. The MEC brings computation and storage resources to the edge of mobile network enabling to run the highly demanding applications at the UE while meeting strict delay requirements. The MEC computing resources can be exploited also by operators and third parties for specific purposes. In this paper, we first describe major use cases and reference scenarios where the MEC is applicable. After that we survey existing concepts integrating MEC functionalities to the mobile networks and discuss current advancement in standardization of the MEC. The core of this survey is, then, focused on user-oriented use case in the MEC, i.e., computation offloading. In this regard, we divide the research on computation offloading to three key areas: i) decision on computation offloading, ii) allocation of computing resource within the MEC, and iii) mobility management. Finally, we highlight lessons learned in area of the MEC and we discuss open research challenges yet to be addressed in order to fully enjoy potentials offered by the MEC.
We carry out constant volume simulations of steady-state, shear driven, rheology in a simple model of bidisperse, soft-core, frictionless disks in two dimensions, using a dissipation law that gives rise to Bagnoldian rheology. We carry out a detailed critical scaling analysis of our resulting data for pressure $p$ and shear stress $\sigma$, in order to determine the critical exponent $\beta$ that describes the algebraic divergence of the Bagnold transport coefficients, as the jamming transition is approached from below. We show that it is necessary, for the strain rates considered in this work, to consider the leading correction-to-scaling term in order to achieve a self-consistent analysis of our data. Our resulting value $\beta\approx 5.0\pm 0.4$ is clearly larger than the theoretical prediction by Otsuki and Hayakawa, and is consistent with earlier numerical results by Peyneau and Roux, and recent theoretical predictions by DeGiuli et al. We have also considered the macroscopic friction $\mu\equiv \sigma/p$ and similarly find results consistent with Peyneau and Roux, and with DeGiuli et al. Our results confirm that the shear driven jamming transition in Bagnoldian systems is well described by a critical scaling theory (as was found previously for Newtonian systems), and we relate this scaling theory to the phenomenological constituent laws for dilatancy and friction.
Function computation of arbitrarily correlated discrete sources over Gaussian networks with orthogonal components is studied. Two classes of functions are considered: the arithmetic sum function and the type function. The arithmetic sum function in this paper is defined as a set of multiple weighted arithmetic sums, which includes averaging of the sources and estimating each of the sources as special cases. The type or frequency histogram function counts the number of occurrences of each argument, which yields many important statistics such as mean, variance, maximum, minimum, median, and so on. The proposed computation coding first abstracts Gaussian networks into the corresponding modulo sum multiple-access channels via nested lattice codes and linear network coding and then computes the desired function by using linear Slepian-Wolf source coding. For orthogonal Gaussian networks (with no broadcast and multiple-access components), the computation capacity is characterized for a class of networks. For Gaussian networks with multiple-access components (but no broadcast), an approximate computation capacity is characterized for a class of networks.
This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities. The recognition is realized using a two-level hierarchy of Long Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture, which can be trained end-to-end. In comparison with existing architectures of LSTMs, we make two key contributions giving the name to our approach as Confidence-Energy Recurrent Network -- CERN. First, instead of using the common softmax layer for prediction, we specify a novel energy layer (EL) for estimating the energy of our predictions. Second, rather than finding the common minimum-energy class assignment, which may be numerically unstable under uncertainty, we specify that the EL additionally computes the p-values of the solutions, and in this way estimates the most confident energy minimum. The evaluation on the Collective Activity and Volleyball datasets demonstrates: (i) advantages of our two contributions relative to the common softmax and energy-minimization formulations and (ii) a superior performance relative to the state-of-the-art approaches.
The Hopfield network has been applied to solve optimization problems over decades. However, it still has many limitations in accomplishing this task. Most of them are inherited from the optimization algorithms it implements. The computation of a Hopfield network, defined by a set of difference equations, can easily be trapped into one local optimum or another, sensitive to initial conditions, perturbations, and neuron update orders. It doesn't know how long it will take to converge, as well as if the final solution is a global optimum, or not. In this paper, we present a Hopfield network with a new set of difference equations to fix those problems. The difference equations directly implement a new powerful optimization algorithm.
I explore the current ability of both white dwarf cooling theory and main sequence stellar evolution theory to accurately determine stellar population ages by comparing ages derived using both techniques for open clusters ranging from 0.1 to 4 Gyr. I find good agreement between white dwarf and main sequence evolutionary ages over the entire age range currently available for study. I also find that directly comparing main sequence turn-off ages to white dwarf ages is only weakly sensitive to realistic levels of errors in cluster distance, metallicity, and reddening. Additional detailed comparisons between white dwarf and main sequence ages have tremendous potential to refine and calibrate both of these important clocks, and I present new simulations of promising open cluster targets. The most demanding requirement for these white dwarf studies are very deep (V > 25-28) cluster observations made necessary by the faintness of the oldest white dwarfs.
We study the coexistence problem in a two-tier heterogeneous network (HetNet) with cognitive small cells. In particular, we consider an underlay HetNet, where the cognitive small base station (C-SBS) is allowed to use the frequency bands of the macro cell with an access probability (AP) as long as the C-SBS satisfies a preset interference probability (IP) constraint at macro users (MUs). To enhance the AP (or transmission opportunity) of the C-SBS, we propose a learning-based algorithm for the C-SBS and exploit the distance information between the macro base station (MBS) and MUs. Generally, the signal from the MBS to a specific MU contains the distance information between the MBS to the MU. We enable the C-SBS to analyze the MBS signal on a target frequency band, and learn the distance information between the MBS and the corresponding MU. With the learnt distance information, we calculate the upper bound of the probability that the C-SBS may interfere with the MU, and design an AP with a closed-form expression under the IP constraint. Numerical results indicate that the proposed algorithm outperforms the existing methods up to $60\%$ AP (or transmission opportunity).
We use molecular dynamics computer simulations to study the equilibrium properties of the surface of amorphous silica. Two types of geometries are investigated: i) clusters with different diameters (13.5\AA, 19\AA, and 26.5\AA) and ii) a thin film with thickness 29\AA. We find that the shape of the clusters is independent of temperature and that it becomes more spherical with increasing size. The surface energy is in qualitative agreement with the experimental value for the surface tension. The density distribution function shows a small peak just below the surface, the origin of which is traced back to a local chemical ordering at the surface. Close to the surface the partial radial distribution functions as well as the distributions of the bond-bond angles show features which are not observed in the interior of the systems. By calculating the distribution of the length of the Si-O rings we can show that these additional features are related to the presence of two-membered rings at the surface. The surface density of these structures is around 0.6/nm^2 in good agreement with experimental estimates. From the behavior of the mean-squared displacement at low temperatures we conclude that at the surface the cage of the particles is larger than the one in the bulk. Close to the surface the diffusion constant is somewhat larger than the one in the bulk and with decreasing temperature the relative difference grows. The total vibrational density of states at the surface is similar to the one in the bulk. However, if only the one for the silicon atoms is considered, significant differences are found.
Monte Carlo tree search (MCTS) is extremely popular in computer Go which determines each action by enormous simulations in a broad and deep search tree. However, human experts select most actions by pattern analysis and careful evaluation rather than brute search of millions of future nteractions. In this paper, we propose a computer Go system that follows experts way of thinking and playing. Our system consists of two parts. The first part is a novel deep alternative neural network (DANN) used to generate candidates of next move. Compared with existing deep convolutional neural network (DCNN), DANN inserts recurrent layer after each convolutional layer and stacks them in an alternative manner. We show such setting can preserve more contexts of local features and its evolutions which are beneficial for move prediction. The second part is a long-term evaluation (LTE) module used to provide a reliable evaluation of candidates rather than a single probability from move predictor. This is consistent with human experts nature of playing since they can foresee tens of steps to give an accurate estimation of candidates. In our system, for each candidate, LTE calculates a cumulative reward after several future interactions when local variations are settled. Combining criteria from the two parts, our system determines the optimal choice of next move. For more comprehensive experiments, we introduce a new professional Go dataset (PGD), consisting of 253233 professional records. Experiments on GoGoD and PGD datasets show the DANN can substantially improve performance of move prediction over pure DCNN. When combining LTE, our system outperforms most relevant approaches and open engines based on MCTS.
Convolutional neural networks demonstrated outstanding empirical results in computer vision and speech recognition tasks where labeled training data is abundant. In medical imaging, there is a huge variety of possible imaging modalities and contrasts, where annotated data is usually very scarce. We present two approaches to deal with this challenge. A network pretrained in a different domain with abundant data is used as a feature extractor, while a subsequent classifier is trained on a small target dataset; and a deep architecture trained with heavy augmentation and equipped with sophisticated regularization methods. We test the approaches on a corpus of X-ray images to design an anatomy detection system.
We propose a new measure to characterize the dimension of complex networks based on the ergodic theory of dynamical systems. This measure is derived from the correlation sum of a trajectory generated by a random walker navigating the network, and extends the classical Grassberger-Procaccia algorithm to the context of complex networks. The method is validated with reliable results for both synthetic networks and real-world networks such as the world air-transportation network or urban networks, and provides a computationally fast way for estimating the dimensionality of networks which only relies on the local information provided by the walkers.
The parton spectra as predicted by the ARIADNE Monte Carlo generator, for both e+e- annihilation and deep inelastic scattering, are compared to the QCD MLLA calculations.
The generation of watt-level cw narrow-linewidth sources at specific deep UV wavelengths corresponding to atomic cooling transitions usually employs external cavity-enhanced second-harmonic generation (SHG) of moderate-power visible lasers in birefringent materials. In this work, we investigate a novel approach to cw deep-UV generation by employing the low-loss BBO in a monolithic walkoff-compensating structure [Zondy {\it{et al}}, J. Opt. Soc. Am. B {\bf{20}} (2003) 1675] to simultaneously enhance the effective nonlinear coefficient while minimizing the UV beam ellipticity under tight focusing. As a preliminary step to cavity-enhanced operation, and in order to apprehend the design difficulties stemming from the extremely low acceptance angle of BBO, we investigate and analyze the single-pass performance of a $L_c=8 $mm monolithic walk-off compensating structure made of 2 optically-contacted BBO plates cut for type-I critically phase-matched SHG of a cw $\lambda=570.4$nm dye laser. As compared with a bulk crystal of identical length, a sharp UV efficiency enhancement factor of 1.65 has been evidenced with the tandem structure, but at $\sim-1$nm from the targeted fundamental wavelength, highlighting the sensitivity of this technique when applied to a highly birefringent material such as BBO. Solutions to angle cut residual errors are identified so as to match accurately more complex periodic-tandem structure performance to any target UV wavelength, opening the prospect for high-power, good beam quality deep UV cw laser sources for atom cooling and trapping.
We propose a general Bayesian network model for application in a wide class of problems of therapy monitoring. We discuss the use of stochastic simulation as a computational approach to inference on the proposed class of models. As an illustration we present an application to the monitoring of cytotoxic chemotherapy in breast cancer.
Trojan virus attacks pose one of the most serious threats to computer security. A Trojan horse is typically separated into two parts - a server and a client. It is the client that is cleverly disguised as significant software and positioned in peer-to-peer file sharing networks, or unauthorized download websites. The most common means of infection is through email attachments. The developer of the virus usually uses various spamming techniques in order to distribute the virus to unsuspecting users. Malware developers use chat software as another method to spread their Trojan horse viruses such as Yahoo Messenger and Skype. The objective of this paper is to explore the network packet information and detect the behavior of Trojan attacks to monitoring operating systems such as Windows and Linux. This is accomplished by detecting and analyzing the Trojan infected packet from a network segment -which passes through email attachment- before attacking a host computer. The results that have been obtained to detect information and to store infected packets through monitoring when using the web browser also compare the behaviors of Linux and Windows using the payload size after implementing the Wireshark sniffer packet results. Conclusions of the figures analysis from the packet captured data to analyze the control bit, and check the behavior of the control bits, and the usability of the operating systems Linux and Windows.
We compute the phase diagram of the HgTe/CdTe quantum wells in the 3 dimensional (3D) parameter space of Dirac mass, Fermi level and disorder strength. The phase diagram reveals the Quantum spin-Hall, the metallic and the normal insulating phases. The phase boundary of the Quantum spin-Hall state is shown to be strongly deformed by the disorder. Taking specific cuts into this 3D phase diagram, we recover the so called topological Anderson insulator (TAI) phase, but now we can demonstrate explicitly that TAI is not a distinct phase and instead it is part of the Quantum spin-Hall phase. The calculations are performed with $S_z$-conserving and $S_z$-nonconserving Hamiltonians.
Measurements of transverse energy flow are presented for neutral current deep-inelastic scattering events produced in positron-proton collisions at HERA. The kinematic range covers squared momentum transfers Q^2 from 3.2 to 2,200 GeV^2, the Bjorken scaling variable x from 8.10^{-5} to 0.11 and the hadronic mass W from 66 to 233 GeV. The transverse energy flow is measured in the hadronic centre of mass frame and is studied as a function of Q^2, x, W and pseudorapidity. A comparison is made with QCD based models. The behaviour of the mean transverse energy in the central pseudorapidity region and an interval corresponding to the photon fragmentation region are analysed as a function of Q^2 and W.
Recurrent neural networks have shown excellent performance in many applications, however they require increased complexity in hardware or software based implementations. The hardware complexity can be much lowered by minimizing the word-length of weights and signals. This work analyzes the fixed-point performance of recurrent neural networks using a retrain based quantization method. The quantization sensitivity of each layer in RNNs is studied, and the overall fixed-point optimization results minimizing the capacity of weights while not sacrificing the performance are presented. A language model and a phoneme recognition examples are used.
Every day media generate large amounts of text. An unbiased view on media reports requires an understanding of the political bias of media content. Assistive technology for estimating the political bias of texts can be helpful in this context. This study proposes a simple statistical learning approach to predict political bias from text. Standard text features extracted from speeches and manifestos of political parties are used to predict political bias in terms of political party affiliation and in terms of political views. Results indicate that political bias can be predicted with above chance accuracy. Mistakes of the model can be interpreted with respect to changes of policies of political actors. Two approaches are presented to make the results more interpretable: a) discriminative text features are related to the political orientation of a party and b) sentiment features of texts are correlated with a measure of political power. Political power appears to be strongly correlated with positive sentiment of a text. To highlight some potential use cases a web application shows how the model can be used for texts for which the political bias is not clear such as news articles.
We trained a deep all-convolutional neural network with masked global pooling to perform single-label classification for acoustic scene classification and multi-label classification for domestic audio tagging in the DCASE-2016 contest. Our network achieved an average accuracy of 84.5% on the four-fold cross-validation for acoustic scene recognition, compared to the provided baseline of 72.5%, and an average equal error rate of 0.17 for domestic audio tagging, compared to the baseline of 0.21. The network therefore improves the baselines by a relative amount of 17% and 19%, respectively. The network only consists of convolutional layers to extract features from the short-time Fourier transform and one global pooling layer to combine those features. It particularly possesses neither fully-connected layers, besides the fully-connected output layer, nor dropout layers.
Developing a sense of community among students is one of the three pillars of an overall reform effort to increase participation in physics, and the sciences more broadly, at Florida International University. The emergence of a research and learning community, embedded within a course reform effort, has contributed to increased recruitment and retention of physics majors. Finn and Rock [1] link the academic and social integration of students to increased rates of retention. We utilize social network analysis to quantify interactions in Florida International University's Physics Learning Center (PLC) that support the development of academic and social integration,. The tools of social network analysis allow us to visualize and quantify student interactions, and characterize the roles of students within a social network. After providing a brief introduction to social network analysis, we use sequential multiple regression modeling to evaluate factors which contribute to participation in the learning community. Results of the sequential multiple regression indicate that the PLC learning community is an equitable environment as we find that gender and ethnicity are not significant predictors of participation in the PLC. We find that providing students space for collaboration provides a vital element in the formation of supportive learning community.
A network coding-based scheme is proposed to improve the energy efficiency of distributed storage systems in WSNs (wireless sensor networks), which mainly focuses on two problems: firstly, consideration is given to effective distributed storage technology in WSNs; secondly, we address how to repair the data in failed storage nodes with less resource. For the first problem, we propose a method to obtain a sparse generator matrix to construct network codes, and this sparse generator matrix is proven to be the sparsest. Benefiting from the sparse generator matrix, the energy consumption required to implement distributed storage is reduced. For the second problem, we designed a network coding-based iterative repair method, which adequately utilizes the idea of re-encoding at intermediate nodes from network coding theory. Benefiting from the re-encoding, the energy consumption required by data repair is significantly reduced. Moreover, we provide an explicit lower bound of field size required by this scheme, which implies that this scheme can work over a very small field and the required computation overhead of coding is very low. The simulation result verifies that by using our scheme, the total energy consumption required to implement distributed storage system in WSNs can be reduced on the one hand, and on the other hand, this method can also balance energy consumption of the networks.
Rotation invariance and translation invariance have great values in image recognition tasks. In this paper, we bring a new architecture in convolutional neural network (CNN) named cyclic convolutional layer to achieve rotation invariance in 2-D symbol recognition. We can also get the position and orientation of the 2-D symbol by the network to achieve detection purpose for multiple non-overlap target. Last but not least, this architecture can achieve one-shot learning in some cases using those invariance.
We present the requirements and design specification of the open-source Distributed Modular Audio Recognition Framework (DMARF), a distributed extension of MARF. The distributed version aggregates a number of distributed technologies (e.g. Java RMI, CORBA, Web Services) in a pluggable and modular model along with the provision of advanced distributed systems algorithms. We outline the associated challenges incurred during the design and implementation as well as overall specification of the project and its advantages and limitations.
Brains need to predict how the body reacts to motor commands. It is an open question how networks of spiking neurons can learn to reproduce the non-linear body dynamics caused by motor commands, using local, online and stable learning rules. Here, we present a supervised learning scheme for the feedforward and recurrent connections in a network of heterogeneous spiking neurons. The error in the output is fed back through fixed random connections with a negative gain, causing the network to follow the desired dynamics, while an online and local rule changes the weights. The rule for Feedback-based Online Local Learning Of Weights (FOLLOW) is local in the sense that weight changes depend on the presynaptic activity and the error signal projected onto the postsynaptic neuron. We provide examples of learning linear, non-linear and chaotic dynamics, as well as the dynamics of a two-link arm. Using the Lyapunov method, and under reasonable assumptions and approximations, we show that FOLLOW learning is stable uniformly, with the error going to zero asymptotically.
When analyzing the genome, researchers have discovered that proteins bind to DNA based on certain patterns of the DNA sequence known as "motifs". However, it is difficult to manually construct motifs due to their complexity. Recently, externally learned memory models have proven to be effective methods for reasoning over inputs and supporting sets. In this work, we present memory matching networks (MMN) for classifying DNA sequences as protein binding sites. Our model learns a memory bank of encoded motifs, which are dynamic memory modules, and then matches a new test sequence to each of the motifs to classify the sequence as a binding or nonbinding site.
This paper revisits recognition of natural image pleasantness by employing deep convolutional neural networks and affordable eye trackers. There exist several approaches to recognize image pleasantness: (1) computer vision, and (2) psychophysical signals. For natural images, computer vision approaches have not been as successful as for abstract paintings and is lagging behind the psychophysical signals like eye movements. Despite better results, the scalability of eye movements is adversely affected by the sensor cost. While the introduction of affordable sensors have helped the scalability issue by making the sensors more accessible, the application of such sensors in a loosely controlled human-computer interaction setup is not yet studied for affective image tagging. On the other hand, deep convolutional neural networks have boosted the performance of vision-based techniques significantly in recent years. To investigate the current status in regard to affective image tagging, we (1) introduce a new eye movement dataset using an affordable eye tracker, (2) study the use of deep neural networks for pleasantness recognition, (3) investigate the gap between deep features and eye movements. To meet these ends, we record eye movements in a less controlled setup, akin to daily human-computer interaction. We assess features from eye movements, visual features, and their combination. Our results show that (1) recognizing natural image pleasantness from eye movement under less restricted setup is difficult and previously used techniques are prone to fail, and (2) visual class categories are strong cues for predicting pleasantness, due to their correlation with emotions, necessitating careful study of this phenomenon. This latter finding is alerting as some deep learning approaches may fit to the class category bias.
The Shannon and the Renyi entropies of the ground state wavefunction in the pure and in the random quantum Ising chain are studied via the self-dual Fernandez-Pacheco real-space renormalization procedure. In particular, we analyze the critical behavior of the leading extensive term at the quantum phase transition : the derivative with respect to the control parameter is found to be logarithmically divergent in the pure case, and to display a cusp singularity in the random case. This cusp singularity for the random case is also derived via the Strong Disorder Renormalization approach.
We report results of one-day simultaneous multiwavelength observations of Cygnus X-2 using XMM, Chandra, the European VLBI Network and the XMM Optical Monitor. During the observations, the source did not exhibit Z-track movement, but remained in the vicinity of the soft apex. It was in a radio quiescent/quiet state of < 150 microJy. Strong dip events were seen as 25% reductions in X-ray intensity. The use of broadband CCD spectra in combination with narrow-band grating spectra has now demonstrated for the first time that these dipping events in Cygnus X-2 are caused by absorption in cool material in quite a unique way. In the band 0.2 - 10 keV, dipping appears to be due to progressive covering of the Comptonized emission of an extended accretion disk corona, the covering factor rising to 40% in deep dipping with an associated column density of 3.10^{23} atom cm^{-2}. Remarkably, the blackbody emission of the neutron star is not affected by these dips, in strong contrast with observations of typical low mass X-ray binary dipping sources. The Chandra and XMM gratings directly measure the optical depths in absorption edges such as Ne K, Fe L, and O K and a comparison of the optical depths in the edges of non-dip and dip data reveals no increase of optical depth during dipping even though the continuum emission sharply decreases. Based on these findings, at orbital phase 0.35, we propose that dipping in this observation is caused by absorption in the outer disk by structures located opposite to the impact bulge of the accretion stream. With an inclination angle > 60 deg, these structures can still cover large parts of the extended ADC, without absorbing emission from the central neutral star.
This talk summarized the proof of hard-scattering factorization for exclusive deep-inelastic processes, such as diffractive meson production.
We present first results of our search for high-redshift galaxies in deep CCD mosaic images. As a pilot study for a larger survey, very deep images of the Chandra Deep Field South (CDFS), taken withWFI@MPG/ESO2.2m, are used to select large samples of 1070 U-band and 565 B-band dropouts with the Lyman-break method. The data of these Lyman-break galaxies are made public as an electronic table. These objects are good candidates for galaxies at z~3 and z~4 which is supported by their photometric redshifts. The distributions of apparent magnitudes and the clustering properties of the two populations are analysed, and they show good agreement to earlier studies. We see no evolution in the comoving clustering scale length from z~3 to z~4. The techniques presented here will be applied to a much larger sample of U-dropouts from the whole survey in near future.
Community detection is the process of grouping strongly connected nodes in a network. Many community detection methods for un-weighted networks have a theoretical basis in a null model. Communities discovered by these methods therefore have interpretations in terms of statistical significance. In this paper, we introduce a null for weighted networks called the continuous configuration model. We use the model both as a tool for community detection and for simulating weighted networks with null nodes. First, we propose a community extraction algorithm for weighted networks which incorporates iterative hypothesis testing under the null. We prove a central limit theorem for edge-weight sums and asymptotic consistency of the algorithm under a weighted stochastic block model. We then incorporate the algorithm in a community detection method called CCME. To benchmark the method, we provide a simulation framework incorporating the null to plant "background" nodes in weighted networks with communities. We show that the empirical performance of CCME on these simulations is competitive with existing methods, particularly when overlapping communities and background nodes are present. To further validate the method, we present two real-world networks with potential background nodes and analyze them with CCME, yielding results that reveal macro-features of the corresponding systems.
The presence of random fields is well known to destroy ferromagnetic order in Ising systems in two dimensions. When the system is placed in a sufficiently strong external field, however, the size of clusters of like spins diverges. There is evidence that this percolation transition is in the universality class of standard site percolation. It has been claimed that, for small disorder, a similar percolation phenomenon also occurs in zero external field. Using exact algorithms, we study ground states of large samples and find little evidence for a transition at zero external field. Nevertheless, for sufficiently small random field strengths, there is an extended region of the phase diagram, where finite samples are indistinguishable from a critical percolating system. In this regime we examine ground-state domain walls, finding strong evidence that they are conformally invariant and satisfy Schramm-Loewner evolution ($SLE_{\kappa}$) with parameter $\kappa = 6$. These results add support to the hope that at least some aspects of systems with quenched disorder might be ultimately studied with the techniques of SLE and conformal field theory.
We develop a numerical technique to study Anderson localization in interacting electronic systems. The ground state of the disordered system is calculated with quantum Monte-Carlo simulations while the localization properties are extracted from the ``Thouless conductance'' $g$, i.e. the curvature of the energy with respect to an Aharonov-Bohm flux. We apply our method to polarized electrons in a two dimensional system of size $L$. We recover the well known universal $\beta(g)=\rm{d}\log g/\rm{d}\log L$ one parameter scaling function without interaction. Upon switching on the interaction, we find that $\beta(g)$ is unchanged while the system flows toward the insulating limit. We conclude that polarized electrons in two dimensions stay in an insulating state in the presence of weak to moderate electron-electron correlations.
We introduce a new dynamical system for sequentially observed multivariate count data. This model is based on the gamma--Poisson construction---a natural choice for count data---and relies on a novel Bayesian nonparametric prior that ties and shrinks the model parameters, thus avoiding overfitting. We present an efficient MCMC inference algorithm that advances recent work on augmentation schemes for inference in negative binomial models. Finally, we demonstrate the model's inductive bias using a variety of real-world data sets, showing that it exhibits superior predictive performance over other models and infers highly interpretable latent structure.
Transfer learning for feature extraction can be used to exploit deep representations in contexts where there is very few training data, where there are limited computational resources, or when tuning the hyper-parameters needed for training is not an option. While previous contributions to feature extraction propose embeddings based on a single layer of the network, in this paper we propose a full-network embedding which successfully integrates convolutional and fully connected features, coming from all layers of a deep convolutional neural network. To do so, the embedding normalizes features in the context of the problem, and discretizes their values to reduce noise and regularize the embedding space. Significantly, this also reduces the computational cost of processing the resultant representations. The proposed method is shown to outperform single layer embeddings on several image classification tasks, while also being more robust to the choice of the pre-trained model used for obtaining the initial features. The performance gap in classification accuracy between thoroughly tuned solutions and the full-network embedding is also reduced, which makes of the proposed approach a competitive solution for a large set of applications.
Boolean cardinality constraints state that at most (at least, or exactly) $k$ out of $n$ propositional literals can be true. We propose a new class of selection networks that can be used for an efficient encoding of them. Several comparator networks have been proposed recently for encoding cardinality constraints and experiments have proved their efficiency. Those were based mainly on the odd-even or pairwise comparator networks. We use similar ideas, but we extend the model of comparator networks so that the basic components are not only comparators (2-sorters) but more general $m$-sorters, for $m \geq 2$. The inputs are organized into $m$ columns, in which elements are recursively selected and, after that, columns are merged using an idea of multi-way merging. We present two algorithms parametrized by $m \geq 2$. We call those networks $m$-Wise Selection Network and $m$-Odd-Even Selection Network. We give detailed construction of the mergers when $m=4$. The construction can be directly applied to any values of $k$ and $n$. The proposed encoding of sorters is standard, therefore the arc-consistency is preserved. We prove correctness of the constructions and present the theoretical and experimental evaluation, which show that the new encodings are competitive to the other state-of-art encodings.
The mean free path of light ($l^*$) calculated for elastic scattering on a system of nanoparticles with spatially correlated disorder is found to have a minimum when the correlation length is of the order of the wavelength of light. For a typical choice of parameters for the scattering system, this minimum mean free path ($l^*_{min}$) turns out to satisfy the Ioffe-Regel criterion for wave localization, $l^*_{min} \sim \lambda$, over a range of the correlation length, defining thus a stop-band for light transmission. It also provides a semi-phenomenological explanation for several interesting findings reported recently on the transmission/ reflection and the trapping/storage of light in a magnetically tunable ferrofluidic system. The subtle effect of structural anisotropy, induced by the external magnetic field on the scattering by the medium, is briefly discussed in physical terms of the anisotropic Anderson localization.
The main focus of recent time synchronization research is developing power-efficient synchronization methods that meet pre-defined accuracy requirements. However, an aspect that has been often overlooked is the high dynamics of the network topology due to the mobility of the nodes. Employing existing flooding-based and peer-to-peer synchronization methods, are networked robots still be able to adapt themselves and self-adjust their logical clocks under mobile network dynamics? In this paper, we present the application and the evaluation of the existing synchronization methods on robotic sensor networks. We show through simulations that Adaptive Value Tracking synchronization is robust and efficient under mobility. Hence, deducing the time synchronization problem in robotic sensor networks into a dynamic value searching problem is preferable to existing synchronization methods in the literature.
The phase diagram of the superfluid phases of $^{3}$He in 98% aerogel was determined in the range of pressure from 15 to 33 bars and for fields up to 3 kG using high-frequency sound. The superfluid transition in aerogel at 33.4 bars is field independent from 0 to 5 kG and shows no evidence of an $A_{1}-A_{2}$ splitting. The first-order transition between the A and B-phases is suppressed by a magnetic field, and exhibits strong supercooling at high pressures. We show that the equilibrium phase in zero applied field is the B-phase with at most a region of A-phase {\small $\alt$} 20 $\mu$K just below T${_c}$ at a pressure of 33.4 bars. This is in contrast to pure $^{3}$He which has a large stable region of A-phase and a polycritical point. The quadratic coefficient for magnetic field suppression of the AB-transition, $g_{a}(\beta)$, was obtained. The pressure dependence of $g_{a}(\beta)$ is markedly different from that for the pure superfluid, $g_{0}(\beta)$, which diverges at a polycritical pressure of 21 bars. We compare our results with calculations from the homogeneous scattering model for $g_{a}(\beta)$, defined in a Ginzburg-Landau theory in terms of strong-coupling parameters $\beta$. We find qualitatively good agreement with the experiment if the strong-coupling corrections are rescaled from known values of the $\beta$'s for pure $^{3}$He, reduced by the suppression of the superfluid transition temperature. The calculations indicate that the polycritical pressure in the aerogel system is displaced well above the melting pressure and out of experimental reach. We cannot account for the puzzling supercooling of the aerogel AB-transition in zero applied field within the framework of known nucleation scenarios.
Sensor network has been recognized as the most significant technology for next century. Despites of its potential application, wireless sensor network encounters resource restriction such as low power, reduced bandwidth and specially limited power sources. This work proposes an efficient technique for the conservation of energy in a wireless sensor network (WSN) by forming an effective cluster of the network nodes distributed over a wide range of geographical area. The clustering scheme is developed around a specified class of cellular automata (CA) referred to as the modified cyclic cellular automata (mCCA). It sets a number of nodes in stand-by mode at an instance of time without compromising the area of network coverage and thereby conserves the battery power. The proposed scheme also determines an effective cluster size where the inter-cluster and intra-cluster communication cost is minimum. The simulation results establish that the cyclic cellular automata based clustering for energy conservation in sensor networks (CCABC) is more reliable than the existing schemes where clustering and CA based energy saving technique is used.
Fitting probabilistic models to data is often difficult, due to the general intractability of the partition function and its derivatives. Here we propose a new parameter estimation technique that does not require computing an intractable normalization factor or sampling from the equilibrium distribution of the model. This is achieved by establishing dynamics that would transform the observed data distribution into the model distribution, and then setting as the objective the minimization of the KL divergence between the data distribution and the distribution produced by running the dynamics for an infinitesimal time. Score matching, minimum velocity learning, and certain forms of contrastive divergence are shown to be special cases of this learning technique. We demonstrate parameter estimation in Ising models, deep belief networks and an independent component analysis model of natural scenes. In the Ising model case, current state of the art techniques are outperformed by at least an order of magnitude in learning time, with lower error in recovered coupling parameters.
In this document, we are primarily interested in computing the probabilities of various types of dependencies that can occur in a multi-cell infrastructure network.
We found that numbers of fully connected clusters in Barab\'asi-Albert (BA) networks follow the exponential distribution with the characteristic exponent $\kappa=2/m$. The critical temperature for the Ising model on the BA network is determined by the critical temperature of the largest fully connected cluster within the network. The result explains the logarithmic dependence of the critical temperature on the size of the network $N$.
The analysis of datasets taking the form of simple, undirected graphs continues to gain in importance across a variety of disciplines. Two choices of null model, the logistic-linear model and the implicit log-linear model, have come into common use for analyzing such network data, in part because each accounts for the heterogeneity of network node degrees typically observed in practice. Here we show how these both may be viewed as instances of a broader class of null models, with the property that all members of this class give rise to essentially the same likelihood-based estimates of link probabilities in sparse graph regimes. This facilitates likelihood-based computation and inference, and enables practitioners to choose the most appropriate null model from this family based on application context. Comparative model fits for a variety of network datasets demonstrate the practical implications of our results.
Microalgae counting is used to measure biomass quantity. Usually, it is performed in a manual way using a Neubauer chamber and expert criterion, with the risk of a high error rate. This paper addresses the methodology for automatic identification of Scenedesmus microalgae (used in the methane production and food industry) and applies it to images captured by a digital microscope. The use of contrast adaptive histogram equalization for pre-processing, and active contours for segmentation are presented. The calculation of statistical features (Histogram of Oriented Gradients, Hu and Zernike moments) with texture features (Haralick and Local Binary Patterns descriptors) are proposed for algae characterization. Scenedesmus algae can build coenobia consisting of 1, 2, 4 and 8 cells. The amount of algae of each coenobium helps to determine the amount of lipids, proteins, and other substances in a given sample of a algae crop. The knowledge of the quantity of those elements improves the quality of bioprocess applications. Classification of coenobia achieves accuracies of 98.63% and 97.32% with Support Vector Machine (SVM) and Artificial Neural Network (ANN), respectively. According to the results it is possible to consider the proposed methodology as an alternative to the traditional technique for algae counting. The database used in this paper is publicly available for download.
Evaluation of link prediction methods is a hard task in very large complex networks because of the inhibitive computational cost. By setting a lower bound of the number of common neighbors (CN), we propose a new framework to efficiently and precisely evaluate the performances of CN-based similarity indices in link prediction for very large heterogeneous networks. Specifically, we propose a fast algorithm based on the parallel computing scheme to obtain all the node pairs with CN values larger than the lower bound. Furthermore, we propose a new measurement, called self-predictability, to quantify the performance of the CN-based similarity indices in link prediction, which on the other side can indicate the link predictability of a network.
The strength of the low energy K^- nucleus real potential has recently received renewed attention in view of experimental evidence for the possible existence of strongly bound K^- states. Previous fits to kaonic atom data led to either 'shallow' or to 'deep' potentials, where only the former are in agreement with chiral approaches but only the latter can produce strongly bound states. Here we explore the uncertainties of the K^- nucleus optical potentials, obtained from fits to kaonic atom data, using the functional derivatives of the best-fit chi^2 values with respect to the potential. We find that only the deep type of potential provides information which is applicable to the K^- interaction in the nuclear interior.
We present a new analysis of the ISOCAM 14.3 micron deep survey in a 20x20 square arcmins area in the Lockman Hole. This survey is intermediate between the ultra-deep surveys and the shallow surveys in the ELAIS fields. The data have been analyzed with the method presented by Lari et al. (2001). We have produced a catalogue of 283 sources detected above the 5-sigma threshold, with fluxes in the interval 0.1-8 mJy. The catalogue is 90% complete at 1 mJy. The positional accuracy, estimated from the cross-correlation of infrared and optical sources, is around 1.5 arcsec. The search for the optical counterparts of the sources in the survey is performed on a medium-deep r' band optical image (5-sigma depth of r'=25), making use of the radio detections when available. The photometry has been checked through simulations and by comparing the data with those presented in a shallower and more extended ISOCAM survey in the Lockman Hole, that we have presented in a companion paper. Only 15% of the 14.3 micron sources do not have an optical counterpart down to r'=25 mag. We use the 6.7/14.3 micron colour as a star/galaxy separator, together with a visual inspection of the optical image and an analysis of the observed Spectral Energy Distribution of the ISOCAM sources. The stars in the sample turn out to be only 6% of the sample. We discuss the 14.3 micron counts of extragalactic sources, combining our catalogue with that obtained from the shallower ISOCAM survey. The data in the two surveys are consistent, and our results fully support the claims in previous works for the existence of an evolving population of infrared galaxies, confirming the evident departure from non-evolutionary model predictions.
We present a new approach to sample from generic binary distributions, based on an exact Hamiltonian Monte Carlo algorithm applied to a piecewise continuous augmentation of the binary distribution of interest. An extension of this idea to distributions over mixtures of binary and possibly-truncated Gaussian or exponential variables allows us to sample from posteriors of linear and probit regression models with spike-and-slab priors and truncated parameters. We illustrate the advantages of these algorithms in several examples in which they outperform the Metropolis or Gibbs samplers.
Almost all of the elements heavier than hydrogen that are present in our solar system were produced by nuclear burning processes either in the early universe or at some point in the life cycle of stars. In all of these environments, there are dozens to thousands of nuclear species that interact with each other to produce successively heavier elements. In this paper, we present SkyNet, a new general-purpose nuclear reaction network that evolves the abundances of nuclear species under the influence of nuclear reactions. SkyNet can be used to compute the nucleosynthesis evolution in all astrophysical scenarios where nucleosynthesis occurs. SkyNet is free and open-source and aims to be easy to use and flexible. Any list of isotopes can be evolved and SkyNet supports various different types of nuclear reactions. SkyNet is modular so that new or existing physics, like nuclear reactions or equations of state, can easily be added or modified. Here, we present in detail the physics implemented in SkyNet with a focus on a self-consistent transition to and from nuclear statistical equilibrium (NSE) to non-equilibrium nuclear burning, our implementation of electron screening, and coupling of the network to an equation of state. We also present comprehensive code tests and comparisons with existing nuclear reaction networks. We find that SkyNet agrees with published results and other codes to an accuracy of a few percent. Discrepancies, where they exist, can be traced to differences in the physics implementations.
It is a mainstream idea that scale-free network would be fragile under the selective attacks. Internet is a typical scale-free network in the real world, but it never collapses under the selective attacks of computer viruses and hackers. This phenomenon is different from the deduction of the idea above because this idea assumes the same cost to delete an arbitrary node. Hence this paper discusses the behaviors of the scale-free network under the selective node attack with different cost. Through the experiments on five complex networks, we show that the scale-free network is possibly robust under the selective node attacks; furthermore, the more compact the network is, and the larger the average degree is, then the more robust the network is; With the same average degrees, the more compact the network is, the more robust the network is. This result would enrich the theory of the invulnerability of the network, and can be used to build the robust social, technological and biological networks, and also has the potential to find the target of drugs.
Sentiment analysis of online user generated content is important for many social media analytics tasks. Researchers have largely relied on textual sentiment analysis to develop systems to predict political elections, measure economic indicators, and so on. Recently, social media users are increasingly using images and videos to express their opinions and share their experiences. Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is complementary to textual sentiment analysis. Motivated by the needs in leveraging large scale yet noisy training data to solve the extremely challenging problem of image sentiment analysis, we employ Convolutional Neural Networks (CNN). We first design a suitable CNN architecture for image sentiment analysis. We obtain half a million training samples by using a baseline sentiment algorithm to label Flickr images. To make use of such noisy machine labeled data, we employ a progressive strategy to fine-tune the deep network. Furthermore, we improve the performance on Twitter images by inducing domain transfer with a small number of manually labeled Twitter images. We have conducted extensive experiments on manually labeled Twitter images. The results show that the proposed CNN can achieve better performance in image sentiment analysis than competing algorithms.
Recently much attention has been paid to the study of the robustness of interdependent and multiplex networks and, in particular, networks of networks. The robustness of interdependent networks can be evaluated by the size of a mutually connected component when a fraction of nodes have been removed from these networks. Here we characterize the emergence of the mutually connected component in a network of networks in which every node of a network (layer) $\alpha$ is connected with $q_{\alpha}$ randomly chosen replicas in some other networks and is interdependent of these nodes with probability $r$. We find that when the superdegrees $q_{\alpha}$ of different layers in the network of networks are distributed heterogeneously, multiple percolation phase transition can occur, and depending on the value of $r$ these transition are continuous or discontinuous.
With traditional open-loop scheduling of network resources, the quality-of-control (QoC) of networked control systems (NCSs) may degrade significantly in the presence of limited bandwidth and variable workload. The goal of this work is to maximize the overall QoC of NCSs through dynamically allocating available network bandwidth. Based on codesign of control and scheduling, an integrated feedback scheduler is developed to enable flexible QoC management in dynamic environments. It encompasses a cascaded feedback scheduling module for sampling period adjustment and a direct feedback scheduling module for priority modification. The inherent characteristics of priority-driven control networks make it feasible to implement the proposed feedback scheduler in real-world systems. Extensive simulations show that the proposed approach leads to significant QoC improvement over the traditional open-loop scheduling scheme under both underloaded and overloaded network conditions.
In this paper, a complete preprocessing methodology for discovering patterns in web usage mining process to improve the quality of data by reducing the quantity of data has been proposed. A dynamic ART1 neural network clustering algorithm to group users according to their Web access patterns with its neat architecture is also proposed. Several experiments are conducted and the results show the proposed methodology reduces the size of Web log files down to 73-82% of the initial size and the proposed ART1 algorithm is dynamic and learns relatively stable quality clusters.
We construct a tensor network that delivers an unnormalized quantum state whose coefficients are the solutions to a given instance of 3SAT, an NP-complete problem. The tensor network contraction that corresponds to the norm of the state counts the number of solutions to the instance. It follows that exact contractions of this tensor network are in the #P-complete computational complexity class, thus believed to be a hard task. Furthermore, we show that for a 3SAT instance with n bits, it is enough to perform a polynomial number of contractions of the tensor network structure associated to the computation of local observables to obtain one of the explicit solutions to the problem, if any. Physical realization of a state described by a generic tensor network is equivalent to finding the satisfying assignment of a 3SAT instance and, consequently, this experimental task is expected to be hard.
In this work we establish the relation between optimal control and training deep Convolution Neural Networks (CNNs). We show that the forward propagation in CNNs can be interpreted as a time-dependent nonlinear differential equation and learning as controlling the parameters of the differential equation such that the network approximates the data-label relation for given training data. Using this continuous interpretation we derive two new methods to scale CNNs with respect to two different dimensions. The first class of multiscale methods connects low-resolution and high-resolution data through prolongation and restriction of CNN parameters. We demonstrate that this enables classifying high-resolution images using CNNs trained with low-resolution images and vice versa and warm-starting the learning process. The second class of multiscale methods connects shallow and deep networks and leads to new training strategies that gradually increase the depths of the CNN while re-using parameters for initializations.
Using the global fiber bundle model as a tractable scheme of progressive fracture in heterogeneous materials, we define the branching ratio in avalanches as a suitable order parameter to clarify the order of the phase transition occurring at the collapse of the system. The model is analyzed using a probabilistic approach suited to smooth fluctuations. The branching ratio shows a behavior analogous to the magnetization in known magnetic systems with 2nd-order phase transitions. We obtain a universal critical exponent $\beta\approx 0.5$ independent of the probability distribution used to assign the strengths of individual fibers.
We study the evolution of localized wave groups in unidirectional water wave envelope equations (nonlinear Schrodinger (NLS) and modified NLS (MNLS)). These localizations of energy can lead to disastrous extreme responses (rogue waves). Previous studies have focused on the role of energy distribution in the frequency domain in the formation of extreme waves. We analytically quantify the role of spatial localization, introducing a novel technique to reduce the underlying PDE dynamics to a simple ODE for the wave packet amplitude. We use this reduced model to show how the scale- invariant symmetries of NLS break down when the additional terms in MNLS are included, inducing a critical scale for the occurrence of extreme waves.
Since November 2002, we have conducted the largest deep imaging survey of the young, nearby associations of the southern hemisphere. Our goal is detection and characterization of substellar companions at intermediate (10--500 AU) physical separations. We have observed a sample of 88 stars, mostly G to M dwarfs, that we essentially identify as younger than 100 Myr and closer to Earth than 100 pc. The VLT/NACO adaptive optics instrument of the ESO Paranal Observatory was used to explore the faint circumstellar environment between typically 0.1 and 10''. We report the discovery of 17 new close (0.1-5.0'') multiple systems. HIP108195AB and C (F1III-M6), HIP84642AB (a~14 AU, K0-M5) and TWA22AB (a~1.8 AU; M6-M6) confirmed comoving systems. TWA22AB is likely to be a astrometric calibrator that can be used to test evolutionary predictions. Among our complete sample, a total of 65 targets observed with deep coronagraphic imaging. About 240 faint candidates were detected around 36 stars. Follow-up observations VLT or HST for 83% of these stars enabled us to identify a fraction of contaminants. The latest results about the substellar companions to GSC08047-00232, AB Pic and 2M1207, confirmed during this survey and published earlier, are reviewed. Finally, the statistical analysis of our complete set of coronagraphic limits enables us to place constraints on the physical and properties of giant planets between typically 20 and 150 AU.
Video games are a compelling source of annotated data as they can readily provide fine-grained groundtruth for diverse tasks. However, it is not clear whether the synthetically generated data has enough resemblance to the real-world images to improve the performance of computer vision models in practice. We present experiments assessing the effectiveness on real-world data of systems trained on synthetic RGB images that are extracted from a video game. We collected over 60000 synthetic samples from a modern video game with similar conditions to the real-world CamVid and Cityscapes datasets. We provide several experiments to demonstrate that the synthetically generated RGB images can be used to improve the performance of deep neural networks on both image segmentation and depth estimation. These results show that a convolutional network trained on synthetic data achieves a similar test error to a network that is trained on real-world data for dense image classification. Furthermore, the synthetically generated RGB images can provide similar or better results compared to the real-world datasets if a simple domain adaptation technique is applied. Our results suggest that collaboration with game developers for an accessible interface to gather data is potentially a fruitful direction for future work in computer vision.
Many extreme right groups have had an online presence for some time through the use of dedicated websites. This has been accompanied by increased activity in social media platforms in recent years, enabling the dissemination of extreme right content to a wider audience. In this paper, we present an analysis of the activity of a selection of such groups on Twitter, using network representations based on reciprocal follower and interaction relationships, while also analyzing topics found in their corresponding tweets. International relationships between certain extreme right groups across geopolitical boundaries are initially identified. Furthermore, we also discover stable communities of accounts within local interaction networks, in addition to associated topics, where the underlying extreme right ideology of these communities is often identifiable.
Question Answering System (QAS) is used for information retrieval and natural language processing (NLP) to reduce human effort. There are numerous QAS based on the user documents present today, but they all are limited to providing objective answers and process simple questions only. Complex questions cannot be answered by the existing QAS, as they require interpretation of the current and old data as well as the question asked by the user. The above limitations can be overcome by using deep cases and neural network. Hence we propose a modified QAS in which we create a deep artificial neural network with associative memory from text documents. The modified QAS processes the contents of the text document provided to it and find the answer to even complex questions in the documents.
Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias. Fine-tuning deep models in a new domain can require a significant amount of labeled data, which for many applications is simply not available. We propose a new CNN architecture to exploit unlabeled and sparsely labeled target domain data. Our approach simultaneously optimizes for domain invariance to facilitate domain transfer and uses a soft label distribution matching loss to transfer information between tasks. Our proposed adaptation method offers empirical performance which exceeds previously published results on two standard benchmark visual domain adaptation tasks, evaluated across supervised and semi-supervised adaptation settings.
Unsupervised text embedding methods, such as Skip-gram and Paragraph Vector, have been attracting increasing attention due to their simplicity, scalability, and effectiveness. However, comparing to sophisticated deep learning architectures such as convolutional neural networks, these methods usually yield inferior results when applied to particular machine learning tasks. One possible reason is that these text embedding methods learn the representation of text in a fully unsupervised way, without leveraging the labeled information available for the task. Although the low dimensional representations learned are applicable to many different tasks, they are not particularly tuned for any task. In this paper, we fill this gap by proposing a semi-supervised representation learning method for text data, which we call the \textit{predictive text embedding} (PTE). Predictive text embedding utilizes both labeled and unlabeled data to learn the embedding of text. The labeled information and different levels of word co-occurrence information are first represented as a large-scale heterogeneous text network, which is then embedded into a low dimensional space through a principled and efficient algorithm. This low dimensional embedding not only preserves the semantic closeness of words and documents, but also has a strong predictive power for the particular task. Compared to recent supervised approaches based on convolutional neural networks, predictive text embedding is comparable or more effective, much more efficient, and has fewer parameters to tune.
The Hubble space telescope observations of the northern Hubble deep field, and more recently its counterpart in the south, provide detections and photometry for stars and field galaxies to the faintest levels currently achievable, reaching magnitudes V ~ 30. Since 1995, the northern Hubble deep field has been the focus of deep surveys at nearly all wavelengths. These observations have revealed many properties of high redshift galaxies, and have contributed important data on the stellar mass function in the Galactic halo.
In this article, we investigate the wave equation in spiral geometry and study the modes of vibrations of a one-dimensional (1-D) string in spiral shape. Here we show that the problem of wave propagation along a spiral can be reduced to Bessel differential equation and hence, very closely related to the problem of radial waves of two-dimensional (2-D) vibrating membrane in circular geometry.
This paper explores the use of the Artificial Bee Colony (ABC) algorithm to compute threshold selection for image segmentation. ABC is a heuristic algorithm motivated by the intelligent behavior of honey-bees which has been successfully employed to solve complex optimization problems. In this approach, an image 1D histogram is approximated through a Gaussian mixture model whose parameters are calculated by the ABC algorithm. For the approximation scheme, each Gaussian function represents a pixel class and therefore a threshold. Unlike the Expectation Maximization (EM) algorithm, the ABC based method shows fast convergence and low sensitivity to initial conditions. Remarkably, it also improves complex time consuming computations commonly required by gradient-based methods. Experimental results demonstrate the algorithms ability to perform automatic multi threshold selection yet showing interesting advantages by comparison to other well known algorithms.
With the emergence of social networking services, researchers enjoy the increasing availability of large-scale heterogenous datasets capturing online user interactions and behaviors. Traditional analysis of techno-social systems data has focused mainly on describing either the dynamics of social interactions, or the attributes and behaviors of the users. However, overwhelming empirical evidence suggests that the two dimensions affect one another, and therefore they should be jointly modeled and analyzed in a multi-modal framework. The benefits of such an approach include the ability to build better predictive models, leveraging social network information as well as user behavioral signals. To this purpose, here we propose the Constrained Latent Space Model (CLSM), a generalized framework that combines Mixed Membership Stochastic Blockmodels (MMSB) and Latent Dirichlet Allocation (LDA) incorporating a constraint that forces the latent space to concurrently describe the multiple data modalities. We derive an efficient inference algorithm based on Variational Expectation Maximization that has a computational cost linear in the size of the network, thus making it feasible to analyze massive social datasets. We validate the proposed framework on two problems: prediction of social interactions from user attributes and behaviors, and behavior prediction exploiting network information. We perform experiments with a variety of multi-modal social systems, spanning location-based social networks (Gowalla), social media services (Instagram, Orkut), e-commerce and review sites (Amazon, Ciao), and finally citation networks (Cora). The results indicate significant improvement in prediction accuracy over state of the art methods, and demonstrate the flexibility of the proposed approach for addressing a variety of different learning problems commonly occurring with multi-modal social data.
Natural stimuli are highly redundant, possessing significant spatial and temporal correlations. While sparse coding has been proposed as an efficient strategy employed by neural systems to encode sensory stimuli, the underlying mechanisms are still not well understood. Most previous approaches model the neural dynamics by the sparse representation dictionary itself and compute the representation coefficients offline. In reality, faced with the challenge of constantly changing stimuli, neurons must compute the sparse representations dynamically in an online fashion. Here, we describe a leaky linearized Bregman iteration (LLBI) algorithm which computes the time varying sparse representations using a biologically motivated network of leaky rectifying neurons. Compared to previous attempt of dynamic sparse coding, LLBI exploits the temporal correlation of stimuli and demonstrate better performance both in representation error and the smoothness of temporal evolution of sparse coefficients.
We show that a recently proposed neural dependency parser can be improved by joint training on multiple languages from the same family. The parser is implemented as a deep neural network whose only input is orthographic representations of words. In order to successfully parse, the network has to discover how linguistically relevant concepts can be inferred from word spellings. We analyze the representations of characters and words that are learned by the network to establish which properties of languages were accounted for. In particular we show that the parser has approximately learned to associate Latin characters with their Cyrillic counterparts and that it can group Polish and Russian words that have a similar grammatical function. Finally, we evaluate the parser on selected languages from the Universal Dependencies dataset and show that it is competitive with other recently proposed state-of-the art methods, while having a simple structure.
A defect coupling to the square of the order parameter in a nearly quantum-critical magnet can nucleate an ordered droplet while the bulk system is in the paramagnetic phase. We study the influence of long-range spatial interactions of the form $r^{-(d+\sigma)}$ on the droplet formation. To this end, we solve a Landau-Ginzburg-Wilson free energy in saddle point approximation. The long-range interaction causes the droplet to develop an energetically unfavorable power-law tail. However, for $\sigma>0$, the free energy contribution of this tail is subleading in the limit of large droplets; and the droplet formation is controlled by the defect bulk. Thus, for large defects, long-range interactions do not hinder the formation of droplets.
So far the problem of a spin glass on a Bethe lattice has been solved only at the replica symmetric level, which is wrong in the spin glass phase. Because of some technical difficulties, attempts at deriving a replica symmetry breaking solution have been confined to some perturbative regimes, high connectivity lattices or temperature close to the critical temperature.   Using the cavity method, we propose a general non perturbative solution of the Bethe lattice spin glass problem at a level of approximation which is equivalent to a one step replica symmetry breaking solution. The results compare well with numerical simulations. The method can be used for many finite connectivity problems appearing in combinatorial optimization.
In this paper we investigate the influence of directed motifs on the synchronization properties of Kuramoto oscillators on directed networks. Building different types of sparse directed small world networks similar to the Watts and Strogatz procedure we establish that feed forward loops favour synchronization on directed networks. The paper highlights the importance of local network characteristics for synchronization.
Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected -- but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and retarding infections, broadly construed. To this end, we model diffusion processes as discrete networks of continuous temporal processes occurring at different rates. Given cascade data -- observed infection times of nodes -- we infer the edges of the global diffusion network and estimate the transmission rates of each edge that best explain the observed data. The optimization problem is convex. The model naturally (without heuristics) imposes sparse solutions and requires no parameter tuning. The problem decouples into a collection of independent smaller problems, thus scaling easily to networks on the order of hundreds of thousands of nodes. Experiments on real and synthetic data show that our algorithm both recovers the edges of diffusion networks and accurately estimates their transmission rates from cascade data.
In this paper, we study the information-theoretic limits of community detection in the symmetric two-community stochastic block model, with intra-community and inter-community edge probabilities $\frac{a}{n}$ and $\frac{b}{n}$ respectively. We consider the sparse setting, in which $a$ and $b$ do not scale with $n$, and provide upper and lower bounds on the proportion of community labels recovered on average. We provide a numerical example for which the bounds are near-matching for moderate values of $a - b$, and matching in the limit as $a-b$ grows large.
This paper proposes a new algorithm for recovery of belief network structure from data handling hidden variables. It consists essentially in an extension of the CI algorithm of Spirtes et al. by restricting the number of conditional dependencies checked up to k variables and in an extension of the original CI by additional steps transforming so called partial including path graph into a belief network. Its correctness is demonstrated.
A/B testing, also known as controlled experiment, bucket testing or splitting testing, has been widely used for evaluating a new feature, service or product in the data-driven decision processes of online websites. The goal of A/B testing is to estimate or test the difference between the treatment effects of the old and new variations. It is a well-studied two-sample comparison problem if each user's response is influenced by her treatment only. However, in many applications of A/B testing, especially those in HIVE of Yahoo and other social networks of Microsoft, Facebook, LinkedIn, Twitter and Google, users in the social networks influence their friends via underlying social interactions, and the conventional A/B testing methods fail to work. This paper considers the network A/B testing problem and provide a general framework consisting of five steps: data sampling, probabilistic model, parameter inference, computing average treatment effect and hypothesis test. The framework performs well for network A/B testing in simulation studies.
Facial analysis is a key technology for enabling human-machine interaction. In this context, we present a client-server framework, where a client transmits the signature of a face to be analyzed to the server, and, in return, the server sends back various information describing the face e.g. is the person male or female, is she/he bald, does he have a mustache, etc. We assume that a client can compute one (or a combination) of visual features; from very simple and efficient features, like Local Binary Patterns, to more complex and computationally heavy, like Fisher Vectors and CNN based, depending on the computing resources available. The challenge addressed in this paper is to design a common universal representation such that a single merged signature is transmitted to the server, whatever be the type and number of features computed by the client, ensuring nonetheless an optimal performance. Our solution is based on learning of a common optimal subspace for aligning the different face features and merging them into a universal signature. We have validated the proposed method on the challenging CelebA dataset, on which our method outperforms existing state-of-the-art methods when rich representation is available at test time, while giving competitive performance when only simple signatures (like LBP) are available at test time due to resource constraints on the client.
Aligning video sequences is a fundamental yet still unsolved component for a wide range of applications in computer graphics and vision. However, most image processing methods cannot be directly applied to related video problems due to the high amount of underlying data, in our case of 1.75 TB raw video data. Using recent advances in deep learning, we present a scalable and robust method for detecting and computing optimal non-linear temporal video alignments. The presented algorithm learns to retrieve and match similar video frames from input sequences without any human interaction or additional label annotations. An iterative scheme is presented which leverages on the nature of the videos themselves in order to remove the need for labels. While previous methods are limited to short video sequences and assume similar settings in vegetation, season and illumination, our approach is able to robustly align videos from data recorded months apart.
The inclusive energy distributions of fragments (3$\leq$Z$\leq$7) emitted in the reaction $^{16}$O + $^{27}$Al at $E_{lab} = $116 MeV have been measured in the angular range $\theta_{lab} $= 15$^\circ$ - 115$^\circ$. A non-linear optimisation procedure using multiple Gaussian distribution functions has been proposed to extract the fusion-fission and deep inelastic components of the fragment emission from the experimental data. The angular distributions of the fragments, thus obtained, from the deep inelastic component are found to fall off faster than those from the fusion-fission component, indicating shorter life times of the emitting di-nuclear systems. The life times of the intermediate di-nuclear configurations have been estimated using a diffractive Regge-pole model. The life times thus extracted ($\sim 1 - 5\times 10^{-22}$ Sec.) are found to decrease with the increase in the fragment charge. Optimum Q-values are also found to increase with increasing charge transfer i.e. with the decrease in fragment charge.
Due to recent advances in synthetic biology and artificial life, the origin of life is currently a hot topic of research. We review the literature and argue that the two traditionally competing "replicator-first" and "metabolism-first" approaches are merging into one integrated theory of individuation and evolution. We contribute to the maturation of this more inclusive approach by highlighting some problematic assumptions that still lead to an impoverished conception of the phenomenon of life. In particular, we argue that the new consensus has so far failed to consider the relevance of intermediate timescales. We propose that an adequate theory of life must account for the fact that all living beings are situated in at least four distinct timescales, which are typically associated with metabolism, motility, development, and evolution. On this view, self-movement, adaptive behavior and morphological changes could have already been present at the origin of life. In order to illustrate this possibility we analyze a minimal model of life-like phenomena, namely of precarious, individuated, dissipative structures that can be found in simple reaction-diffusion systems. Based on our analysis we suggest that processes in intermediate timescales could have already been operative in prebiotic systems. They may have facilitated and constrained changes occurring in the faster- and slower-paced timescales of chemical self-individuation and evolution by natural selection, respectively.
Models of ordering dynamics allow to understand natural systems in which an initially disordered population homogenizes some traits via local interactions. The simplest of these models, with wide applications ranging from evolutionary to social dynamics, are the Voter and Moran processes, usually defined in terms of static or randomly mixed individuals that interact with a neighbor to copy or modify a discrete trait. Here we study the effects of diffusion in Voter/Moran processes by proposing a generalization of ordering dynamics in a metapopulation framework, in which individuals are endowed with mobility and diffuse through a spatial structure represented as a graph of patches upon which interactions take place. We show that diffusion dramatically affects the time to reach the homogeneous state, independently of the underlying network's topology, while the final consensus emerges through different local/global mechanisms, depending on the mobility strength. Our results highlight the crucial role played by mobility in ordering processes and set up a general framework that allows to study its effect on a large class of models, with implications in the understanding of evolutionary and social phenomena.
Device coordination in open spectrum systems is a challenging problem, particularly since users experience varying spectrum availability over time and location. In this paper, we propose a game theoretical approach that allows cognitive radio pairs, namely the primary user (PU) and the secondary user (SU), to update their transmission powers and frequencies simultaneously. Specifically, we address a Stackelberg game model in which individual users attempt to hierarchically access to the wireless spectrum while maximizing their energy efficiency. A thorough analysis of the existence, uniqueness and characterization of the Stackelberg equilibrium is conducted. In particular, we show that a spectrum coordination naturally occurs when both actors in the system decide sequentially about their powers and their transmitting carriers. As a result, spectrum sensing in such a situation turns out to be a simple detection of the presence/absence of a transmission on each sub-band. We also show that when users experience very different channel gains on their two carriers, they may choose to transmit on the same carrier at the Stackelberg equilibrium as this contributes enough energy efficiency to outweigh the interference degradation caused by the mutual transmission. Then, we provide an algorithmic analysis on how the PU and the SU can reach such a spectrum coordination using an appropriate learning process. We validate our results through extensive simulations and compare the proposed algorithm to some typical scenarios including the non-cooperative case and the throughput-based-utility systems. Typically, it is shown that the proposed Stackelberg decision approach optimizes the energy efficiency while still maximizing the throughput at the equilibrium.
Recent research shows that deep neural networks (DNNs) can be used to extract deep speaker vectors (d-vectors) that preserve speaker characteristics and can be used in speaker verification. This new method has been tested on text-dependent speaker verification tasks, and improvement was reported when combined with the conventional i-vector method.   This paper extends the d-vector approach to semi text-independent speaker verification tasks, i.e., the text of the speech is in a limited set of short phrases. We explore various settings of the DNN structure used for d-vector extraction, and present a phone-dependent training which employs the posterior features obtained from an ASR system. The experimental results show that it is possible to apply d-vectors on semi text-independent speaker recognition, and the phone-dependent training improves system performance.
Big Data streams are being generated in a faster, bigger, and more commonplace. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as online and leveraging bagging. Also, $k$-nearest neighbors is a popular choice, with most extensions dealing with the inherent performance limitations over a potentially-infinite stream.   At the same time, gradient descent methods are becoming increasingly popular, owing in part to the successes of deep learning. Although deep neural networks can learn incrementally, they have so far proved too sensitive to hyper-parameter options and initial conditions to be considered an effective `off-the-shelf' data-streams solution.   In this work, we look at combinations of Hoeffding-trees, nearest neighbour, and gradient descent methods with a streaming preprocessing approach in the form of a random feature functions filter for additional predictive power.   We further extend the investigation to implementing methods on GPUs, which we test on some large real-world datasets, and show the benefits of using GPUs for data-stream learning due to their high scalability.   Our empirical evaluation yields positive results for the novel approaches that we experiment with, highlighting important issues, and shed light on promising future directions in approaches to data-stream classification.
Deep network pruning is an effective method to reduce the storage and computation cost of deep neural networks when applying them to resource-limited devices. Among many pruning granularities, neuron level pruning will remove redundant neurons and filters in the model and result in thinner networks. In this paper, we propose a gradually global pruning scheme for neuron level pruning. In each pruning step, a small percent of neurons were selected and dropped across all layers in the model. We also propose a simple method to eliminate the biases in evaluating the importance of neurons to make the scheme feasible. Compared with layer-wise pruning scheme, our scheme avoid the difficulty in determining the redundancy in each layer and is more effective for deep networks. Our scheme would automatically find a thinner sub-network in original network under a given performance.
Wireless body area network (WBAN) is emerging in the mobile healthcare area to replace the traditional wire-connected monitoring devices. As wireless data transmission dominates power cost of sensor nodes, it is beneficial to reduce the data size without much information loss. Compressive sensing (CS) is a perfect candidate to achieve this goal compared to existing compression techniques. In this paper, we proposed a general framework that utilize CS and online dictionary learning (ODL) together. The learned dictionary carries individual characteristics of the original signal, under which the signal has an even sparser representation compared to pre-determined dictionaries. As a consequence, the compression ratio is effectively improved by 2-4x comparing to prior works. Besides, the proposed framework offloads pre-processing from sensor nodes to the server node prior to dictionary learning, providing further reduction in hardware costs. As it is data driven, the proposed framework has the potential to be used with a wide range of physiological signals.
The Named Data Networking (NDN) and Content-Centric Networking (CCN) architectures advocate Interest aggregation as a means to reduce end-to-end latency and bandwidth consumption. To enable these benefits, Interest aggregation must be realized through Pending Interest Tables (PIT) that grow in size at the rate of incoming Interests to an extent that may eventually defeat their original purpose. A thorough analysis is provided of the Interest aggregation mechanism using mathematical arguments backed by extensive discrete-event simulation results. We present a simple yet accurate analytical framework for characterizing Interest aggregation in an LRU cache, and use our model to develop an iterative algorithm to analyze the benefits of Interest aggregation in a network of interconnected caches. Our findings reveal that, under realistic assumptions, an insignificant fraction of Interests in the system benefit from aggregation, compromising the effectiveness of using PITs as an integral component of Content-Centric Networks.
We present next-to-leading order calculations of one- and two-jet production in eP collisions at HERA for photon virtualities in the range 1<Q^2<100 GeV^2. Soft and collinear singularities are extracted using the phase space slicing method. Numerical results are presented for HERA conditions with the Snowmass jet definition. The transition between photoproduction and deep-inelastic scattering is investigated in detail. We compare two approaches, the usual deep-inelastic theory, where the virtual photon couples only directly to quarks and antiquarks, and the photoproduction approach, where the photon couples either in the direct way or in the resolved way via the parton constituents of the virtual photon with the proton constituents. Finally we compare with recent H1 data of the dijet rate obtained for various photon virtualities Q^2 with special attention to the region, in which two jets have equal transverse momenta.
Here we focus on the challenge of verifying the correctness of molecular implementations of abstract chemical reaction networks, where operation in a well-mixed "soup" of molecules is stochastic, asynchronous, concurrent, and often involves multiple intermediate steps in the implementation, parallel pathways, and side reactions. This problem relates to the verification of Petri nets, but existing approaches are not sufficient for providing a single guarantee covering an infinite set of possible initial states (molecule counts) and an infinite state space potentially explored by the system given any initial state. We address these issues by formulating a new theory of pathway decomposition that provides an elegant formal basis for comparing chemical reaction network implementations, and we present an algorithm that computes this basis. Our theory naturally handles certain situations that commonly arise in molecular implementations, such as what we call "delayed choice," that are not easily accommodated by other approaches. We further show how pathway decomposition can be combined with weak bisimulation to handle a wider class that includes most currently known enzyme-free DNA implementation techniques. We anticipate that our notion of logical equivalence between chemical reaction network implementations will be valuable for other molecular implementations such as biochemical enzyme systems, and perhaps even more broadly in concurrency theory.
We study growth limited by packing for irregular objects in two dimensions. We generate packings by seeding objects randomly in time and space and allowing each object to grow until it collides with another object. The objects we consider allow us to investigate the separate effects of anisotropy and non-unit aspect ratio. By means of a connection to the decay of pore-space volume, we measure power law exponents for the object size distribution. We carry out a scaling analysis, showing that it provides an upper bound for the size distribution exponent. We find that while the details of the growth mechanism are irrelevant, the exponent is strongly shape dependent. Potential applications lie in ecological and biological environments where sessile organisms compete for limited space as they grow.
Deep learning methods are powerful tools but often suffer from expensive computation and limited flexibility. An alternative is to combine light-weight models with deep representations. As successful cases exist in several visual problems, a unified framework is absent. In this paper, we revisit two widely used approaches in computer vision, namely filtered channel features and Convolutional Neural Networks (CNN), and absorb merits from both by proposing an integrated method called Convolutional Channel Features (CCF). CCF transfers low-level features from pre-trained CNN models to feed the boosting forest model. With the combination of CNN features and boosting forest, CCF benefits from the richer capacity in feature representation compared with channel features, as well as lower cost in computation and storage compared with end-to-end CNN methods. We show that CCF serves as a good way of tailoring pre-trained CNN models to diverse tasks without fine-tuning the whole network to each task by achieving state-of-the-art performances in pedestrian detection, face detection, edge detection and object proposal generation.
Network latencies have become increasingly important for the performance of web servers and cloud computing platforms. Identifying network-related tail latencies and reasoning about their potential causes is especially important to gauge application run-time in online data-intensive applications, where the 99th percentile latency of individual operations can significantly affect the the overall latency of requests.   This paper deconstructs the "tail at scale" effect across TCP-IP, UDP-IP, and RDMA network protocols. Prior scholarly works have analyzed tail latencies caused by extrinsic network parameters like network congestion and flow fairness. Contrary to existing literature, we identify surprising rare tails in TCP-IP round-trip measurements that are as enormous as 110x higher than the median latency. Our experimental design eliminates network congestion as a tail-inducing factor. Moreover, we observe similar extreme tails in UDP-IP packet exchanges, ruling out additional TCP-IP protocol operations as the root cause of tail latency. However, we are unable to reproduce similar tail latencies in RDMA packet exchanges, which leads us to conclude that the TCP/UDP protocol stack within the operating system kernel is likely the primary source of extreme latency tails.
Network reliability is the probability that a dynamical system composed of discrete elements interacting on a network will be found in a configuration that satisfies a particular property. We introduce a new reliability property, Ising feasibility, for which the network reliability is the Ising model s partition function. As shown by Moore and Shannon, the network reliability can be separated into two factors: structural, solely determined by the network topology, and dynamical, determined by the underlying dynamics. In this case, the structural factor is known as the joint density of states. Using methods developed to approximate the structural factor for other reliability properties, we simulate the joint density of states, yielding an approximation for the partition function. Based on a detailed examination of why naive Monte Carlo sampling gives a poor approximation, we introduce a novel parallel scheme for estimating the joint density of states using a Markov chain Monte Carlo method with a spin exchange random walk. This parallel scheme makes simulating the Ising model in the presence of an external field practical on small computer clusters for networks with arbitrary topology with 10 to 6 energy levels and more than 10 to 308 microstates.
Energy disaggregation (a.k.a nonintrusive load monitoring, NILM), a single-channel blind source separation problem, aims to decompose the mains which records the whole electricity consumption into appliance-wise readings. This problem is difficult because it is inherently unidentifiable. Recent approaches have shown that the identifiability problem could be reduced by introducing domain knowledge into the model. Deep neural networks have been shown to be promising to tackle this problem in literature. However, it is not clear why and how the neural networks could work for this problem. In this paper, we propose sequence-to-point learning for NILM, where the input is a window of the mains and the output is a single point of the target appliance. We use convolutional neural networks to train the model. Interestingly, we systematically show that the convolutional neural networks can inherently learn the signatures of the target appliances, which are automatically added into the model to reduce the identifiability problem. We applied the proposed neural network approaches to a real-world household energy data, and show that the methods achieve state-of-the-art performance.
We participated in three of the protein-protein interaction subtasks of the Second BioCreative Challenge: classification of abstracts relevant for protein-protein interaction (IAS), discovery of protein pairs (IPS) and text passages characterizing protein interaction (ISS) in full text documents. We approached the abstract classification task with a novel, lightweight linear model inspired by spam-detection techniques, as well as an uncertainty-based integration scheme. We also used a Support Vector Machine and the Singular Value Decomposition on the same features for comparison purposes. Our approach to the full text subtasks (protein pair and passage identification) includes a feature expansion method based on word-proximity networks. Our approach to the abstract classification task (IAS) was among the top submissions for this task in terms of the measures of performance used in the challenge evaluation (accuracy, F-score and AUC). We also report on a web-tool we produced using our approach: the Protein Interaction Abstract Relevance Evaluator (PIARE). Our approach to the full text tasks resulted in one of the highest recall rates as well as mean reciprocal rank of correct passages. Our approach to abstract classification shows that a simple linear model, using relatively few features, is capable of generalizing and uncovering the conceptual nature of protein-protein interaction from the bibliome. Since the novel approach is based on a very lightweight linear model, it can be easily ported and applied to similar problems. In full text problems, the expansion of word features with word-proximity networks is shown to be useful, though the need for some improvements is discussed.
A computer network can be attacked in a number of ways. The security-related threats have become not only numerous but also diverse and they may also come in the form of blended attacks. It becomes difficult for any security system to block all types of attacks. This gives rise to the need of an incidence handling capability which is necessary for rapidly detecting incidents, minimizing loss and destruction, mitigating the weaknesses that were exploited and restoring the computing services. Incidence response has always been an important aspect of information security but it is often overlooked by security administrators. in this paper, we propose an automated system which will handle the security threats and make the computer network capable enough to withstand any kind of attack. we also present the state-of-the-art technology in computer, network and software which is required to build such a system.
We consider a resource allocation game with binary preferences in which each player as a node of an undirected unweighted network is trying to minimize her cost by caching an appropriate resource. Using an ordinal potential function, we propose a polynomial time algorithm to obtain a pure-strategy Nash equilibrium when the number of resources is limited or the network has a high edge density with respect to the number of resources. Moreover, we provide an algorithm to approximate any pure-strategy Nash equilibrium of the game over general networks, and extend our results to games with arbitrary cache sizes. Finally, we make a comparison between graph coloring and the Nash equilibrium points.
Community networks differ from regular networks by their organic growth patterns -- there is no central planning body that would decide how the network is built. Instead, the network grows in a bottom-up fashion as more people express interest in participating in the community and connect with their neighbours. People who participate in community networks are usually volunteers with limited free time. Due to these factors, making the management of community networks simpler and easier for all participants is the key component in boosting their growth. Specifics of individual networks often force communities to develop their own sets of tools and best practices which are hard to share and do not interoperate well with others. We propose a new general community network management platform nodewatcher that is built around the core principle of modularity and extensibility, making it suitable for reuse by different community networks. Devices are configured using platform-independent configuration which nodewatcher can transform into deployable firmware images, eliminating any manual device configuration, reducing errors, and enabling participation of novice maintainers. An embedded monitoring system enables live overview and validation of the whole community network. We show how the system successfully operates in an actual community wireless network, wlan Slovenija.
Hashing method maps similar data to binary hashcodes with smaller hamming distance, and it has received a broad attention due to its low storage cost and fast retrieval speed. However, the existing limitations make the present algorithms difficult to deal with large-scale datasets: (1) discrete constraints are involved in the learning of the hash function; (2) pairwise or triplet similarity is adopted to generate efficient hashcodes, resulting both time and space complexity are greater than O(n^2). To address these issues, we propose a novel discrete supervised hash learning framework which can be scalable to large-scale datasets. First, the discrete learning procedure is decomposed into a binary classifier learning scheme and binary codes learning scheme, which makes the learning procedure more efficient. Second, we adopt the Asymmetric Low-rank Matrix Factorization and propose the Fast Clustering-based Batch Coordinate Descent method, such that the time and space complexity is reduced to O(n). The proposed framework also provides a flexible paradigm to incorporate with arbitrary hash function, including deep neural networks and kernel methods. Experiments on large-scale datasets demonstrate that the proposed method is superior or comparable with state-of-the-art hashing algorithms.
Semantic matching is of central importance to many natural language tasks \cite{bordes2014semantic,RetrievalQA}. A successful matching algorithm needs to adequately model the internal structures of language objects and the interaction between them. As a step toward this goal, we propose convolutional neural network models for matching two sentences, by adapting the convolutional strategy in vision and speech. The proposed models not only nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling, but also capture the rich matching patterns at different levels. Our models are rather generic, requiring no prior knowledge on language, and can hence be applied to matching tasks of different nature and in different languages. The empirical study on a variety of matching tasks demonstrates the efficacy of the proposed model on a variety of matching tasks and its superiority to competitor models.
We revisit the choice of SGD for training deep neural networks by reconsidering the appropriate geometry in which to optimize the weights. We argue for a geometry invariant to rescaling of weights that does not affect the output of the network, and suggest Path-SGD, which is an approximate steepest descent method with respect to a path-wise regularizer related to max-norm regularization. Path-SGD is easy and efficient to implement and leads to empirical gains over SGD and AdaGrad.
Compared to other applications in computer vision, convolutional neural networks have under-performed on pedestrian detection. A breakthrough was made very recently by using sophisticated deep CNN models, with a number of hand-crafted features, or explicit occlusion handling mechanism. In this work, we show that by re-using the convolutional feature maps (CFMs) of a deep convolutional neural network (DCNN) model as image features to train an ensemble of boosted decision models, we are able to achieve the best reported accuracy without using specially designed learning algorithms. We empirically identify and disclose important implementation details. We also show that pixel labelling may be simply combined with a detector to boost the detection performance. By adding complementary hand-crafted features such as optical flow, the DCNN based detector can be further improved. We set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from $11.7\%$ to $8.9\%$, a relative improvement of $24\%$. We also achieve a comparable result to the state-of-the-art approaches on the KITTI dataset.
To make sense of the world our brains must analyze high-dimensional datasets streamed by our sensory organs. Because such analysis begins with dimensionality reduction, modelling early sensory processing requires biologically plausible online dimensionality reduction algorithms. Recently, we derived such an algorithm, termed similarity matching, from a Multidimensional Scaling (MDS) objective function. However, in the existing algorithm, the number of output dimensions is set a priori by the number of output neurons and cannot be changed. Because the number of informative dimensions in sensory inputs is variable there is a need for adaptive dimensionality reduction. Here, we derive biologically plausible dimensionality reduction algorithms which adapt the number of output dimensions to the eigenspectrum of the input covariance matrix. We formulate three objective functions which, in the offline setting, are optimized by the projections of the input dataset onto its principal subspace scaled by the eigenvalues of the output covariance matrix. In turn, the output eigenvalues are computed as i) soft-thresholded, ii) hard-thresholded, iii) equalized thresholded eigenvalues of the input covariance matrix. In the online setting, we derive the three corresponding adaptive algorithms and map them onto the dynamics of neuronal activity in networks with biologically plausible local learning rules. Remarkably, in the last two networks, neurons are divided into two classes which we identify with principal neurons and interneurons in biological circuits.
In big data era, the data continuously generated and its distribution may keep changes overtime. These challenges in online stream of data are known as concept drift. In this paper, we proposed the Adaptive Convolutional ELM method (ACNNELM) as enhancement of Convolutional Neural Network (CNN) with a hybrid Extreme Learning Machine (ELM) model plus adaptive capability. This method is aimed for concept drift handling. We enhanced the CNN as convolutional hiererchical features representation learner combined with Elastic ELM (E$^2$LM) as a parallel supervised classifier. We propose an Adaptive OS-ELM (AOS-ELM) for concept drift adaptability in classifier level (named ACNNELM-1) and matrices concatenation ensembles for concept drift adaptability in ensemble level (named ACNNELM-2). Our proposed Adaptive CNNELM is flexible that works well in classifier level and ensemble level while most current methods only proposed to work on either one of the levels.   We verified our method in extended MNIST data set and not MNIST data set. We set the experiment to simulate virtual drift, real drift, and hybrid drift event and we demonstrated how our CNNELM adaptability works. Our proposed method works well and gives better accuracy, computation scalability, and concept drifts adaptability compared to the regular ELM and CNN. Further researches are still required to study the optimum parameters and to use more varied image data set.
In this study we present a Deep Mixture of Experts (DMoE) neural-network architecture for single microphone speech enhancement. By contrast to most speech enhancement algorithms that overlook the speech variability mainly caused by phoneme structure, our framework comprises a set of deep neural networks (DNNs), each one of which is an 'expert' in enhancing a given speech type corresponding to a phoneme. A gating DNN determines which expert is assigned to a given speech segment. A speech presence probability (SPP) is then obtained as a weighted average of the expert SPP decisions, with the weights determined by the gating DNN. A soft spectral attenuation, based on the SPP, is then applied to enhance the noisy speech signal. The experts and the gating components of the DMoE network are trained jointly. As part of the training, speech clustering into different subsets is performed in an unsupervised manner. Therefore, unlike previous methods, a phoneme-labeled database is not required for the training procedure. A series of experiments with different noise types verified the applicability of the new algorithm to the task of speech enhancement. The proposed scheme outperforms other schemes that either do not consider phoneme structure or use a simpler training methodology.
In this report, we investigate the synchronization of temporal activity in an electrically coupled neural network model. The electrical coupling is established by homotypic static gap-junctions (Connexin 43). Two distinct network topologies, namely: {\em sparse random network, (SRN)} and {\em fully connected network, (FCN)} are used to establish the connectivity. The strength of connectivity in the FCN is governed by the {\em mean gap junctional conductance} ($\mu$). In the case of the SRN, the overall strength of connectivity is governed by the {\em density of connections} ($\delta$) and the connection strength between two neurons ($S_0$). The synchronization of the network with increasing gap junctional strength and varying population sizes is investigated. It was observed that the network {\em abruptly} makes a transition from a weakly synchronized to a well synchronized regime when ($\delta$) or ($\mu$) exceeds a critical value. It was also observed that the ($\delta$, $\mu$) values used to achieve synchronization decreases with increasing network size.
In this paper we compare the two intelligent route generation system and its performance capability in graded networks using Artificial Bee Colony (ABC) algorithm and Genetic Algorithm (GA). Both ABC and GA have found its importance in optimization technique for determining optimal path while routing operations in the network. The paper shows how ABC approach has been utilized for determining the optimal path based on bandwidth availability of the links and determines better quality paths over GA. Here the nodes participating in the routing are evaluated for their QoS metric. The nodes which satisfy the minimum threshold value of the metric are chosen and enabled to participate in routing. A quadrant is synthesized on the source as the centre and depending on which quadrant the destination node belongs to, a search for optimal path is performed. The simulation results show that ABC speeds up local minimum search convergence by around 60% as compared to GA with respect to traffic intensity, and opens the possibility for cognitive routing in future intelligent networks.
We apply the model introduced in Phys. Rev. B 75, 064202 (2007), cond-mat/0610469, to calculate the anisotropy effect in the interaction of two level systems with phonons in disordered crystals. We particularize our calculations to cubic crystals and compare them with the available experimental data to extract the parameters of the model. With these parameters we calculate the interaction of the dynamical defects in the disordered crystal with phonons (or sound waves) propagating along other crystalographic directions, providing in this way a method to investigate if the anisotropy comes from the two-level systems being preferably oriented in a certain direction or solely from the lattice anisotropy with the two-level systems being isotropically oriented.
Tau-chain is a decentralized peer-to-peer network having three unified faces: Rules, Proofs, and Computer Programs, allowing a generalization of virtually any centralized or decentralized P2P network, together with many new abilities, as we present on this note.
The behaviour of the quark coefficient functions in deep-inelastic scattering is investigated for large values of the Bjorken variable x. By combining results of soft-gluon resummation and fixed-order calculations, we determine the coefficients of the four leading large-x logarithms, alpha_s^k [{ln(1-x)}^{2k-l}/(1-x)]_+, l = 1, ...4, to all orders in the strong coupling constant alpha_s. This result includes two more terms for the three-loop coefficient functions than previously specified in the literature. The effect of the fifth logarithmic contribution is approximately evaluated. The terms derived here are required, but also seem to be sufficient, for a reliable representation of the coefficient functions at large x.
The D'yakonov-Perel' spin relaxation induced by the spin-orbit interaction is examined in disordered two-dimensional electron gas. It is shown that, because of the electron-electron interactions different spin relaxation rates can be obtained depending on the techniques used to extract them. It is demonstrated that the relaxation rate of a spin population is proportional to the spin-diffusion constant D_s, while the spin-orbit scattering rate controlling the weak-localization corrections is proportional to the diffusion constant D, i.e., the conductivity. The two diffusion constants get strongly renormalized by the electron-electron interactions, but in different ways. As a result, the corresponding relaxation rates are different, with the difference between the two being especially strong near a magnetic instability or near the metal-insulator transition.
As wireless access technologies grow rapidly, the recent studies have focused on granting mobile users the ability of roaming across different wireless networks in a seamless manner thus offering seamless mobility. The different characteristics of each wireless technology with regards to QoS brought many challenges for provisioning the continuous services (audio/video streaming) in a seamless way. In this paper, we intend to review the existing context-aware methods which offered solutions for service continuity. We looked at the types of context information used in each solution. Through this study, it is clear that context awareness plays a significant role in handover process in order to satisfy users demanding seamless services. Therefore, the goal of this paper is to compare the existing methods grouped as general, IMS based, and WLAN/WiMAX solutions in terms of several criteria, such as interworking architecture, service continuity, and QoS provisioning.
Recently, machine learning methods have provided a broad spectrum of original and efficient algorithms based on Deep Neural Networks (DNN) to automatically predict an outcome with respect to a sequence of inputs. Recurrent hidden cells allow these DNN-based models to manage long-term dependencies such as Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM). Nevertheless, these RNNs process a single input stream in one (LSTM) or two (Bidirectional LSTM) directions. But most of the information available nowadays is from multistreams or multimedia documents, and require RNNs to process these information synchronously during the training. This paper presents an original LSTM-based architecture, named Parallel LSTM (PLSTM), that carries out multiple parallel synchronized input sequences in order to predict a common output. The proposed PLSTM method could be used for parallel sequence classification purposes. The PLSTM approach is evaluated on an automatic telecast genre sequences classification task and compared with different state-of-the-art architectures. Results show that the proposed PLSTM method outperforms the baseline n-gram models as well as the state-of-the-art LSTM approach.
The critical group of an abelian network is a finite abelian group that governs the behavior of the network on large inputs. It generalizes the sandpile group of a graph. We show that the critical group of an irreducible abelian network acts freely and transitively on recurrent states of the network. We exhibit the critical group as a quotient of a free abelian group by a subgroup containing the image of the Laplacian, with equality in the case that the network is rectangular. We generalize Dhar's burning algorithm to abelian networks, and estimate the running time of an abelian network on an arbitrary input up to a constant additive error.
The democratization of wireless networks combined to the emergence of mobile devices increasingly autonomous and efficient lead to new services. Positioning services become overcrowded. Accuracy is the main quality criteria in positioning. But to better appreciate this one a coefficient is needed. In this paper we present Geometric and Signal Strength Dilution of Precision (DOP) for positioning systems based on Wi-Fi and Signal Strength measurements.
The comparison of heterogeneous samples extensively exists in many applications, especially in the task of image classification. In this paper, we propose a simple but effective coupled neural network, called Deeply Coupled Autoencoder Networks (DCAN), which seeks to build two deep neural networks, coupled with each other in every corresponding layers. In DCAN, each deep structure is developed via stacking multiple discriminative coupled auto-encoders, a denoising auto-encoder trained with maximum margin criterion consisting of intra-class compactness and inter-class penalty. This single layer component makes our model simultaneously preserve the local consistency and enhance its discriminative capability. With increasing number of layers, the coupled networks can gradually narrow the gap between the two views. Extensive experiments on cross-view image classification tasks demonstrate the superiority of our method over state-of-the-art methods.
It is shown that the existing four-loop result for the Bjorken polarized sum rule for deep inelastic electron-nucleon scattering obtained within perturbative Quantum Chromodynamics should be supplemented by the calculation of the diagrams of the so called singlet type. We also give an explanation of the interesting coincidence of two different classes of diagrams, one of the non-singlet and one of the singlet type, contributing the $\alpha_s^4$-approximation to the total cross-section of electron-positron annihilation into hadrons.
We analyze the nature of a novel type of self-trapping transition called self-localization (SL) of Bose-Einstein condensates in one-dimensional optical lattices in the presence of weak local dissipation. SL has recently been observed in several studies based upon the discrete nonlinear Schr\"odinger equation (DNLS), however, its origin is hitherto an open question. We show that SL is based upon a self-trapping crossover in the system. Furthermore, we establish that the origin of the crossover is the Peierls-Nabarro barrier, an energy threshold describing the stability of self-trapped states. Beyond the mean-field description the crossover becomes even sharper which is also reflected by a sudden change of the coherence of the condensate. While we expect that the crossover can be readily studied in current experiments in deep optical lattices, our results allow for the preparation of robust and long-time coherent quantum states.
An emerging design principle in deep learning is that each layer of a deep artificial neural network should be able to easily express the identity transformation. This idea not only motivated various normalization techniques, such as \emph{batch normalization}, but was also key to the immense success of \emph{residual networks}.   In this work, we put the principle of \emph{identity parameterization} on a more solid theoretical footing alongside further empirical progress. We first give a strikingly simple proof that arbitrarily deep linear residual networks have no spurious local optima. The same result for linear feed-forward networks in their standard parameterization is substantially more delicate. Second, we show that residual networks with ReLu activations have universal finite-sample expressivity in the sense that the network can represent any function of its sample provided that the model has more parameters than the sample size.   Directly inspired by our theory, we experiment with a radically simple residual architecture consisting of only residual convolutional layers and ReLu activations, but no batch normalization, dropout, or max pool. Our model improves significantly on previous all-convolutional networks on the CIFAR10, CIFAR100, and ImageNet classification benchmarks.
Knowledge of patients affective state could prove to be crucial for health-care professionals in both diagnosis and treatment, however, this requires patients to report how they feel. In practice the sampling rate of affective states needs to be kept low, in order to ensure that the patients can rest. Furthermore using traditional methods of measuring affective states, is not always possible, e.g. patients can be incapable of verbal communications. In this study we explore the prediction of peoples self-reported affective state by measuring multiple physiological signals. We use different Neural networks (NN) setups and compare with different multiple linear regression (MLR) setups for prediction of changes in affective states. The results showed that NN and MLR predicted the change in affective states with accuracies of 91.88% and 89.10%, respectively.
Certain tight binding lattices host macroscopically degenerate flat spectral bands. Their origin is rooted in local symmetries of the lattice, with destructive interference leading to the existence of compact localized eigenstates. We study the robustness of this localization to disorder in different classes of flat band lattices in one and two dimensions. Depending on the flat band class, the flat band states can either be robust, preserving their strong localization for weak disorder W, or they are destroyed and acquire large localization lengths $\xi$ that diverge with a variety of unconventional exponents $\nu$, $\xi \sim 1/W^{\nu}$.
In this paper we show how to learn directly from image data (i.e., without resorting to manually-designed features) a general similarity function for comparing image patches, which is a task of fundamental importance for many computer vision problems. To encode such a function, we opt for a CNN-based model that is trained to account for a wide variety of changes in image appearance. To that end, we explore and study multiple neural network architectures, which are specifically adapted to this task. We show that such an approach can significantly outperform the state-of-the-art on several problems and benchmark datasets.
Several popular graph embedding techniques for representation learning and dimensionality reduction rely on performing computationally expensive eigendecompositions to derive a nonlinear transformation of the input data space. The resulting eigenvectors encode the embedding coordinates for the training samples only, and so the embedding of novel data samples requires further costly computation. In this paper, we present a method for the out-of-sample extension of graph embeddings using deep neural networks (DNN) to parametrically approximate these nonlinear maps. Compared with traditional nonparametric out-of-sample extension methods, we demonstrate that the DNNs can generalize with equal or better fidelity and require orders of magnitude less computation at test time. Moreover, we find that unsupervised pretraining of the DNNs improves optimization for larger network sizes, thus removing sensitivity to model selection.
Distribution system state estimation (DSSE) is an essential tool for operation of distribution networks, the results of which enables the operator to have a thorough observation of the system. Thus, most distribution management systems (DMS) include a single-phase state estimator. Due to non-convexity of the SE problem, heuristic and Newton methods do not guarantee the global solution. In contrast, SDP based SE is more promising to guarantee the globally optimal solution since it represents and solves the problem in a convex format. However, the observability of the power system is highly vulnerable to the set of measurements while employing the SDP-based SE, which is addressed in this report. An algorithm is proposed to generate additional measurements using the measurement data already gathered. The SDP-based SE is very sensitive to the level of noise in large power networks. Also, bad data detection algorithms proposed for Newton methods do not work for the SDP-based SE method due to larger number of state variables in SDP representation of power network. In this report, an algorithm is proposed to generate additional measurements using the measurement data already gathered in order to solve the observability issue. A network separation algorithm is developed to solve the entire problem for smaller sub-networks which include micro-PMUs to mitigate the adverse effects of noise for huge networks. An algorithm based on redundancy test is also developed for bad data detection. The algorithms are tested on single phase and multiphase test systems. The algorithms are applied EPRI Circuit 5 (2998-bus) test feeder to demonstrate the flexibility of the algorithms developed.
We compute the pion quark Generalised Parton Distribution H and quark Double Distributions in a coupled Bethe-Salpeter and Dyson-Schwinger approach in terms of quark flavors or isospin states. We use analytic expressions inspired by the numerical resolution of Dyson-Schwinger and Bethe-Salpeter equations. We obtain an analytic expression for the pion Generalised Parton Distribution at a low scale. Our model fulfils most of the required symmetry properties and compares very well to experimental pion form factor or valence parton distribution function experimental data. In addition, we have highlighted limitations of the so-called impulse approximation, which breaks symmetries when computing the valence parton distribution function. Doing so, we introduced new terms which were neglected before. Finally, we also shed light on a specific property of the pion GPD: Polyakov soft pion theorem.
The main goal for this article is to compare performance penalties when using KVM virtualization and Docker containers for creating isolated environments for HPC applications. The article provides both data obtained using commonly accepted synthetic tests (High Performance Linpack) and real life applications (OpenFOAM). The article highlights the influence on resulting application performance of major infrastructure configuration options: CPU type presented to VM, networking connection type used.
This paper presents an investigation of the approximation property of neural networks with unbounded activation functions, such as the rectified linear unit (ReLU), which is the new de-facto standard of deep learning. The ReLU network can be analyzed by the ridgelet transform with respect to Lizorkin distributions. By showing three reconstruction formulas by using the Fourier slice theorem, the Radon transform, and Parseval's relation, it is shown that a neural network with unbounded activation functions still satisfies the universal approximation property. As an additional consequence, the ridgelet transform, or the backprojection filter in the Radon domain, is what the network learns after backpropagation. Subject to a constructive admissibility condition, the trained network can be obtained by simply discretizing the ridgelet transform, without backpropagation. Numerical examples not only support the consistency of the admissibility condition but also imply that some non-admissible cases result in low-pass filtering.
We propose a general approach to study spin models with two symmetric absorbing states. Starting from the microscopic dynamics on a square lattice, we derive a Langevin equation for the time evolution of the magnetization field, that successfully explains coarsening properties of a wide range of nonlinear voter models and systems with intermediate states. We find that the macroscopic behavior only depends on the first derivatives of the spin-flip probabilities. Moreover, an analysis of the mean-field term reveals the three types of transitions commonly observed in these systems -generalized voter, Ising and directed percolation-. Monte Carlo simulations of the spin dynamics qualitatively agree with theoretical predictions.
We address the problem of gauging the influence exerted by a given country on the global trade market from the viewpoint of complex networks. In particular, we apply the PWP method for computing indirect influences on the world trade network.
We develop a representation suitable for the unconstrained recognition of words in natural images: the general case of no fixed lexicon and unknown length.   To this end we propose a convolutional neural network (CNN) based architecture which incorporates a Conditional Random Field (CRF) graphical model, taking the whole word image as a single input. The unaries of the CRF are provided by a CNN that predicts characters at each position of the output, while higher order terms are provided by another CNN that detects the presence of N-grams. We show that this entire model (CRF, character predictor, N-gram predictor) can be jointly optimised by back-propagating the structured output loss, essentially requiring the system to perform multi-task learning, and training uses purely synthetically generated data. The resulting model is a more accurate system on standard real-world text recognition benchmarks than character prediction alone, setting a benchmark for systems that have not been trained on a particular lexicon. In addition, our model achieves state-of-the-art accuracy in lexicon-constrained scenarios, without being specifically modelled for constrained recognition. To test the generalisation of our model, we also perform experiments with random alpha-numeric strings to evaluate the method when no visual language model is applicable.
Recent studies reveal that a deep neural network can learn transferable features which generalize well to novel tasks for domain adaptation. However, as deep features eventually transition from general to specific along the network, the feature transferability drops significantly in higher layers with increasing domain discrepancy. Hence, it is important to formally reduce the dataset bias and enhance the transferability in task-specific layers. In this paper, we propose a new Deep Adaptation Network (DAN) architecture, which generalizes deep convolutional neural network to the domain adaptation scenario. In DAN, hidden representations of all task-specific layers are embedded in a reproducing kernel Hilbert space where the mean embeddings of different domain distributions can be explicitly matched. The domain discrepancy is further reduced using an optimal multi-kernel selection method for mean embedding matching. DAN can learn transferable features with statistical guarantees, and can scale linearly by unbiased estimate of kernel embedding. Extensive empirical evidence shows that the proposed architecture yields state-of-the-art image classification error rates on standard domain adaptation benchmarks.
Large networks are becoming a widely used abstraction for studying complex systems in a broad set of disciplines, ranging from social network analysis to molecular biology and neuroscience. Despite an increasing need to analyze and manipulate large networks, only a limited number of tools are available for this task.   Here, we describe Stanford Network Analysis Platform (SNAP), a general-purpose, high-performance system that provides easy to use, high-level operations for analysis and manipulation of large networks. We present SNAP functionality, describe its implementational details, and give performance benchmarks. SNAP has been developed for single big-memory machines and it balances the trade-off between maximum performance, compact in-memory graph representation, and the ability to handle dynamic graphs where nodes and edges are being added or removed over time. SNAP can process massive networks with hundreds of millions of nodes and billions of edges. SNAP offers over 140 different graph algorithms that can efficiently manipulate large graphs, calculate structural properties, generate regular and random graphs, and handle attributes and meta-data on nodes and edges. Besides being able to handle large graphs, an additional strength of SNAP is that networks and their attributes are fully dynamic, they can be modified during the computation at low cost. SNAP is provided as an open source library in C++ as well as a module in Python.   We also describe the Stanford Large Network Dataset, a set of social and information real-world networks and datasets, which we make publicly available. The collection is a complementary resource to our SNAP software and is widely used for development and benchmarking of graph analytics algorithms.
Cellular differentiation and evolution are stochastic processes that can involve multiple types (or states) of particles moving on a complex, high-dimensional state-space or "fitness" landscape. Cells of each specific type can thus be quantified by their population at a corresponding node within a network of states. Their dynamics across the state-space network involve genotypic or phenotypic transitions that can occur upon cell division, such as during symmetric or asymmetric cell differentiation, or upon spontaneous mutation. Waiting times between transitions can be nonexponentially distributed and reflect e.g., the cell cycle. Here, we use a multi-type branching processes to study first passage time statistics for a single cell to appear in a specific state. We present results for a sequential evolutionary process in which $L$ successive transitions propel a population from a "wild-type" state to a given "terminally differentiated," "resistant," or "cancerous" state. Analytic and numeric results are also found for first passage times across an evolutionary chain containing a node with increased death or proliferation rate, representing a desert/bottleneck or an oasis. Processes involving cell proliferation are shown to be "nonlinear" (even though mean-field equations for the expected particle numbers are linear) resulting in first passage time statistics that depend on the position of the bottleneck or oasis. Our results highlight the sensitivity of stochastic measures to cell division fate and quantify the limitations of using certain approximations and assumptions (such as fixed-population and mean-field assumptions) in evaluating fixation times.
We consider the one dimensional classical Ising model in a symmetric dichotomous random field. The problem is reduced to a random iterated function system for an effective field. The D_q-spectrum of the invariant measure of this effective field exhibits a sharp drop of all D_q with q < 0 at some critical strength of the random field. We introduce the concept of orbits which naturally group the points of the support of the invariant measure. We then show that the pointwise dimension at all points of an orbit has the same value and calculate it for a class of periodic orbits and their so-called offshoots as well as for generic orbits in the non-overlapping case. The sharp drop in the D_q-spectrum is analytically explained by a drastic change of the scaling properties of the measure near the points of a certain periodic orbit at a critical strength of the random field which is explicitly given. A similar drastic change near the points of a special family of periodic orbits explains a second, hitherto unnoticed transition in the D_q-spectrum. As it turns out, a decisive role in this mechanism is played by a specific offshoot. We furthermore give rigorous upper and/or lower bounds on all D_q in a wide parameter range. In most cases the numerically obtained D_q coincide with either the upper or the lower bound. The results in this paper are relevant for the understanding of random iterated function systems in the case of moderate overlap in which periodic orbits with weak singularity can play a decisive role.
Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum.   We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be deployed on a physical robot and successfully complete the task.
Tensegrity robots, composed of rigid rods connected by elastic cables, have a number of unique properties that make them appealing for use as planetary exploration rovers. However, control of tensegrity robots remains a difficult problem due to their unusual structures and complex dynamics. In this work, we show how locomotion gaits can be learned automatically using a novel extension of mirror descent guided policy search (MDGPS) applied to periodic locomotion movements, and we demonstrate the effectiveness of our approach on tensegrity robot locomotion. We evaluate our method with real-world and simulated experiments on the SUPERball tensegrity robot, showing that the learned policies generalize to changes in system parameters, unreliable sensor measurements, and variation in environmental conditions, including varied terrains and a range of different gravities. Our experiments demonstrate that our method not only learns fast, power-efficient feedback policies for rolling gaits, but that these policies can succeed with only the limited onboard sensing provided by SUPERball's accelerometers. We compare the learned feedback policies to learned open-loop policies and hand-engineered controllers, and demonstrate that the learned policy enables the first continuous, reliable locomotion gait for the real SUPERball robot. Our code and other supplementary materials are available from http://rll.berkeley.edu/drl_tensegrity
Scattering length, which can be measured in Bose-Einstein condensate and Feshbach molecule experiments, is extremely sensitive to the variation of fundamental constants, in particular, the electron-to-proton mass ratio (m_e/m_p or m_e/Lambda_{QCD}, where Lambda_{QCD} is the QCD scale). Based on single- and two-channel scattering model, we show how the variation of the mass ratio propagates to the scattering length. Our results suggest that variation of m_e/m_p on the level of 10^{-11}~10^{-14} can be detected near a narrow magnetic or an optical Feshbach resonance by monitoring the scattering length on the 1% level. Derived formulae may also be used to estimate the isotopic shift of the scattering length.
Zero temperature limit in (1+1) directed polymers with finite range correlated random potential is studied. In terms of the standard replica technique it is demonstrated that in this limit the considered system reveals the one-step replica symmetry breaking structure similar to the one which takes place in the Random Energy Model. In particular, it is shown that at the temperature $T_{*} \sim (u R)^{1/3}$ (where $u$ and $R$ are the strength and the correlation length of the random potential) there is a crossover from the high- to the low-temperature regime. Namely, in the high-temperature regime at $T >> T_{*}$ the model is equivalent to the one with the $\delta$-correlated potential where the non-universal prefactor of the free energy is proportional to $T^{-2/3}$, while at $T << T_{*}$ this non-universal prefactor saturates at a finite (temperature independent) value.
Protein interaction networks aim to summarize the complex interplay of proteins in an organism. Early studies suggested that the position of a protein in the network determines its evolutionary rate but there has been considerable disagreement as to what extent other factors, such as protein abundance, modify this reported dependence.   We compare the genomes of Saccharomyces cerevisiae and Caenorhabditis elegans with those of closely related species to elucidate the recent evolutionary history of their respective protein interaction networks. Interaction and expression data are studied in the light of a detailed phylogenetic analysis. The underlying network structure is incorporated explicitly into the statistical analysis.   The increased phylogenetic resolution, paired with high-quality interaction data, allows us to resolve the way in which protein interaction network structure and abundance of proteins affect the evolutionary rate. We find that expression levels are better predictors of the evolutionary rate than a protein's connectivity. Detailed analysis of the two organisms also shows that the evolutionary rates of interacting proteins are not sufficiently similar to be mutually predictive.   It appears that meaningful inferences about the evolution of protein interaction networks require comparative analysis of reasonably closely related species. The signature of protein evolution is shaped by a protein's abundance in the organism and its function and the biological process it is involved in. Its position in the interaction networks and its connectivity may modulate this but they appear to have only minor influence on a protein's evolutionary rate.
Estimating the log-likelihood gradient with respect to the parameters of a Restricted Boltzmann Machine (RBM) typically requires sampling using Markov Chain Monte Carlo (MCMC) techniques. To save computation time, the Markov chains are only run for a small number of steps, which leads to a biased estimate. This bias can cause RBM training algorithms such as Contrastive Divergence (CD) learning to deteriorate. We adopt the idea behind Population Monte Carlo (PMC) methods to devise a new RBM training algorithm termed Population-Contrastive-Divergence (pop-CD). Compared to CD, it leads to a consistent estimate and may have a significantly lower bias. Its computational overhead is negligible compared to CD. However, the variance of the gradient estimate increases. We experimentally show that pop-CD can significantly outperform CD. In many cases, we observed a smaller bias and achieved higher log-likelihood values. However, when the RBM distribution has many hidden neurons, the consistent estimate of pop-CD may still have a considerable bias and the variance of the gradient estimate requires a smaller learning rate. Thus, despite its superior theoretical properties, it is not advisable to use pop-CD in its current form on large problems.
Recent studies on the practice of shaping subscribers' traffic by ISPs give a new insight into the actual performance of broadband access networks at a packet level. Unlike metro and backbone networks, however, access networks directly interface with end-users, so it is important to base the study and design of access networks on the behaviors of and the actual performance perceived by end-users. In this paper we study the effect of ISP traffic shaping using traffic models based on user behaviors and application/session-layer metrics providing quantifiable measures of user-perceived performance for HTTP, FTP, and streaming video traffic. To compare the user-perceived performance of shaped traffic flows with those of unshaped ones in an integrated way, we use a multivariate non-inferiority testing procedure. We first investigate the effect of the token generation rate and the token bucket size of a token bucket filter on user-perceived performance at a subscriber level with a single subscriber. Then we investigate their effect at an access level where shaped traffic flows from multiple subscribers interact with one another in a common shared access network. The simulation results show that for a given token generation rate, a larger token bucket provides better user-perceived performance at both subscriber and access levels. It is also shown that the loose burst control resulting from the large token bucket --- up to 100 MB for access line rate of 100 Mbit/s --- does not negatively affect user-perceived performance with multiple subscribers even in the presence of non-conformant subscribers; with a much larger token bucket, however, the negative effect of non-conformant subscribers on the user-perceived performance of conformant subscribers becomes clearly visible because the impact of token bucket size and that of token generation rate are virtually indistinguishable in this case.
The surge of mobile data traffic forces network operators to cope with capacity shortage. The deployment of small cells in 5G networks is meant to reduce latency, backhaul traffic and increase radio access capacity. In this context, mobile edge computing technology will be used to manage dedicated cache space in the radio access network. Thus, mobile network operators will be able to provision OTT content providers with new caching services to enhance the quality of experience of their customers on the move. In turn, the cache memory in the mobile edge network will become a shared resource. Hence, we study a competitive caching scheme where contents are stored at given price set by the mobile network operator. We first formulate a resource allocation problem for a tagged content provider seeking to minimize the expected missed cache rate. The optimal caching policy is derived accounting for popularity and availability of contents, the spatial distribution of small cells, and the caching strategies of competing content providers. It is showed to induce a specific order on contents to be cached based on their popularity and availability. Next, we study a game among content providers in the form of a generalized Kelly mechanism with bounded strategy sets and heterogeneous players. Existence and uniqueness of the Nash equilibrium are proved. Finally, extensive numerical results validate and characterize the performance of the model.
Increased density of wireless devices, ever growing demands for extremely high data rate, and spectrum scarcity at microwave bands make the millimeter wave (mmWave) frequencies an important player in future wireless networks. However, mmWave communication systems exhibit severe attenuation, blockage, deafness, and may need microwave networks for coordination and fall-back support. To compensate for high attenuation, mmWave systems exploit highly directional operation, which in turn substantially reduces the interference footprint. The significant differences between mmWave networks and legacy communication technologies challenge the classical design approaches, especially at the medium access control (MAC) layer, which has received comparatively less attention than PHY and propagation issues in the literature so far. In this paper, the MAC layer design aspects of short range mmWave networks are discussed. In particular, we explain why current mmWave standards fail to fully exploit the potential advantages of short range mmWave technology, and argue for the necessity of new collision-aware hybrid resource allocation frameworks with on-demand control messages, the advantages of a collision notification message, and the potential of multihop communication to provide reliable mmWave connections.
There has been a long history of research into the structure and evolution of mankind's scientific endeavor. However, recent progress in applying the tools of science to understand science itself has been unprecedented because only recently has there been access to high-volume and high-quality data sets of scientific output (e.g., publications, patents, grants), as well as computers and algorithms capable of handling this enormous stream of data. This paper reviews major work on models that aim to capture and recreate the structure and dynamics of scientific evolution. We then introduce a general process model that simultaneously grows co-author and paper-citation networks. The statistical and dynamic properties of the networks generated by this model are validated against a 20-year data set of articles published in the Proceedings of the National Academy of Science. Systematic deviations from a power law distribution of citations to papers are well fit by a model that incorporates a partitioning of authors and papers into topics, a bias for authors to cite recent papers, and a tendency for authors to cite papers cited by papers that they have read. In this TARL model (for Topics, Aging, and Recursive Linking), the number of topics is linearly related to the clustering coefficient of the simulated paper citation network.
We introduce Dynamic Deep Neural Networks (D2NN), a new type of feed-forward deep neural network that allow selective execution. Given an input, only a subset of D2NN neurons are executed, and the particular subset is determined by the D2NN itself. By pruning unnecessary computation depending on input, D2NNs provide a way to improve computational efficiency. To achieve dynamic selective execution, a D2NN augments a regular feed-forward deep neural network (directed acyclic graph of differentiable modules) with one or more controller modules. Each controller module is a sub-network whose output is a decision that controls whether other modules can execute. A D2NN is trained end to end. Both regular modules and controller modules in a D2NN are learnable and are jointly trained to optimize both accuracy and efficiency. Such training is achieved by integrating backpropagation with reinforcement learning. With extensive experiments of various D2NN architectures on image classification tasks, we demonstrate that D2NNs are general and flexible, and can effectively optimize accuracy-efficiency trade-offs.
Recently an algorithm, was discovered, which separates points in n-dimension by planes in such a manner that no two points are left un-separated by at least one plane{[}1-3{]}. By using this new algorithm we show that there are two ways of classification by a neural network, for a large dimension feature space, both of which are non-iterative and deterministic. To demonstrate the power of both these methods we apply them exhaustively to the classical pattern recognition problem: The Fisher-Anderson's, IRIS flower data set and present the results.   It is expected these methods will now be widely used for the training of neural networks for Deep Learning not only because of their non-iterative and deterministic nature but also because of their efficiency and speed and will supersede other classification methods which are iterative in nature and rely on error minimization.
In this paper, we propose several novel deep learning methods for object saliency detection based on the powerful convolutional neural networks. In our approach, we use a gradient descent method to iteratively modify an input image based on the pixel-wise gradients to reduce a cost function measuring the class-specific objectness of the image. The pixel-wise gradients can be efficiently computed using the back-propagation algorithm. The discrepancy between the modified image and the original one may be used as a saliency map for the image. Moreover, we have further proposed several new training methods to learn saliency-specific convolutional nets for object saliency detection, in order to leverage the available pixel-wise segmentation information. Our methods are extremely computationally efficient (processing 20-40 images per second in one GPU). In this work, we use the computed saliency maps for image segmentation. Experimental results on two benchmark tasks, namely Microsoft COCO and Pascal VOC 2012, have shown that our proposed methods can generate high-quality salience maps, clearly outperforming many existing methods. In particular, our approaches excel in handling many difficult images, which contain complex background, highly-variable salient objects, multiple objects, and/or very small salient objects.
Modelling of deep VLA images of the jets in FR I radio galaxies has allowed us to derive their three-dimensional distributions of velocity, emissivity and magnetic-field structure on kiloparsec scales. By combining our models of jet kinematics with measurements of the external pressure and density derived from Chandra observations, we can also determine the jet dynamics via a conservation-law approach. The result is a detailed and quantitative picture of jet deceleration. We discuss the potential application of these techniques on VLBI scales. Our fundamental assumption is that the jets are intrinsically symmetrical, axisymmetric, relativistic flows. Although this is likely to be a good approximation on average, the effects of non-stationarity in the flow (e.g. shocks/knots) may limit its applicability on pc scales. We also stress the need for both the main and counter-jet to be detected with good transverse resolution in linear polarization as well as total intensity. Two foreground effects (free-free absorption and Faraday rotation) must also be corrected. We comment on the implications of observed VLBI polarization and Faraday rotation for the field structure (and, by implication, collimation) of pc-scale jets. Where our VLA observations start to resolve the jets, about 1 kpc from the nucleus, they are travelling at about 0.8c. We briefly discuss the deceleration of jets from a much faster initial speed.
Femtocell networks have become a promising solution in supporting high data rates for 5G systems, where cell densification is performed using the small femtocells. However, femtocell networks have many challenges. One of the major challenges of femtocell networks is the interference management problem, where deployment of femtocells in the range of macro-cells may degrade the performance of the macrocell. In this paper, we develop a new platform for studying interference management in distributed femtocell networks using reinforcement learning approach. We design a complete MAC protocol to perform distributed power allocation using Q-Learning algorithm, where both independent and cooperative learning approaches are applied across network nodes. The objective of the Q-Learning algorithms is to maximize aggregate femtocells capacity, while maintaining the QoS for the Macrocell users. Furthermore, we present the realization of the algorithms using GNURadio and USRP platforms. Performance evaluation are conducted in terms of macrocell capacity convergence to a target capacity and improvement of aggregate femtocells capacity.
It is now feasible to host Web Services on a mobile device due to the advances in cellular devices and mobile communication technologies. However, the reliability, usability and responsiveness of the Mobile Hosts depend on various factors including the characteristics of available network, computational resources, and better means of searching the services provided by them. P2P enhances the adoption of Mobile Host in commercial environments. Mobile Hosts in P2P can collaboratively share the resources of individual peers. P2P also enhances the service discovery of huge number of Web Services possible with Mobile Hosts. Advanced features like post filtering with weight of keywords and context-awareness can also be exploited to select the best possible mobile Web Service. This paper proposes the concept of Mobile Hosts in P2P networks and identifies the means of publishing and discovery of Web Services in mobile P2P networks.
We present a method for discovering and exploiting object specific deep learning features and use face detection as a case study. Motivated by the observation that certain convolutional channels of a Convolutional Neural Network (CNN) exhibit object specific responses, we seek to discover and exploit the convolutional channels of a CNN in which neurons are activated by the presence of specific objects in the input image. A method for explicitly fine-tuning a pre-trained CNN to induce an object specific channel (OSC) and systematically identifying it for the human face object has been developed. Based on the basic OSC features, we introduce a multi-resolution approach to constructing robust face heatmaps for fast face detection in unconstrained settings. We show that multi-resolution OSC can be used to develop state of the art face detectors which have the advantage of being simple and compact.
In analyzing information streamed by sensory organs, our brains face challenges similar to those solved in statistical signal processing. This suggests that biologically plausible implementations of online signal processing algorithms may model neural computation. Here, we focus on such workhorses of signal processing as Principal Component Analysis (PCA) and whitening which maximize information transmission in the presence of noise. We adopt the similarity matching framework, recently developed for principal subspace extraction, but modify the existing objective functions by adding a decorrelating term. From the modified objective functions, we derive online PCA and whitening algorithms which are implementable by neural networks with local learning rules, i.e. synaptic weight updates that depend on the activity of only pre- and postsynaptic neurons. Our theory offers a principled model of neural computations and makes testable predictions such as the dropout of underutilized neurons.
Bayesian networks are a useful tool in the representation of uncertain knowledge. This paper proposes a new algorithm called ACO-E, to learn the structure of a Bayesian network. It does this by conducting a search through the space of equivalence classes of Bayesian networks using Ant Colony Optimization (ACO). To this end, two novel extensions of traditional ACO techniques are proposed and implemented. Firstly, multiple types of moves are allowed. Secondly, moves can be given in terms of indices that are not based on construction graph nodes. The results of testing show that ACO-E performs better than a greedy search and other state-of-the-art and metaheuristic algorithms whilst searching in the space of equivalence classes.
We present Charagram embeddings, a simple approach for learning character-based compositional models to embed textual sequences. A word or sentence is represented using a character n-gram count vector, followed by a single nonlinear transformation to yield a low-dimensional embedding. We use three tasks for evaluation: word similarity, sentence similarity, and part-of-speech tagging. We demonstrate that Charagram embeddings outperform more complex architectures based on character-level recurrent and convolutional neural networks, achieving new state-of-the-art performance on several similarity tasks.
Decentralized optimization algorithms have received much attention due to the recent advances in network information processing. However, conventional decentralized algorithms based on projected gradient descent are incapable of handling high dimensional constrained problems, as the projection step becomes computationally prohibitive to compute. To address this problem, this paper adopts a projection-free optimization approach, a.k.a.~the Frank-Wolfe (FW) or conditional gradient algorithm. We first develop a decentralized FW (DeFW) algorithm from the classical FW algorithm. The convergence of the proposed algorithm is studied by viewing the decentralized algorithm as an inexact FW algorithm. Using a diminishing step size rule and letting $t$ be the iteration number, we show that the DeFW algorithm's convergence rate is ${\cal O}(1/t)$ for convex objectives; is ${\cal O}(1/t^2)$ for strongly convex objectives with the optimal solution in the interior of the constraint set; and is ${\cal O}(1/\sqrt{t})$ towards a stationary point for smooth but non-convex objectives. We then show that a consensus-based DeFW algorithm meets the above guarantees with two communication rounds per iteration. Furthermore, we demonstrate the advantages of the proposed DeFW algorithm on low-complexity robust matrix completion and communication efficient sparse learning. Numerical results on synthetic and real data are presented to support our findings.
Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured {\em graph} data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data.   This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we provide a comprehensive exploration of both data mining and machine learning algorithms for these {\em detection} tasks. we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly {\em attribution} and highlight the major techniques that facilitate digging out the root cause, or the `why', of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.
The problem of designing policies for in-network function computation with minimum energy consumption subject to a latency constraint is considered. The scaling behavior of the energy consumption under the latency constraint is analyzed for random networks, where the nodes are uniformly placed in growing regions and the number of nodes goes to infinity. The special case of sum function computation and its delivery to a designated root node is considered first. A policy which achieves order-optimal average energy consumption in random networks subject to the given latency constraint is proposed. The scaling behavior of the optimal energy consumption depends on the path-loss exponent of wireless transmissions and the dimension of the Euclidean region where the nodes are placed. The policy is then extended to computation of a general class of functions which decompose according to maximal cliques of a proximity graph such as the $k$-nearest neighbor graph or the geometric random graph. The modified policy achieves order-optimal energy consumption albeit for a limited range of latency constraints.
Mobile Ad-hoc Network (MANET) is the self organizing collection of mobile nodes. The communication in MANET is done via a wireless media. Ad hoc wireless networks have massive commercial and military potential because of their mobility support. Due to demanding real time multimedia applications, Quality of Services (QoS) support in such infrastructure less networks have become essential. QoS routing in mobile Ad-Hoc networks is challenging due to rapid change in network topology. In this paper, we focused to reduce flooding performance of the Fisheye State Routing (FSR) protocol in Grid using ns-2 network simulator under different performance metrics scenario in respect to number of Nodes. For example, the connection establishment is costly in terms of time and resource where the network is mostly affected by connection request flooding. The proposed approach presents a way to reduce flooding in MANETs. Flooding is dictated by the propagation of connection-request packets from the source to its neighborhood nodes. The proposed architecture embarks on the concept of sharing neighborhood information. The proposed approach focuses on exposing its neighborhood peer to another node that is referred to as its friend-node, which had requested/forwarded connection request. If there is a high probability for the friend node to communicate through the exposed routes, this could improve the efficacy of bandwidth utilization by reducing flooding, as the routes have been acquired, without any broadcasts. Friendship between nodes is quantized based on empirical computations and heuristic algorithms. The nodes store the neighborhood information in their cache that is periodically verified for consistency. Simulation results show the performance of this proposed method.
In this paper, we explore ordinal classification (in the context of deep neural networks) through a simple modification of the squared error loss which not only allows it to not only be sensitive to class ordering, but also allows the possibility of having a discrete probability distribution over the classes. Our formulation is based on the use of a softmax hidden layer, which has received relatively little attention in the literature. We empirically evaluate its performance on the Kaggle diabetic retinopathy dataset, an ordinal and high-resolution dataset and show that it outperforms all of the baselines employed.
A lot of works have shown that frobenius-norm based representation (FNR) is competitive to sparse representation and nuclear-norm based representation (NNR) in numerous tasks such as subspace clustering. Despite the success of FNR in experimental studies, less theoretical analysis is provided to understand its working mechanism. In this paper, we fill this gap by building the theoretical connections between FNR and NNR. More specially, we prove that: 1) when the dictionary can provide enough representative capacity, FNR is exactly NNR even though the data set contains the Gaussian noise, Laplacian noise, or sample-specified corruption, 2) otherwise, FNR and NNR are two solutions on the column space of the dictionary.
Usually broadband wireless access networks are considered to be enterprise level networks providing us with more capacity as well as coverage. We have seen that in remote inaccessible areas wired networks are not at all cost effective. Wireless networking has offered us an alternative solution for such problem of information access. They have definitely changed the way people communicate and share information among themselves by overcoming problems nowadays associated with distance and location. This paper provides a comparison and technical analysis of alternatives for implementing last-mile wireless broadband services. It provides detailed technical differences between 802.11 (Wi-FI) wireless networks with 802.16 (WiMAX), a new technology that solves many of the difficulties in last-mile implementations.
Deep reinforcement learning has emerged as a promising and powerful technique for automatically acquiring control policies that can process raw sensory inputs, such as images, and perform complex behaviors. However, extending deep RL to real-world robotic tasks has proven challenging, particularly in safety-critical domains such as autonomous flight, where a trial-and-error learning process is often impractical. In this paper, we explore the following question: can we train vision-based navigation policies entirely in simulation, and then transfer them into the real world to achieve real-world flight without a single real training image? We propose a learning method that we call CAD$^2$RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models. Our method uses single RGB images from a monocular camera, without needing to explicitly reconstruct the 3D geometry of the environment or perform explicit motion planning. Our learned collision avoidance policy is represented by a deep convolutional neural network that directly processes raw monocular images and outputs velocity commands. This policy is trained entirely on simulated images, with a Monte Carlo policy evaluation algorithm that directly optimizes the network's ability to produce collision-free flight. By highly randomizing the rendering settings for our simulated training set, we show that we can train a policy that generalizes to the real world, without requiring the simulator to be particularly realistic or high-fidelity. We evaluate our method by flying a real quadrotor through indoor environments, and further evaluate the design choices in our simulator through a series of ablation studies on depth prediction. For supplementary video see: https://youtu.be/nXBWmzFrj5s
Most of neural approaches to relation classification have focused on finding short patterns that represent the semantic relation between two nominals using Convolutional Neural Networks (CNNs) and those approaches have generally achieved better performances than using Recurrent Neural Networks (RNNs). With a similar intuition to the CNN models, we propose a novel RNN-based model that strongly focuses on only important parts of a sentence with multiple range-restricted bidirectional layers and attention for relation classification. Experimental results on the SemEval-2010 relation classification task show that our model is comparable to the state-of-the-art CNN-based and RNN-based models that use additional linguistic information.
The vision of wireless sensor networks is one of a smart collection of tiny, dumb devices. These motes may be individually cheap, unintelligent, imprecise, and unreliable. Yet they are able to derive strength from numbers, rendering the whole to be strong, reliable and robust. Our approach is to adopt a distributed and randomized mindset and rely on in network processing and network coding. Our general abstraction is that nodes should act only locally and independently, and the desired global behavior should arise as a collective property of the network. We summarize our work and present how these ideas can be applied for communication and storage in sensor networks.
We investigate efficient methods for packets to navigate in complex networks. The packets are assumed to have memory, but no previous knowledge of the graph. We assume the graph to be indexed, i.e. every vertex is associated with a number (accessible to the packets) between one and the size of the graph. We test different schemes to assign indices and utilize them in packet navigation. Four different network models with very different topological characteristics are used for testing the schemes. We find that one scheme outperform the others, and has an efficiency close to the theoretical optimum. We discuss the use of indexed-graph navigation in peer-to-peer networking and other distributed information systems.
Despite the large quantity of information available, thorough researches in various biological databases are still needed in order to reconstruct and understand the steps that lead to known or new phenomena. By using protein-protein interaction networks and algorithms to extract relevant interconnections among proteins of interest, it is possible to assemble subnetworks from global interactomes. Using these extracted networks it is possible to use algorithms to predict signal directions while activation and inhibition effects can be predicted using RNA interference screenings. The result of this approach is the automatic generation of boolean networks. This way of modelling dynamical systems allows the discovery of steady states and the prediction of stimuli response.
Artificial neural networks are powerful pattern classifiers; however, they have been surpassed in accuracy by methods such as support vector machines and random forests that are also easier to use and faster to train. Backpropagation, which is used to train artificial neural networks, suffers from the herd effect problem which leads to long training times and limit classification accuracy. We use the disjunctive normal form and approximate the boolean conjunction operations with products to construct a novel network architecture. The proposed model can be trained by minimizing an error function and it allows an effective and intuitive initialization which solves the herd-effect problem associated with backpropagation. This leads to state-of-the art classification accuracy and fast training times. In addition, our model can be jointly optimized with convolutional features in an unified structure leading to state-of-the-art results on computer vision problems with fast convergence rates. A GPU implementation of LDNN with optional convolutional features is also available
In this paper we describe a new method combining the polynomial neural network and decision tree techniques in order to derive comprehensible classification rules from clinical electroencephalograms (EEGs) recorded from sleeping newborns. These EEGs are heavily corrupted by cardiac, eye movement, muscle and noise artifacts and as a consequence some EEG features are irrelevant to classification problems. Combining the polynomial network and decision tree techniques, we discover comprehensible classification rules whilst also attempting to keep their classification error down. This technique is shown to outperform a number of commonly used machine learning technique applied to automatically recognize artifacts in the sleep EEGs.
The paper presents the electronic design and motion planning of a robot based on decision making regarding its straight motion and precise turn using Artificial Neural Network (ANN). The ANN helps in learning of robot so that it performs motion autonomously. The weights calculated are implemented in microcontroller. The performance has been tested to be excellent.
Complex systems can exhibit unexpected large changes, e.g. a crash in a financial market. We examine the large endogenous changes arising within a non-trivial generalization of the Minority Game: the Grand Canonical Minority Game (GCMG). Using a Markov Chain description, we study the many possible paths the system may take. This 'many-worlds' view not only allows us to predict the start and end of a crash in this system, but also to investigate how such a crash may be avoided. We find that the system can be 'immunized' against large changes: by inducing small changes today, much larger changes in the future can be prevented.
What happens if one pushes a cup sitting on a table toward the edge of the table? How about pushing a desk against a wall? In this paper, we study the problem of understanding the movements of objects as a result of applying external forces to them. For a given force vector applied to a specific location in an image, our goal is to predict long-term sequential movements caused by that force. Doing so entails reasoning about scene geometry, objects, their attributes, and the physical rules that govern the movements of objects. We design a deep neural network model that learns long-term sequential dependencies of object movements while taking into account the geometry and appearance of the scene by combining Convolutional and Recurrent Neural Networks. Training our model requires a large-scale dataset of object movements caused by external forces. To build a dataset of forces in scenes, we reconstructed all images in SUN RGB-D dataset in a physics simulator to estimate the physical movements of objects caused by external forces applied to them. Our Forces in Scenes (ForScene) dataset contains 10,335 images in which a variety of external forces are applied to different types of objects resulting in more than 65,000 object movements represented in 3D. Our experimental evaluations show that the challenging task of predicting long-term movements of objects as their reaction to external forces is possible from a single image.
Due to the dynamic nature of biological systems, biological networks underlying temporal process such as the development of {\it Drosophila melanogaster} can exhibit significant topological changes to facilitate dynamic regulatory functions. Thus it is essential to develop methodologies that capture the temporal evolution of networks, which make it possible to study the driving forces underlying dynamic rewiring of gene regulation circuity, and to predict future network structures. Using a new machine learning method called Tesla, which builds on a novel temporal logistic regression technique, we report the first successful genome-wide reverse-engineering of the latent sequence of temporally rewiring gene networks over more than 4000 genes during the life cycle of \textit{Drosophila melanogaster}, given longitudinal gene expression measurements and even when a single snapshot of such measurement resulted from each (time-specific) network is available. Our methods offer the first glimpse of time-specific snapshots and temporal evolution patterns of gene networks in a living organism during its full developmental course. The recovered networks with this unprecedented resolution chart the onset and duration of many gene interactions which are missed by typical static network analysis, and are suggestive of a wide array of other temporal behaviors of the gene network over time not noticed before.
Grids allow users flexible on-demand usage of computing resources through remote communication networks. A remarkable example of a Grid in High Energy Physics (HEP) research is used in the ALICE experiment at European Organization for Nuclear Research CERN. Physicists can submit jobs used to process the huge amount of particle collision data produced by the Large Hadron Collider (LHC). Grids face complex security challenges. They are interesting targets for attackers seeking for huge computational resources. Since users can execute arbitrary code in the worker nodes on the Grid sites, special care should be put in this environment. Automatic tools to harden and monitor this scenario are required. Currently, there is no integrated solution for such requirement. This paper describes a new security framework to allow execution of job payloads in a sandboxed context. It also allows process behavior monitoring to detect intrusions, even when new attack methods or zero day vulnerabilities are exploited, by a Machine Learning approach. We plan to implement the proposed framework as a software prototype that will be tested as a component of the ALICE Grid middleware.
Almost all network research has been focused on the properties of a single network that does not interact and depends on other networks. In reality, many real-world networks interact with other networks. Here we develop an analytical framework for studying interacting networks and present an exact percolation law for a network of $n$ interdependent networks. In particular, we find that for $n$ Erd\H{o}s-R\'{e}nyi networks each of average degree $k$, the giant component, $P_{\infty}$, is given by $P_{\infty}=p[1-\exp(-kP_{\infty})]^n$ where $1-p$ is the initial fraction of removed nodes. Our general result coincides for $n=1$ with the known Erd\H{o}s-R\'{e}nyi second-order phase transition for a single network. For any $n \geq 2$ cascading failures occur and the transition becomes a first-order percolation transition. The new law for $P_{\infty}$ shows that percolation theory that is extensively studied in physics and mathematics is a limiting case ($n=1$) of a more general general and different percolation law for interdependent networks.
Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms. Relatively little work has focused on learning representations for clustering. In this paper, we propose Deep Embedded Clustering (DEC), a method that simultaneously learns feature representations and cluster assignments using deep neural networks. DEC learns a mapping from the data space to a lower-dimensional feature space in which it iteratively optimizes a clustering objective. Our experimental evaluations on image and text corpora show significant improvement over state-of-the-art methods.
This paper explores the use of Pyramid Vector Quantization (PVQ) to reduce the computational cost for a variety of neural networks (NNs) while, at the same time, compressing the weights that describe them. This is based on the fact that the dot product between an N dimensional vector of real numbers and an N dimensional PVQ vector can be calculated with only additions and subtractions and one multiplication. This is advantageous since tensor products, commonly used in NNs, can be re-conduced to a dot product or a set of dot products. Finally, it is stressed that any NN architecture that is based on an operation that can be re-conduced to a dot product can benefit from the techniques described here.
As recurrent neural networks become larger and deeper, training times for single networks are rising into weeks or even months. As such there is a significant incentive to improve the performance and scalability of these networks. While GPUs have become the hardware of choice for training and deploying recurrent models, the implementations employed often make use of only basic optimizations for these architectures. In this article we demonstrate that by exposing parallelism between operations within the network, an order of magnitude speedup across a range of network sizes can be achieved over a naive implementation. We describe three stages of optimization that have been incorporated into the fifth release of NVIDIA's cuDNN: firstly optimizing a single cell, secondly a single layer, and thirdly the entire network.
Event sequence, asynchronously generated with random timestamp, is ubiquitous among applications. The precise and arbitrary timestamp can carry important clues about the underlying dynamics, and has lent the event data fundamentally different from the time-series whereby series is indexed with fixed and equal time interval. One expressive mathematical tool for modeling event is point process. The intensity functions of many point processes involve two components: the background and the effect by the history. Due to its inherent spontaneousness, the background can be treated as a time series while the other need to handle the history events. In this paper, we model the background by a Recurrent Neural Network (RNN) with its units aligned with time series indexes while the history effect is modeled by another RNN whose units are aligned with asynchronous events to capture the long-range dynamics. The whole model with event type and timestamp prediction output layers can be trained end-to-end. Our approach takes an RNN perspective to point process, and models its background and history effect. For utility, our method allows a black-box treatment for modeling the intensity which is often a pre-defined parametric form in point processes. Meanwhile end-to-end training opens the venue for reusing existing rich techniques in deep network for point process modeling. We apply our model to the predictive maintenance problem using a log dataset by more than 1000 ATMs from a global bank headquartered in North America.
We discuss shortest-path lengths $\ell(r)$ on periodic rings of size L supplemented with an average of pL randomly located long-range links whose lengths are distributed according to $P_l \sim l^{-\xpn}$. Using rescaling arguments and numerical simulation on systems of up to $10^7$ sites, we show that a characteristic length $\xi$ exists such that $\ell(r) \sim r$ for $r<\xi$ but $\ell(r) \sim r^{\theta_s(\xpn)}$ for $r>>\xi$. For small p we find that the shortest-path length satisfies the scaling relation $\ell(r,\xpn,p)/\xi = f(\xpn,r/\xi)$. Three regions with different asymptotic behaviors are found, respectively: a) $\xpn>2$ where $\theta_s=1$, b) $1<\xpn<2$ where $0<\theta_s(\xpn)<1/2$ and, c) $\xpn<1$ where $\ell(r)$ behaves logarithmically, i.e. $\theta_s=0$. The characteristic length $\xi$ is of the form $\xi \sim p^{-\nu}$ with $\nu=1/(2-\xpn)$ in region b), but depends on L as well in region c). A directed model of shortest-paths is solved and compared with numerical results.
The Manual labeling of data is and will remain a costly endeavor. For this reason, semi-supervised learning remains a topic of practical importance. The recently proposed Ladder Network is one such approach that has proven to be very successful. In addition to the supervised objective, the Ladder Network also adds an unsupervised objective corresponding to the reconstruction costs of a stack of denoising autoencoders. Although the empirical results are impressive, the Ladder Network has many components intertwined, whose contributions are not obvious in such a complex architecture. In order to help elucidate and disentangle the different ingredients in the Ladder Network recipe, this paper presents an extensive experimental investigation of variants of the Ladder Network in which we replace or remove individual components to gain more insight into their relative importance. We find that all of the components are necessary for achieving optimal performance, but they do not contribute equally. For semi-supervised tasks, we conclude that the most important contribution is made by the lateral connection, followed by the application of noise, and finally the choice of what we refer to as the `combinator function' in the decoder path. We also find that as the number of labeled training examples increases, the lateral connections and reconstruction criterion become less important, with most of the improvement in generalization being due to the injection of noise in each layer. Furthermore, we present a new type of combinator function that outperforms the original design in both fully- and semi-supervised tasks, reducing record test error rates on Permutation-Invariant MNIST to 0.57% for the supervised setting, and to 0.97% and 1.0% for semi-supervised settings with 1000 and 100 labeled examples respectively.
This work explores how degrees of freedom (DoF) results from wireless networks can be translated into capacity results for their finite field counterparts that arise in network coding applications. The main insight is that scalar (SISO) finite field channels over $\mathbb{F}_{p^n}$ are analogous to n x n vector (MIMO) channels in the wireless setting, but with an important distinction -- there is additional structure due to finite field arithmetic which enforces commutativity of matrix multiplication and limits the channel diversity to n, making these channels similar to diagonal channels in the wireless setting. Within the limits imposed by the channel structure, the DoF optimal precoding solutions for wireless networks can be translated into capacity optimal solutions for their finite field counterparts. This is shown through the study of the 2-user X channel and the 3-user interference channel. Besides bringing the insights from wireless networks into network coding applications, the study of finite field networks over $\mathbb{F}_{p^n}$ also touches upon important open problems in wireless networks (finite SNR, finite diversity scenarios) through interesting parallels between p and SNR, and n and diversity.
Using the activation-relaxation technique (ART), we study the nature of relaxation events in a binary Lennard-Jones system above and below the glass transition temperature (T_g). ART generates trajectories with almost identical efficiency at all temperature, thus avoiding the exponential slowing down below T_g and providing extensive sampling everywhere. Comparing these runs, we find that the number of atoms involved in an event decreases strongly with temperature. In particular, while in the supercooled liquid activated events are collective, involving on average thirty atoms or more, events below T_g involve mostly single atoms and produce minimal disturbance of the local environment. These results confirm the interpretation and the generality of recent NMR results by Tang et al (Nature 402, 160 (1999)).
We proposed a two stage framework with only one network to analyze skin lesion images, we firstly trained a convolutional network to classify these images, and cropped the import regions which the network has the maximum activation value. In the second stage, we retrained this CNN with the image regions extracted from stage one and output the final probabilities. The two stage framework achieved a mean AUC of 0.857 in ISIC-2017 skin lesion validation set and is 0.04 higher than that of the original inputs, 0.821.
We survey some of the concepts, methods, and applications of community detection, which has become an increasingly important area of network science. To help ease newcomers into the field, we provide a guide to available methodology and open problems, and discuss why scientists from diverse backgrounds are interested in these problems. As a running theme, we emphasize the connections of community detection to problems in statistical physics and computational optimization.
We study the problem of sampling k-bandlimited signals on graphs. We propose two sampling strategies that consist in selecting a small subset of nodes at random. The first strategy is non-adaptive, i.e., independent of the graph structure, and its performance depends on a parameter called the graph coherence. On the contrary, the second strategy is adaptive but yields optimal results. Indeed, no more than O(k log(k)) measurements are sufficient to ensure an accurate and stable recovery of all k-bandlimited signals. This second strategy is based on a careful choice of the sampling distribution, which can be estimated quickly. Then, we propose a computationally efficient decoder to reconstruct k-bandlimited signals from their samples. We prove that it yields accurate reconstructions and that it is also stable to noise. Finally, we conduct several experiments to test these techniques.
The slime mould Physarum polycephalum is known to construct proto- plasmic transport networks which approximate proximity graphs by forag- ing for nutrients during its plasmodial life cycle stage. In these networks, nodes are represented by nutrients and edges are represented by proto- plasmic tubes. These networks have been shown to be efficient in terms of length and resilience of the overall network to random damage. However relatively little research has been performed in the potential for Physarum transport networks to approximate the overall shape of a dataset. In this paper we distinguish between connectivity and shape of a planar point dataset and demonstrate, using scoping experiments with plasmodia of P. polycephalum and a multi-agent model of the organism, how we can gen- erate representations of the external and internal shapes of a set of points. As with proximity graphs formed by P. polycephalum, the behaviour of the plasmodium (real and model) is mediated by environmental stimuli. We further explore potential morphological computation approaches with the multi-agent model, presenting methods which approximate the Convex Hull and the Concave Hull. We demonstrate how a growth parameter in the model can be used to transition between Convex and Concave Hulls. These results suggest novel mechanisms of morphological computation mediated by environmental stimuli.
{\sc top-c} (Task Oriented Parallel C) is a freely available package for parallel computing. It is designed to be easy to learn and to have good tolerance for the high latencies that are common in commodity networks of computers. It has been successfully used in a wide range of examples, providing linear speedup with the number of computers. A brief overview of {\sc top-c} is provided, along with recent experience with cosmic ray physics simulations.
We present QuickNet, a fast and accurate network architecture that is both faster and significantly more accurate than other fast deep architectures like SqueezeNet. Furthermore, it uses less parameters than previous networks, making it more memory efficient. We do this by making two major modifications to the reference Darknet model (Redmon et al, 2015): 1) The use of depthwise separable convolutions and 2) The use of parametric rectified linear units. We make the observation that parametric rectified linear units are computationally equivalent to leaky rectified linear units at test time and the observation that separable convolutions can be interpreted as a compressed Inception network (Chollet, 2016). Using these observations, we derive a network architecture, which we call QuickNet, that is both faster and more accurate than previous models. Our architecture provides at least four major advantages: (1) A smaller model size, which is more tenable on memory constrained systems; (2) A significantly faster network which is more tenable on computationally constrained systems; (3) A high accuracy of 95.7 percent on the CIFAR-10 Dataset which outperforms all but one result published so far, although we note that our works are orthogonal approaches and can be combined (4) Orthogonality to previous model compression approaches allowing for further speed gains to be realized.
Nearly all Statistical Parametric Speech Synthesizers today use Mel Cepstral coefficients as the vocal tract parameterization of the speech signal. Mel Cepstral coefficients were never intended to work in a parametric speech synthesis framework, but as yet, there has been little success in creating a better parameterization that is more suited to synthesis. In this paper, we use deep learning algorithms to investigate a data-driven parameterization technique that is designed for the specific requirements of synthesis. We create an invertible, low-dimensional, noise-robust encoding of the Mel Log Spectrum by training a tapered Stacked Denoising Autoencoder (SDA). This SDA is then unwrapped and used as the initialization for a Multi-Layer Perceptron (MLP). The MLP is fine-tuned by training it to reconstruct the input at the output layer. This MLP is then split down the middle to form encoding and decoding networks. These networks produce a parameterization of the Mel Log Spectrum that is intended to better fulfill the requirements of synthesis. Results are reported for experiments conducted using this resulting parameterization with the ClusterGen speech synthesizer.
Computational approaches to drug discovery can reduce the time and cost associated with experimental assays and enable the screening of novel chemotypes. Structure-based drug design methods rely on scoring functions to rank and predict binding affinities and poses. The ever-expanding amount of protein-ligand binding and structural data enables the use of deep machine learning techniques for protein-ligand scoring.   We describe convolutional neural network (CNN) scoring functions that take as input a comprehensive 3D representation of a protein-ligand interaction. A CNN scoring function automatically learns the key features of protein-ligand interactions that correlate with binding. We train and optimize our CNN scoring functions to discriminate between correct and incorrect binding poses and known binders and non-binders. We find that our CNN scoring function outperforms the AutoDock Vina scoring function when ranking poses both for pose prediction and virtual screening.
We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.
Recently deep neural networks based on tanh activation function have shown their impressive power in image denoising. However, much training time is needed because of their very large size. In this letter, we propose a dual-pathway rectifier neural network by combining two rectifier neurons with reversed input and output weights in the same hidden layer. We drive the equivalent activation function and illustrate that it improves the efficiency of capturing information from the noisy data. The experimental results show that our model outperforms other activation functions and achieves state-of-the-art denoising performance, while the network size and the training time are significantly reduced.
The Shannon-Weaver model of linear information transmission is extended with two loops potentially generating redundancies: (i) meaning is provided locally to the information from the perspective of hindsight, and (ii) meanings can be codified differently and then refer to other horizons of meaning. Thus, three layers are distinguished: variations in the communications, historical organization at each moment of time, and evolutionary self-organization of the codes of communication over time. Furthermore, the codes of communication can functionally be different and then the system is both horizontally and vertically differentiated. All these subdynamics operate in parallel and necessarily generate uncertainty. However, meaningful information can be considered as the specific selection of a signal from the noise; the codes of communication are social constructs that can generate redundancy by giving different meanings to the same information. Reflexively, one can translate among codes in more elaborate discourses. The second (instantiating) layer can be operationalized in terms of semantic maps using the vector space model; the third in terms of mutual redundancy among the latent dimensions of the vector space. Using Blaise Cronin's {\oe}uvre, the different operations of the three layers are demonstrated empirically.
In order to extract event information from text, a machine reading model must learn to accurately read and interpret the ways in which that information is expressed. But it must also, as the human reader must, aggregate numerous individual value hypotheses into a single coherent global analysis, applying global constraints which reflect prior knowledge of the domain.   In this work we focus on the task of extracting plane crash event information from clusters of related news articles whose labels are derived via distant supervision. Unlike previous machine reading work, we assume that while most target values will occur frequently in most clusters, they may also be missing or incorrect.   We introduce a novel neural architecture to explicitly model the noisy nature of the data and to deal with these aforementioned learning issues. Our models are trained end-to-end and achieve an improvement of more than 12.1 F$_1$ over previous work, despite using far less linguistic annotation. We apply factor graph constraints to promote more coherent event analyses, with belief propagation inference formulated within the transitions of a recurrent neural network. We show this technique additionally improves maximum F$_1$ by up to 2.8 points, resulting in a relative improvement of $50\%$ over the previous state-of-the-art.
The remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built on users' personal and highly sensitive data, e.g., clinical records, user profiles, biomedical images, etc. However, only a few scientific studies on preserving privacy in deep learning have been conducted. In this paper, we focus on developing a private convolutional deep belief network (pCDBN), which essentially is a convolutional deep belief network (CDBN) under differential privacy. Our main idea of enforcing epsilon-differential privacy is to leverage the functional mechanism to perturb the energy-based objective functions of traditional CDBNs, rather than their results. One key contribution of this work is that we propose the use of Chebyshev expansion to derive the approximate polynomial representation of objective functions. Our theoretical analysis shows that we can further derive the sensitivity and error bounds of the approximate polynomial representation. As a result, preserving differential privacy in CDBNs is feasible. We applied our model in a health social network, i.e., YesiWell data, and in a handwriting digit dataset, i.e., MNIST data, for human behavior prediction, human behavior classification, and handwriting digit recognition tasks. Theoretical analysis and rigorous experimental evaluations show that the pCDBN is highly effective. It significantly outperforms existing solutions.
Mean field glassy systems have a complicated energy landscape and an enormous number of different Gibbs states. In this paper, we introduce a generalization of the cavity method in order to describe the adiabatic evolution of these glassy Gibbs states as an external parameter, such as the temperature, is tuned. We give a general derivation of the method and describe in details the solution of the resulting equations for the fully connected p-spin model, the XOR-SAT problem and the anti-ferromagnetic Potts glass (or "coloring" problem). As direct results of the states following method, we present a study of very slow Monte-Carlo annealings, the demonstration of the presence of temperature chaos in these systems, and the identification of a easy/hard transition for simulated annealing in constraint optimization problems. We also discuss the relation between our approach and the Franz-Parisi potential, as well as with the reconstruction problem on trees in computer science. A mapping between the states following method and the physics on the Nishimori line is also presented.
We study noisy broadcast networks with local cache memories at the receivers, where the transmitter can pre-store information even before learning the receivers' requests. We mostly focus on packet-erasure broadcast networks with two disjoint sets of receivers: a set of weak receivers with all-equal erasure probabilities and equal cache sizes and a set of strong receivers with all-equal erasure probabilities and no cache memories. We present lower and upper bounds on the capacity-memory tradeoff of this network. The lower bound is achieved by a new joint cache-channel coding idea and significantly improves on schemes that are based on separate cache-channel coding. We discuss how this coding idea could be extended to more general discrete memoryless broadcast channels and to unequal cache sizes. Our upper bound holds for all stochastically degraded broadcast channels.   For the described packet-erasure broadcast network, our lower and upper bounds are tight when there is a single weak receiver (and any number of strong receivers) and the cache memory size does not exceed a given threshold. When there are a single weak receiver, a single strong receiver, and two files, then we can strengthen our upper and lower bounds so as they coincide over a wide regime of cache sizes.   Finally, we completely characterise the rate-memory tradeoff for general discrete-memoryless broadcast channels with arbitrary cache memory sizes and arbitrary (asymmetric) rates when all receivers always demand exactly the same file.
In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.
Deep learning methods have typically been trained on large datasets in which many training examples are available. However, many real-world product datasets have only a small number of images available for each product. We explore the use of deep learning methods for recognizing object instances when we have only a single training example per class. We show that feedforward neural networks outperform state-of-the-art methods for recognizing objects from novel viewpoints even when trained from just a single image per object. To further improve our performance on this task, we propose to take advantage of a supplementary dataset in which we observe a separate set of objects from multiple viewpoints. We introduce a new approach for training deep learning methods for instance recognition with limited training data, in which we use an auxiliary multi-view dataset to train our network to be robust to viewpoint changes. We find that this approach leads to a more robust classifier for recognizing objects from novel viewpoints, outperforming previous state-of-the-art approaches including keypoint-matching, template-based techniques, and sparse coding.
Multicast data delivery can significantly reduce traffic in operators' networks, but has been limited in deployment due to concerns such as the scalability of state management. This paper shows how multicast can be implemented in contemporary software defined networking (SDN) switches, with less state than existing unicast switching strategies, by utilising a Bloom Filter (BF) based switching technique. Furthermore, the proposed mechanism uses only proactive rule insertion, and thus, is not limited by congestion or delay incurred by reactive controller-aided rule insertion. We compare our solution against common switching mechanisms such as layer-2 switching and MPLS in realistic network topologies by modelling the TCAM state sizes in SDN switches. The results demonstrate that our approach has significantly smaller state size compared to existing mechanisms and thus is a multicast switching solution for next generation networks.
Effective convolutional neural networks are trained on large sets of labeled data. However, creating large labeled datasets is a very costly and time-consuming task. Semi-supervised learning uses unlabeled data to train a model with higher accuracy when there is a limited set of labeled data available. In this paper, we consider the problem of semi-supervised learning with convolutional neural networks. Techniques such as randomized data augmentation, dropout and random max-pooling provide better generalization and stability for classifiers that are trained using gradient descent. Multiple passes of an individual sample through the network might lead to different predictions due to the non-deterministic behavior of these techniques. We propose an unsupervised loss function that takes advantage of the stochastic nature of these methods and minimizes the difference between the predictions of multiple passes of a training sample through the network. We evaluate the proposed method on several benchmark datasets.
We discuss resummation of large logarithmic terms that appear in the cross-section of semi-inclusive DIS in the case when the final-state hadron follows the direction of the incoming electroweak vector boson in the c.m. frame of the vector boson and the initial-state proton.
Similarity-preserving hashing is a widely-used method for nearest neighbour search in large-scale image retrieval tasks. For most existing hashing methods, an image is first encoded as a vector of hand-engineering visual features, followed by another separate projection or quantization step that generates binary codes. However, such visual feature vectors may not be optimally compatible with the coding process, thus producing sub-optimal hashing codes. In this paper, we propose a deep architecture for supervised hashing, in which images are mapped into binary codes via carefully designed deep neural networks. The pipeline of the proposed deep architecture consists of three building blocks: 1) a sub-network with a stack of convolution layers to produce the effective intermediate image features; 2) a divide-and-encode module to divide the intermediate image features into multiple branches, each encoded into one hash bit; and 3) a triplet ranking loss designed to characterize that one image is more similar to the second image than to the third one. Extensive evaluations on several benchmark image datasets show that the proposed simultaneous feature learning and hash coding pipeline brings substantial improvements over other state-of-the-art supervised or unsupervised hashing methods.
The role of the geometric fluctuations on the multifractal properties of the local magnetization of aperiodic ferromagnetic Ising models on hierachical lattices is investigated. The geometric fluctuations are introduced by generalized Fibonacci sequences. The local magnetization is evaluated via an exact recurrent procedure encompassing a real space renormalization group decimation. The symmetries of the local magnetization patterns induced by the aperiodic couplings is found to be strongly (weakly) different, with respect to the ones of the corresponding homogeneous systems, when the geometric fluctuations are relevant (irrelevant) to change the critical properties of the system. At the criticality, the measure defined by the local magnetization is found to exhibit a non-trivial F(alpha) spectra being shifted to higher values of alpha when relevant geometric fluctuations are considered. The critical exponents are found to be related with some special points of the F(alpha) function and agree with previous results obtained by the quite distinct transfer matrix approach.
The power of any kind of network approach lies in the ability to simplify a complex system so that one can better understand its function as a whole. Sometimes it is beneficial, however, to include more information than in a simple graph of only nodes and links. Adding information about times of interactions can make predictions and mechanistic understanding more accurate. The drawback, however, is that there are not so many methods available, partly because temporal networks is a relatively young field, partly because it more difficult to develop such methods compared to for static networks. In this colloquium, we review the methods to analyze and model temporal networks and processes taking place on them, focusing mainly on the last three years. This includes the spreading of infectious disease, opinions, rumors, in social networks; information packets in computer networks; various types of signaling in biology, and more. We also discuss future directions.
Quality of Service provisioning is one of the major design goals of IEEE 802.16 mesh networks. In order to provide quality delivery of delay sensitive services such as voice, video etc., it is required to route such traffic over a minimum delay path. In this paper we propose a routing metric for delay sensitive services in IEEE 802.16 mesh networks. We design a new cross layer routing metric, namely Expected Scheduler Delay (ESD), based on HoldOff exponent and the current load at each node of the network. This proposed metric takes into account the expected theoretical end-to-end delay of routing paths as well as network congestion to find the best suited path. We propose an efficient distributed scheme to calculate ESD and route the packets using source routing mechanism based on ESD. The simulation results demonstrate that our metric achieves reduced delay compared to a standard scheme used in IEEE 802.16 mesh, that takes hop count to find the path.
Blue horizontal branch and UV bright stars in several globular clusters are analysed spectroscopically and the results are compared with predictions of stellar evolutionary theory. We find that the distribution of temperatures and surface gravities of the blue HB stars may be explained by the effects of deep mixing. The masses derived for these stars are too low unless one uses the long distance scale for globular clusters. First results on blue HB stars in metal rich clusters are presented.   Analyses of hot UV bright stars in globular clusters uncovered a lack of genuine post-asymptotic giant branch stars which may explain the lack of planetary nebulae in globular clusters seen by Jacoby et al. (1997). Abundance analyses of post-AGB stars in two globular clusters suggest that gas and dust may separate during the AGB phase.
Selective classification techniques (also known as reject option) have not yet been considered in the context of deep neural networks (DNNs). These techniques can potentially significantly improve DNNs prediction performance by trading-off coverage. In this paper we propose a method to construct a selective classifier given a trained neural network. Our method allows a user to set a desired risk level. At test time, the classifier rejects instances as needed, to grant the desired risk (with high probability). Empirical results over CIFAR and ImageNet convincingly demonstrate the viability of our method, which opens up possibilities to operate DNNs in mission-critical applications. For example, using our method an unprecedented 2% error in top-5 ImageNet classification can be guaranteed with probability 99.9%, and almost 60% test coverage.
In this paper, a novel spectrum association approach for cognitive radio networks (CRNs) is proposed. Based on a measure of both inference and confidence as well as on a measure of quality-of-service, the association between secondary users (SUs) in the network and frequency bands licensed to primary users (PUs) is investigated. The problem is formulated as a matching game between SUs and PUs. In this game, SUs employ a soft-decision Bayesian framework to detect PUs' signals and, eventually, rank them based on the logarithm of the a posteriori ratio. A performance measure that captures both the ranking metric and rate is further computed by the SUs. Using this performance measure, a PU evaluates its own utility function that it uses to build its own association preferences. A distributed algorithm that allows both SUs and PUs to interact and self-organize into a stable match is proposed. Simulation results show that the proposed algorithm can improve the sum of SUs' rates by up to 20 % and 60 % relative to the deferred acceptance algorithm and random channel allocation approach, respectively. The results also show an improved convergence time.
The Deep Ecliptic Survey is a project whose goal is to survey a large area of the near-ecliptic region to a faint limiting magnitude (R ~ 24) in search of objects in the outer solar system. We are collecting a large homogeneous data sample from the Kitt Peak Mayall 4-m and Cerro Tololo Blanco 4-m telescopes with the Mosaic prime-focus CCD cameras. Our goal is to collect a sample of 500 objects with good orbits to further our understanding of the dynamical structure of the outer solar system. This survey has been in progress since 1998 and is responsible for 272 designated discoveries as of March 2003. We summarize our techniques, highlight recent results, and describe publically available resources.
We study the occurrence of frequency synchronised states with tunable emergent frequencies in a network of connected systems. This is achieved by the interplay between time scales of nonlinear dynamical systems connected to form a network, where out of N systems, m evolve on a slower time scale. In such systems, in addition to frequency synchronised states, we also observe amplitude death, synchronised clusters and multifrequency states. We report an interesting cross over behaviour from fast to slow collective dynamics as the number of slow systems m increases. The transition to amplitude death is analysed in detail for minimal network configurations of 3 and 4 systems which actually form possible motifs of the full network.
The diffusion limited aggregation model (DLA) and the more general dielectric breakdown model (DBM) are solved exactly in a two dimensional cylindrical geometry with periodic boundary conditions of width 2. Our approach follows the exact evolution of the growing interface, using the evolution matrix E, which is a temporal transfer matrix. The eigenvector of this matrix with an eigenvalue of one represents the system's steady state. This yields an estimate of the fractal dimension for DLA, which is in good agreement with simulations. The same technique is used to calculate the fractal dimension for various values of eta in the more general DBM model. Our exact results are very close to the approximate results found by the fixed scale transformation approach.
We use the visibility algorithm to construct the time series networks obtained from the time series of different dynamical regimes of the logistic map. We define the simplicial characterisers of networks which can analyse the simplicial structure at both the global and local levels. These characterisers are used to analyse the TS networks obtained in different dynamical regimes of the logisitic map. It is seen that the simplicial characterisers are able to distinguish between distinct dynamical regimes. We also apply the simplicial characterisers to time series networks constructed from fMRI data, where the preliminary results indicate that the characterisers are able to differentiate between distinct TS networks.
The objective of this paper is to implement the Active Network based Active Queue Management Technique for providing Quality of Service (QoS) using Network Processor(NP) based router to enhance multimedia applications. The performance is evaluated using Intel IXP2400 NP Simulator. The results demonstrate that, Active Network based Active Queue Management has better performance than RED algorithm in case of congestion and is well suited to achieve high speed packet classification to support multimedia applications with minimum delay and Queue loss. Using simulation, we show that the proposed system can provide assurance for prioritized flows with improved network utilization where bandwidth is shared among the flows according to the levels of priority. We first analyze the feasibility and optimality of the load distribution schemes and then present separate solutions for non-delay sensitive streams and delay-sensitive streams. Rigorous simulations and experiments have been carried out to evaluate the performance.
The presence of fluctuating local relaxation times, $\tau_r(t)$ has been used for some time as a conceptual tool to describe dynamical heterogeneities in glass-forming systems. However, until now no general method is known to extract the full space and time dependent $\tau_r(t)$ from experimental or numerical data. Here we report on a new method for determining the local phase field, $\phi_r(t) = \int^{t} dt'/\tau_r(t')$ from snapshots $\{r(t_i)\}_{i=1...M}$ of the positions of the particles in a system, and we apply it to extract $\phi_r(t)$ and $\tau_r(t)$ from numerical simulations. By studying how the phase field depends on the number of snapshots, we find that it is a well defined quantity. By studying fluctuations of the phase field, we find that they describe heterogeneities well at long distance scales.
Much recent machine learning research has been directed towards leveraging shared statistics among labels, instances and data views, commonly referred to as multi-label, multi-instance and multi-view learning. The underlying premises are that there exist correlations among input parts and among output targets, and the predictive performance would increase when the correlations are incorporated. In this paper, we propose Column Bundle (CLB), a novel deep neural network for capturing the shared statistics in data. CLB is generic that the same architecture can be applied for various types of shared statistics by changing only input and output handling. CLB is capable of scaling to thousands of input parts and output labels by avoiding explicit modeling of pairwise relations. We evaluate CLB on different types of data: (a) multi-label, (b) multi-view, (c) multi-view/multi-label and (d) multi-instance. CLB demonstrates a comparable and competitive performance in all datasets against state-of-the-art methods designed specifically for each type.
A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree based networks. Reticulation-visible networks and child-sibling networks are all tree based. In this work, we present a simply necessary and sufficient condition for tree-based networks and prove that there is a universal tree based network for each set of species such that every phylogenetic tree on the same species is a base of this network. The existence of universal tree based network implies that for any given set of phylogenetic trees (resp. clusters) on the same species there exists a tree base network that display all of them.
We analyze a plug-in estimator for a large class of integral functionals of one or more continuous probability densities. This class includes important families of entropy, divergence, mutual information, and their conditional versions. For densities on the $d$-dimensional unit cube $[0,1]^d$ that lie in a $\beta$-H\"older smoothness class, we prove our estimator converges at the rate $O \left( n^{-\frac{\beta}{\beta + d}} \right)$. Furthermore, we prove the estimator is exponentially concentrated about its mean, whereas most previous related results have proven only expected error bounds on estimators.
We highlight a pitfall when applying stochastic variational inference to general Bayesian networks. For global random variables approximated by an exponential family distribution, natural gradient steps, commonly starting from a unit length step size, are averaged to convergence. This useful insight into the scaling of initial step sizes is lost when the approximation factorizes across a general Bayesian network, and care must be taken to ensure practical convergence. We experimentally investigate how much of the baby (well-scaled steps) is thrown out with the bath water (exact gradients).
The Vandermonde-Chu Binomial Coefficients Identity is shown to imply Bombieri's deep norm inequalities, via identities of Beauzamy-D\'egot, and Reznick.
Physics and programming aspects are discussed for a Fortran 77 Monte Carlo program to simulate complete events in deep inelastic lepton-nucleon scattering. The parton level interaction is based on the standard model electroweak cross sections, which are fully implemented in leading order for any lepton of arbitrary polarization, and different parametrizations of parton density functions can be used. First order QCD matrix elements for gluon radiation and boson-gluon fusion are implemented and higher order QCD radiation is treated using parton showers. Hadronization is performed using the Lund string model, implemented in {\sc Jetset}/{\sc Pythia}. Rapidity gap events are generated through a model based on soft colour interactions.
The premise of genetic analysis is that a causal link exists between phenotypic and allelic variation. Yet it has long been documented that mutant phenotypes are not a simple result of a single DNA lesion, but rather are due to interactions of the focal allele with other genes and the environment. Although an experimentally rigorous approach, focusing on individual mutations and isogenic control strains, has facilitated amazing progress within genetics and related fields, a glimpse back suggests that a vast complexity has been omitted from our current understanding of allelic effects. Armed with traditional genetic analyses and the foundational knowledge they have provided, we argue that the time and tools are ripe to return to the under-explored aspects of gene function and embrace the context-dependent nature of genetic effects. We assert that a broad understanding of genetic effects and the evolutionary dynamics of alleles requires identifying how mutational outcomes depend upon the wild-type genetic background. Furthermore, we discuss how best to exploit genetic background effects to broaden genetic research programs.
In recent years the research community has accumulated overwhelming evidence for the emergence of complex and heterogeneous connectivity patterns in a wide range of biological and sociotechnical systems. The complex properties of real-world networks have a profound impact on the behavior of equilibrium and nonequilibrium phenomena occurring in various systems, and the study of epidemic spreading is central to our understanding of the unfolding of dynamical processes in complex networks. The theoretical analysis of epidemic spreading in heterogeneous networks requires the development of novel analytical frameworks, and it has produced results of conceptual and practical relevance. A coherent and comprehensive review of the vast research activity concerning epidemic processes is presented, detailing the successful theoretical approaches as well as making their limits and assumptions clear. Physicists, mathematicians, epidemiologists, computer, and social scientists share a common interest in studying epidemic spreading and rely on similar models for the description of the diffusion of pathogens, knowledge, and innovation. For this reason, while focusing on the main results and the paradigmatic models in infectious disease modeling, the major results concerning generalized social contagion processes are also presented. Finally, the research activity at the forefront in the study of epidemic spreading in coevolving, coupled, and time-varying networks is reported.
We report thermodynamic magnetization measurements of two-dimensional electrons in several high mobility Si metal-oxide-semiconductor field-effect transistors. We provide evidence for an easily polarizable electron state in a wide density range from insulating to deep into the metallic phase. The temperature and magnetic field dependence of the magnetization is consistent with the formation of large-spin droplets in the insulating phase. These droplets melt in the metallic phase with increasing density and temperature, although they survive up to large densities.
We analyze protein-protein interaction networks for six different species under the framework of random matrix theory. Nearest neighbor spacing distribution of the eigenvalues of adjacency matrices of the largest connected part of these networks emulate universal Gaussian orthogonal statistics of random matrix theory. We demonstrate that spectral rigidity, which quantifies long range correlations in eigenvalues, for all protein-protein interaction networks follow random matrix prediction up to certain ranges indicating randomness in interactions. After this range, deviation from the universality evinces underlying structural features in network.
Recently neural networks and multiple instance learning are both attractive topics in Artificial Intelligence related research fields. Deep neural networks have achieved great success in supervised learning problems, and multiple instance learning as a typical weakly-supervised learning method is effective for many applications in computer vision, biometrics, nature language processing, etc. In this paper, we revisit the problem of solving multiple instance learning problems using neural networks. Neural networks are appealing for solving multiple instance learning problem. The multiple instance neural networks perform multiple instance learning in an end-to-end way, which take a bag with various number of instances as input and directly output bag label. All of the parameters in a multiple instance network are able to be optimized via back-propagation. We propose a new multiple instance neural network to learn bag representations, which is different from the existing multiple instance neural networks that focus on estimating instance label. In addition, recent tricks developed in deep learning have been studied in multiple instance networks, we find deep supervision is effective for boosting bag classification accuracy. In the experiments, the proposed multiple instance networks achieve state-of-the-art or competitive performance on several MIL benchmarks. Moreover, it is extremely fast for both testing and training, e.g., it takes only 0.0003 second to predict a bag and a few seconds to train on a MIL datasets on a moderate CPU.
In recent years, Deep Neural Networks (DNNs) based methods have achieved remarkable performance in a wide range of tasks and have been among the most powerful and widely used techniques in computer vision, speech recognition and Natural Language Processing. However, DNN-based methods are both computational-intensive and resource-consuming, which hinders the application of these methods on embedded systems like smart phones. To alleviate this problem, we introduce a novel Fixed-point Factorized Networks (FFN) on pre-trained models to reduce the computational complexity as well as the storage requirement of networks. Extensive experiments on large-scale ImageNet classification task show the effectiveness of our proposed method.
Pd$_{81}$Ge$_{19}$ metallic glass was investigated by neutron diffraction, X-ray diffraction and extended X-ray absorption fine structure spectroscopy (EXAFS) at the Ge K-edge. Large scale structural models were obtained by fitting the three measurements simultaneously in the framework of the reverse Monte Carlo simulation technique. It was found that the experimental data sets can be adequately fitted without Ge-Ge nearest neighbours. Mean Pd-Pd and Pd-Ge distances are 2.80$\pm$0.02 {\AA} and 2.50$\pm$0.02 {\AA}, respectively. The total average coordination number of Pd is 12.1$\pm$0.5 while Ge is surrounded by 10.6$\pm$1.1 Pd atoms. The coordination numbers calculated from partial pair correlation functions were compared to those obtained by Voronoi tessellation method. It was found that the latter technique overestimates the number of nearest neighbours by about 20% due to the significant contribution of distant pairs.
The issue of optimizing queries is a cost-sensitive process and with respect to the number of associated tables in a query, its number of permutations grows exponentially. On one hand, in comparison with other operators in relational database, join operator is the most difficult and complicated one in terms of optimization for reducing its runtime. Accordingly, various algorithms have so far been proposed to solve this problem. On the other hand, the success of any database management system (DBMS) means exploiting the query model. In the current paper, the heuristic ant algorithm has been proposed to solve this problem and improve the runtime of join operation. Experiments and observed results reveal the efficiency of this algorithm compared to its similar algorithms.
This paper provides guidelines for designing deep neural networks (DNNs) as add-on blocks to baseline feedback control loops to enhance tracking performance on arbitrary, desired trajectories. The DNNs are trained to adapt the reference signals to the feedback control loop. The goal is to achieve a unity map between the desired and actual outputs. In previous work, the efficacy of this approach was demonstrated on quadrotors. On 30 unseen trajectories, the proposed DNN approach achieved an average error reduction of 43%, compared to the baseline feedback controller. Motivated by these results, this work aims to provide platform-independent design guidelines for the proposed DNN-enhanced control architecture. In particular, we provide specific guidelines for the DNN feature selection, derive conditions for when the proposed approach is effective, and show in which cases training efficiency can be further increased.
We present a new deep supervised learning method for intrinsic decomposition of a single image into its albedo and shading components. Our contributions are based on a new fully convolutional neural network that estimates absolute albedo and shading jointly. As opposed to classical intrinsic image decomposition work, it is fully data-driven, hence does not require any physical priors like shading smoothness or albedo sparsity, nor does it rely on geometric information such as depth. Compared to recent deep learning techniques, we simplify the architecture, making it easier to build and train. It relies on a single end-to-end deep sequence of residual blocks and a perceptually-motivated metric formed by two discriminator networks. We train and demonstrate our architecture on the publicly available MPI Sintel dataset and its intrinsic image decomposition augmentation. We additionally discuss and augment the set of quantitative metrics so as to account for the more challenging recovery of non scale-invariant quantities. Results show that our work outperforms the state of the art algorithms both on the qualitative and quantitative aspect, while training convergence time is reduced.
A general method for the derivation of asymptotic nonlinear shallow water and deep water models is presented. Starting from a general dimensionless version of the water-wave equations, we reduce the problem to a system of two equations on the surface elevation and the velocity potential at the free surface. These equations involve a Dirichlet-Neumann operator and we show that all the asymptotic models can be recovered by a simple asymptotic expansion of this operator, in function of the shallowness parameter (shallow water limit) or the steepness parameter (deep water limit). Based on this method, a new two-dimensional fully dispersive model for small wave steepness is also derived, which extends to uneven bottom the approach developed by Matsuno \cite{matsuno3} and Choi \cite{choi}. This model is still valid in shallow water but with less precision than what can be achieved with Green-Naghdi model, when fully nonlinear waves are considered. The combination, or the coupling, of the new fully dispersive equations with the fully nonlinear shallow water Green-Naghdi equations represents a relevant model for describing ocean wave propagation from deep to shallow waters.
Classification of jets as originating from light-flavor or heavy-flavor quarks is an important task for inferring the nature of particles produced in high-energy collisions. The large and variable dimensionality of the data provided by the tracking detectors makes this task difficult. The current state-of-the-art tools require expert data-reduction to convert the data into a fixed low-dimensional form that can be effectively managed by shallow classifiers. We study the application of deep networks to this task, attempting classification at several levels of data, starting from a raw list of tracks. We find that the highest-level lowest-dimensionality expert information sacrifices information needed for classification, that the performance of current state-of-the-art taggers can be matched or slightly exceeded by deep-network-based taggers using only track and vertex information, that classification using only lowest-level highest-dimensionality tracking information remains a difficult task for deep networks, and that adding lower-level track and vertex information to the classifiers provides a significant boost in performance compared to the state-of-the-art.
This study aims to analyze the benefits of improved multi-scale reasoning for object detection and localization with deep convolutional neural networks. To that end, an efficient and general object detection framework which operates on scale volumes of a deep feature pyramid is proposed. In contrast to the proposed approach, most current state-of-the-art object detectors operate on a single-scale in training, while testing involves independent evaluation across scales. One benefit of the proposed approach is in better capturing of multi-scale contextual information, resulting in significant gains in both detection performance and localization quality of objects on the PASCAL VOC dataset and a multi-view highway vehicles dataset. The joint detection and localization scale-specific models are shown to especially benefit detection of challenging object categories which exhibit large scale variation as well as detection of small objects.
Considering the raising socio-economic burden of autism spectrum disorder (ASD), timely and evidence-driven public policy decision making and communication of the latest guidelines pertaining to the treatment and management of the disorder is crucial. Yet evidence suggests that policy makers and medical practitioners do not always have a good understanding of the practices and relevant beliefs of ASD-afflicted individuals' carers who often follow questionable recommendations and adopt advice poorly supported by scientific data. The key goal of the present work is to explore the idea that Twitter, as a highly popular platform for information exchange, could be used as a data-mining source to learn about the population affected by ASD -- their behaviour, concerns, needs etc. To this end, using a large data set of over 11 million harvested tweets as the basis for our investigation, we describe a series of experiments which examine a range of linguistic and semantic aspects of messages posted by individuals interested in ASD. Our findings, the first of their nature in the published scientific literature, strongly motivate additional research on this topic and present a methodological basis for further work.
We perform a statistical analysis of deterministic energy-decreasing algorithms on mean-field spin models with complex energy landscape like the Sine model and the Sherrington Kirkpatrick model. We specifically address the following question: in the search of low energy configurations is it convenient (and in which sense) a quick decrease along the gradient (greedy dynamics) or a slow decrease close to the level curves (reluctant dynamics)? Average time and wideness of the attraction basins are introduced for each algorithm together with an interpolation among the two and experimental results are presented for different system sizes. We found that while the reluctant algorithm performs better for a fixed number of trials, the two algorithms become basically equivalent for a given elapsed time due to the fact that the greedy has a shorter relaxation time which scales linearly with the system size compared to a quadratic dependence for the reluctant.
In this paper, drawing intuition from the Turing test, we propose using adversarial training for open-domain dialogue generation: the system is trained to produce sequences that are indistinguishable from human-generated dialogue utterances. We cast the task as a reinforcement learning (RL) problem where we jointly train two systems, a generative model to produce response sequences, and a discriminator---analagous to the human evaluator in the Turing test--- to distinguish between the human-generated dialogues and the machine-generated ones. The outputs from the discriminator are then used as rewards for the generative model, pushing the system to generate dialogues that mostly resemble human dialogues.   In addition to adversarial training we describe a model for adversarial {\em evaluation} that uses success in fooling an adversary as a dialogue evaluation metric, while avoiding a number of potential pitfalls. Experimental results on several metrics, including adversarial evaluation, demonstrate that the adversarially-trained system generates higher-quality responses than previous baselines.
We consider the initial value problem associated to the neural field equation of Amari type with plasticity \[ u_t(x,t)=-u(x,t)+\int_{\Omega}w(x,y)[1+\gamma g( u(x,t) - u(y,t) )] f(u(y,t))\; dy, \;(x,t) \in \Omega \times (0, \infty), \] where $\Omega\subset\mathbb{R}^m$, $f$ and $g$ are bounded and continuously differentiable functions with bounded derivative, and $\gamma\ge0$ is the plasticity synaptic coefficient. We show that the problem is well posed in $C_b(\mathbb{R}^m)$ and $L^1(\Omega)$ with $\Omega$ compact. The proof follows from a classical fixed point argument when we consider the equation's flow. Strong convergence of solutions in the no plasticity limit ($\gamma\to0$) to solutions of Amari's equation is analysed. Finally, we prove existence of stationary solutions in a general way. As a particular case, we show that the Amari's model, after learning, leads to the stationary Schr\"odinger equation for a type of gain modulation.
Recently developed tensor network methods demonstrate great potential for addressing the quantum many-body problem, by constructing variational spaces with polynomially, instead of exponentially, scaled parameters. Constructing such an efficient tensor network, and thus the variational space, is a subtle problem and the main obstacle of the method. We demonstrate the necessity of size consistency in tensor network methods for their success in addressing the quantum many-body problem. We further demonstrate that size consistency is independent of the entanglement criterion, thus providing a general and tight constraint to construct the tensor network method.
Recurrent neural network grammars (RNNG) are a recently proposed probabilistic generative modeling family for natural language. They show state-of-the-art language modeling and parsing performance. We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection. We find that explicit modeling of composition is crucial for achieving the best performance. Through the attention mechanism, we find that headedness plays a central role in phrasal representation (with the model's latent attention largely agreeing with predictions made by hand-crafted head rules, albeit with some important differences). By training grammars without nonterminal labels, we find that phrasal representations depend minimally on nonterminals, providing support for the endocentricity hypothesis.
The spin market model [S. Bornholdt, Int.J.Mod.Phys. C 12 (2001) 667] is extended into co-evolutionary version, where strategies of interacting and competitive traders are represented by local and global couplings between the nodes of dynamic directed stochastic network. The co-evolutionary principles are applied in the frame of Bak - Sneppen self-organized dynamics [P. Bak, K. Sneppen, Phys. Rev. Letter 71 (1993) 4083] that includes the processes of selection and extinction actuated by the local (node) fitness. The local fitness is related to orientation of spin agent with respect to instant magnetization. The stationary regime characterized by a fat tailed distribution of the log-price returns with index $\alpha\simeq 3.6$ (out of the Levy range) is identified numerically. The non-trivial consequence of the extremal dynamics is the partially power-law decay (an effective exponent varies between -0.3 and -0.6) of the autocorrelation function of volatility. Broad-scale network topology with node degree distribution characterized by the exponent $\gamma=1.8$ from the range of social networks is obtained.
Driven by a large number of potential applications in areas like bioinformatics, information retrieval and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated quite intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations, so that standard classification methods can be adopted. However, relations between objects like similarities and preferences are often expressed in a graded manner in real-world applications. A general kernel-based framework for learning relations from data is introduced here. It extends existing approaches because both crisp and graded relations are considered, and it unifies existing approaches because different types of graded relations can be modeled, including symmetric and reciprocal relations. This framework establishes important links between recent developments in fuzzy set theory and machine learning. Its usefulness is demonstrated through various experiments on synthetic and real-world data.
The existence of multiple subclasses of type Ia supernovae (SNeIa) has been the subject of great debate in the last decade. One major challenge inevitably met when trying to infer the existence of one or more subclasses is the time consuming, and subjective, process of subclass definition. In this work, we show how machine learning tools facilitate identification of subtypes of SNeIa through the establishment of a hierarchical group structure in the continuous space of spectral diversity formed by these objects. Using Deep Learning, we were capable of performing such identification in a 4 dimensional feature space (+1 for time evolution), while the standard Principal Component Analysis barely achieves similar results using 15 principal components. This is evidence that the progenitor system and the explosion mechanism can be described by a small number of initial physical parameters. As a proof of concept, we show that our results are in close agreement with a previously suggested classification scheme and that our proposed method can grasp the main spectral features behind the definition of such subtypes. This allows the confirmation of the velocity of lines as a first order effect in the determination of SNIa subtypes, followed by 91bg-like events. Given the expected data deluge in the forthcoming years, our proposed approach is essential to allow a quick and statistically coherent identification of SNeIa subtypes (and outliers). All tools used in this work were made publicly available in the Python package Dimensionality Reduction And Clustering for Unsupervised Learning in Astronomy (DRACULA) and can be found within COINtoolbox (https://github.com/COINtoolbox/DRACULA).
Fisher's geometric model was originally introduced to argue that complex adaptations must occur in small steps because of pleiotropic constraints. When supplemented with the assumption of additivity of mutational effects on phenotypic traits, it provides a simple mechanism for the emergence of genotypic epistasis from the nonlinear mapping of phenotypes to fitness. Of particular interest is the occurrence of reciprocal sign epistasis, which is a necessary condition for multipeaked genotypic fitness landscapes. Here we compute the probability that a pair of randomly chosen mutations interacts sign epistatically, which is found to decrease with increasing phenotypic dimension $n$, and varies nonmonotonically with the distance from the phenotypic optimum. We then derive expressions for the mean number of fitness maxima in genotypic landscapes comprised of all combinations of $L$ random mutations. This number increases exponentially with $L$, and the corresponding growth rate is used as a measure of the complexity of the landscape. The dependence of the complexity on the model parameters is found to be surprisingly rich, and three distinct phases characterized by different landscape structures are identified. Our analysis shows that the phenotypic dimension, which is often referred to as phenotypic complexity, does not generally correlate with the complexity of fitness landscapes and that even organisms with a single phenotypic trait can have complex landscapes. Our results further inform the interpretation of experiments where the parameters of Fisher's model have been inferred from data, and help to elucidate which features of empirical fitness landscapes can be described by this model.
We have discussed the dynamics of Langevin model subjected to colored noise, by using the functional-integral method (FIM) combined with equations of motion for mean and variance of the state variable. Two sets of colored noise have been investigated: (a) one additive and one multiplicative colored noise, and (b) one additive and two multiplicative colored noise. The case (b) is examined with the relevance to a recent controversy on the stationary subthreshold voltage distribution of an integrate-and-fire model including stochastic excitatory and inhibitory synapses and a noisy input. We have studied the stationary probability distribution and dynamical responses to time-dependent (pulse and sinusoidal) inputs of the linear Langevin model. Model calculations have shown that results of the FIM are in good agreement with those of direct simulations (DSs). A comparison is made among various approximate analytic solutions such as the universal colored noise approximation (UCNA). It has been pointed out that dynamical responses to pulse and sinusoidal inputs calculated by the UCNA are rather different from those of DS and the FIM, although they yield the same stationary distribution.
Dropout-based regularization methods can be regarded as injecting random noise with pre-defined magnitude to different parts of the neural network during training. It was recently shown that Bayesian dropout procedure not only improves generalization but also leads to extremely sparse neural architectures by automatically setting the individual noise magnitude per weight. However, this sparsity can hardly be used for acceleration since it is unstructured. In the paper, we propose a new Bayesian model that takes into account the computational structure of neural networks and provides structured sparsity, e.g. removes neurons and/or convolutional channels in CNNs. To do this, we inject noise to the neurons outputs while keeping the weights unregularized. We established the probabilistic model with a proper truncated log-uniform prior over the noise and truncated log-normal variational approximation that ensures that the KL-term in the evidence lower bound is computed in closed-form. The model leads to structured sparsity by removing elements with a low SNR from the computation graph and provides significant acceleration on a number of deep neural architectures. The model is very easy to implement as it only corresponds to the addition of one dropout-like layer in computation graph.
We propose a framework for inferring the latent attitudes or preferences of users by performing probabilistic first-order logical reasoning over the social network graph. Our method answers questions about Twitter users like {\em Does this user like sushi?} or {\em Is this user a New York Knicks fan?} by building a probabilistic model that reasons over user attributes (the user's location or gender) and the social network (the user's friends and spouse), via inferences like homophily (I am more likely to like sushi if spouse or friends like sushi, I am more likely to like the Knicks if I live in New York). The algorithm uses distant supervision, semi-supervised data harvesting and vector space models to extract user attributes (e.g. spouse, education, location) and preferences (likes and dislikes) from text. The extracted propositions are then fed into a probabilistic reasoner (we investigate both Markov Logic and Probabilistic Soft Logic). Our experiments show that probabilistic logical reasoning significantly improves the performance on attribute and relation extraction, and also achieves an F-score of 0.791 at predicting a users likes or dislikes, significantly better than two strong baselines.
The structure of a Bayesian network includes a great deal of information about the probability distribution of the data, which is uniquely identified given some general distributional assumptions. Therefore it's important to study its variability, which can be used to compare the performance of different learning algorithms and to measure the strength of any arbitrary subset of arcs.   In this paper we will introduce some descriptive statistics and the corresponding parametric and Monte Carlo tests on the undirected graph underlying the structure of a Bayesian network, modeled as a multivariate Bernoulli random variable. A simple numeric example and the comparison of the performance of some structure learning algorithm on small samples will then illustrate their use.
We study the robustness of interdependent networks where two networks are said to be interdependent if the operation of one network depends on the operation of the other one, and vice versa. In this paper, we propose a model for analyzing bidirectional interdependent networks with known topology. We define the metric MR(D) to be the minimum number of nodes that should be removed from one network to cause the failure of D nodes in the other network due to cascading failures. We prove that evaluating this metric is not only NP-complete, but also inapproximable. Next, we propose heuristics for evaluating this metric and compare their performances using simulation results. Finally, we introduce two closely related definitions for robust design of interdependent networks; propose algorithms for explicit design, and demonstrate the relation between robust interdependent networks and expander graphs.
3D shape models are naturally parameterized using vertices and faces, \ie, composed of polygons forming a surface. However, current 3D learning paradigms for predictive and generative tasks using convolutional neural networks focus on a voxelized representation of the object. Lifting convolution operators from the traditional 2D to 3D results in high computational overhead with little additional benefit as most of the geometry information is contained on the surface boundary. Here we study the problem of directly generating the 3D shape surface of rigid and non-rigid shapes using deep convolutional neural networks. We develop a procedure to create consistent `geometry images' representing the shape surface of a category of 3D objects. We then use this consistent representation for category-specific shape surface generation from a parametric representation or an image by developing novel extensions of deep residual networks for the task of geometry image generation. Our experiments indicate that our network learns a meaningful representation of shape surfaces allowing it to interpolate between shape orientations and poses, invent new shape surfaces and reconstruct 3D shape surfaces from previously unseen images.
In the last few years, we have seen the transformative impact of deep learning in many applications, particularly in speech recognition and computer vision. Inspired by Google's Inception-ResNet deep convolutional neural network (CNN) for image classification, we have developed "Chemception", a deep CNN for the prediction of chemical properties, using just the images of 2D drawings of molecules. We develop Chemception without providing any additional explicit chemistry knowledge, such as basic concepts like periodicity, or advanced features like molecular descriptors and fingerprints. We then show how Chemception can serve as a general-purpose neural network architecture for predicting toxicity, activity, and solvation properties when trained on a modest database of 600 to 40,000 compounds. When compared to multi-layer perceptron (MLP) deep neural networks trained with ECFP fingerprints, Chemception slightly outperforms in activity and solvation prediction and slightly underperforms in toxicity prediction. Having matched the performance of expert-developed QSAR/QSPR deep learning models, our work demonstrates the plausibility of using deep neural networks to assist in computational chemistry research, where the feature engineering process is performed primarily by a deep learning algorithm.
The SPring-8 controls network has maintained accelerator operations in high reliability and shown good performance during the past years. To cope with the increase of loads on the network due to faster data acquisition and the addition of equipment data, networking hardware has been installed in the last few years. The upgraded network replaces the original FDDI backbone and switches with mixed FDDI/gigabit ethernet and Layer-3 switches. It is necessary to keep the double ring topology for the FDDI and introduce link aggregation technology for the gigabit ethernet to maintain the full redundancy and bandwidth of the system. This paper will discuss the network performance of the gigabit ethernet including its latency and redundancy. We also discuss a future plan for the network including Quality-of-Service over the gigabit ethernet.
We obtained the number counts and the rest-frame B-band luminosity function of the color-selected old passively-evolving galaxies (OPEGs) at z=1 with very high statistical accuracy using a large and homogeneous sample of about 4000 such objects with z' <25 detected in the area of 1.03 deg^2 in the Subaru/XMM-Newton Deep Survey (SXDS) field. Our selection criteria are defined on the i'-z' and R-z' color-magnitude plane so that OPEGs at z=0.9-1.1 with formation redshift z_f=2-10 are properly sampled. The limiting magnitude corresponds to the luminosity of galaxies with M_*+3 at z=0. We made a pilot redshift observations for 99 OPEG candidates with 19 < z' < 22 and found that at least 78% (73/93) of the entire sample, or 95% (73/77) of these whose redshifts were obtained are indeed lie between z=0.87 and 1.12 and the most of their spectra show the continuum break and strong Ca H and K lines, indicating that these objects are indeed dominated by the old stellar populations. We then compare our results with the luminosity functions of the color- or the morphologically-selected early type galaxies at z=0 taking the evolutionary factor into account and found that the number density of old passive galaxies with sim M_* magnitude at z~1 averaged over the SXDS area is 40-60% of the equivalently red galaxies and 60-85% of the morphologically-selected E/S0 galaxies at z=0 depending on their luminosity evolution. It is revealed that more than half, but not all, of the present-day early-type galaxies had already been formed into quiescent passive galaxies at z=1.
Under the sociological theory of homophily, people who are similar to one another are more likely to interact with one another. Marketers often have access to data on interactions among customers from which, with homophily as a guiding principle, inferences could be made about the underlying similarities. However, larger networks face a quadratic explosion in the number of potential interactions that need to be modeled. This scalability problem renders probability models of social interactions computationally infeasible for all but the smallest networks. In this paper we develop a probabilistic framework for modeling customer interactions that is both grounded in the theory of homophily, and is flexible enough to account for random variation in who interacts with whom. In particular, we present a novel Bayesian nonparametric approach, using Dirichlet processes, to moderate the scalability problems that marketing researchers encounter when working with networked data. We find that this framework is a powerful way to draw insights into latent similarities of customers, and we discuss how marketers can apply these insights to segmentation and targeting activities.
Understanding the causes and effects of network structural features is a key task in deciphering complex systems. In this context, the property of network nestedness has aroused a fair amount of interest as regards ecological networks. Indeed, Bastolla et al. introduced a simple measure of network nestedness which opened the door to analytical understanding, allowing them to conclude that biodiversity is strongly enhanced in highly nested mutualistic networks. Here, we suggest a slightly refined version of such a measure and go on to study how it is influenced by the most basic structural properties of networks, such as degree distribution and degree-degree correlations (i.e. assortativity). We find that heterogeneity in the degree has a very strong influence on nestedness. Once such an influence has been discounted, we find that nestedness is strongly correlated with disassortativity and hence, as random (neutral) networks have been recently found to be naturally disassortative, they tend to be naturally nested just as the result of chance.
In this paper, we propose an opportunistic network coding (ONC) scheme in cellular relay networks, which operates depending on whether the relay decodes source messages successfully or not. A fully distributed method is presented to implement the proposed opportunistic network coding scheme without the need of any feedback between two network nodes. We consider the use of proposed ONC for cellular downlink transmissions and derive its closed-form outage probability expression considering cochannel interference in a Rayleigh fading environment. Numerical results show that the proposed ONC scheme outperforms the traditional non-cooperation in terms of outage probability. We also develop the diversity-multiplexing tradeoff (DMT) of proposed ONC and show that the ONC scheme obtains the full diversity and an increased multiplexing gain as compared with the conventional cooperation protocols.
New Raman and incoherent neutron scattering data at various temperatures and molecular dynamic simulations in amorphous silica, are compared to obtain the Raman coupling coefficient $C(\omega)$ and, in particular, its low frequency limit. This study indicates that in the $\omega \to 0$ limit $C(\omega)$ extrapolates to a non vanishing value, giving important indications on the characteristics of the vibrational modes in disordered materials; in particular our results indicate that even in the limit of very long wavelength the local disorder implies non-regular local atomic displacements.
Simulation of spiking neural networks has been traditionally done on high-performance supercomputers or large-scale clusters. Utilizing the parallel nature of neural network computation algorithms, GeNN (GPU Enhanced Neural Network) provides a simulation environment that performs on General Purpose NVIDIA GPUs with a code generation based approach. GeNN allows the users to design and simulate neural networks by specifying the populations of neurons at different stages, their synapse connection densities and the model of individual neurons. In this report we describe work on how to scale synaptic weights based on the configuration of the user-defined network to ensure sufficient spiking and subsequent effective learning. We also discuss optimization strategies particular to GPU computing: sparse representation of synapse connections and occupancy based block-size determination.
Using the random matrix theory, we investigate the ensemble statistics of edge transport of a quantum spin Hall insulator with multiple edge states in the presence of quenched disorder. Dorokhov-Mello-Pereyra-Kumar equation applicable for such a system is established. It is found that a two-dimensional quantum spin Hall insulator is effectively a new type of one-dimensional (1D) quantum conductor with the different ensemble statistics from that of the ordinary 1D quantum conductor or the insulator with an even number of Kramers edge pairs. The ensemble statistics provides a physical manifestation of the Z2-classification for the time-reversal invariant insulators.
We study an extension of the classical graph cut problem, wherein we replace the modular (sum of edge weights) cost function by a submodular set function defined over graph edges. Special cases of this problem have appeared in different applications in signal processing, machine learning, and computer vision. In this paper, we connect these applications via the generic formulation of "cooperative graph cuts", for which we study complexity, algorithms, and connections to polymatroidal network flows. Finally, we compare the proposed algorithms empirically.
We present a study of non-equilibrium phenomena observed in the electrical conductance of insulating granular aluminium thin films. An anomalous field effect and its slow relaxation are studied in some detail. The phenomenology is very similar to the one already observed in indium oxide. The origin of the phenomena is discussed. In granular systems, the present experiments can naturally be interpreted along two different lines. One relies on a slow polarisation in the dielectric surrounding the metallic islands. The other one relies on a purely electronic mechanism: the formation of an electron Coulomb glass in the granular metal. More selective experiments and/or quantitative predictions about the Coulomb glass properties are still needed to definitely distinguish between the two scenarii.
We analyze the time series of overnight returns for the bund and btp futures exchanged at LIFFE (London). The overnight returns of both assets are mapped onto a one-dimensional symbolic-dynamics random walk: The `bond walk'. During the considered period (October 1991 - January 1994) the bund-future market opened earlier than the btp-future one. The crosscorrelations between the two bond walks, as well as estimates of the conditional probability, show that they are not independent; however each walk can be modeled by means of a trinomial probability distribution. Monte Carlo simulations confirm that it is necessary to take into account the bivariate dependence in order to properly reproduce the statistical properties of the real-world data. Various investment strategies have been devised to exploit the `prior' information obtained by the aforementioned analysis.
We report on measurements of the h_1 structure function, relevant to calculating cross-sections for the Drell-Yan process. This is a quantity which can not be measured in Deep Inelastic Scattering, it gives additional information on the spin carried by the valence quarks, as well as insights on how relativistic the quarks are.
Using an example of physical interactions between proteins, we study how perturbations propagate in interconnected networks whose equilibrium state is governed by the law of mass action. We introduce a comprehensive matrix formalism which predicts the response of this equilibrium to small changes in total concentrations of individual molecules, and explain it using a heuristic analogy to a current flow in a network of resistors. Our main conclusion is that on average changes in free concentrations exponentially decay with the distance from the source of perturbation. We then study how this decay is influenced by such factors as the topology of a network, binding strength, and correlations between concentrations of neighboring nodes. An exact analytic expression for the decay constant is obtained for the case of uniform interactions on the Bethe lattice. Our general findings are illustrated using a real biological network of protein-protein interactions in baker's yeast with experimentally determined protein concentrations.
Network embedding which encodes all vertices in a network as a set of numerical vectors in accordance with it's local and global structures, has drawn widespread attention. Network embedding not only learns significant features of a network, such as the clustering and linking prediction but also learns the latent vector representation of the nodes which provides theoretical support for a variety of applications, such as visualization, node classification, and recommendation. As the latest progress of the research, several algorithms based on random walks have been devised. Although their high scores for learning efficiency and accuracy have drawn much attention, there is still a lack of theoretical explanation, and the transparency of the algorithms has been doubted. Here, we propose an approach based on the open-flow network model to reveal the underlying flow structure and its hidden metric space of different random walk strategies on networks. We show that the essence of embedding based on random walks is the latent metric structure defined on the open-flow network. This not only deepens our understanding of random walk based embedding algorithms but also helps in finding new potential applications in embedding.
The Pauli exclusion principle induces a repulsion between composite systems of identical fermions such as colliding atomic nuclei. Our goal is to study how heavy-ion fusion is impacted by this "Pauli repulsion". We propose a new microscopic approach, the density-constrained frozen Hartree-Fock method, to compute the bare potential including the Pauli exclusion principle exactly. Pauli repulsion is shown to be important inside the barrier radius and increases with the charge product of the nuclei. Its main effect is to reduce tunnelling probability. Pauli repulsion is part of the solution to the long-standing deep sub-barrier fusion hindrance problem.
A high-resolution calorimetric spectroscopy study has been performed on pure glycerol and colloidal dispersions of an aerosil in glycerol covering a wide range of temperatures from 300 K to 380 K, deep in the liquid phase of glycerol. The colloidal glycerol+aerosil samples with 0.05, 0.10, and 0.20 mass fraction of aerosil reveal glassy, activated dynamics at temperatures well above the $T_g$ of the pure glycerol. The onset of glass-like behavior appears to be due to the structural frustration imposed by the silica gel on the glycerol liquid. The aerosil gel increases the net viscosity of the mixture, placing the sample effectively at a lower temperature thus inducing a glassy state. Given the onset of this behavior at relatively low aerosil density (large mean-void length compared to the size of a glycerol molecule), this induced glassy behavior is likely due to a collective mode of glycerol molecules. The study of frustrated glass-forming systems may be a unique avenue for illuminating the physics of glasses.
A new musical instrument classification method using convolutional neural networks (CNNs) is presented in this paper. Unlike the traditional methods, we investigated a scheme for classifying musical instruments using the learned features from CNNs. To create the learned features from CNNs, we not only used a conventional spectrogram image, but also proposed multiresolution recurrence plots (MRPs) that contain the phase information of a raw input signal. Consequently, we fed the characteristic timbre of the particular instrument into a neural network, which cannot be extracted using a phase-blinded representations such as a spectrogram. By combining our proposed MRPs and spectrogram images with a multi-column network, the performance of our proposed classifier system improves over a system that uses only a spectrogram. Furthermore, the proposed classifier also outperforms the baseline result from traditional handcrafted features and classifiers.
Hierarchical learning models, such as mixture models and Bayesian networks, are widely employed for unsupervised learning tasks, such as clustering analysis. They consist of observable and hidden variables, which represent the given data and their hidden generation process, respectively. It has been pointed out that conventional statistical analysis is not applicable to these models, because redundancy of the latent variable produces singularities in the parameter space. In recent years, a method based on algebraic geometry has allowed us to analyze the accuracy of predicting observable variables when using Bayesian estimation. However, how to analyze latent variables has not been sufficiently studied, even though one of the main issues in unsupervised learning is to determine how accurately the latent variable is estimated. A previous study proposed a method that can be used when the range of the latent variable is redundant compared with the model generating data. The present paper extends that method to the situation in which the latent variables have redundant dimensions. We formulate new error functions and derive their asymptotic forms. Calculation of the error functions is demonstrated in two-layered Bayesian networks.
Synonym extraction is an important task in natural language processing and often used as a submodule in query expansion, question answering and other applications. Automatic synonym extractor is highly preferred for large scale applications. Previous studies in synonym extraction are most limited to small scale datasets. In this paper, we build a large dataset with 3.4 million synonym/non-synonym pairs to capture the challenges in real world scenarios. We proposed (1) a new cost function to accommodate the unbalanced learning problem, and (2) a feature learning based deep neural network to model the complicated relationships in synonym pairs. We compare several different approaches based on SVMs and neural networks, and find out a novel feature learning based neural network outperforms the methods with hand-assigned features. Specifically, the best performance of our model surpasses the SVM baseline with a significant 97\% relative improvement.
Authentication is the act of confirming the truth of an attribute of a datum or entity. This might involve confirming the identity of a person, tracing the origins of an artefact, ensuring that a product is what it's packaging and labelling claims to be, or assuring that a computer program is a trusted one. The authentication of information can pose special problems (especially man-in-the-middle attacks), and is often wrapped up with authenticating identity. Password authentication using Brain-State -In-A Box is presented in this paper. Here in this paper we discuss Brain-State -In-A Box Scheme for Textual and graphical passwords which will be converted in to probabilistic values Password. We observe how to get password authentication Probabilistic values for Text and Graphical image. This study proposes the use of a Brain-State -In-A Box technique for password authentication. In comparison to existing layered neural network techniques, the proposed method provides better accuracy and quicker response time to registration and password changes.
Strong Disorder Renormalization is an energy-based renormalization that leads to a complicated renormalized topology for the surviving clusters as soon as $d>1$. In this paper, we propose to include Strong Disorder Renormalization ideas within the more traditional fixed cell-size real space RG framework. We first consider the one-dimensional chain as a test for this fixed cell-size procedure: we find that all exactly known critical exponents are reproduced correctly, except for the magnetic exponent $\beta$ (because it is related to more subtle persistence properties of the full RG flow). We then apply numerically this fixed cell-size procedure to two types of renormalizable fractal lattices (i) the Sierpinski gasket of fractal dimension $D=\ln 3/\ln 2$, where there is no underlying classical ferromagnetic transition, so that the RG flow in the ordered phase is similar to what happens in $d=1$ (ii) a hierarchical diamond lattice of fractal dimension $D=4/3$, where there is an underlying classical ferromagnetic transition, so that the RG flow in the ordered phase is similar to what happens on hypercubic lattices of dimension $d>1$. In both cases, we find that the transition is governed by an Infinite Disorder Fixed Point : besides the measure of the activated exponent $\psi$, we analyze the RG flow of various observables in the disordered and ordered phases, in order to extract the 'typical' correlation length exponents of these two phases which are different from the finite-size correlation length exponent.
This paper presents new results about the optimization based generation of chemical reaction networks (CRNs) of higher deficiency. Firstly, it is shown that the graph structure of the realization containing the maximal number of reactions is unique if the set of possible complexes is fixed. Secondly, a mixed integer programming based numerical procedure is given for computing a realization containing the minimal/maximal number of complexes. Moreover, the linear inequalities corresponding to full reversibility of the CRN realization are also described. The theoretical results are illustrated on meaningful examples.
We consider restrictions imposed on the (electromagnetic or weak) current operator by its commutation relations with the representation operators of the Poincare group and show that the nonperturbative part of the current operator contributes to deep inelastic scattering even in leading order in $1/Q$ where $Q$ is the magnitude of the momentum transfer. Some consequences of this result are discussed.
This paper considers stochastic bandits with side observations, a model that accounts for both the exploration/exploitation dilemma and relationships between arms. In this setting, after pulling an arm i, the decision maker also observes the rewards for some other actions related to i. We will see that this model is suited to content recommendation in social networks, where users' reactions may be endorsed or not by their friends. We provide efficient algorithms based on upper confidence bounds (UCBs) to leverage this additional information and derive new bounds improving on standard regret guarantees. We also evaluate these policies in the context of movie recommendation in social networks: experiments on real datasets show substantial learning rate speedups ranging from 2.2x to 14x on dense networks.
In this work, we study the crystalline nuclei growth in glassy systems focusing primarily on the early stages of the process, at which the size of a growing nucleus is still comparable with the critical size. On the basis of molecular dynamics simulation results for two crystallizing glassy systems, we evaluate the growth laws of the crystalline nuclei and the parameters of the growth kinetics at the temperatures corresponding to deep supercoolings; herein, the statistical treatment of the simulation results is done within the mean-first-passage-time method. It is found for the considered systems at different temperatures that the crystal growth laws rescaled onto the waiting times of the critically-sized nucleus follow the unified dependence, that can simplify significantly theoretical description of the post-nucleation growth of crystalline nuclei. The evaluated size-dependent growth rates are characterized by transition to the steady-state growth regime, which depends on the temperature and occurs in the glassy systems when the size of a growing nucleus becomes two-three times larger than a critical size. It is suggested to consider the temperature dependencies of the crystal growth rate characteristics by using the reduced temperature scale $\widetilde{T}$. Thus, it is revealed that the scaled values of the crystal growth rate characteristics (namely, the steady-state growth rate and the attachment rate for the critically-sized nucleus) as functions of the reduced temperature $\widetilde{T}$ for glassy systems follow the unified power-law dependencies. This finding is supported by available simulation results; the correspondence with the experimental data for the crystal growth rate in glassy systems at the temperatures near the glass transition is also discussed.
The fragmentation of a two-dimensional circular disc by lateral impact is investigated using a cell model of brittle solid. The disc is composed of numerous unbreakable randomly shaped convex polygons connected together by simple elastic beams that break when bent or stretched beyond a certain limit. We found that the fragment mass distribution follows a power law with an exponent close to 2 independent of the system size. We also observed two types of crack patterns: radial cracks starting from the impact point and cracks perpendicular to the radial ones. Simulations revealed that there exists a critical projectile energy, above which the target breaks into numerous smaller pieces, and below which it suffers only damage in the form of cracks. Our theoretical results are in a reasonable agreement with recent experimental findings on the fragmentation of discs.
We investigated photogeneration yield and recombination dynamics in blends of poly(3-hexyl thiophene) (P3HT) and poly[2-methoxy-5-(30,70-dimethyloctyloxy)-1,4-phenylenevinylene] (MDMO-PPV) with [6,6]- phenyl-C61 butyric acid methyl ester (PC61BM) by means of temperature dependent time delayed collection field (TDCF) measurements. In MDMO-PPV:PC61BM we find a strongly field dependent polaron pair dissociation which can be attributed to geminate recombination in the device. Our findings are in good agreement with field dependent photoluminescence measurements published before, supporting a scenario of polaron pair dissociation via an intermediate charge transfer (CT) state. In contrast, polaron pair dissociation in P3HT:PC61BM shows only a very weak field dependence, indicating an almost field independent polaron pair dissociation or a direct photogeneration. Furthermore, we found Langevin recombination for MDMO-PPV:PC61BM and strongly reduced Langevin recombination for P3HT:PC61BM.
This paper develops a model that addresses sentence embedding, a hot topic in current natural language processing research, using recurrent neural networks with Long Short-Term Memory (LSTM) cells. Due to its ability to capture long term memory, the LSTM-RNN accumulates increasingly richer information as it goes through the sentence, and when it reaches the last word, the hidden layer of the network provides a semantic representation of the whole sentence. In this paper, the LSTM-RNN is trained in a weakly supervised manner on user click-through data logged by a commercial web search engine. Visualization and analysis are performed to understand how the embedding process works. The model is found to automatically attenuate the unimportant words and detects the salient keywords in the sentence. Furthermore, these detected keywords are found to automatically activate different cells of the LSTM-RNN, where words belonging to a similar topic activate the same cell. As a semantic representation of the sentence, the embedding vector can be used in many different applications. These automatic keyword detection and topic allocation abilities enabled by the LSTM-RNN allow the network to perform document retrieval, a difficult language processing task, where the similarity between the query and documents can be measured by the distance between their corresponding sentence embedding vectors computed by the LSTM-RNN. On a web search task, the LSTM-RNN embedding is shown to significantly outperform several existing state of the art methods. We emphasize that the proposed model generates sentence embedding vectors that are specially useful for web document retrieval tasks. A comparison with a well known general sentence embedding method, the Paragraph Vector, is performed. The results show that the proposed method in this paper significantly outperforms it for web document retrieval task.
Neuromorphic systems that densely integrate CMOS spiking neurons and nano-scale memristor synapses open a new avenue of brain-inspired computing. Existing silicon neurons have molded neural biophysical dynamics but are incompatible with memristor synapses, or used extra training circuitry thus eliminating much of the density advantages gained by using memristors, or were energy inefficient. Here we describe a novel CMOS spiking leaky integrate-and-fire neuron circuit. Building on a reconfigurable architecture with a single opamp, the described neuron accommodates a large number of memristor synapses, and enables online spike timing dependent plasticity (STDP) learning with optimized power consumption. Simulation results of an 180nm CMOS design showed 97% power efficiency metric when realizing STDP learning in 10,000 memristor synapses with a nominal 1M{\Omega} memristance, and only 13{\mu}A current consumption when integrating input spikes. Therefore, the described CMOS neuron contributes a generalized building block for large-scale brain-inspired neuromorphic systems.
We study the glass transition of binary mixtures of dipolar particles in two dimensions within the framework of mode-coupling theory, focusing in particular on the influence of composition changes. In a first step, we demonstrate that the experimental system of K\"onig et al. [Eur. Phys. J. E 18, 287 (2005)] is well described by point dipoles through a comparison between the experimental partial structure factors and those from our Monte Carlo simulation. For such a mixture of point particles we show that there is always a plasticization effect, i.e. a stabilization of the liquid state due to mixing, in contrast to binary hard disks. We demonstrate that the predicted plasticization effect is in qualitative agreement with experimental results. Furthermore, also some general properties of the glass transition lines are discussed.
Naive Bayes is a simple Bayesian classifier with strong independence assumptions among the attributes. This classifier, desipte its strong independence assumptions, often performs well in practice. It is believed that relaxing the independence assumptions of a naive Bayes classifier may improve the classification accuracy of the resulting structure. While finding an optimal unconstrained Bayesian Network (for most any reasonable scoring measure) is an NP-hard problem, it is possible to learn in polynomial time optimal networks obeying various structural restrictions. Several authors have examined the possibilities of adding augmenting arcs between attributes of a Naive Bayes classifier. Friedman, Geiger and Goldszmidt define the TAN structure in which the augmenting arcs form a tree on the attributes, and present a polynomial time algorithm that learns an optimal TAN with respect to MDL score. Keogh and Pazzani define Augmented Bayes Networks in which the augmenting arcs form a forest on the attributes (a collection of trees, hence a relaxation of the stuctural restriction of TAN), and present heuristic search methods for learning good, though not optimal, augmenting arc sets. The authors, however, evaluate the learned structure only in terms of observed misclassification error and not against a scoring metric, such as MDL. In this paper, we present a simple, polynomial time greedy algorithm for learning an optimal Augmented Bayes Network with respect to MDL score.
Visual Question Answering (VQA) emerges as one of the most fascinating topics in computer vision recently. Many state of the art methods naively use holistic visual features with language features into a Long Short-Term Memory (LSTM) module, neglecting the sophisticated interaction between them. This coarse modeling also blocks the possibilities of exploring finer-grained local features that contribute to the question answering dynamically over time.   This paper addresses this fundamental problem by directly modeling the temporal dynamics between language and all possible local image patches. When traversing the question words sequentially, our end-to-end approach explicitly fuses the features associated to the words and the ones available at multiple local patches in an attention mechanism, and further combines the fused information to generate dynamic messages, which we call episode. We then feed the episodes to a standard question answering module together with the contextual visual information and linguistic information. Motivated by recent practices in deep learning, we use auxiliary loss functions during training to improve the performance. Our experiments on two latest public datasets suggest that our method has a superior performance. Notably, on the DARQUAR dataset we advanced the state of the art by 6$\%$, and we also evaluated our approach on the most recent MSCOCO-VQA dataset.
To explore the relation between network structure and function, we studied the computational performance of Hopfield-type attractor neural nets with regular lattice, random, small-world and scale-free topologies. The random net is the most efficient for storage and retrieval of patterns by the entire network. However, in the scale-free case retrieval errors are not distributed uniformly: the portion of a pattern encoded by the subset of highly connected nodes is more robust and efficiently recognized than the rest of the pattern. The scale-free network thus achieves a very strong partial recognition. Implications for brain function and social dynamics are suggestive.
An information-centric network should realize significant economies by exploiting a favourable memory-bandwidth tradeoff: it is cheaper to store copies of popular content close to users than to fetch them repeatedly over the Internet. We evaluate this tradeoff for some simple cache network structures under realistic assumptions concerning the size of the content catalogue and its popularity distribution. Derived cost formulas reveal the relative impact of various cost, traffic and capacity parameters, allowing an appraisal of possible future network architectures. Our results suggest it probably makes more sense to envisage the future Internet as a loosely interconnected set of local data centers than a network like today's with routers augmented by limited capacity content stores.
We present the first massively distributed architecture for deep reinforcement learning. This architecture uses four main components: parallel actors that generate new behaviour; parallel learners that are trained from stored experience; a distributed neural network to represent the value function or behaviour policy; and a distributed store of experience. We used our architecture to implement the Deep Q-Network algorithm (DQN). Our distributed algorithm was applied to 49 games from Atari 2600 games from the Arcade Learning Environment, using identical hyperparameters. Our performance surpassed non-distributed DQN in 41 of the 49 games and also reduced the wall-time required to achieve these results by an order of magnitude on most games.
We demonstrate the use of computational phylogenetic techniques to solve a central problem in inferential network monitoring. More precisely, we design a novel algorithm for multicast-based delay inference, i.e. the problem of reconstructing the topology and delay characteristics of a network from end-to-end delay measurements on network paths. Our inference algorithm is based on additive metric techniques widely used in phylogenetics. It runs in polynomial time and requires a sample of size only $\poly(\log n)$.
The distributed computing is done on many systems to solve a large scale problem. The growing of high-speed broadband networks in developed and developing countries, the continual increase in computing power, and the rapid growth of the Internet have changed the way. In it the society manages information and information services. Historically, the state of computing has gone through a series of platform and environmental changes. Distributed computing holds great assurance for using computer systems effectively. As a result, supercomputer sites and data centers have changed from providing high performance floating point computing capabilities to concurrently servicing huge number of requests from billions of users. The distributed computing system uses multiple computers to solve large-scale problems over the Internet. It becomes data-intensive and network-centric. The applications of distributed computing have become increasingly wide-spread. In distributed computing, the main stress is on the large scale resource sharing and always goes for the best performance. In this article, we have reviewed the work done in the area of distributed computing paradigms. The main stress is on the evolving area of cloud computing.
Multiple instance learning (MIL) can reduce the need for costly annotation in tasks such as semantic segmentation by weakening the required degree of supervision. We propose a novel MIL formulation of multi-class semantic segmentation learning by a fully convolutional network. In this setting, we seek to learn a semantic segmentation model from just weak image-level labels. The model is trained end-to-end to jointly optimize the representation while disambiguating the pixel-image label assignment. Fully convolutional training accepts inputs of any size, does not need object proposal pre-processing, and offers a pixelwise loss map for selecting latent instances. Our multi-class MIL loss exploits the further supervision given by images with multiple labels. We evaluate this approach through preliminary experiments on the PASCAL VOC segmentation challenge.
We investigate the impact of spontaneous movement in the complexity of verification problems for an automata-based protocol model of networks with selective broadcast communication. We first consider reachability of an error state and show that parameterized verification is decidable with polynomial complexity. We then move to richer queries and show how the complexity changes when considering properties with negation or cardinality constraints.
The paper presents a methodology of transmitting voice in SMS (Short Message Service) over GSM network. Usually SMS contents are text based and limited to 140 bytes. It supports national and international roaming, but also supported by other telecommunication such as TDMA (Time Division Multiple Access), CDMA (Code Division Multiple Access) as well. It can sent/ receive simultaneously with other services. Such features make it favorable for this methodology. For this an application is developed using J2ME platform which is supported by all mobile phones in the world. This algorithm's test is conducted on N95 having Symbian Operating System (OS).
The majority of methods used to compute approximations to the Hamilton-Jacobi-Isaacs partial differential equation (HJI PDE) rely on the discretization of the state space to perform dynamic programming updates. This type of approach is known to suffer from the curse of dimensionality due to the exponential growth in grid points with the state dimension. In this work we present an approximate dynamic programming algorithm that computes an approximation of the solution of the HJI PDE by alternating between solving a regression problem and solving a minimax problem using a feedforward neural network as the function approximator. We find that this method requires less memory to run and to store the approximation than traditional gridding methods, and we test it on a few systems of two, three and six dimensions.
We introduce a new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Such problems cannot be trivially addressed by existent approaches such as sequence-to-sequence and Neural Turing Machines, because the number of target classes in each step of the output depends on the length of the input, which is variable. Problems such as sorting variable sized sequences, and various combinatorial optimization problems belong to this class. Our model solves the problem of variable size output dictionaries using a recently proposed mechanism of neural attention. It differs from the previous attention attempts in that, instead of using attention to blend hidden units of an encoder to a context vector at each decoder step, it uses attention as a pointer to select a member of the input sequence as the output. We call this architecture a Pointer Net (Ptr-Net). We show Ptr-Nets can be used to learn approximate solutions to three challenging geometric problems -- finding planar convex hulls, computing Delaunay triangulations, and the planar Travelling Salesman Problem -- using training examples alone. Ptr-Nets not only improve over sequence-to-sequence with input attention, but also allow us to generalize to variable size output dictionaries. We show that the learnt models generalize beyond the maximum lengths they were trained on. We hope our results on these tasks will encourage a broader exploration of neural learning for discrete problems.
One of the key points addressed by Per Bak in his models of brain function was that biological neural systems must be able not just to learn, but also to adapt--to quickly change their behaviour in response to a changing environment. I discuss this in the context of various simple learning rules and adaptive problems, centred around the Chialvo-Bak `minibrain' model [Neurosci. 90 (1999) 1137--1148].
This paper initiates the study of quantum computing within the constraints of using a polylogarithmic ($O(\log^k n), k\geq 1$) number of qubits and a polylogarithmic number of computation steps. The current research in the literature has focussed on using a polynomial number of qubits. A new mathematical model of computation called \emph{Quantum Neural Networks (QNNs)} is defined, building on Deutsch's model of quantum computational network. The model introduces a nonlinear and irreversible gate, similar to the speculative operator defined by Abrams and Lloyd. The precise dynamics of this operator are defined and while giving examples in which nonlinear Schr\"{o}dinger's equations are applied, we speculate on its possible implementation. The many practical problems associated with the current model of quantum computing are alleviated in the new model. It is shown that QNNs of logarithmic size and constant depth have the same computational power as threshold circuits, which are used for modeling neural networks. QNNs of polylogarithmic size and polylogarithmic depth can solve the problems in \NC, the class of problems with theoretically fast parallel solutions. Thus, the new model may indeed provide an approach for building scalable parallel computers.
We study the atypically large deviations of the height $H \sim {{\cal O}}(t)$ at the origin at late times in $1+1$-dimensional growth models belonging to the Kardar-Parisi-Zhang (KPZ) universality class. We present exact results for the rate functions for the discrete single step growth model, as well as for the continuum KPZ equation in a droplet geometry. Based on our exact calculation of the rate functions we argue that models in the KPZ class undergo a third order phase transition from a strong coupling to a weak coupling phase, at late times.
Clustering is a fundamental property of complex networks and it is the mathematical expression of a ubiquitous phenomenon that arises in various types of self-organized networks such as biological networks, computer networks or social networks. In this paper, we consider what is called the global clustering coefficient of random graphs on the hyperbolic plane. This model of random graphs was proposed recently by Krioukov et al. as a mathematical model of complex networks, under the fundamental assumption that hyperbolic geometry underlies the structure of these networks. We give a rigorous analysis of clustering and characterize the global clustering coefficient in terms of the parameters of the model. We show how the global clustering coefficient can be tuned by these parameters and we give an explicit formula for this function.
We investigate different strategies for active learning with Bayesian deep neural networks. We focus our analysis on scenarios where new, unlabeled data is obtained episodically, such as commonly encountered in mobile robotics applications. An evaluation of different strategies for acquisition, updating, and final training on the CIFAR-10 dataset shows that incremental network updates with final training on the accumulated acquisition set are essential for best performance, while limiting the amount of required human labeling labor.
As modern industry shifts toward significant globalization, robust and adaptable network capability is increasingly vital to the success of business enterprises. Large quantities of information must be distilled and presented in a single integrated picture in order to maintain the health, security and performance of global networks. We present a design for a network situational awareness display that visually aggregates large quantities of data, identifies problems in a network, assesses their impact on critical company mission areas and clarifies the utilization of resources. This display facilitates the prioritization of network problems as they arise by explicitly depicting how problems interrelate. It also serves to coordinate mitigation strategies with members of a team.
An adversary is essentially an algorithm intent on making a classification system perform in some particular way given an input, e.g., increase the probability of a false negative. Recent work builds adversaries for deep learning systems applied to image object recognition, which exploits the parameters of the system to find the minimal perturbation of the input image such that the network misclassifies it with high confidence. We adapt this approach to construct and deploy an adversary of deep learning systems applied to music content analysis. In our case, however, the input to the systems is magnitude spectral frames, which requires special care in order to produce valid input audio signals from network-derived perturbations. For two different train-test partitionings of two benchmark datasets, and two different deep architectures, we find that this adversary is very effective in defeating the resulting systems. We find the convolutional networks are more robust, however, compared with systems based on a majority vote over individually classified audio frames. Furthermore, we integrate the adversary into the training of new deep systems, but do not find that this improves their resilience against the same adversary.
This paper explores the complexity of deep feedforward networks with linear pre-synaptic couplings and rectified linear activations. This is a contribution to the growing body of work contrasting the representational power of deep and shallow network architectures. In particular, we offer a framework for comparing deep and shallow models that belong to the family of piecewise linear functions based on computational geometry. We look at a deep rectifier multi-layer perceptron (MLP) with linear outputs units and compare it with a single layer version of the model. In the asymptotic regime, when the number of inputs stays constant, if the shallow model has $kn$ hidden units and $n_0$ inputs, then the number of linear regions is $O(k^{n_0}n^{n_0})$. For a $k$ layer model with $n$ hidden units on each layer it is $\Omega(\left\lfloor {n}/{n_0}\right\rfloor^{k-1}n^{n_0})$. The number $\left\lfloor{n}/{n_0}\right\rfloor^{k-1}$ grows faster than $k^{n_0}$ when $n$ tends to infinity or when $k$ tends to infinity and $n \geq 2n_0$. Additionally, even when $k$ is small, if we restrict $n$ to be $2n_0$, we can show that a deep model has considerably more linear regions that a shallow one. We consider this as a first step towards understanding the complexity of these models and specifically towards providing suitable mathematical tools for future analysis.
The paper evaluates three variants of the Gated Recurrent Unit (GRU) in recurrent neural networks (RNN) by reducing parameters in the update and reset gates. We evaluate the three variant GRU models on MNIST and IMDB datasets and show that these GRU-RNN variant models perform as well as the original GRU RNN model while reducing the computational expense.
We discuss the symbolic dynamics of biochemical networks with separate timescales. We show that symbolic dynamics of monomolecular reaction networks with separated rate constants can be described by deterministic, acyclic automata with a number of states that is inferior to the number of biochemical species. For nonlinear pathways, we propose a general approach to approximate their dynamics by finite state machines working on the metastable states of the network (long life states where the system has slow dynamics). For networks with polynomial rate functions we propose to compute metastable states as solutions of the tropical equilibration problem. Tropical equilibrations are defined by the equality of at least two dominant monomials of opposite signs in the differential equations of each dynamic variable. In algebraic geometry, tropical equilibrations are tantamount to tropical prevarieties, that are finite intersections of tropical hypersurfaces.
The star R Corona Borealis (R CrB) shows forbidden lines of [O II], [N II], and [S II] during the deep minimum when the star is fainter by about 8 to 9 magnitudes from normal brightness, suggesting the presence of nebular material around it. We present low and high spectral resolution observations of these lines during the ongoing deep minimum of R CrB, which started in July 2007. These emission lines show double peaks with a separation of about 170 km/s. The line ratios of [S II] and [O II] suggest an electron density of about 100 cm$^{-3}$. We discuss the physical conditions and possible origins of this low density gas. These forbidden lines have also been seen in other R Coronae Borealis stars during their deep light minima and this is a general characteristic of these stars, which might have some relevance to their origins.
"Module networks" are a framework to learn gene regulatory networks from expression data using a probabilistic model in which coregulated genes share the same parameters and conditional distributions. We present a method to infer ensembles of such networks and an averaging procedure to extract the statistically most significant modules and their regulators. We show that the inferred probabilistic models extend beyond the data set used to learn the models.
We study networks that connect points in geographic space, such as transportation networks and the Internet. We find that there are strong signatures in these networks of topography and use patterns, giving the networks shapes that are quite distinct from one another and from non-geographic networks. We offer an explanation of these differences in terms of the costs and benefits of transportation and communication, and give a simple model based on the Monte Carlo optimization of these costs and benefits that reproduces well the qualitative features of the networks studied.
Extensive simulations are made of the link overlap in five dimensional Ising Spin Glasses (ISGs) through and below the ordering transition. Moments of the mean link overlap distributions (the kurtosis and the skewness) show clear critical maxima at the ISG ordering temperature. These criteria can be used as efficient tools to identify a freezing transition quite generally and in any dimension. In the ISG ordered phase the mean link overlap distribution develops a strong two peak structure, with the link overlap spectra of individual samples becoming very heterogeneous. There is no tendency towards a "trivial" universal single peak distribution in the range of size and temperature covered by the data.
We study the localization of wave functions for one-dimensional Schr\"odinger Hamiltonians with random potentials $V(x)$ with short range correlations and large local fluctuations such that $\int\D{x} \smean{V(x)V(0)}=\infty$. A random supersymmetric Hamiltonian is also considered. Depending on how large the fluctuations of $V(x)$ are, we find either new energy dependences of the localization length, $\ell_\mathrm{loc}\propto{}E/\ln{E}$, $\ell_\mathrm{loc}\propto{}E^{\mu/2}$ with $0<\mu<2$ or $\ell_\mathrm{loc}\propto\ln^{\mu-1}E$ for $\mu>1$, or superlocalization (decay of the wave functions faster than a simple exponential).
We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "quintessential" observations that best represent clusters in a dataset, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants' understanding when using explanations produced by BCM, compared to those given by prior art.
Using stochastic conformal mappings we study the effects of anisotropic perturbations on diffusion limited aggregation (DLA) in two dimensions. The harmonic measure of the growth probability for DLA can be conformally mapped onto a constant measure on a unit circle. Here we map $m$ preferred directions for growth of angular width $\sigma$ to a distribution on the unit circle which is a periodic function with $m$ peaks in $[-\pi, \pi)$ such that the width $\sigma$ of each peak scales as $\sigma \sim 1/\sqrt{k}$, where $k$ defines the ``strength'' of anisotropy along any of the $m$ chosen directions. The two parameters $(m,k)$ map out a parameter space of perturbations that allows a continuous transition from DLA (for $m=0$ or $k=0$) to $m$ needle-like fingers as $k \to \infty$. We show that at fixed $m$ the effective fractal dimension of the clusters $D(m,k)$ obtained from mass-radius scaling decreases with increasing $k$ from $D_{DLA} \simeq 1.71$ to a value bounded from below by $D_{min} = 3/2$. Scaling arguments suggest a specific form for the dependence of the fractal dimension $D(m,k)$ on $k$ for large $k$, form which compares favorably with numerical results.
The goal of Byzantine Broadcast (BB) is to allow a set of fault-free nodes to agree on information that a source node wants to broadcast to them, in the presence of Byzantine faulty nodes. We consider design of efficient algorithms for BB in {\em synchronous} point-to-point networks, where the rate of transmission over each communication link is limited by its "link capacity". The throughput of a particular BB algorithm is defined as the average number of bits that can be reliably broadcast to all fault-free nodes per unit time using the algorithm without violating the link capacity constraints. The {\em capacity} of BB in a given network is then defined as the supremum of all achievable BB throughputs in the given network, over all possible BB algorithms.   We develop NAB -- a Network-Aware Byzantine broadcast algorithm -- for arbitrary point-to-point networks consisting of $n$ nodes, wherein the number of faulty nodes is at most $f$, $f<n/3$, and the network connectivity is at least $2f+1$. We also prove an upper bound on the capacity of Byzantine broadcast, and conclude that NAB can achieve throughput at least 1/3 of the capacity. When the network satisfies an additional condition, NAB can achieve throughput at least 1/2 of the capacity.   To the best of our knowledge, NAB is the first algorithm that can achieve a constant fraction of capacity of Byzantine Broadcast (BB) in arbitrary point-to-point networks.
We study heteroclinic networks in $\mathbb{R}^4$, made of a certain type of simple robust heteroclinic cycle. In simple cycles all the connections are of saddle-sink type in two-dimensional fixed-point spaces. We show that there exist only very few ways to join such cycles together in a network and provide the list of all possible such networks in $\mathbb{R}^4$. The networks involving simple heteroclinic cycles of type A are new in the literature and we describe the stability of the cycles in these networks: while the geometry of type A and type B networks is very similar, stability distinguishes them clearly.
The advancement in technology has brought a new era in terrorism where Online Social Networks have become a major platform of communication with wide range of usage from message channeling to propaganda and recruitment of new followers in terrorist groups. Meanwhile, during the terrorist attacks people use social networks for information exchange, mobilizing and uniting and raising money for the victims. This paper critically analyses the specific usage of social networks in the times of terrorism attacks in developing countries.
We study stable matching problems with locality of information and control. In our model, each agent is a node in a fixed network and strives to be matched to another agent. An agent has a complete preference list over all other agents it can be matched with. Agents can match arbitrarily, and they learn about possible partners dynamically based on their current neighborhood. We consider convergence of dynamics to locally stable matchings -- states that are stable with respect to their imposed information structure in the network. In the two-sided case of stable marriage in which existence is guaranteed, we show that the existence of a path to stability becomes NP-hard to decide. This holds even when the network exists only among one partition of agents. In contrast, if one partition has no network and agents remember a previous match every round, a path to stability is guaranteed and random dynamics converge with probability 1. We characterize this positive result in various ways. For instance, it holds for random memory and for cache memory with the most recent partner, but not for cache memory with the best partner. Also, it is crucial which partition of the agents has memory. Finally, we present results for centralized computation of locally stable matchings, i.e., computing maximum locally stable matchings in the two-sided case and deciding existence in the roommates case.
In this paper, a family of ant colony algorithms called DAACA for data aggregation has been presented which contains three phases: the initialization, packet transmission and operations on pheromones. After initialization, each node estimates the remaining energy and the amount of pheromones to compute the probabilities used for dynamically selecting the next hop. After certain rounds of transmissions, the pheromones adjustment is performed periodically, which combines the advantages of both global and local pheromones adjustment for evaporating or depositing pheromones. Four different pheromones adjustment strategies are designed to achieve the global optimal network lifetime, namely Basic-DAACA, ES-DAACA, MM-DAACA and ACS-DAACA. Compared with some other data aggregation algorithms, DAACA shows higher superiority on average degree of nodes, energy efficiency, prolonging the network lifetime, computation complexity and success ratio of one hop transmission. At last we analyze the characteristic of DAACA in the aspects of robustness, fault tolerance and scalability.
We explore numerically, analytically, and experimentally the relationship between quasi-normal modes (QNMs) and transmission resonance (TR) peaks in the transmission spectrum of one-dimensional (1D) and quasi-1D open disordered systems. It is shown that for weak disorder there exist two types of the eigenstates: ordinary QNMs which are associated with a TR, and hidden QNMs which do not exhibit peaks in transmission or within the sample. The distinctive feature of the hidden modes is that unlike ordinary ones, their lifetimes remain constant in a wide range of the strength of disorder. In this range, the averaged ratio of the number of transmission peaks $N_{\rm res}$ to the number of QNMs $N_{\rm mod}$, $N_{\rm res}/N_{\rm mod}$, is insensitive to the type and degree of disorder and is close to the value $\sqrt{2/5}$, which we derive analytically in the weak-scattering approximation. The physical nature of the hidden modes is illustrated in simple examples with a few scatterers. The analogy between ordinary and hidden QNMs and the segregation of superradiant states and trapped modes is discussed. When the coupling to the environment is tuned by an external edge reflectors, the superradiace transition is reproduced. Hidden modes have been also found in microwave measurements in quasi-1D open disordered samples. The microwave measurements and modal analysis of transmission in the crossover to localization in quasi-1D systems give a ratio of $N_{\rm res}/N_{\rm mod}$ close to $\sqrt{2/5}$. In diffusive quasi-1D samples, however, $N_{\rm res}/N_{\rm mod}$ falls as the effective number of transmission eigenchannels $M$ increases. Once $N_{\rm mod}$ is divided by $M$, however, the ratio $N_{\rm res}/N_{\rm mod}$ is close to the ratio found in 1D.
We revisit the interpretation of quasiparticle scattering interference in cuprate high-$T_c$ superconductors. This phenomenon has been very successful in reconstructing the dispersions of $d$-wave Bogoliubov excitations, but the successful identification and interpretation of QPI in scanning tunneling spectroscopy (STS) experiments rely on theoretical results obtained for the case of isolated impurities. We introduce a highly flexible technique to simulate STS measurements by computing the local density of states using real-space Green's functions defined on two-dimensional lattices with as many as 100,000 sites. We focus on the following question: to what extent can the experimental results be reproduced when various forms of distributed disorder are present? We consider randomly distributed point-like impurities, smooth "Coulombic" disorder, and disorder arising from random on-site energies and superconducting gaps. We find an apparent paradox: the QPI peaks in the Fourier-transformed local density of states appear to be sharper and better defined in experiment than those seen in our simulations. We arrive at a no-go result for smooth-potential disorder since this does not reproduce the QPI peaks associated with large-momentum scattering. An ensemble of point-like impurities gets closest to experiment, but this goes hand in hand with impurity cores that are not seen in experiment. We also study the effects of possible measurement artifacts (the "fork mechanism"), which turn out to be of relatively minor consequence. It appears that there is an unknown mechanism at work which renders the QPI peaks much sharper than they are based on present theoretical understanding.
Despite their success, convolutional neural networks are computationally expensive because they must examine all image locations. Stochastic attention-based models have been shown to improve computational efficiency at test time, but they remain difficult to train because of intractable posterior inference and high variance in the stochastic gradient estimates. Borrowing techniques from the literature on training deep generative models, we present the Wake-Sleep Recurrent Attention Model, a method for training stochastic attention networks which improves posterior inference and which reduces the variability in the stochastic gradients. We show that our method can greatly speed up the training time for stochastic attention networks in the domains of image classification and caption generation.
A combination of several sources including: radiogenic heating, processes of mantle and core formation and differentiation, delayed radiogenic heating, earthquakes, and tidal friction account for the surface heat flux in the Earth. Radiogenic heating is of much interest in various fields of geosciences. Inferences from recent experiments with reactor antineutrinos and solar neutrinos showed that the age of geoneutrinos is at hand for constraining radiogenic heat. Because of the deep penetrating properties of the neutrinos this type of radiation in the decay of the heat producing elements (HPE) is ideally suited for the investigation of the deep interiors of the Earth compared to conventional radiometric methods for HPE employing alpha-, beta- and gamma rays. This presentation will address the considerations for a dedicated geoneutrino detector to be set up for investigating the interior regions all the way to the center of the Earth.
The computational properties of neural systems are often thought to be implemented in terms of their network dynamics. Hence, recovering the system dynamics from experimentally observed neuronal time series, like multiple single-unit (MSU) recordings or neuroimaging data, is an important step toward understanding its computations. Ideally, one would not only seek a state space representation of the dynamics, but would wish to have access to its governing equations for in-depth analysis. Recurrent neural networks (RNNs) are a computationally powerful and dynamically universal formal framework which has been extensively studied from both the computational and the dynamical systems perspective. Here we develop a semi-analytical maximum-likelihood estimation scheme for piecewise-linear RNNs (PLRNNs) within the statistical framework of state space models, which accounts for noise in both the underlying latent dynamics and the observation process. The Expectation-Maximization algorithm is used to infer the latent state distribution, through a global Laplace approximation, and the PLRNN parameters iteratively. After validating the procedure on toy examples, the approach is applied to MSU recordings from the rodent anterior cingulate cortex obtained during performance of a classical working memory task, delayed alternation. A model with 5 states turned out to be sufficient to capture the essential computational dynamics underlying task performance, including stimulus-selective delay activity. The estimated models were rarely multi-stable, but rather were tuned to exhibit slow dynamics in the vicinity of a bifurcation point. In summary, the present work advances a semi-analytical (thus reasonably fast) maximum-likelihood estimation framework for PLRNNs that may enable to recover the relevant dynamics underlying observed neuronal time series, and directly link them to computational properties.
Automatic speech processing systems are employed more and more often in real environments. Although the underlying speech technology is mostly language independent, differences between languages with respect to their structure and grammar have substantial effect on the recognition systems performance. In this paper, we present a review of the latest developments in the sign language recognition research in general and in the Arabic sign language (ArSL) in specific. This paper also presents a general framework for improving the deaf community communication with the hearing people that is called SignsWorld. The overall goal of the SignsWorld project is to develop a vision-based technology for recognizing and translating continuous Arabic sign language ArSL.
Born-Oppenheimer (BO) potential in any material. The proton potential surfaces in the hydrogen bonded superprotonic conductor Rb3H(SO4)2 are extracted from the momentum distribution measured using Deep Inelastic Neutron Scattering(DINS). The potential has a single minimum along the bond direction, which accounts for the absence of the antiferroelectric transition seen in the deuterated material, and a saddle point off the bond direction for tunneling into the next well with a barrier height of 350 meV. The measured potential is in qualitative agreement with phenomenological double Morse potentials that have been used to describe hydrogen bonds in other systems.
Recent research on deep neural networks has focused primarily on improving accuracy. For a given accuracy level, it is typically possible to identify multiple DNN architectures that achieve that accuracy level. With equivalent accuracy, smaller DNN architectures offer at least three advantages: (1) Smaller DNNs require less communication across servers during distributed training. (2) Smaller DNNs require less bandwidth to export a new model from the cloud to an autonomous car. (3) Smaller DNNs are more feasible to deploy on FPGAs and other hardware with limited memory. To provide all of these advantages, we propose a small DNN architecture called SqueezeNet. SqueezeNet achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters. Additionally, with model compression techniques we are able to compress SqueezeNet to less than 0.5MB (510x smaller than AlexNet).   The SqueezeNet architecture is available for download here: https://github.com/DeepScale/SqueezeNet
The attenuation of longitudinal acoustic phonons up to frequencies nearing 250 GHz is measured in vitreous silica with a picosecond optical technique. Taking advantage of interferences on the probe beam, difficulties encountered in early pioneering experiments are alleviated. Sound damping at 250 GHz and room temperature is consistent with relaxation dominated by anharmonic interactions with the thermal bath, extending optical Brillouin scattering data. Our result is at variance with claims of a recent deep-UV experiment which reported a rapid damping increase beyond 100 GHz. A comprehensive picture of the frequency dependence of sound attenuation in $v$-SiO$_2$ can be proposed.
In this paper we investigate the image aesthetics classification problem, aka, automatically classifying an image into low or high aesthetic quality, which is quite a challenging problem beyond image recognition. Deep convolutional neural network (DCNN) methods have recently shown promising results for image aesthetics assessment. Currently, a powerful inception module is proposed which shows very high performance in object classification. However, the inception module has not been taken into consideration for the image aesthetics assessment problem. In this paper, we propose a novel DCNN structure codenamed ILGNet for image aesthetics classification, which introduces the Inception module and connects intermediate Local layers to the Global layer for the output. Besides, we use a pre-trained image classification CNN called GoogLeNet on the ImageNet dataset and fine tune our connected local and global layer on the large scale aesthetics assessment AVA dataset. The experimental results show that the proposed ILGNet outperforms the state of the art results in image aesthetics assessment in the AVA benchmark.
While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation. Since parallel corpora are usually limited in quantity, quality, and coverage, especially for low-resource languages, it is appealing to exploit monolingual corpora to improve NMT. We propose a semi-supervised approach for training NMT models on the concatenation of labeled (parallel corpora) and unlabeled (monolingual corpora) data. The central idea is to reconstruct the monolingual corpora using an autoencoder, in which the source-to-target and target-to-source translation models serve as the encoder and decoder, respectively. Our approach can not only exploit the monolingual corpora of the target language, but also of the source language. Experiments on the Chinese-English dataset show that our approach achieves significant improvements over state-of-the-art SMT and NMT systems.
Many research works have been carried out recently to find the optimal path in network routing. Among them the evolutionary algorithms is an area where work is carried out extensively. We in this paper, have used PSO for finding the optimal path and the concept of region based network is introduced along with the use of indirect encoding. A comparative study of genetic algorithm (GA) and particle swarm optimization (PSO) is carried out, and it was found that PSO performed better than GA.
Based on the assumption that there exists a neural network that efficiently represents a set of Boolean functions between all binary inputs and outputs, we propose a process for developing and deploying neural networks whose weight parameters, bias terms, input, and intermediate hidden layer output signals, are all binary-valued, and require only basic bit logic for the feedforward pass. The proposed Bitwise Neural Network (BNN) is especially suitable for resource-constrained environments, since it replaces either floating or fixed-point arithmetic with significantly more efficient bitwise operations. Hence, the BNN requires for less spatial complexity, less memory bandwidth, and less power consumption in hardware. In order to design such networks, we propose to add a few training schemes, such as weight compression and noisy backpropagation, which result in a bitwise network that performs almost as well as its corresponding real-valued network. We test the proposed network on the MNIST dataset, represented using binary features, and show that BNNs result in competitive performance while offering dramatic computational savings.
Recent progress on saliency detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and saliency detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. Holistically-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on salience detection is not obvious. In this paper, we propose a new method for saliency detection by introducing short connections to the skip-layer structures within the HED architecture. Our framework provides rich multi-scale feature maps at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.15 seconds per image), effectiveness, and simplicity over the existing algorithms.
Multivariate binary distributions can be decomposed into products of univariate conditional distributions. Recently popular approaches have modeled these conditionals through neural networks with sophisticated weight-sharing structures. It is shown that state-of-the-art performance on several standard benchmark datasets can actually be achieved by training separate probability estimators for each dimension. In that case, model training can be trivially parallelized over data dimensions. On the other hand, complexity control has to be performed for each learned conditional distribution. Three possible methods are considered and experimentally compared. The estimator that is employed for each conditional is LogitBoost. Similarities and differences between the proposed approach and autoregressive models based on neural networks are discussed in detail.
This article presents a powerful algorithmic framework for big data optimization, called the Block Successive Upper bound Minimization (BSUM). The BSUM includes as special cases many well-known methods for analyzing massive data sets, such as the Block Coordinate Descent (BCD), the Convex-Concave Procedure (CCCP), the Block Coordinate Proximal Gradient (BCPG) method, the Nonnegative Matrix Factorization (NMF), the Expectation Maximization (EM) method and so on. In this article, various features and properties of the BSUM are discussed from the viewpoint of design flexibility, computational efficiency, parallel/distributed implementation and the required communication overhead. Illustrative examples from networking, signal processing and machine learning are presented to demonstrate the practical performance of the BSUM framework
Noisy network coding, which elegantly combines the conventional compress-and-forward relaying strategy and ideas from network coding, has recently drawn much attention for its simplicity and optimality in achieving to within constant gap of the capacity of the multisource multicast Gaussian network. The constant-gap result, however, applies only to Gaussian relay networks with independent noises. This paper investigates the application of noisy network coding to networks with correlated noises. By focusing on a four-node Gaussian relay chain network with a particular noise correlation structure, it is shown that noisy network coding can no longer achieve to within constant gap to capacity with the choice of Gaussian inputs and Gaussian quantization. The cut-set bound of the relay chain network in this particular case, however, can be achieved to within half a bit by a simple concatenation of a correlation-aware noisy network coding strategy and a decode-and-forward scheme.
We present a study of the galaxies found in the Hubble Deep Field. A high proportion of HDF galaxies are undergoing a strong episode of star formation, as evidenced by their very blue colours. A wide range of morphological types is found, with a high proportion of peculiar and merger morphologies. Fitting the multiband spectra with redshifted SEDs of galaxy types E to HII, we predict the spectral types and redshifts of galaxies detected in the HDF. We find a median redshift of 1.6, with $68\%$ having $z > 1$ and $31\%$ with $z >2$. The I-band absolute magnitude distributions as a function of galaxy types show a plausible trend of decreasing luminosity towards later types. The derived I-band luminosity function agrees well with that from the Canada-France survey (Lilly et al 1996) for $z < 1$ and shows strong luminosity evolution at $M_I < -21$ for $1 < z < 3$, comparable to the rate seen in quasars and starburst galaxies. We have predicted infrared and submillimetre fluxes assuming most of the galaxies are undergoing a strong starburst. Several planned space-borne and ground-based deep surveys are capable of detecting interesting numbers of HDF galaxies.
In this paper, we revisit two fundamental results of the self-stabilizing literature about silent BFS spanning tree constructions: the Dolev et al algorithm and the Huang and Chen's algorithm. More precisely, we propose in the composite atomicity model three straightforward adaptations inspired from those algorithms. We then present a deep study of these three algorithms. Our results are related to both correctness (convergence and closure, assuming a distributed unfair daemon) and complexity (analysis of the stabilization time in terms of rounds and steps).
We introduce and compare several strategies for learning discriminative features from electroencephalography (EEG) recordings using deep learning techniques. EEG data are generally only available in small quantities, they are high-dimensional with a poor signal-to-noise ratio, and there is considerable variability between individual subjects and recording sessions. Our proposed techniques specifically address these challenges for feature learning. Cross-trial encoding forces auto-encoders to focus on features that are stable across trials. Similarity-constraint encoders learn features that allow to distinguish between classes by demanding that two trials from the same class are more similar to each other than to trials from other classes. This tuple-based training approach is especially suitable for small datasets. Hydra-nets allow for separate processing pathways adapting to subsets of a dataset and thus combine the advantages of individual feature learning (better adaptation of early, low-level processing) with group model training (better generalization of higher-level processing in deeper layers). This way, models can, for instance, adapt to each subject individually to compensate for differences in spatial patterns due to anatomical differences or variance in electrode positions. The different techniques are evaluated using the publicly available OpenMIIR dataset of EEG recordings taken while participants listened to and imagined music.
Power Transfer Distribution Factors (PTDFs) play a crucial role in power grid security analysis, planning, and redispatch. Fast calculation of the PTDFs is therefore of great importance. In this paper, we present a non-approximative dual method of computing PTDFs. It uses power flows along topological cycles of the network but still relies on simple matrix algebra. At the core, our method changes the size of the matrix that needs to be inverted to calculate the PTDFs from $N\times N$, where $N$ is the number of buses, to $(L-N+1)\times (L-N+1)$, where $L$ is the number of lines and $L-N+1$ is the number of independent cycles (closed loops) in the network while remaining mathematically fully equivalent. For power grids containing a relatively small number of cycles, the method can offer a speedup of numerical calculations.
We study a system in which N agents have to decide between two strategies \theta_i (i \in 1... N), for defection or cooperation, when interacting with other n agents (either spatial neighbors or randomly chosen ones). After each round, they update their strategy responding nonlinearly to two different information sources: (i) the payoff a_i(\theta_i, f_i) received from the strategic interaction with their n counterparts, (ii) the fraction f_i of cooperators in this interaction. For the latter response, we assume social herding, i.e. agents adopt their strategy based on the frequencies of the different strategies in their neighborhood, without taking into account the consequences of this decision. We note that f_i already determines the payoff, so there is no additional information assumed. A parameter \zeta defines to what level agents take the two different information sources into account. For the strategic interaction, we assume a Prisoner's Dilemma game, i.e. one in which defection is the evolutionary stable strategy. However, if the additional dimension of social herding is taken into account, we find instead a stable outcome where cooperators are the majority. By means of agent-based computer simulations and analytical investigations, we evaluate the critical conditions for this transition towards cooperation. We find that, in addition to a high degree of social herding, there has to be a nonlinear response to the fraction of cooperators. We argue that the transition to cooperation in our model is based on less information, i.e. on agents which are not informed about the payoff matrix, and therefore rely on just observing the strategy of others, to adopt it. By designing the right mechanisms to respond to this information, the transition to cooperation can be remarkably enhanced.
There is a large amount of interest in understanding users of social media in order to predict their behavior in this space. Despite this interest, user predictability in social media is not well-understood. To examine this question, we consider a network of fifteen thousand users on Twitter over a seven week period. We apply two contrasting modeling paradigms: computational mechanics and echo state networks. Both methods attempt to model the behavior of users on the basis of their past behavior. We demonstrate that the behavior of users on Twitter can be well-modeled as processes with self-feedback. We find that the two modeling approaches perform very similarly for most users, but that they differ in performance on a small subset of the users. By exploring the properties of these performance-differentiated users, we highlight the challenges faced in applying predictive models to dynamic social data.
In this paper, we investigate stable matching in structured networks. Consider case of matching in social networks where candidates are not fully connected. A candidate on one side of the market gets acquaintance with which one on the heterogeneous side depends on the structured network. We explore four well-used structures of networks and define the social circle by the distance between each candidate. When matching within social circle, we have equilibrium distinguishes from each other since each social network's topology differs. Equilibrium changes with the change on topology of each network and it always converges to the same stable outcome as complete information algorithm if there is no block to reach anyone in agent's social circle.
The fragmented nature and asymmetry of local and remote file access and network access, combined with the current lack of robust authenticity and privacy, hamstrings the current internet. The collection of disjoint and often ad-hoc technologies currently in use are at least partially responsible for the magnitude and potency of the plagues besetting the information economy, of which spam and email borne virii are canonical examples. The proposed replacement for the internet, Internet Protocol Version 6 (IPv6), does little to tackle these underlying issues, instead concentrating on addressing the technical issues of a decade ago.   This paper introduces CANE, a Content Addressed Network Environment, and compares it against current internet and related technologies. Specifically, CANE presents a simple computing environment in which location is abstracted away in favour of identity, and trust is explicitly defined. Identity is cryptographically verified and yet remains pervasively open in nature. It is argued that this approach is capable of being generalised such that file storage and network access can be unified and subsequently combined with human interfaces to result in a Unified Theory of Access, which addresses many of the significant problems besetting the internet community of the early 21st century.
Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. In the past, GPUs enabled these breakthroughs because of their greater computational speed. In the future, faster computation at both training and test time is likely to be crucial for further progress and for consumer applications on low-power devices. As a result, there is much interest in research and development of dedicated hardware for Deep Learning (DL). Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and power-hungry components of the digital implementation of neural networks. We introduce BinaryConnect, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated. Like other dropout schemes, we show that BinaryConnect acts as regularizer and we obtain near state-of-the-art results with BinaryConnect on the permutation-invariant MNIST, CIFAR-10 and SVHN.
The James Webb Space Telescope (JWST), scheduled for launch in 2014, is expected to revolutionize our understanding of the high-redshift Universe. Even so, many of the most interesting sources that may be hiding at redshifts z~10 (population III stars, dark stars, population III galaxies) are likely to be intrinsically too faint for JWST. Here, we explore the prospects of searching for the first stars and galaxies by pointing JWST through foreground lensing clusters. Observations of this kind can reach significantly deeper than the currently planned JWST ultra deep field in just a fraction of the exposure time, but at the expense of probing a much smaller volume of the high-redshift Universe. We also present Yggdrasil, a spectral synthesis code for modelling the first galaxies, and use it to derive the masses of the faintest pop I, II and III galaxies that can be detected through broadband imaging in JWST ultra deep fields.
This paper explores algorithms for processing probabilistic and deterministic information when the former is represented as a belief network and the latter as a set of boolean clauses. The motivating tasks are 1. evaluating beliefs networks having a large number of deterministic relationships and2. evaluating probabilities of complex boolean querie over a belief network. We propose a parameterized family of variable elimination algorithms that exploit both types of information, and that allows varying levels of constraint propagation inferences. The complexity of the scheme is controlled by the induced-width of the graph {em augmented} by the dependencies introduced by the boolean constraints. Preliminary empirical evaluation demonstrate the effect of constraint propagation on probabilistic computation.
We investigate an extension of continuous online learning in recurrent neural network language models. The model keeps a separate vector representation of the current unit of text being processed and adaptively adjusts it after each prediction. The initial experiments give promising results, indicating that the method is able to increase language modelling accuracy, while also decreasing the parameters needed to store the model along with the computation required at each step.
Social networks tend to disproportionally favor connections between individuals with either similar or dissimilar characteristics. This propensity, referred to as assortative mixing or homophily, is expressed as the correlation between attribute values of nearest neighbour vertices in a graph. Recent results indicate that beyond demographic features such as age, sex and race, even psychological states such as "loneliness" can be assortative in a social network. In spite of the increasing societal importance of online social networks it is unknown whether assortative mixing of psychological states takes place in situations where social ties are mediated solely by online networking services in the absence of physical contact. Here, we show that general happiness or Subjective Well-Being (SWB) of Twitter users, as measured from a 6 month record of their individual tweets, is indeed assortative across the Twitter social network. To our knowledge this is the first result that shows assortative mixing in online networks at the level of SWB. Our results imply that online social networks may be equally subject to the social mechanisms that cause assortative mixing in real social networks and that such assortative mixing takes place at the level of SWB. Given the increasing prevalence of online social networks, their propensity to connect users with similar levels of SWB may be an important instrument in better understanding how both positive and negative sentiments spread through online social ties. Future research may focus on how event-specific mood states can propagate and influence user behavior in "real life".
Neural Machine Translation (NMT), like many other deep learning domains, typically suffers from over-parameterization, resulting in large storage sizes. This paper examines three simple magnitude-based pruning schemes to compress NMT models, namely class-blind, class-uniform, and class-distribution, which differ in terms of how pruning thresholds are computed for the different classes of weights in the NMT architecture. We demonstrate the efficacy of weight pruning as a compression technique for a state-of-the-art NMT system. We show that an NMT model with over 200 million parameters can be pruned by 40% with very little performance loss as measured on the WMT'14 English-German translation task. This sheds light on the distribution of redundancy in the NMT architecture. Our main result is that with retraining, we can recover and even surpass the original performance with an 80%-pruned model.
We study the weight space structure of the parity machine with binary weights by deriving the distribution of volumes associated to the internal representations of the learning examples. The learning behaviour and the symmetry breaking transition are analyzed and the results are found to be in very good agreement with extended numerical simulations.
Decision-making is a process of choosing among alternative courses of action for solving complicated problems where multi-criteria objectives are involved. The past few years have witnessed a growing recognition of Soft Computing technologies that underlie the conception, design and utilization of intelligent systems. Several works have been done where engineers and scientists have applied intelligent techniques and heuristics to obtain optimal decisions from imprecise information. In this paper, we present a concurrent fuzzy-neural network approach combining unsupervised and supervised learning techniques to develop the Tactical Air Combat Decision Support System (TACDSS). Experiment results clearly demonstrate the efficiency of the proposed technique.
We propose a novel deep network architecture for grayscale and color image denoising that is based on a non-local image model. Our motivation for the overall design of the proposed network stems from variational methods that exploit the inherent non-local self-similarity property of natural images. We build on this concept and introduce deep networks that perform non-local processing and at the same time they significantly benefit from discriminative learning. Experiments on the Berkeley segmentation dataset, comparing several state-of-the-art methods, show that the proposed non-local models achieve the best reported denoising performance both for grayscale and color images for all the tested noise levels. It is also worth noting that this increase in performance comes at no extra cost on the capacity of the network compared to existing alternative deep network architectures. In addition, we highlight a direct link of the proposed non-local models to convolutional neural networks. This connection is of significant importance since it allows our models to take full advantage of the latest advances on GPU computing in deep learning and makes them amenable to efficient implementations through their inherent parallelism.
Intermediate features at different layers of a deep neural network are known to be discriminative for visual patterns of different complexities. However, most existing works ignore such cross-layer heterogeneities when classifying samples of different complexities. For example, if a training sample has already been correctly classified at a specific layer with high confidence, we argue that it is unnecessary to enforce rest layers to classify this sample correctly and a better strategy is to encourage those layers to focus on other samples.   In this paper, we propose a layer-wise discriminative learning method to enhance the discriminative capability of a deep network by allowing its layers to work collaboratively for classification. Towards this target, we introduce multiple classifiers on top of multiple layers. Each classifier not only tries to correctly classify the features from its input layer, but also coordinates with other classifiers to jointly maximize the final classification performance. Guided by the other companion classifiers, each classifier learns to concentrate on certain training examples and boosts the overall performance. Allowing for end-to-end training, our method can be conveniently embedded into state-of-the-art deep networks. Experiments with multiple popular deep networks, including Network in Network, GoogLeNet and VGGNet, on scale-various object classification benchmarks, including CIFAR100, MNIST and ImageNet, and scene classification benchmarks, including MIT67, SUN397 and Places205, demonstrate the effectiveness of our method. In addition, we also analyze the relationship between the proposed method and classical conditional random fields models.
We present a new framework of applying deep neural networks (DNN) to devise a universal discrete denoiser. Unlike other approaches that utilize supervised learning for denoising, we do not require any additional training data. In such setting, while the ground-truth label, i.e., the clean data, is not available, we devise "pseudo-labels" and a novel objective function such that DNN can be trained in a same way as supervised learning to become a discrete denoiser. We experimentally show that our resulting algorithm, dubbed as Neural DUDE, significantly outperforms the previous state-of-the-art in several applications with a systematic rule of choosing the hyperparameter, which is an attractive feature in practice.
We show that the representation of an image in a deep neural network (DNN) can be manipulated to mimic those of other natural images, with only minor, imperceptible perturbations to the original image. Previous methods for generating adversarial images focused on image perturbations designed to produce erroneous class labels, while we concentrate on the internal layers of DNN representations. In this way our new class of adversarial images differs qualitatively from others. While the adversary is perceptually similar to one image, its internal representation appears remarkably similar to a different image, one from a different class, bearing little if any apparent similarity to the input; they appear generic and consistent with the space of natural images. This phenomenon raises questions about DNN representations, as well as the properties of natural images themselves.
Our research has shown that schedules can be built mimicking a human scheduler by using a set of rules that involve domain knowledge. This chapter presents a Bayesian Optimization Algorithm (BOA) for the nurse scheduling problem that chooses such suitable scheduling rules from a set for each nurses assignment. Based on the idea of using probabilistic models, the BOA builds a Bayesian network for the set of promising solutions and samples these networks to generate new candidate solutions. Computational results from 52 real data instances demonstrate the success of this approach. It is also suggested that the learning mechanism in the proposed algorithm may be suitable for other scheduling problems.
Drinking water for human health and well-being is crucial. Accidental and intentional water contamination can pose great danger to consumers. Optimal design of a system that can quickly detect the presence of contamination in a water distribution network is very challenging for technical and operational reasons. However, on the one hand improvement in chemical and biological sensor technology has created the possibility of designing efficient contamination detection systems. On the other hand, methods and tools from complex network theory, which was primarily the domain of mathematicians and physicists, provide analytical output for engineers to design, optimize, operate, and maintain complex network systems such as power grids, water distribution networks, telecommunication systems, internet, roads, supply chains, traffic and transportation systems. In this work, we develop a new modeling approach for the optimal placement of sensors for contamination detection in a water distribution network. The approach originally combines classical optimization and complex systems theory.
We present a novel framework for generating pop music. Our model is a hierarchical Recurrent Neural Network, where the layers and the structure of the hierarchy encode our prior knowledge about how pop music is composed. In particular, the bottom layers generate the melody, while the higher levels produce the drums and chords. We conduct several human studies that show strong preference of our generated music over that produced by the recent method by Google. We additionally show two applications of our framework: neural dancing and karaoke, as well as neural story singing.
We propose to enhance the RNN decoder in a neural machine translator (NMT) with external memory, as a natural but powerful extension to the state in the decoding RNN. This memory-enhanced RNN decoder is called \textsc{MemDec}. At each time during decoding, \textsc{MemDec} will read from this memory and write to this memory once, both with content-based addressing. Unlike the unbounded memory in previous work\cite{RNNsearch} to store the representation of source sentence, the memory in \textsc{MemDec} is a matrix with pre-determined size designed to better capture the information important for the decoding process at each time step. Our empirical study on Chinese-English translation shows that it can improve by $4.8$ BLEU upon Groundhog and $5.3$ BLEU upon on Moses, yielding the best performance achieved with the same training set.
Network models of language have provided a way of linking cognitive processes to the structure and connectivity of language. However, one shortcoming of current approaches is focusing on only one type of linguistic relationship at a time, missing the complex multi-relational nature of language. In this work, we overcome this limitation by modelling the mental lexicon of English-speaking toddlers as a multiplex lexical network, i.e. a multi-layered network where N=529 words/nodes are connected according to four types of relationships: (i) free associations, (ii) feature sharing, (iii) co-occurrence, and (iv) phonological similarity. We provide analysis of the topology of the resulting multiplex and then proceed to evaluate single layers as well as the full multiplex structure on their ability to predict empirically observed age of acquisition data of English speaking toddlers. We find that the emerging multiplex network topology is an important proxy of the cognitive processes of acquisition, capable of capturing emergent lexicon structure. In fact, we show that the multiplex topology is fundamentally more powerful than individual layers in predicting the ordering with which words are acquired. Furthermore, multiplex analysis allows for a quantification of distinct phases of lexical acquisition in early learners: while initially all the multiplex layers contribute to word learning, after about month 23 free associations take the lead in driving word acquisition.
It is known that the Fano network has a vector linear solution if and only if the characteristic of the finite field is $2$; and the non-Fano network has a vector linear solution if and only if the characteristic of the finite field is not $2$. Using these properties of Fano and non-Fano networks it has been shown that linear network coding is insufficient. In this paper we generalize the properties of Fano and non-Fano networks. Specifically, by adding more nodes and edges to the Fano network, we construct a network which has a vector linear solution for any vector dimension if and only if the characteristic of the finite field belongs to an arbitrary given set of primes $\{p_1,p_2,\ldots,p_l\}$. Similarly, by adding more nodes and edges to the non-Fano network, we construct a network which has a vector linear solution for any vector dimension if and only if the characteristic of the finite field does not belong to an arbitrary given set of primes $\{p_1,p_2,\ldots,p_l\}$.
Cloze-style queries are representative problems in reading comprehension. Over the past few months, we have seen much progress that utilizing neural network approach to solve Cloze-style questions. In this paper, we present a novel model called attention-over-attention reader for the Cloze-style reading comprehension task. Our model aims to place another attention mechanism over the document-level attention, and induces "attended attention" for final predictions. Unlike the previous works, our neural network model requires less pre-defined hyper-parameters and uses an elegant architecture for modeling. Experimental results show that the proposed attention-over-attention model significantly outperforms various state-of-the-art systems by a large margin in public datasets, such as CNN and Children's Book Test datasets.
Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to the for practical applications in Molecular Biology much more relevant case of multiple sequence alignments with gaps. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10^-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.
Exponential growth in Electronic Healthcare Records (EHR) has resulted in new opportunities and urgent needs for discovery of meaningful data-driven representations and patterns of diseases in Computational Phenotyping research. Deep Learning models have shown superior performance for robust prediction in computational phenotyping tasks, but suffer from the issue of model interpretability which is crucial for clinicians involved in decision-making. In this paper, we introduce a novel knowledge-distillation approach called Interpretable Mimic Learning, to learn interpretable phenotype features for making robust prediction while mimicking the performance of deep learning models. Our framework uses Gradient Boosting Trees to learn interpretable features from deep learning models such as Stacked Denoising Autoencoder and Long Short-Term Memory. Exhaustive experiments on a real-world clinical time-series dataset show that our method obtains similar or better performance than the deep learning models, and it provides interpretable phenotypes for clinical decision making.
We consider the dynamics of diluted neural networks with clipped and adapting synapses. Unlike previous studies, the learning rate is kept constant as the connectivity tends to infinity: the synapses evolve on a time scale intermediate between the quenched and annealing limits and all orders of synaptic correlations must be taken into account. The dynamics is solved by mean-field theory, the order parameter for synapses being a function. We describe the effects, in the double dynamics, due to synaptic correlations.
In this paper, we explore salient questions about user interests, conversations and friendships in the Facebook social network, using a novel latent space model that integrates several data types. A key challenge of studying Facebook's data is the wide range of data modalities such as text, network links, and categorical labels. Our latent space model seamlessly combines all three data modalities over millions of users, allowing us to study the interplay between user friendships, interests, and higher-order network-wide social trends on Facebook. The recovered insights not only answer our initial questions, but also reveal surprising facts about user interests in the context of Facebook's ecosystem. We also confirm that our results are significant with respect to evidential information from the study subjects.
This paper describes and analyzes a hierarchical gossip algorithm for solving the distributed average consensus problem in wireless sensor networks. The network is recursively partitioned into subnetworks. Initially, nodes at the finest scale gossip to compute local averages. Then, using geographic routing to enable gossip between nodes that are not directly connected, these local averages are progressively fused up the hierarchy until the global average is computed. We show that the proposed hierarchical scheme with $k$ levels of hierarchy is competitive with state-of-the-art randomized gossip algorithms, in terms of message complexity, achieving $\epsilon$-accuracy with high probability after $O\big(n \log \log n \log \frac{kn}{\epsilon} \big)$ messages. Key to our analysis is the way in which the network is recursively partitioned. We find that the optimal scaling law is achieved when subnetworks at scale $j$ contain $O(n^{(2/3)^j})$ nodes; then the message complexity at any individual scale is $O(n \log \frac{kn}{\epsilon})$, and the total number of scales in the hierarchy grows slowly, as $\Theta(\log \log n)$. Another important consequence of hierarchical construction is that the longest distance over which messages are exchanged is $O(n^{1/3})$ hops (at the highest scale), and most messages (at lower scales) travel shorter distances. In networks that use link-level acknowledgements, this results in less congestion and resource usage by reducing message retransmissions. Simulations illustrate that the proposed scheme is more message-efficient than existing state-of-the-art randomized gossip algorithms based on averaging along paths.
The number of people using online social networks in their everyday life is continuously growing at a pace never saw before. This new kind of communication has an enormous impact on opinions, cultural trends, information spreading and even in the commercial success of new products. More importantly, social online networks have revealed as a fundamental organizing mechanism in recent country-wide social movements. In this paper, we provide a quantitative analysis of the structural and dynamical patterns emerging from the activity of an online social network around the ongoing May 15th (15M) movement in Spain. Our network is made up by users that exchanged tweets in a time period of one month, which includes the birth and stabilization of the 15M movement. We characterize in depth the growth of such dynamical network and find that it is scale-free with communities at the mesoscale. We also find that its dynamics exhibits typical features of critical systems such as robustness and power-law distributions for several quantities. Remarkably, we report that the patterns characterizing the spreading dynamics are asymmetric, giving rise to a clear distinction between information sources and sinks. Our study represent a first step towards the use of data from online social media to comprehend modern societal dynamics.
Leaky integrate-and-fire (LIF) models are mean-field limits, with a large number of neurons, used to describe neural networks. We consider inhomogeneous networks structured by a connec-tivity parameter (strengths of the synaptic weights) with the effect of processing the input current with different intensities. We first study the properties of the network activity depending on the distribution of synaptic weights and in particular its discrimination capacity. Then, we consider simple learning rules and determine the synaptic weight distribution it generates. We outline the role of noise as a selection principle and the capacity to memorized a learned signal.
Frame stacking is broadly applied in end-to-end neural network training like connectionist temporal classification (CTC), and it leads to more accurate models and faster decoding. However, it is not well-suited to conventional neural network based on context-dependent state acoustic model, if the decoder is unchanged. In this paper, we propose a novel frame retaining method which is applied in decoding. The system which combined frame retaining with frame stacking could reduces the time consumption of both training and decoding. Long short-term memory (LSTM) recurrent neural networks (RNNs) using it achieve almost linear training speedup and reduces relative 41\% real time factor (RTF). At the same time, recognition performance is no degradation or improves sightly on Shenma voice search dataset in Mandarin.
Human activity understanding with 3D/depth sensors has received increasing attention in multimedia processing and interactions. This work targets on developing a novel deep model for automatic activity recognition from RGB-D videos. We represent each human activity as an ensemble of cubic-like video segments, and learn to discover the temporal structures for a category of activities, i.e. how the activities to be decomposed in terms of classification. Our model can be regarded as a structured deep architecture, as it extends the convolutional neural networks (CNNs) by incorporating structure alternatives. Specifically, we build the network consisting of 3D convolutions and max-pooling operators over the video segments, and introduce the latent variables in each convolutional layer manipulating the activation of neurons. Our model thus advances existing approaches in two aspects: (i) it acts directly on the raw inputs (grayscale-depth data) to conduct recognition instead of relying on hand-crafted features, and (ii) the model structure can be dynamically adjusted accounting for the temporal variations of human activities, i.e. the network configuration is allowed to be partially activated during inference. For model training, we propose an EM-type optimization method that iteratively (i) discovers the latent structure by determining the decomposed actions for each training example, and (ii) learns the network parameters by using the back-propagation algorithm. Our approach is validated in challenging scenarios, and outperforms state-of-the-art methods. A large human activity database of RGB-D videos is presented in addition.
When gene copies are sampled from various species, the resulting gene tree might disagree with the containing species tree. The primary causes of gene tree and species tree discord include lineage sorting, horizontal gene transfer, and gene duplication and loss. Each of these events yields a different parsimony criterion for inferring the (containing) species tree from gene trees. With lineage sorting, species tree inference is to find the tree minimizing extra gene lineages that had to coexist along species lineages; with gene duplication, it becomes to find the tree minimizing gene duplications and/or losses. In this paper, we show the following results: (i) The deep coalescence cost is equal to the number of gene losses minus two times the gene duplication cost in the reconciliation of a uniquely leaf labeled gene tree and a species tree. The deep coalescence cost can be computed in linear time for any arbitrary gene tree and species tree. (ii) The deep coalescence cost is always no less than the gene duplication cost in the reconciliation of an arbitrary gene tree and a species tree. (iii) Species tree inference by minimizing deep coalescences is NP-hard.
The ability to train large-scale neural networks has resulted in state-of-the-art performance in many areas of computer vision. These results have largely come from computational break throughs of two forms: model parallelism, e.g. GPU accelerated training, which has seen quick adoption in computer vision circles, and data parallelism, e.g. A-SGD, whose large scale has been used mostly in industry. We report early experiments with a system that makes use of both model parallelism and data parallelism, we call GPU A-SGD. We show using GPU A-SGD it is possible to speed up training of large convolutional neural networks useful for computer vision. We believe GPU A-SGD will make it possible to train larger networks on larger training sets in a reasonable amount of time.
Linguistic typology studies the range of structures present in human language. The main goal of the field is to discover which sets of possible phenomena are universal, and which are merely frequent. For example, all languages have vowels, while most---but not all---languages have an /u/ sound. In this paper we present the first probabilistic treatment of a basic question in phonological typology: What makes a natural vowel inventory? We introduce a series of deep stochastic point processes, and contrast them with previous computational, simulation-based approaches. We provide a comprehensive suite of experiments on over 200 distinct languages.
Training time on large datasets for deep neural networks is the principal workflow bottleneck in a number of important applications of deep learning, such as object classification and detection in automatic driver assistance systems (ADAS). To minimize training time, the training of a deep neural network must be scaled beyond a single machine to as many machines as possible by distributing the optimization method used for training. While a number of approaches have been proposed for distributed stochastic gradient descent (SGD), at the current time synchronous approaches to distributed SGD appear to be showing the greatest performance at large scale. Synchronous scaling of SGD suffers from the need to synchronize all processors on each gradient step and is not resilient in the face of failing or lagging processors. In asynchronous approaches using parameter servers, training is slowed by contention to the parameter server. In this paper we compare the convergence of synchronous and asynchronous SGD for training a modern ResNet network architecture on the ImageNet classification problem. We also propose an asynchronous method, gossiping SGD, that aims to retain the positive features of both systems by replacing the all-reduce collective operation of synchronous training with a gossip aggregation algorithm. We find, perhaps counterintuitively, that asynchronous SGD, including both elastic averaging and gossiping, converges faster at fewer nodes (up to about 32 nodes), whereas synchronous SGD scales better to more nodes (up to about 100 nodes).
Fifth generation (5G) dense small cell networks (SCNs) are expected to meet the thousand-fold mobile traffic challenge within the next few years. When developing solution schemes for resource allocation problems in such networks, conventional centralized control is no longer viable due to excessive computational complexity and large signaling overhead caused by the large number of users and network nodes in such a network. Instead, distributed resource allocation (or decision making) methods with low complexity would be desirable to make the network self-organizing and autonomous. Minority game (MG) has recently gained attention of the research community as a tool to model and solve distributed resource allocation problems. The main objective of this article is to study the applicability of the MG to solve the distributed decision making problems in future wireless networks. We present the fundamental theoretical aspects of basic MG, some variants of MG, and the notion of equilibrium. We also study the current state-of-the-art on the applications of MGs in communication networks. Furthermore, we describe an example application of MG to SCNs, where the problem of computation offloading by users in an SCN is modeled and analyzed using MG.
We report on critical currents and voltage noise measurements in Niobium strips in the superconducting state, in the presence of a bulk vortex lattice ($B < B_{C2}$) and in the surface superconducting state ($B_{c2}< B < B_{C3}$). For homogeneous surfaces, the correlation length of the current fluctuations can be associated with the electromagnetic skin depth of vortex superficial instabilities. The modification of the surface state by means of low energy irradiation induces a strong modification of the critical current and of the noise. The appearance of a corner frequency in the spectral domain can be linked with the low wave-vectors of the artificial corrugation. Since this latter occurs only for $B < B_{C2}$, we propose that the long-range interactions allow the correlation length to extend up to values imposed by the surface topography.
Measurements of normalised cross sections for the production of photons and neutrons at very small angles with respect to the proton beam direction in deep-inelastic $ep$ scattering at HERA are presented as a function of the Feynman variable $x_F$ and of the centre-of-mass energy of the virtual photon-proton system $W$. The data are taken with the H1 detector in the years 2006 and 2007 and correspond to an integrated luminosity of $131 \mathrm{pb}^{-1}$. The measurement is restricted to photons and neutrons in the pseudorapidity range $\eta>7.9$ and covers the range of negative four momentum transfer squared at the positron vertex $6<Q^2<100$ GeV$^2$, of inelasticity $0.05<y<0.6$ and of $70<W<245 $GeV. To test the Feynman scaling hypothesis the $W$ dependence of the $x_F$ dependent cross sections is investigated. Predictions of deep-inelastic scattering models and of models for hadronic interactions of high energy cosmic rays are compared to the measured cross sections.
Network creation games model the creation and usage costs of networks formed by a set of selfish peers. Each peer has the ability to change the network in a limited way, e.g., by creating or deleting incident links. In doing so, a peer can reduce its individual communication cost. Typically, these costs are modeled by the maximum or average distance in the network. We introduce a generalized version of the basic network creation game (BNCG). In the BNCG (by Alon et al., SPAA 2010), each peer may replace one of its incident links by a link to an arbitrary peer. This is done in a selfish way in order to minimize either the maximum or average distance to all other peers. That is, each peer works towards a network structure that allows himself to communicate efficiently with all other peers. However, participants of large networks are seldom interested in all peers. Rather, they want to communicate efficiently only with a small subset of peers. Our model incorporates these (communication) interests explicitly. In the MAX-version, each node tries to minimize its maximum distance to nodes it is interested in.   Given peers with interests and a communication network forming a tree, we prove several results on the structure and quality of equilibria in our model. For the MAX-version, we give an upper worst case bound of O(\sqrt{n}) for the private costs in an equilibrium of n peers. Moreover, we give an equilibrium for a circular interest graph where a node has private cost \Omega(\sqrt{n}), showing that our bound is tight. This example can be extended such that we get a tight bound of \Theta(\sqrt{n}) for the price of anarchy. For the case of general communication networks we show the price of anarchy to be \Theta(n). Additionally, we prove an interesting connection between a maximum independent set in the interest graph and the private costs of the peers.
This paper presents the development of several models of a deep convolutional auto-encoder in the Caffe deep learning framework and their experimental evaluation on the example of MNIST dataset. We have created five models of a convolutional auto-encoder which differ architecturally by the presence or absence of pooling and unpooling layers in the auto-encoder's encoder and decoder parts. Our results show that the developed models provide very good results in dimensionality reduction and unsupervised clustering tasks, and small classification errors when we used the learned internal code as an input of a supervised linear classifier and multi-layer perceptron. The best results were provided by a model where the encoder part contains convolutional and pooling layers, followed by an analogous decoder part with deconvolution and unpooling layers without the use of switch variables in the decoder part. The paper also discusses practical details of the creation of a deep convolutional auto-encoder in the very popular Caffe deep learning framework. We believe that our approach and results presented in this paper could help other researchers to build efficient deep neural network architectures in the future.
Malware remains a serious problem for corporations, government agencies, and individuals, as attackers continue to use it as a tool to effect frequent and costly network intrusions. Machine learning holds the promise of automating the work required to detect newly discovered malware families, and could potentially learn generalizations about malware and benign software that support the detection of entirely new, unknown malware families. Unfortunately, few proposed machine learning based malware detection methods have achieved the low false positive rates required to deliver deployable detectors.   In this paper we a deep neural network malware classifier that achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware. Specifically, we show that our system achieves a 95% detection rate at 0.1% false positive rate (FPR), based on more than 400,000 software binaries sourced directly from our customers and internal malware databases. We achieve these results by directly learning on all binaries, without any filtering, unpacking, or manually separating binary files into categories. Further, we confirm our false positive rates directly on a live stream of files coming in from Invincea's deployed endpoint solution, provide an estimate of how many new binary files we expected to see a day on an enterprise network, and describe how that relates to the false positive rate and translates into an intuitive threat score.   Our results demonstrate that it is now feasible to quickly train and deploy a low resource, highly accurate machine learning classification model, with false positive rates that approach traditional labor intensive signature based methods, while also detecting previously unseen malware.
We introduce an approach for analyzing the variation of features generated by convolutional neural networks (CNNs) with respect to scene factors that occur in natural images. Such factors may include object style, 3D viewpoint, color, and scene lighting configuration. Our approach analyzes CNN feature responses corresponding to different scene factors by controlling for them via rendering using a large database of 3D CAD models. The rendered images are presented to a trained CNN and responses for different layers are studied with respect to the input scene factors. We perform a decomposition of the responses based on knowledge of the input scene factors and analyze the resulting components. In particular, we quantify their relative importance in the CNN responses and visualize them using principal component analysis. We show qualitative and quantitative results of our study on three CNNs trained on large image datasets: AlexNet, Places, and Oxford VGG. We observe important differences across the networks and CNN layers for different scene factors and object categories. Finally, we demonstrate that our analysis based on computer-generated imagery translates to the network representation of natural images.
We consider random non-directed networks subject to dynamics conserving vertex degrees and study analytically and numerically equilibrium three-vertex motif distributions in the presence of an external field, $h$, coupled to one of the motifs. For small $h$ the numerics is well described by the "chemical kinetics" for the concentrations of motifs based on the law of mass action. For larger $h$ a transition into some trapped motif state occurs in Erd\H{o}s-R\'enyi networks. We explain the existence of the transition by employing the notion of the entropy of the motif distribution and describe it in terms of a phenomenological Landau-type theory with a non-zero cubic term. A localization transition should always occur if the entropy function is non-convex. We conjecture that this phenomenon is the origin of the motifs' pattern formation in real evolutionary networks.
In this work, we focus on a challenging task: synthesizing multiple imaginary videos given a single image. Major problems come from high dimensionality of pixel space and the ambiguity of potential motions. To overcome those problems, we propose a new framework that produce imaginary videos by transformation generation. The generated transformations are applied to the original image in a novel volumetric merge network to reconstruct frames in imaginary video. Through sampling different latent variables, our method can output different imaginary video samples. The framework is trained in an adversarial way with unsupervised learning. For evaluation, we propose a new assessment metric $RIQA$. In experiments, we test on 3 datasets varying from synthetic data to natural scene. Our framework achieves promising performance in image quality assessment. The visual inspection indicates that it can successfully generate diverse five-frame videos in acceptable perceptual quality.
An overview of game-theoretic approaches to energy-efficient resource allocation in wireless networks is presented. Focusing on multiple-access networks, it is demonstrated that game theory can be used as an effective tool to study resource allocation in wireless networks with quality-of-service (QoS) constraints. A family of non-cooperative (distributed) games is presented in which each user seeks to choose a strategy that maximizes its own utility while satisfying its QoS requirements. The utility function considered here measures the number of reliable bits that are transmitted per joule of energy consumed and, hence, is particulary suitable for energy-constrained networks. The actions available to each user in trying to maximize its own utility are at least the choice of the transmit power and, depending on the situation, the user may also be able to choose its transmission rate, modulation, packet size, multiuser receiver, multi-antenna processing algorithm, or carrier allocation strategy. The best-response strategy and Nash equilibrium for each game is presented. Using this game-theoretic framework, the effects of power control, rate control, modulation, temporal and spatial signal processing, carrier allocation strategy and delay QoS constraints on energy efficiency and network capacity are quantified.
We study the ground-state phase diagram of the Ashkin-Teller random quantum spin chain by means of a generalization of the strong-disorder renormalization group. In addition to the conventional paramagnetic and ferromagnetic (Baxter) phases, we find a partially ordered phase characterized by strong randomness and infinite coupling between the colors. This unusual phase acts, at the same time, as a Griffiths phase for two distinct quantum phase transitions both of which are of infinite-randomness type. We also investigate the quantum multi-critical point that separates the two-phase and three-phase regions; and we discuss generalizations of our results to higher dimensions and other systems.
We build an evolutionary scenario that explains how some crucial physiological constraints in the arterial network of mammals - i.e. hematocrit, vessels diameters and arterial pressure drops - could have been selected by evolution. We propose that the arterial network evolved while being constrained by its function as an organ. To support this hypothesis, we focus our study on one of the main function of blood network: oxygen supply to the organs. We consider an idealized organ with a given oxygen need and we optimize blood network geometry and hematocrit with the constraint that it must fulfill the organ oxygen need. Our model accounts for the non-Newtonian behavior of blood, its maintenance cost and F\aa hr\ae us effects (decrease in average concentration of red blood cells as the vessel diameters decrease). We show that the mean shear rates (relative velocities of fluid layers) in the tree vessels follow a scaling law related to the multi-scale property of the tree network, and we show that this scaling law drives the behavior of the optimal hematocrit in the tree. We apply our scenario to physiological data and reach results fully compatible with the physiology: we found an optimal hematocrit of 0.43 and an optimal ratio for diameter decrease of about 0.79. Moreover our results show that pressure drops in the arterial network should be regulated in order for oxygen supply to remain optimal, suggesting that the amplitude of the arterial pressure drop may have co-evolved with oxygen needs.
Particles in space periodic potentials constitute standard models for investigation of crystalline phenomena in solid state physics. Time periodicity of periodically driven systems is a close analogue of space periodicity of solid state crystals. There is an intriguing question if solid state phenomena can be observed in the time domain. Here we show that wave-packets localized on resonant classical trajectories of periodically driven systems are ideal elements to realize Anderson localization or Mott insulator phase in the time domain. Uniform superpositions of the wave-packets form stationary states of a periodically driven particle. However, an additional perturbation that fluctuates in time results in disorder in time and Anderson localization effects emerge. Switching to many-particle systems we observe that depending on how strong particle interactions are, stationary states can be Bose-Einstein condensates or single Fock states where definite numbers of particles occupy the periodically evolving wave-packets. Our study shows that non-trivial crystal-like phenomena can be observed in the time domain.
Generating functionals may guide the evolution of a dynamical system and constitute a possible route for handling the complexity of neural networks as relevant for computational intelligence. We propose and explore a new objective function, which allows to obtain plasticity rules for the afferent synaptic weights. The adaption rules are Hebbian, self-limiting, and result from the minimization of the Fisher information with respect to the synaptic flux. We perform a series of simulations examining the behavior of the new learning rules in various circumstances. The vector of synaptic weights aligns with the principal direction of input activities, whenever one is present. A linear discrimination is performed when there are two or more principal directions; directions having bimodal firing-rate distributions, being characterized by a negative excess kurtosis, are preferred. We find robust performance and full homeostatic adaption of the synaptic weights results as a by-product of the synaptic flux minimization. This self-limiting behavior allows for stable online learning for arbitrary durations. The neuron acquires new information when the statistics of input activities is changed at a certain point of the simulation, showing however, a distinct resilience to unlearn previously acquired knowledge. Learning is fast when starting with randomly drawn synaptic weights and substantially slower when the synaptic weights are already fully adapted.
A mechanism of point defect migration triggered by local depolarization fields is shown to explain some still inexplicable features of aging in acceptor doped ferroelectrics. A drift-diffusion model of the coupled charged defect transport and electrostatic field relaxation within a two-dimensional domain configuration is treated numerically and analytically. Numerical results are given for the emerging internal bias field of about 1 kV/mm which levels off at dopant concentrations well below 1 mol%; the fact, long ago known experimentally but still not explained. For higher defect concentrations a closed solution of the model equations in the drift approximation as well as an explicit formula for the internal bias field is derived revealing the plausible time, temperature and concentration dependencies of aging. The results are compared to those due to the mechanism of orientational reordering of defect dipoles.
In this paper, we propose a new network architecture for Chinese typography transformation based on deep learning. The architecture consists of two sub-networks: (1)a fully convolutional network(FCN) aiming at transferring specified typography style to another in condition of preserving structure information; (2)an adversarial network aiming at generating more realistic strokes in some details. Unlike models proposed before 2012 relying on the complex segmentation of Chinese components or strokes, our model treats every Chinese character as an inseparable image, so pre-processing or post-preprocessing are abandoned. Besides, our model adopts end-to-end training without pre-trained used in other deep models. The experiments demonstrates that our model can synthesize realistic-looking target typography from any source typography both on printed style and handwriting style.
Waves propagating through a weakly scattering random medium show a pronounced branching of the flow accompanied by the formation of freak waves, i.e., extremely intense waves. Theory predicts that this strong fluctuation regime is accompanied by its own fundamental length scale of transport in random media, parametrically different from the mean free path or the localization length. We show numerically how the scintillation index can be used to assess the scaling behavior of the branching length. We report the experimental observation of this scaling using microwave transport experiments in quasi-two-dimensional resonators with randomly distributed weak scatterers. Remarkably, the scaling range extends much further than expected from random caustics statistics.
In this work, we attempt to ameliorate the impact of data sparsity in the context of session-based recommendation. Specifically, we seek to devise a machine learning mechanism capable of extracting subtle and complex underlying temporal dynamics in the observed session data, so as to inform the recommendation algorithm. To this end, we improve upon systems that utilize deep learning techniques with recurrently connected units; we do so by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network recurrent units as stochastic latent variables with a prior distribution imposed over them. On this basis, we proceed to infer corresponding posteriors; these can be used for prediction and recommendation generation, in a way that accounts for the uncertainty in the available sparse training data. To allow for our approach to easily scale to large real-world datasets, we perform inference under an approximate amortized variational inference (AVI) setup, whereby the learned posteriors are parameterized via (conventional) neural networks. We perform an extensive experimental evaluation of our approach using challenging benchmark datasets, and illustrate its superiority over existing state-of-the-art techniques.
Recurrent neural networks (RNNs) are very good at modelling the flow of text, but typically need to be trained on a far larger corpus than is available for the PAN 2015 Author Identification task. This paper describes a novel approach where the output layer of a character-level RNN language model is split into several independent predictive sub-models, each representing an author, while the recurrent layer is shared by all. This allows the recurrent layer to model the language as a whole without over-fitting, while the outputs select aspects of the underlying model that reflect their author's style. The method proves competitive, ranking first in two of the four languages.
The random cluster model is used to define an upper bound on a distance measure as a function of the number of data points to be classified and the expected value of the number of classes to form in a hybrid K-means and regression classification methodology, with the intent of detecting anomalies. Conditions are given for the identification of classes which contain anomalies and individual anomalies within identified classes. A neural network model describes the decision region-separating surface for offline storage and recall in any new anomaly detection.
Despite more than 50 years of human space exploration, no paper in the field of economics has been published regarding the theory of a space-based economy. The aim of this paper is to develop quantitative techniques to estimate conditions of the human heliospheric expansion. An empirical analysis of current space commercialization and reasoning from first economic principles yields an evolutionary prisoner's dilemma game on a dynamically scaled heterogeneous Newman-Watts Small World Network to generate a new space. The analysis allows for scalar measurements of behavior, market structures, wealth, and technological prowess, with time measured relative to the system. Four major phases of heliospheric expansion become evident, in which the dynamic of the economic environment drives further exploration. Further research could combine empirical estimations of parameters with computer simulations to prove results to inform long-term business plans or public policy to further incentivize human heliospheric domination.
Convolutional neural networks with many layers have recently been shown to achieve excellent results on many high-level tasks such as image classification, object detection and more recently also semantic segmentation. Particularly for semantic segmentation, a two-stage procedure is often employed. Hereby, convolutional networks are trained to provide good local pixel-wise features for the second step being traditionally a more global graphical model. In this work we unify this two-stage process into a single joint training algorithm. We demonstrate our method on the semantic image segmentation task and show encouraging results on the challenging PASCAL VOC 2012 dataset.
This study examines long-term trends and shifting behavior in the collaboration network of mathematics literature, using a subset of data from Mathematical Reviews spanning 1985-2009. Rather than modeling the network cumulatively, this study traces the evolution of the "here and now" using fixed-duration sliding windows. The analysis uses a suite of common network diagnostics, including the distributions of degrees, distances, and clustering, to track network structure. Several random models that call these diagnostics as parameters help tease them apart as factors from the values of others. Some behaviors are consistent over the entire interval, but most diagnostics indicate that the network's structural evolution is dominated by occasional dramatic shifts in otherwise steady trends. These behaviors are not distributed evenly across the network; stark differences in evolution can be observed between two major subnetworks, loosely thought of as "pure" and "applied", which approximately partition the aggregate. The paper characterizes two major events along the mathematics network trajectory and discusses possible explanatory factors.
We consider distributed optimization over orthogonal collision channels in spatial random access networks. Users are spatially distributed and each user is in the interference range of a few other users. Each user is allowed to transmit over a subset of the shared channels with a certain attempt probability. We study both the non-cooperative and cooperative settings. In the former, the goal of each user is to maximize its own rate irrespective of the utilities of other users. In the latter, the goal is to achieve proportionally fair rates among users. Simple distributed learning algorithms are developed to solve these problems. The efficiencies of the proposed algorithms are demonstrated via both theoretical analysis and simulation results.
The input to a neural sequence-to-sequence model is often determined by an up-stream system, e.g. a word segmenter, part of speech tagger, or speech recognizer. These up-stream models are potentially error-prone. Representing inputs through word lattices allows making this uncertainty explicit by capturing alternative sequences and their posterior probabilities in a compact form. In this work, we extend the TreeLSTM (Tai et al., 2015) into a LatticeLSTM that is able to consume word lattices, and can be used as encoder in an attentional encoder-decoder model. We integrate lattice posterior scores into this architecture by extending the TreeLSTM's child-sum and forget gates and introducing a bias term into the attention mechanism. We experiment with speech translation lattices and report consistent improvements over baselines that translate either the 1-best hypothesis or the lattice without posterior scores.
We introduce a new framework for learning dense correspondence between deformable 3D shapes. Existing learning based approaches model shape correspondence as a labelling problem, where each point of a query shape receives a label identifying a point on some reference domain; the correspondence is then constructed a posteriori by composing the label predictions of two input shapes. We propose a paradigm shift and design a structured prediction model in the space of functional maps, linear operators that provide a compact representation of the correspondence. We model the learning process via a deep residual network which takes dense descriptor fields defined on two shapes as input, and outputs a soft map between the two given objects. The resulting correspondence is shown to be accurate on several challenging benchmarks comprising multiple categories, synthetic models, real scans with acquisition artifacts, topological noise, and partiality.
While natural languages are compositional, how state-of-the-art neural models achieve compositionality is still unclear. We propose a deep network, which not only achieves competitive accuracy for text classification, but also exhibits compositional behavior. That is, while creating hierarchical representations of a piece of text, such as a sentence, the lower layers of the network distribute their layer-specific attention weights to individual words. In contrast, the higher layers compose meaningful phrases and clauses, whose lengths increase as the networks get deeper until fully composing the sentence.
We report a noise induced delay of bifurcation in a simple pulse-coupled neural circuit. We study the behavior of two neural oscillators, each individually governed by saddle-node dynamics, with reciprocal excitatory synaptic connections. In the deterministic circuit, the synaptic current amplitude acts as a control parameter to move the circuit from a mono-stable regime through a bifurcation into a bistable regime. In this regime stable sustained anti-phase oscillations in both neurons coexist with a stable rest state. We introduce a small amount of random current into both neurons to model possible randomly arriving synaptic inputs. We find that such random noise delays the onset of bistability, even though in decoupled neurons noise tends to advance bifurcations and the circuit has only excitatory coupling. We show that the delay is dependent on the level of noise and suggest that a curious stochastic ``anti-resonance'' is present.
Standardized corpora of undeciphered scripts, a necessary starting point for computational epigraphy, requires laborious human effort for their preparation from raw archaeological records. Automating this process through machine learning algorithms can be of significant aid to epigraphical research. Here, we take the first steps in this direction and present a deep learning pipeline that takes as input images of the undeciphered Indus script, as found in archaeological artifacts, and returns as output a string of graphemes, suitable for inclusion in a standard corpus. The image is first decomposed into regions using Selective Search and these regions are classified as containing textual and/or graphical information using a convolutional neural network. Regions classified as potentially containing text are hierarchically merged and trimmed to remove non-textual information. The remaining textual part of the image is segmented using standard image processing techniques to isolate individual graphemes. This set is finally passed to a second convolutional neural network to classify the graphemes, based on a standard corpus. The classifier can identify the presence or absence of the most frequent Indus grapheme, the "jar" sign, with an accuracy of 92%. Our results demonstrate the great potential of deep learning approaches in computational epigraphy and, more generally, in the digital humanities.
We study the co-existence of star formation and AGN activity in X-ray selected AGN by analyzing stacked 870um submm emission from a deep and wide map of the Extended Chandra Deep Field South, obtained with LABOCA at the APEX telescope. The total X-ray sample of 895 sources with median redshift z~1 is detected at a mean submm flux of 0.49+-0.04mJy, corresponding to a typical star formation rate around 30Msun/yr for a T=35K, beta=1.5 greybody far-infrared SED. The good S/N permits stacking analyses for subgroups. We observe a trend of star formation rate increasing with redshift. An increase of star formation rate with AGN luminosity is indicated at the highest L_2-10>~1E44erg/s luminosities only. Increasing trends with X-ray obscuration as expected in some AGN evolutionary scenarios are not observed for the bulk of the X-ray AGN sample but may be present for the highest intrinsic luminosity objects. This suggests a transition between two modes in the coexistence of AGN activity and star formation. For the bulk of the sample, the X-ray luminosity and obscuration of the AGN are not intimately linked to the global star formation rate of their hosts. The hosts are likely massive and forming stars secularly, at rates similar to the pervasive star formation seen in massive galaxies without an AGN at similar redshifts. The change indicated towards more intense star formation, and a more pronounced increase in star formation rates between unobscured and obscured AGN at highest luminosities suggests that luminous AGN follow an evolutionary path on which obscured AGN activity and intense star formation are linked, possibly via merging. Comparison to local hard X-ray selected AGN supports this interpretation. [Abridged]
Policy learning for partially observed control tasks requires policies that can remember salient information from past observations. In this paper, we present a method for learning policies with internal memory for high-dimensional, continuous systems, such as robotic manipulators. Our approach consists of augmenting the state and action space of the system with continuous-valued memory states that the policy can read from and write to. Learning general-purpose policies with this type of memory representation directly is difficult, because the policy must automatically figure out the most salient information to memorize at each time step. We show that, by decomposing this policy search problem into a trajectory optimization phase and a supervised learning phase through a method called guided policy search, we can acquire policies with effective memorization and recall strategies. Intuitively, the trajectory optimization phase chooses the values of the memory states that will make it easier for the policy to produce the right action in future states, while the supervised learning phase encourages the policy to use memorization actions to produce those memory states. We evaluate our method on tasks involving continuous control in manipulation and navigation settings, and show that our method can learn complex policies that successfully complete a range of tasks that require memory.
Wiatowski and B\"olcskei, 2015, proved that deformation stability and vertical translation invariance of deep convolutional neural network-based feature extractors are guaranteed by the network structure per se rather than the specific convolution kernels and non-linearities. While the translation invariance result applies to square-integrable functions, the deformation stability bound holds for band-limited functions only. Many signals of practical relevance (such as natural images) exhibit, however, sharp and curved discontinuities and are hence not band-limited. The main contribution of this paper is a deformation stability result that takes these structural properties into account. Specifically, we establish deformation stability bounds for the class of cartoon functions introduced by Donoho, 2001.
We present a preliminary strategy for modeling multidimensional distributions through neural networks. We study the efficiency of the proposed strategy by considering as input data the two-dimensional next-to-next leading order (NNLO) jet k-factors distribution for the ATLAS 7 TeV 2011 data. We then validate the neural network model in terms of interpolation and prediction quality by comparing its results to alternative models.
In this paper we propose and investigate a novel nonlinear unit, called $L_p$ unit, for deep neural networks. The proposed $L_p$ unit receives signals from several projections of a subset of units in the layer below and computes a normalized $L_p$ norm. We notice two interesting interpretations of the $L_p$ unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the $L_p$ unit is, to a certain degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013) which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the $L_p$ unit is more efficient at representing complex, nonlinear separating boundaries. Each $L_p$ unit defines a superelliptic boundary, with its exact shape defined by the order $p$. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few $L_p$ units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed $L_p$ units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the $L_p$ units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed $L_p$ unit on the recently proposed deep recurrent neural networks (RNN).
This paper describes algorithms for nonnegative matrix factorization (NMF) with the beta-divergence (beta-NMF). The beta-divergence is a family of cost functions parametrized by a single shape parameter beta that takes the Euclidean distance, the Kullback-Leibler divergence and the Itakura-Saito divergence as special cases (beta = 2,1,0, respectively). The proposed algorithms are based on a surrogate auxiliary function (a local majorization of the criterion function). We first describe a majorization-minimization (MM) algorithm that leads to multiplicative updates, which differ from standard heuristic multiplicative updates by a beta-dependent power exponent. The monotonicity of the heuristic algorithm can however be proven for beta in (0,1) using the proposed auxiliary function. Then we introduce the concept of majorization-equalization (ME) algorithm which produces updates that move along constant level sets of the auxiliary function and lead to larger steps than MM. Simulations on synthetic and real data illustrate the faster convergence of the ME approach. The paper also describes how the proposed algorithms can be adapted to two common variants of NMF : penalized NMF (i.e., when a penalty function of the factors is added to the criterion function) and convex-NMF (when the dictionary is assumed to belong to a known subspace).
Aiming at improving the performance of existing detection algorithms developed for different applications, we propose a region regression-based multi-stage class-agnostic detection pipeline, whereby the existing algorithms are employed for providing the initial detection proposals. Better detection is obtained by exploiting the power of deep learning in the region regress scheme while avoiding the requirement on a huge amount of reference data for training deep neural networks. Additionally, a novel network architecture with recycled deep features is proposed, which provides superior regression results compared to the commonly used architectures. As demonstrated on a data set with ~1200 samples of different classes, it is feasible to successfully train a deep neural network in our proposed architecture and use it to obtain the desired detection performance. Since only slight modifications are required to common network architectures and since the deep neural network is trained using the standard hyperparameters, the proposed detection is well accessible and can be easily adopted to a broad variety of detection tasks.
This paper proposes Fulcrum network codes, a network coding framework that achieves three seemingly conflicting objectives: (i) to reduce the coding coefficient overhead to almost n bits per packet in a generation of n packets; (ii) to operate the network using only GF(2) operations at intermediate nodes if necessary, dramatically reducing complexity in the network; (iii) to deliver an end-to-end performance that is close to that of a high-field network coding system for high-end receivers while simultaneously catering to low-end receivers that decode in GF(2). As a consequence of (ii) and (iii), Fulcrum codes have a unique trait missing so far in the network coding literature: they provide the network with the flexibility to spread computational complexity over different devices depending on their current load, network conditions, or even energy targets in a decentralized way. At the core of our framework lies the idea of precoding at the sources using an expansion field GF(2h) to increase the number of dimensions seen by the network using a linear mapping. Fulcrum codes can use any high-field linear code for precoding, e.g., Reed-Solomon, with the structure of the precode determining some of the key features of the resulting code. For example, a systematic structure provides the ability to manage heterogeneous receivers while using the same data stream. Our analysis shows that the number of additional dimensions created during precoding controls the trade-off between delay, overhead, and complexity. Our implementation and measurements show that Fulcrum achieves similar decoding probability as high field Random Linear Network Coding (RLNC) approaches but with encoders/decoders that are an order of magnitude faster.
Spanning trees are relevant to various aspects of networks. Generally, the number of spanning trees in a network can be obtained by computing a related determinant of the Laplacian matrix of the network. However, for a large generic network, evaluating the relevant determinant is computationally intractable. In this paper, we develop a fairly generic technique for computing determinants corresponding to self-similar networks, thereby providing a method to determine the numbers of spanning trees in networks exhibiting self-similarity. We describe the computation process with a family of networks, called $(x,y)$-flowers, which display rich behavior as observed in a large variety of real systems. The enumeration of spanning trees is based on the relationship between the determinants of submatrices of the Laplacian matrix corresponding to the $(x,y)$-flowers at different generations and is devoid of the direct laborious computation of determinants. Using the proposed method, we derive analytically the exact number of spanning trees in the $(x,y)$-flowers, on the basis of which we also obtain the entropies of the spanning trees in these networks. Moreover, to illustrate the universality of our technique, we apply it to some other self-similar networks with distinct degree distributions, and obtain explicit solutions to the numbers of spanning trees and their entropies. Finally, we compare our results for networks with the same average degree but different structural properties, such as degree distribution and fractal dimension, and uncover the effect of these topological features on the number of spanning trees.
In this paper, we propose Energy-efficient Adaptive Scheme for Transmission (EAST) in WSNs. EAST is IEEE 802.15.4 standard compliant. In this approach, open-loop is used for temperature-aware link quality estimation and compensation. Whereas, closed-loop feedback helps to divide network into three logical regions to minimize overhead of control packets on basis of Threshold transmitter power loss (RSSIloss) for each region and current number of neighbor nodes that help to adapt transmit power according to link quality changes due to temperature variation. Simulation results show that propose scheme; EAST effectively adapts transmission power to changing link quality with less control packets overhead and energy consumption compared to classical approach with single region in which maximum transmitter power assigned to compensate temperature variation.
Texture plays an important role in many image analysis applications. In this paper, we give a performance evaluation of color texture classification by performing wavelet scattering network in various color spaces. Experimental results on the KTH_TIPS_COL database show that opponent RGB based wavelet scattering network outperforms other color spaces. Therefore, when dealing with the problem of color texture classification, opponent RGB based wavelet scattering network is recommended.
Using a recently developed continuum theory of dislocation dynamics, we derive three new predictions about plasticity and grain boundary formation in crystals. (1) There will be a residual stress jump across grain boundaries and plasticity-induced cell walls, which self-consistently acts to form the boundary by attracting neighboring dislocations; we derive the asymptotic late-time dynamics of the grain-boundary formation process. (2) At grain boundaries formed at high temperatures, there will be a cusp in the elastic energy density. (3) In early stages of plasticity, when only one type of dislocation is active (single-slip stage I plasticity), cell walls will not form; instead we predict the formation of jump singularities in the dislocation density.
We derive properties of Latent Variable Models for networks, a broad class of models that includes the widely-used Latent Position Models. These include the average degree distribution, clustering coefficient, average path length and degree correlations. We introduce the Gaussian Latent Position Model, and derive analytic expressions and asymptotic approximations for its network properties. We pay particular attention to one special case, the Gaussian Latent Position Models with Random Effects, and show that it can represent the heavy-tailed degree distributions, positive asymptotic clustering coefficients and small-world behaviours that are often observed in social networks. Several real and simulated examples illustrate the ability of the models to capture important features of observed networks.
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.
The adoption of probabilistic models for the best individuals found so far is a powerful approach for evolutionary computation. Increasingly more complex models have been used by estimation of distribution algorithms (EDAs), which often result better effectiveness on finding the global optima for hard optimization problems. Supervised and unsupervised learning of Bayesian networks are very effective options, since those models are able to capture interactions of high order among the variables of a problem. Diversity preservation, through niching techniques, has also shown to be very important to allow the identification of the problem structure as much as for keeping several global optima. Recently, clustering was evaluated as an effective niching technique for EDAs, but the performance of simpler low-order EDAs was not shown to be much improved by clustering, except for some simple multimodal problems. This work proposes and evaluates a combination operator guided by a measure from information theory which allows a clustered low-order EDA to effectively solve a comprehensive range of benchmark optimization problems.
We compute the shear modulus of amorphous hard and soft spheres, using the exact solution in infinite spatial dimensions that has been developed recently. We characterize the behavior of this observable in the whole phase diagram, and in particular around the glass and jamming transitions. Our results are consistent with other theoretical approaches, that are unified within this general picture, and they are also consistent with numerical and experimental results. Furthermore, we discuss some properties of the out-of-equilibrium dynamics after a deep quench close to the jamming transition, and we show that a combined measure of the shear modulus and of the mean square displacement allows one to probe experimentally the complex structure of phase space predicted by the full replica symmetry breaking solution.
Recently Bramoulle and Kranton presented a model for the provision of public goods over a network and showed the existence of a class of Nash equilibria called specialized equilibria wherein some agents exert maximum effort while other agents free ride. We examine the efficiency, effort and cost of specialized equilibria in comparison to other equilibria. Our main results show that the welfare of a particular specialized equilibrium approaches the maximum welfare amongst all equilibria as the concavity of the benefit function tends to unity. For forest networks a similar result also holds as the concavity approaches zero. Moreover, without any such concavity conditions, there exists for any network a specialized equilibrium that requires the maximum weighted effort amongst all equilibria. When the network is a forest, a specialized equilibrium also incurs the minimum total cost amongst all equilibria. For well-covered forest networks we show that all welfare maximizing equilibria are specialized and all equilibria incur the same total cost. Thus we argue that specialized equilibria may be considered as a refinement of the equilibrium of the public goods game. We show several results on the structure and efficiency of equilibria that highlight the role of dependants in the network.
The paper describes some recent developments in neural networks and discusses the applicability of neural networks in the development of a machine that mimics the human brain. The paper mentions a new architecture, the pulsed neural network that is being considered as the next generation of neural networks. The paper also explores the use of memristors in the development of a brain-like computer called the MoNETA. A new model, multi/infinite dimensional neural networks, are a recent development in the area of advanced neural networks. The paper concludes that the need of neural networks in the development of human-like technology is essential and may be non-expendable for it.
We study a quantum small-world network with disorder and show that the system exhibits a delocalization transition. A quantum algorithm is built up which simulates the evolution operator of the model in a polynomial number of gates for exponential number of vertices in the network. The total computational gain is shown to depend on the parameters of the network and a larger than quadratic speed-up can be reached.   We also investigate the robustness of the algorithm in presence of imperfections.
We study numerically the spectrum and eigenstate properties of the Google matrix of various examples of directed networks such as vocabulary networks of dictionaries and university World Wide Web networks. The spectra have gapless structure in the vicinity of the maximal eigenvalue for Google damping parameter $\alpha$ equal to unity. The vocabulary networks have relatively homogeneous spectral density, while university networks have pronounced spectral structures which change from one university to another, reflecting specific properties of the networks. We also determine specific properties of eigenstates of the Google matrix, including the PageRank. The fidelity of the PageRank is proposed as a new characterization of its stability.
Using analytical arguments and computer simulations we show that the dependence of the hopping carrier mobility on the electric field $\mu(F)/\mu(0)$ in a system of random sites is determined by the localization length $\alpha$ and not by the concentration of sites $N$. This result is in drastic contrast to what is usually assumed in the literature for theoretical description of experimental data and for device modeling, where $N^{-1/3}$ is considered as the decisive length scale for $\mu(F)$. We show that although the limiting value $\mu(F \rightarrow 0)$ is determined by the ratio $N^{-1/3}/\alpha$, the dependence $\mu(F)/\mu(0)$ is sensitive to the magnitude of $\alpha$ and not to $N^{-1/3}$. Furthermore, our numerical and analytical results prove that the effective temperature responsible for the combined effect of the electric field $F$ and the real temperature $T$ on the hopping transport via spatially random sites can contain the electric field only in the combination $eF\alpha$.
We investigate the effect of interstitial liquid on the physical properties of granular media by measuring the angle of repose as a function of the liquid content. The resultant adhesive forces lead to three distinct regimes in the observed behavior as the liquid content is increased: a granular regime in which the grains move individually, a correlated regime in which the grains move in correlated clusters, and a plastic regime in which the grains flow coherently. We discuss these regimes in terms of two proposed theories describing the effects of liquid on the physical properties of granular media.
We consider the totally asymmetric simple exclusion process (TASEP) in discrete time with the sublattice parallel dynamics describing particles moving to the right on the one-dimensional infinite chain with equal hoping probabilities. Using sequentially two mappings, we show that the model is equivalent to the TASEP with the backward-ordered sequential update in the case when particles start and finish their motion not simultaneously. The Green functions are obtained exactly in a determinant form for different initial and final conditions.
Pedestrian attribute inference is a demanding problem in visual surveillance that can facilitate person retrieval, search and indexing. To exploit semantic relations between attributes, recent research treats it as a multi-label image classification task. The visual cues hinting at attributes can be strongly localized and inference of person attributes such as hair, backpack, shorts, etc., are highly dependent on the acquired view of the pedestrian. In this paper we assert this dependence in an end-to-end learning framework and show that a view-sensitive attribute inference is able to learn better attribute predictions. Our proposed model jointly predicts the coarse pose (view) of the pedestrian and learns specialized view-specific multi-label attribute predictions. We show in an extensive evaluation on three challenging datasets (PETA, RAP and WIDER) that our proposed end-to-end view-aware attribute prediction model provides competitive performance and improves on the published state-of-the-art on these datasets.
Particle motion of a Lennard-Jones supercooled liquid near the glass transition is studied by molecular dynamics simulations. We analyze the wave vector dependence of relaxation times in the incoherent self scattering function and show that at least three different regimes can be identified and its scaling properties determined. The transition from one regime to another happens at characteristic length scales. The lengthscale associated with the onset of Fickian diffusion corresponds to the maximum size of heterogeneities in the system, and the characterisitic timescale is several times larger than the alpha relaxation time. A second crossover lengthscale is observed, which corresponds to the typical time and length of heterogeneities, in agreement with results from four point functions. The different regimes can be traced back in the behavior of the van Hove distribution of displacements, which shows a characteristic exponential regime in the heterogeneous region before the crossover to gaussian diffusion and should be observable in experiments.   Our results show that it is possible to obtain characteristic length scales of heterogeneities through the computation of two point functions at different times.
The Cosmic Far-Infrared Background (CFIRB) contains information about the number and distribution of contributing sources and thus gives us an important key to understand the evolution of galaxies. Using a confusion study to set a fundamental limit to the observations, we investigate the potential to explore the CFIRB with AKARI/FIS deep observations. The Far-Infrared Surveyor (FIS) is one of the focal-plane instruments on the AKARI (formerly known as ASTRO-F) satellite, which was launched in early 2006. Based upon source distribution models assuming three different cosmological evolutionary scenarios (no evolution, weak evolution, and strong evolution), an extensive model for diffuse emission from infrared cirrus, and instrumental noise estimates, we present a comprehensive analysis for the determination of the confusion levels for deep far-infrared observations. We use our derived sensitivities to suggest the best observational strategy for the AKARI/FIS mission to detect the CFIRB fluctuations. If the source distribution follows the evolutionary models, observations will be mostly limited by source confusion. We find that we will be able to detect the CFIRB fluctuations and that these will in turn provide information to discriminate between the evolutionary scenarios of galaxies in most low-to-medium cirrus regions.
Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain. In this paper we consider possible generalizations of CNNs to signals defined on more general domains without the action of a translation group. In particular, we propose two constructions, one based upon a hierarchical clustering of the domain, and another based on the spectrum of the graph Laplacian. We show through experiments that for low-dimensional graphs it is possible to learn convolutional layers with a number of parameters independent of the input size, resulting in efficient deep architectures.
In this paper we present a deeper analysis than has previously been carried out of a selective attention problem, and the evolution of continuous-time recurrent neural networks to solve it. We show that the task has a rich structure, and agents must solve a variety of subproblems to perform well. We consider the relationship between the complexity of an agent and the ease with which it can evolve behavior that generalizes well across subproblems, and demonstrate a shaping protocol that improves generalization.
Collaborative filtering is used to recommend items to a user without requiring a knowledge of the item itself and tends to outperform other techniques. However, collaborative filtering suffers from the cold-start problem, which occurs when an item has not yet been rated or a user has not rated any items. Incorporating additional information, such as item or user descriptions, into collaborative filtering can address the cold-start problem. In this paper, we present a neural network model with latent input variables (latent neural network or LNN) as a hybrid collaborative filtering technique that addresses the cold-start problem. LNN outperforms a broad selection of content-based filters (which make recommendations based on item descriptions) and other hybrid approaches while maintaining the accuracy of state-of-the-art collaborative filtering techniques.
In recent years, stochastic gradient descent (SGD) based techniques has become the standard tools for training neural networks. However, formal theoretical understanding of why SGD can train neural networks in practice is largely missing.   In this paper, we make progress on understanding this mystery by providing a convergence analysis for SGD on a rich subset of two-layer feedforward networks with ReLU activations. This subset is characterized by a special structure called "identity mapping". We prove that, if input follows from Gaussian distribution, with standard $O(1/\sqrt{d})$ initialization of the weights, SGD converges to the global minimum in polynomial number of steps. Unlike normal vanilla networks, the "identity mapping" makes our network asymmetric and thus the global minimum is unique. To complement our theory, we are also able to show experimentally that multi-layer networks with this mapping have better performance compared with normal vanilla networks.   Our convergence theorem differs from traditional non-convex optimization techniques. We show that SGD converges to optimal in "two phases": In phase I, the gradient points to the wrong direction, however, a potential function $g$ gradually decreases. Then in phase II, SGD enters a nice one point convex region and converges. We also show that the identity mapping is necessary for convergence, as it moves the initial point to a better place for optimization. Experiment verifies our claims.
Reaction-diffusion equations based on a polymerization model are solved to simulate the spreading of hypothetic left and right handed life forms on the Earth's surface. The equations exhibit front-like behavior as is familiar from the theory of the spreading of epidemics. It is shown that the relevant time scale for achieving global homochirality is not, however, the time scale of front propagation, but the much longer global diffusion time. The process can be sped up by turbulence and large scale flows. It is speculated that, if the deep layers of the early ocean were sufficiently quiescent, there may have been the possibility of competing early life forms with opposite handedness.
We present a simple neural network for word alignment that builds source and target word window representations to compute alignment scores for sentence pairs. To enable unsupervised training, we use an aggregation operation that summarizes the alignment scores for a given target word. A soft-margin objective increases scores for true target words while decreasing scores for target words that are not present. Compared to the popular Fast Align model, our approach improves alignment accuracy by 7 AER on English-Czech, by 6 AER on Romanian-English and by 1.7 AER on English-French alignment.
Deep neural networks have been shown to achieve state-of-the-art performance in several machine learning tasks. Stochastic Gradient Descent (SGD) is the preferred optimization algorithm for training these networks and asynchronous SGD (ASGD) has been widely adopted for accelerating the training of large-scale deep networks in a distributed computing environment. However, in practice it is quite challenging to tune the training hyperparameters (such as learning rate) when using ASGD so as achieve convergence and linear speedup, since the stability of the optimization algorithm is strongly influenced by the asynchronous nature of parameter updates. In this paper, we propose a variant of the ASGD algorithm in which the learning rate is modulated according to the gradient staleness and provide theoretical guarantees for convergence of this algorithm. Experimental verification is performed on commonly-used image classification benchmarks: CIFAR10 and Imagenet to demonstrate the superior effectiveness of the proposed approach, compared to SSGD (Synchronous SGD) and the conventional ASGD algorithm.
Ubiquitous sensing devices frequently disseminate their data between them. The use of a distributed event-based system that decouples publishers of subscribers arises as an ideal candidate to implement the dissemination process. In this paper, we present a network architecture which merges the network and overlay layers of typical structured event-based systems. Directional Random Walks (DRWs) are used for the construction of this merged layer. Our first results show that DRWs are suitable to balance the load using a few nodes in the network to construct the dissemination path. As future work, we propose to study the properties of this new layer and to work on the design of Bloom filters to manage broker nodes.
Activation function is crucial to the recent successes of deep neural networks. In this paper, we first propose a new activation function, Multiple Parametric Exponential Linear Units (MPELU), aiming to generalize and unify the rectified and exponential linear units. As the generalized form, MPELU shares the advantages of Parametric Rectified Linear Unit (PReLU) and Exponential Linear Unit (ELU), leading to better classification performance and convergence property. In addition, weight initialization is very important to train very deep networks. The existing methods laid a solid foundation for networks using rectified linear units but not for exponential linear units. This paper complements the current theory and extends it to the wider range. Specifically, we put forward a way of initialization, enabling training of very deep networks using exponential linear units. Experiments demonstrate that the proposed initialization not only helps the training process but leads to better generalization performance. Finally, utilizing the proposed activation function and initialization, we present a deep MPELU residual architecture that achieves state-of-the-art performance on the CIFAR-10/100 datasets. The code is available at https://github.com/Coldmooon/Code-for-MPELU.
This paper attempts multi-label classification by extending the idea of independent binary classification models for each output label, and exploring how the inherent correlation between output labels can be used to improve predictions. Logistic Regression, Naive Bayes, Random Forest, and SVM models were constructed, with SVM giving the best results: an improvement of 12.9\% over binary models was achieved for hold out cross validation by augmenting with pairwise correlation probabilities of the labels.
This paper proposes a new memetic evolutionary algorithm to achieve explicit learning in rule-based nurse rostering, which involves applying a set of heuristic rules for each nurse's assignment. The main framework of the algorithm is an estimation of distribution algorithm, in which an ant-miner methodology improves the individual solutions produced in each generation. Unlike our previous work (where learning is implicit), the learning in the memetic estimation of distribution algorithm is explicit, i.e. we are able to identify building blocks directly. The overall approach learns by building a probabilistic model, i.e. an estimation of the probability distribution of individual nurse-rule pairs that are used to construct schedules. The local search processor (i.e. the ant-miner) reinforces nurse-rule pairs that receive higher rewards. A challenging real world nurse rostering problem is used as the test problem. Computational results show that the proposed approach outperforms most existing approaches. It is suggested that the learning methodologies suggested in this paper may be applied to other scheduling problems where schedules are built systematically according to specific rules
The conventional wisdom is that scale-free networks are prone to cooperation spreading. In this paper we investigate the cooperative behaviors on the structured scale-free network. On the contrary of the conventional wisdom that scale-free networks are prone to cooperation spreading, the evolution of cooperation is inhibited on the structured scale-free network while performing the prisoner's dilemma (PD) game. Firstly, we demonstrate that neither the scale-free property nor the high clustering coefficient is responsible for the inhibition of cooperation spreading on the structured scale-free network. Then we provide one heuristic method to argue that the lack of age correlations and its associated `large-world' behavior in the structured scale-free network inhibit the spread of cooperation. The findings may help enlighten further studies on evolutionary dynamics of the PD game in scale-free networks.
Many decentralized online social networks (DOSNs) have been proposed due to an increase in awareness related to privacy and scalability issues in centralized social networks. Such decentralized networks transfer processing and storage functionalities from the service providers towards the end users. DOSNs require individualistic implementation for services, (i.e., search, information dissemination, storage, and publish/subscribe). However, many of these services mostly perform social queries, where OSN users are interested in accessing information of their friends. In our work, we design a socially-aware distributed hash table (DHTs) for efficient implementation of DOSNs. In particular, we propose a gossip-based algorithm to place users in a DHT, while maximizing the social awareness among them. Through a set of experiments, we show that our approach reduces the lookup latency by almost 30% and improves the reliability of the communication by nearly 10% via trusted contacts.
How do we assign value to economic transactions? To answer this question, we must consider whether the value of objects is inherent, is a product of social interaction, or involves other mechanisms. Economic theory predicts that there is an optimal price for any market transaction, and can be observed during auctions or other bidding processes. However, there are also social, cultural, and cognitive components to the assignation of value, which can be observed in both human and non-human Primate societies. While behaviors related to these factors are embedded in market interactions, they also involve a biological substrate for the assignation of value (valuation). To synthesize this diversity of perspectives, we will propose that the process of valuation can be modeled computationally and conceived of as a set of interrelated cultural evolutionary, cognitive, and neural processes. To do this, contextual geometric structures (CGS) will be placed in an agent-based context (minimal and compositional markets). Objects in the form of computational propositions can be acquired and exchanged, which will determine the value of both singletons and linked propositions. Expected results of this model will be evaluated in terms of their contribution to understanding human economic phenomena. The paper will focus on computational representations and how they correspond to real-world concepts. The implications for evolutionary economics and our contemporary understanding of valuation and market dynamics will also be discussed.
This paper investigates how far a very deep neural network is from attaining close to saturating performance on existing 2D and 3D face alignment datasets. To this end, we make the following three contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets. (b) We create a guided by 2D landmarks network which converts 2D landmark annotations to 3D and unifies all existing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date (~230,000 images). (c) Following that, we train a neural network for 3D face alignment and evaluate it on the newly introduced LS3D-W. (d) We further look into the effect of all "traditional" factors affecting face alignment performance like large pose, initialization and resolution, and introduce a "new" one, namely the size of the network. (e) We show that both 2D and 3D face alignment networks achieve performance of remarkable accuracy which is probably close to saturating the datasets used. Demo code and pre-trained models can be downloaded from http://www.cs.nott.ac.uk/~psxab5/face-alignment/
In this paper, we propose a novel learning based method for automated segmenta-tion of brain tumor in multimodal MRI images. The machine learned features from fully convolutional neural network (FCN) and hand-designed texton fea-tures are used to classify the MRI image voxels. The score map with pixel-wise predictions is used as a feature map which is learned from multimodal MRI train-ing dataset using the FCN. The learned features are then applied to random for-ests to classify each MRI image voxel into normal brain tissues and different parts of tumor. The method was evaluated on BRATS 2013 challenge dataset. The results show that the application of the random forest classifier to multimodal MRI images using machine-learned features based on FCN and hand-designed features based on textons provides promising segmentations. The Dice overlap measure for automatic brain tumor segmentation against ground truth is 0.88, 080 and 0.73 for complete tumor, core and enhancing tumor, respectively.
It has been known for nearly three decades that high redshift radio galaxies exhibit steep radio spectra, and hence ultra-steep spectrum radio sources provide candidates for high-redshift radio galaxies. Nearly all radio galaxies with z > 3 have been found using this redshift-spectral index correlation. We have started a programme with GMRT to exploit this correlation at flux density levels about 10 to 100 times deeper than the known high-redshift radio galaxies which were identified primarily using the already available radio catalogues. In our programme, we have obtained deep, high resolution radio observations at 150 MHz with GMRT for several 'deep' fields which are well studied at higher radio frequencies and in other bands of the electromagnetic spectrum, with an aim to detect candidate high redshift radio galaxies. In this paper we present results from the deep 150 MHz observations of LBDS-Lynx field, which has been already imaged at 327, 610 and 1412 MHz with the WSRT and at 1400 and 4860 MHz with the VLA. The 150 MHz image made with GMRT has a rms noise of ~0.7 mJy/beam and a resolution of ~19" X 15". It is the deepest low frequency image of the LBDS-Lynx field. The source catalog of this field at 150 MHz has about 765 sources down to ~20% of the primary beam response, covering an area of about 15 degree$^2$. Spectral index was estimated by cross correlating each source detected at 150 MHz with the available observations at 327, 610, 1400 and 4860 MHz and also using available radio surveys such as WENSS at 327 MHz and NVSS and FIRST at 1400 MHz. We find about 150 radio sources with spectra steeper than 1. About two-third of these are not detected in SDSS, hence are strong candidate high-redshift radio galaxies, which need to be further explored with deep infra-red imaging and spectroscopy to estimate the redshift.
Recently Convolutional Neural Networks (CNNs) have been shown to achieve state-of-the-art performance on various classification tasks. In this paper, we present for the first time a place recognition technique based on CNN models, by combining the powerful features learnt by CNNs with a spatial and sequential filter. Applying the system to a 70 km benchmark place recognition dataset we achieve a 75% increase in recall at 100% precision, significantly outperforming all previous state of the art techniques. We also conduct a comprehensive performance comparison of the utility of features from all 21 layers for place recognition, both for the benchmark dataset and for a second dataset with more significant viewpoint changes.
We investigate and improve self-supervision as a drop-in replacement for ImageNet pretraining, focusing on automatic colorization as the proxy task. Self-supervised training has been shown to be more promising for utilizing unlabeled data than other, traditional unsupervised learning methods. We build on this success and evaluate the ability of our self-supervised network in several contexts. On VOC segmentation and classification tasks, we present results that are state-of-the-art among methods not using ImageNet labels for pretraining representations.   Moreover, we present the first in-depth analysis of self-supervision via colorization, concluding that formulation of the loss, training details and network architecture play important roles in its effectiveness. This investigation is further expanded by revisiting the ImageNet pretraining paradigm, asking questions such as: How much training data is needed? How many labels are needed? How much do features change when fine-tuned? We relate these questions back to self-supervision by showing that colorization provides a similarly powerful supervisory signal as various flavors of ImageNet pretraining.
Randomly-connected networks of integrate-and-fire (IF) neurons are known to display asynchronous irregular (AI) activity states, which resemble the discharge activity recorded in the cerebral cortex of awake animals. However, it is not clear whether such activity states are specific to simple IF models, or if they also exist in networks where neurons are endowed with complex intrinsic properties similar to electrophysiological measurements. Here, we investigate the occurrence of AI states in networks of nonlinear IF neurons, such as the adaptive exponential IF (Brette-Gerstner-Izhikevich) model. This model can display intrinsic properties such as low-threshold spike (LTS), regular spiking (RS) or fast-spiking (FS). We successively investigate the oscillatory and AI dynamics of thalamic, cortical and thalamocortical networks using such models. AI states can be found in each case, sometimes with surprisingly small network size of the order of a few tens of neurons. We show that the presence of LTS neurons in cortex or in thalamus, explains the robust emergence of AI states for relatively small network sizes. Finally, we investigate the role of spike-frequency adaptation (SFA). In cortical networks with strong SFA in RS cells, the AI state is transient, but when SFA is reduced, AI states can be self-sustained for long times. In thalamocortical networks, AI states are found when the cortex is itself in an AI state, but with strong SFA, the thalamocortical network displays Up and Down state transitions, similar to intracellular recordings during slow-wave sleep or anesthesia. Self-sustained Up and Down states could also be generated by two-layer cortical networks with LTS cells. These models suggest that intrinsic properties such as LTS are crucial for AI states in thalamocortical networks.
We investigate and quantify the interplay between topology and ability to send specific signals in complex networks. We find that in a majority of investigated real-world networks the ability to communicate is favored by the network topology on small distances, but disfavored at larger distances. We further discuss how the ability to locate specific nodes can be improved if information associated to the overall traffic in the network is available.
We map the problem of diffusion in the quenched trap model onto a new stochastic process: Brownian motion which is terminated at the coverage "time" ${\cal S}_\alpha=\sum_{x=-\infty} ^\infty (n_x)^\alpha$ with $n_x$ being the number of visits to site $x$. Here $0<\alpha=T/T_g<1$ is a measure of the disorder in the original model. This mapping allows us to treat the intricate correlations in the underlying random walk in the random environment. The operational "time" ${\cal S}_\alpha$ is changed to laboratory time $t$ with a L\'evy time transformation. Investigation of Brownian motion stopped at "time" ${\cal S}_\alpha$ yields the diffusion front of the quenched trap model which is favorably compared with numerical simulations. In the zero temperature limit of $\alpha\to 0$ we recover the renormalization group solution obtained by C. Monthus. Our theory surmounts critical slowing down which is found when $\alpha \to 1$. Above the critical dimension two mapping the problem to a continuous time random walk becomes feasible though still not trivial.
Jet cross sections in deep-inelastic scattering over a wide region of phase space have been measured at HERA. These cross section measurements provide a thorough test of the implementation of Quantum Chromodynamics in next-to-leading order (NLO) calculations. They also provide the opportunity to test the consistency of the gluon distribution in the proton as extracted from (mainly) inclusive DIS measurements. Comparison of the cross sections with NLO enables accurate extractions of the strong coupling constant, $\alpha_s$, to be made, several of which are reported here.
A disordered alloy Ap B1-p where both A and B represent the magnetic atoms with respective spin SA =1/2 and SB =1 and whose magnetic interaction can be described through Ising Hamiltonian is treated using the cluster-variational method. In this method it is assumed that the system is built out of building block which is embedded in an effective field. Taking building block as 4-atom cluster the approximate free energy of the alloy is then obtained by treating the interactions between spins within the cluster of all possible configurations in exact manner and the rest of the interaction by an effective variational field . The magnetization M and transition temperature Tc are then calculated for different concentration and exchange parameters (JAA, JBB and JAB). The magnetization M exhibits different kinds of ferrimagnetic behaviour depending on concentration and relative strength of intra- and inter- sub-network exchange interactions. For antiferromagnetic JAB, the sub-network magnetization saturates and aligned antiferromagnetically at low temperature. The existence of compensation temperature Tcm, where total magnetization reverses its direction, depends sensitively on relative values of JAA/ JAB and JBB/ JAB and p. For B (A)-rich alloy with small JAB, the direction of net magnetization remains same upto Tc and a maximum of M appears at intermediate T < Tc when JBB>> JAA (JBB<< JAA).When magnitude of JAB > JAA, JBB, Tc exhibits a maximum with p. The transition temperature is much less than the mean-field value for all cases. The magnetic susceptibility for diverges and is Curie-Wiess like at T >>Tc. The meta-magnetic behaviour at high magnetic field has been found. Some of these results are in tune with experimental observation in amorphous rare-earth-transition metal alloys.
An error correcting code using a tree-like multilayer perceptron is proposed. An original message $\mbi{s}^0$ is encoded into a codeword $\boldmath{y}_0$ using a tree-like committee machine (committee tree) or a tree-like parity machine (parity tree). Based on these architectures, several schemes featuring monotonic or non-monotonic units are introduced. The codeword $\mbi{y}_0$ is then transmitted via a Binary Asymmetric Channel (BAC) where it is corrupted by noise. The analytical performance of these schemes is investigated using the replica method of statistical mechanics. Under some specific conditions, some of the proposed schemes are shown to saturate the Shannon bound at the infinite codeword length limit. The influence of the monotonicity of the units on the performance is also discussed.
Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods.
This Letter presents a unified approach for the fundamental relationship between structure and function in flow networks by solving analytically the voltages in a resistor network, transforming the network structure to an effective all-to-all topology, and then measuring the resultant flows. Moreover, it defines a way to study the structural resilience of the graph and to detect possible communities.
Very deep convolutional networks with hundreds of layers have led to significant reductions in error on competitive benchmarks. Although the unmatched expressiveness of the many layers can be highly desirable at test time, training very deep networks comes with its own set of challenges. The gradients can vanish, the forward flow often diminishes, and the training time can be painfully slow. To address these problems, we propose stochastic depth, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time. We start with very deep networks but during training, for each mini-batch, randomly drop a subset of layers and bypass them with the identity function. This simple approach complements the recent success of residual networks. It reduces training time substantially and improves the test error significantly on almost all data sets that we used for evaluation. With stochastic depth we can increase the depth of residual networks even beyond 1200 layers and still yield meaningful improvements in test error (4.91% on CIFAR-10).
This paper develops a randomized approach for incrementally building deep neural networks, where a supervisory mechanism is proposed to constrain the random assignment of the weights and biases, and all the hidden layers have direct links to the output layer. A fundamental result on the universal approximation property is established for such a class of randomized leaner models, namely deep stochastic configuration networks (DeepSCNs). A learning algorithm is presented to implement DeepSCNs with either specific architecture or self-organization. The read-out weights attached with all direct links from each hidden layer to the output layer are evaluated by the least squares method. Given a set of training examples, DeepSCNs can speedily produce a learning representation, that is, a collection of random basis functions with the cascaded inputs together with the read-out weights. An empirical study on a function approximation is carried out to demonstrate some properties of the proposed deep learner model.
This paper proposes a deep leaning method to address the challenging facial attractiveness prediction problem. The method constructs a convolutional neural network of facial beauty prediction using a new deep cascaded fine-turning scheme with various face inputting channels, such as the original RGB face image, the detail layer image, and the lighting layer image. With a carefully designed CNN model of deep structure, large input size and small convolutional kernels, we have achieved a high prediction correlation of 0.88. This result convinces us that the problem of facial attractiveness prediction can be solved by deep learning approach, and it also shows the important roles of the facial smoothness, lightness, and color information that were involved in facial beauty perception, which is consistent with the result of recent psychology studies. Furthermore, we analyze the high-level features learnt by CNN through visualization of its hidden layers, and some interesting phenomena were observed. It is found that the contours and appearance of facial features, especially eyes and moth, are the most significant facial attributes for facial attractiveness prediction, which is also consistent with the visual perception intuition of human.
We study a one-dimensional quantum system with an arbitrary number of hard-core particles on the lattice, which are subject to a deterministic attractive interaction as well as a random potential. Our choice of interaction is suggested by the spectral analysis of the XXZ quantum spin chain. The main result concerns a version of high-disorder Fock-space localization expressed here in the configuration space of hard-core particles. The proof relies on an energetically motivated Combes-Thomas estimate and an effective one-particle analysis. As an application, we show the exponential decay of the two-point function in the infinite system uniformly in the particle number.
Whether we prepare a coffee or navigate to a shop: in many tasks we make multiple decisions before reaching a goal. Learning such state-action sequences from sparse reward raises the problem of credit-assignment: which actions out of a long sequence should be reinforced? One solution provided by reinforcement learning (RL) theory is the eligibility trace (ET); a decaying memory of the state-action history. Here we investigate behaviorally and neurally whether humans utilize an ET when learning a multi-step decision making task. We implemented three versions of a novel task using visual, acoustic, and spatial cues. Eleven healthy subjects performed all three conditions while we recorded their pupil diameter. We considered model-based and model-free (with and without ET) algorithms to explain human learning and find that model-free learning with ET explains the human behavior best in all three conditions. We then compare pupil dilation in early and late learning and observe differences that are consistent with an ET contribution. In particular, we find significant changes in pupil response to non-goal states after just a single reward in all three experimental conditions. In this research we introduce a novel paradigm to study the ET in human learning in a multi-step sequential decision making task. The analysis of the behavioral and pupil data provides evidence that humans utilize an eligibility trace to solve the credit-assignment problem when learning from sparse and delayed reward.
Information flow framed in a computational and complexity context is relevant to the understanding of cognitive processes and awareness. In this paper, we begin with analyzing an information theory framework developed in recent years under Information and Integration Theory (IIT) based on interactions among partitions of cognitive information sets. We discuss the scope and limitations of these ideas, introducing a related measure for partitioning information sets. We introduce a set of postulates describing cognition as a partially ordered set of events in space and time. We consider the relevant sequential and concurrent computational concepts in an idealized minimal cognitive device. The concept of fundamental cognitive chain formalizes temporal limits of cognition.
Effective human-machine collaboration can significantly improve many learning and planning strategies for information gathering via fusion of 'hard' and 'soft' data originating from machine and human sensors, respectively. However, gathering the most informative data from human sensors without task overloading remains a critical technical challenge. In this context, Value of Information (VOI) is a crucial decision-theoretic metric for scheduling interaction with human sensors. We present a new Deep Learning based VOI estimation framework that can be used to schedule collaborative human-machine sensing with computationally efficient online inference and minimal policy hand-tuning. Supervised learning is used to train deep convolutional neural networks (CNNs) to extract hierarchical features from 'images' of belief spaces obtained via data fusion. These features can be associated with soft data query choices to reliably compute VOI for human interaction. The CNN framework is described in detail, and a performance comparison to a feature-based POMDP scheduling policy is provided. The practical feasibility of our method is also demonstrated on a mobile robotic search problem with language-based semantic human sensor inputs.
Batch normalization (BN) is very effective in accelerating the convergence of a neural network training phase that it has become a common practice. We propose a generalization of BN, the diminishing batch normalization (DBN) algorithm. We provide an analysis of the convergence of the DBN algorithm that converges to a stationary point with respect to trainable parameters. We analyze a two layer model with linear activation. The main challenge of the analysis is the fact that some parameters are updated by gradient while others are not. In the numerical experiments, we use models with more layers and ReLU activation. We observe that DBN outperforms the original BN algorithm on MNIST, NI and CIFAR-10 datasets with reasonable complex FNN and CNN models.
Multiple scattering of waves leads to many peculiar phenomena such as complete band gaps in periodic structures and wave localization in disordered media. Within a band gap excitations are evanescent; when localized they remain confined in space until dissipated. Here we report acoustic band gap and localization in a 2D system of air-cylinders in water. Exact numerical calculations reveal the unexpected result that localization is relatively independent of the precise location or organization of the scatterers. Localization occurs within a finite region of frequencies, coincident with the complete band gap predicted by a conventional band structure calculation for a periodic lattice of scatterers. Inside the gap or localization regime, a previously uninvestigated stable collective behavior of the cylinders appears.
Spin glasses are a longstanding model for the sluggish dynamics that appears at the glass transition. However, spin glasses differ from structural glasses for a crucial feature: they enjoy a time reversal symmetry. This symmetry can be broken by applying an external magnetic field, but embarrassingly little is known about the critical behaviour of a spin glass in a field. In this context, the space dimension is crucial. Simulations are easier to interpret in a large number of dimensions, but one must work below the upper critical dimension (i.e., in d<6) in order for results to have relevance for experiments. Here we show conclusive evidence for the presence of a phase transition in a four-dimensional spin glass in a field. Two ingredients were crucial for this achievement: massive numerical simulations were carried out on the Janus special-purpose computer, and a new and powerful finite-size scaling method.
We consider stability and network capacity in discrete time queueing systems. Relationships between four common notions of stability are described. Specifically, we consider rate stability, mean rate stability, steady state stability, and strong stability. We then consider networks of queues with random events and control actions that can be implemented over time to affect arrivals and service at the queues. The control actions also generate a vector of additional network attributes. We characterize the network capacity region, being the closure of the set of all rate vectors that can be supported subject to network stability and to additional time average attribute constraints. We show that (under mild technical assumptions) the capacity region is the same under all four stability definitions. Our capacity achievability proof uses the drift-plus-penalty method of Lyapunov optimization, and provides full details for the case when network states obey a decaying memory property, which holds for finite state ergodic systems and more general systems.
In this paper we study the generalized version of weighted matching in bipartite networks. Consider a weighted matching in a bipartite network in which the nodes derive value from the split of the matching edge assigned to them if they are matched. The value a node derives from the split depends both on the split as well as the partner the node is matched to. We assume that the value of a split to the node is continuous and strictly increasing in the part of the split assigned to the node. A stable weighted matching is a matching and splits on the edges in the matching such that no two adjacent nodes in the network can split the edge between them so that both of them can derive a higher value than in the matching. We extend the weighted matching problem to this general case and study the existence of a stable weighted matching. We also present an algorithm that converges to a stable weighted matching. The algorithm generalizes the Hungarian algorithm for bipartite matching. Faster algorithms can be made when there is more structure on the value functions.
This paper studies convolutional neural networks (CNN) to learn unsupervised feature representations for 44 different plant species, collected at the Royal Botanic Gardens, Kew, England. To gain intuition on the chosen features from the CNN model (opposed to a 'black box' solution), a visualisation technique based on the deconvolutional networks (DN) is utilized. It is found that venations of different order have been chosen to uniquely represent each of the plant species. Experimental results using these CNN features with different classifiers show consistency and superiority compared to the state-of-the art solutions which rely on hand-crafted features.
Rho photoproduction on complex nuclei is re-examined using a generalised vector dominance model which succesfully predicts the observed nuclear shadowing in real photoabsorption and deep inelastic scattering. This model is shown to give a good fit to rho photoproduction data on both nucleons and complex nuclei, in which the well-known disagreement between the measured gamma-rho coupling and the gamma-rho coupling required by the simple vector dominance model is eliminated. The rho N total cross-sections required are similar to those predicted by the additive quark model; and the magnitude of the correction to simple vector dominance is consistent with that inferred from the analysis of real photoabsorption and deep inelastic scattering.
The past decade has seen tremendous growth in the field of Complex Social Networks. Several network generation models have been extensively studied to develop an understanding of how real world networks evolve over time. Two important applications of these models are to study the evolution dynamics and processes that shape a network, and to generate benchmark networks with known community structures. Research has been conducted in both these directions, relatively independent of the other. This creates a disjunct between real world networks and the networks generated as benchmarks to study community detection algorithms.   In this paper, we propose to study both these application areas together. We introduce a network generation model which is based on evolution dynamics of real world networks and, it can generate networks with community structures that can be used as benchmark graphs. We study the behaviour of different community detection algorithms based on the proposed model and compare it with other models to generate benchmark graphs. Results suggest that the proposed model can generate networks which are not only structurally similar to real world networks but can be used to generate networks with varying community sizes and topologies.
We propose an approach to instance-level image segmentation that is built on top of category-level segmentation. Specifically, for each pixel in a semantic category mask, its corresponding instance bounding box is predicted using a deep fully convolutional regression network. Thus it follows a different pipeline to the popular detect-then-segment approaches that first predict instances' bounding boxes, which are the current state-of-the-art in instance segmentation. We show that, by leveraging the strength of our state-of-the-art semantic segmentation models, the proposed method can achieve comparable or even better results to detect-then-segment approaches. We make the following contributions. (i) First, we propose a simple yet effective approach to semantic instance segmentation. (ii) Second, we propose an online bootstrapping method during training, which is critically important for achieving good performance for both semantic category segmentation and instance-level segmentation. (iii) As the performance of semantic category segmentation has a significant impact on the instance-level segmentation, which is the second step of our approach, we train fully convolutional residual networks to achieve the best semantic category segmentation accuracy. On the PASCAL VOC 2012 dataset, we obtain the currently best mean intersection-over-union score of 79.1%. (iv) We also achieve state-of-the-art results for instance-level segmentation.
We report the discovery of tidal tails around the high-latitude Galactic globular cluster NGC 5466 in Sloan Digital Sky Survey (SDSS) data. Neural networks are used to reconstruct the probability distribution of cluster stars in u,g,r,i and z space. The tails are clearly visible, once extra-galactic contaminants and field stars have been eliminated. They extend roughly 4 degrees on the sky, corresponding to about 1 kpc in projected length. The orientation of the tails is in good agreement with the cluster's Galactic orbit, as judged from the proper motion data.
Sum-Product Networks with complex probability distribution at the leaves have been shown to be powerful tractable-inference probabilistic models. However, while learning the internal parameters has been amply studied, learning complex leaf distribution is an open problem with only few results available in special cases. In this paper we derive an efficient method to learn a very large class of leaf distributions with Expectation-Maximization. The EM updates have the form of simple weighted maximum likelihood problems, allowing to use any distribution that can be learned with maximum likelihood, even approximately. The algorithm has cost linear in the model size and converges even if only partial optimizations are performed. We demonstrate this approach with experiments on twenty real-life datasets for density estimation, using tree graphical models as leaves. Our model outperforms state-of-the-art methods for parameter learning despite using SPNs with much fewer parameters.
We report on determinations of the running mass for charm quarks from deep-inelastic scattering reactions. The method provides complementary information on this fundamental parameter from hadronic processes with space-like kinematics. The obtained values are consistent with but systematically lower than the world average as published by the PDG. We also address the consequences of the running mass scheme for heavy-quark parton distributions in global fits to deep-inelastic scattering data.
We present preliminary results that quantify network robustness and fragility of Ewing sarcoma (ES), a rare pediatric bone cancer that often exhibits de novo or acquired drug resistance. By identifying novel proteins or pathways susceptible to drug targeting, this formalized approach promises to improve preclinical drug development and may lead to better treatment outcomes. Toward that end, our network modeling focused upon the IGF-1R-PI3K-Akt-mTOR pathway, which is of proven importance in ES. The clinical response and proteomic networks of drug-sensitive parental cell lines and their drug-resistant counterparts were assessed using two small molecule inhibitors for IGF-1R (OSI-906 and NVP-ADW-742) and an mTOR inhibitor (mTORi), MK8669, such that protein-to-protein expression networks could be generated for each group. For the first time, mathematical modeling proves that drug resistant ES samples exhibit higher degrees of overall network robustness (e.g., the ability of a system to withstand random perturbations to its network configuration) to that of their untreated or short-term (72-hour) treated samples. This was done by leveraging previous work, which suggests that Ricci curvature, a key geometric feature of a given network, is positively correlated to increased network robustness. More importantly, given that Ricci curvature is a local property of the system, it is capable of resolving pathway fragility. In this note, we offer some encouraging yet limited insights in terms of system-level robustness of ES and lay the foundation for scope of future work in which a complete study will be conducted.
This paper addresses handover decision instability which impacts negatively on both user perception and network performances. To this aim, a new technique called The HandOver Decision STAbility Technique (HODSTAT) is proposed for horizontal handover in Wireless Local Area Networks (WLAN) based on IEEE 802.11standard. HODSTAT is based on a hysteresis margin analysis that, combined with a utilitybased function, evaluates the need for the handover and determines if the handover is needed or avoided. Indeed, if a Mobile Terminal (MT) only transiently hands over to a better network, the gain from using this new network may be diminished by the handover overhead and short usage duration. The approach that we adopt throughout this article aims at reducing the minimum handover occurrence that leads to the interruption of network connectivity (this is due to the nature of handover in WLAN which is a break before make which causes additional delay and packet loss). To this end, MT rather performs a handover only if the connectivity of the current network is threatened or if the performance of a neighboring network is really better comparing the current one with a hysteresis margin. This hysteresis should make a tradeoff between handover occurrence and the necessity to change the current network of attachment. Our extensive simulation results show that our proposed algorithm outperforms other decision stability approaches for handover decision algorithm.
This paper deals with broadcasting problem in vehicular ad hoc networks (VANETs). This communication mode is commonly used for sending safety messages and traffic information. However, designing an efficient broadcasting protocol is hard to achieve since it has to take into account some parameters related to the network environment, for example, the network density, in order to avoid causing radio interferences. In this paper, we propose a novel Autonomic Dissemination Method (ADM) which delivers messages in accordance with given priority and density levels. The proposed approach is based on two steps: an offline optimization process and an adaptation to the network characteristics. The first step uses a genetic algorithm to find solutions that fit the network context. The second one relies on the Autonomic Computing paradigm. ADM allows each vehicle to dynamically adapt its broadcasting strategy not only with respect to the network density, but also in accordance to the priority level of the message to send. The experimental results show that ADM effectively uses the radio resources even when there are globally many messages to send simultaneously. Moreover, ADM allows to increase the message delivery ratio and to reduce the latency and radio interferences.
Drawing on a large database of publicly announced R&D alliances, we empirically investigate the evolution of R&D networks and the process of alliance formation in several manufacturing sectors over a 24-year period (1986-2009). Our goal is to empirically evaluate the temporal and sectoral robustness of a large set of network indicators, thus providing a more complete description of R&D networks with respect to the existing literature. We find that most network properties are not only invariant across sectors, but also independent of the scale of aggregation at which they are observed, and we highlight the presence of core-periphery architectures in explaining some properties emphasized in previous empirical studies (e.g. asymmetric degree distributions and small worlds). In addition, we show that many properties of R&D networks are characterized by a rise-and-fall dynamics with a peak in the mid-nineties. We find that such dynamics is driven by mechanisms of accumulative advantage, structural homophily and multiconnectivity. In particular, the change from the "rise" to the "fall" phase is associated to a structural break in the importance of multiconnectivity.
We consider the statistical properties of the local density of states of a one-dimensional Dirac equation in the presence of various types of disorder with Gaussian white-noise distribution. It is shown how either the replica trick or supersymmetry can be used to calculate exactly all the moments of the local density of states. Careful attention is paid to how the results change if the local density of states is averaged over atomic length scales. For both the replica trick and supersymmetry the problem is reduced to finding the ground state of a zero-dimensional Hamiltonian which is written solely in terms of a pair of coupled ``spins'' which are elements of u(1,1). This ground state is explicitly found for the particular case of the Dirac equation corresponding to an infinite metallic quantum wire with a single conduction channel. The calculated moments of the local density of states agree with those found previously by Al'tshuler and Prigodin [Sov. Phys. JETP 68 (1989) 198] using a technique based on recursion relations for Feynman diagrams.
Power line interference may severely corrupt neural recordings at 50/60 Hz and harmonic frequencies. In this paper, we present a robust and computationally efficient algorithm for removing power line interference from neural recordings. The algorithm includes four steps. First, an adaptive notch filter is used to estimate the fundamental frequency of the interference. Subsequently, based on the estimated frequency, harmonics are generated by using discrete-time oscillators, and then the amplitude and phase of each harmonic are estimated through using a modified recursive least squares algorithm. Finally, the estimated interference is subtracted from the recorded data. The algorithm does not require any reference signal, and can track the frequency, phase, and amplitude of each harmonic. When benchmarked with other popular approaches, our algorithm performs better in terms of noise immunity, convergence speed, and output signal-to-noise ratio (SNR). While minimally affecting the signal bands of interest, the algorithm consistently yields fast convergence and substantial interference rejection in different conditions of interference strengths (input SNR from -30 dB to 30 dB), power line frequencies (45-65 Hz), and phase and amplitude drifts. In addition, the algorithm features a straightforward parameter adjustment since the parameters are independent of the input SNR, input signal power, and the sampling rate. The proposed algorithm features a highly robust operation, fast adaptation to interference variations, significant SNR improvement, low computational complexity and memory requirement, and straightforward parameter adjustment. These features render the algorithm suitable for wearable and implantable sensor applications, where reliable and real-time cancellation of the interference is desired.
The Helioseismic and Magnetic Imager (HMI) provides continuum images and magnetograms with a cadence better than one every minute. It has been continuously observing the Sun 24 hours a day for the past 7 years. The obvious trade-off between cadence and spatial resolution makes that HMI is not enough to analyze the smallest-scale events in the solar atmosphere. Our aim is developing a new method to enhance HMI data, simultaneously deconvolving and superresolving images and magnetograms. The resulting images will mimick observations with a diffraction-limited telescope twice the diameter of HMI. The method, that we term Enhance, is based on two deep fully convolutional neural networks that input patches of HMI observations and output deconvolved and superresolved data. The neural networks are trained on synthetic data obtained from simulations of the emergence of solar active regions. We have obtained deconvolved and supperresolved HMI images. To solve this ill-defined problem with infinite solutions we have used a neural network approach to add prior information from the simulations. We test Enhance against Hinode data that has been degraded to a 28 cm diameter telescope showing very good consistency. The code is open sourced for the community.
In the on-line Explore and Exploit literature, central to Machine Learning, a central planner is faced with a set of alternatives, each yielding some unknown reward. The planner's goal is to learn the optimal alternative as soon as possible, via experimentation. A typical assumption in this model is that the planner has full control over the experiment design and implementation. When experiments are implemented by a society of self-motivated agents the planner can only recommend experimentation but has no power to enforce it. Kremer et al (JPE, 2014) introduce the first study of explore and exploit schemes that account for agents' incentives. In their model it is implicitly assumed that agents do not see nor communicate with each other. Their main result is a characterization of an optimal explore and exploit scheme. In this work we extend Kremer et al (JPE, 2014) by adding a layer of a social network according to which agents can observe each other. It turns out that when observability is factored in the scheme proposed by Kremer et al (JPE, 2014) is no longer incentive compatible. In our main result we provide a tight bound on how many other agents can each agent observe and still have an incentive-compatible algorithm and asymptotically optimal outcome. More technically, for a setting with N agents where the number of nodes with degree greater than N^alpha is bounded by N^beta and 2*alpha+beta < 1 we construct incentive-compatible asymptotically optimal mechanism. The bound 2*alpha+beta < 1 is shown to be tight.
Using extremely deep (rms 3.3 microJy/bm) 1.4GHz sub-arcsecond resolution MERLIN + VLA radio observations of a 8'.5 by 8'.5 field centred upon the Hubble Deep Field North, in conjunction with Spitzer 24 micron data we present an investigation of the radio-MIR correlation at very low flux densities. By stacking individual sources within these data we are able to extend the MIR-radio correlation to the extremely faint (~microJy and even sub-microJy) radio source population. Tentatively we demonstrate a small deviation from the correlation for the faintest MIR sources. We suggest that this small observed change in the gradient of the correlation is the result of a suppression of the MIR emission in faint star-forming galaxies. This deviation potentially has significant implications for using either the MIR or non-thermal radio emission as a star-formation tracer at low luminosities.
Deformation due to embedded fluidic networks is currently studied in the context of soft-actuators and soft-robotics. Expanding on this concept, beams can be designed so that the pressure in the channel-network is created directly from external forces acting on the beam, and thus can be viewed as passive solid-liquid composite structure. We obtain a continuous function relating the network geometry to the deformation. This enables design of networks creating arbitrary steady and time varying deformation-fields as well as to eliminate deformation created by external forces.
The use of short text messages in social media and instant messaging has become a popular communication channel during the last years. This rising popularity has caused an increment in messaging threats such as spam, phishing or malware as well as other threats. The processing of these short text message threats could pose additional challenges such as the presence of lexical variants, SMS-like contractions or advanced obfuscations which can degrade the performance of traditional filtering solutions. By using a real-world SMS data set from a large telecommunications operator from the US and a social media corpus, in this paper we analyze the effectiveness of machine learning filters based on linguistic and behavioral patterns in order to detect short text spam and abusive users in the network. We have also explored different ways to deal with short text message challenges such as tokenization and entity detection by using text normalization and substring clustering techniques. The obtained results show the validity of the proposed solution by enhancing baseline approaches.
Small-world networks by Watts and Strogatz are a class of networks that are highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. These characteristics result in networks with unique properties of regional specialization with efficient information transfer. Social networks are intuitive examples of this organization with cliques or clusters of friends being interconnected, but each person is really only 5-6 people away from anyone else. While this qualitative definition has prevailed in network science theory, in application, the standard quantitative application is to compare path length (a surrogate measure of distributed processing) and clustering (a surrogate measure of regional specialization) to an equivalent random network. It is demonstrated here that comparing network clustering to that of a random network can result in aberrant findings and networks once thought to exhibit small-world properties may not. We propose a new small-world metric, {\omega} (omega), which compares network clustering to an equivalent lattice network and path length to a random network, as Watts and Strogatz originally described. Example networks are presented that would be interpreted as small-world when clustering is compared to a random network but are not small-world according to {\omega}. These findings have significant implications in network science as small-world networks have unique topological properties, and it is critical to accurately distinguish them from networks without simultaneous high clustering and low path length.
The violation of the Fluctuation-Dissipation Theorem (FDT) in a two-dimensional Ising model with both ferromagnetic exchange and antiferromagnetic dipolar interactions is established and investigated via Monte Carlo simulations. Through the computation of the autocorrelation C(t+t_w,t_w) and the integrated response (susceptibility) functions we obtain the FDT violation factor X(t+t_w,t_w) for different values of the temperature, the waiting time t_w and the quotient delta=J_0/J_d, J_0 and J_d being the strength of exchange and dipolar interactions respectively. For positive values of delta this system develops a striped phase at low temperatures, in which the non-equilibrium dynamics presents two different regimes. Our results show that such different regimes are not reflected in the FDT violation factor, where X goes always to zero for high values of t_w in the aging regime, a result that appears in domain growth processes in non-frustrated ordered systems.
We add strong negation $N$ to classical logic and interpret the attack relation of "$x$ attacks $y$" in argumentation as $(x\to Ny)$. We write a corresponding object level (using $N$ only) classical theory for each argumentation network and show that the classical models of this theory correspond exactly to the complete extensions of the argumentation network. We show by example how this approach simplifies the study of abstract argumentation networks. We compare with other translations of abstract argumentation networks into logic, such as classical predicate logic or modal logics, or logic programming, and we also compare with Abstract Dialectical Frameworks.
The PITCHf/x database has allowed the statistical analysis of of Major League Baseball (MLB) to flourish since its introduction in late 2006. Using PITCHf/x, pitches have been classified by hand, requiring considerable effort, or using neural network clustering and classification, which is often difficult to interpret. To address these issues, we use model-based clustering with a multivariate Gaussian mixture model and an appropriate adjustment factor as an alternative to current methods. Furthermore, we describe a new pitch classification algorithm based on our clustering approach to address the problems of pitch misclassification. We illustrate our methods for various pitchers from the PITCHf/x database that covers a wide variety of pitch types.
Deep neural networks have been developed drawing inspiration from the brain visual pathway, implementing an end-to-end approach: from image data to video object classes. However building an fMRI decoder with the typical structure of Convolutional Neural Network (CNN), i.e. learning multiple level of representations, seems impractical due to lack of brain data. As a possible solution, this work presents the first hybrid fMRI and deep features decoding approach: collected fMRI and deep learnt representations of video object classes are linked together by means of Kernel Canonical Correlation Analysis. In decoding, this allows exploiting the discriminatory power of CNN by relating the fMRI representation to the last layer of CNN (fc7). We show the effectiveness of embedding fMRI data onto a subspace related to deep features in distinguishing semantic visual categories based solely on brain imaging data.
Dictionary learning algorithms or supervised deep convolution networks have considerably improved the efficiency of predefined feature representations such as SIFT. We introduce a deep scattering convolution network, with predefined wavelet filters over spatial and angular variables. This representation brings an important improvement to results previously obtained with predefined features over object image databases such as Caltech and CIFAR. The resulting accuracy is comparable to results obtained with unsupervised deep learning and dictionary based representations. This shows that refining image representations by using geometric priors is a promising direction to improve image classification and its understanding.
In this paper we address the issue of photo galleries synchronization, where pictures related to the same event are collected by different users. Existing solutions to address the problem are usually based on unrealistic assumptions, like time consistency across photo galleries, and often heavily rely on heuristics, limiting therefore the applicability to real-world scenarios. We propose a solution that achieves better generalization performance for the synchronization task compared to the available literature. The method is characterized by three stages: at first, deep convolutional neural network features are used to assess the visual similarity among the photos; then, pairs of similar photos are detected across different galleries and used to construct a graph; eventually, a probabilistic graphical model is used to estimate the temporal offset of each pair of galleries, by traversing the minimum spanning tree extracted from this graph. The experimental evaluation is conducted on four publicly available datasets covering different types of events, demonstrating the strength of our proposed method. A thorough discussion of the obtained results is provided for a critical assessment of the quality in synchronization.
Deep Gaussian processes provide a flexible approach to probabilistic modelling of data using either supervised or unsupervised learning. For tractable inference approximations to the marginal likelihood of the model must be made. The original approach to approximate inference in these models used variational compression to allow for approximate variational marginalization of the hidden variables leading to a lower bound on the marginal likelihood of the model [Damianou and Lawrence, 2013]. In this paper we extend this idea with a nested variational compression. The resulting lower bound on the likelihood can be easily parallelized or adapted for stochastic variational inference.
The models in statistical physics such as an Ising model offer a convenient way to characterize stationary activity of neural populations. Such stationary activity of neurons may be expected for recordings from in vitro slices or anesthetized animals. However, modeling activity of cortical circuitries of awake animals has been more challenging because both spike-rates and interactions can change according to sensory stimulation, behavior, or an internal state of the brain. Previous approaches modeling the dynamics of neural interactions suffer from computational cost; therefore, its application was limited to only a dozen neurons. Here by introducing multiple analytic approximation methods to a state-space model of neural population activity, we make it possible to estimate dynamic pairwise interactions of up to 60 neurons. More specifically, we applied the pseudolikelihood approximation to the state-space model, and combined it with the Bethe or TAP mean-field approximation to make the sequential Bayesian estimation of the model parameters possible. The large-scale analysis allows us to investigate dynamics of macroscopic properties of neural circuitries underlying stimulus processing and behavior. We show that the model accurately estimates dynamics of network properties such as sparseness, entropy, and heat capacity by simulated data, and demonstrate utilities of these measures by analyzing activity of monkey V4 neurons as well as a simulated balanced network of spiking neurons.
Discovering frequent episodes in event sequences is an interesting data mining task. In this paper, we argue that this framework is very effective for analyzing multi-neuronal spike train data. Analyzing spike train data is an important problem in neuroscience though there are no data mining approaches reported for this. Motivated by this application, we introduce different temporal constraints on the occurrences of episodes. We present algorithms for discovering frequent episodes under temporal constraints. Through simulations, we show that our method is very effective for analyzing spike train data for unearthing underlying connectivity patterns.
In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i.e., the positive or negative nature of the social relationships). Many successful heuristics for this problem are based on the troll-trust features, estimating at each node the fraction of outgoing and incoming positive/negative edges. We show that these heuristics can be understood, and rigorously analyzed, as approximators to the Bayes optimal classifier for a simple probabilistic model of the edge labels. We then show that the maximum likelihood estimator for this model approximately corresponds to the predictions of a Label Propagation algorithm run on a transformed version of the original social graph. Extensive experiments on a number of real-world datasets show that this algorithm is competitive against state-of-the-art classifiers in terms of both accuracy and scalability. Finally, we show that troll-trust features can also be used to derive online learning algorithms which have theoretical guarantees even when edges are adversarially labeled.
As the utilization of sensor networks continue to increase, the importance of security becomes more profound. Many industries depend on sensor networks for critical tasks, and a malicious entity can potentially cause catastrophic damage. We propose a new key exchange trust evaluation for peer-to-peer sensor networks, where part of the network has unconditionally secure key exchange. For a given sensor, the higher the portion of channels with unconditionally secure key exchange the higher the trust value. We give a brief introduction to unconditionally secured key exchange concepts and mention current trust measures in sensor networks. We demonstrate the new key exchange trust measure on a hypothetical sensor network using both wired and wireless communication channels.
We propose a max-pooling based loss function for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting (KWS), with low CPU, memory, and latency requirements. The max-pooling loss training can be further guided by initializing with a cross-entropy loss trained network. A posterior smoothing based evaluation approach is employed to measure keyword spotting performance. Our experimental results show that LSTM models trained using cross-entropy loss or max-pooling loss outperform a cross-entropy loss trained baseline feed-forward Deep Neural Network (DNN). In addition, max-pooling loss trained LSTM with randomly initialized network performs better compared to cross-entropy loss trained LSTM. Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields $67.6\%$ relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.
Quantum walk (QW) in presence of lattice disorders leads to a multitude of interesting phenomena, such as Anderson localization. While QW has been realized in various optical and atomic systems, its implementation with superconducting qubits still remains pending. The major challenge in simulating QW with superconducting qubits emerges from the fact that on-chip superconducting qubits cannot hop between two adjacent lattice sites. Here we overcome this barrier and develop a scheme to realize the discrete time QW by placing a pair of superconducting qubits on each site of a 1D lattice and treating an excitation as a walker. It is also shown that lattice disorders can be introduced and fully controlled within this scheme by tuning the qubit parameters. We observe a distinct signature of transition from the ballistic regime to a localized QW with an increasing strength of disorder. Finally, an eight-qubit experiment is proposed where the signature of localized and delocalized regimes can be detected with existing superconducting technology.
Using fast confocal microscopy we image the three-dimensional dynamics of particles in a yielded hard-sphere colloidal glass under steady shear. The structural relaxation, observed in regions with uniform shear, is nearly isotropic but is distinctly different from that of quiescent metastable colloidal fluids. The inverse relaxation time $\tau_\alpha^{-1}$ and diffusion constant $D$, as functions of the {\it local} shear rate $\dot{\gamma}$, show marked shear thinning with $\tau_\alpha^{-1} \propto D \propto \dot{\gamma}^{0.8}$ over more than two decades in $\dot{\gamma}$. In contrast, the {\it global} rheology of the system displays Herschel-Bulkley behavior. We discuss the possible role of large scale shear localization and other mechanisms in generating this difference.
A constructive scheme for determining pure states (clusters) at very low temperature in the 3-spins glass model on a random lattice is provided, in full agreement with Parisi's one step replica symmetry breaking (RSB) scheme. Proof is based on the analysis of an exact decimation procedure. When the number c of couplings per spin is smaller than some critical value c_d, all spins are eliminated at the end of decimation (RS phase). In the range c_d<c<c_s, a reduced Hamiltonian is left; each ground state (GS) of the latter is a "seed" from which a cluster of GS of the original Hamiltonian can be reconstructed. Above c_s, GS are frustrated with an energy per spin larger than -c. The number of GS in each cluster, the number of clusters, the distances between GS are calculated and correspond to RSB predictions.
Experimental fMRI studies have shown that spontaneous brain activity i.e. in the absence of any external input, exhibit complex spatial and temporal patterns of co-activity between segregated brain regions. These so-called large-scale resting-state functional connectivity networks represent dynamically organized neural assemblies interacting with each other in a complex way. It has been suggested that looking at the dynamical properties of complex patterns of brain functional co-activity may reveal neural mechanisms underlying the dynamic changes in functional interactions. Here, we examine how global network dynamics is shaped by different network configurations, derived from realistic brain functional interactions. We focus on two main dynamics measures: synchrony and variations in synchrony. Neural activity and the inferred hemodynamic response of the network nodes are simulated using system of 90 FitzHugh-Nagumo neural models subject to system noise and time-delayed interactions. These models are embedded into the topology of the complex brain functional interactions, whose architecture is additionally reduced to its main structural pathways. In the simulated functional networks, patterns of correlated regional activity clearly arise from dynamical properties that maximize synchrony and variations in synchrony. Our results on the fast changes of the level of the network synchrony also show how flexible changes in the large-scale network dynamics could be.
We study the role of particle-hole symmetry on the universality class of various quantum phase transitions corresponding to the onset of superfluidity at zero temperature of bosons in a quenched random medium. The functional integral formulation of this problem in d spatial dimensions yields a (d+1)-dimensional classical XY-model with extended disorder--the so-called random rod problem. Particle-hole symmetry may then be broken by adding nonzero site energies. We may distinguish three cases: (i) exact particle-hole symmetry, in which the site energies all vanish, (ii) statistical particle-hole symmetry in which the site energy distribution is symmetric about zero, vanishing on average, and (iii) complete absence of particle-hole symmetry in which the distribution is generic. We explore in each case the nature of the excitations in the non-superfluid Mott insulating and Bose glass phases. We find that the Bose glass compressibility, which has the interpretation of a temporal spin stiffness or superfluid density, is positive in cases (ii) and (iii), but that it vanishes with an essential singularity as full particle-hole symmetry is restored. We then focus on the critical point and discuss the relevance of type (ii) particle-hole symmetry breaking perturbations to the random rod critical behavior. We argue that a perturbation of type (iii) is irrelevant to the resulting type (ii) critical behavior: the statistical symmetry is restored on large scales close to the critical point, and case (ii) therefore describes the dirty boson fixed point. To study higher dimensions we attempt, with partial success, to generalize the Dorogovtsev-Cardy-Boyanovsky double epsilon expansion technique to this problem. The qualitative renormalization group flow picture this technique provides is quite compelling.
The main result of this paper is a semi-analytic approximation for the chord distribution functions of three-dimensional models of microstructure derived from Gaussian random fields. In the simplest case the chord functions are equivalent to a standard first-passage time problem, i.e., the probability density governing the time taken by a Gaussian random process to first exceed a threshold. We obtain an approximation based on the assumption that successive chords are independent. The result is a generalization of the independent interval approximation recently used to determine the exponent of persistence time decay in coarsening. The approximation is easily extended to more general models based on the intersection and union sets of models generated from the iso-surfaces of random fields. The chord distribution functions play an important role in the characterization of random composite and porous materials. Our results are compared with experimental data obtained from a three-dimensional image of a porous Fontainebleau sandstone and a two-dimensional image of a tungsten-silver composite alloy.
There is a growing demand for accurate high-resolution land cover maps in many fields, e.g., in land-use planning and biodiversity conservation. Developing such maps has been performed using Object-Based Image Analysis (OBIA) methods, which usually reach good accuracies, but require a high human supervision and the best configuration for one image can hardly be extrapolated to a different image. Recently, the deep learning Convolutional Neural Networks (CNNs) have shown outstanding results in object recognition in the field of computer vision. However, they have not been fully explored yet in land cover mapping for detecting species of high biodiversity conservation interest. This paper analyzes the potential of CNNs-based methods for plant species detection using free high-resolution Google Earth T M images and provides an objective comparison with the state-of-the-art OBIA-methods. We consider as case study the detection of Ziziphus lotus shrubs, which are protected as a priority habitat under the European Union Habitats Directive. According to our results, compared to OBIA-based methods, the proposed CNN-based detection model, in combination with data-augmentation, transfer learning and pre-processing, achieves higher performance with less human intervention and the knowledge it acquires in the first image can be transferred to other images, which makes the detection process very fast. The provided methodology can be systematically reproduced for other species detection.
Using numerical simulations we studied the long time relaxation of the hopping conductivity. Even though no modern computation is able to simulate the behavior of a large size system over minutes or hours so as to observe the relaxation, still we have been able to show that the long time relaxation and aging effect observed in experiments can be explained in terms of slow transitions between different pseudoground states. This was achieved by showing that different pseudoground states may have different conductivities and that the dispersion of conductivities is in agreement with the experimental data. We considered two different two-dimensional models with electron-electron interaction: the lattice model and the random site model, corresponding to ``strong'' and ``weak'' effective disorder. For the lattice model, effectively strong disorder, we have shown that the universality of the Coulomb gap, which is responsible for the universal Efros-Shklovskii law for the conductivity, suppresses the long time relaxation of conductivity since the universality strongly decreases the dispersion of conductivities of the pseudoground states.
The Continuous Time Random Walk (CTRW) formalism is used to model the non-Poisson relaxation of a system response to perturbation. Two mechanisms to perturb the system are analyzed: a first in which the perturbation, seen as a potential gradient, simply introduces a bias in the hopping probability of the walker from on site to the other but leaves unchanged the occurrence times of the attempted jumps ("events") and a second in which the occurrence times of the events are perturbed. The system response is calculated analytically in both cases in a non-ergodic condition, i.e. for a diverging first moment in time. Two different Fluctuation-Dissipation Theorems (FDTs), one for each kind of mechanism, are derived and discussed.
Transfer learning is a recent field of machine learning research that aims to resolve the challenge of dealing with insufficient training data in the domain of interest. This is a particular issue with traditional deep neural networks where a large amount of training data is needed. Recently, StochasticNets was proposed to take advantage of sparse connectivity in order to decrease the number of parameters that needs to be learned, which in turn may relax training data size requirements. In this paper, we study the efficacy of transfer learning on StochasticNet frameworks. Experimental results show ~7% improvement on StochasticNet performance when the transfer learning is applied in training step.
Exponential Linear Units (ELUs) are a useful rectifier for constructing deep learning architectures, as they may speed up and otherwise improve learning by virtue of not have vanishing gradients and by having mean activations near zero. However, the ELU activation as parametrized in [1] is not continuously differentiable with respect to its input when the shape parameter alpha is not equal to 1. We present an alternative parametrization which is C1 continuous for all values of alpha, making the rectifier easier to reason about and making alpha easier to tune. This alternative parametrization has several other useful properties that the original parametrization of ELU does not: 1) its derivative with respect to x is bounded, 2) it contains both the linear transfer function and ReLU as special cases, and 3) it is scale-similar with respect to alpha.
The possibility of a reliable extraction of the neutron deep inelastic structure function, $F_2^n(x)$, for $ x < 0.85$ from joint measurements of deep inelastic structure functions of deuteron, $^{3}He$ and $^{3}H$ is investigated. The model dependence in this extraction, linked to the possible different interactions between nucleons in nuclei, is shown to be weak, if the nuclear structure effects are properly taken into account. A combined analysis of the deep inelastic structure functions of these nuclei is proposed to study effects beyond the impulse approximation.
Recent numerical work by Bardarson et. al. [Phys. Rev. Lett. 109, 017202 (2012)] revealed a slow, logarithmic in time, growth of entanglement entropy for initial product states in a putative many-body localized phase. We show that this surprising phenomenon results from the dephasing due to exponentially small interaction-induced corrections to the eigenenergies of different states. For weak interactions, we find that the entanglement entropy grows as \xi ln (Vt/\hbar), where V is the interaction strength, and \xi is the single-particle localization length. The saturated value of the entanglement entropy at long times is determined by the participation ratios of the initial state over the eigenstates of the subsystem. The proposed mechanism is illustrated with numerical simulations of small systems. Our work shows that the logarithmic entanglement growth is a universal phenomenon characteristic of the many-body localized phase in any number of spatial dimensions, and reveals a broad hierarchy of dephasing time scales present in such a phase.
The K-matrix, also known as the "Wigner reaction matrix" in nuclear scattering or "impedance matrix" in the electromagnetic wave scattering, is given essentially by an M x M diagonal block of the resolvent (E-H)^{-1} of a Hamiltonian H. For chaotic quantum systems the Hamiltonian H can be modelled by random Hermitian N x N matrices taken from invariant ensembles with the Dyson symmetry index beta=1,2,4. For beta=2 we prove by explicit calculation a universality conjecture by P. Brouwer which is equivalent to the claim that the probability distribution of K, for a broad class of invariant ensembles of random Hermitian matrices H, converges to a matrix Cauchy distribution with density ${\cal P}(K)\propto \left[\det{({\lambda}^2+(K-{\epsilon})^2)}\right]^{-M}$ in the limit $N\to \infty$, provided the parameter M is fixed and the spectral parameter E is taken within the support of the eigenvalue distribution of H. In particular, we show that for a broad class of unitary invariant ensembles of random matrices finite diagonal blocks of the resolvent are Cauchy distributed. The cases beta=1 and beta=4 remain outstanding.
We derive a synaptic weight update rule for learning temporally precise spike train to spike train transformations in multilayer feedforward networks of spiking neurons. The framework, aimed at seamlessly generalizing error backpropagation to the deterministic spiking neuron setting, is based strictly on spike timing and avoids invoking concepts pertaining to spike rates or probabilistic models of spiking. The derivation is founded on two innovations. First, an error functional is proposed that compares the spike train emitted by the output neuron of the network to the desired spike train by way of their putative impact on a virtual postsynaptic neuron. This formulation sidesteps the need for spike alignment and leads to closed form solutions for all quantities of interest. Second, virtual assignment of weights to spikes rather than synapses enables a perturbation analysis of individual spike times and synaptic weights of the output as well as all intermediate neurons in the network, which yields the gradients of the error functional with respect to the said entities. Learning proceeds via a gradient descent mechanism that leverages these quantities. Simulation experiments demonstrate the efficacy of the proposed learning framework. The experiments also highlight asymmetries between synapses on excitatory and inhibitory neurons.
Qubit networks with long-range interactions inspired by the Hebb rule can be used as quantum associative memories. Starting from a uniform superposition, the unitary evolution generated by these interactions drives the network through a quantum phase transition at a critical computation time, after which ferromagnetic order guarantees that a measurement retrieves the stored memory. The maximum memory capacity p of these qubit networks is reached at a memory density p/n=1.
Research on multilingual speech recognition remains attractive yet challenging. Recent studies focus on learning shared structures under the multi-task paradigm, in particular a feature sharing structure. This approach has been found effective to improve performance on each individual language. However, this approach is only useful when the deployed system supports just one language. In a true multilingual scenario where multiple languages are allowed, performance will be significantly reduced due to the competition among languages in the decoding space. This paper presents a multi-task recurrent model that involves a multilingual speech recognition (ASR) component and a language recognition (LR) component, and the ASR component is informed of the language information by the LR component, leading to a language-aware recognition. We tested the approach on an English-Chinese bilingual recognition task. The results show that the proposed multi-task recurrent model can improve performance of multilingual recognition systems.
This paper analyzes the performance of the primary and secondary users (SUs) in an arbitrarily-shaped underlay cognitive network. In order to meet the interference threshold requirement for a primary receiver (PU-Rx) at an arbitrary location, we consider different SU activity protocols which limit the number of active SUs. We propose a framework, based on the moment generating function (MGF) of the interference due to a random SU, to analytically compute the outage probability in the primary network, as well as the average number of active SUs in the secondary network. We also propose a cooperation-based SU activity protocol in the underlay cognitive network which includes the existing threshold-based protocol as a special case. We study the average number of active SUs for the different SU activity protocols, subject to a given outage probability constraint at the PU and we employ it as an analytical approach to compare the effect of different SU activity protocols on the performance of the primary and secondary networks.
In this paper, we study how rumors in Online Social Networks (OSNs) may impact the performance of device-to-device (D2D) communication. As D2D is a new technology, people may choose not to use it when believed in rumors of its negative impacts. Thus, the cellular network with underlaying D2D is vulnerable to OSNs as rumors in OSNs may decrement the throughput of the cellular network in popular content delivery scenarios. To analyze the vulnerability, we introduce the problem of finding the most critical nodes in the OSN such that the throughput of a content delivery scenario is minimized when a rumor starts from those nodes. We then propose an efficient solution to the critical nodes detection problem. The severity of such vulnerability is supported by extensive experiments in various simulation settings, from which we observe up to $40\%$ reduction in network throughput.
In this paper we present results of performance evaluation of S3DCNN - a Sparse 3D Convolutional Neural Network - on a large-scale 3D Shape benchmark ModelNet40, and measure how it is impacted by voxel resolution of input shape. We demonstrate comparable classification and retrieval performance to state-of-the-art models, but with much less computational costs in training and inference phases. We also notice that benefits of higher input resolution can be limited by an ability of a neural network to generalize high level features.
Many biological and cognitive systems do not operate deep into one or other regime of activity. Instead, they exploit critical surfaces poised at transitions in their parameter space. The pervasiveness of criticality in natural systems suggests that there may be general principles inducing this behaviour. However, there is a lack of conceptual models explaining how embodied agents propel themselves towards these critical points. In this paper, we present a learning model driving an embodied Boltzmann Machine towards critical behaviour by maximizing the heat capacity of the network. We test and corroborate the model implementing an embodied agent in the mountain car benchmark, controlled by a Boltzmann Machine that adjust its weights according to the model. We find that the neural controller reaches a point of criticality, which coincides with a transition point of the behaviour of the agent between two regimes of behaviour, maximizing the synergistic information between its sensors and the hidden and motor neurons. Finally, we discuss the potential of our learning model to study the contribution of criticality to the behaviour of embodied living systems in scenarios not necessarily constrained by biological restrictions of the examples of criticality we find in nature.
Gravitational wave astronomy will require the coordinated analysis of data from the global network of gravitational wave observatories. Questions of how to optimally configure the global network arise in this context. We have elsewhere proposed a formalism which is employed here to compare different configurations of the network, using both the coincident network analysis method and the coherent network analysis method. We have constructed a network model to compute a figure-of-merit based on the detection rate for a population of standard-candle binary inspirals. We find that this measure of network quality is very sensitive to the geographic location of component detectors under a coincident network analysis, but comparatively insensitive under a coherent network analysis.
Today, Voice over Wireless Local Area Network (VOWLAN) is the most accepted Internet application. There are a large number of literatures regarding the performance of various WLAN networks. Most of them focus on simulations and modeling, but there are also some experiments with real networks. This paper explains the comparison of performance of two different VOIP (Voice over Internet Protocol) applications over the same IEEE 802.11a wireless network. Radio link standard 802.11a have maximum transmission rate of 54Mbps. First protocol is session initiation protocol (SIP) and second is H.323 protocol. First one has an agent called SIP proxy. Second have a gateway reflects the characteristics of a Switched Circuit Network (SCN). With this comparison we have required to obtain a better understanding of wireless network suitability for voice communication in IP network.
We develop perturbation theory and physically motivated resummations of the perturbation theory for the problem of a tracer particle diffusing in a random media. The random media contains point scatterers of density $\rho$ uniformly distributed through out the material. The tracer is a Langevin particle subjected to the quenched random force generated by the scatterers. Via our perturbative analysis we determine when the random potential can be approximated by a Gaussian random potential. We also develop a self-similar renormalisation group approach based on thinning out the scatterers, this scheme is similar to that used with success for diffusion in Gaussian random potentials and agrees with known exact results. To assess the accuracy of this approximation scheme its predictions are confronted with results obtained by numerical simulation.
In a seminal paper, Meyer [David Meyer, Phys. Rev. Lett. 82, 1052 (1999)] described the advantages of quantum game theory by looking at the classical penny flip game. A player using a quantum strategy can win against a classical player almost 100\% of the time. Here we make a slight modification to the quantum game, with the two players sharing an entangled state to begin with. We then analyze two different scenarios, first in which quantum player makes unitary transformations to his qubit while the classical player uses a pure strategy of either flipping or not flipping the state of his qubit. In this case the quantum player always wins against the classical player. In the second scenario we have the quantum player making similar unitary transformations while the classical player makes use of a mixed strategy wherein he either flips or not with some probability "p". We show that in the second scenario, 100\% win record of a quantum player is drastically reduced and for a particular probability "p" the classical player can even win against the quantum player. This is of possible relevance to the field of quantum computation as we show that in this quantum game of preserving versus destroying entanglement a particular classical algorithm can beat the quantum algorithm.
We present a novel multivariate classification technique based on Genetic Programming. The technique is distinct from Genetic Algorithms and offers several advantages compared to Neural Networks and Support Vector Machines. The technique optimizes a set of human-readable classifiers with respect to some user-defined performance measure. We calculate the Vapnik-Chervonenkis dimension of this class of learning machines and consider a practical example: the search for the Standard Model Higgs Boson at the LHC. The resulting classifier is very fast to evaluate, human-readable, and easily portable. The software may be downloaded at: http://cern.ch/~cranmer/PhysicsGP.html
The low energy behavior of the S=1/2 antiferromagnetic XY-like XXZ chains with precious mean quasiperiodic exchange modulation is studied by the density matrix renormalization group method. It is found that the energy gap of the chain with length N scales as $\exp (-cN^{\omega})$ with nonuniversal exponent $\omega$ if the Ising component of the exhange coupling is antiferromagnetic. This behavior is expected to be the characteristic feature of the quantum spin chains with relevant aperiodicity. This is in contrast to the XY chain for which the precious mean exchange modulation is marginal and the gap scales as $N^{-z}$. On the contrary, it is also verified that the energy gap scales as $N^{-1}$ if the Ising component of the exhange coupling is ferromagnetic. Our results are not only consistent with the recent bosonization analysis of Vidal, Mouhanna and Giamarchi but also clarify the nature of the strong coupling regime which is inaccesssible by the bosonization approach.
We study the effects of a probabilistic refractory period in the collective behavior of coupled discrete-time excitable cells (SIRS-like cellular automata). Using mean-field analysis and simulations, we show that a synchronized phase with stable collective oscillations exists even with non-deterministic refractory periods. Moreover, further increasing the coupling strength leads to a reentrant transition, where the synchronized phase loses stability. In an intermediate regime, we also observe bistability (and consequently hysteresis) between a synchronized phase and an active but incoherent phase without oscillations. The onset of the oscillations appears in the mean-field equations as a Neimark-Sacker bifurcation, the nature of which (i.e. super- or subcritical) is determined by the first Lyapunov coefficient. This allows us to determine the borders of the oscillating and of the bistable regions. The mean-field prediction thus obtained agrees quantitatively with simulations of complete graphs and, for random graphs, qualitatively predicts the overall structure of the phase diagram. The latter can be obtained from simulations by defining an order parameter q suited for detecting collective oscillations of excitable elements. We briefly review other commonly used order parameters and show (via data collapse) that q satisfies the expected finite size scaling relations.
Most of the approaches for discovering visual attributes in images demand significant supervision, which is cumbersome to obtain. In this paper, we aim to discover visual attributes in a weakly supervised setting that is commonly encountered with contemporary image search engines. Deep Convolutional Neural Networks (CNNs) have enjoyed remarkable success in vision applications recently. However, in a weakly supervised scenario, widely used CNN training procedures do not learn a robust model for predicting multiple attribute labels simultaneously. The primary reason is that the attributes highly co-occur within the training data. To ameliorate this limitation, we propose Deep-Carving, a novel training procedure with CNNs, that helps the net efficiently carve itself for the task of multiple attribute prediction. During training, the responses of the feature maps are exploited in an ingenious way to provide the net with multiple pseudo-labels (for training images) for subsequent iterations. The process is repeated periodically after a fixed number of iterations, and enables the net carve itself iteratively for efficiently disentangling features. Additionally, we contribute a noun-adjective pairing inspired Natural Scenes Attributes Dataset to the research community, CAMIT - NSAD, containing a number of co-occurring attributes within a noun category. We describe, in detail, salient aspects of this dataset. Our experiments on CAMIT-NSAD and the SUN Attributes Dataset, with weak supervision, clearly demonstrate that the Deep-Carved CNNs consistently achieve considerable improvement in the precision of attribute prediction over popular baseline methods.
We present a modified two-dimensional random field Ising model, where a dipolar interaction term is added to the classic random field Hamiltonian. In a similar model it was already verified that the system state can exhibit domains in the form of stripe patterns, typical of thin materials with strong perpendicular anisotropy. In this work we show that the hysteresis loops obtained at zero temperature can display a strict similarity with the loops obtained in thin magnetic materials such as garnet films. In our model the processes of domain nucleation and domain wall motion are well separated in time as the system evolves. This remarkable fact allowed us to better understand the nucleation process in this family of spin systems.
Many-body localization (MBL) is currently a hot issue of interacting systems, in which quantum mechanics overcomes thermalization of statistical mechanics. Like Anderson localization of non-interacting electrons, disorders are usually crucial in engineering the quantum interference in MBL. For translation invariant systems, however, the breakdown of eigenstate thermalization hypothesis due to a \emph{pure} many-body quantum effect is still unclear. Here we demonstrate a possible MBL phenomenon without disorder, which emerges in a lightly doped Hubbard model with very strong interaction. By means of density matrix renormalization group numerical calculation on a two-leg ladder, we show that whereas a single hole can induce a very heavy Nagaoka polaron, two or more holes will form bound pair/droplets which are all localized excitations with flat bands at low energy densities. Consequently, MBL eigenstates of finite energy density can be constructed as composed of these localized droplets spatially separated. We further identify the underlying mechanism for this MBL as due to a novel `Berry phase' of the doped Mott insulator, and show that by turning off this Berry phase either by increasing the anisotropy of the model or by hand, an eigenstate transition from the MBL to a conventional quasiparticle phase can be realized.
This paper presents two unsupervised learning layers (UL layers) for label-free video analysis: one for fully connected layers, and the other for convolutional ones. The proposed UL layers can play two roles: they can be the cost function layer for providing global training signal; meanwhile they can be added to any regular neural network layers for providing local training signals and combined with the training signals backpropagated from upper layers for extracting both slow and fast changing features at layers of different depths. Therefore, the UL layers can be used in either pure unsupervised or semi-supervised settings. Both a closed-form solution and an online learning algorithm for two UL layers are provided. Experiments with unlabeled synthetic and real-world videos demonstrated that the neural networks equipped with UL layers and trained with the proposed online learning algorithm can extract shape and motion information from video sequences of moving objects. The experiments demonstrated the potential applications of UL layers and online learning algorithm to head orientation estimation and moving object localization.
A control theoretic approach is presented in this paper for both batch and instantaneous updates of weights in feed-forward neural networks. The popular Hamilton-Jacobi-Bellman (HJB) equation has been used to generate an optimal weight update law. The remarkable contribution in this paper is that closed form solutions for both optimal cost and weight update can be achieved for any feed-forward network using HJB equation in a simple yet elegant manner. The proposed approach has been compared with some of the existing best performing learning algorithms. It is found as expected that the proposed approach is faster in convergence in terms of computational time. Some of the benchmark test data such as 8-bit parity, breast cancer and credit approval, as well as 2D Gabor function have been used to validate our claims. The paper also discusses issues related to global optimization. The limitations of popular deterministic weight update laws are critiqued and the possibility of global optimization using HJB formulation is discussed. It is hoped that the proposed algorithm will bring in a lot of interest in researchers working in developing fast learning algorithms and global optimization.
Estimation of Distribution Algorithms have been proposed as a new paradigm for evolutionary optimization. This paper focuses on the parallelization of Estimation of Distribution Algorithms. More specifically, the paper discusses how to predict performance of parallel Mixed Bayesian Optimization Algorithm (MBOA) that is based on parallel construction of Bayesian networks with decision trees. We determine the time complexity of parallel Mixed Bayesian Optimization Algorithm and compare this complexity with experimental results obtained by solving the spin glass optimization problem. The empirical results fit well the theoretical time complexity, so the scalability and efficiency of parallel Mixed Bayesian Optimization Algorithm for unknown instances of spin glass benchmarks can be predicted. Furthermore, we derive the guidelines that can be used to design effective parallel Estimation of Distribution Algorithms with the speedup proportional to the number of variables in the problem.
Identifying the same individual across different scenes is an important yet difficult task in intelligent video surveillance. Its main difficulty lies in how to preserve similarity of the same person against large appearance and structure variation while discriminating different individuals. In this paper, we present a scalable distance driven feature learning framework based on the deep neural network for person re-identification, and demonstrate its effectiveness to handle the existing challenges. Specifically, given the training images with the class labels (person IDs), we first produce a large number of triplet units, each of which contains three images, i.e. one person with a matched reference and a mismatched reference. Treating the units as the input, we build the convolutional neural network to generate the layered representations, and follow with the $L2$ distance metric. By means of parameter optimization, our framework tends to maximize the relative distance between the matched pair and the mismatched pair for each triplet unit. Moreover, a nontrivial issue arising with the framework is that the triplet organization cubically enlarges the number of training triplets, as one image can be involved into several triplet units. To overcome this problem, we develop an effective triplet generation scheme and an optimized gradient descent algorithm, making the computational load mainly depends on the number of original images instead of the number of triplets. On several challenging databases, our approach achieves very promising results and outperforms other state-of-the-art approaches.
Inferring the relations between two images is an important class of tasks in computer vision. Examples of such tasks include computing optical flow and stereo disparity. We treat the relation inference tasks as a machine learning problem and tackle it with neural networks. A key to the problem is learning a representation of relations. We propose a new neural network module, contrast association unit (CAU), which explicitly models the relations between two sets of input variables. Due to the non-negativity of the weights in CAU, we adopt a multiplicative update algorithm for learning these weights. Experiments show that neural networks with CAUs are more effective in learning five fundamental image transformations than conventional neural networks.
A family of equivalence tools for bounding network capacities is introduced. Part I treats networks of point-to-point channels. The main result is roughly as follows. Given a network of noisy, independent, memoryless point-to-point channels, a collection of communication demands can be met on the given network if and only if it can be met on another network where each noisy channel is replaced by a noiseless bit pipe with throughput equal to the noisy channel capacity. This result was known previously for the case of a single-source multicast demand. The result given here treats general demands -- including, for example, multiple unicast demands -- and applies even when the achievable rate region for the corresponding demands is unknown in the noiseless network. In part II, definitions of upper and lower bounding channel models for general channels are introduced. By these definitions, a collection of communication demands can be met on a network of independent channels if it can be met on a network where each channel is replaced by its lower bounding model andonly if it can be met on a network where each channel is replaced by its upper bounding model. This work derives general conditions under which a network of noiseless bit pipes is an upper or lower bounding model for a multiterminal channel. Example upper and lower bounding models for broadcast, multiple access, and interference channels are given. It is then shown that bounding the difference between the upper and lower bounding models for a given channel yields bounds on the accuracy of network capacity bounds derived using those models. By bounding the capacity of a network of independent noisy channels by the network coding capacity of a network of noiseless bit pipes, this approach represents one step towards the goal of building computational tools for bounding network capacities.
This paper presents a new 3D point cloud classification benchmark data set with over four billion manually labelled points, meant as input for data-hungry (deep) learning methods. We also discuss first submissions to the benchmark that use deep convolutional neural networks (CNNs) as a work horse, which already show remarkable performance improvements over state-of-the-art. CNNs have become the de-facto standard for many tasks in computer vision and machine learning like semantic segmentation or object detection in images, but have no yet led to a true breakthrough for 3D point cloud labelling tasks due to lack of training data. With the massive data set presented in this paper, we aim at closing this data gap to help unleash the full potential of deep learning methods for 3D labelling tasks. Our semantic3D.net data set consists of dense point clouds acquired with static terrestrial laser scanners. It contains 8 semantic classes and covers a wide range of urban outdoor scenes: churches, streets, railroad tracks, squares, villages, soccer fields and castles. We describe our labelling interface and show that our data set provides more dense and complete point clouds with much higher overall number of labelled points compared to those already available to the research community. We further provide baseline method descriptions and comparison between methods submitted to our online system. We hope semantic3D.net will pave the way for deep learning methods in 3D point cloud labelling to learn richer, more general 3D representations, and first submissions after only a few months indicate that this might indeed be the case.
Responding to calls from the National Science Foundation (NSF) for new proposals to measure the gravitational constant $G$, we offer an interesting experiment in deep space employing the classic gravity train mechanism. Our setup requires three bodies: a larger layered solid sphere with a cylindrical hole through its center, a much smaller retroreflector which will undergo harmonic motion within the hole and a host spacecraft with laser ranging capabilities to measure round trip light-times to the retroreflector but ultimately separated a significant distance away from the sphere-retroreflector apparatus. Measurements of the period of oscillation of the retroreflector in terms of host spacecraft clock time using existing technology could give determinations of $G$ nearly three orders of magnitude more accurate than current measurements here on Earth. However, significant engineering advances in the release mechanism of the apparatus from the host spacecraft will likely be necessary. Issues with regard to the stability of the system are briefly addressed.
Bandwidth-sharing networks as introduced by Massouli\'e & Roberts (1998) model the dynamic interaction among an evolving population of elastic flows competing for several links. With policies based on optimization procedures, such models are of interest both from a Queueing Theory and Operations Research perspective.   In the present paper, we focus on bandwidth-sharing networks with capacities and arrival rates of a large order of magnitude compared to transfer rates of individual flows. This regime is standard in practice. In particular, we extend previous work by Reed & Zwart (2010) on fluid approximations for such networks: we allow interarrival times, flow sizes and patient times (i.e. abandonment times measured from the arrival epochs) to be generally distributed, rather than exponentially distributed. We also develop polynomial-time computable fixed-point approximations for stationary distributions of bandwidth-sharing networks, and suggest new techniques for deriving these types of results.
We study the graph coloring problem over random graphs of finite average connectivity $c$. Given a number $q$ of available colors, we find that graphs with low connectivity admit almost always a proper coloring whereas graphs with high connectivity are uncolorable. Depending on $q$, we find the precise value of the critical average connectivity $c_q$. Moreover, we show that below $c_q$ there exist a clustering phase $c\in [c_d,c_q]$ in which ground states spontaneously divide into an exponential number of clusters. Furthermore, we extended our considerations to the case of single instances showing consistent results. This lead us to propose a new algorithm able to color in polynomial time random graphs in the hard but colorable region, i.e when $c\in [c_d,c_q]$.
Complex networks from such different fields as biology, technology or sociology share similar organization principles. The possibility of a unique growth mechanism promises to uncover universal origins of collective behaviour. In particular, the emergence of self-similarity in complex networks raises the fundamental question of the growth process according to which these structures evolve. Here we investigate the concept of renormalization as a mechanism for the growth of fractal and non-fractal modular networks. We show that the key principle that gives rise to the fractal architecture of networks is a strong effective 'repulsion' (or, disassortativity) between the most connected nodes (that is, the hubs) on all length scales, rendering them very dispersed. More importantly, we show that a robust network comprising functional modules, such as a cellular network, necessitates a fractal topology, suggestive of an evolutionary drive for their existence.
Phonemic or phonetic sub-word units are the most commonly used atomic elements to represent speech signals in modern ASRs. However they are not the optimal choice due to several reasons such as: large amount of effort required to handcraft a pronunciation dictionary, pronunciation variations, human mistakes and under-resourced dialects and languages. Here, we propose a data-driven pronunciation estimation and acoustic modeling method which only takes the orthographic transcription to jointly estimate a set of sub-word units and a reliable dictionary. Experimental results show that the proposed method which is based on semi-supervised training of a deep neural network largely outperforms phoneme based continuous speech recognition on the TIMIT dataset.
Face recognition (FR) methods report significant performance by adopting the convolutional neural network (CNN) based learning methods. Although CNNs are mostly trained by optimizing the softmax loss, the recent trend shows an improvement of accuracy with different strategies, such as task-specific CNN learning with different loss functions, fine-tuning on target dataset, metric learning and concatenating features from multiple CNNs. Incorporating these tasks obviously requires additional efforts. Moreover, it demotivates the discovery of efficient CNN models for FR which are trained only with identity labels. We focus on this fact and propose an easily trainable and single CNN based FR method. Our CNN model exploits the residual learning framework. Additionally, it uses normalized features to compute the loss. Our extensive experiments show excellent generalization on different datasets. We obtain very competitive and state-of-the-art results on the LFW, IJB-A, YouTube faces and CACD datasets.
Recently exciting progress has been made on protein contact prediction, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual networks. This deep neural network allows us to model very complex sequence-contact relationship as well as long-range inter-contact correlation. Our method greatly outperforms existing contact prediction methods and leads to much more accurate contact-assisted protein folding. Tested on three datasets of 579 proteins, the average top L long-range prediction accuracy obtained our method, the representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints can yield correct folds (i.e., TMscore>0.6) for 203 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 proteins, respectively. Further, our contact-assisted models have much better quality than template-based models. Using our predicted contacts as restraints, we can (ab initio) fold 208 of the 398 membrane proteins with TMscore>0.5. By contrast, when the training proteins of our method are used as templates, homology modeling can only do so for 10 of them. One interesting finding is that even if we do not train our prediction models with any membrane proteins, our method works very well on membrane protein prediction. Finally, in recent blind CAMEO benchmark our method successfully folded 5 test proteins with a novel fold.
Curiously overlooked in physics is its dependence on the transmission of numbers. For example the transmission of numerical clock readings is implicit in the concept of a coordinate system. The transmission of numbers and other logical distinctions is often achieved over a computer-mediated communications network in the face of an unpredictable environment. By unpredictable we mean something stronger than the spread of probabilities over given possible outcomes, namely an opening to unforeseeable possibilities. Unpredictability, until now overlooked in theoretical physics, makes the transmission of numbers interesting. Based on recent proofs within quantum theory that provide a theoretical foundation to unpredictability, here we show how regularities in physics rest on a background of channels over which numbers are transmitted.   As is known to engineers of digital communications, numerical transmissions depend on coordination reminiscent of the cycle of throwing and catching by players tossing a ball back and forth. In digital communications, the players are computers, and the required coordination involves unpredictably adjusting "live clocks" that step these computers through phases of a cycle. We show how this phasing, which we call `logical synchronization,' constrains number-carrying networks, and, if a spacetime manifold in invoked, put "stripes" on spacetime. Via its logically synchronized channels, a network of live clocks serves as a reference against which to locate events. Such a network in any case underpins a coordinate frame, and in some cases the direct use of a network can be tailored to investigate an unpredictable environment. Examples include explorations of gravitational variations near Earth.
To improve unstructured P2P system performance, one wants to minimize the number of peers that have to be probed for the shortening of the search time. A solution to the problem is to employ a replication scheme, which provides high hit rate for target files. Replication can also provide load balancing and reduce access latency if the file is accessed by a large population of users. This paper briefly describes various replication schemes that have appeared in the literature and also focuses on a novel replication technique called Q-replication to increase availability of objects in unstructured P2P networks. The Q-replication technique replicates objects autonomously to suitable sites based on object popularity and site selection logic by extensively employing Q-learning concept.
The density of states of a three dimensional Dirac equation with a random potential as well as in other problems of quantum motion in a random potential placed in sufficiently high spatial dimensionality appears to be singular at a certain critical disorder strength. This was seen numerically in a variety of studies as well as supported by detailed renormalization group calculations. At the same time it was suggested by a number of arguments accompanied by detailed numerical simulations that this singularity is rounded off by the rare region fluctuations of random potential, and that tuning the disorder past its critical value is not a genuine phase transition but rather a crossover. Here we develop an analytic theory which explains how rare region effects indeed lead to rounding off of the singularity and to the crossover replacing the transition.
Real-time tool segmentation from endoscopic videos is an essential part of many computer-assisted robotic surgical systems and of critical importance in robotic surgical data science. We propose two novel deep learning architectures for automatic segmentation of non-rigid surgical instruments. Both methods take advantage of automated deep-learning-based multi-scale feature extraction while trying to maintain an accurate segmentation quality at all resolutions. The two proposed methods encode the multi-scale constraint inside the network architecture. The first proposed architecture enforces it by cascaded aggregation of predictions and the second proposed network does it by means of a holistically-nested architecture where the loss at each scale is taken into account for the optimization process. As the proposed methods are for real-time semantic labeling, both present a reduced number of parameters. We propose the use of parametric rectified linear units for semantic labeling in these small architectures to increase the regularization ability of the design and maintain the segmentation accuracy without overfitting the training sets. We compare the proposed architectures against state-of-the-art fully convolutional networks. We validate our methods using existing benchmark datasets, including ex vivo cases with phantom tissue and different robotic surgical instruments present in the scene. Our results show a statistically significant improved Dice Similarity Coefficient over previous instrument segmentation methods. We analyze our design choices and discuss the key drivers for improving accuracy.
Designing mobiles to harvest ambient energy such as kinetic activities or electromagnetic radiation will enable wireless networks to be self sustaining besides alleviating global warming. In this paper, the spatial throughput of a mobile ad hoc network powered by energy harvesting is analyzed using a stochastic-geometry model. In this model, transmitters are distributed as a Poisson point process and energy arrives at each transmitter randomly with a uniform average rate called the energy arrival rate; upon harvesting sufficient energy, each transmitter transmits with fixed power to an intended receiver under an outage-probability constraint for a target signal-to-interference-and-noise ratio. It is assumed that transmitters store energy in batteries with infinite capacity. By applying the random-walk theory, the probability that a transmitter transmits, called the transmission probability, is proved to be equal to one if the energy-arrival rate exceeds transmission power or otherwise is equal to their ratio. This result and tools from stochastic geometry are applied to maximize the network throughput for a given energy-arrival rate by optimizing transmission power. The maximum network throughput is shown to be proportional to the optimal transmission probability, which is equal to one if the transmitter density is below a derived function of the energy-arrival rate or otherwise is smaller than one and solves a given polynomial equation. Last, the limits of the maximum network throughput are obtained for the extreme cases of high energy-arrival rates and dense networks.
We present a Deep Convolutional Neural Network (DCNN) architecture for the task of continuous authentication on mobile devices. To deal with the limited resources of these devices, we reduce the complexity of the networks by learning intermediate features such as gender and hair color instead of identities. We present a multi-task, part-based DCNN architecture for attribute detection that performs better than the state-of-the-art methods in terms of accuracy. As a byproduct of the proposed architecture, we are able to explore the embedding space of the attributes extracted from different facial parts, such as mouth and eyes, to discover new attributes. Furthermore, through extensive experimentation, we show that the attribute features extracted by our method outperform the previously presented attribute-based method and a baseline LBP method for the task of active authentication. Lastly, we demonstrate the effectiveness of the proposed architecture in terms of speed and power consumption by deploying it on an actual mobile device.
Attention mechanisms have attracted considerable interest in image captioning due to its powerful performance. However, existing methods use only visual content as attention and whether textual context can improve attention in image captioning remains unsolved. To explore this problem, we propose a novel attention mechanism, called \textit{text-conditional attention}, which allows the caption generator to focus on certain image features given previously generated text. To obtain text-related image features for our attention model, we adopt the guiding Long Short-Term Memory (gLSTM) captioning architecture with CNN fine-tuning. Our proposed method allows joint learning of the image embedding, text embedding, text-conditional attention and language model with one network architecture in an end-to-end manner. We perform extensive experiments on the MS-COCO dataset. The experimental results show that our method outperforms state-of-the-art captioning methods on various quantitative metrics as well as in human evaluation, which supports the use of our text-conditional attention in image captioning.
In a previous work we have detailed the requirements to obtain a maximal performance benefit by implementing fully connected deep neural networks (DNN) in form of arrays of resistive devices for deep learning. This concept of Resistive Processing Unit (RPU) devices we extend here towards convolutional neural networks (CNNs). We show how to map the convolutional layers to RPU arrays such that the parallelism of the hardware can be fully utilized in all three cycles of the backpropagation algorithm. We find that the noise and bound limitations imposed due to analog nature of the computations performed on the arrays effect the training accuracy of the CNNs. Noise and bound management techniques are presented that mitigate these problems without introducing any additional complexity in the analog circuits and can be addressed by the digital circuits. In addition, we discuss digitally programmable update management and device variability reduction techniques that can be used selectively for some of the layers in a CNN. We show that combination of all those techniques enables a successful application of the RPU concept for training CNNs. The techniques discussed here are more general and can be applied beyond CNN architectures and therefore enables applicability of RPU approach for large class of neural network architectures.
We propose an image smoothing approximation and intrinsic image decomposition method based on a modified convolutional neural network architecture with large receptive fields applied directly to the original color image. When training a deep model for these purposes however, it is quite difficult to generate edge-preserving images without undesirable color differences or boundary artifacts. To overcome these obstacles, we supervise intermediate outputs from different functional components during training. For example, we apply both image gradient supervision and a channel-wise rescaling layer that computes a minimum mean-squared error color correction. Additionally, to enhance piece-wise constant effects for image smoothing, we append a domain transform filter with a predicted refined edge map. The resulting deep model, which can be trained end-to-end, directly learns edge-preserving smooth images and intrinsic image decompositions without any special design, specialized auxiliary training data, or input scaling/size requirements. Moreover, our method shows much better numerical and visual results on both tasks and runs in comparable test time to existing deep methods.
This paper presents how we can achieve the state-of-the-art accuracy in multi-category object detection task while minimizing the computational cost by adapting and combining recent technical innovations. Following the common pipeline of "CNN feature extraction + region proposal + RoI classification", we mainly redesign the feature extraction part, since region proposal part is not computationally expensive and classification part can be efficiently compressed with common techniques like truncated SVD. Our design principle is "less channels with more layers" and adoption of some building blocks including concatenated ReLU, Inception, and HyperNet. The designed network is deep and thin and trained with the help of batch normalization, residual connections, and learning rate scheduling based on plateau detection. We obtained solid results on well-known object detection benchmarks: 83.8% mAP (mean average precision) on VOC2007 and 82.5% mAP on VOC2012 (2nd place), while taking only 750ms/image on Intel i7-6700K CPU with a single core and 46ms/image on NVIDIA Titan X GPU. Theoretically, our network requires only 12.3% of the computational cost compared to ResNet-101, the winner on VOC2012.
Several growth models have been proposed in the literature for scale-free complex networks, with a range of fitness-based attachment models gaining prominence recently. However, the processes by which such fitness-based attachment behaviour can arise are less well understood, making it difficult to compare the relative merits of such models. This paper analyses an evolutionary mechanism that would give rise to a fitness-based attachment process. In particular, it is proven by analytical and numerical methods that in homogeneous networks, the minimisation of maximum exposure to node unfitness leads to attachment probabilities that are proportional to node fitness. This result is then extended to heterogeneous networks, with supply chain networks being used as an example.
In this work we present a pedagogical introduction to the minority game and various new versions of it with interesting properties, focusing in its applications in socialphysics. For instance, some systems display a kind of social behavior that seems to play an important role in the advancement and survival of an organized society [see, for instance, J. B. Silk et al., Science 302, 1231 (2003)]. On the other hand, devious behavior may degrade a organized society specially when anti-social individual patterns becomes common to many members of a collectivity. In a, perhaps, somewhat far fetched application of a model for interacting agents, the well known minority game, applicable in many contexts, we have studied by computer simulation the effect of having a fraction of the members of a collectivity endowed with spurious strategies. In particular the so called advantage law, where there are some agents that always win, no matter if they play good or not, and another one is a realization of a popularly known ?crab effect?, where better performing agents may be suppressed by the mass of the medial players. As may be expected, this antisocial strategies deteriorate the collective organization of the system, but now studied within a measurable framework. In another application, positively minded, of the multi choice minority game we introduce different ways to reward also second place winners and compare the results with the one of the standard MG
Characterizing relationships between people is fundamental for the understanding of narratives. In this work, we address the problem of inferring the polarity of relationships between people in narrative summaries. We formulate the problem as a joint structured prediction for each narrative, and present a model that combines evidence from linguistic and semantic features, as well as features based on the structure of the social community in the text. We also provide a clustering-based approach that can exploit regularities in narrative types. e.g., learn an affinity for love-triangles in romantic stories. On a dataset of movie summaries from Wikipedia, our structured models provide more than a 30% error-reduction over a competitive baseline that considers pairs of characters in isolation.
We present a simple characterization of equivalent Bayesian network structures based on local transformations. The significance of the characterization is twofold. First, we are able to easily prove several new invariant properties of theoretical interest for equivalent structures. Second, we use the characterization to derive an efficient algorithm that identifies all of the compelled edges in a structure. Compelled edge identification is of particular importance for learning Bayesian network structures from data because these edges indicate causal relationships when certain assumptions hold.
Recently, it has become evident that submodularity naturally captures widely occurring concepts in machine learning, signal processing and computer vision. Consequently, there is need for efficient optimization procedures for submodular functions, especially for minimization problems. While general submodular minimization is challenging, we propose a new method that exploits existing decomposability of submodular functions. In contrast to previous approaches, our method is neither approximate, nor impractical, nor does it need any cumbersome parameter tuning. Moreover, it is easy to implement and parallelize. A key component of our method is a formulation of the discrete submodular minimization problem as a continuous best approximation problem that is solved through a sequence of reflections, and its solution can be easily thresholded to obtain an optimal discrete solution. This method solves both the continuous and discrete formulations of the problem, and therefore has applications in learning, inference, and reconstruction. In our experiments, we illustrate the benefits of our method on two image segmentation tasks.
Convolutional kernels are basic and vital components of deep Convolutional Neural Networks (CNN). In this paper, we equip convolutional kernels with shape attributes to generate the deep Irregular Convolutional Neural Networks (ICNN). Compared to traditional CNN applying regular convolutional kernels like ${3\times3}$, our approach trains irregular kernel shapes to better fit the geometric variations of input features. In other words, shapes are learnable parameters in addition to weights. The kernel shapes and weights are learned simultaneously during end-to-end training with the standard back-propagation algorithm. Experiments for semantic segmentation are implemented to validate the effectiveness of our proposed ICNN.
Autoencoders are unsupervised machine learning circuits whose learning goal is to minimize a distortion measure between inputs and outputs. Linear autoencoders can be defined over any field and only real-valued linear autoencoder have been studied so far. Here we study complex-valued linear autoencoders where the components of the training vectors and adjustable matrices are defined over the complex field with the $L_2$ norm. We provide simpler and more general proofs that unify the real-valued and complex-valued cases, showing that in both cases the landscape of the error function is invariant under certain groups of transformations. The landscape has no local minima, a family of global minima associated with Principal Component Analysis, and many families of saddle points associated with orthogonal projections onto sub-space spanned by sub-optimal subsets of eigenvectors of the covariance matrix. The theory yields several iterative, convergent, learning algorithms, a clear understanding of the generalization properties of the trained autoencoders, and can equally be applied to the hetero-associative case when external targets are provided. Partial results on deep architecture as well as the differential geometry of autoencoders are also presented. The general framework described here is useful to classify autoencoders and identify general common properties that ought to be investigated for each class, illuminating some of the connections between information theory, unsupervised learning, clustering, Hebbian learning, and autoencoders.
I consider the effects of enforced dimerization on random Heisenberg antiferromagnetic S=1 chains. I argue for the existence of novel Griffiths phases characterized by {\em two independent dynamical exponents} that vary continuously in these phases; one of the exponents controls the density of spin-1/2 degrees of freedom in the low-energy effective Hamiltonian, while the other controls the corresponding density of spin-1 degrees of freedom. Moreover, in one of these Griffiths phases, the system has very different low temperature behavior in two different parts of the phase which are separated from each other by a sharply defined crossover line; on one side of this crossover line, the system `looks' like a S=1 chain at low energies, while on the other side, it is best thought of as a $S=1/2$ chain. A strong-disorder RG analysis makes it possible to analytically obtain detailed information about the low temperature behavior of physical observables such as the susceptibility and the specific heat, as well as identify an experimentally accessible signature of this novel crossover.
Networking operational costs and environmental concerns have lately driven the quest for energy efficient equipment. In wired networks, energy efficient Ethernet (EEE) interfaces can greatly reduce power demands when compared to regular Ethernet interfaces. Their power saving capabilities have been studied and modeled in many research articles in the last few years, together with their effects on traffic delay. However, to this date, all articles have considered them in isolation instead of as part of a network of EEE interfaces. In this paper we develop a model for the traffic delay on a network of EEE interfaces. We prove that, whatever the network topology, the per interface delay increment due to the power savings capabilities is bounded and, in most scenarios, negligible. This confirms that EEE interfaces can be used in all but the most delay constrained scenarios to save considerable amounts of power.
We present observations and discussion of previously unreported phenomena discovered while training residual networks. The goal of this work is to better understand the nature of neural networks through the examination of these new empirical results. These behaviors were identified through the application of Cyclical Learning Rates (CLR) and linear network interpolation. Among these behaviors are counterintuitive increases and decreases in training loss and instances of rapid training. For example, we demonstrate how CLR can produce greater testing accuracy than traditional training despite using large learning rates. Files to replicate these results are available at https://github.com/lnsmith54/exploring-loss
The Lasso has attracted the attention of many authors these last years. While many efforts have been made to prove that the Lasso behaves like a variable selection procedure at the price of strong (though unavoidable) assumptions on the geometric structure of these variables, much less attention has been paid to the analysis of the performance of the Lasso as a regularization algorithm. Our first purpose here is to provide a conceptually very simple result in this direction. We shall prove that, provided that the regularization parameter is properly chosen, the Lasso works almost as well as the deterministic Lasso. This result does not require any assumption at all, neither on the structure of the variables nor on the regression function. Our second purpose is to introduce a new estimator particularly adapted to deal with infinite countable dictionaries. This estimator is constructed as an l0-penalized estimator among a sequence of Lasso estimators associated to a dyadic sequence of growing truncated dictionaries. The selection procedure automatically chooses the best level of truncation of the dictionary so as to make the best tradeoff between approximation, l1-regularization and sparsity. From a theoretical point of view, we shall provide an oracle inequality satisfied by this selected Lasso estimator. The oracle inequalities established for the Lasso and the selected Lasso estimators shall enable us to derive rates of convergence on a wide class of functions, showing that these estimators perform at least as well as greedy algorithms. Besides, we shall prove that the rates of convergence achieved by the selected Lasso estimator are optimal in the orthonormal case by bounding from below the minimax risk on some Besov bodies. Finally, some theoretical results about the performance of the Lasso for infinite uncountable dictionaries will be studied in the specific framework of neural networks. All the oracle inequalities presented in this paper are obtained via the application of a single general theorem of model selection among a collection of nonlinear models which is a direct consequence of the Gaussian concentration inequality. The key idea that enables us to apply this general theorem is to see l1-regularization as a model selection procedure among l1-balls.
This paper presents experiments extending the work of Ba et al. (2014) on recurrent neural models for attention into less constrained visual environments, specifically fine-grained categorization on the Stanford Dogs data set. In this work we use an RNN of the same structure but substitute a more powerful visual network and perform large-scale pre-training of the visual network outside of the attention RNN. Most work in attention models to date focuses on tasks with toy or more constrained visual environments, whereas we present results for fine-grained categorization better than the state-of-the-art GoogLeNet classification model. We show that our model learns to direct high resolution attention to the most discriminative regions without any spatial supervision such as bounding boxes, and it is able to discriminate fine-grained dog breeds moderately well even when given only an initial low-resolution context image and narrow, inexpensive glimpses at faces and fur patterns. This and similar attention models have the major advantage of being trained end-to-end, as opposed to other current detection and recognition pipelines with hand-engineered components where information is lost. While our model is state-of-the-art, further work is needed to fully leverage the sequential input.
We consider multiphoton dynamics of a quantum system composed of a three-state atom (a qutrit) and a single-mode photonic field in the ultrastrong and deep strong coupling regimes, when the coupling strength is comparable to or larger than the oscillator energy scale. We assume a qutrit to be in a polar-$\Lambda $ configuration in which two lower levels have mean dipole moments. Direct multiphoton resonant transitions revealing generalized Rabi oscillations, collapse, and revivals in atomic excitation probabilities for the ultrastrong couplings are studied. In the deep strong coupling regime particular emphasis is placed on the ground state of considering system which exhibits strictly nonclassical properties.
We have obtained deep near-infrared $J$- (1.25 $\mu$m), $H$- (1.65$ \mu$m) and $K_s$-band (2.15 $\mu$m) imaging for a sample of six dwarf galaxies ($M_B\ga-17$ mag) in the Local Volume (LV, $D\la10$ Mpc). The sample consists mainly of early-type dwarf galaxies found in various environments in the LV. Two galaxies (LEDA 166099 and UGCA 200) in the sample are detected in the near-infrared for the first time. The deep near-infrared images allow for a detailed study of the photometric and structural properties of each galaxy. The surface brightness profiles of the galaxies are detected down to the ~$24 mag arcsec^{-2}$ isophote in the $J$- and $H$-bands, and $23 mag arcsec^{-2}$ in the $K_s$-band. The total magnitudes of the galaxies are derived in the three wavelength bands. For the brightest galaxies ($M_B\la-15.5$ mag) in the sample, we find that the Two Micron All Sky Survey (2MASS) underestimates the total magnitudes of these systems by up to $\la0.5$ mag. The radial surface brightness profiles of the galaxies are fitted with an exponential (for those galaxies having a stellar disk) or S\'ersic law to derive the structure of the underlying stellar component. In particular, the effective surface brightness ($\mu_e$) and effective radius ($r_e$) are determined from the analytic fits to the surface brightness profile. The $J$-$K_s$ colours for the galaxies have been measured to explore the luminosity-metallicity relation for early-type dwarfs. In addition, the $B$-$K_s$ colours of the galaxies are used to assess their evolutionary state relative to other galaxy morphologies. The total stellar masses of the dwarf galaxies are derived from the $H$-band photometric measurements. These will later be compared to the dynamical mass estimates for the galaxies to determine their dark matter content.
We prove the Vollhardt and Wolfle hypothesis that the irreducible vertex U_{kk'}(q) appearing in the Bethe--Salpeter equation contains a diffusion pole (with the observable diffusion coefficient D(\omega,q)) in the limit k+k'\to 0. The presence of a diffusion pole in U_{kk'}(q) makes it possible to represent the quantum "collision operator" L as a sum of a singular operator L_{sing}, which has an infinite number of zero modes, and a regular operator L_{reg} of a general form. Investigation of the response of the system to a change in L_{reg} leads to a self-consistency equation, which replaces the rough Vollhardt-Wolfle equation. Its solution shows that D(0,q) vanishes at the transition point simultaneously for all q. The spatial dispersion of D(\omega,q) at \omega \to 0 is found to be \sim 1 in relative units. It is determined by the atomic scale, and it has no manifestations on the scale q \sim \xi^{-1} associated with the correlation length \xi. The values obtained for the critical exponent s of the conductivity and the critical exponent \nu of the localization length in a d-dimensional space, s=1 (d>2) and \nu=1/(d-2) (2<d<4), \nu=1/2 (d>4), agree with all reliably established results. With respect to the character of the change in the symmetry, the Anderson transition is found to be similar to the Curie point of an isotropic ferromagnet with an infinite number of components. For such a magnet, the critical exponents are known exactly and they agree with the exponents indicated above. This suggests that the symmetry of the critical point has been established correctly and that the exponents have been determined exactly.
We present preliminary results on the double longitudinal spin asymmetries A_LL in inclusive jet production and the longitudinal spin transfer asymmetries D_LL in inclusive Lambda and anti-Lambda hyperon production. The data amount to about 0.5 pb-1 collected at RHIC in 2003 and 2004 with beam polarizations up to 45 %. The jet A_LL asymmetries, measured over 5 < pT < 17 GeV/c, are consistent with evaluations based on deep-inelastic scattering parametrizations for the gluon polarization in the nucleon, and disfavor large positive values of gluon polarization in the nucleon. The Lambda and anti-Lambda D_LL, measured at midrapidity and at low average transverse momentum of 1.5 GeV/c, are consistent with zero within their dominant statistical uncertainties.
In this paper, we study the sensitivity of CNN outputs with respect to image transformations and noise in the area of fine-grained recognition. In particular, we answer the following questions (1) how sensitive are CNNs with respect to image transformations encountered during wild image capture?; (2) how can we predict CNN sensitivity?; and (3) can we increase the robustness of CNNs with respect to image degradations? To answer the first question, we provide an extensive empirical sensitivity analysis of commonly used CNN architectures (AlexNet, VGG19, GoogleNet) across various types of image degradations. This allows for predicting CNN performance for new domains comprised by images of lower quality or captured from a different viewpoint. We also show how the sensitivity of CNN outputs can be predicted for single images. Furthermore, we demonstrate that input layer dropout or pre-filtering during test time only reduces CNN sensitivity for high levels of degradation.   Experiments for fine-grained recognition tasks reveal that VGG19 is more robust to severe image degradations than AlexNet and GoogleNet. However, small intensity noise can lead to dramatic changes in CNN performance even for VGG19.
Many natural language processing (NLP) tasks can be generalized into segmentation problem. In this paper, we combine semi-CRF with neural network to solve NLP segmentation tasks. Our model represents a segment both by composing the input units and embedding the entire segment. We thoroughly study different composition functions and different segment embeddings. We conduct extensive experiments on two typical segmentation tasks: named entity recognition (NER) and Chinese word segmentation (CWS). Experimental results show that our neural semi-CRF model benefits from representing the entire segment and achieves the state-of-the-art performance on CWS benchmark dataset and competitive results on the CoNLL03 dataset.
We model certain features of human language complexity by means of advanced concepts borrowed from statistical mechanics. Using a time series approach, the diffusion entropy method (DE), we compute the complexity of an Italian corpus of newspapers and magazines. We find that the anomalous scaling index is compatible with a simple dynamical model, a random walk on a complex scale-free network, which is linguistically related to Saussurre's paradigms. The model yields the famous Zipf's law in terms of the generalized central limit theorem.
We introduce a generalized ensemble of nonhermitian matrices interpolating between the Gaussian Unitary Ensemble, the Ginibre ensemble and the Poisson ensemble. The joint eigenvalue distribution of this model is obtained by means of an extension of the Itzykson-Zuber formula to general complex matrices. Its correlation functions are studied both in the case of weak nonhermiticity and in the case of strong nonhermiticity. In the weak nonhermiticity limit we show that the spectral correlations in the bulk of the spectrum display critical statistics: the asymptotic linear behavior of the number variance is already approached for energy differences of the order of the eigenvalue spacing. To lowest order, its slope does not depend on the degree of nonhermiticity. Close the edge, the spectral correlations are similar to the Hermitian case. In the strong nonhermiticity limit the crossover behavior from the Ginibre ensemble to the Poisson ensemble first appears close to the surface of the spectrum. Our model may be relevant for the description of the spectral correlations of an open disordered system close to an Anderson transition.
The presence of a bias in each image data collection has recently attracted a lot of attention in the computer vision community showing the limits in generalization of any learning method trained on a specific dataset. At the same time, with the rapid development of deep learning architectures, the activation values of Convolutional Neural Networks (CNN) are emerging as reliable and robust image descriptors. In this paper we propose to verify the potential of the DeCAF features when facing the dataset bias problem. We conduct a series of analyses looking at how existing datasets differ among each other and verifying the performance of existing debiasing methods under different representations. We learn important lessons on which part of the dataset bias problem can be considered solved and which open questions still need to be tackled.
I present a new approach for the interpretation of reaction time (RT) data from behavioral experiments. From a physical perspective, the entropy of the RT distribution provides a model-free estimate of the amount of processing performed by the cognitive system. In this way, the focus is shifted from the conventional interpretation of individual RTs being either long or short, into their distribution being more or less complex in terms of entropy. The new approach enables the estimation of the cognitive processing load without reference to the informational content of the stimuli themselves, thus providing a more appropriate estimate of the cognitive impact of different sources of information that are carried by experimental stimuli or tasks. The paper introduces the formulation of the theory, followed by an empirical validation using a database of human RTs in lexical tasks (visual lexical decision and word naming). The results show that this new interpretation of RTs is more powerful than the traditional one. The method provides theoretical estimates of the processing loads elicited by individual stimuli. These loads sharply distinguish the responses from different tasks. In addition, it provides upper-bound estimates for the speed at which the system processes information. Finally, I argue that the theoretical proposal, and the associated empirical evidence, provide strong arguments for an adaptive system that systematically adjusts its operational processing speed to the particular demands of each stimulus. This finding is in contradiction with Hick's law, which posits a relatively constant processing speed within an experimental context.
We consider damage spreading transitions in the framework of mode-coupling theory. This theory describes relaxation processes in glasses in the mean-field approximation which are known to be characterized by the presence of an exponentially large number of meta-stable states. For systems evolving under identical but arbitrarily correlated noises we demonstrate that there exists a critical temperature $T_0$ which separates two different dynamical regimes depending on whether damage spreads or not in the asymptotic long-time limit. This transition exists for generic noise correlations such that the zero damage solution is stable at high-temperatures being minimal for maximal noise correlations. Although this dynamical transition depends on the type of noise correlations we show that the asymptotic damage has the good properties of an dynamical order parameter such as: 1) Independence on the initial damage; 2) Independence on the class of initial condition and 3) Stability of the transition in the presence of asymmetric interactions which violate detailed balance. For maximally correlated noises we suggest that damage spreading occurs due to the presence of a divergent number of saddle points (as well as meta-stable states) in the thermodynamic limit consequence of the ruggedness of the free energy landscape which characterizes the glassy state. These results are then compared to extensive numerical simulations of a mean-field glass model (the Bernasconi model) with Monte Carlo heat-bath dynamics. The freedom of choosing arbitrary noise correlations for Langevin dynamics makes damage spreading a interesting tool to probe the ruggedness of the configurational landscape.
We develop an analytical framework for distribution of popular content in an Information Centric Network (ICN) that comprises of Access ICNs, a Transit ICN and a Content Provider. Using a generalized Zipf distribution to model content popularity, we devise a game theoretic approach to jointly determine caching and pricing strategies in such an ICN. Under the assumption that the caching cost of the access and transit ICNs is inversely proportional to popularity, we show that the Nash caching strategies in the ICN are 0-1 (all or nothing) strategies. Further, for the case of symmetric Access ICNs, we show that the Nash equilibrium is unique and the caching policy (0 or 1) is determined by a threshold on the popularity of the content (reflected by the Zipf probability metric), i.e., all content more popular than the threshold value is cached. We also show that the resulting threshold of the Access and Transit ICNs, as well as all prices can be obtained by a decomposition of the joint caching and pricing problem into two independent caching only and pricing only problems.
Most traditional algorithms for compressive sensing image reconstruction suffer from the intensive computation. Recently, deep learning-based reconstruction algorithms have been reported, which dramatically reduce the time complexity than iterative reconstruction algorithms. In this paper, we propose a novel \textbf{D}eep \textbf{R}esidual \textbf{R}econstruction Network (DR$^{2}$-Net) to reconstruct the image from its Compressively Sensed (CS) measurement. The DR$^{2}$-Net is proposed based on two observations: 1) linear mapping could reconstruct a high-quality preliminary image, and 2) residual learning could further improve the reconstruction quality. Accordingly, DR$^{2}$-Net consists of two components, \emph{i.e.,} linear mapping network and residual network, respectively. Specifically, the fully-connected layer in neural network implements the linear mapping network. We then expand the linear mapping network to DR$^{2}$-Net by adding several residual learning blocks to enhance the preliminary image. Extensive experiments demonstrate that the DR$^{2}$-Net outperforms traditional iterative methods and recent deep learning-based methods by large margins at measurement rates 0.01, 0.04, 0.1, and 0.25, respectively. The code of DR$^{2}$-Net has been released on: https://github.com/coldrainyht/caffe\_dr2
This research is to search for alternatives to the resolution of complex medical diagnosis where human knowledge should be apprehended in a general fashion. Successful application examples show that human diagnostic capabilities are significantly worse than the neural diagnostic system. This paper describes a modified feedforward neural network constructive algorithm (MFNNCA), a new algorithm for medical diagnosis. The new constructive algorithm with backpropagation; offer an approach for the incremental construction of near-minimal neural network architectures for pattern classification. The algorithm starts with minimal number of hidden units in the single hidden layer; additional units are added to the hidden layer one at a time to improve the accuracy of the network and to get an optimal size of a neural network. The MFNNCA was tested on several benchmarking classification problems including the cancer, heart disease and diabetes. Experimental results show that the MFNNCA can produce optimal neural network architecture with good generalization ability.
We propose a new architecture for difficult image processing operations, such as natural edge detection or thin object segmentation. The architecture is based on a simple combination of convolutional neural networks with the nearest neighbor search.   We focus our attention on the situations when the desired image transformation is too hard for a neural network to learn explicitly. We show that in such situations, the use of the nearest neighbor search on top of the network output allows to improve the results considerably and to account for the underfitting effect during the neural network training. The approach is validated on three challenging benchmarks, where the performance of the proposed architecture matches or exceeds the state-of-the-art.
Native extracellular matrices (ECMs), such as those of the human brain and other neural tissues, exhibit networks of molecular interactions between specific matrix proteins and other tissue components. Guided by these naturally self-assembling supramolecular systems, we have designed a matrix-derived protein chimera that contains a laminin globular-like (LG) domain fused to an elastin-like polypeptide (ELP). All-atom, classical molecular dynamics simulations of our designed laminin-elastin fusion protein reveal temperature-dependent conformational changes, in terms of secondary structure composition, solvent accessible surface area, hydrogen bonding, and surface hydration. These properties illuminate the phase behavior of this fusion protein, via the emergence of $\beta$-sheet character in physiologically-relevant temperature ranges.
This article demonstrates a new conceptor network based classifier in classifying images. Mathematical descriptions and analysis are presented. Various tests are experimented using three benchmark datasets: MNIST, CIFAR-10 and CIFAR-100. The experiments displayed that conceptor network can offer superior results and flexible configurations than conventional classifiers such as Softmax Regression and Support Vector Machine (SVM).
It was recently found that cascading failures can cause the abrupt breakdown of a system of interdependent networks. Using the percolation method developed for single clustered networks by Newman [Phys. Rev. Lett. {\bf 103}, 058701 (2009)], we develop an analytical method for studying how clustering within the networks of a system of interdependent networks affects the system's robustness. We find that clustering significantly increases the vulnerability of the system, which is represented by the increased value of the percolation threshold $p_c$ in interdependent networks.
Power line communication (PLC) exploits the existence of installed infrastructure of power delivery system, in order to transmit data over power lines. In PLC networks, different nodes of the network are interconnected via power delivery transmission lines, and the data signal is flowing between them. However, the attenuation and the harsh environment of the power line communication channels, makes it difficult to establish a reliable communication between two nodes of the network which are separated by a long distance. Relaying and cooperative communication has been used to overcome this problem. In this paper a two-hop cooperative PLC has been studied, where the data is communicated between a transmitter and a receiver node, through a single array node which has to be selected from a set of available arrays. The relay selection problem can be solved by having channel state information (CSI) at transmitter and selecting the relay which results in the best performance. However, acquiring the channel state information at transmitter increases the complexity of the communication system and introduces undesired overhead to the system. We propose a class of machine learning schemes, namely multi-armed bandit (MAB), to solve the relay selection problem without the knowledge of the channel at the transmitter. Furthermore, we develop a new MAB algorithm which exploits the periodicity of the synchronous impulsive noise of the PLC channel, in order to improve the relay selection algorithm.
Sparse signal representations based on linear combinations of learned atoms have been used to obtain state-of-the-art results in several practical signal processing applications. Approximation methods are needed to process high-dimensional signals in this way because the problem to calculate optimal atoms for sparse coding is NP-hard. Here we study greedy algorithms for unsupervised learning of dictionaries of shift-invariant atoms and propose a new method where each atom is selected with the same probability on average, which corresponds to the homeostatic regulation of a recurrent convolutional neural network. Equiprobable selection can be used with several greedy algorithms for dictionary learning to ensure that all atoms adapt during training and that no particular atom is more likely to take part in the linear combination on average. We demonstrate via simulation experiments that dictionary learning with equiprobable selection results in higher entropy of the sparse representation and lower reconstruction and denoising errors, both in the case of ordinary matching pursuit and orthogonal matching pursuit with shift-invariant dictionaries. Furthermore, we show that the computational costs of the matching pursuits are lower with equiprobable selection, leading to faster and more accurate dictionary learning algorithms.
For reliable and consistent quantum information processing carried out on a quantum network, the network structure must be fully known and a desired initial state must be accurately prepared on it. In this paper, for a class of spin networks with its single node only accessible, we provide two continuous-measurement-based methods to achieve the above requirements; the first one identifies the unknown network structure with high probability, based on continuous-time Bayesian update of the graph structure; the second one is, with the use of adaptive measurement technique, able to deterministically drive any mixed state to a spin coherent state for network initialization.
We introduce a methodology to construct parsimonious probabilistic models. This method makes use of Information Filtering Networks to produce a robust estimate of the global sparse inverse covariance from a simple sum of local inverse covariances computed on small sub-parts of the network. Being based on local and low-dimensional inversions, this method is computationally very efficient and statistically robust even for the estimation of inverse covariance of high-dimensional, noisy and short time-series. Applied to financial data our method results computationally more efficient than state-of-the-art methodologies such as Glasso producing, in a fraction of the computation time, models that can have equivalent or better performances but with a sparser inference structure. We also discuss performances with sparse factor models where we notice that relative performances decrease with the number of factors. The local nature of this approach allows us to perform computations in parallel and provides a tool for dynamical adaptation by partial updating when the properties of some variables change without the need of recomputing the whole model. This makes this approach particularly suitable to handle big datasets with large numbers of variables. Examples of practical application for forecasting, stress testing and risk allocation in financial systems are also provided.
The explosive growth of global mobile traffic has lead to a rapid growth in the energy consumption in communication networks. In this paper, we focus on the energy-aware design of the network selection, subchannel, and power allocation in cellular and Wi-Fi networks, while taking into account the traffic delay of mobile users. The problem is particularly challenging due to the two-timescale operations for the network selection (large timescale) and subchannel and power allocation (small timescale). Based on the two-timescale Lyapunov optimization technique, we first design an online Energy-Aware Network Selection and Resource Allocation (ENSRA) algorithm. The ENSRA algorithm yields a power consumption within O(1/V) bound of the optimal value, and guarantees an O(V) traffic delay for any positive control parameter V. Motivated by the recent advancement in the accurate estimation and prediction of user mobility, channel conditions, and traffic demands, we further develop a novel predictive Lyapunov optimization technique to utilize the predictive information, and propose a Predictive Energy-Aware Network Selection and Resource Allocation (P-ENSRA) algorithm. We characterize the performance bounds of P-ENSRA in terms of the power-delay tradeoff theoretically. To reduce the computational complexity, we finally propose a Greedy Predictive Energy-Aware Network Selection and Resource Allocation (GP-ENSRA) algorithm, where the operator solves the problem in P-ENSRA approximately and iteratively. Numerical results show that GP-ENSRA significantly improves the power-delay performance over ENSRA in the large delay regime. For a wide range of system parameters, GP-ENSRA reduces the traffic delay over ENSRA by 20~30% under the same power consumption.
The article is a lightly edited version of my habilitation thesis at the University Wuerzburg. My aim is to give a self contained, if concise, introduction to the formal methods used when off-line learning in feedforward networks is analyzed by statistical physics. However, due to its origin, the article is not a comprehensive review of the field but is highly skewed towards reporting my own research.
The main contribution of this paper is a simple semi-supervised pipeline that only uses the original training set without collecting extra data. It is challenging in 1) how to obtain more training data only from the training set and 2) how to use the newly generated data. In this work, the generative adversarial network (GAN) is used to generate unlabeled samples. We propose the label smoothing regularization for outliers (LSRO). This method assigns a uniform label distribution to the unlabeled images, which regularizes the supervised model and improves the baseline.   We verify the proposed method on a practical problem: person re-identification (re-ID). This task aims to retrieve a query person from other cameras. We adopt the deep convolutional generative adversarial network (DCGAN) for sample generation, and a baseline convolutional neural network (CNN) for representation learning. Experiments show that adding the GAN-generated data effectively improves the discriminative ability of learned feature embedding. On three large-scale datasets, Market-1501, CUHK03 and DukeMTMC-reID, we obtain +4.37%, +1.6% and +2.46% improvement in rank-1 precision over the baseline CNN, respectively. We additionally apply the proposed method to fine-grained bird recognition and achieve a +0.6% improvement over a strong baseline.
The parallel dynamics of the fully connected Blume-Emery-Griffiths neural network model is studied at zero temperature for arbitrary using a probabilistic approach. A recursive scheme is found determining the complete time evolution of the order parameters, taking into account all feedback correlations. It is based upon the evolution of the distribution of the local field, the structure of which is determined in detail. As an illustrative example, explicit analytic formula are given for the first few time steps of the dynamics. Furthermore, equilibrium fixed-point equations are derived and compared with the thermodynamic approach. The analytic results find excellent confirmation in extensive numerical simulations.
We introduce two Python frameworks to train neural networks on large datasets: Blocks and Fuel. Blocks is based on Theano, a linear algebra compiler with CUDA-support. It facilitates the training of complex neural network models by providing parametrized Theano operations, attaching metadata to Theano's symbolic computational graph, and providing an extensive set of utilities to assist training the networks, e.g. training algorithms, logging, monitoring, visualization, and serialization. Fuel provides a standard format for machine learning datasets. It allows the user to easily iterate over large datasets, performing many types of pre-processing on the fly.
We study the influence of various excitations on the anomalous field effect observed in insulating indium-oxide films. In conductance G versus gate-voltage Vg measurements one observes a characteristic cusp around the Vg at which the system has equilibrated. In the absence of any disturbance this cusp may persist for a long time after a new gate voltage was imposed on the sample and hence reflects a memory of the previous equilibrium state. This memory is believed to be related to the correlations between electrons. Here we show that exciting the conduction electrons by exposing the sample to IR light degrades this memory. We argue that any excitation that randomizes the system destroys the correlations and therefore impairs the memory.
We study the appearance of bound states in the spin gap of spin-1/2 ladders induced by weak bond disorder. Starting from the strong-coupling limit, i.e., the limit of weakly coupled dimers, we perform a projection on the single-triplet subspace and derive the position of bound states for the single impurity problem of one modified coupling as well as for small impurity clusters. The case of a finite concentration of impurities is treated with the coherent-potential approximation in the strong-coupling limit and compared with numerical results. Furthermore, we analyze the details in the structure of the density of states and relate their origin to the influence of impurity clusters.
We examine the statistical mechanics of spin-ice materials with a [100] magnetic field. We show that the approach to saturated magnetisation is, in the low-temperature limit, an example of a 3D Kasteleyn transition, which is topological in the sense that magnetisation is changed only by excitations that span the entire system. We study the transition analytically and using a Monte Carlo cluster algorithm, and compare our results with recent data from experiments on Dy2Ti2O7.
In the field of face recognition, Sparse Representation (SR) has received considerable attention during the past few years. Most of the relevant literature focuses on holistic descriptors in closed-set identification applications. The underlying assumption in SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such assumption is easily violated in the more challenging face verification scenario, where an algorithm is required to determine if two faces (where one or both have not been seen before) belong to the same person. In this paper, we first discuss why previous attempts with SR might not be applicable to verification problems. We then propose an alternative approach to face verification via SR. Specifically, we propose to use explicit SR encoding on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which are then concatenated to form an overall face descriptor. Due to the deliberate loss spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment & various image deformations. Within the proposed framework, we evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN), and an implicit probabilistic technique based on Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the proposed local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, in both verification and closed-set identification problems. The experiments also show that l1-minimisation based encoding has a considerably higher computational than the other techniques, but leads to higher recognition rates.
The transverse momentum dependent helicity distributions of valence quarks are calculated in the light-cone diquark model by adopting two different approaches. We use the model results to analyze the $P_{h\perp}$-dependent double spin asymmetries for $\pi^+$, $\pi^-$ and $\pi^0$ productions in semi-inclusive deep inelastic scattering, and find that the asymmetries agree with the CLAS data in one of the approaches. By taking the Fourier transform of the transverse momentum dependent helicity distributions, we obtain the helicity distributions of valence quarks in the transverse coordinate space, and then apply them further to predict the Bessel-weighted double spin asymmetries of $\pi^+$, $\pi^-$ and $\pi^0$ productions in semi-inclusive deep inelastic scattering at CLAS, COMPASS and HERMES for the first time. The shape of the Bessel-weighted double spin asymmetry thereby provides a direct probe on the transverse structure of longitudinally polarized quarks.
High-level brain function such as memory, classification or reasoning can be realized by means of recurrent networks of simplified model neurons. Analog neuromorphic hardware constitutes a fast and energy efficient substrate for the implementation of such neural computing architectures in technical applications and neuroscientific research. The functional performance of neural networks is often critically dependent on the level of correlations in the neural activity. In finite networks, correlations are typically inevitable due to shared presynaptic input. Recent theoretical studies have shown that inhibitory feedback, abundant in biological neural networks, can actively suppress these shared-input correlations and thereby enable neurons to fire nearly independently. For networks of spiking neurons, the decorrelating effect of inhibitory feedback has so far been explicitly demonstrated only for homogeneous networks of neurons with linear sub-threshold dynamics. Theory, however, suggests that the effect is a general phenomenon, present in any system with sufficient inhibitory feedback, irrespective of the details of the network structure or the neuronal and synaptic properties. Here, we investigate the effect of network heterogeneity on correlations in sparse, random networks of inhibitory neurons with non-linear, conductance-based synapses. Emulations of these networks on the analog neuromorphic hardware system Spikey allow us to test the efficiency of decorrelation by inhibitory feedback in the presence of hardware-specific heterogeneities. The configurability of the hardware substrate enables us to modulate the extent of heterogeneity in a systematic manner. We selectively study the effects of shared input and recurrent connections on correlations in membrane potentials and spike trains. Our results confirm ...
We present an investment model integrated with trust-reputation mechanisms where agents interact with each other to establish investment projects. We investigate the establishment of investment projects, the influence of the interaction between agents in the evolution of the distribution of wealth, as well as the formation of common investment networks and some of their properties. Simulation results show that the wealth distribution presents a power law in its tail. Also, it is shown that the trust and reputation mechanism presented leads to the establishment of networks among agents, which present some of the typical characteristics of real-life networks like a high clustering coefficient and short average path length.
Multi-core architectures feature an intricate hierarchy of cache memories, with multiple levels and sizes. To adequately decompose an application according to the traits of a particular memory hierarchy is a cumbersome task that may be rewarded with significant performance gains. The current state-of-the-art in memory-hierarchy-aware parallel computing delegates this endeavour on the programmer, demanding from him deep knowledge of both parallel programming and computer architecture. In this paper we propose the shifting of these memory-hierarchy-related concerns to the run-time system, which then takes on the responsibility of distributing the computation's data across the target memory hierarchy. We evaluate our approach from a performance perspective, comparing it against the common cache-neglectful data decomposition strategy.
We develop a general duality between neural networks and compositional kernels, striving towards a better understanding of deep learning. We show that initial representations generated by common random initializations are sufficiently rich to express all functions in the dual kernel space. Hence, though the training objective is hard to optimize in the worst case, the initial weights form a good starting point for optimization. Our dual view also reveals a pragmatic and aesthetic perspective of neural networks and underscores their expressive power.
Feedforward multilayer networks trained by supervised learning have recently demonstrated state of the art performance on image labeling problems such as boundary prediction and scene parsing. As even very low error rates can limit practical usage of such systems, methods that perform closer to human accuracy remain desirable. In this work, we propose a new type of network with the following properties that address what we hypothesize to be limiting aspects of existing methods: (1) a `wide' structure with thousands of features, (2) a large field of view, (3) recursive iterations that exploit statistical dependencies in label space, and (4) a parallelizable architecture that can be trained in a fraction of the time compared to benchmark multilayer convolutional networks. For the specific image labeling problem of boundary prediction, we also introduce a novel example weighting algorithm that improves segmentation accuracy. Experiments in the challenging domain of connectomic reconstruction of neural circuity from 3d electron microscopy data show that these "Deep And Wide Multiscale Recursive" (DAWMR) networks lead to new levels of image labeling performance. The highest performing architecture has twelve layers, interwoven supervised and unsupervised stages, and uses an input field of view of 157,464 voxels ($54^3$) to make a prediction at each image location. We present an associated open source software package that enables the simple and flexible creation of DAWMR networks.
Deep learning models have achieved state-of-the- art performance in recognizing human activities, but often rely on utilizing background cues present in typical computer vision datasets that predominantly have a stationary camera. If these models are to be employed by autonomous robots in real world environments, they must be adapted to perform independently of background cues and camera motion effects. To address these challenges, we propose a new method that firstly generates generic action region proposals with good potential to locate one human action in unconstrained videos regardless of camera motion and then uses action proposals to extract and classify effective shape and motion features by a ConvNet framework. In a range of experiments, we demonstrate that by actively proposing action regions during both training and testing, state-of-the-art or better performance is achieved on benchmarks. We show the outperformance of our approach compared to the state-of-the-art in two new datasets; one emphasizes on irrelevant background, the other highlights the camera motion. We also validate our action recognition method in an abnormal behavior detection scenario to improve workplace safety. The results verify a higher success rate for our method due to the ability of our system to recognize human actions regardless of environment and camera motion.
Machine learning is essentially the sciences of playing with data. An adaptive data selection strategy, enabling to dynamically choose different data at various training stages, can reach a more effective model in a more efficient way. In this paper, we propose a deep reinforcement learning framework, which we call \emph{\textbf{N}eural \textbf{D}ata \textbf{F}ilter} (\textbf{NDF}), to explore automatic and adaptive data selection in the training process. In particular, NDF takes advantage of a deep neural network to adaptively select and filter important data instances from a sequential stream of training data, such that the future accumulative reward (e.g., the convergence speed) is maximized. In contrast to previous studies in data selection that is mainly based on heuristic strategies, NDF is quite generic and thus can be widely suitable for many machine learning tasks. Taking neural network training with stochastic gradient descent (SGD) as an example, comprehensive experiments with respect to various neural network modeling (e.g., multi-layer perceptron networks, convolutional neural networks and recurrent neural networks) and several applications (e.g., image classification and text understanding) demonstrate that NDF powered SGD can achieve comparable accuracy with standard SGD process by using less data and fewer iterations.
Actors in realistic social networks play not one but a number of diverse roles depending on whom they interact with, and a large number of such role-specific interactions collectively determine social communities and their organizations. Methods for analyzing social networks should capture these multi-faceted role-specific interactions, and, more interestingly, discover the latent organization or hierarchy of social communities. We propose a hierarchical Mixed Membership Stochastic Blockmodel to model the generation of hierarchies in social communities, selective membership of actors to subsets of these communities, and the resultant networks due to within- and cross-community interactions. Furthermore, to automatically discover these latent structures from social networks, we develop a Gibbs sampling algorithm for our model. We conduct extensive validation of our model using synthetic networks, and demonstrate the utility of our model in real-world datasets such as predator-prey networks and citation networks.
Detecting weak seismic events from noisy sensors is a difficult perceptual task. We formulate this task as Bayesian inference and propose a generative model of seismic events and signals across a network of spatially distributed stations. Our system, SIGVISA, is the first to directly model seismic waveforms, allowing it to incorporate a rich representation of the physics underlying the signal generation process. We use Gaussian processes over wavelet parameters to predict detailed waveform fluctuations based on historical events, while degrading smoothly to simple parametric envelopes in regions with no historical seismicity. Evaluating on data from the western US, we recover three times as many events as previous work, and reduce mean location errors by a factor of four while greatly increasing sensitivity to low-magnitude events.
Web 2.0 is beyond a jargon describing technological transformation: it refers to new strategies, tools and techniques that encourage and augment informed, creative and social inter(actions). When considered in an educational context, Web 2.0 provides various opportunities for enhanced integration and for improving the learning processes in information-rich collaborative disciplines such as urban planning and architectural design. The dialogue between the design students and studio teachers can be mediated in various ways by creating novel learning spaces using Web 2.0-based social software and information aggregation services, and brought to a level where the Web 2.0 environment supports, augments and enriches the reflective learning processes. We propose to call this new setting Design Studio 2.0. We suggest that Design Studio 2.0 can provide numerous opportunities which are not fully or easily available in a conventional design studio setting. In this context, we will introduce a web-based geographic virtual environment model (GEO-VEM) and discuss how we reconfigured and rescaled this model with the objective of supporting an international urban design studio by encouraging students to make a collaborative and location-based analysis of a project site (the Brussels-Charleroi Canal). Pursuing the discussion further, we will present our experiences and observations of this design studio including web use statistics, and the results of student attitude surveys. In conclusion, we will reflect the difficulties and challenges of using the GEO-VEM in the Design Studio in a blended learning context and develop future prospects. As a result, we will introduce a set of key criteria for the development and implementation of an effective e-learning environment as a sustainable platform for supporting the Design Studio 2.0.
We introduce `atomic flows': they are graphs obtained from derivations by tracing atom occurrences and forgetting the logical structure. We study simple manipulations of atomic flows that correspond to complex reductions on derivations. This allows us to prove, for propositional logic, a new and very general normalisation theorem, which contains cut elimination as a special case. We operate in deep inference, which is more general than other syntactic paradigms, and where normalisation is more difficult to control. We argue that atomic flows are a significant technical advance for normalisation theory, because 1) the technique they support is largely independent of syntax; 2) indeed, it is largely independent of logical inference rules; 3) they constitute a powerful geometric formalism, which is more intuitive than syntax.
A new and powerful mean field scheme is presented. It maps to a one-dimensional finite closed chain in an external field. The chain size accounts for lattice topologies. Moreover lattice connectivity is rescaled according to the GM law recently obtained in percolation theory. The associated self-consistent mean-field equation of state yields critical temperatures which are within a few percent of exact estimates. Results are obtained for a large variety of lattices and dimensions. The Ising lower critical dimension for the onset of phase transitions is $d_l=1+\frac{2}{q}$. For the Ising hypercube it becomes the Golden number $d_l=\frac{1+\sqrt 5}{2}$. The scheme recovers the exact result of no long range order for non-zero temperature Ising triangular antiferromagnets.
Detecting and recognizing objects interacting with humans lie in the center of first-person (egocentric) daily activity recognition. However, due to noisy camera motion and frequent changes in viewpoint and scale, most of the previous egocentric action recognition methods fail to capture and model highly discriminative object features. In this work, we propose a novel pipeline for first-person daily activity recognition, aiming at more discriminative object feature representation and object-motion feature fusion. Our object feature extraction and representation pipeline is inspired by the recent success of object hypotheses and deep convolutional neural network based detection frameworks. Our key contribution is a simple yet effective manipulated object proposal generation scheme. This scheme leverages motion cues such as motion boundary and motion magnitude (in contrast, camera motion is usually considered as "noise" for most previous methods) to generate a more compact and discriminative set of object proposals, which are more closely related to the objects which are being manipulated. Then, we learn more discriminative object detectors from these manipulated object proposals based on region-based convolutional neural network (R-CNN). Meanwhile, we develop a network based feature fusion scheme which better combines object and motion features. We show in experiments that the proposed framework significantly outperforms the state-of-the-art recognition performance on a challenging first-person daily activity benchmark.
According to the May-Wigner stability theorem, increasing the complexity of a network inevitably leads to its destabilization, such that a small perturbation will be able to disrupt the entire system. One of the principal arguments against this observation is that it is valid only for random networks, and therefore does not apply to real-world networks, which presumably are structured. Here we examine how the introduction of small-world topological structure into networks affect their stability. Our results indicate that, in structured networks, the parameter values at which the stability-instability transition occurs with increasing complexity is identical to that predicted by the May-Wigner criteria. However, the nature of the transition, as measured by the finite-size scaling exponent, appears to change as the network topology transforms from regular to random, with the small-world regime as the cross-over region. This behavior is related to the localization of the largest eigenvalues along the real axis in the eigenvalue plain with increasing regularity in the network.
Scratch is a programming environment and an online community where young people can create, share, learn, and communicate. In collaboration with the Scratch Team at MIT, we created a longitudinal dataset of public activity in the Scratch online community during its first five years (2007-2012). The dataset comprises 32 tables with information on more than 1 million Scratch users, nearly 2 million Scratch projects, more than 10 million comments, more than 30 million visits to Scratch projects, and more. To help researchers understand this dataset, and to establish the validity of the data, we also include the source code of every version of the software that operated the website, as well as the software used to generate this dataset. We believe this is the largest and most comprehensive downloadable dataset of youth programming artifacts and communication.
The standard model of glasses is an ensemble of two-level systems interacting with a thermal bath. The general origin of memory effects in this model is a quasi-stationary but non-equilibrium state of a single two-level system, which is realized due to a finite-rate cooling and very slow thermally activated relaxation. We show that single particle memory effects, such as negativity of the specific heat under reheating, vanish for a sufficiently disordered ensemble. In contrast, a disordered ensemble displays a collective memory effect [similar to that described by Kovacs for glassy polymers], where non-equilibrium features of the ensemble are monitored via a macroscopic observable. An experimental realization of the effect can be used to further assess the consistency of the model.
We investigate the effects of aperiodic interactions on the critical behavior of an interacting two-polymer model on hierarchical lattices (equivalent to the Migadal-Kadanoff approximation for the model on Bravais lattices), via renormalization-group and tranfer-matrix calculations. The exact renormalization-group recursion relations always present a symmetric fixed point, associated with the critical behavior of the underlying uniform model. If the aperiodic interactions, defined by s ubstitution rules, lead to relevant geometric fluctuations, this fixed point becomes fully unstable, giving rise to novel attractors of different nature. We present an explicit example in which this new attractor is a two-cycle, with critical indices different from the uniform model. In case of the four-letter Rudin-Shapiro substitution rule, we find a surprising closed curve whose points are attractors of period two, associated with a marginal operator. Nevertheless, a scaling analysis indicates that this attractor may lead to a new critical universality class. In order to provide an independent confirmation of the scaling results, we turn to a direct thermodynamic calculation of the specific-heat exponent. The thermodynamic free energy is obtained from a transfer matrix formalism, which had been previously introduced for spin systems, and is now extended to the two-polymer model with aperiodic interactions.
The introduction of data analytics into medicine has changed the nature of treatment. In this, patients are asked to disclose personal information such as genetic markers, lifestyle habits, and clinical history. This data is then used by statistical models to predict personalized treatments. However, due to privacy concerns, patients often desire to withhold sensitive information. This self-censorship can impede proper diagnosis and treatment, which may lead to serious health complications and even death. In this paper, we present privacy distillation, a mechanism which allows patients to control the type and amount of information they wish to disclose to the healthcare providers for use in statistical models. Meanwhile, it retains the accuracy of models that have access to all patient data under a sufficient but not full set of privacy-relevant information. We validate privacy distillation using a corpus of patients prescribed to warfarin for a personalized dosage. We use a deep neural network to implement privacy distillation for training and making dose predictions. We find that privacy distillation with sufficient privacy-relevant information i) retains accuracy almost as good as having all patient data (only 3% worse), and ii) is effective at preventing errors that introduce health-related risks (yielding on average 3.9% of under- or over-prescriptions).
We discuss the determination of polarized parton distributions from charged-current deep-inelastic scattering experiments. We summarize the next-to-leading order treatment of charged-current polarized structure functions, their relation to polarized parton distributions and scale dependence, and discuss their description by means of a next-to-leading order evolution code. We discuss current theoretical expectations and positivity constraints on the unmeasured C-odd combinations Delta q-Delta qbar of polarized quark distributions, and their determination in charged-current deep-inelastic scattering experiments. We give estimates of the expected errors on charged-current structure functions at a future neutrino factory, and perform a study of the accuracy in the determination of polarized parton distributions that would be possible at such a facility. We show that these measurements have the potential to distinguish between different theoretical scenarios for the proton spin structure.
We show how the Equation-Free approach for mutliscale computations can be exploited to extract, in a computational strict and systematic way the emergent dynamical attributes, from detailed large-scale microscopic stochastic models, of neurons that interact on complex networks. In particular we show how the Equation-Free approach can be exploited to perform system-level tasks such as bifurcation, stability analysis and estimation of mean appearance times of rare events, bypassing the need for obtaining analytical approximations, providing an "on-demand" model reduction. Using the detailed simulator as a black-box timestepper, we compute the coarse-grained equilibrium bifurcation diagrams, examine the stability of the solution branches and perform a rare-events analysis with respect to certain characteristics of the underlying network topology such as the connectivity degree
We computed a Lennard Jones frozen liquid with a free surface using classical molecular dynamics. The structure factor curves on the free surface of this sample was calculated for different depths knowing that we have periodic boundary conditions on the other parts of the sample. The resulting structure factor curves show an horizontal shift of their first peak depending on how deep in the sample the curves are computed. We analyze our resulting curves in the light of spatial correlation functions during melting and at when the liquid is frozen. The conclusion is that near the free surface the sample is less dense than in the bulk and that the frozen liquid surface has a spatial correlation which does not differ very much from that of the bulk. This result is intrinsic to the melting of the Lennard Jones liquid and does not depend on any other parameter.
The authors present empirical distributions for the halting time (measured by the number of iterations to reach a given accuracy) of optimization algorithms applied to two random systems: spin glasses and deep learning. Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed. We observe two qualitative classes: A Gumbel-like distribution that appears in Google searches, human decision times, the QR eigenvalue algorithm and spin glasses, and a Gaussian-like distribution that appears in conjugate gradient method, deep network with MNIST input data and deep network with random input data. This empirical evidence suggests presence of a class of distributions for which the halting time is independent of the underlying distribution under some conditions.
We extend the analysis of the conductance fluctuations in disordered metals by Altshuler, Kravtsov, and Lerner (AKL) to disordered superconductors with broken time-reversal symmetry in $d=(2+\epsilon)$ dimensions (symmetry classes C and D of Altland and Zirnbauer). Using a perturbative renormalization group analysis of the corresponding non-linear sigma model (NL$\sigma$M) we compute the anomalous scaling dimensions of the dominant scalar operators with $2s$ gradients to one-loop order. We show that, in analogy with the result of AKL for ordinary, metallic systems (Wigner-Dyson classes), an infinite number of high-gradient operators would become relevant (in the renormalization group sense) near two dimensions if contributions beyond one-loop order are ignored. We explore the possibility to compare, in symmetry class D, the $\epsilon=(2-d)$ expansion in $d<2$ with exact results in one dimension. The method we use to perform the one-loop renormalization analysis is valid for general symmetric spaces of K\"ahler type, and suggests that this is a generic property of the perturbative treatment of NL$\sigma$Ms defined on Riemannian symmetric target spaces.
Community detection or clustering is a fundamental task in the analysis of network data. Many real networks have a bipartite structure which makes community detection challenging. In this paper, we consider a model which allows for matched communities in the bipartite setting, in addition to node covariates with information about the matching. We derive a simple fast algorithm for fitting the model based on variational inference ideas and show its effectiveness on both simulated and real data. A variation of the model to allow for degree-correction is also considered, in addition to a novel approach to fitting such degree-corrected models.
We have measured the resistance and the 1/f resistance noise of a two-dimensional low density hole system in a high mobility GaAs quantum well at low temperature. At densities lower than the metal-insulator transition one, the temperature dependence of the resistance is either power-like or simply activated. The noise decreases when the temperature or the density increase. These results contradict the standard description of independent particles in the strong localization regime. On the contrary, they agree with the percolation picture suggested by higher density results. The physical nature of the system could be a mixture of a conducting and an insulating phase. We compare our results with those of composite thin films.
Spatial extent is a complicating factor in mathematical biology. The possibility that an action at point A cannot immediately affect what happens at point B creates the opportunity for spatial nonuniformity. This nonuniformity must change our understanding of evolutionary dynamics, as the same organism in different places can have different expected evolutionary outcomes. Since organism origins and fates are both determined locally, we must consider heterogeneity explicitly to determine its effects. We use simulations of spatially extended host--pathogen and predator--prey ecosystems to reveal the limitations of standard mathematical treatments of spatial heterogeneity. Our model ecosystem generates heterogeneity dynamically; an adaptive network of hosts on which pathogens are transmitted arises as an emergent phenomenon. The structure and dynamics of this network differ in significant ways from those of related models studied in the adaptive-network field. We use a new technique, organism swapping, to test the efficacy of both simple approximations and more elaborate moment-closure methods, and a new measure to reveal the timescale dependence of invasive-strain behavior. Our results demonstrate the failure not only of the most straightforward ("mean field") approximation, which smooths over heterogeneity entirely, but also of the standard correction ("pair approximation") to the mean field treatment. In spatial contexts, invasive pathogen varieties can prosper initially but perish in the medium term, implying that the concepts of reproductive fitness and the Evolutionary Stable Strategy have to be modified for such systems.
A central challenge in neuroscience is understanding how neural system implements computation through its dynamics. We propose a nonlinear time series model aimed at characterizing interpretable dynamics from neural trajectories. Our model assumes low-dimensional continuous dynamics in a finite volume. It incorporates a prior assumption about globally contractional dynamics to avoid overly enthusiastic extrapolation outside of the support of observed trajectories. We show that our model can recover qualitative features of the phase portrait such as attractors, slow points, and bifurcations, while also producing reliable long-term future predictions in a variety of dynamical models and in real neural data.
Potential games form a class of non-cooperative games where unilateral improvement dynamics are guaranteed to converge in many practical cases. The potential game approach has been applied to a wide range of wireless network problems, particularly to a variety of channel assignment problems. In this paper, the properties of potential games are introduced, and games in wireless networks that have been proven to be potential games are comprehensively discussed.
This presentation's Part 3 studies the evolutionary information processes and regularities of evolution dynamics, evaluated by an entropy functional (EF) of a random field (modeled by a diffusion information process) and an informational path functional (IPF) on trajectories of the related dynamic process (Lerner 2012). The integral information measure on the process' trajectories accumulates and encodes inner connections and dependencies between the information states, and contains more information than a sum of Shannon's entropies, which measures and encodes each process's states separately. Cutting off the process' measured information under action of impulse controls (Lerner 2012a), extracts and reveals hidden information, covering the states' correlations in a multi-dimensional random process, and implements the EF-IPF minimax variation principle (VP). The approach models an information observer (Lerner 2012b)-as an extractor of such information, which is able to convert the collected information of the random process in the information dynamic process and organize it in the hierarchical information network (IN), Part2 (Lerner, 2012c). The IN's highest level of the structural hierarchy, measured by a maximal quantity and quality of the accumulated cooperative information, evaluates the observer's intelligence level, associated with its ability to recognize and build such structure of a meaningful hidden information. The considered evolution of optimal extraction, assembling, cooperation, and organization of this information in the IN, satisfying the VP, creates the phenomena of an evolving observer's intelligence. The requirements of preserving the evolutionary hierarchy impose the restrictions that limit the observer's intelligence level in the IN. The cooperative information geometry, evolving under observations, limits the size and volumes of a particular observer.
Direct imaging has just started the inventory of the population of gas giant planets on wide-orbits around young stars in the solar neighborhood. Following this approach, we carried out a deep imaging survey in the near-infrared using VLT/NaCo to search for substellar companions. We report here the discovery in L' (3.8 microns) images of a probable companion orbiting at 56 AU the young (10-17 Myr), dusty, and early-type (A8) star HD 95086. This discovery is based on observations with more than a year-time-lapse. Our first epoch clearly revealed the source at 10 sigma while our second epoch lacked good observing conditions hence yielding a 3 sigma detection. Various tests were thus made to rule out possible artifacts. This recovery is consistent with the signal at the first epoch but requires cleaner confirmation. Nevertheless, our astrometric precision suggests the companion to be comoving with the star, with a 3 sigma confidence level. The planetary nature of the source is reinforced by a non-detection in Ks-band (2.18 microns) images according to its possible extremely red Ks - L' color. Conversely, background contamination is rejected with good confidence level. The luminosity yields a predicted mass of about 4-5MJup (at 10-17 Myr) using "hot-start" evolutionary models, making HD 95086 b the exoplanet with the lowest mass ever imaged around a star.
Retrieving paragraphs to populate a Wikipedia article is a challenging task. The new TREC Complex Answer Retrieval (TREC CAR) track introduces a comprehensive dataset that targets this retrieval scenario. We present early results from a variety of approaches -- from standard information retrieval methods (e.g., tf-idf) to complex systems that using query expansion using knowledge bases and deep neural networks. The goal is to offer future participants of this track an overview of some promising approaches to tackle this problem.
A strong motivation of charging depleted battery can be an enabler for network capacity increase. In this light we propose a spatial attraction cellular network (SAN) consisting of macro cells overlaid with small cell base stations that wirelessly charge user batteries. Such a network makes battery depleting users move toward the vicinity of small cell base stations. With a fine adjustment of charging power, this user spatial attraction (SA) improves in spectral efficiency as well as load balancing. We jointly optimize both enhancements thanks to SA, and derive the corresponding optimal charging power in a closed form by using a stochastic geometric approach.
David Mermin's recent paper with the same title as this one makes it clear that his claim to have found a gap in my reasoning rests on his claim that my argument violates a criterion for meaningfulness of counterfactual statements that I myself had set down. I set down no such requirement. But I am willing to accept it as a conservative sufficient condition. This already entails, within my proof, that nature must have a deep structure that extends beyond what actually occurs. It imposes, without appeal to the notion of determinism or hidden variables, constraints connecting, at the macroscopic level, what did occur to what would have occurred if certain quantum choices had gone differently. All the statements in the proof have natural meanings within the context of an examination of that deep structure.
We study the interplay between a dynamic process and the structure of the network on which it is defined. Specifically, we examine the impact of this interaction on the quality-measure of network clusters and node centrality. This enables us to effectively identify network communities and important nodes participating in the dynamics. As the first step towards this objective, we introduce an umbrella framework for defining and characterizing an ensemble of dynamic processes on a network. This framework generalizes the traditional Laplacian framework to continuous-time biased random walks and also allows us to model some epidemic processes over a network. For each dynamic process in our framework, we can define a function that measures the quality of every subset of nodes as a potential cluster (or community) with respect to this process on a given network. This subset-quality function generalizes the traditional conductance measure for graph partitioning. We partially justify our choice of the quality function by showing that the classic Cheeger's inequality, which relates the conductance of the best cluster in a network with a spectral quantity of its Laplacian matrix, can be extended from the Laplacian-conductance setting to this more general setting.
Expressive Efficiency with respect to a network architectural attribute P refers to the property where an architecture without P must grow exponentially large in order to approximate the expressivity of a network with attribute P. For example, it is known that depth is an architectural attribute that generates exponential efficiency in the sense that a shallow network must grow exponentially large in order to approximate the functions represented by a deep network of polynomial size. In this paper we extend the study of expressive efficiency to the attribute of network connectivity and in particular to the effect of "overlaps" in the convolutional process, i.e., when the stride of the convolution is smaller than its kernel size (receptive field). Our analysis shows that having overlapping local receptive fields, and more broadly denser connectivity, results in an exponential increase in the expressive capacity of neural networks. Moreover, while denser connectivity can increase the expressive capacity, we show that the most common types of modern architectures already exhibit exponential increase in expressivity, without relying on fully-connected layers.
Inference of correspondences between images from different modalities is an extremely important perceptual ability that enables humans to understand and recognize cross-modal concepts. In this paper, we consider an instance of this problem that involves matching photographs of building interiors with their corresponding floorplan. This is a particularly challenging problem because a floorplan, as a stylized architectural drawing, is very different in appearance from a color photograph. Furthermore, individual photographs by themselves depict only a part of a floorplan (e.g., kitchen, bathroom, and living room). We propose the use of a number of different neural network architectures for this task, which are trained and evaluated on a novel large-scale dataset of 5 million floorplan images and 80 million associated photographs. Experimental evaluation reveals that our neural network architectures are able to identify visual cues that result in reliable matches across these two quite different modalities. In fact, the trained networks are able to even outperform human subjects in several challenging image matching problems. Our result implies that neural networks are effective at perceptual tasks that require long periods of reasoning even for humans to solve.
The performance of automatic speech recognition (ASR) has improved tremendously due to the application of deep neural networks (DNNs). Despite this progress, building a new ASR system remains a challenging task, requiring various resources, multiple training stages and significant expertise. This paper presents our Eesen framework which drastically simplifies the existing pipeline to build state-of-the-art ASR systems. Acoustic modeling in Eesen involves learning a single recurrent neural network (RNN) predicting context-independent targets (phonemes or characters). To remove the need for pre-generated frame labels, we adopt the connectionist temporal classification (CTC) objective function to infer the alignments between speech and label sequences. A distinctive feature of Eesen is a generalized decoding approach based on weighted finite-state transducers (WFSTs), which enables the efficient incorporation of lexicons and language models into CTC decoding. Experiments show that compared with the standard hybrid DNN systems, Eesen achieves comparable word error rates (WERs), while at the same time speeding up decoding significantly.
A path-integral formalism is proposed for studying the dynamical evolution in time of patterns in an artificial neural network in the presence of noise. An effective cost function is constructed which determines the unique global minimum of the neural network system. The perturbative method discussed also provides a way for determining the storage capacity of the network.
Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.
Partially observed control problems are a challenging aspect of reinforcement learning. We extend two related, model-free algorithms for continuous control -- deterministic policy gradient and stochastic value gradient -- to solve partially observed domains using recurrent neural networks trained with backpropagation through time.   We demonstrate that this approach, coupled with long-short term memory is able to solve a variety of physical control problems exhibiting an assortment of memory requirements. These include the short-term integration of information from noisy sensors and the identification of system parameters, as well as long-term memory problems that require preserving information over many time steps. We also demonstrate success on a combined exploration and memory problem in the form of a simplified version of the well-known Morris water maze task. Finally, we show that our approach can deal with high-dimensional observations by learning directly from pixels.   We find that recurrent deterministic and stochastic policies are able to learn similarly good solutions to these tasks, including the water maze where the agent must learn effective search strategies.
Information centric networking (ICN) proposes to redesign the Internet by replacing its host-centric design with information-centric design. Communication among entities is established at the naming level, with the receiver side (referred to as the Consumer) acting as the driving force behind content delivery, by interacting with the network through Interest message transmissions. One of the proposed advantages for ICN is its support for mobility, by de-coupling applications from transport semantics. However, so far, little research has been conducted to understand the interaction between ICN and mobility of consuming and producing applications, in protocols purely based on information-centric principles, particularly in the case of NDN. In this paper, we present our findings on the mobility-based performance of Named Data Networking (NDN) in wireless access networks. Through simulations, we show that the current NDN architecture is not efficient in handling mobility and architectural enhancements needs to be done to fully support mobility of Consumers and Producers.
By extending the notion of minimum rank distance, this paper introduces two new relative code parameters of a linear code C_1 of length n over a field extension and its subcode C_2. One is called the relative dimension/intersection profile (RDIP), and the other is called the relative generalized rank weight (RGRW). We clarify their basic properties and the relation between the RGRW and the minimum rank distance. As applications of the RDIP and the RGRW, the security performance and the error correction capability of secure network coding, guaranteed independently of the underlying network code, are analyzed and clarified. We propose a construction of secure network coding scheme, and analyze its security performance and error correction capability as an example of applications of the RDIP and the RGRW. Silva and Kschischang showed the existence of a secure network coding in which no part of the secret message is revealed to the adversary even if any dim C_1-1 links are wiretapped, which is guaranteed over any underlying network code. However, the explicit construction of such a scheme remained an open problem. Our new construction is just one instance of secure network coding that solves this open problem.
For many complex diseases, there is a wide variety of ways in which an individual can manifest the disease. The challenge of personalized medicine is to develop tools that can accurately predict the trajectory of an individual's disease, which can in turn enable clinicians to optimize treatments. We represent an individual's disease trajectory as a continuous-valued continuous-time function describing the severity of the disease over time. We propose a hierarchical latent variable model that individualizes predictions of disease trajectories. This model shares statistical strength across observations at different resolutions--the population, subpopulation and the individual level. We describe an algorithm for learning population and subpopulation parameters offline, and an online procedure for dynamically learning individual-specific parameters. Finally, we validate our model on the task of predicting the course of interstitial lung disease, a leading cause of death among patients with the autoimmune disease scleroderma. We compare our approach against state-of-the-art and demonstrate significant improvements in predictive accuracy.
We theoretically study the single particle Green function of a three dimensional disordered Weyl semimetal using a combination of techniques. These include analytic $T$-matrix and renormalization group methods with complementary regimes of validity, and an exact numerical approach based on the kernel polynomial technique. We show that at any nonzero disorder, Weyl excitations are not ballistic: they instead have a nonzero linewidth that for weak short-range disorder arises from non-perturbative resonant impurity scattering. Perturbative approaches find a quantum critical point between a semimetal and a metal at a finite disorder strength, but this transition is avoided due to nonperturbative effects. At moderate disorder strength and intermediate energies the avoided quantum critical point renormalizes the scaling of single particle properties. In this regime we compute numerically the anomalous dimension of the fermion field and find $\eta= 0.13 \pm 0.04$, which agrees well with a renormalization group analysis ($\eta= 0.125$). Our predictions can be directly tested by ARPES and STM measurements in samples dominated by neutral impurities.
The following \textit{network computing} problem is considered. Source nodes in a directed acyclic network generate independent messages and a single receiver node computes a target function $f$ of the messages. The objective is to maximize the average number of times $f$ can be computed per network usage, i.e., the ``computing capacity''. The \textit{network coding} problem for a single-receiver network is a special case of the network computing problem in which all of the source messages must be reproduced at the receiver. For network coding with a single receiver, routing is known to achieve the capacity by achieving the network \textit{min-cut} upper bound. We extend the definition of min-cut to the network computing problem and show that the min-cut is still an upper bound on the maximum achievable rate and is tight for computing (using coding) any target function in multi-edge tree networks and for computing linear target functions in any network. We also study the bound's tightness for different classes of target functions. In particular, we give a lower bound on the computing capacity in terms of the Steiner tree packing number and a different bound for symmetric functions. We also show that for certain networks and target functions, the computing capacity can be less than an arbitrarily small fraction of the min-cut bound.
With rapid advances in neuroimaging techniques, the research on brain disorder identification has become an emerging area in the data mining community. Brain disorder data poses many unique challenges for data mining research. For example, the raw data generated by neuroimaging experiments is in tensor representations, with typical characteristics of high dimensionality, structural complexity and nonlinear separability. Furthermore, brain connectivity networks can be constructed from the tensor data, embedding subtle interactions between brain regions. Other clinical measures are usually available reflecting the disease status from different perspectives. It is expected that integrating complementary information in the tensor data and the brain network data, and incorporating other clinical parameters will be potentially transformative for investigating disease mechanisms and for informing therapeutic interventions. Many research efforts have been devoted to this area. They have achieved great success in various applications, such as tensor-based modeling, subgraph pattern mining, multi-view feature analysis. In this paper, we review some recent data mining methods that are used for analyzing brain disorders.
In IEEE 802.11 based wireless networks adding more access points does not always guarantee an increase of network capacity. In some cases, additional access points may contribute to degrade the aggregated network throughput as more interference is introduced.   This paper characterizes the power interference in CSMA/CA based networks consisting of nodes using directional antenna. The severity of the interference is quantized via an improved form of the Attacking Case metric as the original form of this metric was developed for nodes using omnidirectional antenna.   The proposed metric is attractive because it considers nodes using directional or omnidirectional antenna, and it enables the quantization of interference in wireless networks using multiple transmission power schemes. The improved Attacking Case metric is useful to study the aggregated throughput of IEEE 802.11 based networks; reducing Attacking Case probably results in an increase of aggregated throughput. This reduction can be implemented using strategies such as directional antenna, transmit power control, or both.
In recent times, there have been a lot of efforts for improving the ossified Internet architecture in a bid to sustain unstinted growth and innovation. A major reason for the perceived architectural ossification is the lack of ability to program the network as a system. This situation has resulted partly from historical decisions in the original Internet design which emphasized decentralized network operations through co-located data and control planes on each network device. The situation for wireless networks is no different resulting in a lot of complexity and a plethora of largely incompatible wireless technologies. The emergence of "programmable wireless networks", that allow greater flexibility, ease of management and configurability, is a step in the right direction to overcome the aforementioned shortcomings of the wireless networks. In this paper, we provide a broad overview of the architectures proposed in literature for building programmable wireless networks focusing primarily on three popular techniques, i.e., software defined networks, cognitive radio networks, and virtualized networks. This survey is a self-contained tutorial on these techniques and its applications. We also discuss the opportunities and challenges in building next-generation programmable wireless networks and identify open research issues and future research directions.
Learning from a few examples remains a key challenge in machine learning. Despite recent advances in important domains such as vision and language, the standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. We then define one-shot learning problems on vision (using Omniglot, ImageNet) and language tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to 93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches. We also demonstrate the usefulness of the same model on language modeling by introducing a one-shot task on the Penn Treebank.
We present herein a scheme by which to accurately evaluate the error exponents of a lossy data compression problem, which characterize average probabilities over a code ensemble of compression failure and success above or below a critical compression rate, respectively, utilizing the replica method (RM). Although the existing method used in information theory (IT) is, in practice, limited to ensembles of randomly constructed codes, the proposed RM-based approach can be applied to a wider class of ensembles. This approach reproduces the optimal expressions of the error exponents achieved by the random code ensembles, which are known in IT. In addition, the proposed framework is used to show that codes composed of non-monotonic perceptrons of a specific type can provide the optimal exponents in most cases, which is supported by numerical experiments.
We describe a method for searching the optimal hyper-parameters in reservoir computing, which consists of a Gaussian process with Bayesian optimization. It provides an alternative to other frequently used optimization methods such as grid, random, or manual search. In addition to a set of optimal hyper-parameters, the method also provides a probability distribution of the cost function as a function of the hyper-parameters. We apply this method to two types of reservoirs: nonlinear delay nodes and echo state networks. It shows excellent performance on all considered benchmarks, either matching or significantly surpassing results found in the literature. In general, the algorithm achieves optimal results in fewer iterations when compared to other optimization methods. We have optimized up to six hyper-parameters simultaneously, which would have been infeasible using, e.g., grid search. Due to its automated nature, this method significantly reduces the need for expert knowledge when optimizing the hyper-parameters in reservoir computing. Existing software libraries for Bayesian optimization, such as Spearmint, make the implementation of the algorithm straightforward. A fork of the Spearmint framework along with a tutorial on how to use it in practice is available at https://bitbucket.org/uhasseltmachinelearning/spearmint/
Generative Adversarial Networks (GANs) are powerful models for learning complex distributions. Stable training of GANs has been addressed in many recent works which explore different metrics between distributions. In this paper we introduce Fisher GAN which fits within the Integral Probability Metrics (IPM) framework for training GANs. Fisher GAN defines a critic with a data dependent constraint on its second order moments. We show in this paper that Fisher GAN allows for stable and time efficient training that does not compromise the capacity of the critic, and does not need data independent constraints such as weight clipping. We analyze our Fisher IPM theoretically and provide an algorithm based on Augmented Lagrangian for Fisher GAN. We validate our claims on both image sample generation and semi-supervised classification using Fisher GAN.
We report results of molecular-dynamics simulations for a glassy polymer melt consisting of short, linear bead-spring chains. It was shown in previous work that this onset of the glassy slowing down is compatible with the predictions of the mode coupling theory. The physical process of `caging' of a monomer by its spatial neighbors leads to a distinct two step behavior in the particle mean square displacements. In this work we analyze the effects of this caging process on the Rouse description of the melt's dynamics. We show that the Rouse theory is applicable for length and time scales above the typical scales for the caging process. Futhermore, the monomer displacement is compared with simulation results for a binary Lennard-Jones mixture to point out the differences which are introduced by the connectivity of the particles.
Models of biochemical networks are frequently high-dimensional and complex. Reduction methods that preserve important dynamical properties are therefore essential in their study. Interactions between the nodes in such networks are frequently modeled using a Hill function, $x^n/(J^n+x^n)$. Reduced ODEs and Boolean networks have been studied extensively when the exponent $n$ is large. However, the case of small constant $J$ appears in practice, but is not well understood. In this paper we provide a mathematical analysis of this limit, and show that a reduction to a set of piecewise linear ODEs and Boolean networks can be mathematically justified. The piecewise linear systems have closed form solutions that closely track those of the fully nonlinear model. On the other hand, the simpler, Boolean network can be used to study the qualitative behavior of the original system. We justify the reduction using geometric singular perturbation theory and compact convergence, and illustrate the results in networks modeling a genetic switch and a genetic oscillator.
This study is motivated by problems related to environmental transport on river networks. We establish statistical properties of a flow along a directed branching network and suggest its compact parameterization. The downstream network transport is treated as a particular case of nearest-neighbor hierarchical aggregation with respect to the metric induced by the branching structure of the river network. We describe the static geometric structure of a drainage network by a tree, referred to as the static tree, and introduce an associated dynamic tree that describes the transport along the static tree. It is well known that the static branching structure of river networks can be described by self-similar trees (SSTs); we demonstrate that the corresponding dynamic trees are also self-similar. We report an unexpected phase transition in the dynamics of three river networks, one from California and two from Italy, demonstrate the universal features of this transition, and seek to interpret it in hydrological terms.
Artificial Neural Network is one of the most common AI application fields. This field has direct and indirect usages most sciences. The main goal of ANN is to imitate biological neural networks for solving scientific problems. But the level of parallelism is the main problem of ANN systems in comparison with biological systems. To solve this problem, we have offered a XML-based framework for implementing ANN on the Globus Toolkit Platform. Globus Toolkit is well known management software for multipurpose Grids. Using the Grid for simulating the neuron network will lead to a high degree of parallelism in the implementation of ANN. We have used the XML for improving flexibility and scalability in our framework.
Given a large edge-weighted network $G$ with $k$ terminal vertices, we wish to compress it and store, using little memory, the value of the minimum cut (or equivalently, maximum flow) between every bipartition of terminals. One appealing methodology to implement a compression of $G$ is to construct a \emph{mimicking network}: a small network $G'$ with the same $k$ terminals, in which the minimum cut value between every bipartition of terminals is the same as in $G$. This notion was introduced by Hagerup, Katajainen, Nishimura, and Ragde [JCSS '98], who proved that such $G'$ of size at most $2^{2^k}$ always exists. Obviously, by having access to the smaller network $G'$, certain computations involving cuts can be carried out much more efficiently.   We provide several new bounds, which together narrow the previously known gap from doubly-exponential to only singly-exponential, both for planar and for general graphs. Our first and main result is that every $k$-terminal planar network admits a mimicking network $G'$ of size $O(k^2 2^{2k})$, which is moreover a minor of $G$. On the other hand, some planar networks $G$ require $|E(G')| \ge \Omega(k^2)$. For general networks, we show that certain bipartite graphs only admit mimicking networks of size $|V(G')| \geq 2^{\Omega(k)}$, and moreover, every data structure that stores the minimum cut value between all bipartitions of the terminals must use $2^{\Omega(k)}$ machine words.
We present deep, panoramic multi-color imaging of the distant rich cluster A851 (z=0.41) using Suprime-Cam on Subaru. These images cover a 27' field of view (~11 h_{50}^{-1}Mpc), and by exploiting photometric redshifts, we can isolate galaxies in a narrow redshift slice at the cluster redshift. Using a sample of ~2700 probable cluster members (<M*+4), we trace the network of filaments and subclumps around the cluster core. The depth of our observations, combined with the identification of filamentary structure, gives us an unprecedented opportunity to test the influence of the environment on the properties of low luminosity galaxies. We find an abrupt change in the colors of faint galaxies (>M*+1) at a local density of 100 gal. Mpc^{-2}. The transition in the color-local density behavior occurs at densities corresponding to subclumps within the filaments surrounding the cluster. Identifying the sites where the transition occurs brings us much closer to understanding the mechanisms which are responsible for establishing the present-day relationship between environment and galaxy characteristics.
Neural cryptography is based on synchronization of tree parity machines by mutual learning. We extend previous key-exchange protocols by replacing random inputs with queries depending on the current state of the neural networks. The probability of a successful attack is calculated for different model parameters using numerical simulations. The results show that queries restore the security against cooperating attackers. The success probability can be reduced without increasing the average synchronization time.
A complex network is said to show topological isotropy if the topological structure around a particular node looks the same in all directions of the whole network. Topologically anisotropic networks are those where the local neighborhood around a node is not reproduced at large scale for the whole network. The existence of topological isotropy is investigated by the existence of a power-law scaling between a local and a global topological characteristic of complex networks obtained from graph spectra. We investigate this structural characteristic of complex networks and its consequences for 32 real-world networks representing informational, technological, biological, social and ecological systems.
The goal of an infection source node (e.g., a rumor or computer virus source) in a network is to spread its infection to as many nodes as possible, while remaining hidden from the network administrator. On the other hand, the network administrator aims to identify the source node based on knowledge of which nodes have been infected. We model the infection spreading and source identification problem as a strategic game, where the infection source and the network administrator are the two players. As the Jordan center estimator is a minimax source estimator that has been shown to be robust in recent works, we assume that the network administrator utilizes a source estimation strategy that can probe any nodes within a given radius of the Jordan center. Given any estimation strategy, we design a best-response infection strategy for the source. Given any infection strategy, we design a best-response estimation strategy for the network administrator. We derive conditions under which a Nash equilibrium of the strategic game exists. Simulations in both synthetic and real-world networks demonstrate that our proposed infection strategy infects more nodes while maintaining the same safety margin between the true source node and the Jordan center source estimator.
We present a general theory and corresponding declarative model for the embodied grounding and natural language based analytical summarisation of dynamic visuo-spatial imagery. The declarative model ---ecompassing spatio-linguistic abstractions, image schemas, and a spatio-temporal feature based language generator--- is modularly implemented within Constraint Logic Programming (CLP). The implemented model is such that primitives of the theory, e.g., pertaining to space and motion, image schemata, are available as first-class objects with `deep semantics' suited for inference and query. We demonstrate the model with select examples broadly motivated by areas such as film, design, geography, smart environments where analytical natural language based externalisations of the moving image are central from the viewpoint of human interaction, evidence-based qualitative analysis, and sensemaking.   Keywords: moving image, visual semantics and embodiment, visuo-spatial cognition and computation, cognitive vision, computational models of narrative, declarative spatial reasoning
Excitations of three-dimensional spin glasses are computed numerically. We find that one can flip a finite fraction of an LxLxL lattice with an O(1) energy cost, confirming the mean field picture of a non-trivial spin overlap distribution P(q). These low energy excitations are not domain-wall-like, rather they are topologically non-trivial and they reach out to the boundaries of the lattice. Their surface to volume ratios decrease as L increases and may asymptotically go to zero. If so, link and window overlaps between the ground state and these excited states become ``trivial''.
Broadcasting concerns the dissemination of a message originating at one node of a network to all other nodes. This task is accomplished by placing a series of calls over the communication lines of the network between neighboring nodes, where each call requires a unit of time and a call can involve only two nodes. We show that for bounded-degree networks determining the minimum broadcast time from an originating node remains NP-complete.
We study the transport properties of ultrathin disordered nanowires in the neighborhood of the superconductor-metal quantum phase transition. To this end we combine numerical calculations with analytical strong-disorder renormalization group results. The quantum critical conductivity at zero temperature diverges logarithmically as a function of frequency. In the metallic phase, it obeys activated scaling associated with an infinite-randomness quantum critical point. We extend the scaling theory to higher dimensions and discuss implications for experiments.
The hippocampus has the capacity for reactivating recently acquired memories [1-3] and it is hypothesized that one of the functions of sleep reactivation is the facilitation of consolidation of novel memory traces [4-11]. The dynamic and network processes underlying such a reactivation remain, however, unknown. We show that such a reactivation characterized by local, self-sustained activity of a network region may be an inherent property of the recurrent excitatory-inhibitory network with a heterogeneous structure. The entry into the reactivation phase is mediated through a physiologically feasible regulation of global excitability and external input sources, while the reactivated component of the network is formed through induced network heterogeneities during learning. We show that structural changes needed for robust reactivation of a given network region are well within known physiological parameters [12,13].
R-CNN style methods are sorts of the state-of-the-art object detection methods, which consist of region proposal generation and deep CNN classification. However, the proposal generation phase in this paradigm is usually time consuming, which would slow down the whole detection time in testing. This paper suggests that the value discrepancies among features in deep convolutional feature maps contain plenty of useful spatial information, and proposes a simple approach to extract the information for fast region proposal generation in testing. The proposed method, namely Relief R-CNN (R2-CNN), adopts a novel region proposal generator in a trained R-CNN style model. The new generator directly generates proposals from convolutional features by some simple rules, thus resulting in a much faster proposal generation speed and a lower demand of computation resources. Empirical studies show that R2-CNN could achieve the fastest detection speed with comparable accuracy among all the compared algorithms in testing.
In this work, we propose to train a deep neural network by distributed optimization over a graph. Two nonlinear functions are considered: the rectified linear unit (ReLU) and a linear unit with both lower and upper cutoffs (DCutLU). The problem reformulation over a graph is realized by explicitly representing ReLU or DCutLU using a set of slack variables. We then apply the alternating direction method of multipliers (ADMM) to update the weights of the network layerwise by solving subproblems of the reformulated problem. Empirical results suggest that the ADMM-based method is less sensitive to overfitting than the stochastic gradient descent (SGD) and Adam methods.
We study the formation of a colloidal gel by means of Molecular Dynamics simulations of a model for colloidal suspensions. A slowing down with gel-like features is observed at low temperatures and low volume fractions, due to the formation of persistent structures. We show that at low volume fraction the dynamic susceptibility, which describes dynamic heterogeneities, exhibits a large plateau, dominated by clusters of long living bonds. At higher volume fraction, where the effect of the crowding of the particles starts to be present, it crosses over towards a regime characterized by a peak. We introduce a suitable mean cluster size of clusters of monomers connected by "persistent" bonds which well describes the dynamic susceptibility.
We study a class of Markov chains that describe reversible stochastic dynamics of a large class of disordered mean field models at low temperatures. Our main purpose is to give a precise relation between the metastable time scales in the problem to the properties of the rate functions of the corresponding Gibbs measures. We derive the analog of the Wentzell-Freidlin theory in this case, showing that any transition can be decomposed, with probability exponentially close to one, into a deterministic sequence of ``admissible transitions''. For these admissible transitions we give upper and lower bounds on the expected transition times that differ only by a constant.   The distribution rescaled transition times are shown to converge to the exponential distribution. We exemplify our results in the context of the random field Curie-Weiss model.
This article describes different models based on Bayesian networks RB modeling expertise in the diagnosis of brain tumors. Indeed, they are well adapted to the representation of the uncertainty in the process of diagnosis of these tumors. In our work, we first tested several structures derived from the Bayesian network reasoning performed by doctors on the one hand and structures generated automatically on the other. This step aims to find the best structure that increases diagnostic accuracy. The machine learning algorithms relate MWST-EM algorithms, SEM and SEM + T. To estimate the parameters of the Bayesian network from a database incomplete, we have proposed an extension of the EM algorithm by adding a priori knowledge in the form of the thresholds calculated by the first phase of the algorithm RBE . The very encouraging results obtained are discussed at the end of the paper
We consider the quantized consensus problem on undirected time-varying connected graphs with n nodes, and devise a protocol with fast convergence time to the set of consensus points. Specifically, we show that when the edges of each network in a sequence of connected time-varying networks are activated based on Poisson processes with Metropolis rates, the expected convergence time to the set of consensus points is at most O(n^2 log^2 n), where each node performs a constant number of updates per unit time.
The diffractive open charm production is computed in perturbative QCD formalism and in the Regge approach. The results are compared with recent data on charm diffractive structure function measured at DESY-HERA. Our results demonstrate that this observable can be useful to discriminate the QCD dynamics.
Social animals have to take into consideration the behaviour of conspecifics when making decisions to go by their daily lives. These decisions affect their fitness and there is therefore an evolutionary pressure to try making the right choices. In many instances individuals will make their own choices and the behaviour of the group will be a democratic integration of all decisions. However, in some instances it can be advantageous to follow the choice of a few individuals in the group if they have more information regarding the situation that has arisen. Here I provide early evidence that decisions about shifts in activity states in a population of bottlenose dolphin follow such a decision making process. This unshared consensus is mediated by a non-vocal signal which can be communicated globally within the dolphin school. These signals are emitted by individuals that tend to have more information about the behaviour of potential competitors because of their position in the social network. I hypothesise that this decision making process emerged from the social structure of the population and the need to maintain mixed-sex schools.
We investigate relaxation in the recently discovered "fracton" models and discover that these models naturally host glassy quantum dynamics in the absence of quenched disorder. We begin with a discussion of "type I" fracton models, in the taxonomy of Vijay, Haah, and Fu. We demonstrate that in these systems, the mobility of charges is suppressed exponentially in the inverse temperature. We further demonstrate that when a zero temperature type I fracton model is placed in contact with a finite temperature heat bath, the approach to equilibrium is a logarithmic function of time over an exponentially wide window of time scales. Generalizing to the more complex "type II" fracton models, we find that the charges exhibit subdiffusion upto a relaxation time that diverges at low temperatures as a super-exponential function of inverse temperature. This behaviour is reminiscent of "nearly localized" disordered systems, but occurs with a translation invariant three-dimensional Hamiltonian. We also conjecture that fracton models with conserved charge may support a phase which is a thermal metal but a charge insulator.
Web Services and mobile data services are the newest trends in information systems engineering in wired and wireless domains, respectively. Web Services have a broad range of service distributions while mobile phones have large and expanding user base. To address the confluence of Web Services and pervasive mobile devices and communication environments, a basic mobile Web Service provider was developed for smart phones. The performance of this Mobile Host was also analyzed in detail. Further analysis of the Mobile Host to provide proper QoS and to check Mobile Host's feasibility in the P2P networks, identified the necessity of a mediation framework. The paper describes the research conducted with the Mobile Host, identifies the tasks of the mediation framework and then discusses the feasible realization details of such a mobile Web Services mediation framework.
We propose a novel decoding approach for neural machine translation (NMT) based on continuous optimisation. The resulting optimisation problem is then tackled using constrained gradient optimisation. Our powerful decoding framework, enables decoding intractable models such as the intersection of left-to-right and right-to-left (bidirectional) as well as source-to-target and target-to-source (bilingual) NMT models. Our empirical results show that our decoding framework is effective, and leads to substantial improvements in translations generated from the intersected models where the typical greedy or beam search is infeasible.
We apply a variant of the Nose-Hoover thermostat to derive the Hamiltonian of a nonextensive system that is compatible with the canonical ensemble of the generalized thermostatistics of Tsallis. This microdynamical approach provides a deterministic connection between the generalized nonextensive entropy and power law behavior. For the case of a simple one-dimensional harmonic oscillator, we confirm by numerical simulation of the dynamics that the distribution of energy H follows precisely the canonical q-statistics for different values of the parameter q. The approach is further tested for classical many-particle systems by means of molecular dynamics simulations. The results indicate that the intrinsic nonlinear features of the nonextensive formalism are capable to generate energy fluctuations that obey anomalous probability laws. For q<1 a broad distribution of energy is observed, while for q>1 the resulting distribution is confined to a compact support.
Neural network acoustic models have significantly advanced state of the art speech recognition over the past few years. However, they are usually computationally expensive due to the large number of matrix-vector multiplications and nonlinearity operations. Neural network models also require significant amounts of memory for inference because of the large model size. For these two reasons, it is challenging to deploy neural network based speech recognizers on resource-constrained platforms such as embedded devices. This paper investigates the use of binary weights and activations for computation and memory efficient neural network acoustic models. Compared to real-valued weight matrices, binary weights require much fewer bits for storage, thereby cutting down the memory footprint. Furthermore, with binary weights or activations, the matrix-vector multiplications are turned into addition and subtraction operations, which are computationally much faster and more energy efficient for hardware platforms. In this paper, we study the applications of binary weights and activations for neural network acoustic modeling, reporting encouraging results on the WSJ and AMI corpora.
In this paper, a novel approach for single channel source separation (SCSS) using a deep neural network (DNN) architecture is introduced. Unlike previous studies in which DNN and other classifiers were used for classifying time-frequency bins to obtain hard masks for each source, we use the DNN to classify estimated source spectra to check for their validity during separation. In the training stage, the training data for the source signals are used to train a DNN. In the separation stage, the trained DNN is utilized to aid in estimation of each source in the mixed signal. Single channel source separation problem is formulated as an energy minimization problem where each source spectra estimate is encouraged to fit the trained DNN model and the mixed signal spectrum is encouraged to be written as a weighted sum of the estimated source spectra. The proposed approach works regardless of the energy scale differences between the source signals in the training and separation stages. Nonnegative matrix factorization (NMF) is used to initialize the DNN estimate for each source. The experimental results show that using DNN initialized by NMF for source separation improves the quality of the separated signal compared with using NMF for source separation.
Determining the optimal size and orientation of small-scale residential based PV arrays will become increasingly complex in the future smart grid environment with the introduction of smart meters and dynamic tariffs. However consumers can leverage the availability of smart meter data to conduct a more detailed exploration of PV investment options for their particular circumstances. In this paper, an optimization method for PV orientation and sizing is proposed whereby maximizing the PV investment value is set as the defining objective. Solar insolation and PV array models are described to form the basis of the PV array optimization strategy. A constrained particle swarm optimization algorithm is selected due to its strong performance in non-linear applications. The optimization algorithm is applied to real-world metered data to quantify the possible investment value of a PV installation under different energy retailers and tariff structures. The arrangement with the highest value is determined to enable prospective small-scale PV investors to select the most cost-effective system.
In this paper, we develop a novel method based on machine-learning and image processing to identify North Atlantic right whale (NARW) up-calls in the presence of high levels of ambient and interfering noise. We apply a continuous region algorithm on the spectrogram to extract the regions of interest, and then use grid masking techniques to generate a small feature set that is then used in an artificial neural network classifier to identify the NARW up-calls. It is shown that the proposed technique is effective in detecting and capturing even very faint up-calls, in the presence of ambient and interfering noises. The method is evaluated on a dataset recorded in Massachusetts Bay, United States. The dataset includes 20000 sound clips for training, and 10000 sound clips for testing. The results show that the proposed technique can achieve an error rate of less than FPR = 4.5% for a 90% true positive rate.
We examine how the structure of the world trade network has been shaped by globalization and recessions over the last 40 years. We show that by treating the world trade network as an evolving system, theory predicts the trade network is more sensitive to evolutionary shocks and recovers more slowly from them now than it did 40 years ago, due to structural changes in the world trade network induced by globalization. We also show that recession-induced change to the world trade network leads to an \emph{increased} hierarchical structure of the global trade network for a few years after the recession.
ELM (Extreme Learning Machine) is a single hidden layer feed-forward network, where the weights between input and hidden layer are initialized randomly. ELM is efficient due to its utilization of the analytical approach to compute weights between hidden and output layer. However, ELM still fails to output the semantic classification outcome. To address such limitation, in this paper, we propose a diversified top-k shapelets transform framework, where the shapelets are the subsequences i.e., the best representative and interpretative features of each class. As we identified, the most challenge problems are how to extract the best k shapelets in original candidate sets and how to automatically determine the k value. Specifically, we first define the similar shapelets and diversified top-k shapelets to construct diversity shapelets graph. Then, a novel diversity graph based top-k shapelets extraction algorithm named as \textbf{DivTopkshapelets}\ is proposed to search top-k diversified shapelets. Finally, we propose a shapelets transformed ELM algorithm named as \textbf{DivShapELM} to automatically determine the k value, which is further utilized for time series classification. The experimental results over public data sets demonstrate that the proposed approach significantly outperforms traditional ELM algorithm in terms of effectiveness and efficiency.
There has recently been considerable interest in design of low-complexity, myopic, distributed and stable scheduling policies for constrained queueing network models that arise in the context of emerging communication networks. Here, we consider two representative models. One, a model for the collection of wireless nodes communicating through a shared medium, that represents randomly varying number of packets in the queues at the nodes of networks. Two, a buffered circuit switched network model for an optical core of future Internet, to capture the randomness in calls or flows present in the network. The maximum weight scheduling policy proposed by Tassiulas and Ephremide in 1992 leads to a myopic and stable policy for the packet-level wireless network model. But computationally it is very expensive (NP-hard) and centralized. It is not applicable to the buffered circuit switched network due to the requirement of non-premption of the calls in the service. As the main contribution of this paper, we present a stable scheduling algorithm for both of these models. The algorithm is myopic, distributed and performs few logical operations at each node per unit time.
We study the Ising spin glass on random graphs with fixed connectivity z and with a Gaussian distribution of the couplings, with mean \mu and unit variance. We compute exact ground states by using a sophisticated branch-and-cut method for z=4,6 and system sizes up to N=1280 for different values of \mu. We locate the spin-glass/ferromagnet phase transition at \mu = 0.77 +/- 0.02 (z=4) and \mu = 0.56 +/- 0.02 (z=6). We also compute the energy and magnetization in the Bethe-Peierls approximation with a stochastic method, and estimate the magnitude of replica symmetry breaking corrections. Near the phase transition, we observe a sharp change of the median running time of our implementation of the algorithm, consistent with a change from a polynomial dependence on the system size, deep in the ferromagnetic phase, to slower than polynomial in the spin-glass phase.
Graph algorithms are increasingly used in applications that exploit large databases. However, conventional processor architectures are inadequate for handling the throughput and memory requirements of graph computation. Lincoln Laboratory's graph-processor architecture represents a rethinking of parallel architectures for graph problems. Our processor utilizes innovations that include a sparse matrix-based graph instruction set, a cacheless memory system, accelerator-based architecture, a systolic sorter, high-bandwidth multi-dimensional toroidal communication network, and randomized communications. A field-programmable gate array (FPGA) prototype of the new graph processor has been developed with significant performance enhancement over conventional processors in graph computational throughput.
Agile recovery from link failures in autonomic communication networks is essential to increase robustness, accessibility, and reliability of data transmission. However, this must be done with the least amount of protection resources, while using simple management plane functionality. Recently, network coding has been proposed as a solution to provide agile and cost efficient network self-healing against link failures, in a manner that does not require data rerouting, packet retransmission, or failure localization, hence leading to simple control and management planes. To achieve this, separate paths have to be provisioned to carry encoded packets, hence requiring either the addition of extra links, or reserving some of the resources for this purpose.   In this paper we introduce autonomic self-healing strategies for autonomic networks in order to protect against link failures. The strategies are based on network coding and reduced capacity, which is a technique that we call network protection codes (NPC). In these strategies, an autonomic network is able to provide self-healing from various network failures affecting network operation. The techniques improve service and enhance reliability of autonomic communication.   Network protection codes are extended to provide self-healing from multiple link failures in autonomic networks. We provide implementation aspects of the proposed strategies. We present bounds and network protection code constructions. Finally, we study the construction of such codes over the binary field. The paper also develops an Integer Linear Program formulation to evaluate the cost of provisioning connections using the proposed strategies.
Faced with continuously increasing scale of data, original back-propagation neural network based machine learning algorithm presents two non-trivial challenges: huge amount of data makes it difficult to maintain both efficiency and accuracy; redundant data aggravates the system workload. This project is mainly focused on the solution to the issues above, combining deep learning algorithm with cloud computing platform to deal with large-scale data. A MapReduce-based handwriting character recognizer will be designed in this project to verify the efficiency improvement this mechanism will achieve on training and practical large-scale data. Careful discussion and experiment will be developed to illustrate how deep learning algorithm works to train handwritten digits data, how MapReduce is implemented on deep learning neural network, and why this combination accelerates computation. Besides performance, the scalability and robustness will be mentioned in this report as well. Our system comes with two demonstration software that visually illustrates our handwritten digit recognition/encoding application.
In this paper we present a review of the existing typologies of Internet service users. We zoom in on social networking services including blogs and crowdsourcing websites. Based on the results of the analysis of the considered typologies obtained by means of FCA we developed a new user typology of a certain class of Internet services, namely a collaboration innovation platform. Cluster analysis of data extracted from the collaboration platform Witology was used to divide more than 500 participants into six groups based on three activity indicators: idea generation, commenting, and evaluation (assigning marks) The obtained groups and their percentages appear to follow the "90 - 9 - 1" rule.
Despite the high perceived value and increasing severity of online information controls, a data-driven understanding of the phenomenon has remained elusive. In this paper, we consider two design points in the space of Internet censorship measurement with particular emphasis on how they address the challenges of locating vantage points, choosing content to test, and analyzing results. We discuss the trade offs of decisions made by each platform and show how the resulting data provides complementary views of global censorship. Finally, we discuss lessons learned and open challenges discovered through our experiences.
We quantify a source of ineffectual computations when processing the multiplications of the convolutional layers in Deep Neural Networks (DNNs) and propose Pragmatic (PRA), an architecture that exploits it improving performance and energy efficiency. The source of these ineffectual computations is best understood in the context of conventional multipliers which generate internally multiple terms, that is, products of the multiplicand and powers of two, which added together produce the final product [1]. At runtime, many of these terms are zero as they are generated when the multiplicand is combined with the zero-bits of the multiplicator. While conventional bit-parallel multipliers calculate all terms in parallel to reduce individual product latency, PRA calculates only the non-zero terms using a) on-the-fly conversion of the multiplicator representation into an explicit list of powers of two, and b) hybrid bit-parallel multplicand/bit-serial multiplicator processing units. PRA exploits two sources of ineffectual computations: 1) the aforementioned zero product terms which are the result of the lack of explicitness in the multiplicator representation, and 2) the excess in the representation precision used for both multiplicants and multiplicators, e.g., [2]. Measurements demonstrate that for the convolutional layers, a straightforward variant of PRA improves performance by 2.6x over the DaDiaNao (DaDN) accelerator [3] and by 1.4x over STR [4]. Similarly, PRA improves energy efficiency by 28% and 10% on average compared to DaDN and STR. An improved cross lane synchronication scheme boosts performance improvements to 3.1x over DaDN. Finally, Pragmatic benefits persist even with an 8-bit quantized representation [5].
White noise methods are a powerful tool for characterizing the computation performed by neural systems. These methods allow one to identify the feature or features that a neural system extracts from a complex input, and to determine how these features are combined to drive the system's spiking response. These methods have also been applied to characterize the input/output relations of single neurons driven by synaptic inputs, simulated by direct current injection. To interpret the results of white noise analysis of single neurons, we would like to understand how the obtained feature space of a single neuron maps onto the biophysical properties of the membrane, in particular the dynamics of ion channels. Here, through analysis of a simple dynamical model neuron, we draw explicit connections between the output of a white noise analysis and the underlying dynamical system. We find that under certain assumptions, the form of the relevant features is well defined by the parameters of the dynamical system. Further, we show that under some conditions, the feature space is spanned by the spike-triggered average and its successive order time derivatives.
Consider a population of individuals and a network that encodes social connections among them. We are interested in making inference on finite population and super-population estimands that are a function of both individuals' responses and of the network, from a sample. Neither the sampling frame nor the network are available. However, the sampling mechanism implicitly leverages the network to recruit individuals, thus partially revealing social interactions among the individuals in the sample, as well as their responses. This is a common setting that arises, for instance, in epidemiology and healthcare, where samples from hard-to-reach populations are collected using link-tracing mechanisms, including respondent-driven sampling. In this paper, we study statistical properties of popular network sampling mechanisms. We formulate the estimation problem in terms of Rubin's inferential framework to explicitly account for social network structure. We then identify key modeling elements that lead to inferences with good frequentist properties when dealing with data collected through non-ignorable network sampling mechanisms. We demonstrate these methods on a study of the incidence of HIV in Brazil
Volumetric 3D reconstruction has witnessed a significant progress in performance through the use of deep neural network based methods that address some of the limitations of traditional reconstruction algorithms. However, this increase in performance requires large scale annotations of 2D/3D data. This paper introduces a novel generative model for volumetric 3D reconstruction, Weakly supervised Generative Adversarial Network (WS-GAN) which reduces reliance on expensive 3D supervision. WS-GAN takes an input image, a sparse set of 2D object masks with respective camera parameters, and an unmatched 3D model as inputs during training. WS-GAN uses a learned encoding as input to a conditional 3D-model generator trained alongside a discriminator, which is constrained to the manifold of realistic 3D shapes. We bridge the representation gap between 2D masks and 3D volumes through a perspective raytrace pooling layer, that enables perspective projection and allows backpropagation. We evaluate WS-GAN on ShapeNet, ObjectNet and Stanford Online Product dataset for reconstruction with single-view and multi-view cases in both synthetic and real images. We compare our method to voxel carving and prior work with full 3D supervision. Additionally, we also demonstrate that the learned feature representation is semantically meaningful through interpolation and manipulation in input space.
We reinvestigate the validity of mapping the problem of two onsite interacting particles in a random potential onto an effective random matrix model. To this end we first study numerically how the non-interacting basis is coupled by the interaction. Our results indicate that the typical coupling matrix element decreases significantly faster with increasing single-particle localization length than is assumed in the random matrix model. We further show that even for models where the dependency of the coupling matrix element on the single-particle localization length is correctly described by the corresponding random matrix model its predictions for the localization length can be qualitatively incorrect. These results indicate that the mapping of an interacting random system onto an effective random matrix model is potentially dangerous. We also discuss how Imry's block-scaling picture for two interacting particles is influenced by the above arguments.
We present a Deep Convolutional Neural Network architecture which serves as a generic image-to-image regressor that can be trained end-to-end without any further machinery. Our proposed architecture: the Recursively Branched Deconvolutional Network (RBDN) develops a cheap multi-context image representation very early on using an efficient recursive branching scheme with extensive parameter sharing and learnable upsampling. This multi-context representation is subjected to a highly non-linear locality preserving transformation by the remainder of our network comprising of a series of convolutions/deconvolutions without any spatial downsampling. The RBDN architecture is fully convolutional and can handle variable sized images during inference. We provide qualitative/quantitative results on $3$ diverse tasks: relighting, denoising and colorization and show that our proposed RBDN architecture obtains comparable results to the state-of-the-art on each of these tasks when used off-the-shelf without any post processing or task-specific architectural modifications.
Photometric redshifts for galaxies in the Hubble Deep Field are measured. Luminosity functions show steepening of the faint-end slope and mild brightening of M* out to z~3, followed by a decline at higher z; an excess of faint, star-forming galaxies is seen at low z. Our results are consistent with the formation of large galaxies at z=2-3, followed by that of dwarfs at z<1.
Understanding open-domain text is one of the primary challenges in natural language processing (NLP). Machine comprehension benchmarks evaluate the system's ability to understand text based on the text content only. In this work, we investigate machine comprehension on MCTest, a question answering (QA) benchmark. Prior work is mainly based on feature engineering approaches. We come up with a neural network framework, named hierarchical attention-based convolutional neural network (HABCNN), to address this task without any manually designed features. Specifically, we explore HABCNN for this task by two routes, one is through traditional joint modeling of passage, question and answer, one is through textual entailment. HABCNN employs an attention mechanism to detect key phrases, key sentences and key snippets that are relevant to answering the question. Experiments show that HABCNN outperforms prior deep learning approaches by a big margin.
The recent adaptation of deep neural network-based methods to reinforcement learning and planning domains has yielded remarkable progress on individual tasks. Nonetheless, progress on task-to-task transfer remains limited. In pursuit of efficient and robust generalization, we introduce the Schema Network, an object-oriented generative physics simulator capable of disentangling multiple causes of events and reasoning backward through causes to achieve goals. The richly structured architecture of the Schema Network can learn the dynamics of an environment directly from data. We compare Schema Networks with Asynchronous Advantage Actor-Critic and Progressive Networks on a suite of Breakout variations, reporting results on training efficiency and zero-shot generalization, consistently demonstrating faster, more robust learning and better transfer. We argue that generalizing from limited data and learning causal relationships are essential abilities on the path toward generally intelligent systems.
Topic models are one of the most popular methods for learning representations of text, but a major challenge is that any change to the topic model requires mathematically deriving a new inference algorithm. A promising approach to address this problem is autoencoding variational Bayes (AEVB), but it has proven diffi- cult to apply to topic models in practice. We present what is to our knowledge the first effective AEVB based inference method for latent Dirichlet allocation (LDA), which we call Autoencoded Variational Inference For Topic Model (AVITM). This model tackles the problems caused for AEVB by the Dirichlet prior and by component collapsing. We find that AVITM matches traditional methods in accuracy with much better inference time. Indeed, because of the inference network, we find that it is unnecessary to pay the computational cost of running variational optimization on test data. Because AVITM is black box, it is readily applied to new topic models. As a dramatic illustration of this, we present a new topic model called ProdLDA, that replaces the mixture model in LDA with a product of experts. By changing only one line of code from LDA, we find that ProdLDA yields much more interpretable topics, even if LDA is trained via collapsed Gibbs sampling.
We describe in this paper the ROSAT Ultra Deep Survey (UDS), an extension of the ROSAT Deep Survey (RDS) in the Lockman Hole. The UDS reaches a flux level of 1.2 x 10E-15 erg/cm2/s in 0.5-2.0 keV energy band, a level ~4.6 times fainter than the RDS. We present nearly complete spectroscopic identifications (90%) of the sample of 94 X-ray sources based on low-resolution Keck spectra. The majority of the sources (57) are broad emission line AGNs (type I), whereas a further 13 AGNs show only narrow emission lines or broad Balmer emission lines with a large Balmer decrement (type II AGNs) indicating significant optical absorption. The second most abundant class of objects (10) are groups and clusters of galaxies (~11%). Further we found five galactic stars and one ''normal'' emission line galaxy. Eight X-ray sources remain spectroscopically unidentified. The photometric redshift determination indicates in three out of the eight sources the presence of an obscured AGN in the range of 1.2 < z < 2.7. These objects could belong to the long-sought population of type 2 QSOs, which are predicted by the AGN synthesis models of the X-ray background. Finally, we discuss the optical and soft X-ray properties of the type I AGN, type II AGN, and groups and clusters of galaxies, and the implications to the X-ray backround.
We study the superconducting proximity effect in a quantum wire with broken time-reversal (TR) symmetry connected to a conventional superconductor. We consider the situation of a strong TR-symmetry breaking, so that Cooper pairs entering the wire from the superconductor are immediately destroyed. Nevertheless, some traces of the proximity effect survive: for example, the local electronic density of states (LDOS) is influenced by the proximity to the superconductor, provided that localization effects are taken into account. With the help of the supersymmetric sigma model, we calculate the average LDOS in such a system. The LDOS in the wire is strongly modified close to the interface with the superconductor at energies near the Fermi level. The relevant distances from the interface are of the order of the localization length, and the size of the energy window around the Fermi level is of the order of the mean level spacing at the localization length. Remarkably, the sign of the effect is sensitive to the way the TR symmetry is broken: In the spin-symmetric case (orbital magnetic field), the LDOS is depleted near the Fermi energy, whereas for the broken spin symmetry (magnetic impurities), the LDOS at the Fermi energy is enhanced.
Clinical observations indicate that during critical care at the hospitals, patients sleep positioning and motion affect recovery. Unfortunately, there is no formal medical protocol to record, quantify, and analyze patient motion. There is a small number of clinical studies, which use manual analysis of sleep poses and motion recordings to support medical benefits of patient positioning and motion monitoring. Manual processes are not scalable, are prone to human errors, and strain an already taxed healthcare workforce. This study introduces DECU (Deep Eye-CU): an autonomous mulitmodal multiview system, which addresses these issues by autonomously monitoring healthcare environments and enabling the recording and analysis of patient sleep poses and motion. DECU uses three RGB-D cameras to monitor patient motion in a medical Intensive Care Unit (ICU). The algorithms in DECU estimate pose direction at different temporal resolutions and use keyframes to efficiently represent pose transition dynamics. DECU combines deep features computed from the data with a modified version of Hidden Markov Model to more flexibly model sleep pose duration, analyze pose patterns, and summarize patient motion. Extensive experimental results are presented. The performance of DECU is evaluated in ideal (BC: Bright and Clear/occlusion-free) and natural (DO: Dark and Occluded) scenarios at two motion resolutions in a mock-up and a real ICU. The results indicate that deep features allow DECU to match the classification performance of engineered features in BC scenes and increase the accuracy by up to 8% in DO scenes. In addition, the overall pose history summarization tracing accuracy shows an average detection rate of 85% in BC and of 76% in DO scenes. The proposed keyframe estimation algorithm allows DECU to reach an average 78% transition classification accuracy.
One of the most discussed features offered by Information-centric Networking (ICN) architectures is the ability to support packet-level caching at every node in the network. By individually naming each packet, ICN allows routers to turn their queueing buffers into packet caches, thus exploiting the network's existing storage resources. However, the performance of packet caching at commodity routers is restricted by the small capacity of their SRAM, which holds the index for the packets stored at the, slower, DRAM. We therefore propose Object-oriented Packet Caching (OPC), a novel caching scheme that overcomes the SRAM bottleneck, by combining object-level indexing in the SRAM with packet-level storage in the DRAM. We implemented OPC and experimentally evaluated it over various cache placement policies, showing that it can enhance the impact of ICN packet-level caching, reducing both network and server load.
Recent results on dijet production in deep-inelastic scattering from the H1 experiment at the ep-collider HERA are presented. Internal jet structure has been studied in terms of jet shapes and subjet multiplicities in the Breit frame. Both observables are seen to be well described by QCD models. We observe a broadening of the jets towards the proton direction and at lower transverse jet energies. Dijet rates and dijet cross sections have been measured over a wide range of four momentum transfers (5 < Q^2 < 5000 GeV^2) and transverse jet energies (25 < E^2_t,Breit < 1200 GeV^2) with different jet algorithms. Perturbative QCD calculations in next-to-leading order in the strong coupling constant give a good description of the data.
Mitochondrial outer membrane permeabilization (MOMP) is one of the most important points, in majority of apoptotic signaling cascades. Decision mechanism controlling whether the MOMP occurs or not, is formed by an interplay between members of the Bcl-2 family. To understand the role of individual members of this family within the MOMP regulation, we constructed a boolean network-based mathematical model of interactions between the Bcl-2 proteins. Results of computational simulations reveal the existence of the potentially malign configurations of activities of the Bcl-2 proteins, blocking the occurrence of MOMP, independently of the incoming stimuli. Our results suggest role of the antiapoptotic protein Mcl-1 in relation to these configurations. We demonstrate here, the importance of the Bid and Bim according to activation of effectors Bax and Bak, and the irreversibility of this activation. The model further shows the distinct requirements for effectors activation, where the antiapoptic protein Bcl-w is seemingly a key factor preventing the Bax activation. We believe that this work may help to describe the functioning of the Bcl-2 regulation of MOMP better, and hopefully provide some contribution regarding the anti-cancer drug development research.
Face recognition performance has improved remarkably in the last decade. Much of this success can be attributed to the development of deep learning techniques such as convolutional neural networks (CNNs). While CNNs have pushed the state-of-the-art forward, their training process requires a large amount of clean and correctly labelled training data. If a CNN is intended to tolerate facial pose, then we face an important question: should this training data be diverse in its pose distribution, or should face images be normalized to a single pose in a pre-processing step? To address this question, we evaluate a number of popular facial landmarking and pose correction algorithms to understand their effect on facial recognition performance. Additionally, we introduce a new, automatic, single-image frontalization scheme that exceeds the performance of current algorithms. CNNs trained using sets of different pre-processing methods are used to extract features from the Point and Shoot Challenge (PaSC) and CMU Multi-PIE datasets. We assert that the subsequent verification and recognition performance serves to quantify the effectiveness of each pose correction scheme.
The ability of the organism to distinguish between various stimuli is limited by the structure and noise in the population code of its sensory neurons. Here we infer a distance measure on the stimulus space directly from the recorded activity of 100 neurons in the salamander retina. In contrast to previously used measures of stimulus similarity, this "neural metric" tells us how distinguishable a pair of stimulus clips is to the retina, given the noise in the neural population response. We show that the retinal distance strongly deviates from Euclidean, or any static metric, yet has a simple structure: we identify the stimulus features that the neural population is jointly sensitive to, and show the SVM-like kernel function relating the stimulus and neural response spaces. We show that the non-Euclidean nature of the retinal distance has important consequences for neural decoding.
Our aim is to observationally investigate the cosmic Dark Ages in order to constrain star and structure formation models, as well as the chemical evolution in the early Universe. Spectral lines from atoms and molecules in primordial perturbations at high redshifts can give information about the conditions in the early universe before and during the formation of the first stars in addition to the epoch of reionisation. The lines may arise from moving primordial perturbations before the formation of the first stars (resonant scattering lines), or could be thermal absorption or emission lines at lower redshifts. The difficulties in these searches are that the source redshift and evolutionary state, as well as molecular species and transition are unknown, which implies that an observed line can fall within a wide range of frequencies. The lines are also expected to be very weak. Observations from space have the advantages of stability and the lack of atmospheric features which is important in such observations. We have therefore, as a first step in our searches, used the Odin satellite to perform two sets of spectral line surveys towards several positions. The first survey covered the band 547-578 GHz towards two positions, and the second one covered the bands 542.0-547.5 GHz and 486.5-492.0 GHz towards six positions selected to test different sizes of the primordial clouds. Two deep searches centred at 543.250 and 543.100 GHz with 1 GHz bandwidth were also performed towards one position. The two lowest rotational transitions of H2 will be redshifted to these frequencies from z~20-30, which is the predicted epoch of the first star formation. No lines are detected at an rms level of 14-90 and 5-35 mK for the two surveys, respectively, and 2-7 mK in the deep searches with a channel spacing of 1-16 MHz. The broad bandwidth covered allows a wide range of redshifts to be explored for a number of atomic and molecular species and transitions. From the theoretical side, our sensitivity analysis show that the largest possible amplitudes of the resonant lines are about 1 mK at frequencies <200 GHz, and a few micro K around 500-600 GHz, assuming optically thick lines and no beam-dilution. However, if existing, thermal absorption lines have the potential to be orders of magnitude stronger than the resonant lines. We make a simple estimation of the sizes and masses of the primordial perturbations at their turn-around epochs, which previously has been identified as the most favourable epoch for a detection. This work may be considered as an important pilot study for our forthcoming observations with the Herschel Space Observatory.
Incorporation of prior knowledge about organ shape and location is key to improve performance of image analysis approaches. In particular, priors can be useful in cases where images are corrupted and contain artefacts due to limitations in image acquisition. The highly constrained nature of anatomical objects can be well captured with learning based techniques. However, in most recent and promising techniques such as CNN based segmentation it is not obvious how to incorporate such prior knowledge. State-of-the-art methods operate as pixel-wise classifiers where the training objectives do not incorporate the structure and inter-dependencies of the output. To overcome this limitation, we propose a generic training strategy that incorporates anatomical prior knowledge into CNNs through a new regularisation model, which is trained end-to-end. The new framework encourages models to follow the global anatomical properties of the underlying anatomy (e.g. shape, label structure) via learned non-linear representations of the shape. We show that the proposed approach can be easily adapted to different analysis tasks (e.g. image enhancement, segmentation) and improve the prediction accuracy of the state-of-the-art models. The applicability of our approach is shown on multi-modal cardiac datasets and public benchmarks. Additionally, we demonstrate how the learned deep models of 3D shapes can be interpreted and used as biomarkers for classification of cardiac pathologies.
Phylogenetic networks model reticulate evolutionary histories. The last two decades have seen an increased interest in establishing mathematical results and developing computational methods for inferring and analyzing these networks. A salient concept underlying a great majority of these developments has been the notion that a network displays a set of trees and those trees can be used to infer, analyze, and study the network.   In this paper, we show that in the presence of coalescence effects, the set of displayed trees is not sufficient to capture the network. We formally define the set of parental trees of a network and make three contributions based on this definition. First, we extend the notion of anomaly zone to phylogenetic networks and report on anomaly results for different networks. Second, we demonstrate how coalescence events could negatively affect the ability to infer a species tree that could be augmented into the correct network. Third, we demonstrate how a phylogenetic network can be viewed as a mixture model that lends itself to a novel inference approach via gene tree clustering.   Our results demonstrate the limitations of focusing on the set of trees displayed by a network when analyzing and inferring the network. Our findings can form the basis for achieving higher accuracy when inferring phylogenetic networks and open up new venues for research in this area, including new problem formulations based on the notion of a network's parental trees.
Empirical temporal networks display strong heterogeneities in their dynamics, which profoundly affect processes taking place on these networks, such as rumor and epidemic spreading. Despite the recent wealth of data on temporal networks, little work has been devoted to the understanding of how such heterogeneities can emerge from microscopic mechanisms at the level of nodes and links. Here we show that long-term memory effects are present in the creation and disappearance of links in empirical networks. We thus consider a simple generative modeling framework for temporal networks able to incorporate these memory mechanisms. This allows us to study separately the role of each of these mechanisms in the emergence of heterogeneous network dynamics. In particular, we show analytically and numerically how heterogeneous distributions of contact durations, of inter-contact durations and of numbers of contacts per link emerge. We also study the individual effect of heterogeneities on dynamical processes, such as the paradigmatic Susceptible-Infected epidemic spreading model. Our results confirm in particular the crucial role of the distributions of inter-contact durations and of the numbers of contacts per link.
We propose and illustrate an approach to coarse-graining the dynamics of evolving networks (networks whose connectivity changes dynamically). The approach is based on the equation-free framework: short bursts of detailed network evolution simulations are coupled with lifting and restriction operators that translate between actual network realizations and their (appropriately chosen) coarse observables. This framework is used here to accelerate temporal simulations (through coarse projective integration), and to implement coarsegrained fixed point algorithms (through matrix-free Newton-Krylov GMRES). The approach is illustrated through a simple network evolution example, for which analytical approximations to the coarse-grained dynamics can be independently obtained, so as to validate the computational results. The scope and applicability of the approach, as well as the issue of selection of good coarse observables are discussed.
Hybrid intra-data centre networks, with optical and electrical capabilities, are attracting research interest in recent years. This is attributed to the emergence of new bandwidth greedy applications and novel computing paradigms. A key decision to make in networks of this type is the selection and placement of suitable flows for switching in circuit network. Here, we propose an efficient strategy for flow selection and placement suitable for hybrid Intra-cloud data centre networks. We further present techniques for investigating bottlenecks in a packet networks and for the selection of flows to switch in circuit network. The bottleneck technique is verified on a Software Defined Network (SDN) testbed. We also implemented the techniques presented here in a scalable simulation experiment to investigate the impact of flow selection on network performance. Results obtained from scalable simulation experiment indicate a considerable improvement on average throughput, lower configuration delay, and stability of offloaded flows
The random networks enriched with additional structures as metric and group-symmetry in background metric space are investigated. The important quantities like he clustering coefficient as well as the mean degree of separation in such networks are effectively computed with help of additional structures. Representative models are discussed in details.
Summary: Founded upon diffusion with damping, ITM Probe is an application for modeling information flow in protein interaction networks without prior restriction to the sub-network of interest. Given a context consisting of desired origins and destinations of information, ITM Probe returns the set of most relevant proteins with weights and a graphical representation of the corresponding sub-network. With a click, the user may send the resulting protein list for enrichment analysis to facilitate hypothesis formation or confirmation.   Availability: ITM Probe web service and documentation can be found at www.ncbi.nlm.nih.gov/CBBresearch/qmbp/mn/itm_probe
The charge carrier drift mobility in disordered semiconductors is commonly graphically extracted from time-of-flight (ToF) photocurrent transients yielding a single transit time. However, the term transit time is ambiguously defined and fails to deliver a mobility in terms of a statistical average. Here, we introduce an advanced computational procedure to evaluate ToF transients, which allows to extract the whole distribution of transit times and mobilities from the photocurrent transient, instead of a single value. This method, extending the work of Scott et al. (Phys. Rev. B 46, 8603), is applicable to disordered systems with a Gaussian density of states (DOS) and its accuracy is validated using one-dimensional Monte Carlo simulations. We demonstrate the superiority of this new approach by comparing it to the common geometrical analysis of hole ToF transients measured on poly(3-hexyl thiophene-2,5-diyl) (P3HT). The extracted distributions provide access to a very detailed and accurate analysis of the charge carrier transport. For instance, not only the mobility given by the mean transit time, but also the mean mobility can be calculated. Whereas the latter determines the macroscopic photocurrent, the former is relevant for an accurate determination of the energetic disorder parameter $\sigma$ within the Gaussian disorder model (GDM). $\sigma$ derived by using the common geometrical method is, as we show, underestimated instead.
We consider discriminative dictionary learning in a distributed online setting, where a network of agents aims to learn a common set of dictionary elements of a feature space and model parameters while sequentially receiving observations. We formulate this problem as a distributed stochastic program with a non-convex objective and present a block variant of the Arrow-Hurwicz saddle point algorithm to solve it. Using Lagrange multipliers to penalize the discrepancy between them, only neighboring nodes exchange model information. We show that decisions made with this saddle point algorithm asymptotically achieve a first-order stationarity condition on average.
Evolutionary algorithms based on modeling the statistical dependencies (interactions) between the variables have been proposed to solve a wide range of complex problems. These algorithms learn and sample probabilistic graphical models able to encode and exploit the regularities of the problem. This paper investigates the effect of using probabilistic modeling techniques as a way to enhance the behavior of MOEA/D framework. MOEA/D is a decomposition based evolutionary algorithm that decomposes a multi-objective optimization problem (MOP) in a number of scalar single-objective subproblems and optimizes them in a collaborative manner. MOEA/D framework has been widely used to solve several MOPs. The proposed algorithm, MOEA/D using probabilistic Graphical Models (MOEA/D-GM) is able to instantiate both univariate and multi-variate probabilistic models for each subproblem. To validate the introduced framework algorithm, an experimental study is conducted on a multi-objective version of the deceptive function Trap5. The results show that the variant of the framework (MOEA/D-Tree), where tree models are learned from the matrices of the mutual information between the variables, is able to capture the structure of the problem. MOEA/D-Tree is able to achieve significantly better results than both MOEA/D using genetic operators and MOEA/D using univariate probability models, in terms of the approximation to the true Pareto front.
This work focuses on the simulation of $\CO$ storage in deep underground formations under uncertainty and seeks to understand the impact of uncertainties in reservoir properties on $\CO$ leakage. To simulate the process, a non-isothermal two-phase two-component flow system with equilibrium phase exchange is used. Since model evaluations are computationally intensive, instead of traditional Monte Carlo methods, we rely on polynomial chaos (PC) expansions for representation of the stochastic model response. A non-intrusive approach is used to determine the PC coefficients. We establish the accuracy of the PC representations within a reasonable error threshold through systematic convergence studies. In addition to characterizing the distributions of model observables, we compute probabilities of excess $\CO$ leakage. Moreover, we consider the injection rate as an uncertain design parameter and compute an optimum injection rate that ensures that the risk of excess pressure buildup at the leaky well remains below acceptable levels. We also provide a comprehensive analysis of sensitivities of $\CO$ leakage, where we compute the contributions of the random parameters, and their interactions, to the variance by computing first, second, and total order Sobol indices.
The fits of 0.4 Ca(NO_3)_2 0.6 K(NO_3) (CKN) by schematic mode-coupling models [V. Krakoviack and C. Alba-Simionesco, J. Chem. Phys. 117, 2161-2171 (2002)] are analyzed by asymptotic expansions. The validity of both the power-law and the Cole-Cole-peak solutions for the critical spectrum are investigated. It is found that the critical spectrum derived from the fits is described by both expansions equally well when both expansions are carried out up to next-to-leading order. The expansions up to this order describe the data for 373K over two orders of magnitude in frequency. In this regime an effective power law omega^a can be identified where the observed exponent $a$ is smaller than its calculated value by about 15%; this finding can be explained by corrections to the leading-order terms in the asymptotic expansions. For higher temperatures, even smaller effective exponents are caused by a crossover to the alpha-peak spectrum.
We study the disordered XYZ spin chain using the recently developed Spectrum Bifurcation Renormalization Group (SBRG) numerical method. With strong disorder, the phase diagram consists of three many body localized (MBL) spin glass phases. We argue that, with sufficiently strong disorder, these spin glass phases are separated by marginally many-body localized (MBL) critical lines. We examine the critical lines of this model by measuring the entanglement entropy and Edwards-Anderson spin glass order parameter, and find that the critical lines are characterized by an effective central charge c'=ln2. Our data also suggests continuously varying critical exponents along the critical lines. We also demonstrate how long-range mutual information can distinguish these phases.
The study of community networks has attracted considerable attention recently. In this paper, we propose an evolving community network model based on local processes, the addition of new nodes intra-community and new links intra- or inter-community. Employing growth and preferential attachment mechanisms, we generate networks with a generalized power-law distribution of nodes' degrees.
We define a domain-specific language (DSL) to inductively assemble flow networks from small networks or modules to produce arbitrarily large ones, with interchangeable functionally-equivalent parts. Our small networks or modules are "small" only as the building blocks in this inductive definition (there is no limit on their size). Associated with our DSL is a type theory, a system of formal annotations to express desirable properties of flow networks together with rules that enforce them as invariants across their interfaces, i.e, the rules guarantee the properties are preserved as we build larger networks from smaller ones. A prerequisite for a type theory is a formal semantics, i.e, a rigorous definition of the entities that qualify as feasible flows through the networks, possibly restricted to satisfy additional efficiency or safety requirements. This can be carried out in one of two ways, as a denotational semantics or as an operational (or reduction) semantics; we choose the first in preference to the second, partly to avoid exponential-growth rewriting in the operational approach. We set up a typing system and prove its soundness for our DSL.
Neutral-current and charged-current deep-inelastic scattering at very high four-momentum transfer squared (Q^2) have been studied in positron-proton collisions at center-of-mass energy 300 GeV using the ZEUS detector at HERA. An integrated luminosity of 47.7 $pb^{-1}$ was collected in the years 1994-1997. Differential cross sections are presented for $Q^2 > 400 GeV^2$ and compared to Standard Model predictions.
The rapidly growing wave of wireless data service is pushing against the boundary of our communication network's processing power. The pervasive and exponentially increasing data traffic present imminent challenges to all the aspects of the wireless system design, such as spectrum efficiency, computing capabilities and fronthaul/backhaul link capacity. In this article, we discuss the challenges and opportunities in the design of scalable wireless systems to embrace such a "bigdata" era. On one hand, we review the state-of-the-art networking architectures and signal processing techniques adaptable for managing the bigdata traffic in wireless networks. On the other hand, instead of viewing mobile bigdata as a unwanted burden, we introduce methods to capitalize from the vast data traffic, for building a bigdata-aware wireless network with better wireless service quality and new mobile applications. We highlight several promising future research directions for wireless communications in the mobile bigdata era.
To improve the performance of reprogramming in wireless sensor network, we present a novel reprogramming structure and constructive interference-based dissemination protocol (CIDP) to transmit the patch through out the network fast and reliability. CIDP disseminates the patch, which is divided into several packets, to the network exploiting constructive interference. We evaluate our implementation of CIDP using simulation under different number of nodes. Our results show that CIDP disseminates the patch less than 4 milliseconds. In general, the probability of a node receives the complete patch as high as 99.99%.
We design an optimal strategy for investment in a portfolio of assets subject to a multiplicative Brownian motion. The strategy provides the maximal typical long-term growth rate of investor's capital. We determine the optimal fraction of capital that an investor should keep in risky assets as well as weights of different assets in an optimal portfolio. In this approach both average return and volatility of an asset are relevant indicators determining its optimal weight. Our results are particularly relevant for very risky assets when traditional continuous-time Gaussian portfolio theories are no longer applicable.
Optical backbone networks carry a huge amount of bandwidth and serve as a key enabling technology to provide telecommunication connectivity across the world. Currently, fiber-optic cables carry about $99$$\%$ of global Internet traffic. Hence, in events of network component (node/link) failures, communication networks may suffer from huge amount of bandwidth loss and service disruptions. Natural disasters such as earthquakes, hurricanes, tornadoes, etc., occur at different places around the world, causing severe communication service disruptions due to network component failures. Most of the previous works on optical network survivability assume that the failures are going to occur in future, and the network is made survivable to ensure connectivity in events of failures. With the advancements in seismology, the predictions of earthquakes are becoming more accurate. Earthquakes have been a major cause of telecommunication service disruption in the past. Hence, the information provided by the meteorological departments and other similar agencies of different countries may be helpful in designing networks that are more robust against earthquakes. In this work, we consider the actual information provided by the Indian meteorological department (IMD) on seismic zones, and earthquakes occurred in the past in India, and propose a scheme, called seismic zone aware node relocation (SZANR), to improve the survivability of the existing Indian optical network through minute changes in network topology. We take Indian RailTel optical network as a case study in this work. Simulations show significant improvement in the network survivability can be achieved using the proposed SZANR in events of earthquakes.
According to a recent numerical finding, the dynamics of a glass former is exclusively due to the forces within the first coordination shell. This implies that the Kohlrausch beta should be understandable in terms of the effective nearest-neighbor potential.   The present paper proposes a relation for the Kohlrausch beta based on the Adam-Gibbs conjecture of a flow barrier proportional to the number of atoms or molecules in a cooperatively rearranging region. The conjecture implies that beta is given by the ratio of the structural entropy increase per particle to the barrier increase per particle. Making use of a recent numerical determination of the structural entropy per particle in Lennard-Jones-like potentials, one can show that the relation leads to values between 0.3 and 0.6.
We study the correlation length of the two-dimensional Ising spin glass with a Gaussian distribution of interactions, using an efficient Monte Carlo algorithm proposed by Houdayer, that allows larger sizes and lower temperatures to be studied than was possible before. We find that the "effective" value of the bulk correlation length exponent \nu increases as the temperature is lowered, and, at low temperatures, apparently approaches -1/\theta, where \theta ~ -0.29 is the stiffness exponent obtained at zero temperature. This means scaling is satisfied and earlier results at higher temperatures that find a smaller value for \nu are affected by corrections to scaling.
What happens when you slow down part of an ultrafast network that is operating quicker than the blink of an eye, e.g. electronic exchange network, navigational systems in driverless vehicles, or even neuronal processes in the brain? This question just adopted immediate commercial, legal and political importance following U.S. financial regulators' decision to allow a new network node to intentionally introduce delays of microseconds. Though similar requests are set to follow, there is still no scientific understanding available to policymakers of the likely system-wide impact of such delays. Giving academic researchers access to (so far prohibitively expensive) microsecond exchange data would help rectify this situation. As a by-product, the lessons learned would deepen understanding of instabilities across myriad other networks, e.g. impact of millisecond delays on brain function and safety of driverless vehicle navigation systems beyond human response times.
Constraints on changes in expression levels across all cell components imposed by the steady growth of cells have recently been discussed both experimentally and theoretically. By assuming a small environmental perturbation and considering a linear response to it, a common proportionality in such expression changes was derived and partially verified by experimental data. Here, we examined global protein expression in {\it Escherichia coli} under various environmental perturbations. Remarkably they are proportional across components, even though these environmental changes are not small and cover different types of stresses, while the proportion coefficient corresponds to the change in growth rate. However, since such global proportionality is not generic to all systems under a condition of steady growth, a new conceptual framework is needed. We hypothesized that such proportionality is a result of evolution. To validate this hypothesis, we analyzed a cell model with a huge number of components that reproduces itself via a catalytic reaction network, and confirmed that the common proportionality in the concentrations of all components is shaped through evolutionary processes to maximize cell growth (and therefore fitness) under a given environmental condition. Further, we found that the concentration changes across all components in response to environmental and evolutionary changes are constrained along a one-dimensional major axis within a huge-dimensional state space. Based on these observations, we propose a theory in which high-dimensional phenotypic changes after evolution are constrained along a one-dimensional major axis that correlates with the growth rate, which can explain broad experimental and numerical results.
The recent detection of the Cosmic Infrared Background in FIRAS and DIRBE residuals, and the observations of IR/submm sources by the ISOPHOT and SCUBA instruments have shed new light on the optically dark side of galaxy formation. It turns out that our view on the deep universe has been so far biassed towards optically bright galaxies. We now know that a significant fraction of galaxy/star formation in the universe is hidden by dust shrouds. In this paper, we introduce a new modelling of galaxy formation and evolution that provides us with specific predictions in the IR/submm wavebands. These predictions are compared with the current status of the observations. Finally, the future all-sky and deep surveys with the PLANCK and FIRST missions are briefly described.
Methods of general applicability are searched for in swarm intelligence with the aim of gaining new insights about natural swarms and to develop design methodologies for artificial swarms. An ideal solution could be a `swarm calculus' that allows to calculate key features of swarms such as expected swarm performance and robustness based on only a few parameters. To work towards this ideal, one needs to find methods and models with high degrees of generality. In this paper, we report two models that might be examples of exceptional generality. First, an abstract model is presented that describes swarm performance depending on swarm density based on the dichotomy between cooperation and interference. Typical swarm experiments are given as examples to show how the model fits to several different results. Second, we give an abstract model of collective decision making that is inspired by urn models. The effects of positive feedback probability, that is increasing over time in a decision making system, are understood by the help of a parameter that controls the feedback based on the swarm's current consensus. Several applicable methods, such as the description as Markov process, calculation of splitting probabilities, mean first passage times, and measurements of positive feedback, are discussed and applications to artificial and natural swarms are reported.
Exact ground states of two-dimensional Ising spin glasses with Gaussian and bimodal (+- J) distributions of the disorder are calculated using a ``matching'' algorithm, which allows large system sizes of up to N=480^2 spins to be investigated. We study domain walls induced by two rather different types of boundary-condition changes, and, in each case, analyze the system-size dependence of an appropriately defined ``defect energy'', which we denote by DE. For Gaussian disorder, we find a power-law behavior DE ~ L^\theta, with \theta=-0.266(2) and \theta=-0.282(2) for the two types of boundary condition changes. These results are in reasonable agreement with each other, allowing for small systematic effects. They also agree well with earlier work on smaller sizes. The negative value indicates that two dimensions is below the lower critical dimension d_c. For the +-J model, we obtain a different result, namely the domain-wall energy saturates at a nonzero value for L\to \infty, so \theta = 0, indicating that the lower critical dimension for the +-J model exactly d_c=2.
Biochemical models that exhibit bistability are of interest to biologists and mathematicians alike. Chemical reaction network theory can provide sufficient conditions for the existence of bistability, and on the other hand can rule out the possibility of multiple steady states. Understanding small networks is important because the existence of multiple steady states in a subnetwork of a biochemical model can sometimes be lifted to establish multistationarity in the larger network. This paper establishes the smallest reversible, mass-preserving network that admits bistability and determines the semi-algebraic set of parameters for which more than one steady state exists.
Artificial neural networks have been successfully applied to a variety of business application problems involving classification and regression. Although backpropagation neural networks generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions are not as interpretable as those of decision trees. In many applications, it is desirable to extract knowledge from trained neural networks so that the users can gain a better understanding of the solution. This paper presents an efficient algorithm to extract rules from artificial neural networks. We use two-phase training algorithm for backpropagation learning. In the first phase, the number of hidden nodes of the network is determined automatically in a constructive fashion by adding nodes one after another based on the performance of the network on training data. In the second phase, the number of relevant input units of the network is determined using pruning algorithm. The pruning process attempts to eliminate as many connections as possible from the network. Relevant and irrelevant attributes of the data are distinguished during the training process. Those that are relevant will be kept and others will be automatically discarded. From the simplified networks having small number of connections and nodes we may easily able to extract symbolic rules using the proposed algorithm. Extensive experimental results on several benchmarks problems in neural networks demonstrate the effectiveness of the proposed approach with good generalization ability.
We study the phase ordering colloids suspended in a thermotropic nematic liquid crystal below the clearing point Tni and the resulting aggregated structure. Small (150nm) PMMA particles are dispersed in a classical liquid crystal matrix, 5CB or MBBA. With the help of confocal microscopy we show that small colloid particles densely aggregate on thin interfaces surrounding large volumes of clean nematic liquid, thus forming an open cellular structure, with the characteristic size of 10-100 micron inversely proportional to the colloid concentration. A simple theoretical model, based on the Landau mean-field treatment, is developed to describe the continuous phase separation and the mechanism of cellular structure formation.
The deterministic wireless relay network model, introduced by Avestimehr et al., has been proposed for approximating Gaussian relay networks. This model, known as the ADT network model, takes into account the broadcast nature of wireless medium and interference. Avestimehr et al. showed that the Min-cut Max-flow theorem holds in the ADT network.   In this paper, we show that the ADT network model can be described within the algebraic network coding framework introduced by Koetter and Medard. We prove that the ADT network problem can be captured by a single matrix, called the "system matrix". We show that the min-cut of an ADT network is the rank of the system matrix; thus, eliminating the need to optimize over exponential number of cuts between two nodes to compute the min-cut of an ADT network.   We extend the capacity characterization for ADT networks to a more general set of connections. Our algebraic approach not only provides the Min-cut Max-flow theorem for a single unicast/multicast connection, but also extends to non-multicast connections such as multiple multicast, disjoint multicast, and two-level multicast. We also provide sufficiency conditions for achievability in ADT networks for any general connection set. In addition, we show that the random linear network coding, a randomized distributed algorithm for network code construction, achieves capacity for the connections listed above.   Finally, we extend the ADT networks to those with random erasures and cycles (thus, allowing bi-directional links). Note that ADT network was proposed for approximating the wireless networks; however, ADT network is acyclic. Furthermore, ADT network does not model the stochastic nature of the wireless links. With our algebraic framework, we incorporate both cycles as well as random failures into ADT network model.
We present an architecture for information extraction from text that augments an existing parser with a character-level neural network. The network is trained using a measure of consistency of extracted data with existing databases as a form of noisy supervision. Our architecture combines the ability of constraint-based information extraction systems to easily incorporate domain knowledge and constraints with the ability of deep neural networks to leverage large amounts of data to learn complex features. Boosting the existing parser's precision, the system led to large improvements over a mature and highly tuned constraint-based production information extraction system used at Bloomberg for financial language text.
Structural properties of the ship-transport network of China (STNC) are studied in the light of recent investigations of complex networks. STNC is composed of a set of routes and ports located along the sea or river. Network properties including the degree distribution, degree correlations, clustering, shortest path length, centrality and betweenness are studied in different definition of network topology. It is found that geographical constraint plays an important role in the network topology of STNC. We also study the traffic flow of STNC based on the weighted network representation, and demonstrate the weight distribution can be described by power law or exponential function depending on the assumed definition of network topology. Other features related to STNC are also investigated.
In recent years, machine learning techniques based on neural networks for mobile computing become increasingly popular. Classical multi-layer neural networks require matrix multiplications at each stage. Multiplication operation is not an energy efficient operation and consequently it drains the battery of the mobile device. In this paper, we propose a new energy efficient neural network with the universal approximation property over space of Lebesgue integrable functions. This network, called, additive neural network, is very suitable for mobile computing. The neural structure is based on a novel vector product definition, called ef-operator, that permits a multiplier-free implementation. In ef-operation, the "product" of two real numbers is defined as the sum of their absolute values, with the sign determined by the sign of the product of the numbers. This "product" is used to construct a vector product in $R^N$. The vector product induces the $l_1$ norm. The proposed additive neural network successfully solves the XOR problem. The experiments on MNIST dataset show that the classification performances of the proposed additive neural networks are very similar to the corresponding multi-layer perceptron and convolutional neural networks (LeNet).
The critical dynamics of model C in the presence of disorder is considered. It is known that in the asymptotics a conserved secondary density decouples from the nonconserved order parameter for disordered systems. However couplings between order parameter and secondary density cause considerable effects on non-asymptotic critical properties. Here, a general procedure for a renormalization group treatment is proposed. Already the one-loop approximation gives a qualitatively correct picture of the diluted model C dynamical criticality. A more quantitative description is achieved using two-loop approximation. In order to get reliable results resummation technique has to be applied.
We study the statistical properties of eigenvalues of the Hessian matrix ${\cal H}$ (matrix of second derivatives of the potential energy) for a classical atomic liquid, and compare these properties with predictions for random matrix models (RMM). The eigenvalue spectra (the Instantaneous Normal Mode or INM spectra) are evaluated numerically for configurations generated by molecular dynamics simulations. We find that distribution of spacings between nearest neighbor eigenvalues, s, obeys quite well the Wigner prediction $s exp(-s^2)$, with the agreement being better for higher densities at fixed temperature. The deviations display a correlation with the number of localized eigenstates (normal modes) in the liquid; there are fewer localized states at higher densities which we quantify by calculating the participation ratios of the normal modes. We confirm this observation by calculating the spacing distribution for parts of the INM spectra with high participation ratios, obtaining greater conformity with the Wigner form. We also calculate the spectral rigidity and find a substantial dependence on the density of the liquid.
Precise segmentation is a prerequisite for an accurate quantification of the imaged objects. It is a very challenging task in many medical imaging applications due to relatively poor image quality and data scarcity. In this work, we present an innovative segmentation paradigm, named Deep Poincare Map (DPM), by coupling the dynamical system theory with a novel deep learning based approach. Firstly, we model the image segmentation process as a dynamical system, in which limit cycle models the boundary of the region of interest (ROI). Secondly, instead of segmenting the ROI directly, convolutional neural network is employed to predict the vector field of the dynamical system. Finally, the boundary of the ROI is identified using the Poincare map and the flow integration. We demonstrate that our segmentation model can be built using a very limited number of train- ing data. By cross-validation, we can achieve a mean Dice score of 94% compared to the manual delineation (ground truth) of the left ventricle ROI defined by clinical experts on a cardiac MRI dataset. Compared with other state-of-the-art methods, we can conclude that the proposed DPM method is adaptive, accurate and robust. It is straightforward to apply this method for other medical imaging applications.
We present details of the analysis of the nonlinear quality of life index for 171 countries. This index is based on four indicators: GDP per capita by Purchasing Power Parities, Life expectancy at birth, Infant mortality rate, and Tuberculosis incidence. We analyze the structure of the data in order to find the optimal and independent on expert's opinion way to map several numerical indicators from a multidimensional space onto the one-dimensional space of the quality of life. In the 4D space we found a principal curve that goes "through the middle" of the dataset and project the data points on this curve. The order along this principal curve gives us the ranking of countries. Projection onto the principal curve provides a solution to the classical problem of unsupervised ranking of objects. It allows us to find the independent on expert's opinion way to project several numerical indicators from a multidimensional space onto the one-dimensional space of the index values. This projection is, in some sense, optimal and preserves as much information as possible. For computation we used ViDaExpert, a tool for visualization and analysis of multidimensional vectorial data (arXiv:1406.5550).
Network coding is a recently proposed method for transmitting data, which has been shown to have potential to improve wireless network performance. We study network coding for one specific case of multicast, broadcasting, from one source to all nodes of the network. We use network coding as a loss tolerant, energy-efficient, method for broadcast. Our emphasis is on mobile networks. Our contribution is the proposal of DRAGONCAST, a protocol to perform network coding in such a dynamically evolving environment. It is based on three building blocks: a method to permit real-time decoding of network coding, a method to adjust the network coding transmission rates, and a method for ensuring the termination of the broadcast. The performance and behavior of the method are explored experimentally by simulations; they illustrate the excellent performance of the protocol.
First responders are increasingly using social media to identify and reduce crime for well-being and safety of the society. Images shared on social media hurting religious, political, communal and other sentiments of people, often instigate violence and create law & order situations in society. This results in the need for first responders to inspect the spread of such images and users propagating them on social media. In this paper, we present a comparison between different hand-crafted features and a Convolutional Neural Network (CNN) model to retrieve similar images, which outperforms state-of-art hand-crafted features. We propose an Open-Source-Intelligent (OSINT) real-time image search system, robust to retrieve modified images that allows first responders to analyze the current spread of images, sentiments floating and details of users propagating such content. The system also aids officials to save time of manually analyzing the content by reducing the search space on an average by 67%.
In-network data aggregation in Wireless Sensor Networks (WSNs) provides efficient bandwidth utilization and energy-efficient computing.Supporting efficient in-network data aggregation while preserving the privacy of the data of individual sensor nodes has emerged as an important requirement in numerous WSN applications. For privacy-preserving data aggregation in WSNs, He et al. (INFOCOM 2007) have proposed a Cluster-based Private Data Aggregation (CPDA) that uses a clustering protocol and a well-known key distribution scheme for computing an additive aggregation function in a privacy-preserving manner. In spite of the wide popularity of CPDA, it has been observed that the protocol is not secure and it is also possible to enhance its efficiency. In this paper, we first identify a security vulnerability in the existing CPDA scheme, wherein we show how a malicious participant node can launch an attack on the privacy protocol so as to get access to the private data of its neighboring sensor nodes. Next it is shown how the existing CPDA scheme can be made more efficient by suitable modification of the protocol. Further, suitable modifications in the existing protocol have been proposed so as to plug the vulnerability of the protocol.
Cross-domain biometrics has been emerging as a new necessity, which poses several additional challenges, including harsh illumination changes, noise, pose variation, among others. In this paper, we explore approaches to cross-domain face verification, comparing self-portrait photographs ("selfies") to ID documents. We approach the problem with proper image photometric adjustment and data standardization techniques, along with deep learning methods to extract the most prominent features from the data, reducing the effects of domain shift in this problem. We validate the methods using a novel dataset comprising 50 individuals. The obtained results are promising and indicate that the adopted path is worth further investigation.
We present a real-time deep learning framework for video-based facial performance capture -- the dense 3D tracking of an actor's face given a monocular video. Our pipeline begins with accurately capturing a subject using a high-end production facial capture pipeline based on multi-view stereo tracking and artist-enhanced animations. With 5-10 minutes of captured footage, we train a convolutional neural network to produce high-quality output, including self-occluded regions, from a monocular video sequence of that subject. Since this 3D facial performance capture is fully automated, our system can drastically reduce the amount of labor involved in the development of modern narrative-driven video games or films involving realistic digital doubles of actors and potentially hours of animated dialogue per character. We compare our results with several state-of-the-art monocular real-time facial capture techniques and demonstrate compelling animation inference in challenging areas such as eyes and lips.
A compactified horizontal visibility graph for the language network is proposed. It was found that the networks constructed in such way are scale free, and have a property that among the nodes with largest degrees there are words that determine not only a text structure communication, but also its informational structure.
By combining the features of CSMA and TDMA, fully decentralised WLAN MAC schemes have recently been proposed that converge to collision-free schedules. In this paper we describe a MAC with optimal long-run throughput that is almost decentralised. We then design two \changed{schemes} that are practically realisable, decentralised approximations of this optimal scheme and operate with different amounts of sensing information. We achieve this by (1) introducing learning algorithms that can substantially speed up convergence to collision free operation; (2) developing a decentralised schedule length adaptation scheme that provides long-run fair (uniform) access to the medium while maintaining collision-free access for arbitrary numbers of stations.
It is currently accepted that cortical maps are dynamic constructions that are altered in response to external input. Experience-dependent structural changes in cortical microcurcuts lead to changes of activity, i.e. to changes in information encoded. Specific patterns of external stimulation can lead to creation of new synaptic connections between neurons. The calcium influxes controlled by neuronal activity regulate the processes of neurotrophic factors released by neurons, growth cones movement and synapse differentiation in developing neural systems. We propose a model for description and investigation of the activity dependent development of neural networks. The dynamics of the network parameters (activity, diffusion of axon guidance chemicals, growth cone position) is described by a closed set of differential equations. The model presented here describes the development of neural networks under the assumption of activity dependent axon guidance molecules. Numerical simulation shows that morpholess neurons compromise the development of cortical connectivity.
The dynamics and the stationary states of an exactly solvable three-state layered feed-forward neural network model with asymmetric synaptic connections, finite dilution and low pattern activity are studied in extension of a recent work on a recurrent network. Detailed phase diagrams are obtained for the stationary states and for the time evolution of the retrieval overlap with a single pattern. It is shown that the network develops instabilities for low thresholds and that there is a gradual improvement in network performance with increasing threshold up to an optimal stage. The robustness to synaptic noise is checked and the effects of dilution and of variable threshold on the information content of the network are also established.
The class of chain event graph models is a generalisation of the class of discrete Bayesian networks, retaining most of the structural advantages of the Bayesian network for model interrogation, propagation and learning, while more naturally encoding asymmetric state spaces and the order in which events happen. In this paper we demonstrate how with complete sampling, conjugate closed form model selection based on product Dirichlet priors is possible, and prove that suitable homogeneity assumptions characterise the product Dirichlet prior on this class of models. We demonstrate our techniques using two educational examples.
The recent development of single-cell transcriptomics has enabled gene expression to be measured in individual cells instead of being population-averaged. This is clearly a tremendous improvement for the description of gene expression, but it is also a new challenge because a stochasticity dimension has now to be integrated to properly infer gene regulatory networks - mRNA synthesis is now acknowledged to occur in a highly bursty manner. Here we propose to view the inference problem as a fitting procedure for a mechanistic gene network model that is inherently stochastic and takes not only protein, but also mRNA levels into account. We first explain how to build and simulate this network model based upon the coupling of genes that are described as piecewise deterministic Markov processes. Our model is modular and can be used to implement various biochemical hypotheses including causal interactions between genes. However, a naive fitting procedure would be intractable. By performing a relevant approximation of the stationary distribution, we derive a tractable procedure that corresponds to a statistical hidden Markov model with interpretable parameters. In the case of a simple toggle-switch, the output of this approximation was almost indistinguishable from the theoretical distribution. Our approach was finally applied to a number of simple two-gene networks simulated in silico from the mechanistic model and satisfactorily recovered an accurate version of the original networks. Most importantly, our results demonstrate that functional, dynamic interactions between genes can be identified even from the stationary distribution. This method stands as a promising tool in relation to the current explosion of single-cell expression data.
We define and study slider-pinning rigidity, giving a complete combinatorial characterization. This is done via direction-slider networks, which are a generalization of Whiteley's direction networks.
One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a second deep network, called a learnet, which predicts the parameters of a pupil network from a single exemplar. In this manner we obtain an efficient feed-forward one-shot learner, trained end-to-end by minimizing a one-shot classification objective in a learning to learn formulation. In order to make the construction feasible, we propose a number of factorizations of the parameters of the pupil network. We demonstrate encouraging results by learning characters from single exemplars in Omniglot, and by tracking visual objects from a single initial exemplar in the Visual Object Tracking benchmark.
Large modern surveys require efficient review of data in order to find transient sources such as supernovae, and to distinguish such sources from artefacts and noise. Much effort has been put into the development of automatic algorithms, but surveys still rely on human review of targets. This paper presents an integrated system for the identification of supernovae in data from Pan-STARRS1, combining classifications from volunteers participating in a citizen science project with those from a convolutional neural network. The unique aspect of this work is the deployment, in combination, of both human and machine classifications for near real-time discovery in an astronomical project. We show that the combination of the two methods outperforms either one used individually. This result has important implications for the future development of transient searches, especially in the era of LSST and other large-throughput surveys.
Given the potential X-ray radiation risk to the patient, low-dose CT has attracted a considerable interest in the medical imaging field. The current main stream low-dose CT methods include vendor-specific sinogram domain filtration and iterative reconstruction, but they need to access original raw data whose formats are not transparent to most users. Due to the difficulty of modeling the statistical characteristics in the image domain, the existing methods for directly processing reconstructed images cannot eliminate image noise very well while keeping structural details. Inspired by the idea of deep learning, here we combine the autoencoder, the deconvolution network, and shortcut connections into the residual encoder-decoder convolutional neural network (RED-CNN) for low-dose CT imaging. After patch-based training, the proposed RED-CNN achieves a competitive performance relative to the-state-of-art methods in both simulated and clinical cases. Especially, our method has been favorably evaluated in terms of noise suppression, structural preservation and lesion detection.
This paper studies a novel discriminative part-based model to represent and recognize object shapes with an "And-Or graph". We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e.g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration. The advantages of our method are two-fold. (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. (ii) The proposed learning algorithm is able to obtain the And-Or graph representation without requiring elaborate supervision and initialization. We validate the proposed method on several challenging databases (e.g., INRIA-Horse, ETHZ-Shape, and UIUC-People), and it outperforms the state-of-the-arts approaches.
Networks representing many complex systems in nature and society share some common structural properties like heterogeneous degree distributions and strong clustering. Recent research on network geometry has shown that those real networks can be adequately modeled as random geometric graphs in hyperbolic spaces. In this paper, we present a computer program to generate such graphs. Besides real-world-like networks, the program can generate random graphs from other well-known graph ensembles, such as the soft configuration model, random geometric graphs on a circle, or Erd\H{o}s-R\'enyi random graphs. The simulations show a good match between the expected values of different network structural properties and the corresponding empirical values measured in generated graphs, confirming the accurate behavior of the program.
Using Instagram data from 166 individuals, we applied machine learning tools to successfully identify markers of depression. Statistical features were computationally extracted from 43,950 participant Instagram photos, using color analysis, metadata components, and algorithmic face detection. Resulting models outperformed general practitioners' average diagnostic success rate for depression. These results held even when the analysis was restricted to posts made before depressed individuals were first diagnosed. Photos posted by depressed individuals were more likely to be bluer, grayer, and darker. Human ratings of photo attributes (happy, sad, etc.) were weaker predictors of depression, and were uncorrelated with computationally-generated features. These findings suggest new avenues for early screening and detection of mental illness.
In this paper, we address the scenario where nodes with sensor data are connected in a tree network, and every node wants to compute a given symmetric Boolean function of the sensor data. We first consider the problem of computing a function of two nodes with integer measurements. We allow for block computation to enhance data fusion efficiency, and determine the minimum worst-case total number of bits to be exchanged to perform the desired computation. We establish lower bounds using fooling sets, and provide a novel scheme which attains the lower bounds, using information theoretic tools. For a class of functions called sum-threshold functions, this scheme is shown to be optimal. We then turn to tree networks and derive a lower bound for the number of bits exchanged on each link by viewing it as a two node problem. We show that the protocol of recursive innetwork aggregation achieves this lower bound in the case of sumthreshold functions. Thus we have provided a communication and in-network computation strategy that is optimal for each link. All the results can be extended to the case of non-binary alphabets. In the case of general graphs, we present a cut-set lower bound, and an achievable scheme based on aggregation along trees. For complete graphs, the complexity of this scheme is no more than twice that of the optimal scheme.
Wireless sensor networks have been a driving force of the Industrial Internet of Things (IIoT) advancement in the process control and manufacturing industry. The emergence of IIoT opens great potential for the ubiquitous field device connectivity and manageability with an integrated and standardized architecture from low-level device operations to high-level data-centric application interactions. This technological development requires software definability in the key architectural elements of IIoT, including wireless field devices, IIoT gateways, network infrastructure, and IIoT sensor cloud services. In this paper, a novel software-defined IIoT (SD-IIoT) is proposed in order to solve essential challenges in a holistic IIoT system, such as reliability, security, timeliness scalability, and quality of service (QoS). A new IIoT system architecture is proposed based on the latest networking technologies such as WirelessHART, WebSocket, IETF constrained application protocol (CoAP) and software-defined networking (SDN). A new scheme based on CoAP and SDN is proposed to solve the QoS issues. Computer experiments in a case study are implemented to show the effectiveness of the proposed system architecture.
We characterize the distributions of short cycles in a large metabolic network previously shown to have small world characteristics and a power law degree distribution. Compared with three classes of random networks, including Erdoes-Renyi random graphs and synthetic small world networks of the same connectivity, the metabolic network has a particularly large number of triangles and a deficit in large cycles. Short cycles reduce the length of detours when a connection is clipped, so we propose that long cycles in metabolism may have been selected against in order to shorten transition times and reduce the likelihood of oscillations in response to external perturbations.
Existing Natural Language Generation (NLG) systems are weak AI systems and exhibit limited capabilities when language generation tasks demand higher levels of creativity, originality and brevity. Effective solutions or, at least evaluations of modern NLG paradigms for such creative tasks have been elusive, unfortunately. This paper introduces and addresses the task of coherent story generation from independent descriptions, describing a scene or an event. Towards this, we explore along two popular text-generation paradigms -- (1) Statistical Machine Translation (SMT), posing story generation as a translation problem and (2) Deep Learning, posing story generation as a sequence to sequence learning problem. In SMT, we chose two popular methods such as phrase based SMT (PB-SMT) and syntax based SMT (SYNTAX-SMT) to `translate' the incoherent input text into stories. We then implement a deep recurrent neural network (RNN) architecture that encodes sequence of variable length input descriptions to corresponding latent representations and decodes them to produce well formed comprehensive story like summaries. The efficacy of the suggested approaches is demonstrated on a publicly available dataset with the help of popular machine translation and summarization evaluation metrics.
In hadron-nucleus interactions, the stronger is nuclear shadowing in the total cross section the higher is the multiplicity of secondary hadrons. In deep inelastic scattering, nuclear shadowing at small $x$ is associated with the hadronlike behaviour of photons as contrasted to the pointlike behaviour in the non-shadowing region of large $x$. In this paper we predict smaller mean multiplicity of secondary hadrons, and weaker fragmentation of the target nucleus, in deep inelastic leptoptoproduction on nuclei in the shadowing region of small $x$ as compared to the non-shadowing region of large $x$. This paradoxial conclusion has its origin in nuclear enhancement of the coherent diffraction dissociation of photons. We present numerical predictions for multiproduction in $\mu Xe$ interactions studied by the Fermilab E665 collaboration.
Recently, there has been an explosion of work on network routing in hostile environments. Hostile environments tend to be dynamic, and the motivation for this work stems from the scenario of IED placements by insurgents in a logistical network. For discussion, we consider here a sub-network abstracted from a real network, and propose a framework for route selection. What distinguishes our work from related work is its decision theoretic foundation, and statistical considerations pertaining to probability assessments. The latter entails the fusion of data from diverse sources, modeling the socio-psychological behavior of adversaries, and likelihood functions that are induced by simulation. This paper demonstrates the role of statistical inference and data analysis on problems that have traditionally belonged in the domain of computer science, communications, transportation science, and operations research.
As quantum computers of non-trivial size become available in the near future, it is imperative to develop tools to emulate small quantum computers. This allows for validation and debugging of algorithms as well as exploring hardware-software co-design to guide the development of quantum hardware and architectures. The simulation of quantum computers entails multiplications of sparse matrices with very large dense vectors of dimension $2^n$, where $n$ denotes the number of qubits, making this a memory-bound and network bandwidth-limited application. We introduce the concept of a quantum computer \textit{emulator} as a component of a software framework for quantum computing, enabling a significant performance advantage over simulators by emulating quantum algorithms at a high level rather than simulating individual gate operations. We describe various optimization approaches and present benchmarking results, establishing the superiority of quantum computer emulators in terms of performance.
Continuous time Bayesian networks (CTBNs) describe structured stochastic processes with finitely many states that evolve over continuous time. A CTBN is a directed (possibly cyclic) dependency graph over a set of variables, each of which represents a finite state continuous time Markov process whose transition model is a function of its parents. We address the problem of learning parameters and structure of a CTBN from fully observed data. We define a conjugate prior for CTBNs, and show how it can be used both for Bayesian parameter estimation and as the basis of a Bayesian score for structure learning. Because acyclicity is not a constraint in CTBNs, we can show that the structure learning problem is significantly easier, both in theory and in practice, than structure learning for dynamic Bayesian networks (DBNs). Furthermore, as CTBNs can tailor the parameters and dependency structure to the different time granularities of the evolution of different variables, they can provide a better fit to continuous-time processes than DBNs with a fixed time granularity.
It is known that the stationary distribution of the random walk process is dependent on the structure of the network. This could provide us a solution of the network reconstruction. However, the stationary distribution of the random walk process can only reflect the relative size of node degrees directly, how to infer the real connection is still a problem. In this paper, we will propose a method to reconstruct network by the random walk process, which can reconstruct the total number of links, degree sequence and links sequentially. In our method, only the stationary distribution is used, and no data of the evolution process is needed, such as the first passage time. We perform our method on some network models and real-world network, the results indicate our method can reconstruct networks accurately, even when we can not get the exact stationary distribution.
Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to perform scene recognition and annotation. Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator (DocNADE) was proposed and demonstrated state-of-the-art performance for document modeling. In this work, we show how to successfully apply and extend this model to the context of visual scene modeling. Specifically, we propose SupDocNADE, a supervised extension of DocNADE, that increases the discriminative power of the hidden topic features by incorporating label information into the training objective of the model. We also describe how to leverage information about the spatial position of the visual words and how to embed additional image annotations, so as to simultaneously perform image classification and annotation. We test our model on the Scene15, LabelMe and UIUC-Sports datasets and show that it compares favorably to other topic models such as the supervised variant of LDA.
We define a new family of random spin models with one-dimensional structure, finite-range multi-spin interactions, and bounded average degree (number of interactions in which each spin participates). Unfrustrated ground states can be described as solutions of a sparse, band diagonal linear system, thus allowing for efficient numerical analysis.   In the limit of infinite interaction range, we recover the so-called XORSAT (diluted p-spin) model, that is known to undergo a random first order phase transition as the average degree is increased. Here we investigate the most important consequences of a large but finite interaction range: (i) Fluctuation-induced corrections to thermodynamic quantities; (ii) The need of an inhomogeneous (position dependent) order parameter; (iii) The emergence of a finite mosaic length scale. In particular, we study the correlation length divergence at the (mean-field) glass transition.
Recent years have seen the proposal of a number of neural architectures for the problem of Program Induction. Given a set of input-output examples, these architectures are able to learn mappings that generalize to new test inputs. While achieving impressive results, these approaches have a number of important limitations: (a) they are computationally expensive and hard to train, (b) a model has to be trained for each task (program) separately, and (c) it is hard to interpret or verify the correctness of the learnt mapping (as it is defined by a neural network). In this paper, we propose a novel technique, Neuro-Symbolic Program Synthesis, to overcome the above-mentioned problems. Once trained, our approach can automatically construct computer programs in a domain-specific language that are consistent with a set of input-output examples provided at test time. Our method is based on two novel neural modules. The first module, called the cross correlation I/O network, given a set of input-output examples, produces a continuous representation of the set of I/O examples. The second module, the Recursive-Reverse-Recursive Neural Network (R3NN), given the continuous representation of the examples, synthesizes a program by incrementally expanding partial programs. We demonstrate the effectiveness of our approach by applying it to the rich and complex domain of regular expression based string transformations. Experiments show that the R3NN model is not only able to construct programs from new input-output examples, but it is also able to construct new programs for tasks that it had never observed before during training.
The organization of the connectivity between mammalian cortical areas has become a major subject of study, because of its important role in scaffolding the macroscopic aspects of animal behavior and intelligence. In this study we present a computational reconstruction approach to the problem of network organization, by considering the topological and spatial features of each area in the primate cerebral cortex as subsidy for the reconstruction of the global cortical network connectivity. Starting with all areas being disconnected, pairs of areas with similar sets of features are linked together, in an attempt to recover the original network structure. Inferring primate cortical connectivity from the properties of the nodes, remarkably good reconstructions of the global network organization could be obtained, with the topological features allowing slightly superior accuracy to the spatial ones. Analogous reconstruction attempts for the C. elegans neuronal network resulted in substantially poorer recovery, indicating that cortical area interconnections are relatively stronger related to the considered topological and spatial properties than neuronal projections in the nematode. The close relationship between area-based features and global connectivity may hint on developmental rules and constraints for cortical networks. Particularly, differences between the predictions from topological and spatial properties, together with the poorer recovery resulting from spatial properties, indicate that the organization of cortical networks is not entirely determined by spatial constraints.
Network steganography has been a well-known covert data channeling method for over three decades. The basic set of techniques and implementation tools have not changed significantly since their introduction in the early 1980's. In this paper, we review the predominant methods of classical network steganography, describing the detailed operations and resultant challenges involved in embedding data in the network transport domain. We also consider the various cyber threat vectors of network steganography and point out the major differences between classical network steganography and the widely known end-point multimedia embedding techniques, which focus exclusively on static data modification for data hiding. We then challenge the security community by introducing an entirely new network dat hiding methodology, which we refer to as real-time network data steganography. Finally we provide the groundwork for this fundamental change of covert network data embedding by forming a basic framework for real-time network data operations that will open the path for even further advances in computer network security.
An asymptotic technique is developed to find the Signal-to-Interference-plus-Noise-Ratio (SINR) and spectral efficiency of a link with N receiver antennas in wireless networks with non-homogeneous distributions of nodes. It is found that with appropriate normalization, the SINR and spectral efficiency converge with probability 1 to asymptotic limits as N increases. This technique is applied to networks with power-law node intensities, which includes homogeneous networks as a special case, to find a simple approximation for the spectral efficiency. It is found that for receivers in dense clusters, the SINR grows with N at rates higher than that of homogeneous networks and that constant spectral efficiencies can be maintained if the ratio of N to node density is constant. This result also enables the analysis of a new scaling regime where the distribution of nodes in the network flattens rather than increases uniformly. It is found that in many cases in this regime, N needs to grow approximately exponentially to maintain a constant spectral efficiency. In addition to strengthening previously known results for homogeneous networks, these results provide insight into the benefit of using antenna arrays in non-homogeneous wireless networks, for which few results are available in the literature.
Results of Monte Carlo simulations of the one-dimensional long-range Ising spin glass with power-law interactions in the presence of a (random) field are presented. By tuning the exponent of the power-law interactions, we are able to scan the full range of possible behaviors from the infinite-range (Sherrington-Kirkpatrick) model to the short-range model. A finite-size scaling analysis of the correlation length indicates that the Almeida-Thouless line does not occur in the region with non-mean-field critical behavior in zero field. However, there is evidence that an Almeida-Thouless line does occur in the mean-field region.
Results on the analysis of the hadronic final state in neutral current deep inelastic scattering at HERA are presented; recent results on inclusive single particle distributions, particle correlations and event shapes are highlighted.
We consider complex clustered networks with a gradient structure, where sizes of the clusters are distributed unevenly. Such networks describe more closely actual networks in biophysical systems and in technological applications than previous models. Theoretical analysis predicts that the network synchronizability can be optimized by the strength of the gradient field but only when the gradient field points from large to small clusters. A remarkable finding is that, if the gradient field is sufficiently strong, synchronizability of the network is mainly determined by the properties of the subnetworks in the two largest clusters. These results are verified by numerical eigenvalue analysis and by direct simulation of synchronization dynamics on coupled-oscillator networks.
The zinc-blende GaAs-based III-V diluted magnetic semiconductors (DMS)are studied in the coherent potential approximation(CPA). In this work, we use the exact Hilbert transformation of the face-centered cubic(fcc) density of states (DOS), which is different from the usual semi-circle density of states employed in our previous work. Our calculated relation of ground-state energy and impurity magnetization shows that ferromagnetism is always favorable at low temperatures. For very weak Kondo coupling, the density of states (DOS) of the host semiconductor is modified slightly. Impurity band can be generated at the host band bottom only when Kondo coupling is strong enough. Using Weiss molecular theory, we predict a nonlinear relation of Curie temperature with respect to Kondo coupling, as is different from the conclusion of our previous calculations based on semicircle DOS. The agreement of our calculated $T_{\rm C}$ with measured values is convincing.
In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates. SAGA improves on the theory behind SAG and SVRG, with better theoretical convergence rates, and has support for composite objectives where a proximal operator is used on the regulariser. Unlike SDCA, SAGA supports non-strongly convex problems directly, and is adaptive to any inherent strong convexity of the problem. We give experimental results showing the effectiveness of our method.
The structural properties of liquid GeSe$_2$ are studied by using first-principles molecular dynamics in conjuncton with the Becke, Lee, Yang and Parr (BLYP) generalized gradient approximation for the exchange and correlation energy. The results on partial pair correlation functions, coordination numbers, bond angle distributions and partial structure factors are compared with available experimental data and with previous first-principle molecular dynamics results obtained within the Perdew and Wang (PW) generalized gradient approximation for the exchange and correlation energy. We found that the BLYP approach substantially improves upon the PW one in the case of the short-range properties. In particular, the Ge$-$Ge pair correlation function takes a more structured profile that includes a marked first peak due to homopolar bonds, a first maximum exhibiting a clear shoulder and a deep minimum, all these features being absent in the previous PW results. Overall, the amount of tetrahedral order is significantly increased, in spite of a larger number of Ge$-$Ge homopolar connections. Due to the smaller number of miscoordinations, diffusion coefficients obtained by the present BLYP calculation are smaller by at least one order of magnitude than in the PW case.
Transversity observables, such as the T-odd single-spin asymmetry measured in deep inelastic lepton scattering on polarized protons, and the distributions which are measured in deeply virtual Compton scattering provide important constraints on the fundamental quark and gluon structure of the proton. In this talk I discuss the challenge of computing these observables from first principles; i.e., quantum chromodynamics, itself. A key step is the determination of the frame-independent light-front wavefunctions (LFWFs) of hadrons -- the QCD eigensolutions which are analogs of the Schrodinger wavefunctions of atomic physics. The lensing effects of initial-state and final-state interactions, acting on LFWFs with different orbital angular momentum, lead to the T-odd transversity observables such as the Sivers, Collins, and Boer-Mulders distributions. The lensing effect also leads to leading-twist phenomena which break leading-twist factorization, such as the breakdown of the Lam-Tung relation in Drell-Yan reactions. A similar rescattering mechanism also leads to diffractive deep inelastic scattering, as well as nuclear shadowing and non-universal antishadowing. It is thus important to distinguish "static" structure functions, the probability distributions computed from the target hadron's light-front wavefunctions, versus "dynamical" structure functions which include the effects of initial- and final-state rescattering. I also discuss related effects, such as the J=0 fixed pole contribution which appears in the real part of the virtual Compton amplitude. AdS/QCD, together with "Light-Front Holography", provides a simple Lorentz-invariant color-confining approximation to QCD which is successful in accounting for light-quark meson and baryon spectroscopy as well as hadronic LFWFs.
This work addresses the problem of deriving fundamental trade-off bounds for a 1-relay and a 2-relay wireless network when multiple performance criteria are of interest. It proposes a simple MultiObjective (MO) performance evaluation framework composed of a broadcast and interference-limited network model; capacity, delay and energy performance metrics and an associated MO optimization problem. Pareto optimal performance bounds between end-to-end delay and energy for a capacity-achieving network are given for 1-relay and 2-relay topologies and assessed through simulations. Moreover, we also show in this paper that these bounds are tight since they can be reached by simple practical coding strategies performed by the source and the relays. Two different types of network coding strategies are investigated. Practical performance bounds for both strategies are compared to the theoretical upper bound. Results confirm that the proposed upper bound on delay and energy performance is tight and can be reached with the proposed combined source and network coding strategies.
Automatically learning the structure of object categories remains an important open problem in computer vision. We propose a novel unsupervised approach that can discover and learn to detect landmarks in object categories, thus characterizing their structure. Our approach is based on factorizing image deformations, as induced by a viewpoint change or an object articulation, by learning a deep neural network that detects landmarks compatible with such visual effects. We show that, by requiring the same neural network to be applicable to different object instances, our method naturally induces meaningful correspondences between different object instances in a category. We assess the method qualitatively on a variety of object types, natural an man-made. We also show that our unsupervised landmarks are highly predictive of manually-annotated landmarks in faces benchmark datasets, and can be used to regress those with a high degree of accuracy.
The goal of a generative model is to capture the distribution underlying the data, typically through latent variables. After training, these variables are often used as a new representation, more effective than the original features in a variety of learning tasks. However, the representations constructed by contemporary generative models are usually point-wise deterministic mappings from the original feature space. Thus, even with representations robust to class-specific transformations, statistically driven models trained on them would not be able to generalize when the labeled data is scarce. Inspired by the stochasticity of the synaptic connections in the brain, we introduce Energy-based Stochastic Ensembles. These ensembles can learn non-deterministic representations, i.e., mappings from the feature space to a family of distributions in the latent space. These mappings are encoded in a distribution over a (possibly infinite) collection of models. By conditionally sampling models from the ensemble, we obtain multiple representations for every input example and effectively augment the data. We propose an algorithm similar to contrastive divergence for training restricted Boltzmann stochastic ensembles. Finally, we demonstrate the concept of the stochastic representations on a synthetic dataset as well as test them in the one-shot learning scenario on MNIST.
In this work we formulate the nonequilibrium dynamical renormalization group (ndRG). The ndRG represents a general renormalization-group scheme for the analytical description of the real-time dynamics of complex quantum many-body systems. In particular, the ndRG incorporates time as an additional scale which turns out to be important for the description of the long-time dynamics. It can be applied to both translational invariant and disordered systems. As a concrete application we study the real-time dynamics after a quench between two quantum critical points of different universality classes. We achieve this by switching on weak disorder in a one-dimensional transverse-field Ising model initially prepared at its clean quantum critical point. By comparing to numerically exact simulations for large systems we show that the ndRG is capable of analytically capturing the full crossover from weak to infinite randomness. We analytically study signatures of localization in both real space and Fock space.
Neural Networks are function approximators that have achieved state-of-the-art accuracy in numerous machine learning tasks. In spite of their great success in terms of accuracy, their large training time makes it difficult to use them for various tasks. In this paper, we explore the idea of learning weight evolution pattern from a simple network for accelerating training of novel neural networks. We use a neural network to learn the training pattern from MNIST classification and utilize it to accelerate training of neural networks used for CIFAR-10 and ImageNet classification. Our method has a low memory footprint and is computationally efficient. This method can also be used with other optimizers to give faster convergence. The results indicate a general trend in the weight evolution during training of neural networks.
We study a neural network model of interacting stochastic discrete two--state cellular automata on a regular lattice. The system is externally tuned to a critical point which varies with the degree of stochasticity (or the effective temperature). There are avalanches of neuronal activity, namely spatially and temporally contiguous sites of activity; a detailed numerical study of these activity avalanches is presented, and single, joint and marginal probability distributions are computed. At the critical point, we find that the scaling exponents for the variables are in good agreement with a mean--field theory.
Transport of neural signals in the brain is challenging, owing to neural interference and neural noise. There is experimental evidence of multiplexing of sensory information across population of neurons, particularly in the vertebrate visual and olfactory systems. Recently, it has been discovered that in lateral intraparietal cortex of the brain, decision signals are multiplexed with decision-irrelevant visual signals. Furthermore, it is well known that several cortical neurons exhibit chaotic spiking patterns. Multiplexing of chaotic neural signals and their successful demultiplexing in the neurons amidst interference and noise, is difficult to explain. In this work, a novel compressed sensing model for efficient multiplexing of chaotic neural signals constructed using the Hindmarsh-Rose spiking model is proposed. The signals are multiplexed from a pre-synaptic neuron to its neighbouring post-synaptic neuron, in the presence of $10^4$ interfering noisy neural signals and demultiplexed using compressed sensing techniques.
Within the framework of perturbative Quantum Chromodynamics we derive transverse momentum dependent distributions describing both current and target fragmentation in semi-inclusive Deep Inelastic Scattering. We present, to leading logarithmic accuracy, the corresponding crosssections describing final state hadrons on the whole phase space. Phenomenological implications and further developments are briefly discussed.
This paper is focused on automatic multi-label document classification of Czech text documents. The current approaches usually use some pre-processing which can have negative impact (loss of information, additional implementation work, etc). Therefore, we would like to omit it and use deep neural networks that learn from simple features. This choice was motivated by their successful usage in many other machine learning fields. Two different networks are compared: the first one is a standard multi-layer perceptron, while the second one is a popular convolutional network. The experiments on a Czech newspaper corpus show that both networks significantly outperform baseline method which uses a rich set of features with maximum entropy classifier. We have also shown that convolutional network gives the best results.
One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects represented by natural images, points to the intended object if the expression has a unique referent, or indicates a failure, if it does not. The model, directly trained on reference acts, is competitive with a pipeline manually engineered to perform the same task, both when referents are purely visual, and when they are characterized by a combination of visual and linguistic properties.
We introduce a general method for improving the convergence rate of gradient-based optimizers that is easy to implement and works well in practice. We analyze the effectiveness of the method by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that it improves upon these commonly used algorithms on a range of optimization problems; in particular the kinds of objective functions that arise frequently in deep neural network training. Our method works by dynamically updating the learning rate during optimization using the gradient with respect to the learning rate of the update rule itself. Computing this "hypergradient" needs little additional computation, requires only one extra copy of the original gradient to be stored in memory, and relies upon nothing more than what is provided by reverse-mode automatic differentiation.
Using wide-field $BVR_cI$ imaging for a sample of 16 intermediate redshift ($0.17 < z < 0.55$) galaxy clusters from the Canadian Network for Observational Cosmology (CNOC1) Survey, we investigate the dependence of cluster galaxy populations and their evolution on environment. Galaxy photometric redshifts are estimated using an empirical photometric redshift technique and galaxy groups are identified using a modified friends-of-friends algorithm in photometric redshift space.We utilize the red galaxy fraction (\fred) to infer the evolutionary status of galaxies in clusters, using both individual galaxies and galaxies in groups. We apply the local galaxy density, \sig5, derived using the fifth nearest-neighbor distance, as a measure of local environment, and the cluster-centric radius, \rCL, as a proxy for global cluster environment. Our cluster sample exhibits a Butcher-Oemler effect in both luminosity-selected and stellar-mass-selected samples. We find that \fred depends strongly on \sig5 and \rCL, and the Butcher-Oemler effect is observed in all \sig5 and \rCL bins. However, when the cluster galaxies are separated into \rCL bins, or into group and non-group subsamples, the dependence on local galaxy density becomes much weaker. This suggests that the properties of the dark matter halo in which the galaxy resides have a dominant effect on its galaxy population and evolutionary history. We find that our data are consistent with the scenario that cluster galaxies situated in successively richer groups (i.e., more massive dark matter halos) reach a high \fred value at earlier redshifts. Associated with this, we observe a clear signature of `pre-processing', in which ... <and more>
In today's education systems, there is a deep concern about the importance of teaching code and computer programming in schools. Moving digital learning from a simple use of tools to understanding the processes of the internal functioning of these tools is an old / new debate originated with the digital laboratories of the 1960. Today, it is emerging again under impulse of the large - scale public sphere digitalization and the new constructivist education theories. Teachers and educators discuss not only the viability of code teaching in the classroom, but also the intellectual and cognitive advantages for students. The debate thus takes several orientations and is resourced in the entanglement of arguments and interpretations of any order, technical, educational, cultural, cognitive and psychological. However, that phenomenon which undoubtedly augurs for a profound transformation in the future models of learning and teaching , is predicting a new and almost congenital digital humanism
Application development for distributed computing "Grids" can benefit from tools that variously hide or enable application-level management of critical aspects of the heterogeneous environment. As part of an investigation of these issues, we have developed MPICH-G2, a Grid-enabled implementation of the Message Passing Interface (MPI) that allows a user to run MPI programs across multiple computers, at the same or different sites, using the same commands that would be used on a parallel computer. This library extends the Argonne MPICH implementation of MPI to use services provided by the Globus Toolkit for authentication, authorization, resource allocation, executable staging, and I/O, as well as for process creation, monitoring, and control. Various performance-critical operations, including startup and collective operations, are configured to exploit network topology information. The library also exploits MPI constructs for performance management; for example, the MPI communicator construct is used for application-level discovery of, and adaptation to, both network topology and network quality-of-service mechanisms. We describe the MPICH-G2 design and implementation, present performance results, and review application experiences, including record-setting distributed simulations.
In heterogeneous networks, achieving congestion avoidance is difficult because the congestion feedback from one subnetwork may have no meaning to source on other other subnetworks. We propose using changes in round-trip delay as an implicit feedback. Using a black-box model of the network, we derive an expression for the optimal window as a function of the gradient of the delay-window curve.   The problems of selfish optimum and social optimum are also addressed. It is shown that without a careful design, it is possible to get into a race condition during heavy congestion, where each user wants more resources than others, thereby leading to a diverging congestion   It is shown that congestion avoidance using round trip delay is a promising approach. The aproach has the advantage that there is absolutely no overhead for the network itself. It is exemplified by a simple scheme. The performance of the scheme is analyzed using a simulation model. The scheme is shown to be efficient, fair, convergent and adaptive to changes in network configuration.   The scheme as described works only for networks that can ne modelled with queueing servers with constant service times. Further research is required to extend it for implementation in practical networks. Several directions for future research have beensuggested.
The genuine physical reasons explaining the delocalization effect of the Hubbard repulsion U are analyzed. First it is shown that always when this effect is observed, U acts on the background of a macroscopic degeneracy present in a multiband type of system. After this step I demonstrate that acting in such conditions, by strongly diminishing the double occupancy, U spreads out the contributions in the ground state wave function, hence strongly increases the one-particle localization length, consequently extends the one-particle behavior producing conditions for a delocalization effect. To be valuable, the reported results are presented in exact terms, being based on the first exact ground states deduced at half filling in two dimensions for a prototype two band system, the generic periodic Anderson model at finite value of the interaction.
Quantum measurements are crucial to observe the properties of a quantum system, which however unavoidably perturb its state and dynamics in an irreversible way. Here we study the dynamics of a quantum system while being subject to a sequence of projective measurements applied at random times. In the case of independent and identically distributed intervals of time between consecutive measurements, we analytically demonstrate that the survival probability of the system to remain in the projected state assumes a large-deviation (exponentially decaying) form in the limit of an infinite number of measurements. This allows us to estimate the typical value of the survival probability, which can therefore be tuned by controlling the probability distribution of the random time intervals. Our analytical results are numerically tested for Zeno-protected entangled states, which also demonstrates that the presence of disorder in the measurement sequence further enhances the survival probability when the Zeno limit is not reached (as it happens in experiments). Our studies provide a new tool for protecting and controlling the amount of quantum coherence in open complex quantum systems by means of tunable stochastic measurements.
We calculate shadowing in the process of deep inelastic interactions of leptons with nuclei in the perturbative regime of QCD. We find appreciable shadowing for heavy nuclei (e.g. Pb) in the region of small Bjorken scaling variable $10^{-5}\leq x \leq 10^{-3}$. This shadowing depends weakly on $Q^2$, but it may be strongly influenced, especially at $x \geq 10^{-3}$, by the existence of real parts of the forward scattering amplitudes.
While deep learning methods have achieved state-of-the-art performance in many challenging inverse problems like image inpainting and super-resolution, they invariably involve problem-specific training of the networks. Under this approach, different problems require different networks. In scenarios where we need to solve a wide variety of problems, e.g., on a mobile camera, it is inefficient and costly to use these specially-trained networks. On the other hand, traditional methods using signal priors can be used in all linear inverse problems but often have worse performance on challenging tasks. In this work, we provide a middle ground between the two kinds of methods --- we propose a general framework to train a single deep neural network that solves arbitrary linear inverse problems. The proposed network acts as a proximal operator for an optimization algorithm and projects non-image signals onto the set of natural images defined by the decision boundary of a classifier. In our experiments, the proposed framework demonstrates superior performance over traditional methods using a wavelet sparsity prior and achieves comparable performance of specially-trained networks on tasks including compressive sensing and pixel-wise inpainting.
The fragmentation of an unpolarized quark into a transversely polarized spin-1/2 particle is studied in the framework of a simple model. Special attention is payed to the gluon exchange which is incorporated in the gauge link of the fragmentation function, and which we model by an abelian gauge field. The transverse single spin asymmetries in $e^+ e^-$ annihilation and semi-inclusive deep-inelastic scattering are calculated in the one-loop approximation. For $e^+ e^-$ annihilation one finds a cancellation between contributions from two on-shell intermediate states, which have no counterpart in deep-inelastic scattering. As a consequence of this cancellation, the model predicts the same spin asymmetry for both processes implying that, in the one-loop approximation, the corresponding fragmentation function is universal.
We present a comprehensive, novel framework for understanding how the neocortex, including the thalamocortical loops through the deep layers, can support a temporal context representation in the service of predictive learning. Many have argued that predictive learning provides a compelling, powerful source of learning signals to drive the development of human intelligence: if we constantly predict what will happen next, and learn based on the discrepancies from our predictions (error-driven learning), then we can learn to improve our predictions by developing internal representations that capture the regularities of the environment (e.g., physical laws governing the time-evolution of object motions). Our version of this idea builds upon existing work with simple recurrent networks (SRN's), which have a discretely-updated temporal context representations that are a direct copy of the prior internal state representation. We argue that this discretization of temporal context updating has a number of important computational and functional advantages, and further show how the strong alpha-frequency (10hz, 100ms cycle time) oscillations in the posterior neocortex could reflect this temporal context updating. We examine a wide range of data from biology to behavior through the lens of this LeabraTI model, and find that it provides a unified account of a number of otherwise disconnected findings, all of which converge to support this new model of neocortical learning and processing. We describe an implemented model showing how predictive learning of tumbling object trajectories can facilitate object recognition with cluttered backgrounds.
Graph theory has become a very critical component in many applications in the computing field including networking and security. Unfortunately, it is also amongst the most complex topics to understand and apply.   In this paper, we review some of the key applications of graph theory in network security. We first cover some algorithmic aspects, then present network coding and its relation to routing.
We recently have witnessed many ground-breaking results in machine learning and computer vision, generated by using deep convolutional neural networks (CNN). While the success mainly stems from the large volume of training data and the deep network architectures, the vector processing hardware (e.g. GPU) undisputedly plays a vital role in modern CNN implementations to support massive computation. Though much attention was paid in the extent literature to understand the algorithmic side of deep CNN, little research was dedicated to the vectorization for scaling up CNNs. In this paper, we studied the vectorization process of key building blocks in deep CNNs, in order to better understand and facilitate parallel implementation. Key steps in training and testing deep CNNs are abstracted as matrix and vector operators, upon which parallelism can be easily achieved. We developed and compared six implementations with various degrees of vectorization with which we illustrated the impact of vectorization on the speed of model training and testing. Besides, a unified CNN framework for both high-level and low-level vision tasks is provided, along with a vectorized Matlab implementation with state-of-the-art speed performance.
We solve a 40-year-old open problem on the depth optimality of sorting networks. In 1973, Donald E. Knuth detailed, in Volume 3 of "The Art of Computer Programming", sorting networks of the smallest depth known at the time for n =< 16 inputs, quoting optimality for n =< 8. In 1989, Parberry proved the optimality of the networks with 9 =< n =< 10 inputs. In this article, we present a general technique for obtaining such optimality results, and use it to prove the optimality of the remaining open cases of 11 =< n =< 16 inputs. We show how to exploit symmetry to construct a small set of two-layer networks on n inputs such that if there is a sorting network on n inputs of a given depth, then there is one whose first layers are in this set. For each network in the resulting set, we construct a propositional formula whose satisfiability is necessary for the existence of a sorting network of a given depth. Using an off-the-shelf SAT solver we show that the sorting networks listed by Knuth are optimal. For n =< 10 inputs, our algorithm is orders of magnitude faster than the prior ones.
We present an improved three-step pipeline for the stereo matching problem and introduce multiple novelties at each stage. We propose a new highway network architecture for computing the matching cost at each possible disparity, based on multilevel weighted residual shortcuts, trained with a hybrid loss that supports multilevel comparison of image patches. A novel post-processing step is then introduced, which employs a second deep convolutional neural network for pooling global information from multiple disparities. This network outputs both the image disparity map, which replaces the conventional "winner takes all" strategy, and a confidence in the prediction. The confidence score is achieved by training the network with a new technique that we call the reflective loss. Lastly, the learned confidence is employed in order to better detect outliers in the refinement step. The proposed pipeline achieves state of the art accuracy on the largest and most competitive stereo benchmarks, and the learned confidence is shown to outperform all existing alternatives.
In this paper, we study an adaptive spatial network. We consider an SIS (susceptible-infectedsusceptible) epidemic on the network, with a link/contact rewiring process constrained by spatial proximity. In particular, we assume that susceptible nodes break links with infected nodes independently of distance, and reconnect at random to susceptible nodes available within a given radius. By systematically manipulating this radius we investigate the impact of rewiring on the structure of the network and characteristics of the epidemic. We adopt a step-by-step approach whereby we first study the impact of rewiring on the network structure in the absence of an epidemic, then with nodes assigned a disease status but without disease dynamics, and finally running network and epidemic dynamics simultaneously. In the case of no labelling and no epidemic dynamics, we provide both analytic and semi-analytic formulas for the value of clustering achieved in the network. Our results also show that the rewiring radius and the network's initial structure have a pronounced effect on the endemic equilibrium, with increasingly large rewiring radiuses yielding smaller disease prevalence.
We present an effective field theory method to analyze, in a very general way, models defined over small-world networks. Even if the exactness of the method is limited to the paramagnetic regions and to some special limits, it provides, yielding a clear and immediate (also in terms of calculation) physical insight, the exact critical behavior and the exact critical surfaces and percolation thresholds. The underlying structure of the non random part of the model, i.e., the set of spins filling up a given lattice L_0 of dimension d_0 and interacting through a fixed coupling J_0, is exactly taken into account. When J_0\geq 0, the small-world effect gives rise, as is known, to a second-order phase transition that takes place independently of the dimension d_0 and of the added random connectivity c. When J_0<0, a different and novel scenario emerges in which, besides a spin glass transition, multiple first- and second-order phase transitions may take place. As immediate analytical applications we analyze the Viana-Bray model (d_0=0), the one dimensional chain (d_0=1), and the spherical model for arbitrary d_0.
Deep imaging and spectroscopy have been carried out for optical counterparts of a sample of 68 faint radio sources (S > 0.2 mJy) in the ``Marano Field''. About 60% of the sources have been optically identified on deep CCD exposures (limit R ~ 24.0) or ESO 3.6-m plates (limit bJ ~ 22.5). Thirty-four spectra (50% of the total radio sample) were obtained with the ESO 3.6-m telescope and 30 redshifts were determined. In addition to a few broad line active galactic nuclei, three main spectroscopic classes have been found to dominate the faint radio galaxy population: (1) Early-type galaxies with 0.1 < z < 0.8. (2) Late-type galaxies at moderate redshift (z < 0.4), with relatively bright magnitudes (B < 22.5) and sub-mJy radio fluxes. (3) A group of bright high-redshift (z > 0.8) radio galaxies with moderate-to-strong [OII] emission. They have spectra, colours and absolute magnitudes similar to those of the classical bright elliptical radio galaxies found in surveys carried out at higher radio fluxes. Star-forming galaxies do not constitute the main population of our radio sources identified with galaxies. In fact, even at sub-mJy level the majority of our radio sources are identified with early-type galaxies. This apparent discrepancy with previous results is due to the fainter magnitude limit reached in our spectroscopic identifications. Moreover, using mainly the large radio-to-optical ratio and the information from the available limits on the optical magnitudes of the unidentified radio sources, we conclude that the great majority of them are likely to be early-type galaxies, at z > 1. If correct, it would suggest that the evolution of the radio luminosity function of spiral galaxies, including starbursts, might not be as strong as suggested in previous evolutionary models.
The effect of defects on the behavior of electrical conductivity in a monolayer produced by the random sequential adsorption of linear $k$-mers (particles occupying $k$ adjacent sites) onto a square lattice is studied by means of a Monte Carlo simulation. The $k$-mers are deposited on the substrate until a jamming state is reached, i.e. a state where no one additional particle can be placed because the presented voids are too small or of inappropriate shapes. The presence of defects in the lattice (impurities) and of defects in the $k$-mers with concentrations of $d_l$ and $d_k$, respectively, is assumed. The defects in the lattice are distributed randomly before deposition and these lattice sites are forbidden for the deposition of $k$-mers. The defects on the $k$-mers are distributed randomly. The sites filled with $k$-mers have high electrical conductivity whereas the empty sites, and the sites filled by either types of defect have a low electrical conductivity, i.e., a high-contrast is assumed. We examined isotropic (both the possible orientations of a particle are equiprobable) and anisotropic (all particles are aligned along one given direction, $y$) deposition. To calculate the effective electrical conductivity, the monolayer was presented as a random resistor network (RRN) and the Frank--Lobb algorithm was used. The effects of the concentrations of defects $d_l$ and $d_k$ on the electrical conductivity for the values of $k = 2^n$, where $n =1,2,\dots, 5$, were studied. Increase of both the $d_l$ and $d_k$ parameters values resulted in decreases in the value of $\sigma$ and the suppression of percolation. Moreover, for anisotropic deposition the electrical conductivity along the $y$ direction was noticeably larger than in the perpendicular direction, $x$. Phase diagrams in the ($d_l, d_k$)-plane for different values of $k$ were obtained.
Machine learning and deep learning in particular has advanced tremendously on perceptual tasks in recent years. However, it remains vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system while being quasi-imperceptible to a human. In this work, we propose to augment deep neural networks with a small "detector" subnetwork which is trained on the binary classification task of distinguishing genuine data from data containing adversarial perturbations. Our method is orthogonal to prior work on addressing adversarial perturbations, which has mostly focused on making the classification network itself more robust. We show empirically that adversarial perturbations can be detected surprisingly well even though they are quasi-imperceptible to humans. Moreover, while the detectors have been trained to detect only a specific adversary, they generalize to similar and weaker adversaries. In addition, we propose an adversarial attack that fools both the classifier and the detector and a novel training procedure for the detector that counteracts this attack.
Stability and diversity are two key properties that living entities share with spin glasses, where they are manifested through the breaking of the phase space into many valleys or local minima connected by saddle points. The topology of the phase space can be conveniently condensed into a tree structure, akin to the biological phylogenetic trees, whose tips are the local minima and internal nodes are the lowest-energy saddles connecting those minima. For the infinite-range Ising spin glass with p-spin interactions, we show that the average size-frequency distribution of saddles obeys a power law $<\psi(w) > \sim w^{-D}$, where w=w(s) is the number of minima that can be connected through saddle s, and D is the fractal dimension of the phase space.
Network coding is known to improve the throughput and the resilience to losses in most network scenarios. In a practical network scenario, however, the accurate modeling of the traffic is often too complex and/or infeasible. The goal is thus to design codes that perform close to the capacity of any network (with arbitrary traffic) efficiently. In this context, random linear network codes are known to be capacity-achieving while requiring a decoding complexity quadratic in the message length. Chunked Codes (CC) were proposed by Maymounkov et al. to improve the computational efficiency of random codes by partitioning the message into a number of non-overlapping chunks. CC can also be capacity-achieving but have a lower encoding/decoding complexity at the expense of slower convergence to the capacity. In this paper, we propose and analyze a generalized version of CC called Overlapped Chunked Codes (OCC) in which chunks are allowed to overlap. Our theoretical analysis and simulation results show that compared to CC, OCC can achieve the capacity with a faster speed while maintaining almost the same advantage in computational efficiency.
We calculate the mean number of metastable states of an Ising ferromagnet on random thin graphs of fixed connectivity c. We find, as for mean field spin glasses that this mean increases exponentially with the number of sites, and is the same as that calculated for the +/- J spin glass on the same graphs. An annealed calculation of the number <N_{MS}(E)> of metastable states of energy E is carried out. For small c, an analytic result is obtained. The result is compared with the one obtained for spin glasses in order to discuss the role played by loops on thin graphs and hence the effect of real frustration on the distribution of metastable states.
It has been suggested that neural systems across several scales of organization show optimal component placement, in which any spatial rearrangement of the components would lead to an increase of total wiring. Using extensive connectivity datasets for diverse neural networks combined with spatial coordinates for network nodes, we applied an optimization algorithm to the network layouts, in order to search for wire-saving component rearrangements. We found that optimized component rearrangements could substantially reduce total wiring length in all tested neural networks. Specifically, total wiring among 95 primate (Macaque) cortical areas could be decreased by 32%, and wiring of neuronal networks in the nematode Caenorhabditis elegans could be reduced by 48% on the global level, and by 49% for neurons within frontal ganglia. Wiring length reductions were possible due to the existence of long-distance projections in neural networks. We explored the role of these projections by comparing the original networks with minimally rewired networks of the same size, which possessed only the shortest possible connections. In the minimally rewired networks, the number of processing steps along the shortest paths between components was significantly increased compared to the original networks. Additional benchmark comparisons also indicated that neural networks are more similar to network layouts that minimize the length of processing paths, rather than wiring length. These findings suggest that neural systems are not exclusively optimized for minimal global wiring, but for a variety of factors including the minimization of processing steps.
We introduce a general and simple structural design called Multiplicative Integration (MI) to improve recurrent neural networks (RNNs). MI changes the way in which information from difference sources flows and is integrated in the computational building block of an RNN, while introducing almost no extra parameters. The new structure can be easily embedded into many popular RNN models, including LSTMs and GRUs. We empirically analyze its learning behaviour and conduct evaluations on several tasks using different RNN models. Our experimental results demonstrate that Multiplicative Integration can provide a substantial performance boost over many of the existing RNN models.
Composite multiferroics are materials exhibiting the interplay of ferroelectricity, magnetism, and strong electron correlations. Typical example --- magnetic nano grains embedded in a ferroelectric matrix. Coupling of ferroelectric and ferromagnetic degrees of freedom in these materials is due to the influence of ferroelectric matrix on the exchange coupling constant via screening of the intragrain and intergrain Coulomb interaction. Cooling typical magnetic materials the ordered state appears at lower temperatures than the disordered state. We show that in composite multiferroics the ordered magnetic phase may appear at higher temperatures than the magnetically disordered phase. In non-magnetic materials such a behavior is known as inverse phase transition.
We calculate the probability distribution of the local density of states $\nu$ in a disordered one-dimensional conductor or single-mode waveguide, attached at one end to an electron or photon reservoir. We show that this distribution does not display a log-normal tail for small $\nu$, but diverges instead $\propto\nu^{-1/2}$. The log-normal tail appears if $\nu$ is averaged over rapid oscillations on the scale of the wavelength. There is no such qualitative distinction between microscopic and mesoscopic densities of states if the levels are broadened by inelastic scattering or absorption, rather than by coupling to a reservoir.
We consider a single-interface model for the description of Barkhausen noise in soft ferromagnetic materials. Previously, the model had been used only in the adiabatic regime of infinitely slow field ramping. We introduce finite driving rates and analyze the scaling of event sizes and durations for different regimes of the driving rate. Coexistence of intermittency, with non-trivial scaling laws, and finite-velocity interface motion is observed for high enough driving rates. Power spectra show a decay $\sim \omega^{-t}$, with $t<2$ for finite driving rates, revealing the influence of the internal structure of avalanches.
I shall present a proof of universality of the microscopic spectral correlations in Verbaarschot's random matrix models of QCD, to corroborate the beautiful agreement between the predictions from the gaussian model and the numerical data. Rather than following closely the original proof, I shall disguise it with the conventional approach of Q, P operators and with the method employed in an alternative proof by Kanzieper-Freilikher, in a hope that borrowing notions from quantum mechanics may add the proof some pedagogical flavor. I shall also discuss a problem associated with the multi-criticality.
We study the magnetic field dependences of the conductivity in heavily doped, strongly disordered 2D quantum well structures within wide conductivity and temperature ranges. We show that the exact analytical expression derived in our previous paper [1], is in better agreement than the existing equation i.e. Hikami(et.al.,) expression [2,3], with the experimental data even in low magnetic field for which the diffusion approximation is valid. On the other hand from theoretical point of view we observe that our equation is also rich because it establishes a strong relationship between quantum corrections to the conductivity and the quantum symmetry su_{q}(2). It is shown that the quantum corrections to the conductivity is the trace of Green function made by a generator of su_{q}(2)algebra. Using this fact we show that the quantum corrections to the conductivity can be expressed as a sum of an infinite number of Feynman diagrams.
In this paper we propose a crossover operator for evolutionary algorithms with real values that is based on the statistical theory of population distributions. The operator is based on the theoretical distribution of the values of the genes of the best individuals in the population. The proposed operator takes into account the localization and dispersion features of the best individuals of the population with the objective that these features would be inherited by the offspring. Our aim is the optimization of the balance between exploration and exploitation in the search process. In order to test the efficiency and robustness of this crossover, we have used a set of functions to be optimized with regard to different criteria, such as, multimodality, separability, regularity and epistasis. With this set of functions we can extract conclusions in function of the problem at hand. We analyze the results using ANOVA and multiple comparison statistical tests. As an example of how our crossover can be used to solve artificial intelligence problems, we have applied the proposed model to the problem of obtaining the weight of each network in a ensemble of neural networks. The results obtained are above the performance of standard methods.
The recent boom of AI has seen the emergence of many human-computer conversation systems such as Google Assistant, Microsoft Cortana, Amazon Echo and Apple Siri. We introduce and formalize the task of predicting questions in conversations, where the goal is to predict the new question that the user will ask, given the past conversational context. This task can be modeled as a "sequence matching" problem, where two sequences are given and the aim is to learn a model that maps any pair of sequences to a matching probability. Neural matching models, which adopt deep neural networks to learn sequence representations and matching scores, have attracted immense research interests of information retrieval and natural language processing communities. In this paper, we first study neural matching models for the question retrieval task that has been widely explored in the literature, whereas the effectiveness of neural models for this task is relatively unstudied. We further evaluate the neural matching models in the next question prediction task in conversations. We have used the publicly available Quora data and Ubuntu chat logs in our experiments. Our evaluations investigate the potential of neural matching models with representation learning for question retrieval and next question prediction in conversations. Experimental results show that neural matching models perform well for both tasks.
This paper summarizes our method and validation results for the ISBI Challenge 2017 - Skin Lesion Analysis Towards Melanoma Detection - Part I: Lesion Segmentation
Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited. Aiming at filling this gap, we investigate three key mathematical properties of representations: equivariance, invariance, and equivalence. Equivariance studies how transformations of the input image are encoded by the representation, invariance being a special case where a transformation has no effect. Equivalence studies whether two representations, for example two different parametrisations of a CNN, capture the same visual information or not. A number of methods to establish these properties empirically are proposed, including introducing transformation and stitching layers in CNNs. These methods are then applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved. While the focus of the paper is theoretical, direct applications to structured-output regression are demonstrated too.
Due to imprecision and uncertainties in predicting real world problems, artificial neural network (ANN) techniques have become increasingly useful for modeling and optimization. This paper presents an artificial neural network approach for forecasting electric energy consumption. For effective planning and operation of power systems, optimal forecasting tools are needed for energy operators to maximize profit and also to provide maximum satisfaction to energy consumers. Monthly data for electric energy consumed in the Gaza strip was collected from year 1994 to 2013. Data was trained and the proposed model was validated using 2-Fold and K-Fold cross validation techniques. The model has been tested with actual energy consumption data and yields satisfactory performance.
We describe Swapout, a new stochastic training method, that outperforms ResNets of identical network structure yielding impressive results on CIFAR-10 and CIFAR-100. Swapout samples from a rich set of architectures including dropout, stochastic depth and residual architectures as special cases. When viewed as a regularization method swapout not only inhibits co-adaptation of units in a layer, similar to dropout, but also across network layers. We conjecture that swapout achieves strong regularization by implicitly tying the parameters across layers. When viewed as an ensemble training method, it samples a much richer set of architectures than existing methods such as dropout or stochastic depth. We propose a parameterization that reveals connections to exiting architectures and suggests a much richer set of architectures to be explored. We show that our formulation suggests an efficient training method and validate our conclusions on CIFAR-10 and CIFAR-100 matching state of the art accuracy. Remarkably, our 32 layer wider model performs similar to a 1001 layer ResNet model.
This is a proposal to create a research facility for the development of a high-parallel version of the "SP machine", based on the "SP theory of intelligence". We envisage that the new version of the SP machine will be an open-source software virtual machine, derived from the existing "SP computer model", and hosted on an existing high-performance computer. It will be a means for researchers everywhere to explore what can be done with the system and to create new versions of it. The SP system is a unique attempt to simplify and integrate observations and concepts across artificial intelligence, mainstream computing, mathematics, and human perception and cognition, with information compression as a unifying theme. Potential benefits and applications include helping to solve problems associated with big data; facilitating the development of autonomous robots; unsupervised learning, natural language processing, several kinds of reasoning, fuzzy pattern recognition at multiple levels of abstraction, computer vision, best-match and semantic forms of information retrieval, software engineering, medical diagnosis, simplification of computing systems, and the seamless integration of diverse kinds of knowledge and diverse aspects of intelligence. Additional motivations include the potential of the SP system to help solve problems in defence, security, and the detection and prevention of crime; potential in terms of economic, social, environmental, and academic criteria, and in terms of publicity; and the potential for international influence in research. The main elements of the proposed facility are described, including support for the development of "SP-neural", a neural version of the SP machine. The facility should be permanent in the sense that it should be available for the foreseeable future, and it should be designed to facilitate its use by researchers anywhere in the world.
A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available -- current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval. The dataset is freely available at http://www.scan-net.org.
We study the structure of business firm networks and scale-free models with degree distribution $P(q) \propto (q+c)^{-\lambda}$ using the method of $k$-shell decomposition.We find that the Life Sciences industry network consist of three components: a ``nucleus,'' which is a small well connected subgraph, ``tendrils,'' which are small subgraphs consisting of small degree nodes connected exclusively to the nucleus, and a ``bulk body'' which consists of the majority of nodes. At the same time we do not observe the above structure in the Information and Communication Technology sector of industry. We also conduct a systematic study of these three components in random scale-free networks. Our results suggest that the sizes of the nucleus and the tendrils decrease as $\lambda$ increases and disappear for $\lambda \geq 3$. We compare the $k$-shell structure of random scale-free model networks with two real world business firm networks in the Life Sciences and in the Information and Communication Technology sectors. Our results suggest that the observed behavior of the $k$-shell structure in the two industries is consistent with a recently proposed growth model that assumes the coexistence of both preferential and random agreements in the evolution of industrial networks.
Vehicular Networks enable a vast number of innovative applications, which rely on the efficient exchange of information between vehicles. However, efficient and reliable data dissemination is a particularly challenging task in the context of vehicular networks due to the underlying properties of these networks, limited availability of network infrastructure and variable penetration rates for distinct communication technologies. This paper presents a novel system and mechanism for information dissemination based on virtual infrastructure selection in combination with multiple communication technologies. The system has been evaluated using a simulation framework, involving network simulation in conjugation with realistic vehicular mobility traces. The presented simulation results show the feasibility of the proposed mechanism to achieve maximum message penetration with reduced overhead. Compared with a cellular-based only solution, our mechanism shows that the judicious vehicle selection can lead to improved network utilization through the offload of traffic to the short-range communication network.
We discuss possible designs and prototypes of computing systems that could be based on morphological development of roots, interaction of roots, and analog electrical computation with plants, and plant-derived electronic components. In morphological plant processors data are represented by initial configuration of roots and configurations of sources of attractants and repellents; results of computation are represented by topology of the roots' network. Computation is implemented by the roots following gradients of attractants and repellents, as well as interacting with each other. Problems solvable by plant roots, in principle, include shortest-path, minimum spanning tree, Voronoi diagram, $\alpha$-shapes, convex subdivision of concave polygons. Electrical properties of plants can be modified by loading the plants with functional nanoparticles or coating parts of plants of conductive polymers. Thus, we are in position to make living variable resistors, capacitors, operational amplifiers, multipliers, potentiometers and fixed-function generators. The electrically modified plants can implement summation, integration with respect to time, inversion, multiplication, exponentiation, logarithm, division. Mathematical and engineering problems to be solved can be represented in plant root networks of resistive or reaction elements. Developments in plant-based computing architectures will trigger emergence of a unique community of biologists, electronic engineering and computer scientists working together to produce living electronic devices which future green computers will be made of.
A standard model in network synchronised distributed computing is the LOCAL model. In this model, the processors work in rounds and, in the classic setting, they know the number of vertices of the network, $n$. Using $n$, they can compute the number of rounds after which they must all stop and output. It has been shown recently that for many problems, one can basically remove the assumption about the knowledge of $n$, without increasing the asymptotic running time. In this case, it is assumed that different vertices can choose their final output at different rounds, but continue to transmit messages. In both models, the measure of the running time is the number of rounds before the last node outputs. In this brief announcement, the vertices do not have the knowledge of $n$, and we consider an alternative measure: the average, over the nodes, of the number of rounds before they output. We prove that the complexity of a problem can be exponentially smaller with the new measure, but that Linial's lower bound for colouring still holds.
We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art.
The broad adoption of the web as a communication medium has made it possible to study social behavior at a new scale. With social media networks such as Twitter, we can collect large data sets of online discourse. Social science researchers and journalists, however, may not have tools available to make sense of large amounts of data or of the structure of large social networks. In this paper, we describe our recent extensions to Truthy, a system for collecting and analyzing political discourse on Twitter. We introduce several new analytical perspectives on online discourse with the goal of facilitating collaboration between individuals in the computational and social sciences. The design decisions described in this article are motivated by real-world use cases developed in collaboration with colleagues at the Indiana University School of Journalism.
This paper proposes and evaluates the k-greedy equivalence search algorithm (KES) for learning Bayesian networks (BNs) from complete data. The main characteristic of KES is that it allows a trade-off between greediness and randomness, thus exploring different good local optima. When greediness is set at maximum, KES corresponds to the greedy equivalence search algorithm (GES). When greediness is kept at minimum, we prove that under mild assumptions KES asymptotically returns any inclusion optimal BN with nonzero probability. Experimental results for both synthetic and real data are reported showing that KES often finds a better local optima than GES. Moreover, we use KES to experimentally confirm that the number of different local optima is often huge.
We study a version of the Tangled Nature model of evolutionary ecology redefined in a phenotype space where mutants have properties correlated to their parents. The model has individual-based dynamics whilst incorporating species scale competitive constraints and a system scale resource constraint. Multiple species arise that coexist in a species interaction network with evolving global properties. Both the mean interaction strength and the network connectance increase relative to the null system as mutualism becomes more extensive. From a study of the dependence of average degree on the resource level we extract the diversity-connectance relationship which conforms to the hyperbolic form seen in field data. This is adjudged to arise as a consequence of the evolutionary pressure to achieve positive interactions. The network degree distributions conform more strongly to exponential than to the null binomial distributions in all cases. This effect is believed to be caused by correlations in the reproductive process. We also study how resource availability influences the phenotypical lifetime distribution which is approximately of power law form. We observe that the mean lifetime is inversely related to the resource level.
The paper describes a neural approach for modelling and control of a turbocharged Diesel engine. A neural model, whose structure is mainly based on some physical equations describing the engine behaviour, is built for the rotation speed and the exhaust gas opacity. The model is composed of three interconnected neural submodels, each of them constituting a nonlinear multi-input single-output error model. The structural identification and the parameter estimation from data gathered on a real engine are described. The neural direct model is then used to determine a neural controller of the engine, in a specialized training scheme minimising a multivariable criterion. Simulations show the effect of the pollution constraint weighting on a trajectory tracking of the engine speed. Neural networks, which are flexible and parsimonious nonlinear black-box models, with universal approximation capabilities, can accurately describe or control complex nonlinear systems, with little a priori theoretical knowledge. The presented work extends optimal neuro-control to the multivariable case and shows the ?exibility of neural optimisers. Considering the preliminary results, it appears that neural networks can be used as embedded models for engine control, to satisfy the more and more restricting pollutant emission legislation. Particularly, they are able to model nonlinear dynamics and outperform during transients the control schemes based on static mappings.
Language Identification, being an important aspect of Automatic Speaker Recognition has had many changes and new approaches to ameliorate performance over the last decade. We compare the performance of using audio spectrum in the log scale and using Polyphonic sound sequences from raw audio samples to train the neural network and to classify speech as either English or Spanish. To achieve this, we use the novel approach of using a Convolutional Recurrent Neural Network using Long Short Term Memory (LSTM) or a Gated Recurrent Unit (GRU) for forward propagation of the neural network. Our hypothesis is that the performance of using polyphonic sound sequence as features and both LSTM and GRU as the gating mechanisms for the neural network outperform the traditional MFCC features using a unidirectional Deep Neural Network.
Emergent antibiotic-resistant bacterial infections are an increasingly significant source of morbidity and mortality. Antibiotic-resistant organisms have a natural reservoir in hospitals, and recent estimates suggest that almost 2 million people develop hospital-acquired infections each year in the US alone. We investigate a network induced by the transfer of Medicare patients across US hospitals over a 2-year period to learn about the possible role of hospital-to-hospital transfers of patients in the spread of infections. We analyze temporal, geographical, and topological properties of the transfer network and demonstrate, using C. Diff. as a case study, that this network may serve as a substrate for the spread of infections. Finally, we study different strategies for the early detection of incipient epidemics, finding that using approximately 2% of hospitals as sensors, chosen based on their network in-degree, results in optimal performance for this early warning system, enabling the early detection of 80% of the C. Diff. cases.
We posit a new paradigm for image information processing. For the last 25 years, this task was usually approached in the frame of Treisman's two-stage paradigm [1]. The latter supposes an unsupervised, bottom-up directed process of preliminary information pieces gathering at the lower processing stages and a supervised, top-down directed process of information pieces binding and grouping at the higher stages. It is acknowledged that these sub-processes interact and intervene between them in a tricky and a complicated manner. Notwithstanding the prevalence of this paradigm in biological and computer vision, we nevertheless propose to replace it with a new one, which we would like to designate as a two-part paradigm. In it, information contained in an image is initially extracted in an independent top-down manner by one part of the system, and then it is examined and interpreted by another, separate system part. We argue that the new paradigm seems to be more plausible than its forerunner. We provide evidence from human attention vision studies and insights of Kolmogorov's complexity theory to support our arguments. We also provide some reasons in favor of separate image interpretation issues.
Shape matching represents a challenging problem in both information engineering and computer science, exhibiting not only a wide spectrum of multimedia applications, but also a deep relation with conformal geometry. After reviewing the theoretical foundations and the practical issues involved in this fashinating subject, we focus on two state-of-the-art approaches relying respectively on local features (landmark points) and on global properties (conformal parameterizations). Finally, we introduce the Teichm\"uller space of n-pointed Riemann surfaces of genus g into the realm of multimedia, showing that its beautiful geometry provides a natural unified framework for three-dimensional shape matching.
We present a robust and real-time monocular six degree of freedom relocalization system. Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking 5ms per frame to compute. It obtains approximately 2m and 6 degree accuracy for large scale outdoor scenes and 0.5m and 10 degree accuracy indoors. This is achieved using an efficient 23 layer deep convnet, demonstrating that convnets can be used to solve complicated out of image plane regression problems. This was made possible by leveraging transfer learning from large scale classification data. We show the convnet localizes from high level features and is robust to difficult lighting, motion blur and different camera intrinsics where point based SIFT registration fails. Furthermore we show how the pose feature that is produced generalizes to other scenes allowing us to regress pose with only a few dozen training examples. PoseNet code, dataset and an online demonstration is available on our project webpage, at http://mi.eng.cam.ac.uk/projects/relocalisation/
We review the status of the data and phenomenology in the Generalized Parton Distribution approach of Deep Virtual Meson Production.
We present the optimal relay-subset selection and transmission-time for a decode-and-forward, half-duplex cooperative network of arbitrary size. The resource allocation is obtained by maximizing over the rates obtained for each possible subset of active relays, and the unique time allocation for each set can be obtained by solving a linear system of equations. We also present a simple recursive algorithm for the optimization problem which reduces the computational load of finding the required matrix inverses, and reduces the number of required iterations. Our results, in terms of outage rate, confirm the benefit of adding potential relays to a small network and the diminishing marginal returns for a larger network. We also show that optimizing over the channel resources ensures that more relays are active over a larger SNR range, and that linear network constellations significantly outperform grid constellations. Through simulations, the optimization is shown to be robust to node numbering.
Natural language inference (NLI) is a fundamentally important task in natural language processing that has many applications. The recently released Stanford Natural Language Inference (SNLI) corpus has made it possible to develop and evaluate learning-centered methods such as deep neural networks for natural language inference (NLI). In this paper, we propose a special long short-term memory (LSTM) architecture for NLI. Our model builds on top of a recently proposed neural attention model for NLI but is based on a significantly different idea. Instead of deriving sentence embeddings for the premise and the hypothesis to be used for classification, our solution uses a match-LSTM to perform word-by-word matching of the hypothesis with the premise. This LSTM is able to place more emphasis on important word-level matching results. In particular, we observe that this LSTM remembers important mismatches that are critical for predicting the contradiction or the neutral relationship label. On the SNLI corpus, our model achieves an accuracy of 86.1%, outperforming the state of the art.
The physics of the spin glass (SG) state, with magnetic moments (spins) frozen in random orientations, is one of the most intriguing problems in condensed matter physics. While most theoretical studies have focused on the Edwards-Anderson model of Ising spins with only discrete `up/down' directions, such Ising systems are experimentally rare. LiHo$_{x}$Y$_{1-x}$F$_{4}$, where the Ho$^{3+}$ moments are well described by Ising spins, is an almost ideal Ising SG material. In LiHo$_{x}$Y$_{1-x}$F$_{4}$, the Ho$^{3+}$ moments interact predominantly via the inherently frustrated magnetostatic dipole-dipole interactions. The random frustration causing the SG behavior originates from the random substitution of dipole-coupled Ho$^{3+}$ by non-magnetic Y$^{3+}$. In this paper, we provide compelling evidence from extensive computer simulations that a SG transition at nonzero temperature occurs in a realistic microscopic model of LiHo$_{x}$Y$_{1-x}$F$_{4}$, hence resolving the long-standing, and still ongoing, controversy about the existence of a SG transition in disordered dipolar Ising systems.
Vehicular Ad hoc NETworks (VANETs) as the basic infrastructure can facilitate applications and services of connected vehicles (CVs). Cognitive radio (CR) technology is an effective supplement and enhancement for VANETs. It can reduce the impact of deficiency of spectrum resource in VANETs. Although CR-VANETs can utilize the unused licensed spectrum effectively, the distributed nature of CR-VANETs may open a door for different attacks, such as spectrum sensing data falsification attack. In this paper, we propose a joint RSU and vehicle-based light-weighted cloud for CR-VANETs. Based on this cloud computing model, we propose a new service named Spectrum Sensing as a Service (SSaaS), which can perform a cooperative spectrum sensing in CR-VANETs with cloud computing assistance to secure the spectrum sensing procedure. As a result, a reliable service can be obtained in CR-VANETs. Simulation results show that the cloud computing in CR-VANETs can effectively reduce latency and improve the security of CR-VANETs.
We present persona-based models for handling the issue of speaker consistency in neural response generation. A speaker model encodes personas in distributed embeddings that capture individual characteristics such as background information and speaking style. A dyadic speaker-addressee model captures properties of interactions between two interlocutors. Our models yield qualitative performance improvements in both perplexity and BLEU scores over baseline sequence-to-sequence models, with similar gains in speaker consistency as measured by human judges.
Deep learning (deep structured learning, hierarchi- cal learning or deep machine learning) is a branch of machine learning based on a set of algorithms that attempt to model high- level abstractions in data by using multiple processing layers with complex structures or otherwise composed of multiple non-linear transformations. In this paper, we present the results of testing neural networks architectures on H2O platform for various activation functions, stopping metrics, and other parameters of machine learning algorithm. It was demonstrated for the use case of MNIST database of handwritten digits in single-threaded mode that blind selection of these parameters can hugely increase (by 2-3 orders) the runtime without the significant increase of precision. This result can have crucial influence for opitmization of available and new machine learning methods, especially for image recognition problems.
It has been argued that a central objective of nanotechnology is to make products inexpensively, and that self-replication is an effective approach to very low-cost manufacturing. The research presented here is intended to be a step towards this vision. In previous work (JohnnyVon 1.0), we simulated machines that bonded together to form self-replicating strands. There were two types of machines (called types 0 and 1), which enabled strands to encode arbitrary bit strings. However, the information encoded in the strands had no functional role in the simulation. The information was replicated without being interpreted, which was a significant limitation for potential manufacturing applications. In the current work (JohnnyVon 2.0), the information in a strand is interpreted as instructions for assembling a polygonal mesh. There are now four types of machines and the information encoded in a strand determines how it folds. A strand may be in an unfolded state, in which the bonds are straight (although they flex slightly due to virtual forces acting on the machines), or in a folded state, in which the bond angles depend on the types of machines. By choosing the sequence of machine types in a strand, the user can specify a variety of polygonal shapes. A simulation typically begins with an initial unfolded seed strand in a soup of unbonded machines. The seed strand replicates by bonding with free machines in the soup. The child strands fold into the encoded polygonal shape, and then the polygons drift together and bond to form a mesh. We demonstrate that a variety of polygonal meshes can be manufactured in the simulation, by simply changing the sequence of machine types in the seed.
Recurrent Neural Network (RNN) has been widely applied for sequence modeling. In RNN, the hidden states at current step are full connected to those at previous step, thus the influence from less related features at previous step may potentially decrease model's learning ability. We propose a simple technique called parallel cells (PCs) to enhance the learning ability of Recurrent Neural Network (RNN). In each layer, we run multiple small RNN cells rather than one single large cell. In this paper, we evaluate PCs on 2 tasks. On language modeling task on PTB (Penn Tree Bank), our model outperforms state of art models by decreasing perplexity from 78.6 to 75.3. On Chinese-English translation task, our model increases BLEU score for 0.39 points than baseline model.
Stochastic sampling algorithms, while an attractive alternative to exact algorithms in very large Bayesian network models, have been observed to perform poorly in evidential reasoning with extremely unlikely evidence. To address this problem, we propose an adaptive importance sampling algorithm, AIS-BN, that shows promising convergence rates even under extreme conditions and seems to outperform the existing sampling algorithms consistently. Three sources of this performance improvement are (1) two heuristics for initialization of the importance function that are based on the theoretical properties of importance sampling in finite-dimensional integrals and the structural advantages of Bayesian networks, (2) a smooth learning method for the importance function, and (3) a dynamic weighting function for combining samples from different stages of the algorithm. We tested the performance of the AIS-BN algorithm along with two state of the art general purpose sampling algorithms, likelihood weighting (Fung and Chang, 1989; Shachter and Peot, 1989) and self-importance sampling (Shachter and Peot, 1989). We used in our tests three large real Bayesian network models available to the scientific community: the CPCS network (Pradhan et al., 1994), the PathFinder network (Heckerman, Horvitz, and Nathwani, 1990), and the ANDES network (Conati, Gertner, VanLehn, and Druzdzel, 1997), with evidence as unlikely as 10^-41. While the AIS-BN algorithm always performed better than the other two algorithms, in the majority of the test cases it achieved orders of magnitude improvement in precision of the results. Improvement in speed given a desired precision is even more dramatic, although we are unable to report numerical results here, as the other algorithms almost never achieved the precision reached even by the first few iterations of the AIS-BN algorithm.
Because of the spread of the Internet, social platforms become big data pools. From there we can learn about the trends, culture and hot topics. This project focuses on analyzing the data from Instagram. It shows the relationship of Instagram filter data with location and number of likes to give users filter suggestion on achieving more likes based on their location. It also analyzes the popular hashtags in different locations to show visual culture differences between different cities.
It has been shown that the communities of complex networks often overlap with each other. However, there is no effective method to quantify the overlapping community structure. In this paper, we propose a metric to address this problem. Instead of assuming that one node can only belong to one community, our metric assumes that a maximal clique only belongs to one community. In this way, the overlaps between communities are allowed. To identify the overlapping community structure, we construct a maximal clique network from the original network, and prove that the optimization of our metric on the original network is equivalent to the optimization of Newman's modularity on the maximal clique network. Thus the overlapping community structure can be identified through partitioning the maximal clique network using any modularity optimization method. The effectiveness of our metric is demonstrated by extensive tests on both the artificial networks and the real world networks with known community structure. The application to the word association network also reproduces excellent results.
Network Functions Virtualization, or NFV, enables telecommunications infrastructure providers to replace special-purpose networking equipment with commodity servers running virtualized network functions (VNFs). A service provider utilizing NFV technology faces the SFC provisioning problem of assigning VNF instances to nodes in the physical infrastructure (e.g., a datacenter), and routing Service Function Chains (sequences of functions required by customers, a.k.a. SFCs) in the physical network. In doing so, the provider must balance between various competing goals of performance and resource usage. We present an approach for SFC provisioning, consisting of three elements. The first element is a fast, scalable round-robin heuristic. The second element is a Mixed Integer Programming (MIP) based approach. The third element is a queueing-theoretic model to estimate the average latency associated with any SFC provisioning solution. Combined, these elements create an approach that generates a set of SFC provisioning solutions, reflecting different tradeoffs between resource usage and performance.
In this paper artificial neural networks and support vector machines are used to reduce the amount of vibration data that is required to estimate the Time Domain Average of a gear vibration signal. Two models for estimating the time domain average of a gear vibration signal are proposed. The models are tested on data from an accelerated gear life test rig. Experimental results indicate that the required data for calculating the Time Domain Average of a gear vibration signal can be reduced by up to 75% when the proposed models are implemented.
This paper shows that causal model discovery is not an NP-hard problem, in the sense that for sparse graphs bounded by node degree k the sound and complete causal model can be obtained in worst case order N^{2(k+2)} independence tests, even when latent variables and selection bias may be present. We present a modification of the well-known FCI algorithm that implements the method for an independence oracle, and suggest improvements for sample/real-world data versions. It does not contradict any known hardness results, and does not solve an NP-hard problem: it just proves that sparse causal discovery is perhaps more complicated, but not as hard as learning minimal Bayesian networks.
Morphological reinflection is the task of generating a target form given a source form, a source tag and a target tag. We propose a new way of modeling this task with neural encoder-decoder models. Our approach reduces the amount of required training data for this architecture and achieves state-of-the-art results, making encoder-decoder models applicable to morphological reinflection even for low-resource languages. We further present a new automatic correction method for the outputs based on edit trees.
Inclusive e\pmp single and double differential cross sections for neutral and charged current deep inelastic scattering processes are measured with the H1 detector at HERA. The data were taken at a centre-of-mass energy of \surds = 319GeV with a total integrated luminosity of 333.7 pb-1 shared between two lepton beam charges and two longitudinal lepton polarisation modes. The differential cross sections are measured in the range of negative fourmomentum transfer squared, Q2, between 60 and 50 000GeV2, and Bjorken x between 0.0008 and 0.65. The measurements are combined with earlier published unpolarised H1 data to improve statistical precision and used to determine the structure function xF_3^gammaZ. A measurement of the neutral current parity violating structure function F_2^gammaZ is presented for the first time. The polarisation dependence of the charged current total cross section is also measured. The new measurements are well described by a next-to-leading order QCD fit based on all published H1 inclusive cross section data which are used to extract the parton distribution functions of the proton.
This paper presents a new supervised segmentation algorithm for hyperspectral image (HSI) data which integrates both spectral and spatial information in a probabilistic framework. A convolutional neural network (CNN) is first used to learn the posterior class distributions using a patch-wise training strategy to better utilize the spatial information. Then, the spatial information is further considered by using a Markov random field prior, which encourages the neighboring pixels to have the same labels. Finally, a maximum a posteriori segmentation model is efficiently computed by the alpha-expansion min-cut-based optimization algorithm. The proposed segmentation approach achieves state-of-the-art performance on one synthetic dataset and two benchmark HSI datasets in a number of experimental settings.
In the recent years there have been a number of studies that applied deep learning algorithms to neuroimaging data. Pipelines used in those studies mostly require multiple processing steps for feature extraction, although modern advancements in deep learning for image classification can provide a powerful framework for automatic feature generation and more straightforward analysis. In this paper, we show how similar performance can be achieved skipping these feature extraction steps with the residual and plain 3D convolutional neural network architectures. We demonstrate the performance of the proposed approach for classification of Alzheimer's disease versus mild cognitive impairment and normal controls on the Alzheimer's Disease National Initiative (ADNI) dataset of 3D structural MRI brain scans.
Interpreting deep neural networks can enable new applications for predictive modeling where both accuracy and interpretability are required. In this paper, we examine the weights of a deep neural network to interpret the statistical interactions it captures. Our key observation is that any input features that interact with each other must follow strongly weighted paths to a common hidden unit before the final output. We propose a novel framework, which we call Neural Interaction Detector (NID), that identifies meaningful interactions of arbitrary-order without an exhaustive search on an exponential solution space of interaction candidates. Empirical evaluation on both synthetic and real-world data showed the effectiveness of NID, which detects interactions more accurately and efficiently than does the state-of-the-art.
Many tasks in natural language processing, ranging from machine translation to question answering, can be reduced to the problem of matching two sentences or more generally two short texts. We propose a new approach to the problem, called Deep Match Tree (DeepMatch$_{tree}$), under a general setting. The approach consists of two components, 1) a mining algorithm to discover patterns for matching two short-texts, defined in the product space of dependency trees, and 2) a deep neural network for matching short texts using the mined patterns, as well as a learning algorithm to build the network having a sparse structure. We test our algorithm on the problem of matching a tweet and a response in social media, a hard matching problem proposed in [Wang et al., 2013], and show that DeepMatch$_{tree}$ can outperform a number of competitor models including one without using dependency trees and one based on word-embedding, all with large margins
The temperature and magnetic field dependences of the conductivity of the heterostructures with asymmetric In$_x$Ga$_{1-x}$As quantum well are studied. It is shown that the metallic-like temperature dependence of the conductivity observed in the structures investigated is quantitatively understandable within the whole temperature range, $T=0.4-20$ K. It is caused by the interference quantum correction at fast spin relaxation for 0.4 K$ < T < 1.5$ K. At higher temperatures, 1.5 K$<T<4$ K, it is due to the interaction quantum correction. Finally, at $T>4-6$ K, the metallic-like behavior is determined by the phonon scattering.
Many real-world problems are dynamic optimization problems. In this case, the optima in the environment change dynamically. Therefore, traditional optimization algorithms disable to track and find optima. In this paper, a new multi-swarm cellular particle swarm optimization based on clonal selection algorithm (CPSOC) is proposed for dynamic environments. In the proposed algorithm, the search space is partitioned into cells by a cellular automaton. Clustered particles in each cell, which make a sub-swarm, are evolved by the particle swarm optimization and clonal selection algorithm. Experimental results on Moving Peaks Benchmark demonstrate the superiority of the CPSOC its popular methods.
Trajectory Prediction of dynamic objects is a widely studied topic in the field of artificial intelligence. Thanks to a large number of applications like predicting abnormal events, navigation system for the blind, etc. there have been many approaches to attempt learning patterns of motion directly from data using a wide variety of techniques ranging from hand-crafted features to sophisticated deep learning models for unsupervised feature learning. All these approaches have been limited by problems like inefficient features in the case of hand crafted features, large error propagation across the predicted trajectory and no information of static artefacts around the dynamic moving objects. We propose an end to end deep learning model to learn the motion patterns of humans using different navigational modes directly from data using the much popular sequence to sequence model coupled with a soft attention mechanism. We also propose a novel approach to model the static artefacts in a scene and using these to predict the dynamic trajectories. The proposed method, tested on trajectories of pedestrians, consistently outperforms previously proposed state of the art approaches on a variety of large scale data sets. We also show how our architecture can be naturally extended to handle multiple modes of movement (say pedestrians, skaters, bikers and buses) simultaneously.
A statistical analysis is performed for a random unrestricted local crew scheduling problem, expressed in terms of pairing arrivals with departures. The analysis is aimed at understanding the structure of similar problems with global restrictions, and estimating their difficulty. The methods developed are of a general nature and can be of use in other problems with a similar structure. For large random problems, the ground-state energy scales like \sqrt{N} and the average excitation like N, where N is the number of arrivals/departures. The average ground-state degeneracy is such that the probability of hitting an optimal pairing by chance scales like 2N2^{-N} for large N. By insisting on the local ground-state energy for a restricted problem, airports can be split into smaller parts, and the state space reduced by typically a factor ~ 2^{N_a}, with N_a the total number of airports.
A defining feature of many large empirical networks is their intrinsic complexity. However, many networks also contain a large degree of structural repetition. An immediate question then arises: can we characterize essential network complexity while excluding structural redundancy?   In this article we utilize inherent network symmetry to collapse all redundant information from a network, resulting in a coarse-graining which we show to carry the essential structural information of the `parent' network. In the context of algebraic combinatorics, this coarse-graining is known as the \emph{quotient}. We systematically explore the theoretical properties of network quotients and summarize key statistics of a variety of `real-world' quotients with respect to those of their parent networks. In particular, we find that quotients can be substantially smaller than their parent networks yet typically preserve various key functional properties such as complexity (heterogeneity and hubs vertices) and communication (diameter and mean geodesic distance), suggesting that quotients constitute the essential structural skeleton of their parent network. We summarize with a discussion of potential uses of quotients in analysis of biological regulatory networks and ways in which using quotients can reduce the computational complexity of network algorithms.
Recent years have seen a sharp increase in the number of related yet distinct advances in semantic segmentation. Here, we tackle this problem by leveraging the respective strengths of these advances. That is, we formulate a conditional random field over a four-connected graph as end-to-end trainable convolutional and recurrent networks, and estimate them via an adversarial process. Importantly, our model learns not only unary potentials but also pairwise potentials, while aggregating multi-scale contexts and controlling higher-order inconsistencies. We evaluate our model on two standard benchmark datasets for semantic face segmentation, achieving state-of-the-art results on both of them.
Network is a simple but powerful representation of real-world complex systems. Network community analysis has become an invaluable tool to explore and reveal the internal organization of nodes. However, only a few methods were directly designed for community-detection in directed networks. In this article, we introduce the concept of local community structure in directed networks and provide a generic criterion to describe a local community with two properties. We further propose a stochastic optimization algorithm to rapidly detect a local community, which allows for uncovering the directional modular characteristics in directed networks. Numerical results show that the proposed method can resolve detailed local communities with directional information and provide more structural characteristics of directed networks than previous methods.
Across scientific disciplines, thresholded pairwise measures of statistical dependence between time series are taken as proxies for the interactions between the dynamical units of a network. Yet such correlation measures often fail to reflect the underlying physical interactions accurately. Here we systematically study the problem of reconstructing direct physical interaction networks from thresholding correlations. We explicate how local common cause and relay structures, heterogeneous in-degrees and non-local structural properties of the network generally hinder reconstructibility. However, in the limit of weak coupling strengths we prove that stationary systems with dynamics close to a given operating point transition to universal reconstructiblity across all network topologies.
The pervasive interstellar dust grains provide significant insights to understand the formation and evolution of the stars, planetary systems, and the galaxies, and may harbor the building blocks of life. One of the most effective way to analyze the dust is via their interaction with the light from background sources. The observed extinction curves and spectral features carry the size and composition information of dust. The broad absorption bump at 2175 Angstrom is the most prominent feature in the extinction curves. Traditionally, statistical methods are applied to detect the existence of the absorption bump. These methods require heavy preprocessing and the co-existence of other reference features to alleviate the influence from the noises. In this paper, we apply Deep Learning techniques to detect the broad absorption bump. We demonstrate the key steps for training the selected models and their results. The success of Deep Learning based method inspires us to generalize a common methodology for broader science discovery problems. We present our on-going work to build the DeepDis system for such kind of applications.
We introduce a new framework for web page ranking -- reinforcement ranking -- that improves the stability and accuracy of Page Rank while eliminating the need for computing the stationary distribution of random walks. Instead of relying on teleportation to ensure a well defined Markov chain, we develop a reverse-time reinforcement learning framework that determines web page authority based on the solution of a reverse Bellman equation. In particular, for a given reward function and surfing policy we recover a well defined authority score from a reverse-time perspective: looking back from a web page, what is the total incoming discounted reward brought by the surfer from the page's predecessors? This results in a novel form of reverse-time dynamic-programming/reinforcement-learning problem that achieves several advantages over Page Rank based methods: First, stochasticity, ergodicity, and irreducibility of the underlying Markov chain is no longer required for well-posedness. Second, the method is less sensitive to graph topology and more stable in the presence of dangling pages. Third, not only does the reverse Bellman iteration yield a more efficient power iteration, it allows for faster updating in the presence of graph changes. Finally, our experiments demonstrate improvements in ranking quality.
Book covers communicate information to potential readers, but can that same information be learned by computers? We propose using a deep Convolutional Neural Network (CNN) to predict the genre of a book based on the visual clues provided by its cover. The purpose of this research is to investigate whether relationships between books and their covers can be learned. However, determining the genre of a book is a difficult task because covers can be ambiguous and genres can be overarching. Despite this, we show that a CNN can extract features and learn underlying design rules set by the designer to define a genre. Using machine learning, we can bring the large amount of resources available to the book cover design process. In addition, we present a new challenging dataset that can be used for many pattern recognition tasks.
The formulation of a canonical deep-water breaking wave problem is introduced, and the results of a set of three-dimensional numerical simulations for deep-water breaking waves are presented. In this paper fully nonlinear progressive waves are generated by applying a normal stress to the free surface. Precise control of the forcing allows for a systematic study of four types of deep-water breaking waves, characterized herein as weak plunging, plunging, strong plunging, and very strong plunging.
This paper introduces a computationally inexpensive method of extracting the backbone of one-mode networks projected from bipartite networks. We show that the edge weights in the one-mode projections are distributed according to a Poisson binomial distribution and that finding the expected weight distribution of a one-mode network projected from a random bipartite network only requires knowledge of the bipartite degree distributions. Being able to extract the backbone of a projection is highly beneficial in filtering out redundant information in large complex networks and narrowing down the information in the one-mode projection to the most relevant. We demonstrate that the backbone of a one-mode projection aids in the detection of communities.
The Great Observatories Origins Deep Survey (GOODS) combines deep HST and Spitzer imaging with the deepest Chandra/XMM observations to probe obscured AGN at higher redshifts than previous multiwavelength surveys. We present a self-consistent implementation of the AGN unification paradigm, which postulates obscured AGN wherever there are unobscured AGN, to successfully explain the infrared, optical, and X-ray number counts of X-ray sources detected in the GOODS fields. Assuming either a constant ratio of obscured to unobscured AGN of 3:1 (the local value), or a ratio that decreases with luminosity, and including Compton-thick sources, we can explain the spectral shape and normalization of the extragalactic X-ray "background" as a superposition of unresolved AGN, predominantly at z~0.5-1.5 and L_x~10^43-10^44 ergs/s. The possible dependence of the obscured to unobscured ratio with redshift is not well constrained; present data allow it to decrease or increase substantially beyond z~1.
A significant performance reduction is often observed in speech recognition when the rate of speech (ROS) is too low or too high. Most of present approaches to addressing the ROS variation focus on the change of speech signals in dynamic properties caused by ROS, and accordingly modify the dynamic model, e.g., the transition probabilities of the hidden Markov model (HMM). However, an abnormal ROS changes not only the dynamic but also the static property of speech signals, and thus can not be compensated for purely by modifying the dynamic model. This paper proposes an ROS learning approach based on deep neural networks (DNN), which involves an ROS feature as the input of the DNN model and so the spectrum distortion caused by ROS can be learned and compensated for. The experimental results show that this approach can deliver better performance for too slow and too fast utterances, demonstrating our conjecture that ROS impacts both the dynamic and the static property of speech. In addition, the proposed approach can be combined with the conventional HMM transition adaptation method, offering additional performance gains.
Online handwritten Chinese text recognition (OHCTR) is a challenging problem as it involves a large-scale character set, ambiguous segmentation, and variable-length input sequences. In this paper, we exploit the outstanding capability of path signature to translate online pen-tip trajectories into informative signature feature maps using a sliding window-based method, successfully capturing the analytic and geometric properties of pen strokes with strong local invariance and robustness. A multi-spatial-context fully convolutional recurrent network (MCFCRN) is proposed to exploit the multiple spatial contexts from the signature feature maps and generate a prediction sequence while completely avoiding the difficult segmentation problem. Furthermore, an implicit language model is developed to make predictions based on semantic context within a predicting feature sequence, providing a new perspective for incorporating lexicon constraints and prior knowledge about a certain language in the recognition procedure. Experiments on two standard benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with correct rates of 97.10% and 97.15%, respectively, which are significantly better than the best result reported thus far in the literature.
With the proliferation of mobile computing devices, the demand for continuous network connectivity regardless of physical location has spurred interest in the use of mobile ad hoc networks. Since Transmission Control Protocol (TCP) is the standard network protocol for communication in the internet, any wireless network with Internet service need to be compatible with TCP. TCP is tuned to perform well in traditional wired networks, where packet losses occur mostly because of congestion. However, TCP connections in Ad-hoc mobile networks are plagued by problems such as high bit error rates, frequent route changes, multipath routing and temporary network partitions. The throughput of TCP over such connection is not satisfactory, because TCP misinterprets the packet loss or delay as congestion and invokes congestion control and avoidance algorithm. In this research, the performance of TCP in Adhoc mobile network with high Bit Error rate (BER) and mobility is studied and investigated. Simulation model is implemented and experiments are performed using the Network Simulatior 2 (NS2).
Consider a network of k parties, each holding a long sequence of n entries (a database), with minimum vertex-cut greater than t. We show that any empirical statistic across the network of databases can be computed by each party with perfect privacy, against any set of t < k/2 passively colluding parties, such that the worst-case distortion and communication cost (in bits per database entry) both go to zero as n, the number of entries in the databases, goes to infinity. This is based on combining a striking dimensionality reduction result for random sampling with unconditionally secure multi-party computation protocols.
Games have always been a popular test bed for artificial intelligence techniques. Game developers are always in constant search for techniques that can automatically create computer games minimizing the developer's task. In this work we present an evolutionary strategy based solution towards the automatic generation of two player board games. To guide the evolutionary process towards games, which are entertaining, we propose a set of metrics. These metrics are based upon different theories of entertainment in computer games. This work also compares the entertainment value of the evolved games with the existing popular board based games. Further to verify the entertainment value of the evolved games with the entertainment value of the human user a human user survey is conducted. In addition to the user survey we check the learnability of the evolved games using an artificial neural network based controller. The proposed metrics and the evolutionary process can be employed for generating new and entertaining board games, provided an initial search space is given to the evolutionary algorithm.
Modeling the locations of nodes as a uniform binomial point process (BPP), we present a generic mathematical framework to characterize the performance of an arbitrarily-located reference receiver in a finite wireless network. Different from most of the prior works where the serving transmitter (TX) node is located at the fixed distance from the reference receiver, we consider two general TX-selection policies: i) uniform TX-selection: the serving node is chosen uniformly at random amongst transmitting nodes, and ii) k-closest TX-selection: the serving node is the k-th closest node out of transmitting nodes to the reference receiver. The key intermediate step in our analysis is the derivation of a new set of distance distributions that lead not only to the tractable analysis of coverage probability but also enable the analyses of wide range of classical and currently trending problems in wireless networks. Using this new set of distance distributions, we first investigate the diversity loss due to SIR correlation in a finite network. We then obtain the optimal number of links that can be simultaneously activated to maximize network spectral efficiency. Finally, we evaluate optimal caching probability to maximize the total hit probability in cache-enabled finite networks.
We consider a generalized processing system having several queues, where the available service rate combinations are fluctuating over time due to reliability and availability variations. The objective is to allocate the available resources, and corresponding service rates, in response to both workload and service capacity considerations, in order to maintain the long term stability of the system. The service configurations are completely arbitrary, including negative service rates which represent forwarding and service-induced cross traffic. We employ a trace-based trajectory asymptotic technique, which requires minimal assumptions about the arrival dynamics of the system.   We prove that cone schedules, which leverage the geometry of the queueing dynamics, maximize the system throughput for a broad class of processing systems, even under adversarial arrival processes. We study the impact of fluctuating service availability, where resources are available only some of the time, and the schedule must dynamically respond to the changing available service rates, establishing both the capacity of such systems and the class of schedules which will stabilize the system at full capacity. The rich geometry of the system dynamics leads to important insights for stability, performance and scalability, and substantially generalizes previous findings.   The processing system studied here models a broad variety of computer, communication and service networks, including varying channel conditions and cross-traffic in wireless networking, and call centers with fluctuating capacity. The findings have implications for bandwidth and processor allocation in communication networks and workforce scheduling in congested call centers.
We present a novel method to perform multi-class pattern classification with neural networks and test it on a challenging 3D hand gesture recognition problem. Our method consists of a standard one-against-all (OAA) classification, followed by another network layer classifying the resulting class scores, possibly augmented by the original raw input vector. This allows the network to disambiguate hard-to-separate classes as the distribution of class scores carries considerable information as well, and is in fact often used for assessing the confidence of a decision. We show that by this approach we are able to significantly boost our results, overall as well as for particular difficult cases, on the hard 10-class gesture classification task.
Mobile ad hoc network is a wireless, self-configured, infrastructureless network of mobile nodes. The nodes are highly mobile, which makes the application running on them face network related problems like node failure, link failure, network level disconnection, scarcity of resources, buffer degradation, and intermittent disconnection etc. Node failure and Network fault are need to be monitored continuously by supervising the network status. Node monitoring protocol is crucial, so it is required to test the protocol exhaustively to verify and validate the functionality and accuracy of the designed protocol. This paper presents a validation model for Node Monitoring Protocol using Specification and Description Llanguage (SDL) using both Static Agent (SA) and Mobile Agent (MA). We have verified properties of the Node Monitoring Protocol (NMP) based on the global states with no exits, deadlock states or proper termination states using reachability graph. Message Sequence Chart (MSC) gives an intuitive understanding of the described system behavior with varying node density and complex behavior etc.
The effects of randomly pinning particles in a model glass-forming fluid are studied, with a focus on the dynamically heterogeneous relaxation in the presence of pinning. We show how four-point dynamical correlations can be analysed in real space, allowing direct extraction of a length scale that characterises dynamical heterogeneity. In the presence of pinning, the relaxation time of the glassy system increases by up to two decades, but there is almost no increase in either the four-point correlation length or the strength of the four-point correlations. We discuss the implications of these results for theories of the glass transition.
We propose a new deep network for audio event recognition, called AENet. In contrast to speech, sounds coming from audio events may be produced by a wide variety of sources. Furthermore, distinguishing them often requires analyzing an extended time period due to the lack of clear sub-word units that are present in speech. In order to incorporate this long-time frequency structure of audio events, we introduce a convolutional neural network (CNN) operating on a large temporal input. In contrast to previous works this allows us to train an audio event detection system end-to-end. The combination of our network architecture and a novel data augmentation outperforms previous methods for audio event detection by 16%. Furthermore, we perform transfer learning and show that our model learnt generic audio features, similar to the way CNNs learn generic features on vision tasks. In video analysis, combining visual features and traditional audio features such as MFCC typically only leads to marginal improvements. Instead, combining visual features with our AENet features, which can be computed efficiently on a GPU, leads to significant performance improvements on action recognition and video highlight detection. In video highlight detection, our audio features improve the performance by more than 8% over visual features alone.
Multi-object tracking has recently become an important area of computer vision, especially for Advanced Driver Assistance Systems (ADAS). Despite growing attention, achieving high performance tracking is still challenging, with state-of-the- art systems resulting in high complexity with a large number of hyper parameters. In this paper, we focus on reducing overall system complexity and the number hyper parameters that need to be tuned to a specific environment. We introduce a novel tracking system based on similarity mapping by Enhanced Siamese Neural Network (ESNN), which accounts for both appearance and geometric information, and is trainable end-to-end. Our system achieves competitive performance in both speed and accuracy on MOT16 challenge, compared to known state-of-the-art methods.
The paper presents a parallel math library, dMath, that demonstrates leading scaling when using intranode, internode, and hybrid-parallelism for deep learning (DL). dMath provides easy-to-use distributed primitives and a variety of domain-specific algorithms including matrix multiplication, convolutions, and others allowing for rapid development of scalable applications like deep neural networks (DNNs). Persistent data stored in GPU memory and advanced memory management techniques avoid costly transfers between host and device. dMath delivers performance, portability, and productivity to its specific domain of support.
We study the problem of large scale, multi-label visual recognition with a large number of possible classes. We propose a method for augmenting a trained neural network classifier with auxiliary capacity in a manner designed to significantly improve upon an already well-performing model, while minimally impacting its computational footprint. Using the predictions of the network itself as a descriptor for assessing visual similarity, we define a partitioning of the label space into groups of visually similar entities. We then augment the network with auxilliary hidden layer pathways with connectivity only to these groups of label units. We report a significant improvement in mean average precision on a large-scale object recognition task with the augmented model, while increasing the number of multiply-adds by less than 3%.
We propose an opinion dynamics model in which agents gradually increase their own self-confidence while interacting with each other. The relations between the newly proposed model and existing works of social learning, inertial opinion dynamics, Bayesian inference, and stochastic multi-armed bandits are demonstrated. We prove the convergence of the system with the existence of a truth under fixed and periodically changing social networks, and obtain tight convergence bounds related to the spectral gap of the graph Laplacian and the maximum total degree centrality, respectively. In the case of randomly generated social networks, an almost-sure convergence result is obtained. The dynamics of the model with multiple truths or zero truth is also discussed.
An efficient and relatively fast algorithm for the detection of communities in complex networks is introduced. The method exploits spectral properties of the graph Laplacian-matrix combined with hierarchical-clustering techniques, and includes a procedure to maximize the ``modularity'' of the output. Its performance is compared with that of other existing methods, as applied to different well-known instances of complex networks with a community-structure: both computer-generated and from the real-world. Our results are in all the tested cases, at least, as good as the best ones obtained with any other methods, and faster in most of the cases than methods providing similar-quality results. This converts the algorithm in a valuable computational tool for detecting and analyzing communities and modular structures in complex networks.
We analyze a class of distributed quantized consen- sus algorithms for arbitrary networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reach consensus with quantized precision. We start the analysis with a special case of a distributed binary voting algorithm, then proceed to the expected convergence time for the general quantized consensus algorithm proposed by Kashyap et al. We use the theory of electric networks, random walks, and couplings of Markov chains to derive an O(N^3log N) upper bound for the expected convergence time on an arbitrary graph of size N, improving on the state of art bound of O(N^4logN) for binary consensus and O(N^5) for quantized consensus algorithms. Our result is not dependent on graph topology. Simulations on special graphs such as star networks, line graphs, lollipop graphs, and Erd\"os-R\'enyi random graphs are performed to validate the analysis. This work has applications to load balancing, coordination of autonomous agents, estimation and detection, decision-making networks, peer-to-peer systems, etc.
Experimental studies have shown the ubiquity of altruistic behavior in human societies. The social structure is a fundamental ingredient to understand the degree of altruism displayed by the members of a society, in contrast to individual-based features, like for example age or gender, which have been shown not to be relevant to determine the level of altruistic behavior. We explore an evolutionary model aiming to delve how altruistic behavior is affected by social structure. We investigate the dynamics of interacting individuals playing the Ultimatum Game with their neighbors given by a social network of interaction. We show that a population self-organizes in a critical state where the degree of altruism depends on the topology characterizing the social structure. In general, individuals offering large shares but in turn accepting large shares, are removed from the population. In heterogeneous social networks, individuals offering intermediate shares are strongly selected in contrast to random homogeneous networks where a broad range of offers, below a critical one, is similarly present in the population.
Fragmentation functions (FFs) describe the formation of final state particles from a partonic initial state. Precise knowledge of these functions is a key ingredient in accessing quantities such as the nucleon spin structure in semi-inclusive deep-inelastic scattering and proton proton collisions. However, fragmentation functions can currently not be determined from first principles Quantum Chromodynamics and have to be extracted from experimental data. The Belle experiment at KEK, Japan, provides a large data sample for high precision measurements on e^{+}e^{-} annihilations allowing for first-time or more precise extractions of fragmentation functions. Analyses for extractions of spin-independent (unpolarized FFs) as well as spin-dependent fragmentation functions (interference FFs) at Belle are presented.
Money laundering is a major global problem, enabling criminal organisations to hide their ill-gotten gains and to finance further operations. Prevention of money laundering is seen as a high priority by many governments, however detection of money laundering without prior knowledge of predicate crimes remains a significant challenge. Previous detection systems have tended to focus on individuals, considering transaction histories and applying anomaly detection to identify suspicious behaviour. However, money laundering involves groups of collaborating individuals, and evidence of money laundering may only be apparent when the collective behaviour of these groups is considered. In this paper we describe a detection system that is capable of analysing group behaviour, using a combination of network analysis and supervised learning. This system is designed for real-world application and operates on networks consisting of millions of interacting parties. Evaluation of the system using real-world data indicates that suspicious activity is successfully detected. Importantly, the system exhibits a low rate of false positives, and is therefore suitable for use in a live intelligence environment.
In this work, we investigate several neural network architectures for fine-grained entity type classification. Particularly, we consider extensions to a recently proposed attentive neural architecture and make three key contributions. Previous work on attentive neural architectures do not consider hand-crafted features, we combine learnt and hand-crafted features and observe that they complement each other. Additionally, through quantitative analysis we establish that the attention mechanism is capable of learning to attend over syntactic heads and the phrase containing the mention, where both are known strong hand-crafted features for our task. We enable parameter sharing through a hierarchical label encoding method, that in low-dimensional projections show clear clusters for each type hierarchy. Lastly, despite using the same evaluation dataset, the literature frequently compare models trained using different data. We establish that the choice of training data has a drastic impact on performance, with decreases by as much as 9.85% loose micro F1 score for a previously proposed method. Despite this, our best model achieves state-of-the-art results with 75.36% loose micro F1 score on the well- established FIGER (GOLD) dataset.
Deep neural networks have greatly improved the performance of various applications including image processing, speech recognition, natural language processing, and bioinformatics. However, it is still difficult to discover or interpret knowledge from the inference provided by a deep neural network, since its internal representation has many nonlinear and complex parameters embedded in hierarchical layers. Therefore, it becomes important to establish a new methodology by which deep neural networks can be understood.   In this paper, we propose a new method for extracting a global and simplified structure from a layered neural network. Based on network analysis, the proposed method detects communities or clusters of units with similar connection patterns. We show its effectiveness by applying it to three use cases. (1) Network decomposition: it can decompose a trained neural network into multiple small independent networks thus dividing the problem and reducing the computation time. (2) Training assessment: the appropriateness of a trained result with a given hyperparameter or randomly chosen initial parameters can be evaluated by using a modularity index. And (3) data analysis: in practical data it reveals the community structure in the input, hidden, and output layers, which serves as a clue for discovering knowledge from a trained neural network.
This is the first paper of a series of three, reporting on numerical simulation studies of geometric and mechanical properties of static assemblies of spherical beads under an isotropic pressure. Frictionless systems assemble in the unique random close packing (RCP) state in the low pressure limit if the compression process is fast enough, slower processes inducing traces of crystallization, and exhibit specific properties directly related to isostaticity of the force-carrying structure. The different structures of frictional packings assembled by various methods cannot be classified by the sole density. While lubricated systems approach RCP densities and coordination number z^*~=6 on the backbone in the rigid limit, an idealized "vibration" procedure results in equally dense configurations with z^*~=4.5. Near neighbor correlations on various scales are computed and compared to available laboratory data, although z^* values remain experimentally inaccessible. Low coordination packings have many rattlers (more than 10% of the grains carry no force), which should be accounted for on studying position correlations, and a small proportion of harmless "floppy modes" associated with divalent grains. Frictional packings, however slowly assembled under low pressure, retain a finite level of force indeterminacy, except in the limit of infinite friction.
We introduce a growing network evolution model with nodal attributes. The model describes the interactions between potentially violent V and non-violent N agents who have different affinities in establishing connections within their own population versus between the populations. The model is able to generate all stable triads observed in real social systems. In the framework of rate equations theory, we employ the mean-field approximation to derive analytical expressions of the degree distribution and the local clustering coefficient for each type of nodes. Analytical derivations agree well with numerical simulation results. The assortativity of the potentially violent network qualitatively resembles the connectivity pattern in terrorist networks that was recently reported. The assortativity of the network driven by aggression shows clearly different behavior than the assortativity of the networks with connections of non-aggressive nature in agreement with recent empirical results of an online social system.
Small groups of interneurons, abbreviated by CPG for central pattern generators, are arranged into neural networks to generate a variety of core bursting rhythms with specific phase-locked states, on distinct time scales, that govern vital motor behaviors in invertebrates such as chewing, swimming, etc. These movements in lower level animals mimic motions of organs in higher animals due to evolutionarily conserved mechanisms. Hence, various neurological diseases can be linked to abnormal movement of body parts that are regulated by a malfunctioning CPG. In this paper, we, being inspired by recent experimental studies of neuronal activity patterns recorded from a swimming motion CPG of the sea slug {\it Melibe leonina}, examine a mathematical model of a 4-cell network that can plausibly and stably underlie the observed bursting rhythm. We develop a dynamical systems framework for explaining the existence and robustness of phase-locked states in activity patterns produced by the modeled CPGs. The proposed tools can be used for identifying core components for other CPG networks with reliable bursting outcomes and specific phase relationships between the interneurons. Our findings can be employed for identifying or implementing the conditions for normal and pathological functioning of basic CPGs of animals and artificially intelligent prosthetics that can regulate various movements.
In this article we present Miuz, a robustness index for complex networks. Miuz measures the impact of disconnecting a node from the network while comparing the sizes of the remaining connected components. Strictly speaking, Miuz for a node is defined as the inverse of the size of the largest connected component divided by the sum of the sizes of the remaining ones. We tested our index in attack strategies where the nodes are disconnected in decreasing order of a specified metric. We considered Miuz and other well-known centrality measures such as betweenness, degree , and harmonic centrality. All of these metrics were compared regarding the behavior of the robust-ness (R-index) during the attacks. In an attempt to simulate the internet backbone, the attacks were performed in complex networks with power-law degree distributions (scale-free networks). Preliminary results show that attacks based on disconnecting a few number of nodes Miuz are more dangerous (decreasing the robustness) than the same attacks based on other centrality measures. We believe that Miuz, as well as other measures based on the size of the largest connected component, provides a good addition to other robustness metrics for complex networks.
We study the filling process of a two-dimensional silo with inelastic particles by simulation of a granular media lattice gas (GMLG) model. We calculate the surface shape and flow profiles for a monodisperse system and we introduce a novel generalization of the GMLG model for a binary mixture of particles of different friction properties where, for the first time, we measure the segregation process on the surface. The results are in good agreement with a recent theory, and we explain the observed small deviations by the nonuniform velocity profile.
The current scenario in the field of computing is largely affected by the speed at which data can be accessed and recalled. In this paper, we present the word existence algorithm which is used to check if the word given as an input is part of a particular database or not. We have taken the English language as an example here. This algorithm tries to solve the problem of lookup by using a uniformly distributed hash function. We have also addressed the problem of clustering and collision. A further contribution is that we follow a direct hashed model where each hash value is linked to another table if the continuity for the function holds true. The core of the algorithm lies in the data model being used during preordering. Our focus lies on the formation of a continuity series and validating the words that exists in the database. This algorithm can be used in applications where we there is a requirement to search for just the existence of a word, example Artificial Intelligence responding to input ,look up for neural networks and dictionary lookups and more. We have observed that this algorithm provides a faster search time
We have studied the Parisi overlap distribution for the three dimensional Ising spin glass in the Migdal-Kadanoff approximation. For temperatures T around 0.7Tc and system sizes upto L=32, we found a P(q) as expected for the full Parisi replica symmetry breaking, just as was also observed in recent Monte Carlo simulations on a cubic lattice. However, for lower temperatures our data agree with predictions from the droplet or scaling picture. The failure to see droplet model behaviour in Monte Carlo simulations is due to the fact that all existing simulations have been done at temperatures too close to the transition temperature so that sytem sizes larger than the correlation length have not been achieved.
We develop a novel and powerful method of exactly calculating various transport characteristics of waves in one-dimensional random media with (or without) coherent absorption or amplification. Using the method, we compute the probability densities of the reflectance and of the phase of the reflection coefficient, together with the localization length, of electromagnetic waves in sufficiently long random dielectric media. We find substantial differences between our exact results and the previous results obtained using the random phase approximation (RPA). The probabilty density of the phase of the reflection coefficient is highly nonuniform when either disorder or absorption (or amplification) is strong. The probability density of the reflectance when the absorption or amplification parameter is large is also quite different from the RPA result. We prove that the probability densities in the amplifying case are related to those in the absorbing case with the same magnitude of the imaginary part of the dielectric permeability by exact dual relationships. From the analysis of the average reflectance that shows a nonmonotonic dependence on the absorption or amplification parameter, we obtain a useful criterion for the applicability of the RPA. In the parameter regime where the RPA is invalid, we find the exact localization length is substantially larger than the RPA localization length.
The Naming Game is an agent-based model where individuals communicate to name an initially unnamed object. On a large class of networks continual pairwise interactions lead the system to an ultimate consensus state, in which agents converge on a globally shared name. Soon after the introduction of the model, it was observed in literature that on community-based networks the path to consensus passes through metastable multi-language states. Subsequently, it was proposed to use this feature as a mean to discover communities in a given network. In this paper we show that metastable states correspond to genuine multi-language phases, emerging in the thermodynamic limit when the fraction of links connecting communities drops below critical thresholds. In particular, we study the transition to multi-language states in the stochastic block model and on networks with community overlap. We also examine the scaling of critical thresholds under variations of topological properties of the network, such as the number and relative size of communities and the structure of intra-/inter-community links. Our results provide a theoretical justification for the proposed use of the model as a community-detection algorithm.
Energy efficiency is a corner stone of sustainability in data center and high-performance networking. However, at present there is a notable structural mismatch between network silicon development targets and network equipment utilization patterns in the field. In particular, some aspects of network energy utilization (eg load-proportional energy consumption) routinely stay out of focus during system design and implementation. Drawing from hands-on research and development in high-speed and grid networking, we identify a novel approach to energy efficiency in network engineering. In this paper, we demonstrate how the problem of efficient network system design can be dissected into smaller sections based on timescales of traffic processing. The newly proposed approach allows R&D efforts to be tightly paired to resources and sustainability targets to improve energy efficiency in many classes of network and telecom devices.
For many externally driven complex systems neither the noisy driving force, nor the internal dynamics are a priori known. Here we focus on systems for which the time dependent activity of a large number of components can be monitored, allowing us to separate each signal into a component attributed to the external driving force and one to the internal dynamics. We propose a formalism to capture the potential multiscaling in the fluctuations and apply it to the high frequency trading records of the New York Stock Exchange. We find that on the time scale of minutes the dynamics is governed by internal processes, while on a daily or longer scale the external factors dominate. This transition from internal to external dynamics induces systematic changes in the scaling exponents, offering direct evidence of non-universality in the system.
The emergence of structure in cooperative relation is studied in a game theoretical model. It is proved that specific types of reciprocity norm lead individuals to split into two groups. The condition for the evolutionary stability of the norms is also revealed. This result suggests a connection between group formation and a specific type of reciprocity norm in our society.
Handoff decisions are usually signal strength based because of simplicity and effectiveness. Apart from the conventional techniques, such as threshold and hysteresis based schemes, recently many artificial intelligent techniques such as Fuzzy Logic, Artificial Neural Network (ANN) etc. are also used for taking handoff decision. In this paper, an Artificial Neural Network based handoff algorithm is proposed and its performance is studied. We have used ANN here for taking fast and accurate handoff decision. In our proposed handoff algorithm, Backpropagation Neural Network model is used.The advantages of Back propagation method are its simplicity and reasonable speed. The algorithm is designed, tested and found to give optimum results.
Peer to peer networks will become an increasingly important distribution channel for consumer information goods and may play a role in the distribution of information within corporations. Our research analyzes optimal membership rules for these networks in light of positive and negative externalities additional users impose on the network. Using a dataset gathered from the six largest OpenNap-based networks, we find that users impose a positive network externality based on the desirability of the content they provide and a negative network externality based on demands they place on the network. Further we find that the marginal value of additional users is declining and the marginal cost is increasing in the number of current users. This suggests that multiple small networks may serve user communities more efficiently than single monolithic networks and that network operators may wish to specialize in their content and restrict membership based on capacity constraints and user content desirability.
Background: Automated cardiac image interpretation has the potential to transform clinical practice in multiple ways including enabling low-cost assessment of cardiac function in the primary care setting. We hypothesized that advances in computer vision could enable building a fully automated, scalable analysis pipeline for echocardiogram (echo) interpretation, with a focus on evaluation of left ventricular function. Methods: Our approach entailed: 1) preprocessing, which includes auto-downloading of echo studies, metadata extraction, de-identification, and conversion of images into numerical arrays; 2) convolutional neural networks (CNN) for view identification; 3) localization of the left ventricle and delineation of cardiac boundaries using active appearance models (AAM); 4) identification of properly segmented images using gradient boosting; and 5) particle tracking to compute longitudinal strain. Results: CNNs identified views with high accuracy (e.g. 95% for apical 4-chamber) and the combination of CNN/bounding box determination/AAM effectively segmented 67-88% of videos. We analyzed 2775 apical videos from patients with heart failure and found good concordance with vendor-derived longitudinal strain measurements, both at the individual video level (r=0.77) and at the patient level (r=0.51). We also analyzed 9402 videos from breast cancer patients undergoing serial monitoring for trastuzumab cardiotoxicity to illustrate the potential for automated, quality-weighted modeling of patient trajectories. Conclusions: We demonstrate the feasibility of a fully automated echocardiography analysis pipeline for assessment of left ventricular function. Our work lays the groundwork for using automated interpretation to support point-of-care handheld cardiac ultrasound and may enable large-scale analysis of the millions of echos currently archived within healthcare systems.
Building on existing stochastic actor-oriented models for panel data, we employ a conditional logistic framework to explore growth mechanisms for tie creation in continuously-observed networks. This framework models the likelihood of tie formation distinguishing it from hazard models that consider time to tie formation. It enables multiple growth mechanisms for network evolution (homophily, focus constraints, reinforcement, reciprocity, triadic closure, and popularity) to be modeled simultaneously. We apply this framework to communication within a Facebook-like community. The findings exemplify the inadequacy of descriptive measures that test single mechanisms independently. They also indicate how system design shapes behavior and network evolution.
Electron conductance in planar magnetic tunnel junctions with long-range barrier disorder is studied within Glauber-eikonal approximation enabling exact disorder ensemble averaging by means of the Holtsmark-Markov method. This allows us to address a hitherto unexplored regime of the tunneling magnetoresistance effect characterized by the crossover from momentum-conserving to random tunneling as a function of the defect concentration. We demonstrate that such a crossover results in a reentrant magnetoresistance: It goes through a pronounced minimum before reaching disorder- and geometry-independent Julliere's value at high defect concentrations.
The classification of MRI images according to the anatomical field of view is a necessary task to solve when faced with the increasing quantity of medical images. In parallel, advances in deep learning makes it a suitable tool for computer vision problems. Using a common architecture (such as AlexNet) provides quite good results, but not sufficient for clinical use. Improving the model is not an easy task, due to the large number of hyper-parameters governing both the architecture and the training of the network, and to the limited understanding of their relevance. Since an exhaustive search is not tractable, we propose to optimize the network first by random search, and then by an adaptive search based on Gaussian Processes and Probability of Improvement. Applying this method on a large and varied MRI dataset, we show a substantial improvement between the baseline network and the final one (up to 20\% for the most difficult classes).
A theory of relative species abundance on sparsely-connected networks is presented by investigating the replicator dynamics with symmetric interactions. Sparseness of a network involves difficulty in analyzing the fixed points of the equation, and we avoid this problem by treating large self interaction $u$, which allows us to construct a perturbative expansion. Based on this perturbation, we find that the nature of the interactions is directly connected to the abundance distribution, and some characteristic behaviors, such as multiple peaks in the abundance distribution and all species coexistence at moderate values of $u$, are discovered in a wide class of the distribution of the interactions. The all species coexistence collapses at a critical value of $u$, $u_c$, and this collapsing is regarded as a phase transition. To get more quantitative information, we also construct a non-perturbative theory on random graphs based on techniques of statistical mechanics. The result shows those characteristic behaviors are sustained well even for not large $u$. For even smaller values of $u$, extinct species start to appear and the abundance distribution becomes rounded and closer to a standard functional form. Another interesting finding is the non-monotonic behavior of diversity, which quantifies the number of coexisting species, when changing the ratio of mutualistic relations $\Delta$. These results are examined by numerical simulations, and the multiple peaks in the abundance distribution are confirmed to be robust against a certain level of modifications of the problem. The numerical results also show that our theory is exact for the case without extinct species, but becomes less and less precise as the proportion of extinct species grows.
Sum-Product Networks (SPNs) are recently introduced deep tractable probabilistic models by which several kinds of inference queries can be answered exactly and in a tractable time. Up to now, they have been largely used as black box density estimators, assessed only by comparing their likelihood scores only. In this paper we explore and exploit the inner representations learned by SPNs. We do this with a threefold aim: first we want to get a better understanding of the inner workings of SPNs; secondly, we seek additional ways to evaluate one SPN model and compare it against other probabilistic models, providing diagnostic tools to practitioners; lastly, we want to empirically evaluate how good and meaningful the extracted representations are, as in a classic Representation Learning framework. In order to do so we revise their interpretation as deep neural networks and we propose to exploit several visualization techniques on their node activations and network outputs under different types of inference queries. To investigate these models as feature extractors, we plug some SPNs, learned in a greedy unsupervised fashion on image datasets, in supervised classification learning tasks. We extract several embedding types from node activations by filtering nodes by their type, by their associated feature abstraction level and by their scope. In a thorough empirical comparison we prove them to be competitive against those generated from popular feature extractors as Restricted Boltzmann Machines. Finally, we investigate embeddings generated from random probabilistic marginal queries as means to compare other tractable probabilistic models on a common ground, extending our experiments to Mixtures of Trees.
Determining deep holes is an important topic in decoding Reed-Solomon codes. In a previous paper [8], we showed that the received word $u$ is a deep hole of the standard Reed-Solomon codes $[q-1, k]_q$ if its Lagrange interpolation polynomial is the sum of monomial of degree $q-2$ and a polynomial of degree at most $k-1$. In this paper, we extend this result by giving a new class of deep holes of the generalized Reed-Solomon codes.
We show that when a non-wetting fluid drains a stratified porous medium at sufficiently small capillary numbers Ca, it flows only through the coarsest stratum of the medium; by contrast, above a threshold Ca, the non-wetting fluid is also forced laterally, into part of the adjacent, finer strata. The spatial extent of this partial invasion increases with Ca. We quantitatively understand this behavior by balancing the stratum-scale viscous pressure driving the flow with the capillary pressure required to invade individual pores. Because geological formations are frequently stratified, we anticipate that our results will be relevant to a number of important applications, including understanding oil migration, preventing groundwater contamination, and sub-surface CO$_{2}$ storage.
The structural origin of Cu+ ions conductivity in Cu6PS5I single crystals is described in terms of structural phase transitions studied by X-ray diffraction, polarizing microscope and calorimetric measurements. Below the phase transition at Tc=(144-169) K Cu6PS5I belongs to monoclinic, ferroelastic phase, space group Cc. Above Tc crystal changes the symmetry to cubic superstructure, space group F-43c (a=19.528); finally at 274K disordering of the Cu+ ions increases the symmetry to F-43m, (a=9.794). The phase transition at 274K coincides well with a strong anomaly in electrical conductivity observed in the Arrhenius plot. Diffusion paths for Cu+ ions are evidenced by means of the atomic displacement factors and split model. Influence of the copper stechiometry on the Tc is also discussed.
We describe a large class of chemical reaction networks, those endowed with a subtle structural property called concordance. We show that the class of concordant networks coincides precisely with the class of networks which, when taken with any weakly monotonic kinetics, invariably give rise to kinetic systems that are injective --- a quality that, among other things, precludes the possibility of switch-like transitions between distinct positive steady states. We also provide persistence characteristics of concordant networks, instability implications of discordance, and consequences of stronger variants of concordance. Some of our results are in the spirit of recent ones by Banaji and Craciun, but here we do not require that every species suffer a degradation reaction. This is especially important in studying biochemical networks, for which it is rare to have all species degrade.
The conductivity $\sigma$ and resistivity $\rho$ tensors of the disordered Hofstadter model are mapped as functions of Fermi energy $E_F$ and temperature $T$ in the quantum critical regime of the plateau-insulator transition (PIT). The finite-size errors are eliminated by using the non-commutative Kubo-formula. The results reproduce all the key experimental characteristics of this transition in Integer Quantum Hall (IQHE) systems. In particular, the Quantized Hall Insulator (QHI) phase is detected and analyzed. The presently accepted characterization of the QHI phase in the quantum critical regime, based entirely on experimental data, is fully supported by our theoretical investigation.
Motivated by modeling the dynamics of a population living in a flowing medium where the environmental factors are random in space, we have studied an asymmetric variant of the one-dimensional contact process, where the quenched random reproduction rates are systematically greater in one direction than in the opposite one. The spatial disorder turns out to be a relevant perturbation but, according to results of Monte Carlo simulations, the behavior of the model at the extinction transition is different from the (infinite randomness) critical behavior of the disordered, symmetric contact process. Depending on the strength $a$ of the asymmetry, the critical population drifts either with a finite velocity or with an asymptotically vanishing velocity as $x(t)\sim t^{\mu(a)}$, where $\mu(a)<1$. Dynamical quantities are non-self-averaging at the extinction transition; the survival probability, for instance, shows multiscaling, i.e. it is characterized by a broad spectrum of effective exponents. For a sufficiently weak asymmetry, a Griffiths phase appears below the extinction transition, where the survival probability decays as a non-universal power of the time while, above the transition, another extended phase emerges, where the front of the population advances anomalously with a diffusion exponent continuously varying with the control parameter.
We propose a paradigm to deep-learn the ever-expanding databases which have emerged in mathematical physics and particle phenomenology, as diverse as the statistics of string vacua or combinatorial and algebraic geometry. As concrete examples, we establish multi-layer neural networks as both classifiers and predictors and train them with a host of available data ranging from Calabi-Yau manifolds and vector bundles, to quiver representations for gauge theories. We find that even a relatively simple neural network can learn many significant quantities to astounding accuracy in a matter of minutes and can also predict hithertofore unencountered results. This paradigm should prove a valuable tool in various investigations in landscapes in physics as well as pure mathematics.
The analysis of randomized search heuristics on classes of functions is fundamental for the understanding of the underlying stochastic process and the development of suitable proof techniques. Recently, remarkable progress has been made in bounding the expected optimization time of the simple (1+1) EA on the class of linear functions. We improve the best known bound in this setting from $(1.39+o(1))en\ln n$ to $en\ln n+O(n)$ in expectation and with high probability, which is tight up to lower-order terms. Moreover, upper and lower bounds for arbitrary mutations probabilities $p$ are derived, which imply expected polynomial optimization time as long as $p=O((\ln n)/n)$ and which are tight if $p=c/n$ for a constant $c$. As a consequence, the standard mutation probability $p=1/n$ is optimal for all linear functions, and the (1+1) EA is found to be an optimal mutation-based algorithm. The proofs are based on adaptive drift functions and the recent multiplicative drift theorem.
Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highlight also the importance of the optimization used in the training.
Real-world networks are often organized as modules or communities of similar nodes that serve as functional units. These networks are also rich in content, with nodes having distinguishing features or attributes. In order to discover a network's modular structure, it is necessary to take into account not only its links but also node attributes. We describe an information-theoretic method that identifies modules by compressing descriptions of information flow on a network. Our formulation introduces node content into the description of information flow, which we then minimize to discover groups of nodes with similar attributes that also tend to trap the flow of information. The method has several advantages: it is conceptually simple and does not require ad-hoc parameters to specify the number of modules or to control the relative contribution of links and node attributes to network structure. We apply the proposed method to partition real-world networks with known community structure. We demonstrate that adding node attributes helps recover the underlying community structure in content-rich networks more effectively than using links alone. In addition, we show that our method is faster and more accurate than alternative state-of-the-art algorithms.
We present a message-passing algorithm to solve the edge disjoint path problem (EDP) on graphs incorporating under a unique framework both traffic optimization and path length minimization. The min-sum equations for this problem present an exponential computational cost in the number of paths. To overcome this obstacle we propose an efficient implementation by mapping the equations onto a weighted combinatorial matching problem over an auxiliary graph. We perform extensive numerical simulations on random graphs of various types to test the performance both in terms of path length minimization and maximization of the number of accommodated paths. In addition, we test the performance on benchmark instances on various graphs by comparison with state-of-the-art algorithms and results found in the literature. Our message-passing algorithm always outperforms the others in terms of the number of accommodated paths when considering non trivial instances (otherwise it gives the same trivial results). Remarkably, the largest improvement in performance with respect to the other methods employed is found in the case of benchmarks with meshes, where the validity hypothesis behind message-passing is expected to worsen. In these cases, even though the exact message-passing equations do not converge, by introducing a reinforcement parameter to force convergence towards a sub optimal solution, we were able to always outperform the other algorithms with a peak of 27% performance improvement in terms of accommodated paths. On random graphs, we numerically observe two separated regimes: one in which all paths can be accommodated and one in which this is not possible. We also investigate the behaviour of both the number of paths to be accommodated and their minimum total length.
Recurrent neural networks (RNN) are simple dynamical systems whose computational power has been attributed to their short-term memory. Short-term memory of RNNs has been previously studied analytically only for the case of orthogonal networks, and only under annealed approximation, and uncorrelated input. Here for the first time, we present an exact solution to the memory capacity and the task-solving performance as a function of the structure of a given network instance, enabling direct determination of the function--structure relation in RNNs. We calculate the memory capacity for arbitrary networks with exponentially correlated input and further related it to the performance of the system on signal processing tasks in a supervised learning setup. We compute the expected error and the worst-case error bound as a function of the spectra of the network and the correlation structure of its inputs and outputs. Our results give an explanation for learning and generalization of task solving using short-term memory, which is crucial for building alternative computer architectures using physical phenomena based on the short-term memory principle.
Learning and the ability to learn are important factors in development and evolutionary processes [1]. Depending on the level, the complexity of learning can strongly vary. While associative learning can explain simple learning behaviour [1,2] much more sophisticated strategies seem to be involved in complex learning tasks. This is particularly evident in machine learning theory [3] (reinforcement learning [4], statistical learning [5]), but it equally shows up in trying to model natural learning behaviour [2]. A general setting for modelling learning processes in which statistical aspects are relevant is provided by the neural network (NN) paradigm. This is in particular of interest for natural, learning by experience situations. NN learning models can incorporate elementary learning mechanisms based on neuro-physiological analogies, such as the Hebb rule, and lead to quantitative results concerning the dynamics of the learning process [6]. The Hebb rule, however, cannot be directly applied in all cases, and in particular for realistic problems, such as "delayed reinforcement" [4,6], the sophistication of the algorithms rapidly increases. We want to present here a model which can cope with such non trivial tasks, while still being elementary and based only on procedures which one may think of as natural, without any appeal to higher strategies [7]. We can show the capability of this model to provide good learning in many, very different settings [7,8,9]. It may help therefore understanding some basic features of learning.
We study the adaptation of convolutional neural networks to the complex temporal radio signal domain. We compare the efficacy of radio modulation classification using naively learned features against using expert features which are widely used in the field today and we show significant performance improvements. We show that blind temporal learning on large and densely encoded time series using deep convolutional neural networks is viable and a strong candidate approach for this task especially at low signal to noise ratio.
We have calculated the key characteristics of associative (content-addressable) spatial-temporal memories based on neuromorphic networks with restricted connectivity - "CrossNets". Such networks may be naturally implemented in nanoelectronic hardware using hybrid CMOS/memristor circuits, which may feature extremely high energy efficiency, approaching that of biological cortical circuits, at much higher operation speed. Our numerical simulations, in some cases confirmed by analytical calculations, have shown that the characteristics depend substantially on the method of information recording into the memory. Of the four methods we have explored, two look especially promising - one based on the quadratic programming, and the other one being a specific discrete version of the gradient descent. The latter method provides a slightly lower memory capacity (at the same fidelity) then the former one, but it allows local recording, which may be more readily implemented in nanoelectronic hardware. Most importantly, at the synchronous retrieval, both methods provide a capacity higher than that of the well-known Ternary Content-Addressable Memories with the same number of nonvolatile memory cells (e.g., memristors), though the input noise immunity of the CrossNet memories is somewhat lower.
The phenomenon of many-body localization in disordered quantum many-body systems occurs when all transport is suppressed despite the fact that the excitations of the system interact. In this work we report on the numerical simulation of autonomous quantum dynamics for disordered Heisenberg chains when the system is prepared with an initial inhomogeneity in the energy density profile. Using exact diagonalisation and a dynamical code based on Krylov subspaces we are able to simulate dynamics for up to L = 26 spins. We find, surprisingly, the breakdown of energy diffusion even before the many-body localization transition whilst the system is still in the ergodic phase. Moreover, in the ergodic phase we also find a large region in parameter space where the energy dynamics remains diffusive but where spin transport has been previously evidenced to occur only subdiffusively: this is found to be true for initial states composed of infinitely many hydrodynamic modes (square-wave energy profile) or just the single longest mode (sinusoidal profile). This suggestive finding points towards a peculiar ergodic phase where particles are transported slower than energy, reminiscent of the situation in amorphous solids and of the gapped phase of the anisotropic Heisenberg model.
In this paper, we propose a channel assortment strategy for Reliable Communication in Multi-Hop Cognitive Radio Networks.
Improvement of the radiative correction scheme for the deep inelastic scattering on the heavy targets is discussed. Arguments that the contribution of the radiative tail from the elastic peak should be calculated without use of the Born approximation in the case of the heavy target scattering are presented. The semianalytical approach based on the classical solution of the old bremsstrahlung problem is shortly described. The numerical results for the new correction factor are presented for the scattering on tin and lead.
The real-time nature of Twitter means that term distributions in tweets and in search queries change rapidly: the most frequent terms in one hour may look very different from those in the next. Informally, we call this phenomenon "churn". Our interest in analyzing churn stems from the perspective of real-time search. Nearly all ranking functions, machine-learned or otherwise, depend on term statistics such as term frequency, document frequency, as well as query frequencies. In the real-time context, how do we compute these statistics, considering that the underlying distributions change rapidly? In this paper, we present an analysis of tweet and query churn on Twitter, as a first step to answering this question. Analyses reveal interesting insights on the temporal dynamics of term distributions on Twitter and hold implications for the design of search systems.
In this paper, we study the problem of author identification under double-blind review setting, which is to identify potential authors given information of an anonymized paper. Different from existing approaches that rely heavily on feature engineering, we propose to use network embedding approach to address the problem, which can automatically represent nodes into lower dimensional feature vectors. However, there are two major limitations in recent studies on network embedding: (1) they are usually general-purpose embedding methods, which are independent of the specific tasks; and (2) most of these approaches can only deal with homogeneous networks, where the heterogeneity of the network is ignored. Hence, challenges faced here are two folds: (1) how to embed the network under the guidance of the author identification task, and (2) how to select the best type of information due to the heterogeneity of the network.   To address the challenges, we propose a task-guided and path-augmented heterogeneous network embedding model. In our model, nodes are first embedded as vectors in latent feature space. Embeddings are then shared and jointly trained according to task-specific and network-general objectives. We extend the existing unsupervised network embedding to incorporate meta paths in heterogeneous networks, and select paths according to the specific task. The guidance from author identification task for network embedding is provided both explicitly in joint training and implicitly during meta path selection. Our experiments demonstrate that by using path-augmented network embedding with task guidance, our model can obtain significantly better accuracy at identifying the true authors comparing to existing methods.
Networks of coupled neural systems represent an important class of models in computational neuroscience. In some applications it is required that equilibrium points in these networks remain stable under parameter variations. Here we present a general methodology to yield explicit constraints on the coupling strengths to ensure the stability of the equilibrium point. Two models of coupled excitatory-inhibitory oscillators are used to illustrate the approach.
Increasing mobile data demands in current cellular networks and proliferation of advanced handheld devices have given rise to a new generation of dynamic network architectures (DNAs). In a DNA, users share their connectivities and act as access points providing Internet connections for others without additional network infrastructure cost. A large number of users and their dynamic connections make DNA highly adaptive to variations in the network and suitable for low cost ubiquitous Internet connectivity. In this article, we propose a novel collaborative cognitive dynamic network architecture (CDNA) which incorporates cognitive capabilities to exploit underutilized spectrum in a more flexible and intelligent way. The design principles of CDNA are perfectly aligned to the functionality requirements of future 5G wireless networks such as energy and spectrum efficiency, scalability, dynamic reconfigurability, support for multi-hop communications, infrastructure sharing, and multi-operator cooperation. A case study with a new resource allocation problem enabled by CDNA is conducted using matching theory with pricing to illustrate the potential benefits of CDNA for users and operators, tackle user associations for data and spectrum trading with low complexity, and enable self-organizing capabilities. Finally, possible challenges and future research directions are given.
We review in a unified way results for two types of stochastic chemical reaction systems for which moments can be effectively computed: feedforward networks and complex-balanced networks.
The study explores perpendicular transport through macroscopically inhomogeneous three-dimensional disordered conductors using mesoscopic methods (real-space Green function technique in a two-probe measuring geometry). The nanoscale samples (containing $\sim1000$ atoms) are modeled by a tight-binding Hamiltonian on a simple cubic lattice where disorder is introduced in the on-site potential energy. I compute the transport properties of: disordered metallic junctions formed by concatenating two homogenous samples with different kinds of microscopic disorder, a single strongly disordered interface, and multilayers composed of such interfaces and homogeneous layers characterized by different strength of the same type of microscopic disorder. This allows us to: contrast resistor model (semiclassical) approach with fully quantum description of dirty mesoscopic multilayers; study the transmission properties of dirty interfaces (where Schep-Bauer distribution of transmission eigenvalues is confirmed for single interface, as well as for the stack of such interfaces that is thinner than the localization length); and elucidate the effect of coupling to ideal leads (``measuring apparatus'') on the conductance of both bulk conductors and dirty interfaces When multilayer contains a ballistic layer in between two interfaces, its disorder-averaged conductance oscillates as a function of Fermi energy. I also address some fundamental issues in quantum transport theory--the relationship between Kubo formula in exact state representation and ``mesoscopic Kubo formula'' (which gives the zero-temperature conductance of a finite-size sample attached to two semi-infinite ideal leads) is thoroughly reexamined by comparing their answers for both the junctions and homogeneous samples.
Networks are a powerful abstraction with applicability to a variety of scientific fields. Models explaining their morphology and growth processes permit a wide range of phenomena to be more systematically analysed and understood. At the same time, creating such models is often challenging and requires insights that may be counter-intuitive. Yet there currently exists no general method to arrive at better models. We have developed an approach to automatically detect realistic decentralised network growth models from empirical data, employing a machine learning technique inspired by natural selection and defining a unified formalism to describe such models as computer programs. As the proposed method is completely general and does not assume any pre-existing models, it can be applied "out of the box" to any given network. To validate our approach empirically, we systematically rediscover pre-defined growth laws underlying several canonical network generation models and credible laws for diverse real-world networks. We were able to find programs that are simple enough to lead to an actual understanding of the mechanisms proposed, namely for a simple brain and a social network.
While we once thought of cancer as single monolithic diseases affecting a specific organ site, we now understand that there are many subtypes of cancer defined by unique patterns of gene mutations. These gene mutational data, which can be more reliably obtained than gene expression data, help to determine how the subtypes develop, evolve, and respond to therapies. Different from dense continuous-value gene expression data, which most existing cancer subtype discovery algorithms use, somatic mutational data are extremely sparse and heterogeneous, because there are less than 0.5\% mutated genes in discrete value 1/0 out of 20,000 human protein-coding genes, and identical mutated genes are rarely shared by cancer patients.   Our focus is to search for cancer subtypes from extremely sparse and high dimensional gene mutational data in discrete 1 and 0 values using unsupervised learning. We propose a new network-based distance metric. We project cancer patients' mutational profile into their gene network structure and measure the distance between two patients using the similarity between genes and between the gene vertexes of the patients in the network. Experimental results in synthetic data and real-world data show that our approach outperforms the top competitors in cancer subtype discovery. Furthermore, our approach can identify cancer subtypes that cannot be detected by other clustering algorithms in real cancer data.
Inhomogeneity in networks can be detected by the analysis of the correlation of the total degree of nearest neighbors. This is illustrated by two models. The first one is a random multi-partitions network that the Aboav Weaire law, which predicts the linear relationship between the degree of node and the total degree of nearest neighbor, is being extended. The second one is a preferential attachment network with two partitions which shows scale free properties with power tail $\gamma$ within the range $2<\gamma\le3$. By plotting the total degree of neighbor verses the degree of each node in the networks, the scattered plot shows separable clustering as evidence for inhomogeneity in networks. The effectiveness of this new tool for the detection of inhomogeneity is demonstrated in real bipartite networks. By using this method, some interesting group of nodes of semantic and WWW networks have been found.
We consider the problem of locating a facility on a network, represented by a graph. A set of strategic agents have different ideal locations for the facility; the cost of an agent is the distance between its ideal location and the facility. A mechanism maps the locations reported by the agents to the location of the facility. Specifically, we are interested in social choice mechanisms that do not utilize payments. We wish to design mechanisms that are strategyproof, in the sense that agents can never benefit by lying, or, even better, group strategyproof, in the sense that a coalition of agents cannot all benefit by lying. At the same time, our mechanisms must provide a small approximation ratio with respect to one of two optimization targets: the social cost or the maximum cost.   We give an almost complete characterization of the feasible truthful approximation ratio under both target functions, deterministic and randomized mechanisms, and with respect to different network topologies. Our main results are: We show that a simple randomized mechanism is group strategyproof and gives a (2-2/n)-approximation for the social cost, where n is the number of agents, when the network is a circle (known as a ring in the case of computer networks); we design a novel "hybrid" strategyproof randomized mechanism that provides a tight approximation ratio of 3/2 for the maximum cost when the network is a circle; and we show that no randomized SP mechanism can provide an approximation ratio better than 2-o(1) to the maximum cost even when the network is a tree, thereby matching a trivial upper bound of two.
The 3D shapes of faces are well known to be discriminative. Yet despite this, they are rarely used for face recognition and always under controlled viewing conditions. We claim that this is a symptom of a serious but often overlooked problem with existing methods for single view 3D face reconstruction: when applied "in the wild", their 3D estimates are either unstable and change for different photos of the same subject or they are over-regularized and generic. In response, we describe a robust method for regressing discriminative 3D morphable face models (3DMM). We use a convolutional neural network (CNN) to regress 3DMM shape and texture parameters directly from an input photo. We overcome the shortage of training data required for this purpose by offering a method for generating huge numbers of labeled examples. The 3D estimates produced by our CNN surpass state of the art accuracy on the MICC data set. Coupled with a 3D-3D face matching pipeline, we show the first competitive face recognition results on the LFW, YTF and IJB-A benchmarks using 3D face shapes as representations, rather than the opaque deep feature vectors used by other modern systems.
2048 is an engaging single-player, nondeterministic video puzzle game, which, thanks to the simple rules and hard-to-master gameplay, has gained massive popularity in recent years. As 2048 can be conveniently embedded into the discrete-state Markov decision processes framework, we treat it as a testbed for evaluating existing and new methods in reinforcement learning. With the aim to develop a strong 2048 playing program, we employ temporal difference learning with systematic n-tuple networks. We show that this basic method can be significantly improved with temporal coherence learning, multi-stage function approximator with weight promotion, carousel shaping, and redundant encoding. In addition, we demonstrate how to take advantage of the characteristics of the n-tuple network, to improve the algorithmic effectiveness of the learning process by i) delaying the (decayed) update and applying lock-free optimistic parallelism to effortlessly make advantage of multiple CPU cores. This way, we were able to develop the best known 2048 playing program to date, which confirms the effectiveness of the introduced methods for discrete-state Markov decision problems.
This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown data-generating distribution with high confidence. We then formulate a distributionally robust logistic regression model that minimizes a worst-case expected logloss function, where the worst case is taken over all distributions in the Wasserstein ball. We prove that this optimization problem admits a tractable reformulation and encapsulates the classical as well as the popular regularized logistic regression problems as special cases. We further propose a distributionally robust approach based on Wasserstein balls to compute upper and lower confidence bounds on the misclassification probability of the resulting classifier. These bounds are given by the optimal values of two highly tractable linear programs. We validate our theoretical out-of-sample guarantees through simulated and empirical experiments.
Geometric routing algorithms like GFG (GPSR) are lightweight, scalable algorithms that can be used to route in resource-constrained ad hoc wireless networks. However, such algorithms run on planar graphs only. To efficiently construct a planar graph, they require a unit-disk graph. To make the topology unit-disk, the maximum link length in the network has to be selected conservatively. In practical setting this leads to the designs where the node density is rather high. Moreover, the network diameter of a planar subgraph is greater than the original graph, which leads to longer routes. To remedy this problem, we propose a void traversal algorithm that works on arbitrary geometric graphs. We describe how to use this algorithm for geometric routing with guaranteed delivery and compare its performance with GFG.
We present a supervised neural network model for polyphonic piano music transcription. The architecture of the proposed model is analogous to speech recognition systems and comprises an acoustic model and a music language model. The acoustic model is a neural network used for estimating the probabilities of pitches in a frame of audio. The language model is a recurrent neural network that models the correlations between pitch combinations over time. The proposed model is general and can be used to transcribe polyphonic music without imposing any constraints on the polyphony. The acoustic and language model predictions are combined using a probabilistic graphical model. Inference over the output variables is performed using the beam search algorithm. We perform two sets of experiments. We investigate various neural network architectures for the acoustic models and also investigate the effect of combining acoustic and music language model predictions using the proposed architecture. We compare performance of the neural network based acoustic models with two popular unsupervised acoustic models. Results show that convolutional neural network acoustic models yields the best performance across all evaluation metrics. We also observe improved performance with the application of the music language models. Finally, we present an efficient variant of beam search that improves performance and reduces run-times by an order of magnitude, making the model suitable for real-time applications.
We have used deep high-resolution Hubble Space Telescope ACS observations to image the cluster NGC 1978 in the Large Magellanic Cloud. This high-quality photometric data set allowed us to confirm the high ellipticity (~0.30+-0.02) of this stellar system. The derived Color Magnitude Diagram allowed a detailed study of the main evolutionary sequences: in particular we have detected the so called Bump along the Red Giant Branch (at V_{555}=19.10+-0.10). This is the first detection of this feature in an intermediate-age cluster. Moreover the morphology of the evolutionary sequence and their population ratios have been compared with the expectations of different theoretical models (namely BaSTI, PEL and Padua) in order to quantify the effect of convective overshooting. The best agreement (both in terms of morphology and star counts) has been found the PEL (Pisa Evolutionary Library) isochrone with Z=0.008 (consistenly with the most recent determination of the cluster metallicity, [M/H]=-0.37 dex) and a mild overshooting efficiency (Lambda_{os}=0.1). By adopting this theoretical set an age of 1.9+-0.1 Gyr has been obtained.
Based on signaling process on complex networks, a method for identification community structure is proposed. For a network with $n$ nodes, every node is assumed to be a system which can send, receive, and record signals. Each node is taken as the initial signal source once to inspire the whole network by exciting its neighbors and then the source node is endowed a $n$d vector which recording the effects of signaling process. So by this process, the topological relationship of nodes on networks could be transferred into the geometrical structure of vectors in $n$d Euclidian space. Then the best partition of groups is determined by $F$-statistic and the final community structure is given by Fuzzy $C$-means clustering method (FCM). This method can detect community structure both in unweighted and weighted networks without any extra parameters. It has been applied to ad hoc networks and some real networks including Zachary Karate Club network and football team network. The results are compared with that of other approaches and the evidence indicates that the algorithm based on signaling process is effective.
Results of experiments on the dynamics and kinetic roughening of one-dimensional slow-combustion fronts in three grades of paper are reported. Extensive averaging of the data allows a detailed analysis of the spatial and temporal development of the interface fluctuations. The asymptotic scaling properties, on long length and time scales, are well described by the Kardar-Parisi-Zhang (KPZ) equation with short-range, uncorrelated noise. To obtain a more detailed picture of the strong-coupling fixed point, characteristic of the KPZ universality class, universal amplitude ratios, and the universal coupling constant are computed from the data and found to be in good agreement with theory. Below the spatial and temporal scales at which a cross-over takes place to the standard KPZ behavior, the fronts display higher apparent exponents and apparent multiscaling. In this regime the interface velocities are spatially and temporally correlated, and the distribution of the magnitudes of the effective noise has a power-law tail. The relation of the observed short-range behavior and the noise as determined from the local velocity fluctuations is discussed.
Estimation of Distribution Algorithms (EDAs) require flexible probability models that can be efficiently learned and sampled. Deep Boltzmann Machines (DBMs) are generative neural networks with these desired properties. We integrate a DBM into an EDA and evaluate the performance of this system in solving combinatorial optimization problems with a single objective. We compare the results to the Bayesian Optimization Algorithm. The performance of DBM-EDA was superior to BOA for difficult additively decomposable functions, i.e., concatenated deceptive traps of higher order. For most other benchmark problems, DBM-EDA cannot clearly outperform BOA, or other neural network-based EDAs. In particular, it often yields optimal solutions for a subset of the runs (with fewer evaluations than BOA), but is unable to provide reliable convergence to the global optimum competitively. At the same time, the model building process is computationally more expensive than that of other EDAs using probabilistic models from the neural network family, such as DAE-EDA.
Spatial item recommendation has become an important means to help people discover interesting locations, especially when people pay a visit to unfamiliar regions. Some current researches are focusing on modelling individual and collective geographical preferences for spatial item recommendation based on users' check-in records, but they fail to explore the phenomenon of user interest drift across geographical regions, i.e., users would show different interests when they travel to different regions. Besides, they ignore the influence of public comments for subsequent users' check-in behaviors. Specifically, it is intuitive that users would refuse to check in to a spatial item whose historical reviews seem negative overall, even though it might fit their interests. Therefore, it is necessary to recommend the right item to the right user at the right location. In this paper, we propose a latent probabilistic generative model called LSARS to mimic the decision-making process of users' check-in activities both in home-town and out-of-town scenarios by adapting to user interest drift and crowd sentiments, which can learn location-aware and sentiment-aware individual interests from the contents of spatial items and user reviews. Due to the sparsity of user activities in out-of-town regions, LSARS is further designed to incorporate the public preferences learned from local users' check-in behaviors. Finally, we deploy LSARS into two practical application scenes: spatial item recommendation and target user discovery. Extensive experiments on two large-scale location-based social networks (LBSNs) datasets show that LSARS achieves better performance than existing state-of-the-art methods.
It has been believed that stochastic feedforward neural networks (SFNNs) have several advantages beyond deterministic deep neural networks (DNNs): they have more expressive power allowing multi-modal mappings and regularize better due to their stochastic nature. However, training large-scale SFNN is notoriously harder. In this paper, we aim at developing efficient training methods for SFNN, in particular using known architectures and pre-trained parameters of DNN. To this end, we propose a new intermediate stochastic model, called Simplified-SFNN, which can be built upon any baseline DNNand approximates certain SFNN by simplifying its upper latent units above stochastic ones. The main novelty of our approach is in establishing the connection between three models, i.e., DNN->Simplified-SFNN->SFNN, which naturally leads to an efficient training procedure of the stochastic models utilizing pre-trained parameters of DNN. Using several popular DNNs, we show how they can be effectively transferred to the corresponding stochastic models for both multi-modal and classification tasks on MNIST, TFD, CASIA, CIFAR-10, CIFAR-100 and SVHN datasets. In particular, we train a stochastic model of 28 layers and 36 million parameters, where training such a large-scale stochastic network is significantly challenging without using Simplified-SFNN
Using the "Quality Factor" (QF) method, we analyse the scaling properties of deep-inelastic processes at HERA and fixed target experiments for x<10^{-2}.
We elaborate on a previous proposal by Hartman and Maldacena on a tensor network which accounts for the scaling of the entanglement entropy in a system at a finite temperature. In this construction, the ordinary entanglement renormalization flow given by the class of tensor networks known as the Multi Scale Entanglement Renormalization Ansatz (MERA), is supplemented by an additional entanglement structure at the length scale fixed by the temperature. The network comprises two copies of a MERA circuit with a fixed number of layers and a pure matrix product state which joins both copies by entangling the infrared degrees of freedom of both MERA networks. The entanglement distribution within this bridge state defines reduced density operators on both sides which cause analogous effects to the presence of a black hole horizon when computing the entanglement entropy at finite temperature in the AdS/CFT correspondence. The entanglement and correlations during the thermalization process of a system after a quantum quench are also analyzed. To this end, a full tensor network representation of the action of local unitary operations on the bridge state is proposed. This amounts to a tensor network which grows in size by adding succesive layers of bridge states. Finally, we discuss on the holographic interpretation of the tensor network through a notion of distance within the network which emerges from its entanglement distribution.
We present a tree-structured network architecture for large scale image classification. The trunk of the network contains convolutional layers optimized over all classes. At a given depth, the trunk splits into separate branches, each dedicated to discriminate a different subset of classes. Each branch acts as an expert classifying a set of categories that are difficult to tell apart, while the trunk provides common knowledge to all experts in the form of shared features. The training of our "network of experts" is completely end-to-end: the partition of categories into disjoint subsets is learned simultaneously with the parameters of the network trunk and the experts are trained jointly by minimizing a single learning objective over all classes. The proposed structure can be built from any existing convolutional neural network (CNN). We demonstrate its generality by adapting 4 popular CNNs for image categorization into the form of networks of experts. Our experiments on CIFAR100 and ImageNet show that in every case our method yields a substantial improvement in accuracy over the base CNN, and gives the best result achieved so far on CIFAR100. Finally, the improvement in accuracy comes at little additional cost: compared to the base network, the training time is only moderately increased and the number of parameters is comparable or in some cases even lower.
We report X-ray results of the Chandra observation of Orion Molecular Cloud 2 and 3. A deep exposure of \sim 100 ksec detects \sim 400 X-ray sources in the field of view of the ACIS array, providing one of the largest X-ray catalogs in a star forming region. Coherent studies of the source detection, time variability, and energy spectra are performed. We classify the X-ray sources into class I, class II, and class III+MS based on the J, H, and K-band colors of their near infrared counterparts and discuss the X-ray properties (temperature, absorption, and time variability) along these evolutionary phases.
Deep learning-based approaches have been widely used for training controllers for autonomous vehicles due to their powerful ability to approximate nonlinear functions or policies. However, the training process usually requires large labeled data sets and takes a lot of time. In this paper, we analyze the influences of features on the performance of controllers trained using the convolutional neural networks (CNNs), which gives a guideline of feature selection to reduce computation cost. We collect a large set of data using The Open Racing Car Simulator (TORCS) and classify the image features into three categories (sky-related, roadside-related, and road-related features).We then design two experimental frameworks to investigate the importance of each single feature for training a CNN controller.The first framework uses the training data with all three features included to train a controller, which is then tested with data that has one feature removed to evaluate the feature's effects. The second framework is trained with the data that has one feature excluded, while all three features are included in the test data. Different driving scenarios are selected to test and analyze the trained controllers using the two experimental frameworks. The experiment results show that (1) the road-related features are indispensable for training the controller, (2) the roadside-related features are useful to improve the generalizability of the controller to scenarios with complicated roadside information, and (3) the sky-related features have limited contribution to train an end-to-end autonomous vehicle controller.
Remote sensing image classification is a fundamental task in remote sensing image processing. Remote sensing field still lacks of such a large-scale benchmark compared to ImageNet, Place2. We propose a remote sensing image classification benchmark (RSI-CB) based on crowd-source data which is massive, scalable, and diversity. Using crowdsource data, we can efficiently annotate ground objects in remotes sensing image by point of interests, vectors data from OSM or other crowd-source data. Based on this method, we construct a worldwide large-scale benchmark for remote sensing image classification. In this benchmark, there are two sub datasets with 256 * 256 and 128 * 128 size respectively since different convolution neural networks requirement different image size. The former sub dataset contains 6 categories with 35 subclasses with total of more than 24,000 images; the later one contains 6 categories with 45 subclasses with total of more than 36,000 images. The six categories are agricultural land, construction land and facilities, transportation and facilities, water and water conservancy facilities, woodland and other land, and each category has several subclasses. This classification system is defined according to the national standard of land use classification in China, and is inspired by the hierarchy mechanism of ImageNet. Finally, we have done a large number of experiments to compare RSI-CB with SAT-4, UC-Merced datasets on handcrafted features, such as such as SIFT, and classical CNN models, such as AlexNet, VGG, GoogleNet, and ResNet. We also show CNN models trained by RSI-CB have good performance when transfer to other dataset, i.e. UC-Merced, and good generalization ability. The experiments show that RSI-CB is more suitable as a benchmark for remote sensing image classification task than other ones in big data era, and can be potentially used in practical applications.
We investigate the synaptic noise as a novel mechanism for creating critical avalanches in the activity of neural networks. We model neurons and chemical synapses by dynamical maps with a uniform noise term in the synaptic coupling. An advantage of utilizing maps is that the dynamical properties (action potential profile, excitability properties, post synaptic potential summation etc.) are not imposed to the system, but occur naturally by solving the system equations. We discuss the relevant neuronal and synaptic properties to achieve the critical state. We verify that networks of excitatory by rebound neurons with fast synapses present power law avalanches. We also discuss the measuring of neuronal avalanches by subsampling our data, shedding light on the experimental search for Self-Organized Criticality in neural networks.
The origin of color gradients in elliptical galaxies is examined by comparing model gradients with those observed in the Hubble Deep Field. The models are constructed so as to reproduce color gradients in local elliptical galaxies either by a metallicity gradient or by an age gradient. By looking back a sequence of color gradient as a function of redshift, the age - metallicity degeneracy is solved. The observed color gradients in elliptical galaxies at $z=0.1-1.0$ agree excellently with those predicted by the metallicity gradient, while they deviates significantly from those predicted by the age gradient even at $z \sim 0.3$ and the deviation is getting larger with increasing redshift. This result does not depend on cosmological parameters and parameters for an evolutionary model of galaxy within a reasonable range. Thus our results clearly indicate that the origin of color gradients is not the age but the stellar metallicity.
Chemical algorithms are statistical algorithms described and represented as chemical reaction networks. They are particularly attractive for traffic shaping and general control of network dynamics; they are analytically tractable, they reinforce a strict state-to-dynamics relationship, they have configurable stability properties, and they are directly implemented in state-space using a high-level (graphical) representation.   In this paper, we present a direct implementation of chemical algorithms on FPGA hardware. Besides substantially improving performance, we have achieved hardware-level programmability and re-configurability of these algorithms at runtime (not interrupting servicing) and in realtime (with sub-second latency). This opens an interesting perspective for expanding the currently limited scope of software defined networking and network virtualisation solutions, to include programmable control of network dynamics.
The dynamic structure factor of glassy and liquid glycerol has been measured by inelastic X-ray scattering in the exchanged momentum ($Q$) region $Q$=2$\div$23 nm$^{-1}$ and in the temperature range 80$\div$570 K. Beside the propagating longitudinal excitations modes, at low temperature the spectra show a second non $Q$-dispersing peak at $\hbar \Omega_T$$\approx$8.5 meV. We assign this peak to the transverse dynamics that, in topologically disordered systems, acquires a longitudinal symmetry component. This assignment is substantiated by the observation that, in the liquid, this peak vanishes when the structural relaxation time $\tau_\alpha$ approaches $\Omega_T^{-1}$, a behavior consistent with the condition $\tau_\alpha \Omega_T^{-1}$$>>$1 required for the existence of a transverse-like dynamics in the liquid state.
Strong Disorder Renormalization for the Random Transverse Field Ising model leads to a complicated topology of surviving clusters as soon as $d>1$. Even if one starts from a Cayley tree, the network of surviving renormalized clusters will contain loops, so that no analytical solution can been obtained. Here we introduce a modified procedure called 'Boundary Strong Disorder Renormalization' that preserves the tree structure, so that one can write simple recursions with respect to the number of generations. We first show that this modified procedure allows to recover exactly most of the critical exponents for the one-dimensional chain. After this important check, we study the RG equations for the quantum Ising model on a Cayley tree with a uniform ferromagnetic coupling $J$ and random transverse fields with support $[h_{min},h_{max}]$. We find the following picture (i) for $J>h_{max}$, only bonds are decimated, so that the whole tree is a quantum ferromagnetic cluster (ii) for $J<h_{min}$, only sites are decimated, so that no quantum ferromagnetic cluster is formed, and the ferromagnetic coupling to the boundary coincides with the partition function of a Directed Polymer model in a random medium (iii) for $h_{min}<J<h_{max}$, both sites and bonds can be decimated : the quantum ferromagnetic clusters can either remain finite (the physics is then similar to (ii), with a quantitative mapping to a modified Directed Polymer model) or an infinite quantum ferromagnetic cluster appears. We find that the quantum transition can be of two types : (a) either the quantum transition takes place in the region where quantum ferromagnetic clusters remain finite, and the singularity of the ferromagnetic coupling to the boundary involves the typical correlation length exponent $\nu_{typ}=1$ (b) or the quantum transition takes place at the point where an extensive quantum ferromagnetic cluster appears.
Object detection with deep neural networks is often performed by passing a few thousand candidate bounding boxes through a deep neural network for each image. These bounding boxes are highly correlated since they originate from the same image. In this paper we investigate how to exploit feature occurrence at the image scale to prune the neural network which is subsequently applied to all bounding boxes. We show that removing units which have near-zero activation in the image allows us to significantly reduce the number of parameters in the network. Results on the PASCAL 2007 Object Detection Challenge demonstrate that up to 40% of units in some fully-connected layers can be entirely eliminated with little change in the detection result.
In terms of the stochastic process of quantum-mechanical version of Markov chain Monte Carlo method (the MCMC), we analytically derive macroscopically deterministic flow equations of order parameters such as spontaneous magnetization in infinite-range ($d(=\infty)$-dimensional) quantum spin systems. By means of the Trotter decomposition, we consider the transition probability of Glauber-type dynamics of microscopic states for the corresponding $(d+1)$-dimensional classical system. Under the static approximation, differential equations with respect to macroscopic order parameters are explicitly obtained from the master equation that describes the microscopic-law. In the steady state, we show that the equations are identical to the saddle point equations for the equilibrium state of the same system. The equation for the dynamical Ising model is recovered in the classical limit. We also check the validity of the static approximation by making use of computer simulations for finite size systems and discuss several possible extensions of our approach to disordered spin systems for statistical-mechanical informatics. Especially, we shall use our procedure to evaluate the decoding process of Bayesian image restoration. With the assistance of the concept of dynamical replica theory (the DRT), we derive the zero-temperature flow equation of image restoration measure showing some `non-monotonic' behaviour in its time evolution.
Statistical models of natural stimuli provide an important tool for researchers in the fields of machine learning and computational neuroscience. A canonical way to quantitatively assess and compare the performance of statistical models is given by the likelihood. One class of statistical models which has recently gained increasing popularity and has been applied to a variety of complex data are deep belief networks. Analyses of these models, however, have been typically limited to qualitative analyses based on samples due to the computationally intractable nature of the model likelihood. Motivated by these circumstances, the present article provides a consistent estimator for the likelihood that is both computationally tractable and simple to apply in practice. Using this estimator, a deep belief network which has been suggested for the modeling of natural image patches is quantitatively investigated and compared to other models of natural image patches. Contrary to earlier claims based on qualitative results, the results presented in this article provide evidence that the model under investigation is not a particularly good model for natural images
An infrastructure network is a self-organizing network with help of Access Point (AP) of wireless links connecting nodes to another. The nodes can communicate without an ad hoc. They form an uninformed topology (BSS/ESS), where the nodes play the role of routers and are free to move randomly. Infrastructure networks proved their efficiency being used in different fields but they are highly vulnerable to security attacks and dealing with this is one of the main challenges of these networks at present. In recent times some clarification are proposed to provide authentication, confidentiality, availability, secure routing and intrusion avoidance in infrastructure networks. Implementing security in such dynamically changing networks is a hard task. Infrastructure network characteristics should be taken into consideration to be clever to design efficient solutions. Here we spotlight on civilizing the flow transmission privacy in infrastructure networks based on multipath routing. Certainly, we take benefit of the being of multiple paths between nodes in an infrastructure network to increase the confidentiality robustness of transmitted data with the help of Access Point. In our approach the original message to secure is split into shares through access point that are encrypted and combined then transmitted along different disjointed existing paths between sender and receiver. Even if an intruder achieve something to get one or more transmitted distribute the likelihood that the unique message will be reconstituted is very squat.
It is known that training deep neural networks, in particular, deep convolutional networks, with aggressively reduced numerical precision is challenging. The stochastic gradient descent algorithm becomes unstable in the presence of noisy gradient updates resulting from arithmetic with limited numeric precision. One of the well-accepted solutions facilitating the training of low precision fixed point networks is stochastic rounding. However, to the best of our knowledge, the source of the instability in training neural networks with noisy gradient updates has not been well investigated. This work is an attempt to draw a theoretical connection between low numerical precision and training algorithm stability. In doing so, we will also propose and verify through experiments methods that are able to improve the training performance of deep convolutional networks in fixed point.
We compare the performance of a recurrent neural network with the best results published so far on phoneme recognition in the TIMIT database. These published results have been obtained with a combination of classifiers. However, in this paper we apply a single recurrent neural network to the same task. Our recurrent neural network attains an error rate of 24.6%. This result is not significantly different from that obtained by the other best methods, but they rely on a combination of classifiers for achieving comparable performance.
We consider the deterministic evolution of a time-discretized spiking network of neurons with connection weights having delays, modeled as a discretized neural network of the generalized integrate and fire (gIF) type. The purpose is to study a class of algorithmic methods allowing to calculate the proper parameters to reproduce exactly a given spike train generated by an hidden (unknown) neural network. This standard problem is known as NP-hard when delays are to be calculated. We propose here a reformulation, now expressed as a Linear-Programming (LP) problem, thus allowing to provide an efficient resolution. This allows us to "back-engineer" a neural network, i.e. to find out, given a set of initial conditions, which parameters (i.e., connection weights in this case), allow to simulate the network spike dynamics. More precisely we make explicit the fact that the back-engineering of a spike train, is a Linear (L) problem if the membrane potentials are observed and a LP problem if only spike times are observed, with a gIF model. Numerical robustness is discussed. We also explain how it is the use of a generalized IF neuron model instead of a leaky IF model that allows us to derive this algorithm. Furthermore, we point out how the L or LP adjustment mechanism is local to each unit and has the same structure as an "Hebbian" rule. A step further, this paradigm is easily generalizable to the design of input-output spike train transformations. This means that we have a practical method to "program" a spiking network, i.e. find a set of parameters allowing us to exactly reproduce the network output, given an input. Numerical verifications and illustrations are provided.
Results of detailed investigations of the conductivity and Hall effect in gated single quantum well GaAs/InGaAs/GaAs heterostructures with two-dimensional electron gas are presented. A successive analysis of the data has shown that the conductivity is diffusive for $k_F l=25-2$ and behaves like diffusive one for $k_F l=2-0.5$ down to the temperature T=0.4 K. It has been therewith found that the quantum corrections are not small at low temperature when $k_F l\simeq 1$. They are close in magnitude to the Drude conductivity so that the conductivity $\sigma$ becomes significantly less than $e^{2}/h$ (the minimal $\sigma$ value achieved in our experiment is about $3\times 10^{-8}\Omega^{-1}$ at $k_Fl\simeq 0.5$ and $T=0.46$ K). We conclude that the temperature and magnetic field dependences of conductivity in whole $k_Fl$ range are due to changes of quantum corrections.
This paper presents a theoretical analysis and practical evaluation of the main bottlenecks towards a scalable distributed solution for the training of Deep Neuronal Networks (DNNs). The presented results show, that the current state of the art approach, using data-parallelized Stochastic Gradient Descent (SGD), is quickly turning into a vastly communication bound problem. In addition, we present simple but fixed theoretic constraints, preventing effective scaling of DNN training beyond only a few dozen nodes. This leads to poor scalability of DNN training in most practical scenarios.
In this work, we present a novel background subtraction system that uses a deep Convolutional Neural Network (CNN) to perform the segmentation. With this approach, feature engineering and parameter tuning become unnecessary since the network parameters can be learned from data by training a single CNN that can handle various video scenes. Additionally, we propose a new approach to estimate background model from video. For the training of the CNN, we employed randomly 5 percent video frames and their ground truth segmentations taken from the Change Detection challenge 2014(CDnet 2014). We also utilized spatial-median filtering as the post-processing of the network outputs. Our method is evaluated with different data-sets, and the network outperforms the existing algorithms with respect to the average ranking over different evaluation metrics. Furthermore, due to the network architecture, our CNN is capable of real time processing.
The phase ordering dynamics of coupled chaotic bistable maps on lattices with defects is investigated. The statistical properties of the system are characterized by means of the average normalized size of spatial domains of equivalent spin variables that define the phases. It is found that spatial defects can induce the formation of domains in bistable spatiotemporal systems. The minimum distance between defects acts as parameter for a transition from a homogeneous state to a heterogeneous regime where two phases coexist The critical exponent of this transition also exhibits a transition when the coupling is increased, indicating the presence of a new class of domain where both phases coexist forming a chessboard pattern.
The Tsallis entropy and Fisher information entropy (matrix) are very important quantities expressing information measures in nonextensive systems. Stationary and dynamical properties of the information entropies have been investigated in the $N$-unit coupled Langevin model subjected to additive and multiplicative white noise, which is one of typical nonextensive systems. We have made detailed, analytical and numerical study on the dependence of the stationary-state entropies on additive and multiplicative noise, external inputs, couplings and number of constitutive elements ($N$). By solving the Fokker-Planck equation (FPE) by both the proposed analytical scheme and the partial difference-equation method, transient responses of the information entropies to an input signal and an external force have been investigated. We have calculated the information entropies also with the use of the probability distribution derived by the maximum-entropy method (MEM), whose result is compared to that obtained by the FPE. The Cram\'{e}r-Rao inequality is shown to be expressed by the {\it extended} Fisher entropy, which is different from the {\it generalized} Fisher entropy obtained from the generalized Kullback-Leibler divergence in conformity with the Tsallis entropy. The effect of additive and multiplicative {\it colored} noise on information entropies is discussed also.
We show that a Modular Neural Network (MNN) can combine various speech enhancement modules, each of which is a Deep Neural Network (DNN) specialized on a particular enhancement job. Differently from an ordinary ensemble technique that averages variations in models, the propose MNN selects the best module for the unseen test signal to produce a greedy ensemble. We see this as Collaborative Deep Learning (CDL), because it can reuse various already-trained DNN models without any further refining. In the proposed MNN selecting the best module during run time is challenging. To this end, we employ a speech AutoEncoder (AE) as an arbitrator, whose input and output are trained to be as similar as possible if its input is clean speech. Therefore, the AE can gauge the quality of the module-specific denoised result by seeing its AE reconstruction error, e.g. low error means that the module output is similar to clean speech. We propose an MNN structure with various modules that are specialized on dealing with a specific noise type, gender, and input Signal-to-Noise Ratio (SNR) value, and empirically prove that it almost always works better than an arbitrarily chosen DNN module and sometimes as good as an oracle result.
This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset. Instead of generic and commonly used representations trained on the ImageNet dataset, our approach employs a combination of networks trained for specialized tasks such as scene recognition, person activity classification, and attribute prediction. We also present a method for localizing phrases from candidate answers in order to provide spatial support for feature extraction. We map each of these features, together with candidate answers, to a joint embedding space through normalized canonical correlation analysis (CCA). Finally, we solve an optimization problem to learn to combine CCA scores from multiple cues to select the best answer. Extensive experimental results show a significant improvement over the previous state of the art and confirm that answering questions from a wide range of types benefits from examining a variety of image cues and carefully choosing the spatial support of feature extraction.
Percolation threshold of a network is the critical value such that when nodes or edges are randomly selected with probability below the value, the network is fragmented but when the probability is above the value, a giant component connecting large portion of the network would emerge. Assessing the percolation threshold of networks has wide applications in network reliability, information spread, epidemic control, etc. The theoretical approach so far to assess the percolation threshold is mainly based on spectral radius of adjacency matrix or non-backtracking matrix, which is limited to dense graphs or locally treelike graphs, and is less effective for sparse networks with non-negligible amount of triangles and loops. In this paper, we study high-order non-backtracking matrices and their application to assessing percolation threshold. We first define high-order non-backtracking matrices and study the properties of their spectral radii. Then we focus on 2nd-order non-backtracking matrix and demonstrate analytically that the reciprocal of its spectral radius gives a tighter lower bound than those of adjacency and standard non-backtracking matrices. We further build a smaller size matrix with the same largest eigenvalue as the 2nd-order non-backtracking matrix to improve computation efficiency. Finally, we use both synthetic networks and 42 real networks to illustrate that the use of 2nd-order non-backtracking matrix does give better lower bound for assessing percolation threshold than adjacency and standard non-backtracking matrices.
We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning--the use of Bayesian nonparametric methods to infer distributions on flexible data structures.
With increasingly "big" data available in biomedical research, deriving accurate and reproducible biology knowledge from such big data imposes enormous computational challenges. In this paper, motivated by recently developed stochastic block coordinate algorithms, we propose a highly scalable randomized block coordinate Frank-Wolfe algorithm for convex optimization with general compact convex constraints, which has diverse applications in analyzing biomedical data for better understanding cellular and disease mechanisms. We focus on implementing the derived stochastic block coordinate algorithm to align protein-protein interaction networks for identifying conserved functional pathways based on the IsoRank framework. Our derived stochastic block coordinate Frank-Wolfe (SBCFW) algorithm has the convergence guarantee and naturally leads to the decreased computational cost (time and space) for each iteration. Our experiments for querying conserved functional protein complexes in yeast networks confirm the effectiveness of this technique for analyzing large-scale biological networks.
We consider the problem of maximizing the probability of hitting a strategically chosen hidden virtual network by placing a wiretap on a single link of a communication network. This can be seen as a two-player win-lose (zero-sum) game that we call the wiretap game. The value of this game is the greatest probability that the wiretapper can secure for hitting the virtual network. The value is shown to equal the reciprocal of the strength of the underlying graph.   We efficiently compute a unique partition of the edges of the graph, called the prime-partition, and find the set of pure strategies of the hider that are best responses against every maxmin strategy of the wiretapper. Using these special pure strategies of the hider, which we call omni-connected-spanning-subgraphs, we define a partial order on the elements of the prime-partition. From the partial order, we obtain a linear number of simple two-variable inequalities that define the maxmin-polytope, and a characterization of its extreme points.   Our definition of the partial order allows us to find all equilibrium strategies of the wiretapper that minimize the number of pure best responses of the hider. Among these strategies, we efficiently compute the unique strategy that maximizes the least punishment that the hider incurs for playing a pure strategy that is not a best response. Finally, we show that this unique strategy is the nucleolus of the recently studied simple cooperative spanning connectivity game.
In this paper, we propose a new text recognition model based on measuring the visual similarity of text and predicting the content of unlabeled texts. First a Siamese convolutional network is trained with deep supervision on a labeled training dataset. This network projects texts into a similarity manifold. The Deeply Supervised Siamese network learns visual similarity of texts. Then a K-nearest neighbor classifier is used to predict unlabeled text based on similarity distance to labeled texts. The performance of the model is evaluated on three datasets of machine-print and hand-written text combined. We demonstrate that the model reduces the cost of human estimation by $50\%-85\%$. The error of the system is less than $0.5\%$. The proposed model outperform conventional Siamese network by finding visually-similar barely-readable and readable text, e.g. machine-printed, handwritten, due to deep supervision. The results also demonstrate that the predicted labels are sometimes better than human labels e.g. spelling correction.
We consider the rheology of soft-core frictionless disks in two dimensions in the neighborhood of the athermal jamming transition. From numerical simulations of bidisperse, overdamped, particles, we argue that the divergence of the viscosity below jamming is characteristic of the hard-core limit, independent of the particular soft-core interaction. We develop a mapping from soft-core to hard-core particles that recovers all the critical behavior found in earlier scaling analyses. Using this mapping we derive a duality relation that gives the exponent of the non-linear Herschel-Bulkley rheology above jamming in terms of the exponent of the diverging viscosity below jamming.
Generative models of 3D human motion are often restricted to a small number of activities and can therefore not generalize well to novel movements or applications. In this work we propose a deep learning framework for human motion capture data that learns a generic representation from a large corpus of motion capture data and generalizes well to new, unseen, motions. Using an encoding-decoding network that learns to predict future 3D poses from the most recent past, we extract a feature representation of human motion. Most work on deep learning for sequence prediction focuses on video and speech. Since skeletal data has a different structure, we present and evaluate different network architectures that make different assumptions about time dependencies and limb correlations. To quantify the learned features, we use the output of different layers for action classification and visualize the receptive fields of the network units. Our method outperforms the recent state of the art in skeletal motion prediction even though these use action specific training data. Our results show that deep feedforward networks, trained from a generic mocap database, can successfully be used for feature extraction from human motion data and that this representation can be used as a foundation for classification and prediction.
In this paper, we propose a fast deep learning method for object saliency detection using convolutional neural networks. In our approach, we use a gradient descent method to iteratively modify the input images based on the pixel-wise gradients to reduce a pre-defined cost function, which is defined to measure the class-specific objectness and clamp the class-irrelevant outputs to maintain image background. The pixel-wise gradients can be efficiently computed using the back-propagation algorithm. We further apply SLIC superpixels and LAB color based low level saliency features to smooth and refine the gradients. Our methods are quite computationally efficient, much faster than other deep learning based saliency methods. Experimental results on two benchmark tasks, namely Pascal VOC 2012 and MSRA10k, have shown that our proposed methods can generate high-quality salience maps, at least comparable with many slow and complicated deep learning methods. Comparing with the pure low-level methods, our approach excels in handling many difficult images, which contain complex background, highly-variable salient objects, multiple objects, and/or very small salient objects.
The diffusion of electronic wave packets in one-dimensional systems with on-site, binary disorder is numerically investigated within the framework of a single-band tight-binding model. Fractal properties are incorporated by assuming that the distribution of distances $\ell$ between consecutive impurities obeys a power law, $P(\ell) \sim \ell^{-\alpha}$. For suitable ranges of $\alpha$, one finds system-wide anomalous diffusion. Asymmetric diffusion effects are introduced through the application of an external electric field, leading to results similar to those observed in the case of photogenerated electron-hole plasmas in tilted InP/InGaAs/InP quantum wells.
We present a new multigrid method called neural multigrid which is based on joining multigrid ideas with concepts from neural nets. The main idea is to use the Greenbaum criterion as a cost functional for the neural net. The algorithm is able to learn efficient interpolation operators in the case of the ordered Laplace equation with only a very small critical slowing down and with a surprisingly small amount of work comparable to that of a Conjugate Gradient solver. In the case of the two-dimensional Laplace equation with SU(2) gauge fields at beta=0 the learning exhibits critical slowing down with an exponent of about z = 0.4. The algorithm is able to find quite good interpolation operators in this case as well. Thereby it is proven that a practical true multigrid algorithm exists even for a gauge theory. An improved algorithm using dynamical blocks that will hopefully overcome the critical slowing down completely is sketched.
We extend the merging model for undirected networks by Kim et al. [Eur. Phys. J. B 43, 369 (2004)] to directed networks and investigate the emerging scale-free networks. Two versions of the directed merging model, friendly and hostile merging, give rise to two distinct network types. We uncover that some non-trivial features of these two network types resemble two levels of a certain randomization/non-specificity in the link reshuffling during network evolution. Furthermore the same features show up, respectively, in metabolic networks and transcriptional networks. We introduce measures that single out the distinguishing features between the two prototype networks, as well as point out features which are beyond the prototypes.
We present SHARC-II 350um imaging of twelve 24um-bright (F_24um > 0.8 mJy) Dust-Obscured Galaxies (DOGs) and CARMA 1mm imaging of a subset of 2 DOGs, all selected from the Bootes field of the NOAO Deep Wide-Field Survey. Detections of 4 DOGs at 350um imply IR luminosities which are consistent within a factor of 2 of expectations based on a warm dust spectral energy distribution (SED) scaled to the observed 24um flux density. The 350um upper limits for the 8 non-detected DOGs are consistent with both Mrk231 and M82 (warm dust SEDs), but exclude cold dust (Arp220) SEDs. The two DOGs targeted at 1mm were not detected in our CARMA observations, placing strong constraints on the dust temperature: T_dust > 35-60 K. Assuming these dust properties apply to the entire sample, we find dust masses of ~3x10^8 M_sun. In comparison to other dusty z ~ 2 galaxy populations such as sub-millimeter galaxies (SMGs) and other Spitzer-selected high-redshift sources, this sample of DOGs has higher IR luminosities (2x10^13 L_sun vs. 6x10^12 L_sun for the other galaxy populations), warmer dust temperatures (>35-60 K vs. ~30 K), and lower inferred dust masses (3x10^8 M_sun vs. 3x10^9 M_sun). Herschel and SCUBA-2 surveys should be able to detect hundreds of these power-law dominated DOGs. We use HST and Spitzer/IRAC data to estimate stellar masses of these sources and find that the stellar to gas mass ratio may be higher in our 24um-bright sample of DOGs than in SMGs and other Spitzer-selected sources. Although larger sample sizes are needed to provide a definitive conclusion, the data are consistent with an evolutionary trend in which the formation of massive galaxies at z~2 involves a sub-millimeter bright, cold-dust and star-formation dominated phase followed by a 24um-bright, warm-dust and AGN-dominated phase.
It has been shown \citep{broeck90:physicalreview,patarnello87:europhys} that feedforward Boolean networks can learn to perform specific simple tasks and generalize well if only a subset of the learning examples is provided for learning. Here, we extend this body of work and show experimentally that random Boolean networks (RBNs), where both the interconnections and the Boolean transfer functions are chosen at random initially, can be evolved by using a state-topology evolution to solve simple tasks. We measure the learning and generalization performance, investigate the influence of the average node connectivity $K$, the system size $N$, and introduce a new measure that allows to better describe the network's learning and generalization behavior. We show that the connectivity of the maximum entropy networks scales as a power-law of the system size $N$. Our results show that networks with higher average connectivity $K$ (supercritical) achieve higher memorization and partial generalization. However, near critical connectivity, the networks show a higher perfect generalization on the even-odd task.
How do networks form and what is their ultimate topology? Most of the literature that addresses these questions assumes complete information: agents know in advance the value of linking to other agents, even with agents they have never met and with whom they have had no previous interaction (direct or indirect). This paper addresses the same questions under what seems to us to be the much more natural assumption of incomplete information: agents do not know in advance -- but must learn -- the value of linking to agents they have never met. We show that the assumption of incomplete information has profound implications for the process of network formation and the topology of networks that ultimately form. Under complete information, the networks that form and are stable typically have a star, wheel or core-periphery form, with high-value agents in the core. Under incomplete information, the presence of positive externalities (the value of indirect links) implies that a much wider collection of network topologies can emerge and be stable. Moreover, even when the topologies that emerge are the same, the locations of agents can be very different. For instance, when information is incomplete, it is possible for a hub-and-spokes network with a low-value agent in the center to form and endure permanently: an agent can achieve a central position purely as the result of chance rather than as the result of merit. Perhaps even more strikingly: when information is incomplete, a connected network could form and persist even if, when information were complete, no links would ever form, so that the final form would be a totally disconnected network. All of this can occur even in settings where agents eventually learn everything so that information, although initially incomplete, eventually becomes complete.
We investigate two paradigms for studying the evolution of cooperation--Prisoner's Dilemma and Snowdrift game in an online friendship network obtained from a social networking site. We demonstrate that such social network has small-world property and degree distribution has a power-law tail. Besides, it has hierarchical organizations and exhibits disassortative mixing pattern. We study the evolutionary version of the two types of games on it. It is found that enhancement and sustainment of cooperative behaviors are attributable to the underlying network topological organization. It is also shown that cooperators can survive when confronted with the invasion of defectors throughout the entire ranges of parameters of both games. The evolution of cooperation on empirical networks is influenced by various network effects in a combined manner, compared with that on model networks. Our results can help understand the cooperative behaviors in human groups and society.
Canonical ensemble molecular dynamics simulations of liquid methanol, modeled using a rigid-body, pair-additive potential, are used to compute static distributions and temporal correlations of tagged molecule potential energies as a means of characterising the liquid state dynamics. The static distribution of tagged molecule potential energies shows a clear multimodal structure with three distinct peaks, similar to those observed previously in water and liquid silica. The multimodality is shown to originate from electrostatic effects, but not from local, hydrogen-bond interactions. An interesting outcome of this study is the remarkable similarity in the tagged potential energy power spectra of methanol, water and silica, despite the differences in the underlying interactions and the dimensionality of the network. All three liquids show a distinct multiple time scale (MTS) regime with a 1/f dependence with a clear positive correlation between the scaling exponent alpha and the diffusivity. The low-frequency limit of the MTS regime is determined by the frequency of crossover to white noise behaviour which occurs at approximately 0.1 cm{-1} in the case of methanol under standard temperature and pressure conditions. The power spectral regime above 200 cm{-1} in all three systems is dominated by resonances due to localised vibrations, such as librations. The correlation between $\alpha$ and the diffusivity in all three liquids appears to be related to the strength of the coupling between the localised motions and the larger length/time-scale network reorganizations. Thus the time scales associated with network reorganization dynamics appear to be qualitatively similar in these systems, despite the fact that water and silica both display diffusional anomalies but methanol does not.
We consider the size and structure of the automorphism groups of a variety of empirical `real-world' networks and find that, in contrast to classical random graph models, many real-world networks are richly symmetric. We relate automorphism group structure to network topology and discuss generic forms of symmetry and their origin in real-world networks.
There are many methods proposed for inferring parameters of the Ising model from given data, that is a set of configurations generated according to the model itself. However little attention has been paid until now to the data, e.g. how the data is generated, whether the inference error using one set of data could be smaller than using another set of data, etc. In this paper we address the data quality problem in the kinetic inverse Ising problem. We quantify the quality of data using effective rank of the correlation matrix, and show that data gathered in a out of-equilibrium regime has a better quality than data gathered in equilibrium for coupling reconstruction. We also propose a matrix-perturbation based method for tuning the quality of given data and for removing bad-quality (i.e. redundant) configurations from data.
We present a novel approach to the regression of quantum mechanical energies based on a scattering transform of an intermediate electron density representation. A scattering transform is a deep convolution network computed with a cascade of multiscale wavelet transforms. It possesses appropriate invariant and stability properties for quantum energy regression. This new framework removes fundamental limitations of Coulomb matrix based energy regressions, and numerical experiments give state-of-the-art accuracy over planar molecules.
We present new parallel sorting networks for $17$ to $20$ inputs. For $17, 19,$ and $20$ inputs these new networks are faster (i.e., they require less computation steps) than the previously known best networks. Therefore, we improve upon the known upper bounds for minimal depth sorting networks on $17, 19,$ and $20$ channels. Furthermore, we show that our sorting network for $17$ inputs is optimal in the sense that no sorting network using less layers exists. This solves the main open problem of [D. Bundala & J. Za\'vodn\'y. Optimal sorting networks, Proc. LATA 2014].
Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence. We propose a method to alleviate this problem by augmenting NMT systems with discrete translation lexicons that efficiently encode translations of these low-frequency words. We describe a method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word lexical probabilities the model should focus on. We test two methods to combine this probability with the standard NMT probability: (1) using it as a bias, and (2) linear interpolation. Experiments on two corpora show an improvement of 2.0-2.3 BLEU and 0.13-0.44 NIST score, and faster convergence time.
As deep neural networks continue to revolutionize various application domains, there is increasing interest in making these powerful models more understandable and interpretable, and narrowing down the causes of good and bad predictions. We focus on recurrent neural networks (RNNs), state of the art models in speech recognition and translation. Our approach to increasing interpretability is by combining an RNN with a hidden Markov model (HMM), a simpler and more transparent model. We explore various combinations of RNNs and HMMs: an HMM trained on LSTM states; a hybrid model where an HMM is trained first, then a small LSTM is given HMM state distributions and trained to fill in gaps in the HMM's performance; and a jointly trained hybrid model. We find that the LSTM and HMM learn complementary information about the features in the text.
The string relaxation equation (SRE) of the string model for the glass transition contains the well-known Debye and Rouse-Zimm relaxation equations. However, its initial condition, necessary to the model predictions of glassy dynamics, such as the mechanism of the universal primary alpha- and Johari-Goldstein beta-relaxations in glassformers, has not been solved. In this paper, the special initial condition (SIC) of the SRE of straight strings for dielectric spectrum technique, which is one of the most common methods to measure the glassy dynamics, was tentatively calculated by a direct calculation method, finding that the method has not any practical feasibility. However, a recursive calculation method was developed that allows to obtain the SIC exactly. It should be expected that the obtained SIC would benefit the thorough solution of the general initial condition of the SRE of the string model for stochastically spatially configurating strings, as will be described in separate publications.
We consider a nonlinear dynamical system on a signed graph, which can be interpreted as a mathematical model of social networks in which the links can have both positive and negative connotations. In accordance with a concept from social psychology called structural balance, the negative links play a key role in both the structure and dynamics of the network. Recent research has shown that in a nonlinear dynamical system modeling the time evolution of "friendliness levels" in the network, two opposing factions emerge from almost any initial condition. Here we study active external influence in this dynamical model and show that any agent in the network can achieve any desired structurally balanced state from any initial condition by perturbing its own local friendliness levels. Based on this result, we also introduce a new network centrality measure for signed networks. The results are illustrated in an international relations network using United Nations voting record data from 1946 to 2008 to estimate friendliness levels amongst various countries.
In this paper, we explore the interplay between network topology and time criticality in a military logistics system. A general goal of this work (and previous work) is to evaluate land transportation requirements or, more specifically, how to design appropriate fleets of military general service vehicles that are tasked with the supply and re-supply of military units dispersed in an area of operation. The particular focus of this paper is to gain a better understanding of how the logistics environment changes when current Army vehicles with fixed transport characteristics are replaced by a new generation of modularised vehicles that can be configured task-specifically. The experimental work is conducted within a well developed strategic planning simulation environment which includes a scenario generation engine for automatically sampling supply and re-supply missions and a multi-objective meta-heuristic search algorithm (i.e. Evolutionary Algorithm) for solving the particular scheduling and routing problems. The results presented in this paper allow for a better understanding of how (and under what conditions) a modularised vehicle fleet can provide advantages over the currently implemented system.
The Sherrington--Kirkpatrick model of spin glasses, the Hopfield model of neural networks and the Ising spin glass are all models of binary data belonging to the one-parameter exponential family with quadratic sufficient statistic. Under bare minimal conditions, we establish the $\sqrt{N}$-consistency of the maximum pseudolikelihood estimate of the natural parameter in this family, even at critical temperatures. Since very little is known about the low and critical temperature regimes of these extremely difficult models, the proof requires several new ideas. The author's version of Stein's method is a particularly useful tool. We aim to introduce these techniques into the realm of mathematical statistics through an example and present some open questions.
There has been an explosion of interest in statistical models for analyzing network data, and considerable interest in the class of exponential random graph (ERG) models, especially in connection with difficulties in computing maximum likelihood estimates. The issues associated with these difficulties relate to the broader structure of discrete exponential families. This paper re-examines the issues in two parts. First we consider the closure of $k$-dimensional exponential families of distribution with discrete base measure and polyhedral convex support $\mathrm{P}$. We show that the normal fan of $\mathrm{P}$ is a geometric object that plays a fundamental role in deriving the statistical and geometric properties of the corresponding extended exponential families. We discuss its relevance to maximum likelihood estimation, both from a theoretical and computational standpoint. Second, we apply our results to the analysis of ERG models. In particular, by means of a detailed example, we provide some characterization of the properties of ERG models, and, in particular, of certain behaviors of ERG models known as degeneracy.
Complex networks are now being studied in a wide range of disciplines across science and technology. In this paper we propose a method by which one can probe the properties of experimentally obtained network data. Rather than just measuring properties of a network inferred from data, we aim to ask how typical is that network? What properties of the observed network are typical of all such scale free networks, and which are peculiar? To do this we propose a series of methods that can be used to generate statistically likely complex networks which are both similar to the observed data and also consistent with an underlying null-hypothesis -- for example a particular degree distribution. There is a direct analogy between the approach we propose here and the surrogate data methods applied to nonlinear time series data.
Mobile networks receive increasing research interest recently due to their increasingly wide applications in various areas; mobile ad hoc networks (MANET) and Vehicular ad hoc networks (VANET) are two prominent examples. Mobility introduces challenges as well as opportunities: it is known to improve the network throughput as shown in [1]. In this paper, we analyze the effect of mobility on the information spreading based on gossip algorithms. Our contributions are twofold. Firstly, we propose a new performance metric, mobile conductance, which allows us to separate the details of mobility models from the study of mobile spreading time. Secondly, we explore the mobile conductances of several popular mobility models, and offer insights on the corresponding results. Large scale network simulation is conducted to verify our analysis.
Physical-layer network coding (PNC) is a promising approach for wireless networks. It allows nodes to transmit simultaneously. Due to the difficulties of scheduling simultaneous transmissions, existing works on PNC are based on simplified medium access control (MAC) protocols, which are not applicable to general multi-hop wireless networks, to the best of our knowledge. In this paper, we propose a distributed MAC protocol that supports PNC in multi-hop wireless networks. The proposed MAC protocol is based on the carrier sense multiple access (CSMA) strategy and can be regarded as an extension to the IEEE 802.11 MAC protocol. In the proposed protocol, each node collects information on the queue status of its neighboring nodes. When a node finds that there is an opportunity for some of its neighbors to perform PNC, it notifies its corresponding neighboring nodes and initiates the process of packet exchange using PNC, with the node itself as a relay. During the packet exchange process, the relay also works as a coordinator which coordinates the transmission of source nodes. Meanwhile, the proposed protocol is compatible with conventional network coding and conventional transmission schemes. Simulation results show that the proposed protocol is advantageous in various scenarios of wireless applications.
In this paper we study the critical behavior of an $N$-component ${\phi}^{4}$-model in hyperbolic space, which serves as a model of uniform frustration. We find that this model exhibits a second-order phase transition with an unusual magnetization texture that results from the lack of global parallelism in hyperbolic space. Angular defects occur on length scales comparable to the radius of curvature. This phase transition is governed by a new strong curvature fixed point that obeys scaling below the upper critical dimension $d_{uc}=4$. The exponents of this fixed point are given by the leading order terms of the $1/N$ expansion. In distinction to flat space no order $1/N$ corrections occur. We conclude that the description of many-particle systems in hyperbolic space is a promising avenue to investigate uniform frustration and non-trivial critical behavior within one theoretical approach.
We consider the problem of distributing a vaccine for immunizing a scale-free network against a given virus or worm. We introduce a new method, based on vaccine dissemination, that seems to reflect more accurately what is expected to occur in real-world networks. Also, since the dissemination is performed using only local information, the method can be easily employed in practice. Using a random-graph framework, we analyze our method both mathematically and by means of simulations. We demonstrate its efficacy regarding the trade-off between the expected number of nodes that receive the vaccine and the network's resulting vulnerability to develop an epidemic as the virus or worm attempts to infect one of its nodes. For some scenarios, the new method is seen to render the network practically invulnerable to attacks while requiring only a small fraction of the nodes to receive the vaccine.
An unprecedented information wealth produced by online social networks, further augmented by location/collocation data, is currently fragmented across different proprietary services. Combined, it can accurately represent the social world and enable novel socially-aware applications. We present Prometheus, a socially-aware peer-to-peer service that collects social information from multiple sources into a multigraph managed in a decentralized fashion on user-contributed nodes, and exposes it through an interface implementing non-trivial social inferences while complying with user-defined access policies. Simulations and experiments on PlanetLab with emulated application workloads show the system exhibits good end-to-end response time, low communication overhead and resilience to malicious attacks.
A computational scheme for reasoning about dynamic systems using (causal) probabilistic networks is presented. The scheme is based on the framework of Lauritzen and Spiegelhalter (1988), and may be viewed as a generalization of the inference methods of classical time-series analysis in the sense that it allows description of non-linear, multivariate dynamic systems with complex conditional independence structures. Further, the scheme provides a method for efficient backward smoothing and possibilities for efficient, approximate forecasting methods. The scheme has been implemented on top of the HUGIN shell.
Generative Adversarial Networks (GAN) have attracted much research attention recently, leading to impressive results for natural image generation. However, to date little success was observed in using GAN generated images for improving classification tasks. Here we attempt to explore, in the context of car license plate recognition, whether it is possible to generate synthetic training data using GAN to improve recognition accuracy. With a carefully-designed pipeline, we show that the answer is affirmative. First, a large-scale image set is generated using the generator of GAN, without manual annotation. Then, these images are fed to a deep convolutional neural network (DCNN) followed by a bidirectional recurrent neural network (BRNN) with long short-term memory (LSTM), which performs the feature learning and sequence labelling. Finally, the pre-trained model is fine-tuned on real images. Our experimental results on a few data sets demonstrate the effectiveness of using GAN images: an improvement of 7.5% over a strong baseline with moderate-sized real data being available. We show that the proposed framework achieves competitive recognition accuracy on challenging test datasets. We also leverage the depthwise separate convolution to construct a lightweight convolutional RNN, which is about half size and 2x faster on CPU. Combining this framework and the proposed pipeline, we make progress in performing accurate recognition on mobile and embedded devices.
We show an alternative way of representing a Bayesian belief network by sensitivities and probability distributions. This representation is equivalent to the traditional representation by conditional probabilities, but makes dependencies between nodes apparent and intuitively easy to understand. We also propose a QR matrix representation for the sensitivities and/or conditional probabilities which is more efficient, in both memory requirements and computational speed, than the traditional representation for computer-based implementations of probabilistic inference. We use sensitivities to show that for a certain class of binary networks, the computation time for approximate probabilistic inference with any positive upper bound on the error of the result is independent of the size of the network. Finally, as an alternative to traditional algorithms that use conditional probabilities, we describe an exact algorithm for probabilistic inference that uses the QR-representation for sensitivities and updates probability distributions of nodes in a network according to messages from the neighbors.
Mean values and differential distributions of event-shape variables have been studied in neutral current deep inelastic scattering using an integrated {luminosity} of 82.2 pb$^{-1}$ collected with the ZEUS detector at HERA. The kinematic range was $80 < Q^2 < 20 480\gev^2$ and $0.0024 < x < 0.6$, where $Q^2$ is the virtuality of the exchanged boson and $x$ is the Bjorken variable. The data are compared with a model based on a combination of next-to-leading-order QCD calculations with next-to-leading-logarithm corrections and the Dokshitzer-Webber non-perturbative power corrections. The power-correction method provides a reasonable description of the data for all event-shape variables studied. Nevertheless, the lack of consistency of the determination of $\alpha_s$ and of the non-perturbative parameter of the model, $\albar$, suggests the importance of higher-order processes that are not yet included in the model.
In this paper, the problem of proactive caching is studied for cloud radio access networks (CRANs). In the studied model, the baseband units (BBUs) can predict the content request distribution and mobility pattern of each user, determine which content to cache at remote radio heads and BBUs. This problem is formulated as an optimization problem which jointly incorporates backhaul and fronthaul loads and content caching. To solve this problem, an algorithm that combines the machine learning framework of echo state networks with sublinear algorithms is proposed. Using echo state networks (ESNs), the BBUs can predict each user's content request distribution and mobility pattern while having only limited information on the network's and user's state. In order to predict each user's periodic mobility pattern with minimal complexity, the memory capacity of the corresponding ESN is derived for a periodic input. This memory capacity is shown to be able to record the maximum amount of user information for the proposed ESN model. Then, a sublinear algorithm is proposed to determine which content to cache while using limited content request distribution samples. Simulation results using real data from Youku and the Beijing University of Posts and Telecommunications show that the proposed approach yields significant gains, in terms of sum effective capacity, that reach up to 27.8% and 30.7%, respectively, compared to random caching with clustering and random caching without clustering algorithm.
I present an introductory review text, focusing on recent deep field studies using radio telescopes, including high-resolution, wide-field VLBI observations. The nature of the faint radio source population is discussed, taking into account complimentary data and results that are now available via the sub-mm, IR, optical and X-ray wave-bands. New developments regarding the increased sensitivity of VLBI and an expansion in the instruments field-of-view are also presented. VLBI may be an important tool in recognising distant, obscured AGN from the ``contaminant'' star forming galaxies that now dominate lower-resolution, sub-arcsecond and arcsecond radio observations.
This paper presents an approach to model melodies (and music pieces in general) as networks. Notes of a melody can be seen as nodes of a network that are connected whenever these are played in sequence. This creates a directed graph. By using complex network theory, it is possible to extract some main metrics, typical of networks, that characterize the piece. Using this framework, we provide an analysis on a set of guitar solos performed by main musicians. The results of this study indicate that this model can have an impact on multimedia applications such as music classification, identification, and automatic music generation.
We show that evolutionary computation can be implemented as standard Markov-chain Monte-Carlo (MCMC) sampling. With some care, `genetic algorithms' can be constructed that are reversible Markov chains that satisfy detailed balance; it follows that the stationary distribution of populations is a Gibbs distribution in a simple factorised form. For some standard and popular nonparametric probability models, we exhibit Gibbs-sampling procedures that are plausible genetic algorithms. At mutation-selection equilibrium, a population of genomes is analogous to a sample from a Bayesian posterior, and the genomes are analogous to latent variables. We suggest this is a general, tractable, and insightful formulation of evolutionary computation in terms of standard machine learning concepts and techniques.   In addition, we show that evolutionary processes in which selection acts by differences in fecundity are not reversible, and also that it is not possible to construct reversible evolutionary models in which each child is produced by only two parents.
Complete next-to leading order QCD predictions for (2+1) jet cross sections and jet rates in deep inelastic scattering (DIS) based on a new parton level Monte Carlo program are presented. All relevant helicity contributions to the total cross section are included. Results on total jet cross sections as well as differential distributions in the basic kinematical variables $x,W^2$ and $Q^2$ are shown for HERA energies and for the fixed target experiment E665 at FERMILAB. We study the dependence on the choices of the renormalization scale $\mu_R$ and the factorization scale $\mu_F$ and show that the NLO results are much less sensitive to the variation of $\mu=\mu_F=\mu_R$ than the LO results. The effect of an additional $p_T$ cut to our jet definition scheme is investigated.
We investigate the statistical properties of local Lyapunov exponents which characterize magnon localization in the $d=1$ Heisenberg-Mattis spin glass (HMSG) at zero temperature, by means of a connection to a suitable version of the Fokker-Planck (F-P) equation. We consider the local Lyapunov exponents (LLE), in particular the case of {\em instantaneous} LLE. We establish a connection between the transfer-matrix recursion relation for the problem, and an F-P equation governing the evolution of the probability distribution of the instantaneous LLE. The closed-form (stationary) solutions to the F-P equation are in excellent accord with numerical simulations, for both the unmagnetized and magnetized versions of the HMSG. Scaling properties for non-stationary conditions are derived from the F-P equation in a special limit (in which diffusive effects tend to vanish), and also shown to provide a close description to the corresponding numerical-simulation data.
We propose in this paper a new generative model for graphs that uses a latent space approach to explain timestamped interactions. The model is designed to provide global estimates of activity dates in historical networks where only the interaction dates between agents are known with reasonable precision. Experimental results show that the model provides better results than local averages in dense enough networks
Recently there has been many works on adaptive subspace filtering in the signal processing literature. Most of them are concerned with tracking the signal subspace spanned by the eigenvectors corresponding to the eigenvalues of the covariance matrix of the signal plus noise data. Minor Component Analysis (MCA) is important tool and has a wide application in telecommunications, antenna array processing, statistical parametric estimation, etc. As an important feature extraction technique, MCA is a statistical method of extracting the eigenvector associated with the smallest eigenvalue of the covariance matrix. In this paper, we will present a MCA learning algorithm to extract minor component from input signals, and the learning rate parameter is also presented, which ensures fast convergence of the algorithm, because it has direct effect on the convergence of the weight vector and the error level is affected by this value. MCA is performed to determine the estimated DOA. Simulation results will be furnished to illustrate the theoretical results achieved.
In this paper, the problem of allocating users to radio resources (i.e., subcarriers) in the downlink of an OFDMA cellular network is addressed. We consider a multi-cellular environment with a realistic interference model and a margin adaptive approach, i.e., we aim at minimizing total transmission power while maintaining a certain given rate for each user.   The computational complexity issues of the resulting model is discussed and proving that the problem is NP-hard in the strong sense. Heuristic approaches, based on network flow models, that finds optima under suitable conditions, or "reasonably good" solutions in the general case are presented. Computational experiences show that, in a comparison with a commercial state-of-the-art optimization solver, the proposed algorithms are effective in terms of solution quality and CPU times.
We analyze the X-ray properties of a near-infrared selected (Ks < 20) sample of spectroscopically identified Extremely Red Objects (R-Ks>5) in a region of the Chandra Deep Field-South using the public 1 Ms observation. The 21 objects were classified, on the basis of deep VLT spectra, in two categories: 13 dusty star-forming galaxies showing [OII] emission, and 8 early-type galaxies with absorption features in their optical spectra. Only one emission line object has been individually detected; its very hard X-ray spectrum and the high intrinsic X-ray luminosity unambiguously reveal the presence of an obscured AGN. Stacking analysis of the remainder 12 emission line objects shows a significant detection with an average luminosity L(X)~8x10^{40} erg/s in the rest-frame 2-10 keV band. The stacked counts of the 8 passive galaxies do not provide a positive detection. We briefly discuss the implications of the present results for the estimate of the Star Formation Rate (SFR) in emission line EROs.
In this paper, we infer the statuses of a taxi, consisting of occupied, non-occupied and parked, in terms of its GPS trajectory. The status information can enable urban computing for improving a city's transportation systems and land use planning. In our solution, we first identify and extract a set of effective features incorporating the knowledge of a single trajectory, historical trajectories and geographic data like road network. Second, a parking status detection algorithm is devised to find parking places (from a given trajectory), dividing a trajectory into segments (i.e., sub-trajectories). Third, we propose a two-phase inference model to learn the status (occupied or non-occupied) of each point from a taxi segment. This model first uses the identified features to train a local probabilistic classifier and then carries out a Hidden Semi-Markov Model (HSMM) for globally considering long term travel patterns. We evaluated our method with a large-scale real-world trajectory dataset generated by 600 taxis, showing the advantages of our method over baselines.
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance estimation for the construction of confidence intervals and hypothesis tests. In this paper, we show that the estimators of RDS sampling variance rely on a critical assumption that the network is First Order Markov (FOM) with respect to the dependent variable of interest. We demonstrate, through intuitive examples, mathematical generalizations, and computational experiments that current RDS variance estimators will always underestimate the population sampling variance of RDS in empirical networks that do not conform to the FOM assumption. Analysis of 215 observed university and school networks from Facebook and Add Health indicates that the FOM assumption is violated in every empirical network we analyze, and that these violations lead to substantially biased RDS estimators of sampling variance. We propose and test two alternative variance estimators that show some promise for reducing biases, but which also illustrate the limits of estimating sampling variance with only partial information on the underlying population social network.
Recently, a new class of error-control codes, the polar codes, have attracted much attention. The polar codes are the first known class of capacity-achieving codes for many important communication channels. In addition, polar codes have low-complexity encoding algorithms. Therefore, these codes are favorable choices for low-complexity devices, for example, in ubiquitous computing and sensor networks. However, the polar codes fall short in terms of finite-length error probabilities, compared with the state-of-the-art codes, such as the low-density parity-check codes. In this paper, in order to improve the error probabilities of the polar codes, we propose novel interactive coding schemes using receiver feedbacks based on polar codes. The proposed coding schemes have very low computational complexities at the transmitter side. By experimental results, we show that the proposed coding schemes achieve significantly lower error probabilities.
In recent years, Deep Learning (DL) has found great success in domains such as multimedia understanding. However, the complex nature of multimedia data makes it difficult to develop DL-based software. The state-of-the art tools, such as Caffe, TensorFlow, Torch7, and CNTK, while are successful in their applicable domains, are programming libraries with fixed user interface, internal representation, and execution environment. This makes it difficult to implement portable and customized DL applications.   In this paper, we present DeepDSL, a domain specific language (DSL) embedded in Scala, that compiles deep networks written in DeepDSL to Java source code. Deep DSL provides (1) intuitive constructs to support compact encoding of deep networks; (2) symbolic gradient derivation of the networks; (3) static analysis for memory consumption and error detection; and (4) DSL-level optimization to improve memory and runtime efficiency.   DeepDSL programs are compiled into compact, efficient, customizable, and portable Java source code, which operates the CUDA and CUDNN interfaces running on Nvidia GPU via a Java Native Interface (JNI) library. We evaluated DeepDSL with a number of popular DL networks. Our experiments show that the compiled programs have very competitive runtime performance and memory efficiency compared to the existing libraries.
In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the "true" alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our model improves both translation and alignment qualities significantly over the large-vocabulary neural machine translation system, and even beats a state-of-the-art traditional syntax-based system.
We investigate the effects of impurities on the nature of the phase transition in frustrated magnets, in d=4-epsilon dimensions. For sufficiently small values of the number of spin components, we find no physically relevant stable fixed point in the deep perturbative region (epsilon << 1), contrarily to what is to be expected on very general grounds. This signals the onset of important physical effects.
The computational complexity of kernel methods has often been a major barrier for applying them to large-scale learning problems. We argue that this barrier can be effectively overcome. In particular, we develop methods to scale up kernel models to successfully tackle large-scale learning problems that are so far only approachable by deep learning architectures. Based on the seminal work by Rahimi and Recht on approximating kernel functions with features derived from random projections, we advance the state-of-the-art by proposing methods that can efficiently train models with hundreds of millions of parameters, and learn optimal representations from multiple kernels. We conduct extensive empirical studies on problems from image recognition and automatic speech recognition, and show that the performance of our kernel models matches that of well-engineered deep neural nets (DNNs). To the best of our knowledge, this is the first time that a direct comparison between these two methods on large-scale problems is reported. Our kernel methods have several appealing properties: training with convex optimization, cost for training a single model comparable to DNNs, and significantly reduced total cost due to fewer hyperparameters to tune for model selection. Our contrastive study between these two very different but equally competitive models sheds light on fundamental questions such as how to learn good representations.
This is a review of aspects of the theory of algorithmic information that may contribute to a framework for formulating questions related to complex highly unpredictable systems. We start by contrasting Shannon Entropy and Kolmogorov-Chaitin complexity epitomizing the difference between correlation and causation to then move onto surveying classical results from algorithmic complexity and algorithmic probability, highlighting their deep connection to the study of automata frequency distributions. We end showing how long-range algorithmic predicting models for economic and biological systems may require infinite computation but locally approximated short-range estimations are possible thereby showing how small data can deliver important insights into important features of complex "Big Data".
Motivated by online reputation systems, we investigate social learning in a network where agents interact on a time dependent graph to estimate an underlying state of nature. Agents record their own private observations, then update their private beliefs about the state of nature using Bayes' rule. Based on their belief, each agent then chooses an action (rating) from a finite set and transmits this action over the social network. An important consequence of such social learning over a network is the ruinous multiple re-use of information known as data incest (or mis-information propagation). In this paper, the data incest management problem in social learning context is formulated on a directed acyclic graph. We give necessary and sufficient conditions on the graph topology of social interactions to eliminate data incest. A data incest removal algorithm is proposed such that the public belief of social learning (and hence the actions of agents) is not affected by data incest propagation. This results in an online reputation system with a higher trust rating. Numerical examples are provided to illustrate the performance of the proposed optimal data incest removal algorithm.
We compute Green's functions with a bilinear quark operator inserted at non-zero momentum for a generalized momentum configuration to two loops. These are required to assist lattice gauge theory measurements of the same quantity in matching to the high energy behaviour. The flavour non-singlet operators considered are the scalar, vector and tensor currents as well as the second moment of the twist-2 Wilson operator used in deep inelastic scattering for the measurement of nucleon structure functions.
We explore the consequences for diffractive production, gamma* p --> X p, in deep inelastic scattering at low values of x\sim Q^2/W^2 <<1 that follow from our recent representation of the total photoabsorption cross section, sigma_{gamma* p}, in the generalized vector dominance/ color dipole picture(GVD/CDP) that is based on the generic structure of the two-gluon-exchange from QCD. Sum rules are derived that relate the transverse and the longitudinal (virtual) photoabsorption cross section to diffractive forward production of q q-bar states that carry photon quantum numbers ("elastic diffraction"). Agreement with experiment in the W^2 and Q^2 dependence is found for M_X^2/Q^2<<1, where M_X is the mass of the produced system X. An additional component ("inelastic diffraction"), not actively contributing to the forward Compton amplitude, is needed for diffractive production at high values of M_X. Our previous theoretical representation of the total photoabsorption cross section sigma_{gamma* p}=sigma_{gamma* p}(eta), in terms of the scaling variable eta=(Q^2+m_0^2)/Lambda^2(W^2) is extended to include the entire kinematic domain, x=<0.1 and all Q^2 with Q^2>=0, where scaling in eta holds experimentally.
Rodrigo de Miguel et al 2007 J. Phys. A: Math. Theor. 40 5241-5260: A noisy vector channel operating under a strict complexity constraint at the receiver is introduced. According to this constraint, detected bits, obtained by performing hard decisions directly on the channel's matched filter output, must be the same as the transmitted binary inputs. An asymptotic analysis is carried out using mathematical tools imported from the study of neural networks, and it is shown that, under a bounded noise assumption, such complexity-constrained channel exhibits a non-trivial Shannon-theoretic capacity. It is found that performance relies on rigorous interference-based multiuser cooperation at the transmitter and that this cooperation is best served when all transmitters use the same amplitude.
Vehicular Ad hoc Networks (VANETs) allow vehicles to form a self-organized network without any fixed infrastructure. VANETs have received wide attention and numerous research issues have been identified in the recent time. The design and implementation of efficient and scalable routing protocols for VANETs is a challenging task due to high dynamics and mobility constraints. In this paper, we have proposed D-LAR (Directional-Location Aided Routing), is an extension of Location Aided Routing (LAR) with Directional Routing (DIR) capability. D-LAR is a greedy approach based-position based routing protocol to forward packet to the node present in request zone within the transmission range of the source node as most suitable next-hop node. We have justified the feasibility of our proposed protocol for VANET.
By assigning a probability measure via the spectrum of the normalized Laplacian to each graph and using L^p Wasserstein distances between probability measures, we define the corresponding spectral distances d_p on the set of all graphs. This approach can even be extended to measuring the distances between infinite graphs. We prove that the diameter of the set of graphs, as a pseudo-metric space equipped with d_1, is one. We further study the behavior of d_1 when the size of graphs tends to infinity by interlacing inequalities aiming at exploring large real networks. A monotonic relation between d_1 and the evolutionary distance of biological networks is observed in simulations.
Recent progress in applying complex network theory to problems faced in quantum information and computation has resulted in a beneficial crossover between two fields. Complex network methods have successfully been used to characterize quantum walk and transport models, entangled communication networks, graph theoretic models of emergent space-time and in detecting community structure in quantum systems. Information physics is setting the stage for a theory of complex and networked systems with quantum information-inspired methods appearing in complex network science, including information-theoretic distance and correlation measures for network characterization. Novel quantum induced effects have been predicted in random graphs---where edges represent entangled links---and quantum computer algorithms have recently been proposed to offer super-polynomial enhancement for several network and graph theoretic problems. Here we review the results at the cutting edge, pinpointing the similarities and reconciling the differences found in the series of results at the intersection of these two fields.
This paper proposes a branched residual network for image classification. It is known that high-level features of deep neural network are more representative than lower-level features. By sharing the low-level features, the network can allocate more memory to high-level features. The upper layers of our proposed network are branched, so that it mimics the ensemble learning. By mimicking ensemble learning with single network, we have achieved better performance on ImageNet classification task.
Social Network Analysis (SNA) tries to understand and exploit the key features of social networks in order to manage their life cycle and predict their evolution. Increasingly popular web 2.0 sites are forming huge social network. Classical methods from social network analysis (SNA) have been applied to such online networks. In this paper, we propose leveraging semantic web technologies to merge and exploit the best features of each domain. We present how to facilitate and enhance the analysis of online social networks, exploiting the power of semantic social network analysis.
In this paper, we introduce new learning algorithms for reducing false positives in intrusion detection. It is based on decision tree-based attribute weighting with adaptive na\"ive Bayesian tree, which not only reduce the false positives (FP) at acceptable level, but also scale up the detection rates (DR) for different types of network intrusions. Due to the tremendous growth of network-based services, intrusion detection has emerged as an important technique for network security. Recently data mining algorithms are applied on network-based traffic data and host-based program behaviors to detect intrusions or misuse patterns, but there exist some issues in current intrusion detection algorithms such as unbalanced detection rates, large numbers of false positives, and redundant attributes that will lead to the complexity of detection model and degradation of detection accuracy. The purpose of this study is to identify important input attributes for building an intrusion detection system (IDS) that is computationally efficient and effective. Experimental results performed using the KDD99 benchmark network intrusion detection dataset indicate that the proposed approach can significantly reduce the number and percentage of false positives and scale up the balance detection rates for different types of network intrusions.
Deep Reinforcement Learning (DRL) is a trending field of research, showing great promise in many challenging problems such as playing Atari, solving Go and controlling robots. While DRL agents perform well in practice we are still missing the tools to analayze their performance and visualize the temporal abstractions that they learn. In this paper, we present a novel method that automatically discovers an internal Semi Markov Decision Process (SMDP) model in the Deep Q Network's (DQN) learned representation. We suggest a novel visualization method that represents the SMDP model by a directed graph and visualize it above a t-SNE map. We show how can we interpret the agent's policy and give evidence for the hierarchical state aggregation that DQNs are learning automatically. Our algorithm is fully automatic, does not require any domain specific knowledge and is evaluated by a novel likelihood based evaluation criteria.
We propose a scalable temporal latent space model for link prediction in dynamic social networks, where the goal is to predict links over time based on a sequence of previous graph snapshots. The model assumes that each user lies in an unobserved latent space and interactions are more likely to form between similar users in the latent space representation. In addition, the model allows each user to gradually move its position in the latent space as the network structure evolves over time. We present a global optimization algorithm to effectively infer the temporal latent space, with a quadratic convergence rate. Two alternative optimization algorithms with local and incremental updates are also proposed, allowing the model to scale to larger networks without compromising prediction accuracy. Empirically, we demonstrate that our model, when evaluated on a number of real-world dynamic networks, significantly outperforms existing approaches for temporal link prediction in terms of both scalability and predictive power.
In this paper we recall for physicists how it is possible, using the principle of maximization of the Boltzmann-Shannon entropy, to derive the Burr-Bingh-Maddala (burr12) double power law probability distribution function and its approximations (Pareto, loglogistic ..) and extension first used in econometrics. this is possible using a deformation of the power function, as this has been done in complex systems for the exponential function. We give to that distribution a deep stochastic interpretation using the theory of Weron et al. applied to thermodynamics the entropy nonextensivity can be accounted for by assuming that the asymptotic exponents are scale dependent. Therefore functions which describe phenomena presenting power-law asymptotic behaviour can be obtained without introducing exotic forms of the entropy.
Automatically detecting sound units of humpback whales in complex time-varying background noises is a current challenge for scientists. In this paper, we explore the applicability of Convolution Neural Network (CNN) method for this task. In the evaluation stage, we present 6 bi-class classification experimentations of whale sound detection against different background noise types (e.g., rain, wind). In comparison to classical FFT-based representation like spectrograms, we showed that the use of image-based pretrained CNN features brought higher performance to classify whale sounds and background noise.
Recent studies uncovered important core/periphery network structures characterizing complex sets of cooperative and competitive interactions between network nodes, be they proteins, cells, species or humans. Better characterization of the structure, dynamics and function of core/periphery networks is a key step of our understanding cellular functions, species adaptation, social and market changes. Here we summarize the current knowledge of the structure and dynamics of "traditional" core/periphery networks, rich-clubs, nested, bow-tie and onion networks. Comparing core/periphery structures with network modules, we discriminate between global and local cores. The core/periphery network organization lies in the middle of several extreme properties, such as random/condensed structures, clique/star configurations, network symmetry/asymmetry, network assortativity/disassortativity, as well as network hierarchy/anti-hierarchy. These properties of high complexity together with the large degeneracy of core pathways ensuring cooperation and providing multiple options of network flow re-channelling greatly contribute to the high robustness of complex systems. Core processes enable a coordinated response to various stimuli, decrease noise, and evolve slowly. The integrative function of network cores is an important step in the development of a large variety of complex organisms and organizations. In addition to these important features and several decades of research interest, studies on core/periphery networks still have a number of unexplored areas.
We survey the latest advances in machine learning with deep neural networks by applying them to the task of radio modulation recognition. Results show that radio modulation recognition is not limited by network depth and further work should focus on improving learned synchronization and equalization. Advances in these areas will likely come from novel architectures designed for these tasks or through novel training methods.
In this paper we present a framework for secure identification using deep neural networks, and apply it to the task of template protection for face authentication. We use deep convolutional neural networks (CNNs) to learn a mapping from face images to maximum entropy binary (MEB) codes. The mapping is robust enough to tackle the problem of exact matching, yielding the same code for new samples of a user as the code assigned during training. These codes are then hashed using any hash function that follows the random oracle model (like SHA-512) to generate protected face templates (similar to text based password protection). The algorithm makes no unrealistic assumptions and offers high template security, cancelability, and state-of-the-art matching performance. The efficacy of the approach is shown on CMU-PIE, Extended Yale B, and Multi-PIE face databases. We achieve high (~95%) genuine accept rates (GAR) at zero false accept rate (FAR) with up to 1024 bits of template security.
We consider the problem of link scheduling for wireless networks with fading channels, where the link rates are varying with time. Due to the high computational complexity of the throughput optimal scheduler, we provide a low complexity greedy link scheduler GFS, with provable performance guarantees. We show that the performance of our greedy scheduler can be analyzed using the Local Pooling Factor (LPF) of a network graph, which has been previously used to characterize the stability of the Greedy Maximal Scheduling (GMS) policy for networks with static channels. We conjecture that the performance of GFS is a lower bound on the performance of GMS for wireless networks with fading channels
Many complex natural and physical systems exhibit patterns of interconnection that conform, approximately, to a network structure referred to as scale-free. Preferential attachment is one of many algorithms that have been introduced to model the growth and structure of scale-free networks. With so many different models of scale-free networks it is unclear what properties of scale-free networks are typical, and what properties are peculiarities of a particular growth or construction process. We propose a simple maximum entropy process which provides the best representation of what are typical properties of scale-free networks, and provides a standard against which real and algorithmically generated networks can be compared. As an example we consider preferential attachment and find that this particular growth model does not yield typical realizations of scale-free networks. In particular, the widely discussed "fragility" of scale-free networks is actually found to be due to the peculiar "hub-centric" structure of preferential attachment networks. We provide a method to generate or remove this latent hub-centric bias --- thereby demonstrating exactly which features of preferential attachment networks are atypical of the broader class of scale-free networks. We are also able to statistically demonstrate whether real networks are typical realizations of scale-free networks, or networks with that particular degree distribution; using a new surrogate generation method for complex networks, exactly analogous the the widely used surrogate tests of nonlinear time series analysis.
The Sznajd model has been largely applied to simulate many sociophysical phenomena. In this paper we applied the Sznajd model with more than two opinions on three different network topologies and observed the evolution of surviving opinions after many interactions among the nodes. As result, we obtained a scaling law which depends of the network size and the number of possible opinions. We also observed that this scaling law is not the same for all network topologies, being quite similar between scale-free networks and Sznajd networks but different for random networks.
A theory of citations should not consider cited and/or citing agents as its sole subject of study. One is able to study also the dynamics in the networks of communications. While communicating agents (e.g., authors, laboratories, journals) can be made comparable in terms of their publication and citation counts, one would expect the communication networks not to be homogeneous. The latent structures of the network indicate different codifications that span a space of possible 'translations'. The various subdynamics can be hypothesized from an evolutionary perspective. Using the network of aggregated journal-journal citations in Science & Technology Studies as an empirical case, the operation of such subdynamics can be demonstrated. Policy implications and the consequences for a theory-driven type of scientometrics will be elaborated.
Multi-channel wireless networks are increasingly being employed as infrastructure networks, e.g. in metro areas. Nodes in these networks frequently employ directional antennas to improve spatial throughput. In such networks, given a source and destination, it is of interest to compute an optimal path and channel assignment on every link in the path such that the path bandwidth is the same as that of the link bandwidth and such a path satisfies the constraint that no two consecutive links on the path are assigned the same channel, referred to as "Channel Discontinuity Constraint" (CDC). CDC-paths are also quite useful for TDMA system, where preferably every consecutive links along a path are assigned different time slots.   This paper contains several contributions. We first present an $O(N^{2})$ distributed algorithm for discovering the shortest CDC-path between given source and destination. This improves the running time of the $O(N^{3})$ centralized algorithm of Ahuja et al. for finding the minimum-weight CDC-path. Our second result is a generalized $t$-spanner for CDC-path; For any $\theta>0$ we show how to construct a sub-network containing only $O(\frac{N}{\theta})$ edges, such that that length of shortest CDC-paths between arbitrary sources and destinations increases by only a factor of at most $(1-2\sin{\tfrac{\theta}{2}})^{-2}$. We propose a novel algorithm to compute the spanner in a distributed manner using only $O(n\log{n})$ messages. An important conclusion of this scheme is in the case of directional antennas are used. In this case, it is enough to consider only the two closest nodes in each cone.
Recent work suggests that some auto-encoder variants do a good job of capturing the local manifold structure of the unknown data generating density. This paper contributes to the mathematical understanding of this phenomenon and helps define better justified sampling algorithms for deep learning based on auto-encoder variants. We consider an MCMC where each step samples from a Gaussian whose mean and covariance matrix depend on the previous state, defines through its asymptotic distribution a target density. First, we show that good choices (in the sense of consistency) for these mean and covariance functions are the local expected value and local covariance under that target density. Then we show that an auto-encoder with a contractive penalty captures estimators of these local moments in its reconstruction function and its Jacobian. A contribution of this work is thus a novel alternative to maximum-likelihood density estimation, which we call local moment matching. It also justifies a recently proposed sampling algorithm for the Contractive Auto-Encoder and extends it to the Denoising Auto-Encoder.
We employ large-scale Monte Carlo simulations to study a particle-hole symmetric site-diluted quantum rotor model in two dimensions. The ground state phase diagram of this system features two distinct quantum phase transitions between the superfluid and the insulating (Mott glass) phases. They are separated by a multicritical point. The generic transition for dilutions below the lattice percolation threshold is driven by quantum fluctuations while the transition across the percolation threshold is due to the geometric fluctuations of the lattice. We determine the location of the multicritical point between these two transitions and find its critical behavior. The multicritical exponents read $z=1.72(2)$, $\beta/\nu=0.41(2)$, and $\gamma/\nu=2.90(5)$ . We compare our results to other quantum phase transitions in disordered systems, and we discuss experiments.
Cloud computing is an emerging platform of service computing designed for swift and dynamic delivery of assured computing resources. Cloud computing provide Service-Level Agreements (SLAs) for guaranteed uptime availability for enabling convenient and on-demand network access to the distributed and shared computing resources. Though the cloud computing paradigm holds its potential status in the field of distributed computing, cloud platforms are not yet to the attention of majority of the researchers and practitioners. More specifically, still the researchers and practitioners community has fragmented and imperfect knowledge on cloud computing principles and techniques. In this context, one of the primary motivations of the work presented in this paper is to reveal the versatile merits of cloud computing paradigm and hence the objective of this work is defined to bring out the remarkable significances of cloud computing paradigm through an application environment. In this work, a cloud computing model for software testing is developed.
We present a deep 1.4 GHz survey made with the Australia Telescope Compact Array (ATCA), having a background RMS of 9 microJy near the image phase centre, up to 25 microJy at the edge of a 50' field of view. Over 770 radio sources brighter than 45 microJy have been catalogued in the field. The differential source counts in the deep field provide tentative support for the growing evidence that the microjansky radio population exhibits significantly higher clustering than found at higher flux density cutoffs. The optical identification rate on CCD images is approximately 50% to R=22.5, and the optical counterparts of the faintest radio sources appear to be mainly single galaxies close to this optical magnitude limit.
In this paper, we propose multi-stage and deformable deep convolutional neural networks for object detection. This new deep learning object detection diagram has innovations in multiple aspects. In the proposed new deep architecture, a new deformation constrained pooling (def-pooling) layer models the deformation of object parts with geometric constraint and penalty. With the proposed multi-stage training strategy, multiple classifiers are jointly optimized to process samples at different difficulty levels. A new pre-training strategy is proposed to learn feature representations more suitable for the object detection task and with good generalization capability. By changing the net structures, training strategies, adding and removing some key components in the detection pipeline, a set of models with large diversity are obtained, which significantly improves the effectiveness of modeling averaging. The proposed approach ranked \#2 in ILSVRC 2014. It improves the mean averaged precision obtained by RCNN, which is the state-of-the-art of object detection, from $31\%$ to $45\%$. Detailed component-wise analysis is also provided through extensive experimental evaluation.
In this paper, a simple network management architecture for supporting both QoS requirements and organization network management policies is purposed. By grouping the traffic flows according to the QoS requirements or certain network management policies, the network resources are effectively controlled. The purposed architecture is easy to deploy; the gateway is the only equipment that needs installation, leaving the rest of the system untouched. The architecture has not significantly degraded the overall system utilization when applying it to the outgoing bound of the gateway. The architecture can also be implemented on the wireless LAN at the access point because the architecture is designed in such the way that it is independent to both the lower and upper protocol layers.
Starting from the study of a linear combination of multi-overlaps which can be rigorously shown to vanish for large systems we numerically analyze the factorization properties of the link-overlaps multi-distribution for the 3D Gaussian Edward-Anderson spin-glass model. We find evidence of a pure factorization law for the multi-correlation functions. For instance the quantity [<Q_{12}^2> - <Q_{12}Q_{34}>]/<Q_{12}^2> tends to zero at increasing volumes. We also perform the same analysis for the standard overlap for which instead the lack of factorization persists increasing the size of the system. The necessity of a better understanding of the mutual relation between the two overlaps is pointed out.
In most computer vision and image analysis problems, it is necessary to define a similarity measure between two or more different objects or images. Template matching is a classic and fundamental method used to score similarities between objects using certain mathematical algorithms. In this paper, we reviewed the basic concept of matching, as well as advances in template matching and applications such as invariant features or novel applications in medical image analysis. Additionally, deformable models and templates originating from classic template matching were discussed. These models have broad applications in image registration, and they are a fundamental aspect of novel machine vision or deep learning algorithms, such as convolutional neural networks (CNN), which perform shift and scale invariant functions followed by classification. In general, although template matching methods have restrictions which limit their application, they are recommended for use with other object recognition methods as pre- or post-processing steps. Combining a template matching technique such as normalized cross-correlation or dice coefficient with a robust decision-making algorithm yields a significant improvement in the accuracy rate for object detection and recognition.
To use deep reinforcement learning in the wild, we might hope for an agent that can avoid catastrophic mistakes. Unfortunately, even in simple environments, the popular deep Q-network (DQN) algorithm is doomed by a Sisyphean curse. Owing to the use of function approximation, these agents may eventually forget experiences as they become exceedingly unlikely under a new policy. Consequently, for as long as they continue to train, DQNs may periodically repeat avoidable catastrophic mistakes. In this paper, we learn a \emph{reward shaping} that accelerates learning and guards oscillating policies against repeated catastrophes. First, we demonstrate unacceptable performance of DQNs on two toy problems. We then introduce \emph{intrinsic fear}, a new method that mitigates these problems by avoiding dangerous states. Our approach incorporates a second model trained via supervised learning to predict the probability of catastrophe within a short number of steps. This score then acts to penalize the Q-learning objective. Equipped with intrinsic fear, our DQNs solve the toy environments and improve on the Atari games Seaquest, Asteroids, and Freeway.
We analyze numerically the violation of the fluctuation-dissipation theorem (FDT) in the $\pm J$ Edwards-Anderson (EA) spin glass model. Using single spin probability densities we reveal the presence of strong dynamical heterogeneities, which correlate with ground state information. The physical interpretation of the results shows that the spins in the EA model can be divided in two sets. In 3D, one set forms a compact structure which presents a coarsening-like behavior with its characteristic violation of the FDT, while the other asymptotically follows the FDT. Finally, we compare the dynamical behavior observed in 3D with 2D.
This paper presents a general framework for exploiting the representational capacity of neural networks to approximate complex, nonlinear reward functions in the context of solving the inverse reinforcement learning (IRL) problem. We show in this context that the Maximum Entropy paradigm for IRL lends itself naturally to the efficient training of deep architectures. At test time, the approach leads to a computational complexity independent of the number of demonstrations, which makes it especially well-suited for applications in life-long learning scenarios. Our approach achieves performance commensurate to the state-of-the-art on existing benchmarks while exceeding on an alternative benchmark based on highly varying reward structures. Finally, we extend the basic architecture - which is equivalent to a simplified subclass of Fully Convolutional Neural Networks (FCNNs) with width one - to include larger convolutions in order to eliminate dependency on precomputed spatial features and work on raw input representations.
In the mammalian brain newly acquired memories are dependent on the hippocampus for maintenance and recall, but over time these functions are taken over by the neocortex through a process called memory consolidation. Thus, whereas recent memories are likely to be disrupted in the event of hippocampal damage, older memories are less vulnerable. However, if a consolidated memory is reactivated by a reminder, it can temporarily return to a hippocampus-dependent state. This phenomenon is known as memory reconsolidation. Here, we present an artificial neural-network model that captures memory consolidation and reconsolidation, as well as the related phenomena of memory extinction, spontaneous recovery and trace dominance. The model provides a novel explanation of trace dominance, the competition between reconsolidation and extinction, from which we derive two predictions that could be tested in future experiments.
We show that the low temperature ($T<0.5$ K) time dependent non-exponential energy relaxation of quasi-one-dimensional (quasi-1D) compounds strongly differ according to the nature of their modulated ground state. For incommensurate ground states, such as in (TMTSF)$_2$PF$_6$ the relaxation time distribution is homogeneously shifted to larger time when the duration of the heat input is increased, and exhibits in addition a scaling between the width and the position of the peak in the relaxation time distribution, $w^{2}\sim\ln{(\tau_{m})}$. For a commensurate ground state, as in (TMTTF)$_2$PF$_6$, the relaxation time spectra show a bimodal character with a weight transfer between well separated slow and fast entities. Our interpretation is based on the dynamics of defects in the modulated structure, which depend crucially on the degree of commensurability.
Machine learning is making substantial progress in diverse applications. The success is mostly due to advances in deep learning. However, deep learning can make mistakes and its generalization abilities to new tasks are questionable. We ask when and how one can combine network outputs, when (i) details of the observations are evaluated by learned deep components and (ii) facts and confirmation rules are available in knowledge based systems. We show that in limited contexts the required number of training samples can be low and self-improvement of pre-trained networks in more general context is possible. We argue that the combination of sparse outlier detection with deep components that can support each other diminish the fragility of deep methods, an important requirement for engineering applications. We argue that supervised learning of labels may be fully eliminated under certain conditions: a component based architecture together with a knowledge based system can train itself and provide high quality answers. We demonstrate these concepts on the State Farm Distracted Driver Detection benchmark. We argue that the view of the Study Panel (2016) may overestimate the requirements on `years of focused research' and `careful, unique construction' for `AI systems'.
In this paper, we explore theoretical properties of training a two-layered ReLU network $g(\mathbf{x}; \mathbf{w}) = \sum_{j=1}^K \sigma(\mathbf{w}_j^T\mathbf{x})$ with centered $d$-dimensional spherical Gaussian input $\mathbf{x}$ ($\sigma$=ReLU). We train our network with gradient descent on $\mathbf{w}$ to mimic the output of a teacher network with the same architecture and fixed parameters $\mathbf{w}^*$. We show that its population gradient has an analytical formula, leading to interesting theoretical analysis of critical points and convergence behaviors. First, we prove that critical points outside the hyperplane spanned by the teacher parameters ("out-of-plane") are not isolated and form manifolds, and characterize in-plane critical-point-free regions for two ReLU case. On the other hand, convergence to $\mathbf{w}^*$ for one ReLU node is guaranteed with at least $(1-\epsilon)/2$ probability, if weights are initialized randomly with standard deviation upper-bounded by $O(\epsilon/\sqrt{d})$, consistent with empirical practice. For network with many ReLU nodes, we prove that an infinitesimal perturbation of weight initialization results in convergence towards $\mathbf{w}^*$ (or its permutation), a phenomenon known as spontaneous symmetric-breaking (SSB) in physics. We assume no independence of ReLU activations. Simulation verifies our findings.
We consider the canonical ensemble of multilayered constrained Erdos-Renyi networks (CERN) and regular random graphs (RRG), where each layer represents graph vertices painted in a specific color. We study the critical behavior in such networks under changing the fugacity, $\mu$, which controls the number of monochromatic triads of nodes. The behavior of considered systems is investigated via the spectral properties of the adjacency and Laplacian matrices of corresponding networks. For some wide region of $\mu$ we find the formation of a finite plateau in the number of the intercolor links, which exactly matches the finite plateau for the algebraic connectivity of the network (the value of the first non-vanishing eigenvalue of the Laplacian matrix, $\lambda_2$). We claim that at the plateau the restoring of the spontaneously broken $Z_2$ symmetry by the mechanism of modes collectivization in clusters of different colors occurs. The phenomena of a finite plateau formation holds for the polychromatic (multilayer) networks with $M>2$ colors.
The rapid e-commerce growth has made both business community and customers face a new situation. Due to intense competition on one hand and the customer's option to choose from several alternatives business community has realized the necessity of intelligent marketing strategies and relationship management. Web usage mining attempts to discover useful knowledge from the secondary data obtained from the interactions of the users with the Web. Web usage mining has become very critical for effective Web site management, creating adaptive Web sites, business and support services, personalization, network traffic flow analysis and so on. In this paper, we present the important concepts of Web usage mining and its various practical applications. We further present a novel approach 'intelligent-miner' (i-Miner) to optimize the concurrent architecture of a fuzzy clustering algorithm (to discover web data clusters) and a fuzzy inference system to analyze the Web site visitor trends. A hybrid evolutionary fuzzy clustering algorithm is proposed in this paper to optimally segregate similar user interests. The clustered data is then used to analyze the trends using a Takagi-Sugeno fuzzy inference system learned using a combination of evolutionary algorithm and neural network learning. Proposed approach is compared with self-organizing maps (to discover patterns) and several function approximation techniques like neural networks, linear genetic programming and Takagi-Sugeno fuzzy inference system (to analyze the clusters). The results are graphically illustrated and the practical significance is discussed in detail. Empirical results clearly show that the proposed Web usage-mining framework is efficient.
We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. The algorithm can be used with any model that generates a sequence $ \mathbf{\hat{y}} = \{y_{0}\ldots y_{T}\} $, by maximizing $ p(\mathbf{y} | \mathbf{x}) = \prod\limits_{t}p(y_{t} | \mathbf{x}; \{y_{0} \ldots y_{t-1}\}) $. Lexical constraints take the form of phrases or words that must be present in the output sequence. This is a very general way to incorporate additional knowledge into a model's output without requiring any modification of the model parameters or training data. We demonstrate the feasibility and flexibility of Lexically Constrained Decoding by conducting experiments on Neural Interactive-Predictive Translation, as well as Domain Adaptation for Neural Machine Translation. Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, GBS can be used to achieve significant gains in performance in domain adaptation scenarios.
Structural equation models and Bayesian networks have been widely used to study causal relationships between continuous variables. Recently, a non-Gaussian method called LiNGAM was proposed to discover such causal models and has been extended in various directions. An important problem with LiNGAM is that the results are affected by the random sampling of the data as with any statistical method. Thus, some analysis of the statistical reliability or confidence level should be conducted. A common method to evaluate a confidence level is a bootstrap method. However, a confidence level computed by ordinary bootstrap method is known to be biased as a probability-value ($p$-value) of hypothesis testing. In this paper, we propose a new procedure to apply an advanced bootstrap method called multiscale bootstrap to compute confidence levels, i.e., p-values, of LiNGAM outputs. The multiscale bootstrap method gives unbiased $p$-values with asymptotic much higher accuracy. Experiments on artificial data demonstrate the utility of our approach.
The Harris-Aharony criterion for a statistical model predicts, that if a specific heat exponent $\alpha \ge 0$, then this model does not exhibit self-averaging. In two-dimensional percolation model the index $\alpha=-{1/2}$. It means that, in accordance with the Harris-Aharony criterion, the model can exhibit self-averaging properties. We study numerically the relative variances $R_{M}$ and $R_{\chi}$ for the probability $M$ of a site belongin to the "infinite" (maximum) cluster and the mean finite cluster size $\chi$. It was shown, that two-dimensional site-bound percolation on the square lattice, where the bonds play the role of impurity and the sites play the role of the statistical ensemble, over which the averaging is performed, exhibits self-averaging properties.
In Deep Learning, a well-known approach for training a Deep Neural Network starts by training a generative Deep Belief Network model, typically using Contrastive Divergence (CD), then fine-tuning the weights using backpropagation or other discriminative techniques. However, the generative training can be time-consuming due to the slow mixing of Gibbs sampling. We investigated an alternative approach that estimates model expectations of Restricted Boltzmann Machines using samples from a D-Wave quantum annealing machine. We tested this method on a coarse-grained version of the MNIST data set. In our tests we found that the quantum sampling-based training approach achieves comparable or better accuracy with significantly fewer iterations of generative training than conventional CD-based training. Further investigation is needed to determine whether similar improvements can be achieved for other data sets, and to what extent these improvements can be attributed to quantum effects.
To learn (statistical) dependencies among random variables requires exponentially large sample size in the number of observed random variables if any arbitrary joint probability distribution can occur.   We consider the case that sparse data strongly suggest that the probabilities can be described by a simple Bayesian network, i.e., by a graph with small in-degree \Delta. Then this simple law will also explain further data with high confidence. This is shown by calculating bounds on the VC dimension of the set of those probability measures that correspond to simple graphs. This allows to select networks by structural risk minimization and gives reliability bounds on the error of the estimated joint measure without (in contrast to a previous paper) any prior assumptions on the set of possible joint measures.   The complexity for searching the optimal Bayesian networks of in-degree \Delta increases only polynomially in the number of random varibales for constant \Delta and the optimal joint measure associated with a given graph can be found by convex optimization.
We study the effective interaction mediated by strongly coupled Coulomb fluids between dielectric surfaces carrying quenched, random monopolar charges with equal mean and variance, both when the Coulomb fluid consists only of mobile multivalent counterions and when it consists of an asymmetric ionic mixture containing multivalent and monovalent (salt) ions in equilibrium with an aqueous bulk reservoir. We analyze the consequences that follow from the interplay between surface charge disorder, dielectric and salt image effects, and the strong electrostatic coupling that results from multivalent counterions on the distribution of these ions and the effective interaction pressure they mediate between the surfaces. In a dielectrically homogeneous system, we show that the multivalent counterions are attracted towards the surfaces with a singular, disorder-induced potential that diverges logarithmically on approach to the surfaces, creating a singular counterion density profile with an algebraic divergence at the surfaces. This effect drives the system towards a state of lower thermal "disorder", one that can be described by a renormalized temperature, exhibiting thus a remarkable antifragility. The interaction pressure acting on the surfaces displays in general a highly non-monotonic behavior as a function of the inter-surface separation with a prominent regime of attraction at small to intermediate separations. This attraction is caused directly by the combined effects from charge disorder and strong coupling electrostatics of multivalent counterions, which can be quite significant even with a small degree of surface charge disorder relative to the mean surface charge. The strong coupling, disorder-induced attraction is typically far more stronger than the van der Waals interaction between the surfaces, especially within a range of several nanometers for the inter-surface separation.
We compare the critical behavior of the short-range Ising spin glass with a spin glass with long-range interactions which fall off as a power sigma of the distance. We show that there is a value of sigma of the long-range model for which the critical behavior is very similar to that of the short-range model in four dimensions. We also study a value of sigma for which we find the critical behavior to be compatible with that of the three dimensional model, though we have much less precision than in the four-dimensional case.
We present results on the fourth-order splitting functions and coefficient functions obtained using Forcer, a four-loop generalization of the Mincer program for the parametric reduction of self-energy integrals. We have computed the respective lowest three even-N and odd-N moments for the non-singlet splitting functions and the non-singlet coefficient functions in electromagnetic and nu+nu(bar) charged-current deep-inelastic scattering, and the N=2 and N=4 results for the corresponding flavour-singlet quantities. Enough moments have been obtained for an LLL-based determination of the analytic N-dependence of the nf^3 and nf^2 parts, respectively, of the singlet and non-singlet splitting functions. The large-N limit of the latter provides the complete nf^2 contributions to the four-loop cusp anomalous dimension. Our results also provide additional evidence of a non-vanishing contribution of quartic group invariants to the cusp anomalous dimension.
High network communication cost for synchronizing gradients and parameters is the well-known bottleneck of distributed training. In this work, we propose TernGrad that uses ternary gradients to accelerate distributed deep learning in data parallelism. Our approach requires only three numerical levels {-1,0,1} which can aggressively reduce the communication time. We mathematically prove the convergence of TernGrad under the assumption of a bound on gradients. Guided by the bound, we propose layer-wise ternarizing and gradient clipping to improve its convergence. Our experiments show that applying TernGrad on AlexNet does not incur any accuracy loss and can even improve accuracy. The accuracy loss of GoogLeNet induced by TernGrad is less than 2% on average. Finally, a performance model is proposed to study the scalability of TernGrad. Experiments show significant speed gains for various deep neural networks.
Device-to-device (D2D) communication underlaying cellular wireless networks is a promising concept to improve user experience and resource utilization by allowing direct transmission between two cellular devices. In this paper, performance of network-assisted D2D communication is investigated where D2D traffic is carried through relay nodes. Considering a multi-user and multi-relay network, we propose a distributed solution for resource allocation with a view to maximizing network sum-rate. An optimization problem is formulated for radio resource allocation at the relays. The objective is to maximize end-to-end rate as well as satisfy the data rate requirements for cellular and D2D user equipments under total power constraint. Due to intractability of the resource allocation problem, we propose a solution approach using message passing technique where each user equipment sends and receives information messages to/from the relay node in an iterative manner with the goal of achieving an optimal allocation. Therefore, the computational effort is distributed among all the user equipments and the corresponding relay node. The convergence and optimality of the proposed scheme are proved and a possible distributed implementation of the scheme in practical LTE-Advanced networks is outlined. The numerical results show that there is a distance threshold beyond which relay-aided D2D communication significantly improves network performance with a small increase in end-to-end delay when compared to direct communication between D2D peers.
While deep learning has had significant successes in computer vision thanks to the abundance of visual data, collecting sufficiently large real-world datasets for robot learning can be costly. To increase the practicality of these techniques on real robots, we propose a modular deep reinforcement learning method capable of transferring models trained in simulation to a real-world robotic task. We introduce a bottleneck between perception and control, enabling the networks to be trained independently, but then merged and fine-tuned in an end-to-end manner. On a canonical, planar visually-guided robot reaching task a fine-tuned accuracy of 1.6 pixels is achieved, a significant improvement over naive transfer (17.5 px), showing the potential for more complicated and broader applications. Our method provides a technique for more efficiently improving hand-eye coordination on a real robotic system without relying entirely on large real-world robot datasets.
We present an analysis of the properties of the hot interstellar medium (ISM) in the merging pair of galaxies known as The Antennae (NGC 4038/39), performed using the deep, coadded ~411 ks Chandra ACIS-S data set. These deep X-ray observations and Chandra's high angular resolution allow us to investigate the properties of the hot ISM with unprecedented spatial and spectral resolution. Through a spatially resolved spectral analysis, we find a variety of temperatures (from 0.2 to 0.7 keV) and Nh (from Galactic to 2x10^21 cm^-2). Metal abundances for Ne, Mg, Si, and Fe vary dramatically throughout the ISM from sub-solar values (~0.2) up to several times solar.
The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning.
We investigate the impact of borders on the topology of spatially embedded networks. Indeed territorial subdivisions and geographical borders significantly hamper the geographical span of networks thus playing a key role in the formation of network communities. This is especially important in scientific and technological policy-making, highlighting the interplay between pressure for the internationalization to lead towards a global innovation system and the administrative borders imposed by the national and regional institutions. In this study we introduce an outreach index to quantify the impact of borders on the community structure and apply it to the case of the European and US patent co-inventors networks. We find that (a) the US connectivity decays as a power of distance, whereas we observe a faster exponential decay for Europe; (b) European network communities essentially correspond to nations and contiguous regions while US communities span multiple states across the whole country without any characteristic geographic scale. We confirm our findings by means of a set of simulations aimed at exploring the relationship between different patterns of cross-border community structures and the outreach index.
The collective phenomena of a liquid market is characterized in terms of a particle system scenario. This physical analogy enables us to disentangle intrinsic features from purely stochastic ones. The latter are the result of environmental changes due to a `heat bath' acting on the many-asset system, quantitatively described in terms of a time dependent effective temperature. The remaining intrinsic properties can be widely investigated by applying standard methods of classical many body systems. As an example, we consider a large set of stocks traded at the NYSE and determine the corresponding asset--asset `interaction' potential. In order to investigate in more detail the cluster structure suggested by the short distance behavior of the interaction potential, we perform a connectivity analysis of the spatial distribution of the particle system. In this way, we are able to draw conclusions on the intrinsic cluster persistency independently of the specific market conditions.
We present a memory-based snowdrift game (MBSG) taking place on networks. We found that, when a lattice is taken to be the underlying structure, the transition of spatial patterns at some critical values of the payoff parameter is observable for both 4 and 8-neighbor lattices. The transition points as well as the styles of spatial patterns can be explained by local stability analysis. In sharp contrast to previously reported results, cooperation is promoted by the spatial structure in the MBSG. Interestingly, we found that the frequency of cooperation of the MBSG on a scale-free network peaks at a specific value of the payoff parameter. This phenomenon indicates that properly encouraging selfish behaviors can optimally enhance the cooperation. The memory effects of individuals are discussed in detail and some non-monotonous phenomena are observed on both lattices and scale-free networks. Our work may shed some new light on the study of evolutionary games over networks.
We present the first provably almost-optimal gossip-based algorithms for aggregate computation that are both time optimal and message-optimal. Given a $n$-node network, our algorithms guarantee that all the nodes can compute the common aggregates (such as Min, Max, Count, Sum, Average, Rank etc.) of their values in optimal $O(\log n)$ time and using $O(n \log \log n)$ messages. Our result improves on the algorithm of Kempe et al. \cite{kempe} that is time-optimal, but uses $O(n \log n)$ messages as well as on the algorithm of Kashyap et al. \cite{efficient-gossip} that uses $O(n \log \log n)$ messages, but is not time-optimal (takes $O(\log n \log \log n)$ time). Furthermore, we show that our algorithms can be used to improve gossip-based aggregate computation in sparse communication networks, such as in peer-to-peer networks.   The main technical ingredient of our algorithm is a technique called {\em distributed random ranking (DRR)} that can be useful in other applications as well. DRR gives an efficient distributed procedure to partition the network into a forest of (disjoint) trees of small size.   Our algorithms are non-address oblivious. In contrast, we show a lower bound of $\Omega(n\log n)$ on the message complexity of any address-oblivious algorithm for computing aggregates. This shows that non-address oblivious algorithms are needed to obtain significantly better message complexity. Our lower bound holds regardless of the number of rounds taken or the size of the messages used. Our lower bound is the first non-trivial lower bound for gossip-based aggregate computation and also gives the first formal proof that computing aggregates is strictly harder than rumor spreading in the address-oblivious model.
We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval $T$.   We analyze the $\ell_1$-regularized least squares algorithm and, in the setting in which the underlying network is sparse, we prove performance guarantees that are \emph{uniform in the sampling rate} as long as this is sufficiently high. This result substantiates the notion of a well defined `time complexity' for the network inference problem.
In this paper we derive and analyze two algorithms -- referred to as decentralized power method (DPM) and decentralized Lanczos algorithm (DLA) -- for distributed computation of one (the largest) or multiple eigenvalues of a sample covariance matrix over a wireless network. The proposed algorithms, based on sequential average consensus steps for computations of matrix-vector products and inner vector products, are first shown to be equivalent to their centralized counterparts in the case of exact distributed consensus. Then, closed-form expressions of the error introduced by non-ideal consensus are derived for both algorithms. The error of the DPM is shown to vanish asymptotically under given conditions on the sequence of consensus errors. Finally, we consider applications to spectrum sensing in cognitive radio networks, and we show that virtually all eigenvalue-based tests proposed in the literature can be implemented in a distributed setting using either the DPM or the DLA. Simulation results are presented that validate the effectiveness of the proposed algorithms in conditions of practical interest (large-scale networks, small number of samples, and limited number of iterations).
We describe the political and technical complications encountered during the astronomical CosmoGrid project. CosmoGrid is a numerical study on the formation of large scale structure in the universe. The simulations are challenging due to the enormous dynamic range in spatial and temporal coordinates, as well as the enormous computer resources required. In CosmoGrid we dealt with the computational requirements by connecting up to four supercomputers via an optical network and make them operate as a single machine. This was challenging, if only for the fact that the supercomputers of our choice are separated by half the planet, as three of them are located scattered across Europe and fourth one is in Tokyo. The co-scheduling of multiple computers and the 'gridification' of the code enabled us to achieve an efficiency of up to $93\%$ for this distributed intercontinental supercomputer. In this work, we find that high-performance computing on a grid can be done much more effectively if the sites involved are willing to be flexible about their user policies, and that having facilities to provide such flexibility could be key to strengthening the position of the HPC community in an increasingly Cloud-dominated computing landscape. Given that smaller computer clusters owned by research groups or university departments usually have flexible user policies, we argue that it could be easier to instead realize distributed supercomputing by combining tens, hundreds or even thousands of these resources.
Exploring the human brain networks during rest is a topic of great interest. Several structural and functional studies have previously been conducted to study the intrinsic brain networks. In this paper, we focus on investigating the human brain network topology using dense Electroencephalography (EEG) source connectivity approach. We applied graph theoretical methods on functional networks reconstructed from resting state data acquired using EEG in 14 healthy subjects. Our findings confirmed the existence of sets of brain regions considered as functional hubs. In particular, the isthmus cingulate and the orbitofrontal regions reveal high levels of integration. Results also emphasize on the critical role of the default mode network (DMN) in enabling an efficient communication between brain regions.
We propose and develop a new procedure, whereby a quantum system can learn to anneal to a desired ground state. We demonstrate successful learning to produce an entangled state for a two-qubit system, then demonstrate generalizability to larger systems. The amount of additional learning necessary decreases as the size of the system increases. Because current technologies limit measurement of the states of quantum annealing machines to determination of the average spin at each site, we then construct a "broken pathway" between the initial and desired states, at each step of which the average spins are nonzero, and show successful learning of that pathway. Using this technique we show we can direct annealing to multiqubit GHZ and W states, and verify that we have done so. Because quantum neural networks are robust to noise and decoherence we expect our method to be readily implemented experimentally; we show some preliminary results which support this.
Research of homogeneous (i.e., single node and edge type) biological networks (BNs) has received significant attention. Graphlets have been proven in homogeneous BN research. Given their popularity, graphlets were extended to their directed or dynamic counterparts, owing to increase in availability of directed or temporal BNs. Given the increasing amounts of available BN data of different types, we generalize current homogeneous graphlets to their heterogeneous counterparts, which encompass different node or edge types and thus allow for analyzing a heterogeneous "network of networks". Heterogeneous graphlets will have at least as high impact as homogeneous graphlets have had.   We illustrate the usefulness of heterogeneous graphlets in the context of network alignment (NA). While existing NA methods are homogeneous (they can only account for a single node type and a single edge type), we generalize a state-of-the-art homogeneous graphlet-based NA method into its heterogeneous counterpart. Because graphlets are applicable to many other network science problems, we provide an implementation of our heterogeneous graphlet counting approach (available upon request).
There is an increasing interest in studying the neural interaction mechanisms behind patterns of cognitive brain activity. This paper proposes a new approach to infer such interaction mechanisms from electroencephalographic (EEG) data using a new estimator of directed information (DI) called logit shrinkage optimized directed information assessment (L-SODA). Unlike previous directed information measures applied to neural decoding, L-SODA uses shrinkage regularization on multinomial logistic regression to deal with the high dimensionality of multi-channel EEG signals and the small sizes of many real-world datasets. It is designed to make few a priori assumptions and can handle both non-linear and non-Gaussian flows among electrodes. Our L-SODA estimator of the DI is accompanied by robust statistical confidence intervals on the true DI that make it especially suitable for hypothesis testing on the information flow patterns. We evaluate our work in the context of two different problems where interaction localization is used to determine highly interactive areas for EEG signals spatially and temporally. First, by mapping the areas that have high DI into Brodmann area, we identify that the areas with high DI are associated with motor-related functions. We demonstrate that L-SODA provides better accuracy for neural decoding of EEG signals as compared to several state-of-the-art approaches on the Brain Computer Interface (BCI) EEG motor activity dataset. Second, the proposed L-SODA estimator is evaluated on the CHB-MIT Scalp EEG database. We demonstrate that compared to the state-of-the-art approaches, the proposed method provides better performance in detecting the epileptic seizure.
The search for patterns or motifs in data represents an area of key interest to many researchers. In this paper we present the Motif Tracking Algorithm, a novel immune inspired pattern identification tool that is able to identify unknown motifs which repeat within time series data. The power of the algorithm is derived from its use of a small number of parameters with minimal assumptions. The algorithm searches from a completely neutral perspective that is independent of the data being analysed, and the underlying motifs. In this paper the motif tracking algorithm is applied to the search for patterns within sequences of low level system calls between the Linux kernel and the operating system's user space. The MTA is able to compress data found in large system call data sets to a limited number of motifs which summarise that data. The motifs provide a resource from which a profile of executed processes can be built. The potential for these profiles and new implications for security research are highlighted. A higher level call system language for measuring similarity between patterns of such calls is also suggested.
We consider a one-dimensional quantum many-body system and investigate how the interplay between interaction and on-site disorder affects spatial localization and quantum correlations. The hopping amplitude is kept constant. To measure localization, we use the number of principal components (NPC), which quantifies the spreading of the system eigenstates over vectors of a given basis set. Quantum correlations are determined by a global entanglement measure $Q$, which quantifies the degree of entanglement of multipartite pure states. Our studies apply analogously to a one-dimensional system of interacting spinless fermions, hard-core bosons, or yet to an XXZ Heisenberg spin-1/2 chain. Disorder is characterized by both: uncorrelated and long-range correlated random on-site energies. Dilute and half-filled chains are analyzed. In half-filled clean chains, delocalization is maximum when the particles do not interact, whereas multi-partite entanglement is largest when they do. In the presence of uncorrelated disorder, NPC and Q show a non-trivial behavior with interaction, peaking in the chaotic region. The inclusion of correlated disorder may further extend two-particle states, but the effect decreases with the number of particles and strength of their interactions. In half-filled chains with large interaction, correlated disorder may even enhance localization.
Learning fine-grained image similarity is a challenging task. It needs to capture between-class and within-class image differences. This paper proposes a deep ranking model that employs deep learning techniques to learn similarity metric directly from images.It has higher learning capability than models based on hand-crafted features. A novel multiscale network structure has been developed to describe the images effectively. An efficient triplet sampling algorithm is proposed to learn the model with distributed asynchronized stochastic gradient. Extensive experiments show that the proposed algorithm outperforms models based on hand-crafted visual features and deep classification models.
Information-Centric Networks place content as the narrow waist of the network architecture. This allows to route based upon the content name, and not based upon the locations of the content consumer and producer. However, current Internet architecture does not support content routing at the network layer. We present ContentFlow, an Information-Centric network architecture which supports content routing by mapping the content name to an IP flow, and thus enables the use of OpenFlow switches to achieve content routing over a legacy IP architecture. ContentFlow is viewed as an evolutionary step between the current IP networking architecture, and a full fledged ICN architecture. It supports content management, content caching and content routing at the network layer, while using a legacy OpenFlow infrastructure and a modified controller. In particular, ContentFlow is transparent from the point of view of the client and the server, and can be inserted in between with no modification at either end. We have implemented ContentFlow and describe our implementation choices as well as the overall architecture specification. We evaluate the performance of ContentFlow in our testbed.
We present a transport equation for the incoherent propagation of radiation inside a quasi-resonant atomic gas at low temperature. The derivation is based on a generalized Bethe-Salpeter equation taking into account the motion of the atoms. The obtained equation is similar to the radiative transfer equation. It is solved numerically by an original Monte Carlo approach in the case of a slab geometry. The partial frequency redistribution caused by the small velocity of the scatterers make the emitted flux outside the system and the energy density inside the medium to behave differently than in the case of complete frequency redistribution. In particular, the long time dependence of the specific intensity (escape factor) is slightly different from the Holstein prediction.
Variational methods that rely on a recognition network to approximate the posterior of directed graphical models offer better inference and learning than previous methods. Recent advances that exploit the capacity and flexibility in this approach have expanded what kinds of models can be trained. However, as a proposal for the posterior, the capacity of the recognition network is limited, which can constrain the representational power of the generative model and increase the variance of Monte Carlo estimates. To address these issues, we introduce an iterative refinement procedure for improving the approximate posterior of the recognition network and show that training with the refined posterior is competitive with state-of-the-art methods. The advantages of refinement are further evident in an increased effective sample size, which implies a lower variance of gradient estimates.
We show how single-molecule unzipping experiments can provide strong evidence that the zero-force melting transition of long molecules of natural dsDNA should be classified as a phase transition of the higher-order type (continuous). We study a model for a long molecule of dsDNA, and compute the equilibrium phase diagram for the experiment in which the molecule is unzipped under force. We consider a perfect-matching dsDNA model, in which the loops are volume-excluding chains with arbitrary loop exponent c. We include stacking interactions, hydrogen bonds, and main-chain entropy, including sequence heterogeneity at the level of random sequences. We use the replica method to calculate the equilibrium properties of the system. As a function of temperature, we obtain the minimal force at which the molecule separates completely. This critical force curve is a line in the temperature-force phase diagram that marks the regions where the molecule exists primarily as a helix, versus the region where the molecule exists as two separate strands. Near melting, the critical force curve of our random-sequence model is very different from that of the homogeneous version of our model. For both sequence models, the critical force falls to zero at the melting temperature with a power law having exponent alpha. For the homogeneous model, alpha is 1/2 almost exactly, while for the random model, alpha is about 0.9. The shape of the critical force determines how the helix fraction falls to zero at melting, and thus classifies the melting transition as a type of phase transition.
Orientation selectivity is a remarkable feature of the neurons located in the primary visual cortex. Provided that the visual neurons acquire orientation selectivity through activity-dependent Hebbian learning, the development process could be understood as a kind of symmetry breaking phenomenon in the view of physics. The key mechanisms of the development process are examined here in a neural system. Found is that there are at least two different mechanisms which lead to the development of orientation selectivity through breaking the radial symmetry in receptive fields. The first, a simultaneous symmetry breaking mechanism, bases on the competition between neighboring neurons, and the second, a spontaneous one, bases on the nonlinearity in interactions. It turns out that only the second mechanism leads to the formation of a columnar pattern which characteristics accord with those observed in an animal experiment.
The out-of-time-order correlator (OTOC) is considered as a measure of quantum chaos. We formulate how to calculate the OTOC for quantum mechanics with a general Hamiltonian. We demonstrate explicit calculations of OTOCs for a harmonic oscillator, a particle in a one-dimensional box, a circle billiard and stadium billiards. For the first two cases, OTOCs are periodic in time because of their commensurable energy spectra. For the circle and stadium billiards, they are not recursive but saturate to constant values which are linear in temperature. Although the stadium billiard is a typical example of the classical chaos, an expected exponential growth of the OTOC is not found. We also discuss the classical limit of the OTOC. Analysis of a time evolution of a wavepacket in a box shows that the OTOC can deviate from its classical value at a time much earlier than the Ehrenfest time.
The paper presents a study of an adaptive approach to lateral skew control for an experimental railway stand. The preliminary experiments with the real experimental railway stand and simulations with its 3-D mechanical model, indicates difficulties of model-based control of the device. Thus, use of neural networks for identification and control of lateral skew shall be investigated. This paper focuses on real-data based modeling of the railway stand by various neural network models, i.e; linear neural unit and quadratic neural unit architectures. Furthermore, training methods of these neural architectures as such, real-time-recurrent-learning and a variation of back-propagation-through-time are examined, accompanied by a discussion of the produced experimental results.
Cortical Learning Algorithms based on the Hierarchical Temporal Memory, HTM have been developed by Numenta Incorporation from which variations and modifications are currently being investigated upon. HTM offers better promises as a future computational model of the neocortex the seat of intelligence in the brain. Currently, intelligent agents are embedded in almost every modern day electronic system found in homes, offices and industries worldwide. In this paper, we present a first step in realising useful HTM like applications specifically for mining a synthetic and real time dataset based on a novel intelligent agent framework, and demonstrate how a modified version of this very important computational technique will lead to improved recognition.
We derive exact expressions for a number of aging functions that are scaling limits of non-equilibrium correlations, R(tw,tw+t) as tw --> infinity with t/tw --> theta, in the 1D homogenous q-state Potts model for all q with T=0 dynamics following a quench from infinite temperature. One such quantity is (the two-point, two-time correlation function) <sigma(0,tw) sigma(n,tw+t)> when n/sqrt(tw) --> z. Exact, closed-form expressions are also obtained when one or more interludes of infinite temperature dynamics occur. Our derivations express the scaling limit via coalescing Brownian paths and a ``Brownian space-time spanning tree,'' which also yields other aging functions, such as the persistence probability of no spin flip at 0 between tw and tw+t.
Studies on social networks have proved that endogenous and exogenous factors influence dynamics. Two streams of modeling exist on explaining the dynamics of social networks: 1) models predicting links through network properties, and 2) models considering the effects of social attributes. In this interdisciplinary study we work to overcome a number of computational limitations within these current models. We employ a mean-field model which allows for the construction of a population-specific socially informed model for predicting links from both network and social properties in large social networks. The model is tested on a population of conference coauthorship behavior, considering a number of parameters from available Web data. We address how large social networks can be modeled preserving both network and social parameters. We prove that the mean-field model, using a data-aware approach, allows us to overcome computational burdens and thus scalability issues in modeling large social networks in terms of both network and social parameters. Additionally, we confirm that large social networks evolve through both network and social-selection decisions; asserting that the dynamics of networks cannot singly be studied from a single perspective but must consider effects of social parameters.
The latest results on jet cross sections in neutral current deep inelastic $ep$ scattering from the ZEUS Collaboration are presented. The new results were used to perform stringent tests of perturbative QCD and extract precise values of the strong coupling. Also, the measurements have the potential to constrain further the parton distribution functions in the proton if included in QCD fits.
This paper suggests the use of intelligent network-aware processing agents in wireless local area network drivers to generate metrics for bandwidth estimation based on real-time channel statistics to enable wireless multimedia application adaptation. Various configurations in the wireless digital home are studied and the experimental results with performance variations are presented.
We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations. Key components are entity embeddings, a neural attention mechanism over local context windows, and a differentiable joint inference stage for disambiguation. Our approach thereby combines benefits of deep learning with more traditional approaches such as graphical models and probabilistic mention-entity maps. Extensive experiments show that we are able to obtain competitive or state-of-the-art accuracy at moderate computational costs.
The Internet of Things (IoT) is a dynamic global information network consisting of Internet-connected objects, such as RFIDs, sensors, and actuators, as well as other instruments and smart appliances that are becoming an integral component of the Internet. Over the last few years, we have seen a plethora of IoT solutions making their way into the industry marketplace. Context-aware communication and computing has played a critical role throughout the last few years of ubiquitous computing and is expected to play a significant role in the IoT paradigm as well. In this article, we examine a variety of popular and innovative IoT solutions in terms of context-aware technology perspectives. More importantly, we evaluate these IoT solutions using a framework that we built around well-known context-aware computing theories. This survey is intended to serve as a guideline and a conceptual framework for contextaware product development and research in the IoT paradigm. It also provides a systematic exploration of existing IoT products in the marketplace and highlights a number of potentially significant research directions and trends.
We perform a detailed numerical study of the distribution of conductances $P(T)$ for quasi-one-dimensional corrugated waveguides as a function of the corrugation complexity (from rough to smooth). We verify the universality of $P(T)$ in both, the diffusive ($\bra T \ket> 1$) and the localized ($\bra T \ket\ll 1$) transport regimes. However, at the crossover regime ($\bra T \ket \sim 1$), we observe that $P(T)$ evolves from the surface-disorder to the bulk-disorder theoretical predictions for decreasing complexity in the waveguide boundaries. We explain this behavior as a transition from disorder to deterministic chaos; since, in the limit of smooth boundaries the corrugated waveguides are, effectively, linear chains of chaotic cavities.
A generalization of the single-parameter scaling theory of localization is proposed for the case when the random potential depends on temperature. The scaling equation describing the behavior of the resistance is derived. It is shown that the competition between the metallic-like temperature dependence of the Drude resistivity and localization leads to a maximum (minimum) at higher (lower) temperatures. An illustration of a metal-insulator transition in the model of charged traps, whose concentration depends on temperature, is presented.
We study a fermionic version of the Sherrington-Kirkpatrick model including nearest-neighbor hopping on a $\infty$-dimensional simple cubic lattices. The problem is reduced to one of free fermions moving in a dynamical effective random medium. By means of a CPA method we derive a set of self-consistency equations for the spin glass order parameter and for the Fourier components of the local spin susceptibility. In order to solve these equations numerically we employ an approximation scheme which restricts the dynamics to a feasible number of the leading Fourier components. From a sequence of systematically improved dynamical approximations we estimate the location of the quantum critical point.
The challenge of quantum many-body problems comes from the difficulty to represent large-scale quantum states, which in general requires an exponentially large number of parameters. Recently, a connection has been made between quantum many-body states and the neural network representation (\textit{arXiv:1606.02318}). An important open question is what characterizes the representational power of deep and shallow neural networks, which is of fundamental interest due to popularity of the deep learning methods. Here, we give a rigorous proof that a deep neural network can efficiently represent most physical states, including those generated by any polynomial size quantum circuits or ground states of many body Hamiltonians with polynomial-size gaps, while a shallow network through a restricted Boltzmann machine cannot efficiently represent those states unless the polynomial hierarchy in computational complexity theory collapses.
We present ac susceptibility and specific heat measurements taken on samples of LiHo$_x$Y$_{1-x}$F$_4$ in the dilute limit: x = 0.018, 0.045, 0.080 and 0.12. Susceptibility measurements show glassy behavior including wide absorption spectra that continually broaden with decreasing temperature. Dynamical scaling analyses show evidence of finite-temperature spin-glass transitions, the temperatures of which match those of recent theoretical work. A surprisingly long intrinsic time constant is observed in these samples and is found to be inversely correlated with the concentration of magnetic moments, x. Our results support the picture that this behavior is largely a single-ion effect, related to the random transverse fields generated by the off-diagonal component of the dipolar interaction and significantly slowed by the important nuclear hyperfine interaction. Specific heat measurements show broad features due to the electronic spins on top of a large Schottky-like nuclear contribution. Unusually, the peak position of the electronic component is found to be largely concentration independent, unlike the glass transition temperature.
A sum-network is an instance of a network coding problem over a directed acyclic network in which each terminal node wants to compute the sum over a finite field of the information observed at all the source nodes. Many characteristics of the well-studied multiple unicast network communication problem also hold for sum-networks due to a known reduction between instances of these two problems. In this work, we describe an algorithm to construct families of sum-network instances using incidence structures. The computation capacity of several of these sum-network families is characterized. We demonstrate that unlike the multiple unicast problem, the computation capacity of sum-networks depends on the characteristic of the finite field over which the sum is computed. This dependence is very strong; we show examples of sum-networks that have a rate-1 solution over one characteristic but a rate close to zero over a different characteristic. Additionally, a sum-network can have an arbitrary different number of computation capacities for different alphabets. This is contrast to the multiple unicast problem where it is known that the capacity is independent of the network coding alphabet.
The task of a neural associative memory is to retrieve a set of previously memorized patterns from their noisy versions using a network of neurons. An ideal network should have the ability to 1) learn a set of patterns as they arrive, 2) retrieve the correct patterns from noisy queries, and 3) maximize the pattern retrieval capacity while maintaining the reliability in responding to queries. The majority of work on neural associative memories has focused on designing networks capable of memorizing any set of randomly chosen patterns at the expense of limiting the retrieval capacity. In this paper, we show that if we target memorizing only those patterns that have inherent redundancy (i.e., belong to a subspace), we can obtain all the aforementioned properties. This is in sharp contrast with the previous work that could only improve one or two aspects at the expense of the third. More specifically, we propose framework based on a convolutional neural network along with an iterative algorithm that learns the redundancy among the patterns. The resulting network has a retrieval capacity that is exponential in the size of the network. Moreover, the asymptotic error correction performance of our network is linear in the size of the patterns. We then ex- tend our approach to deal with patterns lie approximately in a subspace. This extension allows us to memorize datasets containing natural patterns (e.g., images). Finally, we report experimental results on both synthetic and real datasets to support our claims.
We formulate a strong-disorder renormalization-group (SDRG) approach to study the beta function of the tight-binding model in one dimension with both diagonal and off-diagonal disorder for states at the band center. We show that the SDRG method, when used to compute transport properties, yields exact results since it is identical to the transfer matrix method. The beta function is shown to be universal when only off-diagonal disorder is present even though single-parameter scaling is known to be violated. A different single-parameter scaling theory is formulated for this particular (particle-hole symmetric) case. Upon breaking particle-hole symmetry (by adding diagonal disorder), the beta function is shown to crossover from the universal behavior of the particle-hole symmetric case to the conventional non-universal one in agreement with the two-parameter scaling theory. We finally draw an analogy with the random transverse-field Ising chain in the paramagnetic phase. The particle-hole symmetric case corresponds to the critical point of the quantum Ising model while the generic case corresponds to the Griffiths paramagnetic phase.
Recent studies of animal social networks have significantly increased our understanding of animal behavior, social interactions, and many important ecological and epidemiological processes. However, most of the studies are at low temporal and spatial resolution due to the difficulty in recording accurate contact information. Domestic animals such as cattle have social behavior and serve as an excellent study system because their position can be explicitly and continuously tracked, allowing their social networks to be accurately constructed. We used radio-frequency tags to accurately track cattle position and analyze high-resolution cattle social networks. We tested the hypothesis of temporal stationarity and spatial homogeneity in these high-resolution networks and demonstrated substantial spatial-temporal heterogeneity during different daily time periods (feeding and non-feeding) and in different areas of the pen (grain bunk, water trough, hay bunk, and other general pen area). The social network structure is analyzed using global network characteristics (network density, exponential random graph model structure), subgroup clustering (modularity), triadic property (transitivity), and dyadic interactions (correlation coefficient from a quadratic assignment procedure). Cattle tend to have the strongest and most consistent contacts with others around the hay bunk during the feeding time. These results cannot be determined from data at lower spatial (aggregated at entire pen level) or temporal (aggregated at daily level) resolution. These results reveal new insights for real-time animal social network structure dynamics, providing more accurate descriptions that allow more accurate modeling of multiple (both direct and indirect) disease transmission pathways.
When subjected to large amplitude oscillatory shear stress, aqueous Laponite suspensions show an abrupt solidification transition after a long delay time tc. We measure the dependence of tc on stress amplitude, frequency, and on the age-dependent initial loss modulus. At first sight our observations appear quantitatively consistent with a simple soft-glassy rheology (SGR)-type model, in which barrier crossings by mesoscopic elements are purely strain-induced. For a given strain amplitude {\gamma}0 each element can be classified as fluid or solid according to whether its local yield strain exceeds {\gamma}0. Each cycle, the barrier heights E of yielded elements are reassigned according to a fixed prior distribution {\rho}(E): this fixes the per-cycle probability R({\gamma}0) of a fluid elements becoming solid. As the fraction of solid elements builds up, {\gamma}0 falls (at constant stress amplitude), so R({\gamma}0) increases. This positive feedback accounts for the sudden solidification after a long delay. The model thus appears to directly link macroscopic rheology with mesoscopic barrier height statistics: within its precepts, our data point towards a power law for {\rho}(E) rather than the exponential form usually assumed in SGR. However, despite this apparent success, closer investigation shows that the assumptions of the model cannot be reconciled with the extremely large strain amplitudes arising in our experiments. The quantitative explanation of delayed solidification in Laponite therefore remains an open theoretical challenge.
As it was argued by Anderson [Science 177, 393 (1972)], the "reductionist" hypothesis does not by any means imply a "constructionist" one. Hence, in general, the behavior of large and complex aggregates of elementary components can not be understood nor extrapolated from the properties of a few components. Following this insight, we have simulated different "aggregates" of logistic maps according to a particular coupling scheme. All these aggregates show a similar pattern of dynamical properties, concretely a bistable behavior, that is also found in a network of many units of the same type, independently of the number of components and of the interconnection topology. A qualitative relationship with brain-like systems is suggested.
Charm production in deep inelastic scattering has been measured with the ZEUS detector at HERA using an integrated luminosity of 82 pb^{-1}. Charm has been tagged by reconstructing D^{*+}, D^0, D^{+} and D_s^+ (+ c.c.) charm mesons. The charm hadrons were measured in the kinematic range p_T(D^{*+},D^0,D^{+}) > 3 GeV, p_T(D_s^+)>2 GeV and |\eta(D)| < 1.6 for 1.5 < Q^2 < 1000 GeV^2 and 0.02 < y < 0.7. The production cross sections were used to extract charm fragmentation ratios and the fraction of c quarks hadronising into a particular charm meson in the kinematic range considered. The cross sections were compared to the predictions of next-to-leading-order QCD, and extrapolated to the full kinematic region in p_T(D) and \eta(D) in order to determine the open-charm contribution, F_2^{c\bar{c}}(x,Q^2), to the proton structure function F_2.
Several learning rules for synaptic plasticity, that depend on either spike timing or internal state variables, have been proposed in the past imparting varying computational capabilities to Spiking Neural Networks. Due to design complications these learning rules are typically not implemented on neuromorphic devices leaving the devices to be only capable of inference. In this work we propose a unidirectional post-synaptic potential dependent learning rule that is only triggered by pre-synaptic spikes, and easy to implement on hardware. We demonstrate that such a learning rule is functionally capable of replicating computational capabilities of pairwise STDP. Further more, we demonstrate that this learning rule can be used to learn and classify spatio-temporal spike patterns in an unsupervised manner using individual neurons. We argue that this learning rule is computationally powerful and also ideal for hardware implementations due to its unidirectional memory access.
We consider approximations of 1D Lipschitz functions by deep ReLU networks of a fixed width. We prove that without the assumption of continuous weight selection the uniform approximation error is lower than with this assumption at least by a factor logarithmic in the size of the network.
Recent results of numerical simulations of fully nonlinear evolutionary equations for long-crested deep-water waves are discussed, where formation of extreme waves was observed. Several examples demonstrate that three-dimensionality of the fluid motion has an essential influence on the process of rogue wave formation. In particular, in the presence of elongate wave groups, the most tall extreme waves occur when in an initial state the wave fronts were oriented obliquely to the direction of the group. An "optimal" angle, resulting in the highest rogue waves, depends on initial wave amplitude and group width, and it is about 18-28 degrees in a practically important range of parameters. Besides that, the mechanism of spatial-temporal focusing on a random wave background has been simulated for several values of nonlinearity.
We propose the Gaussian attention model for content-based neural memory access. With the proposed attention model, a neural network has the additional degree of freedom to control the focus of its attention from a laser sharp attention to a broad attention. It is applicable whenever we can assume that the distance in the latent space reflects some notion of semantics. We use the proposed attention model as a scoring function for the embedding of a knowledge base into a continuous vector space and then train a model that performs question answering about the entities in the knowledge base. The proposed attention model can handle both the propagation of uncertainty when following a series of relations and also the conjunction of conditions in a natural way. On a dataset of soccer players who participated in the FIFA World Cup 2014, we demonstrate that our model can handle both path queries and conjunctive queries well.
We describe FactorBase, a new SQL-based framework that leverages a relational database management system to support multi-relational model discovery. A multi-relational statistical model provides an integrated analysis of the heterogeneous and interdependent data resources in the database. We adopt the BayesStore design philosophy: statistical models are stored and managed as first-class citizens inside a database. Whereas previous systems like BayesStore support multi-relational inference, FactorBase supports multi-relational learning. A case study on six benchmark databases evaluates how our system supports a challenging machine learning application, namely learning a first-order Bayesian network model for an entire database. Model learning in this setting has to examine a large number of potential statistical associations across data tables. Our implementation shows how the SQL constructs in FactorBase facilitate the fast, modular, and reliable development of highly scalable model learning systems.
We study the stochastic multi-armed bandit (MAB) problem in the presence of side-observations across actions that occur as a result of an underlying network structure. In our model, a bipartite graph captures the relationship between actions and a common set of unknowns such that choosing an action reveals observations for the unknowns that it is connected to. This models a common scenario in online social networks where users respond to their friends' activity, thus providing side information about each other's preferences. Our contributions are as follows: 1) We derive an asymptotic lower bound (with respect to time) as a function of the bi-partite network structure on the regret of any uniformly good policy that achieves the maximum long-term average reward. 2) We propose two policies - a randomized policy; and a policy based on the well-known upper confidence bound (UCB) policies - both of which explore each action at a rate that is a function of its network position. We show, under mild assumptions, that these policies achieve the asymptotic lower bound on the regret up to a multiplicative factor, independent of the network structure. Finally, we use numerical examples on a real-world social network and a routing example network to demonstrate the benefits obtained by our policies over other existing policies.
The EM-algorithm is a general procedure to get maximum likelihood estimates if part of the observations on the variables of a network are missing. In this paper a stochastic version of the algorithm is adapted to probabilistic neural networks describing the associative dependency of variables. These networks have a probability distribution, which is a special case of the distribution generated by probabilistic inference networks. Hence both types of networks can be combined allowing to integrate probabilistic rules as well as unspecified associations in a sound way. The resulting network may have a number of interesting features including cycles of probabilistic rules, hidden 'unobservable' variables, and uncertain and contradictory evidence.
Convolutional neural networks (CNNs) have demonstrated superior capability for extracting information from raw signals in computer vision. Recently, character-level and multi-channel CNNs have exhibited excellent performance for sentence classification tasks. We apply CNNs to large-scale authorship attribution, which aims to determine an unknown text's author among many candidate authors, motivated by their ability to process character-level signals and to differentiate between a large number of classes, while making fast predictions in comparison to state-of-the-art approaches. We extensively evaluate CNN-based approaches that leverage word and character channels and compare them against state-of-the-art methods for a large range of author numbers, shedding new light on traditional approaches. We show that character-level CNNs outperform the state-of-the-art on four out of five datasets in different domains. Additionally, we present the first application of authorship attribution to reddit.
The promising performance of Deep Learning (DL) in speech recognition has motivated the use of DL in other speech technology applications such as speaker recognition. Given i-vectors as inputs, the authors proposed an impostor selection algorithm and a universal model adaptation process in a hybrid system based on Deep Belief Networks (DBN) and Deep Neural Networks (DNN) to discriminatively model each target speaker. In order to have more insight into the behavior of DL techniques in both single and multi-session speaker enrollment tasks, some experiments have been carried out in this paper in both scenarios. Additionally, the parameters of the global model, referred to as universal DBN (UDBN), are normalized before adaptation. UDBN normalization facilitates training DNNs specifically with more than one hidden layer. Experiments are performed on the NIST SRE 2006 corpus. It is shown that the proposed impostor selection algorithm and UDBN adaptation process enhance the performance of conventional DNNs 8-20 % and 16-20 % in terms of EER for the single and multi-session tasks, respectively. In both scenarios, the proposed architectures outperform the baseline systems obtaining up to 17 % reduction in EER.
Gaussian Markov random fields (GMRFs) are useful in a broad range of applications. In this paper we tackle the problem of learning a sparse GMRF in a high-dimensional space. Our approach uses the l1-norm as a regularization on the inverse covariance matrix. We utilize a novel projected gradient method, which is faster than previous methods in practice and equal to the best performing of these in asymptotic complexity. We also extend the l1-regularized objective to the problem of sparsifying entire blocks within the inverse covariance matrix. Our methods generalize fairly easily to this case, while other methods do not. We demonstrate that our extensions give better generalization performance on two real domains--biological network analysis and a 2D-shape modeling image task.
With the recent substantial growth of media such as YouTube, a considerable number of instructional videos covering a wide variety of tasks are available online. Therefore, online instructional videos have become a rich resource for humans to learn everyday skills. In order to improve the effectiveness of the learning with instructional video, observation and evaluation of the activity are required. However, it is difficult to observe and evaluate every activity steps by expert. In this study, a novel deep learning framework which targets human activity evaluation for learning from instructional video has been proposed. In order to deal with the inherent variability of activities, we propose to model activity as a structured process. First, action units are encoded from dense trajectories with LSTM network. The variable-length action unit features are then evaluated by a Siamese LSTM network. By the comparative experiments on public dataset, the effectiveness of the proposed method has been demonstrated.
We introduce a diagrammatic formulation for a cavity field expansion around the critical temperature. This approach allows us to obtain a theory for the overlap's fluctuations and, in particular, the linear part of the Ghirlanda-Guerra relationships (GG) (often called Aizenman-Contucci polynomials (AC)) in a very simple way. We show moreover how these constraints are "superimposed" by the symmetry of the model with respect to the restriction required by thermodynamic stability. Within this framework it is possible to expand the free energy in terms of these irreducible overlaps fluctuations and in a form that simply put in evidence how the complexity of the solution is related to the complexity of the entropy.
We introduce quantum fluctuations into the simulated annealing process of optimization problems, aiming at faster convergence to the optimal state. Quantum fluctuations cause transitions between states and thus play the same role as thermal fluctuations in the conventional approach. The idea is tested by the transverse Ising model, in which the transverse field is a function of time similar to the temperature in the conventional method. The goal is to find the ground state of the diagonal part of the Hamiltonian with high accuracy as quickly as possible. We have solved the time-dependent Schr\"odinger equation numerically for small size systems with various exchange interactions. Comparison with the results of the corresponding classical (thermal) method reveals that the quantum annealing leads to the ground state with much larger probability in almost all cases if we use the same annealing schedule.
In this paper, we investigate how the latest versions of deep convolutional neural networks perform on the facial attribute classification task. We test two loss functions to train the neural networks: the sigmoid cross-entropy loss usually used in multi-objective classification tasks, and the Euclidean loss normally applied to regression problems, and find that there is little difference between these two loss functions. Rather, more attention should be paid on pre-training the network with external data, the learning rate policy and the evaluation technique. Using an ensemble of three ResNets, we obtain the new state-of-the-art facial attribute classification error of 8.00% on the aligned images of the CelebA dataset. More significantly, we introduce a novel data augmentation technique allowing to train the AFFACT network that classifies facial attributes without requiring alignment, but works solely on detected face bounding boxes. We show that this approach outperforms the CelebA baseline, which did not use any face alignment either, with a relative improvement of 34%.
We introduce a quantitative measure of network bipartivity as a proportion of even to total number of closed walks in the network. Spectral graph theory is used to quantify how close to bipartite a network is and the extent to which individual nodes and edges contribute to the global network bipartivity. It is shown that the bipartivity characterizes the network structure and can be related to the efficiency of semantic or communication networks, trophic interactions in food webs, construction principles in metabolic networks, or communities in social networks.
To cope with the growing demand for wireless data and to extend service coverage, future 5G networks will increasingly rely on the use of low powered nodes to support massive connectivity in diverse set of applications and services [1]. To this end, virtualized and mass-scale cloud architectures are proposed as promising technologies for 5G in which all the nodes are connected via a backhaul network and managed centrally by such cloud centers. The significant computing power made available by the cloud technologies has enabled the implementation of sophisticated signal processing algorithms, especially by way of parallel processing, for both interference management and network provision. The latter two are among the major signal processing tasks for 5G due to increased level of frequency sharing, node density, interference and network congestion. This article outlines several theoretical and practical aspects of joint interference management and network provisioning for future 5G networks. A cross-layer optimization framework is proposed for joint user admission, user-base station association, power control, user grouping, transceiver design as well as routing and flow control. We show that many of these cross-layer tasks can be treated in a unified way and implemented in a parallel manner using an efficient algorithmic framework called WMMSE (Weighted MMSE). Some recent developments in this area are highlighted and future research directions are identified.
Understanding the interactions among nodes in a complex network is of great importance, since they disclose how these nodes are cooperatively supporting the functioning of the network. Scientists have developed numerous methods to uncover the underlying adjacent physical connectivity based on measurements of functional quantities of the nodes states. Often, the physical connectivity, the adjacency matrix, is available. Yet, little is known about how this adjacent connectivity impacts on the "hidden" flows being exchanged between any two arbitrary nodes, after travelling longer non-adjacent paths. In this Letter, we show that hidden physical flows in conservative flow networks, a quantity that is usually inaccessible to measurements, can be determined by the interchange of physical flows between any pair of adjacent nodes. Our approach applies to steady or dynamic state of either linear or non-linear complex networks that can be modelled by conservative flow networks, such as gas supply networks, water supply networks and power grids.
We numerically investigate the electric potential distribution over a two-dimensional continuum percolation model between the electrodes. The model consists of overlapped conductive particles on the background with an infinitesimal conductivity. Using the finite difference method, we solve the generalized Laplace equation and show that in the potential distribution, there appear the {\it{quasi-equipotential clusters}} which approximately and locally have the same values like steps and stairs. Since the quasi-equipotential clusters has the fractal structure, we compute the fractal dimension of equipotential curves and its dependence on the volume fraction over $[0,1]$. The fractal dimension in [1.00, 1.257] has a peak at the percolation threshold $p_c$.
A proper initialization of the weights in a neural network is critical to its convergence. Current insights into weight initialization come primarily from linear activation functions. In this paper, I develop a theory for weight initializations with non-linear activations. First, I derive a general weight initialization strategy for any neural network using activation functions differentiable at 0. Next, I derive the weight initialization strategy for the Rectified Linear Unit (RELU), and provide theoretical insights into why the Xavier initialization is a poor choice with RELU activations. My analysis provides a clear demonstration of the role of non-linearities in determining the proper weight initializations.
Recurrent neural networks are often used for learning time-series data. Based on a few assumptions we model this learning task as a minimization problem of a nonlinear least-squares cost function. The special structure of the cost function allows us to build a connection to reinforcement learning. We exploit this connection and derive a convergent, policy iteration-based algorithm. Furthermore, we argue that RNN training can be fit naturally into the reinforcement learning framework.
Statistical modeling of nuclear data using artificial neural networks (ANNs) and, more recently, support vector machines (SVMs), is providing novel approaches to systematics that are complementary to phenomenological and semi-microscopic theories. We present a global model of $\beta^-$-decay halflives of the class of nuclei that decay 100% by $\beta^-$ mode in their ground states. A fully-connected multilayered feed forward network has been trained using the Levenberg-Marquardt algorithm, Bayesian regularization, and cross-validation. The halflife estimates generated by the model are discussed and compared with the available experimental data, with previous results obtained with neural networks, and with estimates coming from traditional global nuclear models. Predictions of the new neural-network model are given for nuclei far from stability, with particular attention to those involved in r-process nucleosynthesis. This study demonstrates that in the framework of the $\beta^-$-decay problem considered here, global models based on ANNs can at least match the predictive performance of the best conventional global models rooted in nuclear theory. Accordingly, such statistical models can provide a valuable tool for further mapping of the nuclidic chart.
This paper studies the landscape of empirical risk of deep neural networks by theoretically analyzing its convergence behavior to the population risk as well as its stationary points and properties. For an $l$-layer linear neural network, we prove its empirical risk uniformly converges to its population risk at the rate of $\mathcal{O}(r^{2l}\sqrt{d\log(l)}/\sqrt{n})$ with training sample size of $n$, the total weight dimension of $d$ and the magnitude bound $r$ of weight of each layer. We then derive the stability and generalization bounds for the empirical risk based on this result. Besides, we establish the uniform convergence of gradient of the empirical risk to its population counterpart. We prove the one-to-one correspondence of the non-degenerate stationary points between the empirical and population risks with convergence guarantees, which describes the landscape of deep neural networks. In addition, we analyze these properties for deep nonlinear neural networks with sigmoid activation functions. We prove similar results for convergence behavior of their empirical risks as well as the gradients and analyze properties of their non-degenerate stationary points.   To our best knowledge, this work is the first one theoretically characterizing landscapes of deep learning algorithms. Besides, our results provide the sample complexity of training a good deep neural network. We also provide theoretical understanding on how the neural network depth $l$, the layer width, the network size $d$ and parameter magnitude determine the neural network landscapes.
Aging dynamics in glassy systems is investigated by considering the hopping motion in a rugged energy landscape whose deep minima are characterized by an exponential density of states. In particular we explore the behavior of a generic two-time correlation function Pi(t_w+t,t_w) below the glass transition temperature T_g when both the observation time t and the waiting time t_w become large. We show the occurrence of ordinary scaling behavior, which includes normal aging and subaging, and the possible simultaneous occurrence of generalized scaling behavior. Which situation occurs depends on the form of the effective transition rates between the low lying states. Employing a ``Partial Equilibrium Concept'', the exponents of the aging and the asymptotic form of the scaling functions are obtained both by simple scaling arguments and by analytical calculations. The predicted scaling properties compare well with Monte-Carlo simulations in dimensions d = 1-1000 and it is argued that a mean-field type treatment of the hopping motion fails to describe the aging dynamics in any dimension. Implications for more general situations involving different forms of transition rates and the occurrence of many scaling regimes in the t-t_w plane are pointed out.
I obtain the inverse of the correlation length exponent at the superfluid-Bose glass quantum critical point as a series in small parameter $\sqrt{d-1}$, with d being the dimensionality of the system, and compute the first two terms using a novel field-theoretic technique. For d=2 I find $\nu_s = 0.81$ and $\nu_c = 1.03$, for short-range and Coulomb interactions between bosons, respectively. When combined with the exact values of the dynamical critical exponents $z_s = d$ and $z_c = 1$, these results are in quantitative agreement with the experiments on onset of superfluidity in $^4 He$ in porous glasses, and on superconductor-insulator transition in homogeneous metallic films, in support of the dirty-boson theory fot the latter. Higher-order calculation of the exponents and of the universal conductivity is discussed.
We have used the locally self-consistent Green's function (LSGF) method in supercell calculations to establish the distribution of the net charges assigned to the atomic spheres of the alloy components in metallic alloys with different compositions and degrees of order. This allows us to determine the Madelung potential energy of a random alloy in the single-site mean field approximation which makes the conventional single-site density-functional- theory coherent potential approximation (SS-DFT-CPA) method practically identical to the supercell LSGF method with a single-site local interaction zone that yields an exact solution of the DFT problem. We demonstrate that the basic mechanism which governs the charge distribution is the screening of the net charges of the alloy components that makes the direct Coulomb interactions short-ranged. In the atomic sphere approximation, this screening appears to be almost independent of the alloy composition, lattice spacing, and crystal structure. A formalism which allows a consistent treatment of the screened Coulomb interactions within the single-site mean-filed approximation is outlined. We also derive the contribution of the screened Coulomb interactions to the S2 formalism and the generalized perturbation method.
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. The primary difficulty arises due to insufficient exploration, resulting in an agent being unable to learn robust value functions. Intrinsically motivated agents can explore new behavior for its own sake rather than to directly solve problems. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning. A top-level value function learns a policy over intrinsic goals, and a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse, delayed feedback: (1) a complex discrete stochastic decision process, and (2) the classic ATARI game `Montezuma's Revenge'.
Thermal conductivities $\Lambda $ of amorphous carbon thin films are measured in the temperatures range 80--400 K using the $3\omega $ method. Sample films range from soft a-C:H prepared by remote-plasma deposition ($\Lambda = 0.20$ W m$^{-1}$ K$^{-1}$ at room temperature) to amorphous diamond with a large fraction of $sp^3$ bonded carbon deposited from a filtered-arc source ($\Lambda = 2.2$ W m$^{-1}$ K$^{-1}$). Effective-medium theory provides a phenomenological description of the variation of conductivity with mass density. The thermal conductivities are in good agreement with the minimum thermal conductivity calculated from the measured atomic density and longitudinal speed of sound.
A general method for calculating statistical properties of speckle patterns of coherent waves propagating in disordered media is developed. It allows one to calculate speckle pattern correlations in space, as well as their sensitivity to external parameters. This method, which is similar to the Boltzmann-Langevin approach for the calculation of classical fluctuations, applies for a wide range of systems: From cases where the ray propagation is diffusive to the regime where the rays experience only small angle scattering. The latter case comprises the regime of directed waves where rays propagate ballistically in space while their directions diffuse. We demonstrate the applicability of the method by calculating the correlation function of the wave intensity and its sensitivity to the wave frequency and the angle of incidence of the incoming wave.
Nonlinear independent component analysis (ICA) provides an appealing framework for unsupervised feature learning, but the models proposed so far are not identifiable. Here, we first propose a new intuitive principle of unsupervised deep learning from time series which uses the nonstationary structure of the data. Our learning principle, time-contrastive learning (TCL), finds a representation which allows optimal discrimination of time segments (windows). Surprisingly, we show how TCL can be related to a nonlinear ICA model, when ICA is redefined to include temporal nonstationarities. In particular, we show that TCL combined with linear ICA estimates the nonlinear ICA model up to point-wise transformations of the sources, and this solution is unique --- thus providing the first identifiability result for nonlinear ICA which is rigorous, constructive, as well as very general.
It has been demonstrated that excitable media with a tree structure performed better than other network topologies, it is natural to consider neural networks defined on Cayley trees. The investigation of a symbolic space called tree-shift of finite type is important when it comes to the discussion of the equilibrium solutions of neural networks on Cayley trees. Entropy is a frequently used invariant for measuring the complexity of a system, and constant entropy for an open set of coupling weights between neurons means that the specific network is stable. This paper gives a complete characterization for entropy spectrum of neural networks on Cayley trees and reveals whether the entropy bifurcates when the coupling weights change.
Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalising constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes' factors (BFs) for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates.
Accurate and automatic organ segmentation from 3D radiological scans is an important yet challenging problem for medical image analysis. Specifically, the pancreas demonstrates very high inter-patient anatomical variability in both its shape and volume. In this paper, we present an automated system using 3D computed tomography (CT) volumes via a two-stage cascaded approach: pancreas localization and segmentation. For the first step, we localize the pancreas from the entire 3D CT scan, providing a reliable bounding box for the more refined segmentation step. We introduce a fully deep-learning approach, based on an efficient application of holistically-nested convolutional networks (HNNs) on the three orthogonal axial, sagittal, and coronal views. The resulting HNN per-pixel probability maps are then fused using pooling to reliably produce a 3D bounding box of the pancreas that maximizes the recall. We show that our introduced localizer compares favorably to both a conventional non-deep-learning method and a recent hybrid approach based on spatial aggregation of superpixels using random forest classification. The second, segmentation, phase operates within the computed bounding box and integrates semantic mid-level cues of deeply-learned organ interior and boundary maps, obtained by two additional and separate realizations of HNNs. By integrating these two mid-level cues, our method is capable of generating boundary-preserving pixel-wise class label maps that result in the final pancreas segmentation. Quantitative evaluation is performed on a publicly available dataset of 82 patient CT scans using 4-fold cross-validation (CV). We achieve a Dice similarity coefficient (DSC) of 81.27+/-6.27% in validation, which significantly outperforms previous state-of-the art methods that report DSCs of 71.80+/-10.70% and 78.01+/-8.20%, respectively, using the same dataset.
Convolutional network techniques have recently achieved great success in vision based detection tasks. This paper introduces the recent development of our research on transplanting the fully convolutional network technique to the detection tasks on 3D range scan data. Specifically, the scenario is set as the vehicle detection task from the range data of Velodyne 64E lidar. We proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously. By carefully design the bounding box encoding, it is able to predict full 3D bounding boxes even using a 2D convolutional network. Experiments on the KITTI dataset shows the state-of-the-art performance of the proposed method.
The demand for extracting rules from high dimensional real world data is increasing in various fields. However, the possible redundancy of such data sometimes makes it difficult to obtain a good generalization ability for novel samples. To resolve this problem, we provide a scheme that reduces the effective dimensions of data by pruning redundant components for bicategorical classification based on the Bayesian framework. First, the potential of the proposed method is confirmed in ideal situations using the replica method. Unfortunately, performing the scheme exactly is computationally difficult. So, we next develop a tractable approximation algorithm, which turns out to offer nearly optimal performance in ideal cases when the system size is large. Finally, the efficacy of the developed classifier is experimentally examined for a real world problem of colon cancer classification, which shows that the developed method can be practically useful.
Although the ``scale-free'' literature is large and growing, it gives neither a precise definition of scale-free graphs nor rigorous proofs of many of their claimed properties. In fact, it is easily shown that the existing theory has many inherent contradictions and verifiably false claims. In this paper, we propose a new, mathematically precise, and structural definition of the extent to which a graph is scale-free, and prove a series of results that recover many of the claimed properties while suggesting the potential for a rich and interesting theory. With this definition, scale-free (or its opposite, scale-rich) is closely related to other structural graph properties such as various notions of self-similarity (or respectively, self-dissimilarity). Scale-free graphs are also shown to be the likely outcome of random construction processes, consistent with the heuristic definitions implicit in existing random graph approaches. Our approach clarifies much of the confusion surrounding the sensational qualitative claims in the scale-free literature, and offers rigorous and quantitative alternatives.
The production of final state photons in deep inelastic scattering originates from photon radiation off leptons or quarks involved in the scattering process. Photon radiation off quarks involves a contribution from the quark-to-photon fragmentation function, corresponding to the non-perturbative transition of a hadronic jet into a single, highly energetic photon accompanied by some limited hadronic activity. Up to now, this fragmentation function was measured only in electron-positron annihilation at LEP. We demonstrate by a dedicated parton-level calculation that a competitive measurement of the quark-to-photon fragmentation function can be obtained in deep inelastic scattering at HERA. Such a measurement can be obtained by studying the photon energy spectra in $\gamma + (0+1)$-jet events, where $\gamma$ denotes a hadronic jet containing a highly energetic photon (the photon jet). Isolated photons are then defined from the photon jet by imposing a minimal photon energy fraction. For this so-called democratic clustering approach, we study the cross sections for isolated $\gamma + (0+1)$-jet and $\gamma + (1+1)$-jet production as well as for the inclusive isolated photon production in deep inelastic scattering.
We present a novel large-scale dataset and comprehensive baselines for end-to-end pedestrian detection and person recognition in raw video frames. Our baselines address three issues: the performance of various combinations of detectors and recognizers, mechanisms for pedestrian detection to help improve overall re-identification accuracy and assessing the effectiveness of different detectors for re-identification. We make three distinct contributions. First, a new dataset, PRW, is introduced to evaluate Person Re-identification in the Wild, using videos acquired through six synchronized cameras. It contains 932 identities and 11,816 frames in which pedestrians are annotated with their bounding box positions and identities. Extensive benchmarking results are presented on this dataset. Second, we show that pedestrian detection aids re-identification through two simple yet effective improvements: a discriminatively trained ID-discriminative Embedding (IDE) in the person subspace using convolutional neural network (CNN) features and a Confidence Weighted Similarity (CWS) metric that incorporates detection scores into similarity measurement. Third, we derive insights in evaluating detector performance for the particular scenario of accurate person re-identification.
We consider the orthogonality catastrophe at the Anderson Metal-Insulator transition (AMIT). The typical overlap $F$ between the ground state of a Fermi liquid and the one of the same system with an added potential impurity is found to decay at the AMIT exponentially with system size $L$ as $F \sim \exp (- \langle I_A\rangle /2)= \exp(-c L^{\eta})$, where $I_A$ is the so called Anderson integral, $\eta $ is the power of multifractal intensity correlations and $\langle ... \rangle$ denotes the ensemble average. Thus, strong disorder typically increases the sensitivity of a system to an additional impurity exponentially. We recover on the metallic side of the transition Anderson's result that fidelity $F$ decays with a power law $F \sim L^{-q (E_F)}$ with system size $L$. This power increases as Fermi energy $E_F$ approaches mobility edge $E_M$ as $q (E_F) \sim (\frac{E_F-E_M}{E_M})^{-\nu \eta},$ where $\nu$ is the critical exponent of correlation length $\xi_c$. On the insulating side of the transition $F$ is constant for system sizes exceeding localization length $\xi$. While these results are obtained from the mean value of $I_A,$ giving the typical fidelity $F$, we find that $I_A$ is widely, log normally, distributed with a width diverging at the AMIT. As a consequence, the mean value of fidelity $F$ converges to one at the AMIT, in strong contrast to its typical value which converges to zero exponentially fast with system size $L$. This counterintuitive behavior is explained as a manifestation of multifractality at the AMIT.
Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style transfer. However, their framework requires a slow iterative optimization process, which limits its practical application. Fast approximations with feed-forward neural networks have been proposed to speed up neural style transfer. Unfortunately, the speed improvement comes at a cost: the network is usually tied to a fixed set of styles and cannot adapt to arbitrary new styles. In this paper, we present a simple yet effective approach that for the first time enables arbitrary style transfer in real-time. At the heart of our method is a novel adaptive instance normalization (AdaIN) layer that aligns the mean and variance of the content features with those of the style features. Our method achieves speed comparable to the fastest existing approach, without the restriction to a pre-defined set of styles. In addition, our approach allows flexible user controls such as content-style trade-off, style interpolation, color & spatial controls, all using a single feed-forward neural network.
Powerline communication (PLC) provides inexpensive, secure and high speed network connectivity, by leveraging the existing power distribution networks inside the buildings. While PLC technology has the potential to improve connectivity and is considered a key enabler for sensing, control, and automation applications in enterprises, it has been mainly deployed for improving connectivity in homes. Deploying PLCs in enterprises is more challenging since the power distribution network is more complex as compared to homes. Moreover, existing PLC technologies such as HomePlug AV have not been designed for and evaluated in enterprise deployments. In this paper, we first present a comprehensive measurement study of PLC performance in enterprise settings, by analyzing PLC channel characteristics across space, time, and spectral dimensions, using commodity HomePlug AV PLC devices. Our results uncover the impact of distribution lines, circuit breakers, AC phases and electrical interference on PLC performance. Based on our findings, we show that careful planning of PLC network topology, routing and spectrum sharing can significantly boost performance of enterprise PLC networks. Our experimental results show that multi-hop routing can increase throughput performance by 5x in scenarios where direct PLC links perform poorly. Moreover, our trace driven simulations for multiple deployments, show that our proposed fine-grained spectrum sharing design can boost the aggregated and per-link PLC throughput by more than 20% and 100% respectively, in enterprise PLC networks.
In this paper, we present a novel non-parametric clustering technique, which is based on an iterative algorithm that peels off layers of points around the clusters. Our technique is based on the notion that each latent cluster is comprised of layers that surround its core, where the external layers, or border points, implicitly separate the clusters. Analyzing the K-nearest neighbors of the points makes it possible to identify the border points and associate them with points of inner layers. Our clustering algorithm iteratively identifies border points, peels them, and separates the latent clusters. We show that the peeling process adapts to the local density and successfully separates adjacent clusters. A notable quality of the Border-Peeling algorithm is that it does not require any parameter tuning in order to outperform state-of-the-art finely-tuned non-parametric clustering methods, including Mean-Shift and DBSCAN. We further assess our technique on high-dimensional datasets that vary in size and characteristics. In particular, we analyze the space of deep features that were trained by a convolutional neural network.
We have defined pinning fields as those random fields that keep some of the magnetic moments unreversed in the region of negative external applied field during the demagnetizing process. An analysis of the statistical properties of such pinning fields is presented within the context of the Gaussian Random Field Ising Model (RFIM). We show that the average of the pinning fields exhibits a drastic increase close to the coercive field and that such an increase is discontinuous for low degrees of disorder. This behaviour can be described with standard finite size scaling (FSS) assumptions. Furthermore, we also show that the pinning fields corresponding to states close to coercivity exhibit strong statistical correlations.
Routing in opportunistic networks is a very important and challenging problem because opportunistic network utilizes the contact opportunities of mobile nodes to achieve data communication.Social-based routing uses behavior of human beings which can form a community with the same interests to deliver the message.In this paper,we analyze the drawbacks of the original friendship-based algorithm, which defined social pressure metric to determine nodes'friendship community, but the metric couldn't distinguish the distribution characterization of the connection length which has an important impact on the selection of links with better quality. Further, the existing friendship-based routing doesn't consider the buffer management, which is vital for routing design in opportunistic networks. We propose a New Friendship-based routing with buffer management based on Copy Utility, named NFCU. NFCU algorithm, which not only considers the contact periods in constructing social pressure metric to solve the drawbacks of the original friendship-based routing scheme efficiently, but also considers the buffer management, that can efficiently determine which copy of the message should be deleted timely according to the copy utility function. Our proposed strategy can reduce the network overhead significantly, and increase the message delivery ratio. The extensive simulation results demonstrate that NFCU performs better than the original friendship-based routing. Moreover, we compare NFCU with other four classicalrouting schemes in opportunistic networks in terms of message delivery ratio, average delay, and comprehensive metric- message delivery ratio*(1/average delay). The simulation results show that our scheme NFCU can achieve better performance.
Traditional speaker change detection in dialogues is typically based on audio input. In some scenarios, however, researchers can only obtain text, and do not have access to raw audio signals. Moreover, with the increasing need of deep semantic processing, text-based dialogue understanding is attracting more attention in the community. These raise the problem of text-based speaker change detection. In this paper, we formulate the task as a matching problem of utterances before and after a certain decision point; we propose a hierarchical recurrent neural network (RNN) with static sentence-level attention. Our model comprises three main components: a sentence encoder with a long short term memory (LSTM)-based RNN, a context encoder with another LSTM-RNN, and a static sentence-level attention mechanism, which allows rich information interaction. Experimental results show that neural networks consistently achieve better performance than feature-based approaches, and that our attention-based model significantly outperforms non-attention neural networks.
This paper discusses the systematic use of product feedback information to support life-cycle design approaches and provides guidelines for developing a design at both the product and the system levels. Design activities are surveyed in the light of the product life cycle, and the design information flow is interpreted from a semiotic perspective. The natural evolution of a design is considered, the notion of design expectations is introduced, and the importance of evaluation of these expectations in dynamic environments is argued. Possible strategies for reconciliation of the expectations and environmental factors are described. An Internet-enabled technology is proposed to monitor product functionality, usage, and operational environment and supply the designer with relevant information. A pilot study of assessing design expectations of a refrigerator is outlined, and conclusions are drawn.
Data from the Spitzer Space Telescope (the First Look Survey - FLS) have recently been made public. We have compared the 24 micron images with very deep WSRT 1.4 GHz observations (Morganti et al. 2004), centred on the FLS verification strip (FLSv). Approximately 75% of the radio sources have corresponding 24 micron identifications. Such a close correspondence is expected, especially at the fainter radio flux density levels, where star forming galaxies are thought to dominate both the radio and mid-IR source counts. Spitzer detects many sources that have no counter-part in the radio. However, a significant fraction of radio sources detected by the WSRT (about 25%) have no mid-IR identification in the FLSv (implying a 24 micron flux density less than 100 microJy). The fraction of radio sources without a counterpart in the mid-IR appears to increase with increasing radio flux density, perhaps indicating that some fraction of the AGN population may be detected more readily at radio than Mid-IR wavelenghts. We present initial results on the nature of the radio sources without Spitzer identification, using data from various multi-waveband instruments, including the publicly available R-band data from the Kitt Peak 4-m telescope.
We present an algorithm of clustering of many-dimensional objects, where only the distances between objects are used. Centers of classes are found with the aid of neuron-like procedure with lateral inhibition. The result of clustering does not depend on starting conditions. Our algorithm makes it possible to give an idea about classes that really exist in the empirical data. The results of computer simulations are presented.
Deep neural networks require a large amount of labeled training data during supervised learning. However, collecting and labeling so much data might be infeasible in many cases. In this paper, we introduce a source-target selective joint fine-tuning scheme for improving the performance of deep learning tasks with insufficient training data. In this scheme, a target learning task with insufficient training data is carried out simultaneously with another source learning task with abundant training data. However, the source learning task does not use all existing training data. Our core idea is to identify and use a subset of training images from the original source learning task whose low-level characteristics are similar to those from the target learning task, and jointly fine-tune shared convolutional layers for both tasks. Specifically, we compute descriptors from linear or nonlinear filter bank responses on training images from both tasks, and use such descriptors to search for a desired subset of training samples for the source learning task.   Experiments demonstrate that our selective joint fine-tuning scheme achieves state-of-the-art performance on multiple visual classification tasks with insufficient training data for deep learning. Such tasks include Caltech 256, MIT Indoor 67, Oxford Flowers 102 and Stanford Dogs 120. In comparison to fine-tuning without a source domain, the proposed method can improve the classification accuracy by 2% - 10% using a single model.
The dynamics of a fibre-bundle type model with equal load sharing rule is numerically studied. The system, formed by N elements, is driven by a slow increase of the load upon it which is removed in a novel way through internal transfers to the elements broken during avalanches. When an avalanche ends, failed elements are regenerated with strengths taken from a probability distribution. For a large enough N and certain restrictions on the distribution of individual strengths, the system reaches a self-organized critical state where the spectrum of avalanche sizes is a power law with an exponent $\tau\simeq 1.5$.
Deep neural networks (DNN) are the state of the art on many engineering problems such as computer vision and audition. A key factor in the success of the DNN is scalability - bigger networks work better. However, the reason for this scalability is not yet well understood. Here, we interpret the DNN as a discrete system, of linear filters followed by nonlinear activations, that is subject to the laws of sampling theory. In this context, we demonstrate that over-sampled networks are more selective, learn faster and learn more robustly. Our findings may ultimately generalize to the human brain.
Recently, there has been much interest in finding globally optimal Bayesian network structures. These techniques were developed for generative scores and can not be directly extended to discriminative scores, as desired for classification. In this paper, we propose an exact method for finding network structures maximizing the probabilistic soft margin, a successfully applied discriminative score. Our method is based on branch-and-bound techniques within a linear programming framework and maintains an any-time solution, together with worst-case sub-optimality bounds. We apply a set of order constraints for enforcing the network structure to be acyclic, which allows a compact problem representation and the use of general-purpose optimization techniques. In classification experiments, our methods clearly outperform generatively trained network structures and compete with support vector machines.
We performed detailed conductivity and tunneling mesurements on the amorphous, magnetically doped material $\alpha$-Gd$_x$Si$_{1-x}$ (GdSi), which can be driven through the metal-insulator transition by the application of an external magnetic field. Conductivity increases linearly with field near the transition and slightly slower on the metallic side. The tunneling conductance, proportional to the density of states $N(E)$, undergoes a gradual change with increasing field, from insulating, showing a soft gap at low bias, with a slightly weaker than parabolic energy dependence, i.e. $N(E) \sim E^c$, $c \lesssim 2$, towards metallic behavior, with $E^d$, $0.5 \lt d \lt 1$ energy dependence. The density of states at the Fermi level appears to be zero at low fields, as in an insulator, while the sample shows already small, but metal-like conductivity. We suggest a possible explanation to the observed effect.
The phylogenetic network emerged in the 1990s as a new model to represent the evolution of species in the case where coexisting species transfer genetic information through hybridization, recombination, lateral gene transfer, etc. As is true for many rapidly evolving fields, there is considerable fragmentation and diversity in methodologies, standards and vocabulary in phylogenetic network research, thus creating the need for an integrated database of articles, authors, techniques, keywords and software. We describe such a database, "Who is Who in Phylogenetic Networks", available at http://phylnet.univ-mlv.fr. "Who is Who in Phylogenetic Networks" comprises more than 600 publications and 500 authors interlinked with a rich set of more than 200 keywords related to phylogenetic networks. The database is integrated with web-based tools to visualize authorship and collaboration networks and analyze these networks using common graph and social network metrics such as centrality (betweenness, eigenvector, degree and closeness) and clustering. We provide downloads of raw information about entries in the database, and a facility to suggest modifications and contribute new information to the database. We also present in this article common use cases of the database and identify trends in the research on phylogenetic networks using the information in the database and textual analysis.
Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising. The data in those applications is mostly categorical and contains multiple fields; a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between inter-field categories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics.
In this paper, we study a novel approach for named entity recognition (NER) and mention detection in natural language processing. Instead of treating NER as a sequence labelling problem, we propose a new local detection approach, which rely on the recent fixed-size ordinally forgetting encoding (FOFE) method to fully encode each sentence fragment and its left/right contexts into a fixed-size representation. Afterwards, a simple feedforward neural network is used to reject or predict entity label for each individual fragment. The proposed method has been evaluated in several popular NER and mention detection tasks, including the CoNLL 2003 NER task and TAC-KBP2015 and TAC-KBP2016 Tri-lingual Entity Discovery and Linking (EDL) tasks. Our methods have yielded pretty strong performance in all of these examined tasks. This local detection approach has shown many advantages over the traditional sequence labelling methods.
The random-field Ising model is one of the few disordered systems where the perturbative renormalization group can be carried out to all orders of perturbation theory. This analysis predicts dimensional reduction, i.e., that the critical properties of the random-field Ising model in $D$ dimensions are identical to those of the pure Ising ferromagnet in $D-2$ dimensions. It is well known that dimensional reduction is not true in three dimensions, thus invalidating the perturbative renormalization group prediction. Here, we report high-precision numerical simulations of the 5D random-field Ising model at zero temperature. We illustrate universality by comparing different probability distributions for the random fields. We compute all the relevant critical exponents (including the critical slowing down exponent for the ground-state finding algorithm), as well as several other renormalization-group invariants. The estimated values of the critical exponents of the 5D random-field Ising model are statistically compatible to those of the pure 3D Ising ferromagnet. These results support the restoration of dimensional reduction at $D = 5$. We thus conclude that the failure of the perturbative renormalization group is a low-dimensional phenomenon. We close our contribution by comparing universal quantities for the random-field problem at dimensions $3 \leq D < 6$ to their values in the pure Ising model at $D-2$ dimensions and we provide a clear verification of the Rushbrooke equality at all studied dimensions.
As one kind of skin cancer, melanoma is very dangerous. Dermoscopy based early detection and recarbonization strategy is critical for melanoma therapy. However, well-trained dermatologists dominant the diagnostic accuracy. In order to solve this problem, many effort focus on developing automatic image analysis systems. Here we report a novel strategy based on deep learning technique, and achieve very high skin lesion segmentation and melanoma diagnosis accuracy: 1) we build a segmentation neural network (skin_segnn), which achieved very high lesion boundary detection accuracy; 2) We build another very deep neural network based on Google inception v3 network (skin_recnn) and its well-trained weight. The novel designed transfer learning based deep neural network skin_inceptions_v3_nn helps to achieve a high prediction accuracy.
Social web users are a very diverse group with varying interests, levels of expertise, enthusiasm, and expressiveness. As a result, the quality of content and annotations they create to organize content is also highly variable. While several approaches have been proposed to mine social annotations, for example, to learn folksonomies that reflect how people relate narrower concepts to broader ones, these methods treat all users and the annotations they create uniformly. We propose a framework to automatically identify experts, i.e., knowledgeable users who create high quality annotations, and use their knowledge to guide folksonomy learning. We evaluate the approach on a large body of social annotations extracted from the photosharing site Flickr. We show that using expert knowledge leads to more detailed and accurate folksonomies. Moreover, we show that including annotations from non-expert, or novice, users leads to more comprehensive folksonomies than experts' knowledge alone.
We consider the task of pixel-wise semantic segmentation given a small set of labeled training images. Among two of the most popular techniques to address this task are Random Forests (RF) and Neural Networks (NN). The main contribution of this work is to explore the relationship between two special forms of these techniques: stacked RFs and deep Convolutional Neural Networks (CNN). We show that there exists a mapping from stacked RF to deep CNN, and an approximate mapping back. This insight gives two major practical benefits: Firstly, deep CNNs can be intelligently constructed and initialized, which is crucial when dealing with a limited amount of training data. Secondly, it can be utilized to create a new stacked RF with improved performance. Furthermore, this mapping yields a new CNN architecture, that is well suited for pixel-wise semantic labeling. We experimentally verify these practical benefits for two different application scenarios in computer vision and biology, where the layout of parts is important: Kinect-based body part labeling from depth images, and somite segmentation in microscopy images of developing zebrafish.
Diagrammatic techniques to compute perturbatively the spectral properties of Euclidean Random Matrices in the high-density regime are introduced and discussed in detail. Such techniques are developed in two alternative and very different formulations of the mathematical problem and are shown to give identical results up to second order in the perturbative expansion. One method, based on writing the so-called resolvent function as a Taylor series, allows to group the diagrams in a small number of topological classes, providing a simple way to determine the infrared (small momenta) behavior of the theory up to third order, which is of interest for the comparison with experiments. The other method, which reformulates the problem as a field theory, can instead be used to study the infrared behaviour at any perturbative order.
This paper tackles the problem of selecting among several linear estimators in non-parametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression, spline smoothing or locally weighted regression, and the choice of a kernel in multiple kernel learning. We propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty, which was previously introduced in the context of model selection. Then, plugging our variance estimate in Mallows' $C_L$ penalty is proved to lead to an algorithm satisfying an oracle inequality. Simulation experiments with kernel ridge regression and multiple kernel learning show that the proposed algorithm often improves significantly existing calibration procedures such as generalized cross-validation.
Social networks have an important role in an individual's health, with the propagation of health-related features through a network, and correlations between network structures and symptomatology. Using Bluetooth-enabled smartphones to measure social connectivity is an alternative to traditional paper-based data collection; however studies employing this technology have been restricted to limited sets of homogenous handsets. We investigated the feasibility of using the Bluetooth Low Energy (BLE) protocol, present on users' own smartphones, to measure social connectivity. A custom application was designed for Android and iOS handsets. The app was configured to simultaneously broadcast via BLE and perform periodic discovery scans for other nearby devices. The app was installed on two Android handsets and two iOS handsets, and each combination of devices was tested in the foreground, background and locked states. Connectivity was successfully measured in all test cases, except between two iOS devices when both were in a locked state with their screens off. As smartphones are in a locked state for the majority of a day, this severely limits the ability to measure social connectivity on users' own smartphones. It is not currently feasible to use Bluetooth Low Energy to map social networks, due to the inability of iOS devices to detect another iOS device when both are in a locked state. While the technology was successfully implemented on Android devices, this represents a smaller market share of partially or fully compatible devices.
We propose in this paper a decentralized traffic signal control policy for urban road networks. Our policy is an adaptation of a so-called BackPressure scheme which has been widely recognized in data network as an optimal throughput control policy. We have formally proved that our proposed BackPressure scheme, with fixed cycle time and cyclic phases, stabilizes the network for any feasible traffic demands. Simulation has been conducted to compare our BackPressure policy against other existing distributed control policies in various traffic and network scenarios. Numerical results suggest that the proposed policy can surpass other policies both in terms of network throughput and congestion.
People often refer to entities in an image in terms of their relationships with other entities. For example, "the black cat sitting under the table" refers to both a "black cat" entity and its relationship with another "table" entity. Understanding these relationships is essential for interpreting and grounding such natural language expressions. Most prior work focuses on either grounding entire referential expressions holistically to one region, or localizing relationships based on a fixed set of categories. In this paper we instead present a modular deep architecture capable of analyzing referential expressions into their component parts, identifying entities and relationships mentioned in the input expression and grounding them all in the scene. We call this approach Compositional Modular Networks (CMNs): a novel architecture that learns linguistic analysis and visual inference end-to-end. Our approach is built around two types of neural modules that inspect local regions and pairwise interactions between regions. We evaluate CMNs on multiple referential expression datasets, outperforming state-of-the-art approaches on all tasks.
In many real-world scenarios, labeled data for a specific machine learning task is costly to obtain. Semi-supervised training methods make use of abundantly available unlabeled data and a smaller number of labeled examples. We propose a new framework for semi-supervised training of deep neural networks inspired by learning in humans. "Associations" are made from embeddings of labeled samples to those of unlabeled ones and back. The optimization schedule encourages correct association cycles that end up at the same class from which the association was started and penalizes wrong associations ending at a different class. The implementation is easy to use and can be added to any existing end-to-end training setup. We demonstrate the capabilities of learning by association on several data sets and show that it can improve performance on classification tasks tremendously by making use of additionally available unlabeled data. In particular, for cases with few labeled data, our training scheme outperforms the current state of the art on SVHN.
We study effects of quenched disorder on coupled two-dimensional arrays of Luttinger liquids (LL) as a model for stripes in high-T_c compounds. In the framework of a renormalization-group analysis, we find that weak inter-LL charge-density-wave couplings are always irrelevant as opposed to the pure system. By varying either disorder strength, intra- or inter-LL interactions, the system can undergo a delocalization transition between an insulator and a novel strongly anisotropic metallic state with LL-like transport. This state is characterized by short-ranged charge-density-wave order, the superconducting order is quasi long-ranged along the stripes and short-ranged in the transversal direction.
We study the force-induced unfolding of random disordered RNA or single-stranded DNA polymers. The system undergoes a second order phase transition from a collapsed globular phase at low forces to an extensive necklace phase with a macroscopic end-to-end distance at high forces. At low temperatures, the sequence inhomogeneities modify the critical behaviour. We provide numerical evidence for the universality of the critical exponents which, by extrapolation of the scaling laws to zero force, contain useful information on the ground state (f=0) properties. This provides a good method for quantitative studies of scaling exponents characterizing the collapsed globule. In order to get rid of the blurring effect of thermal fluctuations we restrict ourselves to the groundstate at fixed external force. We analyze the statistics of rearrangements, in particular below the critical force, and point out its implications for force-extension experiments on single molecules.
This work carries out a detailed transient analysis of the learning behavior of multi-agent networks, and reveals interesting results about the learning abilities of distributed strategies. Among other results, the analysis reveals how combination policies influence the learning process of networked agents, and how these policies can steer the convergence point towards any of many possible Pareto optimal solutions. The results also establish that the learning process of an adaptive network undergoes three (rather than two) well-defined stages of evolution with distinctive convergence rates during the first two stages, while attaining a finite mean-square-error (MSE) level in the last stage. The analysis reveals what aspects of the network topology influence performance directly and suggests design procedures that can optimize performance by adjusting the relevant topology parameters. Interestingly, it is further shown that, in the adaptation regime, each agent in a sparsely connected network is able to achieve the same performance level as that of a centralized stochastic-gradient strategy even for left-stochastic combination strategies. These results lead to a deeper understanding and useful insights on the convergence behavior of coupled distributed learners. The results also lead to effective design mechanisms to help diffuse information more thoroughly over networks.
We systematically explore regularizing neural networks by penalizing low entropy output distributions. We show that penalizing low entropy output distributions, which has been shown to improve exploration in reinforcement learning, acts as a strong regularizer in supervised learning. Furthermore, we connect a maximum entropy based confidence penalty to label smoothing through the direction of the KL divergence. We exhaustively evaluate the proposed confidence penalty and label smoothing on 6 common benchmarks: image classification (MNIST and Cifar-10), language modeling (Penn Treebank), machine translation (WMT'14 English-to-German), and speech recognition (TIMIT and WSJ). We find that both label smoothing and the confidence penalty improve state-of-the-art models across benchmarks without modifying existing hyperparameters, suggesting the wide applicability of these regularizers.
Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality. Following the success of deep learning methods in several computer vision tasks, recent work has focused on using deep recurrent neural networks (RNNs) to model human motion, with the goal of learning time-dependent representations that perform tasks such as short-term motion prediction and long-term human motion synthesis. We examine recent work, with a focus on the evaluation methodologies commonly used in the literature, and show that, surprisingly, state-of-the-art performance can be achieved by a simple baseline that does not attempt to model motion at all. We investigate this result, and analyze recent RNN methods by looking at the architectures, loss functions, and training procedures used in state-of-the-art approaches. We propose three changes to the standard RNN models typically used for human motion, which result in a simple and scalable RNN architecture that obtains state-of-the-art performance on human motion prediction.
The need to visualize large social networks is growing as hardware capabilities make analyzing large networks feasible and many new data sets become available. Unfortunately, the visualizations in existing systems do not satisfactorily answer the basic dilemma of being readable both for the global structure of the network and also for detailed analysis of local communities. To address this problem, we present NodeTrix, a hybrid representation for networks that combines the advantages of two traditional representations: node-link diagrams are used to show the global structure of a network, while arbitrary portions of the network can be shown as adjacency matrices to better support the analysis of communities. A key contribution is a set of interaction techniques. These allow analysts to create a NodeTrix visualization by dragging selections from either a node-link or a matrix, flexibly manipulate the NodeTrix representation to explore the dataset, and create meaningful summary visualizations of their findings. Finally, we present a case study applying NodeTrix to the analysis of the InfoVis 2004 coauthorship dataset to illustrate the capabilities of NodeTrix as both an exploration tool and an effective means of communicating results.
In this paper, a multi-source multi-relay cooperative wireless network with binary modulation and binary network coding is studied. The system model encompasses: i) a demodulate-and-forward protocol at the relays, where the received packets are forwarded regardless of their reliability; and ii) a maximum-likelihood optimum demodulator at the destination, which accounts for possible demodulations errors at the relays. An asymptotically-tight and closed-form expression of the end-to-end error probability is derived, which clearly showcases diversity order and coding gain of each source. Unlike other papers available in the literature, the proposed framework has three main distinguishable features: i) it is useful for general network topologies and arbitrary binary encoding vectors; ii) it shows how network code and two-hop forwarding protocol affect diversity order and coding gain; and ii) it accounts for realistic fading channels and demodulation errors at the relays. The framework provides three main conclusions: i) each source achieves a diversity order equal to the separation vector of the network code; ii) the coding gain of each source decreases with the number of mixed packets at the relays; and iii) if the destination cannot take into account demodulation errors at the relays, it loses approximately half of the diversity order.
In this paper we carry out Quantum Monte Carlo simulations of a quantum particle in a one-dimensional random potential (plus a fixed harmonic potential) at a finite temperature. This is the simplest model of an interface in a disordered medium and may also pertain to an electron in a dirty metal. We compare with previous analytical results, and also derive an expression for the sample to sample fluctuations of the mean square displacement from the origin which is a measure of the glassiness of the system. This quantity as well as the mean square displacement of the particle are measured in the simulation. The similarity to the quantum spin glass in a transverse field is noted. The effect of quantum fluctuations on the glassy behavior is discussed.
The interaction topology among the constituents of a complex network plays a crucial role in the network's evolutionary mechanisms and functional behaviors. However, some network topologies are usually unknown or uncertain. Meanwhile, coupling delay are ubiquitous in various man-made and natural networks. Hence, it is necessary to gain knowledge of the whole or partial topology of a complex dynamical network by taking into consideration communication delay. In this paper, topology identification of complex dynamical networks is investigated via generalized synchronization of a two-layer network. Particularly, based on the LaSalle-type invariance principle of stochastic differential delay equations, an adaptive control technique is proposed by constructing an auxiliary layer and designing proper control input and updating laws so that the unknown topology can be recovered upon successful generalized synchronization. Numerical simulations are provided to illustrate the effectiveness of the proposed method. The technique provides a certain theoretical basis for topology inference of complex networks. In particular, when the considered network is composed of systems with high-dimension or complicated dynamics, a simpler response layer can be constructed, which is conducive to circuit design. Moreover, it is practical to take into consideration perturbations caused by control input. Finally, the method is applicable to infer topology of a subnetwork embedded within a complex system and locate hidden sources. We hope the results can provide basic insight into further research endeavors on understanding practical and economical topology inference of networks.
This paper deals with the problem of neural code solving. On the basis of the formulated hypotheses the information model of a neuron-detector is suggested, the detector being one of the basic elements of an artificial neural network (ANN). The paper subjects the connectionist paradigm of ANN building to criticism and suggests a new presentation paradigm for ANN building and neuroelements (NE) learning. The adequacy of the suggested model is proved by the fact that is does not contradict the modern propositions of neuropsychology and neurophysiology.
This paper proposes a complete framework consisting pre-processing, modeling, and post-processing stages to carry out well tops guided prediction of a reservoir property (sand fraction) from three seismic attributes (seismic impedance, instantaneous amplitude, and instantaneous frequency) using the concept of modular artificial neural network (MANN). The data set used in this study comprising three seismic attributes and well log data from eight wells, is acquired from a western onshore hydrocarbon field of India. Firstly, the acquired data set is integrated and normalized. Then, well log analysis and segmentation of the total depth range into three different units (zones) separated by well tops are carried out. Secondly, three different networks are trained corresponding to three different zones using combined data set of seven wells and then trained networks are validated using the remaining test well. The target property of the test well is predicted using three different tuned networks corresponding to three zones; and then the estimated values obtained from three different networks are concatenated to represent the predicted log along the complete depth range of the testing well. The application of multiple simpler networks instead of a single one improves the prediction accuracy in terms of performance metrics such as correlation coefficient, root mean square error, absolute error mean and program execution time.
We study filamentary structure in the galaxy distribution at z ~ 0.8 using data from the Deep Extragalactic Evolutionary Probe 2 (DEEP2) Redshift Survey and its evolution to z ~ 0.1 using data from the Sloan Digital Sky Survey (SDSS). We trace individual filaments for both surveys using the Smoothed Hessian Major Axis Filament Finder, an algorithm which employs the Hessian matrix of the galaxy density field to trace the filamentary structures in the distribution of galaxies. We extract 33 subsamples from the SDSS data with a geometry similar to that of DEEP2. We find that the filament length distribution has not significantly changed since z ~ 0.8, as predicted in a previous study using a $\Lamda$CDM cosmological N-body simulation. However, the filament width distribution, which is sensitive to the non-linear growth of structure, broadens and shifts to smaller widths for smoothing length scales of 5-10 Mpc/h from z ~ 0.8 to z ~ 0.1, in accord with N-body simulations.
The information regarding the structure of a single protein is encoded in the network of interacting amino acids. Considering each protein as a weighted and unweighted network of amino acids we have analyzed a total of forty nine protein structures that covers the three branches of life on earth. Our results show that the probability degree distribution of network connectivity follows Poisson's distribution; whereas the probability strength distribution does not follow any known distribution. However, the average strength of amino acid node depends on its degree (k). For some of the proteins, the strength of a node increases linearly with k. On the other hand, for a set of other proteins, although the strength increases linaerly with k for smaller values of k, we have not obtained any clear functional relationship of strength with degree at higher values of k. The results also show that the weight of the amino acid nodes belonging to the highly connected nodes tend to have a higher value. The result that the average clustering coefficient of weighted network is less than that of unweighted network implies that the topological clustering is generated by edges with low weights. The ratio of average clustering coefficients of protein network to that of the corresponding classical random network varies linearly with the number (N) of amino acids of a protein; whereas the ratio of characteristic path lengths varies logarithmically with N. The power law behaviour of clustering coefficients of weighted and unweighted network as a function of degree k indicates that the network has a signature of hierarchical network. It has also been observed that the network is of assortative type.
We present a deep JHKs-band imaging survey of the W3 Main star forming region, using the near-infrared camera, SIRIUS, mounted on the University of Hawaii 2.2m telescope. The near-infrared survey covers an area of ~ 24 sq. arcmin with 10 sigma limiting magnitudes of ~ 19.0, 18.1, and 17.3 in J, H, and Ks-band, respectively. We construct JHK color-color and J/J-H and K/H-K color-magnitude diagrams to identify young stellar objects and estimate their masses. Based on these color-color and color-magnitude diagrams, a rich population of YSOs is identified which is associated with the W3 Main region. A large number of previously unreported red sources (H-K > 2) have also been detected around W3 Main. We argue that these red stars are most probably pre-main sequence stars with intrinsic color excesses. We find that the slope of the Ks-band luminosity function of W3 Main is lower than the typical values reported for the young embedded clusters. The derived slope of the KLF is the same as that found by Megeath et al. (1996), from which analysis by Megeath et al. indicates that the W3 Main region has an age in the range of 0.3--1 Myr. Based on the comparison between models of pre-main sequence stars with the observed color-magnitude diagram we find that the stellar population in W3 Main is primarily composed of low mass pre-main sequence stars. We also report the detection of isolated young stars with large infrared excesses which are most probably in their earliest evolutionary phases.
In this paper, we study the data gathering problem in the context of power grids by using a network of sensors, where the sensed data have inter-node redundancy. Specifically, we propose a new transmission method, calledquantized network coding, which performs linear net-work coding in the field of real numbers, and quantization to accommodate the finite capacity of edges. By using the concepts in compressed sensing literature, we propose to use l1-minimization to decode the quantized network coded packets, especially when the number of received packets at the decoder is less than the size of sensed data (i.e. number of nodes). We also propose an appropriate design for network coding coefficients, based on restricted isometry property, which results in robust l1-min decoding. Our numerical analysis show that the proposed quantized network coding scheme with l1-min decoding can achieve significant improvements, in terms of compression ratio and delivery delay, compared to conventional packet forwarding.
We give an overview of a model to describe deep-inelastic scattering (DIS) off the deuteron with a spectator proton, based on the virtual nucleon approximation (VNA). The model accounts for the final-state interactions (FSI) of the DIS debris with the spectator proton. Values of the rescattering cross section are obtained by fits to high-momentum spectator data. By using the so-called "pole extrapolation method", free neutron structure functions can be obtained by extrapolating low-momentum spectator proton data to the on-shell neutron pole. We apply this method to the BONuS data set and find a surprising Bjorken $x$ dependence, indicating a possible rise of the neutron to proton structure function ratio at high $x$.
The social Web is transforming the way information is created and distributed. Blog authoring tools enable users to publish content, while sites such as Digg and Del.icio.us are used to distribute content to a wider audience. With content fast becoming a commodity, interest in using social networks to promote and find content has grown, both on the side of content producers (viral marketing) and consumers (recommendation). Here we study the role of social networks in promoting content on Digg, a social news aggregator that allows users to submit links to and vote on news stories. Digg's goal is to feature the most interesting stories on its front page, and it aggregates opinions of its many users to identify them. Like other social networking sites, Digg allows users to designate other users as ``friends'' and see what stories they found interesting. We studied the spread of interest in news stories submitted to Digg in June 2006. Our results suggest that pattern of the spread of interest in a story on the network is indicative of how popular the story will become. Stories that spread mainly outside of the submitter's neighborhood go on to be very popular, while stories that spread mainly through submitter's social neighborhood prove not to be very popular. This effect is visible already in the early stages of voting, and one can make a prediction about the potential audience of a story simply by analyzing where the initial votes come from.
Population rate or activity equations are the foundation of a common approach to modeling for neural networks. These equations provide mean field dynamics for the firing rate or activity of neurons within a network given some connectivity. The shortcoming of these equations is that they take into account only the average firing rate while leaving out higher order statistics like correlations between firing. A stochastic theory of neural networks which includes statistics at all orders was recently formulated. We describe how this theory yields a systematic extension to population rate equations by introducing equations for correlations and appropriate coupling terms. Each level of the approximation yields closed equations, i.e. they depend only upon the mean and specific correlations of interest, without an {\it ad hoc} criterion for doing so. We show in an example of an all-to-all connected network how our system of generalized activity equations captures phenomena missed by the mean fieldrate equations alone.
There has been an explosion of interest in functional Magnetic Resonance Imaging (MRI) during the past two decades. Naturally, this has been accompanied by many major advances in the understanding of the human connectome. These advances have served to pose novel challenges as well as open new avenues for research. One of the most promising and exciting of such avenues is the study of functional MRI in real-time. Such studies have recently gained momentum and have been applied in a wide variety of settings; ranging from training of healthy subjects to self-regulate neuronal activity to being suggested as potential treatments for clinical populations. To date, the vast majority of these studies have focused on a single region at a time. This is due in part to the many challenges faced when estimating dynamic functional connectivity networks in real-time. In this work we propose a novel methodology with which to accurately track changes in functional connectivity networks in real-time. We adapt the recently proposed SINGLE algorithm for estimating sparse and temporally homo- geneous dynamic networks to be applicable in real-time. The proposed method is applied to motor task data from the Human Connectome Project as well as to real-time data ob- tained while exploring a virtual environment. We show that the algorithm is able to estimate significant task-related changes in network structure quickly enough to be useful in future brain-computer interface applications.
The combinatorial stochastic semi-bandit problem is an extension of the classical multi-armed bandit problem in which an algorithm pulls more than one arm at each stage and the rewards of all pulled arms are revealed. One difference with the single arm variant is that the dependency structure of the arms is crucial. Previous works on this setting either used a worst-case approach or imposed independence of the arms. We introduce a way to quantify the dependency structure of the problem and design an algorithm that adapts to it. The algorithm is based on linear regression and the analysis develops techniques from the linear bandit literature. By comparing its performance to a new lower bound, we prove that it is optimal, up to a poly-logarithmic factor in the number of pulled arms.
Deep dynamic generative models are developed to learn sequential dependencies in time-series data. The multi-layered model is designed by constructing a hierarchy of temporal sigmoid belief networks (TSBNs), defined as a sequential stack of sigmoid belief networks (SBNs). Each SBN has a contextual hidden state, inherited from the previous SBNs in the sequence, and is used to regulate its hidden bias. Scalable learning and inference algorithms are derived by introducing a recognition model that yields fast sampling from the variational posterior. This recognition model is trained jointly with the generative model, by maximizing its variational lower bound on the log-likelihood. Experimental results on bouncing balls, polyphonic music, motion capture, and text streams show that the proposed approach achieves state-of-the-art predictive performance, and has the capacity to synthesize various sequences.
Thermodynamic constraints on reactions directions are inherent in the structure of a given biochemical network. However, concrete procedures for determining feasible reaction directions for large-scale metabolic networks are not well established. This work introduces a systematic approach to compute reaction directions, which are constrained by mass balance and thermodynamics, for genome-scale networks. In addition, it is shown that the nonconvex solution space constrained by physicochemical constraints can be approximated by a set of linearized subspaces in which mass and thermodynamic balance are guaranteed. The developed methodology can be used to {\it ab initio} predict reaction directions of genome-scale networks based solely on the network stoichoimetry.
Multimedia or spoken content presents more attractive information than plain text content, but the former is more difficult to display on a screen and be selected by a user. As a result, accessing large collections of the former is much more difficult and time-consuming than the latter for humans. It's therefore highly attractive to develop machines which can automatically understand spoken content and summarize the key information for humans to browse over. In this endeavor, a new task of machine comprehension of spoken content was proposed recently. The initial goal was defined as the listening comprehension test of TOEFL, a challenging academic English examination for English learners whose native languages are not English. An Attention-based Multi-hop Recurrent Neural Network (AMRNN) architecture was also proposed for this task, which considered only the sequential relationship within the speech utterances. In this paper, we propose a new Hierarchical Attention Model (HAM), which constructs multi-hopped attention mechanism over tree-structured rather than sequential representations for the utterances. Improved comprehension performance robust with respect to ASR errors were obtained.
The hardware implementation of deep neural networks (DNNs) has recently received tremendous attention: many applications in fact require high-speed operations that suit a hardware implementation. However, numerous elements and complex interconnections are usually required, leading to a large area occupation and copious power consumption. Stochastic computing has shown promising results for low-power area-efficient hardware implementations, even though existing stochastic algorithms require long streams that cause long latencies. In this paper, we propose an integer form of stochastic computation and introduce some elementary circuits. We then propose an efficient implementation of a DNN based on integral stochastic computing. The proposed architecture has been implemented on a Virtex7 FPGA, resulting in 45% and 62% average reductions in area and latency compared to the best reported architecture in literature. We also synthesize the circuits in a 65 nm CMOS technology and we show that the proposed integral stochastic architecture results in up to 21% reduction in energy consumption compared to the binary radix implementation at the same misclassification rate. Due to fault-tolerant nature of stochastic architectures, we also consider a quasi-synchronous implementation which yields 33% reduction in energy consumption w.r.t. the binary radix implementation without any compromise on performance.
Measuring the morphological parameters of galaxies is a key requirement for studying their formation and evolution. Surveys such as the Sloan Digital Sky Survey (SDSS) have resulted in the availability of very large collections of images, which have permitted population-wide analyses of galaxy morphology. Morphological analysis has traditionally been carried out mostly via visual inspection by trained experts, which is time-consuming and does not scale to large ($\gtrsim10^4$) numbers of images.   Although attempts have been made to build automated classification systems, these have not been able to achieve the desired level of accuracy. The Galaxy Zoo project successfully applied a crowdsourcing strategy, inviting online users to classify images by answering a series of questions. Unfortunately, even this approach does not scale well enough to keep up with the increasing availability of galaxy images.   We present a deep neural network model for galaxy morphology classification which exploits translational and rotational symmetry. It was developed in the context of the Galaxy Challenge, an international competition to build the best model for morphology classification based on annotated images from the Galaxy Zoo project.   For images with high agreement among the Galaxy Zoo participants, our model is able to reproduce their consensus with near-perfect accuracy ($> 99\%$) for most questions. Confident model predictions are highly accurate, which makes the model suitable for filtering large collections of images and forwarding challenging images to experts for manual annotation. This approach greatly reduces the experts' workload without affecting accuracy. The application of these algorithms to larger sets of training data will be critical for analysing results from future surveys such as the LSST.
Regularization is a well studied problem in the context of neural networks. It is usually used to improve the generalization performance when the number of input samples is relatively small or heavily contaminated with noise. The regularization of a parametric model can be achieved in different manners some of which are early stopping (Morgan and Bourlard, 1990), weight decay, output smoothing that are used to avoid overfitting during the training of the considered model. From a Bayesian point of view, many regularization techniques correspond to imposing certain prior distributions on model parameters (Krogh and Hertz, 1991). Using Bishop's approximation (Bishop, 1995) of the objective function when a restricted type of noise is added to the input of a parametric function, we derive the higher order terms of the Taylor expansion and analyze the coefficients of the regularization terms induced by the noisy input. In particular we study the effect of penalizing the Hessian of the mapping function with respect to the input in terms of generalization performance. We also show how we can control independently this coefficient by explicitly penalizing the Jacobian of the mapping function on corrupted inputs.
Cellular communications are evolving to facilitate the current and expected increasing needs of Quality of Service (QoS), high data rates and diversity of offered services. Towards this direction, Radio Access Network (RAN) virtualization aims at providing solutions of mapping virtual network elements onto radio resources of the existing physical network. This paper proposes the Resources nEgotiation for NEtwork Virtualization (RENEV) algorithm, suitable for application in Heterogeneous Networks (HetNets) in Long Term Evolution-Advanced (LTE-A) environments, consisting of a macro evolved NodeB (eNB) overlaid with small cells. By exploiting Radio Resource Management (RRM) principles, RENEV achieves slicing and on demand delivery of resources. Leveraging the multi-tenancy approach, radio resources are transferred in terms of physical radio Resource Blocks (RBs) among multiple heterogeneous base stations, interconnected via the X2 interface. The main target is to deal with traffic variations in geographical dimension. All signaling design considerations under the current Third Generation Partnership Project (3GPP) LTE-A architecture are also investigated. Analytical studies and simulation experiments are conducted to evaluate RENEV in terms of network's throughput as well as its additional signaling overhead. Moreover we show that RENEV can be applied independently on top of already proposed schemes for RAN virtualization to improve their performance. The results indicate that significant merits are achieved both from network's and users' perspective as well as that it is a scalable solution for different number of small cells.
In this report we show that in a planar exponentially growing network consisting of $N$ nodes, congestion scales as $O(N^2/\log(N))$ independently of how flows may be routed. This is in contrast to the $O(N^{3/2})$ scaling of congestion in a flat polynomially growing network. We also show that without the planarity condition, congestion in a small world network could scale as low as $O(N^{1+\epsilon})$, for arbitrarily small $\epsilon$. These extreme results demonstrate that the small world property by itself cannot provide guidance on the level of congestion in a network and other characteristics are needed for better resolution. Finally, we investigate scaling of congestion under the geodesic flow, that is, when flows are routed on shortest paths based on a link metric. Here we prove that if the link weights are scaled by arbitrarily small or large multipliers then considerable changes in congestion may occur. However, if we constrain the link-weight multipliers to be bounded away from both zero and infinity, then variations in congestion due to such remetrization are negligible.
In Linear Programming (LP) decoding of a Low-Density-Parity-Check (LDPC) code one minimizes a linear functional, with coefficients related to log-likelihood ratios, over a relaxation of the polytope spanned by the codewords \cite{03FWK}. In order to quantify LP decoding, and thus to describe performance of the error-correction scheme at moderate and large Signal-to-Noise-Ratios (SNR), it is important to study the relaxed polytope to understand better its vertexes, so-called pseudo-codewords, especially those which are neighbors of the zero codeword. In this manuscript we propose a technique to heuristically create a list of these neighbors and their distances. Our pseudo-codeword-search algorithm starts by randomly choosing the initial configuration of the noise. The configuration is modified through a discrete number of steps. Each step consists of two sub-steps. Firstly, one applies an LP decoder to the noise-configuration deriving a pseudo-codeword. Secondly, one finds configuration of the noise equidistant from the pseudo codeword and the zero codeword. The resulting noise configuration is used as an entry for the next step. The iterations converge rapidly to a pseudo-codeword neighboring the zero codeword. Repeated many times, this procedure is characterized by the distribution function (frequency spectrum) of the pseudo-codeword effective distance. The effective distance of the coding scheme is approximated by the shortest distance pseudo-codeword in the spectrum. The efficiency of the procedure is demonstrated on examples of the Tanner $[155,64,20]$ code and Margulis $p=7$ and $p=11$ codes (672 and 2640 bits long respectively) operating over an Additive-White-Gaussian-Noise (AWGN) channel.
This paper presents the general distribution for the distance between a mobile user and any base station (BS). We show that a random variable proportional to the distance squared is Gamma distributed. In the case of the nearest BS, it can be reduced to the well established result of the distance being Rayleigh distributed. We validate our results using a random node simulation and real Vodafone 3G network data, and go on to show how the distribution is tractable by deriving the average aggregate interference power.
We poorly understand the properties of amorphous systems at small length scales, where a continuous elastic description breaks down. This is apparent when one considers their vibrational and transport properties, or the way forces propagate in these solids. Little is known about the microscopic cause of their rigidity. Recently it has been observed numerically that an assembly of elastic particles has a critical behavior near the jamming threshold where the pressure vanishes. At the transition such a system does not behave as a continuous medium at any length scales. When this system is compressed, scaling is observed for the elastic moduli, the coordination number, but also for the density of vibrational modes. In the present work we derive theoretically these results, and show that they apply to various systems such as granular matter and silica, but also to colloidal glasses. In particular we show that: (i) these systems present a large excess of vibrational modes at low frequency in comparison with normal solids, called the "boson peak" in the glass literature. The corresponding modes are very different from plane waves, and their frequency is related to the system coordination; (ii) rigidity is a non-local property of the packing geometry, characterized by a length scale which can be large. For elastic particles this length diverges near the jamming transition; (iii) for repulsive systems the shear modulus can be much smaller than the bulk modulus. We compute the corresponding scaling laws near the jamming threshold. Finally, we discuss the applications of these results to the glass transition, the transport, and the geometry of the random close packing.
One of the key challenges in applying reinforcement learning to complex robotic control tasks is the need to gather large amounts of experience in order to find an effective policy for the task at hand. Model-based reinforcement learning can achieve good sample efficiency, but requires the ability to learn a model of the dynamics that is good enough to learn an effective policy. In this work, we develop a model-based reinforcement learning algorithm that combines prior knowledge from previous tasks with online adaptation of the dynamics model. These two ingredients enable highly sample-efficient learning even in regimes where estimating the true dynamics is very difficult, since the online model adaptation allows the method to locally compensate for unmodeled variation in the dynamics. We encode the prior experience into a neural network dynamics model, adapt it online by progressively refitting a local linear model of the dynamics, and use model predictive control to plan under these dynamics. Our experimental results show that this approach can be used to solve a variety of complex robotic manipulation tasks in just a single attempt, using prior data from other manipulation behaviors.
Most of human actions consist of complex temporal compositions of more simple actions. Action recognition tasks usually relies on complex handcrafted structures as features to represent the human action model. Convolutional Neural Nets (CNN) have shown to be a powerful tool that eliminate the need for designing handcrafted features. Usually, the output of the last layer in CNN (a layer before the classification layer -known as fc7) is used as a generic feature for images. In this paper, we show that fc7 features, per se, can not get a good performance for the task of action recognition, when the network is trained only on images. We present a feature structure on top of fc7 features, which can capture the temporal variation in a video. To represent the temporal components, which is needed to capture motion information, we introduced a hierarchical structure. The hierarchical model enables to capture sub-actions from a complex action. At the higher levels of the hierarchy, it represents a coarse capture of action sequence and lower levels represent fine action elements. Furthermore, we introduce a method for extracting key-frames using binary coding of each frame in a video, which helps to improve the performance of our hierarchical model. We experimented our method on several action datasets and show that our method achieves superior results compared to other state-of-the-arts methods.
Constraint-based causal discovery from limited data is a notoriously difficult challenge due to the many borderline independence test decisions. Several approaches to improve the reliability of the predictions by exploiting redundancy in the independence information have been proposed recently. Though promising, existing approaches can still be greatly improved in terms of accuracy and scalability. We present a novel method that reduces the combinatorial explosion of the search space by using a more coarse-grained representation of causal information, drastically reducing computation time. Additionally, we propose a method to score causal predictions based on their confidence. Crucially, our implementation also allows one to easily combine observational and interventional data and to incorporate various types of available background knowledge. We prove soundness and asymptotic consistency of our method and demonstrate that it can outperform the state-of-the-art on synthetic data, achieving a speedup of several orders of magnitude. We illustrate its practical feasibility by applying it on a challenging protein data set.
Variational Autoencoders (VAEs) are expressive latent variable models that can be used to learn complex probability distributions from training data. However, the quality of the resulting model crucially relies on the expressiveness of the inference model. We introduce Adversarial Variational Bayes (AVB), a technique for training Variational Autoencoders with arbitrarily expressive inference models. We achieve this by introducing an auxiliary discriminative network that allows to rephrase the maximum-likelihood-problem as a two-player game, hence establishing a principled connection between VAEs and Generative Adversarial Networks (GANs). We show that in the nonparametric limit our method yields an exact maximum-likelihood assignment for the parameters of the generative model, as well as the exact posterior distribution over the latent variables given an observation. Contrary to competing approaches which combine VAEs with GANs, our approach has a clear theoretical justification, retains most advantages of standard Variational Autoencoders and is easy to implement.
In a recent Letter [Europhys. Lett. 40, 429 (1997)], Hartmann presented results for the structure of the degenerate ground states of the three-dimensional +/- J spin glass model obtained using a genetic algorithm. In this Comment, I argue that the method does not produce the correct thermodynamic distribution of ground states and therefore gives erroneous results for the overlap distribution. I present results of simulated annealing calculations using different annealing rates for cubic lattices with N=4*4*4spins. The disorder-averaged overlap distribution exhibits a significant dependence on the annealing rate, even when the energy has converged. For fast annealing, moments of the distribution are similar to those presented by Hartmann. However, as the annealing rate is lowered, they approach the results previously obtained using a multi-canonical Monte Carlo method. This shows explicitly that care must be taken not only to reach states with the lowest energy but also to ensure that they obey the correct thermodynamic distribution, i.e., that the probability is the same for reaching any of the ground states.
In smart power grids, keeping the synchronicity of generators and the corresponding controls is of great importance. To do so, a simple model is employed in terms of swing equation to represent the interactions among dynamics of generators and feedback control. In case of having a communication network available, the control can be done based on the transmitted measurements by the communication network. The stability of system is denoted by the largest eigenvalue of the weighted sum of the Laplacian matrices of the communication infrastructure and power network. In this work, we use graph theory to model the communication network as a graph problem. Then, Ant Colony System (ACS) is employed for optimum design of above graph for synchronization of power grids. Performance evaluation of the proposed method for the 39-bus New England power system versus methods such as exhaustive search and Rayleigh quotient approximation indicates feasibility and effectiveness of our method for even large scale smart power grids.
When looking for a restaurant online, user uploaded photos often give people an immediate and tangible impression about a restaurant. Due to their informativeness, such user contributed photos are leveraged by restaurant review websites to provide their users an intuitive and effective search experience. In this paper, we present a novel approach to inferring restaurant types or styles (ambiance, dish styles, suitability for different occasions) from user uploaded photos on user-review websites. To that end, we first collect a novel restaurant photo dataset associating the user contributed photos with the restaurant styles from TripAdvior. We then propose a deep multi-instance multi-label learning (MIML) framework to deal with the unique problem setting of the restaurant style classification task. We employ a two-step bootstrap strategy to train a multi-label convolutional neural network (CNN). The multi-label CNN is then used to compute the confidence scores of restaurant styles for all the images associated with a restaurant. The computed confidence scores are further used to train a final binary classifier for each restaurant style tag. Upon training, the styles of a restaurant can be profiled by analyzing restaurant photos with the trained multi-label CNN and SVM models. Experimental evaluation has demonstrated that our crowd sourcing-based approach can effectively infer the restaurant style when there are a sufficient number of user uploaded photos for a given restaurant.
Autoregressive models are among the best performing neural density estimators. We describe an approach for increasing the flexibility of an autoregressive model, based on modelling the random numbers that the model uses internally when generating data. By constructing a stack of autoregressive models, each modelling the random numbers of the next model in the stack, we obtain a type of normalizing flow suitable for density estimation, which we call Masked Autoregressive Flow. This type of flow is closely related to Inverse Autoregressive Flow and is a generalization of Real NVP. Masked Autoregressive Flow achieves state-of-the-art performance in a range of general-purpose density estimation tasks.
Magnetic resonance image (MRI) reconstruction is a severely ill-posed linear inverse task demanding time and resource intensive computations that can substantially trade off {\it accuracy} for {\it speed} in real-time imaging. In addition, state-of-the-art compressed sensing (CS) analytics are not cognizant of the image {\it diagnostic quality}. To cope with these challenges we put forth a novel CS framework that permeates benefits from generative adversarial networks (GAN) to train a (low-dimensional) manifold of diagnostic-quality MR images from historical patients. Leveraging a mixture of least-squares (LS) GANs and pixel-wise $\ell_1$ cost, a deep residual network with skip connections is trained as the generator that learns to remove the {\it aliasing} artifacts by projecting onto the manifold. LSGAN learns the texture details, while $\ell_1$ controls the high-frequency noise. A multilayer convolutional neural network is then jointly trained based on diagnostic quality images to discriminate the projection quality. The test phase performs feed-forward propagation over the generator network that demands a very low computational overhead. Extensive evaluations are performed on a large contrast-enhanced MR dataset of pediatric patients. In particular, images rated based on expert radiologists corroborate that GANCS retrieves high contrast images with detailed texture relative to conventional CS, and pixel-wise schemes. In addition, it offers reconstruction under a few milliseconds, two orders of magnitude faster than state-of-the-art CS-MRI schemes.
Traditional concept of cognitive radio is the coexistence of primary and secondary user in multiplexed manner. we consider the opportunistic channel access scheme in IEEE 802.11 based networks subject to the interference mitigation scenario. According to the protocol rule and due to the constraint of message passing, secondary user is unaware of the exact state of the primary user. In this paper, we have proposed an online algorithm for the secondary which assist determining a backoff counter or the decision of being idle for utilizing the time/frequency slot unoccupied by the primary user. Proposed algorithm is based on conventional reinforcement learning technique namely Q-Learning. Simulation has been conducted in order to prove the strength of this algorithm and also results have been compared with our contemporary solution of this problem where secondary user is aware of some states of primary user.
In this talk I will present a complete theory for the behaviour of large-scale dynamical heterogeneities in glasses. Following the work arXiv:1001.1746 I will show that we can write a (physically motivated) simple stochastic differential equation that is potentially able to explain the behaviour of large scale dynamical heterogeneities in glasses. It turns out that this behaviour is in the same universality class of the dynamics near the endpoint of a metastable phase in a disordered system, as far as reparametrization invariant quantities are concerned. Therefore Large scale dynamical heterogeneities in glasses have many points in contact with the Barkhausen noise. Numerical verifications of this theory have not yet done, but they are quite possible.
We present measures, models and link prediction algorithms based on the structural balance in signed social networks. Certain social networks contain, in addition to the usual 'friend' links, 'enemy' links. These networks are called signed social networks. A classical and major concept for signed social networks is that of structural balance, i.e., the tendency of triangles to be 'balanced' towards including an even number of negative edges, such as friend-friend-friend and friend-enemy-enemy triangles. In this article, we introduce several new signed network analysis methods that exploit structural balance for measuring partial balance, for finding communities of people based on balance, for drawing signed social networks, and for solving the problem of link prediction. Notably, the introduced methods are based on the signed graph Laplacian and on the concept of signed resistance distances. We evaluate our methods on a collection of four signed social network datasets.
Recent experimental data on diffractive deep inelastic scattering collected by the H1 and ZEUS Collaborations at HERA are analysed in a model with a non-linear trajectory in the pomeron flux. The $t$ dependence of the diffractive structure function $F_2^{D(4)}$ is predicted. The normalization of the pomeron flux and the (weak) $Q^2$ dependence of the pomeron structure function are revised as well.
AHaH computing forms a theoretical framework from which a biologically-inspired type of computing architecture can be built where, unlike von Neumann systems, memory and processor are physically combined. In this paper we report on an incremental step beyond the theoretical framework of AHaH computing toward the development of a memristor-based physical neural processing unit (NPU), which we call Thermodynamic-RAM (kT-RAM). While the power consumption and speed dominance of such an NPU over von Neumann architectures for machine learning applications is well appreciated, Thermodynamic-RAM offers several advantages over other hardware approaches to adaptation and learning. Benefits include general-purpose use, a simple yet flexible instruction set and easy integration into existing digital platforms. We present a high level design of kT-RAM and a formal definition of its instruction set. We report the completion of a kT-RAM emulator and the successful port of all previous machine learning benchmark applications including unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization. Lastly, we extend a previous MNIST hand written digits benchmark application, to show that an extra step of reading the synaptic states of AHaH nodes during the train phase (healing) alone results in plasticity that improves the classifier's performance, bumping our best F1 score up to 99.5%.
The potential of recovering the topology of a grid using solely publicly available market data is explored here. In contemporary whole-sale electricity markets, real-time prices are typically determined by solving the network-constrained economic dispatch problem. Under a linear DC model, locational marginal prices (LMPs) correspond to the Lagrange multipliers of the linear program involved. The interesting observation here is that the matrix of spatiotemporally varying LMPs exhibits the following property: Once premultiplied by the weighted grid Laplacian, it yields a low-rank and sparse matrix. Leveraging this rich structure, a regularized maximum likelihood estimator (MLE) is developed to recover the grid Laplacian from the LMPs. The convex optimization problem formulated includes low rank- and sparsity-promoting regularizers, and it is solved using a scalable algorithm. Numerical tests on prices generated for the IEEE 14-bus benchmark provide encouraging topology recovery results.
Motivation: The study of diverse enzyme superfamilies can provide important insight into the relationships between protein sequence, structure and function. It is often challenging, however, to discover these relationships across a large and diverse superfamily. Contemporary similarity network visualization techniques allow researchers to aggregate sequence similarity information into a single global view. Network visualization provides a qualitative estimate of functional diversity within a superfamily, but is unable to quantitate explicit boundaries, when present, between neighboring families in sequence space. This limits the potential of existing sequence-based algorithms to generate functional predictions from superfamily datasets.   Results: By building on current network analysis tools, we have developed a new algorithm for elucidating pairs of homologous families within a sequence dataset. Our algorithm is able to filter through a dense similarity network in order to estimate both the boundaries of individual families and also how the families neighbor one another. Globally, these neighboring families define a topology across the entire superfamily. The topology is simple to interpret by visualizing the network output generated by our filtration protocol. We have compared the network topology within the kinase superfamily against available phylogenetic data. Our results suggest that neighbors within the filtered kinase network are more likely to share structural and functional properties than more distant network clusters.
Given a large social or information network, how can we partition the vertices into sets (i.e., colors) such that no two vertices linked by an edge are in the same set while minimizing the number of sets used. Despite the obvious practical importance of graph coloring, existing works have not systematically investigated or designed methods for large complex networks. In this work, we develop a unified framework for coloring large complex networks that consists of two main coloring variants that effectively balances the tradeoff between accuracy and efficiency. Using this framework as a fundamental basis, we propose coloring methods designed for the scale and structure of complex networks. In particular, the methods leverage triangles, triangle-cores, and other egonet properties and their combinations. We systematically compare the proposed methods across a wide range of networks (e.g., social, web, biological networks) and find a significant improvement over previous approaches in nearly all cases. Additionally, the solutions obtained are nearly optimal and sometimes provably optimal for certain classes of graphs (e.g., collaboration networks). We also propose a parallel algorithm for the problem of coloring neighborhood subgraphs and make several key observations. Overall, the coloring methods are shown to be (i) accurate with solutions close to optimal, (ii) fast and scalable for large networks, and (iii) flexible for use in a variety of applications.
Mastering the game of Go has remained a long standing challenge to the field of AI. Modern computer Go systems rely on processing millions of possible future positions to play well, but intuitively a stronger and more 'humanlike' way to play the game would be to rely on pattern recognition abilities rather then brute force computation. Following this sentiment, we train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players. To solve this problem we introduce a number of novel techniques, including a method of tying weights in the network to 'hard code' symmetries that are expect to exist in the target function, and demonstrate in an ablation study they considerably improve performance. Our final networks are able to achieve move prediction accuracies of 41.1% and 44.4% on two different Go datasets, surpassing previous state of the art on this task by significant margins. Additionally, while previous move prediction programs have not yielded strong Go playing programs, we show that the networks trained in this work acquired high levels of skill. Our convolutional neural networks can consistently defeat the well known Go program GNU Go, indicating it is state of the art among programs that do not use Monte Carlo Tree Search. It is also able to win some games against state of the art Go playing program Fuego while using a fraction of the play time. This success at playing Go indicates high level principles of the game were learned.
We present a model that generates natural language descriptions of images and their regions. Our approach leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between language and visual data. Our alignment model is based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding. We then describe a Multimodal Recurrent Neural Network architecture that uses the inferred alignments to learn to generate novel descriptions of image regions. We demonstrate that our alignment model produces state of the art results in retrieval experiments on Flickr8K, Flickr30K and MSCOCO datasets. We then show that the generated descriptions significantly outperform retrieval baselines on both full images and on a new dataset of region-level annotations.
We discuss the influence of the electromagnetic environment and the electron-electron interaction on the weak localization correction to the conductivity of a disordered metal. The theory of this phenomenon for sufficiently high temperature, where the quantum nature of the interaction of electrons with the electromagnetic field can be disregarded, has been understood for some time. We consider the first order quantum correction to this semiclassical description and work out the temperature range in which this correction is small. No external low frequency cut-off is needed in our calculation. We conclude that in the whole region of temperature where the weak localization correction is much smaller than the Drude conductivity the classical treatment of the interaction is valid.
We compute the distribution of the partition functions for a class of one-dimensional Random Energy Models (REM) with logarithmically correlated random potential, above and at the glass transition temperature. The random potential sequences represent various versions of the 1/f noise generated by sampling the two-dimensional Gaussian Free Field (2dGFF) along various planar curves. Our method extends the recent analysis of Fyodorov Bouchaud from the circular case to an interval and is based on an analytical continuation of the Selberg integral. In particular, we unveil a {\it duality relation} satisfied by the suitable generating function of free energy cumulants in the high-temperature phase. It reinforces the freezing scenario hypothesis for that generating function, from which we derive the distribution of extrema for the 2dGFF on the $[0,1]$ interval. We provide numerical checks of the circular and the interval case and discuss universality and various extensions. Relevance to the distribution of length of a segment in Liouville quantum gravity is noted.
We derive an expression for the effective second-harmonic coefficient of a dilute suspension of coated spherical particles. It is assumed that the coating material, but not the core or the host, has a nonlinear susceptibility for second-harmonic generation (SHG). The resulting compact expression shows the various factors affecting the effective SHG coefficient. The effective SHG per unit volume of nonlinear coating material is found to be greatly enhanced at certain frequencies, corresponding to the surface plasmon resonance of the coated particles. Similar expression is also derived for a dilute suspension of coated discs. For coating materials with third-harmonic (THG) coefficient, results for the effective THG coefficients are given for the cases of coated particles and coated discs.
Our general subject is the emergence of phases, and phase transitions, in large networks subjected to a few variable constraints. Our main result is the analysis, in the model using edge and triangle subdensities for constraints, of a sharp transition between two phases with different symmetries, analogous to the transition between a fluid and a crystalline solid.
Actin networks, acting as an engine pushing against an external load, are fundamentally important to cell motility. A measure of the effectiveness of an engine is the velocity the engine is able to produce at a given force, the force-velocity curve. One type of force-velocity curve, consisting of a concave region where velocity is insensitive to increasing force followed by a decrease in velocity, is indicative of an adaptive response. In contrast, an engine whose velocity rapidly decays as a convex curve in response to increasing force would indicate a lack of adaptive response. Even taken outside of a cellular context, branching actin networks have been observed to exhibit both concave and convex force-velocity curves. The exact mechanism that can explain both force-velocity curves is not yet known. We carried out an agent-based stochastic simulation to explore such a mechanism. Our results suggest that upon loading, branching actin networks are capable of remodeling by increasing the number filaments growing against the load. Our model provides a mechanism that can account for both convex and concave force-velocity relationships observed in branching actin networks. Finally, our model gives a potential explanation to the experimentally observed force history dependence for actin network velocity.
Measurements of the individual multiplicities of pi+, pi- and pi0 produced in the deep-inelastic scattering of 27.5 GeV positrons on hydrogen are presented. The average charged pion multiplicity is the same as for neutral pions, up to approximately z= 0.7, where z is the fraction of the energy transferred in the scattering process carried by the pion. This result (below z= 0.7) is consistent with isospin invariance. The total energy fraction associated with charged and neutral pions is 0.51 +/- 0.01 (stat.) +/- 0.08 (syst.) and 0.26 +/- 0.01 (stat.) +/- 0.04 (syst.), respectively. For fixed z, the measured multiplicities depend on both the negative squared four momentum transfer Q^2 and the Bjorken variable x. The observed dependence on Q^2 agrees qualitatively with the expected behaviour based on NLO-QCD evolution, while the dependence on x is consistent with that of previous data after corrections have been made for the expected Q^2-dependence.
Schelling's model of segregation looks to explain the way in which particles or agents of two types may come to arrange themselves spatially into configurations consisting of large homogeneous clusters, i.e.\ connected regions consisting of only one type. As one of the earliest agent based models studied by economists and perhaps the most famous model of self-organising behaviour, it also has direct links to areas at the interface between computer science and statistical mechanics, such as the Ising model and the study of contagion and cascading phenomena in networks.   While the model has been extensively studied it has largely resisted rigorous analysis, prior results from the literature generally pertaining to variants of the model which are tweaked so as to be amenable to standard techniques from statistical mechanics or stochastic evolutionary game theory. In \cite{BK}, Brandt, Immorlica, Kamath and Kleinberg provided the first rigorous analysis of the unperturbed model, for a specific set of input parameters. Here we provide a rigorous analysis of the model's behaviour much more generally and establish some surprising forms of threshold behaviour, notably the existence of situations where an \emph{increased} level of intolerance for neighbouring agents of opposite type leads almost certainly to \emph{decreased} segregation.
This article describes an approach to designing a distributed and modular neural classifier. This approach introduces a new hierarchical clustering that enables one to determine reliable regions in the representation space by exploiting supervised information. A multilayer perceptron is then associated with each of these detected clusters and charged with recognizing elements of the associated cluster while rejecting all others. The obtained global classifier is comprised of a set of cooperating neural networks and completed by a K-nearest neighbor classifier charged with treating elements rejected by all the neural networks. Experimental results for the handwritten digit recognition problem and comparison with neural and statistical nonmodular classifiers are given.
We consider in-network computation of an arbitrary function over an arbitrary communication network. A network with capacity constraints on the links is given. Some nodes in the network generate data, e.g., like sensor nodes in a sensor network. An arbitrary function of this distributed data is to be obtained at a terminal node. The structure of the function is described by a given computation schema, which in turn is represented by a directed tree. We design computing and communicating schemes to obtain the function at the terminal at the maximum rate. For this, we formulate linear programs to determine network flows that maximize the computation rate. We then develop fast combinatorial primal-dual algorithm to obtain $\epsilon$-approximate solutions to these linear programs. We then briefly describe extensions of our techniques to the cases of multiple terminals wanting different functions, multiple computation schemas for a function, computation with a given desired precision, and to networks with energy constraints at nodes.
Measurements of explosive nucleosynthesis yields in core-collapse supernovae provide tests for explosion models. We investigate constraints on explosive conditions derivable from measured amounts of nickel and iron after radioactive decays using nucleosynthesis networks with parameterized thermodynamic trajectories. The Ni/Fe ratio is for most regimes dominated by the production ratio of 58Ni/(54Fe + 56Ni), which tends to grow with higher neutron excess and with higher entropy. For SN 2012ec, a supernova that produced a Ni/Fe ratio of $3.4\pm1.2$ times solar, we find that burning of a fuel with neutron excess $\eta \approx 6\times 10^{-3}$ is required. Unless the progenitor metallicity is over 5 times solar, the only layer in the progenitor with such a neutron excess is the silicon shell. Supernovae producing large amounts of stable nickel thus suggest that this deep-lying layer can be, at least partially, ejected in the explosion. We find that common spherically symmetric models of $M_{\rm ZAMS} \lesssim 13$ Msun stars exploding with a delay time of less than one second ($M_{\rm cut} < 1.5$ Msun) are able to achieve such silicon-shell ejection. Supernovae that produce solar or sub-solar Ni/Fe ratios, such as SN 1987A, must instead have burnt and ejected only oxygen-shell material, which allows a lower limit to the mass cut to be set. Finally, we find that the extreme Ni/Fe value of 60-75 times solar derived for the Crab cannot be reproduced by any realistic-entropy burning outside the iron core, and neutrino-neutronization obtained in electron-capture models remains the only viable explanation.
The stochastic block model is one of the oldest and most ubiquitous models for studying clustering and community detection. In an exciting sequence of developments, motivated by deep but non-rigorous ideas from statistical physics, Decelle et al. conjectured a sharp threshold for when community detection is possible in the sparse regime. Mossel, Neeman and Sly and Massoulie proved the conjecture and gave matching algorithms and lower bounds.   Here we revisit the stochastic block model from the perspective of semirandom models where we allow an adversary to make `helpful' changes that strengthen ties within each community and break ties between them. We show a surprising result that these `helpful' changes can shift the information-theoretic threshold, making the community detection problem strictly harder. We complement this by showing that an algorithm based on semidefinite programming (which was known to get close to the threshold) continues to work in the semirandom model (even for partial recovery). This suggests that algorithms based on semidefinite programming are robust in ways that any algorithm meeting the information-theoretic threshold cannot be.   These results point to an interesting new direction: Can we find robust, semirandom analogues to some of the classical, average-case thresholds in statistics? We also explore this question in the broadcast tree model, and we show that the viewpoint of semirandom models can help explain why some algorithms are preferred to others in practice, in spite of the gaps in their statistical performance on random models.
The mutual information between stimulus and spike-train response is commonly used to monitor neural coding efficiency, but neuronal computation broadly conceived requires more refined and targeted information measures of input-output joint processes. A first step towards that larger goal is to develop information measures for individual output processes, including information generation (entropy rate), stored information (statistical complexity), predictable information (excess entropy), and active information accumulation (bound information rate). We calculate these for spike trains generated by a variety of noise-driven integrate-and-fire neurons as a function of time resolution and for alternating renewal processes. We show that their time-resolution dependence reveals coarse-grained structural properties of interspike interval statistics; e.g., $\tau$-entropy rates that diverge less quickly than the firing rate indicate interspike interval correlations. We also find evidence that the excess entropy and regularized statistical complexity of different types of integrate-and-fire neurons are universal in the continuous-time limit in the sense that they do not depend on mechanism details. This suggests a surprising simplicity in the spike trains generated by these model neurons. Interestingly, neurons with gamma-distributed ISIs and neurons whose spike trains are alternating renewal processes do not fall into the same universality class. These results lead to two conclusions. First, the dependence of information measures on time resolution reveals mechanistic details about spike train generation. Second, information measures can be used as model selection tools for analyzing spike train processes.
Using an exactly solvable cortical model of a neuronal network, we show that, by increasing the intensity of shot noise (flow of random spikes bombarding neurons), the network undergoes first- and second-order non-equilibrium phase transitions. We study the nature of the transitions, bursts and avalanches of neuronal activity. Saddle-node and supercritical Hopf bifurcations are the mechanisms of emergence of sustained network oscillations. We show that the network stimulated by shot noise behaves similar to the Morris-Lecar model of a biological neuron stimulated by an applied current.
Machine learning is increasingly prevalent in stock market trading. Though neural networks have seen success in computer vision and natural language processing, they have not been as useful in stock market trading. To demonstrate the applicability of a neural network in stock trading, we made a single-layer neural network that recommends buying or selling shares of a stock by comparing the highest high of 10 consecutive days with that of the next 10 days, a process repeated for the stock's year-long historical data. A chi-squared analysis found that the neural network can accurately and appropriately decide whether to buy or sell shares for a given stock, showing that a neural network can make simple decisions about the stock market.
Zero-shot learning (ZSL) models rely on learning a joint embedding space where both textual/semantic description of object classes and visual representation of object images can be projected to for nearest neighbour search. Despite the success of deep neural networks that learn an end-to-end model between text and images in other vision problems such as image captioning, very few deep ZSL model exists and they show little advantage over ZSL models that utilise deep feature representations but do not learn an end-to-end embedding. In this paper we argue that the key to make deep ZSL models succeed is to choose the right embedding space. Instead of embedding into a semantic space or an intermediate space, we propose to use the visual space as the embedding space. This is because that in this space, the subsequent nearest neighbour search would suffer much less from the hubness problem and thus become more effective. This model design also provides a natural mechanism for multiple semantic modalities (e.g., attributes and sentence descriptions) to be fused and optimised jointly in an end-to-end manner. Extensive experiments on four benchmarks show that our model significantly outperforms the existing models.
X-ray binaries have been an important key in understanding the jet-disc symbiosis in accreting black holes on all mass scales, from stellar-mass to supermassive black holes. SS433 was the first Galactic XRB that has been extensively studied in the radio regime. The radio properties, including the highest angular resolution data can now be better understood in the framework for accretion disc state transitions that is observed in microquasars (black hole X-ray binary systems). SS433 remains unique in various ways to date, and there is still much to learn about black hole accretion phenomena. In the meantime, the electronic very long baseline (e-VLBI) developments at the European VLBI Network (EVN) has allowed us to study microquasars and other transients at milliarcsecond resolutions more flexibly than was possible before. Even more new opportunities will arise as the SKA pathfinders become operational.
Deep Brain Stimulation (DBS) has gained increasing attention as an effective method to mitigate Parkinsons disease (PD) disorders. Existing DBS systems are open-loop such that the system parameters are not adjusted automatically based on patients behavior. Classification of human behavior is an important step in the design of the next generation of DBS systems that are closed-loop. This paper presents a classification approach to recognize such behavioral tasks using the subthalamic nucleus (STN) Local Field Potential (LFP) signals. In our approach, we use the time-frequency representation (spectrogram) of the raw LFP signals recorded from left and right STNs as the feature vectors. Then these features are combined together via Support Vector Machines (SVM) with Multiple Kernel Learning (MKL) formulation. The MKL-based classification method is utilized to classify different tasks: button press, mouth movement, speech, and arm movement. Our experiments show that the lp-norm MKL significantly outperforms single kernel SVM-based classifiers in classifying behavioral tasks of five subjects even using signals acquired with a low sampling rate of 10 Hz. This leads to a lower computational cost.
Automatic segmentation of an organ and its cystic region is a prerequisite of computer-aided diagnosis. In this paper, we focus on pancreatic cyst segmentation in abdominal CT scan. This task is important and very useful in clinical practice yet challenging due to the low contrast in boundary, the variability in location, shape and the different stages of the pancreatic cancer. Inspired by the high relevance between the location of a pancreas and its cystic region, we introduce extra deep supervision into the segmentation network, so that cyst segmentation can be improved with the help of relatively easier pancreas segmentation. Under a reasonable transformation function, our approach can be factorized into two stages, and each stage can be efficiently optimized via gradient back-propagation throughout the deep networks. We collect a new dataset with 131 pathological samples, which, to the best of our knowledge, is the largest set for pancreatic cyst segmentation. Without human assistance, our approach reports a 63.44% average accuracy, measured by the Dice-S{\o}rensen coefficient (DSC), which is higher than the number (60.46%) without deep supervision.
The traditional TCP congestion control mechanism encounters a number of new problems and suffers a poor performance when the IEEE 802.11 MAC protocol is used in multihop ad hoc networks. Many of the problems result from medium contention at the MAC layer. In this paper, I first illustrate that severe medium contention and congestion are intimately coupled, and TCP s congestion control algorithm becomes too coarse in its granularity, causing throughput instability and excessively long delay. Further, we illustrate TCP s severe unfairness problem due to the medium contention and the tradeoff between aggregate throughput and fairness. Then, based on the novel use of channel busyness ratio, a more accurate metric to characterize the network utilization and congestion status, I propose a new wireless congestion control protocol (WCCP) to efficiently and fairly support the transport service in multihop ad hoc networks. In this protocol, each forwarding node along a traffic flow exercises the internode and intranode fair resource allocation and determines the MAC layer feedback accordingly. The endtoend feedback, which is ultimately determined by the bottleneck node along the flow, is carried back to the source to control its sending rate. Extensive simulations show that WCCP significantly outperforms traditional TCP in terms of channel utilization, delay, and fairness, and eliminates the starvation problem.
Most learning algorithms are not invariant to the scale of the function that is being approximated. We propose to adaptively normalize the targets used in learning. This is useful in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari games, where the rewards were all clipped to a predetermined range. This clipping facilitates learning across many different games with a single learning algorithm, but a clipped reward function can result in qualitatively different behavior. Using the adaptive normalization we can remove this domain-specific heuristic without diminishing overall performance.
Dimensionality is one of the most important properties of complex physical systems. However, only recently this concept has been considered in the context of complex networks. In this paper we further develop the previously introduced definitions of dimension in complex networks by presenting a new method to characterize the dimensionality of individual nodes. The methodology consists in obtaining patterns of dimensionality at different scales for each node, which can be used to detect regions with distinct dimensional structures as well as borders. We also apply this technique to power grid networks, showing, quantitatively, that the continental European power grid is substantially more planar than the network covering the western states of US, which present topological dimension higher than their intrinsic embedding space dimension. Local dimension also successfully revealed how distinct regions of network topologies spreads along the degrees of freedom when it is embedded in a metric space.
We describe an example of a structurally stable heteroclinic network for which nearby orbits exhibit irregular but sustained switching between the various sub-cycles in the network. The mechanism for switching is the presence of spiralling due to complex eigenvalues in the flow linearised about one of the equilibria common to all cycles in the network. We construct and use return maps to investigate the asymptotic stability of the network, and show that switching is ubiquitous near the network. Some of the unstable manifolds involved in the network are two-dimensional; we develop a technique to account for all trajectories on those manifolds. A simple numerical example illustrates the rich dynamics that can result from the interplay between the various cycles in the network.
We consider model social networks in which information propagates directionally across layers of rational agents. Each agent makes a locally optimal estimate of the state of the world, and communicates this estimate to agents downstream. When agents receive information from the same source their estimates are correlated. We show that the resulting redundancy can lead to the loss of information about the state of the world across layers of the network, even when all agents have full knowledge of the network's structure. A simple algebraic condition identifies networks in which information loss occurs, and we show that all such networks must contain a particular network motif. We also study random networks asymptotically as the number of agents increases, and find a sharp transition in the probability of information loss at the point at which the number of agents in one layer exceeds the number in the previous layer.
Phylogenetic networks have gained prominence over the years due to their ability to represent complex non-treelike evolutionary events such as recombination or hybridization. Popular combinatorial objects used to construct them are triplet systems and cluster systems, the motivation being that any network $N$ induces a triplet system $\mathcal R(N)$ and a softwired cluster system $\mathcal S(N)$. Since in real-world studies it cannot be guaranteed that all triplets/softwired clusters induced by a network are available it is of particular interest to understand whether subsets of $\mathcal R(N)$ or $\mathcal S(N)$ allow one to uniquely reconstruct the underlying network $N$. Here we show that even within the highly restricted yet biologically interesting space of level-1 phylogenetic networks it is not always possible to uniquely reconstruct a level-1 network $N$ even when all triplets in $\mathcal R(N)$ or all clusters in $\mathcal S(N)$ are available. On the positive side, we introduce a reasonably large subclass of level-1 networks the members of which are uniquely determined by their induced triplet/softwired cluster systems. Along the way, we also establish various enumerative results, both positive and negative, including results which show that certain special subclasses of level-1 networks $N$ can be uniquely reconstructed from proper subsets of $\mathcal R(N)$ and $\mathcal S(N)$. We anticipate these results to be of use in the design of, for example, algorithms for phylogenetic network inference.
The number of papers published in journals indexed by the Web of Science core collection is steadily increasing. In recent years, nearly two million new papers were published each year; somewhat more than one million papers when primary research papers are considered only (articles and reviews are the document types where primary research is usually reported or reviewed). However, who reads these papers? More precisely, which groups of researchers from which (self-assigned) scientific disciplines and countries are reading these papers? Is it possible to visualize readership patterns for certain countries, scientific disciplines, or academic status groups? One popular method to answer these questions is a network analysis. In this study, we analyze Mendeley readership data of a set of 1,133,224 articles and 64,960 reviews with publication year 2012 to generate three different kinds of networks: (1) The network based on disciplinary affiliations of Mendeley readers contains four groups: (i) biology, (ii) social science and humanities (including relevant computer science), (iii) bio-medical sciences, and (iv) natural science and engineering. In all four groups, the category with the addition "miscellaneous" prevails. (2) The network of co-readers in terms of professional status shows that a common interest in papers is mainly shared among PhD students, Master's students, and postdocs. (3) The country network focusses on global readership patterns: a group of 53 nations is identified as core to the scientific enterprise, including Russia and China as well as two thirds of the OECD (Organisation for Economic Co-operation and Development) countries.
Widespread use of the Internet and social networks invokes the generation of big data, which is proving to be useful in a number of applications. To deal with explosively growing amounts of data, data analytics has emerged as a critical technology related to computing, signal processing, and information networking. In this paper, a formalism is considered in which data is modeled as a generalized social network and communication theory and information theory are thereby extended to data analytics. First, the creation of an equalizer to optimize information transfer between two data variables is considered, and financial data is used to demonstrate the advantages. Then, an information coupling approach based on information geometry is applied for dimensionality reduction, with a pattern recognition example to illustrate the effectiveness. These initial trials suggest the potential of communication theoretic data analytics for a wide range of applications.
The field of wireless networks has been rapidly developed during the past decade due to the increasing popularity of the mobile devices. The great demand for mobility and connectivity makes wireless networking a field whose continuous technological development is very important as new challenges and issues are arising. Many scientists and researchers are currently engaged in developing new approaches and optimization methods in several topics of wireless networking. This survey paper study works from the following topics: Cognitive Radio Networks, Interactive Broadcasting, Energy Efficient Networks, Cloud Computing and Resource Management, Interactive Marketing and Optimization.
Complex networks are intrinsically modular. Resolving small modules is particularly difficult when the network is densely connected; wide variation of link weights invites additional complexities. In this article we present an algorithm to detect community structure in densely connected weighted networks. First, modularity of the network is calculated by erasing the links having weights smaller than a cutoff $q.$ Then one takes all the disjoint components obtained at $q=q_c,$ where the modularity is maximum, and modularize the components individually using Newman Girvan's algorithm for weighted networks. We show, taking microRNA (miRNA) co-target network of Homo sapiens as an example, that this algorithm could reveal miRNA modules which are known to be relevant in biological context.
How do the topology and geometry of a tubular network affect the spread of particles within fluid flows? We investigate patterns of effective dispersion in the hierarchical, biological transport network formed by Physarum polycephalum. We demonstrate that a change in topology - pruning in the foraging state - causes a large increase in effective dispersion throughout the network. By comparison, changes in the hierarchy of tube radii result in smaller and more localized differences. Pruned networks capitalize on Taylor dispersion to increase the dispersion capability.
Causal inference concerns the identification of cause-effect relationships between variables. However, often only linear combinations of variables constitute meaningful causal variables. For example, recovering the signal of a cortical source from electroencephalography requires a well-tuned combination of signals recorded at multiple electrodes. We recently introduced the MERLiN (Mixture Effect Recovery in Linear Networks) algorithm that is able to recover, from an observed linear mixture, a causal variable that is a linear effect of another given variable. Here we relax the assumption of this cause-effect relationship being linear and present an extended algorithm that can pick up non-linear cause-effect relationships. Thus, the main contribution is an algorithm (and ready to use code) that has broader applicability and allows for a richer model class. Furthermore, a comparative analysis indicates that the assumption of linear cause-effect relationships is not restrictive in analysing electroencephalographic data.
We perform a high-statistics simulation of the three-dimensional randomly dilute Ising model on cubic lattices $L^3$ with $L\le 256$. We choose a particular value of the density, x=0.8, for which the leading scaling corrections are suppressed. We determine the critical exponents, obtaining $\nu = 0.683(3)$, $\eta = 0.035(2)$, $\beta = 0.3535(17)$, and $\alpha = -0.049(9)$, in agreement with previous numerical simulations. We also estimate numerically the fixed-point values of the four-point zero-momentum couplings that are used in field-theoretical fixed-dimension studies. Although these results somewhat differ from those obtained using perturbative field theory, the field-theoretical estimates of the critical exponents do not change significantly if the Monte Carlo result for the fixed point is used. Finally, we determine the six-point zero-momentum couplings, relevant for the small-magnetization expansion of the equation of state, and the invariant amplitude ratio $R^+_\xi$ that expresses the universality of the free-energy density per correlation volume. We find $R^+_\xi = 0.2885(15)$.
In front-form dynamics the current operator can be constructed from auxiliary operators, defined in a Breit frame where initial and final three-momenta of the system are directed along the $z$ axis. Poincar\'e covariance constraints reduce for auxiliary operators to the ones imposed only by kinematical rotations around the $z$ axis.   Elastic and transition form factors can be extracted without any ambiguity and in the elastic case the continuity equation is automatically satisfied, once Poincar\'e, ${\cal P}$ and ${\cal T}$ covariance, together with hermiticity, are imposed. Applications to deep inelastic structure functions in an exactly solvable model and to the calculation of the deuteron electromagnetic form factors are presented.
Breiman (2001) proposed to statisticians awareness of two cultures: 1. Parametric modeling culture, pioneered by R.A.Fisher and Jerzy Neyman; 2. Algorithmic predictive culture, pioneered by machine learning research.   Parzen (2001), as a part of discussing Breiman (2001), proposed that researchers be aware of many cultures, including the focus of our research: 3. Nonparametric, quantile based, information theoretic modeling. We provide a unification of many statistical methods for traditional small data sets and emerging big data sets in terms of comparison density, copula density, measure of dependence, correlation, information, new measures (called LP score comoments) that apply to long tailed distributions with out finite second order moments. A very important goal is to unify methods for discrete and continuous random variables. Our research extends these methods to modern high dimensional data modeling.
Initial- and final-state interactions from gluon-exchange, normally neglected in the parton model have a profound effect in QCD hard-scattering reactions, leading to leading-twist single-spin asymmetries, diffractive deep inelastic scattering, diffractive hard hadronic reactions, and nuclear shadowing and antishadowing--leading-twist physics not incorporated in the light-front wavefunctions of the target computed in isolation. I also discuss the use of diffraction to materialize the Fock states of a hadronic projectile and test QCD color transparency.
Visual multimedia have become an inseparable part of our digital social lives, and they often capture moments tied with deep affections. Automated visual sentiment analysis tools can provide a means of extracting the rich feelings and latent dispositions embedded in these media. In this work, we explore how Convolutional Neural Networks (CNNs), a now de facto computational machine learning tool particularly in the area of Computer Vision, can be specifically applied to the task of visual sentiment prediction. We accomplish this through fine-tuning experiments using a state-of-the-art CNN and via rigorous architecture analysis, we present several modifications that lead to accuracy improvements over prior art on a dataset of images from a popular social media platform. We additionally present visualizations of local patterns that the network learned to associate with image sentiment for insight into how visual positivity (or negativity) is perceived by the model.
In this paper we study finite interaction range corrections to the mosaic picture of the glass transition as emerges from the study of the Kac limit of large interaction range for disordered models. To this aim we consider point to set correlation functions, or overlaps, in a one dimensional random energy model as a function of the range of interaction. In the Kac limit, the mosaic length defines a sharp first order transition separating a high overlap phase from a low overlap one. Correspondingly we find that overlap curves as a function of the window size and different finite interaction ranges cross roughly at the mosaic lenght. Nonetheless we find very slow convergence to the Kac limit and we discuss why this could be a problem for measuring the mosaic lenght in realistic models.
The recognition of cursive script is regarded as a subtle task in optical character recognition due to its varied representation. Every cursive script has different nature and associated challenges. As Urdu is one of cursive language that is derived from Arabic script, thats why it nearly shares the same challenges and difficulties even more harder. We can categorized Urdu and Arabic language on basis of its script they use. Urdu is mostly written in Nastaliq style whereas, Arabic follows Naskh style of writing. This paper presents new and comprehensive Urdu handwritten offline database name Urdu-Nastaliq Handwritten Dataset (UNHD). Currently, there is no standard and comprehensive Urdu handwritten dataset available publicly for researchers. The acquired dataset covers commonly used ligatures that were written by 500 writers with their natural handwriting on A4 size paper. We performed experiments using recurrent neural networks and reported a significant accuracy for handwritten Urdu character recognition.
Using Brownian dynamics simulation, we study the orientational order in melting transition of colloidal systems with $'$soft$'$ Yukawa potential. The bond-orientational order parameter $\Phi_{6}$ and the bond-orientational order function $g_B(r)$ are calculated in two-dimensional systems. It is found that a two-stage transition and the hexatic phase are indeed existent in two-dimensional melitng, which is consistent with the prediction of the Kosterlitz-Thouless-Halperin-Nelson-Young theory. For comparing with the melting process in three-dimensional systems, the probability distribution of single-particle local order parameter is introduced. Based on the extensive simulations, it is qualitatively suggested that the breakdown of local order only occurs on the fractional part of the colloidal systems for the two-dimensional melting, but in three-dimensional melting, this breakdown takes place on the whole systems at the same time.
The spatial distribution of the local density of states (LDOS) at Mn acceptors near the (110) surface of p-doped InAs is investigated by Scanning Tunneling Microscopy (STM). The shapes of the acceptor contrasts for different dopant depths under the surface are analyzed. Acceptors located within the first ten subsurface layers of the semiconductor show a lower symmetry than expected from theoretical predictions of the bulk acceptor wave function. They exhibit a (001) mirror asymmetry. The degree of asymmetry depends on the acceptor atoms' depths. The measured contrasts for acceptors buried below the 10th subsurface layer closely match the theoretically derived shape. Two effects are able to explain the symmetry reduction: the strain field of the surface relaxation and the tip-induced electric field.
We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. We show that, in comparison to other recently introduced large-scale datasets, TriviaQA (1) has relatively complex, compositional questions, (2) has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and (3) requires more cross sentence reasoning to find answers. We also present two baseline algorithms: a feature-based classifier and a state-of-the-art neural network, that performs well on SQuAD reading comprehension. Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging testbed that is worth significant future study. Data and code available at -- http://nlp.cs.washington.edu/triviaqa/
The described multicoloring problem has direct applications in the context of wireless ad hoc and sensor networks. In order to coordinate the access to the shared wireless medium, the nodes of such a network need to employ some medium access control (MAC) protocol. Typical MAC protocols control the access to the shared channel by time (TDMA), frequency (FDMA), or code division multiple access (CDMA) schemes. Many channel access schemes assign a fixed set of time slots, frequencies, or (orthogonal) codes to the nodes of a network such that nodes that interfere with each other receive disjoint sets of time slots, frequencies, or code sets. Finding a valid assignment of time slots, frequencies, or codes hence directly corresponds to computing a multicoloring of a graph $G$. The scarcity of bandwidth, energy, and computing resources in ad hoc and sensor networks, as well as the often highly dynamic nature of these networks require that the multicoloring can be computed based on as little and as local information as possible.
As recently discovered [PRL ${\bf 109}$ 190601(2012)], Anderson localization in a bulk disordered system triggers the emergence of a coherent forward scattering (CFS) peak in momentum space, which twins the well-known coherent backscattering (CBS) peak observed in weak localization experiments. Going beyond the perturbative regime, we address here the long-time dynamics of the CFS peak in a 1D random system and we relate this novel interference effect to the statistical properties of the eigenfunctions and eigenspectrum of the corresponding random Hamiltonian. Our numerical results show that the dynamics of the CFS peak is governed by the logarithmic level repulsion between localized states, with a time scale that is, with good accuracy, twice the Heisenberg time. This is in perfect agreement with recent findings based on the nonlinear $\sigma$-model. In the stationary regime, the width of the CFS peak in momentum space is inversely proportional to the localization length, reflecting the exponential decay of the eigenfunctions in real space, while its height is exactly twice the background, reflecting the Poisson statistical properties of the eigenfunctions. Our results should be easily extended to higher dimensional systems and other symmetry classes.
It is well known that one can ignore parts of a belief network when computing answers to certain probabilistic queries. It is also well known that the ignorable parts (if any) depend on the specific query of interest and, therefore, may change as the query changes. Algorithms based on jointrees, however, do not seem to take computational advantage of these facts given that they typically construct jointrees for worst-case queries; that is, queries for which every part of the belief network is considered relevant. To address this limitation, we propose in this paper a method for reconfiguring jointrees dynamically as the query changes. The reconfiguration process aims at maintaining a jointree which corresponds to the underlying belief network after it has been pruned given the current query. Our reconfiguration method is marked by three characteristics: (a) it is based on a non-classical definition of jointrees; (b) it is relatively efficient; and (c) it can reuse some of the computations performed before a jointree is reconfigured. We present preliminary experimental results which demonstrate significant savings over using static jointrees when query changes are considerable.
The network of contacts in space-filling disk packings, such as the Apollonian packing, are examined. These networks provide an interesting example of spatial scale-free networks, where the topology reflects the broad distribution of disk areas. A wide variety of topological and spatial properties of these systems are characterized. Their potential as models for networks of connected minima on energy landscapes is discussed.
Spatial navigation in mammals is based on building a mental representation of their environment---a cognitive map. However, both the nature of this cognitive map and its underpinning in neural structures and activity remains vague. A key difficulty is that these maps are collective, emergent phenomena that cannot be reduced to a simple combination of inputs provided by individual neurons. In this paper we suggest computational frameworks for integrating the spiking signals of individual cells into a spatial map, which we call schemas. We provide examples of four schemas defined by different types of topological relations that may be neurophysiologically encoded in the brain and demonstrate that each schema provides its own large-scale characteristics of the environment---the schema integrals. Moreover, we find that, in all cases, these integrals are learned at a rate which is faster than the rate of complete training of neural networks. Thus, the proposed schema framework differentiates between the cognitive aspect of spatial learning and the physiological aspect at the neural network level.
The QSO HE0450-2958 and the companion galaxy with which it is interacting, both ultra luminous in the infrared, have been the subject of much attention in recent years, as the quasar host galaxy remained undetected. This led to various interpretations on QSO and galaxy formation and co-evolution, such as black hole ejection, jet induced star formation, dust obscured galaxy, or normal host below the detection limit. We carried out deep observations in the near-IR in order to solve the puzzle concerning the existence of any host. The object was observed with the ESO VLT and HAWK-I in the near-IR J-band for 8 hours. The images have been processed with the MCS deconvolution method (Magain, Courbin & Sohy, 1998), permitting accurate subtraction of the QSO light from the observations. The compact emission region situated close to the QSO, called the blob, which previously showed only gas emission lines in the optical spectra, is now detected in our near-IR images. Its high brightness implies that stars likely contribute to the near-IR emission. The blob might thus be interpreted as an off-centre, bright and very compact host galaxy, involved in a violent collision with its companion.
The irradiation of a dilute cloud of cold atoms with a coherent light field produces a random intensity distribution known as laser speckle. Its statistical fluctuations contain information about the mesoscopic scattering processes at work inside the disordered medium. Following up on earlier work by Assaf and Akkermans [Phys.\ Rev.\ Lett.\ \textbf{98}, 083601 (2007)], we analyze how static speckle intensity correlations are affected by an internal Zeeman degeneracy of the scattering atoms. It is proven on general grounds that the speckle correlations cannot exceed the standard Rayleigh law. On the contrary, because which-path information is stored in the internal atomic states, the intensity correlations suffer from strong decoherence and become exponentially small in the diffusive regime applicable to an optically thick cloud.
This paper focuses on structured-output learning using deep neural networks for 3D human pose estimation from monocular images. Our network takes an image and 3D pose as inputs and outputs a score value, which is high when the image-pose pair matches and low otherwise. The network structure consists of a convolutional neural network for image feature extraction, followed by two sub-networks for transforming the image features and pose into a joint embedding. The score function is then the dot-product between the image and pose embeddings. The image-pose embedding and score function are jointly trained using a maximum-margin cost function. Our proposed framework can be interpreted as a special form of structured support vector machines where the joint feature space is discriminatively learned using deep neural networks. We test our framework on the Human3.6m dataset and obtain state-of-the-art results compared to other recent methods. Finally, we present visualizations of the image-pose embedding space, demonstrating the network has learned a high-level embedding of body-orientation and pose-configuration.
We argue that the lower bound to the barrier energy to flip an up/down spin domain embedded in a down/up spin environment for Ising spin glass is independent of the size of the system. The argument shows the existence of at least one dynamical way through which it is possible to bypass local maxima in the phase space. For an arbitrary case where one flips any cluster of spin of size $l$, we have numerically calculated a lower bound to the exponent $\psi$ characterizing the barrier one has to overcome. In this case $\psi$ corresponding to the lower bound calculated on hierarchical lattice comes out to be equal to $\theta $ the exponent characterizing the domain wall energy in ground state.
We study the efficiency of allocations in large markets with a network structure where every seller owns an edge in a graph and every buyer desires a path connecting some nodes. While it is known that stable allocations in such settings can be very inefficient, the exact properties of equilibria in markets with multiple sellers are not fully understood even in single-source single-sink networks. In this work, we show that for a large class of natural buyer demand functions, we are guaranteed the existence of an equilibrium with several desirable properties. The crucial insight that we gain into the equilibrium structure allows us to obtain tight bounds on efficiency in terms of the various parameters governing the market, especially the number of monopolies M. All of our efficiency results extend to markets without the network structure.   While it is known that monopolies can cause large inefficiencies in general, our main results for single-source single-sink networks indicate that for several natural demand functions the efficiency only drops linearly with M. For example, for concave demand we prove that the efficiency loss is at most a factor 1+M/2 from the optimum, for demand with monotone hazard rate it is at most 1+M, and for polynomial demand the efficiency decreases logarithmically with M. In contrast to previous work that showed that monopolies may adversely affect welfare, our main contribution is showing that monopolies may not be as `evil' as they are made out to be; the loss in efficiency is bounded in many natural markets. Finally, we consider more general, multiple-source networks and show that in the absence of monopolies, mild assumptions on the network topology guarantee an equilibrium that maximizes social welfare.
We employ a number of statistical measures to characterize neural discharge activity in cat retinal ganglion cells (RGCs) and in their target lateral geniculate nucleus (LGN) neurons under various stimulus conditions, and we develop a new measure to examine correlations in fractal activity between spike-train pairs. In the absence of stimulation (i.e., in the dark), RGC and LGN discharges exhibit similar properties. The presentation of a constant, uniform luminance to the eye reduces the fractal fluctuations in the RGC maintained discharge but enhances them in the target LGN discharge, so that neural activity in the pair no longer mirror each other. A drifting-grating stimulus yields RGC and LGN driven spike trains similar in character to those observed in the maintained discharge, with two notable distinctions: action potentials are reorganized along the time axis so that they occur only during certain phases of the stimulus waveform, and fractal activity is suppressed. Under both uniform-luminance and drifting-grating stimulus conditions (but not in the dark), the discharges of pairs of LGN cells are highly correlated over long time scales; in contrast discharges of RGCs are nearly uncorrelated with each other. This indicates that action-potential activity at the LGN is subject to a common fractal modulation to which the RGCs are not subjected.
The promise of quantum neural nets, which utilize quantum effects to model complex data sets, has made their development an aspirational goal for quantum machine learning and quantum computing in general. Here we provide new methods of training quantum Boltzmann machines, which are a class of recurrent quantum neural network. Our work generalizes existing methods and provides new approaches for training quantum neural networks that compare favorably to existing methods. We further demonstrate that quantum Boltzmann machines enable a form of quantum state tomography that not only estimates a state but provides a perscription for generating copies of the reconstructed state. Classical Boltzmann machines are incapable of this. Finally we compare small non-stoquastic quantum Boltzmann machines to traditional Boltzmann machines for generative tasks and observe evidence that quantum models outperform their classical counterparts.
In classical systems, our recent theoretical study provides new insight into how spatial constraint on the system connects with macroscopic properties, which lead to universal representation of equilibrium macroscopic physical property and structure in disordered states. These important characteristics rely on the fact that statistical interdependence for density of microscopic states (DOMS) in configuration space appears numerically vanished at thermodynamic limit for a wide class of spatial constraints, while such behavior of the DOMS is not quantitatively well-understood so far. The present study theoretically address this problem based on the Random Matrix with Gaussian Orthogonal Ensemble, where corresponding statistical independence is mathematically guaranteed. Using the generalized Ising model, we confirm that lower-order moment of density of eigenstates (DOE) of covariance matrix of DOMS shows asymptotic behavior to those for Random Matrix with increase of system size. This result supports our developed theoretical approach, where equilibrium macroscopic property in disordered states can be decomposed into individual contribtion from each generalized coordinate with the sufficiently high number of constituents in the given system, leading to representing equilibrium macroscopic properties by a few special microscopic states.
General Packet Radio Service (GPRS) is a complex data network which upgrades current second generation GSM networks, offering true high-speed internet (IP) and network connectivity over existing GSM cellular networks. The increasing population of mobile users leads to congestion problems in these systems, and motivates the development of more efficient management schemes. This project deals with radio resource and mobility management such as location management and handoff management using distance method in GPRS networks. A simulator based on MATLAB which can study the location updating is used in this GPRS system.
We explore the possibility for translationally invariant quantum many-body systems to undergo a dynamical glass transition, at which ergodicity and translational invariance break down spontaneously, driven entirely by quantum effects. In contrast to analogous classical systems, where the existence of such an ideal glass transition remains a controversial issue, a genuine phase transition is predicted in the quantum regime. This ideal quantum glass transition can be regarded as a many-body localization transition due to self-generated disorder. Despite their lack of thermalization, these disorder-free quantum glasses do not possess an extensive set of local conserved operators, unlike what is conjectured for many-body localized systems with strong quenched disorder.
Music emotion recognition (MER) is usually regarded as a multi-label tagging task, and each segment of music can inspire specific emotion tags. Most researchers extract acoustic features from music and explore the relations between these features and their corresponding emotion tags. Considering the inconsistency of emotions inspired by the same music segment for human beings, seeking for the key acoustic features that really affect on emotions is really a challenging task. In this paper, we propose a novel MER method by using deep convolutional neural network (CNN) on the music spectrograms that contains both the original time and frequency domain information. By the proposed method, no additional effort on extracting specific features required, which is left to the training procedure of the CNN model. Experiments are conducted on the standard CAL500 and CAL500exp dataset. Results show that, for both datasets, the proposed method outperforms state-of-the-art methods.
Real-time computation of data streams over affordable virtualized infrastructure resources is an important form of data in motion processing architecture. However, processing such data streams while ensuring strict guarantees on quality of services is problematic due to: (i) uncertain stream arrival pattern; (ii) need of processing different types of continuous queries; and (iii) variable resource consumption behavior of continuous queries. Recent work has explored the use of statistical techniques for resource estimation of SQL queries and OLTP workloads. All these techniques approximate resource usage for each query as a single point value. However, in data stream processing workloads in which data flows through the graph of operators endlessly and poses performance and resource demand fluctuations, the single point resource estimation is inadequate. Because it is neither expressive enough nor does it capture the multi-modal nature of the target data. To this end, we present a novel technique which uses mixture density networks, a combined structure of neural networks and mixture models, to estimate the whole spectrum of resource usage as probability density functions. The proposed approach is a flexible and convenient means of modeling unknown distribution models. We have validated the models using both the linear road benchmark and the TPC-H, observing high accuracy under a number of error metrics: mean-square error, continuous ranked probability score, and negative log predictive density.
We suggest to simulate evolution of complex organisms constrained by the sole requirement of robustness in their expression patterns. This scenario is illustrated by evolving discrete logical networks with epigenetic properties. Evidence for dynamical features in the evolved networks is found that can be related to biological observables.
Recent development of network structure analysis shows that it plays an important role in characterizing complex system of many branches of sciences. Different from previous network centrality measures, this paper proposes the notion of topological centrality (TC) reflecting the topological positions of nodes and edges in general networks, and proposes an approach to calculating the topological centrality. The proposed topological centrality is then used to discover communities and build the backbone network. Experiments and applications on research network show the significance of the proposed approach.
Sample efficiency is a critical property when optimizing policy parameters for the controller of a robot. In this paper, we evaluate two state-of-the-art policy optimization algorithms. One is a recent deep reinforcement learning method based on an actor-critic algorithm, Deep Deterministic Policy Gradient (DDPG), that has been shown to perform well on various control benchmarks. The other one is a direct policy search method, Covariance Matrix Adaptation Evolution Strategy (CMA-ES), a black-box optimization method that is widely used for robot learning. The algorithms are evaluated on a continuous version of the mountain car benchmark problem, so as to compare their sample complexity. From a preliminary analysis, we expect DDPG to be more sample efficient than CMA-ES, which is confirmed by our experimental results.
Generality is one of the main advantages of heuristic algorithms, as such, multiple parameters are exposed to the user with the objective of allowing them to shape the algorithms to their specific needs. Parameter selection, therefore, becomes an intrinsic problem of every heuristic algorithm. Selecting good parameter values relies not only on knowledge related to the problem at hand, but to the algorithms themselves. This research explores the usage of self-organized criticality to reduce user interaction in the process of selecting suitable parameters for particle swarm optimization (PSO) heuristics. A particle swarm variant (named Adaptive PSO) with self-organized criticality is developed and benchmarked against the standard PSO. Criticality is observed in the dynamic behaviour of this swarm and excellent results are observed in the long run. In contrast with the standard PSO, the Adaptive PSO does not stagnate at any point in time, balancing the concepts of exploration and exploitation better. A software platform for experimenting with particle swarms, called PSO Laboratory, is also developed. This software is used to test the standard PSO as well as all other PSO variants developed in the process of creating the Adaptive PSO. As the software is intended to be of aid to future and related research, special attention has been put in the development of a friendly graphical user interface. Particle swarms are executed in real time, allowing users to experiment by changing parameters on-the-fly.
We introduce a new method for time-domain signal processing, based on deep learning neural networks, which has the potential to revolutionize data analysis in engineering and science. To demonstrate how this enables real-time multimessenger astrophysics, we designed two deep convolutional neural networks that can analyze time-series data from observatories including advanced LIGO. The first neural network recognizes the presence of gravitational waves from binary black hole mergers, and the second one estimates the mass of each black hole, given weak signals hidden in extremely noisy time-series inputs. We highlight the advantages offered by this novel method, which outperforms matched-filtering or conventional machine learning techniques, and propose strategies to extend our implementation for simultaneously targeting different classes of gravitational wave sources while ignoring anomalous noise transients. Our results strongly indicate that deep neural networks are highly efficient and versatile tools for directly processing any raw noisy data streams. We also pioneer a new paradigm to accelerate scientific discovery by combining high-performance simulations on traditional supercomputers and artificial intelligence algorithms that exploit innovative hardware architectures such as deep-learning-optimized GPUs. This unique approach immediately provides a natural framework to unify multi-spectrum observations in real-time thus enabling coincident detection campaigns of gravitational waves sources and their electromagnetic counterparts.
Nowadays, software has become a complex piece of work that may be beyond our control. Understanding how software evolves over time plays an important role in controlling software development processes. Recently, a few researchers found the quantitative evidence of structural duplication in software systems or web applications, which is similar to the evolutionary trend found in biological systems. To investigate the principles or rules of software evolution, we introduce the relevant theories and methods of complex networks into structural evolution and change of software systems. According to the results of our experiment on network motifs, we find that the stability of a motif shows positive correlation with its abundance and a motif with high Z score tends to have stable structure. These findings imply that the evolution of software systems is based on functional cloning as well as structural duplication and tends to be structurally stable. So, the work presented in this paper will be useful for the analysis of structural changes of software systems in reverse engineering.
We present the first experimental determination of the time autocorrelation $C(t',t)$ of magnetization in the non-stationary regime of a spin glass. Quantitative comparison with the response, the magnetic susceptibility $\chi(t',t)$, is made using a new experimental setup allowing both measurements in the same conditions. Clearly, we observe a non-linear fluctuation-dissipation relation between $C$ and $\chi$, depending weakly on the waiting time $t'$. Following theoretical developments on mean-field models, and lately on short range models, it is predicted that in the limit of long times, the $\chi(C)$ relationship should become independent on $t'$. A scaling procedure allows us to extrapolate to the limit of long waiting times.
Deep learning is revolutionizing many areas of science and technology, especially image, text and speech recognition. In this paper, we demonstrate how a deep neural network (NN) trained on quantum mechanical (QM) DFT calculations can learn an accurate and fully transferable potential for organic molecules. We introduce ANAKIN-ME (Accurate NeurAl networK engINe for Molecular Energies) or ANI in short. ANI is a new method and procedure for training neural network potentials that utilizes a highly modified version of the Behler and Parrinello symmetry functions to build single-atom atomic environment vectors as a molecular representation. We utilize ANI to build a potential called ANI-1, which was trained on a subset of the GDB databases with up to 8 heavy atoms to predict total energies for organic molecules containing four atom types: H, C, N, and O. To obtain an accelerated but physically relevant sampling of molecular potential surfaces, we also propose a Normal Mode Sampling (NMS) method for generating molecular configurations. Through a series of case studies, we show that ANI-1 is chemically accurate compared to reference DFT calculations on much larger molecular systems (up to 54 atoms) than those included in the training data set, with root mean square errors as low as 0.56 kcal/mol.
Motivation   Protein fold recognition is an important problem in structural bioinformatics. Almost all traditional fold recognition methods use sequence (homology) comparison to indirectly predict the fold of a tar get protein based on the fold of a template protein with known structure, which cannot explain the relationship between sequence and fold. Only a few methods had been developed to classify protein sequences into a small number of folds due to methodological limitations, which are not generally useful in practice.   Results   We develop a deep 1D-convolution neural network (DeepSF) to directly classify any protein se quence into one of 1195 known folds, which is useful for both fold recognition and the study of se quence-structure relationship. Different from traditional sequence alignment (comparison) based methods, our method automatically extracts fold-related features from a protein sequence of any length and map it to the fold space. We train and test our method on the datasets curated from SCOP1.75, yielding a classification accuracy of 80.4%. On the independent testing dataset curated from SCOP2.06, the classification accuracy is 77.0%. We compare our method with a top profile profile alignment method - HHSearch on hard template-based and template-free modeling targets of CASP9-12 in terms of fold recognition accuracy. The accuracy of our method is 14.5%-29.1% higher than HHSearch on template-free modeling targets and 4.5%-16.7% higher on hard template-based modeling targets for top 1, 5, and 10 predicted folds. The hidden features extracted from sequence by our method is robust against sequence mutation, insertion, deletion and truncation, and can be used for other protein pattern recognition problems such as protein clustering, comparison and ranking.
We introduce a setting for learning possibilistic logic theories from defaults of the form "if alpha then typically beta". We first analyse this problem from the point of view of machine learning theory, determining the VC dimension of possibilistic stratifications as well as the complexity of the associated learning problems, after which we present a heuristic learning algorithm that can easily scale to thousands of defaults. An important property of our approach is that it is inherently able to handle noisy and conflicting sets of defaults. Among others, this allows us to learn possibilistic logic theories from crowdsourced data and to approximate propositional Markov logic networks using heuristic MAP solvers. We present experimental results that demonstrate the effectiveness of this approach.
We study vaccine control for disease spread on an adaptive network modeling disease avoidance behavior. Control is implemented by adding Poisson distributed vaccination of susceptibles. We show that vaccine control is much more effective in adaptive networks than in static networks due to an interaction between the adaptive network rewiring and the vaccine application. Disease extinction rates using vaccination are computed, and orders of magnitude less vaccine application is needed to drive the disease to extinction in an adaptive network than in a static one.
Many networks extent in space, may it be metric (e.g. geographic) or non-metric (ordinal). Spatial network growth, which depends on the distance between nodes, can generate a wide range of topologies from small-world to linear scale-free networks. However, networks often lacked multiple clusters or communities. Multiple clusters can be generated, however, if there are time windows during development. Time windows ensure that regions of the network develop connections at different points in time. This novel approach could generate small-world but not scale-free networks. The resulting topology depended critically on the overlap of time windows as well as on the position of pioneer nodes.
Boltzmann-Gibbs measures generated by logarithmically correlated random potentials are multifractal. We investigate the abrupt change ("pre-freezing") of multifractality exponents extracted from the averaged moments of the measure - the so-called inverse participation ratios. The pre-freezing can be identified with termination of the disorder-averaged multifractality spectrum. Naive replica limit employed to study a one-dimensional variant of the model is shown to break down at the pre-freezing point. Further insights are possible when employing zero-dimensional and infinite-dimensional versions of the problem. In particular, the latter version allows one to identify the pattern of the replica symmetry breaking responsible for the pre-freezing phenomenon.
The Substellar Initial Mass Function (SIMF) of many star-forming regions is still poorly known but the detailed knowledge of its shape will help to distinguish among the substellar formation theories. The Orion Nebula Cluster (ONC) is one of the most extensively studied star forming regions. We here present deep, wide-field JHK observations of the ONC taken with HAWK-I@VLT. These observations extend the IMF into the brown dwarf and free-floating planetary mass regime with unprecedented sensitivity. To obtain a clean sample of ONC members, we exclude potential background sources with the help of CO extinction maps. Masses are assigned by means of evolutionary tracks in the H vs. J-H Color-Magnitude Diagram (CMD). Besides the well known stellar peak at ~0.25 M_sun we find a pronounced second peak at ~0.04 M_sun in the SIMF and indications for a third rise in the free-floating planetary mass regime.
The miniaturisation of computing devices has seen computing devices become increasingly pervasive in society. With this increased pervasiveness, the technologies of small computing devices have also improved. Mobile devices are now capable of capturing various forms of multimedia and able to communicate wirelessly using increasing numbers of communication techniques. The owners and creators of local content are motivated to share this content in ever increasing volume; the conclusion has been that social networks sites are seeing a revolution in the sharing of information between communities of people. As load on centralised systems increases, we present a novel decentralised peer-to-peer approach dubbed the Market Contact Protocol (MCP) to achieve cost effective, scalable and efficient content sharing using opportunistic networking (pocket switched networking), incentive, context-awareness, social contact and mobile devices. Within the report we describe how the MCP is simulated with a superimposed geographic framework on top of the JiST (Java in Simulation Time) framework to evaluate and measure its capability to share content between massively mobile peers. The MCP is shown in conclusion to be a powerful means by which to share content in a massively mobile ad-hoc environment.
Most real systems are growing. In order to model the evolution of real systems, many growing network models have been proposed to reproduce some specific topology properties. As the structure strongly influences the network function, designing the function-aimed growing strategy is also a significant task with many potential applications. In this letter, we focus on synchronization in the growing networks. In order to enhance the synchronizability during the network evolution, we propose the Spectral-Based Growing (SBG) strategy. Based on the linear stability analysis of synchronization, we show that our growing mechanism yields better synchronizability than the existing topology-aimed growing strategies in both artificial and real-world networks. We also observe that there is an optimal degree of new added nodes, which means adding nodes with neither too large nor too low degree could enhance the synchronizability. Furthermore, some topology measurements are considered in the resultant networks. The results show that the degree, node betweenness centrality from SBG strategy are more homogenous than those from other growing strategies. Our work highlights the importance of the function-aimed growth of the networks and deepens our understanding of it.
We show how different approaches to developing marketing strategies depending on the type of environment a firm faces, where environments are distinguished in terms of their systems properties not their context. Particular emphasis is given to turbulent environments in which outcomes are not a priori predictable and are not traceable to individual firm actions and we show that, in these conditions, the relevant unit of competitive response and understanding is no longer the individual firm but the network of relations comprising interdependent, interacting firms. Networks of relations are complex adaptive systems that are more 'intelligent' than the individual firms that comprise them and are capable of comprehending and responding to more complex and turbulent environments. Yet they are co-produced by the patterns of actions and interactions of the firms involved. The creation and accessing of such distributed intelligence cannot be centrally directed, as this necessarily limits it. Instead managers and firms are involved in a kind of participatory planning and adaptation process through which the network self-organises and adapts. Drawing on research in systems theory, complexity, biology and cognitive science, extensions to the resource-based theory of the firm are proposed that include how resources are linked across relations and network in a dynamic and evolutionary way. The concept of an extended firm and soft assembled strategies are introduced to describe the nature of the strategy development process. This results in a more theoretically grounded basis for understanding the nature and role of relationship and network strategies in marketing and management. We finish by considering the research implications of our analysis and the role of agent based models as a means of sensitising and informing management action.
Persistent homology has recently emerged as a powerful technique in topological data analysis for analyzing the emergence and disappearance of topological features throughout a filtered space, shown via persistence diagrams. Additionally, (co)sheaves have proven to be powerful instruments in tracking locally defined data across global systems, resulting in innovative applications to network science. In this paper, we combine the topological results of persistent homology and the quantitative data tracking capabilities of cosheaf theory to develop novel techniques in network data flow analysis. Specifically, we use cosheaf theory to construct persistent homology in a framework geared towards assessing data flow stability in hierarchical recurrent networks (HRNs). We use cosheaves to link topological information about a filtered network encoded in persistence diagrams with data associated locally to the network. From this construction, we use the homology of cosheaves as a framework to study the notion of "persistent data flow errors." That is, we generalize aspects of persistent homology to analyze the lifetime of local data flow malfunctions. We study an algorithmic construction of persistence diagrams parameterizing network data flow errors, thus enabling novel applications of statistical methods to study data flow malfunctions. We conclude with an application to network packet delivery systems.
We propose centralized algorithm of data distribution in the unicast p2p network. Good example of such networks are meshes of WWW and FTP mirrors. Simulation of data propogation for different network topologies is performed and it is shown that proposed method performs up to 200% better then common apporaches
Automated detection of sclerotic metastases (bone lesions) in Computed Tomography (CT) images has potential to be an important tool in clinical practice and research. State-of-the-art methods show performance of 79% sensitivity or true-positive (TP) rate, at 10 false-positives (FP) per volume. We design a two-tiered coarse-to-fine cascade framework to first operate a highly sensitive candidate generation system at a maximum sensitivity of ~92% but with high FP level (~50 per patient). Regions of interest (ROI) for lesion candidates are generated in this step and function as input for the second tier. In the second tier we generate N 2D views, via scale, random translations, and rotations with respect to each ROI centroid coordinates. These random views are used to train a deep Convolutional Neural Network (CNN) classifier. In testing, the CNN is employed to assign individual probabilities for a new set of N random views that are averaged at each ROI to compute a final per-candidate classification probability. This second tier behaves as a highly selective process to reject difficult false positives while preserving high sensitivities. We validate the approach on CT images of 59 patients (49 with sclerotic metastases and 10 normal controls). The proposed method reduces the number of FP/vol. from 4 to 1.2, 7 to 3, and 12 to 9.5 when comparing a sensitivity rates of 60%, 70%, and 80% respectively in testing. The Area-Under-the-Curve (AUC) is 0.834. The results show marked improvement upon previous work.
In non-network settings, encouragement designs have been widely used to analyze causal effects of a treatment, policy, or intervention on an outcome of interest when randomizing the treatment was considered impractical or when compliance to treatment cannot be perfectly enforced. Unfortunately, such questions related to treatment compliance have received less attention in network settings and the most well-studied experimental design in networks, the two-stage randomization design, requires perfect compliance with treatment. The paper proposes a new experimental design called peer encouragement design to study network treatment effects when enforcing treatment randomization is not feasible. The key idea in peer encouragement design is the idea of personalized encouragement, which allows point-identification of familiar estimands in the encouragement design literature. The paper also defines new causal estimands, local average network effects, that can be identified under the new design and analyzes the effect of non-compliance behavior in randomized experiments on networks.
We study the low-energy states of the 1D random-hopping model in the strong disordered regime. The entanglement structure is shown to depend solely on the probability distribution for the length of the effective bonds $P(l_b)$, whose scaling and finite-size behavior are established using renormalization-group arguments and a simple model based on random permutations. Parity oscillations are absent in the von Neumann entropy with periodic boundary conditions, but appear in the higher moments of the distribution, such as the variance. The particle-hole excited states leave the bond-structure and the entanglement untouched. Nonetheless, particle addition or removal deletes bonds and leads to an effective saturation of entanglement at an effective block size given by the expected value for the longest bond.
We propose the object-oriented networking (OON) framework, for meeting the generalized interconnection, mobility and technology integration requirements underlining the Internet. In OON, the various objects that need to be accessed through the Internet (content, smart things, services, people, etc.) are viewed as network layer resources, rather than as application layer resources as in the IP communications model. By abstracting them as computing objects -with attributes and methods- they are identified by expressive, discoverable names, while data are exchanged between them in the context of their methods, based on suitably defined system-specific names. An OON-enabled Internet is not only a global data delivery medium but also a universal object discovery and service development platform; service-level interactions can be realized through native network means, without requiring standardized protocols. OON can be realized through existing software-defined networking or network functions virtualization technologies and it can be deployed in an incremental fashion.
Metric graph properties lie in the heart of the analysis of complex networks, while in this paper we study their convexity through mathematical definition of a convex subgraph. A subgraph is convex if every geodesic path between the nodes of the subgraph lies entirely within the subgraph. According to our perception of convexity, convex network is such in which every connected subset of nodes induces a convex subgraph. We show that convexity is an inherent property of many networks that is not present in a random graph. Most convex are spatial infrastructure networks and social collaboration graphs due to their tree-like or clique-like structure, whereas the food web is the only network studied that is truly non-convex. Core-periphery networks are regionally convex as they can be divided into a non-convex core surrounded by a convex periphery. Random graphs, however, are only locally convex meaning that any connected subgraph of size smaller than the average geodesic distance between the nodes is almost certainly convex. We present different measures of network convexity and discuss its applications in the study of networks.
We study the contact process on a class of evolving scale-free networks, where each node updates its connections at independent random times. We give a rigorous mathematical proof that there is a transition between a phase where for all infection rates the infection survives for a long time, at least exponential in the network size, and a phase where for sufficiently small infection rates extinction occurs quickly, at most like the square root of the network size. The phase transition occurs when the power-law exponent crosses the value four. This behaviour is in contrast to that of the contact process on the corresponding static model, where there is no phase transition, as well as that of a classical mean-field approximation, which has a phase transition at power-law exponent three. The new observation behind our result is that temporal variability of networks can simultaneously increase the rate at which the infection spreads in the network, and decrease the time which the infection spends in metastable states.
In large part, rodents see the world through their whiskers, a powerful tactile sense enabled by a series of brain areas that form the whisker-trigeminal system. Raw sensory data arrives in the form of mechanical input to the exquisitely sensitive, actively-controllable whisker array, and is processed through a sequence of neural circuits, eventually arriving in cortical regions that communicate with decision-making and memory areas. Although a long history of experimental studies has characterized many aspects of these processing stages, the computational operations of the whisker-trigeminal system remain largely unknown. In the present work, we take a goal-driven deep neural network (DNN) approach to modeling these computations. First, we construct a biophysically-realistic model of the rat whisker array. We then generate a large dataset of whisker sweeps across a wide variety of 3D objects in highly-varying poses, angles, and speeds. Next, we train DNNs from several distinct architectural families to solve a shape recognition task in this dataset. Each architectural family represents a structurally-distinct hypothesis for processing in the whisker-trigeminal system, corresponding to different ways in which spatial and temporal information can be integrated. We find that most networks perform poorly on the challenging shape recognition task, but that specific architectures from several families can achieve reasonable performance levels. Finally, we show that Representational Dissimilarity Matrices (RDMs), a tool for comparing population codes between neural systems, can separate these higher-performing networks with data of a type that could plausibly be collected in a neurophysiological or imaging experiment. Our results are a proof-of-concept that goal-driven DNN networks of the whisker-trigeminal system are potentially within reach.
We consider a one-step replica symmetry breaking description of the Edwards-Anderson spin glass model in 2D. The ingredients of this description are a Kikuchi approximation to the free energy and a second-level statistical model built on the extremal points of the Kikuchi approximation, which are also fixed points of a Generalized Belief Propagation (GBP) scheme. We show that a generalized free energy can be constructed where these extremal points are exponentially weighted by their Kikuchi free energy and a Parisi parameter $y$, and that the Kikuchi approximation of this generalized free energy leads to second-level, one-step replica symmetry breaking (1RSB), GBP equations. We then proceed analogously to Bethe approximation case for tree-like graphs, where it has been shown that 1RSB Belief Propagation equations admit a Survey Propagation solution. We discuss when and how the one-step-replica symmetry breaking GBP equations that we obtain also allow a simpler class of solutions which can be interpreted as a class of Generalized Survey Propagation equations for the single instance graph case.
Despite of their success, the results of first-principles quantum mechanical calculations contain inherent numerical errors caused by various approximations. We propose here a neural-network algorithm to greatly reduce these inherent errors. As a demonstration, this combined quantum mechanical calculation and neural-network correction approach is applied to the evaluation of standard heat of formation $\DelH$ and standard Gibbs energy of formation $\DelG$ for 180 organic molecules at 298 K. A dramatic reduction of numerical errors is clearly shown with systematic deviations being eliminated. For examples, the root--mean--square deviation of the calculated $\DelH$ ($\DelG$) for the 180 molecules is reduced from 21.4 (22.3) kcal$\cdotp$mol$^{-1}$ to 3.1 (3.3) kcal$\cdotp$mol$^{-1}$ for B3LYP/6-311+G({\it d,p}) and from 12.0 (12.9) kcal$\cdotp$mol$^{-1}$ to 3.3 (3.4) kcal$\cdotp$mol$^{-1}$ for B3LYP/6-311+G(3{\it df},2{\it p}) before and after the neural-network correction.
Conceptual blending is a powerful tool for computational creativity where, for example, the properties of two harmonic spaces may be combined in a consistent manner to produce a novel harmonic space. However, deciding about the importance of property features in the input spaces and evaluating the results of conceptual blending is a nontrivial task. In the specific case of musical harmony, defining the salient features of chord transitions and evaluating invented harmonic spaces requires deep musicological background knowledge. In this paper, we propose a creative tool that helps musicologists to evaluate and to enhance harmonic innovation. This tool allows a music expert to specify arguments over given transition properties. These arguments are then considered by the system when defining combinations of features in an idiom-blending process. A music expert can assess whether the new harmonic idiom makes musicological sense and re-adjust the arguments (selection of features) to explore alternative blends that can potentially produce better harmonic spaces. We conclude with a discussion of future work that would further automate the harmonisation process.
We give a complete characterization of receptive field codes realizable by connected receptive fields and their minimal embedding dimensions. In particular, we show that all connected codes are realizable in dimension at most three. To our knowledge, this is the first family of receptive field codes for which the exact characterization and minimal embedding dimension is known.
We here focus on the problem of predicting the popularity trend of user generated content (UGC) as early as possible. Taking YouTube videos as case study, we propose a novel two-step learning approach that: (1) extracts popularity trends from previously uploaded objects, and (2) predicts trends for new content. Unlike previous work, our solution explicitly addresses the inherent tradeoff between prediction accuracy and remaining interest in the content after prediction, solving it on a per-object basis. Our experimental results show great improvements of our solution over alternatives, and its applicability to improve the accuracy of state-of-the-art popularity prediction methods.
This paper investigates a method of Handwritten English Character Recognition using Artificial Neural Network (ANN). This work has been done in offline Environment for non correlated characters, which do not possess any linear relationships among them. We test that whether the particular tested character belongs to a cluster or not. The implementation is carried out in Matlab environment and successfully tested. Fifty-two sets of English alphabets are used to train the ANN and test the network. The algorithms are tested with 26 capital letters and 26 small letters. The testing result showed that the proposed ANN based algorithm showed a maximum recognition rate of 85%.
Inspired by recent advances in deep learning, we propose a novel iterative BP-CNN architecture for channel decoding under correlated noise. This architecture concatenates a trained convolutional neural network (CNN) with a standard belief-propagation (BP) decoder. The standard BP decoder is used to estimate the coded bits, followed by a CNN to remove the estimation errors of the BP decoder and obtain a more accurate estimation of the channel noise. Iterating between BP and CNN will gradually improve the decoding SNR and hence result in better decoding performance. To train a well-behaved CNN model, we define a new loss function which involves not only the accuracy of the noise estimation but also the normality test for the estimation errors, i.e., to measure how likely the estimation errors follow a Gaussian distribution. The introduction of the normality test to the CNN training shapes the residual noise distribution and further reduces the BER of the iterative decoding, compared to using the standard quadratic loss function. We carry out extensive experiments to analyze and verify the proposed framework. The iterative BP-CNN decoder has better BER performance with lower complexity, is suitable for parallel implementation, does not rely on any specific channel model or encoding method, and is robust against training mismatches. All of these features make it a good candidate for decoding modern channel codes.
Performance and reliability of content access in mobile networks is conditioned by the number and location of content replicas deployed at the network nodes. Location theory has been the traditional, centralized approach to study content replication: computing the number and placement of replicas in a static network can be cast as a facility location problem. The endeavor of this work is to design a practical solution to the above joint optimization problem that is suitable for mobile wireless environments. We thus seek a replication algorithm that is lightweight, distributed, and reactive to network dynamics. We devise a solution that lets nodes (i) share the burden of storing and providing content, so as to achieve load balancing, and (ii) autonomously decide whether to replicate or drop the information, so as to adapt the content availability to dynamic demands and time-varying network topologies. We evaluate our mechanism through simulation, by exploring a wide range of settings, including different node mobility models, content characteristics and system scales. Furthermore, we compare our mechanism to state-of-the-art approaches to content delivery in static and mobile networks. Results show that our mechanism, which uses local measurements only, is: (i) extremely precise in approximating an optimal solution to content placement and replication; (ii) robust against network mobility; (iii) flexible in accommodating various content access patterns. Moreover, our scheme outperforms alternative approaches to content dissemination both in terms of content access delay and access congestion.
A typical modern optimization technique is usually either heuristic or metaheuristic. This technique has managed to solve some optimization problems in the research area of science, engineering, and industry. However, implementation strategy of metaheuristic for accuracy improvement on convolution neural networks (CNN), a famous deep learning method, is still rarely investigated. Deep learning relates to a type of machine learning technique, where its aim is to move closer to the goal of artificial intelligence of creating a machine that could successfully perform any intellectual tasks that can be carried out by a human. In this paper, we propose the implementation strategy of three popular metaheuristic approaches, that is, simulated annealing, differential evolution, and harmony search, to optimize CNN. The performances of these metaheuristic methods in optimizing CNN on classifying MNIST and CIFAR dataset were evaluated and compared. Furthermore, the proposed methods are also compared with the original CNN. Although the proposed methods show an increase in the computation time, their accuracy has also been improved (up to 7.14 percent).
We consider a problem of significant practical importance, namely, the reconstruction of a low-rank data matrix from a small subset of its entries. This problem appears in many areas such as collaborative filtering, computer vision and wireless sensor networks. In this paper, we focus on the matrix completion problem in the case when the observed samples are corrupted by noise. We compare the performance of three state-of-the-art matrix completion algorithms (OptSpace, ADMiRA and FPCA) on a single simulation platform and present numerical results. We show that in practice these efficient algorithms can be used to reconstruct real data matrices, as well as randomly generated matrices, accurately.
Online social networks represent a main source of communication and information exchange in today's life. They facilitate exquisitely news sharing, knowledge elicitation, and forming groups of same interests. Researchers in the last two decades studied the growth dynamics of the online social networks extensively questing a clear understanding of the behavior of humans in online social networks that helps in many directions, like engineering better recommendation systems and attracting new members. However, not all of social networks achieved the desired growth, for example, online social networks like MySpace, Orkut, and Friendster are out of service today. In this work, we present a probabilistic theoretical model that captures the dynamics of the social decay due to the inactivity of the members of a social network. The model is proved to have some interesting mathematical properties, namely \textit{submodularity}, which imply achieving the model optimization in a reasonable performance. That means the maximization problem can be approximated within a factor of $(1-1/e)$ and the minimization problem can be achieved in polynomial time.
This paper considers a transmission control problem in network-coded two-way relay channels (NC-TWRC), where the relay buffers random symbol arrivals from two users, and the channels are assumed to be fading. The problem is modeled by a discounted infinite horizon Markov decision process (MDP). The objective is to find a transmission control policy that minimizes the symbol delay, buffer overflow and transmission power consumption and error rate simultaneously and in the long run. By using the concepts of submodularity, multimodularity and L-natural convexity, we study the structure of the optimal policy searched by dynamic programming (DP) algorithm. We show that the optimal transmission policy is nondecreasing in queue occupancies or/and channel states under certain conditions such as the chosen values of parameters in the MDP model, channel modeling method, modulation scheme and the preservation of stochastic dominance in the transitions of system states. The results derived in this paper can be used to relieve the high complexity of DP and facilitate real-time control.
We investigate the possible existence of a high-redshift (z=1) cluster of galaxies associated with the QSO lens system MG2016+112. From an ultra-deep R- and less deep V- and I-band Keck images and a K-band mosaic from UKIRT, we detect ten galaxies with colors consistent with the lensing galaxy within 225h^{-1} kpc of the z=1.01 lensing galaxy. This represents an overdensity of more than ten times the number density of galaxies with similar colors in the rest of the image. We also find a group of seven much fainter objects closely packed in a group only 27h^{-1} kpc north-west of the lensing galaxy. We perform a weak lensing analysis on faint galaxies in the R-band image and detect a mass peak of a size similar to the mass inferred from X-ray observations of the field, but located 64" northwest of the lensing galaxy.   From the weak lensing data we rule out a similar sized mass peak centered on the lensing galaxy at the 2 sigma level.
We use information from DIS and the two gluon nucleon form factor to estimate the impact parameter amplitude of hadronic configurations in the dipole model of DIS. We demonstrate that only a small fraction of the total $\gamma^{\ast}N$ cross section at $x\sim 10^{-4}$ is due to scattering that occurs near the black body limit. We also make comparisons with other models and we point out that a quark mass of $\lesssim 100$ MeV leads to a strong variation of the the $t$-dependence with $Q^{2}$.
Security of an information system is only as strong as its weakest element. Popular elements of such system include hardware, software, network and people. Current approaches to computer security problems usually exclude people in their studies even though it is an integral part of these systems. To fill that gap, this paper discusses crucial people-related problems in computer security and proposes a method of improving security in such systems by integrating people tightly into the whole system. The integration is implemented via visualization to provide visual feedbacks and capture people's awareness of their actions and consequent results. By doing it, we can improve system usability, shorten user's learning curve, and hence enable user uses computer systems more securely.
We compute grids of radiative-convective model atmospheres for Jupiter, Saturn, Uranus, and Neptune over a range of intrinsic fluxes and surface gravities. The atmosphere grids serve as an upper boundary condition for models of the thermal evolution of the planets. Unlike previous work, we customize these grids for the specific properties of each planet, including the appropriate chemical abundances and incident fluxes as a function of solar system age. Using these grids, we compute new models of the thermal evolution of the major planets in an attempt to match their measured luminosities at their known ages. Compared to previous work, we find longer cooling times, predominantly due to higher atmospheric opacity at young ages. For all planets, we employ simple "standard" cooling models that feature adiabatic temperature gradients in the interior H/He and water layers, and an initially hot starting point for the calculation of subsequent cooling. For Jupiter we find a model cooling age 10% longer than previous work, a modest quantitative difference. This may indicate that the hydrogen equation of state used here overestimates the temperatures in the deep interior of the planet. For Saturn we find a model cooling age 20% longer than previous work. However, an additional energy source, such as that due to helium phase separation, is still clearly needed. For Neptune, unlike in work from the 1980s and 1990s, we match the measured Teff of the planet with a model that also matches the planet's current gravity field. This is predominantly due to advances in the equation of state of water. This may indicate that the planet possesses no barriers to efficient convection in its deep interior. However, for Uranus, our models exacerbate the well-known problem that Uranus is far cooler than calculations predict, which could imply strong barriers to interior convective cooling.
We propose a convolution neural network based algorithm for simultaneously diagnosing diabetic retinopathy and highlighting suspicious regions. Our contributions are two folds: 1) a network termed Zoom-in-Net which mimics the zoom-in process of a clinician to examine the retinal images. Trained with only image-level supervisions, Zoomin-Net can generate attention maps which highlight suspicious regions, and predicts the disease level accurately based on both the whole image and its high resolution suspicious patches. 2) Only four bounding boxes generated from the automatically learned attention maps are enough to cover 80% of the lesions labeled by an experienced ophthalmologist, which shows good localization ability of the attention maps. By clustering features at high response locations on the attention maps, we discover meaningful clusters which contain potential lesions in diabetic retinopathy. Experiments show that our algorithm outperform the state-of-the-art methods on two datasets, EyePACS and Messidor.
The evolution of species habitat range is an important topic over a wide range of research fields. In higher organisms, habitat range evolution is generally associated with genetic events such as gene duplication. However, the specific factors that determine habitat variability remain unclear at higher levels of biological organization (e.g., biochemical networks). One widely accepted hypothesis developed from both theoretical and empirical analyses is that habitat variability promotes network modularity; however, this relationship has not yet been directly tested in higher organisms. Therefore, I investigated the relationship between habitat variability and metabolic network modularity using compound and enzymatic networks in flies and mammals. Contrary to expectation, there was no clear positive correlation between habitat variability and network modularity. As an exception, the network modularity increased with habitat variability in the enzymatic networks of flies. However, the observed association was likely an artifact, and the frequency of gene duplication appears to be the main factor contributing to network modularity. These findings raise the question of whether or not there is a general mechanism for habitat range expansion at a higher level (i.e., above the gene scale). This study suggests that the currently widely accepted hypothesis for habitat variability should be reconsidered.
The chain-structured long short-term memory (LSTM) has showed to be effective in a wide range of problems such as speech recognition and machine translation. In this paper, we propose to extend it to tree structures, in which a memory cell can reflect the history memories of multiple child cells or multiple descendant cells in a recursive process. We call the model S-LSTM, which provides a principled way of considering long-distance interaction over hierarchies, e.g., language or image parse structures. We leverage the models for semantic composition to understand the meaning of text, a fundamental problem in natural language understanding, and show that it outperforms a state-of-the-art recursive model by replacing its composition layers with the S-LSTM memory blocks. We also show that utilizing the given structures is helpful in achieving a performance better than that without considering the structures.
We consider the problem of minimizing the number of broadcasts for collecting all sensor measurements at a sink node in a noisy broadcast sensor network. Focusing first on arbitrary network topologies, we provide (i) fundamental limits on the required number of broadcasts of data gathering, and (ii) a general in-network computing strategy to achieve an upper bound within factor $\log N$ of the fundamental limits, where $N$ is the number of agents in the network. Next, focusing on two example networks, namely, \textcolor{black}{arbitrary geometric networks and random Erd$\ddot{o}$s-R$\acute{e}$nyi networks}, we provide improved in-network computing schemes that are optimal in that they attain the fundamental limits, i.e., the lower and upper bounds are tight \textcolor{black}{in order sense}. Our main techniques are three distributed encoding techniques, called graph codes, which are designed respectively for the above-mentioned three scenarios. Our work thus extends and unifies previous works such as those of Gallager [1] and Karamchandani~\emph{et. al.} [2] on number of broadcasts for distributed function computation in special network topologies, while bringing in novel techniques, e.g., from error-control coding and noisy circuits, for both upper and lower bounds.
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step towards solving the challenging problem of computational retrosynthetic analysis.
Closeness is a global measure of centrality in networks, and a proxy for how influential actors are in social networks. In most network models, and many empirical networks, closeness is strongly correlated with degree. However, in social networks there is a cost of maintaining social ties. This leads to a situation (that can occur in the professional social networks of executives, lobbyists, diplomats and so on) where agents have the conflicting objectives of aiming for centrality while simultaneously keeping the degree low. We investigate this situation in an adaptive network-evolution model where agents optimize their positions in the network following individual strategies, and using only local information. The strategies are also optimized, based on the success of the agent and its neighbors. We measure and describe the time evolution of the network and the agents' strategies.
Evolving Cascade Neural Networks (ECNNs) and a new training algorithm capable of selecting informative features are described. The ECNN initially learns with one input node and then evolves by adding new inputs as well as new hidden neurons. The resultant ECNN has a near minimal number of hidden neurons and inputs. The algorithm is successfully used for training ECNN to recognise artefacts in sleep electroencephalograms (EEGs) which were visually labelled by EEG-viewers. In our experiments, the ECNN outperforms the standard neural-network as well as evolutionary techniques.
Local computation in microcircuits is an essential feature of distributed information processing in vertebrate and invertebrate brains. The insect antennal lobe represents a spatially confined local network that processes high-dimensional and redundant peripheral input to compute an efficient odor code. Social insects can rely on a particularly rich olfactory receptor repertoire and they exhibit complex odor-guided behaviors. This corresponds with a high anatomical complexity of their AL network. In the honeybee, a large number of glomeruli that receive sensory input are interconnected by a dense network of local interneurons (LNs). Uniglomerular projection neurons (PNs) integrate sensory and recurrent network input into an efficient spatio-temporal odor code. To investigate the specific computational roles of LNs and PNs we measured eleven features of sub- and suprathreshold single cell responses to in vivo odor stimulation. Using a semi-supervised cluster analysis we identified a combination of five characteristic features that enabled the accurate separation of morphologically identified LNs and PNs. The two clusters differed significantly in all five features. In the absence of stimulation PNs showed a higher subthreshold activation, assumed higher peak response rates and more regular spiking pattern. LNs reacted considerably faster to the onset of a stimulus and their responses were more reliable across stimulus repetitions. We discuss possible mechanisms that can explain our results, and we interpret cell-type specific characteristics with respect to their functional relevance.
We report on the image reconstruction (IR) problem by making use of the random chiral q-state Potts model, whose Hamiltonian possesses the same gauge invariance as the usual Ising spin glass model. We show that the pixel representation by means of the Potts variables is suitable for the gray-scale level image which can not be represented by the Ising model. We find that the IR quality is highly improved by the presence of a glassy term, besides the usual ferromagnetic term under random external fields, as very recently pointed out by Nishimori and Wong. We give the exact solution of the infinite range model with q=3, the three gray-scale level case. In order to check our analytical result and the efficiency of our model, 2D Monte Carlo simulations have been carried out on real-world pictures with three and eight gray-scale levels.
On a high mobility two-dimensional hole gas (2DHG) in a GaAs/GaAlAs heterostructure we study the interaction correction to the Drude conductivity in the ballistic regime, $k_BT\tau /\hbar $ $>1$. It is shown that the 'metallic' behaviour of the resistivity ($d\rho /dT>0$) of the low-density 2DHG is caused by hole-hole interaction effect in this regime. We find that the temperature dependence of the conductivity and the parallel-field magnetoresistance are in agreement with this description, and determine the Fermi-liquid interaction constant $F_0^\sigma $ which controls the sign of $d\rho /dT$.
We introduce SIM-CE, an advanced, user-friendly modeling and simulation environment in Simulink for performing multi-scale behavioral analysis of the nervous system of Caenorhabditis elegans (C. elegans). SIM-CE contains an implementation of the mathematical models of C. elegans's neurons and synapses, in Simulink, which can be easily extended and particularized by the user. The Simulink model is able to capture both complex dynamics of ion channels and additional biophysical detail such as intracellular calcium concentration. We demonstrate the performance of SIM-CE by carrying out neuronal, synaptic and neural-circuit-level behavioral simulations. Such environment enables the user to capture unknown properties of the neural circuits, test hypotheses and determine the origin of many behavioral plasticities exhibited by the worm.
For a family of models of evolving population under selection, which can be described by noisy traveling wave equations, the coalescence times along the genealogical tree scale like $\log^\alpha N$, where $N$ is the size of the population, in contrast with neutral models for which they scale like $N$. An argument relating this time scale to the diffusion constant of the noisy traveling wave leads to a prediction for $\alpha$ which agrees with our simulations. An exactly soluble case gives trees with statistics identical to those predicted for mean-field spin glasses in Parisi's theory.
It has long been conjectured that hypotheses spaces suitable for data that is compositional in nature, such as text or images, may be more efficiently represented with deep hierarchical networks than with shallow ones. Despite the vast empirical evidence supporting this belief, theoretical justifications to date are limited. In particular, they do not account for the locality, sharing and pooling constructs of convolutional networks, the most successful deep learning architecture to date. In this work we derive a deep network architecture based on arithmetic circuits that inherently employs locality, sharing and pooling. An equivalence between the networks and hierarchical tensor factorizations is established. We show that a shallow network corresponds to CP (rank-1) decomposition, whereas a deep network corresponds to Hierarchical Tucker decomposition. Using tools from measure theory and matrix algebra, we prove that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be realized (or even approximated) by a shallow network. Since log-space computation transforms our networks into SimNets, the result applies directly to a deep learning architecture demonstrating promising empirical performance. The construction and theory developed in this paper shed new light on various practices and ideas employed by the deep learning community.
In contrast to traditional flow networks, in additive flow networks, to every edge e is assigned a gain factor g(e) which represents the loss or gain of the flow while using edge e. Hence, if a flow f(e) enters the edge e and f(e) is less than the designated capacity of e, then f(e) + g(e) = 0 units of flow reach the end point of e, provided e is used, i.e., provided f(e) != 0. In this report we study the maximum flow problem in additive flow networks, which we prove to be NP-hard even when the underlying graphs of additive flow networks are planar. We also investigate the shortest path problem, when to every edge e is assigned a cost value for every unit flow entering edge e, which we show to be NP-hard in the strong sense even when the additive flow networks are planar.
Mobility prediction allows estimating the stability of paths in a mobile wireless Ad Hoc networks. Identifying stable paths helps to improve routing by reducing the overhead and the number of connection interruptions. In this paper, we introduce a neural network based method for mobility prediction in Ad Hoc networks. This method consists of a multi-layer and recurrent neural network using back propagation through time algorithm for training.
Statistical techniques that analyze texts, referred to as text analytics, have departed from the use of simple word count statistics towards a new paradigm. Text mining now hinges on a more sophisticated set of methods, including the representations in terms of complex networks. While well-established word-adjacency (co-occurrence) methods successfully grasp syntactical features of written texts, they are unable to represent important aspects of textual data, such as its topical structure, i.e. the sequence of subjects developing at a mesoscopic level along the text. Such aspects are often overlooked by current methodologies. In order to grasp the mesoscopic characteristics of semantical content in written texts, we devised a network model which is able to analyze documents in a multi-scale fashion. In the proposed model, a limited amount of adjacent paragraphs are represented as nodes, which are connected whenever they share a minimum semantical content. To illustrate the capabilities of our model, we present, as a case example, a qualitative analysis of "Alice's Adventures in Wonderland". We show that the mesoscopic structure of a document, modeled as a network, reveals many semantic traits of texts. Such an approach paves the way to a myriad of semantic-based applications. In addition, our approach is illustrated in a machine learning context, in which texts are classified among real texts and randomized instances.
In Wireless Ad hoc networks (WANETs), nodes separated by considerable distance communicate with each other by relaying their messages through other nodes. However, it might not be in the best interests of a node to forward the message of another node due to power constraints. In addition, all nodes being rational, some nodes may be selfish, i.e. they might not relay data from other nodes so as to increase their lifetime. In this paper, we present a fair and incentivized approach for participation in Ad hoc networks. Given the power required for each transmission, we are able to determine the power saving contributed by each intermediate hop. We propose the FAir Share incenTivizEd Ad hoc paRticipation protocol (FASTER), which takes a selected route from a routing protocol as input, to calculate the worth of each node using the cooperative game theory concept of 'Shapley Value' applied on the power saved by each node. This value can be used for allocation of Virtual Currency to the nodes, which can be spent on subsequent message transmissions.
Recently, a method has been proposed to obtain accurate predictions for low-temperature properties of lattice spin glasses that is practical even above the upper critical dimension, $d_c=6$. This method is based on the observation that bond-dilution enables the numerical treatment of larger lattices, and that the subsequent combination of such data at various bond densities into a finite-size scaling Ansatz produces more robust scaling behavior. In the present study we test the potential of such a procedure, in particular, to obtain the stiffness exponent for the hierarchical Migdal-Kadanoff lattice. Critical exponents for this model are known with great accuracy and any simulations can be executed to very large lattice sizes at almost any bond density, effecting a insightful comparison that highlights the advantages -- as well as the weaknesses -- of this method. These insights are applied to the Edwards-Anderson model in $d=3$ with Gaussian bonds.
In this paper we consider kinetic and associated macroscopic models for chemotaxis on a network. Coupling conditions at the nodes of the network for the kinetic problem are presented and used to derive coupling conditions for the macroscopic approximations. The results of the different models are compared and relations to a Keller-Segel model on networks are discussed. For a numerical approximation of the governing equations asymptotic preserving relaxation schemes are extended to directed graphs. Kinetic and macroscopic equations are investigated numerically and their solutions are compared for tripod and more general networks.
We analyze random isotropic antiferromagnetic SU(N) spin chains using the real space renormalization group. We find that they are governed at low energies by a universal infinite randomness fixed point different from the one of random spin-1/2 chains. We determine analytically the important exponents: the energy-length scale relation is $\Omega\sim\exp(-L^{\psi})$, where $\psi=1/N$, and the mean correlation function is given by $\bar{C_{ij}}\sim(-1)^{i-j}/|i-j|^{\phi}$, where $\phi=4/N$. Our analysis shows that the infinite-N limit is unable to capture the behavior obtained at any finite N.
We review the theory of polarized deep inelastic scattering in light of the most recent experimental data. We discuss the nucleon spin decomposition and the Bjorken sum rule. The latter is used for extraction of $\alpha_s(M_Z^2)=0.116^{{+}0.004}_{{-}0.006}$ and as a test case for a new method of analyzing divergent perturbation series in QCD.
In signal analysis and synthesis, linear approximation theory considers a linear decomposition of any given signal in a set of atoms, collected into a so-called dictionary. Relevant sparse representations are obtained by relaxing the orthogonality condition of the atoms, yielding overcomplete dictionaries with an extended number of atoms. More generally than the linear decomposition, overcomplete kernel dictionaries provide an elegant nonlinear extension by defining the atoms through a mapping kernel function (e.g., the gaussian kernel). Models based on such kernel dictionaries are used in neural networks, gaussian processes and online learning with kernels.   The quality of an overcomplete dictionary is evaluated with a diversity measure the distance, the approximation, the coherence and the Babel measures. In this paper, we develop a framework to examine overcomplete kernel dictionaries with the entropy from information theory. Indeed, a higher value of the entropy is associated to a further uniform spread of the atoms over the space. For each of the aforementioned diversity measures, we derive lower bounds on the entropy. Several definitions of the entropy are examined, with an extensive analysis in both the input space and the mapped feature space.
Generating descriptions for videos has many applications including assisting blind people and human-robot interaction. The recent advances in image captioning as well as the release of large-scale movie description datasets such as MPII Movie Description allow to study this task in more depth. Many of the proposed methods for image captioning rely on pre-trained object classifier CNNs and Long-Short Term Memory recurrent networks (LSTMs) for generating descriptions. While image description focuses on objects, we argue that it is important to distinguish verbs, objects, and places in the challenging setting of movie description. In this work we show how to learn robust visual classifiers from the weak annotations of the sentence descriptions. Based on these visual classifiers we learn how to generate a description using an LSTM. We explore different design choices to build and train the LSTM and achieve the best performance to date on the challenging MPII-MD dataset. We compare and analyze our approach and prior work along various dimensions to better understand the key challenges of the movie description task.
As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses significant challenge to construct a high performance implementations of deep learning neural networks. In order to improve the performance as well to maintain the low power cost, in this paper we design DLAU, which is a scalable accelerator architecture for large-scale deep learning networks using FPGA as the hardware prototype. The DLAU accelerator employs three pipelined processing units to improve the throughput and utilizes tile techniques to explore locality for deep learning applications. Experimental results on the state-of-the-art Xilinx FPGA board demonstrate that the DLAU accelerator is able to achieve up to 36.1x speedup comparing to the Intel Core2 processors, with the power consumption at 234mW.
We theoretically investigate disorder effects on the ferromagnetic transition ('Curie') temperature $T_c$ in dilute III$_{1-x}$Mn$_x$V magnetic semiconductors (e.g. Ga$_{1-x}$Mn$_x$As) where a small fraction ($x \approx 0.01-0.1$) of the cation atoms (e.g. Ga) are randomly replaced by the magnetic dopants (e.g. Mn), leading to long-range ferromagnetic ordering for $T<T_c$. We find that $T_c$ is a complicated function of at least eight different parameters including carrier density, magnetic dopant density, and carrier mean free path; nominally macroscopically similar samples could have substantially different Curie temperatures. We provide simple physically appealing prescriptions for enhancing $T_c$ in diluted magnetic semiconductors, and discuss the magnetic phase diagram in the system parameter space.
We present a novel classification-based algorithm called GeneClass for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences (``motifs'') in the gene's regulatory region and (2) the expression levels of regulators such as transcription factors in the experiment (``parents''). Thus our learning task integrates two qualitatively different data sources: genome-wide cDNA microarray data across multiple perturbation and mutant experiments along with motif profile data from regulatory sequences. Rather than focusing on the regression task of predicting real-valued gene expression measurements, GeneClass performs the classification task of predicting +1 and -1 labels, corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. GeneClass uses the Adaboost learning algorithm with a margin-based generalization of decision trees called alternating decision trees. In computational experiments based on the Gasch S. cerevisiae dataset, we show that the GeneClass method predicts up- and down-regulation on held-out experiments with high accuracy. We explore a range of experimental setups related to environmental stress response, and we retrieve important regulators, binding site motifs, and relationships between regulators and binding sites that are known to be associated to specific stress response pathways. Our method thus provides predictive hypotheses, suggests biological experiments, and provides interpretable insight into the structure of genetic regulatory networks.
Random geometric networks consist of 1) a set of nodes embedded randomly in a bounded domain $\mathcal{V} \subseteq \mathbb{R}^d$ and 2) links formed probabilistically according to a function of mutual Euclidean separation. We quantify how often all paths in the network characterisable as topologically `shortest' contain a given node (betweenness centrality), deriving an expression in terms of a known integral whenever 1) the network boundary is the perimeter of a disk and 2) the network is extremely dense. Our method shows how similar formulas can be obtained for any convex geometry. Numerical corroboration is provided, as well as a discussion of our formula's potential use for cluster head election and boundary detection in densely deployed wireless ad hoc networks.
The eigenvalues of the transmission matrix provide the basis for a full description of the statistics of steady-state transmission and conductance. At the same time, the ability to excite the sample with the waveform of specific transmission eigenchannels allows for control over transmission. However, the nature of pulsed transmission of transmission eigenchannels and their spectral correlation, which would permit control of propagation in the time domain, has not been discussed. Here we report the dramatic variation of the dynamic properties of transmission with incident waveform. Computer simulations show that lower-transmission eigenchannels respond more promptly to an incident pulse and are correlated over a wide frequency range. We explain these results together with the puzzlingly large dynamic range of transmission eigenvalues in terms of the way quasi-normal modes of the medium combine to form specific transmission eigenchannels. Key factors are the closeness of the illuminating waves to resonance with the modes comprising an eigenchannel, their spectral range, and the interference between the modes. We demonstrate in microwave experiments that the modal characteristics of eigenchannels provide the optimum way efficiently excite specific modes of the medium.
By means of the concept of balanced estimation of diffusion entropy we evaluate reliable scale-invariance embedded in different sleep stages and stride records. Segments corresponding to Wake, light sleep, REM, and deep sleep stages are extracted from long-term EEG signals. For each stage the scaling value distributes in a considerable wide range, which tell us that the scaling behavior is subject- and sleep cycle- dependent. The average of the scaling exponent values for wake segments is almost the same with that for REM segments ($\sim 0.8$). Wake and REM stages have significant high value of average scaling exponent, compared with that for light sleep stages ($\sim 0.7$). For the stride series, the original diffusion entropy (DE) and balanced estimation of diffusion entropy (BEDE) give almost the same results for de-trended series. Evolutions of local scaling invariance show that the physiological states change abruptly, though in the experiments great efforts have been done to keep conditions unchanged. Global behaviors of a single physiological signal may lose rich information on physiological states. Methodologically, BEDE can evaluate with considerable precision scale-invariance in very short time series ($\sim 10^2$), while the original DE method sometimes may underestimate scale-invariance exponents or even fail in detecting scale-invariant behavior. The BEDE method is sensitive to trends in time series. Existence of trend may leads to a unreasonable high value of scaling exponent, and consequent mistake conclusions.
We study the problem of estimating, in the sense of optimal transport metrics, a measure which is assumed supported on a manifold embedded in a Hilbert space. By establishing a precise connection between optimal transport metrics, optimal quantization, and learning theory, we derive new probabilistic bounds for the performance of a classic algorithm in unsupervised learning (k-means), when used to produce a probability measure derived from the data. In the course of the analysis, we arrive at new lower bounds, as well as probabilistic upper bounds on the convergence rate of the empirical law of large numbers, which, unlike existing bounds, are applicable to a wide class of measures.
We present a computational circuit which realizes contextually the Tele-UNOT gate and the universal optimal quantum cloning machine (UOQCM). We report the experimental realization of the probabilistic UOQCM with polarization encoded qubits. This is achieved by combining on a symmetric beam-splitter the input qubit with an ancilla in a fully mixed state.
Encouraged by the success of Convolutional Neural Networks (CNNs) in image classification, recently much effort is spent on applying CNNs to video based action recognition problems. One challenge is that video contains a varying number of frames which is incompatible to the standard input format of CNNs. Existing methods handle this issue either by directly sampling a fixed number of frames or bypassing this issue by introducing a 3D convolutional layer which conducts convolution in spatial-temporal domain.   To solve this issue, here we propose a novel network structure which allows an arbitrary number of frames as the network input. The key of our solution is to introduce a module consisting of an encoding layer and a temporal pyramid pooling layer. The encoding layer maps the activation from previous layers to a feature vector suitable for pooling while the temporal pyramid pooling layer converts multiple frame-level activations into a fixed-length video-level representation. In addition, we adopt a feature concatenation layer which combines appearance information and motion information. Compared with the frame sampling strategy, our method avoids the risk of missing any important frames. Compared with the 3D convolutional method which requires a huge video dataset for network training, our model can be learned on a small target dataset because we can leverage the off-the-shelf image-level CNN for model parameter initialization. Experiments on two challenging datasets, Hollywood2 and HMDB51, demonstrate that our method achieves superior performance over state-of-the-art methods while requiring much fewer training data.
Deep learning using multi-layer neural networks (NNs) architecture manifests superb power in modern machine learning systems. The trained Deep Neural Networks (DNNs) are typically large. The question we would like to address is whether it is possible to simplify the NN during training process to achieve a reasonable performance within an acceptable computational time. We presented a novel approach of optimising a deep neural network through regularisation of net- work architecture. We proposed regularisers which support a simple mechanism of dropping neurons during a network training process. The method supports the construction of a simpler deep neural networks with compatible performance with its simplified version. As a proof of concept, we evaluate the proposed method with examples including sparse linear regression, deep autoencoder and convolutional neural network. The valuations demonstrate excellent performance.   The code for this work can be found in http://www.github.com/panweihit/DropNeuron
We demonstrate that local density of states fluctuations in disordered Anderson lattice models universally lead to the emergence of non-Fermi liquid (NFL) behavior. The NFL regime appears at moderate disorder (W = W_c) and is characterized by power-law anomalies, e. g. C/T ~ 1/T^{(1-\alpha)}, where the exponent \alpha varies continuously with disorder, as in other Griffiths phases. This Griffiths phase is not associated with the proximity to any magnetic ordering, but reflects the approach to a disorder-driven metal-insulator transition (MIT). Remarkably, the MIT takes place only at much larger disorder W_{MIT} ~ 12 W_c, resulting in an extraordinarily robust NFL metallic phase.
We consider a distributed Software Defined Networking (SDN) architecture adopting a cluster of multiple controllers to improve network performance and reliability. Besides the Openflow control traffic exchanged between controllers and switches, we focus on the control traffic exchanged among the controllers in the cluster, needed to run coordination and consensus algorithms to keep the controllers synchronized. We estimate the effect of the inter-controller communications on the reaction time perceived by the switches depending on the data-ownership model adopted in the cluster. The model is accurately validated in an operational Software Defined WAN (SDWAN). We advocate a careful placement of the controllers, that should take into account both the above kinds of control traffic. We evaluate, for some real ISP network topologies, the delay tradeoffs for the controllers placement problem and we propose a novel evolutionary algorithm to find the corresponding Pareto frontier. Our work provides novel quantitative tools to optimize the planning and the design of the network supporting the control plane of SDN networks, especially when the network is very large and in-band control plane is adopted. We also show that for operational distributed controllers (e.g. OpenDaylight and ONOS), the location of the controller which acts as a leader in the consensus algorithm has a strong impact on the reactivity perceived by switches.
In this paper we present an original approach for community detection in complex networks. The approach belongs to the family of seed-centric algorithms. However, instead of expanding communities around selected seeds as most of existing approaches do, we explore here applying an ensemble clustering approach to different network partitions derived from ego-centered communities computed for each selected seed. Ego-centered communities are themselves computed applying a recently proposed ensemble ranking based approach that allow to efficiently combine various local modularities used to guide a greedy optimisation process. Results of first experiments on real world networks for which a ground truth decomposition into communities are known, argue for the validity of our approach.
We perform an in-depth study for mean first-passage time (MFPT)---a primary quantity for random walks with numerous applications---of maximal-entropy random walks (MERW) performed in complex networks. For MERW in a general network, we derive an explicit expression of MFPT in terms of the eigenvalues and eigenvectors of the adjacency matrix associated with the network. For MERW in uncorrelated networks, we also provide a theoretical formula of MFPT at the mean-field level, based on which we further evaluate the dominant scalings of MFPT to different targets for MERW in uncorrelated scale-free networks, and compare the results with those corresponding to traditional unbiased random walks (TURW). We show that the MFPT to a hub node is much lower for MERW than for TURW. However, when the destination is a node with the least degree or a uniformly chosen node, the MFPT is higher for MERW than for TURW. Since MFPT to a uniformly chosen node measures real efficiency of search in networks, our work provides insight into general searching process in complex networks.
[abridged] The properties of galaxies with the reddest observed R-K colors (Extremely Red Objects, EROs), including their apparent division into passive and obscured active objects with roughly similar number densities, are a known challenge for models of galaxy formation. We produce mock catalogues generated by interfacing the predictions of the semi-analytical MORGANA model for the evolution of galaxies in a Lambda-CDM cosmology with the spectro-photometric + radiative transfer code GRASIL and Infrared (IR) template library to show that the model correctly reproduces number counts, redshift distributions and active fractions of R-K>5 sources. We test the robustness of our results against different dust attenuations and, most importantly, against the inclusion of TP-AGB stars in Simple Stellar Populations used to generate galaxy spectra, and find that the inclusion of TP-AGBs has a relevant effect, in that it allows to increase by a large factor the number of very red active objects at all color cuts. We find that though the most passive and the most obscured active galaxies have a higher probability of being selected as EROs, many EROs have intermediate properties and the population does not show bimodality in specific star formation rate (SSFR). We predict that deep observations in the Far-IR, from 100 to 500 micron, are the most efficient way to constrain the SSFR of these objects; we give predictions for future Herschel observations. Finally, we test whether a simple evolutionary sequence for the formation of z=0 massive galaxies, going through a sub-mm-bright phase and then a ERO phase, are typical in this galaxy formation model. We find that this sequence holds for ~25 per cent of z=0 massive galaxies, while the model typically shows a more complex connection between sub-mm, ERO and massive galaxies. [abridged]
Many prediction problems can be phrased as inferences over local neighborhoods of graphs. The graph represents the interaction between entities, and the neighborhood of each entity contains information that allows the inferences or predictions. We present an approach for applying machine learning directly to such graph neighborhoods, yielding predicitons for graph nodes on the basis of the structure of their local neighborhood and the features of the nodes in it. Our approach allows predictions to be learned directly from examples, bypassing the step of creating and tuning an inference model or summarizing the neighborhoods via a fixed set of hand-crafted features. The approach is based on a multi-level architecture built from Long Short-Term Memory neural nets (LSTMs); the LSTMs learn how to summarize the neighborhood from data. We demonstrate the effectiveness of the proposed technique on a synthetic example and on real-world data related to crowdsourced grading, Bitcoin transactions, and Wikipedia edit reversions.
We study the behavior of droplets for two dimensional Ising spin glasses with Gaussian interactions. We use an exact matching algorithm which enables study of systems with linear dimension L up to 240, which is larger than is possible with other approaches. But the method only allows certain classes of droplets to be generated. We study single-bond, cross and a category of fixed volume droplets as well as first excitations. By comparison with similar or equivalent droplets generated in previous works, the advantages but also the limitations of this approach are revealed. In particular we have studied the scaling behavior of the droplet energies and droplet sizes. In most cases, a crossover of the data can be observed such that for large sizes the behavior is compatible with the one-exponent scenario of the droplet theory. Only for the case of first excitations, no clear conclusion can be reached, probably because even with the matching approach the accessible system sizes are still too small.
We introduce a solvable quantum antiferromagnetic model. The model, with Ising spins in a transverse field, has infinite range antiferromagnetic interactions with random fields on each site, following an arbitrary distribution. As is well-known, frustration in the random field Ising model gives rise to a many-valley structure in the spin-configuration space. In addition, the antiferromagnetism also induces a regular frustration even for the ground state. In this paper, we investigate analytically the critical phenomena in the model, having both randomness and frustration and we report some analytical results for it.
We investigate scaling properties of human brain functional networks in the resting-state. Analyzing network degree distributions, we statistically test whether their tails scale as power-law or not. Initial studies, based on least-squares fitting, were shown to be inadequate for precise estimation of power-law distributions. Subsequently, methods based on maximum-likelihood estimators have been proposed and applied to address this question. Nevertheless, no clear consensus has emerged, mainly because results have shown substantial variability depending on the data-set used or its resolution. In this study, we work with high-resolution data (10K nodes) from the Human Connectome Project and take into account network weights. We test for the power-law, exponential, log-normal and generalized Pareto distributions. Our results show that the statistics generally do not support a power-law, but instead these degree distributions tend towards the thin-tail limit of the generalized Pareto model. This may have implications for the number of hubs in human brain functional networks.
Modeling and topological analysis of networks in biological and other complex systems, must venture beyond the limited consideration of very few network metrics like degree, betweenness or assortativity. A proper identification of informative and redundant entities from many different metrics, using recently demonstrated techniques, is essential. A holistic comparison of networks and growth models is best achieved only with the use of such methods.
We present a novel method for identifying a biochemical reaction network based on multiple sets of estimated reaction rates in the corresponding reaction rate equations arriving from various (possibly different) experiments. The current method, unlike some of the graphical approaches proposed in the literature, uses the values of the experimental measurements only relative to the geometry of the biochemical reactions under the assumption that the underlying reaction network is the same for all the experiments.   The proposed approach utilizes algebraic statistical methods in order to parametrize the set of possible reactions so as to identify the most likely network structure, and is easily scalable to very complicated biochemical systems involving a large number of species and reactions. The method is illustrated with a numerical example of a hypothetical network arising form a "mass transfer"-type model.
In secure communications networks there are a great number of user behavioural problems, which need to be dealt with. Curious players pose a very real and serious threat to the integrity of such a network. By traversing a network a Curious player could uncover secret information, which that user has no need to know, by simply posing as a loyalty check. Loyalty checks are done simply to gauge the integrity of the network with respect to players who act in a malicious manner. We wish to propose a method, which can deal with Curious players trying to obtain "Need to Know" information using a combined Fault-tolerant, Cryptographic and Game Theoretic Approach.
We consider the problem of influence maximization in fixed networks, for both stochastic and adversarial contagion models. The common goal is to select a subset of nodes of a specified size to infect so that the number of infected nodes at the conclusion of the epidemic is as large as possible. In the stochastic setting, the epidemic spreads according to a general triggering model, which includes the popular linear threshold and independent cascade models. We establish upper and lower bounds for the influence of an initial subset of nodes in the network, where the influence is defined as the expected number of infected nodes. Although the problem of exact influence computation is NP-hard in general, our bounds may be evaluated efficiently, leading to scalable algorithms for influence maximization with rigorous theoretical guarantees. In the adversarial spreading setting, an adversary is allowed to specify the edges through which contagion may spread, and the player chooses sets of nodes to infect in successive rounds. Both the adversary and player may behave stochastically, but we limit the adversary to strategies that are oblivious of the player's actions. We establish upper and lower bounds on the minimax pseudo-regret in both undirected and directed networks.
We study the betweenness centrality of fractal and non-fractal scale-free network models as well as real networks. We show that the correlation between degree and betweenness centrality $C$ of nodes is much weaker in fractal network models compared to non-fractal models. We also show that nodes of both fractal and non-fractal scale-free networks have power law betweenness centrality distribution $P(C)\sim C^{-\delta}$. We find that for non-fractal scale-free networks $\delta = 2$, and for fractal scale-free networks $\delta = 2-1/d_{B}$, where $d_{B}$ is the dimension of the fractal network. We support these results by explicit calculations on four real networks: pharmaceutical firms (N=6776), yeast (N=1458), WWW (N=2526), and a sample of Internet network at AS level (N=20566), where $N$ is the number of nodes in the largest connected component of a network. We also study the crossover phenomenon from fractal to non-fractal networks upon adding random edges to a fractal network. We show that the crossover length $\ell^{*}$, separating fractal and non-fractal regimes, scales with dimension $d_{B}$ of the network as $p^{-1/d_{B}}$, where $p$ is the density of random edges added to the network. We find that the correlation between degree and betweenness centrality increases with $p$.
To overcome the #P-hardness of computing/updating prices in logarithm market scoring rule-based (LMSR-based) combinatorial prediction markets, Chen et al. [5] recently used a simple Bayesian network to represent the prices of securities in combinatorial predictionmarkets for tournaments, and showed that two types of popular securities are structure preserving. In this paper, we significantly extend this idea by employing Bayesian networks in general combinatorial prediction markets. We reveal a very natural connection between LMSR-based combinatorial prediction markets and probabilistic belief aggregation,which leads to a complete characterization of all structure preserving securities for decomposable network structures. Notably, the main results by Chen et al. [5] are corollaries of our characterization. We then prove that in order for a very basic set of securities to be structure preserving, the graph of the Bayesian network must be decomposable. We also discuss some approximation techniques for securities that are not structure preserving.
The thermal Wightman functions for free, massless particles of spin 0, 1/2, 1, 3/2, and 2 are computed directly in coordinate space by solving the appropriate differential equation and imposing the Kubo-Martin-Schwinger condition. The solutions are valid for real, imaginary, or complex time. The Wightman functions for spin 1 gauge bosons and for spin 2 gravitons are directly related to the fundamental functions for spin 0. The Wightman functions for spin 3/2 gravitinos is directly related to that for spin 1/2 fermions. Calculations for spin 1, 3/2, and 2 are done in covariant gauges. In the deep space-like region the Wightman functions for bosons fall like $T/r$ whereas those for the fermions fall exponentially. In the deep time-like region all the Wightman functions fall exponentially.
Both Phase 1 of the Square Kilometre Array (SKA1) and the full SKA have the potential to dramatically increase the science return from future astrophysics, heliophysics, and especially planetary missions, primarily due to the greater sensitivity (AEFF / TSYS) compared with existing or planned spacecraft tracking facilities. While this is not traditional radio astronomy, it is an opportunity for productive synergy between the large investment in the SKA and the even larger investments in space missions to maximize the total scientific value returned to society. Specific applications include short-term increases in downlink data rate during critical mission phases or spacecraft emergencies, enabling new mission concepts based on small probes with low power and small antennas, high precision angular tracking via VLBI phase referencing using in-beam calibrators, and greater range and signal/noise ratio for bi-static planetary radar observations. Future use of higher frequencies (e.g., 32 GHz and optical) for spacecraft communications will not eliminate the need for high sensitivities at lower frequencies. Many atmospheric probes and any spacecraft using low gain antennas require frequencies below a few GHz. The SKA1 baseline design covers VHF/UHF frequencies appropriate for some planetary atmospheric probes (band 1) as well as the standard 2.3 GHz deep space downlink frequency allocation (band 3). SKA1-MID also covers the most widely used deep space downlink allocation at 8.4 GHz (band 5). Even a 50% deployment of SKA1-MID will still result in a factor of several increase in sensitivity compared to the current 70-m Deep Space Network tracking antennas, along with an advantageous geographic location. The assumptions of a 10X increase in sensitivity and 20X increase in angular resolution for SKA result in a truly unique and spectacular future spacecraft tracking capability.
We investigate the influence of aperiodic modulations of the exchange interactions between nearest-neighbour rows on the phase transition of the two-dimensional eight-state Potts model. The systems are studied numerically through intensive Monte Carlo simulations using the Swendsen-Wang cluster algorithm for different aperiodic sequences. The transition point is located through duality relations, and the critical behaviour is investigated using FSS techniques at criticality. While the pure system exhibits a first-order transition, we show that the deterministic fluctuations resulting from the aperiodic coupling distribution are liable to modify drastically the physical properties in the neighbourhood of the transition point. For strong enough fluctuations of the sequence under consideration, a second-order phase transition is induced. The exponents $\beta/\nu$, $\gamma /\nu$ and $(1-\alpha)/\nu$ are obtained at the new fixed point and crossover effects are discussed. Surface properties are also studied.
Subjet distributions were measured in neutral current deep inelastic ep scattering with the ZEUS detector at HERA using an integrated luminosity of 81.7 pb-1. Jets were identified using the kT cluster algorithm in the laboratory frame. Subjets were defined as jet-like substructures identified by a reapplication of the cluster algorithm at a smaller value of the resolution parameter ycut. Measurements of subjet distributions for jets with exactly two subjets for ycut=0.05 are presented as functions of observables sensitive to the pattern of parton radiation and to the colour coherence between the initial and final states. Perturbative QCD predictions give an adequate description of the data.
Deep convolutional neural networks (DCNNs) have achieved great success in various computer vision and pattern recognition applications, including those for handwritten Chinese character recognition (HCCR). However, most current DCNN-based HCCR approaches treat the handwritten sample simply as an image bitmap, ignoring some vital domain-specific information that may be useful but that cannot be learnt by traditional networks. In this paper, we propose an enhancement of the DCNN approach to online HCCR by incorporating a variety of domain-specific knowledge, including deformation, non-linear normalization, imaginary strokes, path signature, and 8-directional features. Our contribution is twofold. First, these domain-specific technologies are investigated and integrated with a DCNN to form a composite network to achieve improved performance. Second, the resulting DCNNs with diversity in their domain knowledge are combined using a hybrid serial-parallel (HSP) strategy. Consequently, we achieve a promising accuracy of 97.20% and 96.87% on CASIA-OLHWDB1.0 and CASIA-OLHWDB1.1, respectively, outperforming the best results previously reported in the literature.
In this paper, we propose a new Recurrent Neural Network (RNN) architecture. The novelty is simple: We use diagonal recurrent matrices instead of full. This results in better test likelihood and faster convergence compared to regular full RNNs in most of our experiments. We show the benefits of using diagonal recurrent matrices with popularly used LSTM and GRU architectures as well as with the vanilla RNN architecture, on four standard symbolic music datasets.
We have measured the current-voltage (I-V) characteristics of small-capacitance single Josephson junctions at low temperatures (T=0.02-0.6 K), where the strength of the coupling between the single junction and the electromagnetic environment was controlled with one-dimensional arrays of dc SQUIDs. The single-junction I-V curve is sensitive to the impedance of the environment, which can be tuned IN SITU. We have observed Coulomb blockade of Cooper-pair tunneling and even a region of negative differential resistance, when the zero-bias resistance R_0' of the SQUID arrays is much higher than the quantum resistance R_K = h/e^2 = 26 kohm. The negative differential resistance is evidence of coherent single-Cooper-pair tunneling within the theory of current-biased single Josephson junctions. Based on the theory, we have calculated the I-V curves numerically in order to compare with the experimental ones at R_0' >> R_K. The numerical calculation agrees with the experiments qualitatively. We also discuss the R_0' dependence of the single-Josephson-junction I-V curve in terms of the superconductor-insulator transition driven by changing the coupling to the environment.
Nowadays, the number of layers and of neurons in each layer of a deep network are typically set manually. While very deep and wide networks have proven effective in general, they come at a high memory and computation cost, thus making them impractical for constrained platforms. These networks, however, are known to have many redundant parameters, and could thus, in principle, be replaced by more compact architectures. In this paper, we introduce an approach to automatically determining the number of neurons in each layer of a deep network during learning. To this end, we propose to make use of a group sparsity regularizer on the parameters of the network, where each group is defined to act on a single neuron. Starting from an overcomplete network, we show that our approach can reduce the number of parameters by up to 80\% while retaining or even improving the network accuracy.
Advances in neural network based classifiers have transformed automatic feature learning from a pipe dream of stronger AI to a routine and expected property of practical systems. Since the emergence of AlexNet every winning submission of the ImageNet challenge has employed end-to-end representation learning, and due to the utility of good representations for transfer learning, representation learning has become as an important and distinct task from supervised learning. At present, this distinction is inconsequential, as supervised methods are state-of-the-art in learning transferable representations. But recent work has shown that generative models can also be powerful agents of representation learning. Will the representations learned from these generative methods ever rival the quality of those from their supervised competitors? In this work, we argue in the affirmative, that from an information theoretic perspective, generative models have greater potential for representation learning. Based on several experimentally validated assumptions, we show that supervised learning is upper bounded in its capacity for representation learning in ways that certain generative models, such as Generative Adversarial Networks (GANs) are not. We hope that our analysis will provide a rigorous motivation for further exploration of generative representation learning.
We study the statistics of flow events in the inherent dynamics in supercooled two- and three-dimensional binary Lennard-Jones liquids. Distributions of changes of the collective quantities energy, pressure and shear stress become exponential at low temperatures, as does that of the event "size" $S\equiv\sum {d_i}^2$. We show how the $S$-distribution controls the others, while itself following from exponential tails in the distributions of (1) single particle displacements $d$, involving a Lindemann-like length $d_L$ and (2) the number of active particles (with $d>d_L$).
Automatic detection of liver lesions in CT images poses a great challenge for researchers. In this work we present a deep learning approach that models explicitly the variability within the non-lesion class, based on prior knowledge of the data, to support an automated lesion detection system. A multi-class convolutional neural network (CNN) is proposed to categorize input image patches into sub-categories of boundary and interior patches, the decisions of which are fused to reach a binary lesion vs non-lesion decision. For validation of our system, we use CT images of 132 livers and 498 lesions. Our approach shows highly improved detection results that outperform the state-of-the-art fully convolutional network. Automated computerized tools, as shown in this work, have the potential in the future to support the radiologists towards improved detection.
The GANs are generative models whose random samples realistically reflect natural images. It also can generate samples with specific attributes by concatenating a condition vector into the input, yet research on this field is not well studied. We propose novel methods of conditioning generative adversarial networks (GANs) that achieve state-of-the-art results on MNIST and CIFAR-10. We mainly introduce two models: an information retrieving model that extracts conditional information from the samples, and a spatial bilinear pooling model that forms bilinear features derived from the spatial cross product of an image and a condition vector. These methods significantly enhance log-likelihood of test data under the conditional distributions compared to the methods of concatenation.
This paper introduces a new method for face verification across large age gaps and also a dataset containing variations of age in the wild, the Large Age-Gap (LAG) dataset, with images ranging from child/young to adult/old. The proposed method exploits a deep convolutional neural network (DCNN) pre-trained for the face recognition task on a large dataset and then fine-tuned for the large age-gap face verification task. Finetuning is performed in a Siamese architecture using a contrastive loss function. A feature injection layer is introduced to boost verification accuracy, showing the ability of the DCNN to learn a similarity metric leveraging external features. Experimental results on the LAG dataset show that our method is able to outperform the face verification solutions in the state of the art considered.
A resource exchange network is considered, where exchanges among nodes are based on reciprocity. Peers receive from the network an amount of resources commensurate with their contribution. We assume the network is fully connected, and impose sparsity constraints on peer interactions. Finding the sparsest exchanges that achieve a desired level of reciprocity is in general NP-hard. To capture near-optimal allocations, we introduce variants of the Eisenberg-Gale convex program with sparsity penalties. We derive decentralized algorithms, whereby peers approximately compute the sparsest allocations, by reweighted l1 minimization. The algorithms implement new proportional-response dynamics, with nonlinear pricing. The trade-off between sparsity and reciprocity and the properties of graphs induced by sparse exchanges are examined.
In a study related to this one I set up a temporal network simulation environment for evaluating network intervention strategies. A network intervention strategy consists of a sampling design to select nodes in the network. An intervention is applied to nodes in the sample for the purpose of changing the wider network in some desired way. The network intervention strategies can represent natural agents such as viruses that spread in the network, programs to prevent or reduce the virus spread, and the agency of individual nodes, such as people, in forming and dissolving the links that create, maintain or change the network. The present paper examines idealized versions of the sampling designs used to that study. The purpose is to better understand the natural and human network designs in real situations and to provide a simple inference of design-based properties that in turn measure properties of the time-changing network. The designs use link tracing and sometimes other probabilistic procedures to add units to the sample and have an ongoing attrition process by which units are removed from the sample.
We present a neural network model approach for multi-frame blind deconvolution. The discriminative approach adopts and combines two recent techniques for image deblurring into a single neural network architecture. Our proposed hybrid-architecture combines the explicit prediction of a deconvolution filter and non-trivial averaging of Fourier coefficients in the frequency domain. In order to make full use of the information contained in all images in one burst, the proposed network embeds smaller networks, which explicitly allow the model to transfer information between images in early layers. Our system is trained end-to-end using standard backpropagation on a set of artificially generated training examples, enabling competitive performance in multi-frame blind deconvolution, both with respect to quality and runtime.
With information revolution, increased globalization and competition, supply chain has become longer and more complicated than ever before. These developments bring supply chain management to the forefront of the managements attention. Inventories are very important in a supply chain. The total investment in inventories is enormous, and the management of inventory is crucial to avoid shortages or delivery delays for the customers and serious drain on a companys financial resources. The supply chain cost increases because of the influence of lead times for supplying the stocks as well as the raw materials. Practically, the lead times will not be same through out all the periods. Maintaining abundant stocks in order to avoid the impact of high lead time increases the holding cost. Similarly, maintaining fewer stocks because of ballpark lead time may lead to shortage of stocks. This also happens in the case of lead time involved in supplying raw materials. A better optimization methodology that utilizes the Particle Swarm Optimization algorithm, one of the best optimization algorithms, is proposed to overcome the impasse in maintaining the optimal stock levels in each member of the supply chain. Taking into account the stock levels thus obtained from the proposed methodology, an appropriate stock levels to be maintained in the approaching periods that will minimize the supply chain inventory cost can be arrived at.
We consider the vector linear solvability of networks over a field $\mathbb{F}_q.$ It is well known that a scalar linear solution over $\mathbb{F}_q$ exists for a network if and only if the network is \textit{matroidal} with respect to a \textit{matroid} representable over $\mathbb{F}_q.$ A \textit{discrete polymatroid} is the multi-set analogue of a matroid. In this paper, a \textit{discrete polymatroidal} network is defined and it is shown that a vector linear solution over a field $\mathbb{F}_q$ exists for a network if and only if the network is discrete polymatroidal with respect to a discrete polymatroid representable over $\mathbb{F}_q.$ An algorithm to construct networks starting from a discrete polymatroid is provided. Every representation over $\mathbb{F}_q$ for the discrete polymatroid, results in a vector linear solution over $\mathbb{F}_q$ for the constructed network. Examples which illustrate the construction algorithm are provided, in which the resulting networks admit vector linear solution but no scalar linear solution over $\mathbb{F}_q.$
The small-world phenomenon is found in many self-organising systems. Systems configured in small-world networks spread information more easily than in random or regular lattice-type networks. Whilst it is a known fact that small-world networks have short average path length and high clustering coefficient in self-organising systems, the ego centralities that maintain the cohesiveness of small-world network have not been formally defined. Here we show that instantaneous events such as the release of news items via Twitter, coupled with active community arguments related to the news item form a particular type of small-world network. Analysis of the centralities in the network reveals that community arguments maintain the small-world network whilst actively maintaining the cohesiveness and boundary of the group. The results demonstrate how an active Twitter community unconsciously forms a small-world network whilst interacting locally with a bordering community. Over time, such local interactions brought about the global emergence of the small-world network, connecting media channels with human activities. Understanding the small-world phenomenon in relation to online social or civic movement is important, as evident in the spate of online activists that tipped the power of governments for the better or worst in recent times. The support, or removal of high centrality nodes in such networks has important ramifications in the self-expression of society and civic discourses. The presentation in this article anticipates further exploration of man-made self-organising systems where a larger cluster of ad-hoc and active community maintains the overall cohesiveness of the network.
We measure the two-point angular correlation function of a sample of 4,289,223 galaxies with r < 19.4 mag from the Sloan Digital Sky Survey as a function of photometric redshift, absolute magnitude and colour down to M_r - 5log h = -14 mag. Photometric redshifts are estimated from ugriz model magnitudes and two Petrosian radii using the artificial neural network package ANNz, taking advantage of the Galaxy and Mass Assembly (GAMA) spectroscopic sample as our training set. The photometric redshifts are then used to determine absolute magnitudes and colours. For all our samples, we estimate the underlying redshift and absolute magnitude distributions using Monte-Carlo resampling. These redshift distributions are used in Limber's equation to obtain spatial correlation function parameters from power law fits to the angular correlation function. We confirm an increase in clustering strength for sub-L* red galaxies compared with ~L* red galaxies at small scales in all redshift bins, whereas for the blue population the correlation length is almost independent of luminosity for ~L* galaxies and fainter. A linear relation between relative bias and log luminosity is found to hold down to luminosities L~0.03L*. We find that the redshift dependence of the bias of the L* population can be described by the passive evolution model of Tegmark & Peebles (1998). A visual inspection of a random sample of our r < 19.4 sample of SDSS galaxies reveals that about 10 per cent are spurious, with a higher contamination rate towards very faint absolute magnitudes due to over-deblended nearby galaxies. We correct for this contamination in our clustering analysis.
Person re-identification (Re-ID) poses a unique challenge to deep learning: how to learn a deep model with millions of parameters on a small training set of few or no labels. In this paper, a number of deep transfer learning models are proposed to address the data sparsity problem. First, a deep network architecture is designed which differs from existing deep Re-ID models in that (a) it is more suitable for transferring representations learned from large image classification datasets, and (b) classification loss and verification loss are combined, each of which adopts a different dropout strategy. Second, a two-stepped fine-tuning strategy is developed to transfer knowledge from auxiliary datasets. Third, given an unlabelled Re-ID dataset, a novel unsupervised deep transfer learning model is developed based on co-training. The proposed models outperform the state-of-the-art deep Re-ID models by large margins: we achieve Rank-1 accuracy of 85.4\%, 83.7\% and 56.3\% on CUHK03, Market1501, and VIPeR respectively, whilst on VIPeR, our unsupervised model (45.1\%) beats most supervised models.
An accurate understanding of the interplay between random and deterministic processes in generating extreme events is of critical importance in many fields, from forecasting extreme meteorological events to the catastrophic failure of materials and in the Earth. Here we investigate the statistics of record-breaking events in the time series of crackling noise generated by local rupture events during the compressive failure of porous materials. The events are generated by computer simulations of the uni-axial compression of cylindrical samples in a discrete element model of sedimentary rocks that closely resemble those of real experiments. The number of records grows initially as a decelerating power law of the number of events, followed by an acceleration immediately prior to failure. We demonstrate the existence of a characteristic record rank k^* which separates the two regimes of the time evolution. Up to this rank deceleration occurs due to the effect of random disorder. Record breaking then accelerates towards macroscopic failure, when physical interactions leading to spatial and temporal correlations dominate the location and timing of local ruptures. Sub-sequences of bursts between consecutive records are characterized by a power law size distribution with an exponent which decreases as failure is approached. High rank records are preceded by bursts of increasing size and waiting time between consecutive events and they are followed by a relaxation process. As a reference, surrogate time series are generated by reshuffling the crackling bursts. The record statistics of the uncorrelated surrogates agrees very well with the corresponding predictions of independent identically distributed random variables, which confirms that the temporal and spatial correlation of cracking bursts are responsible for the observed unique behaviour.
We compute analytically the weak (anti)localization correction to the Drude conductivity for electrons in tubular semiconductor systems of zinc blende type. We include linear Rashba and Dresselhaus spin-orbit coupling (SOC) and compare wires of standard growth directions $\langle100\rangle$, $\langle111\rangle$, and $\langle110\rangle$. The motion on the quasi-two-dimensional surface is considered diffusive in both directions: transversal as well as along the cylinder axis. It is shown that Dresselhaus and Rashba SOC similarly affect the spin relaxation rates. For the $\langle110\rangle$ growth direction, the long-lived spin states are of helical nature. We detect a crossover from weak localization to weak anti-localization depending on spin-orbit coupling strength as well as dephasing and scattering rate. The theory is fitted to experimental data of an undoped $\langle111\rangle$ InAs nanowire device which exhibits a top-gate-controlled crossover from positive to negative magnetoconductivity. Thereby, we extract transport parameters where we quantify the distinct types of SOC individually.
The relaxor ferroelectric PbMg1/Nb2/3O3 was investigated by means of broad-band dielectric and Fourier Transform Infrared (FTIR) transmission spectroscopy in the frequency range from 1 MHz to 15 THz at temperatures between 20 and 900 K using PMN films on infrared transparent sapphire substrates. While thin film relaxors display reduced dielectric permittivity at low frequencies, their high frequency intrinsic or lattice response is shown to be the same as single crystal/ceramic specemins. It was observed that in contrast to the results of inelastic neutron scattering, the optic soft mode was underdamped at all temperatures. On heating, the TO1 soft phonon followed the Cochran law with an extrapolated critical temperature equal to the Burns temperature of 670 K and softened down to 50 cm-1. Above 450 K the soft mode frequency leveled off and slightly increased above the Burns temperature. A central mode, describing the dynamics of polar nanoclusters appeared below the Burns temperature at frequencies near the optic soft mode and dramatically slowed down below 1 MHz on cooling below room temperature. It broadened on cooling, giving rise to frequency independent losses in microwave and lower frequency range below the freezing temperature of 200 K. In addition, a new heavily damped mode appeared in the FTIR spectra below the soft mode frequency at room temperature and below. The origin of this mode as well as the discrepancy between the soft mode damping in neutron and infrared spectra is discussed.
Looking at a person's hands one often can tell what the person is going to do next, how his/her hands are moving and where they will be, because an actor's intentions shape his/her movement kinematics during action execution. Similarly, active systems with real-time constraints must not simply rely on passive video-segment classification, but they have to continuously update their estimates and predict future actions. In this paper, we study the prediction of dexterous actions. We recorded from subjects performing different manipulation actions on the same object, such as "squeezing", "flipping", "washing", "wiping" and "scratching" with a sponge. In psychophysical experiments, we evaluated human observers' skills in predicting actions from video sequences of different length, depicting the hand movement in the preparation and execution of actions before and after contact with the object. We then developed a recurrent neural network based method for action prediction using as input patches around the hand. We also used the same formalism to predict the forces on the finger tips using for training synchronized video and force data streams. Evaluations on two new datasets showed that our system closely matches human performance in the recognition task, and demonstrate the ability of our algorithm to predict what and how a dexterous action is performed.
Uncovering factors underlying the network formation is a long-standing challenge for data mining and network analysis. In particular, the microscopic organizing principles of directed networks are less understood than those of undirected networks. This article proposes a hypothesis named potential theory, which assumes that every directed link corresponds to a decrease of a unit potential and subgraphs with definable potential values for all nodes are preferred. Combining the potential theory with the clustering and homophily mechanisms, it is deduced that the Bi-fan structure consisting of 4 nodes and 4 directed links is the most favored local structure in directed networks. Our hypothesis receives strongly positive supports from extensive experiments on 15 directed networks drawn from disparate fields, as indicated by the most accurate and robust performance of Bi-fan predictor within the link prediction framework. In summary, our main contribution is twofold: (i) We propose a new mechanism for the local organization of directed networks; (ii) We design the corresponding link prediction algorithm, which can not only testify our hypothesis, but also find out direct applications in missing link prediction and friendship recommendation.
We present an exhaustive investigation of recent Deep Learning architectures, algorithms, and strategies for the task of document image classification to finally reduce the error by more than half. Existing approaches, such as the DeepDocClassifier, apply standard Convolutional Network architectures with transfer learning from the object recognition domain. The contribution of the paper is threefold: First, it investigates recently introduced very deep neural network architectures (GoogLeNet, VGG, ResNet) using transfer learning (from real images). Second, it proposes transfer learning from a huge set of document images, i.e. 400,000 documents. Third, it analyzes the impact of the amount of training data (document images) and other parameters to the classification abilities. We use two datasets, the Tobacco-3482 and the large-scale RVL-CDIP dataset. We achieve an accuracy of 91.13% for the Tobacco-3482 dataset while earlier approaches reach only 77.6%. Thus, a relative error reduction of more than 60% is achieved. For the large dataset RVL-CDIP, an accuracy of 90.97% is achieved, corresponding to a relative error reduction of 11.5%.
Photos are becoming spontaneous, objective, and universal sources of information. This paper develops evolving situation recognition using photo streams coming from disparate sources combined with the advances of deep learning. Using visual concepts in photos together with space and time information, we formulate the situation detection into a semi-supervised learning framework and propose new graph-based models to solve the problem. To extend the method for unknown situations, we introduce a soft label method which enables the traditional semi-supervised learning framework to accurately predict predefined labels as well as effectively form new clusters. To overcome the noisy data which degrades graph quality, leading to poor recognition results, we take advantage of two kinds of noise-robust norms which can eliminate the adverse effects of outliers in visual concepts and improve the accuracy of situation recognition. Finally, we demonstrate the idea and the effectiveness of the proposed model on Yahoo Flickr Creative Commons 100 Million.
We compare policy differences across institutions by embedding representations of the entire legal corpus of each institution and the vocabulary shared across all corpora into a continuous vector space. We apply our method, Gov2Vec, to Supreme Court opinions, Presidential actions, and official summaries of Congressional bills. The model discerns meaningful differences between government branches. We also learn representations for more fine-grained word sources: individual Presidents and (2-year) Congresses. The similarities between learned representations of Congresses over time and sitting Presidents are negatively correlated with the bill veto rate, and the temporal ordering of Presidents and Congresses was implicitly learned from only text. With the resulting vectors we answer questions such as: how does Obama and the 113th House differ in addressing climate change and how does this vary from environmental or economic perspectives? Our work illustrates vector-arithmetic-based investigations of complex relationships between word sources based on their texts. We are extending this to create a more comprehensive legal semantic map.
Efficient generation of high-quality object proposals is an essential step in state-of-the-art object detection systems based on deep convolutional neural networks (DCNN) features. Current object proposal algorithms are computationally inefficient in processing high resolution images containing small objects, which makes them the bottleneck in object detection systems. In this paper we present effective methods to detect objects for high resolution images. We combine two complementary strategies. The first approach is to predict bounding boxes based on adjacent visual features. The second approach uses high level image features to guide a two-step search process that adaptively focuses on regions that are likely to contain small objects. We extract features required for the two strategies by utilizing a pre-trained DCNN model known as AlexNet. We demonstrate the effectiveness of our algorithm by showing its performance on a high-resolution image subset of the SUN 2012 object detection dataset.
The recently proposed reduction method is applied to the Edwards-Anderson model on bond-diluted square lattices. This allows, in combination with a graph-theoretical matching algorithm, to calculate numerically exact ground states of large systems. Low-temperature domain-wall excitations are studied to determine the stiffness exponent y_2. A value of y_2=-0.281(3) is found, consistent with previous results obtained on undiluted lattices. This comparison demonstrates the validity of the reduction method for bond-diluted spin systems and provides strong support for similar studies proclaiming accurate results for stiffness exponents in dimensions d=3,...,7.
The giant HII region NGC 604 constitutes a complex and rich population to studying detail many aspects of massive star formation, such as their environments and physical conditions, the evolutionary processes involved, the initial mass function for massive stars and star-formation rates, among many others. Here, we present our first results of a near-infrared study of NGC 604 performed with NIRI images obtained with Gemini North. Based on deep JHK photometry, 164 sources showing infrared excess were detected, pointing to the places where we should look for star-formation processes currently taking place. In addition, the color-color diagram reveals a great number of objects that could be giant/supergiant stars or unresolved, small, tight clusters. A extinction map obtained based on narrow-band images is also shown.
A standard model for Recommender Systems is the Matrix Completion setting: given partially known matrix of ratings given by users (rows) to items (columns), infer the unknown ratings. In the last decades, few attempts where done to handle that objective with Neural Networks, but recently an architecture based on Autoencoders proved to be a promising approach. In current paper, we enhanced that architecture (i) by using a loss function adapted to input data with missing values, and (ii) by incorporating side information. The experiments demonstrate that while side information only slightly improve the test error averaged on all users/items, it has more impact on cold users/items.
Deep learning techniques have been paramount in the last years, mainly due to their outstanding results in a number of applications, that range from speech recognition to face-based user identification. Despite other techniques employed for such purposes, Deep Boltzmann Machines are among the most used ones, which are composed of layers of Restricted Boltzmann Machines (RBMs) stacked on top of each other. In this work, we evaluate the concept of temperature in DBMs, which play a key role in Boltzmann-related distributions, but it has never been considered in this context up to date. Therefore, the main contribution of this paper is to take into account this information and to evaluate its influence in DBMs considering the task of binary image reconstruction. We expect this work can foster future research considering the usage of different temperatures during learning in DBMs.
Networked applications have software components that reside on different computers. Email, for example, has database, processing, and user interface components that can be distributed across a network and shared by users in different locations or work groups. End-to-end performance and reliability metrics describe the software quality experienced by these groups of users, taking into account all the software components in the pipeline. Each user produces only some of the data needed to understand the quality of the application for the group, so group performance metrics are obtained by combining summary statistics that each end computer periodically (and automatically) sends to a central server. The group quality metrics usually focus on medians and tail quantiles rather than on averages. Distributed quantile estimation is challenging, though, especially when passing large amounts of data around the network solely to compute quality metrics is undesirable. This paper describes an Incremental Quantile (IQ) estimation method that is designed for performance monitoring at arbitrary levels of network aggregation and time resolution when only a limited amount of data can be transferred. Applications to both real and simulated data are provided.
The impact of the dispersion of the transport coefficients on the structure of the energy distribution function for charge carriers far from equilibrium has been investigated in effective-medium approximation for model densities of states. The investigations show that two regimes can be observed in energy relaxation processes. Below a characteristic temperature the structure of the energy distribution function is determined by the dispersion of the transport coefficients. Thermal energy diffusion is irrelevant in this regime. Above the characteristic temperature the structure of the energy distribution function is determined by energy diffusion. The characteristic temperature depends on the degree of disorder and increases with increasing disorder. Explicit expressions for the energy distribution function in both regimes are derived for a constant and an exponential density of states.
This paper begins with considering the identification of sparse linear time-invariant networks described by multivariable ARX models. Such models possess relatively simple structure thus used as a benchmark to promote further research. With identifiability of the network guaranteed, this paper presents an identification method that infers both the Boolean structure of the network and the internal dynamics between nodes. Identification is performed directly from data without any prior knowledge of the system, including its order. The proposed method solves the identification problem using Maximum a posteriori estimation (MAP) but with inseparable penalties for complexity, both in terms of element (order of nonzero connections) and group sparsity (network topology). Such an approach is widely applied in Compressive Sensing (CS) and known as Sparse Bayesian Learning (SBL). We then propose a novel scheme that combines sparse Bayesian and group sparse Bayesian to efficiently solve the problem. The resulted algorithm has a similar form of the standard Sparse Group Lasso (SGL) while with known noise variance, it simplifies to exact re-weighted SGL. The method and the developed toolbox can be applied to infer networks from a wide range of fields, including systems biology applications such as signaling and genetic regulatory networks.
Deep neural networks are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in neural networks is a challenging and yet unsolved problem. Bayesian neural networks, which learn a distribution over weights, are currently the state-of-the-art for estimating predictive uncertainty; however these require significant modifications to the training procedure and are computationally expensive compared to standard (non-Bayesian) neural neural networks. We propose an alternative to Bayesian neural networks, that is simple to implement, readily parallelisable and yields high quality predictive uncertainty estimates. Through a series of experiments on classification and regression benchmarks, we demonstrate that our method produces well-calibrated uncertainty estimates which are as good or better than approximate Bayesian neural networks. To assess robustness to dataset shift, we evaluate the predictive uncertainty on test examples from known and unknown distributions, and show that our method is able to express higher uncertainty on unseen data. We demonstrate the scalability of our method by evaluating predictive uncertainty estimates on ImageNet.
In recent years multilayer perceptrons (MLPs) with many hid- den layers Deep Neural Network (DNN) has performed sur- prisingly well in many speech tasks, i.e. speech recognition, speaker verification, speech synthesis etc. Although in the context of F0 modeling these techniques has not been ex- ploited properly. In this paper, Deep Belief Network (DBN), a class of DNN family has been employed and applied to model the F0 contour of synthesized speech which was generated by HMM-based speech synthesis system. The experiment was done on Bengali language. Several DBN-DNN architectures ranging from four to seven hidden layers and up to 200 hid- den units per hidden layer was presented and evaluated. The results were compared against clustering tree techniques pop- ularly found in statistical parametric speech synthesis. We show that from textual inputs DBN-DNN learns a high level structure which in turn improves F0 contour in terms of ob- jective and subjective tests.
We introduce the neural network approach to the parametrization of parton distributions. After a general introduction, we present in detail our approach to parametrize experimental data, based on a combination of Monte Carlo methods and neural networks. We apply this strategy first in three different cases: the proton structure function, hadronic tau decays and B meson decay spectra. Finally we describe the neural network approach applied to the parametrization of parton distribution functions, and present results on the nonsinglet parton distribution.
Visual rendering of graphs is a key task in the mapping of complex network data. Although most graph drawing algorithms emphasize aesthetic appeal, certain applications such as travel-time maps place more importance on visualization of structural network properties. The present paper advocates a graph embedding approach with centrality considerations to comply with node hierarchy. The problem is formulated as one of constrained multi-dimensional scaling (MDS), and it is solved via block coordinate descent iterations with successive approximations and guaranteed convergence to a KKT point. In addition, a regularization term enforcing graph smoothness is incorporated with the goal of reducing edge crossings. Experimental results demonstrate that the algorithm converges, and can be used to efficiently embed large graphs on the order of thousands of nodes.
We introduce a growing network model in which a new node attaches to a randomly-selected node, as well as to all ancestors of the target node. This mechanism produces a sparse, ultra-small network where the average node degree grows logarithmically with network size while the network diameter equals 2. We determine basic geometrical network properties, such as the size dependence of the number of links and the in- and out-degree distributions. We also compare our predictions with real networks where the node degree also grows slowly with time -- the Internet and the citation network of all Physical Review papers.
We study electronic eigenstates on quasiperiodic lattices using a tight-binding Hamiltonian in the vertex model. In particular, the two-dimensional Penrose tiling and the three-dimensional icosahedral Ammann-Kramer tiling are considered. Our main interest concerns the decay form and the self-similarity of the electronic wave functions, which we compute numerically for periodic approximants of the perfect quasiperiodic structure. In order to investigate the suggested power-law localization of states, we calculate their participation numbers and structural entropy. We also perform a multifractal analysis of the eigenstates by standard box-counting methods. Our results indicate a rather different behaviour of the two- and the three-dimensional systems. Whereas the eigenstates on the Penrose tiling typically show power-law localization, this was not observed for the icosahedral tiling.
We study the magnetoconductance of topological insulator nanowires in a longitudinal magnetic field, including Aharonov-Bohm, Altshuler-Aronov-Spivak, perfectly conducting channel, and universal conductance fluctuation effects. Our focus is on predicting experimental behavior in single wires in the quantum limit where temperature is reduced to zero. We show that changing the Fermi energy $E_F$ can tune a wire from from ballistic to diffusive conduction and to localization. In both ballistic and diffusive single wires we find both Aharonov-Bohm and Altshuler-Aronov-Spivak oscillations with similar strengths, accompanied by quite strong universal conductance fluctuations (UCFs), all with amplitudes between $0.3 \, G_0$ and $1\,G_0$. This contrasts strongly with the average behavior of many wires, which shows Aharonov-Bohm oscillations in the ballistic regime and Altshuler-Aronov-Spivak oscillations in the diffusive regime, with both oscillations substantially larger than the conductance fluctuations. In single wires the ballistic and diffusive regimes can be distinguished by varying $E_F$ and studying the sign of the AB signal, which depends periodically on $E_F$ in ballistic wires and randomly on $E_F$ in diffusive wires. We also show that in long wires the perfectly conducting channel is visible at a wide range of energies within the bulk gap. We present typical conductance profiles at several wire lengths, showing that conductance fluctuations can dominate the average signal. Similar behavior will be found in carbon nanotubes.
We study wireless ad hoc networks in the absence of any channel contention or transmit power control and ask how antenna directivity affects network connectivity in the interference limited regime. We answer this question by deriving closed-form expressions for the outage probability, capacity and mean node degree of the network using tools from stochastic geometry. These novel results provide valuable insights for the design of future ad hoc networks. Significantly, our results suggest that the more directional the interfering transmitters are, the less detrimental are the effects of interference to individual links. We validate our analytical results through computer simulations.
Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI) applications including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Accordingly, techniques that enable efficient processing of deep neural network to improve energy-efficiency and throughput without sacrificing performance accuracy or increasing hardware cost are critical to enabling the wide deployment of DNNs in AI systems.   This article aims to provide a comprehensive tutorial and survey about the recent advances towards the goal of enabling efficient processing of DNNs. Specifically, it will provide an overview of DNNs, discuss various platforms and architectures that support DNNs, and highlight key trends in recent efficient processing techniques that reduce the computation cost of DNNs either solely via hardware design changes or via joint hardware design and network algorithm changes. It will also summarize various development resources that can enable researchers and practitioners to quickly get started on DNN design, and highlight important benchmarking metrics and design considerations that should be used for evaluating the rapidly growing number of DNN hardware designs, optionally including algorithmic co-design, being proposed in academia and industry.   The reader will take away the following concepts from this article: understand the key design considerations for DNNs; be able to evaluate different DNN hardware implementations with benchmarks and comparison metrics; understand trade-offs between various architectures and platforms; be able to evaluate the utility of various DNN design techniques for efficient processing; and understand of recent implementation trends and opportunities.
In this work, we focus on the problem of image instance retrieval with deep descriptors extracted from pruned Convolutional Neural Networks (CNN). The objective is to heavily prune convolutional edges while maintaining retrieval performance. To this end, we introduce both data-independent and data-dependent heuristics to prune convolutional edges, and evaluate their performance across various compression rates with different deep descriptors over several benchmark datasets. Further, we present an end-to-end framework to fine-tune the pruned network, with a triplet loss function specially designed for the retrieval task. We show that the combination of heuristic pruning and fine-tuning offers 5x compression rate without considerable loss in retrieval performance.
We calculate the joint min--max distribution and the Edwards-Anderson's order parameter for the circular model of $1 / f$-noise. Both quantities, as well as generalisations, are obtained exactly by combining the freezing-duality conjecture and Jack-polynomial techniques. Numerical checks come with significantly improved control of finite-size effects in the glassy phase, and the results convincingly validate the freezing-duality conjecture. Application to diffusive dynamics is discussed. We also provide a formula for the pre-factor ratio of the joint/marginal Carpentier-Le Doussal tail for minimum/maximum which applies to any logarithmic random energy model.
The real part of the time-dependent ac susceptibility of the short-range Ising spin glass in a transverse field has been investigated at very low temperatures. We have used the quantum linear response theory and domain coarsening ideas of quantum droplet scaling theory. It is found that after a temperature quench to a temperature which is less than the spin glass transition temperature the ac susceptibility decreases with time elapsed after the initial quench towards equilibrium approximately logarithmically. It is shown that the transverse field of "tunneling" has nonessential effect to a nonequilibrium dynamical properties of the magnetic droplet system. It is found that the time dependence of ac susceptibility has a qualitative agreement with some experimental results.
We investigate social networks of characters found in cultural works such as novels and films. These character networks exhibit many of the properties of complex networks such as skewed degree distribution and community structure, but may be of relatively small order with a high multiplicity of edges. Building on recent work of beveridge, we consider graph extraction, visualization, and network statistics for three novels: Twilight by Stephanie Meyer, Steven King's The Stand, and J.K. Rowling's Harry Potter and the Goblet of Fire. Coupling with 800 character networks from films found in the http://moviegalaxies.com/ database, we compare the data sets to simulations from various stochastic complex networks models including random graphs with given expected degrees (also known as the Chung-Lu model), the configuration model, and the preferential attachment model. Using machine learning techniques based on motif (or small subgraph) counts, we determine that the Chung-Lu model best fits character networks and we conjecture why this may be the case.
In social systems, people communicate with each other and form groups based on their interests. The pattern of interactions, the network, and the ideas that flow on the network naturally evolve together. Researchers use simple models to capture the feedback between changing network patterns and ideas on the network, but little is understood about the role of past events in the feedback process. Here we introduce a simple agent-based model to study the coupling between peoples' ideas and social networks, and better understand the role of history in dynamic social networks. We measure how information about ideas can be recovered from information about network structure and, the other way around, how information about network structure can be recovered from information about ideas. We find that it is in general easier to recover ideas from the network structure than vice versa.
We explore the coupled dynamics of the internal states of a set of interacting elements and the network of interactions among them. Interactions are modeled by a spatial game and the network of interaction links evolves adapting to the outcome of the game. As an example we consider a model of cooperation, where the adaptation is shown to facilitate the formation of a hierarchical interaction network that sustains a highly cooperative stationary state. The resulting network has the characteristics of a small world network when a mechanism of local neighbor selection is introduced in the adaptive network dynamics. The highly connected nodes in the hierarchical structure of the network play a leading role in the stability of the network. Perturbations acting on the state of these special nodes trigger global avalanches leading to complete network reorganization.
The effects of quark-sector Lorentz violation on deep inelastic electron-proton scattering are studied. We show that existing data can be used to establish first constraints on numerous coefficients for Lorentz violation in the quark sector at an estimated sensitivity of parts in a million.
Acceptance of an innovation can occur through mutliple exposures to individuals who have already accepted it. Presented here is a model to trace the evolution of an innovation in a social network with a preference $\lambda$, amidst topological constraints specified mainly by connectivity, $k$ and population size, $N_k$. With the interplay between properties of innovation and network structure, the model attempts to explain the variations in patterns of innovations across social networks. Time in which the propagation attains highest velocity depends on $\lambda^{-2}k^{-2}N_{k}^{1/2}$. Dynamics in random networks may lead or lag behind that in scale-free networks depending on the average connectivity. Hierarchical propagation is evident across connectivity classes within scale-free networks, as well as across random networks with distinct topological indices. For highly preferred innovations, the hierarchy observed within scale-free networks tends to be insignificant. The results have implications for administering innovations in finite size networks.
The present work analyses a particular scenario of consensus formation, where the individuals navigate across an underlying network defining the topology of the walks. The consensus, associated to a given opinion coded as a simple messages, is generated by interactions during the agent's walk and manifest itself in the collapse of the various opinions into a single one. We analyze how the topology of the underlying networks and the rules of interaction between the agents promote or inhibit the emergence of this consensus. We find that non-linear interaction rules are required to form consensus and that consensus is more easily achieved in networks whose degree distribution is narrower.
We consider the statistical properties over disordered samples of the overlap distribution $P_{\cal J}(q)$ which plays the role of an order parameter in spin-glasses. We show that near zero temperature (i) the {\it typical} overlap distribution is exponentially small in the central region of $-1<q<1$: $ P^{typ}(q) = e^{\bar{\ln P_{\cal J}(q)}} \sim e^{- \beta N^{\theta} \phi(q)} $, where $\theta$ is the droplet exponent defined here with respect to the total number $N$ of spins (in order to consider also fully connected models where the notion of length does not exist); (ii) the rescaled variable $v = - (\ln P_{\cal J}(q))/N^{\theta}$ remains an O(1) random positive variable describing sample-to sample fluctuations; (iii) the averaged distribution $\bar{P_{\cal J}(q)} $ is non-typical and dominated by rare anomalous samples. Similar statements hold for the cumulative overlap distribution $I_{\cal J}(q_0) \equiv \int_{0}^{q_0} dq P_{\cal J}(q) $. These results are derived explicitly for the spherical mean-field model with $\theta=1/3$, $\phi(q)=1-q^2 $, and the random variable $v$ corresponds to the rescaled difference between the two largest eigenvalues of GOE random matrices. Then we compare numerically the typical and averaged overlap distributions for the long-ranged one-dimensional Ising spin-glass with random couplings decaying as $J(r) \propto r^{-\sigma}$ for various values of the exponent $\sigma$, corresponding to various droplet exponents $\theta(\sigma)$, and for the mean-field SK-model (corresponding formally to the $\sigma=0$ limit of the previous model). Our conclusion is that future studies on spin-glasses should measure the {\it typical} values of the overlap distribution or of the cumulative overlap distribution to obtain clearer conclusions on the nature of the spin-glass phase.
LeoTask is a Java library for computation-intensive and time-consuming research tasks. It automatically executes tasks in parallel on multiple CPU cores on a computing facility. It uses a configuration file to enable automatic exploration of parameter space and flexible aggregation of results, and therefore allows researchers to focus on programming the key logic of a computing task. It also supports reliable recovery from interruptions, dynamic and cloneable networks, and integration with the plotting software Gnuplot.
Graph partitioning, a well studied problem of parallel computing has many applications in diversified fields such as distributed computing, social network analysis, data mining and many other domains. In this paper, we introduce FGPGA, an efficient genetic approach for producing feasible graph partitions. Our method takes into account the heterogeneity and capacity constraints of the partitions to ensure balanced partitioning. Such approach has various applications in mobile cloud computing that include feasible deployment of software applications on the more resourceful infrastructure in the cloud instead of mobile hand set. Our proposed approach is light weight and hence suitable for use in cloud architecture. We ensure feasibility of the partitions generated by not allowing over-sized partitions to be generated during the initialization and search. Our proposed method tested on standard benchmark datasets significantly outperforms the state-of-the-art methods in terms of quality of partitions and feasibility of the solutions.
We present a graph theoretic upper bound on speedup needed to achieve 100% throughput in a multicast switch using network coding. By bounding speedup, we show the equivalence between network coding and speedup in multicast switches - i.e. network coding, which is usually implemented using software, can in many cases substitute speedup, which is often achieved by adding extra switch fabrics. This bound is based on an approach to network coding problems called the "enhanced conflict graph". We show that the "imperfection ratio" of the enhanced conflict graph gives an upper bound on speedup. In particular, we apply this result to K-by-N switches with traffic patterns consisting of unicasts and broadcasts only to obtain an upper bound of min{(2K-1)/K, 2N/(N+1)}.
Recent advances in combining deep neural network architectures with reinforcement learning techniques have shown promising potential results in solving complex control problems with high dimensional state and action spaces. Inspired by these successes, in this paper, we build two kinds of reinforcement learning algorithms: deep policy-gradient and value-function based agents which can predict the best possible traffic signal for a traffic intersection. At each time step, these adaptive traffic light control agents receive a snapshot of the current state of a graphical traffic simulator and produce control signals. The policy-gradient based agent maps its observation directly to the control signal, however the value-function based agent first estimates values for all legal control signals. The agent then selects the optimal control action with the highest value. Our methods show promising results in a traffic network simulated in the SUMO traffic simulator, without suffering from instability issues during the training process.
We develop a neural network model to classify liver cancer patients into high-risk and low-risk groups using genomic data. Our approach provides a novel technique to classify big data sets using neural network models. We preprocess the data before training the neural network models. We first expand the data using wavelet analysis. We then compress the wavelet coefficients by mapping them onto a new scaled orthonormal coordinate system. Then the data is used to train a neural network model that enables us to classify cancer patients into two different classes of high-risk and low-risk patients. We use the leave-one-out approach to build a neural network model. This neural network model enables us to classify a patient using genomic data as a high-risk or low-risk patient without any information about the survival time of the patient. The results from genomic data analysis are compared with survival time analysis. It is shown that the expansion and compression of data using wavelet analysis and singular value decomposition (SVD) is essential to train the neural network model.
The threshold model has been widely adopted as a classic model for studying contagion processes on social networks. We consider asymmetric individual interactions in social networks and introduce a persuasion mechanism into the threshold model. Specifically, we study a combination of adoption and persuasion in cascading processes on complex networks. It is found that with the introduction of the persuasion mechanism, the system may become more vulnerable to global cascades, and the effects of persuasion tend to be more significant in heterogeneous networks than those in homogeneous networks: a comparison between heterogeneous and homogeneous networks shows that under weak persuasion, heterogeneous networks tend to be more robust against random shocks than homogeneous networks; whereas under strong persuasion, homogeneous networks are more stable. Finally, we study the effects of adoption and persuasion threshold heterogeneity on systemic stability. Though both heterogeneities give rise to global cascades, the adoption heterogeneity has an overwhelmingly stronger impact than the persuasion heterogeneity when the network connectivity is sufficiently dense.
We show how a deep denoising autoencoder with lateral connections can be used as an auxiliary unsupervised learning task to support supervised learning. The proposed model is trained to minimize simultaneously the sum of supervised and unsupervised cost functions by back-propagation, avoiding the need for layer-wise pretraining. It improves the state of the art significantly in the permutation-invariant MNIST classification task.
In this paper, we consider joint network and LDPC coding for practically implementing the denosie-and-forward protocol over bi-directional relaying. the closed-form expressions for computing the log-likelihood ratios of the network-coded codewords have been derived for both real and complex multiple-access channels. It is revealed that the equivalent channel observed at the relay is an asymmetrical channel, where the channel input is the XOR form of the two source nodes.
Random boolean networks are a model of genetic regulatory networks that has proven able to describe experimental data in biology. They not only reproduce important phenomena in cell dynamics, but they are also extremely interesting from a theoretical viewpoint, since it is possible to tune their asymptotic behaviour from order to disorder. The usual approach characterizes network families as a whole, either by means of static or dynamic measures. We show here that a more detailed study, based on the properties of system's attractors, can provide information that makes it possible to predict with higher precision important properties, such as system's response to gene knock-out. A new set of principled measures is introduced, that explains some puzzling behaviours of these networks. These results are not limited to random Boolean network models, but they are general and hold for any discrete model exhibiting similar dynamical characteristics.
The fourfold research proposal regards in particular: critical oriented percolation; random walk limit laws; neural networks with long-range connections; the ant in a labyrinth.
Word embeddings have been demonstrated to benefit NLP tasks impressively. Yet, there is room for improvement in the vector representations, because current word embeddings typically contain unnecessary information, i.e., noise. We propose two novel models to improve word embeddings by unsupervised learning, in order to yield word denoising embeddings. The word denoising embeddings are obtained by strengthening salient information and weakening noise in the original word embeddings, based on a deep feed-forward neural network filter. Results from benchmark tasks show that the filtered word denoising embeddings outperform the original word embeddings.
This paper outlines a comprehensive model to increase system efficiency, preserve network bandwidth, monitor incoming and outgoing packets, ensure the security of confidential files and reduce power wastage in an organization. This model illustrates the use and potential application of a Network Analysis Tool (NAT) in a multi-computer set-up of any scale. The model is designed to run in the background and not hamper any currently executing applications, while using minimum system resources. It was developed as open source software, using VB. Net, with a view to overcoming limitations of legacy systems and financial restrictions in small-to mid-level organizations like businesses and educational institutes. It is fully-customizable and serves as a simple and open-source alternative to existing software. The NAT relies on simple client-server architecture and uses remote access to monitor and maintain the computers on a network, for example logging off a user or shutting down a computer after a certain "idle" time, enabling and disabling applications, troubleshooting and so on. The NAT was tested in a laboratory and resultant data is presented, along with the results of a survey that was conducted among users.
We study a model in which particles interact through a hard-core repulsion complemented by a short-ranged attractive potential, of the kind found in colloidal suspensions. Combining theoretical and numerical work we locate the line of higher-order glass transition singularities and its end-point -- named $A_4$ -- on the fluid-glass line. Close to the $A_4$ point, we detect logarithmic decay of density correlations and sub linear power-law increase of the mean square displacement, for time intervals up to four order of magnitudes. We establish the presence of the $A_4$ singularity by studying how the range of the potential affects the time-window where anomalous dynamics is observed.
We show that quantum computation circuits with coherent states as the logical qubits can be constructed using very simple linear networks, conditional measurements and coherent superposition resource states.
We study the spin excitations and the transverse susceptibility of a two-dimensional antiferromagnet doped with a small concentration of holes in the t-J model. The motion of holes generates a renormalization of the magnetic properties. The Green's functions are calculated in the self-consistent Born approximation. It is shown that the long-wavelength spin waves are significantly softened and the shorter-wavelength spin waves become strongly damped as the doping increases. The spin wave velocity is reduced by the coherent motion of holes, and not increased as has been claimed elsewhere. The transverse susceptibility is found to increase considerably with doping, also as a result of coherent hole motion. Our results are in agreement with experimental data for the doped copper oxide superconductors.
We have performed the first experimental investigation of quantum interference corrections to the conductivity of a bilayer graphene structure. A negative magnetoresistance - a signature of weak localisation - is observed at different carrier densities, including the electro-neutrality region. It is very different, however, from the weak localisation in conventional two-dimensional systems. We show that it is controlled not only by the dephasing time, but also by different elastic processes that break the effective time-reversal symmetry and provide invervalley scattering.
We consider the problem of estimating self-exciting generalized linear models from limited binary observations, where the history of the process serves as the covariate. We analyze the performance of two classes of estimators, namely the $\ell_1$-regularized maximum likelihood and greedy estimators, for a canonical self-exciting process and characterize the sampling tradeoffs required for stable recovery in the non-asymptotic regime. Our results extend those of compressed sensing for linear and generalized linear models with i.i.d. covariates to those with highly inter-dependent covariates. We further provide simulation studies as well as application to real spiking data from the mouse's lateral geniculate nucleus and the ferret's retinal ganglion cells which agree with our theoretical predictions.
The adjacency and Laplacian matrices of complex networks with two species of nodes are studied and the spectral density is evaluated by using the replica method in statistical physics. The network nodes are classified into two species (A and B) and the connections are made only between the nodes of different species. A static model of such bipartite networks with power law degree distributions is introduced by applying Goh, Kahng and Kim's method to construct scale free networks. As a result, the spectral density is shown to obey a power law in the limit of large mean degree.
The problem of characterizing the optimal rate achievable with analog network coding (ANC) for a unicast communication over general wireless relay networks is computationally hard. A relay node performing ANC scales and forwards its input signals. The source-destination channel in such communication scenarios is, in general, an intersymbol interference (ISI) channel which leads to the single-letter characterization of the optimal rate in terms of an optimization problem with nonconvex, non closed-form objective function and non-convex constraints. For a special class of such networks, called layered networks, a few key results and insights are however available.   To gain insights into the nature of the optimal solution and to construct low-complexity schemes to characterize the optimal rate for general wireless relay networks, we need (1) network topologies that are regular enough to be amenable for analysis, yet general enough to capture essential characteristics of general wireless relay networks, and (2) schemes to approximate the objective function in closed-form without significantly compromising the performance. Towards these two goals, this work proposes (1) nonlinear chain networks, and (2) two approximation schemes. We show that their combination allows us to tightly characterize the optimal ANC rate with low computational complexity for a much larger class of general wireless relay networks than possible with existing schemes.
Reconstruction of biochemical reaction networks is a central topic in systems biology which raises crucial theoretical challenges in system identification. Nonlinear Ordinary Differential Equations (ODEs) that involve polynomial and rational functions are typically used to model biochemical reaction networks. Such nonlinear models make the problem of determining the connectivity of biochemical networks from time-series experimental data quite difficult. In this paper, we present a network reconstruction algorithm that can deal with model descriptions under the form of polynomial and rational functions. Rather than identifying the parameters of linear or nonlinear ODEs characterised by pre-defined equation structures, our methodology allows us to determine the nonlinear ODEs structure together with their associated reaction constants. To solve the network reconstruction problem, we cast it as a Compressive Sensing (CS) problem and use Bayesian Sparse Learning (BSL) algorithms as an efficient way to obtain its solution.
Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, our network is able to learn local features with increasing contextual scales. With further observation that point sets are usually sampled with varying densities, which results in greatly decreased performance for networks trained on uniform densities, we propose novel set learning layers to adaptively combine features from multiple scales. Experiments show that our network called PointNet++ is able to learn deep point set features efficiently and robustly. In particular, results significantly better than state-of-the-art have been obtained on challenging benchmarks of 3D point clouds.
Deep convolutional neural networks (CNNs) have structures that are loosely related to that of the primate visual cortex. Surprisingly, when these networks are trained for object classification, the activity of their early, intermediate, and later layers becomes closely related to activity patterns in corresponding parts of the primate ventral visual stream. The activity statistics are far from identical, but perhaps remaining differences can be minimized in order to produce artificial networks with highly brain-like activity and performance, which would provide a rich source of insight into primate vision. One way to align CNN activity more closely with neural activity is to add cost functions that directly drive deep layers to approximate neural recordings. However, suitably large datasets are particularly difficult to obtain for deep structures, such as the primate middle temporal area (MT). To work around this barrier, we have developed a rich empirical model of activity in MT. The model is pixel-computable, so it can provide an arbitrarily large (but approximate) set of labels to better guide learning in the corresponding layers of deep networks. Our model approximates a number of MT phenomena more closely than previous models. Furthermore, our model approximates population statistics in detail through fourteen parameter distributions that we estimated from the electrophysiology literature. In general, deep networks with internal representations that closely approximate those of the brain may help to clarify the mechanisms that produce these representations, and the roles of various properties of these representations in performance of vision tasks. Although our empirical model inevitably differs from real neural activity, it allows tuning properties to be modulated independently, which may allow very detailed exploration of the origins and functional roles of these properties.
We study the use of linear codes for network computing in single-receiver networks with various classes of target functions of the source messages. Such classes include reducible, injective, semi-injective, and linear target functions over finite fields. Computing capacity bounds and achievability are given with respect to these target function classes for network codes that use routing, linear coding, or nonlinear coding.
Echocardiography is essential to modern cardiology. However, human interpretation limits high throughput analysis, limiting echocardiography from reaching its full clinical and research potential for precision medicine. Deep learning is a cutting-edge machine-learning technique that has been useful in analyzing medical images but has not yet been widely applied to echocardiography, partly due to the complexity of echocardiograms' multi view, multi modality format. The essential first step toward comprehensive computer assisted echocardiographic interpretation is determining whether computers can learn to recognize standard views. To this end, we anonymized 834,267 transthoracic echocardiogram (TTE) images from 267 patients (20 to 96 years, 51 percent female, 26 percent obese) seen between 2000 and 2017 and labeled them according to standard views. Images covered a range of real world clinical variation. We built a multilayer convolutional neural network and used supervised learning to simultaneously classify 15 standard views. Eighty percent of data used was randomly chosen for training and 20 percent reserved for validation and testing on never seen echocardiograms. Using multiple images from each clip, the model classified among 12 video views with 97.8 percent overall test accuracy without overfitting. Even on single low resolution images, test accuracy among 15 views was 91.7 percent versus 70.2 to 83.5 percent for board-certified echocardiographers. Confusional matrices, occlusion experiments, and saliency mapping showed that the model finds recognizable similarities among related views and classifies using clinically relevant image features. In conclusion, deep neural networks can classify essential echocardiographic views simultaneously and with high accuracy. Our results provide a foundation for more complex deep learning assisted echocardiographic interpretation.
With a core-periphery structure of networks, core nodes are densely interconnected, peripheral nodes are connected to core nodes to different extents, and peripheral nodes are sparsely interconnected. Core-periphery structure composed of a single core and periphery has been identified for various networks. However, analogous to the observation that many empirical networks are composed of densely interconnected groups of nodes, i.e., communities, a network may be better regarded as a collection of multiple cores and peripheries. We propose a scalable algorithm to detect multiple non-overlapping groups of core-periphery structure in a network. We illustrate our algorithm using synthesised and empirical networks. For example, we find distinct core-periphery pairs with different political leanings in a network of political blogs and separation between international and domestic subnetworks of airports in some single countries in a world-wide airport network.
Neural ranking models for information retrieval (IR) use shallow or deep neural networks to rank search results in response to a query. Traditional learning to rank models employ machine learning techniques over hand-crafted IR features. By contrast, neural models learn representations of language from raw text that can bridge the gap between query and document vocabulary. Unlike classical IR models, these new machine learning based approaches are data-hungry, requiring large scale training data before they can be deployed. This tutorial introduces basic concepts and intuitions behind neural IR models, and places them in the context of traditional retrieval models. We begin by introducing fundamental concepts of IR and different neural and non-neural approaches to learning vector representations of text. We then review shallow neural IR methods that employ pre-trained neural term embeddings without learning the IR task end-to-end. We introduce deep neural networks next, discussing popular deep architectures. Finally, we review the current DNN models for information retrieval. We conclude with a discussion on potential future directions for neural IR.
We propose a simple experiment to study delocalization and extinction in inhomogeneous biological systems. The nonlinear steady state for, say, a bacteria colony living on and near a patch of nutrient or favorable illumination (``oasis'') in the presence of a drift term (``wind'') is computed. The bacteria, described by a simple generalization of the Fisher equation, diffuse, divide A -> A+A, die A -> 0, and annihilate A+A -> 0. At high wind velocities all bacteria are blown into an unfavorable region (``desert''), and the colony dies out. At low velocity a steady state concentration survives near the oasis. In between these two regimes there is a critical velocity at which bacteria first survive. If the ``desert'' supports a small nonzero population, this extinction transition is replaced by a delocalization transition with increasing velocity. Predictions for the behavior as a function of wind velocity are made for one and two dimensions.
The capacity of proteins to interact specifically with one another underlies our conceptual understanding of how living systems function. Systems-level study of specificity in protein-protein interactions is complicated by the fact that the cellular environment is crowded and heterogeneous; interaction pairs may exist at low relative concentrations and thus be presented with many more opportunities for promiscuous interactions compared to specific interaction possibilities. Here we address these questions using a simple computational model that includes specifically designed interacting model proteins immersed in a mixture containing hundreds of different unrelated ones; all of them undergo simulated diffusion and interaction. We find that specific complexes are quite robust to interference from promiscuous interaction partners, only in the range of temperatures Tdesign>T>Trand. At T>Tdesign specific complexes become unstable, while at T<Trand formation of specific complexes is suppressed by promiscuous interactions. Specific interactions can form only if Tdesign>Trand. This condition requires an energy gap between binding energy in a specific complex and set of binding energies between randomly associating proteins, providing a general physical constraint on evolutionary selection or design of specific interacting protein interfaces. This work has implications for our understanding of how the protein repertoire functions and evolves within the context of cellular systems.
In the most popular approach to the numerical study of the Anderson metal-insulator transition the transfer matrix method is combined with finite-size scaling ideas. This approach requires large computer resources to overcome the statistical fluctuations and to accumulate data for a sufficient range of different values of disorder or energy. In this paper we present an alternative approach in which the basic transfer matrix is extended to calculate the derivative with respect to disorder. By so doing we are able to concentrate on a single value of energy or disorder and, potentially, to calculate the critical behaviour much more efficiently and independently of the assumed range of the critical regime. We present some initial results which illustrate both the advantages and the drawbacks of the method.
We are interested in the out of equilibrium phenomena observed in the electrical conductance of disordered insulators at low temperature, which may be signatures of the electron coulomb glass state. The present work is devoted to the occurrence of ageing, a benchmark phenomenon for the glassy state. It is the fact that the dynamical properties of a glass depend on its age, i.e. on the time elapsed since it was quench-cooled. We first critically analyse previous studies on disordered insulators and question their interpretation in terms of ageing. We then present new measurements on insulating granular aluminium thin films which demonstrate that the dynamics is indeed age dependent. We also show that the results of different relaxation protocols are related by a superposition principle. The implications of our findings for the mechanism of the conductance slow relaxations are then discussed.
Using parallels with the quantum scattering theory, developed for processes in nuclear and mesoscopic physics and quantum chaos, we construct a reduced Google matrix $G_R$ which describes the properties and interactions of a certain subset of selected nodes belonging to a much larger directed network. The matrix $G_R$ takes into account effective interactions between subset nodes by all their indirect links via the whole network. We argue that this approach gives new possibilities to analyze effective interactions in a group of nodes embedded in a large directed networks. Possible efficient numerical methods for the practical computation of $G_R$ are also described.
Automatically describing an image with a natural language has been an emerging challenge in both fields of computer vision and natural language processing. In this paper, we present Long Short-Term Memory with Attributes (LSTM-A) - a novel architecture that integrates attributes into the successful Convolutional Neural Networks (CNNs) plus Recurrent Neural Networks (RNNs) image captioning framework, by training them in an end-to-end manner. To incorporate attributes, we construct variants of architectures by feeding image representations and attributes into RNNs in different ways to explore the mutual but also fuzzy relationship between them. Extensive experiments are conducted on COCO image captioning dataset and our framework achieves superior results when compared to state-of-the-art deep models. Most remarkably, we obtain METEOR/CIDEr-D of 25.2%/98.6% on testing data of widely used and publicly available splits in (Karpathy & Fei-Fei, 2015) when extracting image representations by GoogleNet and achieve to date top-1 performance on COCO captioning Leaderboard.
We consider a new class of non-Hermitian random matrices, namely the ones which have the form of sums of freely independent terms involving unitary matrices. To deal with them, we exploit the recently developed quaternion technique. After having derived some general identities describing additive properties of unitary matrices, we solve three particular models: CUE plus CUE, CUE plus... plus CUE (i. e. the sum of an arbitrary number of CUE matrices), and CUE plus GUE. By solution of a given model we mean here to calculate the borderline of the eigenvalues' two-dimensional domain, as well as the eigenvalues' density function inside the domain. We confirm numerically all the results, obtaining very good agreement.
We investigate a $d$-dimensional model ($d$ = 2,3) for sound waves in a disordered environment, in which the local fluctuations of the elastic modulus are spatially correlated with a certain correlation length. The model is solved analytically by means of a field-theoretical effective-medium theory (self-consistent Born approximation) and numerically on a square lattice. As in the uncorrelated case the theory predicts an enhancement of the density of states over Debye's $\omega^{d-1}$ law (``boson peak'') as a result of disorder. This anomay becomes reinforced for increasing correlation length $\xi$. The theory predicts that $\xi$ times the width of the Brillouin line should be a universal function of $\xi$ times the wavenumber. Such a scaling is found in the 2d simulation data, so that they can be represented in a universal plot. In the low-wavenumber regime, where the lattice structure is irrelevant there is excellent agreement between the simulation at small disorder. At larger disorder the continuum theory deviates from the lattice simulation data. It is argued that this is due to an instability of the model with stronger disorder.
A sensor network is one of the critical networks that is based on hardware components as well the energy parameters. Because of this, such network requires the optimization in all kind communication to improve the network life. In case of underwater sensor network, the criticality of network is also increased because of the random floating movement of the nodes. In this work, a composition of the multicast or broadcast communication is presented by the generation of aggregative path. The presented work is about to define a new chain based aggregative routing approach to provide the effective communication over the network. In this work, an effective aggregative path is suggested under the different parameters of energy, distance and congestion analysis. Based on these parameters a trustful aggregative route will be generated so that the network life will be improved.
We solve the compressive sensing problem via convolutional factor analysis, where the convolutional dictionaries are learned {\em in situ} from the compressed measurements. An alternating direction method of multipliers (ADMM) paradigm for compressive sensing inversion based on convolutional factor analysis is developed. The proposed algorithm provides reconstructed images as well as features, which can be directly used for recognition ($e.g.$, classification) tasks. When a deep (multilayer) model is constructed, a stochastic unpooling process is employed to build a generative model. During reconstruction and testing, we project the upper layer dictionary to the data level and only a single layer deconvolution is required. We demonstrate that using $\sim30\%$ (relative to pixel numbers) compressed measurements, the proposed model achieves the classification accuracy comparable to the original data on MNIST. We also observe that when the compressed measurements are very limited ($e.g.$, $<10\%$), the upper layer dictionary can provide better reconstruction results than the bottom layer.
We unveil a remarkable connection between the sine-Gordon quantum field theory and the Kardar-Parisi-Zhang (KPZ) growth equation. We find that the non-relativistic limit of the two point correlation function of the sine-Gordon theory is related to the generating function of the height distribution of the KPZ field with droplet initial conditions, i.e. the directed polymer free energy with two endpoints fixed. As shown recently, the latter can be expressed as a Fredholm determinant which in the large time separation limit converges to the GUE Tracy-Widom cumulative distribution. Possible applications and extensions are discussed.
We use multilayer Long Short Term Memory (LSTM) networks to learn representations of video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed length representation. This representation is decoded using single or multiple decoder LSTMs to perform different tasks, such as reconstructing the input sequence, or predicting the future sequence. We experiment with two kinds of input sequences - patches of image pixels and high-level representations ("percepts") of video frames extracted using a pretrained convolutional net. We explore different design choices such as whether the decoder LSTMs should condition on the generated output. We analyze the outputs of the model qualitatively to see how well the model can extrapolate the learned video representation into the future and into the past. We try to visualize and interpret the learned features. We stress test the model by running it on longer time scales and on out-of-domain data. We further evaluate the representations by finetuning them for a supervised learning problem - human action recognition on the UCF-101 and HMDB-51 datasets. We show that the representations help improve classification accuracy, especially when there are only a few training examples. Even models pretrained on unrelated datasets (300 hours of YouTube videos) can help action recognition performance.
Control of multi-agent systems via game theory is investigated. Assume a system level object is given, the utility functions for individual agents are designed to convert a multi-agent system into a potential game. First, for fixed topology, a necessary and sufficient condition is given to assure the existence of local information based utility functions. Then using local information the system can converge to a maximum point of the system object, which is a Nash equilibrium. It is also proved that a networked evolutionary potential game is a special case of this multi-agent system. Second, for time-varying topology, the state based potential game is utilized to design the optimal control. A strategy based Markov state transition process is proposed to assure the existence of state based potential function. As an extension of the fixed topology case, a necessary and sufficient condition for the existence of state depending utility functions using local information is also presented. It is also proved that using better reply with inertia strategy, the system converges to a maximum strategy of the state based system object, which is called the recurrent state equilibrium.
Semi-inclusive deep inelastic scattering of polarized leptons off hadrons enables one to measure the antisymmetric part of the hadron tensor. For unpolarized hadrons this piece is odd under time reversal. In deep inelastic scattering it shows up as a $\langle \sin \phi \rangle$ asymmetry for the produced hadrons. This asymmetry can be expressed as the product of a twist-three "hadron $\rightarrow$ quark" distribution function and a time reversal odd twist-two "quark $\rightarrow$ hadron" fragmentation function. This fragmentation function can only be measured for nonzero transverse momenta of the produced hadron.
Gradient descent training techniques are remarkably successful in training analog-valued artificial neural networks (ANNs). Such training techniques, however, do not transfer easily to spiking networks due to the spike generation hard non-linearity and the discrete nature of spike communication. We show that in a feedforward spiking network that uses a temporal coding scheme where information is encoded in spike times instead of spike rates, the network input-output relation is differentiable almost everywhere. Moreover, this relation is locally linear after a transformation of variables. Methods for training ANNs thus carry directly to the training of such spiking networks as we show when training on the permutation invariant MNIST task. In contrast to rate-based spiking networks that are often used to approximate the behavior of ANNs, the networks we present spike much more sparsely and their behavior can not be directly approximated by conventional ANNs. Our results highlight a new approach for controlling the behavior of spiking networks with realistic temporal dynamics, opening up the potential for using these networks to process spike patterns with complex temporal information.
We present a detailed analysis of glass transitions induced by pinning particles at random from an equilibrium configuration. We first develop a mean-field analysis based on the study of p-spin spherical disordered models and then obtain the three dimensional critical behavior by the Migdal-Kadanoff real space renormalization group method. We unveil the important physical differences with the case in which particles are pinned from a random (or very high temperature) configuration. We contrast the pinning particles approach to the ones based on biasing dynamical trajectories with respect to their activity and on coupling to equilibrium configurations. Finally, we discuss numerical and experimental tests.
In the past years, network theory has successfully characterized the interaction among the constituents of a variety of complex systems, ranging from biological to technological, and social systems. However, up until recently, attention was almost exclusively given to networks in which all components were treated on equivalent footing, while neglecting all the extra information about the temporal- or context-related properties of the interactions under study. Only in the last years, taking advantage of the enhanced resolution in real data sets, network scientists have directed their interest to the multiplex character of real-world systems, and explicitly considered the time-varying and multilayer nature of networks. We offer here a comprehensive review on both structural and dynamical organization of graphs made of diverse relationships (layers) between its constituents, and cover several relevant issues, from a full redefinition of the basic structural measures, to understanding how the multilayer nature of the network affects processes and dynamics.
Deep Convolutional Neural Networks (CNNs) have made substantial improvement on human attention prediction. There still remains room for improvement over deep learning based attention models that do not explicitly deal with scale-space feature learning problem. Our method learns to predict human eye fixation with view-free scenes based on an end-to-end deep learning architecture. The attention model captures hierarchical saliency information from deep, coarse layers with global saliency information to shallow, fine layers with local saliency response. We base our model on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields. Final saliency prediction is achieved via the cooperation of those global and local predictions. Our model is learned with a deep supervision manner, where supervision is directly fed into multi-level layers, instead of previous approaches of providing supervision only at the output layer and propagating this supervision back to earlier layers. Our model thus incorporates multi-level saliency predictions within a single network, which significantly decreases the redundancy of previous approaches of learning multiple network streams with different input scales. Extensive experimental analysis on various challenging benchmark datasets demonstrate our method yields state-of-the-art performance with competitive inference time.
Keypoint detection is one of the most important pre-processing steps in tasks such as face modeling, recognition and verification. In this paper, we present an iterative method for Keypoint Estimation and Pose prediction of unconstrained faces by Learning Efficient H-CNN Regressors (KEPLER) for addressing the face alignment problem. Recent state of the art methods have shown improvements in face keypoint detection by employing Convolution Neural Networks (CNNs). Although a simple feed forward neural network can learn the mapping between input and output spaces, it cannot learn the inherent structural dependencies. We present a novel architecture called H-CNN (Heatmap-CNN) which captures structured global and local features and thus favors accurate keypoint detecion. HCNN is jointly trained on the visibility, fiducials and 3D-pose of the face. As the iterations proceed, the error decreases making the gradients small and thus requiring efficient training of DCNNs to mitigate this. KEPLER performs global corrections in pose and fiducials for the first four iterations followed by local corrections in the subsequent stage. As a by-product, KEPLER also provides 3D pose (pitch, yaw and roll) of the face accurately. In this paper, we show that without using any 3D information, KEPLER outperforms state of the art methods for alignment on challenging datasets such as AFW and AFLW.
An intrusion detection system framework using mobile agents is a layered framework mechanism designed to support heterogeneous network environments to identify intruders at its best. Traditional computer misuse detection techniques can identify known attacks efficiently, but perform very poorly in other cases. Anomaly detection has the potential to detect unknown attacks; however, it is a very challenging task since its aim is to detect unknown attacks without any priori knowledge about specific intrusions. This technology is still at its early stage. The objective of this paper is that the system can detect anomalous user activity. Existing research in this area focuses either on user activity or on program operation but not on both simultaneously. In this paper, an attempt to look at both concurrently is presented. Based on an intrusion detection framework [1], a novel user anomaly detection system has been implemented and conducted several intrusion detection experiments in a simulated environment by analyzing user activity and program operation activities. The proposed framework is a layered framework, which is designed to satisfy the core purpose of IDS, and allows detecting the intrusion as quickly as possible with available data using mobile agents. This framework was mainly designed to provide security for the network using mobile agent mechanisms to add mobility features to monitor the user processes from different computational systems. The experimental results have shown that the system can detect anomalous user activity effectively.
The ability of a deterministic, plastic system to learn to imitate stochastic behavior is analyzed. Two neural networks -actually, two perceptrons- are put to play a zero-sum game one against the other. The competition, by acting as a kind of mutually supervised learning, drives the networks to produce an approximation to the optimal strategy, that is to say, a random signal.
We show that world trade network datasets contain empirical evidence that the dynamics of innovation in the world economy follows indeed the concept of creative destruction, as proposed by J.A. Schumpeter more than half a century ago. National economies can be viewed as complex, evolving systems, driven by a stream of appearance and disappearance of goods and services. Products appear in bursts of creative cascades. We find that products systematically tend to co-appear, and that product appearances lead to massive disappearance events of existing products in the following years. The opposite - disappearances followed by periods of appearances - is not observed. This is an empirical validation of the dominance of cascading competitive replacement events on the scale of national economies, i.e. creative destruction. We find a tendency that more complex products drive out less complex ones, i.e. progress has a direction. Finally we show that the growth trajectory of a country's product output diversity can be understood by a recently proposed evolutionary model of Schumpeterian economic dynamics.
We study experimentally the fracture mechanisms of a model cohesive granular medium consisting of glass beads held together by solidified polymer bridges. The elastic response of this material can be controlled by changing the cross-linking of the polymer phase, for example. Here we show that its fracture toughness can be tuned over an order of magnitude by adjusting the stiffness and size of the polymer bridges. We extract a well-defined fracture energy from fracture testing under a range of material preparations. This energy is found to scale linearly with the cross-sectional area of the bridges. Finally, X-ray microcomputed tomography shows that crack propagation is driven by adhesive failure of about one polymer bridge per bead located at the interface, along with microcracks in the vicinity of the failure plane. Our findings provide insight to the fracture mechanisms of this model material, and the mechanical properties of disordered cohesive granular media in general.
We enlighten some critical aspects of the three-dimensional ($d=3$) random-field Ising model from simulations performed at zero temperature. We consider two different, in terms of the field distribution, versions of model, namely a Gaussian random-field Ising model and an equal-weight trimodal random-field Ising model. By implementing a computational approach that maps the ground-state of the system to the maximum-flow optimization problem of a network, we employ the most up-to-date version of the push-relabel algorithm and simulate large ensembles of disorder realizations of both models for a broad range of random-field values and systems sizes $\mathcal{V}=L\times L\times L$, where $L$ denotes linear lattice size and $L_{\rm max}=156$. Using as finite-size measures the sample-to-sample fluctuations of various quantities of physical and technical origin, and the primitive operations of the push-relabel algorithm, we propose, for both types of distributions, estimates of the critical field $h_{\rm c}$ and the critical exponent $\nu$ of the correlation length, the latter clearly suggesting that both models share the same universality class. Additional simulations of the Gaussian random-field Ising model at the best-known value of the critical field provide the magnetic exponent ratio $\beta/\nu$ with high accuracy and clear out the controversial issue of the critical exponent $\alpha$ of the specific heat. Finally, we discuss the infinite-limit size extrapolation of energy- and order-parameter-based noise to signal ratios related to the self-averaging properties of the model, as well as the critical slowing down aspects of the algorithm.
We study the KPZ equation (in D = 2, 3 and 4 spatial dimensions) by using a RSOS discretization of the surface. We measure the critical exponents very precisely, and we show that the rational guess is not appropriate, and that 4D is not the upper critical dimension. We are also able to determine very precisely the exponent of the sub-leading scaling corrections, that turns out to be close to 1 in all cases. We introduce and use a {\em multi-surface coding} technique, that allow a gain of order 30 over usual numerical simulations.
Among several heliophysical and geophysical quantities, the accurate evolution of the solar irradiance is fundamental to forecast the evolution of the neutral and ionized components of the Earth's atmosphere.We developed an artificial neural network model to compute the evolution of the solar irradiance in near-real time. The model is based on the assumption that that great part of the solar irradiance variability is due to the evolution of the structure of the solar magnetic field. We employ a Layer-Recurrent Network (LRN) to model the complex relationships between the evolution of the bipolar magnetic structures (input) and the solar irradiance (output). The evolution of the bipolar magnetic structures is obtained from near-real time solar disk magnetograms and intensity images. The magnetic structures are identify and classified according to the area of the solar disk covered. We constrained the model by comparing the output of the model and observations of the solar irradiance made by instruments onboard of SORCE spacecraft. Here we focus on two regions of the spectra that are covered by SORCE instruments. The generalization of the network is tested by dividing the data sets on two groups: the training set; and, the validation set. We have found that the model error is wavelength dependent. While the model error for 24-hour forecast in the band from 115 to 180 nm is lower than 5%, the model error can reach 20% in the band from 180 to 310 nm. The performance of the network reduces progressively with the increase of the forecast period, which limits significantly the maximum forecast period that we can achieve with the discussed architecture. The model proposed allows us to predict the total and spectral solar irradiance up to three days in advance.
We use the recently developed typical medium dynamical cluster (TMDCA) approach~[Ekuma \etal,~\textit{Phys. Rev. B \textbf{89}, 081107 (2014)}] to perform a detailed study of the Anderson localization transition in three dimensions for the Box, Gaussian, Lorentzian, and Binary disorder distributions, and benchmark them with exact numerical results. Utilizing the nonlocal hybridization function and the momentum resolved typical spectra to characterize the localization transition in three dimensions, we demonstrate the importance of both spatial correlations and a typical environment for the proper characterization of the localization transition in all the disorder distributions studied. As a function of increasing cluster size, the TMDCA systematically recovers the re-entrance behavior of the mobility edge for disorder distributions with finite variance, obtaining the correct critical disorder strengths, and shows that the order parameter critical exponent for the Anderson localization transition is universal. The TMDCA is computationally efficient, requiring only a small cluster to obtain qualitative and quantitative data in good agreement with numerical exact results at a fraction of the computational cost. Our results demonstrate that the TMDCA provides a consistent and systematic description of the Anderson localization transition.
We identify proper motion objects in the Hubble Ultra Deep Field (UDF) using the optical data from the original UDF program in 2004 and the near-infrared data from the 128-orbit UDF 2012 campaign. There are 12 sources brighter than I=27 mag that display >3sigma significant proper motions. We do not find any proper motion objects fainter than this magnitude limit. Combining optical and near-infrared photometry, we model the spectral energy distribution of each point-source using stellar templates and state-of-the-art white dwarf models. For I<27 mag, we identify 23 stars with K0-M6 spectral types and two faint blue objects that are clearly old, thick disk white dwarfs. We measure a thick disk white dwarf space density of 0.1-1.7 E-3 per cubic parsec from these two objects. There are no halo white dwarfs in the UDF down to I=27 mag. Combining the Hubble Deep Field North, South, and the UDF data, we do not see any evidence for dark matter in the form of faint halo white dwarfs, and the observed population of white dwarfs can be explained with the standard Galactic models.
Transmission control protocol (TCP) is a connection oriented protocol for several types of distributed applications. TCP is reliable particularly for traditional fixed networks. With emergence of faster wireless networks, TCP has been performing poorly in its original format. The performance of TCP is affected due to assorted factors including congestion window, maximum packet size, retry limit, recovery mechanism, backup mechanism and mobility. To overcome deficiency of original TCP, Several modifications have been introduced to improve network quality. The mobility is a major hurdle in degrading the performance of mobile wireless networks. In this paper, we introduce and implement new TCP variant University of Bridgeport (UB) that combines the features of TCP Westwood and Vegas. We examine the performance of TCP-UB, Vegas and Westwood using different realistic scenarios. NS2 simulator demonstrates the stability of TCP-UB as compared with TCP Vegas and Westwood in highly congested networks from the mobility point of view.
This contribution to the book in honour of J.S. Bell will probably differ from the remaining ones, in particular since only a part of it will be devoted to specific technical arguments. In fact I have considered appropriate to share with the community of physicists interested in the foundational problems of our best theory the repeated interactions I had with him in the last four years of his life, the deep discussions in which we have been involved in particular in connection with the elaboration of collapse theories and their interpretation, the contributions he gave to the development of this approach, both at a formal level, as well as championing it on repeated occasions. In brief, I intend to play here the role of one of those lucky persons who became acquainted with him personally, who has exchanged important views with him, who has learned a lot from his deep insight and conceptual lucidity, and, last but not least, one whose scientific work has been appreciated by him. Moreover, due to the fact that this book intends to celebrate the 50th anniversary of the derivation of the fundamental inequality which bears his name, I will also devote a small part of the text to recall his clear cut views about the locality issue, views that I believe have not been grasped correctly by a remarkable part of the scientific community. I will analyze this problem in quite general terms at the end of the paper.
In large wireless networks, acquiring full network state information is typically infeasible. Hence, nodes need to flow the information and manage the interference based on partial information about the network. In this paper, we consider multi-hop wireless networks and assume that each source only knows the channel gains that are on the routes from itself to other destinations in the network. We develop several distributed strategies to manage the interference among the users and prove their optimality in maximizing the achievable normalized sum-rate for some classes of networks.
Recent advances in deep learning have shown exciting promise in filling large holes in natural images with semantically plausible and context aware details, impacting fundamental image manipulation tasks such as object removal. While these learning-based methods are significantly more effective in capturing high-level features than prior techniques, they can only handle very low-resolution inputs due to memory limitations and difficulty in training. Even for slightly larger images, the inpainted regions would appear blurry and unpleasant boundaries become visible. We propose a multi-scale neural patch synthesis approach based on joint optimization of image content and texture constraints, which not only preserves contextual structures but also produces high-frequency details by matching and adapting patches with the most similar mid-layer feature correlations of a deep classification network. We evaluate our method on the ImageNet and Paris Streetview datasets and achieved state-of-the-art inpainting accuracy. We show our approach produces sharper and more coherent results than prior methods, especially for high-resolution images.
The negative sampling (NEG) objective function, used in word2vec, is a simplification of the Noise Contrastive Estimation (NCE) method. NEG was found to be highly effective in learning continuous word representations. However, unlike NCE, it was considered inapplicable for the purpose of learning the parameters of a language model. In this study, we refute this assertion by providing a principled derivation for NEG-based language modeling, founded on a novel analysis of a low-dimensional approximation of the matrix of pointwise mutual information between the contexts and the predicted words. The obtained language modeling is closely related to NCE language models but is based on a simplified objective function. We thus provide a unified formulation for two main language processing tasks, namely word embedding and language modeling, based on the NEG objective function. Experimental results on two popular language modeling benchmarks show comparable perplexity results, with a small advantage to NEG over NCE.
A one-dimensional driven lattice gas with disorder in the particle hopping probabilities is considered. It has previously been shown that in the version of the model with random sequential updating, a phase transition occurs from a low density inhomogeneous phase to a high density congested phase. Here the steady states for both parallel (fully synchronous) updating and ordered sequential updating are solved exactly and the phase transition shown to persist in both cases. For parallel dynamics and forward ordered sequential dynamics the phase transition occurs at the same density but for backward ordered sequential dynamics it occurs at a higher density. In both cases the critical density is higher than that for random sequential dynamics. In all the models studied the steady state velocity is related to the fugacity of a Bose system suggesting a principle of minimisation of velocity. A generalisation of the dynamics where the hopping probabilities depend on the number of empty sites in front of the particles, is also solved exactly in the case of parallel updating. The models have natural interpretations as simplistic descriptions of traffic flow. The relation to more sophisticated traffic flow models is discussed.
A heterogenous network with base stations (BSs), small base stations (SBSs) and users distributed according to independent Poisson point processes is considered. SBS nodes are assumed to possess high storage capacity and to form a distributed caching network. Popular files are stored in local caches of SBSs, so that a user can download the desired files from one of the SBSs in its vicinity. The offloading-loss is captured via a cost function that depends on the random caching strategy proposed here. The popularity profile of cached content is unknown and estimated using instantaneous demands from users within a specified time interval. An estimate of the cost function is obtained from which an optimal random caching strategy is devised. The training time to achieve an $\epsilon>0$ difference between the achieved and optimal costs is finite provided the user density is greater than a predefined threshold, and scales as $N^2$, where $N$ is the support of the popularity profile. A transfer learning-based approach to improve this estimate is proposed. The training time is reduced when the popularity profile is modeled using a parametric family of distributions; the delay is independent of $N$ and scales linearly with the dimension of the distribution parameter.
We consider multicommodity flow and cut problems in {\em polymatroidal} networks where there are submodular capacity constraints on the edges incident to a node. Polymatroidal networks were introduced by Lawler and Martel and Hassin in the single-commodity setting and are closely related to the submodular flow model of Edmonds and Giles; the well-known maxflow-mincut theorem holds in this more general setting. Polymatroidal networks for the multicommodity case have not, as far as the authors are aware, been previously explored. Our work is primarily motivated by applications to information flow in wireless networks. We also consider the notion of undirected polymatroidal networks and observe that they provide a natural way to generalize flows and cuts in edge and node capacitated undirected networks.   We establish poly-logarithmic flow-cut gap results in several scenarios that have been previously considered in the standard network flow models where capacities are on the edges or nodes. Our results have already found aplications in wireless network information flow and we anticipate more in the future. On the technical side our key tools are the formulation and analysis of the dual of the flow relaxations via continuous extensions of submodular functions, in particular the Lov\'asz extension. For directed graphs we rely on a simple yet useful reduction from polymatroidal networks to standard networks. For undirected graphs we rely on the interplay between the Lov\'asz extension of a submodular function and line embeddings with low average distortion introduced by Matousek and Rabinovich; this connection is inspired by, and generalizes, the work of Feige, Hajiaghayi and Lee on node-capacitated multicommodity flows and cuts. The applicability of embeddings to polymatroidal networks is of independent mathematical interest.
The behaviour of the Random Anisotropy Ising model at T=0 under local relaxation dynamics is studied. The model includes a dominant ferromagnetic interaction and assumes an infinite anisotropy at each site along local anisotropy axes which are randomly aligned. Two different random distributions of anisotropy axes have been studied. Both are characterized by a parameter that allows control of the degree of disorder in the system. By using numerical simulations we analyze the hysteresis loop properties and characterize the statistical distribution of avalanches occuring during the metastable evolution of the system driven by an external field. A disorder-induced critical point is found in which the hysteresis loop changes from displaying a typical ferromagnetic magnetization jump to a rather smooth loop exhibiting only tiny avalanches. The critical point is characterized by a set of critical exponents, which are consistent with the universal values proposed from the study of other simpler models.
The Model Predictive Control (MPC) approach is used in this paper to control the voltage profiles in MV networks with distributed generation. The proposed algorithm lies at the intermediate level of a three-layer hierarchical structure. At the upper level a static Optimal Power Flow (OPF) manager computes the required voltage profiles to be transmitted to the MPC level, while at the lower level local Automatic Voltage Regulators (AVR), one for each Distributed Generator (DG), track the reactive power reference values computed by MPC. The control algorithm is based on an impulse response model of the system, easily obtained by means of a detailed simulator of the network, and allows to cope with constraints on the voltage profiles and/or on the reactive power flows along the network. If these constraints cannot be satisfied by acting on the available DGs, the algorithm acts on the On-Load Tap Changing (OLTC) transformer. A radial rural network with two feeders, eight DGs, and thirty-one loads is used as case study. The model of the network is implemented in DIgSILENT PowerFactory, while the control algorithm runs in Matlab. A number of simulation results is reported to witness the main characteristics and limitations of the proposed approach.
Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered policy representations and human-supplied demonstrations. Deep reinforcement learning alleviates this limitation by training general-purpose neural network policies, but applications of direct deep reinforcement learning algorithms have so far been restricted to simulated settings and relatively simple tasks, due to their apparent high sample complexity. In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots. We demonstrate that the training times can be further reduced by parallelizing the algorithm across multiple robots which pool their policy updates asynchronously. Our experimental evaluation shows that our method can learn a variety of 3D manipulation skills in simulation and a complex door opening skill on real robots without any prior demonstrations or manually designed representations.
Quasiperiodic modulation can prevent isolated quantum systems from equilibrating by localizing their degrees of freedom. In this article, we show that such systems can exhibit dynamically stable long-range orders forbidden in equilibrium. Specifically, we show that the interplay of symmetry breaking and localization in the quasiperiodic quantum Ising chain produces a \emph{quasiperiodic Ising glass} stable at all energy densities. The glass order parameter vanishes with an essential singularity at the melting transition with no signatures in the equilibrium properties. The zero temperature phase diagram is also surprisingly rich, consisting of paramagnetic, ferromagnetic and quasiperiodically alternating ground state phases with extended, localized and critically delocalized low energy excitations. The system exhibits an unusual quantum Ising transition whose properties are intermediate between those of the clean and infinite randomness Ising transitions. Many of these results follow from a geometric generalization of the Aubry-Andr\'e duality which we develop. The quasiperiodic Ising glass may be realized in near term quantum optical experiments.
We report on our system for the shared task on discriminating between similar languages (DSL 2016). The system uses only byte representations in a deep residual network (ResNet). The system, named ResIdent, is trained only on the data released with the task (closed training). We obtain 84.88% accuracy on subtask A, 68.80% accuracy on subtask B1, and 69.80% accuracy on subtask B2. A large difference in accuracy on development data can be observed with relatively minor changes in our network's architecture and hyperparameters. We therefore expect fine-tuning of these parameters to yield higher accuracies.
In mobile robotics, a solid test for adaptation is the ability of a control system to function not only in a diverse number of physical environments, but also on a number of different robotic platforms. This paper demonstrates that a set of behaviours evolved in simulation on a miniature robot (epuck) can be transferred to a much larger-scale platform (Pioneer), both in simulation and in the real world. The chosen architecture uses artificial evolution of epuck behaviours to obtain a genetic sequence, which is then employed to seed an idiotypic, artificial immune system (AIS) on the Pioneers. Despite numerous hardware and software differences between the platforms, navigation and target-finding experiments show that the evolved behaviours transfer very well to the larger robot when the idiotypic AIS technique is used. In contrast, transferability is poor when reinforcement learning alone is used, which validates the adaptability of the chosen architecture.
Deep convolutional neural networks have liberated its extraordinary power on various tasks. However, it is still very challenging to deploy state-of-the-art models into real-world applications due to their high computational complexity. How can we design a compact and effective network without massive experiments and expert knowledge? In this paper, we propose a simple and effective framework to learn and prune deep models in an end-to-end manner. In our framework, a new type of parameter -- scaling factor is first introduced to scale the outputs of specific structures, such as neurons, groups or residual blocks. Then we add sparsity regularizations on these factors, and solve this optimization problem by a modified stochastic Accelerated Proximal Gradient (APG) method. By forcing some of the factors to zero, we can safely remove the corresponding structures, thus prune the unimportant parts of a CNN. Compared with other structure selection methods that may need thousands of trials or iterative fine-tuning, our method is trained fully end-to-end in one training pass without bells and whistles. We evaluate our method, Sparse Structure Selection with two state-of-the-art CNNs ResNet and ResNeXt, and demonstrate very promising results with adaptive depth and width selection.
As 3D movie viewing becomes mainstream and Virtual Reality (VR) market emerges, the demand for 3D contents is growing rapidly. Producing 3D videos, however, remains challenging. In this paper we propose to use deep neural networks for automatically converting 2D videos and images to stereoscopic 3D format. In contrast to previous automatic 2D-to-3D conversion algorithms, which have separate stages and need ground truth depth map as supervision, our approach is trained end-to-end directly on stereo pairs extracted from 3D movies. This novel training scheme makes it possible to exploit orders of magnitude more data and significantly increases performance. Indeed, Deep3D outperforms baselines in both quantitative and human subject evaluations.
The catalog from the first high resolution U-band image of the Hubble Ultra Deep Field, taken with Hubble's Wide Field Planetary Camera 2 through the F300W filter, is presented. We detect 96 U-band objects and compare and combine this catalog with a Great Observatories Origins Deep Survey (GOODS) B-selected catalog that provides B, V, i, and z photometry, spectral types, and photometric redshifts. We have also obtained Far-Ultraviolet (FUV, 1614 \AA) data with Hubble's Advanced Camera for Surveys Solar Blind Channel (ACS/SBC) and with Galaxy Evolution Explorer (GALEX). We detected 31 sources with ACS/SBC, 28 with GALEX/FUV, and 45 with GALEX/NUV. The methods of observations, image processing, object identification, catalog preparation, and catalog matching are presented.
The widespread deployment of various networking technologies, coupled with the exponential increase in end- user data demand, have led to the proliferation of multi-homed, or multi-interface enabled, devices. These trends drove researchers to investigate a wide spectrum of solutions, at different layers of the protocol stack, that utilize available interfaces in such devices by aggregating their bandwidth. In this survey paper, we provide an overview and examine the evolution of bandwidth aggregation solutions over time. We begin by describing the bandwidth aggregation problem. We investigate the common features of proposed bandwidth aggregation systems and break them down into two major categories: layer-dependent and layer-independent features. Afterwards, we discuss the evolution trends in the literature and discuss some open challenges requiring further research. We end the survey with a brief presentation of related work in tangential research areas.
This problem is a series of biddings and auctions. Each round of bidding and auction are different from previous ones because of the change of network topology, variance of budget set by the sender, and possible evolution of strategies of other nodes. The huge strategy space of relay nodes makes the formulation to a game very difficult. We present a brief qualitative analysis in this paper, and propose a bidding strategy based on learning algorithms.
In this paper, a copy-move forgery detection method based on Convolutional Kernel Network is proposed. Different from methods based on conventional hand-crafted features, Convolutional Kernel Network is a kind of data-driven local descriptor with the deep convolutional structure. Thanks to the development of deep learning theories and widely available datasets, the data-driven methods can achieve competitive performance on different conditions for its excellent discriminative capability. Besides, our Convolutional Kernel Network is reformulated as a series of matrix computations and convolutional operations which are easy to parallelize and accelerate by GPU, leading to high efficiency. Then, appropriate preprocessing and postprocessing for Convolutional Kernel Network are adopted to achieve copy-move forgery detection. Particularly, a segmentation-based keypoints distribution strategy is proposed and a GPU-based adaptive oversegmentation method is adopted. Numerous experiments are conducted to demonstrate the effectiveness and robustness of the GPU version of Convolutional Kernel Network, and the state-of-the-art performance of the proposed copy-move forgery detection method based on Convolutional Kernel Network.
Scheduling wireless links for simultaneous activation in such a way that all transmissions are successfully decoded at the receivers and moreover network capacity is maximized is a computationally hard problem. Usually it is tackled by heuristics whose output is a sequence of time slots in which every link appears in exactly one time slot. Such approaches can be interpreted as the coloring of a graph's vertices so that every vertex gets exactly one color. Here we introduce a new approach that can be viewed as assigning multiple colors to each vertex, so that, in the resulting schedule, every link may appear more than once (though the same number of times for all links). We report on extensive computational experiments, under the physical interference model, revealing substantial gains for a variety of randomly generated networks.
A scheme to provide various mean-field-type approximation algorithms is presented by employing the Bethe free energy formalism to a family of replicated systems in conjunction with analytical continuation with respect to the number of replicas. In the scheme, survey propagation (SP), which is an efficient algorithm developed recently for analyzing the microscopic properties of glassy states for a fixed sample of disordered systems, can be reproduced by assuming the simplest replica symmetry on stationary points of the replicated Bethe free energy. Belief propagation and generalized SP can also be offered in the identical framework under assumptions of the highest and broken replica symmetries, respectively.
We have made a thorough investigation of the nuclear structure function W_2A in the region of 0.8 < x < 1.5 and Q^2 < 20 GeV^2, separating the quasielastic and inelastic plus deep inelastic contributions. The agreement with present experimental data is good giving support to the results for both channels. Predictions are made in yet unexplored regions of x and Q^2 to assert the weight of the quasielastic or inelastic channels. We find that at Q^2 < 4 GeV^2 the structure function is dominated by the quasielastic contributions for x < 1.5, while for values of Q^2 > 15 GeV^2 and the range of x studied the inelastic channels are over one order of magnitude bigger than the quasielastic one. The potential of the structure function at x > 1 as a source of information on nuclear correlations is stressed once more.
Glasses are amorphous solids whose constituent particles are caged by their neighbors and thus cannot flow. This sluggishness is often ascribed to the free energy landscape containing multiple minima (basins) separated by high barriers. Here we show, using theory and numerical simulation, that the landscape is much rougher than is classically assumed. Deep in the glass, it undergoes a "roughness transition" to fractal basins. This brings about isostaticity at jamming and marginality of glassy states near jamming. Critical exponents for the basin width, the weak force distribution, and the spatial spread of quasi-contacts at jamming can be analytically determined. Their value is found to be compatible with numerical observations. This advance therefore incorporates the jamming transition of granular materials into the framework of glass theory. Because temperature and pressure control which features of the landscape are experienced, glass mechanics and transport are expected to reflect the features of the topology we discuss here. Hitherto mysterious properties of low-temperature glasses could be explained by this approach.
The QCDNUM program numerically solves the evolution equations for parton densities and fragmentation functions in perturbative QCD. Un-polarised parton densities can be evolved up to next-to-next-to-leading order in powers of the strong coupling constant, while polarised densities or fragmentation functions can be evolved up to next-to-leading order. Other types of evolution can be accessed by feeding alternative sets of evolution kernels into the program. A versatile convolution engine provides tools to compute parton luminosities, cross-sections in hadron-hadron scattering, and deep inelastic structure functions in the zero-mass scheme or in generalised mass schemes. Input to these calculations are either the QCDNUM evolved densities, or those read in from an external parton density repository. Included in the software distribution are packages to calculate zero-mass structure functions in un-polarised deep inelastic scattering, and heavy flavour contributions to these structure functions in the fixed flavour number scheme.
This paper investigates the operator mapping problem for in-network stream-processing applications. In-network stream-processing amounts to applying one or more trees of operators in steady-state, to multiple data objects that are continuously updated at different locations in the network. The goal is to compute some final data at some desired rate. Different operator trees may share common subtrees. Therefore, it may be possible to reuse some intermediate results in different application trees. The first contribution of this work is to provide complexity results for different instances of the basic problem, as well as integer linear program formulations of various problem instances. The second second contribution is the design of several polynomial-time heuristics. One of the primary objectives of these heuristics is to reuse intermediate results shared by multiple applications. Our quantitative comparisons of these heuristics in simulation demonstrates the importance of choosing appropriate processors for operator mapping. It also allow us to identify a heuristic that achieves good results in practice.
We study the limits of communication efficiency for function computation in collocated networks within the framework of multi-terminal block source coding theory. With the goal of computing a desired function of sources at a sink, nodes interact with each other through a sequence of error-free, network-wide broadcasts of finite-rate messages. For any function of independent sources, we derive a computable characterization of the set of all feasible message coding rates - the rate region - in terms of single-letter information measures. We show that when computing symmetric functions of binary sources, the sink will inevitably learn certain additional information which is not demanded in computing the function. This conceptual understanding leads to new improved bounds for the minimum sum-rate. The new bounds are shown to be orderwise better than those based on cut-sets as the network scales. The scaling law of the minimum sum-rate is explored for different classes of symmetric functions and source parameters.
We introduce a new technique to certify lower bounds on the multicut size using network coding. In directed networks the network coding rate is not a lower bound on the multicut, but we identify a class of networks on which the rate is equal to the size of the minimum multicut and show this class is closed under the strong graph product. We then show that the famous construction of Saks et al. that gives a $\Theta(k)$ gap between the multicut and the multicommodity flow rate is contained in this class. This allows us to apply our result to strengthen their multicut lower bound, determine the exact value of the minimum multicut, and give an optimal network coding solution with rate matching the multicut.
A semantic segmentation algorithm must assign a label to every pixel in an image. Recently, semantic segmentation of RGB imagery has advanced significantly due to deep learning. Because creating datasets for semantic segmentation is laborious, these datasets tend to be significantly smaller than object recognition datasets. This makes it difficult to directly train a deep neural network for semantic segmentation, because it will be prone to overfitting. To cope with this, deep learning models typically use convolutional neural networks pre-trained on large-scale image classification datasets, which are then fine-tuned for semantic segmentation. For non-RGB imagery, this is currently not possible because large-scale labeled non-RGB datasets do not exist. In this paper, we developed two deep neural networks for semantic segmentation of multispectral remote sensing imagery. Prior to training on the target dataset, we initialize the networks with large amounts of synthetic multispectral imagery. We show that this significantly improves results on real-world remote sensing imagery, and we establish a new state-of-the-art result on the challenging Hamlin Beach State Park Dataset.
Distributed computing remains inaccessible to a large number of users, in spite of many open source platforms and extensive commercial offerings. While distributed computation frameworks have moved beyond a simple map-reduce model, many users are still left to struggle with complex cluster management and configuration tools, even for running simple embarrassingly parallel jobs. We argue that stateless functions represent a viable platform for these users, eliminating cluster management overhead, fulfilling the promise of elasticity. Furthermore, using our prototype implementation, PyWren, we show that this model is general enough to implement a number of distributed computing models, such as BSP, efficiently. Extrapolating from recent trends in network bandwidth and the advent of disaggregated storage, we suggest that stateless functions are a natural fit for data processing in future computing environments.
We establish the existence of a hidden degree of freedom and the critical states of a spinless electron system in a spatially-correlated random magnetic field with vanishing mean. Whereas the critical states are carried by the zero-field contours of the field landscape, the hidden degree of freedom is recognized as being associated with the formation of vortices in these special contours. It is argued that, as opposed to the coherent backscattering mechanism of weak localization, a new type of scattering processes in the contours controls the underlying physics of localization in the random magnetic field system. In addition, we investigate the role of vortices in governing the metal-insulator transition and propose a renormalization-group diagram for the system under study.
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and models are available at https://github.com/liuzhuang13/DenseNet .
Most work on neural natural language generation (NNLG) focus on controlling the content of the generated text. We experiment with controlling several stylistic aspects of the generated text, in addition to its content. The method is based on conditioned RNN language model, where the desired content as well as the stylistic parameters serve as conditioning contexts. We demonstrate the approach on the movie reviews domain and show that it is successful in generating coherent sentences corresponding to the required linguistic style and content.
"Mixed Data" comprising a large number of heterogeneous variables (e.g. count, binary, continuous, skewed continuous, among other data types) are prevalent in varied areas such as genomics and proteomics, imaging genetics, national security, social networking, and Internet advertising. There have been limited efforts at statistically modeling such mixed data jointly, in part because of the lack of computationally amenable multivariate distributions that can capture direct dependencies between such mixed variables of different types. In this paper, we address this by introducing a novel class of Block Directed Markov Random Fields (BDMRFs). Using the basic building block of node-conditional univariate exponential families from Yang et al. (2012), we introduce a class of mixed conditional random field distributions, that are then chained according to a block-directed acyclic graph to form our class of Block Directed Markov Random Fields (BDMRFs). The Markov independence graph structure underlying a BDMRF thus has both directed and undirected edges. We introduce conditions under which these distributions exist and are normalizable, study several instances of our models, and propose scalable penalized conditional likelihood estimators with statistical guarantees for recovering the underlying network structure. Simulations as well as an application to learning mixed genomic networks from next generation sequencing expression data and mutation data demonstrate the versatility of our methods.
Conserved moieties are groups of atoms that remain intact in all reactions of a metabolic network. Identification of conserved moieties gives insight into the structure and function of metabolic networks and facilitates metabolic modelling. All moiety conservation relations can be represented as nonnegative integer vectors in the left null space of the stoichiometric matrix corresponding to a biochemical network. Algorithms exist to compute such vectors based only on reaction stoichiometry but their computational complexity has limited their application to relatively small metabolic networks. Moreover, the vectors returned by existing algorithms do not, in general, represent conservation of a specific moiety with a defined atomic structure. Here, we show that identification of conserved moieties requires data on reaction atom mappings in addition to stoichiometry. We present a novel method to identify conserved moieties in metabolic networks by graph theoretical analysis of their underlying atom transition networks. Our method returns the exact group of atoms belonging to each conserved moiety as well as the corresponding vector in the left null space of the stoichiometric matrix. It can be implemented as a pipeline of polynomial time algorithms. Our implementation completes in under five minutes on a metabolic network with more than 4,000 mass balanced reactions. The scalability of the method enables extension of existing applications for moiety conservation relations to genome-scale metabolic networks. We also give examples of new applications made possible by elucidating the atomic structure of conserved moieties.
Information-centric networking extensively uses universal in-network caching. However, developing an efficient and fair collaborative caching algorithm for selfish caches is still an open question. In addition, the communication overhead induced by collaboration is especially poorly understood in a general network setting such as realistic ISP and Autonomous System networks. In this paper, we address these two problems by modeling the in-network caching problem as a Nash bargaining game. We show that the game is a convex optimization problem and further derive the corresponding distributed algorithm. We analytically investigate the collaboration overhead on general graph topologies, and theoretically show that collaboration has to be constrained within a small neighborhood due to its cost growing exponentially. Our proposed algorithm achieves at least 16% performance gain over its competitors on different network topologies in the evaluation, and guarantees provable convergence, Pareto efficiency and proportional fairness.
Deep neural networks are known to be difficult to train due to the instability of back-propagation. A deep \emph{residual network} (ResNet) with identity loops remedies this by stabilizing gradient computations. We prove a boosting theory for the ResNet architecture. We construct $T$ weak module classifiers, each contains two of the $T$ layers, such that the combined strong learner is a ResNet. Therefore, we introduce an alternative Deep ResNet training algorithm, \emph{BoostResNet}, which is particularly suitable in non-differentiable architectures. Our proposed algorithm merely requires a sequential training of $T$ "shallow ResNets" which are inexpensive. We prove that the training error decays exponentially with the depth $T$ if the \emph{weak module classifiers} that we train perform slightly better than some weak baseline. In other words, we propose a weak learning condition and prove a boosting theory for ResNet under the weak learning condition. Our results apply to general multi-class ResNets. A generalization error bound based on margin theory is proved and suggests ResNet's resistant to overfitting under network with $l_1$ norm bounded weights.
The probability that there are $k$ real eigenvalues for an $n$ dimensional real random matrix is known. Here we study this for the case of products of independent random matrices. Relating the problem of the probability that the product of two real 2 dimensional random matrices has real eigenvalues to an issue of optimal quantum entanglement, this is fully analytically solved. It is shown that in $\pi/4$ fraction of such products the eigenvalues are real. Being greater than the corresponding known probability ($1/\sqrt{2}$) for a single matrix, it is shown numerically that the probability that {\it all} eigenvalues of a product of random matrices are real tends to unity as the number of matrices in the product increases indefinitely. Some other numerical explorations, including the expected number of real eigenvalues is also presented, where an exponential approach of the expected number to the dimension of the matrix seems to hold.
We study a quantum extension of the spherical $p$-spin-glass model using the imaginary-time replica formalism. We solve the model numerically and we discuss two analytical approximation schemes that capture most of the features of the solution. The phase diagram and the physical properties of the system are determined in two ways: by imposing the usual conditions of thermodynamic equilibrium and by using the condition of marginal stability. In both cases, the phase diagram consists of two qualitatively different regions. If the transition temperature is higher than a critical value $T^{\star}$, quantum effects are qualitatively irrelevant and the phase transition is {\it second} order, as in the classical case. However, when quantum fluctuations depress the transition temperature below $T^{\star}$, the transition becomes {\it first order}. The susceptibility is discontinuous and shows hysteresis across the first order line, a behavior reminiscent of that observed in the dipolar Ising spin-glass LiHo$_x$Y$_{1-x}$F$_4$ in an external transverse magnetic field. We discuss in detail the thermodynamics and the stationary dynamics of both states. The spectrum of magnetic excitations of the equilibrium spin-glass state is gaped, leading to an exponentially small specific heat at low temperatures. That of the marginally stable state is gapless and its specific heat varies linearly with temperature, as generally observed in glasses at low temperature. We show that the properties of the marginally stable state are closely related to those obtained in studies of the real-time dynamics of the system weakly coupled to a quantum thermal bath. Finally, we discuss a possible application of our results to the problem of polymers in random media.
We investigate the effects of heterogeneous delays in the coupling of two excitable neural systems. Depending upon the coupling strengths and the time delays in the mutual and self-coupling, the compound system exhibits different types of synchronized oscillations of variable period. We analyze this synchronization based on the interplay of the different time delays and support the numerical results by analytical findings. In addition, we elaborate on bursting-like dynamics with two competing timescales on the basis of the autocorrelation function.
The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions.
We study the transport through the Kitaev's chain with incommensurate potentials coupled to two normal leads by the numerical operator method. We find a quantized linear conductance of $e^2/h$, which is independent to the disorder strength and the gate voltage in a wide range, signaling the Majorana bound states. While the incommensurate disorder suppresses the current at finite voltage bias, and then narrows the linear response regime of the $I-V$ curve which exhibits two plateaus corresponding to the superconducting gap and the band edge respectively. The linear conductance abruptly drops to zero as the disorder strength reaches the critical value $2+2\Delta$ with $\Delta$ the p-wave pairing amplitude, corresponding to the transition from the topological superconducting phase to the Anderson localized phase. Changing the gate voltage will also cause an abrupt drop of the linear conductance by driving the chain into the topologically trivial superconducting phase, whose $I-V$ curve exhibits an exponential shape.
As a widely used non-linear activation, Rectified Linear Unit (ReLU) separates noise and signal in a feature map by learning a threshold or bias. However, we argue that the classification of noise and signal not only depends on the magnitude of responses, but also the context of how the feature responses would be used to detect more abstract patterns in higher layers. In order to output multiple response maps with magnitude in different ranges for a particular visual pattern, existing networks employing ReLU and its variants have to learn a large number of redundant filters. In this paper, we propose a multi-bias non-linear activation (MBA) layer to explore the information hidden in the magnitudes of responses. It is placed after the convolution layer to decouple the responses to a convolution kernel into multiple maps by multi-thresholding magnitudes, thus generating more patterns in the feature space at a low computational cost. It provides great flexibility of selecting responses to different visual patterns in different magnitude ranges to form rich representations in higher layers. Such a simple and yet effective scheme achieves the state-of-the-art performance on several benchmarks.
In a balancing network each processor has an initial collection of unit-size jobs (tokens) and in each round, pairs of processors connected by balancers split their load as evenly as possible. An excess token (if any) is placed according to some predefined rule. As it turns out, this rule crucially affects the performance of the network. In this work we propose a model that studies this effect. We suggest a model bridging the uniformly-random assignment rule, and the arbitrary one (in the spirit of smoothed-analysis). We start with an arbitrary assignment of balancer directions and then flip each assignment with probability $\alpha$ independently. For a large class of balancing networks our result implies that after $\Oh(\log n)$ rounds the discrepancy is $\Oh( (1/2-\alpha) \log n + \log \log n)$ with high probability. This matches and generalizes known upper bounds for $\alpha=0$ and $\alpha=1/2$. We also show that a natural network matches the upper bound for any $\alpha$.
We study the operator that corresponds to the measurement of volume, in non-perturbative quantum gravity, and we compute its spectrum. The operator is constructed in the loop representation, via a regularization procedure; it is finite, background independent, and diffeomorphism-invariant, and therefore well defined on the space of diffeomorphism invariant states (knot states). We find that the spectrum of the volume of any physical region is discrete. A family of eigenstates are in one to one correspondence with the spin networks, which were introduced by Penrose in a different context. We compute the corresponding component of the spectrum, and exhibit the eigenvalues explicitly. The other eigenstates are related to a generalization of the spin networks, and their eigenvalues can be computed by diagonalizing finite dimensional matrices. Furthermore, we show that the eigenstates of the volume diagonalize also the area operator. We argue that the spectra of volume and area determined here can be considered as predictions of the loop-representation formulation of quantum gravity on the outcomes of (hypothetical) Planck-scale sensitive measurements of the geometry of space.
Motivated by previous investigations on the radiative effects of the electric dipoles embedded in structured cavities, localization of electromagnetic waves in two dimensions is studied {\it ab initio} for a system consisting of many randomly distributed two dimensional dipoles. A set of self-consistent equations, incorporating all orders of multiple scattering of the electromagnetic waves, is derived from first principles and then solved numerically for the total electromagnetic field. The results show that spatially localized electromagnetic waves are possible in such a simple but realistic disordered system. When localization occurs, a coherent behavior appears and is revealed as a unique property differentiating localization from either the residual absorption or the attenuation effects.
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Learning useful information across long time lags is a critical and difficult problem for temporal neural models in tasks such as language modeling. Existing architectures that address the issue are often complex and costly to train. The Differential State Framework (DSF) is a simple and high-performing design that unifies previously introduced gated neural models. DSF models maintain longer-term memory by learning to interpolate between a fast-changing data-driven representation and a slowly changing, implicitly stable state. This requires hardly any more parameters than a classical, simple recurrent network. Within the DSF framework, a new architecture is presented, the Delta-RNN. In language modeling at the word and character levels, the Delta-RNN outperforms popular complex architectures, such as the Long Short Term Memory (LSTM) and the Gated Recurrent Unit (GRU), and, when regularized, performs comparably to several state-of-the-art baselines. At the subword level, the Delta-RNN's performance is comparable to that of complex gated architectures.
Cross-modality retrieval encompasses retrieval tasks where the fetched items are of a different type than the search query, e.g., retrieving pictures relevant to a given text query. The state-of-the-art approach to cross-modality retrieval relies on learning a joint embedding space of the two modalities, where items from either modality are retrieved using nearest-neighbor search. In this work, we introduce a neural network layer based on Canonical Correlation Analysis (CCA) that learns better embedding spaces by analytically computing projections that maximize correlation. In contrast to previous approaches, the CCA layer allows us to combine existing objectives for embedding space learning, such as pairwise ranking losses, with the optimal projections of CCA. We show the effectiveness of our approach for cross-modality retrieval on three datasets, surpassing both Deep CCA and a multi-view network using freely learned projections optimized by a pairwise ranking loss, especially when few training data is available.
We study the effect of planar defects in phononic crystals of spherical scatterers. It is shown that a plane of impurity spheres introduces modes of vibration of the elastic field localized on this plane at frequencies within a frequency gap of a pure phononic crystal; these show up as sharp resonances in the transmittance of elastic waves incident on a slab of the crystal. A periodic arrangement of impurity planes along a given direction creates narrow impurity bands with a width which depends on the position of these bands within the frequency gap of the pure crystal and on the separation between the impurity planes. We show how a slight deviation from periodicity (one impurity plane is different from the rest) reduces dramatically the transmittance of elastic waves incident on a slab of the crystal.
Adaptivity, both of the individual agents and of the interaction structure among the agents, seems indispensable for scaling up multi-agent systems (MAS's) in noisy environments. One important consideration in designing adaptive agents is choosing their action spaces to be as amenable as possible to machine learning techniques, especially to reinforcement learning (RL) techniques. One important way to have the interaction structure connecting agents itself be adaptive is to have the intentions and/or actions of the agents be in the input spaces of the other agents, much as in Stackelberg games. We consider both kinds of adaptivity in the design of a MAS to control network packet routing.   We demonstrate on the OPNET event-driven network simulator the perhaps surprising fact that simply changing the action space of the agents to be better suited to RL can result in very large improvements in their potential performance: at their best settings, our learning-amenable router agents achieve throughputs up to three and one half times better than that of the standard Bellman-Ford routing algorithm, even when the Bellman-Ford protocol traffic is maintained. We then demonstrate that much of that potential improvement can be realized by having the agents learn their settings when the agent interaction structure is itself adaptive.
A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. Detecting hidden variables poses two problems: determining the relations to other variables in the model and determining the number of states of the hidden variable. In this paper, we address the latter problem in the context of Bayesian networks. We describe an approach that utilizes a score-based agglomerative state-clustering. As we show, this approach allows us to efficiently evaluate models with a range of cardinalities for the hidden variable. We show how to extend this procedure to deal with multiple interacting hidden variables. We demonstrate the effectiveness of this approach by evaluating it on synthetic and real-life data. We show that our approach learns models with hidden variables that generalize better and have better structure than previous approaches.
In this paper we document lognormal distributions for spike rates, synaptic weights and intrinsic excitability (gain) for neurons in various brain areas, such as auditory or visual cortex, cerebellum, striatum, midbrain nuclei. We find a remarkable consistency of power-law, specifically lognormal, distributions for rates, weights and gains in all brain areas, such as cortex, cerebellum, striatum, midbrain. The difference between strongly recurrent and transfer connectivity (cortex vs. striatum and cerebellum), neurotransmitter (GABA or glutamate) or the level of activation (low in cortex, high in Purkinje cells and midbrain nuclei) turns out to be irrelevant for this feature. Logarithmic scale distribution appears as a functional phenomenon that is always present. Secondly, we created a generic neural model to show that Hebbian learning will create and maintain lognormal distributions. We could prove with the model that not only weights but also intrinsic gains need to have strong Hebbian learning in order to produce and maintain the experimentally attested distributions. This settles a long-standing question about the type of plasticity exhibited by intrinsic excitability.
Exploration of task mappings plays a crucial role in achieving high performance in heterogeneous multi-processor system-on-chip (MPSoC) platforms. The problem of optimally mapping a set of tasks onto a set of given heterogeneous processors for maximal throughput has been known, in general, to be NP-complete. The problem is further exacerbated when multiple applications (i.e., bigger task sets) and the communication between tasks are also considered. Previous research has shown that Genetic Algorithms (GA) typically are a good choice to solve this problem when the solution space is relatively small. However, when the size of the problem space increases, classic genetic algorithms still suffer from the problem of long evolution times. To address this problem, this paper proposes a novel bias-elitist genetic algorithm that is guided by domain-specific heuristics to speed up the evolution process. Experimental results reveal that our proposed algorithm is able to handle large scale task mapping problems and produces high-quality mapping solutions in only a short time period.
[Abridged] We present deep optical and near-infrared UBVRIHKs imaging data for 24 blue compact galaxies (BCGs). The sample contains luminous dwarf and intermediate-mass BCGs which are predominantly metal-poor, although a few have near-solar metallicities. We have analyzed isophotal and elliptical integration surface brightness and color profiles, extremely deep (mu_B<29 mag arcsec^{-2}) contour maps and RGB images for each galaxy in the sample. The colors are compared to different spectral evolutionary models. We detect extremely extended low surface brightness (LSB) components dominant beyond the Holmberg radius as well as optical bridges between companion galaxies at the mu_V~28th mag arcsec^{-2} isophotal level. The central surface brightness mu_0 and scale length h_r are derived from two radial ranges typically assumed to be dominated by the underlying host galaxy. We find that mu_0 and h_r of the BCGs host deviate from those of dwarf ellipticals (dE) and dwarf irregulars (dI) solely due to a strong burst contribution to the surface brightness profile almost down to the Holmberg radius. Structural parameters obtained from a fainter region, mu_B=26-28 mag arcsec^{-2}, are consistent with those of true LSB galaxies for the starbursting BCGs in our sample, and with dEs and dIs for the BCGs with less vigorous star formation.
We propose a novel method that exploits fMRI Repetition Suppression (RS-fMRI) to measure the dimensionality of the set response vectors, i.e. the dimension of the space of linear combinations of neural population activity patterns in response to specific task conditions. RS-fMRI measures the overlap between response vectors even in brain areas displaying no discernible average differential BOLD signal. We show how this property can be used to estimate the neural response dimensionality in areas lacking macroscopic spatial patterning. The importance of dimensionality derives from how it relates to a neural circuit's functionality. As we show, the dimensionality of the response vectors is predicted to be high in areas involved in multi-stream integration, while it is low in areas where inputs from independent sources do not interact or merely overlap linearly. Our method can be used to identify and functionally characterize cortical circuits that integrate multiple independent information pathways.
Translating or rotating an input image should not affect the results of many computer vision tasks. Convolutional neural networks (CNNs) are already translation equivariant: input image translations produce proportionate feature map translations. This is not the case for rotations. Global rotation equivariance is typically sought through data augmentation, but patch-wise equivariance is more difficult. We present Harmonic Networks or H-Nets, a CNN exhibiting equivariance to patch-wise translation and 360-rotation. We achieve this by replacing regular CNN filters with circular harmonics, returning a maximal response and orientation for every receptive field patch.   H-Nets use a rich, parameter-efficient and low computational complexity representation, and we show that deep feature maps within the network encode complicated rotational invariants. We demonstrate that our layers are general enough to be used in conjunction with the latest architectures and techniques, such as deep supervision and batch normalization. We also achieve state-of-the-art classification on rotated-MNIST, and competitive results on other benchmark challenges.
Wireless networking allows users to access information and services regardless of location and physical infrastructure. It is a fast growing technology due to its availability of wireless devices, flexibility, ease of installation and configuration. With this rapid expansion of information and Communication Technology (ICT), the consumption of energy is also increasing. In the early age of wireless technology, computing infrastructure focused on everywhere access, capacity and speed of technology. But now computing infrastructure should be energy efficient because, in wireless networking, devices are mostly powered by a battery that is a limited source of energy and is a challenge for the researchers. In computing infrastructure energy saving and environmental protection has become a global demand. This paper proposed a computing infrastructure based on green computing for energy efficient wireless networking. Further, some challenges and techniques like power consumption in network architecture, algorithm efficiency, virtualization, and dynamic power saving will be discussed to make energy efficient computing infrastructure.
Dealing with pollution attacks in inter-session network coding is challenging due to the fact that sources, in addition to intermediate nodes, can be malicious. In this work, we precisely define corrupted packets in inter-session pollution based on the commitment of the source packets. We then propose three detection schemes: one hash-based and two MAC-based schemes: InterMacCPK and SpaceMacPM. InterMacCPK is the first multi-source homomorphic MAC scheme that supports multiple keys. Both MAC schemes can replace traditional MACs, e.g., HMAC, in networks that employ inter-session coding. All three schemes provide in-network detection, are collusion-resistant, and have very low online bandwidth and computation overhead.
We address the problem of stability of motor actions implemented by the central nervous system based on simple algorithms potentially reflecting physical (including physiological) processes within the body. A number of conceptually simple algorithms that solve motor tasks with a high probability of success may be based on feedback schemes that ensure stability of subspaces of neural variables associated with accomplishing those tasks.   The task is formulated in terms of linear constrains imposed either on the human body mechanical variables or on neural variables; we discuss three reference frames relevant to these processes. We discuss underlying basic principles of such algorithms, their architecture, and efficiency, and compare the outcomes of implementation of such algorithms with the results of experiments performed on the human hand.
Internet is known to display a highly heterogeneous structure and complex fluctuations in its traffic dynamics. Congestion seems to be an inevitable result of user's behavior coupled to the network dynamics and it effects should be minimized by choosing appropriate routing strategies. But what are the requirements of routing depth in order to optimize the traffic flow? In this paper we analyse the behavior of Internet traffic with a topologically realistic spatial structure as described in a previous study (S-H. Yook et al. ,Proc. Natl Acad. Sci. USA, {\bf 99} (2002) 13382). The model involves self-regulation of packet generation and different levels of routing depth. It is shown that it reproduces the relevant key, statistical features of Internet's traffic. Moreover, we also report the existence of a critical path horizon defining a transition from low-efficient traffic to highly efficient flow. This transition is actually a direct consequence of the web's small world architecture exploited by the routing algorithm. Once routing tables reach the network diameter, the traffic experiences a sudden transition from a low-efficient to a highly-efficient behavior. It is conjectured that routing policies might have spontaneously reached such a compromise in a distributed manner. Internet would thus be operating close to such critical path horizon.
Some biological systems operate at the critical point between stability and instability and this requires a fine-tuning of parameters. We bring together two examples from the literature that illustrate this: neural integration in the nervous system and hair cell oscillations in the auditory system. In both examples the question arises as to how the required fine-tuning may be achieved and maintained in a robust and reliable way. We study this question using tools from nonlinear and adaptive control theory. We illustrate our approach on a simple model which captures some of the essential features of neural integration. As a result, we propose a large class of feedback adaptation rules that may be responsible for the experimentally observed robustness of neural integration. We mention extensions of our approach to the case of hair cell oscillations in the ear.
Multiprocess systems, including grid systems, multiprocessors and multicore computers, incorporate a variety of specialized hardware and software mechanisms, which speed computation, but result in complex memory behavior. As a consequence, the possible outcomes of a concurrent program can be unexpected. A memory consistency model is a description of the behaviour of such a system. Abstract memory consistency models aim to capture the concrete implementations and architectures. Therefore, formal specification of the implementation or architecture is necessary, and proofs of correspondence between the abstract and the concrete models are required.   This paper provides a case study of this process. We specify a new model, partition consistency, that generalizes many existing consistency models. A concrete message-passing network model is also specified. Implementations of partition consistency on this network model are then presented and proved correct. A middle level of abstraction is utilized to facilitate the proofs. All three levels of abstraction are specified using the same framework. The paper aims to illustrate a general methodology and techniques for specifying memory consistency models and proving the correctness of their implementations.
We develop a general variational inference method that preserves dependency among the latent variables. Our method uses copulas to augment the families of distributions used in mean-field and structured approximations. Copulas model the dependency that is not captured by the original variational distribution, and thus the augmented variational family guarantees better approximations to the posterior. With stochastic optimization, inference on the augmented distribution is scalable. Furthermore, our strategy is generic: it can be applied to any inference procedure that currently uses the mean-field or structured approach. Copula variational inference has many advantages: it reduces bias; it is less sensitive to local optima; it is less sensitive to hyperparameters; and it helps characterize and interpret the dependency among the latent variables.
In view of real world applications of quantum information technologies, the combination of miniature quantum resources with existing fibre networks is a crucial issue. Among such resources, on-chip entangled photon sources play a central role for applications spanning quantum communications, computing and metrology. Here, we use a semiconductor source of entangled photons operating at room temperature in conjunction with standard telecom components to demonstrate multi-user quantum key distribution, a core protocol for securing communications in quantum networks. The source consists of an AlGaAs chip emitting polarization entangled photon pairs over a large bandwidth in the main telecom band around 1550 nm without the use of any off-chip compensation or interferometric scheme; the photon pairs are directly launched into a dense wavelength division multiplexer (DWDM) and secret keys are distributed between several pairs of users communicating through different channels. We achieve a visibility measured after the DWDM of 87% and show long-distance key distribution using a 50-km standard telecom fibre link between two network users. These results illustrate a promising route to practical, resource-efficient implementations adapted to quantum network infrastructures.
Extreme events taking place on networks are not uncommon. We show that it is possible to manipulate the extreme events occurrence probabilities and its distribution over the nodes on scale-free networks by tuning the nodal capacity. This can be used to reduce the number of extreme events occurrences on a network. However monotonic nodal capacity enhancements, beyond a point, do not lead to any substantial reduction in the number of extreme events. We point out the practical implication of this result for network design in the context of reducing extreme events occurrences.
Using the colour dipole approach of the QCD perturbative (BFKL) Pomeron exchange in onium-onium scattering, we compute the cross section for small but hierarchically different onium sizes. A specific term dependent on the size-ratio is generated. In deep inelastic onium scattering it appears as a scaling violation contribution to the quark structure function near the BFKL singularity. We find that the extension of the formalism for deep inelastic onium scattering to the proton structure function provides a remarkably good 3-parameter fit to HERA data at small x with a simple physical interpretation in terms of the dipole formulation.
The categorization ability of fully connected neural network models, with either discrete or continuous Q-state units, is studied in this work in replica symmetric mean-field theory. Hierarchically correlated multi-state patterns in a two level structure of ancestors and descendents (examples) are embedded in the network and the categorization task consists in recognizing the ancestors when the network is trained exclusively with their descendents. Explicit results for the dependence of the equilibrium properties of a Q=3-state model and a $Q=\infty$-state model are obtained in the form of phase diagrams and categorization curves. A strong improvement of the categorization ability is found when the network is trained with examples of low activity. The categorization ability is found to be robust to finite threshold and synaptic noise. The Almeida-Thouless lines that limit the validity of the replica-symmetric results, are also obtained.
Many complex networks are described by directed links; in such networks, a link represents, for example, the control of one node over the other node or unidirectional information flows. Some centrality measures are used to determine the relative importance of nodes specifically in directed networks. We analyze such a centrality measure called the influence. The influence represents the importance of nodes in various dynamics such as synchronization, evolutionary dynamics, random walk, and social dynamics. We analytically calculate the influence in various networks, including directed multipartite networks and a directed version of the Watts-Strogatz small-world network. The global properties of networks such as hierarchy and position of shortcuts, rather than local properties of the nodes, such as the degree, are shown to be the chief determinants of the influence of nodes in many cases. The developed method is also applicable to the calculation of the PageRank. We also numerically show that in a coupled oscillator system, the threshold for entrainment by a pacemaker is low when the pacemaker is placed on influential nodes. For a type of random network, the analytically derived threshold is approximately equal to the inverse of the influence. We numerically show that this relationship also holds true in a random scale-free network and a neural network.
We focus on the resolved stellar populations of one early- and four transition-type dwarf galaxies in the Sculptor group, with the aim to examine the potential presence of population gradients and place constraints on their mean metallicities. We use deep HST images to construct CMDs, from which we select stellar populations that trace different evolutionary phases in order to constrain their range of ages and metallicities, as well as to examine their spatial distribution. In addition, we use the resolved stars in the RGB in order to derive photometric metallicities. All studied dwarfs contain intermediate-age stars with ages of ~1Gyr and older as traced by the luminous asymptotic giant branch and red clump stars, while the transition-type dwarfs contain also stars younger than ~1Gyr as traced by a young main sequence and vertical red clump stars. Moreover, the spatial distribution of the stars that trace different evolutionary phases shows a population gradient in all transition-type dwarfs. The derived error-weighted mean metallicities, assuming purely old stellar populations, range from -1.5dex for ESO294-G010 to -1.9dex for Scl-dE1, and should be considered as lower limits to their true metallicities. Assuming intermediate-age stellar populations to dominate the dwarfs, we derive upper limits for the metallicities that are 0.3 to 0.2 dex higher than the metallicities derived assuming purely old populations. We discuss how photometric metallicity gradients are affected by the age-metallicity degeneracy, which prevents strong conclusions regarding their actual presence. Finally, the transition-type dwarfs lie beyond the virial radius of their closest bright galaxy, as also observed for the LG transition-type dwarfs. Scl-dE1 is the only dSph in our sample and is an outlier in a potential morphology-distance relation, similar as the two isolated dSphs of the LG, Tucana and Cetus.
We introduce a new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions. Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains. The approach implements this idea in the context of neural network architectures that are trained on labeled data from the source domain and unlabeled data from the target domain (no labeled target-domain data is necessary). As the training progresses, the approach promotes the emergence of features that are (i) discriminative for the main learning task on the source domain and (ii) indiscriminate with respect to the shift between the domains. We show that this adaptation behaviour can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer. The resulting augmented architecture can be trained using standard backpropagation and stochastic gradient descent, and can thus be implemented with little effort using any of the deep learning packages. We demonstrate the success of our approach for two distinct classification problems (document sentiment analysis and image classification), where state-of-the-art domain adaptation performance on standard benchmarks is achieved. We also validate the approach for descriptor learning task in the context of person re-identification application.
Exploration in an unknown environment is the core functionality for mobile robots. Learning-based exploration methods, including convolutional neural networks, provide excellent strategies without human-designed logic for the feature extraction. But the conventional supervised learning algorithms cost lots of efforts on the labeling work of datasets inevitably. Scenes not included in the training set are mostly unrecognized either. We propose a deep reinforcement learning method for the exploration of mobile robots in an indoor environment with the depth information from an RGB-D sensor only. Based on the Deep Q-Network framework, the raw depth image is taken as the only input to estimate the Q values corresponding to all moving commands. The training of the network weights is end-to-end. In arbitrarily constructed simulation environments, we show that the robot can be quickly adapted to unfamiliar scenes without any man-made labeling. Besides, through analysis of receptive fields of feature representations, deep reinforcement learning motivates the convolutional networks to estimate the traversability of the scenes. The test results are compared with the exploration strategies separately based on deep learning or reinforcement learning. Even trained only in the simulated environment, experimental results in real-world environment demonstrate that the cognitive ability of robot controller is dramatically improved compared with the supervised method. We believe it is the first time that raw sensor information is used to build cognitive exploration strategy for mobile robots through end-to-end deep reinforcement learning.
Three important properties of a classification machinery are: (i) the system preserves the core information of the input data; (ii) the training examples convey information about unseen data; and (iii) the system is able to treat differently points from different classes. In this work we show that these fundamental properties are satisfied by the architecture of deep neural networks. We formally prove that these networks with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data. Similar points at the input of the network are likely to have a similar output. The theoretical analysis of deep networks here presented exploits tools used in the compressed sensing and dictionary learning literature, thereby making a formal connection between these important topics. The derived results allow drawing conclusions on the metric learning properties of the network and their relation to its structure, as well as providing bounds on the required size of the training set such that the training examples would represent faithfully the unseen data. The results are validated with state-of-the-art trained networks.
This paper presented our work on applying Recurrent Deep Stacking Networks (RDSNs) to Robust Automatic Speech Recognition (ASR) tasks. In the paper, we also proposed a more efficient yet comparable substitute to RDSN, Bi- Pass Stacking Network (BPSN). The main idea of these two models is to add phoneme-level information into acoustic models, transforming an acoustic model to the combination of an acoustic model and a phoneme-level N-gram model. Experiments showed that RDSN and BPsn can substantially improve the performances over conventional DNNs.
A class of additive cellular automata (ACA) on a finite group is defined by an index-group $\m g$ and a finite field $\m F_p$ for a prime modulus $p$ \cite{Bul_arch_1}. This paper deals mainly with ACA on infinite commutative groups and direct products of them with some non commutative $p$-groups. It appears that for all abelian groups, the rules and initial states with finite supports define behaviors which being restricted to some infinite regular series of time moments become significantly simplified. In particular, for free abelian groups with $n$ generators states $V^{[t]}$ of ACA with a rule $R$ at time moments $t=p^k,k>k_0,$ can be viewed as $||R||$ copies of initial state $V^{[0]}$ moving through an $n$-dimensional Euclidean space. That is the behavior is similar to gliders from J.Conway's automaton {\sl Life}. For some other special infinite series of time moments the automata states approximate self-similar structures and the approximation becomes better with time. An infinite class $\mathrm{DHC}(\mbf S,\theta)$ of non-commutative $p$-groups is described which in particular includes quaternion and dihedral $p$-groups. It is shown that the simplification of behaviors takes place as well for direct products of non-commutative groups from the class $\mathrm{DHC}(\mbf S,\theta)$ with commutative groups. Finally, an automaton on a non-commutative group is constructed such that its behavior at time moments $2^k,k\ge2,$ is similar to a glider gun. It is concluded that ACA on non-commutative groups demonstrate more diverse variety of behaviors comparing to ACA on commutative groups.
In this paper, we explore the consequences of a distinction between `live' and `dead' network nodes; `live' nodes are able to acquire new links whereas `dead' nodes are static. We develop an analytically soluble growing network model incorporating this distinction and show that it can provide a quantitative description of the empirical network composed of citations and references (in- and out-links) between papers (nodes) in the SPIRES database of scientific papers in high energy physics. We also demonstrate that the death mechanism alone can result in power law degree distributions for the resulting network.
The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. At the same time, the images captured by these devices can be categorized into a small set of semantic classes. However simple, this observation has not been exploited in image denoising until now. In this paper, we demonstrate how the reconstruction quality improves when a denoiser is aware of the type of content in the image. To this end, we first propose a new fully convolutional deep neural network architecture which is simple yet powerful as it achieves state-of-the-art performance even without being class-aware. We further show that a significant boost in performance of up to $0.4$ dB PSNR can be achieved by making our network class-aware, namely, by fine-tuning it for images belonging to a specific semantic class. Relying on the hugely successful existing image classifiers, this research advocates for using a class-aware approach in all image enhancement tasks.
Network modeling based on ensemble averages tacitly assumes that the networks meant to be modeled are typical in the ensemble. Previous research on network eigenvalues, which govern a range of dynamical phenomena, has shown that this is indeed the case for uncorrelated networks with minimum degree $\ge 3$. Here we focus on real networks, which generally have both structural correlations and low-degree nodes. We show that: (i) the ensemble distribution of the dynamically most important eigenvalues can be not only broad and far apart from the real eigenvalue but also highly structured, often with a multimodal rather than bell-shaped form; (ii) these interesting properties are found to be due to low-degree nodes, mainly those with degree $< 3$, and network communities, which is a common form of structural correlation found in real networks. In addition to having implications for ensemble-based approaches, this shows that low-degree nodes may have a stronger influence on collective dynamics than previously anticipated from the study of computer-generated networks.
The structure of ecological networks, in particular food webs, determines their ability to evolve further, i.e. evolvability. The knowledge about how food web evolvability is determined by the structures of diverse ecological networks can guide human interventions purposefully to either promote or limit evolvability of ecosystems. However, the focus of prior food web studies was on stability and robustness; little is known regarding the impact of ecological network structures on their evolvability. To correlate ecosystem structure and evolvability, we adopt the NK model originally from evolutionary biology to generate and assess the ruggedness of fitness landscapes of a wide spectrum of model food webs with gradual variation in the amount of feeding loops and link density. The variation in network structures is controlled by linkage rewiring. Our results show that more feeding loops and lower trophic link density, i.e. higher autonomy of species, of food webs increase the potential for the ecosystem to generate heritable variations with improved fitness. Our findings allow the prediction of the evolvability of actual food webs according to their network structures, and provide guidance to enhancing or controlling the evolvability of specific ecosystems.
Despite a lot of research efforts devoted in recent years, how to efficiently learn long-term dependencies from sequences still remains a pretty challenging task. As one of the key models for sequence learning, recurrent neural network (RNN) and its variants such as long short term memory (LSTM) and gated recurrent unit (GRU) are still not powerful enough in practice. One possible reason is that they have only feedforward connections, which is different from the biological neural system that is typically composed of both feedforward and feedback connections. To address this problem, this paper proposes a biologically-inspired deep network, called shuttleNet\footnote{Our code is available at \url{https://github.com/shiyemin/shuttlenet}}. Technologically, the shuttleNet consists of several processors, each of which is a GRU while associated with multiple groups of cells and states. Unlike traditional RNNs, all processors inside shuttleNet are loop connected to mimic the brain's feedforward and feedback connections, in which they are shared across multiple pathways in the loop connection. Attention mechanism is then employed to select the best information flow pathway. Extensive experiments conducted on two benchmark datasets (i.e UCF101 and HMDB51) show that we can beat state-of-the-art methods by simply embedding shuttleNet into a CNN-RNN framework.
Conventional studies of network growth models mainly look at the steady state degree distribution of the graph. Often long time behavior is considered, hence the initial condition is ignored. In this contribution, the time evolution of the degree distribution is the center of attention. We consider two specific growth models; incoming nodes with uniform and preferential attachment, and the degree distribution of the graph for arbitrary initial condition is obtained as a function of time. This allows us to characterize the transient behavior of the degree distribution, as well as to quantify the rate of convergence to the steady-state limit.
Obtaining compact and discriminative features is one of the major challenges in many of the real-world image classification tasks such as face verification and object recognition. One possible approach is to represent input image on the basis of high-level features that carry semantic meaning which humans can understand. In this paper, a model coined deep attribute network (DAN) is proposed to address this issue. For an input image, the model outputs the attributes of the input image without performing any classification. The efficacy of the proposed model is evaluated on unconstrained face verification and real-world object recognition tasks using the LFW and the a-PASCAL datasets. We demonstrate the potential of deep learning for attribute-based classification by showing comparable results with existing state-of-the-art results. Once properly trained, the DAN is fast and does away with calculating low-level features which are maybe unreliable and computationally expensive.
We compare vibrational dynamics in two structurally distinct, simple monatomic model glasses simulated by molecular dynamics: the Lennard-Jones glass with an fcc-related structure and a glass with predominantly icosahedral short-range order. The former, characterised by a single local quasi-periodicity, supports only modes with acoustic behaviour. In the latter, the presence of optic modes and two incommensurate length scales is observed. This pattern of vibrational dynamics is shown to be closely related to that of a Frank-Kasper crystal having the same local topological order.
In this paper we study the cooperative behavior of agents playing the Prisoner's Dilemma game in random scale-free networks. We show that the survival of cooperation is enhanced with respect to random homogeneous graphs but, on the other hand, decreases when compared to that found in Barab\'asi-Albert scale-free networks. We show that the latter decrease is related with the structure of cooperation. Additionally, we present a mean field approximation for studying evolutionary dynamics in networks with no degree-degree correlations and with arbitrary degree distribution. The mean field approach is similar to the one used for describing the disease spreading in complex networks, making a further compartmentalization of the strategists partition into degree-classes. We show that this kind of approximation is suitable to describe the behavior of the system for a particular set of initial conditions, such as the placement of cooperators in the higher-degree classes, while it fails to reproduce the level of cooperation observed in the numerical simulations for arbitrary initial configurations.
Segmentation of 3D images is a fundamental problem in biomedical image analysis. Deep learning (DL) approaches have achieved state-of-the-art segmentation perfor- mance. To exploit the 3D contexts using neural networks, known DL segmentation methods, including 3D convolution, 2D convolution on planes orthogonal to 2D image slices, and LSTM in multiple directions, all suffer incompatibility with the highly anisotropic dimensions in common 3D biomedical images. In this paper, we propose a new DL framework for 3D image segmentation, based on a com- bination of a fully convolutional network (FCN) and a recurrent neural network (RNN), which are responsible for exploiting the intra-slice and inter-slice contexts, respectively. To our best knowledge, this is the first DL framework for 3D image segmentation that explicitly leverages 3D image anisotropism. Evaluating using a dataset from the ISBI Neuronal Structure Segmentation Challenge and in-house image stacks for 3D fungus segmentation, our approach achieves promising results comparing to the known DL-based 3D segmentation approaches.
Converging evidence suggests that the mammalian ventral visual pathway encodes increasingly complex stimulus features in downstream areas. Using deep convolutional neural networks, we can now quantitatively demonstrate that there is indeed an explicit gradient for feature complexity in the ventral pathway of the human brain. Our approach also allows stimulus features of increasing complexity to be mapped across the human brain, providing an automated approach to probing how representations are mapped across the cortical sheet. Finally, it is shown that deep convolutional neural networks allow decoding of representations in the human brain at a previously unattainable degree of accuracy, providing a more sensitive window into the human brain.
The ability to simultaneously leverage multiple modes of sensor information is critical for perception of an automated vehicle's physical surroundings. Spatio-temporal alignment of registration of the incoming information is often a prerequisite to analyzing the fused data. The persistence and reliability of multi-modal registration is therefore the key to the stability of decision support systems ingesting the fused information. LiDAR-video systems like on those many driverless cars are a common example of where keeping the LiDAR and video channels registered to common physical features is important. We develop a deep learning method that takes multiple channels of heterogeneous data, to detect the misalignment of the LiDAR-video inputs. A number of variations were tested on the Ford LiDAR-video driving test data set and will be discussed. To the best of our knowledge the use of multi-modal deep convolutional neural networks for dynamic real-time LiDAR-video registration has not been presented.
We study the Langevin dynamics of the soft-spin, continuum version of the Coulomb frustrated Ising ferromagnet. By using the dynamical mode-coupling approximation, supplemented by reasonable approximations for describing the equilibrium static correlation function, and the somewhat improved dynamical self-consistent screening approximation, we find that the system displays a transition from an ergodic to a non-ergodic behavior. This transition is similar to that obtained in the idealized mode-coupling theory of glassforming liquids and in the mean-field generalized spin glasses with one-step replica symmetry breaking. The significance of this result and the relation to the appearance of a complex free-energy landscape are also discussed.
User's home locations are used by numerous social media applications, such as social media analysis. However, since the user's home location is not generally open to the public, many researchers have been attempting to develop a more accurate home location estimation. A social network that expresses relationships between users is used to estimate the users' home locations. The network-based home location estimation method with iteration, which propagates the estimated locations, is used to estimate more users' home locations. In this study, we analyze the function of network-based home location estimation with iteration while using the social network based on following relationships on Twitter. The results indicate that the function that selects the most frequent location among the friends' location has the best accuracy. Our analysis also shows that the 88% of users, who are in the social network based on following relationships, has at least one correct home location within one-hop (friends and friends of friends). According to this characteristic of the social network, we indicate that twice is sufficient for iteration.
The problem we address is the following: how can a user employ a predictive model that is held by a third party, without compromising private information. For example, a hospital may wish to use a cloud service to predict the readmission risk of a patient. However, due to regulations, the patient's medical files cannot be revealed. The goal is to make an inference using the model, without jeopardizing the accuracy of the prediction or the privacy of the data.   To achieve high accuracy, we use neural networks, which have been shown to outperform other learning models for many tasks. To achieve the privacy requirements, we use homomorphic encryption in the following protocol: the data owner encrypts the data and sends the ciphertexts to the third party to obtain a prediction from a trained model. The model operates on these ciphertexts and sends back the encrypted prediction. In this protocol, not only the data remains private, even the values predicted are available only to the data owner.   Using homomorphic encryption and modifications to the activation functions and training algorithms of neural networks, we show that it is protocol is possible and may be feasible. This method paves the way to build a secure cloud-based neural network prediction services without invading users' privacy.
We establish a network formation game for the Internet's Autonomous System (AS) interconnection topology. The game includes different types of players, accounting for the heterogeneity of ASs in the Internet. In this network formation game, the utility of a player depends on the network structure, e.g., the distances between nodes and the cost of links. We also consider the case where utility (or monetary) transfers are allowed between the players. We incorporate reliability considerations in the player's utility function, and analyze static properties of the game as well as its dynamic evolution. We provide dynamic analysis of topological quantities, and explain the prevalence of some "network motifs" in the Internet graph. We assess our predictions with real-world data.
An online user joins multiple social networks in order to enjoy different services. On each joined social network, she creates an identity and constitutes its three major dimensions namely profile, content and connection network. She largely governs her identity formulation on any social network and therefore can manipulate multiple aspects of it. With no global identifier to mark her presence uniquely in the online domain, her online identities remain unlinked, isolated and difficult to search. Earlier research has explored the above mentioned dimensions, to search and link her multiple identities with an assumption that the considered dimensions have been least disturbed across her identities. However, majority of the approaches are restricted to exploitation of one or two dimensions. We make a first attempt to deploy an integrated system (Finding Nemo) which uses all the three dimensions of an identity to search for a user on multiple social networks. The system exploits a known identity on one social network to search for her identities on other social networks. We test our system on two most popular and distinct social networks - Twitter and Facebook. We show that the integrated system gives better accuracy than the individual algorithms. We report experimental findings in the report.
Intelligent systems in an open world must reason about many interacting entities related to each other in diverse ways and having uncertain features and relationships. Traditional probabilistic languages lack the expressive power to handle relational domains. Classical first-order logic is sufficiently expressive, but lacks a coherent plausible reasoning capability. Recent years have seen the emergence of a variety of approaches to integrating first-order logic, probability, and machine learning. This paper presents Multi-entity Bayesian networks (MEBN), a formal system that integrates First Order Logic (FOL) with Bayesian probability theory. MEBN extends ordinary Bayesian networks to allow representation of graphical models with repeated sub-structures, and can express a probability distribution over models of any consistent, finitely axiomatizable first-order theory. We present the logic using an example inspired by the Paramount Series StarTrek.
Policy search methods can allow robots to learn control policies for a wide range of tasks, but practical applications of policy search often require hand-engineered components for perception, state estimation, and low-level control. In this paper, we aim to answer the following question: does training the perception and control systems jointly end-to-end provide better performance than training each component separately? To this end, we develop a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors. The policies are represented by deep convolutional neural networks (CNNs) with 92,000 parameters, and are trained using a partially observed guided policy search method, which transforms policy search into supervised learning, with supervision provided by a simple trajectory-centric reinforcement learning method. We evaluate our method on a range of real-world manipulation tasks that require close coordination between vision and control, such as screwing a cap onto a bottle, and present simulated comparisons to a range of prior policy search methods.
In this paper, we propose different practical distributed schemes to solve the rank failure problem in the compute and forward (CMF)-based multi-user multi-relay networks without central coordinator, in which the relays have no prior information about each other. First, a new relaying strategy based on CMF, named incremental compute-and-forward (ICMF), is proposed that performs quite well in terms of the outage probability. We show that the distributed ICMF scheme can even outperform the achievable rate of centralized optimal CMF in strong enough inter relay links, with much less complexity. Then, as the second scheme, amplify-forward and compute (AFC) is introduced in which the equations are recovered in the destination rather than in the relays. Finally, ICMF and AFC schemes are combined to present hybrid compute-amplify and forward (HCAF) relaying scheme, which takes advantages of both ICMF, and AFC and improves the performance of the ICMF considerably. We evaluate the performance of the proposed strategies in terms of the outage probability and compare the results with those of the conventional CMF strategy, the Decode and Forward (DF) strategy, and also the centralized optimal CMF. The results indicate the substantial superiority of the proposed schemes compared with the conventional schemes, specially for high number of users and relays.
We review some results on the effect of a specific type of quenched disorder on well known O(m)-vector models in three dimension. Evidences of changes of criticality in both systems, when confined in aerogel pores, are briefly referenced. The 3DXY model (m=2) represents the universality class to which the \lambda-transition of bulk superfluid $^4$He belongs. Experiments report changes of critical exponents for this transition, when superfluid $^4$He is confined in aerogels. Numerical results of the 3DXY model, confined in aerogel-like structures, are in agreement with experiments. Both results seem to contradict Harris criterion: being the specific heat exponent negative for the pure system, changes must be explained in terms of the extended criterion due to Weinrib and Halperin, which requires disorder to be long-range correlated (LRC) at all scales. Aerogels are fractal through some decades only, and present crossovers to homogeneous regimes at finite scales, so the violation to Harris criterion persists. The apparent violation has been explained in terms of hidden LRC subsets within aerogels $[$Phys. Rev. Lett., 2003, {\bf 90}, 170602$]$. On the other hand, experiments on the liquid-vapor (LV) transition of $^4$He and N$_2$ confined in aerogels, also showed changes in critical-point exponents. Being the LV critical-point in the O(1) universality class, criticality may be affected by both, short-range correlated (SRC) and LRC subsets ofdisorder. Simulations of the 3DIS in DLCA aerogels can corroborate experimental results. Experiments and simulations both suggest a shift in critical exponents to values closer to the SRC instead of those of the LRC fixed point.
The conformal algebra provides powerful constraints, which guarantee that renormalized conformally covariant operators exist in the hypothetical conformal limit of the theory, where the $\beta$-function vanishes. Thus, in this limit also the conformally covariant operator product expansion on the light cone holds true. This operator product expansion has predictive power for two-photon processes in the generalized Bjorken region. Only the Wilson coefficients and the anomalous dimensions that are known from deep inelastic scattering are required for the prediction of all other two-photon processes in terms of the process-dependent off-diagonal expectation values of conformal operators. It is checked that the next-to-leading order calculations for the flavour non-singlet meson transition form factors are consistent with the corrections to the corresponding Wilson coefficients in deep inelasitic scattering.
Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image- and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with multi-label classification ability, showing impressive accuracy when classifying protein sequences into 698 UniProt families (AUC=99.99%) and 983 Gene Ontology classes (AUC=99.45%).
Profile images on social networks are users' opportunity to present themselves and to affect how others judge them. We examine what Facebook images say about users' perceived and measured intelligence. 1,122 Facebook users completed a matrices intelligence test and shared their current Facebook profile image. Strangers also rated the images for perceived intelligence. We use automatically extracted image features to predict both measured and perceived intelligence. Intelligence estimation from images is a difficult task even for humans, but experimental results show that human accuracy can be equalled using computing methods. We report the image features that predict both measured and perceived intelligence, and highlight misleading features such as "smiling" and "wearing glasses" that are correlated with perceived but not measured intelligence. Our results give insights into inaccurate stereotyping from profile images and also have implications for privacy, especially since in most social networks profile images are public by default.
A theory is developed for the evolution of the non-equilibrium distribution of quasiparticles when the scattering rate decreases due to particle collisions. We propose a "modified one-collision approximation" which is most effective for high-energy quasiparticle distributions. This method is used to explain novel measurements of the non-monotonic energy dependence of the signal of scattered electrons in a 2D system. The observed effect is related to a crossover from the ballistic to the hydrodynamic regime of electron flow.
Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols. Currently two major paradigms for language modeling exist: count-based n-gram models, which have advantages of scalability and test-time speed, and neural LMs, which often achieve superior modeling performance. We demonstrate how both varieties of models can be unified in a single modeling framework that defines a set of probability distributions over the vocabulary of words, and then dynamically calculates mixture weights over these distributions. This formulation allows us to create novel hybrid models that combine the desirable features of count-based and neural LMs, and experiments demonstrate the advantages of these approaches.
We present predictions for the numbers of ultra--cool dwarfs in the Galactic disk population that could be detected by the WFCAM/UKIDSS Large Area Survey and Ultra Deep Survey. Simulated samples of objects are created with masses and ages drawn from different mass functions and birthrates. Each object is then given absolute magnitudes in different passbands based on empirically derived bolometric correction vs. effective temperature relationships (or model predictions for Y dwarfs). These are then combined with simulated space positions, velocities and photometric errors to yield observables such as apparent magnitudes and proper motions. Such observables are then passed through the survey selection mechanism to yield histograms in colour. This technique also produces predictions for the proper motion histograms for ultra--cool dwarfs and estimated numbers for the as yet undetected Y dwarfs. Finally it is shown that these techniques could be used to constrain the ultra low--mass mass function and birthrate of the Galactic disk population
Automatic speech recognition systems usually rely on spectral-based features, such as MFCC of PLP. These features are extracted based on prior knowledge such as, speech perception or/and speech production. Recently, convolutional neural networks have been shown to be able to estimate phoneme conditional probabilities in a completely data-driven manner, i.e. using directly temporal raw speech signal as input. This system was shown to yield similar or better performance than HMM/ANN based system on phoneme recognition task and on large scale continuous speech recognition task, using less parameters. Motivated by these studies, we investigate the use of simple linear classifier in the CNN-based framework. Thus, the network learns linearly separable features from raw speech. We show that such system yields similar or better performance than MLP based system using cepstral-based features as input.
In this paper we tackle the problem of image search when the query is a short textual description of the image the user is looking for. We choose to implement the actual search process as a similarity search in a visual feature space, by learning to translate a textual query into a visual representation. Searching in the visual feature space has the advantage that any update to the translation model does not require to reprocess the, typically huge, image collection on which the search is performed. We propose Text2Vis, a neural network that generates a visual representation, in the visual feature space of the fc6-fc7 layers of ImageNet, from a short descriptive text. Text2Vis optimizes two loss functions, using a stochastic loss-selection method. A visual-focused loss is aimed at learning the actual text-to-visual feature mapping, while a text-focused loss is aimed at modeling the higher-level semantic concepts expressed in language and countering the overfit on non-relevant visual components of the visual loss. We report preliminary results on the MS-COCO dataset.
This is a natural generalization of the previous work by Dan, "Modeling and Simulation of Diffusion Phenomena on Social Networks," to appear in The proceedings of 2011 Third International Conference on Computer Modeling and Simulation. In this paper, we consider the diffusion phenomena of personal or secret information on the variety of networks, such as complete, random, stochastic and scale-free networks.
The Building Block Hypothesis suggests that Genetic Algorithms (GAs) are well-suited for hierarchical problems, where efficient solving requires proper problem decomposition and assembly of solution from sub-solution with strong non-linear interdependencies. The paper proposes a hill-climber operating over the building block (BB) space that can efficiently address hierarchical problems. The new Building Block Hill-Climber (BBHC) uses past hill-climb experience to extract BB information and adapts its neighborhood structure accordingly. The perpetual adaptation of the neighborhood structure allows the method to climb the hierarchical structure solving successively the hierarchical levels. It is expected that for fully non deceptive hierarchical BB structures the BBHC can solve hierarchical problems in linearithmic time. Empirical results confirm that the proposed method scales almost linearly with the problem size thus clearly outperforms population based recombinative methods.
This thesis focuses on the XY model, the simplest vector spin model, used for describing numerous physical systems. It is studied for different sources of quenched disorder: random couplings, random fields, or both them. The belief propagation algorithm and the cavity method are exploited to solve the model on the sparse topology provided by Bethe lattices. It is found that the discretized version of the XY model, the so-called $Q$-state clock model, provides a reliable and efficient proxy for the continuous model with an error going to zero exponentially in $Q$, so implying a remarkable speedup in numerical simulations. Interesting results regard the low-temperature solution of the spin glass XY model, which is by far more unstable toward the replica symmetry broken phase with respect to what happens in discrete models. Moreover, even the random field XY model with ferromagnetic couplings exhibits a replica symmetry broken phase, at variance with both the fully connected version of the same model and the diluted random field Ising model, as a further evidence of a more pronounced glassiness of the diluted XY model. Then, the instabilities of the spin glass XY model in an external field are characterized, recognizing different critical lines according to the different symmetries of the external field. Finally, the inherent structures in the energy landscape of the spin glass XY model in a random field are described, exploiting the capability of the zero-temperature belief propagation algorithm to actually reach the ground state of the system. Remarkably, the density of soft modes in the Hessian matrix shows a non-mean-field behaviour, typical of glasses in finite dimension, while the critical point of replica symmetry instability predicted by the belief propagation algorithm seems to correspond to a delocalization of such soft modes.
Stacked denoising auto encoders (DAEs) are well known to learn useful deep representations, which can be used to improve supervised training by initializing a deep network. We investigate a training scheme of a deep DAE, where DAE layers are gradually added and keep adapting as additional layers are added. We show that in the regime of mid-sized datasets, this gradual training provides a small but consistent improvement over stacked training in both reconstruction quality and classification error over stacked training on MNIST and CIFAR datasets.
Sleep signals from a polysomnographic database are sequences in nature. Commonly employed analysis and classification methods, however, ignored this fact and treated the sleep signals as non-sequence data. Treating the sleep signals as sequences, this paper compared two powerful unsupervised feature extractors and three sequence-based classifiers regarding accuracy and computational (training and testing) time after 10-folds cross-validation. The compared feature extractors are Deep Belief Networks (DBN) and Fuzzy C-Means (FCM) clustering. Whereas the compared sequence-based classifiers are Hidden Markov Models (HMM), Conditional Random Fields (CRF) and its variants, i.e., Hidden-state CRF (HCRF) and Latent-Dynamic CRF (LDCRF); and Conditional Neural Fields (CNF) and its variant (LDCNF). In this study, we use two datasets. The first dataset is an open (public) polysomnographic dataset downloadable from the Internet, while the second dataset is our polysomnographic dataset (also available for download). For the first dataset, the combination of FCM and CNF gives the highest accuracy (96.75\%) with relatively short training time (0.33 hours). For the second dataset, the combination of DBN and CRF gives the accuracy of 99.96\% but with 1.02 hours training time, whereas the combination of DBN and CNF gives slightly less accuracy (99.69\%) but also less computation time (0.89 hours).
Convolutional neural networks (CNNs) have been applied to various automatic image segmentation tasks in medical image analysis, including brain MRI segmentation. Generative adversarial networks have recently gained popularity because of their power in generating images that are difficult to distinguish from real images.   In this study we use an adversarial training approach to improve CNN-based brain MRI segmentation. To this end, we include an additional loss function that motivates the network to generate segmentations that are difficult to distinguish from manual segmentations. During training, this loss function is optimised together with the conventional average per-voxel cross entropy loss.   The results show improved segmentation performance using this adversarial training procedure for segmentation of two different sets of images and using two different network architectures, both visually and in terms of Dice coefficients.
We investigate the dynamics of spin glasses from the `rheological' point of view, in which aging is suppressed by the action of small, non-conservative forces. The different features can be expressed in terms of the scaling of relaxation times with the magnitude of the driving force, which plays the role of the critical parameter. Stated in these terms, ultrametricity loses much of its mystery and can be checked rather easily. This approach also seems a natural starting point to investigate what would be the real-space structures underlying the hierarchy of time scales. We study in detail the appearance of this many-scale behavior in a mean-field model, in which dynamic ultrametricity is clearly present. A similar analysis is performed on numerical results obtained for a 3D spin glass: In that case, our results are compatible with either that dynamic ultrametricity is absent or that it develops so slowly that even in experimental time-windows it is still hardly observable.
Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning. In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128x128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128x128 samples are more than twice as discriminable as artificially resized 32x32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.
We propose a method for lossy image compression based on recurrent, convolutional neural networks that outperforms BPG (4:2:0 ), WebP, JPEG2000, and JPEG as measured by MS-SSIM. We introduce three improvements over previous research that lead to this state-of-the-art result. First, we show that training with a pixel-wise loss weighted by SSIM increases reconstruction quality according to several metrics. Second, we modify the recurrent architecture to improve spatial diffusion, which allows the network to more effectively capture and propagate image information through the network's hidden state. Finally, in addition to lossless entropy coding, we use a spatially adaptive bit allocation algorithm to more efficiently use the limited number of bits to encode visually complex image regions. We evaluate our method on the Kodak and Tecnick image sets and compare against standard codecs as well recently published methods based on deep neural networks.
Three measures of clumpiness of complex networks are introduced. The measures quantify how most central nodes of a network are clumped together. The assortativity coefficient defined in a previous study measures a similar characteristic, but accounts only for the clumpiness of the central nodes that are directly connected to each other. The clumpiness coefficient defined in the present paper also takes into account the cases where central nodes are separated by a few links. The definition is based on the node degrees and the distances between pairs of nodes. The clumpiness coefficient together with the assortativity coefficient can define four classes of network. Numerical calculations demonstrate that the classification scheme successfully categorizes 30 real-world networks into the four classes: clumped assortative, clumped disassortative, loose assortative and loose disassortative networks. The clumpiness coefficient also differentiates the Erdos-Renyi model from the Barabasi-Albert model, which the assortativity coefficient could not differentiate. In addition, the bounds of the clumpiness coefficient as well as the relationships between the three measures of clumpiness are discussed.
We compute the anomalous dimensions of a set of composite operators which involve derivatives at four loops in MSbar in phi^4 theory as a function of the operator moment n. These operators are similar to the twist-2 operators which arise in QCD in the operator product expansion in deep inelastic scattering. By regarding their inverse Mellin transform as being equivalent to the DGLAP splitting functions we explore to what extent taking a restricted set of operator moments can give a good approximation to the exact four loop result.
Deviations from spin-glass linear response in a single crystal Cu:Mn 1.5 at % are studied for a wide range of changes in magnetic field, $\Delta H$. Three quantities, the difference $TRM-(MFC-ZFC)$, the effective waiting time, $t_{w}^{eff}$, and the difference $TRM(t_{w})-TRM(t_{w}=0)$ are examined in our analysis. Three regimes of spin-glass behavior are observed as $\Delta H$ increases. Lines in the $(T,\Delta H)$ plane, corresponding to ``weak'' and ``strong'' violations of linear response under a change in magnetic field, are shown to have the same functional form as the de Almeida-Thouless critical line. Our results demonstrate the existence of a fundamental link between static and dynamic properties of spin glasses, predicted by the mean-field theory of aging phenomena.
We study real-space condensation phenomena in a type of classical stochastic processes (site-particle system), such as zero-range processes and urn models. We here study a stochastic process in the Ehrenfest class, i.e., particles in a site are distinguishable. In terms of the statistical mechanical analogue, the Ehrenfest class obeys the Maxwell-Boltzmann statistics. We analytically clarify conditions for condensation phenomena in disordered cases in the Ehrenfest class. In addition, we discuss the preferential urn model as an example of the disordered urn model. It becomes clear that the quenched disorder property plays an important role in the occurrence of the condensation phenomenon in the preferential urn model. It is revealed that the preferential urn model shows three types of condensation depending on the disorder parameters.
As the virtualization of networks continues to attract attention from both industry and academia, the Virtual Network Embedding (VNE) problem remains a focus of researchers. This paper proposes a one-shot, unsplittable flow VNE solution based on column generation. We start by formulating the problem as a path-based mathematical program called the primal, for which we derive the corresponding dual problem. We then propose an initial solution which is used, first, by the dual problem and then by the primal problem to obtain a final solution. Unlike most approaches, our focus is not only on embedding accuracy but also on the scalability of the solution. In particular, the one-shot nature of our formulation ensures embedding accuracy, while the use of column generation is aimed at enhancing the computation time to make the approach more scalable. In order to assess the performance of the proposed solution, we compare it against four state of the art approaches as well as the optimal link-based formulation of the one-shot embedding problem. Experiments on a large mix of Virtual Network (VN) requests show that our solution is near optimal (achieving about 95% of the acceptance ratio of the optimal solution), with a clear improvement over existing approaches in terms of VN acceptance ratio and average Substrate Network (SN) resource utilization, and a considerable improvement (92% for a SN of 50 nodes) in time complexity compared to the optimal solution.
A complete self-control mechanism is proposed in the dynamics of neural networks through the introduction of a time-dependent threshold, determined in function of both the noise and the pattern activity in the network. Especially for sparsely coded models this mechanism is shown to considerably improve the storage capacity, the basins of attraction and the mutual information content of the network.
Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: -Data insufficiency:Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results. -Interpretation:The representations learned by deep learning methods should align with medical knowledge. To address these challenges, we propose a GRaph-based Attention Model, GRAM that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. Based on the data volume and the ontology structure, GRAM represents a medical concept as a combination of its ancestors in the ontology via an attention mechanism. We compared predictive performance (i.e. accuracy, data needs, interpretability) of GRAM to various methods including the recurrent neural network (RNN) in two sequential diagnoses prediction tasks and one heart failure prediction task. Compared to the basic RNN, GRAM achieved 10% higher accuracy for predicting diseases rarely observed in the training data and 3% improved area under the ROC curve for predicting heart failure using an order of magnitude less training data. Additionally, unlike other methods, the medical concept representations learned by GRAM are well aligned with the medical ontology. Finally, GRAM exhibits intuitive attention behaviors by adaptively generalizing to higher level concepts when facing data insufficiency at the lower level concepts.
This paper presents a novel mechanism to adapt surrogate-assisted population-based algorithms. This mechanism is applied to ACM-ES, a recently proposed surrogate-assisted variant of CMA-ES. The resulting algorithm, saACM-ES, adjusts online the lifelength of the current surrogate model (the number of CMA-ES generations before learning a new surrogate) and the surrogate hyper-parameters. Both heuristics significantly improve the quality of the surrogate model, yielding a significant speed-up of saACM-ES compared to the ACM-ES and CMA-ES baselines. The empirical validation of saACM-ES on the BBOB-2012 noiseless testbed demonstrates the efficiency and the scalability w.r.t the problem dimension and the population size of the proposed approach, that reaches new best results on some of the benchmark problems.
Deep neural networks (DNN) trained in a supervised way suffer from two known problems. First, the minima of the objective function used in learning correspond to data points (also known as rubbish examples or fooling images) that lack semantic similarity with the training data. Second, a clean input can be changed by a small, and often imperceptible for human vision, perturbation, so that the resulting deformed input is misclassified by the network. These findings emphasize the differences between the ways DNN and humans classify patterns, and raise a question of designing learning algorithms that more accurately mimic human perception compared to the existing methods.   Our paper examines these questions within the framework of Dense Associative Memory (DAM) models. These models are defined by the energy function, with higher order (higher than quadratic) interactions between the neurons. We show that in the limit when the power of the interaction vertex in the energy function is sufficiently large, these models have the following three properties. First, the minima of the objective function are free from rubbish images, so that each minimum is a semantically meaningful pattern. Second, artificial patterns poised precisely at the decision boundary look ambiguous to human subjects and share aspects of both classes that are separated by that decision boundary. Third, adversarial images constructed by models with small power of the interaction vertex, which are equivalent to DNN with rectified linear units (ReLU), fail to transfer to and fool the models with higher order interactions. This opens up a possibility to use higher order models for detecting and stopping malicious adversarial attacks. The presented results suggest that DAM with higher order energy functions are closer to human visual perception than DNN with ReLUs.
Traditional Linear Genetic Programming (LGP) algorithms are based only on the selection mechanism to guide the search. Genetic operators combine or mutate random portions of the individuals, without knowing if the result will lead to a fitter individual. Probabilistic Model Building Genetic Programming (PMB-GP) methods were proposed to overcome this issue through a probability model that captures the structure of the fit individuals and use it to sample new individuals. This work proposes the use of LGP with a Stochastic Context-Free Grammar (SCFG), that has a probability distribution that is updated according to selected individuals. We proposed a method for adapting the grammar into the linear representation of LGP. Tests performed with the proposed probabilistic method, and with two hybrid approaches, on several symbolic regression benchmark problems show that the results are statistically better than the obtained by the traditional LGP.
Visual Question and Answering (VQA) problems are attracting increasing interest from multiple research disciplines. Solving VQA problems requires techniques from both computer vision for understanding the visual contents of a presented image or video, as well as the ones from natural language processing for understanding semantics of the question and generating the answers. Regarding visual content modeling, most of existing VQA methods adopt the strategy of extracting global features from the image or video, which inevitably fails in capturing fine-grained information such as spatial configuration of multiple objects. Extracting features from auto-generated regions -- as some region-based image recognition methods do -- cannot essentially address this problem and may introduce some overwhelming irrelevant features with the question. In this work, we propose a novel Focused Dynamic Attention (FDA) model to provide better aligned image content representation with proposed questions. Being aware of the key words in the question, FDA employs off-the-shelf object detector to identify important regions and fuse the information from the regions and global features via an LSTM unit. Such question-driven representations are then combined with question representation and fed into a reasoning unit for generating the answers. Extensive evaluation on a large-scale benchmark dataset, VQA, clearly demonstrate the superior performance of FDA over well-established baselines.
In this paper, we propose a fully distributed algorithm for joint clock skew and offset estimation in wireless sensor networks based on belief propagation. In the proposed algorithm, each node can estimate its clock skew and offset in a completely distributed and asynchronous way: some nodes may update their estimates more frequently than others using outdated message from neighboring nodes. In addition, the proposed algorithm is robust to random packet loss. Such algorithm does not require any centralized information processing or coordination, and is scalable with network size. The proposed algorithm represents a unified framework that encompasses both classes of synchronous and asynchronous algorithms for network-wide clock synchronization. It is shown analytically that the proposed asynchronous algorithm converges to the optimal estimates with estimation mean-square-error at each node approaching the centralized Cram\'er-Rao bound under any network topology. Simulation results further show that {the convergence speed is faster than that corresponding to a synchronous algorithm}.
Given a polygon representing a transportation network together with a point p in its interior, we aim to extend the network by inserting a line segment, called a feed-link, which connects p to the boundary of the polygon. Once a feed link is fixed, the geometric dilation of some point q on the boundary is the ratio between the length of the shortest path from p to q through the extended network, and their Euclidean distance. The utility of a feed-link is inversely proportional to the maximal dilation over all boundary points.   We give a linear time algorithm for computing the feed-link with the minimum overall dilation, thus improving upon the previously known algorithm of complexity that is roughly O(n log n).
We propose the CLEX supercomputer topology and routing scheme. We prove that CLEX can utilize a constant fraction of the total bandwidth for point-to-point communication, at delays proportional to the sum of the number of intermediate hops and the maximum physical distance between any two nodes. Moreover, % applying an asymmetric bandwidth assignment to the links, all-to-all communication can be realized $(1+o(1))$-optimally both with regard to bandwidth and delays. This is achieved at node degrees of $n^{\varepsilon}$, for an arbitrary small constant $\varepsilon\in (0,1]$. In contrast, these results are impossible in any network featuring constant or polylogarithmic node degrees. Through simulation, we assess the benefits of an implementation of the proposed communication strategy. Our results indicate that, for a million processors, CLEX can increase bandwidth utilization and reduce average routing path length by at least factors $10$ respectively $5$ in comparison to a torus network. Furthermore, the CLEX communication scheme features several other properties, such as deadlock-freedom, inherent fault-tolerance, and canonical partition into smaller subsystems.
This work presents a distributed method for control centers to monitor the operating condition of a power network, i.e., to estimate the network state, and to ultimately determine the occurrence of threatening situations. State estimation has been recognized to be a fundamental task for network control centers to ensure correct and safe functionalities of power grids. We consider (static) state estimation problems, in which the state vector consists of the voltage magnitude and angle at all network buses. We consider the state to be linearly related to network measurements, which include power flows, current injections, and voltages phasors at some buses. We admit the presence of several cooperating control centers, and we design two distributed methods for them to compute the minimum variance estimate of the state given the network measurements. The two distributed methods rely on different modes of cooperation among control centers: in the first method an incremental mode of cooperation is used, whereas, in the second method, a diffusive interaction is implemented. Our procedures, which require each control center to know only the measurements and structure of a subpart of the whole network, are computationally efficient and scalable with respect to the network dimension, provided that the number of control centers also increases with the network cardinality. Additionally, a finite-memory approximation of our diffusive algorithm is proposed, and its accuracy is characterized. Finally, our estimation methods are exploited to develop a distributed algorithm to detect corrupted data among the network measurements.
We consider synchronization of weighted networks, possibly with asymmetrical connections. We show that the synchronizability of the networks cannot be directly inferred from their statistical properties. Small local changes in the network structure can sensitively affect the eigenvalues relevant for synchronization, while the gross statistical network properties remain essentially unchanged. Consequently, commonly used statistical properties, including the degree distribution, degree homogeneity, average degree, average distance, degree correlation, and clustering coefficient, can fail to characterize the synchronizability of networks.
A constant age population of blue galaxies, postulated in the model of Gronwall & Koo (1995), seems to provide an attractive explanation of the excess of very blue galaxies in the deep galaxy counts. Such a population may be generated by a set of galaxies with cycling star formation rates, or at the other extreme, be maintained by the continual formation of new galaxies which fade after they reach the age specified in the Gronwall and Koo model. For both of these hypotheses, we have calculated the luminosity functions including the respective selection criteria, the redshift distributions, and the number counts in the B_J and K bands. We find a substantial excess in the number of galaxies at low redshift (0 < z < 0.05) over that observed in the CFH redshift survey (Lilly et al. 1995) and at the faint end of the Las Campanas luminosity function (Lin et al. 1996). Passive or mild evolution fails to account for the deep galaxy counts because of the implications for low redshift determinations of the I-selected redshift distribution and the r-selected luminosity function in samples where the faded counterparts of the star-forming galaxies would be detectable.
Recently it has been recognized that many complex social, technological and biological networks have a multilayer nature and can be described by multiplex networks. Multiplex networks are formed by a set of nodes connected by links having different connotations forming the different layers of the multiplex. Characterizing the centrality of the nodes in a multiplex network is a challenging task since the centrality of the node naturally depends on the importance associated to links of a certain type. Here we propose to assign to each node of a multiplex network a centrality called Functional Multiplex PageRank that is a function of the weights given to every different pattern of connections (multilinks) existent in the multiplex network between any two nodes. Since multilinks distinguish all the possible ways in which the links in different layers can overlap, the Functional Multiplex PageRank can describe important non-linear effects when large relevance or small relevance is assigned to multilinks with overlap. Here we apply the Functional Page Rank to the multiplex airport networks, to the neuronal network of the nematode c.elegans, and to social collaboration and citation networks between scientists. This analysis reveals important differences existing between the most central nodes of these networks, and the correlations between their so called "pattern to success".
We present the clustering learning technique applied to multi-layer feedforward deep neural networks. We show that this unsupervised learning technique can compute network filters with only a few minutes and a much reduced set of parameters. The goal of this paper is to promote the technique for general-purpose robotic vision systems. We report its use in static image datasets and object tracking datasets. We show that networks trained with clustering learning can outperform large networks trained for many hours on complex datasets.
State-of-the-art techniques in Generative Adversarial Networks (GANs) such as cycleGAN is able to learn the mapping of one image domain $X$ to another image domain $Y$ using unpaired image data. We extend the cycleGAN to ${\it Conditional}$ cycleGAN such that the mapping from $X$ to $Y$ is subjected to attribute condition $Z$. Using face image generation as an application example, where $X$ is a low resolution face image, $Y$ is a high resolution face image, and $Z$ is a set of attributes related to facial appearance (e.g. gender, hair color, smile), we present our method to incorporate $Z$ into the network, such that the hallucinated high resolution face image $Y'$ not only satisfies the low resolution constrain inherent in $X$, but also the attribute condition prescribed by $Z$. Using face feature vector extracted from face verification network as $Z$, we demonstrate the efficacy of our approach on identity-preserving face image super-resolution. Our approach is general and applicable to high-quality face image generation where specific facial attributes can be controlled easily in the automatically generated results.
We design dynamic routing policies for an overlay network which meet delay requirements of real-time traffic being served on top of an underlying legacy network, where the overlay nodes do not know the underlay characteristics. We pose the problem as a constrained MDP, and show that when the underlay implements static policies such as FIFO with randomized routing, then a decentralized policy, that can be computed efficiently in a distributed fashion, is optimal. When the underlay implements a dynamic policy, we show that the same decentralized policy performs as well as any decentralized policy. Our algorithm utilizes multi-timescale stochastic approximation techniques, and its convergence relies on the fact that the recursions asymptotically track a nonlinear differential equation, namely the replicator equation.
Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals, and uses transposed convolutions for upsampling to the finer level. Our method does not require the bicubic interpolation as the pre-processing step and thus dramatically reduces the computational complexity. We train the proposed LapSRN with deep supervision using a robust Charbonnier loss function and achieve high-quality reconstruction. Furthermore, our network generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of speed and accuracy.
We consider a scenario where multiple event-based systems use a wireless network to communicate with their respective controllers. These systems use a contention resolution mechanism (CRM) to arbitrate access to the network. We present a Markov model for the network interactions between the event-based systems. Using this model, we obtain an analytical expression for the reliability, or the probability of successfully transmitting a packet, in this network. There are two important aspects to our model. Firstly, our model captures the joint interactions of the event-triggering policy and the CRM. This is required because event-triggering policies typically adapt to the CRM outcome. Secondly, the model is obtained by decoupling interactions between the different systems in the network, drawing inspiration from Bianchi's analysis of IEEE 802.11. This is required because the network interactions introduce a correlation between the system variables. We present Monte-Carlo simulations that validate our model under various network configurations, and verify our performance analysis as well.
In this work we present a new reinforcement learning agent, called Reactor (for Retrace-actor), based on an off-policy multi-step return actor-critic architecture. The agent uses a deep recurrent neural network for function approximation. The network outputs a target policy {\pi} (the actor), an action-value Q-function (the critic) evaluating the current policy {\pi}, and an estimated behavioral policy {\hat \mu} which we use for off-policy correction. The agent maintains a memory buffer filled with past experiences. The critic is trained by the multi-step off-policy Retrace algorithm and the actor is trained by a novel {\beta}-leave-one-out policy gradient estimate (which uses both the off-policy corrected return and the estimated Q-function). The Reactor is sample-efficient thanks to the use of memory replay, and numerical efficient since it uses multi-step returns. Also both acting and learning can be parallelized. We evaluated our algorithm on 57 Atari 2600 games and demonstrate that it achieves state-of-the-art performance.
We present first results for the transmittance, T, through a 1D disordered system with an imaginary vector potential, ih, which provide a new analytical criterion for a delocalization transition in the model. It turns out that the position of the critical curve on the complex energy plane (i.e. the curve where an exponential decay of <T> is changed by a power-law one) is different from that obtained previously from the complex energy spectra. Corresponding curves for <T^n> or <ln T> are also different. This happens because of different scales of the exponential decay of one-particle Green's functions (GF) defining the spectra and many-particle GF governing transport characteristics, and reflects higher-order correlations in localized eigenstates of the non-Hermitian model.
Efficiently computing fast paths in large scale dynamic road networks (where dynamic traffic information is known over a part of the network) is a practical problem faced by several traffic information service providers who wish to offer a realistic fast path computation to GPS terminal enabled vehicles. The heuristic solution method we propose is based on a highway hierarchy-based shortest path algorithm for static large-scale networks; we maintain a static highway hierarchy and perform each query on the dynamically evaluated network.
Modern educational institutions widely used virtual laboratories and cloud technologies. In practice must deal with security, processing speed and other tasks. The paper describes the experience of the construction of an experimental stand cloud computing and network management. Models and control principles set forth herein.
Nonparametric extension of tensor regression is proposed. Nonlinearity in a high-dimensional tensor space is broken into simple local functions by incorporating low-rank tensor decomposition. Compared to naive nonparametric approaches, our formulation considerably improves the convergence rate of estimation while maintaining consistency with the same function class under specific conditions. To estimate local functions, we develop a Bayesian estimator with the Gaussian process prior. Experimental results show its theoretical properties and high performance in terms of predicting a summary statistic of a real complex network.
A Neural Network, in general, is not considered to be a good solver of mathematical and binary arithmetic problems. However, networks have been developed for such problems as the XOR circuit. This paper presents a technique for the implementation of the Half-adder circuit using the CoActive Neuro-Fuzzy Inference System (CANFIS) Model and attempts to solve the problem using the NeuroSolutions 5 Simulator. The paper gives the experimental results along with the interpretations and possible applications of the technique.
We prove rates of convergence in the statistical sense for kernel-based least squares regression using a conjugate gradient algorithm, where regularization against overfitting is obtained by early stopping. This method is directly related to Kernel Partial Least Squares, a regression method that combines supervised dimensionality reduction with least squares projection. The rates depend on two key quantities: first, on the regularity of the target regression function and second, on the intrinsic dimensionality of the data mapped into the kernel space. Lower bounds on attainable rates depending on these two quantities were established in earlier literature, and we obtain upper bounds for the considered method that match these lower bounds (up to a log factor) if the true regression function belongs to the reproducing kernel Hilbert space. If this assumption is not fulfilled, we obtain similar convergence rates provided additional unlabeled data are available. The order of the learning rates match state-of-the-art results that were recently obtained for least squares support vector machines and for linear regularization operators.
In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed quantization scheme leads to significant memory savings and enables the use of optimized hardware instructions for integer arithmetic, thus significantly reducing the cost of inference. Finally, we propose a "quantization aware" training process that applies the proposed scheme during network training and find that it allows us to recover most of the loss in accuracy introduced by quantization. We validate the proposed techniques by applying them to a long short-term memory-based acoustic model on an open-ended large vocabulary speech recognition task.
We introduce in this paper a new way of optimizing the natural extension of the quantization error using in k-means clustering to dissimilarity data. The proposed method is based on hierarchical clustering analysis combined with multi-level heuristic refinement. The method is computationally efficient and achieves better quantization errors than the
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) these cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.
Deep reinforcement learning (RL) can acquire complex behaviors from low-level inputs, such as images. However, real-world applications of such methods require generalizing to the vast variability of the real world. Deep networks are known to achieve remarkable generalization when provided with massive amounts of labeled data, but can we provide this breadth of experience to an RL agent, such as a robot? The robot might continuously learn as it explores the world around it, even while deployed. However, this learning requires access to a reward function, which is often hard to measure in real-world domains, where the reward could depend on, for example, unknown positions of objects or the emotional state of the user. Conversely, it is often quite practical to provide the agent with reward functions in a limited set of situations, such as when a human supervisor is present or in a controlled setting. Can we make use of this limited supervision, and still benefit from the breadth of experience an agent might collect on its own? In this paper, we formalize this problem as semisupervised reinforcement learning, where the reward function can only be evaluated in a set of "labeled" MDPs, and the agent must generalize its behavior to the wide range of states it might encounter in a set of "unlabeled" MDPs, by using experience from both settings. Our proposed method infers the task objective in the unlabeled MDPs through an algorithm that resembles inverse RL, using the agent's own prior experience in the labeled MDPs as a kind of demonstration of optimal behavior. We evaluate our method on challenging tasks that require control directly from images, and show that our approach can improve the generalization of a learned deep neural network policy by using experience for which no reward function is available. We also show that our method outperforms direct supervised learning of the reward.
Mean-field models of the mammalian cortex treat this part of the brain as a two-dimensional excitable medium. The electrical potentials, generated by the excitatory and inhibitory neuron populations, are described by nonlinear, coupled, partial differential equations, that are known to generate complicated spatio-temporal behaviour. We focus on the model by Liley {\sl et al.} (Network: Comput. Neural Syst. (2002) 13, 67-113). Several reductions of this model have been studied in detail, but a direct analysis of its spatio-temporal dynamics has, to the best of our knowledge, never been attempted before. Here, we describe the implementation of implicit time-stepping of the model and the tangent linear model, and solving for equilibria and time-periodic solutions, using the open-source library PETSc. By using domain decomposition for parallelization, and iterative solving of linear problems, the code is capable of parsing some dynamics of a macroscopic slice of cortical tissue with a sub-millimetre resolution.
This paper investigates a novel task of generating texture images from perceptual descriptions. Previous work on texture generation focused on either synthesis from examples or generation from procedural models. Generating textures from perceptual attributes have not been well studied yet. Meanwhile, perceptual attributes, such as directionality, regularity and roughness are important factors for human observers to describe a texture. In this paper, we propose a joint deep network model that combines adversarial training and perceptual feature regression for texture generation, while only random noise and user-defined perceptual attributes are required as input. In this model, a preliminary trained convolutional neural network is essentially integrated with the adversarial framework, which can drive the generated textures to possess given perceptual attributes. An important aspect of the proposed model is that, if we change one of the input perceptual features, the corresponding appearance of the generated textures will also be changed. We design several experiments to validate the effectiveness of the proposed method. The results show that the proposed method can produce high quality texture images with desired perceptual properties.
We analyze the following group learning problem in the context of opinion diffusion: Consider a network with $M$ users, each facing $N$ options. In a discrete time setting, at each time step, each user chooses $K$ out of the $N$ options, and receive randomly generated rewards, whose statistics depend on the options chosen as well as the user itself, and are unknown to the users. Each user aims to maximize their expected total rewards over a certain time horizon through an online learning process, i.e., a sequence of exploration (sampling the return of each option) and exploitation (selecting empirically good options) steps.   Within this context we consider two group learning scenarios, (1) users with uniform preferences and (2) users with diverse preferences, and examine how a user should construct its learning process to best extract information from other's decisions and experiences so as to maximize its own reward. Performance is measured in {\em weak regret}, the difference between the user's total reward and the reward from a user-specific best single-action policy (i.e., always selecting the set of options generating the highest mean rewards for this user). Within each scenario we also consider two cases: (i) when users exchange full information, meaning they share the actual rewards they obtained from their choices, and (ii) when users exchange limited information, e.g., only their choices but not rewards obtained from these choices.
Ranking the nodes' ability for spreading in networks is a fundamental problem which relates to many real applications such as information and disease control. In the previous literatures, a network decomposition procedure called k-shell method has been shown to effectively identify the most influential spreaders. In this paper, we find that the k-shell method have some limitations when it is used to rank all the nodes in the network. We also find that these limitations are due to considering only the links between the remaining nodes (residual degree) while entirely ignoring all the links connecting to the removed nodes (exhausted degree) when decomposing the networks. Accordingly, we propose a mixed degree decomposition (MDD) procedure in which both the residual degree and the exhausted degree are considered. By simulating the epidemic process on the real networks, we show that the MDD method can outperform the k-shell and the degree methods in ranking spreaders. Finally, the influence of the network structure on the performance of the MDD method is discussed.
Using a new dynamical network model of society in which pairwise interactions are weighted according to mutual satisfaction, we show that cooperation is the norm in the Hawks-Doves game when individuals are allowed to break ties with undesirable neighbors and to make new acquaintances in their extended neighborhood. Moreover, cooperation is robust with respect to rather strong strategy perturbations. We also discuss the empirical structure of the emerging networks, and the reasons that allow cooperators to thrive in the population. Given the metaphorical importance of this game for social interaction, this is an encouraging positive result as standard theory for large mixing populations prescribes that a certain fraction of defectors must always exist at equilibrium.
With a view to provide a user-friendly interface for designing, training and developing deep learning frameworks, we have developed Expresso, a GUI tool written in Python. Expresso is built atop Caffe, the open-source, prize-winning framework popularly used to develop Convolutional Neural Networks. Expresso provides a convenient wizard-like graphical interface which guides the user through various common scenarios -- data import, construction and training of deep networks, performing various experiments, analyzing and visualizing the results of these experiments. The multi-threaded nature of Expresso enables concurrent execution and notification of events related to the aforementioned scenarios. The GUI sub-components and inter-component interfaces in Expresso have been designed with extensibility in mind. We believe Expresso's flexibility and ease of use will come in handy to researchers, newcomers and seasoned alike, in their explorations related to deep learning.
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on four benchmark datasets (IAPR TC-12, Flickr 8K, Flickr 30K and MS COCO). Our model outperforms the state-of-the-art methods. In addition, we apply the m-RNN model to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval. The project page of this work is: www.stat.ucla.edu/~junhua.mao/m-RNN.html .
This paper discusses two new procedures for extracting verb valences from raw texts, with an application to the Polish language. The first novel technique, the EM selection algorithm, performs unsupervised disambiguation of valence frame forests, obtained by applying a non-probabilistic deep grammar parser and some post-processing to the text. The second new idea concerns filtering of incorrect frames detected in the parsed text and is motivated by an observation that verbs which take similar arguments tend to have similar frames. This phenomenon is described in terms of newly introduced co-occurrence matrices. Using co-occurrence matrices, we split filtering into two steps. The list of valid arguments is first determined for each verb, whereas the pattern according to which the arguments are combined into frames is computed in the following stage. Our best extracted dictionary reaches an $F$-score of 45%, compared to an $F$-score of 39% for the standard frame-based BHT filtering.
We consider the properties of vibrational dynamics on random networks, with random masses and spring constants. The localization properties of the eigenstates contrast greatly with the Laplacian case on these networks. We introduce several real-space renormalization techniques which can be used to describe this dynamics on general networks, drawing on strong disorder techniques developed for regular lattices. The renormalization group is capable of elucidating the localization properties, and provides, even for specific network instances, a fast approximation technique for determining the spectra which compares well with exact results.
We inhabit a world that is not only small but supports efficient decentralized search - an individual using local information can establish a line of communication with another completely unknown individual. Here we augment a hierarchical social network model with communication between and within communities. We argue that organization into communities would decrease overall decentralized search times. We take inspiration from the biological immune system which organizes search for pathogens in a hybrid modular strategy. Our strategy has relevance in search for rare amounts of information in online social networks. Our work also has implications for design of efficient online networks that could have an impact on networks of human collaboration, scientific collaboration and networks used in targeted manhunts. Real world systems, like online social networks, have high associated delays for long-distance links, since they are built on top of physical networks. Such systems have been shown to densify. Hence such networks will have a communication cost due to space and the requirement of maintaining connections. We have incorporated such a non-spatial cost to communication. We introduce the notion of a community size that increases with the size of the system, which is shown to reduce the time to search for information in networks. Our final strategy balances search times and participation costs and is shown to decrease time to find information in decentralized search in online social networks. Our strategy also balances strong-ties and weak-ties over long distances and may ultimately lead to more productive and innovative networks of human communication and enterprise. We hope that this work will lay the foundation for strategies aimed at producing global scale human interaction networks that are sustainable and lead to a more networked, diverse and prosperous society.
This piece of research belongs to the field of educational assessment issue based upon the cognitive multimedia theory. Considering that theory; visual and auditory material should be presented simultaneously to reinforce the retention of a mathematical learned topic, a carefully computer-assisted learning (CAL) module is designed for development of a multimedia tutorial for our suggested mathematical topic. The designed CAL module is a multimedia tutorial computer package with visual and/or auditory material. So, via suggested computer package, Multi-Sensory associative memories and classical conditioning theories are practically applicable at an educational field (a children classroom). It is noticed that comparative practical results obtained are interesting for field application of CAL package with and without associated teacher's voice. Finally, the presented study highly recommends application of a novel teaching trend aiming to improve quality of children mathematical learning performance.
Deep learning has gained much success in sentence-level relation classification. For example, convolutional neural networks (CNN) have delivered competitive performance without much effort on feature engineering as the conventional pattern-based methods. Thus a lot of works have been produced based on CNN structures. However, a key issue that has not been well addressed by the CNN-based method is the lack of capability to learn temporal features, especially long-distance dependency between nominal pairs. In this paper, we propose a simple framework based on recurrent neural networks (RNN) and compare it with CNN-based model. To show the limitation of popular used SemEval-2010 Task 8 dataset, we introduce another dataset refined from MIMLRE(Angeli et al., 2014). Experiments on two different datasets strongly indicates that the RNN-based model can deliver better performance on relation classification, and it is particularly capable of learning long-distance relation patterns. This makes it suitable for real-world applications where complicated expressions are often involved.
We present and explore a model of stateless and self-stabilizing distributed computation, inspired by real-world applications such as routing on today's Internet. Processors in our model do not have an internal state, but rather interact by repeatedly mapping incoming messages ("labels") to outgoing messages and output values. While seemingly too restrictive to be of interest, stateless computation encompasses both classical game-theoretic notions of strategic interaction and a broad range of practical applications (e.g., Internet protocols, circuits, diffusion of technologies in social networks). We embark on a holistic exploration of stateless computation. We tackle two important questions: (1) Under what conditions is self-stabilization, i.e., guaranteed "convergence" to a "legitimate" global configuration, achievable for stateless computation? and (2) What is the computational power of stateless computation? Our results for self-stabilization include a general necessary condition for self-stabilization and hardness results for verifying that a stateless protocol is self-stabilizing. Our main results for the power of stateless computation show that labels of logarithmic length in the number of processors yield substantial computational power even on ring topologies. We present a separation between unidirectional and bidirectional rings (L/poly vs. P/poly), reflecting the sequential nature of computation on a unidirectional ring, as opposed to the parallelism afforded by the bidirectional ring. We leave the reader with many exciting directions for future research.
Recently, considerable attention has been devoted to the prediction problems arising from heterogeneous information networks. In this paper, we present a new prediction task, Neighbor Distribution Prediction (NDP), which aims at predicting the distribution of the labels on neighbors of a given node and is valuable for many different applications in heterogeneous information networks. The challenges of NDP mainly come from three aspects: the infinity of the state space of a neighbor distribution, the sparsity of available data, and how to fairly evaluate the predictions. To address these challenges, we first propose an Evolution Factor Model (EFM) for NDP, which utilizes two new structures proposed in this paper, i.e. Neighbor Distribution Vector (NDV) to represent the state of a given node's neighbors, and Neighbor Label Evolution Matrix (NLEM) to capture the dynamics of a neighbor distribution, respectively. We further propose a learning algorithm for Evolution Factor Model. To overcome the problem of data sparsity, the learning algorithm first clusters all the nodes and learns an NLEM for each cluster instead of for each node. For fairly evaluating the predicting results, we propose a new metric: Virtual Accuracy (VA), which takes into consideration both the absolute accuracy and the predictability of a node. Extensive experiments conducted on three real datasets from different domains validate the effectiveness of our proposed model EFM and metric VA.
Recently, the FCC announced a new standard for review in mergers. Under the new standard, mass media mergers that comply with existing rules will automaticly receive approval, while those that do not will receive a more searching review. Common carrier mergers, however, will continue to receive the 4=part test established in Bell Atlantic/Nynex. The new standard fails to take into account the complexities of the emerging, converged networked world, and is essentially obsolete on arrival. Looking to those areas where Congress has required an additional public interest review of mergers, a pattern emerges. The emergence of vast, vertically integrated networks of content and conduit fit the historic pattern of areas requiring piublic interest review and re-enforce the need for increased, rather than decreased merger review.
Convolutional deep neural networks (DNN) are state of the art in many engineering problems but have not yet addressed the issue of how to deal with complex spectrograms. Here, we use circular statistics to provide a convenient probabilistic estimate of spectrogram phase in a complex convolutional DNN. In a typical cocktail party source separation scenario, we trained a convolutional DNN to re-synthesize the complex spectrograms of two source speech signals given a complex spectrogram of the monaural mixture - a discriminative deep transform (DT). We then used this complex convolutional DT to obtain probabilistic estimates of the magnitude and phase components of the source spectrograms. Our separation results are on a par with equivalent binary-mask based non-complex separation approaches.
For every Gaussian network, there exists a corresponding deterministic network called the discrete superposition network. We show that this discrete superposition network provides a near-optimal digital interface for operating a class consisting of many Gaussian networks in the sense that any code for the discrete superposition network can be naturally lifted to a corresponding code for the Gaussian network, while achieving a rate that is no more than a constant number of bits lesser than the rate it achieves for the discrete superposition network. This constant depends only on the number of nodes in the network and not on the channel gains or SNR. Moreover the capacities of the two networks are within a constant of each other, again independent of channel gains and SNR. We show that the class of Gaussian networks for which this interface property holds includes relay networks with a single source-destination pair, interference networks, multicast networks, and the counterparts of these networks with multiple transmit and receive antennas.   The code for the Gaussian relay network can be obtained from any code for the discrete superposition network simply by pruning it. This lifting scheme establishes that the superposition model can indeed potentially serve as a strong surrogate for designing codes for Gaussian relay networks.   We present similar results for the K x K Gaussian interference network, MIMO Gaussian interference networks, MIMO Gaussian relay networks, and multicast networks, with the constant gap depending additionally on the number of antennas in case of MIMO networks.
We observe the emergence of a disorder-induced insulating state in a strongly interacting atomic Fermi gas trapped in an optical lattice. This closed quantum system free of a thermal reservoir realizes the disordered Fermi-Hubbard model, which is a minimal model for strongly correlated electronic solids. In measurements of disorder-induced localization obtained via mass transport, we detect interaction-driven delocalization and localization that persists as the temperature of the gas is raised. These behaviors are consistent with many-body localization, which is a novel paradigm for understanding localization in interacting quantum systems at non-zero temperature.
The Munich Near-IR Cluster Survey (MUNICS) is a wide-area, medium-deep, photometric survey selected in the K' band. The project's main scientific aims are the identification of galaxy clusters up to redshifts of unity and the selection of a large sample of field early-type galaxies up to z < 1.5 for evolutionary studies. We created a Large Scale Structure catalog, using a new structure finding technique specialized for photometric datasets, that we developed on the basis of a friends-of-friends algorithm. We tested the plausibility of the resulting galaxy group and cluster catalog with the help of Color-Magnitude Diagrams (CMD), as well as a likelihood- and Voronoi-approach.
Internet has played a vital role in this modern world, the possibilities and opportunities offered are limitless. Despite all the hype, Internet services are liable to intrusion attack that could tamper the confidentiality and integrity of important information. An attack started with gathering the information of the attack target, this gathering of information activity can be done as either fast or slow attack. The defensive measure network administrator can take to overcome this liability is by introducing Intrusion Detection Systems (IDSs) in their network. IDS have the capabilities to analyze the network traffic and recognize incoming and on-going intrusion. Unfortunately the combination of both modules in real time network traffic slowed down the detection process. In real time network, early detection of fast attack can prevent any further attack and reduce the unauthorized access on the targeted machine. The suitable set of feature selection and the correct threshold value, add an extra advantage for IDS to detect anomalies in the network. Therefore this paper discusses a new technique for selecting static threshold value from a minimum standard features in detecting fast attack from the victim perspective. In order to increase the confidence of the threshold value the result is verified using Statistical Process Control (SPC). The implementation of this approach shows that the threshold selected is suitable for identifying the fast attack in real time.
We check the performance of the {\sl\,PARSEC} tracks in reproducing the blue loops of intermediate age and young stellar populations at very low metallicity. We compute new evolutionary {\sl\,PARSEC} tracks of intermediate- and high-mass stars from 2\Msun to 350\Msun with enhanced envelope overshooting (EO), EO=2\HP and 4\HP, for very low metallicity, Z=0.0005. The input physics, including the mass-loss rate, has been described in {\sl\,PARSEC}~V1.2 version. By comparing the synthetic color-magnitude diagrams (CMDs) obtained from the different sets of models with envelope overshooting EO=0.7\HP (the standard {\sl\,PARSEC} tracks), 2\HP and 4\HP, with deep observations of the Sagittarius dwarf irregular galaxy (SagDIG), we find an overshooting scale EO=2\HP to best reproduce the observed loops. This result is consistent with that obtained by \citet{Tang_etal14} for Z in the range 0.001-0.004. We also discuss the dependence of the blue loop extension on the adopted instability criterion and find that, contrary to what stated in literature, the Schwarzschild criterion, instead of the Ledoux criterion, favours the development of blue loops. Other factors that could affect the CMD comparisons such as differential internal extinction or the presence of binary systems are found to have negligible effects on the results. We thus confirm that, in presence of core overshooting during the H-burning phase, a large envelope overshooting is needed to reproduce the main features of the central He-burning phase of intermediate- and high-mass stars.
Mammogram classification is directly related to computer-aided diagnosis of breast cancer. Traditional methods rely on regions of interest (ROIs) which require great efforts to annotate. Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning (MIL) for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned ROIs. We explore three different schemes to construct deep multi-instance networks for whole mammogram classification. Experimental results on the INbreast dataset demonstrate the robustness of proposed networks compared to previous work using segmentation and detection annotations.
Initially TCP was designed with the notion in mind that wired networks are generally reliable and any segment loss in a transmission is due to congestion in the network rather than an unreliable medium (The assumptions is that the packet loss caused by damage is much less than 1 percent) . This notion doesnt hold in wireless parts of the network. Wireless links are highly unreliable and they lose segments all the time due to a number of factors. Very few papers are available which uses TCP for MANET. In this paper, an attempt have been made to justify the use of TCP variants (Tahoe and Reno) for loss of packet due to random noise introduces in the MANET. For the present analysis the simulation has been carried out for TCP variants (Tahoe and Reno) by introduces 0, 10, 20 and 30 percent noise. The comparison of TCP variants is made by running simulation for 0, 10, 20 and 30 percent of data packet loss due to noise in the transmission link and the effect of throughput and congestion window has been examined. During the simulation we have observed that throughput has been decreased when a drop of multiple segments happens, further we have observed in the case of TCP variant (Reno) throughput is better at 1 percent (Figure 5) which implies a network with short burst of error and low BER, causing only one segment to be lost. When multiple segments are lost due to error prone nature of link, Tahoe perform better than Reno (Figure 13), that gives a significant saving of time (64.28 percent) in comparison with Reno (Table 4). Several simulations have been run with ns 2 simulator in order to acquire a better understanding of these TCP variants and the way they perform their function. We conclude with a discussion of whether these TCP versions can be used in Mobile Ad hoc Network.
Designed as extremely deep architectures, deep residual networks which provide a rich visual representation and offer robust convergence behaviors have recently achieved exceptional performance in numerous computer vision problems. Being directly applied to a scene labeling problem, however, they were limited to capture long-range contextual dependence, which is a critical aspect. To address this issue, we propose a novel approach, Contextual Recurrent Residual Networks (CRRN) which is able to simultaneously handle rich visual representation learning and long-range context modeling within a fully end-to-end deep network. Furthermore, our proposed end-to-end CRRN is completely trained from scratch, without using any pre-trained models in contrast to most existing methods usually fine-tuned from the state-of-the-art pre-trained models, e.g. VGG-16, ResNet, etc. The experiments are conducted on four challenging scene labeling datasets, i.e. SiftFlow, CamVid, Stanford background and SUN datasets, and compared against various state-of-the-art scene labeling methods.
We have added a simplified neuromorphic model of Spike Time Dependent Plasticity (STDP) to the Synapto-dendritic Kernel Adapting Neuron (SKAN). The resulting neuron model is the first to show synaptic encoding of afferent signal to noise ratio in addition to the unsupervised learning of spatio temporal spike patterns. The neuron model is particularly suitable for implementation in digital neuromorphic hardware as it does not use any complex mathematical operations and uses a novel approach to achieve synaptic homeostasis. The neurons noise compensation properties are characterized and tested on noise corrupted zeros digits of the MNIST handwritten dataset. Results show the simultaneously learning common patterns in its input data while dynamically weighing individual afferent channels based on their signal to noise ratio. Despite its simplicity the interesting behaviors of the neuron model and the resulting computational power may offer insights into biological systems.
Apart from mobile cellular networks, IEEE 802.11-based wireless local area networks (WLANs) represent the most widely deployed wireless networking technology. With the migration of critical applications onto data networks, and the emergence of multimedia applications such as digital audio/video and multimedia games, the success of IEEE 802.11 depends critically on its ability to provide quality of service (QoS). A lot of research has focused on equipping IEEE 802.11 WLANs with features to support QoS. In this survey, we provide an overview of these techniques. We discuss the QoS features incorporated by the IEEE 802.11 standard at both physical (PHY) and media access control (MAC) layers, as well as other higher-layer proposals. We also focus on how the new architectural developments of software-defined networking (SDN) and cloud networking can be used to facilitate QoS provisioning in IEEE 802.11-based networks. We conclude this paper by identifying some open research issues for future consideration.
We present an overview of the theory of high--Q^2 deep inelastic scattering. We focus in particular on the theoretical uncertainties in the predictions for neutral and charged current cross sections obtained by extrapolating from lower Q^2.
We consider the numerical approximation of compressible flow in a pipe network. Appropriate coupling conditions are formulated that allow us to derive a variational characterization of solutions and to prove global balance laws for the conservation of mass and energy on the whole network. This variational principle, which is the basis of our further investigations, is amenable to a conforming Galerkin approximation by mixed finite elements. The resulting semi-discrete problems are well-posed and automatically inherit the global conservation laws for mass and energy from the continuous level. We also consider the subsequent discretization in time by a problem adapted implicit time stepping scheme which leads to conservation of mass and a slight dissipation of energy of the full discretization. The well-posedness of the fully discrete scheme is established and a fixed-point iteration is proposed for the solution of the nonlinear systems arising in every single time step. Some computational results are presented for illustration of our theoretical findings and for demonstration of the robustness and accuracy of the new method.
The `sodalite' network is the edge-skeleton of the uniform tiling in Euclidean 3-dimensional space by Archimedean tetrakaidecahedra (truncated octahedra). We develop explicit expressions for its `height' (minimum network path length from some fixed to given vertex) and `coordination' (content of network sphere of given height) functions. The final discussion should to some extent assist in motivating and signposting our proof strategy, in the course of ruminating on its potential generalisation.
With the rise in data centre virtualization, there are increasing choices as to where to place workloads, be it in web applications, Enterprise IT or in Network Function Virtualisation. Workload placement approaches available today primarily focus on optimising the use of data centre resources. Given the significant forecasts for network traffic growth to/from data centres, effective management of both data centre resources and of the wide area networks resources that provide access to those data centres will become increasingly important. In this paper, we present an architecture for workload placement, which uniquely employs a logically centralised controller that is both network and data centre aware, which aims to place workloads to optimise the use of both data centre and wide area network resources. We call this approach workload engineering. We present the results of a simulation study, where we use a reinforcement-learning based placement algorithm on the controller. Results of the study show this algorithm was able to place workloads to make more efficient use of network and data centre resources and placed ~5-8% more workloads than other heuristic placement algorithms considered, for the same installed capacity.
We propose an efficient method for approximating natural gradient descent in neural networks which we call Kronecker-Factored Approximate Curvature (K-FAC). K-FAC is based on an efficiently invertible approximation of a neural network's Fisher information matrix which is neither diagonal nor low-rank, and in some cases is completely non-sparse. It is derived by approximating various large blocks of the Fisher (corresponding to entire layers) as being the Kronecker product of two much smaller matrices. While only several times more expensive to compute than the plain stochastic gradient, the updates produced by K-FAC make much more progress optimizing the objective, which results in an algorithm that can be much faster than stochastic gradient descent with momentum in practice. And unlike some previously proposed approximate natural-gradient/Newton methods which use high-quality non-diagonal curvature matrices (such as Hessian-free optimization), K-FAC works very well in highly stochastic optimization regimes. This is because the cost of storing and inverting K-FAC's approximation to the curvature matrix does not depend on the amount of data used to estimate it, which is a feature typically associated only with diagonal or low-rank approximations to the curvature matrix.
We propose a novel approach for inferring the individualized causal effects of a treatment (intervention) from observational data. Our approach conceptualizes causal inference as a multitask learning problem; we model a subject's potential outcomes using a deep multitask network with a set of shared layers among the factual and counterfactual outcomes, and a set of outcome-specific layers. The impact of selection bias in the observational data is alleviated via a propensity-dropout regularization scheme, in which the network is thinned for every training example via a dropout probability that depends on the associated propensity score. The network is trained in alternating phases, where in each phase we use the training examples of one of the two potential outcomes (treated and control populations) to update the weights of the shared layers and the respective outcome-specific layers. Experiments conducted on data based on a real-world observational study show that our algorithm outperforms the state-of-the-art.
Complex biological systems have been successfully modeled by biochemical and genetic interaction networks, typically gathered from high-throughput (HTP) data. These networks can be used to infer functional relationships between genes or proteins. Using the intuition that the topological role of a gene in a network relates to its biological function, local or diffusion based "guilt-by-association" and graph-theoretic methods have had success in inferring gene functions. Here we seek to improve function prediction by integrating diffusion-based methods with a novel dimensionality reduction technique to overcome the incomplete and noisy nature of network data. In this paper, we introduce diffusion component analysis (DCA), a framework that plugs in a diffusion model and learns a low-dimensional vector representation of each node to encode the topological properties of a network. As a proof of concept, we demonstrate DCA's substantial improvement over state-of-the-art diffusion-based approaches in predicting protein function from molecular interaction networks. Moreover, our DCA framework can integrate multiple networks from heterogeneous sources, consisting of genomic information, biochemical experiments and other resources, to even further improve function prediction. Yet another layer of performance gain is achieved by integrating the DCA framework with support vector machines that take our node vector representations as features. Overall, our DCA framework provides a novel representation of nodes in a network that can be used as a plug-in architecture to other machine learning algorithms to decipher topological properties of and obtain novel insights into interactomes.
In evolutionary biology, genetic sequences carry with them a trace of the underlying tree that describes their evolution from a common ancestral sequence. The question of how many sequence sites are required to recover this evolutionary relationship accurately depends on the model of sequence evolution, the substitution rate, divergence times and the method used to infer phylogenetic history. A particularly challenging problem for phylogenetic methods arises when a rapid divergence event occurred in the distant past. We analyse an idealised form of this problem in which the terminal edges of a symmetric four--taxon tree are some factor ($p$) times the length of the interior edge. We determine an order $p^2$ lower bound on the growth rate for the sequence length required to resolve the tree (independent of any particular branch length). We also show that this rate of sequence length growth can be achieved by existing methods (including the simple `maximum parsimony' method), and compare these order $p^2$ bounds with an order $p$ growth rate for a model that describes low-homoplasy evolution. In the final section, we provide a generic bound on the sequence length requirement for a more general class of Markov processes.
Anderson localization has been studied extensively for more than half a century. However, while our understanding has been greatly enhanced by calculations based on a small epsilon expansion in d = 2 + epsilon dimensions in the framework of non-linear sigma models, those results can not be safely extrapolated to d = 3. Here we calculate the leading scale-dependent correction to the frequency-dependent conductivity sigma(omega) in dimensions d <= 3. At d = 3 we find a leading correction Re{sigma(omega)} ~ |omega|, which at low frequency is much larger than the omega^2 correction deriving from the Drude law. We also determine the leading correction to the renormalization group beta-function in the metallic phase at d = 3.
We aim to generate virtual commuting networks in the French rural regions in order to study the dynamics of their municipalities. Since we have to model small commuting flows between municipalities with a few hundreds or thousands inhabitants, we opt for a stochastic model presented by Gargiulo et al. 2012. It reproduces the various possible complete networks using an iterative process, stochastically choosing a workplace in the region for each commuter living in the municipality of a region. The choice is made considering the job offers in each municipality of the region and the distance to all the possible destinations. This paper presents how to adapt and implement this model to generate French regions commuting networks between municipalities. We address three different questions: How to generate a reliable virtual commuting network for a region highly dependant of other regions for the satisfaction of its resident's demand for employment? What about a convenient deterrence function? How to calibrate the model when detailed data is not available? We answer proposing an extended job search geographical base for commuters living in the municipalities, we compare two different deterrence functions and we show that the parameter is a constant for network linking French municipalities.
Real-world attacks can be interpreted as the result of competitive interactions between networks, ranging from predator-prey networks to networks of countries under economic sanctions. Although the purpose of an attack is to damage a target network, it also curtails the ability of the attacker, which must choose the duration and magnitude of an attack to avoid negative impacts on its own functioning. Nevertheless, despite the large number of studies on interconnected networks, the consequences of initiating an attack have never been studied. Here, we address this issue by introducing a model of network competition where a resilient network is willing to partially weaken its own resilience in order to more severely damage a less resilient competitor. The attacking network can take over the competitor nodes after their long inactivity. However, due to a feedback mechanism the takeovers weaken the resilience of the attacking network. We define a conservation law that relates the feedback mechanism to the resilience dynamics for two competing networks. Within this formalism, we determine the cost and optimal duration of an attack, allowing a network to evaluate the risk of initiating hostilities.
The appearance of large geolocated communication datasets has recently increased our understanding of how social networks relate to their physical space. However, many recurrently reported properties, such as the spatial clustering of network communities, have not yet been systematically tested at different scales. In this work we analyze the social network structure of over 25 million phone users from three countries at three different scales: country, provinces and cities. We consistently find that this last urban scenario presents significant differences to common knowledge about social networks. First, the emergence of a giant component in the network seems to be controlled by whether or not the network spans over the entire urban border, almost independently of the population or geographic extension of the city. Second, urban communities are much less geographically clustered than expected. These two findings shed new light on the widely-studied searchability in self-organized networks. By exhaustive simulation of decentralized search strategies we conclude that urban networks are searchable not through geographical proximity as their country-wide counterparts, but through an homophily-driven community structure.
We study theoretically the effects of strong pinning centers on a charge density wave in the limit that the charge density wave coherence length is shorter than the average inter-impurity distance. An analysis based on a Ginzburg-Landau model shows that long range forces arising from the elastic response of the charge density wave induce a kind of collective pinning which suppresses impurity-induced phase fluctuations leading to a long ranged ordered ground state. The effective correlations among impurities are characterized by a length scale parametrically longer than the average inter-impurity distance. The thermal excitations are found to be gapped implying the stability of the ground state. We also present Monte Carlo simulations that confirm the basic features of the analytical results.
Online leading has disrupted the traditional consumer banking sector with more effective loan processing. Risk prediction and monitoring is critical for the success of the business model. Traditional credit score models fall short in applying big data technology in building risk model. In this manuscript, data with various format and size were collected from public website, third-parties and assembled with client's loan application information data. Ensemble machine learning models, random forest model and XGBoost model, were built and trained with the historical transaction data and subsequently tested with separate data. XGBoost model shows higher K-S value, suggesting better classification capability in this task. Top 10 important features from the two models suggest external data such as zhimaScore, multi-platform stacking loans information, and social network information are important factors in predicting loan default probability.
We apply numerical simulations to study the criticality of the 3D Ising model with random site quenched dilution. The emphasis is given to the issues not being discussed in detail before. In particular, we attempt a comparison of different Monte Carlo techniques, discussing regions of their applicability and advantages/disadvantages depending on the aim of a particular simulation set. Moreover, besides evaluation of the critical indices we estimate the universal ratio $\Gamma^+/\Gamma^-$ for the magnetic susceptibility critical amplitudes. Our estimate $\Gamma^+/\Gamma^- = 1.67 \pm 0.15$ is in a good agreement with the recent MC analysis of the random-bond Ising model giving further support that both random-site and random-bond dilutions lead to the same universality class.
In this paper, we propose a new correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition. Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with a pair of deep neural networks, so that the sharable and modal-specific information can be simultaneously exploited. Specifically, we construct a pair of deep convolutional neural networks (CNNs) for the RGB and depth data, and concatenate them at the top layer of the network with a loss function which learns a new feature space where both correlated part and the individual part of the RGB-D information are well modelled. The parameters of the whole networks are updated by using the back-propagation criterion. Experimental results on two widely used RGB-D object image benchmark datasets clearly show that our method outperforms state-of-the-arts.
We use an external spin as a dynamical probe of many body localization. The probe spin is coupled to an interacting and disordered environment described by a Heisenberg spin chain in a random field. The spin-chain environment can be tuned between a thermalizing delocalized phase and non-thermalizing localized phase, both in its ground- and high-energy states. We study the decoherence of the probe spin when it couples to the environment prepared in three states: the ground state, the infinite temperature state and a high energy N\'eel state. In the non-thermalizing many body localized regime, the coherence shows scaling behaviour in the disorder strength. The long-time dynamics of the probe spin shows a logarithmic dephasing in analogy with the logarithmic growth of entanglement entropy for a bi-partition of a many-body localized system. In summary, we show that decoherence of the probe spin provides clear signatures of many-body localization.
Top-down information plays a central role in human perception, but plays relatively little role in many current state-of-the-art deep networks, such as Convolutional Neural Networks (CNNs). This work seeks to explore a path by which top-down information can have a direct impact within current deep networks. We explore this path by learning and using "generators" corresponding to the network internal effects of three types of transformation (each a restriction of a general affine transformation): rotation, scaling, and translation. We demonstrate how these learned generators can be used to transfer top-down information to novel settings, as mediated by the "feature flows" that the transformations (and the associated generators) correspond to inside the network. Specifically, we explore three aspects: 1) using generators as part of a method for synthesizing transformed images --- given a previously unseen image, produce versions of that image corresponding to one or more specified transformations, 2) "zero-shot learning" --- when provided with a feature flow corresponding to the effect of a transformation of unknown amount, leverage learned generators as part of a method by which to perform an accurate categorization of the amount of transformation, even for amounts never observed during training, and 3) (inside-CNN) "data augmentation" --- improve the classification performance of an existing network by using the learned generators to directly provide additional training "inside the CNN".
Je\v{r}\'abek showed that cuts in classical propositional logic proofs in deep inference can be eliminated in quasipolynomial time. The proof is indirect and it relies on a result of Atserias, Galesi and Pudl\'ak about monotone sequent calculus and a correspondence between that system and cut-free deep-inference proofs. In this paper we give a direct proof of Je\v{r}\'abek's result: we give a quasipolynomial-time cut-elimination procedure for classical propositional logic in deep inference. The main new ingredient is the use of a computational trace of deep-inference proofs called atomic flows, which are both very simple (they only trace structural rules and forget logical rules) and strong enough to faithfully represent the cut-elimination procedure.
In a basic framework of a complex Hilbert space equipped with a complex conjugation and an involution, linear operators can be real, quaternionic, symmetric or anti-symmetric, and orthogonal projections can furthermore be symplectic. This paper investigates index pairings of projections and unitaries submitted to such symmetries. Various scenarios emerge: Noether indices can take either arbitrary integer values or only even integer values or they can vanish and then possibly have secondary $\mathbb{Z}_2$-invariants. These general results are applied to prove index theorems for the strong invariants of topological insulators. The symmetries come from the Fermi projection ($K$-theoretic part of the pairing) and the Dirac operator ($K$-homological part of the pairing depending on the dimension of physical space).
Many real-world networks describe systems in which interactions decay with the distance between nodes. Examples include systems constrained in real space such as transportation and communication networks, as well as systems constrained in abstract spaces such as multivariate biological or economic datasets and models of social networks. These networks often display network motifs: subgraphs that recur in the network much more often than in randomized networks. To understand the origin of the network motifs in these networks, it is important to study the subgraphs and network motifs that arise solely from geometric constraints. To address this, we analyze geometric network models, in which nodes are arranged on a lattice and edges are formed with a probability that decays with the distance between nodes. We present analytical solutions for the numbers of all 3 and 4-node subgraphs, in both directed and non-directed geometric networks. We also analyze geometric networks with arbitrary degree sequences, and models with a field that biases for directed edges in one direction. Scaling rules for scaling of subgraph numbers with system size, lattice dimension and interaction range are given. Several invariant measures are found, such as the ratio of feedback and feed-forward loops, which do not depend on system size, dimension or connectivity function. We find that network motifs in many real-world networks, including social networks and neuronal networks, are not captured solely by these geometric models. This is in line with recent evidence that biological network motifs were selected as basic circuit elements with defined information-processing functions.
Network self-similarity or fractality are widely accepted as an important topological property of metabolic networks; however, recent studies cast doubt on the reality of self-similarity in the networks. Therefore, we perform a comprehensive evaluation of metabolic network fractality using a box-covering method with an earlier version and the latest version of metabolic networks, and demonstrate that the latest metabolic networks are almost self-dissimilar, while the earlier ones are fractal, as reported in a number of previous studies. This result may be because the networks were randomized because of an increase in network density due to database updates, suggesting that the previously observed network fractality was due to a lack of available data on metabolic reactions. This finding may not entirely discount the importance of self-similarity of metabolic networks. Rather, it highlights the need for a more suitable definition of network fractality and a more careful examination of self-similarity of metabolic networks.
One of the interesting and important problems of information diffusion over a large social network is to identify an appropriate model from a limited amount of diffusion information. There are two contrasting approaches to model information diffusion: a push type model known as Independent Cascade (IC) model and a pull type model known as Linear Threshold (LT) model. We extend these two models (called AsIC and AsLT in this paper) to incorporate asynchronous time delay and investigate 1) how they differ from or similar to each other in terms of information diffusion, 2) whether the model itself is learnable or not from the observed information diffusion data, and 3) which model is more appropriate to explain for a particular topic (information) to diffuse/propagate. We first show there can be variations with respect to how the time delay is modeled, and derive the likelihood of the observed data being generated for each model. Using one particular time delay model, we show the model parameters are learnable from a limited amount of observation. We then propose a method based on predictive accuracy by which to select a model which better explains the observed data. Extensive evaluations were performed. We first show using synthetic data with the network structures taken from real networks that there are considerable behavioral differences between the AsIC and the AsLT models, the proposed methods accurately and stably learn the model parameters, and identify the correct diffusion model from a limited amount of observation data. We next apply these methods to behavioral analysis of topic propagation using the real blog propagation data, and show there is a clear indication as to which topic better follows which model although the results are rather insensitive to the model selected at the level of discussing how far and fast each topic propagates from the learned parameter values.
Deep surveys of the sky at millimeter wavelengths have revealed a population of ultra-luminous infrared galaxies (ULIRGs) at high redshifts. These appear similar to local objects of similar luminosities (such as Arp220) but are much more ``important'' at high redshift than at low reshift, in the sense that they represent a much larger fraction of the total luminous output of the distant Universe than they do locally. In fact the ULIRGs at high redshift are producing a significant fraction (>= 15%) of the total luminous output of the Universe averaged over all wavelengths and all epochs. The high z ULIRGs could plausibly be responsible for producing the metal-rich spheroidal components of galaxies, including the bulges that are the subject of this conference. In this case we would infer from the redshift distribution of the sources that much of this activity is probably happening relatively recently at z <= 2.
Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets. In this paper, we offer contributions in both these areas to enable similar progress in audio modeling. First, we detail a powerful new WaveNet-style autoencoder model that conditions an autoregressive decoder on temporal codes learned from the raw audio waveform. Second, we introduce NSynth, a large-scale and high-quality dataset of musical notes that is an order of magnitude larger than comparable public datasets. Using NSynth, we demonstrate improved qualitative and quantitative performance of the WaveNet autoencoder over a well-tuned spectral autoencoder baseline. Finally, we show that the model learns a manifold of embeddings that allows for morphing between instruments, meaningfully interpolating in timbre to create new types of sounds that are realistic and expressive.
Fast-AT is an automatic thumbnail generation system based on deep neural networks. It is a fully-convolutional deep neural network, which learns specific filters for thumbnails of different sizes and aspect ratios. During inference, the appropriate filter is selected depending on the dimensions of the target thumbnail. Unlike most previous work, Fast-AT does not utilize saliency but addresses the problem directly. In addition, it eliminates the need to conduct region search on the saliency map. The model generalizes to thumbnails of different sizes including those with extreme aspect ratios and can generate thumbnails in real time. A data set of more than 70,000 thumbnail annotations was collected to train Fast-AT. We show competitive results in comparison to existing techniques.
We present a new algorithm for multi-region segmentation of 2D images with objects that may partially occlude each other. Our algorithm is based on the observation hat human performance on this task is based both on prior knowledge about plausible shapes and taking into account the presence of occluding objects whose shape is already known - once an occluded region is identified, the shape prior can be used to guess the shape of the missing part. We capture the former aspect using a deep learning model of shape; for the latter, we simultaneously minimize the energy of all regions and consider only unoccluded pixels for data agreement.   Existing algorithms incorporating object shape priors consider every object separately in turn and can't distinguish genuine deviation from the expected shape from parts missing due to occlusion. We show that our method significantly improves on the performance of a representative algorithm, as evaluated on both preprocessed natural and synthetic images. Furthermore, on the synthetic images, we recover the ground truth segmentation with good accuracy.
We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted. We apply these to the problem of modeling phone sequences---a domain in which universal symbol inventories and cross-linguistically shared feature representations are a natural fit. Intrinsic evaluation on held-out perplexity, qualitative analysis of the learned representations, and extrinsic evaluation in two downstream applications that make use of phonetic features show (i) that polyglot models better generalize to held-out data than comparable monolingual models and (ii) that polyglot phonetic feature representations are of higher quality than those learned monolingually.
In this work, we explore the analogy between entanglement and secret classical correlations in the context of large networks, more precisely the question of percolation of secret correlations in a network. It is known that entanglement percolation in quantum networks can display a highly nontrivial behavior depending on the topology of the network and on the presence of entanglement between the nodes. Here we show that this behavior, thought to be of a genuine quantum nature, also occurs in a classical context.
We study the consequences of adopting products by agents who form a social network. To this end we use the threshold model introduced in Apt and Markakis, arXiv:1105.2434, in which the nodes influenced by their neighbours can adopt one out of several alternatives, and associate with such each social network a strategic game between the agents. The possibility of not choosing any product results in two special types of (pure) Nash equilibria.   We show that such games may have no Nash equilibrium and that determining the existence of a Nash equilibrium, also of a special type, is NP-complete. The situation changes when the underlying graph of the social network is a DAG, a simple cycle, or has no source nodes. For these three classes we determine the complexity of establishing whether a (special type of) Nash equilibrium exists.   We also clarify for these categories of games the status and the complexity of the finite improvement property (FIP). Further, we introduce a new property of the uniform FIP which is satisfied when the underlying graph is a simple cycle, but determining it is co-NP-hard in the general case and also when the underlying graph has no source nodes. The latter complexity results also hold for verifying the property of being a weakly acyclic game.
Operators of networks covering large areas are confronted with demands from some of their customers who are virtual service providers. These providers may call for the connectivity service which fulfils the specificity of their services, for instance a multicast transition with allocated bandwidth. On the other hand, network operators want to make profit by trading the connectivity service of requested quality to their customers and to limit their infrastructure investments (or do not invest anything at all).   We focus on circuit switching optical networks and work on repetitive multicast demands whose source and destinations are {\em \`a priori} known by an operator. He may therefore have corresponding trees "ready to be allocated" and adapt his network infrastructure according to these recurrent transmissions. This adjustment consists in setting available branching routers in the selected nodes of a predefined tree. The branching nodes are opto-electronic nodes which are able to duplicate data and retransmit it in several directions. These nodes are, however, more expensive and more energy consuming than transparent ones.   In this paper we are interested in the choice of nodes of a multicast tree where the limited number of branching routers should be located in order to minimize the amount of required bandwidth. After formally stating the problem we solve it by proposing a polynomial algorithm whose optimality we prove. We perform exhaustive computations to show an operator gain obtained by using our algorithm. These computations are made for different methods of the multicast tree construction. We conclude by giving dimensioning guidelines and outline our further work.
This erratum revokes the main conclusion of a Letter that reported measurements of cross sections for deep-inelastic scattering (DIS) of leptons on $^3$He and $^{14}$N targets, expressed as ratios of $\sigma_A / \sigma_D$ to the cross section on the deuterium target.
A network creation game simulates a decentralized and non-cooperative building of a communication network. Informally, there are $n$ players sitting on the network nodes, which attempt to establish a reciprocal communication by activating, incurring a certain cost, any of their incident links. The goal of each player is to have all the other nodes as close as possible in the resulting network, while buying as few links as possible. According to this intuition, any model of the game must then appropriately address a balance between these two conflicting objectives. Motivated by the fact that a player might have a strong requirement about its centrality in the network, in this paper we introduce a new setting in which if a player maintains its (either maximum or average) distance to the other nodes within a given associated bound, then its cost is simply equal to the number of activated edges, otherwise its cost is unbounded. We study the problem of understanding the structure of associated pure Nash equilibria of the resulting games, that we call MaxBD and SumBD, respectively. For both games, we show that computing the best response of a player is an NP-hard problem. Next, we show that when distance bounds associated with players are non-uniform, then equilibria can be arbitrarily bad. On the other hand, for MaxBD, we show that when nodes have a uniform bound $R$ on the maximum distance, then the Price of Anarchy (PoA) is lower and upper bounded by 2 and $O(n^{\frac{1}{\lfloor\log_3 R\rfloor+1}})$ for $R \geq 3$, while for the interesting case R=2, we are able to prove that the PoA is $\Omega(\sqrt{n})$ and $O(\sqrt{n \log n})$. For the uniform SumBD we obtain similar (asymptotically) results, and moreover we show that the PoA becomes constant as soon as the bound on the average distance is $n^{\omega(\frac{1}{\sqrt{\log n}})}$.
In oncology, Positron Emission Tomography imaging is widely used in diagnostics of cancer metastases, in monitoring of progress in course of the cancer treatment, and in planning radiotherapeutic interventions. Accurate and reproducible delineation of the tumor in the Positron Emission Tomography scans remains a difficult task, despite being crucial for delivering appropriate radiation dose, minimizing adverse side-effects of the therapy, and reliable evaluation of treatment. In this piece of research we attempt to solve the problem of automated delineation of the tumor using 3d implementations of the spatial distance weighted fuzzy c-means, the deep convolutional neural network and a dictionary model. The methods, in diverse ways, combine intensity and spatial information.
We have investigated the electrical conduction processes in as-grown and thermally cycled ZnO single crystal as well as as-grown ZnO polycrystalline films over the wide temperature range 20--500 K. In the case of ZnO single crystal between 110 and 500 K, two types of thermal activation conduction processes are observed. This is explained in terms of the existence of both shallow donors and intermediately deep donors which are consecutively excited to the conduction band as the temperature increases. By measuring the resistivity $\rho(T)$ of a given single crystal after repeated thermal cycling in vacuum, we demonstrate that oxygen vacancies play an important role in governing the shallow donor concentrations but leave the activation energy ($\simeq27\pm$2 meV) largely intact. In the case of polycrystalline films, two types of thermal activation conduction processes are also observed between $\sim$150 and 500 K. Below $\sim$150 K, we found an additional conduction process due to the nearest-neighbor-hopping conduction mechanism which takes place in the shallow impurity band. As the temperatures further decreases below $\sim$80 K, a crossover to the Mott variable-range-hopping conduction process is observed. Taken together with our previous measurements on $\rho (T)$ of ZnO polycrystalline films in the temperature range 2--100 K [Y. L. Huang {\it et al.}, J. Appl. Phys. \textbf{107}, 063715 (2010)], this work establishes a quite complete picture of the overall electrical conduction mechanisms in the ZnO material from liquid-helium temperatures up to 500 K.
Representations are internal models of the environment that can provide guidance to a behaving agent, even in the absence of sensory information. It is not clear how representations are developed and whether or not they are necessary or even essential for intelligent behavior. We argue here that the ability to represent relevant features of the environment is the expected consequence of an adaptive process, give a formal definition of representation based on information theory, and quantify it with a measure R. To measure how R changes over time, we evolve two types of networks---an artificial neural network and a network of hidden Markov gates---to solve a categorization task using a genetic algorithm. We find that the capacity to represent increases during evolutionary adaptation, and that agents form representations of their environment during their lifetime. This ability allows the agents to act on sensorial inputs in the context of their acquired representations and enables complex and context-dependent behavior. We examine which concepts (features of the environment) our networks are representing, how the representations are logically encoded in the networks, and how they form as an agent behaves to solve a task. We conclude that R should be able to quantify the representations within any cognitive system, and should be predictive of an agent's long-term adaptive success.
In this work we study diffusion in networks with community structure. We first replicate and extend work on networks with non-overlapping community structure. We then study diffusion on network models that have overlapping community structure. We study contagions in the standard SIR model, and complex contagions thought to be better models of some social diffusion processes. Finally, we investigate diffusion on empirical networks with known overlapping community structure, by analysing the structure of such networks, and by simulating contagion on them. We find that simple and complex contagions can spread fast in networks with overlapping community structure. We also find that short paths exist through overlapping community structure on empirical networks.
We study the quantum phase transition of an itinerant antiferromagnet with cubic anisotropy in the presence of quenched disorder, paying particular attention to the locally ordered spatial regions that form in the Griffiths region. We derive an effective action where these rare regions are described in terms of static annealed disorder. A one loop renormalization group analysis of the effective action shows that for order parameter dimensions $p<4$ the rare regions destroy the conventional critical behavior. For order parameter dimensions $p>4$ the critical behavior is not influenced by the rare regions, it is described by the conventional dirty cubic fixed point. We also discuss the influence of the rare regions on the fluctuation-driven first-order transition in this system.
The dynamic character of most social networks requires to model evolution of networks in order to enable complex analysis of theirs dynamics. The following paper focuses on the definition of differences between network snapshots by means of Graph Differential Tuple. These differences enable to calculate the diverse distance measures as well as to investigate the speed of changes. Four separate measures are suggested in the paper with experimental study on real social network data.
We consider the task of opportunistic channel access in a primary system composed of independent Gilbert-Elliot channels where the secondary (or opportunistic) user does not dispose of a priori information regarding the statistical characteristics of the system. It is shown that this problem may be cast into the framework of model-based learning in a specific class of Partially Observed Markov Decision Processes (POMDPs) for which we introduce an algorithm aimed at striking an optimal tradeoff between the exploration (or estimation) and exploitation requirements. We provide finite horizon regret bounds for this algorithm as well as a numerical evaluation of its performance in the single channel model as well as in the case of stochastically identical channels.
A significant theoretical advantage of search-and-score methods for learning Bayesian Networks is that they can accept informative prior beliefs for each possible network, thus complementing the data. In this paper, a method is presented for assigning priors based on beliefs on the presence or absence of certain paths in the true network. Such beliefs correspond to knowledge about the possible causal and associative relations between pairs of variables. This type of knowledge naturally arises from prior experimental and observational data, among others. In addition, a novel search-operator is proposed to take advantage of such prior knowledge. Experiments show that, using path beliefs improves the learning of the skeleton, as well as the edge directions in the network.
In this paper, we discuss a voting model by considering three different kinds of networks: a random graph, the Barab\'{a}si-Albert(BA) model, and a fitness model. A voting model represents the way in which public perceptions are conveyed to voters. Our voting model is constructed by using two types of voters--herders and independents--and two candidates. Independents conduct voting based on their fundamental values; on the other hand, herders base their voting on the number of previous votes. Hence, herders vote for the majority candidates and obtain information relating to previous votes from their networks. We discuss the difference between the phases on which the networks depend. Two kinds of phase transitions, an information cascade transition and a super-normal transition, were identified. The first of these is a transition between a state in which most voters make the correct choices and a state in which most of them are wrong. The second is a transition of convergence speed. The information cascade transition prevails when herder effects are stronger than the super-normal transition. In the BA and fitness models, the critical point of the information cascade transition is the same as that of the random network model. However, the critical point of the super-normal transition disappears when these two models are used. In conclusion, the influence of networks is shown to only affect the convergence speed and not the information cascade transition. We are therefore able to conclude that the influence of hubs on voters' perceptions is limited.
Learning a natural language interface for database tables is a challenging task that involves deep language understanding and multi-step reasoning. The task is often approached by mapping natural language queries to logical forms or programs that provide the desired response when executed on the database. To our knowledge, this paper presents the first weakly supervised, end-to-end neural network model to induce such programs on a real-world dataset. We enhance the objective function of Neural Programmer, a neural network with built-in discrete operations, and apply it on WikiTableQuestions, a natural language question-answering dataset. The model is trained end-to-end with weak supervision of question-answer pairs, and does not require domain-specific grammars, rules, or annotations that are key elements in previous approaches to program induction. The main experimental result in this paper is that a single Neural Programmer model achieves 34.2% accuracy using only 10,000 examples with weak supervision. An ensemble of 15 models, with a trivial combination technique, achieves 37.7% accuracy, which is competitive to the current state-of-the-art accuracy of 37.1% obtained by a traditional natural language semantic parser.
We use the Bethe approximation to calculate the critical temperature for the transition from a paramagnetic to a glassy phase in spin-glass models on real-world graphs. Our criterion is based on the marginal stability of the minimum of the Bethe free energy. For uniform degree random graphs (equivalent to the Viana-Bray model) our numerical results, obtained by averaging single problem instances, are in agreement with the known critical temperature obtained by use of the replica method. Contrary to the replica method, our method immediately generalizes to arbitrary (random) graphs. We present new results for Barabasi-Albert scale-free random graphs, for which no analytical results are known. We investigate the scaling behavior of the critical temperature with graph size for both the finite and the infinite connectivity limit. We compare these with the naive Mean Field results. We observe that the Belief Propagation algorithm converges only in the paramagnetic regime.
Spectral graph theory-based methods represent an important class of tools for studying the structure of networks. Spectral methods are based on a first-order Markov chain derived from a random walk on the graph and thus they cannot take advantage of important higher-order network substructures such as triangles, cycles, and feed-forward loops. Here we propose a Tensor Spectral Clustering (TSC) algorithm that allows for modeling higher-order network structures in a graph partitioning framework. Our TSC algorithm allows the user to specify which higher-order network structures (cycles, feed-forward loops, etc.) should be preserved by the network clustering. Higher-order network structures of interest are represented using a tensor, which we then partition by developing a multilinear spectral method. Our framework can be applied to discovering layered flows in networks as well as graph anomaly detection, which we illustrate on synthetic networks. In directed networks, a higher-order structure of particular interest is the directed 3-cycle, which captures feedback loops in networks. We demonstrate that our TSC algorithm produces large partitions that cut fewer directed 3-cycles than standard spectral clustering algorithms.
Multi-task learning is partly motivated by the observation that humans bring to bear what they know about related problems when solving new ones. Similarly, deep neural networks can profit from related tasks by sharing parameters with other networks. However, humans do not consciously decide to transfer knowledge between tasks (and are typically not aware of the transfer). In machine learning, it is hard to estimate if sharing will lead to improvements; especially if tasks are only loosely related. To overcome this, we introduce Sluice Networks, a general framework for multi-task learning where trainable parameters control the amount of sharing -- including which parts of the models to share. Our framework goes beyond and generalizes over previous proposals in enabling hard or soft sharing of all combinations of subspaces, layers, and skip connections. We perform experiments on three task pairs from natural language processing, and across seven different domains, using data from OntoNotes 5.0, and achieve up to 15% average error reductions over common approaches to multi-task learning. We analyze when the architecture is particularly helpful, as well as its ability to fit noise. We show that a) label entropy is predictive of gains in sluice networks, confirming findings for hard parameter sharing, and b) while sluice networks easily fit noise, they are robust across domains in practice.
We study the sample-to-sample fluctuations of the overlap probability densities from large-scale equilibrium simulations of the three-dimensional Edwards-Anderson spin glass below the critical temperature. Ultrametricity, Stochastic Stability and Overlap Equivalence impose constraints on the moments of the overlap probability densities that can be tested against numerical data. We found small deviations from the Ghirlanda-Guerra predictions, which get smaller as system size increases. We also focus on the shape of the overlap distribution, comparing the numerical data to a mean-field-like prediction in which finite-size effects are taken into account by substituting delta functions with broad peaks
This paper presents the designing of a neural network for the classification of Human activity. A Triaxial accelerometer sensor, housed in a chest worn sensor unit, has been used for capturing the acceleration of the movements associated. All the three axis acceleration data were collected at a base station PC via a CC2420 2.4GHz ISM band radio (zigbee wireless compliant), processed and classified using MATLAB. A neural network approach for classification was used with an eye on theoretical and empirical facts. The work shows a detailed description of the designing steps for the classification of human body acceleration data. A 4-layer back propagation neural network, with Levenberg-marquardt algorithm for training, showed best performance among the other neural network training algorithms.
This paper presents an end-to-end neural network model, named Neural Generative Question Answering (GENQA), that can generate answers to simple factoid questions, based on the facts in a knowledge-base. More specifically, the model is built on the encoder-decoder framework for sequence-to-sequence learning, while equipped with the ability to enquire the knowledge-base, and is trained on a corpus of question-answer pairs, with their associated triples in the knowledge-base. Empirical study shows the proposed model can effectively deal with the variations of questions and answers, and generate right and natural answers by referring to the facts in the knowledge-base. The experiment on question answering demonstrates that the proposed model can outperform an embedding-based QA model as well as a neural dialogue model trained on the same data.
Surface probes such as scanning tunneling microscopy (STM) have detected complex patterns at the nanoscale, indicative of electronic inhomogeneity, in a variety of high temperature superconductors. In cuprates, the pattern formation is associated with the pseudogap phase, a precursor to the high temperature superconducting state. Symmetry breaking (i.e. from C4 to C2) in the form of electronic nematicity has recently been implicated as a unifying theme of the pseudogap phase,[1] however the fundamental physics governing the nanoscale pattern formation has not yet been identified. Here we use universal cluster properties extracted from STM studies of cuprate superconductors in order to identify the fundamental physics controlling the complex pattern formation. We find that the pattern formation is set by a delicate balance between disorder and interactions, leading to a fractal nature of the cluster pattern. The method introduced here may be extended to a variety of surface probes, enabling the direct measurement of the dimension of the phenomenon being studied, in order to determine whether the phenomenon arises from the bulk of the material, or whether it is confined to the surface.
Our contribution in this paper is e3D, a diffusion based routing protocol that prolongs the system lifetime, evenly distributes the power dissipation throughout the network, and incurs minimal overhead for synchronizing communication. We compare e3D with other algorithms in terms of system lifetime, power dissipation distribution, cost of synchronization, and simplicity of the algorithm.
We present a method for localizing facial keypoints on animals by transferring knowledge gained from human faces. Instead of directly finetuning a network trained to detect keypoints on human faces to animal faces (which is sub-optimal since human and animal faces can look quite different), we propose to first adapt the animal images to the pre-trained human detection network by correcting for the differences in animal and human face shape. We first find the nearest human neighbors for each animal image using an unsupervised shape matching method. We use these matches to train a thin plate spline warping network to warp each animal face to look more human-like. The warping network is then jointly finetuned with a pre-trained human facial keypoint detection network using an animal dataset. We demonstrate state-of-the-art results on both horse and sheep facial keypoint detection, and significant improvement over simple finetuning, especially when training data is scarce. Additionally, we present a new dataset with 3717 images with horse face and facial keypoint annotations.
The conductance of disordered nano-wires at T=0 is calculated in one-particle approximation by reducing the original multi-dimensional problem for an open bounded system to a set of exactly one-dimensional non-Hermitian problems for mode propagators. Regarding two-dimensional conductor as a limiting case of three-dimensional disordered quantum waveguide, the metallic ground state is shown to result from its multi-modeness. On thinning the waveguide (in practice, e. g., by means of the ``pressing'' external electric field) the electron system undergoes a continuous phase transition from metallic to insulating state. The result predicted conform qualitatively to the observed anomalies of the resistance of different planar electron and hole systems.
Most information spreading models consider that all individuals are identical psychologically. They ignore, for instance, the curiosity level of people, which may indicate that they can be influenced to seek for information given their interest. For example, the game Pok\'emon GO spread rapidly because of the aroused curiosity among users. This paper proposes an information propagation model considering the curiosity level of each individual, which is a dynamical parameter that evolves over time. We evaluate the efficiency of our model in contrast to traditional information propagation models, like SIR or IC, and perform analysis on different types of artificial and real-world networks, like Google+, Facebook, and the United States roads map. We present a mean-field approach that reproduces with a good accuracy the evolution of macroscopic quantities, such as the density of stiflers, for the system's behavior with the curiosity. We also obtain an analytical solution of the mean-field equations that allows to predicts a transition from a phase where the information remains confined to a small number of users to a phase where it spreads over a large fraction of the population. The results indicate that the curiosity increases the information spreading in all networks as compared with the spreading without curiosity, and that this increase is larger in spatial networks than in social networks. When the curiosity is taken into account, the maximum number of informed individuals is reached close to the transition point. Since curious people are more open to a new product, concepts, and ideas, this is an important factor to be considered in propagation modeling. Our results contribute to the understanding of the interplay between diffusion process and dynamical heterogeneous transmission in social networks.
Broadcasting algorithms are of fundamental importance for distributed systems engineering. In this paper we revisit the classical and well-studied push protocol for message broadcasting. Assuming that initially only one node has some piece of information, at each stage every one of the informed nodes chooses randomly and independently one of its neighbors and passes the message to it.   The performance of the push protocol on a fully connected network, where each node is joined by a link to every other node, is very well understood. In particular, Frieze and Grimmett proved that with probability 1-o(1) the push protocol completes the broadcasting of the message within (1 +/- \epsilon) (log_2 n + ln n) stages, where n is the number of nodes of the network. However, there are no tight bounds for the broadcast time on networks that are significantly sparser than the complete graph.   In this work we consider random networks on n nodes, where every edge is present with probability p, independently of every other edge. We show that if p > f(n)ln n/ n, where f(n) is any function that tends to infinity as n grows, then the push protocol broadcasts the message within (1 +/- \epsilon) (log_2 n + ln n) stages with probability 1-o(1). In other words, in almost every network of density d such that d > f(n)ln n, the push protocol broadcasts a message as fast as in a fully connected network. This is quite surprising in the sense that the time needed remains essentially unaffected by the fact that most of the links are missing.
We present deep BVrI multicolor photometry in the field of the quasar BR1202-07 (z_{em}=4.694) aimed at selecting field galaxies at z>4. We compare the observed colors of the galaxies in the field with those predicted by spectral synthesis models including UV absorption by the intergalactic medium and we define a robust multicolor selection of galaxies at z>4. We provide spectroscopic confirmation of the high redshift QSO-companion galaxy (z=4.702) selected by our method. The first estimate of the surface density of galaxies in the redshift interval 4<z<4.5 is obtained for the same field, corresponding to a comoving volume density of ~ 10^{-3}$ Mpc$^{-3}. This provides a lower limit to the average star formation rate of the order of 10^{-2} Mo/yr/Mpc^{-3} at z~ 4.25.
Data center applications require the network to be scalable and bandwidth-rich. Current data center network architectures often use rigid topologies to increase network bandwidth. A major limitation is that they can hardly support incremental network growth. Recent work proposes to use random interconnects to provide growth flexibility. However routing on a random topology suffers from control and data plane scalability problems, because routing decisions require global information and forwarding state cannot be aggregated. In this paper we design a novel flexible data center network architecture, Space Shuffle (S2), which applies greedy routing on multiple ring spaces to achieve high-throughput, scalability, and flexibility. The proposed greedy routing protocol of S2 effectively exploits the path diversity of densely connected topologies and enables key-based routing. Extensive experimental studies show that S2 provides high bisectional bandwidth and throughput, near-optimal routing path lengths, extremely small forwarding state, fairness among concurrent data flows, and resiliency to network failures.
High resolution X-ray scattering measurements on single crystal Tb2Ti2O7 reveal finite structural correlations at low temperatures. This geometrically frustrated pyrochlore is known to exhibit a spin liquid, or cooperative paramagnetic state, at temperatures below ~ 20 K. Parametric studies of structural Bragg peaks appropriate to the Fd$\bar{3}$m space group of Tb2Ti2O7 reveal substantial broadening and peak intensity reduction in the temperature regime 20 K to 300 mK. We also observe a small, anomalous lattice expansion on cooling below a density maximum at ~ 18 K. These measurements are consistent with the development of fluctuations above a cooperative Jahn-Teller, cubic-tetragonal phase transition at very low temperatures.
Security is one of the major issue in wired and wireless network but due to the presence of centralized administration not difficult to find out misbehavior in network other than in Mobile Ad hoc Network due to the absence of centralized management and frequently changes in topology security is one of a major issue in MANET. Only prevention methods for attack are not enough. In this paper a new Intrusion Detection System (IDS) algorithm has proposed against selfish node attack to secure MANET. Here the behavior of selfish node is unnecessary flooding the information in network and block all types of packets transferring between the reliable nodes. Proposed IDS Algorithm identifies the behavior of selfish node and also blocked their misbehavior activities. In case of selfish node attack network performance is almost negligible but after applying IDS on attack network performance is enhanced up to 92% and provides 0% Infection rate from attack.
A fermion model with random on-site potential defined on a two-dimensional square lattice with $\pi$-flux is studied. The continuum limit of the model near the zero energy yields Dirac fermions with random potentials specified by four independent coupling constants. The basic symmetry of the model is time-reversal invariance. Moreover, it turns out that the model has enhanced (chiral) symmetry on several surfaces in the four-dimensional space of the coupling constants. It is shown that one of the surfaces with chiral symmetry has Sp(n)$\times$Sp(n) symmety whereas others have U(2n) symmetry, both of which are broken to Sp(n), and the fluctuation around a saddle point is described, respectively, by Sp($n)_2$ WZW model and U(2n)/Sp(n) nonlinear sigma model. Based on these results, we propose a phase diagram of the model.
In Francis and Steel (2015), it was shown that there exists non-trivial networks on $4$ leaves upon which the distance metric affords a metric on a tree which is not the base tree of the network. In this paper we extend this result in two directions. We show that for any tree $T$ there exists a family of non-trivial HGT networks $N$ for which the distance metric $d_N$ affords a metric on $T$. We additionally provide a class of networks on any number of leaves upon which the distance metric affords a metric on a tree which is not the base tree of the network.   The family of networks are all "floating" networks, a subclass of a novel family of networks introduced in this paper, and referred to as "versatile" networks. Versatile networks are then characterised.   Additionally, we find a lower bound for the number of `useful' HGT arcs in such networks, in a sense explained in the paper. This lower bound is equal to the number of HGT arcs required for each floating network in the main results, and thus our networks are minimal in this sense.
In this paper, we present a simple and efficient method for training deep neural networks in a semi-supervised setting where only a small portion of training data is labeled. We introduce self-ensembling, where we form a consensus prediction of the unknown labels using the outputs of the network-in-training on different epochs, and most importantly, under different regularization and input augmentation conditions. This ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training. Using our method, we set new records for two standard semi-supervised learning benchmarks, reducing the (non-augmented) classification error rate from 18.44% to 7.05% in SVHN with 500 labels and from 18.63% to 16.55% in CIFAR-10 with 4000 labels, and further to 5.12% and 12.16% by enabling the standard augmentations. We additionally obtain a clear improvement in CIFAR-100 classification accuracy by using random images from the Tiny Images dataset as unlabeled extra inputs during training. Finally, we demonstrate good tolerance to incorrect labels.
We investigate the optical colours of X-ray sources from the Extended Chandra Deep Field South (ECDFS) using photometry from the COMBO-17 survey, aiming to explore AGN - galaxy feedback models. The X-ray sources populate both the ``blue'' and the ``red sequence'' on the colour-magnitude diagram. However, sources in the ``red sequence'' appear systematically more obscured. HST imaging from the GEMS survey demonstrates that the nucleus does not affect significantly the observed colours, and therefore red sources are early-type systems. In the context of AGN feedback models, this means that there is still remaining material after the initial ``blowout''. We argue that this material could not be only left-over from the original merger, but a secondary cold gas supplier (such as minor interactions or self-gravitational instabilities) must also assist.
In the past three decades, many theoretical measures of complexity have been proposed to help understand complex systems. In this work, for the first time, we place these measures on a level playing field, to explore the qualitative similarities and differences between them, and their shortcomings. Specifically, using the Boltzmann machine architecture (a fully connected recurrent neural network) with uniformly distributed weights as our model of study, we numerically measure how complexity changes as a function of network dynamics and network parameters. We apply an extension of one such information-theoretic measure of complexity to understand incremental Hebbian learning in Hopfield networks, a fully recurrent architecture model of autoassociative memory. In the course of Hebbian learning, the total information flow reflects a natural upward trend in complexity as the network attempts to learn more and more patterns.
We show the relevance of the nonlinear Fisher and Kolmogorov-Petrovsky- Piscounov (KPP) equation to the problem of high energy evolution of the QCD amplitudes. We explain how the traveling wave solutions of this equation are related to geometric scaling, a phenomenon observed in deep-inelastic scattering experiments. Geometric scaling is for the first time shown to result from an exact solution of nonlinear QCD evolution equations. Using general results on the KPP equation, we compute the velocity of the wave front, which gives the full high energy dependence of the saturation scale.
Images of scenes have various objects as well as abundant attributes, and diverse levels of visual categorization are possible. A natural image could be assigned with fine-grained labels that describe major components, coarse-grained labels that depict high level abstraction or a set of labels that reveal attributes. Such categorization at different concept layers can be modeled with label graphs encoding label information. In this paper, we exploit this rich information with a state-of-art deep learning framework, and propose a generic structured model that leverages diverse label relations to improve image classification performance. Our approach employs a novel stacked label prediction neural network, capturing both inter-level and intra-level label semantics. We evaluate our method on benchmark image datasets, and empirical results illustrate the efficacy of our model.
We describe the emergence of the giant mutually connected component in networks of networks in which each node has a single replica node in any layer and can be interdependent only on its replica nodes in the interdependent layers. We prove that if in these networks, all the nodes of one network (layer) are interdependent on the nodes of the same other interconnected layer, then, remarkably, the mutually connected component does not depend on the topology of the network of networks. This component coincides with the mutual component of the fully connected network of networks constructed from the same set of layers, i.e., a multiplex network.
The partition function of a bosonic Riemann gas is given by the Riemann zeta function. We assume that the hamiltonian of this gas at a given temperature $\beta^{-1}$ has a random variable $\omega$ with a given probability distribution over an ensemble of hamiltonians. We study the average free energy density and average mean energy density of this arithmetic gas in the complex $\beta$-plane. Assuming that the ensemble is made by an enumerable infinite set of copies, there is a critical temperature where the average free energy density diverges due to the pole of the Riemann zeta function. Considering an ensemble of non-enumerable set of copies, the average free energy density is non-singular for all temperatures, but acquires complex values in the critical region. Next, we study the mean energy density of the system which depends strongly on the distribution of the non-trivial zeros of the Riemann zeta function. Using a regularization procedure we prove that the this quantity is continuous and bounded for finite temperatures.
Typical techniques for sequence classification are designed for well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, such methods cannot be easily applied on noisy sequences expected in real-world applications. In this paper, we present the Temporal Attention-Gated Model (TAGM) which integrates ideas from attention models and gated recurrent networks to better deal with noisy or unsegmented sequences. Specifically, we extend the concept of attention model to measure the relevance of each observation (time step) of a sequence. We then use a novel gated recurrent network to learn the hidden representation for the final prediction. An important advantage of our approach is interpretability since the temporal attention weights provide a meaningful value for the salience of each time step in the sequence. We demonstrate the merits of our TAGM approach, both for prediction accuracy and interpretability, on three different tasks: spoken digit recognition, text-based sentiment analysis and visual event recognition.
In this paper, we propose a growing random complex network model, which we call context dependent preferential attachment model (CDPAM), when the preference of a new node to get attached to old nodes is determined by the local and global property of the old nodes. We consider that local and global properties of a node as the degree and relative average degree of the node respectively. We prove that the degree distribution of complex networks generated by CDPAM follow power law with exponent lies in the interval [2, 3] and the expected diameter grows logarithmically with the size of new nodes added in the initial small network. Numerical results show that the expected diameter stabilizes when alike weights to the local and global properties are assigned by the new nodes. Computing various measures including clustering coefficient, assortativity, number of triangles, algebraic connectivity, spectral radius, we show that the proposed model replicates properties of real networks better than BA model for all these measures when alike weights are given to local and global property. Finally, we observe that the BA model is a limiting case of CDPAM when new nodes tend to give large weight to the local property compared to the weight given to the global property during link formation.
Infrared-Faint Radio Sources (IFRSs) are a class of source which are bright at radio frequencies, but do not appear in deep infrared images. We report the detection of 14 IFRSs within the Spitzer extragalactic First Look Survey field, eight of which are detected near to the limiting magnitude of a deep R-band image of the region, at R ~ 24.5. Sensitive Spitzer Space Telescope images are stacked in order to place upper limits on their mid-infrared flux densities, and using recent 610-MHz and 1.4-GHz observations we find that they have spectral indices which vary between alpha = 0.05 and 1.38, where we define alpha such that S = S_0 nu^(- alpha), and should not be thought of as a single source population. We place constraints on the luminosity and linear size of these sources, and through comparison with well-studied local objects in the 3CRR catalogue demonstrate that they can be modelled as being compact (< 20 kpc) Fanaroff-Riley Class II (FRII) radio galaxies located at high redshift (z > 4).
We study non-convex empirical risk minimization for learning halfspaces and neural networks. For loss functions that are $L$-Lipschitz continuous, we present algorithms to learn halfspaces and multi-layer neural networks that achieve arbitrarily small excess risk $\epsilon>0$. The time complexity is polynomial in the input dimension $d$ and the sample size $n$, but exponential in the quantity $(L/\epsilon^2)\log(L/\epsilon)$. These algorithms run multiple rounds of random initialization followed by arbitrary optimization steps. We further show that if the data is separable by some neural network with constant margin $\gamma>0$, then there is a polynomial-time algorithm for learning a neural network that separates the training data with margin $\Omega(\gamma)$. As a consequence, the algorithm achieves arbitrary generalization error $\epsilon>0$ with ${\rm poly}(d,1/\epsilon)$ sample and time complexity. We establish the same learnability result when the labels are randomly flipped with probability $\eta<1/2$.
From the identification of a drawback in the Isolation Forest (IF) algorithm that limits its use in the scope of anomaly detection, we propose two extensions that allow to firstly overcome the previously mention limitation and secondly to provide it with some supervised learning capability. The resulting Hybrid Isolation Forest (HIF) that we propose is first evaluated on a synthetic dataset to analyze the effect of the new meta-parameters that are introduced and verify that the addressed limitation of the IF algorithm is effectively overcame. We hen compare the two algorithms on the ISCX benchmark dataset, in the context of a network intrusion detection application. Our experiments show that HIF outperforms IF, but also challenges the 1-class and 2-classes SVM baselines with computational efficiency.
We present a synthesis model of the X-ray background based on the cross-correlation between mid-infrared and X-ray surveys, where the distribution of type 2 sources is assumed to follow that of luminous infrared galaxies while type 1 sources are traced by the observed ROSAT distribution. The best fits to both the X-ray number counts and background spectrum require at least some density evolution. We explore a limited range of parameter space for the evolutionary variables of the type 2 luminosity function. Matching the redshift distribution to that observed in deep Chandra and XMM fields, we find weak residuals as a signature of Fe emission from sources in a relatively peaked range of redshift. This extends the recent work of Franceschini et al., and emphasizes the possible correlation between obscured AGN and star-forming activity.
The GMRT has been used to make deep, wide-field surveys of several fields at 610 MHz, with a resolution of about 5 arcsec. These include the Spitzer Extragalactic First Look Survey field, where 4 square degrees were observed with a r.m.s. sensitivity of about 30 microJy/beam, and several SWIRE fields (namely the Lockman Hole, ELAIS-N1 and N2 fields) covering more than 20 square degrees with a sensitivity of about 80 microJy beam or better. The analysis of these observations, and some of the science results are described.
Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques. The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation.
Poisson distribution is used for modeling noise in photon-limited imaging. While canonical examples include relatively exotic types of sensing like spectral imaging or astronomy, the problem is relevant to regular photography now more than ever due to the booming market for mobile cameras. Restricted form factor limits the amount of absorbed light, thus computational post-processing is called for. In this paper, we make use of the powerful framework of deep convolutional neural networks for Poisson denoising. We demonstrate how by training the same network with images having a specific peak value, our denoiser outperforms previous state-of-the-art by a large margin both visually and quantitatively. Being flexible and data-driven, our solution resolves the heavy ad hoc engineering used in previous methods and is an order of magnitude faster. We further show that by adding a reasonable prior on the class of the image being processed, another significant boost in performance is achieved.
Gravitational lensing by massive galaxy clusters allows study of the population of intrinsically faint infrared galaxies that lie below the sensitivity and confusion limits of current infrared and submillimeter telescopes. We present ultra-deep PACS 100 and 160 microns observations toward the cluster lens Abell 2218, to penetrate the Herschel confusion limit. We derive source counts down to a flux density of 1 mJy at 100 microns and 2 mJy at 160 microns, aided by strong gravitational lensing. At these levels, source densities are 20 and 10 beams/source in the two bands, approaching source density confusion at 160 microns. The slope of the counts below the turnover of the Euclidean-normalized differential curve is constrained in both bands and is consistent with most of the recent backwards evolutionary models. By integrating number counts over the flux range accessed by Abell 2218 lensing (0.94-35 mJy at 100 microns and 1.47-35 mJy at 160 microns, we retrieve a cosmic infrared background (CIB) surface brightness of ~8.0 and ~9.9 nW m^-2 sr^-1, in the respective bands. These values correspond to 55% (+/- 24%) and 77% (+/- 31%) of DIRBE direct measurements. Combining Abell 2218 results with wider/shallower fields, these figures increase to 62% (+/- 25%) and 88% (+/- 32%) CIB total fractions, resolved at 100 and 160 microns, disregarding the high uncertainties of DIRBE absolute values.
We study distributed optimization where nodes cooperatively minimize the sum of their individual, locally known, convex costs $f_i(x)$'s, $x \in {\mathbb R}^d$ is global. Distributed augmented Lagrangian (AL) methods have good empirical performance on several signal processing and learning applications, but there is limited understanding of their convergence rates and how it depends on the underlying network. This paper establishes globally linear (geometric) convergence rates of a class of deterministic and randomized distributed AL methods, when the $f_i$'s are twice continuously differentiable and have a bounded Hessian. We give explicit dependence of the convergence rates on the underlying network parameters. Simulations illustrate our analytical findings.
Contact networks can significantly change the course of epidemics, affecting the rate of new infections and the mean size of an outbreak. Despite this dependence, some characteristics of epidemics are not contingent on the contact network and are probably predictable based only on the pathogen. Here we consider SIR-like pathogens and give an elementary proof that for any network increasing the probability of transmission increases the mean outbreak size. We also introduce a simple model, termed 2FleeSIR, in which susceptibles protect themselves by avoiding contacts with infectees. The 2FleeSIR model is non-monotonic: for some networks, increasing transmissibility actually decreases the final extent. The dynamics of 2FleeSIR are fundamentally different from SIR because 2FleeSIR exhibits no outbreak transition in densely-connected networks. We show that in non-monotonic epidemics, public health officials might be able to intervene in a fundamentally new way to change the network so as to control the effect of unexpectedly-high virulence. However, interventions that decrease transmissibility might actually cause more people to become infected.
This short paper reports the method and the evaluation results of Casio and Shinshu University joint team for the ISBI Challenge 2017 - Skin Lesion Analysis Towards Melanoma Detection - Part 3: Lesion Classification hosted by ISIC. Our online validation score was 0.958 with melanoma classifier AUC 0.924 and seborrheic keratosis classifier AUC 0.993.
In this paper, we propose a method to establish a networked control system that maintains its stability in the presence of certain undesirable incidents on local controllers. We call such networked control systems weakly resilient. We first derive a necessary and sufficient condition for the weak resilience of networked systems. Networked systems do not generally satisfy this condition. Therefore, we provide a method for designing a compensator which ensures the weak resilience of the compensated system. Finally, we illustrate the efficiency of the proposed method by a power system example based on the IEEE 14-bus test system.
Video classification is productive in many practical applications, and the recent deep learning has greatly improved its accuracy. However, existing works often model video frames indiscriminately, but from the view of motion, video frames can be decomposed into salient and non-salient areas naturally. Salient and non-salient areas should be modeled with different networks, for the former present both appearance and motion information, and the latter present static background information. To address this problem, in this paper, video saliency is predicted by optical flow without supervision firstly. Then two streams of 3D CNN are trained individually for raw frames and optical flow on salient areas, and another 2D CNN is trained for raw frames on non-salient areas. For the reason that these three streams play different roles for each class, the weights of each stream are adaptively learned for each class. Experimental results show that saliency-guided modeling and adaptively weighted learning can reinforce each other, and we achieve the state-of-the-art results.
We propose a scale-free network model with a tunable power-law exponent. The Poisson growth model, as we call it, is an offshoot of the celebrated model of Barab\'{a}si and Albert where a network is generated iteratively from a small seed network; at each step a node is added together with a number of incident edges preferentially attached to nodes already in the network. A key feature of our model is that the number of edges added at each step is a random variable with Poisson distribution, and, unlike the Barab\'{a}si-Albert model where this quantity is fixed, it can generate any network. Our model is motivated by an application in Bayesian inference implemented as Markov chain Monte Carlo to estimate a network; for this purpose, we also give a formula for the probability of a network under our model.
Modularity is designed to measure the strength of division of a network into clusters (known also as communities). Networks with high modularity have dense connections between the vertices within clusters but sparse connections between vertices of different clusters. As a result, modularity is often used in optimization methods for detecting community structure in networks, and so it is an important graph parameter from a practical point of view. Unfortunately, many existing non-spatial models of complex networks do not generate graphs with high modularity; on the other hand, spatial models naturally create clusters. We investigate this phenomenon by considering a few examples from both sub-classes. We prove precise theoretical results for the classical model of random d-regular graphs as well as the preferential attachment model, and contrast these results with the ones for the spatial preferential attachment (SPA) model that is a model for complex networks in which vertices are embedded in a metric space, and each vertex has a sphere of influence whose size increases if the vertex gains an in-link, and otherwise decreases with time. The results obtained in this paper can be used for developing statistical tests for models selection and to measure statistical significance of clusters observed in complex networks.
The purpose of this paper is to evaluate the impact of emerging network-centric software systems on the field of software architecture. We first develop an insight concerning the term "network-centric" by presenting its origin and its implications within the context of software architecture. On the basis of this insight, we present our definition of a network-centric framework and its distinguishing characteristics. We then enumerate the challenges that face the field of software architecture as software development shifts from a platform-centric to a network-centric model. In order to face these challenges, we propose a formal approach embodied in a new architectural style that supports overcoming these challenges at the architectural level. Finally, we conclude by presenting an illustrative example to demonstrate the usefulness of the concepts of network centricity, summarizing our contributions, and linking our approach to future work that needs to be done in this area.
Many network analysis tasks in social sciences rely on pre-existing data sources that were created with explicit relations or interactions between entities under consideration. Examples include email logs, friends and followers networks on social media, communication networks, etc. In these data, it is relatively easy to identify who is connected to whom and how they are connected. However, most of the data that we encounter on a daily basis are unstructured free-text data, e.g., forums, online marketplaces, etc. It is considerably more difficult to extract network data from unstructured text. In this work, we present an end-to-end system for analyzing unstructured text data and transforming the data into structured graphs that are directly applicable to a downstream application. Specifically, we look at social media data and attempt to predict the most indicative words from users' posts. The resulting keywords can be used to construct a context+content network for downstream processing such as graph-based analysis and learning. With that goal in mind, we apply our methods to the application of cross-domain entity resolution. The performance of the resulting system with automatic keywords shows improvement over the system with user-annotated hashtags.
The friendship paradox states that in a social network, egos tend to have lower degree than their alters, or, "your friends have more friends than you do". Most research has focused on the friendship paradox and its implications for information transmission, but treating the network as static and unweighted. Yet, people can dedicate only a finite fraction of their attention budget to each social interaction: a high-degree individual may have less time to dedicate to individual social links, forcing them to modulate the quantities of contact made to their different social ties. Here we study the friendship paradox in the context of differing contact volumes between egos and alters, finding a connection between contact volume and the strength of the friendship paradox. The most frequently contacted alters exhibit a less pronounced friendship paradox compared with the ego, whereas less-frequently contacted alters are more likely to be high degree and give rise to the paradox. We argue therefore for a more nuanced version of the friendship paradox: "your closest friends have slightly more friends than you do", and in certain networks even: "your best friend has no more friends than you do". We demonstrate that this relationship is robust, holding in both a social media and a mobile phone dataset. These results have implications for information transfer and influence in social networks, which we explore using a simple dynamical model.
Perceptrons are the basic computational unit of artificial neural networks, as they model the activation mechanism of an output neuron due to incoming signals from its neighbours. As linear classifiers, they play an important role in the foundations of machine learning. In the context of the emerging field of quantum machine learning, several attempts have been made to develop a corresponding unit using quantum information theory. Based on the quantum phase estimation algorithm, this paper introduces a quantum perceptron model imitating the step-activation function of a classical perceptron. This scheme requires resources in $\mathcal{O}(n)$ (where $n$ is the size of the input) and promises efficient applications for more complex structures such as trainable quantum neural networks.
Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve state-of-the-art results for many tasks. These models are characterized by a memory state that can be written to and read from by applying gated composition operations to the current input and the previous state. However, they only cover a small subset of potentially useful compositions. We propose Multi-Function Recurrent Units (MuFuRUs) that allow for arbitrary differentiable functions as composition operations. Furthermore, MuFuRUs allow for an input- and state-dependent choice of these composition operations that is learned. Our experiments demonstrate that the additional functionality helps in different sequence modeling tasks, including the evaluation of propositional logic formulae, language modeling and sentiment analysis.
The critical behavior of the random field $O(N)$ model driven at a uniform velocity is investigated at zero-temperature. From naive phenomenological arguments, we introduce a dimensional reduction property, which relates the large-scale behavior of the $D$-dimensional driven random field $O(N)$ model to that of the $(D-1)$-dimensional pure $O(N)$ model. This is an analogue of the dimensional reduction property in equilibrium cases, which states that the large-scale behavior of $D$-dimensional random field models is identical to that of $(D-2)$-dimensional pure models. However, the dimensional reduction property breaks down in low enough dimensions due to the presence of multiple meta-stable states. By employing the non-perturbative renormalization group approach, we calculate the critical exponents of the driven random field $O(N)$ model near three-dimensions and determine the range of $N$ in which the dimensional reduction breaks down.
We start with a rather detailed, general discussion of recent results of the replica approach to statistical mechanics of a single classical particle placed in a random $N (\gg 1)$-dimensional Gaussian landscape and confined by a spherically symmetric potential suitably growing at infinity. Then we employ random matrix methods to calculate the density of stationary points, as well as minima, of the associated energy surface. This is used to show that for a generic smooth, concave confining potentials the condition of the zero-temperature replica symmetry breaking coincides with one signalling that {\it both} mean total number of stationary points in the energy landscape, {\it and} the mean number of minima are exponential in $N$. For such systems the (annealed) complexity of minima vanishes cubically when approaching the critical confinement, whereas the cumulative annealed complexity vanishes quadratically. Different behaviour reported in our earlier short communication [Y.V. Fyodorov et al., {\it JETP Lett.} {\bf 85}, 261, (2007)] was due to non-analyticity of the hard-wall confinement potential. Finally, for the simplest case of parabolic confinement we investigate how the complexity depends on the index of stationary points. In particular, we show that in the vicinity of critical confinement the saddle-points with a positive annealed complexity must be close to minima, as they must have a vanishing {\it fraction} of negative eigenvalues in the Hessian.
Genetic algorithms are considered as an original way to solve problems, probably because of their generality and of their "blind" nature. But GAs are also unusual since the features of many implementations (among all that could be thought of) are principally led by the biological metaphor, while efficiency measurements intervene only afterwards. We propose here to examine the relevance of these biomimetic aspects, by pointing out some fundamental similarities and divergences between GAs and the genome of living beings shaped by natural selection. One of the main differences comes from the fact that GAs rely principally on the so-called implicit parallelism, while giving to the mutation/selection mechanism the second role. Such differences could suggest new ways of employing GAs on complex problems, using complex codings and starting from nearly homogeneous populations.
Recent developments in fitness landscape analysis include the study of Local Optima Networks (LON) and applications of the Elementary Landscapes theory. This paper represents a first step at combining these two tools to explore their ability to forecast the performance of search algorithms. We base our analysis on the Quadratic Assignment Problem (QAP) and conduct a large statistical study over 600 generated instances of different types. Our results reveal interesting links between the network measures, the autocorrelation measures and the performance of heuristic search algorithms.
We examine the behavior of an anisotropic brane-world in the presence of inflationary scalar fields. We show that, contrary to naive expectations, a large anisotropy does not adversely affect inflation. On the contrary, a large initial anisotropy introduces more damping into the scalar field equation of motion, resulting in greater inflation. The rapid decay of anisotropy in the brane-world significantly increases the class of initial conditions from which the observed universe could have originated. This generalizes a similar result in general relativity. A unique feature of Bianchi I brane-world cosmology appears to be that for scalar fields with a large kinetic term the initial expansion of the universe is quasi-isotropic. The universe grows more anisotropic during an intermediate transient regime until anisotropy finally disappears during inflationary expansion.
We calculate analytically the probability of large deviations from its mean of the largest (smallest) eigenvalue of random matrices belonging to the Gaussian orthogonal, unitary and symplectic ensembles. In particular, we show that the probability that all the eigenvalues of an (N\times N) random matrix are positive (negative) decreases for large N as \exp[-\beta \theta(0) N^2] where the parameter \beta characterizes the ensemble and the exponent \theta(0)=(\ln 3)/4=0.274653... is universal. We also calculate exactly the average density of states in matrices whose eigenvalues are restricted to be larger than a fixed number \zeta, thus generalizing the celebrated Wigner semi-circle law. The density of states generically exhibits an inverse square-root singularity at \zeta.
Deep subwavelength integration of high-definition plasmonic nanostructures is of key importance for the development of future optical nanocircuitry for high-speed communication, quantum computation and lab-on-a-chip applications. So far the experimental realization of proposed extended plasmonic networks consisting of multiple functional elements remains challenging, mainly due to the multi-crystallinity of commonly used thermally evaporated gold layers. Resulting structural imperfections in individual circuit elements will drastically reduce the yield of functional integrated nanocircuits. Here we demonstrate the use of very large (>100 micron^2) but thin (<80 nm) chemically grown single-crystalline gold flakes, which, after immobilization, serve as an ideal basis for focused-ion beam milling and other top-down nanofabrication techniques on any desired substrate. Using this methodology we obtain high-definition ultrasmooth gold nanostructures with superior optical properties and reproducible nano-sized features over micrometer length scales. Our approach provides a possible solution to overcome the current fabrication bottleneck and to realize high-definition plasmonic nanocircuitry.
We study the scaling behavior of domain-wall energies in two-dimensional Ising spin glasses with Gaussian and bimodal distributions of the interactions and different types of boundary conditions. The domain walls are generated by changing the boundary conditions at T=0 in one direction. The ground states of the original and perturbed system are calculated numerically by applying an efficient matching algorithm. Systems of size LxM with different aspect-ratios 1/8 <= L/M <= 64 are considered. For Gaussian interactions, using the aspect-ratio scaling approach, we find a stiffness exponent theta=-0.287(4), which is independent of the boundary conditions in contrast to earlier results. Furthermore, we find a scaling behavior of the domain-wall energy as predicted by the aspect-ratio approach. Finally, we show that this approach does not work for the bimodal distribution of interactions.
MXNet is a multi-language machine learning (ML) library to ease the development of ML algorithms, especially for deep neural networks. Embedded in the host language, it blends declarative symbolic expression with imperative tensor computation. It offers auto differentiation to derive gradients. MXNet is computation and memory efficient and runs on various heterogeneous systems, ranging from mobile devices to distributed GPU clusters.   This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion. Our preliminary experiments reveal promising results on large scale deep neural network applications using multiple GPU machines.
High Energy Physics (HEP) distributed computing infrastructures require automatic tools to monitor, analyze and react to potential security incidents. These tools should collect and inspect data such as resource consumption, logs and sequence of system calls for detecting anomalies that indicate the presence of a malicious agent. They should also be able to perform automated reactions to attacks without administrator intervention. We describe a novel framework that accomplishes these requirements, with a proof of concept implementation for the ALICE experiment at CERN. We show how we achieve a fully virtualized environment that improves the security by isolating services and Jobs without a significant performance impact. We also describe a collected dataset for Machine Learning based Intrusion Prevention and Detection Systems on Grid computing. This dataset is composed of resource consumption measurements (such as CPU, RAM and network traffic), logfiles from operating system services, and system call data collected from production Jobs running in an ALICE Grid test site and a big set of malware. This malware was collected from security research sites. Based on this dataset, we will proceed to develop Machine Learning algorithms able to detect malicious Jobs.
We propose StyleBank, which is composed of multiple convolution filter banks and each filter bank explicitly represents one style, for neural image style transfer. To transfer an image to a specific style, the corresponding filter bank is operated on top of the intermediate feature embedding produced by a single auto-encoder. The StyleBank and the auto-encoder are jointly learnt, where the learning is conducted in such a way that the auto-encoder does not encode any style information thanks to the flexibility introduced by the explicit filter bank representation. It also enables us to conduct incremental learning to add a new image style by learning a new filter bank while holding the auto-encoder fixed. The explicit style representation along with the flexible network design enables us to fuse styles at not only the image level, but also the region level. Our method is the first style transfer network that links back to traditional texton mapping methods, and hence provides new understanding on neural style transfer. Our method is easy to train, runs in real-time, and produces results that qualitatively better or at least comparable to existing methods.
The three-dimensional Edwards-Anderson and mean-field Sherrington-Kirkpatrick Ising spin glasses are studied via large-scale Monte Carlo simulations at low temperatures, deep within the spin-glass phase. Performing a careful statistical analysis of several thousand independent disorder realizations and using an observable that detects peaks in the overlap distribution, we show that the Sherrington-Kirkpatrick and Edwards-Anderson models have a distinctly different low-temperature behavior. The structure of the spin-glass overlap distribution for the Edwards-Anderson model suggests that its low-temperature phase has only a single pair of pure states.
We study the most suitable procedure to measure the effective temperature in off-equilibrium systems. We analyze the stationary current established between an off-equilibrium system and a thermometer and the necessary conditions for that current to vanish. We find that the thermometer must have a short characteristic time-scale compared to the typical decorrelation time of the glassy system to correctly measure the effective temperature. This general conclusion is confirmed analyzing an ensemble of harmonic oscillators with Monte Carlo dynamics as an illustrative example of a solvable model of a glass. We also find that the current defined allows to extend Fourier's law to the off-equilibrium regime by consistently defining effective transport coefficients. Our results for the oscillator model explain why thermal conductivities between thermalized and frozen degrees of freedom in structural glasses are extremely small.
Particle Swarm Optimization (PSO) is a popular nature-inspired meta-heuristic for solving continuous optimization problems. Although this technique is widely used, the understanding of the mechanisms that make swarms so successful is still limited. We present the first substantial experimental investigation of the influence of the local attractor on the quality of exploration and exploitation. We compare in detail classical PSO with the social-only variant where local attractors are ignored. To measure the exploration capabilities, we determine how frequently both variants return results in the neighborhood of the global optimum. We measure the quality of exploitation by considering only function values from runs that reached a search point sufficiently close to the global optimum and then comparing in how many digits such values still deviate from the global minimum value. It turns out that the local attractor significantly improves the exploration, but sometimes reduces the quality of the exploitation. As a compromise, we propose and evaluate a hybrid PSO which switches off its local attractors at a certain point in time. The effects mentioned can also be observed by measuring the potential of the swarm.
We consider unreliable multi-hop networks serving multiple flows in which packets not delivered to their destination nodes by their deadlines are dropped. We address the design of policies for routing and scheduling packets that optimize any specified weighted average of the throughputs of the flows. We provide a new approach which directly yields an optimal distributed scheduling policy that attains any desired maximal timely-throughput vector under average-power constraints on the nodes. It pursues a novel intrinsically stochastic decomposition of the Lagrangian of the constrained network-wide MDP rather than of the fluid model. All decisions regarding a packet's transmission scheduling, transmit power level, and routing, are completely distributed, based solely on the age of the packet, not requiring any knowledge of network state or queue lengths at any of the nodes. Global coordination is achieved through a tractably computable "price" for transmission energy. This price is different from that used to derive the backpressure policy where price corresponds to queue lengths. A quantifiably near-optimal policy is provided if nodes have peak-power constraints.
Deconvolutional layers have been widely used in a variety of deep models for up-sampling, including encoder-decoder networks for semantic segmentation and deep generative models for unsupervised learning. One of the key limitations of deconvolutional operations is that they result in the so-called checkerboard problem. This is caused by the fact that no direct relationship exists among adjacent pixels on the output feature map. To address this problem, we propose the pixel deconvolutional layer (PixelDCL) to establish direct relationships among adjacent pixels on the up-sampled feature map. Our method is based on a fresh interpretation of the regular deconvolution operation. The resulting PixelDCL can be used to replace any deconvolutional layer in a plug-and-play manner without compromising the fully trainable capabilities of original models. The proposed PixelDCL may result in slight decrease in efficiency, but this can be overcome by an implementation trick. Experimental results on semantic segmentation demonstrate that PixelDCL can consider spatial features such as edges and shapes and yields more accurate segmentation outputs than deconvolutional layers. When used in image generation tasks, our PixelDCL can largely overcome the checkerboard problem suffered by regular deconvolution operations.
We derive the zero-temperature phase diagram of spin glass models with a generic fraction of ferromagnetic interactions on the Bethe lattice. We use the cavity method at the level of one-step replica symmetry breaking (1RSB) and we find three phases: A replica-symmetric (RS) ferromagnetic one, a magnetized spin glass one (the so-called mixed phase), and an unmagnetized spin glass one. We are able to give analytic expressions for the critical point where the RS phase becomes unstable with respect to 1RSB solutions: we also clarify the mechanism inducing such a phase transition. Finally we compare our analytical results with the outcomes of a numerical algorithm especially designed for finding ground states in an efficient way, stressing weak points in the use of such numerical tools for discovering RSB effects. Some of the analytical results are given for generic connectivity.
This issue discusses the fault-trajectory approach suitability for fault diagnosis on analog networks. Recent works have shown promising results concerning a method based on this concept for ATPG for diagnosing faults on analog networks. Such method relies on evolutionary techniques, where a generic algorithm (GA) is coded to generate a set of optimum frequencies capable to disclose faults.
We extend the research program initiated in [Phys. Rev. A 92, 012338 (2015)], where we restricted our attention to noisy deterministic teleportation protocols, to noisy probabilistic (conditional) protocols. Our main goal now is to study how we can increase the fidelity of the teleported state in the presence of noise by working with probabilistic protocols. We work with several scenarios involving the most common types of noise in realistic implementations of quantum communication tasks and find many cases where adding more noise to the probabilistic protocol increases considerably the fidelity of the teleported state, without decreasing the probability of a successful run of the protocol. Also, there are cases where the entanglement of the channel connecting Alice and Bob leading to the greatest fidelity is not maximal. Moreover, there exist cases where the optimal fidelity for the probabilistic protocols are greater than the maximal fidelity (2/3) achievable by using only classical resources, while the optimal ones for the deterministic protocols under the same conditions lie below this limit. This result clearly illustrates that in some cases we can only get a truly quantum teleportation if we use probabilistic instead of deterministic protocols.
Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These models can explain a "flat" clustering structure. Hierarchical Bayesian models provide a natural approach to capture more complex dependencies. We propose a model in which objects are characterised by a latent feature vector. Each feature is itself partitioned into disjoint groups (subclusters), corresponding to a second layer of hierarchy. In experimental comparisons, the model achieves significantly improved predictive performance on social and biological link prediction tasks. The results indicate that models with a single layer hierarchy over-simplify real networks.
Relation classification is an important research arena in the field of natural language processing (NLP). In this paper, we present SDP-LSTM, a novel neural network to classify the relation of two entities in a sentence. Our neural architecture leverages the shortest dependency path (SDP) between two entities; multichannel recurrent neural networks, with long short term memory (LSTM) units, pick up heterogeneous information along the SDP. Our proposed model has several distinct features: (1) The shortest dependency paths retain most relevant information (to relation classification), while eliminating irrelevant words in the sentence. (2) The multichannel LSTM networks allow effective information integration from heterogeneous sources over the dependency paths. (3) A customized dropout strategy regularizes the neural network to alleviate overfitting. We test our model on the SemEval 2010 relation classification task, and achieve an $F_1$-score of 83.7\%, higher than competing methods in the literature.
There has been considerable progress in the design and construction of quantum annealing devices. However, a conclusive detection of quantum speedup over traditional silicon-based machines remains elusive, despite multiple careful studies. In this work we outline strategies to design hard tunable benchmark instances based on insights from the study of spin glasses - the archetypal random benchmark problem for novel algorithms and optimization devices. We propose to complement head-to-head scaling studies that compare quantum annealing machines to state-of-the-art classical codes with an approach that compares the performance of different algorithms and/or computing architectures on different classes of computationally hard tunable spin-glass instances. The advantage of such an approach lies in having to only compare the performance hit felt by a given algorithm and/or architecture when the instance complexity is increased. Furthermore, we propose a methodology that might not directly translate into the detection of quantum speedup, but might elucidate whether quantum annealing has a "`quantum advantage" over corresponding classical algorithms like simulated annealing. Our results on a 496 qubit D-Wave Two quantum annealing device are compared to recently-used state-of-the-art thermal simulated annealing codes.
The effects of an additional strategy or character called loner in the snowdrift game are studied in a well-mixed population or fully-connected network and in a square lattice. The snowdrift game, which is a possible alternative to the prisoner's dilemma game in studying cooperative phenomena in competing populations, consists of two types of strategies, C (cooperators) and D (defectors). In a fully-connected network, it is found that either C lives with D or the loners take over the whole population. In a square lattice, three possible situations are found: a uniform C-population, C lives with D, and the coexistence of all three characters. The presence of loners is found to enhance cooperation in a square lattice by enhancing the payoff of cooperators. The results are discussed in terms of the effects in restricting a player to compete only with his nearest neighbors in a square lattice, as opposed to competing with all players in a fully-connected network.
In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.
In this paper, optimum decomposition of belief networks is discussed. Some methods of decomposition are examined and a new method - the method of Minimum Total Number of States (MTNS) - is proposed. The problem of optimum belief network decomposition under our framework, as under all the other frameworks, is shown to be NP-hard. According to the computational complexity analysis, an algorithm of belief network decomposition is proposed in (Wee, 1990a) based on simulated annealing.
This reply shows that the argument presented in the comment by Jorg and Krzakala (cond mat 0709.0894) cannot be used to weaken the results presented in our paper on ultrametricity evidence in the 3d Edwards Anderson model (PRL 99, 057206, 2007; cond-mat/0607376). Our work in fact was mainly based on identifying the scaling law that governs the large volume approach to ultrametricity while NO asymptotic analysis has been done in (cond mat 0709.0894). We show here that the same method we used in our paper, when properly applied to the 2d case, reveals the expected lack of RSB picture at positive temperature, despite the fact that for a fixed finite volume some ultrametric features might still be seen in the joint overlap probability distribution. Those features disappear for increasing volume or when the system is away from the critical curve in the (T,d) plane.
(Abridged) We have performed a comprehensive multiwavelength analysis of a sample of 20 starburst galaxies that show the presence of a substantial population of massive stars. The main aims are the study of the massive star formation and stellar populations in these galaxies, and the role that interactions with or between dwarf galaxies and/or low surface companion objects have in triggering the bursts. We completed new deep optical and \NIR\ broad-band images, as well as the new continuum-subtracted H$\alpha$ maps, of our sample of Wolf-Rayet galaxies. We analyze the morphology of each system and its surroundings and quantify the photometric properties of all important objects. All data were corrected for both extinction and nebular emission using our spectroscopic data. The age of the most recent star-formation burst is estimated and compared with the age of the underlying older low-luminosity population. The \Ha-based star-formation rate, number of O7V equivalent stars, mass of ionized gas, and mass of the ionizing star cluster are also derived. We found interaction features in many (15 up to 20) of the analyzed objects, which were extremely evident in the majority. We checked that the correction for nebular emission to the broad-band filter fluxes is important in compact objects and/or with intense nebular emission to obtain realistic colors and compare with the predictions of evolutionary synthesis models. The estimate of the age of the most recent star-formation burst is derived consistently. With respect to the results found in individual objects, we remark the strong \Ha\ emission found in IRAS 08208+2816, UM 420, and SBS 0948+532, the detection of a double-nucleus in SBS 0926+606A, a possible galactic wind in Tol 9, and one (two?) nearby dwarf star-forming galaxies surrounding Tol 1457-437.
Accurate computational identification of promoters remains a challenge as these key DNA regulatory regions have variable structures composed of functional motifs that provide gene specific initiation of transcription. In this paper we utilize Convolutional Neural Networks (CNN) to analyze sequence characteristics of prokaryotic and eukaryotic promoters and build their predictive models. We trained the same CNN architecture on promoters of four very distant organisms: human, plant (Arabidopsis), and two bacteria (Escherichia coli and Mycoplasma pneumonia). We found that CNN trained on sigma70 subclass of Escherichia coli promoter gives an excellent classification of promoters and non-promoter sequences (Sn=0.90, Sp=0.96, CC=0.84). The Bacillus subtilis promoters identification CNN model achieves Sn=0.91, Sp=0.95, and CC=0.86. For human and Arabidopsis promoters we employ CNNs for identification of two well-known promoter classes (TATA and non-TATA promoters). CNNs models nicely recognize these complex functional regions. For human Sn/Sp/CC accuracy of prediction reached 0.95/0.98/0,90 on TATA and 0.90/0.98/0.89 for non-TATA promoter sequences, respectively. For Arabidopsis we observed Sn/Sp/CC 0.95/0.97/0.91 (TATA) and 0.94/0.94/0.86 (non-TATA) promoters. Thus, the developed CNN models (implemented in CNNProm program) demonstrated the ability of deep learning with grasping complex promoter sequence characteristics and achieve significantly higher accuracy compared to the previously developed promoter prediction programs. As the suggested approach does not require knowledge of any specific promoter features, it can be easily extended to identify promoters and other complex functional regions in sequences of many other and especially newly sequenced genomes. The CNNProm program is available to run at web server http://www.softberry.com.
In this paper, we propose an automated computer platform for the purpose of classifying Electroencephalography (EEG) signals associated with left and right hand movements using a hybrid system that uses advanced feature extraction techniques and machine learning algorithms. It is known that EEG represents the brain activity by the electrical voltage fluctuations along the scalp, and Brain-Computer Interface (BCI) is a device that enables the use of the brain neural activity to communicate with others or to control machines, artificial limbs, or robots without direct physical movements. In our research work, we aspired to find the best feature extraction method that enables the differentiation between left and right executed fist movements through various classification algorithms. The EEG dataset used in this research was created and contributed to PhysioNet by the developers of the BCI2000 instrumentation system. Data was preprocessed using the EEGLAB MATLAB toolbox and artifacts removal was done using AAR. Data was epoched on the basis of Event-Related (De) Synchronization (ERD/ERS) and movement-related cortical potentials (MRCP) features. Mu/beta rhythms were isolated for the ERD/ERS analysis and delta rhythms were isolated for the MRCP analysis. The Independent Component Analysis (ICA) spatial filter was applied on related channels for noise reduction and isolation of both artifactually and neutrally generated EEG sources. The final feature vector included the ERD, ERS, and MRCP features in addition to the mean, power and energy of the activations of the resulting independent components of the epoched feature datasets. The datasets were inputted into two machine-learning algorithms: Neural Networks (NNs) and Support Vector Machines (SVMs). Intensive experiments were carried out and optimum classification performances of 89.8 and 97.1 were obtained using NN and SVM, respectively.
Auto-encoding generative adversarial networks (GANs) combine the standard GAN algorithm, which discriminates between real and model-generated data, with a reconstruction loss given by an auto-encoder. Such models aim to prevent mode collapse in the learned generative model by ensuring that it is grounded in all the available training data. In this paper, we develop a principle upon which auto-encoders can be combined with generative adversarial networks by exploiting the hierarchical structure of the generative model. The underlying principle shows that variational inference can be used a basic tool for learning, but with the in- tractable likelihood replaced by a synthetic likelihood, and the unknown posterior distribution replaced by an implicit distribution; both synthetic likelihoods and implicit posterior distributions can be learned using discriminators. This allows us to develop a natural fusion of variational auto-encoders and generative adversarial networks, combining the best of both these methods. We describe a unified objective for optimization, discuss the constraints needed to guide learning, connect to the wide range of existing work, and use a battery of tests to systematically and quantitatively assess the performance of our method.
In this paper, we propose a two-timescale delay-optimal dynamic clustering and power allocation design for downlink network MIMO systems. The dynamic clustering control is adaptive to the global queue state information (GQSI) only and computed at the base station controller (BSC) over a longer time scale. On the other hand, the power allocations of all the BSs in one cluster are adaptive to both intra-cluster channel state information (CCSI) and intra-cluster queue state information (CQSI), and computed at the cluster manager (CM) over a shorter time scale. We show that the two-timescale delay-optimal control can be formulated as an infinite-horizon average cost Constrained Partially Observed Markov Decision Process (CPOMDP). By exploiting the special problem structure, we shall derive an equivalent Bellman equation in terms of Pattern Selection Q-factor to solve the CPOMDP. To address the distributive requirement and the issue of exponential memory requirement and computational complexity, we approximate the Pattern Selection Q-factor by the sum of Per-cluster Potential functions and propose a novel distributive online learning algorithm to estimate the Per-cluster Potential functions (at each CM) as well as the Lagrange multipliers (LM) (at each BS). We show that the proposed distributive online learning algorithm converges almost surely (with probability 1). By exploiting the birth-death structure of the queue dynamics, we further decompose the Per-cluster Potential function into sum of Per-cluster Per-user Potential functions and formulate the instantaneous power allocation as a Per-stage QSI-aware Interference Game played among all the CMs. We also propose a QSI-aware Simultaneous Iterative Water-filling Algorithm (QSIWFA) and show that it can achieve the Nash Equilibrium (NE).
Recurrent neural networks (RNNs) are notoriously difficult to train. When the eigenvalues of the hidden to hidden weight matrix deviate from absolute value 1, optimization becomes difficult due to the well studied issue of vanishing and exploding gradients, especially when trying to learn long-term dependencies. To circumvent this problem, we propose a new architecture that learns a unitary weight matrix, with eigenvalues of absolute value exactly 1. The challenge we address is that of parametrizing unitary matrices in a way that does not require expensive computations (such as eigendecomposition) after each weight update. We construct an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned. Optimization with this parameterization becomes feasible only when considering hidden states in the complex domain. We demonstrate the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies.
The density response of supercooled glycerol to an impulsive stimulated thermal grating (q=0.63 micron^-1) has been studied in the temperature range (T=200-340 K) where the structure rearrangement (alpha-relaxation) and thermal diffusion occur on the same time scale. A strong interaction between the two modes occurs giving rise to a dip in the T-dependence of the apparent thermal conductivity and a flattening of the apparent alpha-relaxation time upon cooling. A non-equilibrium thermodynamic (NET) model for the long time response of relaxing fluids has been developed. The model is capable to reproduce the experimental data and to explain the observed phenomenology.
We have explored the redshift evolution of the luminosity function of red and blue galaxies up to $z=3.5$. This was possible joining a deep I band composite galaxy sample, which includes the spectroscopic K20 sample and the HDFs samples, with the deep $H_{AB}=26$ and $K_{AB}=25$ samples derived from the deep NIR images of the Hubble Deep Fields North and South, respectively. About 30% of the sample has spectroscopic redshifts and the remaining fraction well-calibrated photometric redshifts. This allowed to select and measure galaxies in the rest-frame blue magnitude up to $z\sim 3$ and to derive the redshift evolution of the B-band luminosity function of galaxies separated by their rest-frame $U-V$ color or specific (i.e. per unit mass) star-formation rate. The class separation was derived from passive evolutionary tracks or from their observed bimodal distributions. Both distributions appear bimodal at least up to $z\sim 2$ and the locus of red/early galaxies is clearly identified up to these high redshifts. Both luminosity and density evolutions are needed to describe the cosmological behaviour of the red/early and blue/late populations. The density evolution is greater for the early population with a decrease by one order of magnitude at $z\sim 2-3$ with respect to the value at $z\sim 0.4$. The luminosity densities of the early and late type galaxies with $M_B<-20.6$ appear to have a bifurcation at $z>1$. Indeed while star-forming galaxies slightly increase or keep constant their luminosity density, "early" galaxies decrease in their luminosity density by a factor $\sim 5-6$ from $z\sim 0.4$ to $z\sim 2.5-3$. A comparison with one of the latest versions of the hierarchical CDM models shows a broad agreement with the observed number and luminosity density evolutions of both populations.
Modern cloud computing platforms based on virtual machine monitors carry a variety of complex business that present many network security vulnerabilities. At present, the traditional architecture employs a number of security devices at front-end of cloud computing to protect its network security. Under the new environment, however, this approach can not meet the needs of cloud security. New cloud security vendors and academia also made great efforts to solve network security of cloud computing, unfortunately, they also cannot provide a perfect and effective method to solve this problem. We introduce a novel network security architecture for cloud computing (NetSecCC) that addresses this problem. NetSecCC not only provides an effective solution for network security issues of cloud computing, but also greatly improves in scalability, fault-tolerant, resource utilization, etc. We have implemented a proof-of-concept prototype about NetSecCC and proved by experiments that NetSecCC is an effective architecture with minimal performance overhead that can be applied to the extensive practical promotion in cloud computing.
As years passes wireless networks were rapidly improved, introducing new applications and services, as well as important challenges for mobility support. This research field is new with many researchers and scientists making their proposals to optimize the provision of multiple services to the mobile users. In this context, this survey paper studies research approaches from the following topics: Cognitive Radio Networks, Interactive Broadcasting, Energy Efficient Networks, Cloud Computing and Resource Management.
Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene co-regulatory networks from such data, we propose a multivariate Hurdle model using a finite mixture of singular Gaussian distributions. This permits inference of statistical independences in zero-inflated, semi-continuous data to learn undirected, Markov graphical models. The node-wise conditional log-likelihood of the multivariate Hurdle model is convex and tractable, and allows neighborhood selection. We apply penalized maximum pseudo-likelihood using a group lasso penalty to infer conditional independences. The method is demonstrated in a data set of selected expression in T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It reveals network structure not present using other methods; or in bulk data sets.
We describe a neural attention model with a learnable retinal sampling lattice. The model is trained on a visual search task requiring the classification of an object embedded in a visual scene amidst background distractors using the smallest number of fixations. We explore the tiling properties that emerge in the model's retinal sampling lattice after training. Specifically, we show that this lattice resembles the eccentricity dependent sampling lattice of the primate retina, with a high resolution region in the fovea surrounded by a low resolution periphery. Furthermore, we find conditions where these emergent properties are amplified or eliminated providing clues to their function.
The subcellular location of a protein can provide valuable information about its function. With the rapid increase of sequenced genomic data, the need for an automated and accurate tool to predict subcellular localization becomes increasingly important. Many efforts have been made to predict protein subcellular localization. This paper aims to merge the artificial neural networks and bioinformatics to predict the location of protein in yeast genome. We introduce a new subcellular prediction method based on a backpropagation neural network. The results show that the prediction within an error limit of 5 to 10 percentage can be achieved with the system.
Hough transform (HT) has been the most common method for circle detection, exhibiting robustness, but adversely demanding considerable computational effort and large memory requirements. Alternative approaches include heuristic methods that employ iterative optimization procedures for detecting multiple circles. Since only one circle can be marked at each optimization cycle, multiple executions must be enforced in order to achieve multi detection. This paper presents an algorithm for automatic detection of multiple circular shapes that considers the overall process as a multi-modal optimization problem. The approach is based on the artificial bee colony (ABC) algorithm, a swarm optimization algorithm inspired by the intelligent foraging behavior of honey bees. Unlike the original ABC algorithm, the proposed approach presents the addition of a memory for discarded solutions. Such memory allows holding important information regarding other local optima which might have emerged during the optimization process. The detector uses a combination of three non-collinear edge points as parameters to determine circle candidates. A matching function (nectar- amount) determines if such circle candidates (bee-food-sources) are actually present in the image. Guided by the values of such matching functions, the set of encoded candidate circles are evolved through the ABC algorithm so that the best candidate (global optimum) can be fitted into an actual circle within the edge only image. Then, an analysis of the incorporated memory is executed in order to identify potential local optima, i.e., other circles.
Deep learning became the method of choice in recent year for solving a wide variety of predictive analytics tasks. For sequence prediction, recurrent neural networks (RNN) are often the go-to architecture for exploiting sequential information where the output is dependent on previous computation. However, the dependencies of the computation lie in the latent domain which may not be suitable for certain applications involving the prediction of a step-wise transformation sequence that is dependent on the previous computation only in the visible domain. We propose that a hybrid architecture of convolution neural networks (CNN) and stacked autoencoders (SAE) is sufficient to learn a sequence of actions that nonlinearly transforms an input shape or distribution into a target shape or distribution with the same support. While such a framework can be useful in a variety of problems such as robotic path planning, sequential decision-making in games, and identifying material processing pathways to achieve desired microstructures, the application of the framework is exemplified by the control of fluid deformations in a microfluidic channel by deliberately placing a sequence of pillars. Learning of a multistep topological transform has significant implications for rapid advances in material science and biomedical applications.
We propose a simple and efficient algorithm for learning sparse invariant representations from unlabeled data with fast inference. When trained on short movies sequences, the learned features are selective to a range of orientations and spatial frequencies, but robust to a wide range of positions, similar to complex cells in the primary visual cortex. We give a hierarchical version of the algorithm, and give guarantees of fast convergence under certain conditions.
A general statistical thermodynamic theory that considers given sequences of x-integers to play the role of particles of known type in an isolated elastic system is proposed. By also considering some explicit discrete probability distributions p_{x} for natural numbers, we claim that they lead to a better understanding of probabilistic laws associated with number theory. Sequences of numbers are treated as the size measure of finite sets. By considering p_{x} to describe complex phenomena, the theory leads to derive a distinct analogous force f_{x} on number sets proportional to $(\frac{\partial p_{x}}{\partial x} )_{T}$ at an analogous system temperature T. In particular, this yields to an understanding of the uneven distribution of integers of random sets in terms of analogous scale invariance and a screened inverse square force acting on the significant digits. The theory also allows to establish recursion relations to predict sequences of Fibonacci numbers and to give an answer to the interesting theoretical question of the appearance of the Benford's law in Fibonacci numbers. A possible relevance to prime numbers is also analyzed.
Dynamics near the surface of glasses is generally much faster than in the bulk. Neglecting static perturbations of structure at the surface, we use random first order transition theory to show the free energy barrier for activated motion near a free surface should be half that of the bulk at the same temperature. The increased mobility allows the surface layers to descend much further on the energy landscape than the bulk ordinarily does. The simplified RFOT calculation however predicts a limiting value for the configurational entropy a vapor deposited glass may reach as a function of deposition rate. We sketch how mode coupling effects extend the excess free surface mobility into the bulk so that the glass transition temperature is measurably perturbed at depths greater than the naive length scale of dynamic cooperativity.
We consider the non-equilibrium dynamics of disordered systems as defined by a master equation involving transition rates between configurations (detailed balance is not assumed). To compute the important dynamical time scales in finite-size systems without simulating the actual time evolution which can be extremely slow, we propose to focus on first-passage times that satisfy 'backward master equations'. Upon the iterative elimination of configurations, we obtain the exact renormalization rules that can be followed numerically. To test this approach, we study the statistics of some first-passage times for two disordered models : (i) for the random walk in a two-dimensional self-affine random potential of Hurst exponent $H$, we focus on the first exit time from a square of size $L \times L$ if one starts at the square center. (ii) for the dynamics of the ferromagnetic Sherrington-Kirkpatrick model of $N$ spins, we consider the first passage time $t_f$ to zero-magnetization when starting from a fully magnetized configuration. Besides the expected linear growth of the averaged barrier $\bar{\ln t_{f}} \sim N$, we find that the rescaled distribution of the barrier $(\ln t_{f})$ decays as $e^{- u^{\eta}}$ for large $u$ with a tail exponent of order $\eta \simeq 1.72$. This value can be simply interpreted in terms of rare events if the sample-to-sample fluctuation exponent for the barrier is $\psi_{width}=1/3$.
We consider Deep Inelastic Scattering (DIS) thought experiments in unitary Conformal Field Theories (CFTs). We explore the implications of the standard dispersion relations for the OPE data. We derive positivity constraints on the OPE coefficients of minimal-twist operators of even spin s \geq 2. In the case of s=2, when the leading-twist operator is the stress tensor, we reproduce the Hofman-Maldacena bounds. For s>2 the bounds are new.
This paper discusses the elastic behavior of polyelectrolyte networks. The deformation behavior of single polyelectrolyte chains is discussed. It is shown that a strong coupling between interactions and chain elasticity exists. The theory of the complete crosslinked networks shows that the Flory - Rehner - Hypothesis (FRH) does not hold. The modulus contains contributions from the classical rubber elasticity and from the electrostatic interactions. The equilibrium degree of swelling is estimated by the assumption of a $c^{*}$-network.
The quantum random energy model provides a mean-field description of the equilibrium spin glass transition. We show that it further exhibits a many-body localization - delocalization (MBLD) transition when viewed as a closed quantum system. The mean-field structure of the model allows an analytically tractable description of the MBLD transition using the forward-scattering approximation and replica techniques. The predictions are in good agreement with the numerics. The MBLD lies at energy density significantly above the equilibrium spin glass transition, indicating that the closed system dynamics freezes well outside of the traditional glass phase. We also observe that the structure of the eigenstates at the MBLD critical point changes continuously with the energy density, raising the possibility of a family of critical theories for the MBLD transition.
In this note we show that, a simple combination of deep results in the theory of random Schr\"odinger operators yields a quantitative estimate of the fact that the localization centers become far apart, as corresponding energies are close together.
The benefit of localized features within the regular domain has given rise to the use of Convolutional Neural Networks (CNNs) in machine learning, with great proficiency in the image classification. The use of CNNs becomes problematic within the irregular spatial domain due to design and convolution of a kernel filter being non-trivial. One solution to this problem is to utilize graph signal processing techniques and the convolution theorem to perform convolutions on the graph of the irregular domain to obtain feature map responses to learnt filters. We propose graph convolution and pooling operators analogous to those in the regular domain. We also provide gradient calculations on the input data and spectral filters, which allow for the deep learning of an irregular spatial domain problem. Signal filters take the form of spectral multipliers, applying convolution in the graph spectral domain. Applying smooth multipliers results in localized convolutions in the spatial domain, with smoother multipliers providing sharper feature maps. Algebraic Multigrid is presented as a graph pooling method, reducing the resolution of the graph through agglomeration of nodes between layers of the network. Evaluation of performance on the MNIST digit classification problem in both the regular and irregular domain is presented, with comparison drawn to standard CNN. The proposed graph CNN provides a deep learning method for the irregular domains present in the machine learning community, obtaining 94.23% on the regular grid, and 94.96% on a spatially irregular subsampled MNIST.
The driven transport of plastic systems in various disordered backgrounds is studied within mean field theory. Plasticity is modeled using non-convex interparticle potentials that allow for phase slips. This theory most naturally describes sliding charge density waves; other applications include flow of colloidal particles or driven magnetic flux vortices in disordered backgrounds. The phase diagrams exhibit generic phases and phase boundaries, though the shapes of the phase boundaries depend on the shape of the disorder potential. The phases are distinguished by their velocity and coherence: the moving phase generically has finite coherence, while pinned states can be coherent or incoherent. The coherent and incoherent static phases can coexist in parameter space, in contrast with previous results for exactly sinusoidal pinning potentials. Transitions between the moving and static states can also be hysteretic. The depinning transition from the static to sliding states can be determined analytically, while the repinning transition from the moving to the pinned phases is computed by direct simulation.
The introduction of the social networking platform has drastically affected the way individuals interact. Even though most of the effects have been positive, there exist some serious threats associated with the interactions on a social networking website. A considerable proportion of the crimes that occur are initiated through a social networking platform [5]. Almost 33% of the crimes on the internet are initiated through a social networking website [5]. Moreover activities like spam messages create unnecessary traffic and might affect the user base of a social networking platform. As a result preventing interactions with malicious intent and spam activities becomes crucial. This work attempts to detect the same in a social networking platform by considering a social network as a weighted graph wherein each node, which represents an individual in the social network, stores activities of other nodes with respect to itself in an optimized format which is referred to as localized data-set. The weights associated with the edges in the graph represent the trust relationship between profiles. The weights of the edges along with the localized data-set is used to infer whether nodes in the social network are compromised and are performing spam or malicious activities.
We propose a new method to recover global information about a network of interconnected dynamical systems based on observations made at a small number (possibly one) of its nodes. In contrast to classical identification of full graph topology, we focus on the identification of the spectral graph-theoretic properties of the network, a framework that we call spectral network identification.   The main theoretical results connect the spectral properties of the network to the spectral properties of the dynamics, which are well-defined in the context of the so-called Koopman operator and can be extracted from data through the Dynamic Mode Decomposition algorithm. These results are obtained for networks of diffusively-coupled units that admit a stable equilibrium state. For large networks, a statistical approach is considered, which focuses on spectral moments of the network and is well-suited to the case of heterogeneous populations.   Our framework provides efficient numerical methods to infer global information on the network from sparse local measurements at a few nodes. Numerical simulations show for instance the possibility of detecting the mean number of connections or the addition of a new vertex using measurements made at one single node, that need not be representative of the other nodes' properties.
We consider a basic quantum hybrid network model consisting of a number of nodes each holding a qubit, for which the aim is to drive the network to a consensus in the sense that all qubits reach a common state. Projective measurements are applied serving as control means, and the measurement results are exchanged among the nodes via classical communication channels. We show how to carry out centralized optimal path planning for this network with all-to-all classical communications, in which case the problem becomes a stochastic optimal control problem with a continuous action space. To overcome the computation and communication obstacles facing the centralized solutions, we also develop a distributed Pairwise Qubit Projection (PQP) algorithm, where pairs of nodes meet at a given time and respectively perform measurements at their geometric average. We show that the qubit states are driven to a consensus almost surely along the proposed PQP algorithm, and that the expected qubit density operators converge to the average of the network's initial values.
Deep convolutional neural networks (CNN) brought revolution without any doubt to various challenging tasks, mainly in computer vision. However, their model designing still requires attention to reduce number of learnable parameters, with no meaningful reduction in performance. In this paper we investigate to what extend CNN may take advantage of pyramid structure typical of biological neurons. A generalized statement over convolutional layers from input till fully connected layer is introduced that helps further in understanding and designing a successful deep network. It reduces ambiguity, number of parameters, and their size on disk without degrading overall accuracy. Performance are shown on state-of-the-art models for MNIST, Cifar-10, Cifar-100, and ImageNet-12 datasets. Despite more than 80% reduction in parameters for Caffe_LENET, challenging results are obtained. Further, despite 10-20% reduction in training data along with 10-40% reduction in parameters for AlexNet model and its variations, competitive results are achieved when compared to similar well-engineered deeper architectures.
While social networks can provide an ideal platform for up-to-date information from individuals across the world, it has also proved to be a place where rumours fester and accidental or deliberate misinformation often emerges. In this article, we aim to support the task of making sense from social media data, and specifically, seek to build an autonomous message-classifier that filters relevant and trustworthy information from Twitter. For our work, we collected about 100 million public tweets, including users' past tweets, from which we identified 72 rumours (41 true, 31 false). We considered over 80 trustworthiness measures including the authors' profile and past behaviour, the social network connections (graphs), and the content of tweets themselves. We ran modern machine-learning classifiers over those measures to produce trustworthiness scores at various time windows from the outbreak of the rumour. Such time-windows were key as they allowed useful insight into the progression of the rumours. From our findings, we identified that our model was significantly more accurate than similar studies in the literature. We also identified critical attributes of the data that give rise to the trustworthiness scores assigned. Finally we developed a software demonstration that provides a visual user interface to allow the user to examine the analysis.
A densely packed granular system is an example of an out-of-equilibrium system in the jammed state. It has been a longstanding problem to determine whether this class of systems can be described by concepts arising from equilibrium statistical mechanics, such as an ``effective temperature'' and ``compactivity''. The measurement of the effective temperature is realized in the laboratory by slowly shearing a closely-packed ensemble of spherical beads confined by an external pressure in a Couette geometry. All the probe particles considered in this study, independent of their characteristic features, equilibrate at the same temperature, given by the packing density of the system.
We study a unique behavioral network data set (based on periodic surveys and on electronic logs of dyadic contact via smartphones) collected at the University of Notre Dame.The participants are a sample of members of the entering class of freshmen in the fall of 2011 whose opinions on a wide variety of political and social issues and activities on campus were regularly recorded - at the beginning and end of each semester - for the first three years of their residence on campus. We create a communication activity network implied by call and text data, and a friendship network based on surveys. Both networks are limited to students participating in the NetSense surveys. We aim at finding student traits and activities on which agreements correlate well with formation and persistence of links while disagreements are highly correlated with non-existence or dissolution of links in the two social networks that we created. Using statistical analysis and machine learning, we observe several traits and activities displaying such correlations, thus being of potential use to predict social network evolution.
Recently, the ZEUS collaboration has reported on several remarkable properties of events with a large rapidity gap in deep inelastic scattering. We suggest that the mechanism underlying these events is the scattering of electrons off lumps of wee partons inside the proton. Based on an effective lagrangian approach the $Q^2$-, $x$- and $W$-distributions are evaluated. For sufficiently small invariant mass of the detected hadronic system, the mechanism implies leading twist behaviour. The $x$- and $W$-distributions are determined by the Lipatov exponent which governs the behaviour of parton densities at small $x$.
Centrality of a node measures its relative importance within a network. There are a number of applications of centrality, including inferring the influence or success of an individual in a social network, and the resulting social network dynamics. While we can compute the centrality of any node in a given network snapshot, a number of applications are also interested in knowing the potential importance of an individual in the future. However, current centrality is not necessarily an effective predictor of future centrality. While there are different measures of centrality, we focus on degree centrality in this paper. We develop a method that reconciles preferential attachment and triadic closure to capture a node's prominence profile. We show that the proposed node prominence profile method is an effective predictor of degree centrality. Notably, our analysis reveals that individuals in the early stage of evolution display a distinctive and robust signature in degree centrality trend, adequately predicted by their prominence profile. We evaluate our work across four real-world social networks. Our findings have important implications for the applications that require prediction of a node's future degree centrality, as well as the study of social network dynamics.
In surveillance, monitoring and tactical reconnaissance, gathering the right visual information from a dynamic environment and accurately processing such data are essential ingredients to making informed decisions which determines the success of an operation. Camera sensors are often cost-limited in ability to clearly capture objects without defects from images or videos taken in a poorly-lit environment. The goal in many applications is to enhance the brightness, contrast and reduce noise content of such images in an on-board real-time manner. We propose a deep autoencoder-based approach to identify signal features from low-light images handcrafting and adaptively brighten images without over-amplifying the lighter parts in images (i.e., without saturation of image pixels) in high dynamic range. We show that a variant of the recently proposed stacked-sparse denoising autoencoder can learn to adaptively enhance and denoise from synthetically darkened and noisy training examples. The network can then be successfully applied to naturally low-light environment and/or hardware degraded images. Results show significant credibility of deep learning based approaches both visually and by quantitative comparison with various popular enhancing, state-of-the-art denoising and hybrid enhancing-denoising techniques.
We report on a search for the process $p\bar{p}\to \gamma+W/Z$ with $W/Z\to q\bar{q}$ in events containing two jets and a photon at the center-of-mass energy $\sqrt{s}=1.96$ TeV, using 184 pb$^{-1}$ of data collected by the CDF II detector. A neural network event selection has been developed to optimize the rejection of the large QCD production background; it is shown that this method gives a significant improvement in both signal-to-noise ratio and signal sensitivity, as compared with an event selection based on conventional cuts. An upper limit is presented for the $\gamma+W/Z$ production cross section with the $W$ and $Z$ decaying hadronically.
Understanding how species are distributed across landscapes over time is a fundamental question in biodiversity research. Unfortunately, most species distribution models only target a single species at a time, despite strong ecological evidence that species are not independently distributed. We propose Deep Multi-Species Embedding (DMSE), which jointly embeds vectors corresponding to multiple species as well as vectors representing environmental covariates into a common high-dimensional feature space via a deep neural network. Applied to bird observational data from the citizen science project \textit{eBird}, we demonstrate how the DMSE model discovers inter-species relationships to outperform single-species distribution models (random forests and SVMs) as well as competing multi-label models. Additionally, we demonstrate the benefit of using a deep neural network to extract features within the embedding and show how they improve the predictive performance of species distribution modelling. An important domain contribution of the DMSE model is the ability to discover and describe species interactions while simultaneously learning the shared habitat preferences among species. As an additional contribution, we provide a graphical embedding of hundreds of bird species in the Northeast US.
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers. DGPs are probabilistic and non-parametric and as such are arguably more flexible, have a greater capacity to generalise, and provide better calibrated uncertainty estimates than alternative deep models. The focus of this paper is scalable approximate Bayesian learning of these networks. The paper develops a novel and efficient extension of probabilistic backpropagation, a state-of-the-art method for training Bayesian neural networks, that can be used to train DGPs. The new method leverages a recently proposed method for scaling Expectation Propagation, called stochastic Expectation Propagation. The method is able to automatically discover useful input warping, expansion or compression, and it is therefore is a flexible form of Bayesian kernel design. We demonstrate the success of the new method for supervised learning on several real-world datasets, showing that it typically outperforms GP regression and is never much worse.
The minimal conductivity of graphene is a quantity measured in the DC limit. It is shown, using the Kubo formula, that the actual value of the minimal conductivity is sensitive to the order in which certain limits are taken. If the DC limit is taken before the integration over energies is performed, the minimal conductivity of graphene is $4/\pi$ (in units of $e^2/h$) and it is $\pi/2$ in the reverse order. The value $\pi$ is obtained if weak disorder is included via a small frequency-dependent selfenergy. In the high-frequency limit the minimal conductivity approaches $\pi/2$ and drops to zero if the frequency exceeds the cut-off energy of the particles.
Making use of a recent complete calculation of a chiral six-point correlation function C(z) in a rectangle we calculate various quantities of interest for percolation (SLE parameter \kappa = 6) and many other two-dimensional critical points. In particular, we specify the density at z of critical clusters conditioned to touch either or both vertical sides of the rectangle, with these sides 'wired,' i.e. constrained to be in a single cluster, and the horizontal sides free. These quantities probe the structure of various cluster configurations, including those that contribute to the crossing probability.   We first examine the effects of boundary conditions on C for the critical O(n) loop models in both high and low density phases and for both Fortuin-Kasteleyn (FK) and spin clusters in the critical Q-state Potts models. A Coulomb gas analysis then allows us to calculate the cluster densities with various conditionings in terms of the known conformal blocks. Explicit formulas generalizing Cardy's horizontal crossing probability to these models (using previously known results) are also presented.   These solutions are employed to generalize previous results demonstrating factorization of higher-order correlation functions to the critical systems mentioned. An explicit formula for the density of critical percolation clusters that cross a rectangle horizontally with free boundary conditions is also given. Simplifications of the hypergeometric functions in our solutions for various models are presented.   High precision simulations verify these predictions for percolation and for the Q=2 and 3-state Potts models, including both FK and spin clusters. Our formula for the density of crossing clusters in percolation in open systems is also verified.
In the early age of the internet users enjoyed a large level of anonymity. At the time web pages were just hypertext documents; almost no personalisation of the user experience was o ered. The Web today has evolved as a world wide distributed system following specific architectural paradigms. On the web now, an enormous quantity of user generated data is shared and consumed by a network of applications and services, reasoning upon users expressed preferences and their social and physical connections. Advertising networks follow users' browsing habits while they surf the web, continuously collecting their traces and surfing patterns. We analyse how users tracking happens on the web by measuring their online footprint and estimating how quickly advertising networks are able to pro le users by their browsing habits.
Motivated by machine learning problems over large data sets and distributed optimization over networks, we develop and analyze a new method called incremental Newton method for minimizing the sum of a large number of strongly convex functions. We show that our method is globally convergent for a variable stepsize rule. We further show that under a gradient growth condition, convergence rate is linear for both variable and constant stepsize rules. By means of an example, we show that without the gradient growth condition, incremental Newton method cannot achieve linear convergence. Our analysis can be extended to study other incremental methods: in particular, we obtain a linear convergence rate result for the incremental Gauss-Newton algorithm under a variable stepsize rule.
Low-cost consumer depth cameras and deep learning have enabled reasonable 3D hand pose estimation from single depth images. In this paper, we present an approach that estimates 3D hand pose from regular RGB images. This task has far more ambiguities due to the missing depth information. To this end, we propose a deep network that learns a network-implicit 3D articulation prior. Together with detected keypoints in the images, this network yields good estimates of the 3D pose. We introduce a large scale 3D hand pose dataset based on synthetic hand models for training the involved networks. Experiments on a variety of test sets, including one on sign language recognition, demonstrate the feasibility of 3D hand pose estimation on single color images.
Statistical inference using social sensors is an area that has witnessed remarkable progress in the last decade. It is relevant in a variety of applications including localizing events for targeted advertising, mar- keting, localization of natural disasters and predicting sentiment of investors in financial markets. This paper presents a tutorial description of three important aspects of sensing-based information diffusion in social networks from a communications/signal processing perspective. First, diffusion models for information exchange in large scale social networks together with social sensing via social media networks such as Twitter is considered. Second, Bayesian social learning models in online reputation systems are presented. Finally, the principle of revealed preferences arising in micro-economics theory is used to parse datasets to determine if social sensors are utility maximizers and then determine their utility functions. All three topics are explained in the context of actual experimental datasets from health networks, social media and psychological experiments. Also, algorithms are given that exploit the above models to infer underlying events based on social sensing. The overview, insights, models and algorithms presented in this paper stem from recent developments in computer-science, economics, psychology and electrical engineering.
Deep X-ray surveys have resolved much of the X-ray background radiation below 2 keV into discrete sources, but the background above 8 keV remains largely unresolved. The obscured (type 2) Active Galactic Nuclei (AGNs) that are expected to dominate the hard X-ray background have not yet been detected in sufficient numbers to account for the observed background flux. However, deep X-ray surveys have revealed large numbers of faint quiescent and starburst galaxies at moderate redshifts. In hopes of recovering the missing AGN population, it has been suggested that the defining optical spectral features of low-luminosity Seyfert nuclei at large distances may be overwhelmed by their host galaxies, causing them to appear optically quiescent in deep surveys. We test this possibility by artificially redshifting a sample of 23 nearby, well-studied active galaxies to z = 0.3, testing them for X-ray AGN signatures and comparing them to the objects detected in deep X-ray surveys. We find that these redshifted galaxies have properties consistent with the deep field ``normal'' and ``optically bright, X-ray faint'' (OBXF) galaxy populations, supporting the hypothesis that the numbers of AGNs in deep X-ray surveys are being underestimated, and suggesting that OBXFs should not be ruled out as candidate AGN hosts that could contribute to the hard X-ray background source population.
With a toppling rule which generates metastable sites, we explore the properties of a gradient-driven sandpile that is minimally perturbed at one boundary. In two dimensions we find that the transport of grains takes place along deep valleys, generating a set of patterns as the system approaches the stationary state. We use two versions of the toppling rule to analyze the time behavior and the geometric properties of clusters of valleys, also discussing the relation between this model and the general properties of models displaying self-organized criticality.
Microscopic analysis of histological sections is considered the "gold standard" to verify structural parcellations in the human brain. Its high resolution allows the study of laminar and columnar patterns of cell distributions, which build an important basis for the simulation of cortical areas and networks. However, such cytoarchitectonic mapping is a semiautomatic, time consuming process that does not scale with high throughput imaging. We present an automatic approach for parcellating histological sections at 2um resolution. It is based on a convolutional neural network that combines topological information from probabilistic atlases with the texture features learned from high-resolution cell-body stained images. The model is applied to visual areas and trained on a sparse set of partial annotations. We show how predictions are transferable to new brains and spatially consistent across sections.
Representation and learning of commonsense knowledge is one of the foundational problems in the quest to enable deep language understanding. This issue is particularly challenging for understanding casual and correlational relationships between events. While this topic has received a lot of interest in the NLP community, research has been hindered by the lack of a proper evaluation framework. This paper attempts to address this problem with a new framework for evaluating story understanding and script learning: the 'Story Cloze Test'. This test requires a system to choose the correct ending to a four-sentence story. We created a new corpus of ~50k five-sentence commonsense stories, ROCStories, to enable this evaluation. This corpus is unique in two ways: (1) it captures a rich set of causal and temporal commonsense relations between daily events, and (2) it is a high quality collection of everyday life stories that can also be used for story generation. Experimental evaluation shows that a host of baselines and state-of-the-art models based on shallow language understanding struggle to achieve a high score on the Story Cloze Test. We discuss these implications for script and story learning, and offer suggestions for deeper language understanding.
A geometric model of a physical network affected by a disaster is proposed and analyzed using integral geometry (geometric probability). This analysis provides a theoretical method of evaluating performance metrics, such as the probability of maintaining connectivity, and a network design rule that can make the network robust against disasters.   The proposed model is of when the disaster area is much larger than the part of the network in which we are interested. Performance metrics, such as the probability of maintaining connectivity, are explicitly given by linear functions of the perimeter length of convex hulls determined by physical routes. The derived network design rule includes the following. (1) Reducing the convex hull of the physical route reduces the expected number of nodes that cannot connect to the destination. (2) The probability of maintaining the connectivity of two nodes on a loop cannot be changed by changing the physical route of that loop. (3) The effect of introducing a loop is identical to that of a single physical route implemented by the straight-line route.
The critical behavior of the gauge-glass and the XY spin-glass models in three dimensions is studied by analyzing their nonequilibrium aging dynamics. A new numerical method, which relies on the calculation of the two-time correlation and integrated response functions, is used to determine both the critical temperature and the nonequilibrium scaling exponents, both for spin and chiral degrees of freedom. First, the ferromagnetic XY model is studied to validate this nonequilibirum aging method (NAM), since for this nondisordered system we can compare with known results obtained with standard equilibrium and nonequilibrium techniques. When applied to the case of the gauge-glass model, we show that the NAM allows us to obtain precise and reliable values of its critical quantities, improving previous estimates. The XY spin-glass model with both Gaussian and bimodal bond distributions, is analyzed in more detail. The spin and the chiral two-time correlation and integrated response functions are calculated in our simulations. The results obtained mainly for Gaussian and, to a lesser extent, for bimodal interactions, support the existence of a spin-chiral decoupling scenario, where the chiral order occurs at a finite temperature while the spin degrees of freedom order at very low or zero temperature.
We investigate equilibrium statistical properties of the quantum XY spin-1/2 model in an external magnetic field when the interaction and field parts are subjected to quenched or/and annealed disorder. The randomness present in the system are termed annealed or quenched depending on the relation between two different time scales - the time scale associated with the equilibriation of the randomness and the time of observation. Within a mean-field framework, we study the effects of disorders on spontaneous magnetization, both by perturbative and numerical techniques. Our primary interest is to understand the differences between quenched and annealed cases, and also to investigate the interplay when both of them are present in a system. We observe in particular that when interaction and field terms are respectively quenched and annealed, critical temperature for the system to magnetize in the direction parallel to the applied field does not depend on any of the disorders. Further, an annealed disordered interaction neither affects the magnetizations nor the critical temperatures. We carry out a comparative study of the different combinations of the disorders in the interaction and field terms, and point out their generic features.
We give a heuristic argument for disorder rounding of a first order quantum phase transition into a continuous phase transition. From both weak and strong disorder analysis of the the N-color quantum Ashkin-Teller model in one spatial dimension, we find that for $N \geq 3$, the first order transition is rounded to a continuous transition and the physical picture is the same as the random transverse field Ising model for a limited parameter regime. The results are strikingly different from the corresponding classical problem in two dimensions where the fate of the renormalization group flows is a fixed point corresponding to N-decoupled pure Ising models.
With the advances in high resolution neuroimaging, there has been a growing interest in the detection of functional brain connectivity. Complex network theory has been proposed as an attractive mathematical representation of functional brain networks. However, most of the current studies of functional brain networks have focused on the computation of graph theoretic indices for static networks, i.e. long-time averages of connectivity networks. It is well-known that functional connectivity is a dynamic process and the construction and reorganization of the networks is key to understanding human cognition. Therefore, there is a growing need to track dynamic functional brain networks and identify time intervals over which the network is quasi-stationary. In this paper, we present a tensor decomposition based method to identify temporally invariant 'network states' and find a common topographic representation for each state. The proposed methods are applied to electroencephalogram (EEG) data during the study of error-related negativity (ERN).
Transport measurements on an etched graphene nanoribbon are presented. It is shown that two distinct voltage scales can be experimentally extracted that characterize the parameter region of suppressed conductance at low charge density in the ribbon. One of them is related to the charging energy of localized states, the other to the strength of the disorder potential. The lever arms of gates vary by up to 30% for different localized states which must therefore be spread in position along the ribbon. A single-electron transistor is used to prove the addition of individual electrons to the localized states. In our sample the characteristic charging energy is of the order of 10 meV, the characteristic strength of the disorder potential of the order of 100 meV.
The need for data intensive Grids, and advanced networks with high performance that support our science has made the High Energy Physics community a leading and a key co-developer of leading edge wide area networks. This paper gives an overview of the status for the world's research networks and major international links used by the high energy physics and other scientific communities, showing some Future Internet testbed architectures, scalability, geographic scope, and extension between networks. The resemblance between wireless sensor network and future internet network, especially in scale consideration as density and network coverage, inspires us to adopt the models of the former to the later. Then we test this assumption to see that this provides a concise working model. This paper collects some heuristics that we call them SpiroPlanck and employs them to model the coverage of dense networks. In this paper, we propose a framework for the operation of FI testbeds containing a test scenario, new representation and visualization techniques, and possible performance measures. Investigations show that it is very promising and could be seen as a good optimization
For the three-dimensional random Ising model, the effective sextic coupling constant v_6 and the nonlinear susceptibilities of the fourth (chi_4) and sixth (chi_6) orders are calculated at criticality. These quantities are shown to differ markedly from their counterparts for pure uniaxial magnets. In particular, the ratio v_6/v_{4}^2 entering the equation of state of the random Ising model turns out to be equal to 0.87, while in pure magnets v_6/v_{4}^2 = 1.65. The universal susceptibility ratios m^3 chi_4/chi^2 and m^6 chi_6/chi^3 (m - the inverse correlation length) are found to differ by factors 1.6 and 2.7, respectively, for random and uniform Ising models. These big differences of the universal quantities can be measured both in physical and computer experiments, and such measurements may be considered as a tool for an identification of the random critical behavior.
Prominent applications of sentiment analysis are countless, including areas such as marketing, customer service and communication. The conventional bag-of-words approach for measuring sentiment merely counts term frequencies; however, it neglects the position of the terms within the discourse. As a remedy, we thus develop a discourse-aware method that builds upon the discourse structure of documents. For this purpose, we utilize rhetorical structure theory to label (sub-)clauses according to their hierarchical relationships and then assign polarity scores to individual leaves. To learn from the resulting rhetoric structure, we propose a tensor-based, tree-structured deep neural network (named RST-LSTM) in order to process the complete discourse tree. The underlying attention mechanism infers the salient passages of narrative materials. In addition, we suggest two algorithms for data augmentation (node reordering and artificial leaf insertion) that increase our training set and reduce overfitting. Our benchmarks demonstrate the superior performance of our approach. Ultimately, this work advances the status quo in natural language processing by developing algorithms that incorporate semantic information.
In todays world, Cloud computing has attracted research communities as it provides services in reduced cost due to virtualizing all the necessary resources. Even modern business architecture depends upon Cloud computing .As it is a internet based utility, which provides various services over a network, it is prone to network based attacks. Hence security in clouds is the most important in case of cloud computing. Cloud Security concerns the customer to fully rely on storing data on clouds. That is why Cloud security has attracted attention of the research community. This paper will discuss securing the data in clouds by implementing key agreement, encryption and signature verification/generation with hyperelliptic curve cryptography.
We present simple and computationally efficient nonparametric estimators of R\'enyi entropy and mutual information based on an i.i.d. sample drawn from an unknown, absolutely continuous distribution over $\R^d$. The estimators are calculated as the sum of $p$-th powers of the Euclidean lengths of the edges of the `generalized nearest-neighbor' graph of the sample and the empirical copula of the sample respectively. For the first time, we prove the almost sure consistency of these estimators and upper bounds on their rates of convergence, the latter of which under the assumption that the density underlying the sample is Lipschitz continuous. Experiments demonstrate their usefulness in independent subspace analysis.
This paper presents libRoadRunner, an extensible, high-performance, cross-platform, open-source software library for the simulation and analysis of models \ expressed using Systems Biology Markup Language (SBML). SBML is the most widely used standard for representing dynamic networks, especially biochemical networks. libRoadRunner supports solution of both large models and multiple replicas of a single model on desktop, mobile and cluster computers. libRoadRunner is a self-contained library, able to run both as a component inside other tools via its C++ and C bindings andnteractively through its Python interface. The Python Application Programming Interface (API) is similar to the APIs of Matlab and SciPy, making it fast and easy to learn, even for new users. libRoadRunner uses a custom Just-In-Time (JIT) compiler built on the widely-used LLVM JIT compiler framework to compile SBML-specified models directly into very fast native machine code for a variety of processors, making it appropriate for solving very large models or multiple replicas of smaller models. libRoadRunner is flexible, supporting the bulk of the SBML specification (except for delay and nonlinear algebraic equations) and several of its extensions. It offers multiple deterministic and stochastic integrators, as well as tools for steady-state, stability analyses and flux balance analysis. We regularly update libRoadRunner binary distributions for Mac OS X, Linux and Windows and license them under Apache License Version 2.0. http://www.libroadrunner.org provides online documentation, full build instructions, binaries and a git source repository.
Trivalent plane graphs are used in various areas of mathematics which relate for instance to the colored Jones polynomial, invariants of 3-manifolds and quantum computation. Their evaluation is based on computations in the Temperley-Lieb algebra and more specifically the Jones-Wenzl projectors. We use the work by Kauffman-Lins to present a quantum combinatorial approach for evaluating a tetrahedral net. On the way we recover two equivalent definitions for the unsigned Stirling numbers of the first kind and we provide an equality for the quantized factorial using these numbers.
Single Index Models (SIMs) are simple yet flexible semi-parametric models for machine learning, where the response variable is modeled as a monotonic function of a linear combination of features. Estimation in this context requires learning both the feature weights and the nonlinear function that relates features to observations. While methods have been described to learn SIMs in the low dimensional regime, a method that can efficiently learn SIMs in high dimensions, and under general structural assumptions, has not been forthcoming. In this paper, we propose computationally efficient algorithms for SIM inference in high dimensions with structural constraints. Our general approach specializes to sparsity, group sparsity, and low-rank assumptions among others. Experiments show that the proposed method enjoys superior predictive performance when compared to generalized linear models, and achieves results comparable to or better than single layer feedforward neural networks with significantly less computational cost.
We investigate a graph probing problem in which an agent has only an incomplete view $G' \subsetneq G$ of the network and wishes to explore the network with least effort. In each step, the agent selects a node $u$ in $G'$ to probe. After probing $u$, the agent gains the information about $u$ and its neighbors. All the neighbors of $u$ become \emph{observed} and are \emph{probable} in the subsequent steps (if they have not been probed). What is the best probing strategy to maximize the number of nodes explored in $k$ probes? This problem serves as a fundamental component for other decision-making problems in incomplete networks such as information harvesting in social networks, network crawling, network security, and viral marketing with incomplete information.   While there are a few methods proposed for the problem, none can perform consistently well across different network types. In this paper, we establish a strong (in)approximability for the problem, proving that no algorithm can guarantees finite approximation ratio unless P=NP. On the bright side, we design learning frameworks to capture the best probing strategies for individual network. Our extensive experiments suggest that our framework can learn efficient probing strategies that \emph{consistently} outperform previous heuristics and metric-based approaches.
Over the course of the last decade, infrared (IR) and particularly thermal IR imaging based face recognition has emerged as a promising complement to conventional, visible spectrum based approaches which continue to struggle when applied in practice. While inherently insensitive to visible spectrum illumination changes, IR data introduces specific challenges of its own, most notably sensitivity to factors which affect facial heat emission patterns, e.g. emotional state, ambient temperature, and alcohol intake. In addition, facial expression and pose changes are more difficult to correct in IR images because they are less rich in high frequency detail which is an important cue for fitting any deformable model. In this paper we describe a novel method which addresses these major challenges. Specifically, when comparing two thermal IR images of faces, we mutually normalize their poses and facial expressions by using an active appearance model (AAM) to generate synthetic images of the two faces with a neutral facial expression and in the same view (the average of the two input views). This is achieved by piecewise affine warping which follows AAM fitting. A major contribution of our work is the use of an AAM ensemble in which each AAM is specialized to a particular range of poses and a particular region of the thermal IR face space. Combined with the contributions from our previous work which addressed the problem of reliable AAM fitting in the thermal IR spectrum, and the development of a person-specific representation robust to transient changes in the pattern of facial temperature emissions, the proposed ensemble framework accurately matches faces across the full range of yaw from frontal to profile, even in the presence of scale variation (e.g. due to the varying distance of a subject from the camera).
This article covers the problem of processing of Big Data that describe process of complex networks and network systems operation. It also introduces the notion of hierarchical network systems combination into associations and conglomerates alongside with complex networks combination into multiplexes. The analysis is provided for methods of global network structures study depending on the purpose of the research. Also the main types of information flows in complex hierarchical network systems being the basic components of associations and conglomerates are covered. Approaches are proposed for creation of efficient computing environments, distributed computations organization and information processing methods parallelization at different levels of system hierarchy.
What networks can form and persist when agents are self-interested? Can such networks be efficient? A substantial theoretical literature predicts that the only networks that can form and persist must have very special shapes and that such networks cannot be efficient, but these predictions are in stark contrast to empirical findings. In this paper, we present a new model of network formation. In contrast to the existing literature, our model is dynamic (rather than static), we model agents as foresighted (rather than myopic) and we allow for the possibility that agents are heterogeneous (rather than homogeneous). We show that a very wide variety of networks can form and persist; in particular, efficient networks can form and persist if they provide every agent a strictly positive payoff. For the widely-studied connections model, we provide a full characterization of the set of efficient networks that can form and persist. Our predictions are consistent with empirical findings.
Cognitive radio nodes have been proposed as means to improve the spectrum utilization. It reuses the spectrum of a primary service provider under the condition that the primary service provider services are not harmfully interrupted. A cognitive radio can sense its operating environment's conditions and it is able to reconfigure itself and to communicate with other counterparts based on the status of the environment and also the requirements of the user to meet the optimal communication conditions and to keep quality of service (QoS) as high as possible. The efficiency of spectrum sharing can be improved by minimizing the interference. The Utility function that captures the cooperative behavior to minimize the interference and the satisfaction to improve the throughput is investigated. The dynamic spectrum sharing algorithm can maintain the quality of service (QoS) of each network while the effective spectrum utilisation is improved under a fluctuation traffic environment when the available spectrum is limited.
We investigate the probability distribution $p(g)$ of the conductance $g$ in anisotropic two-dimensional systems. The scaling procedure applicable to mapping the conductance distributions of localized anisotropic systems to the corresponding isotropic one can be extended to systems at the critical point of the metal-to-insulator transition. Instead of the squares used for isotropic systems, one should use rectangles for the anisotropic ones. At the critical point, the ratio of the side lengths must be equal to the squre root of the ratio of the critical values of the quasi-one-dimensional scaling functions. For localized systems, the ratio of the side lengths must be equal to the ratio of the localization lengths.
We study the electron-electron interaction contribution to the conductivity of two-dimensional In$_{0.2}$Ga$_{0.8}$As electron systems in the diffusion regime over the wide conductivity range, $\sigma\simeq(1-150) G_0$, where $G_0=e^2/(2\pi^2\hbar)$. We show that the data are well described within the framework of the one-loop approximation of the renormalization group (RG) theory when the conductivity is relatively high, $\sigma \gtrsim 15 G_0$. At lower conductivity, the experimental results are found to be in drastic disagreement with the predictions of this theory. The theory predicts much stronger renormalization of the Landau's Fermi liquid amplitude, which controls the interaction in the triplet channel, than that observed experimentally. A further contradiction is that the experimental value of the interaction contribution does not practically depend on the magnetic field, whereas the RG theory forecasts its strong decrease due to decreasing diagonal component of the conductivity tensor in the growing magnetic field.
In this work, a novel learning-based approach has been developed to generate driving paths by integrating LIDAR point clouds, GPS-IMU information, and Google driving directions. The system is based on a fully convolutional neural network that jointly learns to carry out perception and path generation from real-world driving sequences and that is trained using automatically generated training examples. Several combinations of input data were tested in order to assess the performance gain provided by specific information modalities. The fully convolutional neural network trained using all the available sensors together with driving directions achieved the best MaxF score of 88.13% when considering a region of interest of 60x60 meters. By considering a smaller region of interest, the agreement between predicted paths and ground-truth increased to 92.60%. The positive results obtained in this work indicate that the proposed system may help fill the gap between low-level scene parsing and behavior-reflex approaches by generating outputs that are close to vehicle control and at the same time human-interpretable.
Deep learning approaches have been widely used in Automatic Speech Recognition (ASR) and they have achieved a significant accuracy improvement. Especially, Convolutional Neural Networks (CNNs) have been revisited in ASR recently. However, most CNNs used in existing work have less than 10 layers which may not be deep enough to capture all human speech signal information. In this paper, we propose a novel deep and wide CNN architecture denoted as RCNN-CTC, which has residual connections and Connectionist Temporal Classification (CTC) loss function. RCNN-CTC is an end-to-end system which can exploit temporal and spectral structures of speech signals simultaneously. Furthermore, we introduce a CTC-based system combination, which is different from the conventional frame-wise senone-based one. The basic subsystems adopted in the combination are different types and thus mutually complementary to each other. Experimental results show that our proposed single system RCNN-CTC can achieve the lowest word error rate (WER) on WSJ and Tencent Chat data sets, compared to several widely used neural network systems in ASR. In addition, the proposed system combination can offer a further error reduction on these two data sets, resulting in relative WER reductions of $14.91\%$ and $6.52\%$ on WSJ dev93 and Tencent Chat data sets respectively.
In this paper, the early design of our self-organized agent-based simulation model for exploration of synaptic connections that faithfully generates what is observed in natural situation is given. While we take inspiration from neuroscience, our intent is not to create a veridical model of processes in neurodevelopmental biology, nor to represent a real biological system. Instead, our goal is to design a simulation model that learns acting in the same way of human nervous system by using findings on human subjects using reflex methodologies in order to estimate unknown connections.
Synchronization of networked oscillators is known to depend fundamentally on the interplay between the dynamics of the graph's units and the microscopic arrangement of the network's structure. For non identical elements, the lack of quantitative tools has hampered so far a systematic study of the mechanisms behind such a collective behavior. We here propose an effective network whose topological properties reflect the interplay between the topology and dynamics of the original network. On that basis, we are able to introduce the "synchronization centrality", a measure which quantifies the role and importance of each network's node in the synchronization process. In particular, we use such a measure to assess the propensity of a graph to synchronize explosively, thus indicating a unified framework for most of the different models proposed so far for such an irreversible transition. Taking advantage of the predicting power of this measure, we furthermore discuss a strategy to induce the explosive behavior in a generic network, by acting only upon a small fraction of its nodes.
The chest X-ray is one of the most commonly accessible radiological examinations for screening and diagnosis of many lung diseases. A tremendous number of X-ray imaging studies accompanied by radiological reports are accumulated and stored in many modern hospitals' Picture Archiving and Communication Systems (PACS). On the other side, it is still an open question how this type of hospital-size knowledge database containing invaluable imaging informatics (i.e., loosely labeled) can be used to facilitate the data-hungry deep learning paradigms in building truly large-scale high precision computer-aided diagnosis (CAD) systems.   In this paper, we present a new chest X-ray database, namely "ChestX-ray8", which comprises 108,948 frontal-view X-ray images of 32,717 unique patients with the text-mined eight disease image labels (where each image can have multi-labels), from the associated radiological reports using natural language processing. Importantly, we demonstrate that these commonly occurring thoracic diseases can be detected and even spatially-located via a unified weakly-supervised multi-label image classification and disease localization framework, which is validated using our proposed dataset. Although the initial quantitative results are promising as reported, deep convolutional neural network based "reading chest X-rays" (i.e., recognizing and locating the common disease patterns trained with only image-level labels) remains a strenuous task for fully-automated high precision CAD systems.
Negative refraction is such a prominent electromagnetic phenomenon that most researchers believe it can only occur in artificially engineered metamaterials. In this article, we report negative refraction for all incident angles for the first time in a naturally existing material. Using ellipsometry measurement of the equifrequency contour in the deep-ultraviolet frequency region (typically 254 nm), obvious negative refraction was demonstrated in monocrystalline graphite for incident angles ranging from 20o to 70o. This negative refraction is attributed to extremely strong anisotropy in the crystal structure of graphite, which gives the crystal indefinite permeability. This result not only explores a new route to identifying natural negative-index materials, but it also holds promise for the development of an ultraviolet hyperlens, which may lead to a breakthrough in nanolithography, the most critical technology necessary for the next generation of electronics.
Deep Neural Network (DNN) acoustic models have yielded many state-of-the-art results in Automatic Speech Recognition (ASR) tasks. More recently, Recurrent Neural Network (RNN) models have been shown to outperform DNNs counterparts. However, state-of-the-art DNN and RNN models tend to be impractical to deploy on embedded systems with limited computational capacity. Traditionally, the approach for embedded platforms is to either train a small DNN directly, or to train a small DNN that learns the output distribution of a large DNN. In this paper, we utilize a state-of-the-art RNN to transfer knowledge to small DNN. We use the RNN model to generate soft alignments and minimize the Kullback-Leibler divergence against the small DNN. The small DNN trained on the soft RNN alignments achieved a 3.93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4.54 WER or more than 13% relative improvement.
In this paper, we propose the Neural Knowledge DNA, a framework that tailors the ideas underlying the success of neural networks to the scope of knowledge representation. Knowledge representation is a fundamental field that dedicate to representing information about the world in a form that computer systems can utilize to solve complex tasks. The proposed Neural Knowledge DNA is designed to support discovering, storing, reusing, improving, and sharing knowledge among machines and organisation. It is constructed in a similar fashion of how DNA formed: built up by four essential elements. As the DNA produces phenotypes, the Neural Knowledge DNA carries information and knowledge via its four essential elements, namely, Networks, Experiences, States, and Actions.
The goal of this paper is to present a centrality measurement for the nodes of a hypergraph, by using existing literature which extends eigenvector centrality from a graph to a hypergraph, and literature which give a general centrality measurement for a graph. We will use this measurement to say more about the number of communications in a hypergraph, to implement a learning mechanism, and to construct certain networks.
This paper describes a fast and accurate semantic image segmentation approach that encodes not only the discriminative features from deep neural networks, but also the high-order context compatibility among adjacent objects as well as low level image features. We formulate the underlying problem as the conditional random field that embeds local feature extraction, clique potential construction, and guided filtering within the same framework, and provide an efficient coarse-to-fine solver. At the coarse level, we combine local feature representation and context interaction using a deep convolutional network, and directly learn the interaction from high order cliques with a message passing routine, avoiding time-consuming explicit graph inference for joint probability distribution. At the fine level, we introduce a guided filtering interpretation for the mean field algorithm, and achieve accurate object boundaries with 100+ faster than classic learning methods. The two parts are connected and jointly trained in an end-to-end fashion. Experimental results on Pascal VOC 2012 dataset have shown that the proposed algorithm outperforms the state-of-the-art, and that it achieves the rank 1 performance at the time of submission, both of which prove the effectiveness of this unified framework for semantic image segmentation.
How to design a secure steganography method is the problem that researchers have always been concerned about. Traditionally, the steganography method is designed in a heuristic way which does not take into account the detection side (steganalysis) fully and automatically. In this paper, we propose a new strategy that generates more suitable and secure covers for steganography with adversarial learning scheme, named SSGAN. The proposed architecture has one generative network called G, and two discriminative networks called D and S, among which the former evaluates the visual quality of the generated images for steganography and the latter assesses their suitableness for information hiding. Different from the existing work, we use WGAN instead of GAN for the sake of faster convergence speed, more stable training, and higher quality images, and also re-design the S net with more sophisticated steganalysis network. The experimental results prove the effectiveness of the proposed method.
We discovered novel Anderson localization behaviors of pseudospin systems in a 1D disordered potential. For a pseudospin-1 system, due to the absence of backscattering under normal incidence and the presence of a conical band structure, the wave localization behaviors are entirely different from those of normal disordered systems. We show both numerically and analytically that there exists a critical strength of random potential ($W_c$), which is equal to the incident energy ($E$), below which the localization length $\xi$ decreases with the random strength $W$ for a fixed incident angle $\theta$. But the localization length drops abruptly to a minimum at $W=W_c$ and rises immediately afterwards, which has never been observed in ordinary materials. The incidence angle dependence of the localization length has different asymptotic behaviors in two regions of random strength, with $\xi \propto \sin^{-4}\theta$ when $W<W_c$ and $\xi \propto \sin^{-2}\theta$ when $W>W_c$. Experimentally, for a given disordered sample with a fixed randomness strength $W$, the incident wave with incident energy $E$ will experience two different types of localization, depending on whether $E>W$ or $E<W$. The existence of a sharp transition at $E=W$ is due to the emergence of evanescent waves in the systems when $E<W$. Such localization behavior is unique to pseudospin-1 systems. For pseudospin-1/2 systems, there is a minimum localization length as randomness increases, but the transition from decreasing to increasing localization length at the minimum is smooth rather than abrupt. In both decreasing and increasing regions, the $\theta$ -dependence of the localization length has the same asymptotic behavior $\xi \propto \sin^{-2}\theta$.
Random neural networks are dynamical descriptions of randomly interconnected neural units. These show a phase transition to chaos as a disorder parameter is increased. The microscopic mechanisms underlying this phase transition are unknown, and similarly to spin-glasses, shall be fundamentally related to the behavior of the system. In this Letter we investigate the explosion of complexity arising near that phase transition. We show that the mean number of equilibria undergoes a sharp transition from one equilibrium to a very large number scaling exponentially with the dimension on the system. Near criticality, we compute the exponential rate of divergence, called topological complexity. Strikingly, we show that it behaves exactly as the maximal Lyapunov exponent, a classical measure of dynamical complexity. This relationship unravels a microscopic mechanism leading to chaos which we further demonstrate on a simpler class of disordered systems, suggesting a deep and underexplored link between topological and dynamical complexity.
The "developmental hourglass" describes a pattern of increasing morphological divergence towards earlier and later embryonic development, separated by a period of significant conservation across distant species (the "phylotypic stage"). Recent studies have found evidence in support of the hourglass effect at the genomic level. For instance, the phylotypic stage expresses the oldest and most conserved transcriptomes. However, the regulatory mechanism that causes the hourglass pattern remains an open question. Here, we use an evolutionary model of regulatory gene interactions during development to identify the conditions under which the hourglass effect can emerge in a general setting. The model focuses on the hierarchical gene regulatory network that controls the developmental process, and on the evolution of a population under random perturbations in the structure of that network. The model predicts, under fairly general assumptions, the emergence of an hourglass pattern in the structure of a temporal representation of the underlying gene regulatory network. The evolutionary age of the corresponding genes also follows an hourglass pattern, with the oldest genes concentrated at the hourglass waist. The key behind the hourglass effect is that developmental regulators should have an increasingly specific function as development progresses. Analysis of developmental gene expression profiles from Drosophila melanogaster and Arabidopsis thaliana provide consistent results with our theoretical predictions.
A next--to--leading order QCD analysis of deep inelastic scattering data is performed allowing for contributions due to a light gluino. We obtain the values of $\alpha_s(M_Z^2) \pm \delta \alpha_s^{stat} = 0.108 \pm 0.002, 0.124 \pm 0.001, 0.145 \pm 0.009$ for QCD, SUSY QCD with a Majorana gluino and a Dirac gluino respectively. The value of $\alsm$ obtained in SUSY QCD with a Majorana gluino best agrees with the direct measurements of $\alpha_s(M_Z^2)$ at LEP.
Parallel processing, the core of High Performance Computing (HPC), was and still the most effective way in improving the speed of computer systems. For the past few years, the substantial developments in the computing power of processors and the network speed have strikingly changed the landscape of HPC. Geography distributed heterogeneous systems can now cooperate and share resources to execute one application. This computing infrastructure is known as computational Grid or Grid Computing. Grid can be viewed as a distributed large-scale cluster computing. From other perspective, it constitutes the major part of Cloud Computing Systems in addition to thin clients and utility computing [1,2, 3]. Hence, Grid computing has attracted many researchers [4]. The interest in Grid computing has gone beyond the paradigm of traditional Grid computing to a Wireless Grid computing [5,6].
We provide a short supplement to the paper "MAPK networks and their capacity for multistationarity due to toric steady states" by P\'erez Mill\'an and Turjanski. We show that the capacity for toric steady states in the three networks analyzed in that paper can be derived using the process of network translation, which corresponds the original mass action system to a generalized mass action system with the same steady states. In all three cases, the translated chemical reaction network is proper, weakly reversible, and has both a structural and kinetic deficiency of zero. This is sufficient to guarantee toric steady states by previously established work on network translations. A basis of the steady state ideal is then derived by consideration of the linkage classes of the translated chemical reaction network.
Centralized Radio Access Network (C-RAN) is a new paradigm for wireless networks that centralizes the signal processing in a computing cloud, allowing commodity computational resources to be pooled. While C-RAN improves utilization and efficiency, the computational load occasionally exceeds the available resources, creating a computational outage. This paper provides a mathematical characterization of the computational outage probability for low-density parity check (LDPC) codes, a common class of error-correcting codes. For tractability, a binary erasures channel is assumed. Using the concept of density evolution, the computational demand is determined for a given ensemble of codes as a function of the erasure probability. The analysis reveals a trade-off: aggressively signaling at a high rate stresses the computing pool, while conservatively backing-off the rate can avoid computational outages. Motivated by this trade-off, an effective computationally aware scheduling algorithm is developed that balances demands for high throughput and low outage rates.
The density of states, even for a perfectly ordered tight-binding model, can exhibit a tail-like feature at the top of the band, provided the hopping integral falls off in space slowly enough. We apply the coherent potential approximation to study the eigenstates of a tight-binding Hamiltonian with uncorrelated diagonal disorder and long-range hopping, falling off as a power $\mu$ of the intersite distance. For a certain interval of hopping range exponent $\mu$, we show that the phase coherence length is infinite for the outermost state of the tail, irrespectively of the strength of disorder. Such anomalous feature can be explained by the smallness of the phase-space volume for the disorder scattering from this state. As an application of the theory, we mention that ballistic regime can be realized for Frenkel excitons in two-dimensional molecular aggregates, affecting to a large extent the optical response and energy transport.
The way in which different types of dynamics unfold in complex networks is intrinsically related to the propagation of activation along nodes, which is strongly affected by the network connectivity. In this work we investigate to which extent a time-varying signal emanating from a specific node is modified as it diffuses, at the equilibrium regime, along uniformly random (Erd\H{o}s-R\'enyi) and scale-free (Barab\'asi-Albert) networks. The degree of preservation is quantified in terms of the Pearson cross-correlation between the original signal and the derivative of the signals appearing at each node along time. Several interesting results are reported. First, the fact that quite distinct signals are typically obtained at different nodes in the considered networks implies mean-field approaches to be completely inadequate. It has also been found that the peak and lag of the similarity time-signatures obtained for each specific node are strongly related to the respective distance between that node and the source node. Such a relationship tends to decrease with the average degree of the networks. Also, in the case of the lag, a less intense relationship is verified for scale-free networks. No relationship was found between the dispersion of the similarity signature and the distance to the source.
We consider a dissipative flow network that obeys the standard linear nodal flow conservation, and where flows on edges are driven by potential difference between adjacent nodes. We show that in the case when the flow is a monotonically increasing function of the potential difference, solution of the network flow equations is unique and can be equivalently recast as the solution of a strictly convex optimization problem. We also analyze the maximum throughput problem on such networks seeking to maximize the amount of flow that can be delivered to the loads while satisfying bounds on the node potentials. When the dissipation function is differentiable we develop a representation of the maximum throughput problem in the form of a twice differentiable biconvex optimization problem exploiting the variational representation of the network flow equations. In the process we prove a special case of a certain monotonicity property of dissipative flow networks. When the dissipation function follows a power law with exponent greater than one, we suggest a mixed integer convex relaxation of the maximum throughput problem. Finally, we illustrate application of these general results to balanced, i.e. steady, natural gas networks also validating the theory results through simulations on a test case.
We demonstrate that the quantum mutual information (QMI) is a useful probe to study many-body localization (MBL). First, we focus on the detection of a metal--insulator transition for two different models, the noninteracting Aubry-Andr\'e-Harper model and the spinless fermionic disordered Hubbard chain. We find that the QMI in the localized phase decays exponentially with the distance between the regions traced out, allowing us to define a correlation length, which converges to the localization length in the case of one particle. Second, we show how the QMI can be used as a dynamical indicator to distinguish an Anderson insulator phase from an MBL phase. By studying the spread of the QMI after a global quench from a random product state, we show that the QMI does not spread in the Anderson insulator phase but grows logarithmically in time in the MBL phase.
The lowest Landau level of graphene is studied numerically by considering a tight-binding Hamiltonian with disorder. The Hall conductance $\sigma_\mathrm{xy}$ and the longitudinal conductance $\sigma_\mathrm{xx}$ are computed. We demonstrate that bond disorder can produce a plateau-like feature centered at $\nu=0$, while the longitudinal conductance is nonzero in the same region, reflecting a band of extended states between $\pm E_{c}$, whose magnitude depends on the disorder strength. The critical exponent corresponding to the localization length at the edges of this band is found to be $2.47\pm 0.04$. When both bond disorder and a finite mass term exist the localization length exponent varies continuously between $\sim 1.0$ and $\sim 7/3$.
People are increasingly relying on the Web and social media to find solutions to their problems in a wide range of domains. In this online setting, closely related problems often lead to the same characteristic learning pattern, in which people sharing these problems visit related pieces of information, perform almost identical queries or, more generally, take a series of similar actions. In this paper, we introduce a novel modeling framework for clustering continuous-time grouped streaming data, the hierarchical Dirichlet Hawkes process (HDHP), which allows us to automatically uncover a wide variety of learning patterns from detailed traces of learning activity. Our model allows for efficient inference, scaling to millions of actions taken by thousands of users. Experiments on real data gathered from Stack Overflow reveal that our framework can recover meaningful learning patterns in terms of both content and temporal dynamics, as well as accurately track users' interests and goals over time.
Previous work on relay networks has concentrated primarily on the diversity benefits of such techniques. This paper explores the possibility of also obtaining multiplexing gain in a relay network, while retaining diversity gain. Specifically, consider a network in which a single source node is equipped with one antenna and a destination is equipped with two antennas. It is shown that, in certain scenarios, by adding a relay with two antennas and using a successive relaying protocol, the diversity multiplexing tradeoff performance of the network can be lower bounded by that of a 2 by 2 MIMO channel, when the decode-and-forward protocol is applied at the relay. A distributed D-BLAST architecture is developed, in which parallel channel coding is applied to achieve this tradeoff. A space-time coding strategy, which can bring a maximal multiplexing gain of more than one, is also derived for this scenario. As will be shown, while this space-time coding strategy exploits maximal diversity for a small multiplexing gain, the proposed successive relaying scheme offers a significant performance advantage for higher data rate transmission. In addition to the specific results shown here, these ideas open a new direction for exploiting the benefits of wireless relay networks.
There is a long-standing belief that in social networks with simultaneous friendly/hostile interactions (signed networks) there is a general tendency to a global balance. Balance represents a state of the network with lack of contentious situations. Here we introduce a method to quantify the degree of balance of any signed (social) network. It accounts for the contribution of all signed cycles in the network and gives, in agreement with empirical evidences, more weight to the shorter than to the longer cycles. We found that, contrary to what is believed, many signed social networks -- in particular very large directed online social networks -- are in general very poorly balanced. We also show that unbalanced states can be changed by tuning the weights of the social interactions among the agents in the network.
Field programmable gate arrays (FPGAs) can accelerate image processing by exploiting fine-grained parallelism opportunities in image operations. FPGA language designs are often subsets or extensions of existing languages, though these typically lack suitable hardware computation models so compiling them to FPGAs leads to inefficient designs. Moreover, these languages lack image processing domain specificity. Our solution is RIPL, an image processing domain specific language (DSL) for FPGAs. It has algorithmic skeletons to express image processing, and these are exploited to generate deep pipelines of highly concurrent and memory-efficient image processing components.
In the neural network domain, methods for hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of neural network configurations. In this paper, we show that a simple regression model, based on support vector machines, can predict the final performance of partially trained neural network configurations using features based on network architectures, hyperparameters, and time-series validation performance data. We use this regression model to develop an early stopping strategy for neural network configurations. With this early stopping strategy, we obtain significant speedups in both hyperparameter optimization and meta-modeling. Particularly in the context of meta-modeling, our method can learn to predict the performance of drastically different architectures and is seamlessly incorporated into reinforcement learning-based architecture selection algorithms. Finally, we show that our method is simpler, faster, and more accurate than Bayesian methods for learning curve prediction.
Recognizing materials in real-world images is a challenging task. Real-world materials have rich surface texture, geometry, lighting conditions, and clutter, which combine to make the problem particularly difficult. In this paper, we introduce a new, large-scale, open dataset of materials in the wild, the Materials in Context Database (MINC), and combine this dataset with deep learning to achieve material recognition and segmentation of images in the wild.   MINC is an order of magnitude larger than previous material databases, while being more diverse and well-sampled across its 23 categories. Using MINC, we train convolutional neural networks (CNNs) for two tasks: classifying materials from patches, and simultaneous material recognition and segmentation in full images. For patch-based classification on MINC we found that the best performing CNN architectures can achieve 85.2% mean class accuracy. We convert these trained CNN classifiers into an efficient fully convolutional framework combined with a fully connected conditional random field (CRF) to predict the material at every pixel in an image, achieving 73.1% mean class accuracy. Our experiments demonstrate that having a large, well-sampled dataset such as MINC is crucial for real-world material recognition and segmentation.
5G networks are expected to achieve gigabit-level throughput in future cellular networks. However, it is a great challenge to treat 5G wireless backhaul traffic in an effective way. In this article, we analyze the wireless backhaul traffic in two typical network architectures adopting small cell and millimeter wave commmunication technologies. Furthermore, the energy efficiency of wireless backhaul networks is compared for different network architectures and frequency bands. Numerical comparison results provide some guidelines for deploying future 5G wireless backhaul networks in economical and highly energy-efficient ways.
The cost of large scale data collection and annotation often makes the application of machine learning algorithms to new tasks or datasets prohibitively expensive. One approach circumventing this cost is training models on synthetic data where annotations are provided automatically. Despite their appeal, such models often fail to generalize from synthetic to real images, necessitating domain adaptation algorithms to manipulate these models before they can be successfully applied. Existing approaches focus either on mapping representations from one domain to the other, or on learning to extract features that are invariant to the domain from which they were extracted. However, by focusing only on creating a mapping or shared representation between the two domains, they ignore the individual characteristics of each domain. We suggest that explicitly modeling what is unique to each domain can improve a model's ability to extract domain-invariant features. Inspired by work on private-shared component analysis, we explicitly learn to extract image representations that are partitioned into two subspaces: one component which is private to each domain and one which is shared across domains. Our model is trained not only to perform the task we care about in the source domain, but also to use the partitioned representation to reconstruct the images from both domains. Our novel architecture results in a model that outperforms the state-of-the-art on a range of unsupervised domain adaptation scenarios and additionally produces visualizations of the private and shared representations enabling interpretation of the domain adaptation process.
Modern networks are large, highly complex and dynamic. Add to that the mobility of the agents comprising many of these networks. It is difficult or even impossible for such systems to be managed centrally in an efficient manner. It is imperative for such systems to attain a degree of self-management. Self-healing i.e. the capability of a system in a good state to recover to another good state in face of an attack, is desirable for such systems. In this paper, we discuss the self-healing model for dynamic reconfigurable systems. In this model, an omniscient adversary inserts or deletes nodes from a network and the algorithm responds by adding a limited number of edges in order to maintain invariants of the network. We look at some of the results in this model and argue for their applicability and further extensions of the results and the model. We also look at some of the techniques we have used in our earlier work, in particular, we look at the idea of maintaining virtual graphs mapped over the existing network and assert that this may be a useful technique to use in many problem domains.
Seamless continuity is the main goal and challenge in fourth generation Wireless networks (FGWNs), to achieve seamless connectivity "HANDOVER" technique is used,Handover mechanism are mainly used when a mobile terminal(MT) is in overlapping area for service continuity. In Heterogeneous wireless networks main challenge is continual connection among the different networks like WiFi, WiMax, WLAN, WPAN etc. In this paper, Vertical handover decision schemes are compared, Simple Additive Weighting method (SAW) and Weighted product model (WPM) are used to choose the best network from the available Visitor networks(VTs) for the continuous connection by the mobile terminal. In our work we mainly concentrated to the handover decision phase and to reduce the processing delay in the period of handover. In this paper both SAW and WPM methods are compared with the Qos parameters of the mobile terminal (MT) to connect with the best network. Keywords: Handover, Vertical handover decision schemes, Simple additive weighting, Weight product method.
We present new results from extremely deep, high-resolution images of the field around the double quasar QSO 0957+561. A possible gravitational arc system near the double quasar has recently been reported, which, if real, would set strong constraints on determinations of the Hubble constant from the time delay in the double quasar. We find that both the morphology and the colors of the claimed arc systems suggest that they are chance alignments of three and two different objects, and not gravitationally lensed arcs. Hence, the constraints on $H_0$-determinations from the arcs are not valid. Also, a small group of galaxies at $z=0.5$ near the line-of-sight which was required to have a very large mass in the physically interesting arc models, is most likely insignificant. From our deep images we are able to use weak lensing of faint background galaxies in the field to map the gravitational potential in the main cluster. This sets new constraints on determinations of $H_0$. We find that the Hubble constant is constrained to be less than 70km/(s Mpc), if the time delay between the two images of the QSO is equal to or larger than 1.1 years.
The observable behavior of a complex system reflects the mechanisms governing the internal interactions between the system's components and the effect of external perturbations. Here we show that by capturing the simultaneous activity of several of the system's components we can separate the internal dynamics from the external fluctuations. The method allows us to systematically determine the origin of fluctuations in various real systems, finding that while the Internet and the computer chip have robust internal dynamics, highway and Web traffic are driven by external demand. As multichannel measurements are becoming the norm in most fields, the method could help uncover the collective dynamics of a wide array of complex systems.
Metal-dielectric multilayer metamaterials with extreme loss-anisotropy, in which the longitudinal component of the permittivity tensor has ultra-large imaginary part, are proposed and designed. Diffraction-free deep subwavelength beam propagation and manipulation, due to the nearly flat iso-frequency contour (IFC), is demonstrated in such loss-anisotropic metamaterials. It is also shown that deep subwavelength beam propagation can be realized in practical multilayer structures with large multilayer period, when the nonlocal effect is considered.
The history of infections and epidemics holds famous examples where understanding, containing and ultimately treating an outbreak began with understanding its mode of spread. Influenza, HIV and most computer viruses, spread person to person, device to device, through contact networks; Cholera, Cancer, and seasonal allergies, on the other hand, do not. In this paper we study two fundamental questions of detection: first, given a snapshot view of a (perhaps vanishingly small) fraction of those infected, under what conditions is an epidemic spreading via contact (e.g., Influenza), distinguishable from a "random illness" operating independently of any contact network (e.g., seasonal allergies); second, if we do have an epidemic, under what conditions is it possible to determine which network of interactions is the main cause of the spread -- the causative network -- without any knowledge of the epidemic, other than the identity of a minuscule subsample of infected nodes?   The core, therefore, of this paper, is to obtain an understanding of the diagnostic power of network information. We derive sufficient conditions networks must satisfy for these problems to be identifiable, and produce efficient, highly scalable algorithms that solve these problems. We show that the identifiability condition we give is fairly mild, and in particular, is satisfied by two common graph topologies: the grid, and the Erdos-Renyi graphs.
The charm contribution to the proton structure, F_2^c(x,Q^2), is determined using the inclusive cross sections of D*(2010) meson production in deep-inelastic scattering. The cross section measurement covers the region 5 < Q2 < 1000 GeV2 in photon virtuality and 0.02 < y < 0.70 in the inelasticity of the scattering process. The D* meson is measured in transverse momentum and pseudo-rapidity down to p_T > 1.5 GeV and up to |eta| < 1.5. The data were taken with the H1 detector corresponding to an integrated luminosity of 347 pb-1. F_2^c is determined from the D* production cross sections and compared to leading and next-to-leading order perturbative QCD predictions.
Low-temperature in-plane magnetotransport measurements have been performed on adsorbate-induced electron systems formed at in-situ cleaved surfaces of p-type InAs. The Ag-coverage dependence of the surface electron density strongly supports a simple model based on a surface donor level lying above the conduction band minimum. The observations of the quantized Hall resistance and zero longitudinal resistivity demonstrate the perfect two-dimensionality of the surface electron system. We also observed the Rashba effect due to the strong asymmetry of the confining potential well.
The purported "black box"' nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input. DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. By optionally giving separate consideration to positive and negative contributions, DeepLIFT can also reveal dependencies which are missed by other approaches. Scores can be computed efficiently in a single backward pass. We apply DeepLIFT to models trained on MNIST and simulated genomic data, and show significant advantages over gradient-based methods. A detailed video tutorial on the method is at http://goo.gl/qKb7pL and code is at http://goo.gl/RM8jvH.
The information processing abilities of a multilayer neural network with a number of hidden units scaling as the input dimension are studied using statistical mechanics methods. The mapping from the input layer to the hidden units is performed by general symmetric Boolean functions whereas the hidden layer is connected to the output by either discrete or continuous couplings. Introducing an overlap in the space of Boolean functions as order parameter the storage capacity if found to scale with the logarithm of the number of implementable Boolean functions. The generalization behaviour is smooth for continuous couplings and shows a discontinuous transition to perfect generalization for discrete ones.
Large-scale three dimensional molecular dynamics simulations of hopper flow are presented. The flow rate of the system is controlled by the width of the aperture at the bottom. As the steady-state flow rate is reduced, the force distribution $P(f)$ changes only slightly, while there is a large change in the impulse distribution $P(i)$. In both cases, the distributions show an increase in small forces or impulses as the systems approach jamming, the opposite of that seen in previous Lennard-Jones simulations. This occurs dynamically as well for a hopper that transitions from a flowing to a jammed state over time. The final jammed $P(f)$ is quite distinct from a poured packing $P(f)$ in the same geometry. The change in $P(i)$ is a much stronger indicator of the approach to jamming. The formation of a peak or plateau in $P(f)$ at the average force is not a general feature of the approach to jamming.
One of the basic tasks for Bayesian networks (BNs) is that of learning a network structure from data. The BN-learning problem is NP-hard, so the standard solution is heuristic search. Many approaches have been proposed for this task, but only a very small number outperform the baseline of greedy hill-climbing with tabu lists; moreover, many of the proposed algorithms are quite complex and hard to implement. In this paper, we propose a very simple and easy-to-implement method for addressing this task. Our approach is based on the well-known fact that the best network (of bounded in-degree) consistent with a given node ordering can be found very efficiently. We therefore propose a search not over the space of structures, but over the space of orderings, selecting for each ordering the best network consistent with it. This search space is much smaller, makes more global search steps, has a lower branching factor, and avoids costly acyclicity checks. We present results for this algorithm on both synthetic and real data sets, evaluating both the score of the network found and in the running time. We show that ordering-based search outperforms the standard baseline, and is competitive with recent algorithms that are much harder to implement.
Detection of emerging topics are now receiving renewed interest motivated by the rapid growth of social networks. Conventional term-frequency-based approaches may not be appropriate in this context, because the information exchanged are not only texts but also images, URLs, and videos. We focus on the social aspects of theses networks. That is, the links between users that are generated dynamically intentionally or unintentionally through replies, mentions, and retweets. We propose a probability model of the mentioning behaviour of a social network user, and propose to detect the emergence of a new topic from the anomaly measured through the model. We combine the proposed mention anomaly score with a recently proposed change-point detection technique based on the Sequentially Discounting Normalized Maximum Likelihood (SDNML), or with Kleinberg's burst model. Aggregating anomaly scores from hundreds of users, we show that we can detect emerging topics only based on the reply/mention relationships in social network posts. We demonstrate our technique in a number of real data sets we gathered from Twitter. The experiments show that the proposed mention-anomaly-based approaches can detect new topics at least as early as the conventional term-frequency-based approach, and sometimes much earlier when the keyword is ill-defined.
While correlation measures are used to discern statistical relationships between observed variables in almost all branches of data-driven scientific inquiry, what we are really interested in is the existence of causal dependence. Designing an efficient causality test, that may be carried out in the absence of restrictive pre-suppositions on the underlying dynamical structure of the data at hand, is non-trivial. Nevertheless, ability to computationally infer statistical prima facie evidence of causal dependence may yield a far more discriminative tool for data analysis compared to the calculation of simple correlations. In the present work, we present a new non-parametric test of Granger causality for quantized or symbolic data streams generated by ergodic stationary sources. In contrast to state-of-art binary tests, our approach makes precise and computes the degree of causal dependence between data streams, without making any restrictive assumptions, linearity or otherwise. Additionally, without any a priori imposition of specific dynamical structure, we infer explicit generative models of causal cross-dependence, which may be then used for prediction. These explicit models are represented as generalized probabilistic automata, referred to crossed automata, and are shown to be sufficient to capture a fairly general class of causal dependence. The proposed algorithms are computationally efficient in the PAC sense; $i.e.$, we find good models of cross-dependence with high probability, with polynomial run-times and sample complexities. The theoretical results are applied to weekly search-frequency data from Google Trends API for a chosen set of socially "charged" keywords. The causality network inferred from this dataset reveals, quite expectedly, the causal importance of certain keywords. It is also illustrated that correlation analysis fails to gather such insight.
We consider distributed algorithms for data aggregation and function computation in sensor networks. The algorithms perform pairwise computations along edges of an underlying communication graph. A token is associated with each sensor node, which acts as a transmission permit. Nodes with active tokens have transmission permits; they generate messages at a constant rate and send each message to a randomly selected neighbor. By using different strategies to control the transmission permits we can obtain tradeoffs between message and time complexity. Gossip corresponds to the case when all nodes have permits all the time. We study algorithms where permits are revoked after transmission and restored upon reception. Examples of such algorithms include Simple-Random Walk(SRW), Coalescent-Random-Walk(CRW) and Controlled Flooding(CFLD) and their hybrid variants. SRW has a single node permit, which is passed on in the network. CRW, initially initially has a permit for each node but these permits are revoked gradually. The final result for SRW and CRW resides at a single(or few) random node(s) making a direct comparison with GOSSIP difficult. A hybrid two-phase algorithm switching from CRW to CFLD at a suitable pre-determined time can be employed to achieve consensus. We show that such hybrid variants achieve significant gains in both message and time complexity. The per-node message complexity for n-node graphs, such as 2D mesh, torii, and Random geometric graphs, scales as $O(polylog(n))$ and the corresponding time complexity scales as O(n). The reduced per-node message complexity leads to reduced energy utilization in sensor networks.
We report B_c2 data for LaO_{0.9}F_{0.1}FeAs_{1-delta} in a wide T and field range up to 60 Tesla. The large slope of B_c2 approx ~ -6 Tesla/K near an improved T_c = 28.5 K of the in-plane B_c2(T) contrasts with a flattening starting at 23 K above 30 Tesla we regard as the onset of Pauli-limited behavior (PLB) with B_c2(0) about 65 Tesla. We interpret a similar hitherto unexplained flattening of the B_c2(T) curves reported for at least three other disordered closely related systems as also as a manifestation of PLB. Their Maki parameters have been estimated analyzing their B_c2(T) data within the WHH approach. The pronounced PLB of (Ba,K)Fe_2As_2 single crystals from a tin-flux is attributed also to a significant As deficiency. Consequences of our results are discussed in terms of disorder effects within conventional (CSC) and unconventional superconductivity (USC). USC scenarios with nodes on individual Fermi surface sheets (FSS), can be discarded for our samples. The increase of dB_c2/dT|_{T_c} by sizeable disorder provides evidence for an important intraband (intra-FSS) contribution to the orbital upper critical field. We suggest that it can be ascribed either to an impurity driven transition from s_{+-} USC to CSC of an extended s_{++}-wave state or to a stabilized s_{+-}-state provided As-vacancies cause predominantly strong intraband scattering in the unitary limit. We compare our results with B_c2 data from the literature with no PLB for fields below 60 to 70 Tesla probed so far. A novel disorder related scenario of a complex interplay of SC with two different competing magnetic instabilities is suggested.
Mobile applications (apps) often transmit sensitive data through network with various intentions. Some transmissions are needed to fulfill the app's functionalities. However, transmissions with malicious receivers may lead to privacy leakage and tend to behave stealthily to evade detection. The problem is twofold: how does one unveil sensitive transmissions in mobile apps, and given a sensitive transmission, how does one determine if it is legitimate?   In this paper, we propose LeakSemantic, a framework that can automatically locate abnormal sensitive network transmissions from mobile apps. LeakSemantic consists of a hybrid program analysis component and a machine learning component. Our program analysis component combines static analysis and dynamic analysis to precisely identify sensitive transmissions. Compared to existing taint analysis approaches, LeakSemantic achieves better accuracy with fewer false positives and is able to collect runtime data such as network traffic for each transmission. Based on features derived from the runtime data, machine learning classifiers are built to further differentiate between the legal and illegal disclosures. Experiments show that LeakSemantic achieves 91% accuracy on 2279 sensitive connections from 1404 apps.
We study the correlation and response dynamics of trap models of glassy dynamics, considering observables that only partially decorrelate with every jump. This is inspired by recent work on a microscopic realization of such models, which found strikingly simple linear out-of-equilibrium fluctuation-dissipation relations in the limit of slow decorrelation. For the Barrat-Mezard model with its entropic barriers we obtain exact results at zero temperature $T$ for arbitrary decorrelation factor $\kappa$. These are then extended to nonzero $T$, where the qualitative scaling behaviour and all scaling exponents can still be found analytically. Unexpectedly, the choice of transition rates (Glauber versus Metropolis) affects not just prefactors but also some exponents. In the limit of slow decorrelation even complete scaling functions are accessible in closed form. The results show that slowly decorrelating observables detect persistently slow out-of-equilibrium dynamics, as opposed to intermittent behaviour punctuated by excursions into fast, effectively equilibrated states.
This paper examines some of the potential challenges associated with enabling a seamless web experience on underpowered mobile devices such as Google Glass from the perspective of web content providers, device, and the network. We conducted experiments to study the impact of webpage complexity, individual web components and different application layer protocols while accessing webpages on the performance of Glass browser, by measuring webpage load time, temperature variation and power consumption and compare it to a smartphone. Our findings suggest that (a) performance of Glass compared to a smartphone in terms of power consumption and webpage load time deteriorates with increasing webpage complexity (b) execution time for popular JavaScript benchmarks is about 3-8 times higher on Glass compared to a smartphone, (c) WebP is more energy efficient image format than JPEG and PNG, and (d) seven out of 50 websites studied are optimized for content delivery to Glass.
Network traffic is difficult to monitor and analyze, especially in high-bandwidth networks. Performance analysis, in particular, presents extreme complexity and scalability challenges. GPU (Graphics Processing Unit) technology has been utilized recently to accelerate general purpose scientific and engineering computing. GPUs offer extreme thread-level parallelism with hundreds of simple cores. Their data-parallel execution model can rapidly solve large problems with inherent data parallelism. At Fermilab, we have prototyped a GPU-accelerated network performance monitoring system, called G-NetMon, to support large-scale scientific collaborations. In this work, we explore new opportunities in network traffic monitoring and analysis with GPUs. Our system exploits the data parallelism that exists within network flow data to provide fast analysis of bulk data movement between Fermilab and collaboration sites. Experiments demonstrate that our G-NetMon can rapidly detect sub-optimal bulk data movements.
The fifth generation (5G) mobile networks are envisioned to support the deluge of data traffic with reduced energy consumption and improved quality of service (QoS) provision. To this end, the key enabling technologies, such as heterogeneous networks (HetNets), massive multiple-input multiple-output (MIMO) and millimeter wave (mmWave) techniques, are identified to bring 5G to fruition. Regardless of the technology adopted, a user association mechanism is needed to determine whether a user is associated with a particular base station (BS) before the data transmission commences. User association plays a pivotal role in enhancing the load balancing, the spectrum efficiency and the energy efficiency of networks. The emerging 5G networks introduce numerous challenges and opportunities for the design of sophisticated user association mechanisms. Hence, substantial research efforts are dedicated to the issues of user association in HetNets, massive MIMO networks, mmWave networks and energy harvesting networks. We introduce a taxonomy as a framework for systematically studying the existing user association algorithms. Based on the proposed taxonomy, we then proceed to present an extensive overview of the state-of-the-art in user association conceived for HetNets, massive MIMO, mmWave and energy harvesting networks. Finally, we summarize the challenges as well as opportunities of user association in 5G and provide design guidelines and potential solutions for sophisticated user association mechanisms.
Deep neural networks achieve unprecedented performance levels over many tasks and scale well with large quantities of data, but performance in the low-data regime and tasks like one shot learning still lags behind. While recent work suggests many hypotheses from better optimization to more complicated network structures, in this work we hypothesize that having a learnable and more expressive similarity objective is an essential missing component. Towards overcoming that, we propose a network design inspired by deep residual networks that allows the efficient computation of this more expressive pairwise similarity objective. Further, we argue that regularization is key in learning with small amounts of data, and propose an additional generator network based on the Generative Adversarial Networks where the discriminator is our residual pairwise network. This provides a strong regularizer by leveraging the generated data samples. The proposed model can generate plausible variations of exemplars over unseen classes and outperforms strong discriminative baselines for few shot classification tasks. Notably, our residual pairwise network design outperforms previous state-of-theart on the challenging mini-Imagenet dataset for one shot learning by getting over 55% accuracy for the 5-way classification task over unseen classes.
We apply statistical mechanics to an inverse problem of linear mapping to investigate the physics of the irreversible compression. We use the replica symmetry breaking (RSB) technique with a toy model to demonstrate the Shannon's result. The rate distortion function, which is widely known as the theoretical limit of the compression with a fidelity criterion, is derived using the Parisi one step RSB scheme. The bound can not be achieved in the sparsely-connected systems, where suboptimal solutions dominate the capacity.
Using numerical self-consistent solutions of a sequence of finite replica symmetry breakings (RSB) and Wilson's renormalization group but with the number of RSB-steps playing a role of decimation scales, we report evidence for a non-trivial T->0-limit of the Parisi order function q(x) for the SK spin glass. Supported by scaling in RSB-space, the fixed point order function is conjectured to be q*(a)=sqrt{\pi/2} a/\xi erf(\xi/a) on 0\leq a\leq infty where x/T->a at T=0 and \xi\approx 1.13\pm 0.01. \xi plays the role of a correlation length in a-space. q*(a) may be viewed as the solution of an effective 1D field theory.
Recent research on super-resolution has progressed with the development of deep convolutional neural networks (DCNN). In particular, residual learning techniques exhibit improved performance. In this paper, we develop an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods. The significant performance improvement of our model is due to optimization by removing unnecessary modules in conventional residual networks. The performance is further improved by expanding the model size while we stabilize the training procedure. We also propose a new multi-scale deep super-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model. The proposed methods show superior performance over the state-of-the-art methods on benchmark datasets and prove its excellence by winning the NTIRE2017 Super-Resolution Challenge.
In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of all mobile responses. It is designed to work at very high throughput and process hundreds of millions of messages daily. The system exploits state-of-the-art, large-scale deep learning.   We describe the architecture of the system as well as the challenges that we faced while building it, like response diversity and scalability. We also introduce a new method for semantic clustering of user-generated content that requires only a modest amount of explicitly labeled data.
Observations depending on sums of random variables are common throughout many fields; however, no efficient solution is currently known for performing max-product inference on these sums of general discrete distributions (max-product inference can be used to obtain maximum a posteriori estimates). The limiting step to max-product inference is the max-convolution problem (sometimes presented in log-transformed form and denoted as "infimal convolution", "min-convolution", or "convolution on the tropical semiring"), for which no O(k log(k)) method is currently known. Here I present a O(k log(k)) numerical method for estimating the max-convolution of two nonnegative vectors (e.g., two probability mass functions), where k is the length of the larger vector. This numerical max-convolution method is then demonstrated by performing fast max-product inference on a convolution tree, a data structure for performing fast inference given information on the sum of n discrete random variables in O(n k log(n k) log(n) ) steps (where each random variable has an arbitrary prior distribution on k contiguous possible states). The numerical max-convolution method can be applied to specialized classes of hidden Markov models to reduce the runtime of computing the Viterbi path from n k^2 to n k log(k), and has potential application to the all-pairs shortest paths problem.
In the compromise model of continuous opinions proposed by Deffuant et al, the states of two agents in a network can start to converge if they are neighbors and if their opinions are sufficiently close to each other, below a given threshold of tolerance $\epsilon$. In directed networks, if agent i is a neighbor of agent j, j need not be a neighbor of i. In Watts-Strogatz networks we performed simulations to find the averaged number of final opinions $<F>$ and their distribution as a function of $\epsilon$ and of the network structural disorder. In directed networks $<F>$ exhibits a rich structure, being larger than in undirected networks for higher values of $\epsilon$, and smaller for lower values of $\epsilon$.
Many frequency-response analyses of experimental data for homogeneous glasses and single-crystals involving mobile ions of a single type indicate that estimates of the stretched-exponential beta1 shape parameter of the Kohlrausch K1 fitting model are close to 1/3 and are virtually independent of both temperature and ionic concentration. This model, which usually yields better fits than others, is indirectly associated with temporal-domain stretched-exponential response having the same beta1 parameter value. Here it is shown that for the above conditions several different analyses yield the important and unique value of exactly 1/3 for the beta1 of the K1 model. It is therefore appropriate to fix the beta1 parameter of this model at the constant value of 1/3, then defined as the U model. It fits data sets exhibiting conductive-system dispersion that vary with both temperature and concentration just as well as those with beta1 free to vary, and it leads to a correspondingly universal value of the Barton-Nakajima-Namikawa (BNN) parameter p of 1.65. Composite-model complex-nonlinear-least-squares fitting, including the dispersive U-model,the effects of the bulk dipolar-electronic dielectric constant, and of electrode polarization when significant, also leads to estimates of two hopping parameters that yield optimum scaling of experimental data that involve temperature and concentration variation.
The coordination of base stations in mobile access networks is an important approach to reduce harmful interference and to deliver high data rates to the users. Such coordination mechanisms, like Coordinated Multi-Point (CoMP) where multiple BSs transmit data to a user equipment, can be easily implemented when centralizing the data processing of the base stations, known as Cloud RAN.   This centralization also imposes significant requirements on the backhaul network for high capacities and low latencies for the connections to the base stations. These requirements can be mitigated by (a) a flexible placement of the base station data processing functionality and by (b) dynamically assigning backhaul network resources. We show how these two techniques increase the feasibility of base station coordination in dense mobile access networks by using a heuristic algorithm.   We furthermore present a prototype implementation of our approach based on software defined networking (SDN) with OpenDaylight and Maxinet.
We model the interaction of several radio devices aiming to obtain wireless connectivity by using a set of base stations (BS) as a non-cooperative game. Each radio device aims to maximize its own spectral efficiency (SE) in two different scenarios: First, we let each player to use a unique BS (BS selection) and second, we let them to simultaneously use several BSs (BS Sharing). In both cases, we show that the resulting game is an exact potential game. We found that the BS selection game posses multiple Nash equilibria (NE) while the BS sharing game posses a unique one. We provide fully decentralized algorithms which always converge to a NE in both games. We analyze the price of anarchy and the price of stability for the case of BS selection. Finally, we observed that depending on the number of transmitters, the BS selection technique might provide a better global performance (network spectral efficiency) than BS sharing, which suggests the existence of a Braess type paradox.
We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.
We show that the deep expected in the diffuse photon spectrum above the threshold of e+e- pair production, i.e., at energies 10^15 - 10^17 eV, may be absent due to the synchrotron radiation by the electron component of the extragalactic Ultra-High Energy Cosmic Rays (UHE CR) in the Galactic magnetic filed. The mechanism we propose requires small (less than 2x10^-12 G) extragalactic magnetic fields and large fraction of photons in the UHE CR. For a typical photon flux expected in top-down scenarios of UHE CR, the predicted flux in the region of the deep is close to the existing experimental limit. The sensitivity of our mechanism to the extragalactic magnetic field may be used to improve existing bounds on the latter by two orders of magnitude.
Glasses have a large excess of low-frequency vibrational modes in comparison with crystalline solids. We show that such a feature is a necessary consequence of the geometry generic to weakly connected solids. In particular, we analyze the density of states of a recently simulated system, comprised of weakly compressed spheres at zero temperature. We account for the observed a) constancy of the density of modes with frequency, b) appearance of a low-frequency cutoff, and c) power-law increase of this cutoff with compression. We predict a length scale below which vibrations are very different from those of a continuous elastic body.
In this paper, we describe an algorithm FARDiff (Fuzzy Adaptive Resonance Dif- fusion) which combines Diffusion Maps and Fuzzy Adaptive Resonance Theory to do clustering on high dimensional data. We describe some applications of this method and some problems for future research.
The brain is a powerful tool used to achieve amazing feats. There have been several significant advances in neuroscience and artificial brain research in the past two decades. This article is a review of such advances, ranging from the concepts of connectionism, to neural network architectures and high-dimensional representations. There have also been advances in biologically inspired cognitive architectures of which we will cite a few. We will be positioning relatively specific models in a much broader perspective, while comparing and contrasting their advantages and weaknesses. The projects presented are targeted to model the brain at different levels, utilizing different methodologies.
We describe the neural-network training framework used in the Kaldi speech recognition toolkit, which is geared towards training DNNs with large amounts of training data using multiple GPU-equipped or multi-core machines. In order to be as hardware-agnostic as possible, we needed a way to use multiple machines without generating excessive network traffic. Our method is to average the neural network parameters periodically (typically every minute or two), and redistribute the averaged parameters to the machines for further training. Each machine sees different data. By itself, this method does not work very well. However, we have another method, an approximate and efficient implementation of Natural Gradient for Stochastic Gradient Descent (NG-SGD), which seems to allow our periodic-averaging method to work well, as well as substantially improving the convergence of SGD on a single machine.
Both self-organization and organization are important for the further development of the sciences: the two dynamics condition and enable each other. Commercial and public considerations can interact and "interpenetrate" in historical organization; different codes of communication are then "recombined." However, self-organization in the symbolically generalized codes of communication can be expected to operate at the global level. The Triple Helix model allows for both a neo-institutional appreciation in terms of historical networks of university-industry-government relations and a neo-evolutionary interpretation in terms of three functions: (i) novelty production, (i) wealth generation, and (iii) political control. Using this model, one can appreciate both subdynamics. The mutual information in three dimensions enables us to measure the trade-off between organization and self-organization as a possible synergy. The question of optimization between commercial and public interests in the different sciences can thus be made empirical.
Could simple organisms such as slime mould approximate LI without recourse to neural tissue? We describe a model whereby LI can emerge without explicit inhibitory wiring, using only bulk transport effects. We use a multi-agent model of slime mould to reproduce the char- acteristic edge contrast amplification effects of LI using excitation via attractant based stimuli. We also explore a counterpart behaviour, Lateral Activation (where stimulated regions are inhibited and lateral regions are excited), using simulated exposure to light irradiation. In both cases restoration of baseline activity occurs when the stimuli are removed. In addition to the enhancement of local edge contrast the long-term change in population density distribution corresponds to a collective response to the global brightness of 2D image stimuli, including the scalloped inten- sity profile of the Chevreul staircase and the perceived difference of two identically bright patches in the Simultaneous Brightness Contrast (SBC) effect. This simple modelapproximatesLIcontrastenhancementphenomenaandglobalbrightnessper- ception in collective unorganised systems without fixed neural architectures. This may encourage further research into unorganised analogues of neural processes in simple organisms and suggests novel mechanisms to generate collective perception of contrast and brightness in distributed computing and robotic devices.
We study the phenomenon of entrainment in processor sharing networks, whereby, while individual network resources have sufficient capacity to met demand, the requirement for simultaneous availability of resources means that a network may nevertheless be unstable. We show that instability occurs through poor control, and that, for a variety of network topologies, only small modifications to controls are required in order to ensure stability. For controls which possess a natural monotonicity property, we give some new results for the classification of the corresponding Markov processes, which lead to conditions both for stability and for instability.
How does one develop a new online community that is highly engaging to each user and promotes social interaction? A number of websites offer friend-finding features that help users bootstrap social networks on the website by copying links from an established network like Facebook or Twitter. This paper quantifies the extent to which such social bootstrapping is effective in enhancing a social experience of the website. First, we develop a stylised analytical model that suggests that copying tends to produce a giant connected component (i.e., a connected community) quickly and preserves properties such as reciprocity and clustering, up to a linear multiplicative factor. Second, we use data from two websites, Pinterest and Last.fm, to empirically compare the subgraph of links copied from Facebook to links created natively. We find that the copied subgraph has a giant component, higher reciprocity and clustering, and confirm that the copied connections see higher social interactions. However, the need for copying diminishes as users become more active and influential. Such users tend to create links natively on the website, to users who are more similar to them than their Facebook friends. Our findings give new insights into understanding how bootstrapping from established social networks can help engage new users by enhancing social interactivity.
In the fight against the racketeering and terrorism, knowledge about the structure and the organization of criminal networks is of fundamental importance for both the investigations and the development of efficient strategies to prevent and restrain crimes. Intelligence agencies exploit information obtained from the analysis of large amounts of heterogeneous data deriving from various informative sources including the records of phone traffic, the social networks, surveillance data, interview data, experiential police data, and police intelligence files, to acquire knowledge about criminal networks and initiate accurate and destabilizing actions. In this context, visual representation techniques coordinate the exploration of the structure of the network together with the metrics of social network analysis. Nevertheless, the utility of visualization tools may become limited when the dimension and the complexity of the system under analysis grow beyond certain terms. In this paper we show how we employ some interactive visualization techniques to represent criminal and terrorist networks reconstructed from phone traffic data, namely foci, fisheye and geo-mapping network layouts. These methods allow the exploration of the network through animated transitions among visualization models and local enlargement techniques in order to improve the comprehension of interesting areas. By combining the features of the various visualization models it is possible to gain substantial enhancements with respect to classic visualization models, often unreadable in those cases of great complexity of the network.
Unconstrained video recognition and Deep Convolution Network (DCN) are two active topics in computer vision recently. In this work, we apply DCNs as frame-based recognizers for video recognition. Our preliminary studies, however, show that video corpora with complete ground truth are usually not large and diverse enough to learn a robust model. The networks trained directly on the video data set suffer from significant overfitting and have poor recognition rate on the test set. The same lack-of-training-sample problem limits the usage of deep models on a wide range of computer vision problems where obtaining training data are difficult. To overcome the problem, we perform transfer learning from images to videos to utilize the knowledge in the weakly labeled image corpus for video recognition. The image corpus help to learn important visual patterns for natural images, while these patterns are ignored by models trained only on the video corpus. Therefore, the resultant networks have better generalizability and better recognition rate. We show that by means of transfer learning from image to video, we can learn a frame-based recognizer with only 4k videos. Because the image corpus is weakly labeled, the entire learning process requires only 4k annotated instances, which is far less than the million scale image data sets required by previous works. The same approach may be applied to other visual recognition tasks where only scarce training data is available, and it improves the applicability of DCNs in various computer vision problems. Our experiments also reveal the correlation between meta-parameters and the performance of DCNs, given the properties of the target problem and data. These results lead to a heuristic for meta-parameter selection for future researches, which does not rely on the time consuming meta-parameter search.
We model the Internet as a network of interconnected Autonomous Systems which self-organize under an absolute lack of centralized control. Our aim is to capture how the Internet evolves by reproducing the assembly that has led to its actual structure and, to this end, we propose a growing weighted network model driven by competition for resources and adaptation to maintain functionality in a demand and supply ``equilibrium''. On the demand side, we consider the environment, a pool of users which need to transfer information and ask for service. On the supply side, ASs compete to gain users, but to be able to provide service efficiently, they must adapt their bandwidth as a function of their size. Hence, the Internet is not modeled as an isolated system but the environment, in the form of a pool of users, is also a fundamental part which must be taken into account. ASs compete for users and big and small come up, so that not all ASs are identical. New connections between ASs are made or old ones are reinforced according to the adaptation needs. Thus, the evolution of the Internet can not be fully understood if just described as a technological isolated system. A socio-economic perspective must also be considered.
We show the advantages of a wedding cake design for Sunyaev-Zel'dovich cluster surveys. We show that by dividing up a cluster survey into a wide and a deep survey, one can essentially recover the cosmological information that would be diluted in a single survey of the same duration due to the uncertainties in our understanding of cluster physics. The parameter degeneracy directions of the deep and wide surveys are slightly different, and combining them breaks these degeneracies effectively. A variable depth survey with a few thousand clusters is as effective at constraining cosmological parameters as a single depth survey with a much larger cluster sample.
In the past few decades, the world has witnessed a rapid growth in mobile communication and reaped great benefits from it. Even though the fourth generation (4G) mobile communication system is just being deployed worldwide, proliferating mobile demands call for newer wireless communication technologies with even better performance. Consequently, the fifth generation (5G) system is already emerging in the research field. However, simply evolving the current mobile networks can hardly meet such great expectations, because over the years the infrastructures have generally become ossified, closed, and vertically constructed. Aiming to establish a new paradigm for 5G mobile networks, in this article, we propose a cross-layer software-defined 5G network architecture. By jointly considering both the network layer and the physical layer together, we establish the two software-defined programmable components, the control plane and the cloud computing pool, which enable an effective control of the mobile network from the global perspective and benefit technological innovations. Specifically, by the cross-layer design for software-defining, the logically centralized and programmable control plane abstracts the control functions from the network layer down to the physical layer, through which we achieve the fine-grained controlling of mobile network, while the cloud computing pool provides powerful computing capability to implement the baseband data processing of multiple heterogeneous networks. We discuss the main challenges of our architecture, including the fine-grained control strategies, network virtualization, and programmability. The architecture significantly benefits the convergence towards heterogeneous networks and it enables much more controllable, programmable and evolvable mobile networks.
An important step in unveiling the relation between network structure and dynamics defined on networks is to detect communities, and numerous methods have been developed separately to identify community structure in different classes of networks, such as unipartite networks, bipartite networks, and directed networks. We show that both unipartite and directed networks can be represented as bipartite networks, and their modularity is completely consistent with that for bipartite networks, the detection of modular structure on which can be reformulated as modularity maximization. To optimize the bipartite modularity, we develop a modified adaptive genetic algorithm (MAGA), which is shown to be especially efficient for community structure detection. The high efficiency of the MAGA is based on the following three improvements we make. First, we introduce a different measure for the informativeness of a locus instead of the standard deviation, which can exactly determine which loci mutate. This measure is the bias between the distribution of a locus over the current population and the uniform distribution of the locus, i.e., the Kullback-Leibler divergence between them. Second, we develop a reassignment technique for differentiating the informative state a locus has attained from the random state in the initial phase. Third, we present a modified mutation rule which by incorporating related operation can guarantee the convergence of the MAGA to the global optimum and can speed up the convergence process. Experimental results show that the MAGA outperforms existing methods in terms of modularity for both bipartite and unipartite networks.
Deep neural networks have been shown to suffer from a surprising weakness: their classification outputs can be changed by small, non-random perturbations of their inputs. This adversarial example phenomenon has been explained as originating from deep networks being "too linear" (Goodfellow et al., 2014). We show here that the linear explanation of adversarial examples presents a number of limitations: the formal argument is not convincing, linear classifiers do not always suffer from the phenomenon, and when they do their adversarial examples are different from the ones affecting deep networks.   We propose a new perspective on the phenomenon. We argue that adversarial examples exist when the classification boundary lies close to the submanifold of sampled data, and present a mathematical analysis of this new perspective in the linear case. We define the notion of adversarial strength and show that it can be reduced to the deviation angle between the classifier considered and the nearest centroid classifier. Then, we show that the adversarial strength can be made arbitrarily high independently of the classification performance due to a mechanism that we call boundary tilting. This result leads us to defining a new taxonomy of adversarial examples. Finally, we show that the adversarial strength observed in practice is directly dependent on the level of regularisation used and the strongest adversarial examples, symptomatic of overfitting, can be avoided by using a proper level of regularisation.
This paper proposes GProp, a deep reinforcement learning algorithm for continuous policies with compatible function approximation. The algorithm is based on two innovations. Firstly, we present a temporal-difference based method for learning the gradient of the value-function. Secondly, we present the deviator-actor-critic (DAC) model, which comprises three neural networks that estimate the value function, its gradient, and determine the actor's policy respectively. We evaluate GProp on two challenging tasks: a contextual bandit problem constructed from nonparametric regression datasets that is designed to probe the ability of reinforcement learning algorithms to accurately estimate gradients; and the octopus arm, a challenging reinforcement learning benchmark. GProp is competitive with fully supervised methods on the bandit task and achieves the best performance to date on the octopus arm.
Automated Text Scoring (ATS) provides a cost-effective and consistent alternative to human marking. However, in order to achieve good performance, the predictive features of the system need to be manually engineered by human experts. We introduce a model that forms word representations by learning the extent to which specific words contribute to the text's score. Using Long-Short Term Memory networks to represent the meaning of texts, we demonstrate that a fully automated framework is able to achieve excellent results over similar approaches. In an attempt to make our results more interpretable, and inspired by recent advances in visualizing neural networks, we introduce a novel method for identifying the regions of the text that the model has found more discriminative.
Transport properties of dense liquid helium under the conditions of planet's core and cool atmosphere of white dwarfs have been investigated by using the improved centroid path-integral simulations combined with density functional theory. The self-diffusion is largely higher and the shear viscosity is notably lower predicted with the quantum mechanical description of the nuclear motion compared with the description by Newton equation. The results show that nuclear quantum effects (NQEs), which depends on the temperature and density of the matter via the thermal de Broglie wavelength and the ionization of electrons, are essential for the transport properties of dense liquid helium at certain astrophysical conditions. The Stokes-Einstein relation between diffusion and viscosity in strongly coupled regime is also examined to display the influences of NQEs.
In this paper, we propose a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition. Specifically, we design a new model called Deep Reconstruction-Classification Network (DRCN), which jointly learns a shared encoding representation for two tasks: i) supervised classification of labeled source data, and ii) unsupervised reconstruction of unlabeled target data.In this way, the learnt representation not only preserves discriminability, but also encodes useful information from the target domain. Our new DRCN model can be optimized by using backpropagation similarly as the standard neural networks.   We evaluate the performance of DRCN on a series of cross-domain object recognition tasks, where DRCN provides a considerable improvement (up to ~8% in accuracy) over the prior state-of-the-art algorithms. Interestingly, we also observe that the reconstruction pipeline of DRCN transforms images from the source domain into images whose appearance resembles the target dataset. This suggests that DRCN's performance is due to constructing a single composite representation that encodes information about both the structure of target images and the classification of source images. Finally, we provide a formal analysis to justify the algorithm's objective in domain adaptation context.
Feedforward neural networks with error backpropagation (FFBP) are widely applied to pattern recognition. One general problem encountered with this type of neural networks is the uncertainty, whether the minimization procedure has converged to a global minimum of the cost function. To overcome this problem a novel approach to minimize the error function is presented. It allows to monitor the approach to the global minimum and as an outcome several ambiguities related to the choice of free parameters of the minimization procedure are removed.
The two primary categories for eigenstate phases of matter at finite temperature are many-body localization (MBL) and the eigenstate thermalization hypothesis (ETH). We show that in the paradigmatic quantum $p$-spin models of spin-glass theory, eigenstates violate ETH yet are not MBL either. A mobility edge, which we locate to leading order in $1/p$ using the forward-scattering approximation and replica techniques, separates the non-ergodic phase at small transverse field from an ergodic phase at large transverse field. The non-ergodic phase is also bounded from above in temperature, by a transition in configuration-space statistics reminiscent of the clustering transition in spin-glass theory. We show that the non-ergodic eigenstates are organized in clusters which exhibit distinct magnetization patterns, as characterized by an eigenstate variant of the Edwards-Anderson order parameter.
For the one-dimensional long-ranged Ising spin-glass with random couplings decaying with the distance $r$ as $J(r) \sim r^{-\sigma}$ and distributed with the L\'evy symmetric stable distribution of index $1 <\mu \leq 2$ (including the usual Gaussian case $\mu=2$), we consider the region $\sigma>1/\mu$ where the energy is extensive. We study two real space renormalization procedures at zero temperature, namely a simple box decimation that leads to explicit calculations, and a strong disorder decimation that can be studied numerically on large sizes. The droplet exponent governing the scaling of the renormalized couplings $J_L \propto L^{\theta_{\mu}(\sigma)}$ is found to be $\theta_{\mu}(\sigma)=\frac{2}{\mu}-\sigma$ whenever the long-ranged couplings are relevant $\theta_{\mu}(\sigma)=\frac{2}{\mu}-\sigma \geq -1$. For the statistics of the ground state energy $E_L^{GS}$ over disordered samples, we obtain that the droplet exponent $\theta_{\mu}(\sigma) $ governs the leading correction to extensivity of the averaged value $\overline{E_L^{GS}} \simeq L e_0 +L^{\theta_{\mu}(\sigma)} e_1$. The characteristic scale of the fluctuations around this average is of order $L^{\frac{1}{\mu}}$, and the rescaled variable $u=(E_L^{GS}-\overline{E_L^{GS}})/L^{\frac{1}{\mu}}$ is Gaussian distributed for $\mu=2$, or displays the negative power-law tail in $1/(-u)^{1+\mu}$ for $u \to -\infty$ in the L\'evy case $1<\mu<2$.
The behavior of Wireless Sensor Networks (WSN) is nowadays widely analyzed. One of the most important issues is related to their energy consumption, as this has a major impact on the network lifetime. Another important application requirement is to ensure data sensing synchronization, which leads to additional energy consumption as a high number of messages is sent and received at each node. Our proposal consists in implementing a combined synchronization protocol based on the IEEE 1588 standard that was designed for wired networks and the PBS (Pairwise Broadcast Synchronization) protocol that was designed for sensor networks, as none of them is able to provide the needed synchronization accuracy for our application on its own. The main goals of our new synchronization protocol are: to ensure the accuracy of local clocks up to a tenth of a microsecond and to provide an important energy saving. Our results obtained using NS-2 (Network Simulator) show that the performance of our solution (IEEE 1588-PBS) matches our application requirements with regard to the synchronization, with a significant improvement in energy saving.
We present a representation learning method that learns features at multiple different levels of scale. Working within the unsupervised framework of denoising autoencoders, we observe that when the input is heavily corrupted during training, the network tends to learn coarse-grained features, whereas when the input is only slightly corrupted, the network tends to learn fine-grained features. This motivates the scheduled denoising autoencoder, which starts with a high level of noise that lowers as training progresses. We find that the resulting representation yields a significant boost on a later supervised task compared to the original input, or to a standard denoising autoencoder trained at a single noise level. After supervised fine-tuning our best model achieves the lowest ever reported error on the CIFAR-10 data set among permutation-invariant methods.
An increasing number of computer vision tasks can be tackled with deep features, which are the intermediate outputs of a pre-trained Convolutional Neural Network. Despite the astonishing performance, deep features extracted from low-level neurons are still below satisfaction, arguably because they cannot access the spatial context contained in the higher layers. In this paper, we present InterActive, a novel algorithm which computes the activeness of neurons and network connections. Activeness is propagated through a neural network in a top-down manner, carrying high-level context and improving the descriptive power of low-level and mid-level neurons. Visualization indicates that neuron activeness can be interpreted as spatial-weighted neuron responses. We achieve state-of-the-art classification performance on a wide range of image datasets.
Development of robust dynamical systems and networks such as autonomous aircraft systems capable of accomplishing complex missions faces challenges due to the dynamically evolving uncertainties coming from model uncertainties, necessity to operate in a hostile cluttered urban environment, and the distributed and dynamic nature of the communication and computation resources. Model-based robust design is difficult because of the complexity of the hybrid dynamic models including continuous vehicle dynamics, the discrete models of computations and communications, and the size of the problem. We will overview recent advances in methodology and tools to model, analyze, and design robust autonomous aerospace systems operating in uncertain environment, with stress on efficient uncertainty quantification and robust design using the case studies of the mission including model-based target tracking and search, and trajectory planning in uncertain urban environment. To show that the methodology is generally applicable to uncertain dynamical systems, we will also show examples of application of the new methods to efficient uncertainty quantification of energy usage in buildings, and stability assessment of interconnected power networks.
A fundamental task of statistical physics is to predict the system's statistical properties and compare them with observable data. We formulate the theory of dipolaron solutions and analyze the screening effects for permanent and field-induced dipolarons. The mathematical treatment of the collective behaviour and microscopical morphology of dipolaron solutions are discussed. The presented computations show that the electric field shielding of dipolarons in dielectric nanosolutions is quite different from that of counterionic nano-complexes of Debye-H\"uckel theory of electrolytes. The limiting case of screening length $\lambda=0$ in dipolaron solutions corresponds to Coulomb's law for the potential and field of uniformly charged sphere.
Linking two data sources is a basic building block in numerous computer vision problems. Canonical Correlation Analysis (CCA) achieves this by utilizing a linear optimizer in order to maximize the correlation between the two views. Recent work makes use of non-linear models, including deep learning techniques, that optimize the CCA loss in some feature space. In this paper, we introduce a novel, bi-directional neural network architecture for the task of matching vectors from two data sources. Our approach employs two tied neural network channels that project the two views into a common, maximally correlated space using the Euclidean loss. We show a direct link between the correlation-based loss and Euclidean loss, enabling the use of Euclidean loss for correlation maximization. To overcome common Euclidean regression optimization problems, we modify well-known techniques to our problem, including batch normalization and dropout. We show state of the art results on a number of computer vision matching tasks including MNIST image matching and sentence-image matching on the Flickr8k, Flickr30k and COCO datasets.
Many biological, geophysical and technological systems involve the transport of resource over a network. In this paper we present an algorithm for calculating the exact concentration of resource at any point in space or time, given that the resource in the network is lost or delivered out of the network at a given rate, while being subject to advection and diffusion. We consider the implications of advection, diffusion and delivery for simple models of glucose delivery through a vascular network, and conclude that in certain circumstances, increasing the volume of blood and the number of glucose transporters can actually decrease the total rate of glucose delivery. We also consider the case of empirically determined fungal networks, and analyze the distribution of resource that emerges as such networks grow over time. Fungal growth involves the expansion of fluid filled vessels, which necessarily involves the movement of fluid. In three empirically determined fungal networks we found that the minimum currents consistent with the observed growth would effectively transport resource throughout the network over the time-scale of growth. This suggests that in foraging fungi, the active transport mechanisms observed in the growing tips may not be required for long range transport.
In this paper we propose the Structured Deep Neural Network (Structured DNN) as a structured and deep learning algorithm, learning to find the best structured object (such as a label sequence) given a structured input (such as a vector sequence) by globally considering the mapping relationships between the structure rather than item by item.   When automatic speech recognition is viewed as a special case of such a structured learning problem, where we have the acoustic vector sequence as the input and the phoneme label sequence as the output, it becomes possible to comprehensively learned utterance by utterance as a whole, rather than frame by frame.   Structured Support Vector Machine (structured SVM) was proposed to perform ASR with structured learning previously, but limited by the linear nature of SVM. Here we propose structured DNN to use nonlinear transformations in multi-layers as a structured and deep learning algorithm. It was shown to beat structured SVM in preliminary experiments on TIMIT.
A new type of collective excitations, due exclusively to the topology of a complex random network that can be characterized by a fractal dimension $D_F$, is investigated. We show analytically that these excitations generate phase transitions due to the non-periodic topology of the $D_F>1$ complex network. An Ising system, with long range interactions over such a network, is studied in detail to support the claim. The analytic treatment is possible because the evaluation of the partition function can be decomposed into closed factor loops, in spite of the architectural complexity. This way we compute the magnetization distribution, magnetization loops, and the two point correlation function; and relate them to the network topology. In summary, the removal of the infrared divergences leads to an unconventional phase transition, where spin correlations are robust against thermal fluctuations.
Robustness of network of networks (NON) has been studied only for dependency coupling (J.X. Gao et. al., Nature Physics, 2012) and only for connectivity coupling (E.A. Leicht and R.M. D Souza, arxiv:0907.0894). The case of network of n networks with both interdependent and interconnected links is more complicated, and also more closely to real-life coupled network systems. Here we develop a framework to study analytically and numerically the robustness of this system. For the case of starlike network of n ER networks, we find that the system undergoes from second order to first order phase transition as coupling strength q increases. We find that increasing intra-connectivity links or inter-connectivity links can increase the robustness of the system, while the interdependency links decrease its robustness. Especially when q=1, we find exact analytical solutions of the giant component and the first order transition point. Understanding the robustness of network of networks with interdependent and interconnected links is helpful to design resilient infrastructures.
We study the effect of clustering on the organization of cooperation, by analyzing the evolutionary dynamics of the Prisoner's Dilemma on scale-free networks with a tunable value of clustering. We find that a high value of the clustering coefficient produces an overall enhancement of cooperation in the network, even for a very high temptation to defect. On the other hand, high clustering homogeneizes the process of invasion of degree classes by defectors, decreasing the chances of survival of low densities of cooperator strategists in the network.
We compute the onium-onium scattering amplitude at fixed impact parameter in the framework of the perturbative QCD dipole model. Relying on conformal properties of the dipole cascade and of the elementary dipole-dipole scattering amplitude, we obtain an exact result for this onium-onium scattering amplitude, which is proven to be identical to the BFKL result, and which exhibits the frame invariance of the calculation. The asymptotic expression for this amplitude and for the dipole distribution in an onium at fixed impact parameter agree with previous numerical simulations. We show how it is possible to describe onium-$e^{\pm}$ deep inelastic scattering in the dipole model, relying on k_T-factorization properties. The elementary scattering amplitudes involved in the various processes are computed using eikonal techniques.
Deep images in the 10 micron spectral region have been obtained for five massive Galactic globular clusters, NGC 104 (=47 Tuc), NGC 362, NGC 5139 (omega Cen), NGC 6388, NGC 7078 (=M15) and NGC 6715 (=M54) in the Sagittarius Dwarf Spheroidal using ISOCAM in 1997. A significant sample of bright giants have an ISOCAM counterpart but only < 20% of these have a strong mid-IR excess indicative of dusty circumstellar envelopes. From a combined physical and statistical analysis we derive mass loss rates and frequency. We find that i) significant mass loss occurs only at the very end of the Red Giant Branch evolutionary stage and is episodic, ii) the modulation timescales must be greater than a few decades and less than a million years, and iii) mass loss occurrence does not show a crucial dependence on the cluster metallicity.
The discriminative power of modern deep learning models for 3D human action recognition is growing ever so potent. In conjunction with the recent resurgence of 3D human action representation with 3D skeletons, the quality and the pace of recent progress have been significant. However, the inner workings of state-of-the-art learning based methods in 3D human action recognition still remain mostly black-box. In this work, we propose to use a new class of models known as Temporal Convolutional Neural Networks (TCN) for 3D human action recognition. Compared to popular LSTM-based Recurrent Neural Network models, given interpretable input such as 3D skeletons, TCN provides us a way to explicitly learn readily interpretable spatio-temporal representations for 3D human action recognition. We provide our strategy in re-designing the TCN with interpretability in mind and how such characteristics of the model is leveraged to construct a powerful 3D activity recognition method. Through this work, we wish to take a step towards a spatio-temporal model that is easier to understand, explain and interpret. The resulting model, Res-TCN, achieves state-of-the-art results on the largest 3D human action recognition dataset, NTU-RGBD.
Social network platforms can use the data produced by their users to serve them better. One of the services these platforms provide is recommendation service. Recommendation systems can predict the future preferences of users using their past preferences. In the recommendation systems literature there are various techniques, such as neighborhood based methods, machine-learning based methods and matrix-factorization based methods. In this work, a set of well known methods from natural language processing domain, namely Word2Vec, is applied to recommendation systems domain. Unlike previous works that use Word2Vec for recommendation, this work uses non-textual features, the check-ins, and it recommends venues to visit/check-in to the target users. For the experiments, a Foursquare check-in dataset is used. The results show that use of continuous vector space representations of items modeled by techniques of Word2Vec is promising for making recommendations.
The past analyses of datasets for social networks have enabled us to make empirical findings of a number of aspects of human society, which are commonly featured as stylized facts of social networks, such as broad distributions of network quantities, existence of communities, assortative mixing, and intensity-topology correlations. Since the structure of the whole social network is yet largely unexplored, for deeper insight into human society there is need for more comprehensive datasets and modeling of the stylized facts. The existing dynamical models generically require a considerable amount of computational efforts, while various static modeling approaches tend to be complicated for the whole social network showing the stylized facts. In this paper, in order to model the whole social network, we take an alternative approach by devising a community-based static model, for which a number of communities are randomly assigned to isolated nodes using a few reasonable assumptions such that the community size is heterogeneous, and larger communities are assigned with smaller link density and smaller characteristic link weight. With these few assumptions, we are able to generate realistic social networks that show most stylized facts for a wide range of parameters, which in turn can explain why the stylized facts are so commonly observed. We also obtain analytical results for various network quantities that turn out to be comparable with the numerical results. Since our community-based static model is simple to implement and easily scalable, it can be used as a reference system, benchmark, or testbed for further applications in physics, computer science, and social sciences.
We present a search for standard model Higgs boson production in association with a W boson in proton-antiproton collisions at a center of mass energy of 1.96 TeV. The search employs data collected with the CDF II detector that correspond to an integrated luminosity of approximately 1.9 inverse fb. We select events consistent with a signature of a single charged lepton, missing transverse energy, and two jets. Jets corresponding to bottom quarks are identified with a secondary vertex tagging method, a jet probability tagging method, and a neural network filter. We use kinematic information in an artificial neural network to improve discrimination between signal and background compared to previous analyses. The observed number of events and the neural network output distributions are consistent with the standard model background expectations, and we set 95% confidence level upper limits on the production cross section times branching fraction ranging from 1.2 to 1.1 pb or 7.5 to 102 times the standard model expectation for Higgs boson masses from 110 to $150 GeV/c^2, respectively.
We study the number of clusters in two-dimensional (2d) critical percolation, N_Gamma, which intersect a given subset of bonds, Gamma. In the simplest case, when Gamma is a simple closed curve, N_Gamma is related to the entanglement entropy of the critical diluted quantum Ising model, in which Gamma represents the boundary between the subsystem and the environment. Due to corners in Gamma there are universal logarithmic corrections to N_Gamma, which are calculated in the continuum limit through conformal invariance, making use of the Cardy-Peschel formula. The exact formulas are confirmed by large scale Monte Carlo simulations. These results are extended to anisotropic percolation where they confirm a result of discrete holomorphicity.
Being dominant factors driving the human actions, personalities can be excellent indicators in predicting the offline and online behavior of different individuals. However, because of the great expense and inevitable subjectivity in questionnaires and surveys, it is challenging for conventional studies to explore the connection between personality and behavior and gain insights in the context of large amount individuals. Considering the more and more important role of the online social media in daily communications, we argue that the footprint of massive individuals, like tweets in Weibo, can be the inspiring proxy to infer the personality and further understand its functions in shaping the online human behavior. In this study, a map from self-reports of personalities to online profiles of 293 active users in Weibo is established to train a competent machine learning model, which then successfully identifies over 7,000 users as extroverts or introverts. Systematical comparisons from perspectives of tempo-spatial patterns, online activities, emotion expressions and attitudes to virtual honor surprisingly disclose that the extrovert indeed behaves differently from the introvert in Weibo. Our findings provide solid evidence to justify the methodology of employing machine learning to objectively study personalities of massive individuals and shed lights on applications of probing personalities and corresponding behaviors solely through online profiles.
Ant colony optimization (ACO) has been applied to the field of combinatorial optimization widely. But the study of convergence theory of ACO is rare under general condition. In this paper, the authors try to find the evidence to prove that entropy is related to the convergence of ACO, especially to the estimation of the minimum iteration number of convergence. Entropy is a new view point possibly to studying the ACO convergence under general condition. Key Words: Ant Colony Optimization, Convergence of ACO, Entropy
In most wireless networks, nodes have only limited local information about the state of the network, which includes connectivity and channel state information. With limited local information about the network, each node's knowledge is mismatched; therefore, they must make distributed decisions. In this paper, we pose the following question - if every node has network state information only about a small neighborhood, how and when should nodes choose to transmit? While link scheduling answers the above question for point-to-point physical layers which are designed for an interference-avoidance paradigm, we look for answers in cases when interference can be embraced by advanced PHY layer design, as suggested by results in network information theory.   To make progress on this challenging problem, we propose a constructive distributed algorithm that achieves rates higher than link scheduling based on interference avoidance, especially if each node knows more than one hop of network state information. We compare our new aggressive algorithm to a conservative algorithm we have presented in [1]. Both algorithms schedule sub-networks such that each sub-network can employ advanced interference-embracing coding schemes to achieve higher rates. Our innovation is in the identification, selection and scheduling of sub-networks, especially when sub-networks are larger than a single link.
In this paper we consider an exactly solvable model which displays glassy behavior at zero temperature due to entropic barriers. The new ingredient of the model is the existence of different energy scales or modes associated to different relaxational time-scales. Low-temperature relaxation takes place by partial equilibration of successive lower energy modes. An adiabatic scaling solution, defined in terms of a threshold energy scale $\eps^*$, is proposed. For such a solution, modes with energy $\eps\gg\eps^*$ are equilibrated at the bath temperature, modes with $\eps\ll\eps^*$ remain out of equilibrium and relaxation occurs in the neighborhood of the threshold $\eps\sim \eps^*$. The model is presented as a toy example to investigate conditions related to the existence of an effective temperature in glassy systems and its possible dependence on the energy sector probed by the corresponding observable.
In this paper we explore whether or not deep neural architectures can learn to classify Boolean satisfiability (SAT). We devote considerable time to discussing the theoretical properties of SAT. Then, we define a graph representation for Boolean formulas in conjunctive normal form, and train neural classifiers over general graph structures called Graph Neural Networks, or GNNs, to recognize features of satisfiability. To the best of our knowledge this has never been tried before. Our preliminary findings are potentially profound. In a weakly-supervised setting, that is, without problem specific feature engineering, Graph Neural Networks can learn features of satisfiability.
Contrastive Divergence (CD) and Persistent Contrastive Divergence (PCD) are popular methods for training the weights of Restricted Boltzmann Machines. However, both methods use an approximate method for sampling from the model distribution. As a side effect, these approximations yield significantly different biases and variances for stochastic gradient estimates of individual data points. It is well known that CD yields a biased gradient estimate. In this paper we however show empirically that CD has a lower stochastic gradient estimate variance than exact sampling, while the mean of subsequent PCD estimates has a higher variance than exact sampling. The results give one explanation to the finding that CD can be used with smaller minibatches or higher learning rates than PCD.
While the sparse coding principle can successfully model information processing in sensory neural systems, it remains unclear how learning can be accomplished under neural architectural constraints. Feasible learning rules must rely solely on synaptically local information in order to be implemented on spatially distributed neurons. We describe a neural network with spiking neurons that can address the aforementioned fundamental challenge and solve the L1-minimizing dictionary learning problem, representing the first model able to do so. Our major innovation is to introduce feedback synapses to create a pathway to turn the seemingly non-local information into local ones. The resulting network encodes the error signal needed for learning as the change of network steady states caused by feedback, and operates akin to the classical stochastic gradient descent method.
Emergence is a phenomenon taken for granted in science but also still not well understood. We have developed a model of artificial genetic evolution intended to allow for emergence on genetic, population and social levels. We present the details of the current state of our environment, agent, and reproductive models. In developing our models we have relied on a principle of using non-linear systems to model as many systems as possible including mutation and recombination, gene-environment interaction, agent metabolism, agent survival, resource gathering and sexual reproduction. In this paper we review the genetic dynamics that have emerged in our system including genotype-phenotype divergence, genetic drift, pseudogenes, and gene duplication. We conclude that emergence-focused design in complex system simulation is necessary to reproduce the multilevel emergence seen in the natural world.
We propose a method that allows for a rigorous statistical analysis of neural responses to natural stimuli which are non-Gaussian and exhibit strong correlations. We have in mind a model in which neurons are selective for a small number of stimulus dimensions out of a high dimensional stimulus space, but within this subspace the responses can be arbitrarily nonlinear. Existing analysis methods are based on correlation functions between stimuli and responses, but these methods are guaranteed to work only in the case of Gaussian stimulus ensembles. As an alternative to correlation functions, we maximize the mutual information between the neural responses and projections of the stimulus onto low dimensional subspaces. The procedure can be done iteratively by increasing the dimensionality of this subspace. Those dimensions that allow the recovery of all of the information between spikes and the full unprojected stimuli describe the relevant subspace. If the dimensionality of the relevant subspace indeed is small, it becomes feasible to map the neuron's input-output function even under fully natural stimulus conditions. These ideas are illustrated in simulations on model visual and auditory neurons responding to natural scenes and sounds, respectively.
A hypothetical test resistor is connected in parallel to a two terminal network. The temperature of the test resistor is tuned, until there is no net flow of noise energy between the network and the resistor. It is shown that this temperature is independent of the value of the resistor. It is therefore suggested that this temperature may serve as an alternative definition of the equivalent noise temperature of a two terminal network. Quite interestingly, this equivalent noise temperature of an ideal pn diode equals the actual temperatue of the diode. Furthermore, if this newly defined equivalent noise temperature can be estimated for a given network by physical arguments, than the noise of the network can be calculated provided its current voltage dependence is known. As an example, the suppression of shot noise due to recombination in the space charge region of a pn diode is estimated using this procedure.
Convolutional rectifier networks, i.e. convolutional neural networks with rectified linear activation and max or average pooling, are the cornerstone of modern deep learning. However, despite their wide use and success, our theoretical understanding of the expressive properties that drive these networks is partial at best. On the other hand, we have a much firmer grasp of these issues in the world of arithmetic circuits. Specifically, it is known that convolutional arithmetic circuits possess the property of "complete depth efficiency", meaning that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be implemented (or even approximated) by a shallow network. In this paper we describe a construction based on generalized tensor decompositions, that transforms convolutional arithmetic circuits into convolutional rectifier networks. We then use mathematical tools available from the world of arithmetic circuits to prove new results. First, we show that convolutional rectifier networks are universal with max pooling but not with average pooling. Second, and more importantly, we show that depth efficiency is weaker with convolutional rectifier networks than it is with convolutional arithmetic circuits. This leads us to believe that developing effective methods for training convolutional arithmetic circuits, thereby fulfilling their expressive potential, may give rise to a deep learning architecture that is provably superior to convolutional rectifier networks but has so far been overlooked by practitioners.
With the increasing complexity of computing systems, complete hardware reliability can no longer be guaranteed. We need, however, to ensure overall system reliability. One of the most important features of artificial neural networks is their intrinsic fault-tolerance. The aim of this work is to investigate whether such networks have features that can be applied to wider computational systems. This paper presents an analysis, in both the learning and operational phases, of a distributed feed-forward neural network with decentralised event-driven time management, which is insensitive to intermittent faults caused by unreliable communication or faulty hardware components. The learning rules used in the model are local in space and time, which allows efficient scalable distributed implementation. We investigate the overhead caused by injected faults and analyse the sensitivity to limited failures in the computational hardware in different areas of the network.
We study the Restricted Solid on Solid (RSOS) model for surface growth in spatial dimension d=4 by means of a multi-surface coding technique that allows to analyze samples to analyze samples of size up to $256^4$ in the steady state regime. For such large systems we are able to achieve a controlled asymptotic regime where the typical scale of the fluctuations are larger than the lattice spacing used in the simulations. A careful finite-size scaling analysis of the critical exponents clearly indicate that d=4 is not the upper critical dimension of the model.
Most current studies estimate the invulnerability of complex networks using a qualitative method that analyzes the inaccurate decay rate of network efficiency. This method results in confusion over the invulnerability of various types of complex networks. By normalizing network efficiency and defining a baseline, this paper defines the invulnerability index as the integral of the difference between the normalized network efficiency curve and the baseline. This quantitative method seeks to establish a benchmark for the robustness and fragility of networks and to measure network invulnerability under both edge and node attacks. To validate the reliability of the proposed method, three small-world networks were selected as test beds. The simulation results indicate that the proposed invulnerability index can effectively and accurately quantify network resilience. The index should provide a valuable reference for determining network invulnerability in future research.
In our work we analyse the political disaffection or "the subjective feeling of powerlessness, cynicism, and lack of confidence in the political process, politicians, and democratic institutions, but with no questioning of the political regime" by exploiting Twitter data through machine learning techniques. In order to validate the quality of the time-series generated by the Twitter data, we highlight the relations of these data with political disaffection as measured by means of public opinion surveys. Moreover, we show that important political news of Italian newspapers are often correlated with the highest peaks of the produced time-series.
We present extragalactic number counts and a lower limit estimate for the cosmic infrared background at 15 um from AKARI ultra deep mapping of the gravitational lensing cluster Abell 2218. This data is the deepest taken by any facility at this wavelength, and uniquely samples the normal galaxy population. We have de-blended our sources, to resolve photometric confusion, and de-lensed our photometry to probe beyond AKARI's blank-field sensitivity. We estimate a de-blended 5 sigma sensitivity of 28.7 uJy. The resulting 15 um galaxy number counts are a factor of three fainter than previous results, extending to a depth of ~ 0.01 mJy and providing a stronger lower limit constraint on the cosmic infrared background at 15 um of 1.9 +/- 0.5 nW m^-2 sr^-1.
We demonstrate that a number of sociology models for social network dynamics can be viewed as continuous time Bayesian networks (CTBNs). A sampling-based approximate inference method for CTBNs can be used as the basis of an expectation-maximization procedure that achieves better accuracy in estimating the parameters of the model than the standard method of moments algorithmfromthe sociology literature. We extend the existing social network models to allow for indirect and asynchronous observations of the links. A Markov chain Monte Carlo sampling algorithm for this new model permits estimation and inference. We provide results on both a synthetic network (for verification) and real social network data.
We investigate the problem of image retrieval based on visual queries when the latter comprise arbitrary regions-of-interest (ROI) rather than entire images. Our proposal is a compact image descriptor that combines the state-of-the-art in content-based descriptor extraction with a multi-level, Voronoi-based spatial partitioning of each dataset image. The proposed multi-level Voronoi-based encoding uses a spatial hierarchical K-means over interest-point locations, and computes a content-based descriptor over each cell. In order to reduce the matching complexity with minimal or no sacrifice in retrieval performance: (i) we utilize the tree structure of the spatial hierarchical K-means to perform a top-to-bottom pruning for local similarity maxima; (ii) we propose a new image similarity score that combines relevant information from all partition levels into a single measure for similarity; (iii) we combine our proposal with a novel and efficient approach for optimal bit allocation within quantized descriptor representations. By deriving both a Voronoi-based VLAD descriptor (termed as Fast-VVLAD) and a Voronoi-based deep convolutional neural network (CNN) descriptor (termed as Fast-VDCNN), we demonstrate that our Voronoi-based framework is agnostic to the descriptor basis, and can easily be slotted into existing frameworks. Via a range of ROI queries in two standard datasets, it is shown that the Voronoi-based descriptors achieve comparable or higher mean Average Precision against conventional grid-based spatial search, while offering more than two-fold reduction in complexity. Finally, beyond ROI queries, we show that Voronoi partitioning improves the geometric invariance of compact CNN descriptors, thereby resulting in competitive performance to the current state-of-the-art on whole image retrieval.
Functional connectivity is a fundamental property of neural networks that quantifies the segregation and integration of information between cortical areas. Due to mathematical complexity, a theory that could explain how the parameters of mesoscopic networks composed of a few tens of neurons affect the functional connectivity is still to be formulated. Yet, many interesting problems in neuroscience involve the study of networks composed of a small number of neurons. Based on a recent study of the dynamics of small neural circuits, we combine the analysis of local bifurcations of multi-population neural networks of arbitrary size with the analytical calculation of the functional connectivity. We study the functional connectivity in different regimes, showing that external stimuli cause the network to switch from asynchronous states characterized by weak correlation and low variability (local chaos), to synchronous states characterized by strong correlations and wide temporal fluctuations (critical slowing down). Local chaos typically occurs in large networks, but here we show that it can also be generated by strong stimuli in small neural circuits. On the other side, critical slowing down is expected to occur when the stimulus moves the network close to a local bifurcation. In particular, strongly positive correlations occur at the saddle-node and Andronov-Hopf bifurcations of the network, while strongly negative correlations occur when the network undergoes a spontaneous symmetry-breaking at the branching-point bifurcations. These results prove that the functional connectivity of firing-rate network models is strongly affected by the external stimuli even if the anatomical connections are fixed, and suggest an effective mechanism through which biological networks can dynamically modulate the encoding and integration of sensory information.
In ecology it is widely recognised that many landscapes comprise a network of discrete patches of habitat. The species that inhabit the patches interact with each other through a network of feeding interactions, a foodweb. The meta-food-web model proposed by Pillai et al. combines the feeding relationships at each patch with the dispersal of species between patches, such that the whole system is represented by a network of networks. Previous work on meta-food-webs has focussed on landscape networks that do not have an explicit spatial embedding, but in real landscapes the patches are usually distributed in space. Here we compare the dispersal of a meta-food-web on \ER networks, that do not have a spatial embedding, and random geometric networks, that do have a spatial embedding. We found that local structure and large network distances in spatially embedded networks, lead to meso-scale patterns of patch occupation by both specialist and omnivorous species. In particular we found that spatial separations makes coexistence of competing species more likely. Our results highlight the effects of spatial embeddings for meta-food-web models, and the need for new analytical approaches to them.
We consider molecular communication networks consisting of transmitters and receivers distributed in a fluidic medium. In such networks, a transmitter sends one or more signalling molecules, which are diffused over the medium, to the receiver to realise the communication. In order to be able to engineer synthetic molecular communication networks, mathematical models for these networks are required. This paper proposes a new stochastic model for molecular communication networks called reaction-diffusion master equation with exogenous input (RDMEX). The key idea behind RDMEX is to model the transmitters as time series of signalling molecule counts, while diffusion in the medium and chemical reactions at the receivers are modelled as Markov processes using master equation. An advantage of RDMEX is that it can readily be used to model molecular communication networks with multiple transmitters and receivers. For the case where the reaction kinetics at the receivers is linear, we show how RDMEX can be used to determine the mean and covariance of the receiver output signals, and derive closed-form expressions for the mean receiver output signal of the RDMEX model. These closed-form expressions reveal that the output signal of a receiver can be affected by the presence of other receivers. Numerical examples are provided to demonstrate the properties of the model.
Over the years, computer vision researchers have spent an immense amount of effort on designing image features for the visual object recognition task. We propose to incorporate this valuable experience to guide the task of training deep neural networks. Our idea is to pretrain the network through the task of replicating the process of hand-designed feature extraction. By learning to replicate the process, the neural network integrates previous research knowledge and learns to model visual objects in a way similar to the hand-designed features. In the succeeding finetuning step, it further learns object-specific representations from labeled data and this boosts its classification power. We pretrain two convolutional neural networks where one replicates the process of histogram of oriented gradients feature extraction, and the other replicates the process of region covariance feature extraction. After finetuning, we achieve substantially better performance than the baseline methods.
Bidirectional recurrent neural networks (RNN) are trained to predict both in the positive and negative time directions simultaneously. They have not been used commonly in unsupervised tasks, because a probabilistic interpretation of the model has been difficult. Recently, two different frameworks, GSN and NADE, provide a connection between reconstruction and probabilistic modeling, which makes the interpretation possible. As far as we know, neither GSN or NADE have been studied in the context of time series before. As an example of an unsupervised task, we study the problem of filling in gaps in high-dimensional time series with complex dynamics. Although unidirectional RNNs have recently been trained successfully to model such time series, inference in the negative time direction is non-trivial. We propose two probabilistic interpretations of bidirectional RNNs that can be used to reconstruct missing gaps efficiently. Our experiments on text data show that both proposed methods are much more accurate than unidirectional reconstructions, although a bit less accurate than a computationally complex bidirectional Bayesian inference on the unidirectional RNN. We also provide results on music data for which the Bayesian inference is computationally infeasible, demonstrating the scalability of the proposed methods.
A network belongs to the monotone separable class if its state variables are homogeneous and monotone functions of the epochs of the arrival process.   This framework, which was first introduced to derive the stability region for stochastic networks with stationary and ergodic driving sequences, is revisited. It contains several classical queueing network models, including generalized Jackson networks, max-plus networks, polling systems, multiserver queues, and various classes of stochastic Petri nets. Our purpose is the analysis of the tails of the stationary state variables in the particular case of i.i.d. driving sequences. For this, we establish general comparison relationships between networks of this class and the GI/GI/1/\infty queue.   We first use this to show that two classical results of the asymptotic theory for GI/GI/1/\infty queues can be directly extended to this framework.   The first one concerns the existence of moments for the stationary state variables. We establish that for all \alpha\geq 1, the (\alpha+1)-moment condition for service times is necessary and sufficient for the existence of the \alpha-moment for the stationary maximal dater (typically the time to empty the network when stopping further arrivals) in any network of this class. The second one is a direct extension of Veraverbeke's tail asymptotic for the stationary waiting times in the GI/GI/1/\infty queue.
A key challenge within the social network literature is the problem of network generation - that is, how can we create synthetic networks that match characteristics traditionally found in most real world networks? Important characteristics that are present in social networks include a power law degree distribution, small diameter and large amounts of clustering; however, most current network generators, such as the Chung Lu and Kronecker models, largely ignore the clustering present in a graph and choose to focus on preserving other network statistics, such as the power law distribution. Models such as the exponential random graph model have a transitivity parameter, but are computationally difficult to learn, making scaling to large real world networks intractable. In this work, we propose an extension to the Chung Lu ran- dom graph model, the Transitive Chung Lu (TCL) model, which incorporates the notion of a random transitive edge. That is, with some probability it will choose to connect to a node exactly two hops away, having been introduced to a 'friend of a friend'. In all other cases it will follow the standard Chung Lu model, selecting a 'random surfer' from anywhere in the graph according to the given invariant distribution. We prove TCL's expected degree distribution is equal to the degree distribution of the original graph, while being able to capture the clustering present in the network. The single parameter required by our model can be learned in seconds on graphs with millions of edges, while networks can be generated in time that is linear in the number of edges. We demonstrate the performance TCL on four real- world social networks, including an email dataset with hundreds of thousands of nodes and millions of edges, showing TCL generates graphs that match the degree distribution, clustering coefficients and hop plots of the original networks.
The importance of multiwavelength astronomical surveys is discussed in the context of galaxy evolution. The AKARI Deep Field South (ADF-S) is a new, well placed survey field that is already the subject of studies at a wide range of wavelengths. A number of ADF-S observational programmes are discussed and the prospects for the ADF-S as a future resource for extragalactic astronomy is explored.
Second- and third-order results are presented for the structure functions of charged-current deep-inelastic scattering in the framework of massless perturbative QCD. We write down the two-loop differences between the corresponding crossing-even and -odd coefficient functions, including those for the longitudinal structure function not covered in the literature so far. At three loops we compute the lowest five moments of these differences for all three structure functions and provide approximate expressions in Bjorken-$x$ space. Also calculated is the related third-order coefficient-function correction to the Gottfried sum rule. We confirm the conjectured suppression of these quantities if the number of colours is large. Finally we derive the second- and third-order QCD contributions to the Paschos-Wolfenstein ratio used for the determination of the weak mixing angle from neutrino-nucleon deep-inelastic scattering. These contributions are found to be small.
The virtual photon absorption cross section differences [sigma_1/2-sigma_3/2] for the proton and neutron have been determined from measurements of polarised cross section asymmetries in deep inelastic scattering of 27.5 GeV longitudinally polarised positrons from polarised 1H and 3He internal gas targets. The data were collected in the region above the nucleon resonances in the kinematic range nu < 23.5 GeV and 0.8 GeV**2 < Q**2 < 12 GeV**2. For the proton the contribution to the generalised Gerasimov-Drell-Hearn integral was found to be substantial and must be included for an accurate determination of the full integral. Furthermore the data are consistent with a QCD next-to-leading order fit based on previous deep inelastic scattering data. Therefore higher twist effects do not appear significant.
Data analysis plays an important role in the development of intelligent energy networks (IENs). This article reviews and discusses the application of data analysis methods for energy big data. The installation of smart energy meters has provided a huge volume of data at different time resolutions, suggesting data analysis is required for clustering, demand forecasting, energy generation optimization, energy pricing, monitoring and diagnostics. The currently adopted data analysis technologies for IENs include pattern recognition, machine learning, data mining, statistics methods, etc. However, existing methods for data analysis cannot fully meet the requirements for processing the big data produced by the IENs and, therefore, more comprehensive data analysis methods are needed to handle the increasing amount of data and to mine more valuable information.
This paper addresses the challenge of 3D human pose estimation from a single color image. Despite the general success of the end-to-end learning paradigm, top performing approaches employ a two-step solution consisting of a Convolutional Network (ConvNet) for 2D joint localization only and recover 3D pose by a subsequent optimization step. In this paper, we identify the representation of 3D pose as a critical issue with current ConvNet approaches and make two important contributions towards validating the value of end-to-end learning for this task. First, we propose a fine discretization of the 3D space around the subject and train a ConvNet to predict per voxel likelihoods for each joint. This creates a natural representation for 3D pose and greatly improves performance over the direct regression of joint coordinates. Second, to further improve upon initial estimates, we employ a coarse-to-fine prediction scheme. This step addresses the large dimensionality increase and enables iterative refinement and repeated processing of the image features. The proposed approach allows us to train a ConvNet that outperforms all state-of-the-art approaches on standard benchmarks achieving relative error reduction greater than 35% on average. Additionally, we investigate using our volumetric representation in a related architecture which is suboptimal compared to our end-to-end approach, but is of practical interest, since it enables training when no image with corresponding 3D groundtruth is available, and allows us to present compelling results for in-the-wild images.
Scene classification is a fundamental perception task for environmental understanding in today's robotics. In this paper, we have attempted to exploit the use of popular machine learning technique of deep learning to enhance scene understanding, particularly in robotics applications. As scene images have larger diversity than the iconic object images, it is more challenging for deep learning methods to automatically learn features from scene images with less samples. Inspired by human scene understanding based on object knowledge, we address the problem of scene classification by encouraging deep neural networks to incorporate object-level information. This is implemented with a regularization of semantic segmentation. With only 5 thousand training images, as opposed to 2.5 million images, we show the proposed deep architecture achieves superior scene classification results to the state-of-the-art on a publicly available SUN RGB-D dataset. In addition, performance of semantic segmentation, the regularizer, also reaches a new record with refinement derived from predicted scene labels. Finally, we apply our SUN RGB-D dataset trained model to a mobile robot captured images to classify scenes in our university demonstrating the generalization ability of the proposed algorithm.
This paper introduces algorithms for surveillance applications of wireless sensor and actor networks (WSANs) that reduce communication cost by suppressing unnecessary data transfers. The objective of the considered WSAN system is to capture and eliminate distributed targets in the shortest possible time. Computational experiments were performed to evaluate effectiveness of the proposed algorithms. The experimental results show that a considerable reduction of the communication costs together with a performance improvement of the WSAN system can be obtained by using the communication algorithms that are based on spatiotemporal and decision aware suppression methods.
We give a rigorous framework for the interaction of physical computing devices with abstract computation. Device and program are mediated by the non-logical 'representation relation'; we give the conditions under which representation and device theory give rise to commuting diagrams between logical and physical domains, and the conditions for computation to occur. We give the interface of this new framework with currently existing formal methods, showing in particular its close relationship to refinement theory, and the implications for questions of meaning and reference in theoretical computer science. The case of hybrid computing is considered in detail, addressing in particular the example of an internet-mediated 'social machine', and the abstraction/representation framework used to provide a formal distinction between heterotic and hybrid computing. This forms the basis for future use of the framework in formal treatments of nonstandard physical computers.
The quantum phase transition between the three dimensional Dirac semimetal and the diffusive metal can be induced by increasing disorder. Taking the system of disordered $\mathbb{Z}_2$ topological insulator as an important example, we compute the single particle density of states by the kernel polynomial method. We focus on three regions: the Dirac semimetal at the phase boundary between two topologically distinct phases, the tricritical point of the two topological insulator phases and the diffusive metal, and the diffusive metal lying at strong disorder. The density of states obeys a novel single parameter scaling, collapsing onto two branches of a universal scaling function, which correspond to the Dirac semimetal and the diffusive metal. The diverging length scale critical exponent $\nu$ and the dynamical critical exponent $z$ are estimated, and found to differ significantly from those for the conventional Anderson transition. Critical behavior of experimentally observable quantities near and at the tricritical point is also discussed.
Autonomous driving is one of the most recent topics of interest which is aimed at replicating human driving behavior keeping in mind the safety issues. We approach the problem of learning synthetic driving using generative neural networks. The main idea is to make a controller trainer network using images plus key press data to mimic human learning. We used the architecture of a stable GAN to make predictions between driving scenes using key presses. We train our model on one video game (Road Rash) and tested the accuracy and compared it by running the model on other maps in Road Rash to determine the extent of learning.
We tested 14 very different classification algorithms (random forest, gradient boosting machines, SVM - linear, polynomial, and RBF - 1-hidden-layer neural nets, extreme learning machines, k-nearest neighbors and a bagging of knn, naive Bayes, learning vector quantization, elastic net logistic regression, sparse linear discriminant analysis, and a boosting of linear classifiers) on 115 real life binary datasets. We followed the Demsar analysis and found that the three best classifiers (random forest, gbm and RBF SVM) are not significantly different from each other. We also discuss that a change of less then 0.0112 in the error rate should be considered as an irrelevant change, and used a Bayesian ANOVA analysis to conclude that with high probability the differences between these three classifiers is not of practical consequence. We also verified the execution time of "standard implementations" of these algorithms and concluded that RBF SVM is the fastest (significantly so) both in training time and in training plus testing time.
Drug repositioning offers an effective solution to drug discovery, saving both time and resources by finding new indications for existing drugs. Typically, a drug takes effect via its protein targets in the cell. As a result, it is necessary for drug development studies to conduct an investigation into the interrelationships of drugs, protein targets, and diseases. Although previous studies have made a strong case for the effectiveness of integrative network-based methods for predicting these interrelationships, little progress has been achieved in this regard within drug repositioning research. Moreover, the interactions of new drugs and targets (lacking any known targets and drugs, respectively) cannot be accurately predicted by most established methods. In this paper, we propose a novel semi-supervised heterogeneous label propagation algorithm named Heter-LP, which applies both local as well as global network features for data integration. To predict drug-target, disease-target, and drug-disease associations, we use information about drugs, diseases, and targets as collected from multiple sources at different levels. Our algorithm integrates these various types of data into a heterogeneous network and implements a label propagation algorithm to find new interactions. Statistical analyses of 10-fold cross-validation results and experimental analysis support the effectiveness of the proposed algorithm.
Building large models with parameter sharing accounts for most of the success of deep convolutional neural networks (CNNs). In this paper, we propose doubly convolutional neural networks (DCNNs), which significantly improve the performance of CNNs by further exploring this idea. In stead of allocating a set of convolutional filters that are independently learned, a DCNN maintains groups of filters where filters within each group are translated versions of each other. Practically, a DCNN can be easily implemented by a two-step convolution procedure, which is supported by most modern deep learning libraries. We perform extensive experiments on three image classification benchmarks: CIFAR-10, CIFAR-100 and ImageNet, and show that DCNNs consistently outperform other competing architectures. We have also verified that replacing a convolutional layer with a doubly convolutional layer at any depth of a CNN can improve its performance. Moreover, various design choices of DCNNs are demonstrated, which shows that DCNN can serve the dual purpose of building more accurate models and/or reducing the memory footprint without sacrificing the accuracy.
Multipath routing in WSN has been a long wish in security scenario where nodes on next-hop may be targeted to compromise. Many proposals of Multipath routing has been proposed in ADHOC Networks but under constrained from keying environment most seems ignorant. In WSN where crucial data is reported by nodes in deployment area to their securely located Sink, route security has to be guaranteed. Under dynamic load and selective attacks, availability of multiple secure paths is a boon and increases the attacker efforts by many folds. We propose to build a subset of neighbors as our front towards destination node. We also identified forwarders for query by base station. The front is optimally calculated to maintain the security credential and avail multiple paths. According to our knowledge ours is first secure multipath routing protocol for WSN. We established effectiveness of our proposal with mathematical analysis
Only a year ago, all state-of-the-art coreference resolvers were using an extensive amount of surface features. Recently, there was a paradigm shift towards using word embeddings and deep neural networks, where the use of surface features is very limited. In this paper, we show that a simple SVM model with surface features outperforms more complex neural models for detecting anaphoric mentions. Our analysis suggests that using generalized representations and surface features have different strength that should be both taken into account for improving coreference resolution.
Social network analysis emerged as an important research topic in sociology decades ago, and it has also attracted scientists from various fields of study like psychology, anthropology, geography and economics. In recent years, a significant number of researches has been conducted on using social network analysis to design e-commerce recommender systems. Most of the current recommender systems are designed for B2C e-commerce websites. This paper focuses on building a recommendation algorithm for C2C e-commerce business model by considering special features of C2C e-commerce websites. In this paper, we consider users and their transactions as a network; by this mapping, link prediction technique which is an important task in social network analysis could be used to build the recommender system. The proposed tow-level recommendation algorithm, rather than topology of the network, uses nodes features like: category of items, ratings of users, and reputation of sellers. The results show that the proposed model can be used to predict a portion of future trades between users in a C2C commercial network.
The use of unsupervised data in addition to supervised data in training discriminative neural networks has improved the performance of this clas- sification scheme. However, the best results were achieved with a training process that is divided in two parts: first an unsupervised pre-training step is done for initializing the weights of the network and after these weights are refined with the use of supervised data. On the other hand adversarial noise has improved the results of clas- sical supervised learning. Recently, a new neural network topology called Ladder Network, where the key idea is based in some properties of hierar- chichal latent variable models, has been proposed as a technique to train a neural network using supervised and unsupervised data at the same time with what is called semi-supervised learning. This technique has reached state of the art classification. In this work we add adversarial noise to the ladder network and get state of the art classification, with several important conclusions on how adversarial noise can help in addition with new possible lines of investi- gation. We also propose an alternative to add adversarial noise to unsu- pervised data.
Among the various means to evaluate the quality of video streams, No-Reference (NR) methods have low computation and may be executed on thin clients. Thus, NR algorithms would be perfect candidates in cases of real-time quality assessment, automated quality control and, particularly, in adaptive mobile streaming. Yet, existing NR approaches are often inaccurate, in comparison to Full-Reference (FR) algorithms, especially under lossy network conditions. In this work, we present an NR method that combines machine learning with simple NR metrics to achieve a quality index comparably as accurate as the Video Quality Metric (VQM) Full-Reference algorithm. Our method is tested in an extensive dataset (960 videos), under lossy network conditions and considering nine different machine learning algorithms. Overall, we achieve an over 97% correlation with VQM, while allowing real-time assessment of video quality of experience in realistic streaming scenarios.
Interconnecting multiple sensor networks is a relatively new research field which has emerged in the Wireless Sensor Network domain. Wireless Sensor Networks (WSNs) have typically been seen as logically separate, and few works have considered interconnection and interaction between them. Interconnecting multiple heterogeneous sensor networks therefore opens up a new field besides more traditional research on, e.g., routing, self organization, or MAC layer development. Up to now, some approaches have been proposed for interconnecting multiple sensor networks with goals like information sharing or monitoring multiple sensor networks. In this paper, we propose to utilize inter-WSN communication to enable Collaborative Performance Optimization, i.e., our approach aims to optimize the performance of individual WSNs by taking into account measured information from others. The parameters to be optimized are energy consumption on the one hand and sensing quality on the other.
Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case. This significantly reduces the training time in feed-forward neural networks. However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case. Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity. Unlike batch normalization, layer normalization performs exactly the same computation at training and test times. It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step. Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks. Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques.
We consider the problem of utility optimal scheduling in general \emph{processing networks} with random arrivals and network conditions. These are generalizations of traditional data networks where commodities in one or more queues can be combined to produce new commodities that are delivered to other parts of the network. This can be used to model problems such as in-network data fusion, stream processing, and grid computing. Scheduling actions are complicated by the \emph{underflow problem} that arises when some queues with required components go empty. In this paper, we develop the Perturbed Max-Weight algorithm (PMW) to achieve optimal utility. The idea of PMW is to perturb the weights used by the usual Max-Weight algorithm to ``push'' queue levels towards non-zero values (avoiding underflows). We show that when the perturbations are carefully chosen, PMW is able to achieve a utility that is within $O(1/V)$ of the optimal value for any $V\geq1$, while ensuring an average network backlog of $O(V)$.
Functional networks, i.e. networks representing dynamic relationships between the components of a complex system, have been instrumental for our understanding of, among others, the human brain. Due to limited data availability, the multi-layer nature of numerous functional networks has hitherto been neglected, and nodes are endowed with a single type of links even when multiple relationships coexist at different physical levels. A relevant problem is the assessment of the benefits yielded by studying a multi-layer functional network, against the simplicity guaranteed by the reconstruction and use of the corresponding single layer projection. Here, I tackle this issue by using as a test case, the functional network representing the dynamics of delay propagation through European airports. Neglecting the multi-layer structure of a functional network has dramatic consequences on our understanding of the underlying system, a fact to be taken into account when a projection is the only available information.
While lung cancer is the second most diagnosed form of cancer in men and women, a sufficiently early diagnosis can be pivotal in patient survival rates. Imaging-based, or radiomics-driven, detection methods have been developed to aid diagnosticians, but largely rely on hand-crafted features which may not fully encapsulate the differences between cancerous and healthy tissue. Recently, the concept of discovery radiomics was introduced, where custom abstract features are discovered from readily available imaging data. We propose a novel evolutionary deep radiomic sequencer discovery approach based on evolutionary deep intelligence. Motivated by patient privacy concerns and the idea of operational artificial intelligence, the evolutionary deep radiomic sequencer discovery approach organically evolves increasingly more efficient deep radiomic sequencers that produce significantly more compact yet similarly descriptive radiomic sequences over multiple generations. As a result, this framework improves operational efficiency and enables diagnosis to be run locally at the radiologist's computer while maintaining detection accuracy. We evaluated the evolved deep radiomic sequencer (EDRS) discovered via the proposed evolutionary deep radiomic sequencer discovery framework against state-of-the-art radiomics-driven and discovery radiomics methods using clinical lung CT data with pathologically-proven diagnostic data. The evolved deep radiomic sequencer shows state-of-the-art diagnostic accuracy (88.78%) relative to previous radiomics approaches.
The present paper introduces a practical protocol for provably secure, outsourced computation. Our protocol minimizes overhead for verification by requiring solutions to withstand an interactive game between a prover and challenger. For optimization problems, the best or nearly best of all submitted solutions is expected to be accepted by this approach. Financial incentives and deposits are used in order to overcome the problem of fake participants.
Blood vessel networks form by spontaneous aggregation of individual cells migrating toward vascularization sites (vasculogenesis). A successful theoretical model of two dimensional experimental vasculogenesis has been recently proposed, showing the relevance of percolation concepts and of cell cross-talk (chemotactic autocrine loop) to the understanding of this self-aggregation process. Here we study the natural 3D extension of the computational model proposed earlier, which is relevant for the investigation of the genuinely threedimensional process of vasculogenesis in vertebrate embryos. The computational model is based on a multidimensional Burgers equation coupled with a reaction diffusion equation for a chemotactic factor and a mass conservation law. The numerical approximation of the computational model is obtained by high order relaxed schemes. Space and time discretization are performed by using TVD schemes and, respectively, IMEX schemes. Due to the computational costs of realistic simulations, we have implemented the numerical algorithm on a cluster for parallel computation. Starting from initial conditions mimicking the experimentally observed ones, numerical simulations produce network-like structures qualitatively similar to those observed in the early stages of \emph{in vivo} vasculogenesis. We develop the computation of critical percolative indices as a robust measure of the network geometry as a first step towards the comparison of computational and experimental data.
This work targets person re-identification (ReID) from depth sensors such as Kinect. Since depth is invariant to illumination and less sensitive than color to day-by-day appearance changes, a natural question is whether depth is an effective modality for Person ReID, especially in scenarios where individuals wear different colored clothes or over a period of several months. We explore the use of recurrent Deep Neural Networks for learning high-level shape information from low-resolution depth images. In order to tackle the small sample size problem, we introduce regularization and a hard temporal attention unit. The whole model can be trained end to end with a hybrid supervised loss. We carry out a thorough experimental evaluation of the proposed method on three person re-identification datasets, which include side views, views from the top and sequences with varying degree of partial occlusion, pose and viewpoint variations. To that end, we introduce a new dataset with RGB-D and skeleton data. In a scenario where subjects are recorded after three months with new clothes, we demonstrate large performance gains attained using Depth ReID compared to a state-of-the-art Color ReID. Finally, we show further improvements using the temporal attention unit in multi-shot setting.
Deep Belief Networks which are hierarchical generative models are effective tools for feature representation and extraction. Furthermore, DBNs can be used in numerous aspects of Machine Learning such as image denoising. In this paper, we propose a novel method for image denoising which relies on the DBNs' ability in feature representation. This work is based upon learning of the noise behavior. Generally, features which are extracted using DBNs are presented as the values of the last layer nodes. We train a DBN a way that the network totally distinguishes between nodes presenting noise and nodes presenting image content in the last later of DBN, i.e. the nodes in the last layer of trained DBN are divided into two distinct groups of nodes. After detecting the nodes which are presenting the noise, we are able to make the noise nodes inactive and reconstruct a noiseless image. In section 4 we explore the results of applying this method on the MNIST dataset of handwritten digits which is corrupted with additive white Gaussian noise (AWGN). A reduction of 65.9% in average mean square error (MSE) was achieved when the proposed method was used for the reconstruction of the noisy images.
We study the time until first occurrence, the first-passage time, of rare density fluctuations in diffusive systems. We approach the problem using a model consisting of many independent random walkers on a lattice. The existence of spatial correlations makes this problem analytically intractable. However, for a mean-field approximation in which the walkers can jump anywhere in the system, we obtain a simple asymptotic form for the mean first-passage time to have a given number k of particles at a distinguished site. We show numerically, and argue heuristically, that for large enough k, the mean-field results give a good approximation for first-passage times for systems with nearest-neighbour dynamics, especially for two and higher spatial dimensions. Finally, we show how the results change when density fluctuations anywhere in the system, rather than at a specific distinguished site, are considered.
We report on an anomalous behavior of the absorption spectrum in a one-dimensional lattice with long-range-correlated diagonal disorder with a power-like spectrum in the form S(k) ~ 1/k^A. These type of correlations give rise to a phase of extended states at the band center, provided A is larger than a critical value A_c. We show that for A < A_c the absorption spectrum is single-peaked, while an additional peak arises when A > A_c, signalling the occurrence of the Anderson transition. The peak is located slightly below the low-energy mobility edge, providing a unique spectroscopic tool to monitor the latter. We present qualitative arguments explaining this anomaly.
We introduce the "exponential linear unit" (ELU) which speeds up learning in deep neural networks and leads to higher classification accuracies. Like rectified linear units (ReLUs), leaky ReLUs (LReLUs) and parametrized ReLUs (PReLUs), ELUs alleviate the vanishing gradient problem via the identity for positive values. However, ELUs have improved learning characteristics compared to the units with other activation functions. In contrast to ReLUs, ELUs have negative values which allows them to push mean unit activations closer to zero like batch normalization but with lower computational complexity. Mean shifts toward zero speed up learning by bringing the normal gradient closer to the unit natural gradient because of a reduced bias shift effect. While LReLUs and PReLUs have negative values, too, they do not ensure a noise-robust deactivation state. ELUs saturate to a negative value with smaller inputs and thereby decrease the forward propagated variation and information. Therefore, ELUs code the degree of presence of particular phenomena in the input, while they do not quantitatively model the degree of their absence. In experiments, ELUs lead not only to faster learning, but also to significantly better generalization performance than ReLUs and LReLUs on networks with more than 5 layers. On CIFAR-100 ELUs networks significantly outperform ReLU networks with batch normalization while batch normalization does not improve ELU networks. ELU networks are among the top 10 reported CIFAR-10 results and yield the best published result on CIFAR-100, without resorting to multi-view evaluation or model averaging. On ImageNet, ELU networks considerably speed up learning compared to a ReLU network with the same architecture, obtaining less than 10% classification error for a single crop, single model network.
The human brain is a complex system composed of a network of hundreds of billions of discrete neurons that are coupled through time dependent synapses. Simulating the entire brain is a daunting challenge. Here, we show how ideas from quantum field theory can be used to construct an effective reduced theory, which may be analyzed with lattice computations. We give some examples of how the formalism can be applied to biophysically plausible neural network models.
How to develop slim and accurate deep neural networks has become crucial for real- world applications, especially for those employed in embedded systems. Though previous work along this research line has shown some promising results, most existing methods either fail to significantly compress a well-trained deep network or require a heavy retraining process for the pruned deep network to re-boost its prediction performance. In this paper, we propose a new layer-wise pruning method for deep neural networks. In our proposed method, parameters of each individual layer are pruned independently based on second order derivatives of a layer-wise error function with respect to the corresponding parameters. We prove that the final prediction performance drop after pruning is bounded by a linear combination of the reconstructed errors caused at each layer. Therefore, there is a guarantee that one only needs to perform a light retraining process on the pruned network to resume its original prediction performance. We conduct extensive experiments on benchmark datasets to demonstrate the effectiveness of our pruning method compared with several state-of-the-art baseline methods.
Many details about our world are not captured in written records because they are too mundane or too abstract to describe in words. Fortunately, since the invention of the camera, an ever-increasing number of photographs capture much of this otherwise lost information. This plethora of artifacts documenting our "visual culture" is a treasure trove of knowledge as yet untapped by historians. We present a dataset of 37,921 frontal-facing American high school yearbook photos that allow us to use computation to glimpse into the historical visual record too voluminous to be evaluated manually. The collected portraits provide a constant visual frame of reference with varying content. We can therefore use them to consider issues such as a decade's defining style elements, or trends in fashion and social norms over time. We demonstrate that our historical image dataset may be used together with weakly-supervised data-driven techniques to perform scalable historical analysis of large image corpora with minimal human effort, much in the same way that large text corpora together with natural language processing revolutionized historians' workflow. Furthermore, we demonstrate the use of our dataset in dating grayscale portraits using deep learning methods.
In this paper we address the issue of output instability of deep neural networks: small perturbations in the visual input can significantly distort the feature embeddings and output of a neural network. Such instability affects many deep architectures with state-of-the-art performance on a wide range of computer vision tasks. We present a general stability training method to stabilize deep networks against small input distortions that result from various types of common image processing, such as compression, rescaling, and cropping. We validate our method by stabilizing the state-of-the-art Inception architecture against these types of distortions. In addition, we demonstrate that our stabilized model gives robust state-of-the-art performance on large-scale near-duplicate detection, similar-image ranking, and classification on noisy datasets.
Synaptic plasticity is implemented and controlled through over thousand different types of molecules in the postsynaptic density and presynaptic boutons that assume a staggering array of different states through phosporylation and other mechanisms. One of the most prominent molecule in the postsynaptic density is CaMKII, that is described in molecular biology as a "memory molecule" that can integrate through auto-phosporylation Ca-influx signals on a relatively large time scale of dozens of seconds. The functional impact of this memory mechanism is largely unknown. We show that the experimental data on the specific role of CaMKII activation in dopamine-gated spine consolidation suggest a general functional role in speeding up reward-guided search for network configurations that maximize reward expectation. Our theoretical analysis shows that stochastic search could in principle even attain optimal network configurations by emulating one of the most well-known nonlinear optimization methods, simulated annealing. But this optimization is usually impeded by slowness of stochastic search at a given temperature. We propose that CaMKII contributes a momentum term that substantially speeds up this search. In particular, it allows the network to overcome saddle points of the fitness function. The resulting improved stochastic policy search can be understood on a more abstract level as Hamiltonian sampling, which is known to be one of the most efficient stochastic search methods.
After setting the performance benchmarks for image, video, speech and audio processing, deep convolutional networks have been core to the greatest advances in image recognition tasks in recent times. This raises the question of whether there are any benefit in targeting these remarkable deep architectures with the unattempted task of recognising human rights violations through digital images. Under this perspective, we introduce a new, well-sampled human rights-centric dataset called Human Rights Understanding (HRUN). We conduct a rigorous evaluation on a common ground by combining this dataset with different state-of-the-art deep convolutional architectures in order to achieve recognition of human rights violations. Experimental results on the HRUN dataset have shown that the best performing CNN architectures can achieve up to 88.10\% mean average precision. Additionally, our experiments demonstrate that increasing the size of the training samples is crucial for achieving an improvement on mean average precision principally when utilising very deep networks.
Low-temperature properties of crystalline solids can be understood using harmonic perturbations around a perfect lattice, as in Debye's theory. Low-temperature properties of amorphous solids, however, strongly depart from such descriptions, displaying enhanced transport, activated slow dynamics across energy barriers, excess vibrational modes with respect to Debye's theory (i.e., a Boson Peak), and complex irreversible responses to small mechanical deformations. These experimental observations indirectly suggest that the dynamics of amorphous solids becomes anomalous at low temperatures. Here, we present direct numerical evidence that vibrations change nature at a well-defined location deep inside the glass phase of a simple glass former. We provide a real-space description of this transition and of the rapidly growing time and length scales that accompany it. Our results provide the seed for a universal understanding of low-temperature glass anomalies within the theoretical framework of the recently discovered Gardner phase transition.
Most recent approaches to monocular 3D human pose estimation rely on Deep Learning. They typically involve regressing from an image to either 3D joint coordinates directly or 2D joint locations from which 3D coordinates are inferred. Both approaches have their strengths and weaknesses and we therefore propose a novel architecture designed to deliver the best of both worlds by performing both simultaneously and fusing the information along the way. At the heart of our framework is a trainable fusion scheme that learns how to fuse the information optimally instead of being hand-designed. This yields significant improvements upon the state-of-the-art on standard 3D human pose estimation benchmarks.
Here, we propose a brain-inspired winner-take-all emotional neural network (WTAENN) and prove the universal approximation property for the novel architecture. WTAENN is a single layered feedforward neural network that benefits from the excitatory, inhibitory, and expandatory neural connections as well as the winner-take-all (WTA) competitions in the human brain s nervous system. The WTA competition increases the information capacity of the model without adding hidden neurons. The universal approximation capability of the proposed architecture is illustrated on two example functions, trained by a genetic algorithm, and then applied to several competing recent and benchmark problems such as in curve fitting, pattern recognition, classification and prediction. In particular, it is tested on twelve UCI classification datasets, a facial recognition problem, three real world prediction problems (2 chaotic time series of geomagnetic activity indices and wind farm power generation data), two synthetic case studies with constant and nonconstant noise variance as well as k-selector and linear programming problems. Results indicate the general applicability and often superiority of the approach in terms of higher accuracy and lower model complexity, especially where low computational complexity is imperative.
We have used the Scanning Kelvin probe microscopy technique to monitor the charging process of highly resistive granular thin films. The sample is connected to two leads and is separated by an insulator layer from a gate electrode. When a gate voltage is applied, charges enter from the leads and rearrange across the sample. We find very slow processes with characteristic charging times exponentially distributed over a wide range of values, resulting in a logarithmic relaxation to equilibrium. After the gate voltage has been switched off, the system again relaxes logarithmically slowly to the new equilibrium. The results cannot be explained with diffusion models, but most of them can be understood with a hopping percolation model, in which the localization length is shorter than the typical site separation. The technique is very promising for the study of slow phenomena in highly resistive systems and will be able to estimate the conductance of these systems when direct macroscopic measurement techniques are not sensitive enough.
The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. Standard Bayesian inference algorithms are computationally expensive, however, making their direct application to large datasets difficult or infeasible. Recent work on scaling Bayesian inference has focused on modifying the underlying algorithms to, for example, use only a random data subsample at each iteration. We leverage the insight that data is often redundant to instead obtain a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We can then use this small coreset in any number of existing posterior inference algorithms without modification. In this paper, we develop an efficient coreset construction algorithm for Bayesian logistic regression models. We provide theoretical guarantees on the size and approximation quality of the coreset -- both for fixed, known datasets, and in expectation for a wide class of data generative models. Crucially, the proposed approach also permits efficient construction of the coreset in both streaming and parallel settings, with minimal additional effort. We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size. Furthermore, constructing the coreset takes a negligible amount of time compared to that required to run MCMC on it.
Development of many futuristic technologies, such as MANET, VANET, iThings, nano-devices, depend on efficient distributed communication protocols in multi-hop ad hoc networks. A vast majority of research in this area focus on design heuristic protocols, and analyze their performance by simulations on networks generated randomly or obtained in practical measurements of some (usually small-size) wireless networks. %some library. Moreover, they often assume access to truly random sources, which is often not reasonable in case of wireless devices. In this work we use a formal framework to study the problem of broadcasting and its time complexity in any two dimensional Euclidean wireless network with uniform transmission powers. For the analysis, we consider two popular models of ad hoc networks based on the Signal-to-Interference-and-Noise Ratio (SINR): one with opportunistic links, and the other with randomly disturbed SINR. In the former model, we show that one of our algorithms accomplishes broadcasting in $O(D\log^2 n)$ rounds, where $n$ is the number of nodes and $D$ is the diameter of the network. If nodes know a priori the granularity $g$ of the network, i.e., the inverse of the maximum transmission range over the minimum distance between any two stations, a modification of this algorithm accomplishes broadcasting in $O(D\log g)$ rounds.   Finally, we modify both algorithms to make them efficient in the latter model with randomly disturbed SINR, with only logarithmic growth of performance.   Ours are the first provably efficient and well-scalable, under the two models, distributed deterministic solutions for the broadcast task.
We analyze the dynamical and energetic instabilities of spin currents in a system of two-component bosons in an optical lattice, with a particular focus on the Neel state. We consider both the weakly interacting superfluid and the strongly interacting Mott insulating limits as well as the regime near the superfluid-insulator transition and establish the criteria for the onset of these instabilities. We use Bogoliubov theory to treat the weakly interacting superfluid regime. Near the Mott transition, we calculate the stability phase diagram within a variational Gutzwiller wavefunction approach. In the deep Mott limit we discuss the emergence of the Heisenberg model and calculate the stability diagram within this model. Though the Bogoliubov theory and the Heisenberg model (appropriate for deep superfluid and deep Mott phase respectively) predict no dynamical instabilities, we find, interestingly, between these two limiting cases there is a regime of dynamical instability. This result is relevant for the ongoing experimental efforts to realize a stable Neel-ordered state in multi-component ultracold bosons.
NLP tasks differ in the semantic information they require, and at this time no single se- mantic representation fulfills all requirements. Logic-based representations characterize sentence structure, but do not capture the graded aspect of meaning. Distributional models give graded similarity ratings for words and phrases, but do not capture sentence structure in the same detail as logic-based approaches. So it has been argued that the two are complementary. We adopt a hybrid approach that combines logic-based and distributional semantics through probabilistic logic inference in Markov Logic Networks (MLNs). In this paper, we focus on the three components of a practical system integrating logical and distributional models: 1) Parsing and task representation is the logic-based part where input problems are represented in probabilistic logic. This is quite different from representing them in standard first-order logic. 2) For knowledge base construction we form weighted inference rules. We integrate and compare distributional information with other sources, notably WordNet and an existing paraphrase collection. In particular, we use our system to evaluate distributional lexical entailment approaches. We use a variant of Robinson resolution to determine the necessary inference rules. More sources can easily be added by mapping them to logical rules; our system learns a resource-specific weight that corrects for scaling differences between resources. 3) In discussing probabilistic inference, we show how to solve the inference problems efficiently. To evaluate our approach, we use the task of textual entailment (RTE), which can utilize the strengths of both logic-based and distributional representations. In particular we focus on the SICK dataset, where we achieve state-of-the-art results.
Deep neural networks with skip-connections, such as ResNet, show excellent performance in various image classification benchmarks. It is though observed that the initial motivation behind them - training deeper networks - does not actually hold true, and the benefits come from increased capacity, rather than from depth. Motivated by this, and inspired from ResNet, we propose a simple Dirac weight parameterization, which allows us to train very deep plain networks without skip-connections, and achieve nearly the same performance. This parameterization has a minor computational cost at training time and no cost at all at inference. We're able to achieve 95.5% accuracy on CIFAR-10 with 34-layer deep plain network, surpassing 1001-layer deep ResNet, and approaching Wide ResNet. Our parameterization also mostly eliminates the need of careful initialization in residual and non-residual networks. The code and models for our experiments are available at https://github.com/szagoruyko/diracnets
We propose a localization algorithm for wireless sensor networks, which is simple in design, does not involve significant overhead and yet provides acceptable position estimates of sensor nodes. The algorithm uses settled nodes as beacon nodes so as to increase the number of beacon nodes. The algorithm is range free and does not need any additional piece of hardware for ranging. It also does not involve any significant communication overhead for localization. The simulation and results show that good localization accuracy is achieved for outdoor environments.
We derive a set of coupled non-linear algebraic equations for the asymptotics of the Poisson kernel distribution describing the statistical properties of a two-terminal double-barrier chaotic billiard (or ballistic quantum dot). The equations are calculated from a diagrammatic technique for performing averages over the unitary group, proposed by Brouwer and Beenakker [J. Math. Phys. 37, 4904 (1996)]. We give strong analytical evidences that these equations are equivalent to a much simpler polynomial equation calculated from a recent extension of Nazarov's circuit theory [A. M. S. Macedo, Phys. Rev. B 66, 033306 (2002)]. These results offer interesting perspectives for further developments in the field via the direct conversion of one approach into the other.
This letter studies synchronization conditions for distributed dynamic networks with different types of leaders. The role of a leader specifying a desired global state trajectory through local interactions (the power leader) has long been recognized and modeled. This paper introduces the complementary notion of a knowledge leader holding information on the target dynamics, which is propagated to the entire network through local adaptation mechanisms. Knowledge-based leader-followers networks have many analogs in biology, e.g., in evolutionary processes and disease propagation. Different types of leaders can co-exist in the same network.
This paper presents a model for end-to-end learning of task-oriented dialog systems. The main component of the model is a recurrent neural network (an LSTM), which maps from raw dialog history directly to a distribution over system actions. The LSTM automatically infers a representation of dialog history, which relieves the system developer of much of the manual feature engineering of dialog state. In addition, the developer can provide software that expresses business rules and provides access to programmatic APIs, enabling the LSTM to take actions in the real world on behalf of the user. The LSTM can be optimized using supervised learning (SL), where a domain expert provides example dialogs which the LSTM should imitate; or using reinforcement learning (RL), where the system improves by interacting directly with end users. Experiments show that SL and RL are complementary: SL alone can derive a reasonable initial policy from a small number of training dialogs; and starting RL optimization with a policy trained with SL substantially accelerates the learning rate of RL.
We study the Axelrod's cultural adaptation model using the concept of cluster size entropy, $S_{c}$ that gives information on the variability of the cultural cluster size present in the system. Using networks of different topologies, from regular to random, we find that the critical point of the well-known nonequilibrium monocultural-multicultural (order-disorder) transition of the Axelrod model is unambiguously given by the maximum of the $S_{c}(q)$ distributions. The width of the cluster entropy distributions can be used to qualitatively determine whether the transition is first- or second-order. By scaling the cluster entropy distributions we were able to obtain a relationship between the critical cultural trait $q_c$ and the number $F$ of cultural features in regular networks. We also analyze the effect of the mass media (external field) on social systems within the Axelrod model in a square network. We find a new partially ordered phase whose largest cultural cluster is not aligned with the external field, in contrast with a recent suggestion that this type of phase cannot be formed in regular networks. We draw a new $q-B$ phase diagram for the Axelrod model in regular networks.
We analyze the security of the authentication code against pollution attacks in network coding given by Oggier and Fathi and show one way to remove one very strong condition they required. Actually, we find a way to attack their authentication scheme. In their scheme, they considered that if some malicious nodes in the network collude to make pollution in the network flow or make substitution attacks to other nodes, they thought these malicious nodes must solve a system of linear equations to recover the secret parameters. Then they concluded that their scheme is an unconditional secure scheme. Actually, note that the authentication tag in the scheme of Oggier and Fathi is nearly linear on the messages, so it is very easy for any malicious node to make pollution attack in the network flow, replacing the vector of any incoming edge by linear combination of his incoming vectors whose coefficients have sum 1. And if the coalition of malicious nodes can carry out decoding of the network coding, they can easily make substitution attack to any other node even if they do not know any information of the private key of the node. Moreover, even if their scheme can work fruitfully, the condition in their scheme $H\leqslant M$ in a network can be removed, where $H$ is the sum of numbers of the incoming edges at adversaries. Under the condition $H\leqslant M$, $H$ may be large, so we need large parameter $M$ which increases the cost of computation a lot. On the other hand, the parameter $M$ can not be very large as it can not exceed the length of original messages.
Ad hoc networks are wireless mobile networks that can operate without infrastructure and without centralized network management. Traditional techniques of routing are not well adapted. Indeed, their lack of reactivity with respect to the variability of network changes makes them difficult to use. Moreover, conserving energy is a critical concern in the design of routing protocols for ad hoc networks, because most mobile nodes operate with limited battery capacity, and the energy depletion of a node affects not only the node itself but also the overall network lifetime. In all proposed single-path routing schemes a new path-discovery process is required once a path failure is detected, and this process causes delay and wastage of node resources. A multipath routing scheme is an alternative to maximize the network lifetime. In this paper, we propose an energy-efficient multipath routing protocol, called AOMR-LM (Ad hoc On-demand Multipath Routing with Lifetime Maximization), which preserves the residual energy of nodes and balances the consumed energy to increase the network lifetime. To achieve this goal, we used the residual energy of nodes for calculating the node energy level. The multipath selection mechanism uses this energy level to classify the paths. Two parameters are analyzed: the energy threshold beta and the coefficient alpha. These parameters are required to classify the nodes and to ensure the preservation of node energy. Our protocol improves the performance of mobile ad hoc networks by prolonging the lifetime of the network. This novel protocol has been compared with other protocols: AOMDV and ZD-AOMDV. The protocol performance has been evaluated in terms of network lifetime, energy consumption, and end-to-end delay.
The search for other superconductors in the MgB2 class currently is focussed on Li{1-x}BC, which when hole-doped (concentration x) should be a metal with the potential to be a better superconductor than MgB2. Here we present the calculated phonon spectrum of the parent semiconductor LiBC. The calculated Raman-active modes are in excellent agreement with a recent observation, and comparison of calculated IR-active modes with a recent report provides a prediction of the LO--TO splitting for these four modes, which is small for the B-C bond stretching mode at ~1200 cm^{-1}, but large for clearly resolved modes at 540 cm^{-1} and 620 cm^{-1}.
We study the single-transverse spin asymmetry in semi-inclusive hadron production in deep inelastic scattering. We derive the leading contribution to the asymmetry at moderate transverse momentum $P_{h\perp}$ of the produced hadron in terms of twist-three quark-gluon correlation functions, and compare with the approach based on the factorization at fixed transverse momentum involving the asymmetric transverse-momentum and spin-dependent quark distribution. We verify that the two approaches yield identical results in this regime. By a comparison with our earlier calculations for the single-spin asymmetry in the Drell-Yan process we recover the well-established process-dependence of the time-reversal-odd transverse-momentum-dependent quark distributions that generate single-spin phenomena.
We evaluate a semantic parser based on a character-based sequence-to-sequence model in the context of the SemEval-2017 shared task on semantic parsing for AMRs. With data augmentation, super characters, and POS-tagging we gain major improvements in performance compared to a baseline character-level model. Although we improve on previous character-based neural semantic parsing models, the overall accuracy is still lower than a state-of-the-art AMR parser. An ensemble combining our neural semantic parser with an existing, traditional parser, yields a small gain in performance.
We propose a novel neural network structure called CrossNets, which considers architectures on directed acyclic graphs. This structure builds on previous generalizations of feed forward models, such as ResNets, by allowing for all forward cross connections between layers (both adjacent and non-adjacent). The addition of cross connections among the network increases information flow across the whole network, leading to better training and testing performances. The superior performance of the network is tested against four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN. We conclude with a proof of convergence for Crossnets to a local minimum for error, where weights for connections are chosen through backpropagation with momentum.
We propose the neural programmer-interpreter (NPI): a recurrent and compositional neural network that learns to represent and execute programs. NPI has three learnable components: a task-agnostic recurrent core, a persistent key-value program memory, and domain-specific encoders that enable a single NPI to operate in multiple perceptually diverse environments with distinct affordances. By learning to compose lower-level programs to express higher-level programs, NPI reduces sample complexity and increases generalization ability compared to sequence-to-sequence LSTMs. The program memory allows efficient learning of additional tasks by building on existing programs. NPI can also harness the environment (e.g. a scratch pad with read-write pointers) to cache intermediate results of computation, lessening the long-term memory burden on recurrent hidden units. In this work we train the NPI with fully-supervised execution traces; each program has example sequences of calls to the immediate subprograms conditioned on the input. Rather than training on a huge number of relatively weak labels, NPI learns from a small number of rich examples. We demonstrate the capability of our model to learn several types of compositional programs: addition, sorting, and canonicalizing 3D models. Furthermore, a single NPI learns to execute these programs and all 21 associated subprograms.
Visual patterns represent the discernible regularity in the visual world. They capture the essential nature of visual objects or scenes. Understanding and modeling visual patterns is a fundamental problem in visual recognition that has wide ranging applications. In this paper, we study the problem of visual pattern mining and propose a novel deep neural network architecture called PatternNet for discovering these patterns that are both discriminative and representative. The proposed PatternNet leverages the filters in the last convolution layer of a convolutional neural network to find locally consistent visual patches, and by combining these filters we can effectively discover unique visual patterns. In addition, PatternNet can discover visual patterns efficiently without performing expensive image patch sampling, and this advantage provides an order of magnitude speedup compared to most other approaches. We evaluate the proposed PatternNet subjectively by showing randomly selected visual patterns which are discovered by our method and quantitatively by performing image classification with the identified visual patterns and comparing our performance with the current state-of-the-art. We also directly evaluate the quality of the discovered visual patterns by leveraging the identified patterns as proposed objects in an image and compare with other relevant methods. Our proposed network and procedure, PatterNet, is able to outperform competing methods for the tasks described.
Visualizing high dimensional data by projecting them into two or three dimensional space is one of the most effective ways to intuitively understand the data's underlying characteristics, for example their class neighborhood structure. While data visualization in low dimensional space can be efficient for revealing the data's underlying characteristics, classifying a new sample in the reduced-dimensional space is not always beneficial because of the loss of information in expressing the data. It is possible to classify the data in the high dimensional space, while visualizing them in the low dimensional space, but in this case, the visualization is often meaningless because it fails to illustrate the underlying characteristics that are crucial for the classification process.   In this paper, the performance-preserving property of the previously proposed Restricted Radial Basis Function Network in reducing the dimension of labeled data is explained. Here, it is argued through empirical experiments that the internal representation of the Restricted Radial Basis Function Network, which during the supervised learning process organizes a visualizable two dimensional map, does not only preserve the topographical structure of high dimensional data but also captures their class neighborhood structures that are important for classifying them. Hence, unlike many of the existing dimension reduction methods, the Restricted Radial Basis Function Network offers two dimensional visualization that is strongly correlated with the classification process.
Two major goals in machine learning are the discovery and improvement of solutions to complex problems. In this paper, we argue that complexification, i.e. the incremental elaboration of solutions through adding new structure, achieves both these goals. We demonstrate the power of complexification through the NeuroEvolution of Augmenting Topologies (NEAT) method, which evolves increasingly complex neural network architectures. NEAT is applied to an open-ended coevolutionary robot duel domain where robot controllers compete head to head. Because the robot duel domain supports a wide range of strategies, and because coevolution benefits from an escalating arms race, it serves as a suitable testbed for studying complexification. When compared to the evolution of networks with fixed structure, complexifying evolution discovers significantly more sophisticated strategies. The results suggest that in order to discover and improve complex solutions, evolution, and search in general, should be allowed to complexify as well as optimize.
We review the recent fast progress in statistical physics of evolving networks. Interest has focused mainly on the structural properties of random complex networks in communications, biology, social sciences and economics. A number of giant artificial networks of such a kind came into existence recently. This opens a wide field for the study of their topology, evolution, and complex processes occurring in them. Such networks possess a rich set of scaling properties. A number of them are scale-free and show striking resilience against random breakdowns. In spite of large sizes of these networks, the distances between most their vertices are short -- a feature known as the ``small-world'' effect. We discuss how growing networks self-organize into scale-free structures and the role of the mechanism of preferential linking. We consider the topological and structural properties of evolving networks, and percolation in these networks. We present a number of models demonstrating the main features of evolving networks and discuss current approaches for their simulation and analytical study. Applications of the general results to particular networks in Nature are discussed. We demonstrate the generic connections of the network growth processes with the general problems of non-equilibrium physics, econophysics, evolutionary biology, etc.
We introduce Bayesian Poisson Tucker decomposition (BPTD) for modeling country--country interaction event data. These data consist of interaction events of the form "country $i$ took action $a$ toward country $j$ at time $t$." BPTD discovers overlapping country--community memberships, including the number of latent communities. In addition, it discovers directed community--community interaction networks that are specific to "topics" of action types and temporal "regimes." We show that BPTD yields an efficient MCMC inference algorithm and achieves better predictive performance than related models. We also demonstrate that it discovers interpretable latent structure that agrees with our knowledge of international relations.
Recent years have seen tremendous growth of many online social networks such as Facebook, LinkedIn and MySpace. People connect to each other through these networks forming large social communities providing researchers rich datasets to understand, model and predict social interactions and behaviors. New contacts in these networks can be formed either due to an individual's demographic profile such as age group, gender, geographic location or due to network's structural dynamics such as triadic closure and preferential attachment, or a combination of both demographic and structural characteristics.   A number of network generation models have been proposed in the last decade to explain the structure, evolution and processes taking place in different types of networks, and notably social networks. Network generation models studied in the literature primarily consider structural properties, and in some cases an individual's demographic profile in the formation of new social contacts. These models do not present a mechanism to combine both structural and demographic characteristics for the formation of new links. In this paper, we propose a new network generation algorithm which incorporates both these characteristics to model growth of a network.We use different publicly available Facebook datasets as benchmarks to demonstrate the correctness of the proposed network generation model.
We show with a few descriptive examples the limitations of Artificial Neural Networks when they are applied to the analysis of independent stochastic data.
We calculate, in a model, the beam spin asymmetry in semi-inclusive jet production in deep inelastic scattering. This twist-3, $T$-odd observable is non-zero due to final state strong interactions. With reasonable choices for the parameters, one finds an asymmetry of several percent, about the size seen experimentally. We present the result both as an explicit asymmetry calculation and as a model calculation of the new transverse-momentum dependent distribution function $g^\perp$.
We consider force-induced unzipping transition for a heterogeneous DNA model with a correlated base-sequence. Both finite-range and long-range correlated situations are considered. It is shown that finite-range correlations increase stability of DNA with respect to the external unzipping force. Due to long-range correlations the number of unzipped base-pairs displays two widely different scenarios depending on the details of the base-sequence: either there is no unzipping phase-transition at all, or the transition is realized via a sequence of jumps with magnitude comparable to the size of the system. Both scenarios are different from the behavior of the average number of unzipped base-pairs (non-self-averaging). The results can be relevant for explaining the biological purpose of correlated structures in DNA.
Implicit models, which allow for the generation of samples but not for point-wise evaluation of probabilities, are omnipresent in real world problems tackled by machine learning and a hot topic of current research. Some examples include data simulators that are widely used in engineering and scientific research, generative adversarial networks (GANs) for image synthesis, and hot-off-the-press approximate inference techniques relying on implicit distributions. The majority of existing approaches to learning implicit models rely on approximating the intractable distribution or optimisation objective for gradient-based optimisation, which is liable to produce inaccurate updates and thus poor models. This paper alleviates the need for such approximations by proposing the Stein gradient estimator, which directly estimates the score function of the implicitly defined distribution. The efficacy of the proposed estimator is empirically demonstrated by examples that include meta-learning for approximate inference, and entropy regularised GANs that provide improved sample diversities.
Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as "the curse of dimensionality". This paper presents a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated as a control theory problem and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and speed. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships.
As the first step to model emotional state of a person, we build sentiment analysis models with existing deep neural network algorithms and compare the models with psychological measurements to enlighten the relationship. In the experiments, we first examined psychological state of 64 participants and asked them to summarize the story of a book, Chronicle of a Death Foretold (Marquez, 1981). Secondly, we trained models using crawled 365,802 movie review data; then we evaluated participants' summaries using the pretrained model as a concept of transfer learning. With the background that emotion affects on memories, we investigated the relationship between the evaluation score of the summaries from computational models and the examined psychological measurements. The result shows that although CNN performed the best among other deep neural network algorithms (LSTM, GRU), its results are not related to the psychological state. Rather, GRU shows more explainable results depending on the psychological state. The contribution of this paper can be summarized as follows: (1) we enlighten the relationship between computational models and psychological measurements. (2) we suggest this framework as objective methods to evaluate the emotion; the real sentiment analysis of a person.
We present a neuromorphic Analogue-to-Digital Converter (ADC), which uses integrate-and-fire (I&F) neurons as the encoders of the analogue signal, with modulated inhibitions to decohere the neuronal spikes trains. The architecture consists of an analogue chip and a control module. The analogue chip comprises two scan chains and a twodimensional integrate-and-fire neuronal array. Individual neurons are accessed via the chains one by one without any encoder decoder or arbiter. The control module is implemented on an FPGA (Field Programmable Gate Array), which sends scan enable signals to the scan chains and controls the inhibition for individual neurons. Since the control module is implemented on an FPGA, it can be easily reconfigured. Additionally, we propose a pulse width modulation methodology for the lateral inhibition, which makes use of different pulse widths indicating different strengths of inhibition for each individual neuron to decohere neuronal spikes. Software simulations in this paper tested the robustness of the proposed ADC architecture to fixed random noise. A circuit simulation using ten neurons shows the performance and the feasibility of the architecture.
Real-time monitoring and responses to emerging public health threats rely on the availability of timely surveillance data. During the early stages of an epidemic, the ready availability of line lists with detailed tabular information about laboratory-confirmed cases can assist epidemiologists in making reliable inferences and forecasts. Such inferences are crucial to understand the epidemiology of a specific disease early enough to stop or control the outbreak. However, construction of such line lists requires considerable human supervision and therefore, difficult to generate in real-time. In this paper, we motivate Guided Deep List, the first tool for building automated line lists (in near real-time) from open source reports of emerging disease outbreaks. Specifically, we focus on deriving epidemiological characteristics of an emerging disease and the affected population from reports of illness. Guided Deep List uses distributed vector representations (ala word2vec) to discover a set of indicators for each line list feature. This discovery of indicators is followed by the use of dependency parsing based techniques for final extraction in tabular form. We evaluate the performance of Guided Deep List against a human annotated line list provided by HealthMap corresponding to MERS outbreaks in Saudi Arabia. We demonstrate that Guided Deep List extracts line list features with increased accuracy compared to a baseline method. We further show how these automatically extracted line list features can be used for making epidemiological inferences, such as inferring demographics and symptoms-to-hospitalization period of affected individuals.
Structural properties of the Indian Railway network is studied in the light of recent investigations of the scaling properties of different complex networks. Stations are considered as `nodes' and an arbitrary pair of stations is said to be connected by a `link' when at least one train stops at both stations. Rigorous analysis of the existing data shows that the Indian Railway network displays small-world properties. We define and estimate several other quantities associated with this network.
Random geometric graphs (RGG) can be formalized as hidden-variables models where the hidden variables are the coordinates of the nodes. Here we develop a general approach to extract the typical configurations of a generic hidden-variables model and apply the resulting equations to RGG. For any RGG, defined through a rigid or a soft geometric rule, the method reduces to a non trivial satisfaction problem: Given $N$ nodes, a domain $\mathcal{D}$, and a desired average connectivity $\langle k\rangle$, find - if any - the distribution of nodes having support in $\mathcal{D}$ and average connectivity $\langle k\rangle$. We find out that, in the thermodynamic limit, nodes are either uniformly distributed or highly condensed in a small region, the two regimes being separated by a first order phase transition characterized by a $\mathop{O}(N)$ jump of $\langle k\rangle$. Other intermediate values of $\langle k\rangle$ correspond to very rare graph realizations. The phase transition is observed as a function of a parameter $a\in[0,1]$ that tunes the underlying geometry. In particular, $a=1$ indicates a rigid geometry where only close nodes are connected, while $a=0$ indicates a rigid anti-geometry where only distant nodes are connected. Consistently, when $a=1/2$ there is no geometry and no phase transition. After discussing the numerical analysis, we provide a combinatorial argument to fully explain the mechanism inducing this phase transition and recognize it as an easy-hard-easy transition. Our result shows that, in general, ad hoc optimized networks can hardly be designed, unless to rely to specific heterogeneous constructions, not necessarily scale free.
We present an exactly solvable model of a Gaussian (flexible) polymer chain in a quenched random medium. This is the case when the random medium obeys very long range quadratic correlations. The model is solved in $d$ spatial dimensions using the replica method, and practically all the physical properties of the chain can be found. In particular the difference between the behavior of a chain that is free to move and a chain with one end fixed is elucidated. The interesting finding is that a chain that is free to move in a quadratically correlated random potential behaves like a free chain with $R^2 \sim L$, where $R$ is the end to end distance and $L$ is the length of the chain, whereas for a chain anchored at one end $R^2 \sim L^4$. The exact results are found to agree with an alternative numerical solution in $d=1$ dimensions. The crossover from long ranged to short ranged correlations of the disorder is also explored.
We investigate numerically and analytically a recently proposed model for food webs [Nature {\bf 404}, 180 (2000)] in the limit of large web sizes and sparse interaction matrices. We obtain analytical expressions for several quantities with ecological interest, in particular the probability distributions for the number of prey and the number of predators. We find that these distributions have fast-decaying exponential and Gaussian tails, respectively. We also find that our analytical expressions are robust to changes in the details of the model.
Deep reinforcement learning (DRL) methods such as the Deep Q-Network (DQN) have achieved state-of-the-art results in a variety of challenging, high-dimensional domains. This success is mainly attributed to the power of deep neural networks to learn rich domain representations for approximating the value function or policy. Batch reinforcement learning methods with linear representations, on the other hand, are more stable and require less hyper parameter tuning. Yet, substantial feature engineering is necessary to achieve good results. In this work we propose a hybrid approach -- the Least Squares Deep Q-Network (LS-DQN), which combines rich feature representations learned by a DRL algorithm with the stability of a linear least squares method. We do this by periodically re-training the last hidden layer of a DRL network with a batch least squares update. Key to our approach is a Bayesian regularization term for the least squares update, which prevents over-fitting to the more recent data. We tested LS-DQN on five Atari games and demonstrate significant improvement over vanilla DQN and Double-DQN. We also investigated the reasons for the superior performance of our method. Interestingly, we found that the performance improvement can be attributed to the large batch size used by the LS method when optimizing the last layer.
A comprehensive census of Galactic open cluster properties places unique constraints on the Galactic disc structure and evolution. In this framework we investigate the evolutionary status of three poorly-studied open clusters, Berkeley 31, Berkeley 23 and King 8, all located in the Galactic anti-centre direction. To this aim, we make use of deep LBT observations, reaching more than 6 mag below the main sequence Turn- Off. To determine the cluster parameters, namely age, metallicity, distance, reddening and binary fraction, we compare the observational colour-magnitude diagrams (CMDs) with a library of synthetic CMDs generated with different evolutionary sets (Padova, FRANEC and FST) and metallicities. We find that Berkeley 31 is relatively old, with an age between 2.3 and 2.9 Gyr, and rather high above the Galactic plane, at about 700 pc. Berkeley 23 and King 8 are younger, with best fitting ages in the range 1.1-1.3 Gyr and 0.8-1.3 Gyr, respectively. The position above the Galactic plane is about 500- 600 pc for the former, and 200 pc for the latter. Although a spectroscopic confirmation is needed, our analysis suggests a sub-solar metallicity for all three clusters.
Results are presented for optimizing device-to-device communications in cellular networks, while maintaining spectral efficiency of the base-station-to-device downlink channel. We build upon established and tested stochastic geometry models of signal-to-interference ratio in wireless networks based on the Poisson point process, which incorporate random propagation effects such as fading and shadowing. A key result is a simple formula, allowing one to optimize the device-to-device spatial throughput by suitably adjusting the proportion of active devices. These results can lead to further investigation as they can be immediately applied to more sophisticated models such as studying multi-tier network models to address coverage in closed access networks.
Objective: Research into network analysis of brain function faces a methodological challenge in selecting an appropriate threshold to binarise edge weights. Such binarisation should take into account the complex hierarchical structure found in functional connectivity. We explore the density range suitable for such structure and provide a comparison of state-of-the-art binarisation techniques including the recently proposed Cluster-Span Threshold (CST).   Methods: We compare CST networks with weighted networks, minimum spanning trees, union of shortest path graphs and arbitrary proportional thresholds. We test these techniques on weighted complex hierarchy models by contrasting model realisations with small parametric differences. We also test the robustness of these techniques to random and targeted topological attacks. Simulated results are confirmed with the analysis of three relevant EEG datasets: eyes open and closed resting states; visual short-term memory tasks; and resting state Alzheimer's disease with a healthy control group.   Results: The CST consistently outperforms other state-of-the-art binarisation methods for topological accuracy and robustness in both synthetic and real data. In fact, it proves near maximal for distinguishing differences when compared with arbitrary proportional thresholding.   Conclusion: Complex hierarchical topology requires a medium-density range binarisation solution, such as the CST.   Significance: We provide insights into how the complex hierarchical structure of functional networks is best revealed in medium density ranges and how it safeguards against targeted attacks. We explore the effects of network size and density on the topological accuracy of binarised networks.
This paper presents Deep Retinal Image Understanding (DRIU), a unified framework of retinal image analysis that provides both retinal vessel and optic disc segmentation. We make use of deep Convolutional Neural Networks (CNNs), which have proven revolutionary in other fields of computer vision such as object detection and image classification, and we bring their power to the study of eye fundus images. DRIU uses a base network architecture on which two set of specialized layers are trained to solve both the retinal vessel and optic disc segmentation. We present experimental validation, both qualitative and quantitative, in four public datasets for these tasks. In all of them, DRIU presents super-human performance, that is, it shows results more consistent with a gold standard than a second human annotator used as control.
A mode-coupling theory for the slow single-particle dynamics in fluids adsorbed in disordered porous media is derived, which complements previous work on the collective dynamics [V. Krakoviack, Phys. Rev. E 75, 031503 (2007)]. Its equations, like the previous ones, reflect the interplay between confinement-induced relaxation phenomena and glassy dynamics through the presence of two contributions in the slow part of the memory kernel, which are linear and quadratic in the density correlation functions, respectively. From numerical solutions for two simple models with pure hard core interactions, it is shown that two different scenarios result for the diffusion-localization transition, depending on the strength of the confinement. For weak confinement, this transition is discontinuous and coincides with the ideal glass transition, like in one-component bulk systems, while, for strong confinement, it is continuous and occurs before the collective dynamics gets nonergodic. In the latter case, the glass transition manifests itself as a secondary transition, which can be either continuous or discontinuous, in the already arrested single-particle dynamics. The main features of the anomalous dynamics found in the vicinity of all these transitions are reviewed and illustrated with detailed computations.
Extensive cooperation among unrelated individuals is unique to humans, who often sacrifice personal benefits for the common good and work together to achieve what they are unable to execute alone. The evolutionary success of our species is indeed due, to a large degree, to our unparalleled other-regarding abilities. Yet, a comprehensive understanding of human cooperation remains a formidable challenge. Recent research in social science indicates that it is important to focus on the collective behavior that emerges as the result of the interactions among individuals, groups, and even societies. Non-equilibrium statistical physics, in particular Monte Carlo methods and the theory of collective behavior of interacting particles near phase transition points, has proven to be very valuable for understanding counterintuitive evolutionary outcomes. By studying models of human cooperation as classical spin models, a physicist can draw on familiar settings from statistical physics. However, unlike pairwise interactions among particles that typically govern solid-state physics systems, interactions among humans often involve group interactions, and they also involve a larger number of possible states even for the most simplified description of reality. The complexity of solutions therefore often surpasses that observed in physical systems. Here we review experimental and theoretical research that advances our understanding of human cooperation, focusing on spatial pattern formation, on the spatiotemporal dynamics of observed solutions, and on self-organization that may either promote or hinder socially favorable states.
Rich semantic relations are important in a variety of visual recognition problems. As a concrete example, group activity recognition involves the interactions and relative spatial relations of a set of people in a scene. State of the art recognition methods center on deep learning approaches for training highly effective, complex classifiers for interpreting images. However, bridging the relatively low-level concepts output by these methods to interpret higher-level compositional scenes remains a challenge. Graphical models are a standard tool for this task. In this paper, we propose a method to integrate graphical models and deep neural networks into a joint framework. Instead of using a traditional inference method, we use a sequential inference modeled by a recurrent neural network. Beyond this, the appropriate structure for inference can be learned by imposing gates on edges between nodes. Empirical results on group activity recognition demonstrate the potential of this model to handle highly structured learning tasks.
Recently, it has been discovered that in contrast to expectations the low-temperature dielectric properties of some multi-component glasses depend strongly on magnetic fields. In particular, the low-frequency dielectric susceptibility and the amplitude of coherent polarization echoes show striking non-monotonic magnetic field dependencies. The low-temperature dielectric response of these materials is governed by atomic tunneling systems. We now have investigated the coherent properties of tunneling states in a crystalline host in magnetic fields up to 230$ $mT. Two-pulse echo experiments have been performed on a KBr crystal containing about 7.5% CN$^-$. Like in glasses, but perhaps even more surprising in the case of a crystalline system, we observe a very strong magnetic field dependence of the echo amplitude. Moreover, for the first time we have direct evidence that magnetic fields change the phase of coherent tunneling systems in a well-defined way. We present the data and discuss the possible origin of this intriguing effect.
We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the message passing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which enable us to significantly reduce the necessary cost of computation. These formulas also provide a simple connection of the CV errors to the residual sums of squares between the reconstructed and the given measurements. Second, on the basis of this finding, we analytically evaluate the CV errors when the design matrix is given as a simple random matrix in the large size limit by using the replica method. Finally, these results are compared with those of numerical simulations on finite-size systems and are confirmed to be correct. We also apply the simple formulas of the first type of CV error to an actual dataset of the supernovae.
In this work we investigate the use of deep learning for distortion-generic blind image quality assessment. We report on different design choices, ranging from the use of features extracted from pre-trained Convolutional Neural Networks (CNNs) as a generic image description, to the use of features extracted from a CNN fine-tuned for the image quality task. Our best proposal, named DeepBIQ, estimates the image quality by average pooling the scores predicted on multiple sub-regions of the original image. The score of each sub-region is computed using a Support Vector Regression (SVR) machine taking as input features extracted using a CNN fine-tuned for category-based image quality assessment. Experimental results on the LIVE In the Wild Image Quality Challenge Database and on the LIVE Image Quality Assessment Database show that DeepBIQ outperforms the state-of-the-art methods compared, having a Linear Correlation Coefficient (LCC) with human subjective scores of almost 0.91 and 0.98 respectively. Furthermore, in most of the cases, the quality score predictions of DeepBIQ are closer to the average observer than those of a generic human observer.
Generative adversarial networks (GANs) provide an algorithmic framework for constructing generative models with several appealing properties: they do not require a likelihood function to be specified, only a generating procedure; they provide samples that are sharp and compelling; and they allow us to harness our knowledge of building highly accurate neural network classifiers. Here, we develop our understanding of GANs with the aim of forming a rich view of this growing area of machine learning---to build connections to the diverse set of statistical thinking on this topic, of which much can be gained by a mutual exchange of ideas. We frame GANs within the wider landscape of algorithms for learning in implicit generative models--models that only specify a stochastic procedure with which to generate data--and relate these ideas to modelling problems in related fields, such as econometrics and approximate Bayesian computation. We develop likelihood-free inference methods and highlight hypothesis testing as a principle for learning in implicit generative models, using which we are able to derive the objective function used by GANs, and many other related objectives. The testing viewpoint directs our focus to the general problem of density ratio estimation. There are four approaches for density ratio estimation, one of which is a solution using classifiers to distinguish real from generated data. Other approaches such as divergence minimisation and moment matching have also been explored in the GAN literature, and we synthesise these views to form an understanding in terms of the relationships between them and the wider literature, highlighting avenues for future exploration and cross-pollination.
Neuromorphic chips embody computational principles operating in the nervous system, into microelectronic devices. In this domain it is important to identify computational primitives that theory and experiments suggest as generic and reusable cognitive elements. One such element is provided by attractor dynamics in recurrent networks. Point attractors are equilibrium states of the dynamics (up to fluctuations), determined by the synaptic structure of the network; a `basin' of attraction comprises all initial states leading to a given attractor upon relaxation, hence making attractor dynamics suitable to implement robust associative memory. The initial network state is dictated by the stimulus, and relaxation to the attractor state implements the retrieval of the corresponding memorized prototypical pattern. In a previous work we demonstrated that a neuromorphic recurrent network of spiking neurons and suitably chosen, fixed synapses supports attractor dynamics. Here we focus on learning: activating on-chip synaptic plasticity and using a theory-driven strategy for choosing network parameters, we show that autonomous learning, following repeated presentation of simple visual stimuli, shapes a synaptic connectivity supporting stimulus-selective attractors. Associative memory develops on chip as the result of the coupled stimulus-driven neural activity and ensuing synaptic dynamics, with no artificial separation between learning and retrieval phases.
We develop a fully automatic image colorization system. Our approach leverages recent advances in deep networks, exploiting both low-level and semantic representations. As many scene elements naturally appear according to multimodal color distributions, we train our model to predict per-pixel color histograms. This intermediate output can be used to automatically generate a color image, or further manipulated prior to image formation. On both fully and partially automatic colorization tasks, we outperform existing methods. We also explore colorization as a vehicle for self-supervised visual representation learning.
We study the prisoner's dilemma model with a noisy imitation evolutionary dynamics on directed out-homogeneous and uncorrelated directed random networks. An heterogeneous pair mean-field approximation is presented showing good agreement with Monte Carlo simulations in the limit of weak selection (high noise) where we obtain analytical predictions for the critical temptations. We discuss the phase diagram as a function of temptation, intensity of noise and coordination number of the networks and we consider both the model with and without self-interaction. We compare our results with available results for non-directed lattices and networks.
Traditionally, parameter estimation in biophysical neuron and neural network models usually adopts a global search algorithm, often combined with a local search method in order to minimize the value of a cost function, which measures the discrepancy between various features of the available experimental data and model output. In this study, we approach the problem of parameter estimation in conductance-based models of single neurons from a different perspective. By adopting a hidden-dynamical-systems formalism, we expressed parameter estimation as an inference problem in these systems, which can then be tackled using well-established statistical inference methods. The particular method we used was Kitagawa's self-organizing state-space model, which was applied on a number of Hodgkin-Huxley models using simulated or actual electrophysiological data. We showed that the algorithm can be used to estimate a large number of parameters, including maximal conductances, reversal potentials, kinetics of ionic currents and measurement noise, based on low-dimensional experimental data and sufficiently informative priors in the form of pre-defined constraints imposed on model parameters. The algorithm remained operational even when very noisy experimental data were used. Importantly, by combining the self-organizing state-space model with an adaptive sampling algorithm akin to the Covariance Matrix Adaptation Evolution Strategy we achieved a significant reduction in the variance of parameter estimates. The algorithm did not require the explicit formulation of a cost function and it was straightforward to apply on compartmental models and multiple data sets. Overall, the proposed methodology is particularly suitable for resolving high-dimensional inference problems based on noisy electrophysiological data and, therefore, a potentially useful tool in the construction of biophysical neuron models.
In this paper we introduce a capacity allocation game which models the problem of maximizing network utility from the perspective of distributed noncooperative agents. Motivated by the idea of self-managed networks, in the developed framework decision-making entities are associated with individual transmission links, deciding on the way they split capacity among concurrent flows. An efficient decentralized algorithm is given for computing strongly Pareto-optimal strategies, constituting a pure Nash equilibrium. Subsequently, we discuss the properties of the introduced game related to the Price of Anarchy and Price of Stability. The paper is concluded with an experimental study.
Loom (LM), a hardware inference accelerator for Convolutional Neural Networks (CNNs) is presented. In LM every bit of data precision that can be saved translates to proportional performance gains. Specifically, for convolutional layers LM's execution time scales inversely proportionally with the precisions of both weights and activations. For fully-connected layers LM's performance scales inversely proportionally with the precision of the weights. The LM accelerator targets area constrained System-on-a-Chip designs such as those found on mobile devices that cannot afford the multi-megabyte buffers that would be needed to store each layer on-chip during processing. Experiments on image classification CNNs show that on average across all networks studied and assuming that weights are supplied via a High Bandwidth Memory v2 (HBM2) interface, a configuration of LM outperforms a state-of-the-art bit-parallel accelerator [1] by 2.34x without any loss in accuracy while being 2.23x more energy efficient. Moreover, LM can trade-off accuracy for additional improvements in execution performance and energy efficiency.
We study the optical properties of Weyl semimetal (WSM) in a model which features, in addition to the usual term describing isolated Dirac cones proportional to the Fermi velocity $v_{F}$, a gap term $m$ and a Zeeman spin-splitting term $b$ with broken time reversal symmetry. Transport is treated within Kubo formalism and particular attention is payed to the modifications that result from a finite $m$ and $b$. We consider how these modifications change when a finite residual scattering rate $\Gamma$ is included. For $\Gamma<m$ the A.C. conductivity as a function of photon energy $\Omega$ continues to display the two quasilinear energy regions of the clean limit for $\Omega$ below the onset of the second electronic band which is gapped at ($ m+b $). For $\Gamma$ of the order $m$ little trace of two distinct linear energy scales remain and the optical response has evolved towards that for $m=b=0$. Although some quantitative differences remain there are no qualitative differences. The magnitude of the D.C. conductivity $\sigma^{DC}(T=0)$ at zero temperature ($T=0$) and chemical potential ($\mu=0$) is altered. While it remains proportional to $\Gamma$ it becomes inversely dependent on an effective Fermi velocity out of the Weyl nodes equal to $v_{F}^\ast=v_{F}\sqrt{b^2-m^2}/b$ which decreases strongly as the phase boundary between Weyl semimetal and gapped Dirac phase (GDSM) is approached at $b=m$. The leading term in the approach to $\sigma^{DC}(T=0)$ for finite $T/\Gamma$, $\mu/\Gamma$ and $\Omega/\Gamma$ is found to be quadratic. The coefficient of these corrections tracks closely the $b/m$ dependence of the $\mu=T=\Omega=0$ limit with differences largest near to the WSM-GDSM boundary.
Using \emph{ab initio} alloy theory formulated within the exact muffin-tin orbitals theory in combination with the coherent potential approximation, we investigate the ideal tensile strength (ITS) in the $[001]$ direction of bcc ferro-/ferrimagnetic (FFM) and paramagnetic (PM) Fe$_{1-x}M_{x}$ ($M=$ Al, V, Cr, Mn, Co, or Ni) random alloys. The ITS of ferromagnetic (FM) Fe is calculated to be $12.6$\,GPa, in agreement with available data, while the PM phase turns out to posses a significantly lower value of $0.7\,$GPa. Alloyed to the FM matrix, we predict that V, Cr, and Co increase the ITS of Fe, while Al and Ni decrease it. Manganese yields a weak non-monotonic alloying behavior. In comparison to FM Fe, the alloying effect of Al and Co to PM Fe is reversed and the relative magnitude of the ITS can be altered more strongly for any of the solutes. All considered binaries are intrinsically brittle and fail by cleavage of the $(001)$ planes under uniaxial tensile loading in both magnetic phases. We show that the previously established ITS model based on structural energy differences proves successful in the PM Fe-alloys but is of limited use in the case of the FFM Fe-based alloys. The different performance is attributed to the specific interplay between magnetism and volume change in response to uniaxial tension. We establish a strong correlation between the compositional effect on the ITS and the one on the shear elastic constant $C'$ for the PM alloys and briefly discuss the relation between hardenability and the ITS.
Understanding mobile traffic patterns of large scale cellular towers in urban environment is extremely valuable for Internet service providers, mobile users, and government managers of modern metropolis. This paper aims at extracting and modeling the traffic patterns of large scale towers deployed in a metropolitan city. To achieve this goal, we need to address several challenges, including lack of appropriate tools for processing large scale traffic measurement data, unknown traffic patterns, as well as handling complicated factors of urban ecology and human behaviors that affect traffic patterns. Our core contribution is a powerful model which combines three dimensional information (time, locations of towers, and traffic frequency spectrum) to extract and model the traffic patterns of thousands of cellular towers. Our empirical analysis reveals the following important observations. First, only five basic time-domain traffic patterns exist among the 9,600 cellular towers. Second, each of the extracted traffic pattern maps to one type of geographical locations related to urban ecology, including residential area, business district, transport, entertainment, and comprehensive area. Third, our frequency-domain traffic spectrum analysis suggests that the traffic of any tower among the 9,600 can be constructed using a linear combination of four primary components corresponding to human activity behaviors. We believe that the proposed traffic patterns extraction and modeling methodology, combined with the empirical analysis on the mobile traffic, pave the way toward a deep understanding of the traffic patterns of large scale cellular towers in modern metropolis.
The dipolar-random field Ising model (DRFIM) recently introduced displays a behaviour that can be connected to the magnetization of bidimensional magnetic media. Epitaxial magnetic garnet films seem to be the ideal test material for such a model. In this work the results of the measurements performed on garnet samples are presented, as well as the comparisons with simulation results obtained by the DRFIM. The results prove that a variety of hysteresis loops are well described by the DRFIM. This capability does not derive from the fine tuning of a great number of parameters, but by the interplay of exchange and dipolar interactions.
The ferromagnetic phase of an Ising model in d=3, with any amount of quenched antiferromagnetic bond randomness, is shown to undergo a transition to a spin-glass phase under sufficient quenched bond dilution. This general result, demonstrated here with the numerically exact renormalization-group solution of a d=3 hierarchical lattice, is expected to hold true generally, for the cubic lattice and for quenched site dilution. Conversely, in the ferromagnetic-spinglass-antiferromagnetic phase diagram, the spin-glass phase expands under quenched dilution at the expense of the ferromagnetic and antiferromagnetic phases. In the ferro-spinglass phase transition induced by quenched dilution reentrance is seen, as previously found for the ferro-spinglass transition induced by increasing the antiferromagnetic bond concentration.
Musical performance combines a wide range of pitches, nuances, and expressive techniques. Audio-based classification of musical instruments thus requires to build signal representations that are invariant to such transformations. This article investigates the construction of learned convolutional architectures for instrument recognition, given a limited amount of annotated training data. In this context, we benchmark three different weight sharing strategies for deep convolutional networks in the time-frequency domain: temporal kernels; time-frequency kernels; and a linear combination of time-frequency kernels which are one octave apart, akin to a Shepard pitch spiral. We provide an acoustical interpretation of these strategies within the source-filter framework of quasi-harmonic sounds with a fixed spectral envelope, which are archetypal of musical notes. The best classification accuracy is obtained by hybridizing all three convolutional layers into a single deep learning architecture.
We study statistical properties of the energy spectra of two-dimensional quasiperiodic tight-binding models. The multifractal nature of the eigenstates of these models is corroborated by the scaling of the participation numbers with the systems size. Hence one might have expected `critical' or `intermediate' statistics for the level-spacing distributions as observed at the metal-insulator transition in the three-dimensional Anderson model of disorder. However, our numerical results are in perfect agreement with the universal level-spacing distributions of the Gaussian orthogonal random matrix ensemble, including the distribution of spacings between second, third, and forth neighbour energy levels.
This paper has been withdrawn by the author due to a crucial accuracy error in Fig. 5. For precise performance of ALBNN please refer to Yoon et al.'s work in the following article. Yoon, H., Park, C. S., Kim, J. S., & Baek, J. G. (2013). Algorithm learning based neural network integrating feature selection and classification. Expert Systems with Applications, 40(1), 231-241. http://www.sciencedirect.com/science/article/pii/S0957417412008731
The effect of a magnetic field on the dipole echo amplitude in glasses (at temperatures of about 10 mK) induced by the dipole-dipole interaction of nuclear spins has been theoretically studied. It has been shown that a change in the positions of nuclear spins as a result of tunneling and their interaction with the external magnetic field E_H lead to a nonmonotonic magnetic field dependence of the dipole echo amplitude. The approximation that the nuclear dipole-dipole interaction energy E_d is much smaller than the Zeeman energy E_H has been found to be valid in the experimentally important cases. It has been shown that the dipole echo amplitude in this approximation may be described by a simple universal analytic function independent of the microscopic structure of the two-level systems. An excellent agreement of the theory with the experimental data has been obtained without fitting parameters (except for the unknown echo amplitude)
This paper is concerned with the trends of stress fluctuations in dry granular materials as functions of the sample size D and of the grain diameter d. Results are obtained in the plateau regime of large axial deformation, during axi-symmetric uni-axial compression. Two cases have to be distinguished depending on whether the material generates or does not generate stick-slip. When no stick-slip is generated (Fig.1), it is found that the relative stress fluctuations dq/q of the deviatoric stress q scales linearly with the ratio d/D >. The coefficient A relating dq/q and d/D ranges from 1 to 3 about, depending on the grain size d; this is probably due to the effect of the piston roughness. This result is compatible with a contact force network with a random force distribution and a short range correlation length L,i.e. L<3d; this leads to estimate the representative elementary volume REV to contain few grains (from 1 to 30) as it has been defended already in [1].   We turn now to cases exhibiting stick-slip (Fig. 2). It has been shown in [2] that the distribution of delays between events is rather "exponential" when the sample is small, and Gaussian-like when the sample is larger. The cross-over between the random to quasi periodic regime occurs for a typical sample which contains 10 000 000 grains [2]. In this paper we show an analogy between the statistics of stick-slip amplitude Dq/q and the seismicity of the Los Angeles area during 1979-1997 period [3]. Similarity of the two trends is quite surprising. The seismicity has been cut off at Magnitude M=3.5.   These results have to be handled with care since similarity may be fortuitous.
We present new, more efficient algorithms for estimating random walk scores such as Personalized PageRank from a given source node to one or several target nodes. These scores are useful for personalized search and recommendations on networks including social networks, user-item networks, and the web. Past work has proposed using Monte Carlo or using linear algebra to estimate scores from a single source to every target, making them inefficient for a single pair. Our contribution is a new bidirectional algorithm which combines linear algebra and Monte Carlo to achieve significant speed improvements. On a diverse set of six graphs, our algorithm is 70x faster than past state-of-the-art algorithms. We also present theoretical analysis: while past algorithms require $\Omega(n)$ time to estimate a random walk score of typical size $\frac{1}{n}$ on an $n$-node graph to a given constant accuracy, our algorithm requires only $O(\sqrt{m})$ expected time for an average target, where $m$ is the number of edges, and is provably accurate.   In addition to our core bidirectional estimator for personalized PageRank, we present an alternative algorithm for undirected graphs, a generalization to arbitrary walk lengths and Markov Chains, an algorithm for personalized search ranking, and an algorithm for sampling random paths from a given source to a given set of targets. We expect our bidirectional methods can be extended in other ways and will be useful subroutines in other graph analysis problems.
We propose a novel learning method for multilayered neural networks which uses feedforward supervisory signal and associates classification of a new input with that of pre-trained input. The proposed method effectively uses rich input information in the earlier layer for robust leaning and revising internal representation in a multilayer neural network.
In this paper, we address a rain removal problem from a single image, even in the presence of heavy rain and rain streak accumulation. Our core ideas lie in the new rain image models and a novel deep learning architecture. We first modify an existing model comprising a rain streak layer and a background layer, by adding a binary map that locates rain streak regions. Second, we create a new model consisting of a component representing rain streak accumulation (where individual streaks cannot be seen, and thus visually similar to mist or fog), and another component representing various shapes and directions of overlapping rain streaks, which usually happen in heavy rain. Based on the first model, we develop a multi-task deep learning architecture that learns the binary rain streak map, the appearance of rain streaks, and the clean background, which is our ultimate output. The additional binary map is critically beneficial, since its loss function can provide additional strong information to the network. To handle rain streak accumulation (again, a phenomenon visually similar to mist or fog) and various shapes and directions of overlapping rain streaks, we propose a recurrent rain detection and removal network that removes rain streaks and clears up the rain accumulation iteratively and progressively. In each recurrence of our method, a new contextualized dilated network is developed to exploit regional contextual information and outputs better representation for rain detection. The evaluation on real images, particularly on heavy rain, shows the effectiveness of our novel models and architecture, outperforming the state-of-the-art methods significantly. Our codes and data sets will be publicly available.
We present preliminary results of an ESO-VLT large programme (AMAZE) aimed at determining the evolution of the mass-metallicity relation at z~3 by means of deep near-IR spectroscopy. Gas metallicities and stellar masses are measured for an initial sample of nine star forming galaxies at z~3.3. When compared with previous surveys, the mass-metallicity relation inferred at z~3.3 shows an evolution significantly stronger than observed at lower redshifts. There are also some indications that the metallicity evolution of low mass galaxies is stronger relative to high mass systems, an effect which can be considered as the chemical version of the galaxy downsizing. The mass-metallicity relation observed at z~3.3 is difficult to reconcile with the predictions of some hierarchical evolutionary models. We shortly discuss the possible implications of such discrepancies.
Most previous works study the evolution of cooperation in a structured population by commonly employing an isolated single network. However, realistic systems are composed of many interdependent networks coupled with each other, rather than the isolated single one. In this paper, we consider a system including two interacting networks with the same size, entangled with each other by the introduction of probabilistic interconnections. We introduce the public goods game into such system, and study how the probabilistic interconnection influences the evolution of cooperation of the whole system and the coupling effect between two layers of interdependent networks. Simulation results show that there exists an intermediate region of interconnection probability leading to the maximum cooperation level in the whole system. Interestingly, we find that at the optimal interconnection probability the fraction of internal links between cooperators in two layers is maximal. Also, even if initially there are no cooperators in one layer of interdependent networks, cooperation can still be promoted by probabilistic interconnection, and the cooperation levels in both layers can more easily reach an agreement at the intermediate interconnection probability. Our results may be helpful in understanding the cooperative behavior in some realistic interdependent networks and thus highlight the importance of probabilistic interconnection on the evolution of cooperation.
We describe some of the challenges of particle accelerator control, highlight recent advances in neural network techniques, discuss some promising avenues for incorporating neural networks into particle accelerator control systems, and describe a neural network-based control system that is being developed for resonance control of an RF electron gun at the Fermilab Accelerator Science and Technology (FAST) facility, including initial experimental results from a benchmark controller.
The accurate analytical solution for the low temperature $1/f$ noise in a microwave dielectric constant of amorphous films containing tunneling two-level systems (TLSs) is derived within the standard tunneling model including the weak dipolar or elastic TLS-TLS interactions. The results are consistent with the recent experimental investigations of $1/f$ noise in Josephson junction qubits including the power law increase of the noise amplitude with decreasing temperature at low temperatures $T<0.1$K. The long time correlations needed for $1/f$ noise are provided by the logarithmic broadening of TLS absorption resonances with time due to their interaction with neighboring TLSs. The noise behavior at higher temperatures $T>0.1$K and its possible sensitivity to quasi-particle excitations are discussed.
We examine the benefits of user cooperation under compute-and-forward. Much like in network coding, receivers in a compute-and-forward network recover finite-field linear combinations of transmitters' messages. Recovery is enabled by linear codes: transmitters map messages to a linear codebook, and receivers attempt to decode the incoming superposition of signals to an integer combination of codewords. However, the achievable computation rates are low if channel gains do not correspond to a suitable linear combination. In response to this challenge, we propose a cooperative approach to compute-and-forward. We devise a lattice-coding approach to block Markov encoding with which we construct a decode-and-forward style computation strategy. Transmitters broadcast lattice codewords, decode each other's messages, and then cooperatively transmit resolution information to aid receivers in decoding the integer combinations. Using our strategy, we show that cooperation offers a significant improvement both in the achievable computation rate and in the diversity-multiplexing tradeoff.
We present an approach to model time series data from resting state fMRI for autism spectrum disorder (ASD) severity classification. We propose to adopt kernel machines and employ graph kernels that define a kernel dot product between two graphs. This enables us to take advantage of spatio-temporal information to capture the dynamics of the brain network, as opposed to aggregating them in the spatial or temporal dimension. In addition to the conventional similarity graphs, we explore the use of L1 graph using sparse coding, and the persistent homology of time delay embeddings, in the proposed pipeline for ASD classification. In our experiments on two datasets from the ABIDE collection, we demonstrate a consistent and significant advantage in using graph kernels over traditional linear or non linear kernels for a variety of time series features.
Peer-to-Peer protocols currently form the most heavily used protocol class in the Internet, with BitTorrent, the most popular protocol for content distribution, as its flagship.   A high number of studies and investigations have been undertaken to measure, analyse and improve the inner workings of the BitTorrent protocol. Approaches such as tracker message analysis, network probing and packet sniffing have been deployed to understand and enhance BitTorrent's internal behaviour.   In this paper we present a novel approach that aims to collect, process and analyse large amounts of local peer information in BitTorrent swarms. We classify the information as periodic status information able to be monitored in real time and as verbose logging information to be used for subsequent analysis. We have designed and implemented a retrieval, storage and presentation infrastructure that enables easy analysis of BitTorrent protocol internals. Our approach can be employed both as a comparison tool, as well as a measurement system of how network characteristics and protocol implementation influence the overall BitTorrent swarm performance.   We base our approach on a framework that allows easy swarm creation and control for different BitTorrent clients. With the help of a virtualized infrastructure and a client-server software layer we are able to create, command and manage large sized BitTorrent swarms. The framework allows a user to run, schedule, start, stop clients within a swarm and collect information regarding their behavior.
We study the statistics of the local resolvent and non-ergodic properties of eigenvectors for a generalised Rosenzweig-Porter $N\times N$ random matrix model, undergoing two transitions separated by a delocalised non-ergodic phase. Interpreting the model as the combination of on-site random energies $\{a_i\}$ and a structurally disordered hopping, we found that each eigenstate is delocalised over $N^{2-\gamma}$ sites close in energy $|a_j-a_i|\leq N^{1-\gamma}$ in agreement with Kravtsov \emph{et al}, arXiv:1508.01714. Our other main result, obtained combining a recurrence relation for the resolvent matrix with insights from Dyson's Brownian motion, is to show that the properties of the non-ergodic delocalised phase can be probed studying the statistics of the local resolvent in a non-standard scaling limit.
We show how the AGBS model, originally developed for deep inelastic scattering applied to HERA data on the proton structure function, can also describe the RHIC data on single inclusive hadron yield for $d+Au$ and $p+p$ collisions through a new simultaneous fit. The single inclusive hadron production is modeled through the color glass condensate, which uses the quark(and gluon)--condensate amplitudes in momentum space. The AGBS model is also a momentum space model based on the asymptotic solutions of the BK equation, although a different definition of the Fourier transform is used. This aspect is overcome and a description entirely in transverse momentum of both processes arises for the first time. The small difference between the simultaneous fit and the one for HERA data alone suggests that the AGBS model describes very well both kind of processes and thus emerges as a good tool to investigate the inclusive hadron production data. We use this model for predictions at LHC energies, which agree very well with available experimental data.
Deep neural networks are facing a potential security threat from adversarial examples, inputs that look normal but cause an incorrect classification by the deep neural network. For example, the proposed threat could result in hand-written digits on a scanned check being incorrectly classified but looking normal when humans see them. This research assesses the extent to which adversarial examples pose a security threat, when one considers the normal image acquisition process. This process is mimicked by simulating the transformations that normally occur in acquiring the image in a real world application, such as using a scanner to acquire digits for a check amount or using a camera in an autonomous car. These small transformations negate the effect of the carefully crafted perturbations of adversarial examples, resulting in a correct classification by the deep neural network. Thus just acquiring the image decreases the potential impact of the proposed security threat. We also show that the already widely used process of averaging over multiple crops neutralizes most adversarial examples. Normal preprocessing, such as text binarization, almost completely neutralizes adversarial examples. This is the first paper to show that for text driven classification, adversarial examples are an academic curiosity, not a security threat.
A locally iterative learning (LIL) rule is adapted to a model of the associative memory based on the evolving recurrent-type neural networks composed of growing neurons. There exist extremely different scale parameters of time, the individual learning time and the generation in evolution. This model allows us definite investigation on the interaction between learning and evolution. And the reinforcement of the robustness against the noise is also achieved in the evolutional scheme.
In this paper, the severity prediction of drought through the implementation of modern sensor networks is discussed. We describe how to design a drought prediction system using wireless sensor networks. This paper will describe a terrestrial interconnected wireless sensor network paradigm for the prediction of severity of drought over a vast area of 10,000 sq km. The communication architecture for sensor network is outlined and the protocols developed for each layer is explored. The data integration model and sensor data analysis at the central computer is explained. The advantages and limitations are discussed along with the use of wireless standards. They are analyzed for its relevance. Finally a conclusion is presented along with open research issues.
In cloud infrastructure, accommodating multiple virtual networks on a single physical network reduces power consumed by physical resources and minimizes cost of operating cloud data centers. However, mapping multiple virtual network resources to physical network components, called virtual network embedding (VNE), is known to be NP-hard. With considering energy efficiency, the problem becomes more complicated. In this paper, we model energy-aware virtual network embedding, devise metrics for evaluating performance of energy aware virtual network-embedding algorithms, and propose an energy aware virtual network-embedding algorithm based on multi-objective particle swarm optimization augmented with local search to speed up convergence of the proposed algorithm and improve solutions quality. Performance of the proposed algorithm is evaluated and compared with existing algorithms using extensive simulations, which show that the proposed algorithm improves virtual network embedding by increasing revenue and decreasing energy consumption.
We propose a novel 3D face recognition algorithm using a deep convolutional neural network (DCNN) and a 3D augmentation technique. The performance of 2D face recognition algorithms has significantly increased by leveraging the representational power of deep neural networks and the use of large-scale labeled training data. As opposed to 2D face recognition, training discriminative deep features for 3D face recognition is very difficult due to the lack of large-scale 3D face datasets. In this paper, we show that transfer learning from a CNN trained on 2D face images can effectively work for 3D face recognition by fine-tuning the CNN with a relatively small number of 3D facial scans. We also propose a 3D face augmentation technique which synthesizes a number of different facial expressions from a single 3D face scan. Our proposed method shows excellent recognition results on Bosphorus, BU-3DFE, and 3D-TEC datasets, without using hand-crafted features. The 3D identification using our deep features also scales well for large databases.
This text presents the research field of natural/unconventional computing as it appears in the book COMPUTING NATURE. The articles discussed consist a selection of works from the Symposium on Natural Computing at AISB-IACAP (British Society for the Study of Artificial Intelligence and the Simulation of Behaviour and The International Association for Computing and Philosophy) World Congress 2012, held at the University of Birmingham, celebrating Turing centenary. The COMPUTING NATURE is about nature considered as the totality of physical existence, the universe. By physical we mean all phenomena, objects and processes, that are possible to detect either directly by our senses or via instruments. Historically, there have been many ways of describing the universe (cosmic egg, cosmic tree, theistic universe, mechanistic universe) while a particularly prominent contemporary approach is computational universe, as discussed in this article.
Many neural learning algorithms require to solve large least square systems in order to obtain synaptic weights. Moore-Penrose inverse matrices allow for solving such systems, even with rank deficiency, and they provide minimum-norm vectors of synaptic weights, which contribute to the regularization of the input-output mapping. It is thus of interest to develop fast and accurate algorithms for computing Moore-Penrose inverse matrices. In this paper, an algorithm based on a full rank Cholesky factorization is proposed. The resulting pseudoinverse matrices are similar to those provided by other algorithms. However the computation time is substantially shorter, particularly for large systems.
Recently, the end-to-end approach that learns hierarchical representations from raw data using deep convolutional neural networks has been successfully explored in the image, text and speech domains. This approach was applied to musical signals as well but has been not fully explored yet. To this end, we propose sample-level deep convolutional neural networks which learn representations from very small grains of waveforms (e.g. 2 or 3 samples) beyond typical frame-level input representations. Our experiments show how deep architectures with sample-level filters improve the accuracy in music auto-tagging and they provide results comparable to previous state-of-the-art performances for the Magnatagatune dataset and Million Song Dataset. In addition, we visualize filters learned in a sample-level DCNN in each layer to identify hierarchically learned features and show that they are sensitive to log-scaled frequency along layer, such as mel-frequency spectrogram that is widely used in music classification systems.
We present the extention and application of a new unsupervised statistical learning technique--the Partition Decoupling Method--to gene expression data. Because it has the ability to reveal non-linear and non-convex geometries present in the data, the PDM is an improvement over typical gene expression analysis algorithms, permitting a multi-gene analysis that can reveal phenotypic differences even when the individual genes do not exhibit differential expression. Here, we apply the PDM to publicly-available gene expression data sets, and demonstrate that we are able to identify cell types and treatments with higher accuracy than is obtained through other approaches. By applying it in a pathway-by-pathway fashion, we demonstrate how the PDM may be used to find sets of mechanistically-related genes that discriminate phenotypes.
With the rapid proliferation of Internet of Things and intelligent edge devices, there is an increasing need for implementing machine learning algorithms, including deep learning, on resource-constrained mobile embedded devices with limited memory and computation power. Typical large Convolutional Neural Networks (CNNs) need large amounts of memory and computational power, and cannot be deployed on embedded devices efficiently. We present Two-Bit Networks (TBNs) for model compression of CNNs with edge weights constrained to (-2, -1, 1, 2), which can be encoded with two bits. Our approach can reduce the memory usage and improve computational efficiency significantly while achieving good performance in terms of classification accuracy, thus representing a reasonable tradeoff between model size and performance.
We explore techniques to significantly improve the compute efficiency and performance of Deep Convolution Networks without impacting their accuracy. To improve the compute efficiency, we focus on achieving high accuracy with extremely low-precision (2-bit) weight networks, and to accelerate the execution time, we aggressively skip operations on zero-values. We achieve the highest reported accuracy of 76.6% Top-1/93% Top-5 on the Imagenet object classification challenge with low-precision network\footnote{github release of the source code coming soon} while reducing the compute requirement by ~3x compared to a full-precision network that achieves similar accuracy. Furthermore, to fully exploit the benefits of our low-precision networks, we build a deep learning accelerator core, dLAC, that can achieve up to 1 TFLOP/mm^2 equivalent for single-precision floating-point operations (~2 TFLOP/mm^2 for half-precision).
We investigate the effects of atmospheric circulation on the chemistry of the hot Jupiter HD 209458b. We use a simplified dynamical model and a robust chemical network, as opposed to previous studies which have used a three dimensional circulation model coupled to a simple chemical kinetics scheme. The temperature structure and distribution of the main atmospheric constituents are calculated in the limit of an atmosphere that rotates as a solid body with an equatorial rotation rate of 1 km/s. Such motion mimics a uniform zonal wind which resembles the equatorial superrotation structure found by three dimensional circulation models. The uneven heating of this tidally locked planet causes, even in the presence of such a strong zonal wind, large temperature contrasts between the dayside and nightside, of up to 800 K. This would result in important longitudinal variations of some molecular abundances if the atmosphere were at chemical equilibrium. The zonal wind, however, acts as a powerful disequilibrium process. We identify the existence of a pressure level of transition between two regimes, which may be located between 100 and 0.1 mbar depending on the molecule. Below this transition layer, chemical equilibrium holds, while above it, the zonal wind tends to homogenize the chemical composition of the atmosphere, bringing molecular abundances in the limb and nightside regions close to chemical equilibrium values characteristic of the dayside, i.e. producing an horizontal quenching effect in the abundances. Reasoning based on timescales arguments indicates that horizontal and vertical mixing are likely to compete in HD 209458b's atmosphere, producing a complex distribution where molecular abundances are quenched horizontally to dayside values and vertically to chemical equilibrium values characteristic of deep layers.
This article proposes a new layered model to represent the spectrum assignment on flexible-grid optical networks. This model can reduce the time-complexity of existing routing and spectrum assignment methods by providing a data structure that captures the current spectrum usage of the network links.
A fundamental question in Computer Science is understanding when a specific class of problems go from being computationally easy to hard. Because of its generality and applications, the problem of Boolean Satisfiability (aka SAT) is often used as a vehicle for investigating this question. A signal result from these studies is that the hardness of SAT problems exhibits a dramatic easy-to-hard phase transition with respect to the problem constrainedness. Past studies have however focused mostly on SAT instances generated using uniform random distributions, where all constraints are independently generated, and the problem variables are all considered of equal importance. These assumptions are unfortunately not satisfied by most real problems. Our project aims for a deeper understanding of hardness of SAT problems that arise in practice. We study two key questions: (i) How does easy-to-hard transition change with more realistic distributions that capture neighborhood sensitivity and rich-get-richer aspects of real problems and (ii) Can these changes be explained in terms of the network properties (such as node centrality and small-worldness) of the clausal networks of the SAT problems. Our results, based on extensive empirical studies and network analyses, provide important structural and computational insights into realistic SAT problems. Our extensive empirical studies show that SAT instances from realistic distributions do exhibit phase transition, but the transition occurs sooner (at lower values of constrainedness) than the instances from uniform random distribution. We show that this behavior can be explained in terms of their clausal network properties such as eigenvector centrality and small-worldness (measured indirectly in terms of the clustering coefficients and average node distance).
This paper presents a case for exploiting the synergy of dedicated and opportunistic network resources in a distributed hosting platform for data stream processing applications. Our previous studies have demonstrated the benefits of combining dedicated reliable resources with opportunistic resources in case of high-throughput computing applications, where timely allocation of the processing units is the primary concern. Since distributed stream processing applications demand large volume of data transmission between the processing sites at a consistent rate, adequate control over the network resources is important here to assure a steady flow of processing. In this paper, we propose a system model for the hybrid hosting platform where stream processing servers installed at distributed sites are interconnected with a combination of dedicated links and public Internet. Decentralized algorithms have been developed for allocation of the two classes of network resources among the competing tasks with an objective towards higher task throughput and better utilization of expensive dedicated resources. Results from extensive simulation study show that with proper management, systems exploiting the synergy of dedicated and opportunistic resources yield considerably higher task throughput and thus, higher return on investment over the systems solely using expensive dedicated resources.
Over the last decade, Convolutional Neural Networks (CNN) saw a tremendous surge in performance. However, understanding what a network has learned still proves to be a challenging task. To remedy this unsatisfactory situation, a number of groups have recently proposed different methods to visualize the learned models. In this work we suggest a general taxonomy to classify and compare these methods, subdividing the literature into three main categories and providing researchers with a terminology to base their works on. Furthermore, we introduce the FeatureVis library for MatConvNet: an extendable, easy to use open source library for visualizing CNNs. It contains implementations from each of the three main classes of visualization methods and serves as a useful tool for an enhanced understanding of the features learned by intermediate layers, as well as for the analysis of why a network might fail for certain examples.
Natural and man-made transport webs are frequently dominated by dense sets of nested cycles. The architecture of these networks, as defined by the topology and edge weights, determines how efficiently the networks perform their function. Yet, the set of tools that can characterize such a weighted cycle-rich architecture in a physically relevant, mathematically compact way is sparse. In order to fill this void, we have developed a new algorithm that rests on an abstraction of the physical `tiling' in the case of a two dimensional network to an effective tiling of an abstract surface in space that the network may be thought to sit in. Generically these abstract surfaces are richer than the flat plane and as a result there are now two families of fundamental units that may aggregate upon cutting weakest links -- the plaquettes of the tiling and the longer `topological' cycles associated with the abstract surface itself. Upon sequential removal of the weakest links, as determined by the edge weight, neighboring plaquettes merge and a tree characterizing this merging process results. The properties of this characteristic tree can provide the physical and topological data required to describe the architecture of the network and to build physical models. The new algorithm can be used for automated phenotypic characterization of any weighted network whose structure is dominated by cycles, such as mammalian vasculature in the organs, the root networks of clonal colonies like quaking aspen, or the force networks in jammed granular matter.
We investigate the phase diagrams of theoretical models describing bosonic atoms in a lattice in the presence of randomly localized impurities. By including multiband and nonlinear hopping effects we enrich the standard model containing only the chemical-potential disorder with the site-dependent hopping term. We compare the extension of the MI and the BG phase in both models using a combination of the local mean-field method and a Hartree-Fock-like procedure, as well as, the Gutzwiller-ansatz approach. We show analytical argument for the presence of triple points in the phase diagram of the model with chemical-potential disorder. These triple points however, cease to exists after the addition of the hopping disorder.
Learning Classifier Systems (LCS) are population-based reinforcement learners that were originally designed to model various cognitive phenomena. This paper presents an explicitly cognitive LCS by using spiking neural networks as classifiers, providing each classifier with a measure of temporal dynamism. We employ a constructivist model of growth of both neurons and synaptic connections, which permits a Genetic Algorithm (GA) to automatically evolve sufficiently-complex neural structures. The spiking classifiers are coupled with a temporally-sensitive reinforcement learning algorithm, which allows the system to perform temporal state decomposition by appropriately rewarding "macro-actions," created by chaining together multiple atomic actions. The combination of temporal reinforcement learning and neural information processing is shown to outperform benchmark neural classifier systems, and successfully solve a robotic navigation task.
We study a model for neural activity on the small-world topology of Watts and Strogatz and on the scale-free topology of Barab\'asi and Albert. We find that the topology of the network connections may spontaneously induce periodic neural activity, contrasting with chaotic neural activities exhibited by regular topologies. Periodic activity exists only for relatively small networks and occurs with higher probability when the rewiring probability is larger. The average length of the periods increases with the square root of the network size.
This paper considers the problem of inferring image labels for which only a few labelled examples are available at training time. This setup is often referred to as low-shot learning in the literature, where a standard approach is to re-train the last few layers of a convolutional neural network learned on separate classes. We consider a semi-supervised setting in which we exploit a large collection of images to support label propagation. This is made possible by leveraging the recent advances on large-scale similarity graph construction. We show that despite its conceptual simplicity, scaling up label propagation to up hundred millions of images leads to state of the art accuracy in the low-shot learning regime.
We study transport properties of graphene with anisotropically distributed on-site impurities (adatoms) that are randomly placed on every third line drawn along carbon bonds. We show that stripe states characterized by strongly suppressed back-scattering are formed in this model in the direction of the lines. The system reveals L\'evy-flight transport in stripe direction such that the corresponding conductivity increases as the square root of the system length. Thus, adding this type of disorder to clean graphene near the Dirac point strongly enhances the conductivity, which is in stark contrast with a fully random distribution of on-site impurities which leads to Anderson localization. The effect is demonstrated both by numerical simulations using the Kwant code and by an analytical theory based on the self-consistent $T$-matrix approximation.
A Quantum Key Distribution (QKD) network is an infrastructure capable of performing long-distance and high-rate secret key agreement with information-theoretic security. In this paper we study security properties of QKD networks based on trusted repeater nodes. Such networks can already be deployed, based on current technology. We present an example of a trusted repeater QKD network, developed within the SECOQC project.   The main focus is put on the study of secure key agreement over a trusted repeater QKD network, when some nodes are corrupted. We propose an original method, able to ensure the authenticity and privacy of the generated secret keys.
In this paper, we develop a hierarchical Bayesian game framework for automated dynamic offset selection. Users compete to maximize their throughput by picking the best locally serving radio access network (RAN) with respect to their own measurement, their demand and a partial statistical channel state information (CSI) of other users. In particular, we investigate the properties of a Stackelberg game, in which the base station is a player on its own. We derive analytically the utilities related to the channel quality perceived by users to obtain the equilibria. We study the Price of Anarchy (PoA) of such system, where the PoA is the ratio of the social welfare attained when a network planner chooses policies to maximize social welfare versus the social welfare attained in Nash/Stackeleberg equilibrium when users choose their policies strategically. We show by means of a Stackelberg formulation, how the operator, by sending appropriate information about the state of the channel, can configure a dynamic offset that optimizes its global utility while users maximize their individual utilities. The proposed hierarchical decision approach for wireless networks can reach a good trade-off between the global network performance at the equilibrium and the requested amount of signaling. Typically, it is shown that when the network goal is orthogonal to user's goal, this can lead the users to a misleading association problem.
In an earlier article [J. Schubert, On nonspecific evidence, Int. J. Intell. Syst. 8(6), 711-725 (1993)] we established within Dempster-Shafer theory a criterion function called the metaconflict function. With this criterion we can partition into subsets a set of several pieces of evidence with propositions that are weakly specified in the sense that it may be uncertain to which event a proposition is referring. Each subset in the partitioning is representing a separate event. The metaconflict function was derived as the plausibility that the partitioning is correct when viewing the conflict in Dempster's rule within each subset as a newly constructed piece of metalevel evidence with a proposition giving support against the entire partitioning. In this article we extend the results of the previous article. We will not only find the most plausible subset for each piece of evidence as was done in the earlier article. In addition we will specify each piece of nonspecific evidence, in the sense that we find to which events the proposition might be referring, by finding the plausibility for every subset that this piece of evidence belong to the subset. In doing this we will automatically receive indication that some evidence might be false. We will then develop a new methodology to exploit these newly specified pieces of evidence in a subsequent reasoning process. This will include methods to discount evidence based on their degree of falsity and on their degree of credibility due to a partial specification of affiliation, as well as a refined method to infer the event of each subset.
Previous studies in Open Information Extraction (Open IE) are mainly based on extraction patterns. They manually define patterns or automatically learn them from a large corpus. However, these approaches are limited when grasping the context of a sentence, and they fail to capture implicit relations. In this paper, we address this problem with the following methods. First, we exploit long short-term memory (LSTM) networks to extract higher-level features along the shortest dependency paths, connecting headwords of relations and arguments. The path-level features from LSTM networks provide useful clues regarding contextual information and the validity of arguments. Second, we constructed samples to train LSTM networks without the need for manual labeling. In particular, feedback negative sampling picks highly negative samples among non-positive samples through a model trained with positive samples. The experimental results show that our approach produces more precise and abundant extractions than state-of-the-art open IE systems. To the best of our knowledge, this is the first work to apply deep learning to Open IE.
We study diffusion with a bias towards a target node in networks. This problem is relevant to efficient routing strategies in emerging communication networks like optical networks. Bias is represented by a probability $p$ of the packet/particle to travel at every hop towards a site which is along the shortest path to the target node. We investigate the scaling of the mean first passage time (MFPT) with the size of the network. We find by using theoretical analysis and computer simulations that for Random Regular (RR) and Erd\H{o}s-R\'{e}nyi (ER) networks, there exists a threshold probability, $p_{th}$, such that for $p<p_{th}$ the MFPT scales anomalously as $N^\alpha$, where $N$ is the number of nodes, and $\alpha$ depends on $p$. For $p>p_{th}$ the MFPT scales logarithmically with $N$. The threshold value $p_{th}$ of the bias parameter for which the regime transition occurs is found to depend only on the mean degree of the nodes. An exact solution for every value of $p$ is given for the scaling of the MFPT in RR networks. The regime transition is also observed for the second moment of the probability distribution function, the standard deviation.
Natural logic offers a powerful relational conception of meaning that is a natural counterpart to distributed semantic representations, which have proven valuable in a wide range of sophisticated language tasks. However, it remains an open question whether it is possible to train distributed representations to support the rich, diverse logical reasoning captured by natural logic. We address this question using two neural network-based models for learning embeddings: plain neural networks and neural tensor networks. Our experiments evaluate the models' ability to learn the basic algebra of natural logic relations from simulated data and from the WordNet noun graph. The overall positive results are promising for the future of learned distributed representations in the applied modeling of logical semantics.
We apply simple analyses techniques developed for the study of complex networks to the study of the cosmic web, the large scale galaxy distribution. In this paper, we measure three network centralities (ranks of topological importance), Degree Centrality (DC), Closeness Centrality (CL), and Betweenness Centrality (BC) from a network built from the Cosmological Evolution Survey (COSMOS) catalog. We define 8 galaxy populations according to the centrality measures; Void, Wall, and Cluster by DC, Main Branch and Dangling Leaf by BC, and Kernel, Backbone, and Fracture by CL. We also define three populations by voronoi tessellation density to compare these with the DC selection. We apply the topological selections to galaxies in the (photometric) redshift range $0.91<z<0.94$ from the COSMOS survey, and explore whether the red and blue galaxy populations show differences in color, star-formation rate (SFR) and stellar mass in the different topological regions. Despite the limitations and uncertainties associated with using photometric redshift and indirect measurements of galactic parameters, the preliminary results illustrate the potential of network analysis. The coming future surveys will provide better statistical samples to test and improve this "network cosmology".
We investigate the ground state properties of Bose-Bose mixtures with Rashba-type spin-orbit (SO) coupling in a square lattice. The system displays rich physics from the deep Mott-insulator (MI) all the way to the superfluid (SF) regime. In the deep MI regime, novel spin-ordered phases arise due to the effective Dzyaloshinskii-Moriya type super-exchange interactions. By employing the non-perturbative Bosonic Dynamical Mean-Field-Theory (BDMFT), we numerically study and establish the stability of these magnetic phases against increasing hopping amplitude. We show that as hopping is increased across the MI to SF transition, exotic superfluid phases with magnetic textures emerge. In particular, we identify a new spin-spiral magnetic texture with spatial period 3 in the superfluid close to the MI-SF transition.
We study the dynamical low temperature behaviour of the Ising spin glass on the Bethe lattice. Starting from Glauber dynamics we propose a cavity like Ansatz that allows for the treatment of the slow (low temperature) part of dynamics. Assuming a continuous phase transitions and ultrametricity with respect to long time scales we approach the problem perturbatively near the critical temperature. The theory is formulated in terms of correlation-response-functions of arbitrary order. They can, however, be broken down completely to products of pair functions depending on two time arguments only. For binary couplings $J=\pm I$ a spin glass solution is found which approaches the corresponding solution for the SK-model in the limit of high connectivity. For more general distributions $P(J)$ no stable or marginal solution of this type appears to exist. The nature of the low temperature phase in this more general case is unclear.
We introduce a data-driven approach to complete partial 3D shapes through a combination of volumetric deep neural networks and 3D shape synthesis. From a partially-scanned input shape, our method first infers a low-resolution -- but complete -- output. To this end, we introduce a 3D-Encoder-Predictor Network (3D-EPN) which is composed of 3D convolutional layers. The network is trained to predict and fill in missing data, and operates on an implicit surface representation that encodes both known and unknown space. This allows us to predict global structure in unknown areas at high accuracy. We then correlate these intermediary results with 3D geometry from a shape database at test time. In a final pass, we propose a patch-based 3D shape synthesis method that imposes the 3D geometry from these retrieved shapes as constraints on the coarsely-completed mesh. This synthesis process enables us to reconstruct fine-scale detail and generate high-resolution output while respecting the global mesh structure obtained by the 3D-EPN. Although our 3D-EPN outperforms state-of-the-art completion method, the main contribution in our work lies in the combination of a data-driven shape predictor and analytic 3D shape synthesis. In our results, we show extensive evaluations on a newly-introduced shape completion benchmark for both real-world and synthetic data.
We investigate the propagation of perturbations in Boolean networks by evaluating the Derrida plot and modifications of it. We show that even small Random Boolean Networks agree well with the predictions of the annealed approximation, but non-random networks show a very different behaviour. We focus on networks that were evolved for high dynamical robustness. The most important conclusion is that the simple distinction between frozen, critical and chaotic networks is no longer useful, since such evolved networks can display properties of all three types of networks. Furthermore, we evaluate a simplified empirical network and show how its specific state space properties are reflected in the modified Derrida plots.
We investigate the finite-size scaling behavior of the conductivity in a two-dimensional Dirac electron gas within a chiral sigma model. Based on the fact that the conductivity is a function of system size times scattering rate, we obtain a two-parameter scaling flow toward a finite fixed point. The latter is the minimal conductivity of the infinite system. Depending on boundary conditions, we also observe unstable fixed points with conductivities much larger than the experimentally observed values, which may account for results found in some numerical simulations. By including a spectral gap we extend our scaling approach to describe a metal-insulator transition.
We consider quasistatic fiber bundle models with interactions. Classical load sharing rules are considered, i.e. local, global or decaying as a power-law of distance. All fibers are identically elastic, initially intact, and break at a random threshold picked from a quenched disorder (q.d.) distribution. We are interested in the probability distribution of configurations of broken fibers at a given elongation, averaged over all possible realizations of the underlying q.d.. This distribution is accessed by mapping the threshold set space onto the configurational space, each path corresponding to the evolution of a bundle corresponding to a realized q.d.. Using a perturbative approach allows to obtain this distribution to leading order in the interactions. This maps this system onto classical statistical mechanics models, i.e. percolation, standard or generalized Ising models depending on the range of the interactions chosen in the load sharing rule. This relates such q.d. based systems to standard classical mechanics, which allows to derive observables of the system, as e.g. correlation lengths. The thermodynamic parameters formally equivalent to temperature and chemical potential, are functions of the externally imposed deformation, depending on the load sharing rule and the choice of the q.d. distribution.
A learning algorithm for multilayer perceptrons is presented which is based on finding the principal components of a correlation matrix computed from the example inputs and their target outputs. For large networks our procedure needs far fewer examples to achieve good generalization than traditional on-line algorithms.
Global optimization algorithms have shown impressive performance in data-association based multi-object tracking, but handling online data remains a difficult hurdle to overcome. In this paper, we present a hybrid data association framework with a min-cost multi-commodity network flow for robust online multi-object tracking. We build local target-specific models interleaved with global optimization of the optimal data association over multiple video frames. More specifically, in the min-cost multi-commodity network flow, the target-specific similarities are online learned to enforce the local consistency for reducing the complexity of the global data association. Meanwhile, the global data association taking multiple video frames into account alleviates irrecoverable errors caused by the local data association between adjacent frames. To ensure the efficiency of online tracking, we give an efficient near-optimal solution to the proposed min-cost multi-commodity flow problem, and provide the empirical proof of its sub-optimality. The comprehensive experiments on real data demonstrate the superior tracking performance of our approach in various challenging situations.
In recent years, many researchers have focused on wireless sensor networks and their applications. To obtain scalability potential in these networks most of the nodes are categorized as distinct groups named cluster and the node which is selected as cluster head or Aggregation Node offers the operation of data collection from other cluster nodes and aggregation and sending it to the rest of the network. Clustering and data aggregation increase network scalability and cause that limited resources of the network are used well. However, these mechanisms also make several breaches in the network, for example in clustered networks cluster head nodes are considered Desirable and attractive targets for attackers since reaching their information whether by physical attack and node capturing or by other attacks, the attacker can obtain the whole information of corresponding cluster. In this study secure clustering of the nodes are considered with the approach of reducing energy consumption of nodes and a protocol is presented that in addition to satisfying Advanced security needs of wireless sensor networks reduces the amount of energy consumption by the nodes. Due to network clustering there is scalability potential in such a network and According to frequent change of cluster head nodes load distribution is performed in the cluster and eventually increase the network lifetime.
Real data often contains a mixture of discrete and continuous variables, but many Bayesian network structure learning and inference algorithms assume all random variables are discrete. Continuous variables are often discretized, but the choice of discretization policy has significant impact on the accuracy, speed, and interpretability of the resulting models. This paper introduces a principled Bayesian discretization method for continuous variables in Bayesian networks with quadratic complexity instead of the cubic complexity of other standard techniques. Empirical demonstrations show that the proposed method is superior to the state of the art. In addition, this paper shows how to incorporate existing methods into the structure learning process to discretize all continuous variables and simultaneously learn Bayesian network structures.
The addition of deep near infrared images to the database provided by the HDF-S WFPC2 is essential to monitor the SEDs of the objects on a wide baseline and address a number of key issues including the total stellar content of baryonic mass, the effects of dust extinction, the dependence of morphology on the rest frame wavelength, the photometric redshifts, the detection and nature of extremely red objects (EROs). For these reasons deep near infrared images were obtained with the ISAAC instrument at the ESO VLT in the Js, H and Ks bands reaching, respectively, 23.5, 22.0, 22.0 limiting Vega-magnitude. A multi-color (F300, F450, F606, F814, Js, H, Ks) photometric catalog of the HDF-S has been produced. Photometric redshifts have been generated both fitting templates to the observed SEDs and with neural network techniques. Spectroscopic observations of the 9 candidates with I_AB <24.25 have confirmed all of them to be galaxies with 2<z<3.5. The photometric redshifts for all the galaxies brighter than I_AB< 27.5 have been used to study the evolution of galaxy clustering in the interval 0<z<4.5.
Several clues indicate that Titan's atmosphere has been depleted in methane during some period of its history, possibly as recently as 0.5-1 billion years ago. It could also happen in the future. Under these conditions, the atmosphere becomes only composed of nitrogen with a range of temperature and pressure allowing liquid or solid nitrogen to condense. Here, we explore these exotic climates throughout Titan's history with a 3D Global Climate Model (GCM) including the nitrogen cycle and the radiative effect of nitrogen clouds. We show that for the last billion years, only small polar nitrogen lakes should have formed. Yet, before 1 Ga, a significant part of the atmosphere could have condensed, forming deep nitrogen polar seas, which could have flowed and flooded the equatorial regions. Alternatively, nitrogen could be frozen on the surface like on Triton, but this would require an initial surface albedo higher than 0.65 at 4 Ga. Such a state could be stable even today if nitrogen ice albedo is higher than this value. According to our model, nitrogen flows and rain may have been efficient to erode the surface. Thus, we can speculate that a paleo-nitrogen cycle may explain the erosion and the age of Titan's surface, and may have produced some of the present valley networks and shorelines. Moreover, by diffusion of liquid nitrogen in the crust, a paleo-nitrogen cycle could be responsible of the flattening of the polar regions and be at the origin of the methane outgassing on Titan.
We study the acoustic emission avalanches during the failure process of porous alumina samples (Al2O3) under compression. Specimens with different porosities ranging from 30% to 59% have been synthetized from a mixture of fine-grained alumina and graphite. The compressive strength as well as the characteristics of the acoustic activity have been determined. The statistical analysis of the recorded acoustic emission pulses reveals, for all porosities, a broad distribution of energies with a fat tail, compatible with the existence of an underlying critical point. In the region of 35%-55% porosity, the energy distributions of the acoustic emission signals are compatible with a power law behavior over two decades in energy with an exponent \epsilon = 1.8+/-0.1.
We consider renewal processes where events, which can for instance be the zero crossings of a stochastic process, occur at random epochs of time. The intervals of time between events, $\tau_{1},\tau_{2},...$, are independent and identically distributed (i.i.d.) random variables with a common density $\rho(\tau)$. Fixing the total observation time to $t$ induces a global constraint on the sum of these random intervals, which accordingly become interdependent. Here we focus on the largest interval among such a sequence on the fixed time interval $(0,t)$. Depending on how the last interval is treated, we consider three different situations, indexed by $\alpha=$ I, II and III. We investigate the distribution of the longest interval $\ell^\alpha_{\max}(t)$ and the probability $Q^\alpha(t)$ that the last interval is the longest one. We show that if $\rho(\tau)$ decays faster than $1/\tau^2$ for large $\tau$, then the full statistics of $\ell^\alpha_{\max}(t)$ is given, in the large $t$ limit, by the standard theory of extreme value statistics for i.i.d. random variables, showing in particular that the global constraint on the intervals $\tau_i$ does not play any role at large times in this case. However, if $\rho(\tau)$ exhibits heavy tails, $\rho(\tau)\sim\tau^{-1-\theta}$ for large $\tau$, with index $0 <\theta<1$, we show that the fluctuations of $\ell^\alpha_{\max}(t)/t$ are governed, in the large $t$ limit, by a stationary universal distribution which depends on both $\theta$ and $\alpha$, which we compute exactly. On the other hand, $Q^{\alpha}(t)$ is generically different from its counterpart for i.i.d. variables (both for narrow or heavy tailed distributions $\rho(\tau)$). In particular, in the case $0<\theta<1$, the large $t$ behaviour of $Q^\alpha(t)$ gives rise to universal constants (depending also on both $\theta$ and $\alpha$) which we compute exactly.
Enzymes are on the front lines of evolution. All living organisms rely on highly efficient, specific enzymes for growth, sustenance, and reproduction; and many diseases are a consequence of a mutation on an enzyme that affects its catalytic function. It follows that the function of an enzyme affects the fitness of an organism, but just as rightfully true, the function of an enzyme affects the fitness of itself. Understanding how the complexity of enzyme structure relates to its essential function will unveil the fundamental mechanisms of evolution, and, perhaps, shed light on strategies used by ancient replicators. This paper presents evidence that supports the hypothesis that enzymes, and proteins in general, are the manifestation of the coevolution of two opposing forces. The synthesis of enzyme architecture, stability, function, evolutionary relationships, and evolvability shows that the complexity of macromolecules is a consequence of the function it provides.
Determinations of structure functions and parton distribution functions have been recently obtained using Monte Carlo methods and neural networks as universal, unbiased interpolants for the unknown functional dependence. In this work the same methods are applied to obtain a parametrization of polarized Deep Inelastic Scattering (DIS) structure functions. The Monte Carlo approach provides a bias--free determination of the probability measure in the space of structure functions, while retaining all the information on experimental errors and correlations. In particular the error on the data is propagated into an error on the structure functions that has a clear statistical meaning. We present the application of this method to the parametrization from polarized DIS data of the photon asymmetries $A_1^p$ and $A_1^d$ from which we determine the structure functions $g_1^p(x,Q^2)$ and $g_1^d(x,Q^2)$, and discuss the possibility to extract physical parameters from these parametrizations. This work can be used as a starting point for the determination of polarized parton distributions.
For many real-life Bayesian networks, common knowledge dictates that the output established for the main variable of interest increases with higher values for the observable variables. We define two concepts of monotonicity to capture this type of knowledge. We say that a network is isotone in distribution if the probability distribution computed for the output variable given specific observations is stochastically dominated by any such distribution given higher-ordered observations; a network is isotone in mode if a probability distribution given higher observations has a higher mode. We show that establishing whether a network exhibits any of these properties of monotonicity is coNPPP-complete in general, and remains coNP-complete for polytrees. We present an approximate algorithm for deciding whether a network is monotone in distribution and illustrate its application to a real-life network in oncology.
Large-scale recordings of neuronal activity make it possible to gain insights into the collective activity of neural ensembles. It has been hypothesized that neural populations might be optimized to operate at a 'thermodynamic critical point', and that this property has implications for information processing. Support for this notion has come from a series of studies which identified statistical signatures of criticality in the ensemble activity of retinal ganglion cells. What are the underlying mechanisms that give rise to these observations? Here we show that signatures of criticality arise even in simple feed-forward models of retinal population activity. In particular, they occur whenever neural population data exhibits correlations, and is randomly sub-sampled during data analysis. These results show that signatures of criticality are not necessarily indicative of an optimized coding strategy, and challenge the utility of analysis approaches based on equilibrium thermodynamics for understanding partially observed biological systems.
After intensive research, heterogenous face recognition is still a challenging problem. The main difficulties are owing to the complex relationship between heterogenous face image spaces. The heterogeneity is always tightly coupled with other variations, which makes the relationship of heterogenous face images highly nonlinear. Many excellent methods have been proposed to model the nonlinear relationship, but they apt to overfit to the training set, due to limited samples. Inspired by the unsupervised algorithms in deep learning, this paper proposes an novel framework for heterogeneous face recognition. We first extract Gabor features at some localized facial points, and then use Restricted Boltzmann Machines (RBMs) to learn a shared representation locally to remove the heterogeneity around each facial point. Finally, the shared representations of local RBMs are connected together and processed by PCA. Two problems (Sketch-Photo and NIR-VIS) and three databases are selected to evaluate the proposed method. For Sketch-Photo problem, we obtain perfect results on the CUFS database. For NIR-VIS problem, we produce new state-of-the-art performance on the CASIA HFB and NIR-VIS 2.0 databases.
A novel efficient method for extraction of object proposals is introduced. Its "objectness" function exploits deep spatial pyramid features, a novel fast-to-compute HoG-based edge statistic and the EdgeBoxes score. The efficiency is achieved by the use of spatial bins in a novel combination with sparsity-inducing group normalized SVM. State-of-the-art recall performance is achieved on Pascal VOC07, significantly outperforming methods with comparable speed. Interestingly, when only 100 proposals per image are considered the method attains 78% recall on VOC07. The method improves mAP of the RCNN state-of-the-art class-specific detector, increasing it by 10 points when only 50 proposals are used in each image. The system trained on twenty classes performs well on the two hundred class ILSVRC2013 set confirming generalization capability.
This paper investigates two different neural architectures for the task of relation classification: convolutional neural networks and recurrent neural networks. For both models, we demonstrate the effect of different architectural choices. We present a new context representation for convolutional neural networks for relation classification (extended middle context). Furthermore, we propose connectionist bi-directional recurrent neural networks and introduce ranking loss for their optimization. Finally, we show that combining convolutional and recurrent neural networks using a simple voting scheme is accurate enough to improve results. Our neural models achieve state-of-the-art results on the SemEval 2010 relation classification task.
Massive MIMO is a promising technology in future wireless communication networks. However, it raises a lot of implementation challenges, for example, the huge pilot symbols and feedback overhead, requirement of real-time global CSI, large number of RF chains needed and high computational complexity. We consider a two-tier precoding strategy for multi-cell massive MIMO interference networks, with an outer precoder for inter-cell/inter-cluster interference cancellation, and an inner precoder for intra-cell multiplexing. In particular, to combat with the computational complexity issue of the outer precoding, we propose a low complexity online iterative algorithm to track the outer precoder under time-varying channels. We follow an optimization technique and formulate the problem on the Grassmann manifold. We develop a low complexity iterative algorithm, which converges to the global optimal solution under static channels. In time-varying channels, we propose a compensation technique to offset the variation of the time-varying optimal solution. We show with our theoretical result that, under some mild conditions, perfect tracking of the target outer precoder using the proposed algorithm is possible. Numerical results demonstrate that the two-tier precoding with the proposed iterative compensation algorithm can achieve a good performance with a significant complexity reduction compared with the conventional two-tier precoding techniques in the literature.
Human pose estimation using deep neural networks aims to map input images with large variations into multiple body keypoints which must satisfy a set of geometric constraints and inter-dependency imposed by the human body model. This is a very challenging nonlinear manifold learning process in a very high dimensional feature space. We believe that the deep neural network, which is inherently an algebraic computation system, is not able to capture highly sophisticated human knowledge, for example those highly coupled geometric characteristics and interdependence between keypoints in human poses. In this work, we propose to explore how external knowledge can be effectively represented and injected into the deep neural networks to guide its training process using learned projections for more accurate and robust human pose estimation. Specifically, we use the stacked hourglass network module and inception-resnet to construct a fractal network to regress human pose images into heatmaps with no explicit graphical modeling. We encode external knowledge with visual features which are able to characterize the constraints of human body models and evaluate the fitness of intermediate network output. We then inject these external features into the neural network using a projection matrix learned using an auxiliary cost function. The effectiveness of the proposed inception-resnet module and the benefit in guided learning with knowledge projection is evaluated on two widely used human pose estimation benchmarks. Our approach achieves state-of-the-art performance on both datasets. Code is made publicly available.
I propose a frequency domain adaptation of the Expectation Maximization (EM) algorithm to group a family of time series in classes of similar dynamic structure. It does this by viewing the magnitude of the discrete Fourier transform (DFT) of each signal (or power spectrum) as a probability density/mass function (pdf/pmf) on the unit circle: signals with similar dynamics have similar pdfs; distinct patterns have distinct pdfs. An advantage of this approach is that it does not rely on any parametric form of the dynamic structure, but can be used for non-parametric, robust and model-free classification. This new method works for non-stationary signals of similar shape as well as stationary signals with similar auto-correlation structure. Applications to neural spike sorting (non-stationary) and pattern-recognition in socio-economic time series (stationary) demonstrate the usefulness and wide applicability of the proposed method.
Random multifractals occur in particular at critical points of disordered systems. For Anderson localization transitions, Mirlin and Evers [PRB 62,7920 (2000)] have proposed the following scenario (a) the Inverse Participation Ratios (I.P.R.) $Y_q(L)$ display the following fluctuations between the disordered samples of linear size $L$ : with respect to the typical value $Y^{typ}_q(L) = e^{\bar{\ln Y_q(L)}} \sim L^{- \tau_{typ}(q)} $ that involve the typical multifractal spectrum $\tau_{typ}(q)$, the rescaled variable $y=Y_q(L)/Y^{typ}_q(L) $ is distributed with a scale-invariant distribution presenting the power-law tail $1/y^{1+\beta_q}$, so that the disorder-averaged I.P.R. $\bar{Y_q(L)} \sim L^{- \tau_{av}(q)} $ have multifractal exponents $\tau_{av}(q) $ that differ from the typical ones $\tau_{typ}(q)$ whenever $\beta_q<1$; (b) the tail exponents $\beta_q$ and the multifractal exponents are related by the relation $\beta_q \tau_{typ}(q)=\tau_{av}(q \beta_q)$. Here we show that this scenario can be understood by considering the real-space renormalization equations satisfied by the I.P.R. For the simplest multifractals described by random cascades, these renormalization equations are formally similar to the recursion relations for disordered models defined on Cayley trees and they admit travelling-wave solutions for the variable $(\ln Y_q)$ in the effective time $t_{eff}=\ln L$ : the exponent $\tau_{typ}(q)$ represents the velocity, whereas the tail exponent $\beta_q$ represents the usual exponential decay of the travelling-wave tail. In addition, we obtain that the relation (b) above can be obtained as a self-consistency condition from the self-similarity of the multifractal spectrum at all scales. Our conclusion is thus that the Mirlin-Evers scenario should apply to other types of random critical points, and even to random multifractals occurring in other fields.
Many biological phenomena are inherently multiscale, i.e. they are characterized by interactions involving different spatial and temporal scales simultaneously. Though several approaches have been proposed to provide "multilayer" models, only Complex Automata, derived from Cellular Automata, naturally embed spatial information and realize multiscaling with well-established inter-scale integration schemas. Spatial P systems, a variant of P systems in which a more geometric concept of space has been added, have several characteristics in common with Cellular Automata. We propose such a formalism as a basis to rephrase the Complex Automata multiscaling approach and, in this perspective, provide a 2-scale Spatial P system describing bone remodelling. The proposed model not only results to be highly faithful and expressive in a multiscale scenario, but also highlights the need of a deep and formal expressiveness study involving Complex Automata, Spatial P systems and other promising multiscale approaches, such as our shape-based one already resulted to be highly faithful.
In this paper we present an application where we put together two methods for clustering and classification into a force aggregation method. Both methods are based on conflicts between elements. These methods work with different type of elements (intelligence reports, vehicles, military units) on different hierarchical levels using specific conflict assessment methods on each level. We use Dempster-Shafer theory for conflict calculation between elements, Dempster-Shafer clustering for clustering these elements, and templates for classification. The result of these processes is a complete force aggregation on all levels handled.
Consider a synchronous point-to-point network of n nodes connected by directed links, wherein each node has a binary input. This paper proves a tight necessary and sufficient condition on the underlying communication topology for achieving Byzantine consensus among these nodes in the presence of up to f Byzantine faults. We derive a necessary condition, and then we provide a constructive proof of sufficiency by presenting a Byzantine consensus algorithm for directed graphs that satisfy the necessary condition.   Prior work has developed analogous necessary and sufficient conditions for undirected graphs. It is known that, for undirected graphs, the following two conditions are together necessary and sufficient [8, 2, 6]: (i) n ? 3f + 1, and (ii) network connectivity greater than 2f. However, these conditions are not adequate to completely characterize Byzantine consensus in directed graphs.
Exchangeable models for countable vertex-labeled graphs cannot replicate the large sample behaviors of sparsity and power law degree distribution observed in many network datasets. Out of this mathematical impossibility emerges the question of how network data can be modeled in a way that reflects known empirical behaviors and respects basic statistical principles. We address this question by observing that edges, not vertices, act as the statistical units in networks constructed from interaction data, making a theory of edge-labeled networks more natural for many applications. In this context we introduce the concept of {\em edge exchangeability}, which unlike its vertex exchangeable counterpart admits models for networks with sparse and/or power law structure. Our characterization of edge exchangeable networks gives rise to a class of nonparametric models, akin to graphon models in the vertex exchangeable setting. Within this class, we identify a tractable family of distributions with a clear interpretation and suitable theoretical properties, whose significance in estimation, prediction, and testing we demonstrate.
This paper aims to accelerate the test-time computation of convolutional neural networks (CNNs), especially very deep CNNs that have substantially impacted the computer vision community. Unlike previous methods that are designed for approximating linear filters or linear responses, our method takes the nonlinear units into account. We develop an effective solution to the resulting nonlinear optimization problem without the need of stochastic gradient descent (SGD). More importantly, while previous methods mainly focus on optimizing one or two layers, our nonlinear method enables an asymmetric reconstruction that reduces the rapidly accumulated error when multiple (e.g., >=10) layers are approximated. For the widely used very deep VGG-16 model, our method achieves a whole-model speedup of 4x with merely a 0.3% increase of top-5 error in ImageNet classification. Our 4x accelerated VGG-16 model also shows a graceful accuracy degradation for object detection when plugged into the Fast R-CNN detector.
Over the past decade, the Cyberspace has seen an increasing number of attacks coming from botnets using the Peer-to-Peer (P2P) architecture. Peer-to-Peer botnets use a decentralized Command & Control architecture. Moreover, a large number of such botnets already exist, and newer versions- which significantly differ from their parent bot- are also discovered practically every year. In this work, the authors propose and implement a novel hybrid framework for detecting P2P botnets in live network traffic by integrating Neural Networks with Bayesian Regularization. Bayesian Regularization helps in achieving better generalization of the dataset, thereby enabling the detection of botnet activity even of those bots which were never used in training the Neural Network. Hence such a framework is suitable for detection of newer and unseen botnets in live traffic of a network. This was verified by testing the Framework on test data unseen to the Detection module (using untrained botnet dataset), and the authors were successful in detecting this activity with an accuracy of 99.2 %.
We offer an alternative viewpoint on Dyson's original paper regarding the application of Brownian motion to random matrix theory (RMT). In particular we show how one may use the same approach in order to study the stochastic motion in the space of matrix traces $t_n = \sum_{\nu=1}^{N} \lambda_\nu^n$, rather than the eigenvalues $\lambda_\nu$. In complete analogy with Dyson we obtain a Fokker-Planck equation that exhibits a stationary solution corresponding to the joint probability density function in the space $t = (t_1,\ldots,t_n)$, which can in turn be related to the eigenvalues $\lambda = (\lambda_1,\ldots,\lambda_N)$. As a consequence two interesting combinatorial identities emerge, which are proved algebraically in the appendix. We also offer a number of comments on this version of Dyson's theory and discuss its potential advantages.
Synchronization is a universal phenomenon found in many non-equilibrium systems. Much recent interest in this area has overlapped with the study of complex networks, where a major focus is determining how a system's connectivity patterns affect the types of behavior that it can produce. Thus far, modeling efforts have focused on the tendency of networks of oscillators to mutually synchronize themselves, with less emphasis on the effects of external driving. In this work we discuss the interplay between mutual and driven synchronization in networks of phase oscillators of the Kuramoto type, and explore how the structure and emergence of such states depends on the underlying network topology for simple random networks with a given degree distribution. We find a variety of interesting dynamical behaviors, including bifurcations and bistability patterns that are qualitatively different for heterogeneous and homogeneous networks, and which are separated by a Takens-Bogdanov-Cusp singularity in the parameter region where the coupling strength between oscillators is weak. Our analysis is connected to the underlying dynamics of oscillator clusters for important states and transitions.
The problem of low complexity, close to optimal, channel decoding of linear codes with short to moderate block length is considered. It is shown that deep learning methods can be used to improve a standard belief propagation decoder, despite the large example space. Similar improvements are obtained for the min-sum algorithm. It is also shown that tying the parameters of the decoders across iterations, so as to form a recurrent neural network architecture, can be implemented with comparable results. The advantage is that significantly less parameters are required. We also introduce a recurrent neural decoder architecture based on the method of successive relaxation. Improvements over standard belief propagation are also observed on sparser Tanner graph representations of the codes. Furthermore, we demonstrate that the neural belief propagation decoder can be used to improve the performance, or alternatively reduce the computational complexity, of a close to optimal decoder of short BCH codes.
We present a detailed study of the phase diagram of the Ising model in random graphs with arbitrary degree distribution. By using the replica method we compute exactly the value of the critical temperature and the associated critical exponents as a function of the minimum and maximum degree, and the degree distribution characterizing the graph. As expected, there is a ferromagnetic transition provided <k> <= <k^2> < \infty. However, if the fourth moment of the degree distribution is not finite then non-trivial scaling exponents are obtained. These results are analyzed for the particular case of power-law distributed random graphs.
Diffraction methods are at the heart of structure determination of solids. While Bragg-like scattering (pure point diffraction) is a characteristic feature of crystals and quasicrystals, it is not straightforward to interpret continuous diffraction intensities, which are generally linked to the presence of disorder. However, based on simple model systems, we demonstrate that it may be impossible to draw conclusions on the degree of order in the system from its diffraction image. In particular, we construct a family of one-dimensional binary systems which cover the entire entropy range but still share the same purely diffuse diffraction spectrum.
Congestion control mechanisms for ATM networks as selected by the ATM Forum traffic management group are described. Reasons behind these selections are explained. In particular, selection criteria for selection between rate-based and credit-based approach and the key points of the debate between the two approaches are presented. The approach that was finally selected and several other schemes that were considered are described.
We investigate shadowing effects in deep-inelastic scattering from nuclei at small values $x<0.1$ of the Bjorken variable. Unifying aspects of generalized vector meson dominance and color transparency we first develop a model for deep-inelastic scattering from free nucleons at small $x$. In application to nuclear targets we find that the coherent interaction of quark-antiquark fluctuations with nucleons in a nucleus leads to the observed shadowing at $x<0.1$. We compare our results with most of the recent data for a large variety of nuclei and examine in particular the $Q^2$ dependence of the shadowing effect. While the coherent interaction of low mass vector mesons causes a major part of the shadowing observed in the $Q^2$ range of current experiments, the coherent scattering of continuum quark-antiquark pairs is also important and guarantees a very weak overall $Q^2$ dependence of the effect. We also discuss shadowing in deuterium and its implications for the quark flavor structure of nucleons. Finally we comment on shadowing effects in high-energy photon-nucleus reactions with real photons.
Using the spread of the Ice Bucket Challenge on Twitter as a case study, this research compared the concurrent diffusion patterns of both information and behaviors in online social networks. Individual behaviors in taking the Challenge were detected by applying text mining techniques to millions of tweets. After comparing diffusion dynamics of information and behaviors at the network level, the individual level, and the dyadic level, our analysis revealed interesting differences and interactions between the two diffusion processes and laid foundations for future predictive and prescriptive analytics for information and behavior diffusion in social networks.
Many mobile ad hoc network protocols use simple flooding, in order to adapt to changes in time varying network topology. Most of the times, a network-wide flood results in redundant packets and increases network congestion, probability of packet collision, low utilization of available bandwidth, and most important, higher power consumption. In this paper, we propose a new cross-layer broadcast scheme to minimize broadcast traffic in mobile ad hoc networks. Our scheme is based on use of received signal strength indicator, RSSI, value to reduce the number of broadcast packets. The effectiveness of the proposed technique is verified using simulations.
In most social and information systems the activity of agents generates rapidly evolving time-varying networks. The temporal variation in networks' connectivity patterns and the ongoing dynamic processes are usually coupled in ways that still challenge our mathematical or computational modelling. Here we analyse a mobile call dataset and find a simple statistical law that characterize the temporal evolution of users' egocentric networks. We encode this observation in a reinforcement process defining a time-varying network model that exhibits the emergence of strong and weak ties. We study the effect of time-varying and heterogeneous interactions on the classic rumour spreading model in both synthetic, and real-world networks. We observe that strong ties severely inhibit information diffusion by confining the spreading process among agents with recurrent communication patterns. This provides the counterintuitive evidence that strong ties may have a negative role in the spreading of information across networks.
Network analysis is currently used in a myriad of contexts: from identifying potential drug targets to predicting the spread of epidemics and designing vaccination strategies, and from finding friends to uncovering criminal activity. Despite the promise of the network approach, the reliability of network data is a source of great concern in all fields where complex networks are studied. Here, we present a general mathematical and computational framework to deal with the problem of data reliability in complex networks. In particular, we are able to reliably identify both missing and spurious interactions in noisy network observations. Remarkably, our approach also enables us to obtain, from those noisy observations, network reconstructions that yield estimates of the true network properties that are more accurate than those provided by the observations themselves. Our approach has the potential to guide experiments, to better characterize network data sets, and to drive new discoveries.
We consider particles suspended in a randomly stirred or turbulent fluid. When effects of the inertia of the particles are significant, an initially uniform scatter of particles can cluster together. We analyse this 'unmixing' effect by calculating the Lyapunov exponents for dense particles suspended in such a random three-dimensional flow, concentrating on the limit where the viscous damping rate is small compared to the inverse correlation time of the random flow (that is, the regime of large Stokes number). In this limit Lyapunov exponents are obtained as a power series in a parameter which is a dimensionless measure of the inertia. We report results for the first seven orders. The perturbation series is divergent, but we obtain accurate results from a Pade-Borel summation. We deduce that particles can cluster onto a fractal set and show that its dimension is in satisfactory agreement with previously reported in simulations of turbulent Navier-Stokes flows. We also investigate the rate of formation of caustics in the particle flow.
Deep Neural Networks have been shown to succeed at a range of natural language tasks such as machine translation and text summarization. While tasks on source code (ie, formal languages) have been considered recently, most work in this area does not attempt to capitalize on the unique opportunities offered by its known syntax and structure. In this work, we introduce SmartPaste, a first task that requires to use such information. The task is a variant of the program repair problem that requires to adapt a given (pasted) snippet of code to surrounding, existing source code. As first solutions, we design a set of deep neural models that learn to represent the context of each variable location and variable usage in a data flow-sensitive way. Our evaluation suggests that our models can learn to solve the SmartPaste task in many cases, achieving 58.6% accuracy, while learning meaningful representation of variable usages.
Within the literature on non-cooperative game theory, there have been a number of attempts to propose logorithms which will compute Nash equilibria. Rather than derive a new algorithm, this paper shows that the family of algorithms known as Markov chain Monte Carlo (MCMC) can be used to calculate Nash equilibria. MCMC is a type of Monte Carlo simulation that relies on Markov chains to ensure its regularity conditions. MCMC has been widely used throughout the statistics and optimization literature, where variants of this algorithm are known as simulated annealing. This paper shows that there is interesting connection between the trembles that underlie the functioning of this algorithm and the type of Nash refinement known as trembling hand perfection.
Real-time network verification promises to automatically detect violations of network-wide reachability invariants on the data plane. To be useful in practice, these violations need to be detected in the order of milliseconds, without raising false alarms. To date, most real-time data plane checkers address this problem by exploiting at least one of the following two observations: (i) only small parts of the network tend to be affected by typical changes to the data plane, and (ii) many different packets tend to share the same forwarding behaviour in the entire network. This paper shows how to effectively exploit a third characteristic of the problem, namely: similarity among forwarding behaviour of packets through parts of the network, rather than its entirety. We propose the first provably amortized quasi-linear algorithm to do so. We implement our algorithm in a new real-time data plane checker, Delta-net. Our experiments with SDN-IP, a globally deployed ONOS software-defined networking application, and several hundred million IP prefix rules generated using topologies and BGP updates from real-world deployed networks, show that Delta-net checks a rule insertion or removal in approximately 40 microseconds on average, a more than 10X improvement over the state-of-the-art. We also show that Delta-net eliminates an inherent bottleneck in the state-of-the-art that restricts its use in answering Datalog-style "what if" queries.
This paper investigates the use of fractional order (FO) controllers for a microgrid. The microgrid employs various autonomous generation systems like wind turbine generator (WTG), solar photovoltaic (PV), diesel energy generator (DEG) and fuel-cells (FC). Other storage devices like the battery energy storage system (BESS) and the flywheel energy storage system (FESS) are also present in the power network. An FO control strategy is employed and the FO-PID controller parameters are tuned with a global optimization algorithm to meet system performance specifications. A kriging based surrogate modeling technique is employed to alleviate the issue of expensive objective function evaluation for the optimization based controller tuning. Numerical simulations are reported to prove the validity of the proposed methods. The results for both the FO and the integer order (IO) controllers are compared with standard evolutionary optimization techniques and the relative merits and demerits of the kriging based surrogate modeling are discussed. This kind of optimization technique is not only limited to this specific case of microgrid control but also can be ported to other computationally expensive power system optimization problems.
The increase in the use of microblogging came along with the rapid growth on short linguistic data. On the other hand deep learning is considered to be the new frontier to extract meaningful information out of large amount of raw data in an automated manner. In this study, we engaged these two emerging fields to come up with a robust language identifier on demand, namely Language Identification Engine (LIDE). As a result, we achieved 95.12% accuracy in Discriminating between Similar Languages (DSL) Shared Task 2015 dataset, which is comparable to the maximum reported accuracy of 95.54% achieved so far.
The dynamics of stochastic reaction networks within cells are inevitably modulated by factors considered extrinsic to the network such as for instance the fluctuations in ribsome copy numbers for a gene regulatory network. While several recent studies demonstrate the importance of accounting for such extrinsic components, the resulting models are typically hard to analyze. In this work we develop a general mathematical framework that allows to uncouple the network from its dynamic environment by incorporating only the environment's effect onto the network into a new model. More technically, we show how such fluctuating extrinsic components (e.g., chemical species) can be marginalized in order to obtain this decoupled model. We derive its corresponding process- and master equations and show how stochastic simulations can be performed. Using several case studies, we demonstrate the significance of the approach. For instance, we exemplarily formulate and solve a marginal master equation describing the protein translation and degradation in a fluctuating environment.
In this work we apply model averaging to parallel training of deep neural network (DNN). Parallelization is done in a model averaging manner. Data is partitioned and distributed to different nodes for local model updates, and model averaging across nodes is done every few minibatches. We use multiple GPUs for data parallelization, and Message Passing Interface (MPI) for communication between nodes, which allows us to perform model averaging frequently without losing much time on communication. We investigate the effectiveness of Natural Gradient Stochastic Gradient Descent (NG-SGD) and Restricted Boltzmann Machine (RBM) pretraining for parallel training in model-averaging framework, and explore the best setups in term of different learning rate schedules, averaging frequencies and minibatch sizes. It is shown that NG-SGD and RBM pretraining benefits parameter-averaging based model training. On the 300h Switchboard dataset, a 9.3 times speedup is achieved using 16 GPUs and 17 times speedup using 32 GPUs with limited decoding accuracy loss.
Deep learning is an established framework for learning hierarchical data representations. While compute power is in abundance, one of the main challenges in applying this framework to robotic grasping has been obtaining the amount of data needed to learn these representations, and structuring the data to the task at hand. Among contemporary approaches in the literature, we highlight key properties that have encouraged the use of deep learning techniques, and in this paper, detail our experience in developing a simulator for collecting cylindrical precision grasps of a multi-fingered dexterous robotic hand.
The aim of this paper is to propose a new method to identify main paths in a technological domain using patent citations. Previous approaches for using main path analysis have greatly improved our understanding of actual technological trajectories but nonetheless have some limitations. They have high potential to miss some dominant patents from the identified main paths; nonetheless, the high network complexity of their main paths makes qualitative tracing of trajectories problematic. The proposed method searches backward and forward paths from the high-persistence patents which are identified based on a standard genetic knowledge persistence algorithm. We tested the new method by applying it to the desalination and the solar photovoltaic domains and compared the results to output from the same domains using a prior method. The empirical results show that the proposed method overcomes the aforementioned drawbacks defining main paths that are almost 10x less complex while containing more of the relevant important knowledge than the main path networks defined by the existing method.
Social networks are a prominent feature of many social media sites, a new generation of Web sites that allow users to create and share content. Sites such as Digg, Flickr, and Del.icio.us allow users to designate others as "friends" or "contacts" and provide a single-click interface to track friends' activity. How are these social networks used? Unlike pure social networking sites (e.g., LinkedIn and Facebook), which allow users to articulate their online professional and personal relationships, social media sites are not, for the most part, aimed at helping users create or foster online relationships. Instead, we claim that social media users create social networks to express their tastes and interests, and use them to filter the vast stream of new submissions to find interesting content. Social networks, in fact, facilitate new ways of interacting with information: what we call social browsing. Through an extensive analysis of data from Digg and Flickr, we show that social browsing is one of the primary usage modalities on these social media sites. This finding has implications for how social media sites rate and personalize content.
Neural networks have been shown to have a remarkable ability to uncover low dimensional structure in data: the space of possible reconstructed images form a reduced model manifold in image space. We explore this idea directly by analyzing the manifold learned by Deep Belief Networks and Stacked Denoising Autoencoders using Monte Carlo sampling. The model manifold forms an only slightly elongated hyperball with actual reconstructed data appearing predominantly on the boundaries of the manifold. In connection with the results we present, we discuss problems of sampling high-dimensional manifolds as well as recent work [M. Transtrum, G. Hart, and P. Qiu, Submitted (2014)] discussing the relation between high dimensional geometry and model reduction.
Seamless continuity is the main goal in fourth generation Wireless networks (FGWNs), to achieve this "HANDOVER" technique is used, when a mobile terminal(MT) is in overlapping area for service continuity, Handover mechanism are mainly used. In Heterogeneous wireless networks main challenge is continual connection among the different networks like WiFi, WiMax, WLAN, WPAN etc. In this paper, Vertical handover decision schemes are compared and Multi Attribute Decision Making (MADM) is used to choose the best network from the available Visitor networks (VTs) for the continuous connection by the mobile terminal. In our work we mainly concentrated to the handover decision phase and to reduce the processing delay in the period of handover. MADM algorithms SAW and TOPSIS where compared to reduce the processing delay by using NS2 to evaluate the parameters for processing delay.
Efficient and flexible information matching over wireless networks has become increasingly important and challenging with the popularity of smart devices and the growth of social-network-based applications. Some existing approaches designed for wired networks are not applicable to wireless networks, due to their overwhelming control overheads. In this paper, we propose a reliable and scalable binary range vector summary tree (BRVST) infrastructure for flexible information expression support, effective content matching and timely information dissemination over the dynamic wireless network. A novel attribute range vector structure has been introduced for efficient and accurate content representation and a summary tree structure to facilitate information aggregation. For robust and scalable operations over dynamic wireless network, the proposed overlay system exploits a virtual hierarchical geographic management framework. Extensive simulations demonstrate that BRVST has a significantly faster event matching speed, while incurs very low storage and traffic overhead, as compared with peer schemes tested.
Most contemporary approaches to instance segmentation use complex pipelines involving conditional random fields, recurrent neural networks, object proposals, or template matching schemes. In our paper, we present a simple yet powerful end-to-end convolutional neural network to tackle this task. Our approach combines intuitions from the classical watershed transform and modern deep learning to produce an energy map of the image where object instances are unambiguously represented as basins in the energy map. We then perform a cut at a single energy level to directly yield connected components corresponding to object instances. Our model more than doubles the performance of the state-of-the-art on the challenging Cityscapes Instance Level Segmentation task.
On general object recognition, Deep Convolutional Neural Networks (DCNNs) achieve high accuracy. In particular, ResNet and its improvements have broken the lowest error rate records. In this paper, we propose a method to successfully combine two ResNet improvements, ResDrop and PyramidNet. We confirmed that the proposed network outperformed the conventional methods; on CIFAR-100, the proposed network achieved an error rate of 16.18% in contrast to PiramidNet achieving that of 18.29% and ResNeXt 17.31%.
Data augmentation is a key element in training high-dimensional models. In this approach, one synthesizes new observations by applying pre-specified transformations to the original training data; e.g.~new images are formed by rotating old ones. Current augmentation schemes, however, rely on manual specification of the applied transformations, making data augmentation an implicit form of feature engineering. With an eye towards true end-to-end learning, we suggest learning the applied transformations on a per-class basis. Particularly, we align image pairs within each class under the assumption that the spatial transformation between images belongs to a large class of diffeomorphisms. We then learn a class-specific probabilistic generative models of the transformations in a Riemannian submanifold of the Lie group of diffeomorphisms. We demonstrate significant performance improvements in training deep neural nets over manually-specified augmentation schemes. Our code and augmented datasets are available online.
This thesis describes the design and implementation of a smile detector based on deep convolutional neural networks. It starts with a summary of neural networks, the difficulties of training them and new training methods, such as Restricted Boltzmann Machines or autoencoders. It then provides a literature review of convolutional neural networks and recurrent neural networks. In order to select databases for smile recognition, comprehensive statistics of databases popular in the field of facial expression recognition were generated and are summarized in this thesis. It then proposes a model for smile detection, of which the main part is implemented. The experimental results are discussed in this thesis and justified based on a comprehensive model selection performed. All experiments were run on a Tesla K40c GPU benefiting from a speedup of up to factor 10 over the computations on a CPU. A smile detection test accuracy of 99.45% is achieved for the Denver Intensity of Spontaneous Facial Action (DISFA) database, significantly outperforming existing approaches with accuracies ranging from 65.55% to 79.67%. This experiment is re-run under various variations, such as retaining less neutral images or only the low or high intensities, of which the results are extensively compared.
This paper reviews application of Artificial Neural Networks in Aircraft Maintenance, Repair and Overhaul (MRO). MRO solutions are designed to facilitate the authoring and delivery of maintenance and repair information to the line maintenance technicians who need to improve aircraft repair turn around time, optimize the efficiency and consistency of fleet maintenance and ensure regulatory compliance. The technical complexity of aircraft systems, especially in avionics, has increased to the point at which it poses a significant troubleshotting and repair challenge for MRO personnel. As per the existing scenario, the MRO systems in place are inefficient. In this paper, we propose the centralization and integration of the MRO database to increase its efficiency. Moreover the implementation of Artificial Neural Networks in this system can rid the system of many of its deficiencies. In order to make the system more efficient we propose to integrate all the modules so as to reduce the efficacy of repair.
Gaussian graphical models are of great interest in statistical learning. Because the conditional independencies between different nodes correspond to zero entries in the inverse covariance matrix of the Gaussian distribution, one can learn the structure of the graph by estimating a sparse inverse covariance matrix from sample data, by solving a convex maximum likelihood problem with an $\ell_1$-regularization term. In this paper, we propose a first-order method based on an alternating linearization technique that exploits the problem's special structure; in particular, the subproblems solved in each iteration have closed-form solutions. Moreover, our algorithm obtains an $\epsilon$-optimal solution in $O(1/\epsilon)$ iterations. Numerical experiments on both synthetic and real data from gene association networks show that a practical version of this algorithm outperforms other competitive algorithms.
This paper describes a new protocol for authentication in ad-hoc networks. The protocol has been designed to meet specialized requirements of ad-hoc networks, such as lack of direct communication between nodes or requirements for revocable anonymity. At the same time, a ad-hoc authentication protocol must be resistant to spoofing, eavesdropping and playback, and man-in-the-middle attacks. The article analyzes existing authentication methods based on the Public Key Infrastructure, and finds that they have several drawbacks in ad-hoc networks. Therefore, a new authentication protocol, basing on established cryptographic primitives (Merkle's puzzles and zero-knowledge proofs) is proposed. The protocol is studied for a model ad-hoc chat application that provides private conversations.
Visual tracking is challenging as target objects often undergo significant appearance changes caused by deformation, abrupt motion, background clutter and occlusion. In this paper, we propose to exploit the rich hierarchical features of deep convolutional neural networks to improve the accuracy and robustness of visual tracking. Deep neural networks trained on object recognition datasets consist of multiple convolutional layers. These layers encode target appearance with different levels of abstraction. For example, the outputs of the last convolutional layers encode the semantic information of targets and such representations are invariant to significant appearance variations. However, their spatial resolutions are too coarse to precisely localize the target. In contrast, features from earlier convolutional layers provide more precise localization but are less invariant to appearance changes. We interpret the hierarchical features of convolutional layers as a nonlinear counterpart of an image pyramid representation and explicitly exploit these multiple levels of abstraction to represent target objects. Specifically, we learn adaptive correlation filters on the outputs from each convolutional layer to encode the target appearance. We infer the maximum response of each layer to locate targets in a coarse-to-fine manner. To further handle the issues with scale estimation and target re-detection from tracking failures caused by heavy occlusion or moving out of the view, we conservatively learn another correlation filter that maintains a long-term memory of target appearance as a discriminative classifier. Extensive experimental results on large-scale benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art tracking methods.
We show how our previous result based on the replica Bethe ansatz for the Kardar Parisi Zhang (KPZ) equation with the "half-flat" initial condition leads to the Airy$_2$ to Airy$_1$ (i.e. GUE to GOE) universal crossover one-point height distribution in the limit of large time. Equivalently, we obtain the distribution of the free energy of a long directed polymer (DP) in a random potential with one fixed endpoint and the other one on a half-line. We then generalize to a DP when each endpoint is free on its own half-line. It amounts, in the limit of large time, to obtain the distribution of the maximum of the transition process Airy$_{2\to 1}$ (minus a half-parabola) on a half line.
Global optimization of the energy consumption of dual power source vehicles such as hybrid electric vehicles, plug-in hybrid electric vehicles, and plug in fuel cell electric vehicles requires knowledge of the complete route characteristics at the beginning of the trip. One of the main characteristics is the vehicle speed profile across the route. The profile will translate directly into energy requirements for a given vehicle. However, the vehicle speed that a given driver chooses will vary from driver to driver and from time to time, and may be slower, equal to, or faster than the average traffic flow. If the specific driver speed profile can be predicted, the energy usage can be optimized across the route chosen. The purpose of this paper is to research the application of Deep Learning techniques to this problem to identify at the beginning of a drive cycle the driver specific vehicle speed profile for an individual driver repeated drive cycle, which can be used in an optimization algorithm to minimize the amount of fossil fuel energy used during the trip.
The devil's staircase is a term used to describe surface or an equilibrium phase diagram in which various ordered facets or phases are infinitely closely packed as a function of some model parameter. A classic example is a 1-D Ising model [bak] wherein long-range and short range forces compete, and the periodicity of the gaps between minority species covers all rational values. In many physical cases, crystal growth proceeds by adding surface layers which have the lowest energy, but are then frozen in place. The emerging layered structure is not the thermodynamic ground state, but is uniquely defined by the growth kinetics. It is shown that for such a system, the grown structure tends to the equilibrium ground state via a devil's staircase traversing an infinity of intermediate phases. It would be extremely difficult to deduce the simple growth law based on measurement made on such an grown structure.
Complex systems may have billion components making consensus formation slow and difficult. Recently several overlapping stories emerged from various disciplines, including protein structures, neuroscience and social networks, showing that fast responses to known stimuli involve a network core of few, strongly connected nodes. In unexpected situations the core may fail to provide a coherent response, thus the stimulus propagates to the periphery of the network. Here the final response is determined by a large number of weakly connected nodes mobilizing the collective memory and opinion, i.e. the slow democracy exercising the 'wisdom of crowds'. This mechanism resembles to Kahneman's "Thinking, Fast and Slow" discriminating fast, pattern-based and slow, contemplative decision making. The generality of the response also shows that democracy is neither only a moral stance nor only a decision making technique, but a very efficient general learning strategy developed by complex systems during evolution. The duality of fast core and slow majority may increase our understanding of metabolic, signaling, ecosystem, swarming or market processes, as well as may help to construct novel methods to explore unusual network responses, deep-learning neural network structures and core-periphery targeting drug design strategies. (Illustrative videos can be downloaded from here: http://networkdecisions.linkgroup.hu)
We study the off-equilibrium dynamics of the three-dimensional Ising spin glass in the presence of an external magnetic field. We have performed simulations both at fixed temperature and with an annealing protocol. Thanks to the Janus special-purpose computer, based on FPGAs, we have been able to reach times equivalent to 0.01 seconds in experiments. We have studied the system relaxation both for high and for low temperatures, clearly identifying a dynamical transition point. This dynamical temperature is strictly positive and depends on the external applied magnetic field. We discuss different possibilities for the underlying physics, which include a thermodynamical spin-glass transition, a mode-coupling crossover or an interpretation reminiscent of the random first-order picture of structural glasses.
Wireless sensor networks are normally characterized by resource challenged nodes. Since communication costs the most in terms of energy in these networks, minimizing this overhead is important. We consider minimum length node scheduling in regular multi-hop wireless sensor networks. We present collision-free decentralized scheduling algorithms based on TDMA with spatial reuse that do not use message passing, this saving communication overhead. We develop the algorithms using graph-based k-hop interference model and show that the schedule complexity in regular networks is independent of the number of nodes and varies quadratically with k which is typically a very small number. We follow it by characterizing feasibility regions in the SINR parameter space where the constant complexity continues to hold while simultaneously satisfying the SINR criteria. Using simulation, we evaluate the efficiency of our solution on random network deployments.
In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised Natural Language Processing and implement a bootstrapping algorithm for extracting security entities and their relationships from text. The algorithm requires little input data, specifically, a few relations or patterns (heuristics for identifying relations), and incorporates an active learning component which queries the user on the most important decisions to prevent drifting from the desired relations. Preliminary testing on a small corpus shows promising results, obtaining precision of .82.
Humans make complex inferences on faces, ranging from objective properties (gender, ethnicity, expression, age, identity, etc) to subjective judgments (facial attractiveness, trustworthiness, sociability, friendliness, etc). While the objective aspects of face perception have been extensively studied, relatively fewer computational models have been developed for the social impressions of faces. Bridging this gap, we develop a method to predict human impressions of faces in 40 subjective social dimensions, using deep representations from state-of-the-art neural networks. We find that model performance grows as the human consensus on a face trait increases, and that model predictions outperform human groups in correlation with human averages. This illustrates the learnability of subjective social perception of faces, especially when there is high human consensus. Our system can be used to decide which photographs from a personal collection will make the best impression. The results are significant for the field of social robotics, demonstrating that robots can learn the subjective judgments defining the underlying fabric of human interaction.
Current surveillance and control systems still require human supervision and intervention. This work presents a novel automatic handgun detection system in videos appropriate for both, surveillance and control purposes. We reformulate this detection problem into the problem of minimizing false positives and solve it by building the key training data-set guided by the results of a deep Convolutional Neural Networks (CNN) classifier, then assessing the best classification model under two approaches, the sliding window approach and region proposal approach. The most promising results are obtained by Faster R-CNN based model trained on our new database. The best detector show a high potential even in low quality youtube videos and provides satisfactory results as automatic alarm system. Among 30 scenes, it successfully activates the alarm after five successive true positives in less than 0.2 seconds, in 27 scenes. We also define a new metric, Alarm Activation per Interval (AApI), to assess the performance of a detection model as an automatic detection system in videos.
Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power consumption becomes a problem for real time mobile applications. We propose a flexible and efficient CNN accelerator architecture which can support the implementation of SOA CNNs in low-power and low-latency application scenarios. This architecture exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across a wide range of convolutional network kernel sizes; and numbers of input and output feature maps. We implemented the proposed architecture on an FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. We show how in RTL simulations in a 28nm process with a clock frequency of 500MHz, the NullHop core is able to reach over 450 GOp/s and efficiency of 368%, maintaining over 98% utilization of the MAC units and achieving a power efficiency of over 3TOp/s/W in a core area of 5.8mm2
Neural machine translation has become a major alternative to widely used phrase-based statistical machine translation. We notice however that much of research on neural machine translation has focused on European languages despite its language agnostic nature. In this paper, we apply neural machine translation to the task of Arabic translation (Ar<->En) and compare it against a standard phrase-based translation system. We run extensive comparison using various configurations in preprocessing Arabic script and show that the phrase-based and neural translation systems perform comparably to each other and that proper preprocessing of Arabic script has a similar effect on both of the systems. We however observe that the neural machine translation significantly outperform the phrase-based system on an out-of-domain test set, making it attractive for real-world deployment.
We study numerically the relaxation of a driven elastic string in a two dimensional pinning landscape. The relaxation of the string, initially flat, is governed by a growing length $L(t)$ separating the short steady-state equilibrated lengthscales, from the large lengthscales that keep memory of the initial condition. We find a macroscopic short time regime where relaxation is universal, both above and below the depinning threshold, different from the one expected for standard critical phenomena. Below the threshold, the zero temperature relaxation towards the first pinned configuration provides a novel, experimentally convenient way to access all the critical exponents of the depinning transition independently.
As a first step toward a characterization of the limiting extremal process of branching Brownian motion, we proved in a recent work [Comm. Pure Appl. Math. 64 (2011) 1647-1676] that, in the limit of large time $t$, extremal particles descend with overwhelming probability from ancestors having split either within a distance of order 1 from time 0, or within a distance of order 1 from time $t$. The result suggests that the extremal process of branching Brownian motion is a randomly shifted cluster point process. Here we put part of this picture on rigorous ground: we prove that the point process obtained by retaining only those extremal particles which are also maximal inside the clusters converges in the limit of large $t$ to a random shift of a Poisson point process with exponential density. The last section discusses the Tidal Wave Conjecture by Lalley and Sellke [Ann. Probab. 15 (1987) 1052-1061] on the full limiting extremal process and its relation to the work of Chauvin and Rouault [Math. Nachr. 149 (1990) 41-59] on branching Brownian motion with atypical displacement.
The computation and storage requirements for Deep Neural Networks (DNNs) are usually high. This issue limits their deployability on ubiquitous computing devices such as smart phones, wearables and autonomous drones. In this paper, we propose ternary neural networks (TNNs) in order to make deep learning more resource-efficient. We train these TNNs using a teacher-student approach based on a novel, layer-wise greedy methodology. Thanks to our two-stage training procedure, the teacher network is still able to use state-of-the-art methods such as dropout and batch normalization to increase accuracy and reduce training time. Using only ternary weights and activations, the student ternary network learns to mimic the behavior of its teacher network without using any multiplication. Unlike its -1,1 binary counterparts, a ternary neural network inherently prunes the smaller weights by setting them to zero during training. This makes them sparser and thus more energy-efficient. We design a purpose-built hardware architecture for TNNs and implement it on FPGA and ASIC. We evaluate TNNs on several benchmark datasets and demonstrate up to 3.1x better energy efficiency with respect to the state of the art while also improving accuracy.
The convolutional neural network (CNN), which is one of the deep learning models, has seen much success in a variety of computer vision tasks. However, designing CNN architectures still requires expert knowledge and a lot of trial and error. In this paper, we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP). In our method, we adopt highly functional modules, such as convolutional blocks and tensor concatenation, as the node functions in CGP. The CNN structure and connectivity represented by the CGP encoding method are optimized to maximize the validation accuracy. To evaluate the proposed method, we constructed a CNN architecture for the image classification task with the CIFAR-10 dataset. The experimental result shows that the proposed method can be used to automatically find the competitive CNN architecture compared with state-of-the-art models.
In this work, we investigate the use of sparsity-inducing regularizers during training of Convolution Neural Networks (CNNs). These regularizers encourage that fewer connections in the convolution and fully connected layers take non-zero values and in effect result in sparse connectivity between hidden units in the deep network. This in turn reduces the memory and runtime cost involved in deploying the learned CNNs. We show that training with such regularization can still be performed using stochastic gradient descent implying that it can be used easily in existing codebases. Experimental evaluation of our approach on MNIST, CIFAR, and ImageNet datasets shows that our regularizers can result in dramatic reductions in memory requirements. For instance, when applied on AlexNet, our method can reduce the memory consumption by a factor of four with minimal loss in accuracy.
Accurate forecasting is important for cost-effective and efficient monitoring and control of the renewable energy based power generation. Wind based power is one of the most difficult energy to predict accurately, due to the widely varying and unpredictable nature of wind energy. Although Autoregressive (AR) techniques have been widely used to create wind power models, they have shown limited accuracy in forecasting, as well as difficulty in determining the correct parameters for an optimized AR model. In this paper, Constriction Factor Particle Swarm Optimization (CF-PSO) is employed to optimally determine the parameters of an Autoregressive (AR) model for accurate prediction of the wind power output behaviour. Appropriate lag order of the proposed model is selected based on Akaike information criterion. The performance of the proposed PSO based AR model is compared with four well-established approaches; Forward-backward approach, Geometric lattice approach, Least-squares approach and Yule-Walker approach, that are widely used for error minimization of the AR model. To validate the proposed approach, real-life wind power data of \textit{Capital Wind Farm} was obtained from Australian Energy Market Operator. Experimental evaluation based on a number of different datasets demonstrate that the performance of the AR model is significantly improved compared with benchmark methods.
A deep generative model is developed for representation and analysis of images, based on a hierarchical convolutional dictionary-learning framework. Stochastic {\em unpooling} is employed to link consecutive layers in the model, yielding top-down image generation. A Bayesian support vector machine is linked to the top-layer features, yielding max-margin discrimination. Deep deconvolutional inference is employed when testing, to infer the latent features, and the top-layer features are connected with the max-margin classifier for discrimination tasks. The model is efficiently trained using a Monte Carlo expectation-maximization (MCEM) algorithm, with implementation on graphical processor units (GPUs) for efficient large-scale learning, and fast testing. Excellent results are obtained on several benchmark datasets, including ImageNet, demonstrating that the proposed model achieves results that are highly competitive with similarly sized convolutional neural networks.
Traditional mathematical models of epidemic disease had for decades conventionally considered static structure for contacts. Recently, an upsurge of theoretical inquiry has strived towards rendering the models more realistic by incorporating the temporal aspects of networks of contacts, societal and online, that are of interest in the study of epidemics (and other similar diffusion processes). However, temporal dynamics have predominantly focused on link fluctuations and nodal activities, and less attention has been paid to the growth of the underlying network. Many real networks grow: online networks are evidently in constant growth, and societal networks can grow due to migration flux and reproduction. The effect of network growth on the epidemic properties of networks is hitherto unknown---mainly due to the predominant focus of the network growth literature on the so-called steady-state. This paper takes a step towards alleviating this gap. We analytically study the degree dynamics of a given arbitrary network that is subject to growth. We use the theoretical findings to predict the epidemic properties of the network as a function of time. We observe that the introduction of new individuals into the network can enhance or diminish its resilience against endemic outbreaks, and investigate how this regime shift depends upon the connectivity of newcomers and on how they establish connections to existing nodes. Throughout, theoretical findings are corroborated with Monte Carlo simulations over synthetic and real networks. The results shed light on the effects of network growth on the future epidemic properties of networks, and offers insights for devising a-priori immunization strategies.
Stochastic network calculus is a newly developed theory for stochastic service guarantee analysis of computer networks. In the current stochastic network calculus literature, its fundamental models are based on the cumulative amount of traffic or cumulative amount of service. However, there are network scenarios where direct application of such models is difficult. This paper presents a temporal approach to stochastic network calculus. The key idea is to develop models and derive results from the time perspective. Particularly, we define traffic models and service models based on the cumulative packet inter-arrival time and the cumulative packet service time, respectively. Relations among these models as well as with the existing models in the literature are established. In addition, we prove the basic properties of the proposed models, such as delay bound and backlog bound, output characterization, concatenation property and superposition property. These results form a temporal stochastic network calculus and compliment the existing results.
Millimeter wave (mmW) wireless networks are capable to support multi-gigabit data rates, by using directional communications with narrow beams. However, existing mmW communications standards are hindered by two problems: deafness and single link scheduling. The deafness problem, that is, a misalignment between transmitter and receiver beams, demands a time consuming beam-searching operation, which leads to an alignment-throughput tradeoff. Moreover, the existing mmW standards schedule a single link in each time slot and hence do not fully exploit the potential of mmW communications, where directional communications allow multiple concurrent transmissions. These two problems are addressed in this paper, where a joint beamwidth selection and power allocation problem is formulated by an optimization problem for short range mmW networks with the objective of maximizing effective network throughput. This optimization problem allows establishing the fundamental alignment-throughput tradeoff, however it is computationally complex and requires exact knowledge of network topology, which may not be available in practice. Therefore, two standard-compliant approximation solution algorithms are developed, which rely on underestimation and overestimation of interference. The first one exploits directionality to maximize the reuse of available spectrum and thereby increases the network throughput, while imposing almost no computational complexity. The second one is a more conservative approach that protects all active links from harmful interference, yet enhances the network throughput by 100% compared to the existing standards. Extensive performance analysis provides useful insights on the directionality level and the number of concurrent transmissions that should be pursued. Interestingly, extremely narrow beams are in general not optimal.
We study the sum degrees of freedom (DoF) of a class of multi-layer relay-aided MIMO broadcast networks with delayed channel state information at transmitters (CSIT). In the assumed network a K-antenna source intends to communicate to K single-antenna destinations, with the help of N-2 layers of K full-duplex single-antenna relays. We consider two practical delayed CSIT feedback scenarios. If the source can obtain the CSI feedback signals from all layers, we prove the optimal sum DoF of the network to be K/(1+1/2+...+1/K). If the CSI feedback is only within each hop, we show that when K=2 the optimal sum DoF is 4/3, and when K >= 3 the sum DoF 3/2 is achievable. Our results reveal that the sum DoF performance in the considered class of N-layer MIMO broadcast networks with delayed CSIT may depend not on N, the number of layers in the network, but only on K, the number of antennas/terminals in each layer.
Color constancy is the recovery of true surface color from observed color, and requires estimating the chromaticity of scene illumination to correct for the bias it induces. In this paper, we show that the per-pixel color statistics of natural scenes---without any spatial or semantic context---can by themselves be a powerful cue for color constancy. Specifically, we describe an illuminant estimation method that is built around a "classifier" for identifying the true chromaticity of a pixel given its luminance (absolute brightness across color channels). During inference, each pixel's observed color restricts its true chromaticity to those values that can be explained by one of a candidate set of illuminants, and applying the classifier over these values yields a distribution over the corresponding illuminants. A global estimate for the scene illuminant is computed through a simple aggregation of these distributions across all pixels. We begin by simply defining the luminance-to-chromaticity classifier by computing empirical histograms over discretized chromaticity and luminance values from a training set of natural images. These histograms reflect a preference for hues corresponding to smooth reflectance functions, and for achromatic colors in brighter pixels. Despite its simplicity, the resulting estimation algorithm outperforms current state-of-the-art color constancy methods. Next, we propose a method to learn the luminance-to-chromaticity classifier "end-to-end". Using stochastic gradient descent, we set chromaticity-luminance likelihoods to minimize errors in the final scene illuminant estimates on a training set. This leads to further improvements in accuracy, most significantly in the tail of the error distribution.
In this paper, we study the stochastic gradient descent method in analyzing nonconvex statistical optimization problems from a diffusion approximation point of view. Using the theory of large deviation of random dynamical system, we prove in the small stepsize regime and the presence of omnidirectional noise the following: starting from a local minimizer (resp.~saddle point) the SGD iteration escapes in a number of iteration that is exponentially (resp.~linearly) dependent on the inverse stepsize. We take the deep neural network as an example to study this phenomenon. Based on a new analysis of the mixing rate of multidimensional Ornstein-Uhlenbeck processes, our theory substantiate a very recent empirical results by \citet{keskar2016large}, suggesting that large batch sizes in training deep learning for synchronous optimization leads to poor generalization error.
We calculate the effects of the target mass on deep-inelastic electron scattering on targets of spin one such as the deuteron, focusing on the novel structure functions that enter. It is important to understand these effects given the recent HERMES measurements of the $b_1(x,Q^2)$ structure function of the deuteron. We also derive the spin-one analogues of the Wandzura-Wilczek relations and discuss possible calculations in lattice QCD.
We investigate the task of modeling open-domain, multi-turn, unstructured, multi-participant, conversational dialogue. We specifically study the effect of incorporating different elements of the conversation. Unlike previous efforts, which focused on modeling messages and responses, we extend the modeling to long context and participant's history. Our system does not rely on handwritten rules or engineered features; instead, we train deep neural networks on a large conversational dataset. In particular, we exploit the structure of Reddit comments and posts to extract 2.1 billion messages and 133 million conversations. We evaluate our models on the task of predicting the next response in a conversation, and we find that modeling both context and participants improves prediction accuracy.
The constraints of lightweight distributed computing environments such as wireless sensor networks lend themselves to the use of symmetric cryptography to provide security services. The lack of central infrastructure after deployment of such networks requires the necessary symmetric keys to be predistributed to participating nodes. The rich mathematical structure of combinatorial designs has resulted in the proposal of several key predistribution schemes for wireless sensor networks based on designs. We review and examine the appropriateness of combinatorial designs as a tool for building key predistribution schemes suitable for such environments.
We consider the spin polarization of leptons produced in neutrino and antineutrino nucleon deep inelastic scattering, via charged currents, and we study the positivity constraints on the spin components in a model independent way. These results are very important, in particular in the case of $\tau^{\pm}$ leptons, because the polarization information is crucial in all future neutrino oscillation experiments.
There are indications that for optimizing neural computation, neural networks - including the brain - operate at criticality. Previous approaches have, however, used diverse fingerprints of criticality, leaving open the question whether they refer to a unique critical point or whether there could be several. Using a recurrent spiking neural network as the model, we demonstrate that avalanche criticality does not necessarily lie at the dynamical edge-of-chaos and that therefore, the different fingerprints indicate distinct phenomena with an as yet unclarified relationship.
This paper deals with distributed matrix multiplication. Each player owns only one row of both matrices and wishes to learn about one distinct row of the product matrix, without revealing its input to the other players. We first improve on a weighted average protocol, in order to securely compute a dot-product with a quadratic volume of communications and linear number of rounds. We also propose a protocol with five communication rounds, using a Paillier-like underlying homomorphic public key cryptosystem, which is secure in the semi-honest model or secure with high probability in the malicious adversary model. Using ProVerif, a cryptographic protocol verification tool, we are able to check the security of the protocol and provide a countermeasure for each attack found by the tool. We also give a randomization method to avoid collusion attacks. As an application, we show that this protocol enables a distributed and secure evaluation of trust relationships in a network, for a large class of trust evaluation schemes.
The interplay of disorder and interactions is a subject of perennial interest. In this work, we have investigated the effect of disorder due to chemical substitution on the dynamics and transport properties of correlated Fermi liquids. A low frequency analysis in the concentrated and dilute limits shows that the dynamical local potentials arising through disorder averaging generate a linear (in frequency) term in the scattering rate. Such non-Fermi liquid behaviour (nFL) is investigated in detail for Kondo hole substitution in heavy fermions within dynamical mean field theory. We find closed form expressions for the dependence of the static and linear terms in the scattering rate on substitutional disorder and model parameters. We argue that the low temperature resistivity will acquire a linear in temperature term, and show that the Drude peak structure in the optical conductivity will disappear beyond a certain disorder $p_c$, that marks the crossover from lattice coherent to single-impurity behaviour. A full numerical solution of the DMFT equations reveals that the nFL term will show up significantly only in certain regimes, although it is present for any non-zero disorder concentration in principle. We highlight the dramatic changes that occur in the quasiparticle scattering rate in the proximity of $p_c$. Remarkably, we find that the nFL behaviour due to dynamical effects of impurity scattering has features that are distinct from those arising through Griffiths singularities or distribution of Kondo scales. Relevance of our findings to experiments on alloyed correlated systems is pointed out.
We analyze cross-correlations between price fluctuations of different stocks using methods of random matrix theory (RMT). Using two large databases, we calculate cross-correlation matrices C of returns constructed from (i) 30-min returns of 1000 US stocks for the 2-yr period 1994--95 (ii) 30-min returns of 881 US stocks for the 2-yr period 1996--97, and (iii) 1-day returns of 422 US stocks for the 35-yr period 1962--96. We test the statistics of the eigenvalues $\lambda_i$ of C against a ``null hypothesis'' --- a random correlation matrix constructed from mutually uncorrelated time series. We find that a majority of the eigenvalues of C fall within the RMT bounds $[\lambda_-, \lambda_+]$ for the eigenvalues of random correlation matrices. We test the eigenvalues of C within the RMT bound for universal properties of random matrices and find good agreement with the results for the Gaussian orthogonal ensemble of random matrices --- implying a large degree of randomness in the measured cross-correlation coefficients. Further, we find that the distribution of eigenvector components for the eigenvectors corresponding to the eigenvalues outside the RMT bound display systematic deviations from the RMT prediction. In addition, we find that these ``deviating eigenvectors'' are stable in time. We analyze the components of the deviating eigenvectors and find that the largest eigenvalue corresponds to an influence common to all stocks. Our analysis of the remaining deviating eigenvectors shows distinct groups, whose identities correspond to conventionally-identified business sectors. Finally, we discuss applications to the construction of portfolios of stocks that have a stable ratio of risk to return.
(shortened) We provide a grid of single star models covering a mass range from 0.8 to 120 Msun with an initial metallicity Z = 0.002 with and without rotation. We discuss the impact of a change in the metallicity by comparing the current tracks with models computed with exactly the same physical ingredients but with a metallicity Z = 0.014 (solar). We show that the width of the main-sequence (MS) band in the upper part of the Hertzsprung-Russell diagram (HRD), for luminosity above log(L/Lsun) > 5.5, is very sensitive to rotational mixing. Strong mixing significantly reduces the MS width. We confirm, but here for the first time on the whole mass range, that surface enrichments are stronger at low metallicity provided that comparisons are made for equivalent initial mass, rotation and evolutionary stage. We show that the enhancement factor due to a lowering of the metallicity (all other factors kept constant) increases when the initial mass decreases. Present models predict an upper luminosity for the red supergiants (RSG) of log (L/Lsun) around 5.5 at Z = 0.002 in agreement with the observed upper limit of RSG in the Small Magellanic Cloud. We show that models using shear diffusion coefficient calibrated to reproduce the surface enrichments observed for MS B-type stars at Z = 0.014 can also reproduce the stronger enrichments observed at low metallicity. In the framework of the present models, we discuss the factors governing the timescale of the first crossing of the Hertzsprung gap after the MS phase. We show that any process favouring a deep localisation of the H-burning shell (steep gradient at the border of the H-burning convective core, low CNO content) and/or the low opacity of the H-rich envelope favour a blue position in the HRD for the whole or at least a significant fraction of the core He-burning phase.
Extensive researches have been dedicated to investigating the performance of real networks and synthetic networks against random failures or intentional attack guided by degree (degree attack). Degree is one of straightforward measures to characterize the vitality of a vertex in maintaining the integrity of the network but not the only one. Damage, the decrease of the largest component size that was caused by the removal of a vertex, intuitively is a more destructive guide for intentional attack on networks since the network functionality is usually measured by the largest component size. However, it is surprising to find that little is known about behaviors of real networks or synthetic networks against intentional attack guided by damage (damage attack), in which adversaries always choose the vertex with the largest damage to attack.   In this article, we dedicate our efforts to understanding damage attack and behaviors of real networks as well as synthetic networks against this attack. To this end, existing attacking models, statistical properties of damage in complex networks are first revisited. Then, we present the empirical analysis results about behaviors of complex networks against damage attack with the comparisons to degree attack. It is surprising to find a cross-point for diverse networks before which damage attack is more destructive than degree attack. Further investigation shows that the existence of cross-point can be attributed to the fact that: degree attack tends produce networks with more heterogenous damage distribution than damage attack. Results in this article strongly suggest that damage attack is one of most destructive attacks and deserves our research efforts.Our understandings about damage attack may also shed light on efficient solutions to protect real networks against damage attack.
In the era of Internet of Things, all components in intelligent transportation systems will be connected to improve transport safety, relieve traffic congestion, reduce air pollution and enhance the comfort of driving. The vision of all vehicles connected poses a significant challenge to the collection and storage of large amounts of traffic-related data. In this article, we propose to integrate cloud computing into vehicular networks such that the vehicles can share computation resources, storage resources and bandwidth resources. The proposed architecture includes a vehicular cloud, a roadside cloud, and a central cloud. Then, we study cloud resource allocation and virtual machine migration for effective resource management in this cloud-based vehicular network. A game-theoretical approach is presented to optimally allocate cloud resources. Virtual machine migration due to vehicle mobility is solved based on a resource reservation scheme.
In networked control systems (NCS), sensing and control signals between the plant and controllers are typically transmitted wirelessly. Thus, the time delay plays an important role for the stability of NCS, especially with distributed controllers. In this paper, the optimal control strategy is derived for distributed control networks with time delays. In particular, we form the optimal control problem as a non-cooperative linear quadratic game (LQG). Then, the optimal control strategy of each controller is obtained that is based on the current state and the last control strategies. The proposed optimal distributed controller reduces to some known controllers under certain conditions. Moreover, we illustrate the application of the proposed distributed controller to load frequency control in power grid systems.
Deep learning is having a profound impact in many fields, especially those that involve some form of image processing. Deep neural networks excel in turning an input image into a set of high-level features. On the other hand, tomography deals with the inverse problem of recreating an image from a number of projections. In plasma diagnostics, tomography aims at reconstructing the cross-section of the plasma from radiation measurements. This reconstruction can be computed with neural networks. However, previous attempts have focused on learning a parametric model of the plasma profile. In this work, we use a deep neural network to produce a full, pixel-by-pixel reconstruction of the plasma profile. For this purpose, we use the overview bolometer system at JET, and we introduce an up-convolutional network that has been trained and tested on a large set of sample tomograms. We show that this network is able to reproduce existing reconstructions with a high level of accuracy, as measured by several metrics.
We consider open multi-class queueing networks with general arrival processes, general processing time sequences and Bernoulli routing. The network is assumed to be operating under an arbitrary work-conserving scheduling policy that makes the system stable. An example is a generalized Jackson network with load less than unity and any work conserving policy. We find a simple diffusion limit for the inter-queue flows with an explicit computable expression for the covariance matrix. Specifically, we present a simple computable expression for the asymptotic variance of arrivals (or departures) of each of the individual queues and each of the flows in the network.
Boolean networks have been proposed as potentially useful models for genetic control. An important aspect of these networks is the stability of their dynamics in response to small perturbations. Previous approaches to stability have assumed uncorrelated random network structure. Real gene networks typically have nontrivial topology significantly different from the random network paradigm. In order to address such situations, we present a general method for determining the stability of large Boolean networks of any specified network topology and predicting their steady-state behavior in response to small perturbations. Additionally, we generalize to the case where individual genes have a distribution of `expression biases,' and we consider non-synchronous update, as well as extension of our method to non-Boolean models in which there are more than two possible gene states. We find that stability is governed by the maximum eigenvalue of a modified adjacency matrix, and we test this result by comparison with numerical simulations. We also discuss the possible application of our work to experimentally inferred gene networks.
We propose an online, end-to-end, neural generative conversational model for open-domain dialogue. It is trained using a unique combination of offline two-phase supervised learning and online human-in-the-loop active learning. While most existing research proposes offline supervision or hand-crafted reward functions for online reinforcement, we devise a novel interactive learning mechanism based on hamming-diverse beam search for response generation and one-character user-feedback at each step. Experiments show that our model inherently promotes the generation of semantically relevant and interesting responses, and can be used to train agents with customized personas, moods and conversational styles.
We study projective-anticipating, projective, and projective-lag synchronization of time-delayed chaotic systems on random networks. We relax some limitations of previous work, where projective-anticipating and projective-lag synchronization can be achieved only on two coupled chaotic systems. In this paper, we can realize projective-anticipating and projective-lag synchronization on complex dynamical networks composed by a large number of interconnected components. At the same time, although previous work studied projective synchronization on complex dynamical networks, the dynamics of the nodes are coupled partially linear chaotic systems. In this paper, the dynamics of the nodes of the complex networks are time-delayed chaotic systems without the limitation of the partial-linearity. Based on the Lyapunov stability theory, we suggest a generic method to achieve the projective-anticipating, projective, and projective-lag synchronization of time-delayed chaotic systems on random dynamical networks and find both the existence and sufficient stability conditions. The validity of the proposed method is demonstrated and verified by examining specific examples using Ikeda and Mackey-Glass systems on Erdos-Renyi networks.
This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks. These layers encode constraints and complex dependencies between the hidden states that traditional convolutional and fully-connected layers often cannot capture. In this paper, we explore the foundations for such an architecture: we show how techniques from sensitivity analysis, bilevel optimization, and implicit differentiation can be used to exactly differentiate through these layers and with respect to layer parameters; we develop a highly efficient solver for these layers that exploits fast GPU-based batch solves within a primal-dual interior point method, and which provides backpropagation gradients with virtually no additional cost on top of the solve; and we highlight the application of these approaches in several problems. In one notable example, we show that the method is capable of learning to play mini-Sudoku (4x4) given just input and output games, with no a priori information about the rules of the game; this highlights the ability of our architecture to learn hard constraints better than other neural architectures.
In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training. The proposed approach consists of two stages. In the first stage--retrieval stage--, an off-the-shelf, black-box search engine is used to retrieve a small subset of sentence pairs from a training set given a source sentence. These pairs are further filtered based on a fuzzy matching score based on edit distance. In the second stage--translation stage--, a novel translation model, called translation memory enhanced NMT (TM-NMT), seamlessly uses both the source sentence and a set of retrieved sentence pairs to perform the translation. Empirical evaluation on three language pairs (En-Fr, En-De, and En-Es) shows that the proposed approach significantly outperforms the baseline approach and the improvement is more significant when more relevant sentence pairs were retrieved.
Skin cancer, the most common human malignancy, is primarily diagnosed visually by physicians [1]. Classification with an automated method like CNN [2, 3] shows potential for challenging tasks [1]. By now, the deep convolutional neural networks are on par with human dermatologist [1]. This abstract is dedicated on developing a Deep Learning method for ISIC [5] 2017 Skin Lesion Detection Competition hosted at [6] to classify the dermatology pictures, which is aimed at improving the diagnostic accuracy rate and general level of the human health. The challenge falls into three sub-challenges, including Lesion Segmentation, Lesion Dermoscopic Feature Extraction and Lesion Classification. This project only participates in the Lesion Classification part. This algorithm is comprised of three steps: (1) original images preprocessing, (2) modelling the processed images using CNN [2, 3] in Caffe [4] framework, (3) predicting the test images and calculating the scores that represent the likelihood of corresponding classification. The models are built on the source images are using the Caffe [4] framework. The scores in prediction step are obtained by two different models from the source images.
Many modern network designs incorporate "failover" paths into routers' forwarding tables. We initiate the theoretical study of the conditions under which such resilient routing tables can guarantee delivery of packets.
In recent years the belief network has been used increasingly to model systems in Al that must perform uncertain inference. The development of efficient algorithms for probabilistic inference in belief networks has been a focus of much research in AI. Efficient algorithms for certain classes of belief networks have been developed, but the problem of reporting the uncertainty in inferred probabilities has received little attention. A system should not only be capable of reporting the values of inferred probabilities and/or the favorable choices of a decision; it should report the range of possible error in the inferred probabilities and/or choices. Two methods have been developed and implemented for determining the variance in inferred probabilities in belief networks. These methods, the Approximate Propagation Method and the Monte Carlo Integration Method are discussed and compared in this paper.
A resonance search has been made in the K0s p and K0s pbar invariant-mass spectrum measured with the ZEUS detector at HERA using an integrated luminosity of 121 pb-1. The search was performed in the central rapidity region of inclusive deep inelastic scattering at an ep centre-of-mass energy of 300-318 GeV for exchanged photon virtuality, Q2, above 1 GeV2. Recent results from fixed-target experiments give evidence for a narrow baryon resonance decaying to K+ n and K0s p, interpreted as a pentaquark. The results presented here support the existence of such state, with a mass of 1521.5+/-1.5(stat.)^{+2.8}_{-1.7}(syst.) MeV and a Gaussian width consistent with the experimental resolution of 2 MeV. The signal is visible at high Q2 and, for Q2>20 GeV2, contains 221+/-48 events. The probability of a similar signal anywhere in the range 1500-1560 MeV arising from fluctuations of the background is below 6x10^{-5}.
Energy efficiency has become an important measurement of scheduling algorithm for private cloud. The challenge is trade-off between minimizing of energy consumption and satisfying Quality of Service (QoS) (e.g. performance or resource availability on time for reservation request). We consider resource needs in context of a private cloud system to provide resources for applications in teaching and researching. In which users request computing resources for laboratory classes at start times and non-interrupted duration in some hours in prior. Many previous works are based on migrating techniques to move online virtual machines (VMs) from low utilization hosts and turn these hosts off to reduce energy consumption. However, the techniques for migration of VMs could not use in our case. In this paper, a genetic algorithm for power-aware in scheduling of resource allocation (GAPA) has been proposed to solve the static virtual machine allocation problem (SVMAP). Due to limited resources (i.e. memory) for executing simulation, we created a workload that contains a sample of one-day timetable of lab hours in our university. We evaluate the GAPA and a baseline scheduling algorithm (BFD), which sorts list of virtual machines in start time (i.e. earliest start time first) and using best-fit decreasing (i.e. least increased power consumption) algorithm, for solving the same SVMAP. As a result, the GAPA algorithm obtains total energy consumption is lower than the baseline algorithm on simulated experimentation.
We present a dynamic learning paradigm for "programming" a general quantum computer. A learning algorithm is used to find the control parameters for a coupled qubit system, such that the system at an initial time evolves to a state in which a given measurement corresponds to the desired operation. This can be thought of as a quantum neural network. We first apply the method to a system of two coupled superconducting quantum interference devices (SQUIDs), and demonstrate learning of both the classical gates XOR and XNOR. Training of the phase produces a gate congruent to the CNOT modulo a phase shift. Striking out for somewhat more interesting territory, we attempt learning of an entanglement witness for a two qubit system. Simulation shows a reasonably successful mapping of the entanglement at the initial time onto the correlation function at the final time for both pure and mixed states. For pure states this mapping requires knowledge of the phase relation between the two parts; however, given that knowledge, this method can be used to measure the entanglement of an otherwise unknown state. The method is easily extended to multiple qubits or to quNits.
Whereas deep neural networks were first mostly used for classification tasks, they are rapidly expanding in the realm of structured output problems, where the observed target is composed of multiple random variables that have a rich joint distribution, given the input. We focus in this paper on the case where the input also has a rich structure and the input and output structures are somehow related. We describe systems that learn to attend to different places in the input, for each element of the output, for a variety of tasks: machine translation, image caption generation, video clip description and speech recognition. All these systems are based on a shared set of building blocks: gated recurrent neural networks and convolutional neural networks, along with trained attention mechanisms. We report on experimental results with these systems, showing impressively good performance and the advantage of the attention mechanism.
In the two-dimensional Ising model weak random surface field is predicted to be a marginally irrelevant perturbation at the critical point. We study this question by extensive Monte Carlo simulations for various strength of disorder. The calculated effective (temperature or size dependent) critical exponents fit with the field-theoretical results and can be interpreted in terms of the predicted logarithmic corrections to the pure system's critical behaviour.
We demonstrate a convolutional neural network trained to reproduce the Kohn-Sham kinetic energy of hydrocarbons from electron density. The output of the network is used as a non-local correction to the conventional local and semi-local kinetic functionals. We show that this approximation qualitatively reproduces Kohn-Sham potential energy surfaces when used with conventional exchange correlation functionals. Numerical noise inherited from the non-linearity of the neural network is identified as the major challenge for the model. Finally we examine the features in the density learned by the neural network to anticipate the prospects of generalizing these models.
The beam spin asymmetry (BSA) in the exclusive reaction ep->ep pi0 was measured with the CEBAF 5.77 GeV polarized electron beam and Large Acceptance Spectrometer(CLAS). The xB, Q2, t and phi dependences of the pi0 BSA are presented in the deep inelastic regime. The asymmetries are fitted with a sin(phi) function and their amplitudes are extracted. Overall, they are of the order of 0.04 - 0.11 and roughly independent of t. This is the signature of a non-zero longitudinal-transverse interference. The implications concerning the applicability of a formalism based on generalized parton distributions, as well as the extension of a Regge formalism at high photon virtualities, are discussed.
I present results from a deep 12 micron extragalactic survey conducted with the ISOCAM instrument. The survey covers about 0.1 sq. deg. in four fields and reaches a 5 sigma flux limit of about 500 microJy. 50 sources are identified to this flux limit. Of these, 37 are classified as galaxies on the basis of optical/mid-IR colours using identifications from the USNO-A photographic survey. Number counts for these objects exceed those predicted for no-evolution models in simple models. However, these conclusions are somewhat dependent on the assumed K-corrections. For this reason, and to better determine the nature of the evolution of this population, followup observations are required to determine redshifts, broadband optical-IR colours, and optical morphologies. The first results from these followups are presented. Images and optical/IR photometry for one of the four fields is discussed, and I also present the first results from optical spectroscopy. The highest redshift for the sample so far is z=1.2 for a broad-line object.
The nature may be disclosed that the glass transition is only determined by the intrinsic 8 orders of instant 2-D mosaic geometric structures, without any presupposition and relevant parameter. An interface excited state on the geometric structures comes from the additional Lindemann distance increment, which is a vector with 8 orders of relaxation times, 8 orders of additional restoring force moment (ARFM), quantized energy and extra volume. Each order of anharmonic ARFM gives rise to an additional position-asymmetry on a 2-D projection plane of a reference particle, thus, in removing additional position-asymmetry, the 8 orders of 2-D clusters and hard-spheres accompanied with the 4 excited interface relaxations of the reference particle have been illustrated. Dynamical behavior comes of the slow inverse energy cascade to generate 8 orders of clusters, to thaw a solid-domain, and the fast cascade to relax tension and rearrange structure. This model provides a unified mechanism to interpret hard-sphere, compacting cluster, free volume, cage, jamming behaviors, geometrical frustration, reptation, Ising model, breaking solid lattice, percolation, cooperative migration and orientation, critical entanglement chain length and structure rearrangements. It also directly deduces a series of quantitative values for the average energy of cooperative migration in one direction, localized energy independent of temperature and the activation energy to break solid lattice. In a flexible polymer system, there are all 320 different interface excited states that have the same quantized excited energy but different interaction times, relaxation times and phases. The quantized excited energy is about 6.4 k = 0.55meV.
Social dynamics on a network may be accelerated or decelerated depending on which pairs of individuals in the network communicate early and which pairs do later. The order with which the links in a given network are sequentially used, which we call the link order, may be a strong determinant of dynamical behaviour on networks, potentially adding a new dimension to effects of temporal networks relative to static networks. Here we study the effect of the link order on linear coordination (i.e., synchronisation) dynamics. We show that the coordination speed considerably depends on specific orders of links. In addition, applying each single link for a long time to ensure strong pairwise coordination before moving to a next pair of individuals does not often enhance coordination of the entire network. We also implement a simple greedy algorithm to optimise the link order in favour of fast coordination.
This paper reports the performances of shallow word-level convolutional neural networks (CNN), our earlier work (2015), on the eight datasets with relatively large training data that were used for testing the very deep character-level CNN in Conneau et al. (2016). Our findings are as follows. The shallow word-level CNNs achieve better error rates than the error rates reported in Conneau et al., though the results should be interpreted with some consideration due to the unique pre-processing of Conneau et al. The shallow word-level CNN uses more parameters and therefore requires more storage than the deep character-level CNN; however, the shallow word-level CNN computes much faster.
The algebraic formulation for linear network coding in acyclic networks with the links having integer delay is well known. Based on this formulation, for a given set of connections over an arbitrary acyclic network with integer delay assumed for the links, the output symbols at the sink nodes, at any given time instant, is a $\mathbb{F}_{p^m}$-linear combination of the input symbols across different generations where, $\mathbb{F}_{p^m}$ denotes the field over which the network operates ($p$ is prime and $m$ is a positive integer). We use finite-field discrete fourier transform (DFT) to convert the output symbols at the sink nodes, at any given time instant, into a $\mathbb{F}_{p^m}$-linear combination of the input symbols generated during the same generation without making use of memory at the intermediate nodes. We call this as transforming the acyclic network with delay into {\em $n$-instantaneous networks} ($n$ is sufficiently large). We show that under certain conditions, there exists a network code satisfying sink demands in the usual (non-transform) approach if and only if there exists a network code satisfying sink demands in the transform approach. When the zero-interference conditions are not satisfied, we propose three Precoding Based Network Alignment (PBNA) schemes for three-source three-destination multiple unicast network with delays (3-S 3-D MUN-D) termed as PBNA using transform approach and time-invariant local encoding coefficients (LECs), PBNA using time-varying LECs, and PBNA using transform approach and block time-varying LECs. Their feasibility conditions are then analyzed.
There are families of neural networks that can learn to compute any function, provided sufficient training data. However, given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. Here we consider the case of prior procedural knowledge, such as knowing the overall recursive structure of a sequence transduction program or the fact that a program will likely use arithmetic operations on real numbers to solve a task. To this end we present a differentiable interpreter for the programming language Forth. Through a neural implementation of the dual stack machine that underlies Forth, programmers can write program sketches with slots that can be filled with behaviour trained from program input-output data. As the program interpreter is end-to-end differentiable, we can optimize this behaviour directly through gradient descent techniques on user specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and learn complex transduction tasks such as sequence sorting or addition with substantially less data and better generalisation over problem sizes. In addition, we introduce neural program optimisations based on symbolic computation and parallel branching that lead to significant speed improvements.
To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input. The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation learning methods. We show the effectiveness of the proposed model on unseen instructions as well as unseen maps, both quantitatively and qualitatively. We also introduce a novel environment based on a 3D game engine to simulate the challenges of task-oriented language grounding over a rich set of instructions and environment states.
Video classification problem has been studied many years. The success of Convolutional Neural Networks (CNN) in image recognition tasks gives a powerful incentive for researchers to create more advanced video classification approaches. As video has a temporal content Long Short Term Memory (LSTM) networks become handy tool allowing to model long-term temporal clues. Both approaches need a large dataset of input data. In this paper three models provided to address video classification using recently announced YouTube-8M large-scale dataset. The first model is based on frame pooling approach. Two other models based on LSTM networks. Mixture of Experts intermediate layer is used in third model allowing to increase model capacity without dramatically increasing computations. The set of experiments for handling imbalanced training data has been conducted.
In building studies dealing about energy efficiency and comfort, simulation software need relevant weather files with optimal time steps. Few tools generate extreme and mean values of simultaneous hourly data including correlation between the climatic parameters. This paper presents the C++ Runeole software based on typical weather sequences analysis. It runs an analysis process of a stochastic continuous multivariable phenomenon with frequencies properties applied to a climatic database. The database analysis associates basic statistics, PCA (Principal Component Analysis) and automatic classifications. Different ways of applying these methods will be presented. All the results are stored in the Runeole internal database that allows an easy selection of weather sequences. The extreme sequences are used for system and building sizing and the mean sequences are used for the determination of the annual cooling loads as proposed by Audrier-Cros (Audrier-Cros, 1984). This weather analysis was tested with the database of the French weather forecast utility Meteo France. Reunion Island experiences a lot of different micro-climates due to the high altitude (3069m), specific relief, and geographic situation (Tropic of Capricorn). Furthermore Reunion Island has the densest meteorological network in France and is an ideal place to validate the methodology with different climates. To test the efficiency of such analysis, simulations using the resulting weather sequences were carried out with the building simulation software CODYRUN. This analysis is the first step of a more global research concerning weather data generation. Future work will permit whole hourly typical meteorological year generation using neural networks.
In this paper we study the household-structure SIS epidemic spreading on general complex networks. The household structure gives us the way to distinguish inner and the outer infection rate. Unlike household-structure models on homogenous networks, such as regular and random networks, here we consider heterogeneous networks with arbitrary degree distribution p(k). First we introduce the epidemic model. Then rate equations under mean field appropriation and computer simulations are used here to analyze our model. Some unique phenomena only existing in divergent network with household structure is found, while we also get some similar conclusions that some simple geometrical quantities of networks have important impression on infection property of infectous disease. It seems that in our model even when local cure rate is greater than inner infection rate in every household, disease still can spread on scale-free network. It implies that no disease is spreading in every single household, but for the whole network, disease is spreading. Since our society network seems like this structure, maybe this conclusion remind us that during disease spreading we should pay more attention on network structure than local cure condition.
Over the past two decades, several consistent procedures have been designed to infer causal conclusions from observational data. We prove that if the true causal network might be an arbitrary, linear Gaussian network or a discrete Bayes network, then every unambiguous causal conclusion produced by a consistent method from non-experimental data is subject to reversal as the sample size increases any finite number of times. That result, called the causal flipping theorem, extends prior results to the effect that causal discovery cannot be reliable on a given sample size. We argue that since repeated flipping of causal conclusions is unavoidable in principle for consistent methods, the best possible discovery methods are consistent methods that retract their earlier conclusions no more than necessary. A series of simulations of various methods across a wide range of sample sizes illustrates concretely both the theorem and the principle of comparing methods in terms of retractions.
We consider congestion games on networks with nonatomic users and user-specific costs. We are interested in the uniqueness property defined by Milchtaich [Milchtaich, I. 2005. Topological conditions for uniqueness of equilibrium in networks. Math. Oper. Res. 30 225-244] as the uniqueness of equilibrium flows for all assignments of strictly increasing cost functions. He settled the case with two-terminal networks. As a corollary of his result, it is possible to prove that some other networks have the uniqueness property as well by adding common fictitious origin and destination.   In the present work, we find a necessary condition for networks with several origin-destination pairs to have the uniqueness property in terms of excluded minors or subgraphs. As a key result, we characterize completely bidirectional rings for which the uniqueness property holds: it holds precisely for nine networks and those obtained from them by elementary operations. For other bidirectional rings, we exhibit affine cost functions yielding to two distinct equilibrium flows. Related results are also proven. For instance, we characterize networks having the uniqueness property for any choice of origin-destination pairs.
We investigate how free probability allows us to approximate the density of states in tight binding models of disordered electronic systems. Extending our previous studies of the Anderson model in neighbor interactions [J. Chen et al., Phys. Rev. Lett. 109, 036403 (2012)], we find that free probability continues to provide accurate approximations for systems with constant interactions on two- and three-dimensional lattices or with next-nearest-neighbor interactions, with the results being visually indistinguishable from the numerically exact solution. For systems with disordered interactions, we observe a small but visible degradation of the approximation. To explain this behavior of the free approximation, we develop and apply an asymptotic error analysis scheme to show that the approximation is accurate to the eighth moment in the density of states for systems with constant interactions, but is only accurate to sixth order for systems with disordered interactions. The error analysis also allows us to calculate asymptotic corrections to the density of states, allowing for systematically improvable approximations as well as insight into the sources of error without requiring a direct comparison to an exact solution.
A method is provided to compute the parameter exponent $\lambda$ yielding the dynamic exponents of critical slowing down in mode coupling theory. It is independent from the dynamic approach and based on the formulation of an effective static field theory. Expressions of $\lambda$ in terms of third order coefficients of the action expansion or, equivalently, in term of six point cumulants are provided. Applications are reported to a number of mean-field models: with hard and soft variables and both fully-connected and dilute interactions. Comparisons with existing results for Potts glass model, ROM, hard and soft-spin Sherrington-Kirkpatrick and p-spin models are presented.
We provide a theoretical perspective on the glass transition in molecular liquids at thermal equilibrium, on the spatially heterogeneous and aging dynamics of disordered materials, and on the rheology of soft glassy materials. We start with a broad introduction to the field and emphasize its connections with other subjects and its relevance. The important role played by computer simulations to study and understand the dynamics of systems close to the glass transition at the molecular level is spelled out. We review the recent progress on the subject of the spatially heterogeneous dynamics that characterizes structural relaxation in materials with slow dynamics. We then present the main theoretical approaches describing the glass transition in supercooled liquids, focusing on theories that have a microscopic, statistical mechanics basis. We describe both successes and failures, and critically assess the current status of each of these approaches. The physics of aging dynamics in disordered materials and the rheology of soft glassy materials are then discussed, and recent theoretical progress is described. For each section, we give an extensive overview of the most recent advances, but we also describe in some detail the important open problems that, we believe, will occupy a central place in this field in the coming years.
Very recently, a kind of spatial network constructed with power-law distance distribution and total energy constriction is proposed. Moreover, it has been pointed out that such spatial networks have the optimal exponents $\delta$ in the power-law distance distribution for the average shortest path, traffic dynamics and navigation. Because the distance is estimated approximately in real world, we present an distance coarse graining procedure to generate the binary spatial networks in this paper. We find that the distance coarse graining procedure will result in the shifting of the optimal exponents $\delta$. Interestingly, when the network is large enough, the effect of distance coarse graining can be ignored eventually. Additionally, we also study some main dynamic processes including traffic dynamics, navigation, synchronization and percolation on this spatial networks with coarse grained distance. The results lead us to the enhancement of spatial networks' specifical functions.
Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model with training data distributed over a large number of clients each with unreliable and relatively slow network connections. We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local data, and communicates this update to a central server, where the client-side updates are aggregated to compute a new global model. The typical clients in this setting are mobile phones, and communication efficiency is of utmost importance. In this paper, we propose two ways to reduce the uplink communication costs. The proposed methods are evaluated on the application of training a deep neural network to perform image classification. Our best approach reduces the upload communication required to train a reasonable model by two orders of magnitude.
Significant research has been carried out in the recent years for generating systems exhibiting intelligence for realizing optimized routing in networks. In this paper, a grade based twolevel based node selection method along with Particle Swarm Optimization (PSO) technique is proposed. It assumes that the nodes are intelligent and there exist a knowledge base about the environment in their local memory. There are two levels for approaching the effective route selection process through grading. At the first level, grade based selection is applied and at the second level, the optimum path is explored using PSO. The simulation has been carried out on different topological structures and it is observed that a graded network produces a significant reduction in number of iteration to arrive at the optimal path selection.
Current studies of the peculiar velocity flow field in the Local Universe are limited by either the lack of detection or accurate photometry for galaxies at low Galactic latitudes. The contribution to the dynamics of the Local Group of the largely unknown mass distribution in this 'Zone of Avoidance' remains controversial. We present here the results of a pilot project to obtain deep near infrared (NIR) observations of galaxies detected in the systematic Parkes deep HI survey of the ZoA - 578 galaxies with recession velocities out to 6000 km/s were observed with the 1.4m InfraRed Survey Facility SIRIUS camera providing J, H and K_s imaging ~2 mag deeper than 2MASS. After star-subtraction, the resulting isophotal magnitudes and inclinations of ZoA galaxies are of sufficient accuracy (magnitude errors under 0.1 mag even at high extinction) to ultimately be used to determine cosmic flow fields "in" the ZoA via the NIR Tully-Fisher relation. We further used the observed NIR colours to assess the ratio of the true extinction to the DIRBE/IRAS extinction deep into the dust layers of the Milky Way. The derived ratio was found to be 0.87 across the HIZOA survey region with no significant variation with Galactic latitude or longitude. This value is in excellent agreement with the completely independently derived factor of 0.86 by Schlafly & Finkbeiner based on Sloan data far away from the Milky Way.
This study focuses on the problem of finding ground states of random instances of the Sherrington-Kirkpatrick (SK) spin-glass model with Gaussian couplings. While the ground states of SK spin-glass instances can be obtained with branch and bound, the computational complexity of branch and bound yields instances of not more than about 90 spins. We describe several approaches based on the hierarchical Bayesian optimization algorithm (hBOA) to reliably identifying ground states of SK instances intractable with branch and bound, and present a broad range of empirical results on such problem instances. We argue that the proposed methodology holds a big promise for reliably solving large SK spin-glass instances to optimality with practical time complexity. The proposed approaches to identifying global optima reliably can also be applied to other problems and they can be used with many other evolutionary algorithms. Performance of hBOA is compared to that of the genetic algorithm with two common crossover operators.
Chord recognition systems depend on robust feature extraction pipelines. While these pipelines are traditionally hand-crafted, recent advances in end-to-end machine learning have begun to inspire researchers to explore data-driven methods for such tasks. In this paper, we present a chord recognition system that uses a fully convolutional deep auditory model for feature extraction. The extracted features are processed by a Conditional Random Field that decodes the final chord sequence. Both processing stages are trained automatically and do not require expert knowledge for optimising parameters. We show that the learned auditory system extracts musically interpretable features, and that the proposed chord recognition system achieves results on par or better than state-of-the-art algorithms.
In these lectures, given at the NATO ASI at Windsor (2001), applications of the replicas nonlinear sigma model to disordered systems are reviewed. A particular attention is given to two sets of issues. First, obtaining non-perturbative results in the replica limit is discussed, using as examples (i) an oscillatory behaviour of the two-level correlation function and (ii) long-tail asymptotes of different mesoscopic distributions. Second, a new variant of the sigma model for interacting electrons in disordered normal and superconducting systems is presented, with demonstrating how to reduce it, under certain controlled approximations, to known ``phase-only'' actions, including that of the ``dirty bosons'' model.
Torus networks of moderate degree have been widely used in the supercomputer industry. Tori are superb when used for executing applications that require near-neighbor communications. Nevertheless, they are not so good when dealing with global communications. Hence, typical 3D implementations have evolved to 5D networks, among other reasons, to reduce network distances. Most of these big systems are mixed-radix tori which are not the best option for minimizing distances and efficiently using network resources. This paper is focused on improving the topological properties of these networks.   By using integral matrices to deal with Cayley graphs over Abelian groups, we have been able to propose and analyze a family of high-dimensional grid-based interconnection networks. As they are built over $n$-dimensional grids that induce a regular tiling of the space, these topologies have been denoted \textsl{lattice graphs}. We will focus on cubic crystal lattices for modeling symmetric 3D networks. Other higher dimensional networks can be composed over these graphs, as illustrated in this research. Easy network partitioning can also take advantage of this network composition operation. Minimal routing algorithms are also provided for these new topologies. Finally, some practical issues such as implementability and preliminary performance evaluations have been addressed.
In this work, we propose a deep learning approach to improve docking-based virtual screening. The introduced deep neural network, DeepVS, uses the output of a docking program and learns how to extract relevant features from basic data such as atom and residues types obtained from protein-ligand complexes. Our approach introduces the use of atom and amino acid embeddings and implements an effective way of creating distributed vector representations of protein-ligand complexes by modeling the compound as a set of atom contexts that is further processed by a convolutional layer. One of the main advantages of the proposed method is that it does not require feature engineering. We evaluate DeepVS on the Directory of Useful Decoys (DUD), using the output of two docking programs: AutodockVina1.1.2 and Dock6.6. Using a strict evaluation with leave-one-out cross-validation, DeepVS outperforms the docking programs in both AUC ROC and enrichment factor. Moreover, using the output of AutodockVina1.1.2, DeepVS achieves an AUC ROC of 0.81, which, to the best of our knowledge, is the best AUC reported so far for virtual screening using the 40 receptors from DUD.
How does political discourse spread in digital networks? Can we empirically test if certain conceptual frames of social movements have a correlate on their online discussion networks? Through an analysis of the Twitter data from the Occupy movement, this paper describes the formation of political discourse over time. Building on a previous set of concepts - derived from theoretical discussions about the movement and its roots - we analyse the data to observe when those concepts start to appear within the networks, who are those Twitter users responsible for them, and what are the patterns through which those concepts spread. Preliminary evidence shows that, although there are some signs of opportunistic behaviour among activists, most of them are central nodes from the onset of the network, and shape the discussions across time. These central activists do not only start the conversations around given frames, but also sustain over time and become key members of the network. From here, we aim to provide a thorough account of the "travel" of political discourse, and the correlate of online conversational networks with theoretical accounts of the movement.
Generally traffic and the sensor network security have many challenges in the transmission of data in the network. The existing schemes consider homogeneous sensor networks which have poor performance and scalability. Due to many-to-one traffic pattern, sensors may communicate with small portion of its neighbours. Key management is the critical process in sensor nodes to secure the data. Most existing schemes establish shared keys for all the sensors no matter whether they are communicating or not. Hence it leads to large storage overhead. Another problem in sensor network is compromised node attack and denial of service attack which occurs because of its wireless nature. Existing multi path routing algorithms are vulnerable to these attacks. So once an adversary acquires the routing algorithm, it can compute the same routes known to the source, and hence endanger all information sent over these routes. If an adversary performs node compromise attack, they can easily get the encryption/ decryption keys used by that node and hence they can intercept the information easily. In this paper we are proposing a key management scheme which only establishes shared keys with their communicating neighbour and a mechanism to generate randomized multipath routes for secure transmission of data to the sink. Here we are adopting heterogeneous sensor networks and we are utilizing elliptic curve cryptography for efficient key management which is more efficient, scalable, and highly secure and reduces communication overhead. The routes generated by our mechanism are highly dispersive, energy efficient and making them quite capable of bypassing the back holes at low energy cost.
In a quantum network, distant observers sharing physical resources emitted by independent sources can establish strong correlations, which defy any classical explanation in terms of local variables. We discuss the characterization of nonlocal correlations in such a situation, when compared to those that can be generated in networks distributing independent local variables. We present an iterative procedure for constructing Bell inequalities tailored for networks: starting from a given network, and a corresponding Bell inequality, our technique provides new Bell inequalities for a more complex network, involving one additional source and one additional observer. The relevance of our method is illustrated on a variety of networks, for which we demonstrate significant quantum violations.
In an earlier publication, we have introduced a method to obtain, at large N, the effective action for d-dimensional manifolds in a N-dimensional disordered environment. This allowed to obtain the Functional Renormalization Group (FRG) equation for N=infinity and was shown to reproduce, with no need for ultrametric replica symmetry breaking, the predictions of the Mezard-Parisi solution. Here we compute the corrections at order 1/N. We introduce two novel complementary methods, a diagrammatic and an algebraic one, to perform the complicated resummation of an infinite number of loops, and derive the beta-function of the theory to order 1/N. We present both the effective action and the corresponding functional renormalization group equations. The aim is to explain the conceptual basis and give a detailed account of the novel aspects of such calculations. The analysis of the FRG flow, comparison with other studies, and applications, e.g. to the strong-coupling phase of the Kardar-Parisi-Zhang equation are examined in a subsequent publication.
Analyzing and understanding the structure of complex relational data is important in many applications including analysis of the connectivity in the human brain. Such networks can have prominent patterns on different scales, calling for a hierarchically structured model. We propose two non-parametric Bayesian hierarchical network models based on Gibbs fragmentation tree priors, and demonstrate their ability to capture nested patterns in simulated networks. On real networks we demonstrate detection of hierarchical structure and show predictive performance on par with the state of the art. We envision that our methods can be employed in exploratory analysis of large scale complex networks for example to model human brain connectivity.
Network coding is an elegant technique where, instead of simply relaying the packets of information they receive, the nodes of a network are allowed to combine \emph{several} packets together for transmission and this technique can be used to achieve the maximum possible information flow in a network and save the needed number of packet transmissions. Moreover, in an energy-constraint wireless network such as Wireless Sensor Network (a typical type of wireless ad hoc network), applying network coding to reduce the number of wireless transmissions can also prolong the life time of sensor nodes. Although applying network coding in a wireless sensor network is obviously beneficial, due to the operation that one transmitting information is actually combination of multiple other information, it is possible that an error propagation may occur in the network. This special characteristic also exposes network coding system to a wide range of error attacks, especially Byzantine attacks. When some adversary nodes generate error data in the network with network coding, those erroneous information will be mixed at intermeidate nodes and thus corrupt all the information reaching a destination. Recent research efforts have shown that network coding can be combined with classical error control codes and cryptography for secure communication or misbehavior detection. Nevertheless, when it comes to Byzantine attacks, these results have limited effect. In fact, unless we find out those adversary nodes and isolate them, network coding may perform much worse than pure routing in the presence of malicious nodes. In this paper, a distributed hierarchical algorithm based on random linear network coding is developed to detect, locate and isolate malicious nodes.
We present the catalog of 2580 eclipsing binary stars detected in 4.6 square degree area of the central parts of the Large Magellanic Cloud. The photometric data were collected during the second phase of the OGLE microlensing search from 1997 to 2000. The eclipsing objects were selected with the automatic search algorithm based on an artificial neural network. Basic statistics of eclipsing stars are presented. Also, the list of 36 candidates of detached eclipsing binaries for spectroscopic study and for precise LMC distance determination is provided. The full catalog is accessible from the OGLE Internet archive.
We consider the problem of signal recovery on graphs as graphs model data with complex structure as signals on a graph. Graph signal recovery implies recovery of one or multiple smooth graph signals from noisy, corrupted, or incomplete measurements. We propose a graph signal model and formulate signal recovery as a corresponding optimization problem. We provide a general solution by using the alternating direction methods of multipliers. We next show how signal inpainting, matrix completion, robust principal component analysis, and anomaly detection all relate to graph signal recovery, and provide corresponding specific solutions and theoretical analysis. Finally, we validate the proposed methods on real-world recovery problems, including online blog classification, bridge condition identification, temperature estimation, recommender system, and expert opinion combination of online blog classification.
Though distribution system operators have been adding more sensors to their networks, they still often lack an accurate real-time picture of the behavior of distributed energy resources such as demand responsive electric loads and residential solar generation. Such information could improve system reliability, economic efficiency, and environmental impact. Rather than installing additional, costly sensing and communication infrastructure to obtain additional real-time information, it may be possible to use existing sensing capabilities and leverage knowledge about the system to reduce the need for new infrastructure.   In this paper, we disaggregate a distribution feeder's demand measurements into two components: 1) the demand of a population of air conditioners, and 2) the demand of the remaining loads connected to the feeder. We use an online learning algorithm, Dynamic Fixed Share (DFS), that uses the real-time distribution feeder measurements as well as models generated from historical building- and device-level data. We develop two implementations of the algorithm and conduct simulations using real demand data from households and commercial buildings to investigate the effectiveness of the algorithm. Case studies demonstrate that DFS can effectively perform online disaggregation and the choice and construction of models included in the algorithm affects its accuracy, which is comparable to that of a set of Kalman filters.
It is often stated in papers tackling the task of inferring Bayesian network structures from data that there are these two distinct approaches: (i) Apply conditional independence tests when testing for the presence or otherwise of edges; (ii) Search the model space using a scoring metric. Here I argue that for complete data and a given node ordering this division is a myth, by showing that cross entropy methods for checking conditional independence are mathematically identical to methods based upon discriminating between models by their overall goodness-of-fit logarithmic scores.
An associative memory has been discussed of neural networks consisting of spiking N (=100) Hodgkin-Huxley (HH) neurons with time-delayed couplings, which memorize P patterns in their synaptic weights. In addition to excitatory synapses whose strengths are modified after the Willshaw-type learning rule with the 0/1 code for quiescent/active states, the network includes uniform inhibitory synapses which are introduced to reduce cross-talk noises. Our simulations of the HH neuron network for the noise-free state have shown to yield a fairly good performance with the storage capacity of $\alpha_c = P_{\rm max}/N \sim 0.4 - 2.4$ for the low neuron activity of $f \sim 0.04-0.10$. This storage capacity of our temporal-code network is comparable to that of the rate-code model with the Willshaw-type synapses. Our HH neuron network is realized not to be vulnerable to the distribution of time delays in couplings. The variability of interspace interval (ISI) of output spike trains in the process of retrieving stored patterns is also discussed.
Coupling a many-body-localized system to a dissipative bath necessarily leads to delocalization. Here, we investigate the nature of the ensuing relaxation dynamics and the information it holds on the many-body-localized state. We formulate the relevant Lindblad equation in terms of the local integrals of motion of the underlying localized Hamiltonian. This allows to map the quantum evolution deep in the localized state to tractable classical rate equations. We consider two different types of dissipation relevant to systems of ultra-cold atoms: dephasing due to inelastic scattering on the lattice lasers and particle loss. Only the latter mechanism shows a pronounced effects of interactions on the relaxation of observables.
In this work, we present an approach to learn cost maps for driving in complex urban environments from a very large number of demonstrations of driving behaviour by human experts. The learned cost maps are constructed directly from raw sensor measurements, bypassing the effort of manually designing cost maps as well as features. When deploying the learned cost maps, the trajectories generated not only replicate human-like driving behaviour but are also demonstrably robust against systematic errors in putative robot configuration. To achieve this we deploy a Maximum Entropy based, non-linear IRL framework which uses Fully Convolutional Neural Networks (FCNs) to represent the cost model underlying expert driving behaviour. Using a deep, parametric approach enables us to scale efficiently to large datasets and complex behaviours by being run-time independent of dataset extent during deployment. We demonstrate the scalability and the performance of the proposed approach on an ambitious dataset collected over the course of one year including more than 25k demonstration trajectories extracted from over 120km of driving around pedestrianised areas in the city of Milton Keynes, UK. We evaluate the resulting cost representations by showing the advantages over a carefully manually designed cost map and, in addition, demonstrate its robustness to systematic errors by learning precise cost-maps even in the presence of system calibration perturbations.
We describe four algorithms for neural network training, each adapted to different scalability constraints. These algorithms are mathematically principled and invariant under a number of transformations in data and network representation, from which performance is thus independent. These algorithms are obtained from the setting of differential geometry, and are based on either the natural gradient using the Fisher information matrix, or on Hessian methods, scaled down in a specific way to allow for scalability while keeping some of their key mathematical properties.
Generalized preferential attachment is defined as the tendency of a vertex to acquire new links in the future with respect to a particular vertex property. Understanding which properties influence link acquisition tendency (LAT) gives us a predictive power to estimate the future growth of network and insight about the actual dynamics governing the complex networks. In this study, we explore the effect of age and degree on LAT by analyzing data collected from a new complex-network growth dataset. We found that LAT and degree of a vertex are linearly correlated in accordance with previous studies. Interestingly, the relation between LAT and age of a vertex is found to be in conflict with the known models of network growth. We identified three different periods in the network's lifetime where the relation between age and LAT is strongly positive, almost stationary and negative correspondingly.
Restricted Boltzmann machines (RBMs) are endowed with the universal power of modeling (binary) joint distributions. Meanwhile, as a result of their confining network structure, training RBMs confronts less difficulties (compared with more complicated models, e.g., Boltzmann machines) when dealing with approximation and inference issues. However, in certain computational biology scenarios, such as the cancer data analysis, employing RBMs to model data features may lose its efficacy due to the "$p\gg N$" problem, in which the number of features/predictors is much larger than the sample size. The "$p\gg N$" problem puts the bias-variance trade-off in a more crucial place when designing statistical learning methods. In this manuscript, we try to address this problem by proposing a novel RBM model, called elastic restricted Boltzmann machine (eRBM), which incorporates the elastic regularization term into the likelihood/cost function. We provide several theoretical analysis on the superiority of our model. Furthermore, attributed to the classic contrastive divergence (CD) algorithm, eRBMs can be trained efficiently. Our novel model is a promising method for future cancer data analysis.
The widespread availability of electronic health records (EHRs) promises to usher in the era of personalized medicine. However, the problem of extracting useful clinical representations from longitudinal EHR data remains challenging. In this paper, we explore deep neural network models with learned medical feature embedding to deal with the problems of high dimensionality and temporality. Specifically, we use a multi-layer convolutional neural network (CNN) to parameterize the model and is thus able to capture complex non-linear longitudinal evolution of EHRs. Our model can effectively capture local/short temporal dependency in EHRs, which is beneficial for risk prediction. To account for high dimensionality, we use the embedding medical features in the CNN model which hold the natural medical concepts. Our initial experiments produce promising results and demonstrate the effectiveness of both the medical feature embedding and the proposed convolutional neural network in risk prediction on cohorts of congestive heart failure and diabetes patients compared with several strong baselines.
Network virtualization offers flexibility by decoupling virtual network from the underlying physical network. Software-Defined Network (SDN) could utilize the virtual network. For example, in Software-Defined Networks, the entire network can be run on commodity hardware and operating systems that use virtual elements. However, this could present new challenges of data plane performance. In this paper, we present an empirical model of the packet processing delay of a widely used OpenFlow virtual switch, the Open vSwitch. In the empirical model, we analyze the effect of varying Random Access Memory (RAM) and network parameters on the performance of the Open vSwitch. Our empirical model captures the non-network processing delays, which could be used in enhancing the network modeling and simulation.
Neural networks have recently been proposed for multi-label classification because they are able to capture and model label dependencies in the output layer. In this work, we investigate limitations of BP-MLL, a neural network (NN) architecture that aims at minimizing pairwise ranking error. Instead, we propose to use a comparably simple NN approach with recently proposed learning techniques for large-scale multi-label text classification tasks. In particular, we show that BP-MLL's ranking loss minimization can be efficiently and effectively replaced with the commonly used cross entropy error function, and demonstrate that several advances in neural network training that have been developed in the realm of deep learning can be effectively employed in this setting. Our experimental results show that simple NN models equipped with advanced techniques such as rectified linear units, dropout, and AdaGrad perform as well as or even outperform state-of-the-art approaches on six large-scale textual datasets with diverse characteristics.
We show how reciprocal arcs significantly influence the structural organization of Wikipedias, online encyclopedias. It is shown that random addition of reciprocal arcs in the static network cannot explain the observed reciprocity of Wikipedias. A model of Wikipedia growth based on preferential attachment and on information exchange via reciprocal arcs is presented. An excellent agreement between in-degree distributions of our model and real Wikipedia networks is achieved without fitting the distributions, but by merely extracting a small number of model parameters from the measurement of real networks.
We prove the impossibility of recent attempts to decouple the Replica Symmetry Breaking (RSB) picture for finite-dimensional spin glasses from the existence of many thermodynamic (i.e., infinite-volume) pure states while preserving another signature RSB feature --- space filling relative domain walls between different finite-volume states. Thus revisions of the notion of pure states cannot shield the RSB picture from the internal contradictions that rule out its physical correctness in finite dimensions at low temperature in large finite volume.
Unsupervised pretraining and dropout have been well studied, especially with respect to regularization and output consistency. However, our understanding about the explicit convergence rates of the parameter estimates, and their dependence on the learning (like denoising and dropout rate) and structural (like depth and layer lengths) aspects of the network is less mature. An interesting question in this context is to ask if the network structure could "guide" the choices of such learning parameters. In this work, we explore these gaps between network structure, the learning mechanisms and their interaction with parameter convergence rates. We present a way to address these issues based on the backpropagation convergence rates for general nonconvex objectives using first-order information. We then incorporate two learning mechanisms into this general framework -- denoising autoencoder and dropout, and subsequently derive the convergence rates of deep networks. Building upon these bounds, we provide insights into the choices of learning parameters and network sizes that achieve certain levels of convergence accuracy. The results derived here support existing empirical observations, and we also conduct a set of experiments to evaluate them.
We consider the problem of learning the underlying graph of an unknown Ising model on p spins from a collection of i.i.d. samples generated from the model. We suggest a new estimator that is computationally efficient and requires a number of samples that is near-optimal with respect to previously established information-theoretic lower-bound. Our statistical estimator has a physical interpretation in terms of "interaction screening". The estimator is consistent and is efficiently implemented using convex optimization. We prove that with appropriate regularization, the estimator recovers the underlying graph using a number of samples that is logarithmic in the system size p and exponential in the maximum coupling-intensity and maximum node-degree.
Many generative models can be expressed as a differentiable function of random inputs drawn from some simple probability density. This framework includes both deep generative architectures such as Variational Autoencoders and a large class of procedurally defined simulator models. We present a method for performing efficient MCMC inference in such models when conditioning on observations of the model output. For some models this offers an asymptotically exact inference method where Approximate Bayesian Computation might otherwise be employed. We use the intuition that inference corresponds to integrating a density across the manifold corresponding to the set of inputs consistent with the observed outputs. This motivates the use of a constrained variant of Hamiltonian Monte Carlo which leverages the smooth geometry of the manifold to coherently move between inputs exactly consistent with observations. We validate the method by performing inference tasks in a diverse set of models.
Gene regulatory network inference uses genome-wide transcriptome measurements in response to genetic, environmental or dynamic perturbations to predict causal regulatory influences between genes. We hypothesized that evolution also acts as a suitable network perturbation and that integration of data from multiple closely related species can lead to improved reconstruction of gene regulatory networks. To test this hypothesis, we predicted networks from temporal gene expression data for 3,610 genes measured during early embryonic development in six Drosophila species and compared predicted networks to gold standard networks of ChIP-chip and ChIP-seq interactions for developmental transcription factors in five species. We found that (i) the performance of single-species networks was independent of the species where the gold standard was measured; (ii) differences between predicted networks reflected the known phylogeny and differences in biology between the species; (iii) an integrative consensus network which minimized the total number of edge gains and losses with respect to all single-species networks performed better than any individual network. Our results show that in an evolutionarily conserved system, integration of data from comparable experiments in multiple species improves the inference of gene regulatory networks. They provide a basis for future studies on the numerous multi-species gene expression datasets for other biological processes available in the literature.
The quality of data representation in deep learning methods is directly related to the prior model imposed on the representations; however, generally used fixed priors are not capable of adjusting to the context in the data. To address this issue, we propose deep predictive coding networks, a hierarchical generative model that empirically alters priors on the latent representations in a dynamic and context-sensitive manner. This model captures the temporal dependencies in time-varying signals and uses top-down information to modulate the representation in lower layers. The centerpiece of our model is a novel procedure to infer sparse states of a dynamic model which is used for feature extraction. We also extend this feature extraction block to introduce a pooling function that captures locally invariant representations. When applied on a natural video data, we show that our method is able to learn high-level visual features. We also demonstrate the role of the top-down connections by showing the robustness of the proposed model to structured noise.
In a distributed quantum computer scalability is accomplished by networking together many elementary nodes. Typically the network is optical and inter-node entanglement involves photon detection. In complex networks the entanglement fidelity may be degraded by the twin problems of photon loss and dark counts. Here we describe an entanglement protocol which can achieve high fidelity even when these issues are arbitrarily severe; indeed the method succeeds with finite probability even if the detectors are entirely removed from the network. An experimental demonstration should be possible with existing technologies.
Recent years have witnessed wide application of hashing for large-scale image retrieval. However, most existing hashing methods are based on hand-crafted features which might not be optimally compatible with the hashing procedure. Recently, deep hashing methods have been proposed to perform simultaneous feature learning and hash-code learning with deep neural networks, which have shown better performance than traditional hashing methods with hand-crafted features. Most of these deep hashing methods are supervised whose supervised information is given with triplet labels. For another common application scenario with pairwise labels, there have not existed methods for simultaneous feature learning and hash-code learning. In this paper, we propose a novel deep hashing method, called deep pairwise-supervised hashing(DPSH), to perform simultaneous feature learning and hash-code learning for applications with pairwise labels. Experiments on real datasets show that our DPSH method can outperform other methods to achieve the state-of-the-art performance in image retrieval applications.
We study the temporal co-variation of network co-evolution via the cross-link structure of networks, for which we take advantage of the formalism of hypergraphs to map cross-link structures back to network nodes. We investigate two sets of temporal network data in detail. In a network of coupled nonlinear oscillators, hyperedges that consist of network edges with temporally co-varying weights uncover the driving co-evolution patterns of edge weight dynamics both within and between oscillator communities. In the human brain, networks that represent temporal changes in brain activity during learning exhibit early co-evolution that then settles down with practice, and subsequent decreases in hyperedge size are consistent with emergence of an autonomous subgraph whose dynamics no longer depends on other parts of the network. Our results on real and synthetic networks give a poignant demonstration of the ability of cross-link structure to uncover unexpected co-evolution attributes in both real and synthetic dynamical systems. This, in turn, illustrates the utility of analyzing cross-links for investigating the structure of temporal networks.
In the literature on graphical models, there has been increased attention paid to the problems of learning hidden structure (see Heckerman [H96] for survey) and causal mechanisms from sample data [H96, P88, S93, P95, F98]. In most settings we should expect the former to be difficult, and the latter potentially impossible without experimental intervention. In this work, we examine some restricted settings in which perfectly reconstruct the hidden structure solely on the basis of observed sample data.
This paper presents a hierarchical approach to resource allocation in open-access femtocell networks. The major challenge in femtocell networks is interference management which in our system, based on the Long Term Evolution (LTE) standard, translates to which user should be allocated which physical resource block (or fraction thereof) from which femtocell access point (FAP). The globally optimal solution requires integer programming and is mathematically intractable. We propose a hierarchical three-stage solution: first, the load of each FAP is estimated considering the number of users connected to the FAP, their average channel gain and required data rates. Second, based on each FAP's load, the physical resource blocks (PRBs) are allocated to FAPs in a manner that minimizes the interference by coloring the modified interference graph. Finally, the resource allocation is performed at each FAP considering users' instantaneous channel gain. The two major advantages of this suboptimal approach are the significantly reduced computation complexity and the fact that the proposed algorithm only uses information that is already likely to be available at the nodes executing the relevant optimization step. The performance of the proposed solution is evaluated in networks based on the LTE standard.
We investigate, via Brownian dynamics simulations, the reaction dynamics of a simple, non-linear chemical network (the Willamowski-Rossler network) under spatial confinement and crowding conditions. Our results show that the presence of inert crowders has a non-nontrivial effect on the dynamics of the network and, consequently, that effective modeling efforts aiming at a general understanding of the behavior of biochemical networks in vivo should be stochastic in nature and based on an explicit representation of both spatial confinement and macromolecular crowding.
While the statistical and resilience properties of the Internet are no more changing significantly across time, the Darknet, a network devoted to keep anonymous its traffic, still experiences rapid changes to improve the security of its users. Here, we study the structure of the Darknet and we find that its topology is rather peculiar, being characterized by non-homogenous distribution of connections -- typical of scale-free networks --, very short path lengths and high clustering -- typical of small-world networks -- and lack of a core of highly connected nodes.   We propose a model to reproduce such features, demonstrating that the mechanisms used to improve cyber-security are responsible for the observed topology. Unexpectedly, we reveal that its peculiar structure makes the Darknet much more resilient than the Internet -- used as a benchmark for comparison at a descriptive level -- to random failures, targeted attacks and cascade failures, as a result of adaptive changes in response to the attempts of dismantling the network across time.
Community detection has been an active research area for decades. Among all probabilistic models, Stochastic Block Model has been the most popular one. This paper introduces a novel probabilistic model: RW-HDP, based on random walks and Hierarchical Dirichlet Process, for community extraction. In RW-HDP, random walks conducted in a social network are treated as documents; nodes are treated as words. By using Hierarchical Dirichlet Process, a nonparametric Bayesian model, we are not only able to cluster nodes into different communities, but also determine the number of communities automatically. We use Stochastic Variational Inference for our model inference, which makes our method time efficient and can be easily extended to an online learning algorithm.
Mobile malware and mobile network attacks are becoming a significant threat that accompanies the increasing popularity of smart phones and tablets. Thus in this paper we present our research vision that aims to develop a network-based security solution combining analytical modelling, simulation and learning, together with billing and control-plane data, to detect anomalies and attacks, and eliminate or mitigate their effects, as part of the EU FP7 NEMESYS project. These ideas are supplemented with a careful review of the state-of-the-art regarding anomaly detection techniques that mobile network operators may use to protect their infrastructure and secure users against malware.
We introduce the concept of a semigroup coupled cell network and show that the collection of semigroup network vector fields forms a Lie algebra. This implies that near a dynamical equilibrium the local normal form of a semigroup network is a semigroup network itself. Networks without the semigroup property will support normal forms with a more general network architecture, but these normal forms nevertheless possess the same symmetries and synchronous solutions as the original network. We explain how to compute Lie brackets and normal forms of coupled cell networks and we characterize the SN-decomposition that determines the normal form symmetry. This paper concludes with a generalization to nonhomogeneous networks with the structure of a semigroupoid.
On the basis of theoretical evolutionary expectations, we develop a Galactic model reproducing star counts and synthetic color-magnitude diagrams of field stars, which include the white dwarf (WD) population. In this way we are able to evaluate the expected occurrence of WDs in deep observations at the various photometric bands and Galactic coordinates, discussing the contribution of the WDs of the various Galactic components. The effects on the theoretical predictions of different WD evolutionary models, ages, initial mass functions and relations between progenitor mass and WD mass are discussed.
We study the velocity structure of penumbral filaments in the deep photosphere to obtain direct evidence for the convective nature of sunspot penumbrae. A sunspot was observed at high spatial resolution with the 1-m Swedish Solar Telescope in the deep photospheric C I 5380 {\AA} absorption line. The Multi-Object Multi-Frame Blind Deconvolution (MOMFBD) method is used for image restoration and straylight is filtered out. We report here the discovery of clear redshifts in the C I 5380 {\AA} line at multiple locations in sunspot penumbral filaments. For example, bright head of filaments show larger concentrated blueshift and are surrounded by darker, redshifted regions, suggestive of overturning convection. Elongated downflow lanes are also located beside bright penumbral fibrils. Our results provide the strongest evidence yet for the presence of overturning convection in penumbral filaments and highlight the need to observe the deepest layers of the penumbra in order to uncover the energy transport processes taking place there.
We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.
The deep Convolutional Neural Network (CNN) is the state-of-the-art solution for large-scale visual recognition. Following basic principles such as increasing the depth and constructing highway connections, researchers have manually designed a lot of fixed network structures and verified their effectiveness.   In this paper, we discuss the possibility of learning deep network structures automatically. Note that the number of possible network structures increases exponentially with the number of layers in the network, which inspires us to adopt the genetic algorithm to efficiently traverse this large search space. We first propose an encoding method to represent each network structure in a fixed-length binary string, and initialize the genetic algorithm by generating a set of randomized individuals. In each generation, we define standard genetic operations, e.g., selection, mutation and crossover, to eliminate weak individuals and then generate more competitive ones. The competitiveness of each individual is defined as its recognition accuracy, which is obtained via training the network from scratch and evaluating it on a validation set. We run the genetic process on two small datasets, i.e., MNIST and CIFAR10, demonstrating its ability to evolve and find high-quality structures which are little studied before. These structures are also transferrable to the large-scale ILSVRC2012 dataset.
Fisher-Vectors (FV) encode higher-order statistics of a set of multiple local descriptors like SIFT features. They already show good performance in combination with shallow learning architectures on visual recognitions tasks. Current methods using FV as a feature descriptor in deep architectures assume that all original input features are static. We propose a framework to jointly learn the representation of original features, FV parameters and parameters of the classifier in the style of traditional neural networks. Our proof of concept implementation improves the performance of FV on the Pascal Voc 2007 challenge in a multi-GPU setting in comparison to a default SVM setting. We demonstrate that FV can be embedded into neural networks at arbitrary positions, allowing end-to-end training with back-propagation.
The field of machine learning has taken a dramatic twist in recent times, with the rise of the Artificial Neural Network (ANN). These biologically inspired computational models are able to far exceed the performance of previous forms of artificial intelligence in common machine learning tasks. One of the most impressive forms of ANN architecture is that of the Convolutional Neural Network (CNN). CNNs are primarily used to solve difficult image-driven pattern recognition tasks and with their precise yet simple architecture, offers a simplified method of getting started with ANNs.   This document provides a brief introduction to CNNs, discussing recently published papers and newly formed techniques in developing these brilliantly fantastic image recognition models. This introduction assumes you are familiar with the fundamentals of ANNs and machine learning.
Physical systems differing in their microscopic details often display strikingly similar behaviour when probed at low energies. Those universal properties, largely determining their physical characteristics, are revealed by the powerful renormalization group (RG) procedure, which systematically retains "slow" degrees of freedom and integrates out the rest. However, the important degrees of freedom may be difficult to identify. Here we demonstrate a machine learning (ML) algorithm capable of identifying the relevant degrees of freedom without any prior knowledge about the system. We introduce an artificial neural network based on a model-independent, information-theoretic characterization of a real-space RG procedure, performing this task. We apply the algorithm to classical statistical physics problems in two dimensions.
The paper introduces k-bounded MAP inference, a parameterization of MAP inference in Markov logic networks. k-Bounded MAP states are MAP states with at most k active ground atoms of hidden (non-evidence) predicates. We present a novel delayed column generation algorithm and provide empirical evidence that the algorithm efficiently computes k-bounded MAP states for meaningful real-world graph matching problems. The underlying idea is that, instead of solving one large optimization problem, it is often more efficient to tackle several small ones.
Topic models extract representative word sets - called topics - from word counts in documents without requiring any semantic annotations. Topics are not guaranteed to be well interpretable, therefore, coherence measures have been proposed to distinguish between good and bad topics. Studies of topic coherence so far are limited to measures that score pairs of individual words. For the first time, we include coherence measures from scientific philosophy that score pairs of more complex word subsets and apply them to topic scoring.
The Internet's importance for promoting free and open communication has led to widespread crackdowns on its use in countries around the world. In this study, we investigate the relationship between national policies around freedom of speech and Internet topology in countries around the world. We combine techniques from network measurement and machine learning to identify features of Internet structure at the national level that are the best indicators of a country's level of freedom. We find that IP density and path lengths to other countries are the best indicators of a country's freedom. We also find that our methods predict freedom categories for countries with 91% accuracy.
A directed acyclic network is considered where all the terminals need to recover the sum of the symbols generated at all the sources. We call such a network a sum-network. It is shown that there exists a solvably (and linear solvably) equivalent sum-network for any multiple-unicast network, and thus for any directed acyclic communication network. It is also shown that there exists a linear solvably equivalent multiple-unicast network for every sum-network. It is shown that for any set of polynomials having integer coefficients, there exists a sum-network which is scalar linear solvable over a finite field F if and only if the polynomials have a common root in F. For any finite or cofinite set of prime numbers, a network is constructed which has a vector linear solution of any length if and only if the characteristic of the alphabet field is in the given set. The insufficiency of linear network coding and unachievability of the network coding capacity are proved for sum-networks by using similar known results for communication networks. Under fractional vector linear network coding, a sum-network and its reverse network are shown to be equivalent. However, under non-linear coding, it is shown that there exists a solvable sum-network whose reverse network is not solvable.
Complex networks constitute the backbones of many complex systems such as social networks. Detecting the community structure in a complex network is both a challenging and a computationally expensive task. In this paper, we present the HAMUHI-CODE, a novel fast heuristic algorithm for multi-scale hierarchical community detection inspired on an agglomerative hierarchical clustering technique. We define a new structural similarity of vertices based on the classical cosine similarity by removing some vertices in order to increase the probability of identifying inter-cluster edges. Then we use the proposed structural similarity in a new agglomerative hierarchical algorithm that does not merge only clusters with maximal similarity as in the classical approach, but merges any cluster that does not meet a parameterized community definition with its most similar adjacent cluster. The algorithm computes all the similar clusters at the same time is checking if each cluster meets the parameterized community definition. It is done in linear time complexity in terms of the number of cluster in the iteration. Since a complex network is a sparse graph, our approach HAMUHI-CODE has a super-linear time complexity with respect to the size of the input in the worst-case scenario (if the clusters merge in pairs), making it suitable to be applied on large-scale complex networks. To test the properties and the efficiency of our algorithm we have conducted extensive experiments on real world and synthetic benchmark networks by comparing it to several baseline state-of-the-art algorithms.
Deep learning using convolutional neural networks (CNNs) is quickly becoming the state-of-the-art for challenging computer vision applications. However, deep learning's power consumption and bandwidth requirements currently limit its application in embedded and mobile systems with tight energy budgets. In this paper, we explore the energy savings of optically computing the first layer of CNNs. To do so, we utilize bio-inspired Angle Sensitive Pixels (ASPs), custom CMOS diffractive image sensors which act similar to Gabor filter banks in the V1 layer of the human visual cortex. ASPs replace both image sensing and the first layer of a conventional CNN by directly performing optical edge filtering, saving sensing energy, data bandwidth, and CNN FLOPS to compute. Our experimental results (both on synthetic data and a hardware prototype) for a variety of vision tasks such as digit recognition, object recognition, and face identification demonstrate using ASPs while achieving similar performance compared to traditional deep learning pipelines.
Deep learning requires data. A useful approach to obtain data is to be creative and mine data from various sources, that were created for different purposes. Unfortunately, this approach often leads to noisy labels. In this paper, we propose a meta algorithm for tackling the noisy labels problem. The key idea is to decouple "when to update" from "how to update". We demonstrate the effectiveness of our algorithm by mining data for gender classification by combining the Labeled Faces in the Wild (LFW) face recognition dataset with a textual genderizing service, which leads to a noisy dataset. While our approach is very simple to implement, it leads to state-of-the-art results. We analyze some convergence properties of the proposed algorithm.
Query-based video summarization is the task of creating a brief visual trailer, which captures the parts of the video (or a collection of videos) that are most relevant to the user-issued query. In this paper, we propose an unsupervised label propagation approach for this task. Our approach effectively captures the multimodal semantics of queries and videos using state-of-the-art deep neural networks and creates a summary that is both semantically coherent and visually attractive. We describe the theoretical framework of our graph-based approach and empirically evaluate its effectiveness in creating relevant and attractive trailers. Finally, we showcase example video trailers generated by our system.
In this short note, we present an extension of long short-term memory (LSTM) neural networks to using a depth gate to connect memory cells of adjacent layers. Doing so introduces a linear dependence between lower and upper layer recurrent units. Importantly, the linear dependence is gated through a gating function, which we call depth gate. This gate is a function of the lower layer memory cell, the input to and the past memory cell of this layer. We conducted experiments and verified that this new architecture of LSTMs was able to improve machine translation and language modeling performances.
Methods based on convolutional neural network (CNN) have demonstrated tremendous improvements on single image super-resolution. However, the previous methods mainly restore images from one single area in the low resolution (LR) input, which limits the flexibility of models to infer various scales of details for high resolution (HR) output. Moreover, most of them train a specific model for each up-scale factor. In this paper, we propose a multi-scale super resolution (MSSR) network. Our network consists of multi-scale paths to make the HR inference, which can learn to synthesize features from different scales. This property helps reconstruct various kinds of regions in HR images. In addition, only one single model is needed for multiple up-scale factors, which is more efficient without loss of restoration quality. Experiments on four public datasets demonstrate that the proposed method achieved state-of-the-art performance with fast speed.
We present the optimisation of a neuromorphic adaptation of a spiking neural network model of the locust Lobula Giant Movement Detector (LGMD), which detects looming objects and can be used to facilitate obstacle avoidance in robotic applications. Due to the number of user-defined parameters and the size of the search space, it is difficult to find values that detect looming events and ignore translation. Additionally, evaluation of the objective function is expensive (approximately one minute), the properties of the black box objective function are unknown, and derivatives are not available. Therefore, we investigate the use of Differential Evolution and self-adaptive DE (SADE) to find optimal values. We demonstrate that these optimisation algorithms are suitable candidates to find suitable parameters for an obstacle avoidance system on an unmanned aerial vehicle (UAV).
We study a model of interacting fermions in one dimension subject to random, uncorrelated onsite disorder. The model realizes an interaction-driven quantum phase transition between an ergodic and a many-body localized phase (MBL). We propose a single-particle framework to characterize these phases by the eigenstates (the natural orbitals) and the eigenvalues (the occupation spectrum) of the one-particle density matrix (OPDM) in many-body eigenstates. As a main result, we find that the natural orbitals are localized in the MBL phase, but delocalized in the ergodic phase. This qualitative change in these single-particle states is a many-body effect, since without interactions the single-particle energy eigenstates are all localized. The occupation spectrum in the ergodic phase is thermal in agreement with the eigenstate thermalization hypothesis, while in the MBL phase the occupations preserve a discontinuity at an emergent Fermi edge. This suggests that the MBL eigenstates are weakly dressed Slater determinants, with the eigenstates of the underlying Anderson problem as reference states. We discuss the statistical properties of the natural orbitals and of the occupation spectrum in the two phases and as the transition is approached. Our results are consistent with the existing picture of emergent integrability and localized integrals of motion, or quasiparticles, in the MBL phase. We emphasize the close analogy of the MBL phase to a zero-temperature Fermi liquid: in the studied model, the MBL phase is adiabatically connected to the Anderson insulator and the occupation-spectrum discontinuity directly indicates the presence of quasiparticles localized in real space. Finally, we show that the same picture emerges for interacting fermions in the presence of an experimentally-relevant bichromatic lattice and thereby demonstrate that our findings are not limited to a specific model.
Deep neural networks have become invaluable tools for supervised machine learning, e.g., classification of text or images. While often offering superior results over traditional techniques and successfully expressing complicated patterns in data, deep architectures are known to be challenging to design and train such that they generalize well to new data. Important issues with deep architectures are numerical instabilities in derivative-based learning algorithms commonly called exploding or vanishing gradients. In this paper we propose new forward propagation techniques inspired by systems of Ordinary Differential Equations (ODE) that overcome this challenge and lead to well-posed learning problems for arbitrarily deep networks.   The backbone of our approach is our interpretation of deep learning as a parameter estimation problem of nonlinear dynamical systems. Given this formulation, we analyze stability and well-posedness of deep learning and use this new understanding to develop new network architectures. We relate the exploding and vanishing gradient phenomenon to the stability of the discrete ODE and present several strategies for stabilizing deep learning for very deep networks. While our new architectures restrict the solution space, several numerical experiments show their competitiveness with state-of-the-art networks.
We propose an algorithm for detecting communities of links in networks which uses local information, is based on a new evaluation function, and allows for pervasive overlaps of communities. The complexity of the clustering task requires the application of a memetic algorithm that combines probabilistic evolutionary strategies with deterministic local searches. In Part 2 we will present results of experiments with with citation networks.
Schemata theory, Markov chains, and statistical mechanics have been used to explain how evolutionary algorithms (EAs) work. Incremental success has been achieved with all of these methods, but each has been stymied by limitations related to its less-than-global view. We show that moving the investigation into topological space improves our understanding of why EAs work.
For a broad range of research, governmental and commercial applications it is important to understand the allegiances, communities and structure of key players in society. One promising direction towards extracting this information is to exploit the rich relational data in digital social networks (the social graph). As social media data sets are very large, most approaches make use of distributed computing systems for this purpose. Distributing graph processing requires solving many difficult engineering problems, which has lead some researchers to look at single-machine solutions that are faster and easier to maintain. In this article, we present a single-machine real-time system for large-scale graph processing that allows analysts to interactively explore graph structures. The key idea is that the aggregate actions of large numbers of users can be compressed into a data structure that encapsulates user similarities while being robust to noise and queryable in real-time. We achieve single machine real-time performance by compressing the neighbourhood of each vertex using minhash signatures and facilitate rapid queries through Locality Sensitive Hashing. These techniques reduce query times from hours using industrial desktop machines operating on the full graph to milliseconds on standard laptops. Our method allows exploration of strongly associated regions (i.e. communities) of large graphs in real-time on a laptop. It has been deployed in software that is actively used by social network analysts and offers another channel for media owners to monetise their data, helping them to continue to provide free services that are valued by billions of people globally.
Recent face recognition experiments on the LFW benchmark show that face recognition is performing stunningly well, surpassing human recognition rates. In this paper, we study face recognition at scale. Specifically, we have collected from Flickr a \textbf{Million} faces and evaluated state of the art face recognition algorithms on this dataset. We found that the performance of algorithms varies--while all perform great on LFW, once evaluated at scale recognition rates drop drastically for most algorithms. Interestingly, deep learning based approach by \cite{schroff2015facenet} performs much better, but still gets less robust at scale. We consider both verification and identification problems, and evaluate how pose affects recognition at scale. Moreover, we ran an extensive human study on Mechanical Turk to evaluate human recognition at scale, and report results. All the photos are creative commons photos and is released at \small{\url{http://megaface.cs.washington.edu/}} for research and further experiments.
In a dynamically changing world, businesses must transform themselves to survive. Although the necessity for change may be apparent; how to change is not. Learning from the experiences of successful pioneers in a core business function is useful. Procurement is essential for all organisations. It is how organisations acquire assets and inputs including facilities, materials and people. Traditionally, the business objective from procurement was to increase availability and reduce costs. Subsequently, the objective became more tactical. Leading procurement practice today is disruptive; beyond cost to creating value. The strategy for market leaders has also transformed; from competition to collaboration. Technology-enabled business networks are now driving business disruption globally. Through a rigorous field study of two world-class pioneering corporations, this paper explores how disruption is transforming Enterprise Supply Chains. Lessons, contributions and their implications for current IS theory and practice, are discussed.
We construct a deep portfolio theory. By building on Markowitz's classic risk-return trade-off, we develop a self-contained four-step routine of encode, calibrate, validate and verify to formulate an automated and general portfolio selection process. At the heart of our algorithm are deep hierarchical compositions of portfolios constructed in the encoding step. The calibration step then provides multivariate payouts in the form of deep hierarchical portfolios that are designed to target a variety of objective functions. The validate step trades-off the amount of regularization used in the encode and calibrate steps. The verification step uses a cross validation approach to trace out an ex post deep portfolio efficient frontier. We demonstrate all four steps of our portfolio theory numerically.
The ability to seamlessly switch between the macro networks and femtocell networks is a key driver for femtocell network deployment. The handover procedures for the integrated femtocell/macrocell networks differ from the existing handovers. Some modifications of existing network and protocol architecture for the integration of femtocell networks with the existing macrocell networks are also essential. These modifications change the signal flow for handover procedures due to different 2-tier cell (macrocell and femtocell) environment. The handover between two networks should be performed with minimum signaling. A frequent and unnecessary handover is another problem for hierarchical femtocell/macrocell network environment that must be minimized. This work studies the details mobility management schemes for small and medium scale femtocell network deployment. To do that, firstly we present two different network architectures for small scale and medium scale WCDMA femtocell deployment. The details handover call flow for these two network architectures and CAC scheme to minimize the unnecessary handovers are proposed for the integrated femtocell/macrocell networks. The numerical analysis for the proposed M/M/N/N queuing scheme and the simulation results of the proposed CAC scheme demonstrate the handover call control performances for femtocell environment.
Clustering and community structure is crucial for many network systems and the related dynamic processes. It has been shown that communities are usually overlapping and hierarchical. However, previous methods investigate these two properties of community structure separately. This paper proposes an algorithm (EAGLE) to detect both the overlapping and hierarchical properties of complex community structure together. This algorithm deals with the set of maximal cliques and adopts an agglomerative framework. The quality function of modularity is extended to evaluate the goodness of a cover. The examples of application to real world networks give excellent results.
Embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs) derived from somatic cells (SCs) provide promising resources for regenerative medicine and medical research, leading to a daily identification of new cell lines. However, an efficient system to discriminate the cell lines is lacking. Here, we developed a quantitative system to discriminate the three cell types, iPSCs, ESCs and SCs. The system contains DNA-methylation biomarkers and mathematical models, including an artificial neural network and support vector machines. All biomarkers were unbiasedly selected by calculating an eigengene score derived from analysis of genome-wide DNA methylations. With 30 biomarkers, or even with as few as 3 top biomarkers, this system can discriminate SCs from ESCs and iPSCs with almost 100% accuracy, and with approximately 100 biomarkers, the system can distinguish ESCs from iPSCs with an accuracy of 95%. This robust system performs precisely with raw data without normalization as well as with converted data in which the continuous methylation levels are accounted. Strikingly, this system can even accurately predict new samples generated from different microarray platforms and the next-generation sequencing. The subtypes of cells, such as female and male iPSCs and fetal and adult SCs, can also be discriminated with this system. Thus, this quantitative system works as a novel general and accurate framework for discriminating the three cell types, iPSCs, ESCs, and SCs and this strategy supports the notion that DNA-methylation generally varies among the three cell types.
Ever since the introduction of the internet, it has been void of any privacy. The majority of internet traffic currently is and always has been unencrypted. A number of anonymous communication overlay networks exist whose aim it is to provide privacy to its users. However, due to the nature of the internet, there is major difficulty in getting these networks to become both decentralized and anonymous. We list reasons for having anonymous networks, discern the problems in achieving decentralization and sum up the biggest initiatives in the field and their current status. To do so, we use one exemplary network, the Tor network. We explain how Tor works, what vulnerabilities this network currently has, and possible attacks that could be used to violate privacy and anonymity. The Tor network is used as a key comparison network in the main part of the report: a tabular overview of the major anonymous networking technologies in use today.
This report explores the use of machine learning techniques to accurately predict travel times in city streets and highways using floating car data (location information of user vehicles on a road network). The aim of this report is twofold, first we present a general architecture of solving this problem, then present and evaluate few techniques on real floating car data gathered over a month on a 5 Km highway in New Delhi.
The study of criminal networks using traces from heterogeneous communication media is acquiring increasing importance in nowadays society. The usage of communication media such as phone calls and online social networks leaves digital traces in the form of metadata that can be used for this type of analysis. The goal of this work is twofold: first we provide a theoretical framework for the problem of detecting and characterizing criminal organizations in networks reconstructed from phone call records. Then, we introduce an expert system to support law enforcement agencies in the task of unveiling the underlying structure of criminal networks hidden in communication data. This platform allows for statistical network analysis, community detection and visual exploration of mobile phone network data. It allows forensic investigators to deeply understand hierarchies within criminal organizations, discovering members who play central role and provide connection among sub-groups. Our work concludes illustrating the adoption of our computational framework for a real-word criminal investigation.
Protein-protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein-protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequence databases, we expect that the method can be generalized to genome-wide elucidation of protein-protein interaction networks and used for interaction predictions at residue resolution.
The general perception is that kernel methods are not scalable, and neural nets are the methods of choice for nonlinear learning problems. Or have we simply not tried hard enough for kernel methods? Here we propose an approach that scales up kernel methods using a novel concept called "doubly stochastic functional gradients". Our approach relies on the fact that many kernel methods can be expressed as convex optimization problems, and we solve the problems by making two unbiased stochastic approximations to the functional gradient, one using random training points and another using random functions associated with the kernel, and then descending using this noisy functional gradient. We show that a function produced by this procedure after $t$ iterations converges to the optimal function in the reproducing kernel Hilbert space in rate $O(1/t)$, and achieves a generalization performance of $O(1/\sqrt{t})$. This doubly stochasticity also allows us to avoid keeping the support vectors and to implement the algorithm in a small memory footprint, which is linear in number of iterations and independent of data dimension. Our approach can readily scale kernel methods up to the regimes which are dominated by neural nets. We show that our method can achieve competitive performance to neural nets in datasets such as 8 million handwritten digits from MNIST, 2.3 million energy materials from MolecularSpace, and 1 million photos from ImageNet.
A geodesic is the shortest path between two vertices in a connected network. The geodesic is the kernel of various network metrics including radius, diameter, eccentricity, closeness, and betweenness. These metrics are the foundation of much network research and thus, have been studied extensively in the domain of single-relational networks (both in their directed and undirected forms). However, geodesics for single-relational networks do not translate directly to multi-relational, or semantic networks, where vertices are connected to one another by any number of edge labels. Here, a more sophisticated method for calculating a geodesic is necessary. This article presents a technique for calculating geodesics in semantic networks with a focus on semantic networks represented according to the Resource Description Framework (RDF). In this framework, a discrete "walker" utilizes an abstract path description called a grammar to determine which paths to include in its geodesic calculation. The grammar-based model forms a general framework for studying geodesic metrics in semantic networks.
This paper presents a simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding. It is trained to output letters, with transcribed speech, without the need for force alignment of phonemes. We introduce an automatic segmentation criterion for training from sequence annotation without alignment that is on par with CTC while being simpler. We show competitive results in word error rate on the Librispeech corpus with MFCC features, and promising results from raw waveform.
For a given set $\mathcal{L}$ of species and a set $\mathcal{T}$ of triplets on $\mathcal{L}$, one wants to construct a phylogenetic network which is consistent with $\mathcal{T}$, i.e which represents all triplets of $\mathcal{T}$. The level of a network is defined as the maximum number of hybrid vertices in its biconnected components. When $\mathcal{T}$ is dense, there exist polynomial time algorithms to construct level-$0,1,2$ networks (Aho et al. 81, Jansson et al. 04, Iersel et al. 08). For higher levels, partial answers were obtained by Iersel et al. 2008 with a polynomial time algorithm for simple networks. In this paper, we detail the first complete answer for the general case, solving a problem proposed by Jansson et al. 2004: for any $k$ fixed, it is possible to construct a minimum level-$k$ network consistent with $\mathcal{T}$, if there is any, in time $O(|\mathcal{T}|^{k+1}n^{\lfloor\frac{4k}{3}\rfloor+1})$. This is an improved result of our preliminary version presented at CPM'2009.
For the quantum Ising model with ferromagnetic random couplings $J_{i,j}>0$ and random transverse fields $h_i>0$ at zero temperature in finite dimensions $d>1$, we consider the lowest-order contributions in perturbation theory in $(J_{i,j}/h_i)$ to obtain some information on the statistics of various observables in the disordered phase. We find that the two-point correlation scales as : $\ln C(r) \sim - \frac{r}{\xi_{typ}} +r^{\omega} u$, where $\xi_{typ} $ is the typical correlation length, $u$ is a random variable, and $\omega$ coincides with the droplet exponent $\omega_{DP}(D=d-1)$ of the Directed Polymer with $D=(d-1)$ transverse directions. Our main conclusions are (i) whenever $\omega>0$, the quantum model is governed by an Infinite-Disorder fixed point : there are two distinct correlation length exponents related by $\nu_{typ}=(1-\omega)\nu_{av}$ ; the distribution of the local susceptibility $\chi_{loc}$ presents the power-law tail $P(\chi_{loc}) \sim 1/\chi_{loc}^{1+\mu}$ where $\mu$ vanishes as $\xi_{av}^{-\omega} $, so that the averaged local susceptibility diverges in a finite neighborhood $0<\mu<1$ before criticality (Griffiths phase) ; the dynamical exponent $z$ diverges near criticality as $z=d/\mu \sim \xi_{av}^{\omega}$ (ii) in dimensions $d \leq 3$, any infinitesimal disorder flows towards this Infinite-Disorder fixed point with $\omega(d)>0$ (for instance $\omega(d=2)=1/3$ and $\omega(d=3) \sim 0.24$) (iii) in finite dimensions $d > 3$, a finite disorder strength is necessary to flow towards the Infinite-Disorder fixed point with $\omega(d)>0$ (for instance $\omega(d=4) \simeq 0.19$), whereas a Finite-Disorder fixed point remains possible for a small enough disorder strength. For the Cayley tree of effective dimension $d=\infty$ where $\omega=0$, we discuss the similarities and differences with the case of finite dimensions.
The proliferation of social media in communication and information dissemination has made it an ideal platform for spreading rumors. Automatically debunking rumors at their stage of diffusion is known as \textit{early rumor detection}, which refers to dealing with sequential posts regarding disputed factual claims with certain variations and highly textual duplication over time. Thus, identifying trending rumors demands an efficient yet flexible model that is able to capture long-range dependencies among postings and produce distinct representations for the accurate early detection. However, it is a challenging task to apply conventional classification algorithms to rumor detection in earliness since they rely on hand-crafted features which require intensive manual efforts in the case of large amount of posts. This paper presents a deep attention model on the basis of recurrent neural networks (RNN) to learn \textit{selectively} temporal hidden representations of sequential posts for identifying rumors. The proposed model delves soft-attention into the recurrence to simultaneously pool out distinct features with particular focus and produce hidden representations that capture contextual variations of relevant posts over time. Extensive experiments on real datasets collected from social media websites demonstrate that (1) the deep attention based RNN model outperforms state-of-the-arts that rely on hand-crafted features; (2) the introduction of soft attention mechanism can effectively distill relevant parts to rumors from original posts in advance; (3) the proposed method detects rumors more quickly and accurately than competitors.
Recently, deep learning approaches have achieved promising results in various fields of computer vision. In this paper, we tackle the problem of head pose estimation through a Convolutional Neural Network (CNN). Differently from other proposals in the literature, the described system is able to work directly and based only on raw depth data. Moreover, the head pose estimation is solved as a regression problem and does not rely on visual facial features like facial landmarks. We tested our system on a well known public dataset, Biwi Kinect Head Pose, showing that our approach achieves state-of-art results and is able to meet real time performance requirements.
While there exist a wide range of effective methods for community detection in networks, most of them require one to know in advance how many communities one is looking for. Here we present a method for estimating the number of communities in a network using a combination of Bayesian inference with a novel prior and an efficient Monte Carlo sampling scheme. We test the method extensively on both real and computer-generated networks, showing that it performs accurately and consistently, even in cases where groups are widely varying in size or structure.
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Increasing concerns over data privacy make it more and more difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used machine learning model in various disciplines while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluation on several studies validated the privacy guarantees, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
The semi-automatic or automatic synthesis of robot controller software is both desirable and challenging. Synthesis of rather simple behaviors such as collision avoidance by applying artificial evolution has been shown multiple times. However, the difficulty of this synthesis increases heavily with increasing complexity of the task that should be performed by the robot. We try to tackle this problem of complexity with Artificial Homeostatic Hormone Systems (AHHS), which provide both intrinsic, homeostatic processes and (transient) intrinsic, variant behavior. By using AHHS the need for pre-defined controller topologies or information about the field of application is minimized. We investigate how the principle design of the controller and the hormone network size affects the overall performance of the artificial evolution (i.e., evolvability). This is done by comparing two variants of AHHS that show different effects when mutated. We evolve a controller for a robot built from five autonomous, cooperating modules. The desired behavior is a form of gait resulting in fast locomotion by using the modules' main hinges.
The nature has inspired several metaheuristics, outstanding among these is Ant Colony Optimization (ACO), which have proved to be very effective and efficient in problems of high complexity (NP-hard) in combinatorial optimization. This paper describes the implementation of an ACO model algorithm known as Elitist Ant System (EAS), applied to a combinatorial optimization problem called Job Shop Scheduling Problem (JSSP). We propose a method that seeks to reduce delays designating the operation immediately available, but considering the operations that lack little to be available and have a greater amount of pheromone. The performance of the algorithm was evaluated for problems of JSSP reference, comparing the quality of the solutions obtained regarding the best known solution of the most effective methods. The solutions were of good quality and obtained with a remarkable efficiency by having to make a very low number of objective function evaluations.
Tree-structured neural networks have proven to be effective in learning semantic representations by exploiting syntactic information. In spite of their success, most existing models suffer from the underfitting problem: they recursively use the same shared compositional function throughout the whole compositional process and lack expressive power due to inability to capture the richness of compositionality. In this paper, we address this issue by introducing the dynamic compositional neural networks over tree structure (DC-TreeNN), in which the compositional function is dynamically generated by a meta network. The role of meta-network is to capture the metaknowledge across the different compositional rules and formulate them. Experimental results on two typical tasks show the effectiveness of the proposed models.
One problem facing players of competitive games is negative, or toxic, behavior. League of Legends, the largest eSport game, uses a crowdsourcing platform called the Tribunal to judge whether a reported toxic player should be punished or not. The Tribunal is a two stage system requiring reports from those players that directly observe toxic behavior, and human experts that review aggregated reports. While this system has successfully dealt with the vague nature of toxic behavior by majority rules based on many votes, it naturally requires tremendous cost, time, and human efforts.   In this paper, we propose a supervised learning approach for predicting crowdsourced decisions on toxic behavior with large-scale labeled data collections; over 10 million user reports involved in 1.46 million toxic players and corresponding crowdsourced decisions. Our result shows good performance in detecting overwhelmingly majority cases and predicting crowdsourced decisions on them. We demonstrate good portability of our classifier across regions. Finally, we estimate the practical implications of our approach, potential cost savings and victim protection.
The main limitation that constrains the fast and comprehensive application of Wireless Local Area Network (WLAN) based indoor localization systems with Received Signal Strength (RSS) positioning algorithms is the building of the fingerprinting radio map, which is time-consuming especially when the indoor environment is large and/or with high frequent changes. Different approaches have been proposed to reduce workload, including fingerprinting deployment and update efforts, but the performance degrades greatly when the workload is reduced below a certain level. In this paper, we propose an indoor localization scenario that applies metric learning and manifold alignment to realize direct mapping localization (DML) using a low resolution radio map with single sample of RSS that reduces the fingerprinting workload by up to 87\%. Compared to previous work. The proposed two localization approaches, DML and $k$ nearest neighbors based on reconstructed radio map (reKNN), were shown to achieve less than 4.3\ m and 3.7\ m mean localization error respectively in a typical office environment with an area of approximately 170\ m$^2$, while the unsupervised localization with perturbation algorithm was shown to achieve 4.7\ m mean localization error with 8 times more workload than the proposed methods. As for the room level localization application, both DML and reKNN can meet the requirement with at most 9\ m of localization error which is enough to tell apart different rooms with over 99\% accuracy.
Deep inelastic $e^-p$ scattering has been studied in both the charged-current (CC) and neutral-current (NC) reactions at momentum transfers squared, $Q^2$, between 400 GeV$^2$ and the kinematic limit of 87500 GeV$^2$ using the ZEUS detector at the HERA $ep$ collider. The CC and NC total cross sections, the NC to CC cross section ratio, and the differential cross sections, $ d\sigma/dQ^2 $, are presented. For $Q^2 \simeq M_W^2$, where $M_W$ is the mass of the $W$ boson, the CC and NC cross sections have comparable magnitudes, demonstrating the equal strengths of the weak and electromagnetic interactions at high $Q^2$. The $Q^2$ dependence of the CC cross section determines the mass term in the CC propagator to be $M_{W} = 76 \pm 16 \pm 13$~GeV.
With the impressive capability to capture visual content, deep convolutional neural networks (CNN) have demon- strated promising performance in various vision-based ap- plications, such as classification, recognition, and objec- t detection. However, due to the intrinsic structure design of CNN, for images with complex content, it achieves lim- ited capability on invariance to translation, rotation, and re-sizing changes, which is strongly emphasized in the s- cenario of content-based image retrieval. In this paper, to address this problem, we proposed a new kernelized deep convolutional neural network. We first discuss our motiva- tion by an experimental study to demonstrate the sensitivi- ty of the global CNN feature to the basic geometric trans- formations. Then, we propose to represent visual content with approximate invariance to the above geometric trans- formations from a kernelized perspective. We extract CNN features on the detected object-like patches and aggregate these patch-level CNN features to form a vectorial repre- sentation with the Fisher vector model. The effectiveness of our proposed algorithm is demonstrated on image search application with three benchmark datasets.
Semantic segmentation tasks can be well modeled by Markov Random Field (MRF). This paper addresses semantic segmentation by incorporating high-order relations and mixture of label contexts into MRF. Unlike previous works that optimized MRFs using iterative algorithm, we solve MRF by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-to-end computation in a single forward pass. Specifically, DPN extends a contemporary CNN to model unary terms and additional layers are devised to approximate the mean field (MF) algorithm for pairwise terms. It has several appealing properties. First, different from the recent works that required many iterations of MF during back-propagation, DPN is able to achieve high performance by approximating one iteration of MF. Second, DPN represents various types of pairwise terms, making many existing models as its special cases. Furthermore, pairwise terms in DPN provide a unified framework to encode rich contextual information in high-dimensional data, such as images and videos. Third, DPN makes MF easier to be parallelized and speeded up, thus enabling efficient inference. DPN is thoroughly evaluated on standard semantic image/video segmentation benchmarks, where a single DPN model yields state-of-the-art segmentation accuracies on PASCAL VOC 2012, Cityscapes dataset and CamVid dataset.
The Next Generation Wireless Networks (NGWN) will be heterogeneous in nature where the different Radio Access Technologies (RATs) operate together .The mobile terminals operating in this heterogeneous environment will have different QoS requirements to be handled by the system. These QoS requirements are determined by a set of QoS parameters. The radio resource management is one of the key challenges in NGWN. Call admission control is one of the radio resource management technique plays instrumental role in ensure the desired QoS to the users working on different applications which have diversified QoS requirements from the wireless networks . The call blocking probability is one such QoS parameter for the wireless network. For better QoS it is desirable to reduce the call blocking probability. In this customary scenario it is highly desirable to obtain analytic Performance model. In this paper we propose a higher order Markov chain based performance model for call admission control in a heterogeneous wireless network environment. In the proposed algorithm we have considered three classes of traffic having different QoS requirements and we have considered the heterogeneous network environment which includes the RATs that can effectively handle applications like voice calls, Web browsing and file transfer applications which are with varied QoS parameters. The paper presents the call blocking probabilities for all the three types of traffic both for fixed and varied traffic scenario.
The dynamic structure factor of the 7Li0.61Na0.39 liquid alloy at T=590 K has been calculated by ab initio molecular dynamics simulations using 2000 particles. For small wavevectors, 0.15 <= q/A-1 <= 1.6, we find clear side peaks in the partial dynamic structure factors. Whereas for q <= 0.25 A-1 the peak frequencies correspond to the hydrodynamic sound dispersion of the binary alloy, for greater q values we obtain two modes with phase velocities above and below the hydrodynamic sound. A smooth transition between hydrodynamic sound and the two collective modes is shown to take place in the range 0.25 <= q/A-1 <= 0.35. The mass ratio in this system, mNa/mLi = 3, is the smallest one so far for which the fast mode is observed. We also predict that inelastic X-ray scattering experiments would be able to detect the slow mode, and explain why the inelastic neutron scattering experiments [P.R. Gartyrell-Mills et al, Physica B 154, 1 (1988)] do not show any of these modes.
General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well-developed for small finite state Markov Decision Processes (MDPs). So far it is an art performed by human designers to extract the right state representation out of the bare observations, i.e. to reduce the agent setup to the MDP framework. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in a companion article.
Trust is often conveyed through delegation, or through recommendation. This makes the trust authorities, who process and publish trust recommendations, into an attractive target for attacks and spoofing. In some recent empiric studies, this was shown to lead to a remarkable phenomenon of *adverse selection*: a greater percentage of unreliable or malicious web merchants were found among those with certain types of trust certificates, then among those without. While such findings can be attributed to a lack of diligence in trust authorities, or even to conflicts of interest, our analysis of trust dynamics suggests that public trust networks would probably remain vulnerable even if trust authorities were perfectly diligent. The reason is that the process of trust building, if trust is not breached too often, naturally leads to power-law distributions: the rich get richer, the trusted attract more trust. The evolutionary processes with such distributions, ubiquitous in nature, are known to be robust with respect to random failures, but vulnerable to adaptive attacks. We recommend some ways to decrease the vulnerability of trust building, and suggest some ideas for exploration.
Knowing a user's next cell allows more efficient resource allocation and enables new location-aware services. To anticipate the cell a user will hand-over to, we introduce a new machine learning based prediction system. Therein, we formulate the prediction as a classification problem based on information that is readily available in cellular networks. Using only Channel State Information (CSI) and handover history, we perform classification by embedding Support Vector Machines (SVMs) into an efficient pre-processing structure. Simulation results from a Manhattan Grid scenario and from a realistic radio map of downtown Frankfurt show that our system provides timely prediction at high accuracy.
The effect of long wavelength fluctuations on the Mode-Coupling-Theory (MCT) dynamical singularity at $T_c$ in the $\beta$ regime is studied by means of the standard field-theoretical procedure for a genuine second-order phase transition. The resulting perturbative loop expansion can be resummed leading to an extension of the MCT equation for the critical correlator with local random fluctuations of the separation parameter. The corresponding model explains both qualitatively and quantitatively why the MCT dynamical singularity is transformed into a crossover from relaxational to activated dynamics. Dynamical Heterogeneities emerge naturally as the ergodicity-restoring mechanism instead of {\it ad hoc} hopping processes.
Modeling biological networks serves as both a major goal and an effective tool of systems biology in studying mechanisms that orchestrate the activities of gene products in cells. Biological networks are context specific and dynamic in nature. To systematically characterize the selectively activated regulatory components and mechanisms, the modeling tools must be able to effectively distinguish significant rewiring from random background fluctuations. We formulated the inference of differential dependency networks that incorporates both conditional data and prior knowledge as a convex optimization problem, and developed an efficient learning algorithm to jointly infer the conserved biological network and the significant rewiring across different conditions. We used a novel sampling scheme to estimate the expected error rate due to random knowledge and based on which, developed a strategy that fully exploits the benefit of this data-knowledge integrated approach. We demonstrated and validated the principle and performance of our method using synthetic datasets. We then applied our method to yeast cell line and breast cancer microarray data and obtained biologically plausible results.
We perform the analysis of the existing inclusive deep inelastic scattering (DIS) data within NNLO QCD approximation. The parton distributions functions (PDFs) and the value of strong coupling constant $\alpha_{s}(M_Z)=0.1143\pm0.0013 (exp)$ are obtained. The sensitivity of the PDFs to the uncertainty in the value of the NNLO corrections to the splitting functions is analyzed. It is shown that the PDFs errors due to this uncertainty is generally less than the experimental uncertainty in PDFs through the region of $x$ spanned by the existing DIS data.
We develop a analytical and quantitative theory of the tube model concept for entangled networks of semiflexible polymers. The absolute value of the tube diameter L_perp is derived as a function of the polymers' persistence length l_p and mesh size xi of the network. To leading order we find L_\perp = 0.32 xi^{6/5} l_p^{-1/5}, which is consistent with known asymptotic scaling laws. Additionally, our theory provides corrections to scaling that account for finite polymer length effects and are dominated by the mesh size to polymer length ratio. We support our analytical studies by extensive computer simulations. These allow to verify assumptions essential to our theoretical description and provide an excellent agreement with the analytically calculated tube diameter. Furthermore, we present simulation data for the distribution function of tube widths in the network.
In this work we study the behavior of classical two-person, two-strategies evolutionary games on a class of weighted networks derived from Barab\'asi-Albert and random scale-free unweighted graphs. Using customary imitative dynamics, our numerical simulation results show that the presence of link weights that are correlated in a particular manner with the degree of the link endpoints, leads to unprecedented levels of cooperation in the whole games' phase space, well above those found for the corresponding unweighted complex networks. We provide intuitive explanations for this favorable behavior by transforming the weighted networks into unweighted ones with particular topological properties. The resulting structures help to understand why cooperation can thrive and also give ideas as to how such supercooperative networks might be built.
We present a novel heuristic algorithm for routing optimization on complex networks. Previously proposed routing optimization algorithms aim at avoiding or reducing link overload. Our algorithm balances traffic on a network by minimizing the maximum node betweenness with as little path lengthening as possible, thus being useful in cases when networks are jamming due to queuing overload. By using the resulting routing table, a network can sustain significantly higher traffic without jamming than in the case of traditional shortest path routing.
Significant efforts have gone into the development of statistical models for analyzing data in the form of networks, such as social networks. Most existing work has focused on modeling static networks, which represent either a single time snapshot or an aggregate view over time. There has been recent interest in statistical modeling of dynamic networks, which are observed at multiple points in time and offer a richer representation of many complex phenomena. In this paper, we propose a state-space model for dynamic networks that extends the well-known stochastic blockmodel for static networks to the dynamic setting. We then propose a procedure to fit the model using a modification of the extended Kalman filter augmented with a local search. We apply the procedure to analyze a dynamic social network of email communication.
In this paper we present a dynamical Monte Carlo algorithm which is applicable to systems satisfying a clustering condition: during the dynamical evolution the system is mostly trapped in deep local minima (as happens in glasses, pinning problems etc.). We compare the algorithm to the usual Monte Carlo algorithm, using as an example the Bernasconi model. In this model, a straightforward implementation of the algorithm gives an improvement of several orders of magnitude in computational speed with respect to a recent, already very efficient, implementation of the algorithm of Bortz, Kalos and Lebowitz.
Multimedia data on web is a large source of audio events. However, very little has been done to exploit these data for training audio event recognition and detection models. Moreover, to be able to successfully detect or recognize events in such data for different applications it is important to train the models themselves from such data. Learning event recognition models from these web data poses several challenges, the most important being the availability of labels for these data. We propose that weak label learning offers an effective method to learn from web data. We then propose a robust and efficient deep convolutional neural network (CNN) based framework to learn audio event recognition models from weakly labeled data. It can train from and test on recordings of variable length in an efficient manner and outperforms a network trained on web data under strong label assumption by a considerable margin.
This review explains in a self-contained way the properties of random Boolean networks and their attractors, with a special focus on critical networks. Using small example networks, analytical calculations, phenomenological arguments, and problems to solve, the basic concepts are introduced and important results concerning phase diagrams, numbers of relevant nodes and attractor properties are derived.
This paper investigates a discontinuous neural network which is used as a model of the mammalian olfactory system and can more generally be applied to solve non-negative sparse approximation problems. By inherently limiting the systems integrators to having non-negative outputs, the system function becomes discontinuous since the integrators switch between being inactive and being active. It is shown that the presented network converges to equilibrium points which are solutions to general non-negative least squares optimization problems. We specify a Caratheodory solution and prove that the network is stable, provided that the system matrix has full column-rank. Under a mild condition on the equilibrium point, we show that the network converges to its equilibrium within a finite number of switches. Two applications of the neural network are shown. Firstly, we apply the network as a model of the olfactory system and show that in principle it may be capable of performing complex sparse signal recovery tasks. Secondly, we generalize the application to include non-negative sparse approximation problems and compare the recovery performance to a classical non-negative basis pursuit denoising algorithm. We conclude that the recovery performance differs only marginally from the classical algorithm, while the neural network has the advantage that no performance critical regularization parameter has to be chosen prior to recovery.
We propose a new neurally-inspired model that can learn to encode the global relationship context of visual events across time and space and to use the contextual information to modulate the analysis by synthesis process in a predictive coding framework. The model learns latent contextual representations by maximizing the predictability of visual events based on local and global contextual information through both top-down and bottom-up processes. In contrast to standard predictive coding models, the prediction error in this model is used to update the contextual representation but does not alter the feedforward input for the next layer, and is thus more consistent with neurophysiological observations. We establish the computational feasibility of this model by demonstrating its ability in several aspects. We show that our model can outperform state-of-art performances of gated Boltzmann machines (GBM) in estimation of contextual information. Our model can also interpolate missing events or predict future events in image sequences while simultaneously estimating contextual information. We show it achieves state-of-art performances in terms of prediction accuracy in a variety of tasks and possesses the ability to interpolate missing frames, a function that is lacking in GBM.
We study the critical behavior of Ising quantum magnets with broadly distributed random couplings (J), such that $P(\ln J) \sim |\ln J|^{-1-\alpha}$, $\alpha>1$, for large $|\ln J|$ (L\'evy flight statistics). For sufficiently broad distributions, $\alpha<\alpha_c$, the critical behavior is controlled by a line of fixed points, where the critical exponents vary with the L\'evy index, $\alpha$. In one dimension, with $\alpha_c=2$, we obtaind several exact results through a mapping to surviving Riemann walks. In two dimensions the varying critical exponents have been calculated by a numerical implementation of the Ma-Dasgupta-Hu renormalization group method leading to $\alpha_c \approx 4.5$. Thus in the region $2<\alpha<\alpha_c$, where the central limit theorem holds for $|\ln J|$ the broadness of the distribution is relevant for the 2d quantum Ising model.
It has been shown that the adjacency eigenspace of a network contains key information of its underlying structure. However, there has been no study on spectral analysis of the adjacency matrices of directed signed graphs. In this paper, we derive theoretical approximations of spectral projections from such directed signed networks using matrix perturbation theory. We use the derived theoretical results to study the influences of negative intra cluster and inter cluster directed edges on node spectral projections. We then develop a spectral clustering based graph partition algorithm, SC-DSG, and conduct evaluations on both synthetic and real datasets. Both theoretical analysis and empirical evaluation demonstrate the effectiveness of the proposed algorithm.
Since Darwin, species trees have been used as a simplified description of the relationships which summarize the complicated network $N$ of reality. Recent evidence of hybridization and lateral gene transfer, however, suggest that there are situations where trees are inadequate. Consequently it is important to determine properties that characterize networks closely related to $N$ and possibly more complicated than trees but lacking the full complexity of $N$.   A connected surjective digraph map (CSD) is a map $f$ from one network $N$ to another network $M$ such that every arc is either collapsed to a single vertex or is taken to an arc, such that $f$ is surjective, and such that the inverse image of a vertex is always connected. CSD maps are shown to behave well under composition. It is proved that if there is a CSD map from $N$ to $M$, then there is a way to lift an undirected version of $M$ into $N$, often with added resolution. A CSD map from $N$ to $M$ puts strong constraints on $N$.   In general, it may be useful to study classes of networks such that, for any $N$, there exists a CSD map from $N$ to some standard member of that class.
We discuss the radiative corrections to charged and neutral current deep-inelastic neutrino--nucleon scattering in the minimal supersymmetric standard model (MSSM). In particular, we study deviations, delta_R^nu(nu-bar), from the Standard Model prediction for the ratios of neutral- to charged-current cross sections, taking into account all sources for deviations in the MSSM, i.e. different contributions from virtual Higgs bosons and virtual superpartners. Our calculation includes the full q^2 dependence of the one-loop amplitudes, parton distribution functions and a NuTeV-inspired process kinematics. We present results of a scan of delta_R^nu(nu-bar) over the relevant MSSM parameter space.
We propose a class of qubit networks that admit perfect transfer of any quantum state in a fixed period of time. Unlike many other schemes for quantum computation and communication, these networks do not require qubit couplings to be switched on and off. When restricted to N-qubit spin networks of identical qubit couplings, we show that 2 log_3 N is the maximal perfect communication distance for hypercube geometries. Moreover, if one allows fixed but different couplings between the qubits then perfect state transfer can be achieved over arbitrarily long distances in a linear chain.
We investigate the ground state of the irrationally frustrated Josephson junction array with controlling anisotropy parameter \lambda\ that is the ratio of the longitudinal Josephson coupling to the transverse one. We find that the ground state has one dimensional periodicity whose reciprocal lattice vector depends on \lambda\ and is incommensurate with the substrate lattice. Approaching the isotropic point, \lambda=1 the so called hull function of the ground state exhibits analyticity breaking similar to the Aubry transition in the Frenkel-Kontorova model. We find a scaling law for the harmonic spectrum of the hull functions, which suggests the existence of a characteristic length scale diverging at the isotropic point. This critical behavior is directly connected to the jamming transition previously observed in the current-voltage characteristics by a numerical simulation. On top of the ground state there is a gapless, continuous band of metastable states, which exhibit the same critical behavior as the ground state.
Ultra Deep-Sub-Micron CMOS chips have to function correctly and reliably, not only during their early post-fabrication life, but also for their entire life span. In this paper, we present an architectural-level in-field repair technique. The key idea is to trade area for reliability by adding repair features to the system while keeping the power and the performance overheads as low as possible. In the case of permanent faults, spare blocks will replace the faulty blocks on the fly. Meanwhile by shutting down the main logic blocks, partial threshold voltage recovery can be achieved which will alleviate the ageing-related delays and timing issues. The technique can avoid fatal shut-downs in the system and will decrease the down-time, hence the availability of such a system will be preserved. We have implemented the proposed idea on a pipelined processor core using a conventional ASIC design flow. The simulation results show that by tolerating about 70% area overhead and less than 18% power overhead we can dramatically increase the reliability and decrease the downtime of the processor.
Layer-wise relevance propagation (LRP) is a recently proposed technique for explaining predictions of complex non-linear classifiers in terms of input variables. In this paper, we apply LRP for the first time to natural language processing (NLP). More precisely, we use it to explain the predictions of a convolutional neural network (CNN) trained on a topic categorization task. Our analysis highlights which words are relevant for a specific prediction of the CNN. We compare our technique to standard sensitivity analysis, both qualitatively and quantitatively, using a "word deleting" perturbation experiment, a PCA analysis, and various visualizations. All experiments validate the suitability of LRP for explaining the CNN predictions, which is also in line with results reported in recent image classification studies.
A neural code $\mathcal{C}$ is a collection of binary vectors of a given length n that record the co-firing patterns of a set of neurons. Our focus is on neural codes arising from place cells, neurons that respond to geographic stimulus. In this setting, the stimulus space can be visualized as subset of $\mathbb{R}^2$ covered by a collection $\mathcal{U}$ of convex sets such that the arrangement $\mathcal{U}$ forms an Euler diagram for $\mathcal{C}$. There are some methods to determine whether such a convex realization $\mathcal{U}$ exists; however, these methods do not describe how to draw a realization. In this work, we look at the problem of algorithmically drawing Euler diagrams for neural codes using two polynomial ideals: the neural ideal, a pseudo-monomial ideal; and the neural toric ideal, a binomial ideal. In particular, we study how these objects are related to the theory of piercings in information visualization, and we show how minimal generating sets of the ideals reveal whether or not a code is $0$, $1$, or $2$-inductively pierced.
Intrigued by the capacity of random networks, we start by proving a max-flow min-cut theorem that is applicable to any random graph obeying a suitably defined independence-in-cut property. We then show that this property is satisfied by relevant classes, including small world topologies, which are pervasive in both man-made and natural networks, and wireless networks of dual devices, which exploit multiple radio interfaces to enhance the connectivity of the network. In both cases, we are able to apply our theorem and derive max-flow min-cut bounds for network information flow.
The emergence and maintenance of cooperation within sizable groups of unrelated humans offer many challenges for our understanding. We propose that the humans' capacity of communication, such as how many and how far away the fellows can build up mutual communications, may affect the evolution of cooperation. We study this issue by means of the public goods game (PGG) with a two-layered network of contacts. Players obtain payoffs from five-person public goods interactions on a square lattice (the interaction layer). Also, they update strategies after communicating with neighbours in learning layer, where two players build up mutual communication with a power law probability depending on their spatial distance. Our simulation results indicate that the evolution of cooperation is indeed sensitive to how players choose others to communicate with, including the amount as well as the locations. The tendency of localised communication is proved to be a new mechanism to promote cooperation.
Computer system models provide detailed answer to system performance.In this paper a two stage tandem network system with Blocking and Feedback is considered and it performance has been analyzed by spectral expansion method.The study state system with balance equations has been discussed.
A number of works in the field of intrusion detection have been based on Artificial Immune System and Soft Computing. Artificial Immune System based approaches attempt to leverage the adaptability, error tolerance, self- monitoring and distributed nature of Human Immune Systems. Whereas Soft Computing based approaches are instrumental in developing fuzzy rule based systems for detecting intrusions. They are computationally intensive and apply machine learning (both supervised and unsupervised) techniques to detect intrusions in a given system. A combination of these two approaches could provide significant advantages for intrusion detection. In this paper we attempt to leverage the adaptability of Artificial Immune System and the computation intensive nature of Soft Computing to develop a system that can effectively detect intrusions in a given network.
We develop a structural default model for interconnected financial institutions in a probabilistic framework. For all possible network structures we characterize the joint default distribution of the system using Bayesian network methodologies. Particular emphasis is given to the treatment and consequences of cyclic financial linkages. We further demonstrate how Bayesian network theory can be applied to detect contagion channels within the financial network, to measure the systemic importance of selected entities on others, and to compute conditional or unconditional probabilities of default for single or multiple institutions.
We review recent results on the potential energy landscape (PES) of model liquids. The role of saddle-points in the PES in connecting dynamics to statics is investigated, confirming that a change between minima-dominated and saddle-dominated regions of the PES explored in equilibrium happens around the Mode Coupling Temperature. The structure of the low-energy saddles in the basins is found to be simple and hierarchically organized; the presence of saddles nearby in energy to the local minima indicates that, at non-cryogenic temperatures, entropic bottlenecks limit the dynamics.
We investigate near the point of glass transition the expansion of the free energy corresponding to the generalized Sherrington--Kirkpatrick model with arbitrary diagonal operators U standing instead of Ising spins. We focus on the case when U is an operator with broken reflection symmetry. Such a consideration is important for understanding the behavior of spin-glass-like phases in a number of real physical systems, mainly in orientational glasses in mixed molecular crystals which present just the case. We build explicitly a full replica symmetry breaking (FRSB) solution of the equations for the orientational glass order parameters when the non-symmetric part of U is small. This particular result presents a counterexample in the context of usually adopted conjecture of the absence of FRSB solution in systems with no reflection symmetry.
To maintain fairness, in the terms of resources shared by an individual peer, a proper incentive policy is required in a peer to peer network. This letter proposes, a simpler mechanism to rank the peers based on their resource contributions to the network. This mechanism will suppress the free riders from downloading the resources from the network. Contributions of the peers are biased in such a way that it can balance the download and upload amount of resources at each peer. This mechanism can be implemented in a distributed system and it converges much faster than the other existing approaches.
We present the results of a calculation of deep inelastic electron-photon scattering at a linear collider for very high virtuality of the intermediate gauge boson up to NLO in perturbative QCD. The real photon is produced unpolarized via the Compton back scattering of laser light of the incoming beam. For $Q^2$ values close to the masses squared of the Z and W gauge bosons, the deep inelastic electron-photon scattering process receives important contributions not only from virtual photon exchange but also from the exchange of a Z-boson and a W-boson. We find that the total cross section for center of mass energies above $500 \rm{GeV}$ is at least of ${\cal O}(pb)$ and has an important charged current contribution.
We present results from our studies of radio emission from selected Ultraluminous X-ray (ULX) sources, using archival Giant Metrewave Radio Telescope (GMRT) data and new European VLBI Network (EVN) observations. The GMRT data are used to find possible faint radio emission from ULX sources located in late-type galaxies in the Chandra Deep Fields. No detections are found at 235, 325 and 610 MHz, and upper limits on the radio flux densities at these frequencies are given. The EVN observations target milliarcsecond-scale structures in three ULXs with known radio counterparts (N4449- X1, N4088-X1, and N4861-X2). We confirm an earlier identification of the ULX N4449-X1 with a supernova remnant and obtain the most accurate estimates of its size and age. We detect compact radio emission for the ULX N4088-X1, which could harbour an intermediate mass black hole (IMBH) of 10^5 M\odot accreting at a sub-Eddington rate. We detect a compact radio component in the ULX N4861-X2, with a brightness temperature > 10^6 K and an indication for possible extended emission. If the extended structure is confirmed, this ULX could be an HII region with a diameter of 8.6 pc and surface brightness temperature \geq 10^5 K. The compact radio emission may be coming from a ~ 10^5 M\odot black hole accreting at 0.1M_Edd.
A central goal of neuroscience is to understand how activity in the nervous system is related to features of the external world, or to features of the nervous system itself. A common approach is to model neural responses as a weighted combination of external features, or vice versa. The structure of the model weights can provide insight into neural representations. Often, neural input-output relationships are sparse, with only a few inputs contributing to the output. In part to account for such sparsity, structured regularizers are incorporated into model fitting optimization. However, by imposing priors, structured regularizers can make it difficult to interpret learned model parameters. Here, we investigate a simple, minimally structured model estimation method for accurate, unbiased estimation of sparse models based on Bootstrapped Adaptive Threshold Selection followed by ordinary least-squares refitting (BoATS). Through extensive numerical investigations, we show that this method often performs favorably compared to L1 and L2 regularizers. In particular, for a variety of model distributions and noise levels, BoATS more accurately recovers the parameters of sparse models, leading to more parsimonious explanations of outputs. Finally, we apply this method to the task of decoding human speech production from ECoG recordings.
A sufficient condition for the confinement of quarks is presented. Quarks are shown to be unobservable. Colour singlets are however, observables. The results of deep inelastic scattering are discussed. We argue that QCD does not exhibit a deconfining transition.
We represent transport between different regions of a fluid domain by flow networks, constructed from the discrete representation of the Perron-Frobenius or transfer operator associated to the fluid advection dynamics. The procedure is useful to analyze fluid dynamics in geophysical contexts, as illustrated by the construction of a flow network associated to the surface circulation in the Mediterranean sea. We use network-theory tools to analyze the flow network and gain insights into transport processes. In particular we quantitatively relate dispersion and mixing characteristics, classically quantified by Lyapunov exponents, to the degree of the network nodes. A family of network entropies is defined from the network adjacency matrix, and related to the statistics of stretching in the fluid, in particular to the Lyapunov exponent field. Finally we use a network community detection algorithm, Infomap, to partition the Mediterranean network into coherent regions, i.e. areas internally well mixed, but with little fluid interchange between them.
The dynamical structure factor (S(Q,E)) of vitreous silica has been measured by Inelastic X-ray Scattering varying the exchanged wavevector (Q) at fixed exchanged energy (E) - an experimental procedure that, contrary to the usual one at constant Q, provides spectra with much better identified inelastic features. This allows the first direct evidence of Brillouin peaks in the S(Q,E) of SiO_2 at energies above the Boson Peak (BP) energy, a finding that excludes the possibility that the BP marks the transition from propagating to localised dynamics in glasses.
We reveal a phase transition with decreasing viscosity $\nu$ at \nu=\nu_c>0 in one-dimensional decaying Burgers turbulence with a power-law correlated random profile of Gaussian-distributed initial velocities <v(x,0)v(x',0)>\sim|x-x'|^{-2}. The low-viscosity phase exhibits non-Gaussian one-point probability density of velocities, continuously dependent on \nu, reflecting a spontaneous one step replica symmetry breaking (RSB) in the associated statistical mechanics problem. We obtain the low orders cumulants analytically. Our results, which are checked numerically, are based on combining insights in the mechanism of the freezing transition in random logarithmic potentials with an extension of duality relations discovered recently in Random Matrix Theory. They are essentially non mean-field in nature as also demonstrated by the shock size distribution computed numerically and different from the short range correlated Kida model, itself well described by a mean field one step RSB ansatz. We also provide some insights for the finite viscosity behaviour of velocities in the latter model.
Convolution is a fundamental operation in many applications, such as computer vision, natural language processing, image processing, etc. Recent successes of convolutional neural networks in various deep learning applications put even higher demand on fast convolution. The high computation throughput and memory bandwidth of graphics processing units (GPUs) make GPUs a natural choice for accelerating convolution operations. However, maximally exploiting the available memory bandwidth of GPUs for convolution is a challenging task. This paper introduces a general model to address the mismatch between the memory bank width of GPUs and computation data width of threads. Based on this model, we develop two convolution kernels, one for the general case and the other for a special case with one input channel. By carefully optimizing memory access patterns and computation patterns, we design a communication-optimized kernel for the special case and a communication-reduced kernel for the general case. Experimental data based on implementations on Kepler GPUs show that our kernels achieve 5.16X and 35.5% average performance improvement over the latest cuDNN library, for the special case and the general case, respectively.
A recurring problem faced when training neural networks is that there is typically not enough data to maximize the generalization capability of deep neural networks(DNN). There are many techniques to address this, including data augmentation, dropout, and transfer learning. In this paper, we introduce an additional method which we call Smart Augmentation and we show how to use it to increase the accuracy and reduce overfitting on a target network. Smart Augmentation works by creating a network that learns how to generate augmented data during the training process of a target network in a way that reduces that networks loss. This allows us to learn augmentations that minimize the error of that network.   Smart Augmentation has shown the potential to increase accuracy by demonstrably significant measures on all datasets tested. In addition, it has shown potential to achieve similar or improved performance levels with significantly smaller network sizes in a number of tested cases.
The past century of telecommunications has shown that failures in networks are prevalent. Although much has been done to prevent failures, network nodes and links are bound to fail eventually. Failure recovery processes are therefore needed. Failure recovery is mainly influenced by (1) detection of the failure, and (2) circumvention of the detected failure. However, especially in SDNs where controllers recompute network state reactively, this leads to high delays. Hence, next to primary rules, backup rules should be installed in the switches to quickly detour traffic once a failure occurs. In this work, we propose algorithms for computing an all-to-all primary and backup network forwarding configuration that is capable of circumventing link and node failures. Omitting the high delay invoked by controller recomputation through preconfiguration, our proposal's recovery delay is close to the detection time which is significantly below the 50 ms rule of thumb. After initial recovery, we recompute network configuration to guarantee protection from future failures. Our algorithms use packet-labeling to guarantee correct and shortest detour forwarding. The algorithms and labeling technique allow packets to return to the primary path and are able to discriminate between link and node failures. The computational complexity of our solution is comparable to that of all-to-all-shortest paths computations. Our experimental evaluation on both real and generated networks shows that network configuration complexity highly decreases compared to classic disjoint paths computations. Finally, we provide a proof-of-concept OpenFlow controller in which our proposed configuration is implemented, demonstrating that it readily can be applied in production networks.
We study dissipation effects for electrons on the surface of liquid helium, which may serve as qubits of a quantum computer. Each electron is localized in a 3D potential well formed by the image potential in helium and the potential from a submicron electrode submerged into helium. We estimate parameters of the confining potential and characterize the electron energy spectrum. Decay of the excited electron state is due to two-ripplon scattering and to scattering by phonons in helium. We identify mechanisms of coupling to phonons and estimate contributions from different scattering mechanisms. Even in the absence of a magnetic field we expect the decay rate to be $\alt 10^4$ s$^{-1}$. We also calculate the dephasing rate, which is due primarily to ripplon scattering off an electron. This rate is $\alt 10^2$ s$^{-1}$ for typical operation temperatures.
The extremely high rate of events that will be produced in the future Large Hadron Collider requires the triggering mechanism to take precise decisions in a few nano-seconds. We present a study which used an artificial neural network triggering algorithm and compared it to the performance of a dedicated electronic muon triggering system. Relatively simple architecture was used to solve a complicated inverse problem. A comparison with a realistic example of the ATLAS first level trigger simulation was in favour of the neural network. A similar architecture trained after the simulation of the electronics first trigger stage showed a further background rejection.
Growth models have been proposed for constructing the scale-free overlay topology to improve the performance of unstructured peer-to-peer (P2P) networks. However, previous growth models are able to maintain the limited scale-free topology when nodes only join but do not leave the network; the case of nodes leaving the network while preserving a precise scaling parameter is not included in the solution. Thus, the full dynamic of node participation, inherent in P2P networks, is not considered in these models. In order to handle both nodes joining and leaving the network, we propose a robust growth model E-SRA, which is capable of producing the perfect limited scale-free overlay topology with user-defined scaling parameter and hard cut-offs. Scalability of our approach is ensured since no global information is required to add or remove a node. E-SRA is also tolerant to individual node failure caused by errors or attacks. Simulations have shown that E-SRA outperforms other growth models by producing topologies with high adherence to the desired scale-free property. Search algorithms, including flooding and normalized flooding, achieve higher efficiency over the topologies produced by E-SRA.
We give a solution to the fundamental problem of restoring optical resonances in deep subwavelength structures by resorting to indefinite metamaterials. We prove that a nanometric thick hyperbolic slab with very small permittivities exhibits etalon resonances and provides high-contrast optical angular filtering. This is possible since the hyperbolic dispersion allows the vacuum radiation to couple with medium plane waves with longitudinal wavenumbers large enough to yield optical standing waves within the nanometric slab thickness. Our findings can form the basis of a novel way for shrinking optical devices down to the deep subwavelength scale.
The Sun is a variable star whose magnetic activity and total irradiance vary on a timescale of approximately 11 years. The current activity minimum has attracted considerable interest because of its unusual duration and depth. This raises the question: what might be happening beneath the surface where the magnetic activity ultimately originates? The surface activity can be linked to the conditions in the solar interior by the observation and analysis of the frequencies of the Sun's natural seismic modes of oscillation - the p modes. These seismic frequencies respond to changes in activity and are probes of conditions within the Sun. The Birmingham Solar-Oscillations Network (BiSON) has made measurements of p-mode frequencies over the last three solar activity cycles, and so is in a unique position to explore the current unusual and extended solar minimum. We show that the BiSON data reveal significant variations of the p-mode frequencies during the current minimum. This is in marked contrast to the surface activity observations, which show little variation over the same period. The level of the minimum is significantly deeper in the p-mode frequencies than in the surface observations. We observe a quasi-biennial signal in the p-mode frequencies, which has not previously been observed at mid- and low-activity levels. The stark differences in the behavior of the frequencies and the surface activity measures point to activity-related processes occurring in the solar interior, which are yet to reach the surface, where they may be attenuated.
Thanks to improvements in wireless communication technologies and increasing computing power in hand-held devices, mobile ad hoc networks are becoming an ever-more present reality. Coordination languages are expected to become important means in supporting this type of interaction. To this extent we argue the interest of the Bach coordination language as a middleware that can handle and react to context changes as well as cope with unpredictable physical interruptions that occur in opportunistic network connections. More concretely, our proposal is based on blackboard rules that model declaratively the actions to be taken once the blackboard content reaches a predefined state, but also that manage the engagement and disengagement of hosts and transient sharing of blackboards. The idea of reactiveness has already been introduced in previous work, but as will be appreciated by the reader, this article presents a new perspective, more focused on a declarative setting.
In this study, a multi-task deep neural network is proposed for skin lesion analysis. The proposed multi-task learning model solves different tasks (e.g., lesion segmentation and two independent binary lesion classifications) at the same time by exploiting commonalities and differences across tasks. This results in improved learning efficiency and potential prediction accuracy for the task-specific models, when compared to training the individual models separately. The proposed multi-task deep learning model is trained and evaluated on the dermoscopic image sets from the International Skin Imaging Collaboration (ISIC) 2017 Challenge - Skin Lesion Analysis towards Melanoma Detection, which consists of 2000 training samples and 150 evaluation samples. The experimental results show that the proposed multi-task deep learning model achieves promising performances on skin lesion segmentation and classification. The average value of Jaccard index for lesion segmentation is 0.724, while the average values of area under the receiver operating characteristic curve (AUC) on two individual lesion classifications are 0.880 and 0.972, respectively.
We have designed a novel Event-Condition-Action(ECA) scheme based Ad hoc On-demand Distance Vector(ECA-AODV) routing protocol for a Ubiquitous Network(UbiNet). ECA-AODV is designed to make routing decision dynamically and quicker response to dynamic network conditions as and when event occur. ECA scheme essentially consists of three modules to make runtime routing decision quicker. First, event module receive event that occur in a UbiNet and split up event into event type and event attributes. Second, condition module obtain event details from event module split up each condition into condition attributes that matches event and fire the rule as soon as condition hold. Third, action module make runtime decisions based on event obtained and condition applied. We have simulated and tested the designed ECA scheme by considering ubiquitous museum environment as a case study with nodes range from 10 to 100. The simulation results show the time efficient with minimal operations
In an isolated single-particle quantum system a spatial disorder can induce Anderson localization. Being a result of interference, this phenomenon is expected to be fragile in the face of dissipation. Here we show that dissipation can drive a disordered system into a steady state with tunable localization properties. This can be achieved with a set of identical dissipative operators, each one acting non-trivially only on a pair of neighboring sites. Operators are parametrized by a uniform phase, which controls selection of Anderson modes contributing to the state. On the microscopic level, quantum trajectories of a system in a localized steady regime exhibit intermittent dynamics consisting of long-time sticking events near selected modes interrupted by jumps between them.
The Karmarkar-Karp differencing algorithm is the best known polynomial time heuristic for the number partitioning problem, fundamental in both theoretical computer science and statistical physics. We analyze the performance of the differencing algorithm on random instances by mapping it to a nonlinear rate equation. Our analysis reveals strong finite size effects that explain why the precise asymptotics of the differencing solution is hard to establish by simulations. The asymptotic series emerging from the rate equation satisfies all known bounds on the Karmarkar-Karp algorithm and projects a scaling $n^{-c\ln n}$, where $c=1/(2\ln2)=0.7213...$. Our calculations reveal subtle relations between the algorithm and Fibonacci-like sequences, and we establish an explicit identity to that effect.
The recent astonishing wide adhesion of french people to the rumor claiming `No plane did crash on the Pentagon on September the 11", is given a generic explanation in terms of a model of minority opinion spreading. Using a majority rule reaction-diffusion dynamics, a rumor is shown to invade for sure a social group provided it fulfills simultaneously two criteria. First it must initiate with a support beyond some critical threshold which however, turns out to be always very low. Then it has to be consistent with some larger collective social paradigm of the group. Othewise it just dies out. Both conditions were satisfied in the french case with the associated book sold at more than 200 000 copies in just a few days. The rumor was stopped by the firm stand of most newspaper editors stating it is nonsense. Such an incredible social dynamics is shown to result naturally from an open and free public debate among friends and colleagues. Each one searching for the truth sincerely on a free will basis and without individual biases. The polarization process appears also to be very quick in agreement with reality. It is a very strong anti-democratic reversal of opinion although made quite democratically. The model may apply to a large range of rumors.
COOJA is a network simulator developed for wireless sensor networks. It can be used for high-level algorithm development as well as low-level device driver implementations for accurate simulation of wireless sensor networks before deployment. However, in a simulation Cooja assumes that the nodes are only equipped with omnidirectional antennas. There is currently no support for directional antennas. Due to the growing interest in the use of directional or smart antennas in wireless sensor networks, a model that can support directional antennas is essential for the realistic simulations of protocols relying on directional communication. This paper presents work on extending COOJA with a directional antenna model.
In this contribution we discuss the role which incoherent boundary conditions can play in the study of phase transitions. This is a question of particular relevance for the analysis of disordered systems, and in particular of spin glasses. For the moment our mathematical results only apply to ferromagnetic models which have an exact symmetry between low-temperature phases. We give a survey of these results and discuss possibilities to extend them to some situations where many pure states can coexist. An idea of the proofs as well as the reformulation of our results in the language of Newman-Stein metastates are also presented.
The randomness and uniqueness of human eye patterns is a major breakthrough in the search for quicker, easier and highly reliable forms of automatic human identification. It is being used extensively in security solutions. This includes access control to physical facilities, security systems and information databases, Suspect tracking, surveillance and intrusion detection and by various Intelligence agencies through out the world. We use the advantage of human eye uniqueness to identify people and approve its validity as a biometric. . Eye detection involves first extracting the eye from a digital face image, and then encoding the unique patterns of the eye in such a way that they can be compared with pre-registered eye patterns. The eye detection system consists of an automatic segmentation system that is based on the wavelet transform, and then the Wavelet analysis is used as a pre-processor for a back propagation neural network with conjugate gradient learning. The inputs to the neural network are the wavelet maxima neighborhood coefficients of face images at a particular scale. The output of the neural network is the classification of the input into an eye or non-eye region. An accuracy of 90% is observed for identifying test images under different conditions included in training stage.
We introduce a "water retention" model for liquids captured on a random surface with open boundaries, and investigate it for both continuous and discrete surface heights 0, 1, ... n-1, on a square lattice with a square boundary. The model is found to have several intriguing features, including a non-monotonic dependence of the retention on the number of levels in the discrete case: for many n, the retention is counterintuitively greater than that of an n+1-level system. The behavior is explained using percolation theory, by mapping it to a 2-level system with variable probability. Results in 1-dimension are also found.
In this paper we develop conditions for various types of stability in social networks governed by Imitation of Success principle. Considering so-called Prisoner's Dilemma as the base of node-to-node game in the network we obtain well-known Hopfield neural network model. Asymptotic behavior of the original model and dynamic Hopfield model has a certain correspondence. To obtain more general results, we consider Hopfield model dynamic system on time scales. Developed stability conditions combine main parameters of network structure such as network size and maximum relative nodes' degree with the main characteristics of time scale, nodes' inertia and resistance, rate of input-output response.
Deep-inelastic ep scattering data taken with the H1 detector at HERA and corresponding to an integrated luminosity of 106 pb^{-1} are used to study the differential distributions of event shape variables. These include thrust, jet broadening, jet mass and the C-parameter. The four-momentum transfer Q is taken to be the relevant energy scale and ranges between 14 GeV and 200 GeV. The event shape distributions are compared with perturbative QCD predictions, which include resummed contributions and analytical power law corrections, the latter accounting for non-perturbative hadronisation effects. The data clearly exhibit the running of the strong coupling alpha_s(Q) and are consistent with a universal power correction parameter alpha_0 for all event shape variables. A combined QCD fit using all event shape variables yields alpha_s(mZ) = 0.1198 \pm 0.0013 ^{+0.0056}_{-0.0043} and alpha_0 = 0.476 \pm 0.008 ^{+0.018} _{-0.059}.
With an increasing emphasis on network security, much more attention has been attracted to the vulnerability of complex networks. The multi-scale evaluation of vulnerability is widely used since it makes use of combined powers of the links' betweenness and has an effective evaluation to vulnerability. However, how to determine the coefficient in existing multi-scale evaluation model to measure the vulnerability of different networks is still an open issue. In this paper, an improved model based on the fractal dimension of complex networks is proposed to obtain a more reasonable evaluation of vulnerability with more physical significance. Not only the structure and basic physical properties of networks is characterized, but also the covering ability of networks, which is related to the vulnerability of the network, is taken into consideration in our proposed method. The numerical examples and real applications are used to illustrate the efficiency of our proposed method.
This paper describes our winning entry in the ImageCLEF 2015 image sentence generation task. We improve Google's CNN-LSTM model by introducing concept-based sentence reranking, a data-driven approach which exploits the large amounts of concept-level annotations on Flickr. Different from previous usage of concept detection that is tailored to specific image captioning models, the propose approach reranks predicted sentences in terms of their matches with detected concepts, essentially treating the underlying model as a black box. This property makes the approach applicable to a number of existing solutions. We also experiment with fine tuning on the deep language model, which improves the performance further. Scoring METEOR of 0.1875 on the ImageCLEF 2015 test set, our system outperforms the runner-up (METEOR of 0.1687) with a clear margin.
Neuromorphic computing --- brainlike computing in hardware --- typically requires myriad CMOS spiking neurons interconnected by a dense mesh of nanoscale plastic synapses. Memristors are frequently citepd as strong synapse candidates due to their statefulness and potential for low-power implementations. To date, plentiful research has focused on the bipolar memristor synapse, which is capable of incremental weight alterations and can provide adaptive self-organisation under a Hebbian learning scheme. In this paper we consider the Unipolar memristor synapse --- a device capable of non-Hebbian switching between only two states (conductive and resistive) through application of a suitable input voltage --- and discuss its suitability for neuromorphic systems. A self-adaptive evolutionary process is used to autonomously find highly fit network configurations. Experimentation on a two robotics tasks shows that unipolar memristor networks evolve task-solving controllers faster than both bipolar memristor networks and networks containing constant nonplastic connections whilst performing at least comparably.
We report a measurement of the top quark mass, m_t, obtained from ppbar collisions at sqrt(s) = 1.96 TeV at the Fermilab Tevatron using the CDF II detector. We analyze a sample corresponding to an integrated luminosity of 1.9 fb^-1. We select events with an electron or muon, large missing transverse energy, and exactly four high-energy jets in the central region of the detector, at least one of which is tagged as coming from a b quark. We calculate a signal likelihood using a matrix element integration method, with effective propagators to take into account assumptions on event kinematics. Our event likelihood is a function of m_t and a parameter JES that determines /in situ/ the calibration of the jet energies. We use a neural network discriminant to distinguish signal from background events. We also apply a cut on the peak value of each event likelihood curve to reduce the contribution of background and badly reconstructed events. Using the 318 events that pass all selection criteria, we find m_t = 172.7 +/- 1.8 (stat. + JES) +/- 1.2 (syst.) GeV/c^2.
Many applications in translational medicine require the understanding of how diseases progress through the accumulation of persistent events. Specialized Bayesian networks called monotonic progression networks offer a statistical framework for modeling this sort of phenomenon. Current machine learning tools to reconstruct Bayesian networks from data are powerful but not suited to progression models. We combine the technological advances in machine learning with a rigorous philosophical theory of causation to produce Polaris, a scalable algorithm for learning progression networks that accounts for causal or biological noise as well as logical relations among genetic events, making the resulting models easy to interpret qualitatively. We tested Polaris on synthetically generated data and showed that it outperforms a widely used machine learning algorithm and approaches the performance of the competing special-purpose, albeit clairvoyant algorithm that is given a priori information about the model parameters. We also prove that under certain rather mild conditions, Polaris is guaranteed to converge for sufficiently large sample sizes. Finally, we applied Polaris to point mutation and copy number variation data in Prostate cancer from The Cancer Genome Atlas (TCGA) and found that there are likely three distinct progressions, one major androgen driven progression, one major non-androgen driven progression, and one novel minor androgen driven progression.
A propagation of dipolar radiation in a finite length linear chain of identical dielectric spheres is investigated using the multisphere Mie scattering formalism (MSMS). A frequency pass band is shown to be formed near every Mie resonances inherent in the spheres. The manifestation of the pass band depends on the polarization of the travelling radiation. To prove this effect, a point dipole placed by the end of the chain is used as an external source of radiation. It is found that, if this dipole is directed parallel to the to the chain axis, the frequency pass bands exist if the refractive index of dielectric spheres is sufficiently large. For the dipole normal to the chain axis, the pass band can always be formed if the chain is sufficiently long. Such a distinction is due to different behavior of the far-field dipolar interaction between the spheres induced by the external source. The edges of the pass bands are defined by the guiding wave criterion based on the light-cone constraint. The criterion of creation of the pass bands correlate with condition of formation of high quality factor modes in these systems found in our previous papers. A comparison with the results available for infinite chains is made. In particular, we clarify the nature of braking down the band structure for small enough wavevectors.
We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust the variational parameters to improve the quality of the approximation. We demonstrate experimentally that this approximation is faster and potentially more accurate than sampling. We also introduce a simple new technique for handling evidence, which allows us to handle arbitrary distributions on observed nodes, as well as achieving a significant speedup in networks with discrete variables of large cardinality.
We present the photometric sample of a faint galaxy survey carried out in the southern hemisphere, using CCDs on the 3.60m and NTT-3.5m telescopes at La Silla (ESO). The survey area is a continuous strip of 0.2 deg x 1.53 deg located at high galactic latitude (-83 deg) in the Sculptor constellation. The photometric survey provides total magnitudes in the bands B, V (Johnson) and R (Cousins) to limiting magnitudes of 24.5, 24.0, 23.5 respectively. To these limits, the catalog contains about 9500, 12150, 13000 galaxies in B, V, R bands respectively and is the first large digital multi-colour photometric catalog at this depth. This photometric survey also provides the entry catalog for a fully-sampled redshift survey of ~ 700 galaxies with R < 20.5 (Bellanger et al. 1995). In this paper, we describe the photometric observations and the steps used in the data reduction. The analysis of objects and the star-galaxy separation with a neural network are performed using SExtractor, a new photometric software developed by E. Bertin (1996). The photometric accuracy of the resulting catalog is ~ 0.05 mag for R < 22. The differential galaxy number counts in B, V, R are in good agreement with previously published CCD studies and confirm the evidence for significant evolution at faint magnitudes as compared to a standard non evolving model (by factors 3.6, 2.6, 2.1). The galaxy colour distributions B-R, B-V of our sample show a blueing trend of ~ 0.5 mag between 21 < R < 23.5 in contrast to the V-R colour distribution where no significant evolution is observed.
Prediction and control of the dynamics of complex networks is a central problem in network science. Structural and dynamical similarities of different real networks suggest that some universal laws might accurately describe the dynamics of these networks, albeit the nature and common origin of such laws remain elusive. Here we show that the causal network representing the large-scale structure of spacetime in our accelerating universe is a power-law graph with strong clustering, similar to many complex networks such as the Internet, social, or biological networks. We prove that this structural similarity is a consequence of the asymptotic equivalence between the large-scale growth dynamics of complex networks and causal networks. This equivalence suggests that unexpectedly similar laws govern the dynamics of complex networks and spacetime in the universe, with implications to network science and cosmology.
Symbolic regression is an important but challenging research topic in data mining. It can detect the underlying mathematical models. Genetic programming (GP) is one of the most popular methods for symbolic regression. However, its convergence speed might be too slow for large scale problems with a large number of variables. This drawback has become a bottleneck in practical applications. In this paper, a new non-evolutionary real-time algorithm for symbolic regression, Elite Bases Regression (EBR), is proposed. EBR generates a set of candidate basis functions coded with parse-matrix in specific mapping rules. Meanwhile, a certain number of elite bases are preserved and updated iteratively according to the correlation coefficients with respect to the target model. The regression model is then spanned by the elite bases. A comparative study between EBR and a recent proposed machine learning method for symbolic regression, Fast Function eXtraction (FFX), are conducted. Numerical results indicate that EBR can solve symbolic regression problems more effectively.
Although backpropagation ANNs generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions cannot be explained as those of decision trees. In many applications, it is desirable to extract knowledge from trained ANNs for the users to gain a better understanding of how the networks solve the problems. A new rule extraction algorithm, called rule extraction from artificial neural networks (REANN) is proposed and implemented to extract symbolic rules from ANNs. A standard three-layer feedforward ANN is the basis of the algorithm. A four-phase training algorithm is proposed for backpropagation learning. Explicitness of the extracted rules is supported by comparing them to the symbolic rules generated by other methods. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy. Extensive experimental studies on several benchmarks classification problems, such as breast cancer, iris, diabetes, and season classification problems, demonstrate the effectiveness of the proposed approach with good generalization ability.
It is found experimentally that the coexistence region of a vapor-liquid system or a binary mixture is substantially narrowed when the fluid is confined in a aerogel with a high degree of porosity (e.g. of the order of 95% to 99%). A Hamiltonian model for this system has recently been introduced (J.Donley PRE 55:539, 1997}. We have performed Monte-Carlo simulations for this model to obtain the phase diagram for the model. We use a periodic fractal structure constructed by diffusion-limited cluster-cluster aggregation (DLCA) method to simulate a realistic gel environment. The phase diagram obtained is qualitatively similar to that observed experimentally. We also have observed some metastable branches in the phase diagram which have not been seen in experiments yet. These branches, however, might be important in the context of recent theoretical predictions and other simulations.
We study measures of decoherence and thermalization of a quantum system $S$ in the presence of a quantum environment (bath) $E$. The entirety $S$$+$$E$ is prepared in a canonical thermal state at a finite temperature, that is the entirety is in a steady state. Both our numerical results and theoretical predictions show that measures of the decoherence and the thermalization of $S$ are generally finite, even in the thermodynamic limit, when the entirety $S$$+$$E$ is at finite temperature. Notably, applying perturbation theory with respect to the system-environment coupling strength, we find that under common Hamiltonian symmetries, up to first order in the coupling strength it is sufficient to consider $S$ uncoupled from $E$, but entangled with $E$, to predict decoherence and thermalization measures of $S$. This decoupling allows closed form expressions for perturbative expansions for the measures of decoherence and thermalization in terms of the free energies of $S$ and of $E$. Large-scale numerical results for both coupled and uncoupled entireties with up to 40 quantum spins support these findings.
We consider a chemical reaction network governed by mass action kinetics and composed of N different species which can reversibly form heterodimers. A fast iterative algorithm is introduced to compute the equilibrium concentrations of such networks. We show that the convergence is guaranteed by the Banach fixed point theorem. As a practical example, of relevance for a quantitative analysis of microarray data, we consider a reaction network formed by N~10^6 mutually hybridizing different mRNA sequences. We show that, despite the large number of species involved, the convergence to equilibrium is very rapid for most species. The origin of slow convergence for some specific subnetworks is discussed. This provides some insights for improving the performance of the algorithm.
We introduce a novel representation of structured polynomial ideals, which we refer to as chordal networks. The sparsity structure of a polynomial system is often described by a graph that captures the interactions among the variables. Chordal networks provide a computationally convenient decomposition into simpler (triangular) polynomial sets, while preserving the underlying graphical structure. We show that many interesting families of polynomial ideals admit compact chordal network representations (of size linear in the number of variables), even though the number of components is exponentially large. Chordal networks can be computed for arbitrary polynomial systems using a refinement of the chordal elimination algorithm from [Cifuentes-Parrilo-2016]. Furthermore, they can be effectively used to obtain several properties of the variety, such as its dimension, cardinality, and equidimensional components, as well as an efficient probabilistic test for radical ideal membership. We apply our methods to examples from algebraic statistics and vector addition systems; for these instances, algorithms based on chordal networks outperform existing techniques by orders of magnitude.
End-to-end training of automated speech recognition (ASR) systems requires massive data and compute resources. We explore transfer learning based on model adaptation as an approach for training ASR models under constrained GPU memory, throughput and training data. We conduct several systematic experiments adapting a Wav2Letter convolutional neural network originally trained for English ASR to the German language. We show that this technique allows faster training on consumer-grade resources while requiring less training data in order to achieve the same accuracy, thereby lowering the cost of training ASR models in other languages. Model introspection revealed that small adaptations to the network's weights were sufficient for good performance, especially for inner layers.
In the field of computer science, the network reliability problem for evaluating the network failure probability has been extensively investigated. For a given undirected graph $G$, the network failure probability is the probability that edge failures (i.e., edge erasures) make $G$ unconnected. Edge failures are assumed to occur independently with the same probability. The main contributions of the present paper are the upper and lower bounds on the expected network failure probability. We herein assume a simple random graph ensemble that is closely related to the Erd\H{o}s-R\'{e}nyi random graph ensemble. These upper and lower bounds exhibit the typical behavior of the network failure probability. The proof is based on the fact that the cut-set space of $G$ is a linear space over $\Bbb F_2$ spanned by the incident matrix of $G$. The present study shows a close relationship between the ensemble analysis of the network failure probability and the ensemble analysis of the error detection probability of LDGM codes with column weight 2.
The large error of the DFT+U method on full-filled shell metal oxides is due to the residue of self-energy from the localized d orbitals of cations and p orbitals of the anions. U parameters are self-consistently found to achieve the analytical self-energy cancellation. The improved band structures based on relaxed lattices of ${Cu_{2}O}$ are shown based on minimization of self-energy error. The experimentally reported intrinsic p-type trap levels are contributed by both Cu-vacancy and the O-interstitial defects in ${{Cu}_{2}O}$. The latter defect has the lowest formation energy but contributes a deep hole trap level while the Cu-vacancy has higher energy cost but acting as a shallow acceptor. Both present single-particle levels spread over nearby the valence band edge, consistent to the trend of defects transition levels. By this calculation approach, we also elucidated the entanglement of strong p-d orbital coupling to unravel the screened Coulomb potential of fully filled shells.
We investigate the statistics of record breaking events in the time series of crackling bursts in a fiber bundle model of the creep rupture of heterogeneous materials. In the model fibers break due to two mechanisms: slowly accumulating damage triggers bursts of immediate breakings analogous to acoustic emissions in experiments. The rupture process accelerates such that the size of breaking avalanches increases while the waiting time between consecutive events decreases towards failure. Record events are defined as bursts which have a larger size than all previous events in the time series. We analyze the statistics of records focusing on the limit of equal load sharing (ELS) of the model and compare the results to the record statistics of sequences of independent identically distributed random variables. Computer simulations revealed that the number of records grows with the logarithm of the event number except for the close vicinity of macroscopic failure where an exponential dependence is evidenced. The two regimes can be attributed to the dominance of disorder with small burst sizes and to stress enhancements giving rise efficient triggering of extended bursts, respectively. Both the size of records and the increments between consecutive record events are characterized by power law distributions with a common exponent 1.33 significantly different from the usual ELS burst size exponents of fiber bundles. The distribution of waiting times follows the same behavior, however, with two distinct exponents for low and high loads. Studying the evolution of records we identify a load dependent characteristic scale of the system which separates slow down and acceleration of record breaking as failure is approached.
Answering open-ended questions is an essential capability for any intelligent agent. One of the most interesting recent open-ended question answering challenges is Visual Question Answering (VQA) which attempts to evaluate a system's visual understanding through its answers to natural language questions about images. There exist many approaches to VQA, the majority of which do not exhibit deeper semantic understanding of the candidate answers they produce. We study the importance of generating plausible answers to a given question by introducing the novel task of `Answer Proposal': for a given open-ended question, a system should generate a ranked list of candidate answers informed by the semantics of the question. We experiment with various models including a neural generative model as well as a semantic graph matching one. We provide both intrinsic and extrinsic evaluations for the task of Answer Proposal, showing that our best model learns to propose plausible answers with a high recall and performs competitively with some other solutions to VQA.
One of the most important problems in wireless sensor network is to develop a routing protocol that has energy efficiency. Since the power of the sensor Nodes are limited, conserving energy and network life is a critical issue in wireless sensor network. Clustering is one of the known methods widely used to face these challenges. In this paper, a cluster based communication protocol with considering the low energy consumption in wireless sensor networks, is introduced which balances the energy load among sensor nodes. The nodes close to each other have more overlap; they sense the same data from environment and cause a waste of energy by generating repetitive data. In this paper, a cluster based routing protocol is introduced, in the proposed protocol, in each round a certain number of nodes are specified; the nodes which have at least one neighboring node at a distance less than the threshold. Then, among them the nodes with less energy and greater overlap with their neighbors have been chosen to go to sleep mode, Also, the energy imbalance among sensor nodes is reduced by integrating the distance of the nodes from the base station into clustering policies.
Deep radio surveys show a population of very red counterparts of microjansky radio sources, which are unidentified to I = 25 in ground based images and I = 28.5 in the Hubble Deep Field. This population of optically faint radio sources, which comprises about 20% of the microjansky radio samples, may be dust enshrouded starburst galaxies, extreme redshift or dust reddened AGN, or due to displaced radio lobes. Even deeper radio surveys, which will be made possible by next generation radio telescopes such as the Expanded VLA or the Square Kilometer Array, will reach to nanojansky levels which may be dominated by this new population, but only if special care is taken to achieve high angular resolution and dynamic range better than 60 dB. This will reuire array dimensions up to 1000 km to achieve confusion limited performance at 1.4 GHz and up to 10,000 km at 300 MHz. But, even then, the ability to study individual nanojanksy radio sources may be limited by the finite extent of the sources and consequential blending of their images.
A new approach to the solution of Economic Dispatch using Particle Swarm Optimization is presented. It is the progression of allocating production amongst the dedicated units such that the restriction forced are fulfilled and the power needs are reduced. More just, the soft computing method has received supplementary concentration and was used in a quantity of successful and sensible applications. Here, an attempt has been made to find out the minimum cost by using Particle Swarm Optimization Algorithm using the data of three generating units. In this work, data has been taken such as the loss coefficients with the max-min power limit and cost function. PSO and Simulated Annealing are functional to put out the least amount for dissimilar energy requirements. When the outputs are compared with the conventional method, PSO seems to give an improved result with enhanced convergence feature. All the methods are executed in MATLAB environment. The effectiveness and feasibility of the proposed method were demonstrated by three generating units case study. Output gives hopeful results, signifying that the projected method of calculation is competent of economically formative advanced eminence solutions addressing economic dispatch problems.
We extend the concept of graph isomorphisms to multilayer networks with any number of "aspects" (i.e., types of layering). In developing this generalization, we identify multiple types of isomorphisms. For example, in multilayer networks with a single aspect, permuting vertex labels, layer labels, and both vertex labels and layer labels each yield different isomorphism relations between multilayer networks. Multilayer network isomorphisms lead naturally to defining isomorphisms in any of the numerous types of networks that can be represented as a multilayer network, and we thereby obtain isomorphisms for multiplex networks, temporal networks, networks with both of these features, and more. We reduce each of the multilayer network isomorphism problems to a graph isomorphism problem, where the size of the graph isomorphism problem grows linearly with the size of the multilayer network isomorphism problem. One can thus use software that has been developed to solve graph isomorphism problems as a practical means for solving multilayer network isomorphism problems. Our theory lays a foundation for extending many network analysis methods --- including motifs, graphlets, structural roles, and network alignment --- to any multilayer network.
We perform intensive numerical simulations of the three-dimensional site-diluted Ising antiferromagnet in a magnetic field at high values of the external applied field. Even if data for small lattice sizes are compatible with second-order criticality, the critical behavior of the system shows a crossover from second-order to first-order behavior for large system sizes, where signals of latent heat appear. We propose "apparent" critical exponents for the dependence of some observables with the lattice size for a generic (disordered) first-order phase transition.
Over the last decade, the growing amount of UL and DL mobile data traffic has been characterized by substantial asymmetry and time variations. Dynamic time-division duplex (TDD) has the capability to accommodate to the traffic asymmetry by adapting the UL/DL configuration to the current traffic demands. In this work, we study a two-tier heterogeneous cellular network (HCN) where the macro tier and small cell tier operate according to a dynamic TDD scheme on orthogonal frequency bands. To offload the network infrastructure, mobile users in proximity can engage in D2D communications, whose activity is determined by a carrier sensing multiple access (CSMA) scheme to protect the ongoing infrastructure-based and D2D transmissions. We present an analytical framework to evaluate the network performance in terms of load-aware coverage probability and network throughput. The proposed framework allows to quantify the effect on the coverage probability of the most important TDD system parameters, such as the UL/DL configuration, the base station density, and the bias factor. In addition, we evaluate how the bandwidth partition and the D2D network access scheme affect the total network throughput. Through the study of the tradeoff between coverage probability and D2D user activity, we provide guidelines for the optimal design of D2D network access.
We present results of Monte Carlo simulations, using parallel tempering, on the three- and four-dimensional Edwards-Anderson Ising spin glass with Gaussian couplings at low temperatures with free boundary conditions. Our results suggest that large-scale low-energy excitations may be space filling. The data implies that the energy of these excitations increases with increasing system size for small systems, but we see evidence in three dimensions, where we have a greater range of sizes, for a crossover to a regime where the energy is independent of system size, in accordance with replica symmetry breaking.
Communication networks are vulnerable to natural disasters, such as earthquakes or floods, as well as to physical attacks, such as an Electromagnetic Pulse (EMP) attack. Such real-world events happen at specific geographical locations and disrupt specific parts of the network. Therefore, the geographical layout of the network determines the impact of such events on the network's physical topology in terms of capacity, connectivity, and flow.   Recent works focused on assessing the vulnerability of a deterministic network to such events. In this work, we focus on assessing the vulnerability of (geographical) random networks to such disasters. We consider stochastic graph models in which nodes and links are probabilistically distributed on a plane, and model the disaster event as a circular cut that destroys any node or link within or intersecting the circle.   We develop algorithms for assessing the damage of both targeted and non-targeted (random) attacks and determining which attack locations have the expected most disruptive impact on the network. Then, we provide experimental results for assessing the impact of circular disasters to communications networks in the USA, where the network's geographical layout was modeled probabilistically, relying on demographic information only. Our results demonstrates the applicability of our algorithms to real-world scenarios.   Our algorithms allows to examine how valuable is public information about the network's geographical area (e.g., demography, topography, economy) to an attacker's destruction assessment capabilities in the case the network's physical topology is hidden or examine the affect of hiding the actual physical location of the fibers on the attack strategy. Thereby, our schemes can be used as a tool for policy makers and engineers to design more robust networks and identifying locations which require additional protection efforts.
Recent studies have demonstrated that network approaches are highly appropriate tools to understand the extreme complexity of the aging process. The generality of the network concept helps to define and study the aging of technological, social networks and ecosystems, which may give novel concepts to cure age-related diseases. The current review focuses on the role of protein-protein interaction networks (interactomes) in aging. Hubs and inter-modular elements of both interactomes and signaling networks are key regulators of the aging process. Aging induces an increase in the permeability of several cellular compartments, such as the cell nucleus, introducing gross changes in the representation of network structures. The large overlap between aging genes and genes of age-related major diseases makes drugs which aid healthy aging promising candidates for the prevention and treatment of age-related diseases, such as cancer, atherosclerosis, diabetes and neurodegenerative disorders. We also discuss a number of possible research options to further explore the potential of the network concept in this important field, and show that multi-target drugs (representing "magic-buckshots" instead of the traditional "magic bullets") may become an especially useful class of age-related future drugs.
This paper proposes three simple, compact yet effective representations of depth sequences, referred to respectively as Dynamic Depth Images (DDI), Dynamic Depth Normal Images (DDNI) and Dynamic Depth Motion Normal Images (DDMNI). These dynamic images are constructed from a sequence of depth maps using bidirectional rank pooling to effectively capture the spatial-temporal information. Such image-based representations enable us to fine-tune the existing ConvNets models trained on image data for classification of depth sequences, without introducing large parameters to learn. Upon the proposed representations, a convolutional Neural networks (ConvNets) based method is developed for gesture recognition and evaluated on the Large-scale Isolated Gesture Recognition at the ChaLearn Looking at People (LAP) challenge 2016. The method achieved 55.57\% classification accuracy and ranked $2^{nd}$ place in this challenge but was very close to the best performance even though we only used depth data.
This paper proposes a new method for an optimized mapping of temporal variables, describing a temporal stream data, into the recently proposed NeuCube spiking neural network architecture. This optimized mapping extends the use of the NeuCube, which was initially designed for spatiotemporal brain data, to work on arbitrary stream data and to achieve a better accuracy of temporal pattern recognition, a better and earlier event prediction and a better understanding of complex temporal stream data through visualization of the NeuCube connectivity. The effect of the new mapping is demonstrated on three bench mark problems. The first one is early prediction of patient sleep stage event from temporal physiological data. The second one is pattern recognition of dynamic temporal patterns of traffic in the Bay Area of California and the last one is the Challenge 2012 contest data set. In all cases the use of the proposed mapping leads to an improved accuracy of pattern recognition and event prediction and a better understanding of the data when compared to traditional machine learning techniques or spiking neural network reservoirs with arbitrary mapping of the variables.
Generative adversarial networks (GANs) has gained tremendous popularity lately due to an ability to reinforce quality of its predictive model with generated objects and the quality of the generative model with and supervised feedback. GANs allow to synthesize images with a high degree of realism. However, the learning process of such models is a very complicated optimization problem and certain limitation for such models were found. It affects the choice of certain layers and nonlinearities when designing architectures. In particular, it does not allow to train convolutional GAN models with fully-connected hidden layers. In our work, we propose a modification of the previously described set of rules, as well as new approaches to designing architectures that will allow us to train more powerful GAN models. We show the effectiveness of our methods on the problem of synthesizing projections of 3D objects with the possibility of interpolation by class and view point.
We present a derivation and theoretical investigation of the Adams-Bashforth and Adams-Moulton family of linear multistep methods for solving ordinary differential equations, starting from a Gaussian process (GP) framework. In the limit, this formulation coincides with the classical deterministic methods, which have been used as higher-order initial value problem solvers for over a century. Furthermore, the natural probabilistic framework provided by the GP formulation allows us to derive probabilistic versions of these methods, in the spirit of a number of other probabilistic ODE solvers presented in the recent literature. In contrast to higher-order Runge-Kutta methods, which require multiple intermediate function evaluations per step, Adams family methods make use of previous function evaluations, so that increased accuracy arising from a higher-order multistep approach comes at very little additional computational cost. We show that through a careful choice of covariance function for the GP, the posterior mean and standard deviation over the numerical solution can be made to exactly coincide with the value given by the deterministic method and its local truncation error respectively. We provide a rigorous proof of the convergence of these new methods, as well as an empirical investigation (up to fifth order) demonstrating their convergence rates in practice.
Protein structures can be studied as complex networks of interacting amino acids. We study proteins of different structural classes from the network perspective. Our results indicate that proteins, regardless of their structural class, show small-world network property. Various network parameters offer insight into the structural organisation of proteins and provide indications of modularity in protein networks.
This paper presents the experiences in building and implementing a 3D avatar-based virtual world (3D-AVW) as a VLE (3D-AVLE) for the Technological University of the Philippines-Taguig (TUP-T), a higher education institution (HEI) in the Philippines. Free and Open Source Software (FOSS) systems were used, such as the OpenSimulator and various 3D renderers, to create a replica of the TUP-T campus in a simulated 3D world. The 3D-AVLE runs in a single server that is connected to the learners' computers via a simply-wired local area network (LAN). The use of various networking optimization techniques was experimented on to provide the learners and the instructors alike a seamless experience and lag-less immersion within the 3D-AVLE. With the current LAN setup in TUP-T, the optimal number of concurrent users that can be accommodated without sacrificing connectivity and the quality of virtual experience was found to be at 30 users, exactly the mean class size in TUP-T. The 3D-AVLE allows for recording of the learners' experiences which provides the learners a facility to review the lessons at a later time. Classes in fundamental topics in engineering sciences were conducted using the usual teaching aid technologies such as the presentation software, and by dragging-and-dropping the presentation files to the 3D-AVLE, resulting to increased learning curve by both the instructors and the learners.
Unlike unsupervised approaches such as autoencoders that learn to reconstruct their inputs, this paper introduces an alternative approach to unsupervised feature learning called divergent discriminative feature accumulation (DDFA) that instead continually accumulates features that make novel discriminations among the training set. Thus DDFA features are inherently discriminative from the start even though they are trained without knowledge of the ultimate classification problem. Interestingly, DDFA also continues to add new features indefinitely (so it does not depend on a hidden layer size), is not based on minimizing error, and is inherently divergent instead of convergent, thereby providing a unique direction of research for unsupervised feature learning. In this paper the quality of its learned features is demonstrated on the MNIST dataset, where its performance confirms that indeed DDFA is a viable technique for learning useful features.
Communicating and sharing intelligence among agents is an important facet of achieving Artificial General Intelligence. As a first step towards this challenge, we introduce a novel framework for image generation: Message Passing Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have recently been shown to be very effective for image generation and other tasks, these networks have been limited to mostly single generator-discriminator networks. We show that we can obtain multi-agent GANs that communicate through message passing to achieve better image generation. The objectives of the individual agents in this framework are two fold: a co-operation objective and a competing objective. The co-operation objective ensures that the message sharing mechanism guides the other generator to generate better than itself while the competing objective encourages each generator to generate better than its counterpart. We analyze and visualize the messages that these GANs share among themselves in various scenarios. We quantitatively show that the message sharing formulation serves as a regularizer for the adversarial training. Qualitatively, we show that the different generators capture different traits of the underlying data distribution.
Many topological and dynamical properties of complex networks are defined by assuming that most of the transport on the network flows along the shortest paths. However, there are different scenarios in which non-shortest paths are used to reach the network destination. Thus the consideration of the shortest paths only does not account for the global communicability of a complex network. Here we propose a new measure of the communicability of a complex network, which is a broad generalization of the concept of the shortest path. According to the new measure, most of real-world networks display the largest communicability between the most connected (popular) nodes of the network (assortative communicability). There are also several networks with the disassortative communicability, where the most "popular" nodes communicate very poorly to each other. Using this information we classify a diverse set of real-world complex systems into a small number of universality classes based on their structure-dynamic correlation. In addition, the new communicability measure is able to distinguish finer structures of networks, such as communities into which a network is divided. A community is unambiguously defined here as a set of nodes displaying larger communicability among them than to the rest of nodes in the network.
Networked Controlled Systems (NCSs) are more and more used in industrial applications. They are strongly connected to real-time constraints because important delays induced by the network can lead to an unstable process control. Usually, the network used in NCSs is shared with many others applications requiring different Quality of Service. The objective of this paper is to optimize the tuning of the network scheduling mechanisms in taking into account the level of Quality of Control. The goal is to maximize the bandwidth allocation for unconstrained frames in guarantying that the control constraints are respected. In this paper, we focus on switched Ethernet network implementing the Classification of Service (IEEE 802.1p) based on a Weighted Round Robin policy.
This paper proposes an intrusion detection and prediction system based on uncertain and imprecise inference networks and its implementation. Giving a historic of sessions, it is about proposing a method of supervised learning doubled of a classifier permitting to extract the necessary knowledge in order to identify the presence or not of an intrusion in a session and in the positive case to recognize its type and to predict the possible intrusions that will follow it. The proposed system takes into account the uncertainty and imprecision that can affect the statistical data of the historic. The systematic utilization of an unique probability distribution to represent this type of knowledge supposes a too rich subjective information and risk to be in part arbitrary. One of the first objectives of this work was therefore to permit the consistency between the manner of which we represent information and information which we really dispose.
Spatio-temporal dynamics of a deterministic three-level cellular automaton (TLCA) of Zykov-Mikhailov type (Sov. Phys. - Dokl., 1986, Vol.31, No.1, P.51) is studied numerically. Evolution of spatial structures is investigated both for the original Zykov-Mikhailov model (which is applicable to, for example, Belousov-Zhabotinskii chemical reactions) and for proposed by us TLCA, which is a generalization of Zykov-Mikhailov model for the case of two-channel diffusion. Such the TLCA is a minimal model for an excitable medium of microwave phonon laser, called phaser (D. N. Makovetskii, Tech. Phys., 2004, Vol.49, No.2, P.224; cond-mat/0402640). The most interesting observed forms of TLCA dynamics are as follows: (a) spatio-temporal transient chaos in form of highly bottlenecked collective evolution of excitations by rotating spiral waves (RSW) with variable topological charges; (b) competition of left-handed and right-handed RSW with unexpected features, including self-induced alteration of integral effective topological charge; (c) transient chimera states, i.e. coexistence of regular and chaotic domains in TLCA patterns; (d) branching of TLCA states with different symmetry which may lead to full restoring of symmetry of imperfect starting pattern. Phenomena (a) and (c) are directly related to phaser dynamics features observed earlier in real experiments at liquid helium temperatures on corundum crystals doped by iron-group ions. ACM: F.1.1, I.6, J.2; PACS:05.65.+b, 07.05.Tp, 82.20.Wt
A complex nature of big data resources demands new methods for structuring especially for textual content. WordNet is a good knowledge source for comprehensive abstraction of natural language as its good implementations exist for many languages. Since WordNet embeds natural language in the form of a complex network, a transformation mechanism WordNet2Vec is proposed in the paper. It creates vectors for each word from WordNet. These vectors encapsulate general position - role of a given word towards all other words in the natural language. Any list or set of such vectors contains knowledge about the context of its component within the whole language. Such word representation can be easily applied to many analytic tasks like classification or clustering. The usefulness of the WordNet2Vec method was demonstrated in sentiment analysis, i.e. classification with transfer learning for the real Amazon opinion textual dataset.
Attention networks have proven to be an effective approach for embedding categorical inference within a deep neural network. However, for many tasks we may want to model richer structural dependencies without abandoning end-to-end training. In this work, we experiment with incorporating richer structural distributions, encoded using graphical models, within deep networks. We show that these structured attention networks are simple extensions of the basic attention procedure, and that they allow for extending attention beyond the standard soft-selection approach, such as attending to partial segmentations or to subtrees. We experiment with two different classes of structured attention networks: a linear-chain conditional random field and a graph-based parsing model, and describe how these models can be practically implemented as neural network layers. Experiments show that this approach is effective for incorporating structural biases, and structured attention networks outperform baseline attention models on a variety of synthetic and real tasks: tree transduction, neural machine translation, question answering, and natural language inference. We further find that models trained in this way learn interesting unsupervised hidden representations that generalize simple attention.
Non-negative matrix factorization (NMF) is a recently developed technique for finding parts-based, linear representations of non-negative data. Although it has successfully been applied in several applications, it does not always result in parts-based representations. In this paper, we show how explicitly incorporating the notion of `sparseness' improves the found decompositions. Additionally, we provide complete MATLAB code both for standard NMF and for our extension. Our hope is that this will further the application of these methods to solving novel data-analysis problems.
A new scaling formalism is used to analyze nonlinear I-V data in the vicinity of metal-insulator transitions (MIT) in five manganite systems. An exponent, called the nonlinearity exponent, and an onset field for nonlinearity, both characteristic of the system under study, are obtained from the analysis. The onset field is found to have an anomalously low value corroborating the theoretically predicted electronically soft phases. The scaling functions above and below the MIT of a polycrystalline sample are found to be the same but with different exponents which are attributed to the distribution of the MIT temperatures. The applicability of the scaling in manganites underlines the universal response of the disordered systems to electric field.
We explore the application of graph coloring to biological networks, specifically protein-protein interaction (PPI) networks. First, we find that given similar conditions (i.e. number of nodes, number of links, degree distribution and clustering), fewer colors are needed to color disassortative (high degree nodes tend to connect to low degree nodes and vice versa) than assortative networks. Fewer colors create fewer independent sets which in turn imply higher concurrency potential for a network. Since PPI networks tend to be disassortative, we suggest that in addition to functional specificity and stability proposed previously by Maslov and Sneppen (Science 296, 2002), the disassortative nature of PPI networks may promote the ability of cells to perform multiple, crucial and functionally diverse tasks concurrently. Second, since graph coloring is closely related to the presence of cliques in a graph, the significance of node coloring information to the problem of identifying protein complexes, i.e. dense subgraphs in a PPI network, is investigated. We find that for PPI networks where 1% to 11% of nodes participate in at least one identified protein complex, such as H. sapien (DIP20070219, DIP20081014 and HPRD070609), DSATUR (a well-known complete graph coloring algorithm) node coloring information can improve the quality (homogeneity and separation) of initial candidate complexes. This finding may help to improve existing protein complex detection methods, and/or suggest new methods.
In the past few years, convolutional neural nets (CNN) have shown incredible promise for learning visual representations. In this paper, we use CNNs for the task of predicting surface normals from a single image. But what is the right architecture we should use? We propose to build upon the decades of hard work in 3D scene understanding, to design new CNN architecture for the task of surface normal estimation. We show by incorporating several constraints (man-made, manhattan world) and meaningful intermediate representations (room layout, edge labels) in the architecture leads to state of the art performance on surface normal estimation. We also show that our network is quite robust and show state of the art results on other datasets as well without any fine-tuning.
In this paper, we study the relationship between the network-based inference method and global ranking method in personal recommendation. By some theoretical analysis, we prove that the recommendation result under the global ranking method is the limit of applying network-based inference method with infinity times.
We uncover a disorder-driven instability in the diffusive Fermi liquid phase of a class of many-fermion systems, indicative of a metal-insulator transition of first order type, which arises solely from the competition between quenched disorder and interparticle interactions. Our result is expected to be relevant for sufficiently strong disorder in d = 3 spatial dimensions. Specifically, we study a class of half-filled, Hubbard-like models for spinless fermions with (complex) random hopping and short-ranged interactions on bipartite lattices, in d > 1. In a given realization, the hopping disorder breaks time reversal invariance, but preserves the special ``nesting'' symmetry responsible for the charge density wave instability of the ballistic Fermi liquid. This disorder may arise, e.g., from the application of a random magnetic field to the otherwise clean model. We derive a low energy effective field theory description for this class of disordered, interacting fermion systems, which takes the form of a Finkel'stein non-linear sigma model [A. M. Finkel'stein, Zh. Eksp. Teor. Fiz. 84, 168 (1983), Sov. Phys. JETP 57, 97 (1983)]. We analyze the Finkel'stein sigma model using a perturbative, one-loop renormalization group analysis controlled via an epsilon-expansion in d = 2 + epsilon dimensions. We find that, in d = 2 dimensions, the interactions destabilize the conducting phase known to exist in the disordered, non-interacting system. The metal-insulator transition that we identify in d > 2 dimensions occurs for disorder strengths of order epsilon, and is therefore perturbatively accessible for epsilon << 1. We emphasize that the disordered system has no localized phase in the absence of interactions, so that a localized phase, and the transition into it, can only appear due to the presence of the interactions.
This paper considers convex optimization problems where nodes of a network have access to summands of a global objective. Each of these local objectives is further assumed to be an average of a finite set of functions. The motivation for this setup is to solve large scale machine learning problems where elements of the training set are distributed to multiple computational elements. The decentralized double stochastic averaging gradient (DSA) algorithm is proposed as a solution alternative that relies on: (i) The use of local stochastic averaging gradients. (ii) Determination of descent steps as differences of consecutive stochastic averaging gradients. Strong convexity of local functions and Lipschitz continuity of local gradients is shown to guarantee linear convergence of the sequence generated by DSA in expectation. Local iterates are further shown to approach the optimal argument for almost all realizations. The expected linear convergence of DSA is in contrast to the sublinear rate characteristic of existing methods for decentralized stochastic optimization. Numerical experiments on a logistic regression problem illustrate reductions in convergence time and number of feature vectors processed until convergence relative to these other alternatives.
In this paper, we relate the coupling of Markov chains, at the basis of perfect sampling methods, with damage spreading, which captures the chaotic nature of stochastic dynamics. For two-dimensional spin glasses and hard spheres we point out that the obstacle to the application of perfect-sampling schemes is posed by damage spreading rather than by the survey problem of the entire configuration space. We find dynamical damage-spreading transitions deeply inside the paramagnetic and liquid phases, and show that critical values of the transition temperatures and densities depend on the coupling scheme. We discuss our findings in the light of a classic proof that for arbitrary Monte Carlo algorithms damage spreading can be avoided through non-Markovian coupling schemes.
This paper presents research in progress investigating the viability and adaptation of reinforcement learning using deep neural network based function approximation for the task of radio control and signal detection in the wireless domain. We demonstrate a successful initial method for radio control which allows naive learning of search without the need for expert features, heuristics, or search strategies. We also introduce Kerlym, an open Keras based reinforcement learning agent collection for OpenAI's Gym.
We present a numerical study of lasing modes in diffusive random media with local pumping. The reabsorption of emitted light suppresses the feedback from the unpumped part of the sample and effectively reduces the system size. The lasing modes are dramatically different from the quasimodes of the passive system (without gain or absorption). Even if all the quasimodes of a passive diffusive system are extended across the entire sample, the lasing modes are still confined in the pumped volume with an exponential tail outside it. The reduction of effective system volume by absorption broadens the distribution of decay rates of quasimodes and facilitates the occurrence of discrete lasing peaks.
In this paper, we propose a self-learning mutual selection model to characterize weighted evolving networks. By introducing the self-learning probability $p$ and the general mutual selection mechanism, which is controlled by the parameter $m$, the model can reproduce scale-free distributions of degree, weight and strength, as found in many real systems. The simulation results are consistent with the theoretical predictions approximately. Interestingly, we obtain the nontrivial clustering coefficient $C$ and tunable degree assortativity $r$, depending on the parameters $m$ and $p$. The model can unify the characterization of both assortative and disassortative weighted networks. Also, we find that self-learning may contribute to the assortative mixing of social networks.
We address the problem of synthesizing new video frames in an existing video, either in-between existing frames (interpolation), or subsequent to them (extrapolation). This problem is challenging because video appearance and motion can be highly complex. Traditional optical-flow-based solutions often fail where flow estimation is challenging, while newer neural-network-based methods that hallucinate pixel values directly often produce blurry results. We combine the advantages of these two methods by training a deep network that learns to synthesize video frames by flowing pixel values from existing ones, which we call deep voxel flow. Our method requires no human supervision, and any video can be used as training data by dropping, and then learning to predict, existing frames. The technique is efficient, and can be applied at any video resolution. We demonstrate that our method produces results that both quantitatively and qualitatively improve upon the state-of-the-art.
We present a superposition coding scheme for communication over a network, which combines partial decode and forward and noisy network coding. This hybrid scheme is termed as superposition noisy network coding. The scheme is designed and analyzed for single relay channel, single source multicast network and multiple source multicast network. The achievable rate region is determined for each case. The special cases of Gaussian single relay channel and two way relay channel are analyzed for superposition noisy network coding. The achievable rate of the proposed scheme is higher than the existing schemes of noisy network coding and compress-forward.
We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC kernels: a random walk Metropolis kernel and a blockMetropolis-Hastings (MH) kernel with a variational approximation as proposaldistribution. The MH kernel allows one to locate regions of high probability efficiently. The Metropolis kernel allows us to explore the vicinity of these regions. This algorithm outperforms variationalapproximations because it yields slightly better estimates of the mean and considerably better estimates of higher moments, such as covariances. It also outperforms standard MCMC algorithms because it locates theregions of high probability quickly, thus speeding up convergence. We demonstrate this algorithm on the problem of Bayesian parameter estimation for logistic (sigmoid) belief networks.
Multivariate time series forecasting is an important machine learning problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. Temporal data arise in these real-world applications often involves a mixture of long-term and short-term patterns, for which traditional approaches such as Autoregressive models and Gaussian Process may fail. In this paper, we proposed a novel deep learning framework, namely Long- and Short-term Time-series network (LSTNet), to address this open challenge. LSTNet uses the Convolution Neural Network (CNN) to extract short-term local dependency patterns among variables, and the Recurrent Neural Network (RNN) to discover long-term patterns and trends. In our evaluation on real-world data with complex mixtures of repetitive patterns, LSTNet achieved significant performance improvements over that of several state-of-the-art baseline methods. The dataset and experiment code both are uploaded to Github.
We compare the performance of different clustering algorithms applied to the task of unsupervised text categorization. We consider agglomerative clustering algorithms, principal direction divisive partitioning and (for the first time) superparamagnetic clustering with several distance measures. The algorithms have been applied to test databases extracted from the Reuters-21578 text categorization test database. We find that simple application of the different clustering algorithms yields clustering solutions of comparable quality. In order to achieve considerable improvements of the clustering results it is crucial to reduce the dictionary of words considered in the representation of the documents. Significant improvements of the quality of the clustering can be obtained by identifying discriminative words and filtering out indiscriminative words from the dictionary. We present two methods, each based on a resampling scheme, for selecting discriminative words in an unsupervised way.
We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, model-based reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the synthetic-data generative model. We demonstrate this connection in a recognition task where we develop a novel Captcha-breaking architecture and train it using synthetic data, demonstrating both state-of-the-art performance and a way of computing task-specific posterior uncertainty. Using a neural network trained this way, we also demonstrate successful breaking of real-world Captchas currently used by Facebook and Wikipedia. Reasoning from these empirical results and drawing connections with Bayesian modeling, we discuss the robustness of synthetic data results and suggest important considerations for ensuring good neural network generalization when training with synthetic data.
For many networks, the connection pattern (often called the topology) can vary in time, depending on the changing state, or mode, of the modules within the network. For example, "airplane mode" is the name for one communicative mode of a modern cellphone, in which it will not connect with any cellphone towers; thus the topology of the cellular network is dependent on the modes of its modules. This paper addresses the issue of nesting such mode-dependent networks, in which a local network can be abstracted as a single module in a larger network. Each module in the network represents a dynamic system, whose behavior includes repeatedly updating its communicative mode. It is in this way that the dynamics of the modules controls the topology of the networks at all levels. This paper provides a formal semantics, using the category-theoretic framework of operads and their algebras, to capture the nesting property and dynamics of mode-dependent networks. We provide a detailed running example to ground the mathematics.
Spectrum efficiency and energy efficiency are two critical issues in designing wireless networks. Through dynamic spectrum access, cognitive radios can improve the spectrum efficiency and capacity of wireless networks. On the other hand, radio frequency (RF) energy harvesting has emerged as a promising technique to supply energy to wireless networks and thereby increase their energy efficiency. Therefore, to achieve both spectrum and energy efficiencies, the secondary users in a cognitive radio network (CRN) can be equipped with the RF energy harvesting capability and such a network can be referred to as an RF-powered cognitive radio network. In this article, we provide an overview of the RF-powered CRNs and discuss the challenges that arise for dynamic spectrum access in these networks. Focusing on the tradeoff among spectrum sensing, data transmission, and RF energy harvesting, then we discuss the dynamic channel selection problem in a multi-channel RF-powered CRN. In the RF-powered CRN, a secondary user can adaptively select a channel to transmit data when the channel is not occupied by any primary user. Alternatively, the secondary user can harvest RF energy for data transmission if the channel is occupied. The optimal channel selection policy of the secondary user can be obtained by formulating a Markov decision process (MDP) problem. We present some numerical results obtained by solving this MDP problem.
We present inelastic neutron scattering measurements on liquid mercury at room temperature for wave numbers $q$ in the range 0.3 $< q <$ 7.0 \AA$^{-1}$. We find that the energy halfwidth of the incoherent part of the dynamic structure factor $S(q,E)$ is determinded by a self-diffusion process. The halfwidth of the coherent part of $S(q, E)$ shows the characteristic behavior expected for a cage diffusion process. We also show that the response function at small wave numbers exhibits a quasi-elastic mode with a time scale characteristic of cage diffusion, however, its intensity is larger by an order of magnitude than what would be expected for cage diffusion. We speculate on a scenario in which the intensity of the cage diffusion mode at small wave numbers is amplified through a valence fluctuation mechanism.
This paper examines the four main types of Evolutionary Design by computers: Evolutionary Design Optimisation, Evolutionary Art, Evolutionary Artificial Life Forms and Creative Evolutionary Design. Definitions for all four areas are provided. A review of current work in each of these areas is given, with examples of the types of applications that have been tackled. The different properties and requirements of each are examined. Descriptions of typical representations and evolutionary algorithms are provided and examples of designs evolved using these techniques are shown. The paper then discusses how the boundaries of these areas are beginning to merge, resulting in four new 'overlapping' types of Evolutionary Design: Integral Evolutionary Design, Artificial Life Based Evolutionary Design, Aesthetic Evolutionary AL and Aesthetic Evolutionary Design. Finally, the last part of the paper discusses some common problems faced by creators of Evolutionary Design systems, including: interdependent elements in designs, epistasis, and constraint handling.
Multiwavelength data are essential in order to provide a complete picture of galaxy evolution and to inform studies of galaxies' morphological properties across cosmic time. Here we present results of a multiwavelength investigation of the morphologies of "tadpole" galaxies at intermediate redshift (0.314<z<3.175) in the Hubble Ultra Deep Field. These galaxies were previously selected from deep Hubble Space Telescope (HST) F775W data based on their distinct asymmetric knot-plus-tail morphologies (Straughn et al. 2006). Here we use deep Wide Field Camera 3 near-infrared imaging in addition to the HST optical data in order to study the rest-frame UV/optical morphologies of these galaxies across the redshift range 0.3<z<3.2. This study reveals that the majority of these galaxies do retain their general asymmetric morphology in the rest-frame optical over this redshift range, if not the distinct "tadpole" shape. The average stellar mass of tadpole galaxies is lower than field galaxies, with the effect being slightly greater at higher redshift within the errors. Estimated from SED fits, the average age of tadpole galaxies is younger than field galaxies in the lower redshift bin, and the average metallicity is lower (whereas the specific star formation rate for tadpoles is roughly the same as field galaxies across the redshift range probed here). These average effects combined support the conclusion that this subset of galaxies is in an active phase of assembly, either late-stage merging or cold gas accretion causing localized clumpy star-formation.
Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner. In this work we introduce a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network. This differentiable module can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps, conditional on the feature map itself, without any extra training supervision or modification to the optimisation process. We show that the use of spatial transformers results in models which learn invariance to translation, scale, rotation and more generic warping, resulting in state-of-the-art performance on several benchmarks, and for a number of classes of transformations.
Determining the achievable rate region for networks using routing, linear coding, or non-linear coding is thought to be a difficult task in general, and few are known. We describe the achievable rate regions for four interesting networks (completely for three and partially for the fourth). In addition to the known matrix-computation method for proving outer bounds for linear coding, we present a new method which yields actual characteristic-dependent linear rank inequalities from which the desired bounds follow immediately.
This paper presents a novel approach to procedural generation of urban maps for First Person Shooter (FPS) games. A multi-agent evolutionary system is employed to place streets, buildings and other items inside the Unity3D game engine, resulting in playable video game levels. A computational agent is trained using machine learning techniques to capture the intent of the game designer as part of the multi-agent system, and to enable a semi-automated aesthetic selection for the underlying genetic algorithm.
A broadcast network is a classical network with all source messages collocated at a single source node. For broadcast networks, the standard cut-set bounds, which are known to be loose in general, are closely related to union as a specific set operation to combine the basic cuts of the network. This paper provides a new set of network coding bounds for general broadcast networks. These bounds combine the basic cuts of the network via a variety of set operations (not just the union) and are established via only the submodularity of Shannon entropy. The tightness of these bounds are demonstrated via applications to combination networks.
Quantum computing has seen tremendous progress in the past years. However, due to limitations in scalability of quantum technologies, it seems that we are far from constructing universal quantum computers for everyday users. A more feasible solution is the delegation of computation to powerful quantum servers on the network. This solution was proposed in previous studies of Blind Quantum Computation, with guarantees for both the secrecy of the input and of the computation being performed. In this work, we further develop this idea of computing over encrypted data, to propose a multiparty delegated quantum computing protocol in the measurement-based quantum computing framework.
In the case of the gluon-fusion process in deep-inelastic leptoproduction, I explicitly show how to incorporate NLO corrections in a Monte-Carlo event generator by a subtraction method. I calculate the parton densities to be used by the event generator in terms of MSbar densities. The method is generalizable. A particular motivation for treating the gluon-fusion process is to treat diffractive deep-inelastic scattering properly, since in diffractive scattering the gluon density dominates the quark densities. I also propose a modified algorithm for treating parton kinematics in event generators; the new algorithm results in much simpler formulae for the NLO corrections. A disadvantage of the new method is that some of the generated events may have negative weights. However, an adjustable cut-off function is present in the formalism, and this permits a renormalization-group-like transformation that can be used to at least reduce the proportion of events with negative weights.
In recent years, model-free methods that use deep learning have achieved great success in many different reinforcement learning environments. Most successful approaches focus on solving a single task, while multi-task reinforcement learning remains an open problem. In this paper, we present a model based approach to deep reinforcement learning which we use to solve different tasks simultaneously. We show that our approach not only does not degrade but actually benefits from learning multiple tasks. For our model, we also present a new kind of recurrent neural network inspired by residual networks that decouples memory from computation allowing to model complex environments that do not require lots of memory.
Our ability to synthesize sensory data that preserves specific statistical properties of the real data has had tremendous implications on data privacy and big data analytics. The synthetic data can be used as a substitute for selective real data segments,that are sensitive to the user, thus protecting privacy and resulting in improved analytics.However, increasingly adversarial roles taken by data recipients such as mobile apps, or other cloud-based analytics services, mandate that the synthetic data, in addition to preserving statistical properties, should also be difficult to distinguish from the real data. Typically, visual inspection has been used as a test to distinguish between datasets. But more recently, sophisticated classifier models (discriminators), corresponding to a set of events, have also been employed to distinguish between synthesized and real data. The model operates on both datasets and the respective event outputs are compared for consistency. In this paper, we take a step towards generating sensory data that can pass a deep learning based discriminator model test, and make two specific contributions: first, we present a deep learning based architecture for synthesizing sensory data. This architecture comprises of a generator model, which is a stack of multiple Long-Short-Term-Memory (LSTM) networks and a Mixture Density Network. second, we use another LSTM network based discriminator model for distinguishing between the true and the synthesized data. Using a dataset of accelerometer traces, collected using smartphones of users doing their daily activities, we show that the deep learning based discriminator model can only distinguish between the real and synthesized traces with an accuracy in the neighborhood of 50%.
The goal of this paper is to develop a principled understanding of when it is beneficial to bundle technologies or services whose value is heavily dependent on the size of their user base, i.e., exhibits positive exernalities. Of interest is how the joint distribution, and in particular the correlation, of the values users assign to components of a bundle affect its odds of success. The results offer insight and guidelines for deciding when bundling new Internet technologies or services can help improve their overall adoption. In particular, successful outcomes appear to require a minimum level of value correlation.
We propose an extension of the Sachdev-Ye-Kitaev (SYK) model that exhibits a quantum phase transition from the previously identified non-Fermi liquid fixed point to a Fermi liquid like state, while still allowing an exact solution in a suitable large $N$ limit. The extended model involves coupling the interacting $N$-site SYK model to a new set of $pN$ peripheral sites with only quadratic hopping terms between them. The conformal fixed point of the SYK model remains a stable low energy phase below a critical ratio of peripheral sites $p<p_c(n)$ that depends on the fermion filling $n$. The scrambling dynamics throughout the non-Fermi liquid phase is characterized by a universal Lyapunov exponent $\lambda_L\to 2\pi T$ in the low temperature limit, however the temperature scale marking the crossover to the conformal regime vanishes continuously at the critical point $p_c$. The residual entropy at $T\to 0$, non zero in the NFL, also vanishes continuously at the critical point. For $p>p_c$ the quadratic sites effectively screen the SYK dynamics, leading to a quadratic fixed point in the low temperature and frequency limit. The interactions have a perturbative effect in this regime leading to scrambling with Lyapunov exponent $\lambda_L\propto T^2$.
The correlation of interference has been well quantified in Poisson networks where the interferers are independent of each other. However, there exists dependence among the base stations (BSs) in wireless networks. In view of this, we quantify the interference correlation in non-Poisson networks where the interferers are distributed as a Matern cluster process (MCP) and a second-order cluster process (SOCP). Interestingly, it is found that the correlation coefficient of interference for the Matern cluster networks, $\zeta_{MCP}$, is equal to that for second-order cluster networks, $\zeta_{SOCP}$. Furthermore, they are greater than their counterpart for the Poisson networks. This shows that clustering in interferers enhances the interference correlation. In addition, we show that the correlation coefficients $\zeta_{MCP}$ and $\zeta_{SOCP}$ increase as the average number of points in each cluster, $c$, grows, but decrease with the increase in the cluster radius, $R$. More importantly, we point that the effects of clustering on interference correlation can be neglected as $\frac{c}{\pi^{2}R^{2}}\rightarrow0$. Finally, the analytical results are validated by simulations.
Light field imaging extends the traditional photography by capturing both spatial and angular distribution of light, which enables new capabilities, including post-capture refocusing, post-capture aperture control, and depth estimation from a single shot. Micro-lens array (MLA) based light field cameras offer a cost-effective approach to capture light field. A major drawback of MLA based light field cameras is low spatial resolution, which is due to the fact that a single image sensor is shared to capture both spatial and angular information. In this paper, we present a learning based light field enhancement approach. Both spatial and angular resolution of captured light field is enhanced using convolutional neural networks. The proposed method is tested with real light field data captured with a Lytro light field camera, clearly demonstrating spatial and angular resolution improvement.
We give a cross-disciplinary survey on ``population'' Monte Carlo algorithms. In these algorithms, a set of ``walkers'' or ``particles'' is used as a representation of a high-dimensional vector. The computation is carried out by a random walk and split/deletion of these objects. The algorithms are developed in various fields in physics and statistical sciences and called by lots of different terms -- ``quantum Monte Carlo'', ``transfer-matrix Monte Carlo'', ``Monte Carlo filter (particle filter)'',``sequential Monte Carlo'' and ``PERM'' etc. Here we discuss them in a coherent framework. We also touch on related algorithms -- genetic algorithms and annealed importance sampling.
We analyze the ability of peer to peer networks to deliver a complete file among the peers. Early on we motivate a broad generalization of network behavior organizing it into one of two successive phases. According to this view the network has two main states: first centralized - few sources (roots) hold the complete file, and next distributed - peers hold some parts (chunks) of the file such that the entire network has the whole file, but no individual has it. In the distributed state we study two scenarios, first, when the peers are ``patient'', i.e, do not leave the system until they obtain the complete file; second, peers are ``impatient'' and almost always leave the network before obtaining the complete file.
We study user behavior in the courses offered by a major Massive Online Open Course (MOOC) provider during the summer of 2013. Since social learning is a key element of scalable education in MOOCs and is done via online discussion forums, our main focus is in understanding forum activities. Two salient features of MOOC forum activities drive our research: 1. High decline rate: for all courses studied, the volume of discussions in the forum declines continuously throughout the duration of the course. 2. High-volume, noisy discussions: at least 30% of the courses produce new discussion threads at rates that are infeasible for students or teaching staff to read through. Furthermore, a substantial portion of the discussions are not directly course-related.   We investigate factors that correlate with the decline of activity in the online discussion forums and find effective strategies to classify threads and rank their relevance. Specifically, we use linear regression models to analyze the time series of the count data for the forum activities and make a number of observations, e.g., the teaching staff's active participation in the discussion increases the discussion volume but does not slow down the decline rate. We then propose a unified generative model for the discussion threads, which allows us both to choose efficient thread classifiers and design an effective algorithm for ranking thread relevance. Our ranking algorithm is further compared against two baseline algorithms, using human evaluation from Amazon Mechanical Turk.   The authors on this paper are listed in alphabetical order. For media and press coverage, please refer to us collectively, as "researchers from the EDGE Lab at Princeton University, together with collaborators at Boston University and Microsoft Corporation."
Single spin asymmetries in semi-inclusive deep-inelastic scattering off transversely polarized nucleon targets have been under intense experimental investigation over the past few years. They provide new insights into QCD and the nucleon structure. For instance, they allow the determination of the third yet-unknown leading-twist quark distribution function $\Delta_{T}q(x)$, the transversity distribution. Additionally, they give insight into the parton transverse momentum distribution and angular momentum.   The measurement of transverse spin effects in semi-inclusive deep-inelastic scattering is an important part of the COMPASS physics program. In the years 2002-2004 data were collected scattering a 160 GeV muon beam on a transversely polarized deuteron target. In 2007, additional data were collected on a transversely polarized proton target. New results from the analysis of the proton data will be presented. A different but not less important insight into the nucleon structure might be given by the Sivers asymmetry. This angular dependence of the cross-section arises from an intrinsic asymmetry in the parton transverse momentum distribution. The Sivers function is tightly related to the total angular momentum carried by the quarks in the nucleon. New COMPASS results for the Sivers asymmetry of the proton will be shown.
We propose the configurable rendering of massive quantities of photorealistic images with ground truth for the purposes of training, benchmarking, and diagnosing computer vision models. In contrast to the conventional (crowd-sourced) manual labeling of ground truth for a relatively modest number of RGB-D images captured by Kinect-like sensors, we devise a non-trivial configurable pipeline of algorithms capable of generating a potentially infinite variety of indoor scenes using a stochastic grammar, specifically, one represented by an attributed spatial And-Or graph. We employ physics-based rendering to synthesize photorealistic RGB images while automatically synthesizing detailed, per-pixel ground truth data, including visible surface depth and normal, object identity and material information, as well as illumination. Our pipeline is configurable inasmuch as it enables the precise customization and control of important attributes of the generated scenes. We demonstrate that our generated scenes achieve a performance similar to the NYU v2 Dataset on pre-trained deep learning models. By modifying pipeline components in a controllable manner, we furthermore provide diagnostics on common scene understanding tasks; eg., depth and surface normal prediction, semantic segmentation, etc.
We consider a superlattice of parallel metal tunnel junctions with a spatially non-homogeneous probability for electrons to tunnel. In such structures tunneling can be accompanied by electron scattering that conserves energy but not momentum. In the special case of a tunneling probability that varies periodically with period $a$ in the longitudinal direction, i.e., perpendicular to the junctions, electron tunneling is accompanied by "umklapp" scattering, where the longitudinal momentum changes by a multiple of $h/a$. We predict that as a result a sequence of metal-insulator transitions can be induced by an external electric- or magnetic field as the field strength is increased.
In this paper, we investigate the fundamental limitations of feedback mechanism in dealing with uncertainties for network systems. The study of maximum capability of feedback control was pioneered in Xie and Guo (2000) for scalar systems with nonparametric nonlinear uncertainty. In a network setting, nodes with unknown and nonlinear dynamics are interconnected through a directed interaction graph. Nodes can design feedback controls based on all available information, where the objective is to stabilize the network state. Using information structure and decision pattern as criteria, we specify three categories of network feedback laws, namely the global-knowledge/global-decision, network-flow/local-decision, and local-flow/local-decision feedback. We establish a series of network capacity characterizations for these three fundamental types of network control laws. First of all, we prove that for global-knowledge/global-decision and network-flow/local-decision control where nodes know the information flow across the entire network, there exists a critical number $\big(3/2+\sqrt{2}\big)/\|A_{\mathrm{G}}\|_\infty$, where $3/2+\sqrt{2}$ is as known as the Xie-Guo constant and $A_{\mathrm{G}}$ is the network adjacency matrix, defining exactly how much uncertainty in the node dynamics can be overcome by feedback. Interestingly enough, the same feedback capacity can be achieved under max-consensus enhanced local flows where nodes only observe information flows from neighbors as well as extreme (max and min) states in the network. Next, for local-flow/local-decision control, we prove that there exists a structure-determined value being a lower bound of the network feedback capacity. These results reveal the important connection between network structure and fundamental capabilities of in-network feedback control.
This paper aims to classify and locate objects accurately and efficiently, without using bounding box annotations. It is challenging as objects in the wild could appear at arbitrary locations and in different scales. In this paper, we propose a novel classification architecture ProNet based on convolutional neural networks. It uses computationally efficient neural networks to propose image regions that are likely to contain objects, and applies more powerful but slower networks on the proposed regions. The basic building block is a multi-scale fully-convolutional network which assigns object confidence scores to boxes at different locations and scales. We show that such networks can be trained effectively using image-level annotations, and can be connected into cascades or trees for efficient object classification. ProNet outperforms previous state-of-the-art significantly on PASCAL VOC 2012 and MS COCO datasets for object classification and point-based localization.
This paper investigates the use of WiFi and mobile device-to-device networks, with vehicular ad hoc networks being a typical example, as a complementary means to offload and reduce the traffic load of cellular networks. A novel cooperative content dissemination strategy is proposed for a heterogeneous network consisting of different types of devices with different levels of mobility, ranging from static WiFi access points to mobile devices such as vehicles. The proposed strategy offloads a significant proportion of data traffic from cellular networks to WiFi or device-to-device networks. Detailed analysis is provided for the content dissemination process in heterogeneous networks adopting the strategy. On that basis, the optimal parameter settings for the content dissemination strategy are discussed. Simulation and numerical studies show that the proposed strategy effectively reduces data traffic for cellular networks while guaranteeing successful content delivery.
In this paper, we propose an efficient vehicle trajectory prediction framework based on recurrent neural network. Basically, the characteristic of the vehicle's trajectory is different from that of regular moving objects since it is affected by various latent factors including road structure, traffic rules, and driver's intention. Previous state of the art approaches use sophisticated vehicle behavior model describing these factors and derive the complex trajectory prediction algorithm, which requires a system designer to conduct intensive model optimization for practical use. Our approach is data-driven and simple to use in that it learns complex behavior of the vehicles from the massive amount of trajectory data through deep neural network model. The proposed trajectory prediction method employs the recurrent neural network called long short-term memory (LSTM) to analyze the temporal behavior and predict the future coordinate of the surrounding vehicles. The proposed scheme feeds the sequence of vehicles' coordinates obtained from sensor measurements to the LSTM and produces the probabilistic information on the future location of the vehicles over occupancy grid map. The experiments conducted using the data collected from highway driving show that the proposed method can produce reasonably good estimate of future trajectory.
We present different classes of solutions to the Firing Squad Synchronization Problem on networks of different shapes. The nodes are finite state processors that work at unison discrete steps. The networks considered are the line, the ring and the square. For all of these models we have considered one and two-way communication modes and also constrained the quantity of information that adjacent processors can exchange each step. We are given a particular time expressed as a function of the number of nodes of the network, $f(n)$ and present synchronization algorithms in time $n^2$, $n \log n$, $n\sqrt n$, $2^n$. The solutions are presented as {\em signals} that are used as building blocks to compose new solutions for all times expressed by polynomials with nonnegative coefficients.
In realistic spinglasses, such as CuMn, AuFe and EuSrS, magnetic atoms are located at random positions. Their couplings are determined by their relative positions. For such systems a field theory is formulated. In certain limits it reduces to the Hopfield model, the Sherrington-Kirkpatrick model, and the Viana-Bray model. The model has a percolation transition, while for RKKY couplings the ``concentration scaling'' T_g proportional to c occurs. Within the Gaussian approximation the Ginzburg-Landau expansion is considered in the clusterglass phase, that is to say, for not too small concentrations. Near special points, the prefactor of the cubic term, or the one of the replica-symmetry- breaking quartic term, may go through zero. Around such points new spin glass phases are found.
We consider the problem of making distributed computations robust to noise, in particular to worst-case (adversarial) corruptions of messages. We give a general distributed interactive coding scheme which simulates any asynchronous distributed protocol while tolerating an optimal corruption of a $\Theta(1/n)$ fraction of all messages while incurring a moderate blowup of $O(n\log^2 n)$ in the communication complexity.   Our result is the first fully distributed interactive coding scheme in which the topology of the communication network is not known in advance. Prior work required either a coordinating node to be connected to all other nodes in the network or assumed a synchronous network in which all nodes already know the complete topology of the network.
The inherent intractability of probabilistic inference has hindered the application of belief networks to large domains. Noisy OR-gates [30] and probabilistic similarity networks [18, 17] escape the complexity of inference by restricting model expressiveness. Recent work in the application of belief-network models to time-series analysis and forecasting [9, 10] has given rise to the additive belief network model (ABNM). We (1) discuss the nature and implications of the approximations made by an additive decomposition of a belief network, (2) show greater efficiency in the induction of additive models when available data are scarce, (3) generalize probabilistic inference algorithms to exploit the additive decomposition of ABNMs, (4) show greater efficiency of inference, and (5) compare results on inference with a simple additive belief network.
VLBI is the only technology that will allow sub-milliarcsecond resolution imaging in the near future. As such, it is the only way to image expanding supernovae in nearby galaxies. Such images potentially allow us to study the early evolution of neutron stars or black holes left behind by core-collapse supernovae, the circumstellar wind history of the supernova progenitor stars, the shock acceleration of cosmic-ray particles in supernovae as well as the evolutionary process by which supernova shells merge into, and enrich, the ISM. I will discuss the results of the on-going VLBI imaging campaigns on supernovae 1986J and 1993J. I will also discuss the impact on supernova VLBI of the proposed South-African Karoo Array Telescope and Australian ASKAP arrays, as well as the SKA itself, as these telescopes will greatly increase the sensitivity of the global VLBI network.
Recent works have been shown effective in using neural networks for Chinese word segmentation. However, these models rely on large-scale data and are less effective for low-resource datasets because of insufficient training data. Thus, we propose a transfer learning method to improve low-resource word segmentation by leveraging high-resource corpora. First, we train a teacher model on high-resource corpora and then use the learned knowledge to initialize a student model. Second, a weighted data similarity method is proposed to train the student model on low-resource data with the help of high-resource corpora. Finally, given that insufficient data puts forward higher requirements for feature extraction, we propose a novel neural network which improves feature learning. Experiment results show that our work significantly improves the performance on low-resource datasets: 2.3% and 1.5% F-score on PKU and CTB datasets. Furthermore, this paper achieves state-of-the-art results: 96.1%, and 96.2% F-score on PKU and CTB datasets. Besides, we explore an asynchronous parallel method on neural word segmentation to speed up training. The parallel method accelerates training substantially and is almost five times faster than a serial mode.
We propose a primitive called PJOIN, for "predictive join," which combines and extends the operations JOIN and LINK, which Valiant proposed as the basis of a computational theory of cortex. We show that PJOIN can be implemented in Valiant's model. We also show that, using PJOIN, certain reasonably complex learning and pattern matching tasks can be performed, in a way that involves phenomena which have been observed in cognition and the brain, namely memory-based prediction and downward traffic in the cortical hierarchy.
Recently it has been shown that policy-gradient methods for reinforcement learning can be utilized to train deep end-to-end systems directly on non-differentiable metrics for the task at hand. In this paper we consider the problem of optimizing image captioning systems using reinforcement learning, and show that by carefully optimizing our systems using the test metrics of the MSCOCO task, significant gains in performance can be realized. Our systems are built using a new optimization approach that we call self-critical sequence training (SCST). SCST is a form of the popular REINFORCE algorithm that, rather than estimating a "baseline" to normalize the rewards and reduce variance, utilizes the output of its own test-time inference algorithm to normalize the rewards it experiences. Using this approach, estimating the reward signal (as actor-critic methods must do) and estimating normalization (as REINFORCE algorithms typically do) is avoided, while at the same time harmonizing the model with respect to its test-time inference procedure. Empirically we find that directly optimizing the CIDEr metric with SCST and greedy decoding at test-time is highly effective. Our results on the MSCOCO evaluation sever establish a new state-of-the-art on the task, improving the best result in terms of CIDEr from 104.9 to 112.3.
The challenging task of learning structures of probabilistic graphical models is an important problem within modern AI research. Recent years have witnessed several major algorithmic advances in structure learning for Bayesian networks---arguably the most central class of graphical models---especially in what is known as the score-based setting. A successful generic approach to optimal Bayesian network structure learning (BNSL), based on integer programming (IP), is implemented in the GOBNILP system. Despite the recent algorithmic advances, current understanding of foundational aspects underlying the IP based approach to BNSL is still somewhat lacking. Understanding fundamental aspects of cutting planes and the related separation problem( is important not only from a purely theoretical perspective, but also since it holds out the promise of further improving the efficiency of state-of-the-art approaches to solving BNSL exactly. In this paper, we make several theoretical contributions towards these goals: (i) we study the computational complexity of the separation problem, proving that the problem is NP-hard; (ii) we formalise and analyse the relationship between three key polytopes underlying the IP-based approach to BNSL; (iii) we study the facets of the three polytopes both from the theoretical and practical perspective, providing, via exhaustive computation, a complete enumeration of facets for low-dimensional family-variable polytopes; and, furthermore, (iv) we establish a tight connection of the BNSL problem to the acyclic subgraph problem.
Many biological and man-made networked systems are characterized by the simultaneous presence of different sub-networks organized in separate layers, with links and nodes of qualitatively different types. While during the past few years theoretical studies have examined a variety of structural features of complex networks, the outstanding question is whether such features are characterizing all single layers, or rather emerge as a result of coarse-graining, i.e. when going from the multilayered to the aggregate network representation. Here we address this issue with the help of real data. We analyze the structural properties of an intrinsically multilayered real network, the European Air Transportation Multiplex Network in which each commercial airline defines a network layer. We examine how several structural measures evolve as layers are progressively merged together. In particular, we discuss how the topology of each layer affects the emergence of structural properties in the aggregate network.
Artificial neural networks are powerful models, which have been widely applied into many aspects of machine translation, such as language modeling and translation modeling. Though notable improvements have been made in these areas, the reordering problem still remains a challenge in statistical machine translations. In this paper, we present a novel neural reordering model that directly models word pairs and alignment. By utilizing LSTM recurrent neural networks, much longer context could be learned for reordering prediction. Experimental results on NIST OpenMT12 Arabic-English and Chinese-English 1000-best rescoring task show that our LSTM neural reordering feature is robust and achieves significant improvements over various baseline systems.
We study the electron/pion identification performance of the ALICE Transition Radiation Detector (TRD) prototypes using a neural network (NN) algorithm. Measurements were carried out for particle momenta from 2 to 6 GeV/c. An improvement in pion rejection by about a factor of 3 is obtained with NN compared to standard likelihood methods.
It is well known that apps running on mobile devices extensively track and leak users' personally identifiable information (PII); however, these users have little visibility into PII leaked through the network traffic generated by their devices, and have poor control over how, when and where that traffic is sent and handled by third parties. In this paper, we present the design, implementation, and evaluation of ReCon: a cross-platform system that reveals PII leaks and gives users control over them without requiring any special privileges or custom OSes. ReCon leverages machine learning to reveal potential PII leaks by inspecting network traffic, and provides a visualization tool to empower users with the ability to control these leaks via blocking or substitution of PII. We evaluate ReCon's effectiveness with measurements from controlled experiments using leaks from the 100 most popular iOS, Android, and Windows Phone apps, and via an IRB-approved user study with 92 participants. We show that ReCon is accurate, efficient, and identifies a wider range of PII than previous approaches.
We trained a convolutional neural network (CNN) to map raw pixels from a single front-facing camera directly to steering commands. This end-to-end approach proved surprisingly powerful. With minimum training data from humans the system learns to drive in traffic on local roads with or without lane markings and on highways. It also operates in areas with unclear visual guidance such as in parking lots and on unpaved roads.   The system automatically learns internal representations of the necessary processing steps such as detecting useful road features with only the human steering angle as the training signal. We never explicitly trained it to detect, for example, the outline of roads.   Compared to explicit decomposition of the problem, such as lane marking detection, path planning, and control, our end-to-end system optimizes all processing steps simultaneously. We argue that this will eventually lead to better performance and smaller systems. Better performance will result because the internal components self-optimize to maximize overall system performance, instead of optimizing human-selected intermediate criteria, e.g., lane detection. Such criteria understandably are selected for ease of human interpretation which doesn't automatically guarantee maximum system performance. Smaller networks are possible because the system learns to solve the problem with the minimal number of processing steps.   We used an NVIDIA DevBox and Torch 7 for training and an NVIDIA DRIVE(TM) PX self-driving car computer also running Torch 7 for determining where to drive. The system operates at 30 frames per second (FPS).
We describe an application of an encoder-decoder recurrent neural network with LSTM units and attention to generating headlines from the text of news articles. We find that the model is quite effective at concisely paraphrasing news articles. Furthermore, we study how the neural network decides which input words to pay attention to, and specifically we identify the function of the different neurons in a simplified attention mechanism. Interestingly, our simplified attention mechanism performs better that the more complex attention mechanism on a held out set of articles.
On the basis of introspective analysis, we establish a crucial requirement for the physical computation basis of consciousness: it should allow processing a significant amount of information together at the same time. Classical computation does not satisfy the requirement. At the fundamental physical level, it is a network of two body interactions, each the input-output transformation of a universal Boolean gate. Thus, it cannot process together at the same time more than the three bit input of this gate - many such gates in parallel do not count since the information is not processed together. Quantum computation satisfies the requirement. At the light of our recent explanation of the speed up, quantum measurement of the solution of the problem is analogous to a many body interaction between the parts of a perfect classical machine, whose mechanical constraints represent the problem to be solved. The many body interaction satisfies all the constraints together at the same time, producing the solution in one shot. This shades light on the physical computation level of the theories that place consciousness in quantum measurement and explains how informations coming from disparate sensorial channels come together in the unity of subjective experience. The fact that the fundamental mechanism of consciousness is the same of the quantum speed up, gives quantum consciousness a potentially enormous evolutionary advantage.
Present study deals with the mean monthly total ozone time series over Arosa, Switzerland. The study period is 1932-1971. First of all, the total ozone time series has been identified as a complex system and then Artificial Neural Networks models in the form of Multilayer Perceptron with back propagation learning have been developed. The models are Single-hidden-layer and Two-hidden-layer Perceptrons with sigmoid activation function. After sequential learning with learning rate 0.9 the peak total ozone period (February-May) concentrations of mean monthly total ozone have been predicted by the two neural net models. After training and validation, both of the models are found skillful. But, Two-hidden-layer Perceptron is found to be more adroit in predicting the mean monthly total ozone concentrations over the aforesaid period.
Article about objective regularity and mechanisms of formation of network economy in the conditions of crisis of {\guillemotleft}new economy{\guillemotright}. The network economy is considered as a basis of a following business cycle. The author analyzes features of institutional transformation of economic relations under the influence of development of cloudy technologies and the virtual organizations.
Temporal networks are ubiquitous and evolve over time by the addition, deletion, and changing of links, nodes, and attributes. Although many relational datasets contain temporal information, the majority of existing techniques in relational learning focus on static snapshots and ignore the temporal dynamics. We propose a framework for discovering temporal representations of relational data to increase the accuracy of statistical relational learning algorithms. The temporal relational representations serve as a basis for classification, ensembles, and pattern mining in evolving domains. The framework includes (1) selecting the time-varying relational components (links, attributes, nodes), (2) selecting the temporal granularity, (3) predicting the temporal influence of each time-varying relational component, and (4) choosing the weighted relational classifier. Additionally, we propose temporal ensemble methods that exploit the temporal-dimension of relational data. These ensembles outperform traditional and more sophisticated relational ensembles while avoiding the issue of learning the most optimal representation. Finally, the space of temporal-relational models are evaluated using a sample of classifiers. In all cases, the proposed temporal-relational classifiers outperform competing models that ignore the temporal information. The results demonstrate the capability and necessity of the temporal-relational representations for classification, ensembles, and for mining temporal datasets.
The measurements of the hadronic final state in deep inelastic scattering at HERA are reviewed and discussed in the context of QCD. Covered are the general event properties in terms of energy flows, charged particle production, and charm and strangeness production. Quark fragmentation properties are studied in the Breit frame. Event shape measurements allow "power corrections" to be applied and the strong coupling alpha_s to be extracted. Other alpha_s measurements are based on dijet rates. Jet rates as well as charm production have been used to determine the gluon density in the proton. Indications have been found in the hadronic final state for unconventional, non-DGLAP evolution at small x, which could be explained with BFKL evolution. Signatures for QCD instanton effects are discussed and first search results are presented.
We investigate the effects of rare regions on the dynamics of Ising magnets with planar defects, i.e., disorder perfectly correlated in two dimensions. In these systems, the magnetic phase transition is smeared because static long-range order can develop on isolated rare regions. We first study an infinite-range model by numerically solving local dynamic mean-field equations. Then we use extremal statistics and scaling arguments to discuss the dynamics beyond mean-field theory. In the tail region of the smeared transition the dynamics is even slower than in a conventional Griffiths phase: the spin autocorrelation function decays like a stretched exponential at intermediate times before approaching the exponentially small equilibrium value following a power law at late times.
In this study, we analyze the aerospace stocks prices in order to characterize the sector behavior. The data analyzed cover the period from January 1987 to April 1999. We present a new index for the aerospace sector and we investigate the statistical characteristics of this index. Our results show that this index is well described by Tsallis distribution. We explore this result and modify the standard Value-at-Risk (VaR), financial risk assessment methodology in order to reflect an asset which obeys Tsallis non-extensive statistics.
This paper presents ground-based data obtained from deep optical and infrared observations carried out at the ESO 3.5 New Technology Telescope (NTT) of a field selected for its low HI content for deep observations with AXAF. These data were taken as part of the ESO Imaging Survey (EIS) program, a public survey conducted in preparation for the first year of operation of the VLT. Deep CCD images are available for five optical passbands, reaching 2 sigma limiting magnitudes of U_AB~27.0, B_AB~27, V_AB~26.5, R_AB~26.5 and I_AB~26. An area of ~56 square arcmin is covered by UBVR observations, and ~30 square arcmin also in I. The infrared observations cover a total area of ~83 square arcmin, reaching J_AB~24.5 and K_AB~23.5. This paper describes the observations and data reduction. It also presents images of the surveyed region and lists the optical and infrared photometric parameters of the objects detected on the co-added images of each passband, as well as multicolor optical and infrared catalogs. These catalogs together with the astrometrically and photometrically calibrated co-added images are public and can be retrieved from the URL http://www.eso.org/eis/ . This data set completes the ESO Imaging Survey program sixteen months after it began in July 1997.
We study a reinforcement learning for temporal coding with neural network consisting of stochastic spiking neurons. In neural networks, information can be coded by characteristics of the timing of each neuronal firing, including the order of firing or the relative phase differences of firing. We derive the learning rule for this network and show that the network consisting of Hodgkin-Huxley neurons with the dynamical synaptic kinetics can learn the appropriate timing of each neuronal firing. We also investigate the system size dependence of learning efficiency.
Experimental studies on enzyme evolution show that only a small fraction of all possible mutation trajectories are accessible to evolution. However, these experiments deal with individual enzymes and explore a tiny part of the fitness landscape. We report an exhaustive analysis of fitness landscapes constructed with an off-lattice model of protein folding where fitness is equated with robustness to misfolding. This model mimics the essential features of the interactions between amino acids, is consistent with the key paradigms of protein folding and reproduces the universal distribution of evolutionary rates among orthologous proteins. We introduce mean path divergence as a quantitative measure of the degree to which the starting and ending points determine the path of evolution in fitness landscapes. Global measures of landscape roughness are good predictors of path divergence in all studied landscapes: the mean path divergence is greater in smooth landscapes than in rough ones. The model-derived and experimental landscapes are significantly smoother than random landscapes and resemble additive landscapes perturbed with moderate amounts of noise; thus, these landscapes are substantially robust to mutation. The model landscapes show a deficit of suboptimal peaks even compared with noisy additive landscapes with similar overall roughness. We suggest that smoothness and the substantial deficit of peaks in the fitness landscapes of protein evolution are fundamental consequences of the physics of protein folding.
With the increasing demand for mobile data services, Broadband Wireless Access (BWA) is emerging as one of the fastest growing areas within mobile communications. Innovative wireless communication systems, such as WiMAX, are expected to offer highly reliable broadband radio access in order to meet the increasing demands of emerging high speed data and multimedia services. In Ghana, deployment of WiMAX technology has recently begun. Planning these high capacity networks in the presence of multiple interferences in order to achieve the aim of enabling users enjoy cheap and reliable internet services is a critical design issue. This paper has used a deterministic approach for simulating the Bit-Error-Rate (BER) of initial MIMO antenna configurations which were considered in deploying a high capacity 4G-WiMAX network in Ghana. The radiation pattern of the antenna used in the deploying the network has been simulated with Genex-Unet and NEC and results presented. An adaptive 4x4 MIMO antenna configuration with optimally suppressed sidelobes has been suggested for future network deployment since the adaptive 2x2 MIMO antenna configuration, which was used in the initial network deployment provides poor estimates for average BER performance as compared to 4x4 antenna configuration which seem less affected in the presence of multiple interferers.
The energy dependence of the cross section for exclusive electroproduction of vector mesons is discussed as a way to learn about the interplay of soft and hard interactions. The question of determining the scale of these processes is addressed.
We investigate models of the mitogenactivated protein kinases (MAPK) network, with the aim of determining where in parameter space there exist multiple positive steady states. We build on recent progress which combines various symbolic computation methods for mixed systems of equalities and inequalities. We demonstrate that those techniques benefit tremendously from a newly implemented graph theoretical symbolic preprocessing method. We compare computation times and quality of results of numerical continuation methods with our symbolic approach before and after the application of our preprocessing.
We present an extensive Monte Carlo study on light transport in optically thin slabs, addressing both axial and transverse propagation. We completely characterize the so-called ballistic-to-diffusive transition, notably in terms of the spatial variance of the transmitted/reflected profile. We test the validity of the prediction cast by diffusion theory, that the spatial variance should grow independently of absorption and, to a first approximation, of the sample thickness and refractive index contrast. Based on a large set of simulated data, we build a freely available look-up table routine allowing reliable and precise determination of the microscopic transport parameters starting from robust observables which are independent of absolute intensity measurements. We also present the Monte Carlo software package that was developed for the purpose of this study.
High speeds have been measured at seep and mud-volcano sites expelling methane-rich fluids from the seabed. Thermal or solute-driven convection alone cannot explain such high velocities in low-permeability sediments. Here, we demonstrate that in addition to buoyancy, osmotic effects generated by the adsorption of methane onto the sediments can create large overpressures, capable of recirculating seawater from the seafloor to depth in the sediment layer, then expelling it upwards at rates of up to a few hundreds of metres per year. In the presence of global warming, such deep recirculation of seawater can accelerate the melting of methane hydrates at depth from timescales of millennia to just decades, and can drastically increase the rate of release of methane into the hydrosphere and perhaps the atmosphere.
We study the delocalization effect of a short-range repulsive interaction on the ground state of a finite density of spinless fermions in strongly disordered one dimensional lattices. The density matrix renormalization group method is used to explore the charge density and the sensitivity of the ground state energy with respect to the boundary condition (the persistent current) for a wide range of parameters (carrier density, interaction and disorder). Analytical approaches are developed and allow to understand some mechanisms and limiting conditions. For weak interaction strength, one has a Fermi glass of Anderson localized states, while in the opposite limit of strong interaction, one has a correlated array of charges (Mott insulator). In the two cases, the system is strongly insulating and the ground state energy is essentially invariant under a twist of the boundary conditions. Reducing the interaction strength from large to intermediate values, the quantum melting of the solid array gives rise to a more homogeneous distribution of charges, and the ground state energy changes when the boundary conditions are twisted. In individual chains, this melting occurs by abrupt steps located at sample-dependent values of the interaction where an (avoided) level crossing between the ground state and the first excitation can be observed. Important charge reorganizations take place at the avoided crossings and the persistent currents are strongly enhanced around the corresponding interaction value. These large delocalization effects become smeared and reduced after ensemble averaging. They mainly characterize half filling and strong disorder, but they persist away of this optimal condition.
We propose a methodology to detect and isolate link failures in a weighted and directed network of identical multi-input multi-output LTI systems when only the output responses of a subset of nodes are available. Our method is based on the observation of jump discontinuities in the output derivatives, which can be explicitly related to the occurrence of link failures. The order of the derivative at which the jump is observed is given by $r(d+1)$, where $r$ is the relative degree of each system's transfer matrix, and $d$ denotes the distance from the location of the failure to the observation point. We then propose detection and isolation strategies based on this relation. Furthermore, we propose an efficient algorithm for sensor placement to detect and isolate any possible link failure using a small number of sensors. Available results from the theory of sub-modular set functions provide us with performance guarantees that bound the size of the chosen sensor set within a logarithmic factor of the smallest feasible set of sensors. These results are illustrated through elaborative examples and supplemented by computer experiments.
This master thesis describes an algorithm for automated categorization of scientific documents using deep learning techniques and compares the results to the results of existing classification algorithms. As an additional goal a reusable API is to be developed allowing the automation of classification tasks in existing software. A design will be proposed using a convolutional neural network as a classifier and integrating this into a REST based API. This is then used as the basis for an actual proof of concept implementation presented as well in this thesis. It will be shown that the deep learning classifier provides very good result in the context of multi-class document categorization and that it is feasible to integrate such classifiers into a larger ecosystem using REST based services.
Discriminative segmental models, such as segmental conditional random fields (SCRFs) and segmental structured support vector machines (SSVMs), have had success in speech recognition via both lattice rescoring and first-pass decoding. However, such models suffer from slow decoding, hampering the use of computationally expensive features, such as segment neural networks or other high-order features. A typical solution is to use approximate decoding, either by beam pruning in a single pass or by beam pruning to generate a lattice followed by a second pass. In this work, we study discriminative segmental models trained with a hinge loss (i.e., segmental structured SVMs). We show that beam search is not suitable for learning rescoring models in this approach, though it gives good approximate decoding performance when the model is already well-trained. Instead, we consider an approach inspired by structured prediction cascades, which use max-marginal pruning to generate lattices. We obtain a high-accuracy phonetic recognition system with several expensive feature types: a segment neural network, a second-order language model, and second-order phone boundary features.
We consider the one-dimensional lattice model of interacting fermions with disorder studied previously by Oganesyan and Huse [Phys. Rev. B 75, 155111 (2007)]. To characterize a possible many-body localization transition as a function of the disorder strength $W$, we use an exact renormalization procedure in configuration space that generalizes the Aoki real-space RG procedure for Anderson localization one-particle models [H. Aoki, J. Phys. C13, 3369 (1980)]. We focus on the statistical properties of the renormalized hopping $V_L$ between two configurations separated by a distance $L$ in configuration space (distance being defined as the minimal number of elementary moves to go from one configuration to the other). Our numerical results point towards the existence of a many-body localization transition at a finite disorder strength $W_c$. In the localized phase $W>W_c$, the typical renormalized hopping $V_L^{typ} \equiv e^{\bar{\ln V_L}}$ decays exponentially in $L$ as $ (\ln V_L^{typ}) \simeq - \frac{L}{\xi_{loc}}$ and the localization length diverges as $\xi_{loc}(W) \sim (W-W_c)^{-\nu_{loc}}$ with a critical exponent of order $\nu_{loc} \simeq 0.5$. In the delocalized phase $W<W_c$, the renormalized hopping remains a finite random variable as $L \to \infty$, and the typical asymptotic value $V_{\infty}^{typ} \equiv e^{\bar{\ln V_{\infty}}}$ presents an essential singularity $(\ln V_{\infty}^{typ}) \sim - (W_c-W)^{-\kappa}$ with an exponent of order $\kappa \sim 1.4$. Finally, we show that this analysis in configuration space is compatible with the localization properties of the simplest two-point correlation function in real space.
Molecular profiling data (e.g., gene expression) has been used for clinical risk prediction and biomarker discovery. However, it is necessary to integrate other prior knowledge like biological pathways or gene interaction networks to improve the predictive ability and biological interpretability of biomarkers. Here, we first introduce a general regularized Logistic Regression (LR) framework with regularized term $\lambda \|\bm{w}\|_1 + \eta\bm{w}^T\bm{M}\bm{w}$, which can reduce to different penalties, including Lasso, elastic net, and network-regularized terms with different $\bm{M}$. This framework can be easily solved in a unified manner by a cyclic coordinate descent algorithm which can avoid inverse matrix operation and accelerate the computing speed. However, if those estimated $\bm{w}_i$ and $\bm{w}_j$ have opposite signs, then the traditional network-regularized penalty may not perform well. To address it, we introduce a novel network-regularized sparse LR model with a new penalty $\lambda \|\bm{w}\|_1 + \eta|\bm{w}|^T\bm{M}|\bm{w}|$ to consider the difference between the absolute values of the coefficients. And we develop two efficient algorithms to solve it. Finally, we test our methods and compare them with the related ones using simulated and real data to show their efficiency.
A general formulation is developed to study heat conduction in disordered harmonic chains with arbitrary heat baths that satisfy the fluctuation-dissipation theorem. A simple formal expression for the heat current J is obtained, from which its asymptotic system-size (N) dependence is extracted. It is shown that the ``thermal conductivity'' depends not just on the system itself but also on the spectral properties of the fluctuation and noise used to model the heat baths. As special cases of our heat baths we recover earlier results which reported that for fixed boundaries $J \sim 1/N^{3/2}$, while for free boundaries $J \sim 1/N^{1/2}$. For other choices we find that one can get other power laws including the ``Fourier behaviour'' $J \sim 1/N$.
In this work we consider \emph{temporal networks}, i.e. networks defined by a \emph{labeling} $\lambda$ assigning to each edge of an \emph{underlying graph} $G$ a set of \emph{discrete} time-labels. The labels of an edge, which are natural numbers, indicate the discrete time moments at which the edge is available. We focus on \emph{path problems} of temporal networks. In particular, we consider \emph{time-respecting} paths, i.e. paths whose edges are assigned by $\lambda$ a strictly increasing sequence of labels. We begin by giving two efficient algorithms for computing shortest time-respecting paths on a temporal network. We then prove that there is a \emph{natural analogue of Menger's theorem} holding for arbitrary temporal networks. Finally, we propose two \emph{cost minimization parameters} for temporal network design. One is the \emph{temporality} of $G$, in which the goal is to minimize the maximum number of labels of an edge, and the other is the \emph{temporal cost} of $G$, in which the goal is to minimize the total number of labels used. Optimization of these parameters is performed subject to some \emph{connectivity constraint}. We prove several lower and upper bounds for the temporality and the temporal cost of some very basic graph families such as rings, directed acyclic graphs, and trees.
The new method of the mean-field approximation is extended. An approach which enables to estimate some parameters of the transition from the isotropic state of hard sticks to the nematic ordering phase is suggested. An technique of the investigation of the model jointed to the so-called polymer excluded volume problem is proposed. The critical exponents are estimated. The transition of a swelling polymer coil to ideal is revealed as the polymerization degree of the chain increases. The entanglement concentration obtained is in accordance with experimental data for polymers with flexible chains. The number of monomers between neighbor entanglements assumes to be the ratio of the stick length to its diameter. The comparison of the theory with experimental data is made.
Transmission power has a major impact on link and communication reliability and network lifetime in Wireless Sensor Networks. We study power control in a multi-hop Wireless Sensor Network where nodes' communication interfere with each other. Our objective is to determine each node's transmission power level that will reduce the communication interference and keep energy consumption to a minimum. We propose a potential game approach to obtain the unique equilibrium of the network transmission power allocation. The unique equilibrium is located in a continuous domain. However, radio transceivers accept only discrete values for transmission power level setting. We study the viability and performance of mapping the continuous solution from the potential game to the discrete domain required by the radio. We demonstrate the success of our approach through TOSSIM simulation when nodes use the Collection Tree Protocol for routing the data. Also, we show results of our method from the Indriya testbed. We compare it with the case where the motes use Collection Tree Protocol with the maximum transmission power.
Compared to bright star searches, surveys for transiting planets against fainter (V=12-18) stars have the advantage of much higher sky densities of dwarf star primaries, which afford easier detection of small transiting bodies. Furthermore, deep searches are capable of probing a wider range of stellar environments. On the other hand, for a given spatial resolution and transit depth, deep searches are more prone to confusion from blended eclipsing binaries. We present a powerful mitigation strategy for the blending problem that includes the use of image deconvolution and high resolution imaging. The techniques are illustrated with Lupus-TR-3 and very recent IR imaging with PANIC on Magellan. The results are likely to have implications for the CoRoT and KEPLER missions designed to detect transiting planets of terrestrial size.
We investigate the problem of learning to generate complex networks from data. Specifically, we consider whether deep belief networks, dependency networks, and members of the exponential random graph family can learn to generate networks whose complex behavior is consistent with a set of input examples. We find that the deep model is able to capture the complex behavior of small networks, but that no model is able capture this behavior for networks with more than a handful of nodes.
In this paper, adaptive neural control (ANC) is investigated for a class of strict-feedback nonlinear stochastic systems with unknown parameters, unknown nonlinear functions and stochastic disturbances. The new controller of adaptive neural network with state feedback is presented by using a universal approximation of radial basis function neural network and backstepping. An adaptive neural network state-feedback controller is designed by constructing a suitable Lyapunov function. Adaptive bounding design technique is used to deal with the unknown nonlinear functions and unknown parameters. It is shown that, the global asymptotically stable in probability can be achieved for the closed-loop system. The simulation results are presented to demonstrate the effectiveness of the proposed control strategy in the presence of unknown parameters, unknown nonlinear functions and stochastic disturbances.
Diluted Ising antiferromagnets in a homogenous magnetic field have a disordered phase for sufficiently large values of the field and for low temperatures. Here, the system is in a domain state with a broad size-distribution of fractal domains. We study the relaxation dynamics of this domain state after removing the external field for two and three dimensions. Using Monte Carlo simulation techniques, we measure the decay of the remanent magnetization. Its temperature dependence can be understood as thermal activation. All data can be described by a unique generalized power law for a wide range of temperatures in two and three dimensions. The question wether the exponent of the generalized power law is universal remains open.
Bounded rational decision-makers transform sensory input into motor output under limited computational resources. Mathematically, such decision-makers can be modeled as information-theoretic channels with limited transmission rate. Here, we apply this formalism for the first time to multilayer feedforward neural networks. We derive synaptic weight update rules for two scenarios, where either each neuron is considered as a bounded rational decision-maker or the network as a whole. In the update rules, bounded rationality translates into information-theoretically motivated types of regularization in weight space. In experiments on the MNIST benchmark classification task for handwritten digits, we show that such information-theoretic regularization successfully prevents overfitting across different architectures and attains results that are competitive with other recent techniques like dropout, dropconnect and Bayes by backprop, for both ordinary and convolutional neural networks.
Deep neural networks (DNNs) are powerful types of artificial neural networks (ANNs) that use several hidden layers. They have recently gained considerable attention in the speech transcription and image recognition community (Krizhevsky et al., 2012) for their superior predictive properties including robustness to overfitting. However their application to algorithmic trading has not been previously researched, partly because of their computational complexity. This paper describes the application of DNNs to predicting financial market movement directions. In particular we describe the configuration and training approach and then demonstrate their application to backtesting a simple trading strategy over 43 different Commodity and FX future mid-prices at 5-minute intervals. All results in this paper are generated using a C++ implementation on the Intel Xeon Phi co-processor which is 11.4x faster than the serial version and a Python strategy backtesting environment both of which are available as open source code written by the authors.
Most of the wireless sensor networks consist of static sensors, which can be deployed in a wide environment for monitoring applications. While transmitting the data from source to static sink, the amount of energy consumption of the sensor node is high. It results in reduced lifetime of the network.Some of the WSN architectures have been proposed based on Mobile Elements. There is large number of approaches to resolve the above problem. It is found those two approaches, namely Single Hop Data Gathering problem (SHDGP) and mobile Data Gathering, which is used to increase the lifetime of the network. Single Hop Data Gathering Problem is used to achieve the uniform energy consumption. The mobile Data Gathering algorithm is used to find the minimal set of points in the sensor network, which serves as data gathering points for mobile network. Even after so many decades of research, there are some unresolved problems like non uniform energy consumption, increased latency, which needs to be resolved.
We present an analysis of two deep (75 ks) Chandra observations of the European Large Area ISO Survey (ELAIS) fields N1 and N2 as the first results from the ELAIS deep X-ray survey. This survey is being conducted in well studied regions with extensive multi-wavelength coverage. Here we present the Chandra source catalogues along with an analysis of source counts, hardness ratios and optical classifications. A total of 233 X-ray point sources are detected in addition to 2 soft extended sources, which are found to be associated with galaxy clusters. An over-density of sources is found in N1 with 30% more sources than N2, which we attribute to large-scale structure. A similar variance is seen between other deep Chandra surveys. The source count statistics reveal an increasing fraction of hard sources at fainter fluxes. The number of galaxy-like counterparts also increases dramatically towards fainter fluxes, consistent with the emergence of a large population of obscured sources.
This paper describes the details of Sighthound's fully automated age, gender and emotion recognition system. The backbone of our system consists of several deep convolutional neural networks that are not only computationally inexpensive, but also provide state-of-the-art results on several competitive benchmarks. To power our novel deep networks, we collected large labeled datasets through a semi-supervised pipeline to reduce the annotation effort/time. We tested our system on several public benchmarks and report outstanding results. Our age, gender and emotion recognition models are available to developers through the Sighthound Cloud API at https://www.sighthound.com/products/cloud
In closed quantum systems, strong randomness can localize many-body excitations, preventing ergodicity. An interesting consequence is that high energy excited states can exhibit quantum coherent properties, such as symmetry protected topological (SPT) order, that otherwise only occur in equilibrium ground states. Here, we ask: which types of SPT orders can be realized in highly excited states of a many-body (MB) localized system? We argue that this question is equivalent to whether an SPT order can be realized in an exactly solvable lattice model of commuting projectors. This perspective enables a sharp definition of MB localizability. Using this criterion, it is straightforward to establish that whereas all bosonic SPTs in spatial dimensions $d=1,\,2,\,3$ are MB localizable, chiral phases (e.g. quantum Hall fluids) are not. We also show that free fermion SPTs in $d >1$ (including topological insulators and superconductors) cannot be localized if interactions are weak. A key question is whether strong interactions can render them MB localizable, which we study in the context of a class of $d=2$ topological superconductors. Using a decorated domain wall (DDW) approach we show that some phases in this class are MB localizable, when they correspond to bosonic SPT orders. However, a similar DDW approach faces a fatal obstruction to realizing certain intrinsically fermionic SPT orders, an issue we argue may persist beyond this specific construction.
Wireless sensor networks consist of sensor nodes with limited computational and communication capabilities. This paper deals with mobile sensors which are divided into clusters based on their physical locations. Efficient ways of key distribution among the sensors and inter and intra cluster communications are examined. The security of the entire network is considered through efficient key management by taking into consideration the network's power capabilities.
The authors introduce a new vision for providing computing services for connected devices. It is based on the key concept that future computing resources will be coupled with communication resources, for enhancing user experience of the connected users, and also for optimising resources in the providers' infrastructures. Such coupling is achieved by Joint/Cooperative resource allocation algorithms, by integrating computing and communication services and by integrating hardware in networks. Such type of computing, by which computing services are not delivered independently but dependent of networking services, is named Aqua Computing. The authors see Aqua Computing as a novel approach for delivering computing resources to end devices, where computing power of the devices are enhanced automatically once they are connected to an Aqua Computing enabled network. The process of resource coupling is named computation dissolving. Then, an Aqua Computing architecture is proposed for mobile edge networks, in which computing and wireless networking resources are allocated jointly or cooperatively by a Mobile Cloud Controller, for the benefit of the end-users and/or for the benefit of the service providers. Finally, a working prototype of the system is shown and the gathered results show the performance of the Aqua Computing prototype.
Obtaining common representations from different modalities is important in that they are interchangeable with each other in a classification problem. For example, we can train a classifier on image features in the common representations and apply it to the testing of the text features in the representations. Existing multi-modal representation learning methods mainly aim to extract rich information from paired samples and train a classifier by the corresponding labels; however, collecting paired samples and their labels simultaneously involves high labor costs. Addressing paired modal samples without their labels and single modal data with their labels independently is much easier than addressing labeled multi-modal data. To obtain the common representations under such a situation, we propose to make the distributions over different modalities similar in the learned representations, namely modality-invariant representations. In particular, we propose a novel algorithm for modality-invariant representation learning, named Deep Modality Invariant Adversarial Network (DeMIAN), which utilizes the idea of Domain Adaptation (DA). Using the modality-invariant representations learned by DeMIAN, we achieved better classification accuracy than with the state-of-the-art methods, especially for some benchmark datasets of zero-shot learning.
Caricatures are facial drawings by artists with exaggeration on certain facial parts. The exaggerations are often beyond realism and yet the caricatures are still recognizable by humans. With the advent of deep learning, recognition performances by computers on real-world faces has become comparable to human performance even under unconstrained situations. However, there is still a gap in caricature recognition performance between computer and human. This is mainly due to the lack of publicly available caricature datasets of large scale. To facilitate the research in caricature recognition, a new caricature dataset is built. All the caricature images and face images were collected from the web.Compared with two existing datasets, this dataset is of larger size and has various artistic styles. We also offer evaluation protocols and present baseline performances on the dataset. Specifically, four evaluation protocols are provided: restricted and unrestricted caricature verifications, caricature to photo and photo to caricature face identifications. Based on the evaluation protocols, three face alignment methods together with five kinds of features and nine subspace and metric learning algorithms have been applied to provide the baseline performances on this dataset. Main conclusion is that there is still a space for improvement in caricature face recognition.
The success of deep learning depends on finding an architecture to fit the task. As deep learning has scaled up to more challenging tasks, the architectures have become difficult to design by hand. This paper proposes an automated method, CoDeepNEAT, for optimizing deep learning architectures through evolution. By extending existing neuroevolution methods to topology, components, and hyperparameters, this method achieves results comparable to best human designs in standard benchmarks in object recognition and language modeling. It also supports building a real-world application of automated image captioning on a magazine website. Given the anticipated increases in available computing power, evolution of deep networks is promising approach to constructing deep learning applications in the future.
A computable estimate of the readiness coefficient for a standard binary-state system is established in the case where both working and repair time distributions possess heavy tails.
Deep optical CCD images of the supernova remnant G 17.4-2.3 were obtained and faint emission has been discovered. The images, taken in the emission lines of Halpha+[N II], [S II] and [O III], reveal filamentary structures in the east, south-east area, while diffuse emission in the south and central regions of the remnant is also present. The radio emission in the same area is found to be well correlated with the brightest optical filament. The flux calibrated images suggest that the optical filamentary emission originates from shock-heated gas ([S II]/Halpha > 0.4), while the diffuse emission seems to originate from an HII region ([S II]/Halpha < 0.3). Furthermore, deep long-slit spectra were taken at the bright [O III] filament and clearly show that the emission originates from shock heated gas. The [O III] flux suggests shock velocities into the interstellar "clouds" greater than 100 km/s, while the [S II] 6716/6731 ratio indicates electron densities ~240 cm^{-3}. Finally, the Halpha emission has been measured to be between 7 to 20 x 10^{-17} erg s^{-1} cm^{-2} arcsec^{-2}.
We consider a set of probabilistic functions of some input variables as a representation of the inputs. We present bounds on how informative a representation is about input data. We extend these bounds to hierarchical representations so that we can quantify the contribution of each layer towards capturing the information in the original data. The special form of these bounds leads to a simple, bottom-up optimization procedure to construct hierarchical representations that are also maximally informative about the data. This optimization has linear computational complexity and constant sample complexity in the number of variables. These results establish a new approach to unsupervised learning of deep representations that is both principled and practical. We demonstrate the usefulness of the approach on both synthetic and real-world data.
Automatic Offline Handwritten Signature Verification has been researched over the last few decades from several perspectives, using insights from graphology, computer vision, signal processing, among others. In spite of the advancements on the field, building classifiers that can separate between genuine signatures and skilled forgeries (forgeries made targeting a particular signature) is still hard. We propose approaching the problem from a feature learning perspective. Our hypothesis is that, in the absence of a good model of the data generation process, it is better to learn the features from data, instead of using hand-crafted features that have no resemblance to the signature generation process. To this end, we use Deep Convolutional Neural Networks to learn features in a writer-independent format, and use this model to obtain a feature representation on another set of users, where we train writer-dependent classifiers. We tested our method in two datasets: GPDS-960 and Brazilian PUC-PR. Our experimental results show that the features learned in a subset of the users are discriminative for the other users, including across different datasets, reaching close to the state-of-the-art in the GPDS dataset, and improving the state-of-the-art in the Brazilian PUC-PR dataset.
We derive a Belief-Propagation algorithm for counting large loops in a directed network. We evaluate the distribution of the number of small loops in a directed random network with given degree sequence. We apply the algorithm to a few characteristic directed networks of various network sizes and loop structures and compare the algorithm with exhaustive counting results when possible. The algorithm is adequate in estimating loop counts for large directed networks and can be used to compare the loop structure of directed networks and their randomized counterparts.
The ground state of interacting particles on a disordered one-dimensional host-lattice is studied by a direct numerical method. It is shown that if the concentration of particles is small, then even a weak disorder of the host-lattice breaks the long-range order of Generalized Wigner Crystal, replacing it by the sequence of blocks (domains) of particles with random lengths. The mean domains length as a function of the host-lattice disorder parameter is also found. It is shown that the domain structure can be detected by a weak random field, whose form is similar to that of the ground state but has fluctuating domain walls positions. This is because the generalized magnetization corresponding to the field has a sufficiently sharp peak as a function of the amplitude of fluctuations for small amplitudes.
The increasing demand for higher data rates, better quality of service, fully mobile and connected wireless networks lead the researchers to seek new solutions beyond 4G wireless systems. It is anticipated that 5G wireless networks, which are expected to be introduced around 2020, will achieve ten times higher spectral and energy efficiency than current 4G wireless networks and will support data rates up to 10 Gbps for low mobility users. The ambitious goals set for 5G wireless networks require dramatic changes in the design of different layers for next generation communications systems. Massive multiple-input multiple-output (MIMO) systems, filter bank multi-carrier (FBMC) modulation, relaying technologies, and millimeter-wave communications have been considered as some of the strong candidates for the physical layer design of 5G networks. In this article, we shed light on the potential and implementation of index modulation (IM) techniques for MIMO and multi-carrier communications systems which are expected to be two of the key technologies for 5G systems. Specifically, we focus on two promising applications of IM: spatial modulation (SM) and orthogonal frequency division multiplexing with IM (OFDM-IM), and we discuss the recent advances and future research directions in IM technologies towards spectral and energy-efficient 5G wireless networks.
Recent advances in cellular communication systems resulted in a huge increase in spectrum demand. To meet the requirements of the ever-growing need for spectrum, efficient utilization of the existing resources is of utmost importance. Channel Allocation, has thus become an inevitable research topic in wireless communications. In this paper, we propose an optimal channel allocation scheme, Optimal Hybrid Channel Allocation (OHCA) for an effective allocation of channels. We improvise upon the existing Fixed Channel Allocation (FCA) technique by imparting intelligence to the existing system by employing the multilayer perceptron technique.
We analyze the photometric information contained in individual pixels of galaxies in the Hubble Deep Field North (HDFN) using a new technique, _pixel-z_, that combines predictions of evolutionary synthesis models with photometric redshift template fitting. Each spectral energy distribution template is a result of modeling of the detailed physical processes affecting gas properties and star formation efficiency. The criteria chosen to generate the SED templates is that of sampling a wide range of physical characteristics such as age, star formation rate, obscuration and metallicity. A key feature of our method is the sophisticated use of error analysis to generate error maps that define the reliability of the template fitting on pixel scales and allow for the separation of the interplay among dust, metallicity and star formation histories. This technique offers a number of advantages over traditional integrated color studies. As a first application, we derive the star formation and metallicity histories of galaxies in the HDFN. Our results show that the comoving density of star formation rate, determined from the UV luminosity density of sources in the HDFN, increases monotonically with redshift out to at least redshift of 5. This behavior can plausibly be explained by a smooth increase of the UV luminosity density with redshift coupled with an increase in the number of star forming regions as a function of redshift. We also find that the information contained in individual pixels in a galaxy can be linked to its morphological history. Finally, we derive the metal enrichment rate history of the universe and find it in good agreement with predictions based on the evolving HI content of Lyman-alpha QSO absorption line systems.
Recurrent Neural Networks (RNNs) have had considerable success in classifying and predicting sequences. We demonstrate that RNNs can be effectively used in order to encode sequences and provide effective representations. The methodology we use is based on Fisher Vectors, where the RNNs are the generative probabilistic models and the partial derivatives are computed using backpropagation. State of the art results are obtained in two central but distant tasks, which both rely on sequences: video action recognition and image annotation. We also show a surprising transfer learning result from the task of image annotation to the task of video action recognition.
With the increasing number of intrusions in system and network infrastructures, Intrusion Detection Systems (IDS) have become an active area of research to develop reliable and effective solutions to detect and counter them. The use of Evolutionary Algorithms in IDS has proved its maturity over the times. Although most of the research works have been based on the use of genetic algorithms in IDS, this paper presents an approach toward the generation of rules for the identification of anomalous connections using evolution strategies . The emphasis is given on how the problem can be modeled into ES primitives and how the fitness of the population can be evaluated in order to find the local optima, therefore resulting in optimal rules that can be used for detecting intrusions in intrusion detection systems.
First-order factoid question answering assumes that the question can be answered by a single fact in a knowledge base (KB). While this does not seem like a challenging task, many recent attempts that apply either complex linguistic reasoning or deep neural networks achieve 35%-65% on benchmark sets. Our approach formulates the task as two machine learning problems: detecting the entities in the question, and classifying the question as one of the relation types in the KB. Based on this assumption of the structure, our simple yet effective approach trains two recurrent neural networks to outperform state of the art by significant margins --- relative improvement reaches 16% for WebQuestions, and surpasses 38% for SimpleQuestions.
We consider one-dimensional bosonic chains with a repulsive boson-boson interaction that decays exponentially on large length-scales. This model describes transport of Cooper-pairs in a Josepshon junction array, or transport of magnetic flux quanta in quantum-phase-slip ladders, i.e. arrays of superconducting wires in a ladder-configuration that allow for the coherent tunnelling of flux quanta. In the low-frequency, long wave-length regime these chains can be mapped to an effective model of a one-dimensional elastic field in a disordered potential. The onset of transport in these systems, when biased by external voltage, is described by the standard depinning theory of elastic media in disordered pinning potentials. We numerically study the regimes that are of relevance for quantum-phase-slip ladders. These are (i) very short chains and (ii) the regime of weak disorder. For chains shorter than the typical pinning length, i.e., the Larkin length, the chains reach a saturation regime where the depinning voltage does not depend on the decay length of the repulsive interaction. In the regime of weak disorder we find an emergent correlation length-scale that depends on the disorder strength. For arrays shorter than this length the onset of transport is similar to the clean arrays, i.e., is due to the penetration of solitons into the array. We discuss the depinning scenarios for longer arrays in this regime.
We consider the shortest path routing (SPR) of a network with stochastically time varying link metrics under potential adversarial attacks. Due to potential denial of service attacks, the distributions of link states could be stochastic (benign) or adversarial at different temporal and spatial locations. Without any \emph{a priori}, designing an adaptive SPR protocol to cope with all possible situations in practice optimally is a very challenging issue. In this paper, we present the first solution by formulating it as a multi-armed bandit (MAB) problem. By introducing a novel control parameter into the exploration phase for each link, a martingale inequality is applied in the our combinatorial adversarial MAB framework. As such, our proposed algorithms could automatically detect features of the environment within a unified framework and find the optimal SPR strategies with almost optimal learning performance in all possible cases over time. Moreover, we study important issues related to the practical implementation, such as decoupling route selection with multi-path route probing, cooperative learning among multiple sources, the "cold-start" issue and delayed feedback of our algorithm. Nonetheless, the proposed SPR algorithms can be implemented with low complexity and they are proved to scale very well with the network size. Comparing to existing approaches in a typical network scenario under jamming attacks, our algorithm has a 65.3\% improvement of network delay given a learning period and a 81.5\% improvement of learning duration under a specified network delay.
The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep convolutional neural network architecture for environmental sound classification. Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture. Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification. We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a "shallow" dictionary learning model with augmentation. Finally, we examine the influence of each augmentation on the model's classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation.
We develop a novel multi-fidelity framework that goes far beyond the classical AR(1) Co-kriging scheme of Kennedy and O'Hagan (2000). Our method can handle general discontinuous cross-correlations among systems with different levels of fidelity. A combination of multi-fidelity Gaussian Processes (AR(1) Co-kriging) and deep neural networks enables us to construct a method that is immune to discontinuities. We demonstrate the effectiveness of the new technology using standard benchmark problems designed to resemble the outputs of complicated high- and low-fidelity codes.
Resistive switches are a class of emerging nanoelectronics devices that exhibit a wide variety of switching characteristics closely resembling behaviors of biological synapses. Assembled into random networks, such resistive switches produce emerging behaviors far more complex than that of individual devices. This was previously demonstrated in simulations that exploit information processing within these random networks to solve tasks that require nonlinear computation as well as memory. Physical assemblies of such networks manifest complex spatial structures and basic processing capabilities often related to biologically-inspired computing. We model and simulate random resistive switch networks and analyze their computational capacities. We provide a detailed discussion of the relevant design parameters and establish the link to the physical assemblies by relating the modeling parameters to physical parameters. More globally connected networks and an increased network switching activity are means to increase the computational capacity linearly at the expense of exponentially growing energy consumption. We discuss a new modular approach that exhibits higher computational capacities and energy consumption growing linearly with the number of networks used. The results show how to optimize the trade-off between computational capacity and energy efficiency and are relevant for the design and fabrication of novel computing architectures that harness random assemblies of emerging nanodevices.
The relationship between spatially heterogeneous dynamics (SHD) and jamming is studied in a glass-forming binary Lennard-Jones system via molecular dynamics simulations. It has been suggested that the probability distribution of interparticle forces $P(F)$ develops a peak at the glass transition temperature $T_g$, and that the large force inhomogeneities, responsible for structural arrest in granular materials, are related to dynamical heterogeneities in supercooled liquids that form glasses. It has been further suggested that ``force chains'' present in granular materials may exist in supercooled liquids, and may provide an order parameter for the glass transition. Our goal is to investigate the extent to which the forces experienced by particles in a glass-forming liquid are related to SHD, and compare these forces to those observed in granular materials and other glass-forming systems. We find no peak in $P(F)$ at any temperature in our system, even below $T_g$. We also find that particles that have been localized for a long time are less likely to experience high relative force and that mobile particles experience higher relative forces at shorter time scales, indicating a correlation between pairwise forces and particle mobility. We also discuss a possible relationship between force chains found here and the development of string-like motion found in other glass-forming liquids.
A novel deep learning method for improving the belief propagation algorithm is proposed. The method generalizes the standard belief propagation algorithm by assigning weights to the edges of the Tanner graph. These edges are then trained using deep learning techniques. A well-known property of the belief propagation algorithm is the independence of the performance on the transmitted codeword. A crucial property of our new method is that our decoder preserved this property. Furthermore, this property allows us to learn only a single codeword instead of exponential number of code-words. Improvements over the belief propagation algorithm are demonstrated for various high density parity check codes.
Reservoir computing is a recently introduced machine learning paradigm that has been shown to be well-suited for the processing of spatiotemporal data. Rather than training the network node connections and weights via backpropagation in traditional recurrent neural networks, reservoirs instead have fixed connections and weights among the `hidden layer' nodes, and traditionally only the weights to the output layer of neurons are trained using linear regression. We claim that for signal classification tasks one may forgo the weight training step entirely and instead use a simple supervised clustering method based upon principal components of norms of reservoir states. The proposed method is mathematically analyzed and explored through numerical experiments on real-world data. The examples demonstrate that the proposed may outperform the traditional trained output weight approach in terms of classification accuracy and sensitivity to reservoir parameters.
It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples. We systematically investigate the underlying reasons why deep neural networks often generalize well, and reveal the difference between the minima (with the same training error) that generalize well and those they don't. We show that it is the characteristics the landscape of the loss function that explains the good generalization capability. For the landscape of loss function for deep networks, the volume of basin of attraction of good minima dominates over that of poor minima, which guarantees optimization methods with random initialization to converge to good minima. We theoretically justify our findings through analyzing 2-layer neural networks; and show that the low-complexity solutions have a small norm of Hessian matrix with respect to model parameters. For deeper networks, extensive numerical evidence helps to support our arguments.
Churn prediction, or the task of identifying customers who are likely to discontinue use of a service, is an important and lucrative concern of firms in many different industries. As these firms collect an increasing amount of large-scale, heterogeneous data on the characteristics and behaviors of customers, new methods become possible for predicting churn. In this paper, we present a unified analytic framework for detecting the early warning signs of churn, and assigning a "Churn Score" to each customer that indicates the likelihood that the particular individual will churn within a predefined amount of time. This framework employs a brute force approach to feature engineering, then winnows the set of relevant attributes via feature selection, before feeding the final feature-set into a suite of supervised learning algorithms. Using several terabytes of data from a large mobile phone network, our method identifies several intuitive - and a few surprising - early warning signs of churn, and our best model predicts whether a subscriber will churn with 89.4% accuracy.
We study the effect of coupling magnetic impurities to the honeycomb lattice spin-1/2 Kitaev model in its spin liquid phase. We show that a spin-S impurity coupled to the Kitaev model is associated with an unusual Kondo effect with an intermediate coupling unstable fixed point K_c J/S separating topologically distinct sectors of the Kitaev model. We also show that the massless spinons in the spin liquid mediate an interaction of the form S_{i\alpha}^{2}S_{j\beta}^{2}/R_{ij}^{3} between distant impurities unlike the usual dipolar RKKY interaction S_{i\alpha}S_{j\alpha}/R_{ij}^{3} noted in various 2D impurity problems with a pseudogapped density of states of the spin bath. Furthermore, this long-range interaction is possible only if the impurities (a) couple to more than one neighboring spin on the host lattice and (b) the impurity spin is not a spin-1/2.$
Graph-theoretical analyses of complex brain networks is a rapidly evolving field with a strong impact for neuroscientific and related clinical research. Due to a number of confounding variables, however, a reliable and meaningful characterization of particularly functional brain networks is a major challenge. Addressing this problem, we present an analysis approach for weighted networks that makes use of surrogate networks with preserved edge weights or vertex strengths. We first investigate whether characteristics of weighted networks are influenced by trivial properties of the edge weights or vertex strengths (e.g., their standard deviations). If so, these influences are then effectively segregated with an appropriate surrogate normalization of the respective network characteristic. We demonstrate this approach by re-examining, in a time-resolved manner, weighted functional brain networks of epilepsy patients and control subjects derived from simultaneous EEG/MEG recordings during different behavioral states. We show that this surrogate-assisted analysis approach reveals complementary information about these networks, can aid with their interpretation, and thus can prevent deriving inappropriate conclusions.
We analyze whether preferential attachment in scientific coauthorship networks is different for authors with different forms of centrality. Using a complete database for the scientific specialty of research about "steel structures," we show that betweenness centrality of an existing node is a significantly better predictor of preferential attachment by new entrants than degree or closeness centrality. During the growth of a network, preferential attachment shifts from (local) degree centrality to betweenness centrality as a global measure. An interpretation is that supervisors of PhD projects and postdocs broker between new entrants and the already existing network, and thus become focal to preferential attachment. Because of this mediation, scholarly networks can be expected to develop differently from networks which are predicated on preferential attachment to nodes with high degree centrality.
We formulate and study the thinnest path problem in wireless ad hoc networks. The objective is to find a path from a source to its destination that results in the minimum number of nodes overhearing the message by a judicious choice of relaying nodes and their corresponding transmission power. We adopt a directed hypergraph model of the problem and establish the NP-completeness of the problem in 2-D networks. We then develop two polynomial-time approximation algorithms that offer $\sqrt{\frac{n}{2}}$ and $\frac{n}{2\sqrt{n-1}}$ approximation ratios for general directed hypergraphs (which can model non-isomorphic signal propagation in space) and constant approximation ratios for ring hypergraphs (which result from isomorphic signal propagation). We also consider the thinnest path problem in 1-D networks and 1-D networks embedded in 2-D field of eavesdroppers with arbitrary unknown locations (the so-called 1.5-D networks). We propose a linear-complexity algorithm based on nested backward induction that obtains the optimal solution to both 1-D and 1.5-D networks. This algorithm does not require the knowledge of eavesdropper locations and achieves the best performance offered by any algorithm that assumes complete location information of the eavesdroppers.
Real-life networks often encounter vertex dysfunctions, which are usually followed by recoveries after appropriate maintenances. In this paper we present our research on a model of scale-free networks whose vertices are regularly removed and put back. Both the frequency and length of time of the disappearance of each vertex depend on the degree of the vertex, creating a heterogeneous control over the network. Our simulation results show very interesting growth pattern of this kind of networks. We also find that the scale-free property of the degree distribution is maintained in the proposed heterogeneously controlled networks. However, the overall growth rate of the networks in our model can be remarkably reduced if the inactive periods of the vertices are kept long.
We propose a method to holographically compute the conformal partial waves in any decomposition of correlation functions of primary operators in conformal field theories using open Wilson network operators in the holographic gravitational dual. The Wilson operators are the gravitational ones where gravity is written as a gauge theory in the first order Hilbert-Palatini formalism. We apply this method to compute the global conformal blocks and partial waves in 2d CFTs reproducing many of the known results.
The intelligent acoustic emission locator is described in Part I, while Part II discusses blind source separation, time delay estimation and location of two simultaneously active continuous acoustic emission sources.   The location of acoustic emission on complicated aircraft frame structures is a difficult problem of non-destructive testing. This article describes an intelligent acoustic emission source locator. The intelligent locator comprises a sensor antenna and a general regression neural network, which solves the location problem based on learning from examples. Locator performance was tested on different test specimens. Tests have shown that the accuracy of location depends on sound velocity and attenuation in the specimen, the dimensions of the tested area, and the properties of stored data. The location accuracy achieved by the intelligent locator is comparable to that obtained by the conventional triangulation method, while the applicability of the intelligent locator is more general since analysis of sonic ray paths is avoided. This is a promising method for non-destructive testing of aircraft frame structures by the acoustic emission method.
The 1/f resistance noise of a two-dimensional (2D) hole system in a high mobility GaAs quantum well has been measured on both sides of the 2D metal-insulator transition (MIT) at zero magnetic field (B=0), and deep in the insulating regime. The two measurement methods used are described: I or V fixed, and measurement of resp. V or I fluctuations. The normalized noise magnitude SR/R^2 increases strongly when the hole density is decreased, and its temperature (T) dependence goes from a slight increase with T at the largest densities, to a strong decrease at low density. We find that the noise magnitude scales with the resistance, SR /R^2 ~ R^2.4. Such a scaling is expected for a second order phase transition or a percolation transition. The possible presence of such a transition is investigated by studying the dependence of the conductivity as a function of the density. This dependence is consistent with a critical behavior close to a critical density p* lower than the usual MIT critical density pc.
We analyze complex networks under random matrix theory framework. Particularly, we show that $\Delta_3$ statistic, which gives information about the long range correlations among eigenvalues, provides a qualitative measure of randomness in networks. As networks deviate from the regular structure, $\Delta_3$ follows random matrix prediction of linear behavior, in semi-logarithmic scale with the slope of $1/\pi^2$, for the longer scale.
We propose a physical layer (PHY) caching scheme for wireless adhoc networks. The PHY caching exploits cache-assisted multihop gain and cache-induced dual-layer CoMP gain, which substantially improves the throughput of wireless adhoc networks. In particular, the PHY caching scheme contains a novel PHY transmission mode called the cache-induced dual-layer CoMP which can support homogeneous opportunistic CoMP in the wireless adhoc network. Compared with traditional per-node throughput scaling results of \Theta\left(1/\sqrt{N}\right), we can achieve O(1) per node throughput for a cached wireless adhoc network with N nodes. Moreover, we analyze the throughput of the PHY caching scheme for regular wireless adhoc networks and study the impact of various system parameters on the PHY caching gain.
In software industries, individuals at different levels from customer to an engineer apply diverse mechanisms to detect to which class a particular bug should be allocated. Sometimes while a simple search in Internet might help, in many other cases a lot of effort is spent in analyzing the bug report to classify the bug. So there is a great need of a structured mining algorithm - where given a crash log, the existing bug database could be mined to find out the class to which the bug should be allocated. This would involve Mining patterns and applying different classification algorithms. This paper focuses on the feature extraction, noise reduction in data and classification of network bugs using probabilistic Na\"ive Bayes approach. Different event models like Bernoulli and Multinomial are applied on the extracted features. When new, unseen bugs are given as input to the algorithms, the performance comparison of different algorithms is done on the basis of accuracy and recall parameters.
Thinking is one of the most interesting mental processes. Its complexity is sometimes simplified and its different manifestations are classified into normal and abnormal, like the delusional and disorganized thought or the creative one. The boundaries between these facets of thinking are fuzzy causing difficulties in medical, academic, and philosophical discussions. Considering the dopaminergic signal-to-noise neuronal modulation in the central nervous system, and the existence of semantic maps in human brain, a self-organizing neural network model was developed to unify the different thought processes into a single neurocomputational substrate. Simulations were performed varying the dopaminergic modulation and observing the different patterns that emerged at the semantic map. Assuming that the thought process is the total pattern elicited at the output layer of the neural network, the model shows how the normal and abnormal thinking are generated and that there are no borders between their different manifestations. Actually, a continuum of different qualitative reasoning, ranging from delusion to disorganization of thought, and passing through the normal and the creative thinking, seems to be more plausible. The model is far from explaining the complexities of human thinking but, at least, it seems to be a good metaphorical and unifying view of the many facets of this phenomenon usually studied in separated settings.
In the context of a classical example of glass-formation in 3-dimensions we exemplify how to construct a statistical mechanical theory of the glass transition. At the heart of the approach is a simple criterion for verifying a proper choice of up-scaled quasi-species that allow the construction of a theory with a finite number of 'states'. Once constructed, the theory identifies a typical scale $\xi$ that increases rapidly with lowering the temperature and which determines the $\alpha$-relaxation time $\tau_\alpha$ as $\tau_\alpha \sim \exp(\mu\xi/T)$ with $\mu$ a typical chemical potential. The theory can predict relaxation times at temperatures that are inaccessible to numerical simulations.
Proportional integral derivative (PID) controllers are important and widely used tools in system control. Tuning of the controller gains is a laborious task, especially for complex systems such as combustion engines. To minimize the time of an engineer for tuning of the gains in a simulation software, we propose to formulate a part of the problem as a black-box optimization task. In this paper, we summarize the properties and practical limitations of tuning of the gains in this particular application. We investigate the latest methods of black-box optimization and conclude that the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) with bi-population restart strategy, elitist parent selection and active covariance matrix adaptation is best suited for this task. Details of the algorithm's experiment-based calibration are explained as well as derivation of a suitable objective function. The method's performance is compared with that of PSO and SHADE. Finally, its usability is verified on six models of real engines.
This paper introduces a method to reduce communication that is injected into the network during a sparse matrix-vector multiply by reorganizing messages on each node. This results in a reduction of the inter-node communication, replaced by less-costly intra-node communication, which reduces both the number and size of messages that are injected into the network.
Margin-Based Principle has been proposed for a long time, it has been proved that this principle could reduce the structural risk and improve the performance in both theoretical and practical aspects. Meanwhile, feed-forward neural network is a traditional classifier, which is very hot at present with a deeper architecture. However, the training algorithm of feed-forward neural network is developed and generated from Widrow-Hoff Principle that means to minimize the squared error. In this paper, we propose a new training algorithm for feed-forward neural networks based on Margin-Based Principle, which could effectively promote the accuracy and generalization ability of neural network classifiers with less labelled samples and flexible network. We have conducted experiments on four UCI open datasets and achieved good results as expected. In conclusion, our model could handle more sparse labelled and more high-dimension dataset in a high accuracy while modification from old ANN method to our method is easy and almost free of work.
Representing a dialog policy as a recurrent neural network (RNN) is attractive because it handles partial observability, infers a latent representation of state, and can be optimized with supervised learning (SL) or reinforcement learning (RL). For RL, a policy gradient approach is natural, but is sample inefficient. In this paper, we present 3 methods for reducing the number of dialogs required to optimize an RNN-based dialog policy with RL. The key idea is to maintain a second RNN which predicts the value of the current policy, and to apply experience replay to both networks. On two tasks, these methods reduce the number of dialogs/episodes required by about a third, vs. standard policy gradient methods.
As a first step of a program aimed to the detection of dark matter (or radial variations of M/L) in early-type galaxies, we report deep spectroscopic observations of the bulge-dominated edge-on S0 galaxy NGC 3115, made at ESO, La Silla, using EFOSC at the 3.6m telescope and EMMI at NTT. Such observations allow measurements of the rotational velocity out to 1.8 a_e (effective radii) from the galaxy center, where the surface brightness is \mu_B ~ 24 mag arcsec**-2. The rotation curve quickly reaches an asymptotic value, v_f ~ 260 km/s, with only marginal indication of systematic decline within the range of our observations. The line-of-sight velocity dispersion has also been measured; it decreases steeply from a rather high central value and flattens out (<\sigma> ~ 100 km/s) within our observing range (a ~< 1.3 a_e). Models built on these data and simple dynamical arguments show that the M/L of NGC 3115 must thus be increasing from M/L = 6 (in solar units) in the inner regions (~ 1 a_e) to at least M/L >= 10 in the outermost regions (~ 2 a_e).
Hard spheres are ubiquitous in condensed matter: they have been used as models for liquids, crystals, colloidal systems, granular systems, and powders. Packings of hard spheres are of even wider interest, as they are related to important problems in information theory, such as digitalization of signals, error correcting codes, and optimization problems. In three dimensions the densest packing of identical hard spheres has been proven to be the FCC lattice, and it is conjectured that the closest packing is ordered (a regular lattice, e.g, a crystal) in low enough dimension. Still, amorphous packings have attracted a lot of interest, because for polydisperse colloids and granular materials the crystalline state is not obtained in experiments for kinetic reasons. We review here a theory of amorphous packings, and more generally glassy states, of hard spheres that is based on the replica method: this theory gives predictions on the structure and thermodynamics of these states. In dimensions between two and six these predictions can be successfully compared with numerical simulations. We will also discuss the limit of large dimension where an exact solution is possible. Some of the results we present here have been already published, but others are original: in particular we improved the discussion of the large dimension limit and we obtained new results on the correlation function and the contact force distribution in three dimensions. We also try here to clarify the main assumptions that are beyond our theory and in particular the relation between our static computation and the dynamical procedures used to construct amorphous packings.
We study a class of games which model the competition among agents to access some service provided by distributed service units and which exhibit congestion and frustration phenomena when service units have limited capacity. We propose a technique, based on the cavity method of statistical physics, to characterize the full spectrum of Nash equilibria of the game. The analysis reveals a large variety of equilibria, with very different statistical properties. Natural selfish dynamics, such as best-response, usually tend to large-utility equilibria, even though those of smaller utility are exponentially more numerous. Interestingly, the latter actually can be reached by selecting the initial conditions of the best-response dynamics close to the saturation limit of the service unit capacities. We also study a more realistic stochastic variant of the game by means of a simple and effective approximation of the average over the random parameters, showing that the properties of the average-case Nash equilibria are qualitatively similar to the deterministic ones.
The ubiquitous role of the cyber-infrastructures, such as the WWW, provides myriad opportunities for machine learning and its broad spectrum of application domains taking advantage of digital communication. Pattern classification and feature extraction are among the first applications of machine learning that have received extensive attention. The most remarkable achievements have addressed data sets of moderate-to-large size. The 'data deluge' in the last decade or two has posed new challenges for AI researchers to design new, effective and accurate algorithms for similar tasks using ultra-massive data sets and complex (natural or synthetic) dynamical systems. We propose a novel principled approach to feature extraction in hybrid architectures comprised of humans and machines in networked communication, who collaborate to solve a pre-assigned pattern recognition (feature extraction) task. There are two practical considerations addressed below: (1) Human experts, such as plant biologists or astronomers, often use their visual perception and other implicit prior knowledge or expertise without any obvious constraints to search for the significant features, whereas machines are limited to a pre-programmed set of criteria to work with; (2) in a team collaboration of collective problem solving, the human experts have diverse abilities that are complementary, and they learn from each other to succeed in cognitively complex tasks in ways that are still impossible imitate by machines.
In this study, we explore the use of deep convolutional neural networks (DCNNs) in visual place classification for robotic mapping and localization. An open question is how to partition the robot's workspace into places to maximize the performance (e.g., accuracy, precision, recall) of potential DCNN classifiers. This is a chicken and egg problem: If we had a well-trained DCNN classifier, it is rather easy to partition the robot's workspace into places, but the training of a DCNN classifier requires a set of pre-defined place classes. In this study, we address this problem and present several strategies for unsupervised discovery of place classes ("time cue," "location cue," "time-appearance cue," and "location-appearance cue"). We also evaluate the efficacy of the proposed methods using the publicly available University of Michigan North Campus Long-Term (NCLT) Dataset.
How should a network experiment be designed to achieve high statistical power? Experimental treatments on networks may spread. Randomizing assignment of treatment to nodes enhances learning about the counterfactual causal effects of a social network experiment and also requires new methodology (Aronow and Samii, 2013; Bowers et al., 2013; Toulis and Kao, 2013, ex.). In this paper we show that the way in which a treatment propagates across a social network affects the statistical power of an experimental design. As such, prior information regarding treatment propagation should be incorporated into the experimental design. Our findings run against standard advice in circumstances where units are presumed to be independent: information about treatment effects is not maximized when we assign half the units to treatment and half to control. We also show that statistical power depends on the extent to which the network degree of nodes is correlated with treatment assignment probability. We recommend that researchers think carefully about the underlying treatment propagation model motivating their study in designing an experiment on a network.
Previous research on relation classification has verified the effectiveness of using dependency shortest paths or subtrees. In this paper, we further explore how to make full use of the combination of these dependency information. We first propose a new structure, termed augmented dependency path (ADP), which is composed of the shortest dependency path between two entities and the subtrees attached to the shortest path. To exploit the semantic representation behind the ADP structure, we develop dependency-based neural networks (DepNN): a recursive neural network designed to model the subtrees, and a convolutional neural network to capture the most important features on the shortest path. Experiments on the SemEval-2010 dataset show that our proposed method achieves state-of-art results.
This paper is first-line research expanding GANs into graph topology analysis. By leveraging the hierarchical connectivity structure of a graph, we have demonstrated that generative adversarial networks (GANs) can successfully capture topological features of any arbitrary graph, and rank edge sets by different stages according to their contribution to topology reconstruction. Moreover, in addition to acting as an indicator of graph reconstruction, we find that these stages can also preserve important topological features in a graph.
We describe the class of convexified convolutional neural networks (CCNNs), which capture the parameter sharing of convolutional neural networks in a convex manner. By representing the nonlinear convolutional filters as vectors in a reproducing kernel Hilbert space, the CNN parameters can be represented as a low-rank matrix, which can be relaxed to obtain a convex optimization problem. For learning two-layer convolutional neural networks, we prove that the generalization error obtained by a convexified CNN converges to that of the best possible CNN. For learning deeper networks, we train CCNNs in a layer-wise manner. Empirically, CCNNs achieve performance competitive with CNNs trained by backpropagation, SVMs, fully-connected neural networks, stacked denoising auto-encoders, and other baseline methods.
Search engines perform the task of retrieving information related to the user-supplied query words. This task has two parts; one is finding "featured words" which describe an article best and the other is finding a match among these words to user-defined search terms. There are two main independent approaches to achieve this task. The first one, using the concepts of semantics, has been implemented partially. For more details see another paper of Marko et al., 2002. The second approach is reported in this paper. It is a theoretical model based on using Neural Network (NN). Instead of using keywords or reading from the first few lines from papers/articles, the present model gives emphasis on extracting "featured words" from an article. Obviously we propose to exclude prepositions, articles and so on, that is, English words like "of, the, are, so, therefore, " etc. from such a list. A neural model is taken with its nodes pre-assigned energies. Whenever a match is found with featured words and userdefined search words, the node is fired and jumps to a higher energy. This firing continues until the model attains a steady energy level and total energy is now calculated. Clearly, higher match will generate higher energy; so on the basis of total energy, a ranking is done to the article indicating degree of relevance to the user's interest. Another important feature of the proposed model is incorporating a semantic module to refine the search words; like finding association among search words, etc. In this manner, information retrieval can be improved markedly.
We use dynamic light scattering and numerical simulations to study the approach to equilibrium and the equilibrium dynamics of systems of colloidal hard spheres over a broad range of density, from dilute systems up to very concentrated suspensions undergoing glassy dynamics. We discuss several experimental issues (sedimentation, thermal control, non-equilibrium aging effects, dynamic heterogeneity) arising when very large relaxation times are measured. When analyzed over more than seven decades in time, we find that the equilibrium relaxation time, tau_alpha, of our system is described by the algebraic divergence predicted by mode-coupling theory over a window of about three decades. At higher density, tau_alpha increases exponentially with distance to a critical volume fraction phi_0 which is much larger than the mode-coupling singularity. This is reminiscent of the behavior of molecular glass-formers in the activated regime. We compare these results to previous work, carefully discussing crystallization and size polydispersity effects. Our results suggest the absence of a genuine algebraic divergence of tau_alpha in colloidal hard spheres.
We study the problem of throughput maximization by predicting spectrum opportunities using reinforcement learning. Our kernel-based reinforcement learning approach is coupled with a sparsification technique that efficiently captures the environment states to control dimensionality and finds the best possible channel access actions based on the current state. This approach allows learning and planning over the intrinsic state-action space and extends well to large state and action spaces. For stationary Markov environments, we derive the optimal policy for channel access, its associated limiting throughput, and propose a fast online algorithm for achieving the optimal throughput. We then show that the maximum-likelihood channel prediction and access algorithm is suboptimal in general, and derive conditions under which the two algorithms are equivalent. For reactive Markov environments, we derive kernel variants of Q-learning, R-learning and propose an accelerated R-learning algorithm that achieves faster convergence. We finally test our algorithms against a generic reactive network. Simulation results are shown to validate the theory and show the performance gains over current state-of-the-art techniques.
Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of SSNVs obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open-sourced and available at http://viq854.github.io/lichee.
The decentralized caching is studied in two-layer networks, where users request contents through intermediate nodes (helpers) from a file server. By placing contents randomly and independently in each node and carefully designing the data delivery, the correlations of the pre-stored contents across layers can be utilized to reduce the transmission rates in the network. A hybrid caching scheme is developed by exploiting the cross-layer storage correlations, the single-layer multicast opportunities from the server (each helper) to helpers (the attached users), and the cross-layer multicast opportunities from the server to users. It is observed that, by the hybrid caching scheme, the achievable rate in the first layer is reduced without compromising the achievable rate in the second layer compared with the state of art. Furthermore, the achievable rate region is shown to be order-optimal and lies within constant margins to the information theoretic optimum. In particular, the multiplicative and additive factors are carefully sharpened to be $\frac{1}{48}$ and $4$, respectively.
This paper studies the problem of load-balancing the demand for content in a peer-to-peer network across heterogeneous peer nodes that hold replicas of the content. Previous decentralized load balancing techniques in distributed systems base their decisions on periodic updates containing information about load or available capacity observed at the serving entities. We show that these techniques do not work well in the peer-to-peer context; either they do not address peer node heterogeneity, or they suffer from significant load oscillations. We propose a new decentralized algorithm, Max-Cap, based on the maximum inherent capacities of the replica nodes and show that unlike previous algorithms, it is not tied to the timeliness or frequency of updates. Yet, Max-Cap can handle the heterogeneity of a peer-to-peer environment without suffering from load oscillations.
Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or "temporally deep", are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and demonstrate the value of these models on benchmark video recognition tasks, image description and retrieval problems, and video narration challenges. In contrast to current models which assume a fixed spatio-temporal receptive field or simple temporal averaging for sequential processing, recurrent convolutional models are "doubly deep"' in that they can be compositional in spatial and temporal "layers". Such models may have advantages when target concepts are complex and/or training data are limited. Learning long-term dependencies is possible when nonlinearities are incorporated into the network state updates. Long-term RNN models are appealing in that they directly can map variable-length inputs (e.g., video frames) to variable length outputs (e.g., natural language text) and can model complex temporal dynamics; yet they can be optimized with backpropagation. Our recurrent long-term models are directly connected to modern visual convnet models and can be jointly trained to simultaneously learn temporal dynamics and convolutional perceptual representations. Our results show such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized.
We investigate transport properties of topologically disordered, three-dimensional, one-particle, tight binding models, featuring site distance dependent hopping terms. We start from entirely disordered systems into which we gradually introduce some short range order by numerically performing a pertinent structural relaxation using local site-pair interactions. Transport properties of the resulting models within the delocalized regime are analyzed numerically using linear response theory. We find that even though the generated order is very short ranged, transport properties such as conductivity or mean free path scale significantly with the degree of order. Mean free paths may exceed site-pair correlation length. It is furthermore demonstrated that, while the totally disordered model is not in accord with a Drude- or Boltzmann-type description, moderate degrees of order suffice to render such a picture valid.
Traffic light timing optimization is still an active line of research despite the wealth of scientific literature on the topic, and the problem remains unsolved for any non-toy scenario. One of the key issues with traffic light optimization is the large scale of the input information that is available for the controlling agent, namely all the traffic data that is continually sampled by the traffic detectors that cover the urban network. This issue has in the past forced researchers to focus on agents that work on localized parts of the traffic network, typically on individual intersections, and to coordinate every individual agent in a multi-agent setup. In order to overcome the large scale of the available state information, we propose to rely on the ability of deep Learning approaches to handle large input spaces, in the form of Deep Deterministic Policy Gradient (DDPG) algorithm. We performed several experiments with a range of models, from the very simple one (one intersection) to the more complex one (a big city section).
In this paper, we investigate convergence of a class of analytic neural networks with event-triggered rule. This model is general and include Hopfield neural network as a special case. The event-trigger rule efficiently reduces the frequency of information transmission between synapses of the neurons. The synaptic feedback of each neuron keeps a constant value based on the outputs of its neighbours at its latest triggering time but changes until the next triggering time of this neuron that is determined by certain criterion via its neighborhood information. It is proved that the analytic neural network is completely stable under this event-triggered rule. The main technique of proof is the ${\L}$ojasiewicz inequality to prove the finiteness of trajectory length. The realization of this event-triggered rule is verified by the exclusion of Zeno behaviors. Numerical examples are provided to illustrate the theoretical results and present the optimisation capability of the network dynamics.
We consider the interplay of disorder and interactions upon the gapless surface states of 3D topological superconductors. The combination of topology and superconducting order inverts the action of time-reversal symmetry, so that extrinsic time-reversal invariant surface perturbations appear only as "pseudomagnetic" fields (abelian and non-abelian vector potentials, which couple to spin and valley currents). The main effect of disorder is to induce multifractal scaling in surface state wavefunctions. These critically delocalized, yet strongly inhomogeneous states renormalize interaction matrix elements relative to the clean system. We compute the enhancement or suppression of interaction scaling dimensions due to the disorder exactly, using conformal field theory. We determine the conditions under which interactions remain irrelevant in the presence of disorder for symmetry classes AIII and DIII. In the limit of large topological winding numbers (many surface valleys), we show that the effective field theory takes the form of a Finkel'stein non-linear sigma model, augmented by the Wess-Zumino-Novikov-Witten term. The sigma model incorporates interaction effects to all orders, and provides a framework for a controlled perturbative expansion; the inverse spin or thermal conductance is the small parameter. For class DIII we show that interactions are always irrelevant, while in class AIII there is a finite window of stability, controlled by the disorder. Outside of this window we identify new interaction-stabilized fixed points.
Szubert and Jaskowski successfully used temporal difference (TD) learning together with n-tuple networks for playing the game 2048. However, we observed a phenomenon that the programs based on TD learning still hardly reach large tiles. In this paper, we propose multi-stage TD (MS-TD) learning, a kind of hierarchical reinforcement learning method, to effectively improve the performance for the rates of reaching large tiles, which are good metrics to analyze the strength of 2048 programs. Our experiments showed significant improvements over the one without using MS-TD learning. Namely, using 3-ply expectimax search, the program with MS-TD learning reached 32768-tiles with a rate of 18.31%, while the one with TD learning did not reach any. After further tuned, our 2048 program reached 32768-tiles with a rate of 31.75% in 10,000 games, and one among these games even reached a 65536-tile, which is the first ever reaching a 65536-tile to our knowledge. In addition, MS-TD learning method can be easily applied to other 2048-like games, such as Threes. Based on MS-TD learning, our experiments for Threes also demonstrated similar performance improvement, where the program with MS-TD learning reached 6144-tiles with a rate of 7.83%, while the one with TD learning only reached 0.45%.
We present the general expression, in terms of structure functions, of the cross section for the production of two hadrons in semi-inclusive deep-inelastic scattering. We analyze this process including full transverse-momentum dependence up to subleading twist and check, where possible, the consistency with existing literature.
We investigate the one-dimensional propagation of waves in the Anderson localization regime, for a single-mode, surface disordered waveguide. We make use of both an analytical formulation and rigorous numerical simulation calculations. The occurrence of anomalously large transmission coefficients for given realizations and/or frequencies is studied, revealing huge field intensity concentration inside the disordered waveguide. The analytically predicted s-like dependence of the average intensity, being in good agreement with the numerical results for moderately long systems, fails to explain the intensity distribution observed deep in the localized regime. The average contribution to the field intensity from the resonances that are above a threshold transmission coefficient $T_{c}$ is a broad distribution with a large maximum at/near mid-waveguide, depending universally (for given $T_{c}$) on the ratio of the length of the disorder segment to the localization length, $L/\xi$. The same universality is observed in the spatial distribution of the intensity inside typical (non-resonant with respect to the transmission coefficient) realizations, presenting a s-like shape similar to that of the total average intensity for $T_{c}$ close to 1, which decays faster the lower is $T_{c}$. Evidence is given of the self-averaging nature of the random quantity $\log[I(x)]/x\simeq -1/\xi$. Higher-order moments of the intensity are also shown.
Real-world networks in technology, engineering and biology often exhibit dynamics that cannot be adequately reproduced using network models given by smooth dynamical systems and a fixed network topology. Asynchronous networks give a theoretical and conceptual framework for the study of network dynamics where nodes can evolve independently of one another, be constrained, stop, and later restart, and where the interaction between different components of the network may depend on time, state, and stochastic effects. This framework is sufficiently general to encompass a wide range of applications ranging from engineering to neuroscience. Typically, dynamics is piecewise smooth and there are relationships with Filippov systems. In the first part of the paper, we give examples of asynchronous networks, and describe the basic formalism and structure. In the second part, we make the notion of a functional asynchronous network rigorous, discuss the phenomenon of dynamical locks, and present a foundational result on the spatiotemporal factorization of the dynamics for a large class of functional asynchronous networks.
We investigate the effect of phase randomness in Ising-type quantum networks. These networks model a large class of physical systems. They describe micro- and nanostructures or arrays of optical elements such as beam splitters (interferometers) or parameteric amplifiers. Most of these stuctures are promising candidates for quantum information processing networks. We demonstrate that such systems exhibit two very distinct types of behaviour. For certain network configurations (parameters), they show quantum localization similar to Anderson localization whereas classical stochastic behaviour is observed in other cases. We relate these findings to the standard theory of quantum localization.
We propose a new image instance segmentation method that segments individ- ual glands (instances) in colon histology images. This process is challenging since the glands not only need to be segmented from a complex background, they must also be individually identified. We leverage the idea of image-to-image prediction in recent deep learning by designing an algorithm that automatically exploits and fuses complex multichannel information - regional, location and boundary cues - in gland histology images. Our proposed algorithm, a deep multichannel framework, alleviates heavy feature design due to the use of con- volutional neural networks and is able to meet multifarious requirements by altering channels. Compared to methods reported in the 2015 MICCAI Gland Segmentation Challenge and other currently prevalent instance segmentation methods, we observe state-of-the-art results based on the evaluation metrics. Keywords: Instance segmentation, convolutional neural networks, segmentation, multichannel, histology image.
We provide a systematic study of the problem of finding the source of a rumor in a network. We model rumor spreading in a network with a variant of the popular SIR model and then construct an estimator for the rumor source. This estimator is based upon a novel topological quantity which we term \textbf{rumor centrality}. We establish that this is an ML estimator for a class of graphs. We find the following surprising threshold phenomenon: on trees which grow faster than a line, the estimator always has non-trivial detection probability, whereas on trees that grow like a line, the detection probability will go to 0 as the network grows. Simulations performed on synthetic networks such as the popular small-world and scale-free networks, and on real networks such as an internet AS network and the U.S. electric power grid network, show that the estimator either finds the source exactly or within a few hops of the true source across different network topologies. We compare rumor centrality to another common network centrality notion known as distance centrality. We prove that on trees, the rumor center and distance center are equivalent, but on general networks, they may differ. Indeed, simulations show that rumor centrality outperforms distance centrality in finding rumor sources in networks which are not tree-like.
As machine learning algorithms are increasingly applied to high impact yet high risk tasks, e.g. problems in health, it is critical that researchers can explain how such algorithms arrived at their predictions. In recent years, a number of image saliency methods have been developed to summarize where highly complex neural networks "look" in an image for evidence for their predictions. However, these techniques are limited by their heuristic nature and architectural constraints.   In this paper, we make two main contributions: First, we propose a general framework for learning different kinds of explanations for any black box algorithm. Second, we introduce a paradigm that learns the minimally salient part of an image by directly editing it and learning from the corresponding changes to its output. Unlike previous works, our method is model-agnostic and testable because it is grounded in replicable image perturbations.
Recent years have shown that deep learned neural networks are a valuable tool in the field of computer vision. This paper addresses the use of two different kinds of network architectures, namely LeNet and Network in Network (NiN). They will be compared in terms of both performance and computational efficiency by addressing the classification and detection problems. In this paper, multiple databases will be used to test the networks. One of them contains images depicting burn wounds from pediatric cases, another one contains an extensive number of art images and other facial databases were used for facial keypoints detection.
We have recently shown that deep Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform feed forward deep neural networks (DNNs) as acoustic models for speech recognition. More recently, we have shown that the performance of sequence trained context dependent (CD) hidden Markov model (HMM) acoustic models using such LSTM RNNs can be equaled by sequence trained phone models initialized with connectionist temporal classification (CTC). In this paper, we present techniques that further improve performance of LSTM RNN acoustic models for large vocabulary speech recognition. We show that frame stacking and reduced frame rate lead to more accurate models and faster decoding. CD phone modeling leads to further improvements. We also present initial results for LSTM RNN models outputting words directly.
Structural identity is a concept of symmetry in which network nodes are identified according to the network structure and their relationship to other nodes. Structural identity has been studied in theory and practice over the past decades, but only recently has it been addressed with representational learning techniques. This work presents struc2vec, a novel and flexible framework for learning latent representations for the structural identity of nodes. struc2vec uses a hierarchy to measure node similarity at different scales, and constructs a multilayer graph to encode structural similarities and generate structural context for nodes. Numerical experiments indicate that state-of-the-art techniques for learning node representations fail in capturing stronger notions of structural identity, while struc2vec exhibits much superior performance in this task, as it overcomes limitations of prior approaches. As a consequence, numerical experiments indicate that struc2vec improves performance on classification tasks that depend more on structural identity.
We revisit the problem of wavefunction statistics at the Anderson metal-insulator transition (MIT) of non-interacting electrons in d > 2 spatial dimensions. At the transition, the complex spatial structure of the critical wavefunctions is reflected in the non-linear behavior of the multifractal spectrum of generalized inverse participation ratios (IPRs). Beyond the crossover from narrow to broad IPR statistics, which always occurs for sufficiently large moments of the wavefunction amplitude, the spectrum obtained from a typical wavefunction associated with a particular disorder realization differs markedly from that obtained from the disorder-averaged IPRs. This phenomenon is known as the termination of the multifractal spectrum. We provide a field theoretical derivation for the termination of the typical multifractal spectrum, by combining the non-linear sigma model framework, conventionally used to access the MIT in d = 2 + epsilon dimensions, with a functional renormalization group (FRG) technique. The FRG method deployed here was originally pioneered to study the properties of the two-dimensional (2D) random phase XY model [D. Carpentier and P. Le Doussal, Nucl. Phys. B 588, 565 (2000)]. The same method was used to demonstrate the termination of the multifractal spectrum in the very special problem of 2D Dirac fermions subject to a random Abelian vector potential. Our result shows that the typical multifractal wavefunction spectrum and its termination can be obtained at a generic Anderson localization transition in d > 2, within the standard field theoretical framework of the non-linear sigma model, when combined with the FRG.
We present fast deterministic distributed protocols in synchronous networks for leader election and spanning tree construction. The protocols are designed under the assumption that nodes in a network have identifiers but the size of an identifier is unlimited. So time bounds of protocols depend on the sizes of identifiers. We present fast protocols running in time $O(D\log L+L)$, where $L$ is the size of the minimal identifier and $D$ is the diameter of a network.
It is very common to observe crowds of individuals solving similar problems with similar information in a largely independent manner. We argue here that crowds can become "smarter," i.e., more efficient and robust, by partially following the average opinion. This observation runs counter to the widely accepted claim that the wisdom of crowds deteriorates with social influence. The key difference is that individuals are self-interested and hence will reject feedbacks that do not improve their performance. We propose a control-theoretic methodology to compute the degree of social influence, i.e., the level to which one accepts the population feedback, that optimizes performance. We conducted an experiment with human subjects ($N = 194$), where the participants were first asked to solve an optimization problem independently, i.e., under no social influence. Our theoretical methodology estimates a $30\%$ degree of social influence to be optimal, resulting in a $29\%$ improvement in the crowd's performance. We then let the same cohort solve a new problem and have access to the average opinion. Surprisingly, we find the average degree of social influence in the cohort to be $32\%$ with a $29\%$ improvement in performance: In other words, the crowd self-organized into a near-optimal setting. We believe this new paradigm for making crowds "smarter" has the potential for making a significant impact on a diverse set of fields including population health to government planning. We include a case study to show how a crowd of states can collectively learn the level of taxation and expenditure that optimizes economic growth.
The Phoenix Deep Survey is a multi-wavelength galaxy survey based on deep 1.4 GHz radio imaging (Hopkins et al., 2003). The primary goal of this survey is to investigate the properties of star formation in galaxies and to trace the evolution in those properties to a redshift z=1, covering a significant fraction of the age of the Universe. By compiling a sample of star-forming galaxies based on selection at radio wavelengths we eliminate possible biases due to dust obscuration, a significant issue when selecting objects at optical and ultraviolet wavelengths. In this paper, we present the catalogs and results of deep optical (UBVRI) and near-infrared (Ks) imaging of the deepest region of the existing decimetric radio imaging. The observations and data-processing are summarised and the construction of the optical source catalogs described, together with the details of the identification of candidate optical counterparts to the radio catalogs. Based on our UBVRIKs imaging, photometric redshift estimates for the optical counterparts to the radio detections are explored.
Recently, Neural Networks have been proven extremely effective in many natural language processing tasks such as sentiment analysis, question answering, or machine translation. Aiming to exploit such advantages in the Ontology Learning process, in this technical report we present a detailed description of a Recurrent Neural Network based system to be used to pursue such goal.
We introduce a new threshold model of social networks, in which the nodes influenced by their neighbours can adopt one out of several alternatives. We characterize the graphs for which adoption of a product by the whole network is possible (respectively necessary) and the ones for which a unique outcome is guaranteed. These characterizations directly yield polynomial time algorithms that allow us to determine whether a given social network satisfies one of the above properties.   We also study algorithmic questions for networks without unique outcomes. We show that the problem of computing the minimum possible spread of a product is NP-hard to approximate with an approximation ratio better than $\Omega(n)$, in contrast to the maximum spread, which is efficiently computable. We then move on to questions regarding the behavior of a node with respect to adopting some (resp. a given) product. We show that the problem of determining whether a given node has to adopt some (resp. a given) product in all final networks is co-NP-complete.
It is well known that direct training of deep neural networks will generally lead to poor results. A major progress in recent years is the invention of various pretraining methods to initialize network parameters and it was shown that such methods lead to good prediction performance. However, the reason for the success of pretraining has not been fully understood, although it was argued that regularization and better optimization play certain roles. This paper provides another explanation for the effectiveness of pretraining, where we show pretraining leads to a sparseness of hidden unit activation in the resulting neural networks. The main reason is that the pretraining models can be interpreted as an adaptive sparse coding. Compared to deep neural network with sigmoid function, our experimental results on MNIST and Birdsong further support this sparseness observation.
Training deep belief networks (DBNs) requires optimizing a non-convex function with an extremely large number of parameters. Naturally, existing gradient descent (GD) based methods are prone to arbitrarily poor local minima. In this paper, we rigorously show that such local minima can be avoided (upto an approximation error) by using the dropout technique, a widely used heuristic in this domain. In particular, we show that by randomly dropping a few nodes of a one-hidden layer neural network, the training objective function, up to a certain approximation error, decreases by a multiplicative factor.   On the flip side, we show that for training convex empirical risk minimizers (ERM), dropout in fact acts as a "stabilizer" or regularizer. That is, a simple dropout based GD method for convex ERMs is stable in the face of arbitrary changes to any one of the training points. Using the above assertion, we show that dropout provides fast rates for generalization error in learning (convex) generalized linear models (GLM). Moreover, using the above mentioned stability properties of dropout, we design dropout based differentially private algorithms for solving ERMs. The learned GLM thus, preserves privacy of each of the individual training points while providing accurate predictions for new test points. Finally, we empirically validate our stability assertions for dropout in the context of convex ERMs and show that surprisingly, dropout significantly outperforms (in terms of prediction accuracy) the L2 regularization based methods for several benchmark datasets.
We capitalize on large amounts of readily-available, synchronous data to learn a deep discriminative representations shared across three major natural modalities: vision, sound and language. By leveraging over a year of sound from video and millions of sentences paired with images, we jointly train a deep convolutional network for aligned representation learning. Our experiments suggest that this representation is useful for several tasks, such as cross-modal retrieval or transferring classifiers between modalities. Moreover, although our network is only trained with image+text and image+sound pairs, it can transfer between text and sound as well, a transfer the network never observed during training. Visualizations of our representation reveal many hidden units which automatically emerge to detect concepts, independent of the modality.
The paraxial (parabolic) theory of a near forward scattering of a quantum charged particle by a static magnetic field is presented. From the paraxial solution to the Aharonov-Bohm scattering problem the transverse transfered momentum (the Lorentz force) is found. Multiple magnetic scattering is considered for two models: (i) Gaussian $\delta$ -correlated random magnetic field; (ii) a random array of the Aharonov-Bohm magnetic flux line. The paraxial gauge-invariant two-particle Green function averaged with respect to the random field is found by an exact evaluation of the Feynman integral. It is shown that in spite of the anomalous character of the forward scattering, the transport properties can be described by the Boltzmann equation. The Landau quantization in the field of the Aharonov-Bohm lines is discussed.
We propose a novel method for classifying resume data of job applicants into 27 different job categories using convolutional neural networks. Since resume data is costly and hard to obtain due to its sensitive nature, we use domain adaptation. In particular, we train a classifier on a large number of freely available job description snippets and then use it to classify resume data. We empirically verify a reasonable classification performance of our approach despite having only a small amount of labeled resume data available.
The dielectric susceptibility of the molecular liquid sorbitol below its calorimetric glass transition displays memory strikingly similar to that of a variety of glassy materials. During a temporary stop in cooling, the susceptibility changes with time, and upon reheating the susceptibility retraces these changes. To investigate the out-of-equilibrium state of the liquid as it displays this memory, the heating stage of this cycle is interrupted and the subsequent aging characterized. At temperatures above that of the original cooling stop, the liquid enters a state on heating with an effective age that is proportional to the duration of the stop, while at lower temperatures no effective age can be assigned and subtler behavior emerges. These results, which reveal differences with memory displayed by spin glasses, are discussed in the context of the liquid's energy landscape.
Recent experiments on non-interacting ultra-cold atoms in correlated disorder have yielded conflicting results regarding the so-called mobility edge, i.e. the energy threshold separating Anderson localized from diffusive states. At the same time, there are theoretical indications that the experimental data overestimate the position of this critical energy, sometimes by a large amount. The non-trivial effect of anisotropy in the spatial correlations of experimental speckle potentials have been put forward as a possible cause for such discrepancy. Using extensive numerical simulations we show that the effect of anisotropy alone is not sufficient to explain the experimental data. In particular, we find that, for not-too-strong anisotropy, realistic disorder configurations are essentially identical to the isotropic case, modulo a simple rescaling of the energies.
Emerging smart infrastructures, such as Smart City, Smart Grid, Smart Health, and Smart Transportation, need smart wireless connectivity. However, the requirements of these smart infrastructures cannot be met with today's wireless networks. A new wireless infrastructure is needed to meet unprecedented needs in terms of agility, reliability, security, scalability, and partnerships.   We are at the beginning of a revolution in how we live with technology, resulting from a convergence of machine learning (ML), the Internet-of-Things (IoT), and robotics. A smart infrastructure monitors and processes a vast amount of data, collected from a dense and wide distribution of heterogeneous sensors (e.g., the IoT), as well as from web applications like social media. In real time, using machine learning, patterns and relationships in the data over space, time, and application can be detected and predictions can be made; on the basis of these, resources can be managed, decisions can be made, and devices can be actuated to optimize metrics, such as cost, health, safety, and convenience.
Computing the cut-set bound in half-duplex relay networks is a challenging optimization problem, since it requires finding the cut-set optimal half-duplex schedule. This subproblem in general involves an exponential number of variables, since the number of ways to assign each node to either transmitter or receiver mode is exponential in the number of nodes. We present a general technique that takes advantage of specific structures in the topology of a given network and allows us to reduce the complexity of computing the half-duplex schedule that maximizes the cut-set bound (with i.i.d. input distribution). In certain classes of network topologies, our approach yields polynomial time algorithms. We use simulations to show running time improvements over alternative methods and compare the performance of various half-duplex scheduling approaches in different SNR regimes.
Discrete-state stochastic models have become a well-established approach to describe biochemical reaction networks that are influenced by the inherent randomness of cellular events. In the last years severalmethods for accurately approximating the statistical moments of such models have become very popular since they allow an efficient analysis of complex networks. We propose a generalized method of moments approach for inferring the parameters of reaction networks based on a sophisticated matching of the statistical moments of the corresponding stochastic model and the sample moments of population snapshot data. The proposed parameter estimation method exploits recently developed moment-based approximations and provides estimators with desirable statistical properties when a large number of samples is available. We demonstrate the usefulness and efficiency of the inference method on two case studies. The generalized method of moments provides accurate and fast estimations of unknown parameters of reaction networks. The accuracy increases when also moments of order higher than two are considered. In addition, the variance of the estimator decreases, when more samples are given or when higher order moments are included.
We present a novel detection method using a deep convolutional neural network (CNN), named AttentionNet. We cast an object detection problem as an iterative classification problem, which is the most suitable form of a CNN. AttentionNet provides quantized weak directions pointing a target object and the ensemble of iterative predictions from AttentionNet converges to an accurate object boundary box. Since AttentionNet is a unified network for object detection, it detects objects without any separated models from the object proposal to the post bounding-box regression. We evaluate AttentionNet by a human detection task and achieve the state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 with an 8-layered architecture only.
Future dense small-cell networks are one key 5G candidates to offer outdoor high access data rates, especially in millimeter wave (mmWave) frequency bands. At those frequencies, the free space propagation loss and shadowing (from buildings, vegetation or any kind of obstacles) are far stronger than in the traditional radio cellular spectrum. Therefore, the cell range is expected to be limited to 50 - 100 meters, and directive high gain antennas are required at least for the base stations. This paper investigates the kind of topology that is required to serve a suburban area with a small-cell network operating at 60 GHz and equipped with beam-steering antennas. A real environment is considered to introduce practical deployment and propagation constraints. The analysis relies on Monte-Carlo system simulations with non-full buffer, and ray-based predictions. The ray-tracing techniques are today identified as a relevant solution to capture the main channel properties impacting the beam-steering performance (angular dispersion, inter-link correlation); and the one involved in the present study was specifically enhanced to deal with detailed vegetation modeling. In addition to the user outage, the paper evaluates the evolution of the inter-cell interference along with the user density, and investigates the network behavior in case of local strong obstructions.
Although Software-Defined Networking (SDN) enables flexible network resource allocations for traffic engineering, current literature mostly focuses on unicast communications. Compared to traffic engineering for multiple unicast flows, multicast traffic engineering for multiple trees is very challenging not only because minimizing the bandwidth consumption of a single multicast tree by solving the Steiner tree problem is already NP-Hard, but the Steiner tree problem does not consider the link capacity constraint for multicast flows and node capacity constraint to store the forwarding entries in Group Table of OpenFlow. In this paper, therefore, we first study the hardness results of scalable multicast traffic engineering in SDN. We prove that scalable multicast traffic engineering with only the node capacity constraint is NP-Hard and not approximable within, which is the number of destinations in the largest multicast group. We then prove that scalable multicast traffic engineering with both the node and link capacity constraints is NP-Hard and not approximable within any ratio. To solve the problem, we design an approximation algorithm, named Multi-Tree Routing and State Assignment Algorithm (MTRSA), for the first case and extend it to the general multicast traffic engineering problem. The simulation and implementation results demonstrate that the solutions obtained by the proposed algorithm outperform the shortest-path trees and Steiner trees. Most importantly, MTRSA is computation-efficient and can be deployed in SDN since it can generate the solution with numerous trees in a short time.
Text simplification (TS) aims to reduce the lexical and structural complexity of a text, while still retaining the semantic meaning. Current automatic TS techniques are limited to either lexical-level applications or manually defining a large amount of rules. Since deep neural networks are powerful models that have achieved excellent performance over many difficult tasks, in this paper, we propose to use the Long Short-Term Memory (LSTM) Encoder-Decoder model for sentence level TS, which makes minimal assumptions about word sequence. We conduct preliminary experiments to find that the model is able to learn operation rules such as reversing, sorting and replacing from sequence pairs, which shows that the model may potentially discover and apply rules such as modifying sentence structure, substituting words, and removing words for TS.
The application of amorphous chalcogenide alloys as data-storage media relies on their ability to undergo an extremely fast (10-100 ns) crystallisation once heated at sufficiently high temperature. However, the peculiar features that make these materials so attractive for memory devices still lack a comprehensive microscopic understanding. By means of large scale molecular dynamics simulations, we demonstrate that the supercooled liquid of the prototypical compound GeTe shows a very high atomic mobility (D \sim 10^(-6) cm2/s) down to temperatures close to the glass transition temperatures. This behaviour leads to a breakdown of the Stokes-Einstein relation between the self- diffusion coefficient and the viscosity in the supercooled liquid. The results suggest that the fragility of the supercooled liquid is the key to understand the fast crystallisation process in this class of materials.
Today, Online Social Networks such as Facebook, LinkedIn and Twitter are the most popular platforms on the Internet, on which millions of users register to share personal information with their friends. A large amount of data, social links and statistics about users are collected by Online Social Networks services and they create big digital mines of various statistical data. Leakage of personal information is a significant concern for social network users. Besides information propagation, some new attacks on Online Social Networks such as Identity Clone attack (ICA) have been identified. ICA attempts to create a fake online identity of a victim to fool their friends into believing the authenticity of the fake identity to establish social links in order to reap the private information of the victims friends which is not shared in their public profiles. There are some identity validation services that perform users identity validation, but they are passive services and they only protect users who are informed on privacy concerns and online identity issues. This paper starts with an explanation of two types of profile cloning attacks are explained and a new approach for detecting clone identities is proposed by defining profile similarity and strength of relationship measures. According to similar attributes and strength of relationship among users which are computed in detection steps, it will be decided which profile is clone and which one is genuine by a predetermined threshold. Finally, the experimental results are presented to demonstrate the effectiveness of the proposed approach.
In this article we discuss an exactly solvable, one-dimensional, periodic toy charge density wave model introduced in [D.C. Kaspar, M. Mungan, EPL {\bf 103}, 46002 (2013)]. In particular, driving the system with a uniform force, we show that the depinning threshold configuration is an explicit function of the underlying disorder, as is the evolution from the negative threshold to the positive threshold, the latter admitting a description in terms of record sequences. This evolution is described by an avalanche algorithm, which identifies a sequence of static configurations that are stable at successively stronger forcing, and is useful both for analysis and simulation. We focus in particular on the behavior of the polarization $P$, which is related to the cumulative avalanche size, as a function of the threshold force minus the current force $(F_{\mathrm{th}} - F)$, as this has been the focus of several prior numerical and analytical studies of CDW systems. The results presented are rigorous, with exceptions explicitly indicated, and show that the depinning transition in this model is indeed a dynamic critical phenomenon.
We analyze the control of frequency for a synchronized inhibitory neuronal network. The analysis is done for a reduced membrane model with a biophysically-based synaptic influence. We argue that such a reduced model can quantitatively capture the frequency behavior of a larger class of neuronal models. We show that in different parameter regimes, the network frequency depends in different ways on the intrinsic and synaptic time constants. Only in one portion of the parameter space, called `phasic', is the network period proportional to the synaptic decay time. These results are discussed in connection with previous work of the authors, which showed that for mildly heterogeneous networks, the synchrony breaks down, but coherence is preserved much more for systems in the phasic regime than in the other regimes. These results imply that for mildly heterogeneous networks, the existence of a coherent rhythm implies a linear dependence of the network period on synaptic decay time, and a much weaker dependence on the drive to the cells. We give experimental evidence for this conclusion.
The long-range spectral density correlations (spectral rigidities $\bar{\Delta}_3(\bar n)$ and related spectral compressibilities) of the $E\otimes (b_1+b_2)$ Jahn-Teller model are found strongly nonuniversal with respect to the Hamiltonian parameters and inhomogeneous with respect to the choice of a partial energy segment. However, the spectral compressibilities found for the partial spectral segments and averaged over a whole relevant part of the spectrum cumulate close to a well-defined limit pertaining to the semi-Poisson statistics. This is in accordance with similar tendencies revealed in the short-range averaged statistical characteristics of this model investigated in our previous paper [1]. We ascribe the nonuniversal and inhomogeneous nonintegrability behaviour to the changing degree of the brokening of the rotation symmetry when changing parameters of our effectively two-dimensional model. It results in a random distribution of the respective localized wave functions at all scales up to the size of an available state space. The multifractal behaviour of the wave functions is implied from the analysis of their (averaged) fractal dimensions which range up to $1.5\pm 0.1$ (for $\bar{D}_2$). This might imply the concept of the chaos-assisted tunnelling between the regions of reduced degree of stochasticity through regions of high degree of stochasticity. It supports the analogy with the two-dimensional Anderson model with marginal-asymptotically far metal-insulator transition. The features found allow us to classify the present model as a member of the class with a multifractal eigenfunction statistics characteristic for the spectra with weakened level repulsion similar to the Anderson model close to the metal-insulator transition.
We consider a class of Nash games, termed as aggregative games, being played over a networked system. In an aggregative game, a player's objective is a function of the aggregate of all the players' decisions. Every player maintains an estimate of this aggregate, and the players exchange this information with their local neighbors over a connected network. We study distributed synchronous and asynchronous algorithms for information exchange and equilibrium computation over such a network. Under standard conditions, we establish the almost-sure convergence of the obtained sequences to the equilibrium point. We also consider extensions of our schemes to aggregative games where the players' objectives are coupled through a more general form of aggregate function. Finally, we present numerical results that demonstrate the performance of the proposed schemes.
In this paper we consider the training of single hidden layer neural networks by pseudoinversion, which, in spite of its popularity, is sometimes affected by numerical instability issues. Regularization is known to be effective in such cases, so that we introduce, in the framework of Tikhonov regularization, a matricial reformulation of the problem which allows us to use the condition number as a diagnostic tool for identification of instability. By imposing well-conditioning requirements on the relevant matrices, our theoretical analysis allows the identification of an optimal value for the regularization parameter from the standpoint of stability. We compare with the value derived by cross-validation for overfitting control and optimisation of the generalization performance. We test our method for both regression and classification tasks. The proposed method is quite effective in terms of predictivity, often with some improvement on performance with respect to the reference cases considered. This approach, due to analytical determination of the regularization parameter, dramatically reduces the computational load required by many other techniques.
Multimodal registration is a challenging problem in medical imaging due the high variability of tissue appearance under different imaging modalities. The crucial component here is the choice of the right similarity measure. We make a step towards a general learning-based solution that can be adapted to specific situations and present a metric based on a convolutional neural network. Our network can be trained from scratch even from a few aligned image pairs. The metric is validated on intersubject deformable registration on a dataset different from the one used for training, demonstrating good generalization. In this task, we outperform mutual information by a significant margin.
We study the dynamics of an electron subjected to a uniform electric field within a tight-binding model with long-range-correlated diagonal disorder. The random distribution of site energies is assumed to have a power spectrum $S(k) \sim 1/k^{\alpha}$ with $\alpha > 0$. Moura and Lyra [Phys. Rev. Lett. {\bf 81}, 3735 (1998)] predicted that this model supports a phase of delocalized states at the band center, separated from localized states by two mobility edges, provided $\alpha > 2$. We find clear signatures of Bloch-like oscillations of an initial Gaussian wave packet between the two mobility edges and determine the bandwidth of extended states, in perfect agreement with the zero-field prediction.
Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key to achieve haze removal is to estimate a medium transmission map for an input hazy image. In this paper, we propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. DehazeNet takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model. DehazeNet adopts Convolutional Neural Networks (CNN) based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing. Specifically, layers of Maxout units are used for feature extraction, which can generate almost all haze-relevant features. We also propose a novel nonlinear activation function in DehazeNet, called Bilateral Rectified Linear Unit (BReLU), which is able to improve the quality of recovered haze-free image. We establish connections between components of the proposed DehazeNet and those used in existing methods. Experiments on benchmark images show that DehazeNet achieves superior performance over existing methods, yet keeps efficient and easy to use.
Extracting the risk neutral density (RND) function from option prices is well defined in principle, but is very sensitive to errors in practice. For risk management, knowledge of the entire RND provides more information for Value-at-Risk (VaR) calculations than implied volatility alone [1]. Typically, RNDs are deduced from option prices by making a distributional assumption, or relying on implied volatility [2]. We present a fully non-parametric method for extracting RNDs from observed option prices. The aim is to obtain a continuous, smooth, monotonic, and convex pricing function that is twice differentiable. Thus, irregularities such as negative probabilities that afflict many existing RND estimation techniques are reduced. Our method employs neural networks to obtain a smoothed pricing function, and a central finite difference approximation to the second derivative to extract the required gradients.   This novel technique was successfully applied to a large set of FTSE 100 daily European exercise (ESX) put options data and as an Ansatz to the corresponding set of American exercise (SEI) put options. The results of paired t-tests showed significant differences between RNDs extracted from ESX and SEI option data, reflecting the distorting impact of early exercise possibility for the latter. In particular, the results for skewness and kurtosis suggested different shapes for the RNDs implied by the two types of put options. However, both ESX and SEI data gave an unbiased estimate of the realised FTSE 100 closing prices on the options' expiration date. We confirmed that estimates of volatility from the RNDs of both types of option were biased estimates of the realised volatility at expiration, but less so than the LIFFE tabulated at-the-money implied volatility.
Trending research on fine-grained recognition gradually shifts from traditional multistage frameworks to an end-to-end fashion with convolutional neural network (CNN). Many previous end-to-end deep approaches typically consist of a recognition network and an auxiliary localization network trained with additional part annotations to detect semantic parts shared across classes. In this paper, without the cost of extra semantic part annotations, we advance by learning class-specific discriminative patches within the CNN framework. We achieve this by designing a novel asymmetric two-stream network architecture with supervision on convolutional filters and a non-random way of layer initialization. Experimental results show that our approach is able to find high-quality discriminative patches as expected and gets comparable results to state-of-the-art on two publicly available fine-grained recognition datasets.
In real-world and online social networks, individuals receive and transmit information in real time. Cascading information transmissions (e.g. phone calls, text messages, social media posts) may be understood as a realization of a diffusion process operating on the network, and its branching path can be represented by a directed tree. The process only traverses and thus reveals a limited portion of the edges. The network reconstruction/inference problem is to infer the unrevealed connections. Most existing approaches derive a likelihood and attempt to find the network topology maximizing the likelihood, a problem that is highly intractable. In this paper, we focus on the network reconstruction problem for a broad class of real-world diffusion processes, exemplified by a network diffusion scheme called respondent-driven sampling (RDS). We prove that under realistic and general models of network diffusion, the posterior distribution of an observed RDS realization is a Bayesian log-submodular model.We then propose VINE (Variational Inference for Network rEconstruction), a novel, accurate, and computationally efficient variational inference algorithm, for the network reconstruction problem under this model. Crucially, we do not assume any particular probabilistic model for the underlying network. VINE recovers any connected graph with high accuracy as shown by our experimental results on real-life networks.
Medical errors are leading causes of death in the US and as such, prevention of these errors is paramount to promoting health care. Patient Safety Event reports are narratives describing potential adverse events to the patients and are important in identifying and preventing medical errors. We present a neural network architecture for identifying the type of safety events which is the first step in understanding these narratives. Our proposed model is based on a soft neural attention model to improve the effectiveness of encoding long sequences. Empirical results on two large-scale real-world datasets of patient safety reports demonstrate the effectiveness of our method with significant improvements over existing methods.
Understanding gene regulation is a fundamental step towards understanding of how cells function and respond to environmental cues and perturbations. An important step in this direction is to infer the transcription factor-gene regulatory network (GRN). However gene regulatory networks are typically constructed disregarding the fact that regulatory programs are conditioned on tissue type, developmental stage, sex, and other factors. Collecting multitude of features required for a reliable construction of GRNs such as physical features and functional features for every context of interest is costly. Therefore we need methods that is able to use the knowledge of a context-agnostic network for construction of a context specific regulatory network.   To address this challenge we developed a computational approach that uses context specific expression data and a GRN constructed in a different but related context to construct a context specific GRN. Our method, NetREX, is inspired by network component analysis that estimates TF activities and their influences on target genes given predetermined topology of a TF-gene network. To predict a network under a different condition, NetREX removes the restriction that the topology of the TF-gene network is fixed and allows for adding and removing edges to that network. To solve the corresponding optimization problem, we provide a general mathematical framework allowing use of the recently proposed PALM technique and develop a convergent algorithm.   We tested our NetREX on simulated data and subsequently applied it to gene expression data in adult females from Drosophila deletion (DrosDel) panel. The networks predicted by NetREX showed higher biological consistency than alternative approaches. In addition, we used the list of recently identified targets of the Doublesex (DSX) transcription factor to demonstrate the predictive power of our method.
An originally chaotic system can be controlled into various periodic dynamics. When it is implemented into a legged robot's locomotion control as a central pattern generator (CPG), sophisticated gait patterns arise so that the robot can perform various walking behaviors. However, such a single chaotic CPG controller has difficulties dealing with leg malfunction. Specifically, in the scenarios presented here, its movement permanently deviates from the desired trajectory. To address this problem, we extend the single chaotic CPG to multiple CPGs with learning. The learning mechanism is based on a simulated annealing algorithm. In a normal situation, the CPGs synchronize and their dynamics are identical. With leg malfunction or disability, the CPGs lose synchronization leading to independent dynamics. In this case, the learning mechanism is applied to automatically adjust the remaining legs' oscillation frequencies so that the robot adapts its locomotion to deal with the malfunction. As a consequence, the trajectory produced by the multiple chaotic CPGs resembles the original trajectory far better than the one produced by only a single CPG. The performance of the system is evaluated first in a physical simulation of a quadruped as well as a hexapod robot and finally in a real six-legged walking machine called AMOSII. The experimental results presented here reveal that using multiple CPGs with learning is an effective approach for adaptive locomotion generation where, for instance, different body parts have to perform independent movements for malfunction compensation.
In this paper, the problem of opportunistic channel sensing and access in cognitive radio networks when the sensing is imperfect and a secondary user has limited traffic to send at a time is investigated. Primary users' statistical information is assumed to be unknown, and therefore, a secondary user needs to learn the information online during channel sensing and access process, which means learning loss, also referred to as regret, is inevitable. In this research, the case when all potential channels can be sensed simultaneously is investigated first. The channel access process is modeled as a multi-armed bandit problem with side observation. And channel access rules are derived and theoretically proved to have asymptotically finite regret. Then the case when the secondary user can sense only a limited number of channels at a time is investigated. The channel sensing and access process is modeled as a bi-level multi-armed bandit problem. It is shown that any adaptive rule has at least logarithmic regret. Then we derive channel sensing and access rules and theoretically prove they have logarithmic regret asymptotically and with finite time. The effectiveness of the derived rules is validated by computer simulation.
In a recent interesting Letter Contucci {\it et al.} have investigated several properties of the three-dimensional (3d) Edwards-Anderson (EA) Ising spin glass. They claim to have found strong numerical evidence for the presence of a complex ultrametric structure similar to the one described by the replica symmetry breaking solution of the mean field model. We illustrate by numerical simulations that the relations used by Contucci {\it et al.} as evidence for an ultrametric structure in the 3d EA model are fulfilled to similar accuracy in the two-dimensional EA model, which is well-described by the droplet picture and has no spin glass phase at finite temperature. We conclude that the data presented in the Contucci {\it et al.} Letter is not sufficient to dismiss the possibility that, e.g., the droplet model might describe the behavior of the 3d EA model.
We investigate complex synchronization patterns such as cluster synchronization and partial amplitude death in networks of coupled Stuart-Landau oscillators with fractal connectivities. The study of fractal or self-similar topology is motivated by the network of neurons in the brain. This fractal property is well represented in hierarchical networks, for which we present three different models. In addition, we introduce an analytical eigensolution method and provide a comprehensive picture of the interplay of network topology and the corresponding network dynamics, thus allowing us to predict the dynamics of arbitrarily large hierarchical networks simply by analyzing small network motifs. We also show that oscillation death can be induced in these networks, even if the coupling is symmetric, contrary to previous understanding of oscillation death. Our results show that there is a direct correlation between topology and dynamics: Hierarchical networks exhibit the corresponding hierarchical dynamics. This helps bridging the gap between mesoscale motifs and macroscopic networks.
This work examines the close interplay between cooperation and adaptation for distributed detection schemes over fully decentralized networks. The combined attributes of cooperation and adaptation are necessary to enable networks of detectors to continually learn from streaming data and to continually track drifts in the state of nature when deciding in favor of one hypothesis or another. The results in the paper establish a fundamental scaling law for the steady-state probabilities of miss-detection and false-alarm in the slow adaptation regime, when the agents interact with each other according to distributed strategies that employ small constant step-sizes. The latter are critical to enable continuous adaptation and learning. The work establishes three key results. First, it is shown that the output of the collaborative process at each agent has a steady-state distribution. Second, it is shown that this distribution is asymptotically Gaussian in the slow adaptation regime of small step-sizes. And third, by carrying out a detailed large deviations analysis, closed-form expressions are derived for the decaying rates of the false-alarm and miss-detection probabilities. Interesting insights are gained. In particular, it is verified that as the step-size $\mu$ decreases, the error probabilities are driven to zero exponentially fast as functions of $1/\mu$, and that the error exponents increase linearly in the number of agents. It is also verified that the scaling laws governing errors of detection and errors of estimation over networks behave very differently, with the former having an exponential decay proportional to $1/\mu$, while the latter scales linearly with decay proportional to $\mu$. It is shown that the cooperative strategy allows each agent to reach the same detection performance, in terms of detection error exponents, of a centralized stochastic-gradient solution.
The multiplicity structure of the hadronic system X produced in deep-inelastic processes at HERA of the type ep -> eXY, where Y is a hadronic system with mass M_Y< 1.6 GeV and where the squared momentum transfer at the pY vertex, t, is limited to |t|<1 GeV^2, is studied as a function of the invariant mass M_X of the system X. Results are presented on multiplicity distributions and multiplicity moments, rapidity spectra and forward-backward correlations in the centre-of-mass system of X. The data are compared to results in e+e- annihilation, fixed-target lepton-nucleon collisions, hadro-produced diffractive final states and to non-diffractive hadron-hadron collisions. The comparison suggests a production mechanism of virtual photon dissociation which involves a mixture of partonic states and a significant gluon content. The data are well described by a model, based on a QCD-Regge analysis of the diffractive structure function, which assumes a large hard gluonic component of the colourless exchange at low Q^2. A model with soft colour interactions is also successful.
A general procedure of average-case performance evaluation for population dynamics such as genetic algorithms (GAs) is proposed and its validity is numerically examined. We introduce a learning algorithm of Gibbs distributions from training sets which are gene configurations (strings) generated by GA in order to figure out the statistical properties of GA from the view point of thermodynamics. The learning algorithm is constructed by means of minimization of the Kullback-Leibler information between a parametric Gibbs distribution and the empirical distribution of gene configurations. The formulation is applied to the solvable probabilistic models having multi-valley energy landscapes, namely, the spin glass chain and the Sherrington-Kirkpatrick model. By using computer simulations, we discuss the asymptotic behaviour of the effective temperature scheduling and the residual energy induced by the GA dynamics.
Deep learning has shown state-of-art classification performance on datasets such as ImageNet, which contain a single object in each image. However, multi-object classification is far more challenging. We present a unified framework which leverages the strengths of multiple machine learning methods, viz deep learning, probabilistic models and kernel methods to obtain state-of-art performance on Microsoft COCO, consisting of non-iconic images. We incorporate contextual information in natural images through a conditional latent tree probabilistic model (CLTM), where the object co-occurrences are conditioned on the extracted fc7 features from pre-trained Imagenet CNN as input. We learn the CLTM tree structure using conditional pairwise probabilities for object co-occurrences, estimated through kernel methods, and we learn its node and edge potentials by training a new 3-layer neural network, which takes fc7 features as input. Object classification is carried out via inference on the learnt conditional tree model, and we obtain significant gain in precision-recall and F-measures on MS-COCO, especially for difficult object categories. Moreover, the latent variables in the CLTM capture scene information: the images with top activations for a latent node have common themes such as being a grasslands or a food scene, and on on. In addition, we show that a simple k-means clustering of the inferred latent nodes alone significantly improves scene classification performance on the MIT-Indoor dataset, without the need for any retraining, and without using scene labels during training. Thus, we present a unified framework for multi-object classification and unsupervised scene understanding.
Context: The Deep Extragalactic VLBI-Optical Survey (DEVOS) aims at constructing a large sample of compact radio sources up to two orders of magnitude fainter than those studied in other Very Long Baseline Interferometry (VLBI) surveys. Optical identification of the objects is ensured by selecting them from the Sloan Digital Sky Survey (SDSS) list. Aims: While continuing to build up the DEVOS data base, we investigated how the VLBI detection rate could be enhanced by refining the initial selection criteria introduced in the first paper of this series. Methods: We observed 26 sources in two adjacent, slightly overlapping 2 deg radius fields with the European VLBI Network (EVN) at 5 GHz frequency on 2 March 2007.The phase-reference calibrator quasars were J1616+3621 and J1623+3909. The objects selected were unresolved both in the Faint Images of the Radio Sky at Twenty-centimeters (FIRST) survey catalogue and the SDSS Data Release 4. Results: We present images of milli-arcsecond (mas) scale radio structures and accurate coordinates of 24 extragalactic sources. Most of them have never been imaged with VLBI. Twenty-two compact radio sources (85% of our initial sample) are considered as VLBI detections of the corresponding optical quasars in SDSS. We found an efficient way to identify quasars as potential VLBI targets with mas-scale compact radio stucture at >1 mJy level, based only on the FIRST and SDSS catalogue data by applying simple selection criteria.
Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the observations are very high-dimensional and complex, as in the case of raw images. We propose a novelty detection algorithm for exploration that is based entirely on discriminatively trained exemplar models, where classifiers are trained to discriminate each visited state against all others. Intuitively, novel states are easier to distinguish against other states seen during training. We show that this kind of discriminative modeling corresponds to implicit density estimation, and that it can be combined with count-based exploration to produce competitive results on a range of popular benchmark tasks, including state-of-the-art results on challenging egocentric observations in the vizDoom benchmark.
We consider the Nordic electricity spot market from mid 1992 to the end of year 2000. This market is found to be well approximated by an anti-persistent self-affine (mean-reverting) walk. It is characterized by a Hurst exponent of $H\simeq 0.41$ over three orders of magnitude in time ranging from days to years. We argue that in order to see such a good scaling behavior, and to locate cross-overs, it is crucial that an analyzing technique is used that {\em decouples} scales. This is in our case achieved by utilizing a (multi-scale) wavelet approach. The shortcomings of methods that do not decouple scales are illustrated by applying, to the same dat a set, the classic $R/S$- and Fourier techniques, for which scaling regimes and/or positions of cross-overs are hard to define.
We propose a method for Monte Carlo simulation of statistical physical models with discretized energy. The method is based on several ideas including the cluster algorithm, the multicanonical Monte Carlo method and its acceleration proposed recently by Wang and Landau. As in the multibondic ensemble method proposed by Janke and Kappler, the present algorithm performs a random walk in the space of the bond population to yield the state density as a function of the bond number. A test on the Ising model shows that the number of Monte Carlo sweeps required of the present method for obtaining the density of state with a given accuracy is proportional to the system size, whereas it is proportional to the system size squared for other conventional methods. In addition, the new method shows a better performance than the original Wang-Landau method in measurement of physical quantities.
The discrete picture of geometry arising from the loop representation of quantum gravity can be extended by a quantum deformation. The operators for area and volume defined in the q-deformation of the theory are partly diagonalized. The eigenstates are expressed in terms of q-deformed spin networks. The q-deformation breaks some of the degeneracy of the volume operator so that trivalent spin-networks have non-zero volume. These computations are facilitated by use of a technique based on the recoupling theory of SU(2)_q, which simplifies the construction of these and other operators through diffeomorphism invariant regularization procedures.
Websites of a particular class form increasingly complex networks, and new tools are needed to map and understand them. A way of visualizing this complex network is by mapping it. A map highlights which members of the community have similar interests, and reveals the underlying social network. In this paper, we will map a network of websites using Kohonen's self-organizing map (SOM), a neural-net like method generally used for clustering and visualization of complex data sets. The set of websites considered has been the Blogalia weblog hosting site (based at http://www.blogalia.com/), a thriving community of around 200 members, created in January 2002. In this paper we show how SOM discovers interesting community features, its relation with other community-discovering algorithms, and the way it highlights the set of communities formed over the network.
We propose a coarse-grained model for polymer chains and polymer networks based on the meso-scale dynamics. The model takes the internal degrees of freedom of the constituent polymer chains into account using memory functions and colored noises. We apply our model to dilute polymer solutions and polymer networks. A numerical simulation on a dilute polymer solution demonstrates the validity of the assumptions on the dynamics of our model. By applying this model to polymer networks, we find a transition in the dynamical behavior from an isolated chain state to a network state.
The closed-form solution for the average distance of a deterministic network--Sierpinski network--is found. This important quantity is calculated exactly with the help of recursion relations, which are based on the self-similar network structure and enable one to derive the precise formula analytically. The obtained rigorous solution confirms our previous numerical result, which shows that the average distance grows logarithmically with the number of network nodes. The result is at variance with that derived from random networks.
This work supports the existence of extended nonergodic states in the intermediate region between the chaotic (thermal) and the many-body localized phases. These states are identified through an extensive analysis of static and dynamical properties of a finite one-dimensional system with onsite random disorder. The long-time dynamics is particularly sensitive to changes in the spectrum and in the structures of the eigenstates. The study of the evolution of the survival probability, Shannon information entropy, and von Neumann entanglement entropy enables the distinction between the chaotic and the intermediate region.
What does a typical visit to Paris look like? Do people first take photos of the Louvre and then the Eiffel Tower? Can we visually model a temporal event like "Paris Vacation" using current frameworks? In this paper, we explore how we can automatically learn the temporal aspects, or storylines of visual concepts from web data. Previous attempts focus on consecutive image-to-image transitions and are unsuccessful at recovering the long-term underlying story. Our novel Skipping Recurrent Neural Network (S-RNN) model does not attempt to predict each and every data point in the sequence, like classic RNNs. Rather, S-RNN uses a framework that skips through the images in the photo stream to explore the space of all ordered subsets of the albums via an efficient sampling procedure. This approach reduces the negative impact of strong short-term correlations, and recovers the latent story more accurately. We show how our learned storylines can be used to analyze, predict, and summarize photo albums from Flickr. Our experimental results provide strong qualitative and quantitative evidence that S-RNN is significantly better than other candidate methods such as LSTMs on learning long-term correlations and recovering latent storylines. Moreover, we show how storylines can help machines better understand and summarize photo streams by inferring a brief personalized story of each individual album.
We analytically investigate the kinetic Gaussian model and the one-dimensional kinetic Ising model on two typical small-world networks (SWN), the adding-type and the rewiring-type. The general approaches and some basic equations are systematically formulated. The rigorous investigation of the Glauber-type kinetic Gaussian model shows the mean-field-like global influence on the dynamic evolution of the individual spins. Accordingly a simplified method is presented and tested, and believed to be a good choice for the mean-field transition widely (in fact, without exception so far) observed on SWN. It yields the evolving equation of the Kawasaki-type Gaussian model. In the one-dimensional Ising model, the p-dependence of the critical point is analytically obtained and the inexistence of such a threshold p_c, for a finite temperature transition, is confirmed. The static critical exponents, gamma and beta are in accordance with the results of the recent Monte Carlo simulations, and also with the mean-field critical behavior of the system. We also prove that the SWN effect does not change the dynamic critical exponent, z=2, for this model. The observed influence of the long-range randomness on the critical point indicates two obviously different hidden mechanisms.
Using the non-equilibrium Keldysh Green's function formalism, we show that the non-equilibrium charge transport in nanoscopic quantum networks takes place via {\it current eigenmodes} that possess characteristic spatial patterns. We identify the microscopic relation between the current patterns and the network's electronic structure and topology and demonstrate that these patterns can be selected via gating or constrictions, providing new venues for manipulating charge transport at the nanoscale. Finally, decreasing the dephasing time leads to a smooth evolution of the current patterns from those of a ballistic quantum network to those of a classical resistor network.
We study with exact diagonalization techniques the Heisenberg model for a system of SU(2) spins with S=1/2 and random infinite-range exchange interactions. We calculate the critical temperature T_g for the spin-glass to paramagnetic transition. We obtain T_g ~ 0.13, in good agreement with previous quantum Monte Carlo and analytical estimates. We provide a detailed picture for the different kind of excitations which intervene in the dynamical response chi''(w,T) at T=0 and analyze their evolution as T increases. We also calculate the specific heat Cv(T). We find that it displays a smooth maximum at TM ~ 0.25, in good qualitative agreement with experiments. We argue that the fact that TM>Tg is due to a quantum disorder effect.
A study of seismic regionalization for central Chile based on a neural network is presented. A scenario with six seismic regions is obtained, independently of the size of the neighborhood or the reach of the correlation between the cells of the grid. The high correlation between the spatial distribution of the seismic zones and geographical data confirm our election of the training vectors of the neural network.
In this paper we compare the use of several features in the task of content filtering for video social networks, a very challenging task, not only because the unwanted content is related to very high-level semantic concepts (e.g., pornography, violence, etc.) but also because videos from social networks are extremely assorted, preventing the use of constrained a priori information. We propose a simple method, able to combine diverse evidence, coming from different features and various video elements (entire video, shots, frames, keyframes, etc.). We evaluate our method in three social network applications, related to the detection of unwanted content - pornographic videos, violent videos, and videos posted to artificially manipulate popularity scores. Using challenging test databases, we show that this simple scheme is able to obtain good results, provided that adequate features are chosen. Moreover, we establish a representation using codebooks of spatiotemporal local descriptors as critical to the success of the method in all three contexts. This is consequential, since the state-of-the-art still relies heavily on static features for the tasks addressed.
Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases. Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features. Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar errors and use similar representations for this task, and whether the answers depend on the magnitude of the viewpoint variations. To investigate these issues, we benchmarked eight state-of-the-art DCNNs, the HMAX model, and a baseline shallow model and compared their results to those of humans with backward masking. Unlike in all previous DCNN studies, we carefully controlled the magnitude of the viewpoint variations to demonstrate that shallow nets can outperform deep nets and humans when variations are weak. When facing larger variations, however, more layers were needed to match human performance and error distributions, and to have representations that are consistent with human behavior. A very deep net with 18 layers even outperformed humans at the highest variation level, using the most human-like representations.
In all empirical-network studies, the observed properties of economic networks are informative only if compared with a well-defined null model that can quantitatively predict the behavior of such properties in constrained graphs. However, predictions of the available null-model methods can be derived analytically only under assumptions (e.g., sparseness of the network) that are unrealistic for most economic networks like the World Trade Web (WTW). In this paper we study the evolution of the WTW using a recently-proposed family of null network models. The method allows to analytically obtain the expected value of any network statistic across the ensemble of networks that preserve on average some local properties, and are otherwise fully random. We compare expected and observed properties of the WTW in the period 1950-2000, when either the expected number of trade partners or total country trade is kept fixed and equal to observed quantities. We show that, in the binary WTW, node-degree sequences are sufficient to explain higher-order network properties such as disassortativity and clustering-degree correlation, especially in the last part of the sample. Conversely, in the weighted WTW, the observed sequence of total country imports and exports are not sufficient to predict higher-order patterns of the WTW. We discuss some important implications of these findings for international-trade models.
Complex network theory provides an elegant and powerful framework to statistically investigate different types of systems such as society, brain or the structure of local and long-range dynamical interrelationships in the climate system. Network links in climate networks typically imply information, mass or energy exchange. However, the specific connection between oceanic or atmospheric flows and the climate network's structure is still unclear. We propose a theoretical approach for verifying relations between the correlation matrix and the climate network measures, generalizing previous studies and overcoming the restriction to stationary flows. Our methods are developed for correlations of a scalar quantity (temperature, for example) which satisfies an advection-diffusion dynamics in the presence of forcing and dissipation. Our approach reveals that correlation networks are not sensitive to steady sources and sinks and the profound impact of the signal decay rate on the network topology. We illustrate our results with calculations of degree and clustering for a meandering flow resembling a geophysical ocean jet.
Topological aspects, like community structure, and temporal activity patterns, like burstiness, have been shown to severly influence the speed of spreading in temporal networks. We study the influence of the topology on the susceptible-infected (SI) spreading on time stamped communication networks, as obtained from a dataset of mobile phone records. We consider city level networks with intra- and inter-city connections. The networks using only intra-city links are usually sparse, where the spreading depends mainly on the average degree. The inter-city links serve as bridges in spreading, speeding up considerably the process. We demonstrate the effect also on model simulations.
Generative adversarial networks (GANs) have given us a great tool to fit implicit generative models to data. Implicit distributions are ones we can sample from easily, and take derivatives of samples with respect to model parameters. These models are highly expressive and we argue they can prove just as useful for variational inference (VI) as they are for generative modelling. Several papers have proposed GAN-like algorithms for inference, however, connections to the theory of VI are not always well understood. This paper provides a unifying review of existing algorithms establishing connections between variational autoencoders, adversarially learned inference, operator VI, GAN-based image reconstruction, and more. Secondly, the paper provides a framework for building new algorithms: depending on the way the variational bound is expressed we introduce prior-contrastive and joint-contrastive methods, and show practical inference algorithms based on either density ratio estimation or denoising.
We carry out a detailed investigation of neutral ($D^0$) and charged ($D^-$) impurity states of hydrogen-like donors in spherical semiconductor quantum dots. The investigation is carried out within the effective mass theory (EMT). We take recourse to local density approximation (LDA) and the Harbola-Sahni (HS) schemes for treating many-body effects. We experiment with a variety of confining potentials: square, harmonic and triangular. We observe that the donor level undergoes shallow to deep transition as the dot radius ($R$) is reduced. On further reduction of the dot radius it becomes shallow again. We term this non-monotonic behaviour \textbf{SHADES}. This suggests the possibility of carrier {\textbf{\textit{``freeze out''}}} for both $D^0$ and $D^-$. Further, our study of the optical gaps also reveals a {\textbf{SHADES}} transition.
Many social and economic systems are naturally represented as networks, from off-line and on-line social networks, to bipartite networks, like Netflix and Amazon, between consumers and products. Graphons, developed as limits of graphs, form a natural, nonparametric method to describe and estimate large networks like Facebook and LinkedIn. Here we describe the development of the theory of graphons, for both dense and sparse networks, over the last decade. We also review theorems showing that we can consistently estimate graphons from massive networks in a wide variety of models. Finally, we show how to use graphons to estimate missing links in a sparse network, which has applications from estimating social and information networks in development economics, to rigorously and efficiently doing collaborative filtering with applications to movie recommendations in Netflix and product suggestions in Amazon.
Influence maximization, defined as a problem of finding a set of seed nodes to trigger a maximized spread of influence, is crucial to viral marketing on social networks. For practical viral marketing on large scale social networks, it is required that influence maximization algorithms should have both guaranteed accuracy and high scalability. However, existing algorithms suffer a scalability-accuracy dilemma: conventional greedy algorithms guarantee the accuracy with expensive computation, while the scalable heuristic algorithms suffer from unstable accuracy.   In this paper, we focus on solving this scalability-accuracy dilemma. We point out that the essential reason of the dilemma is the surprising fact that the submodularity, a key requirement of the objective function for a greedy algorithm to approximate the optimum, is not guaranteed in all conventional greedy algorithms in the literature of influence maximization. Therefore a greedy algorithm has to afford a huge number of Monte Carlo simulations to reduce the pain caused by unguaranteed submodularity. Motivated by this critical finding, we propose a static greedy algorithm, named StaticGreedy, to strictly guarantee the submodularity of influence spread function during the seed selection process. The proposed algorithm makes the computational expense dramatically reduced by two orders of magnitude without loss of accuracy. Moreover, we propose a dynamical update strategy which can speed up the StaticGreedy algorithm by 2-7 times on large scale social networks.
Recognizing textual entailment is a fundamental task in a variety of text mining or natural language processing applications. This paper proposes a simple neural model for RTE problem. It first matches each word in the hypothesis with its most-similar word in the premise, producing an augmented representation of the hypothesis conditioned on the premise as a sequence of word pairs. The LSTM model is then used to model this augmented sequence, and the final output from the LSTM is fed into a softmax layer to make the prediction. Besides the base model, in order to enhance its performance, we also proposed three techniques: the integration of multiple word-embedding library, bi-way integration, and ensemble based on model averaging. Experimental results on the SNLI dataset have shown that the three techniques are effective in boosting the predicative accuracy and that our method outperforms several state-of-the-state ones.
We apply a general theory describing the dynamics of supervised learning in layered neural networks in the regime where the size p of the training set is proportional to the number of inputs N, as developed in a previous paper, to several choices of learning rules. In the case of (on-line and batch) Hebbian learning, where a direct exact solution is possible, we show that our theory provides exact results at any time in many different verifiable cases. For non-Hebbian learning rules, such as Perceptron and AdaTron, we find very good agreement between the predictions of our theory and numerical simulations. Finally, we derive three approximation schemes aimed at eliminating the need to solve a functional saddle-point equation at each time step, and assess their performance. The simplest of these schemes leads to a fully explicit and relatively simple non-linear diffusion equation for the joint field distribution, which already describes the learning dynamics surprisingly well over a wide range of parameters.
This paper addresses the estimation of parameters of a Bayesian network from incomplete data. The task is usually tackled by running the Expectation-Maximization (EM) algorithm several times in order to obtain a high log-likelihood estimate. We argue that choosing the maximum log-likelihood estimate (as well as the maximum penalized log-likelihood and the maximum a posteriori estimate) has severe drawbacks, being affected both by overfitting and model uncertainty. Two ideas are discussed to overcome these issues: a maximum entropy approach and a Bayesian model averaging approach. Both ideas can be easily applied on top of EM, while the entropy idea can be also implemented in a more sophisticated way, through a dedicated non-linear solver. A vast set of experiments shows that these ideas produce significantly better estimates and inferences than the traditional and widely used maximum (penalized) log-likelihood and maximum a posteriori estimates. In particular, if EM is adopted as optimization engine, the model averaging approach is the best performing one; its performance is matched by the entropy approach when implemented using the non-linear solver. The results suggest that the applicability of these ideas is immediate (they are easy to implement and to integrate in currently available inference engines) and that they constitute a better way to learn Bayesian network parameters.
We characterize the breaking of analyticity with respect to the replica number which occurs in random energy models via the complex zeros of the moment of the partition function. We perturbatively evaluate the zeros in the vicinity of the transition point based on an exact expression of the moment of the partition function utilizing the steepest descent method, and examine an asymptotic form of the locus of the zeros as the system size tends to infinity. The incident angle of this locus indicates that analyticity breaking is analogous to a phase transition of the second order. We also evaluate the number of zeros utilizing the argument principle of complex analysis. The actual number of zeros calculated numerically for systems of finite size agrees fairly well with the analytical results.
This paper shows that it is possible to train large and deep convolutional neural networks (CNN) for JPEG compression artifacts reduction, and that such networks can provide significantly better reconstruction quality compared to previously used smaller networks as well as to any other state-of-the-art methods. We were able to train networks with 8 layers in a single step and in relatively short time by combining residual learning, skip architecture, and symmetric weight initialization. We provide further insights into convolution networks for JPEG artifact reduction by evaluating three different objectives, generalization with respect to training dataset size, and generalization with respect to JPEG quality level.
Recently, deep learning approach has achieved promising results in various fields of computer vision. In this paper, a new framework called Hierarchical Depth Motion Maps (HDMM) + 3 Channel Deep Convolutional Neural Networks (3ConvNets) is proposed for human action recognition using depth map sequences. Firstly, we rotate the original depth data in 3D pointclouds to mimic the rotation of cameras, so that our algorithms can handle view variant cases. Secondly, in order to effectively extract the body shape and motion information, we generate weighted depth motion maps (DMM) at several temporal scales, referred to as Hierarchical Depth Motion Maps (HDMM). Then, three channels of ConvNets are trained on the HDMMs from three projected orthogonal planes separately. The proposed algorithms are evaluated on MSRAction3D, MSRAction3DExt, UTKinect-Action and MSRDailyActivity3D datasets respectively. We also combine the last three datasets into a larger one (called Combined Dataset) and test the proposed method on it. The results show that our approach can achieve state-of-the-art results on the individual datasets and without dramatical performance degradation on the Combined Dataset.
P2P system rely on voluntary allocation of resources by its members due to absence of any central controlling authority. This resource allocation can be viewed as classical control problem where feedback is the amount of resource received, which controls the output i.e. the amount of resources shared back to the network by the node. The motivation behind the use of control system in resource allocation is to exploit already existing tools in control theory to improve the overall allocation process and thereby solving the problem of freeriding and whitewashing in the network. At the outset, we have derived the transfer function to model the P2P system. Subsequently, through the simulation results we have shown that transfer function was able to provide optimal value of resource sharing for the peers during the normal as well as high degree of overloading in the network. Thereafter we verified the accuracy of the transfer function derived by comparing its output with the simulated P2P network. To demonstrate how control system reduces free riding it has been shown through simulations how the control systems penalizes the nodes indulging in different levels of freeriding. Our proposed control system shows considerable gain over existing state of art algorithm. This improvement is achieved through PI action of controller. Since low reputation peers usually subvert reputation system by whitewashing. We propose and substantiate a technique modifying transfer function such that systems' sluggishness becomes adaptive in such a way that it encourage genuine new comers to enter network and discourages member peers to whitewash.
Branching Brownian Motion describes a system of particles which diffuse in space and split into offsprings according to a certain random mechanism. In virtue of the groundbreaking work by M. Bramson on the convergence of solutions of the Fisher-KPP equation to traveling waves, the law of the rightmost particle in the limit of large times is rather well understood. In this work, we address the full statistics of the extremal particles (first-, second-, third- etc. largest). In particular, we prove that in the large $t-$limit, such particles descend with overwhelming probability from ancestors having split either within a distance of order one from time 0, or within a distance of order one from time $t$. The approach relies on characterizing, up to a certain level of precision, the paths of the extremal particles. As a byproduct, a heuristic picture of Branching Brownian Motion "at the edge" emerges, which sheds light on the still unknown limiting extremal process.
During Cycle 14 a total of 113 HST orbits were secured to observe five isolated dwarf galaxies, namely Tucana, LGS3, LeoA, IC1613, and Cetus. The aim of the project is a full characterization of the stellar content of these galaxies, in term of their SFH, radial distributions, halo populations and variable stars. Deep (V~29) F475W, F814W data allowed us to fully sample all the evolutionary phases from the tip of the Red Giant Branch (RGB) to well below the old Main Sequence Turnoff (MSTO). Here we describe the observational design, and the reduction and calibration strategy adopted. A comparison of the results obtained using two different packages, ALLFRAME and Dolphot, is presented.
In the modern era, each Internet user leaves enormous amounts of auxiliary digital residuals (footprints) by using a variety of on-line services. All this data is already collected and stored for many years. In recent works, it was demonstrated that it's possible to apply simple machine learning methods to analyze collected digital footprints and to create psycho-demographic profiles of individuals. However, while these works clearly demonstrated the applicability of machine learning methods for such an analysis, created simple prediction models still lacks accuracy necessary to be successfully applied for practical needs. We have assumed that using advanced deep machine learning methods may considerably increase the accuracy of predictions. We started with simple machine learning methods to estimate basic prediction performance and moved further by applying advanced methods based on shallow and deep neural networks. Then we compared prediction power of studied models and made conclusions about its performance. Finally, we made hypotheses how prediction accuracy can be further improved. As result of this work, we provide full source code used in the experiments for all interested researchers and practitioners in corresponding GitHub repository. We believe that applying deep machine learning for psycho-demographic profiling may have an enormous impact on the society (for good or worse) and provides means for Artificial Intelligence (AI) systems to better understand humans by creating their psychological profiles. Thus AI agents may achieve the human-like ability to participate in conversation (communication) flow by anticipating human opponents' reactions, expectations, and behavior.
Extreme learning machine (ELM) is an extremely fast learning method and has a powerful performance for pattern recognition tasks proven by enormous researches and engineers. However, its good generalization ability is built on large numbers of hidden neurons, which is not beneficial to real time response in the test process. In this paper, we proposed new ways, named "constrained extreme learning machines" (CELMs), to randomly select hidden neurons based on sample distribution. Compared to completely random selection of hidden nodes in ELM, the CELMs randomly select hidden nodes from the constrained vector space containing some basic combinations of original sample vectors. The experimental results show that the CELMs have better generalization ability than traditional ELM, SVM and some other related methods. Additionally, the CELMs have a similar fast learning speed as ELM.
Because the desire to explore opaque materials is ordinarily frustrated by multiple scattering of waves, attention has focused on the transmission matrix of the wave field. This matrix gives the fullest account of transmission and conductance and enables the control of the transmitted flux; however, it cannot address the fundamental issue of the spatial profile of eigenchannels of the transmission matrix inside the sample. Here we obtain a universal expression for the average disposition of energy of transmission eigenchannels for diffusive waves in terms of auxiliary localization lengths determined by the corresponding transmission eigenvalues. The spatial profile of each eigenchannel is shown to be a solution of a generalized diffusion equation. These results reveal the rich structure of transmission eigenchannels and enable the control of wave propagation and the energy distribution inside random media.
We study the self-organization of the consonant inventories through a complex network approach. We observe that the distribution of occurrence as well as cooccurrence of the consonants across languages follow a power-law behavior. The co-occurrence network of consonants exhibits a high clustering coefficient. We propose four novel synthesis models for these networks (each of which is a refinement of the earlier) so as to successively match with higher accuracy (a) the above mentioned topological properties as well as (b) the linguistic property of feature economy exhibited by the consonant inventories. We conclude by arguing that a possible interpretation of this mechanism of network growth is the process of child language acquisition. Such models essentially increase our understanding of the structure of languages that is influenced by their evolutionary dynamics and this, in turn, can be extremely useful for building future NLP applications.
The present study provides the first evidence that illiteracy can be reliably predicted from standard mobile phone logs. By deriving a broad set of mobile phone indicators reflecting users financial, social and mobility patterns we show how supervised machine learning can be used to predict individual illiteracy in an Asian developing country, externally validated against a large-scale survey. On average the model performs 10 times better than random guessing with a 70% accuracy. Further we show how individual illiteracy can be aggregated and mapped geographically at cell tower resolution. Geographical mapping of illiteracy is crucial to know where the illiterate people are, and where to put in resources. In underdeveloped countries such mappings are often based on out-dated household surveys with low spatial and temporal resolution. One in five people worldwide struggle with illiteracy, and it is estimated that illiteracy costs the global economy more than 1 trillion dollars each year. These results potentially enable costeffective, questionnaire-free investigation of illiteracy-related questions on an unprecedented scale
Development of network economy changes the institutional maintenance of economic relations. On change to traditional channels of distribution virtual networks of distribution of production come. In article features and mechanisms of transformation of chains deliveries in e-commerce reveal.
In this paper technological solutions for improving the quality of video transfer along wireless networks are investigated. Tools have been developed to allow packets to be duplicated with key frames data. In the paper we tested video streams with duplication of all frames, with duplication of key frames, and without duplication. The experiments showed that the best results are obtained by duplication of packages which contain key frames. The paper also provides an overview of the coefficients describing the dependence of video quality on packet loss and delay variation (network jitter).
In order to understand the long-standing problem of the nature of glass states, we performed intensive simulations on the thermodynamic properties and potential energy surface of an ideal glass. We found that the atoms of an ideal glass manifest cooperative diffusion, and show clearly different behavior from the liquid state. By determining the potential energy surface, we demonstrated that the glass state has a flat potential landscape, which is the critical intrinsic feature of ideal glasses. When this potential region is accessible through any thermal or kinetic process, the glass state can be formed and a glass transition will occur, regardless of any special structural character. With this picture, the glass transition can be interpreted by the emergence of configurational entropies, as a consequence of flat potential landscapes.
We exploit millimeter wave technology to measure the reflection and transmission response of random dielectric media. Our samples are easily constructed from random stacks of identical, sub-wavelength quartz and Teflon wafers. The measurement allows us to observe the characteristic transmission resonances associated with localization. We show that these resonances give rise to enhanced attenuation even though the attenuation of homogeneous quartz and Teflon is quite low. We provide experimental evidence of disorder-induced slow light and superluminal group velocities, which, in contrast to photonic crystals, are not associated with any periodicity in the system. Furthermore, we observe localization even though the sample is only about four times the localization length, interpreting our data in terms of an effective cavity model. An algorithm for the retrieval of the internal parameters of random samples (localization length and average absorption rate) from the external measurements of the reflection and transmission coefficients is presented and applied to a particular random sample. The retrieved value of the absorption is in agreement with the directly measured value within the accuracy of the experiment.
Carrier induced ferromagnetism in diluted III-V semi-conductor is analyzed within a two step approach. First, within a single site CPA formalism, we calculate the element resolved averaged Green's function of the itinerant carrier. Then using a generalized RKKY formula we evaluate the Mn-Mn long-range exchange integrals and the Curie temperature as a function of the exchange parameter, magnetic impurity concentration and carrier density. The effect of the disorder (impurity scattering) appears to play a crucial role. The standard RKKY calculation (no scattering processes), strongly underestimate the Curie temperature and is inappropriate to describe magnetism in diluted magnetic semi-conductors. It is also shown that an antiferromagnetic exchange favors higher Curie temperature.
This series discusses the origin of sound damping and dispersion in glasses. In particular, we address the relative importance of anharmonicity versus thermally activated relaxation. In this first article, Brillouin-scattering measurements of permanently densified silica glass are presented. It is found that in this case the results are compatible with a model in which damping and dispersion are only produced by the anharmonic coupling of the sound waves with thermally excited modes. The thermal relaxation time and the unrelaxed velocity are estimated.
-Purpose: A neural network estimator to process x-ray spectral measurements from photon counting detectors with pileup. The estimator is used with an expansion of the attenuation coefficient as a linear combination of functions of energy multiplied by coefficients that depend on the material composition at points within the object [R.E. Alvarez and A. Macovski, Phys. Med. Biol., 1976, 733-744]. The estimator computes the line integrals of the coefficients from measurements with different spectra. Neural network estimators are trained with measurements of a calibration phantom with the clinical x-ray system. One estimator uses low noise training data and another network is trained with data computed by adding random noise to the low noise data. The performance of the estimators is compared to each other and to the Cramer-Rao lower bound (CRLB). Methods: The estimator performance is measured using a Monte Carlo simulation with an idealized model of a photon counting detector that includes only pileup and quantum noise. Transmitted x-ray spectra are computed for a calibration phantom. The transmitted spectra are used to compute random data for photon counting detectors with pileup. Detectors with small and large dead times are considered. Neural network training data with extremely low noise are computed by averaging the random detected data with pileup for a large numbers of exposures of the phantom. Each exposure is equivalent to a projection image or one projection of a computed tomography scan. Training data with high noise are computed by using data from one exposure. Finally, training data are computed by adding random data to the low noise data. The added random data are multivariate normal with zero mean and covariance equal to the sample covariance of data for an object with properly chosen attenuation. To test the estimators, random data are computed for different thicknesses of three test objects with different compositions. These are used as inputs to the neural network estimators. The mean squared errors (MSE), variance and square of the bias of the neural networks' outputs with the random object data are each compared to the CRLB. Results: The MSE for a network trained with low noise data and added noise is close to the CRLB for both the low and high pileup cases. Networks trained with very low noise data have low bias but large variance for both pileup cases. ralvarez@aprendtech.com Networks trained with high noise data have both large bias and large variance. Conclusion: With a properly chosen level of added training data noise, a neural network estimator for photon counting data with pileup can have variance close to the CRLB with negligible bias.
We investigate the phase coherence time of weakly interacting tunneling systems (TSs). We show that all neighbors of a given TS form together with the TS of interest an entangled cluster as long as the linewidth of the excitation of the neighbor is smaller than its interaction energy with the TS of interest. Thus, the relaxation of all neighbors in the cluster contributes to the dephasing of the TS of interest. This mechanism dominates the transversal decay of the TSs and it explains recent two-pulse echo experiments in which the exponential decay rate could not be explained within spectral diffusion consistent with internal friction data. However, since the proposed mechanism predicts only an exponential decay, the Gaussian like decay at short times remains unexplained.
Stochastic networks are a plausible representation of the relational information among entities in dynamic systems such as living cells or social communities. While there is a rich literature in estimating a static or temporally invariant network from observation data, little has been done toward estimating time-varying networks from time series of entity attributes. In this paper we present two new machine learning methods for estimating time-varying networks, which both build on a temporally smoothed $l_1$-regularized logistic regression formalism that can be cast as a standard convex-optimization problem and solved efficiently using generic solvers scalable to large networks. We report promising results on recovering simulated time-varying networks. For real data sets, we reverse engineer the latent sequence of temporally rewiring political networks between Senators from the US Senate voting records and the latent evolving regulatory networks underlying 588 genes across the life cycle of Drosophila melanogaster from the microarray time course.
In geographical forwarding of packets in a large wireless sensor network (WSN) with sleep-wake cycling nodes, we are interested in the local decision problem faced by a node that has custody of a packet and has to choose one among a set of next-hop relay nodes to forward the packet towards the sink. Each relay is associated with a reward that summarizes the benefit of forwarding the packet through that relay. We seek a solution to this local problem, the idea being that such a solution, if adopted by every node, could provide a reasonable heuristic for the end-to-end forwarding problem. Towards this end, we propose a relay selection problem comprising a forwarding node and a collection of relay nodes, with the relays waking up sequentially at random times. At each relay wake-up instant the forwarder can choose to probe a relay to learn its reward value, based on which the forwarder can then decide whether to stop (and forward its packet to the chosen relay) or to continue to wait for further relays to wake-up. The forwarder's objective is to select a relay so as to minimize a combination of waiting-delay, reward and probing cost. Our problem can be considered as a variant of the asset selling problem studied in the operations research literature. We formulate our relay selection problem as a Markov decision process (MDP) and obtain some interesting structural results on the optimal policy (namely, the threshold and the stage-independence properties). We also conduct simulation experiments and gain valuable insights into the performance of our local forwarding-solution.
Recently several authors have proposed stochastic evolutionary models for the growth of complex networks that give rise to power-law distributions. These models are based on the notion of preferential attachment leading to the ``rich get richer'' phenomenon. Despite the generality of the proposed stochastic models, there are still some unexplained phenomena, which may arise due to the limited size of networks such as protein, e-mail, actor and collaboration networks. Such networks may in fact exhibit an exponential cutoff in the power-law scaling, although this cutoff may only be observable in the tail of the distribution for extremely large networks. We propose a modification of the basic stochastic evolutionary model, so that after a node is chosen preferentially, say according to the number of its inlinks, there is a small probability that this node will become inactive. We show that as a result of this modification, by viewing the stochastic process in terms of an urn transfer model, we obtain a power-law distribution with an exponential cutoff. Unlike many other models, the current model can capture instances where the exponent of the distribution is less than or equal to two. As a proof of concept, we demonstrate the consistency of our model empirically by analysing the Mathematical Research collaboration network, the distribution of which is known to follow a power law with an exponential cutoff.
In this paper, we study learning generalized driving style representations from automobile GPS trip data. We propose a novel Autoencoder Regularized deep neural Network (ARNet) and a trip encoding framework trip2vec to learn drivers' driving styles directly from GPS records, by combining supervised and unsupervised feature learning in a unified architecture. Experiments on a challenging driver number estimation problem and the driver identification problem show that ARNet can learn a good generalized driving style representation: It significantly outperforms existing methods and alternative architectures by reaching the least estimation error on average (0.68, less than one driver) and the highest identification accuracy (by at least 3% improvement) compared with traditional supervised learning methods.
Dynamics of classical Heisenberg spins, ${\bf S}_i=({\bf s}_i,S_{iz})$, on the Kagom\'{e} lattice has been studied. An ideal Heisenberg Kagom\'{e} antiferromagnet is known to remain disordered down to T=0 due to the macroscopic degeneracy of the ground state. Through the study, however, we find that ${\bf S}_i$ and their planar components ${\bf s}_i$ behave in a qualitatively different way and especially planar spins (${\bf s}_i$) show the exotic glass-like transition in the very low temperature $T\sim 0.003J$ ($J$: spin exchange). The glassy behaviors of ${\bf s}_i$ would be found to be driven by the spin-nematic fluctuations, different from ordinary spin glasses by disorders or anisotropies.
The eigenvalues of the normalized Laplacian matrix of a network plays an important role in its structural and dynamical aspects associated with the network. In this paper, we study the spectra and their applications of normalized Laplacian matrices of a family of fractal trees and dendrimers modeled by Cayley trees, both of which are built in an iterative way. For the fractal trees, we apply the spectral decimation approach to determine analytically all the eigenvalues and their corresponding multiplicities, with the eigenvalues provided by a recursive relation governing the eigenvalues of networks at two successive generations. For Cayley trees, we show that all their eigenvalues can be obtained by computing the roots of several small-degree polynomials defined recursively. By using the relation between normalized Laplacian spectra and eigentime identity, we derive the explicit solution to the eigentime identity for random walks on the two treelike networks, the leading scalings of which follow quite different behaviors. In addition, we corroborate the obtained eigenvalues and their degeneracies through the link between them and the number of spanning trees.
In this paper we propose a method for logo recognition using deep learning. Our recognition pipeline is composed of a logo region proposal followed by a Convolutional Neural Network (CNN) specifically trained for logo classification, even if they are not precisely localized. Experiments are carried out on the FlickrLogos-32 database, and we evaluate the effect on recognition performance of synthetic versus real data augmentation, and image pre-processing. Moreover, we systematically investigate the benefits of different training choices such as class-balancing, sample-weighting and explicit modeling the background class (i.e. no-logo regions). Experimental results confirm the feasibility of the proposed method, that outperforms the methods in the state of the art.
For a general class of flows on polytopes, which include many examples from Evolutionary Game Theory, we describe a method to encapsulate the asymptotic dynamics of the flow along the heteroclinic network formed by the polytope's edges. Using this technique we give sufficient conditions for the existence of normally hyperbolic stable and unstable manifolds associated to heteroclinic cycles along the polytope's edges. These results are illustrated in a polymatrix replicator model.
We report the discovery of an eclipsing cataclysmic variable with eclipse depths > 5.7 mag, orbital period 94.657 min, and peak brightness V~18 at J2000 position 17h 25m 54s.8, -64d 38' 39". Detected by visual inspection of images from Yale University's QUEST camera on the ESO 1.0-m Schmidt telescope at La Silla, we obtained light curves in B, V, R, I, z and J with the SMARTS 1.3-m and 1.0-m telescopes at Cerro Tololo and spectra from 3500 to 9000 Angstrom with the SOAR 4.3-m telescope at Cerro Pachon. The optical light curves show a deep, 5-min eclipse immediately followed by a shallow 38-min eclipse and then sinusoidal variation. No eclipses appear in J. During the deep eclipse we measure V-J > 7.1, corresponding to a spectral type M8 or later secondary, consistent with the dynamical constraints. The estimated distance is 150 psec. The spectra show strong Hydrogen emission lines, Doppler broadened by 600 - 1300 km/s, oscillating with radial velocity that peaks at mid deep eclipse with semi-amplitude 500 +/- 22 km/s. We suggest that LSQ172554.8-643839 is a polar with a low-mass secondary viewed at high inclination. No known radio or x-ray source coincides with the new object's location.
The concept of evolving intelligent system (EIS) provides an effective avenue for data stream mining because it is capable of coping with two prominent issues: online learning and rapidly changing environments. We note at least three uncharted territories of existing EISs: data uncertainty, temporal system dynamic, redundant data streams. This book chapter aims at delivering a concrete solution of this problem with the algorithmic development of a novel learning algorithm, namely PANFIS++. PANFIS++ is a generalized version of the PANFIS by putting forward three important components: 1) An online active learning scenario is developed to overcome redundant data streams. This module allows to actively select data streams for the training process, thereby expediting execution time and enhancing generalization performance, 2) PANFIS++ is built upon an interval type-2 fuzzy system environment, which incorporates the so-called footprint of uncertainty. This component provides a degree of tolerance for data uncertainty. 3) PANFIS++ is structured under a recurrent network architecture with a self-feedback loop. This is meant to tackle the temporal system dynamic. The efficacy of the PANFIS++ has been numerically validated through numerous real-world and synthetic case studies, where it delivers the highest predictive accuracy while retaining the lowest complexity.
Hierarchical latent class (HLC) models are tree-structured Bayesian networks where leaf nodes are observed while internal nodes are latent. There are no theoretically well justified model selection criteria for HLC models in particular and Bayesian networks with latent nodes in general. Nonetheless, empirical studies suggest that the BIC score is a reasonable criterion to use in practice for learning HLC models. Empirical studies also suggest that sometimes model selection can be improved if standard model dimension is replaced with effective model dimension in the penalty term of the BIC score. Effective dimensions are difficult to compute. In this paper, we prove a theorem that relates the effective dimension of an HLC model to the effective dimensions of a number of latent class models. The theorem makes it computationally feasible to compute the effective dimensions of large HLC models. The theorem can also be used to compute the effective dimensions of general tree models.
We investigate the distribution of zeros of the partition function of the two- and three-dimensional symmetric $\pm J$ Ising spin glasses on the complex field plane. We use the method to analytically implement the idea of numerical transfer matrix which provides us with the exact expression of the partition function as a polynomial of fugacity. The results show that zeros are distributed in a wide region in the complex field plane. Nevertheless we observe that zeros on the imaginary axis play dominant roles in the critical behaviour since zeros on the imaginary axis are in closer proximity to the real axis. We estimate the density of zeros on the imaginary axis by an importance-sampling Monte Carlo algorithm, which enables us to sample very rare events. Our result suggests that the density has an essential singularity at the origin. This observation is consistent with the existence of Griffiths singularities in the present systems. This is the first evidence for Griffiths singularities in spin glass systems in equilibrium.
Networks and network-like structures are amongst the central building blocks of many technological and biological systems. Given a mathematical graph representation of a network, methods from graph theory enable a precise investigation of its properties. Software for the analysis of graphs is widely available and has been applied to graphs describing large scale networks such as social networks, protein-interaction networks, etc. In these applications, graph acquisition, i.e., the extraction of a mathematical graph from a network, is relatively simple. However, for many network-like structures, e.g. leaf venations, slime molds and mud cracks, data collection relies on images where graph extraction requires domain-specific solutions or even manual. Here we introduce Network Extraction From Images, NEFI, a software tool that automatically extracts accurate graphs from images of a wide range of networks originating in various domains. While there is previous work on graph extraction from images, theoretical results are fully accessible only to an expert audience and ready-to-use implementations for non-experts are rarely available or insufficiently documented. NEFI provides a novel platform allowing practitioners from many disciplines to easily extract graph representations from images by supplying flexible tools from image processing, computer vision and graph theory bundled in a convenient package. Thus, NEFI constitutes a scalable alternative to tedious and error-prone manual graph extraction and special purpose tools. We anticipate NEFI to enable the collection of larger datasets by reducing the time spent on graph extraction. The analysis of these new datasets may open up the possibility to gain new insights into the structure and function of various types of networks. NEFI is open source and available http://nefi.mpi-inf.mpg.de.
Virtualized network services consisting of multiple individual network functions are already today deployed across multiple sites, so called multi-PoP (points of presence) environ- ments. This allows to improve service performance by optimizing its placement in the network. But prototyping and testing of these complex distributed software systems becomes extremely challenging. The reason is that not only the network service as such has to be tested but also its integration with management and orchestration systems. Existing solutions, like simulators, basic network emulators, or local cloud testbeds, do not support all aspects of these tasks. To this end, we introduce MeDICINE, a novel NFV prototyping platform that is able to execute production-ready network func- tions, provided as software containers, in an emulated multi-PoP environment. These network functions can be controlled by any third-party management and orchestration system that connects to our platform through standard interfaces. Based on this, a developer can use our platform to prototype and test complex network services in a realistic environment running on his laptop.
Community-based question answering (CQA) platforms are crowd-sourced services for sharing user expertise on various topics, from mechanical repairs to parenting. While they naturally build-in an online social network infrastructure, they carry a very different purpose from Facebook-like social networks, where users "hang-out" with their friends and tend to share more personal information. It is unclear, thus, how the privacy concerns and their correlation with user behavior in an online social network translate into a CQA platform. This study analyzes one year of recorded traces from a mature CQA platform to understand the association between users' privacy concerns as manifested by their account settings and their activity in the platform. The results show that privacy preference is correlated with behavior in the community in terms of engagement, retention, accomplishments and deviance from the norm. We find privacy-concerned users have higher qualitative and quantitative contributions, show higher retention, report more abuses, have higher perception on answer quality and have larger social circles. However, at the same time, these users also exhibit more deviant behavior than the users with public profiles.
This chapter revises the most important aspects in how computing infrastructures should be configured and intelligently managed to fulfill the most notably security aspects required by Big Data applications. One of them is privacy. It is a pertinent aspect to be addressed because users share more and more personal data and content through their devices and computers to social networks and public clouds. So, a secure framework to social networks is a very hot topic research. This last topic is addressed in one of the two sections of the current chapter with case studies. In addition, the traditional mechanisms to support security such as firewalls and demilitarized zones are not suitable to be applied in computing systems to support Big Data. SDN is an emergent management solution that could become a convenient mechanism to implement security in Big Data systems, as we show through a second case study at the end of the chapter. This also discusses current relevant work and identifies open issues.
We study the elastic deformations in a cross-linked polymer network film triggered by the binding of submicron particles with a sticky surface, mimicking the interactions of viral pathogens with thin films of stimulus-responsive polymeric materials such as hydrogels. From extensive Langevin Dynamics simulations we quantify how far the network deformations propagate depending on the elasticity parameters of the network and the adhesion strength of the particles. We examine the dynamics of the collective area shrinkage of the network and obtain some simple relations for the associated characteristic decay lengths. A detailed analysis elucidates how the elastic energy of the network is distributed between stretching and compression modes in response to the particle binding. We also examine the force-distance curves of the repulsion or attraction interactions for a pair of sticky particles in the polymer network film as a function of the particle-particle separation. The results of this computational study provide new insight into collective phenomena in soft polymer network films and may, in particular, be applied to applications for visual detection of pathogens such as viruses via a macroscopic response of thin films of cross-linked hydrogels.
This work explores hypernetworks: an approach of using a one network, also known as a hypernetwork, to generate the weights for another network. Hypernetworks provide an abstraction that is similar to what is found in nature: the relationship between a genotype - the hypernetwork - and a phenotype - the main network. Though they are also reminiscent of HyperNEAT in evolution, our hypernetworks are trained end-to-end with backpropagation and thus are usually faster. The focus of this work is to make hypernetworks useful for deep convolutional networks and long recurrent networks, where hypernetworks can be viewed as relaxed form of weight-sharing across layers. Our main result is that hypernetworks can generate non-shared weights for LSTM and achieve near state-of-the-art results on a variety of sequence modelling tasks including character-level language modelling, handwriting generation and neural machine translation, challenging the weight-sharing paradigm for recurrent networks. Our results also show that hypernetworks applied to convolutional networks still achieve respectable results for image recognition tasks compared to state-of-the-art baseline models while requiring fewer learnable parameters.
The mass assembly of galaxies leaves various imprints on their surroundings, such as shells, streams and tidal tails. The frequency and properties of these fine structures depend on the mechanism driving the mass assembly: e.g. a monolithic collapse, rapid cold-gas accretion followed by violent disk instabilities, minor mergers or major dry / wet mergers. Therefore, by studying the outskirts of galaxies, one can learn about their main formation mechanism. I present here our on-going work to characterize the outskirts of Early-Type Galaxies (ETGs), which are powerful probes at low redshift of the hierarchical mass assembly of galaxies. This work relies on ultra-deep optical images obtained at CFHT with the wide-field of view MegaCam camera of field and cluster ETGs obtained as part of the Atlas-3D and NGVS projects. State of the art numerical simulations are used to interpret the data. The images reveal a wealth of unknown faint structures at levels as faint as 29 mag arcsec-2 in the g-band. Initial results for two galaxies are presented here.
While funding agencies have provided substantial support for the developers and vendors of services that facilitate the unfettered flow of information through the Internet, little consolidated knowledge exists on the basic communications network infrastructure of the Islamic Republic of Iran. In the absence open access and public data, rumors and fear have reigned supreme. During provisional research on the country's censorship regime, we found initial indicators that telecommunications entities in Iran allowed private addresses to route domestically, whether intentionally or unintentionally, creating a hidden network only reachable within the country. Moreover, records such as DNS entries lend evidence of a 'dual stack' approach, wherein servers are assigned a domestic IP addresses, in addition to a global one. Despite the clear political implications of the claim we put forward, particularly in light of rampant speculation regarding the mandate of Article 46 of the 'Fifth Five Year Development Plan' to establish a "national information network," we refrain from hypothesizing the purpose of this structure. In order to solicit critical feedback for future research, we outline our initial findings and attempt to demonstrate that the matter under contention is a nation-wide phenomenom that warrants broader attention.
In the context of single-label classification, despite the huge success of deep learning, the commonly used cross-entropy loss function ignores the intricate inter-class relationships that often exist in real-life tasks such as age classification. In this work, we propose to leverage these relationships between classes by training deep nets with the exact squared Earth Mover's Distance (also known as Wasserstein distance) for single-label classification. The squared EMD loss uses the predicted probabilities of all classes and penalizes the miss-predictions according to a ground distance matrix that quantifies the dissimilarities between classes. We demonstrate that on datasets with strong inter-class relationships such as an ordering between classes, our exact squared EMD losses yield new state-of-the-art results. Furthermore, we propose a method to automatically learn this matrix using the CNN's own features during training. We show that our method can learn a ground distance matrix efficiently with no inter-class relationship priors and yield the same performance gain. Finally, we show that our method can be generalized to applications that lack strong inter-class relationships and still maintain state-of-the-art performance. Therefore, with limited computational overhead, one can always deploy the proposed loss function on any dataset over the conventional cross-entropy.
The speed of longitudinal sound waves at 7 and 22 MHz has been measured in liquid, supecooled, and amorphous selenium, including the region around the glass transition temperature, Tg, near 35 C. In amorphous selenium the speed of shear waves at 7 MHz was also measured. The experiments were performed with high purity Se (99.9999%) hermetically sealed in an evacuated quartz ampoule. Four temperature regions with strongly different relaxation times can be distinguished between room temperature and the melting point: (1) a glassy state below Tg, which is stable on the time scale of the experiments, (2) a glassy state above Tg,, which is metastable on the time scale of the experiments, (3) a region where homogeneous crystal nucleation occurs, and (4) a supercooled liquid, which is stable on the time scale of the experiments. Each region is marked by a change in the slope of the temperature dependence of the sound velocity. Near the glass transition temperature the velocities of longitudinal and transverse sound exhibit hysteresis with a step-like drop on heating and a more continuous rise on cooling. The step-like anomaly in sound velocity may be a general property of the glass transition.
Despite the fact that grouping behavior has been actively studied for over a century, the relative importance of the numerous proposed fitness benefits of grouping remain unclear. We use a digital model of evolving prey under simulated predation to directly explore the evolution of gregarious foraging behavior according to one such benefit, the "many eyes" hypothesis. According to this hypothesis, collective vigilance allows prey in large groups to detect predators more efficiently by making alarm signals or behavioral cues to each other, thereby allowing individuals within the group to spend more time foraging. Here, we find that collective vigilance is sufficient to select for gregarious foraging behavior as long there is not a direct cost for grouping (e.g., competition for limited food resources), even when controlling for confounding factors such as the dilution effect. Further, we explore the role of the genetic relatedness and reproductive strategy of the prey, and find that highly related groups of prey with a semelparous reproductive strategy are the most likely to evolve gregarious foraging behavior mediated by the benefit of vigilance. These findings, combined with earlier studies with evolving digital organisms, further sharpen our understanding of the factors favoring grouping behavior.
Non-deterministic giant waves, denoted as rogue, killer, monster or freak waves, have been reported in many different branches of physics. Their origin is however still unknown: despite the massive numerical and experimental evidence, the ultimate reason for their spontaneous formation has not been identified yet. Here we show that rogue waves in optical fibres actually result from a complex dynamic process very similar to well known mechanisms such as glass transitions and protein folding. We describe how the interaction among optical solitons produces an energy landscape in a highly-dimensional parameter space with multiple quasi-equilibrium points. These configurations have the same statistical distribution of the observed rogue events and are explored during the light dynamics due to soliton collisions, with inelastic mechanisms enhancing the process. Slightly different initial conditions lead to very different dynamics in this complex geometry; a rogue soliton turns out to stem from one particular deep quasi-equilibrium point of the energy landscape in which the system may be transiently trapped during evolution. This explanation will prove fruitful to the wide community interested in freak waves.
We study the probability distribution P(E) of the ground state energy E in various Ising spin glasses. In most models, P(E) seems to become Gaussian with a variance growing as the system's volume V. Exceptions include the Sherrington-Kirkpatrick model (where the variance grows more slowly, perhaps as the square root of the volume), and mean field diluted spin glasses having +/-J couplings. We also find that the corrections to the extensive part of the disorder averaged energy grow as a power of the system size; for finite dimensional lattices, this exponent is equal, within numerical precision, to the domain-wall exponent theta_DW. We also show how a systematic expansion of theta_DW in powers of exp(-d) can be obtained for Migdal-Kadanoff lattices. Some physical arguments are given to rationalize our findings.
This study seeks to validate a search protocol of ill health-related terms using Twitter data which can later be used to understand if, and how, Twitter can reveal information on the current health situation. We extracted conversations related to health and disease postings on Twitter using a set of pre-defined keywords, assessed the prevalence, frequency, and timing of such content in these conversations, and validated how this search protocol was able to detect relevant disease tweets. Classification and Regression Trees (CART) algorithm was used to train and test search protocols of disease and health hits comparing to those identified by our team. The accuracy of predictions showed a good validity with AUC beyond 0.8. Our study shows that monitoring of public sentiment on Twitter can be used as a real-time proxy for health events.
A hybrid model where the tunneling probability is estimated based on both sudden and adiabatic approaches has been proposed to understand the heavy ion fusion phenomena at deep sub-barrier energies. It is shown that under certain approximations, it amounts to tunneling through two barriers: one while overcoming the normal Coulomb barrier (which is of sudden nature) along the radial direction until the repulsive core is reached and thereafter through an adiabatic barrier along the neck degree of freedom while making transition from a di-nuclear to a mono-nuclear regime through shape relaxation. A general feature of this hybrid model is a steep fall-off of the fusion cross section, sharp increase of logarithmic derivative L(E) with decreasing energy and the astrophysical S-factor showing a maxima at deep sub-barrier energies particularly for near symmetric systems. The model can explain the experimental fusion measurements for several systems ranging from near symmetric systems like $^{58}Ni+^{64}Ni, ^{58}Ni+^{58}Ni$ and $ ^{58}Ni+^{69}Y$ to asymmetric one like $^{16}O+^{208}Pb$ where the experimental findings are very surprising. Since the second tunneling is along the neck co-ordinate, it is further conjectured that deep sub-barrier fusion supression may not be observed for the fusion of highly asymmetric projectile target combinations where adiabatic transition occurs automatically without any hindrance. The recent deep sub-barrier fusion cross section measurements of $^{6}Li+^{198}Pt$ system supports this conjecture.
Recent works demonstrated the usefulness of temporal coherence to regularize supervised training or to learn invariant features with deep architectures. In particular, enforcing smooth output changes while presenting temporally-closed frames from video sequences, proved to be an effective strategy. In this paper we prove the efficacy of temporal coherence for semi-supervised incremental tuning. We show that a deep architecture, just mildly trained in a supervised manner, can progressively improve its classification accuracy, if exposed to video sequences of unlabeled data. The extent to which, in some cases, a semi-supervised tuning allows to improve classification accuracy (approaching the supervised one) is somewhat surprising. A number of control experiments pointed out the fundamental role of temporal coherence.
Recurrent sequence generators conditioned on input data through an attention mechanism have recently shown very good performance on a range of tasks in- cluding machine translation, handwriting synthesis and image caption gen- eration. We extend the attention-mechanism with features needed for speech recognition. We show that while an adaptation of the model used for machine translation in reaches a competitive 18.7% phoneme error rate (PER) on the TIMIT phoneme recognition task, it can only be applied to utterances which are roughly as long as the ones it was trained on. We offer a qualitative explanation of this failure and propose a novel and generic method of adding location-awareness to the attention mechanism to alleviate this issue. The new method yields a model that is robust to long inputs and achieves 18% PER in single utterances and 20% in 10-times longer (repeated) utterances. Finally, we propose a change to the at- tention mechanism that prevents it from concentrating too much on single frames, which further reduces PER to 17.6% level.
Graph generators learn a model from a source graph in order to generate a new graph that has many of the same properties. The learned models each have implicit and explicit biases built in, and its important to understand the assumptions that are made when generating a new graph. Of course, the differences between the new graph and the original graph, as compared by any number of graph properties, are important indicators of the biases inherent in any modelling task. But these critical differences are subtle and not immediately apparent using standard performance metrics. Therefore, we introduce the infinity mirror test for the analysis of graph generator performance and robustness. This stress test operates by repeatedly, recursively fitting a model to itself. A perfect graph generator would have no deviation from the original or ideal graph, however the implicit biases and assumptions that are cooked into the various models are exaggerated by the infinity mirror test allowing for new insights that were not available before. We show, via hundreds of experiments on 6 real world graphs, that several common graph generators do degenerate in interesting and informative ways. We believe that the observed degenerative patterns are clues to future development of better graph models.
The underwater acoustic channel is characterized by a path loss that depends not only on the transmission distance, but also on the signal frequency. Signals transmitted from one user to another over a distance $l$ are subject to a power loss of $l^{-\alpha}{a(f)}^{-l}$. Although a terrestrial radio channel can be modeled similarly, the underwater acoustic channel has different characteristics. The spreading factor $\alpha$, related to the geometry of propagation, has values in the range $1 \leq \alpha \leq 2$. The absorption coefficient $a(f)$ is a rapidly increasing function of frequency: it is three orders of magnitude greater at 100 kHz than at a few Hz. Existing results for capacity of wireless networks correspond to scenarios for which $a(f) = 1$, or a constant greater than one, and $\alpha \geq 2$. These results cannot be applied to underwater acoustic networks in which the attenuation varies over the system bandwidth. We use a water-filling argument to assess the minimum transmission power and optimum transmission band as functions of the link distance and desired data rate, and study the capacity scaling laws under this model.
(abridged) We present results of the disorder-induced metal-insulator-transition (MIT) in three-dimensional amorphous indium-oxide films. The amorphous version studied here differs from the one reported earlier [PRB 46, 10917 (1992)] in that it has a much lower carrier concentration. As a measure of the static disorder we use the dimensionless parameter kFl. Thermal annealing is employed as the experimental handle to tune the disorder. On the metallic side of the transition, the low temperature transport exhibits weak-localization and electron-electron correlation effects characteristic of disordered electronic systems. The MIT occurs at a kFl~0.3 for both versions of the amorphous material. However, in contrast with the results obtained on the electron-rich version of this system, no sign of superconductivity is seen down to ~0.3K even for the most metallic sample used in the current study. This demonstrates that using kFl as a disorder parameter for the superconductor-insulator-transition (SIT) is an ill defined procedure. A microstructural study of the films, employing high resolution chemical analysis, gives evidence for spatial fluctuations of the stoichiometry. This brings to light that, while the films are amorphous and show excellent uniformity in transport measurements of macroscopic samples, they contain compositional fluctuations that extend over mesoscopic scales. It is argued that this compositional disorder may be the reason for the apparent violation of the Ioffe-Regel criterion in the two versions of the amorphous indium-oxide. However, more dramatic effects due to this disorder are expected when superconductivity sets in, which are in fact consistent with the prominent transport anomalies observed in the electron-rich version of indium-oxide. The relevance of compositional disorder to other systems near their SIT is discussed.
We investigate a generalised version of the recently proposed ordinal partition time series to network transformation algorithm. Firstly we introduce a fixed time lag for the elements of each partition that is selected using techniques from traditional time delay embedding. The resulting partitions define regions in the embedding phase space that are mapped to nodes in the network space. Edges are allocated between nodes based on temporal succession thus creating a Markov chain representation of the time series. We then apply this new transformation algorithm to time series generated by the R\"ossler system and find that periodic dynamics translate to ring structures whereas chaotic time series translate to band or tube-like structures -- thereby indicating that our algorithm generates networks whose structure is sensitive to system dynamics. Furthermore we demonstrate that simple network measures including the mean out degree and variance of out degrees can track changes in the dynamical behaviour in a manner comparable to the largest Lyapunov exponent. We also apply the same analysis to experimental time series generated by a diode resonator circuit and show that the network size, mean shortest path length and network diameter are highly sensitive to the interior crisis captured in this particular data set.
Multi-label classification is a practical yet challenging task in machine learning related fields, since it requires the prediction of more than one label category for each input instance. We propose a novel deep neural networks (DNN) based model, Canonical Correlated AutoEncoder (C2AE), for solving this task. Aiming at better relating feature and label domain data for improved classification, we uniquely perform joint feature and label embedding by deriving a deep latent space, followed by the introduction of label-correlation sensitive loss function for recovering the predicted label outputs. Our C2AE is achieved by integrating the DNN architectures of canonical correlation analysis and autoencoder, which allows end-to-end learning and prediction with the ability to exploit label dependency. Moreover, our C2AE can be easily extended to address the learning problem with missing labels. Our experiments on multiple datasets with different scales confirm the effectiveness and robustness of our proposed method, which is shown to perform favorably against state-of-the-art methods for multi-label classification.
Multiplex networks, a special type of multilayer networks, are increasingly applied in many domains ranging from social media analytics to biology. A common task in these applications concerns the detection of community structures. Many existing algorithms for community detection in multiplexes attempt to detect communities which are shared by all layers. In this article we propose a community detection algorithm, LART (Locally Adaptive Random Transitions), for the detection of communities that are shared by either some or all the layers in the multiplex. The algorithm is based on a random walk on the multiplex, and the transition probabilities defining the random walk are allowed to depend on the local topological similarity between layers at any given node so as to facilitate the exploration of communities across layers. Based on this random walk, a node dissimilarity measure is derived and nodes are clustered based on this distance in a hierarchical fashion. We present experimental results using networks simulated under various scenarios to showcase the performance of LART in comparison to related community detection algorithms.
Standard routing protocols for Low power and Lossy Networks are typically designed to optimize bottom-up data flows, by maintaining a cycle-free network topology. The advantage of such topologies is low memory footprint to store routing information (only the parent's address needs to me known by each node). The disadvantage is that other communication patterns, like top-down and bidirectional data flows, are not easily implemented. In this work we propose MHCL: IPv6 Multihop Host Configuration for Low-Power Wireless Networks. MHCL employs hierarchical address allocation that explores cycle-free network topologies and aims to enable top-down data communication with low message overhead and memory footprint. We evaluated the performance of MHCL both analytically and through simulations. We implemented MHCL as a subroutine of RPL protocol on Contiki OS and showed that it significantly improves top-down message delivery in RPL, while using a constant amount of memory (i.e., independent of network size) and being efficient in terms of setup time and number of control messages.
This paper presents an alternative approach of analyzing possibly multitype point patterns in space and space-time that occur on network structures, and introduces several different graph-related intensity measures. The proposed formalism allows to control for processes on undirected, directional as well as partially directed network structures and is not restricted to linearity or circularity.
To what extent is the success of deep visualization due to the training? Could we do deep visualization using untrained, random weight networks? To address this issue, we explore new and powerful generative models for three popular deep visualization tasks using untrained, random weight convolutional neural networks. First we invert representations in feature spaces and reconstruct images from white noise inputs. The reconstruction quality is statistically higher than that of the same method applied on well trained networks with the same architecture. Next we synthesize textures using scaled correlations of representations in multiple layers and our results are almost indistinguishable with the original natural texture and the synthesized textures based on the trained network. Third, by recasting the content of an image in the style of various artworks, we create artistic images with high perceptual quality, highly competitive to the prior work of Gatys et al. on pretrained networks. To our knowledge this is the first demonstration of image representations using untrained deep neural networks. Our work provides a new and fascinating tool to study the representation of deep network architecture and sheds light on new understandings on deep visualization.
We study the fluctuations of certain random matrix products $\Pi_N=M_N\cdots M_2M_1$ of $\mathrm{SL}(2,\mathbb{R})$, describing localisation properties of the one-dimensional Dirac equation with random mass. In the continuum limit, i.e. when matrices $M_n$'s are close to the identity matrix, we obtain convenient integral representations for the variance $\Gamma_2=\lim_{N\to\infty}\mathrm{Var}(\ln||\Pi_N||)/N$. The case studied exhibits a saturation of the variance at low energy $\varepsilon$ along with a vanishing Lyapunov exponent $\Gamma_1=\lim_{N\to\infty}\ln||\Pi_N||/N$, leading to the behaviour $\Gamma_2/\Gamma_1\sim\ln(1/|\varepsilon|)\to\infty$ as $\varepsilon\to0$. Our continuum description sheds new light on the Kappus-Wegner (band center) anomaly.
A network is a set of nodes that are linked together by a set of edges. Networks can represent any set of objects that have relations among themselves. Communities are sets of nodes that are related in an important way, probably sharing common properties and/or playing similar roles within a network. When network analysis is applied to study the livestock movement patterns, the epidemiological units of interest (farm premises, counties, states, countries, etc.) are represented as nodes, and animal movements between the nodes are represented as the edges of a network. Unraveling a network structure, and hence the trade preferences and pathways, could be very useful to a researcher or a decision-maker. We implemented a community detection algorithm to find livestock communities that is consistent with the definition of a livestock production zone, assuming that a community is a group of farm premises in which an animal is more likely to stay during its life time than expected by chance. We applied this algorithm to the network of within animal movements made inside the State of Mato Grosso, for the year of 2007. This database holds information about 87,899 premises and 521,431 movements throughout the year, totalizing 15,844,779 animals moved. The community detection algorithm achieved a network partition that shows a clear geographical and commercial pattern, two crucial features to preventive veterinary medicine applications, and also has a meaningful interpretation in trade networks where links emerge from the choice of trader nodes.
We approach the singing phrase audio to score matching problem by using phonetic and duration information - with a focus on studying the jingju a cappella singing case. We argue that, due to the existence of a basic melodic contour for each mode in jingju music, only using melodic information (such as pitch contour) will result in an ambiguous matching. This leads us to propose a matching approach based on the use of phonetic and duration information. Phonetic information is extracted with an acoustic model shaped with our data, and duration information is considered with the Hidden Markov Models (HMMs) variants we investigate. We build a model for each lyric path in our scores and we achieve the matching by ranking the posterior probabilities of the decoded most likely state sequences. Three acoustic models are investigated: (i) convolutional neural networks (CNNs), (ii) deep neural networks (DNNs) and (iii) Gaussian mixture models (GMMs). Also, two duration models are compared: (i) hidden semi-Markov model (HSMM) and (ii) post-processor duration model. Results show that CNNs perform better in our (small) audio dataset and also that HSMM outperforms the post-processor duration model.
We propose a novel fully convolutional networks architecture for shapes, denoted as Shape Fully Convolutional Networks (SFCN). Similar to convolution and pooling operation on image, the 3D shape is represented as a graph structure in the SFCN architecture, based on which we first propose and implement shape convolution and pooling operation. Meanwhile, to build our SFCN architecture in the original image segmentation FCN architecture, we also design and implement the generating operation with bridging function. This ensures that the convolution and pooling operation we designed can be successfully applied in the original FCN architecture. In this paper,we also present a new shape segmentation based on SFCN. In contrast to existing state-of-the-art shape segmentation methods that require the same types of shapes as input, we allow the more general and challenging input such as mixed datasets of different types of shapes. In our approach, SFCNs are first trained end-to-end, triangles-to-triangles by three low-level geometric features. Then, based on the trained SFCNs, we can complete the shape segmentation task with high quality. Finally, The feature voting-based multilabel graph cuts is adopted to optimize the segmentation results obtained by SFCN prediction. The experiment results show that our method can effectively learn and predict mixed shape datasets of either similar or different characters, and achieve excellent segmentation results.
There is an urgent need for compact, fast, and power-efficient hardware implementations of state-of-the-art artificial intelligence. Here we propose a power-efficient approach for real-time inference, in which deep neural networks (DNNs) are implemented through low-power analog circuits. Although analog implementations can be extremely compact, they have been largely supplanted by digital designs, partly because of device mismatch effects due to fabrication. We propose a framework that exploits the power of Deep Learning to compensate for this mismatch by incorporating the measured variations of the devices as constraints in the DNN training process. This eliminates the use of mismatch minimization strategies such as the use of very large transistors, and allows circuit complexity and power-consumption to be reduced to a minimum. Our results, based on large-scale simulations as well as a prototype VLSI chip implementation indicate at least a 3-fold improvement of processing efficiency over current digital implementations.
Deep convolutional neural networks (CNN) have achieved great success. On the other hand, modeling structural information has been proved critical in many vision problems. It is of great interest to integrate them effectively. In a classical neural network, there is no message passing between neurons in the same layer. In this paper, we propose a CRF-CNN framework which can simultaneously model structural information in both output and hidden feature layers in a probabilistic way, and it is applied to human pose estimation. A message passing scheme is proposed, so that in various layers each body joint receives messages from all the others in an efficient way. Such message passing can be implemented with convolution between features maps in the same layer, and it is also integrated with feedforward propagation in neural networks. Finally, a neural network implementation of end-to-end learning CRF-CNN is provided. Its effectiveness is demonstrated through experiments on two benchmark datasets.
Several machine learning tasks require to represent the data using only a sparse set of interest points. An ideal detector is able to find the corresponding interest points even if the data undergo a transformation typical for a given domain. Since the task is of high practical interest in computer vision, many hand-crafted solutions were proposed. In this paper, we ask a fundamental question: can we learn such detectors from scratch? Since it is often unclear what points are "interesting", human labelling cannot be used to find a truly unbiased solution. Therefore, the task requires an unsupervised formulation. We are the first to propose such a formulation: training a neural network to rank points in a transformation-invariant manner. Interest points are then extracted from the top/bottom quantiles of this ranking. We validate our approach on two tasks: standard RGB image interest point detection and challenging cross-modal interest point detection between RGB and depth images. We quantitatively show that our unsupervised method performs better or on-par with baselines.
We apply recurrent neural networks to the task of recognizing surgical activities from robot kinematics. Prior work in this area focuses on recognizing short, low-level activities, or gestures, and has been based on variants of hidden Markov models and conditional random fields. In contrast, we work on recognizing both gestures and longer, higher-level activites, or maneuvers, and we model the mapping from kinematics to gestures/maneuvers with recurrent neural networks. To our knowledge, we are the first to apply recurrent neural networks to this task. Using a single model and a single set of hyperparameters, we match state-of-the-art performance for gesture recognition and advance state-of-the-art performance for maneuver recognition, in terms of both accuracy and edit distance. Code is available at https://github.com/rdipietro/miccai-2016-surgical-activity-rec .
There are plenty of problems where the data available is scarce and expensive. We propose a generator of semi-artificial data with similar properties to the original data which enables development and testing of different data mining algorithms and optimization of their parameters. The generated data allow a large scale experimentation and simulations without danger of overfitting. The proposed generator is based on RBF networks which learn sets of Gaussian kernels. Learned Gaussian kernels can be used in a generative mode to generate the data from the same distributions. To asses quality of the generated data we developed several workflows and used them to evaluate the statistical properties of the generated data, structural similarity, and predictive similarity using supervised and unsupervised learning techniques. To determine usability of the proposed generator we conducted a large scale evaluation using 51 UCI data sets. The results show a considerable similarity between the original and generated data and indicate that the method can be useful in several development and simulation scenarios.
Oscillations lie at the core of many biological processes, from the cell cycle, to circadian oscillations and developmental processes. Time-keeping mechanisms are essential to enable organisms to adapt to varying conditions in environmental cycles, from day/night to seasonal. Transcriptional regulatory networks are one of the mechanisms behind these biological oscillations. However, while identifying cyclically expressed genes from time series measurements is relatively easy, determining the structure of the interaction network underpinning the oscillation is a far more challenging problem. Here, we explicitly leverage the oscillatory nature of the transcriptional signals and present a method for reconstructing network interactions tailored to this special but important class of genetic circuits. Our method is based on projecting the signal onto a set of oscillatory basis functions using a Discrete Fourier Transform. We build a Bayesian Hierarchical model within a frequency domain linear model in order to enforce sparsity and incorporate prior knowledge about the network structure. Experiments on real and simulated data show that the method can lead to substantial improvements over competing approaches if the oscillatory assumption is met, and remains competitive also in cases it is not.
In wireless ad hoc networks, the absence of any control on packets forwarding, make these networks vulnerable by various deny of service attacks (DoS). A node, in wireless ad hoc network, counts always on intermediate nodes to send these packets to a given destination node. An intermediate node, which takes part in packets forwarding, may behave maliciously and drop packets which goes through it, instead of forwarding them to the following node. Such behavior is called black hole attack. In this paper, after having specified the black hole attack, a secure mechanism, which consists in checking the good forwarding of packets by an intermediate node, was proposed. The proposed solution avoids the black hole and the cooperative black hole attacks. Evaluation metrics were considered in simulation to show the effectiveness of the suggested solution.
Batch Normalization is a commonly used trick to improve the training of deep neural networks. These neural networks use L2 regularization, also called weight decay, ostensibly to prevent overfitting. However, we show that L2 regularization has no regularizing effect when combined with normalization. Instead, regularization has an influence on the scale of weights, and thereby on the effective learning rate. We investigate this dependence, both in theory, and experimentally. We show that popular optimization methods such as ADAM only partially eliminate the influence of normalization on the learning rate. This leads to a discussion on other ways to mitigate this issue.
Current microelectrodes designed to record chronic neural activity suffer from recording instabilities due to the modulus mismatch between the electrode materials and the brain. We sought to address this by microfabricating a novel flexible neural probe. Our probe was fabricated from parylene-C with a WTi metal, using contact photolithography and reactive ion etching, with three design features to address this modulus mismatch: a sinusoidal shaft, a rounded tip and a polyimide anchoring ball. The anchor restricts movement of the electrode recording sites and the shaft accommodates the brain motion. We successfully patterned thick metal and parylene-C layers, with a reliable device release process leading to high functional yield. This novel reliably microfabricated probe can record stable neural activity for up to two years without delamination, surpassing the current state-of-the-art intracortical probes. This challenges recent concerns that have been raised over the long-term reliability of chronic implants when Parylene-C is used as an insulator, for both research and human applications. The microfabrication and design considerations provided in this manuscript may aid in the future development of flexible devices for biomedical applications.
We present measurements of the semi-inclusive cross-sections for muon neutrino and muon antineutrino-nucleon deep inelastic scattering interactions with two oppositely charged muons in the final state. These events dominantly arise from production of a charm quark during the scattering process. The measurement was obtained from the analysis of 5102 muon neutrino-induced and 1458 muon antineutrino-induced events collected with the NuTeV detector exposed to a sign-selected beam at the Fermilab Tevatron. We also extract a cross-section measurement from a re-analysis of 5030 muon neutrino-induced and 1060 muon antineutrino-induced vents collected from the exposure of the same detector to a quad-triplet beam by the CCFR experiment. The results are combined to obtain the most statistically precise measurement of neutrino-induced dimuon production cross-sections to date. These measurements should be of broad use to phenomenologists interested in the dynamics of charm production, the strangeness content of the nucleon, and the CKM matrix element Vcd.
The training complexity of deep learning-based channel decoders scales exponentially with the codebook size and therefore with the number of information bits. Thus, neural network decoding (NND) is currently only feasible for very short block lengths. In this work, we show that the conventional iterative decoding algorithm for polar codes can be enhanced when sub-blocks of the decoder are replaced by neural network (NN) based components. Thus, we partition the encoding graph into smaller sub-blocks and train them individually, closely approaching maximum a posteriori (MAP) performance per sub-block. These blocks are then connected via the remaining conventional belief propagation decoding stage(s). The resulting decoding algorithm is non-iterative and inherently enables a high-level of parallelization, while showing a competitive bit error rate (BER) performance. We examine the degradation through partitioning and compare the resulting decoder to state-of-the-art polar decoders such as successive cancellation list and belief propagation decoding.
We study the factors affecting training time in multi-device deep learning systems. Given a specification of a convolutional neural network, our goal is to minimize the time to train this model on a cluster of commodity CPUs and GPUs. We first focus on the single-node setting and show that by using standard batching and data-parallel techniques, throughput can be improved by at least 5.5x over state-of-the-art systems on CPUs. This ensures an end-to-end training speed directly proportional to the throughput of a device regardless of its underlying hardware, allowing each node in the cluster to be treated as a black box. Our second contribution is a theoretical and empirical study of the tradeoffs affecting end-to-end training time in a multiple-device setting. We identify the degree of asynchronous parallelization as a key factor affecting both hardware and statistical efficiency. We see that asynchrony can be viewed as introducing a momentum term. Our results imply that tuning momentum is critical in asynchronous parallel configurations, and suggest that published results that have not been fully tuned might report suboptimal performance for some configurations. For our third contribution, we use our novel understanding of the interaction between system and optimization dynamics to provide an efficient hyperparameter optimizer. Our optimizer involves a predictive model for the total time to convergence and selects an allocation of resources to minimize that time. We demonstrate that the most popular distributed deep learning systems fall within our tradeoff space, but do not optimize within the space. By doing this optimization, our prototype runs 1.9x to 12x faster than the fastest state-of-the-art systems.
This paper introduces a probabilistic framework for k-shot image classification. The goal is to generalise from an initial large-scale classification task to a separate task comprising new classes and small numbers of examples. The new approach not only leverages the feature-based representation learned by a neural network from the initial task (representational transfer), but also information about the form of the classes (concept transfer). The concept information is encapsulated in a probabilistic model for the final layer weights of the neural network which then acts as a prior when probabilistic k-shot learning is performed. Surprisingly, simple probabilistic models and inference schemes outperform many existing k-shot learning approaches and compare favourably with the state-of-the-art method in terms of error-rate. The new probabilistic methods are also able to accurately model uncertainty, leading to well calibrated classifiers, and they are easily extensible and flexible, unlike many recent approaches to k-shot learning.
An algorithm for the calculation of the resistance between two arbitrary nodes in an arbitrary distance-regular resistor network is provided, where the calculation is based on stratification introduced in \cite{js} and Stieltjes transform of the spectral distribution (Stieltjes function) associated with the network. It is shown that the resistances between a node $\alpha$ and all nodes $\beta$ belonging to the same stratum with respect to the $\alpha$ ($R_{\alpha\beta^{(i)}}$, $\beta$ belonging to the $i$-th stratum with respect to the $\alpha$) are the same. Also, the analytical formulas for two-point resistances $R_{{\alpha\beta^{(i)}}}, i=1,2,3$ are given in terms of the the size of the network and corresponding intersection numbers. In particular, the two-point resistances in a strongly regular network are given in terms of the its parameters ($v,\kappa,\lambda,\mu$). Moreover, the lower and upper bounds for two-point resistances in strongly regular networks are discussed.   Keywords:two-point resistance, association scheme, distance-regular networks, Stieltjes function   PACs Index: 01.55.+b, 02.10.Yn
Little is observationally known about the progenitors of Type Ibc supernovae (SNe) or the typical activity of SNe progenitors in their final years. Here, we analyze deep Large Binocular Telescope imaging data spanning the 4 years before and after the Type Ic SN 2012fh using difference imaging. We place 1$\sigma$ upper limits on the detection of the progenitor star at $M_R>-4.0$, $M_V>-3.8$, $M_B>-3.1$, and $M_U>-3.8$ mag. These limits are the tightest placed on a Type Ic SNe and they largely rule out single star evolutionary models in favor of a binary channel as the origin of this SN. We also constrain the activity of the progenitor to be small on an absolute scale, with the RMS $UBVR$ optical variability $<2500L_\odot$ and long-term dimming or brightening trends $<1000L_\odot/\text{year}$ in all four bands.
In our recent works, we developed a probabilistic framework for structural analysis in undirected networks. The key idea of that framework is to sample a network by a symmetric bivariate distribution and then use that bivariate distribution to formerly define various notions, including centrality, relative centrality, community, and modularity. The main objective of this paper is to extend the probabilistic framework to directed networks, where the sampling bivariate distributions could be asymmetric. Our main finding is that we can relax the assumption from symmetric bivariate distributions to bivariate distributions that have the same marginal distributions. By using such a weaker assumption, we show that various notions for structural analysis in directed networks can also be defined in the same manner as before. However, since the bivariate distribution could be asymmetric, the community detection algorithms proposed in our previous work cannot be directly applied. For this, we show that one can construct another sampled graph with a symmetric bivariate distribution so that for any partition of the network, the modularity index remains the same as that of the original sampled graph. Based on this, we propose a hierarchical agglomerative algorithm that returns a partition of communities when the algorithm converges.
Recent advancement in the wireless sensor networks has provided a platform to numerous applications in healthcare sector. It has become an active research area due to its large scale potential. This research focuses on the application areas of wireless sensor networks specifically in the healthcare sector. In this work, we have tried to explain the different challenges faced by the WSNs in order to implement them. The different pros and cons of the WSNs in healthcare sector are also discussed. Some important parameters which can be used to evaluate the performance of the wireless sensor networks are also presented in this work. Wireless sensor networks have a tremendous future and it should be taken at its earliest because of the significant importance of the healthcare issues.
The outputs of a trained neural network contain much richer information than just an one-hot classifier. For example, a neural network might give an image of a dog the probability of one in a million of being a cat but it is still much larger than the probability of being a car. To reveal the hidden structure in them, we apply two unsupervised learning algorithms, PCA and ICA, to the outputs of a deep Convolutional Neural Network trained on the ImageNet of 1000 classes. The PCA/ICA embedding of the object classes reveals their visual similarity and the PCA/ICA components can be interpreted as common visual features shared by similar object classes. For an application, we proposed a new zero-shot learning method, in which the visual features learned by PCA/ICA are employed. Our zero-shot learning method achieves the state-of-the-art results on the ImageNet of over 20000 classes.
The scope of this teaching package is to make a brief induction to Artificial Neural Networks (ANNs) for people who have no previous knowledge of them. We first make a brief introduction to models of networks, for then describing in general terms ANNs. As an application, we explain the backpropagation algorithm, since it is widely used and many other algorithms are derived from it. The user should know algebra and the handling of functions and vectors. Differential calculus is recommendable, but not necessary. The contents of this package should be understood by people with high school education. It would be useful for people who are just curious about what are ANNs, or for people who want to become familiar with them, so when they study them more fully, they will already have clear notions of ANNs. Also, people who only want to apply the backpropagation algorithm without a detailed and formal explanation of it will find this material useful. This work should not be seen as "Nets for dummies", but of course it is not a treatise. Much of the formality is skipped for the sake of simplicity. Detailed explanations and demonstrations can be found in the referred readings. The included exercises complement the understanding of the theory. The on-line resources are highly recommended for extending this brief induction.
We consider a classical spin model, of two-dimensional spins, with continuous symmetry, and investigate the effect of a symmetry breaking unidirectional quenched disorder on the magnetization of the system. We work in the mean field regime. We show, by numerical simulations and by perturbative calculations in the low as well as in the high temperature limits, that although the continuous symmetry of the magnetization is lost, the system still magnetizes, albeit with a lower value as compared to the case without disorder. The critical temperature at which the system starts magnetizing, also decreases with the introduction of disorder. However, with the introduction of an additional constant magnetic field, the component of magnetization in the direction that is transverse to the disorder field increases with the introduction of the quenched disorder. We discuss the same effects also for three-dimensional spins.
The workshop "Mining Scientific Papers: Computational Linguistics and Bibliometrics" (CLBib 2015), co-located with the 15th International Society of Scientometrics and Informetrics Conference (ISSI 2015), brought together researchers in Bibliometrics and Computational Linguistics in order to study the ways Bibliometrics can benefit from large-scale text analytics and sense mining of scientific papers, thus exploring the interdisciplinarity of Bibliometrics and Natural Language Processing (NLP). The goals of the workshop were to answer questions like: How can we enhance author network analysis and Bibliometrics using data obtained by text analytics? What insights can NLP provide on the structure of scientific writing, on citation networks, and on in-text citation analysis? This workshop is the first step to foster the reflection on the interdisciplinarity and the benefits that the two disciplines Bibliometrics and Natural Language Processing can drive from it.
Visual question answering (VQA) has witnessed great progress since May, 2015 as a classic problem unifying visual and textual data into a system. Many enlightening VQA works explore deep into the image and question encodings and fusing methods, of which attention is the most effective and infusive mechanism. Current attention based methods focus on adequate fusion of visual and textual features, but lack the attention to where people focus to ask questions about the image. Traditional attention based methods attach a single value to the feature at each spatial location, which losses many useful information. To remedy these problems, we propose a general method to perform saliency-like pre-selection on overlapped region features by the interrelation of bidirectional LSTM (BiLSTM), and use a novel element-wise multiplication based attention method to capture more competent correlation information between visual and textual features. We conduct experiments on the large-scale COCO-VQA dataset and analyze the effectiveness of our model demonstrated by strong empirical results.
In this paper we introduce a novel method for automatically tuning the search parameters of a chess program using genetic algorithms. Our results show that a large set of parameter values can be learned automatically, such that the resulting performance is comparable with that of manually tuned parameters of top tournament-playing chess programs.
We numerically investigate the supercooled dynamics of two simple model liquids exploiting the partition of the multi-dimension configuration space in basins of attraction of the stationary points (inherent saddles) of the potential energy surface. We find that the inherent saddles order and potential energy are well defined functions of the temperature T. Moreover, decreasing T, the saddle order vanishes at the same temperature (T_MCT) where the inverse diffusivity appears to diverge as a power law. This allows a topological interpretation of T_MCT: it marks the transition from a dynamics between basins of saddles (T>T_MCT) to a dynamics between basins of minima (T<T_MCT).
In this paper we revisit and extend the mapping between two apparently different classes of models. The first class contains the prototypical models described --at the mean-field level-- by the Random First Order Transition (RFOT) theory of the glass transition, called either "random XORSAT problem" (in the information theory community) or "diluted $p$-spin model" (in the spin glass community), undergoing a single-spin flip Glauber dynamics. The models in the second class are Kinetically Constrained Models (KCM): their Hamiltonian is that of independent spins in a constant magnetic field, hence their thermodynamics is completely trivial, but the dynamics is such that only groups of spin can flip together, thus implementing a kinetic constraint that induces a non-trivial dynamical behavior. A mapping between some representatives of these two classes has been known for long. Here we formally prove this mapping at the level of the master equation, and we apply it to the particular case of Bethe lattice models. This allows us to show that a RFOT model can be mapped exactly into a KCM. However, the natural order parameter for the RFOT model, namely the spin overlap, is mapped into a very complicated non-local function in the KCM. Therefore, if one were to study the KCM without knowing of the mapping onto the RFOT model, one would guess that its physics is quite different from the RFOT one. Our results instead suggest that these two apparently different descriptions of the glass transition are, at least in some case, closely related.
Humans have an impressive ability to solve complex coordination problems in a fully distributed manner. This ability, if learned as a set of distributed multirobot coordination strategies, can enable programming large groups of robots to collaborate towards complex coordination objectives in a way similar to humans. Such strategies would offer robustness, adaptability, fault-tolerance, and, importantly, distributed decision-making. To that end, we have designed a networked gaming platform to investigate human group behavior, specifically in solving complex collaborative coordinated tasks. Through this platform, we are able to limit the communication, sensing, and actuation capabilities provided to the players. With the aim of learning coordination algorithms for robots in mind, we define these capabilities to mimic those of a simple ground robot.
Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloud computing system. However, a complete cloud resource allocation framework exhibits high dimensions in state and action spaces, which prohibit the usefulness of traditional RL techniques. In addition, high power consumption has become one of the critical concerns in design and control of cloud computing systems, which degrades system reliability and increases cooling cost. An effective dynamic power management (DPM) policy should minimize power consumption while maintaining performance degradation within an acceptable level. Thus, a joint virtual machine (VM) resource allocation and power management framework is critical to the overall cloud computing system. Moreover, novel solution framework is necessary to address the even higher dimensions in state and action spaces. In this paper, we propose a novel hierarchical framework for solving the overall resource allocation and power management problem in cloud computing systems. The proposed hierarchical framework comprises a global tier for VM resource allocation to the servers and a local tier for distributed power management of local servers. The emerging deep reinforcement learning (DRL) technique, which can deal with complicated control problems with large state space, is adopted to solve the global tier problem. Furthermore, an autoencoder and a novel weight sharing structure are adopted to handle the high-dimensional state space and accelerate the convergence speed. On the other hand, the local tier of distributed server power managements comprises an LSTM based workload predictor and a model-free RL based power manager, operating in a distributed manner.
A vast amount of textual web streams is influenced by events or phenomena emerging in the real world. The social web forms an excellent modern paradigm, where unstructured user generated content is published on a regular basis and in most occasions is freely distributed. The present Ph.D. Thesis deals with the problem of inferring information - or patterns in general - about events emerging in real life based on the contents of this textual stream. We show that it is possible to extract valuable information about social phenomena, such as an epidemic or even rainfall rates, by automatic analysis of the content published in Social Media, and in particular Twitter, using Statistical Machine Learning methods. An important intermediate task regards the formation and identification of features which characterise a target event; we select and use those textual features in several linear, non-linear and hybrid inference approaches achieving a significantly good performance in terms of the applied loss function. By examining further this rich data set, we also propose methods for extracting various types of mood signals revealing how affective norms - at least within the social web's population - evolve during the day and how significant events emerging in the real world are influencing them. Lastly, we present some preliminary findings showing several spatiotemporal characteristics of this textual information as well as the potential of using it to tackle tasks such as the prediction of voting intentions.
A dynamic memory model is proposed in which an agent ``learns'' a new agent by means of recommendation. The agents can also ``remember'' and ``forget''. The memory size is decreased while the population size is kept constant. ``Fame'' emerged as a few agents become very well known in expense of the majority being completely forgotten. The minimum and the maximum of fame change linearly with the relative memory size. The network properties of the who-knows-who graph, which represents the state of the system, are investigated.
This paper presents a novel approach for video-based person re-identification using multiple Convolutional Neural Networks (CNNs). Unlike previous work, we intend to extract a compact yet discriminative appearance representation from several frames rather than the whole sequence. Specifically, given a video, the representative frames are selected based on the walking profile of consecutive frames. A multiple CNN architecture incorporated with feature pooling is proposed to learn and compile the features of the selected representative frames into a compact description about the pedestrian for identification. Experiments are conducted on benchmark datasets to demonstrate the superiority of the proposed method over existing person re-identification approaches.
Recent years have seen an increasing popularity of learning the sparse \emph{changes} in Markov Networks. Changes in the structure of Markov Networks reflect alternations of interactions between random variables under different regimes and provide insights into the underlying system. While each individual network structure can be complicated and difficult to learn, the overall change from one network to another can be simple. This intuition gave birth to an approach that \emph{directly} learns the sparse changes without modelling and learning the individual (possibly dense) networks. In this paper, we review such a direct learning method with some latest developments along this line of research.
Logitboost is an influential boosting algorithm for classification. In this paper, we develop robust logitboost to provide an explicit formulation of tree-split criterion for building weak learners (regression trees) for logitboost. This formulation leads to a numerically stable implementation of logitboost. We then propose abc-logitboost for multi-class classification, by combining robust logitboost with the prior work of abc-boost. Previously, abc-boost was implemented as abc-mart using the mart algorithm. Our extensive experiments on multi-class classification compare four algorithms: mart, abcmart, (robust) logitboost, and abc-logitboost, and demonstrate the superiority of abc-logitboost. Comparisons with other learning methods including SVM and deep learning are also available through prior publications.
We study the influence of strong nonmagnetic disorder on the Ruderman-Kittel-Kasuya-Yosida (RKKY) interactions between diluted magnetic moments in metals. We find that the probability distribution for the RKKY interactions assumes strongly non-Gaussian form featuring long tails. Since such distributions cannot be characterized by its moments, we define a \textit{typical} value of the interaction amplitude, which we find to be exponentially suppressed in presence of Anderson localization. Our results present a plausible and physically transparent picture describing how Anderson localization effectively eliminates the long range nature of the RKKY interactions.
We study the totally asymmetric simple exclusion process (TASEP) on complex networks, as a paradigmatic model for transport subject to excluded volume interactions. Building on TASEP phenomenology on a single segment and borrowing ideas from random networks we investigate the effect of connectivity on transport. In particular, we argue that the presence of disorder in the topology of vertices crucially modifies the transport features of a network: irregular networks involve homogeneous segments and have a bimodal distribution of edge densities, whereas regular networks are dominated by shocks leading to a unimodal density distribution. The proposed numerical approach of solving for mean-field transport on networks provides a general framework for studying TASEP on large networks, and is expected to generalize to other transport processes.
We propose a dynamical model in which a network structure evolves in a self-organized critical (SOC) manner and explain a possible origin of the emergence of fractal and small-world networks. Our model combines a network growth and its decay by failures of nodes. The decay mechanism reflects the instability of large functional networks against cascading overload failures. It is demonstrated that the dynamical system surely exhibits SOC characteristics, such as power-law forms of the avalanche size distribution, the cluster size distribution, and the distribution of the time interval between intermittent avalanches. During the network evolution, fractal networks are spontaneously generated when networks experience critical cascades of failures that lead to a percolation transition. In contrast, networks far from criticality have small-world structures. We also observe the crossover behavior from fractal to small-world structure in the network evolution.
Active learning is a powerful approach to analyzing data effectively. We show that the feasibility of active learning depends crucially on the choice of measure with respect to which the query is being optimized. The standard information gain, for example, does not permit an accurate evaluation with a small committee, a representative subset of the model space. We propose a surrogate measure requiring only a small committee and discuss the properties of this new measure. We devise, in addition, a bootstrap approach for committee selection. The advantages of this approach are illustrated in the context of recovering (regulatory) network models.
A major issue in biology is the understanding of the interactions between proteins. These interactions can be described by a network, where the proteins are modeled by nodes and the interactions by edges. The origin of these protein networks is not well understood yet. Here we present a two-step model, which generates clusters with the same topological properties as networks for protein-protein interactions, namely, the same degree distribution, cluster size distribution, clustering coefficient and shortest path length. The biological and model networks are not scale free but exhibit small world features. The model allows the fitting of different biological systems by tuning a single parameter.
Despite the great success of convolutional neural networks (CNN) for the image classification task on datasets like Cifar and ImageNet, CNN's representation power is still somewhat limited in dealing with object images that have large variation in size and clutter, where Fisher Vector (FV) has shown to be an effective encoding strategy. FV encodes an image by aggregating local descriptors with a universal generative Gaussian Mixture Model (GMM). FV however has limited learning capability and its parameters are mostly fixed after constructing the codebook. To combine together the best of the two worlds, we propose in this paper a neural network structure with FV layer being part of an end-to-end trainable system that is differentiable; we name our network FisherNet that is learnable using backpropagation. Our proposed FisherNet combines convolutional neural network training and Fisher Vector encoding in a single end-to-end structure. We observe a clear advantage of FisherNet over plain CNN and standard FV in terms of both classification accuracy and computational efficiency on the challenging PASCAL VOC object classification task.
Although different learning systems are coordinated to afford complex behavior, little is known about how this occurs. This article describes a theoretical framework that specifies how complex behaviors that might be thought to require error-driven learning might instead be acquired through simple reinforcement. This framework includes specific assumptions about the mechanisms that contribute to the evolution of (artificial) neural networks to generate topologies that allow the networks to learn large-scale complex problems using only information about the quality of their performance. The practical and theoretical implications of the framework are discussed, as are possible biological analogs of the approach.
Deep Learning has revolutionized vision via convolutional neural networks (CNNs) and natural language processing via recurrent neural networks (RNNs). However, success stories of Deep Learning with standard feed-forward neural networks (FNNs) are rare. FNNs that perform well are typically shallow and, therefore cannot exploit many levels of abstract representations. We introduce self-normalizing neural networks (SNNs) to enable high-level abstract representations. While batch normalization requires explicit normalization, neuron activations of SNNs automatically converge towards zero mean and unit variance. The activation function of SNNs are "scaled exponential linear units" (SELUs), which induce self-normalizing properties. Using the Banach fixed-point theorem, we prove that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero mean and unit variance -- even under the presence of noise and perturbations. This convergence property of SNNs allows to (1) train deep networks with many layers, (2) employ strong regularization, and (3) to make learning highly robust. Furthermore, for activations not close to unit variance, we prove an upper and lower bound on the variance, thus, vanishing and exploding gradients are impossible. We compared SNNs on (a) 121 tasks from the UCI machine learning repository, on (b) drug discovery benchmarks, and on (c) astronomy tasks with standard FNNs and other machine learning methods such as random forests and support vector machines. SNNs significantly outperformed all competing FNN methods at 121 UCI tasks, outperformed all competing methods at the Tox21 dataset, and set a new record at an astronomy data set. The winning SNN architectures are often very deep. Implementations are available at: github.com/bioinf-jku/SNNs.
Inclusive $K^0$ and $\Lambda$ photoproduction has been investigated at HERA with the H1 detector at an average photon-proton center of mass energy of 200 GeV in the transverse momentum range 0.5 <p_t <5 GeV. The production rates as a function of $p_t$ and center of mass rapidity are compared to those obtained in deep inelastic scattering at $\av{Q^2}=23 GeV^2$. A similar comparison is made of the rapidity spectra of charged particles. The rate of strangeness photoproduction is compared with $p\bar p$ measurements. The observations are also compared with next-to-leading order QCD calculations and the predictions of a Monte Carlo model.
Optical Coherence Tomography (OCT) provides a unique ability to image the eye retina in 3D at micrometer resolution and gives ophthalmologist the ability to visualize retinal diseases such as Age-Related Macular Degeneration (AMD). While visual inspection of OCT volumes remains the main method for AMD identification, doing so is time consuming as each cross-section within the volume must be inspected individually by the clinician. In much the same way, acquiring ground truth information for each cross-section is expensive and time consuming. This fact heavily limits the ability to acquire large amounts of ground truth, which subsequently impacts the performance of learning-based methods geared at automatic pathology identification. To avoid this burden, we propose a novel strategy for automatic analysis of OCT volumes where only volume labels are needed. That is, we train a classifier in a semi-supervised manner to conduct this task. Our approach uses a novel Convolutional Neural Network (CNN) architecture, that only needs volume-level labels to be trained to automatically asses whether an OCT volume is healthy or contains AMD. Our architecture involves first learning a cross-section pathology classifier using pseudo-labels that could be corrupted and then leverage these towards a more accurate volume-level classification. We then show that our approach provides excellent performances on a publicly available dataset and outperforms a number of existing automatic techniques.
Fully convolutional neural networks (FCNNs) trained on a large number of images with strong pixel-level annotations have become the new state of the art for the semantic segmentation task. While there have been recent attempts to learn FCNNs from image-level weak annotations, they need additional constraints, such as the size of an object, to obtain reasonable performance. To address this issue, we present motion-CNN (M-CNN), a novel FCNN framework which incorporates motion cues and is learned from video-level weak annotations. Our learning scheme to train the network uses motion segments as soft constraints, thereby handling noisy motion information. When trained on weakly-annotated videos, our method outperforms the state-of-the-art EM-Adapt approach on the PASCAL VOC 2012 image segmentation benchmark. We also demonstrate that the performance of M-CNN learned with 150 weak video annotations is on par with state-of-the-art weakly-supervised methods trained with thousands of images. Finally, M-CNN substantially outperforms recent approaches in a related task of video co-localization on the YouTube-Objects dataset.
A novel technique to identify and split clusters created by multiple charged particles in the ATLAS pixel detector using a set of artificial neural networks is presented. Such merged clusters are a common feature of tracks originating from highly energetic objects, such as jets. Neural networks are trained using Monte Carlo samples produced with a detailed detector simulation. This technique replaces the former clustering approach based on a connected component analysis and charge interpolation. The performance of the neural network splitting technique is quantified using data from proton--proton collisions at the LHC collected by the ATLAS detector in 2011 and from Monte Carlo simulations. This technique reduces the number of clusters shared between tracks in highly energetic jets by up to a factor of three. It also provides more precise position and error estimates of the clusters in both the transverse and longitudinal impact parameter resolution.
A dominant paradigm for deep learning based object detection relies on a "bottom-up" approach using "passive" scoring of class agnostic proposals. These approaches are efficient but lack of holistic analysis of scene-level context. In this paper, we present an "action-driven" detection mechanism using our "top-down" visual attention model. We localize an object by taking sequential actions that the attention model provides. The attention model conditioned with an image region provides required actions to get closer toward a target object. An action at each time step is weak itself but an ensemble of the sequential actions makes a bounding-box accurately converge to a target object boundary. This attention model we call AttentionNet is composed of a convolutional neural network. During our whole detection procedure, we only utilize the actions from a single AttentionNet without any modules for object proposals nor post bounding-box regression. We evaluate our top-down detection mechanism over the PASCAL VOC series and ILSVRC CLS-LOC dataset, and achieve state-of-the-art performances compared to the major bottom-up detection methods. In particular, our detection mechanism shows a strong advantage in elaborate localization by outperforming Faster R-CNN with a margin of +7.1% over PASCAL VOC 2007 when we increase the IoU threshold for positive detection to 0.7.
Graph kernels are usually defined in terms of simpler kernels over local substructures of the original graphs. Different kernels consider different types of substructures. However, in some cases they have similar predictive performances, probably because the substructures can be interpreted as approximations of the subgraphs they induce. In this paper, we propose to associate to each feature a piece of information about the context in which the feature appears in the graph. A substructure appearing in two different graphs will match only if it appears with the same context in both graphs. We propose a kernel based on this idea that considers trees as substructures, and where the contexts are features too. The kernel is inspired from the framework in [6], even if it is not part of it. We give an efficient algorithm for computing the kernel and show promising results on real-world graph classification datasets.
We use inelastic neutron scattering and molecular dynamics (MD) simulation to investigate the chemical short range order (CSRO), visible through prepeaks in the structure factors, and its relation to self diffusion in Al-Ni melts. As a function of composition at 1795K Ni self diffusion coefficients from experiment and simulation exhibit a non-linear dependence with a pronounced increase on the Al-rich side. This comes along with a change in CSRO with increasing Al content that is related to a more dense packing of the atoms in Ni-rich Al-Ni systems.
The architecture of a neural network controlling an unknown environment is presented. It is based on a randomly connected recurrent neural network from which both perception and action are simultaneously read and fed back. There are two concurrent learning rules implementing a sort of ideomotor control: (i) perception is learned along the principle that the network should predict reliably its incoming stimuli; (ii) action is learned along the principle that the prediction of the network should match a target time series. The coherent behavior of the neural network in its environment is a consequence of the interaction between the two principles. Numerical simulations show the promising performance of the approach, which can be turned into a local, and thus "biologically plausible", algorithm.
This paper is about the wireless sensor network in environmental monitoring applications. A Wireless Sensor Network consists of many sensor nodes and a base station. The number and type of sensor nodes and the design protocols for any wireless sensor network is application specific. The sensor data in this application may be light intensity, temperature, pressure, humidity and their variations .Clustering and routing are the two areas which are given more attention in this paper.
Many physical and physiological signals exhibit complex scale-invariant features characterized by $1/f$ scaling and long-range power-law correlations, suggesting a possibly common control mechanism. Specifically, it has been suggested that dynamical processes influenced by inputs and feedback on multiple time scales may be sufficient to give rise to $1/f$ scaling and scale invariance. Two examples of physiologic signals that are the output of hierarchical, multi-scale physiologic systems under neural control are the human heartbeat and human gait. Here we show that while both cardiac interbeat interval and gait interstride interval time series under healthy conditions have comparable $1/f$ scaling, they still may belong to different complexity classes. Our analysis of the magnitude series correlations and multifractal scaling exponents of the fluctuations in these two signals demonstrates that in contrast with the nonlinear multifractal behavior found in healthy heartbeat dynamics, gait time series exhibit less complex, close to monofractal behavior and a low degree of nonlinearity. These findings are of interest because they underscore the limitations of traditional two-point correlation methods in fully characterizing physiologic and physical dynamics. In addition, these results suggest that different mechanisms of control may be responsible for varying levels of complexity observed in physiological systems under neural regulation and in physical systems that possess similar $1/f$ scaling.
Infrastructure-as-a-Service (IaaS) providers need to offer richer services to be competitive while optimizing their resource usage to keep costs down. Richer service offerings include new resource request models involving bandwidth guarantees between virtual machines (VMs). Thus we consider the following problem: given a VM request graph (where nodes are VMs and edges represent virtual network connectivity between the VMs) and a real data center topology, find an allocation of VMs to servers that satisfies the bandwidth guarantees for every virtual network edge---which maps to a path in the physical network---and minimizes congestion of the network.   Previous work has shown that for arbitrary networks and requests, finding the optimal embedding satisfying bandwidth requests is $\mathcal{NP}$-hard. However, in most data center architectures, the routing protocols employed are based on a spanning tree of the physical network. In this paper, we prove that the problem remains $\mathcal{NP}$-hard even when the physical network topology is restricted to be a tree, and the request graph topology is also restricted. We also present a dynamic programming algorithm for computing the optimal embedding in a tree network which runs in time $O(3^kn)$, where $n$ is the number of nodes in the physical topology and $k$ is the size of the request graph, which is well suited for practical requests which have small $k$. Such requests form a large class of web-service and enterprise workloads. Also, if we restrict the requests topology to a clique (all VMs connected to a virtual switch with uniform bandwidth requirements), we show that the dynamic programming algorithm can be modified to output the minimum congestion embedding in time $O(k^2n)$.
Using a new scheme of the derivation of the non-linear $\sigma$-model we consider the electron motion in a random magnetic field (RMF) in two dimensions. The derivation is based on writing quasiclassical equations and representing their solutions in terms of a functional integral over supermatrices $Q$ with the constraint $Q^2=1$. Contrary to the standard scheme, neither singling out slow modes nor saddle-point approximation are used. The $\sigma$-model obtained is applicable at the length scale down to the electron wavelength. We show that this model differs from the model with a random potential (RP).However, after averaging over fluctuations in the Lyapunov region the standard $\sigma$-model is obtained leading to the conventional localization behavior.
Developments in deep generative models have allowed for tractable learning of high-dimensional data distributions. While the employed learning procedures typically assume that training data is drawn i.i.d. from the distribution of interest, it may be desirable to model distinct distributions which are observed sequentially, such as when different classes are encountered over time. Although conditional variations of deep generative models permit multiple distributions to be modeled by a single network in a disentangled fashion, they are susceptible to catastrophic forgetting when the distributions are encountered sequentially. In this paper, we adapt recent work in reducing catastrophic forgetting to the task of training generative adversarial networks on a sequence of distinct distributions, enabling continual generative modeling.
Three-dimensional random electron systems undergo quantum phase transitions and show rich phase diagrams. Examples of the phases are the band gap insulator, Anderson insulator, strong and weak topological insulators, Weyl semimetal, and diffusive metal. As in the previous paper on two-dimensional quantum phase transitions [J. Phys. Soc. Jpn. vol. 85, 123706 (2016)], we use an image recognition algorithm based on a multilayered convolutional neural network to identify which phase the eigenfunction belongs to. The Anderson model for localization-delocalization transition, the Wilson--Dirac model for topological insulators, and the layered Chern insulator model for Weyl semimetal are studied. The situation where the standard transfer matrix approach is not applicable is also treated by this method.
We propose the n-clique network as a powerful tool for understanding global structures of combined highly-interconnected subgraphs, and provide theoretical predictions for statistical properties of the n-clique networks embedded in a complex network using the degree distribution and the clustering spectrum. Furthermore, using our theoretical predictions, we find that the statistical properties are invariant between 3-clique networks and original networks for several observable real-world networks with the scale-free connectivity and the hierarchical modularity. The result implies that structural properties are identical between the 3-clique networks and the original networks.
We present a summary of optical/NIR deep surveys for very high-$z$ galaxies using the 8.2m Subaru Telescope operated by National Astronomical Observatory of Japan. The prime focus mosaic CCD camera, Suprime-Cam, with a very wide field of view, 34$^\prime \times 27^\prime$, allows us to carry out efficient optical deep surveys. In particular, the Subaru Deep Field project has provided us a number of Lyman$\alpha$ emitters beyond $z=6$. We discuss the star formation history in the early universe based on this project.
One popular assumption regarding biological systems is that traits have evolved to be optimized with respect to function. This is a standard goal in evolutionary computation, and while not always embraced in the biological sciences, is an underlying assumption of what happens when fitness is maximized. The implication of this is that a signaling pathway or phylogeny should show evidence of minimizing the number of steps required to produce a biochemical product or phenotypic adaptation. In this paper, it will be shown that a principle of "maximum intermediate steps" may also characterize complex biological systems, especially those in which extreme historical contingency or a combination of mutation and recombination are key features. The contribution to existing literature is two-fold: demonstrating both the potential for non-optimality in engineered systems with "lifelike" attributes, and the underpinnings of non-optimality in naturalistic contexts.   This will be demonstrated by using the Rube Goldberg Machine (RGM) analogy. Mechanical RGMs will be introduced, and their relationship to conceptual biological RGMs. Exemplars of these biological RGMs and their evolution (e.g. introduction of mutations and recombination-like inversions) will be demonstrated using block diagrams and interconnections with complex networks (called convolution architectures). The conceptual biological RGM will then be mapped to an artificial vascular system, which can be modeled using microfluidic-like structures. Theoretical expectations will be presented, particularly regarding whether or not maximum intermediate steps equates to the rescue or reuse of traits compromised by previous mutations or inversions. Considerations for future work and applications will then be discussed, including the incorporation of such convolution architectures into complex networks.
We present the generalization of the two-dimensional quantum scattering formalism to systems with Rashba spin-orbit coupling. Using symmetry considerations, we show that the differential scattering cross section depends on the spin state of the incident electron, and skew scattering may arise even for central spin-independent scattering potentials. The skew scattering effect is demonstrated by exact results of a simple hard wall impurity model. The magnitude of the effect for short-range impurities is estimated using the first Born approximation. The exact formalism we present can serve as a foundation for further theoretical investigations.
Given a connected graph $G$ and a non-negative integer $g$, the {\em $g$-extra connectivity} $\k_g(G)$ of $G$ is the minimum cardinality of a set of vertices in $G$, if it exists, whose deletion disconnects $G$ and leaves each remaining component with more than $g$ vertices. This paper focuses on the $g$-extra connectivity of hypercube-like networks (HL-networks for short) which includes numerous well-known topologies, such as hypercubes, twisted cubes, crossed cubes and M\"obius cubes. All the known results suggest the equality $\k_g(X_n)=f_n(g)$ holds, where $X_n$ is an $n$-dimensional HL-network, $f_n(g)=n(g+1)-\frac{g(g+3)}{2}$, $n\geq 5$ and $0\leq g\leq n-3$? Some authors also attempted to prove this equality in general. In this paper, we construct a subfamily of an $n$-dimensional HL-network with $g$-extra connectivity greater than $f_n(g)$ which implies that the above equality does not hold in general. We also prove that for $n\geq 5$ and $0\leq g\leq n-3$, $\k_g(X_n)\geq f_n(g)$ always holds. This enables us to give a sufficient condition for the equality $\k_g(X_n)=f_n(g)$, which is then used to determine the $g$-extra connectivity of HL-networks for some small $g$ or the $g$-extra connectivity of some particular subfamily of HL-networks. As a result, a short proof for the main results in [Journal of Computer and System Sciences 79 (2013) 669--688].
In vehicular ad hoc networks, the dynamic change in transmission power is very effective to increase the throughput of the wireless vehicular network and decrease the delay of the message communication between vehicular nodes on the highway. Transmission range is directly proportional to the transmission power the moving node. If the transmission power will be high, the interference increases that can cause higher delay in message reception at receiver end, hence the performance of the network decreased. In this paper, it is analyzed that how transmission power can be controlled by considering other different parameter of the network such as; density, distance between moving nodes, different types of messages dissemination with their priority, selection of an antenna also affects on the transmission power. The dynamic control of transmission power in VANETs serves also for the optimization of the resources where it needs, can be decreased and increased depending on the circumstances of the network. Different applications and events of different types also cause changes in transmission power to enhance the reach-ability. The analysis in this paper is comprised of density, distance with single hop and multi-hop message broadcasting based dynamic transmission power control as well as antenna selection and applications based. Some summarized tables are produced according to the respective parameters of the vehicular network. At the end some valuable observations are made and discussed in detail. This paper concludes with a grand summary of all the protocols discussed in it.
Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.
Deep learning is the state-of-the-art in fields such as visual object recognition and speech recognition. This learning uses a large number of layers, huge number of units, and connections. Therefore, overfitting is a serious problem. To avoid this problem, dropout learning is proposed. Dropout learning neglects some inputs and hidden units in the learning process with a probability, p, and then, the neglected inputs and hidden units are combined with the learned network to express the final output. We find that the process of combining the neglected hidden units with the learned network can be regarded as ensemble learning, so we analyze dropout learning from this point of view.
We use a generalised version of the individual-based Tangled Nature model of evolutionary ecology to study the relationship between ecosystem structure and evolutionary history. Our evolved model ecosystems typically exhibit interaction networks with exponential degree distributions and an inverse dependence between connectance and species richness. We use a simplified network evolution model to demonstrate that the observed degree distributions can occur as a consequence of partial correlations in the inheritance process. Futher to this, in the limit of low connectance and maximal correlation, distributions of power law form, $P(k){\propto}1/k$, can be achieved. We also show that a hyperbolic relationship between connectance and species richness, $C{\sim}1/D$ can arise as a consequence of probabilistic constraints on the evolutionary search process.
Recent development of large-scale question answering (QA) datasets triggered a substantial amount of research into end-to-end neural architectures for QA. Increasingly complex systems have been conceived without comparison to simpler neural baseline systems that would justify their complexity. In this work, we propose a simple heuristic that guides the development of neural baseline systems for the extractive QA task. We find that there are two ingredients necessary for building a high-performing neural QA system: first, the awareness of question words while processing the context and second, a composition function that goes beyond simple bag-of-words modeling, such as recurrent neural networks. Our results show that FastQA, a system that meets these two requirements, can achieve very competitive performance compared with existing models. We argue that this surprising finding puts results of previous systems and the complexity of recent QA datasets into perspective.
Based on the spectral statistics obtained in numerical simulations on three dimensional disordered systems within the tight--binding approximation, a new superuniversal scaling relation is presented that allows us to collapse data for the orthogonal, unitary and symplectic symmetry ($\beta=1,2,4$) onto a single scaling curve. This relation provides a strong evidence for one-parameter scaling existing in these systems which exhibit a second order phase transition. As a result a possible one-parameter family of spacing distribution functions, $P_g(s)$, is given for each symmetry class $\beta$, where $g$ is the dimensionless conductance.
The Laplacian eigenvalues of a network play an important role in the analysis of many structural and dynamical network problems. In this paper, we study the relationship between the eigenvalue spectrum of the normalized Laplacian matrix and the structure of `local' subgraphs of the network. We call a subgraph \emph{local} when it is induced by the set of nodes obtained from a breath-first search (BFS) of radius $r$ around a node. In this paper, we propose techniques to estimate spectral properties of the normalized Laplacian matrix from a random collection of induced local subgraphs. In particular, we provide an algorithm to estimate the spectral moments of the normalized Laplacian matrix (the power-sums of its eigenvalues). Moreover, we propose a technique, based on convex optimization, to compute upper and lower bounds on the spectral radius of the normalized Laplacian matrix from local subgraphs. We illustrate our results studying the normalized Laplacian spectrum of a large-scale online social network.
Semantic similarity measures (SSMs) refer to a set of algorithms used to quantify the similarity of two or more terms belonging to the same ontology. Ontology terms may be associated to concepts, for instance in computational biology gene and proteins are associated with terms of biological ontologies. Thus, SSMs may be used to quantify the similarity of genes and proteins starting from the comparison of the associated annotations. SSMs have been recently used to compare genes and proteins even on a system level scale. More recently some works have focused on the building and analysis of Semantic Similarity Networks (SSNs) i.e. weighted networks in which nodes represents genes or proteins while weighted edges represent the semantic similarity score among them. SSNs are quasi-complete networks, thus their analysis presents different challenges that should be addressed. For instance, the need for the introduction of reliable thresholds for the elimination of meaningless edges arises. Nevertheless, the use of global thresholding methods may produce the elimination of meaningful nodes, while the use of local thresholds may introduce biases. For these aims, we introduce a novel technique, based on spectral graph considerations and on a mixed global-local focus. The effectiveness of our technique is demonstrated by using markov clustering for the extraction of biological modules. We applied clustering to simplified networks demonstrating a considerable improvements with respect to the original ones.
In February 2016, World Health Organization declared the Zika outbreak a Public Health Emergency of International Concern. With developing evidence it can cause birth defects, and the Summer Olympics coming up in the worst affected country, Brazil, the virus caught fire on social media. In this work, use Zika as a case study in building a tool for tracking the misinformation around health concerns on Twitter. We collect more than 13 million tweets -- spanning the initial reports in February 2016 and the Summer Olympics -- regarding the Zika outbreak and track rumors outlined by the World Health Organization and Snopes fact checking website. The tool pipeline, which incorporates health professionals, crowdsourcing, and machine learning, allows us to capture health-related rumors around the world, as well as clarification campaigns by reputable health organizations. In the case of Zika, we discover an extremely bursty behavior of rumor-related topics, and show that, once the questionable topic is detected, it is possible to identify rumor-bearing tweets using automated techniques. Thus, we illustrate insights the proposed tools provide into potentially harmful information on social media, allowing public health researchers and practitioners to respond with a targeted and timely action.
We consider the problem of the assignment of nodes into communities from a set of hyperedges, where every hyperedge is a noisy observation of the community assignment of the adjacent nodes. We focus in particular on the sparse regime where the number of edges is of the same order as the number of vertices. We propose a spectral method based on a generalization of the non-backtracking Hashimoto matrix into hypergraphs. We analyze its performance on a planted generative model and compare it with other spectral methods and with Bayesian belief propagation (which was conjectured to be asymptotically optimal for this model). We conclude that the proposed spectral method detects communities whenever belief propagation does, while having the important advantages to be simpler, entirely nonparametric, and to be able to learn the rule according to which the hyperedges were generated without prior information.
Functional programming typically emphasizes programming with first-class functions and immutable data. Immutable data types enable fault tolerance in distributed systems, and ensure process isolation in message-passing concurrency, among other applications. However, beyond the distinction between reassignable and non-reassignable fields, Scala's type system does not have a built-in notion of immutability for type definitions. As a result, immutability is "by-convention" in Scala, and statistics about the use of immutability in real-world Scala code are non-existent.   This paper reports on the results of an empirical study on the use of immutability in several medium-to-large Scala open-source code bases, including Scala's standard library and the Akka actor framework. The study investigates both shallow and deep immutability, two widely-used forms of immutability in Scala. Perhaps most interestingly, for type definitions determined to be mutable, explanations are provided for why neither the shallow nor the deep immutability property holds; in turn, these explanations are aggregated into statistics in order to determine the most common reasons for why type definitions are mutable rather than immutable.
Finding accurate reduced descriptions for large, complex, dynamically evolving networks is a crucial enabler to their simulation, analysis, and, ultimately, design. Here we propose and illustrate a systematic and powerful approach to obtaining good collective coarse-grained observables-- variables successfully summarizing the detailed state of such networks. Finding such variables can naturally lead to successful reduced dynamic models for the networks. The main premise enabling our approach is the assumption that the behavior of a node in the network depends (after a short initial transient) on the node identity: a set of descriptors that quantify the node properties, whether intrinsic (e.g. parameters in the node evolution equations) or structural (imparted to the node by its connectivity in the particular network structure). The approach creates a natural link with modeling and "computational enabling technology" developed in the context of Uncertainty Quantification. In our case, however, we will not focus on ensembles of different realizations of a problem, each with parameters randomly selected from a distribution. We will instead study many coupled heterogeneous units, each characterized by randomly assigned (heterogeneous) parameter value(s). One could then coin the term Heterogeneity Quantification for this approach, which we illustrate through a model dynamic network consisting of coupled oscillators with one intrinsic heterogeneity (oscillator individual frequency) and one structural heterogeneity (oscillator degree in the undirected network). The computational implementation of the approach, its shortcomings and possible extensions are also discussed.
In this paper reinforcement learning with binary vector actions was investigated. We suggest an effective architecture of the neural networks for approximating an action-value function with binary vector actions. The proposed architecture approximates the action-value function by a linear function with respect to the action vector, but is still non-linear with respect to the state input. We show that this approximation method enables the efficient calculation of greedy action selection and softmax action selection. Using this architecture, we suggest an online algorithm based on Q-learning. The empirical results in the grid world and the blocker task suggest that our approximation architecture would be effective for the RL problems with large discrete action sets.
Signal transmission delays tend to destabilize dynamical networks leading to oscillation, but their dispersion contributes oppositely toward stabilization. We analyze an integro-differential equation that describes the collective dynamics of a neural network with distributed signal delays. With the gamma distributed delays less dispersed than exponential distribution, the system exhibits reentrant phenomena, in which the stability is once lost but then recovered as the mean delay is increased. With delays dispersed more highly than exponential, the system never destabilizes.
The rise and fall of a research field is the cumulative outcome of its intrinsic scientific value and social coordination among scientists. The structure of the social component is quantifiable by the social network of researchers linked via co-authorship relations, which can be tracked through digital records. Here, we use such co-authorship data in theoretical physics and study their complete evolutionary trail since inception, with a particular emphasis on the early transient stages. We find that the co-authorship networks evolve through three common major processes in time: the nucleation of small isolated components, the formation of a tree-like giant component through cluster aggregation, and the entanglement of the network by large-scale loops. The giant component is constantly changing yet robust upon link degradations, forming the network's dynamic core. The observed patterns are successfully reproducible through a new network model.
Neural networks are used extensively in classification problems in particle physics research. Since the training of neural networks can be viewed as a problem of inference, Bayesian learning of neural networks can provide more optimal and robust results than conventional learning methods. We have investigated the use of Bayesian neural networks for signal/background discrimination in the search for second generation leptoquarks at the Tevatron, as an example. We present a comparison of the results obtained from the conventional training of feedforward neural networks and networks trained with Bayesian methods.
Convolutional neural nets (convnets) trained from massive labeled datasets have substantially improved the state-of-the-art in image classification and object detection. However, visual understanding requires establishing correspondence on a finer level than object category. Given their large pooling regions and training from whole-image labels, it is not clear that convnets derive their success from an accurate correspondence model which could be used for precise localization. In this paper, we study the effectiveness of convnet activation features for tasks requiring correspondence. We present evidence that convnet features localize at a much finer scale than their receptive field sizes, that they can be used to perform intraclass alignment as well as conventional hand-engineered features, and that they outperform conventional features in keypoint prediction on objects from PASCAL VOC 2011.
Morphology in unbalanced languages remains a big challenge in the context of machine translation. In this paper, we propose to de-couple machine translation from morphology generation in order to better deal with the problem. We investigate the morphology simplification with a reasonable trade-off between expected gain and generation complexity. For the Chinese-Spanish task, optimum morphological simplification is in gender and number. For this purpose, we design a new classification architecture which, compared to other standard machine learning techniques, obtains the best results. This proposed neural-based architecture consists of several layers: an embedding, a convolutional followed by a recurrent neural network and, finally, ends with sigmoid and softmax layers. We obtain classification results over 98% accuracy in gender classification, over 93% in number classification, and an overall translation improvement of 0.7 METEOR.
In this paper, we study novel neural network structures to better model long term dependency in sequential data. We propose to use more memory units to keep track of more preceding states in recurrent neural networks (RNNs), which are all recurrently fed to the hidden layers as feedback through different weighted paths. By extending the popular recurrent structure in RNNs, we provide the models with better short-term memory mechanism to learn long term dependency in sequences. Analogous to digital filters in signal processing, we call these structures as higher order RNNs (HORNNs). Similar to RNNs, HORNNs can also be learned using the back-propagation through time method. HORNNs are generally applicable to a variety of sequence modelling tasks. In this work, we have examined HORNNs for the language modeling task using two popular data sets, namely the Penn Treebank (PTB) and English text8 data sets. Experimental results have shown that the proposed HORNNs yield the state-of-the-art performance on both data sets, significantly outperforming the regular RNNs as well as the popular LSTMs.
When suitably rescaled, the distribution of the angular gaps between branches of off-lattice radial DLA is shown to approach a size-independent limit. The power-law expected from an asymptotic fractal dimension D=1.71 arises only for very small angular gaps, which occur only for clusters significantly larger than one million particles. Intermediate size gaps exhibit an effective dimension around 1.66, even for cluster sizes going to infinity. They dominate the distribution for clusters with less than a million particles. The largest gap approaches a finite limit extremely slowly, with a correction of order M^{-0.17}, where M is the cluster mass.
The largest strength of contention-based MAC protocols is simultaneously the largest weakness of their scheduled counterparts: the ability to adapt to changes in network conditions. For scheduling to be competitive in mobile wireless networks, continuous adaptation must be addressed. We propose ATLAS, an Adaptive Topology- and Load-Aware Scheduling protocol to address this problem. In ATLAS, each node employs a random schedule achieving its persistence, the fraction of time a node is permitted to transmit, that is computed in a topology and load dependent manner. A distributed auction (REACT) piggybacks offers and claims onto existing network traffic to compute a lexicographic max-min channel allocation. A node's persistence p is related to its allocation. Its schedule achieving p is updated where and when needed, without waiting for a frame boundary.We study how ATLAS adapts to controlled changes in topology and load. Our results show that ATLAS adapts to most network changes in less than 0.1s, with about 20% relative error, scaling with network size. We further study ATLAS in more dynamic networks showing that it keeps up with changes in topology and load sufficient for TCP to sustain multi-hop flows, a struggle in IEEE 802.11 networks. The stable performance of ATLAS supports the design of higher-layer services that inform, and are informed by, the underlying communication network.
Online social networks such as Twitter and Facebook have gained tremendous popularity for information exchange. The availability of unprecedented amounts of digital data has accelerated research on information diffusion in online social networks. However, the mechanism of information spreading in online social networks remains elusive due to the complexity of social interactions and rapid change of online social networks. Much of prior work on information diffusion over online social networks has based on empirical and statistical approaches. The majority of dynamical models arising from information diffusion over online social networks involve ordinary differential equations which only depend on time. In a number of recent papers, the authors propose to use partial differential equations(PDEs) to characterize temporal and spatial patterns of information diffusion over online social networks. Built on intuitive cyber-distances such as friendship hops in online social networks, the reaction-diffusion equations take into account influences from various external out-of-network sources, such as the mainstream media, and provide a new analytic framework to study the interplay of structural and topical influences on information diffusion over online social networks. In this survey, we discuss a number of PDE-based models that are validated with real datasets collected from popular online social networks such as Digg and Twitter. Some new developments including the conservation law of information flow in online social networks and information propagation speeds based on traveling wave solutions are presented to solidify the foundation of the PDE models and highlight the new opportunities and challenges for mathematicians as well as computer scientists and researchers in online social networks.
A combined Short-Term Learning (STL) and Long-Term Learning (LTL) approach to solving mobile robot navigation problems is presented and tested in both real and simulated environments. The LTL consists of rapid simulations that use a Genetic Algorithm to derive diverse sets of behaviours. These sets are then transferred to an idiotypic Artificial Immune System (AIS), which forms the STL phase, and the system is said to be seeded. The combined LTL-STL approach is compared with using STL only, and with using a handdesigned controller. In addition, the STL phase is tested when the idiotypic mechanism is turned off. The results provide substantial evidence that the best option is the seeded idiotypic system, i.e. the architecture that merges LTL with an idiotypic AIS for the STL. They also show that structurally different environments can be used for the two phases without compromising transferability
We give upper and lower bounds on the information-theoretic threshold for community detection in the stochastic block model. Specifically, consider the symmetric stochastic block model with $q$ groups, average degree $d$, and connection probabilities $c_\text{in}/n$ and $c_\text{out}/n$ for within-group and between-group edges respectively; let $\lambda = (c_\text{in}-c_\text{out})/(qd)$. We show that, when $q$ is large, and $\lambda = O(1/q)$, the critical value of $d$ at which community detection becomes possible---in physical terms, the condensation threshold---is \[ d_\text{c} = \Theta\!\left( \frac{\log q}{q \lambda^2} \right) \, , \] with tighter results in certain regimes. Above this threshold, we show that any partition of the nodes into $q$ groups which is as `good' as the planted one, in terms of the number of within- and between-group edges, is correlated with it. This gives an exponential-time algorithm that performs better than chance; specifically, community detection becomes possible below the Kesten-Stigum bound for $q \ge 5$ in the disassortative case $\lambda < 0$, and for $q \ge 11$ in the assortative case $\lambda >0$ (similar upper bounds were obtained independently by Abbe and Sandon). Conversely, below this threshold, we show that no algorithm can label the vertices better than chance, or even distinguish the block model from an \ER\ random graph with high probability.   Our lower bound on $d_\text{c}$ uses Robinson and Wormald's small subgraph conditioning method, and we also give (less explicit) results for non-symmetric stochastic block models. In the symmetric case, we obtain explicit results by using bounds on certain functions of doubly stochastic matrices due to Achlioptas and Naor; indeed, our lower bound on $d_\text{c}$ is their second moment lower bound on the $q$-colorability threshold for random graphs with a certain effective degree.
Many cognitive processes rely on the ability of the brain to hold sequences of events in short-term memory. Recent studies have revealed that such memory can be read out from the transient dynamics of a network of neurons. However, the memory performance of such a network in buffering past information has only been rigorously estimated in networks of linear neurons. When signal gain is kept low, so that neurons operate primarily in the linear part of their response nonlinearity, the memory lifetime is bounded by the square root of the network size. In this work, I demonstrate that it is possible to achieve a memory lifetime almost proportional to the network size, "an extensive memory lifetime", when the nonlinearity of neurons is appropriately utilized. The analysis of neural activity revealed that nonlinear dynamics prevented the accumulation of noise by partially removing noise in each time step. With this error-correcting mechanism, I demonstrate that a memory lifetime of order $N/\log N$ can be achieved.
Patterns of event propagation in online social networks provide novel insights on the modeling and analysis of information dissemination over networks and physical systems. This paper studies the importance of follower links for event propagation on Twitter. Three recent event propagation traces are collected with the user languages being used to identify the Network of Networks (NoN) structure embedded in the Twitter follower networks. We first formulate event propagation on Twitter as an iterative state equation, and then propose an effective score function on follower links accounting for the containment of event propagation via link removals. Furthermore, we find that utilizing the NoN model can successfully identify influential follower links such that their removals lead to remarkable reduction in event propagation on Twitter follower networks. Experimental results find that the between-network follower links, though only account for a small portion of the total follower links, are extremely crucial to event propagation on Twitter.
Domain wall networks on the surface of a soliton are studied in a simple theory. It consists of two complex scalar fields, in (3+1)-dimensions, with a global U(1) x Z_n symmetry, where n>2. Solutions are computed numerically in which one of the fields forms a Q-ball and the other field forms a network of domain walls localized on the surface of the Q-ball. Examples are presented in which the domain walls lie along the edges of a spherical polyhedron, forming junctions at its vertices. It is explained why only a small restricted class of polyhedra can arise as domain wall networks.
Recognition of objects using Deep Neural Networks is an active area of research and many breakthroughs have been made in the last few years. The paper attempts to indicate how far this field has progressed. The paper briefly describes the history of research in Neural Networks and describe several of the recent advances in this field. The performances of recently developed Neural Network Algorithm over benchmark datasets have been tabulated. Finally, some the applications of this field have been provided.
A key challenge in entity linking is making effective use of contextual information to disambiguate mentions that might refer to different entities in different contexts. We present a model that uses convolutional neural networks to capture semantic correspondence between a mention's context and a proposed target entity. These convolutional networks operate at multiple granularities to exploit various kinds of topic information, and their rich parameterization gives them the capacity to learn which n-grams characterize different topics. We combine these networks with a sparse linear model to achieve state-of-the-art performance on multiple entity linking datasets, outperforming the prior systems of Durrett and Klein (2014) and Nguyen et al. (2014).
The first passage time (FPT) distribution for random walk in complex networks is calculated through an asymptotic analysis. For network with size $N$ and short relaxation time $\tau\ll N$, the computed mean first passage time (MFPT), which is inverse of the decay rate of FPT distribution, is inversely proportional to the degree of the destination. These results are verified numerically for the paradigmatic networks with excellent agreement. We show that the range of validity of the analytical results covers networks that have short relaxation time and high mean degree, which turn out to be valid to many real networks.
The construction of discrete scalar wave propagation equations in arbitrary inhomogeneous media was recently achieved by using elementary dynamical processes realizing a discrete counterpart of the Huygens principle. In this paper, we generalize this approach to spinor wave propagation. Although the construction can be formulated on a discrete lattice of any dimension, for simplicity we focus on spinors living in 1+1 space-time dimensions. The Dirac equation in the Majorana-Weyl representation is directly recovered by incorporating appropriate symmetries of the elementary processes. The Dirac equation in the standard representation is also obtained by using its relationship with the Majorana-Weyl representation.
A wireless sensor network consists of light-weight, low power, small size sensor nodes. Routing in wireless sensor networks is a demanding task. This demand has led to a number of routing protocols which efficiently utilize the limited resources available at the sensor nodes. Most of these protocols are either based on single hop routing or multi hop routing and typically find the minimum energy path without addressing other issues such as time delay in delivering a packet, load balancing, and redundancy of data. Response time is very critical in environment monitoring sensor networks where typically the sensors are stationary and transmit data to a base station or a sink node. In this paper a faster load balancing routing protocol based on location with a hybrid approach is proposed.
We showed in this paper that similarity network can be used as an powerful tools to study the relationship of tRNA genes. We constructed a network of 3719 tRNA gene sequences using simplest alignment and studied its topology, degree distribution and clustering coefficient. It is found that the behavior of the network shift from fluctuated distribution to scale-free distribution when the similarity degree of the tRNA gene sequences increase. tRNA gene sequences with the same anticodon identity are more self-organized than the tRNA gene sequences with different anticodon identities and form local clusters in the network. An interesting finding in our studied is some vertices of the local cluster have a high connection with other local clusters, the probable reason is given. Moreover, a network constructed by the same number of random tRNA sequences is used to make comparisons. The relationships between properties of the tRNA similarity network and the characters of tRNA evolutionary history are discussed.
We have developped a fitting code based on the leading-twist handbag Deep Virtual Compton Scattering (DVCS) amplitude in order to extract the Generalized Parton Distributions (GPD) information from DVCS observables in the valence region. In a first stage, with simulations and pseudo-data, we show that the full GPD information can be recovered from experimental data if enough observables are measured. If only part of these observables are measured, valuable information can still be extracted, certain observables being particularly sensitive to certain GPDs. In a second stage, we make a practical application of this code to the recent DVCS Jefferson Lab Hall A data from which we can extract numerical constraints for the two $H$ GPD Compton Form Factors.
The paper investigates epistemic properties of information flow under communication protocols with a given topological structure of the communication network. The main result is a sound and complete logical system that describes all such properties. The system consists of a variation of the multi-agent epistemic logic S5 extended by a new network-specific Gateway axiom.
We propose a novel method for temporally pooling frames in a video for the task of human action recognition. The method is motivated by the observation that there are only a small number of frames which, together, contain sufficient information to discriminate an action class present in a video, from the rest. The proposed method learns to pool such discriminative and informative frames, while discarding a majority of the non-informative frames in a single temporal scan of the video. Our algorithm does so by continuously predicting the discriminative importance of each video frame and subsequently pooling them in a deep learning framework. We show the effectiveness of our proposed pooling method on standard benchmarks where it consistently improves on baseline pooling methods, with both RGB and optical flow based Convolutional networks. Further, in combination with complementary video representations, we show results that are competitive with respect to the state-of-the-art results on two challenging and publicly available benchmark datasets.
Recent work on discriminative segmental models has shown that they can achieve competitive speech recognition performance, using features based on deep neural frame classifiers. However, segmental models can be more challenging to train than standard frame-based approaches. While some segmental models have been successfully trained end to end, there is a lack of understanding of their training under different settings and with different losses.   We investigate a model class based on recent successful approaches, consisting of a linear model that combines segmental features based on an LSTM frame classifier. Similarly to hybrid HMM-neural network models, segmental models of this class can be trained in two stages (frame classifier training followed by linear segmental model weight training), end to end (joint training of both frame classifier and linear weights), or with end-to-end fine-tuning after two-stage training.   We study segmental models trained end to end with hinge loss, log loss, latent hinge loss, and marginal log loss. We consider several losses for the case where training alignments are available as well as where they are not.   We find that in general, marginal log loss provides the most consistent strong performance without requiring ground-truth alignments. We also find that training with dropout is very important in obtaining good performance with end-to-end training. Finally, the best results are typically obtained by a combination of two-stage training and fine-tuning.
We consider classification problems in which the label space has structure. A common example is hierarchical label spaces, corresponding to the case where one label subsumes another (e.g., animal subsumes dog). But labels can also be mutually exclusive (e.g., dog vs cat) or unrelated (e.g., furry, carnivore). To jointly model hierarchy and exclusion relations, the notion of a HEX (hierarchy and exclusion) graph was introduced in [7]. This combined a conditional random field (CRF) with a deep neural network (DNN), resulting in state of the art results when applied to visual object classification problems where the training labels were drawn from different levels of the ImageNet hierarchy (e.g., an image might be labeled with the basic level category "dog", rather than the more specific label "husky"). In this paper, we extend the HEX model to allow for soft or probabilistic relations between labels, which is useful when there is uncertainty about the relationship between two labels (e.g., an antelope is "sort of" furry, but not to the same degree as a grizzly bear). We call our new model pHEX, for probabilistic HEX. We show that the pHEX graph can be converted to an Ising model, which allows us to use existing off-the-shelf inference methods (in contrast to the HEX method, which needed specialized inference algorithms). Experimental results show significant improvements in a number of large-scale visual object classification tasks, outperforming the previous HEX model.
Ideas by Statistical Mechanics (ISM) is a generic program to model evolution and propagation of ideas/patterns throughout populations subjected to endogenous and exogenous interactions. The program is based on the author's work in Statistical Mechanics of Neocortical Interactions (SMNI), and uses the author's Adaptive Simulated Annealing (ASA) code for optimizations of training sets, as well as for importance-sampling to apply the author's copula financial risk-management codes, Trading in Risk Dimensions (TRD), for assessments of risk and uncertainty. This product can be used for decision support for projects ranging from diplomatic, information, military, and economic (DIME) factors of propagation/evolution of ideas, to commercial sales, trading indicators across sectors of financial markets, advertising and political campaigns, etc. A statistical mechanical model of neocortical interactions, developed by the author and tested successfully in describing short-term memory and EEG indicators, is the proposed model. Parameters with a given subset of macrocolumns will be fit using ASA to patterns representing ideas. Parameters of external and inter-regional interactions will be determined that promote or inhibit the spread of these ideas. Tools of financial risk management, developed by the author to process correlated multivariate systems with differing non-Gaussian distributions using modern copula analysis, importance-sampled using ASA, will enable bona fide correlations and uncertainties of success and failure to be calculated. Marginal distributions will be evolved to determine their expected duration and stability using algorithms developed by the author, i.e., PATHTREE and PATHINT codes.
Many empirical networks display an inherent tendency to cluster, i.e. to form circles of connected nodes. This feature is typically measured by the clustering coefficient (CC). The CC, originally introduced for binary, undirected graphs, has been recently generalized to weighted, undirected networks. Here we extend the CC to the case of (binary and weighted) directed networks and we compute its expected value for random graphs. We distinguish between CCs that count all directed triangles in the graph (independently of the direction of their edges) and CCs that only consider particular types of directed triangles (e.g., cycles). The main concepts are illustrated by employing empirical data on world-trade flows.
The time-dependent fluctuations of conductivity \sigma have been studied in a two-dimensional electron system in low-mobility, small-size Si inversion layers. The noise power spectrum is ~1/f^{\alpha} with \alpha exhibiting a sharp jump at a certain electron density n_s=n_g. An enormous increase in the relative variance of \sigma is observed as n_s is reduced below n_g, reflecting a dramatic slowing down of the electron dynamics. This is attributed to the freezing of the electron glass. The data strongly suggest that glassy dynamics persists in the metallic phase.
Most networks of interest do not live in isolation. Instead they form components of larger systems in which multiple networks with distinct topologies coexist and where elements distributed amongst different networks may interact directly. Here we develop a mathematical framework based on generating functions for analyzing a system of L interacting networks given the connectivity within and between networks. We derive exact expressions for the percolation threshold describing the onset of large-scale connectivity in the system of networks and each network individually. These general expressions apply to networks with arbitrary degree distributions and we explicitly evaluate them for L=2 interacting networks with a few choices of degree distributions. We show that the percolation threshold in an individual network can be significantly lowered once "hidden" connections to other networks are considered. We show applications of the framework to two real-world systems involving communications networks and socio-tecnical congruence in software systems.
In-memory (transactional) data stores are recognized as a first-class data management technology for cloud platforms, thanks to their ability to match the elasticity requirements imposed by the pay-as-you-go cost model. On the other hand, defining the well-suited amount of cache servers to be deployed, and the degree of in-memory replication of slices of data, in order to optimize reliability/availability and performance tradeoffs, is far from being a trivial task. Yet, it is an essential aspect of the provisioning process of cloud platforms, given that it has an impact on how well cloud resources are actually exploited. To cope with the issue of determining optimized configurations of cloud in-memory data stores, in this article we present a flexible simulation framework offering skeleton simulation models that can be easily specialized in order to capture the dynamics of diverse data grid systems, such as those related to the specific protocol used to provide data consistency and/or transactional guarantees. Besides its flexibility, another peculiar aspect of the framework lies in that it integrates simulation and machine-learning (black-box) techniques, the latter being essentially used to capture the dynamics of the data-exchange layer (e.g. the message passing layer) across the cache servers. This is a relevant aspect when considering that the actual data-transport/networking infrastructure on top of which the data grid is deployed might be unknown, hence being not feasible to be modeled via white-box (namely purely simulative) approaches. We also provide an extended experimental study aimed at validating instances of simulation models supported by our framework against execution dynamics of real data grid systems deployed on top of either private or public cloud infrastructures.
We describe the optical, near-infrared, and radio properties of a sample of hard (2-7 keV) X-ray sources detected in a deep Chandra observation of the field surrounding the Abell 370 cluster. We combine these data with similar observations of the Chandra Deep Field-North and the Hawaii Survey Field SSA13 to obtain a sample of 69 hard X-ray sources (45 are spectroscopically identified) with extremely deep 20 cm observations. We find that about 4% of the >Lstar galaxy population is X-ray luminous at any time and hence that black hole accretion has a duration of about half a Gyr. We find that about 30% of the summed 2-7 keV flux from our total sample is from sources at z<1. We estimate the bolometric luminosities of accretion onto supermassive black holes and the mass inflow rates. The time history of the accretion rate density is evaluated, and its integrated value is reasonably consistent with the value inferred from the local black hole mass to bulge mass ratio.
In recent years, deep learning has had a profound impact on machine learning and artificial intelligence. At the same time, algorithms for quantum computers have been shown to efficiently solve some problems that are intractable on conventional, classical computers. We show that quantum computing not only reduces the time required to train a deep restricted Boltzmann machine, but also provides a richer and more comprehensive framework for deep learning than classical computing and leads to significant improvements in the optimization of the underlying objective function. Our quantum methods also permit efficient training of full Boltzmann machines and multi-layer, fully connected models and do not have well known classical counterparts.
We introduce LL-RNNs (Log-Linear RNNs), an extension of Recurrent Neural Networks that replaces the softmax output layer by a log-linear output layer, of which the softmax is a special case. This conceptually simple move has two main advantages. First, it allows the learner to combat training data sparsity by allowing it to model words (or more generally, output symbols) as complex combinations of attributes without requiring that each combination is directly observed in the training data (as the softmax does). Second, it permits the inclusion of flexible prior knowledge in the form of a priori specified modular features, where the neural network component learns to dynamically control the weights of a log-linear distribution exploiting these features.   We conduct experiments in the domain of language modelling of French, that exploit morphological prior knowledge and show an important decrease in perplexity relative to a baseline RNN.   We provide other motivating iillustrations, and finally argue that the log-linear and the neural-network components contribute complementary strengths to the LL-RNN: the LL aspect allows the model to incorporate rich prior knowledge, while the NN aspect, according to the "representation learning" paradigm, allows the model to discover novel combination of characteristics.
Researchers in fields such as sociology, demography and public health, have used data from social media to explore a diversity of questions. In public health, researchers use data from social media to monitor disease spread, assess population attitudes toward health-related issues, and to better understand the relationship between behavioral changes and population health. However, a major limitation of the use of these data for population health research is a lack of key demographic indicators such as, age, race and gender. Several studies have proposed methods for automated detection of social media users' demographic characteristics. These range from facial recognition to classic supervised and unsupervised machine learning methods. We seek to provide a review of existing approaches to automated detection of demographic characteristics of social media users. We also address the applicability of these methods to public health research; focusing on the challenge of working with highly dynamical, large scale data to study health trends. Furthermore, we provide an overview of work that emphasizes scalability and efficiency in data acquisition and processing, and make best practice recommendations.
This paper settles the optimality of sorting networks given in The Art of Computer Programming vol. 3 more than 40 years ago. The book lists efficient sorting networks with n <= 16 inputs. In this paper we give general combinatorial arguments showing that if a sorting network with a given depth exists then there exists one with a special form. We then construct propositional formulas whose satisfiability is necessary for the existence of such a network. Using a SAT solver we conclude that the listed networks have optimal depth. For n <= 10 inputs where optimality was known previously, our algorithm is four orders of magnitude faster than those in prior work.
We explore the use of convolutional neural networks for the semantic classification of remote sensing scenes. Two recently proposed architectures, CaffeNet and GoogLeNet, are adopted, with three different learning modalities. Besides conventional training from scratch, we resort to pre-trained networks that are only fine-tuned on the target data, so as to avoid overfitting problems and reduce design time. Experiments on two remote sensing datasets, with markedly different characteristics, testify on the effectiveness and wide applicability of the proposed solution, which guarantees a significant performance improvement over all state-of-the-art references.
We propose a soft attention based model for the task of action recognition in videos. We use multi-layered Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units which are deep both spatially and temporally. Our model learns to focus selectively on parts of the video frames and classifies videos after taking a few glimpses. The model essentially learns which parts in the frames are relevant for the task at hand and attaches higher importance to them. We evaluate the model on UCF-11 (YouTube Action), HMDB-51 and Hollywood2 datasets and analyze how the model focuses its attention depending on the scene and the action being performed.
Network robustness research aims at finding a measure to quantify network robustness. Once such a measure has been established, we will be able to compare networks, to improve existing networks and to design new networks that are able to continue to perform well when it is subject to failures or attacks. In this paper we survey a large amount of robustness measures on simple, undirected and unweighted graphs, in order to offer a tool for network administrators to evaluate and improve the robustness of their network. The measures discussed in this paper are based on the concepts of connectivity (including reliability polynomials), distance, betweenness and clustering. Some other measures are notions from spectral graph theory, more precisely, they are functions of the Laplacian eigenvalues. In addition to surveying these graph measures, the paper also contains a discussion of their functionality as a measure for topological network robustness.
By using the $q$-Gaussian distribution derived by the maximum entropy method for spatially-correlated $N$-unit nonextensive systems, we have calculated the generalized Fisher information matrix of $g_{\theta_n \theta_m}$ for $(\theta_1, \theta_2, \theta_3) = (\mu_q, \sigma_q^2$, $s$), where $\mu_q$, $\sigma_q^2$ and $s$ denote the mean, variance and degree of spatial correlation, respectively, for a given entropic index $q$. It has been shown from the Cram\'{e}r-Rao theorem that (1) an accuracy of an unbiased estimate of $\mu_q$ is improved (degraded) by a negative (positive) correlation $s$, (2) that of $\sigma_q^2$ is worsen with increasing $s$, and (3) that of $s$ is much improved for $s \simeq -1/(N-1)$ or $s \simeq 1.0$ though it is worst at $s = (N-2)/2(N-1)$. Our calculation provides a clear insight to the long-standing controversy whether the spatial correlation is beneficial or detrimental to decoding in neuronal ensembles. We discuss also a calculation of the $q$-Gaussian distribution, applying the superstatistics to the Langevin model subjected to spatially-correlated inputs.
Linking human whole-body motion and natural language is of great interest for the generation of semantic representations of observed human behaviors as well as for the generation of robot behaviors based on natural language input. While there has been a large body of research in this area, most approaches that exist today require a symbolic representation of motions (e.g. in the form of motion primitives), which have to be defined a-priori or require complex segmentation algorithms. In contrast, recent advances in the field of neural networks and especially deep learning have demonstrated that sub-symbolic representations that can be learned end-to-end usually outperform more traditional approaches, for applications such as machine translation. In this paper we propose a generative model that learns a bidirectional mapping between human whole-body motion and natural language using deep recurrent neural networks (RNNs) and sequence-to-sequence learning. Our approach does not require any segmentation or manual feature engineering and learns a distributed representation, which is shared for all motions and descriptions. We evaluate our approach on 2,846 human whole-body motions and 6,187 natural language descriptions thereof from the KIT Motion-Language Dataset. Our results clearly demonstrate the effectiveness of the proposed model: We show that our model generates a wide variety of realistic motions only from descriptions thereof in form of a single sentence. Conversely, our model is also capable of generating correct and detailed natural language descriptions from human motions.
Feature learning and deep learning have drawn great attention in recent years as a way of transforming input data into more effective representations using learning algorithms. Such interest has grown in the area of music information retrieval (MIR) as well, particularly in music audio classification tasks such as auto-tagging. In this paper, we present a two-stage learning model to effectively predict multiple labels from music audio. The first stage learns to project local spectral patterns of an audio track onto a high-dimensional sparse space in an unsupervised manner and summarizes the audio track as a bag-of-features. The second stage successively performs the unsupervised learning on the bag-of-features in a layer-by-layer manner to initialize a deep neural network and finally fine-tunes it with the tag labels. Through the experiment, we rigorously examine training choices and tuning parameters, and show that the model achieves high performance on Magnatagatune, a popularly used dataset in music auto-tagging.
We present a robust and real-time monocular six degree of freedom visual relocalization system. We use a Bayesian convolutional neural network to regress the 6-DOF camera pose from a single RGB image. It is trained in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking under 6ms to compute. It obtains approximately 2m and 6 degrees accuracy for very large scale outdoor scenes and 0.5m and 10 degrees accuracy indoors. Using a Bayesian convolutional neural network implementation we obtain an estimate of the model's relocalization uncertainty and improve state of the art localization accuracy on a large scale outdoor dataset. We leverage the uncertainty measure to estimate metric relocalization error and to detect the presence or absence of the scene in the input image. We show that the model's uncertainty is caused by images being dissimilar to the training dataset in either pose or appearance.
Residual learning has recently surfaced as an effective means of constructing very deep neural networks for object recognition. However, current incarnations of residual networks do not allow for the modeling and integration of complex relations between closely coupled recognition tasks or across domains. Such problems are often encountered in multimedia applications involving large-scale content recognition. We propose a novel extension of residual learning for deep networks that enables intuitive learning across multiple related tasks using cross-connections called cross-residuals. These cross-residuals connections can be viewed as a form of in-network regularization and enables greater network generalization. We show how cross-residual learning (CRL) can be integrated in multitask networks to jointly train and detect visual concepts across several tasks. We present a single multitask cross-residual network with >40% less parameters that is able to achieve competitive, or even better, detection performance on a visual sentiment concept detection problem normally requiring multiple specialized single-task networks. The resulting multitask cross-residual network also achieves better detection performance by about 10.4% over a standard multitask residual network without cross-residuals with even a small amount of cross-task weighting.
Background: One-dimensional protein structures such as secondary structures or contact numbers are useful for three-dimensional structure prediction and helpful for intuitive understanding of the sequence-structure relationship. Accurate prediction methods will serve as a basis for these and other purposes. Results: We implemented a program CRNPRED which predicts secondary structures, contact numbers and residue-wise contact orders. This program is based on a novel machine learning scheme called critical random networks. Unlike most conventional one-dimensional structure prediction methods which are based on local windows of an amino acid sequence, CRNPRED takes into account the whole sequence. CRNPRED achieves, on average per chain, Q3 = 81% for secondary structure prediction, and correlation coefficients of 0.75 and 0.61 for contact number and residue-wise contact order predictions, respectively. Conclusion: CRNPRED will be a useful tool for computational as well as experimental biologists who need accurate one-dimensional protein structure predictions.
A survey is given on the status of 3-loop heavy flavor corrections to deep-inelastic structure functions at large enough virtualities $Q^2$.
This paper provides new stability results for Action-Dependent Heuristic Dynamic Programming (ADHDP), using a control algorithm that iteratively improves an internal model of the external world in the autonomous system based on its continuous interaction with the environment. We extend previous results by ADHDP control to the case of general multi-layer neural networks with deep learning across all layers. In particular, we show that the introduced control approach is uniformly ultimately bounded (UUB) under specific conditions on the learning rates, without explicit constraints on the temporal discount factor. We demonstrate the benefit of our results to the control of linear and nonlinear systems, including the cart-pole balancing problem. Our results show significantly improved learning and control performance as compared to the state-of-art.
Vertical shaking of a mixture of small and large beads can lead to segregation where the large beads either accumulate at the top of the sample, the so called Brazil Nut effect (BNE), or at the bottom, the Reverse Brazil Nut effect (RBNE). Here we demonstrate experimentally a sharp transition from the RBNE to the BNE when the particle coefficient of friction increases due to aging of the particles. This result can be explained by the two competing mechanisms of buoyancy and sidewall-driven convection, where the latter is assumed to grow in strength with increasing friction.
We have conducted temperature and frequency dependent transport measurements in amorphous Nb_x Si_{1-x} samples in the insulating regime. We find a temperature dependent dc conductivity consistent with variable range hopping in a Coulomb glass. The frequency dependent response in the millimeter-wave frequency range can be described by the expression $sigma(omega) \propto (-\imath omega)^alpha$ with the exponent somewhat smaller than one. Our ac results are not consistent with extant theories for the hopping transport.
Localization transitions as a function of temperature require a many-body mobility edge in energy, separating localized from ergodic states. We argue that this scenario is inconsistent because local fluctuations into the ergodic phase within the supposedly localized phase can serve as mobile bubbles that induce global delocalization. Such fluctuations inevitably appear with a low but finite density anywhere in any typical state. We conclude that the only possibility for many-body localization to occur are lattice models that are localized at all energies. Building on a close analogy with a model of assisted two-particle hopping, where interactions induce delocalization, we argue why hot bubbles are mobile and do not localize upon diluting their energy. Numerical tests of our scenario show that previously reported mobility edges cannot be distinguished from finite-size effects.
Turing presented a general representation scheme by which to achieve artificial intelligence - unorganised machines. Significantly, these were a form of discrete dynamical system and yet such representations remain relatively unexplored. Further, at the same time as also suggesting that natural evolution may provide inspiration for search mechanisms to design machines, he noted that mechanisms inspired by the social aspects of learning may prove useful. This paper presents initial results from consideration of using Turing's dynamical representation within an unconventional substrate - networks of Belousov-Zhabotinsky vesicles - designed by an imitation-based, i.e., cultural, approach. Turing's representation scheme is also extended to include a fuller set of Boolean functions at the nodes of the recurrent networks.
Delivery of multicast video services over fourth generation (4G) networks such as 3GPP Long Term Evolution-Advanced (LTE-A) is gaining momentum. In this paper, we address the issue of efficiently multicasting layered video services by defining a novel resource allocation framework that aims to maximize the service coverage whilst keeping the radio resource footprint low. A key point in the proposed system mode is that the reliability of multicast video services is ensured by means of an Unequal Error Protection implementation of the Network Coding (UEP-NC) scheme. In addition, both the communication parameters and the UEP-NC scheme are jointly optimized by the proposed resource allocation framework. Numerical results show that the proposed allocation framework can significantly increase the service coverage when compared to a conventional Multi-rate Transmission (MrT) strategy.
Securing wireless sensor networks (WSNs) is a hard problem. In particular, network access control is notoriously difficult to achieve due to the inherent broadcast characteristics of wireless communications: an attacker can easily target any node in its transmission range and affect large parts of a sensor network simultaneously. In this paper, we therefore propose a distributed guardian system to protect a WSN based on physically regulating channel access by means of selective interference. The guardians are deployed alongside a sensor network, inspecting all local traffic, classifying packets based on their content, and destroying any malicious packet while still on the air. In that sense, the system tries to gain "air dominance" over attackers. A key challenge in implementing the guardian system is the resulting real-time requirement in order to classify and destroy packets during transmission. We present a USRP2 software radio based guardian implementation for IEEE 802.15.4 that meets this challenge; using an FPGA-based design we can even check for the content of the very last payload byte of a packet and still prevent its reception by a potential victim mote. Our evaluation shows that the guardians effectively block 99.9% of unauthorized traffic in 802.15.4 networks in our experiments, without disturbing the legitimate operations of the WSN.
We study the mean length $\ell(k)$ of the shortest paths between a vertex of degree $k$ and other vertices in growing networks, where correlations are essential. In a number of deterministic scale-free networks we observe a power-law correction to a logarithmic dependence, $\ell(k) = A\ln [N/k^{(\gamma-1)/2}] - C k^{\gamma-1}/N + ...$ in a wide range of network sizes. Here $N$ is the number of vertices in the network, $\gamma$ is the degree distribution exponent, and the coefficients $A$ and $C$ depend on a network. We compare this law with a corresponding $\ell(k)$ dependence obtained for random scale-free networks growing through the preferential attachment mechanism. In stochastic and deterministic growing trees with an exponential degree distribution, we observe a linear dependence on degree, $\ell(k) \cong A\ln N - C k$. We compare our findings for growing networks with those for uncorrelated graphs.
The IEEE 802.11b wireless ethernet standart has several serious security flaws. This paper describes this flaws, surveys wireless networks in the Cologne/Bonn area to get an assessment of the security configurations of fielded networks and analizes the legal protections provided to wireless ethernet operators by german law. We conclude that wireless ethernets without additional security measures are not usable for any transmissions which are not meant for a public audience.
This paper develops efficient Compute-and-forward (CMF) schemes in multi-user multi-relay networks. To solve the rank failure problem in CMF setups and to achieve full diversity of the network, we introduce two novel CMF methods, namely, extended CMF and successive CMF. The former, having low complexity, is based on recovering multiple equations at relays. The latter utilizes successive interference cancellation (SIC) to enhance the system performance compared to the state-of-the-art schemes. Both methods can be utilized in a network with different number of users, relays, and relay antennas, with negligible feedback channels or signaling overhead. We derive new concise formulations and explicit framework for the successive CMF method as well as an approach to reduce its computational complexity. Our theoretical analysis and computer simulations demonstrate the superior performance of our proposed CMF methods over the conventional schemes. Furthermore, based on our simulation results, the successive CMF method yields additional signal-to-noise ratio gains and shows considerable robustness against channel estimation error, compared to the extended CMF method.
In many global Optimization Problems, it is required to evaluate a global point (min or max) in large space that calculation effort is very high. In this paper is presented new approach for optimization problem with subdivision labeling method (SLM) but in this method for higher dimensional has high calculation effort. Clustering-Based Parallel Genetic Algorithm (CBPGA) in optimization problems is one of the solutions of this problem. That the initial population is crossing points and subdividing in each step is according to mutation. After labeling all of crossing points, selecting is according to polytope that has complete label. In this method we propose an algorithm, based on parallelization scheme using master-slave. SLM algorithm is implemented by CBPGA and compared the experimental results. The numerical examples and numerical results show that SLMCBPGA is improved speed up and efficiency.
We study the multifractality (MF) of critical wave functions at boundaries and corners at the metal-insulator transition (MIT) for noninteracting electrons in the two-dimensional (2D) spin-orbit (symplectic) universality class. We find that the MF exponents near a boundary are different from those in the bulk. The exponents at a corner are found to be directly related to those at a straight boundary through a relation arising from conformal invariance. This provides direct numerical evidence for conformal invariance at the 2D spin-orbit MIT. The presence of boundaries modifies the MF of the whole sample even in the thermodynamic limit.
We present specific heat data on three samples of the dilute Ising magnet $\HoYLF$ with $x = 0.018$, 0.045 and 0.080. Previous measurements of the ac susceptibility of an $x = 0.045$ sample showed the Ho$^{3+}$ moments to remain dynamic down to very low temperatures and the specific heat was found to have unusually sharp features. In contrast, our measurements do not exhibit these sharp features in the specific heat and instead show a broad feature, for all three samples studied, which is qualitatively consistent with a spin glass state. Integrating $C/T$, however, reveals an increase in residual entropy with lower Ho concentration, consistent with recent Monte Carlo simulations showing a lack of spin glass transition for low x.
Most network-based protein (or gene) function prediction methods are based on the assumption that the labels of two adjacent proteins in the network are likely to be the same. However, assuming the pairwise relationship between proteins or genes is not complete, the information a group of genes that show very similar patterns of expression and tend to have similar functions (i.e. the functional modules) is missed. The natural way overcoming the information loss of the above assumption is to represent the gene expression data as the hypergraph. Thus, in this paper, the three un-normalized, random walk, and symmetric normalized hypergraph Laplacian based semi-supervised learning methods applied to hypergraph constructed from the gene expression data in order to predict the functions of yeast proteins are introduced. Experiment results show that the average accuracy performance measures of these three hypergraph Laplacian based semi-supervised learning methods are the same. However, their average accuracy performance measures of these three methods are much greater than the average accuracy performance measures of un-normalized graph Laplacian based semi-supervised learning method (i.e. the baseline method of this paper) applied to gene co-expression network created from the gene expression data.
Convolutional Neural Networks (ConvNets) have shown excellent results on many visual classification tasks. With the exception of ImageNet, these datasets are carefully crafted such that objects are well-aligned at similar scales. Naturally, the feature learning problem gets more challenging as the amount of variation in the data increases, as the models have to learn to be invariant to certain changes in appearance. Recent results on the ImageNet dataset show that given enough data, ConvNets can learn such invariances producing very discriminative features [1]. But could we do more: use less parameters, less data, learn more discriminative features, if certain invariances were built into the learning process? In this paper we present a simple model that allows ConvNets to learn features in a locally scale-invariant manner without increasing the number of model parameters. We show on a modified MNIST dataset that when faced with scale variation, building in scale-invariance allows ConvNets to learn more discriminative features with reduced chances of over-fitting.
Air transport systems are highly dynamic at temporal scales from minutes to years. This dynamic behavior not only characterizes the evolution of the system but also affect the system's functioning. Understanding the evolutionary mechanisms is thus fundamental in order to better design optimal air transport networks that benefits companies, passengers and the environment. In this review, we briefly present and discuss the state-of-art on time-evolving air transport networks. We distinguish the structural analysis of sequences of network snapshots, ideal for long-term network evolution (e.g. annual evolution), and temporal paths, preferred for short-term dynamics (e.g. hourly evolution). We emphasize that most previous research focused on the first modeling approach (i.e. long-term) whereas only a few studies look at high-resolution temporal paths. We conclude the review highlighting that much research remains to be done, both to apply already available methods and to develop new measures for temporal paths on air transport networks. In particular, we identify that the study of delays, network resilience and optimization of resources (aircraft and crew) are critical topics that can benefit of temporal network analysis.
Neuromorphic computing has come to refer to a variety of brain-inspired computers, devices, and models that contrast the pervasive von Neumann computer architecture. This biologically inspired approach has created highly connected synthetic neurons and synapses that can be used to model neuroscience theories as well as solve challenging machine learning problems. The promise of the technology is to create a brain-like ability to learn and adapt, but the technical challenges are significant, starting with an accurate neuroscience model of how the brain works, to finding materials and engineering breakthroughs to build devices to support these models, to creating a programming framework so the systems can learn, to creating applications with brain-like capabilities. In this work, we provide a comprehensive survey of the research and motivations for neuromorphic computing over its history. We begin with a 35-year review of the motivations and drivers of neuromorphic computing, then look at the major research areas of the field, which we define as neuro-inspired models, algorithms and learning approaches, hardware and devices, supporting systems, and finally applications. We conclude with a broad discussion on the major research topics that need to be addressed in the coming years to see the promise of neuromorphic computing fulfilled. The goals of this work are to provide an exhaustive review of the research conducted in neuromorphic computing since the inception of the term, and to motivate further work by illuminating gaps in the field where new research is needed.
Through several studies, it has been highlighted that mobility patterns in mobile networks are driven by human behaviors. This effect has been particularly observed in intermittently connected networks like DTN (Delay Tolerant Networks). Given that common social intentions generate similar human behavior, it is relevant to exploit this knowledge in the network protocols design, e.g. to identify the closeness degree between two nodes. In this paper, we propose a temporal link prediction technique for DTN which quantifies the behavior similarity between each pair of nodes and makes use of it to predict future links. We attest that the tensor-based technique is effective for temporal link prediction applied to the intermittently connected networks. The validity of this method is proved when the prediction is made in a distributed way (i.e. with local information) and its performance is compared to well-known link prediction metrics proposed in the literature.
We define the Abelian distribution and study its basic properties. Abelian distributions arise in the context of neural modeling and describe the size of neural avalanches in fully-connected integrate-and-fire models of self-organized criticality in neural systems.
The THz spectrum of density fluctuations, $S(Q, \omega)$, of vitreous GeO$_2$ at ambient temperature was measured by inelastic x-ray scattering from ambient pressure up to pressures well beyond that of the known $\alpha$-quartz to rutile polyamorphic (PA) transition. We observe significant differences in the spectral shape measured below and above the PA transition, in particular, in the 30-80 meV range. Guided by first-principle lattice dynamics calculations, we interpret the changes in the phonon dispersion as the evolution from a quartz-like to a rutile-like coordination. Notably, such a crossover is accompanied by a cusp-like behavior in the pressure dependence of the elastic response of the system. Overall, the presented results highlight the complex fingerprint of PA phenomena on the high-frequency phonon dispersion.
A non-technical introduction to the theory of magnets with strong geometric frustration is given, concentrating on magnets on corner-sharing (kagome, pyrochlore, SCGO and GGG) lattices. Their rich behaviour is traced back to a large ground-state degeneracy in model systems, which renders them highly unstable towards perturbations. A systematic classification according to properties of their ground states is discussed. Other topics addressed in this overview article include a general theoretical framework for thermal order by disorder; the dynamics of how the vast regions of phase space accessible at low temperature are explored; the origin of the featureless magnetic susceptibility fingerprint of geometric frustration; the role of perturbations; and spin ice. The rich field of quantum frustrated magnets is also touched on.
This paper addresses the problem of online tracking and classification of multiple objects in an image sequence. Our proposed solution is to first track all objects in the scene without relying on object-specific prior knowledge, which in other systems can take the form of hand-crafted features or user-based track initialization. We then classify the tracked objects with a fast-learning image classifier that is based on a shallow convolutional neural network architecture and demonstrate that object recognition improves when this is combined with object state information from the tracking algorithm. We argue that by transferring the use of prior knowledge from the detection and tracking stages to the classification stage we can design a robust, general purpose object recognition system with the ability to detect and track a variety of object types. We describe our biologically inspired implementation, which adaptively learns the shape and motion of tracked objects, and apply it to the Neovision2 Tower benchmark data set, which contains multiple object types. An experimental evaluation demonstrates that our approach is competitive with state-of-the-art video object recognition systems that do make use of object-specific prior knowledge in detection and tracking, while providing additional practical advantages by virtue of its generality.
We introduce appropriate definitions of dimensions in order to characterize the fractal properties of complex networks. We compute these dimensions in a hierarchically structured network of particular interest. In spite of the nontrivial character of this network that displays scale-free connectivity among other features, it turns out to be approximately one-dimensional. The dimensional characterization is in agreement with the results on statistics of site percolation and other dynamical processes implemented on such a network.
Parallelization framework has become a necessity to speed up the training of deep neural networks (DNN) recently. Such framework typically employs the Model Average approach, denoted as MA-DNN, in which parallel workers conduct respective training based on their own local data while the parameters of local models are periodically communicated and averaged to obtain a global model which serves as the new start of local models. However, since DNN is a highly non-convex model, averaging parameters cannot ensure that such global model can perform better than those local models. To tackle this problem, we introduce a new parallel training framework called Ensemble-Compression, denoted as EC-DNN. In this framework, we propose to aggregate the local models by ensemble, i.e., averaging the outputs of local models instead of the parameters. As most of prevalent loss functions are convex to the output of DNN, the performance of ensemble-based global model is guaranteed to be at least as good as the average performance of local models. However, a big challenge lies in the explosion of model size since each round of ensemble can give rise to multiple times size increment. Thus, we carry out model compression after each ensemble, specialized by a distillation based method in this paper, to reduce the size of the global model to be the same as the local ones. Our experimental results demonstrate the prominent advantage of EC-DNN over MA-DNN in terms of both accuracy and speedup.
Thanks to the rapid development of information technology, the size of the wireless network becomes larger and larger, which makes spectrum resources more precious than ever before. To improve the efficiency of spectrum utilization, game theory has been applied to study the spectrum sharing in wireless networks for a long time. However, the scale of wireless network in existing studies is relatively small. In this paper, we introduce a novel game and model the spectrum sharing problem as an aggregative game for large-scale, heterogeneous, and dynamic networks. The massive usage of spectrum also leads to easier privacy divulgence of spectrum users' actions, which calls for privacy and truthfulness guarantees in wireless network. In a large decentralized scenario, each user has no priori about other users' decisions, which forms an incomplete information game. A "weak mediator", e.g., the base station or licensed spectrum regulator, is introduced and turns this game into a complete one, which is essential to reach a Nash equilibrium (NE). By utilizing past experience on the channel access, we propose an online learning algorithm to improve the utility of each user, achieving NE over time. Our learning algorithm also provides no regret guarantee to each user. Our mechanism admits an approximate ex-post NE. We also prove that it satisfies the joint differential privacy and is incentive-compatible. Efficiency of the approximate NE is evaluated, and the innovative scaling law results are disclosed. Finally, we provide simulation results to verify our analysis.
We construct a complete set of local integrals of motion that characterize the many-body localized (MBL) phase. Our approach relies on the assumption that local perturbations act locally on the eigenstates in the MBL phase, which is supported by numerical simulations of the random-field XXZ spin chain. We describe the structure of the eigenstates in the MBL phase and discuss the implications of local conservation laws for its non-equilibrium quantum dynamics. We argue that the many-body localization can be used to protect coherence in the system by suppressing relaxation between eigenstates with different local integrals of motion.
We introduce a novel framework for evaluating multimodal deep learning models with respect to their language understanding and generalization abilities. In this approach, artificial data is automatically generated according to the experimenter's specifications. The content of the data, both during training and evaluation, can be controlled in detail, which enables tasks to be created that require true generalization abilities, in particular the combination of previously introduced concepts in novel ways. We demonstrate the potential of our methodology by evaluating various visual question answering models on four different tasks, and show how our framework gives us detailed insights into their capabilities and limitations. By open-sourcing our framework, we hope to stimulate progress in the field of multimodal language understanding.
We explore the application of kernel-based multi-task learning techniques to forecast the demand of electricity in multiple nodes of a distribution network. We show that recently developed output kernel learning techniques are particularly well suited to solve this problem, as they allow to flexibly model the complex seasonal effects that characterize electricity demand data, while learning and exploiting correlations between multiple demand profiles. We also demonstrate that kernels with a multiplicative structure yield superior predictive performance with respect to the widely adopted (generalized) additive models. Our study is based on residential and industrial smart meter data provided by the Irish Commission for Energy Regulation (CER).
This paper introduces the Intentional Unintentional (IU) agent. This agent endows the deep deterministic policy gradients (DDPG) agent for continuous control with the ability to solve several tasks simultaneously. Learning to solve many tasks simultaneously has been a long-standing, core goal of artificial intelligence, inspired by infant development and motivated by the desire to build flexible robot manipulators capable of many diverse behaviours. We show that the IU agent not only learns to solve many tasks simultaneously but it also learns faster than agents that target a single task at-a-time. In some cases, where the single task DDPG method completely fails, the IU agent successfully solves the task. To demonstrate this, we build a playroom environment using the MuJoCo physics engine, and introduce a grounded formal language to automatically generate tasks.
Neural networks and rational functions efficiently approximate each other. In more detail, it is shown here that for any ReLU network, there exists a rational function of degree $O(\text{polylog}(1/\epsilon))$ which is $\epsilon$-close, and similarly for any rational function there exists a ReLU network of size $O(\text{polylog}(1/\epsilon))$ which is $\epsilon$-close. By contrast, polynomials need degree $\Omega(\text{poly}(1/\epsilon))$ to approximate even a single ReLU. When converting a ReLU network to a rational function as above, the hidden constants depend exponentially on the number of layers, which is shown to be tight; in other words, a compositional representation can be beneficial even for rational functions.
The Ca II K filtergrams from Kodaikanal Solar Observatory have been used to study the solar activity. The images are dominated by the chromospheric network and plages. Programs have been developed to obtain the network and plage indices from the daily images as functions of solar latitude and time. Preliminary results from the analysis are reported here. The network and plage indices were found to follow the sunspot cycle. A secondary peak is found during the declining activity period in both the indices. A comparison of network indices from the northern and the southern hemispheres shows that the former is more active than latter. However such an asymmetry is not clearly seen in the case of plage index.
In this paper, we propose a novel unsupervised deep learning model, called PCA-based Convolutional Network (PCN). The architecture of PCN is composed of several feature extraction stages and a nonlinear output stage. Particularly, each feature extraction stage includes two layers: a convolutional layer and a feature pooling layer. In the convolutional layer, the filter banks are simply learned by PCA. In the nonlinear output stage, binary hashing is applied. For the higher convolutional layers, the filter banks are learned from the feature maps that were obtained in the previous stage. To test PCN, we conducted extensive experiments on some challenging tasks, including handwritten digits recognition, face recognition and texture classification. The results show that PCN performs competitive with or even better than state-of-the-art deep learning models. More importantly, since there is no back propagation for supervised finetuning, PCN is much more efficient than existing deep networks.
Below the melting temperature $T_m$ crystals are the stable phase of typical elemental or molecular systems. However, cooling down a liquid below $T_m$, crystallization is anything but inevitable. The liquid can be supercooled, eventually forming a glass below the glass transition temperature $T_g$. Despite their long lifetimes and the presence of strong barriers that produces an apparent stability, supercooled liquids and glasses remain intrinsically metastable state and thermodynamically unstable towards the crystal. Here we investigated the isothermal crystallization kinetics of the prototypical strong glassformer GeO$_2$ in the deep supercooled liquid at 1100 K, about half-way between $T_m$ and $T_g$. The crystallization process has been observed through time-resolved neutron diffraction for about three days. Data show a continuous reorganization of the amorphous structure towards the alpha-quartz phase with the final material composed by crystalline domains plunged into a low-density, residual amorphous matrix. A quantitative analysis of the diffraction patterns allows determining the time evolution of the relative fractions of crystal and amorphous, that was interpreted through an empirical model for the crystallization kinetics. This approach provides a very good description of the experimental data and identifies a predator-prey-like mechanism between crystal and amorphous, where the density variation acts as blocking barrier.
The immune system is a complex biological system with a highly distributed, adaptive and self-organising nature. This paper presents an artificial immune system (AIS) that exploits some of these characteristics and is applied to the task of film recommendation by collaborative filtering (CF). Natural evolution and in particular the immune system have not been designed for classical optimisation. However, for this problem, we are not interested in finding a single optimum. Rather we intend to identify a sub-set of good matches on which recommendations can be based. It is our hypothesis that an AIS built on two central aspects of the biological immune system will be an ideal candidate to achieve this: Antigen - antibody interaction for matching and antibody - antibody interaction for diversity. Computational results are presented in support of this conjecture and compared to those found by other CF techniques.
In this paper we present EPIC, an efficient and effective predictor for IC manufacturing hotspots in deep sub-wavelength lithography. EPIC proposes a unified framework to combine different hotspot detection methods together, such as machine learning and pattern matching, using mathematical programming/optimization. EPIC algorithm has been tested on a number of industry benchmarks under advanced manufacturing conditions. It demonstrates so far the best capability in selectively combining the desirable features of various hotspot detection methods (3.5-8.2% accuracy improvement) as well as significant suppression of the detection noise (e.g., 80% false-alarm reduction). These characteristics make EPIC very suitable for conducting high performance physical verification and guiding efficient manufacturability friendly physical design.
By a model of coupled phase oscillators, we show analytically how synchronization in {\em non-identical} complex networks can be enhanced by introducing a proper gradient into the couplings. It is found that, by pointing the gradient from the large-degree to the small-degree nodes on each link, increasing the gradient strength will bring forward the {\em onset} of network synchronization monotonically, and, under the same gradient strength, heterogeneous networks are more synchronizable than homogeneous networks. These findings are verified by extensive simulations.
Object detection when provided image-level labels instead of instance-level labels (i.e., bounding boxes) during training is an important problem in computer vision, since large scale image datasets with instance-level labels are extremely costly to obtain. In this paper, we address this challenging problem by developing an Expectation-Maximization (EM) based object detection method using deep convolutional neural networks (CNNs). Our method is applicable to both the weakly-supervised and semi-supervised settings. Extensive experiments on PASCAL VOC 2007 benchmark show that (1) in the weakly supervised setting, our method provides significant detection performance improvement over current state-of-the-art methods, (2) having access to a small number of strongly (instance-level) annotated images, our method can almost match the performace of the fully supervised Fast RCNN. We share our source code at https://github.com/ZiangYan/EM-WSD.
The Unreactive Markovian Evader Interdiction Problem (UME) asks to optimally place sensors on a network to detect Markovian motion by one or more "evaders". It was previously proved that finding the optimal sensor placement is NP-hard if the number of evaders is unbounded. Here we show that the problem is NP-hard with just 2 evaders using a connection to coloring of planar graphs. The results suggest that approximation algorithms are needed even in applications where the number of evaders is small. It remains an open problem to determine the complexity of the 1-evader case or to devise efficient algorithms.
An Algorithm is proposed for the simulation of pure SU(N) lattice gauge theories based on Genetic Algorithms(GAs). Main difference between GAs and Metropolis methods(MPs) is that GAs treat a population of points at once, while MPs treat only one point in the searching space. This provides GAs with information about the assortment as well as the fitness of the evolution function and producting a better solution. We apply GAs to SU(2) pure gauge theory on a 2 dimensional lattice and show the results are consistent with those given by MP and Heatbath methods(HBs). Thermalization speed of GAs is especially faster than the simple MPs.
At low energy scales charge transport in the insulating Si:P is dominated by activated hopping between the localized donor electron states. Thus, theoretical models for a disordered system with electron-electron interaction are appropriate to interpret the electric conductivity spectra. With a newly developed technique we have measured the complex broadband microwave conductivity of Si:P from 100 MHz to 5 GHz in a broad range of phosphorus concentration $n/n_c$ from 0.56 to 0.95 relative to the critical value $n_c=3.5\times 10^{18}$ cm$^{-3}$ corresponding to the metal-insulator transition driven by doping. At our base temperature of $T =1.1$ K the samples are in the zero-phonon regime where they show a super-linear frequency dependence of the conductivity indicating the influence of the Coulomb gap in the density of the impurity states. At higher doping $n\to n_c$, an abrupt drop in the conductivity power law $\sig(\omega)\sim\omega^\alpha$ is observed. The dielectric function $\eps$ increases upon doping following a power law in ($1-n/n_c$). Dynamic response at elevated temperatures has also been investigated.
Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects, which we call "fooling images" (more generally, fooling examples). Our results shed light on interesting differences between human vision and current DNNs, and raise questions about the generality of DNN computer vision.
Experimental results on hadronic structures are discussed in view of our physics understanding. Achievements and challenges are noted.
The Bacterial Foraging Optimization (BFO) is one of the metaheuristics algorithms that most widely used to solve optimization problems. The BFO is imitated from the behavior of the foraging bacteria group such as Ecoli. The main aim of algorithm is to eliminate those bacteria that have weak foraging methods and maintaining those bacteria that have strong foraging methods. In this extent, each bacterium communicates with other bacteria by sending signals such that bacterium change the position in the next step if prior factors have been satisfied. In fact, the process of algorithm allows bacteria to follow up nutrients toward the optimal. In this paper, the BFO is used for the solutions of Quadratic Assignment Problem (QAP), and multi- objective QAP (mQAP) by using updating mechanisms including mutation, crossover, and a local search.
We derive a compositional compressible two-phase, liquid and gas, flow model for numerical simulations of hydrogen migration in deep geological repository for radioactive waste. This model includes capillary effects and the gas high diffusivity. Moreover, it is written in variables (total hydrogen mass density and liquid pressure) chosen in order to be consistent with gas appearance or disappearance. We discuss the well possedness of this model and give some computational evidences of its adequacy to simulate gas generation in a water saturated repository.
We introduce the novel problem of identifying the photographer behind a photograph. To explore the feasibility of current computer vision techniques to address this problem, we created a new dataset of over 180,000 images taken by 41 well-known photographers. Using this dataset, we examined the effectiveness of a variety of features (low and high-level, including CNN features) at identifying the photographer. We also trained a new deep convolutional neural network for this task. Our results show that high-level features greatly outperform low-level features. We provide qualitative results using these learned models that give insight into our method's ability to distinguish between photographers, and allow us to draw interesting conclusions about what specific photographers shoot. We also demonstrate two applications of our method.
Flexible reconfiguration of human brain networks supports cognitive flexibility and learning. However, modulating flexibility to enhance learning requires an understanding of the relationship between flexibility and brain state. In an unprecedented longitudinal data set, we investigate the relationship between flexibility and mood, demonstrating that flexibility is positively correlated with emotional state. Our results inform the modulation of brain state to enhance response to training in health and injury.
We present briefly the Nondeterministic Waiting Time algorithm. Our technique for the simulation of biochemical reaction networks has the ability to mimic the Gillespie Algorithm for some networks and solutions to ordinary differential equations for other networks, depending on the rules of the system, the kinetic rates and numbers of molecules. We provide a full description of the algorithm as well as specifics on its implementation. Some results for two well-known models are reported. We have used the algorithm to explore Fas-mediated apoptosis models in cancerous and HIV-1 infected T cells.
Due to its rapid development in the past decade, air transportation system has attracted considerable research attention from diverse communities. While most of the previous studies focused on airline networks, here we systematically explore the robustness of the Chinese air route network, and identify the vital edges which form the backbone of Chinese air transportation system. Specifically, we employ a memetic algorithm to minimize the network robustness after removing certain edges hence the solution of this model is the set of vital edges. Counterintuitively, our results show that the most vital edges are not necessarily the edges of highest topological importance, for which we provide an extensive explanation from the microscope of view. Our findings also offer new insights to understanding and optimizing other real-world network systems.
We present the aim and the status of the BLEIS project currently under development at SISSA. This project consists in selecting a complete Blazar sample from deep optical data, down to B=24.6, V=24.4 and I=23.7 (80% completness). The optical images are taken from the ESO Imaging Survey (EIS - Wide), a public survey covering ~16 deg2 of the Southern Emisphere. This new Blazar sample, thanks to the different energy band of the selection and to the faintness of EIS images, will be useful not only in understanding the physics of such objects, but also in evolutive and cosmological studies. It will also be a new test for the unified models of AGNs and will be helpful in making predictions for future deep surveys.
In a recent experimental study, Siegrist et al. [Nature Materials 10, 202 (2011)] investigated the metal-insulator transition (MIT) of GeSb_2Te_4 and related phase-change materials on increasing annealing temperature. The authors conclude that these materials exhibit a discontinuous MIT with a finite minimum metallic conductivity. The striking contrast to reports on other disordered substances motivates the present in-depth study of the influence of the MIT criterion used on the characterization of the MIT. First, we discuss in detail the inherent biases of the various available approaches to locating the MIT. Second, reanalyzing the GeSb_2Te_4 measurements, we show that this solid resembles other disordered materials to a large extent: According to a widely-used approach, these data may also be interpreted in terms of a continuous MIT. Carefully checking the justification of the respective fits, however, uncovers inconsistencies in the experimental data, which currently render an unambiguous interpretation impossible. Third, comparing with previous experimental studies of crystalline Si:As, Si:P, Si:B, Ge:Ga, CdSe:In, Cd_{0.95}Mn$_{0.05}Se, Cd_{0.95}Mn_{0.05}Te_{0.97}Se_{0.03}:In, disordered Gd, and nano-granular Pt-C, we show that such an inconclusive behavior occurs frequently: The consideration of the logarithmic derivative of the conductivity highlights serious inconsistencies in the original interpretations in terms of a continuous MIT. In part, they are common to all these studies and seem to be generic, in part, they vary from experiment to experiment and may arise from measurement problems. Thus, the question for the character of the MIT of these materials has to be considered as yet open, and the challenge lies in improving the measurement precision.
This paper explores the application of role-based access control to social networks, from the perspective of social network analysis. Each tie, composed of a relation, a sender and a receiver, involves the sender's assignation of the receiver to a role with permissions. The model is not constrained to system-defined relations and lets users define them unilaterally. It benefits of RBAC's advantages, such as policy neutrality, simplification of security administration and permissions on other roles. Tie-RBAC has been implemented in a core for building social network sites, Social Stream.
We propose a wavelength-saving topology of quantum key distribution(QKD) network based on passive optical elements, and report the field test of this network on the commercial telecom optical fiber. In this network, 5 nodes are supported with 2 wavelengths, and every two nodes can share secure keys directly at the same time. All QKD links in the network operate at the frequency of 20 MHz. We also characterized the insertion loss and crosstalk effects on the point-to-point QKD system after introducing this QKD network.
In a previous letter, we studied learning from stochastic examples by perceptrons with Ising weights in the framework of statistical mechanics. Under the one-step replica symmetry breaking ansatz, the behaviours of learning curves were classified according to some local property of the rules by which examples were drawn. Further, the conditions for the existence of the Perfect Learning together with other behaviors of the learning curves were given. In this paper, we give the detailed derivation about these results and further argument about the Perfect Learning together with extensive numerical calculations.
Recurrent networks are trained to memorize their input better, often in the hopes that such training will increase the ability of the network to predict. We show that networks designed to memorize input can be arbitrarily bad at prediction. We also find, for several types of inputs, that one-node networks optimized for prediction are nearly at upper bounds on predictive capacity given by Wiener filters, and are roughly equivalent in performance to randomly generated five-node networks. Our results suggest that maximizing memory capacity leads to very different networks than maximizing predictive capacity, and that optimizing recurrent weights can decrease reservoir size by half an order of magnitude.
Jerne's idiotypic network theory postulates that the immune response involves inter-antibody stimulation and suppression as well as matching to antigens. The theory has proved the most popular Artificial Immune System (ais) model for incorporation into behavior-based robotics but guidelines for implementing idiotypic selection are scarce. Furthermore, the direct effects of employing the technique have not been demonstrated in the form of a comparison with non-idiotypic systems. This paper aims to address these issues. A method for integrating an idiotypic ais network with a Reinforcement Learning based control system (rl) is described and the mechanisms underlying antibody stimulation and suppression are explained in detail. Some hypotheses that account for the network advantage are put forward and tested using three systems with increasing idiotypic complexity. The basic rl, a simplified hybrid ais-rl that implements idiotypic selection independently of derived concentration levels and a full hybrid ais-rl scheme are examined. The test bed takes the form of a simulated Pioneer robot that is required to navigate through maze worlds detecting and tracking door markers.
Independent component analysis (ICA) is a statistical method for transforming an observable multidimensional random vector into components that are as statistically independent as possible from each other.Usually the ICA framework assumes a model according to which the observations are generated (such as a linear transformation with additive noise). ICA over finite fields is a special case of ICA in which both the observations and the independent components are over a finite alphabet. In this work we consider a generalization of this framework in which an observation vector is decomposed to its independent components (as much as possible) with no prior assumption on the way it was generated. This generalization is also known as Barlow's minimal redundancy representation problem and is considered an open problem. We propose several theorems and show that this NP hard problem can be accurately solved with a branch and bound search tree algorithm, or tightly approximated with a series of linear problems. Our contribution provides the first efficient and constructive set of solutions to Barlow's problem.The minimal redundancy representation (also known as factorial code) has many applications, mainly in the fields of Neural Networks and Deep Learning. The Binary ICA (BICA) is also shown to have applications in several domains including medical diagnosis, multi-cluster assignment, network tomography and internet resource management. In this work we show this formulation further applies to multiple disciplines in source coding such as predictive coding, distributed source coding and coding of large alphabet sources.
A Multi-Layer Perceptron (MLP) defines a family of artificial neural networks often used in TS modeling and forecasting. Because of its "black box" aspect, many researchers refuse to use it. Moreover, the optimization (often based on the exhaustive approach where "all" configurations are tested) and learning phases of this artificial intelligence tool (often based on the Levenberg-Marquardt algorithm; LMA) are weaknesses of this approach (exhaustively and local minima). These two tasks must be repeated depending on the knowledge of each new problem studied, making the process, long, laborious and not systematically robust. In this paper a pruning process is proposed. This method allows, during the training phase, to carry out an inputs selecting method activating (or not) inter-nodes connections in order to verify if forecasting is improved. We propose to use iteratively the popular damped least-squares method to activate inputs and neurons. A first pass is applied to 10% of the learning sample to determine weights significantly different from 0 and delete other. Then a classical batch process based on LMA is used with the new MLP. The validation is done using 25 measured meteorological TS and cross-comparing the prediction results of the classical LMA and the 2-stage LMA.
Text line detection and localization is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents. In this paper, we present a new approach for full page text recognition. Localization of the text lines is based on regressions with Fully Convolutional Neural Networks and Multidimensional Long Short-Term Memory as contextual layers. In order to increase the efficiency of this localization method, only the position of the left side of the text lines are predicted. The text recognizer is then in charge of predicting the end of the text to recognize. This method has shown good results for full page text recognition on the highly heterogeneous Maurdor dataset.
We numerically compute the helicity modulus of the three-dimensional gauge glass by Monte Carlo simulations. Because the average free energy is independent of a twist angle, it is expected that the average helicity modulus, directly related to the superfluid density, vanishes when simulations are performed with periodic boundary conditions. This is not necessarily the case for the typical (median) value, which is nonzero, because the distribution of the helicity modulus among different disorder realizations is very asymmetric. We show that the data for the helicity modulus distribution are well described by a generalized extreme-value distribution with a nonzero location parameter (most probable value). A finite-size scaling analysis of the location parameter yields a critical temperature and critical exponents in agreement with previous results. This suggests that the location parameter is a good observable. There have been conflicting claims as to whether the superfluid density vanishes in the vortex glass phase, with Fisher et al. [Phys. Rev. B 43, 130 (1991)] arguing that it is finite and Korshunov [Phys. Rev. B 63, 174514 (2001)] predicting that it is zero. Because the gauge glass is commonly used to describe the vortex glass in high-temperature superconductors, we discuss this issue in light of our results on the gauge glass.
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.
We simulate the formation of a low metallicity (0.01 Zsun) stellar cluster in a dwarf galaxy at redshift z~14. Beginning with cosmological initial conditions, the simulation utilizes adaptive mesh refinement and sink particles to follow the collapse and evolution of gas past the opacity limit for fragmentation, thus resolving the formation of individual protostellar cores. A time- and location-dependent protostellar radiation field, which heats the gas by absorption on dust, is computed by integration of protostellar evolutionary tracks with the MESA code. The simulation also includes a robust non-equilibrium chemical network that self-consistently treats gas thermodynamics and dust-gas coupling. The system is evolved for 18 kyr after the first protostellar source has formed. In this time span, 30 sink particles representing protostellar cores form with a total mass of 81 Msun. Their masses range from ~0.1 Msun to 14.4 Msun with a median mass ~0.5-1 Msun. Massive protostars grow by competitive accretion while lower-mass protostars are stunted in growth by close encounters and many-body ejections. In the regime explored here, the characteristic mass scale is determined by the temperature floor set by the cosmic microwave background and by the onset of efficient dust-gas coupling. It seems unlikely that host galaxies of the first bursts of metal-enriched star formation will be detectable with the James Webb Space Telescope or other next-generation infrared observatories. Instead, the most promising access route to the dawn of cosmic star formation may lie in the scrutiny of metal-poor, ancient stellar populations in the Galactic neighborhood. The observable targets that correspond to the system simulated here are ultra-faint dwarf satellite galaxies such as Bootes II, Segue I and II, and Willman I.
We obtain a good analytic fit to the joint Bjorken-x and Q^2 dependences of ZEUS data on the deep inelastic structure function F_2(x, Q^2). At fixed virtuality Q^2, as we showed previously, our expression is an expansion in powers of log (1/x) that satisfies the Froissart bound. Here we show that for each x, the Q^2 dependence of the data is well described by an expansion in powers of log Q^2. The resulting analytic expression allows us to predict the logarithmic derivatives {({\partial}^n F_2^p/{{(\partial\ln Q^2}})^n)}_x for n = 1,2 and to compare the results successfully with other data. We extrapolate the proton structure function F_2^p(x,Q^2) to the very large Q^2 and the very small x regions that are inaccessible to present day experiments and contrast our expectations with those of conventional global fits of parton distribution functions.
Optical WDM mesh networks are able to transport huge amount of information. The use of such technology however poses the problem of protection against failures such as fibre cuts. One of the principal methods for link protection used in optical WDM networks is pre-configured protection cycle (p-cycle). The major problem of this method of protection resides in finding the optimal set of p-cycles which protect the network for a given distribution of working capacity. Existing heuristics generate a large set of p-cycle candidates which are entirely independent of the network state, and from then the good sub-set of p-cycles which will protect the network is selected. In this paper, we propose a new algorithm of generation of p-cycles based on the incremental aggregation of the shortest cycles. Our generation of p-cycles depends on the state of the network. This enables us to choose an efficient set of p-cycles which will protect the network. The set of p-cycles that we generate is the final set which will protect the network, in other words our heuristic does not go through the additional step of p-cycle selection
In this work, we investigate the use of OpenStreetMap data for semantic labeling of Earth Observation images. Deep neural networks have been used in the past for remote sensing data classification from various sensors, including multispectral, hyperspectral, SAR and LiDAR data. While OpenStreetMap has already been used as ground truth data for training such networks, this abundant data source remains rarely exploited as an input information layer. In this paper, we study different use cases and deep network architectures to leverage OpenStreetMap data for semantic labeling of aerial and satellite images. Especially , we look into fusion based architectures and coarse-to-fine segmentation to include the OpenStreetMap layer into multispectral-based deep fully convolutional networks. We illustrate how these methods can be successfully used on two public datasets: ISPRS Potsdam and DFC2017. We show that OpenStreetMap data can efficiently be integrated into the vision-based deep learning models and that it significantly improves both the accuracy performance and the convergence speed of the networks.
We investigate the statistics of the gap, G_n, between the two rightmost positions of a Markovian one-dimensional random walker (RW) after n time steps and of the duration, L_n, which separates the occurrence of these two extremal positions. The distribution of the jumps \eta_i's of the RW, f(\eta), is symmetric and its Fourier transform has the small k behavior 1-\hat{f}(k)\sim| k|^\mu with 0 < \mu \leq 2. We compute the joint probability density function (pdf) P_n(g,l) of G_n and L_n and show that, when n \to \infty, it approaches a limiting pdf p(g,l). The corresponding marginal pdf of the gap, p_{\rm gap}(g), is found to behave like p_{\rm gap}(g) \sim g^{-1 - \mu} for g \gg 1 and 0<\mu < 2. We show that the limiting marginal distribution of L_n, p_{\rm time}(l), has an algebraic tail p_{\rm time}(l) \sim l^{-\gamma(\mu)} for l \gg 1 with \gamma(1<\mu \leq 2) = 1 + 1/\mu, and \gamma(0<\mu<1) = 2. For l, g \gg 1 with fixed l g^{-\mu}, p(g,l) takes the scaling form p(g,l) \sim g^{-1-2\mu} \tilde p_\mu(l g^{-\mu}) where \tilde p_\mu(y) is a (\mu-dependent) scaling function. We also present numerical simulations which verify our analytic results.
Regularisation of deep neural networks (DNN) during training is critical to performance. By far the most popular method is known as dropout. Here, cast through the prism of signal processing theory, we compare and contrast the regularisation effects of dropout with those of dither. We illustrate some serious inherent limitations of dropout and demonstrate that dither provides a more effective regulariser.
Methods of modeling cellular regulatory networks as diverse as differential equations and Boolean networks co-exist, however, without any closer correspondence to each other. With the example system of the fission yeast cell cycle control network, we here set the two approaches in relation to each other. We find that the Boolean network can be formulated as a specific coarse-grained limit of the more detailed differential network model for this system. This lays the mathematical foundation on which Boolean networks can be applied to biological regulatory networks in a controlled way.
We discuss the constraints set on galaxy evolution by data from deep surveys performed in the mid-IR and far-IR with ISO and with mm telescopes at longer wavelengths. These reveal extremely high rates of evolution for IR galaxies. According to our model, the deep ISO surveys at 15 micron may have already resolved more than 50% of the CIRB intensity, hence allowing to investigate the origin of the CIRB. From our fits to the observed optical-IR SEDs, these objects appear to involve massive galaxies hosting luminous starbursts (SFR 100 Mo/yr). Our evolutionary scheme considers a bimodal star formation (SF), including long-lived quiescent SF, and enhanced SF taking place during transient events recurrently triggered by interactions and merging. We interpret the strong observed evolution as an increase with z of the rate of interactions between galaxies (density evolution) and an increase of their IR luminosity due to the more abundant fuel available in the past (luminosity evolution). Very schematically, we associate the origin of the bulk of the optical/NIR background to the quiescent evolution, while the CIRB is interpreted as mostly due the dusty starburst phase (the latter possibly leading to the formation of galaxy spheroids). The large energy contents in the CIRB and optical backgrounds are not easily explained, considering the moderate efficiency of energy generation by stars: a top-heavy stellar IMF associated with the starburst phase (and compared with a more standard IMF during the quiescent SF) would alleviate the problem. IR data suggest that the strong observed evolution to z~1 should turn over at higher z: scenarios in which a dominant fraction of stellar formation occurs at very high-z are not supported by our analysis.
We present a new deep learning architecture (called Kd-network) that is designed for 3D model recognition tasks and works with unstructured point clouds. The new architecture performs multiplicative transformations and share parameters of these transformations according to the subdivisions of the point clouds imposed onto them by Kd-trees. Unlike the currently dominant convolutional architectures that usually require rasterization on uniform two-dimensional or three-dimensional grids, Kd-networks do not rely on such grids in any way and therefore avoid poor scaling behaviour. In a series of experiments with popular shape recognition benchmarks, Kd-networks demonstrate competitive performance in a number of shape recognition tasks such as shape classification, shape retrieval and shape part segmentation.
We consider a communication scenario where a source communicates with a destination over a directed layered relay network. Each relay performs analog network coding where it scales and forwards the signals received at its input. In this scenario, we address the question: What portion of the maximum end-to-end achievable rate can be maintained if only a fraction of relay nodes available at each layer are used?   We consider, in particular, the Gaussian diamond network (layered network with a single layer of relay nodes) and a class of symmetric layered networks. For these networks we show that each relay layer increases the additive gap between the optimal analog network coding performance with and without network simplification (using k instead of N relays in each layer, k < N) by no more than log(N/k)^2 bits and the corresponding multiplicative gap by no more than a factor of (N/k)^2, asymptotically (in source power). To the best of our knowledge, this work offers the first characterization of the performance of network simplification in general layered amplify-and-forward relay networks. Further, unlike most of the current approximation results that attempt to bound optimal rates either within an additive gap or a multiplicative gap, our results suggest a new rate approximation scheme that allows for the simultaneous computation of additive and multiplicative gaps.
Natural language and symbols are intimately correlated. Recent advances in machine learning (ML) and in natural language processing (NLP) seem to contradict the above intuition: symbols are fading away, erased by vectors or tensors called distributed and distributional representations. However, there is a strict link between distributed/distributional representations and symbols, being the first an approximation of the second. A clearer understanding of the strict link between distributed/distributional representations and symbols will certainly lead to radically new deep learning networks. In this paper we make a survey that aims to draw the link between symbolic representations and distributed/distributional representations. This is the right time to revitalize the area of interpreting how symbols are represented inside neural networks.
In this paper we study the problem of image representation learning without human annotation. By following the principles of self-supervision, we build a convolutional neural network (CNN) that can be trained to solve Jigsaw puzzles as a pretext task, which requires no manual labeling, and then later repurposed to solve object classification and detection. To maintain the compatibility across tasks we introduce the context-free network (CFN), a siamese-ennead CNN. The CFN takes image tiles as input and explicitly limits the receptive field (or context) of its early processing units to one tile at a time. We show that the CFN is a more compact version of AlexNet, but with the same semantic learning capabilities. By training the CFN to solve Jigsaw puzzles, we learn both a feature mapping of object parts as well as their correct spatial arrangement. Our experimental evaluations show that the learned features capture semantically relevant content. After training our CFN features to solve jigsaw puzzles on the training set of the ILSRV 2012 dataset, we transfer them via fine-tuning on the combined training and validation set of Pascal VOC 2007 for object detection (via fast RCNN) and classification. The performance of the CFN features is 51.8% for detection and 68.6% for classification, which is the highest among features obtained via unsupervised learning, and closing the gap with features obtained via supervised learning (56.5% and 78.2% respectively). In object classification the CFN features achieve 38.1% on the ILSRV 2012 validation set, after fine-tuning only the fully connected layers on the training set.
This article describes a complex network model whose weights are proportional to the difference between uniformly distributed ``fitness'' values assigned to the nodes. It is shown both analytically and experimentally that the strength density (i.e. the weighted node degree) for this model, called derivative complex networks, follows a power law with exponent $\gamma<1$ if the fitness has an upper limit and $\gamma>1$ if the fitness has no upper limit but a positive lower limit. Possible implications for neuronal networks topology and dynamics are also discussed.
Deep learning thrives with large neural networks and large datasets. However, larger networks and larger datasets result in longer training times that impede research and development progress. Distributed synchronous SGD offers a potential solution to this problem by dividing SGD minibatches over a pool of parallel workers. Yet to make this scheme efficient, the per-worker workload must be large, which implies nontrivial growth in the SGD minibatch size. In this paper, we empirically show that on the ImageNet dataset large minibatches cause optimization difficulties, but when these are addressed the trained networks exhibit good generalization. Specifically, we show no loss of accuracy when training with large minibatch sizes up to 8192 images. To achieve this result, we adopt a linear scaling rule for adjusting learning rates as a function of minibatch size and develop a new warmup scheme that overcomes optimization challenges early in training. With these simple techniques, our Caffe2-based system trains ResNet-50 with a minibatch size of 8192 on 256 GPUs in one hour, while matching small minibatch accuracy. Using commodity hardware, our implementation achieves ~90% scaling efficiency when moving from 8 to 256 GPUs. This system enables us to train visual recognition models on internet-scale data with high efficiency.
In many wireless networks, there is no fixed physical backbone nor centralized network management. The nodes of such a network have to self-organize in order to maintain a virtual backbone used to route messages. Moreover, any node of the network can be a priori at the origin of a malicious attack. Thus, in one hand the backbone must be fault-tolerant and in other hand it can be useful to monitor all network communications to identify an attack as soon as possible. We are interested in the minimum \emph{Connected Vertex Cover} problem, a generalization of the classical minimum Vertex Cover problem, which allows to obtain a connected backbone. Recently, Delbot et al.~\cite{DelbotLP13} proposed a new centralized algorithm with a constant approximation ratio of $2$ for this problem. In this paper, we propose a distributed and self-stabilizing version of their algorithm with the same approximation guarantee. To the best knowledge of the authors, it is the first distributed and fault-tolerant algorithm for this problem. The approach followed to solve the considered problem is based on the construction of a connected minimal clique partition. Therefore, we also design the first distributed self-stabilizing algorithm for this problem, which is of independent interest.
For interdependent networks with identity dependency map, percolation is exactly the same with that on a single network and follows a second-order phase transition, while for random dependency, percolation follows a first-order phase transition. In real networks, the dependency relations between networks are neither identical nor completely random. Thus in this paper, we study the influence of randomness for dependency maps on the robustness of interdependent lattice networks. We introduce approximate entropy($ApEn$) as the measure of randomness of the dependency maps. We find that there is critical $ApEn_c$ below which the percolation is continuous, but for larger $ApEn$, it is a first-order transition. With the increment of $ApEn$, the $p_c$ increases until $ApEn$ reaching ${ApEn}_c'$ and then remains almost constant. The time scale of the system shows rich properties as $ApEn$ increases. Our results uncover that randomness is one of the important factors that lead to cascading failures of spatially interdependent networks.
The framework of network equivalence theory developed by Koetter et al. introduces a notion of channel emulation to construct noiseless networks as upper (resp. lower) bounding models, which can be used to calculate the outer (resp. inner) bounds for the capacity region of the original noisy network. Based on the network equivalence framework, this paper presents scalable upper and lower bounding models for wireless networks with potentially many nodes. A channel decoupling method is proposed to decompose wireless networks into decoupled multiple-access channels (MACs) and broadcast channels (BCs). The upper bounding model, consisting of only point-to-point bit pipes, is constructed by firstly extending the "one-shot" upper bounding models developed by Calmon et al. and then integrating them with network equivalence tools. The lower bounding model, consisting of both point-to-point and point-to-points bit pipes, is constructed based on a two-step update of the lower bounding models to incorporate the broadcast nature of wireless transmission. The main advantages of the proposed methods are their simplicity and the fact that they can be extended easily to large networks with a complexity that grows linearly with the number of nodes. It is demonstrated that the resulting upper and lower bounds can approach the capacity in some setups.
The greatest challenge in maximizing the use of gene expression data is to develop new computational tools capable of interconnecting and interpreting the results from different organisms and experimental settings. We propose an integrative and comprehensive approach including a super-chip containing data from microarray experiments collected on different species subjected to hypoxic and anoxic stress. A data mining technology called Peano count tree (P-trees) is used to represent genomic data in multidimensions. Each microarray spot is presented as a pixel with its corresponding red/green intensity feature bands. Each bad is stored separately in a reorganized 8-separate (bSQ) file format. Each bSQ is converted to a quadrant base tree structure (P-tree) from which a superchip is represented as expression P-trees (EP-trees) and repression P-trees (RP-trees). The use of association rule mining is proposed to derived to meanigingfully organize signal transduction pathways taking in consideration evolutionary considerations. We argue that the genetic constitution of an organism (K) can be represented by the total number of genes belonging to two groups. The group X constitutes genes (X1,Xn) and they can be represented as 1 or 0 depending on whether the gene was expressed or not. The second group of Y genes (Y1,Yn) is expressed at different levels. These genes have a very high repression, high expression, very repressed or highly repressed. However, many genes of the group Y are specie specific and modulated by the products and combinations of genes of the group X. In this paper, we introduce the dSQ and P-tree technology; the biological implications of association rule mining using X and Y gene groups and some advances in the integration of this information using the BRAIN architecture.
In machine learning and statistics, probabilistic inference involving multimodal distributions is quite difficult. This is especially true in high dimensional problems, where most existing algorithms cannot easily move from one mode to another. To address this issue, we propose a novel Bayesian inference approach based on Markov Chain Monte Carlo. Our method can effectively sample from multimodal distributions, especially when the dimension is high and the modes are isolated. To this end, it exploits and modifies the Riemannian geometric properties of the target distribution to create \emph{wormholes} connecting modes in order to facilitate moving between them. Further, our proposed method uses the regeneration technique in order to adapt the algorithm by identifying new modes and updating the network of wormholes without affecting the stationary distribution. To find new modes, as opposed to rediscovering those previously identified, we employ a novel mode searching algorithm that explores a \emph{residual energy} function obtained by subtracting an approximate Gaussian mixture density (based on previously discovered modes) from the target density function.
We introduce a simple permutation equivariant layer for deep learning with set structure.This type of layer, obtained by parameter-sharing, has a simple implementation and linear-time complexity in the size of each set. We use deep permutation-invariant networks to perform point-could classification and MNIST-digit summation, where in both cases the output is invariant to permutations of the input. In a semi-supervised setting, where the goal is make predictions for each instance within a set, we demonstrate the usefulness of this type of layer in set-outlier detection as well as semi-supervised learning with clustering side-information.
The possibility of using ultracold atoms to observe strong localization of matter waves is now the subject of a great interest, as undesirable decoherence and interactions can be made negligible in these systems. It was proposed that a static disordered potential can be realized by trapping atoms of a given species in randomly chosen sites of a deep 3D optical lattice with no multiple occupation. We analyze in detail the prospects of this scheme for observing localized states in 3D for a matter wave of a different atomic species that interacts with the trapped particles and that is sufficiently far detuned from the optical lattice to be insensitive to it. We demonstrate that at low energy a large number of 3D strongly localized states can be produced for the matter wave, if the effective scattering length describing the interaction of the matter wave with a trapped atom is of the order of the mean distance between the trapped particles. Such high values of the effective scattering length can be obtained by using a Feshbach resonance to adjust the free space inter-species scattering length and by taking advantage of confinement-induced resonances induced by the trapping of the scatterers in the lattice.
We study the suppression of electron localization due to the screening of disorder in a Hubbard-Anderson model. We focus on the change of the electron localization length at the Fermi level within a static picture, where interactions are absorbed into the redefinition of the random on-site energies. Two different approximations are presented, either one yielding a nonmonotonic dependence of the localization length on the interaction strength, with a pronounced maximum at an intermediate interaction strength. In spite of its simplicity, our approach is in good agreement with recent numerical results.
Compton Thick (CT) AGN are a key ingredient of Cosmic X-ray Background (CXB) synthesis models, but are still an elusive component of the AGN population beyond the local Universe. Multi-wavelength surveys are the only way to find them at z > 0.1, and a deep X-ray coverage is crucial in order to clearly identify them among star forming galaxies. As an example, the deep and wide COSMOS survey allowed us to select a total of 34 CT sources. This number is computed from the 64 nominal CT candidates, each counted for its N H probability distribution function. For each of these sources, rich multi-wavelength information is available, and is used to confirm their obscured nature, by comparing the expected AGN luminosity from spectral energy distribution fitting, with the absorption-corrected X-ray luminosity. While Chandra is more efficient, for a given exposure, in detecting CT candidates in current surveys (by a factor ~2), deep XMM-Newton pointings of bright sources are vital to fully characterize their properties: NH distribution above 10^25 cm^-2, reflection intensity etc., all crucial parameters of CXB models. Since luminous CT AGN at high redshift are extremely rare, the future of CT studies at high redshift will have to rely on the large area surveys currently underway, such as XMM-XXL and Stripe82, and will then require dedicated follow-up with XMM-Newton, while waiting for the advent of the ESA mission Athena.
We discuss a model accounting for the creation and development of transport networks based on the Cameo principle which refers to the idea of distribution of resources, including land, water, minerals, fuel and wealth. We also give an outlook of the use of random walks as an effective tool for the investigation of network structures and its functional segmentation. In particular, we have studied the complex transport network of Venetian canals by means of random walks.
Recently, prediction markets have shown considerable promise for developing flexible mechanisms for machine learning. In this paper, agents with isoelastic utilities are considered. It is shown that the costs associated with homogeneous markets of agents with isoelastic utilities produce equilibrium prices corresponding to alpha-mixtures, with a particular form of mixing component relating to each agent's wealth. We also demonstrate that wealth accumulation for logarithmic and other isoelastic agents (through payoffs on prediction of training targets) can implement both Bayesian model updates and mixture weight updates by imposing different market payoff structures. An iterative algorithm is given for market equilibrium computation. We demonstrate that inhomogeneous markets of agents with isoelastic utilities outperform state of the art aggregate classifiers such as random forests, as well as single classifiers (neural networks, decision trees) on a number of machine learning benchmarks, and show that isoelastic combination methods are generally better than their logarithmic counterparts.
Deep directed generative models have attracted much attention recently due to their expressive representation power and the ability of ancestral sampling. One major difficulty of learning directed models with many latent variables is the intractable inference. To address this problem, most existing algorithms make assumptions to render the latent variables independent of each other, either by designing specific priors, or by approximating the true posterior using a factorized distribution. We believe the correlations among latent variables are crucial for faithful data representation. Driven by this idea, we propose an inference method based on the conditional pseudo-likelihood that preserves the dependencies among the latent variables. For learning, we propose to employ the hard Expectation Maximization (EM) algorithm, which avoids the intractability of the traditional EM by max-out instead of sum-out to compute the data likelihood. Qualitative and quantitative evaluations of our model against state of the art deep models on benchmark datasets demonstrate the effectiveness of the proposed algorithm in data representation and reconstruction.
We describe an adaptation of the simulated annealing algorithm to nonparametric clustering and related probabilistic models. This new algorithm learns nonparametric latent structure over a growing and constantly churning subsample of training data, where the portion of data subsampled can be interpreted as the inverse temperature beta(t) in an annealing schedule. Gibbs sampling at high temperature (i.e., with a very small subsample) can more quickly explore sketches of the final latent state by (a) making longer jumps around latent space (as in block Gibbs) and (b) lowering energy barriers (as in simulated annealing). We prove subsample annealing speeds up mixing time N^2 -> N in a simple clustering model and exp(N) -> N in another class of models, where N is data size. Empirically subsample-annealing outperforms naive Gibbs sampling in accuracy-per-wallclock time, and can scale to larger datasets and deeper hierarchical models. We demonstrate improved inference on million-row subsamples of US Census data and network log data and a 307-row hospital rating dataset, using a Pitman-Yor generalization of the Cross Categorization model.
Most of the complex social, technological and biological networks have a significant community structure. Therefore the community structure of complex networks has to be considered as a universal property, together with the much explored small-world and scale-free properties of these networks. Despite the large interest in characterizing the community structures of real networks, not enough attention has been devoted to the detection of universal mechanisms able to spontaneously generate networks with communities. Triadic closure is a natural mechanism to make new connections, especially in social networks. Here we show that models of network growth based on simple triadic closure naturally lead to the emergence of community structure, together with fat-tailed distributions of node degree, high clustering coefficients. Communities emerge from the initial stochastic heterogeneity in the concentration of links, followed by a cycle of growth and fragmentation. Communities are the more pronounced, the sparser the graph, and disappear for high values of link density and randomness in the attachment procedure. By introducing a fitness-based link attractivity for the nodes, we find a novel phase transition, where communities disappear for high heterogeneity of the fitness distribution, but a new mesoscopic organization of the nodes emerges, with groups of nodes being shared between just a few superhubs, which attract most of the links of the system.
We propose an efficient Adaptive Random Convolutional Network Coding (ARCNC) algorithm to address the issue of field size in random network coding. ARCNC operates as a convolutional code, with the coefficients of local encoding kernels chosen randomly over a small finite field. The lengths of local encoding kernels increase with time until the global encoding kernel matrices at related sink nodes all have full rank. Instead of estimating the necessary field size a priori, ARCNC operates in a small finite field. It adapts to unknown network topologies without prior knowledge, by locally incrementing the dimensionality of the convolutional code. Because convolutional codes of different constraint lengths can coexist in different portions of the network, reductions in decoding delay and memory overheads can be achieved with ARCNC. We show through analysis that this method performs no worse than random linear network codes in general networks, and can provide significant gains in terms of average decoding delay in combination networks.
We present deep submillimeter imaging of two spectroscopically confirmed z~6.5 Lyman-alpha emitters (LAEs) in the Subaru Deep Field. Although we reach the nominal confusion limit at 850 micron, neither LAE is detected at 850 micron nor 450 micron, thus we conclude that the LAEs do not contain large dust masses (<2.3e8 Msun and <5.7e8 Msun). The limit on their average L(FIR)/L(UV) ratios (<~35) is substantially lower than seen for most submillimeter selected galaxies at z~3, and is within the range of values exhibited by Lyman-break galaxies. We place upper-limits on their individual star formation rates of <~248 Msun/yr and <~613 Msun/yr, and on the cosmic star formation rate density of the z~6.5 LAE population of <~0.05 Msun/yr/Mpc^3. In the two submm pointings, we also serendipitously detect seven sources at 850 micron that we estimate to lie at 1<z<5.
A hierarchical gamma process infinite edge partition model is proposed to factorize the binary adjacency matrix of an unweighted undirected relational network under a Bernoulli-Poisson link. The model describes both homophily and stochastic equivalence, and is scalable to big sparse networks by focusing its computation on pairs of linked nodes. It can not only discover overlapping communities and inter-community interactions, but also predict missing edges. A simplified version omitting inter-community interactions is also provided and we reveal its interesting connections to existing models. The number of communities is automatically inferred in a nonparametric Bayesian manner, and efficient inference via Gibbs sampling is derived using novel data augmentation techniques. Experimental results on four real networks demonstrate the models' scalability and state-of-the-art performance.
In this paper we introduce the deep kernelized autoencoder, a neural network model that allows an explicit approximation of (i) the mapping from an input space to an arbitrary, user-specified kernel space and (ii) the back-projection from such a kernel space to input space. The proposed method is based on traditional autoencoders and is trained through a new unsupervised loss function. During training, we optimize both the reconstruction accuracy of input samples and the alignment between a kernel matrix given as prior and the inner products of the hidden representations computed by the autoencoder. Kernel alignment provides control over the hidden representation learned by the autoencoder. Experiments have been performed to evaluate both reconstruction and kernel alignment performance. Additionally, we applied our method to emulate kPCA on a denoising task obtaining promising results.
We show that in a deep neural network trained with ReLU, the low-lying layers should be replaceable with truncated linearly activated layers. We derive the gradient descent equations in this truncated linear model and demonstrate that --if the distribution of the training data is stationary during training-- the optimal choice for weights in these low-lying layers is the eigenvectors of the covariance matrix of the data. If the training data is random and uniform enough, these eigenvectors can be found using a small fraction of the training data, thus reducing the computational complexity of training. We show how this can be done recursively to form successive, trained layers. At least in the first layer, our tests show that this approach improves classification of images while reducing network size.
Discovering the malicious or vulnerable anchor node is an essential problem in wireless sensor networks (WSNs). In wireless sensor networks, anchor nodes are the nodes that know its current location. Neighbouring nodes or non-anchor nodes calculate its location coordinate (or location reference) with the help of anchor nodes. Ingenuous localization is not possible in the presence of a cheating anchor node or a cheating node. Nowadays, its a challenging task to identify the cheating anchor node or cheating node in a network. Even after finding out the location of the cheating anchor node, there is no assurance, that the identified node is legitimate or not. This paper aims to localize the cheating anchor nodes using trilateration algorithm and later associate it with Mahalanobis distance to obtain maximum accuracy in detecting malicious or cheating anchor nodes during localization. We were able to attain a considerable reduction in the error achieved during localization. For implementation purpose, we simulated our scheme using ns3 network simulator.
Transcription factors (TFs) exert their regulatory action by binding to DNA with specific sequence preferences. However, different TFs can partially share their binding sequences. This "redundancy" of binding defines a way of organizing TFs in "motif families" that goes beyond the usual classification based on protein structural similarities. Since the TF binding preferences ultimately define the target genes, the motif family organization entails information about the structure of transcriptional regulation as it has been shaped by evolution. Focusing on the human lineage, we show that a one-parameter evolutionary model of the Birth-Death-Innovation type can explain the empirical repartition of TFs in motif families, thus identifying the relevant evolutionary forces at its origin. More importantly, the model allows to pinpoint few deviations in human from the neutral scenario it assumes: three over-expanded families corresponding to HOX and FOX type genes, a set of "singleton" TFs for which duplication seems to be selected against, and an higher-than-average rate of diversification of the binding preferences of TFs with a Zinc Finger DNA binding domain. Finally, a comparison of the TF motif family organization in different eukaryotic species suggests an increase of redundancy of binding with organism complexity.
We present catalogues of faint 1.4-GHz radio sources from extremely-deep Very Large Array pointings in the Lockman Hole, the Hubble Deep Field North (HDF-N) and ELAIS N2. Our analysis of the HDF-N data has produced maps that are significantly deeper than those previously published and we have used these to search for counterparts to submm sources. For each of the fields we have derived normalised differential source counts and in the case of the HDF-N find no evidence for the previously-reported under-density of sources; our counts are entirely consistent with those found for the majority of other fields. The catalogues are available as an online supplement to this paper and the maps are also available for download.
Convolutional Neural Networks (CNNs) have recently achieved remarkably strong performance on the practically important task of sentence classification (kim 2014, kalchbrenner 2014, johnson 2014). However, these models require practitioners to specify an exact model architecture and set accompanying hyperparameters, including the filter region size, regularization parameters, and so on. It is currently unknown how sensitive model performance is to changes in these configurations for the task of sentence classification. We thus conduct a sensitivity analysis of one-layer CNNs to explore the effect of architecture components on model performance; our aim is to distinguish between important and comparatively inconsequential design decisions for sentence classification. We focus on one-layer CNNs (to the exclusion of more complex models) due to their comparative simplicity and strong empirical performance, which makes it a modern standard baseline method akin to Support Vector Machine (SVMs) and logistic regression. We derive practical advice from our extensive empirical results for those interested in getting the most out of CNNs for sentence classification in real world settings.
The $k$-means algorithm is one of the most widely used clustering heuristics. Despite its simplicity, analyzing its running time and quality of approximation is surprisingly difficult and can lead to deep insights that can be used to improve the algorithm. In this paper we survey the recent results in this direction as well as several extension of the basic $k$-means method.
Supervised (pre-)training currently yields state-of-the-art performance for representation learning for visual recognition, yet it comes at the cost of (1) intensive manual annotations and (2) an inherent restriction in the scope of data relevant for learning. In this work, we explore unsupervised feature learning from unlabeled video. We introduce a novel object-centric approach to temporal coherence that encourages similar representations to be learned for object-like regions segmented from nearby frames. Our framework relies on a Siamese-triplet network to train a deep convolutional neural network (CNN) representation. Compared to existing temporal coherence methods, our idea has the advantage of lightweight preprocessing of the unlabeled video (no tracking required) while still being able to extract object-level regions from which to learn invariances. Furthermore, as we show in results on several standard datasets, our method typically achieves substantial accuracy gains over competing unsupervised methods for image classification and retrieval tasks.
In complex networks the rich nodes are the subset of nodes with high degree. These well connected nodes tend to dominate the organisation of the network's structure. In non-evolving networks, a reference network has been used to detect if the connectivity between the rich nodes is due to chance or caused by an unknown mechanism. Chance is represented as a reference network obtained from an ensemble of networks. When compared with the original network the reference network discounts suggests the existence of a well connected rich club beyond structural constraints. Here we revise some of the properties of the ensemble obtained by conserving only the degree distribution and introduce two new reference networks to study the importance of the rich nodes as organisers of the network structure. The first reference network is obtained from an ensemble of networks where all the members of the ensemble have the same rich--club coefficient. The reference network obtained from the ensemble is assortative. We propose that this reference network can be used to study networks where assortativness is a fundamental property, a common case in many social networks. The second reference network is obtained from an ensemble where the members of the ensemble all have the same probability degree distribution and rich-club coefficient. The reference network obtained from this ensemble has a very similar structure to the original network. This ensemble can be used to quantify correlations between the rich nodes and pinpoint which links are the backbone of the network's structure.
Deep-inelastic scattering (DIS) from a tensor polarized deuteron is sensitive to possible non-nucleonic components of the deuteron wave function. To accurately estimate the size of the nucleonic contribution, final-state interactions (FSIs) need to be accounted for in calculations. We outline a model that, based on the diffractive nature of the effective hadron-nucleon interaction, uses the generalized eikonal approximation to model the FSIs in the resonance region, taking into account the proton-neutron component of the deuteron. The calculation uses a factorized model with a basis of three resonances with mass $W<2$ GeV as the relevant set of effective hadron states entering the final-state interaction amplitude for inclusive DIS. We present results for the tensor asymmetry observable $A_{zz}$ for kinematics accessible in experiments at Jefferson Lab and Hermes. For inclusive DIS, sizeable effects are found when including FSIs for Bjorken $x>0.2$, but the overall size of $A_{zz}$ remains small. For tagged spectator DIS, FSIs effects are largest at spectator momenta around 300 MeV and for forward spectator angles.
Computational methods for discovering patterns of local correlations in sequences are important in computational biology. Here we show how to determine the optimal partitioning of aligned sequences into non-overlapping segments such that positions in the same segment are strongly correlated while positions in different segments are not. Our approach involves discovering the hidden variables of a Bayesian network that interact with observed sequences so as to form a set of independent mixture models. We introduce a dynamic program to efficiently discover the optimal segmentation, or equivalently the optimal set of hidden variables. We evaluate our approach on two computational biology tasks. One task is related to the design of vaccines against polymorphic pathogens and the other task involves analysis of single nucleotide polymorphisms (SNPs) in human DNA. We show how common tasks in these problems naturally correspond to inference procedures in the learned models. Error rates of our learned models for the prediction of missing SNPs are up to 1/3 less than the error rates of a state-of-the-art SNP prediction method. Source code is available at www.uwm.edu/~joebock/segmentation.
The inelastic scattering intensities of glasses and amorphous materials has a maximum at a low frequency, the so called Boson peak. Under applied hydrostatic pressure, $P$, the Boson peak frequency, $\omega_{\rm b}$, is shifted upwards. We have shown previously that the Boson peak is created as a result of a vibrational instability due to the interaction of harmonic quasi localized vibrations (QLV). Applying pressure one exerts forces on the QLV. These shift the low frequency part of the excess spectrum to higher frequencies. For low pressures we find a shift of the Boson peak linear in $P$, whereas for high pressures the shift is $\propto P^{1/3}$. Our analytics is supported by simulation. The results are in agreement with the existing experiments.
We identify and calculate the instanton-induced contributions to deep inelastic scattering which correspond to nonperturbative exponential corrections to the coefficient functions in front of parton distributions of the leading twist.
Analyzing nonlinear conformational relaxation dynamics in elastic networks corresponding to two classical motor proteins, we find that they respond by well-defined internal mechanical motions to various initial deformations and that these motions are robust against external perturbations. We show that this behavior is not characteristic for random elastic networks. However, special network architectures with such properties can be designed by evolutionary optimization methods. Using them, an example of an artificial elastic network, operating as a cyclic machine powered by ligand binding, is constructed.
Collective classification models attempt to improve classification performance by taking into account the class labels of related instances. However, they tend not to learn patterns of interactions between classes and/or make the assumption that instances of the same class link to each other (assortativity assumption). Blockmodels provide a solution to these issues, being capable of modelling assortative and disassortative interactions, and learning the pattern of interactions in the form of a summary network. The Supervised Blockmodel provides good classification performance using link structure alone, whilst simultaneously providing an interpretable summary of network interactions to allow a better understanding of the data. This work explores three variants of supervised blockmodels of varying complexity and tests them on four structurally different real world networks.
Recently, convolutional neural networks (CNN) have been successfully applied to view synthesis problems. However, such CNN-based methods can suffer from lack of texture details, shape distortions, or high computational complexity. In this paper, we propose a novel CNN architecture for view synthesis called "Deep View Morphing" that does not suffer from these issues. To synthesize a middle view of two input images, a rectification network first rectifies the two input images. An encoder-decoder network then generates dense correspondences between the rectified images and blending masks to predict the visibility of pixels of the rectified images in the middle view. A view morphing network finally synthesizes the middle view using the dense correspondences and blending masks. We experimentally show the proposed method significantly outperforms the state-of-the-art CNN-based view synthesis method.
We identify and describe the key qualitative rhythmic states in various 3-cell network motifs of a multifunctional central pattern generator (CPG). Such CPGs are neural microcircuits of cells whose synergetic interactions produce multiple states with distinct phase-locked patterns of bursting activity. To study biologically plausible CPG models, we develop a suite of computational tools that reduce the problem of stability and existence of rhythmic patterns in networks to the bifurcation analysis of fixed points and invariant curves of a Poincar\'e return maps for phase lags between cells.   We explore different functional possibilities for motifs involving symmetry breaking and heterogeneity. This is achieved by varying coupling properties of the synapses between the cells and studying the qualitative changes in the structure of the corresponding return maps. Our findings provide a systematic basis for understanding plausible biophysical mechanisms for the regulation of rhythmic patterns generated by various CPGs in the context of motor control such as gait-switching in locomotion. Our analysis does not require knowledge of the equations modeling the system and provides a powerful qualitative approach to studying detailed models of rhythmic behavior. Thus, our approach is applicable to a wide range of biological phenomena beyond motor control.
We propose a probabilistic formulation that enables sequential detection of multiple change points in a network setting. We present a class of sequential detection rules for certain functionals of change points (minimum among a subset), and prove their asymptotic optimality properties in terms of expected detection delay time. Drawing from graphical model formalism, the sequential detection rules can be implemented by a computationally efficient message-passing protocol which may scale up linearly in network size and in waiting time. The effectiveness of our inference algorithm is demonstrated by simulations.
The bundle of geometry and appearance in computer vision has proven to be a promising solution for robots across a wide variety of applications. Stereo cameras and RGB-D sensors are widely used to realise fast 3D reconstruction and trajectory tracking in a dense way. However, they lack flexibility of seamless switch between different scaled environments, i.e., indoor and outdoor scenes. In addition, semantic information are still hard to acquire in a 3D mapping. We address this challenge by combining the state-of-art deep learning method and semi-dense Simultaneous Localisation and Mapping (SLAM) based on video stream from a monocular camera. In our approach, 2D semantic information are transferred to 3D mapping via correspondence between connective Keyframes with spatial consistency. There is no need to obtain a semantic segmentation for each frame in a sequence, so that it could achieve a reasonable computation time. We evaluate our method on indoor/outdoor datasets and lead to an improvement in the 2D semantic labelling over baseline single frame predictions.
Scene parsing is an important and challenging prob- lem in computer vision. It requires labeling each pixel in an image with the category it belongs to. Tradition- ally, it has been approached with hand-engineered features from color information in images. Recently convolutional neural networks (CNNs), which automatically learn hierar- chies of features, have achieved record performance on the task. These approaches typically include a post-processing technique, such as superpixels, to produce the final label- ing. In this paper, we propose a novel network architecture that combines deep deconvolutional neural networks with CNNs. Our experiments show that deconvolutional neu- ral networks are capable of learning higher order image structure beyond edge primitives in comparison to CNNs. The new network architecture is employed for multi-patch training, introduced as part of this work. Multi-patch train- ing makes it possible to effectively learn spatial priors from scenes. The proposed approach yields state-of-the-art per- formance on four scene parsing datasets, namely Stanford Background, SIFT Flow, CamVid, and KITTI. In addition, our system has the added advantage of having a training system that can be completely automated end-to-end with- out requiring any post-processing.
Online social media are complementing and in some cases replacing person-to-person social interaction and redefining the diffusion of information. In particular, microblogs have become crucial grounds on which public relations, marketing, and political battles are fought. We introduce an extensible framework that will enable the real-time analysis of meme diffusion in social media by mining, visualizing, mapping, classifying, and modeling massive streams of public microblogging events. We describe a Web service that leverages this framework to track political memes in Twitter and help detect astroturfing, smear campaigns, and other misinformation in the context of U.S. political elections. We present some cases of abusive behaviors uncovered by our service. Finally, we discuss promising preliminary results on the detection of suspicious memes via supervised learning based on features extracted from the topology of the diffusion networks, sentiment analysis, and crowdsourced annotations.
We discuss the complex dynamics of a non-linear random networks model, as a function of the connectivity k between the elements of the network. We show that this class of networks exhibit an order-chaos phase transition for a critical connectivity k = 2. Also, we show that both, pairwise correlation and complexity measures are maximized in dynamically critical networks. These results are in good agreement with the previously reported studies on random Boolean networks and random threshold networks, and show once again that critical networks provide an optimal coordination of diverse behavior.
In-network caching is one of the fundamental operations of Information-centric networks (ICN). The default caching strategy taken by most of the current ICN proposals is caching along--default--path, which makes popular objects to be cached redundantly across the network, resulting in a low utilization of available cache space. On the other hand, efficient use of network-wide cache space requires possible cooperation among caching routers without the use of excessive signaling burden. While most of the cache optimization efforts strive to improve the latency and the overall traffic efficiency, we have taken a different path in this work and improved the storage efficiency of the cache space so that it is utilized to its most.   In this work we discuss the ICN caching problem, and propose a novel distributed architecture to efficiently use the network-wide cache storage space based on distributed caching. The proposal achieves cache retention efficiency by means of controlled traffic redirection and selective caching. We utilize the ICN mechanisms and routing protocol messages for decision making, thus reducing the overall signaling need. Our proposal achieves almost 9-fold increase in cache storage efficiency, and around 20% increase in server load reduction when compared to the classic caching methods used in contemporary ICN proposals.
Synchronized brain rhythms, associated with diverse cognitive functions, have been observed in electrical recordings of brain activity. Neural synchronization may be well described by using the population-averaged global potential $V_G$ in computational neuroscience. The time-averaged fluctuation of $V_G$ plays the role of a "thermodynamic" order parameter $\cal {O}$ used for describing the synchrony-asynchrony transition in neural systems. Population spike synchronization may be well visualized in the raster plot of neural spikes. The degree of neural synchronization seen in the raster plot is well measured in terms of a "statistical-mechanical" spike-based measure $M_s$ introduced by considering the occupation and the pacing patterns of spikes. The global potential $V_G$ is also used to give a reference global cycle for the calculation of $M_s$. Hence, $V_G$ becomes an important collective quantity because it is associated with calculation of both $\cal {O}$ and $M_s$. However, it is practically difficult to directly get $V_G$ in real experiments. To overcome this difficulty, instead of $V_G$, we employ the instantaneous population spike rate (IPSR) which can be obtained in experiments, and develop realistic thermodynamic and statistical-mechanical measures, based on IPSR, to make practical characterization of the neural synchronization in both computational and experimental neuroscience. Particularly, more accurate characterization of weak sparse spike synchronization can be achieved in terms of realistic statistical-mechanical IPSR-based measure, in comparison with the conventional measure based on $V_G$.
We are motivated by the need, in some applications, for impromptu or as-you-go deployment of wireless sensor networks. A person walks along a line, starting from a sink node (e.g., a base-station), and proceeds towards a source node (e.g., a sensor) which is at an a priori unknown location. At equally spaced locations, he makes link quality measurements to the previous relay, and deploys relays at some of these locations, with the aim to connect the source to the sink by a multihop wireless path. In this paper, we consider two approaches for impromptu deployment: (i) the deployment agent can only move forward (which we call a pure as-you-go approach), and (ii) the deployment agent can make measurements over several consecutive steps before selecting a placement location among them (which we call an explore-forward approach). We consider a light traffic regime, and formulate the problem as a Markov decision process, where the trade-off is among the power used by the nodes, the outage probabilities in the links, and the number of relays placed per unit distance. We obtain the structures of the optimal policies for the pure as-you-go approach as well as for the explore-forward approach. We also consider natural heuristic algorithms, for comparison. Numerical examples show that the explore-forward approach significantly outperforms the pure as-you-go approach. Next, we propose two learning algorithms for the explore-forward approach, based on Stochastic Approximation, which asymptotically converge to the set of optimal policies, without using any knowledge of the radio propagation model. We demonstrate numerically that the learning algorithms can converge (as deployment progresses) to the set of optimal policies reasonably fast and, hence, can be practical, model-free algorithms for deployment over large regions.
Modern convolutional networks, incorporating rectifiers and max-pooling, are neither smooth nor convex. Standard guarantees therefore do not apply. Nevertheless, methods from convex optimization such as gradient descent and Adam are widely used as building blocks for deep learning algorithms. This paper provides the first convergence guarantee applicable to modern convnets. The guarantee matches a lower bound for convex nonsmooth functions. The key technical tool is the neural Taylor approximation -- a straightforward application of Taylor expansions to neural networks -- and the associated Taylor loss. Experiments on a range of optimizers, layers, and tasks provide evidence that the analysis accurately captures the dynamics of neural optimization.   The second half of the paper applies the Taylor approximation to isolate the main difficulty in training rectifier nets: that gradients are shattered. We investigate the hypothesis that, by exploring the space of activation configurations more thoroughly, adaptive optimizers such as RMSProp and Adam are able to converge to better solutions.
In this work, the TREPAN algorithm is enhanced and extended for extracting decision trees from neural networks. We empirically evaluated the performance of the algorithm on a set of databases from real world events. This benchmark enhancement was achieved by adapting Single-test TREPAN and C4.5 decision tree induction algorithms to analyze the datasets. The models are then compared with X-TREPAN for comprehensibility and classification accuracy. Furthermore, we validate the experimentations by applying statistical methods. Finally, the modified algorithm is extended to work with multi-class regression problems and the ability to comprehend generalized feed forward networks is achieved.
Fine-grained categorisation has been a challenging problem due to small inter-class variation, large intra-class variation and low number of training images. We propose a learning system which first clusters visually similar classes and then learns deep convolutional neural network features specific to each subset. Experiments on the popular fine-grained Caltech-UCSD bird dataset show that the proposed method outperforms recent fine-grained categorisation methods under the most difficult setting: no bounding boxes are presented at test time. It achieves a mean accuracy of 77.5%, compared to the previous best performance of 73.2%. We also show that progressive transfer learning allows us to first learn domain-generic features (for bird classification) which can then be adapted to specific set of bird classes, yielding improvements in accuracy.
Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction may be viewed as a promising avenue for unsupervised feature learning. In addition, while optical flow has been a very studied problem in computer vision for a long time, future frame prediction is rarely approached. Still, many vision applications could benefit from the knowledge of the next frames of videos, that does not require the complexity of tracking every pixel trajectories. In this work, we train a convolutional network to generate future frames given an input sequence. To deal with the inherently blurry predictions obtained from the standard Mean Squared Error (MSE) loss function, we propose three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function. We compare our predictions to different published results based on recurrent neural networks on the UCF101 dataset
Nodes in a complex networked system often engage in more than one type of interactions among them; they form a multiplex network with multiple types of links. In real-world complex systems, a node's degree for one type of links and that for the other are not randomly distributed but correlated, which we term correlated multiplexity. In this paper we study a simple model of multiplex random networks and demonstrate that the correlated multiplexity can drastically affect the properties of giant component in the network. Specifically, when the degrees of a node for different interactions in a duplex Erdos-Renyi network are maximally correlated, the network contains the giant component for any nonzero link densities. In contrast, when the degrees of a node are maximally anti-correlated, the emergence of giant component is significantly delayed, yet the entire network becomes connected into a single component at a finite link density. We also discuss the mixing patterns and the cases with imperfect correlated multiplexity.
Robust face detection is one of the most important pre-processing steps to support facial expression analysis, facial landmarking, face recognition, pose estimation, building of 3D facial models, etc. Although this topic has been intensely studied for decades, it is still challenging due to numerous variants of face images in real-world scenarios. In this paper, we present a novel approach named Multiple Scale Faster Region-based Convolutional Neural Network (MS-FRCNN) to robustly detect human facial regions from images collected under various challenging conditions, e.g. large occlusions, extremely low resolutions, facial expressions, strong illumination variations, etc. The proposed approach is benchmarked on two challenging face detection databases, i.e. the Wider Face database and the Face Detection Dataset and Benchmark (FDDB), and compared against recent other face detection methods, e.g. Two-stage CNN, Multi-scale Cascade CNN, Faceness, Aggregate Chanel Features, HeadHunter, Multi-view Face Detection, Cascade CNN, etc. The experimental results show that our proposed approach consistently achieves highly competitive results with the state-of-the-art performance against other recent face detection methods.
Learning the structure of dependencies among multiple random variables is a problem of considerable theoretical and practical interest. In practice, score optimisation with multiple restarts provides a practical and surprisingly successful solution, yet the conditions under which this may be a well founded strategy are poorly understood. In this paper, we prove that the problem of identifying the structure of a Bayesian Network via regularised score optimisation can be recast, in expectation, as a submodular optimisation problem, thus guaranteeing optimality with high probability. This result both explains the practical success of optimisation heuristics, and suggests a way to improve on such algorithms by artificially simulating multiple data sets via a bootstrap procedure. We show on several synthetic data sets that the resulting algorithm yields better recovery performance than the state of the art, and illustrate in a real cancer genomic study how such an approach can lead to valuable practical insights.
Modular structure is ubiquitous in real-world complex networks, and its detection is important because it gives insights in the structure-functionality Modular structure is ubiquitous in real-world complex networks, and its detection is important because it gives insights in the structure-functionality relationship. The standard approach is based on the optimization of a quality function, modularity, which is a relative quality measure for a partition of a network into modules. Recently some authors [1,2] have pointed out that the optimization of modularity has a fundamental drawback: the existence of a resolution limit beyond which no modular structure can be detected even though these modules might have own entity. The reason is that several topological descriptions of the network coexist at different scales, which is, in general, a fingerprint of complex systems. Here we propose a method that allows for multiple resolution screening of the modular structure. The method has been validated using synthetic networks, discovering the predefined structures at all scales. Its application to two real social networks allows to find the exact splits reported in the literature, as well as the substructure beyond the actual split.
We extend the self-consistent Ornstein-Zernike approximation (SCOZA), first formulated in the context of liquid-state theory, to the study of the random field Ising model. Within the replica formalism, we treat the quenched random field as an annealed spin variable, thereby avoiding the usual average over the random field distribution. This allows to study the influence of the distribution on the phase diagram in finite dimensions. The thermodynamics and the correlation functions are obtained as solutions of a set a coupled partial differential equations with magnetization, temperature and disorder strength as independent variables. A preliminary analysis based on high-temperature and 1/d series expansions shows that the theory can predict accurately the dependence of the critical temperature on disorder strength for dimensions d>4. For the bimodal distribution, we find a tricritical point which moves to weaker fields as the dimension is reduced. For the Gaussian distribution, a tricritical point may appear for d slightly above 4.
Stripes is a Deep Neural Network (DNN) accelerator that uses bit-serial computation to offer performance that is proportional to the fixed-point precision of the activation values. The fixed-point precisions are determined a priori using profiling and are selected at a per layer granularity. This paper presents Dynamic Stripes, an extension to Stripes that detects precision variance at runtime and at a finer granularity. This extra level of precision reduction increases performance by 41% over Stripes.
The sparsely spaced highly permeable fractures of the granitic rock aquifer at Stang-er-Brune (Brittany, France) form a well-connected fracture network of high permeability but unknown geometry. Previous work based on optical and acoustic logging together with single-hole and cross-hole flowmeter data acquired in 3 neighboring boreholes (70-100 m deep) have identified the most important permeable fractures crossing the boreholes and their hydraulic connections. To constrain possible flow paths by estimating the geometries of known and previously unknown fractures, we have acquired, processed and interpreted multifold, single- and cross-hole GPR data using 100 and 250 MHz antennas. The GPR data processing scheme consisting of time-zero corrections, scaling, bandpass filtering and F-X deconvolution, eigenvector filtering, muting, pre-stack Kirchhoff depth migration and stacking was used to differentiate fluid-filled fracture reflections from source-generated noise. The final stacked and pre-stack depth-migrated GPR sections provide high-resolution images of individual fractures (dipping 30-90{\deg}) in the surroundings (2-20 m for the 100 MHz antennas; 2-12 m for the 250 MHz antennas) of each borehole in a 2D plane projection that are of superior quality to those obtained from single-offset sections. Most fractures previously identified from hydraulic testing can be correlated to reflections in the single-hole data. Several previously unknown major near vertical fractures have also been identified away from the boreholes.
In a statistical cluster or loop model such as percolation, or more generally the Potts models or O(n) models, a pinch point is a single bulk point where several distinct clusters or loops touch. In a polygon P harboring such a model in its interior and with 2N sides exhibiting free/fixed side-alternating boundary conditions, "boundary" clusters anchor to the fixed sides of P. At the critical point and in the continuum limit, the density (i.e., frequency of occurrence) of pinch-point events between s distinct boundary clusters at a bulk point w in P is proportional to <psi_1^c(w_1)psi_1^c(w_2)...psi_1^c(w_{2N-1})psi_1^c(w_{2N})Psi_s(w)>_P. The w_i are the vertices of P, psi_1^c is a conformal field theory (CFT) corner one-leg operator, and Psi_s is a CFT bulk 2s-leg operator. In this article, we use the Coulomb gas formalism to construct explicit contour integral formulas for these correlation functions and thereby calculate the density of various pinch-point configurations at arbitrary points in the rectangle, in the hexagon, and for the case s=N, in the 2N-sided polygon at the system's critical point. Explicit formulas for these results are given in terms of algebraic functions or integrals of algebraic functions, particularly Lauricella functions. In critical percolation, the result for s=N=2 gives the density of red bonds between boundary clusters (in the continuum limit) inside a rectangle. We compare our results with high-precision simulations of critical percolation and Ising FK clusters in a rectangle of aspect ratio two and in a regular hexagon and find very good agreement.
Online review communities are dynamic as users join and leave, adopt new vocabulary, and adapt to evolving trends. Recent work has shown that recommender systems benefit from explicit consideration of user experience. However, prior work assumes a fixed number of discrete experience levels, whereas in reality users gain experience and mature continuously over time. This paper presents a new model that captures the continuous evolution of user experience, and the resulting language model in reviews and other posts. Our model is unsupervised and combines principles of Geometric Brownian Motion, Brownian Motion, and Latent Dirichlet Allocation to trace a smooth temporal progression of user experience and language model respectively. We develop practical algorithms for estimating the model parameters from data and for inference with our model (e.g., to recommend items). Extensive experiments with five real-world datasets show that our model not only fits data better than discrete-model baselines, but also outperforms state-of-the-art methods for predicting item ratings.
The mathematical framework of multiplex networks has been increasingly realized as a more suitable framework for modelling real-world complex systems. In this work, we investigate the optimization of synchronizability in multiplex networks by evolving only one layer while keeping other layers fixed. Our main finding is to show the conditions under which the efficiency of convergence to the most optimal structure is almost as good as the case where both layers are rewired during an optimization process. In particular, inter-layer coupling strength responsible for the integration between the layers turns out to be crucial factor governing the efficiency of optimization even for the cases when the layer going through the evolution has nodes interacting much weakly than those in the fixed layer. Additionally, we investigate the dependency of synchronizability on the rewiring probability which governs the network structure from a regular lattice to the random networks. The efficiency of the optimization process preceding evolution driven by the optimization process is maximum when the fixed layer has regular architecture, whereas the optimized network is more synchronizable for the fixed layer having the rewiring probability lying between the small-world transition and the random structure.
In all but the most trivial optimization problems, the structure of the solutions exhibit complex interdependencies between the input parameters. Decades of research with stochastic search techniques has shown the benefit of explicitly modeling the interactions between sets of parameters and the overall quality of the solutions discovered. We demonstrate a novel method, based on learning deep networks, to model the global landscapes of optimization problems. To represent the search space concisely and accurately, the deep networks must encode information about the underlying parameter interactions and their contributions to the quality of the solution. Once the networks are trained, the networks are probed to reveal parameter combinations with high expected performance with respect to the optimization task. These estimates are used to initialize fast, randomized, local search algorithms, which in turn expose more information about the search space that is subsequently used to refine the models. We demonstrate the technique on multiple optimization problems that have arisen in a variety of real-world domains, including: packing, graphics, job scheduling, layout and compression. The problems include combinatoric search spaces, discontinuous and highly non-linear spaces, and span binary, higher-cardinality discrete, as well as continuous parameters. Strengths, limitations, and extensions of the approach are extensively discussed and demonstrated.
Partially observable environments present an important open challenge in the domain of sequential control learning with delayed rewards. Despite numerous attempts during the two last decades, the majority of reinforcement learning algorithms and associated approximate models, applied to this context, still assume Markovian state transitions. In this paper, we explore the use of a recently proposed attention-based model, the Gated End-to-End Memory Network, for sequential control. We call the resulting model the Gated End-to-End Memory Policy Network. More precisely, we use a model-free value-based algorithm to learn policies for partially observed domains using this memory-enhanced neural network. This model is end-to-end learnable and it features unbounded memory. Indeed, because of its attention mechanism and associated non-parametric memory, the proposed model allows us to define an attention mechanism over the observation stream unlike recurrent models. We show encouraging results that illustrate the capability of our attention-based model in the context of the continuous-state non-stationary control problem of stock trading. We also present an OpenAI Gym environment for simulated stock exchange and explain its relevance as a benchmark for the field of non-Markovian decision process learning.
Neural networks have been widely used as predictive models to fit data distribution, and they could be implemented through learning a collection of samples. In many applications, however, the given dataset may contain noisy samples or outliers which may result in a poor learner model in terms of generalization. This paper contributes to a development of robust stochastic configuration networks (RSCNs) for resolving uncertain data regression problems. RSCNs are built on original stochastic configuration networks with weighted least squares method for evaluating the output weights, and the input weights and biases are incrementally and randomly generated by satisfying with a set of inequality constrains. The kernel density estimation (KDE) method is employed to set the penalty weights for each training samples, so that some negative impacts, caused by noisy data or outliers, on the resulting learner model can be reduced. The alternating optimization technique is applied for updating a RSCN model with improved penalty weights computed from the kernel density estimation function. Performance evaluation is carried out by a function approximation, four benchmark datasets and a case study on engineering application. Comparisons to other robust randomised neural modelling techniques, including the probabilistic robust learning algorithm for neural networks with random weights and improved RVFL networks, indicate that the proposed RSCNs with KDE perform favourably and demonstrate good potential for real-world applications.
Empirically, neural networks that attempt to learn programs from data have exhibited poor generalizability. Moreover, it has traditionally been difficult to reason about the behavior of these models beyond a certain level of input complexity. In order to address these issues, we propose augmenting neural architectures with a key abstraction: recursion. As an application, we implement recursion in the Neural Programmer-Interpreter framework on four tasks: grade-school addition, bubble sort, topological sort, and quicksort. We demonstrate superior generalizability and interpretability with small amounts of training data. Recursion divides the problem into smaller pieces and drastically reduces the domain of each neural network component, making it tractable to prove guarantees about the overall system's behavior. Our experience suggests that in order for neural architectures to robustly learn program semantics, it is necessary to incorporate a concept like recursion.
With the advent of highly predictive but opaque deep learning models, it has become more important than ever to understand and explain the predictions of such models. Existing approaches define interpretability as the inverse of complexity and achieve interpretability at the cost of accuracy. This introduces a risk of producing interpretable but misleading explanations. As humans, we are prone to engage in this kind of behavior \cite{mythos}. In this paper, we take a step in the direction of tackling the problem of interpretability without compromising the model accuracy. We propose to build a Treeview representation of the complex model via hierarchical partitioning of the feature space, which reveals the iterative rejection of unlikely class labels until the correct association is predicted.
We accurately treat the effect of the double and triple interactions of two-level systems (TLS) in glasses onto the energy delocalization due to the long-range interaction of TLS. Although this work qualitatively reproduces the estimates of our previous work we believe that it is important because in this paper the estimates are done with quantitative accuracy. This work can serve as the important step towards the development of quantitative theory of many-body delocalization due to the long-range interaction. Our study is compared with the recent work by Bodea et al claiming that the interaction of TLS triples leads to the energy delocalization. We cannot agree with the mentioned work because as we show the interaction of TLS triples was overestimated there and therefore the result for the TLS relaxation rate obtained there is invalid.
We propose a novel deep network architecture for image\\ denoising based on a Gaussian Conditional Random Field (GCRF) model. In contrast to the existing discriminative denoising methods that train a separate model for each noise level, the proposed deep network explicitly models the input noise variance and hence is capable of handling a range of noise levels. Our deep network, which we refer to as deep GCRF network, consists of two sub-networks: (i) a parameter generation network that generates the pairwise potential parameters based on the noisy input image, and (ii) an inference network whose layers perform the computations involved in an iterative GCRF inference procedure.\ We train the entire deep GCRF network (both parameter generation and inference networks) discriminatively in an end-to-end fashion by maximizing the peak signal-to-noise ratio measure. Experiments on Berkeley segmentation and PASCALVOC datasets show that the proposed deep GCRF network outperforms state-of-the-art image denoising approaches for several noise levels.
In this paper, we study the use of recurrent neural networks (RNNs) for modeling and forecasting time series. We first illustrate the fact that standard sequence-to-sequence RNNs neither capture well periods in time series nor handle well missing values, even though many real life times series are periodic and contain missing values. We then propose an extended attention mechanism that can be deployed on top of any RNN and that is designed to capture periods and make the RNN more robust to missing values. We show the effectiveness of this novel model through extensive experiments with multiple univariate and multivariate datasets.
We study chaotic size dependence of the low temperature correlations in the SK spin glass. We prove that as temperature scales to zero with volume, for any typical coupling realization, the correlations cycle through every spin configuration in every fixed observation window. This cannot happen in short-ranged models as there it would mean that every spin configuration is an infinite-volume ground state. Its occurrence in the SK model means that the commonly used `modified clustering' notion of states sheds little light on the RSB solution of SK, and conversely, the RSB solution sheds little light on the thermodynamic structure of EA models.
Recent studies have explored theoretically the ability of populations of neurons to carry information about a set of stimuli, both in the case of purely discrete or purely continuous stimuli, and in the case of multidimensional continuous angular and discrete correlates, in presence of additional quenched disorder in the distribution. An analytical expression for the mutual information has been obtained in the limit of large noise by means of the replica trick. Here we show that the same results can actually be obtained in most cases without the use of replicas, by means of a much simpler expansion of the logarithm. Fitting the theoretical model to real neuronal data, we show that the introduction of correlations in the quenched disorder improves the fit, suggesting a possible role of signal correlations-actually detected in real data- in a redundant code. We show that even in the more difficult analysis of the asymptotic regime, an explicit expression for the mutual information can be obtained without resorting to the replica trick despite the presence of quenched disorder, both with a gaussian and with a more realistic thresholded-gaussian model. When the stimuli are mixed continuous and discrete, we find that with both models the information seem to grow logarithmically to infinity with the number of neurons and with the inverse of the noise, even though the exact general dependence cannot be derived explicitly for the thresholded gaussian model. In the large noise limit lower values of information were obtained with the thresholded-gaussian model, for a fixed value of the noise and of the population size. On the contrary, in the asymptotic regime, with very low values of the noise, a lower information value is obtained with the gaussian model.
An emerging problem in computer vision is the reconstruction of 3D shape and pose of an object from a single image. Hitherto, the problem has been addressed through the application of canonical deep learning methods to regress from the image directly to the 3D shape and pose labels. These approaches, however, are problematic from two perspectives. First, they are minimizing the error between 3D shapes and pose labels - with little thought about the nature of this label error when reprojecting the shape back onto the image. Second, they rely on the onerous and ill-posed task of hand labeling natural images with respect to 3D shape and pose. In this paper we define the new task of pose-aware shape reconstruction from a single image, and we advocate that cheaper 2D annotations of objects silhouettes in natural images can be utilized. We design architectures of pose-aware shape reconstruction which re-project the predicted shape back on to the image using the predicted pose. Our evaluation on several object categories demonstrates the superiority of our method for predicting pose-aware 3D shapes from natural images.
We present a method for dynamically generating Bayesian networks from knowledge bases consisting of first-order probability logic sentences. We present a subset of probability logic sufficient for representing the class of Bayesian networks with discrete-valued nodes. We impose constraints on the form of the sentences that guarantee that the knowledge base contains all the probabilistic information necessary to generate a network. We define the concept of d-separation for knowledge bases and prove that a knowledge base with independence conditions defined by d-separation is a complete specification of a probability distribution. We present a network generation algorithm that, given an inference problem in the form of a query Q and a set of evidence E, generates a network to compute P(Q|E). We prove the algorithm to be correct.
We compute the temperature-dependent barrier for alpha-relaxations in several liquids, without adjustable parameters, using experimentally determined elastic, structural, and calorimetric data. We employ the random first order transition(RFOT) theory, in which relaxation occurs via activated reconfigurations between distinct, aperiodic minima of the free energy. Two different approximations for the mismatch penalty between the distinct aperiodic states are compared, one due to Xia and Wolynes, which scales universally with temperature as for hard spheres, and one due to Rabochiy and Lubchenko, which employs measured elastic and structural data for individual substances. The agreement between the predictions and experiment is satisfactory, given the uncertainty in the measured experimental inputs. The explicitly computed barriers are used to calculate the glass transition temperature for each substance---a kinetic quantity---from the static input data alone. The temperature dependence of both the elastic and structural constants enters the temperature dependence of the barrier over an extended range to a degree that varies from substance to substance. The lowering of the configurational entropy, however, seems to be the dominant contributor to the barrier increase near the laboratory glass transition, consistent with previous experimental tests of the RFOT theory using the XW approximation. In addition, we compute the temperature dependence of the dynamical correlation length, also without using adjustable parameters. These agree well with experimental estimates obtained using the Berthier et al. procedure. Finally, we find the temperature dependence of the complexity of a rearranging region is consistent with the picture based on the RFOT theory but is in conflict with the assumptions of the Adam-Gibbs and "shoving" scenarios for the viscous slowing down in supercooled liquids.
Kiva is an online non-profit crowdsouring microfinance platform that raises funds for the poor in the third world. The borrowers on Kiva are small business owners and individuals in urgent need of money. To raise funds as fast as possible, they have the option to form groups and post loan requests in the name of their groups. While it is generally believed that group loans pose less risk for investors than individual loans do, we study whether this is the case in a philanthropic online marketplace. In particular, we measure the effect of group loans on funding time while controlling for the loan sizes and other factors. Because loan descriptions (in the form of texts) play an important role in lenders' decision process on Kiva, we make use of this information through deep learning in natural language processing. In this aspect, this is the first paper that uses one of the most advanced deep learning techniques to deal with unstructured data in a way that can take advantage of its superior prediction power to answer causal questions. We find that on average, forming group loans speeds up the funding time by about 3.3 days.
Our new, deep, high resolution H-alpha and matching R-band UKST multi-exposure stack of the central 25 sq. degrees of the LMC promises to provide an unprecedented homogeneous sample of >1,000 new PNe. Our preliminary 2dF spectroscopy on the AAT has vindicated our selection process and confirmed 136 new PNe and 57 emission-line stars out of a sample of 263 candidate sources within an initial 2.5 sq. deg. area. To date approximately one third of the entire LMC has been scanned for candidates (~7.5 sq.deg.). More than 750 new emission sources have been catalogued so far along with independent re-identification of all known and possible PNe found from other surveys. Once our image analysis is complete, we plan comprehensive spectroscopic follow-up of the whole sample, not only to confirm our PN candidates but also to derive nebula temperatures and densities which, with the aid of photoionization modeling, will yield stellar parameters which are vital for constructing H-R diagrams for these objects. A prime objective of the survey is to produce a Luminosity Function which will be the most accurate and comprehensive ever derived in terms of numbers, magnitude range and evolutionary state; offering significant new insights into the LMC's evolutionary history. The observation and measurement of our newly discovered AGB halos around 60% of these PN will also assist in determining the initial- to final-mass ratios for this phase of stellar evolution.
Visual inspection of x-ray scattering images is a powerful technique for probing the physical structure of materials at the molecular scale. In this paper, we explore the use of deep learning to develop methods for automatically analyzing x-ray scattering images. In particular, we apply Convolutional Neural Networks and Convolutional Autoencoders for x-ray scattering image classification. To acquire enough training data for deep learning, we use simulation software to generate synthetic x-ray scattering images. Experiments show that deep learning methods outperform previously published methods by 10\% on synthetic and real datasets.
A generic proecological traffic control model for the urban street canyon is proposed by development of advanced continuum field hydrodynamical control model of the street canyon. The model of optimal control of street canyon dynamics is also investigated. The mathematical physics' approach (Eulerian approach) to vehicular movement, to pollutants' emission, and to pollutants' dynamics is used. The rigorous mathematical model is presented, using hydrodynamical theory for both air constituents and vehicles, including many types of vehicles and many types of pollutants emitted from vehicles. The six proposed optimal monocriterial control problems consist of minimization of functionals of the total travel time, of global emissions of pollutants, and of global concentrations of pollutants, both in the studied street canyon, and in its two nearest neighbour substitute canyons, respectively. The six optimization problems are solved numerically. Generic traffic control issues are inferred. The discretization method, comparison with experiment, mathematical issues, and programming issues are discussed.
Machine learning plays a role in many aspects of modern IR systems, and deep learning is applied in all of them. The fast pace of modern-day research has given rise to many different approaches for many different IR problems. The amount of information available can be overwhelming both for junior students and for experienced researchers looking for new research topics and directions. Additionally, it is interesting to see what key insights into IR problems the new technologies are able to give us. The aim of this full-day tutorial is to give a clear overview of current tried-and-trusted neural methods in IR and how they benefit IR research. It covers key architectures, as well as the most promising future directions.
We analyze the distributed power allocation problem in parallel multiple access channels (MAC) by studying an associated non-cooperative game which admits an exact potential. Even though games of this type have been the subject of considerable study in the literature, we find that the sufficient conditions which ensure uniqueness of Nash equilibrium points typically do not hold in this context. Nonetheless, we show that the parallel MAC game admits a unique equilibrium almost surely, thus establishing an important class of counterexamples where these sufficient conditions are not necessary. Furthermore, if the network's users employ a distributed learning scheme based on the replicator dynamics, we show that they converge to equilibrium from almost any initial condition, even though users only have local information at their disposal.
The past several years have seen remarkable progress in generative models which produce convincing samples of images and other modalities. A shared component of many powerful generative models is a decoder network, a parametric deep neural net that defines a generative distribution. Examples include variational autoencoders, generative adversarial networks, and generative moment matching networks. Unfortunately, it can be difficult to quantify the performance of these models because of the intractability of log-likelihood estimation, and inspecting samples can be misleading. We propose to use Annealed Importance Sampling for evaluating log-likelihoods for decoder-based models and validate its accuracy using bidirectional Monte Carlo. The evaluation code is provided at https://github.com/tonywu95/eval_gen. Using this technique, we analyze the performance of decoder-based models, the effectiveness of existing log-likelihood estimators, the degree of overfitting, and the degree to which these models miss important modes of the data distribution.
The brain processes information through many layers of neurons. This deep architecture is representationally powerful, but it complicates learning by making it hard to identify the responsible neurons when a mistake is made. In machine learning, the backpropagation algorithm assigns blame to a neuron by computing exactly how it contributed to an error. To do this, it multiplies error signals by matrices consisting of all the synaptic weights on the neuron's axon and farther downstream. This operation requires a precisely choreographed transport of synaptic weight information, which is thought to be impossible in the brain. Here we present a surprisingly simple algorithm for deep learning, which assigns blame by multiplying error signals by random synaptic weights. We show that a network can learn to extract useful information from signals sent through these random feedback connections. In essence, the network learns to learn. We demonstrate that this new mechanism performs as quickly and accurately as backpropagation on a variety of problems and describe the principles which underlie its function. Our demonstration provides a plausible basis for how a neuron can be adapted using error signals generated at distal locations in the brain, and thus dispels long-held assumptions about the algorithmic constraints on learning in neural circuits.
Canalization of genetic regulatory networks has been argued to be favored by evolutionary processes due to the stability that it can confer to phenotype expression. We explore whether a significant amount of canalization and partial canalization can arise in purely random networks in the absence of evolutionary pressures. We use a mapping of the Boolean functions in the Kauffman N-K model for genetic regulatory networks onto a k-dimensional Ising hypercube to show that the functions can be divided into different classes strictly due to geometrical constraints. The classes can be counted and their properties determined using results from group theory and isomer chemistry. We demonstrate that partially canalized functions completely dominate all possible Boolean functions, particularly for higher k. This indicates that partial canalization is extremely common, even in randomly chosen networks, and has implications for how much information can be obtained in experiments on native state genetic regulatory networks.
In this article we study the treewidth of the \emph{display graph}, an auxiliary graph structure obtained from the fusion of phylogenetic (i.e., evolutionary) trees at their leaves. Earlier work has shown that the treewidth of the display graph is bounded if the trees are in some formal sense topologically similar. Here we further expand upon this relationship. We analyse a number of reduction rules which are commonly used in the phylogenetics literature to obtain fixed parameter tractable algorithms. In some cases (the \emph{subtree} reduction) the reduction rules behave similarly with respect to treewidth, while others (the \emph{cluster} reduction) behave very differently, and the behaviour of the \emph{chain reduction} is particularly intriguing because of its link with graph separators and forbidden minors. We also show that the gap between treewidth and Tree Bisection and Reconnect (TBR) distance can be infinitely large, and that unlike, for example, planar graphs the treewidth of the display graph can be as much as linear in its number of vertices. On a slightly different note we show that if a display graph is formed from the fusion of a phylogenetic network and a tree, rather than from two trees, the treewidth of the display graph is bounded whenever the tree can be topologically embedded ("displayed") within the network. This opens the door to the formulation of the display problem in Monadic Second Order Logic (MSOL). A number of other auxiliary results are given. We conclude with a discussion and list a number of open problems.
Complex networks provide us a new view for investigation of immune systems. In this paper we collect data through STRING database and present a model with cooperation network theory. The cytokine-protein network model we consider is constituted by two kinds of nodes, one is immune cytokine types which can act as acts, other one is protein type which can act as actors. From act degree distribution that can be well described by typical SPL -shifted power law functions, we find that HRAS.TNFRSF13C.S100A8.S100A1.MAPK8.S100A7.LIF.CCL4.CXCL13 are highly collaborated with other proteins. It reveals that these mediators are important in cytokine-protein network to regulate immune activity. Dyad act degree distribution is another important property to generalized collaboration network. Dyad is two proteins and they appear in one cytokine collaboration relationship. The dyad act degree distribution can be well described by typical SPL functions. The length of the average shortest path is 1.29. These results show that this model could describe the cytokine-protein collaboration preferably
Automatically searching for optimal hyperparameter configurations is of crucial importance for applying deep learning algorithms in practice. Recently, Bayesian optimization has been proposed for optimizing hyperparameters of various machine learning algorithms. Those methods adopt probabilistic surrogate models like Gaussian processes to approximate and minimize the validation error function of hyperparameter values. However, probabilistic surrogates require accurate estimates of sufficient statistics (e.g., covariance) of the error distribution and thus need many function evaluations with a sizeable number of hyperparameters. This makes them inefficient for optimizing hyperparameters of deep learning algorithms, which are highly expensive to evaluate. In this work, we propose a new deterministic and efficient hyperparameter optimization method that employs radial basis functions as error surrogates. The proposed mixed integer algorithm, called HORD, searches the surrogate for the most promising hyperparameter values through dynamic coordinate search and requires many fewer function evaluations. HORD does well in low dimensions but it is exceptionally better in higher dimensions. Extensive evaluations on MNIST and CIFAR-10 for four deep neural networks demonstrate HORD significantly outperforms the well-established Bayesian optimization methods such as GP, SMAC, and TPE. For instance, on average, HORD is more than 6 times faster than GP-EI in obtaining the best configuration of 19 hyperparameters.
Recent results from jet production in deep inelastic ep scattering to investigate parton dynamics at low x are reviewed. The results on jet production in deep inelastic scattering and photoproduction used to test perturbative QCD are discussed and the values of alphas(Mz) extracted from a QCD analysis of the data are presented
For many machine learning algorithms such as $k$-Nearest Neighbor ($k$-NN) classifiers and $ k $-means clustering, often their success heavily depends on the metric used to calculate distances between different data points.   An effective solution for defining such a metric is to learn it from a set of labeled training samples. In this work, we propose a fast and scalable algorithm to learn a Mahalanobis distance metric. By employing the principle of margin maximization to achieve better generalization performances, this algorithm formulates the metric learning as a convex optimization problem and a positive semidefinite (psd) matrix is the unknown variable. a specialized gradient descent method is proposed. our algorithm is much more efficient and has a better performance in scalability compared with existing methods. Experiments on benchmark data sets suggest that, compared with state-of-the-art metric learning algorithms, our algorithm can achieve a comparable classification accuracy with reduced computational complexity.
A solution to a 3-satisfiability (3-SAT) formula can be expanded into a cluster, all other solutions of which are reachable from this one through a sequence of single-spin flips. Some variables in the solution cluster are frozen to the same spin values by one of two different mechanisms: frozen-core formation and long-range frustrations. While frozen cores are identified by a local whitening algorithm, long-range frustrations are very difficult to trace, and they make an entropic belief-propagation (BP) algorithm fail to converge. For BP to reach a fixed point the spin values of a tiny fraction of variables (chosen according to the whitening algorithm) are externally fixed during the iteration. From the calculated entropy values, we infer that, for a large random 3-SAT formula with constraint density close to the satisfiability threshold, the solutions obtained by the survey-propagation or the walksat algorithm belong neither to the most dominating clusters of the formula nor to the most abundant clusters. This work indicates that a single solution cluster of a random 3-SAT formula may have further community structures.
We study the motion of an elastic object driven in a disordered environment in presence of both dissipation and inertia. We consider random forces with the statistics of random walks and reduce the problem to a single degree of freedom. It is the extension of the mean field ABBM model in presence of an inertial mass m. While the ABBM model can be solved exactly, its extension to inertia exhibits complicated history dependence due to oscillations and backward motion. The characteristic scales for avalanche motion are studied from numerics and qualitative arguments. To make analytical progress we consider two variants which coincide with the original model whenever the particle moves only forward. Using a combination of analytical and numerical methods together with simulations, we characterize the distributions of instantaneous acceleration and velocity, and compare them in these three models. We show that for large driving velocity, all three models share the same large-deviation function for positive velocities, which is obtained analytically for small and large m, as well as for m =6/25. The effect of small additional thermal and quantum fluctuations can be treated within an approximate method.
An important task in network analysis is the detection of anomalous events in a network time series. These events could merely be times of interest in the network timeline or they could be examples of malicious activity or network malfunction. Hypothesis testing using network statistics to summarize the behavior of the network provides a robust framework for the anomaly detection decision process. Unfortunately, choosing network statistics that are dependent on confounding factors like the total number of nodes or edges can lead to incorrect conclusions (e.g., false positives and false negatives). In this dissertation we describe the challenges that face anomaly detection in dynamic network streams regarding confounding factors. We also provide two solutions to avoiding error due to confounding factors: the first is a randomization testing method that controls for confounding factors, and the second is a set of size-consistent network statistics which avoid confounding due to the most common factors, edge count and node count.
We present statistics for the structure and time-evolution of a network constructed from user activity in an Internet community. The vastness and precise time resolution of an Internet community offers unique possibilities to monitor social network formation and dynamics. Time evolution of well-known quantities, such as clustering, mixing (degree-degree correlations), average geodesic length, degree, and reciprocity is studied. In contrast to earlier analyses of scientific collaboration networks, mixing by degree between vertices is found to be disassortative. Furthermore, both the evolutionary trajectories of the average geodesic length and of the clustering coefficients are found to have minima.
A fundamental problem in the study of complex networks is to provide quantitative measures of correlation and information flow between different parts of a system. To this end, several notions of communicability have been introduced and applied to a wide variety of real-world networks in recent years. Several such communicability functions are reviewed in this paper. It is emphasized that communication and correlation in networks can take place through many more routes than the shortest paths, a fact that may not have been sufficiently appreciated in previously proposed correlation measures. In contrast to these, the communicability measures reviewed in this paper are defined by taking into account all possible routes between two nodes, assigning smaller weights to longer ones. This point of view naturally leads to the definition of communicability in terms of matrix functions, such as the exponential, resolvent, and hyperbolic functions, in which the matrix argument is either the adjacency matrix or the graph Laplacian associated with the network. Considerable insight on communicability can be gained by modeling a network as a system of oscillators and deriving physical interpretations, both classical and quantum-mechanical, of various communicability functions. Applications of communicability measures to the analysis of complex systems are illustrated on a variety of biological, physical and social networks. The last part of the paper is devoted to a review of the notion of locality in complex networks and to computational aspects that by exploiting sparsity can greatly reduce the computational efforts for the calculation of communicability functions for large networks.
The recent rapid and tremendous success of deep convolutional neural networks (CNN) on many challenging computer vision tasks largely derives from the accessibility of the well-annotated ImageNet and PASCAL VOC datasets. Nevertheless, unsupervised image categorization (i.e., without the ground-truth labeling) is much less investigated, yet critically important and difficult when annotations are extremely hard to obtain in the conventional way of "Google Search" and crowd sourcing. We address this problem by presenting a looped deep pseudo-task optimization (LDPO) framework for joint mining of deep CNN features and image labels. Our method is conceptually simple and rests upon the hypothesized "convergence" of better labels leading to better trained CNN models which in turn feed more discriminative image representations to facilitate more meaningful clusters/labels. Our proposed method is validated in tackling two important applications: 1) Large-scale medical image annotation has always been a prohibitively expensive and easily-biased task even for well-trained radiologists. Significantly better image categorization results are achieved via our proposed approach compared to the previous state-of-the-art method. 2) Unsupervised scene recognition on representative and publicly available datasets with our proposed technique is examined. The LDPO achieves excellent quantitative scene classification results. On the MIT indoor scene dataset, it attains a clustering accuracy of 75.3%, compared to the state-of-the-art supervised classification accuracy of 81.0% (when both are based on the VGG-VD model).
Studies of low-frequency resistance noise show that the glassy freezing of the two-dimensional (2D) electron system in the vicinity of the metal-insulator transition occurs in all Si inversion layers. The size of the metallic glass phase, which separates the 2D metal and the (glassy) insulator, depends strongly on disorder, becoming extremely small in high-mobility samples. The behavior of the second spectrum, an important fourth-order noise statistic, indicates the presence of long-range correlations between fluctuators in the glassy phase, consistent with the hierarchical picture of glassy dynamics.
Here we uncover the load and fault-tolerant backbones of the trans-European gas pipeline network. Combining topological data with information on inter-country flows, we estimate the global load of the network and its tolerance to failures. To do this, we apply two complementary methods generalized from the betweenness centrality and the maximum flow. We find that the gas pipeline network has grown to satisfy a dual-purpose: on one hand, the major pipelines are crossed by a large number of shortest paths thereby increasing the efficiency of the network; on the other hand, a non-operational pipeline causes only a minimal impact on network capacity, implying that the network is error-tolerant. These findings suggest that the trans-European gas pipeline network is robust, i.e., error tolerant to failures of high load links.
Natural language questions are inherently compositional, and many are most easily answered by reasoning about their decomposition into modular sub-problems. For example, to answer "is there an equal number of balls and boxes?" we can look for balls, look for boxes, count them, and compare the results. The recently proposed Neural Module Network (NMN) architecture implements this approach to question answering by parsing questions into linguistic substructures and assembling question-specific deep networks from smaller modules that each solve one subtask. However, existing NMN implementations rely on brittle off-the-shelf parsers, and are restricted to the module configurations proposed by these parsers rather than learning them from data. In this paper, we propose End-to-End Module Networks (N2NMNs), which learn to reason by directly predicting instance-specific network layouts without the aid of a parser. Our model learns to generate network structures (by imitating expert demonstrations) while simultaneously learning network parameters (using the downstream task loss). Experimental results on the new CLEVR dataset targeted at compositional question answering show that N2NMNs achieve an error reduction of nearly 50% relative to state-of-the-art attentional approaches, while discovering interpretable network architectures specialized for each question.
The curse of dimensionality is commonly encountered in numerical partial differential equations (PDE), especially when uncertainties have to be modeled into the equations as random coefficients. However, very often the variability of physical quantities derived from PDE can be captured by a few features on the space of random coefficients. Based on such observation, we propose using neural-network, a technique gaining prominence in machine learning tasks, to parameterize the physical quantity of interest as a function of random input coefficients. The simplicity and accuracy of the approach are demonstrated through notable examples of PDEs in engineering and physics.
With the help of EXACT ground states obtained by a polynomial algorithm we compute the domain wall energy at zero-temperature for the bond-random and the site-random Ising spin glass model in two dimensions. We find that in both models the stability of the ferromagnetic AND the spin glass order ceases to exist at a UNIQUE concentration p_c for the ferromagnetic bonds. In the vicinity of this critical point, the size and concentration dependency of the first AND second moment of the domain wall energy are, for both models, described by a COMMON finite size scaling form. Moreover, below this concentration the stiffness exponent turns out to be slightly negative \theta_S = -0.056(6) indicating the absence of any intermediate spin glass phase at non-zero temperature.
A field deterministic model of the vehicular dynamics in a generic urban street canyon with two neighboring canyons is considered. The assumed hydrodynamical model of vehicular movement is coupled to the gasdynamical model of the air and emitted pollutants. The vehicles are assumed to move on distinct left lanes and right ones. At the upstream and downstream ends there are coordinated traffic lights which introduce control parameters to the field model. The model of optimal control of street canyon dynamics is based on the two optimal multi-criteria control problems. The problems consist of minimization of the dimensionless functionals of the cumulative total travel time, global emissions of pollutants, and global concentrations of pollutants, both in the studied street canyon, as well as together with the two nearest neighbor substitute canyons, respectively.
This work describes preliminary steps towards nano-scale reservoir computing using quantum dots. Our research has focused on the development of an accumulator-based sensing system that reacts to changes in the environment, as well as the development of a software simulation. The investigated systems generate nonlinear responses to inputs that make them suitable for a physical implementation of a neural network. This development will enable miniaturisation of the neurons to the molecular level, leading to a range of applications including monitoring of changes in materials or structures. The system is based around the optical properties of quantum dots. The paper will report on experimental work on systems using Cadmium Selenide (CdSe) quantum dots and on the various methods to render the systems sensitive to pH, redox potential or specific ion concentration. Once the quantum dot-based systems are rendered sensitive to these triggers they can provide a distributed array that can monitor and transmit information on changes within the material.
Discriminant Correlation Filters (DCF) based methods now become a kind of dominant approach to online object tracking. The features used in these methods, however, are either based on hand-crafted features like HoGs, or convolutional features trained independently from other tasks like image classification. In this work, we present an end-to-end lightweight network architecture, namely DCFNet, to learn the convolutional features and perform the correlation tracking process simultaneously. Specifically, we treat DCF as a special correlation filter layer added in a Siamese network, and carefully derive the backpropagation through it by defining the network output as the probability heatmap of object location. Since the derivation is still carried out in Fourier frequency domain, the efficiency property of DCF is preserved. This enables our tracker to run at more than 60 FPS during test time, while achieving a significant accuracy gain compared with KCF using HoGs. Extensive evaluations on OTB-2013, OTB-2015, and VOT2015 benchmarks demonstrate that the proposed DCFNet tracker is competitive with several state-of-the-art trackers, while being more compact and much faster.
A measurement of the top quark pair production cross section in proton anti-proton collisions at an interaction energy of sqrt(s)=1.96 TeV is presented. This analysis uses 405 pb-1 of data collected with the D0 detector at the Fermilab Tevatron Collider. Fully hadronic ttbar decays with final states of six or more jets are separated from the multijet background using secondary vertex tagging and a neural network. The ttbar cross section is measured as sigma(ttbar)=4.5 -1.9 +2.0 (stat) -1.1 +1.4 (syst) +/- 0.3 (lumi) pb for a top quark mass of m(t) = 175 GeV/c^2.
In this paper we design an incentive mechanism for heterogeneous Delay Tolerant Networks (DTNs). The proposed mechanism tackles a core problem of such systems: how to induce coordination of DTN relays in order to achieve a target performance figure, e.g., delivery probability or end-to-end delay, under a given constraint in term of network resources, e.g., number of active nodes or energy consumption. Also, we account for the realistic case when the cost for taking part in the forwarding process varies with the devices' technology or the users' habits. Finally, the scheme is truly applicable to DTNs since it works with no need for end-to-end connectivity.   In this context, we first introduce the basic coordination mechanism leveraging the notion of a Minority Game. In this game, relays compete to be in the population minority and their utility is defined in combination with a rewarding mechanism. The rewards in turn configure as a control by which the network operator controls the desired operating point for the DTN. To this aim, we provide a full characterization of the equilibria of the game in the case of heterogeneous DTNs. Finally, a learning algorithm based on stochastic approximations provably drives the system to the equilibrium solution without requiring perfect state information at relay nodes or at the source node and without using end-to-end communications to implement the rewarding scheme. We provide extensive numerical results to validate the proposed scheme.
We use a local projectional analysis method to investigate the effect of topological disorder on the vibrational dynamics in a model glass simulated by molecular dynamics. Evidence is presented that the vibrational eigenmodes in the glass are generically related to the corresponding eigenmodes of its crystalline counterpart via disorder-induced level-repelling and hybridization effects. It is argued that the effect of topological disorder in the glass on the dynamical matrix can be simulated by introducing positional disorder in a crystalline counterpart.
In this work we present a general mathematical framework to deal with Quantum Networks, i.e. networks resulting from the interconnection of elementary quantum circuits. The cornerstone of our approach is a generalization of the Choi isomorphism that allows one to efficiently represent any given Quantum Network in terms of a single positive operator. Our formalism allows one to face and solve many quantum information processing problems that would be hardly manageable otherwise, the most relevant of which are reviewed in this work: quantum process tomography, quantum cloning and learning of transformations, inversion of a unitary gate, information-disturbance tradeoff in estimating a unitary transformation, cloning and learning of a measurement device.
We outline the existing descriptions of the charm component of the deep inelastic proton structure function F2. We discuss recent approaches to include charm mass effects in the parton evolution equations and the coefficient functions.
In this paper, we present linear programming-based sufficient conditions, some of them polynomial-time, to establish the liveness and memory boundedness of general dataflow process networks. Furthermore, this approach can be used to obtain safe upper bounds on the size of the channel buffers of such a network.
For a deeply supercooled liquid just above its glass transition temperature, we present a simple thermodynamic model, where the deeply supercooled liquid is assumed to be a mixture of solid-like and liquid-like micro regions. The mole fraction of the liquid-like regions controls the thermodynamic properties of the supercooled liquid while that of the solid-like regions controls its relaxation time or viscosity. From the universal temperature dependence of the molar excess entropy for the deeply supercooled liquids, we derive the temperature dependence of the mole fraction of the liquid-like regions to obtain the universal temperature dependence of the relaxation time or the viscosity for the deeply supercooled liquids. A central parameter of our model is then shown to be a measure for the fragility of a supercooled liquid. We also suggest a way to test our physical picture of deeply supercooled liquids by means of molecular dynamics simulations of model liquids.
The hidden Markov model (HMM) has been a workhorse of single molecule data analysis and is now commonly used as a standalone tool in time series analysis or in conjunction with other analyses methods such as tracking. Here we provide a conceptual introduction to an important generalization of the HMM which is poised to have a deep impact across Biophysics: the infinite hidden Markov model (iHMM). As a modeling tool, iHMMs can analyze sequential data without a priori setting a specific number of states as required for the traditional (finite) HMM. While the current literature on the iHMM is primarily intended for audiences in Statistics, the idea is powerful and the iHMM's breadth in applicability outside Machine Learning and Data Science warrants a careful exposition. Here we explain the key ideas underlying the iHMM with a special emphasis on implementation and provide a description of a code we are making freely available. In a companion article, we provide an important extension of the iHMM to accommodate complications such as drift.
This paper describes first experiments measuring organizational consciousness by comparing six "honest signals" of interpersonal communication within organizations with organizational metrics of performance.
The proton structure function F_2(x,Q^2) for x < 0.01 and 0.045< Q^2 < 45 GeV^2, measured in the deep inelastic scattering at HERA, can be well described within the framework of the Color Glass Condensate.
The low-frequency noise (LF-noise) of deep submicron MOSFETs is experimentally studied with special emphasis on yield relevant parameter scattering. A novel modeling approach is developed which includes detailed consideration of statistical effects. The model is based on device physics-based parameters which cause statistical fluctuations in LF-noise behavior of individual devices. It can easily be implemented in a compact model for use in circuit simulation tools.
A number of representation schemes have been presented for use within learning classifier systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using discrete and fuzzy dynamical system representations within the XCSF learning classifier system. In particular, asynchronous random Boolean networks are used to represent the traditional condition-action production system rules in the discrete case and asynchronous fuzzy logic networks in the continuous-valued case. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such dynamical systems within XCSF to solve a number of well-known test problems.
Content-invariance in mapping codes learned by GAEs is a useful feature for various relation learning tasks. In this paper we show that the content-invariance of mapping codes for images of 2D and 3D rotated objects can be substantially improved by extending the standard GAE loss (symmetric reconstruction error) with a regularization term that penalizes the symmetric cross-reconstruction error. This error term involves reconstruction of pairs with mapping codes obtained from other pairs exhibiting similar transformations. Although this would principally require knowledge of the transformations exhibited by training pairs, our experiments show that a bootstrapping approach can sidestep this issue, and that the regularization term can effectively be used in an unsupervised setting.
We report studies of the behaviour of a single driven domain wall in the 2-dimensional non-equilibrium zero temperature random-field Ising model, closely above the depinning threshold. It is found that even for very weak disorder, the domain wall moves through the system in percolative fashion. At depinning, the fraction of spins that are flipped by the proceeding avalanche vanishes with the same exponent beta=5/36 as the infinite percolation cluster in percolation theory. With decreasing disorder strength, however, the size of the critical region decreases. Our numerical simulation data appear to reflect a crossover behaviour to an exponent beta'=0 at zero disorder strength. The conclusions of this paper strongly rely on analytical arguments. A scaling theory in terms of the disorder strength and the magnetic field is presented that gives the values of all critical exponent except for one, the value of which is estimated from scaling arguments.
In this paper, we study retrievability of admissible cycles and the dynamics of the networks constructed from admissible cycles with the pseudoinverse learning rule. Retrievability of admissible cycles in networks with $C_0>0$ and $\lambda$ sufficiently large are discussed. Based on the linear stability analysis we derive a complete description of all possible local bifurcations of the trivial solution for the networks constructed from admissible cycles. We illustrate numerically that, depending on the structural features, the admissible cycles are respectively stored and retrieved as attracting limit cycles, unstable periodic solutions and delay-induced long-lasting transient oscillations, and the transition from fixed points to the attracting limit cycle bifurcating from the trivial solution takes place through multiple saddle-nodes on limit cycle bifurcations.
In many scenarios, networks emerge endogenously as cognitive agents establish links in order to exchange information. Network formation has been widely studied in economics, but only on the basis of simplistic models that assume that the value of each additional piece of information is constant. In this paper we present a first model and associated analysis for network formation under the much more realistic assumption that the value of each additional piece of information depends on the type of that piece of information and on the information already possessed: information may be complementary or redundant. We model the formation of a network as a non-cooperative game in which the actions are the formation of links and the benefit of forming a link is the value of the information exchanged minus the cost of forming the link. We characterize the topologies of the networks emerging at a Nash equilibrium (NE) of this game and compare the efficiency of equilibrium networks with the efficiency of centrally designed networks. To quantify the impact of information redundancy and linking cost on social information loss, we provide estimates for the Price of Anarchy (PoA); to quantify the impact on individual information loss we introduce and provide estimates for a measure we call Maximum Information Loss (MIL). Finally, we consider the setting in which agents are not endowed with information, but must produce it. We show that the validity of the well-known "law of the few" depends on how information aggregates; in particular, the "law of the few" fails when information displays complementarities.
Opinions and beliefs determine the evolution of social systems. This is of particular interest in finance, as the increasing complexity of financial systems is coupled with information overload. Opinion formation, therefore, is not always the result of optimal information processing. On the contrary, agents are boundedly rational and naturally tend to observe and imitate others in order to gain further insights. Hence, a certain degree of interaction, which can be envisioned as a network, occurs within the system. Opinions, the interaction network and prices in financial markets are then heavily intertwined and influence one another. We build on previous contributions on adaptive systems, where agents have hetereogenous beliefs, and introduce a dynamic confidence network that captures the interaction and shapes the opinion patterns. The analytical framework we adopt for modeling the interaction is rooted in the opinion dynamics problem. This will allow us to introduce a nonlinear model where the confidence network, opinion dynamics and price formation coevolve in time. A key aspect of the model is the classification of agents according to their topological role in the network, therefore showing that topology matters in determining how of opinions and prices will coevolve. We illustrate the dynamics via simulations, discussing the stylized facts in finance that the model is able to capture. Last, we propose an empirical validation and calibration scheme that makes use of social network data.
We developed task-optimized deep neural networks (DNNs) that achieved state-of-the-art performance in different evaluation scenarios for automatic music tagging. These DNNs were subsequently used to probe the neural representations of music. Representational similarity analysis revealed the existence of a representational gradient across the superior temporal gyrus (STG). Anterior STG was shown to be more sensitive to low-level stimulus features encoded in shallow DNN layers whereas posterior STG was shown to be more sensitive to high-level stimulus features encoded in deep DNN layers.
Recent advances in deep learning have enabled the extraction of high-level features from raw sensor data which has opened up new possibilities in many different fields, including computer generated choreography. In this paper we present a system chor-rnn for generating novel choreographic material in the nuanced choreographic language and style of an individual choreographer. It also shows promising results in producing a higher level compositional cohesion, rather than just generating sequences of movement. At the core of chor-rnn is a deep recurrent neural network trained on raw motion capture data and that can generate new dance sequences for a solo dancer. Chor-rnn can be used for collaborative human-machine choreography or as a creative catalyst, serving as inspiration for a choreographer.
Dynamic heterogeneity as one of the most important properties in supercooled liquids has been found for several decades. However, its structural origin remains open for many systems. Here, we propose a new structural parameter to characterize local atomic packing in metallic liquids. It is found that the new parameter in a simulated metallic glass-forming liquid is closely correlated with potential energy and atomic mobility. It also exhibits significant spatial heterogeneities and these structural fluctuations show close correlation with the spatial distribution of the long-time dynamic propensities. Therefore, our results provide a direct evidence of the correlation between atomic structure and dynamical heterogeneity.
Developing an efficient spectrum access policy enables cognitive radios to dramatically increase spectrum utilization while ensuring predetermined quality of service levels for primary users. In this paper, modeling, performance analysis, and optimization of a distributed secondary network with random sensing order policy are studied. Specifically, the secondary users create a random order of available channels upon primary users return, and then find optimal transmission and handoff opportunities in a distributed manner. By a Markov chain analysis, the average throughputs of the secondary users and average interference level among the secondary and primary users are investigated. A maximization of the secondary network performance in terms of the throughput while keeping under control the average interference is proposed. It is shown that despite of traditional view, non-zero false alarm in the channel sensing can increase channel utilization, especially in a dense secondary network where the contention is too high. Then, two simple and practical adaptive algorithms are established to optimize the network. The second algorithm follows the variations of the wireless channels in non-stationary conditions and outperforms even static brute force optimization, while demanding few computations. The convergence of the distributed algorithms are theoretically investigated based on the analytical performance indicators established by the Markov chain analysis. Finally, numerical results validate the analytical derivations and demonstrate the efficiency of the proposed schemes. It is concluded that fully distributed sensing order algorithms can lead to substantial performance improvements in cognitive radio networks without the need of centralized management or message passing among the users.
We study the effects of conformity, the tendency of humans to imitate locally common behaviors, in the evolution of cooperation when individuals occupy the vertices of a graph and engage in the one-shot Prisoner's Dilemma or the Snowdrift game with their neighbors. Two different graphs are studied: rings (one-dimensional lattices with cyclic boundary conditions) and scale-free networks of the Barabasi-Albert type. The proposed evolutionary-graph model is studied both by means of Monte Carlo simulations and an extended pair-approximation technique. We find improved levels of cooperation when evolution is carried on rings and individuals imitate according to both the traditional pay-off bias and a conformist bias. More important, we show that scale-free networks are no longer powerful amplifiers of cooperation when fair amounts of conformity are introduced in the imitation rules of the players. Such weakening of the cooperation-promoting abilities of scale-free networks is the result of a less biased flow of information in scale-free topologies, making hubs more susceptible of being influenced by less-connected neighbors.
We use the Fourth Data Release of the Sloan Digital Sky Survey to investigate the relation between galaxy rest frame u-r colour, morphology, as described by the concentration and Sersic indices, and environmental density, for a sample of 79,553 galaxies at z < ~0.1. We split the samples according to density and luminosity and recover the expected bimodal distribution in the colour-morphology plane, shown especially clearly by this subsampling. We quantify the bimodality by a sum of two Gaussians on the colour and morphology axes and show that, for the red/early-type population both colour and morphology do not change significantly as a function of density. For the blue/late-type population, with increasing density the colour becomes redder but the morphology again does not change significantly. Both populations become monotonically redder and of earlier type with increasing luminosity. There is no significant qualitative difference between the behaviour of the two morphological measures. We supplement the morphological sample with 13,655 galaxies assigned Hubble types by an artificial neural network. We find, however, that the resulting distribution is less well described by two Gaussians. Therefore, there are either more than two significant morphological populations, physical processes not seen in colour space, or the Hubble type, particularly the different subtypes of spirals Sa-Sd, has an irreducible fuzziness when related to environmental density. For each of the three measures of morphology, on removing the density relation due to it, we recover a strong residual relation in colour. However, on similarly removing the colour-density relation there is no evidence for a residual relation due to morphology. [Abridged]
We solve the problem of identifying (reconstructing) network topology from steady state network measurements. Concretely, given only a data matrix $\mathbf{X}$ where the $X_{ij}$ entry corresponds to flow in edge $i$ in configuration (steady-state) $j$, we wish to find a network structure for which flow conservation is obeyed at all the nodes. This models many network problems involving conserved quantities like water, power, and metabolic networks. We show that identification is equivalent to learning a model $\mathbf{A_n}$ which captures the approximate linear relationships between the different variables comprising $\mathbf{X}$ (i.e. of the form $\mathbf{A_n X \approx 0}$) such that $\mathbf{A_n}$ is full rank (highest possible) and consistent with a network node-edge incidence structure. The problem is solved through a sequence of steps like estimating approximate linear relationships using Principal Component Analysis, obtaining f-cut-sets from these approximate relationships, and graph realization from f-cut-sets (or equivalently f-circuits). Each step and the overall process is polynomial time. The method is illustrated by identifying topology of a water distribution network. We also study the extent of identifiability from steady-state data.
The low-energy spectrum of graphene nanoribbons with armchair edges (armchair nanoribbons) is described as the superposition of two non-equivalent Dirac points of graphene. In spite of the lack of well-separated two valley structures, the single-channel transport subjected to long-ranged impurities is nearly perfectly conducting, where the backward scattering matrix elements in the lowest order vanish as a manifestation of internal phase structures of the wavefunction. For multi-channel energy regime, however, the conventional exponential decay of the averaged conductance occurs. Since the inter-valley scattering is not completely absent, armchair nanoribbons can be classified into orthogonal universality class irrespective of the range of impurities. The nearly perfect single-channel conduction dominates the low-energy electronic transport in rather narrow nanorribbons.
We treat the question of the low temperature behavior of the dephasing rate of the electrons in the presence of elastic spin disorder scattering and interactions. In the frame of a self-consistent diagrammatic treatment, we obtain saturation of the dephasing rate in the limit of low temperature for magnetic scattering, in agreement with the non-interacting case. The magnitude of the dephasing rate is set by the strength of the magnetic scattering rate. We discuss the agreement of our results with relevant experiments.
To predict the final result of an athlete in a marathon run thoroughly is the eternal desire of each trainer. Usually, the achieved result is weaker than the predicted one due to the objective (e.g., environmental conditions) as well as subjective factors (e.g., athlete's malaise). Therefore, making up for the deficit between predicted and achieved results is the main ingredient of the analysis performed by trainers after the competition. In the analysis, they search for parts of a marathon course where the athlete lost time. This paper proposes an automatic making up for the deficit by using a Differential Evolution algorithm. In this case study, the results that were obtained by a wearable sports-watch by an athlete in a real marathon are analyzed. The first experiments with Differential Evolution show the possibility of using this method in the future.
Dropout has been witnessed with great success in training deep neural networks by independently zeroing out the outputs of neurons at random. It has also received a surge of interest for shallow learning, e.g., logistic regression. However, the independent sampling for dropout could be suboptimal for the sake of convergence. In this paper, we propose to use multinomial sampling for dropout, i.e., sampling features or neurons according to a multinomial distribution with different probabilities for different features/neurons. To exhibit the optimal dropout probabilities, we analyze the shallow learning with multinomial dropout and establish the risk bound for stochastic optimization. By minimizing a sampling dependent factor in the risk bound, we obtain a distribution-dependent dropout with sampling probabilities dependent on the second order statistics of the data distribution. To tackle the issue of evolving distribution of neurons in deep learning, we propose an efficient adaptive dropout (named \textbf{evolutional dropout}) that computes the sampling probabilities on-the-fly from a mini-batch of examples. Empirical studies on several benchmark datasets demonstrate that the proposed dropouts achieve not only much faster convergence and but also a smaller testing error than the standard dropout. For example, on the CIFAR-100 data, the evolutional dropout achieves relative improvements over 10\% on the prediction performance and over 50\% on the convergence speed compared to the standard dropout.
Peer-to-Peer (P2P) file-sharing is becoming increasingly popular in recent years. In 2012, it was reported that P2P traffic consumed over 5,374 petabytes per month, which accounted for approximately 20.5% of consumer internet traffic. TV is the popular content type on The Pirate Bay (the world's largest BitTorrent indexing website). In this paper, an analysis of the swarms of the most popular pirated TV shows is conducted. The purpose of this data gathering exercise is to enumerate the peer distribution at different geolocational levels, to measure the temporal trend of the swarm and to discover the amount of cross-swarm peer participation. Snapshots containing peer related information involved in the unauthorised distribution of this content were collected at a high frequency resulting in a more accurate landscape of the total involvement. The volume of data collected throughout the monitoring of the network exceeded 2 terabytes. The presented analysis and the results presented can aid in network usage prediction, bandwidth provisioning and future network design.
Azimuthal correlations in two hadron production in deep inelastic scattering of unpolarized leptons off transversely polarized nucleon are discussed. Specifically, a simple approach for accessing ratios of structure functions corresponding to Sivers effect and transversity induced single spin azimuthal asymmetries to unpolarized structure functions is presented. This approach is then applied to a sample pseudo-data generated by a modified version of the LEPTO Monte Carlo event generator that includes the Sivers effect. Using this data it is shown, that the azimuthal correlations in the Sivers-like structure functions and the unpolarized structure functions are significantly different. Collins-like single hadron asymmetries in a dihadron sample are also discussed, for which experimental results were recently presented by the COMPASS collaboration.
Intelligent systems based on first-order logic on the one hand, and on artificial neural networks (also called connectionist systems) on the other, differ substantially. It would be very desirable to combine the robust neural networking machinery with symbolic knowledge representation and reasoning paradigms like logic programming in such a way that the strengths of either paradigm will be retained. Current state-of-the-art research, however, fails by far to achieve this ultimate goal. As one of the main obstacles to be overcome we perceive the question how symbolic knowledge can be encoded by means of connectionist systems: Satisfactory answers to this will naturally lead the way to knowledge extraction algorithms and to integrated neural-symbolic systems.
The absence of a missing moment inertia in clean solid $^4$He suggests that the minimal experimentally relevant model is one in which disorder induces superfluidity in a bosonic lattice. To this end, we explore the relevance of the disordered Bose-Hubbard model in this context. We posit that a clean array $^4$He atoms is a self-generated Mott insulator, that is, the $^4$He atoms constitute the lattice as well as the `charge carriers'. With this assumption, we are able to interpret the textbook defect-driven supersolids as excitations of either the lower or upper Hubbard bands. In the experiments at hand, disorder induces a closing of the Mott gap through the generation of mid-gap localized states at the chemical potential. Depending on the magnitude of the disorder, we find that the destruction of the Mott state takes place for $d+z>4$ either through a Bose glass phase (strong disorder) or through a direct transition to a superfluid (weak disorder). For $d+z<4$, disorder is always relevant. The critical value of the disorder that separates these two regimes is shown to be a function of the boson filling, interaction and the momentum cut off. We apply our work to the experimentally observed enhancement $^3$He impurities has on the onset temperature for the missing moment of inertia. We find quantitative agreement with experimental trends.
Epidemic spreading on physical contact network will naturally introduce the human awareness information diffusion on virtual contact network, and the awareness diffusion will in turn depress the epidemic spreading, thus forming the competing spreading processes of epidemic and awareness in a multiplex networks. In this paper, we study the competing dynamics of epidemic and awareness, both of which follow the SIR process, in a two-layer networks based on microscopic Markov chain approach and numerical simulations. We find that strong capacities of awareness diffusion and self-protection of individuals could lead to a much higher epidemic threshold and a smaller outbreak size. However, the self-awareness of individuals has no obvious effect on the epidemic threshold and outbreak size. In addition, the immunization of the physical contact network under the interplay between of epidemic and awareness spreading is also investigated. The targeted immunization is found performs much better than random immunization, and the awareness diffusion could reduce the immunization threshold for both type of random and targeted immunization significantly.
Today, robotics is an auspicious and fast-growing branch of technology that involves the manufacturing, design, and maintenance of robot machines that can operate in an autonomous fashion and can be used in a wide variety of applications including space exploration, weaponry, household, and transportation. More particularly, in space applications, a common type of robots has been of widespread use in the recent years. It is called planetary rover which is a robot vehicle that moves across the surface of a planet and conducts detailed geological studies pertaining to the properties of the landing cosmic environment. However, rovers are always impeded by obstacles along the traveling path which can destabilize the rover's body and prevent it from reaching its goal destination. This paper proposes an ANN model that allows rover systems to carry out autonomous path-planning to successfully navigate through challenging planetary terrains and follow their goal location while avoiding dangerous obstacles. The proposed ANN is a multilayer network made out of three layers: an input, a hidden, and an output layer. The network is trained in offline mode using back-propagation supervised learning algorithm. A software-simulated rover was experimented and it revealed that it was able to follow the safest trajectory despite existing obstacles. As future work, the proposed ANN is to be parallelized so as to speed-up the execution time of the training process.
In this paper, we explore the role of network topology on maintaining the extensive property of entropy. We study analytically and numerically how the topology contributes to maintaining extensivity of entropy in multiplex networks, i.e. networks of subnetworks (layers), by means of the sum of the positive Lyapunov exponents, $H_{KS}$, a quantity related to entropy. We show that extensivity relies not only on the interplay between the coupling strengths of the dynamics associated to the intra (short-range) and inter (long-range) interactions, but also on the sum of the intra-degrees of the nodes of the layers. For the analytically treated networks of size $N$, among several other results, we show that if the sum of the intra-degrees (and the sum of inter-degrees) scales as $N^{\theta+1},\;\theta>0$, extensivity can be maintained if the intra-coupling (and the inter-coupling) strength scales as $N^{-\theta}$, when evolution is driven by the maximisation of $H_{KS}$. We then verify our analytical results by performing numerical simulations in multiplex networks formed by electrically and chemically coupled neurons.
Topological properties of networks are widely applied to study the link-prediction problem recently. Common Neighbors, for example, is a natural yet efficient framework. Many variants of Common Neighbors have been thus proposed to further boost the discriminative resolution of candidate links. In this paper, we reexamine the role of network topology in predicting missing links from the perspective of information theory, and present a practical approach based on the mutual information of network structures. It not only can improve the prediction accuracy substantially, but also experiences reasonable computing complexity.
This article aims at summarizing the existing methods for sampling social networking services and proposing a faster confidence interval for related sampling methods. It also includes comparisons of common network sampling techniques.
We present a complete morphologically classified sample of 144 faint field galaxies from the HST Medium Deep Survey with 20.0 < I <22.0 mag. We compare the global properties of the ellipticals, early and late-type spirals, and find a non-negligible fraction (13/144) of compact blue [(V-I) < 1.0 mag] systems with $r^{1/4}$-profiles. We give the differential galaxy number counts for ellipticals and early-type spirals independently, and find that the data are consistent with no-evolution predictions based on conventional flat Schechter luminosity functions (LF's) and a standard cosmology.   Conversely, late-type/Irregulars show a steeply rising differential number count with slope $(\frac{\delta log N}{\delta m}) = 0.64\pm 0.1$. No-evolution models based on the Loveday et al. (1992) and Marzke et al. (1994b) {\it local} luminosity functions under-predict the late-type/Irregular counts by 1.0 and 0.5 dex, respectively, at I = 21.75 mag. Examination of the Irregulars alone shows that $\sim 50$% appear inert and the remainder have multiple cores. If the inert galaxies represent a non-evolving late-type population, then a Loveday-like LF ($\alpha\simeq -1.0$) is ruled out for these types, and a LF with a steep faint-end ($\alpha\simeq -1.5$) is suggested. If multiple core structure indicates recent star-formation, then the observed excess of faint blue field galaxies is likely due to {\it evolutionary} processes acting on a {\it steep} field LF for late-type/Irregulars. The evolutionary mechanism is unclear, but 60% of the multiple-core Irregulars show close companions. To reconcile a Marzke-like LF with the faint redshift surveys, this evolution must be preferentially occurring in the brightest late-type galaxies with z > 0.5 at I = 21.75 mag.
Task-oriented dialogue focuses on conversational agents that participate in user-initiated dialogues on domain-specific topics. In contrast to chatbots, which simply seek to sustain open-ended meaningful discourse, existing task-oriented agents usually explicitly model user intent and belief states. This paper examines bypassing such an explicit representation by depending on a latent neural embedding of state and learning selective attention to dialogue history together with copying to incorporate relevant prior context. We complement recent work by showing the effectiveness of simple sequence-to-sequence neural architectures with a copy mechanism. Our model outperforms more complex memory-augmented models by 7% in per-response generation and is on par with the current state-of-the-art on DSTC2.
A mobile ad-hoc network (MANET) is collection of intercommunicating mobile hosts forming a spontaneous network without using established network infrastructure. Unlike the cellular or infrastructure networks who have a wired backbone connecting the base-station, the MANETs have neither fixed routers nor fixed locations. Their performance largely depend upon the routing mechanism & nature of mobility. Earlier research hints that the Destination Sequenced Distance Vector (DSDV) routing protocol is one of the most efficient and popular protocols, as far as general parameters have been concerned.[1,6] We have experimentally evaluated, the performance metrics for network load, packet delivery fraction and end-to-end delay with DSDV Protocol using NS2 Simulator.This paper presents, the performance of DSDV protocol for four different mobility models namely: Random Waypoint, Reference Point Group Mobility, Gauss Markov & Manhattan Mobility Model having varying network load & speed. The experimental results suggest that DSDV protocol with RPGM mobility model has optimized results for varying network load and speed.
Progresses towards the calculation and the understanding of NLO/NLL contributions to Deep Inelastic Scattering at low x with gluon saturation are being reviewed.
We used the galaxy spectra obtained in the ESO Nearby Abell Cluster Survey (ENACS) to separate early- and late-type galaxies in the ENACS clusters, by applying a Principal Component Analysis (PCA) in combination with an Artificial Neural Network (ANN). We represented the spectra by 15 Principal Components, which form the input for the ANN. The latter was trained with morphological types from Dressler, and its success rate was estimated from an independent testing set of galaxies. The success-rate for an early- (E+S0) vs. late-type (Sp+Irr) classification is close to 75 %; somewhat higher for the early-type galaxies, and somewhat lower for the late-type galaxies. Using the results of the PCA/ANN analysis ('spectral' galaxy types for 3798 galaxies, 57 % of which are of early type and 43 % of late type) we constructed a composite cluster of 2594 galaxies. The late-type galaxies appear to have a larger line-of-sight velocity dispersion and are less centrally concentrated than the early-type galaxies. The late-type galaxies with emission lines (OII, H-beta, OIII) in their spectra are mostly responsible for this difference. The properties of the late-type galaxies without emission lines are more like those of the early-type galaxies than like those of the late-type galaxies with emission lines. The interpretation of the large velocity dispersion and the extended distribution, in terms of fairly radial (possibly 'first-approach') orbits is probably linked primarily to the presence of emission-line gas and much less to galaxy type.
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers. DGPs are nonparametric probabilistic models and as such are arguably more flexible, have a greater capacity to generalise, and provide better calibrated uncertainty estimates than alternative deep models. This paper develops a new approximate Bayesian learning scheme that enables DGPs to be applied to a range of medium to large scale regression problems for the first time. The new method uses an approximate Expectation Propagation procedure and a novel and efficient extension of the probabilistic backpropagation algorithm for learning. We evaluate the new method for non-linear regression on eleven real-world datasets, showing that it always outperforms GP regression and is almost always better than state-of-the-art deterministic and sampling-based approximate inference methods for Bayesian neural networks. As a by-product, this work provides a comprehensive analysis of six approximate Bayesian methods for training neural networks.
Spreading according to simple rules (e.g. of fire or diseases), and shortest-path distances are studied on d-dimensional systems with a small density p per site of long-range connections (``Small-World'' lattices). The volume V(t) covered by the spreading quantity on an infinite system is exactly calculated in all dimensions. We find that V(t) grows initially as t^d/d for t<< t^* = (2p \Gamma_d (d-1)!)^{-1/d} and later exponentially for $t>>t^*$, generalizing a previous result in one dimension. Using the properties of V(t), the average shortest-path distance \ell(r) can be calculated as a function of Euclidean distance r. It is found that   \ell(r) = r for r<r_c=(2p \Gamma_d (d-1)!)^{-1/d} log(2p \Gamma_d L^d), and   \ell(r) = r_c for r>r_c.   The characteristic length r_c, which governs the behavior of shortest-path lengths, diverges with system size for all p>0. Therefore the mean separation s \sim p^{-1/d} between shortcut-ends is not a relevant internal length-scale for shortest-path lengths. We notice however that the globally averaged shortest-path length, divided by L, is a function of L/s only.
In this paper we study the problem of content placement in a cache network. We consider a network where routing of requests is based on random walks. Content placement is done using a novel mechanism referred to as reinforced counters. To each content we associate a counter, which is incremented every time the content is requested, and which is decremented at a fixed rate. We model and analyze this mechanism, tuning its parameters so as to achieve desired performance goals for a single cache or for a cache network. We also show that the optimal static content placement, without reinforced counters, is NP hard under different design goals.
We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, which ensures the VAE's output to preserve the spatial correlation characteristics of the input, thus leading the output to have a more natural visual appearance and better perceptual quality. Based on recent deep learning works such as style transfer, we employ a pre-trained deep convolutional neural network (CNN) and use its hidden features to define a feature perceptual loss for VAE training. Evaluated on the CelebA face dataset, we show that our model produces better results than other methods in the literature. We also show that our method can produce latent vectors that can capture the semantic information of face expressions and can be used to achieve state-of-the-art performance in facial attribute prediction.
The standard model of memory consolidation foresees that memories are initially recorded in the hippocampus, while features that capture higher-level generalisations of data are created in the cortex, where they are stored for a possibly indefinite period of time. Computer scientists have sought inspiration from nature to build machines that exhibit some of the remarkable properties present in biological systems. One of the results of this effort is represented by artificial neural networks, a class of algorithms that represent the state of the art in many artificial intelligence applications. In this work, we reverse the inspiration flow and use the experience obtained from neural networks to gain insight into the design of brain architecture and the functioning of memory. Our starting observation is that neural networks learn from data and need to be exposed to each data record many times during learning: this requires the storage of the entire dataset in computer memory. Our thesis is that the same holds true for the brain and the main role of the hippocampus is to store the "brain dataset", from which high-level features are learned and encoded in cortical neurons.
This paper focuses on the design of medium access control protocols for cognitive radio networks. The scenario in which a single cognitive user wishes to opportunistically exploit the availability of empty frequency bands within parts of the radio spectrum having multiple bands is first considered. In this scenario, the availability probability of each channel is unknown a priori to the cognitive user. Hence efficient medium access strategies must strike a balance between exploring (learning) the availability probability of the channels and exploiting the knowledge of the availability probability identified thus far. For this scenario, an optimal medium access strategy is derived and its underlying recursive structure is illustrated via examples. To avoid the prohibitive computational complexity of this optimal strategy, a low complexity asymptotically optimal strategy is developed. Next, the multi-cognitive user scenario is considered and low complexity medium access protocols, which strike an optimal balance between exploration and exploitation in such competitive environments, are developed.
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.
Stacking-based deep neural network (S-DNN), in general, denotes a deep neural network (DNN) resemblance in terms of its very deep, feedforward network architecture. The typical S-DNN aggregates a variable number of individually learnable modules in series to assemble a DNN-alike alternative to the targeted object recognition tasks. This work likewise devises an S-DNN instantiation, dubbed deep analytic network (DAN), on top of the spectral histogram (SH) features. The DAN learning principle relies on ridge regression, and some key DNN constituents, specifically, rectified linear unit, fine-tuning, and normalization. The DAN aptitude is scrutinized on three repositories of varying domains, including FERET (faces), MNIST (handwritten digits), and CIFAR10 (natural objects). The empirical results unveil that DAN escalates the SH baseline performance over a sufficiently deep layer.
Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback. This feedback is received in the form of a task loss evaluation to a predicted output structure, without having access to gold standard structures. We advance this framework by lifting linear bandit learning to neural sequence-to-sequence learning problems using attention-based recurrent neural networks. Furthermore, we show how to incorporate control variates into our learning algorithms for variance reduction and improved generalization. We present an evaluation on a neural machine translation task that shows improvements of up to 5.89 BLEU points for domain adaptation from simulated bandit feedback.
We consider a quantum particle in a one-dimensional disordered lattice with Anderson localization, in the presence of multi-frequency perturbations of the onsite energies. Using the Floquet representation, we transform the eigenvalue problem into a Wannier-Stark basis. Each frequency component contributes either to a single channel or a multi-channel connectivity along the lattice, depending on the control parameters. The single channel regime is essentially equivalent to the undriven case. The multi-channel driving substantially increases the localization length for slow driving, showing two different scaling regimes of weak and strong driving, yet the localization length stays finite for a finite number of frequency components.
Map matching of GPS trajectories from a sequence of noisy observations serves the purpose of recovering the original routes in a road network. In this work in progress, we attempt to share our experience of feature construction in a spatial database by reporting our ongoing experiment of feature extrac-tion in Conditional Random Fields (CRFs) for map matching. Our preliminary results are obtained from real-world taxi GPS trajectories.
From a sequence of similarity networks, with edges representing certain similarity measures between nodes, we are interested in detecting a change-point which changes the statistical property of the networks. After the change, a subset of anomalous nodes which compares dissimilarly with the normal nodes. We study a simple sequential change detection procedure based on node-wise average similarity measures, and study its theoretical property. Simulation and real-data examples demonstrate such a simply stopping procedure has reasonably good performance. We further discuss the faulty sensor isolation (estimating anomalous nodes) using community detection.
We numerically study the potential energy landscape of a fragile glassy system and find that the dynamic crossover corresponding to the glass transition is actually the effect of an underlying geometric transition caused by a qualitative change in the topological properties of the landscape. Furthermore, we show that the potential energy barriers connecting local glassy minima increase with decreasing energy of the minima, and we relate this behaviour to the fragility of the system. Finally, we analyze the real space structure of activated processes by studying the distribution of particle displacements for local minima connected by simple saddles.
We present a framework for automatically decomposing ("block-modeling") the functional classes of agents within a complex network. These classes are represented by the nodes of an image graph ("block model") depicting the main patterns of connectivity and thus functional roles in the network. Using a first principles approach, we derive a measure for the fit of a network to any given image graph allowing objective hypothesis testing. From the properties of an optimal fit, we derive how to find the best fitting image graph directly from the network and present a criterion to avoid overfitting. The method can handle both two-mode and one-mode data, directed and undirected as well as weighted networks and allows for different types of links to be dealt with simultaneously. It is non-parametric and computationally efficient. The concepts of structural equivalence and modularity are found as special cases of our approach. We apply our method to the world trade network and analyze the roles individual countries play in the global economy.
This paper first presents a parallel solution for the Flowshop Scheduling Problem in parallel environment, and then proposes a novel load balancing strategy. The proposed Proportional Fairness Strategy (PFS) takes computational performance of computing process sets into account, and assigns additional load to computing nodes proportionally to their evaluated performance. In order to efficiently utilize the power of parallel resource, we also discuss the data structure used in communications among computational nodes and design an optimized data transfer strategy. This data transfer strategy combined with the proposed load balancing strategy have been implemented and tested on a super computer consisted of 86 CPUs using MPI as the middleware. The results show that the proposed PFS achieves better performance in terms of computing time than the existing Adaptive Contracting Within Neighborhood Strategy. We also show that the combination of both the Proportional Fairness Strategy and the proposed data transferring strategy achieves additional 13~15% improvement in efficiency of parallelism.
In this paper, we attack the anomaly detection problem by directly modeling the data distribution with deep architectures. We propose deep structured energy based models (DSEBMs), where the energy function is the output of a deterministic deep neural network with structure. We develop novel model architectures to integrate EBMs with different types of data such as static data, sequential data, and spatial data, and apply appropriate model architectures to adapt to the data structure. Our training algorithm is built upon the recent development of score matching \cite{sm}, which connects an EBM with a regularized autoencoder, eliminating the need for complicated sampling method. Statistically sound decision criterion can be derived for anomaly detection purpose from the perspective of the energy landscape of the data distribution. We investigate two decision criteria for performing anomaly detection: the energy score and the reconstruction error. Extensive empirical studies on benchmark tasks demonstrate that our proposed model consistently matches or outperforms all the competing methods.
Ensuring privacy of users of social networks is probably an unsolvable conundrum. At the same time, an informed use of the existing privacy options by the social network participants may alleviate - or even prevent - some of the more drastic privacy-averse incidents. Unfortunately, recent surveys show that an average user is either not aware of these options or does not use them, probably due to their perceived complexity. It is therefore reasonable to believe that tools assisting users with two tasks: 1) understanding their social net behavior in terms of their privacy settings and broad privacy categories, and 2)recommending reasonable privacy options, will be a valuable tool for everyday privacy practice in a social network context. This paper presents YourPrivacyProtector, a recommender system that shows how simple machine learning techniques may provide useful assistance in these two tasks to Facebook users. We support our claim with empirical results of application of YourPrivacyProtector to two groups of Facebook users.
We consider the problem of communicating the sum of $m$ sources to $n$ terminals in a directed acyclic network. Recently, it was shown that for a network of unit capacity links with either $m=2$ or $n=2$, the sum of the sources can be communicated to the terminals if and only if every source-terminal pair is connected in the network. We show in this paper that for any finite set of primes, there exists a network where the sum of the sources can be communicated to the terminals only over finite fields of characteristic belonging to that set. As a corollary, this gives networks where the sum can not be communicated over any finite field even though every source is connected to every terminal.
In this study we present a kernel based convolution model to characterize neural responses to natural sounds by decoding their time-varying acoustic features. The model allows to decode natural sounds from high-dimensional neural recordings, such as magnetoencephalography (MEG), that track timing and location of human cortical signalling noninvasively across multiple channels. We used the MEG responses recorded from subjects listening to acoustically different environmental sounds. By decoding the stimulus frequencies from the responses, our model was able to accurately distinguish between two different sounds that it had never encountered before with 70% accuracy. Convolution models typically decode frequencies that appear at a certain time point in the sound signal by using neural responses from that time point until a certain fixed duration of the response. Using our model, we evaluated several fixed durations (time-lags) of the neural responses and observed auditory MEG responses to be most sensitive to spectral content of the sounds at time-lags of 250 ms to 500 ms. The proposed model should be useful for determining what aspects of natural sounds are represented by high-dimensional neural responses and may reveal novel properties of neural signals.
Corporate networks, as induced by interlocking directorates between corporations, provide structures of personal communication at the level of their boards. This paper studies such networks from a perspective of close communication in sub-networks, where each pair of nodes (boards of a corporation) are either neighbours, or have a common neighbour. These correspond to subgraphs of diameter at most 2, designated by us earlier as 2-clubs, with three types (coteries, social circles and hamlets) as degrees of close communication in social networks, within the concept of boroughs of a network. Boroughs are maximal areas and containers of close communication between nodes of a network. This framework is applied in this paper to an analysis of corporate board interlocks between the top 300 European corporations 2010, as studied by Heemkerk (2013), with data provided by him for that purpose. The paper gives results for several perspectives of close communication in the European corporate network of 2010, a year close to the global crash of 2008, as a further elaboration of those given in Heemskerk (2013).
News is a pertinent source of information on financial risks and stress factors, which nevertheless is challenging to harness due to the sparse and unstructured nature of natural text. We propose an approach based on distributional semantics and deep learning with neural networks to model and link text to a scarce set of bank distress events. Through unsupervised training, we learn semantic vector representations of news articles as predictors of distress events. The predictive model that we learn can signal coinciding stress with an aggregated index at bank or European level, while crucially allowing for automatic extraction of text descriptions of the events, based on passages with high stress levels. The method offers insight that models based on other types of data cannot provide, while offering a general means for interpreting this type of semantic-predictive model. We model bank distress with data on 243 events and 6.6M news articles for 101 large European banks.
Learning deeper convolutional neural networks becomes a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be gained by simply stacking more layers. In this paper, we consider the issue from an information theoretical perspective, and propose a novel method Relay Backpropagation, that encourages the propagation of effective information through the network in training stage. By virtue of the method, we achieved the first place in ILSVRC 2015 Scene Classification Challenge. Extensive experiments on two challenging large scale datasets demonstrate the effectiveness of our method is not restricted to a specific dataset or network architecture. Our models will be available to the research community later.
The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designed or trained for a different task. This work is the first to overcome this limitation by interpreting the Correlation Filter learner, which has a closed-form solution, as a differentiable layer in a deep neural network. This enables learning deep features that are tightly coupled to the Correlation Filter. Experiments illustrate that our method has the important practical benefit of allowing lightweight architectures to achieve state-of-the-art performance at high framerates.
Recurrent neural networks are increasing popular models for sequential learning. Unfortunately, although the most effective RNN architectures are perhaps excessively complicated, extensive searches have not found simpler alternatives. This paper imports ideas from physics and functional programming into RNN design to provide guiding principles. From physics, we introduce type constraints, analogous to the constraints that forbids adding meters to seconds. From functional programming, we require that strongly-typed architectures factorize into stateless learnware and state-dependent firmware, reducing the impact of side-effects. The features learned by strongly-typed nets have a simple semantic interpretation via dynamic average-pooling on one-dimensional convolutions. We also show that strongly-typed gradients are better behaved than in classical architectures, and characterize the representational power of strongly-typed nets. Finally, experiments show that, despite being more constrained, strongly-typed architectures achieve lower training and comparable generalization error to classical architectures.
We present a mathematical analysis of the effects of Hebbian learning in random recurrent neural networks, with a generic Hebbian learning rule including passive forgetting and different time scales for neuronal activity and learning dynamics. Previous numerical works have reported that Hebbian learning drives the system from chaos to a steady state through a sequence of bifurcations. Here, we interpret these results mathematically and show that these effects, involving a complex coupling between neuronal dynamics and synaptic graph structure, can be analyzed using Jacobian matrices, which introduce both a structural and a dynamical point of view on the neural network evolution. Furthermore, we show that the sensitivity to a learned pattern is maximal when the largest Lyapunov exponent is close to 0. We discuss how neural networks may take advantage of this regime of high functional interest.
We test for the presence or absence of the de Almeida-Thouless line using one-dimensional power-law diluted Heisenberg spin glass model, in which the rms strength of the interactions decays with distance, r as 1/r^{sigma}. It is argued that varying the power sigma is analogous to varying the space dimension d in a short-range model. For sigma=0.6, which is in the mean field regime regime, we find clear evidence for an AT line. For sigma = 0.85, which is in the non-mean-field regime and corresponds to a space dimension of close to 3, we find no AT line, though we cannot rule one out for very small fields. Finally for sigma=0.75, which is in the non-mean-field regime but closer to the mean-field boundary, the evidence suggests that there is an AT line, though the possibility that even larger sizes are needed to see the asymptotic behavior can not be ruled out.
We report finite size numerical investigations and mean field analysis of a Kuramoto model with inertia for fully coupled and diluted systems. In particular, we examine for a Gaussian distribution of the frequencies the transition from incoherence to coherence for increasingly large system size and inertia. For sufficiently large inertia the transition is hysteretic and within the hysteretic region clusters of locked oscillators of various sizes and different levels of synchronization coexist. A modification of the mean field theory developed by Tanaka, Lichtenberg, and Oishi [Physica D, 100 (1997) 279] allows to derive the synchronization profile associated to each of these clusters. We have also investigated numerically the limits of existence of the coherent and of the incoherent solutions. The minimal coupling required to observe the coherent state is largely independent of the system size and it saturates to a constant value already for moderately large inertia values. The incoherent state is observable up to a critical coupling whose value saturates for large inertia and for finite system sizes, while in the thermodinamic limit this critical value diverges proportionally to the mass. By increasing the inertia the transition becomes more complex, and the synchronization occurs via the emergence of clusters of whirling oscillators. The presence of these groups of coherently drifting oscillators induces oscillations in the order parameter. We have shown that the transition remains hysteretic even for randomly diluted networks up to a level of connectivity corresponding to few links per oscillator. Finally, an application to the Italian high-voltage power grid is reported, which reveals the emergence of quasi-periodic oscillations in the order parameter due to the simultaneous presence of many competing whirling clusters.
We examine sequential equilibrium in the context of computational games, where agents are charged for computation. In such games, an agent can rationally choose to forget, so issues of imperfect recall arise. In this setting, we consider two notions of sequential equilibrium. One is an ex ante notion, where a player chooses his strategy before the game starts and is committed to it, but chooses it in such a way that it remains optimal even off the equilibrium path. The second is an interim notion, where a player can reconsider at each information set whether he is doing the "right" thing, and if not, can change his strategy. The two notions agree in games of perfect recall, but not in games of imperfect recall. Although the interim notion seems more appealing, \fullv{Halpern and Pass [2011] argue that there are some deep conceptual problems with it in standard games of imperfect recall. We show that the conceptual problems largely disappear in the computational setting. Moreover, in this setting, under natural assumptions, the two notions coincide.
We study the influence of electron-electron interactions on the electronic properties of disordered materials. In particular, we consider the insulating side of a metal-insulator transition where screening breaks down and the electron-electron interaction remains long-ranged. The investigations are based on the quantum Coulomb glass, a generalization of the classical Coulomb glass model of disordered insulators. The quantum Coulomb glass is studied by decoupling the Coulomb interaction by means of a Hartree-Fock approximation and exactly diagonalizing the remaining localization problem. We investigate the behavior of the Coulomb gap in the density of states when approaching the metal-insulator transition and study the influence of the interaction on the localization of the electrons. We find that the interaction leads to an enhancement of localization at the Fermi level.
Deep learning researchers commonly suggest that converged models are stuck in local minima. More recently, some researchers observed that under reasonable assumptions, the vast majority of critical points are saddle points, not true minima. Both descriptions suggest that weights converge around a point in weight space, be it a local optima or merely a critical point. However, it's possible that neither interpretation is accurate. As neural networks are typically over-complete, it's easy to show the existence of vast continuous regions through weight space with equal loss. In this paper, we build on recent work empirically characterizing the error surfaces of neural networks. We analyze training paths through weight space, presenting evidence that apparent convergence of loss does not correspond to weights arriving at critical points, but instead to large movements through flat regions of weight space. While it's trivial to show that neural network error surfaces are globally non-convex, we show that error surfaces are also locally non-convex, even after breaking symmetry with a random initialization and also after partial training.
Complex network topologies and hyperbolic geometry seem specularly connected, and one of the most fascinating and challenging problems of recent complex network theory is to map a given network to its hyperbolic space. The Popularity Similarity Optimization (PSO) model represents - at the moment - the climax of this theory. It suggests that the trade-off between node popularity and similarity is a mechanism to explain how complex network topologies emerge - as discrete samples - from the continuous world of hyperbolic geometry. The hyperbolic space seems appropriate to represent real complex networks. In fact, it preserves many of their fundamental topological properties, and can be exploited for real applications such as, among others, link prediction and community detection. Here, we observe for the first time that a topological-based machine learning class of algorithms - for nonlinear unsupervised dimensionality reduction - can directly approximate the network's node angular coordinates of the hyperbolic model into a two-dimensional space, according to a similar topological organization that we named angular coalescence. On the basis of this phenomenon, we propose a new class of algorithms that offers fast and accurate coalescent embedding of networks in the hyperbolic space even for graphs with thousands of nodes.
We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person's activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.
We report on the measurement of phase coherence length in a high mobility two-dimensional electron gas patterned in two different geometries, a wire and a ring. The phase coherence length is extracted both from the weak localization correction in long wires and from the amplitude of the Aharonov-Bohm oscillations in a single ring, in a low temperature regime when decoherence is dominated by electronic interactions. We show that these two measurements lead to different phase coherence lengths, namely $L_{\Phi}^\mathrm{wire}\propto T^{-1/3}$ and $L_{\Phi}^\mathrm{ring}\propto T^{-1/2}$. This difference reflects the fact that the electrons winding around the ring necessarily explore the whole sample (ergodic trajectories), while in a long wire the electrons lose their phase coherence before reaching the edges of the sample (diffusive regime).
Future wireless networks are envisioned to integrate multi-hop, multi-operator, multi-technology (m3) components in order to meet the increasing traffic demand at an acceptable price for subscribers. The performance of such a network depends on the multitude of parameters defining traffic statistics, network topology/technology, channel characteristics and business models for multi-operator cooperation. So far, most of these aspects have been addressed separately in the literature. Since the above parameters are mutually dependent, and simultaneously present in a network, for a given channel and traffic statistics, a joint optimization of technology and business model parameters is required. In this paper, we present such joint models of complex wireless networks and introduce optimization with parameter clustering to solve the problem in a tractable way for large number of parameters. By parameter clustering we compress the optimization vector and significantly simplify system implementation, hence the algorithm is referred to as compressed control of wireless networks. Two distinct parameter compression techniques are introduced mainly parameter absorption and parameter aggregation. Numerical results obtained in this way demonstrate clear maximum in the network utility as a function of the network topology parameters. The results, for a specific network with traffic offloading, show that the cooperation decisions between the multiple operators will be significantly influenced by the traffic dynamics. For typical example scenarios, the optimum offloading price varies by factor 3 for different traffic patterns which justifies the use of dynamic strategies in the decision process. Besides, if user availability increases by multi-operator cooperation, network capacity can be increased up to 50% and network throughput up to 30-40%.
Cognitive radio networks are a new type of multi-channel wireless network in which different nodes can have access to different sets of channels. By providing multiple channels, they improve the efficiency and reliability of wireless communication. However, the heterogeneous nature of cognitive radio networks also brings new challenges to the design and analysis of distributed algorithms.   In this paper, we focus on two fundamental problems in cognitive radio networks: neighbor discovery, and global broadcast. We consider a network containing $n$ nodes, each of which has access to $c$ channels. We assume the network has diameter $D$, and each pair of neighbors have at least $k\geq 1$, and at most $k_{max}\leq c$, shared channels. We also assume each node has at most $\Delta$ neighbors. For the neighbor discovery problem, we design a randomized algorithm CSeek which has time complexity $\tilde{O}((c^2/k)+(k_{max}/k)\cdot\Delta)$. CSeek is flexible and robust, which allows us to use it as a generic "filter" to find "well-connected" neighbors with an even shorter running time. We then move on to the global broadcast problem, and propose CGCast, a randomized algorithm which takes $\tilde{O}((c^2/k)+(k_{max}/k)\cdot\Delta+D\cdot\Delta)$ time. CGCast uses CSeek to achieve communication among neighbors, and uses edge coloring to establish an efficient schedule for fast message dissemination.   Towards the end of the paper, we give lower bounds for solving the two problems. These lower bounds demonstrate that in many situations, CSeek and CGCast are near optimal.
This paper presents the specific application of wireless communication,automotive Wireless Communication also called as Vehicle to Vehicle Communication.It explains the technology used for automotive Wireless Communication along with the various automotive applications relying on wireless communication. Automotive Wireless Communication gives drivers a sixth sense to know whats going on around them to help avoid accidents and improve traffic flow.The paper also describes VANETS (vehicular adhoc networks) and real world test network implementation.Finally,the paper is summarized.
In this paper, we first propose a new iterative algorithm, called the K-sets+ algorithm for clustering data points in a semi-metric space, where the distance measure does not necessarily satisfy the triangular inequality. We show that the K-sets+ algorithm converges in a finite number of iterations and it retains the same performance guarantee as the K-sets algorithm for clustering data points in a metric space. We then extend the applicability of the K-sets+ algorithm from data points in a semi-metric space to data points that only have a symmetric similarity measure. Such an extension leads to great reduction of computational complexity. In particular, for an n * n similarity matrix with m nonzero elements in the matrix, the computational complexity of the K-sets+ algorithm is O((Kn + m)I), where I is the number of iterations. The memory complexity to achieve that computational complexity is O(Kn + m). As such, both the computational complexity and the memory complexity are linear in n when the n * n similarity matrix is sparse, i.e., m = O(n). We also conduct various experiments to show the effectiveness of the K-sets+ algorithm by using a synthetic dataset from the stochastic block model and a real network from the WonderNetwork website.
We examine whether cubic non-linearities, allowed by symmetry in the elastic energy of a contact line, may result in a different universality class at depinning. Standard linear elasticity predicts a roughness exponent zeta=1/3 (one loop), zeta=0.39 (numerics) while experiments give zeta of about 0.5. Within functional RG we find that a non-local KPZ-type term is generated at depinning and grows under coarse graining. A fixed point with zeta=0.45 (one loop) is identified, showing that large enough cubic terms increase the roughness. This fixed point is unstable, revealing a rough strong-coupling phase. Experimental study of contact angles theta near pi/2, where cubic terms vanish in the energy, is suggested.
Supervised learning with a deep convolutional neural network is used to identify the QCD equation of state (EoS) employed in relativistic hydrodynamic simulations of heavy-ion collisions. The final-state particle spectra $\rho(p_T,\Phi)$ provide directly accessible information from experiments. High-level correlations of $\rho(p_T,\Phi)$ learned by the neural network act as an "EoS-meter", effective in detecting the nature of the QCD transition. The EoS-meter is model independent and insensitive to other simulation input, especially the initial conditions. Thus it provides a formidable direct-connection of heavy-ion collision observable with the bulk properties of QCD.
Glasses have a large excess of low-frequency vibrational modes in comparison with most crystalline solids. We show that such a feature is a necessary consequence of the weak connectivity of the solid, and that the frequency of modes in excess is very sensitive to the pressure. We analyze in particular two systems whose density D(w) of vibrational modes of angular frequency w display scaling behaviors with the packing fraction: (i) simulations of jammed packings of particles interacting through finite-range, purely repulsive potentials, comprised of weakly compressed spheres at zero temperature and (ii) a system with the same network of contacts, but where the force between any particles in contact (and therefore the total pressure) is set to zero. We account in the two cases for the observed a) convergence of D(w) toward a non-zero constant as w goes to 0, b) appearance of a low-frequency cutoff w*, and c) power-law increase of w* with compression. Differences between these two systems occur at lower frequency. The density of states of the modified system displays an abrupt plateau that appears at w*, below which we expect the system to behave as a normal, continuous, elastic body. In the unmodified system, the pressure lowers the frequency of the modes in excess. The requirement of stability despite the destabilizing effect of pressure yields a lower bound on the number of extra contact per particle dz: dz > p^(1/2), which generalizes the Maxwell criterion for rigidity when pressure is present. This scaling behavior is observed in the simulations. We finally discuss how the cooling procedure can affect the microscopic structure and the density of normal modes.
Spatially dependent parameters of a two-component chaotic reaction-diffusion PDE model describing ocean ecology are observed by sampling a single species. We estimate model parameters and the other species in the system by autosynchronization, where quantities of interest are evolved according to misfit between model and observations, to only partially observed data. Our motivating example comes from oceanic ecology as viewed by remote sensing data, but where noisy occluded data are realized in the form of cloud cover. We demonstrate a method to learn a large-scale coupled synchronizing system that represents spatio-temporal dynamics and apply a network approach to analyze manifold stability.
Traditionally, only experts who are equipped with professional knowledge and rich experience are able to recognize different species of wood. Applying image processing techniques for wood species recognition can not only reduce the expense to train qualified identifiers, but also increase the recognition accuracy. In this paper, a wood species recognition technique base on Scale Invariant Feature Transformation (SIFT) keypoint histogram is proposed. We use first the SIFT algorithm to extract keypoints from wood cross section images, and then k-means and k-means++ algorithms are used for clustering. Using the clustering results, an SIFT keypoints histogram is calculated for each wood image. Furthermore, several classification models, including Artificial Neural Networks (ANN), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) are used to verify the performance of the method. Finally, through comparing with other prevalent wood recognition methods such as GLCM and LBP, results show that our scheme achieves higher accuracy.
Inspired by the extraordinary computing power promised by quantum computers, the quantum mind hypothesis postulated that quantum mechanical phenomena are the source of neuronal synchronization, which, in turn, might underlie consciousness. Here, I present an alternative inspired by a classical computing method with quantum power. This method relies on special distributed representations called hyperstrings. Hyperstrings are superpositions of up to an exponential number of strings, which -- by a single-processor classical computer -- can be evaluated in a transparallel fashion, that is, simultaneously as if only one string were concerned. Building on a neurally plausible model of human visual perceptual organization, in which hyperstrings are formal counterparts of transient neural assemblies, I postulate that synchronization in such assemblies is a manifestation of transparallel information processing. This accounts for the high combinatorial capacity and speed of human visual perceptual organization and strengthens ideas that self-organizing cognitive architecture bridges the gap between neurons and consciousness.
In recent years, the amount of information on the Internet has increased exponentially developing great interest in selective information dissemination systems. The publish/subscribe paradigm is particularly suited for designing systems for routing information and requests according to their content throughout wide-area network of brokers. Current publish/subscribe systems use limited syntax-based content routing but since publishers and subscribers are anonymous and decoupled in time, space and location, often over wide-area network boundary, they do not necessarily speak the same language. Consequently, adding semantics to current publish/subscribe systems is important. In this paper we identify and examine the issues in developing semantic-based content routing for publish/subscribe broker networks.
We present a novel approach to online multi-target tracking based on recurrent neural networks (RNNs). Tracking multiple objects in real-world scenes involves many challenges, including a) an a-priori unknown and time-varying number of targets, b) a continuous state estimation of all present targets, and c) a discrete combinatorial problem of data association. Most previous methods involve complex models that require tedious tuning of parameters. Here, we propose for the first time, an end-to-end learning approach for online multi-target tracking. Existing deep learning methods are not designed for the above challenges and cannot be trivially applied to the task. Our solution addresses all of the above points in a principled way. Experiments on both synthetic and real data show promising results obtained at ~300 Hz on a standard CPU, and pave the way towards future research in this direction.
To study relaxation dynamics of the two-dimensional XY gauge glass, we integrate directly the equations of motion and investigate the energy function. As usual, it decays exponentially at high temperatures; at low but non-zero temperatures, it is found to exhibit an algebraic relaxation. We compute the relaxation time $\tau$ as a function of the temperature $T$ and find that the rapid increase of $\tau$ at low temperatures is well described by $\tau \sim (T-T_g)^{-b}$ with $T_g = 0.22 \pm 0.02$ and $b = 0.76 \pm 0.05$, which strongly suggests a finite-temperature glass transition. The decay of vorticity is also examined and explained in terms of a simple heuristic model, which attributes the fast relaxation at high temperatures to annihilation of unpinned vortices.
The chemical composition of stellar photospheres in mass-transferring binary systems is a precious diagnostic of the nucleosynthesis processes that occur deep within stars, and preserves information on the components history. The binary system u Her belongs to a group of hot Algols with both components being B-stars. We have isolated the individual spectra of the two components by the technique of spectral disentangling of a new series of 43 high-resolution echelle spectra. Augmenting these with an analysis of the Hipparcos photometry of the system yields revised stellar quantities for the components of u Her. For the primary component (the mass-gaining star) we find M_A = 7.88 +/- 0.26 M_sun, R_A = 4.93 +/- 0.15 R_sun and T_eff_A = 21600 +/- 220 K. For the secondary (the mass-losing star) we find M_B = 2.79 +/- 0.12 M_sun, R_B = 4.26 +/- 0.06 R_sun and T_eff_B = 12600 +/- 550 K. A non-LTE analysis of the primary star's atmosphere reveals deviations in the abundances of nitrogen and carbon from the standard cosmic abundance pattern in accord with theoretical expectations for CNO nucleosynthesis processing. From a grid of calculated evolutionary models the best match to the observed properties of the stars in u Her enabled tracing the initial properties and history of this binary system. We confirm that it has evolved according to case A mass transfer. A detailed abundance analysis of the primary star gives C/N = 0.9, which supports the evolutionary calculations and indicates strong mixing in the early evolution of the secondary component, which was originally the more massive of the two. The composition of the secondary component would be a further important constraint on the initial properties of u Her system, but requires spectra of a higher signal to noise ratio.
Which reaction networks, when taken with mass-action kinetics, have the capacity for multiple steady states? There is no complete answer to this question, but over the last 40 years various criteria have been developed that can answer this question in certain cases. This work surveys these developments, with an emphasis on recent results that connect the capacity for multistationarity of one network to that of another. In this latter setting, we consider a network $N$ that is embedded in a larger network $G$, which means that $N$ is obtained from $G$ by removing some subsets of chemical species and reactions. This embedding relation is a significant generalization of the subnetwork relation. For arbitrary networks, it is not true that if $N$ is embedded in $G$, then the steady states of $N$ lift to $G$. Nonetheless, this does hold for certain classes of networks; one such class is that of fully open networks. This motivates the search for embedding-minimal multistationary networks: those networks which admit multiple steady states but no proper, embedded networks admit multiple steady states. We present results about such minimal networks, including several new constructions of infinite families of these networks.
Inside computer networks, different information processing tasks are necessary to deliver the user data efficiently. This processing can also be done in the quantum domain. We present simple optical quantum networks where the orbital angular momentum of a single photon is used as an ancillary degree of freedom which controls decisions at the network level. Linear optical elements are enough to provide important network primitives like multiplexing and routing. First we show how to build a simple multiplexer and demultiplexer which combine photonic qubits and separate them again at the receiver. We also give two different self-routing networks where the OAM of an input photon is enough to make it find its desired destination.
Exact ground states of three-dimensional random field Ising magnets (RFIM) with Gaussian distribution of the disorder are calculated using graph-theoretical algorithms. Systems for different strengths h of the random fields and sizes up to N=96^3 are considered. By numerically differentiating the bond-energy with respect to h a specific-heat like quantity is obtained, which does not appear to diverge at the critical point but rather exhibits a cusp. We also consider the effect of a small uniform magnetic field, which allows us to calculate the T=0 susceptibility. From a finite-size scaling analysis, we obtain the critical exponents \nu=1.32(7), \alpha=-0.63(7), \eta=0.50(3) and find that the critical strength of the random field is h_c=2.28(1). We discuss the significance of the result that \alpha appears to be strongly negative.
The Dynamic Vehicle Routing Problem with Time Windows (DVRPTW) is an extension of the well-known Vehicle Routing Problem (VRP), which takes into account the dynamic nature of the problem. This aspect requires the vehicle routes to be updated in an ongoing manner as new customer requests arrive in the system and must be incorporated into an evolving schedule during the working day. Besides the vehicle capacity constraint involved in the classical VRP, DVRPTW considers in addition time windows, which are able to better capture real-world situations. Despite this, so far, few studies have focused on tackling this problem of greater practical importance. To this end, this study devises for the resolution of DVRPTW, an ant colony optimization based algorithm, which resorts to a joint solution construction mechanism, able to construct in parallel the vehicle routes. This method is coupled with a local search procedure, aimed to further improve the solutions built by ants, and with an insertion heuristics, which tries to reduce the number of vehicles used to service the available customers. The experiments indicate that the proposed algorithm is competitive and effective, and on DVRPTW instances with a higher dynamicity level, it is able to yield better results compared to existing ant-based approaches.
Complex networks have recently aroused a lot of interest. However, network edges are considered to be the same in almost all these studies. In this paper, we present a simple classification method, which divides the edges of undirected, unweighted networks into two types: p2c and p2p. The p2c edge represents a hierarchical relationship between two nodes, while the p2p edge represents an equal relationship between two nodes. It is surprising and unexpected that for many real-world networks from a wide variety of domains (including computer science, transportation, biology, engineering and social science etc), the p2c degree distribution follows a power law more strictly than the total degree distribution, while the p2p degree distribution follows the Weibull distribution very well. Thus, the total degree distribution can be seen as a mixture of power-law and Weibull distributions. More surprisingly, it is found that in many cases, the total degree distribution can be better described by the Weibull distribution, rather than a power law as previously suggested. By comparing two topology models, we think that the origin of the Weibull distribution in complex networks might be a mixture of both preferential and random attachments when networks evolve.
Wireless Sensor Networks (WSNs) with their dynamic applications gained a tremendous attention of researchers. Constant monitoring of critical situations attracted researchers to utilize WSNs at vast platforms. The main focus in WSNs is to enhance network life-time as much as one could, for efficient and optimal utilization of resources. Different approaches based upon clustering are proposed for optimum functionality. Network life-time is always related with energy of sensor nodes deployed at remote areas for constant and fault tolerant monitoring. In this work, we propose Quadrature-LEACH (Q-LEACH) for homogenous networks which enhances stability period, network life-time and throughput quiet significantly.
We introduce gvnn, a neural network library in Torch aimed towards bridging the gap between classic geometric computer vision and deep learning. Inspired by the recent success of Spatial Transformer Networks, we propose several new layers which are often used as parametric transformations on the data in geometric computer vision. These layers can be inserted within a neural network much in the spirit of the original spatial transformers and allow backpropagation to enable end-to-end learning of a network involving any domain knowledge in geometric computer vision. This opens up applications in learning invariance to 3D geometric transformation for place recognition, end-to-end visual odometry, depth estimation and unsupervised learning through warping with a parametric transformation for image reconstruction error.
This paper introduces Adaptive Computation Time (ACT), an algorithm that allows recurrent neural networks to learn how many computational steps to take between receiving an input and emitting an output. ACT requires minimal changes to the network architecture, is deterministic and differentiable, and does not add any noise to the parameter gradients. Experimental results are provided for four synthetic problems: determining the parity of binary vectors, applying binary logic operations, adding integers, and sorting real numbers. Overall, performance is dramatically improved by the use of ACT, which successfully adapts the number of computational steps to the requirements of the problem. We also present character-level language modelling results on the Hutter prize Wikipedia dataset. In this case ACT does not yield large gains in performance; however it does provide intriguing insight into the structure of the data, with more computation allocated to harder-to-predict transitions, such as spaces between words and ends of sentences. This suggests that ACT or other adaptive computation methods could provide a generic method for inferring segment boundaries in sequence data.
Craters are among the most studied geomorphic features in the Solar System because they yield important information about the past and present geological processes and provide information about the relative ages of observed geologic formations. We present a method for automatic crater detection using advanced machine learning to deal with the large amount of satellite imagery collected. The challenge of automatically detecting craters comes from their is complex surface because their shape erodes over time to blend into the surface. Bandeira provided a seminal dataset that embodied this challenge that is still an unsolved pattern recognition problem to this day. There has been work to solve this challenge based on extracting shape and contrast features and then applying classification models on those features. The limiting factor in this existing work is the use of hand crafted filters on the image such as Gabor or Sobel filters or Haar features. These hand crafted methods rely on domain knowledge to construct. We would like to learn the optimal filters and features based on training examples. In order to dynamically learn filters and features we look to Convolutional Neural Networks (CNNs) which have shown their dominance in computer vision. The power of CNNs is that they can learn image filters which generate features for high accuracy classification.
We propose a viewpoint invariant model for 3D human pose estimation from a single depth image. To achieve this, our discriminative model embeds local regions into a learned viewpoint invariant feature space. Formulated as a multi-task learning problem, our model is able to selectively predict partial poses in the presence of noise and occlusion. Our approach leverages a convolutional and recurrent network architecture with a top-down error feedback mechanism to self-correct previous pose estimates in an end-to-end manner. We evaluate our model on a previously published depth dataset and a newly collected human pose dataset containing 100K annotated depth images from extreme viewpoints. Experiments show that our model achieves competitive performance on frontal views while achieving state-of-the-art performance on alternate viewpoints.
Bayesian networks are now being used in enormous fields, for example, diagnosis of a system, data mining, clustering and so on. In spite of their wide range of applications, the statistical properties have not yet been clarified, because the models are nonidentifiable and non-regular. In a Bayesian network, the set of its parameter for a smaller model is an analytic set with singularities in the space of large ones. Because of these singularities, the Fisher information matrices are not positive definite. In other words, the mathematical foundation for learning was not constructed. In recent years, however, we have developed a method to analyze non-regular models using algebraic geometry. This method revealed the relation between the models singularities and its statistical properties. In this paper, applying this method to Bayesian networks with latent variables, we clarify the order of the stochastic complexities.Our result claims that the upper bound of those is smaller than the dimension of the parameter space. This means that the Bayesian generalization error is also far smaller than that of regular model, and that Schwarzs model selection criterion BIC needs to be improved for Bayesian networks.
A subgroup of dwarf galaxies have characteristics of a possible evolutionary transition between star-forming systems and dwarf ellipticals. These systems host significant starbursts in combination with smooth, elliptical outer envelopes and small HI content; they are low on gas and unlikely to sustain high star formation rates over significant cosmic time spans. We explore possible origins of such starburst "transition" dwarfs using moderately deep optical images. While galaxy-galaxy interactions could produce these galaxies, no optical evidence exists for tidal debris or other outer disturbances, and they also lack nearby giant neighbors which could supply recent perturbations. Colors of the outer regions indicate that star formation ceased > 1 Gyr in the past, a longer time span than can be reasonably associated with the current starbursts. We consider mechanisms where the starbursts are tied either to interactions with other dwarfs or to the state of the interstellar medium, and discuss the possibility of episodic star formation events associated with gas heating and cooling in low specific angular momentum galaxies.
Our formal understanding of the inductive bias that drives the success of convolutional networks on computer vision tasks is limited. In particular, it is unclear what makes hypotheses spaces born from convolution and pooling operations so suitable for natural images. In this paper we study the ability of convolutional networks to model correlations among regions of their input. We theoretically analyze convolutional arithmetic circuits, and empirically validate our findings on other types of convolutional networks as well. Correlations are formalized through the notion of separation rank, which for a given partition of the input, measures how far a function is from being separable. We show that a polynomially sized deep network supports exponentially high separation ranks for certain input partitions, while being limited to polynomial separation ranks for others. The network's pooling geometry effectively determines which input partitions are favored, thus serves as a means for controlling the inductive bias. Contiguous pooling windows as commonly employed in practice favor interleaved partitions over coarse ones, orienting the inductive bias towards the statistics of natural images. Other pooling schemes lead to different preferences, and this allows tailoring the network to data that departs from the usual domain of natural imagery. In addition to analyzing deep networks, we show that shallow ones support only linear separation ranks, and by this gain insight into the benefit of functions brought forth by depth - they are able to efficiently model strong correlation under favored partitions of the input.
The fluctuation-dissipation relation (FDR) is measured on the dielectric properties of a gel (Laponite) and of a polymer glass (polycarbonate). For the gel it is found that during the transition from a fluid-like to a solid-like state the fluctuation dissipation theorem is strongly violated. The amplitude and the persistence time of this violation are decreasing functions of frequency. Around $1Hz$ it may persist for several hours. A very similar behavior is observed in polycarbonate after a quench below the glass transition temperature. In both cases the origin of this violation is a highly intermittent dynamics characterized by large fluctuations. The relevance of these results for recent models of aging are discussed.
The determination of the normal and transverse (frictional) inter-particle forces within a granular medium is a long standing, taunting, and yet unresolved problem. We present a new formalism which employs the knowledge of the external forces and the orientations of contacts between particles (of any given sizes), to compute all the inter-particle forces. Having solved this problem we exemplify the efficacy of the formalism showing that the force chains in such systems are determined by an expansion in the eigenfunctions of a newly defined operator.
Network topology and nodal dynamics are two fundamental stones of adaptive networks. Detailed and accurate knowledge of these two ingredients is crucial for understanding the evolution and mechanism of adaptive networks. In this paper, by adopting the framework of the adaptive SIS model proposed by Gross et al. [Phys. Rev. Lett. 96, 208701 (2006)] and carefully utilizing the information of degree correlation of the network, we propose a link-based formalism for describing the system dynamics with high accuracy and subtle details. Several specific degree correlation measures are introduced to reveal the coevolution of network topology and system dynamics.
Given the recent advances in depth prediction from Convolutional Neural Networks (CNNs), this paper investigates how predicted depth maps from a deep neural network can be deployed for accurate and dense monocular reconstruction. We propose a method where CNN-predicted dense depth maps are naturally fused together with depth measurements obtained from direct monocular SLAM. Our fusion scheme privileges depth prediction in image locations where monocular SLAM approaches tend to fail, e.g. along low-textured regions, and vice-versa. We demonstrate the use of depth prediction for estimating the absolute scale of the reconstruction, hence overcoming one of the major limitations of monocular SLAM. Finally, we propose a framework to efficiently fuse semantic labels, obtained from a single frame, with dense SLAM, yielding semantically coherent scene reconstruction from a single view. Evaluation results on two benchmark datasets show the robustness and accuracy of our approach.
Photonic systems for high-performance information processing have attracted renewed interest. Neuromorphic silicon photonics has the potential to integrate processing functions that vastly exceed the capabilities of electronics. We report first observations of a recurrent silicon photonic neural network, in which connections are configured by microring weight banks. A mathematical isomorphism between the silicon photonic circuit and a continuous neural network model is demonstrated through dynamical bifurcation analysis. Exploiting this isomorphism, a simulated 24-node silicon photonic neural network is programmed using "neural compiler" to solve a differential system emulation task. A 294-fold acceleration against a conventional benchmark is predicted. We also propose and derive power consumption analysis for modulator-class neurons that, as opposed to laser-class neurons, are compatible with silicon photonic platforms. At increased scale, Neuromorphic silicon photonics could access new regimes of ultrafast information processing for radio, control, and scientific computing.
Neural time-series data contain a wide variety of prototypical signal waveforms (atoms) that are of significant importance in clinical and cognitive research. One of the goals for analyzing such data is hence to extract such 'shift-invariant' atoms. Even though some success has been reported with existing algorithms, they are limited in applicability due to their heuristic nature. Moreover, they are often vulnerable to artifacts and impulsive noise, which are typically present in raw neural recordings. In this study, we address these issues and propose a novel probabilistic convolutional sparse coding (CSC) model for learning shift-invariant atoms from raw neural signals containing potentially severe artifacts. In the core of our model, which we call $\alpha$CSC, lies a family of heavy-tailed distributions called $\alpha$-stable distributions. We develop a novel, computationally efficient Monte Carlo expectation-maximization algorithm for inference. The maximization step boils down to a weighted CSC problem, for which we develop a computationally efficient optimization algorithm. Our results show that the proposed algorithm achieves state-of-the-art convergence speeds. Besides, $\alpha$CSC is significantly more robust to artifacts when compared to three competing algorithms: it can extract spike bursts, oscillations, and even reveal more subtle phenomena such as cross-frequency coupling when applied to noisy neural time series.
We have simulated Edwards-Anderson (EA) as well as Sherrington-Kirkpatrick systems of L^3 spins. After averaging over large sets of EA system samples of 3 =< L =< 10, we obtain accurate numbers for distributions p(q) of the overlap parameter q at very low temperature T. We find p(0)/T --> 0.233(4) as T --> 0. This is in contrast with the droplet scenario of spin glasses. We also study the number of mismatched links --between replica pairs-- that come with large scale excitations. Contributions from small scale excitations are discarded. We thus obtain for the fractal dimension of outer surfaces of q~0 excitations in the EA model d_s --> 2.59(3) as T tends to 0. This is in contrast with d_s --> 3 as T --> 0 that is predicted by mean field theory for the macroscopic limit.
There is an obvious trend that more and more data and computation are migrating into networks nowadays. Combining mature virtualization technologies with service-centric net- working, we are entering into an era where countless services reside in an ISP network to provide low-latency access. Such services are often computation intensive and are dynamically created and destroyed on demands everywhere in the network to perform various tasks. Consequently, these ephemeral in-network services introduce a new type of congestion in the network which we refer to as "computation congestion". The service load need to be effectively distributed on different nodes in order to maintain the funtionality and responsiveness of the network, which calls for a new design rather than reusing the centralised scheduler designed for cloud-based services. In this paper, we study both passive and proactive control strategies, based on the proactive control we further propose a fully distributed solution which is low complexity, adaptive, and responsive to network dynamics.
This study relates the local property of node dominance to local and global properties of a network. Iterative removal of dominated nodes yields a distributed algorithm for computing a core-periphery decomposition of a social network, where nodes in the network core are seen to be essential in terms of network flow and global structure. Additionally, the connected components in the periphery give information about the community structure of the network, aiding in community detection. A number of explicit results are derived, relating the core and periphery to network flow, community structure and global network structure, which are corroborated by observational results. The method is illustrated using a real world network (DBLP co-authorship network), with ground-truth communities.
We present a novel way of using neural networks (NN) to estimate the redshift distribution of a galaxy sample. We are able to obtain a probability density function (PDF) for each galaxy using a classification neural network. The method is applied to 58714 galaxies in CFHTLenS that have spectroscopic redshifts from DEEP2, VVDS and VIPERS. Using this data we show that the stacked PDF's give an excellent representation of the true $N(z)$ using information from 5, 4 or 3 photometric bands. We show that the fractional error due to using N(z_(phot)) instead of N(z_(truth)) is <=1 on the lensing power spectrum P_(kappa) in several tomographic bins. Further we investigate how well this method performs when few training samples are available and show that in this regime the neural network slightly overestimates the N(z) at high z. Finally the case where the training sample is not representative of the full data set is investigated. An IPython notebook accompanying this paper is made available here: https://bitbucket.org/christopher_bonnett/nn_notebook
We study the two dimensional XY model with quenched random phases and its Coulomb gas formulation. A novel renormalization group (RG) method is developed which allows to study perturbatively the glassy low temperature XY phase and the transition at which frozen topological defects (vortices) proliferate. This RG approach is constructed both from the replicated Coulomb gas and, equivalently without the use of replicas, using the probability distribution of the local disorder (random defect core energy). By taking into account the fusion of environments (i.e charge fusion in the replicated Coulomb gas) this distribution is shown to obey a Kolmogorov's type (KPP) non linear RG equation which admits travelling wave solutions and exhibits a freezing phenomenon analogous to glassy freezing in Derrida's random energy models. The resulting physical picture is that the distribution of local disorder becomes broad below a freezing temperature and that the transition is controlled by rare favorable regions for the defects, the density of which can be used as the new perturbative parameter. The determination of marginal directions at the disorder induced transition is shown to be related to the well studied front velocity selection problem in the KPP equation and the universality of the novel critical behaviour obtained here to the known universality of the corrections to the front velocity. Applications to other two dimensional problems are mentionned at the end.
The interconnection network comprises a significant portion of the cost of large parallel computers, both in economic terms and power consumption. Several previous proposals exploit large-radix routers to build scalable low-distance topologies with the aim of minimizing these costs. However, they fail to consider potential unbalance in the network utilization, which in some cases results in suboptimal designs. Based on an appropriate cost model, this paper advocates the use of networks based on incidence graphs of projective planes, broadly denoted as Projective Networks. Projective Networks rely on highly symmetric generalized Moore graphs and encompass several proposed direct (PN and demi-PN) and indirect (OFT) topologies under a common mathematical framework. Compared to other proposals with average distance between 2 and 3 hops, these networks provide very high scalability while preserving a balanced network utilization, resulting in low network costs. Overall, Projective Networks constitute a competitive alternative for exascale-level interconnection network design.
We propose a novel fine-grained quantization (FGQ) method to ternarize pre-trained full precision models, while also constraining activations to 8 and 4-bits. Using this method, we demonstrate a minimal loss in classification accuracy on state-of-the-art topologies without additional training. We provide an improved theoretical formulation that forms the basis for a higher quality solution using FGQ. Our method involves ternarizing the original weight tensor in groups of $N$ weights. Using $N=4$, we achieve Top-1 accuracy within $3.7\%$ and $4.2\%$ of the baseline full precision result for Resnet-101 and Resnet-50 respectively, while eliminating $75\%$ of all multiplications. These results enable a full 8/4-bit inference pipeline, with best-reported accuracy using ternary weights on ImageNet dataset, with a potential of $9\times$ improvement in performance. Also, for smaller networks like AlexNet, FGQ achieves state-of-the-art results. We further study the impact of group size on both performance and accuracy. With a group size of $N=64$, we eliminate $\approx99\%$ of the multiplications; however, this introduces a noticeable drop in accuracy, which necessitates fine tuning the parameters at lower precision. We address this by fine-tuning Resnet-50 with 8-bit activations and ternary weights at $N=64$, improving the Top-1 accuracy to within $4\%$ of the full precision result with $<30\%$ additional training overhead. Our final quantized model can run on a full 8-bit compute pipeline using 2-bit weights and has the potential of up to $15\times$ improvement in performance compared to baseline full-precision models.
Determining deep holes is an important open problem in decoding Reed-Solomon codes. It is well known that the received word is trivially a deep hole if the degree of its Lagrange interpolation polynomial equals the dimension of the Reed-Solomon code. For the standard Reed-Solomon codes $[p-1, k]_p$ with $p$ a prime, Cheng and Murray conjectured in 2007 that there is no other deep holes except the trivial ones. In this paper, we show that this conjecture is not true. In fact, we find a new class of deep holes for standard Reed-Solomon codes $[q-1, k]_q$ with $q$ a prime power of $p$. Let $q \geq 4$ and $2 \leq k\leq q-2$. We show that the received word $u$ is a deep hole if its Lagrange interpolation polynomial is the sum of monomial of degree $q-2$ and a polynomial of degree at most $k-1$. So there are at least $2(q-1)q^k$ deep holes if $k \leq q-3$.
Human 3D pose estimation from a single image is a challenging task with numerous applications. Convolutional Neural Networks (CNNs) have recently achieved superior performance on the task of 2D pose estimation from a single image, by training on images with 2D annotations collected by crowd sourcing. This suggests that similar success could be achieved for direct estimation of 3D poses. However, 3D poses are much harder to annotate, and the lack of suitable annotated training images hinders attempts towards end-to-end solutions. To address this issue, we opt to automatically synthesize training images with ground truth pose annotations. Our work is a systematic study along this road. We find that pose space coverage and texture diversity are the key ingredients for the effectiveness of synthetic training data. We present a fully automatic, scalable approach that samples the human pose space for guiding the synthesis procedure and extracts clothing textures from real images. Furthermore, we explore domain adaptation for bridging the gap between our synthetic training images and real testing photos. We demonstrate that CNNs trained with our synthetic images out-perform those trained with real photos on 3D pose estimation tasks.
The current work describes an empirical study conducted in order to investigate the behavior of an optimization method in a fuzzy environment. MAX-MIN Ant System, an efficient implementation of a heuristic method is used for solving an optimization problem derived from the Traveling Salesman Problem (TSP). Several publicly-available symmetric TSP instances and their fuzzy variants are tested in order to extract some general features. The entry data was adapted by introducing a two-dimensional systematic degree of fuzziness, proportional with the number of nodes, the dimension of the instance and also with the distances between nodes, the scale of the instance. The results show that our proposed method can handle the data uncertainty, showing good resilience and adaptability.
Distributed denial of service attacks are often considered a security problem. While this may be the way to view the problem with today's Internet, new network architectures attempting to address the issue should view it as a scalability problem. In addition, they need to address the problem based on a rigorous foundation.
We study the asymptotic performance of two multi-hop overlaid ad-hoc networks that utilize the same temporal, spectral, and spatial resources based on random access schemes. The primary network consists of Poisson distributed legacy users with density \lambda^{(p)} and the secondary network consists of Poisson distributed cognitive radio users with density \lambda^{(s)} = (\lambda^{(p)})^{\beta} (\beta>0, \beta \neq 1) that utilize the spectrum opportunistically. Both networks are decentralized and employ ALOHA medium access protocols where the secondary nodes are additionally equipped with range-limited perfect spectrum sensors to monitor and protect primary transmissions. We study the problem in two distinct regimes, namely \beta>1 and 0<\beta<1. We show that in both cases, the two networks can achieve their corresponding stand-alone throughput scaling even without secondary spectrum sensing (i.e., the sensing range set to zero); this implies the need for a more comprehensive performance metric than just throughput scaling to evaluate the influence of the overlaid interactions. We thus introduce a new criterion, termed the asymptotic multiplexing gain, which captures the effect of inter-network interferences with different spectrum sensing setups. With this metric, we clearly demonstrate that spectrum sensing can substantially improve primary network performance when \beta>1. On the contrary, spectrum sensing turns out to be unnecessary when \beta<1 and setting the secondary network's ALOHA parameter appropriately can substantially improve primary network performance.
A suppression of tunnelling ionization of deep impurities in terahertz frequency electric fields by a magnetic field is observed. It is shown that the ionization probability at external magnetic field, B, oriented perpendicular to the electric field of terahertz radiation, E, is substantially smaller than that at B || E. The effect occurs at low temperatures and high magnetic fields.
Countless learning tasks require dealing with sequential data. Image captioning, speech synthesis, and music generation all require that a model produce outputs that are sequences. In other domains, such as time series prediction, video analysis, and musical information retrieval, a model must learn from inputs that are sequences. Interactive tasks, such as translating natural language, engaging in dialogue, and controlling a robot, often demand both capabilities. Recurrent neural networks (RNNs) are connectionist models that capture the dynamics of sequences via cycles in the network of nodes. Unlike standard feedforward neural networks, recurrent networks retain a state that can represent information from an arbitrarily long context window. Although recurrent neural networks have traditionally been difficult to train, and often contain millions of parameters, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful large-scale learning with them. In recent years, systems based on long short-term memory (LSTM) and bidirectional (BRNN) architectures have demonstrated ground-breaking performance on tasks as varied as image captioning, language translation, and handwriting recognition. In this survey, we review and synthesize the research that over the past three decades first yielded and then made practical these powerful learning models. When appropriate, we reconcile conflicting notation and nomenclature. Our goal is to provide a self-contained explication of the state of the art together with a historical perspective and references to primary research.
Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning. As a step in this direction, we propose a new architecture, called neural equivalence networks, for the problem of learning continuous semantic representations of algebraic and logical expressions. These networks are trained to represent semantic equivalence, even of expressions that are syntactically very different. The challenge is that semantic representations must be computed in a syntax-directed manner, because semantics is compositional, but at the same time, small changes in syntax can lead to very large changes in semantics, which can be difficult for continuous neural architectures. We perform an exhaustive evaluation on the task of checking equivalence on a highly diverse class of symbolic algebraic and boolean expression types, showing that our model significantly outperforms existing architectures.
Recent advances in associative memory design through structured pattern sets and graph-based inference algorithms have allowed reliable learning and recall of an exponential number of patterns. Although these designs correct external errors in recall, they assume neurons that compute noiselessly, in contrast to the highly variable neurons in brain regions thought to operate associatively such as hippocampus and olfactory cortex.   Here we consider associative memories with noisy internal computations and analytically characterize performance. As long as the internal noise level is below a specified threshold, the error probability in the recall phase can be made exceedingly small. More surprisingly, we show that internal noise actually improves the performance of the recall phase while the pattern retrieval capacity remains intact, i.e., the number of stored patterns does not reduce with noise (up to a threshold). Computational experiments lend additional support to our theoretical analysis. This work suggests a functional benefit to noisy neurons in biological neuronal networks.
Compton scattering is one of the dominant interaction processes in germanium for photons with an energy of around two MeV. If a photon scatters only once inside a germanium detector, the resulting event contains only one electron which normally deposits its energy within a mm range. Such events are similar to Ge-76 neutrinoless double beta-decay events with just two electrons in the final state. Other photon interactions like pair production or multiple scattering can result in events composed of separated energy deposits. One method to identify the multiple energy deposits is the use of timing information contained in the electrical response of a detector or a segment of a detector.   The procedures developed to separate single- and multiple-site events are tested with specially selected event samples provided by an 18-fold segmented prototype germanium detector for Phase II of the GERmanium Detector Array, GERDA. The single Compton scattering, i.e. single-site, events are tagged by coincidently detecting the scattered photon with a second detector positioned at a defined angle. A neural network is trained to separate such events from events which come from multi-site dominated samples. Identification efficiencies of ~80% are achieved for both single- and multi-site events.
Physical library collections are valuable and long standing resources for knowledge and learning. However, managing books in a large bookshelf and finding books on it often leads to tedious manual work, especially for large book collections where books might be missing or misplaced. Recently, deep neural models, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have achieved great success for scene text detection and recognition. Motivated by these recent successes, we aim to investigate their viability in facilitating book management, a task that introduces further challenges including large amounts of cluttered scene text, distortion, and varied lighting conditions. In this paper, we present a library inventory building and retrieval system based on scene text reading methods. We specifically design our scene text recognition model using rich supervision to accelerate training and achieve state-of-the-art performance on several benchmark datasets. Our proposed system has the potential to greatly reduce the amount of human labor required in managing book inventories as well as the space needed to store book information.
The advent of real-time multimedia services over the Internet has stimulated new technologies for expanding the information carrying capacity of optical network backbones. Multilayer wavelength division multiplexing (WDM) packet switching is an emerging technology for increasing the bandwidth of optical networks. Two algorithms for the routing of the multimedia traffic flows were applied: (i) Capacitated Shortest Path First (CSPF) routing, which minimizes the distance of each flow linking the given source and destination nodes and satisfying capacity constraints; and (ii) Flow Deviation Algorithm (FDA) routing, which minimizes the network-wide average packet delay.
High-speed photonic switching networks can switch optical signals at the rate of several terabits per second. However, they suffer from an intrinsic crosstalk problem when two optical signals cross at the same switch element. To avoid crosstalk, active connections must be node-disjoint in the switching network. In this paper, we propose a sequence of decomposition and merge operations, called conjugate transformation, performed on each switch element to tackle this problem. The network resulting from this transformation is called conjugate network. By using the numbering-schemes of networks, we prove that if the route assignments in the original network are link-disjoint, their corresponding ones in the conjugate network would be node-disjoint. Thus, traditional nonblocking switching networks can be transformed into crosstalk-free optical switches in a routine manner. Furthermore, we show that crosstalk-free multicast switches can also be obtained from existing nonblocking multicast switches via the same conjugate transformation.
What we appreciate in dance is the ability of people to sponta- neously improvise new movements and choreographies, sur- rendering to the music rhythm, being inspired by the cur- rent perceptions and sensations and by previous experiences, deeply stored in their memory. Like other human abilities, this, of course, is challenging to reproduce in an artificial entity such as a robot. Recent generations of anthropomor- phic robots, the so-called humanoids, however, exhibit more and more sophisticated skills and raised the interest in robotic communities to design and experiment systems devoted to automatic dance generation. In this work, we highlight the importance to model a computational creativity behavior in dancing robots to avoid a mere execution of preprogrammed dances. In particular, we exploit a deep learning approach that allows a robot to generate in real time new dancing move- ments according to to the listened music.
'Trojan horses', 'logic bombs', 'armoured viruses' and 'cryptovirology' are terms recalling war gears. In fact, concepts of attack and defence drive the world of computer virology, which looks like a war universe in an information society. This war has several shapes, from invasions of a network by worms, to military and industrial espionage ...
In this work we present a deep learning framework for video compressive sensing. The proposed formulation enables recovery of video frames in a few seconds at significantly improved reconstruction quality compared to previous approaches. Our investigation starts by learning a linear mapping between video sequences and corresponding measured frames which turns out to provide promising results. We then extend the linear formulation to deep fully-connected networks and explore the performance gains using deeper architectures. Our analysis is always driven by the applicability of the proposed framework on existing compressive video architectures. Extensive simulations on several video sequences document the superiority of our approach both quantitatively and qualitatively. Finally, our analysis offers insights into understanding how dataset sizes and number of layers affect reconstruction performance while raising a few points for future investigation.
In a plethora of applications dealing with inverse problems, e.g. in image processing, social networks, compressive sensing, biological data processing etc., the signal of interest is known to be structured in several ways at the same time. This premise has recently guided the research to the innovative and meaningful idea of imposing multiple constraints on the parameters involved in the problem under study. For instance, when dealing with problems whose parameters form sparse and low-rank matrices, the adoption of suitably combined constraints imposing sparsity and low-rankness, is expected to yield substantially enhanced estimation results. In this paper, we address the spectral unmixing problem in hyperspectral images. Specifically, two novel unmixing algorithms are introduced, in an attempt to exploit both spatial correlation and sparse representation of pixels lying in homogeneous regions of hyperspectral images. To this end, a novel convex mixed penalty term is first defined consisting of the sum of the weighted $\ell_1$ and the weighted nuclear norm of the abundance matrix corresponding to a small area of the image determined by a sliding square window. This penalty term is then used to regularize a conventional quadratic cost function and impose simultaneously sparsity and row-rankness on the abundance matrix. The resulting regularized cost function is minimized by a) an incremental proximal sparse and low-rank unmixing algorithm and b) an algorithm based on the alternating minimization method of multipliers (ADMM). The effectiveness of the proposed algorithms is illustrated in experiments conducted both on simulated and real data.
Most of the existing recommender systems assume that user's visiting history can be constantly recorded. However, in recent online services, the user identification may be usually unknown and only limited online user behaviors can be used. It is of great importance to model the temporal online user behaviors and conduct recommendation for the anonymous users. In this paper, we propose a list-wise deep neural network based architecture to model the limited user behaviors within each session. To train the model efficiently, we first design a session embedding method to pre-train a session representation, which incorporates different kinds of user search behaviors such as clicks and views. Based on the learnt session representation, we further propose a list-wise ranking model to generate the recommendation result for each anonymous user session. We conduct quantitative experiments on a recently published dataset from an e-commerce company. The evaluation results validate the effectiveness of the proposed method, which can outperform the state-of-the-art significantly.
Adapting a simple biological model, we study the effects of control on the market. Companies are depicted as sites on a lattice and labelled by a fitness parameter (some `company-size' indicator). The chance of survival of a company on the market at any given time is related to its fitness, its position on the lattice and on some particular external influence, which may be considered to represent regulation from governments or central banks. The latter is rendered as a penalty for companies which show a very fast betterment in fitness space. As a result, we find that the introduction of regulation on the market contributes to lower the average fitness of companies.
This volume contains the papers presented at the 1st International Workshop on "Decentralized Coordination of Distributed Processes", DCDP 2010, held in Amsterdam, The Netherlands on June 10th, 2010 in conjunction with the 5th International Federated Conferences on Distributed Computing Techniques, DisCoTec 2010. The central theme of the workshop is the decentralized coordination of distributed processes. Decentralized: there is no single authority in the network that everything is vulnerable to. Coordinated: processes need to cooperate to achieve meaningful results, potentially in the face of mutual suspicion. Distributed: processes are separated by a potentially unreliable network.
Perturbative estimates of operator coefficients for improved lattice actions are becoming increasingly important for precision simulations of many hadronic observables. Following previous work by Dimm, Lepage, and Mackenzie, we consider the feasibility of computing operator coefficients from numerical simulations deep in the perturbative region of lattice theories. Here we introduce a background field technique that may allow for the computation of the coefficients of clover-field operators in a variety of theories. This method is tested by calculations of the renormalized quark mass in lattice NRQCD, and of the $O(\alpha_s)$ clover coefficient for Sheikholeslami-Wohlert fermions. First results for the coefficient of the magnetic moment operator in NRQCD are also presented.
We present the results of optical observations of the GRB990510 field carried out at different epochs from European Southern Observatory (ESO) telescopes. Deep observations, down to a limiting magnitude of about 27 and 24 in the Bessel-R and Gunn-I band, respectively, were obtained between May 16 and 18 from the ESO NTT-SUSI2 telescope and on May 20 from the ESO 3.6 m (EFOSC2) telescope. These observations, together with other published photometric data, allowed to monitor the faint tail of the decaying Optical Transient (OT) associated to the GRB990510. We discuss the light curves in the different filters (V, R and I) in the light of the recently proposed decay laws. No obvious host associated to the GRB990510 optical afterglow was found in the R and I band images. By comparing the light curves with respect to the theoretical colors of galaxies with different morphology we derived a lower limit of R about 26.6 for the host galaxy.
We review the description of deep inelastic scattering using some AdS/QCD phenomenological models.
Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.
Andreev interferometry - the sensitivity of the tunneling current to spatial variations in the local superconducting order at an interface - is proposed as a probe of the spatial structure of the phase correlations in the pseudogap state of the cuprate superconductors. To demonstrate this idea theoretically, a simple tunneling model is considered, via which the tunneling current is related to the equilibrium phase-phase correlator in the pseudogap state. These considerations suggest that measurement of the low-voltage conductance through mesoscopic contacts of varying areas provides a scheme for accessing phase-phase correlation information. For illustrative purposes, quantitative predictions are made for a model of the pseudogap state in which the phase (but not the amplitude) of the superconducting order varies randomly, and does so with correlations consistent with certain proposed pictures of the pseudogap state.
Interconnection networks of parallel systems are used for servicing traf- fic generated by different applications, often belonging to different users. When multiple traffic flows contend for channel bandwidth, the scheduling algorithm regulating the access to that channel plays a key role in ensur- ing that each flow obtains the required quality of service. Fairness is a highly desirable property for a scheduling algorithm. We show that using the Relative Fairness Bound as a fairness measure may lead to decrease in throughput and increase in latency. We propose an alternative metric to evaluate the fairness and avoid the drawback of Relative Fairness Bound.
Due to the rapid growth in technology employed by the spammers, there is a need of classifiers that are more efficient, generic and highly adaptive. Neural Network based technologies have high ability of adaption as well as generalization. As per our knowledge, very little work has been done in this field using neural network. We present this paper to fill this gap. This paper evaluates performance of three supervised learning algorithms of artificial neural network by creating classifiers for the complex problem of latest web spam pattern classification. These algorithms are Conjugate Gradient algorithm, Resilient Backpropagation learning, and Levenberg-Marquardt algorithm.
The reshaping and decorrelation of similar activity patterns by neuronal networks can enhance their discriminability, storage, and retrieval. How can such networks learn to decorrelate new complex patterns, as they arise in the olfactory system? Using a computational network model for the dominant neural populations of the olfactory bulb we show that fundamental aspects of the adult neurogenesis observed in the olfactory bulb -- the persistent addition of new inhibitory granule cells to the network, their activity-dependent survival, and the reciprocal character of their synapses with the principal mitral cells -- are sufficient to restructure the network and to alter its encoding of odor stimuli adaptively so as to reduce the correlations between the bulbar representations of similar stimuli. The decorrelation is quite robust with respect to various types of perturbations of the reciprocity. The model parsimoniously captures the experimentally observed role of neurogenesis in perceptual learning and the enhanced response of young granule cells to novel stimuli. Moreover, it makes specific predictions for the type of odor enrichment that should be effective in enhancing the ability of animals to discriminate similar odor mixtures.
One common trend in image tagging research is to focus on visually relevant tags, and this tends to ignore the personal and social aspect of tags, especially on photoblogging websites such as Flickr. Previous work has correctly identified that many of the tags that users provide on images are not visually relevant (i.e. representative of the salient content in the image) and they go on to treat such tags as noise, ignoring that the users chose to provide those tags over others that could have been more visually relevant. Another common assumption about user generated tags for images is that the order of these tags provides no useful information for the prediction of tags on future images. This assumption also tends to define usefulness in terms of what is visually relevant to the image. For general tagging or labeling applications that focus on providing visual information about image content, these assumptions are reasonable, but when considering personalized image tagging applications, these assumptions are at best too rigid, ignoring user choice and preferences.   We challenge the aforementioned assumptions, and provide a machine learning approach to the problem of personalized image tagging with the following contributions: 1.) We reformulate the personalized image tagging problem as a search/retrieval ranking problem, 2.) We leverage the order of tags, which does not always reflect visual relevance, provided by the user in the past as a cue to their tag preferences, similar to click data, 3.) We propose a technique to augment sparse user tag data (semi-supervision), and 4.) We demonstrate the efficacy of our method on a subset of Flickr images, showing improvement over previous state-of-art methods.
This paper presents a combined strategy for tracking a non-holonomic mobile robot which works under certain operating conditions for system parameters and disturbances. The strategy includes kinematic steering and velocity dynamics learning of mobile robot system simultaneously. In the learning controller (neural network based controller) the velocity dynamics learning control takes part in tracking of the reference velocity trajectory by learning the inverse function of robot dynamics while the reference velocity control input plays a role in stabilizing the kinematic steering system to the desired reference model of kinematic system even without using the assumption of perfect velocity tracking.
We model the robustness against random failure or intentional attack of networks with arbitrary large-scale structure. We construct a block-based model which incorporates --- in a general fashion --- both connectivity and interdependence links, as well as arbitrary degree distributions and block correlations. By optimizing the percolation properties of this general class of networks, we identify a simple core-periphery structure as the topology most robust against random failure. In such networks, a distinct and small "core" of nodes with higher degree is responsible for most of the connectivity, functioning as a central "backbone" of the system. This centralized topology remains the optimal structure when other constraints are imposed, such as a given fraction of interdependence links and fixed degree distributions. This distinguishes simple centralized topologies as the most likely to emerge, when robustness against failure is the dominant evolutionary force.
In this short paper we consider the problem of monitoring physical activity in the smart house. The authors suggested a simple device that allows medical staff and relatives to monitor the activity for older adults living alone. This sensor monitors the switching-on of electrical devices. The fact of switching is seen as confirmation of physical activity. It is confirmed by SMS notifications to observers.
Current state-of-the-art deep learning systems for visual object recognition and detection use purely supervised training with regularization such as dropout to avoid overfitting. The performance depends critically on the amount of labeled examples, and in current practice the labels are assumed to be unambiguous and accurate. However, this assumption often does not hold; e.g. in recognition, class labels may be missing; in detection, objects in the image may not be localized; and in general, the labeling may be subjective. In this work we propose a generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency. We consider a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data. In experiments we demonstrate that our approach yields substantial robustness to label noise on several datasets. On MNIST handwritten digits, we show that our model is robust to label corruption. On the Toronto Face Database, we show that our model handles well the case of subjective labels in emotion recognition, achieving state-of-the- art results, and can also benefit from unlabeled face images with no modification to our method. On the ILSVRC2014 detection challenge data, we show that our approach extends to very deep networks, high resolution images and structured outputs, and results in improved scalable detection.
Environmental sound detection is a challenging application of machine learning because of the noisy nature of the signal, and the small amount of (labeled) data that is typically available. This work thus presents a comparison of several state-of-the-art Deep Learning models on the IEEE challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 challenge task and data, classifying sounds into one of fifteen common indoor and outdoor acoustic scenes, such as bus, cafe, car, city center, forest path, library, train, etc. In total, 13 hours of stereo audio recordings are available, making this one of the largest datasets available. We perform experiments on six sets of features, including standard Mel-frequency cepstral coefficients (MFCC), Binaural MFCC, log Mel-spectrum and two different large- scale temporal pooling features extracted using OpenSMILE. On these features, we apply five models: Gaussian Mixture Model (GMM), Deep Neural Network (DNN), Recurrent Neural Network (RNN), Convolutional Deep Neural Net- work (CNN) and i-vector. Using the late-fusion approach, we improve the performance of the baseline 72.5% by 15.6% in 4-fold Cross Validation (CV) avg. accuracy and 11% in test accuracy, which matches the best result of the DCASE 2016 challenge. With large feature sets, deep neural network models out- perform traditional methods and achieve the best performance among all the studied methods. Consistent with other work, the best performing single model is the non-temporal DNN model, which we take as evidence that sounds in the DCASE challenge do not exhibit strong temporal dynamics.
We apply one of lazy learning methods named k-nearest neighbor algorithm (kNN) to estimate the photometric redshifts of quasars, based on various datasets from the Sloan Digital Sky Survey (SDSS), UKIRT Infrared Deep Sky Survey (UKIDSS) and Wide-field Infrared Survey Explorer (WISE) (the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN arrives at the best performance when k is different with a special input pattern for a special dataset. The best result belongs to the SDSS-UKIDSS-WISE sample. The experimental results show that generally the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. By comparing the performance of various methods for photometric redshift estimation of quasars, kNN based on KD-Tree shows its superiority with the best accuracy for our case.
We employed field star decontaminated 2MASS photometry to study four nearby optical embedded clusters -- Collinder34, NGC3293, NGC3766 and NGC6231 -- obtaining deep colour-magnitude diagrams and stellar radial density profiles. We found what seems to be pre-main sequences detached in different amounts from main sequences in these diagrams. The structural analysis of each cluster revealed different radial distributions for these two sequences. We argued that the detached evolutionary sequences in our sample may be evidence of sequential star formation. Finally, we compared the sample cluster parameters with those of other young clusters in the literature and point out evidence that NGC3766 and NGC6231 might be evolving to OB associations.
We perform fast vehicle detection from traffic surveillance cameras. A novel deep learning framework, namely Evolving Boxes, is developed that proposes and refines the object boxes under different feature representations. Specifically, our framework is embedded with a light-weight proposal network to generate initial anchor boxes as well as to early discard unlikely regions; a fine-turning network produces detailed features for these candidate boxes. We show intriguingly that by applying different feature fusion techniques, the initial boxes can be refined for both localization and recognition. We evaluate our network on the recent DETRAC benchmark and obtain a significant improvement over the state-of-the-art Faster RCNN by 9.5% mAP. Further, our network achieves 9-13 FPS detection speed on a moderate commercial GPU.
We give a microscopic representation of the stock-market in which the microscopic agents are the individual traders and their capital. Their basic dynamics consists in the auto-catalysis of the individual capital and in the global competition/cooperation between the agents mediated by the total wealth invested in the stock (which we identify with the stock-index). We show that such systems lead generically to (truncated) Pareto power-law distribution of the individual wealth. This, in turn, leads to intermittent market (short time) returns parametrized by a (truncated) Levy distribution. We relate the truncation in the Levy distribution to the (truncation in the Pareto Power Law i.e. to the) fact that at each moment no trader can own more than the current total wealth invested in the stock. In the cases where the system is dominated by the largest traders, the dynamics looks similar to noisy low-dimensional chaos. By introducing traders memory and/or feedback between individual and collective wealth fluctuations (the later identified with the stock returns), one obtains clustered "volatility" as well as market booms and crashes. The basic feedback loop consists in: - computing the market price of the stock as the sum of the individual wealths invested in the stock by the traders and - determining the time variation of the individual trader wealth as his/her previous wealth multiplied by the stock return (i.e. the variation of the stock price).
It is well-known that acting in an individually rational manner, according to the principles of classical game theory, may lead to sub-optimal solutions in a class of problems named social dilemmas. In contrast, humans generally do not have much difficulty with social dilemmas, as they are able to balance personal benefit and group benefit. As agents in multi-agent systems are regularly confronted with social dilemmas, for instance in tasks such as resource allocation, these agents may benefit from the inclusion of mechanisms thought to facilitate human fairness. Although many of such mechanisms have already been implemented in a multi-agent systems context, their application is usually limited to rather abstract social dilemmas with a discrete set of available strategies (usually two). Given that many real-world examples of social dilemmas are actually continuous in nature, we extend this previous work to more general dilemmas, in which agents operate in a continuous strategy space. The social dilemma under study here is the well-known Ultimatum Game, in which an optimal solution is achieved if agents agree on a common strategy. We investigate whether a scale-free interaction network facilitates agents to reach agreement, especially in the presence of fixed-strategy agents that represent a desired (e.g. human) outcome. Moreover, we study the influence of rewiring in the interaction network. The agents are equipped with continuous-action learning automata and play a large number of random pairwise games in order to establish a common strategy. From our experiments, we may conclude that results obtained in discrete-strategy games can be generalized to continuous-strategy games to a certain extent: a scale-free interaction network structure allows agents to achieve agreement on a common strategy, and rewiring in the interaction network greatly enhances the agents ability to reach agreement. However, it also becomes clear that some alternative mechanisms, such as reputation and volunteering, have many subtleties involved and do not have convincing beneficial effects in the continuous case.
In data analysis new forms of complex data have to be considered like for example (symbolic data, functional data, web data, trees, SQL query and multimedia data, ...). In this context classical data analysis for knowledge discovery based on calculating the center of gravity can not be used because input are not $\mathbb{R}^p$ vectors. In this paper, we present an application on real world symbolic data using the self-organizing map. To this end, we propose an extension of the self-organizing map that can handle symbolic data.
We investigate the use of Deep Neural Networks for the classification of image datasets where texture features are important for generating class-conditional discriminative representations. To this end, we first derive the size of the feature space for some standard textural features extracted from the input dataset and then use the theory of Vapnik-Chervonenkis dimension to show that hand-crafted feature extraction creates low-dimensional representations which help in reducing the overall excess error rate. As a corollary to this analysis, we derive for the first time upper bounds on the VC dimension of Convolutional Neural Network as well as Dropout and Dropconnect networks and the relation between excess error rate of Dropout and Dropconnect networks. The concept of intrinsic dimension is used to validate the intuition that texture-based datasets are inherently higher dimensional as compared to handwritten digits or other object recognition datasets and hence more difficult to be shattered by neural networks. We then derive the mean distance from the centroid to the nearest and farthest sampling points in an n-dimensional manifold and show that the Relative Contrast of the sample data vanishes as dimensionality of the underlying vector space tends to infinity.
The development of a large non-coding fraction in eukaryotic DNA and the phenomenon of the code-bloat in the field of evolutionary computations show a striking similarity. This seems to suggest that (in the presence of mechanisms of code growth) the evolution of a complex code can't be attained without maintaining a large inactive fraction. To test this hypothesis we performed computer simulations of an evolutionary toy model for Turing machines, studying the relations among fitness and coding/non-coding ratio while varying mutation and code growth rates. The results suggest that, in our model, having a large reservoir of non-coding states constitutes a great (long term) evolutionary advantage.
Overarching goals for this work aim to advance the state of the art for detection, classification and localization (DCL) in the field of bioacoustics. This goal is primarily achieved by building a generic framework for detection-classification (DC) using a fast, efficient and scalable architecture, demonstrating the capabilities of this system using on a variety of low-frequency mid-frequency cetacean sounds. Two primary goals are to develop transferable technologies for detection and classification in, one: the area of advanced algorithms, such as deep learning and other methods; and two: advanced systems, capable of real-time and archival processing. For each key area, we will focus on producing publications from this work and providing tools and software to the community where/when possible. Currently massive amounts of acoustic data are being collected by various institutions, corporations and national defense agencies. The long-term goal is to provide technical capability to analyze the data using automatic algorithms for (DC) based on machine intelligence. The goal of the automation is to provide effective and efficient mechanisms by which to process large acoustic datasets for understanding the bioacoustic behaviors of marine mammals. This capability will provide insights into the potential ecological impacts and influences of anthropogenic ocean sounds. This work focuses on building technologies using a maturity model based on DARPA 6.1 and 6.2 processes, for basic and applied research, respectively.
Aging phenomena of short-range Ising spin glass models have been investigated using Monte Carlo simulations. It is found that in the low-temperature spin-glass phase the mean domain size exhibits a crossover from a power-law growth associated with the critical fluctuation at the transition temperature to slower growth inherent in the low-temperature phase. The temperature dependence of the growth law of the domain size can be almost explained by this crossover. We also find that the spin-autocorrelation function in the quasi-equilibrium regime follows the expected scaling behavior from the droplet theory expressed in terms of the mean domain size and a characteristic length scale of droplet excitations.
Currently, Artificial Intelligence (AI) has won unprecedented attention and is becoming the increasingly popular focus in China. This change can be judged by the impressive record of academic publications, the amount of state-level investment and the presence of nation-wide participation and devotion. In this paper, we place emphasis on discussing the progress of artificial intelligence engineerings in China. We first introduce the focus on AI in Chinese academia, including the supercomputing brain system, Cambrian Period supercomputer of neural networks, and biometric systems. Then, the development of AI in industrial circles and the latest layout of AI products in companies, such as Baidu, Tencent, and Alibaba, are introduced. Last, we bring in the opinions and arguments of the main intelligentsia of China about the future development of AI, including how to examine the relationship between humanity on one side and science and technology on the other.
Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypothesize that the senone probabilities obtained from a DNN trained with binary labels can provide more accurate targets to learn better acoustic models. However, DNN outputs bear inaccuracies which are exhibited as high dimensional unstructured noise, whereas the informative components are structured and low-dimensional. We exploit principle component analysis (PCA) and sparse coding to characterize the senone subspaces. Enhanced probabilities obtained from low-rank and sparse reconstructions are used as soft-targets for DNN acoustic modeling, that also enables training with untranscribed data. Experiments conducted on AMI corpus shows 4.6% relative reduction in word error rate.
The key finding in the DNA double helix model is the specific pairing or binding between nucleotides A-T and C-G, and the pairing rules are the molecule basis of genetic code. Unfortunately, no such rules have been discovered for proteins. Here we show that similar rules and intrinsic sequence patterns between intra-protein binding peptide fragments do exist, and they can be extracted using a deep learning algorithm. Multi-millions of binding and non-binding peptide fragments from currently available protein X-ray structures are classified with an accuracy of up to 93%. This discovery has the potential in helping solve protein folding and protein-protein interaction problems, two open and fundamental problems in molecular biology.
This paper addresses the problem of depth estimation from a single still image. Inspired by recent works on multi- scale convolutional neural networks (CNN), we propose a deep model which fuses complementary information derived from multiple CNN side outputs. Different from previous methods, the integration is obtained by means of continuous Conditional Random Fields (CRFs). In particular, we propose two different variations, one based on a cascade of multiple CRFs, the other on a unified graphical model. By designing a novel CNN implementation of mean-field updates for continuous CRFs, we show that both proposed models can be regarded as sequential deep networks and that training can be performed end-to-end. Through extensive experimental evaluation we demonstrate the effective- ness of the proposed approach and establish new state of the art results on publicly available datasets.
The identification of motifs--subgraphs that appear significantly more often in a particular network than in an ensemble of randomized networks--has become a ubiquitous method for uncovering potentially important subunits within networks drawn from a wide variety of fields. We find that the most common algorithms used to generate the ensemble from the real network change subgraph counts in a highly correlated manner, so that one subgraph's status as a motif may not be independent from the statuses of the other subgraphs. We demonstrate this effect for the problem of 3- and 4-node motif identification in the transcriptional regulatory networks of E. coli and S. cerevisiae in which randomized networks are generated via an edge-swapping algorithm (Milo et al., Science 298:824, 2002). We show that correlations among 3-node subgraphs are easily interpreted, and we present an information-theoretic tool that may be used to identify correlations among subgraphs of any size.
Spike-train responses of single Hodgkin-Huxley (HH) and integrate-and-fire (IF) neurons with and without the refractory period, are calculated and compared. The HH and IF neurons are assumed to receive spike-train inputs with the constant interspike intervals (ISIs) and stochastic ISIs given by the Gamma distribution, through excitatory and inhibitory synaptic couplings: for both the couplings the HH neuron can fire while the IF neuron can only for the excitatory one. It is shown that the response to the constant-ISI inputs of the IF neuron strongly depends on the refractory period and the synaptic strength and that its response is rather different from that of the HH neuron. The variability of HH and IF neurons depends not only on the jitter of the stochastic inputs but also on their mean and the synaptic strength. Even for the excitatory inputs, the type-I IF neuron may be a good substitute of the type-II HH neuron only in the limited parameter range
One successful model of interacting biological systems is the Boolean network. The dynamics of a Boolean network, controlled with Boolean functions, is usually considered to be a Markovian (memory-less) process. However, both self organizing features of biological phenomena and their intelligent nature should raise some doubt about ignoring the history of their time evolution. Here, we extend the Boolean network Markovian approach: we involve the effect of memory on the dynamics. This can be explored by modifying Boolean functions into non-Markovian functions, for example, by investigating the usual non-Markovian threshold function, - one of the most applied Boolean functions. By applying the non-Markovian threshold function on the dynamical process of a cell cycle network, we discover a power law memory with a more robust dynamics than the Markovian dynamics.
We consider the problem of linear network coding over communication networks, representable by directed acyclic graphs, with multiple groupcast sessions: the network comprises of multiple destination nodes, each desiring messages from multiple sources. We adopt an interference alignment perspective, providing new insights into designing practical network coding schemes as well as the impact of network topology on the complexity of the alignment scheme. In particular, we show that under certain (polynomial-time checkable) constraints on networks with $K$ sources, it is possible to achieve a rate of $1/(L+d+1)$ per source using linear network coding coupled with interference alignment, where each destination receives messages from $L$ sources ($L < K$), and $d$ is a parameter, solely dependent on the network topology, that satisfies $0 \leq d < K-L$.
Mean-field theories claim that the capacitance of the double-layer formed at a metal/ionic conductor interface cannot be larger than that of the Helmholtz capacitor, whose width is equal to the radius of an ion. However, in some experiments the apparent width of the double-layer capacitor is substantially smaller. We propose an alternate, non-mean-field theory of the ionic double-layer to explain such large capacitance values. Our theory allows for the binding of discrete ions to their image charges in the metal, which results in the formation of interface dipoles. We focus primarily on the case where only small cations are mobile and other ions form an oppositely-charged background. In this case, at small temperature and zero applied voltage dipoles form a correlated liquid on both contacts. We show that at small voltages the capacitance of the double-layer is determined by the transfer of dipoles from one electrode to the other and is therefore limited only by the weak dipole-dipole repulsion between bound ions, so that the capacitance is very large. At large voltages the depletion of bound ions from one of the capacitor electrodes triggers a collapse of the capacitance to the much smaller mean-field value, as seen in experimental data. We test our analytical predictions with a Monte Carlo simulation and find good agreement. We further argue that our ``one-component plasma" model should work well for strongly asymmetric ion liquids. We believe that this work also suggests an improved theory of pseudo-capacitance.
Every cellular network deployment requires planning and optimization in order to provide adequate coverage, capacity, and quality of service (QoS). Optimization mobile radio network planning is a very complex task, as many aspects must be taken into account. With the rapid development in mobile network we need effective network planning tool to satisfy the need of customers. However, deciding upon the optimum placement for the base stations (BS s) to achieve best services while reducing the cost is a complex task requiring vast computational resource. This paper introduces the spatial clustering to solve the Mobile Networking Planning problem. It addresses antenna placement problem or the cell planning problem, involves locating and configuring infrastructure for mobile networks by modified the original Partitioning Around Medoids PAM algorithm. M-PAM (Modified Partitioning Around Medoids) has been proposed to satisfy the requirements and constraints. PAM needs to specify number of clusters (k) before starting to search for the best locations of base stations. The M-PAM algorithm uses the radio network planning to determine k. We calculate for each cluster its coverage and capacity and determine if they satisfy the mobile requirements, if not we will increase (k) and reapply algorithms depending on two methods for clustering. Implementation of this algorithm to a real case study is presented. Experimental results and analysis indicate that the M-PAM algorithm when applying method two is effective in case of heavy load distribution, and leads to minimum number of base stations, which directly affected onto the cost of planning the network.
Recurrent Neural Networks (RNNs), which are a powerful scheme for modeling temporal and sequential data need to capture long-term dependencies on datasets and represent them in hidden layers with a powerful model to capture more information from inputs. For modeling long-term dependencies in a dataset, the gating mechanism concept can help RNNs remember and forget previous information. Representing the hidden layers of an RNN with more expressive operations (i.e., tensor products) helps it learn a more complex relationship between the current input and the previous hidden layer information. These ideas can generally improve RNN performances. In this paper, we proposed a novel RNN architecture that combine the concepts of gating mechanism and the tensor product into a single model. By combining these two concepts into a single RNN, our proposed models learn long-term dependencies by modeling with gating units and obtain more expressive and direct interaction between input and hidden layers using a tensor product on 3-dimensional array (tensor) weight parameters. We use Long Short Term Memory (LSTM) RNN and Gated Recurrent Unit (GRU) RNN and combine them with a tensor product inside their formulations. Our proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the tensor product. We conducted experiments with our proposed models on word-level and character-level language modeling tasks and revealed that our proposed models significantly improved their performance compared to our baseline models.
In usual scale-free networks of Barabasi-Albert type, a newly added node selects randomly m neighbors from the already existing network nodes, proportionally to the number of links these had before. Then the number N(k) of nodes with k links each decays as 1/k^gamma where gamma=3 is universal, i.e. independent of m. Now we use a limited directedness in the construction of the network, as a result of which the exponent gamma decreases from 3 to 2 for increasing m.
While textual reviews have become prominent in many recommendation-based systems, automated frameworks to provide relevant visual cues against text reviews where pictures are not available is a new form of task confronted by data mining and machine learning researchers. Suggestions of pictures that are relevant to the content of a review could significantly benefit the users by increasing the effectiveness of a review. We propose a deep learning-based framework to automatically: (1) tag the images available in a review dataset, (2) generate a caption for each image that does not have one, and (3) enhance each review by recommending relevant images that might not be uploaded by the corresponding reviewer. We evaluate the proposed framework using the Yelp Challenge Dataset. While a subset of the images in this particular dataset are correctly captioned, the majority of the pictures do not have any associated text. Moreover, there is no mapping between reviews and images. Each image has a corresponding business-tag where the picture was taken, though. The overall data setting and unavailability of crucial pieces required for a mapping make the problem of recommending images for reviews a major challenge. Qualitative and quantitative evaluations indicate that our proposed framework provides high quality enhancements through automatic captioning, tagging, and recommendation for mapping reviews and images.
Several tasks in argumentation mining and debating, question-answering, and natural language inference involve classifying a sequence in the context of another sequence (referred as bi-sequence classification). For several single sequence classification tasks, the current state-of-the-art approaches are based on recurrent and convolutional neural networks. On the other hand, for bi-sequence classification problems, there is not much understanding as to the best deep learning architecture. In this paper, we attempt to get an understanding of this category of problems by extensive empirical evaluation of 19 different deep learning architectures (specifically on different ways of handling context) for various problems originating in natural language processing like debating, textual entailment and question-answering. Following the empirical evaluation, we offer our insights and conclusions regarding the architectures we have considered. We also establish the first deep learning baselines for three argumentation mining tasks.
The goal of this tutorial is to introduce key models, algorithms, and open questions related to the use of optimization methods for solving problems arising in machine learning. It is written with an INFORMS audience in mind, specifically those readers who are familiar with the basics of optimization algorithms, but less familiar with machine learning. We begin by deriving a formulation of a supervised learning problem and show how it leads to various optimization problems, depending on the context and underlying assumptions. We then discuss some of the distinctive features of these optimization problems, focusing on the examples of logistic regression and the training of deep neural networks. The latter half of the tutorial focuses on optimization algorithms, first for convex logistic regression, for which we discuss the use of first-order methods, the stochastic gradient method, variance reducing stochastic methods, and second-order methods. Finally, we discuss how these approaches can be employed to the training of deep neural networks, emphasizing the difficulties that arise from the complex, nonconvex structure of these models.
We informally call a stochastic process learnable if it admits a generalization error approaching zero in probability for any concept class with finite VC-dimension (IID processes are the simplest example). A mixture of learnable processes need not be learnable itself, and certainly its generalization error need not decay at the same rate. In this paper, we argue that it is natural in predictive PAC to condition not on the past observations but on the mixture component of the sample path. This definition not only matches what a realistic learner might demand, but also allows us to sidestep several otherwise grave problems in learning from dependent data. In particular, we give a novel PAC generalization bound for mixtures of learnable processes with a generalization error that is not worse than that of each mixture component. We also provide a characterization of mixtures of absolutely regular ($\beta$-mixing) processes, of independent probability-theoretic interest.
We discuss nuclear effects on neutrino-nuclear interactions in a wide kinematical range from shallow to deep inelastic scattering (DIS) region. There is necessity from neutrino communities to have precise neutrino-nucleus cross sections for future measurements on neutrino oscillations and leptonic CP violation. We try to create a model to calculate neutrino cross sections in the wide kinematical range, from quasi-elastic scattering and resonance productions to the DIS. In this article, nuclear modifications of structure functions are mainly discussed, and a possible extension to the $Q^2 \to 0$ region is explained. We also comment on the transition region between baryon resonances and the DIS. There are ongoing experimental efforts on nuclear modifications of structure functions or parton distribution functions such as by pA reactions at RHIC and LHC, Drell-Yan measurements at Fermilab, Miner$\nu$a neutrino DIS, and charged-lepton DIS at JLab.
In this work we present a framework for the recognition of natural scene text. Our framework does not require any human-labelled data, and performs word recognition on the whole image holistically, departing from the character based recognition systems of the past. The deep neural network models at the centre of this framework are trained solely on data produced by a synthetic text generation engine -- synthetic data that is highly realistic and sufficient to replace real data, giving us infinite amounts of training data. This excess of data exposes new possibilities for word recognition models, and here we consider three models, each one "reading" words in a different way: via 90k-way dictionary encoding, character sequence encoding, and bag-of-N-grams encoding. In the scenarios of language based and completely unconstrained text recognition we greatly improve upon state-of-the-art performance on standard datasets, using our fast, simple machinery and requiring zero data-acquisition costs.
We consider a class of learning problems that involve a structured sparsity-inducing norm defined as the sum of $\ell_\infty$-norms over groups of variables. Whereas a lot of effort has been put in developing fast optimization methods when the groups are disjoint or embedded in a specific hierarchical structure, we address here the case of general overlapping groups. To this end, we show that the corresponding optimization problem is related to network flow optimization. More precisely, the proximal problem associated with the norm we consider is dual to a quadratic min-cost flow problem. We propose an efficient procedure which computes its solution exactly in polynomial time. Our algorithm scales up to millions of variables, and opens up a whole new range of applications for structured sparse models. We present several experiments on image and video data, demonstrating the applicability and scalability of our approach for various problems.
We present a deep learning approach to the ISIC 2017 Skin Lesion Classification Challenge using a multi-scale convolutional neural network. Our approach utilizes an Inception-v3 network pre-trained on the ImageNet dataset, which is fine-tuned for skin lesion classification using two different scales of input images.
We study the dynamics of supervised learning in layered neural networks, in the regime where the size $p$ of the training set is proportional to the number $N$ of inputs. Here the local fields are no longer described by Gaussian probability distributions. We show how dynamical replica theory can be used to predict the evolution of macroscopic observables, including the relevant performance measures, incorporating the old formalism in the limit $\alpha=p/N\to\infty$ as a special case. For simplicity we restrict ourselves to single-layer networks and realizable tasks.
We examine bosons hopping on a one-dimensional lattice in the presence of a random potential at zero temperature. Bogoliubov excitations of the Bose-Einstein condensate formed under such conditions are localized, with the localization length diverging at low frequency as $\ell(\omega)\sim 1/\omega^\alpha$. We show that the well known result $\alpha=2$ applies only for sufficiently weak random potential. As the random potential is increased beyond a certain strength, $\alpha$ starts decreasing. At a critical strength of the potential, when the system of bosons is at the transition from a superfluid to an insulator, $\alpha=1$. This result is relevant for understanding the behavior of the atomic Bose-Einstein condensates in the presence of random potential, and of the disordered Josephson junction arrays.
Robot-control designers have begun to exploit the properties of the human immune system in order to produce dynamic systems that can adapt to complex, varying, real-world tasks. Jernes idiotypic-network theory has proved the most popular artificial-immune-system (AIS) method for incorporation into behaviour-based robotics, since idiotypic selection produces highly adaptive responses. However, previous efforts have mostly focused on evolving the network connections and have often worked with a single, pre-engineered set of behaviours, limiting variability. This paper describes a method for encoding behaviours as a variable set of attributes, and shows that when the encoding is used with a genetic algorithm (GA), multiple sets of diverse behaviours can develop naturally and rapidly, providing much greater scope for flexible behaviour-selection. The algorithm is tested extensively with a simulated e-puck robot that navigates around a maze by tracking colour. Results show that highly successful behaviour sets can be generated within about 25 minutes, and that much greater diversity can be obtained when multiple autonomous populations are used, rather than a single one.
We propose a general multi-class visual recognition model, termed the Classifier Graph, which aims to generalize and integrate ideas from many of today's successful hierarchical recognition approaches. Our graph-based model has the advantage of enabling rich interactions between classes from different levels of interpretation and abstraction. The proposed multi-class system is efficiently learned using step by step updates. The structure consists of simple logistic linear layers with inputs from features that are automatically selected from a large pool. Each newly learned classifier becomes a potential new feature. Thus, our feature pool can consist both of initial manually designed features as well as learned classifiers from previous steps (graph nodes), each copied many times at different scales and locations. In this manner we can learn and grow both a deep, complex graph of classifiers and a rich pool of features at different levels of abstraction and interpretation. Our proposed graph of classifiers becomes a multi-class system with a recursive structure, suitable for deep detection and recognition of several classes simultaneously.
The optimization problem behind neural networks is highly non-convex. Training with stochastic gradient descent and variants requires careful parameter tuning and provides no guarantee to achieve the global optimum. In contrast we show under quite weak assumptions on the data that a particular class of feedforward neural networks can be trained globally optimal with a linear convergence rate with our nonlinear spectral method. Up to our knowledge this is the first practically feasible method which achieves such a guarantee. While the method can in principle be applied to deep networks, we restrict ourselves for simplicity in this paper to one and two hidden layer networks. Our experiments confirm that these models are rich enough to achieve good performance on a series of real-world datasets.
In this paper we focus on one critical issue in mobile ad hoc networks that is multicast routing and propose a mesh based on demand multicast routing protocol for Ad-Hoc networks with QoS (quality of service) support. Then a model was presented which is used for create a local recovering mechanism in order to joining the nodes to multi sectional groups at the minimized time and method for security in this protocol we present .
We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter $\theta^*$ and the objective is to return the arm with the largest reward. We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In particular, we show the importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms. We analyze the proposed strategies and compare their empirical performance. Finally, as a by-product of our analysis, we point out the connection to the $G$-optimality criterion used in optimal experimental design.
We explore how the social dynamics of communication and learning can bring about the rise of a syntactic communication in a population of speakers. Our study is developed starting from a version of the Naming Game model where an elementary syntactic structure is introduced. This analysis shows how the transition from non-syntactic to syntactic communication is socially favored in communities which need to exchange a large number of concepts.
Many hierarchically modular systems are structured in a way that resembles an hourglass. This "hourglass effect" means that the system generates many outputs from many inputs through a relatively small number of intermediate modules that are critical for the operation of the entire system, referred to as the waist of the hourglass. We investigate the hourglass effect in general, not necessarily layered, hierarchical dependency networks. Our analysis focuses on the number of source-to-target dependency paths that traverse each vertex, and it identifies the core of a dependency network as the smallest set of vertices that collectively cover almost all dependency paths. We then examine if a given network exhibits the hourglass property or not, comparing its core size with a "flat" (i.e., non-hierarchical) network that preserves the source dependencies of each target in the original network. As a possible explanation for the hourglass effect, we propose the Reuse Preference (RP) model that captures the bias of new modules to reuse intermediate modules of similar complexity instead of connecting directly to sources or low complexity modules. We have applied the proposed framework in a diverse set of dependency networks from technological, natural and information systems, showing that all these networks exhibit the general hourglass property but to a varying degree and with different waist characteristics.
In this paper, we study video streaming over wireless networks with network coding capabilities. We build upon recent work, which demonstrated that network coding can increase throughput over a broadcast medium, by mixing packets from different flows into a single packet, thus increasing the information content per transmission. Our key insight is that, when the transmitted flows are video streams, network codes should be selected so as to maximize not only the network throughput but also the video quality. We propose video-aware opportunistic network coding schemes that take into account both (i) the decodability of network codes by several receivers and (ii) the importance and deadlines of video packets. Simulation results show that our schemes significantly improve both video quality and throughput.
The perturbative QCD predictions concerning deep inelastic scattering at low $x$ are summarized. The theoretical framework based on the leading log $1/x$ resummation and $k_t$ factorization theorem is described and some recent developments concerning the BFKL equation and its generalization are discussed. The QCD expectations concerning the small $x$ behaviour of the spin dependent structure function $g_1(x,Q^2)$ are briefly summarized and the importance of the double logarithmic terms which sum contributions containing the leading powers of $\alpha_s ln^2(1/x)$ is emphasised. The role of studying final states in deep inelastic scattering for revealing the details of the underlying dynamics at low $x$ is pointed out and some dedicated measurements, like deep inelastic scattering accompanied by an energetic jet, the measurement of the transverse energy flow etc., are briefly discussed.
We perform Monte Carlo simulations to determine the average excluded area $<A_{ex}>$ of randomly oriented squares, randomly oriented widthless sticks and aligned squares in two dimensions. We find significant differences between our results for randomly oriented squares and previous analytical results for the same. The sources of these differences are explained. Using our results for $<A_{ex}>$ and Monte Carlo simulation results for the percolation threshold, we estimate the mean number of connections per object $B_c$ at the percolation threshold for squares in 2-D. We study systems of squares that are allowed random orientations within a specified angular interval. Our simulations show that the variation in $B_c$ is within 1.6% when the angular interval is varied from 0 to $\pi/2$.
We show that a sharp dependence of the Hall coefficient $R$ on the magnetic field $B$ arises in two-dimensional electron systems with randomly located strong scatterers. The phenomenon is due to classical memory effects. We calculate analytically the dependence $R(B)$ for the case of scattering by hard disks of radius $a$, randomly distributed with concentration $n_0\ll1/a^2$. We demonstrate that in very weak magnetic fields ($\omega_c\tau \lesssim n_0a^2$) memory effects lead to a considerable renormalization of the Boltzmann value of the Hall coefficient: $\delta R / R \sim 1 .$ With increasing magnetic field, the relative correction to $R$ decreases, then changes sign, and saturates at the value $\delta R / R \sim -n_0a^2 .$ We also discuss the effect of the smooth disorder on the dependence of $R$ on $B$.
This talk describes the physics input of QCDINS, a Monte Carlo event generator for QCD-instanton induced scattering processes in deep-inelastic scattering.
We introduce a novel framework for computing optimal randomized security policies in networked domains which extends previous approaches in several ways. First, we extend previous linear programming techniques for Stackelberg security games to incorporate benefits and costs of arbitrary security configurations on individual assets. Second, we offer a principled model of failure cascades that allows us to capture both the direct and indirect value of assets, and extend this model to capture uncertainty about the structure of the interdependency network. Third, we extend the linear programming formulation to account for exogenous (random) failures in addition to targeted attacks. The goal of our work is two-fold. First, we aim to develop techniques for computing optimal security strategies in realistic settings involving interdependent security. To this end, we evaluate the value of our technical contributions in comparison with previous approaches, and show that our approach yields much better defense policies and scales to realistic graphs. Second, our computational framework enables us to attain theoretical insights about security on networks. As an example, we study how allowing security to be endogenous impacts the relative resilience of different network topologies.
Action parsing in videos with complex scenes is an interesting but challenging task in computer vision. In this paper, we propose a generic 3D convolutional neural network in a multi-task learning manner for effective Deep Action Parsing (DAP3D-Net) in videos. Particularly, in the training phase, action localization, classification and attributes learning can be jointly optimized on our appearancemotion data via DAP3D-Net. For an upcoming test video, we can describe each individual action in the video simultaneously as: Where the action occurs, What the action is and How the action is performed. To well demonstrate the effectiveness of the proposed DAP3D-Net, we also contribute a new Numerous-category Aligned Synthetic Action dataset, i.e., NASA, which consists of 200; 000 action clips of more than 300 categories and with 33 pre-defined action attributes in two hierarchical levels (i.e., low-level attributes of basic body part movements and high-level attributes related to action motion). We learn DAP3D-Net using the NASA dataset and then evaluate it on our collected Human Action Understanding (HAU) dataset. Experimental results show that our approach can accurately localize, categorize and describe multiple actions in realistic videos.
The parameters of temporal models, such as dynamic Bayesian networks, may be modelled in a Bayesian context as static or atemporal variables that influence transition probabilities at every time step. Particle filters fail for models that include such variables, while methods that use Gibbs sampling of parameter variables may incur a per-sample cost that grows linearly with the length of the observation sequence. Storvik devised a method for incremental computation of exact sufficient statistics that, for some cases, reduces the per-sample cost to a constant. In this paper, we demonstrate a connection between Storvik's filter and a Kalman filter in parameter space and establish more general conditions under which Storvik's filter works. Drawing on an analogy to the extended Kalman filter, we develop and analyze, both theoretically and experimentally, a Taylor approximation to the parameter posterior that allows Storvik's method to be applied to a broader class of models. Our experiments on both synthetic examples and real applications show improvement over existing methods.
A statistical mechanic study of the $XY$ model with nonlinear interaction is presented on bipartite sparse random graphs. The model properties are compared to those of the $p$-clock model, in which the planar continuous spins are discretized into $p$ values. We test the goodness of the discrete approximation to the XY spins to be used in numerical computations and simulations and its limits of convergence in given, $p$-dependent, temperature regimes. The models are applied to describe the mode-locking transition of the phases of light-modes in lasers at the critical lasing threshold. A frequency is assigned to each variable node and function nodes implement a frequency matching condition. A non-trivial unmagnetized phase-locking occurs at the phase transition, where the frequency dependence of the phases turns out to be linear in a broad range of frequencies, as in standard mode-locking multimode laser at the optical power threshold.
In this paper, we present an illustration to the history of Artificial Intelligence(AI) with a statistical analysis of publish since 1940. We collected and mined through the IEEE publish data base to analysis the geological and chronological variance of the activeness of research in AI. The connections between different institutes are showed. The result shows that the leading community of AI research are mainly in the USA, China, the Europe and Japan. The key institutes, authors and the research hotspots are revealed. It is found that the research institutes in the fields like Data Mining, Computer Vision, Pattern Recognition and some other fields of Machine Learning are quite consistent, implying a strong interaction between the community of each field. It is also showed that the research of Electronic Engineering and Industrial or Commercial applications are very active in California. Japan is also publishing a lot of papers in robotics. Due to the limitation of data source, the result might be overly influenced by the number of published articles, which is to our best improved by applying network keynode analysis on the research community instead of merely count the number of publish.
Random graphs, where the connections between nodes are considered random variables, have wide applicability in the social sciences. Exponential-family Random Graph Models (ERGM) have shown themselves to be a useful class of models for representing com- plex social phenomena. We generalize ERGM by also modeling nodal attributes as random variates, thus creating a random model of the full network, which we call Exponential-family Random Network Models (ERNM). We demonstrate how this framework allows a new formu- lation for logistic regression in network data. We develop likelihood-based inference for the model and an MCMC algorithm to implement it. This new model formulation is used to analyze a peer social network from the National Lon- gitudinal Study of Adolescent Health. We model the relationship between substance use and friendship relations, and show how the results differ from the standard use of logistic regression on network data.
Investigating the properties of magnetic flux emergence is one of the most important problems of solar physics. In this study we present a newly developed deep-focus time-distance measurement scheme which is able to detect strong emerging flux events in the deep solar interior, before the flux becomes visible on the surface. We discuss in detail the differences between our method and previous methods, and demonstrate step-by-step how the signal-to-noise (S/N) ratio is increased. The method is based on detection of perturbations in acoustic phase travel times determined from cross-covariances of solar oscillations observed on the surface. We detect strong acoustic travel-time reductions of an order of 12 - 16 seconds at a depth of 42 - 75 Mm. These acoustic anomalies are detected 1 - 2 days before high peaks in the photospheric magnetic flux rate implying that the average emerging speed is 0.3 - 0.6 km/s. The results of this work contribute to our understanding of solar magnetism and benefit space weather forecasting.
Performing random walks in networks is a fundamental primitive that has found applications in many areas of computer science, including distributed computing. In this paper, we focus on the problem of sampling random walks efficiently in a distributed network and its applications. Given bandwidth constraints, the goal is to minimize the number of rounds required to obtain random walk samples.   All previous algorithms that compute a random walk sample of length $\ell$ as a subroutine always do so naively, i.e., in $O(\ell)$ rounds. The main contribution of this paper is a fast distributed algorithm for performing random walks. We present a sublinear time distributed algorithm for performing random walks whose time complexity is sublinear in the length of the walk. Our algorithm performs a random walk of length $\ell$ in $\tilde{O}(\sqrt{\ell D})$ rounds ($\tilde{O}$ hides $\polylog{n}$ factors where $n$ is the number of nodes in the network) with high probability on an undirected network, where $D$ is the diameter of the network. For small diameter graphs, this is a significant improvement over the naive $O(\ell)$ bound. Furthermore, our algorithm is optimal within a poly-logarithmic factor as there exists a matching lower bound [Nanongkai et al. PODC 2011]. We further extend our algorithms to efficiently perform $k$ independent random walks in $\tilde{O}(\sqrt{k\ell D} + k)$ rounds. We also show that our algorithm can be applied to speedup the more general Metropolis-Hastings sampling.   Our random walk algorithms can be used to speed up distributed algorithms in applications that use random walks as a subroutine, such as computing a random spanning tree and estimating mixing time and related parameters. Our algorithm is fully decentralized and can serve as a building block in the design of topologically-aware networks.
Most existing online writer-identification systems require that the text content is supplied in advance and rely on separately designed features and classifiers. The identifications are based on lines of text, entire paragraphs, or entire documents; however, these materials are not always available. In this paper, we introduce a path-signature feature to an end-to-end text-independent writer-identification system with a deep convolutional neural network (DCNN). Because deep models require a considerable amount of data to achieve good performance, we propose a data-augmentation method named DropStroke to enrich personal handwriting. Experiments were conducted on online handwritten Chinese characters from the CASIA-OLHWDB1.0 dataset, which consists of 3,866 classes from 420 writers. For each writer, we only used 200 samples for training and the remaining 3,666. The results reveal that the path-signature feature is useful for writer identification, and the proposed DropStroke technique enhances the generalization and significantly improves performance.
Glass is an under-cooled liquid that very slowly relaxes towards the equilibrium crystalline state. Its energy balance is ill understood, since it is widely believed that the glassy state cannot be described thermodynamically. However, the classical paradoxes involving the Ehrenfest relations and Prigogine-Defay ratio can be explained when the `effective' or `fictive' temperature of the slow modes is taken as additional system parameter. Having straigtened out the proper picture, it is interesting to reconsider glass from a thermodynamic viewpoint. A shard of glass, kept at fixed temperature and volume, relaxes towards lower energy. Heat is released, inducing apparent violations of all basic thermodynamic laws. The most interesting application is to use glassy or amorphous systems as a source of energy, by extracting the configurational energy in a process of melting followed by crystallization.
We develop a field theory formalism for the disordered interacting electron liquid in the dynamical Keldysh formulation. This formalism is an alternative to the previously used replica technique. In addition it naturally allows for the treatment of non-equilibrium effects. Employing the gauge invariance of the theory and carefully choosing the saddle point in the $Q$-matrix manifold, we separate purely phase effects of the fluctuating potential from the ones that change quasi-particle dynamics. As a result, the cancellation of super-divergent diagrams (double logarithms in d=2) is automatically build in the formalism. As a byproduct we derive a non--perturbative expression for the single particle density of states. The remaining low-energy sigma--model describes the quantum fluctuations of the electron distribution function. Its saddle point equation appears to be the quantum kinetic equation with an appropriate collision integral along with collisionless terms. Altshuler-Aronov corrections to conductivity are shown to arise from the one-loop quantum fluctuation effects.
Malicious users try to compromise systems using new techniques. One of the recent techniques used by the attacker is to perform complex distributed attacks such as denial of service and to obtain sensitive data such as password information. These compromised machines are said to be infected with malicious software termed a "bot". In this paper, we investigate the correlation of behavioural attributes such as keylogging and packet flooding behaviour to detect the existence of a single bot on a compromised machine by applying (1) Spearman's rank correlation (SRC) algorithm and (2) the Dendritic Cell Algorithm (DCA). We also compare the output results generated from these two methods to the detection of a single bot. The results show that the DCA has a better performance in detecting malicious activities.
Twitter introduced lists in late 2009 as a means of curating tweets into meaningful themes. Lists were quickly adopted by media companies as a means of organising content around news stories. Thus the curation of these lists is important, they should contain the key information gatekeepers and present a balanced perspective on the story. Identifying members to add to a list on an emerging topic is a delicate process. From a network analysis perspective there are a number of views on the Twitter network that can be explored, e.g. followers, retweets mentions etc. We present a process for integrating these views in order to recommend authoritative commentators to include on a list. This process is evaluated on manually curated lists about unrest in Bahrain and the Iowa caucuses for the 2012 US election.
We present a novel way to characterize the structure of complex networks by studying the statistical properties of the trajectories of random walks over them. We consider time series corresponding to different properties of the nodes visited by the walkers. We show that the analysis of the fluctuations of these time series allows to define a set of characteristic exponents which capture the local and global organization of a network. This approach provides a way of solving two classical problems in network science, namely the systematic classification of networks, and the identification of the salient properties of growing networks. The results contribute to the construction of a unifying framework for the investigation of the structure and dynamics of complex systems.
The ability to effectively control brain dynamics holds great promise for the enhancement of cognitive function in humans, and the betterment of their quality of life. Yet, successfully controlling dynamics in neural systems is challenging, in part due to the immense complexity of the brain and the large set of interactions that can drive any single change. While we have gained some understanding of the control of single neurons, the control of large-scale neural systems -- networks of multiply interacting components -- remains poorly understood. Efforts to address this gap include the construction of tools for the control of brain networks, mostly adapted from control and dynamical systems theory. Informed by current opportunities for practical intervention, these theoretical contributions provide models that draw from a wide array of mathematical approaches. We present intriguing recent developments for effective strategies of control in dynamic brain networks, and we also describe potential mechanisms that underlie such processes. We review efforts in the control of general neurophysiological processes with implications for brain development and cognitive function, as well as the control of altered neurophysiological processes in medical contexts such as anesthesia administration, seizure suppression, and deep-brain stimulation for Parkinson's disease. We conclude with a forward-looking discussion regarding how emerging results from network control -- especially approaches that deal with nonlinear dynamics or more realistic trajectories for control transitions -- could be used to directly address pressing questions in neuroscience.
There is considerable interest in developing predictive capabilities for social diffusion processes, for instance to permit early identification of emerging contentious situations, rapid detection of disease outbreaks, or accurate forecasting of the ultimate reach of potentially viral ideas or behaviors. This paper proposes a new approach to this predictive analytics problem, in which analysis of meso-scale network dynamics is leveraged to generate useful predictions for complex social phenomena. We begin by deriving a stochastic hybrid dynamical systems (S-HDS) model for diffusion processes taking place over social networks with realistic topologies; this modeling approach is inspired by recent work in biology demonstrating that S-HDS offer a useful mathematical formalism with which to represent complex, multi-scale biological network dynamics. We then perform formal stochastic reachability analysis with this S-HDS model and conclude that the outcomes of social diffusion processes may depend crucially upon the way the early dynamics of the process interacts with the underlying network's community structure and core-periphery structure. This theoretical finding provides the foundations for developing a machine learning algorithm that enables accurate early warning analysis for social diffusion events. The utility of the warning algorithm, and the power of network-based predictive metrics, are demonstrated through an empirical investigation of the propagation of political memes over social media networks. Additionally, we illustrate the potential of the approach for security informatics applications through case studies involving early warning analysis of large-scale protests events and politically-motivated cyber attacks.
In this paper, we develop the themes presented at the 2003 Joint Complexity Conference at the London School of Economics and subsequently published in The Intelligencer (2004) and O Tempo Das Redes (2008). Following the data analysis of the 9/11 high-jacker network developed by Valdis Krebs from open sources, we apply social network theory to examine salient arguments regarding terrorism as seen from the standpoint of complex adaptive systems theory. In particular, we explore the concepts of group cohesion, adhesion and alternative network mappings derived from node removal.
We make distributed stochastic gradient descent faster by exchanging 99% sparse updates instead of dense updates. In data-parallel training, nodes pull updated values of the parameters from a sharded server, compute gradients, push their gradients to the server, and repeat. These push and pull updates strain the network. However, most updates are near zero, so we map the 99% smallest updates (by absolute value) to zero then exchange sparse matrices. Even simple coordinate and value encoding achieves 50x reduction in bandwidth. Our experiment with a neural machine translation on 4 GPUs achieved a 22% speed boost without impacting BLEU score.
Advantages of hypercube network and torus topology are used to derive an embedded architecture for product network known as torus embedded hypercube scalable interconnection network. This paper analyzes torus embedded hypercube network pertinent to parallel architecture. The network metrics are used to show how good embedded network can be designed for parallel computation. Network parameter analysis and comparison of embedded network with basic networks is presented.
Networks provide a skeleton for the spread of contagions, like, information, ideas, behaviors and diseases. Many times networks over which contagions diffuse are unobserved and need to be inferred. Here we apply survival theory to develop general additive and multiplicative risk models under which the network inference problems can be solved efficiently by exploiting their convexity. Our additive risk model generalizes several existing network inference models. We show all these models are particular cases of our more general model. Our multiplicative model allows for modeling scenarios in which a node can either increase or decrease the risk of activation of another node, in contrast with previous approaches, which consider only positive risk increments. We evaluate the performance of our network inference algorithms on large synthetic and real cascade datasets, and show that our models are able to predict the length and duration of cascades in real data.
The aim of this work is to model the nodes battery discharge in wireless ad hoc networks. Many work focus on the energy consumption in such networks. Even, the research in the highest layers of the ISO model, takes into account the energy consumption and efficiency. Indeed, the nodes that form such network are mobiles, so no instant recharge of battery. Also with special type of ad hoc networks are wireless sensors networks using non-rechargeable batteries. All nodes with an exhausted battery are considered death and left the network. To consider the energy consumption, in this work we model using a Markov chain, the discharge of the battery considering of instant activation and deactivation distribution function of these nodes.
The OA ansatz has attracted much attention recently, infinite-dimensional Kuramoto model could collapses to a two-dimensional system of order differential equations with it. In this paper, we propose the ensemble order parameter (EOP) equations to describe the dynamics for networks with a finite size. To verify the effectiveness of this method, we apply it into the star network and star-connected network. In the star network, numerous phase transitions among different synchronous states are observed, three processes of synchronization, one process of de-synchronization and a group of hybrid phase transitions, the processes of those transitions are revealed by the EOP dynamics and other nolinear tools such as time reversibility analysis and linear stability analysis. Also in the star-connected network, the two-step synchronization transition is observed. The process of it is still be revealed by the similar methods in the single star network.
We compare the galaxy evolution models of Bruzual & Charlot (1993) with the faint galaxy count, size and colour data from the Hubble and Herschel Deep Fields (Metcalfe et al 1996). For qo=0.05, we find that models where the SFR increases exponentially out to z>2 are consistent with all of the observational data. For qo=0.5, such models require an extra population of galaxies which are only seen at high redshift and then rapidly fade or disappear. We find that, whatever the cosmology, the redshift of the faint blue galaxies and hence the epoch of galaxy formation is likely to lie at z>2. We find no implied peak in the SFR at z=1 and we suggest that the reasons for this contradiction with the results of Madau et al (1996) include differences in faint galaxy photometry, in the treatment of spiral dust and in the local galaxy count normalisation.
This paper investigates the many-to-one throughput capacity (and by symmetry, one-to-many throughput capacity) of IEEE 802.11 multi-hop networks. It has generally been assumed in prior studies that the many-to-one throughput capacity is upper-bounded by the link capacity L. Throughput capacity L is not achievable under 802.11. This paper introduces the notion of "canonical networks", which is a class of regularly-structured networks whose capacities can be analyzed more easily than unstructured networks. We show that the throughput capacity of canonical networks under 802.11 has an analytical upper bound of 3L/4 when the source nodes are two or more hops away from the sink; and simulated throughputs of 0.690L (0.740L) when the source nodes are many hops away. We conjecture that 3L/4 is also the upper bound for general networks. When all links have equal length, 2L/3 can be shown to be the upper bound for general networks. Our simulations show that 802.11 networks with random topologies operated with AODV routing can only achieve throughputs far below the upper bounds. Fortunately, by properly selecting routes near the gateway (or by properly positioning the relay nodes leading to the gateway) to fashion after the structure of canonical networks, the throughput can be improved significantly by more than 150%. Indeed, in a dense network, it is worthwhile to deactivate some of the relay nodes near the sink judiciously.
Differently from theoretical scale-free networks, most of real networks present multi-scale behavior with nodes structured in different types of functional groups and communities. While the majority of approaches for classification of nodes in a complex network has relied on local measurements of the topology/connectivity around each node, valuable information about node functionality can be obtained by Concentric (or Hierarchical) Measurements. In this paper we explore the possibility of using a set of Concentric Measurements and agglomerative clustering methods in order to obtain a set of functional groups of nodes. Concentric clustering coefficient and convergence ratio are chosen as segregation parameters for the analysis of a institutional collaboration network including various known communities (departments of the University of S\~ao Paulo). A dendogram is obtained and the results are analyzed and discussed. Among the interesting obtained findings, we emphasize the scale-free nature of the obtained network, as well as the identification of different patterns of authorship emerging from different areas (e.g. human and exact sciences). Another interesting result concerns the relatively uniform distribution of hubs along the concentric levels, contrariwise to the non-uniform pattern found in theoretical scale free networks such as the BA model.
Boolean networks have long been used as models of molecular networks and play an increasingly important role in systems biology. This paper describes a software package, Polynome, offered as a web service, that helps users construct Boolean network models based on experimental data and biological input. The key feature is a discrete analog of parameter estimation for continuous models. With only experimental data as input, the software can be used as a tool for reverse-engineering of Boolean network models from experimental time course data.
We study the reliability of phase oscillator networks in response to fluctuating inputs. Reliability means that an input elicits essentially identical responses upon repeated presentations, regardless of the network's initial condition. In this paper, we extend previous results on two-cell networks to larger systems. The first issue that arises is chaos in the absence of inputs, which we demonstrate and interpret in terms of reliability. We give a mathematical analysis of networks that can be decomposed into modules connected by an acyclic graph. For this class of networks, we show how to localize the source of unreliability, and address questions concerning downstream propagation of unreliability once it is produced.
Estimating a depth map from multiple views of a scene is a fundamental task in computer vision. As soon as more than two viewpoints are available, one faces the very basic question how to measure similarity across >2 image patches. Surprisingly, no direct solution exists, instead it is common to fall back to more or less robust averaging of two-view similarities. Encouraged by the success of machine learning, and in particular convolutional neural networks, we propose to learn a matching function which directly maps multiple image patches to a scalar similarity score. Experiments on several multi-view datasets demonstrate that this approach has advantages over methods based on pairwise patch similarity.
A comprehensive review will be given about the rich mathematical structure of mean field spin glass theory, mostly developed, until now, in the frame of the methods of theoretical physics, based on deep physical intuition and hints coming from numerical simulation. Central to our treatment is a very simple and yet powerful interpolation method, allowing to compare different probabilistic schemes, by using convexity and positivity arguments. In this way we can prove the existence of the thermodynamic limit for the free energy density of the system, a long standing open problem. Moreover, in the frame of a generalized variational principle, we can show the emergency of the Derrida-Ruelle random probability cascades, leading to the form of free energy given by the celebrated Parisi \textit {Ansatz}. All these results seem to be in full agreement with the mechanism of spontaneous replica symmetry breaking as developed by Giorgio Parisi.
Mesoscopic simulations of electron transport in disordered materials are based on the many particle Monte-Carlo (MC) methods. One of the major disadvantages of the multi-electron MC modeling is that the simulation process becomes significantly slow as the concentration of electrons increases. This problem makes it almost impossible to gain information about the electron transport at high Fermi levels. Recently a single-particle MC model has been proposed which is based on the truncated density of localized states (DLOS) and benefits from very short time execution. Although this model can successfully clarify the properties of the electron transport in moderately disordered systems (e.g. nanocrystalline TiO2), utilizing the single-particle MC model for a strongly disordered medium (e.g. nanocrystalline ZnO) may conducts erroneous estimation of the electron transport coefficient. The limitation of this single-particle MC model originates primarily from using a truncated DLOS. Another obstacle of the model is that it ignores the spatial occupation of localized states in the transport medium. In this regard, for a strongly disordered medium, the deviation of the single-particle MC model is quite large when compared with theoretical the predictions. Here, based on the modified electron residence time in the localized states, we propose a different single-particle MC model which covers the aforementioned models' drawbacks and simultaneously reduces the simulation time. The proposed model is justified by theoretical calculations for a simple cubic lattice at wide range values of disorder parameter, Fermi level, and temperature.
We consider the problem of finding the graph on which an epidemic cascade spreads, given only the times when each node gets infected. While this is a problem of importance in several contexts -- offline and online social networks, e-commerce, epidemiology, vulnerabilities in infrastructure networks -- there has been very little work, analytical or empirical, on finding the graph. Clearly, it is impossible to do so from just one cascade; our interest is in learning the graph from a small number of cascades.   For the classic and popular "independent cascade" SIR epidemics, we analytically establish the number of cascades required by both the global maximum-likelihood (ML) estimator, and a natural greedy algorithm. Both results are based on a key observation: the global graph learning problem decouples into $n$ local problems -- one for each node. For a node of degree $d$, we show that its neighborhood can be reliably found once it has been infected $O(d^2 \log n)$ times (for ML on general graphs) or $O(d\log n)$ times (for greedy on trees). We also provide a corresponding information-theoretic lower bound of $\Omega(d\log n)$; thus our bounds are essentially tight. Furthermore, if we are given side-information in the form of a super-graph of the actual graph (as is often the case), then the number of cascade samples required -- in all cases -- becomes independent of the network size $n$.   Finally, we show that for a very general SIR epidemic cascade model, the Markov graph of infection times is obtained via the moralization of the network graph.
A classic problem in computational biology is constructing a phylogenetic tree given a set of distances between n species. In most cases, a tree structure is too constraining. We consider a circular split network, a generalization of a tree in which multiple parallel edges signify divergence. A geometric space of such networks is introduced, forming a natural extension of the work by Billera, Holmes, and Vogtmann on tree space. We explore properties of this space, and show a natural embedding of the compactification of the real moduli space of curves within it.
In order to overcome the anticipated tremendous growth in the volume of mobile data traffic, the next generation of cellular networks will need to exploit the large bandwidth offered by the millimeter-wave (mmWave) band. A key distinguishing characteristic of mmWave is its use of highly directional and steerable antennas. In addition, future networks will be highly densified through the proliferation of base stations and their supporting infrastructure. With the aim of further improving the overall throughput of the network by mitigating the effect of frequency-selective fading and co-channel interference, 5G cellular networks are also expected to aggressively use frequency-hopping. This paper outlines an analytical framework that captures the main characteristics of a 5G cellular uplink. This framework is used to emphasize the benefits of network infrastructure densification, antenna directivity, mmWave propagation characteristics, and frequency hopping.
In this paper we present a new model for the lifetime of wireless sensor networks used for sea water communications. The new model for power communications takes into consideration parameters such as power consumption for the active mode, power consumption for the sleep mode, power consumption for the transient mode, transmission period, transient mode duration, sleep mode duration, and active mode duration. The power communications model is incorporated in the life time model of wireless sensor networks. The life time model takes into consideration several parameters such as the total number of sensors, network size, percentage of sink nodes, location of sensors, the mobility of sensors, power consumption when nodes move and the power consumption of communications. The new model for power consumption in communications shows more accurate results about the lifetime of the sensor network in comparison with previously published results.
Many pathogens spread primarily via direct contact between infected and susceptible hosts. Thus, the patterns of contacts or contact network of a population fundamentally shapes the course of epidemics. While there is a robust and growing theory for the dynamics of single epidemics in networks, we know little about the impacts of network structure on long term epidemic or endemic transmission. For seasonal diseases like influenza, pathogens repeatedly return to populations with complex and changing patterns of susceptibility and immunity acquired through prior infection. Here, we develop two mathematical approaches for modeling consecutive seasonal outbreaks of a partially-immunizing infection in a population with contact heterogeneity. Using methods from percolation theory we consider both leaky immunity, where all previously infected individuals gain partial immunity, and perfect immunity, where a fraction of previously infected individuals are fully immune. By restructuring the epidemiologically active portion of their host population, such diseases limit the potential of future outbreaks. We speculate that these dynamics can result in evolutionary pressure to increase infectiousness.
The Kolmogorov complexity of x, denoted C(x), is the length of the shortest program that generates x. For such a simple definition, Kolmogorov complexity has a rich and deep theory, as well as applications to a wide variety of topics including learning theory, complexity lower bounds and SAT algorithms.   Kolmogorov complexity typically focuses on decompression, going from the compressed program to the original string. This paper develops a dual notion of compression, the mapping from a string to its compressed version. Typical lossless compression algorithms such as Lempel-Ziv or Huffman Encoding always produce a string that will decompress to the original. We define a general compression concept based on this observation.   For every m, we exhibit a single compression algorithm q of length about m which for n and strings x of length n >= m, the output of q will have length within n-m+O(1) bits of C(x). We also show this bound is tight in a strong way, for every n >= m there is an x of length n with C(x) about m such that no compression program of size slightly less than m can compress x at all.   We also consider a polynomial time-bounded version of compression complexity and show that similar results for this version would rule out cryptographic one-way functions.
A Peer-to-Peer (P2P) network can boost its performance if peers are provided with underlying network-layer routing topology. The task of inferring the network-layer routing topology and link performance from an end host to a set of other hosts is termed as network tomography, and it normally requires host computers to send probing messages. We design a passive network tomography method that does not requires any probing messages and takes a free ride over data flows in P2P networks. It infers routing topology based on end-to-end delay correlation estimation (DCE) without requiring any synchronization or cooperation from the intermediate routers. We implement and test our method in the real world Internet environment and achieved the accuracy of 92% in topology recovery. We also perform extensive simulation in OMNet++ to evaluate its performance over large scale networks, showing that its topology recovery accuracy is about 95% for large networks.
In this paper we address several network design, clustering and Quality of Service (QoS) optimization problems and present novel, efficient, offline algorithms which compute optimal or near-optimal solutions. The QoS optimization problems consist of reliability improvement (by computing backup shortest paths) and network link upgrades (in order to reduce the latency on several paths). The network design problems consist of determining small diameter networks, as well as very well connected and regular network topologies. The network clustering problems consider only the restricted model of static and mobile path networks, for which we were able to develop optimal algorithms.
I propose a system for Automated Theorem Proving in higher order logic using deep learning and eschewing hand-constructed features. Holophrasm exploits the formalism of the Metamath language and explores partial proof trees using a neural-network-augmented bandit algorithm and a sequence-to-sequence model for action enumeration. The system proves 14% of its test theorems from Metamath's set.mm module.
We study the problem of learning influence functions under incomplete observations of node activations. Incomplete observations are a major concern as most (online and real-world) social networks are not fully observable. We establish both proper and improper PAC learnability of influence functions under randomly missing observations. Proper PAC learnability under the Discrete-Time Linear Threshold (DLT) and Discrete-Time Independent Cascade (DIC) models is established by reducing incomplete observations to complete observations in a modified graph. Our improper PAC learnability result applies for the DLT and DIC models as well as the Continuous-Time Independent Cascade (CIC) model. It is based on a parametrization in terms of reachability features, and also gives rise to an efficient and practical heuristic. Experiments on synthetic and real-world datasets demonstrate the ability of our method to compensate even for a fairly large fraction of missing observations.
The Saccharomyces cerevisiae protein-protein interaction map, as well as many natural and man-made networks, shares the scale-free topology. The preferential attachment model was suggested as a generic network evolution model that yields this universal topology. However, it is not clear that the model assumptions hold for the protein interaction network. Using a cross genome comparison we show that (a) the older a protein, the better connected it is, and (b) The number of interactions a protein gains during its evolution is proportional to its connectivity. Therefore, preferential attachment governs the protein network evolution. The evolutionary mechanism leading to such preference and some implications are discussed.
Wireless ad hoc networks are seldom characterized by one single performance metric, yet the current literature lacks a flexible framework to assist in characterizing the design tradeoffs in such networks. In this work, we address this problem by proposing a new modeling framework for routing in ad hoc networks, which used in conjunction with metaheuristic multiobjective search algorithms, will result in a better understanding of network behavior and performance when multiple criteria are relevant. Our approach is to take a holistic view of the network that captures the cross-interactions among interference management techniques implemented at various layers of the protocol stack. The resulting framework is a complex multiobjective optimization problem that can be efficiently solved through existing multiobjective search techniques. In this contribution, we present the Pareto optimal sets for an example sensor network when delay, robustness and energy are considered. The aim of this paper is to present the framework and hence for conciseness purposes, the multiobjective optimization search is not developed herein.
TCP is designed for networks with assumption that major losses occur only due to congestion of network traffic. On a wireless network TCP misinterprets the transmission losses due to bit errors and handoffs as losses caused by congestion, and triggers congestion control mechanisms. Because of its end to end delivery model, congestion handling and avoidance mechanisms, TCP has been widely accepted as Transport layer protocol for internetworks. Extension of Internetworks over wireless links is inevitable with the spread of ubiquitous computing and mobile communications. This paper presents study of different mechanisms proposed to extend Transport Control Protocol and other alternate solutions to enhance end to end performance over lossy wireless links. The paper studies details of different design choices proposed and their technical advantages and disadvantages. Finally, an analysis and proposal for best choice of proposed schemes are made for wireless networks.
The specific heat is frequency dependent in highly viscous liquids. By solving the full one-dimensional thermo-viscoelastic problem analytically it is shown that, because of thermal expansion and the fact that mechanical stresses relax on the same time scale as the enthalpy relaxes, the plane thermal-wave method does not measure the isobaric frequency-dependent specific heat c_p(omega). This method rather measures a "longitudinal" frequency-dependent specific heat, a quantity defined and detailed here that is in-between c_p(omega) and c_v(omega). This result means that no wide-frequency measurements of c_p(omega) on liquids approaching the calorimetric glass transition exist. We briefly discuss consequences for experiment.
Motivated by the abundance of directed synaptic couplings in a real biological neuronal network, we investigate the synchronization behavior of the Hodgkin-Huxley model in a directed network. We start from the standard model of the Watts-Strogatz undirected network and then change undirected edges to directed arcs with a given probability, still preserving the connectivity of the network. A generalized clustering coefficient for directed networks is defined and used to investigate the interplay between the synchronization behavior and underlying structural properties of directed networks. We observe that the directedness of complex networks plays an important role in emerging dynamical behaviors, which is also confirmed by a numerical study of the sociological game theoretic voter model on directed networks.
Conventional cellular wireless networks were designed with the purpose of providing high throughput for the user and high capacity for the service provider, without any provisions of energy efficiency. As a result, these networks have an enormous Carbon footprint. In this note, we describe the sources of the inefficiencies in such networks. First we quantify how much Carbon footprint such networks generate. We also discuss how much more mobile traffic is expected to increase so that this Carbon footprint will even increase tremendously more. We then discuss specific sources of inefficiency and potential sources of improvement at the physical layer as well as higher layers of the communication protocol hierarchy. In particular, considering that most of the energy inefficiency in wireless cellular networks is at the base stations, we discuss multi-tier networks and point to the potential of exploiting mobility patterns in order to use base station energy judiciously.
Motivated by machine learning applications in networks of sensors, internet-of-things (IoT) devices, and autonomous agents, we propose techniques for distributed stochastic convex learning from high-rate data streams. The setup involves a network of nodes---each one of which has a stream of data arriving at a constant rate---that solve a stochastic convex optimization problem by collaborating with each other over rate-limited communication links. To this end, we present and analyze two algorithms---termed distributed stochastic approximation mirror descent (D-SAMD) and {\em accelerated} distributed stochastic approximation mirror descent (AD-SAMD)---that are based on two stochastic variants of mirror descent. The main collaborative step in the proposed algorithms is approximate averaging of the local, noisy subgradients using distributed consensus. While distributed consensus is well suited for collaborative learning, its use in our setup results in perturbed subgradient averages due to rate-limited links, which may slow down or prevent convergence. Our main contributions in this regard are: (i) bounds on the convergence rates of D-SAMD and AD-SAMD in terms of the number of nodes, network topology, and ratio of the data streaming and communication rates, and (ii) sufficient conditions for order-optimum convergence of D-SAMD and AD-SAMD. In particular, we show that there exist regimes under which AD-SAMD, when compared to D-SAMD, achieves order-optimum convergence with slower communications rates. This is in contrast to the centralized setting in which use of accelerated mirror descent results in a modest improvement over regular mirror descent for stochastic composite optimization. Finally, we demonstrate the effectiveness of the proposed algorithms using numerical experiments.
The bisection width of interconnection networks has always been important in parallel computing, since it bounds the amount of information that can be moved from one side of a network to another, i.e., the bisection bandwidth. Finding its exact value has proven to be challenging for some network families. For instance, the problem of finding the exact bisection width of the multidimensional torus was posed by Leighton and has remained open for almost 20 years. In this paper we provide the exact value of the bisection width of the torus, as well as of several d-dimensional classical parallel topologies that can be obtained by the application of the Cartesian product of graphs. To do so, we first provide two general results that allow to obtain upper and lower bounds on the bisection width of a product graph as a function of some properties of its factor graphs. We also apply these results to obtain bounds for the bisection bandwidth of a d-dimensional BCube network, a recently proposed topology for data centers.
The H to tau tau decays form the prime channel for the measurement of the Higgs boson state and tests of the CP invariance of Higgs boson couplings. A previous study has shown the viability of deep learning techniques for the measurement. In this paper, the study is expanded. Effects due to the partial modelling of experimental effects are discussed. Furthermore, systematics due to ? decay modelling for complex cascade decays to tau^pm to a_1^pm nu_tau to rho^0 pi^pm nu_tau to 3pi^\pm nu_tau are also addressed. Various parameterisations are considered using low-energy collision data.
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains.
This paper examines the use of Wireless Sensor Networks interfaced with light fittings to allow for daylight substitution techniques to reduce energy usage in existing buildings. This creates a wire free system for existing buildings with minimal disruption and cost.
Results are presented from a detailed analysis of optical and X-ray observations of moderate-redshift galaxy clusters from the Canadian Network for Observational Cosmology (CNOC) subsample of the EMSS. The combination of extensive optical and deep X-ray observations of these clusters make them ideal candidates for multiwavelength mass comparison studies. X-ray surface brightness profiles of 14 clusters with 0.17<z<0.55 are constructed from Chandra observations and fit to single and double beta-models. Spatially resolved temperature analysis is performed, indicating that five of the clusters in this sample exhibit temperature gradients within their inner 60-200 kpc. Integrated spectra extracted within R_2500 provide temperature, abundance, and luminosity information. Under assumptions of hydrostatic equilibrium and spherical symmetry, we derive gas and total masses within R_2500 and R_200. We find an average gas mass fraction within R_200 of 0.136 +/- 0.004, resulting in Omega_m=0.28 +/- 0.01 (formal error). We also derive dynamical masses for these clusters to R_200. We find no systematic bias between X-ray and dynamical methods across the sample, with an average M(dyn)/M(X-ray) = 0.97 +/- 0.05. We also compare X-ray masses to weak lensing mass estimates of a subset of our sample, resulting in a weighted average of M(lens)/M(X-ray) of 0.99 +/- 0.07. We investigate X-ray scaling relationships and find powerlaw slopes which are slightly steeper than the predictions of self-similar models, with an E(z)^(-1) Lx-Tx slope of 2.4 +/- 0.2 and an E(z) M_2500-Tx slope of 1.7 +/- 0.1. Relationships between red-sequence optical richness (B_gc,red) and global cluster X-ray properties (Tx, Lx and M_2500) are also examined and fitted.
Cavity point-to-set correlations are real-space tools to detect the roughening of the free-energy landscape that accompanies the dynamical slowdown of glass-forming liquids. Measuring these correlations in model glass formers remains, however, a major computational challenge. Here, we develop a general parallel-tempering method that provides orders-of-magnitude improvement for sampling and equilibrating configurations within cavities. We apply this improved scheme to the canonical Kob-Andersen binary Lennard-Jones model for temperatures down to the mode-coupling theory crossover. Most significant improvements are noted for small cavities, which have thus far been the most difficult to study. This methodological advance also enables us to study a broader range of physical observables associated with thermodynamic fluctuations. We measure the probability distribution of overlap fluctuations in cavities, which displays a non-trivial temperature evolution. The corresponding overlap susceptibility is found to provide a robust quantitative estimate of the point-to-set length scale requiring no fitting. By resolving spatial fluctuations of the overlap in the cavity, we also obtain quantitative information about the geometry of overlap fluctuations. We can thus examine in detail how the penetration length as well as its fluctuations evolve with temperature and cavity size.
We present an exact treatment of the hysteresis behavior of the zero-temperature random-field Ising model on a Bethe lattice when it is driven by an external field and evolved according to a 2-spin-flip dynamics. We focus on lattice connectivities z=2 (the one-dimensional chain) and z=3. For the latter case, we demonstrate the existence of an out-of-equilibrium phase transition, in contrast with the situation found with the standard 1-spin-flip dynamics. We discuss the influence of the degree of cooperativity of the (local) spin dynamics of the nonequilibrium response on the system.
We explore the high-temperature dynamics of the disordered, one-dimensional XXZ model near the many-body localization (MBL) transition, focusing on the delocalized (i.e., "metallic") phase. In the vicinity of the transition, we find that this phase has the following properties: (i) Local magnetization fluctuations relax subdiffusively; (ii) the a.c. conductivity vanishes near zero frequency as a power law; (iii) the distribution of resistances becomes increasingly broad at low frequencies, approaching a power law in the zero-frequency limit. We argue that these effects can be understood in a unified way if the metallic phase near the MBL transition is a Griffiths phase. We establish scaling relations between the associated exponents, using exact linear-response arguments as well as a phenomenological resistor-capacitor model.
While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps. We investigate whether Target Propagation (TPROP) style approaches can address these shortcomings. Unfortunately, extensive experiments suggest that TPROP generally underperforms BPTT, and we end with an analysis of this phenomenon, and suggestions for future work.
We present a fast and transparent multi-variate event classification technique, called PDE-RS, which is based on sampling the signal and background densities in a multi-dimensional phase space using range-searching. The employed algorithm is presented in detail and its behaviour is studied with simple toy examples representing basic patterns of problems often encountered in High Energy Physics data analyses. In addition an example relevant for the search for instanton-induced processes in deep-inelastic scattering at HERA is discussed. For all studied examples, the new presented method performs as good as artificial Neural Networks and has furthermore the advantage to need less computation time. This allows to carefully select the best combination of observables which optimally separate the signal and background and for which the simulations describe the data best. Moreover, the systematic and statistical uncertainties can be easily evaluated. The method is therefore a powerful tool to find a small number of signal events in the large data samples expected at future particle colliders.
Detailed study of the financial empirical correlation matrix of the 30 companies comprised by DAX within the period of the last 11 years, using the time-window of 30 trading days, is presented. This allows to clearly identify a nontrivial time-dependence of the resulting correlations. In addition, as a rule, the draw downs are always accompanied by a sizable separation of one strong collective eigenstate of the correlation matrix which, at the same time, reduces the variance of the noise states. The opposite applies to draw ups. In this case the dynamics spreads more uniformly over the eigenstates which results in an increase of the total information entropy.
"Handover" is one of the techniques used to achieve the service continuity in Fourth generation wireless networks (FGWNs). Seamless continuity is the main goal in fourth generation Wireless networks (FGWNs), when a mobile terminal (MT) is in overlapping area for service continuity Handover mechanism are mainly used While moving in the heterogeneous wireless networks continual connection is the main challenge. Vertical handover is used as a technique to minimize the processing delay in heterogeneous wireless networks this paper, Vertical handover decision schemes are compared and Technique of order preference by similarity to ideal solution (TOPSIS) in a distributed manner. TOPSIS is used to choose the best network from the available Visitor networks (VTs) for the continuous connection by the mobile terminal. In our work we mainly concentrated to the handover decision Phase and to reduce the processing delay in the period of handover
We present a novel bio-inspired and dynamic coding scheme for static images. Our coder aims at reproducing the main steps of the visual stimulus processing in the mammalian retina taking into account its time behavior. The main novelty of this work is to show how to exploit the time behavior of the retina cells to ensure, in a simple way, scalability and bit allocation. To do so, our main source of inspiration will be the biologically plausible retina model called Virtual Retina. Following a similar structure, our model has two stages. The first stage is an image transform which is performed by the outer layers in the retina. Here it is modelled by filtering the image with a bank of difference of Gaussians with time-delays. The second stage is a time-dependent analog-to-digital conversion which is performed by the inner layers in the retina. Thanks to its conception, our coder enables scalability and bit allocation across time. Also, our decoded images do not show annoying artefacts such as ringing and block effects. As a whole, this article shows how to capture the main properties of a biological system, here the retina, in order to design a new efficient coder.
Feed-forward multilayer neural networks implementing random input-output mappings develop characteristic correlations between the activity of their hidden nodes which are important for the understanding of the storage and generalization performance of the network. It is shown how these correlations can be calculated from the joint probability distribution of the aligning fields at the hidden units for arbitrary decoder function between hidden layer and output. Explicit results are given for the parity-, and-, and committee-machines with arbitrary number of hidden nodes near saturation.
We investigate overlap fluctuations of the Sherrington-Kirkpatrick mean field spin glass model in the framework of the Random Over- lap Structure (ROSt). The concept of ROSt has been introduced recently by Aizenman and coworkers, who developed a variational approach to the Sherrington-Kirkpatrick model. We propose here an iterative procedure to show that, in the so-called Boltzmann ROSt, Aizenman-Contucci (AC) polynomials naturally arise for almost all values of the inverse temperature (not in average over some interval only). The same results can be obtained in any ROSt, including therefore the Parisi structure. The AC polynomials impose restric- tions on the overlap fluctuations in agreement with Parisi theory.
Deep-inelastic scattering at low x and elastic vector meson production are considered on the basis of the off-shell extension of the s-channel unitarity. We discuss behavior of the structure function F_2(x,Q^2) at low x and the total cross-section of virtual photon-proton scattering and obtain, in particular, the dependence sigma^{tot}_{gamma^* p}\sim (W^2)^{lambda(Q^2)} where exponent lambda(Q^2) is related to the interaction radius of a constituent quark. The energy dependence of the total cross-section of gamma^*gamma^*-interactions is calculated. The explicit mass dependence of the exponent in the power energy behavior of the vector meson production in the processes of virtual photon interactions with a proton gamma^*p to Vp has been obtained. We also consider angular distributions at large momentum transfers.
This paper presents an application of evolutionary search procedures to artificial neural networks. Here, we can distinguish among three kinds of evolution in artificial neural networks, i.e. the evolution of connection weights, of architectures, and of learning rules. We review each kind of evolution in detail and analyse critical issues related to different evolutions. This article concentrates on finding the suitable way of using evolutionary algorithms for optimizing the artificial neural network parameters.
Operational maturity of biological control systems have fuelled the inspiration for a large number of mathematical and logical models for control, automation and optimisation. The human brain represents the most sophisticated control architecture known to us and is a central motivation for several research attempts across various domains. In the present work, we introduce an algorithm for mathematical optimisation that derives its intuition from the hierarchical and distributed operations of the human motor system. The system comprises global leaders, local leaders and an effector population that adapt dynamically to attain global optimisation via a feedback mechanism coupled with the structural hierarchy. The hierarchical system operation is distributed into local control for movement and global controllers that facilitate gross motion and decision making. We present our algorithm as a variant of the classical Differential Evolution algorithm, introducing a hierarchical crossover operation. The discussed approach is tested exhaustively on standard test functions from the CEC 2017 benchmark. Our algorithm significantly outperforms various standard algorithms as well as their popular variants as discussed in the results.
While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network.
In this work, we propose a semi-supervised method for short text clustering, where we represent texts as distributed vectors with neural networks, and use a small amount of labeled data to specify our intention for clustering. We design a novel objective to combine the representation learning process and the k-means clustering process together, and optimize the objective with both labeled data and unlabeled data iteratively until convergence through three steps: (1) assign each short text to its nearest centroid based on its representation from the current neural networks; (2) re-estimate the cluster centroids based on cluster assignments from step (1); (3) update neural networks according to the objective by keeping centroids and cluster assignments fixed. Experimental results on four datasets show that our method works significantly better than several other text clustering methods.
The network structure (or topology) of a dynamical network is often unavailable or uncertain. Hence, in this paper we consider the problem of network reconstruction. Network reconstruction aims at inferring the topology of a dynamical network using measurements obtained from the network. This paper rigorously defines what is meant by solvability of the network reconstruction problem. Subsequently, we provide necessary and sufficient conditions under which the network reconstruction problem is solvable. Finally, using constrained Lyapunov equations, we establish novel network reconstruction algorithms, applicable to general dynamical networks. We also provide specialized algorithms for specific network dynamics, such as the well-known consensus and adjacency dynamics.
In a power distribution network, the network topology information is essential for an efficient operation of the network. This information of network connectivity is not accurately available, at the low voltage level, due to uninformed changes that happen from time to time. In this paper, we propose a novel data--driven approach to identify the underlying network topology including the load phase connectivity from time series of energy measurements. The proposed method involves the application of Principal Component Analysis (PCA) and its graph-theoretic interpretation to infer the topology from smart meter energy measurements. The method is demonstrated through simulation on randomly generated networks and also on IEEE recognized Roy Billinton distribution test system.
Automatic profiling of social media users is an important task for supporting a multitude of downstream applications. While a number of studies have used social media content to extract and study collective social attributes, there is a lack of substantial research that addresses the detection of a user's industry. We frame this task as classification using both feature engineering and ensemble learning. Our industry-detection system uses both posted content and profile information to detect a user's industry with 64.3% accuracy, significantly outperforming the majority baseline in a taxonomy of fourteen industry classes. Our qualitative analysis suggests that a person's industry not only affects the words used and their perceived meanings, but also the number and type of emotions being expressed.
Reducing the complexity of large systems described as complex networks is key to understand them and a crucial issue is to know which properties of the initial system are preserved in the reduced one. Here we use random walks to design a coarse-graining scheme for complex networks. By construction the coarse-graining preserves the slow modes of the walk, while reducing significantly the size and the complexity of the network. In this sense our coarse-graining allows to approximate large networks by smaller ones, keeping most of their relevant spectral properties.
Many classes of non-linear sigma models (NLSMs) are known to contain composite operators with an arbitrary number 2s of derivatives ("high-gradient operators") which appear to become strongly relevant within RG calculations at one (or fixed higher) loop order, when the number 2s of derivatives becomes large. This occurs at many conventional fixed points of NLSMs which are perturbatively accessible within the usual epsilon-expansion in d=2+\epsilon dimensions. Since such operators are not prohibited from occurring in the action, they appear to threaten the very existence of such fixed points. At the same time, for NLSMs describing metal-insulator transitions of Anderson localization in electronic conductors, the strong RG-relevance of these operators has been previously related to statistical properties of the conductance of samples of large finite size ("conductance fluctuations"). In this paper, we analyze this question, not for perturbative RG treatments of NLSMs, but for 2d Wess-Zumino-Witten (WZW) models at level k, perturbatively in the current-current interaction of the Noether current. WZW models are special ("Principal Chiral") NLSMs on a Lie Group G with a WZW term at level k. In these models the role of high-gradient operators is played by homogeneous polynomials of order 2s in the Noether currents, whose scaling dimensions we analyze. For the Lie Supergroup G=GL(2N|2N) and k=1, this corresponds to time-reversal invariant problems of Anderson localization in the so-called chiral symmetry classes, and the strength of the current-current interaction, a measure of the strength of disorder, is known to be completely marginal (for any k). We find that all high-gradient (polynomial) operators are, to one loop order, irrelevant or relevant depending on the sign of that interaction.
We first give a brief overview over quantum computing, quantum key distribution (QKD), a practical architecture that integrates (QKD) in current internet security architectures, and aspects of network security. We introduce the concept of quantum contracts inspired from game theory. Finally, we introduce the basic architecture of the quantum internet and present some protocols.
An algorithm that learns from a set of examples should ideally be able to exploit the available resources of (a) abundant computing power and (b) domain-specific knowledge to improve its ability to generalize. Connectionist theory-refinement systems, which use background knowledge to select a neural network's topology and initial weights, have proven to be effective at exploiting domain-specific knowledge; however, most do not exploit available computing power. This weakness occurs because they lack the ability to refine the topology of the neural networks they produce, thereby limiting generalization, especially when given impoverished domain theories. We present the REGENT algorithm which uses (a) domain-specific knowledge to help create an initial population of knowledge-based neural networks and (b) genetic operators of crossover and mutation (specifically designed for knowledge-based networks) to continually search for better network topologies. Experiments on three real-world domains indicate that our new algorithm is able to significantly increase generalization compared to a standard connectionist theory-refinement system, as well as our previous algorithm for growing knowledge-based networks.
Seamless computing and service sharing in community networks have been gaining momentum due to the emerging technology of community network micro-clouds (CNMCs). However, running services in CNMCs can face enormous challenges such as the dynamic nature of micro-clouds, limited capacity of nodes and links, asymmetric quality of wireless links for services, deployment mod- els based on geographic singularities rather than network QoS, and etc. CNMCs have been increasingly used by network-intensive services that exchange significant amounts of data between the nodes on which they run, therefore the performance heavily relies on the available bandwidth resource in a network. This paper proposes a novel bandwidth-aware service placement algorithm which out- performs the current random placement adopted by Guifi.net. Our preliminary results show that the proposed algorithm consistently outperforms the current random placement adopted in Guifi.net by 35% regarding its bandwidth gain. More importantly, as the number of services increases, the gain tends to increase accordingly.
In this paper, we propose to employ the convolutional neural network (CNN) for the image question answering (QA). Our proposed CNN provides an end-to-end framework with convolutional architectures for learning not only the image and question representations, but also their inter-modal interactions to produce the answer. More specifically, our model consists of three CNNs: one image CNN to encode the image content, one sentence CNN to compose the words of the question, and one multimodal convolution layer to learn their joint representation for the classification in the space of candidate answer words. We demonstrate the efficacy of our proposed model on the DAQUAR and COCO-QA datasets, which are two benchmark datasets for the image QA, with the performances significantly outperforming the state-of-the-art.
We study the dynamics of gene activities in relatively small size biological networks (up to a few tens of nodes), e.g. the activities of cell-cycle proteins during the mitotic cell-cycle progression. Using the framework of deterministic discrete dynamical models, we characterize the dynamical modifications in response to structural perturbations in the network connectivities. In particular, we focus on how perturbations affect the set of fixed points and sizes of the basins of attraction. Our approach uses two analytical measures: the basin entropy $H$ and the perturbation size $\Delta$, a quantity that reflects the distance between the set of fixed points of the perturbed network to that of the unperturbed network. Applying our approach to the yeast-cell cycle network introduced by Li \textit{et al.} provides a low dimensional and informative fingerprint of network behavior under large classes of perturbations. We identify interactions that are crucial for proper network function, and also pinpoints functionally redundant network connections. Selected perturbations exemplify the breadth of dynamical responses in this cell-cycle model.
In a recursive way and by including a parameter, we introduce a family of deterministic scale-free networks. The resulting networks exhibit small-world effects. We calculate the exact results for the degree exponent, the clustering coefficient and the diameter. The major points of our results indicate: the degree exponent can be adjusted; the clustering coefficient of each individual vertex is inversely proportional to its degree and the average clustering coefficient of all vertices approaches to a nonzero value in the infinite network order; and the diameter grows logarithmically with the number of network vertices.
Evolvable hardware (EHW) is a set of techniques that are based on the idea of combining reconfiguration hardware systems with evolutionary algorithms. In other word, EHW has two sections; the reconfigurable hardware and evolutionary algorithm where the configurations are under the control of an evolutionary algorithm. This paper, suggests a method to design and optimize the synchronous sequential circuits. Genetic algorithm (GA) was applied as evolutionary algorithm. In this approach, for building input combinational logic circuit of each DFF, and also output combinational logic circuit, the cell arrays have been used. The obtained results show that our method can reduce the average number of generations by limitation the search space.
Systems biology focuses on the study of entire biological systems rather than on their individual components. With the emergence of high-throughput data generation technologies for molecular biology and the development of advanced mathematical modeling techniques, this field promises to provide important new insights. At the same time, with the availability of increasingly powerful computers, computer algebra has developed into a useful tool for many applications. This article illustrates the use of computer algebra in systems biology by way of a well-known gene regulatory network, the Lac Operon in the bacterium E. coli.
Centrality measures such as the degree, k-shell, or eigenvalue centrality can identify a network's most influential nodes, but are rarely usefully accurate in quantifying the spreading power of the vast majority of nodes which are not highly influential. The spreading power of all network nodes is better explained by considering, from a continuous-time epidemiological perspective, the distribution of the force of infection each node generates. The resulting metric, the \textit{expected force}, accurately quantifies node spreading power under all primary epidemiological models across a wide range of archetypical human contact networks. When node power is low, influence is a function of neighbor degree. As power increases, a node's own degree becomes more important. The strength of this relationship is modulated by network structure, being more pronounced in narrow, dense networks typical of social networking and weakening in broader, looser association networks such as the Internet. The expected force can be computed independently for individual nodes, making it applicable for networks whose adjacency matrix is dynamic, not well specified, or overwhelmingly large.
Typical performance of low-density parity-check (LDPC) codes over a general binary-input output-symmetric memoryless channel is investigated using methods of statistical mechanics. Theoretical framework for dealing with general symmetric channels is provided, based on which Gallager and MacKay-Neal codes are studied as examples of LDPC codes. It has been shown that the basic properties of these codes known for particular channels, including the property to potentially saturate Shannon's limit, hold for general symmetric channels. The binary-input additive-white-Gaussian-noise channel and the binary-input Laplace channel are considered as specific channel noise models.
Deep Convolutional Neural Networks (CNN) enforces supervised information only at the output layer, and hidden layers are trained by back propagating the prediction error from the output layer without explicit supervision. We propose a supervised feature learning approach, Label Consistent Neural Network, which enforces direct supervision in late hidden layers. We associate each neuron in a hidden layer with a particular class label and encourage it to be activated for input signals from the same class. More specifically, we introduce a label consistency regularization called "discriminative representation error" loss for late hidden layers and combine it with classification error loss to build our overall objective function. This label consistency constraint alleviates the common problem of gradient vanishing and tends to faster convergence; it also makes the features derived from late hidden layers discriminative enough for classification even using a simple $k$-NN classifier, since input signals from the same class will have very similar representations. Experimental results demonstrate that our approach achieves state-of-the-art performances on several public benchmarks for action and object category recognition.
A recent "third wave" of Neural Network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, this new NN research is often referred to as deep learning. Stemming from this tide of NN work, a number of researchers have recently begun to investigate NN approaches to Information Retrieval (IR). While deep NNs have yet to achieve the same level of success in IR as seen in other areas, the recent surge of interest and work in NNs for IR suggest that this state of affairs may be quickly changing. In this work, we survey the current landscape of Neural IR research, paying special attention to the use of learned representations of queries and documents (i.e., neural embeddings). We highlight the successes of neural IR thus far, catalog obstacles to its wider adoption, and suggest potentially promising directions for future research.
We study spin systems on Bethe lattices constructed from d-dimensional hypercubes. Although these lattices are not tree-like, and therefore closer to real cubic lattices than Bethe lattices or regular random graphs, one can still use the Bethe-Peierls method to derive exact equations for the magnetization and other thermodynamic quantities. We compute phase diagrams for ferromagnetic Ising models on hypercubic Bethe lattices with dimension d=2, 3, and 4. Our results are in good agreement with the results of the same models on d-dimensional cubic lattices, for low and high temperatures, and offer an improvement over the conventional Bethe lattice with connectivity k=2d.
We investigate dynamic creation of fractionalized half-quantum vortices in Bose-Einstein condensates of sodium atoms. Our simulations show that both individual half-quantum vortices and vortex lattices can be created in rotating optical traps when additional pulsed magnetic trapping potentials are applied. We also find that a distinct periodically modulated spin-density-wave spatial structure is always embedded in square half-quantum vortex lattices; this structure can be conveniently probed by taking absorption images of ballistically expanding cold atoms in a Stern-Gerlach field.
This paper presents a stochastic logic time delay reservoir design. The reservoir is analyzed using a number of metrics, such as kernel quality, generalization rank, performance on simple benchmarks, and is also compared to a deterministic design. A novel re-seeding method is introduced to reduce the adverse effects of stochastic noise, which may also be implemented in other stochastic logic reservoir computing designs, such as echo state networks. Benchmark results indicate that the proposed design performs well on noise-tolerant classification problems, but more work needs to be done to improve the stochastic logic time delay reservoir's robustness for regression problems.
The recognition of unconstrained handwriting continues to be a difficult task for computers despite active research for several decades. This is because handwritten text offers great challenges such as character and word segmentation, character recognition, variation between handwriting styles, different character size and no font constraints as well as the background clarity. In this paper primarily discussed Online Handwriting Recognition methods for Arabic words which being often used among then across the Middle East and North Africa people. Because of the characteristic of the whole body of the Arabic words, namely connectivity between the characters, thereby the segmentation of An Arabic word is very difficult. We introduced a recurrent neural network to online handwriting Arabic word recognition. The key innovation is a recently produce recurrent neural networks objective function known as connectionist temporal classification. The system consists of an advanced recurrent neural network with an output layer designed for sequence labeling, partially combined with a probabilistic language model. Experimental results show that unconstrained Arabic words achieve recognition rates about 79%, which is significantly higher than the about 70% using a previously developed hidden markov model based recognition system.
We investigate the effects of wave localization on the delay time tau (frequency sensitivity of the scattering phase shift) of a wave transmitted through a disordered wave guide. Localization results in a separation tau=chi+chi' of the delay time into two independent but equivalent contributions, associated to the left and right end of the wave guide. For N=1 propagating modes, chi and chi' are identical to half the reflection delay time of each end of the wave guide. In this case the distribution function P(tau) in an ensemble of random disorder can be obtained analytically. For $N>1$ propagating modes the distribution function can be approximated by a simple heuristic modification of the single-channel problem. We find a strong correlation between channels with long reflection delay times and the dominant transmission channel.
Bose systems, subject to the action of external random potentials, are considered. For describing the system properties, under the action of spatially random potentials of arbitrary strength, the stochastic mean-field approximation is employed. When the strength of disorder increases, the extended Bose-Einstein condensate fragments into spatially disconnected regions, forming a granular condensate. Increasing the strength of disorder even more transforms the granular condensate into the normal glass. The influence of time-dependent external potentials is also discussed. Fast varying temporal potentials, to some extent, imitate the action of spatially random potentials. In particular, strong time-alternating potential can induce the appearance of a nonequilibrium granular condensate.
The GANs promote an adversarive game to approximate complex and jointed example probability. The networks driven by noise generate fake examples to approximate realistic data distributions. Later the conditional GAN merges prior-conditions as input in order to transfer attribute vectors to the corresponding data. However, the CGAN is not designed to deal with the high dimension conditions since indirect guide of the learning is inefficiency. In this paper, we proposed a network ResGAN to generate fine images in terms of extremely degenerated images. The coarse images aligned to attributes are embedded as the generator inputs and classifier labels. In generative network, a straight path similar to the Resnet is cohered to directly transfer the coarse images to the higher layers. And adversarial training is circularly implemented to prevent degeneration of the generated images. Experimental results of applying the ResGAN to datasets MNIST, CIFAR10/100 and CELEBA show its higher accuracy to the state-of-art GANs.
A method of `network filtering' has been proposed recently to detect the effects of certain external perturbations on the interacting members in a network. However, with large networks, the goal of detection seems a priori difficult to achieve, especially since the number of observations available often is much smaller than the number of variables describing the effects of the underlying network. Under the assumption that the network possesses a certain sparsity property, we provide a formal characterization of the accuracy with which the external effects can be detected, using a network filtering system that combines Lasso regression in a sparse simultaneous equation model with simple residual analysis. We explore the implications of the technical conditions underlying our characterization, in the context of various network topologies, and we illustrate our method using simulated data.
The activation function is an important component in Convolutional Neural Networks (CNNs). For instance, recent breakthroughs in Deep Learning can be attributed to the Rectified Linear Unit (ReLU). Another recently proposed activation function, the Exponential Linear Unit (ELU), has the supplementary property of reducing bias shift without explicitly centering the values at zero. In this paper, we show that learning a parameterization of ELU improves its performance. We analyzed our proposed Parametric ELU (PELU) in the context of vanishing gradients and provide a gradient-based optimization framework. We conducted several experiments on CIFAR-10/100 and ImageNet with different network architectures, such as NiN, Overfeat, All-CNN and ResNet. Our results show that our PELU has relative error improvements over ELU of 4.45% and 5.68% on CIFAR-10 and 100, and as much as 7.28% with only 0.0003% parameter increase on ImageNet. We also observed that Vgg using PELU tended to prefer activations saturating closer to zero, as in ReLU, except at the last layer, which saturated near -2. Finally, other presented results suggest that varying the shape of the activations during training along with the other parameters helps controlling vanishing gradients and bias shift, thus facilitating learning.
Pedestrian detection is a popular research topic due to its paramount importance for a number of applications, especially in the fields of automotive, surveillance and robotics. Despite the significant improvements, pedestrian detection is still an open challenge that calls for more and more accurate algorithms. In the last few years, deep learning and in particular convolutional neural networks emerged as the state of the art in terms of accuracy for a number of computer vision tasks such as image classification, object detection and segmentation, often outperforming the previous gold standards by a large margin. In this paper, we propose a pedestrian detection system based on deep learning, adapting a general-purpose convolutional network to the task at hand. By thoroughly analyzing and optimizing each step of the detection pipeline we propose an architecture that outperforms traditional methods, achieving a task accuracy close to that of state-of-the-art approaches, while requiring a low computational time. Finally, we tested the system on an NVIDIA Jetson TK1, a 192-core platform that is envisioned to be a forerunner computational brain of future self-driving cars.
The conventional wisdom is that social networks exhibit an assortative mixing pattern, whereas biological and technological networks show a disassortative mixing pattern. However, the recent research on the online social networks modifies the widespread belief, and many online social networks show a disassortative or neutral mixing feature. Especially, we found that an online social network, Wealink, underwent a transition from degree assortativity characteristic of real social networks to degree disassortativity characteristic of many online social networks, and the transition can be reasonably elucidated by a simple network model that we propose. The relations among network assortativity, clustering, and modularity are also discussed in the paper.
This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category. Previous approaches focus on either combining a spatial regularizer with hand-crafted features, or learning a correspondence model for appearance only. We propose instead a convolutional neural network architecture, called SCNet, for learning a geometrically plausible model for semantic correspondence. SCNet uses region proposals as matching primitives, and explicitly incorporates geometric consistency in its loss function. It is trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and a comparative evaluation on several standard benchmarks demonstrates that the proposed approach substantially outperforms both recent deep learning architectures and previous methods based on hand-crafted features.
The capacity to collect fingerprints of individuals in online media has revolutionized the way researchers explore human society. Social systems can be seen as a non-linear superposition of a multitude of complex social networks, where nodes represent individuals and links capture a variety of different social relations. Much emphasis has been put on the network topology of social interactions, however, the multi-dimensional nature of these interactions has largely been ignored in empirical studies, mostly because of lack of data. Here, for the first time, we analyze a complete, multi-relational, large social network of a society consisting of the 300,000 odd players of a massive multiplayer online game. We extract networks of six different types of one-to-one interactions between the players. Three of them carry a positive connotation (friendship, communication, trade), three a negative (enmity, armed aggression, punishment). We first analyze these types of networks as separate entities and find that negative interactions differ from positive interactions by their lower reciprocity, weaker clustering and fatter-tail degree distribution. We then proceed to explore how the inter-dependence of different network types determines the organization of the social system. In particular we study correlations and overlap between different types of links and demonstrate the tendency of individuals to play different roles in different networks. As a demonstration of the power of the approach we present the first empirical large-scale verification of the long-standing structural balance theory, by focusing on the specific multiplex network of friendship and enmity relations.
The spin-dependent cross sections for semi-inclusive lepton-nucleon scattering are derived in the framework of collinear factorization, including the effects of masses of the target and produced hadron at finite momentum transfer squared Q^2. At leading order the cross sections factorize into products of parton distribution and fragmentation functions evaluated in terms of new, mass-dependent scaling variables. The size of the hadron mass corrections is estimated at kinematics relevant for future semi-inclusive deep-inelastic scattering experiments.
The recent advances in deep neural networks have led to effective vision-based reinforcement learning methods that have been employed to obtain human-level controllers in Atari 2600 games from pixel data. Atari 2600 games, however, do not resemble real-world tasks since they involve non-realistic 2D environments and the third-person perspective. Here, we propose a novel test-bed platform for reinforcement learning research from raw visual information which employs the first-person perspective in a semi-realistic 3D world. The software, called ViZDoom, is based on the classical first-person shooter video game, Doom. It allows developing bots that play the game using the screen buffer. ViZDoom is lightweight, fast, and highly customizable via a convenient mechanism of user scenarios. In the experimental part, we test the environment by trying to learn bots for two scenarios: a basic move-and-shoot task and a more complex maze-navigation problem. Using convolutional deep neural networks with Q-learning and experience replay, for both scenarios, we were able to train competent bots, which exhibit human-like behaviors. The results confirm the utility of ViZDoom as an AI research platform and imply that visual reinforcement learning in 3D realistic first-person perspective environments is feasible.
The present paper focuses on the problem of broadcasting information in the most efficient manner in a large two-dimensional ad hoc wireless network at low SNR and under line-of-sight propagation. A new communication scheme is proposed, where source nodes first broadcast their data to the entire network, despite the lack of sufficient available power. The signal's power is then reinforced via successive back-and-forth beamforming transmissions between different groups of nodes in the network, so that all nodes are able to decode the transmitted information at the end. This scheme is shown to achieve asymptotically the broadcast capacity of the network, which is expressed in terms of the largest singular value of the matrix of fading coefficients between the nodes in the network. A detailed mathematical analysis is then presented to evaluate the asymptotic behavior of this largest singular value.
In a data center network, for example, it is quite often to use controllers to manage resources in a centralized man- ner. Centralized control, however, imposes a scalability problem. In this paper, we investigate the use of multiple independent controllers instead of a single omniscient controller to manage resources. Each controller looks after a portion of the network only, but they together cover the whole network. This therefore solves the scalability problem. We use flow allocation as an example to see how this approach can manage the bandwidth use in a distributed manner. The focus is on how to assign components of a network to the controllers so that (1) each controller only need to look after a small part of the network but (2) there is at least one controller that can answer any request. We outline a way to configure the controllers to fulfill these requirements as a proof that the use of devolved controllers is possible. We also discuss several issues related to such implementation.
We investigate the entanglement of the ferromagnetic XY model in a random magnetic field at zero temperature and in the uniform magnetic field at finite temperatures. We use the concurrence to quantify the entanglement. We find that, in the ferromagnetic region of the uniform magnetic field $h$, all the concurrences are \textit{generated} by the random magnetic field and by the thermal fluctuation. In one particular region of $h$, the next-nearest neighbor concurrence is generated by the random field but not at finite temperatures. We also find that the qualitative behavior of the maximum point of the entanglement in the random magnetic field depends on whether the variance of its distribution function is finite or not.
In this paper, we study deep signal representations that are invariant to groups of transformations and stable to the action of diffeomorphisms without losing signal information. This is achieved by generalizing the multilayer kernel introduced in the context of convolutional kernel networks and by studying the geometry of the corresponding reproducing kernel Hilbert space. We show that the signal representation is stable, and that models from this functional space, such as a large class of convolutional neural networks with homogeneous activation functions, may enjoy the same stability.
In this article, we propose a combination of an noise-reduction algorithm based on Singular Spectrum Analysis (SSA) and a standard feedforward neural prediction model. Basically, the proposed algorithm consists of two different steps: data preprocessing based on the SSA filtering method and step-by-step training procedure in which we use a simple feedforward multilayer neural network with backpropagation learning. The proposed noise-reduction procedure successfully removes most of the noise. That increases long-term predictability of the processed dataset comparison with the raw dataset. The method was applied to predict the International sunspot number RZ time series. The results show that our combined technique has better performances than those offered by the same network directly applied to raw dataset.
We propose a new fast word embedding technique using hash functions. The method is a derandomization of a new type of random projections: By disregarding the classic constraint used in designing random projections (i.e., preserving pairwise distances in a particular normed space), our solution exploits extremely sparse non-negative random projections. Our experiments show that the proposed method can achieve competitive results, comparable to neural embedding learning techniques, however, with only a fraction of the computational complexity of these methods. While the proposed derandomization enhances the computational and space complexity of our method, the possibility of applying weighting methods such as positive pointwise mutual information (PPMI) to our models after their construction (and at a reduced dimensionality) imparts a high discriminatory power to the resulting embeddings. Obviously, this method comes with other known benefits of random projection-based techniques such as ease of update.
Multiclass open queueing networks find wide applications in communication, computer and fabrication networks. Often one is interested in steady-state performance measures associated with these networks. Conceptually, under mild conditions, a regenerative structure exists in multiclass networks, making them amenable to regenerative simulation for estimating the steady-state performance measures. However, typically, identification of a regenerative structure in these networks is difficult. A well known exception is when all the interarrival times are exponentially distributed, where the instants corresponding to customer arrivals to an empty network constitute a regenerative structure. In this paper, we consider networks where the interarrival times are generally distributed but have exponential or heavier tails. We show that these distributions can be decomposed into a mixture of sums of independent random variables such that at least one of the components is exponentially distributed. This allows an easily implementable embedded regenerative structure in the Markov process. We show that under mild conditions on the network primitives, the regenerative mean and standard deviation estimators are consistent and satisfy a joint central limit theorem useful for constructing asymptotically valid confidence intervals. We also show that amongst all such interarrival time decompositions, the one with the largest mean exponential component minimizes the asymptotic variance of the standard deviation estimator.
We describe spatio-temporal correlations and heterogeneities in a kinetically constrained glassy model, the Kob-Andersen model. The kinetic constraints of the model alone induce the existence of dynamic correlation lengths, that increase as the density $\rho$ increases, in a way compatible with a double-exponential law. We characterize in detail the trapping time correlation length, the cooperativity length, and the distribution of persistent clusters of particles. This last quantity is related to the typical size of blocked clusters that slow down the dynamics for a given density.
Image clustering is one of the most important computer vision applications, which has been extensively studied in literature. However, current clustering methods mostly suffer from lack of efficiency and scalability when dealing with large-scale and high-dimensional data. In this paper, we propose a new clustering model, called DEeP Embedded RegularIzed ClusTering (DEPICT), which efficiently maps data into a discriminative embedding subspace and precisely predicts cluster assignments. DEPICT generally consists of a multinomial logistic regression function stacked on top of a multi-layer convolutional autoencoder. We define a clustering objective function using relative entropy (KL divergence) minimization, regularized by a prior for the frequency of clusters. An alternating strategy is then derived to optimize the objective by updating parameters and estimating cluster assignments. Furthermore, we employ the reconstruction loss functions in our autoencoder, as a data-dependent regularization term, to prevent the deep embedding function from overfitting. In order to benefit from end-to-end optimization and eliminate the necessity for layer-wise pretraining, we introduce a joint learning framework to minimize the unified clustering and reconstruction loss functions together and train all network layers simultaneously. Experimental results indicate the superiority and speed of DEPICT and our joint learning approach in real-world clustering tasks, in which no labeled data is available for hyper-parameter tuning.
Appraisal of the scientific impact of researchers, teams and institutions with productivity and citation metrics has major repercussions. Funding and promotion of individuals and survival of teams and institutions depend on publications and citations. In this competitive environment, the number of authors per paper is increasing and apparently some co-authors don't satisfy authorship criteria. Listing of individual contributions is still sporadic and also open to manipulation. Metrics are needed to measure the networking intensity for a single scientist or group of scientists accounting for patterns of co-authorship. Here, I define I1 for a single scientist as the number of authors who appear in at least I1 papers of the specific scientist. For a group of scientists or institution, In is defined as the number of authors who appear in at least In papers that bear the affiliation of the group or institution. I1 depends on the number of papers authored Np. The power exponent R of the relationship between I1 and Np categorizes scientists as solitary (R>2.5), nuclear (R=2.25-2.5), networked (R=2-2.25), extensively networked (R=1.75-2) or collaborators (R<1.75). R may be used to adjust for co-authorship networking the citation impact of a scientist. In similarly provides a simple measure of the effective networking size to adjust the citation impact of groups or institutions. Empirical data are provided for single scientists and institutions for the proposed metrics. Cautious adoption of adjustments for co-authorship and networking in scientific appraisals may offer incentives for more accountable co-authorship behaviour in published articles.
We proposed a deterministic multidimensional growth model for small-world networks. The model can characterize the distinguishing properties of many real-life networks with geometric space structure. Our results show the model possesses small-world effect: larger clustering coefficient and smaller characteristic path length. We also obtain some accurate results for its properties including degree distribution, clustering coefficient and network diameter and discuss them. It is also worth noting that we get an accurate analytical expression for calculating the characteristic path length. We verify numerically and experimentally these main features.
In this work we present a further analytical development and a numerical implementation of the recently suggested theoretical model for highly nonlinear potential long-crested water waves, where weak three-dimensional effects are included as small corrections to exact two-dimensional equations written in the conformal variables [V.P. Ruban, Phys. Rev. E 71, 055303(R) (2005)]. Numerical experiments based on this theory describe the spontaneous formation of a single weakly three-dimensional large-amplitude wave (alternatively called freak, killer, rogue or giant wave) on the deep water.
Large-scale neuromorphic hardware systems typically bear the trade-off between detail level and required chip resources. Especially when implementing spike-timing-dependent plasticity, reduction in resources leads to limitations as compared to floating point precision. By design, a natural modification that saves resources would be reducing synaptic weight resolution. In this study, we give an estimate for the impact of synaptic weight discretization on different levels, ranging from random walks of individual weights to computer simulations of spiking neural networks. The FACETS wafer-scale hardware system offers a 4-bit resolution of synaptic weights, which is shown to be sufficient within the scope of our network benchmark. Our findings indicate that increasing the resolution may not even be useful in light of further restrictions of customized mixed-signal synapses. In addition, variations due to production imperfections are investigated and shown to be uncritical in the context of the presented study. Our results represent a general framework for setting up and configuring hardware-constrained synapses. We suggest how weight discretization could be considered for other backends dedicated to large-scale simulations. Thus, our proposition of a good hardware verification practice may rise synergy effects between hardware developers and neuroscientists.
Adaptive networks appear in many biological applications. They combine topological evolution of the network with dynamics in the network nodes. Recently, the dynamics of adaptive networks has been investigated in a number of parallel studies from different fields, ranging from genomics to game theory. Here we review these recent developments and show that they can be viewed from a unique angle. We demonstrate that all these studies are characterized by common themes, most prominently: complex dynamics and robust topological self-organization based on simple local rules.
In the wireless sensor networks composed of battery-powered sensor nodes, one of the main issues is how to save power consumption on each node. The usual approach to this problem is to activate only necessary nodes (e.g., those nodes which compose a backbone network), and to put other nodes to sleep. One such algorithm using location information is GAF (Geographical Adaptive Fidelity), and GAF is enhanced to HGAF (Hierarchical Geographical Adaptive Fidelity). In this paper, we study the energy-efficient partition of a 3 dimensional sensor field into cells. Further, we give a theoretical upper bound on cell size for this problem.
This paper considers the problem of networks reconstruction from heterogeneous data using a Gaussian Graphical Mixture Model (GGMM). It is well known that parameter estimation in this context is challenging due to large numbers of variables coupled with the degeneracy of the likelihood. We propose as a solution a penalized maximum likelihood technique by imposing an $l_{1}$ penalty on the precision matrix. Our approach shrinks the parameters thereby resulting in better identifiability and variable selection. We use the Expectation Maximization (EM) algorithm which involves the graphical LASSO to estimate the mixing coefficients and the precision matrices. We show that under certain regularity conditions the Penalized Maximum Likelihood (PML) estimates are consistent. We demonstrate the performance of the PML estimator through simulations and we show the utility of our method for high dimensional data analysis in a genomic application.
Two neural networks which are trained on their mutual output bits show a novel phenomenon: The networks synchronize to a state with identical time dependent weights. It is shown how synchronization by mutual learning can be applied to cryptography: secret key exchange over a public channel.
A complex system can be represented and analyzed as a network, where nodes represent the units of the network and edges represent connections between those units. For example, a brain network represents neurons as nodes and axons between neurons as edges. In many networks, some nodes have a disproportionately high number of edges. These nodes also have many edges between each other, and are referred to as the rich club. In many different networks, the nodes of this club are assumed to support global network integration. However, another set of nodes potentially exhibits a connectivity structure that is more advantageous to global network integration. Here, in a myriad of different biological and man-made networks, we discover the diverse club--a set of nodes that have edges diversely distributed across the network. The diverse club exhibits, to a greater extent than the rich club, properties consistent with an integrative network function--these nodes are more highly interconnected and their edges are more critical for efficient global integration. Moreover, we present a generative evolutionary network model that produces networks with a diverse club but not a rich club, thus demonstrating that these two clubs potentially evolved via distinct selection pressures. Given the variety of different networks that we analyzed--the c. elegans, the macaque brain, the human brain, the United States power grid, and global air traffic--the diverse club appears to be ubiquitous in complex networks. These results warrant the distinction and analysis of two critical clubs of nodes in all complex systems.
Temporal drift of sensory data is a severe problem impacting the data quality of wireless sensor networks (WSNs). With the proliferation of large-scale and long-term WSNs, it is becoming more important to calibrate sensors when the ground truth is unavailable. This problem is called "blind calibration". In this paper, we propose a novel deep learning method named projection-recovery network (PRNet) to blindly calibrate sensor measurements online. The PRNet first projects the drifted data to a feature space, and uses a powerful deep convolutional neural network to recover the estimated drift-free measurements. We deploy a 24-sensor testbed and provide comprehensive empirical evidence showing that the proposed method significantly improves the sensing accuracy and drifted sensor detection. Compared with previous methods, PRNet can calibrate 2x of drifted sensors at the recovery rate of 80% under the same level of accuracy requirement. We also provide helpful insights for designing deep neural networks for sensor calibration. We hope our proposed simple and effective approach will serve as a solid baseline in blind drift calibration of sensor networks.
We investigate iterated compositions of weighted sums of Gaussian kernels and provide an interpretation of the construction that shows some similarities with the architectures of deep neural networks. On the theoretical side, we show that these kernels are universal and that SVMs using these kernels are universally consistent. We further describe a parameter optimization method for the kernel parameters and empirically compare this method to SVMs, random forests, a multiple kernel learning approach, and to some deep neural networks.
Quantification of human group-behavior has so far defied an empirical, falsifiable approach. This is due to tremendous difficulties in data acquisition of social systems. Massive multiplayer online games (MMOG) provide a fascinating new way of observing hundreds of thousands of simultaneously socially interacting individuals engaged in virtual economic activities. We have compiled a data set consisting of practically all actions of all players over a period of three years from a MMOG played by 300,000 people. This large-scale data set of a socio-economic unit contains all social and economic data from a single and coherent source. Players have to generate a virtual income through economic activities to `survive' and are typically engaged in a multitude of social activities offered within the game. Our analysis of high-frequency log files focuses on three types of social networks, and tests a series of social-dynamics hypotheses. In particular we study the structure and dynamics of friend-, enemy- and communication networks. We find striking differences in topological structure between positive (friend) and negative (enemy) tie networks. All networks confirm the recently observed phenomenon of network densification. We propose two approximate social laws in communication networks, the first expressing betweenness centrality as the inverse square of the overlap, the second relating communication strength to the cube of the overlap. These empirical laws provide strong quantitative evidence for the Weak ties hypothesis of Granovetter. Further, the analysis of triad significance profiles validates well-established assertions from social balance theory. We find overrepresentation (underrepresentation) of complete (incomplete) triads in networks of positive ties, and vice versa for networks of negative ties...
Deep Convolutional Neural Networks (CNNs) have demonstrated excellent performance in image classification, but still show room for improvement in object-detection tasks with many categories, in particular for cluttered scenes and occlusion. Modern detection algorithms like Regions with CNNs (Girshick et al., 2014) rely on Selective Search (Uijlings et al., 2013) to propose regions which with high probability represent objects, where in turn CNNs are deployed for classification. Selective Search represents a family of sophisticated algorithms that are engineered with multiple segmentation, appearance and saliency cues, typically coming with a significant run-time overhead. Furthermore, (Hosang et al., 2014) have shown that most methods suffer from low reproducibility due to unstable superpixels, even for slight image perturbations. Although CNNs are subsequently used for classification in top-performing object-detection pipelines, current proposal methods are agnostic to how these models parse objects and their rich learned representations. As a result they may propose regions which may not resemble high-level objects or totally miss some of them. To overcome these drawbacks we propose a boosting approach which directly takes advantage of hierarchical CNN features for detecting regions of interest fast. We demonstrate its performance on ImageNet 2013 detection benchmark and compare it with state-of-the-art methods.
Extreme learning machine (ELM), proposed by Huang et al., has been shown a promising learning algorithm for single-hidden layer feedforward neural networks (SLFNs). Nevertheless, because of the random choice of input weights and biases, the ELM algorithm sometimes makes the hidden layer output matrix H of SLFN not full column rank, which lowers the effectiveness of ELM. This paper discusses the effectiveness of ELM and proposes an improved algorithm called EELM that makes a proper selection of the input weights and bias before calculating the output weights, which ensures the full column rank of H in theory. This improves to some extend the learning rate (testing accuracy, prediction accuracy, learning time) and the robustness property of the networks. The experimental results based on both the benchmark function approximation and real-world problems including classification and regression applications show the good performances of EELM.
This paper proposes a novel framework for the use of eye movement patterns for biometric applications. Eye movements contain abundant information about cognitive brain functions, neural pathways, etc. In the proposed method, eye movement data is classified into fixations and saccades. Features extracted from fixations and saccades are used by a Gaussian Radial Basis Function Network (GRBFN) based method for biometric authentication. A score fusion approach is adopted to classify the data in the output layer. In the evaluation stage, the algorithm has been tested using two types of stimuli: random dot following on a screen and text reading. The results indicate the strength of eye movement pattern as a biometric modality. The algorithm has been evaluated on BioEye 2015 database and found to outperform all the other methods. Eye movements are generated by a complex oculomotor plant which is very hard to spoof by mechanical replicas. Use of eye movement dynamics along with iris recognition technology may lead to a robust counterfeit-resistant person identification system.
A systematic replica field theory calculations are analysed using the examples of two particular one-dimensional "toy" random models with Gaussian disorder. Due to apparent simplicity of the model the replica trick calculations can be followed here step by step from the very beginning till the very end. In this way it can be easily demonstrated that formally at certain stage of the calculations the implementation of the standard replica program is just impossible. On the other hand, following the usual "doublethink" traditions of the replica calculations (i.e. closing eyes on the fact that certain suggestions used in the calculations contradict to each other) one can easily fulfil the programme till the very end to obtain physically sensible result for the entire free energy distribution function.
The paper contributes to the rare literature modeling term structure of crude oil markets. We explain term structure of crude oil prices using dynamic Nelson-Siegel model, and propose to forecast them with the generalized regression framework based on neural networks. The newly proposed framework is empirically tested on 24 years of crude oil futures prices covering several important recessions and crisis periods. We find 1-month, 3-month, 6-month and 12-month-ahead forecasts obtained from focused time-delay neural network to be significantly more accurate than forecasts from other benchmark models. The proposed forecasting strategy produces the lowest errors across all times to maturity.
We experimentally study the effects of coupling one-dimensional Many-Body Localized (MBL) systems with identical disorder. Using a gas of ultracold fermions in an optical lattice, we artifically prepare an initial charge density wave in an array of 1D tubes with quasi-random onsite disorder and monitor the subsequent dynamics over several thousand tunneling times. We find a strikingly different behavior between MBL and Anderson Localization. While the non-interacting Anderson case remains localized, in the interacting case any coupling between the tubes leads to a delocalization of the entire system.
Grid computing consists of the coordinated use of large sets of diverse, geographically distributed resources for high performance computation. Effective monitoring of these computing resources is extremely important to allow efficient use on the Grid. The large number of heterogeneous computing entities available in Grids makes the task challenging. In this work, we describe a Grid monitoring system, called GridMonitor, that captures and makes available the most important information from a large computing facility. The Grid monitoring system consists of four tiers: local monitoring, archiving, publishing and harnessing. This architecture was applied on a large scale linux farm and network infrastructure. It can be used by many higher-level Grid services including scheduling services and resource brokering.
In this paper we deal with the structural properties of weighted networks. Starting from an empirical analysis of a linguistic network, we analyse the differences between the statistical properties of a real and a shuffled network and we show that the scale free degree distribution and the scale free weight distribution are induced by the scale free strength distribution, that is Zipf's law. We test the result on a scientific collaboration network, that is a social network, and we define a measure, the vertex selectivity, that can easily distinguish a real network from a shuffled network. We prove, via an ad-hoc stochastic growing network with second order correlations, that this measure can effectively capture the correlations within the topology of the network.
The goal of this work is the computation of very compact binary hashes for image instance retrieval. Our approach has two novel contributions. The first one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a mathematical theory for computing group invariant transformations with feed-forward neural networks. NIP is able to produce compact and well-performing descriptors with visual representations extracted from convolutional neural networks. We specifically incorporate scale, translation and rotation invariances but the scheme can be extended to any arbitrary sets of transformations. We also show that using moments of increasing order throughout nesting is important. The NIP descriptors are then hashed to the target code size (32-256 bits) with a Restricted Boltzmann Machine with a novel batch-level regularization scheme specifically designed for the purpose of hashing (RBMH). A thorough empirical evaluation with state-of-the-art shows that the results obtained both with the NIP descriptors and the NIP+RBMH hashes are consistently outstanding across a wide range of datasets.
The main goal of this paper is to describe a method for exact inference in general hybrid Bayesian networks (BNs) (with a mixture of discrete and continuous chance variables). Our method consists of approximating general hybrid Bayesian networks by a mixture of Gaussians (MoG) BNs. There exists a fast algorithm by Lauritzen-Jensen (LJ) for making exact inferences in MoG Bayesian networks, and there exists a commercial implementation of this algorithm. However, this algorithm can only be used for MoG BNs. Some limitations of such networks are as follows. All continuous chance variables must have conditional linear Gaussian distributions, and discrete chance nodes cannot have continuous parents. The methods described in this paper will enable us to use the LJ algorithm for a bigger class of hybrid Bayesian networks. This includes networks with continuous chance nodes with non-Gaussian distributions, networks with no restrictions on the topology of discrete and continuous variables, networks with conditionally deterministic variables that are a nonlinear function of their continuous parents, and networks with continuous chance variables whose variances are functions of their parents.
The task of MRI fingerprinting is to identify tissue parameters from complex-valued MRI signals. The prevalent approach is dictionary based, where a test MRI signal is compared to stored MRI signals with known tissue parameters and the most similar signals and tissue parameters retrieved. Such an approach does not scale with the number of parameters and is rather slow when the tissue parameter space is large.   Our first novel contribution is to use deep learning as an efficient nonlinear inverse mapping approach. We generate synthetic (tissue, MRI) data from an MRI simulator, and use them to train a deep net to map the MRI signal to the tissue parameters directly.   Our second novel contribution is to develop a complex-valued neural network with new cardioid activation functions. Our results demonstrate that complex-valued neural nets could be much more accurate than real-valued neural nets at complex-valued MRI fingerprinting.
In this letter we focus on designing self-organizing diffusion mobile adaptive networks where the individual agents are allowed to move in pursuit of an objective (target). The well-known Adapt-then-Combine (ATC) algorithm is already available in the literature as a useful distributed diffusion-based adaptive learning network. However, in the ATC diffusion algorithm, fixed step sizes are used in the update equations for velocity vectors and location vectors. When the nodes are too far away from the target, such strategies may require large number of iterations to reach the target. To address this issue, in this paper we suggest two modifications on the ATC mobile adaptive network to improve its performance. The proposed modifications include (i) distance-based variable step size adjustment at diffusion algorithms to update velocity vectors and location vectors (ii) to use a selective cooperation, by choosing the best nodes at each iteration to reduce the number of communications. The performance of the proposed algorithm is evaluated by simulation tests where the obtained results show the superior performance of the proposed algorithm in comparison with the available ATC mobile adaptive network.
We study the segmental recurrent neural network for end-to-end acoustic modelling. This model connects the segmental conditional random field (CRF) with a recurrent neural network (RNN) used for feature extraction. Compared to most previous CRF-based acoustic models, it does not rely on an external system to provide features or segmentation boundaries. Instead, this model marginalises out all the possible segmentations, and features are extracted from the RNN trained together with the segmental CRF. In essence, this model is self-contained and can be trained end-to-end. In this paper, we discuss practical training and decoding issues as well as the method to speed up the training in the context of speech recognition. We performed experiments on the TIMIT dataset. We achieved 17.3 phone error rate (PER) from the first-pass decoding --- the best reported result using CRFs, despite the fact that we only used a zeroth-order CRF and without using any language model.
In this work, we perform an exploratory study on synthesizing deep neural networks using biological synaptic strength distributions, and the potential influence of different distributions on modelling performance particularly for the scenario associated with small data sets. Surprisingly, a CNN with convolutional layer synaptic strengths drawn from biologically-inspired distributions such as log-normal or correlated center-surround distributions performed relatively well suggesting a possibility for designing deep neural network architectures that do not require many data samples to learn, and can sidestep current training procedures while maintaining or boosting modelling performance.
Learning the spatial-temporal representation of motion information is crucial to human action recognition. Nevertheless, most of the existing features or descriptors cannot capture motion information effectively, especially for long-term motion. To address this problem, this paper proposes a long-term motion descriptor called sequential Deep Trajectory Descriptor (sDTD). Specifically, we project dense trajectories into two-dimensional planes, and subsequently a CNN-RNN network is employed to learn an effective representation for long-term motion. Unlike the popular two-stream ConvNets, the sDTD stream is introduced into a three-stream framework so as to identify actions from a video sequence. Consequently, this three-stream framework can simultaneously capture static spatial features, short-term motion and long-term motion in the video. Extensive experiments were conducted on three challenging datasets: KTH, HMDB51 and UCF101. Experimental results show that our method achieves state-of-the-art performance on the KTH and UCF101 datasets, and is comparable to the state-of-the-art methods on the HMDB51 dataset.
In a network, we define shell $\ell$ as the set of nodes at distance $\ell$ with respect to a given node and define $r_\ell$ as the fraction of nodes outside shell $\ell$. In a transport process, information or disease usually diffuses from a random node and reach nodes shell after shell. Thus, understanding the shell structure is crucial for the study of the transport property of networks. For a randomly connected network with given degree distribution, we derive analytically the degree distribution and average degree of the nodes residing outside shell $\ell$ as a function of $r_\ell$. Further, we find that $r_\ell$ follows an iterative functional form $r_\ell=\phi(r_{\ell-1})$, where $\phi$ is expressed in terms of the generating function of the original degree distribution of the network. Our results can explain the power-law distribution of the number of nodes $B_\ell$ found in shells with $\ell$ larger than the network diameter $d$, which is the average distance between all pairs of nodes. For real world networks the theoretical prediction of $r_\ell$ deviates from the empirical $r_\ell$. We introduce a network correlation function $c(r_\ell)\equiv r_{\ell+1}/\phi(r_\ell)$ to characterize the correlations in the network, where $r_{\ell+1}$ is the empirical value and $\phi(r_\ell)$ is the theoretical prediction. $c(r_\ell)=1$ indicates perfect agreement between empirical results and theory. We apply $c(r_\ell)$ to several model and real world networks. We find that the networks fall into two distinct classes: (i) a class of {\it poorly-connected} networks with $c(r_\ell)>1$, which have larger average distances compared with randomly connected networks with the same degree distributions; and (ii) a class of {\it well-connected} networks with $c(r_\ell)<1$.
Biological neural networks are systems of extraordinary computational capabilities shaped by evolution, development, and lifetime learning. The interplay of these elements leads to the emergence of adaptive behavior and intelligence, but the complexity of the whole system of interactions is an obstacle to the understanding of the key factors at play. Inspired by such intricate natural phenomena, Evolved Plastic Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed plastic neural networks, artificial systems composed of sensors, outputs, and plastic components that change in response to sensory-output experiences in an environment. These systems may reveal key algorithmic ingredients of adaptation, autonomously discover novel adaptive algorithms, and lead to hypotheses on the emergence of biological adaptation. EPANNs have seen considerable progress over the last two decades. Current scientific and technological advances in artificial neural networks are now setting the conditions for radically new approaches and results. In particular, the limitations of hand-designed structures and algorithms currently used in most deep neural networks could be overcome by more flexible and innovative solutions. This paper brings together a variety of inspiring ideas that define the field of EPANNs. The main computational methods and results are reviewed. Finally, new opportunities and developments are presented.
Adversarial training has been shown to regularize deep neural networks in addition to increasing their robustness to adversarial examples. However, its impact on very deep state of the art networks has not been fully investigated. In this paper, we present an efficient approach to perform adversarial training by perturbing intermediate layer activations and study the use of such perturbations as a regularizer during training. We use these perturbations to train very deep models such as ResNets and show improvement in performance both on adversarial and original test data. Our experiments highlight the benefits of perturbing intermediate layer activations compared to perturbing only the inputs. The results on CIFAR-10 and CIFAR-100 datasets show the merits of the proposed adversarial training approach. Additional results on WideResNets show that our approach provides significant improvement in classification accuracy for a given base model, outperforming dropout and other base models of larger size.
The trend towards higher resolution remote sensing imagery facilitates a transition from land-use classification to object-level scene understanding. Rather than relying purely on spectral content, appearance-based image features come into play. In this work, deep convolutional neural networks (CNNs) are applied to semantic labelling of high-resolution remote sensing data. Recent advances in fully convolutional networks (FCNs) are adapted to overhead data and shown to be as effective as in other domains. A full-resolution labelling is inferred using a deep FCN with no downsampling, obviating the need for deconvolution or interpolation. To make better use of image features, a pre-trained CNN is fine-tuned on remote sensing data in a hybrid network context, resulting in superior results compared to a network trained from scratch. The proposed approach is applied to the problem of labelling high-resolution aerial imagery, where fine boundary detail is important. The dense labelling yields state-of-the-art accuracy for the ISPRS Vaihingen and Potsdam benchmark data sets.
Spatiotemporal forecasting has significant implications in sustainability, transportation and health-care domain. Traffic forecasting is one canonical example of such learning task. This task is challenging due to (1) non-linear temporal dynamics with changing road conditions, (2) complex spatial dependencies on road networks topology and (3) inherent difficulty of long-term time series forecasting. To address these challenges, we propose Graph Convolutional Recurrent Neural Network to incorporate both spatial and temporal dependency in traffic flow. We further integrate the encoder-decoder framework and scheduled sampling to improve long-term forecasting. When evaluated on real-world road network traffic data, our approach can accurately capture spatiotemporal correlations and consistently outperforms state-of-the-art baselines by 12%- 15%.
The delocalized non-ergodic phase existing in some random $N \times N$ matrix models is analyzed via the Wigner-Weisskopf approximation for the dynamics from an initial site $j_0$. The main output of this approach is the inverse $\Gamma_{j_0}(N)$ of the characteristic time to leave the state $j_0$ that provides some broadening $\Gamma_{j_0}(N) $ for the weights of the eigenvectors. In this framework, the localized phase corresponds to the region where the broadening $\Gamma_{j_0}(N) $ is smaller in scaling than the level spacing $\Delta_{j_0}(N) \propto \frac{1}{N}$, while the delocalized non-ergodic phase corresponds to the region where the broadening $\Gamma_{j_0}(N) $ decays with $N$ but is bigger in scaling than the level spacing $\Delta_{j_0}(N) $. Then the number $\frac{\Gamma_{j_0}(N)}{\Delta_{j_0}(N)} $ of resonances grows only sub-extensively in $N$. This approach allows to recover the multifractal spectrum of the Generalized-Rosenzweig-Potter (GRP) Matrix model [V.E. Kravtsov, I.M. Khaymovich, E. Cuevas and M. Amini, New. J. Phys. 17, 122002 (2015)]. We then consider the L\'evy generalization of the GRP Matrix model, where the off-diagonal matrix elements are drawn with an heavy-tailed distribution of L\'evy index $1<\mu<2$ : the dynamics is then governed by a stretched exponential of exponent $\beta=\frac{2 (\mu-1)}{\mu}$ and the multifractal properties of eigenstates are explicitly computed.
Large scale, deep survey missions such as GAIA will collect enormous amounts of data on a significant fraction of the stellar content of our Galaxy. These missions will require a careful optimisation of their observational systems in order to maximise their scientific return, and will require reliable and automated techniques for parametrizing the very large number of stars detected. To address these two problems, I investigate the precision to which the three principal stellar parameters (Teff, logg, [M/H]) can be determined as a function of spectral resolution and signal-to-noise (SNR) ratio, using a large grid of synthetic spectra. The parametrization technique is a neural network, which is shown to provide an accurate three-dimensional physical parametrization of stellar spectra across a wide range of parameters. It is found that even at low resolution (50-100 AA FWHM) and SNR (5-10 per resolution element), Teff and \met can be determined to 1% and 0.2 dex respectively across a large range of temperatures (4000-30000 K) and metallicities (-3.0 to +1.0 dex), and that logg is measurable to +/- 0.2 dex for stars earlier than solar. The accuracy of the results is probably limited by the finite parameter sampling of the data grid. The ability of medium band filter systems (with 10-15 filters) for determining stellar parameters is also investigated. Although easier to implement in a unpointed survey, it is found that they are only competitive at higher SNRs (> 50).
Signaling networks are designed to sense an environmental stimulus and adapt to it. We propose and study a minimal model of signaling network that can sense and respond to external stimuli of varying strength in an adaptive manner. The structure of this minimal network is derived based on some simple assumptions on its differential response to external stimuli. We employ stochastic differential equations and probability distributions obtained from stochastic simulations to characterize differential signaling response in our minimal network model. We show that the proposed minimal signaling network displays two distinct types of response as the strength of the stimulus is decreased. The signaling network has a deterministic part that undergoes rapid activation by a strong stimulus in which case cell-to-cell fluctuations can be ignored. As the strength of the stimulus decreases, the stochastic part of the network begins dominating the signaling response where slow activation is observed with characteristic large cell-to-cell stochastic variability. Interestingly, this proposed stochastic signaling network can capture some of the essential signaling behaviors of a complex apoptotic cell death signaling network that has been studied through experiments and large-scale computer simulations. Thus we claim that the proposed signaling network is an appropriate minimal model of apoptosis signaling. Elucidating the fundamental design principles of complex cellular signaling pathways such as apoptosis signaling remains a challenging task. We demonstrate how our proposed minimal model can help elucidate the effect of a specific apoptotic inhibitor Bcl-2 on apoptotic signaling in a cell-type independent manner. We also discuss the implications of our study in elucidating the adaptive strategy of cell death signaling pathways.
We study the very long-range bond-percolation problem on a linear chain with both sites and bonds dilution. Very long range means that the probability $p_{ij}$ for a connection between two occupied sites $i,j$ at a distance $r_{ij}$ decays as a power law, i.e. $p_{ij} = \rho/[r_{ij}^\alpha N^{1-\alpha}]$ when $ 0 \le \alpha < 1$, and $p_{ij} = \rho/[r_{ij} \ln(N)]$ when $\alpha = 1$. Site dilution means that the occupancy probability of a site is $0 < p_s \le 1$. The behavior of this model results from the competition between long-range connectivity, which enhances the percolation, and site dilution, which weakens percolation. The case $\alpha=0$ with $p_s =1 $ is well-known, being the exactly solvable mean-field model. The percolation order parameter $P_\infty$ is investigated numerically for different values of $\alpha$, $p_s$ and $\rho$. We show that in the ranges $ 0 \le \alpha \le 1$ and $0 < p_s \le 1$ the percolation order parameter $P_\infty$ depends only on the average connectivity $\gamma$ of sites, which can be explicitly computed in terms of the three parameters $\alpha$, $p_s$ and $\rho$.
We analyse the flow of information in multiplex networks by means of the communicability function. First, we generalize this measure from its definition from simple graphs to multiplex networks. Then, we study its relevance for the analysis of real-world systems by studying a social multiplex where information flows using formal/informal channels and an air transportation system where the layers represent different air companies. Accordingly, the communicability, which is essential for the good performance of these complex systems, emerges at a systemic operation point in the multiplex where the performance of the layers operates in a coordinated way very differently from the state represented by a collection of unconnected networks.
Many previous methods have showed the importance of considering semantically relevant objects for performing event recognition, yet none of the methods have exploited the power of deep convolutional neural networks to directly integrate relevant object information into a unified network. We present a novel unified deep CNN architecture which integrates architecturally different, yet semantically-related object detection networks to enhance the performance of the event recognition task. Our architecture allows the sharing of the convolutional layers and a fully connected layer which effectively integrates event recognition, rigid object detection and non-rigid object detection.
Genetic Algorithm (GA) has been used in this paper for a new approach of sub-optimal model reduction in the Nyquist plane and optimal time domain tuning of PID and fractional order (FO) PI{\lambda}D{\mu} controllers. Simulation studies show that the Nyquist based new model reduction technique outperforms the conventional H2 norm based reduced parameter modeling technique. With the tuned controller parameters and reduced order model parameter data-set, optimum tuning rules have been developed with a test-bench of higher order processes via Genetic Programming (GP). The GP performs a symbolic regression on the reduced process parameters to evolve a tuning rule which provides the best analytical expression to map the data. The tuning rules are developed for a minimum time domain integral performance index described by weighted sum of error index and controller effort. From the reported Pareto optimal front of GP based optimal rule extraction technique a trade-off can be made between the complexity of the tuning formulae and the control performance. The efficacy of the single-gene and multi-gene GP based tuning rules has been compared with original GA based control performance for the PID and PI{\lambda}D{\mu} controllers, handling four different class of representative higher order processes. These rules are very useful for process control engineers as they inherit the power of the GA based tuning methodology, but can be easily calculated without the requirement for running the computationally intensive GA every time. Three dimensional plots of the required variation in PID/FOPID controller parameters with reduced process parameters have been shown as a guideline for the operator. Parametric robustness of the reported GP based tuning rules has also been shown with credible simulation examples.
This paper considers the computational power of constant size, dynamic Bayesian networks. Although discrete dynamic Bayesian networks are no more powerful than hidden Markov models, dynamic Bayesian networks with continuous random variables and discrete children of continuous parents are capable of performing Turing-complete computation. With modified versions of existing algorithms for belief propagation, such a simulation can be carried out in real time. This result suggests that dynamic Bayesian networks may be more powerful than previously considered. Relationships to causal models and recurrent neural networks are also discussed.
A set of infinite binary sequences $\mathcal{C}\subseteq2^\omega$ is negligible if there is no partial probabilistic algorithm that produces an element of this set with positive probability. The study of negligibility is of particular interest in the context of $\Pi^0_1$ classes. In this paper, we introduce the notion of depth for $\Pi^0_1$ classes, which is a stronger form of negligibility. Whereas a negligible $\Pi^0_1$ class $\mathcal{C}$ has the property that one cannot probabilistically compute a member of $\mathcal{C}$ with positive probability, a deep $\Pi^0_1$ class $\mathcal{C}$ has the property that one cannot probabilistically compute an initial segment of a member of $\mathcal{C}$ with high probability. That is, the probability of computing a length $n$ initial segment of a deep $\Pi^0_1$ class converges to 0 effectively in $n$.   We prove a number of basic results about depth, negligibility, and a variant of negligibility that we call $\mathit{tt}$-negligibility. We also provide a number of examples of deep $\Pi^0_1$ classes that occur naturally in computability theory and algorithmic randomness. We also study deep classes in the context of mass problems, we examine the relationship between deep classes and certain lowness notions in algorithmic randomness, and establish a relationship between members of deep classes and the amount of mutual information with Chaitin's $\Omega$.
In this paper, we propose a realistic mathematical model taking into account the mutual interference among the interacting populations. This model attempts to describe the control (vaccination) function as a function of the number of infective individuals, which is an improvement over the existing susceptible infective epidemic models. Regarding the growth of the epidemic as a nonlinear phenomenon we have developed a neural network architecture to estimate the vital parameters associated with this model. This architecture is based on a recently developed new class of neural networks known as co-operative and supportive neural networks. The application of this architecture to the present study involves preprocessing of the input data, and this renders an efficient estimation of the rate of spread of the epidemic. It is observed that the proposed new neural network outperforms a simple feed-forward neural network and polynomial regression.
Large Rapidity Gap Events in Deep Inelastic Scattering are discussed in terms of lightcone wave functions for quarks and gluons inside the photon. It is shown that this approach is consistent with earlier, conventional Feynman diagram calculations. An updated parametrization for the cross section is given and a numerical analysis presented.
The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on. These limitations add significantly to development costs and make cross-domain, multi-lingual dialogue systems intractable. Moreover, human languages are context-aware. The most natural response should be directly learned from data rather than depending on predefined syntaxes or rules. This paper presents a statistical language generator based on a joint recurrent and convolutional neural network structure which can be trained on dialogue act-utterance pairs without any semantic alignments or predefined grammar trees. Objective metrics suggest that this new model outperforms previous methods under the same experimental conditions. Results of an evaluation by human judges indicate that it produces not only high quality but linguistically varied utterances which are preferred compared to n-gram and rule-based systems.
Convolutional neural networks (CNN) are increasingly used in many areas of computer vision. They are particularly attractive because of their ability to "absorb" great quantities of labeled data through millions of parameters. However, as model sizes increase, so do the storage and memory requirements of the classifiers. We present a novel network architecture, Frequency-Sensitive Hashed Nets (FreshNets), which exploits inherent redundancy in both convolutional layers and fully-connected layers of a deep learning model, leading to dramatic savings in memory and storage consumption. Based on the key observation that the weights of learned convolutional filters are typically smooth and low-frequency, we first convert filter weights to the frequency domain with a discrete cosine transform (DCT) and use a low-cost hash function to randomly group frequency parameters into hash buckets. All parameters assigned the same hash bucket share a single value learned with standard back-propagation. To further reduce model size we allocate fewer hash buckets to high-frequency components, which are generally less important. We evaluate FreshNets on eight data sets, and show that it leads to drastically better compressed performance than several relevant baselines.
A fitness landscape is a mapping from the space of genetic sequences, which is modeled here as a binary hypercube of dimension $L$, to the real numbers. We consider random models of fitness landscapes, where fitness values are assigned according to some probabilistic rule, and study the statistical properties of pathways to the global fitness maximum along which fitness increases monotonically. Such paths are important for evolution because they are the only ones that are accessible to an adapting population when mutations occur at a low rate. The focus of this work is on the block model introduced by A.S. Perelson and C.A. Macken [Proc. Natl. Acad. Sci. USA 92:9657 (1995)] where the genome is decomposed into disjoint sets of loci (`modules') that contribute independently to fitness, and fitness values within blocks are assigned at random. We show that the number of accessible paths can be written as a product of the path numbers within the blocks, which provides a detailed analytic description of the path statistics. The block model can be viewed as a special case of Kauffman's NK-model, and we compare the analytic results to simulations of the NK-model with different genetic architectures. We find that the mean number of accessible paths in the different versions of the model are quite similar, but the distribution of the path number is qualitatively different in the block model due to its multiplicative structure. A similar statement applies to the number of local fitness maxima in the NK-models, which has been studied extensively in previous works. The overall evolutionary accessibility of the landscape, as quantified by the probability to find at least one accessible path to the global maximum, is dramatically lowered by the modular structure.
This work demonstrates how to accelerate dense linear algebra computations using CLBlast, an open-source OpenCL BLAS library providing optimized routines for a wide variety of devices. It is targeted at machine learning and HPC applications and thus provides a fast matrix-multiplication routine (GEMM) to accelerate the core of many applications (e.g. deep learning, iterative solvers, astrophysics, computational fluid dynamics, quantum chemistry). CLBlast has four main advantages over other BLAS libraries: 1) it is optimized for and tested on a large variety of OpenCL devices including less commonly used devices such as embedded and low-power GPUs, 2) it can be explicitly tuned for specific problem-sizes on specific hardware platforms, 3) it can perform operations in half-precision floating-point FP16 saving precious bandwidth, time and energy, 4) and it can combine multiple operations in a single batched routine, accelerating smaller problems significantly. This paper describes the library and demonstrates the advantages of CLBlast experimentally for different use-cases on a wide variety of OpenCL hardware.
We measure the defect density as a function of time at different temperatures in simulations of a two dimensional system of interacting particles. Just above the solid to liquid transition temperature, the power spectrum of the defect fluctuations shows a 1/f signature, which crosses over to a white noise signature at higher temperatures. When 1/f noise is present, the 5-7 defects predominately form string like structures, and the particle trajectories show a 1D correlated motion that follows the defect strings. At higher temperatures this heterogeneous motion is lost. We demonstrate this heterogeneity both in systems interacting with a short ranged screened Coulomb interaction, as well as in systems with a long range logarithmic interaction between the particles.
A combination of classical molecular dynamics (MD) and ab initio Car-Parrinello molecular dynamics (CPMD) simulations is used to investigate the adsorption of water on a free amorphous silica surface. From the classical MD SiO_2 configurations with a free surface are generated which are then used as starting configurations for the CPMD.We study the reaction of a water molecule with a two-membered ring at the temperature T=300K. We show that the result of this reaction is the formation of two silanol groups on the surface. The activation energy of the reaction is estimated and it is shown that the reaction is exothermic.
It is the main purpose of this paper to introduce a graph-valued stochastic process in order to model the spread of a communicable infectious disease. The major novelty of the SIR model we promote lies in the fact that the social network on which the epidemics is taking place is not specified in advance but evolves through time, accounting for the temporal evolution of the interactions involving infective individuals. Without assuming the existence of a fixed underlying network model, the stochastic process introduced describes, in a flexible and realistic manner, epidemic spread in non-uniformly mixing and possibly heterogeneous populations. It is shown how to fit such a (parametrised) model by means of Approximate Bayesian Computation methods based on graph-valued statistics. The concepts and statistical methods described in this paper are finally applied to a real epidemic dataset, related to the spread of HIV in Cuba in presence of a contact tracing system, which permits one to reconstruct partly the evolution of the graph of sexual partners diagnosed HIV positive between 1986 and 2006.
The $q=2$ random cluster model is studied in the context of two mean field models: The Bethe lattice and the complete graph. For these systems, the critical exponents that are defined in terms of finite clusters have some anomalous values as the critical point is approached from the high density side which vindicates the results of earlier studies. In particular, the exponent $\tilde \gamma^\prime$ which characterises the divergence of the average size of finite clusters is 1/2 and $\tilde\nu^\prime$, the exponent associated with the length scale of finite clusters is 1/4. The full collection of exponents indicates an upper critical dimension of 6. The standard mean field exponents of the Ising system are also present in this model ($\nu^\prime = 1/2$, $\gamma^\prime = 1$) which implies, in particular, the presence of two diverging length scales. Furthermore, the finite cluster exponents are stable to the addition of disorder which, near the upper critical dimension, may have interesting implications concerning the generality of the disordered system/correlation length bounds.
Blind quantum computing allows for secure cloud networks of quasi-classical clients and a fully fledged quantum server. Recently, a new protocol has been proposed, which requires a client to perform only measurements. We demonstrate a proof-of-principle implementation of this measurement-only blind quantum computing, exploiting a photonic setup to generate four-qubit cluster states for computation and verification. Feasible technological requirements for the client and the device-independent blindness make this scheme very applicable for future secure quantum networks.
Group extraction and their evolution are among the topics which arouse the greatest interest in the domain of social network analysis. However, while the grouping methods in social networks are developed very dynamically, the methods of group evolution discovery and analysis are still uncharted territory on the social network analysis map. Therefore the new method for the group evolution discovery called GED is proposed in this paper. Additionally, the results of the first experiments on the email based social network together with comparison with two other methods of group evolution discovery are presented.
With the increased competition for the electromagnetic spectrum, it is important to characterize the impact of interference in the performance of a wireless network, which is traditionally measured by its throughput. This paper presents a unifying framework for characterizing the local throughput in wireless networks. We first analyze the throughput of a probe link from a connectivity perspective, in which a packet is successfully received if it does not collide with other packets from nodes within its reach (called the audible interferers). We then characterize the throughput from a signal-to-interference-plus-noise ratio (SINR) perspective, in which a packet is successfully received if the SINR exceeds some threshold, considering the interference from all emitting nodes in the network. Our main contribution is to generalize and unify various results scattered throughout the literature. In particular, the proposed framework encompasses arbitrary wireless propagation effects (e.g, Nakagami-m fading, Rician fading, or log-normal shadowing), as well as arbitrary traffic patterns (e.g., slotted-synchronous, slotted-asynchronous, or exponential-interarrivals traffic), allowing us to draw more general conclusions about network performance than previously available in the literature.
We analyse a large data set of genetic markers obtained from populations of Cymodocea nodosa, a marine plant occurring from the East Mediterranean to the Iberian-African coasts in the Atlantic Ocean. We fully develop and test a recently introduced methodology to infer the directionality of gene flow based on the concept of geographical segregation. Using the Jensen-Shannon divergence, we are able to extract a directed network of gene flow describing the evolutionary patterns of Cymodocea nodosa. In particular we recover the genetic segregation that the marine plant underwent during its evolution. The results are confirmed by natural evidence and are consistent with an independent cross analysis.
We propose using five data-driven community detection approaches from social networks to partition the label space for the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd: modularity-maximizing fastgreedy and leading eigenvector, infomap, walktrap and label propagation algorithms. We construct a label co-occurence graph (both weighted an unweighted versions) based on training data and perform community detection to partition the label set. We include Binary Relevance and Label Powerset classification methods for comparison. We use gini-index based Decision Trees as the base classifier. We compare educated approaches to label space divisions against random baselines on 12 benchmark data sets over five evaluation measures. We show that in almost all cases seven educated guess approaches are more likely to outperform RAkELd than otherwise in all measures, but Hamming Loss. We show that fastgreedy and walktrap community detection methods on weighted label co-occurence graphs are 85-92% more likely to yield better F1 scores than random partitioning. Infomap on the unweighted label co-occurence graphs is on average 90% of the times better than random paritioning in terms of Subset Accuracy and 89% when it comes to Jaccard similarity. Weighted fastgreedy is better on average than RAkELd when it comes to Hamming Loss.
Neural Machine Translation (NMT) models usually use large target vocabulary sizes to capture most of the words in the target language. The vocabulary size is a big factor when decoding new sentences as the final softmax layer normalizes over all possible target words. To address this problem, it is widely common to restrict the target vocabulary with candidate lists based on the source sentence. Usually, the candidate lists are a combination of external word-to-word aligner, phrase table entries or most frequent words. In this work, we propose a simple and yet novel approach to learn candidate lists directly from the attention layer during NMT training. The candidate lists are highly optimized for the current NMT model and do not need any external computation of the candidate pool. We show significant decoding speedup compared with using the entire vocabulary, without losing any translation quality for two language pairs.
We use molecular dynamics computer simulations to study the relaxation dynamics of Na2O-2(SiO2) in its molten, highly viscous state. We find that at low temperatures the incoherent intermediate scattering function for Na relaxes about 100 times faster than the one of the Si and O atoms. In contrast to this all coherent functions relax on the same time scale if the wave-vector is around 1AA^-1. This anomalous relaxation dynamics is traced back to the channel-like structure for the Na atoms that have been found for this system. We find that the relaxation dynamics for Si and O as well as the time dependence of the coherent functions for Na can be rationalized well by means of mode-coupling theory. In particular we show that the diffusion constants as well as the alpha-relaxation times follow the power-law predicted by the theory and that in the beta-relaxation regime the correlators obey the factorization property with a master curve that is described well by a von Schweidler-law. The value of the von Schweidler exponent $b$ is compatible with the one found for the mentioned power-law of the relaxation times/diffusion constants. Finally we study the wave-vector dependence of f_s(q) and f(q), the coherent and incoherent non-ergodicity parameters. For the Si and O atoms these functions look qualitatively similar to the ones found in simple liquids or pure silica, in that the coherent function oscillates (in phase with the static structure factor) around the incoherent one and in that the latter is approximated well by a Gaussian function. In contrast to this, f(q) for Na-Na is always smaller than f_s(q) for Na and the latter can be approximated by a Gaussian only for relatively large q.
Initial access is the process which allows a mobile user to first connect to a cellular network. It consists of two main steps: cell search (CS) on the downlink and random access (RA) on the uplink. Millimeter wave (mmWave) cellular systems typically must rely on directional beamforming (BF) in order to create a viable connection. The beamforming direction must therefore be learned-as well as used-in the initial access process for mmWave cellular networks. This paper considers four simple but representative initial access protocols that use various combinations of directional beamforming and omnidirectional transmission and reception at the mobile and the BS, during the CS and RA phases. We provide a system-level analysis of the success probability for CS and RA for each one, as well as of the initial access delay and user-perceived downlink throughput (UPT). For a baseline exhaustive search protocol, we find the optimal BS beamwidth and observe that in terms of initial access delay it is decreasing as blockage becomes more severe, but is relatively constant (about $\pi/12$) for UPT. Of the considered protocols, the best trade-off between initial access delay and UPT is achieved under a fast cell search protocol.
Several recent works have shown that image descriptors produced by deep convolutional neural networks provide state-of-the-art performance for image classification and retrieval problems. It has also been shown that the activations from the convolutional layers can be interpreted as local features describing particular image regions. These local features can be aggregated using aggregation approaches developed for local features (e.g. Fisher vectors), thus providing new powerful global descriptors.   In this paper we investigate possible ways to aggregate local deep features to produce compact global descriptors for image retrieval. First, we show that deep features and traditional hand-engineered features have quite different distributions of pairwise similarities, hence existing aggregation methods have to be carefully re-evaluated. Such re-evaluation reveals that in contrast to shallow features, the simple aggregation method based on sum pooling provides arguably the best performance for deep convolutional features. This method is efficient, has few parameters, and bears little risk of overfitting when e.g. learning the PCA matrix. Overall, the new compact global descriptor improves the state-of-the-art on four common benchmarks considerably.
In this work we propose a deep learning network for deformable image registration (DIRNet). The DIRNet consists of a convolutional neural network (ConvNet) regressor, a spatial transformer, and a resampler. The ConvNet analyzes a pair of fixed and moving images and outputs parameters for the spatial transformer, which generates the displacement vector field that enables the resampler to warp the moving image to the fixed image. The DIRNet is trained end-to-end by unsupervised optimization of a similarity metric between input image pairs. A trained DIRNet can be applied to perform registration on unseen image pairs in one pass, thus non-iteratively. Evaluation was performed with registration of images of handwritten digits (MNIST) and cardiac cine MR scans (Sunnybrook Cardiac Data). The results demonstrate that registration with DIRNet is as accurate as a conventional deformable image registration method with substantially shorter execution times.
Populations of neurons display an extraordinary diversity in the types of problems they solve and behaviors they display. Techniques have recently emerged that allow us to create networks of model neurons that solve tasks of similar complexity. Examples include the FORCE method, a novel technique that harnesses chaos to perform computations. We demonstrate the direct applicability of FORCE training to spiking neurons by training networks to mimic various dynamical systems in addition to reproducing more elaborate tasks such as input classification, storing sequences, reproducing the singing behavior of songbirds, and recalling a scene from a movie. Post-training network analysis reveals behaviors that are consistent with electrophysiological data, such as the stereotypical decrease in voltage variance upon input presentation, reproducing firing rate distributions from songbird data, and reproducing locations of incorrect recall in sequence replay. Finally, we demonstrate that theta oscillations are critical for both learning and recall of episodic memories.
We propose a Bayesian methodology for one-mode projecting a bipartite network that is being observed across a series of discrete time steps. The resulting one mode network captures the uncertainty over the presence/absence of each link and provides a probability distribution over its possible weight values. Additionally, the incorporation of prior knowledge over previous states makes the resulting network less sensitive to noise and missing observations that usually take place during the data collection process. The methodology consists of computationally inexpensive update rules and is scalable to large problems, via an appropriate distributed implementation.
The random-anisotropy Heisenberg model is numerically studied on lattices containing over ten million spins. The study is focused on hysteresis and metastability due to topological defects, and is relevant to magnetic properties of amorphous and sintered magnets. We are interested in the limit when ferromagnetic correlations extend beyond the size of the grain inside which the magnetic anisotropy axes are correlated. In that limit the coercive field computed numerically roughly scales as the fourth power of the random anisotropy strength and as the sixth power of the grain size. Theoretical arguments are presented that provide an explanation of numerical results. Our findings should be helpful for designing amorphous and nanosintered materials with desired magnetic properties.
Many real-world problems come with action spaces represented as feature vectors. Although high-dimensional control is a largely unsolved problem, there has recently been progress for modest dimensionalities. Here we report on a successful attempt at addressing problems of dimensionality as high as $2000$, of a particular form. Motivated by important applications such as recommendation systems that do not fit the standard reinforcement learning frameworks, we introduce Slate Markov Decision Processes (slate-MDPs). A Slate-MDP is an MDP with a combinatorial action space consisting of slates (tuples) of primitive actions of which one is executed in an underlying MDP. The agent does not control the choice of this executed action and the action might not even be from the slate, e.g., for recommendation systems for which all recommendations can be ignored. We use deep Q-learning based on feature representations of both the state and action to learn the value of whole slates. Unlike existing methods, we optimize for both the combinatorial and sequential aspects of our tasks. The new agent's superiority over agents that either ignore the combinatorial or sequential long-term value aspect is demonstrated on a range of environments with dynamics from a real-world recommendation system. Further, we use deep deterministic policy gradients to learn a policy that for each position of the slate, guides attention towards the part of the action space in which the value is the highest and we only evaluate actions in this area. The attention is used within a sequentially greedy procedure leveraging submodularity. Finally, we show how introducing risk-seeking can dramatically improve the agents performance and ability to discover more far reaching strategies.
A fundamental problem in network science is to predict how certain individuals are able to initiate new networks to spring up "new ideas". Frequently, these changes in trends are triggered by a few innovators who rapidly impose their ideas through "viral" influence spreading producing cascades of followers fragmenting an old network to create a new one. Typical examples include the raise of scientific ideas or abrupt changes in social media, like the raise of Facebook.com to the detriment of Myspace.com. How this process arises in practice has not been conclusively demonstrated. Here, we show that a condition for sustaining a viral spreading process is the existence of a multiplex correlated graph with hidden "influence links". Analytical solutions predict percolation phase transitions, either abrupt or continuous, where networks are disintegrated through viral cascades of followers as in empirical data. Our modeling predicts the strict conditions to sustain a large viral spreading via a scaling form of the local correlation function between multilayers, which we also confirm empirically. Ultimately, the theory predicts the conditions for viral cascading in a large class of multiplex networks ranging from social to financial systems and markets.
While Deep Neural Networks (DNNs) push the state-of-the-art in many machine learning applications, they often require millions of expensive floating-point operations for each input classification. This computation overhead limits the applicability of DNNs to low-power, embedded platforms and incurs high cost in data centers. This motivates recent interests in designing low-power, low-latency DNNs based on fixed-point, ternary, or even binary data precision. While recent works in this area offer promising results, they often lead to large accuracy drops when compared to the floating-point networks. We propose a novel approach to map floating-point based DNNs to 8-bit dynamic fixed-point networks with integer power-of-two weights with no change in network architecture. Our dynamic fixed-point DNNs allow different radix points between layers. During inference, power-of-two weights allow multiplications to be replaced with arithmetic shifts, while the 8-bit fixed-point representation simplifies both the buffer and adder design. In addition, we propose a hardware accelerator design to achieve low-power, low-latency inference with insignificant degradation in accuracy. Using our custom accelerator design with the CIFAR-10 and ImageNet datasets, we show that our method achieves significant power and energy savings while increasing the classification accuracy.
The connectivity properties of a weight-bearing network are exploited to enhance it's capacity. We study a 2-d network of sites where the weight-bearing capacity of a given site depends on the capacities of the sites connected to it in the layers above. The network consists of clusters viz. a set of sites connected with each other with the largest such collection of sites being denoted as the maximal cluster. New connections are made between sites in successive layers using two distinct strategies. The key element of our strategies consists of adding as many disjoint clusters as possible to the sites on the trunk $T$ of the maximal cluster. The new networks can bear much higher weights than the original networks and have much lower failure rates. The first strategy leads to a greater enhancement of stability whereas the second leads to a greater enhancement of capacity compared to the original networks. The original network used here is a typical example of the branching hierarchical class. However the application of strategies similar to ours can yield useful results in other types of networks as well.
Although neural networks are well suited for sequential transfer learning tasks, the catastrophic forgetting problem hinders proper integration of prior knowledge. In this work, we propose a solution to this problem by using a multi-task objective based on the idea of distillation and a mechanism that directly penalizes forgetting at the shared representation layer during the knowledge integration phase of training. We demonstrate our approach on a Twitter domain sentiment analysis task with sequential knowledge transfer from four related tasks. We show that our technique outperforms networks fine-tuned to the target task. Additionally, we show both through empirical evidence and examples that it does not forget useful knowledge from the source task that is forgotten during standard fine-tuning. Surprisingly, we find that first distilling a human made rule based sentiment engine into a recurrent neural network and then integrating the knowledge with the target task data leads to a substantial gain in generalization performance. Our experiments demonstrate the power of multi-source transfer techniques in practical text analytics problems when paired with distillation. In particular, for the SemEval 2016 Task 4 Subtask A (Nakov et al., 2016) dataset we surpass the state of the art established during the competition with a comparatively simple model architecture that is not even competitive when trained on only the labeled task specific data.
The task of automatically segmenting 3-D surfaces representing boundaries of objects is important for quantitative analysis of volumetric images, and plays a vital role in biomedical image analysis. Recently, graph-based methods with a global optimization property have been developed and optimized for various medical imaging applications. Despite their widespread use, these require human experts to design transformations, image features, surface smoothness priors, and re-design for a different tissue, organ or imaging modality. Here, we propose a Deep Learning based approach for segmentation of the surfaces in volumetric medical images, by learning the essential features and transformations from training data, without any human expert intervention. We employ a regional approach to learn the local surface profiles. The proposed approach was evaluated on simultaneous intraretinal layer segmentation of optical coherence tomography (OCT) images of normal retinas and retinas affected by age related macular degeneration (AMD). The proposed approach was validated on 40 retina OCT volumes including 20 normal and 20 AMD subjects. The experiments showed statistically significant improvement in accuracy for our approach compared to state-of-the-art graph based optimal surface segmentation with convex priors (G-OSC). A single Convolution Neural Network (CNN) was used to learn the surfaces for both normal and diseased images. The mean unsigned surface positioning errors obtained by G-OSC method 2.31 voxels (95% CI 2.02-2.60 voxels) was improved to $1.27$ voxels (95% CI 1.14-1.40 voxels) using our new approach. On average, our approach takes 94.34 s, requiring 95.35 MB memory, which is much faster than the 2837.46 s and 6.87 GB memory required by the G-OSC method on the same computer system.
Recently, Deep Convolutional Neural Networks (DCNNs) have made unprecedented progress, achieving the accuracy close to, or even better than human-level perception in various tasks. There is a timely need to map the latest software DCNNs to application-specific hardware, in order to achieve orders of magnitude improvement in performance, energy efficiency and compactness. Stochastic Computing (SC), as a low-cost alternative to the conventional binary computing paradigm, has the potential to enable massively parallel and highly scalable hardware implementation of DCNNs. One major challenge in SC based DCNNs is designing accurate nonlinear activation functions, which have a significant impact on the network-level accuracy but cannot be implemented accurately by existing SC computing blocks. In this paper, we design and optimize SC based neurons, and we propose highly accurate activation designs for the three most frequently used activation functions in software DCNNs, i.e, hyperbolic tangent, logistic, and rectified linear units. Experimental results on LeNet-5 using MNIST dataset demonstrate that compared with a binary ASIC hardware DCNN, the DCNN with the proposed SC neurons can achieve up to 61X, 151X, and 2X improvement in terms of area, power, and energy, respectively, at the cost of small precision degradation.In addition, the SC approach achieves up to 21X and 41X of the area, 41X and 72X of the power, and 198200X and 96443X of the energy, compared with CPU and GPU approaches, respectively, while the error is increased by less than 3.07%. ReLU activation is suggested for future SC based DCNNs considering its superior performance under a small bit stream length.
While there are well established methods to study delocalization transitions of single particles in random systems, it remains a challenging problem how to characterize many body delocalization transitions. Here, we use a generalized real-space renormalization group technique to study the anisotropic Heisenberg model with long-range interactions, decaying with a power $\alpha$, which are generated by placing spins at random positions along the chain. This method permits a large-scale finite-size scaling analysis. We examine the full distribution function of the excitation energy gap from the ground state and observe a crossover with decreasing $\alpha$. At $\alpha_c$ the full distribution coincides with a critical function. Thereby, we find strong evidence for the existence of a many body localization transition in disordered antiferromagnetic spin chains with long range interactions.
The out-of-time-ordered correlator (OTOC) diagnoses quantum chaos and the scrambling of quantum information via the spread of entanglement. The OTOC encodes forward and reverse evolutions and has deep connections with the flow of time. So do fluctuation relations such as Jarzynski's Equality, derived in nonequilibrium statistical mechanics. I unite these two powerful, seemingly disparate tools by deriving a Jarzynski-like equality for the OTOC. The equality's left-hand side equals the OTOC. The right-hand side suggests a protocol for measuring the OTOC indirectly. The protocol is platform-nonspecific and can be performed with weak measurement or with interference. Time evolution need not be reversed in any interference trial. The equality opens holography, condensed matter, and quantum information to new insights from fluctuation relations and vice versa.
Learning meaningful representations using deep neural networks involves designing efficient training schemes and well-structured networks. Currently, the method of stochastic gradient descent that has a momentum with dropout is one of the most popular training protocols. Based on that, more advanced methods (i.e., Maxout and Batch Normalization) have been proposed in recent years, but most still suffer from performance degradation caused by small perturbations, also known as adversarial examples. To address this issue, we propose manifold regularized networks (MRnet) that utilize a novel training objective function that minimizes the difference between multi-layer embedding results of samples and those adversarial. Our experimental results demonstrated that MRnet is more resilient to adversarial examples and helps us to generalize representations on manifolds. Furthermore, combining MRnet and dropout allowed us to achieve competitive classification performances for three well-known benchmarks: MNIST, CIFAR-10, and SVHN.
Disordered assemblies with maximum packing fraction are studied by discrete element numerical simulation for monodisperse or bidisperse spherical particles, the diameter ratio being set at three. A maximum packing fraction value corresponds to an equilibrium state under isotropic loading of rigid frictionless particles. A statistical study of size effects enables one to evaluate, in the limit of large systems, the maximum packing fractions of both monodisperse assemblies, for which the conventional value 0.639 is retrieved, and bidisperse ones, for two distinct values of the coarse particle volume fraction. An enduring initial assembling step in which agitated grains interact through collisions induces an increase in the final packing fraction due to crystalline order nucleation for a monodisperse system or to a gradual segregation for a binary mixture. Albeit slow and moderate in a number of practical situations, this effect leads to a definition of the random close packing state, as the one obtained with frictionless rigid grains under an isotropic pressure in the limit of fast assembling processes. A few potential extensions to this preliminary study are suggested.
We construct a sample of extremely red objects (EROs) within the UKIDSS Ultra Deep Survey by combining the Early Data Release with optical data from the Subaru/XMM-Newton Deep Field. We find a total of 3715 objects over 2013 sq. arcmin with R-K>5.3 and K<=20.3, which is a higher surface density than found by previous studies. This is partly due to our ability to use a small aperture in which to measure colours, but is also the result of a genuine overdensity of objects compared to other fields. We separate our sample into passively-evolving and dusty star-forming galaxies using their RJK colours and investigate their radio properties using a deep radio map. The dusty population has a higher fraction of individually-detected radio sources and a higher mean radio flux density among the undetected objects, but the passive population has a higher fraction of bright radio sources, suggesting that AGNs are more prevalent among the passive ERO population.
We consider an open quantum system generalization of the well-known linear Aubry-Andr\'e-Harper (AAH) model by putting it out-of-equilibrium with the aid of two baths (at opposite ends) at unequal temperatures and chemical potentials. Non-equilibrium steady state (NESS) properties are computed by a fully exact non-equilibrium Green's function method (valid for any strength of system-bath coupling). We find sub-diffusive scaling of NESS current with system-size at the critical point. Below and above the critical point we recover ballistic and localized NESS transport respectively. We show that the NESS spatial particle density profile provides a potential experimental probe of delocalized, localized and critical phases. This is of special importance in light of recent experiments. We further show that various closed system quantities such as wavefunction spread, dynamical susceptibility and current auto-correlation functions which are often used to classify transport, show hints of the anomalous transport at the critical point, but fail to capture the sub-diffusive behaviour seen in the open system. Our findings are valid for both the bosonic and fermionic versions.
We provide an exact expression of the moment of the partition function for random energy models of finite system size, generalizing an earlier expression for a grand canonical version of the discrete random energy model presented by the authors in Prog. Theor. Phys. 111, 661 (2004). The expression can be handled both analytically and numerically, which is useful for examining how the analyticity of the moment with respect to the replica numbers, which play the role of powers of the moment, can be broken in the thermodynamic limit. A comparison with a replica method analysis indicates that the analyticity breaking can be regarded as the origin of the one-step replica symmetry breaking. The validity of the expression is also confirmed by numerical methods for finite systems.
A new method to numerically calculate the $n$th moment of the spin overlap of the two-dimensional $\pm J$ Ising model is developed using the identity derived by one of the authors (HK) several years ago. By using the method, the $n$th moment of the spin overlap can be calculated as a simple average of the $n$th moment of the total spins with a modified bond probability distribution. The values of the Binder parameter etc have been extensively calculated with the linear size, $L$, up to L=23. The accuracy of the calculations in the present method is similar to that in the conventional transfer matrix method with about $10^{5}$ bond samples. The simple scaling plots of the Binder parameter and the spin-glass susceptibility indicate the existence of a finite-temperature spin-glass phase transition. We find, however, that the estimation of $T_{\rm c}$ is strongly affected by the corrections to scaling within the present data ($L\leq 23$). Thus, there still remains the possibility that $T_{\rm c}=0$, contrary to the recent results which suggest the existence of a finite-temperature spin-glass phase transition.
We introduce the Dynamic Capacity Network (DCN), a neural network that can adaptively assign its capacity across different portions of the input data. This is achieved by combining modules of two types: low-capacity sub-networks and high-capacity sub-networks. The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks. The selection is made using a novel gradient-based attention mechanism, that efficiently identifies input regions for which the DCN's output is most sensitive and to which we should devote more capacity. We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are able to drastically reduce the number of computations, compared to traditional convolutional neural networks, while maintaining similar or even better performance.
A primary motivation for our research in Digital Ecosystems is the desire to exploit the self-organising properties of biological ecosystems. Ecosystems are thought to be robust, scalable architectures that can automatically solve complex, dynamic problems. However, the biological processes that contribute to these properties have not been made explicit in Digital Ecosystems research. Here, we discuss how biological properties contribute to the self-organising features of biological ecosystems, including population dynamics, evolution, a complex dynamic environment, and spatial distributions for generating local interactions. The potential for exploiting these properties in artificial systems is then considered. We suggest that several key features of biological ecosystems have not been fully explored in existing digital ecosystems, and discuss how mimicking these features may assist in developing robust, scalable self-organising architectures. An example architecture, the Digital Ecosystem, is considered in detail. The Digital Ecosystem is then measured experimentally through simulations, with measures originating from theoretical ecology, to confirm its likeness to a biological ecosystem. Including the responsiveness to requests for applications from the user base, as a measure of the 'ecological succession' (development).
The Internet of things (IoT) is still in its infancy and has attracted much interest in many industrial sectors including medical fields, logistics tracking, smart cities and automobiles. However as a paradigm, it is susceptible to a range of significant intrusion threats. This paper presents a threat analysis of the IoT and uses an Artificial Neural Network (ANN) to combat these threats. A multi-level perceptron, a type of supervised ANN, is trained using internet packet traces, then is assessed on its ability to thwart Distributed Denial of Service (DDoS/DoS) attacks. This paper focuses on the classification of normal and threat patterns on an IoT Network. The ANN procedure is validated against a simulated IoT network. The experimental results demonstrate 99.4% accuracy and can successfully detect various DDoS/DoS attacks.
In the framework of quantum computational tensor network [D. Gross and J. Eisert, Phys. Rev. Lett. {\bf98}, 220503 (2007)], which is a general framework of measurement-based quantum computation, the resource many-body state is represented in a tensor-network form, and universal quantum computation is performed in a virtual linear space, which is called a correlation space, where tensors live. Since any unitary operation, state preparation, and the projection measurement in the computational basis can be simulated in a correlation space, it is natural to expect that fault-tolerant quantum circuits can also be simulated in a correlation space. However, we point out that not all physical errors on physical qudits appear as linear completely-positive trace-preserving errors in a correlation space. Since the theories of fault-tolerant quantum circuits known so far assume such noises, this means that the simulation of fault-tolerant quantum circuits in a correlation space is not so straightforward for general resource states.
The study of hypersurfaces in a torus leads to the beautiful zoo of amoebas and their contours, whose possible configurations are seen from combinatorial data. There is a deep connection to the logarithmic Gauss map and its critical points. The theory has a lot of applications in many directions.   In this report we recall basic notions and results from the theory of amoebas, show some connection to algebraic singularity theory and discuss some consequences from the well known classification of singularities to this subject. Moreover, we have tried to compute some examples using the computer algebra system SINGULAR and discuss different possibilities and their effectivity to compute the critical points. Here we meet an essential obstacle: Relevant examples need real or even rational solutions, which are found only by chance. We have tried to unify different views to that subject.
Independent trees are used in building secure and/or fault-tolerant network communication protocols. They have been investigated for different network topologies including tori. Dense Gaussian networks are potential alternatives for 2-dimensional tori. They have similar topological properties; however, they are superiors in carrying communications due to their node-distance distributions and smaller diameters. In this paper, we present constructions of edge-disjoint node-independent spanning trees in dense Gaussian networks. Based on the constructed trees, we design algorithms that could be used in fault-tolerant routing or secure message distribution.
With the recent popularity of animated GIFs on social media, there is need for ways to index them with rich metadata. To advance research on animated GIF understanding, we collected a new dataset, Tumblr GIF (TGIF), with 100K animated GIFs from Tumblr and 120K natural language descriptions obtained via crowdsourcing. The motivation for this work is to develop a testbed for image sequence description systems, where the task is to generate natural language descriptions for animated GIFs or video clips. To ensure a high quality dataset, we developed a series of novel quality controls to validate free-form text input from crowdworkers. We show that there is unambiguous association between visual content and natural language descriptions in our dataset, making it an ideal benchmark for the visual content captioning task. We perform extensive statistical analyses to compare our dataset to existing image and video description datasets. Next, we provide baseline results on the animated GIF description task, using three representative techniques: nearest neighbor, statistical machine translation, and recurrent neural networks. Finally, we show that models fine-tuned from our animated GIF description dataset can be helpful for automatic movie description.
In many organisms the expression levels of each gene are controlled by the activation levels of known "Transcription Factors" (TF). A problem of considerable interest is that of estimating the "Transcription Regulation Networks" (TRN) relating the TFs and genes. While the expression levels of genes can be observed, the activation levels of the corresponding TFs are usually unknown, greatly increasing the difficulty of the problem. Based on previous experimental work, it is often the case that partial information about the TRN is available. For example, certain TFs may be known to regulate a given gene or in other cases a connection may be predicted with a certain probability. In general, the biology of the problem indicates there will be very few connections between TFs and genes. Several methods have been proposed for estimating TRNs. However, they all suffer from problems such as unrealistic assumptions about prior knowledge of the network structure or computational limitations. We propose a new approach that can directly utilize prior information about the network structure in conjunction with observed gene expression data to estimate the TRN. Our approach uses $L_1$ penalties on the network to ensure a sparse structure. This has the advantage of being computationally efficient as well as making many fewer assumptions about the network structure. We use our methodology to construct the TRN for E. coli and show that the estimate is biologically sensible and compares favorably with previous estimates.
Resource competition in heterogeneous environments is still an unresolved problem of theoretical ecology. In this article I analyze competition between two phytoplankton species in a deep water column, where the distributions of main resources (light and a limiting nutrient) have opposing gradients and co-limitation by both resources causes a deep biomass maximum. Assuming that the species have a trade-off in resource requirements and the water column is weakly mixed, I apply the invasion threshold analysis (Ryabov and Blasius 2011) to determine relations between environmental conditions and phytoplankton composition. Although species deplete resources in the interior of the water column, the resource levels at the bottom and surface remain high. As a result, the slope of resources gradients becomes a new crucial factor which, rather than the local resource values, determines the outcome of competition. The value of resource gradients nonlinearly depend on the density of consumers. This leads to complex relationships between environmental parameters and species composition. In particular, it is shown that an increase of both the incident light intensity or bottom nutrient concentrations favors the best light competitors, while an increase of the turbulent mixing or background turbidity favors the best nutrient competitors. These results might be important for prediction of species composition in deep ocean.
With the current host-based Internet architecture, networking faces limitations in dynamic scenarios, due mostly to host mobility. The ICN paradigm mitigates such problems by releasing the need to have an end-to-end transport session established during the life time of the data transfer. Moreover, the ICN concept solves the mismatch between the Internet architecture and the way users would like to use it: currently a user needs to know the topological location of the hosts involved in the communication when he/she just wants to get the data, independently of its location. Most of the research efforts aim to come up with a stable ICN architecture in fixed networks, with few examples in ad-hoc and vehicular networks. However, the Internet is becoming more pervasive with powerful personal mobile devices that allow users to form dynamic networks in which content may be exchanged at all times and with low cost. Such pervasive wireless networks suffer with different levels of disruption given user mobility, physical obstacles, lack of cooperation, intermittent connectivity, among others. This paper discusses the combination of content knowledge (e.g., type and interested parties) and social awareness within opportunistic networking as to drive the deployment of ICN solutions in disruptive networking scenarios. With this goal in mind, we go over few examples of social-aware content-based opportunistic networking proposals that consider social awareness to allow content dissemination independently of the level of network disruption. To show how much content knowledge can improve social-based solutions, we illustrate by means of simulation some content-oblivious/oriented proposals in scenarios based on synthetic mobility patterns and real human traces.
Searching the space of policies directly for the optimal policy has been one popular method for solving partially observable reinforcement learning problems. Typically, with each change of the target policy, its value is estimated from the results of following that very policy. This requires a large number of interactions with the environment as different polices are considered. We present a family of algorithms based on likelihood ratio estimation that use data gathered when executing one policy (or collection of policies) to estimate the value of a different policy. The algorithms combine estimation and optimization stages. The former utilizes experience to build a non-parametric representation of an optimized function. The latter performs optimization on this estimate. We show positive empirical results and provide the sample complexity bound.
The success of deep neural networks hinges on our ability to accurately and efficiently optimize high-dimensional, non-convex loss functions. In this paper, we empirically investigate the geometry of the loss functions of state-of-the-art networks, and how commonly-used stochastic gradient descent variants optimize these loss functions. To do this, we visualize the loss function by projecting it down to low-dimensional spaces chosen based on the convergence points of different optimization algorithms. Our observations suggest that optimization algorithms encounter and choose different descent directions at many saddle points to find different final weights. Based on consistency we observe across re-runs of the same stochastic optimization algorithm, we hypothesize that each optimization algorithm makes characteristic choices at these saddle points.
We propose a partially learned approach for the solution of ill posed inverse problems with not necessarily linear forward operators. The method builds on ideas from classical regularization theory and recent advances in deep learning to perform learning while making use of prior information about the inverse problem encoded in the forward operator, noise model and a regularizing functional. The method results in a gradient-like iterative scheme, where the "gradient" component is learned using a convolutional network that includes the gradients of the data discrepancy and regularizer as input in each iteration. We present results of such a partially learned gradient scheme on a non-linear tomographic inversion problem with simulated data from both the Sheep-Logan phantom as well as a head CT. The outcome is compared against FBP and TV reconstruction and the proposed method provides a 5.4 dB PSNR improvement over the TV reconstruction while being significantly faster, giving reconstructions of 512 x 512 volumes in about 0.4 seconds using a single GPU.
Regulatory networks have evolved to allow gene expression to rapidly track changes in the environment as well as to buffer perturbations and maintain cellular homeostasis in the absence of change. Theoretical work and empirical investigation in Escherichia coli have shown that negative autoregulation confers both rapid response times and reduced intrinsic noise, which is reflected in the fact that almost half of Escherichia coli transcription factors are negatively autoregulated. However, negative autoregulation is exceedingly rare amongst the transcription factors of Saccharomyces cerevisiae. This difference is all the more surprising because E. coli and S. cerevisiae otherwise have remarkably similar profiles of network motifs. In this study we first show that regulatory interactions amongst the transcription factors of Drosophila melanogaster and humans have a similar dearth of negative autoregulation to that seen in S. cerevisiae. We then present a model demonstrating that this fundamental difference in the noise reduction strategies used amongst species can be explained by constraints on the evolution of negative autoregulation in diploids. We show that regulatory interactions between pairs of homologous genes within the same cell can lead to under-dominance - mutations which result in stronger autoregulation, and decrease noise in homozygotes, paradoxically can cause increased noise in heterozygotes. This severely limits a diploid's ability to evolve negative autoregulation as a noise reduction mechanism. Our work offers a simple and general explanation for a previously unexplained difference between the regulatory architectures of E. coli and yeast, Drosophila and humans. It also demonstrates that the effects of diploidy in gene networks can have counter-intuitive consequences that may profoundly influence the course of evolution.
An exact expression for the spin-spin correlation function is derived for the zero-temperature random-field Ising model defined on a Bethe lattice of arbitrary coordination number. The correlation length describing dynamic spin-spin correlations and separated from the intrinsic topological length scale of the Bethe lattice is shown to diverge as a power law at the critical point. The critical exponents governing the behaviour of the correlation length are consistent with the mean-field values found for a hypercubic lattice with dimension greater than the upper critical dimension.
State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources and power budgets. While custom hardware helps the computation, fetching weights from DRAM is two orders of magnitude more expensive than ALU operations, and dominates the required power.   Previously proposed 'Deep Compression' makes it possible to fit large DNNs (AlexNet and VGGNet) fully in on-chip SRAM. This compression is achieved by pruning the redundant connections and having multiple connections share the same weight. We propose an energy efficient inference engine (EIE) that performs inference on this compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing. Going from DRAM to SRAM gives EIE 120x energy saving; Exploiting sparsity saves 10x; Weight sharing gives 8x; Skipping zero activations from ReLU saves another 3x. Evaluated on nine DNN benchmarks, EIE is 189x and 13x faster when compared to CPU and GPU implementations of the same DNN without compression. EIE has a processing power of 102GOPS/s working directly on a compressed network, corresponding to 3TOPS/s on an uncompressed network, and processes FC layers of AlexNet at 1.88x10^4 frames/sec with a power dissipation of only 600mW. It is 24,000x and 3,400x more energy efficient than a CPU and GPU respectively. Compared with DaDianNao, EIE has 2.9x, 19x and 3x better throughput, energy efficiency and area efficiency.
Reducing CO2 emissions is an important global environmental issue. Over the recent years, wireless and mobile communications have increasingly become popular with consumers. An increasingly popular type of wireless access is the so-called Wireless Mesh Networks (WMNs) that provide wireless connectivity through much cheaper and more flexible backhaul infrastructure compared with wired solutions. Wireless Mesh Network (WMN) is an emerging new technology which is being adopted as the wireless internetworking solution for the near future. Due to increased energy consumption in the information and communication technology (ICT) industries, and its consequent environmental effects, energy efficiency has become a key factor to evaluate the performance of a communication network. This paper mainly focuses on classification layer of the largest existing approaches dedicated to energy conservation. It is also discussing the most interesting works on energy saving in WMNs networks.
We study the problem of estimating the origin of an epidemic outbreak -- given a contact network and a snapshot of epidemic spread at a certain time, determine the infection source. Finding the source is important in different contexts of computer or social networks. We assume that the epidemic spread follows the most commonly used susceptible-infected-recovered model. We introduce an inference algorithm based on dynamic message-passing equations, and we show that it leads to significant improvement of performance compared to existing approaches. Importantly, this algorithm remains efficient in the case where one knows the state of only a fraction of nodes.
We predict the universal power law dependence of localization length on magnetic field in the strongly localized regime. This effect is due to the orbital quantum interference. Physically, this dependence shows up in an anomalously large negative magnetoresistance in the hopping regime. The reason for the universality is that the problem of the electron tunneling in a random media belongs to the same universality class as directed polymer problem even in the case of wave functions of random sign. We present numerical simulations which prove this conjecture. We discuss the existing experiments that show anomalously large magnetoresistance. We also discuss the role of localized spins in real materials and the spin polarizing effect of magnetic field.
We consider the dynamics of finite-size disordered systems as defined by a master equation satisfying detailed balance. The master equation can be mapped onto a Schr\"odinger equation in configuration space, where the quantum Hamiltonian $H$ has the generic form of an Anderson localization tight-binding model. The largest relaxation time $t_{eq}$ governing the convergence towards Boltzmann equilibrium is determined by the lowest non-vanishing eigenvalue $E_1=1/t_{eq}$ of $H$ (the lowest eigenvalue being $E_0=0$). So the relaxation time $t_{eq}$ can be computed {\it without simulating the dynamics} by any eigenvalue method able to compute the first excited energy $E_1$. Here we use the 'conjugate gradient' method to determine $E_1$ in each disordered sample and present numerical results on the statistics of the relaxation time $t_{eq}$ over the disordered samples of a given size for two models : (i) for the random walk in a self-affine potential of Hurst exponent $H$ on a two-dimensional square of size $L \times L$, we find the activated scaling $\ln t_{eq}(L) \sim L^{\psi}$ with $\psi=H$ as expected; (ii) for the dynamics of the Sherrington-Kirkpatrick spin-glass model of $N$ spins, we find the growth $\ln t_{eq}(N) \sim N^{\psi}$ with $\psi=1/3$ in agreement with most previous Monte-Carlo measures. In addition, we find that the rescaled distribution of $(\ln t_{eq})$ decays as $e^{- u^{\eta}}$ for large $u$ with a tail exponent of order $\eta \simeq 1.36$. We give a rare-event interpretation of this value, that points towards a sample-to-sample fluctuation exponent of order $\psi_{width} \simeq 0.26$ for the barrier.
In order to obtain polarised parton densities we have made next to leading order QCD fit using experimental data on deep inelastic structure functions on nucleons. This fit is compared with the updated fit to coresponding spin asymmetries. We get very similar results for all fits also for different data samples. It seems that only polarised parton densities for non-strange quarks $\Delta u$ and $\Delta d$ are relatively well determined from the present deep inelastic experiments. Inegrated gluon contribution at $Q^{2}=1 {\rm GeV^{2}}$ is, as in our previous fits, very small.
When we represent a network of sensors in Euclidean space by a graph, there are two distances between any two nodes that we may consider. One of them is the Euclidean distance. The other is the distance between the two nodes in the graph, defined to be the number of edges on a shortest path between them. In this paper, we consider a network of sensors placed uniformly at random in a two-dimensional region and study two conditional distributions related to these distances. The first is the probability distribution of distances in the graph, conditioned on Euclidean distances; the other is the probability density function associated with Euclidean distances, conditioned on distances in the graph. We study these distributions both analytically (when feasible) and by means of simulations. To the best of our knowledge, our results constitute the first of their kind and open up the possibility of discovering improved solutions to certain sensor-network problems, as for example sensor localization.
Currently the Peer to Peer computing paradigm rises as an economic solution for the large scale computation problems. However due to the dynamic nature of peers it is very difficult to use this type of systems for the computations of real time applications. Strict deadline of scientific and real time applications require predictable performance in such applications. We propose an algorithm to identify the group of reliable peers, from the available peers on the Internet, for the processing of real time applications tasks. The algorithm is based on joint evaluation of peer properties like peer availability, credibility, computation time and the turnaround time of the peer with respect to the task distributor peer. Here we also define a method to calculate turnaround time on task distributor peers at application level.
Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. Several methods such as Liner Predictive Predictive Coding (LPC), Hidden Markov Model (HMM), Artificial Neural Network (ANN) and etc are evaluated with a view to identify a straight forward and effective method for voice signal. The extraction and matching process is implemented right after the Pre Processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) introduced by Sakoe Chiba has been used as features matching techniques. Since it's obvious that the voice signal tends to have different temporal rate, the alignment is important to produce the better performance.This paper present the viability of MFCC to extract features and DTW to compare the test patterns.
In this paper, we propose a very concise deep learning approach for collaborative filtering that jointly models distributional representation for users and items. The proposed framework obtains better performance when compared against current state-of-art algorithms and that made the distributional representation model a promising direction for further research in the collaborative filtering.
We consider the statistics of the areas enclosed by domain boundaries (`hulls') during the curvature-driven coarsening dynamics of a two-dimensional nonconserved scalar field from a disordered initial state. We show that the number of hulls per unit area that enclose an area greater than $A$ has, for large time $t$, the scaling form $N_h(A,t) = 2c/(A+\lambda t)$, demonstrating the validity of dynamical scaling in this system, where $c=1/8\pi\sqrt{3}$ is a universal constant. Domain areas (regions of aligned spins) have a similar distribution up to very large values of $A/\lambda t$. Identical forms are obtained for coarsening from a critical initial state, but with $c$ replaced by $c/2$.
This work considers the problem of finding analytical expressions for the expected values of dis- tributed computing performance metrics when the underlying communication network has a complex structure. Through active probing tests a real distributed computing environment is analysed. From the resulting network, ensembles of synthetic graphs with additional structure are used in Monte Carlo simulations to both validate analytical expressions and explore the performance metrics under different conditions. Computing paradigms with different hierarchical structures in computing ser- vices are gauged, fully decentralised (i.e., peer-to-peer) environments providing the best performance. Moreover, it is found that by implementing more intelligent computing services configurations (e.g., betweenness centrality based mappings) and task allocations strategies, significant improvements in the parallel efficiency can be achieved. We qualitatively reproduce results from previous works and provide closed-form solutions for the expected performance metrics linking topological, application structure and allocation parameters when job dependencies and a complex network structure are considered.
Deep neural networks have shown recent promise in many language-related tasks such as the modeling of conversations. We extend RNN-based sequence to sequence models to capture the long range discourse across many turns of conversation. We perform a sensitivity analysis on how much additional context affects performance, and provide quantitative and qualitative evidence that these models are able to capture discourse relationships across multiple utterances. Our results quantifies how adding an additional RNN layer for modeling discourse improves the quality of output utterances and providing more of the previous conversation as input also improves performance. By searching the generated outputs for specific discourse markers we show how neural discourse models can exhibit increased coherence and cohesion in conversations.
In this paper, we propose a new model that allows us to investigate this competitive aspect of real networks in quantitative terms. Through theoretical analysis and numerical simulations, we find that the competitive network have the universality for a weighted network. The relation between parameters in the weighted network and the competitiveness in the competitive network is obtained. So we can use the expression of the degree distribution of the competitive model to calculate that and the strength of the weighted network directly. The analytical solution reveals that the degree distribution of the weighted network is correlated with the increment and initial value of edge weights, which is verified by numerical simulations. Moreover, the evolving pattern of a clustering coefficient along with network parameters such as the size of a network, an updating coefficient, an initial weight and the competitiveness are obtained by further simulations. Specially, it is necessary to point out that the initial weight plays equally significant role as updating coefficient in influencing the topological characteristics of the network.
Bayesian Neural Networks (BNNs) have recently received increasing attention for their ability to provide well-calibrated posterior uncertainties. However, model selection---even choosing the number of nodes---remains an open question. In this work, we apply a horseshoe prior over node pre-activations of a Bayesian neural network, which effectively turns off nodes that do not help explain the data. We demonstrate that our prior prevents the BNN from under-fitting even when the number of nodes required is grossly over-estimated. Moreover, this model selection over the number of nodes doesn't come at the expense of predictive or computational performance; in fact, we learn smaller networks with comparable predictive performance to current approaches.
Automated discovery of early visual concepts from raw image data is a major open challenge in AI research. Addressing this problem, we propose an unsupervised approach for learning disentangled representations of the underlying factors of variation. We draw inspiration from neuroscience, and show how this can be achieved in an unsupervised generative model by applying the same learning pressures as have been suggested to act in the ventral visual stream in the brain. By enforcing redundancy reduction, encouraging statistical independence, and exposure to data with transform continuities analogous to those to which human infants are exposed, we obtain a variational autoencoder (VAE) framework capable of learning disentangled factors. Our approach makes few assumptions and works well across a wide variety of datasets. Furthermore, our solution has useful emergent properties, such as zero-shot inference and an intuitive understanding of "objectness".
We propose a method of characterizing radial networks based on a partition function associated with the structural triangulation of the network. The internal energy, Helmholtz free energy, and entropy derived from the partition function are used to group similar networks together and to interrogate the history of their development. We illustrate our methodology for a model for optimal transport in tree leaves.
The goal of this paper is to increase our understanding of the fundamental performance limits of mobile and Delay Tolerant Networks (DTNs), where end-to-end multi-hop paths may not exist and communication routes may only be available through time and mobility. We use analytical tools to derive generic theoretical upper bounds for the information propagation speed in large scale mobile and intermittently connected networks. In other words, we upper-bound the optimal performance, in terms of delay, that can be achieved using any routing algorithm. We then show how our analysis can be applied to specific mobility and graph models to obtain specific analytical estimates. In particular, in two-dimensional networks, when nodes move at a maximum speed $v$ and their density $\nu$ is small (the network is sparse and surely disconnected), we prove that the information propagation speed is upper bounded by ($1+O(\nu^2))v$ in the random way-point model, while it is upper bounded by $O(\sqrt{\nu v} v)$ for other mobility models (random walk, Brownian motion). We also present simulations that confirm the validity of the bounds in these scenarios. Finally, we generalize our results to one-dimensional and three-dimensional networks.
Leader election is one of the basic problems in distributed computing. For anonymous networks, the task of leader election is formulated as follows: every node v of the network must output a simple path, which is coded as a sequence of port numbers, such that all these paths end at a common node, the leader. In this paper, we study deterministic leader election in arbitrary anonymous networks. It is well known that leader election is impossible in some networks, regardless of the allocated amount of time, even if nodes know the map of the network. However, even in networks in which it is possible to elect a leader knowing the map, the task may be still impossible without any knowledge, regardless of the allocated time. On the other hand, for any network in which leader election is possible knowing the map, there is a minimum time, called the election index, in which this can be done. Informally, the election index of a network is the minimum depth at which views of all nodes are distinct. Our aim is to establish tradeoffs between the allocated time $\tau$ and the amount of information that has to be given a priori to the nodes to enable leader election in time $\tau$ in all networks for which leader election in this time is at all possible. Following the framework of algorithms with advice, this information is provided to all nodes at the start by an oracle knowing the entire network. The length of this string (its number of bits) is called the size of advice. For a given time $\tau$ allocated to leader election, we give upper and lower bounds on the minimum size of advice sufficient to perform leader election in time $\tau$. We focus on the two sides of the time spectrum and give tight (or almost tight) bounds on the minimum size of advice for these extremes. We also show that constant advice is not sufficient for leader election in all graphs, regardless of the allocated time.
In order to investigate the dynamical status of Seyfert's Sextet (SS), we have obtained a deep optical ($VR+I$) image of this group. Our image shows that a faint envelope, down to a surface brightness $\mu_{\rm optical}$(AB) $\simeq 27$ mag arcsec$^{-2}$, surrounds the member galaxies. This envelope is irregular in shape. It is likely that this shape is attributed either to recent-past or to on-going galaxy interactions in SS. If the member galaxies have experienced a number of mutual interactions over a long timescale, the shape of the envelope should be rounder. Therefore, the irregular-shaped morphology suggests that SS is in an early phase of dynamical interaction among the member galaxies. It is interesting to note that the soft X-ray image obtained with ROSAT (Pildis et al. 1995) is significantly similar in morphology. We discuss the possible future evolution of SS briefly.
As computer vision before, remote sensing has been radically changed by the introduction of Convolution Neural Networks. Land cover use, object detection and scene understanding in aerial images rely more and more on deep learning to achieve new state-of-the-art results. Recent architectures such as Fully Convolutional Networks (Long et al., 2015) can even produce pixel level annotations for semantic mapping. In this work, we show how to use such deep networks to detect, segment and classify different varieties of wheeled vehicles in aerial images from the ISPRS Potsdam dataset. This allows us to tackle object detection and classification on a complex dataset made up of visually similar classes, and to demonstrate the relevance of such a subclass modeling approach. Especially, we want to show that deep learning is also suitable for object-oriented analysis of Earth Observation data. First, we train a FCN variant on the ISPRS Potsdam dataset and show how the learnt semantic maps can be used to extract precise segmentation of vehicles, which allow us studying the repartition of vehicles in the city. Second, we train a CNN to perform vehicle classification on the VEDAI (Razakarivony and Jurie, 2016) dataset, and transfer its knowledge to classify candidate segmented vehicles on the Potsdam dataset.
Deep convolutional neural networks are generally regarded as robust function approximators. So far, this intuition is based on perturbations to external stimuli such as the images to be classified. Here we explore the robustness of convolutional neural networks to perturbations to the internal weights and architecture of the network itself. We show that convolutional networks are surprisingly robust to a number of internal perturbations in the higher convolutional layers but the bottom convolutional layers are much more fragile. For instance, Alexnet shows less than a 30% decrease in classification performance when randomly removing over 70% of weight connections in the top convolutional or dense layers but performance is almost at chance with the same perturbation in the first convolutional layer. Finally, we suggest further investigations which could continue to inform the robustness of convolutional networks to internal perturbations.
We extended existing theory of the two-pulse electric-dipole echo in glasses in a magnetic field to the three-pulse echo. As is well known, at low temperatures two-level systems (TLS's) are responsible for the echo phenomenon in glasses. Using a diagram technique in the framework of perturbation theory we derived a simple formula for the three-pulse echo amplitude. As in the case of two-pulse echo the magnetic field dependence of the tree-pulse echo amplitude in glasses is related to quadrupole electric moments of TLS's non-spherical nuclei and/or dipole-dipole interaction of their nuclear spins. These two mechanisms are responsible for the fine level splitting of TLS. As a result TLS transforms to multi-level system with the fine level splitting depending on a magnetic field. Due to existence in the theory the additional parameter T - the time interval between the second and the third pulses we have more reach spectrum of echo oscillations in a magnetic field in comparison with the case of the two-pulse echo.
Using probabilistic approach, the transient dynamics of sparsely connected Hopfield neural networks is studied for arbitrary degree distributions. A recursive scheme is developed to determine the time evolution of overlap parameters. As illustrative examples, the explicit calculations of dynamics for networks with binomial, power-law, and uniform degree distribution are performed. The results are good agreement with the extensive numerical simulations. It indicates that with the same average degree, there is a gradual improvement of network performance with increasing sharpness of its degree distribution, and the most efficient degree distribution for global storage of patterns is the delta function.
Kubo formula is used to get the scaling behavior of the static conductance distribution of wide wires showing pure non-diagonal disorder. Following recent works that point to unusual phenomena in some circumstances, scaling at the band center of wires of odd widths has been numerically investigated. While the conductance mean shows a decrease that is only proportional to the inverse square root of the wire length, the median of the distribution exponentially decreases as a function of the square root of the length. Actually, the whole distribution decays as the inverse square root of the length except close to G=0 where the distribution accumulates the weight lost at larger conductances. It accurately follows the theoretical prediction once the free parameter is correctly fitted. Moreover, when the number of channels equals the wire length but contacts are kept finite, the conductance distribution is still described by the previous model. It is shown that the common origin of this behavior is a simple Gaussian statistics followed by the logarithm of the E=0 wavefunction weight ratio of a system showing chiral symmetry. A finite value of the two-dimensional conductance mean is obtained in the infinite size limit. Both conductance and the wavefunction statistics distributions are given in this limit. This results are consistent with the 'critical' character of the E=0 wavefunction predicted in the literature.
Dilution effects on spin-1/2 quantum Heisenberg antiferromagnets with a non-magnetic spin-gapped ground state are studied by means of the qunatum Monte Carlo simulation. In the site-diluted system, an antiferromagnetic long-range order (AF-LRO) is induced at an infinitesimal concentration of dilution due to an effective coupling $\tilde{J}_{mn}$ between induced magnetic moments. In the bond-diluted case, on the other hand, the AF-LRO is not induced up to a certain concentration of dilution due to singlet pairs reformed by an AF coupling $\tilde{J}_{\rm af}$ between the edge spins of the diluted bond through the two-dimensional shortest paths. The competition between $\tilde{J}_{mn}$ and $\tilde{J}_{\rm af}$ yields peculiar phenomena in the bond-diluted system.
Graphical causal models are an important tool for knowledge discovery because they can represent both the causal relations between variables and the multivariate probability distributions over the data. Once learned, causal graphs can be used for classification, feature selection and hypothesis generation, while revealing the underlying causal network structure and thus allowing for arbitrary likelihood queries over the data. However, current algorithms for learning sparse directed graphs are generally designed to handle only one type of data (continuous-only or discrete-only), which limits their applicability to a large class of multi-modal biological datasets that include mixed type variables. To address this issue, we developed new methods that modify and combine existing methods for finding undirected graphs with methods for finding directed graphs. These hybrid methods are not only faster, but also perform better than the directed graph estimation methods alone for a variety of parameter settings and data set sizes. Here, we describe a new conditional independence test for learning directed graphs over mixed data types and we compare performances of different graph learning strategies on synthetic data.
Information in networks is non-uniformly distributed, enabling individuals in certain network positions to get preferential access to information. Social scientists have developed influential theories about the role of network structure in information access. These theories were validated through numerous studies, which examined how individuals leverage their social networks for competitive advantage, such as a new job or higher compensation. It is not clear how these theories generalize to online networks, which differ from real-world social networks in important respects, including asymmetry of social links. We address this problem by analyzing how users of the social news aggregator Digg adopt stories recommended by friends, i.e., users they follow. We measure the impact different factors, such as network position and activity rate; have on access to novel information, which in Digg's case means set of distinct news stories. We show that a user can improve his information access by linking to active users, though this becomes less effective as the number of friends, or their activity, grows due to structural network constraints. These constraints arise because users in structurally diverse position within the follower graph have topically diverse interests from their friends. Moreover, though in most cases user's friends are exposed to almost all the information available in the network, after they make their recommendations, the user sees only a small fraction of the available information. Our study suggests that cognitive and structural bottlenecks limit access to novel information in online social networks.
In this paper we proposed a hierarchical P2P network based on a dynamic partitioning on a 1-D space. This hierarchy is created and maintained dynamically and provides a gridmiddleware (like DGET) a P2P basic functionality for resource discovery and load-balancing.This network architecture is called TreeP (Tree based P2P network architecture) and is based on atessellation of a 1-D space. We show that this topology exploits in an efficient way theheterogeneity feature of the network while limiting the overhead introduced by the overlaymaintenance. Experimental results show that this topology is highly resilient to a large number ofnetwork failures.
We present the results of the covariant spectator quark model applied to the nucleon structure function $f(x)$ measured in unpolarized deep inelastic scattering, and the structure functions $g_1(x)$ and $g_2(x)$ measured in deep inelastic scattering using polarized beams and targets ($x$ is the Bjorken scaling variable). The nucleon is modeled by a valence quark-diquark structure with $S,P$ and $D$ components. The shape of the wave functions and the relative strength of each component are fixed by making fits to the deep inelastic scattering data for the structure functions $f(x)$ and $g_1(x)$. The model is then used to make predictions on the function $g_2(x)$ for the proton and neutron.
We compute the gluon polarization $\Delta G$ and the gluon total angular momentum $J_g$ in the Isgur-Karl quark model. Taking into account self-interaction contributions, we obtain for both quantities a positive value of about 0.24 at the model scale $\mu_0^2 \simeq 0.25$ GeV$^2$. This estimate is reasonably close to the expectation from polarized deep inelastic scattering analyses.
Coverage is one of the fundamental issues in wireless sensor networks (WSNs). It reflects the ability of WSNs to detect the fields of interest. In a real sensor networks application, the detection area is always non-ideal and the terrain of the detection area is often more complex in applications of three-dimensional sensor networks. Consequently, many of the existing coverage strategies cannot be directly applied to three-dimensional spaces. This paper presents a new coverage strategy for the three-dimensional sensor networks. Sensor nodes are uniformly distributed. The cost factor is utilized to construct the perceived probability and the classical watershed algorithm after the transformation of points from the three-dimensional space to the two-dimensional plane using the dimensionality reduction method, which can maintain the topology characteristic of the non-linear terrain. The detection probability in the optimal breath path is used as the measure to evaluate the coverage. Simulation results indicate that the proposed strategy can determine the coverage with fewer nodes, while achieving the coverage requirements of the networks.
In this paper we present and analyze HSkip+, a self-stabilizing overlay network for nodes with arbitrary heterogeneous bandwidths. HSkip+ has the same topology as the Skip+ graph proposed by Jacob et al. [PODC 2009] but its self-stabilization mechanism significantly outperforms the self-stabilization mechanism proposed for Skip+. Also, the nodes are now ordered according to their bandwidths and not according to their identifiers. Various other solutions have already been proposed for overlay networks with heterogeneous bandwidths, but they are not self-stabilizing. In addition to HSkip+ being self-stabilizing, its performance is on par with the best previous bounds on the time and work for joining or leaving a network of peers of logarithmic diameter and degree and arbitrary bandwidths. Also, the dilation and congestion for routing messages is on par with the best previous bounds for such networks, so that HSkip+ combines the advantages of both worlds. Our theoretical investigations are backed by simulations demonstrating that HSkip+ is indeed performing much better than Skip+ and working correctly under high churn rates.
Empirical studies on the spatial structures in several real transport networks reveal that the distance distribution in these networks obeys power law. To discuss the influence of the power-law exponent on the network's structure and function, a spatial network model is proposed. Based on a regular network and subject to a limited cost $C$, long range connections are added with power law distance distribution $P(r)=ar^{-\delta}$. Some basic topological properties of the network with different $\delta$ are studied. It is found that the network has the smallest average shortest path when $\delta=2$. Then a traffic model on this network is investigated. It is found that the network with $\delta=1.5$ is best for the traffic process. All of these results give us some deep understandings about the relationship between spatial structure and network function.
We study coherent backscattering of a monochromatic laser by a dilute gas of cold two-level atoms in the weakly nonlinear regime. The nonlinear response of the atoms results in a modification of both the average field propagation (nonlinear refractive index) and the scattering events. Using a perturbative approach, the nonlinear effects arise from inelastic two-photon scattering processes. We present a detailed diagrammatic derivation of the elastic and inelastic components of the backscattering signal both for scalar and vectorial photons. Especially, we show that the coherent backscattering phenomenon originates in some cases from the interference between three different scattering amplitudes. This is in marked contrast with the linear regime where it is due to the interference between two different scattering amplitudes. In particular we show that, if elastically scattered photons are filtered out from the photo-detection signal, the nonlinear backscattering enhancement factor exceeds the linear barrier two, consistently with a three-amplitude interference effect.
This paper studies new properties of the front and back ends of a sorting network, and illustrates the utility of these in the search for new bounds on optimal sorting networks. Search focuses first on the "outsides" of the network and then on the inner part. All previous works focus only on properties of the front end of networks and on how to apply these to break symmetries in the search. The new, out-side-in, properties help shed understanding on how sorting networks sort, and facilitate the computation of new bounds on optimal sorting networks. We present new parallel sorting networks for 17 to 20 inputs. For 17, 19, and 20 inputs these networks are faster than the previously known best networks. For 17 inputs, the new sorting network is shown optimal in the sense that no sorting network using less layers exists.
In this work we present a simple and fast computational method, the visibility algorithm, that converts a time series into a graph. The constructed graph inherits several properties of the series in its structure. Thereby, periodic series convert into regular graphs, and random series do so into random graphs. Moreover, fractal series convert into scale-free networks, enhancing the fact that power law degree distributions are related to fractality, something highly discussed recently. Some remarkable examples and analytical tools are outlined in order to test the method's reliability. Many different measures, recently developed in the complex network theory, could by means of this new approach characterize time series from a new point of view.
The word-stock of a language is a complex dynamical system in which words can be created, evolve, and become extinct. Even more dynamic are the short-term fluctuations in word usage by individuals in a population. Building on the recent demonstration that word niche is a strong determinant of future rise or fall in word frequency, here we introduce a model that allows us to distinguish persistent from temporary increases in frequency. Our model is illustrated using a 10^8-word database from an online discussion group and a 10^11-word collection of digitized books. The model reveals a strong relation between changes in word dissemination and changes in frequency. Aside from their implications for short-term word frequency dynamics, these observations are potentially important for language evolution as new words must survive in the short term in order to survive in the long term.
The impact of an electric field on the electron localization problem is studied within the framework of a field-theoretic formulation. The investigation shows that the impact of the electric field on the localization corrections is governed by the interplay between two time scales, one set by the electric field, and the other by the phase relaxation rate. At very low temperatures the scaling of the conductivity is governed by the electric field. In this regime the conductivity depends logarithmically on the field, and an arbitrarily small electric field delocalizes the electron states. At higher temperatures the behavior of the conductivity is governed by the temperature scaling. In this regime the field has no impact on the observable leading localization corrections.
An extensive numerical study is reported on disorder effect in two-dimensional d-wave superconductors with random impurities in the unitary limit. It is found that a sharp resonant peak shows up in the density of states at zero energy and correspondingly the finite-size spin conductance is strongly enhanced which results in a non-universal feature in one-parameter scaling. However, all quasiparticle states remain localized, indicating that the resonant density peak alone is not sufficient to induce delocalization. In the weak disorder limit, the localization length is so long that the spin conductance at small sample size is close to the universal value predicted by Lee (Phys. Rev. Lett. {\bf 71}, 1887 (1993)).
Summary: Protein quality assessment is a long-standing problem in bioinformatics. For more than a decade we have developed state-of-art predictors by carefully selecting and optimising inputs to a machine learning method. The correlation has increased from 0.60 in ProQ to 0.81 in ProQ2 and 0.85 in ProQ3 mainly by adding a large set of carefully tuned descriptions of a protein. Here, we show that a substantial improvement can be obtained using exactly the same inputs as in ProQ2 or ProQ3 but replacing the support vector machine by a deep neural network. This improves the Pearson correlation to 0.90 (0.85 using ProQ2 input features).   Availability: ProQ3D is freely available both as a webserver and a stand-alone program at http://proq3.bioinfo.se/
Soil water retention curve (SWRC) and saturated hydraulic conductivity (SHC) are key hydraulic properties for unsaturated zone hydrology and groundwater. In particular, SWRC provides useful information on entry pore-size distribution, and SHC is required for flow and transport modeling in the hydrologic cycle. Not only the SWRC and SHC measurements are time-consuming, but also scale dependent. This means as soil column volume increases, variability of the SWRC and SHC decreases. Although prediction of the SWRC and SHC from available parameters, such as textural data, organic matter, and bulk density have been under investigation for decades, up to now no research has focused on the effect of measurement scale on the soil hydraulic properties pedotransfer functions development. In the literature, several data mining approaches have been applied, such as multiple linear regression, artificial neural networks, group method of data handling. However, in this study we develop pedotransfer functions using a novel approach called contrast pattern aided regression (CPXR) and compare it with the multiple linear regression method. For this purpose, two databases including 210 and 213 soil samples are collected to develop and evaluate pedotransfer functions for the SWRC and SHC, respectively, from the UNSODA database. The 10-fold cross-validation method is applied to evaluate the accuracy and reliability of the proposed regression-based models. Our results show that including measurement scale parameters, such as sample internal diameter and length could substantially improve the accuracy of the SWRC and SHC pedotransfer functions developed using the CPXR method, while this is not the case when MLR is used. Moreover, the CPXR method yields remarkably more accurate soil water retention curve and saturated hydraulic conductivity predictions than the MLR approach.
Predicting labels of nodes in a network, such as community memberships or demographic variables, is an important problem with applications in social and biological networks. A recently-discovered phase transition puts fundamental limits on the accuracy of these predictions if we have access only to the network topology. However, if we know the correct labels of some fraction $\alpha$ of the nodes, we can do better. We study the phase diagram of this "semisupervised" learning problem for networks generated by the stochastic block model. We use the cavity method and the associated belief propagation algorithm to study what accuracy can be achieved as a function of $\alpha$. For $k = 2$ groups, we find that the detectability transition disappears for any $\alpha > 0$, in agreement with previous work. For larger $k$ where a hard but detectable regime exists, we find that the easy/hard transition (the point at which efficient algorithms can do better than chance) becomes a line of transitions where the accuracy jumps discontinuously at a critical value of $\alpha$. This line ends in a critical point with a second-order transition, beyond which the accuracy is a continuous function of $\alpha$. We demonstrate qualitatively similar transitions in two real-world networks.
Nowadays, social networks such as Twitter, Facebook and LinkedIn become increasingly popular. In fact, they introduced new habits, new ways of communication and they collect every day several information that have different sources. Most existing research works fo-cus on the analysis of homogeneous social networks, i.e. we have a single type of node and link in the network. However, in the real world, social networks offer several types of nodes and links. Hence, with a view to preserve as much information as possible, it is important to consider so-cial networks as heterogeneous and uncertain. The goal of our paper is to classify the social message based on its spreading in the network and the theory of belief functions. The proposed classifier interprets the spread of messages on the network, crossed paths and types of links. We tested our classifier on a real word network that we collected from Twitter, and our experiments show the performance of our belief classifier.
We propose a new method to immunize populations or computer networks against epidemics which is more efficient than any method considered before. The novelty of our method resides in the way of determining the immunization targets. First we identify those individuals or computers that contribute the least to the disease spreading measured through their contribution to the size of the largest connected cluster in the social or a computer network. The immunization process follows the list of identified individuals or computers in inverse order, immunizing first those which are most relevant for the epidemic spreading. We have applied our immunization strategy to several model networks and two real networks, the Internet and the collaboration network of high energy physicists. We find that our new immunization strategy is in the case of model networks up to 14%, and for real networks up to 33% more efficient than immunizing dynamically the most connected nodes in a network. Our strategy is also numerically efficient and can therefore be applied to large systems.
In machine learning, error back-propagation in multi-layer neural networks (deep learning) has been impressively successful in supervised and reinforcement learning tasks. As a model for learning in the brain, however, deep learning has long been regarded as implausible, since it relies in its basic form on a non-local plasticity rule. To overcome this problem, energy-based models with local contrastive Hebbian learning were proposed and tested on a classification task with networks of rate neurons. We extended this work by implementing and testing such a model with networks of leaky integrate-and-fire neurons. Preliminary results indicate that it is possible to learn a non-linear regression task with hidden layers, spiking neurons and a local synaptic plasticity rule.
We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive) latent variable models such as hidden Markov models, POMDPs are more challenging since the learner interacts with the environment and possibly changes the future observations in the process. We devise a learning algorithm running through epochs, in each epoch we employ spectral techniques to learn the POMDP parameters from a trajectory generated by a fixed policy. At the end of the epoch, an optimization oracle returns the optimal memoryless planning policy which maximizes the expected reward based on the estimated POMDP model. We prove an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling with respect to the dimensionality of observation and action spaces.
We report the search for intracluster light in four Abell Type II/III (non-cD) galaxy clusters: Abell 801, 1234, 1553, & 1914. We find on average that these clusters contain $\sim$ 10% of their detected stellar luminosity in a diffuse component. We show that for two of the clusters the intracluster light closely follows the galaxy distribution, but in the other two cases, there are noticeable differences between the spatial distribution of the galaxies and the intracluster light. We report the results of a search for intracluster tidal debris in each cluster, and note that Abell 1914 in particular has a number of strong tidal features likely due to its status as a recent cluster merger. One of the Abell 1914 features appears to be spatially coincident with an extension seen in weak lensing maps, implying the feature traces a large amount of mass. We compare these results to numerical simulations of hierarchically-formed galaxy clusters, and find good general agreement between the observed and simulated images, although we also find that our observations sample only the brightest features of the intracluster light. Together, these results suggest that intracluster light can be a valuable tool in determining the evolutionary state of galaxy clusters.
Large-scale transitions in societies are associated with both individual behavioural change and restructuring of the social network. These two factors have often been considered independently, yet recent advances in social network research challenge this view. Here we show that common features of societal marginalization and clustering emerge naturally during transitions in a co-evolutionary adaptive network model. This is achieved by explicitly considering the interplay between individual interaction and a dynamic network structure in behavioural selection. We exemplify this mechanism by simulating how smoking behaviour and the network structure get reconfigured by changing social norms. Our results are consistent with empirical findings: The prevalence of smoking was reduced, remaining smokers were preferentially connected among each other and formed increasingly marginalised clusters. We propose that self-amplifying feedbacks between individual behaviour and dynamic restructuring of the network are main drivers of the transition. This generative mechanism for co-evolution of individual behaviour and social network structure may apply to a wide range of examples beyond smoking.
A network belongs to the monotone separable class if its state variables are homogeneous and monotone functions of the epochs of the arrival process. This framework contains several classical queueing network models, including generalized Jackson networks, max-plus networks, polling systems, multiserver queues, and various classes of stochastic Petri nets. We use comparison relationships between networks of this class with i.i.d. driving sequences and the $GI /GI /1/1$ queue to obtain the tail asymptotics of the stationary maximal dater under light-tailed assumptions for service times. The exponential rate of decay is given as a function of a logarithmic moment generating function. We exemplify an explicit computation of this rate for the case of queues in tandem under various stochastic assumptions.
Reservoir computing is a recent trend in neural networks which uses the dynamical perturbations on the phase space of a system to compute a desired target function. We present how one can formulate an expectation of system performance in a simple class of reservoir computing called echo state networks. In contrast with previous theoretical frameworks, which only reveal an upper bound on the total memory in the system, we analytically calculate the entire memory curve as a function of the structure of the system and the properties of the input and the target function. We demonstrate the precision of our framework by validating its result for a wide range of system sizes and spectral radii. Our analytical calculation agrees with numerical simulations. To the best of our knowledge this work presents the first exact analytical characterization of the memory curve in echo state networks.
The capacity to resist attacks from the environment is crucial to the survival of all organisms. We quantitatively analyze the susceptibility of protein interaction networks of numerous organisms to random and malicious attacks. We find for all organisms studied that random rewiring improves protein network robustness, so that actual networks are more fragile than rewired surrogates. This unexpected fragility contrasts with the behavior of networks such as the Internet, whose robustness decreases with random rewiring. We trace this surprising effect to the modular structure of protein networks.
In considering a social network, there are cases where people is transferred to another place. Then the physical (direct) relations among nodes are lost by the movement. In terms of a network theory, some nodes break the present connections with neighboring nodes, move and there build new connections of nodes. For simplicity we here consider only that two nodes exchange the place each other on a network. Such exchange is assumed to be constantly carried out. We study this dynamic network (node exchange network NEN) and uncover some new features which usual networks do not contain. We mainly consider average path length and the diameter. Lastly we consider a propagation of one virus on the network by a computer simulation. They are compared to other networks investigated hitherto. The relation to a scale free network is also discussed.
We present the two-loop corrected operator matrix elements calculated in N-dimensional regularization up to the finite terms which survive in the limit $\epsilon = N - 4 \to 0 $. The anomalous dimensions of the local operators have been previously extracted from the pole terms and determine the scale evolution of the deep inelastic structure functions measured in unpolarized lepton hadron scattering. The finite $\epsilon$-independent terms in the two-loop expressions are needed to renormalize the local operators up to third order in the strong coupling constant $\alpha_s$. Further the unrenormalized expressions for the two-loop corrected operator matrix elements can be inserted into specific one loop graphs to obtain a part of the third order contributions to these matrix elements. This work is a first step in obtaining the anomalous dimensions up to third order so that a complete next-to-next-to-leading order (NNLO) analysis can be carried out for deep inelastic electroproduction.
Simple, short, and compact hashtags cover a wide range of information on social networks. Although many works in the field of natural language processing (NLP) have demonstrated the importance of hashtag recommendation, hashtag recommendation for images has barely been studied. In this paper, we introduce the HARRISON dataset, a benchmark on hashtag recommendation for real world images in social networks. The HARRISON dataset is a realistic dataset, composed of 57,383 photos from Instagram and an average of 4.5 associated hashtags for each photo. To evaluate our dataset, we design a baseline framework consisting of visual feature extractor based on convolutional neural network (CNN) and multi-label classifier based on neural network. Based on this framework, two single feature-based models, object-based and scene-based model, and an integrated model of them are evaluated on the HARRISON dataset. Our dataset shows that hashtag recommendation task requires a wide and contextual understanding of the situation conveyed in the image. As far as we know, this work is the first vision-only attempt at hashtag recommendation for real world images in social networks. We expect this benchmark to accelerate the advancement of hashtag recommendation.
Random heteropolymers do not display the typical equilibrium properties of globular proteins, but are the starting point to understand the physics of proteins and, in particular, to describe their non-native states. So far, they have been studied only with mean-field models in the thermodynamic limit, or with computer simulations of very small chains on lattice. After describing a self-adjusting parallel-tempering technique to sample efficiently the low-energy states of frustrated systems without the need of tuning the system-dependent parameters of the algorithm, we apply it to random heteropolymers moving in continuous space. We show that if the mean interaction between monomers is negative, the usual description through the random energy model is nearly correct, provided that it is extended to account for non-compact conformations. If the mean interaction is positive, such a simple description breaks out and the system behaves in a way more similar to Ising spin glasses. The former case is a model for the denatured state of glob- ular proteins, the latter of naturally-unfolded proteins, whose equilibrium properties thus result qualitatively different.
We theoretically analyze and compare the following five popular multiclass classification methods: One vs. All, All Pairs, Tree-based classifiers, Error Correcting Output Codes (ECOC) with randomly generated code matrices, and Multiclass SVM. In the first four methods, the classification is based on a reduction to binary classification. We consider the case where the binary classifier comes from a class of VC dimension $d$, and in particular from the class of halfspaces over $\reals^d$. We analyze both the estimation error and the approximation error of these methods. Our analysis reveals interesting conclusions of practical relevance, regarding the success of the different approaches under various conditions. Our proof technique employs tools from VC theory to analyze the \emph{approximation error} of hypothesis classes. This is in sharp contrast to most, if not all, previous uses of VC theory, which only deal with estimation error.
This work started out with our accidental discovery of a pattern of throughput distributions among links in IEEE 802.11 networks from experimental results. This pattern gives rise to an easy computation method, which we term back-of-the-envelop (BoE) computation, because for many network configurations, very accurate results can be obtained within minutes, if not seconds, by simple hand computation. BoE beats prior methods in terms of both speed and accuracy. While the computation procedure of BoE is simple, explaining why it works is by no means trivial. Indeed the majority of our investigative efforts have been devoted to the construction of a theory to explain BoE. This paper models an ideal CSMA network as a set of interacting on-off telegraph processes. In developing the theory, we discovered a number of analytical techniques and observations that have eluded prior research, such as that the carrier-sensing interactions among links in an ideal CSMA network result in a system state evolution that is time-reversible; and that the probability distribution of the system state is insensitive to the distributions of the "on" and "off" durations given their means, and is a Markov random field. We believe these theoretical frameworks are useful not just for explaining BoE, but could also be a foundation for a fundamental understanding of how links in CSMA networks interact. Last but not least, because of their basic nature, we surmise that some of the techniques and results developed in this paper may be applicable to not just CSMA networks, but also to other physical and engineering systems consisting of entities interacting with each other in time and space.
We use the inverse participation ratio based on the Husimi function to perform a phase space analysis of the Anderson model in one, two, and three dimensions. Important features of the quantum states remain observable in phase space in the large system size limit, while they would be lost in a real or momentum space description. From perturbative approaches in the limits of weak and strong disorder, we find that the appearance of a delocalization-localization transition is connected to the coupling, by a weak potential, of momentum eigenstates which are far apart in momentum space. This is consistent with recent results obtained for the Aubry-Andre model and provides a novel view on the metal-insulator transition.
We review several of the most widely used techniques for training recurrent neural networks to approximate dynamical systems, then describe a novel algorithm for this task. The algorithm is based on an earlier theoretical result that guarantees the quality of the network approximation. We show that a feedforward neural network can be trained on the vector field representation of a given dynamical system using backpropagation, then recast, using matrix manipulations, as a recurrent network that replicates the original system's dynamics. After detailing this algorithm and its relation to earlier approaches, we present numerical examples that demonstrate its capabilities. One of the distinguishing features of our approach is that both the original dynamical systems and the recurrent networks that simulate them operate in continuous time.
We study the effects of the topology on the Olami-Feder-Christensen (OFC) model, an earthquake model of self-organized criticality. In particular, we consider a 2D square lattice and a random rewiring procedure with a parameter $0<p<1$ that allows to tune the interaction graph, in a continuous way, from the initial local connectivity to a random graph. The main result is that the OFC model on a small-world topology exhibits self-organized criticality deep within the non-conservative regime, contrary to what happens in the nearest-neighbors model. The probability distribution for avalanche size obeys finite size scaling, with universal critical exponents in a wide range of values of the rewiring probability $p$. The pdf's cutoff can be fitted by a stretched exponential function with the stretching exponent approaching unity within the small-world region.
This paper proposes a convolutional neural network that can fuse high-level prior for semantic image segmentation. Motivated by humans' vision recognition system, our key design is a three-layer generative structure consisting of high-level coding, middle-level segmentation and low-level image to introduce global prior for semantic segmentation. Based on this structure, we proposed a generative model called conditional variational auto-encoder (CVAE) that can build up the links behind these three layers. These important links include an image encoder that extracts high level info from image, a segmentation encoder that extracts high level info from segmentation, and a hybrid decoder that outputs semantic segmentation from the high level prior and input image. We theoretically derive the semantic segmentation as an optimization problem parameterized by these links. Finally, the optimization problem enables us to take advantage of state-of-the-art fully convolutional network structure for the implementation of the above encoders and decoder. Experimental results on several representative datasets demonstrate our supreme performance for semantic segmentation.
The concept of neutral evolutionary networks being a significant factor in evolutionary dynamics was first proposed by Huynen {\em et al.} about 7 years ago. In one sense, the principle is easy to state -- because most mutations to an organism are deleterious, one would expect that neutral mutations that don't affect the phenotype will have disproportionately greater representation amongst successor organisms than one would expect if each mutation was equally likely. So it was with great surprise that I noted neutral mutations being very rare in a visualisation of phylogenetic trees generated in {\em Tierra}, since I already knew that there was a significant amount of neutrality in the Tierra genotype-phenotype map. The paper reports on an investigation into this mystery.
In wireless networks, node cooperation has been exploited as a data relaying mechanism for decades. However, the wireless channel allows for much richer interaction among nodes. In particular, Distributed Information SHaring (DISH) represents a new improvement to multi-channel MAC protocol design by using a cooperative element at the control plane. In this approach, nodes exchange control information to make up for other nodes' insufficient knowledge about the environment, and thereby aid in their decision making. To date, what is lacking is a theoretical understanding of DISH. In this paper, we view cooperation as a network resource and evaluate the availability of cooperation, $p_{co}$. We first analyze $p_{co}$ in the context of a multi-channel multi-hop wireless network, and then perform simulations which show that the analysis accurately characterizes $p_{co}$ as a function of underlying network parameters. Next, we investigate the correlation between $p_{co}$ and network metrics such as collision rate, packet delay, and throughput. We find a near-linear relationship between $p_{co}$ and the metrics, which suggests that $p_{co}$ can be used as an appropriate performance indicator itself. Finally, we apply our analysis to solving a channel bandwidth allocation problem, where we derive optimal schemes and provide general guidelines on bandwidth allocation for DISH networks.
In this paper we analyze a novel diffusion strategy for adaptive networks called Decoupled Adapt-then-Combine, which keeps a fully local estimate of the solution for the adaptation step. Our strategy, which is specially convenient for heterogeneous networks, is compared with the standard Adapt-then-Combine scheme and theoretically analyzed using energy conservation arguments. Such comparison shows the need of implementing adaptive combiners for both schemes to obtain a good performance in case of heterogeneous networks. Therefore, we propose two adaptive rules to learn the combination coefficients that are useful for our diffusion strategy. Several experiments simulating both stationary estimation and tracking problems show that our method outperforms state-of-the-art techniques, becoming a competitive approach in different scenarios.
We present an algorithm for extracting key-point descriptors using deep convolutional neural networks (CNN). Unlike many existing deep CNNs, our model computes local features around a given point in an image. We also present a face alignment algorithm based on regression using these local descriptors. The proposed method called Local Deep Descriptor Regression (LDDR) is able to localize face landmarks of varying sizes, poses and occlusions with high accuracy. Deep Descriptors presented in this paper are able to uniquely and efficiently describe every pixel in the image and therefore can potentially replace traditional descriptors such as SIFT and HOG. Extensive evaluations on five publicly available unconstrained face alignment datasets show that our deep descriptor network is able to capture strong local features around a given landmark and performs significantly better than many competitive and state-of-the-art face alignment algorithms.
Our world can be succinctly and compactly described as structured scenes of objects and relations. A typical room, for example, contains salient objects such as tables, chairs and books, and these objects typically relate to each other by their underlying causes and semantics. This gives rise to correlated features, such as position, function and shape. Humans exploit knowledge of objects and their relations for learning a wide spectrum of tasks, and more generally when learning the structure underlying observed data. In this work, we introduce relation networks (RNs) - a general purpose neural network architecture for object-relation reasoning. We show that RNs are capable of learning object relations from scene description data. Furthermore, we show that RNs can act as a bottleneck that induces the factorization of objects from entangled scene description inputs, and from distributed deep representations of scene images provided by a variational autoencoder. The model can also be used in conjunction with differentiable memory mechanisms for implicit relation discovery in one-shot learning tasks. Our results suggest that relation networks are a potentially powerful architecture for solving a variety of problems that require object relation reasoning.
Machine learning models, including state-of-the-art deep neural networks, are vulnerable to small perturbations that cause unexpected classification errors. This unexpected lack of robustness raises fundamental questions about their generalization properties and poses a serious concern for practical deployments. As such perturbations can remain imperceptible - commonly called adversarial examples that demonstrate an inherent inconsistency between vulnerable machine learning models and human perception - some prior work casts this problem as a security issue as well. Despite the significance of the discovered instabilities and ensuing research, their cause is not well understood, and no effective method has been developed to address the problem highlighted by adversarial examples. In this paper, we present a novel theory to explain why this unpleasant phenomenon exists in deep neural networks. Based on that theory, we introduce a simple, efficient and effective training approach, Batch Adjusted Network Gradients (BANG), which significantly improves the robustness of machine learning models. While the BANG technique does not rely on any form of data augmentation or the application of adversarial images for training, the resultant classifiers are more resistant to adversarial perturbations while maintaining or even enhancing classification performance overall.
Many recent deep learning platforms rely on third-party libraries (such as cuBLAS) to utilize the computing power of modern hardware accelerators (such as GPUs). However, we observe that they may achieve suboptimal performance because the library functions are not used appropriately. In this paper, we target at optimizing the operations of multiplying a matrix with the transpose of another matrix (referred to as NT operation hereafter), which contribute about half of the training time of fully connected deep neural networks. Rather than directly calling the library function, we propose a supervised learning based algorithm selection approach named MTNN, which uses a gradient boosted decision tree to select one from two alternative NT implementations intelligently: (1) calling the cuBLAS library function; (2) calling our proposed algorithm TNN that uses an efficient out-of-place matrix transpose. We evaluate the performance of MTNN on two modern GPUs: NVIDIA GTX 1080 and NVIDIA Titan X Pascal. MTNN can achieve 96\% of prediction accuracy with very low computational overhead, which results in an average of 54\% performance improvement on a range of NT operations. To further evaluate the impact of MTNN on the training process of deep neural networks, we have integrated MTNN into a popular deep learning platform Caffe. Our experimental results show that the revised Caffe can outperform the original one by an average of 28\%. Both MTNN and the revised Caffe are open-source.
Fluctuating environments pose tremendous challenges to bacterial populations. It is observed in numerous bacterial species that individual cells can stochastically switch among multiple phenotypes for the population to survive in rapidly changing environments. This kind of phenotypic heterogeneity with stochastic phenotype switching is generally understood to be an adaptive bet-hedging strategy. Mathematical models are essential to gain a deeper insight into the principle behind bet-hedging and the pattern behind experimental data. Traditional deterministic models cannot provide a correct description of stochastic phenotype switching and bet-hedging, and traditional Markov chain models at the cellular level fail to explain their underlying molecular mechanisms. In this paper, we propose a nonlinear stochastic model of multistable bacterial systems at the molecular level. It turns out that our model not only provides a clear description of stochastic phenotype switching and bet-hedging within isogenic bacterial populations, but also provides a deeper insight into the analysis of multidimensional experimental data. Moreover, we use some deep mathematical theories to show that our stochastic model and traditional Markov chain models are essentially consistent and reflect the dynamic behavior of the bacterial system at two different time scales. In addition, we provide a quantitative characterization of the critical state of multistable bacterial systems and develop an effective data-driven method to identify the critical state without resorting to specific mathematical models.
We study the effects of quenched disorder on the first-order phase transition in the two-dimensional three-color Ashkin-Teller model by means of large-scale Monte Carlo simulations. We demonstrate that the first-order phase transition is rounded by the disorder and turns into a continuous one. Using a careful finite-size-scaling analysis, we provide strong evidence for the emerging critical behavior of the disordered Ashkin-Teller model to be in the clean two-dimensional Ising universality class, accompanied by universal logarithmic corrections. This agrees with perturbative renormalization-group predictions by Cardy. As a byproduct, we also provide support for the strong-universality scenario for the critical behavior of the two-dimensional disordered Ising model. We discuss consequences of our results for the classification of disordered phase transitions as well as generalizations to other systems.
Synchronization of brain activity fluctuations is believed to represent communication between spatially distant neural processes. These inter-areal functional interactions develop in the background of a complex network of axonal connections linking cortical and sub-cortical neurons, termed the human "structural connectome". Theoretical considerations and experimental evidence support the view that the human brain can be modeled as a system operating at a critical point between ordered (sub-critical) and disordered (super-critical) phases. Here, we explore the hypothesis that pathologies resulting from brain injury of different etiology are related to the model of a critical brain. For this purpose, we investigate how damage to the integrity of the structural connectome impacts on the signatures of critical dynamics. Adopting a hybrid modeling approach combining an empirical weighted network of human structural connections with a conceptual model of critical dynamics, we show that lesions located at highly transited connections progressively displace the model towards the sub-critical regime. The topological properties of the nodes and links are of less importance when considered independently of their weight in the network. We observe that damage to midline hubs such as the middle and posterior cingulate cortex is most crucial for the disruption of criticality in the model. However, a similar effect can be achieved by targeting less transited nodes and links whose connection weights add up to an equivalent amount. This implies that brain pathology does not necessarily arise due to insult targeted at well-connected areas and that inter- subject variability could obscure lesions located at non-hub regions. Finally, we discuss the predictions of our model in the context of clinical studies of traumatic brain injury and neurodegenerative disorders.
We present a theoretical study of diluted magnetic semiconductors that includes spin-orbit coupling within a realistic host band structure and treats explicitly the effects of disorder due to randomly substituted Mn ions. While spin-orbit coupling reduces the spin polarization by mixing different spin states in the valence bands, we find that disorder from Mn ions enhances the spin polarization due to formation of ferromagnetic impurity clusters and impurity bound states. The disorder leads to large effects on the hole carriers which form impurity bands as well as hybridizing with the valence band. For Mn doping 0.01 < x < 0.04, the system is metallic with a large effective mass and low mobility.
In the domain of sequence modelling, Recurrent Neural Networks (RNN) have been capable of achieving impressive results in a variety of application areas including visual question answering, part-of-speech tagging and machine translation. However this success in modelling short term dependencies has not successfully transitioned to application areas such as trajectory prediction, which require capturing both short term and long term relationships. In this paper, we propose a Tree Memory Network (TMN) for modelling long term and short term relationships in sequence-to-sequence mapping problems. The proposed network architecture is composed of an input module, controller and a memory module. In contrast to related literature, which models the memory as a sequence of historical states, we model the memory as a recursive tree structure. This structure more effectively captures temporal dependencies across both short term and long term sequences using its hierarchical structure. We demonstrate the effectiveness and flexibility of the proposed TMN in two practical problems, aircraft trajectory modelling and pedestrian trajectory modelling in a surveillance setting, and in both cases we outperform the current state-of-the-art. Furthermore, we perform an in depth analysis on the evolution of the memory module content over time and provide visual evidence on how the proposed TMN is able to map both long term and short term relationships efficiently via a hierarchical structure.
The network paradigm is increasingly used to describe the topology and dynamics of complex systems. Here we review the results of the topological analysis of protein structures as molecular networks describing their small-world character, and the role of hubs and central network elements in governing enzyme activity, allosteric regulation, protein motor function, signal transduction and protein stability. We summarize available data how central network elements are enriched in active centers and ligand binding sites directing the dynamics of the entire protein. We assess the feasibility of conformational and energy networks to simplify the vast complexity of rugged energy landscapes and to predict protein folding and dynamics. Finally, we suggest that modular analysis, novel centrality measures, hierarchical representation of networks and the analysis of network dynamics will soon lead to an expansion of this field.
Amorphous solids, as well as many disordered lattices, display remarkable universality in their low temperature acoustic properties. This universality is attributed to the attenuation of phonons by tunneling two-level systems (TLSs), facilitated by the interaction of the TLSs with the phonon field. TLS-phonon interaction also mediates effective TLS-TLS interactions, which dictates the existence of a glassy phase and its low energy properties. Here we consider KBr:CN, the archetypal disordered lattice showing universality. We calculate numerically, using conjugate gradients method, the effective TLS-TLS interactions for inversion symmetric (CN flips) and asymmetric (CN rotations) TLSs, in the absence and presence of disorder, in two and three dimensions. The observed dependence of the magnitude and spatial power law of the interaction on TLS symmetry, and its change with disorder, characterizes TLS-TLS interactions in disordered lattices in both extreme and moderate dilutions. Our results are in good agreement with the two-TLS model, recently introduced to explain long-standing questions regarding the quantitative universality of phonon attenuation and the energy scale of $\approx 1-3$ K below which universality is observed.
In this paper we outline an initial typology and framework for the purpose of profiling human-machine networks, that is, collective structures where humans and machines interact to produce synergistic effects. Profiling a human-machine network along the dimensions of the typology is intended to facilitate access to relevant design knowledge and experience. In this way the profiling of an envisioned or existing human-machine network will both facilitate relevant design discussions and, more importantly, serve to identify the network type. We present experiences and results from two case trials: a crisis management system and a peer-to-peer reselling network. Based on the lessons learnt from the case trials we suggest potential benefits and challenges, and point out needed future work.
Several methods to extract the strong coupling constant $\alpha_s$ by means of highly energetic jets in Deep-Inelastic Scattering are presented. The results from the various methods agree with one another and with the world average. The errors are competetive to those achieved in $\alpha_s$ determinations in other processes such as proton--anti-proton scattering.
The paper concerns with Wilson-Cowan neural model with impulses. The main novelty of the study is that besides the traditional singularity of the model, we consider singular impulses. A new technique of analysis of the phenomenon is suggested. This allows to consider the existence of solutions of the model and bifurcation in ultimate neural behavior is observed through numerical simulations. The bifurcations are reasoned by impulses and singularity in the model and they concern the structure of attractors, which consist of newly introduced sets in the phase space such that medusas and rings.
The paper presents a Stateflow based network test-bed to validate real-time optimal control algorithms. Genetic Algorithm (GA) based time domain performance index minimization is attempted for tuning of PI controller to handle a balanced lag and delay type First Order Plus Time Delay (FOPTD) process over network. The tuning performance is validated on a real-time communication network with artificially simulated stochastic delay, packet loss and out-of order packets characterizing the network.
We apply a series of projection techniques on top of tensor networks to compute energies of ground state wave functions with higher accuracy than tensor networks alone with minimal additional cost. We consider both matrix product states as well as tree tensor networks in this work. Building on top of these approaches, we apply fixed-node quantum Monte Carlo, Lanczos steps, and exact projection. We demonstrate these improvements for the triangular lattice Heisenberg model, where we capture up to 57 percent of the remaining energy not captured by the tensor network alone. We conclude by discussing further ways to improve our approach.
We consider a version of a Glauber dynamics for a p-spin Sherrington--Kirkpatrick model of a spin glass that can be seen as a time change of simple random walk on the N-dimensional hypercube. We show that, for any p>2 and any inverse temperature \beta>0, there exist constants g>0, such that for all exponential time scales, $\exp(\gamma N)$, with $\gamma< g$, the properly rescaled clock process (time-change process), converges to an \alpha-stable subordinator where \alpha=\gamma/\beta^2<1. Moreover, the dynamics exhibits aging at these time scales with time-time correlation function converging to the arcsine law of this \alpha-stable subordinator. In other words, up to rescaling, on these time scales (that are shorter than the equilibration time of the system), the dynamics of p-spin models ages in the same way as the REM, and by extension Bouchaud's REM-like trap model, confirming the latter as a universal aging mechanism for a wide range of systems. The SK model (the case p=2) seems to belong to a different universality class.
Community structure is one of the key properties of real-world complex networks. It plays a crucial role in their behaviors and topology. While an important work has been done on the issue of community detection, very little attention has been devoted to the analysis of the community structure. In this paper, we present an extensive investigation of the overlapping community network deduced from a large-scale co-authorship network. The nodes of the overlapping community network represent the functional communities of the co-authorship network, and the links account for the fact that communities share some nodes in the co-authorship network. The comparative evaluation of the topological properties of these two networks shows that they share similar topological properties. These results are very interesting. Indeed, the network of communities seems to be a good representative of the original co-authorship network. With its smaller size, it may be more practical in order to realize various analyses that cannot be performed easily in large-scale real-world networks.
We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs. We first present a generalized framework that encompasses a broad family of approaches and includes cross-dimensional pooling and weighting steps. We then propose specific non-parametric schemes for both spatial- and channel-wise weighting that boost the effect of highly active spatial responses and at the same time regulate burstiness effects. We experiment on different public datasets for image search and show that our approach outperforms the current state-of-the-art for approaches based on pre-trained networks. We also provide an easy-to-use, open source implementation that reproduces our results.
Recent studies have shown that deep neural networks (DNNs) perform significantly better than shallow networks and Gaussian mixture models (GMMs) on large vocabulary speech recognition tasks. In this paper, we argue that the improved accuracy achieved by the DNNs is the result of their ability to extract discriminative internal representations that are robust to the many sources of variability in speech signals. We show that these representations become increasingly insensitive to small perturbations in the input with increasing network depth, which leads to better speech recognition performance with deeper networks. We also show that DNNs cannot extrapolate to test samples that are substantially different from the training examples. If the training data are sufficiently representative, however, internal features learned by the DNN are relatively stable with respect to speaker differences, bandwidth differences, and environment distortion. This enables DNN-based recognizers to perform as well or better than state-of-the-art systems based on GMMs or shallow networks without the need for explicit model adaptation or feature normalization.
By using the wavelet transformation (WT), we have analyzed the response of an ensemble of $N$ (=1, 10, 100 and 500) Hodgkin-Huxley (HH) neurons to {\it transient} $M$-pulse spike trains ($M=1-3$) with independent Gaussian noises. The cross-correlation between the input and output signals is expressed in terms of the WT expansion coefficients. The signal-to-noise ratio (SNR) is evaluated by using the {\it denoising} method within the WT, by which the noise contribution is extracted from output signals. Although the response of a single (N=1) neuron to sub-threshold transient signals with noises is quite unreliable, the transmission fidelity assessed by the cross-correlation and SNR is shown to be much improved by increasing the value of $N$: a population of neurons play an indispensable role in the stochastic resonance (SR) for transient spike inputs. It is also shown that in a large-scale ensemble, the transmission fidelity for supra-threshold transient spikes is not significantly degraded by a weak noise which is responsible to SR for sub-threshold inputs.
Our system addresses Part 1, Lesion Segmentation and Part 3, Lesion Classification of the ISIC 2017 challenge. Both algorithms make use of deep convolutional networks to achieve the challenge objective.
We study aperiodic systems based on substitution rules by means of a transfer-matrix approach. In addition to the well-known trace map, we investigate the so-called `antitrace' map, which is the corresponding map for the difference of the off-diagonal elements of the 2x2 transfer matrix. The antitrace maps are obtained for various binary, ternary and quaternary aperiodic sequences, such as the Fibonacci, Thue-Morse, period-doubling, Rudin-Shapiro sequences, and certain generalizations. For arbitrary substitution rules, we show that not only trace maps, but also antitrace maps exist. The dimension of the our antitrace map is r(r+1)/2, where r denotes the number of basic letters in the aperiodic sequence. Analogous maps for specific matrix elements of the transfer matrix can also be constructed, but the maps for the off-diagonal elements and for the difference of the diagonal elements coincide with the antitrace map. Thus, from the trace and antitrace map, we can determine any physical quantity related to the global transfer matrix of the system. As examples, we employ these dynamical maps to compute the transmission coefficients for optical multilayers, harmonic chains, and electronic systems.
The Langevin system subjected to non-Gaussian noise has been discussed, by using the second-order moment approach with two kinds of models for generating the noise. We have derived the effective differential equation (DE) for a variable $x$, from which the stationary probability distribution $P(x)$ has been calculated with the use of the Fokker-Planck equation. The result of $P(x)$ calculated by the moment method is compared to several expressions obtained by different methods such as the universal colored noise approximation (UCNA) [Jung and H\"{a}nggi, Phys. Rev. A {\bf 35}, 4464 (1987)] and the functional-integral method. It has been shown that our $P(x)$ is in good agreement with that of direct simulations (DSs). We have also discussed dynamical properties of the model with an external input, solving DEs in the moment method.
There is a widely-accepted need to revise current forms of health-care provision, with particular interest in sensing systems in the home. Given a multiple-modality sensor platform with heterogeneous network connectivity, as is under development in the Sensor Platform for HEalthcare in Residential Environment (SPHERE) Interdisciplinary Research Collaboration (IRC), we face specific challenges relating to the fusion of the heterogeneous sensor modalities.   We introduce Bayesian models for sensor fusion, which aims to address the challenges of fusion of heterogeneous sensor modalities. Using this approach we are able to identify the modalities that have most utility for each particular activity, and simultaneously identify which features within that activity are most relevant for a given activity.   We further show how the two separate tasks of location prediction and activity recognition can be fused into a single model, which allows for simultaneous learning an prediction for both tasks.   We analyse the performance of this model on data collected in the SPHERE house, and show its utility. We also compare against some benchmark models which do not have the full structure,and show how the proposed model compares favourably to these methods
In this paper we investigate the impact of simple text preprocessing decisions (particularly tokenizing, lemmatizing, lowercasing and multiword grouping) on the performance of a state-of-the-art text classifier based on convolutional neural networks. Despite potentially affecting the final performance of any given model, this aspect has not received a substantial interest in the deep learning literature. We perform an extensive evaluation in standard benchmarks from text categorization and sentiment analysis. Our results show that a simple tokenization of the input text is often enough, but also highlight the importance of being consistent in the preprocessing of the evaluation set and the corpus used for training word embeddings.
The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows for online learning of general recurrent computational graphs such as recurrent network models. It works in a streaming fashion and avoids backtracking through past activations and inputs. UORO is computationally as costly as Truncated Backpropagation Through Time (truncated BPTT), a widespread algorithm for online learning of recurrent networks. UORO is a modification of NoBackTrack that bypasses the need for model sparsity and makes implementation easy in current deep learning frameworks, even for complex models.   Like NoBackTrack, UORO provides unbiased gradient estimates; unbiasedness is the core hypothesis in stochastic gradient descent theory, without which convergence to a local optimum is not guaranteed. On the contrary, truncated BPTT does not provide this property, leading to possible divergence.   On synthetic tasks where truncated BPTT is shown to diverge, UORO converges. For instance, when a parameter has a positive short-term but negative long-term influence, truncated BPTT diverges unless the truncation span is very significantly longer than the intrinsic temporal range of the interactions, while UORO performs well thanks to the unbiasedness of its gradients.
This paper considers the synthesis of distributed reactive control protocols for a Boolean network in a distributed manner. We start with a directed acyclic graph representing a network of Boolean subsystems and a global contract, given as an assumption-guarantee pair. Assumption captures the environment behavior, and guarantee is the requirements to be satisfied by the system. Local assumption-guarantee contracts, together with local control protocols ensuring these local contracts, are computed recursively for each subsystem based on the partial order structure induced by the directed acyclic graph. By construction, implementing these local control protocols together guarantees the satisfaction of the global assumption-guarantee contract. Moreover, local control protocol synthesis reduces to quantified satisfiability (QSAT) problems in this setting. We also discuss structural properties of the network that affect the completeness of the proposed algorithm. As an application, we show how an aircraft electric power system can be represented as a Boolean network, and we synthesize distributed control protocols from a global assumption-guarantee contract. The assumptions capture possible failures of the system components, and the guarantees capture safety requirements related to power distribution.
Nowadays deep learning is dominating the field of machine learning with state-of-the-art performance in various application areas. Recently, spiking neural networks (SNNs) have been attracting a great deal of attention, notably owning to their power efficiency, which can potentially allow us to implement a low-power deep learning engine suitable for real-time/mobile applications. However, implementing SNN-based deep learning remains challenging, especially gradient-based training of SNNs by error backpropagation. We cannot simply propagate errors through SNNs in conventional way because of the property of SNNs that process discrete data in the form of a series. Consequently, most of the previous studies employ a workaround technique, which first trains a conventional weighted-sum deep neural network and then maps the learning weights to the SNN under training, instead of training SNN parameters directly. In order to eliminate this workaround, recently proposed is a new class of SNN named deep spiking networks (DSNs), which can be trained directly (without a mapping from conventional deep networks) by error backpropagation with stochastic gradient descent. In this paper, we show that the initialization of the membrane potential on the backward path is an important step in DSN training, through diverse experiments performed under various conditions. Furthermore, we propose a simple and efficient method that can improve DSN training by controlling the initial membrane potential on the backward path. In our experiments, adopting the proposed approach allowed us to boost the performance of DSN training in terms of converging time and accuracy.
Frequency shifts and dissipations of a compound torsional oscillator induced by solid $^4$He samples containing $^3$He impurity concentrations ($x_3$ = 0.3, 3, 6, 12 and 25 in units of 10$^{-6}$) have been measured at two resonant mode frequencies ($f_1$ = 493 and $f_2$ = 1164 Hz) at temperatures ($T$) between 0.02 and 1.1 K. The fractional frequency shifts of the $f_1$ mode were much smaller than those of the $f_2$ mode. The observed frequency shifts continued to decrease as $T$ was increased above 0.3 K, and the conventional non-classical rotation inertia fraction was not well defined in all samples with $x_3 \geq$ 3 ppm. Temperatures where peaks in dissipation of the $f_2$ mode occurred were higher than those of the $f_1$ mode in all samples. The peak dissipation magnitudes of the $f_1$ mode was greater than those of the $f_2$ mode in all samples. The activation energy and the characteristic time ($\tau_0$) were extracted for each sample from an Arrhenius plot between mode frequencies and inverse peak temperatures. The average activation energy among all samples was 430 mK, and $\tau_0$ ranged from 2$\times 10^{-7}$ s to 5$\times 10^{-5}$ s in samples with $x_3$ = 0.3 to 25 ppm. The characteristic time increased in proportion to $x_3^{2/3}$. Observed temperature dependence of dissipation were consistent with those expected from a simple Debye relaxation model \emph{if} the dissipation peak magnitude was separately adjusted for each mode. Observed frequency shifts were greater than those expected from the model. The discrepancies between the observed and the model frequency shifts increased at the higher frequency mode.
We show that two apparently contradictory theories on the existence of Griffiths-McCoy singularities in magnetic metallic systems [1,2] are in fact mathematically equivalent. We discuss the generic phase diagram of the problem and show that there is a non-universal crossover temperature range T* < T < W where power law behavior (Griffiths-McCoy behavior) is expect. For T<T* power law behavior ceases to exist due to the destruction of quantum effects generated by the dissipation in the metallic environment. We show that T* is an analogue of the Kondo temperature and is controlled by non-universal couplings.
A review is presented on all higher order QCD corrections to deep inelastic structure functions. The implications of these corrections for polarized and unpolarized deep inelastic lepton-hadron scattering will be discussed.
Networks are known to be prone to link failures. In this paper we set out to investigate how networks of varying connectivity patterns respond to different link failure schemes in terms of connectivity, clustering coefficient and shortest path lengths. We then propose a measure, which we call the vulnerability of a network, for evaluating the extent of the damage these failures can cause. Accepting the disconnections of node pairs as a damage indicator, vulnerability simply represents how quickly the failure of the critical links cause the network to undergo a specified damage extent. Analyzing the vulnerabilities under varying damage specifications shows that scale free networks are relatively more vulnerable for small failures, but more efficient; whereas Erd\"os-R\'enyi networks are the least vulnerable despite lacking any clustered structure.
Encouraged by significant advances in algorithms and tools for verification and analysis, high level modeling and programming techniques, natural language programming, etc., we feel it is time for a major change in the way complex software and systems are developed. We present a vision that will shift the power balance between human engineers and the development and runtime environments. The idea is to endow the computer with human-like wisdom - not general wisdom, and not AI in the standard sense of the term - but wisdom geared towards classical system-building, which will be manifested, throughout development, in creativity and proactivity, and deep insights into the system's own structure and behavior, its overarching goals and rationale. Ideally, the computer will join the development team as an equal partner - knowledgeable, concerned, and responsibly active. We present a running demo of our initial efforts on the topic, illustrating on a small example what we feel is the feasibility of the ideas.
We develop a theory for plastic flux creep in a topologically disordered vortex solid phase in type-II superconductors. We propose a detailed description of the plastic vortex creep of the dislocated, amorphous vortex glass in terms of motion of dislocations driven by a transport current $j$. The {\em plastic barriers} $U_{pl}(j)\propto j^{-\mu}$ show power-law divergence at small drives with exponents $\mu=1$ for single dislocation creep and $\mu = 2/5$ for creep of dislocation bundles. The suppression of the creep rate is a hallmark of the transition from the topologically ordered vortex lattice to an amorphous vortex glass, reflecting a jump in $\mu$ from $\mu = 2/11$, characterizing creep in the topologically ordered vortex lattice near the transition, to its plastic values. The lower creep rates explain the observed increase in apparent critical currents in the dislocated vortex glass.
We study a DeGroot-like opinion dynamics model in which agents may oppose other agents. As an underlying motivation, in our setup, agents want to adjust their opinions to match those of the agents of their 'in-group' and, in addition, they want to adjust their opinions to match the 'inverse' of those of the agents of their 'out-group'. Our paradigm can account for persistent disagreement in connected societies as well as bi- and multi-polarization. Outcomes depend upon network structure and the choice of deviation function modeling the mode of opposition between agents. For a particular choice of deviation function, which we call soft opposition, we derive necessary and sufficient conditions for long-run polarization. We also consider social influence (who are the opinion leaders in the network?) as well as the question of wisdom in our naive learning paradigm, finding that wisdom is difficult to attain when there exist sufficiently strong negative relations between agents.
Weakly supervised learning of object detection is an important problem in image understanding that still does not have a satisfactory solution. In this paper, we address this problem by exploiting the power of deep convolutional neural networks pre-trained on large-scale image-level classification tasks. We propose a weakly supervised deep detection architecture that modifies one such network to operate at the level of image regions, performing simultaneously region selection and classification. Trained as an image classifier, the architecture implicitly learns object detectors that are better than alternative weakly supervised detection systems on the PASCAL VOC data. The model, which is a simple and elegant end-to-end architecture, outperforms standard data augmentation and fine-tuning techniques for the task of image-level classification as well.
The retrieval behavior and thermodynamic properties of symmetrically diluted Q-Ising neural networks are derived and studied in replica-symmetric mean-field theory generalizing earlier works on either the fully connected or the symmetrical extremely diluted network. Capacity-gain parameter phase diagrams are obtained for the Q=3, Q=4 and $Q=\infty$ state networks with uniformly distributed patterns of low activity in order to search for the effects of a gradual dilution of the synapses. It is shown that enlarged regions of continuous changeover into a region of optimal performance are obtained for finite stochastic noise and small but finite connectivity. The de Almeida-Thouless lines of stability are obtained for arbitrary connectivity, and the resulting phase diagrams are used to draw conclusions on the behavior of symmetrically diluted networks with other pattern distributions of either high or low activity.
We investigate the phase behaviour of random copolymers melts via large scale Monte Carlo simulations. We observe macrophase separation into A and B--rich phases as predicted by mean field theory only for systems with a very large correlation lambda of blocks along the polymer chains, far away from the Lifshitz point. For smaller values of lambda, we find that a locally segregated, disordered microemulsion--like structure gradually forms as the temperature decreases. As we increase the number of blocks in the polymers, the region of macrophase separation further shrinks. The results of our Monte Carlo simulation are in agreement with a Ginzburg criterium, which suggests that mean field theory becomes worse as the number of blocks in polymers increases.
We present DAPIP, a Programming-By-Example system that learns to program with APIs to perform data transformation tasks. We design a domain-specific language (DSL) that allows for arbitrary concatenations of API outputs and constant strings. The DSL consists of three family of APIs: regular expression-based APIs, lookup APIs, and transformation APIs. We then present a novel neural synthesis algorithm to search for programs in the DSL that are consistent with a given set of examples. The search algorithm uses recently introduced neural architectures to encode input-output examples and to model the program search in the DSL. We show that synthesis algorithm outperforms baseline methods for synthesizing programs on both synthetic and real-world benchmarks.
Many deep Convolutional Neural Networks (CNN) make incorrect predictions on adversarial samples obtained by imperceptible perturbations of clean samples. We hypothesize that this is caused by a failure to suppress unusual signals within network layers. As remedy we propose the use of Symmetric Activation Functions (SAF) in non-linear signal transducer units. These units suppress signals of exceptional magnitude. We prove that SAF networks can perform classification tasks to arbitrary precision in a simplified situation. In practice, rather than use SAFs alone, we add them into CNNs to improve their robustness. The modified CNNs can be easily trained using popular strategies with the moderate training load. Our experiments on MNIST and CIFAR-10 show that the modified CNNs perform similarly to plain ones on clean samples, and are remarkably more robust against adversarial and nonsense samples.
Interpretability has become an important issue as machine learning is increasingly used to inform consequential decisions. We propose an approach for interpreting a blackbox model by extracting a decision tree that approximates the model. Our model extraction algorithm avoids overfitting by leveraging blackbox model access to actively sample new training points. We prove that as the number of samples goes to infinity, the decision tree learned using our algorithm converges to the exact greedy decision tree. In our evaluation, we use our algorithm to interpret random forests and neural nets trained on several datasets from the UCI Machine Learning Repository, as well as control policies learned for three classical reinforcement learning problems. We show that our algorithm improves over a baseline based on CART on every problem instance. Furthermore, we show how an interpretation generated by our approach can be used to understand and debug these models.
A wireless sensor network can be used to collect and process environmental data, which is often of multivariate nature. This work proposes a multivariate sampling algorithm based on component analysis techniques in wireless sensor networks. To improve the sampling, the algorithm uses component analysis techniques to rank the data. Once ranked, the most representative data is retained. Simulation results show that our technique reduces the data keeping its representativeness. In addition, the energy consumption and delay to deliver the data on the network are reduced.
Link recommendation, which suggests links to connect currently unlinked users, is a key functionality offered by major online social networks. Salient examples of link recommendation include "People You May Know" on Facebook and LinkedIn as well as "You May Know" on Google+. The main stakeholders of an online social network include users (e.g., Facebook users) who use the network to socialize with other users and an operator (e.g., Facebook Inc.) that establishes and operates the network for its own benefit (e.g., revenue). Existing link recommendation methods recommend links that are likely to be established by users but overlook the benefit a recommended link could bring to an operator. To address this gap, we define the utility of recommending a link and formulate a new research problem - the utility-based link recommendation problem. We then propose a novel utility-based link recommendation method that recommends links based on the value, cost, and linkage likelihood of a link, in contrast to existing link recommendation methods which focus solely on linkage likelihood. Specifically, our method models the dependency relationship between value, cost, linkage likelihood and utility-based link recommendation decision using a Bayesian network, predicts the probability of recommending a link with the Bayesian network, and recommends links with the highest probabilities. Using data obtained from a major U.S. online social network, we demonstrate significant performance improvement achieved by our method compared to prevalent link recommendation methods from representative prior research.
Sponsored search is a multi-billion dollar industry and makes up a major source of revenue for search engines (SE). click-through-rate (CTR) estimation plays a crucial role for ads selection, and greatly affects the SE revenue, advertiser traffic and user experience. We propose a novel architecture for solving CTR prediction problem by combining artificial neural networks (ANN) with decision trees. First we compare ANN with respect to other popular machine learning models being used for this task. Then we go on to combine ANN with MatrixNet (proprietary implementation of boosted trees) and evaluate the performance of the system as a whole. The results show that our approach provides significant improvement over existing models.
Atmospheric pollutants concentration forecasting is an important issue in air quality monitoring. Qualitair Corse, the organization responsible for monitoring air quality in Corsica (France) region, needs to develop a short-term prediction model to lead its mission of information towards the public. Various deterministic models exist for meso-scale or local forecasting, but need powerful large variable sets, a good knowledge of atmospheric processes, and can be inaccurate because of local climatical or geographical particularities, as observed in Corsica, a mountainous island located in a Mediterranean Sea. As a result, we focus in this study on statistical models, and particularly Artificial Neural Networks (ANN) that have shown good results in the prediction of ozone concentration at horizon h+1 with data measured locally. The purpose of this study is to build a predictor to realize predictions of ozone and PM10 at horizon d+1 in Corsica in order to be able to anticipate pollution peak formation and to take appropriated prevention measures. Specific meteorological conditions are known to lead to particular pollution event in Corsica (e.g. Saharan dust event). Therefore, several ANN models will be used, for meteorological conditions clustering and for operational forecasting.
The investigation of network structure has important significance to understand the functions of various complex networks. The communities with hierarchical and overlapping structures and the special nodes like hubs and outliers are all common structure features to the networks. Network structure investigation has attracted considerable research effort recently. However, existing studies have only partially explored the structure features. In this paper, a label propagation based integrated network structure investigation algorithm (LINSIA) is proposed. The main novelty here is that LINSIA can uncover hierarchical and overlapping communities, as well as hubs and outliers. Moreover, LINSIA can provide insight into the label propagation mechanism and propose a parameter-free solution that requires no prior knowledge. In addition, LINSIA can give out a soft-partitioning result and depict the degree of overlapping nodes belonging to each relevant community. The proposed algorithm is validated on various synthetic and real-world networks. Experimental results demonstrate that the algorithm outperforms several state-of-the-art methods.
Wireless sensor networks are harshly restricted by storage capacity, energy and computing power. So it is essential to design effective and energy aware protocol in order to enhance the network lifetime. In this paper, a review on routing protocol in WSNs is carried out which are classified as data-centric, hierarchical and location based depending on the network structure. Then some of the multipath routing protocols which are widely used in WSNs to improve network performance are also discussed. Advantages and disadvantages of each routing algorithm are discussed thereafter. Furthermore, this paper compares and summarizes the performances of routing protocols.
User-machine interaction is important for spoken content retrieval. For text content retrieval, the user can easily scan through and select on a list of retrieved item. This is impossible for spoken content retrieval, because the retrieved items are difficult to show on screen. Besides, due to the high degree of uncertainty for speech recognition, the retrieval results can be very noisy. One way to counter such difficulties is through user-machine interaction. The machine can take different actions to interact with the user to obtain better retrieval results before showing to the user. The suitable actions depend on the retrieval status, for example requesting for extra information from the user, returning a list of topics for user to select, etc. In our previous work, some hand-crafted states estimated from the present retrieval results are used to determine the proper actions. In this paper, we propose to use Deep-Q-Learning techniques instead to determine the machine actions for interactive spoken content retrieval. Deep-Q-Learning bypasses the need for estimation of the hand-crafted states, and directly determine the best action base on the present retrieval status even without any human knowledge. It is shown to achieve significantly better performance compared with the previous hand-crafted states.
Bivariate luminosity functions (LFs) are computed for galaxies in the New York Value-Added Galaxy Catalogue, based on the Sloan Digital Sky Survey Data Release 4. The galaxy properties investigated are the morphological type, inverse concentration index, Sersic index, absolute effective surface brightness, reference frame colours, absolute radius, eClass spectral type, stellar mass and galaxy environment. The morphological sample is flux-limited to galaxies with r < 15.9 and consists of 37,047 classifications to an RMS accuracy of +/- half a class in the sequence E, S0, Sa, Sb, Sc, Sd, Im. These were assigned by an artificial neural network, based on a training set of 645 eyeball classifications. The other samples use r < 17.77 with a median redshift of z ~ 0.08, and a limiting redshift of z < 0.15 to minimize the effects of evolution. Other cuts, for example in axis ratio, are made to minimize biases. A wealth of detail is seen, with clear variations between the LFs according to absolute magnitude and the second parameter. They are consistent with an early type, bright, concentrated, red population and a late type, faint, less concentrated, blue, star forming population. This bimodality suggests two major underlying physical processes, which in agreement with previous authors we hypothesize to be merger and accretion, associated with the properties of bulges and discs respectively. The bivariate luminosity-surface brightness distribution is fit with the Choloniewski function (a Schechter function in absolute magnitude and Gaussian in surface brightness). The fit is found to be poor, as might be expected if there are two underlying processes.
This paper is centered on using chemical reaction as a computational metaphor for simultaneously solving problems. An artificial chemical reactor that can simultaneously solve instances of three unrelated problems was created. The reactor is a distributed stochastic algorithm that simulates a chemical universe wherein the molecular species are being represented either by a human genomic contig panel, a Hamiltonian cycle, or an aircraft landing schedule. The chemical universe is governed by reactions that can alter genomic sequences, re-order Hamiltonian cycles, or reschedule an aircraft landing program. Molecular masses were considered as measures of goodness of solutions, and represented radiation hybrid (RH) vector similarities, costs of Hamiltonian cycles, and penalty costs for landing an aircraft before and after target landing times. This method, tested by solving in tandem with deterministic algorithms, has been shown to find quality solutions in finding the minima RH vector similarities of genomic data, minima costs in Hamiltonian cycles of the traveling salesman, and minima costs for landing aircrafts before or after target landing times.
Most object detectors contain two important components: a feature extractor and an object classifier. The feature extractor has rapidly evolved with significant research efforts leading to better deep convolutional architectures. The object classifier, however, has not received much attention and many recent systems (like SPPnet and Fast/Faster R-CNN) use simple multi-layer perceptrons. This paper demonstrates that carefully designing deep networks for object classification is just as important. We experiment with region-wise classifier networks that use shared, region-independent convolutional features. We call them "Networks on Convolutional feature maps" (NoCs). We discover that aside from deep feature maps, a deep and convolutional per-region classifier is of particular importance for object detection, whereas latest superior image classification models (such as ResNets and GoogLeNets) do not directly lead to good detection accuracy without using such a per-region classifier. We show by experiments that despite the effective ResNets and Faster R-CNN systems, the design of NoCs is an essential element for the 1st-place winning entries in ImageNet and MS COCO challenges 2015.
Neural responses are highly variable, and some portion of this variability arises from fluctuations in modulatory factors that alter their gain, such as adaptation, attention, arousal, expected or actual reward, emotion, and local metabolic resource availability. Regardless of their origin, fluctuations in these signals can confound or bias the inferences that one derives from spiking responses. Recent work demonstrates that for sensory neurons, these effects can be captured by a modulated Poisson model, whose rate is the product of a stimulus-driven response function and an unknown modulatory signal. Here, we extend this model, by incorporating explicit modulatory elements that are known (specifically, spike-history dependence, as in previous models), and by constraining the remaining latent modulatory signals to be smooth in time. We develop inference procedures for fitting the entire model, including hyperparameters, via evidence optimization, and apply these to simulated data, and to responses of ferret auditory midbrain and cortical neurons to complex sounds. We show that integrating out the latent modulators yields better (or more readily-interpretable) receptive field estimates than a standard Poisson model. Conversely, integrating out the stimulus dependence yields estimates of the slowly-varying latent modulators.
We present the estimates of the order $O(\alpha^{4}_{s})$ QCD corrections to $R(s)$, $R_{\tau}$ and to the deep-inelastic scattering sum rules, namely to the non-polarized and polarized Bjorken sum rules and to the Gross-Llewellyn Smith sum rule. The estimates are obtained in the $\overline{MS}$-scheme using the principle of minimal sensitivity and the effective charges approach.
In the years, routing protocols in wireless sensor networks (WSN) have been substantially investigated by researches. Most state-of-the-art surveys have focused on reviewing of wireless sensor network .In this paper we review the existing secure geographic routing protocols for wireless sensor network (WSN) and also provide a qualitative comparison of them.
This article underlines the learning and discrimination capabilities of a model of associative memory based on artificial networks of spiking neurons. Inspired from neuropsychology and neurobiology, the model implements top-down modulations, as in neocortical layer V pyramidal neurons, with a learning rule based on synaptic plasticity (STDP), for performing a multimodal association learning task. A temporal correlation method of analysis proves the ability of the model to associate specific activity patterns to different samples of stimulation. Even in the absence of initial learning and with continuously varying weights, the activity patterns become stable enough for discrimination.
Building on recent NMR experiments [Phys. Rev. Lett. 118, 067203], we theoretically investigate the high magnetic field regime of the disordered quasi-one-dimensional $S=1$ antiferromagnetic material Ni(Cl$_{1-x}$Br$_x$)$_2$-4SC(NH$_2$)$_2$. The interplay between disorder, chemically controlled by Br-doping, interactions, and the external magnetic field, leads to a very rich phase diagram. Beyond the well-known antiferromagnetically ordered regime, analog of a Bose condensate of magnons, which disappears when $H\ge 12.3$ T, we unveil a resurgence of phase coherence at higher field $H\sim 13.6$ T, induced by the doping. Interchain couplings stabilize finite temperature long-range order whose extension in the field - temperature space is governed by the concentration of impurities $x$. Such a "mini-condensation" contrasts with previously reported Bose-glass physics in the same regime, and should be accessible to experiments.
Tensor network decompositions offer an efficient description of certain many-body states of a lattice system and are the basis of a wealth of numerical simulation algorithms. We discuss how to incorporate a global symmetry, given by a compact, completely reducible group G, in tensor network decompositions and algorithms. This is achieved by considering tensors that are invariant under the action of the group G. Each symmetric tensor decomposes into two types of tensors: degeneracy tensors, containing all the degrees of freedom, and structural tensors, which only depend on the symmetry group. In numerical calculations, the use of symmetric tensors ensures the preservation of the symmetry, allows selection of a specific symmetry sector, and significantly reduces computational costs. On the other hand, the resulting tensor network can be interpreted as a superposition of exponentially many spin networks. Spin networks are used extensively in loop quantum gravity, where they represent states of quantum geometry. Our work highlights their importance also in the context of tensor network algorithms, thus setting the stage for cross-fertilization between these two areas of research.
This paper is concerned with the development of Back-propagation Neural Network for Bangla Speech Recognition. In this paper, ten bangla digits were recorded from ten speakers and have been recognized. The features of these speech digits were extracted by the method of Mel Frequency Cepstral Coefficient (MFCC) analysis. The mfcc features of five speakers were used to train the network with Back propagation algorithm. The mfcc features of ten bangla digit speeches, from 0 to 9, of another five speakers were used to test the system. All the methods and algorithms used in this research were implemented using the features of Turbo C and C++ languages. From our investigation it is seen that the developed system can successfully encode and analyze the mfcc features of the speech signal to recognition. The developed system achieved recognition rate about 96.332% for known speakers (i.e., speaker dependent) and 92% for unknown speakers (i.e., speaker independent).
This paper makes two contributions to Bayesian machine learning algorithms. Firstly, we propose stochastic natural gradient expectation propagation (SNEP), a novel alternative to expectation propagation (EP), a popular variational inference algorithm. SNEP is a black box variational algorithm, in that it does not require any simplifying assumptions on the distribution of interest, beyond the existence of some Monte Carlo sampler for estimating the moments of the EP tilted distributions. Further, as opposed to EP which has no guarantee of convergence, SNEP can be shown to be convergent, even when using Monte Carlo moment estimates. Secondly, we propose a novel architecture for distributed Bayesian learning which we call the posterior server. The posterior server allows scalable and robust Bayesian learning in cases where a dataset is stored in a distributed manner across a cluster, with each compute node containing a disjoint subset of data. An independent Monte Carlo sampler is run on each compute node, with direct access only to the local data subset, but which targets an approximation to the global posterior distribution given all data across the whole cluster. This is achieved by using a distributed asynchronous implementation of SNEP to pass messages across the cluster. We demonstrate SNEP and the posterior server on distributed Bayesian learning of logistic regression and neural networks.   Keywords: Distributed Learning, Large Scale Learning, Deep Learning, Bayesian Learning, Variational Inference, Expectation Propagation, Stochastic Approximation, Natural Gradient, Markov chain Monte Carlo, Parameter Server, Posterior Server.
Information network analysis has drawn a lot attention in recent years. Among all the aspects of network analysis, similarity measure of nodes has been shown useful in many applications, such as clustering, link prediction and community identification, to name a few. As linkage data in a large network is inherently sparse, it is noted that collecting more data can improve the quality of similarity measure. This gives different parties a motivation to cooperate. In this paper, we address the problem of link-based similarity measure of nodes in an information network distributed over different parties. Concerning the data privacy, we propose a privacy-preserving SimRank protocol based on fully-homomorphic encryption to provide cryptographic protection for the links.
Circular coloring is a constraints satisfaction problem where colors are assigned to nodes in a graph in such a way that every pair of connected nodes has two consecutive colors (the first color being consecutive to the last). We study circular coloring of random graphs using the cavity method. We identify two very interesting properties of this problem. For sufficiently many color and sufficiently low temperature there is a spontaneous breaking of the circular symmetry between colors and a phase transition forwards a ferromagnet-like phase. Our second main result concerns 5-circular coloring of random 3-regular graphs. While this case is found colorable, we conclude that the description via one-step replica symmetry breaking is not sufficient. We observe that simulated annealing is very efficient to find proper colorings for this case. The 5-circular coloring of 3-regular random graphs thus provides a first known example of a problem where the ground state energy is known to be exactly zero yet the space of solutions probably requires a full-step replica symmetry breaking treatment.
We consider the evolutionary dynamics of a cooperative game on an adaptive network, where the strategies of agents (cooperation or defection) feed back on their local interaction topology. While mutual cooperation is the social optimum, unilateral defection yields a higher payoff and undermines the evolution of cooperation. Although no a priori advantage is given to cooperators, an intrinsic dynamical mechanism can lead asymptotically to a state of full cooperation. In finite systems, this state is characterized by long periods of strong cooperation interrupted by sudden episodes of predominant defection, suggesting a possible mechanism for the systemic failure of cooperation in real-world systems.
In general, mobile users have not been able to exploit spatio-temporal differences between individual mobile networks operators for a variety of reasons. End user network switching and multihoming are two promising mechanisms that could allow such exploitation. However these mechanisms have not been thoroughly explored at a general system level with QoE metrics. Therefore, in this work we analyze these mechanisms in a variety of diverse scenarios through a system level model based on an agent based modeling framework.   In terms of results, we find that in all scenarios end user network switching provides significant benefits in terms of both throughput and mean opinion score as the number of available networks increases. However, contrastingly, end user multihoming (over two networks) in most scenarios does not provide significant benefits over network switching given the same number of available networks. The major reason is inefficient radio resource allocation resulting from individual networks not taking the multihoming nature of end users into account. Though, in low user density situations this inefficiency is not a problem and multihoming does provide increased throughput (though not increased mean opinion scores). Finally, scenarios that vary the fraction of users adopting multihoming suggests that both early and late adopters will have similar gains over users not adopting multihoming. Thus the adoption dynamics of multihoming appear favorable. Overall, the results support the applicability of end user network switching for improving mobile user experience and the applicability of end user multihoming in more limited situations.
We study the effects of random pinning on the Fredrickson-Andersen model on the Bethe lattice. We find that the nonergodic transition temperature rises as the fraction of the pinned spins increases and the transition line terminates at a critical point. The freezing behavior of the spins is analogous to that of a randomly pinned p-spin mean-field spin glass model which has been recently reported. The diverging behavior of correlation lengths in the vicinity of the terminal critical point is found to be identical to the prediction of the inhomogeneous mode-coupling theory at the A3 singularity point for the glass transition.
We consider the problem of semantic image segmentation using deep convolutional neural networks. We propose a novel network architecture called the label refinement network that predicts segmentation labels in a coarse-to-fine fashion at several resolutions. The segmentation labels at a coarse resolution are used together with convolutional features to obtain finer resolution segmentation labels. We define loss functions at several stages in the network to provide supervisions at different stages. Our experimental results on several standard datasets demonstrate that the proposed model provides an effective way of producing pixel-wise dense image labeling.
We describe a method for proactive information retrieval targeted at retrieving relevant information during a writing task. In our method, the current task and the needs of the user are estimated, and the potential next steps are unobtrusively predicted based on the user's past actions. We focus on the task of writing, in which the user is coalescing previously collected information into a text. Our proactive system automatically recommends the user relevant background information. The proposed system incorporates text input prediction using a long short-term memory (LSTM) network. We present simulations, which show that the system is able to reach higher precision values in an exploratory search setting compared to both a baseline and a comparison system.
Circuit topology refers to the arrangement of interactions between objects belonging to a linearly ordered object set. Linearly ordered set of objects are common in nature and occur in a wide range of applications in economics, computer science, social science and chemical synthesis. Examples include linear bio-polymers, linear signaling pathways in cells as well as topological sorts appearing in project management. Using a statistical mechanical treatment, we study circuit topology landscapes of linear polymer chains with intra-chain contacts as a prototype of linearly sorted objects with interactions. We find generic features of the topological space and study the statistical properties of the space under the most basic constraints on the occupancy of arrangements and topological interactions. We observe that a set of correlated contact sites (a sector) could nontrivially influence the entropy of circuits as the number of involved sites increases. Finally, we discuss how constraints can be inferred from the information provided by local contact distributions in presence of a sector.
We consider a generalized chiral Gaussian Unitary Ensemble (chGUE) based on a weak confining potential. We study the spectral correlations close to the origin in the thermodynamic limit. We show that for eigenvalues separated up to the mean level spacing the spectral correlations coincide with those of chGUE. Beyond this point, the spectrum is described by an oscillating number variance centered around a constant value. We argue that the origin of such a rigid spectrum is due to the breakdown of the translational invariance of the spectral kernel in the bulk of the spectrum. Finally, we compare our results with the ones obtained from a critical chGUE recently reported in the literature. We conclude that our generalized chGUE does not belong to the same class of universality as the above mentioned model.
Multi-view product image queries can improve retrieval performance over single view queries significantly. In this paper, we investigated the performance of deep convolutional neural networks (ConvNets) on multi-view product image search. First, we trained a VGG-like network to learn deep ConvNets representations of product images. Then, we computed the deep ConvNets representations of database and query images and performed single view queries, and multi-view queries using several early and late fusion approaches.   We performed extensive experiments on the publicly available Multi-View Object Image Dataset (MVOD 5K) with both clean background queries from the Internet and cluttered background queries from a mobile phone. We compared the performance of ConvNets to the classical bag-of-visual-words (BoWs). We concluded that (1) multi-view queries with deep ConvNets representations perform significantly better than single view queries, (2) ConvNets perform much better than BoWs and have room for further improvement, (3) pre-training of ConvNets on a different image dataset with background clutter is needed to obtain good performance on cluttered product image queries obtained with a mobile phone.
We compute the finite size effects in the ground state energy, equivalently the effective central charge c_{eff}, based on S-matrix theories recently conjectured to describe a cyclic regime of the Kosterlitz-Thouless renormalization group flows. The effective central charge has periodic properties consistent with renormalization group predictions. Whereas c_{eff} for the massive case has a singularity in the very deep ultra-violet, we argue that the massless version is non-singular and periodic on all length scales.
In this paper, we newly introduce the concept of temporal attention filters, and describe how they can be used for human activity recognition from videos. Many high-level activities are often composed of multiple temporal parts (e.g., sub-events) with different duration/speed, and our objective is to make the model explicitly learn such temporal structure using multiple attention filters and benefit from them. Our temporal filters are designed to be fully differentiable, allowing end-of-end training of the temporal filters together with the underlying frame-based or segment-based convolutional neural network architectures. This paper presents an approach of learning a set of optimal static temporal attention filters to be shared across different videos, and extends this approach to dynamically adjust attention filters per testing video using recurrent long short-term memory networks (LSTMs). This allows our temporal attention filters to learn latent sub-events specific to each activity. We experimentally confirm that the proposed concept of temporal attention filters benefits the activity recognition, and we visualize the learned latent sub-events.
We present a theory of periodically driven, many-body localized (MBL) systems. We argue that MBL persists under periodic driving at high enough driving frequency: The Floquet operator (evolution operator over one driving period) can be represented as an exponential of an effective time-independent Hamiltonian, which is a sum of quasi-local terms and is itself fully MBL. We derive this result by constructing a sequence of canonical transformations to remove the time-dependence from the original Hamiltonian. When the driving evolves smoothly in time, the theory can be sharpened by estimating the probability of adiabatic Landau-Zener transitions at many-body level crossings. In all cases, we argue that there is delocalization at sufficiently low frequency. We propose a phase diagram of driven MBL systems.
The 2008 financial crisis illustrated the need for a thorough, functional understanding of systemic risk in strongly interconnected financial structures. Dynamic processes on complex networks being intrinsically difficult, most recent studies of this problem have relied on numerical simulations. Here we report analytical results in a network model of interbank lending based on directly relevant financial parameters, such as interest rates and leverage ratios. Using a mean-field approach, we obtain a closed-form formula for the "critical degree", viz. the number of creditors per bank below which an individual shock can propagate throughout the network. We relate the failures distribution (probability that a single shock induces $F$ failures) to the degree distribution (probability that a bank has $k$ creditors), showing in particular that the former is fat-tailed whenever the latter is. Our criterion for the onset of contagion turns out to be isomorphic to the condition for cooperation to evolve on graphs and social networks, as recently formulated in evolutionary game theory. This remarkable connection supports recent calls for a methodological rapprochement between finance and ecology.
In the present paper we discuss the rich phase diagram of S=1/2 weakly coupled dimer systems with site dilution as a function of an applied uniform magnetic field. Performing quantum Monte Carlo simulations on a site-diluted bilayer system, we find a sequence of three distinct quantum-disordered phases induced by the field. Such phases divide a doping-induced order-by-disorder phase at low fields from a field-induced ordered phase at intermediate fields. The three quantum disordered phases are: a gapless disordered-free-moment phase, a gapped plateau phase, and two gapless Bose-glass phases. Each of the quantum-disordered phases have distinct experimental signatures that make them observable through magnetometry and neutron scattering measurements. In particular the Bose-glass phase is characterized by an unconventional magnetization curve whose field-dependence is exponential. Making use of a local-gap model, we directly relate this behavior to the statistics of rare events in the system.
We present FUSE spectra of 3 He-rich sdB stars. Two of these stars, PG1544+488 and JL87, reveal extremely strong C III lines at 977 and 1176A, while the carbon lines are quite weak in the third star, LB1766. We have analyzed the FUSE data using TLUSTY NLTE line-blanketed model atmospheres, and find that PG1544+488 has a surface composition of 96% He, 2% C, and 1% N. JL87 shows a similar surface enrichment of carbon and nitrogen, but some significant fraction of hydrogen still remains in its atmosphere. LB1766 has a surface composition devoid of hydrogen and strongly depleted of carbon, indicating that its surface material has undergone CN-cycle processing. We interpret these observations with new evolutionary calculations which suggest that He-rich sdB stars with C-rich compositions are the progeny of stars which underwent a delayed He-core flash on the white-dwarf cooling curve. During such a flash the interior convection zone will penetrate into the H envelope, thereby mixing the envelope with the He- and C-rich core. Such `flash-mixed' stars will arrive on the extreme horizontal branch (EHB) with He- and C-rich surface compositions and will be hotter than the hottest canonical (i.e., unmixed) EHB stars. Two types of flash mixing are possible: `deep' and `shallow', depending on whether the H envelope is mixed deeply into the site of the He flash or only with the outer layers of the core. Based on both their stellar parameters and surface compositions, we suggest that PG1544+488 and JL87 are examples of `deep' and `shallow' flash mixing, respectively. Flash mixing may therefore represent a new evolutionary channel for producing the hottest EHB stars. However, flash mixing cannot explain the abundance pattern in LB1766, which remains a challenge to current evolutionary models.
We study shot (counting) noise of the amplitude of interference between independent atomic systems. In particular, for the two interfering systems the variance of the fringe amplitude decreases as the inverse power of the number of particles per system with the coefficient being a non-universal number. This number depends on the details of the initial state of each system so that the shot noise measurements can be used to distinguish between such states. We explicitly evaluate this coefficient for the two cases of the interference between bosons in number states and in broken symmetry states. We generalize our analysis to the interference of multiple independent atomic systems. We show that the variance of the interference contrast vanishes as the inverse power of the number of the interfering systems. This result, implying high signal to noise ratio in the interference experiments, holds both for bosons and for fermions.
Many activation functions have been proposed in the past, but selecting an adequate one requires trial and error. We propose a new methodology of designing activation functions within a neural network at each layer. We call this technique an "activation ensemble" because it allows the use of multiple activation functions at each layer. This is done by introducing additional variables, $\alpha$, at each activation layer of a network to allow for multiple activation functions to be active at each neuron. By design, activations with larger $\alpha$ values at a neuron is equivalent to having the largest magnitude. Hence, those higher magnitude activations are "chosen" by the network. We implement the activation ensembles on a variety of datasets using an array of Feed Forward and Convolutional Neural Networks. By using the activation ensemble, we achieve superior results compared to traditional techniques. In addition, because of the flexibility of this methodology, we more deeply explore activation functions and the features that they capture.
Recent work has shown good recognition results in 3D data using 3D convolutional networks. In this paper, we argue that the object orientation plays an important role in 3D recognition. To this end, we approach the category-level classification task as a multi-task problem, in which the network is forced to predict the pose of the object in addition to the class label. We show that this yields significant improvements in the classification results. We implemented different network architectures for this purpose and tested them on different datasets representing various 3D data sources: LiDAR data, CAD models and RGBD images. We report state-of-the-art results on classification, and analyze the effects of orientation-boosting on the dominant signal paths in the network.
The rigidity of a network of elastic beams crucially depends on the specific details of its structure. We show both numerically and theoretically that there is a class of isotropic networks which are stiffer than any other isotropic network with same density. The elastic moduli of these \textit{stiffest elastic networks} are explicitly given. They constitute upper-bounds which compete or improve the well-known Hashin-Shtrikman bounds. We provide a convenient set of criteria (necessary and sufficient conditions) to identify these networks, and show that their displacement field under uniform loading conditions is affine down to the microscopic scale. Finally, examples of such networks with periodic arrangement are presented, in both two and three dimensions.
Caching and multicasting at base stations are two promising approaches to support massive content delivery over wireless networks. However, existing scheduling designs do not make full use of the advantages of the two approaches. In this paper, we consider the optimal dynamic multicast scheduling to jointly minimize the average delay, power, and fetching costs for cache-enabled content-centric wireless networks. We formulate this stochastic optimization problem as an infinite horizon average cost Markov decision process (MDP). It is well-known to be a difficult problem due to the curse of dimensionality, and there generally only exist numerical solutions. By using relative value iteration algorithm and the special structures of the request queue dynamics, we analyze the properties of the value function and the state-action cost function of the MDP for both the uniform and nonuniform channel cases. Based on these properties, we show that the optimal policy, which is adaptive to the request queue state, has a switch structure in the uniform case and a partial switch structure in the nonuniform case. Moreover, in the uniform case with two contents, we show that the switch curve is monotonically non-decreasing. Then, by exploiting these structural properties of the optimal policy, we propose two low-complexity optimal algorithms. Motivated by the switch structures of the optimal policy, to further reduce the complexity, we also propose a low-complexity suboptimal policy, which possesses similar structural properties to the optimal policy, and develop a low-complexity algorithm to compute this policy.
Aging in spin glasses (and in some other systems) reveals astonishing effects of `rejuvenation and memory' upon temperature changes. In this paper, we propose microscopic mechanisms (at the scale of spin-spin interactions) which can be at the origin of such phenomena. Firstly, we recall that, in a frustrated system, the effective average interaction between two spins may take different values (possibly with opposite signs) at different temperatures. We give simple examples of such situations, which we compute exactly. Such mechanisms can explain why new ordering processes (rejuvenation) seem to take place in spin glasses when the temperature is lowered. Secondly, we emphasize the fact that inhomogeneous interactions do naturally lead to a wide distribution of relaxation times for thermally activated flips. `Memory spots' spontaneously appear, in the sense that the flipping time of some spin clusters becomes extremely long when the temperature is decreased. Such memory spots are capable of keeping the memory of previous ordering at a higher temperature while new ordering processes occur at a lower temperature. After a qualitative discussion of these mechanisms, we show in the numerical simulation of a simplified example that this may indeed work. Our conclusion is that certain chaos-like phenomena may show up spontaneously in any frustrated and inhomogeneous magnetic system, without impeding the occurrence of memory effects.
Rapid advances in genetics, genomics, and imaging have given insight into the molecular and cellular basis of behaviour in a variety of model organisms with unprecedented detail and scope. It is increasingly routine to isolate behavioural mutants, clone and characterise mutant genes, and discern the molecular and neural basis for a behavioural phenotype. Conversely, reverse genetic approaches have made it possible to straightforwardly identify genes of interest in whole-genome sequences and generate mutants that can be subjected to phenotypic analysis. In this latter approach, it is the phenotyping that presents the major bottleneck; when it comes to connecting phenotype to genotype in freely behaving animals, analysis of behaviour itself remains superficial and time consuming. However, many proof-of-principle studies of automated behavioural analysis over the last decade have poised the field on the verge of exciting developments that promise to begin closing this gap.   In the broadest sense, our goal in this chapter is to explore what we can learn about the genes involved in neural function by carefully observing behaviour. This approach is rooted in model organism genetics but shares ideas with ethology and neuroscience, as well as computer vision and bioinformatics. After introducing C. elegans as a model, we will survey the research that has led to the current state of the art in worm behavioural phenotyping and present current research that is transforming our approach to behavioural genetics.
This paper presents an approach for semantic place categorization using data obtained from RGB cameras. Previous studies on visual place recognition and classification have shown that, by considering features derived from pre-trained Convolutional Neural Networks (CNNs) in combination with part-based classification models, high recognition accuracy can be achieved, even in presence of occlusions and severe viewpoint changes. Inspired by these works, we propose to exploit local deep representations, representing images as set of regions applying a Na\"{i}ve Bayes Nearest Neighbor (NBNN) model for image classification. As opposed to previous methods where CNNs are merely used as feature extractors, our approach seamlessly integrates the NBNN model into a fully-convolutional neural network. Experimental results show that the proposed algorithm outperforms previous methods based on pre-trained CNN models and that, when employed in challenging robot place recognition tasks, it is robust to occlusions, environmental and sensor changes.
In the context of large-scale networks, the consideration of faults is an evident necessity. This document is focussing on the self-stabilizing approach which aims at conceiving algorithms "repairing themselves" in case of transient faults, that is of faults implying an arbitrary modification of the states of the processes. The document focuses on two different contexts, covering the major part of my research work these last years. The first part of the document is dedicated to the design and analysis of self-stabilizing algorithms for networks of processes. The second part of the document is dedicated to the design and analysis of self-stabilizing algorithms for autonomous entities (i.e., software agents, robots, etc.) moving in a network.
We present a hardware architecture that uses the Neural Engineering Framework (NEF) to implement large-scale neural networks on Field Programmable Gate Arrays (FPGAs) for performing pattern recognition in real time. NEF is a framework that is capable of synthesising large-scale cognitive systems from subnetworks. We will first present the architecture of the proposed neural network implemented using fixed-point numbers and demonstrate a routine that computes the decoding weights by using the online pseudoinverse update method (OPIUM) in a parallel and distributed manner. The proposed system is efficiently implemented on a compact digital neural core. This neural core consists of 64 neurons that are instantiated by a single physical neuron using a time-multiplexing approach. As a proof of concept, we combined 128 identical neural cores together to build a handwritten digit recognition system using the MNIST database and achieved a recognition rate of 96.55%. The system is implemented on a state-of-the-art FPGA and can process 5.12 million digits per second. The architecture is not limited to handwriting recognition, but is generally applicable as an extremely fast pattern recognition processor for various kinds of patterns such as speech and images.
The cavity approach is used to address the physical properties of random solids in equilibrium. Particular attention is paid to the fraction of localized particles and the distribution of localization lengths characterizing their thermal motion. This approach is of relevance to a wide class of random solids, including rubbery media (formed via the vulcanization of polymer fluids) and chemical gels (formed by the random covalent bonding of fluids of atoms or small molecules). The cavity approach confirms results that have been obtained previously via replica mean-field theory, doing so in a way that sheds new light on their physical origin.
The gapless Bogoliubov-de Gennes (BdG) quasiparticles of a clean three dimensional spinless $p_x+ip_y$ superconductor provide an intriguing example of a thermal Hall semimetal (ThSM) phase of Majorana-Weyl fermions in class D of the Altland-Zirnbauer symmetry classification; such a phase can support a large anomalous thermal Hall conductivity and protected surface Majorana-Fermi arcs at zero energy. We study the effect of quenched disorder on such a topological phase with both numerical and analytical methods. Using the kernel polynomial method, we compute the average and typical density of states for the BdG quasiparticles; based on this, we construct the disordered phase diagram. We show for infinitesimal disorder, the ThSM is converted into a diffusive thermal Hall metal (ThDM) due to rare statistical fluctuations. Consequently, the phase diagram of the disordered model only consists of ThDM and thermal insulating phases. Nonetheless, there is a cross-over at finite energies from a ThSM regime to a ThDM regime, and we establish the scaling properties of the avoided quantum critical point which marks this cross-over. Additionally, we show the existence of two types of thermal insulators: (i) a trivial thermal band insulator (ThBI) [or BEC phase] supporting only exponentially localized Lifshitz states (at low energy), and (ii) a thermal Anderson insulator (AI) at large disorder strengths. We determine the nature of the two distinct localization transitions between these two types of insulators and ThDM.We also discuss the experimental relevance of our results for three dimensional, time reversal symmetry breaking, triplet superconducting states.
Information diffusion and virus propagation are fundamental processes taking place in networks. While it is often possible to directly observe when nodes become infected with a virus or adopt the information, observing individual transmissions (i.e., who infects whom, or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NP-hard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and finds provably near-optimal networks.   We demonstrate the effectiveness of our approach by tracing information diffusion in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news for the top 1,000 media sites and blogs tends to have a core-periphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.
This paper presents a class of passivity-based cooperative control problems that have an explicit connection to convex network optimization problems. The new notion of maximal equilibrium independent passivity is introduced and it is shown that networks of systems possessing this property asymptotically approach the solutions of a dual pair of network optimization problems, namely an optimal potential and an optimal flow problem. This connection leads to an interpretation of the dynamic variables, such as system inputs and outputs, to variables in a network optimization framework, such as divergences and potentials, and reveals that several duality relations known in convex network optimization theory translate directly to passivity-based cooperative control problems. The presented results establish a strong and explicit connection between passivity-based cooperative control theory on the one side and network optimization theory on the other, and they provide a unifying framework for network analysis and optimal design. The results are illustrated on a nonlinear traffic dynamics model that is shown to be asymptotically clustering.
In this brief report we explore the energy landscapes of two spin glass models using a greedy single-spin flipping process, {\tt Gmax}. The ground-state energy density of the random maximum two-satisfiability problem is efficiently approached by {\tt Gmax}. The achieved energy density $e(t)$ decreases with the evolution time $t$ as $e(t)-e(\infty)=h (\log_{10} t)^{-z}$ with a small prefactor $h$ and a scaling coefficient $z > 1$, indicating an energy landscape with deep and rugged funnel-shape regions. For the $\pm J$ Viana-Bray spin glass model, however, the greedy single-spin dynamics quickly gets trapped to a local minimal region of the energy landscape.
We focus at the interface between multiscale computations, bifurcation theory and social networks. In particular we address how the Equation-Free approach, a recently developed computational framework, can be exploited to systematically extract coarse-grained, emergent dynamical information by bridging detailed, agent-based models of social interactions on networks, with macroscopic, systems-level, continuum numerical analysis tools. For our illustrations we use a simple dynamic agent-based model describing the propagation of information between individuals interacting under mimesis in a social network with private and public information. We describe the rules governing the evolution of the agents emotional state dynamics and discover, through simulation, multiple stable stationary states as a function of the network topology. Using the Equation-Free approach we track the dependence of these stationary solutions on network parameters and quantify their stability in the form of coarse-grained bifurcation diagrams.
Observations of astrophysical objects such as galaxies are limited by various sources of random and systematic noise from the sky background, the optical system of the telescope and the detector used to record the data. Conventional deconvolution techniques are limited in their ability to recover features in imaging data by the Shannon-Nyquist sampling theorem. Here we train a generative adversarial network (GAN) on a sample of $4,550$ images of nearby galaxies at $0.01<z<0.02$ from the Sloan Digital Sky Survey and conduct $10\times$ cross validation to evaluate the results. We present a method using a GAN trained on galaxy images that can recover features from artificially degraded images with worse seeing and higher noise than the original with a performance which far exceeds simple deconvolution. The ability to better recover detailed features such as galaxy morphology from low-signal-to-noise and low angular resolution imaging data significantly increases our ability to study existing data sets of astrophysical objects as well as future observations with observatories such as the Large Synoptic Sky Telescope (LSST) and the Hubble and James Webb space telescopes.
In the creation of a smart future information society, Internet of Things (IoT) and Content Centric Networking (CCN) break two key barriers for both the front-end sensing and back-end networking. However, we still observe the missing piece of the research that dominates the current design, eg lacking knowledge in the middle to glue sensing and networking holistically. In this paper, we introduce and discuss a new networking paradigm, called Knowledge Centric Networking (KCN), as a promising solution. The key insight of KCN is to leverage emerging machine learning techniques to create knowledge for networking system designs, and extract knowledge from collected data to facilitate communication with better controllability, improved quality of service, and lower cost. This paper presents the KCN design rationale, the KCN benefits and also the potential research opportunities.
Approaches to learning Bayesian networks from data typically combine a scoring function with a heuristic search procedure. Given a Bayesian network structure, many of the scoring functions derived in the literature return a score for the entire equivalence class to which the structure belongs. When using such a scoring function, it is appropriate for the heuristic search algorithm to search over equivalence classes of Bayesian networks as opposed to individual structures. We present the general formulation of a search space for which the states of the search correspond to equivalence classes of structures. Using this space, any one of a number of heuristic search algorithms can easily be applied. We compare greedy search performance in the proposed search space to greedy search performance in a search space for which the states correspond to individual Bayesian network structures.
Network intrusion detection is one of the most visible uses for Big Data analytics. One of the main problems in this application is the constant rise of new attacks. This scenario, characterized by the fact that not enough labeled examples are available for the new classes of attacks is hardly addressed by traditional machine learning approaches. New findings on the capabilities of Zero-Shot learning (ZSL) approach makes it an interesting solution for this problem because it has the ability to classify instances of unseen classes. ZSL has inherently two stages: the attribute learning and the inference stage. In this paper we propose a new algorithm for the attribute learning stage of ZSL. The idea is to learn new values for the attributes based on decision trees (DT). Our results show that based on the rules extracted from the DT a better distribution for the attribute values can be found. We also propose an experimental setup for the evaluation of ZSL on network intrusion detection (NID).
This paper presents the Speech Technology Center (STC) replay attack detection systems proposed for Automatic Speaker Verification Spoofing and Countermeasures Challenge 2017. In this study we focused on comparison of different spoofing detection approaches. These were GMM based methods, high level features extraction with simple classifier and deep learning frameworks. Experiments performed on the development and evaluation parts of the challenge dataset demonstrated stable efficiency of deep learning approaches in case of changing acoustic conditions. At the same time SVM classifier with high level features provided a substantial input in the efficiency of the resulting STC systems according to the fusion systems results.
This paper addresses the problem of Human-Aware Navigation (HAN), using multi camera sensors to implement a vision-based person tracking system. The main contributions of this paper are a novel and real-time Deep Learning person detection and a standardization of personal space, that can be used with any path planer. In the first stage of the approach, we propose to cascade the Aggregate Channel Features (ACF) detector with a deep Convolutional Neural Network (CNN) to achieve fast and accurate Pedestrian Detection (PD). For the personal space definition (that can be defined as constraints associated with the robot's motion), we used a mixture of asymmetric Gaussian functions, to define the cost functions associated to each constraint. Both methods were evaluated individually. The final solution (including both the proposed pedestrian detection and the personal space constraints) was tested in a typical domestic indoor scenario, in four distinct experiments. The results show that the robot is able to cope with human-aware constraints, defined after common proxemics rules.
We discuss the recent experiments on persistent current in metallic rings in the backdrop of low temperature decoherence. The observed size of the persistent current, typically on the order of the Thouless energy, $ e/\tau_D$, is much larger than the theoretical results, obtained with or without electron interaction. In considering the phenomenology of both decoherence and persistent current, usually observed in similar systems, we argue towards a dynamic role played by decoherence in the generation of a persistent current. A field-induced phase shift from near-equilibrium high-frequency fluctuations---which otherwise gives rise to decoherence---under certain conditions of periodicity and asymmetry due to disorder, is argued to induce a steady state diffusion current on the order of $e/\tau_D$, comparable to the observed persistent current.
Ion conduction in noncrystals (glasses, polymers, etc) has a number of properties in common. In fact, from a purely phenomenological point of view, these properties are even more widely observed: ion conduction behaves much like electronic conduction in disordered materials (e.g., amorphous semiconductors). These universalities are subject of much current interest, for instance interpreted in the context of simple hopping models. In the present paper we first discuss the temperature dependence of the dc conductivity in hopping models and the importance of the percolation phenomenon. Next, the experimental (quasi)universality of the ac conductivity is discussed. It is shown the random barrier model is able to reproduce the experimental finding that the response obeys time-temperature superposition, while at the same time a broad range of activation energies is involved in the conduction process. Again, percolation is the key to understanding what is going on. Finally, some open problems in the field are listed.
This paper studies broadcasting and gossiping algorithms in random and general AdHoc networks. Our goal is not only to minimise the broadcasting and gossiping time, but also to minimise the energy consumption, which is measured in terms of the total number of messages (or transmissions) sent. We assume that the nodes of the network do not know the network, and that they can only send with a fixed power, meaning they can not adjust the areas sizes that their messages cover. We believe that under these circumstances the number of transmissions is a very good measure for the overall energy consumption.   For random networks, we present a broadcasting algorithm where every node transmits at most once. We show that our algorithm broadcasts in $O(\log n)$ steps, w.h.p, where $n$ is the number of nodes. We then present a $O(d \log n)$ ($d$ is the expected degree) gossiping algorithm using $O(\log n)$ messages per node.   For general networks with known diameter $D$, we present a randomised broadcasting algorithm with optimal broadcasting time $O(D \log (n/D) + \log^2 n)$ that uses an expected number of $O(\log^2 n / \log (n/D))$ transmissions per node. We also show a tradeoff result between the broadcasting time and the number of transmissions: we construct a network such that any oblivious algorithmusing a time-invariant distribution requires $\Omega(\log^2 n / \log (n/D))$ messages per node in order to finish broadcasting in optimal time. This demonstrates the tightness of our upper bound. We also show that no oblivious algorithm can complete broadcasting w.h.p. using $o(\log n)$ messages per node.
It appeared recently that the classical random network model used to represent complex networks does not capture their main properties (clustering, degree distribution). Since then, various attempts have been made to provide network models having these properties. We propose here the first model which achieves the following challenges: it produces networks which have the three main wanted properties, it is based on some real-world observations, and it is sufficiently simple to make it possible to prove its main properties. We first give an overview of the field by presenting the main models introduced until now, then we discuss some remarks on some complex networks which lead us to the definition of our model. We then show that the model has the expected properties and that it can actually be seen as a general model for complex networks.
Although highly correlated, speech and speaker recognition have been regarded as two independent tasks and studied by two communities. This is certainly not the way that people behave: we decipher both speech content and speaker traits at the same time. This paper presents a unified model to perform speech and speaker recognition simultaneously and altogether. The model is based on a unified neural network where the output of one task is fed to the input of the other, leading to a multi-task recurrent network. Experiments show that the joint model outperforms the task-specific models on both the two tasks.
A low noise constant current source used for measuring the $1/f$ noise in disordered systems in ohmic as well as non-ohmic regime is described. The source can supply low noise constant current starting from as low as 1 $\mu$A to a few tens of mA with a high voltage compliance limit of around 20 Volts. The constant current source has several stages which can work in a standalone manner or together to supply the desired value of load current. The noise contributed by the current source is very low in the entire current range. The fabrication of a low noise voltage preamplifier modified for bias dependent noise measurements and based on the existing design available in the MAT04 data sheet is also described.
Mobile Ad hoc Network (MANET) is a distributed, infrastructure-less and decentralized network. A routing protocol in MANET is used to find routes between mobile nodes to facilitate communication within the network. Numerous routing protocols have been proposed for MANET. Those routing protocols are designed to adaptively accommodate for dynamic unpredictable changes in network's topology. The mobile nodes in MANET are often powered by limited batteries and network lifetime relies heavily on the energy consumption of nodes. In consequence, the lack of a mobile node can lead to network partitioning. In this paper we analyse, evaluate and measure the energy efficiency of three prominent MANET routing protocols namely DSR, AODV and OLSR in addition to modified protocols. These routing protocols follow the reactive and the proactive routing schemes. A discussion and comparison highlighting their particular merits and drawbacks are also presented. Evaluation study and simulations are performed using NS-2 and its accompanying tools for analysis and investigation of results.
The current Internet design is not capable to support communications in environments characterized by very long delays and frequent network partitions. To allow devices to communicate in such environments, delay-tolerant networking solutions have been proposed by exploiting opportunistic message forwarding, with limited expectations of end-to-end connectivity and node resources. Such solutions envision non-traditional communication scenarios, such as disaster areas and development regions. Several forwarding algorithms have been investigated, aiming to offer the best trade-off between cost (number of message replicas) and rate of successful message delivery. Among such proposals, there has been an effort to employ social similarity inferred from user mobility patterns in opportunistic routing solutions to improve forwarding. However, these research effort presents two major limitations: first, it is focused on distribution of the intercontact time over the complete network structure, ignoring the impact that human behavior has on the dynamics of the network; and second, most of the proposed solutions look at challenging networking environments where networks have low density, ignoring the potential use of delay-tolerant networking to support low cost communications in networks with higher density, such as urban scenarios. This paper presents a study of the impact that human behavior has on opportunistic forwarding. Our goal is twofold: i) to show that performance in low and high density networks can be improved by taking the dynamics of the network into account; and ii) to show that the delay-tolerant networking can be used to reduce communication costs in networks with higher density by taking the behavior of the user into account.
Full-duplex (FD) radio has been introduced for bidirectional communications on the same temporal and spectral resources so as to maximize spectral efficiency. In this paper, motivated by the recent advances in FD radios, we provide a foundation for hybrid-duplex heterogeneous networks (HDHNs), composed of multi-tier networks with a mixture of access points (APs), operating either in bidirectional FD mode or downlink half-duplex (HD) mode. Specifically, we characterize the net- work interference from FD-mode cells, and derive the HDHN throughput by accounting for AP spatial density, self-interference cancellation (IC) capability, and transmission power of APs and users. By quantifying the HDHN throughput, we present the effect of network parameters and the self-IC capability on the HDHN throughput, and show the superiority of FD mode for larger AP densities (i.e., larger network interference and shorter communication distance) or higher self-IC capability. Furthermore, our results show operating all APs in FD or HD achieves higher throughput compared to the mixture of two mode APs in each tier network, and introducing hybrid-duplex for different tier networks improves the heterogenous network throughput.
Current approaches to community detection in social networks often ignore the spatial location of the nodes. In this paper, we look to extract spatially-near communities in a social network. We introduce a new metric to measure the quality of a community partition in a geolocated social networks called "spatially-near modularity" a value that increases based on aspects of the network structure but decreases based on the distance between nodes in the communities. We then look to find an optimal partition with respect to this measure - which should be an "ideal" community with respect to both social ties and geographic location. Though an NP-hard problem, we introduce two heuristic algorithms that attempt to maximize this measure and outperform non-geographic community finding by an order of magnitude. Applications to counter-terrorism are also discussed.
We ask whether neural networks can learn to use secret keys to protect information from other neural networks. Specifically, we focus on ensuring confidentiality properties in a multiagent system, and we specify those properties in terms of an adversary. Thus, a system may consist of neural networks named Alice and Bob, and we aim to limit what a third neural network named Eve learns from eavesdropping on the communication between Alice and Bob. We do not prescribe specific cryptographic algorithms to these neural networks; instead, we train end-to-end, adversarially. We demonstrate that the neural networks can learn how to perform forms of encryption and decryption, and also how to apply these operations selectively in order to meet confidentiality goals.
In this paper we study a model for traffic flow on networks based on a hyperbolic system of conservation laws with discontinuous flux. Each equation describes the density evolution of vehicles having a common path along the network. In this formulation the junctions apparently disappear since each path is considered as a single uninterrupted road. We consider a Godunov-based approximation scheme for the system which is very easy to implement. Besides basic properties like the conservation of cars and positive bounded solutions, the scheme exhibits nice properties, being able to select automatically a reasonable solution at junctions without requiring external procedures (e.g., maximization of the flux via a linear programming method). Moreover, the scheme can be interpreted as a discretization of the traffic models with buffer, although any buffer is introduced here. Finally, we show how the scheme can be recast in the framework of the classical theory of traffic flow on networks, where a conservation law has to be solved on each arc of the network. This is achieved by solving the Riemann problem for a modified equation, and showing that its solution corresponds to the one computed by the numerical scheme.
The potential benefit of migrating software design from Structured to Object Oriented Paradigm is manifolded including modularity, manageability and extendability. This design migration should be automated as it will reduce the time required in manual process. Our previous work has addressed this issue in terms of optimal graph clustering problem formulated by a quadratic Integer Program (IP). However, it has been realized that solution to the IP is computationally hard and thus heuristic based methods are required to get a near optimal solution. This paper presents a Genetic Algorithm (GA) for optimal clustering with an objective of maximizing intra-cluster edges whereas minimizing the inter-cluster ones. The proposed algorithm relies on fitness based parent selection and cross-overing cluster elements to reach an optimal solution step by step. The scheme was implemented and tested against a set of real and synthetic data. The experimental results show that GA outperforms our previous works based on Greedy and Monte Carlo approaches by 40% and 49.5%.
Real-time prediction of clinical interventions remains a challenge within intensive care units (ICUs). This task is complicated by data sources that are noisy, sparse, heterogeneous and outcomes that are imbalanced. In this paper, we integrate data from all available ICU sources (vitals, labs, notes, demographics) and focus on learning rich representations of this data to predict onset and weaning of multiple invasive interventions. In particular, we compare both long short-term memory networks (LSTM) and convolutional neural networks (CNN) for prediction of five intervention tasks: invasive ventilation, non-invasive ventilation, vasopressors, colloid boluses, and crystalloid boluses. Our predictions are done in a forward-facing manner to enable "real-time" performance, and predictions are made with a six hour gap time to support clinically actionable planning. We achieve state-of-the-art results on our predictive tasks using deep architectures. We explore the use of feature occlusion to interpret LSTM models, and compare this to the interpretability gained from examining inputs that maximally activate CNN outputs. We show that our models are able to significantly outperform baselines in intervention prediction, and provide insight into model learning, which is crucial for the adoption of such models in practice.
Hypothesis testing is an important cognitive process that supports human reasoning. In this paper, we introduce a computational hypothesis testing approach based on memory augmented neural networks. Our approach involves a hypothesis testing loop that reconsiders and progressively refines a previously formed hypothesis in order to generate new hypotheses to test. We apply the proposed approach to language comprehension task by using Neural Semantic Encoders (NSE). Our NSE models achieve the state-of-the-art results showing an absolute improvement of 1.2% to 2.6% accuracy over previous results obtained by single and ensemble systems on standard machine comprehension benchmarks such as the Children's Book Test (CBT) and Who-Did-What (WDW) news article datasets.
We present a method for 3D object detection and pose estimation from a single image. In contrast to current techniques that only regress the 3D orientation of an object, our method first regresses relatively stable 3D object properties using a deep convolutional neural network and then combines these estimates with geometric constraints provided by a 2D object bounding box to produce a complete 3D bounding box. The first network output estimates the 3D object orientation using a novel hybrid discrete-continuous loss, which significantly outperforms the L2 loss. The second output regresses the 3D object dimensions, which have relatively little variance compared to alternatives and can often be predicted for many object types. These estimates, combined with the geometric constraints on translation imposed by the 2D bounding box, enable us to recover a stable and accurate 3D object pose. We evaluate our method on the challenging KITTI object detection benchmark both on the official metric of 3D orientation estimation and also on the accuracy of the obtained 3D bounding boxes. Although conceptually simple, our method outperforms more complex and computationally expensive approaches that leverage semantic segmentation, instance level segmentation and flat ground priors and sub-category detection. Our discrete-continuous loss also produces state of the art results for 3D viewpoint estimation on the Pascal 3D+ dataset.
With the development of convolution neural network, more and more researchers focus their attention on the advantage of CNN for face recognition task. In this paper, we propose a deep convolution network for learning a robust face representation. The deep convolution net is constructed by 4 convolution layers, 4 max pooling layers and 2 fully connected layers, which totally contains about 4M parameters. The Max-Feature-Map activation function is used instead of ReLU because the ReLU might lead to the loss of information due to the sparsity while the Max-Feature-Map can get the compact and discriminative feature vectors. The model is trained on CASIA-WebFace dataset and evaluated on LFW dataset. The result on LFW achieves 97.77% on unsupervised setting for single net.
Predicting the prices of stocks at any stock market remains a quest for many investors and researchers. Those who trade at the stock market tend to use technical, fundamental or time series analysis in their predictions. These methods usually guide on trends and not the exact likely prices. It is for this reason that Artificial Intelligence systems, such as Artificial Neural Network, that is feedforward multi-layer perceptron with error backpropagation, can be used for such predictions. A difficulty in neural network application is the determination of suitable network parameters. A previous research by the author already determined the network parameters as 5:21:21:1 with 80% training data or 4-year of training data as a good enough model for stock prediction. This model has been put to the test in predicting selected Shanghai Stock Exchange stocks in the future period of 21-Sep-2016 to 11-Oct-2016, about one week after the publication of these predictions. The research aims at confirming that simple neural network systems can be quite powerful in typical stock market predictions.
The glass transition is considered within two toys models, a mean field spin glass and a directed polymer in a correlated random potential. In the spin glass model there occurs a dynamical transition, where the system condenses in a state of lower entropy. The extensive entropy loss, called complexity or information entropy, is calculated by analysis of the metastable (TAP) states. This yields a well behaved thermodynamics of the dynamical transition. The multitude of glassy states also implies an extensive difference between the internal energy fluctuations and the specific heat. In the directed polymer problem there occurs a thermodynamic phase transition in non-extensive terms of the free energy. At low temperature the polymer condenses in a set of highly degenerate metastable states.
We study the problem of wave transport in a one-dimensional disordered system, where the scatterers of the chain are $n$ barriers and wells with statistically independent intensities and with a spatial extension $\l_c$ which may contain an arbitrary number $\delta/2\pi$ of wavelengths, where $\delta = k l_c$. We analyze the average Landauer resistance and transmission coefficient of the chain as a function of $n$ and the phase parameter $\delta$. For weak scatterers, we find: i) a regime, to be called I, associated with an exponential behavior of the resistance with $n$, ii) a regime, to be called II, for $\delta$ in the vicinity of $\pi$, where the system is almost transparent and less localized, and iii) right in the middle of regime II, for $\delta$ very close to $\pi$, the formation of a band gap, which becomes ever more conspicuous as $n$ increases. In regime II, both the average Landauer resistance and the transmission coefficient show an oscillatory behavior with $n$ and $\delta$. These characteristics of the system are found analytically, some of them exactly and some others approximately. The agreement between theory and simulations is excellent, which suggests a strong motivation for the experimental study of these systems. We also present a qualitative discussion of the results.
We investigate the effect of disorder on the dynamical spectrum of layered $f$-electron systems. With random dilution of $f$-sites in a single Kondo insulating layer, we explore the range and extent to which Kondo hole incoherence can penetrate into adjacent layers. We consider three cases of neighboring layers: band insulator, Kondo insulator and simple metal. The disorder-induced spectral weight transfer, used here for quantification of the proximity effect, decays algebraically with distance from the boundary layer. Further, we show that the spectral weight transfer is highly dependent on the frequency range considered as well as the presence of interactions in the clean adjacent layers. The changes in the low frequency spectrum are very similar when the adjacent layers are either metallic or Kondo insulating, and hence are independent of interactions. In stark contrast, a distinct picture emerges for the spectral weight transfers across large energy scales. The spectral weight transfer over all energy scales is much higher when the adjacent layers are non-interacting as compared to when they are strongly interacting Kondo insulators. Thus, over all scales, interactions screen the disorder effects significantly. We discuss the possibility of a crossover from non-Fermi liquid to Fermi liquid behavior upon increasing the ratio of clean to disordered layers in particle-hole asymmetric systems.
Matching pedestrians across multiple camera views known as human re-identification (re-identification) is a challenging problem in visual surveillance. In the existing works concentrating on feature extraction, representations are formed locally and independent of other regions. We present a novel siamese Long Short-Term Memory (LSTM) architecture that can process image regions sequentially and enhance the discriminative capability of local feature representation by leveraging contextual information. The feedback connections and internal gating mechanism of the LSTM cells enable our model to memorize the spatial dependencies and selectively propagate relevant contextual information through the network. We demonstrate improved performance compared to the baseline algorithm with no LSTM units and promising results compared to state-of-the-art methods on Market-1501, CUHK03 and VIPeR datasets. Visualization of the internal mechanism of LSTM cells shows meaningful patterns can be learned by our method.
We study numerically the properties of the bayesian perceptron through a gradient descent on the optimal cost function. The theoretical distribution of stabilities is deduced. It predicts that the optimal generalizer lies close to the boundary of the space of (error-free) solutions. The numerical simulations are in good agreement with the theoretical distribution. The extrapolation of the generalization error to infinite input space size agrees with the theoretical results. Finite size corrections are negative and exhibit two different scaling regimes, depending on the training set size. The variance of the generalization error vanishes for $N \rightarrow \infty$ confirming the property of self-averaging.
We characterize the large-sample properties of network modularity in the presence of covariates, under a natural and flexible nonparametric null model. This provides for the first time an objective measure of whether or not a particular value of modularity is meaningful. In particular, our results quantify the strength of the relation between observed community structure and the interactions in a network. Our technical contribution is to provide limit theorems for modularity when a community assignment is given by nodal features or covariates. These theorems hold for a broad class of network models over a range of sparsity regimes, as well as weighted, multi-edge, and power-law networks. This allows us to assign $p$-values to observed community structure, which we validate using several benchmark examples in the literature. We conclude by applying this methodology to investigate a multi-edge network of corporate email interactions.
We investigate a recently proposed model for cortical computation which performs relational inference. It consists of several interconnected, structurally equivalent populations of leaky integrate-and-fire (LIF) neurons, which are trained in a self-organized fashion with spike-timing dependent plasticity (STDP). Despite its robust learning dynamics, the model is susceptible to a problem typical for recurrent networks which use a correlation based (Hebbian) learning rule: if trained with high learning rates, the recurrent connections can cause strong feedback loops in the network dynamics, which lead to the emergence of attractor states. This causes a strong reduction in the number of representable patterns and a decay in the inference ability of the network. As a solution, we introduce a conceptually very simple "wake-sleep" algorithm: during the wake phase, training is executed normally, while during the sleep phase, the network "dreams" samples from its generative model, which are induced by random input. This process allows us to activate the attractor states in the network, which can then be unlearned effectively by an anti-Hebbian mechanism. The algorithm allows us to increase learning rates up to a factor of ten while avoiding clustering, which allows the network to learn several times faster. Also for low learning rates, where clustering is not an issue, it improves convergence speed and reduces the final inference error.
We prove that various SO(n)-invariant n-vector models with interactions which have a deep and narrow enough minimum have a first-order transition in the temperature. The result holds in dimension two or more, and is independent on the nature of the low-temperature phase.
This paper addresses a challenging problem -- how to generate multi-view cloth images from only a single view input. To generate realistic-looking images with different views from the input, we propose a new image generation model termed VariGANs that combines the strengths of the variational inference and the Generative Adversarial Networks (GANs). Our proposed VariGANs model generates the target image in a coarse-to-fine manner instead of a single pass which suffers from severe artifacts. It first performs variational inference to model global appearance of the object (e.g., shape and color) and produce a coarse image with a different view. Conditioned on the generated low resolution images, it then proceeds to perform adversarial learning to fill details and generate images of consistent details with the input. Extensive experiments conducted on two clothing datasets, MVC and DeepFashion, have demonstrated that images of a novel view generated by our model are more plausible than those generated by existing approaches, in terms of more consistent global appearance as well as richer and sharper details.
In this paper we consider the dyadic effect introduced in complex networks when nodes are distinguished by a binary characteristic. Under these circumstances two independent parameters, namely dyadicity and heterophilicity, are able to measure how much the assigned characteristic affects the network topology. All possible configurations can be represented in a phase diagram lying in a two-dimensional space that represents the feasible region of the dyadic effect, which is bound by two upper bounds on dyadicity and heterophilicity. Using some network's structural arguments, we are able to improve such upper bounds and introduce two new lower bounds, providing a reduction of the feasible region of the dyadic effect as well as constraining dyadicity and heterophilicity within a specific range. Some computational experiences show the bounds' effectiveness and their usefulness with regards to different classes of networks.
A new type of steady steep two-dimensional irrotational symmetric periodic gravity waves on inviscid incompressible fluid of infinite depth is revealed. We demonstrate that these waves have sharper crests in comparison with the Stokes waves of the same wavelength and steepness. The speed of a fluid particle at the crest of new waves is greater than their phase speed.
Secure communication mechanisms in Wireless Sensor Networks (WSNs) have been widely deployed to ensure confidentiality, authenticity and integrity of the nodes and data. Recently many WSNs applications rely on trusted communication to ensure large user acceptance. Indeed, the trusted relationship thus far can only be achieved through Trust Management System (TMS) or by adding external security chip on the WSN platform. In this study an alternative mechanism is proposed to accomplish trusted communication between sensors based on the principles defined by Trusted Computing Group (TCG). The results of other related study have also been analyzed to validate and support our findings. Finally the proposed trusted mechanism is evaluated for the potential application on resource constraint devices by quantifying their power consumption on selected major processes. The result proved the proposed scheme can establish trust in WSN with less computation and communication and most importantly eliminating the need for neighboring evaluation for TMS or relying on external security chip.
In this paper we propose a graph-community detection approach to identify cross-document relationships at the topic segment level. Given a set of related documents, we automatically find these relationships by clustering segments with similar content (topics). In this context, we study how different weighting mechanisms influence the discovery of word communities that relate to the different topics found in the documents. Finally, we test different mapping functions to assign topic segments to word communities, determining which topic segments are considered equivalent.   By performing this task it is possible to enable efficient multi-document browsing, since when a user finds relevant content in one document we can provide access to similar topics in other documents. We deploy our approach in two different scenarios. One is an educational scenario where equivalence relationships between learning materials need to be found. The other consists of a series of dialogs in a social context where students discuss commonplace topics. Results show that our proposed approach better discovered equivalence relationships in learning material documents and obtained close results in the social speech domain, where the best performing approach was a clustering technique.
Structural alterations in v-SiO2 induced by "thermal poling", a treatment which makes the glass able to double the frequency of an impinging infrared light, are revealed by neutron diffraction as a breakdown of the macroscopic isotropy. This leads to concomitant changes in the vibrational density of states measured by inelastic neutron scattering. The observations are found to be consistent with the emergence of partial ordering within the glassy matrix along the direction of an electrostatic field applied during the poling treatment.
The first dual-port server-centric datacenter network, FiConn, was introduced in 2009 and there are several others now in existence; however, the pool of topologies to choose from remains small. We propose a new generic construction, the stellar transformation, that dramatically increases the size of this pool by facilitating the transformation of well-studied topologies from interconnection networks, along with their networking properties and routing algorithms, into viable dual-port server-centric datacenter network topologies. We demonstrate that under our transformation, numerous interconnection networks yield datacenter network topologies with potentially good, and easily computable, baseline properties. We instantiate our construction so as to apply it to generalized hypercubes and obtain the datacenter networks GQ*. Our construction automatically yields routing algorithms for GQ* and we empirically compare GQ* (and its routing algorithms) with the established datacenter networks FiConn and DPillar (and their routing algorithms); this comparison is with respect to network throughput, latency, load balancing, fault-tolerance, and cost to build, and is with regard to all-to-all, many all-to-all, butterfly, and random traffic patterns. We find that GQ* outperforms both FiConn and DPillar (sometimes significantly so) and that there is substantial scope for our stellar transformation to yield new dual-port server-centric datacenter networks that are a considerable improvement on existing ones.
We introduce a convolutional neural network that operates directly on graphs. These networks allow end-to-end learning of prediction pipelines whose inputs are graphs of arbitrary size and shape. The architecture we present generalizes standard molecular feature extraction methods based on circular fingerprints. We show that these data-driven features are more interpretable, and have better predictive performance on a variety of tasks.
Millimeter-wave transmission measurements have been performed in amorphous niobium-silicon alloy samples where the DC conductivity follows the critical temperature dependence $\sigma_{dc} \propto T^{1/2}$. The real part of the conductivity is obtained at eight frequencies in the range 87--1040 GHz for temperatures 2.6 K and above. In the quantum regime ($\hbar \omega > k_B T$) the real part of the high-frequency conductivity has a power-law frequency dependence $Re~\sigma(\omega) \propto \omega^{1/2}$. For temperatures 16 K and below the data exhibits temperature-frequency scaling predicted by theories of dynamics near quantum-critical points.
In this paper we introduce a new framework to train an Echo State Network to predict real valued time-series. The method consists in projecting the output of the internal layer of the network on a space with lower dimensionality, before training the output layer to learn the target task. Notably, we enforce a regularization constraint that leads to better generalization capabilities. We evaluate the performances of our approach on several benchmark tests, using different techniques to train the readout of the network, achieving superior predictive performance when using the proposed framework. Finally, we provide an insight on the effectiveness of the implemented mechanics through a visualization of the trajectory in the phase space and relying on the methodologies of nonlinear time-series analysis. By applying our method on well known chaotic systems, we provide evidence that the lower dimensional embedding retains the dynamical properties of the underlying system better than the full-dimensional internal states of the network.
We present a novel method to mount and align an optical-fiber-based resonator on the flat surface of an atom chip with ultrahigh precision. The structures for mounting a pair of fibers, which constitute the fiber resonator, are produced by a spin-coated SU-8 photoresist technique by use of deep-UV lithography. The design and production of the SU-8 structures are discussed.   From the measured finesses we calculate the coupling loss of the SU-8 structures acting as a kind of fiber splice to be smaller than 0.013 dB.
We consider a Mobile Ad-hoc NETwork (MANET) formed by n agents that move at speed V according to the Manhattan Random-Way Point model over a square region of side length L. The resulting stationary (agent) spatial probability distribution is far to be uniform: the average density over the "central zone" is asymptotically higher than that over the "suburb". Agents exchange data iff they are at distance at most R within each other.   We study the flooding time of this MANET: the number of time steps required to broadcast a message from one source agent to all agents of the network in the stationary phase. We prove the first asymptotical upper bound on the flooding time. This bound holds with high probability, it is a decreasing function of R and V, and it is tight for a wide and relevant range of the network parameters (i.e. L, R and V).   A consequence of our result is that flooding over the sparse and highly-disconnected suburb can be as fast as flooding over the dense and connected central zone. Rather surprisingly, this property holds even when R is exponentially below the connectivity threshold of the MANET and the speed V is very low.
We discuss the radiative corrections to charged and neutral current deep-inelastic neutrino--nucleon scattering in the minimal supersymmetric standard model (MSSM). In particular, deviations from the Standard Model prediction for the ratios of neutral- to charged-current cross sections, $R^\nu$ and $R^{\bar\nu}$, are studied, and results of a scan over the MSSM parameter space are presented.
With magnetic force microscopy in mind, we study the unbinding transition of individual flux lines from extended defects like columnar pins and twin planes in type II superconductors. In the presence of point disorder, the transition is universal with an exponent which depends only on the dimensionality of the extended defect. We also consider the unbinding transition of a single vortex line from a twin plane occupied by other vortices. We show that the critical properties of this transition depend strongly on the Luttinger liquid parameter which describes the long distance physics of the two-dimensional flux line array.
This paper studies the detection of bird calls in audio segments using stacked convolutional and recurrent neural networks. Data augmentation by blocks mixing and domain adaptation using a novel method of test mixing are proposed and evaluated in regard to making the method robust to unseen data. The contributions of two kinds of acoustic features (dominant frequency and log mel-band energy) and their combinations are studied in the context of bird audio detection. Our best achieved AUC measure on five cross-validations of the development data is 95.5% and 88.1% on the unseen evaluation data.
We present the results of a Molecular Dynamics computer simulation of a binary Lennard-Jones liquid confined in a narrow pore. The surface of the pore has an amorphous structure similar to that of the confined liquid. We find that the static properties of the liquid are not affected by the confinement, while the dynamics changes dramatically. By investigating the time and temperature dependence of the intermediate scattering function we show that the dynamics of the particles close to the center of the tube is similar to the one in the bulk, whereas the characteristic relaxation time tau_q(T,rho) of the intermediate scattering function at wavevector q and distance rho from the axis of the pore increases continuously when approaching the wall, leading to an apparent divergence in the vicinity of the wall. This effect is seen for intermediate temperatures down to temperatures close to the glass transition. The rho-dependence of tau_q(T,rho) can be described by an empirical law of the form tau_q(T,\rho)=f_q(T) exp [Delta_q/(rho_p-rho)], where Delta_q and \rho_q are constants, and f_q(T) is the only parameter which shows a significant temperature dependence.
Bayesian networks, and especially their structures, are powerful tools for representing conditional independencies and dependencies between random variables. In applications where related variables form a priori known groups, chosen to represent different "views" to or aspects of the same entities, one may be more interested in modeling dependencies between groups of variables rather than between individual variables. Motivated by this, we study prospects of representing relationships between variable groups using Bayesian network structures. We show that for dependency structures between groups to be expressible exactly, the data have to satisfy the so-called groupwise faithfulness assumption. We also show that one cannot learn causal relations between groups using only groupwise conditional independencies, but also variable-wise relations are needed. Additionally, we present algorithms for finding the groupwise dependency structures.
A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. The problem has largely been overcome through the introduction of carefully constructed initializations and batch normalization. Nevertheless, architectures incorporating skip-connections such as resnets perform much better than standard feedforward architectures despite well-chosen initialization and batch normalization. In this paper, we identify the shattered gradients problem. Specifically, we show that the correlation between gradients in standard feedforward networks decays exponentially with depth resulting in gradients that resemble white noise. In contrast, the gradients in architectures with skip-connections are far more resistant to shattering decaying sublinearly. Detailed empirical evidence is presented in support of the analysis, on both fully-connected networks and convnets. Finally, we present a new "looks linear" (LL) initialization that prevents shattering. Preliminary experiments show the new initialization allows to train very deep networks without the addition of skip-connections.
Using the weak-localization method, we have measured the electron-phonon scattering times $\tau_{ep}$ in Pd$_{60}$Ag$_{40}$ thick films prepared by DC- and RF-sputtering deposition techniques. In both series of samples, we find an anomalous $1/\tau_{ep} \propto T^2\ell$ temperature and disorder dependence, where $\ell$ is the electron elastic mean free path. This anomalous behavior cannot be explained in terms of the current concepts for the electron-phonon interaction in impure conductors. Our result also reveals that the strength of the electron-phonon coupling is much stronger in the DC than RF sputtered films, suggesting that the electron-phonon interaction not only is sensitive to the total level of disorder but also is sensitive to the microscopic quality of the disorder.
In the tensor-network framework, the expectation values of two-dimensional quantum states are evaluated by contracting a double-layer tensor network constructed from initial and final tensor-network states. The computational cost for carrying out this contraction is generally very high, which limits the largest bond dimension of tensor-network states that can be accurately studied to a relatively small value. We propose a dimension reduction scheme to solve this problem by mapping the double-layer tensor network onto an intersected single-layer tensor network. This reduces greatly the bond dimensions of local tensors to be contracted, and improves dramatically the efficiency and accuracy in the evaluation of expectation values of tensor-network states. It almost doubles the largest bond dimension of tensor-network states whose physical properties can be efficiently and reliably calculated, and extends significantly the application scope of tensor-network methods.
Fast and robust three-dimensional reconstruction of facial geometric structure from a single image is a challenging task with numerous applications. Here, we introduce a learning-based approach for reconstructing a three-dimensional face from a single image. Recent face recovery methods rely on accurate localization of key characteristic points. In contrast, the proposed approach is based on a Convolutional-Neural-Network (CNN) which extracts the face geometry directly from its image. Although such deep architectures outperform other models in complex computer vision problems, training them properly requires a large dataset of annotated examples. In the case of three-dimensional faces, currently, there are no large volume data sets, while acquiring such big-data is a tedious task. As an alternative, we propose to generate random, yet nearly photo-realistic, facial images for which the geometric form is known. The suggested model successfully recovers facial shapes from real images, even for faces with extreme expressions and under various lighting conditions.
Information-centric networking proposals attract much attention in the ongoing search for a future communication paradigm of the Internet. Replacing the host-to-host connectivity by a data-oriented publish/subscribe service eases content distribution and authentication by concept, while eliminating threats from unwanted traffic at an end host as are common in today's Internet. However, current approaches to content routing heavily rely on data-driven protocol events and thereby introduce a strong coupling of the control to the data plane in the underlying routing infrastructure. In this paper, threats to the stability and security of the content distribution system are analyzed in theory and practical experiments. We derive relations between state resources and the performance of routers and demonstrate how this coupling can be misused in practice. We discuss new attack vectors present in its current state of development, as well as possibilities and limitations to mitigate them.
The availability of large scale event data with time stamps has given rise to dynamically evolving knowledge graphs that contain temporal information for each edge. Reasoning over time in such dynamic knowledge graphs is not yet well understood. To this end, we present Know-Evolve, a novel deep evolutionary knowledge network that learns non-linearly evolving entity representations over time. The occurrence of a fact (edge) is modeled as a multivariate point process whose intensity function is modulated by the score for that fact computed based on the learned entity embeddings. We demonstrate significantly improved performance over various relational learning approaches on two large scale real-world datasets. Further, our method effectively predicts occurrence or recurrence time of a fact which is novel compared to prior reasoning approaches in multi-relational setting.
The Global Games Jam (GGJ) attracts many people who are passionate about games development, coming from a range of educational backgrounds. Therefore, the event can be experienced by novices and student developers as an opportunity for learning. This provides an opening to promote themes and ideas that could help form future thinking about games design, emerging as a form of induction on key design issues for new practitioners. Such an approach aims to raise awareness about issues which learners could help develop and take with them into industry. However, the experience itself affords a deep experiential rhetoric and dialogue with experts that could be an effective pedagogical tool for issues seldom addressed deeply in formal educational settings. This paper describes an account by one such individual, being introduced to game accessibility through participation in the GGJ. As such, it is not intended as a rigorous empirical analysis, but rather a perspective on one way a game jam can be experienced, inviting further research on the topic.
Hashing has played a pivotal role in large-scale image retrieval. With the development of Convolutional Neural Network (CNN), hashing learning has shown great promise. But existing methods are mostly tuned for classification, which are not optimized for retrieval tasks, especially for instance-level retrieval. In this study, we propose a novel hashing method for large-scale image retrieval. Considering the difficulty in obtaining labeled datasets for image retrieval task in large scale, we propose a novel CNN-based unsupervised hashing method, namely Unsupervised Triplet Hashing (UTH). The unsupervised hashing network is designed under the following three principles: 1) more discriminative representations for image retrieval; 2) minimum quantization loss between the original real-valued feature descriptors and the learned hash codes; 3) maximum information entropy for the learned hash codes. Extensive experiments on CIFAR-10, MNIST and In-shop datasets have shown that UTH outperforms several state-of-the-art unsupervised hashing methods in terms of retrieval accuracy.
Protein secondary structure prediction is an important problem in bioinformatics. Inspired by the recent successes of deep neural networks, in this paper, we propose an end-to-end deep network that predicts protein secondary structures from integrated local and global contextual features. Our deep architecture leverages convolutional neural networks with different kernel sizes to extract multiscale local contextual features. In addition, considering long-range dependencies existing in amino acid sequences, we set up a bidirectional neural network consisting of gated recurrent unit to capture global contextual features. Furthermore, multi-task learning is utilized to predict secondary structure labels and amino-acid solvent accessibility simultaneously. Our proposed deep network demonstrates its effectiveness by achieving state-of-the-art performance, i.e., 69.7% Q8 accuracy on the public benchmark CB513, 76.9% Q8 accuracy on CASP10 and 73.1% Q8 accuracy on CASP11. Our model and results are publicly available.
The main contribution of this article is a new prior distribution over directed acyclic graphs, which gives larger weight to sparse graphs. This distribution is intended for structured Bayesian networks, where the structure is given by an ordered block model. That is, the nodes of the graph are objects which fall into categories (or blocks); the blocks have a natural ordering. The presence of a relationship between two objects is denoted by an arrow, from the object of lower category to the object of higher category. The models considered here were introduced in Kemp et al. (2004) for relational data and extended to multivariate data in Mansinghka et al. (2006). The prior over graph structures presented here has an explicit formula. The number of nodes in each layer of the graph follow a Hoppe Ewens urn model.   We consider the situation where the nodes of the graph represent random variables, whose joint probability distribution factorises along the DAG. We describe Monte Carlo schemes for finding the optimal aposteriori structure given a data matrix and compare the performance with Mansinghka et al. (2006) and also with the uniform prior.
Convolutional Neural Networks have been a subject of great importance over the past decade and great strides have been made in their utility for producing state of the art performance in many computer vision problems. However, the behavior of deep networks is yet to be fully understood and is still an active area of research. In this work, we present an intriguing behavior: pre-trained CNNs can be made to improve their predictions by structurally perturbing the input. We observe that these perturbations - referred as Guided Perturbations - enable a trained network to improve its prediction performance without any learning or change in network weights. We perform various ablative experiments to understand how these perturbations affect the local context and feature representations. Furthermore, we demonstrate that this idea can improve performance of several existing approaches on semantic segmentation and scene labeling tasks on the PASCAL VOC dataset and supervised classification tasks on MNIST and CIFAR10 datasets.
Fashion is a multi-billion dollar industry with social and economic implications worldwide. To gain popularity, brands want to be represented by the top popular models. As new faces are selected using stringent (and often criticized) aesthetic criteria, \emph{a priori} predictions are made difficult by information cascades and other fundamental trend-setting mechanisms. However, the increasing usage of social media within and without the industry may be affecting this traditional system. We therefore seek to understand the ingredients of success of fashion models in the age of Instagram. Combining data from a comprehensive online fashion database and the popular mobile image-sharing platform, we apply a machine learning framework to predict the tenure of a cohort of new faces for the 2015 Spring\,/\,Summer season throughout the subsequent 2015-16 Fall\,/\,Winter season. Our framework successfully predicts most of the new popular models who appeared in 2015. In particular, we find that a strong social media presence may be more important than being under contract with a top agency, or than the aesthetic standards sought after by the industry.
Methods from convex optimization are widely used as building blocks for deep learning algorithms. However, the reasons for their empirical success are unclear, since modern convolutional networks (convnets), incorporating rectifier units and max-pooling, are neither smooth nor convex. Standard guarantees therefore do not apply. This paper provides the first convergence rates for gradient descent on rectifier convnets. The proof utilizes the particular structure of rectifier networks which consists in binary active/inactive gates applied on top of an underlying linear network. The approach generalizes to max-pooling, dropout and maxout. In other words, to precisely the neural networks that perform best empirically. The key step is to introduce gated games, an extension of convex games with similar convergence properties that capture the gating function of rectifiers. The main result is that rectifier convnets converge to a critical point at a rate controlled by the gated-regret of the units in the network. Corollaries of the main result include: (i) a game-theoretic description of the representations learned by a neural network; (ii) a logarithmic-regret algorithm for training neural nets; and (iii) a formal setting for analyzing conditional computation in neural nets that can be applied to recently developed models of attention.
Highly supercooled liquids with soft-core potentials are studied via molecular dynamics simulations in two and three dimensions in quiescent and sheared conditions.We may define bonds between neighboring particle pairs unambiguously owing to the sharpness of the first peak of the pair correlation functions. Upon structural rearrangements, they break collectively in the form of clusters whose sizes grow with lowering the temperature $T$. The bond life time $\tau_b$, which depends on $T$ and the shear rate $\gdot$, is on the order of the usual structural or $\alpha$ relaxation time $\tau_{\alpha}$ in weak shear $\gdot \tau_{\alpha} \ll 1$, while it decreases as $1/\gdot$ in strong shear $\gdot\tau_{\alpha} \gg 1$ due to shear-induced cage breakage. Accumulated broken bonds in a time interval ($\sim 0.05\tau_b$) closely resemble the critical fluctuations of Ising spin systems. For example, their structure factor is well fitted to the Ornstein-Zernike form, which yields the correlation length $\xi$ representing the maximum size of the clusters composed of broken bonds. We also find a dynamical scaling relation, $\tau_b \sim \xi^{z}$, valid for any $T$ and $\gdot$ with $z=4$ in two dimensions and $z=2$ in three dimensions. The viscosity is of order $\tau_b$ for any $T$ and $\gdot$, so marked shear-thinning behavior emerges. The shear stress is close to a limiting stress in a wide shear region. We also examine motion of tagged particles in shear in three dimensions. The diffusion constant is found to be of order $\tau_b^{-\nu}$ with $\nu=0.75 \sim 0.8$ for any $T$ and $\gdot$, so it is much enhanced in strong shear compared with its value at zero shear. This indicates breakdown of the Einstein-Stokes relation in accord with experiments. Some possible experiments are also proposed.
Intrusion detection in wireless ad hoc networks is a challenging task because these networks change their topologies dynamically, lack concentration points where aggregated traffic can be analyzed, utilize infrastructure protocols that are susceptible to manipulation, and rely on noisy, intermittent wireless communications. Security remains a major challenge for these networks due their features of open medium, dynamically changing topologies, reliance on co-operative algorithms, absence of centralized monitoring points, and lack of clear lines of defense. In this paper, we present a cooperative, distributed intrusion detection architecture based on clustering of the nodes that addresses the security vulnerabilities of the network and facilitates accurate detection of attacks. The architecture is organized as a dynamic hierarchy in which the intrusion data is acquired by the nodes and is incrementally aggregated, reduced in volume and analyzed as it flows upwards to the cluster-head. The cluster-heads of adjacent clusters communicate with each other in case of cooperative intrusion detection. For intrusion related message communication, mobile agents are used for their efficiency in lightweight computation and suitability in cooperative intrusion detection. Simulation results show effectiveness and efficiency of the proposed architecture.
We study the efficacy of learning neural networks with neural networks by the (stochastic) gradient descent method. While gradient descent enjoys empirical success in a variety of applications, there is a lack of theoretical guarantees that explains the practical utility of deep learning. We focus on two-layer neural networks with a linear activation on the output node. We show that under some mild assumptions and certain classes of activation functions, gradient descent does learn the parameters of the neural network and converges to the global minima. Using a node-wise gradient descent algorithm, we show that learning can be done in finite, sometimes $poly(d,1/\epsilon)$, time and sample complexity.
We demonstrate the effects of embedding subgraphs using a Boolean network, which is one of the discrete dynamical models for transcriptional regulatory networks. After comparing the dynamical properties of network embedded seven different subgraphs including feedback and feedforward subgraphs, we found that complexity of the state space that increases with longer length of attractors and greater number of attractors is reduced for networks with more feedforward subgraphs. In addition, feedforward subgraphs can also provide higher mutual information with lower entropy in a temporal program of gene expression. Networks with other six subgraphs show opposite effects on dynamics of the networks, is roughly consistent with Thomas's conjecture. These results suggest that feedforward subgraphs are one of the favorable local structures in biological complex networks.
We analyze van der Waals interactions between two rigid polymers with sequence-specific, anisotropic polarizabilities along the polymer backbones, so that the dipole moments fluctuate parallel to the polymer backbones. Assuming that each polymer has a quenched-in polarizability sequence which reflects, for example, the polynucleotide sequence of a double-stranded DNA molecule, we study the van der Waals interaction energy between a pair of such polymers with rod-like structure for the cases where their respective polarizability sequences are (i) distinct and (ii) identical, with both zero and non-zero correlation length of the polarizability correlator along the polymer backbones in the latter case. For identical polymers, we find a novel $r^{-5}$ scaling behavior of the van der Waals interaction energy for small inter-polymer separation $r$, in contradistinction to the $r^{-4}$ scaling behavior of distinct polymers, with furthermore a pronounced angular dependence favoring attraction between sufficiently aligned identical polymers. Such behavior can assist the molecular recognition between polymers.
We discuss the relations between the twist 2 and twist 3 contributions to polarized deep-inelastic scattering structure functions both for neutral and charged current interactions which are predicted by the operator product expansion in lowest order in QCD.
AND-OR networks are Boolean networks where each coordinate function is either the AND or OR logical operator. We study the number of fixed points of these Boolean networks in the case that they have a wiring diagram with chain topology. We find closed formulas for subclasses of these networks and recursive formulas in the general case. Our results allow for an effective computation of the number of fixed points in the case that the topology of the Boolean network is an open chain (finite or infinite) or a closed chain.
We present a search for WW and WZ production in final states that contain a charged lepton (electron or muon) and at least two jets, produced in sqrt(s) = 1.96 TeV ppbar collisions at the Fermilab Tevatron, using data corresponding to 1.2 fb-1 of integrated luminosity collected with the CDF II detector. Diboson production in this decay channel has yet to be observed at hadron colliders due to the large single W plus jets background. An artificial neural network has been developed to increase signal sensitivity, as compared with an event selection based on conventional cuts. We set a 95% confidence level upper limit of sigma_{WW}* BR(W->lnu,W->jets)+ sigma_{WZ}*BR(W->lnu,Z->jets) <2.88 pb, which is consistent with the standard model next to leading order cross section calculation for this decay channel of 2.09+-0.12 pb.
We present a study aimed at deriving constraints on star formation at intermediate ages from the evolved stellar populations in the dwarf irregular galaxy UKS2323-326. These observations were also intended to demonstrate the scientific capabilities of the multi-conjugated adaptive optics demonstrator (MAD) implemented at the ESO Very Large Telescope as a test-bench of adaptive optics (AO) techniques. We perform accurate, deep photometry of the field using J and Ks band AO images of the central region of the galaxy. The near-infrared (IR) colour-magnitude diagrams clearly show the sequences of asymptotic giant branch (AGB) stars, red supergiants, and red giant branch (RGB) stars down to ~1 mag below the RGB tip. Optical-near-IR diagrams, obtained by combining our data with Hubble Space Telescope observations, provide the best separation of stars in the various evolutionary stages. The counts of AGB stars brighter than the RGB tip allow us to estimate the star formation at intermediate ages. Assuming a Salpeter initial mass function, we find that the star formation episode at intermediate ages produced ~6x10^5 M_sun of stars in the observed region.
With the rapid growth of web images, hashing has received increasing interests in large scale image retrieval. Research efforts have been devoted to learning compact binary codes that preserve semantic similarity based on labels. However, most of these hashing methods are designed to handle simple binary similarity. The complex multilevel semantic structure of images associated with multiple labels have not yet been well explored. Here we propose a deep semantic ranking based method for learning hash functions that preserve multilevel semantic similarity between multi-label images. In our approach, deep convolutional neural network is incorporated into hash functions to jointly learn feature representations and mappings from them to hash codes, which avoids the limitation of semantic representation power of hand-crafted features. Meanwhile, a ranking list that encodes the multilevel similarity information is employed to guide the learning of such deep hash functions. An effective scheme based on surrogate loss is used to solve the intractable optimization problem of nonsmooth and multivariate ranking measures involved in the learning procedure. Experimental results show the superiority of our proposed approach over several state-of-the-art hashing methods in term of ranking evaluation metrics when tested on multi-label image datasets.
Using a series of fast cooling protocols we have probed aging effects in the spin glass state as a function of temperature. Analyzing the logarithmic decay found at very long time scales within a simple phenomenological barrier model, leads to the extraction of the fluctuation time scale of the system at a particular temperature. This is the smallest dynamical time-scale, defining a lower-cut off in a hierarchical description of the dynamics. We find that this fluctuation time scale, which is approximately equal to atomic spin fluctuation time scales near the transition temperature, follows ageneralized Arrhenius law. We discuss the hypothesisthat, upon cooling to a measuring temperature within the spin glass state, there is a range of dynamically in-equivalent configurations in which the system can be trapped, and check within a numerical barrier model simulation, that this leads to sub-aging behavior in scaling aged TRM decay curves, as recently discussed theoretically, see arXiv:0902.3556 .
Recently, it has been suggested that the Many-Body Localized phase can be characterized by local integrals of motion. Here we introduce a Hilbert space preserving renormalization scheme that iteratively finds such integrals of motion exactly. Our method is based on the consecutive action of a similarity transformation using displacement operators. We show, as a proof of principle, localization and the delocalization transition in interacting fermion chains with random onsite potentials. Our scheme of consecutive displacement transformations can be used to study Many Body Localization in any dimension, as well as disorder-free Hamiltonians.
This paper deals with the global stability of time-delayed dynamical networks. We show that for a time-delayed dynamical network with non-distributed delays the network and the corresponding non-delayed network are both either globally stable or unstable. We demonstrate that this may not be the case if the network's delays are distributed. The main tool in our analysis is a new procedure of dynamical network restrictions. This procedure is useful in that it allows for improved estimates of a dynamical network's global stability. Moreover, it is a computationally simpler and much more effective means of analyzing the stability of dynamical networks than the procedure of isospectral network expansions introduced in [Isospectral graph transformations, spectral equivalence, and global stability of dynamical networks. Nonlinearity, 25 (2012) 211-254]. The effectiveness of our approach is illustrated by applications to various classes of Cohen-Grossberg neural networks.
Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Although this problem has been extensively studied, the challenge of how to effectively combine the information from the network structure with rich node and edge attribute data remains largely open.   We develop an algorithm based on Supervised Random Walks that naturally combines the information from the network structure with node and edge level attributes. We achieve this by using these attributes to guide a random walk on the graph. We formulate a supervised learning task where the goal is to learn a function that assigns strengths to edges in the network such that a random walker is more likely to visit the nodes to which new links will be created in the future. We develop an efficient training algorithm to directly learn the edge strength estimation function.   Our experiments on the Facebook social graph and large collaboration networks show that our approach outperforms state-of-the-art unsupervised approaches as well as approaches that are based on feature extraction.
Generative adversarial networks are an effective approach for learning rich latent representations of continuous data, but have proven difficult to apply directly to discrete structured data, such as text sequences or discretized images. Ideally we could encode discrete structures in a continuous code space to avoid this problem, but it is difficult to learn an appropriate general-purpose encoder. In this work, we consider a simple approach for handling these two challenges jointly, employing a discrete structure autoencoder with a code space regularized by generative adversarial training. The model learns a smooth regularized code space while still being able to model the underlying data, and can be used as a discrete GAN with the ability to generate coherent discrete outputs from continuous samples. We demonstrate empirically how key properties of the data are captured in the model's latent space, and evaluate the model itself on the tasks of discrete image generation, text generation, and semi-supervised learning.
We study an independent best-response dynamics on network games in which the nodes (players) decide to revise their strategies independently with some probability. We are interested in the convergence time to the equilibrium as a function of this probability, the degree of the network, and the potential of the underlying games.
We propose a distributed deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is based on the deep Q-network, a convolutional neural network trained with a variant of Q-learning. Its input is raw pixels and its output is a value function estimating future rewards from taking an action given a system state. To distribute the deep Q-network training, we adapt the DistBelief software framework to the context of efficiently training reinforcement learning agents. As a result, the method is completely asynchronous and scales well with the number of machines. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to achieve reasonable success on a simple game with minimal parameter tuning.
Real complex networks are often characterized by spatial constraints such as the relative position and adjacency of nodes. The present work describes how Voronoi tessellations of the space where the network is embedded provide not only a natural means for relating such networks with metric spaces, but also a natural means for obtaining fractal complex networks. A series of comprehensive measurements closely related to spatial aspects of these networks is proposed, which includes the effective length, adjacency, as well as the fractal dimension of the network in terms of the spatial scales defined by successive shortest paths starting from a specific node. The potential of such features is illustrated with respect to the random, small-world, scale-free and fractal network models.
A fundamental problem faced in the design of almost all packet networks is that of efficient operation--of reliably communicating given messages among nodes at minimum cost in resource usage. We present a solution to the efficient operation problem for coded packet networks, i.e., packet networks where the contents of outgoing packets are arbitrary, causal functions of the contents of received packets. Such networks are in contrast to conventional, routed packet networks, where outgoing packets are restricted to being copies of received packets and where reliability is provided by the use of retransmissions.   This thesis introduces four considerations to coded packet networks: 1. efficiency, 2. the lack of synchronization in packet networks, 3. the possibility of broadcast links, and 4. packet loss. We take these considerations and give a prescription for operation that is novel and general, yet simple, useful, and extensible.
We show that quantum correlations induced by electron-electron interactions in the presence of random impurity scattering can play an important role in the thermal stabilization of amorphous Hume-Rothery systems: When there is strong backscattering off local, concentrical ion clusters, the static electron density response $\chi (0,q)$ acquires a powerlaw divergence at $q=2k_F$ even at elevated temperature. This leads to an enhancement as well as to a systematical phase shift of the Friedel oscillations, both consistent with experiments. The possible importance of this effect in icosahedral quasicrystals is discussed.
Neural networks (NN) have achieved state-of-the-art performance in various applications. Unfortunately in applications where training data is insufficient, they are often prone to overfitting. One effective way to alleviate this problem is to exploit the Bayesian approach by using Bayesian neural networks (BNN). Another shortcoming of NN is the lack of flexibility to customize different distributions for the weights and neurons according to the data, as is often done in probabilistic graphical models. To address these problems, we propose a class of probabilistic neural networks, dubbed natural-parameter networks (NPN), as a novel and lightweight Bayesian treatment of NN. NPN allows the usage of arbitrary exponential-family distributions to model the weights and neurons. Different from traditional NN and BNN, NPN takes distributions as input and goes through layers of transformation before producing distributions to match the target output distributions. As a Bayesian treatment, efficient backpropagation (BP) is performed to learn the natural parameters for the distributions over both the weights and neurons. The output distributions of each layer, as byproducts, may be used as second-order representations for the associated tasks such as link prediction. Experiments on real-world datasets show that NPN can achieve state-of-the-art performance.
In the last decades, complex networks theory significantly influenced other disciplines on the modeling of both static and dynamic aspects of systems observed in nature. This work aims to investigate the effects of networks' topological features on the dynamics of an evolutionary algorithm, considering in particular the ability to find a large number of optima on multi-modal problems. We introduce a novel spatially-structured evolutionary algorithm and we apply it on two combinatorial problems: ONEMAX and the multi-modal NMAX. Considering three different network models we investigate the relationships between their features, algorithm's convergence and its ability to find multiple optima (for the multi-modal problem). In order to perform a deeper analysis we investigate the introduction of weighted graphs with time-varying weights. The results show that networks with a large Average Path Length lead to an higher number of optima and a consequent slow exploration dynamics (i.e. low First Hitting Time). Furthermore, the introduction of weighted networks shows the possibility to tune algorithm's dynamics during its execution with the parameter related with weights' change. This work gives a first answer about the effects of various graph topologies on the diversity of evolutionary algorithms and it describes a simple but powerful algorithmic framework which allows to investigate many aspects of ssEAs dynamics.
We investigate the relationship between social media, Twitter in particular, and stock market. We provide an in-depth analysis of the Twitter volume and sentiment about the 30 companies in the Dow Jones Industrial Average index, over a period of three years. We focus on Earnings Announcements and show that there is a considerable difference with respect to when the announcements are made: before the market opens or after the market closes. The two different timings of the Earnings Announcements were already investigated in the financial literature, but not yet in the social media. We analyze the differences in terms of the Twitter volumes, cumulative abnormal returns, trade returns, and earnings surprises.   We report mixed results. On the one hand, we show that the Twitter sentiment (the collective opinion of the users) on the day of the announcement very well reflects the stock moves on the same day. We demonstrate this by applying the event study methodology, where the polarity of the Earnings Announcements is computed from the Twitter sentiment. Cumulative abnormal returns are high (2--4\%) and statistically significant. On the other hand, we find only weak predictive power of the Twitter sentiment one day in advance. It turns out that it is important how to account for the announcements made after the market closes. These after-hours announcements draw high Twitter activity immediately, but volume and price changes in trading are observed only on the next day. On the day before the announcements, the Twitter volume is low, and the sentiment has very weak predictive power. A useful lesson learned is the importance of the proper alignment between the announcements, trading and Twitter data.
We propose a novel Coupled Projection multi-task Metric Learning (CP-mtML) method for large scale face retrieval. In contrast to previous works which were limited to low dimensional features and small datasets, the proposed method scales to large datasets with high dimensional face descriptors. It utilises pairwise (dis-)similarity constraints as supervision and hence does not require exhaustive class annotation for every training image. While, traditionally, multi-task learning methods have been validated on same dataset but different tasks, we work on the more challenging setting with heterogeneous datasets and different tasks. We show empirical validation on multiple face image datasets of different facial traits, e.g. identity, age and expression. We use classic Local Binary Pattern (LBP) descriptors along with the recent Deep Convolutional Neural Network (CNN) features. The experiments clearly demonstrate the scalability and improved performance of the proposed method on the tasks of identity and age based face image retrieval compared to competitive existing methods, on the standard datasets and with the presence of a million distractor face images.
Dynamic spectrum sharing is a promising technology to improve spectrum utilization in the future wireless networks. The flexible spectrum management provides new opportunities for licensed primary user and unlicensed secondary users to reallocate the spectrum resource efficiently. In this paper, we present an oligopoly pricing framework for dynamic spectrum allocation in which the primary users sell excessive spectrum to the secondary users for monetary return. We present two approaches, the strict constraints (type-I) and the QoS penalty (type-II), to model the realistic situation that the primary users have limited capacities. In the oligopoly model with strict constraints, we propose a low-complexity searching method to obtain the Nash Equilibrium and prove its uniqueness. When reduced to a duopoly game, we analytically show the interesting gaps in the leader-follower pricing strategy. In the QoS penalty based oligopoly model, a novel variable transformation method is developed to derive the unique Nash Equilibrium. When the market information is limited, we provide three myopically optimal algorithms "StrictBEST", "StrictBR" and "QoSBEST" that enable price adjustment for duopoly primary users based on the Best Response Function (BRF) and the bounded rationality (BR) principles. Numerical results validate the effectiveness of our analysis and demonstrate the fast convergence of "StrictBEST" as well as "QoSBEST" to the Nash Equilibrium. For the "StrictBR" algorithm, we reveal the chaotic behaviors of dynamic price adaptation in response to the learning rates.
We have analyzed the beam spin asymmetry and the longitudinally polarized target spin asymmetry of the Deep Virtual Compton Scattering process, recently measured by the Jefferson Lab CLAS collaboration. Our aim is to extract information about the Generalized Parton Distributions of the proton. By fitting these data, in a largely model-independent procedure, we are able to extract numerical values for the two Compton Form Factors $H_{Im}$ and $\tilde{H}_{Im}$ with uncertainties, in average, of the order of 30%.
The probabilities of clusters spanning a hypercube of dimensions two to seven along one axis of a percolation system under criticality were investigated numerically. We used a modified Hoshen--Kopelman algorithm combined with Grassberger's "go with the winner" strategy for the site percolation. We carried out a finite-size analysis of the data and found that the probabilities confirm Aizenman's proposal of the multiplicity exponent for dimensions three to five. A crossover to the mean-field behavior around the upper critical dimension is also discussed.
In this paper, we propose a novel multi-channel network with infrastructure support, called an \textit{MC-IS} network, which has not been studied in the literature. To the best of our knowledge, we are the first to study such an \textit{MC-IS} network. Our \textit{MC-IS} network is equipped with a number of infrastructure nodes which can communicate with common nodes using a number of channels where a communication between a common node and an infrastructure node is called an infrastructure communication and a communication between two common nodes is called an ad-hoc communication. Our proposed \textit{MC-IS} network has a number of advantages over three existing conventional networks, namely a single-channel wireless ad hoc network (called an \textit{SC-AH} network), a multi-channel wireless ad hoc network (called an \textit{MC-AH} network) and a single-channel network with infrastructure support (called an \textit{SC-IS} network). In particular, the \textit{network capacity} of our proposed \textit{MC-IS} network is $\sqrt{n \log n}$ times higher than that of an \textit{SC-AH} network and an \textit{MC-AH} network and the same as that of an \textit{SC-IS} network, where $n$ is the number of nodes in the network. The \textit{average delay} of our \textit{MC-IS} network is $\sqrt{\log n/n}$ times lower than that of an \textit{SC-AH} network and an \textit{MC-AH} network, and $\min(C_I,m)$ times lower than the average delay of an \textit{SC-IS} network, where $C_I$ and $m$ denote the number of channels dedicated for infrastructure communications and the number of interfaces mounted at each infrastructure node, respectively.
We show that the vibrations of a chain of trapped ions offer an interesting route to explore the physics of disordered quantum systems. By preparing the internal state of the ions in a quantum superposition, we show how the local vibrational energy becomes a stochastic variable, being its statistical properties inherited from the underlying quantum parallelism of the internal state. We describe a minimally-perturbing measurement of the resonance fluorescence, which allows us to study effects like Anderson localization without the need of ground-state cooling or individual addressing, and thus paves the way towards high-temperature ion experiments.
When we represent logical, connective implications by directed edges, the resulting set of directed edges can be regarded as a complex network. In this article, we compose a network model that represents a deductive-logic-like structure composed solely of implications. The proposed network model grows like the BA model reported by Barabasi and Albert [Science 286, 509 (1999)]. Though the BA model references the whole of the existing network when a node is added, our model references only part of the existing network. In this view, our model is more realistic than the BA model. However, it also exhibits power law characteristics.
Due to the prevalence of social media websites, one challenge facing computer vision researchers is to devise methods to process and search for persons of interest among the billions of shared photos on these websites. Facebook revealed in a 2013 white paper that its users have uploaded more than 250 billion photos, and are uploading 350 million new photos each day. Due to this humongous amount of data, large-scale face search for mining web images is both important and challenging. Despite significant progress in face recognition, searching a large collection of unconstrained face images has not been adequately addressed. To address this challenge, we propose a face search system which combines a fast search procedure, coupled with a state-of-the-art commercial off the shelf (COTS) matcher, in a cascaded framework. Given a probe face, we first filter the large gallery of photos to find the top-k most similar faces using deep features generated from a convolutional neural network. The k candidates are re-ranked by combining similarities from deep features and the COTS matcher. We evaluate the proposed face search system on a gallery containing 80 million web-downloaded face images. Experimental results demonstrate that the deep features are competitive with state-of-the-art methods on unconstrained face recognition benchmarks (LFW and IJB-A). Further, the proposed face search system offers an excellent trade-off between accuracy and scalability on datasets consisting of millions of images. Additionally, in an experiment involving searching for face images of the Tsarnaev brothers, convicted of the Boston Marathon bombing, the proposed face search system could find the younger brother's (Dzhokhar Tsarnaev) photo at rank 1 in 1 second on a 5M gallery and at rank 8 in 7 seconds on an 80M gallery.
Generative Stochastic Networks (GSNs) have been recently introduced as an alternative to traditional probabilistic modeling: instead of parametrizing the data distribution directly, one parametrizes a transition operator for a Markov chain whose stationary distribution is an estimator of the data generating distribution. The result of training is therefore a machine that generates samples through this Markov chain. However, the previously introduced GSN consistency theorems suggest that in order to capture a wide class of distributions, the transition operator in general should be multimodal, something that has not been done before this paper. We introduce for the first time multimodal transition distributions for GSNs, in particular using models in the NADE family (Neural Autoregressive Density Estimator) as output distributions of the transition operator. A NADE model is related to an RBM (and can thus model multimodal distributions) but its likelihood (and likelihood gradient) can be computed easily. The parameters of the NADE are obtained as a learned function of the previous state of the learned Markov chain. Experiments clearly illustrate the advantage of such multimodal transition distributions over unimodal GSNs.
In this paper, we introduce a distributed dynamic routing algorithm for secondary users (SUs) to minimize their interference with the primary users (PUs) in multi-hop cognitive radio (CR) networks. We use the medial axis with a relaxation factor as a reference path which is contingent on the states of the PUs. Along the axis, we construct a hierarchical structure for multiple sources to reach cognitive pilot channel (CPC) base stations. We use a temporal and spatial dynamic non-cooperative game to model the interactions among SUs as well as their influences from PUs in the multi-hop structure of the network. A multi-stage fictitious play learning is used for distributed routing in multi-hop CR networks. We obtain a set of mixed (behavioral) Nash equilibrium strategies of the dynamic game in closed form by backward induction. The proposed algorithm minimizes the overall interference and the average packet delay along the routing path from SU nodes to CPC base stations in an optimal and distributed manner
Distracted driving is a worldwide problem leading to an astoundingly increasing number of accidents and deaths. Existing work is concerned with a very small set of distractions (mostly, cell phone usage). Also, for the most part, it uses unreliable ad-hoc methods to detect those distractions. In this paper, we present the first publicly available dataset for "distracted driver" posture estimation with more distraction postures than existing alternatives. In addition, we propose a reliable system that achieves a 95.98% driving posture classification accuracy. The system consists of a genetically-weighted ensemble of Convolutional Neural Networks (CNNs). We show that a weighted ensemble of classifiers using a genetic algorithm yields in better classification confidence. We also study the effect of different visual elements (i.e. hands and face) in distraction detection by means of face and hand localizations. Finally, we present a thinned version of our ensemble that could achieve a 94.29% classification accuracy and operate in a real-time environment.
Creating new ties in a social network facilitates knowledge exchange and affects positional advantage. In this paper, we study the process, which we call network building, of establishing ties between two existing social networks in order to reach certain structural goals. We focus on the case when one of the two networks consists only of a single member and motivate this case from two perspectives. The first perspective is socialization: we ask how a newcomer can forge relationships with an existing network to place herself at the center. We prove that obtaining optimal solutions to this problem is NP-complete, and present several efficient algorithms to solve this problem and compare them with each other. The second perspective is network expansion: we investigate how a network may preserve or reduce its diameter through linking with a new node, hence ensuring small distance between its members. We give two algorithms for this problem. For both perspectives the experiment demonstrates that a small number of new links is usually sufficient to reach the respective goal.
We are studying some fundamental properties of the interface between control and data planes in Information-Centric Networks. We try to evaluate the traffic between these two planes based on allowing a minimum level of acceptable distortion in the network state representation in the control plane. We apply our framework to content distribution, and see how we can compute the overhead of maintaining the location of content in the control plane. This is of importance to evaluate content-oriented network architectures: we identify scenarios where the cost of updating the control plane for content routing overwhelms the benefit of fetching a nearby copy. We also show how to minimize the cost of this overhead when associating costs to peering traffic and to internal traffic for operator-driven CDNs.
We use a transfer-matrix method to study the disorder-induced metal-insulator transition. We take isotropic nearest- neighbor hopping and an onsite potential with uniformly distributed disorder. Following previous work done on the simple cubic lattice, we perform numerical calculations for the body centered cubic and face centered cubic lattices, which are more common in nature. We obtain the localization length from calculated Lyapunov exponents for different system sizes. This data is analyzed using finite-size scaling to find the critical parameters. We create an energy-disorder phase diagram for both lattice types, noting that it is symmetric about the band center for the body centered cubic lattice, but not for the face centered cubic lattice. We find a critical exponent of approximately 1.5-1.6 for both lattice types for transitions occurring either at fixed energy or at fixed disorder, agreeing with results previously obtained for other systems belonging to the same orthogonal universality class. We notice an increase in critical disorder with the number of nearest neighbors, which agrees with intuition.
We study the collective dynamics of a Leaky Integrate and Fire network in which precise relative phase relationship of spikes among neurons are stored, as attractors of the dynamics, and selectively replayed at differentctime scales. Using an STDP-based learning process, we store in the connectivity several phase-coded spike patterns, and we find that, depending on the excitability of the network, different working regimes are possible, with transient or persistent replay activity induced by a brief signal. We introduce an order parameter to evaluate the similarity between stored and recalled phase-coded pattern, and measure the storage capacity. Modulation of spiking thresholds during replay changes the frequency of the collective oscillation or the number of spikes per cycle, keeping preserved the phases relationship. This allows a coding scheme in which phase, rate and frequency are dissociable. Robustness with respect to noise and heterogeneity of neurons parameters is studied, showing that, since dynamics is a retrieval process, neurons preserve stablecprecise phase relationship among units, keeping a unique frequency of oscillation, even in noisy conditions and with heterogeneity of internal parameters of the units.
Without discourse connectives, classifying implicit discourse relations is a challenging task and a bottleneck for building a practical discourse parser. Previous research usually makes use of one kind of discourse framework such as PDTB or RST to improve the classification performance on discourse relations. Actually, under different discourse annotation frameworks, there exist multiple corpora which have internal connections. To exploit the combination of different discourse corpora, we design related discourse classification tasks specific to a corpus, and propose a novel Convolutional Neural Network embedded multi-task learning system to synthesize these tasks by learning both unique and shared representations for each task. The experimental results on the PDTB implicit discourse relation classification task demonstrate that our model achieves significant gains over baseline systems.
Slow Feature Analysis (SFA) extracts features representing the underlying causes of changes within a temporally coherent high-dimensional raw sensory input signal. Our novel incremental version of SFA (IncSFA) combines incremental Principal Components Analysis and Minor Components Analysis. Unlike standard batch-based SFA, IncSFA adapts along with non-stationary environments, is amenable to episodic training, is not corrupted by outliers, and is covariance-free. These properties make IncSFA a generally useful unsupervised preprocessor for autonomous learning agents and robots. In IncSFA, the CCIPCA and MCA updates take the form of Hebbian and anti-Hebbian updating, extending the biological plausibility of SFA. In both single node and deep network versions, IncSFA learns to encode its input streams (such as high-dimensional video) by informative slow features representing meaningful abstract environmental properties. It can handle cases where batch SFA fails.
The ability of a feed-forward neural network to learn and classify different states of polymer configurations is systematically explored. Performing numerical experiments, we find that a simple network model can, after adequate training, recognize multiple structures, including gas-like coil, liquid-like globular, and crystalline anti-Mackay and Mackay structures. The network can be trained to identify the transition points between various states, which compare well with those identified by independent specific-heat calculations. Our study demonstrates that neural network provides an unconventional tool to study the phase transitions in polymeric systems.
We establish a dictionary between group field theory (thus, spin networks and random tensors) states and generalized random tensor networks. Then, we use this dictionary to compute the R\'{e}nyi entropy of such states and recover the Ryu-Takayanagi formula, in three different cases corresponding to three different truncations/approximations, suggested by the established correspondence.
We examine the possible reionization of the intergalactic medium (IGM) by the source UDF033238.7-274839.8 (hereafter HUDF-JD2), which was discovered in deep {\it HST}/VLT/{\it Spitzer} images obtained as part of the Great Observatory Origins Deep Survey and {\it Hubble} Ultra-Deep Field projects. Mobasher et al (2005) have identified HUDF-JD2 as a massive ($\sim6\times10^{11}M_\odot$) post-starburst galaxy at redshift z$\gtrsim6.5$. We find that HUDF-JD2 may be capable of reionizing its surrounding region of the Universe, starting the process at a redshift as high as z$\approx 15 \pm5$.
We use the scattering network as a generic and fixed ini-tialization of the first layers of a supervised hybrid deep network. We show that early layers do not necessarily need to be learned, providing the best results to-date with pre-defined representations while being competitive with Deep CNNs. Using a shallow cascade of 1 x 1 convolutions, which encodes scattering coefficients that correspond to spatial windows of very small sizes, permits to obtain AlexNet accuracy on the imagenet ILSVRC2012. We demonstrate that this local encoding explicitly learns invariance w.r.t. rotations. Combining scattering networks with a modern ResNet, we achieve a single-crop top 5 error of 11.4% on imagenet ILSVRC2012, comparable to the Resnet-18 architecture, while utilizing only 10 layers. We also find that hybrid architectures can yield excellent performance in the small sample regime, exceeding their end-to-end counterparts, through their ability to incorporate geometrical priors. We demonstrate this on subsets of the CIFAR-10 dataset and on the STL-10 dataset.
Maintaining the quality of service (QOS) and controlling the network congestion are quite complicated tasks. They cause degrading the performance of the network, and disturbing the continuous communication process. To overcome these issues, one step towards this dilemma has been taken in form of Pre-congestion notification (PCN) technique. PCN uses a packet marking technique within a PCN domain over IP networks. It is notified by egress node that works as guard at entry point of network. Egress node gives feedback to communicating servers whether rate on the link is exceeded than configured admissible threshold or within the limit. Based on this feedback, admission decisions are taken to determine whether to allow/block new coming flows or terminate already accepted. The actual question is about selection of right algorithm for PCN domain. In this paper, we investigate the analytical behavior of some known PCN algorithms. We make slide modifications in originality of PCN algorithms without disquieting working process in order to employ those within similar types of scenarios. Our goal is to simulate them either in highly congested or less congested realistic scenarios. On the basis of simulation done in ns2, we are able to recommend each PCN algorithm for specific conditions. Finally, we develop a benchmark that helps researchers and scientific communities to pick the right algorithm. Furthermore, the benchmark is designed to achieve specific objectives according to the users' requirements without congesting the network.
EVOC (for EVOlution of Culture) is a computer model of culture that enables us to investigate how various factors such as barriers to cultural diffusion, the presence and choice of leaders, or changes in the ratio of innovation to imitation affect the diversity and effectiveness of ideas. It consists of neural network based agents that invent ideas for actions, and imitate neighbors' actions. The model is based on a theory of culture according to which what evolves through culture is not memes or artifacts, but the internal models of the world that give rise to them, and they evolve not through a Darwinian process of competitive exclusion but a Lamarckian process involving exchange of innovation protocols. EVOC shows an increase in mean fitness of actions over time, and an increase and then decrease in the diversity of actions. Diversity of actions is positively correlated with population size and density, and with barriers between populations. Slowly eroding borders increase fitness without sacrificing diversity by fostering specialization followed by sharing of fit actions. Introducing a leader that broadcasts its actions throughout the population increases the fitness of actions but reduces diversity of actions. Increasing the number of leaders reduces this effect. Efforts are underway to simulate the conditions under which an agent immigrating from one culture to another contributes new ideas while still fitting in.
In this paper we detail Cortexica's (https://www.cortexica.com) recommendation framework -- particularly, we describe how a hybrid visual recommender system can be created by combining conditional random fields for segmentation and deep neural networks for object localisation and feature representation. The recommendation system that is built after localisation, segmentation and classification has two properties -- first, it is knowledge based in the sense that it learns pairwise preference/occurrence matrix by utilising knowledge from experts (images from fashion blogs) and second, it is content-based as it utilises a deep learning based framework for learning feature representation. Such a construct is especially useful when there is a scarcity of user preference data, that forms the foundation of many collaborative recommendation algorithms.
The video and action classification have extremely evolved by deep neural networks specially with two stream CNN using RGB and optical flow as inputs and they present outstanding performance in terms of video analysis. One of the shortcoming of these methods is handling motion information extraction which is done out side of the CNNs and relatively time consuming also on GPUs. So proposing end-to-end methods which are exploring to learn motion representation, like 3D-CNN can achieve faster and accurate performance. We present some novel deep CNNs using 3D architecture to model actions and motion representation in an efficient way to be accurate and also as fast as real-time. Our new networks learn distinctive models to combine deep motion features into appearance model via learning optical flow features inside the network.
The extensive use of social media platforms, especially during disasters, creates unique opportunities for humanitarian organizations to gain situational awareness and launch relief operations accordingly. In addition to the textual content, people post overwhelming amounts of imagery data on social networks within minutes of a disaster hit. Studies point to the importance of this online imagery content for emergency response. Despite recent advances in the computer vision field, automatic processing of the crisis-related social media imagery data remains a challenging task. It is because a majority of which consists of redundant and irrelevant content. In this paper, we present an image processing pipeline that comprises de-duplication and relevancy filtering mechanisms to collect and filter social media image content in real-time during a crisis event. Results obtained from extensive experiments on real-world crisis datasets demonstrate the significance of the proposed pipeline for optimal utilization of both human and machine computing resources.
A review of works on associative neural networks accomplished during last four years in the Institute of Optical Neural Technologies RAS is given. The presentation is based on description of parametrical neural networks (PNN). For today PNN have record recognizing characteristics (storage capacity, noise immunity and speed of operation). Presentation of basic ideas and principles is accentuated.
European VLBI Network spectral imaging of the "compact double" radio source 2050+364 in the UHF band at 1049 MHz has resolved the HI absorbing region, and has shown a faint continuum component to the North (N), in addition to the well-known East-West double (E, W). Re-examination of VLBI continuum images at multiple frequencies suggests that 2050+364 may well be a one-sided core-jet source, which appears as a double over a limited frequency range. One of the dominant features, W, would then be the innermost visible portion of the jet, and could be at or adjacent to the canonical radio core. The other, E, is probably related to shocks at a sudden bend of the jet, towards extended steep-spectrum region N. A remarkably deep and narrow HI absorption line component extends over the entire projected extent of 2050+364. It coincides in velocity with the [OIII] optical doublet lines to within 10 km/s. This HI absorption could arise in the atomic cores of NLR clouds, and the motion in the NLR is then remarkably coherent both along the line-of-sight and across a projected distance of > 300 pc on the plane of the sky. Broader, shallower HI absorption at lower velocities covers only the plausible core area W. This absorption could be due to gas which is either being entrained by the inner jet or is flowing out from the accretion region; it could be related to the BLR.
A simple and commonly employed approximate technique with which one can examine spatially disordered systems when strong electronic correlations are present is based on the use of real-space unrestricted self-consistent Hartree-Fock wave functions. In such an approach the disorder is treated exactly while the correlations are treated approximately. In this report we critique the success of this approximation by making comparisons between such solutions and the exact wave functions for the Anderson-Hubbard model. Due to the sizes of the complete Hilbert spaces for these problems, the comparisons are restricted to small one-dimensional chains, up to ten sites, and a 4x4 two-dimensional cluster, and at 1/2 filling these Hilbert spaces contain about 63,500 and 166 million states, respectively. We have completed these calculations both at and away from 1/2 filling. This approximation is based on a variational approach which minimizes the Hartree-Fock energy, and we have completed comparisons of the exact and Hartree-Fock energies. However, in order to assess the success of this approximation in reproducing ground-state correlations we have completed comparisons of the local charge and spin correlations, including the calculation of the overlap of the Hartree-Fock wave functions with those of the exact solutions. We find that this approximation reproduces the local charge densities to quite a high accuracy, but that the local spin correlations, as represented by < S_i . S_j >, are not as well represented. In addition to these comparisons, we discuss the properties of the spin degrees of freedom in the HF approximation, and where in the disorder-interaction phase diagram such physics may be important.
Recently, there is a surge of interests on heterogeneous information network analysis. As a newly emerging network model, heterogeneous information networks have many unique features (e.g., complex structure and rich semantics) and a number of interesting data mining tasks have been exploited in this kind of networks, such as similarity measure, clustering, and classification. Although evaluating the importance of objects has been well studied in homogeneous networks, it is not yet exploited in heterogeneous networks. In this paper, we study the ranking problem in heterogeneous networks and propose the HRank framework to evaluate the importance of multiple types of objects and meta paths. Since the importance of objects depends upon the meta paths in heterogeneous networks, HRank develops a path based random walk process. Moreover, a constrained meta path is proposed to subtly capture the rich semantics in heterogeneous networks. Furthermore, HRank can simultaneously determine the importance of objects and meta paths through applying the tensor analysis. Extensive experiments on three real datasets show that HRank can effectively evaluate the importance of objects and paths together. Moreover, the constrained meta path shows its potential on mining subtle semantics by obtaining more accurate ranking results.
We consider deep inelastic scattering in the 't Hooft model. Being solvable, this model allows us to directly compute the moments associated with the cross section at next-to-leading order in the 1/Q^2 expansion. We perform the same computation using the operator product expansion. We find that all the terms match in both computations except for one in the hadronic side, which is proportional to a non-local operator. The basics of the result suggest that a similar phenomenon may occur in four dimensions in the large N_c limit.
The process by which new ideas, innovations, and behaviors spread through a large social network can be thought of as a networked interaction game: Each agent obtains information from certain number of agents in his friendship neighborhood, and adapts his idea or behavior to increase his benefit. In this paper, we are interested in how opinions, about a certain topic, form in social networks. We model opinions as continuous scalars ranging from 0 to 1 with 1(0) representing extremely positive(negative) opinion. Each agent has an initial opinion and incurs some cost depending on the opinions of his neighbors, his initial opinion, and his stubbornness about his initial opinion. Agents iteratively update their opinions based on their own initial opinions and observing the opinions of their neighbors. The iterative update of an agent can be viewed as a myopic cost-minimization response (i.e., the so-called best response) to the others' actions. We study whether an equilibrium can emerge as a result of such local interactions and how such equilibrium possibly depends on the network structure, initial opinions of the agents, and the location of stubborn agents and the extent of their stubbornness. We also study the convergence speed to such equilibrium and characterize the convergence time as a function of aforementioned factors. We also discuss the implications of such results in a few well-known graphs such as Erdos-Renyi random graphs and small-world graphs.
Multiple extensions of Recurrent Neural Networks (RNNs) have been proposed recently to address the difficulty of storing information over long time periods. In this paper, we experiment with the capacity of Neural Turing Machines (NTMs) to deal with these long-term dependencies on well-balanced strings of parentheses. We show that not only does the NTM emulate a stack with its heads and learn an algorithm to recognize such words, but it is also capable of strongly generalizing to much longer sequences.
Manual annotations of temporal bounds for object interactions (i.e. start and end times) are typical training input to recognition, localization and detection algorithms. For three publicly available egocentric datasets, we uncover inconsistencies in ground truth temporal bounds within and across annotators and datasets. We systematically assess the robustness of state-of-the-art approaches to changes in labeled temporal bounds, for object interaction recognition. As boundaries are trespassed, a drop of up to 10% is observed for both Improved Dense Trajectories and Two-Stream Convolutional Neural Network. We demonstrate that such disagreement stems from a limited understanding of the distinct phases of an action, and propose annotating based on the Rubicon Boundaries, inspired by a similarly named cognitive model, for consistent temporal bounds of object interactions. Evaluated on a public dataset, we report a 4% increase in overall accuracy, and an increase in accuracy for 55% of classes when Rubicon Boundaries are used for temporal annotations.
Any network studied in the literature is inevitably just a sampled representative of its real-world analogue. Additionally, network sampling is lately often applied to large networks to allow for their faster and more efficient analysis. Nevertheless, the changes in network structure introduced by sampling are still far from understood. In this paper, we study the presence of characteristic groups of nodes in sampled social and information networks. We consider different network sampling techniques including random node and link selection, network exploration and expansion. We first observe that the structure of social networks reveals densely linked groups like communities, while the structure of information networks is better described by modules of structurally equivalent nodes. However, despite these notable differences, the structure of sampled networks exhibits stronger characterization by community-like groups than the original networks, irrespective of their type and consistently across various sampling techniques. Hence, rich community structure commonly observed in social and information networks is to some extent merely an artifact of sampling.
Deep learning methods achieve great success recently on many computer vision problems, with image classification and object detection as the prominent examples. In spite of these practical successes, optimization of deep networks remains an active topic in deep learning research. In this work, we focus on investigation of the network solution properties that can potentially lead to good performance. Our research is inspired by theoretical and empirical results that use orthogonal matrices to initialize networks, but we are interested in investigating how orthogonal weight matrices perform when network training converges. To this end, we propose to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training, and achieve this by a simple yet effective method called Singular Value Bounding (SVB). In SVB, all singular values of each weight matrix are simply bounded in a narrow band around the value of 1. Based on the same motivation, we also propose Bounded Batch Normalization (BBN), which improves Batch Normalization by removing its potential risk of ill-conditioned layer transform. We present both theoretical and empirical results to justify our proposed methods. Experiments on benchmark image classification datasets show the efficacy of our proposed SVB and BBN. In particular, we achieve the state-of-the-art results of 3.06% error rate on CIFAR10 and 16.90% on CIFAR100, using off-the-shelf network architectures (Wide ResNets). Our preliminary results on ImageNet also show the promise in large-scale learning.
Two deep K-band ($2.2 \mu m$) images, with point-source detection limits of $K=25.2$ mag (one sigma), taken with the Keck Telescope in subfields of the Hubble Deep Field, are presented and analyzed. A sample of objects to K=24 mag is constructed and $V_{606}-I_{814}$ and $I_{814}-K$ colors are measured. By stacking visually selected objects, mean $I_{814}-K$ colors can be measured to very faint levels; the mean $I_{814}-K$ color is constant with apparent magnitude down to $V_{606}=28$ mag.
We study impact of multiplexing on the global phase synchronizability of different layers in the delayed coupled multiplex networks. We find that at strong couplings, the multiplexing induces the global synchronization in sparse networks. The introduction of global synchrony depends on the connection density of the layers being multiplexed, which further depends on the underlying network architecture. Moreover, multiplexing may lead to a transition from a quasi-periodic or chaotic evolution to a periodic evolution. For the periodic case, the multiplexing may lead to a change in the period of the dynamical evolution. Additionally, delay in the couplings may bring upon synchrony to those multiplex networks which do not exhibit synchronization for the undelayed evolution. Using a simple example of two globally connected layers forming a multiplex network, we show how delay brings upon a possibility for the inter layer global synchrony, that is not possible for the undelayed evolution.
We calculate the radio-frequency spectrum of a trapped cloud of cold bosonic atoms in an optical lattice. Using random phase and local density approximations we produce both trap averaged and spatially resolved spectra, identifying simple features in the spectra that reveal information about both superfluidity and correlations. Our approach is exact in the deep Mott limit and in the deep superfluid when the hopping rates for the two internal spin states are equal. It contains final state interactions, obeys the Ward identities (and the associated conservation laws), and satisfies the $f$-sum rule. Motivated by earlier work by Sun, Lannert, and Vishveshwara [Phys. Rev. A \textbf{79}, 043422 (2009)], we also discuss the features which arise in a spin-dependent optical lattice.
This paper describes an automated, formal and rigorous analysis of the Ad hoc On-Demand Distance Vector (AODV) routing protocol, a popular protocol used in wireless mesh networks.   We give a brief overview of a model of AODV implemented in the UPPAAL model checker. It is derived from a process-algebraic model which reflects precisely the intention of AODV and accurately captures the protocol specification. Furthermore, we describe experiments carried out to explore AODV's behaviour in all network topologies up to 5 nodes. We were able to automatically locate problematic and undesirable behaviours. This is in particular useful to discover protocol limitations and to develop improved variants. This use of model checking as a diagnostic tool complements other formal-methods-based protocol modelling and verification techniques, such as process algebra.
MANET is a collection of mobile nodes operated by battery source with limited energy reservoir. The dynamic topology and absence of pre-existing infrastructure in MANET makes routing technique more thought-provoking. The arbitrary movement of nodes may lead towards more packet drop, routing overhead and end-to-end delay. Moreover power deficiency in nodes affects the packet forwarding ability and thus reduces network lifetime. So a power aware stable routing strategy is in demand in MANET. In this manuscript we have proposed a novel multipath routing strategy that could select multiple stable routes between source and destination during data transmission depending on two factors residual energy and link expiration time (LET) of nodes. Our proposed energy aware stable multipath routing strategy could attain the reliability, load balancing, and bandwidth aggregation in order to increase the network lifetime.
In this article, we study zigzag graphene nanoribbons with edges reconstructed with Stone-Wales defects, by means of an empirical (first-neighbor) tight-binding method, with parameters determined by ab-initio calculations of very narrow ribbons. We explore the characteristics of the electronic band structure with a focus on the nature of edge states. Edge reconstruction allows the appearance of a new type of edge states. They are dispersive, with non-zero amplitudes in both sub-lattices; furthermore, the amplitudes have two components that decrease with different decay lengths with the distance from the edge; at the Dirac points one of these lengths diverges, whereas the other remains finite, of the order of the lattice parameter. We trace this curious effect to the doubling of the unit cell along the edge, brought about by the edge reconstruction. In the presence of a magnetic field, the zero-energy Landau level is no longer degenerate with edge states as in the case of pristine zigzag ribbon.
The nucleation rate derived in the classical theory contains at least one undetermined parameter, which may be expressed in terms of the Langer first-principles theory. But the uncertainties in the accounting for fluctuation modes, which are either absorbed into the free energy of a critical cluster or not, result in different evaluations of the statistical prefactor and nucleation rate. We get the scaling approximations of the nucleation rate for the vapour condensation both near and out of the critical range. The results obtained deserve the experimental verification to resolve the theoretical uncertainty. PACS numbers: 64.60.Qb, 05.70.Fh, 64.60.Fr, 64.70.Fx
Convolutional Neural networks nowadays are of tremendous importance for any image classification system. One of the most investigated methods to increase the accuracy of CNN is by increasing the depth of CNN. Increasing the depth by stacking more layers also increases the difficulty of training besides making it computationally expensive. Some research found that adding auxiliary forks after intermediate layers increases the accuracy. Specifying which intermediate layer shoud have the fork just addressed recently. Where a simple rule were used to detect the position of intermediate layers that needs the auxiliary supervision fork. This technique known as convolutional neural networks with deep supervision (CNDS). This technique enhanced the accuracy of classification over the straight forward CNN used on the MIT places dataset and ImageNet. In the other side, Residual Learning is another technique emerged recently to ease the training of very deep CNN. Residual Learning framwork changed the learning of layers from unreferenced functions to learning residual function with regard to the layer's input. Residual Learning achieved state of arts results on ImageNet 2015 and COCO competitions. In this paper, we study the effect of adding residual connections to CNDS network. Our experiments results show increasing of accuracy over using CNDS only.
Given a convolutional neural network (CNN) that is pre-trained for object classification, this paper proposes to use active question-answering to semanticize neural patterns in conv-layers of the CNN and mine part concepts. For each part concept, we mine neural patterns in the pre-trained CNN, which are related to the target part, and use these patterns to construct an And-Or graph (AOG) to represent a four-layer semantic hierarchy of the part. As an interpretable model, the AOG associates different CNN units with different explicit object parts. We use an active human-computer communication to incrementally grow such an AOG on the pre-trained CNN as follows. We allow the computer to actively identify objects, whose neural patterns cannot be explained by the current AOG. Then, the computer asks human about the unexplained objects, and uses the answers to automatically discover certain CNN patterns corresponding to the missing knowledge. We incrementally grow the AOG to encode new knowledge discovered during the active-learning process. In experiments, our method exhibits high learning efficiency. Our method uses about 1/6-1/3 of the part annotations for training, but achieves similar or better part-localization performance than fast-RCNN methods.
Temporal information is increasingly available as part of large network data sets. This information reveals sequences of link activations between network entities, which can expose underlying processes in the data. Examples include the dissemination of information through a social network, the propagation of musical ideas in a music sampling network, and the spread of a disease via contacts between infected and susceptible individuals. The search for these more meaningful patterns may be formulated as a time-respecting subgraph isomorphism problem. Our set of query graphs include an enumeration of small random graphs and fan-out-fan-in structures, all composed of time-respecting paths. We explore three methods of solving the problem, which differ in how they exploit temporal and topological information. One approach extracts all subgraphs that have the temporal properties we require and then performs subgraph isomorphism testing on each subgraph. Another approach performs subgraph isomorphism testing first with temporal post-filtering, while the other is a hybrid approach that uses temporal information during the search. We empirically demonstrate the hybrid approach to be more efficient than the others, over a range of network data sets. These data come from communication and social networks, up to interactions in size.
As air traffic grows significantly, aircraft accidents increase. Many aviation accidents could be prevented if the precise aircraft positions and weather conditions on the aircraft's route were known. Existing studies propose determining the precise aircraft positions via a VHF channel with an air-to-air radio relay system that is based on mobile ad-hoc networks. However, due to the long propagation delay, the existing TDMA MAC schemes underutilize the networks. The existing TDMA MAC sends data and receives ACK in one time slot, which requires two guard times in one time slot. Since aeronautical communications spans a significant distance, the guard time occupies a significantly large portion of the slot. To solve this problem, we propose a piggybacking mechanism ACK. Our proposed MAC has one guard time in one time slot, which enables the transmission of more data. Using this additional data, we can send weather conditions that pertain to the aircraft's current position. Our analysis shows that this proposed MAC performs better than the existing MAC, since it offers better throughput and network utilization. In addition, our weather condition notification model achieves a much lower transmission delay than a HF (high frequency) voice communication.
Noise in the expression of a gene produces fluctuations in the concentration of the gene product. These fluctuations can interfere with optimal function or can be exploited to generate beneficial diversity between cells; gene expression noise is therefore expected to be subject to evolutionary pressure. Shifts between modes of high and low rates of transcription initiation at a promoter appear to contribute to this noise both in eukaryotes and prokaryotes. However, models invoked for eukaryotic promoter noise such as stable activation scaffolds or persistent nucleosome alterations seem unlikely to apply to prokaryotic promoters. We consider the relative importance of the steps required for transcription initiation. The 3-step transcription initiation model of McClure is extended into a mathematical model that can be used to predict consequences of additional promoter properties. We show in principle that the transcriptional bursting observed at an E. coli promoter by Golding et al. (2005) can be explained by stimulation of initiation by the negative supercoiling behind a transcribing RNA polymerase (RNAP) or by the formation of moribund or dead-end RNAP-promoter complexes. Both mechanisms are tunable by the alteration of promoter kinetics and therefore allow the optimization of promoter mediated noise.
We consider the learning of algorithmic tasks by mere observation of input-output pairs. Rather than studying this as a black-box discrete regression problem with no assumption whatsoever on the input-output mapping, we concentrate on tasks that are amenable to the principle of divide and conquer, and study what are its implications in terms of learning.   This principle creates a powerful inductive bias that we leverage with neural archi- tectures that are defined recursively and dynamically, by learning two scale-invariant atomic operations: how to split a given input into smaller sets, and how to merge two partially solved tasks into a larger partial solution. Our model can be trained in weakly supervised environments, namely by just observing input-output pairs, and in even weaker environments, using a non-differentiable reward signal. Moreover, thanks to the dynamic aspect of our architecture, we can incorporate the computational complexity as a regularization term that can be optimized by backpropagation. We demonstrate the flexibility and efficiency of the Divide-and-Conquer Network on three combinatorial and geometric tasks: sorting, clustering and convex hulls. Thanks to the dynamic program- ming nature of our model, we show significant improvements in terms of generalization error and computational complexity
Confidence calibration -- the problem of predicting probability estimates representative of the true correctness likelihood -- is important for classification models in many applications. We discover that modern neural networks, unlike those from a decade ago, are poorly calibrated. Through extensive experiments, we observe that depth, width, weight decay, and Batch Normalization are important factors influencing calibration. We evaluate the performance of various post-processing calibration methods on state-of-the-art architectures with image and document classification datasets. Our analysis and experiments not only offer insights into neural network learning, but also provide a simple and straightforward recipe for practical settings: on most datasets, temperature scaling -- a single-parameter variant of Platt Scaling -- is surprisingly effective at calibrating predictions.
The long short-term memory (LSTM) neural network is capable of processing complex sequential information since it utilizes special gating schemes for learning representations from long input sequences. It has the potential to model any sequential time-series data, where the current hidden state has to be considered in the context of the past hidden states. This property makes LSTM an ideal choice to learn the complex dynamics of various actions. Unfortunately, the conventional LSTMs do not consider the impact of spatio-temporal dynamics corresponding to the given salient motion patterns, when they gate the information that ought to be memorized through time. To address this problem, we propose a differential gating scheme for the LSTM neural network, which emphasizes on the change in information gain caused by the salient motions between the successive frames. This change in information gain is quantified by Derivative of States (DoS), and thus the proposed LSTM model is termed as differential Recurrent Neural Network (dRNN). We demonstrate the effectiveness of the proposed model by automatically recognizing actions from the real-world 2D and 3D human action datasets. Our study is one of the first works towards demonstrating the potential of learning complex time-series representations via high-order derivatives of states.
When nodes can repeatedly update their behavior (as in agent-based models from computational social science or repeated-game play settings) the problem of optimal network seeding becomes very complex. For a popular spreading-phenomena model of binary-behavior updating based on thresholds of adoption among neighbors, we consider several planning problems in the design of \textit{Sticky Interventions}: when adoption decisions are reversible, the planner aims to find a Seed Set where temporary intervention leads to long-term behavior change. We prove that completely converting a network at minimum cost is $\Omega(\ln (OPT) )$-hard to approximate and that maximizing conversion subject to a budget is $(1-\frac{1}{e})$-hard to approximate. Optimization heuristics which rely on many objective function evaluations may still be practical, particularly in relatively-sparse networks: we prove that the long-term impact of a Seed Set can be evaluated in $O(|E|^2)$ operations. For a more descriptive model variant in which some neighbors may be more influential than others, we show that under integer edge weights from $\{0,1,2,...,k\}$ objective function evaluation requires only $O(k|E|^2)$ operations. These operation bounds are based on improvements we give for bounds on time-steps-to-convergence under discrete-time reversible-threshold updates in networks.
Compressed sensing is a technique for recovering a high-dimensional signal from lower-dimensional data, whose components represent partial information about the signal, utilizing prior knowledge on the sparsity of the signal. For further reducing the data size of the compressed expression, a scheme to recover the original signal utilizing only the sign of each entry of the linearly transformed vector was recently proposed. This approach is often termed the 1-bit compressed sensing. Here we analyze the typical performance of an L1-norm based signal recovery scheme for the 1-bit compressed sensing using statistical mechanics methods. We show that the signal recovery performance predicted by the replica method under the replica symmetric ansatz, which turns out to be locally unstable for modes breaking the replica symmetry, is in a good consistency with experimental results of an approximate recovery algorithm developed earlier. This suggests that the L1-based recovery problem typically has many local optima of a similar recovery accuracy, which can be achieved by the approximate algorithm. We also develop another approximate recovery algorithm inspired by the cavity method. Numerical experiments show that when the density of nonzero entries in the original signal is relatively large the new algorithm offers better performance than the abovementioned scheme and does so with a lower computational cost.
Future multi-tier communication networks will require enhanced network capacity and reduced overhead. In the absence of Channel State Information (CSI) at the transmitters, Blind Interference Alignment (BIA) and Topological Interference Management (TIM) can achieve optimal Degrees of Freedom (DoF), minimising network's overhead. In addition, Non-Orthogonal Multiple Access (NOMA) can increase the sum rate of the network, compared to orthogonal radio access techniques currently adopted by 4G networks. Our contribution is two interference management schemes, BIA and a hybrid TIM-NOMA scheme, employed in heterogeneous networks by applying user-pairing and Kronecker Product representation. BIA manages inter- and intra-cell interference by antenna selection and appropriate message scheduling. The hybrid scheme manages intra-cell interference based on NOMA and inter-cell interference based on TIM. We show that both schemes achieve at least double the rate of TDMA. The hybrid scheme always outperforms TDMA and BIA in terms of Degrees of Freedom (DoF). Comparing the two proposed schemes, BIA achieves more DoF than TDMA under certain restrictions, and provides better Bit-Error-Rate (BER) and sum rate performance to macrocell users, whereas the hybrid scheme improves the performance of femtocell users.
In this paper we present the application of a novel methodology to scientific citation and collaboration networks. This methodology is designed for understanding the governing dynamics of evolving networks and relies on an attachment kernel, a scalar function of node properties, which stochastically drives the addition and deletion of vertices and edges. We illustrate how the kernel function of a given network can be extracted from the history of the network and discuss other possible applications.
Bayesian networks provide a method of representing conditional independence between random variables and computing the probability distributions associated with these random variables. In this paper, we extend Bayesian network structures to compute probability density functions for continuous random variables. We make this extension by approximating prior and conditional densities using sums of weighted Gaussian distributions and then finding the propagation rules for updating the densities in terms of these weights. We present a simple example that illustrates the Bayesian network for continuous variables; this example shows the effect of the network structure and approximation errors on the computation of densities for variables in the network.
Image blur and image noise are common distortions during image acquisition. In this paper, we systematically study the effect of image distortions on the deep neural network (DNN) image classifiers. First, we examine the DNN classifier performance under four types of distortions. Second, we propose two approaches to alleviate the effect of image distortion: re-training and fine-tuning with noisy images. Our results suggest that, under certain conditions, fine-tuning with noisy images can alleviate much effect due to distorted inputs, and is more practical than re-training.
We present a novel diffusion scheme for online kernel-based learning over networks. So far, a major drawback of any online learning algorithm, operating in a reproducing kernel Hilbert space (RKHS), is the need for updating a growing number of parameters as time iterations evolve. Besides complexity, this leads to an increased need of communication resources, in a distributed setting. In contrast, the proposed method approximates the solution as a fixed-size vector (of larger dimension than the input space) using Random Fourier Features. This paves the way to use standard linear combine-then-adapt techniques. To the best of our knowledge, this is the first time that a complete protocol for distributed online learning in RKHS is presented. Conditions for asymptotic convergence and boundness of the networkwise regret are also provided. The simulated tests illustrate the performance of the proposed scheme.
We study the wealth distribution of the Bouchaud--M\'ezard (BM) model on complex networks. It has been known that this distribution depends on the topology of network by numerical simulations, however, no one have succeeded to explain it. Using "adiabatic" and "independent" assumptions along with the central-limit theorem, we derive equations that determine the probability distribution function. The results are compared to those of simulations for various networks. We find good agreement between our theory and the simulations, except the case of Watts--Strogatz networks with a low rewiring rate, due to the breakdown of independent assumption.
We discuss the situation where attractive and repulsive portions of the inter-particle potential both contribute significantly to glass formation. We introduce the square-well potential as prototypical model for this situation, and {\it reject} the Baxter as a useful model for comparison to experiment on glasses, based on our treatment within mode coupling theory. We present explicit results for various well-widths, and show that, for narrow wells, there is a useful analytical formula that would be suitable for experimentalist working in the field of colloidal science. We raise the question as to whether, in a more exact treatment, the sticky sphere limit might have an infinite glass transition temperature, or a high but finite one.
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources. To provide cloud computing services economically, it is important to optimize resource allocation under the assumption that the required resource can be taken from a shared resource pool. In addition, to be able to provide processing ability and storage capacity, it is necessary to allocate bandwidth to access them at the same time.   This paper proposes an optimal resource allocation method for cloud computing environments. First, this paper develops a resource allocation model of cloud computing environments, assuming both processing ability and bandwidth are allocated simultaneously to each service request and rented out on an hourly basis. The allocated resources are dedicated to each service request. Next, this paper proposes an optimal joint multiple resource allocation method, based on the above resource allocation model. It is demonstrated by simulation evaluation that the proposed method can reduce the request loss probability and as a result, reduce the total resource required, compared with the conventional allocation method. Then, this paper defines basic principles and a measure for achieving fair resource allocation among multiple users in a cloud computing environment, and proposes a fair joint multiple resource allocation method. It is demonstrated by simulation evaluations that the proposed method enables the fair resource allocation among multiple users without a large decline in resource efficiency. Keywords: Cloud computing, joint multiple resource allocation, fairness
When a liquid is cooled below its melting temperature it usually crystallizes. However, if the quenching rate is fast enough, it is possible that the system remains in a disordered state, progressively losing its fluidity upon further cooling. When the time needed for the rearrangement of the local atomic structure reaches approximately 100 seconds, the system becomes "solid" for any practical purpose, and this defines the glass transition temperature $T_g$. Approaching this transition from the liquid side, different systems show qualitatively different temperature dependencies of the viscosity, and, accordingly, they have been classified introducing the concept of "fragility". We report experimental observations that relate the microscopic properties of the {\it glassy phase} to the fragility. We find that the vibrational properties of the glass {\it well below} $T_g$ are correlated with the fragility value. Consequently, we extend the fragility concept to the glassy state and indicate how to determine the fragility uniquely from glass properties well below $T_g$.
The Kaldi toolkit is becoming popular for constructing automated speech recognition (ASR) systems. Meanwhile, in recent years, deep neural networks (DNNs) have shown state-of-the-art performance on various ASR tasks. This document describes our open-source recipes to implement fully-fledged DNN acoustic modeling using Kaldi and PDNN. PDNN is a lightweight deep learning toolkit developed under the Theano environment. Using these recipes, we can build up multiple systems including DNN hybrid systems, convolutional neural network (CNN) systems and bottleneck feature systems. These recipes are directly based on the Kaldi Switchboard 110-hour setup. However, adapting them to new datasets is easy to achieve.
In this paper, we study a strategic model of marketing and product consumption in social networks. We consider two firms in a market competing to maximize the consumption of their products. Firms have a limited budget which can be either invested on the quality of the product or spent on initial seeding in the network in order to better facilitate spread of the product. After the decision of firms, agents choose their consumptions following a myopic best response dynamics which results in a local, linear update for their consumption decision. We characterize the unique Nash equilibrium of the game between firms and study the effect of the budgets as well as the network structure on the optimal allocation. We show that at the equilibrium, firms invest more budget on quality when their budgets are close to each other. However, as the gap between budgets widens, competition in qualities becomes less effective and firms spend more of their budget on seeding. We also show that given equal budget of firms, if seeding budget is nonzero for a balanced graph, it will also be nonzero for any other graph, and if seeding budget is zero for a star graph it will be zero for any other graph as well. As a practical extension, we then consider a case where products have some preset qualities that can be only improved marginally. At some point in time, firms learn about the network structure and decide to utilize a limited budget to mount their market share by either improving the quality or new seeding some agents to incline consumers towards their products. We show that the optimal budget allocation in this case simplifies to a threshold strategy. Interestingly, we derive similar results to that of the original problem, in which preset qualities simulate the role that budgets had in the original setup.
A transform approach to network coding was introduced by Bavirisetti et al. (arXiv:1103.3882v3 [cs.IT]) as a tool to view wireline networks with delays as $k$-instantaneous networks (for some large $k$). When the local encoding kernels (LEKs) of the network are varied with every time block of length $k > 1$, the network is said to use block time varying LEKs. In this work, we propose a Precoding Based Network Alignment (PBNA) scheme based on transform approach and block time varying LEKs for three-source three-destination multiple unicast network with delays (3-S 3-D MUN-D). In a recent work, Meng et al. (arXiv:1202.3405v1 [cs.IT]) reduced the infinite set of sufficient conditions for feasibility of PBNA in a three-source three-destination instantaneous multiple unicast network as given by Das et al. (arXiv:1008.0235v1 [cs.IT]) to a finite set and also showed that the conditions are necessary. We show that the conditions of Meng et al. are also necessary and sufficient conditions for feasibility of PBNA based on transform approach and block time varying LEKs for 3-S 3-D MUN-D.
Many real-world networks display a community structure. We study two random graph models that create a network with similar community structure as a given network. One model preserves the exact community structure of the original network, while the other model only preserves the set of communities and the vertex degrees. These models show that community structure is an important determinant of the behavior of percolation processes on networks, such as information diffusion or virus spreading: the community structure can both \textit{enforce} as well as \textit{inhibit} diffusion processes. Our models further show that it is the mesoscopic set of communities that matters. The exact internal structures of communities barely influence the behavior of percolation processes across networks. This insensitivity is likely due to the relative denseness of the communities.
In the last few years many real-world networks have been found to show a so-called community structure organization. Much effort has been devoted in the literature to develop methods and algorithms that can efficiently highlight this hidden structure of the network, traditionally by partitioning the graph. Since network representation can be very complex and can contain different variants in the traditional graph model, each algorithm in the literature focuses on some of these properties and establishes, explicitly or implicitly, its own definition of community. According to this definition it then extracts the communities that are able to reflect only some of the features of real communities. The aim of this survey is to provide a manual for the community discovery problem. Given a meta definition of what a community in a social network is, our aim is to organize the main categories of community discovery based on their own definition of community. Given a desired definition of community and the features of a problem (size of network, direction of edges, multidimensionality, and so on) this review paper is designed to provide a set of approaches that researchers could focus on.
Deep neural networks are able to solve tasks across a variety of domains and modalities of data. Despite many empirical successes, we lack the ability to clearly understand and interpret the learned internal mechanisms that contribute to such effective behaviors or, more critically, failure modes. In this work, we present a general method for visualizing an arbitrary neural network's inner mechanisms and their power and limitations. Our dataset-centric method produces visualizations of how a trained network attends to components of its inputs. The computed "attention masks" support improved interpretability by highlighting which input attributes are critical in determining output. We demonstrate the effectiveness of our framework on a variety of deep neural network architectures in domains from computer vision, natural language processing, and reinforcement learning. The primary contribution of our approach is an interpretable visualization of attention that provides unique insights into the network's underlying decision-making process irrespective of the data modality.
While deep learning is remarkably successful on perceptual tasks, it was also shown to be vulnerable to adversarial perturbations of the input. These perturbations denote noise added to the input that was generated specifically to fool the system while being quasi-imperceptible for humans. More severely, there even exist universal perturbations that are input-agnostic but fool the network on the majority of inputs. While recent work has focused on image classification, this work proposes attacks against semantic image segmentation: we present an approach for generating (universal) adversarial perturbations that make the network yield a desired target segmentation as output. We show empirically that there exist barely perceptible universal noise patterns which result in nearly the same predicted segmentation for arbitrary inputs. Furthermore, we also show the existence of universal noise which removes a target class (e.g., all pedestrians) from the segmentation while leaving the segmentation mostly unchanged otherwise.
Transferring knowledge from prior source tasks in solving a new target task can be useful in several learning applications. The application of transfer poses two serious challenges which have not been adequately addressed. First, the agent should be able to avoid negative transfer, which happens when the transfer hampers or slows down the learning instead of helping it. Second, the agent should be able to selectively transfer, which is the ability to select and transfer from different and multiple source tasks for different parts of the state space of the target task. We propose A2T (Attend, Adapt and Transfer), an attentive deep architecture which adapts and transfers from these source tasks. Our model is generic enough to effect transfer of either policies or value functions. Empirical evaluations on different learning algorithms show that A2T is an effective architecture for transfer by being able to avoid negative transfer while transferring selectively from multiple source tasks in the same domain.
Bayesian network is a complete model for the variables and their relationships, it can be used to answer probabilistic queries about them. A Bayesian network can thus be considered a mechanism for automatically applying Bayes' theorem to complex problems. In the application of Bayesian networks, most of the work is related to probabilistic inferences. Any variable updating in any node of Bayesian networks might result in the evidence propagation across the Bayesian networks. This paper sums up various inference techniques in Bayesian networks and provide guidance for the algorithm calculation in probabilistic inference in Bayesian networks.
Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? And second, how can we extract meaningful models of neuronal systems from high dimensional datasets? To aid in these challenges, we give a pedagogical review of a collection of ideas and theoretical methods arising at the intersection of statistical physics, computer science and neurobiology. We introduce the interrelated replica and cavity methods, which originated in statistical physics as powerful ways to quantitatively analyze large highly heterogeneous systems of many interacting degrees of freedom. We also introduce the closely related notion of message passing in graphical models, which originated in computer science as a distributed algorithm capable of solving large inference and optimization problems involving many coupled variables. We then show how both the statistical physics and computer science perspectives can be applied in a wide diversity of contexts to problems arising in theoretical neuroscience and data analysis. Along the way we discuss spin glasses, learning theory, illusions of structure in noise, random matrices, dimensionality reduction, and compressed sensing, all within the unified formalism of the replica method. Moreover, we review recent conceptual connections between message passing in graphical models, and neural computation and learning. Overall, these ideas illustrate how statistical physics and computer science might provide a lens through which we can uncover emergent computational functions buried deep within the dynamical complexities of neuronal networks.
The design of distributed mechanisms for interference management is one of the key challenges in emerging wireless small cell networks whose backhaul is capacity limited and heterogeneous (wired, wireless and a mix thereof). In this paper, a novel, backhaul-aware approach to interference management in wireless small cell networks is proposed. The proposed approach enables macrocell user equipments (MUEs) to optimize their uplink performance, by exploiting the presence of neighboring small cell base stations. The problem is formulated as a noncooperative game among the MUEs that seek to optimize their delay-rate tradeoff, given the conditions of both the radio access network and the -- possibly heterogeneous -- backhaul. To solve this game, a novel, distributed learning algorithm is proposed using which the MUEs autonomously choose their optimal uplink transmission strategies, given a limited amount of available information. The convergence of the proposed algorithm is shown and its properties are studied. Simulation results show that, under various types of backhauls, the proposed approach yields significant performance gains, in terms of both average throughput and delay for the MUEs, when compared to existing benchmark algorithms.
Exploring the structural topology of genome-based large-scale metabolic network is essential for investigating possible relations between structure and functionality. Visualization would be helpful for obtaining immediate information about structural organization. In this work, metabolic networks of 75 organisms were investigated from a topological point of view. A spread bow-tie model was proposed to give a clear visualization of the bow-tie structure for metabolic networks. The revealed topological pattern helps to design more efficient algorithm specifically for metabolic networks. This coarse-grained graph also visualizes the vulnerable connections in the network, and thus could have important implication for disease studies and drug target identifications. In addition, analysis on the reciprocal links and main cores in the GSC part of bow-tie also reveals that the bow-tie structure of metabolic networks has its own intrinsic and significant features which are significantly different from those of random networks.
We present new high resolution inelastic neutron scattering data on the candidate spin liquid Tb2Ti2O7. We find that there is no evidence for a zero field splitting of the ground state doublet within the 0.2 K resolution of the instrument. This result contrasts with a pair of recent works on Tb2Ti2O7 claiming that the spin liquid behavior can be attributed to a 2 K split singlet-singlet single-ion spectrum at low energies. We also reconsider the entropy argument presented in Chapuis {\it et al.} as further evidence of a singlet-singlet crystal field spectrum. We arrive at the conclusion that estimates of the low temperature residual entropy drawn from heat capacity measurements are a poor guide to the single ion spectrum without understanding the nature of the correlations.
We compute the joint probability density function (jpdf) P_N(M, \tau_M) of the maximum M and its position \tau_M for N non-intersecting Brownian excursions, on the unit time interval, in the large N limit. For N \to \infty, this jpdf is peaked around M = \sqrt{2N} and \tau_M = 1/2, while the typical fluctuations behave for large N like M - \sqrt{2N} \propto s N^{-1/6} and \tau_M - 1/2 \propto w N^{-1/3} where s and w are correlated random variables. One obtains an explicit expression of the limiting jpdf P(s,w) in terms of the Tracy-Widom distribution for the Gaussian Orthogonal Ensemble (GOE) of Random Matrix Theory and a psi-function for the Hastings-McLeod solution to the Painlev\'e II equation. Our result yields, up to a rescaling of the random variables s and w, an expression for the jpdf of the maximum and its position for the Airy_2 process minus a parabola. This latter describes the fluctuations in many different physical systems belonging to the Kardar-Parisi-Zhang (KPZ) universality class in 1+1 dimensions. In particular, the marginal probability density function (pdf) P(w) yields, up to a model dependent length scale, the distribution of the endpoint of the directed polymer in a random medium with one free end, at zero temperature. In the large w limit one shows the asymptotic behavior \log P(w) \sim - w^3/12.
We investigate the $\ell_\infty$-constrained representation which demonstrates robustness to quantization errors, utilizing the tool of deep learning. Based on the Alternating Direction Method of Multipliers (ADMM), we formulate the original convex minimization problem as a feed-forward neural network, named \textit{Deep $\ell_\infty$ Encoder}, by introducing the novel Bounded Linear Unit (BLU) neuron and modeling the Lagrange multipliers as network biases. Such a structural prior acts as an effective network regularization, and facilitates the model initialization. We then investigate the effective use of the proposed model in the application of hashing, by coupling the proposed encoders under a supervised pairwise loss, to develop a \textit{Deep Siamese $\ell_\infty$ Network}, which can be optimized from end to end. Extensive experiments demonstrate the impressive performances of the proposed model. We also provide an in-depth analysis of its behaviors against the competitors.
Gene regulation in higher eukaryotes involves a complex interplay between the gene proximal promoter and distal genomic elements (such as enhancers) which work in concert to drive spatio-temporal expression. The experimental characterization of gene regulatory elements is a very complex and resource-intensive process. One of the major goals in computational biology is the \textit{in-silico} annotation of previously uncharacterized elements using results from the subset of known, annotated, regulatory elements.   The computational annotation of these hitherto uncharacterized regions would require an identification of features that have good predictive value for regulatory behavior.   In this work, we study transcriptional regulation as a problem in heterogeneous data integration, across sequence, expression and interactome level attributes. Using the example of the \textit{Gata2} gene and its recently discovered urogenital enhancers \cite{Khandekar2004} as a case study, we examine the predictive value of various high throughput functional genomic assays in characterizing these enhancers and their regulatory role. Observing results from the application of modern statistical learning methodologies for each of these data modalities, we propose a set of attributes that are most discriminatory in the localization and behavior of these enhancers.
Restricted Boltzmann Machines (RBM) have attracted a lot of attention of late, as one the principle building blocks of deep networks. Training RBMs remains problematic however, because of the intractibility of their partition function. The maximum likelihood gradient requires a very robust sampler which can accurately sample from the model despite the loss of ergodicity often incurred during learning. While using Parallel Tempering in the negative phase of Stochastic Maximum Likelihood (SML-PT) helps address the issue, it imposes a trade-off between computational complexity and high ergodicity, and requires careful hand-tuning of the temperatures. In this paper, we show that this trade-off is unnecessary. The choice of optimal temperatures can be automated by minimizing average return time (a concept first proposed by [Katzgraber et al., 2006]) while chains can be spawned dynamically, as needed, thus minimizing the computational overhead. We show on a synthetic dataset, that this results in better likelihood scores.
Predicting attributes from face images in the wild is a challenging computer vision problem. To automatically describe face attributes from face containing images, traditionally one needs to cascade three technical blocks --- face localization, facial descriptor construction, and attribute classification --- in a pipeline. As a typical classification problem, face attribute prediction has been addressed using deep learning. Current state-of-the-art performance was achieved by using two cascaded Convolutional Neural Networks (CNNs), which were specifically trained to learn face localization and attribute description. In this paper, we experiment with an alternative way of employing the power of deep representations from CNNs. Combining with conventional face localization techniques, we use off-the-shelf architectures trained for face recognition to build facial descriptors. Recognizing that the describable face attributes are diverse, our face descriptors are constructed from different levels of the CNNs for different attributes to best facilitate face attribute prediction. Experiments on two large datasets, LFWA and CelebA, show that our approach is entirely comparable to the state-of-the-art. Our findings not only demonstrate an efficient face attribute prediction approach, but also raise an important question: how to leverage the power of off-the-shelf CNN representations for novel tasks.
A huge number of independent true ground-state configurations is calculated for two-, three- and four-dimensional +- J spin-glass models. Using the genetic cluster-exact approximation method, system sizes up to N=20^2,8^3,6^4 spins are treated. A ``ballistic-search'' algorithm is applied which allows even for large system sizes to identify clusters of ground states which are connected by chains of zero-energy flips of spins. The number of clusters n_C diverges with N going to infinity. For all dimensions considered here, an exponential increase of n_C appears to be more likely than a growth with a power of N. The number of different ground states is found to grow clearly exponentially with N. A zero-temperature entropy per spin of s_0=0.078(5)k_B (2d), s_0=0.051(3)k_B (3d) respectively s_0=0.027(5)k_B (4d) is obtained.
Small particles advected in a fluid can collide (and therefore aggregate) due to the stretching or shearing of fluid elements. This effect is usually discussed in terms of a theory due to Saffman and Turner [J. Fluid Mech., 1, 16-30, (1956)]. We show that in complex or random flows the Saffman-Turner theory for the collision rate describes only an initial transient (which we evaluate exactly). We obtain precise expressions for the steady-state collision rate for flows with small Kubo number, including the influence of fractal clustering on the collision rate for compressible flows. For incompressible turbulent flows, where the Kubo number is of order unity, the Saffman-Turner theory is an upper bound.
In this work, we propose Attentive Pooling (AP), a two-way attention mechanism for discriminative model training. In the context of pair-wise ranking or classification with neural networks, AP enables the pooling layer to be aware of the current input pair, in a way that information from the two input items can directly influence the computation of each other's representations. Along with such representations of the paired inputs, AP jointly learns a similarity measure over projected segments (e.g. trigrams) of the pair, and subsequently, derives the corresponding attention vector for each input to guide the pooling. Our two-way attention mechanism is a general framework independent of the underlying representation learning, and it has been applied to both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) in our studies. The empirical results, from three very different benchmark tasks of question answering/answer selection, demonstrate that our proposed models outperform a variety of strong baselines and achieve state-of-the-art performance in all the benchmarks.
Deep LSTM is an ideal candidate for text recognition. However text recognition involves some initial image processing steps like segmentation of lines and words which can induce error to the recognition system. Without segmentation, learning very long range context is difficult and becomes computationally intractable. Therefore, alternative soft decisions are needed at the pre-processing level. This paper proposes a hybrid text recognizer using a deep recurrent neural network with multiple layers of abstraction and long range context along with a language model to verify the performance of the deep neural network. In this paper we construct a multi-hypotheses tree architecture with candidate segments of line sequences from different segmentation algorithms at its different branches. The deep neural network is trained on perfectly segmented data and tests each of the candidate segments, generating unicode sequences. In the verification step, these unicode sequences are validated using a sub-string match with the language model and best first search is used to find the best possible combination of alternative hypothesis from the tree structure. Thus the verification framework using language models eliminates wrong segmentation outputs and filters recognition errors.
This paper presents a study of employing Ranking SVM and Convolutional Neural Network for two missions: legal information retrieval and question answering in the Competition on Legal Information Extraction/Entailment. For the first task, our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is based on paragraph level instead of article level as in previous studies. In fact, each single-paragraph article corresponds to a particular paragraph in a huge multiple-paragraph article. For the legal question answering task, additional statistical features from information retrieval task integrated into Convolutional Neural Network contribute to higher accuracy.
In this paper a high speed neural network classifier based on extreme learning machines for multi-label classification problem is proposed and dis-cussed. Multi-label classification is a superset of traditional binary and multi-class classification problems. The proposed work extends the extreme learning machine technique to adapt to the multi-label problems. As opposed to the single-label problem, both the number of labels the sample belongs to, and each of those target labels are to be identified for multi-label classification resulting in in-creased complexity. The proposed high speed multi-label classifier is applied to six benchmark datasets comprising of different application areas such as multi-media, text and biology. The training time and testing time of the classifier are compared with those of the state-of-the-arts methods. Experimental studies show that for all the six datasets, our proposed technique have faster execution speed and better performance, thereby outperforming all the existing multi-label clas-sification methods.
A ``persistence'' exponent theta has been extensively used to describe the nonequilibrium dynamics of spin systems following a deep quench: for zero-temperature homogeneous Ising models on the d-dimensional cubic lattice, the fraction p(t) of spins not flipped by time t decays to zero like t^[-theta(d)] for low d; for high d, p(t) may decay to p(infinity)>0, because of ``blocking'' (but perhaps still like a power). What are the effects of disorder or changes of lattice? We show that these can quite generally lead to blocking (and convergence to a metastable configuration) even for low d, and then present two examples --- one disordered and one homogeneous --- where p(t) decays exponentially to p(infinity).
We present numerical calculations of electronic structure and transport in Penrose approximants. The electronic structure of perfect approximants shows a spiky density of states and a tendency to localization that is more pronounced in the middle of the band. Near the band edges the behavior is more similar to that of free electrons. These calculations of band structure and in particular the band scaling suggest an anomalous quantum diffusion when compared to normal ballistic crystals. This is confirmed by a numerical calculation of quantum diffusion which shows a crossover from normal ballistic propagation at long times to anomalous, possibly insulator-like, behavior at short times. The time scale t*(E) for this crossover is computed for several approximants and is detailed. The consequences for electronic conductivity are discussed in the context of the relaxation time approximation. The metallic like or non metallic like behavior of the conductivity is dictated by the comparison between the scattering time due to defects and the time scale t*(E).
Multicanonical MCMC (Multicanonical Markov Chain Monte Carlo; Multicanonical Monte Carlo) is discussed as a method of rare event sampling. Starting from a review of the generic framework of importance sampling, multicanonical MCMC is introduced, followed by applications in random matrices, random graphs, and chaotic dynamical systems. Replica exchange MCMC (also known as parallel tempering or Metropolis-coupled MCMC) is also explained as an alternative to multicanonical MCMC. In the last section, multicanonical MCMC is applied to data surrogation; a successful implementation in surrogating time series is shown. In the appendices, calculation of averages and normalizing constant in an exponential family, phase coexistence, simulated tempering, parallelization, and multivariate extensions are discussed.
Inference of clusters over networks is a central problem in machine learning. Commonly, it is formulated as a discrete optimization, and a continuous relaxation is used to obtain a spectral algorithm. An alternative problem formulation arises by considering a latent space model, in which edge probabilities are determined by continuous latent positions. A model of particular interest is the Random Dot Product Graph (RDPG), which can be fit using an efficient spectral method; however, this method is based on a heuristic that can fail, even in simple cases. In this paper, we consider a closely related latent space model, the Logistic RDPG, which uses a logistic link function to map from latent position inner products to edge likelihoods. Over this model, we show that asymptotically exact maximum likelihood inference of the latent position vectors can be achieved using a spectral method. Our method involves computing the top eigenvectors of a normalized adjacency matrix and scaling the eigenvectors using a regression step. Through simulations, we show that this method is more accurate and more robust than existing spectral and semidefinite network clustering methods. In particular, the novel regression scaling step is essential to the performance gain of the proposed method.
The semi-inclusive deep inelastic scattering (SIDIS) process is considered. A theoretical procedure is proposed allowing the direct extraction from the SIDIS data of the first moments of the polarized valence distributions and of the first moment difference of the light sea quark polarized distributions in the next to leading QCD order. The validity of the procedure is confirmed by the respective simulations.
This paper presents a formal specification of the Ad hoc On-Demand Distance Vector (AODV) routing protocol using AWN (Algebra for Wireless Networks), a recent process algebra which has been tailored for the modelling of Mobile Ad Hoc Networks and Wireless Mesh Network protocols. Our formalisation models the exact details of the core functionality of AODV, such as route discovery, route maintenance and error handling. We demonstrate how AWN can be used to reason about critical protocol properties by providing detailed proofs of loop freedom and route correctness.
In this paper, we study a strategic model of marketing and product consumption in social networks. We consider two competing firms in a market providing two substitutable products with preset qualities. Agents choose their consumptions following a myopic best response dynamics which results in a local, linear update for the consumptions. At some point in time, firms receive a limited budget which they can use to trigger a larger consumption of their products in the network. Firms have to decide between marginally improving the quality of their products and giving free offers to a chosen set of agents in the network in order to better facilitate spreading their products. We derive a simple threshold rule for the optimal allocation of the budget and describe the resulting Nash equilibrium. It is shown that the optimal allocation of the budget depends on the entire distribution of centralities in the network, quality of products and the model parameters. In particular, we show that in a graph with a higher number of agents with centralities above a certain threshold, firms spend more budget on seeding in the optimal allocation. Furthermore, if seeding budget is nonzero for a balanced graph, it will also be nonzero for any other graph, and if seeding budget is zero for a star graph, it will be zero for any other graph too. We also show that firms allocate more budget to quality improvement when their qualities are close, in order to distance themselves from the rival firm. However, as the gap between qualities widens, competition in qualities becomes less effective and firms spend more budget on seeding.
We study how to spread $k$ tokens of information to every node on an $n$-node dynamic network, the edges of which are changing at each round. This basic {\em gossip problem} can be completed in $O(n + k)$ rounds in any static network, and determining its complexity in dynamic networks is central to understanding the algorithmic limits and capabilities of various dynamic network models. Our focus is on token-forwarding algorithms, which do not manipulate tokens in any way other than storing, copying and forwarding them.   We first consider the {\em strongly adaptive} adversary model where in each round, each node first chooses a token to broadcast to all its neighbors (without knowing who they are), and then an adversary chooses an arbitrary connected communication network for that round with the knowledge of the tokens chosen by each node. We show that $\Omega(nk/\log n + n)$ rounds are needed for any randomized (centralized or distributed) token-forwarding algorithm to disseminate the $k$ tokens, thus resolving an open problem raised in~\cite{kuhn+lo:dynamic}. The bound applies to a wide class of initial token distributions, including those in which each token is held by exactly one node and {\em well-mixed} ones in which each node has each token independently with a constant probability.   We also show several upper bounds in varying models.
Reliable propagation of information through large networks, e.g. communication networks, social networks or sensor networks is very important in many applications concerning marketing, social networks, and wireless sensor networks. However, social ties of friendship may be obsolete, and communication links may fail, inducing the notion of uncertainty in such networks. In this paper, we address the problem of optimizing information propagation in uncertain networks given a constrained budget of edges. We show that this problem requires to solve two NP-hard subproblems: the computation of expected information flow, and the optimal choice of edges. To compute the expected information flow to a source vertex, we propose the F-tree as a specialized data structure, that identifies independent components of the graph for which the information flow can either be computed analytically and efficiently, or for which traditional Monte-Carlo sampling can be applied independent of the remaining network. For the problem of finding the optimal edges, we propose a series of heuristics that exploit properties of this data structure. Our evaluation shows that these heuristics lead to high quality solutions, thus yielding high information flow, while maintaining low run-time.
A random polymer model is a one-dimensional Jacobi matrix randomly composed of two finite building blocks. If the two associated transfer matrices commute, the corresponding energy is called critical. Such critical energies appear in physical models, an example being the widely studied random dimer model. It is proven that the Lyapunov exponent vanishes quadratically at a generic critical energy and that the density of states is positive there. Large deviation estimates around these asymptotics allow to prove optimal lower bounds on quantum transport, showing that it is almost surely overdiffusive even though the models are known to have pure-point spectrum with exponentially localized eigenstates for almost every configuration of the polymers. Furthermore, the level spacing is shown to be regular at the critical energy.
Ensuring sustainability demands more efficient energy management with minimized energy wastage. Therefore, the power grid of the future should provide an unprecedented level of flexibility in energy management. To that end, intelligent decision making requires accurate predictions of future energy demand/load, both at aggregate and individual site level. Thus, energy load forecasting have received increased attention in the recent past, however has proven to be a difficult problem. This paper presents a novel energy load forecasting methodology based on Deep Neural Networks, specifically Long Short Term Memory (LSTM) algorithms. The presented work investigates two variants of the LSTM: 1) standard LSTM and 2) LSTM-based Sequence to Sequence (S2S) architecture. Both methods were implemented on a benchmark data set of electricity consumption data from one residential customer. Both architectures where trained and tested on one hour and one-minute time-step resolution datasets. Experimental results showed that the standard LSTM failed at one-minute resolution data while performing well in one-hour resolution data. It was shown that S2S architecture performed well on both datasets. Further, it was shown that the presented methods produced comparable results with the other deep learning methods for energy forecasting in literature.
Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook's bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on trained word vector representations and input-question-answer triplets.
Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neural models in other domains to the problem of NMT. We demonstrate that standard knowledge distillation applied to word-level prediction can be effective for NMT, and also introduce two novel sequence-level versions of knowledge distillation that further improve performance, and somewhat surprisingly, seem to eliminate the need for beam search (even when applied on the original teacher model). Our best student model runs 10 times faster than its state-of-the-art teacher with little loss in performance. It is also significantly better than a baseline model trained without knowledge distillation: by 4.2/1.7 BLEU with greedy decoding/beam search. Applying weight pruning on top of knowledge distillation results in a student model that has 13 times fewer parameters than the original teacher model, with a decrease of 0.4 BLEU.
This paper gives a review of our recent work on the quantum Frenkel-Kontorova model. Using the squeezed state theory and the quantum Monte Carlo method, we have studied the effects of quantum fluctuations on the Aubry transition and the behavior of the ground state wave function. We found that quantum fluctuations renormalize the sinusoidal standard map to a sawtooth map. Although quantum fluctuations have smeared the Aubry transition, the remnants of this transition are still discernible. The ground state wave function also changes from an extended state to a localized state. The squeezed state results agree very well with those from the Monte Carlo and mean field studies.
We propose an explanation for the quadratic dependence on the momentum $Q$, of the broadening of the acoustic excitation peak recently found in the study of the dynamic structure factor of many real and simulated glasses. We ascribe the observed $Q^2$ law to the spatial fluctuations of the local wavelength of the collective vibrational modes, in turn produced by the dishomegeneity of the inter-particle elastic constants. This explanation is analitically shown to hold for 1-dimensional disordered chains and satisfatorily numerically tested in both 1 and 3 dimensions.
We examine a novel type of disorder in quantum antiferromagnets. Our model consists of localized spins with antiferromagnetic exchanges on a bipartite quasiperiodic structure, which is geometrically disordered in such a way that no frustration is introduced. In the limit of zero disorder, the structure is the perfect Penrose rhombus tiling. This tiling is progressively disordered by augmenting the number of random "phason flips" or local tile-reshuffling operations. The ground state remains N\'eel ordered, and we have studied its properties as a function of increasing disorder using linear spin wave theory and quantum Monte Carlo. We find that the ground state energy decreases, indicating enhanced quantum fluctuations with increasing disorder. The magnon spectrum is progressively smoothed, and the effective spin wave velocity of low energy magnons increases with disorder. For large disorder, the ground state energy as well as the average staggered magnetization tend towards limiting values characteristic of this type of randomized tilings.
We study the disordered Heisenberg spin chain, which exhibits many body localization at strong disorder, in the weak to moderate disorder regime. A continued fraction calculation of dynamical correlations is devised, using a variational extrapolation of recurrents. Good convergence for the infinite chain limit is shown. We find that the local spin correlations decay at long times as $C \sim t^{-\beta}$, while the conductivity exhibits a low frequency power law $\sigma \sim \omega^{\alpha}$. The exponents depict sub-diffusive behavior $ \beta < 1/2, \alpha> 0 $ at all finite disorders, and convergence to the scaling result, $\alpha+2\beta = 1$, at large disorders.
We consider the visual sentiment task of mapping an image to an adjective noun pair (ANP) such as "cute baby". To capture the two-factor structure of our ANP semantics as well as to overcome annotation noise and ambiguity, we propose a novel factorized CNN model which learns separate representations for adjectives and nouns but optimizes the classification performance over their product. Our experiments on the publicly available SentiBank dataset show that our model significantly outperforms not only independent ANP classifiers on unseen ANPs and on retrieving images of novel ANPs, but also image captioning models which capture word semantics from co-occurrence of natural text; the latter turn out to be surprisingly poor at capturing the sentiment evoked by pure visual experience. That is, our factorized ANP CNN not only trains better from noisy labels, generalizes better to new images, but can also expands the ANP vocabulary on its own.
We study the distribution of dynamical quantities in various one-dimensional, disordered models the critical behavior of which is described by an infinite randomness fixed point. In the {\it disordered contact process}, the quenched survival probability $\mathcal{P}(t)$ defined in fixed random environments is found to show multi-scaling in the critical point, meaning that $\mathcal{P}(t)=t^{-\delta}$, where the (environment and time-dependent) exponent $\delta$ has a universal limit distribution when $t\to\infty$. The limit distribution is determined by the strong disorder renormalization group method analytically in the end point of a semi-infinite lattice, where it is found to be exponential, while, in the infinite system, conjectures on its limiting behaviors for small and large $\delta$, which are based on numerical results, are formulated. By the same method, the quenched survival probability in the problem of {\it random walks in random environments} is also shown to exhibit multi-scaling with an exponential limit distribution. In addition to this, the (imaginary-time) spin-spin autocorrelation function of the {\it random transverse-field Ising chain} is found to have a form similar to that of survival probability of the contact process at the level of the renormalization approach. Consequently, a relationship between the corresponding limit distributions in the two problems can be established. Finally, the distribution of the spontaneous magnetization in this model is also discussed.
We describe and analyze a hybrid approach to scalable quantum computation based on an optically connected network of few-qubit quantum registers. We show that probabilistically connected five-qubit quantum registers suffice for deterministic, fault-tolerant quantum computation even when state preparation, measurement, and entanglement generation all have substantial errors. We discuss requirements for achieving fault-tolerant operation for two specific implementations of our approach.
Multi-valued network models are an important qualitative modelling approach used widely by the biological community. In this paper we consider developing an abstraction theory for multi-valued network models that allows the state space of a model to be reduced while preserving key properties of the model. This is important as it aids the analysis and comparison of multi-valued networks and in particular, helps address the well-known problem of state space explosion associated with such analysis. We also consider developing techniques for efficiently identifying abstractions and so provide a basis for the automation of this task. We illustrate the theory and techniques developed by investigating the identification of abstractions for two published MVN models of the lysis-lysogeny switch in the bacteriophage lambda.
In this paper, a wavelet-based neural network (WNN) classifier for recognizing EEG signals is implemented and tested under three sets EEG signals (healthy subjects, patients with epilepsy and patients with epileptic syndrome during the seizure). First, the Discrete Wavelet Transform (DWT) with the Multi-Resolution Analysis (MRA) is applied to decompose EEG signal at resolution levels of the components of the EEG signal (delta, theta, alpha, beta and gamma) and the Parsevals theorem are employed to extract the percentage distribution of energy features of the EEG signal at different resolution levels. Second, the neural network (NN) classifies these extracted features to identify the EEGs type according to the percentage distribution of energy features. The performance of the proposed algorithm has been evaluated using in total 300 EEG signals. The results showed that the proposed classifier has the ability of recognizing and classifying EEG signals efficiently.
We investigate the computationally hard problem whether a random graph of finite average vertex degree has an extensively large $q$-regular subgraph, i.e., a subgraph with all vertices having degree equal to $q$. We reformulate this problem as a constraint-satisfaction problem, and solve it using the cavity method of statistical physics at zero temperature. For $q=3$, we find that the first large $q$-regular subgraphs appear discontinuously at an average vertex degree $c_\reg{3} \simeq 3.3546$ and contain immediately about 24% of all vertices in the graph. This transition is extremely close to (but different from) the well-known 3-core percolation point $c_\cor{3} \simeq 3.3509$. For $q>3$, the $q$-regular subgraph percolation threshold is found to coincide with that of the $q$-core.
Wireless ad hoc networks, in particular mobile ad hoc networks (MANETs), are growing very fast as they make communication easier and more available. However, their protocols tend to be difficult to design due to topology dependent behavior of wireless communication, and their distributed and adaptive operations to topology dynamism. Therefore, it is desirable to have them modeled and verified using formal methods. In this paper, we present an actor-based modeling language with the aim to model MANETs. We address main challenges of modeling wireless ad hoc networks such as local broadcast, underlying topology, and its changes, and discuss how they can be efficiently modeled at the semantic level to make their verification amenable. The new framework abstracts the data link layer services by providing asynchronous (local) broadcast and unicast communication, while message delivery is in order and is guaranteed for connected receivers. We illustrate the applicability of our framework through two routing protocols, namely flooding and AODVv2-11, and show how efficiently their state spaces can be reduced by the proposed techniques. Furthermore, we demonstrate a loop formation scenario in AODV, found by our analysis tool.
Recurrent neural networks are powerful models for sequential data, able to represent complex dependencies in the sequence that simpler models such as hidden Markov models cannot handle. Yet they are notoriously hard to train. Here we introduce a training procedure using a gradient ascent in a Riemannian metric: this produces an algorithm independent from design choices such as the encoding of parameters and unit activities. This metric gradient ascent is designed to have an algorithmic cost close to backpropagation through time for sparsely connected networks. We use this procedure on gated leaky neural networks (GLNNs), a variant of recurrent neural networks with an architecture inspired by finite automata and an evolution equation inspired by continuous-time networks. GLNNs trained with a Riemannian gradient are demonstrated to effectively capture a variety of structures in synthetic problems: basic block nesting as in context-free grammars (an important feature of natural languages, but difficult to learn), intersections of multiple independent Markov-type relations, or long-distance relationships such as the distant-XOR problem. This method does not require adjusting the network structure or initial parameters: the network used is a sparse random graph and the initialization is identical for all problems considered.
We present a new supervised architecture termed Mediated Mixture-of-Experts (MMoE) that allows us to improve classification accuracy of Deep Convolutional Networks (DCN). Our architecture achieves this with the help of expert networks: A network is trained on a disjoint subset of a given dataset and then run in parallel to other experts during deployment. A mediator is employed if experts contradict each other. This allows our framework to naturally support incremental learning, as adding new classes requires (re-)training of the new expert only. We also propose two measures to control computational complexity: An early-stopping mechanism halts experts that have low confidence in their prediction. The system allows to trade-off accuracy and complexity without further retraining. We also suggest to share low-level convolutional layers between experts in an effort to avoid computation of a near-duplicate feature set. We evaluate our system on a popular dataset and report improved accuracy compared to a single model of same configuration.
Even though convolutional neural networks (CNN) has achieved near-human performance in various computer vision tasks, its ability to tolerate scale variations is limited. The popular practise is making the model bigger first, and then train it with data augmentation using extensive scale-jittering. In this paper, we propose a scaleinvariant convolutional neural network (SiCNN), a modeldesigned to incorporate multi-scale feature exaction and classification into the network structure. SiCNN uses a multi-column architecture, with each column focusing on a particular scale. Unlike previous multi-column strategies, these columns share the same set of filter parameters by a scale transformation among them. This design deals with scale variation without blowing up the model size. Experimental results show that SiCNN detects features at various scales, and the classification result exhibits strong robustness against object scale variations.
A machine learning method needs to adapt to over time changes in the environment. Such changes are known as concept drift. In this paper, we propose concept drift tackling method as an enhancement of Online Sequential Extreme Learning Machine (OS-ELM) and Constructive Enhancement OS-ELM (CEOS-ELM) by adding adaptive capability for classification and regression problem. The scheme is named as adaptive OS-ELM (AOS-ELM). It is a single classifier scheme that works well to handle real drift, virtual drift, and hybrid drift. The AOS-ELM also works well for sudden drift and recurrent context change type. The scheme is a simple unified method implemented in simple lines of code. We evaluated AOS-ELM on regression and classification problem by using concept drift public data set (SEA and STAGGER) and other public data sets such as MNIST, USPS, and IDS. Experiments show that our method gives higher kappa value compared to the multiclassifier ELM ensemble. Even though AOS-ELM in practice does not need hidden nodes increase, we address some issues related to the increasing of the hidden nodes such as error condition and rank values. We propose taking the rank of the pseudoinverse matrix as an indicator parameter to detect underfitting condition.
We consider the problem of learning hierarchical policies for Reinforcement Learning able to discover options, an option corresponding to a sub-policy over a set of primitive actions. Different models have been proposed during the last decade that usually rely on a predefined set of options. We specifically address the problem of automatically discovering options in decision processes. We describe a new learning model called Budgeted Option Neural Network (BONN) able to discover options based on a budgeted learning objective. The BONN model is evaluated on different classical RL problems, demonstrating both quantitative and qualitative interesting results.
Artificial Immune Systems have been used successfully to build recommender systems for film databases. In this research, an attempt is made to extend this idea to web site recommendation. A collection of more than 1000 individuals web profiles (alternatively called preferences / favourites / bookmarks file) will be used. URLs will be classified using the DMOZ (Directory Mozilla) database of the Open Directory Project as our ontology. This will then be used as the data for the Artificial Immune Systems rather than the actual addresses. The first attempt will involve using a simple classification code number coupled with the number of pages within that classification code. However, this implementation does not make use of the hierarchical tree-like structure of DMOZ. Consideration will then be given to the construction of a similarity measure for web profiles that makes use of this hierarchical information to build a better-informed Artificial Immune System.
Graph drawing addresses the problem of finding a layout of a graph that satisfies given aesthetic and understandability objectives. The most important objective in graph drawing is minimization of the number of crossings in the drawing, as the aesthetics and readability of graph drawings depend on the number of edge crossings. VLSI layouts with fewer crossings are more easily realizable and consequently cheaper. A straight-line drawing of a planar graph G of n vertices is a drawing of G such that each edge is drawn as a straight-line segment without edge crossings. However, a problem with current graph layout methods which are capable of producing satisfactory results for a wide range of graphs is that they often put an extremely high demand on computational resources. This paper introduces a new layout method, which nicely draws internally convex of planar graph that consumes only little computational resources and does not need any heavy duty preprocessing. Here, we use two methods: The first is self organizing map known from unsupervised neural networks which is known as (SOM) and the second method is Inverse Self Organized Map (ISOM).
Effectively modelling hidden structures in a network is very practical but theoretically challenging. Existing relational models only involve very limited information, namely the binary directional link data, embedded in a network to learn hidden networking structures. There is other rich and meaningful information (e.g., various attributes of entities and more granular information than binary elements such as "like" or "dislike") missed, which play a critical role in forming and understanding relations in a network. In this work, we propose an informative relational model (InfRM) framework to adequately involve rich information and its granularity in a network, including metadata information about each entity and various forms of link data. Firstly, an effective metadata information incorporation method is employed on the prior information from relational models MMSB and LFRM. This is to encourage the entities with similar metadata information to have similar hidden structures. Secondly, we propose various solutions to cater for alternative forms of link data. Substantial efforts have been made towards modelling appropriateness and efficiency, for example, using conjugate priors. We evaluate our framework and its inference algorithms in different datasets, which shows the generality and effectiveness of our models in capturing implicit structures in networks.
We present a new implementation of the Cluster Variational Method (CVM) as a message passing algorithm. The kind of message passing algorithms used for CVM, usually named Generalized Belief Propagation, are a generalization of the Belief Propagation algorithm in the same way that CVM is a generalization of the Bethe approximation for estimating the partition function. However, the connection between fixed points of GBP and the extremal points of the CVM free-energy is usually not a one-to-one correspondence, because of the existence of a gauge transformation involving the GBP messages.   Our contribution is twofold. Firstly we propose a new way of defining messages (fields) in a generic CVM approximation, such that messages arrive on a given region from all its ancestors, and not only from its direct parents, as in the standard Parent-to-Child GBP. We call this approach maximal messages. Secondly we focus on the case of binary variables, re-interpreting the messages as fields enforcing the consistency between the moments of the local (marginal) probability distributions. We provide a precise rule to enforce all consistencies, avoiding any redundancy, that would otherwise lead to a gauge transformation on the messages. This moment matching method is gauge free, i.e. it guarantees that the resulting GBP is not gauge invariant.   We apply our maximal messages and moment matching GBP to obtain an analytical expression for the critical temperature of the Ising model in general dimensions at the level of plaquette-CVM. The values obtained outperform Bethe estimates, and are comparable with loop corrected Belief Propagation equations. The method allows for a straightforward generalization to disordered systems.
Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for large distributed systems, on account of the large network bandwidth cost required to gather the distributed state at a fusion center. Consequently, several recent works have proposed various distributed PCA algorithms aiming to reduce the communication overhead incurred by PCA without losing its inferential power. This paper evaluates the tradeoff between communication cost and solution quality of two distributed PCA algorithms on a real domain name system (DNS) query dataset from a large network. We also apply the distributed PCA algorithm in the area of network anomaly detection and demonstrate that the detection accuracy of both distributed PCA-based methods has little degradation in quality, yet achieves significant savings in communication bandwidth.
Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being quasi-imperceptible to the human eye. We further empirically analyze these universal perturbations and show, in particular, that they generalize very well across neural networks. The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers. It further outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images.
After a more than decade-long period of relatively little research activity in the area of recurrent neural networks, several new developments will be reviewed here that have allowed substantial progress both in understanding and in technical solutions towards more efficient training of recurrent networks. These advances have been motivated by and related to the optimization issues surrounding deep learning. Although recurrent networks are extremely powerful in what they can in principle represent in terms of modelling sequences,their training is plagued by two aspects of the same issue regarding the learning of long-term dependencies. Experiments reported here evaluate the use of clipping gradients, spanning longer time ranges with leaky integration, advanced momentum techniques, using more powerful output probability models, and encouraging sparser gradients to help symmetry breaking and credit assignment. The experiments are performed on text and music data and show off the combined effects of these techniques in generally improving both training and test error.
A method for authorship attribution based on function word adjacency networks (WANs) is introduced. Function words are parts of speech that express grammatical relationships between other words but do not carry lexical meaning on their own. In the WANs in this paper, nodes are function words and directed edges stand in for the likelihood of finding the sink word in the ordered vicinity of the source word. WANs of different authors can be interpreted as transition probabilities of a Markov chain and are therefore compared in terms of their relative entropies. Optimal selection of WAN parameters is studied and attribution accuracy is benchmarked across a diverse pool of authors and varying text lengths. This analysis shows that, since function words are independent of content, their use tends to be specific to an author and that the relational data captured by function WANs is a good summary of stylometric fingerprints. Attribution accuracy is observed to exceed the one achieved by methods that rely on word frequencies alone. Further combining WANs with methods that rely on word frequencies alone, results in larger attribution accuracy, indicating that both sources of information encode different aspects of authorial styles.
We study the creep motion of an elastic string in a two dimensional pinning landscape by Langevin dynamics simulations. We find that the Velocity-Force characteristics are well described by the creep formula predicted from phenomenological scaling arguments. We analyze the creep exponent $\mu$, and the roughness exponent $\zeta$. Two regimes are identified: when the temperature is larger than the strength of the disorder we find $\mu \approx 1/4$ and $\zeta \approx 2/3$, in agreement with the quasi-equilibrium-nucleation picture of creep motion; on the contrary, lowering enough the temperature, the values of $\mu$ and $\zeta$ increase showing a strong violation of the latter picture.
A revised weight-coded evolutionary algorithm (RWCEA) is proposed for solving multidimensional knapsack problems. This RWCEA uses a new decoding method and incorporates a heuristic method in initialization. Computational results show that the RWCEA performs better than a weight-coded evolutionary algorithm proposed by Raidl (1999) and to some existing benchmarks, it can yield better results than the ones reported in the OR-library.
Neural networks are known to be effective function approximators. Recently, deep neural networks have proven to be very effective in pattern recognition, classification tasks and human-level control to model highly nonlinear realworld systems. This paper investigates the effectiveness of deep neural networks in the modeling of dynamical systems with complex behavior. Three deep neural network structures are trained on sequential data, and we investigate the effectiveness of these networks in modeling associated characteristics of the underlying dynamical systems. We carry out similar evaluations on select publicly available system identification datasets. We demonstrate that deep neural networks are effective model estimators from input-output data
We consider a multi-species generalization of the Asymmetric Simple Exclusion Process on an open chain, in which particles hop with their characteristic hopping rates and fast particles can overtake slow ones. The number of species is arbitrary and the hopping rates can be selected from a discrete or continuous distribution. We determine exactly the phase structure of this model and show how the phase diagram of the 1-species ASEP is modified. Depending on the distribution of hopping rates, the system can exist in a three-phase regime or a two phase regime. In the three- phase regime the phase structure is almost the same as in the one species case, that is, there are the low density, the high density and the maximal current phases, while in the two-phase regime there is no high density phase.
Bond-percolation graphs are random subgraphs of the d-dimensional integer lattice generated by a standard bond-percolation process. The associated graph Laplacians, subject to Dirichlet or Neumann conditions at cluster boundaries, represent bounded, self-adjoint, ergodic random operators with off-diagonal disorder. They possess almost surely the non-random spectrum [0,4d] and a self-averaging integrated density of states. The integrated density of states is shown to exhibit Lifshits tails at both spectral edges in the non-percolating phase. While the characteristic exponent of the Lifshits tail for the Dirichlet (Neumann) Laplacian at the lower (upper) spectral edge equals d/2, and thus depends on the spatial dimension, this is not the case at the upper (lower) spectral edge, where the exponent equals 1/2.
Diffusion behavior of Brownian particles in confined spaces was studied for the displacements notably shorter than the confinement size. The confinements, resembling structure of porous solids, were modeled using a spatially-varying potential field with an infinitely large potential representing the solid part and zero potential in the void space. Between them, a smooth transient mimicking the interaction potential of the tracer molecules with the pore walls was applied. The respective Smoluchowski equation describing diffusion of tracer particles in the thus created force field was solved under certain approximations allowing for a general analytical solution. The time-depended diffusion coefficient obtained was found to agree with that obtained earlier using the Fick's diffusion equation, but with a different numerical constant. Numerical solution of the Smoluchowski equation with selected interaction potentials was used to clarify the origin of this discrepancy. The conditions under which the solutions obtained within these two approaches converge were established.
We analyze the dynamics of the Learning-Without-Recall model with Gaussian priors in a dynamic social network. Agents seeking to learn the state of the world, the "truth", exchange signals about their current beliefs across a changing network and update them accordingly. The agents are assumed memoryless and rational, meaning that they Bayes-update their beliefs based on current states and signals, with no other information from the past. The other assumption is that each agent hears a noisy signal from the truth at a frequency bounded away from zero. Under these conditions, we show that the system reaches truthful consensus almost surely with a convergence rate that is polynomial in expectation. Somewhat paradoxically, high outdegree can slow down the learning process. The lower-bound assumption on the truth-hearing frequency is necessary: even infinitely frequent access to the truth offers no guarantee of truthful consensus in the limit.
Cross modal face matching between the thermal and visible spectrum is a much de- sired capability for night-time surveillance and security applications. Due to a very large modality gap, thermal-to-visible face recognition is one of the most challenging face matching problem. In this paper, we present an approach to bridge this modality gap by a significant margin. Our approach captures the highly non-linear relationship be- tween the two modalities by using a deep neural network. Our model attempts to learn a non-linear mapping from visible to thermal spectrum while preserving the identity in- formation. We show substantive performance improvement on a difficult thermal-visible face dataset. The presented approach improves the state-of-the-art by more than 10% in terms of Rank-1 identification and bridge the drop in performance due to the modality gap by more than 40%.
Generative neural samplers are probabilistic models that implement sampling using feedforward neural networks: they take a random input vector and produce a sample from a probability distribution defined by the network weights. These models are expressive and allow efficient computation of samples and derivatives, but cannot be used for computing likelihoods or for marginalization. The generative-adversarial training method allows to train such models through the use of an auxiliary discriminative neural network. We show that the generative-adversarial approach is a special case of an existing more general variational divergence estimation approach. We show that any f-divergence can be used for training generative neural samplers. We discuss the benefits of various choices of divergence functions on training complexity and the quality of the obtained generative models.
We investigate the linear elastic response of amorphous solids to a shear strain at zero temperature. We find that the response is characterized by at least two distinct shear moduli. The first one, $\mu_{\rm ZFC}$, is associated with the linear response of a single energy minimum. The second, $\mu_{\rm FC}$, is related to sampling, through plastic events, an ensemble of distinct energy minima. We provide examples of protocols that allow one to measure both shear moduli. In agreement with a theoretical prediction based on the exact solution in infinite spatial dimensions, the ratio $\mu_{\rm FC}/\mu_{\rm ZFC}$ is found to vanish proportionally to the square root of pressure at the jamming transition. Our results establish that amorphous solids are characterized by a rugged energy landscape, on which the infinite-dimensional solution can give useful insight.
The specific heat of the two-dimensional $\pm J$ Ising model has been investigated by the numerical transfer matrix method and Monte Carlo simulations from a new point of view. The region where a part of the specific heat takes the negative value has been investigated, which is characteristic of frustrated systems and reflects the non-trivial degeneracy of the ground state. The region mentioned above is found to be fairly large in the $p-T$ plane ($p$ is the concentration of the ferromagnetic bond and $T$ is the temperature). Moreover, it includes the Nishimori line. Namely, it includes a part of the paramagnetic-ferromagnetic phase boundary, on which the specific heat cannot diverge. The present result indicates that the specific heat does not diverge at least on a part of the paramagnetic-ferromagnetic phase boundary above the multicritical point, which is in conflict with previous results.
In wireless sensor-actor networks, sensors probe their surroundings and forward their data to actor nodes. Actors collect sensor data and perform certain tasks in response to various events. Since actors operate on harsh environment, they may easily get damaged or failed. Failed actor nodes may partition the network into disjoint subsets. In order to reestablish connectivity nodes may be relocated to new positions. This paper focus on review of three (LeDir, RIM, DARA) node recovery algorithms, and their performance has been analysed in terms network overhead and path length validation metrics.
Monolithic pixel sensors for charged particle detection and imaging applications have been designed and fabricated using commercially available, deep-submicron Silicon-On-Insulator (SOI) processes, which insulate a thin layer of integrated full CMOS electronics from a high-resistivity substrate by means of a buried oxide. The substrate is contacted from the electronics layer through vias etched in the buried oxide, allowing pixel implanting and reverse biasing. This paper summarizes the performances achieved with a first prototype manufactured in the OKI 0.15 micrometer FD-SOI process, featuring analog and digital pixels on a 10 micrometer pitch. The design and preliminary results on the analog section of a second prototype manufactured in the OKI 0.20 micrometer FD-SOI process are briefly discussed.
In this work, we tackle the problem of car license plate detection and recognition in natural scene images. Inspired by the success of deep neural networks (DNNs) in various vision applications, here we leverage DNNs to learn high-level features in a cascade framework, which lead to improved performance on both detection and recognition.   Firstly, we train a $37$-class convolutional neural network (CNN) to detect all characters in an image, which results in a high recall, compared with conventional approaches such as training a binary text/non-text classifier. False positives are then eliminated by the second plate/non-plate CNN classifier. Bounding box refinement is then carried out based on the edge information of the license plates, in order to improve the intersection-over-union (IoU) ratio. The proposed cascade framework extracts license plates effectively with both high recall and precision. Last, we propose to recognize the license characters as a {sequence labelling} problem. A recurrent neural network (RNN) with long short-term memory (LSTM) is trained to recognize the sequential features extracted from the whole license plate via CNNs. The main advantage of this approach is that it is segmentation free. By exploring context information and avoiding errors caused by segmentation, the RNN method performs better than a baseline method of combining segmentation and deep CNN classification; and achieves state-of-the-art recognition accuracy.
The technological advancement and sophistication in cameras and gadgets prompt researchers to have focus on image analysis and text understanding. The deep learning techniques demonstrated well to assess the potential for classifying text from natural scene images as reported in recent years. There are variety of deep learning approaches that prospects the detection and recognition of text, effectively from images. In this work, we presented Arabic scene text recognition using Convolutional Neural Networks (ConvNets) as a deep learning classifier. As the scene text data is slanted and skewed, thus to deal with maximum variations, we employ five orientations with respect to single occurrence of a character. The training is formulated by keeping filter size 3 x 3 and 5 x 5 with stride value as 1 and 2. During text classification phase, we trained network with distinct learning rates. Our approach reported encouraging results on recognition of Arabic characters from segmented Arabic scene images.
Scenarios are pen-pictures of plausible futures, used for strategic planning. The aim of this investigation is to expand the horizon of scenario-based planning through computational models that are able to aid the analyst in the planning process. The investigation builds upon the advances of Information and Communication Technology (ICT) to create a novel, flexible and customizable computational capability-based planning methodology that is practical and theoretically sound. We will show how evolutionary computation, in particular evolutionary multi-objective optimization, can play a central role - both as an optimizer and as a source for innovation.
Current high-quality object detection approaches use the scheme of salience-based object proposal methods followed by post-classification using deep convolutional features. This spurred recent research in improving object proposal methods. However, domain agnostic proposal generation has the principal drawback that the proposals come unranked or with very weak ranking, making it hard to trade-off quality for running time. This raises the more fundamental question of whether high-quality proposal generation requires careful engineering or can be derived just from data alone. We demonstrate that learning-based proposal methods can effectively match the performance of hand-engineered methods while allowing for very efficient runtime-quality trade-offs. Using the multi-scale convolutional MultiBox (MSC-MultiBox) approach, we substantially advance the state-of-the-art on the ILSVRC 2014 detection challenge data set, with $0.5$ mAP for a single model and $0.52$ mAP for an ensemble of two models. MSC-Multibox significantly improves the proposal quality over its predecessor MultiBox~method: AP increases from $0.42$ to $0.53$ for the ILSVRC detection challenge. Finally, we demonstrate improved bounding-box recall compared to Multiscale Combinatorial Grouping with less proposals on the Microsoft-COCO data set.
Unlike current closed systems such as 2nd and 3rd generations where the core network is controlled by a sole network operator, multiple network operators will coexist and manage the core network in Next Generation Networks (NGNs). This open architecture and the collaboration between different network operators will support ubiquitous connectivity and thus enhances users' experience. However, this brings to the fore certain security issues which must be addressed, the most important of which is the initial Authentication and Key Agreement (AKA) to identify and authorize mobile nodes on these various networks. This paper looks at how existing research efforts the HOKEY WG, Mobile Ethernet and 3GPP frameworks respond to this new environment and provide security mechanisms. The analysis shows that most of the research had realized the openness of the core network and tried to deal with it using different methods. These methods will be extensively analysed in order to highlight their strengths and weaknesses.
This paper presents methods to compare high order networks, defined as weighted complete hypergraphs collecting relationship functions between elements of tuples. They can be considered as generalizations of conventional networks where only relationship functions between pairs are defined. Important properties between relationships of tuples of different lengths are established, particularly when relationships encode dissimilarities or proximities between nodes. Two families of distances are then introduced in the space of high order networks. The distances measure differences between networks. We prove that they are valid metrics in the spaces of high order dissimilarity and proximity networks modulo permutation isomorphisms. Practical implications are explored by comparing the coauthorship networks of two popular signal processing researchers. The metrics succeed in identifying their respective collaboration patterns.
On the basis of solutions of the master equation for networks with a small number of neurons it is shown that the conditional entropy and integrated information of neural networks depend on their average activity and inter-cluster correlations.
The improvement of the accuracy of image query retrieval used image classification technique. Image classification is well known technique of supervised learning. The improved method of image classification increases the working efficiency of image query retrieval. For the improvements of classification technique we used RBF neural network function for better prediction of feature used in image retrieval.Colour content is represented by pixel values in image classification using radial base function(RBF) technique. This approach provides better result compare to SVM technique in image representation.Image is represented by matrix though RBF using pixel values of colour intensity of image. Firstly we using RGB colour model. In this colour model we use red, green and blue colour intensity values in matrix.SVM with partical swarm optimization for image classification is implemented in content of images which provide better Results based on the proposed approach are found encouraging in terms of color image classification accuracy.
Variational Level Set (LS) has been a widely used method in medical segmentation. However, it is limited when dealing with multi-instance objects in the real world. In addition, its segmentation results are quite sensitive to initial settings and highly depend on the number of iterations. To address these issues and boost the classic variational LS methods to a new level of the learnable deep learning approaches, we propose a novel definition of contour evolution named Recurrent Level Set (RLS)} to employ Gated Recurrent Unit under the energy minimization of a variational LS functional. The curve deformation process in RLS is formed as a hidden state evolution procedure and updated by minimizing an energy functional composed of fitting forces and contour length. By sharing the convolutional features in a fully end-to-end trainable framework, we extend RLS to Contextual RLS (CRLS) to address semantic segmentation in the wild. The experimental results have shown that our proposed RLS improves both computational time and segmentation accuracy against the classic variations LS-based method, whereas the fully end-to-end system CRLS achieves competitive performance compared to the state-of-the-art semantic segmentation approaches.
Understanding the subgraph distribution in random networks is important for modelling complex systems. In classic Erdos networks, which exhibit a Poissonian degree distribution, the number of appearances of a subgraph G with n nodes and g edges scales with network size as \mean{G} ~ N^{n-g}. However, many natural networks have a non-Poissonian degree distribution. Here we present approximate equations for the average number of subgraphs in an ensemble of random sparse directed networks, characterized by an arbitrary degree sequence. We find new scaling rules for the commonly occurring case of directed scale-free networks, in which the outgoing degree distribution scales as P(k) ~ k^{-\gamma}. Considering the power exponent of the degree distribution, \gamma, as a control parameter, we show that random networks exhibit transitions between three regimes. In each regime the subgraph number of appearances follows a different scaling law, \mean{G} ~ N^{\alpha}, where \alpha=n-g+s-1 for \gamma<2, \alpha=n-g+s+1-\gamma for 2<\gamma<\gamma_c, and \alpha=n-g for \gamma>\gamma_c, s is the maximal outdegree in the subgraph, and \gamma_c=s+1. We find that certain subgraphs appear much more frequently than in Erdos networks. These results are in very good agreement with numerical simulations. This has implications for detecting network motifs, subgraphs that occur in natural networks significantly more than in their randomized counterparts.
We study stability and distortions of liquid crystal nematic order in a cell with a random heterogeneous substrate. Modeling this system as a bulk xy model with quenched disorder confined to a surface, we find that nematic order is marginally unstable to such surface pinning. We compute the length scale beyond which nematic distortions become large and calculate orientational correlation functions using the functional renormalization-group and matching methods, finding universal logarithmic and double-logarithmic distortions in two and three dimensions, respectively. We extend these results to a finite-thickness liquid crystal cell with a second homogeneous substrate, detailing crossovers as a function of random pinning strength and cell thickness. We conclude with analysis of experimental signatures of these distortions in a conventional crossed-polarizer-analyzer light microscopy.
We observe a complex change in the hopping exponent value from 1/2 to 1/3 as a function of disorder strength and electron density in a sodium-doped silicon MOSFET. The disorder was varied by applying a gate voltage and thermally drifting the ions to different positions in the oxide. The same gate was then used at low temperature to modify the carrier concentration. Magnetoconductivity measurements are compatible with a change in transport mechanisms when either the disorder or the electron density is modified suggesting a possible transition from a Mott insulator to an Anderson insulator in these systems.
We have studied the conductance distribution function of two-dimensional disordered noninteracting systems in the crossover regime between the diffusive and the localized phases. The distribution is entirely determined by the mean conductance, g, in agreement with the strong version of the single-parameter scaling hypothesis. The distribution seems to change drastically at a critical value very close to one. For conductances larger than this critical value, the distribution is roughly Gaussian while for smaller values it resembles a log-normal distribution. The two distributions match at the critical point with an often appreciable change in behavior. This matching implies a jump in the first derivative of the distribution which does not seem to disappear as system size increases. We have also studied 1/g corrections to the skewness to quantify the deviation of the distribution from a Gaussian function in the diffusive regime.
Mobile phone calling is one of the most widely used communication methods in modern society. The records of calls among mobile phone users provide us a valuable proxy for the understanding of human communication patterns embedded in social networks. Mobile phone users call each other forming a directed calling network. If only reciprocal calls are considered, we obtain an undirected mutual calling network. The preferential communication behavior between two connected users can be statistically tested and it results in two Bonferroni networks with statistically validated edges. We perform a comparative analysis of the statistical properties of these four networks, which are constructed from the calling records of more than nine million individuals in Shanghai over a period of 110 days. We find that these networks share many common structural properties and also exhibit idiosyncratic features when compared with previously studied large mobile calling networks. The empirical findings provide us an intriguing picture of a representative large social network that might shed new lights on the modelling of large social networks.
Successful training of convolutional neural networks is often associated with the training of sufficiently deep architectures composed of high amounts of features while relying on a variety of regularization and pruning techniques to converge to less redundant states. We introduce an easy to compute metric, based on feature time evolution, to evaluate feature importance during training and demonstrate its potency in determining a networks effective capacity. In consequence we propose a novel algorithm to evolve fixed-depth architectures starting from just a single feature per layer to attain effective representational capacities needed for a specific task by greedily adding feature by feature. We revisit popular CNN architectures and demonstrate how evolved architectures not only converge to similar topologies that benefit from less parameters or improved accuracy, but furthermore exhibit systematic correspondence in representational complexity with the specified task. In contrast to conventional design patterns that typically have a monotonic increase in the amount of features with increased depth, we observe that CNNs perform better when there is a peak in learnable parameters in intermediate, with falloffs to earlier and later layers.
We address the important theoretical question why a recurrent neural network with fixed weights can adaptively classify time-varied signals in the presence of additive noise and parametric perturbations. We provide a mathematical proof assuming that unknown parameters are allowed to enter the signal nonlinearly and the noise amplitude is sufficiently small.
We study link scheduling in wireless networks under stochastic arrival processes of packets, and give an algorithm that achieves stability in the physical (SINR) interference model. The efficiency of such an algorithm is the fraction of the maximum feasible traffic that the algorithm can handle without queues growing indefinitely. Our algorithm achieves two important goals: (i) efficiency is independent of the size of the network, and (ii) the algorithm is fully distributed, i.e., individual nodes need no information about the overall network topology, not even local information.
We study the storage and retrieval of phase-coded patterns as stable dynamical attractors in recurrent neural networks, for both an analog and a integrate-and-fire spiking model. The synaptic strength is determined by a learning rule based on spike-time-dependent plasticity, with an asymmetric time window depending on the relative timing between pre- and post-synaptic activity. We store multiple patterns and study the network capacity.   For the analog model, we find that the network capacity scales linearly with the network size, and that both capacity and the oscillation frequency of the retrieval state depend on the asymmetry of the learning time window. In addition to fully-connected networks, we study sparse networks, where each neuron is connected only to a small number z << N of other neurons. Connections can be short range, between neighboring neurons placed on a regular lattice, or long range, between randomly chosen pairs of neurons. We find that a small fraction of long range connections is able to amplify the capacity of the network. This imply that a small-world-network topology is optimal, as a compromise between the cost of long range connections and the capacity increase.   Also in the spiking integrate and fire model the crucial result of storing and retrieval of multiple phase-coded patterns is observed. The capacity of the fully-connected spiking network is investigated, together with the relation between oscillation frequency of retrieval state and window asymmetry.
A system of replicators with Hebbian random couplings is studied using dynamical methods. The self-reproducing species are here characterized by a set of binary traits and interact based on complementarity. In the case of an extensive number of traits we use path-integral techniques to demonstrate how the coupled dynamics of the system can be formulated in terms of an effective single-species process in the thermodynamic limit, and how persistent order parameters characterizing the stationary states may be computed from this process in agreement with existing replica studies of the statics. Numerical simulations confirm these results. The analysis of the dynamics allows an interpretation of two different types of phase transitions of the model in terms of memory onset at finite and diverging integrated response, respectively. We extend the analysis to the case of diluted couplings of an arbitrary symmetry, where replica theory is not applicable. Finally the dynamics and in particular the approach to the stationary state of the model with a finite number of traits is addressed.
Representation learning for molecules is important for molecular properties prediction, material design, drug screening, etc. In this work a graph convolutional network architecture for learning representations for molecules is presented. An operation for convolving k-neighbourhood of a specific node in graph is defined, which is corresponding to kernel size of k in convolutional neural networks. Besides, A module of adaptive filtering is defined to find the sampling locations based on graph connections and node features.
The task of CDMA multiuser detection is to simultaneously estimate binary symbols of $K$ synchronous users from the received $N$ base-band CDMA signals. Mathematically, this can be formulated as an inference problem on a complete bipartite graph. In the research on graphically represented statistical models, it is known that the belief propagation (BP) can exactly perform the inference in a polynomial time scale of the system size when the graph is free from cycles in spite that the necessary computation for general graphs exponentially explodes in the worst case. In addition, recent several researches revealed that the BP can also serve as an excellent approximation algorithm even if the graph has cycles as far as they are relatively long. However, as there exit many short cycles in a complete bipartite graph, one might suspect that the BP would not provide a good performance when employed for the multiuser detection.   The purpose of this paper is to make an objection to such suspicion. More specifically, we will show that appropriate employment of the central limit theorem and the law of large numbers to BP, which is one of the standard techniques in statistical mechanics, makes it possible to develop a novel multiuser detection algorithm the convergence property of which is considerably better than that of the conventional multistage detection without increasing the computational cost significantly. Furthermore, we will also provide a scheme to analyse the dynamics of the proposed algorithm, which can be naturally linked to the equilibrium analysis recently presented by Tanaka.
In this chapter we first outline some of the popular computing environments used for analysing neural data, followed by a brief discussion of 'software carpentry', basic tools and skills from software engineering that can be of great use to computational scientists. We then introduce the concept of open-source software and explain some of its potential benefits for the academic community before giving a brief directory of some freely available open source software packages that address various aspects of the study of neural codes. While there are many commercial offerings that provide similar functionality, we concentrate here on open source packages, which in addition to being available free of charge, also have the source code available for study and modification.
Sampling a network with a given probability distribution has been identified as a useful operation. In this paper we propose distributed algorithms for sampling networks, so that nodes are selected by a special node, called the \emph{source}, with a given probability distribution. All these algorithms are based on a new class of random walks, that we call Random Centrifugal Walks (RCW). A RCW is a random walk that starts at the source and always moves away from it.   Firstly, an algorithm to sample any connected network using RCW is proposed. The algorithm assumes that each node has a weight, so that the sampling process must select a node with a probability proportional to its weight. This algorithm requires a preprocessing phase before the sampling of nodes. In particular, a minimum diameter spanning tree (MDST) is created in the network, and then nodes' weights are efficiently aggregated using the tree. The good news are that the preprocessing is done only once, regardless of the number of sources and the number of samples taken from the network. After that, every sample is done with a RCW whose length is bounded by the network diameter.   Secondly, RCW algorithms that do not require preprocessing are proposed for grids and networks with regular concentric connectivity, for the case when the probability of selecting a node is a function of its distance to the source.   The key features of the RCW algorithms (unlike previous Markovian approaches) are that (1) they do not need to warm-up (stabilize), (2) the sampling always finishes in a number of hops bounded by the network diameter, and (3) it selects a node with the exact probability distribution.
Model microemulsion networks of oil droplets stabilized by non ionic surfactant and telechelic polymer C18-PEO(10k)-C18 have been studied for two droplet-to-polymer size ratios. The rheological properties of the networks have been measured as a function of network connectivity and can be described in terms of simple percolation laws. The network structure has been characterised by Small Angle Neutron Scattering. A Reverse Monte Carlo approach is used to demonstrate the interplay of attraction and repulsion induced by the copolymer. These model networks are then used as matrix for the incorporation of silica nanoparticles (R=10nm), individual dispersion being checked by scattering. A strong impact on the rheological properties is found for silica volume fractions up to 9%.
In this note, we extend an evolutionary stochastic portfolio optimization framework to include probabilistic constraints. Both the stochastic programming-based modeling environment as well as the evolutionary optimization environment are ideally suited for an integration of various types of probabilistic constraints. We show an approach on how to integrate these constraints. Numerical results using recent financial data substantiate the applicability of the presented approach.
The Abelian Sandpile Model is a discrete diffusion process defined on graphs (Dhar \cite{DD90}, Dhar et al. \cite{DD95}) which serves as the standard model of \textit{self-organized criticality}. The transience class of a sandpile is defined as the maximum number of particles that can be added without making the system recurrent (\cite{BT05}). We develop the theory of discrete diffusions in contrast to continuous harmonic functions on graphs and establish deep connections between standard results in the study of random walks on graphs and sandpiles on graphs. Using this connection and building other necessary machinery we improve the main result of Babai and Gorodezky (SODA 2007,\cite{LB07}) of the bound on the transience class of an $n \times n$ grid, from $O(n^{30})$ to $O(n^{7})$. Proving that the transience class is small validates the general notion that for most natural phenomenon, the time during which the system is transient is small. In addition, we use the machinery developed to prove a number of auxiliary results. We exhibit an equivalence between two other tessellations of plane, the honeycomb and triangular lattices. We give general upper bounds on the transience class as a function of the number of edges to the sink.   Further, for planar sandpiles we derive an explicit algebraic expression which provably approximates the transience class of $G$ to within $O(|E(G)|)$. This expression is based on the spectrum of the Laplacian of the dual of the graph $G$. We also show a lower bound of $\Omega(n^{3})$ on the transience class on the grid improving the obvious bound of $\Omega(n^{2})$.
Many data sets contain rich information about objects, as well as pairwise relations between them. For instance, in networks of websites, scientific papers, and other documents, each node has content consisting of a collection of words, as well as hyperlinks or citations to other nodes. In order to perform inference on such data sets, and make predictions and recommendations, it is useful to have models that are able to capture the processes which generate the text at each node and the links between them. In this paper, we combine classic ideas in topic modeling with a variant of the mixed-membership block model recently developed in the statistical physics community. The resulting model has the advantage that its parameters, including the mixture of topics of each document and the resulting overlapping communities, can be inferred with a simple and scalable expectation-maximization algorithm. We test our model on three data sets, performing unsupervised topic classification and link prediction. For both tasks, our model outperforms several existing state-of-the-art methods, achieving higher accuracy with significantly less computation, analyzing a data set with 1.3 million words and 44 thousand links in a few minutes.
In numerous applicative contexts, data are too rich and too complex to be represented by numerical vectors. A general approach to extend machine learning and data mining techniques to such data is to really on a dissimilarity or on a kernel that measures how different or similar two objects are. This approach has been used to define several variants of the Self Organizing Map (SOM). This paper reviews those variants in using a common set of notations in order to outline differences and similarities between them. It discusses the advantages and drawbacks of the variants, as well as the actual relevance of the dissimilarity/kernel SOM for practical applications.
Most of the existing Neural Machine Translation (NMT) models focus on the conversion of sequential data and do not directly use syntactic information. We propose a novel end-to-end syntactic NMT model, extending a sequence-to-sequence model with the source-side phrase structure. Our model has an attention mechanism that enables the decoder to generate a translated word while softly aligning it with phrases as well as words of the source sentence. Experimental results on the WAT'15 English-to-Japanese dataset demonstrate that our proposed model considerably outperforms sequence-to-sequence attentional NMT models and compares favorably with the state-of-the-art tree-to-string SMT system.
We use the uniform semiclassical approximation in order to derive the fidelity decay in the regime of large perturbations.   Numerical computations are presented which agree with our theoretical predictions.   Moreover, our theory allows to explain previous findings, such as the deviation from the Lyapunov decay rate in cases where the classical finite-time instability is non-uniform in phase space.
We introduce a new framework for the analysis of the dynamics of networks, based on randomly reinforced urn (RRU) processes, in which the weight of the edges is determined by a reinforcement mechanism. We rigorously explain the empirical evidence that in many real networks there is a subset of "dominant edges" that control a major share of the total weight of the network. Furthermore, we introduce a new statistical procedure to study the evolution of networks over time, assessing if a given instance of the nework is taken at its steady state or not. Our results are quite general, since they are not based on a particular probability distribution or functional form of the weights. We test our model in the context of the International Trade Network, showing the existence of a core of dominant links and determining its size.
We study a finite range spin glass model in arbitrary dimension, where the intensity of the coupling between spins decays to zero over some distance $\gamma^{-1}$. We prove that, under a positivity condition for the interaction potential, the infinite-volume free energy of the system converges to that of the Sherrington-Kirkpatrick model, in the Kac limit $\gamma\to0$. We study the implication of this convergence for the local order parameter, i.e., the local overlap distribution function and a family of susceptibilities to it associated, and we show that locally the system behaves like its mean field analogue. Similar results are obtained for models with $p$-spin interactions. Finally, we discuss a possible approach to the problem of the existence of long range order for finite $\gamma$, based on a large deviation functional for overlap profiles. This will be developed in future work.
In computational neuroscience, it is important to estimate well the proportion of signal variance in the total variance of neural activity measurements. This explainable variance measure helps neuroscientists assess the adequacy of predictive models that describe how images are encoded in the brain. Complicating the estimation problem are strong noise correlations, which may confound the neural responses corresponding to the stimuli. If not properly taken into account, the correlations could inflate the explainable variance estimates and suggest false possible prediction accuracies. We propose a novel method to estimate the explainable variance in functional MRI (fMRI) brain activity measurements when there are strong correlations in the noise. Our shuffle estimator is nonparametric, unbiased, and built upon the random effect model reflecting the randomization in the fMRI data collection process. Leveraging symmetries in the measurements, our estimator is obtained by appropriately permuting the measurement vector in such a way that the noise covariance structure is intact but the explainable variance is changed after the permutation. This difference is then used to estimate the explainable variance. We validate the properties of the proposed method in simulation experiments. For the image-fMRI data, we show that the shuffle estimates can explain the variation in prediction accuracy for voxels within the primary visual cortex (V1) better than alternative parametric methods.
This paper studies fundamental limitations of performance for distributed decision-making in robotic networks. The class of decision-making problems we consider encompasses a number of prototypical problems such as average-based consensus as well as distributed optimization, leader election, majority voting, MAX, MIN, and logical formulas. We first propose a formal model for distributed computation on robotic networks that is based on the concept of I/O automata and is inspired by the Computer Science literature on distributed computing clusters. Then, we present a number of bounds on time, message, and byte complexity, which we use to discuss the relative performance of a number of approaches for distributed decision-making. From a methodological standpoint, our work sheds light on the relation between the tools developed by the Computer Science and Controls communities on the topic of distributed algorithms.
We extend a previously proposed field-theoretic self-consistent perturbation approach for the equilibrium dynamics of the Dean-Kawasaki equation presented in [J. Stat. Mech. 2008 P02004]. By taking terms missing in the latter analysis into account we arrive at a set of three new equations for correlation functions of the system. These correlations involve the density and its logarithm as local observables. Our new one-loop equations, which must carefully deal with the noninteracting Brownian gas theory, are more general than the historic Mode-Coupling one in that a further and well-defined approximation leads back to the original mode-coupling equation for the density correlations alone. However, without performing any further approximation step, our set of three equations does not feature any ergodic-non ergodic transition, as opposed to the historical mode- coupling approach.
Vanishing (and exploding) gradients effect is a common problem for recurrent neural networks with nonlinear activation functions which use backpropagation method for calculation of derivatives. Deep feedforward neural networks with many hidden layers also suffer from this effect. In this paper we propose a novel universal technique that makes the norm of the gradient stay in the suitable range. We construct a way to estimate a contribution of each training example to the norm of the long-term components of the target function s gradient. Using this subroutine we can construct mini-batches for the stochastic gradient descent (SGD) training that leads to high performance and accuracy of the trained network even for very complex tasks. We provide a straightforward mathematical estimation of minibatch s impact on for the gradient norm and prove its correctness theoretically. To check our framework experimentally we use some special synthetic benchmarks for testing RNNs on ability to capture long-term dependencies. Our network can detect links between events in the (temporal) sequence at the range approx. 100 and longer.
Data imbalance is common in many vision tasks where one or more classes are rare. Without addressing this issue conventional methods tend to be biased toward the majority class with poor predictive accuracy for the minority class. These methods further deteriorate on small, imbalanced data that has a large degree of class overlap. In this study, we propose a novel discriminative sparse neighbor approximation (DSNA) method to ameliorate the effect of class-imbalance during prediction. Specifically, given a test sample, we first traverse it through a cost-sensitive decision forest to collect a good subset of training examples in its local neighborhood. Then we generate from this subset several class-discriminating but overlapping clusters and model each as an affine subspace. From these subspaces, the proposed DSNA iteratively seeks an optimal approximation of the test sample and outputs an unbiased prediction. We show that our method not only effectively mitigates the imbalance issue, but also allows the prediction to extrapolate to unseen data. The latter capability is crucial for achieving accurate prediction on small dataset with limited samples. The proposed imbalanced learning method can be applied to both classification and regression tasks at a wide range of imbalance levels. It significantly outperforms the state-of-the-art methods that do not possess an imbalance handling mechanism, and is found to perform comparably or even better than recent deep learning methods by using hand-crafted features only.
We study chemical reaction networks with discrete state spaces, such as the standard continuous time Markov chain model, and present sufficient conditions on the structure of the network that guarantee the system exhibits an extinction event. The conditions we derive involve creating a modified chemical reaction network called a domination-expanded reaction network and then checking properties of this network. We apply the results to several networks including an EnvZ-OmpR signaling pathway in Escherichia coli. This analysis produces a system of equalities and inequalities which, in contrast to previous results on extinction events, allows algorithmic implementation. Such an implementation will be investigated in a companion paper where the results are applied to 458 models from the European Bioinformatics Institute's BioModels database.
Using the previously developed concepts of semantic spacetime, I explore the interpretation of knowledge representations, and their structure, as a semantic system, within the framework of promise theory. By assigning interpretations to phenomena, from observers to observed, we may approach a simple description of knowledge-based functional systems, with direct practical utility. The focus is especially on the interpretation of concepts, associative knowledge, and context awareness. The inference seems to be that most if not all of these concepts emerge from purely semantic spacetime properties, which opens the possibility for a more generalized understanding of what constitutes a learning, or even `intelligent' system.   Some key principles emerge for effective knowledge representation: 1) separation of spacetime scales, 2) the recurrence of four irreducible types of association, by which intent propagates: aggregation, causation, cooperation, and similarity, 3) the need for discrimination of identities (discrete), which is assisted by distinguishing timeline simultaneity from sequential events, and 4) the ability to learn (memory). It is at least plausible that emergent knowledge abstraction capabilities have their origin in basic spacetime structures.   These notes present a unified view of mostly well-known results; they allow us to see information models, knowledge representations, machine learning, and semantic networking (transport and information base) in a common framework. The notion of `smart spaces' thus encompasses artificial systems as well as living systems, across many different scales, e.g. smart cities and organizations.
When is it safe to approximate a complicated random Boolean network (RBN) as a simplified, easier to model RBN? When can static measures of network structure be reliably used to infer the network's dynamics? This simple experiment tests the ability of disjoint modular RBNs to approximate the dynamics of progressively more interconnected RBNs, while characterizing the performance of both static and dynamic measures of modularity as both break down. We find that, at least in the small networks investigated, the Newman 2004 [1] measure of static modularity performs as well as a more complex dynamic measure of modularity, and that the progressively increasing failure of one tracks that of the other. The dynamic measure is based on the Hamming distance of attractor schemata in rewired networks from those in perfectly modular networks. This result holds for a range of p-values.
A fault-tolerant quantum computation requires an efficient means to detect and correct errors that accumulate in encoded quantum information. In the context of machine learning, neural networks are a promising new approach to quantum error correction. Here we show that a recurrent neural network can be trained, using only experimentally accessible data, to detect errors in a widely used topological code, the surface code, with a performance above that of the established minimum-weight perfect matching (or blossom) decoder. The performance gain is achieved because the neural network decoder can detect correlations between bit-flip (X) and phase-flip (Z) errors. The machine learning algorithm adapts to the physical system, hence no noise model is needed to achieve optimal performance. The long short-term memory cell of the recurrent neural network maintains this performance over a large number of error correction cycles, making it a practical decoder for forthcoming experimental realizations. On a density-matrix simulation of the 17-qubit surface code, our neural network decoder achieves a substantial performance improvement over a state-of-the-art blossom decoder.
Integrated Computational Materials Engineering (ICME) aims to accelerate optimal design of complex material systems by integrating material science and design automation. For tractable ICME, it is required that (1) a structural feature space be identified to allow reconstruction of new designs, and (2) the reconstruction process be property-preserving. The majority of existing structural presentation schemes rely on the designer's understanding of specific material systems to identify geometric and statistical features, which could be biased and insufficient for reconstructing physically meaningful microstructures of complex material systems. In this paper, we develop a feature learning mechanism based on convolutional deep belief network to automate a two-way conversion between microstructures and their lower-dimensional feature representations, and to achieves a 1000-fold dimension reduction from the microstructure space. The proposed model is applied to a wide spectrum of heterogeneous material systems with distinct microstructural features including Ti-6Al-4V alloy, Pb63-Sn37 alloy, Fontainebleau sandstone, and Spherical colloids, to produce material reconstructions that are close to the original samples with respect to 2-point correlation functions and mean critical fracture strength. This capability is not achieved by existing synthesis methods that rely on the Markovian assumption of material microstructures.
Community detection is one of the fundamental problems in the study of network data. Most existing community detection approaches only consider edge information as inputs, and the output could be suboptimal when nodal information is available. In such cases, it is desirable to leverage nodal information for the improvement of community detection accuracy. Towards this goal, we propose a flexible network model incorporating nodal information, and develop likelihood-based inference methods. For the proposed methods, we establish favorable asymptotic properties as well as efficient algorithms for computation. Numerical experiments show the effectiveness of our methods in utilizing nodal information across a variety of simulated and real network data sets.
We have examined the nonconventional spin glass phase of the 2-dimensional kagome antiferromagnet (H_3 O) Fe_3 (SO_4)_2 (OH)_6 by means of ac and dc magnetic measurements. The frequency dependence of the ac susceptibility peak is characteristic of a critical slowing down at Tg ~ 18K. At fixed temperature below Tg, aging effects are found which obey the same scaling law as in spin glasses or polymers. However, in clear contrast with conventional spin glasses, aging is remarkably insensitive to temperature changes. This particular type of dynamics is discussed in relation with theoretical predictions for highly frustrated non-disordered systems.
Nanocavity lasers, which are an integral part of an on-chip integrated photonic network, are setting stringent requirements on the sensitivity of the techniques used to characterize the laser performance. Current characterization tools cannot provide detailed knowledge about nanolaser noise and dynamics. In this progress article, we will present tools and concepts from the Bayesian machine learning and digital coherent detection that offer novel approaches for highly-sensitive laser noise characterization and inference of laser dynamics. The goal of the paper is to trigger new research directions that combine the fields of machine learning and nanophotonics for characterizing nanolasers and eventually integrated photonic networks
We discuss conservation of probability in noniteracting disordered electron systems. We argue that although the norm of the electron wave function is conserved in individual realizations of the random potential, we cannot extend this conservation law easily to configurationally averaged systems in the thermodynamic limit. A direct generalization of the norm conservation to averaged functions is hindered by the existence of localized states breaking translational invariance. Such states are elusive to the description with periodic Bloch waves. Mathematically this difficulty is manifested through the diffusion pole in the electron-hole irreducible vertex. The pole leads to a clash with analyticity of the self-energy, reflecting causality of the theory, when norm conservation is enforced by the Ward identity between one- and two-particle averaged Green functions.
We propose a method to address challenges in unconstrained face detection, such as arbitrary pose variations and occlusions. First, a new image feature called Normalized Pixel Difference (NPD) is proposed. NPD feature is computed as the difference to sum ratio between two pixel values, inspired by the Weber Fraction in experimental psychology. The new feature is scale invariant, bounded, and is able to reconstruct the original image. Second, we propose a deep quadratic tree to learn the optimal subset of NPD features and their combinations, so that complex face manifolds can be partitioned by the learned rules. This way, only a single soft-cascade classifier is needed to handle unconstrained face detection. Furthermore, we show that the NPD features can be efficiently obtained from a look up table, and the detection template can be easily scaled, making the proposed face detector very fast. Experimental results on three public face datasets (FDDB, GENKI, and CMU-MIT) show that the proposed method achieves state-of-the-art performance in detecting unconstrained faces with arbitrary pose variations and occlusions in cluttered scenes.
Recently, many researches employ middle-layer output of convolutional neural network models (CNN) as features for different visual recognition tasks. Although promising results have been achieved in some empirical studies, such type of representations still suffer from the well-known issue of semantic gap. This paper proposes so-called deep attribute framework to alleviate this issue from three aspects. First, we introduce object region proposals as intermedia to represent target images, and extract features from region proposals. Second, we study aggregating features from different CNN layers for all region proposals. The aggregation yields a holistic yet compact representation of input images. Results show that cross-region max-pooling of soft-max layer output outperform all other layers. As soft-max layer directly corresponds to semantic concepts, this representation is named "deep attributes". Third, we observe that only a small portion of generated regions by object proposals algorithm are correlated to classification target. Therefore, we introduce context-aware region refining algorithm to pick out contextual regions and build context-aware classifiers.   We apply the proposed deep attributes framework for various vision tasks. Extensive experiments are conducted on standard benchmarks for three visual recognition tasks, i.e., image classification, fine-grained recognition and visual instance retrieval. Results show that deep attribute approaches achieve state-of-the-art results, and outperforms existing peer methods with a significant margin, even though some benchmarks have little overlap of concepts with the pre-trained CNN models.
A thrust analysis of Large-Rapidity-Gap events in deep-inelastic ep collisions is presented, using data taken with the H1 detector at HERA in 1994. The average thrust of the final states X, which emerge from the dissociation of virtual photons in the range 10 < Q2 < 100 GeV2, grows with hadronic mass M_X and implies a dominant 2-jet topology. Thrust is found to decrease with growing Pt, the thrust jet momentum transverse to the photon-proton collision axis. Distributions of Pt2 are consistent with being independent of MX. They show a strong alignment of the thrust axis with the photon-proton collision axis, and have a large high-Pt tail. The correlation of thrust with MX is similar to that in e+e- annihilation at sqrt(see)=MX, but with lower values of thrust in the ep data. The data cannot be described by interpreting the dissociated system X as a qqbar state but inclusion of a substantial fraction of qqbarg parton configurations leads naturally to the observed properties. The soft colour exchange interaction model does not describe the data.
The glass transition of supercooled fluids is a particular challenge for computer simulation, because the (longest) relaxation times increase by about 15 decades upon approaching the transition temperature T_g. Brute-force molecular dynamics simulations, as presented here for molten SiO_2 and coarse-grained bead-spring models of polymer chains, can yield very useful insight about the first few decades of this slowing down. Hence this allows to access the temperature range around T_c of the so-called mode coupling theory, whereas the dynamics around the experimental glass transition is completely out of reach. While methods such as ``parallel tempering'' improve the situation somewhat, a method that allows to span a significant part of the region T_g\leq T\leq T_c is still lacking. Only for abstract models such as the infinite range 10-state Potts glass with a few hundred spins this region can be explored. However this model suffers from very strong finite size effects thus making it difficult to extrapolate the results obtained for the finite system sizes to the thermodynamic limit. For the case of polymer melts, two different strategies to use lattice models instead of continuum models are discussed: In the first approach, a mapping of an atomistically realistic model of polyethylene to the bond fluctuation model with suitable effective potentials and a temperature-dependent time rescaling factor is attempted. In the second approach, devoted to a test of the entropy theory, moves that are artificial but which lead to a faster relaxation (``slithering snake'' algorithm) are used, to get at least static properties at somewhat lower temperatures than possible with a ``realistic'' dynamics. The merits and shortcomings of all these approaches are discussed.
In the Sherrington-Kirkpatrick mean field model for spin glasses, we show that the quenched average of the free energy can be expressed through a couple of functional order parameters, in a form very similar to the one found in the frame of the replica symmetry breaking method. The functional order parameters are implicitely given in terms of fluctuations of thermodynamic variables. Under the assumption that the two order parameters can be chosen to be the same, in the thermodynamic limit, it is shown that the Parisi free energy is a rigorous upper bound for the free energy of the model.
We study numerically the paramagnetic phase of the spin-1/2 random transverse-field Ising chain, using a mapping to non-interacting fermions. We extend our earlier work, Phys. Rev. 53, 8486 (1996), to finite temperatures and to dynamical properties. Our results are consistent with the idea that there are ``Griffiths-McCoy'' singularities in the paramagnetic phase described by a continuously varying exponent $z(\delta)$, where $\delta$ measures the deviation from criticality. There are some discrepancies between the values of $z(\delta)$ obtained from different quantities, but this may be due to corrections to scaling. The average on-site time dependent correlation function decays with a power law in the paramagnetic phase, namely $\tau^{-1/z(\delta)}$, where $\tau$ is imaginary time. However, the typical value decays with a stretched exponential behavior, $\exp(-c\tau^{1/\mu})$, where $\mu$ may be related to $z(\delta)$. We also obtain results for the full probability distribution of time dependent correlation functions at different points in the paramagnetic phase.
We describe and develop three recent novelties in network research which are particularly useful for studying social systems. The first one concerns the discovery of some basic dynamical laws that enable the emergence of the fundamental features observed in social networks, namely the nontrivial clustering properties, the existence of positive degree correlations and the subdivision into communities. To reproduce all these features we describe a simple model of mobile colliding agents, whose collisions define the connections between the agents which are the nodes in the underlying network, and develop some analytical considerations. The second point addresses the particular feature of clustering and its relationship with global network measures, namely with the distribution of the size of cycles in the network. Since in social bipartite networks it is not possible to measure the clustering from standard procedures, we propose an alternative clustering coefficient that can be used to extract an improved normalized cycle distribution in any network. Finally, the third point addresses dynamical processes occurring on networks, namely when studying the propagation of information in them. In particular, we focus on the particular features of gossip propagation which impose some restrictions in the propagation rules. To this end we introduce a quantity, the spread factor, which measures the average maximal fraction of nearest neighbors which get in contact with the gossip, and find the striking result that there is an optimal non-trivial number of friends for which the spread factor is minimized, decreasing the danger of being gossiped.
We study the influence of many-particle interaction in a system which, in the single particle case, exhibits a metal-insulator transition induced by a finite amount of onsite pontential fluctuations. Thereby, we consider the problem of interacting particles in the one-dimensional quasiperiodic Aubry-Andre chain. We employ the density-matrix renormalization scheme to investigate the finite particle density situation. In the case of incommensurate densities, the expected transition from the single-particle analysis is reproduced. Generally speaking, interaction does not alter the incommensurate transition. For commensurate densities, we map out the entire phase diagram and find that the transition into a metallic state occurs for attractive interactions and infinite small fluctuations -- in contrast to the case of incommensurate densities. Our results for commensurate densities also show agreement with a recent analytic renormalization group approach.
We propose a novel and flexible approach to meta-learning for learning-to-learn from only a few examples. Our framework is motivated by actor-critic reinforcement learning, but can be applied to both reinforcement and supervised learning. The key idea is to learn a meta-critic: an action-value function neural network that learns to criticise any actor trying to solve any specified task. For supervised learning, this corresponds to the novel idea of a trainable task-parametrised loss generator. This meta-critic approach provides a route to knowledge transfer that can flexibly deal with few-shot and semi-supervised conditions for both reinforcement and supervised learning. Promising results are shown on both reinforcement and supervised learning problems.
Existing centrality measures for social network analysis suggest the im-portance of an actor and give consideration to actor's given structural position in a network. These existing measures suggest specific attribute of an actor (i.e., popularity, accessibility, and brokerage behavior). In this study, we propose new hybrid centrality measures (i.e., Degree-Degree, Degree-Closeness and Degree-Betweenness), by combining existing measures (i.e., degree, closeness and betweenness) with a proposition to better understand the importance of actors in a given network. Generalized set of measures are also proposed for weighted networks. Our analysis of co-authorship networks dataset suggests significant correlation of our proposed new centrality measures (especially weighted networks) than traditional centrality measures with performance of the scholars. Thus, they are useful measures which can be used instead of traditional measures to show prominence of the actors in a network.
An explicit rate indication scheme for congestion avoidance in computer and telecommunication networks is proposed. The sources monitor their load and provide the information periodically to the switches. The switches, in turn, compute the load level and ask the sources to adjust their rates up or down. The scheme achieves high link utilization, fair allocation of rates among contending sources and provides quick convergence. A backward congestion notification option is also provided. The conditions under which this option is useful are indicated.
We study the properties of random graphs where for each vertex a {\it neighbourhood} has been previously defined. The probability of an edge joining two vertices depends on whether the vertices are neighbours or not, as happens in Small World Graphs (SWGs). But we consider the case where the average degree of each node is of order of the size of the graph (unlike SWGs, which are sparse). This allows us to calculate the mean distance and clustering, that are qualitatively similar (although not in such a dramatic scale range) to the case of SWGs. We also obtain analytically the distribution of eigenvalues of the corresponding adjacency matrices. This distribution is discrete for large eigenvalues and continuous for small eigenvalues. The continuous part of the distribution follows a semicircle law, whose width is proportional to the "disorder" of the graph, whereas the discrete part is simply a rescaling of the spectrum of the substrate. We apply our results to the calculation of the mixing rate and the synchronizability threshold.
We present the evolution of low mass and low metallicity RGB models computed under the assumption of deep mixing between the convective envelope and the hydrogen burning shell. We discuss the helium enrichment of the envelope which is necessary or allowed to achieve or to keep consistency with the observations, and conclude on the possible connection between chemical anomalies in red giants and the horizontal branch morphology in globular clusters.
Many real-world networks, including social and information networks, are dynamic structures that evolve over time. Such dynamic networks are typically visualized using a sequence of static graph layouts. In addition to providing a visual representation of the network structure at each time step, the sequence should preserve the mental map between layouts of consecutive time steps to allow a human to interpret the temporal evolution of the network. In this paper, we propose a framework for dynamic network visualization in the on-line setting where only present and past graph snapshots are available to create the present layout. The proposed framework creates regularized graph layouts by augmenting the cost function of a static graph layout algorithm with a grouping penalty, which discourages nodes from deviating too far from other nodes belonging to the same group, and a temporal penalty, which discourages large node movements between consecutive time steps. The penalties increase the stability of the layout sequence, thus preserving the mental map. We introduce two dynamic layout algorithms within the proposed framework, namely dynamic multidimensional scaling (DMDS) and dynamic graph Laplacian layout (DGLL). We apply these algorithms on several data sets to illustrate the importance of both grouping and temporal regularization for producing interpretable visualizations of dynamic networks.
In this paper we approach the novel problem of segmenting an image based on a natural language expression. This is different from traditional semantic segmentation over a predefined set of semantic classes, as e.g., the phrase "two men sitting on the right bench" requires segmenting only the two people on the right bench and no one standing or sitting on another bench. Previous approaches suitable for this task were limited to a fixed set of categories and/or rectangular regions. To produce pixelwise segmentation for the language expression, we propose an end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information. In our model, a recurrent LSTM network is used to encode the referential expression into a vector representation, and a fully convolutional network is used to a extract a spatial feature map from the image and output a spatial response map for the target object. We demonstrate on a benchmark dataset that our model can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.
We introduce an architecture in which internal representations, learned by end-to-end optimization in a deep neural network performing a textual question-answering task, can be interpreted using basic concepts from linguistic theory. This interpretability comes at a cost of only a few percentage-point reduction in accuracy relative to the original model on which the new one is based (BiDAF [1]). The internal representation that is interpreted is a Tensor Product Representation: for each input word, the model selects a symbol to encode the word, and a role in which to place the symbol, and binds the two together. The selection is via soft attention. The overall interpretation is built from interpretations of the symbols, as recruited by the trained model, and interpretations of the roles as used by the model. We find support for our initial hypothesis that symbols can be interpreted as lexical-semantic word meanings, while roles can be interpreted as approximations of grammatical roles (or categories) such as subject, wh-word, determiner, etc. Through extremely detailed, fine-grained analysis, we find specific correspondences between the learned roles and parts of speech as assigned by a standard parser [2], and find several discrepancies in the model's favor. In this sense, the model learns significant aspects of grammar, after having been exposed solely to linguistically unannotated text, questions, and answers: no prior linguistic knowledge is given to the model. What is given is the means to represent using symbols and roles and an inductive bias favoring use of these in an approximately discrete manner.
We study a deep matrix factorization problem. It takes as input a matrix $X$ obtained by multiplying $K$ matrices (called factors). Each factor is obtained by applying a fixed linear operator to a short vector of parameters satisfying a model (for instance sparsity, grouped sparsity, non-negativity, constraints defining a convolution network\ldots). We call the problem deep or multi-layer because the number of factors is not limited. In the practical situations we have in mind, we can typically have $K=10$ or $100$. This work aims at identifying conditions on the structure of the model that guarantees the stable recovery of the factors from the knowledge of $X$ and the model for the factors.We provide necessary and sufficient conditions for the identifiability of the factors (up to a scale rearrangement). We also provide a necessary and sufficient condition called Deep Null Space Property (because of the analogy with the usual Null Space Property in the compressed sensing framework) which guarantees that even an inaccurate optimization algorithm for the factorization stably recovers the factors. We illustrate the theory with a practical example where the deep factorization is a convolutional network.
This article describes an approach to modeling knowledge acquisition in terms of walks along complex networks. Each subset of knowledge is represented as a node, and relations between such knowledge are expressed as edges. Two types of edges are considered, corresponding to free and conditional transitions. The latter case implies that a node can only be reached after visiting previously a set of nodes (the required conditions). The process of knowledge acquisition can then be simulated by considering the number of nodes visited as a single agent moves along the network, starting from its lowest layer. It is shown that hierarchical networks, i.e. networks composed of successive interconnected layers, arise naturally as a consequence of compositions of the prerequisite relationships between the nodes. In order to avoid deadlocks, i.e. unreachable nodes, the subnetwork in each layer is assumed to be a connected component. Several configurations of such hierarchical knowledge networks are simulated and the performance of the moving agent quantified in terms of the percentage of visited nodes after each movement. The Barab\'asi-Albert and random models are considered for the layer and interconnecting subnetworks. Although all subnetworks in each realization have the same number of nodes, several interconnectivities, defined by the average node degree of the interconnection networks, have been considered. Two visiting strategies are investigated: random choice among the existing edges and preferential choice to so far untracked edges. A series of interesting results are obtained, including the identification of a series of plateaux of knowledge stagnation in the case of the preferential movements strategy in presence of conditional edges.
The similarity between neural and immune networks has been known for decades, but so far we did not understand the mechanism that allows the immune system, unlike associative neural networks, to recall and execute a large number of memorized defense strategies {\em in parallel}. The explanation turns out to lie in the network topology. Neurons interact typically with a large number of other neurons, whereas interactions among lymphocytes in immune networks are very specific, and described by graphs with finite connectivity. In this paper we use replica techniques to solve a statistical mechanical immune network model with `coordinator branches' (T-cells) and `effector branches' (B-cells), and show how the finite connectivity enables the system to manage an extensive number of immune clones simultaneously, even above the percolation threshold. The system exhibits only weak ergodicity breaking, so that both multiple antigen defense and homeostasis can be accomplished.
This Generalized Discriminant Analysis (GDA) has provided an extremely powerful approach to extracting non linear features. The network traffic data provided for the design of intrusion detection system always are large with ineffective information, thus we need to remove the worthless information from the original high dimensional database. To improve the generalization ability, we usually generate a small set of features from the original input variables by feature extraction. The conventional Linear Discriminant Analysis (LDA) feature reduction technique has its limitations. It is not suitable for non linear dataset. Thus we propose an efficient algorithm based on the Generalized Discriminant Analysis (GDA) feature reduction technique which is novel approach used in the area of cyber attack detection. This not only reduces the number of the input features but also increases the classification accuracy and reduces the training and testing time of the classifiers by selecting most discriminating features. We use Artificial Neural Network (ANN) and C4.5 classifiers to compare the performance of the proposed technique. The result indicates the superiority of algorithm.
In some underwater sensor networks, sensor nodes may be deployed at various depths of an ocean making those networks three-dimensional (3D). While most terrestrial sensor networks can usually be modeled as two dimensional (2D) networks, these underwater sensor networks must be modeled as 3D networks. This leads to new research challenges in the area of network architecture and topology. In this paper, we present two different network architectures for 3D underwater sensor networks. The first one is a hierarchical architecture that uses a relatively small number of robust backbone nodes to create the network where a large number of inexpensive sensors communicate with their nearest backbone nodes, and packets from a backbone node to the sink is routed through other backbone nodes. This hierarchical approach allows creating a network of smaller number of expensive backbone nodes while keeping the mobile sensors simple and inexpensive. Along with network topology, we also study energy efficiency and frequency reuse issues for such 3D networks. The second approach is a nonhierarchical architecture which assumes that all nodes are identical and randomly deployed. It partitions the whole 3D network space into identical cells and keeps one node active in each cell such that sensing coverage and connectivity are maintained while limiting the energy consumed. We also study closeness to optimality of our proposed scheme.
As deep neural networks continue to revolutionize various application domains, there is increasing interest in making these powerful models more understandable and interpretable, and narrowing down the causes of good and bad predictions. We focus on recurrent neural networks, state of the art models in speech recognition and translation. Our approach to increasing interpretability is by combining a long short-term memory (LSTM) model with a hidden Markov model (HMM), a simpler and more transparent model. We add the HMM state probabilities to the output layer of the LSTM, and then train the HMM and LSTM either sequentially or jointly. The LSTM can make use of the information from the HMM, and fill in the gaps when the HMM is not performing well. A small hybrid model usually performs better than a standalone LSTM of the same size, especially on smaller data sets. We test the algorithms on text data and medical time series data, and find that the LSTM and HMM learn complementary information about the features in the text.
The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions. In this work we present our results on learning strategies in Atari games using a Convolutional Neural Network, the Math Kernel Library and TensorFlow 0.11rc0 machine learning framework. We also analyze effects of asynchronous computations on the convergence of reinforcement learning algorithms.
We study the superconductor-insulator transition in nanohole ultrathin films in a transverse magnetic field by numerical simulation of a Josephson-junction array model. Geometrical disorder due to the random location of nanoholes in the film corresponds to random flux in the array model. Monte Carlo simulation in the path-integral representation is used to determine the critical behavior and the universal resistivity at the transition as a function of disorder and average number of flux quanta per cell, $f_o$. The resistivity increases with disorder for noninteger $ f_o$ while it decreases for integer $ f_o$, and reaches a common constant value in a vortex-glass regime above a critical value of the flux disorder $D_f^c$. The estimate of $D_f^c$ and the resistivity increase for noninteger $ f_o$ are consistent with recent experiments on ultrathin superconducting films with positional disordered nanoholes.
Deep generative models have recently shown great promise in imitation learning for motor control. Given enough data, even supervised approaches can do one-shot imitation learning; however, they are vulnerable to cascading failures when the agent trajectory diverges from the demonstrations. Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust controllers from fewer demonstrations, but is inherently mode-seeking and more difficult to train. In this paper, we show how to combine the favourable aspects of these two approaches. The base of our model is a new type of variational autoencoder on demonstration trajectories that learns semantic policy embeddings. We show that these embeddings can be learned on a 9 DoF Jaco robot arm in reaching tasks, and then smoothly interpolated with a resulting smooth interpolation of reaching behavior. Leveraging these policy representations, we develop a new version of GAIL that (1) is much more robust than the purely-supervised controller, especially with few demonstrations, and (2) avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not. We demonstrate our approach on learning diverse gaits from demonstration on a 2D biped and a 62 DoF 3D humanoid in the MuJoCo physics environment.
This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i.e., it is almost "dimension-free"). The convergence rate of this procedure matches the well-known convergence rate of gradient descent to first-order stationary points, up to log factors. When all saddle points are non-degenerate, all second-order stationary points are local minima, and our result thus shows that perturbed gradient descent can escape saddle points almost for free. Our results can be directly applied to many machine learning applications, including deep learning. As a particular concrete example of such an application, we show that our results can be used directly to establish sharp global convergence rates for matrix factorization. Our results rely on a novel characterization of the geometry around saddle points, which may be of independent interest to the non-convex optimization community.
A model recently introduced for diluted magnetic semiconductors by Berciu and Bhatt (PRL 87, 107203 (2001)) is studied with a Monte Carlo technique, and the results are compared to Hartree-Fock calculations. For doping rates close to the experimentally observed metal-insulator transition, a picture dominated by ferromagnetic droplets formed below a T* scale emerges. The moments of these droplets align as the temperature is lowered below a critical value Tc<T*. Our Monte Carlo investigations provide critical temperatures considerably smaller than Hartree-Fock predictions. Disorder does not seem to enhance ferromagnetism substantially. The inhomogeneous droplet state should be strongly susceptible to changes in doping and external fields.
We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.
About 10$\%$ of the massive main sequence stars have recently been found to host a strong, large scale magnetic field. Both, the origin and the evolutionary consequences of these fields are largely unknown. We argue that these fields may be sufficiently strong in the deep interior of the stars to suppress convection near the outer edge of their convective core. We performed parametrised stellar evolution calculations and assumed a reduced size of the convective core for stars in the mass range 16 M$_{\odot}$ to 28 M$_{\odot}$ from the zero age main sequence until core carbon depletion. We find that such models avoid the coolest part of the main sequence band, which is usually filled by evolutionary models that include convective core overshooting. Furthermore, our `magnetic' models populate the blue supergiant region during core helium burning, i.e., the post-main sequence gap left by ordinary single star models, and some of them end their life in a position near that of the progenitor of Supernova 1987A in the HR diagram. Further effects include a strongly reduced luminosity during the red supergiant stage, and downward shift of the limiting initial mass for white dwarf and neutron star formation.
Granato et al(2004) have elaborated a physically grounded model exploiting the mutual feedback between star-forming spheroidal galaxies and the active nuclei growing in their cores to overcome, within the hierarchical clustering scenario for galaxy formation, one of the main challenges facing such scenario, the fact that massive spheroidal galaxies appear to have formed earlier and faster than predicted by previous models. Adopting the choice by Granato et al for the parameters governing the history of the SF,of chemical abundances and of the gas and dust content of galaxies, we are left with only two parameters affecting the time and mass dependent SED of spheroidal galaxies. After complementing the model with a simple description of evolutionary properties of starburst, normal late-type galaxies and AGNs we have successfully compared the model with a broad variety of observational data, deep K-band, ISO, IRAS, SCUBA, radio counts, the corresponding redshift distributions, the IR background spectrum, and also with data for EROs. We also present detailed predictions for the GOODS and SWIRE surveys with the Spitzer Space Telescope. We find that the GOODS deep survey at 24$\mu$m and the SWIRE surveys at 70 and 160$\mu$m are likely to be severely confusion limited. The GOODS surveys in the IRAC channels are expected to resolve most of the background, to explore the full passive evolution phase of spheroidal galaxies and most of their active star-forming phase, detecting galaxies up to z\simeq 4. A substantial number of high z star-forming spheroidal galaxies should also be detected by the 24\mum SWIRE and GOODS surveys, while the 70 and 160\mum will be particularly useful to study the evolution of such galaxies in the range 1 \lsim z \lsim 2.[abridged]
Melanoma, a malignant form of skin cancer is very threatening to life. Diagnosis of melanoma at an earlier stage is highly needed as it has a very high cure rate. Benign and malignant forms of skin cancer can be detected by analyzing the lesions present on the surface of the skin using dermoscopic images. In this work, an automated skin lesion detection system has been developed which learns the representation of the image using Google's pretrained CNN model known as Inception-v3 \cite{cnn}. After obtaining the representation vector for our input dermoscopic images we have trained two layer feed forward neural network to classify the images as malignant or benign. The system also classifies the images based on the cause of the cancer either due to melanocytic or non-melanocytic cells using a different neural network. These classification tasks are part of the challenge organized by International Skin Imaging Collaboration (ISIC) 2017. Our system learns to classify the images based on the model built using the training images given in the challenge and the experimental results were evaluated using validation and test sets. Our system has achieved an overall accuracy of 65.8\% for the validation set.
A grand challenge in network science is apparently the missing of a structural theory of networks. The authors have showed that the existence of community structures is a universal phenomenon in real networks, and that neither randomness nor preferential attachment is a mechanism of community structures of network \footnote{A. Li, J. Li, and Y. Pan, Community structures are definable in networks, and universal in the real world, To appear.}. This poses a fundamental question: What are the mechanisms of community structures of real networks? Here we found that homophyly is the mechanism of community structures and a structural theory of networks. We proposed a homophyly model. It was shown that networks of our model satisfy a series of new topological, probabilistic and combinatorial principles, including a fundamental principle, a community structure principle, a degree priority principle, a widths principle, an inclusion and infection principle, a king node principle, and a predicting principle etc, leading to a structural theory of networks. Our model demonstrates that homophyly is the underlying mechanism of community structures of networks, that nodes of the same community share common features, that power law and small world property are never obstacles of the existence of community structures in networks, and that community structures are definable in networks.
We prove that Nakhleh's latest dissimilarity measure for phylogenetic networks separates distinguishable phylogenetic networks, and that a slight modification of it provides a true distance on the class of all phylogenetic networks.
Facial alignment involves finding a set of landmark points on an image with a known semantic meaning. However, this semantic meaning of landmark points is often lost in 2D approaches where landmarks are either moved to visible boundaries or ignored as the pose of the face changes. In order to extract consistent alignment points across large poses, the 3D structure of the face must be considered in the alignment step. However, extracting a 3D structure from a single 2D image usually requires alignment in the first place. We present our novel approach to simultaneously extract the 3D shape of the face and the semantically consistent 2D alignment through a 3D Spatial Transformer Network (3DSTN) to model both the camera projection matrix and the warping parameters of a 3D model. By utilizing a generic 3D model and a Thin Plate Spline (TPS) warping function, we are able to generate subject specific 3D shapes without the need for a large 3D shape basis. In addition, our proposed network can be trained in an end-to-end framework on entirely synthetic data from the 300W-LP dataset. Unlike other 3D methods, our approach only requires one pass through the network resulting in a faster than real-time alignment. Evaluations of our model on the Annotated Facial Landmarks in the Wild (AFLW) and AFLW2000-3D datasets show our method achieves state-of-the-art performance over other 3D approaches to alignment.
Due to the requirement of hosting tens of thousands of hosts in today's data centers, data center networks strive for scalability and high throughput on the one hand. On the other hand, the cost for networking hardware should be minimized. Consequently, the number and complexity (e.g. TCAM size) of switches has to be minimized. These requirements led to network topologies like Clos and Leaf-Spine networks only requiring a shallow hierarchy of switches---two levels for Leaf-Spine networks. The drawback of these topologies is that switches at higher levels like Spine switches need a high port density and, thus, are expensive and limit the scalability of the network.   In this paper, we propose a data center network topology based on De Bruijn graphs completely avoiding a switch hierarchy and implementing a flat network topology of top-of-rack switches instead. This topology guarantees logarithmic (short) path length. We show that the required routing logic can be implemented by standard prefix matching operations in hardware (TCAM) allowing for using commodity switches without any modification. Moreover, forwarding requires only a very small number of forwarding table entries, saving costly and energy-intensive TCAM.
In the Internet of Things (IoT) devices are exposed to various kinds of attacks when connected to the Internet. An attack detection mechanism that understands the limitations of these severely resource-constrained devices is necessary. This is important since current approaches are either customized for wireless networks or for the conventional Internet with heavy data transmission. Also, the detection mechanism need not always be as sophisticated. Simply signaling that an attack is taking place may be enough in some situations, for example in NIDS using anomaly detection. In graph networks, central nodes are the nodes that bear the most influence in the network. The purpose of this research is to explore experimentally the relationship between the behavior of central nodes and anomaly detection when an attack spreads through a network. As a result, we propose a novel anomaly detection approach using this unique methodology which has been unexplored so far in communication networks. Also, in the experiment, we identify presence of an attack originating and propagating throughout a network of IoT using our methodology.
We report on a series of measurements aimed to characterize the development and the dynamics of the rhythmic applause in concert halls. Our results demonstrate that while this process shares many characteristics of other systems that are known to synchronize, it also has features that are unexpected and unaccounted for in many other systems. In particular, we find that the mechanism lying at the heart of the synchronization process is the period doubling of the clapping rhythm. The characteristic interplay between synchronized and unsynchronized regimes during the applause is the result of a frustration in the systems. All results are understandable in the framework of the Kuramoto model.
The paper has been withdrawn.
The discovery of community structure is a common challenge in the analysis of network data. Many methods have been proposed for finding community structure, but few have been proposed for determining whether the structure found is statistically significant or whether, conversely, it could have arisen purely as a result of chance. In this paper we show that the significance of community structure can be effectively quantified by measuring its robustness to small perturbations in network structure. We propose a suitable method for perturbing networks and a measure of the resulting change in community structure and use them to assess the significance of community structure in a variety of networks, both real and computer generated.
Consider a countably infinite set of nodes, which sequentially make decisions between two given hypotheses. Each node takes a measurement of the underlying truth, observes the decisions from some immediate predecessors, and makes a decision between the given hypotheses. We consider two classes of broadcast failures: 1) each node broadcasts a decision to the other nodes, subject to random erasure in the form of a binary erasure channel; 2) each node broadcasts a randomly flipped decision to the other nodes in the form of a binary symmetric channel. We are interested in whether there exists a decision strategy consisting of a sequence of likelihood ratio tests such that the node decisions converge in probability to the underlying truth. In both cases, we show that if each node only learns from a bounded number of immediate predecessors, then there does not exist a decision strategy such that the decisions converge in probability to the underlying truth. However, in case 1, we show that if each node learns from an unboundedly growing number of predecessors, then the decisions converge in probability to the underlying truth, even when the erasure probabilities converge to 1. We also derive the convergence rate of the error probability. In case 2, we show that if each node learns from all of its previous predecessors, then the decisions converge in probability to the underlying truth when the flipping probabilities of the binary symmetric channels are bounded away from 1/2. In the case where the flipping probabilities converge to 1/2, we derive a necessary condition on the convergence rate of the flipping probabilities such that the decisions still converge to the underlying truth. We also explicitly characterize the relationship between the convergence rate of the error probability and the convergence rate of the flipping probabilities.
Errors in data are usually unwelcome and so some means to correct them is useful. However, it is difficult to define, detect or correct errors in an unsupervised way. Here, we train a deep neural network to re-synthesize its inputs at its output layer for a given class of data. We then exploit the fact that this abstract transformation, which we call a deep transform (DT), inherently rejects information (errors) existing outside of the abstract feature space. Using the DT to perform probabilistic re-synthesis, we demonstrate the recovery of data that has been subject to extreme degradation.
This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. Spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix. A CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction. The effectiveness of the proposed method is evaluated by taking two real-world transportation networks, the second ring road and north-east transportation network in Beijing, as examples, and comparing the method with four prevailing algorithms, namely, ordinary least squares, k-nearest neighbors, artificial neural network, and random forest, and three deep learning architectures, namely, stacked autoencoder, recurrent neural network, and long-short-term memory network. The results show that the proposed method outperforms other algorithms by an average accuracy improvement of 42.91% within an acceptable execution time. The CNN can train the model in a reasonable time and, thus, is suitable for large-scale transportation networks.
In order to adopt deep learning for information retrieval, models are needed that can capture all relevant information required to assess the relevance of a document to a given user query. While previous works have successfully captured unigram term matches, how to fully employ position-dependent information such as proximity and term dependencies has been insufficiently explored. In this work, we propose a novel neural IR model named PACRR (Position-Aware Convolutional-Recurrent Relevance), aiming at better modeling position-dependent interactions between a query and a document via convolutional layers as well as recurrent layers. Extensive experiments on six years' TREC Web Track data confirm that the proposed model yields better results under different benchmarks.
We study the critical behavior of the 2D $N$-color Ashkin-Teller model in the presence of random bond disorder whose correlations decays with the distance $r$ as a power-law $r^{-a}$. We consider the case when the spins of different colors sitting at the same site are coupled by the same bond and map this problem onto the 2D system of $N/2$ flavors of interacting Dirac fermions in the presence of correlated disorder. Using renormalization group we show that for $N=2$, a "weakly universal" scaling behavior at the continuous transition becomes universal with new critical exponents. For $N>2$, the first-order phase transition is rounded by the correlated disorder and turns into a continuous one.
We describe a simple dynamical model for an underdamped Josephson junction array coupled to a resonant cavity. From numerical solutions of the model in one dimension, we find that (i) current-voltage characteristics of the array have self-induced resonant steps (SIRS), (ii) at fixed disorder and coupling strength, the array locks into a coherent, periodic state above a critical number of active Josephson junctions, and (iii) when $N_a$ active junctions are synchronized on an SIRS, the energy emitted into the resonant cavity is quadratic with $N_a$. All three features are in agreement with a recent experiment [Barbara {\it et al}, Phys. Rev. Lett. {\bf 82}, 1963 (1999)]}.
The low-temperature Hall resistivity \rho_{xy} of La_{2/3}A_{1/3}MnO_3 single crystals (where A stands for Ca, Pb and Ca, or Sr) can be separated into Ordinary and Anomalous contributions, giving rise to Ordinary and Anomalous Hall effects, respectively. However, no such decomposition is possible near the Curie temperature which, in these systems, is close to metal-to-insulator transition. Rather, for all of these compounds and to a good approximation, the \rho_{xy} data at various temperatures and magnetic fields collapse (up to an overall scale), on to a single function of the reduced magnetization m=M/M_{sat}, the extremum of this function lying at m~0.4. A new mechanism for the Anomalous Hall Effect in the inelastic hopping regime, which reproduces these scaling curves, is identified. This mechanism, which is an extension of Holstein's model for the Ordinary Hall effect in the hopping regime, arises from the combined effects of the double-exchange-induced quantal phase in triads of Mn ions and spin-orbit interactions. We identify processes that lead to the Anomalous Hall Effect for localized carriers and, along the way, analyze issues of quantum interference in the presence of phonon-assisted hopping. Our results suggest that, near the ferromagnet-to-paramagnet transition, it is appropriate to describe transport in manganites in terms of carrier hopping between states that are localized due to combined effect of magnetic and non-magnetic disorder. We attribute the qualitative variations in resistivity characteristics across manganite compounds to the differing strengths of their carrier self-trapping, and conclude that both disorder-induced localization and self-trapping effects are important for transport.
Reinforcement learning can acquire complex behaviors from high-level specifications. However, defining a cost function that can be optimized effectively and encodes the correct task is challenging in practice. We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems. Our method addresses two key challenges in inverse optimal control: first, the need for informative features and effective regularization to impose structure on the cost, and second, the difficulty of learning the cost function under unknown dynamics for high-dimensional continuous systems. To address the former challenge, we present an algorithm capable of learning arbitrary nonlinear cost functions, such as neural networks, without meticulous feature engineering. To address the latter challenge, we formulate an efficient sample-based approximation for MaxEnt IOC. We evaluate our method on a series of simulated tasks and real-world robotic manipulation problems, demonstrating substantial improvement over prior methods both in terms of task complexity and sample efficiency.
This paper introduces a computational framework for reasoning in Bayesian belief networks that derives significant advantages from focused inference and relevance reasoning. This framework is based on d -separation and other simple and computationally efficient techniques for pruning irrelevant parts of a network. Our main contribution is a technique that we call relevance-based decomposition. Relevance-based decomposition approaches belief updating in large networks by focusing on their parts and decomposing them into partially overlapping subnetworks. This makes reasoning in some intractable networks possible and, in addition, often results in significant speedup, as the total time taken to update all subnetworks is in practice often considerably less than the time taken to update the network as a whole. We report results of empirical tests that demonstrate practical significance of our approach.
This paper addresses learning stochastic rules especially on an inter-attribute relation based on a Minimum Description Length (MDL) principle with a finite number of examples, assuming an application to the design of intelligent relational database systems. The stochastic rule in this paper consists of a model giving the structure like the dependencies of a Bayesian Belief Network (BBN) and some stochastic parameters each indicating a conditional probability of an attribute value given the state determined by the other attributes' values in the same record. Especially, we propose the extended version of the algorithm of Chow and Liu in that our learning algorithm selects the model in the range where the dependencies among the attributes are represented by some general plural number of trees.
The number of earthquakes as a function of magnitude decays as a power law. This trend is usually justified using spring-block models, where slips with the appropriate global statistics have been numerically observed. However, prominent spatial and temporal clustering features of earthquakes are not reproduced by this kind of modeling. We show that when a spring-block model is complemented with a mechanism allowing for structural relaxation, realistic earthquake patterns are obtained. The proposed model does not need to include a phenomenological velocity weakening friction law, as traditional spring-block models do, since this behavior is effectively induced by the relaxational mechanism as well. In this way, the model provides also a simple microscopic basis for the widely used phenomenological rate-and-state equations of rock friction.
Deep learning has become the method of choice in many application domains of machine learning in recent years, especially for multi-class classification tasks. The most common loss function used in this context is the cross-entropy loss, which reduces to the log loss in the typical case when there is a single correct response label. While this loss is insensitive to the identity of the assigned class in the case of misclassification, in practice it is often the case that some errors may be more detrimental than others. Here we present the bilinear-loss (and related log-bilinear-loss) which differentially penalizes the different wrong assignments of the model. We thoroughly test this method using standard models and benchmark image datasets. As one application, we show the ability of this method to better contain error within the correct super-class, in the hierarchically labeled CIFAR100 dataset, without affecting the overall performance of the classifier.
We construct a solvable spin chain model of many-body localization (MBL) with a tunable mobility edge. This simple model not only demonstrates analytically the existence of mobility edges in interacting one-dimensional (1D) disordered systems, but also allows us to study their physics. By establishing a connection between MBL and a quantum central limit theorem (QCLT), we show that many-body localization-delocalization transitions can be visualized as tuning a mobility edge in the energy spectrum. Since the effective disorder strength for individual eigenstates depends on energy density, we identify "energy-resolved disorder strength" as a physical mechanism for the appearance of mobility edges, and support the universality of this mechanism by arguing its presence in a large class of models including the random-field Heisenberg chain. We also construct models with multiple mobility edges. All our constructions can be made translationally invariant.
Networked structure emerged from a wide range of fields such as biological systems, World Wide Web and technological infrastructure. A deeply insight into the topological complexity of these networks has been gained. Some works start to pay attention to the weighted network, like the world-wide airport network and the collaboration network, where links are not binary, but have intensities. Here, we construct a novel knowledge network, through which we take the first step to uncover the topological structure of the knowledge system. Furthermore, the network is extended to the weighted one by assigning weights to the edges. Thus, we also investigate the relationship between the intensity of edges and the topological structure. These results provide a novel description to understand the hierarchies and organizational principles in knowledge system, and the interaction between the intensity of edges and topological structure. This system also provides a good paradigm to study weighted networks.
We present Earliness-Aware Deep Convolutional Networks (EA-ConvNets), an end-to-end deep learning framework, for early classification of time series data. Unlike most existing methods for early classification of time series data, that are designed to solve this problem under the assumption of the availability of a good set of pre-defined (often hand-crafted) features, our framework can jointly perform feature learning (by learning a deep hierarchy of \emph{shapelets} capturing the salient characteristics in each time series), along with a dynamic truncation model to help our deep feature learning architecture focus on the early parts of each time series. Consequently, our framework is able to make highly reliable early predictions, outperforming various state-of-the-art methods for early time series classification, while also being competitive when compared to the state-of-the-art time series classification algorithms that work with \emph{fully observed} time series data. To the best of our knowledge, the proposed framework is the first to perform data-driven (deep) feature learning in the context of early classification of time series data. We perform a comprehensive set of experiments, on several benchmark data sets, which demonstrate that our method yields significantly better predictions than various state-of-the-art methods designed for early time series classification. In addition to obtaining high accuracies, our experiments also show that the learned deep shapelets based features are also highly interpretable and can help gain better understanding of the underlying characteristics of time series data.
Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model social events using multivariate Hawkes processes, which can capture both endogenous and exogenous event intensities, and derive a time dependent linear relation between the intensity of exogenous events and the overall network activity. Exploiting this connection, we develop a convex optimization framework for determining the required level of external drive in order for the network to reach a desired activity level. We experimented with event data gathered from Twitter, and show that our method can steer the activity of the network more accurately than alternatives.
This paper proposes a novel framework for unsupervised audio source separation using a deep autoencoder. The characteristics of unknown source signals mixed in the mixed input is automatically by properly configured autoencoders implemented by a network with many layers, and separated by clustering the coefficient vectors in the code layer. By investigating the weight vectors to the final target, representation layer, the primitive components of the audio signals in the frequency domain are observed. By clustering the activation coefficients in the code layer, the previously unknown source signals are segregated. The original source sounds are then separated and reconstructed by using code vectors which belong to different clusters. The restored sounds are not perfect but yield promising results for the possibility in the success of many practical applications.
Evolutionary games on networks traditionally involve the same game at each interaction. Here we depart from this assumption by considering mixed games, where the game played at each interaction is drawn uniformly at random from a set of two different games. While in well-mixed populations the random mixture of the two games is always equivalent to the average single game, in structured populations this is not always the case. We show that the outcome is in fact strongly dependent on the distance of separation of the two games in the parameter space. Effectively, this distance introduces payoff heterogeneity, and the average game is returned only if the heterogeneity is small. For higher levels of heterogeneity the distance to the average game grows, which often involves the promotion of cooperation. The presented results support preceding research that highlights the favorable role of heterogeneity regardless of its origin, and they also emphasize the importance of the population structure in amplifying facilitators of cooperation.
Through detailed analysis of scores of publicly available data sets corresponding to a wide range of large-scale networks, from communication and road networks to various forms of social networks, we explore a little-studied geometric characteristic of real-life networks, namely their hyperbolicity. In smooth geometry, hyperbolicity captures the notion of negative curvature; within the more abstract context of metric spaces, it can be generalized as d-hyperbolicity. This generalized definition can be applied to graphs, which we explore in this report. We provide strong evidence that communication and social networks exhibit this fundamental property, and through extensive computations we quantify the degree of hyperbolicity of each network in comparison to its diameter. By contrast, and as evidence of the validity of the methodology, applying the same methods to the road networks shows that they are not hyperbolic, which is as expected. Finally, we present practical computational means for detection of hyperbolicity and show how the test itself may be scaled to much larger graphs than those we examined via renormalization group methodology. Using well-understood mechanisms, we provide evidence through synthetically generated graphs that hyperbolicity is preserved and indeed amplified by renormalization. This allows us to detect hyperbolicity in large networks efficiently, through much smaller renormalized versions. These observations indicate that d-hyperbolicity is a common feature of large-scale networks. We propose that d-hyperbolicity in conjunction with other local characteristics of networks, such as the degree distribution and clustering coefficients, provide a more complete unifying picture of networks, and helps classify in a parsimonious way what is otherwise a bewildering and complex array of features and characteristics specific to each natural and man-made network.
Community structure is one of the most important properties of networks. Most community algorithms are not suitable for large networks because of their time consuming. In fact there are lots of networks with millons even billons of nodes. In such case, most algorithms running in time O(n2logn) or even larger are not practical. What we need are linear or approximately linear time algorithm. Rising in response to such needs, we propose a quick methods to evaluate community structure in networks and then put forward a local community algorithm with nearly linear time based on random walks. Using our community evaluating measure, we could find some difference results from measures used before, i.e., the Newman Modularity. Our algorithm are effective in small benchmark networks with small less accuracy than more complex algorithms but a great of advantage in time consuming for large networks, especially super large networks.
The commonly used accumulated payoff scheme is not invariant with respect to shifts of payoff values when applied locally in degree-inhomogeneous population structures. We propose a suitably modified payoff scheme and we show both formally and by numerical simulation, that it leaves the replicator dynamics invariant with respect to affine transformations of the game payoff matrix. We then show empirically that, using the modified payoff scheme, an interesting amount of cooperation can be reached in three paradigmatic non-cooperative two-person games in populations that are structured according to graphs that have a marked degree inhomogeneity, similar to actual graphs found in society. The three games are the Prisoner's Dilemma, the Hawks-Doves and the Stag-Hunt. This confirms previous important observations that, under certain conditions, cooperation may emerge in such network-structured populations, even though standard replicator dynamics for mixing populations prescribes equilibria in which cooperation is totally absent in the Prisoner's Dilemma, and it is less widespread in the other two games.
In this paper, we present a model which takes as input a corpus of images with relevant spoken captions and finds a correspondence between the two modalities. We employ a pair of convolutional neural networks to model visual objects and speech signals at the word level, and tie the networks together with an embedding and alignment model which learns a joint semantic space over both modalities. We evaluate our model using image search and annotation tasks on the Flickr8k dataset, which we augmented by collecting a corpus of 40,000 spoken captions using Amazon Mechanical Turk.
The entanglement spectrum of the reduced density matrix contains information beyond the von Neumann entropy and provides unique insights into exotic orders or critical behavior of quantum systems. Here, we show that strongly disordered systems in the many-body localized phase have power-law entanglement spectra, arising from the presence of extensively many local integrals of motion. The power-law entanglement spectrum distinguishes many-body localized systems from ergodic systems, as well as from ground states of gapped integrable models or free systems in the vicinity of scale-invariant critical points. We confirm our results using large-scale exact diagonalization. In addition, we develop a matrix-product state algorithm which allows us to access the eigenstates of large systems close to the localization transition, and discuss general implications of our results for variational studies of highly excited eigenstates in many-body localized systems.
Exact solutions for the learning problem of autoassociative networks with binary couplings are determined by a new method. The use of a branch-and-bound algorithm leads to a substantial saving of computational time compared with complete enumeration. As a result, fully connected networks with up to 40 neurons could be investigated. The network capacity is found to be close to 0.83.
We argue that the direct experimental approaches to elucidate the architecture of higher brains may benefit from insights gained from exploring the possibilities and limits of artificial control architectures for robot systems. We present some of our recent work that has been motivated by that view and that is centered around the study of various aspects of hand actions since these are intimately linked with many higher cognitive abilities. As examples, we report on the development of a modular system for the recognition of continuous hand postures based on neural nets, the use of vision and tactile sensing for guiding prehensile movements of a multifingered hand, and the recognition and use of hand gestures for robot teaching.   Regarding the issue of learning, we propose to view real-world learning from the perspective of data mining and to focus more strongly on the imitation of observed actions instead of purely reinforcement-based exploration. As a concrete example of such an effort we report on the status of an ongoing project in our lab in which a robot equipped with an attention system with a neurally inspired architecture is taught actions by using hand gestures in conjunction with speech commands. We point out some of the lessons learnt from this system, and discuss how systems of this kind can contribute to the study of issues at the junction between natural and artificial cognitive systems.
Neural networks are utilized to fit Compton form factor H to HERMES data on deeply virtual Compton scattering off unpolarized protons. We used this result to predict the beam charge-spin assymetry for muon scattering off proton at the kinematics of the COMPASS II experiment.
State-of-the-art neural networks are getting deeper and wider. While their performance increases with the increasing number of layers and neurons, it is crucial to design an efficient deep architecture in order to reduce computational and memory costs. Designing an efficient neural network, however, is labor intensive requiring many experiments, and fine-tunings. In this paper, we introduce network trimming which iteratively optimizes the network by pruning unimportant neurons based on analysis of their outputs on a large dataset. Our algorithm is inspired by an observation that the outputs of a significant portion of neurons in a large network are mostly zero, regardless of what inputs the network received. These zero activation neurons are redundant, and can be removed without affecting the overall accuracy of the network. After pruning the zero activation neurons, we retrain the network using the weights before pruning as initialization. We alternate the pruning and retraining to further reduce zero activations in a network. Our experiments on the LeNet and VGG-16 show that we can achieve high compression ratio of parameters without losing or even achieving higher accuracy than the original network.
A neuromorphic chip that combines CMOS analog spiking neurons and memristive synapses offers a promising solution to brain-inspired computing, as it can provide massive neural network parallelism and density. Previous hybrid analog CMOS-memristor approaches required extensive CMOS circuitry for training, and thus eliminated most of the density advantages gained by the adoption of memristor synapses. Further, they used different waveforms for pre and post-synaptic spikes that added undesirable circuit overhead. Here we describe a hardware architecture that can feature a large number of memristor synapses to learn real-world patterns. We present a versatile CMOS neuron that combines integrate-and-fire behavior, drives passive memristors and implements competitive learning in a compact circuit module, and enables in-situ plasticity in the memristor synapses. We demonstrate handwritten-digits recognition using the proposed architecture using transistor-level circuit simulations. As the described neuromorphic architecture is homogeneous, it realizes a fundamental building block for large-scale energy-efficient brain-inspired silicon chips that could lead to next-generation cognitive computing.
This work presents CLTune, an auto-tuner for OpenCL kernels. It evaluates and tunes kernel performance of a generic, user-defined search space of possible parameter-value combinations. Example parameters include the OpenCL workgroup size, vector data-types, tile sizes, and loop unrolling factors. CLTune can be used in the following scenarios: 1) when there are too many tunable parameters to explore manually, 2) when performance portability across OpenCL devices is desired, or 3) when the optimal parameters change based on input argument values (e.g. matrix dimensions). The auto-tuner is generic, easy to use, open-source, and supports multiple search strategies including simulated annealing and particle swarm optimisation. CLTune is evaluated on two GPU case-studies inspired by the recent successes in deep learning: 2D convolution and matrix-multiplication (GEMM). For 2D convolution, we demonstrate the need for auto-tuning by optimizing for different filter sizes, achieving performance on-par or better than the state-of-the-art. For matrix-multiplication, we use CLTune to explore a parameter space of more than two-hundred thousand configurations, we show the need for device-specific tuning, and outperform the clBLAS library on NVIDIA, AMD and Intel GPUs.
The many-body localization transition (MBLT) between ergodic and many-body localized phase in disordered interacting systems is a subject of much recent interest. Statistics of eigenenergies is known to be a powerful probe of crossovers between ergodic and integrable systems in simpler examples of quantum chaos. We consider the evolution of the spectral statistics across the MBLT, starting with mapping to a Brownian motion process that analytically relates the spectral properties to the statistics of matrix elements. We demonstrate that the flow from Wigner-Dyson to Poisson statistics is a two-stage process. First, fractal enhancement of matrix elements upon approaching the MBLT from the metallic side produces an effective power-law interaction between energy levels, and leads to a plasma model for level statistics. At the second stage, the gas of eigenvalues has local interaction and level statistics belongs to a semi-Poisson universality class. We verify our findings numerically on the XXZ spin chain. We provide a microscopic understanding of the level statistics across the MBLT and discuss implications for the transition that are strong constraints on possible theories.
Counting objects in digital images is a process that should be replaced by machines. This tedious task is time consuming and prone to errors due to fatigue of human annotators. The goal is to have a system that takes as input an image and returns a count of the objects inside and justification for the prediction in the form of object localization. We repose a problem, originally posed by Lempitsky and Zisserman, to instead predict a count map which contains redundant counts based on the receptive field of a smaller regression network. The regression network predicts a count of the objects that exist inside this frame. By processing the image in a fully convolutional way each pixel is going to be accounted for some number of times, the number of windows which include it, which is the size of each window, (i.e., 32x32 = 1024). To recover the true count take the average over the redundant predictions. Our contribution is redundant counting instead of predicting a density map in order to average over errors. We also propose a novel deep neural network architecture adapted from the Inception family of networks called the Count-ception network. Together our approach results in a 20% gain over the state of the art method by Xie, Noble, and Zisserman in 2016.
We study numerically the local low-energy excitations in the 3-d Edwards-Anderson model for spin glasses. Given the ground state, we determine the lowest-lying connected cluster of flipped spins with a fixed volume containing one given spin. These excitations are not compact, having a fractal dimension close to two, suggesting an analogy with lattice animals. Also, their energy does not grow with their size; the associated exponent is slightly negative whereas the one for compact clusters is positive. These findings call for a modification of the basic hypotheses underlying the droplet model.
In this paper we study a Gaussian relay-interference network, in which relay (helper) nodes are to facilitate competing information flows over a wireless network. We focus on a two-stage relay-interference network where there are weak cross-links, causing the networks to behave like a chain of Z Gaussian channels. For these Gaussian ZZ and ZS networks, we establish an approximate characterization of the rate region. The outer bounds to the capacity region are established using genie-aided techniques that yield bounds sharper than the traditional cut-set outer bounds. For the inner bound of the ZZ network, we propose a new interference management scheme, termed interference neutralization, which is implemented using structured lattice codes. This technique allows for over-the-air interference removal, without the transmitters having complete access the interfering signals. For both the ZZ and ZS networks, we establish a new network decomposition technique that (approximately) achieves the capacity region. We use insights gained from an exact characterization of the corresponding linear deterministic version of the problems, in order to establish the approximate characterization for Gaussian networks.
The social networks that infectious diseases spread along are typically clustered. Because of the close relation between percolation and epidemic spread, the behavior of percolation in such networks gives insight into infectious disease dynamics. A number of authors have studied clustered networks, but the networks often contain preferential mixing between high degree nodes. We introduce a class of random clustered networks and another class of random unclustered networks with the same preferential mixing. We analytically show that percolation in the clustered networks reduces the component sizes and increases the epidemic threshold compared to the unclustered networks.
In this paper, we study the performance of network-coded cooperative diversity systems with practical communication constraints. More specifically, we investigate the interplay between diversity, coding, and multiplexing gain when the relay nodes do not act as dedicated repeaters, which only forward data packets transmitted by the sources, but they attempt to pursue their own interest by forwarding packets which contain a network-coded version of received and their own data. We provide a very accurate analysis of the Average Bit Error Probability (ABEP) for two network topologies with three and four nodes, when practical communication constraints, i.e., erroneous decoding at the relays and fading over all the wireless links, are taken into account. Furthermore, diversity and coding gain are studied, and advantages and disadvantages of cooperation and binary Network Coding (NC) are highlighted. Our results show that the throughput increase introduced by NC is offset by a loss of diversity and coding gain. It is shown that there is neither a coding nor a diversity gain for the source node when the relays forward a network-coded version of received and their own data. Compared to other results available in the literature, the conclusion is that binary NC seems to be more useful when the relay nodes act only on behalf of the source nodes, and do not mix their own packets to the received ones. Analytical derivation and findings are substantiated through extensive Monte Carlo simulations.
In 2015, stroke was the number one cause of death in Indonesia. The majority type of stroke is ischemic. The standard tool for diagnosing stroke is CT-Scan. For developing countries like Indonesia, the availability of CT-Scan is very limited and still relatively expensive. Because of the availability, another device that potential to diagnose stroke in Indonesia is EEG. Ischemic stroke occurs because of obstruction that can make the cerebral blood flow (CBF) on a person with stroke has become lower than CBF on a normal person (control) so that the EEG signal have a deceleration. On this study, we perform the ability of 1D Convolutional Neural Network (1DCNN) to construct classification model that can distinguish the EEG and EOG stroke data from EEG and EOG control data. To accelerate training process our model we use Batch Normalization. Involving 62 person data object and from leave one out the scenario with five times repetition of measurement we obtain the average of accuracy 0.86 (F-Score 0.861) only at 200 epoch. This result is better than all over shallow and popular classifiers as the comparator (the best result of accuracy 0.69 and F-Score 0.72 ). The feature used in our study were only 24 handcrafted feature with simple feature extraction process.
Multiclass FIFO is used in communication networks such as in input-queueing routers/switches and in wireless networks. For the concern of providing service guarantees in such networks, it is crucial to have analytical results, e.g. bounds, on the performance of multi-class FIFO. Surprisingly, there are few such results in the literature. This paper is devoted to filling the gap. Specifically, a single hop deterministic case is studied, for which, delay and backlog bounds are derived, in addition to guaranteed rate and service curve characterizations that may be exploited to extend the analysis to network cases.
In recent years, researchers in decision analysis and artificial intelligence (Al) have used Bayesian belief networks to build models of expert opinion. Using standard methods drawn from the theory of computational complexity, workers in the field have shown that the problem of probabilistic inference in belief networks is difficult and almost certainly intractable. K N ET, a software environment for constructing knowledge-based systems within the axiomatic framework of decision theory, contains a randomized approximation scheme for probabilistic inference. The algorithm can, in many circumstances, perform efficient approximate inference in large and richly interconnected models of medical diagnosis. Unlike previously described stochastic algorithms for probabilistic inference, the randomized approximation scheme computes a priori bounds on running time by analyzing the structure and contents of the belief network. In this article, we describe a randomized algorithm for probabilistic inference and analyze its performance mathematically. Then, we devote the major portion of the paper to a discussion of the algorithm's empirical behavior. The results indicate that the generation of good trials (that is, trials whose distribution closely matches the true distribution), rather than the computation of numerous mediocre trials, dominates the performance of stochastic simulation. Key words: probabilistic inference, belief networks, stochastic simulation, computational complexity theory, randomized algorithms.
FPGA-based hardware accelerators for convolutional neural networks (CNNs) have obtained great attentions due to their higher energy efficiency than GPUs. However, it is challenging for FPGA-based solutions to achieve a higher throughput than GPU counterparts. In this paper, we demonstrate that FPGA acceleration can be a superior solution in terms of both throughput and energy efficiency when a CNN is trained with binary constraints on weights and activations. Specifically, we propose an optimized FPGA accelerator architecture tailored for bitwise convolution and normalization that features massive spatial parallelism with deep pipelines stages. A key advantage of the FPGA accelerator is that its performance is insensitive to data batch size, while the performance of GPU acceleration varies largely depending on the batch size of the data. Experiment results show that the proposed accelerator architecture for binary CNNs running on a Virtex-7 FPGA is 8.3x faster and 75x more energy-efficient than a Titan X GPU for processing online individual requests in small batch sizes. For processing static data in large batch sizes, the proposed solution is on a par with a Titan X GPU in terms of throughput while delivering 9.5x higher energy efficiency.
This paper explores multi-task learning (MTL) for face recognition. We answer the questions of how and why MTL can improve the face recognition performance. First, we propose a multi-task Convolutional Neural Network (CNN) for face recognition where identity classification is the main task and pose, illumination, and expression estimations are the side tasks. Second, we develop a dynamic-weighting scheme to automatically assign the loss weight to each side task, which is a crucial problem in MTL. Third, we propose a pose-directed multi-task CNN by grouping different poses to learn pose-specific identity features, simultaneously across all poses. Last but not least, we propose an energy-based weight analysis method to explore how CNN-based MTL works. We observe that the side tasks serve as regularizations to disentangle the variations from the learnt identity features. Extensive experiments on the entire Multi-PIE dataset demonstrate the effectiveness of the proposed approach. To the best of our knowledge, this is the first work using all data in Multi-PIE for face recognition. Our approach is also applicable to in-the-wild datasets for pose-invariant face recognition and achieves comparable or better performance than state of the art on LFW, CFP, and IJB-A datasets.
In this contribution we introduce local attachment as an universal network-joining protocol for peer-to-peer networks, social networks, or other kinds of networks. Based on this protocol nodes in a finite-size network dynamically create power-law connectivity distributions. Nodes or peers maintain them in a self-organized statistical way by incorporating local information only. We investigate the structural and macroscopic properties of such local attachment networks by extensive numerical simulations, including correlations and scaling relations between exponents. The emergence of the power-law degree distribution is further investigated by considering preferential attachment with a nonlinear attractiveness function as an approximative model for local attachment. This study suggests the local attachment scheme as a procedure to be included in future peer-to-peer protocols to enable the efficient production of stable network topologies in a continuously changing environment.
Neural networks have been able to achieve groundbreaking accuracy at tasks conventionally considered only doable by humans. Using stochastic gradient descent, optimization in many dimensions is made possible, albeit at a relatively high computational cost. By splitting training data into batches, networks can be distributed and trained vastly more efficiently and with minimal accuracy loss. We have explored the mathematics behind efficiently implementing tensor-based batch backpropagation algorithms. A common approach to batch training is iterating over batch items individually. Explicitly using tensor operations to backpropagate allows training to be performed non-linearly, increasing computational efficiency.
Calorimetric experiments on network glasses provide information on the ergodicity (landscape) temperature of supercooled liquids and can be compared with a recent theory developed by Hall and Wolynes [PRL90, 085505 (2003)]
A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow. To further reduce the training difficulty, we present a simple network architecture, deep merge-and-run neural networks. The novelty lies in a modularized building block, merge-and-run block, which assembles residual branches in parallel through a merge-and-run mapping: Average the inputs of these residual branches (Merge), and add the average to the output of each residual branch as the input of the subsequent residual branch (Run), respectively. We show that the merge-and-run mapping is a linear idempotent function in which the transformation matrix is idempotent, and thus improves information flow, making training easy. In comparison to residual networks, our networks enjoy compelling advantages: they contain much shorter paths, and the width, i.e., the number of channels, is increased. We evaluate the performance on the standard recognition tasks. Our approach demonstrates consistent improvements over ResNets with the comparable setup, and achieves competitive results (e.g., $3.57\%$ testing error on CIFAR-$10$, $19.00\%$ on CIFAR-$100$, $1.51\%$ on SVHN).
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.
The large N infinite range spin glass is considered, in particular the number of spin components k needed to form the ground state and the sample-to-sample fluctuations in the Lagrange multiplier field on each site. The physical significance of k for the correlation functions is discussed. The difference between the large N and spherical spin glass is emphasized; a slight difference between the average Lagrange multiplier of the large N and spherical spin glasses is derived, leading to a slight increase in the energy of the ground state compared to the naive expectation. Further, there is a change in the low energy density of excitations in the large N system. A form of level repulsion, similar to that found in random matrix theory, is found to exist in this system, surviving interactions. Even though the system is an interacting one, a supersymmetric formalism is developed to deal with the problem of averaging over disorder.
We propose a technique for increasing the efficiency of gradient-based inference and learning in Bayesian networks with multiple layers of continuous latent vari- ables. We show that, in many cases, it is possible to express such models in an auxiliary form, where continuous latent variables are conditionally deterministic given their parents and a set of independent auxiliary variables. Variables of mod- els in this auxiliary form have much larger Markov blankets, leading to significant speedups in gradient-based inference, e.g. rapid mixing Hybrid Monte Carlo and efficient gradient-based optimization. The relative efficiency is confirmed in ex- periments.
Network coding has the potential to improve the overall throughput of a network by combining different streams of data and forwarding them. In wireless networks, the wireless channel provide an excellent medium for physical layer network coding as signals from different transmitters are combined automatically by the wireless channel. In such scenarios, it would be interesting to investigate protocols and algorithms which can optimally relay information. In this paper, we look at a four-node two-way or bidirectional relay network, and propose a relay protocol which can relay information efficiently in this network.
A variety of metrics have been proposed to measure the relative importance of nodes in a network. One of these, alpha-centrality [Bonacich, 2001], measures the number of attenuated paths that exist between nodes. We introduce a normalized version of this metric and use it to study network structure, specifically, to rank nodes and find community structure of the network. Specifically, we extend the modularity-maximization method [Newman and Girvan, 2004] for community detection to use this metric as the measure of node connectivity. Normalized alpha-centrality is a powerful tool for network analysis, since it contains a tunable parameter that sets the length scale of interactions. By studying how rankings and discovered communities change when this parameter is varied allows us to identify locally and globally important nodes and structures. We apply the proposed method to several benchmark networks and show that it leads to better insight into network structure than alternative methods.
We consider a small extent sensor network for event detection, in which nodes take samples periodically and then contend over a {\em random access network} to transmit their measurement packets to the fusion center. We consider two procedures at the fusion center to process the measurements. The Bayesian setting is assumed; i.e., the fusion center has a prior distribution on the change time. In the first procedure, the decision algorithm at the fusion center is \emph{network-oblivious} and makes a decision only when a complete vector of measurements taken at a sampling instant is available. In the second procedure, the decision algorithm at the fusion center is \emph{network-aware} and processes measurements as they arrive, but in a time causal order. In this case, the decision statistic depends on the network delays as well, whereas in the network-oblivious case, the decision statistic does not depend on the network delays. This yields a Bayesian change detection problem with a tradeoff between the random network delay and the decision delay; a higher sampling rate reduces the decision delay but increases the random access delay. Under periodic sampling, in the network--oblivious case, the structure of the optimal stopping rule is the same as that without the network, and the optimal change detection delay decouples into the network delay and the optimal decision delay without the network. In the network--aware case, the optimal stopping problem is analysed as a partially observable Markov decision process, in which the states of the queues and delays in the network need to be maintained. A sufficient statistic for decision is found to be the network-state and the posterior probability of change having occurred given the measurements received and the state of the network. The optimal regimes are studied using simulation.
In this paper we analyze the changes in the microscopic dynamics of a colloidal glass submitted to an oscillatory shear. We use Multispeckle diffusing Wave Spectroscopy to monitor the transient dynamical regimes following a shear application. We show that the system displays a spontaneous aging that is amplified by a low amplitude oscillation (overaging) but stopped by a high amplitude one (rejuvenation). Intermediate amplitudes drive the system into a dynamical state that cannot be reached through spontaneous evolution. We demonstrate that theses systems present a weak long term memory and that the all observed phenomenology can be explained by a broadening of the relaxation time distribution under oscillatory strain. We discussed the similarities of our results with the behavior of spin glasses upon temperature shifts. We eventually propose a scenario based on the existence of harder and softer regions in the material.
A two parameter percolation model with nucleation and growth of finite clusters is developed taking the initial seed concentration \rho and a growth parameter g as two tunable parameters. Percolation transition is determined by the final static configuration of spanning clusters. A finite size scaling theory for such transition is developed and numerically verified. The scaling functions are found to depend on both g and \rho. The singularities at the critical growth probability gc of a given \rho are described by appropriate critical exponents. The values of the critical exponents are found to be same as that of the original percolation at all values of \rho at the respective gc . The model then belongs to the same universality class of percolation for the whole range of \rho.
In this paper, we propose a multi-layer ant-based algorithm MABA, which detects communities from networks by means of locally optimizing modularity using individual ants. The basic version of MABA, namely SABA, combines a self-avoiding label propagation technique with a simulated annealing strategy for ant diffusion in networks. Once the communities are found by SABA, this method can be reapplied to a higher level network where each obtained community is regarded as a new vertex. The aforementioned process is repeated iteratively, and this corresponds to MABA. Thanks to the intrinsic multi-level nature of our algorithm, it possesses the potential ability to unfold multi-scale hierarchical structures. Furthermore, MABA has the ability that mitigates the resolution limit of modularity. The proposed MABA has been evaluated on both computer-generated benchmarks and widely used real-world networks, and has been compared with a set of competitive algorithms. Experimental results demonstrate that MABA is both effective and efficient (in near linear time with respect to the size of network) for discovering communities.
Here, we study the ultimately bounded stability of network of mismatched systems using Lyapunov direct method. We derive an upper bound on the norm of the error of network states from its average states, which it achieves in finite time. Then, we devise a decentralized compensator to asymptotically pin the network of mismatched systems to a desired trajectory. Next, we design distributed estimators to compensate for the mismatched parameters performances of adaptive decentralized and distributed compensations are analyzed. Our analytical results are verified by several simulations in a network of globally connected Lorenz oscillators.
We address the problem of contour detection via per-pixel classifications of edge point. To facilitate the process, the proposed approach leverages with DenseNet, an efficient implementation of multiscale convolutional neural networks (CNNs), to extract an informative feature vector for each pixel and uses an SVM classifier to accomplish contour detection. The main challenge lies in adapting a pre-trained per-image CNN model for yielding per-pixel image features. We propose to base on the DenseNet architecture to achieve pixelwise fine-tuning and then consider a cost-sensitive strategy to further improve the learning with a small dataset of edge and non-edge image patches. In the experiment of contour detection, we look into the effectiveness of combining per-pixel features from different CNN layers and obtain comparable performances to the state-of-the-art on BSDS500.
We consider a dynamic social network model in which agents play repeated games in pairings determined by a stochastically evolving social network. Individual agents begin to interact at random, with the interactions modeled as games. The game payoffs determine which interactions are reinforced, and the network structure emerges as a consequence of the dynamics of the agents' learning behavior. We study this in a variety of game-theoretic conditions and show that the behavior is complex and sometimes dissimilar to behavior in the absence of structural dynamics. We argue that modeling network structure as dynamic increases realism without rendering the problem of analysis intractable.
A reliable network infrastructure must be able to sustain traffic flows, even when a failure occurs and changes the network topology. During the occurrence of a failure, routing protocols, like OSPF, take from hundreds of milliseconds to various seconds in order to converge. During this convergence period, packets might traverse a longer path or even a loop. An even worse transient behaviour is that packets are dropped even though destinations are reachable. In this context, this paper describes a proactive fast rerouting approach, named Fast Emergency Paths Schema (FEP-S), to overcome problems originating from transient link failures in OSPF routing. Extensive experiments were done using several network topologies with different dimensionality degrees. Results show that the recovery paths, obtained by FEPS, are shorter than those from other rerouting approaches and can improve the network reliability by reducing the packet loss rate during the routing protocols convergence caused by a failure.
In this paper, we consider the problem of predicting demographics of geographic units given geotagged Tweets that are composed within these units. Traditional survey methods that offer demographics estimates are usually limited in terms of geographic resolution, geographic boundaries, and time intervals. Thus, it would be highly useful to develop computational methods that can complement traditional survey methods by offering demographics estimates at finer geographic resolutions, with flexible geographic boundaries (i.e. not confined to administrative boundaries), and at different time intervals. While prior work has focused on predicting demographics and health statistics at relatively coarse geographic resolutions such as the county-level or state-level, we introduce an approach to predict demographics at finer geographic resolutions such as the blockgroup-level. For the task of predicting gender and race/ethnicity counts at the blockgroup-level, an approach adapted from prior work to our problem achieves an average correlation of 0.389 (gender) and 0.569 (race) on a held-out test dataset. Our approach outperforms this prior approach with an average correlation of 0.671 (gender) and 0.692 (race).
It is a significant challenge to predict the network topology from a small amount of dynamical observations. Different from the usual framework of the node-based reconstruction, two optimization approaches (i.e., the global and partitioned reconstructions) are proposed to infer the structure of undirected networks from dynamics. These approaches are applied to evolutionary games occurring on both homogeneous and heterogeneous networks via compressed sensing, which can more efficiently achieve higher reconstruction accuracy with relatively small amounts of data. Our approaches provide different perspectives on effectively reconstructing complex networks.
Neural machine translation (NMT) models typically operate with a fixed vocabulary, but translation is an open-vocabulary problem. Previous work addresses the translation of out-of-vocabulary words by backing off to a dictionary. In this paper, we introduce a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units. This is based on the intuition that various word classes are translatable via smaller units than words, for instance names (via character copying or transliteration), compounds (via compositional translation), and cognates and loanwords (via phonological and morphological transformations). We discuss the suitability of different word segmentation techniques, including simple character n-gram models and a segmentation based on the byte pair encoding compression algorithm, and empirically show that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.1 and 1.3 BLEU, respectively.
A measurement is made of the cross section for the process ep --> eXY in deep-inelastic scattering with the H1 detector at HERA. The cross section is presented in terms of a differential structure function F_2^D(3)(x_P,beta,Q^2) of the proton over the kinematic range 4.5 < Q^2 < 75 GeV^2. The dependence of F_2^D(3) on x_P is found to vary with beta, demonstrating that a factorisation of F_2^D(3) with a single diffractive flux independent of beta and Q^2 is not tenable. An interpretation in which a leading diffractive exchange and a subleading reggeon contribute to F_2^D(3) reproduces well the x_P dependence of F_2^D(3) with values for the pomeron and subleading reggeon intercepts of alpha_P(0)=1.203 \pm 0.020(stat.)\pm 0.013(sys.) ^{+0.030}_{-0.035}(model} and alpha_reg(0)=0.50\pm 0.11(stat.)\pm 0.11 (sys.}^{+0.09}_{-0.10} (model), respectively. A fit is performed of the data using a QCD motivated model, in which parton distributions are assigned to the leading and subleading exchanges. In this model, the majority of the momentum of the pomeron must be carried by gluons in order for the data to be well described.
Inspired by the recent evolution of deep neural networks (DNNs) in machine learning, we explore their application to PL-related topics. This paper is the first step towards this goal; we propose a proof-synthesis method for the negation-free propositional logic in which we use a DNN to obtain a guide of proof search. The idea is to view the proof-synthesis problem as a translation from a proposition to its proof. We train seq2seq, which is a popular network in neural machine translation, so that it generates a proof encoded as a $\lambda$-term of a given proposition. We implement the whole framework and empirically observe that a generated proof term is close to a correct proof in terms of the tree edit distance of AST. This observation justifies using the output from a trained seq2seq model as a guide for proof search.
One of the key challenges in machine learning is to design a computationally efficient multi-class classifier while maintaining the output accuracy and performance. In this paper, we present a tree-based classifier: Attention Tree (ATree) for large-scale image classification that uses recursive Adaboost training to construct a visual attention hierarchy. The proposed attention model is inspired from the biological 'selective tuning mechanism for cortical visual processing'. We exploit the inherent feature similarity across images in datasets to identify the input variability and use recursive optimization procedure, to determine data partitioning at each node, thereby, learning the attention hierarchy. A set of binary classifiers is organized on top of the learnt hierarchy to minimize the overall test-time complexity. The attention model maximizes the margins for the binary classifiers for optimal decision boundary modelling, leading to better performance at minimal complexity. The proposed framework has been evaluated on both Caltech-256 and SUN datasets and achieves accuracy improvement over state-of-the-art tree-based methods at significantly lower computational cost.
Deep-inelastic scattering, in the laboratory and on the lattice, is most instructive for understanding how the nucleon is built from quarks and gluons. The long-term goal is to compute the associated structure functions from first principles. So far this has been limited to model calculations. In this Letter we propose a new method to compute the structure functions directly from the virtual, all-encompassing Compton amplitude, utilizing the operator product expansion. This overcomes issues of renormalization and operator mixing, which so far have hindered lattice calculations of power corrections and higher moments.
We present MPWide, a light weight communication library which allows efficient message passing over a distributed network. MPWide has been designed to connect application running on distributed (super)computing resources, and to maximize the communication performance on wide area networks for those without administrative privileges. It can be used to provide message-passing between application, move files, and make very fast connections in client-server environments. MPWide has already been applied to enable distributed cosmological simulations across up to four supercomputers on two continents, and to couple two different bloodflow simulations to form a multiscale simulation.
One of the key elements in the banking industry rely on the appropriate selection of customers. In order to manage credit risk, banks dedicate special efforts in order to classify customers according to their risk. The usual decision making process consists in gathering personal and financial information about the borrower. Processing this information can be time consuming, and presents some difficulties due to the heterogeneous structure of data. We offer in this paper an alternative method that is able to classify customers' profiles from numerical and nominal attributes. The key feature of our method, called LVQ+PSO, is the finding of a reduced set of classifying rules. This is possible, due to the combination of a competitive neural network with an optimization technique. These rules constitute a predictive model for credit risk approval. The reduced quantity of rules makes this method not only useful for credit officers aiming to make quick decisions about granting a credit, but also could act as borrower's self selection. Our method was applied to an actual database of a credit consumer financial institution in Ecuador. We obtain very satisfactory results. Future research lines are exposed.
In this paper we study the 3d frustrated lattice gas model in the annealed version, where the disorder is allowed to evolve in time with a suitable kinetic constraint. Although the model does not exhibit any thermodynamic transition it shows a diverging peak at some characteristic time in the dynamical non-linear susceptibility, similar to the results on the p-spin model in mean field and Lennard-Jones mixture recently found by Donati et al. [cond-mat/9905433]. Comparing these results to those obtained in the model with quenched interactions, we conclude that the critical behavior of the dynamical susceptibility is reminiscent of the thermodynamic transition present in the quenched model, and signaled by the divergence of the static non-linear susceptibility, suggesting therefore a similar mechanism also in supercooled glass-forming liquids.
Starting with two copies of the random energy model coupled with independent magnetic fields, the generating function for the connected correlator of the magnetization is exactly derived. Without use of the replica trick, it is shown that the Hessian of the generating function is symmetric under exchanging the two copies when the system is finite, but the symmetry is spontaneously broken in the low-temperature phase. It can be regarded as a rigorous realization of the replica symmetry breaking. The corresponding effective potential, which has two independent variables conjugate to the magnetic fields, is also calculated. It is singular when the two variables coincide. The singularity is consistent with that observed in the effective potentials of short-ranged disordered systems in the context of the functional renormalization group.
Measuring the naturalness of images is important to generate realistic images or to detect unnatural regions in images. Additionally, a method to measure naturalness can be complementary to Convolutional Neural Network (CNN) based features, which are known to be insensitive to the naturalness of images. However, most probabilistic image models have insufficient capability of modeling the complex and abstract naturalness that we feel because they are built directly on raw image pixels. In this work, we assume that naturalness can be measured by the predictability on high-level features during eye movement. Based on this assumption, we propose a novel method to evaluate the naturalness by building a variant of Recurrent Neural Network Language Models on pre-trained CNN representations. Our method is applied to two tasks, demonstrating that 1) using our method as a regularizer enables us to generate more understandable images from image features than existing approaches, and 2) unnaturalness maps produced by our method achieve state-of-the-art eye fixation prediction performance on two well-studied datasets.
This paper studies improving solvers based on their past solving experiences, and focuses on improving solvers by offline training. Specifically, the key issues of offline training methods are discussed, and research belonging to this category but from different areas are reviewed in a unified framework. Existing training methods generally adopt a two-stage strategy in which selecting the training instances and training instances are treated in two independent phases. This paper proposes a new training method, dubbed LiangYi, which addresses these two issues simultaneously. LiangYi includes a training module for a population-based solver and an instance sampling module for updating the training instances. The idea behind LiangYi is to promote the population-based solver by training it (with the training module) to improve its performance on those instances (discovered by the sampling module) on which it performs badly, while keeping the good performances obtained by it on previous instances. An instantiation of LiangYi on the Travelling Salesman Problem is also proposed. Empirical results on a huge testing set containing 10000 instances showed LiangYi could train solvers that perform significantly better than the solvers trained by other state-of-the-art training method. Moreover, empirical investigation of the behaviours of LiangYi confirmed it was able to continuously improve the solver through training.
A process centric view of robust PCA (RPCA) allows its fast approximate implementation based on a special form o a deep neural network with weights shared across all layers. However, empirically this fast approximation to RPCA fails to find representations that are parsemonious. We resolve these bad local minima by relaxing the elementwise L1 and L2 priors and instead utilize a structure inducing k-sparsity prior. In a discriminative classification task the newly learned representations outperform these from the original approximate RPCA formulation significantly.
The problem of community detection is relevant in many scientific disciplines, from social science to statistical physics. Given the impact of community detection in many areas, such as psychology and social sciences, we have addressed the issue of modifying existing well performing algorithms by incorporating elements of the domain application fields, i.e. domain-inspired. We have focused on a psychology and social network - inspired approach which may be useful for further strengthening the link between social network studies and mathematics of community detection. Here we introduce a community-detection algorithm derived from the van Dongen's Markov Cluster algorithm (MCL) method by considering networks' nodes as agents capable to take decisions. In this framework we have introduced a memory factor to mimic a typical human behavior such as the oblivion effect. The method is based on information diffusion and it includes a non-linear processing phase. We test our method on two classical community benchmark and on computer generated networks with known community structure. Our approach has three important features: the capacity of detecting overlapping communities, the capability of identifying communities from an individual point of view and the fine tuning the community detectability with respect to prior knowledge of the data. Finally we discuss how to use a Shannon entropy measure for parameter estimation in complex networks.
In today's global business market place, individual firms no longer compete as independent entities with unique brand names but as integral part of supply chain links. Key to success of any business is satisfying customer's demands on time which may result in cost reductions and increase in service level. In supply chain networks decisions are made with uncertainty about product's demands, costs, prices, lead times, quality in a competitive and collaborative environment. If poor decisions are made, they may lead to excess inventories that are costly or to insufficient inventory that cannot meet customer's demands. In this work we developed a bi-objective model that minimizes system wide costs of the supply chain and delays on delivery of products to distribution centers for a three echelon supply chain. Picking a set of Pareto front for multi-objective optimization problems require robust and efficient methods that can search an entire space. We used evolutionary algorithms to find the set of Pareto fronts which have proved to be effective in finding the entire set of Pareto fronts.
The work presented here applies deep learning to the task of automated cardiac auscultation, i.e. recognizing abnormalities in heart sounds. We describe an automated heart sound classification algorithm that combines the use of time-frequency heat map representations with a deep convolutional neural network (CNN). Given the cost-sensitive nature of misclassification, our CNN architecture is trained using a modified loss function that directly optimizes the trade-off between sensitivity and specificity. We evaluated our algorithm at the 2016 PhysioNet Computing in Cardiology challenge where the objective was to accurately classify normal and abnormal heart sounds from single, short, potentially noisy recordings. Our entry to the challenge achieved a final specificity of 0.95, sensitivity of 0.73 and overall score of 0.84. We achieved the greatest specificity score out of all challenge entries and, using just a single CNN, our algorithm differed in overall score by only 0.02 compared to the top place finisher, which used an ensemble approach.
We consider the problem of securing a multicast network against a wiretapper that can intercept the packets on a limited number of arbitrary network links of his choice. We assume that the network implements network coding techniques to simultaneously deliver all the packets available at the source to all the destinations. We show how this problem can be looked at as a network generalization of the Ozarow-Wyner Wiretap Channel of type II. In particular, we show that network security can be achieved by using the Ozarow-Wyner approach of coset coding at the source on top of the implemented network code. This way, we quickly and transparently recover some of the results available in the literature on secure network coding for wiretapped networks. We also derive new bounds on the required secure code alphabet size and an algorithm for code construction.
Deep residual networks have recently shown appealing performance on many challenging computer vision tasks. However, the original residual structure still has some defects making it difficult to converge on very deep networks. In this paper, we introduce a weighted residual network to address the incompatibility between \texttt{ReLU} and element-wise addition and the deep network initialization problem. The weighted residual network is able to learn to combine residuals from different layers effectively and efficiently. The proposed models enjoy a consistent improvement over accuracy and convergence with increasing depths from 100+ layers to 1000+ layers. Besides, the weighted residual networks have little more computation and GPU memory burden than the original residual networks. The networks are optimized by projected stochastic gradient descent. Experiments on CIFAR-10 have shown that our algorithm has a \emph{faster convergence speed} than the original residual networks and reaches a \emph{high accuracy} at 95.3\% with a 1192-layer model.
Symbolic regression aims to find a function that best explains the relationship between independent variables and the objective value based on a given set of sample data. Genetic programming (GP) is usually considered as an appropriate method for the problem since it can optimize functional structure and coefficients simultaneously. However, the convergence speed of GP might be too slow for large scale problems that involve a large number of variables. Fortunately, in many applications, the target function is separable or partially separable. This feature motivated us to develop a new method, divide and conquer (D&C), for symbolic regression, in which the target function is divided into a number of sub-functions and the sub-functions are then determined by any of a GP algorithm. The separability is probed by a new proposed technique, Bi-Correlation test (BiCT). D&C powered GP has been tested on some real-world applications, and the study shows that D&C can help GP to get the target function much more rapidly.
Infrared (IR) imaging has the potential to enable more robust action recognition systems compared to visible spectrum cameras due to lower sensitivity to lighting conditions and appearance variability. While the action recognition task on videos collected from visible spectrum imaging has received much attention, action recognition in IR videos is significantly less explored. Our objective is to exploit imaging data in this modality for the action recognition task. In this work, we propose a novel two-stream 3D convolutional neural network (CNN) architecture by introducing the discriminative code layer and the corresponding discriminative code loss function. The proposed network processes IR image and the IR-based optical flow field sequences. We pretrain the 3D CNN model on the visible spectrum Sports-1M action dataset and finetune it on the Infrared Action Recognition (InfAR) dataset. To our best knowledge, this is the first application of the 3D CNN to action recognition in the IR domain. We conduct an elaborate analysis of different fusion schemes (weighted average, single and double-layer neural nets) applied to different 3D CNN outputs. Experimental results demonstrate that our approach can achieve state-of-the-art average precision (AP) performances on the InfAR dataset: (1) the proposed two-stream 3D CNN achieves the best reported 77.5% AP, and (2) our 3D CNN model applied to the optical flow fields achieves the best reported single stream 75.42% AP.
Driven by the need for better models that allow one to shed light into the question how life's diversity has evolved, phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the well-studied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i.e. uniquely describe) the network that induces it? In this note, we present a complete answer to this question for the special case of a level-1 (phylogenetic) network by characterizing those level-1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. Given that this type of network forms the first layer of the rich hierarchy of level-k networks, k a non-negative integer, it is natural to wonder whether our arguments could be extended to members of that hierarchy for higher values for k. By giving examples, we show that this is not the case.
This paper proposes an architecture for deep neural networks with hidden layer branches that learn targets of lower hierarchy than final layer targets. The branches provide a channel for enforcing useful information in hidden layer which helps in attaining better accuracy, both for the final layer and hidden layers. The shared layers modify their weights using the gradients of all cost functions higher than the branching layer. This model provides a flexible inference system with many levels of targets which is modular and can be used efficiently in situations requiring different levels of results according to complexity. This paper applies the idea to a text classification task on 20 Newsgroups data set with two level of hierarchical targets and a comparison is made with training without the use of hidden layer branches.
For speech recognition, deep neural networks (DNNs) have significantly improved the recognition accuracy in most of benchmark datasets and application domains. However, compared to the conventional Gaussian mixture models, DNN-based acoustic models usually have much larger number of model parameters, making it challenging for their applications in resource constrained platforms, e.g., mobile devices. In this paper, we study the application of the recently proposed highway network to train small-footprint DNNs, which are {\it thinner} and {\it deeper}, and have significantly smaller number of model parameters compared to conventional DNNs. We investigated this approach on the AMI meeting speech transcription corpus which has around 70 hours of audio data. The highway neural networks constantly outperformed their plain DNN counterparts, and the number of model parameters can be reduced significantly without sacrificing the recognition accuracy.
We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks.
Large positive (P) magnetoresistance (MR) has been observed in parallel magnetic fields in a single 2D layer in a delta-doped GaAs/AlGaAs heterostructure with a variable-range-hopping (VRH) mechanism of conductivity. Effect of large PMR is accompanied in strong magnetic fields by a substantial change in the character of the temperature dependence of the conductivity. This implies that spins play an important role in 2D VRH conductivity because the processes of orbital origin are not relevant to the observed effect. A possible explanation involves hopping via double occupied states in the upper Hubbard band, where the intra-state correlation of spins is important.
The prospect of neural reconstruction from Electron Microscopy (EM) images has been elucidated by the automatic segmentation algorithms. Although segmentation algorithms eliminate the necessity of tracing the neurons by hand, significant manual effort is still essential for correcting the mistakes they make. A considerable amount of human labor is also required for annotating groundtruth volumes for training the classifiers of a segmentation framework. It is critically important to diminish the dependence on human interaction in the overall reconstruction system. This study proposes a novel classifier training algorithm for EM segmentation aimed to reduce the amount of manual effort demanded by the groundtruth annotation and error refinement tasks. Instead of using an exhaustive pixel level groundtruth, an active learning algorithm is proposed for sparse labeling of pixel and boundaries of superpixels. Because over-segmentation errors are in general more tolerable and easier to correct than the under-segmentation errors, our algorithm is designed to prioritize minimization of false-merges over false-split mistakes. Our experiments on both 2D and 3D data suggest that the proposed method yields segmentation outputs that are more amenable to neural reconstruction than those of existing methods.
Many real world networks, such as social networks, are primarily formed through local interactions between agents. Additionally, in contrast with common network models, social and biological networks exhibit a high degree of clustering. Here we construct a class of network growth models based on local interactions on a metric space, capable of producing arbitrary degree distributions as well as a naturally high degree of clustering akin to biological networks. As a specific example, we study the case of random- walking agents, though most results hold for any linear stochastic dynamics. Agents form bonds when they meet at designated locations we refer to as "rendezvous points." The spatial distribution of the rendezvous points determines key characteristics of the network such as the degree distribution. For any arbitrary (monotonic) degree distribution, we are able to analytically solve for the required rendezvous point distribution.
Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model has a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the number of stored patterns.
Accurate estimation such as cost estimation, quality estimation and risk analysis is a major issue in management. We propose a patent pending soft computing framework to tackle this challenging problem. Our generic framework is independent of the nature and type of estimation. It consists of neural network, fuzzy logic, and an algorithmic estimation model. We made use of the Constructive Cost Model (COCOMO), Analysis of Variance (ANOVA), and Function Point Analysis as the algorithmic models and validated the accuracy of the Neuro-Fuzzy Algorithmic (NFA) Model in software cost estimation using industrial project data. Our model produces more accurate estimation than using an algorithmic model alone. We also discuss the prototypes of our tools that implement the NFA Model. We conclude with our roadmap and direction to enrich the model in tackling different estimation challenges.
The CDF Collaboration has analyzed 955/pb of CDF II data to search for electroweak single top quark production at the Tevatron. We employ three different analysis techniques to search for a single top signal: multivariate likelihood functions; neural networks; the matrix element analysis technique. The sensitivities to a single top signal at the rate predicted by the Standard Model are 2.1 - 2.6 sigma. The first two analyses observe a deficit of single top-like events and set upper limits on the production cross section. The matrix element analysis observes a 2.3 sigma single top excess and measures a combined t-channel and s-channel cross section of 2.7 +1.5-1.3 pb. Using the same dataset, we have searched for non-Standard Model production of single top quarks through a heavy W' boson resonance. No evidence for a signal is observed. We exclude at the 95 % C.L. W' boson production with masses of 760 GeV/c^2 (790 GeV/c^2) in case the right handed neutrino is smaller (larger) than the mass of the W' boson.
Compared to other areas, artwork recommendation has received little attention, despite the continuous growth of the artwork market. Previous research has relied on ratings and metadata to make artwork recommendations, as well as visual features extracted with deep neural networks (DNN). However, these features have no direct interpretation to explicit visual features (e.g. brightness, texture) which might hinder explainability and user-acceptance. In this work, we study the impact of artwork metadata as well as visual features (DNN-based and attractiveness-based) for physical artwork recommendation, using images and transaction data from the UGallery online artwork store.   Our results indicate that: (i) visual features perform better than manually curated data, (ii) DNN-based visual features perform better than attractiveness-based ones, and (iii) a hybrid approach improves the performance further. Our research can inform the development of new artwork recommenders relying on diverse content data.
Recent work in computer vision has yielded impressive results in automatically describing images with natural language. Most of these systems generate captions in a sin- gle language, requiring multiple language-specific models to build a multilingual captioning system. We propose a very simple technique to build a single unified model across languages, using artificial tokens to control the language, making the captioning system more compact. We evaluate our approach on generating English and Japanese captions, and show that a typical neural captioning architecture is capable of learning a single model that can switch between two different languages.
In the primate primary visual area (V1), the ocular dominance pattern consists of alternating monocular stripes. Stripe orientation follows systematic trends preserved across several species. I propose that these trends result from minimizing the length of intra-cortical wiring needed to recombine information from the two eyes in order to achieve the perception of depth. I argue that the stripe orientation at any point of V1 should follow the direction of binocular disparity in the corresponding point of the visual field. The optimal pattern of stripes determined from this argument agrees with the ocular dominance pattern of macaque and Cebus monkeys. This theory predicts that for any point in the visual field the limits of depth perception are greatest in the direction along the ocular dominance stripes at that point.
With different genomes available, unsupervised learning algorithms are essential in learning genome-wide biological insights. Especially, the functional characterization of different genomes is essential for us to understand lives. In this book chapter, we review the state-of-the-art unsupervised learning algorithms for genome informatics from DNA to MicroRNA.   DNA (DeoxyriboNucleic Acid) is the basic component of genomes. A significant fraction of DNA regions (transcription factor binding sites) are bound by proteins (transcription factors) to regulate gene expression at different development stages in different tissues. To fully understand genetics, it is necessary of us to apply unsupervised learning algorithms to learn and infer those DNA regions. Here we review several unsupervised learning methods for deciphering the genome-wide patterns of those DNA regions.   MicroRNA (miRNA), a class of small endogenous non-coding RNA (RiboNucleic acid) species, regulate gene expression post-transcriptionally by forming imperfect base-pair with the target sites primarily at the 3$'$ untranslated regions of the messenger RNAs. Since the 1993 discovery of the first miRNA \emph{let-7} in worms, a vast amount of studies have been dedicated to functionally characterizing the functional impacts of miRNA in a network context to understand complex diseases such as cancer. Here we review several representative unsupervised learning frameworks on inferring miRNA regulatory network by exploiting the static sequence-based information pertinent to the prior knowledge of miRNA targeting and the dynamic information of miRNA activities implicated by the recently available large data compendia, which interrogate genome-wide expression profiles of miRNAs and/or mRNAs across various cell conditions.
Recurrent Neural Network models are the state-of-the-art for Named Entity Recognition (NER). We present two innovations to improve the performance of these models. The first innovation is the introduction of residual connections between the Stacked Recurrent Neural Network model to address the degradation problem of deep neural networks. The second innovation is a bias decoding mechanism that allows the trained system to adapt to non-differentiable and externally computed objectives, such as the entity-based F-measure. Our work improves the state-of-the-art results for both Spanish and English languages on the standard train/development/test split of the CoNLL 2003 Shared Task NER dataset.
Recently, convolutional networks (convnets) have proven useful for predicting optical flow. Much of this success is predicated on the availability of large datasets that require expensive and involved data acquisition and laborious la- beling. To bypass these challenges, we propose an unsuper- vised approach (i.e., without leveraging groundtruth flow) to train a convnet end-to-end for predicting optical flow be- tween two images. We use a loss function that combines a data term that measures photometric constancy over time with a spatial term that models the expected variation of flow across the image. Together these losses form a proxy measure for losses based on the groundtruth flow. Empiri- cally, we show that a strong convnet baseline trained with the proposed unsupervised approach outperforms the same network trained with supervision on the KITTI dataset.
A vast class of disordered conducting-insulating compounds close to the percolation threshold is characterized by nonuniversal values of transport critical exponents. The lack of universality implies that critical indexes may depend on material properties such as the particular microstructure or the nature of the constituents, and that in principle they can be influenced by suitable applied perturbations leading to important informations about the origin of nonuniversality. Here we show that in RuO$_2$-glass composites the nonuniversal exponent can be modulated by an applied mechanical strain, signaled by a logarithmic divergence of the piezoresistive response at the percolation threshold. We interpret this phenomenon as being due to a tunneling-distance dependence of the transport exponent, supporting therefore a theory of transport nonuniversality proposed some years ago.
Graphlets are induced subgraphs of a large network and are important for understanding and modeling complex networks. Despite their practical importance, graphlets have been severely limited to applications and domains with relatively small graphs. Most previous work has focused on exact algorithms, however, it is often too expensive to compute graphlets exactly in massive networks with billions of edges, and finding an approximate count is usually sufficient for many applications. In this work, we propose an unbiased graphlet estimation framework that is (a) fast with significant speedups compared to the state-of-the-art, (b) parallel with nearly linear-speedups, (c) accurate with <1% relative error, (d) scalable and space-efficient for massive networks with billions of edges, and (e) flexible for a variety of real-world settings, as well as estimating macro and micro-level graphlet statistics (e.g., counts) of both connected and disconnected graphlets. In addition, an adaptive approach is introduced that finds the smallest sample size required to obtain estimates within a given user-defined error bound. On 300 networks from 20 domains, we obtain <1% relative error for all graphlets. This is significantly more accurate than existing methods while using less data. Moreover, it takes a few seconds on billion edge graphs (as opposed to days/weeks). These are by far the largest graphlet computations to date.
Self-Organising Maps (SOM) are Artificial Neural Networks used in Pattern Recognition tasks. Their major advantage over other architectures is human readability of a model. However, they often gain poorer accuracy. Mostly used metric in SOM is the Euclidean distance, which is not the best approach to some problems. In this paper, we study an impact of the metric change on the SOM's performance in classification problems. In order to change the metric of the SOM we applied a distance metric learning method, so-called 'Large Margin Nearest Neighbour'. It computes the Mahalanobis matrix, which assures small distance between nearest neighbour points from the same class and separation of points belonging to different classes by large margin. Results are presented on several real data sets, containing for example recognition of written digits, spoken letters or faces.
Signed networks have long been used to represent social relations of amity (+) and enmity (-) between individuals. Group of individuals who are cyclically connected are said to be balanced if the number of negative edges in the cycle is even and unbalanced otherwise. In its earliest and most natural formulation, the balance of a social network was thus defined from its simple cycles, cycles which do not visit any vertex more than once. Because of the inherent difficulty associated with finding such cycles on very large networks, social balance has since then been studied via other means. In this article we present the balance as measured from the simple cycles and primitive orbits of social networks. We specifically provide two measures of balance: the proportion $R_\ell$ of negative simple cycles of length $\ell$ for each $\ell\leq 20$ which generalises the triangle index, and a ratio $K_\ell$ which extends the relative signed clustering coefficient introduced by Kunegis. To do so, we use a Monte Carlo implementation of a novel exact formula for counting the simple cycles on any weighted directed graph. Our method is free from the double-counting problem affecting previous cycle-based approaches, does not require edge-reciprocity of the underlying network, provides a gray-scale measure of balance for each cycle length separately and is sufficiently tractable that it can be implemented on a standard desktop computer. We observe that social networks exhibit strong inter-edge correlations favouring balanced situations and we determine the corresponding correlation length $\xi$. For longer simple cycles, $R_\ell$ undergoes a sharp transition to values expected from an uncorrelated model. This transition is absent from synthetic random networks, strongly suggesting that it carries a sociological meaning warranting further research.
Calculating a product of multiple graphs has been studied in mathematics, engineering, computer science, and more recently in network science, particularly in the context of multilayer networks. One of the important questions to be addressed in this area is how to characterize spectral properties of a product graph using those of its factor graphs. While several such characterizations have already been obtained analytically (mostly for adjacency spectra), characterization of Laplacian spectra of direct product and strong product graphs has remained an open problem. Here we develop practical methods to estimate Laplacian spectra of direct and strong product graphs from spectral properties of their factor graphs using a few heuristic assumptions. Numerical experiments showed that the proposed methods produced reasonable estimation with percentage errors confined within a +/-10% range for most eigenvalues.
Recent work has shown that deep neural networks are capable of approximating both value functions and policies in reinforcement learning domains featuring continuous state and action spaces. However, to the best of our knowledge no previous work has succeeded at using deep neural networks in structured (parameterized) continuous action spaces. To fill this gap, this paper focuses on learning within the domain of simulated RoboCup soccer, which features a small set of discrete action types, each of which is parameterized with continuous variables. The best learned agent can score goals more reliably than the 2012 RoboCup champion agent. As such, this paper represents a successful extension of deep reinforcement learning to the class of parameterized action space MDPs.
Among the mechanisms for the data security in computer networks is considered trusted routing. Its simulation method is chosen and choice of network simulator is substantiated.
We study the dynamics of bond-disordered Ising spin systems on random graphs with finite connectivity, using generating functional analysis. Rather than disorder-averaged correlation and response functions (as for fully connected systems), the dynamic order parameter is here a measure which represents the disorder averaged single-spin path probabilities, given external perturbation field paths. In the limit of completely asymmetric graphs our macroscopic laws close already in terms of the single-spin path probabilities at zero external field. For the general case of arbitrary graph symmetry we calculate the first few time steps of the dynamics exactly, and we work out (numerical and analytical) procedures for constructing approximate stationary solutions of our equations. Simulation results support our theoretical predictions.
The LFPs is a broadband signal that captures variations of neural population activity over a wide range of time scales. The range of time scales available in LFPs is particularly interesting from the neural coding point of view because it opens up the possibility to investigate whether there are privileged time scales for information processing, a question that has been hotly debated over the last one or two decades.It is possible that information is represented by only a small number of specific frequency ranges, each carrying a separate contribution to the information representation. To shed light on this issue, it is important to quantify the information content of each frequency range of neural activity, and understand which ranges carry complementary or similar information.
The global financial crisis in 2007-2009 demonstrated that systemic risk can spread all over the world through a complex web of financial linkages, yet we still lack fundamental knowledge about the evolution of the financial web. In particular, interbank credit networks shape the core of the financial system, in which a time-varying interconnected risk emerges from a massive number of temporal transactions between banks. The current lack of understanding of the mechanics of interbank networks makes it difficult to evaluate and control systemic risk. Here, we uncover fundamental dynamics of interbank networks by seeking the patterns of daily transactions between individual banks. We find stable interaction patterns between banks from which distinctive network-scale dynamics emerge. In fact, the dynamical patterns discovered at the local and network scales share common characteristics with social communication patterns of humans. To explain the origin of "social" dynamics in interbank networks, we provide a simple model that allows us to generate a sequence of synthetic daily networks characterized by the observed dynamical properties. The discovery of dynamical principles at the daily resolution will enhance our ability to assess systemic risk and could contribute to the real-time management of financial stability.
Discursive knowledge emerges as codification in flows of communication. The flows of communication are constrained and enabled by networks of communications as their historical manifestations at each moment of time. New publications modify the existing networks by changing the distributions of attributes and relations in document sets, while the networks are self-referentially updated along trajectories. Codification operates reflexively: the network structures are reconstructed from the perspective of hindsight. Codification along different axes differentiates discursive knowledge into specialties. These intellectual control structures are constructed bottom-up, but feed top-down back upon the production of new knowledge. However, the forward dynamics of diffusion in the development of the communication networks along trajectories differs from the feedback mechanisms of control. Analysis of the development of scientific communication in terms of evolving scientific literatures provides us with a model which makes these evolutionary processes amenable to measurement.
The dynamics of User Datagram Protocol (UDP) traffic over Ethernet between two computers are analyzed using nonlinear dynamics which shows that there are two clear regimes in the data flow: free flow and saturated. The two most important variables affecting this are the packet size and packet flow rate. However, this transition is due to a transcritical bifurcation rather than phase transition in models such as in vehicle traffic or theorized large-scale computer network congestion. It is hoped this model will help lay the groundwork for further research on the dynamics of networks, especially computer networks.
The interactions between proteins, DNA, and RNA in living cells constitute molecular networks that govern various cellular functions. To investigate the global dynamical properties and stabilities of such networks, we studied the cell-cycle regulatory network of the budding yeast. With the use of a simple dynamical model, it was demonstrated that the cell-cycle network is extremely stable and robust for its function. The biological stationary state--the G1 state--is a global attractor of the dynamics. The biological pathway--the cell-cycle sequence of protein states--is a globally attracting trajectory of the dynamics. These properties are largely preserved with respect to small perturbations to the network. These results suggest that cellular regulatory networks are robustly designed for their functions.
The Olympic Games are an important sporting event with notable consequences for the general economic landscape of the host city. Traditional economic assessments focus on the aggregated impact of the event on the national income, but fail to provide micro-scale insights on why local businesses will benefit from the increased activity during the Games. In this paper we provide a novel approach to modeling the impact of the Olympic Games on local retailers by analyzing a dataset mined from a large location-based social service, Foursquare. We hypothesize that the spatial positioning of businesses as well as the mobility trends of visitors are primary indicators of whether retailers will rise their popularity during the event. To confirm this we formulate a retail winners prediction task in the context of which we evaluate a set of geographic and mobility metrics. We find that the proximity to stadiums, the diversity of activity in the neighborhood, the nearby area sociability, as well as the probability of customer flows from and to event places such as stadiums and parks are all vital factors. Through supervised learning techniques we demonstrate that the success of businesses hinges on a combination of both geographic and mobility factors. Our results suggest that location-based social networks, where crowdsourced information about the dynamic interaction of users with urban spaces becomes publicly available, present an alternative medium to assess the economic impact of large scale events in a city.
In this paper we present a modified version of the Hyperbolic Tangent Activation Function as a learning unit generator for neural networks. The function uses an integer calibration constant as an approximation to the Euler number, e, based on a quadratic Real Number Formula (RNF) algorithm and an adaptive normalization constraint on the input activations to avoid the vanishing gradient. We demonstrate the effectiveness of the proposed modification using a hypothetical and real world dataset and show that lower run-times can be achieved by learning algorithms using this function leading to improved speed-ups and learning accuracies during training.
We applied to an open flow a proper orthogonal decomposition (pod) technique, on 2D snapshots of the instantaneous velocity field, to reveal the spatial coherent structures responsible of the self-sustained oscillations observed in the spectral distribution of time series. We applied the technique to 2D planes out of 3D direct numerical simulations on an open cavity flow. The process can easily be implemented on usual personal computers, and might bring deep insights on the relation between spatial events and temporal signature in (both numerical or experimental) open flows.
A rumor spreading in a social network or a disease propagating in a community can be modeled as an infection spreading in a network. Finding the infection source is a challenging problem, which is made more difficult in many applications where we have access only to a limited set of observations. We consider the problem of estimating an infection source for a Susceptible-Infected model, in which not all infected nodes can be observed. When the network is a tree, we show that an estimator for the source node associated with the most likely infection path that yields the limited observations is given by a Jordan center, i.e., a node with minimum distance to the set of observed infected nodes. We also propose approximate source estimators for general networks. Simulation results on various synthetic networks and real world networks suggest that our estimators perform better than distance, closeness, and betweenness centrality based heuristics.
We analyze a learning method that uses a margin $\kappa$ {\it a la} Gardner for simple perceptron learning. This method corresponds to the perceptron learning when $\kappa=0$, and to the Hebbian learning when $\kappa \to \infty$. Nevertheless, we found that the generalization ability of the method was superior to that of the perceptron and the Hebbian methods at an early stage of learning. We analyzed the asymptotic property of the learning curve of this method through computer simulation and found that it was the same as for perceptron learning. We also investigated an adaptive margin control method.
We introduce and analyse ensembles of 2-regular random graphs with a tuneable distribution of short cycles. The phenomenology of these graphs depends critically on the scaling of the ensembles' control parameters relative to the number of nodes. A phase diagram is presented, showing a second order phase transition from a connected to a disconnected phase. We study both the canonical formulation, where the size is large but fixed, and the grand canonical formulation, where the size is sampled from a discrete distribution, and show their equivalence in the thermodynamical limit. We also compute analytically the spectral density, which consists of a discrete set of isolated eigenvalues, representing short cycles, and a continuous part, representing cycles of diverging size.
We study the contribution of the twist-3 fragmentation function to the single-transverse spin asymmetry in semi-inclusive deep inelastic scattering within the framework of the collinear factorization. Making use of the Ward-Takahashi identity in QCD, we establish the formalism in the Feynman gauge to calculate the non-pole contribution of the twist-3 fragmentation function to the asymmetry and derive the complete cross-section formula in the leading order QCD perturbation theory. The obtained twist-3 hadronic tensor is shown to satisfy the electromagnetic gauge-invariance. The behavior in the small transverse- momentum region for the five structure functions is also given.
We employed the method of virial expansion in order to compute the retarded density correlation function (generalized diffusion propagator) in the critical random matrix ensemble in the limit of strong multifractality. We found that the long-range nature of the Hamiltonian is a common root of both multifractality and Levy flights which show up in the power-law intermediate- and long-distance behavior, respectively, of the density correlation function. We review certain models of classical random walks on fractals and show the similarity of the density correlation function in them to that for the quantum problem described by the random critical long-range Hamiltonians.
CODEQ is a new, population-based meta-heuristic algorithm that is a hybrid of concepts from chaotic search, opposition-based learning, differential evolution and quantum mechanics. CODEQ has successfully been used to solve different types of problems (e.g. constrained, integer-programming, engineering) with excellent results. In this paper, CODEQ is used to train feed-forward neural networks. The proposed method is compared with particle swarm optimization and differential evolution algorithms on three data sets with encouraging results.
We study a random code ensemble with a hierarchical structure, which is closely related to the generalized random energy model with discrete energy values. Based on this correspondence, we analyze the hierarchical random code ensemble by using the replica method in two situations: lossy data compression and channel coding. For both the situations, the exponents of large deviation analysis characterizing the performance of the ensemble, the distortion rate of lossy data compression and the error exponent of channel coding in Gallager's formalism, are accessible by a generating function of the generalized random energy model. We discuss that the transitions of those exponents observed in the preceding work can be interpreted as phase transitions with respect to the replica number. We also show that the replica symmetry breaking plays an essential role in these transitions.
Background: Recent models of genome-proteome evolution have shown that some of the key traits displayed by the global structure of cellular networks might be a natural result of a duplication-diversification (DD) process. One of the consequences of such evolution is the emergence of a small world architecture together with a scale-free distribution of interactions. Here we show that the domain of parameter space were such structure emerges is related to a phase transition phenomenon. At this transition point, modular architecture spontaneously emerges as a byproduct of the DD process.   Results: Although the DD models lack any functionality and are thus free from meeting functional constraints, they show the observed features displayed by the real proteome maps when tuned close to a sharp transition point separating a highly connected graph from a disconnected system. Close to such boundary, the maps are shown to display scale-free hierarchical organization, behave as small worlds and exhibit modularity.   Conclusions: It is conjectured that natural selection tuned the average connectivity in such a way that the network reaches a sparse graph of connections. One consequence of such scenario is that the scaling laws and the essential ingredients for building a modular net emerge for free close to such transition.
This paper presents an iterated local search for the fixed-charge uncapacitated network design problem with user-optimal flow (FCNDP-UOF), which concerns routing multiple commodities from its origin to its destination by signing a network through selecting arcs, with an objective of minimizing the sum of the fixed costs of the selected arcs plus the sum of variable costs associated to the flows on each arc. Besides that, since the FCNDP-UOF is a bi-level problem, each commodity has to be transported through a shortest path, concerning the edges length, in the built network. The proposed algorithm generate a initial solution using a variable fixing heuristic. Then a local branching strategy is applied to improve the quality of the solution. At last, an efficient perturbation strategy is presented to perform cycle-based moves to explore different parts of the solution space. Computational experiments shows that the proposed solution method consistently produces high-quality solutions in reasonable computational times.
With smart devices, particular smartphones, becoming our everyday companions, the ubiquitous mobile Internet and computing applications pervade people daily lives. With the surge demand on high-quality mobile services at anywhere, how to address the ubiquitous user demand and accommodate the explosive growth of mobile traffics is the key issue of the next generation mobile networks. The Fog computing is a promising solution towards this goal. Fog computing extends cloud computing by providing virtualized resources and engaged location-based services to the edge of the mobile networks so as to better serve mobile traffics. Therefore, Fog computing is a lubricant of the combination of cloud computing and mobile applications. In this article, we outline the main features of Fog computing and describe its concept, architecture and design goals. Lastly, we discuss some of the future research issues from the networking perspective.
Traffic dynamics on single or isolated complex networks has been extensively studied in the past decade. Recently, several coupled network models have been developed to describe the interactions between real-world networked systems. In a system of interconnected networks, the coupling links refer to the physical links between networks and provide paths for traffic transmission. Thus, the coupling pattern, i.e., the way coupling links are added, has a profound influence on the overall traffic performance. In this paper, we employ a simulated annealing (SA) algorithm to find a near-optimal configuration of the coupling links, which effectively improves the overall traffic capacity compared with random, assortative and disassortative couplings. Furthermore, we investigate the optimal configuration of coupling links given by the SA algorithm and develop a faster method to select the coupling links.
Distributed systems are now both very large and highly dynamic. Peer to peer overlay networks have been proved efficient to cope with this new deal that traditional approaches can no longer accommodate. While the challenge of organizing peers in an overlay network has generated a lot of interest leading to a large number of solutions, maintaining critical data in such a network remains an open issue. In this paper, we are interested in defining the portion of nodes and frequency one has to probe, given the churn observed in the system, in order to achieve a given probability of maintaining the persistence of some critical data. More specifically, we provide a clear result relating the size and the frequency of the probing set along with its proof as well as an analysis of the way of leveraging such an information in a large scale dynamic distributed system.
We present a method to infer network connectivity from collective dynamics in networks of synchronizing phase oscillators. We study the long-term stationary response to temporally constant driving. For a given driving condition, measuring the phase differences and the collective frequency reveals information about how the oscillators are interconnected. Sufficiently many repetitions for different driving conditions yield the entire network connectivity from measuring the dynamics only. For sparsely connected networks we obtain good predictions of the actual connectivity even for formally under-determined problems.
We give efficient algorithms for the fundamental problems of Broadcast and Local Broadcast in dynamic wireless networks. We propose a general model of communication which captures and includes both fading models (like SINR) and graph-based models (such as quasi unit disc graphs, bounded-independence graphs, and protocol model). The only requirement is that the nodes can be embedded in a bounded growth quasi-metric, which is the weakest condition known to ensure distributed operability. Both the nodes and the links of the network are dynamic: nodes can come and go, while the signal strength on links can go up or down.   The results improve some of the known bounds even in the static setting, including an optimal algorithm for local broadcasting in the SINR model, which is additionally uniform (independent of network size). An essential component is a procedure for balancing contention, which has potentially wide applicability. The results illustrate the importance of carrier sensing, a stock feature of wireless nodes today, which we encapsulate in primitives to better explore its uses and usefulness.
Compared to LiDAR-based localization methods, which provide high accuracy but rely on expensive sensors, visual localization approaches only require a camera and thus are more cost-effective while their accuracy and reliability typically is inferior to LiDAR-based methods. In this work, we propose a vision-based localization approach that learns from LiDAR-based localization methods by using their output as training data, thus combining a cheap, passive sensor with an accuracy that is on-par with LiDAR-based localization. The approach consists of two deep networks trained on visual odometry and topological localization, respectively, and a successive optimization to combine the predictions of these two networks. We evaluate the approach on a new challenging pedestrian-based dataset captured over the course of six months in varying weather conditions with a high degree of noise. The experiments demonstrate that the localization errors are up to 10 times smaller than with traditional vision-based localization methods.
In this paper, we examine the phase diagram of quenched QCD with two flavors of Wilson fermions, proposing the following microscopic picture. The super-critical regions inside and outside the Aoki phase are characterized by the existence of a density of near-zero modes of the (hermitian) Wilson-Dirac operator, and thus by a non-vanishing pion condensate. Inside the Aoki phase, this density is built up from extended near-zero modes, while outside the Aoki phase, there is a non-vanishing density of exponentially localized near-zero modes, which occur in "exceptional" gauge-field configurations. Nevertheless, no Goldstone excitations appear outside the Aoki phase, and the existence of Goldstone excitations may therefore be used to define the Aoki phase in both the quenched and unquenched theories. We show that the density of localized near-zero modes gives rise to a divergent pion two-point function, thus providing an alternative mechanism for satisfying the relevant Ward identity in the presence of a non-zero order parameter. This divergence occurs when we take a "twisted" quark mass to zero, and we conclude that quenched QCD with Wilson fermions is well-defined only with a non-vanishing twisted mass. We show that this peculiar behavior of the near-zero-mode density is special to the quenched theory by demonstrating that this density vanishes in the unquenched theory outside the Aoki phase. We discuss the implications for domain-wall and overlap fermions constructed from a Wilson-Dirac kernel. We argue that both methods work outside the Aoki phase, but fail inside because of problems with locality and/or chiral symmetry, in both the quenched and unquenched theories.
A class of recent approaches for generating images, called Generative Adversarial Networks (GAN), have been used to generate impressively realistic images of objects, bedrooms, handwritten digits and a variety of other image modalities. However, typical GAN-based approaches require large amounts of training data to capture the diversity across the image modality. In this paper, we propose DeLiGAN -- a novel GAN-based architecture for diverse and limited training data scenarios. In our approach, we reparameterize the latent generative space as a mixture model and learn the mixture model's parameters along with those of GAN. This seemingly simple modification to the GAN framework is surprisingly effective and results in models which enable diversity in generated samples although trained with limited data. In our work, we show that DeLiGAN can generate images of handwritten digits, objects and hand-drawn sketches, all using limited amounts of data. To quantitatively characterize intra-class diversity of generated samples, we also introduce a modified version of "inception-score", a measure which has been found to correlate well with human assessment of generated samples.
Elites are subgroups of individuals within a society that have the ability and means to influence, lead, govern, and shape societies. Members of elites are often well connected individuals, which enables them to impose their influence to many and to quickly gather, process, and spread information. Here we argue that elites are not only composed of highly connected individuals, but also of intermediaries connecting hubs to form a cohesive and structured elite-subgroup at the core of a social network. For this purpose we present a generalization of the $K$-core algorithm that allows to identify a social core that is composed of well-connected hubs together with their `connectors'. We show the validity of the idea in the framework of a virtual world defined by a massive multiplayer online game, on which we have complete information of various social networks. Exploiting this multiplex structure, we find that the hubs of the generalized $K$-core identify those individuals that are high social performers in terms of a series of indicators that are available in the game. In addition, using a combined strategy which involves the generalized $K$-core and the recently introduced $M$-core, the elites of the different 'nations' present in the game are perfectly identified as modules of the generalized $K$-core. Interesting sudden shifts in the composition of the elite cores are observed at deep levels. We show that elite detection with the traditional $K$-core is not possible in a reliable way. The proposed method might be useful in a series of more general applications, such as community detection.
A classical result in undirected wireline networks is the near optimality of routing (flow) for multiple-unicast traffic (multiple sources communicating independent messages to multiple destinations): the min cut upper bound is within a logarithmic factor of the number of sources of the max flow. In this paper we "extend" the wireline result to the wireless context.   Our main result is the approximate optimality of a simple layering principle: {\em local physical-layer schemes combined with global routing}. We use the {\em reciprocity} of the wireless channel critically in this result. Our formal result is in the context of channel models for which "good" local schemes, that achieve the cut-set bound, exist (such as Gaussian MAC and broadcast channels, broadcast erasure networks, fast fading Gaussian networks).   Layered architectures, common in the engineering-design of wireless networks, can have near-optimal performance if the {\em locality} over which physical-layer schemes should operate is carefully designed. Feedback is shown to play a critical role in enabling the separation between the physical and the network layers. The key technical idea is the modeling of a wireless network by an undirected "polymatroidal" network, for which we establish a max-flow min-cut approximation theorem.
This article focuses on signal classification for deep-sea acoustic neutrino detection. In the deep sea, the background of transient signals is very diverse. Approaches like matched filtering are not sufficient to distinguish between neutrino-like signals and other transient signals with similar signature, which are forming the acoustic background for neutrino detection in the deep-sea environment. A classification system based on machine learning algorithms is analysed with the goal to find a robust and effective way to perform this task. For a well-trained model, a testing error on the level of one percent is achieved for strong classifiers like Random Forest and Boosting Trees using the extracted features of the signal as input and utilising dense clusters of sensors instead of single sensors.
The costs associated to the length of links impose unavoidable constraints to the growth of natural and artificial transport networks. When future network developments can not be predicted, building and maintenance costs require competing minimization mechanisms, and can not be optimized simultaneously. Hereby, we study the interplay of building and maintenance costs and its impact on the growth of transportation networks through a non-equilibrium model of network growth. We show cost balance is a sufficient ingredient for the emergence of tradeoffs between the network's total length and transport effciency, of optimal strategies of construction, and of power-law temporal correlations in the growth history of the network. Analysis of empirical ant transport networks in the framework of this model suggests different ant species may adopt similar optimization strategies.
Recent models of super-massive black hole (SMBH) and host galaxy joint evolution predict the presence of a key phase where accretion, traced by obscured Active Galactic Nuclei (AGN) emission, is coupled with powerful star formation. Then feedback processes likely self-regulate the SMBH growth and quench the star-formation activity. AGN in this important evolutionary phase have been revealed in the last decade via surveys at different wavelengths. On the one hand, moderate-to-deep X-ray surveys have allowed a systematic search for heavily obscured AGN, up to very high redshifts (z~5). On the other hand, infrared/optical surveys have been invaluable in offering complementary methods to select obscured AGN also in cases where the nuclear X-ray emission below 10 keV is largely hidden to our view. In this review I will present my personal perspective of the field of obscured accretion from AGN surveys.
There is compelling evidence that coreference prediction would benefit from modeling global information about entity-clusters. Yet, state-of-the-art performance can be achieved with systems treating each mention prediction independently, which we attribute to the inherent difficulty of crafting informative cluster-level features. We instead propose to use recurrent neural networks (RNNs) to learn latent, global representations of entity clusters directly from their mentions. We show that such representations are especially useful for the prediction of pronominal mentions, and can be incorporated into an end-to-end coreference system that outperforms the state of the art without requiring any additional search.
The availability of big data recorded from massively multiplayer online role-playing games (MMORPGs) allows us to gain a deeper understanding of the potential connection between individuals' network positions and their economic outputs. We use a statistical filtering method to construct dependence networks from weighted friendship networks of individuals. We investigate the 30 distinct motif positions in the 13 directed triadic motifs which represent microscopic dependences among individuals. Based on the structural similarity of motif positions, we further classify individuals into different groups. The node position diversity of individuals is found to be positively correlated with their economic outputs. We also find that the economic outputs of leaf nodes are significantly lower than that of the other nodes in the same motif. Our findings shed light on understanding the influence of network structure on economic activities and outputs in socioeconomic system.
Noisy probabilistic relational rules are a promising world model representation for several reasons. They are compact and generalize over world instantiations. They are usually interpretable and they can be learned effectively from the action experiences in complex worlds. We investigate reasoning with such rules in grounded relational domains. Our algorithms exploit the compactness of rules for efficient and flexible decision-theoretic planning. As a first approach, we combine these rules with the Upper Confidence Bounds applied to Trees (UCT) algorithm based on look-ahead trees. Our second approach converts these rules into a structured dynamic Bayesian network representation and predicts the effects of action sequences using approximate inference and beliefs over world states. We evaluate the effectiveness of our approaches for planning in a simulated complex 3D robot manipulation scenario with an articulated manipulator and realistic physics and in domains of the probabilistic planning competition. Empirical results show that our methods can solve problems where existing methods fail.
Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which generalizes deep kernel learning approaches to enable classification, multi-task learning, additive covariance structures, and stochastic gradient training. Specifically, we apply additive base kernels to subsets of output features from deep neural architectures, and jointly learn the parameters of the base kernels and deep network through a Gaussian process marginal likelihood objective. Within this framework, we derive an efficient form of stochastic variational inference which leverages local kernel interpolation, inducing points, and structure exploiting algebra. We show improved performance over stand alone deep networks, SVMs, and state of the art scalable Gaussian processes on several classification benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and ImageNet.
Many complex networks demonstrate a phenomenon of striking degree correlations, i.e., a node tends to link to other nodes with similar (or dissimilar) degrees. From the perspective of degree correlations, this paper attempts to characterize topological structures of urban street networks. We adopted six urban street networks (three European and three North American), and converted them into network topologies in which nodes and edges respectively represent individual streets and street intersections, and compared the network topologies to three reference network topologies (biological, technological, and social). The urban street network topologies (with the exception of Manhattan) showed a consistent pattern that distinctly differs from the three reference networks. The topologies of urban street networks lack striking degree correlations in general. Through reshuffling the network topologies towards for example maximum or minimum degree correlations while retaining the initial degree distributions, we found that all the surrogate topologies of the urban street networks, as well as the reference ones, tended to deviate from small world properties. This implies that the initial degree correlations do not have any positive or negative effect on the networks' performance or functions.   Keywords: Scale free, small world, rewiring, rich club effect, reshuffle, and complex networks
The human brain is a dynamical system whose extremely complex sensor-driven neural processes give rise to conceptual, logical cognition. Understanding the interplay between nonlinear neural dynamics and concept-level cognition remains a major scientific challenge. Here I propose a mechanism of neurodynamical organization, called conceptors, which unites nonlinear dynamics with basic principles of conceptual abstraction and logic. It becomes possible to learn, store, abstract, focus, morph, generalize, de-noise and recognize a large number of dynamical patterns within a single neural system; novel patterns can be added without interfering with previously acquired ones; neural noise is automatically filtered. Conceptors help explaining how conceptual-level information processing emerges naturally and robustly in neural systems, and remove a number of roadblocks in the theory and applications of recurrent neural networks.
Forward-backward selection is one of the most basic and commonly-used feature selection algorithms available. It is also general and conceptually applicable to many different types of data. In this paper, we propose a heuristic that significantly improves its running time, while preserving predictive accuracy. The idea is to temporarily discard the variables that are conditionally independent with the outcome given the selected variable set. Depending on how those variables are reconsidered and reintroduced, this heuristic gives rise to a family of algorithms with increasingly stronger theoretical guarantees. In distributions that can be faithfully represented by Bayesian networks or maximal ancestral graphs, members of this algorithmic family are able to correctly identify the Markov blanket in the sample limit. In experiments we show that the proposed heuristic increases computational efficiency by about two orders of magnitude in high-dimensional problems, while selecting fewer variables and retaining predictive performance. Furthermore, we show that the proposed algorithm and feature selection with LASSO perform similarly when restricted to select the same number of variables, making the proposed algorithm an attractive alternative for problems where no (efficient) algorithm for LASSO exists.
We have observed an unexpected enhancement of the lower critical field $H_{c1}(T)$ and the critical current $I_{c}(T)$ deep in the superconducting state below $T \approx 0.6$ K ($T/T_{c} \approx 0.3$) in the filled skutterudite heavy fermion superconductor PrOs$_4$Sb$_{12}$. From a comparison of the behavior of $H_{c1}(T)$ with that of the heavy fermion superconductors U$_{1-x}$Th$_x$Be$_{13}$ and UPt$_3$, we speculate that the enhancement of $H_{c1}(T)$ and $I_{c}(T)$ in PrOs$_4$Sb$_{12}$ reflects a transition into another superconducting phase that occurs below $T/T_{c} \approx 0.3$. An examination of the literature reveals unexplained anomalies in other physical properties of PrOs$_4$Sb$_{12}$ near $T/T_{c} \approx 0.3$ that correlate with the features we have observed in $H_{c1}(T)$ and $I_{c}(T)$.
Many-body quantum systems typically display fast dynamics and ballistic spreading of information. An exception are many-body localized systems, where quenched disorder causes the information to spread much slower, i.e., logarithmically in time. Here we address an open problem of how slow the dynamics can be in generic lattice systems with local interactions and no disorder. We identify a class of one-dimensional models where the dynamics is remarkably slow, i.e., the dynamics is indistinguishable from that of localized systems at the time scales and system sizes accessible by matrix-product state simulations. We develop a method based on degenerate perturbation theory that quantitatively captures the long-time dynamics of such systems. Our method allows to classify the types of delocalization processes and paves the way towards a general understanding of slow dynamical regimes in translation-invariant systems in arbitrary dimensions.
Differences in transcriptional regulatory networks underlie much of the phenotypic variation observed across organisms. Changes to cis-regulatory elements are widely believed to be the predominant means by which regulatory networks evolve, yet examples of regulatory network divergence due to transcription factor (TF) variation have also been observed. To systematically ascertain the extent to which TFs contribute to regulatory divergence, we analyzed the evolution of the largest class of metazoan TFs, Cys2-His2 zinc finger (C2H2-ZF) TFs, across 12 Drosophila species spanning ~45 million years of evolution. Remarkably, we uncovered that a significant fraction of all C2H2-ZF 1-to-1 orthologs in flies exhibit variations that can affect their DNA-binding specificities. In addition to loss and recruitment of C2H2-ZF domains, we found diverging DNA-contacting residues in ~47% of domains shared between D. melanogaster and the other fly species. These diverging DNA-contacting residues, found in ~66% of the D. melanogaster C2H2-ZF genes in our analysis and corresponding to ~24% of all annotated D. melanogaster TFs, show evidence of functional constraint: they tend to be conserved across phylogenetic clades and evolve slower than other diverging residues. These same variations were rarely found as polymorphisms within a population of D. melanogaster flies, indicating their rapid fixation. The predicted specificities of these dynamic domains gradually change across phylogenetic distances, suggesting stepwise evolutionary trajectories for TF divergence. Further, whereas proteins with conserved C2H2-ZF domains are enriched in developmental functions, those with varying domains exhibit no functional enrichments. Our work suggests that a subset of highly dynamic and largely unstudied TFs are a likely source of regulatory variation in Drosophila and other metazoans.
We suggest that if a localized phase at nonzero temperature $T>0$ exists for strongly disordered and weakly interacting electrons, as recently argued, it will also occur when both disorder and interactions are strong and $T$ is very high. We show that in this high-$T$ regime the localization transition may be studied numerically through exact diagonalization of small systems. We obtain spectra for one-dimensional lattice models of interacting spinless fermions in a random potential. As expected, the spectral statistics of finite-size samples cross over from those of orthogonal random matrices in the diffusive regime at weak random potential to Poisson statistics in the localized regime at strong randomness. However, these data show deviations from simple one-parameter finite-size scaling: the apparent mobility edge ``drifts'' as the system's size is increased. Based on spectral statistics alone, we have thus been unable to make a strong numerical case for the presence of a many-body localized phase at nonzero $T$.
We study, via the replica method of disordered systems, the packing problem of hard-spheres with a square-well attractive potential when the space dimensionality, d, becomes infinitely large. The phase diagram of the system exhibits reentrancy of the liquid-glass transition line, two distinct glass states and a glass-to-glass transition, much similar to what has been previously obtained by Mode-Coupling Theory, numerical simulations and experiments. The presence of the phase reentrance implies that for a suitable choice of the intensity and attraction range, high-density sphere packings more compact than the one corresponding to pure hard-spheres can be constructed in polynomial time in the number of particles (at fixed, large d) for packing fractions smaller than 6.5 d 2^{-d}. Although our derivation is not a formal mathematical proof, we believe it meets the standards of rigor of theoretical physics, and at this level of rigor it provides a small improvement of the lower bound on the sphere packing problem.
Statistical network modeling has focused on representing the graph as a discrete structure, namely the adjacency matrix, and considering the exchangeability of this array. In such cases, the Aldous-Hoover representation theorem (Aldous, 1981;Hoover, 1979} applies and informs us that the graph is necessarily either dense or empty. In this paper, we instead consider representing the graph as a measure on $\mathbb{R}_+^2$. For the associated definition of exchangeability in this continuous space, we rely on the Kallenberg representation theorem (Kallenberg, 2005). We show that for certain choices of such exchangeable random measures underlying our graph construction, our network process is sparse with power-law degree distribution. In particular, we build on the framework of completely random measures (CRMs) and use the theory associated with such processes to derive important network properties, such as an urn representation for our analysis and network simulation. Our theoretical results are explored empirically and compared to common network models. We then present a Hamiltonian Monte Carlo algorithm for efficient exploration of the posterior distribution and demonstrate that we are able to recover graphs ranging from dense to sparse--and perform associated tests--based on our flexible CRM-based formulation. We explore network properties in a range of real datasets, including Facebook social circles, a political blogosphere, protein networks, citation networks, and world wide web networks, including networks with hundreds of thousands of nodes and millions of edges.
We study network formation with n players and link cost \alpha > 0. After the network is built, an adversary randomly deletes one link according to a certain probability distribution. Cost for player v incorporates the expected number of players to which v will become disconnected. We show existence of equilibria and a price of stability of 1+o(1) under moderate assumptions on the adversary and n \geq 9.   As the main result, we prove bounds on the price of anarchy for two special adversaries: one removes a link chosen uniformly at random, while the other removes a link that causes a maximum number of player pairs to be separated. For unilateral link formation we show a bound of O(1) on the price of anarchy for both adversaries, the constant being bounded by 10+o(1) and 8+o(1), respectively. For bilateral link formation we show O(1+\sqrt{n/\alpha}) for one adversary (if \alpha > 1/2), and \Theta(n) for the other (if \alpha > 2 considered constant and n \geq 9). The latter is the worst that can happen for any adversary in this model (if \alpha = \Omega(1)). This points out substantial differences between unilateral and bilateral link formation.
The mutual information is a core statistical quantity that has applications in all areas of machine learning, whether this is in training of density models over multiple data modalities, in maximising the efficiency of noisy transmission channels, or when learning behaviour policies for exploration by artificial agents. Most learning algorithms that involve optimisation of the mutual information rely on the Blahut-Arimoto algorithm --- an enumerative algorithm with exponential complexity that is not suitable for modern machine learning applications. This paper provides a new approach for scalable optimisation of the mutual information by merging techniques from variational inference and deep learning. We develop our approach by focusing on the problem of intrinsically-motivated learning, where the mutual information forms the definition of a well-known internal drive known as empowerment. Using a variational lower bound on the mutual information, combined with convolutional networks for handling visual input streams, we develop a stochastic optimisation algorithm that allows for scalable information maximisation and empowerment-based reasoning directly from pixels to actions.
As the information available to lay users through autonomous data sources continues to increase, mediators become important to ensure that the wealth of information available is tapped effectively. A key challenge that these information mediators need to handle is the varying levels of incompleteness in the underlying databases in terms of missing attribute values. Existing approaches such as QPIAD aim to mine and use Approximate Functional Dependencies (AFDs) to predict and retrieve relevant incomplete tuples. These approaches make independence assumptions about missing values---which critically hobbles their performance when there are tuples containing missing values for multiple correlated attributes. In this paper, we present a principled probabilistic alternative that views an incomplete tuple as defining a distribution over the complete tuples that it stands for. We learn this distribution in terms of Bayes networks. Our approach involves mining/"learning" Bayes networks from a sample of the database, and using it to do both imputation (predict a missing value) and query rewriting (retrieve relevant results with incompleteness on the query-constrained attributes, when the data sources are autonomous). We present empirical studies to demonstrate that (i) at higher levels of incompleteness, when multiple attribute values are missing, Bayes networks do provide a significantly higher classification accuracy and (ii) the relevant possible answers retrieved by the queries reformulated using Bayes networks provide higher precision and recall than AFDs while keeping query processing costs manageable.
In many cases, the computation of a neural system can be reduced to a receptive field, or a set of linear filters, and a thresholding function, or gain curve, which determines the firing probability; this is known as a linear/nonlinear model. In some forms of sensory adaptation, these linear filters and gain curve adjust very rapidly to changes in the variance of a randomly varying driving input. An apparently similar but previously unrelated issue is the observation of gain control by background noise in cortical neurons: the slope of the firing rate vs current (f-I) curve changes with the variance of background random input. Here, we show a direct correspondence between these two observations by relating variance-dependent changes in the gain of f-I curves to characteristics of the changing empirical linear/nonlinear model obtained by sampling. In the case that the underlying system is fixed, we derive relationships relating the change of the gain with respect to both mean and variance with the receptive fields derived from reverse correlation on a white noise stimulus. Using two conductance-based model neurons that display distinct gain modulation properties through a simple change in parameters, we show that coding properties of both these models quantitatively satisfy the predicted relationships. Our results describe how both variance-dependent gain modulation and adaptive neural computation result from intrinsic nonlinearity.
The resistivity (R) of low mobility dilute 2D-electron gas in a n-InGaAs/GaAs double quantum well (DQW) exhibits the monotonic 'insulating-like' temperature dependence (dR/dT < 0) at T = 1.8 -- 70K in zero magnetic field. This temperature interval corresponds to a ballistic regime (kTtau/hbar > 0.1 -- 3.5) for our samples, and the electron density is on a 'insulating' side of the so-called B = 0 2D metal--insulator transition. We show that the observed localization and Landau quantization is due to the Sigma_xy(T)anomalous T-dependence.
The genetic algorithm (GA) is an optimization and search technique based on the principles of genetics and natural selection. A GA allows a population composed of many individuals to evolve under specified selection rules to a state that maximizes the "fitness" function. In that process, crossover operator plays an important role. To comprehend the GAs as a whole, it is necessary to understand the role of a crossover operator. Today, there are a number of different crossover operators that can be used in GAs. However, how to decide what operator to use for solving a problem? A number of test functions with various levels of difficulty has been selected as a test polygon for determine the performance of crossover operators. In this paper, a novel crossover operator called 'ring crossover' is proposed. In order to evaluate the efficiency and feasibility of the proposed operator, a comparison between the results of this study and results of different crossover operators used in GAs is made through a number of test functions with various levels of difficulty. Results of this study clearly show significant differences between the proposed operator and the other crossover operators.
We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.
We study the theoretical capacity to statistically learn local landscape information by Evolution Strategies (ESs). Specifically, we investigate the covariance matrix when constructed by ESs operating with the selection operator alone. We model continuous generation of candidate solutions about quadratic basins of attraction, with deterministic selection of the decision vectors that minimize the objective function values. Our goal is to rigorously show that accumulation of winning individuals carries the potential to reveal valuable information about the search landscape, e.g., as already practically utilized by derandomized ES variants. We first show that the statistically-constructed covariance matrix over such winning decision vectors shares the same eigenvectors with the Hessian matrix about the optimum. We then provide an analytic approximation of this covariance matrix for a non-elitist multi-child $(1,\lambda)$-strategy, which holds for a large population size $\lambda$. Finally, we also numerically corroborate our results.
The human habitat is a host where microbial species evolve, function, and continue to evolve. Elucidating how microbial communities respond to human habitats is a fundamental and critical task, as establishing baselines of human microbiome is essential in understanding its role in human disease and health. However, current studies usually overlook a complex and interconnected landscape of human microbiome and limit the ability in particular body habitats with learning models of specific criterion. Therefore, these methods could not capture the real-world underlying microbial patterns effectively. To obtain a comprehensive view, we propose a novel ensemble clustering framework to mine the structure of microbial community pattern on large-scale metagenomic data. Particularly, we first build a microbial similarity network via integrating 1920 metagenomic samples from three body habitats of healthy adults. Then a novel symmetric Nonnegative Matrix Factorization (NMF) based ensemble model is proposed and applied onto the network to detect clustering pattern. Extensive experiments are conducted to evaluate the effectiveness of our model on deriving microbial community with respect to body habitat and host gender. From clustering results, we observed that body habitat exhibits a strong bound but non-unique microbial structural patterns. Meanwhile, human microbiome reveals different degree of structural variations over body habitat and host gender. In summary, our ensemble clustering framework could efficiently explore integrated clustering results to accurately identify microbial communities, and provide a comprehensive view for a set of microbial communities. Such trends depict an integrated biography of microbial communities, which offer a new insight towards uncovering pathogenic model of human microbiome.
The new broad-band capabilities of large radio interferometers such as the GMRT and JVLA allow for long-integration mosaic imaging observations to create ultra-deep full-polarization images of the sky over wide frequency ranges. Achieving rms sensitivities of order 1 $\mu$Jy, these observations explore the radio source population at flux densities well below the regime dominated by classical radio galaxies and Active Galactic Nuclei. We present initial results from radio sources revealed with deep mosaicking observations with the GMRT and JVLA at respectively 0.6 and 5 GHz, and evidence that the $\mu$Jy sensitivity level marks the transition to detection of polarized emission from a population of sources dominated by emission from magnetic fields in the disks of starburst and normal galaxies.
A neuron transforms its input into output spikes, and this transformation is the basic unit of computation in the nervous system. The spiking response of the neuron to a complex, time-varying input can be predicted from the detailed biophysical properties of the neuron, modeled as a deterministic nonlinear dynamical system. In the tradition of neural coding, however, a neuron or neural system is treated as a black box and statistical techniques are used to identify functional models of its encoding properties. The goal of this work is to connect the mechanistic, biophysical approach to neuronal function to a description in terms of a coding model. Building from preceding work at the single neuron level, we develop from first principles a mathematical theory mapping the relationships between two simple but powerful classes of models: deterministic integrate-and-fire dynamical models and linear-nonlinear coding models. To do so, we develop an approach for studying a nonlinear dynamical system by conditioning on an observed linear estimator. We derive asymptotic closed-form expressions for the linear filter and estimates for the nonlinear decision function of the linear/nonlinear model. We analytically derive the dependence of the linear filter on the input statistics and we show how deterministic nonlinear dynamics can be used to modulate the properties of a probabilistic code. We demonstrate that integrate-and-fire models without any additional currents can perform perfect contrast gain control, a sophisticated adaptive computation, and we identify the general dynamical principles responsible. We then design from first principles a nonlinear dynamical model that implements gain control. While we focus on the integrate-and-fire models for tractability, the framework we propose to relate LN and dynamical models generalizes naturally to more complex biophysical models.
We describe our application of broad-band photometric redshift techniques to faint galaxies in the Hubble Deep Field. To magnitudes AB(8140) < 26, the accuracy of the photometric redshifts is a few tenths and the reliability of the photometric redshifts approaches 100%. At fainter magnitudes the effects of photometric error on the photometric redshifts can be rigorously quantified and accounted for. We argue that broad-band photometric redshift techniques can be applied to accurately and reliably estimate redshifts of galaxies that are up to many magnitudes fainter than the spectroscopic limit.
In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally. By revealing the equivalence of the state-of-the-art Residual Network (ResNet) and Densely Convolutional Network (DenseNet) within the HORNN framework, we find that ResNet enables feature re-usage while DenseNet enables new features exploration which are both important for learning good representations. To enjoy the benefits from both path topologies, our proposed Dual Path Network shares common features while maintaining the flexibility to explore new features through dual path architectures. Extensive experiments on three benchmark datasets, ImagNet-1k, Places365 and PASCAL VOC, clearly demonstrate superior performance of the proposed DPN over state-of-the-arts. In particular, on the ImagNet-1k dataset, a shallow DPN surpasses the best ResNeXt-101(64x4d) with 26% smaller model size, 25% less computational cost and 8% lower memory consumption, and a deeper DPN (DPN-131) further pushes the state-of-the-art single model performance with more than 3 times faster training speed. Experiments on the Places365 large-scale scene dataset, PASCAL VOC detection dataset, and PASCAL VOC segmentation dataset also demonstrate its consistently better performance than DenseNet, ResNet and the latest ResNeXt model over various applications.
We explore possibility of tuning photonic crystal properties via order-disorder transition. We fabricated a photonic bandgap material consisting of a three-dimensional array of conducting magnetizable spheres. The spheres self-assemble into ordered state under external magnetic field, in such a way that the crystalline order can be continuously controlled. We study mm-wave transmission through the array as a function of magnetic field, i.e. for different degrees of order. This was done for the regular crystal, as well for the crystal with the planar defect which demonstrates resonance transmission at a certain frequency. We observe that in the ordered, "crystalline" state there is a well-defined stopband, while in the completely disordered, glassy or "amorphous" state, the stopband nearly disappears. We relate the disappearance of the stopband in the disordered state to the fluctuations in the particle area density. We develop a model which predicts how these fluctuations depend on magnetic field and how they affect electrodynamic properties of the whole sample. The model describes our results fairly well.
Neural networks show a progressive increase in complexity during the time course of evolution. From diffuse nerve nets in Cnidaria to modular, hierarchical systems in macaque and humans, there is a gradual shift from simple processes involving a limited amount of tasks and modalities to complex functional and behavioral processing integrating different kinds of information from highly specialized tissue. However, studies in a range of species suggest that fundamental similarities, in spatial and topological features as well as in developmental mechanisms for network formation, are retained across evolution. 'Small-world' topology and highly connected regions (hubs) are prevalent across the evolutionary scale, ensuring efficient processing and resilience to internal (e.g. lesions) and external (e.g. environment) changes. Furthermore, in most species, even the establishment of hubs, long-range connections linking distant components, and a modular organization, relies on similar mechanisms. In conclusion, evolutionary divergence leads to greater complexity while following essential developmental constraints.
Many machine learning applications such as in vision, biology and social networking deal with data in high dimensions. Feature selection is typically employed to select a subset of features which im- proves generalization accuracy as well as reduces the computational cost of learning the model. One of the criteria used for feature selection is to jointly minimize the redundancy and maximize the rele- vance of the selected features. In this paper, we formulate the task of feature selection as a one class SVM problem in a space where features correspond to the data points and instances correspond to the dimensions. The goal is to look for a representative subset of the features (support vectors) which describes the boundary for the region where the set of the features (data points) exists. This leads to a joint optimization of relevance and redundancy in a principled max-margin framework. Additionally, our formulation enables us to leverage existing techniques for optimizing the SVM objective resulting in highly computationally efficient solutions for the task of feature selection. Specifically, we employ the dual coordinate descent algorithm (Hsieh et al., 2008), originally proposed for SVMs, for our formulation. We use a sparse representation to deal with data in very high dimensions. Experiments on seven publicly available benchmark datasets from a variety of domains show that our approach results in orders of magnitude faster solutions even while retaining the same level of accuracy compared to the state of the art feature selection techniques.
It has been recently shown that the memory of multiple aging stages, a phenomenon considered possible only below the glass transition of some glassy systems, appears also above that temperature range in the relaxor ferroelectric (Pb/La)(Zr/Ti)O_3 (PLZT). Doubts exist whether memory at such high temperature is intrinsic of the glassy relaxor state or is rather due to migration of mobile defects. It is shown that the memory in the electric susceptibility and elastic compliance of PLZT 9/65/35 is not enhanced but depressed by mobile defects like O vacancies, H defects and mobile charges resulting from their ionization. In addition, memory is drastically reduced at La contents slightly below the relaxor region of the phase diagram, unless aging is protracted for long times (months at room temperature). This is considered as evidence that in the non relaxor case memory is indeed due to slow migration of defects, while in the La rich case it is intrinsic of the relaxor state, even above the temperature of the susceptibility maximum.
We show that strategy independent adaptations of random interaction networks can induce powerful mechanisms, ranging from the Red Queen to group selection, that promote cooperation in evolutionary social dilemmas. These two mechanisms emerge spontaneously as dynamical processes due to deletions and additions of links, which are performed whenever players adopt new strategies and after a certain number of game iterations, respectively. The potency of cooperation promotion, as well as the mechanism responsible for it, can thereby be tuned via a single parameter determining the frequency of link additions. We thus demonstrate that coevolving random networks may evoke an appropriate mechanism for each social dilemma, such that cooperation prevails even by highly unfavorable conditions.
The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies. The videos depict acted-out emotions under realistic conditions with a large degree of variation in attributes such as pose and illumination, making it worthwhile to explore approaches which consider combinations of features from multiple modalities for label assignment. In this paper we present our approach to learning several specialist models using deep learning techniques, each focusing on one modality. Among these are a convolutional neural network, focusing on capturing visual information in detected faces, a deep belief net focusing on the representation of the audio stream, a K-Means based "bag-of-mouths" model, which extracts visual features around the mouth region and a relational autoencoder, which addresses spatio-temporal aspects of videos. We explore multiple methods for the combination of cues from these modalities into one common classifier. This achieves a considerably greater accuracy than predictions from our strongest single-modality classifier. Our method was the winning submission in the 2013 EmotiW challenge and achieved a test set accuracy of 47.67% on the 2014 dataset.
Analyzing the structure of multiple point process observations provides insight into the understanding of complex networks and human activities. In this work, we present Bayesian nonparametric Poisson process allocation (BNPPA), a generative model to automatically infer the number of latent groups in temporal data based on the previous point estimation model, latent Poisson process allocation (LPPA). We derive a variational inference algorithm when incorporating a Dirichlet process prior and adding an integral constraint. Finally, we demonstrate the usefulness of this Bayesian nonparametric model through experiments on both synthetic and real-world data sets.
Percolation theory is extensively studied in statistical physics and mathematics with applications in diverse fields. However, the research is focused on systems with only one type of links, connectivity links. We review a recently developed mathematical framework for analyzing percolation properties of realistic scenarios of networks having links of two types, connectivity and dependency links. This formalism was applied to study Erd$\ddot{o}$s-R$\acute{e}$nyi (ER) networks that include also dependency links. For an ER network with average degree $k$ that is composed of dependency clusters of size $s$, the fraction of nodes that belong to the giant component, $P_\infty$, is given by $ P_\infty=p^{s-1}[1-\exp{(-kpP_\infty)}]^s $ where $1-p$ is the initial fraction of randomly removed nodes. Here, we apply the formalism to the study of random-regular (RR) networks and find a formula for the size of the giant component in the percolation process: $P_\infty=p^{s-1}(1-r^k)^s$ where $r$ is the solution of $r=p^s(r^{k-1}-1)(1-r^k)+1$. These general results coincide, for $s=1$, with the known equations for percolation in ER and RR networks respectively without dependency links. In contrast to $s=1$, where the percolation transition is second order, for $s>1$ it is of first order. Comparing the percolation behavior of ER and RR networks we find a remarkable difference regarding their resilience. We show, analytically and numerically, that in ER networks with low connectivity degree or large dependency clusters, removal of even a finite number (zero fraction) of the network nodes will trigger a cascade of failures that fragments the whole network. This result is in contrast to RR networks where such cascades and full fragmentation can be triggered only by removal of a finite fraction of nodes in the network.
When supernovae enter the nebular phase after a few months, they reveal spectral fingerprints of their deep interiors, glowing by radioactivity produced in the explosion. We are given a unique opportunity to see what an exploded star looks like inside. The line profiles and luminosities encode information about physical conditions, explosive and hydrostatic nucleosynthesis, and ejecta morphology, which link to the progenitor properties and the explosion mechanism. Here, the fundamental properties of spectral formation of supernovae in the nebular phase are reviewed. The formalism between ejecta morphology and line profile shapes is derived, including effects of scattering and absorption. Line luminosity expressions are derived in various physical limits, with examples of applications from the literature. The physical processes at work in the supernova ejecta, including gamma-ray deposition, non-thermal electron degradation, ionization and excitation, and radiative transfer are described and linked to the computation and application of advanced spectral models. Some of the results derived so far from nebular-phase supernova analysis are discussed.
We introduce the use of entanglement entropy as a tool for studying the amount of information shared between the nodes of quantum complex networks. By considering the ground state of a network of coupled quantum harmonic oscillators, we compute the information that each node has on the rest of the system. We show that the nodes storing the largest amount of information are not the ones with the highest connectivity, but those with intermediate connectivity thus breaking down the usual hierarchical picture of classical networks. We show both numerically and analytically that the mutual information characterizes the network topology. As a byproduct, our results point out that the amount of information available for an external node connecting to a quantum network allows to determine the network topology.
We introduce a new approach to the study of influence in strategic settings where the action of an individual depends on that of others in a network-structured way. We propose \emph{influence games} as a \emph{game-theoretic} model of the behavior of a large but finite networked population. Influence games allow \emph{both} positive and negative \emph{influence factors}, permitting reversals in behavioral choices. We embrace \emph{pure-strategy Nash equilibrium (PSNE)}, an important solution concept in non-cooperative game theory, to formally define the \emph{stable outcomes} of an influence game and to predict potential outcomes without explicitly considering intricate dynamics. We address an important problem in network influence, the identification of the \emph{most influential individuals}, and approach it algorithmically using PSNE computation. \emph{Computationally}, we provide (a) complexity characterizations of various problems on influence games; (b) efficient algorithms for several special cases and heuristics for hard cases; and (c) approximation algorithms, with provable guarantees, for the problem of identifying the most influential individuals. \emph{Experimentally}, we evaluate our approach using both synthetic influence games as well as several real-world settings of general interest, each corresponding to a separate branch of the U.S. Government. \emph{Mathematically,} we connect influence games to important game-theoretic models: \emph{potential and polymatrix games}.
Artificial neural networks typically have a fixed, non-linear activation function at each neuron. We have designed a novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent. With this adaptive activation function, we are able to improve upon deep neural network architectures composed of static rectified linear units, achieving state-of-the-art performance on CIFAR-10 (7.51%), CIFAR-100 (30.83%), and a benchmark from high-energy physics involving Higgs boson decay modes.
We introduce a bipartite, diluted and frustrated, network as a sparse restricted Boltzman machine and we show its thermodynamical equivalence to an associative working memory able to retrieve multiple patterns in parallel without falling into spurious states typical of classical neural networks. We focus on systems processing in parallel a finite (up to logarithmic growth in the volume) amount of patterns, mirroring the low-level storage of standard Amit-Gutfreund-Sompolinsky theory. Results obtained trough statistical mechanics, signal-to-noise technique and Monte Carlo simulations are overall in perfect agreement and carry interesting biological insights. Indeed, these associative networks pave new perspectives in the understanding of multitasking features expressed by complex systems, e.g. neural and immune networks.
We investigate the decoherence properties of a central system composed of two spins 1/2 in contact with a spin bath. The dynamical regime of the bath ranges from a fully integrable integrable limit to complete chaoticity. We show that the dynamical regime of the bath determines the efficiency of the decoherence process. For perturbative regimes, the integrable limit provides stronger decoherence, while in the strong coupling regime the chaotic limit becomes more efficient. We also show that the decoherence time behaves in a similar way. On the contrary, the rate of decay of magnitudes like linear entropy or fidelity does not depend on the dynamical regime of the bath. We interpret the latter results as due to a comparable complexity of the Hamiltonian for both the integrable and the fully chaotic limits.
In this paper, we propose a second order optimization method to learn models where both the dimensionality of the parameter space and the number of training samples is high. In our method, we construct on each iteration a Krylov subspace formed by the gradient and an approximation to the Hessian matrix, and then use a subset of the training data samples to optimize over this subspace. As with the Hessian Free (HF) method of [7], the Hessian matrix is never explicitly constructed, and is computed using a subset of data. In practice, as in HF, we typically use a positive definite substitute for the Hessian matrix such as the Gauss-Newton matrix. We investigate the effectiveness of our proposed method on deep neural networks, and compare its performance to widely used methods such as stochastic gradient descent, conjugate gradient descent and L-BFGS, and also to HF. Our method leads to faster convergence than either L-BFGS or HF, and generally performs better than either of them in cross-validation accuracy. It is also simpler and more general than HF, as it does not require a positive semi-definite approximation of the Hessian matrix to work well nor the setting of a damping parameter. The chief drawback versus HF is the need for memory to store a basis for the Krylov subspace.
Lipreading, i.e. speech recognition from visual-only recordings of a speaker's face, can be achieved with a processing pipeline based solely on neural networks, yielding significantly better accuracy than conventional methods. Feed-forward and recurrent neural network layers (namely Long Short-Term Memory; LSTM) are stacked to form a single structure which is trained by back-propagating error gradients through all the layers. The performance of such a stacked network was experimentally evaluated and compared to a standard Support Vector Machine classifier using conventional computer vision features (Eigenlips and Histograms of Oriented Gradients). The evaluation was performed on data from 19 speakers of the publicly available GRID corpus. With 51 different words to classify, we report a best word accuracy on held-out evaluation speakers of 79.6% using the end-to-end neural network-based solution (11.6% improvement over the best feature-based solution evaluated).
In the task of Object Recognition, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects. With the rise of deep architectures, the prime focus has been on object category recognition. Deep learning methods have achieved wide success in this task. In contrast, object pose regression using these approaches has received relatively much less attention. In this paper we show how deep architectures, specifically Convolutional Neural Networks (CNN), can be adapted to the task of simultaneous categorization and pose estimation of objects. We investigate and analyze the layers of various CNN models and extensively compare between them with the goal of discovering how the layers of distributed representations of CNNs represent object pose information and how this contradicts with object category representations. We extensively experiment on two recent large and challenging multi-view datasets. Our models achieve better than state-of-the-art performance on both datasets.
We discuss three related models of scale-free networks with the same degree distribution but different correlation properties. Starting from the Barabasi-Albert construction based on growth and preferential attachment we discuss two other networks emerging when randomizing it with respect to links or nodes. We point out that the Barabasi-Albert model displays dissortative behavior with respect to the nodes' degrees, while the node-randomized network shows assortative mixing. These kinds of correlations are visualized by discussig the shell structure of the networks around their arbitrary node. In spite of different correlation behavior, all three constructions exhibit similar percolation properties.
We report the value of the dynamical critical exponent z for the six dimensional Ising spin glass, measured in three different ways: from the behavior of the energy and the susceptibility with the Monte Carlo time and by studying the overlap-overlap correlation function as a function of the space and time. All three results are in a very good agreement with the Mean Field prediction z=4. Finally we have studied numerically the remanent magnetization in 6 and 8 dimensions and we have compared it with the behavior observed in the SK model, that we have computed analytically.
Enhancing the robustness and accuracy of time series forecasting models is an active area of research. Recently, Artificial Neural Networks (ANNs) have found extensive applications in many practical forecasting problems. However, the standard backpropagation ANN training algorithm has some critical issues, e.g. it has a slow convergence rate and often converges to a local minimum, the complex pattern of error surfaces, lack of proper training parameters selection methods, etc. To overcome these drawbacks, various improved training methods have been developed in literature; but, still none of them can be guaranteed as the best for all problems. In this paper, we propose a novel weighted ensemble scheme which intelligently combines multiple training algorithms to increase the ANN forecast accuracies. The weight for each training algorithm is determined from the performance of the corresponding ANN model on the validation dataset. Experimental results on four important time series depicts that our proposed technique reduces the mentioned shortcomings of individual ANN training algorithms to a great extent. Also it achieves significantly better forecast accuracies than two other popular statistical models.
We discuss the dependence on the flavor of the intrinsic transverse momentum in unpolarized transverse-momentum-dependent distribution functions (TMD PDFs) and fragmentation functions (TMD FFs) analyzing data of semi-inclusive deep-inelastic scattering (SIDIS) released by the HERMES collaboration. Adopting a flavor-dependent Gaussian model in the transverse momentum and neglecting QCD evolution, we find interesting evidences concerning the flavor dependence in TMD FFs, whereas the indications are weaker in TMD PDFs and deserve further investigations. Inclusions of new data sets of SIDIS, electron-positron annihilations and and Drell-Yan (DY) events require a proper treatment of QCD evolution. We try to get constraints on the non-perturbative Sudakov factor from electron-positron annihilations into hadrons.
Aspect phrase grouping is an important task in aspect-level sentiment analysis. It is a challenging problem due to polysemy and context dependency. We propose an Attention-based Deep Distance Metric Learning (ADDML) method, by considering aspect phrase representation as well as context representation. First, leveraging the characteristics of the review text, we automatically generate aspect phrase sample pairs for distant supervision. Second, we feed word embeddings of aspect phrases and their contexts into an attention-based neural network to learn feature representation of contexts. Both aspect phrase embedding and context embedding are used to learn a deep feature subspace for measure the distances between aspect phrases for K-means clustering. Experiments on four review datasets show that the proposed method outperforms state-of-the-art strong baseline methods.
Function word adjacency networks (WANs) are used to study the authorship of plays from the Early Modern English period. In these networks, nodes are function words and directed edges between two nodes represent the likelihood of ordered co-appearance of the two words. For every analyzed play a WAN is constructed and these are aggregated to generate author profile networks. We first study the similarity of writing styles between Early English playwrights by comparing the profile WANs. The accuracy of using WANs for authorship attribution is then demonstrated by attributing known plays among six popular playwrights. The WAN method is shown to additionally outperform other frequency-based methods on attributing Early English plays. This high classification power is then used to investigate the authorship of anonymous plays. Moreover, WANs are shown to be reliable classifiers even when attributing collaborative plays. For several plays of disputed co- authorship, a deeper analysis is performed by attributing every act and scene separately, in which we both corroborate existing breakdowns and provide evidence of new assignments. Finally, the impact of genre on attribution accuracy is examined revealing that the genre of a play partially conditions the choice of the function words used in it.
Protein-protein interaction (PPI) networks, providing a comprehensive landscape of protein interacting patterns, enable us to explore biological processes and cellular components at multiple resolutions. For a biological process, a number of proteins need to work together to perform the job. Proteins densely interact with each other, forming large molecular machines or cellular building blocks. Identification of such densely interconnected clusters or protein complexes from PPI networks enables us to obtain a better understanding of the hierarchy and organization of biological processes and cellular components. Most existing methods apply efficient graph clustering algorithms on PPI networks, often failing to detect possible densely connected subgraphs and overlapped subgraphs. Besides clustering-based methods, dense subgraph enumeration methods have also been used, which aim to find all densely connected protein sets. However, such methods are not practically tractable even on a small yeast PPI network, due to high computational complexity. In this paper, we introduce a novel approximate algorithm to efficiently enumerate putative protein complexes from biological networks. The key insight of our algorithm is that we do not need to enumerate all dense subgraphs. Instead we only need to find a small subset of subgraphs that cover as many proteins as possible. The problem is formulated as finding a diverse set of dense subgraphs, where we develop highly effective pruning techniques to guarantee efficiency. To handle large networks, we take a divide-and-conquer approach to speed up the algorithm in a distributed manner. By comparing with existing clustering and dense subgraph-based algorithms on several human and yeast PPI networks, we demonstrate that our method can detect more putative protein complexes and achieves better prediction accuracy.
Deep convolutional neural networks continue to advance the state-of-the-art in many domains as they grow bigger and more complex. It has been observed that many of the parameters of a large network are redundant, allowing for the possibility of learning a smaller network that mimics the outputs of the large network through a process called Knowledge Distillation. We show, however, that standard Knowledge Distillation is not effective for learning small models for the task of pedestrian detection. To improve this process, we introduce a higher-dimensional hint layer to increase information flow. We also estimate the variance in the outputs of the large network and propose a loss function to incorporate this uncertainty. Finally, we attempt to boost the complexity of the small network without increasing its size by using as input hand-designed features that have been demonstrated to be effective for pedestrian detection. We succeed in training a model that contains $400\times$ fewer parameters than the large network while outperforming AlexNet on the Caltech Pedestrian Dataset.
We analyze a simple model for growing tree networks and find that although it never percolates, there is an anomalously large cluster at finite size. We study the growth of both the maximal cluster and the cluster containing the original vertex and find that they obey power laws. This property is also observed through simulations in a non-linear model with loops and a true percolating phase.
With the widespread use of mobile computing devices in contemporary society, our trajectories in the physical space and virtual world are increasingly closely connected. Using the anonymous smartphone data of $1 \times 10^5$ users in 30 days, we constructed the mobility network and the attention network to study the correlations between online and offline human behaviours. In the mobility network, nodes are physical locations and edges represent the movements between locations, and in the attention network, nodes are websites and edges represent the switch of users between websites. We apply the box-covering method to renormalise the networks. The investigated network properties include the size of box $l_B$ and the number of boxes $N(l_B)$. We find two universal classes of behaviours: the mobility network is featured by a small-world property, $N(l_B) \simeq e^{-l_B}$, whereas the attention network is characterised by a self-similar property $N(l_B) \simeq l_B^{-\gamma}$. In particular, with the increasing of the length of box $l_B$, the degree correlation of the network changes from positive to negative which indicates that there are two layers of structure in the mobility network. We use the results of network renormalisation to detect the community and map the structure of the mobility network. Further, we located the most relevant websites visited in these communities, and identified three typical location-based behaviours, including the shopping, dating, and taxi-calling. Finally, we offered a revised geometric network model to explain our findings in the perspective of spatial-constrained attachment.
From the perspective of thermal fluctuations, we investigate the pseudogap phenomena in underdoped high-temperature curpate superconductors. We present a local update Monte Carlo procedure based on the Green's function method to sample the fluctuating pairing field. The Chebyshev polynomial method is applied to calculate the single-particle spectral function directly and efficiently. The evolution of Fermi arcs as a function of temperature is studied by examining the spectral function at Fermi energy as well as the loss of spectral weight. Our results signify the importance of the vortex-like phase fluctuation on the formation of Fermi arcs.
Recent deep learning based methods have achieved the state-of-the-art performance for handwritten Chinese character recognition (HCCR) by learning discriminative representations directly from raw data. Nevertheless, we believe that the long-and-well investigated domain-specific knowledge should still help to boost the performance of HCCR. By integrating the traditional normalization-cooperated direction-decomposed feature map (directMap) with the deep convolutional neural network (convNet), we are able to obtain new highest accuracies for both online and offline HCCR on the ICDAR-2013 competition database. With this new framework, we can eliminate the needs for data augmentation and model ensemble, which are widely used in other systems to achieve their best results. This makes our framework to be efficient and effective for both training and testing. Furthermore, although directMap+convNet can achieve the best results and surpass human-level performance, we show that writer adaptation in this case is still effective. A new adaptation layer is proposed to reduce the mismatch between training and test data on a particular source layer. The adaptation process can be efficiently and effectively implemented in an unsupervised manner. By adding the adaptation layer into the pre-trained convNet, it can adapt to the new handwriting styles of particular writers, and the recognition accuracy can be further improved consistently and significantly. This paper gives an overview and comparison of recent deep learning based approaches for HCCR, and also sets new benchmarks for both online and offline HCCR.
To infer multilayer deep representations of high-dimensional discrete and nonnegative real vectors, we propose an augmentable gamma belief network (GBN) that factorizes each of its hidden layers into the product of a sparse connection weight matrix and the nonnegative real hidden units of the next layer. The GBN's hidden layers are jointly trained with an upward-downward Gibbs sampler that solves each layer with the same subroutine. The gamma-negative binomial process combined with a layer-wise training strategy allows inferring the width of each layer given a fixed budget on the width of the first layer. Example results illustrate interesting relationships between the width of the first layer and the inferred network structure, and demonstrate that the GBN can add more layers to improve its performance in both unsupervisedly extracting features and predicting heldout data. For exploratory data analysis, we extract trees and subnetworks from the learned deep network to visualize how the very specific factors discovered at the first hidden layer and the increasingly more general factors discovered at deeper hidden layers are related to each other, and we generate synthetic data by propagating random variables through the deep network from the top hidden layer back to the bottom data layer.
The front end of the human auditory system, the cochlea, converts sound signals from the outside world into neural impulses transmitted along the auditory pathway for further processing. The cochlea senses and separates sound in a nonlinear active fashion, exhibiting remarkable sensitivity and frequency discrimination. Although several electronic models of the cochlea have been proposed and implemented, none of these are able to reproduce all the characteristics of the cochlea, including large dynamic range, large gain and sharp tuning at low sound levels, and low gain and broad tuning at intense sound levels. Here, we implement the Cascade of Asymmetric Resonators (CAR) model of the cochlea on an FPGA. CAR represents the basilar membrane filter in the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. CAR-FAC is a neuromorphic model of hearing based on a pole-zero filter cascade model of auditory filtering. It uses simple nonlinear extensions of conventional digital filter stages that are well suited to FPGA implementations, so that we are able to implement up to 1224 cochlear sections on Virtex-6 FPGA to process sound data in real time. The FPGA implementation of the electronic cochlea described here may be used as a front-end sound analyser for various machine-hearing applications.
We develop a new model for Interactive Question Answering (IQA), using Gated-Recurrent-Unit recurrent networks (GRUs) as encoders for statements and questions, and another GRU as a decoder for outputs. Distinct from previous work, our approach employs context-dependent word-level attention for more accurate statement representations and question-guided sentence-level attention for better context modeling. Employing these mechanisms, our model accurately understands when it can output an answer or when it requires generating a supplementary question for additional input. When available, user's feedback is encoded and directly applied to update sentence-level attention to infer the answer. Extensive experiments on QA and IQA datasets demonstrate quantitatively the effectiveness of our model with significant improvement over conventional QA models.
We discuss the properties of two peculiar galaxies (2-809 and 2-906) selected in the Hubble Deep Field as possible candidates to high-redshift (Z about 1) polar-ring galaxies. We found that the presence of polar-ring galaxies in a random deep field gives some support for a galaxy interaction rate steeply increasing with redshift.
Random networks of symmetrically coupled, excitable elements can self-organize into coherently oscillating states if the networks contain loops (indeed loops are abundant in random networks) and if the initial conditions are sufficiently random. In the oscillating state, signals propagate in a single direction and one or a few network loops are selected as driving loops in which the excitation circulates periodically. We analyze the mechanism, describe the oscillating states, identify the pacemaker loops and explain key features of their distribution. This mechanism may play a role in epileptic seizures.
Future communication networks are expected to feature autonomic (or self-organizing) mechanisms to ease deployment (self-configuration), tune parameters automatically (self-optimization) and repair the network (self-healing). Self-organizing mechanisms have been designed as stand-alone entities, even though multiple mechanisms will run in parallel in operational networks. An efficient coordination mechanism will be the major enabler for large scale deployment of self-organizing networks. We model self-organizing mechanisms as control loops, and study the conditions for stability when running control loops in parallel. Based on control theory and Lyapunov stability, we propose a coordination mechanism to stabilize the system, which can be implemented in a distributed fashion. The mechanism remains valid in the presence of measurement noise via stochastic approximation. Instability and coordination in the context of wireless networks are illustrated with two examples and the influence of network geometry is investigated. We are essentially concerned with linear systems, and the applicability of our results for non-linear systems is discussed.
In this paper, we continue our development of algorithms used for topological network discovery. We present native P system versions of two fundamental problems in graph theory: finding the maximum number of edge- and node-disjoint paths between a source node and target node. We start from the standard depth-first-search maximum flow algorithms, but our approach is totally distributed, when initially no structural information is available and each P system cell has to even learn its immediate neighbors. For the node-disjoint version, our P system rules are designed to enforce node weight capacities (of one), in addition to edge capacities (of one), which are not readily available in the standard network flow algorithms.
In this paper Moussouris' algorithm for the decomposition of spin networks is reviewed and the implicit assumptions made in the Decomposition Theorem relating a spin network with its state sum are examined. It is found that the theorem in the original form hides the importance of the orientation of the vertices in the spin networks and, more important, that the algorithm for the evaluation of spin networks assumes a cycle for the reduction of the graph to work in the proof of the theorem. It is shown that this is not always the case and this rises doubt about the generality of the theorem. Having this issue in mind, the theorem is restated to account for toroidal spin networks, i.e. networks cellular embeddable in the torus. For this, the minimal non-planar spin network is examined and the algorithm is extended to account for the non-planarity of toroidal spin networks. Furthermore, three types of minimal non-planar spin networks are found, two of them are toroidal and the third is cellular embeddable only in the double torus and without a cycle. Some examples for both toroidal spin networks are given and the relation between these is found. Finally, the issues and the possibility of generalizing the Decomposition Theorem is shortly discussed.
We analyze information diffusion using empirical data that tracks online communication around two instances of mass political mobilization, including the year that lapsed in-between the protests. We compare the global properties of the topological and dynamic networks through which communication took place as well as local changes in network composition. We show that changes in network structure underlie aggregated differences on how information diffused: an increase in network hierarchy is accompanied by a reduction in the average size of cascades. The increasing hierarchy affects not only the underlying communication topology but also the more dynamic structure of information exchange; the increase is especially noticeable amongst certain categories of nodes (or users). This suggests that the relationship between the structure of networks and their function in diffusing information is not as straightforward as some theoretical models of diffusion in networks imply.
L (Logarithmic space) versus NL (Non-deterministic logarithmic space) is one of the great open problems in computational complexity theory. In the paper "Bounds on monotone switching networks for directed connectivity", we separated monotone analogues of L and NL using a model called the switching network model. In particular, by considering inputs consisting of just a path and isolated vertices, we proved that any monotone switching network solving directed connectivity on $N$ vertices must have size at least $N^{\Omega(\lg(N))}$ and this bound is tight. If we could show a similar result for general switching networks solving directed connectivity, then this would prove that $L \neq NL$. However, proving lower bounds for general switching networks solving directed connectivity requires proving stronger lower bounds on monotone switching networks for directed connectivity. To work towards this goal, we investigated a different set of inputs which we believed to be hard for monotone switching networks to solve and attempted to prove similar lower size bounds. Instead, we found that this set of inputs is actually easy for monotone switching networks for directed connectivity to solve, yet if we restrict ourselves to certain-knowledge switching networks, which are a simple and intuitive subclass of monotone switching networks for directed connectivity, then these inputs are indeed hard to solve. In this paper, we give this set of inputs, demonstrate a "weird" polynomially-sized monotone switching network for directed connectivity which solves this set of inputs, and prove that no polynomially-sized certain-knowledge switching network can solve this set of inputs, thus proving that monotone switching networks for directed connectivity are strictly more powerful than certain-knowledge switching networks.
Network virtualization is an efficient approach of solving the ossification problem of the Internet. It has become a promising way of supporting lots of heterogeneous network onto substrate physical network. A major challenge in network virtualization is how to map multiple virtual networks onto specific nodes and links in the shared substrate network, known as virtual network embedding problem. Due to its NPhardness, many heuristic approaches have been proposed. In this thesis, we propose hybrid VN embedding algorithms that map multiple VN requests with node and link constraints with K-core decomposition using path splitting. Based on network pruning, a virtual network is decomposed to core network and edge network. The mapping process is divided into two phases: core network mapping and edge network mapping. Path splitting enables better resource utilization by allowing the substrate to accept more VN requests. It splits the available bandwidth of the path into small bandwidth to satisfy the resource constraints. The proposed algorithm improves the performance of the algorithm by splitting the path into small bandwidth. Due to path splitting proposed algorithm will accept many virtual network requests that will increase the revenue and acceptance ratio of the algorithm.
We analyze the asymptotic behavior of compute-and-forward relay networks in the regime of high signal-to-noise ratios. We consider a section of such a network consisting of K transmitters and K relays. The aim of the relays is to reliably decode an invertible function of the messages sent by the transmitters. An upper bound on the capacity of this system can be obtained by allowing full cooperation among the transmitters and among the relays, transforming the network into a K times K multiple-input multiple-output (MIMO) channel. The number of degrees of freedom of compute-and-forward is hence at most K. In this paper, we analyze the degrees of freedom achieved by the lattice coding implementation of compute-and-forward proposed recently by Nazer and Gastpar. We show that this lattice implementation achieves at most 2/(1+1/K)\leq 2 degrees of freedom, thus exhibiting a very different asymptotic behavior than the MIMO upper bound. This raises the question if this gap of the lattice implementation to the MIMO upper bound is inherent to compute-and-forward in general. We answer this question in the negative by proposing a novel compute-and-forward implementation achieving K degrees of freedom.
Deep networks rely on massive amounts of labeled data to learn powerful models. For a target task short of labeled data, transfer learning enables model adaptation from a different source domain. This paper addresses deep transfer learning under a more general scenario that the joint distributions of features and labels may change substantially across domains. Based on the theory of Hilbert space embedding of distributions, a novel joint distribution discrepancy is proposed to directly compare joint distributions across domains, eliminating the need of marginal-conditional factorization. Transfer learning is enabled in deep convolutional networks, where the dataset shifts may linger in multiple task-specific feature layers and the classifier layer. A set of joint adaptation networks are crafted to match the joint distributions of these layers across domains by minimizing the joint distribution discrepancy, which can be trained efficiently using back-propagation. Experiments show that the new approach yields state of the art results on standard domain adaptation datasets.
In general, the power-law degree distribution has profound influence on various dynamical processes defined on scale-free networks. In this paper, we will show that power-law degree distribution alone does not suffice to characterize the behavior of trapping problem on scale-free networks, which is an integral major theme of interest for random walks in the presence of an immobile perfect absorber. In order to achieve this goal, we study random walks on a family of one-parameter (denoted by $q$) scale-free networks with identical degree sequence for the full range of parameter $q$, in which a trap is located at a fixed site. We obtain analytically or numerically the mean first-passage time (MFPT) for the trapping issue. In the limit of large network order (number of nodes), for the whole class of networks, the MFPT increases asymptotically as a power-law function of network order with the exponent obviously different for different parameter $q$, which suggests that power-law degree distribution itself is not sufficient to characterize the scaling behavior of MFPT for random walks, at least trapping problem, performed on scale-free networks.
Due to the advancement of computing and communication technology, networked control systems may soon become prevalent in many control applications. While the capability of employing the communication network in the control loop certainly provides many benefits, it also raises several challenges which need to be overcome to utilize the benefits.   In this chapter, we focus on one major challenge: a middleware framework that enables a networked control system to be implemented. Indeed our thesis is that a middleware for networked control sys important for the future of networked control systems.   We discuss the fundamental issues which need to be considered in the design and development of an appropriate middleware for networked control systems. We describe \emph{Etherware}, a middleware for networked control system which has been developed at the University of Illinois, as an example of such a middleware framework, to illustrate how these issues can be addressed in the design of a middleware. Using a networked inverted pendulum control system as an example, we demonstrate the powerful capabilities provided by Etherware for a networked control system.
A central challenge to many fields of science and engineering involves minimizing non-convex error functions over continuous, high dimensional spaces. Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum. Here we argue, based on results from statistical physics, random matrix theory, neural network theory, and empirical evidence, that a deeper and more profound difficulty originates from the proliferation of saddle points, not local minima, especially in high dimensional problems of practical interest. Such saddle points are surrounded by high error plateaus that can dramatically slow down learning, and give the illusory impression of the existence of a local minimum. Motivated by these arguments, we propose a new approach to second-order optimization, the saddle-free Newton method, that can rapidly escape high dimensional saddle points, unlike gradient descent and quasi-Newton methods. We apply this algorithm to deep or recurrent neural network training, and provide numerical evidence for its superior optimization performance.
Virtual screening (VS) is widely used during computational drug discovery to reduce costs. Chemogenomics-based virtual screening (CGBVS) can be used to predict new compound-protein interactions (CPIs) from known CPI network data using several methods, including machine learning and data mining. Although CGBVS facilitates highly efficient and accurate CPI prediction, it has poor performance for prediction of new compounds for which CPIs are unknown. The pairwise kernel method (PKM) is a state-of-the-art CGBVS method and shows high accuracy for prediction of new compounds. In this study, on the basis of link mining, we improved the PKM by combining link indicator kernel (LIK) and chemical similarity and evaluated the accuracy of these methods. The proposed method obtained an average area under the precision-recall curve (AUPR) value of 0.562, which was higher than that achieved by the conventional Gaussian interaction profile (GIP) method (0.425), and the calculation time was only increased by a few percent.
Stochastic blockmodels and variants thereof are among the most widely used approaches to community detection for social networks and relational data. A stochastic blockmodel partitions the nodes of a network into disjoint sets, called communities. The approach is inherently related to clustering with mixture models; and raises a similar model selection problem for the number of communities. The Bayesian information criterion (BIC) is a popular solution, however, for stochastic blockmodels, the conditional independence assumption given the communities of the endpoints among different edges is usually violated in practice. In this regard, we propose composite likelihood BIC (CL-BIC) to select the number of communities, and we show it is robust against possible misspecifications in the underlying stochastic blockmodel assumptions. We derive the requisite methodology and illustrate the approach using both simulated and real data. Supplementary materials containing the relevant computer code are available online.
We study a number of two-user interference networks with multiple-antenna transmitters/receivers, transmitter side information in the form of linear combinations (over finite-field) of the information messages, and two-hop relaying. We start with a Cognitive Interference Channel (CIC) where one of the transmitters (non-cognitive) has knowledge of a rank-1 linear combination of the two information messages, while the other transmitter (cognitive) has access to a rank-2 linear combination of the same messages. This is referred to as the Network-Coded CIC, since such linear combination may be the result of some random linear network coding scheme implemented in the backbone wired network. For such channel we develop an achievable region based on a few novel concepts: Precoded Compute and Forward (PCoF) with Channel Integer Alignment (CIA), combined with standard Dirty-Paper Coding. We also develop a capacity region outer bound and find the sum symmetric GDoF of the Network-Coded CIC. Through the GDoF characterization, we show that knowing "mixed data" (linear combinations of the information messages) provides an unbounded spectral efficiency gain over the classical CIC counterpart, if the ratio of SNR to INR is larger than certain threshold. Then, we consider a Gaussian relay network having the two-user MIMO IC as the main building block. We use PCoF with CIA to convert the MIMO IC into a deterministic finite-field IC. Then, we use a linear precoding scheme over the finite-field to eliminate interference in the finite-field domain. Using this unified approach, we characterize the symmetric sum rate of the two-user MIMO IC with coordination, cognition, and two-hops. We also provide finite-SNR results which show that the proposed coding schemes are competitive against state of the art interference avoidance based on orthogonal access, for Rayleigh fading channels.
We propose an approach to reduce the bias of ridge regression and regularization kernel network. When applied to a single data set the new algorithms have comparable learning performance with the original ones. When applied to incremental learning with block wise streaming data the new algorithms are more efficient due to bias reduction. Both theoretical characterizations and simulation studies are used to verify the effectiveness of these new algorithms.
An orthogonal Haar scattering transform is a deep network, computed with a hierarchy of additions, subtractions and absolute values, over pairs of coefficients. It provides a simple mathematical model for unsupervised deep network learning. It implements non-linear contractions, which are optimized for classification, with an unsupervised pair matching algorithm, of polynomial complexity. A structured Haar scattering over graph data computes permutation invariant representations of groups of connected points in the graph. If the graph connectivity is unknown, unsupervised Haar pair learning can provide a consistent estimation of connected dyadic groups of points. Classification results are given on image data bases, defined on regular grids or graphs, with a connectivity which may be known or unknown.
We present a novel theoretical framework established by complex network analysis for understanding the phase transition beyond the Landau symmetry breaking paradigm. In this paper we take a two-dimensional metal-insulator transition driven by electron correlations for example. Passing through the transition point, we find a hidden symmetry broken in the network space, which is invisible in real space. This symmetry is nothing but a kind of robustness of the network to random failures. We then show that a network quantity, small-worldness, is capable of identifying the phase transition with/without any symmetry breaking in the real space and behaving as a new order parameter in the network space. We demonstrate that whether or not the symmetry is broken in real space a variety of phase transitions in condensed matters can be characterized by the hidden symmetry breaking in the weighted network, that is to say, a decline in network robustness.
The rise of multi-million-item dataset initiatives has enabled data-hungry machine learning algorithms to reach near-human semantic classification at tasks such as object and scene recognition. Here we describe the Places Database, a repository of 10 million scene photographs, labeled with scene semantic categories and attributes, comprising a quasi-exhaustive list of the types of environments encountered in the world. Using state of the art Convolutional Neural Networks, we provide impressive baseline performances at scene classification. With its high-coverage and high-diversity of exemplars, the Places Database offers an ecosystem to guide future progress on currently intractable visual recognition problems.
In this paper, we consider the use of deep neural networks in the context of Multiple-Input-Multiple-Output (MIMO) detection. We give a brief introduction to deep learning and propose a modern neural network architecture suitable for this detection task. First, we consider the case in which the MIMO channel is constant, and we learn a detector for a specific system. Next, we consider the harder case in which the parameters are known yet changing and a single detector must be learned for all multiple varying channels. We demonstrate the performance of our deep MIMO detector using numerical simulations in comparison to competing methods including approximate message passing and semidefinite relaxation. The results show that deep networks can achieve state of the art accuracy with significantly lower complexity while providing robustness against ill conditioned channels and mis-specified noise variance.
In this paper we propose an alternative approach for the assessment of network vulnerability under random and intentional attacks as compared to the results obtained from the "vulnerability function" given by Criado et al. [Criado et al. (Int. J. Comput. Math., 86 (2) (2009), pp. 209-218)]. By using spectral and statistical measurements, we assess robustness as the antonym to vulnerability of complex networks and suggest a tentative ranking for vulnerability, based on the interpretation of quantified network characteristics. We conclude that vulnerability function, derived from the network's degree distribution and its variations only, is not general enough to reflect the lack of robustness due to the specific configurations in graphs with hierarchical or centralized structures. The spectral and statistical metrics, on the other hand, capture different aspects of network topology which provide a more thorough assessment of network vulnerability.
This paper considers a Bayesian view for estimating a sub-network in a Markov random field. The sub-network corresponds to the Markov blanket of a set of query variables, where the set of potential neighbours here is big. We factorize the posterior such that the Markov blanket is conditionally independent of the network of the potential neighbours. By exploiting this blockwise decoupling, we derive analytic expressions for posterior conditionals. Subsequently, we develop an inference scheme which makes use of the factorization. As a result, estimation of a sub-network is possible without inferring an entire network. Since the resulting Gibbs sampler scales linearly with the number of variables, it can handle relatively large neighbourhoods. The proposed scheme results in faster convergence and superior mixing of the Markov chain than existing Bayesian network estimation techniques.
Epidemic processes are used commonly for modeling and analysis of biological networks, computer networks, and human contact networks. The idea of competing viruses has been explored recently, motivated by the spread of different ideas along different social networks. Previous studies of competitive viruses have focused only on two viruses and on static graph structures. In this paper, we consider multiple competing viruses over static and dynamic graph structures, and investigate the eradication and propagation of diseases in these systems. Stability analysis for the class of models we consider is performed and an antidote control technique is proposed.
Data association problems are an important component of many computer vision applications, with multi-object tracking being one of the most prominent examples. A typical approach to data association involves finding a graph matching or network flow that minimizes a sum of pairwise association costs, which are often either hand-crafted or learned as linear functions of fixed features. In this work, we demonstrate that it is possible to learn features for network-flow-based data association via backpropagation, by expressing the optimum of a smoothed network flow problem as a differentiable function of the pairwise association costs. We apply this approach to multi-object tracking with a network flow formulation. Our experiments demonstrate that we are able to successfully learn all cost functions for the association problem in an end-to-end fashion, which outperform hand-crafted costs in all settings. The integration and combination of various sources of inputs becomes easy and the cost functions can be learned entirely from data, alleviating tedious hand-designing of costs.
We present a tree-tensor-network-state (TTNS) method study of the ionic-neutral curve crossing of LiF. For this ansatz, the long-range correlation deviates from the mean-field value polynomially with distance, thus for quantum chemical applications the computational cost could be significantly smaller than that of previous attempts using the density matrix renormalization group (DMRG) method. Optimization of the tensor network topology and localization of the avoided crossing are discussed in terms of entanglement.
Recently, deep neural network has shown promising performance in face image recognition. The inputs of most networks are face images, and there is hardly any work reported in literature on network with face videos as input. To sufficiently discover the useful information contained in face videos, we present a novel network architecture called input aggregated network which is able to learn fixed-length representations for variable-length face videos. To accomplish this goal, an aggregation unit is designed to model a face video with various frames as a point on a Riemannian manifold, and the mapping unit aims at mapping the point into high-dimensional space where face videos belonging to the same subject are close-by and others are distant. These two units together with the frame representation unit build an end-to-end learning system which can learn representations of face videos for the specific tasks. Experiments on two public face video datasets demonstrate the effectiveness of the proposed network.
We seek to better understand the difference in quality of the several publicly released embeddings. We propose several tasks that help to distinguish the characteristics of different embeddings. Our evaluation of sentiment polarity and synonym/antonym relations shows that embeddings are able to capture surprisingly nuanced semantics even in the absence of sentence structure. Moreover, benchmarking the embeddings shows great variance in quality and characteristics of the semantics captured by the tested embeddings. Finally, we show the impact of varying the number of dimensions and the resolution of each dimension on the effective useful features captured by the embedding space. Our contributions highlight the importance of embeddings for NLP tasks and the effect of their quality on the final results.
We motivate the study of a certain class of random Fibonacci sequences - which we call continuous random Fibonacci sequences - by demonstrating that their exponential growth rate can be used to establish capacity and power scaling laws for multihop cooperative amplify-and-forward (AF) relay networks. With these laws, we show that it is possible to construct multihop cooperative AF networks that simultaneously avoid 1) exponential capacity decay and 2) exponential transmit power growth across the network. This is achieved by ensuring the network's Lyapunov exponent is zero.
An urban road network of Le Mans in France is analyzed. Some topological properties of network are investigated, such as degree distribution, clustering coefficient, diameter, and characteristic path length. These results suggest that our network is a "small- world" network with short average shortest path and large clustering coefficient. Furthermore, double power-law distribution is found in degree distribution which is distinct from the single power-law and a novel degree distribution function is also given. Some analysis on this function extend the comprehension of the origination of the double power-law distribution widely dispersed in nature.
Overrepresentation of bidirectional connections in local cortical networks has been repeatedly reported and is in the focus of the ongoing discussion of non-random connectivity. Here we show in a brief mathematical analysis that in a network in which connection probabilities are symmetric in pairs, $P_{ij} = P_{ji}$, the occurrence of bidirectional connections and non-random structures are inherently linked; an overabundance of reciprocally connected pairs emerges necessarily when the network structure deviates from a random network in any form.
Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and modern optimization techniques and then extend to output sequences. The result is a flexible and broadly useful class of neural network models that has favorable inductive biases relative to purely sequence-based models (e.g., LSTMs) when the problem is graph-structured. We demonstrate the capabilities on some simple AI (bAbI) and graph algorithm learning tasks. We then show it achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures.
State-level minimum Bayes risk (sMBR) training has become the de facto standard for sequence-level training of speech recognition acoustic models. It has an elegant formulation using the expectation semiring, and gives large improvements in word error rate (WER) over models trained solely using cross-entropy (CE) or connectionist temporal classification (CTC). sMBR training optimizes the expected number of frames at which the reference and hypothesized acoustic states differ. It may be preferable to optimize the expected WER, but WER does not interact well with the expectation semiring, and previous approaches based on computing expected WER exactly involve expanding the lattices used during training. In this paper we show how to perform optimization of the expected WER by sampling paths from the lattices used during conventional sMBR training. The gradient of the expected WER is itself an expectation, and so may be approximated using Monte Carlo sampling. We show experimentally that optimizing WER during acoustic model training gives 5% relative improvement in WER over a well-tuned sMBR baseline on a 2-channel query recognition task (Google Home).
In this work we propose a novel approach to perform segmentation by leveraging the abstraction capabilities of convolutional neural networks (CNNs). Our method is based on Hough voting, a strategy that allows for fully automatic localisation and segmentation of the anatomies of interest. This approach does not only use the CNN classification outcomes, but it also implements voting by exploiting the features produced by the deepest portion of the network. We show that this learning-based segmentation method is robust, multi-region, flexible and can be easily adapted to different modalities. In the attempt to show the capabilities and the behaviour of CNNs when they are applied to medical image analysis, we perform a systematic study of the performances of six different network architectures, conceived according to state-of-the-art criteria, in various situations. We evaluate the impact of both different amount of training data and different data dimensionality (2D, 2.5D and 3D) on the final results. We show results on both MRI and transcranial US volumes depicting respectively 26 regions of the basal ganglia and the midbrain.
We present a model of contagion that unifies and generalizes existing models of the spread of social influences and micro-organismal infections. Our model incorporates individual memory of exposure to a contagious entity (e.g., a rumor or disease), variable magnitudes of exposure (dose sizes), and heterogeneity in the susceptibility of individuals. Through analysis and simulation, we examine in detail the case where individuals may recover from an infection and then immediately become susceptible again (analogous to the so-called SIS model). We identify three basic classes of contagion models which we call \textit{epidemic threshold}, \textit{vanishing critical mass}, and \textit{critical mass} classes, where each class of models corresponds to different strategies for prevention or facilitation. We find that the conditions for a particular contagion model to belong to one of the these three classes depend only on memory length and the probabilities of being infected by one and two exposures respectively. These parameters are in principle measurable for real contagious influences or entities, thus yielding empirical implications for our model. We also study the case where individuals attain permanent immunity once recovered, finding that epidemics inevitably die out but may be surprisingly persistent when individuals possess memory.
In recent years, visual saliency estimation in images has attracted much attention in the computer vision community. However, predicting saliency in videos has received rela- tively little attention. Inspired by the recent success of deep convolutional neural networks based static saliency mod- els, in this work, we study two different two-stream convo- lutional networks for dynamic saliency prediction. To im- prove the generalization capability of our models, we also introduce a novel, empirically grounded data augmenta- tion technique for this task. We test our models on DIEM dataset and report superior results against the existing mod- els. Moreover, we perform transfer learning experiments on SALICON, a recently proposed static saliency dataset, by finetuning our models on the optical flows estimated from static images. Our experiments show that taking motion into account in this way can be helpful for static saliency estimation.
It is well-established by cognitive neuroscience that human perception of objects constitutes a complex process, where object appearance information is combined with evidence about the so-called object "affordances", namely the types of actions that humans typically perform when interacting with them. This fact has recently motivated the "sensorimotor" approach to the challenging task of automatic object recognition, where both information sources are fused to improve robustness. In this work, the aforementioned paradigm is adopted, surpassing current limitations of sensorimotor object recognition research. Specifically, the deep learning paradigm is introduced to the problem for the first time, developing a number of novel neuro-biologically and neuro-physiologically inspired architectures that utilize state-of-the-art neural networks for fusing the available information sources in multiple ways. The proposed methods are evaluated using a large RGB-D corpus, which is specifically collected for the task of sensorimotor object recognition and is made publicly available. Experimental results demonstrate the utility of affordance information to object recognition, achieving an up to 29% relative error reduction by its inclusion.
We consider distributed computation of functions of distributed data in random planar networks with noisy wireless links. We present a new algorithm for computation of the maximum value which is order optimal in the number of transmissions and computation time.We also adapt the histogram computation algorithm of Ying et al to make the histogram computation time optimal.
By using extensive Molecular Dynamics simulations, we have determined the violation of the fluctuation-dissipation theorem in a Lennard-Jones liquid quenched to low temperatures. For this we have calculated $X(C)$, the ratio between a one particle time-correlation function $C$ and the associated response function. Our results are best fitted by assuming that $X(C)$ is a discontinuous, piecewise constant function. This is similar to what is found in spin systems with one step replica symmetry breaking. This strengthen the conjecture of a similarity between the phase space structure of structural glasses and such spin systems.
The maximum density still life problem (MDSLP) is a hard constraint optimization problem based on Conway's game of life. It is a prime example of weighted constrained optimization problem that has been recently tackled in the constraint-programming community. Bucket elimination (BE) is a complete technique commonly used to solve this kind of constraint satisfaction problem. When the memory required to apply BE is too high, a heuristic method based on it (denominated mini-buckets) can be used to calculate bounds for the optimal solution. Nevertheless, the curse of dimensionality makes these techniques unpractical for large size problems. In response to this situation, we present a memetic algorithm for the MDSLP in which BE is used as a mechanism for recombining solutions, providing the best possible child from the parental set. Subsequently, a multi-level model in which this exact/metaheuristic hybrid is further hybridized with branch-and-bound techniques and mini-buckets is studied. Extensive experimental results analyze the performance of these models and multi-parent recombination. The resulting algorithm consistently finds optimal patterns for up to date solved instances in less time than current approaches. Moreover, it is shown that this proposal provides new best known solutions for very large instances.
Quantum coherence is a critical resource for many operational tasks. Understanding how to quantify and manipulate it also promises to have applications for a diverse set of problems in theoretical physics. For certain applications, however, one requires coherence between the eigenspaces of specific physical observables, such as energy, angular momentum, or photon number, and it makes a difference which eigenspaces appear in the superposition. For others, there is a preferred set of subspaces relative to which coherence is deemed a resource, but it is irrelevant which of the subspaces appear in the superposition. We term these two types of coherence unspeakable and speakable respectively. We argue that a useful approach to quantifying and characterizing unspeakable coherence is provided by the resource theory of asymmetry when the symmetry group is a group of translations, and we translate a number of prior results on asymmetry into the language of coherence. We also highlight some of the applications of this approach, for instance, in the context of quantum metrology, quantum speed limits, quantum thermodynamics, and NMR. The question of how best to treat speakable coherence as a resource is also considered. We review a popular approach in terms of operations that preserve the set of incoherent states, propose an alternative approach in terms of operations that are covariant under dephasing, and we outline the challenge of providing a physical justification for either approach. Finally, we note some mathematical connections that hold among the different approaches to quantifying coherence.
Digital pathology has advanced substantially over the last decade however tumor localization continues to be a challenging problem due to highly complex patterns and textures in the underlying tissue bed. The use of convolutional neural networks (CNNs) to analyze such complex images has been well adopted in digital pathology. However in recent years, the architecture of CNNs have altered with the introduction of inception modules which have shown great promise for classification tasks. In this paper, we propose a modified "transition" module which learns global average pooling layers from filters of varying sizes to encourage class-specific filters at multiple spatial resolutions. We demonstrate the performance of the transition module in AlexNet and ZFNet, for classifying breast tumors in two independent datasets of scanned histology sections, of which the transition module was superior.
This paper addresses a fundamental problem of scene understanding: How to parse the scene image into a structured configuration (i.e., a semantic object hierarchy with object interaction relations) that finely accords with human perception. We propose a deep architecture consisting of two networks: i) a convolutional neural network (CNN) extracting the image representation for pixelwise object labeling and ii) a recursive neural network (RNN) discovering the hierarchical object structure and the inter-object relations. Rather than relying on elaborative user annotations (e.g., manually labeling semantic maps and relations), we train our deep model in a weakly-supervised manner by leveraging the descriptive sentences of the training images. Specifically, we decompose each sentence into a semantic tree consisting of nouns and verb phrases, and facilitate these trees discovering the configurations of the training images. Once these scene configurations are determined, then the parameters of both the CNN and RNN are updated accordingly by back propagation. The entire model training is accomplished through an Expectation-Maximization method. Extensive experiments suggest that our model is capable of producing meaningful and structured scene configurations and achieving more favorable scene labeling performance on PASCAL VOC 2012 over other state-of-the-art weakly-supervised methods.
Recent experimental and theoretical ideas are laying the ground for a new era in the knowledge of the parton structure of nuclei. We report on two promising directions beyond inclusive deep inelastic scattering experiments, aimed at, among other goals, unveiling the three dimensional structure of the bound nucleon. The 3D structure in coordinate space can be accessed through deep exclusive processes, whose non-perturbative content is parametrized in terms of generalized parton distributions. In this way the distribution of partons in the transverse plane will be obtained, providing a pictorial view of the realization of the European Muon Collaboration effect. In particular, we show how, through the generalized parton distribution framework, non nucleonic degrees of freedom in nuclei can be unveiled. Analogously, the momentum space 3D structure can be accessed by studying transverse momentum dependent parton distributions in semi-inclusive deep inelastic scattering processes. The status of measurements is also summarized, in particular novel coincidence measurements at high luminosity facilities, such as Jefferson Laboratory. Finally the prospects for the next years at future facilities, such as the 12~GeV Jefferson Laboratory and the Electron Ion Collider, are presented.
Metabolic networks have two properties that are generally regarded as unrelated: One, they have metabolic reactions whose single knockout is lethal for the organism, and two, they have correlated sets of reactions forming functional modules. In this review we argue that both essentiality and modularity seem to arise as a consequence of the same structural property: the existence of low degree metabolites. This observation allows a prediction of (a) essential metabolic reactions which are potential drug targets in pathogenic microorganisms and (b) regulatory modules within biological networks, from purely structural information about the metabolic network.
Convolutional Neural Networks (CNN) are being increasingly used in computer vision for a wide range of classification and recognition problems. However, training these large networks demands high computational time and energy requirements; hence, their energy-efficient implementation is of great interest. In this work, we reduce the training complexity of CNNs by replacing certain weight kernels of a CNN with Gabor filters. The convolutional layers use the Gabor filters as fixed weight kernels, which extracts intrinsic features, with regular trainable weight kernels. This combination creates a balanced system that gives better training performance in terms of energy and time, compared to the standalone CNN (without any Gabor kernels), in exchange for tolerable accuracy degradation. We show that the accuracy degradation can be mitigated by partially training the Gabor kernels, for a small fraction of the total training cycles. We evaluated the proposed approach on 4 benchmark applications. Simple tasks like face detection and character recognition (MNIST and TiCH), were implemented using LeNet architecture. While a more complex task of object recognition (CIFAR10) was implemented on a state of the art deep CNN (Network in Network) architecture. The proposed approach yields 1.31-1.53x improvement in training energy in comparison to conventional CNN implementation. We also obtain improvement up to 1.4x in training time, up to 2.23x in storage requirements, and up to 2.2x in memory access energy. The accuracy degradation suffered by the approximate implementations is within 0-3% of the baseline.
In this paper, we propose a very deep fully convolutional encoding-decoding framework for image restoration such as denoising and super-resolution. The network is composed of multiple layers of convolution and de-convolution operators, learning end-to-end mappings from corrupted images to the original ones. The convolutional layers act as the feature extractor, which capture the abstraction of image contents while eliminating noises/corruptions. De-convolutional layers are then used to recover the image details. We propose to symmetrically link convolutional and de-convolutional layers with skip-layer connections, with which the training converges much faster and attains a higher-quality local optimum. First, The skip connections allow the signal to be back-propagated to bottom layers directly, and thus tackles the problem of gradient vanishing, making training deep networks easier and achieving restoration performance gains consequently. Second, these skip connections pass image details from convolutional layers to de-convolutional layers, which is beneficial in recovering the original image. Significantly, with the large capacity, we can handle different levels of noises using a single model. Experimental results show that our network achieves better performance than all previously reported state-of-the-art methods.
Neural Turing Machines (NTM) contain memory component that simulates "working memory" in the brain to store and retrieve information to ease simple algorithms learning. So far, only linearly organized memory is proposed, and during experiments, we observed that the model does not always converge, and overfits easily when handling certain tasks. We think memory component is key to some faulty behaviors of NTM, and better organization of memory component could help fight those problems. In this paper, we propose several different structures of memory for NTM, and we proved in experiments that two of our proposed structured-memory NTMs could lead to better convergence, in term of speed and prediction accuracy on copy task and associative recall task as in (Graves et al. 2014).
We study the violation of the fluctuation-dissipation theorem in the three and four dimensional Gaussian Ising spin glasses using on and off equilibrium simulations. We have characterized numerically the function X(C) that determine the violation and we have studied its scaling properties. Moreover we have computed the function x(C) which characterize the breaking of the replica symmetry directly from equilibrium simulations. The two functions are numerically equal and in this way we have established that the conjectured connection between the violation of fluctuation dissipation theorem in the off-equilibrium dynamics and the replica symmetry breaking at equilibrium holds for finite dimensional spin glasses. These results point to a spin glass phase with spontaneously broken replica symmetry in finite dimensional spin glasses.
Distributions of the resilience of transport networks are studied numerically, in particular the large-deviation tails. Thus, not only typical quantities like average or variance but the distributions over the (almost) full support can be studied. For a proof of principle, a simple transport model based on the edge-betweenness and three abstract yet widely studied random network ensembles are considered here: Erdoes-Renyi random networks with finite connectivity, small world networks and spatial networks embedded in a two-dimensional plane. Using specific numerical large-deviation techniques, probability densities as small as 10^(-80) are obtained here. This allows one to study typical but also the most and the least resilient networks. The resulting distributions fulfill the mathematical large-deviation principle, i.e., can be well described by rate functions in the thermodynamic limit. The analysis of the limiting rate function reveals that the resilience follows an exponential distribution almost everywhere. An analysis of the structure of the network shows that the most-resilient networks can be obtained, as a rule of thumb, by minimizing the diameter of a network. Also, trivially, by including more links a network can typically be made more resilient. On the other hand, the least-resilient networks are very rare and characterized by one (or few) small core(s) to which all other nodes are connected. In total, the spatial network ensemble turns out to be most suitable for obtaining and studying resilience of real mostly finite-dimensional networks. Studying this ensemble in combination with the presented large-deviation approach for more realistic, in particular dynamic transport networks appears to be very promising.
Automatic liver segmentation from CT volumes is a crucial prerequisite yet challenging task for computer-aided hepatic disease diagnosis and treatment. In this paper, we present a novel 3D deeply supervised network (3D DSN) to address this challenging task. The proposed 3D DSN takes advantage of a fully convolutional architecture which performs efficient end-to-end learning and inference. More importantly, we introduce a deep supervision mechanism during the learning process to combat potential optimization difficulties, and thus the model can acquire a much faster convergence rate and more powerful discrimination capability. On top of the high-quality score map produced by the 3D DSN, a conditional random field model is further employed to obtain refined segmentation results. We evaluated our framework on the public MICCAI-SLiver07 dataset. Extensive experiments demonstrated that our method achieves competitive segmentation results to state-of-the-art approaches with a much faster processing speed.
Machine transliteration is the process of automatically transforming the script of a word from a source language to a target language, while preserving pronunciation. Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. In this paper a character-based encoder-decoder model has been proposed that consists of two Recurrent Neural Networks. The encoder is a Bidirectional recurrent neural network that encodes a sequence of symbols into a fixed-length vector representation, and the decoder generates the target sequence using an attention-based recurrent neural network. The encoder, the decoder and the attention mechanism are jointly trained to maximize the conditional probability of a target sequence given a source sequence. Our experiments on different datasets show that the proposed encoder-decoder model is able to achieve significantly higher transliteration quality over traditional statistical models.
The automated inference of physically interpretable (bio)chemical reaction network models from measured experimental data is a challenging problem whose solution has significant commercial and academic ramifications. It is demonstrated, using simulations, how sets of elementary reactions comprising chemical reaction networks, as well as their rate coefficients, may be accurately recovered from non-equilibrium time series concentration data, such as that obtained from laboratory scale reactors. A variant of an evolutionary algorithm called differential evolution in conjunction with least squares techniques is used to search the space of reaction networks in order to infer both the reaction network topology and its rate parameters. Properties of the stoichiometric matrices of trial networks are used to bias the search towards physically realisable solutions. No other information, such as chemical characterisation of the reactive species is required, although where available it may be used to improve the search process.
On the basis of statistical mechanics of the Q-Ising model, we formulate the Bayesian inference to the problem of inverse halftoning, which is the inverse process of representing gray-scales in images by means of black and white dots. Using Monte Carlo simulations, we investigate statistical properties of the inverse process, especially, we reveal the condition of the Bayes-optimal solution for which the mean-square error takes its minimum. The numerical result is qualitatively confirmed by analysis of the infinite-range model. As demonstrations of our approach, we apply the method to retrieve a grayscale image, such as standard image `Lenna', from the halftoned version. We find that the Bayes-optimal solution gives a fine restored grayscale image which is very close to the original.
Motivated by recent experiments of carrier-doped C$_{60}$ in a field-effect transistor (FET), effects of spatial single particle potential variations on Mott-Hubbard (MH) insulators are studied theoretically. It is shown that the presence of strong random potentials leads to a reduced dependence of electronic properties on band fillings and to disappearance of the MH insulating behavior at integer fillings. A simple physical picture to explain this behavior is given using a notion of self-doping of the MH insulator. Our results have important implications on some of the puzzling observations of carrier-doped C$_{60}$ in the FET. It is also discussed that the FET configuration with Al{$_2$}O{$_3$} dielectric provides an ``ideal'' system with strong disorders.
In phylogenetics, phylogenetic trees are rooted binary trees, whereas phylogenetic networks are rooted arbitrary acyclic digraphs. Edges are directed away from the root and leaves are uniquely labeled with taxa in phylogenetic networks. For the purpose of validating evolutionary models, biologists check whether or not a phylogenetic tree is contained in a phylogenetic network on the same taxa. This tree containment problem is known to be NP-complete. A phylogenetic network is reticulation-visible if every reticulation node separates the root of the network from some leaves. We answer an open problem by proving that the problem is solvable in quadratic time for reticulation-visible networks. The key tool used in our answer is a powerful decomposition theorem. It also allows us to design a linear-time algorithm for the cluster containment problem for networks of this type and to prove that every galled network with n leaves has 2(n-1) reticulation nodes at most.
We report the observations of the clear sky fraction at the Concordia station during winter 2006, and derive from it the duty cycle for astronomical observations. Performance in duty cycle and observation duration promotes Dome C for efficient asteroseismic observations. This performance is analyzed and compared to network observation. For network observations, simulations were run considering the helioseismic network GONG as a reference. Observations with 1 site in Antarctica provide performance similar or better than with a 6-site network, since the duty cycle limited by meteorology is as high as 92%. On bright targets, a 100-day long time series with a duty cycle about 87% can be observed, what network observation cannot.
We propose a novel approach to enhance the discriminability of Convolutional Neural Networks (CNN). The key idea is to build a tree structure that could progressively learn fine-grained features to distinguish a subset of classes, by learning features only among these classes. Such features are expected to be more discriminative, compared to features learned for all the classes. We develop a new algorithm to effectively learn the tree structure among a large number of classes. Experiments on large-scale image classification tasks demonstrate that our method could boost the performance of a given basic CNN model. Our method is quite general, hence it can potentially be used in combination with many other deep learning models.
We have made next to leading order QCD fit to the deep inelastic spin asymmetries on nucleons and we have determined polarised quark and gluon densities. The functional form for such distributions was inspired by the unpolarised ones given by MRST (Martin, Roberts, Stirling and Thorne) fit. In addition to usually used data points (averaged over $Q^2$) we have also considered the sample containing points with the same $x$ and different $Q^2$. Our fits to both groups of data give very similar results. For the integrated quantities we get rather small gluon polarisation. For the non averaged data the best fit is obtained with vanishing gluon contribution at $ Q^{2}=1 {\rm GeV^{2}}$. For comparison models with alternative assumptions about quark sea and in particular strange sea behaviour are discussed.
Based on the non-local light-ray operator technique, we develop an algorithm for the computational calculation of evolution kernels for higher-twist operators in leading order of the perturbation theory. We compute the evolution kernel for the twist-3 operators in the flavour singlet channel. Our result confirms the local anomalous dimensions computed by Bukhvostov, Kuraev, and Lipatov as well as the non-local evolution kernels of Balitzky and Braun.
Data center networks need to provide low latency, especially at the tail, as demanded by many interactive applications. To improve tail latency, existing approaches require modifications to switch hardware and/or end-host operating systems, making them difficult to be deployed. We present the design, implementation, and evaluation of RepNet, an application layer transport that can be deployed today. RepNet exploits the fact that only a few paths among many are congested at any moment in the network, and applies simple flow replication to mice flows to opportunistically use the less congested path. RepNet has two designs for flow replication: (1) RepSYN, which only replicates SYN packets and uses the first connection that finishes TCP handshaking for data transmission, and (2) RepFlow which replicates the entire mice flow. We implement RepNet on {\tt node.js}, one of the most commonly used platforms for networked interactive applications. {\tt node}'s single threaded event-loop and non-blocking I/O make flow replication highly efficient. Performance evaluation on a real network testbed and in Mininet reveals that RepNet is able to reduce the tail latency of mice flows, as well as application completion times, by more than 50\%.
Traditional Genetic Algorithms (GAs) mating schemes select individuals for crossover independently of their genotypic or phenotypic similarities. In Nature, this behaviour is known as random mating. However, non-random schemes - in which individuals mate according to their kinship or likeness - are more common in natural systems. Previous studies indicate that, when applied to GAs, negative assortative mating (a specific type of non-random mating, also known as dissortative mating) may improve their performance (on both speed and reliability) in a wide range of problems. Dissortative mating maintains the genetic diversity at a higher level during the run, and that fact is frequently observed as an explanation for dissortative GAs ability to escape local optima traps. Dynamic problems, due to their specificities, demand special care when tuning a GA, because diversity plays an even more crucial role than it does when tackling static ones. This paper investigates the behaviour of dissortative mating GAs, namely the recently proposed Adaptive Dissortative Mating GA (ADMGA), on dynamic trap functions. ADMGA selects parents according to their Hamming distance, via a self-adjustable threshold value. The method, by keeping population diversity during the run, provides an effective means to deal with dynamic problems. Tests conducted with deceptive and nearly deceptive trap functions indicate that ADMGA is able to outperform other GAs, some specifically designed for tracking moving extrema, on a wide range of tests, being particularly effective when speed of change is not very fast. When comparing the algorithm to a previously proposed dissortative GA, results show that performance is equivalent on the majority of the experiments, but ADMGA performs better when solving the hardest instances of the test set.
It is common wisdom that no nation is an isolated economic island. All nations participate in the global economy and are linked together through trade and finance. Here we analyze international trade network (ITN), being the network of import-export relationships between countries. We show that in each year over the analyzed period of 50 years (since 1950) the network is a typical representative of the ensemble of maximally random networks. Structural Hamiltonians characterizing binary and weighted versions of ITN are formulated and discussed. In particular, given binary representation of ITN (i.e. binary network of trade channels) we show that the network of partnership in trade is well described by the configuration model. We also show that in the weighted version of ITN, bilateral trade volumes (i.e. directed connections which represent trade/money flows between countries) are only characterized by the product of the trading countries' GDPs, like in the famous gravity model of trade.
Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in probabilistic generative model applications such as image occlusion removal, pattern completion and motion synthesis. Generative inference in such algorithms can be performed very efficiently on hardware using a Markov Chain Monte Carlo procedure called Gibbs sampling, where stochastic samples are drawn from noisy integrate and fire neurons implemented on neuromorphic substrates. Currently, no satisfactory metrics exist for evaluating the generative performance of such algorithms implemented on high-dimensional data for neuromorphic platforms. This paper demonstrates the application of nonparametric goodness-of-fit testing to both quantify the generative performance as well as provide decision-directed criteria for choosing the parameters of the neuromorphic Gibbs sampler and optimizing usage of hardware resources used during sampling.
We evaluate the finite temperature partition sum and correlation functions of the Sachdev-Ye-Kitaev (SYK) model. Starting from a recently proposed mapping of the SYK model onto Liouville quantum mechanics, we obtain our results by exact integration over conformal Goldstone modes reparameterizing physical time. Perhaps, the least expected result of our analysis is that at time scales proportional to the number of particles the out of time order correlation function crosses over from a regime of exponential decay to a universal $t^{-6}$ power-law behavior.
The multifractality of the critical eigenstate at the metal to insulator transition (MIT) in the three-dimensional Anderson model of localization is characterized by its associated singularity spectrum f(alpha). Recent works in 1D and 2D critical systems have suggested an exact symmetry relation in f(alpha). Here we show the validity of the symmetry at the Anderson MIT with high numerical accuracy and for very large system sizes. We discuss the necessary statistical analysis that supports this conclusion. We have obtained the f(alpha) from the box- and system-size scaling of the typical average of the generalized inverse participation ratios. We show that the best symmetry in f(alpha) for typical averaging is achieved by system-size scaling, following a strategy that emphasizes using larger system sizes even if this necessitates fewer disorder realizations.
We propose a simple and efficient method for ranking features in multi-label classification. The method produces a ranking of features showing their relevance in predicting labels, which in turn allows to choose a final subset of features. The procedure is based on Markov Networks and allows to model the dependencies between labels and features in a direct way. In the first step we build a simple network using only labels and then we test how much adding a single feature affects the initial network. More specifically, in the first step we use the Ising model whereas the second step is based on the score statistic, which allows to test a significance of added features very quickly. The proposed approach does not require transformation of label space, gives interpretable results and allows for attractive visualization of dependency structure. We give a theoretical justification of the procedure by discussing some theoretical properties of the Ising model and the score statistic. We also discuss feature ranking procedure based on fitting Ising model using $l_1$ regularized logistic regressions. Numerical experiments show that the proposed methods outperform the conventional approaches on the considered artificial and real datasets.
The Internet of things (IoT) is revolutionizing the management and control of automated systems leading to a paradigm shift in areas such as smart homes, smart cities, health care, transportation, etc. The IoT technology is also envisioned to play an important role in improving the effectiveness of military operations in battlefields. The interconnection of combat equipment and other battlefield resources for coordinated automated decisions is referred to as the Internet of battlefield things (IoBT). IoBT networks are significantly different from traditional IoT networks due to the battlefield specific challenges such as the absence of communication infrastructure, and the susceptibility of devices to cyber and physical attacks. The combat efficiency and coordinated decision-making in war scenarios depends highly on real-time data collection, which in turn relies on the connectivity of the network and the information dissemination in the presence of adversaries. This work aims to build the theoretical foundations of designing secure and reconfigurable IoBT networks. Leveraging the theories of stochastic geometry and mathematical epidemiology, we develop an integrated framework to study the communication of mission-critical data among different types of network devices and consequently design the network in a cost effective manner.
The use of methods borrowed from statistics and physics to analyze written texts has allowed the discovery of unprecedent patterns of human behavior and cognition by establishing links between models features and language structure. While current models have been useful to unveil patterns via analysis of syntactical and semantical networks, only a few works have probed the relevance of investigating the structure arising from the relationship between relevant entities such as characters, locations and organizations. In this study, we represent entities appearing in the same context as a co-occurrence network, where links are established according to a null model based on random, shuffled texts. Computational simulations performed in novels revealed that the proposed model displays interesting topological features, such as the small world feature, characterized by high values of clustering coefficient. The effectiveness of our model was verified in a practical pattern recognition task in real networks. When compared with traditional word adjacency networks, our model displayed optimized results in identifying unknown references in texts. Because the proposed representation plays a complementary role in characterizing unstructured documents via topological analysis of named entities, we believe that it could be useful to improve the characterization of written texts (and related systems), specially if combined with traditional approaches based on statistical and deeper paradigms.
In this work, the topologies of networks constructed from time series from an underlying system undergo a period doubling cascade have been explored by means of the prevalence of different motifs using an efficient computational motif detection algorithm. By doing this we adopt a refinement based on the $k$ nearest neighbor recurrence-based network has been proposed. We demonstrate that the refinement of network construction together with the study of prevalence of different motifs allows a full explosion of the evolving period doubling cascade route to chaos in both discrete and continuous dynamical systems. Further, this links the phase space time series topologies to the corresponding network topologies, and thus helps to understand the empirical "superfamily" phenomenon, as shown by Xu.
One of the major challenges in Wireless Body Area Networks (WBANs) is to prolong the lifetime of network. Traditional research work focuses on minimizing transmit power, however, in the case of short range communication the consumption power in decoding is significantly larger than transmit power. This paper investigates the minimization of total power consumption by reducing the decoding power consumption. For achieving a desired Bit Error Rate (BER), we introduce some fundamental results on the basis of iterative message-passing algorithms for Low Density Parity Check Code (LDPC). To reduce energy dissipation in decoder, LDPC based coded communications between sensors are considered. Moreover, we evaluate the performance of LDPC at different code rates and introduce Adaptive Iterative Decoding (AID) by exploiting threshold on the number of iterations for a certain BER. In iterative LDPC decoding, the total energy consumption of network is reduced by 20 to 25 percent.
The dynamic structure factor of lithium-diborate glass has been measured at several values of the momentum transfer $Q$ using high resolution inelastic x-ray scattering. Much attention has been devoted to the low $Q$-range, below the observed Ioffe-Regel crossover \qco{}$\simeq$ 2.1 nm$^{-1}$. We find that below \qco{}, the linewidth of longitudinal acoustic waves increases with a high power of either $Q$, or of the frequency $\Omega$, up to the crossover frequency \OMco{} $\simeq$ 9 meV that nearly coincides with the center of the boson peak. This new finding strongly supports the view that resonance and hybridization of acoustic waves with a distribution of rather local low frequency modes forming the boson peak is responsible for the end of acoustic branches in strong glasses. Further, we present high resolution Brillouin light-scattering data obtained at much lower frequencies on the same sample. These clearly rule out a simple $\Omega^2$-dependence of the acoustic damping over the entire frequency range.
The paper attempts to find numerical solutions of Diophantine equations, a challenging problem as there are no general methods to find solutions of such equations. It uses the metaphor of foraging habits of real ants. The ant colony optimization based procedure starts with randomly assigned locations to a fixed number of artificial ants. Depending upon the quality of these positions, ants deposit pheromone at the nodes. A successor node is selected from the topological neighborhood of each of the nodes based on this stochastic pheromone deposit. If an ant bumps into an already encountered node, the pheromone is updated correspondingly. A suitably defined pheromone evaporation strategy guarantees that premature convergence does not take place. The experimental results, which compares with those of other machine intelligence techniques, validate the effectiveness of the proposed method.
An introduction to some outstanding issues in QCD is presented, with emphasis on work by Diakonov and co-workers on the influence of the instanton vacuum on low-energy QCD observables. This includes the calculation of input valence-parton distributions for deep-inelastic scattering.
We present a neural network that can act as an equivalent to a Non-Negative Matrix Factorization (NMF), and further show how it can be used to perform supervised source separation. Due to the extensibility of this approach we show how we can achieve better source separation performance as compared to NMF-based methods, and propose a variety of derivative architectures that can be used for further improvements.
Formal methods apply algorithms based on mathematical principles to enhance the reliability of systems. It would only be natural to try to progress from verification, model checking or testing a system against its formal specification into constructing it automatically. Classical algorithmic synthesis theory provides interesting algorithms but also alarming high complexity and undecidability results. The use of genetic programming, in combination with model checking and testing, provides a powerful heuristic to synthesize programs. The method is not completely automatic, as it is fine tuned by a user that sets up the specification and parameters. It also does not guarantee to always succeed and converge towards a solution that satisfies all the required properties. However, we applied it successfully on quite nontrivial examples and managed to find solutions to hard programming challenges, as well as to improve and to correct code. We describe here several versions of our method for synthesizing sequential and concurrent systems.
This paper proposes a novel neural-network-based adaptive hybrid-reflectance three-dimensional (3-D) surface reconstruction model. The neural network combines the diffuse and specular components into a hybrid model. The proposed model considers the characteristics of each point and the variant albedo to prevent the reconstructed surface from being distorted. The neural network inputs are the pixel values of the two-dimensional images to be reconstructed. The normal vectors of the surface can then be obtained from the output of the neural network after supervised learning, where the illuminant direction does not have to be known in advance. Finally, the obtained normal vectors can be applied to integration method when reconstructing 3-D objects. Facial images were used for training in the proposed approach
In this paper we introduce a lowest density MDS array code which is applied to a Smart Meter network to introduce reliability. By treating the network as distributed storage with multiple sources, information can be exchanged between the nodes in the network allowing each node to store parity symbols relating to data from other nodes. A lowest density MDS array code is then applied to make the network robust against outages, ensuring low overhead and data transfers. We show the minimum amount of overhead required to be able to recover from r node erasures in an n node network and explicitly design an optimal array code with lowest density. In contrast to existing codes, this one has no restrictions on the number of nodes or erasures it can correct. Furthermore we consider incomplete networks where all nodes are not connected to each other. This limits the exchange of data for purposes of redundancy and we derive conditions on the minimum node degree that allow lowest density MDS codes to exist. We also present an explicit code design for incomplete networks that is capable of correcting two node failures.
This paper describes the application of a Convolutional Neural Network (CNN) in the context of a predator/prey scenario. The CNN is trained and run on data from a Dynamic and Active Pixel Sensor (DAVIS) mounted on a Summit XL robot (the predator), which follows another one (the prey). The CNN is driven by both conventional image frames and dynamic vision sensor "frames" that consist of a constant number of DAVIS ON and OFF events. The network is thus "data driven" at a sample rate proportional to the scene activity, so the effective sample rate varies from 15 Hz to 240 Hz depending on the robot speeds. The network generates four outputs: steer right, left, center and non-visible. After off-line training on labeled data, the network is imported on the on-board Summit XL robot which runs jAER and receives steering directions in real time. Successful results on closed-loop trials, with accuracies up to 87% or 92% (depending on evaluation criteria) are reported. Although the proposed approach discards the precise DAVIS event timing, it offers the significant advantage of compatibility with conventional deep learning technology without giving up the advantage of data-driven computing.
Estimating gene networks in combination with posthoc analysis based on data from malignant tissue is a major challenge in cancer systems biology as it allows us to improve our understanding of disease pathology and eventually identify new drug targets. Motivated by the need for improving the inherently unstable covariance estimation compounded by noisy gene expression data, we present a hierarchical random covariance model applied as a meta-analysis of gene networks across eleven large-scale gene expression studies of diffuse large B-cell lymphoma (DLBCL). The approach was inspired by traditional meta-analysis using random effects models and we derive and compare basic properties and estimators of the model. Simple inference and interpretation of an introduced parameter measuring the inter-class homogeneity is suggested. The methods are generally applicable where multiple classes are present and believed to share a common covariance matrix of interest that is obscured by class-dependent noise. As such, it provides a basis for meta- or integrative analysis of covariance matrices where the classes are formed by datasets. In a posthoc analysis of the estimated common covariance matrix for the DLBCL data we were able to identify biologically meaningful gene networks of prognostic value. Of particular interest was the identification of a network with the S100 family of calcium-binding proteins as central players which further fuels the indications that knock down of these proteins may improve the immunotherapy strategies and outcome of lymphoma patients.
Large-scale datasets have played a significant role in progress of neural network and deep learning areas. YouTube-8M is such a benchmark dataset for general multi-label video classification. It was created from over 7 million YouTube videos (450,000 hours of video) and includes video labels from a vocabulary of 4716 classes (3.4 labels/video on average). It also comes with pre-extracted audio & visual features from every second of video (3.2 billion feature vectors in total). Google cloud recently released the datasets and organized 'Google Cloud & YouTube-8M Video Understanding Challenge' on Kaggle. Competitors are challenged to develop classification algorithms that assign video-level labels using the new and improved Youtube-8M V2 dataset. Inspired by the competition, we started exploration of audio understanding and classification using deep learning algorithms and ensemble methods. We built several baseline predictions according to the benchmark paper and public github tensorflow code. Furthermore, we improved global prediction accuracy (GAP) from base level 77% to 80.7% through approaches of ensemble.
The node set of a two-mode network consists of two disjoint subsets and all its links are linking these two subsets. The links can be weighted. We developed a new method for identifying important sub-networks in two-mode networks. The method combines and extends the ideas from generalized cores in one-mode networks and from (p, q)- cores for two-mode networks. In this paper we introduce the notion of generalized two-mode cores and discuss some of their properties. An efficient algorithm to determine generalized two-mode cores and an analysis of its complexity are also presented. For illustration some results obtained in analyses of real-life data are presented.
We study the problem of modeling spatiotemporal trajectories over long time horizons using expert demonstrations. For instance, in sports, agents often choose action sequences with long-term goals in mind, such as achieving a certain strategic position. Conventional policy learning approaches, such as those based on Markov decision processes, generally fail at learning cohesive long-term behavior in such high-dimensional state spaces, and are only effective when myopic modeling lead to the desired behavior. The key difficulty is that conventional approaches are "shallow" models that only learn a single state-action policy. We instead propose a hierarchical policy class that automatically reasons about both long-term and short-term goals, which we instantiate as a hierarchical neural network. We showcase our approach in a case study on learning to imitate demonstrated basketball trajectories, and show that it generates significantly more realistic trajectories compared to non-hierarchical baselines as judged by professional sports analysts.
With the rapid growth of social media, rumors are also spreading widely on social media and bring harm to people's daily life. Nowadays, information credibility evaluation has drawn attention from academic and industrial communities. Current methods mainly focus on feature engineering and achieve some success. However, feature engineering based methods require a lot of labor and cannot fully reveal the underlying relations among data. In our viewpoint, the key elements of user behaviors for evaluating credibility are concluded as "who", "what", "when", and "how". These existing methods cannot model the correlation among different key elements during the spreading of microblogs. In this paper, we propose a novel representation learning method, Information Credibility Evaluation (ICE), to learn representations of information credibility on social media. In ICE, latent representations are learnt for modeling user credibility, behavior types, temporal properties, and comment attitudes. The aggregation of these factors in the microblog spreading process yields the representation of a user's behavior, and the aggregation of these dynamic representations generates the credibility representation of an event spreading on social media. Moreover, a pairwise learning method is applied to maximize the credibility difference between rumors and non-rumors. To evaluate the performance of ICE, we conduct experiments on a Sina Weibo data set, and the experimental results show that our ICE model outperforms the state-of-the-art methods.
Microarray cancer gene expression data comprise of very high dimensions. Reducing the dimensions helps in improving the overall analysis and classification performance. We propose two hybrid techniques, Biogeography - based Optimization - Random Forests (BBO - RF) and BBO - SVM (Support Vector Machines) with gene ranking as a heuristic, for microarray gene expression analysis. This heuristic is obtained from information gain filter ranking procedure. The BBO algorithm generates a population of candidate subset of genes, as part of an ecosystem of habitats, and employs the migration and mutation processes across multiple generations of the population to improve the classification accuracy. The fitness of each gene subset is assessed by the classifiers - SVM and Random Forests. The performances of these hybrid techniques are evaluated on three cancer gene expression datasets retrieved from the Kent Ridge Biomedical datasets collection and the libSVM data repository. Our results demonstrate that genes selected by the proposed techniques yield classification accuracies comparable to previously reported algorithms.
Despite the promise of brain-inspired machine learning, deep neural networks (DNN) have frustratingly failed to bridge the deceptively large gap between learning and memory. Here, we introduce a Perpetual Learning Machine; a new type of DNN that is capable of brain-like dynamic 'on the fly' learning because it exists in a self-supervised state of Perpetual Stochastic Gradient Descent. Thus, we provide the means to unify learning and memory within a machine learning framework. We also explore the elegant duality of abstraction and synthesis: the Yin and Yang of deep learning.
A problem studied in Systems Biology is how to find shortest paths in metabolic networks. Unfortunately, simple (i.e., graph theoretic) shortest paths do not properly reflect biochemical facts. An approach to overcome this issue is to use edge labels and search for paths with distinct labels.   In this paper, we show that such biologically feasible shortest paths are hard to compute. Moreover, we present solutions to find such paths in networks in reasonable time.
Many machine learning problems such as speech recognition, gesture recognition, and handwriting recognition are concerned with simultaneous segmentation and labeling of sequence data. Latent-dynamic conditional random field (LDCRF) is a well-known discriminative method that has been successfully used for this task. However, LDCRF can only be trained with pre-segmented data sequences in which the label of each frame is available apriori. In the realm of neural networks, the invention of connectionist temporal classification (CTC) made it possible to train recurrent neural networks on unsegmented sequences with great success. In this paper, we use CTC to train an LDCRF model on unsegmented sequences. Experimental results on two gesture recognition tasks show that the proposed method outperforms LDCRFs, hidden Markov models, and conditional random fields.
The size of nuclei in histological preparations from excised breast tumors is predictive of patient outcome (large nuclei indicate poor outcome). Pathologists take into account nuclear size when performing breast cancer grading. In addition, the mean nuclear area (MNA) has been shown to have independent prognostic value. The straightforward approach to measuring nuclear size is by performing nuclei segmentation. We hypothesize that given an image of a tumor region with known nuclei locations, the area of the individual nuclei and region statistics such as the MNA can be reliably computed directly from the image data by employing a machine learning model, without the intermediate step of nuclei segmentation. Towards this goal, we train a deep convolutional neural network model that is applied locally at each nucleus location, and can reliably measure the area of the individual nuclei and the MNA. Furthermore, we show how such an approach can be extended to perform combined nuclei detection and measurement, which is reminiscent of granulometry.
The SPIRE Photometer Simulator reproduces the entire Herschel-SPIRE system in a modular IDL program. Almost every aspect of the operation of SPIRE can be investigated in a systematic way to ensure that observations are performed in the most efficient way possible when Herschel flies. This paper describes some of the work done with the Simulator to help prepare for large observing programmes such as deep extra-galactic, high-redshift surveys.
In this paper, after analyzing the reasons of poor generalization and overfitting in neural networks, we consider some noise data as a singular value of a continuous function - jump discontinuity point. The continuous part can be approximated with the simplest neural networks, which have good generalization performance and optimal network architecture, by traditional algorithms such as constructive algorithm for feed-forward neural networks with incremental training, BP algorithm, ELM algorithm, various constructive algorithm, RBF approximation and SVM. At the same time, we will construct RBF neural networks to fit the singular value with every error in, and we prove that a function with jumping discontinuity points can be approximated by the simplest neural networks with a decay RBF neural networks in by each error, and a function with jumping discontinuity point can be constructively approximated by a decay RBF neural networks in by each error and the constructive part have no generalization influence to the whole machine learning system which will optimize neural network architecture and generalization performance, reduce the overfitting phenomenon by avoid fitting the noisy data.
Recent advances in semantic image segmentation have mostly been achieved by training deep convolutional neural networks (CNNs). We show how to improve semantic segmentation through the use of contextual information; specifically, we explore `patch-patch' context between image regions, and `patch-background' context. For learning from the patch-patch context, we formulate Conditional Random Fields (CRFs) with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches. Efficient piecewise training of the proposed deep structured model is then applied to avoid repeated expensive CRF inference for back propagation. For capturing the patch-background context, we show that a network design with traditional multi-scale image input and sliding pyramid pooling is effective for improving performance. Our experimental results set new state-of-the-art performance on a number of popular semantic segmentation datasets, including NYUDv2, PASCAL VOC 2012, PASCAL-Context, and SIFT-flow. In particular, we achieve an intersection-over-union score of 78.0 on the challenging PASCAL VOC 2012 dataset.
Symbolic has been long considered as a language of human intelligence while neural networks have advantages of robust computation and dealing with noisy data. The integration of neural-symbolic can offer better learning and reasoning while providing a means for interpretability through the representation of symbolic knowledge. Although previous works focus intensively on supervised feedforward neural networks, little has been done for the unsupervised counterparts. In this paper we show how to integrate symbolic knowledge into unsupervised neural networks. We exemplify our approach with knowledge in different forms, including propositional logic for DNA promoter prediction and first-order logic for understanding family relationship.
In a manner similar to the molecular chaos that underlies the stable thermodynamics of gases, neuronal system may exhibit microscopic instability in individual neuronal dynamics while a macroscopic order of the entire population possibly remains stable. In this study, we analyze the microscopic stability of a network of neurons whose macroscopic activity obeys stable dynamics, expressing either monostable, bistable, or periodic state. We reveal that the network exhibits a variety of dynamical states for microscopic instability residing in given stable macroscopic dynamics. The presence of a variety of dynamical states in such a simple random network implies more abundant microscopic fluctuations in real neural networks, which consist of more complex and hierarchically structured interactions.
In this paper, we investigate the joint design of channel and network coding in bi-directional relaying systems and propose a combined low complexity physical network coding and LDPC decoding scheme. For the same LDPC codes employed at both source nodes, we show that the relay can decodes the network coded codewords from the superimposed signal received from the BPSK-modulated multiple-access channel. Simulation results shown that this novel joint physical network coding and LDPC decoding method outperforms the existing MMSE network coding and LDPC decoding method over AWGN and complex MAC channel.
This chapter introduces statistical methods used in the analysis of social networks and in the rapidly evolving parallel-field of network science. Although several instances of social network analysis in health services research have appeared recently, the majority involve only the most basic methods and thus scratch the surface of what might be accomplished. Cutting-edge methods using relevant examples and illustrations in health services research are provided.
We present evidence for a counter-intuitive behavior of semiconductor mesoscopic networks that is the analog of the Braess paradox encountered in classical networks. A numerical simulation of quantum transport in a two-branch mesoscopic network reveals that adding a third branch can paradoxically induce transport inefficiency that manifests itself in a sizable conductance drop of the network. A scanning-probe experiment using a biased tip to modulate the transmission of one branch in the network reveals the occurrence of this paradox by mapping the conductance variation as a function of the tip voltage and position.
Initialization of parameters in deep neural networks has been shown to have a big impact on the performance of the networks (Mishkin & Matas, 2015). The initialization scheme devised by He et al, allowed convolution activations to carry a constrained mean which allowed deep networks to be trained effectively (He et al., 2015a). Orthogonal initializations and more generally orthogonal matrices in standard recurrent networks have been proved to eradicate the vanishing and exploding gradient problem (Pascanu et al., 2012). Majority of current initialization schemes do not take fully into account the intrinsic structure of the convolution operator. Using the duality of the Fourier transform and the convolution operator, Convolution Aware Initialization builds orthogonal filters in the Fourier space, and using the inverse Fourier transform represents them in the standard space. With Convolution Aware Initialization we noticed not only higher accuracy and lower loss, but faster convergence. We achieve new state of the art on the CIFAR10 dataset, and achieve close to state of the art on various other tasks.
The strongest evidence for superiority of quantum annealing on spin glass problems has come from comparing simulated quantum annealing using quantum Monte Carlo (QMC) methods to simulated classical annealing [G. Santoro et al., Science 295, 2427(2002)]. Motivated by experiments on programmable quantum annealing devices we revisit the question of when quantum speedup may be expected for Ising spin glass problems. We find that even though a better scaling compared to simulated classical annealing can be achieved for QMC simulations, this advantage is due to time discretization and measurements which are not possible on a physical quantum annealing device. QMC simulations in the physically relevant continuous time limit, on the other hand, do not show superiority. Our results imply that care has to be taken when using QMC simulations to assess quantum speedup potential and are consistent with recent arguments that no quantum speedup should be expected for two-dimensional spin glass problems.
Representation learning, especially which by using deep learning, has been widely applied in classification. However, how to use limited size of labeled data to achieve good classification performance with deep neural network, and how can the learned features further improve classification remain indefinite. In this paper, we propose Horizontal Voting Vertical Voting and Horizontal Stacked Ensemble methods to improve the classification performance of deep neural networks. In the ICML 2013 Black Box Challenge, via using these methods independently, Bing Xu achieved 3rd in public leaderboard, and 7th in private leaderboard; Jingjing Xie achieved 4th in public leaderboard, and 5th in private leaderboard.
The development of fast and accurate image reconstruction algorithms is a central aspect of computed tomography. In this paper we investigate this issue for the sparse data problem of photoacoustic tomography (PAT). We develop direct and highly efficient reconstruction algorithms based on deep-learning. In this approach image reconstruction is performed with a deep convolutional neural network (CNN), whose weights are adjusted prior to the actual image reconstruction based on a set of training data. Our results demonstrate that the proposed deep learning approach reconstructs images with a quality komparable to state of the art iterative approaches from sparse data. At the same time, the numerically complexity of our approach is much smaller and the image reconstruction is performed in a fraction of the time required by iterative methods.
We present deep two-dimensional spectra of 22 candidate and confirmed Lyman break galaxies (LBGs) at redshifts 2<z<4 in the Hubble Deep Field (HDF) obtained at the Keck II telescope. The targets were preferentially selected with spatial extent and/or multiple knot morphologies, and we used slitmasks and individual slits tilted to optimize measurement of any spatially resolved kinematics. The median target magnitude was I_814=25.3, and total exposure times ranged from 10 to 50 ks. We measure redshifts, some new, ranging from z=0.2072 to z=4.056, including two interlopers at z<1, and resulting in a sample of 14 LBGs with a median redshift z=2.424. The morphologies and kinematics of the close pairs and multiple knot sources in our sample are generally inconsistent with galaxy formation scenarios postulating that LBGs occur only at the bottom of the potential wells of massive host halos; rather, they support ``collisional starburst'' models with significant major merger rates and a broad halo occupation distribution. For 13 LBGs with possible kinematic signatures, we estimate simple dynamical masses ranging from 4e9 h^-1 M_sun to 1.1e11 h^-1 M_sun for individual galaxies and from <10e10 h^-1 to ~10^14 h^-1 M_sun with a median value 1e13 h^-1 M_sun for host dark matter halos. Comparison with a recent numerical galaxy formation model implies that the pairwise velocities might not reflect true dynamical masses. We compare our dynamical mass estimates directly to stellar masses and find no evidence for a strong correlation. The diversity of morphologies and dynamics implies that LBGs represent a broad range of galaxy or proto-galaxy types in a variety of evolutionary or merger stages rather than a uniform class with a narrow range of mass.
Saliency detection, finding the most important parts of an image, has become increasingly popular in computer vision. In this paper, we introduce Hierarchical Cellular Automata (HCA) -- a temporally evolving model to intelligently detect salient objects. HCA consists of two main components: Single-layer Cellular Automata (SCA) and Cuboid Cellular Automata (CCA). As an unsupervised propagation mechanism, Single-layer Cellular Automata can exploit the intrinsic relevance of similar regions through interactions with neighbors. Low-level image features as well as high-level semantic information extracted from deep neural networks are incorporated into the SCA to measure the correlation between different image patches. With these hierarchical deep features, an impact factor matrix and a coherence matrix are constructed to balance the influences on each cell's next state. The saliency values of all cells are iteratively updated according to a well-defined update rule. Furthermore, we propose CCA to integrate multiple saliency maps generated by SCA at different scales in a Bayesian framework. Therefore, single-layer propagation and multi-layer integration are jointly modeled in our unified HCA. Surprisingly, we find that the SCA can improve all existing methods that we applied it to, resulting in a similar precision level regardless of the original results. The CCA can act as an efficient pixel-wise aggregation algorithm that can integrate state-of-the-art methods, resulting in even better results. Extensive experiments on four challenging datasets demonstrate that the proposed algorithm outperforms state-of-the-art conventional methods and is competitive with deep learning based approaches.
Powerful spectrum decision schemes enable cognitive radios (CRs) to find transmission opportunities in spectral resources allocated exclusively to the primary users. One of the key effecting factor on the CR network throughput is the spectrum sensing sequence used by each secondary user. In this paper, secondary users' throughput maximization through finding an appropriate sensing matrix (SM) is investigated. To this end, first the average throughput of the CR network is evaluated for a given SM. Then, an optimization problem based on the maximization of the network throughput is formulated in order to find the optimal SM. As the optimum solution is very complicated, to avoid its major challenges, three novel sub optimum solutions for finding an appropriate SM are proposed for various cases including perfect and non-perfect sensing. Despite of having less computational complexities as well as lower consumed energies, the proposed solutions perform quite well compared to the optimum solution (the optimum SM). The structure and performance of the proposed SM setting schemes are discussed in detail and a set of illustrative simulation results is presented to validate their efficiencies.
Hierarchical feature discovery using non-spiking convolutional neural networks (CNNs) has attracted much recent interest in machine learning and computer vision. However, it is still not well understood how to create a biologically plausible network of brain-like, spiking neurons with multi-layer, unsupervised learning. This paper explores a novel bio-inspired spiking CNN that is trained in a greedy, layer-wise fashion. The proposed network consists of a spiking convolutional-pooling layer followed by a feature discovery layer extracting independent visual features. Kernels for the convolutional layer are trained using local learning. The learning is implemented using a sparse, spiking auto-encoder representing primary visual features. The feature discovery layer extracts independent features by probabilistic, leaky integrate-and-fire (LIF) neurons that are sparsely active in response to stimuli. The layer of the probabilistic, LIF neurons implicitly provides lateral inhibitions to extract sparse and independent features. Experimental results show that the convolutional layer is stack-admissible, enabling it to support a multi-layer learning. The visual features obtained from the proposed probabilistic LIF neurons in the feature discovery layer are utilized for training a classifier. Classification results contribute to the independent and informative visual features extracted in a hierarchy of convolutional and feature discovery layers. The proposed model is evaluated on the MNIST digit dataset using clean and noisy images. The recognition performance for clean images is above 98%. The performance loss for recognizing the noisy images is in the range 0.1% to 8.5% depending on noise types and densities. This level of performance loss indicates that the network is robust to additive noise.
A region of two-dimensional space has been filled randomly with large number of growing circular discs allowing only a `slight' overlapping among them just before their growth stop. More specifically, each disc grows from a nucleation center that is selected at a random location within the uncovered region. The growth rate $\delta$ is a continuously tunable parameter of the problem which assumes a specific value while a particular pattern of discs is generated. When a growing disc overlaps for the first time with at least another disc, it's growth is stopped and is said to be `frozen'. In this paper we study the percolation properties of the set of frozen discs. Using numerical simulations we present evidence for the following: (i) The Order Parameter appears to jump discontinuously at a certain critical value of the area coverage; (ii) the width of the window of the area coverage needed to observe a macroscopic jump in the Order Parameter tends to vanish as $\delta \to 0$ and on the contrary (iii) the cluster size distribution has a power law decaying functional form. While the first two results are the signatures of a discontinuous transition, the third result is indicative of a continuous transition. Therefore we refer this transition as a sharp but continuous transition similar to what has been observed in the recently introduced Achlioptas process of Explosive Percolation. It is also observed that in the limit of $\delta \to 0$, the critical area coverage at the transition point tends to unity, implying the limiting pattern is space-filling. In this limit, the fractal dimension of the pore space at the percolation point has been estimated to be $1.42(10)$ and the contact network of the disc assembly is found to be a scale-free network.
We use deep properties of Kashiwara's crystal basis to show that the induction functor $ {H}^0 (-) $ introduced by Andersen, Polo and Wen satisfies an analogon of Kempf's vanishing Theorem.
Recent studies have shown the classification and prediction power of the Neural Networks. It has been demonstrated that a NN can approximate any continuous function. Neural networks have been successfully used for forecasting of financial data series. The classical methods used for time series prediction like Box-Jenkins or ARIMA assumes that there is a linear relationship between inputs and outputs. Neural Networks have the advantage that can approximate nonlinear functions. In this paper we compared the performances of different feed forward and recurrent neural networks and training algorithms for predicting the exchange rate EUR/RON and USD/RON. We used data series with daily exchange rates starting from 2005 until 2013.
With a deep Chandra/HETGS exposure of WR 6, we have resolved emission lines whose profiles show that the X-rays originate from a uniformly expanding spherical wind of high X-ray-continuum optical depth. The presence of strong helium-like forbidden lines places the source of X-ray emission at tens to hundreds of stellar radii from the photosphere. Variability was present in X-rays and simultaneous optical photometry, but neither were correlated with the known period of the system or with each other. An enhanced abundance of sodium revealed nuclear processed material, a quantity related to the evolutionary state of the star. The characterization of the extent and nature of the hot plasma in WR 6 will help to pave the way to a more fundamental theoretical understanding of the winds and evolution of massive stars.
Presently, there are not many literatures on the characterization of reputation and trust in wireless sensor networks (WSNs) which can be referenced by scientists, researchers and students. Although some research documents include information on reputation and trust, characterization of these features are not adequately covered. In this paper, reputation and trust are divided into various classes or categories and a method of referencing the information is provided. This method used results in providing researchers with a tool that makes it easier to reference these features on reputation and trust in a much easier way than if referencing has to be directed to several uncoordinated resources. Although the outcome of this work proves beneficial to research in the characterization of reputation and trust in WSNs, more work needs to be done in extending the benefits to other network systems.
We calculate numerically the periodic orbits of pseudointegrable systems of low genus numbers $g$ that arise from rectangular systems with one or two salient corners. From the periodic orbits, we calculate the spectral rigidity $\Delta_3(L)$ using semiclassical quantum mechanics with $L$ reaching up to quite large values. We find that the diagonal approximation is applicable when averaging over a suitable energy interval. Comparing systems of various shapes we find that our results agree well with $\Delta_3$ calculated directly from the eigenvalues by spectral statistics. Therefore, additional terms as e.g. diffraction terms seem to be small in the case of the systems investigated in this work. By reducing the size of the corners, the spectral statistics of our pseudointegrable systems approaches the one of an integrable system, whereas very large differences between integrable and pseudointegrable systems occur, when the salient corners are large. Both types of behavior can be well understood by the properties of the periodic orbits in the system.
Deep convolutional Neural Networks (CNN) are the state-of-the-art performers for object detection task. It is well known that object detection requires more computation and memory than image classification. Thus the consolidation of a CNN-based object detection for an embedded system is more challenging. In this work, we propose LCDet, a fully-convolutional neural network for generic object detection that aims to work in embedded systems. We design and develop an end-to-end TensorFlow(TF)-based model. Additionally, we employ 8-bit quantization on the learned weights. We use face detection as a use case. Our TF-Slim based network can predict different faces of different shapes and sizes in a single forward pass. Our experimental results show that the proposed method achieves comparative accuracy comparing with state-of-the-art CNN-based face detection methods, while reducing the model size by 3x and memory-BW by ~4x comparing with one of the best real-time CNN-based object detector such as YOLO. TF 8-bit quantized model provides additional 4x memory reduction while keeping the accuracy as good as the floating point model. The proposed model thus becomes amenable for embedded implementations.
Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of confounders, factors that affect both an intervention and its outcome. A carefully designed observational study attempts to measure all important confounders. However, even if one does not have direct access to all confounders, there may exist noisy and uncertain measurement of proxies for confounders. We build on recent advances in latent variable modelling to simultaneously estimate the unknown latent space summarizing the confounders and the causal effect. Our method is based on Variational Autoencoders (VAE) which follow the causal structure of inference with proxies. We show our method is significantly more robust than existing methods, and matches the state-of-the-art on previous benchmarks focused on individual treatment effects.
This is the first in a series of connected papers discussing the problem of a dynamically reconfigurable universal learning neurocomputer that could serve as a computational model for the whole human brain. The whole series is entitled "The Brain Zero Project. My Brain as a Dynamically Reconfigurable Universal Learning Neurocomputer." (For more information visit the website www.brain0.com.) This introductory paper is concerned with general methodology. Its main goal is to explain why it is critically important for both neural modeling and cognitive modeling to pay much attention to the basic requirements of the whole brain as a complex computing system. The author argues that it can be easier to develop an adequate computational model for the whole "unprogrammed" (untrained) human brain than to find adequate formal representations of some nontrivial parts of brain's performance. (In the same way as, for example, it is easier to describe the behavior of a complex analytical function than the behavior of its real and/or imaginary part.) The "curse of dimensionality" that plagues purely phenomenological ("brainless") cognitive theories is a natural penalty for an attempt to represent insufficiently large parts of brain's performance in a state space of insufficiently high dimensionality. A "partial" modeler encounters "Catch 22." An attempt to simplify a cognitive problem by artificially reducing its dimensionality makes the problem more difficult.
Neural machine translation models rely on the beam search algorithm for decoding. In practice, we found that the quality of hypotheses in the search space is negatively affected owing to the fixed beam size. To mitigate this problem, we store all hypotheses in a single priority queue and use a universal score function for hypothesis selection. The proposed algorithm is more flexible as the discarded hypotheses can be revisited in a later step. We further design a penalty function to punish the hypotheses that tend to produce a final translation that is much longer or shorter than expected. Despite its simplicity, we show that the proposed decoding algorithm is able to select hypotheses with better qualities and improve the translation performance.
We consider electron-phonon (\textit{e-ph}) energy loss rate in 3D and 2D multi-component electron systems in semiconductors. We allow general asymmetry in the \textit{e-ph} coupling constants (matrix elements), i.e., we allow that the coupling depends on the electron sub-system index. We derive a multi-component \textit{e-ph}power loss formula, which takes into account the asymmetric coupling and links the total \textit{e-ph} energy loss rate to the density response matrix of the total electron system. We write the density response matrix within mean field approximation, which leads to coexistence of\ symmetric energy loss rate $F_{S}(T)$ and asymmetric energy loss rate $F_{A}(T)$ with total energy loss rate $ F(T)=F_{S}(T)+F_{A}(T)$ at temperature $T$. The symmetric component F_{S}(T) $ is equivalent to the conventional single-sub-system energy loss rate in the literature, and in the Bloch-Gr\"{u}neisen limit we reproduce a set of well-known power laws $F_{S}(T)\propto T^{n_{S}}$, where the prefactor and power $n_{S}$ depend on electron system dimensionality and electron mean free path. For $F_{A}(T)$ we produce a new set of power laws F_{A}(T)\propto T^{n_{A}}$. Screening strongly reduces the symmetric coupling, but the asymmetric coupling is unscreened, provided that the inter-sub-system Coulomb interactions are strong. The lack of screening enhances $F_{A}(T)$ and the total energy loss rate $F(T)$. Especially, in the strong screening limit we find $F_{A}(T)\gg F_{S}(T)$. A canonical example of strongly asymmetric \textit{e-ph} matrix elements is the deformation potential coupling in many-valley semiconductors.
Recently, the next-item/basket recommendation system, which considers the sequential relation between bought items, has drawn attention of researchers. The utilization of sequential patterns has boosted performance on several kinds of recommendation tasks. Inspired by natural language processing (NLP) techniques, we propose a novel neural network (NN) based next-song recommender, CNN-rec, in this paper. Then, we compare the proposed system with several NN based and classic recommendation systems on the next-song recommendation task. Verification results indicate the proposed system outperforms classic systems and has comparable performance with the state-of-the-art system.
Metabolic control analysis (Kacser & Burns (1973). Symp. Soc. Exp. Biol. 27, 65-104; Heinrich & Rapoport (1974). Eur. J. Biochem. 42, 89-95) was developed for the understanding of multi-enzyme networks. At the core of this approach is the flux summation theorem. This theorem implies that there is an invariant relationship between the control coefficients of enzymes in a pathway. One of the main conclusions that has been derived from the summation theorem is that phenotypic robustness to mutation (e.g. dominance) is an inherent property of metabolic systems and hence does not require an evolutionary explanation (Kacser & Burns (1981). Genetics. 97, 639-666; Porteous (1996). J. theor. Biol. 182, 223-232). Here we show that for mutations involving discrete changes (of any magnitude) in enzyme concentration the flux summation theorem does not hold. The scenarios we examine are two-enzyme pathways with a diffusion barrier, two enzyme pathways that allow for enzyme saturation and two enzyme pathways that have both saturable enzymes and a diffusion barrier. Our results are extendable to sequential pathways with any number of enzymes. The fact that the flux summation theorem cannot hold in sequential pathways casts serious doubts on the claim that robustness with respect to mutations is an inherent property of metabolic systems.
Recent development of galaxy surveys enables us to investigate the deep universe of high redshift. We quantitatively present the physical information extractable from the observable correlation function in deep redshift space in a framework of the linear theory. The correlation function depends on the underlying power spectrum, velocity distortions, and the Alcock-Paczy\'nski (AP) effect. The underlying power spectrum is sensitive to the constituents of matters in the universe, the velocity distortions are sensitive to the galaxy bias as well as the amount of total matter, and the Alcock-Paczy\'nski effect is sensitive to the dark energy components. Measuring the dark energy by means of the baryonic feature in the correlation function is one of the most interesting applications. We show that the ``baryon ridge'' in the correlation function serves as a statistically circular object in the AP effect. In order to sufficiently constrain the dark energy components, the redshift range of the galaxy survey should be as broad as possible. The survey area on the sky should be smaller at deep redshifts than at shallow redshifts to keep the number density as dense as possible. We illustrate an optimal survey design that are useful in cosmology. Assuming future redshift surveys of $z\simlt 3$ which are within reach of the present-day technology, achievable error bounds on cosmological parameters are estimated by calculating the Fisher matrix. According to an illustrated design, the equation of state of dark energy can be constrained within $\pm 5%$ error assuming that the bias is unknown and marginalized over. Even when all the other cosmological parameters should be simultaneously determined, the error bound for the equation of state is up to $\pm 10%$.
Picasso is a free open-source (Eclipse Public License) web application written in Python for rendering standard visualizations useful for training convolutional neural networks. Picasso ships with occlusion maps and saliency maps, two visualizations which help reveal issues that evaluation metrics like loss and accuracy might hide: for example, learning a proxy classification task. Picasso works with the Keras and Tensorflow deep learning frameworks. Picasso can be used with minimal configuration by deep learning researchers and engineers alike across various neural network architectures. Adding new visualizations is simple: the user can specify their visualization code and HTML template separately from the application code.
We address the problem of separating stars from galaxies in future large photometric surveys. We focus our analysis on simulations of the Dark Energy Survey (DES). In the first part of the paper, we derive the science requirements on star/galaxy separation, for measurement of the cosmological parameters with the Gravitational Weak Lensing and Large Scale Structure probes. These requirements are dictated by the need to control both the statistical and systematic errors on the cosmological parameters, and by Point Spread Function calibration. We formulate the requirements in terms of the completeness and purity provided by a given star/galaxy classifier. In order to achieve these requirements at faint magnitudes, we propose a new method for star/galaxy separation in the second part of the paper. We first use Principal Component Analysis to outline the correlations between the objects parameters and extract from it the most relevant information. We then use the reduced set of parameters as input to an Artificial Neural Network. This multi-parameter approach improves upon purely morphometric classifiers (such as the classifier implemented in SExtractor), especially at faint magnitudes: it increases the purity by up to 20% for stars and by up to 12% for galaxies, at i-magnitude fainter than 23.
In this paper, we investigate stability of a class of analytic neural networks with the synaptic feedback via event-triggered rules. This model is general and include Hopfield neural network as a special case. These event-trigger rules can efficiently reduces loads of computation and information transmission at synapses of the neurons. The synaptic feedback of each neuron keeps a constant value based on the outputs of the other neurons at its latest triggering time but changes at its next triggering time, which is determined by certain criterion. It is proved that every trajectory of the analytic neural network converges to certain equilibrium under this event-triggered rule for all initial values except a set of zero measure. The main technique of the proof is the Lojasiewicz inequality to prove the finiteness of trajectory length. The realization of this event-triggered rule is verified by the exclusion of Zeno behaviors. Numerical examples are provided to illustrate the efficiency of the theoretical results.
Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a "fast" reinforcement learning algorithm, we propose to represent it as a recurrent neural network (RNN) and learn it from data. In our proposed method, RL$^2$, the algorithm is encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm. The RNN receives all information a typical RL algorithm would receive, including observations, actions, rewards, and termination flags; and it retains its state across episodes in a given Markov Decision Process (MDP). The activations of the RNN store the state of the "fast" RL algorithm on the current (previously unseen) MDP. We evaluate RL$^2$ experimentally on both small-scale and large-scale problems. On the small-scale side, we train it to solve randomly generated multi-arm bandit problems and finite MDPs. After RL$^2$ is trained, its performance on new MDPs is close to human-designed algorithms with optimality guarantees. On the large-scale side, we test RL$^2$ on a vision-based navigation task and show that it scales up to high-dimensional problems.
The genotype-phenotype map is an essential object in our understanding of organismal complexity and adaptive properties, determining at once genomic plasticity and those constraints that may limit the ability of genomes to attain evolutionary innovations. An exhaustive experimental characterization of the relationship between genotypes and phenotypes is at present out of reach. Therefore, several models mimicking that map have been proposed and investigated, leading to the identification of a number of general features: genotypes differ in their robustness to mutations, phenotypes are represented by a broadly varying number of genotypes, and simple point mutations seem to suffice to navigate the space of genotypes while maintaining a phenotype. However, most current models address only one level of the map (sequences and folded structures in RNA or proteins; networks of genes and their dynamical attractors; sets of chemical reactions and their ability to undergo molecular catalysis), such that many relevant questions cannot be addressed. Here we introduce toyLIFE, a multi-level model for the genotype-phenotype map based on simple genomes and interaction rules from which a complex behavior at upper levels emerges, remarkably plastic gene regulatory networks and metabolism. toyLIFE is a tool that permits the investigation of how different levels are coupled, in particular how and where do mutations affect phenotype or how the presence of certain metabolites determines the dynamics of toyLIFE gene regulatory networks. The possibilities of this model are not exhausted by the results presented in this contribution. It can be easily generalized to incorporate evolution through mutations that change genome length or through recombination, to consider gene duplication or deletion, and therefore to explore further properties of extended genotype-phenotype maps.
In recent years the concept of the effective capacity that relates the physical layer characteristics of a wireless channel to the data link layer has gained a lot of attraction in wireless networking research community. The effective capacity is based on G\"artner-Ellis' large deviation theorem and it is used to provide the statistical QoS provisioning in the wireless networks. Effective capacity also helps in the analysis of the resource allocation or scheduling policies in various wireless systems such as Relay networks, multi-user systems and multi-carrier systems subject to statistical QoS requirements. The effective capacity in noise limited wireless network has already been investigated in the recent works. Considering the interference limited wireless channels, in this paper we propose an analytical approach based on Laplace's method for the effective capacity of uncorrelated Rayleigh fading channel in the presence of uncorrelated Rayleigh fading interference. The accuracy of the analytical model for the effective capacity is validated by numerical simulations. We also provide the evaluation of tail probability of the delay and maximum sustainable rate. The validation results reveal that the proposed mathematical approach to the effective capacity can open the path for further researches in statistical QoS provisioning in interference limited wireless networks.
We realize experimentally a cold atom system equivalent to the 3D Anderson model of disordered solids where the anisotropy can be controlled by adjusting an experimentally accessible parameter. This allows us to study experimentally the disorder vs anisotropy phase diagram of the Anderson metal-insulator transition. Numerical and experimental data compare very well with each other and a theoretical analysis based on the self-consistent theory of localization correctly discribes the observed behavior, illustrating the flexibility of cold atom experiments for the study of transport phenomena in complex quantum systems.
Quantum walks have been shown to be fruitful tools in analysing the dynamic properties of quantum systems. This article proposes to use quantum walks as an approach to Quantum Neural Networks (QNNs). QNNs replace binary McCulloch-Pitts neurons with a qubit in order to use the advantages of quantum computing in neural networks. A quantum walk on the firing states of such a QNN is supposed to simulate central properties of the dynamics of classical neural networks, such as associative memory. It is shown that a biased discrete Hadamard walk derived from the updating process of a biological neuron does not lead to a unitary walk. However, a Stochastic Quantum Walk between the global firing states of a QNN can be constructed and it is shown that it contains the feature of associative memory. The quantum contribution to the walk accounts for a modest speed-up in some regimes.
Deep Reinforcement Learning methods have achieved state of the art performance in learning control policies for the games in the Atari 2600 domain. One of the important parameters in the Arcade Learning Environment (ALE) is the frame skip rate. It decides the granularity at which agents can control game play. A frame skip value of $k$ allows the agent to repeat a selected action $k$ number of times. The current state of the art architectures like Deep Q-Network (DQN) and Dueling Network Architectures (DuDQN) consist of a framework with a static frame skip rate, where the action output from the network is repeated for a fixed number of frames regardless of the current state. In this paper, we propose a new architecture, Dynamic Frame skip Deep Q-Network (DFDQN) which makes the frame skip rate a dynamic learnable parameter. This allows us to choose the number of times an action is to be repeated based on the current state. We show empirically that such a setting improves the performance on relatively harder games like Seaquest.
We investigate the spatial extent and structure of the Pegasus dwarf irregular galaxy using deep, wide-field, multicolour CCD photometry from the Sloan Digital Sky Survey (SDSS) and new deep HI observations. We study an area of ~0.6 square degrees centred on the Pegasus dwarf that was imaged by SDSS. Using effective filtering in colour-magnitude space we reduce the contamination by foreground Galactic field stars and increase significantly the contrast in the outer regions of the Pegasus dwarf. Our extended surface photometry, reaches down to a surface brightness magnitude mu_r~32 mag/sq.arcsec. It reveals a stellar body with a diameter of ~8 kpc that follows a Sersic surface brightness distribution law, which is composed of a significantly older stellar population than that observed in the ~2 kpc main body. The galaxy is at least five times more extended than listed in NED. The faint extensions of the galaxy are not equally distributed around its circumference; the north-west end is more jagged than the south-east end. We also identified a number of stellar concentrations, possibly stellar associations, arranged in a ring around the main luminous body. New HI observations were collected at the Arecibo Observatory as part of the ALFALFA survey. They reveal an HI distribution somewhat elongated in RA and about 0.3 deg. wide, with the region of highest column density coincident with the luminous galaxy. The HI rotation curve shows a solid-body rotation behaviour, with opposite ends differing by 15 km/s. There is a stream to lower velocities about 5 arcmin from the centre of the galaxy. We were able to measure ugriz colours in a number of apertures using the SDSS data and compared these with predictions of evolutionary synthesis models. (abridged)
Deep learning has greatly improved visual recognition in recent years. However, recent research has shown that there exist many adversarial examples that can negatively impact the performance of such an architecture. This paper focuses on detecting those adversarial examples by analyzing whether they come from the same distribution as the normal examples. Instead of directly training a deep neural network to detect adversarials, a much simpler approach is proposed based on statistics on outputs from convolutional layers. A cascade classifier is designed to efficiently detect adversarials. Furthermore, trained from one particular adversarial generating mechanism, the resulting classifier can successfully detect adversarials from a completely different mechanism as well. After detecting adversarial examples, we show that many of them can be recovered by simply performing a small average filter on the image. Those findings should provoke us to think more about the classification mechanisms in deep convolutional neural networks.
We demonstrate that recent experimental data (E. Castel et al J.Phys. Cond. Mat. {\bf 21} (2009), 452201) on tungsten bronze compound (TBC) Ba$_2$Pr$_x$Nd$_{1-x}$FeNb$_4$O$_{15}$ can be well explained in our model predicting a crossover from ferroelectric ($x=0$) to orientational (dipole) glass ($x=1$), rather then relaxor, behavior. We show, that since a "classical" perovskite relaxor like Pb(Mn$_{1/3}$ Nb$_{2/3}$)O$_3$ is never a ferroelectric, the presence of ferroelectric hysteresis loops in TBC shows that this substance actually transits from ferroelectric to orientational glass phase with $x$ growth. To describe the above crossover theoretically, we use the simple replica-symmetric solution for disordered Ising model.
We investigate ways in which to improve the interpretability of LDA topic models by better analyzing and visualizing their outputs. We focus on examining what we refer to as topic similarity networks: graphs in which nodes represent latent topics in text collections and links represent similarity among topics. We describe efficient and effective approaches to both building and labeling such networks. Visualizations of topic models based on these networks are shown to be a powerful means of exploring, characterizing, and summarizing large collections of unstructured text documents. They help to "tease out" non-obvious connections among different sets of documents and provide insights into how topics form larger themes. We demonstrate the efficacy and practicality of these approaches through two case studies: 1) NSF grants for basic research spanning a 14 year period and 2) the entire English portion of Wikipedia.
A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researcher's experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to both include biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer and the model-based search was more accurate than keyword search; it moreover recovered biologically meaningful relationships that are not straightforwardly visible from annotations, for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database.
The Application of Bio Inspired Algorithms to complicated Power System Stability Problems has recently attracted the researchers in the field of Artificial Intelligence. Low frequency oscillations after a disturbance in a Power system, if not sufficiently damped, can drive the system unstable. This paper provides a systematic procedure to damp the low frequency oscillations based on Bio Inspired Genetic (GA) and Particle Swarm Optimization (PSO) algorithms. The proposed controller design is based on formulating a System Damping ratio enhancement based Optimization criterion to compute the optimal controller parameters for better stability. The Novel and contrasting feature of this work is the mathematical modeling and simulation of the Synchronous generator model including the Steam Governor Turbine (GT) dynamics. To show the robustness of the proposed controller, Non linear Time domain simulations have been carried out under various system operating conditions. Also, a detailed Comparative study has been done to show the superiority of the Bio inspired algorithm based controllers over the Conventional Lead lag controller.
The eigenvalues and eigenvectors of the connectivity matrix of complex networks contain information about its topology and its collective behavior. In particular, the spectral density $\rho(\lambda)$ of this matrix reveals important network characteristics: random networks follow Wigner's semicircular law whereas scale-free networks exhibit a triangular distribution. In this paper we show that the spectral density of hierarchical networks follow a very different pattern, which can be used as a fingerprint of modularity. Of particular importance is the value $\rho(0)$, related to the homeostatic response of the network: it is maximum for random and scale free networks but very small for hierarchical modular networks. It is also large for an actual biological protein-protein interaction network, demonstrating that the current leading model for such networks is not adequate.
The energy bottleneck in Wireless Sensor Network(WSN) can be reduced by limiting communication overhead. Application specific source coding schemes for the sensor networks provide fewer bits to represent the same amount of information exploiting the redundancy present in the source model, network architecture and the physical process. This paper reports the performance of representative codes from various families of source coding schemes (lossless, lossy, constant bit-rate, variable bit-rate, distributed and joint encoding/decoding) in terms of energy consumed, bit-rate achieved, quantization-error/reconstruction-error, latency and complexity of encoder-decoder(codec). A reusable frame work for testing source codes is provided. Finally we propose a set of possible applications and suitable source codes in terms of these parameters.
Recent empirical results on long-term dependency tasks have shown that neural networks augmented with an external memory can learn the long-term dependency tasks more easily and achieve better generalization than vanilla recurrent neural networks (RNN). We suggest that memory augmented neural networks can reduce the effects of vanishing gradients by creating shortcut (or wormhole) connections. Based on this observation, we propose a novel memory augmented neural network model called TARDIS (Temporal Automatic Relation Discovery in Sequences). The controller of TARDIS can store a selective set of embeddings of its own previous hidden states into an external memory and revisit them as and when needed. For TARDIS, memory acts as a storage for wormhole connections to the past to propagate the gradients more effectively and it helps to learn the temporal dependencies. The memory structure of TARDIS has similarities to both Neural Turing Machines (NTM) and Dynamic Neural Turing Machines (D-NTM), but both read and write operations of TARDIS are simpler and more efficient. We use discrete addressing for read/write operations which helps to substantially to reduce the vanishing gradient problem with very long sequences. Read and write operations in TARDIS are tied with a heuristic once the memory becomes full, and this makes the learning problem simpler when compared to NTM or D-NTM type of architectures. We provide a detailed analysis on the gradient propagation in general for MANNs. We evaluate our models on different long-term dependency tasks and report competitive results in all of them.
Efficient identification and follow-up of astronomical transients is hindered by the need for humans to manually select promising candidates from data streams that contain many false positives. These artefacts arise in the difference images that are produced by most major ground-based time domain surveys with large format CCD cameras. This dependence on humans to reject bogus detections is unsustainable for next generation all-sky surveys and significant effort is now being invested to solve the problem computationally. In this paper we explore a simple machine learning approach to real-bogus classification by constructing a training set from the image data of ~32000 real astrophysical transients and bogus detections from the Pan-STARRS1 Medium Deep Survey. We derive our feature representation from the pixel intensity values of a 20x20 pixel stamp around the centre of the candidates. This differs from previous work in that it works directly on the pixels rather than catalogued domain knowledge for feature design or selection. Three machine learning algorithms are trained (artificial neural networks, support vector machines and random forests) and their performances are tested on a held-out subset of 25% of the training data. We find the best results from the random forest classifier and demonstrate that by accepting a false positive rate of 1%, the classifier initially suggests a missed detection rate of around 10%. However we also find that a combination of bright star variability, nuclear transients and uncertainty in human labelling means that our best estimate of the missed detection rate is approximately 6%.
We propose a supervised machine learning approach for boosting existing signal and image recovery methods and demonstrate its efficacy on example of image reconstruction in computed tomography. Our technique is based on a local nonlinear fusion of several image estimates, all obtained by applying a chosen reconstruction algorithm with different values of its control parameters. Usually such output images have different bias/variance trade-off. The fusion of the images is performed by feed-forward neural network trained on a set of known examples. Numerical experiments show an improvement in reconstruction quality relatively to existing direct and iterative reconstruction methods.
This paper presents a multi-point stimulation of a Hebbian neural network with investigation of the interplay between the stimulus waves through the neurons of the network. Equilibrium of the resulting memory is achieved for recall of specific memory data at a rate faster than single point stimulus. The interplay of the intersecting stimuli appears to parallel the clarification process of recall in biological systems.
We show that the emergence of the axial anomaly is a universal phenomenon for a generic three dimensional metal in the presence of parallel electric ($E$) and magnetic ($B$) fields. In contrast to the expectations of the classical theory of magnetotransport, this intrinsically quantum mechanical phenomenon gives rise to the longitudinal magnetoresistance for any three dimensional metal. However, the emergence of the axial anomaly does not guarantee the existence of negative longitudinal magnetoresistance. We show this through an explicit calculation of the longitudinal magnetoconductivity in the quantum limit using the Boltzmann equation, for both short-range neutral and long-range ionic impurity scattering processes. We demonstrate that the ionic scattering contributes a large positive magnetoconductivity $\propto B^2$ in the quantum limit, which can cause a strong negative magnetoresistance for any three dimensional or quasi-two dimensional metal. In contrast, the finite range neutral impurities and zero range point impurities can lead to both positive and negative longitudinal magnetoresistance depending on the underlying band structure. In the presence of both neutral and ionic impurities, the longitudinal magnetoresistance of a generic metal in the quantum limit initially becomes negative, and ultimately becomes positive after passing through a minimum. We discuss in detail the qualitative agreement between our theory and recent observations of negative longitudinal magnetoresistance in Weyl semimetals TaAs and TaP, Dirac semimetals Na$_3$Bi, Bi$_{1-x}$Sb$_x$, and ZrTe$_5$, and quasi-two dimensional metals PdCoO$_2$, $\alpha$-(BEDT-TTF)$_2$I$_3$ which do not possess any bulk three dimensional Dirac or Weyl quasiparticles.
Robust and far-field speech recognition is critical to enable true hands-free communication. In far-field conditions, signals are attenuated due to distance. To improve robustness to loudness variation, we introduce a novel frontend called per-channel energy normalization (PCEN). The key ingredient of PCEN is the use of an automatic gain control based dynamic compression to replace the widely used static (such as log or root) compression. We evaluate PCEN on the keyword spotting task. On our large rerecorded noisy and far-field eval sets, we show that PCEN significantly improves recognition performance. Furthermore, we model PCEN as neural network layers and optimize high-dimensional PCEN parameters jointly with the keyword spotting acoustic model. The trained PCEN frontend demonstrates significant further improvements without increasing model complexity or inference-time cost.
While the capacity, feasibility and methods to obtain codes for network coding problems are well studied, the decoding procedure and complexity have not garnered much attention. In this work, we pose the decoding problem at a sink node in a network as a marginalize a product function (MPF) problem over a Boolean semiring and use the sum-product (SP) algorithm on a suitably constructed factor graph to perform iterative decoding. We use \textit{traceback} to reduce the number of operations required for SP decoding at sink node with general demands and obtain the number of operations required for decoding using SP algorithm with and without traceback. For sinks demanding all messages, we define \textit{fast decodability} of a network code and identify a sufficient condition for the same. Next, we consider the in-network function computation problem wherein the sink nodes do not demand the source messages, but are only interested in computing a function of the messages. We present an MPF formulation for function computation at the sink nodes in this setting and use the SP algorithm to obtain the value of the demanded function. The proposed method can be used for both linear and nonlinear as well as scalar and vector codes for both decoding of messages in a network coding problem and computing linear and nonlinear functions in an in-network function computation problem.
In this paper, we propose a multi-scale deep feature learning method for high-resolution satellite image classification. Specifically, we firstly warp the original satellite image into multiple different scales. The images in each scale are employed to train a deep convolutional neural network (DCNN). However, simultaneously training multiple DCNNs is time-consuming. To address this issue, we explore DCNN with spatial pyramid pooling (SPP-net). Since different SPP-nets have the same number of parameters, which share the identical initial values, and only fine-tuning the parameters in fully-connected layers ensures the effectiveness of each network, thereby greatly accelerating the training process. Then, the multi-scale satellite images are fed into their corresponding SPP-nets respectively to extract multi-scale deep features. Finally, a multiple kernel learning method is developed to automatically learn the optimal combination of such features. Experiments on two difficult datasets show that the proposed method achieves favorable performance compared to other state-of-the-art methods.
A simple line network model is proposed to study the downlink cellular network. Without base station cooperation, the system is interference-limited. The interference limitation is overcome when the base stations are allowed to jointly encode the user signals, but the capacity-achieving dirty paper coding scheme can be too complex for practical implementation. A new linear precoding technique called soft interference nulling (SIN) is proposed, which performs at least as well as zero-forcing (ZF) beamforming under full network coordination. Unlike ZF, SIN allows the possibility of but over-penalizes interference. The SIN precoder is computed by solving a convex optimization problem, and the formulation is extended to multiple-antenna channels. SIN can be applied when only a limited number of base stations cooperate; it is shown that SIN under partial network coordination can outperform full network coordination ZF at moderate SNRs.
With the expansion of wireless sensor networks (WSNs), the need for securing the data flow through these networks is increasing. These sensor networks allow for easy-to-apply and flexible installations which have enabled them to be used for numerous applications. Due to these properties, they face distinct information security threats. Security of the data flowing through across networks provides the researchers with an interesting and intriguing potential for research. Design of these networks to ensure the protection of data faces the constraints of limited power and processing resources. We provide the basics of wireless sensor network security to help the researchers and engineers in better understanding of this applications field. In this chapter, we will provide the basics of information security with special emphasis on WSNs. The chapter will also give an overview of the information security requirements in these networks. Threats to the security of data in WSNs and some of their counter measures are also presented.
Graph clustering is a fundamental problem that has been extensively studied both in theory and practice. The problem has been defined in several ways in literature and most of them have been proven to be NP-Hard. Due to their high practical relevancy, several heuristics for graph clustering have been introduced which constitute a central tool for coping with NP-completeness, and are used in applications of clustering ranging from computer vision, to data analysis, to learning. There exist many methodologies for this problem, however most of them are global in nature and are unlikely to scale well for very large networks. In this paper, we propose two scalable local approaches for identifying the clusters in any network. We further extend one of these approaches for discovering the overlapping clusters in these networks. Some experimentation results obtained for the proposed approaches are also presented.
Recent ISO-data has allowed for the first time observationally based estimates for source confusion in mid-infrared surveys. We use the extragalactic source counts from ISOCAM in conjunction with K-band counts to predict the confusion due to galaxies in deep mid-IR observations. We specifically concentrate on the near-future Space Infrared Telescope Facility (SIRTF) mission, and calculate expected confusion for the Infrared Array Camera (IRAC) onboard SIRTF. A defining scientific goal of the IRAC instrument will be the study of high redshift galaxies using a deep, confusion limited wide field survey at 3-10 microns. A deep survey can reach 3 $\mu$Jy sources with reasonable confidence in the shorter wavelength IRAC bands. Truly confusion limited images at 8 microns will be difficult to obtain due to practical time constraints, unless infrared galaxies exhibit very strong evolution beyond the deepest current observations. We find L^star galaxies to be detectable to z=3-3.5 at 8 microns, which is slightly more pessimistic than found by Simpson & Eisenhardt (1999).
Let $\mathcal{P}_{\beta}^{(V)} (N_{\cal I})$ be the probability that a $N\times N$ $\beta$-ensemble of random matrices with confining potential $V(x)$ has $N_{\cal I}$ eigenvalues inside an interval ${\cal I}=[a,b]$ of the real line. We introduce a general formalism, based on the Coulomb gas technique and the resolvent method, to compute analytically $\mathcal{P}_{\beta}^{(V)} (N_{\cal I})$ for large $N$. We show that this probability scales for large $N$ as $\mathcal{P}_{\beta}^{(V)} (N_{\cal I})\approx \exp\left(-\beta N^2 \psi^{(V)}(N_{\cal I} /N)\right)$, where $\beta$ is the Dyson index of the ensemble. The rate function $\psi^{(V)}(k_{\cal I})$, independent of $\beta$, is computed in terms of single integrals that can be easily evaluated numerically. The general formalism is then applied to the classical $\beta$-Gaussian (${\cal I}=[-L,L]$), $\beta$-Wishart (${\cal I}=[1,L]$) and $\beta$-Cauchy (${\cal I}=[-L,L]$) ensembles. Expanding the rate function around its minimum, we find that generically the number variance ${\rm Var}(N_{\cal I})$ exhibits a non-monotonic behavior as a function of the size of the interval, with a maximum that can be precisely characterized. These analytical results, corroborated by numerical simulations, provide the full counting statistics of many systems where random matrix models apply. In particular, we present results for the full counting statistics of zero temperature one-dimensional spinless fermions in a harmonic trap.
Recently several authors have proposed stochastic evolutionary models for the growth of the web graph and other networks that give rise to power-law distributions. These models are based on the notion of preferential attachment leading to the ``rich get richer'' phenomenon. We present a generalisation of the basic model by allowing deletion of individual links and show that it also gives rise to a power-law distribution. We derive the mean-field equations for this stochastic model and show that by examining a snapshot of the distribution at the steady state of the model, we are able to tell whether any link deletion has taken place and estimate the link deletion probability. Our model enables us to gain some insight into the distribution of inlinks in the web graph, in particular it suggests a power-law exponent of approximately 2.15 rather than the widely published exponent of 2.1.
We investigate small-world networks from the point of view of their origin. While the characteristics of small-world networks are now fairly well understood, there is as yet no work on what drives the emergence of such a network architecture. In situations such as neural or transportation networks, where a physical distance between the nodes of the network exists, we study whether the small-world topology arises as a consequence of a tradeoff between maximal connectivity and minimal wiring. Using simulated annealing, we study the properties of a randomly rewired network as the relative tradeoff between wiring and connectivity is varied. When the network seeks to minimize wiring, a regular graph results. At the other extreme, when connectivity is maximized, a near random network is obtained. In the intermediate regime, a small-world network is formed. However, unlike the model of Watts and Strogatz (Nature {\bf 393}, 440 (1998)), we find an alternate route to small-world behaviour through the formation of hubs, small clusters where one vertex is connected to a large number of neighbours.
A study of the modifications of the magnetic properties of Ho$_{2-x}$Y$_x$Sn$_2$O$_7$ upon varying the concentration of diamagnetic Y$^{3+}$ ions is presented. Magnetization and specific heat measurements show that the Spin Ice ground-state is only weakly affected by doping for $x\leq 0.3$, even if non-negligible changes in the crystal field at Ho$^{3+}$ occur. In this low doping range $\mu$SR relaxation measurements evidence a modification in the low-temperature dynamics with respect to the one observed in the pure Spin Ice. For $x\to 2$, or at high temperature, the dynamics involve fluctuations among Ho$^{3+}$ crystal field levels which give rise to a characteristic peak in $^{119}$Sn nuclear spin-lattice relaxation rate. In this doping limit also the changes in Ho$^{3+}$ magnetic moment suggest a variation of the crystal field parameters.
Three-jet production is studied for the first time in deep-inelastic positron-proton scattering. The measurement carried out with the H1 detector at HERA covers a large range of four-momentum transfer squared 5 < Q^2 < 5000 GeV^2 and invariant three-jet masses 25 < M_(3jet) < 140 GeV. Jets are defined by the inclusive k_T algorithm in the Breit frame. The size of the three-jet cross section and the ratio of the three-jet to the dijet cross section R_(3/2) are described over the whole phase space by the predictions of perturbative QCD in next-to-leading order. The shapes of angular jet distributions deviate significantly from a uniform population of the available phase space but are well described by the QCD calculation.
To train a statistical spoken dialogue system (SDS) it is essential that an accurate method for measuring task success is available. To date training has relied on presenting a task to either simulated or paid users and inferring the dialogue's success by observing whether this presented task was achieved or not. Our aim however is to be able to learn from real users acting under their own volition, in which case it is non-trivial to rate the success as any prior knowledge of the task is simply unavailable. User feedback may be utilised but has been found to be inconsistent. Hence, here we present two neural network models that evaluate a sequence of turn-level features to rate the success of a dialogue. Importantly these models make no use of any prior knowledge of the user's task. The models are trained on dialogues generated by a simulated user and the best model is then used to train a policy on-line which is shown to perform at least as well as a baseline system using prior knowledge of the user's task. We note that the models should also be of interest for evaluating SDS and for monitoring a dialogue in rule-based SDS.
We introduce a new method for finding network motifs: interesting or informative subgraph patterns in a network. Current methods for finding motifs rely on the frequency of the motif: specifically, subgraphs are motifs when their frequency in the data is high compared to the expected frequency under a null model. To compute this expectation, the search for motifs is normally repeated on as many as 1000 random graphs sampled from the null model; a prohibitively expensive step. We use ideas from the Minimum Description Length (MDL) literature to define a new measure of motif relevance, and a new algorithm for detecting motifs. Our method allows motif analysis to scale to networks with billions of links, while still resulting in informative motifs.
With the advent of remote sensing satellites, a huge repository of remotely sensed images is available. Change detection in remotely sensed images has been an active research area as it helps us understand the transitions that are taking place on the Earths surface. This paper discusses the methods and their classifications proposed by various researchers for change detection. Since use of soft computing based techniques are now very popular among research community, this paper also presents a classification based on learning techniques used in soft-computing methods for change detection.
We present a new method to propagate lower bounds on conditional probability distributions in conventional Bayesian networks. Our method guarantees to provide outer approximations of the exact lower bounds. A key advantage is that we can use any available algorithms and tools for Bayesian networks in order to represent and infer lower bounds. This new method yields results that are provable exact for trees with binary variables, and results which are competitive to existing approximations in credal networks for all other network structures. Our method is not limited to a specific kind of network structure. Basically, it is also not restricted to a specific kind of inference, but we restrict our analysis to prognostic inference in this article. The computational complexity is superior to that of other existing approaches.
We construct the first family of horizonless supergravity solutions that have the same mass, charges and angular momenta as general supersymmetric rotating D1-D5-P black holes in five dimensions. This family includes solutions with arbitrarily small angular momenta, deep within the regime of quantum numbers and couplings for which a large classical black hole exists. These geometries are well-approximated by the black-hole solution, and in particular exhibit the same near-horizon throat. Deep in this throat, the black-hole singularity is resolved into a smooth cap. We also identify the holographically-dual states in the N=(4,4) D1-D5 orbifold CFT. Our solutions are among the states counted by the CFT elliptic genus, and provide examples of smooth microstate geometries within the ensemble of supersymmetric black-hole microstates.
In a distributed network environment, the diffusion-least mean squares (LMS) algorithm gives faster convergence than the original LMS algorithm. It has also been observed that, the diffusion-LMS generally outperforms other distributed LMS algorithms like spatial LMS and incremental LMS. However, both the original LMS and diffusion-LMS are not applicable in non-linear environments where data may not be linearly separable. A variant of LMS called kernel-LMS (KLMS) has been proposed in the literature for such non-linearities. In this paper, we propose kernelised version of diffusion-LMS for non-linear distributed environments. Simulations show that the proposed approach has superior convergence as compared to algorithms of the same genre. We also introduce a technique to predict the transient and steady-state behaviour of the proposed algorithm. The techniques proposed in this work (or algorithms of same genre) can be easily extended to distributed parameter estimation applications like cooperative spectrum sensing and massive multiple input multiple output (MIMO) receiver design which are potential components for 5G communication systems.
There has been great interest in recent years on statistical models for dynamic networks. In this paper, I propose a stochastic block transition model (SBTM) for dynamic networks that is inspired by the well-known stochastic block model (SBM) for static networks and previous dynamic extensions of the SBM. Unlike most existing dynamic network models, it does not make a hidden Markov assumption on the edge-level dynamics, allowing the presence or absence of edges to directly influence future edge probabilities while retaining the interpretability of the SBM. I derive an approximate inference procedure for the SBTM and demonstrate that it is significantly better at reproducing durations of edges in real social network data.
We investigate the approach to catastrophic failure in a model porous granular material undergoing uniaxial compression. A discrete element computational model is used to simulate both the micro-structure of the material and the complex dynamics and feedbacks involved in local fracturing and the production of crackling noise. Under strain-controlled loading micro-cracks initially nucleate in an uncorrelated way all over the sample. As loading proceeds the damage localizes into a narrow damage band inclined at 30-45 degrees to the load direction. Inside the damage band the material is crushed into a poorly-sorted mixture of mainly fine powder hosting some larger fragments. The mass probability density distribution of particles in the damage zone is a power law of exponent 2.1, similar to a value of 1.87 inferred from observations of the length distribution of wear products (gouge) in natural and laboratory faults. Dynamic bursts of radiated energy, analogous to acoustic emissions observed in laboratory experiments on porous sedimentary rocks, are identified as correlated trails or cascades of local ruptures that emerge from the stress redistribution process. As the system approaches macroscopic failure consecutive bursts become progressively more correlated. Their size distribution is also a power law, with an equivalent Gutenberg-Richter b-value of 1.22 averaged over the whole test, ranging from 3 down to 0.5 at the time of failure, all similar to those observed in laboratory tests on granular sandstone samples. The formation of the damage band itself is marked by a decrease in the average distance between consecutive bursts and an emergent power law correlation integral of event locations with a correlation dimension of 2.55, also similar to those observed in the laboratory (between 2.75 and 2.25).
In an effort to understand why individuals choose to participate in personally-expensive pro-environmental behaviors, environmental and behavioral economists have examined a moral-motivation model in which the decision to adopt a pro-environmental behavior depends on the society-wide market share of that behavior. An increasing body of practical research on adoption of pro-environmental behavior emphasizes the importance of encouragement from local social contacts and messaging about locally-embraced norms: we respond by extending the moral-motivation model to a social networks setting. We obtain a new decision rule: an individual adopts a pro-environmental behavior if he or she observes a certain threshold of adoption within their local social neighborhood. This gives rise to a concurrent update process which describes adoption of a pro-environmental behavior spreading through a network. The original moral-motivation model corresponds to the special case of our network version in a complete graph.   By improving convergence results, we formulate modest-size Integer Programs that accurately (but not efficiently) find minimum-size sets of nodes that convert the entire network, or alternately that maximize long-term adoption in the network given a limited number of nodes which may be temporarily converted. Issues of stability in determining long-term adoption are key. We give hardness of approximation results for these optimization problems. We demonstrate that there exist classes of networks which qualitatively have severely different behavior than the non-networked version, and provide preliminary computational results in in modestly-sized highly-clustered small-world networks related to the famous small-world networks of Watts and Strogatz.
Deep learning has proven itself as a successful set of models for learning useful semantic representations of data. These, however, are mostly implicitly learned as part of a classification task. In this paper we propose the triplet network model, which aims to learn useful representations by distance comparisons. A similar model was defined by Wang et al. (2014), tailor made for learning a ranking for image information retrieval. Here we demonstrate using various datasets that our model learns a better representation than that of its immediate competitor, the Siamese network. We also discuss future possible usage as a framework for unsupervised learning.
The saturation model with DGLAP evolution is shown to give good description of the production of the charm and beauty quarks in deep inelastic scattering. The modifications of saturation properties caused by the presence of heavy quarks are also discussed.
Exact parsing with finite state automata is deemed inappropriate because of the unbounded non-locality languages overwhelmingly exhibit. We propose a way to structure the parsing task in order to make it amenable to local classification methods. This allows us to build a Dynamic Bayesian Network which uncovers the syntactic dependency structure of English sentences. Experiments with the Wall Street Journal demonstrate that the model successfully learns from labeled data.
The $\Delta H(M, \Delta M)$ method and its ability to determine intrinsic switching field distributions of perpendicular recording media are numerically studied. It is found that the presence of dipolar interactions in the range of typical recording media substantially enhances the reliability of the $\Delta H(M,\Delta M)$ method. In addition, a strong correlation is observed between the precision of this method and a self-consistency-check of the data sets, which is based upon a simple redundancy measure. This suggests that the latter can be utilized as an efficient criterion to decide if a complete data analysis is warranted or not.
The new era of information and communication technology (ICT) calls for a greater understanding of the environmental impacts of recent technology. With increasing energy cost and growing environmental concerns, green IT is receiving more and more attention. Network and system design play a crucial role in both computing and telecommunication systems. Significant part of this energy cost goes to system update by downloading regularly patches and bug fixes to solve security problems and to assure that the operating system and other systems function properly. This paper describes a new design of Windows Server Update Services (WSUS), system responsible of downloads of the mentioned patches and updates from Microsoft Update website and then distributes them to computers on a network. The general idea behind our proposed design is simple. Instead of the periodical check done by the WSUS servers to ensure update form Microsoft main servers, we rather propose to reverse the scenario in order to reduce energy consumption. In the proposed design, the Microsoft main server(s) sends signal to all WSUS servers to inform them about new updates. Once the signal received, WSUS can contact the main server to start downloading.
Recently there has been an increasing trend to use deep learning frameworks for both 2D consumer images and for 3D medical images. However, there has been little effort to use deep frameworks for volumetric vascular segmentation. We wanted to address this by providing a freely available dataset of 12 annotated two-photon vasculature microscopy stacks. We demonstrated the use of deep learning framework consisting both 2D and 3D convolutional filters (ConvNet). Our hybrid 2D-3D architecture produced promising segmentation result. We derived the architectures from Lee et al. who used the ZNN framework initially designed for electron microscope image segmentation. We hope that by sharing our volumetric vasculature datasets, we will inspire other researchers to experiment with vasculature dataset and improve the used network architectures.
Anderson localization is related to exponential localization of a particle in the configuration space in the presence of a disorder potential. Anderson localization can be also observed in the momentum space and corresponds to quantum suppression of classical diffusion in systems that are classically chaotic. Another kind of Anderson localization has been recently proposed, i.e. localization in the time domain due to the presence of {\it disorder} in time. That is, the probability density for the detection of a system at a fixed position in the configuration space is localized exponentially around a certain moment of time if a system is driven by a force that fluctuates in time. We show that an electron in a Rydberg atom, perturbed by a fluctuating microwave field, Anderson localizes along a classical periodic orbit. In other words the probability density for the detection of an electron at a fixed position on an orbit is exponentially localized around a certain time moment. This phenomenon can be experimentally observed.
In three dimensional (3D) disordered metals, the electron-phonon (\emph{e}-ph) scattering is the sole significant inelastic process. Thus the theoretical predication concerning the electron-electron (\emph{e}-\emph{e}) scattering rate $1/\tau_\varphi$ as a function of temperature $T$ in 3D disordered metal has not been fully tested thus far, though it was proposed 40 years ago [A. Schmid, Z. Phys. \textbf{271}, 251 (1974)]. We report here the simultaneous observation of small- and large-energy-transfer \emph{e}-\emph{e} scattering in 3D indium oxide thick films. In temperature region of $T\gtrsim100$\,K, the temperature dependence of resistivities curves of the films obey Bloch-Gr\"{u}neisen law, indicating the films possess degenerate semiconductor characteristics in electrical transport property. In the low temperature regime, $1/\tau_\varphi$ as a function of $T$ for each film can not be ascribed to \emph{e}-ph scattering. To quantitatively describe the temperature behavior of $1/\tau_\varphi$, both the 3D small- and large-energy-transfer \emph{e}-\emph{e} scattering processes should be considered (The small- and large-energy-transfer \emph{e}-\emph{e} scattering rates are proportional to $T^{3/2}$ and $T^2$, respectively). In addition, the experimental prefactors of $T^{3/2}$ and $T^{2}$ are proportional to $k_F^{-5/2}\ell^{-3/2}$ and $E_F^{-1}$ ($k_F$ is the Fermi wave number, $\ell$ is the electron elastic mean free path, and $E_F$ is the Fermi energy), respectively, which are completely consistent with the theoretical predications. Our experimental results fully demonstrate the validity of theoretical predications concerning both small- and large-energy-transfer \emph{e}-\emph{e} scattering rates.
The success of deep neural networks is mostly due their ability to learn meaningful features from the data. Features learned in the hidden layers of deep neural networks trained in computer vision tasks have been shown to be similar to mid-level vision features. We leverage this fact in this work and propose the visualization regularizer for image tasks. The proposed regularization technique enforces smoothness of the features learned by hidden nodes and turns out to be a special case of Tikhonov regularization. We achieve higher classification accuracy as compared to existing regularizers such as the L2 norm regularizer and dropout, on benchmark datasets without changing the training computational complexity.
In this paper, we present a unified analysis of matrix completion under general low-dimensional structural constraints induced by {\em any} norm regularization. We consider two estimators for the general problem of structured matrix completion, and provide unified upper bounds on the sample complexity and the estimation error. Our analysis relies on results from generic chaining, and we establish two intermediate results of independent interest: (a) in characterizing the size or complexity of low dimensional subsets in high dimensional ambient space, a certain partial complexity measure encountered in the analysis of matrix completion problems is characterized in terms of a well understood complexity measure of Gaussian widths, and (b) it is shown that a form of restricted strong convexity holds for matrix completion problems under general norm regularization. Further, we provide several non-trivial examples of structures included in our framework, notably the recently proposed spectral $k$-support norm.
Clustering coefficient is an important topological feature of complex networks. It is, however, an open question to give out its analytic expression on weighted networks yet. Here we applied an extended mean-field approach to investigate clustering coefficients in the typical weighted networks proposed by Barrat, Barth\'elemy and Vespignani (BBV networks). We provide analytical solutions of this model and find that the local clustering in BBV networks depends on the node degree and strength. Our analysis is well in agreement with results of numerical simulations.
We study the mutual percolation of two interdependent lattice networks ranging from two to seven dimensions, denoted as $D$. We impose that the length of interdependent links connecting nodes in the two lattices be less than or equal to a certain value, $r$. For each value of $D$ and $r$, we find the mutual percolation threshold, $p_c[D,r]$ below which the system completely collapses through a cascade of failures following an initial destruction of a fraction $ (1-p)$ of the nodes in one of the lattices. We find that for each dimension, $D<6$, there is a value of $r=r_I>1$ such that for $r\geq r_I$ the cascading failures occur as a discontinuous first order transition, while for $r<r_I$ the system undergoes a continuous second order transition, as in the classical percolation theory. Remarkably, for $D=6$, $r_I=1$ which is the same as in random regular (RR) graphs with the same degree (coordination number) of nodes. We also find that in all dimensions, the interdependent lattices reach maximal vulnerability (maximal $p_c[D,r]$) at a distance $r=r_{max}>r_I$, and for $r>r_{max}$ the vulnerability starts to decrease as $r\to\infty$. However the decrease becomes less significant as $D$ increases and $p_c[D,r_{max}]-p_c[D,\infty]$ decreases exponentially with $D$. We also investigate the dependence of $p_c[D,r]$ on the system size as well as how the nature of the transition changes as the number of lattice sites, $N\to\infty$.
LTEs uplink (UL) efficiency critically depends on how the interference across different cells is controlled. The unique characteristics of LTEs modulation and UL resource assignment poses considerable challenges in achieving this goal because most LTE deployments have 1:1 frequency re-use, and the uplink interference can vary considerably across successive time slots. In this work, we propose LeAP, a measurement data driven machine learning paradigm for power control to manage up-link interference in LTE. The data driven approach has the inherent advantage that the solution adapts based on network traffic, propagation and network topology, that is increasingly heterogeneous with multiple cell-overlays. LeAP system design consists of the following components: (i) design of user equipment (UE) measurement statistics that are succinct, yet expressive enough to capture the network dynamics, and (ii) design of two learning based algorithms that use the reported measurements to set the power control parameters and optimize the network performance. LeAP is standards compliant and can be implemented in centralized SON (self organized networking) server resource (cloud). We perform extensive evaluations using radio network plans from real LTE network operational in a major metro area in United States. Our results show that, compared to existing approaches, LeAP provides a 4.9x gain in the 20th percentile of user data rate, and 3.25x gain in median data rate.
Most real-world networks are not isolated. In order to function fully, they are interconnected with other networks, and this interconnection influences their dynamic processes. For example, when the spread of a disease involves two species, the dynamics of the spread within each species (the contact network) differs from that of the spread between the two species (the interconnected network). We model two generic interconnected networks using two adjacency matrices, A and B, in which A is a 2N*2N matrix that depicts the connectivity within each of two networks of size N, and B a 2N*2N matrix that depicts the interconnections between the two. Using an N-intertwined mean-field approximation, we determine that a critical susceptable-infected-susceptable (SIS) epidemic threshold in two interconnected networks is 1/{\lambda}1(A+\alpha B), where the infection rate is \beta within each of the two individual networks and \alpha\beta in the interconnected links between the two networks and {\lambda}1(A+\alpha B) is the largest eigenvalue of the matrix A+\alpha B. In order to determine how the epidemic threshold is dependent upon the structure of interconnected networks, we analytically derive {\lambda}1(A+\alpha B) using perturbation approximation for small and large \alpha, the lower and upper bound for any \alpha as a function of the adjacency matrix of the two individual networks, and the interconnections between the two and their largest eigenvalues/eigenvectors. We verify these approximation and boundary values for {\lambda}1(A+\alpha B) using numerical simulations, and determine how component network features affect {\lambda}1(A+\alpha B).
We present a theory for the memory effect in electron glasses. In fast gate voltage sweeps it is manifested as a dip in the conductivity around the equilibration gate voltage. We show that this feature, also known as anomalous field effect, arises from the long-time persistence of correlations in the electronic configuration. We argue that the gate voltage at which the memory dip saturates is related to an instability caused by the injection of a critical number of excess carriers. This saturation threshold naturally increases with temperature. On the other hand, we argue that the gate voltage beyond which memory is erased, is temperature independent. Using standard percolation arguments, we calculate the anomalous field effect as a function of gate voltage, temperature, carrier density and disorder. Our results are consistent with experiments, and in particular, they reproduce the observed scaling of the width of the memory dip with various parameters.
Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty. In comparison, Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost. In this paper we develop a new theoretical framework casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes. A direct result of this theory gives us tools to model uncertainty with dropout NNs -- extracting information from existing models that has been thrown away so far. This mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy. We perform an extensive study of the properties of dropout's uncertainty. Various network architectures and non-linearities are assessed on tasks of regression and classification, using MNIST as an example. We show a considerable improvement in predictive log-likelihood and RMSE compared to existing state-of-the-art methods, and finish by using dropout's uncertainty in deep reinforcement learning.
We experimentally investigate jamming in a quasi-two-dimensional granular system of automatically swelling particles and show that a maximum in the height of the first peak of the pair correlation function is a structural signature of the jamming transition at zero temperature. The same signature is also found in the second peak of the pair correlation function, but not in the third peak, reflecting the underlying singularity of jamming transition. We also study the development of clusters in this system. A static length scale extracted from the cluster structure reaches the size of the system when the system approaches the jamming point. Finally, we show that in a highly inhomogeneous system, friction causes the system to jam in series of steps. In this case, jamming may be obtained through successive buckling of force chains.
Networks are useful for describing systems of interacting objects, where the nodes represent the objects and the edges represent the interactions between them. The applications include chemical and metabolic systems, food webs as well as social networks. Lately, it was found that many of these networks display some common topological features, such as high clustering, small average path length (small world networks) and a power-law degree distribution (scale free networks). The topological features of a network are commonly related to the network's functionality. However, the topology alone does not account for the nature of the interactions in the network and their strength. Here we introduce a method for evaluating the correlations between pairs of nodes in the network. These correlations depend both on the topology and on the functionality of the network. A network with high connectivity displays strong correlations between its interacting nodes and thus features small-world functionality. We quantify the correlations between all pairs of nodes in the network, and express them as matrix elements in the correlation matrix. From this information one can plot the correlation function for the network and to extract the correlation length. The connectivity of a network is then defined as the ratio between this correlation length and the average path length of the network. Using this method we distinguish between a topological small world and a functional small world, where the latter is characterized by long range correlations and high connectivity. Clearly, networks which share the same topology, may have different connectivities, based on the nature and strength of their interactions. The method is demonstrated on metabolic networks, but can be readily generalized to other types of networks.
Many real-world complex networks contain a significant amount of structural redundancy, in which multiple vertices play identical topological roles. Such redundancy arises naturally from the simple growth processes which form and shape many real-world systems. Since structurally redundant elements may be permuted without altering network structure, redundancy may be formally investigated by examining network automorphism (symmetry) groups. Here, we use a group-theoretic approach to give a complete description of spectral signatures of redundancy in undirected networks. In particular, we describe how a network's automorphism group may be used to directly associate specific eigenvalues and eigenvectors with specific network motifs.
The spread of infectious diseases crucially depends on the pattern of contacts among individuals. Knowledge of these patterns is thus essential to inform models and computational efforts. Few empirical studies are however available that provide estimates of the number and duration of contacts among social groups. Moreover, their space and time resolution are limited, so that data is not explicit at the person-to-person level, and the dynamical aspect of the contacts is disregarded. Here, we want to assess the role of data-driven dynamic contact patterns among individuals, and in particular of their temporal aspects, in shaping the spread of a simulated epidemic in the population.   We consider high resolution data of face-to-face interactions between the attendees of a conference, obtained from the deployment of an infrastructure based on Radio Frequency Identification (RFID) devices that assess mutual face-to-face proximity. The spread of epidemics along these interactions is simulated through an SEIR model, using both the dynamical network of contacts defined by the collected data, and two aggregated versions of such network, in order to assess the role of the data temporal aspects.   We show that, on the timescales considered, an aggregated network taking into account the daily duration of contacts is a good approximation to the full resolution network, whereas a homogeneous representation which retains only the topology of the contact network fails in reproducing the size of the epidemic.   These results have important implications in understanding the level of detail needed to correctly inform computational models for the study and management of real epidemics.
A review of non-local, deep transport mechanisms in the atmosphere of Earth provides a good foundation for examining whether similar mechanisms are operating in the atmospheres of Mars and Titan. On Earth, deep convective clouds in the tropics constitute the upward branch of the Hadley Cell and provide a conduit through which energy, moisture, momentum, aerosols and chemical species are moved from the boundary layer to the upper troposphere and lower stratosphere. This transport produces mid-tropospheric minima in quantities such as water vapor and moist static energy and maxima where the clouds detrain. Analogs to this terrestrial transport are found in the strong and deep thermal circulations associated with topography on Mars and with Mars dust storms. Observations of elevated dust layers on Mars further support the notion that non-local deep transport is an important mechanism in the atmosphere of Mars. On Titan, the presence of deep convective clouds almost assures that non-local, deep transport is occurring and these clouds may play a role in global cycling of energy, momentum, and methane. Based on the potential importance of non-local deep transport in Earth's atmosphere and supported by evidence for such transport in the atmospheres of Mars and Titan, greater attention to this mechanism in extraterrestrial atmospheres is warranted.
Identifying important nodes in complex networks is a fundamental problem in network analysis. Although a plethora of measures has been proposed to identify important nodes in static (i.e., time-invariant) networks, there is a lack of tools in the context of temporal networks (i.e., networks whose connectivity dynamically changes over time). The aim of this paper is to propose a system-theoretic approach for identifying important nodes in temporal networks. In this direction, we first propose a generalization of the popular Katz centrality measure to the family of Markovian temporal networks using tools from the theory of Markov jump linear systems. We then show that Katz centrality in Markovian temporal networks can be efficiently computed using linear programming. Finally, we propose a convex program for optimizing the Katz centrality of a given node by tuning the weights of the temporal network in a cost-efficient manner. Numerical simulations illustrate the effectiveness of the obtained results.
It has been proposed that neural noise in the cortex arises from chaotic dynamics in the balanced state: in this model of cortical dynamics, the excitatory and inhibitory inputs to each neuron approximately cancel, and activity is driven by fluctuations of the synaptic inputs around their mean. It remains unclear whether neural networks in the balanced state can perform tasks that are highly sensitive to noise, such as storage of continuous parameters in working memory, while also accounting for the irregular behavior of single neurons. Here we show that continuous parameter working memory can be maintained in the balanced state, in a neural circuit with a simple network architecture. We show analytically that in the limit of an infinite network, the dynamics generated by this architecture are characterized by a continuous set of steady balanced states, allowing for the indefinite storage of a continuous parameter. In finite networks, we show that the chaotic noise drives diffusive motion along the approximate attractor, which gradually degrades the stored memory. We analyze the dynamics and show that the slow diffusive motion induces slowly decaying temporal cross correlations in the activity, which differ substantially from those previously described in the balanced state. We calculate the diffusivity, and show that it is inversely proportional to the system size. For large enough (but realistic) neural population sizes, and with suitable tuning of the network connections, the proposed balanced network can sustain continuous parameter values in memory over time scales larger by several orders of magnitude than the single neuron time scale.
Intrusion detection has attracted a considerable interest from researchers and industries. The community, after many years of research, still faces the problem of building reliable and efficient IDS that are capable of handling large quantities of data, with changing patterns in real time situations. The work presented in this manuscript classifies intrusion detection systems (IDS). Moreover, a taxonomy and survey of shallow and deep networks intrusion detection systems is presented based on previous and current works. This taxonomy and survey reviews machine learning techniques and their performance in detecting anomalies. Feature selection which influences the effectiveness of machine learning (ML) IDS is discussed to explain the role of feature selection in the classification and training phase of ML IDS. Finally, a discussion of the false and true positive alarm rates is presented to help researchers model reliable and efficient machine learning based intrusion detection systems.
The effects of link rewiring are considered for the class of directed networks where each node has the same fixed out-degree. We model a network generated by three mechanisms that are present in various networked systems; growth, global rewiring and local rewiring. During a rewiring phase a node is randomly selected, one of its out-going edges is detached from its destination then re-attached to the network in one of two possible ways; either globally to a randomly selected node, or locally to a descendant of a descendant of the originally selected node. Although the probability of attachment to a node increases with its connectivity, the probability of detachment also increases, the result is an exponential degree distribution with a small number of outlying nodes that have extremely large degree. We explain these outliers by identifying the circumstances for which a set of nodes can grow to very high degree.
In recent years significant progress has been made in successfully training recurrent neural networks (RNNs) on sequence learning problems involving long range temporal dependencies. The progress has been made on three fronts: (a) Algorithmic improvements involving sophisticated optimization techniques, (b) network design involving complex hidden layer nodes and specialized recurrent layer connections and (c) weight initialization methods. In this paper, we focus on recently proposed weight initialization with identity matrix for the recurrent weights in a RNN. This initialization is specifically proposed for hidden nodes with Rectified Linear Unit (ReLU) non linearity. We offer a simple dynamical systems perspective on weight initialization process, which allows us to propose a modified weight initialization strategy. We show that this initialization technique leads to successfully training RNNs composed of ReLUs. We demonstrate that our proposal produces comparable or better solution for three toy problems involving long range temporal structure: the addition problem, the multiplication problem and the MNIST classification problem using sequence of pixels. In addition, we present results for a benchmark action recognition problem.
Very deep convolutional neural networks (CNNs) yield state of the art results on a wide variety of visual recognition problems. A number of state of the the art methods for image recognition are based on networks with well over 100 layers and the performance vs. depth trend is moving towards networks in excess of 1000 layers. In such extremely deep architectures the vanishing or exploding gradient problem becomes a key issue. Recent evidence also indicates that convolutional networks could benefit from an interface to explicitly constructed memory mechanisms interacting with a CNN feature processing hierarchy. Correspondingly, we propose and evaluate a memory mechanism enhanced convolutional neural network architecture based on augmenting convolutional residual networks with a long short term memory mechanism. We refer to this as a convolutional residual memory network. To the best of our knowledge this approach can yield state of the art performance on the CIFAR-100 benchmark and compares well with other state of the art techniques on the CIFAR-10 and SVHN benchmarks. This is achieved using networks with more breadth, much less depth and much less overall computation relative to comparable deep ResNets without the memory mechanism. Our experiments and analysis explore the importance of the memory mechanism, network depth, breadth, and predictive performance.
In adaptive networks fragmentation transitions have been observed in which the network breaks into disconnected components. We present an analytical approach for calculating the transition point in general adaptive network models. Using the example of an adaptive voter model, we demonstrate that the proposed approach yields good agreement with numerical results.
Consider the causal effect that one individual's treatment may have on another individual's outcome when the outcome is contagious, with specific application to the effect of vaccination on an infectious disease outcome. The effect of one individual's vaccination on another's outcome can be decomposed into two different causal effects, called the "infectiousness" and "contagion" effects. We present identifying assumptions and estimation or testing procedures for infectiousness and contagion effects in two different settings: (1) using data sampled from independent groups of observations, and (2) using data collected from a single interdependent social network. The methods that we propose for social network data require fitting generalized linear models (GLMs). GLMs and other statistical models that require independence across subjects have been used widely to estimate causal effects in social network data, but, because the subjects in networks are presumably not independent, the use of such models is generally invalid, resulting in inference that is expected to be anticonservative. We introduce a way to ensure that GLM residuals are uncorrelated across subjects despite the fact that outcomes are non-independent. This simultaneously demonstrates the possibility of using GLMs and related statistical models for network data and highlights their limitations.
We construct and classify chiral topological phases in driven (Floquet) systems of strongly interacting bosons, with finite-dimensional site Hilbert spaces, in two spatial dimensions. The construction proceeds by introducing exactly soluble models with chiral edges, which in the presence of many-body localization (MBL) in the bulk are argued to lead to stable chiral phases. These chiral phases do not require any symmetry, and in fact owe their existence to the absence of energy conservation in driven systems. Surprisingly, we show that they are classified by a quantized many-body index, which is well defined for any MBL Floquet system. The value of this index, which is always the logarithm of a positive rational number, can be interpreted as the entropy per Floquet cycle pumped along the edge, formalizing the notion of quantum-information flow. We explicitly compute this index for specific models, and show that the nontrivial topology leads to edge thermalization, which provides an interesting link between bulk topology and chaos at the edge. We also discuss chiral Floquet phases in interacting fermionic systems and their relation to chiral bosonic phases.
The effect of strong disorder on chiral-symmetric 3-dimensional lattice models is investigated via analytical and numerical methods. The phase diagrams of the models are computed using the non-commutative winding number, as functions of disorder strength and model's parameters. The localized/delocalized characteristic of the quantum states is probed with level statistics analysis. Our study re-confirms the accurate quantization of the non-commutative winding number in the presence of strong disorder, and its effectiveness as a numerical tool. Extended bulk states are detected above and below the Fermi level, which are observed to undergo the so called "levitation and pair annihilation" process when the system is driven through a topological transition. This suggests that the bulk invariant is carried by these extended states, in stark contrast with the 1-dimensional case where the extended states are completely absent and the bulk invariant is carried by the localized states.
An evolutionary scenario to explain the transient nature and short total duration of the X-ray burst of SAX J1808.4 -- 3658 is proposed. An optical companion of the neutron star (a ``turn-off'' Main - Sequence star) fills its Roche lobe at the orbital period ($P_{orb}$) $\sim$ 19 hours. During the initial high mass--transfer phase when the neutron star is a persistent X-ray source, the neutron star is spun up to a millisecond period. Due to its chemical composition gradient, the secondary does not become fully convective when its mass decreases below 0.3 $\msun$, hence a magnetic braking remains an effective mechanism to remove orbital angular momentum and the system evolves with Roche - lobe overflow towards a short orbital period. Near an orbital period of two hours the mass transfer rate becomes so small ($\sim$ $10^{-11}\msun$/yr) that the system can not continue to be observed as a persistent X-ray source. During further Roche - lobe filling evolution deep mixing allows the surface of secondary to become more and more helium rich. Since the accreted matter is helium rich, it is easy to explain observed short total duration of the burst . This evolutionary picture suggest that radio emission can be observed only at shorter wavelength's. Our model predicts a faster orbital period decay than expected if the orbital evolution is driven only by gravitational wave radiation.
Using WiFi signals for indoor localization is the main localization modality of the existing personal indoor localization systems operating on mobile devices. WiFi fingerprinting is also used for mobile robots, as WiFi signals are usually available indoors and can provide rough initial position estimate or can be used together with other positioning systems. Currently, the best solutions rely on filtering, manual data analysis, and time-consuming parameter tuning to achieve reliable and accurate localization. In this work, we propose to use deep neural networks to significantly lower the work-force burden of the localization system design, while still achieving satisfactory results. Assuming the state-of-the-art hierarchical approach, we employ the DNN system for building/floor classification. We show that stacked autoencoders allow to efficiently reduce the feature space in order to achieve robust and precise classification. The proposed architecture is verified on the publicly available UJIIndoorLoc dataset and the results are compared with other solutions.
We consider the effect of droplet excitations in the random first order transition theory of glasses on the configurational entropy. The contribution of these excitations is estimated both at and above the ideal glass transition temperature. The temperature range where such excitations could conceivably modify or `round-out' an underlying glass transition temperature is estimated, and found to depend strongly on the surface tension between locally metastable phases in the supercooled liquid. For real structural glasses this temperature range is found to be very narrow, consistent with the quantitative success of the theory. For certain finite-range spin-glass models, however, the surface tension is estimated to be significantly lower leading to much stronger entropy renormalizations, thus providing an explanation for the lack of a strict thermodynamic glass transition in simulations of these models.
To maintain a constant cell size, dividing cells have to coordinate cell cycle events with cell growth. This coordination has for long been supposed to rely on the existence of size thresholds determining cell cycle progression [1]. In budding yeast, size is controlled at the G1/S transition [11]. In agreement with this hypothesis, the size at birth influences the time spent in G1: smaller cells have a longer G1 period [3]. Nevertheless, even though cells born smaller have a longer G1, the compensation is imperfect and they still bud at smaller cell sizes. In bacteria, several recent studies have shown that the incremental model of size control, in which size is controlled by addition of a constant volume (in contrast to a size threshold), is able to quantitatively explain the experimental data on 4 different bacterial species [6, 5, 6, 7]. Here, we report on experimental results for the budding yeast Saccharomyces cerevisiae, finding, surprisingly, that cell size control in this organism is very well described by the incremental model, suggesting a common strategy for cell size control with bacteria. Additionally, we argue that for S. cerevisiae the volume increment is not added from birth to division, but rather between two budding events.
We discuss basic concepts and properties of diffractive phenomena in soft hadron collisions and in deep-inelastic scattering at low Bjorken-x. The paper is not a review of the rapidly developing field but presents an attempt to show in simple terms the close inter-relationship between the dynamics of high-energy hadronic and deep-inelastic diffraction. Using the saturation model of Golec-Biernat and Wusthoff as an example, a simple explanation of geometrical scaling is presented. The relation between the QCD anomalous multiplicity dimension and the Pomeron intercept is discussed.
"Deeper is better" has been recently considered as a principal design criterion for building convolutional neural networks due to its favorable performance in both high-level and low-level computer vision tasks. In this paper, inspired by the importance of image priors in low-level vision tasks, we introduce Wide InferenceNetwork (WIN) with increased filter number and size for low-level vision tasks such as natural image denoising. The key to our approach is the observation that mapping from noisy to clean images primarily relies on the priors learned from feature distributions in the training stage, instead of reasoning through stacked nonlinear layers. We evaluate WIN on additive white Gaussian noise (AWGN) and demonstrate that by learning the prior distribution in natural images, WIN-based network consistently achieves significantly better performance than current state-of-the-art deep CNN-based methods in both quantitative and visual evaluations.
Fourier transform scanning tunneling spectroscopy (FTSTS) is a useful technique for extracting details of the momentum-resolved electronic band structure from inhomogeneities in the local density of states due to disorder-related quasiparticle scattering. To a large extent, current understanding of FTSTS is based on models of Friedel oscillations near isolated impurities. Here, a framework for understanding many-impurity effects is developed based on a systematic treatment of the variance Delta rho^2(q,omega) of the Fourier transformed local density of states rho(q,\omega). One important consequence of this work is a demonstration that the poor signal-to-noise ratio inherent in rho(q,omega) due to randomness in impurity positions can be eliminated by configuration averaging Delta rho^2(q,omega). Furthermore, we develop a diagrammatic perturbation theory for Delta rho^2(q,omega) and show that an important bulk quantity, the mean-free-path, can be extracted from FTSTS experiments.
According to the Harris-Luck criterion the relevance of a fluctuating interaction at the critical point is connected to the value of the fluctuation exponent omega. Here we consider different types of relevant fluctuations in the quantum Ising chain and investigate the universality class of the models. At the critical point the random and aperiodic systems behave similarly, due to the same type of extreme broad distribution of the energy scales at low energies. The critical exponents of some averaged quantities are found to be a universal function of omega, but some others do depend on other parameters of the distribution of the couplings. In the off-critical region there is an important difference between the two systems: there are no Griffiths singularities in aperiodic models.
An approach by which to analyze the performance of the code division multiple access (CDMA) scheme, which is a core technology used in modern wireless communication systems, is provided. The approach characterizes the objective system by the eigenvalue spectrum of a cross-correlation matrix composed of signature sequences used in CDMA communication, which enables us to handle a wider class of CDMA systems beyond the basic model reported by Tanaka. The utility of the novel scheme is shown by analyzing a system in which the generation of signature sequences is designed for enhancing the orthogonality.
Wireless access points on unmanned aerial vehicles (UAVs) are being considered for mobile service provisioning in commercial networks. To be able to efficiently use these devices in cellular networks it is necessary to first have a qualitative and quantitative understanding of how their design parameters reflect on the service quality experienced by the end user. In this paper we set up a scenario where a network of UAVs operating at a certain height above ground provide wireless service within coverage areas shaped by their directional antennas. We provide an analytical expression for the coverage probability experienced by a typical user as a function of the UAV parameters.
Neural machine translation (NMT) heavily relies on word-level modelling to learn semantic representations of input sentences. However, for languages without natural word delimiters (e.g., Chinese) where input sentences have to be tokenized first, conventional NMT is confronted with two issues: 1) it is difficult to find an optimal tokenization granularity for source sentence modelling, and 2) errors in 1-best tokenizations may propagate to the encoder of NMT. To handle these issues, we propose word-lattice based Recurrent Neural Network (RNN) encoders for NMT, which generalize the standard RNN to word lattice topology. The proposed encoders take as input a word lattice that compactly encodes multiple tokenizations, and learn to generate new hidden states from arbitrarily many inputs and hidden states in preceding time steps. As such, the word-lattice based encoders not only alleviate the negative impact of tokenization errors but also are more expressive and flexible to embed input sentences. Experiment results on Chinese-English translation demonstrate the superiorities of the proposed encoders over the conventional encoder.
We present and describe a catalog of galaxy photometric redshifts (photo-z's) for the Sloan Digital Sky Survey (SDSS) Data Release 6 (DR6). We use the Artificial Neural Network (ANN) technique to calculate photo-z's and the Nearest Neighbor Error (NNE) method to estimate photo-z errors for ~ 77 million objects classified as galaxies in DR6 with r < 22. The photo-z and photo-z error estimators are trained and validated on a sample of ~ 640,000 galaxies that have SDSS photometry and spectroscopic redshifts measured by SDSS, 2SLAQ, CFRS, CNOC2, TKRS, DEEP, and DEEP2. For the two best ANN methods we have tried, we find that 68% of the galaxies in the validation set have a photo-z error smaller than sigma_{68} =0.021 or $0.024. After presenting our results and quality tests, we provide a short guide for users accessing the public data.
Content delivery success in wireless caching helper networks depends mainly on cache-based channel selection diversity and network interference. For given channel fading and network geometry, both channel selection diversity and network interference dynamically vary according to what and how the caching helpers cache at their finite storage space. We study probabilistic content placement (or caching placement) to desirably control cache-based channel selection diversity and network interference in a stochastic wireless caching helper network, with sophisticated considerations of wireless fading channels, interactions among multiple users such as interference and loads at caching helpers, and arbitrary memory size. Using stochastic geometry, we derive optimal caching probabilities in closed form to maximize the average success probability of content delivery and propose an efficient algorithm to find the solution in a noise-limited network. In an interference-limited network, based on a lower bound of the average success probability of content delivery, we find near-optimal caching probabilities in closed form to control the channel selection diversity and the network interference. We numerically verify that the proposed content placement is superior to other comparable content placement strategies.
Face photo synthesis from simple line drawing is a one-to-many task as simple line drawing merely contains the contour of human face. Previous exemplar-based methods are over-dependent on the datasets and are hard to generalize to complicated natural scenes. Recently, several works utilize deep neural networks to increase the generalization, but they are still limited in the controllability of the users. In this paper, we propose a deep generative model to synthesize face photo from simple line drawing controlled by face attributes such as hair color and complexion. In order to maximize the controllability of face attributes, an attribute-disentangled variational auto-encoder (AD-VAE) is firstly introduced to learn latent representations disentangled with respect to specified attributes. Then we conduct photo synthesis from simple line drawing based on AD-VAE. Experiments show that our model can well disentangle the variations of attributes from other variations of face photos and synthesize detailed photorealistic face images with desired attributes. Regarding background and illumination as the style and human face as the content, we can also synthesize face photos with the target style of a style photo.
In this paper we will present a general agglomeration law for sorting networks. Agglomeration is a common technique when designing parallel programmes to control the granularity of the computation thereby finding a better fit between the algorithm and the machine on which the algorithm runs. Usually this is done by grouping smaller tasks and computing them en bloc within one parallel process. In the case of sorting networks this could be done by computing bigger parts of the network with one process. The agglomeration law in this paper pursues a different strategy: The input data is grouped and the algorithm is generalized to work on the agglomerated input while the original structure of the algorithm remains. This will result in a new access opportunity to sorting networks well-suited for efficient parallelization on modern multicore computers, computer networks or GPGPU programming. Additionally this enables us to use sorting networks as (parallel or distributed) merging stages for arbitrary sorting algorithms, thereby creating new hybrid sorting algorithms with ease. The expressiveness of functional programming languages helps us to apply this law to systematically constructed sorting networks, leading to efficient and easily adaptable sorting algorithms. An application example is given, using the Eden programming language to show the effectiveness of the law. The implementation is compared with different parallel sorting algorithms by runtime behaviour.
Early diagnosis, playing an important role in preventing progress and treating the Alzheimer\{'}s disease (AD), is based on classification of features extracted from brain images. The features have to accurately capture main AD-related variations of anatomical brain structures, such as, e.g., ventricles size, hippocampus shape, cortical thickness, and brain volume. This paper proposed to predict the AD with a deep 3D convolutional neural network (3D-CNN), which can learn generic features capturing AD biomarkers and adapt to different domain datasets. The 3D-CNN is built upon a 3D convolutional autoencoder, which is pre-trained to capture anatomical shape variations in structural brain MRI scans. Fully connected upper layers of the 3D-CNN are then fine-tuned for each task-specific AD classification. Experiments on the CADDementia MRI dataset with no skull-stripping preprocessing have shown our 3D-CNN outperforms several conventional classifiers by accuracy. Abilities of the 3D-CNN to generalize the features learnt and adapt to other domains have been validated on the ADNI dataset.
In the introductory section of the article we give a brief account of recent insights into statistics of high and extreme values of disorder-generated multifractals following a recent work by the first author with P. Le Doussal and A. Rosso (FLR) employing a close relation between multifractality and logarithmically correlated random fields. We then substantiate some aspects of the FLR approach analytically for multifractal eigenvectors in the Ruijsenaars-Schneider ensemble (RSE) of random matrices introduced by E. Bogomolny and the second author by providing an ab initio calculation that reveals hidden logarithmic correlations at the background of the disorder-generated multifractality. In the rest we investigate numerically a few representative models of that class, including the study of the highest component of multifractal eigenvectors in the Ruijsenaars-Schneider ensemble.
We propose to exploit {\em reconstruction} as a layer-local training signal for deep learning. Reconstructions can be propagated in a form of target propagation playing a role similar to back-propagation but helping to reduce the reliance on derivatives in order to perform credit assignment across many levels of possibly strong non-linearities (which is difficult for back-propagation). A regularized auto-encoder tends produce a reconstruction that is a more likely version of its input, i.e., a small move in the direction of higher likelihood. By generalizing gradients, target propagation may also allow to train deep networks with discrete hidden units. If the auto-encoder takes both a representation of input and target (or of any side information) in input, then its reconstruction of input representation provides a target towards a representation that is more likely, conditioned on all the side information. A deep auto-encoder decoding path generalizes gradient propagation in a learned way that can could thus handle not just infinitesimal changes but larger, discrete changes, hopefully allowing credit assignment through a long chain of non-linear operations. In addition to each layer being a good auto-encoder, the encoder also learns to please the upper layers by transforming the data into a space where it is easier to model by them, flattening manifolds and disentangling factors. The motivations and theoretical justifications for this approach are laid down in this paper, along with conjectures that will have to be verified either mathematically or experimentally, including a hypothesis stating that such auto-encoder mediated target propagation could play in brains the role of credit assignment through many non-linear, noisy and discrete transformations.
We solve the q-state Potts model with anti-ferromagnetic interactions on large random lattices of finite coordination. Due to the frustration induced by the large loops and to the local tree-like structure of the lattice this model behaves as a mean field spin glass. We use the cavity method to compute the temperature-coordination phase diagram and to determine the location of the dynamic and static glass transitions, and of the Gardner instability. We show that for q>=4 the model possesses a phenomenology similar to the one observed in structural glasses. We also illustrate the links between the positive and the zero-temperature cavity approaches, and discuss the consequences for the coloring of random graphs. In particular we argue that in the colorable region the one-step replica symmetry breaking solution is stable towards more steps of replica symmetry breaking.
Currently there are two predominant ways to train deep neural networks. The first one uses restricted Boltzmann machine (RBM) and the second one autoencoders. RBMs are stacked in layers to form deep belief network (DBN); the final representation layer is attached to the target to complete the deep neural network. Autoencoders are nested one inside the other to form stacked autoencoders; once the stcaked autoencoder is learnt the decoder portion is detached and the target attached to the deepest layer of the encoder to form the deep neural network. This work proposes a new approach to train deep neural networks using dictionary learning as the basic building block; the idea is to use the features from the shallower layer as inputs for training the next deeper layer. One can use any type of dictionary learning (unsupervised, supervised, discriminative etc.) as basic units till the pre-final layer. In the final layer one needs to use the label consistent dictionary learning formulation for classification. We compare our proposed framework with existing state-of-the-art deep learning techniques on benchmark problems; we are always within the top 10 results. In actual problems of age and gender classification, we are better than the best known techniques.
Analysis of experimental data shows that the metal--insulator transition is possible in materials composed of atoms of only metallic elements. Such a transition may occur in spite of the high concentration of valence electrons. It requires stable atomic configurations to act as deep potential traps absorbing dozens of valence electrons. This means in essence that bulk metallic space transforms into an assembly of identical quantum dots. Depending on the parameters, such a material either does contain delocalized electrons (metal) or does not contain such electrons (insulator). The degree of disorder is one of these parameters. Two types of substances with such properties are discussed: liquid binary alloys with both components being metallic, and thermodynamically stable quasicrystals.
We bring forward rather simple algorithm allowing us to calculate the effective impedance of inhomogeneous metals in the frequency region where the local Leontovich (the impedance) boundary conditions are justified. The inhomogeneity is due to the properties of the metal or/and the surface roughness. Our results are nonperturbative ones with respect to the inhomogeneity amplitude. They are based on the recently obtained exact result for the effective impedance of inhomogeneous metals with flat surfaces. One-dimension surfaces inhomogeneities are examined. Particular attention is paid to the influence of generated evanescent waves on the reflection characteristics. We show that if the surface roughness is rather strong, the element of the effective impedance tensor relating to the p- polarization state is much greater than the input local impedance. As examples, we calculate: i) the effective impedance for a flat surface with strongly nonhomogeneous periodic strip-like local impedance; ii) the effective impedance associated with one-dimensional lamellar grating. For the problem (i) we also present equations for the forth lines of the Pointing vector in the vicinity of the surface.
The Hubble Deep Field-South (HDF-S) was chosen to have a QSO (RA 22:33:37.6 Dec -60:33:29 J2000 and B=17.5) in the field to allow for studies of absorption systems intersecting the sight line to the QSO. To assist in the planning of HDF-S observations we present here a ground-based spectrum of the QSO. We measure a redshift of z=2.24 for the QSO and find associated absorption in the spectrum at z=2.204 as well as additional absorption features.
Prediction of annual rice production in all the 31 districts of Tamilnadu is an important decision for the Government of Tamilnadu. Rice production is a complex process and non linear problem involving soil, crop, weather, pest, disease, capital, labour and management parameters. ANN software was designed and developed with Feed Forward Back Propagation (FFBP) network to predict rice production. The input layer has six independent variables like area of cultivation and rice production in three seasons like Kuruvai, Samba and Kodai. The popular sigmoid activation function was adopted to convert input data into sigmoid values. The hidden layer computes the summation of six sigmoid values with six sets of weightages. The final output was converted into sigmoid values using a sigmoid transfer function. ANN outputs are the predicted results. The error between original data and ANN output values were computed. A threshold value of 10-9 was used to test whether the error is greater than the threshold level. If the error is greater than threshold then updating of weights was done all summations were done by back propagation. This process was repeated until error equal to zero. The predicted results were printed and it was found to be exactly matching with the expected values. It shows that the ANN prediction was 100% accurate.
In multi-hop secondary networks, bidding strategies for spectrum auction, route selection and relaying incentives should be jointly considered to establish multi-hop communication. In this paper, a framework for joint resource bidding and tipping is developed where users iteratively revise their strategies, which include bidding and incentivizing relays, to achieve their Quality of Service (QoS) requirements. A bidding language is designed to generalize secondary users' heterogeneous demands for multiple resources and willingness to pay. Then, group partitioning-based auction mechanisms are presented to exploit the heterogeneity of SU demands in multi-hop secondary networks. These mechanisms include primary operator (PO) strategies based on static and dynamic partition schemes combined with new payment mechanisms to obtain high revenue and fairly allocate the resources. The proposed auction schemes stimulate the participation of SUs and provide high revenue for the PO while maximizing the social welfare. Besides, they satisfy the properties of truthfulness, individual rationality and computational tractability. Simulation results have shown that for highly demanding users the static group scheme achieves 150% more winners and 3 times higher revenue for the PO compared to a scheme without grouping. For lowly demanding users, the PO may keep similar revenue with the dynamic scheme by lowering 50% the price per channel as the number of winners will increase proportionally.
In temporal networks, both the topology of the underlying network and the timings of interaction events can be crucial in determining how some dynamic process mediated by the network unfolds. We have explored the limiting case of the speed of spreading in the SI model, set up such that an event between an infectious and susceptible individual always transmits the infection. The speed of this process sets an upper bound for the speed of any dynamic process that is mediated through the interaction events of the network. With the help of temporal networks derived from large scale time-stamped data on mobile phone calls, we extend earlier results that point out the slowing-down effects of burstiness and temporal inhomogeneities. In such networks, links are not permanently active, but dynamic processes are mediated by recurrent events taking place on the links at specific points in time. We perform a multi-scale analysis and pinpoint the importance of the timings of event sequences on individual links, their correlations with neighboring sequences, and the temporal pathways taken by the network-scale spreading process. This is achieved by studying empirically and analytically different characteristic relay times of links, relevant to the respective scales, and a set of temporal reference models that allow for removing selected time-domain correlations one by one.
Deep Reinforcement Learning has enabled the learning of policies for complex tasks in partially observable environments, without explicitly learning the underlying model of the tasks. While such model-free methods achieve considerable performance, they often ignore the structure of task. We present a natural representation of to Reinforcement Learning (RL) problems using Recurrent Convolutional Neural Networks (RCNNs), to better exploit this inherent structure. We define 3 such RCNNs, whose forward passes execute an efficient Value Iteration, propagate beliefs of state in partially observable environments, and choose optimal actions respectively. Backpropagating gradients through these RCNNs allows the system to explicitly learn the Transition Model and Reward Function associated with the underlying MDP, serving as an elegant alternative to classical model-based RL. We evaluate the proposed algorithms in simulation, considering a robot planning problem. We demonstrate the capability of our framework to reduce the cost of replanning, learn accurate MDP models, and finally re-plan with learnt models to achieve near-optimal policies.
We perform Monte Carlo simulations of large two-dimensional Gaussian Ising spin glasses down to very low temperatures $\beta=1/T=50$. Equilibration is ensured by using a cluster algorithm including Monte Carlo moves consisting of flipping fundamental excitations. We study the thermodynamic behavior using the Binder cumulant, the spin-glass susceptibility, the distribution of overlaps, the overlap with the ground state and the specific heat. We confirm that $T_c=0$. All results are compatible with an algebraic divergence of the correlation length with an exponent $\nu$. We find $-1/\nu=-0.295(30)$, which is compatible with the value for the domain-wall and droplet exponent $\theta\approx-0.29$ found previously in ground-state studies. Hence the thermodynamic behavior of this model seems to be governed by one single exponent.
In this paper, feedforward neural networks are presented that have nonlinear weight functions based on look--up tables, that are specially smoothed in a regularization called the diffusion. The idea of such a type of networks is based on the hypothesis that the greater number of adaptive parameters per a weight function might reduce the total number of the weight functions needed to solve a given problem. Then, if the computational complexity of a propagation through a single such a weight function would be kept low, then the introduced neural networks might possibly be relatively fast.   A number of tests is performed, showing that the presented neural networks may indeed perform better in some cases than the classic neural networks and a number of other learning machines.
Artificial Neural Network Model for prediction of time-series data is revisited on analysis of the Indonesian stock-exchange data. We introduce the use of Multi-Layer Perceptron to percept the modified Poincare-map of the given financial time-series data. The modified Poincare-map is believed to become the pattern of the data that transforms the data in time-t versus the data in time-t+1 graphically. We built the Multi-Layer Perceptron to percept and demonstrate predicting the data on specific stock-exchange in Indonesia.
Generalized mutual entropy is defined for networks and applied for analysis of complex network structures. The method is tested for the case of computer simulated scale free networks, random networks, and their mixtures. The possible applications for real network analysis are discussed.
Heuristic methods for solution of problems in the NP-Complete class of decision problems often reach exact solutions, but fail badly at "phase boundaries", across which the decision to be reached changes from almost always having one value to almost having a different value. We report an analytic solution and experimental investigations of the phase transition that occurs in the limit of very large problems in K-SAT. The nature of its "random first-order" phase transition, seen at values of K large enough to make the computational cost of solving typical instances increase exponenitally with problem size, suggest a mechanism for the cost increase. There has been evidence for features like the "backbone" of frozen inputs which characterizes the UNSAT phase in K-SAT in the study of models of disordered materials, but this feature and this transition are uniquely accessible to analysis in K-SAT. The random first order transition combines properties of the 1st order (discontinuous onset of order) and 2nd order (with power law scaling, e.g. of the width of the the critical region in a finite system) transitions known in the physics of pure solids. Such transitions should occur in other combinatoric problems in the large N limit. Finally, improved search heuristics may be developed when a "backbone" is known to exist.
Susceptibility of scale free Power Law (PL) networks to attacks has been traditionally studied in the context of what may be termed as {\em instantaneous attacks}, where a randomly selected set of nodes and edges are deleted while the network is kept {\em static}. In this paper, we shift the focus to the study of {\em progressive} and instantaneous attacks on {\em reactive} grown and random PL networks, which can respond to attacks and take remedial steps. In the process, we present several techniques that managed networks can adopt to minimize the damages during attacks, and also to efficiently recover from the aftermath of successful attacks. For example, we present (i) compensatory dynamics that minimize the damages inflicted by targeted progressive attacks, such as linear-preferential deletions of nodes in grown PL networks; the resulting dynamic naturally leads to the emergence of networks with PL degree distributions with exponential cutoffs; (ii) distributed healing algorithms that can scale the maximum degree of nodes in a PL network using only local decisions, and (iii) efficient means of creating giant connected components in a PL network that has been fragmented by attacks on a large number of high-degree nodes. Such targeted attacks are considered to be a major vulnerability of PL networks; however, our results show that the introduction of only a small number of random edges, through a {\em reverse percolation} process, can restore connectivity, which in turn allows restoration of other topological properties of the original network. Thus, the scale-free nature of the networks can itself be effectively utilized for protection and recovery purposes.
We study the spectral properties of the magnitudes of river flux increments, the volatility. The volatility series exhibits (i) strong seasonal periodicity and (ii) strongly power-law correlations for time scales less than one year. We test the nonlinear properties of the river flux increment series by randomizing its Fourier phases and find that the surrogate volatility series (i) has almost no seasonal periodicity and (ii) is weakly correlated for time scales less than one year. We quantify the degree of nonlinearity by measuring (i) the amplitude of the power spectrum at the seasonal peak and (ii) the correlation power-law exponent of the volatility series.
Geocasting is the delivery of packets to nodes within a certain geographic area. For many applications in wireless ad hoc and sensor networks, geocasting is an important and frequent communication service. The challenging problem in geocasting is distributing the packets to all the nodes within the geocast region with high probability but with low overhead. According to our study we notice a clear tradeoff between the proportion of nodes in the geocast region that receive the packet and the overhead incurred by the geocast packet especially at low densities and irregular distributions. We present two novel protocols for geocasting that achieve high delivery rate and low overhead by utilizing the local location information of nodes to combine geographic routing mechanisms with region flooding. We show that the first protocol Geographic-Forwarding-Geocast (GFG) has close-to-minimum overhead in dense networks and that the second protocol Geographic-Forwarding-Perimeter-Geocast (GFPG) provides guaranteed delivery without global flooding or global network information even at low densities and with the existence of region gaps or obstacles. An adaptive version of the second protocol (GFPG*) has the desirable property of perfect delivery at all densities and close-to-minimum overhead at high densities. We evaluate our mechanisms and compare them using simulation to other proposed geocasting mechanisms. The results show the significant improvement in delivery rate (up to 63% higher delivery percentage in low density networks) and reduction in overhead (up to 80% reduction) achieved by our mechanisms. We hope for our protocols to become building block mechanisms for dependable sensor network architectures that require robust efficient geocast services.
This paper presents a method to predict the future movements (location and gaze direction) of basketball players as a whole from their first person videos. The predicted behaviors reflect an individual physical space that affords to take the next actions while conforming to social behaviors by engaging to joint attention. Our key innovation is to use the 3D reconstruction of multiple first person cameras to automatically annotate each other's the visual semantics of social configurations.   We leverage two learning signals uniquely embedded in first person videos. Individually, a first person video records the visual semantics of a spatial and social layout around a person that allows associating with past similar situations. Collectively, first person videos follow joint attention that can link the individuals to a group. We learn the egocentric visual semantics of group movements using a Siamese neural network to retrieve future trajectories. We consolidate the retrieved trajectories from all players by maximizing a measure of social compatibility---the gaze alignment towards joint attention predicted by their social formation, where the dynamics of joint attention is learned by a long-term recurrent convolutional network. This allows us to characterize which social configuration is more plausible and predict future group trajectories.
In this paper, researchers estimated the stock price of activated companies in Tehran (Iran) stock exchange. It is used Linear Regression and Artificial Neural Network methods and compared these two methods. In Artificial Neural Network, of General Regression Neural Network method (GRNN) for architecture is used. In this paper, first, researchers considered 10 macro economic variables and 30 financial variables and then they obtained seven final variables including 3 macro economic variables and 4 financial variables to estimate the stock price using Independent components Analysis (ICA). So, we presented an equation for two methods and compared their results which shown that artificial neural network method is more efficient than linear regression method.
Different measures have been proposed to predict whether individuals will adopt a new behavior in online social networks, given the influence produced by their neighbors. In this paper, we show one can achieve significant improvement over these standard measures, extending them to consider a pair of time constraints. These constraints provide a better proxy for social influence, showing a stronger correlation to the probability of influence as well as the ability to predict influence.
While variational methods have been among the most powerful tools for solving linear inverse problems in imaging, deep (convolutional) neural networks have recently taken the lead in many challenging benchmarks. A remaining drawback of deep learning approaches is that they require an expensive retraining whenever the specific problem, the noise level, noise type, or desired measure of fidelity changes. On the contrary, variational methods have a plug-and-play nature as they usually consist of separate data fidelity and regularization terms. In this paper we study the possibility of replacing the proximal operator of the regularization used in many convex energy minimization algorithms by a denoising neural network. The latter therefore serves as an implicit natural image prior, while the data term can still be chosen arbitrarily. Using a fixed denoising neural network in exemplary problems of image deconvolution with different blur kernels and image demosaicking, we obtain state-of-the-art results. Additionally, we discuss novel results on the analysis of possible convex optimization algorithms to incorporate the network into, as well as the choices of algorithm parameters and their relation to the noise level the neural network is trained on.
Little is known theoretically about the associative memory capabilities of neural networks in which information is encoded not only in the mean firing rate but also in the timing of firings. Particularly, in the case that the fraction of active neurons involved in memorizing patterns becomes small, it is biologically important to consider the timings of firings and to study how such consideration influences storage capacities and quality of recalled patterns. For this purpose, we propose a simple extended model of oscillator neural networks to allow for expression of non-firing state. %which is able to memorize sparsely coded phase patterns including non-firing states. Analyzing both equilibrium states and dynamical properties in recalling processes, we find that the system possesses good associative memory.
We present here an extension of the Wang-Landau Monte Carlo method which allows us to get very accurate estimates of the full probability distributions of several observables after a quantum quench for large systems, whenever the relevant matrix elements are calculable, but the full exponential complexity of the Hilbert space would make an exhaustive enumeration impossible beyond very limited sizes. We apply this method to quenches of free-fermion models with disorder, further corroborating the fact that a generalized Gibbs ensemble fails to capture the long-time average of many-body operators when disorder is present.
In this paper we study events with W+jets final state, produced in double parton (DP) interactions, as a background to the associated Higgs boson (H) and W production, with H->bbar decay, at the Tevatron. We have found that the event yield from the DP background can be quite sizable, which necessitates a choice of selection criteria to separate the HW and DP production processes. We suggest a set of variables sensitive to the kinematics of DP and HW events. We show that these variables, being used as an input to the artificial neural network, allow one to significantly improve a sensitivity to the Higgs boson production.
We propose an approach to learn spatio-temporal features in videos from intermediate visual representations we call "percepts" using Gated-Recurrent-Unit Recurrent Networks (GRUs).Our method relies on percepts that are extracted from all level of a deep convolutional network trained on the large ImageNet dataset. While high-level percepts contain highly discriminative information, they tend to have a low-spatial resolution. Low-level percepts, on the other hand, preserve a higher spatial resolution from which we can model finer motion patterns. Using low-level percepts can leads to high-dimensionality video representations. To mitigate this effect and control the model number of parameters, we introduce a variant of the GRU model that leverages the convolution operations to enforce sparse connectivity of the model units and share parameters across the input spatial locations.   We empirically validate our approach on both Human Action Recognition and Video Captioning tasks. In particular, we achieve results equivalent to state-of-art on the YouTube2Text dataset using a simpler text-decoder model and without extra 3D CNN features.
Lossy image compression algorithms are pervasively used to reduce the size of images transmitted over the web and recorded on data storage media. However, we pay for their high compression rate with visual artifacts degrading the user experience. Deep convolutional neural networks have become a widespread tool to address high-level computer vision tasks very successfully. Recently, they have found their way into the areas of low-level computer vision and image processing to solve regression problems mostly with relatively shallow networks.   We present a novel 12-layer deep convolutional network for image compression artifact suppression with hierarchical skip connections and a multi-scale loss function. We achieve a boost of up to 1.79 dB in PSNR over ordinary JPEG and an improvement of up to 0.36 dB over the best previous ConvNet result. We show that a network trained for a specific quality factor (QF) is resilient to the QF used to compress the input image - a single network trained for QF 60 provides a PSNR gain of more than 1.5 dB over the wide QF range from 40 to 76.
While most existing video summarization approaches aim to extract an informative summary of a single video, we propose a novel framework for summarizing multi-view videos by exploiting both intra- and inter-view content correlations in a joint embedding space. We learn the embedding by minimizing an objective function that has two terms: one due to intra-view correlations and another due to inter-view correlations across the multiple views. The solution can be obtained directly by solving one Eigen-value problem that is linear in the number of multi-view videos. We then employ a sparse representative selection approach over the learned embedding space to summarize the multi-view videos. Experimental results on several benchmark datasets demonstrate that our proposed approach clearly outperforms the state-of-the-art.
The recognition of human actions and the determination of human attributes are two tasks that call for fine-grained classification. Indeed, often rather small and inconspicuous objects and features have to be detected to tell their classes apart. In order to deal with this challenge, we propose a novel convolutional neural network that mines mid-level image patches that are sufficiently dedicated to resolve the corresponding subtleties. In particular, we train a newly de- signed CNN (DeepPattern) that learns discriminative patch groups. There are two innovative aspects to this. On the one hand we pay attention to contextual information in an origi- nal fashion. On the other hand, we let an iteration of feature learning and patch clustering purify the set of dedicated patches that we use. We validate our method for action clas- sification on two challenging datasets: PASCAL VOC 2012 Action and Stanford 40 Actions, and for attribute recogni- tion we use the Berkeley Attributes of People dataset. Our discriminative mid-level mining CNN obtains state-of-the- art results on these datasets, without a need for annotations about parts and poses.
This paper proposes a new approach for monitoring brown planthoppers (BPH) swarms using a surveillance network at provincial scale. The topology of this network is identified to a wireless sensor network (WSN), where each node is a real light trap and each edge describes the influence between two nodes, allowing gathering BPH information. Different communication ranges are evaluated to choose a suitable network. The experiments are performed on the light traps surveillance network of Hau Giang province, a typical rice province in the Mekong Delta region of Vietnam.
A one-way {\em street} of width M is modeled as a set of M parallel one-dimensional TASEPs. The intersection of two perpendicular streets is a square lattice of size M times M. We consider hard core particles entering each street with an injection probability \alpha. On the intersection square the hard core exclusion creates a many-body problem of strongly interacting TASEPs and we study the collective dynamics that arises. We construct an efficient algorithm that allows for the simulation of streets of infinite length, which have sharply defined critical jamming points. The algorithm employs the `frozen shuffle update', in which the randomly arriving particles have fully deterministic bulk dynamics. High precision simulations for street widths up to M=24 show that when \alpha increases, there occur jamming transitions at a sequence of M critical values \alphaM,M < \alphaM,M-1 < ... < \alphaM,1. As M grows, the principal transition point \alphaM,M decreases roughly as \sim 1/(log M) in the range of M values studied. We show that a suitable order parameter is provided by a reflection coefficient associated with the particle current in each TASEP.
We report in this Letter our recent low temperature transport results in a Si/SiGe quantum well with moderate peak mobility. An apparent metal-insulating transition is observed. Within a small range of densities near the transition, the conductivity $\sigma$ displays non-monotonic temperature dependence. After an initial decrease at high temperatures, $\sigma$ first increases with decreasing temperature T, showing a metallic behavior. As T continues decreasing, a downturn in $\sigma$ is observed. This downturn shifts to a lower T at higher densities. More interestingly, the downturn temperature shows a power law dependence on the mobility at the downturn position, suggesting that a similar downturn is also expected to occur deep in the apparent metallic regime at albeit experimentally inaccessible temperatures. This thus hints that the observed metallic phase in 2D systems might be a finite temperature effect.
Latent fingerprints are one of the most important and widely used evidence in law enforcement and forensic agencies worldwide. Yet, NIST evaluations show that the performance of state-of-the-art latent recognition systems is far from satisfactory. An automated latent fingerprint recognition system with high accuracy is essential to compare latents found at crime scenes to a large collection of reference prints to generate a candidate list of possible mates. In this paper, we propose an automated latent fingerprint recognition algorithm that utilizes Convolutional Neural Networks (ConvNets) for ridge flow estimation and minutiae descriptor extraction, and extract complementary templates (two minutiae templates and one texture template) to represent the latent. The comparison scores between the latent and a reference print based on the three templates are fused to retrieve a short candidate list from the reference database. Experimental results show that the rank-1 identification accuracies (query latent is matched with its true mate in the reference database) are 64.7% for the NIST SD27 and 75.3% for the WVU latent databases, against a reference database of 100K rolled prints. These results are the best among published papers on latent recognition and competitive with the performance (66.7% and 70.8% rank-1 accuracies on NIST SD27 and WVU DB, respectively) of a leading COTS latent Automated Fingerprint Identification System (AFIS). By score-level (rank-level) fusion of our system with the commercial off-the-shelf (COTS) latent AFIS, the overall rank-1 identification performance can be improved from 64.7% and 75.3% to 73.3% (74.4%) and 76.6% (78.4%) on NIST SD27 and WVU latent databases, respectively.
For a social networking service to acquire and retain users, it must find ways to keep them engaged. By accurately gauging their preferences, it is able to serve them with the subset of available content that maximises revenue for the site. Without the constraints of an appropriate regulatory framework, we argue that a sufficiently sophisticated curator algorithm tasked with performing this process may choose to explore curation strategies that are detrimental to users. In particular, we suggest that such an algorithm is capable of learning to manipulate its users, for several qualitative reasons: 1. Access to vast quantities of user data combined with ongoing breakthroughs in the field of machine learning are leading to powerful but uninterpretable strategies for decision making at scale. 2. The availability of an effective feedback mechanism for assessing the short and long term user responses to curation strategies. 3. Techniques from reinforcement learning have allowed machines to learn automated and highly successful strategies at an abstract level, often resulting in non-intuitive yet nonetheless highly appropriate action selection. In this work, we consider the form that these strategies for user manipulation might take and scrutinise the role that regulation should play in the design of such systems.
We present the temperature dependence of alpha-relaxation times of 13 glass formers determined from broadband dielectric spectroscopy, also including data from aging measurements. The data sets partly cover relaxation-time ranges of up to 16 decades enabling a critical test of the validity of model predictions. For this purpose, the data are provided for electronic download. Here we employ these results to test the applicability of the Vogel-Fulcher-Tammann equation and a recently proposed new approach that was demonstrated to provide superior fits of a vast collection of viscosity data.
We present the results of the deep NIR imaging observations using the Subaru telescope. At the Hubble Deep Field North, we obtained the new and deepest K'-band image (10 hours) to study the rest-frame optical morphology of galaxies at z > 3 and the stellar mass distribution of galaxies at 2 < z < 4.5. We also study the rest-frame optical properties of the 'building blocks' in the field of 53W002 and overall color distribution there. We then combined these data to compare the color and magnitude distribution in these two fields. The opt-NIR color-magnitude distributions in the two fields looks very similar and we report the conspicuous color change below K_AB =22 and discuss the cause of the feature.
Collision avoidance systems can play a vital role in reducing the number of accidents and saving human lives. In this paper, we introduce and validate a novel method for vehicles reactive collision avoidance using evolutionary neural networks (ENN). A single front-facing rangefinder sensor is the only input required by our method. The training process and the proposed method analysis and validation are carried out using simulation. Extensive experiments are conducted to analyse the proposed method and evaluate its performance. Firstly, we experiment the ability to learn collision avoidance in a static free track. Secondly, we analyse the effect of the rangefinder sensor resolution on the learning process. Thirdly, we experiment the ability of a vehicle to individually and simultaneously learn collision avoidance. Finally, we test the generality of the proposed method. We used a more realistic and powerful simulation environment (CarMaker), a camera as an alternative input sensor, and lane keeping as an extra feature to learn. The results are encouraging; the proposed method successfully allows vehicles to learn collision avoidance in different scenarios that are unseen during training. It also generalizes well if any of the input sensor, the simulator, or the task to be learned is changed.
By observing mergers of compact objects, future gravity wave experiments would measure the luminosity distance to a large number of sources to a high precision but not their redshifts. Given the directional sensitivity of an experiment, a fraction of such sources (gold plated -- GP) can be identified optically as single objects in the direction of the source. We show that if an approximate distance-redshift relation is known then it is possible to statistically resolve those sources that have multiple galaxies in the beam. We study the feasibility of using gold plated sources to iteratively resolve the unresolved sources, obtain the self-calibrated best possible distance-redshift relation and provide an analytical expression for the accuracy achievable. We derive lower limit on the total number of sources that is needed to achieve this accuracy through self-calibration. We show that this limit depends exponentially on the beam width and give estimates for various experimental parameters representative of future gravitational wave experiments DECIGO and BBO.
We use the susceptible-infected-recovered (SIR) model for disease spread over a network, and empirically study how well various centrality measures perform at identifying which nodes in a network will be the best spreaders of disease on 10 real-world networks. We find that the relative performance of degree, shell number and other centrality measures can be sensitive to B, the probability that an infected node will transmit the disease to a susceptible node. We also find that eigenvector centrality performs very well in general for values of B above the epidemic threshold.
Nowadays, sustained development of different social media can be observed worldwide. One of the relevant research domains intensively explored recently is analysis of social communities existing in social media as well as prediction of their future evolution taking into account collected historical evolution chains. These evolution chains proposed in the paper contain group states in the previous time frames and its historical transitions that were identified using one out of two methods: Stable Group Changes Identification (SGCI) and Group Evolution Discovery (GED). Based on the observed evolution chains of various length, structural network features are extracted, validated and selected as well as used to learn classification models. The experimental studies were performed on three real datasets with different profile: DBLP, Facebook and Polish blogosphere. The process of group prediction was analysed with respect to different classifiers as well as various descriptive feature sets extracted from evolution chains of different length. The results revealed that, in general, the longer evolution chains the better predictive abilities of the classification models. However, chains of length 3 to 7 enabled the GED-based method to almost reach its maximum possible prediction quality. For SGCI, this value was at the level of 3 to 5 last periods.
We have studied the propagation of a crack front along the heterogeneous weak plane of a transparent poly(methyl methacrylate) (PMMA) block using two different loading conditions: imposed constant velocity and creep relaxation. We have focused on the intermittent local dynamics of the fracture front for a wide range of average crack front propagation velocities spanning over four decades. We computed the local velocity fluctuations along the fracture front. Two regimes are emphasized: a depinning regime of high velocity clusters defined as avalanches and a pinning regime of very low-velocity creeping lines. The scaling properties of the avalanches and pinning lines (size and spatial extent) are found to be independent of the loading conditions and of the average crack front velocity. The distribution of local fluctuations of the crack front velocity are related to the observed avalanche size distribution. Space-time correlations of the local velocities show a simple diffusion growth behavior.
Importance sampling has been successfully used to accelerate stochastic optimization in many convex problems. However, the lack of an efficient way to calculate the importance still hinders its application to Deep Learning. In this paper, we show that the loss value can be used as an alternative importance metric, and propose a way to efficiently approximate it for a deep model, using a small model trained for that purpose in parallel.   This method allows in particular to utilize a biased gradient estimate that implicitly optimizes a soft max-loss, and leads to better generalization performance. While such method suffers from a prohibitively high variance of the gradient estimate when using a standard stochastic optimizer, we show that when it is combined with our sampling mechanism, it results in a reliable procedure.   We showcase the generality of our method by testing it on both image classification and language modeling tasks using deep convolutional and recurrent neural networks. In particular, in case of CIFAR10 we reach 10% classification error 50 epochs faster than when using uniform sampling.
Quantum annealing method has been widely attracted attention in statistical physics and information science since it is expected to be a powerful method to obtain the best solution of optimization problem as well as simulated annealing. The quantum annealing method was incubated in quantum statistical physics. This is an alternative method of the simulated annealing which is well-adopted for many optimization problems. In the simulated annealing, we obtain a solution of optimization problem by decreasing temperature (thermal fluctuation) gradually. In the quantum annealing, in contrast, we decrease quantum field (quantum fluctuation) gradually and obtain a solution. In this paper we review how to implement quantum annealing and show some quantum fluctuation effects in frustrated Ising spin systems.
We present results from a detailed spectrophotometric analysis of the blue compact dwarf galaxy (BCD) Mrk 370, based on deep UBVRI broad-band and Halpha narrow-band observations, and long-slit and two-dimensional spectroscopy of its brightest knots. The spectroscopic data are used to derive the internal extinction, and to compute metallicities, electronic density and temperature in the knots. By subtracting the contribution of the underlying older stellar population, modeled by an exponential function, removing the contribution from emission lines, and correcting for extinction, we can measure the true colors of the young star-forming knots. We show that the colors obtained this way differ significantly from those derived without the above corrections, and lead to different estimates of the ages and star-forming history of the knots. Using predictions of evolutionary synthesis models, we estimate the ages of both the starburst regions and the underlying stellar component. We found that we can reproduce the colors of all the knots with an instantaneous burst of star formation and the Salpeter initial mass function with an upper mass limit of 100 solar masses. The resulting ages range between 3 and 6 Myrs. The colors of the low surface brightness component are consistent with ages larger than 5 Gyr. The kinematic results suggest ordered motion around the major axis of the galaxy.
The form of the low-temperature interactions between defects in neutral glasses is reconsidered. We analyse the case where the defects can be modelled either as simple 2-level tunneling systems, or tunneling rotational impurities. The coupling to strain fields is determined up to 2nd order in the displacement field. It is shown that the linear coupling generates not only the usual $1/r^3$ Ising-like interaction between the rotational tunneling defect modes, which cause them to freeze around a temperature $T_G$, but also a random field term. At lower temperatures the inversion symmetric tunneling modes are still active - however the coupling of these to the frozen rotational modes, now via the 2nd-order coupling to phonons, generates another random field term acting on the inversion symmetric modes (as well as shorter-range $1/r^5$ interactions between them). Detailed expressions for all these couplings are given.
This is the first in a series of two papers (I and II), in which we revisit the problem of decoherence in weak localization. The basic challenge addressed in our work is to calculate the decoherence of electrons interacting with a quantum-mechanical environment, while taking proper account of the Pauli principle. First, we review the usual influence functional approach valid for decoherence of electrons due to classical noise, showing along the way how the quantitative accuracy can be improved by properly averaging over closed (rather than unrestricted) random walks. We then use a heuristic approach to show how the Pauli principle may be incorporated into a path-integral description of decoherence in weak localization. This is accomplished by introducing an effective modification of the quantum noise spectrum, after which the calculation proceeds in analogy to the case of classical noise. Using this simple but efficient method, which is consistent with much more laborious diagrammatic calculations, we demonstrate how the Pauli principle serves to suppress the decohering effects of quantum fluctuations of the environment, and essentially confirm the classic result of Altshuler, Aronov and Khmelnitskii for the energy-averaged decoherence rate, which vanishes at zero temperature. Going beyond that, we employ our method to calculate explicitly the leading quantum corrections to the classical decoherence rates, and to provide a detailed analysis of the energy-dependence of the decoherence rate. The basic idea of our approach is general enough to be applicable to decoherence of degenerate Fermi systems in contexts other than weak localization as well. -- Paper II will provide a more rigorous diagrammatic basis for our results, by rederiving them from a Bethe-Salpeter equation for the Cooperon.
The combination of multi-armed bandit (MAB) algorithms with Monte-Carlo tree search (MCTS) has made a significant impact in various research fields. The UCT algorithm, which combines the UCB bandit algorithm with MCTS, is a good example of the success of this combination. The recent breakthrough made by AlphaGo, which incorporates convolutional neural networks with bandit algorithms in MCTS, also highlights the necessity of bandit algorithms in MCTS. However, despite the various investigations carried out on MCTS, nearly all of them still follow the paradigm of treating every node as an independent instance of the MAB problem, and applying the same bandit algorithm and heuristics on every node. As a result, this paradigm may leave some properties of the game tree unexploited. In this work, we propose that max nodes and min nodes have different concerns regarding their value estimation, and different bandit algorithms should be applied accordingly. We develop the Asymmetric-MCTS algorithm, which is an MCTS variant that applies a simple regret algorithm on max nodes, and the UCB algorithm on min nodes. We will demonstrate the performance of the Asymmetric-MCTS algorithm on the game of $9\times 9$ Go, $9\times 9$ NoGo, and Othello.
Network coding permits to deploy distributed packet delivery algorithms that locally adapt to the network availability in media streaming applications. However, it may also increase delay and computational complexity if it is not implemented efficiently. We address here the effective placement of nodes that implement randomized network coding in overlay networks, so that the goodput is kept high while the delay for decoding stays small in streaming applications. We first estimate the decoding delay at each client, which depends on the innovative rate in the network. This estimation permits to identify the nodes that have to perform coding for a reduced decoding delay. We then propose two iterative algorithms for selecting the nodes that should perform network coding. The first algorithm relies on the knowledge of the full network statistics. The second algorithm uses only local network statistics at each node. Simulation results show that large performance gains can be achieved with the selection of only a few network coding nodes. Moreover, the second algorithm performs very closely to the central estimation strategy, which demonstrates that the network coding nodes can be selected efficiently in a distributed manner. Our scheme shows large gains in terms of achieved throughput, delay and video quality in realistic overlay networks when compared to methods that employ traditional streaming strategies as well as random network nodes selection algorithms.
We review recent results from the H1 and ZEUS experiments at HERA on charm and beauty production in ep collisions at 300 - 318 GeV centre-of-mass energy.
This paper describes how the elements of the SP theory (Wolff, 2003a) may be realised with neural structures and processes. To the extent that this is successful, the insights that have been achieved in the SP theory - the integration and simplification of a range of phenomena in perception and cognition - may be incorporated in a neural view of brain function.   These proposals may be seen as a development of Hebb's (1949) concept of a 'cell assembly'. By contrast with that concept and variants of it, the version described in this paper proposes that any one neuron can belong in one assembly and only one assembly. A distinctive feature of the present proposals is that any neuron or cluster of neurons within a cell assembly may serve as a proxy or reference for another cell assembly or class of cell assemblies. This device provides solutions to many of the problems associated with cell assemblies, it allows information to be stored in a compressed form, and it provides a robust mechanism by which assemblies may be connected to form hierarchies, grammars and other kinds of knowledge structure.   Drawing on insights derived from the SP theory, the paper also describes how unsupervised learning may be achieved with neural structures and processes. This theory of learning overcomes weaknesses in the Hebbian concept of learning and it is, at the same time, compatible with the observations that Hebb's theory was designed to explain.
Measurements of the $IV$ characteristics of site-diluted Josephson-junction arrays have revealed intriguing effects of percolative disorder on the phase transition and the vortex dynamics in a two-dimensional XY system. Different from other types of phase transitions, the Kosterlitz-Thouless (KT) transition was eliminated with the introduction of percolative disorder far below the percolation threshold. Even after the KT order had been removed, the system remained superconducting at low temperatures by establishing a different type of order. Near the percolation threshold, evidence was found that, as a consequence of the underlying fractal structure, the critical dynamics of the phase degrees of freedom persisted down to zero temperature.
Humans are able to accelerate their learning by selecting training materials that are the most informative and at the appropriate level of difficulty. We propose a framework for distributing deep learning in which one set of workers search for the most informative examples in parallel while a single worker updates the model on examples selected by importance sampling. This leads the model to update using an unbiased estimate of the gradient which also has minimum variance when the sampling proposal is proportional to the L2-norm of the gradient. We show experimentally that this method reduces gradient variance even in a context where the cost of synchronization across machines cannot be ignored, and where the factors for importance sampling are not updated instantly across the training set.
In this article, we extend several algebraic graph analysis methods to bipartite networks. In various areas of science, engineering and commerce, many types of information can be represented as networks, and thus the discipline of network analysis plays an important role in these domains. A powerful and widespread class of network analysis methods is based on algebraic graph theory, i.e., representing graphs as square adjacency matrices. However, many networks are of a very specific form that clashes with that representation: They are bipartite. That is, they consist of two node types, with each edge connecting a node of one type with a node of the other type. Examples of bipartite networks (also called \emph{two-mode networks}) are persons and the social groups they belong to, musical artists and the musical genres they play, and text documents and the words they contain. In fact, any type of feature that can be represented by a categorical variable can be interpreted as a bipartite network. Although bipartite networks are widespread, most literature in the area of network analysis focuses on unipartite networks, i.e., those networks with only a single type of node. The purpose of this article is to extend a selection of important algebraic network analysis methods to bipartite networks, showing that many methods from algebraic graph theory can be applied to bipartite networks with only minor modifications. We show methods for clustering, visualization and link prediction. Additionally, we introduce new algebraic methods for measuring the bipartivity in near-bipartite graphs.
Adaptive-network models are typically studied using deterministic differential equations which approximately describe their dynamics. In simulations, however, the discrete nature of the network gives rise to intrinsic noise which can radically alter the system's behaviour. In this article we develop a method to predict the effects of stochasticity in adaptive networks by making use of a pair-based proxy model. The technique is developed in the context of an epidemiological model of a disease spreading over an adaptive network of infectious contact. Our analysis reveals that in this model the structure of the network exhibits stochastic oscillations in response to fluctuations in the disease dynamic.
Cluster-based descriptions of biological networks have received much attention in recent years fostered by accumulated evidence of the existence of meaningful correlations between topological network clusters and biological functional modules. Several well-performing clustering algorithms exist to infer topological network partitions. However, due to respective technical idiosyncrasies they might produce dissimilar modular decompositions of a given network. In this contribution, we aimed to analyze how alternative modular descriptions could condition the outcome of follow-up network biology analysis. We considered a human protein interaction network and two paradigmatic cluster recognition algorithms, namely: the Clauset-Newman-Moore and the infomap procedures. We analyzed at what extent both methodologies yielded different results in terms of granularity and biological congruency. In addition, taking into account Guimera cartographic role characterization of network nodes, we explored how the adoption of a given clustering methodology impinged on the ability to highlight relevant network meso-scale connectivity patterns. As a case study we considered a set of aging related proteins, and showed that only the high-resolution modular description provided by infomap, could unveil statistically significant associations between them and inter-intra modular cartographic features. Besides reporting novel biological insights that could be gained from the discovered associations, our contribution warns against possible technical concerns that might affect the tools used to mine for interaction patterns in network biology studies. In particular our results suggested that sub-optimal partitions from the strict point of view of their modularity levels might still be worth being analyzed when meso-scale features were to be explored in connection with external source of biological knowledge.
A generative model is developed for deep (multi-layered) convolutional dictionary learning. A novel probabilistic pooling operation is integrated into the deep model, yielding efficient bottom-up (pretraining) and top-down (refinement) probabilistic learning. Experimental results demonstrate powerful capabilities of the model to learn multi-layer features from images, and excellent classification results are obtained on the MNIST and Caltech 101 datasets.
This paper proposes a comparison between rectilinear and radio-concentric networks. Indeed, those networks are often observed in urban areas, in several cities all over the world. One of the interesting properties of such networks is described by the \textit{straightness} measure from graph theory, which assesses how much moving from one node to another along the network links departs from the network-independent straightforward path. We study this property in both rectilinear and radio-concentric networks, first by analyzing mathematically routes from the center to peripheral locations in a theoretical framework with perfect topology, then using simulations for multiple origin-destination paths. We show that in most of the cases, radio-concentric networks have a better straightness than rectilinear ones. How may this property be used in the future for urban networks?
Deep learning techniques have demonstrated significant capacity in modeling some of the most challenging real world problems of high complexity. Despite the popularity of deep models, we still strive to better understand the underlying mechanism that drives their success. Motivated by observations that neurons in trained deep nets predict attributes indirectly related to the training tasks, we recognize that a deep network learns representations more general than the task at hand to disentangle impacts of multiple confounding factors governing the data, in order to isolate the effects of the concerning factors and optimize a given objective. Consequently, we propose a general framework to augment training of deep models with information on auxiliary explanatory data variables, in an effort to boost this disentanglement and train deep networks that comprehend the data interactions and distributions more accurately, and thus improve their generalizability. We incorporate information on prominent auxiliary explanatory factors of the data population into existing architectures as secondary objective/loss blocks that take inputs from hidden layers during training. Once trained, these secondary circuits can be removed to leave a model with the same architecture as the original, but more generalizable and discerning thanks to its comprehension of data interactions. Since pose is one of the most dominant confounding factors for object recognition, we apply this principle to instantiate a pose-aware deep convolutional neural network and demonstrate that auxiliary pose information indeed improves the classification accuracy in our experiments on SAR target classification tasks.
User contributions in the form of posts, comments, and votes are essential to the success of online communities. However, allowing user participation also invites undesirable behavior such as trolling. In this paper, we characterize antisocial behavior in three large online discussion communities by analyzing users who were banned from these communities. We find that such users tend to concentrate their efforts in a small number of threads, are more likely to post irrelevantly, and are more successful at garnering responses from other users. Studying the evolution of these users from the moment they join a community up to when they get banned, we find that not only do they write worse than other users over time, but they also become increasingly less tolerated by the community. Further, we discover that antisocial behavior is exacerbated when community feedback is overly harsh. Our analysis also reveals distinct groups of users with different levels of antisocial behavior that can change over time. We use these insights to identify antisocial users early on, a task of high practical importance to community maintainers.
This paper discusses the need for Cognitive Radio ability in view of the physical scarcity of wireless spectrum for communication. A background of the Cognitive Radio technology is presented and the aspect of 'channel state prediction' is focused upon. Hidden Markov Models (HMM) have been traditionally used to model the wireless channel behavior but it suffers from certain limitations. We discuss few techniques of channel state prediction using machine-learning methods and will extend the Conditional Random Field (CRF) procedure to this field.
Modularisation, repetition, and symmetry are structural features shared by almost all biological neural networks. These features are very unlikely to be found by the means of structural evolution of artificial neural networks. This paper introduces NMODE, which is specifically designed to operate on neuro-modules. NMODE addresses a second problem in the context of evolutionary robotics, which is incremental evolution of complex behaviours for complex machines, by offering a way to interface neuro-modules. The scenario in mind is a complex walking machine, for which a locomotion module is evolved first, that is then extended by other modules in later stages. We show that NMODE is able to evolve a locomotion behaviour for a standard six-legged walking machine in approximately 10 generations and show how it can be used for incremental evolution of a complex walking machine. The entire source code used in this paper is publicly available through GitHub.
We present a novel approach to automatically segment magnetic resonance (MR) images of the human brain into anatomical regions. Our methodology is based on a deep artificial neural network that assigns each voxel in an MR image of the brain to its corresponding anatomical region. The inputs of the network capture information at different scales around the voxel of interest: 3D and orthogonal 2D intensity patches capture the local spatial context while large, compressed 2D orthogonal patches and distances to the regional centroids enforce global spatial consistency. Contrary to commonly used segmentation methods, our technique does not require any non-linear registration of the MR images. To benchmark our model, we used the dataset provided for the MICCAI 2012 challenge on multi-atlas labelling, which consists of 35 manually segmented MR images of the brain. We obtained competitive results (mean dice coefficient 0.725, error rate 0.163) showing the potential of our approach. To our knowledge, our technique is the first to tackle the anatomical segmentation of the whole brain using deep neural networks.
A ``ballistic-search'' algorithm is presented which allows the identification of clusters (or funnels) of ground states in Ising spin glasses even for moderate system sizes. The clusters are defined to be sets of states, which are connected in state-space by chains of zero-energy flips of spins. The technique can also be used to estimate the sizes of such clusters. The performance of the method is tested with respect to different system sizes and choices of parameters. As an application the ground-state funnel structure of two-dimensional +or- J spin glasses of systems up to size L=20 is analyzed by calculating a huge number of ground states per realization. A T=0 entropy per spin of s_0=0.086(4)k_B is obtained.
Deep Convolutional Neural Networks (DCNN) have been proven to be effective for various computer vision problems. In this work, we demonstrate its effectiveness on a continuous object orientation estimation task, which requires prediction of 0 to 360 degrees orientation of the objects. We do so by proposing and comparing three continuous orientation prediction approaches designed for the DCNNs. The first two approaches work by representing an orientation as a point on a unit circle and minimizing either L2 loss or angular difference loss. The third method works by first converting the continuous orientation estimation task into a set of discrete orientation estimation tasks and then converting the discrete orientation outputs back to the continuous orientation using a mean-shift algorithm. By evaluating on a vehicle orientation estimation task and a pedestrian orientation estimation task, we demonstrate that the discretization-based approach not only works better than the other two approaches but also achieves state-of-the-art performance. We also demonstrate that finding an appropriate feature representation is critical to achieve a good performance when adapting a DCNN trained for an image recognition task.
Urban water quality is of great importance to our daily lives. Prediction of urban water quality help control water pollution and protect human health. However, predicting the urban water quality is a challenging task since the water quality varies in urban spaces non-linearly and depends on multiple factors, such as meteorology, water usage patterns, and land uses. In this work, we forecast the water quality of a station over the next few hours from a data-driven perspective, using the water quality data and water hydraulic data reported by existing monitor stations and a variety of data sources we observed in the city, such as meteorology, pipe networks, structure of road networks, and point of interests (POIs). First, we identify the influential factors that affect the urban water quality via extensive experiments. Second, we present a multi-task multi-view learning method to fuse those multiple datasets from different domains into an unified learning model. We evaluate our method with real-world datasets, and the extensive experiments verify the advantages of our method over other baselines and demonstrate the effectiveness of our approach.
Ultrasound-Guided Regional Anesthesia (UGRA) has been gaining importance in the last few years, offering numerous advantages over alternative methods of nerve localization (neurostimulation or paraesthesia). However, nerve detection is one of the most tasks that anaesthetists can encounter in the UGRA procedure. Computer aided system that can detect automatically region of nerve, would help practitioner to concentrate more in anaesthetic delivery. In this paper we propose a new method based on deep learning combined with spatiotemporal information to robustly segment the nerve region. The proposed method is based on two phases, localisation and segmentation. The first phase, consists in using convolutional neural network combined with spatial and temporal consistency to detect the nerve zone. The second phase utilises active contour model to delineate the region of interest. Obtained results show the validity of the proposed approach and its robustness.
This paper investigates two-branch neural networks for image-text matching tasks. We propose two different network structures that produce different output representations. The first one, referred to as an embedding network, learns an explicit shared latent embedding space with a maximum-margin ranking loss and novel neighborhood constraints. The second one, referred to as a similarity network, fuses the two branches via element-wise product and is trained with regression loss to directly predict a similarity score. Extensive experiments show that our two-branch networks achieve high accuracies for phrase localization on the Flickr30K Entities dataset and for bi-directional image-sentence retrieval on Flickr30K and MSCOCO datasets.
This paper addresses the challenge of establishing a bridge between deep convolutional neural networks and conventional object detection frameworks for accurate and efficient generic object detection. We introduce Dense Neural Patterns, short for DNPs, which are dense local features derived from discriminatively trained deep convolutional neural networks. DNPs can be easily plugged into conventional detection frameworks in the same way as other dense local features(like HOG or LBP). The effectiveness of the proposed approach is demonstrated with the Regionlets object detection framework. It achieved 46.1% mean average precision on the PASCAL VOC 2007 dataset, and 44.1% on the PASCAL VOC 2010 dataset, which dramatically improves the original Regionlets approach without DNPs.
This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation. We mainly explored two multimodal architectures where either global visual features or convolutional feature maps are integrated in order to benefit from visual context. Our final systems ranked first for both En-De and En-Fr language pairs according to the automatic evaluation metrics METEOR and BLEU.
We present data of Monte Carlo simulations for monodisperse linear polymer chains in dense melts with degrees of polymerization between N=16 and N=512. The aim of this study is to investigate the crossover from Rouse-like dynamics for short chains to reptation-like dynamics for long chains. To address this problem we calculate a variety of different quantities: standard mean-square displacements of inner monomers and of the chain's center of mass, the recently proposed cubic invariant, persistence of bond-vector orientation with time, and the auto-correlation functions of the bond vector, the end-to-end vector and the Rouse modes. This analysis reveals that the crossover from non- to entangled dynamics is very protracted. Only the largest chain length N=512, which is about 13 times larger than the entanglement length, shows evidence for reptation.
The propagation of light in a scattering medium is described as the motion of a special kind of a Brownian particle on which the fluctuating forces act only perpendicular to its velocity. This enforces strictly and dynamically the constraint of constant speed of the photon in the medium. A Fokker-Planck equation is derived for the probability distribution in the phase space assuming the transverse fluctuating force to be a white noise. Analytic expressions for the moments of the displacement $<x^{n}>$ along with an approximate expression for the marginal probability distribution function $P(x,t)$ are obtained. Exact numerical solutions for the phase space probability distribution for various geometries are presented. The results show that the velocity distribution randomizes in a time of about eight times the mean free time ($8t^*$) only after which the diffusion approximation becomes valid. This factor of eight is a well known experimental fact. A persistence exponent of $0.435 \pm 0.005$ is calculated for this process in two dimensions by studying the survival probability of the particle in a semi-infinite medium. The case of a stochastic amplifying medium is also discussed.
We suggest and explore two observables which may clarify the origin of the attenuation of semi-inclusive hadron production reported in DIS on nuclear targets.
Networks of random quantum scatterers (S-matrices) form paradigmatic models for the propagation of coherent waves in random S-matrix network models cover universal localization-delocalization properties and have some advantages over more traditional Hamiltonian models. In particular, a straightforward implementation of real space renormalization techniques is possible. Starting from a finite elementary cell of the S-matrix network, hierarchical network models can be constructed by recursion. The localization-delocalization properties are contained in the flow of the forward scattering strength ('conductance') under increasing system size. With the aid of 'small scale' numerics qualitative aspects of the localization-delocalization properties of S-matrix network models can be worked out.
Inference and learning for probabilistic generative networks is often very challenging and typically prevents scalability to as large networks as used for deep discriminative approaches. To obtain efficiently trainable, large-scale and well performing generative networks for semi-supervised learning, we here combine two recent developments: a neural network reformulation of hierarchical Poisson mixtures (Neural Simpletrons), and a novel truncated variational EM approach (TV-EM). TV-EM provides theoretical guarantees for learning in generative networks, and its application to Neural Simpletrons results in particularly compact, yet approximately optimal, modifications of learning equations. If applied to standard benchmarks, we empirically find, that learning converges in fewer EM iterations, that the complexity per EM iteration is reduced, and that final likelihood values are higher on average. For the task of classification on data sets with few labels, learning improvements result in consistently lower error rates if compared to applications without truncation. Experiments on the MNIST data set herein allow for comparison to standard and state-of-the-art models in the semi-supervised setting. Further experiments on the NIST SD19 data set show the scalability of the approach when a manifold of additional unlabeled data is available.
One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions and turned out to be a well-known NP-hard problem and, hence, approximations are required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has never been done before.   For this reason, in this work, we provide a detailed study of the different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the characteristics of different widespread scores proposed for the inference and the statistical pitfalls within them.
We study a sheaf-theoretical variation of Carlsson and Zomorodian's persistent homology. We apply our techniques to analyze the persistent cellular sheaf cohomology of network coding sheaves on deteriorating networks.
Predictive business process monitoring methods exploit logs of completed cases of a process in order to make predictions about running cases thereof. Existing methods in this space are tailor-made for specific prediction tasks. Moreover, their relative accuracy is highly sensitive to the dataset at hand, thus requiring users to engage in trial-and-error and tuning when applying them in a specific setting. This paper investigates Long Short-Term Memory (LSTM) neural networks as an approach to build consistently accurate models for a wide range of predictive process monitoring tasks. First, we show that LSTMs outperform existing techniques to predict the next event of a running case and its timestamp. Next, we show how to use models for predicting the next task in order to predict the full continuation of a running case. Finally, we apply the same approach to predict the remaining time, and show that this approach outperforms existing tailor-made methods.
In this paper we report findings from a study of social network site use in a UK Government department. We have investigated this from a managerial, organisational perspective. We found at the study site that there are already several social network technologies in use, and that these: misalign with and problematize organisational boundaries; blur boundaries between working and social lives; present differing opportunities for control; have different visibilities; have overlapping functionality with each other and with other information technologies; that they evolve and change over time; and that their uptake is conditioned by existing infrastructure and availability. We find the organisational complexity that social technologies are often hoped to cut across is, in reality, something that shapes their uptake and use. We argue the idea of a single, central social network site for supporting cooperative work within an organisation will hit the same problems as any effort of centralisation in organisations. We argue that while there is still plenty of scope for design and innovation in this area, an important challenge now is in supporting organisations in managing what can best be referred to as a social network site 'ecosystem'.
Differently from computer vision systems which require explicit supervision, humans can learn facial expressions by observing people in their environment. In this paper, we look at how similar capabilities could be developed in machine vision. As a starting point, we consider the problem of relating facial expressions to objectively measurable events occurring in videos. In particular, we consider a gameshow in which contestants play to win significant sums of money. We extract events affecting the game and corresponding facial expressions objectively and automatically from the videos, obtaining large quantities of labelled data for our study. We also develop, using benchmarks such as FER and SFEW 2.0, state-of-the-art deep neural networks for facial expression recognition, showing that pre-training on face verification data can be highly beneficial for this task. Then, we extend these models to use facial expressions to predict events in videos and learn nameable expressions from them. The dataset and emotion recognition models are available at http://www.robots.ox.ac.uk/~vgg/data/facevalue
We consider the non-hermitian 2D Dirac Hamiltonian with (A): real random mass, imaginary scalar potential and imaginary gauge field potentials, and (B) arbitrary complex random potentials of all three kinds. In both cases this Hamiltonian gives rise to a delocalization transition at zero energy with particle-hole symmetry in every realization of disorder. Case (A) is in addition time-reversal invariant, and can also be interpreted as the random-field XY Statistical Mechanics model in two dimensions. The supersymmetric approach to disorder averaging results in current-current perturbations of $gl(N|N)$ super-current algebras. Special properties of the $gl(N|N)$ algebra allow the exact computation of the beta-functions, and of the correlation functions of all currents. One of them is the Edwards-Anderson order parameter. The theory is `nearly conformal' and possesses a scale-invariant subsector which is not a current algebra. For N=1, in addition, we obtain an exact solution of all correlation functions. We also study the delocalization transition of case (B), with broken time reversal symmetry, in the Gade-Wegner (Random-Flux) universality class, using a GL(N|N;C)/U(N|N) sigma model, as well as its PSL(N|N) variant, and a corresponding generalized random XY model. For N=1 the sigma model is shown to be identical to the current-current perturbation. For the delocalization transitions (case (A) and (B)) a density of states, diverging at zero energy, is found.
Belief propagation (BP) is a message-passing heuristic for statistical inference in graphical models such as Bayesian networks and Markov random fields. BP is used to compute marginal distributions or maximum likelihood assignments and has applications in many areas, including machine learning, image processing, and computer vision. However, the theoretical understanding of the performance of BP is unsatisfactory.   Recently, BP has been applied to combinatorial optimization problems. It has been proved that BP can be used to compute maximum-weight matchings and minimum-cost flows for instances with a unique optimum. The number of iterations needed for this is pseudo-polynomial and hence BP is not efficient in general.   We study belief propagation in the framework of smoothed analysis and prove that with high probability the number of iterations needed to compute maximum-weight matchings and minimum-cost flows is bounded by a polynomial if the weights/costs of the edges are randomly perturbed. To prove our upper bounds, we use an isolation lemma by Beier and V\"{o}cking (SIAM J. Comput. 2006) for matching and generalize an isolation lemma for min-cost flow by Gamarnik, Shah, and Wei (Operations Research, 2012). We also prove almost matching lower tail bounds for the number of iterations that BP needs to converge.
In this paper we compare the packet error rate (PER) and maximum throughput of IEEE 802.11n and IEEE 802.11g under interference from IEEE 802.15.4 by using MATLAB to simulate the IEEE PHY for 802.11n and 802.11g networks.
Coded recurrent neural networks with three levels of sparsity are introduced. The first level is related to the size of messages, much smaller than the number of available neurons. The second one is provided by a particular coding rule, acting as a local constraint in the neural activity. The third one is a characteristic of the low final connection density of the network after the learning phase. Though the proposed network is very simple since it is based on binary neurons and binary connections, it is able to learn a large number of messages and recall them, even in presence of strong erasures. The performance of the network is assessed as a classifier and as an associative memory.
We present an application of the Layer-wise Relevance Propagation (LRP) algorithm to state of the art deep convolutional neural networks and Fisher Vector classifiers to compare the image perception and prediction strategies of both classifiers with the use of visualized heatmaps. Layer-wise Relevance Propagation (LRP) is a method to compute scores for individual components of an input image, denoting their contribution to the prediction of the classifier for one particular test point. We demonstrate the impact of different choices of decomposition cut-off points during the LRP-process, controlling the resolution and semantics of the heatmap on test images from the PASCAL VOC 2007 test data set.
How self-organized networks develop, mature and degenerate is a key question for sociotechnical, cyberphysical and biological systems with potential applications from tackling violent extremism through to neurological diseases. So far, it has proved impossible to measure the continuous-time evolution of any in vivo organism network from birth to death. Here we provide such a study which crosses all organizational and temporal scales, from individual components (10^1) through to the mesoscopic (10^3) and entire system scale (10^6). These continuous-time data reveal a lifespan driven by punctuated, real-time co-evolution of the structural and functional networks. Aging sees these structural and functional networks gradually diverge in terms of their small-worldness and eventually their connectivity. Dying emerges as an extended process associated with the formation of large but disjoint functional sub-networks together with an increasingly detached core. Our mathematical model quantifies the very different impacts that interventions will have on the overall lifetime, period of initial growth, peak of potency, and duration of old age, depending on when and how they are administered. In addition to their direct relevance to online extremism, our findings offer fresh insight into aging in any network system of comparable complexity for which extensive in vivo data is not yet available.
We present a method for the reconstruction of networks, based on the order of nodes visited by a stochastic branching process. Our algorithm reconstructs a network of minimal size that ensures consistency with the data. Crucially, we show that global consistency with the data can be achieved through purely local considerations, inferring the neighbourhood of each node in turn. The optimisation problem solved for each individual node can be reduced to a Set Covering Problem, which is known to be NP-hard but can be approximated well in practice. We then extend our approach to account for noisy data, based on the Minimum Description Length principle. We demonstrate our algorithms on synthetic data, generated by an SIR-like epidemiological model.
We propose a new method for learning a representation of image motion in an unsupervised fashion. We do so by learning an image sequence embedding that respects associativity and invertibility properties of composed sequences with known temporal order. This procedure makes minimal assumptions about scene content, and the resulting networks learn to exploit rigid and non-rigid motion cues. We show that a deep neural network trained to respect these constraints implicitly identifies the characteristic motion patterns of many different sequence types.   Our network architecture consists of a CNN followed by an LSTM and is structured to learn motion representations over sequences of arbitrary length. We demonstrate that a network trained using our unsupervised procedure on real-world sequences of human actions and vehicle motion can capture semantic regions corresponding to the motion in the scene, and not merely image-level differences, without requiring any motion labels. Furthermore, we present results that suggest our method can be used to extract information useful for independent motion tracking, localization, and nearest neighbor identification. Our results suggest that this representation may be useful for motion-related tasks where explicit labels are often very difficult to obtain.
In this paper we present the comparison of the linguistic networks from literature and blog texts. The linguistic networks are constructed from texts as directed and weighted co-occurrence networks of words. Words are nodes and links are established between two nodes if they are directly co-occurring within the sentence. The comparison of the networks structure is performed at global level (network) in terms of: average node degree, average shortest path length, diameter, clustering coefficient, density and number of components. Furthermore, we perform analysis on the local level (node) by comparing the rank plots of in and out degree, strength and selectivity. The selectivity-based results point out that there are differences between the structure of the networks constructed from literature and blogs.
Recent experimental results with humans involved in social dilemma games suggest that cooperation may be a contagious phenomenon and that the selection pressure operating on evolutionary dynamics (i.e., mimicry) is relatively weak. I propose an evolutionary dynamics model that links these experimental findings and evolution of cooperation. By assuming a small fraction of (imperfect) zealous cooperators, I show that a large fraction of cooperation emerges in evolutionary dynamics of social dilemma games. Even if defection is more lucrative than cooperation for most individuals, they often mimic cooperation of fellows unless the selection pressure is very strong. Then, zealous cooperators can transform the population to be even fully cooperative under standard evolutionary dynamics.
We report on the comprehensive numerical study of the fluctuation and correlation properties of wave functions in three-dimensional mesoscopic diffusive conductors. Several large sets of nanoscale samples with finite metallic conductance, modeled by an Anderson model with different strengths of diagonal box disorder, have been generated in order to investigate both small and large deviations (as well as the connection between them) of the distribution function of eigenstate amplitudes from the universal prediction of random matrix theory. We find that small, weak localization-type, deviations contain both diffusive contributions (determined by the bulk and boundary conditions dependent terms) and ballistic ones which are generated by electron dynamics below the length scale set by the mean free path ell. By relating the extracted parameters of the functional form of nonperturbative deviations (``far tails'') to the exactly calculated transport properties of mesoscopic conductors, we compare our findings based on the full solution of the Schrodinger equation to different approximative analytical treatments. We find that statistics in the far tail can be explained by the exp-log-cube asymptotics (convincingly refuting the log-normal alternative), but with parameters whose dependence on ell is linear and, therefore, expected to be dominated by ballistic effects. It is demonstrated that both small deviations and far tails depend explicitly on the sample size--the remaining puzzle then is the evolution of the far tail parameters with the size of the conductor since short-scale physics is supposedly insensitive to the sample boundaries.
Offline handwriting recognition systems require cropped text line images for both training and recognition. On the one hand, the annotation of position and transcript at line level is costly to obtain. On the other hand, automatic line segmentation algorithms are prone to errors, compromising the subsequent recognition. In this paper, we propose a modification of the popular and efficient multi-dimensional long short-term memory recurrent neural networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs. More particularly, we replace the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can recognize one line at a time. In the proposed model, a neural network performs a kind of implicit line segmentation by computing attention weights on the image representation. The experiments on paragraphs of Rimes and IAM database yield results that are competitive with those of networks trained at line level, and constitute a significant step towards end-to-end transcription of full documents.
We consider a Brownian motion (BM) $x(\tau)$ and its maximal value $x_{\max} = \max_{0 \leq \tau \leq t} x(\tau)$ on a fixed time interval $[0,t]$. We study functionals of the maximum of the BM, of the form ${\cal O}_{\max}(t)=\int_0^t\, V(x_{\max} - x(\tau)) {\rm d} \tau$ where $V(x)$ can be any arbitrary function and develop various analytical tools to compute their statistical properties. These tools rely in particular on (i) a "counting paths" method and (ii) a path-integral approach. In particular, we focus on the case where $V(x) = \delta(x-r)$, with $r$ a real parameter, which is relevant to study the density of near-extreme values of the BM (the so called density of states), $\rho(r,t)$, which is the local time of the BM spent at given distance $r$ from the maximum. We also provide a thorough analysis of the family of functionals ${T}_{\alpha}(t)=\int_0^t (x_{\max} - x(\tau))^\alpha \, {\rm d}\tau$, corresponding to $V(x) = x^\alpha$, with $\alpha$ real. As $\alpha$ is varied, $T_\alpha(t)$ interpolates between different interesting observables. For instance, for $\alpha =1$, $T_{\alpha = 1}(t)$ is a random variable of the "area", or "Airy", type while for $\alpha=-1/2$ it corresponds to the maximum time spent by a ballistic particle through a Brownian random potential. On the other hand, for $\alpha = -1$, it corresponds to the cost of the optimal algorithm to find the maximum of a discrete random walk, proposed by Odlyzko. We revisit here, using tools of theoretical physics, the statistical properties of this algorithm which had been studied before using probabilistic methods. Finally, we extend our methods to constrained BM, including in particular the Brownian bridge, i.e., the Brownian motion starting and ending at the origin.
Random spin chains at quantum critical points exhibit an entanglement entropy between a segment of length L and the rest of the chain that scales as log_2 L with a universal coefficient. Since for pure quantum critical spin chains this coefficient is fixed by the central charge of the associated conformal field theory, the universal coefficient in the random case can be understood as an effective central charge. In this paper we calculate the entanglement entropy and effective central charge of the spin-1 random Heisenberg model in its random-singlet phase and also at the critical point at which the Haldane phase breaks down. The latter is the first entanglement calculation for an infinite-randomness fixed point that is not in the random-singlet universality class. Our results are consistent with a c-theorem for flow between infinite-randomness fixed points. The formalism we use can be generally applied to calculation of quantities that depend on the RG history in s>=1 random Heisenberg chains.
We study numerically the transport and localization properties of waves in ordered and disordered ladder-shaped lattices with local $\mathcal{PT}$ symmetry. Using a transfer matrix method, we calculate the transmittance and the reflectance for the individual channels and the Lyapunov exponent for the whole system. In the absence of disorder, we find that when the gain/loss parameter $\rho$ is smaller than the interchain coupling parameter $t_{v}$, the transmittance and the reflectance are periodic functions of the system size, whereas when $\rho$ is larger than $t_{v}$, the transmittance is found to be an exponentially-decaying function while the reflectance attains a saturation value in the thermodynamic limit. For a fixed system size, there appear perfect transmission resonances in each individual channel at several values of the gain/loss strength smaller than $t_{v}$. A singular behavior of the transmittance is also found to appear at various values of $\rho$ for a given system size. When disorder is inserted into the on-site potentials, these behaviors are changed substantially due to the interplay between disorder and the gain/loss effect. When $\rho$ is smaller than $t_{v}$, we find that the presence of locally $\mathcal{PT}$-symmetric potentials suppresses Anderson localization, as compared to the localization in the corresponding Hermitian system. When $\rho$ is larger than $t_{v}$, we find that localization becomes more pronounced at higher gain/loss strengths. We also find that the phenomenon of anomalous localization occurs in disordered locally $\mathcal{PT}$-symmetric systems precisely at the spectral positions $E=0$ and $E=\pm\sqrt{t_{v}^2-\rho^2}$. The anomaly at the band center manifests as a sharp peak contrary to the conventional cases, whereas the anomalies at $E=\pm\sqrt{t_{v}^2-\rho^2}$ manifest as sharp dips.
The Glauber dynamics of the pure and weakly disordered random-bond 2d Ising model is studied at zero-temperature. A single characteristic length scale, $L(t)$, is extracted from the equal time correlation function. In the pure case, the persistence probability decreases algebraically with the coarsening length scale. In the disordered case, three distinct regimes are identified: a short time regime where the behaviour is pure-like; an intermediate regime where the persistence probability decays non-algebraically with time; and a long time regime where the domains freeze and there is a cessation of growth. In the intermediate regime, we find that $P(t)\sim L(t)^{-\theta'}$, where $\theta' = 0.420\pm 0.009$. The value of $\theta'$ is consistent with that found for the pure 2d Ising model at zero-temperature. Our results in the intermediate regime are consistent with a logarithmic decay of the persistence probability with time, $P(t)\sim (\ln t)^{-\theta_d}$, where $\theta_d = 0.63\pm 0.01$.
We propose a method for extracting very accurate masks of hands in egocentric views. Our method is based on a novel Deep Learning architecture: In contrast with current Deep Learning methods, we do not use upscaling layers applied to a low-dimensional representation of the input image. Instead, we extract features with convolutional layers and map them directly to a segmentation mask with a fully connected layer. We show that this approach, when applied in a multi-scale fashion, is both accurate and efficient enough for real-time. We demonstrate it on a new dataset made of images captured in various environments, from the outdoors to offices.
We study order-parameter fluctuations (OPF) in disordered systems by considering the behavior of some recently introduced paramaters $G,G_c$ which have proven very useful to locate phase transitions. We prove that both parameters G (for disconnected overlap disorder averages) and $G_c$ (for connected disorder averages) take the respective universal values 1/3 and 13/31 in the $T\to 0$ limit for any {\em finite} volume provided the ground state is {\em unique} and there is no gap in the ground state local-field distributions, conditions which are met in generic spin-glass models with continuous couplings and no gap at zero coupling. This makes $G,G_c$ ideal parameters to locate phase transitions in disordered systems much alike the Binder cumulant is for ordered systems. We check our results by exactly computing OPF in a simple example of uncoupled spins in the presence of random fields and the one-dimensional Ising spin glass. At finite temperatures, we discuss in which conditions the value 1/3 for G may be recovered by conjecturing different scenarios depending on whether OPF are finite or vanish in the infinite-volume limit. In particular, we discuss replica equivalence and its natural consequence $\lim_{V\to\infty}G(V,T)=1/3$ when OPF are finite. As an example of a model where OPF vanish and replica equivalence does not give information about G we study the Sherrington-Kirkpatrick spherical spin-glass model by doing numerical simulations for small sizes. Again we find results compatible with G=1/3 in the spin-glass phase.
To manage and maintain large-scale cellular networks, operators need to know which sectors underperform at any given time. For this purpose, they use the so-called hot spot score, which is the result of a combination of multiple network measurements and reflects the instantaneous overall performance of individual sectors. While operators have a good understanding of the current performance of a network and its overall trend, forecasting the performance of each sector over time is a challenging task, as it is affected by both regular and non-regular events, triggered by human behavior and hardware failures. In this paper, we study the spatio-temporal patterns of the hot spot score and uncover its regularities. Based on our observations, we then explore the possibility to use recent measurements' history to predict future hot spots. To this end, we consider tree-based machine learning models, and study their performance as a function of time, amount of past data, and prediction horizon. Our results indicate that, compared to the best baseline, tree-based models can deliver up to 14% better forecasts for regular hot spots and 153% better forecasts for non-regular hot spots. The latter brings strong evidence that, for moderate horizons, forecasts can be made even for sectors exhibiting isolated, non-regular behavior. Overall, our work provides insight into the dynamics of cellular sectors and their predictability. It also paves the way for more proactive network operations with greater forecasting horizons.
We consider the adaptive shortest-path routing problem in wireless networks under unknown and stochastically varying link states. In this problem, we aim to optimize the quality of communication between a source and a destination through adaptive path selection. Due to the randomness and uncertainties in the network dynamics, the quality of each link varies over time according to a stochastic process with unknown distributions. After a path is selected for communication, the aggregated quality of all links on this path (e.g., total path delay) is observed. The quality of each individual link is not observable. We formulate this problem as a multi-armed bandit with dependent arms. We show that by exploiting arm dependencies, a regret polynomial with network size can be achieved while maintaining the optimal logarithmic order with time. This is in sharp contrast with the exponential regret order with network size offered by a direct application of the classic MAB policies that ignore arm dependencies. Furthermore, our results are obtained under a general model of link-quality distributions (including heavy-tailed distributions) and find applications in cognitive radio and ad hoc networks with unknown and dynamic communication environments.
We present a deep layered architecture that generalizes convolutional neural networks (ConvNets). The architecture, called SimNets, is driven by two operators: (i) a similarity function that generalizes inner-product, and (ii) a log-mean-exp function called MEX that generalizes maximum and average. The two operators applied in succession give rise to a standard neuron but in "feature space". The feature spaces realized by SimNets depend on the choice of the similarity operator. The simplest setting, which corresponds to a convolution, realizes the feature space of the Exponential kernel, while other settings realize feature spaces of more powerful kernels (Generalized Gaussian, which includes as special cases RBF and Laplacian), or even dynamically learned feature spaces (Generalized Multiple Kernel Learning). As a result, the SimNet contains a higher abstraction level compared to a traditional ConvNet. We argue that enhanced expressiveness is important when the networks are small due to run-time constraints (such as those imposed by mobile applications). Empirical evaluation validates the superior expressiveness of SimNets, showing a significant gain in accuracy over ConvNets when computational resources at run-time are limited. We also show that in large-scale settings, where computational complexity is less of a concern, the additional capacity of SimNets can be controlled with proper regularization, yielding accuracies comparable to state of the art ConvNets.
In order to cope with the ever increasing traffic load that networks will need to support, a new approach for planning cellular networks deployments should be followed. Traditionally, cell association and resource allocation has been based on the received signal power but this approach seems to be inadequate regarding the brewing of heterogeneous networks. In this work, we first implement a network simulator in order to test new cell associacion and resource allocation techniques. Then, we pose the network utility maximisation problem, reformulating the Downlink and Uplink Decoupling (DUDe) scheme under the framework and tools of mathematical optimisation. We derive the explicit solution of the problem under fixed and non-fixed association policy so as to propose and develope both centralised and decentralised algorithms capable of solving cell association and resource allocation problems. We observe that the decentralised approach requires low computational effort and represents a significant gain in the overall performance of the network.
Wireless Sensor Networks (WSNs) consist of sensor nodes which can be deployed for various operations such as agriculture and environmental sensing, wild life monitoring, health care, military surveillance, industrial control, home automation, security etc. Quality of Service (QoS) is an important issue in wireless sensor networks (WSNs) and providing QoS support in WSNs is an emerging area of research. Due to resource constraints nature of sensor networks like processing power, memory, bandwidth, energy etc. providing QoS support in WSNs is a challenging task. Delay is an important QoS parameter for forwarding data in a time constraint WSNs environment. In this paper we propose a delay aware routing protocol for transmission of time critical event information to the Sink of WSNs. The performance of the proposed protocol is evaluated by NS2 simulations under different scenarios.
We present observational evidence for the inhibition of bar formation in dispersion-dominated (dynamically hot) galaxies by studying the relationship between galactic structure and host galaxy kinematics in a sample of 257 galaxies between 0.1 $<$ z $\leq$ 0.84 from the All-Wavelength Extended Groth Strip International Survey (AEGIS) and the Deep Extragalactic Evolutionary Probe 2 (DEEP2) survey. We find that bars are preferentially found in galaxies that are massive and dynamically cold (rotation-dominated) and on the stellar Tully-Fisher relationship, as is the case for barred spirals in the local Universe. The data provide at least one explanation for the steep ($\times$3) decline in the overall bar fraction from z=0 to z=0.84 in L$^*$ and brighter disks seen in previous studies. The decline in the bar fraction at high redshift is almost exclusively in the lower mass (10 $<$ log M$_{*}$(\Msun)$<$ 11), later-type and bluer galaxies. A proposed explanation for this "downsizing" of the bar formation / stellar structure formation is that the lower mass galaxies may not form bars because they could be dynamically hotter than more massive systems from the increased turbulence of accreting gas, elevated star formation, and/or increased interaction/merger rate at higher redshifts. The evidence presented here provides observational support for this hypothesis. However, the data also show that not every disk galaxy that is massive and cold has a stellar bar, suggesting that mass and dynamic coldness of a disk are necessary but not sufficient conditions for bar formation -- a secondary process, perhaps the interaction history between the dark matter halo and the baryonic matter, may play an important role in bar formation.
We represent collaboration of authors in computer science papers in terms of both affiliation and collaboration networks and observe how these networks evolved over time since 1960. We investigate the temporal evolution of bibliometric properties, like size of the discipline, productivity of scholars, and collaboration level in papers, as well as of large-scale network properties, like reachability and average separation distance among scientists, distribution of the number of scholar collaborators, network clustering and network assortativity by number of collaborators.
For discrete data, the likelihood $P(x)$ can be rewritten exactly and parametrized into $P(X = x) = P(X = x | H = f(x)) P(H = f(x))$ if $P(X | H)$ has enough capacity to put no probability mass on any $x'$ for which $f(x')\neq f(x)$, where $f(\cdot)$ is a deterministic discrete function. The log of the first factor gives rise to the log-likelihood reconstruction error of an autoencoder with $f(\cdot)$ as the encoder and $P(X|H)$ as the (probabilistic) decoder. The log of the second term can be seen as a regularizer on the encoded activations $h=f(x)$, e.g., as in sparse autoencoders. Both encoder and decoder can be represented by a deep neural network and trained to maximize the average of the optimal log-likelihood $\log p(x)$. The objective is to learn an encoder $f(\cdot)$ that maps $X$ to $f(X)$ that has a much simpler distribution than $X$ itself, estimated by $P(H)$. This "flattens the manifold" or concentrates probability mass in a smaller number of (relevant) dimensions over which the distribution factorizes. Generating samples from the model is straightforward using ancestral sampling. One challenge is that regular back-propagation cannot be used to obtain the gradient on the parameters of the encoder, but we find that using the straight-through estimator works well here. We also find that although optimizing a single level of such architecture may be difficult, much better results can be obtained by pre-training and stacking them, gradually transforming the data distribution into one that is more easily captured by a simple parametric model.
Inspired by social networks and complex systems, we propose a core-periphery network architecture that supports fast computation for many distributed algorithms and is robust and efficient in number of links. Rather than providing a concrete network model, we take an axiom-based design approach. We provide three intuitive (and independent) algorithmic axioms and prove that any network that satisfies all axioms enjoys an efficient algorithm for a range of tasks (e.g., MST, sparse matrix multiplication, etc.). We also show the minimality of our axiom set: for networks that satisfy any subset of the axioms, the same efficiency cannot be guaranteed for any deterministic algorithm.
Wireless body area networks are wireless sensor networks whose adoption has recently emerged and spread in important healthcare applications, such as the remote monitoring of health conditions of patients. A major issue associated with the deployment of such networks is represented by energy consumption: in general, the batteries of the sensors cannot be easily replaced and recharged, so containing the usage of energy by a rational design of the network and of the routing is crucial. Another issue is represented by traffic uncertainty: body sensors may produce data at a variable rate that is not exactly known in advance, for example because the generation of data is event-driven. Neglecting traffic uncertainty may lead to wrong design and routing decisions, which may compromise the functionality of the network and have very bad effects on the health of the patients. In order to address these issues, in this work we propose the first robust optimization model for jointly optimizing the topology and the routing in body area networks under traffic uncertainty. Since the problem may result challenging even for a state-of-the-art optimization solver, we propose an original optimization algorithm that exploits suitable linear relaxations to guide a randomized fixing of the variables, supported by an exact large variable neighborhood search. Experiments on realistic instances indicate that our algorithm performs better than a state-of-the-art solver, fast producing solutions associated with improved optimality gaps.
A phylogenetic network is a directed acyclic graph that visualises an evolutionary history containing so-called reticulations such as recombinations, hybridisations or lateral gene transfers. Here we consider the construction of a simplest possible phylogenetic network consistent with an input set T, where T contains at least one phylogenetic tree on three leaves (a triplet) for each combination of three taxa. To quantify the complexity of a network we consider both the total number of reticulations and the number of reticulations per biconnected component, called the level of the network. We give polynomial-time algorithms for constructing a level-1 respectively a level-2 network that contains a minimum number of reticulations and is consistent with T (if such a network exists). In addition, we show that if T is precisely equal to the set of triplets consistent with some network, then we can construct such a network with smallest possible level in time O(|T|^(k+1)), if k is a fixed upper bound on the level of the network.
Despite their diverse origin, networks of large real-world systems reveal a number of common properties including small-world phenomena, scale-free degree distributions and modularity. Recently, network self-similarity as a natural outcome of the evolution of real-world systems has also attracted much attention within the physics literature. Here we investigate the scaling of density in complex networks under two classical box-covering renormalizations-network coarse-graining-and also different community-based renormalizations. The analysis on over 50 real-world networks reveals a power-law scaling of network density and size under adequate renormalization technique, yet irrespective of network type and origin. The results thus advance a recent discovery of a universal scaling of density among different real-world networks [Laurienti et al., Physica A 390 (20) (2011) 3608-3613.] and imply an existence of a scale-free density also within-among different self-similar scales of-complex real-world networks. The latter further improves the comprehension of self-similar structure in large real-world networks with several possible applications.
An important problem in wireless sensor networks is to find the minimal number of randomly deployed sensors making a network connected with a given probability. In practice sensors are often deployed one by one along a trajectory of a vehicle, so it is natural to assume that arbitrary probability density functions of distances between successive sensors in a segment are given. The paper computes the probability of connectivity and coverage of 1-dimensional networks and gives estimates for a minimal number of sensors for important distributions.
We present a consistent definition for braided ribbon networks in 3-dimensional manifolds, unifying both three and four valent networks in a single framework. We present evolution moves for these networks which are dual to the Pachner moves on simplices and present an invariant of this evolution. Finally we relate these results back to previous work in the subject.
Optimal control of thermostatically controlled loads connected to a district heating network is considered a sequential decision- making problem under uncertainty. The practicality of a direct model-based approach is compromised by two challenges, namely scalability due to the large dimensionality of the problem and the system identification required to identify an accurate model. To help in mitigating these problems, this paper leverages on recent developments in reinforcement learning in combination with a market-based multi-agent system to obtain a scalable solution that obtains a significant performance improvement in a practical learning time. The control approach is applied on a scenario comprising 100 thermostatically controlled loads connected to a radial district heating network supplied by a central combined heat and power plant. Both for an energy arbitrage and a peak shaving objective, the control approach requires 60 days to obtain a performance within 65% of a theoretical lower bound on the cost.
"Three is a crowd" is an old proverb that applies as much to social interactions, as it does to frustrated configurations in statistical physics models. Accordingly, social relations within a triangle deserve special attention. With this motivation, we explore the impact of topological frustration on the evolutionary dynamics of the snowdrift game on a triangular lattice. This topology provides an irreconcilable frustration, which prevents anti-coordination of competing strategies that would be needed for an optimal outcome of the game. By using different strategy updating protocols, we observe complex spatial patterns in dependence on payoff values that are reminiscent to a honeycomb-like organization, which helps to minimize the negative consequence of the topological frustration. We relate the emergence of these patterns to the microscopic dynamics of the evolutionary process, both by means of mean-field approximations and Monte Carlo simulations. For comparison, we also consider the same evolutionary dynamics on the square lattice, where of course the topological frustration is absent. However, with the deletion of diagonal links of the triangular lattice, we can gradually bridge the gap to the square lattice. Interestingly, in this case the level of cooperation in the system is a direct indicator of the level of topological frustration, thus providing a method to determine frustration levels in an arbitrary interaction network.
We present a new outer bound for the sum capacity of general multi-unicast deterministic networks. Intuitively, this bound can be understood as applying the cut-set bound to concatenated copies of the original network with a special restriction on the allowed transmit signal distributions. We first study applications to finite-field networks, where we obtain a general outer-bound expression in terms of ranks of the transfer matrices. We then show that, even though our outer bound is for deterministic networks, a recent result relating the capacity of AWGN KxKxK networks and the capacity of a deterministic counterpart allows us to establish an outer bound to the DoF of KxKxK wireless networks with general connectivity. This bound is tight in the case of the "adjacent-cell interference" topology, and yields graph-theoretic necessary and sufficient conditions for K DoF to be achievable in general topologies.
Protein synthesis-dependent, late long-term potentiation (LTP) and depression (LTD) at glutamatergic hippocampal synapses are well characterized examples of long-term synaptic plasticity. Persistent increased activity of the enzyme protein kinase M (PKM) is thought essential for maintaining LTP. Additional spatial and temporal features that govern LTP and LTD induction are embodied in the synaptic tagging and capture (STC) and cross capture hypotheses. Only synapses that have been "tagged" by an stimulus sufficient for LTP and learning can "capture" PKM. A model was developed to simulate the dynamics of key molecules required for LTP and LTD. The model concisely represents relationships between tagging, capture, LTD, and LTP maintenance. The model successfully simulated LTP maintained by persistent synaptic PKM, STC, LTD, and cross capture, and makes testable predictions concerning the dynamics of PKM. The maintenance of LTP, and consequently of at least some forms of long-term memory, is predicted to require continual positive feedback in which PKM enhances its own synthesis only at potentiated synapses. This feedback underlies bistability in the activity of PKM. Second, cross capture requires the induction of LTD to induce dendritic PKM synthesis, although this may require tagging of a nearby synapse for LTP. The model also simulates the effects of PKM inhibition, and makes additional predictions for the dynamics of CaM kinases. Experiments testing the above predictions would significantly advance the understanding of memory maintenance.
A common approach for introducing security at the physical layer is to rely on the channel variations of the wireless environment. This type of approach is not always suitable for wireless networks where the channel remains static for most of the network lifetime. For these scenarios, a channel independent physical layer security measure is more appropriate which will rely on a secret known to the sender and the receiver but not to the eavesdropper. In this paper, we propose CD-PHY, a physical layer security technique that exploits the constellation diversity of wireless networks which is independent of the channel variations. The sender and the receiver use a custom bit sequence to constellation symbol mapping to secure the physical layer communication which is not known a priori to the eavesdropper. Through theoretical modeling and experimental simulation, we show that this information theoretic construct can achieve Shannon secrecy and any brute force attack from the eavesdropper incurs high overhead and minuscule probability of success. Our results also show that the high bit error rate also makes decoding practically infeasible for the eavesdropper, thus securing the communication between the sender and receiver.
We present the results of first-principles molecular-dynamics simulations of molten silicates, based on the density functional formalism. In particular, the structural properties of a calcium aluminosilicate $ [$ CaO-Al$_2$O$_3$-SiO$_2$ $ ]$ melt are compared to those of a silica melt. The local structures of the two melts are in good agreement with the experimental understanding of these systems. In the calcium aluminosilicate melt, the number of non-bridging oxygens found is in excess of the number obtained from a simple stoichiometric prediction. In addition, the aluminum avoidance principle, which states that links between AlO$_4$ tetrahedra are absent or rare, is found to be violated. Defects such as 2-fold rings and 5-fold coordinated silicon atoms are found in comparable proportions in both liquids. However, in the calcium aluminosilicate melt, a larger proportion of oxygen atoms are 3-fold coordinated. In addition, 5-fold coordinated aluminum atoms are observed. Finally evidence of creation and anihilation of non-bridging oxygens is observed, with these oxygens being mostly connected to Si tetrahedra.
In this paper we elaborate upon a measure of node influence in social networks, which was recently proposed by Vassio et al., IEEE Trans. Control Netw. Syst., 2014. This measure quantifies the ability of the node to sway the average opinion of the network. Following the approach by Vassio et al., we describe and study a distributed message passing algorithm that aims to compute the nodes' influence. The algorithm is inspired by an analogy between potentials in electrical networks and opinions in social networks. If the graph is a tree, then the algorithm computes the nodes' influence in a number of steps equal to the diameter of the graph. On general graphs, the algorithm converges asymptotically to a meaningful approximation of the nodes' influence. In this paper we detail the proof of convergence, which greatly extends previous results in the literature, and we provide simulations that illustrate the usefulness of the returned approximation.
We introduce gauge networks as generalizations of spin networks and lattice gauge fields to almost-commutative manifolds. The configuration space of quiver representations (modulo equivalence) in the category of finite spectral triples is studied; gauge networks appear as an orthonormal basis in a corresponding Hilbert space. We give many examples of gauge networks, also beyond the well-known spin network examples. We find a Hamiltonian operator on this Hilbert space, inducing a time evolution on the C*-algebra of gauge network correspondences.   Given a representation in the category of spectral triples of a quiver embedded in a spin manifold, we define a discretized Dirac operator on the quiver. We compute the spectral action of this Dirac operator on a four-dimensional lattice, and find that it reduces to the Wilson action for lattice gauge theories and a Higgs field lattice system. As such, in the continuum limit it reduces to the Yang-Mills-Higgs system. For the three-dimensional case, we relate the spectral action functional to the Kogut-Susskind Hamiltonian.
In this work we determine a process-level Large Deviation Principle (LDP) for a model of interacting neurons indexed by a lattice $\mathbb{Z}^d$. The neurons are subject to noise, which is modelled as a correlated martingale. The probability law governing the noise is strictly stationary, and we are therefore able to find a LDP for the probability laws $\Pi^n$ governing the stationary empirical measure $\hat{\mu}^n$ generated by the neurons in a cube of length $(2n+1)$. We use this LDP to determine an LDP for the neural network model. The connection weights between the neurons evolve according to a learning rule / neuronal plasticity, and these results are adaptable to a large variety of neural network models. This LDP is of great use in the mathematical modelling of neural networks, because it allows a quantification of the likelihood of the system deviating from its limit, and also a determination of which direction the system is likely to deviate. The work is also of interest because there are nontrivial correlations between the neurons even in the asymptotic limit, thereby presenting itself as a generalisation of traditional mean-field models.
Resources in a distributed system can be identified using identifiers based on random numbers. When using a distributed hash table to resolve such identifiers to network locations, the straightforward approach is to store the network location directly in the hash table entry associated with an identifier. When a mobile host contains a large number of resources, this requires that all of the associated hash table entries must be updated when its network address changes.   We propose an alternative approach where we store a host identifier in the entry associated with a resource identifier and the actual network address of the host in a separate host entry. This can drastically reduce the time required for updating the distributed hash table when a mobile host changes its network address. We also investigate under which circumstances our approach should or should not be used. We evaluate and confirm the usefulness of our approach with experiments run on top of OpenDHT.
Social media has now become the de facto information source on real world events. The challenge, however, due to the high volume and velocity nature of social media streams, is in how to follow all posts pertaining to a given event over time, a task referred to as story detection. Moreover, there are often several different stories pertaining to a given event, which we refer to as sub-stories and the corresponding task of their automatic detection as sub-story detection. This paper proposes hierarchical Dirichlet processes (HDP), a probabilistic topic model, as an effective method for automatic sub-story detection. HDP can learn sub-topics associated with sub-stories which enables it to handle subtle variations in sub-stories. It is compared with state- of-the-art story detection approaches based on locality sensitive hashing and spectral clustering. We demonstrate the superior performance of HDP for sub-story detection on real world Twitter data sets using various evaluation measures. The ability of HDP to learn sub-topics helps it to recall the sub- stories with high precision. Another contribution of this paper is in demonstrating that the conversational structures within the Twitter stream can be used to improve sub-story detection performance significantly.
Asynchronous methods are widely used in deep learning, but have limited theoretical justification when applied to non-convex problems. We show that running stochastic gradient descent (SGD) in an asynchronous manner can be viewed as adding a momentum-like term to the SGD iteration. Our result does not assume convexity of the objective function, so it is applicable to deep learning systems. We observe that a standard queuing model of asynchrony results in a form of momentum that is commonly used by deep learning practitioners. This forges a link between queuing theory and asynchrony in deep learning systems, which could be useful for systems builders. For convolutional neural networks, we experimentally validate that the degree of asynchrony directly correlates with the momentum, confirming our main result. An important implication is that tuning the momentum parameter is important when considering different levels of asynchrony. We assert that properly tuned momentum reduces the number of steps required for convergence. Finally, our theory suggests new ways of counteracting the adverse effects of asynchrony: a simple mechanism like using negative algorithmic momentum can improve performance under high asynchrony. Since asynchronous methods have better hardware efficiency, this result may shed light on when asynchronous execution is more efficient for deep learning systems.
Network quantization is one of network compression techniques employed to reduce the redundancy of deep neural networks. It compresses the size of the storage for a large number of network parameters in a neural network by quantizing them and encoding the quantized values into binary codewords of smaller sizes. In this paper, we aim to design network quantization schemes that minimize the expected loss due to quantization while maximizing the compression ratio. To this end, we analyze the quantitative relation of quantization errors to the loss function of a neural network and identify that the Hessian-weighted distortion measure is locally the right objective function that we need to optimize for minimizing the loss due to quantization. As a result, Hessian-weighted k-means clustering is proposed for clustering network parameters to quantize when fixed-length binary encoding follows. When optimal variable-length binary codes, e.g., Huffman codes, are employed for further compression of quantized values after clustering, we derive that the network quantization problem can be related to the entropy-constrained scalar quantization (ECSQ) problem in information theory and consequently propose two solutions of ECSQ for network quantization, i.e., uniform quantization and an iterative algorithm similar to Lloyd's algorithm for k-means clustering. Finally, using the simple uniform quantization followed by Huffman coding, our experiment results show that the compression ratios of 51.25, 22.17 and 40.65 are achievable (i.e., the sizes of the compressed models are 1.95%, 4.51% and 2.46% of the original model sizes) for LeNet, ResNet and AlexNet, respectively, at no or marginal performance loss.
I review the recent theoretical progress towards the computation of jet observables at two loops in QCD.
Multiplex networks describe a large variety of complex systems, whose elements (nodes) can be connected by different types of interactions forming different layers (networks) of the multiplex. Multiplex networks include social networks, transportation networks or biological networks in the cell or in the brain. Extracting relevant information from these networks is of crucial importance for solving challenging inference problems and for characterizing the multiplex networks microscopic and mesoscopic structure. Here we propose an information theory method to extract the network between the layers of multiplex datasets, forming a "network of networks". We build an indicator function, based on the entropy of network ensembles, to characterize the mesoscopic similarities between the layers of a multiplex network and we use clustering techniques to characterize the communities present in this network of networks. We apply the proposed method to study the Multiplex Collaboration Network formed by scientists collaborating on different subjects and publishing in the Americal Physical Society (APS) journals. The analysis of this dataset reveals the interplay between the collaboration networks and the organization of knowledge in physics.
Oversampling is a common characteristic of data representing dynamic networks. It introduces noise into representations of dynamic networks, but there has been little work so far to compensate for it. Oversampling can affect the quality of many important algorithmic problems on dynamic networks, including link prediction. Link prediction seeks to predict edges that will be added to the network given previous snapshots. We show that not only does oversampling affect the quality of link prediction, but that we can use link prediction to recover from the effects of oversampling. We also introduce a novel generative model of noise in dynamic networks that represents oversampling. We demonstrate the results of our approach on both synthetic and real-world data.
We study the problem of compressing recurrent neural networks (RNNs). In particular, we focus on the compression of RNN acoustic models, which are motivated by the goal of building compact and accurate speech recognition systems which can be run efficiently on mobile devices. In this work, we present a technique for general recurrent model compression that jointly compresses both recurrent and non-recurrent inter-layer weight matrices. We find that the proposed technique allows us to reduce the size of our Long Short-Term Memory (LSTM) acoustic model to a third of its original size with negligible loss in accuracy.
The high frequency collective dynamics of molten potassium has been investigated by inelastic x-ray scattering, disclosing an energy/momentum transfer region unreachable by previous neutron scattering experiments (INS). We find that a two-step relaxation scenario, similar to that found in other liquid metals, applies to liquid potassium. In particular, we show how the sound velocity determined by INS experiments, exceeding the hydrodynamic value by $\approx 30 %$, is the higher limit of a speed up, located in the momentum region $1<Q<3$ nm$^{-1}$, which marks the departure from the isothermal value. We point out how this phenomenology is the consequence of a microscopic relaxation process that, in turn, can be traced back to the presence of ''instantaneous'' disorder, rather than to the crossover from a liquid to solid-like response.
Metastatic presence in lymph nodes is one of the most important prognostic variables of breast cancer. The current diagnostic procedure for manually reviewing sentinel lymph nodes, however, is very time-consuming and subjective. Pathologists have to manually scan an entire digital whole-slide image (WSI) for regions of metastasis that are sometimes only detectable under high resolution or entirely hidden from the human visual cortex. From October 2015 to April 2016, the International Symposium on Biomedical Imaging (ISBI) held the Camelyon Grand Challenge 2016 to crowd-source ideas and algorithms for automatic detection of lymph node metastasis. Using a generalizable stain normalization technique and the Proscia Pathology Cloud computing platform, we trained a deep convolutional neural network on millions of tissue and tumor image tiles to perform slide-based evaluation on our testing set of whole-slide images images, with a sensitivity of 0.96, specificity of 0.89, and AUC score of 0.90. Our results indicate that our platform can automatically scan any WSI for metastatic regions without institutional calibration to respective stain profiles.
Due to the significant increase of communications between individuals via social media (Facebook, Twitter, Linkedin) or electronic formats (email, web, e-publication) in the past two decades, network analysis has become a unavoidable discipline. Many random graph models have been proposed to extract information from networks based on person-to-person links only, without taking into account information on the contents. This paper introduces the stochastic topic block model (STBM), a probabilistic model for networks with textual edges. We address here the problem of discovering meaningful clusters of vertices that are coherent from both the network interactions and the text contents. A classification variational expectation-maximization (C-VEM) algorithm is proposed to perform inference. Simulated data sets are considered in order to assess the proposed approach and to highlight its main features. Finally, we demonstrate the effectiveness of our methodology on two real-word data sets: a directed communication network and a undirected co-authorship network.
We report on a theoretical study of quantum charge transport in atomistic models of silicon nanowires with surface roughness-based disorder. Depending on the nanowires features (length, roughness profile) various conduction regimes are explored numerically by using efficient real space order N computational approaches of both Kubo-Greenwood and Landauer-Buttiker transport frameworks. Quantitative estimations of the elastic mean free paths, charge mobilities and localization lengths are performed as a function of the correlation length of the surface roughness disorder. The obtained values for charge mobilities well compare with the experimental estimates of the most performant undoped nanowires. Further the limitations of the Thouless relationship between the mean free path and the localization length are outlined.
One of the challenges in fighting cybercrime is to understand the dynamics of message propagation on botnets, networks of infected computers used to send viruses, unsolicited commercial emails (SPAM) or denial of service attacks. We map this problem to the propagation of multiple random walkers on directed networks and we evaluate the inter-arrival time distribution between successive walkers arriving at a target. We show that the temporal organization of this process, which models information propagation on unstructured peer to peer networks, has the same features as SPAM arriving to a single user. We study the behavior of the message inter-arrival time distribution on three different network topologies using two different rules for sending messages. In all networks the propagation is not a pure Poisson process. It shows universal features on Poissonian networks and a more complex behavior on scale free networks. Results open the possibility to indirectly learn about the process of sending messages on networks with unknown topologies, by studying inter-arrival times at any node of the network.
We present a novel definition of the reinforcement learning state, actions and reward function that allows a deep Q-network (DQN) to learn to control an optimization hyperparameter. Using Q-learning with experience replay, we train two DQNs to accept a state representation of an objective function as input and output the expected discounted return of rewards, or q-values, connected to the actions of either adjusting the learning rate or leaving it unchanged. The two DQNs learn a policy similar to a line search, but differ in the number of allowed actions. The trained DQNs in combination with a gradient-based update routine form the basis of the Q-gradient descent algorithms. To demonstrate the viability of this framework, we show that the DQN's q-values associated with optimal action converge and that the Q-gradient descent algorithms outperform gradient descent with an Armijo or nonmonotone line search. Unlike traditional optimization methods, Q-gradient descent can incorporate any objective statistic and by varying the actions we gain insight into the type of learning rate adjustment strategies that are successful for neural network optimization.
Motivation: Recognizing human actions in a video is a challenging task which has applications in various fields. Previous works in this area have either used images from a 2D or 3D camera. Few have used the idea that human actions can be easily identified by the movement of the joints in the 3D space and instead used a Recurrent Neural Network (RNN) for modeling. Convolutional neural networks (CNN) have the ability to recognise even the complex patterns in data which makes it suitable for detecting human actions. Thus, we modeled a CNN which can predict the human activity using the joint data. Furthermore, using the joint data representation has the benefit of lower dimensionality than image or video representations. This makes our model simpler and faster than the RNN models. In this study, we have developed a six layer convolutional network, which reduces each input feature vector of the form 15x1961x4 to an one dimensional binary vector which gives us the predicted activity. Results: Our model is able to recognise an activity correctly upto 87% accuracy. Joint data is taken from the Cornell Activity Datasets which have day to day activities like talking, relaxing, eating, cooking etc.
The concept of neutral evolutionary networks being a significant factor in evolutionary dynamics was first proposed by Huynen {\em et al.} about 7 years ago. In one sense, the principle is easy to state -- because most mutations to an organism are deleterious, one would expect that neutral mutations that don't affect the phenotype will have disproportionately greater representation amongst successor organisms than one would expect if each mutation was equally likely. So it was with great surprise that I noted neutral mutations being very rare in a visualisation of phylogenetic trees generated in {\em Tierra}, since I already knew that there was a significant amount of neutrality in the Tierra genotype-phenotype map. It turns out that competition for resources between host and parasite inhibits neutral evolution.
An overview of microlocality in braided ribbon networks is presented. Following this, a series of definitions are presented to explore the concept of microlocality and the topology of ribbon networks. Isolated substructure of ribbon networks are introduced, and a theorem is proven that allows them to be relocated. This is followed by a demonstration of microlocal translations. Additionally, an investigation into macrolocality and the implications of invariants in braided ribbon networks are presented.
It has been proposed that adaptation in complex systems is optimized at the critical boundary between ordered and disordered dynamical regimes. Here, we review models of evolving dynamical networks that lead to self-organization of network topology based on a local coupling between a dynamical order parameter and rewiring of network connectivity, with convergence towards criticality in the limit of large network size $N$. In particular, two adaptive schemes are discussed and compared in the context of Boolean Networks and Threshold Networks: 1) Active nodes loose links, frozen nodes aquire new links, 2) Nodes with correlated activity connect, de-correlated nodes disconnect. These simple local adaptive rules lead to co-evolution of network topology and -dynamics. Adaptive networks are strikingly different from random networks: They evolve inhomogeneous topologies and broad plateaus of homeostatic regulation, dynamical activity exhibits $1/f$ noise and attractor periods obey a scale-free distribution. The proposed co-evolutionary mechanism of topological self-organization is robust against noise and does not depend on the details of dynamical transition rules. Using finite-size scaling, it is shown that networks converge to a self-organized critical state in the thermodynamic limit. Finally, we discuss open questions and directions for future research, and outline possible applications of these models to adaptive systems in diverse areas.
This paper studies the problem of reconstructing a two-dimensional scalar field using a swarm of networked robots with local communication capabilities. We consider the communication network of the robots to form either a chain or a grid topology. We formulate the reconstruction problem as an optimization problem that is constrained by first-order linear dynamics on a large, interconnected system. To solve this problem, we employ an optimization-based scheme that uses a gradient-based method with an analytical computation of the gradient. In addition, we derive bounds on the trace of observability Gramian of the system, which helps us to quantify and compare the estimation capability of chain and grid networks. A comparison based on a performance measure related to the H2 norm of the system is also used to study robustness of the network topologies. Our resultsare validated using both simulated scalar fields and actual ocean salinity data.
The need for structures capable of accommodating complex evolutionary signals such as those found in, for example, wheat has fueled research into phylogenetic networks. Such structures generalize the standard phylogenetic tree model by also allowing cycles and have been introduced in rooted and unrooted form. In contrast to phylogenetic trees, however, surprisingly little is known about the interplay between both types thus hampering our ability to make much needed progress for rooted phylogenetic networks by drawing on insights from their much better understood unrooted counterparts. Unrooted phylogenetic networks are underpinned by split systems and by focusing on them we establish a first link between both types. More precisely, we develop a link between 1-nested phylogenetic networks which are examples of rooted phylogenetic networks and the well-studied median networks (aka Buneman graph) which are examples of unrooted phylogenetic networks. In particular, we show that not only can a 1-nested network be obtained from a median network but also that that network is, in a well-defined sense, optimal. Along the way, we characterize circular split systems in terms of the novel $\mathcal I$-intersection closure of a split system and establish the 1-nested analogue of the fundamental "Splits Equivalence Theorem" for phylogenetic trees.
The aim of this paper is to test numerically the predictions of the Mode Coupling Theory (MCT) of the glass transition and study its finite size scaling properties in a model with an exact MCT transition, which we choose to be the fully connected Random Orthogonal Model. Surprisingly, some predictions are verified while others seem clearly violated, with inconsistent values of some MCT exponents. We show that this is due to strong pre-asymptotic effects that disappear only in a surprisingly narrow region around the critical point. Our study of Finite Size Scaling (FSS) show that standard theory valid for pure systems fails because of strong sample to sample fluctuations. We propose a modified form of FSS that accounts well for our results. {\it En passant,} we also give new theoretical insights about FSS in disordered systems above their upper critical dimension. Our conclusion is that the quantitative predictions of MCT are exceedingly difficult to test even for models for which MCT is exact. Our results highlight that some predictions are more robust than others. This could provide useful guidance when dealing with experimental data.
Consumer protection agencies are charged with safeguarding the public from hazardous products, but the thousands of products under their jurisdiction make it challenging to identify and respond to consumer complaints quickly. From the consumer's perspective, online reviews can provide evidence of product defects, but manually sifting through hundreds of reviews is not always feasible. In this paper, we propose a system to mine Amazon.com reviews to identify products that may pose safety or health hazards. Since labeled data for this task are scarce, our approach combines positive unlabeled learning with domain adaptation to train a classifier from consumer complaints submitted to the U.S. Consumer Product Safety Commission. On a validation set of manually annotated Amazon product reviews, we find that our approach results in an absolute F1 score improvement of 8% over the best competing baseline. Furthermore, we apply the classifier to Amazon reviews of known recalled products; the classifier identifies reviews reporting safety hazards prior to the recall date for 45% of the products. This suggests that the system may be able to provide an early warning system to alert consumers to hazardous products before an official recall is announced.
In this work, we consider the task of generating highly-realistic images of a given face with a redirected gaze. We treat this problem as a specific instance of conditional image generation and suggest a new deep architecture that can handle this task very well as revealed by numerical comparison with prior art and a user study. Our deep architecture performs coarse-to-fine warping with an additional intensity correction of individual pixels. All these operations are performed in a feed-forward manner, and the parameters associated with different operations are learned jointly in the end-to-end fashion. After learning, the resulting neural network can synthesize images with manipulated gaze, while the redirection angle can be selected arbitrarily from a certain range and provided as an input to the network.
We introduce a novel Deep Network architecture that implements the full feature point handling pipeline, that is, detection, orientation estimation, and feature description. While previous works have successfully tackled each one of these problems individually, we show how to learn to do all three in a unified manner while preserving end-to-end differentiability. We then demonstrate that our Deep pipeline outperforms state-of-the-art methods on a number of benchmark datasets, without the need of retraining.
We present the simplest model that permits a largely analytical exploration of the m=1 counter-rotating instability in a "hot" nearly Keplerian disc of collisionless self-gravitating matter. The model consists of a two-component softened gravity disc, whose linear modes are analysed using WKB. The modes are slow in the sense that their (complex) frequency is smaller than the Keplerian orbital frequency by a factor which is of order the ratio of the disc mass to the mass of the central object. Very simple analytical expressions are derived for the precession frequencies and growth rates of local modes; it is shown that a nearly Keplerian disc must be unrealistically hot to avoid an overstability. Global modes are constructed for the case of zero net rotation.
In online communications, patterns of conduct of individual actors and use of emotions in the process can lead to a complex social graph exhibiting multilayered structure and mesoscopic communities. Using simplicial complexes representation of graphs, we investigate in-depth topology of online social network which is based on MySpace dialogs. The network exhibits original community structure. In addition, we simulate emotion spreading in this network that enables to identify two emotion-propagating layers. The analysis resulting in three structure vectors quantifies the graph's architecture at different topology levels. Notably, structures emerging through shared links, triangles and tetrahedral faces, frequently occur and range from tree-like to maximal 5-cliques and their respective complexes. On the other hand, the structures which spread only negative or only positive emotion messages appear to have much simpler topology consisting of links and triangles. Furthermore, we introduce the node's structure vector which represents the number of simplices at each topology level in which the node resides. The total number of such simplices determines what we define as the node's topological dimension. The presented results suggest that the node's topological dimension provides a suitable measure of the social capital which measures the agent's ability to act as a broker in compact communities, the so called Simmelian brokerage. We also generalize the results to a wider class of computer-generated networks. Investigating components of the node's vector over network layers reveals that same nodes develop different socio-emotional relations and that the influential nodes build social capital by combining their connections in different layers.
Our proposed deeply-supervised nets (DSN) method simultaneously minimizes classification error while making the learning process of hidden layers direct and transparent. We make an attempt to boost the classification performance by studying a new formulation in deep networks. Three aspects in convolutional neural networks (CNN) style architectures are being looked at: (1) transparency of the intermediate layers to the overall classification; (2) discriminativeness and robustness of learned features, especially in the early layers; (3) effectiveness in training due to the presence of the exploding and vanishing gradients. We introduce "companion objective" to the individual hidden layers, in addition to the overall objective at the output layer (a different strategy to layer-wise pre-training). We extend techniques from stochastic gradient methods to analyze our algorithm. The advantage of our method is evident and our experimental result on benchmark datasets shows significant performance gain over existing methods (e.g. all state-of-the-art results on MNIST, CIFAR-10, CIFAR-100, and SVHN).
The presence of noise in non linear dynamical systems can play a constructive role, increasing the degree of order and coherence or evoking improvements in the performance of the system. An example of this positive influence in a biological system is the impulse transmission in neurons and the synchronization of a neural network. Integrating numerically the Fokker-Planck equation we show a self-induced synchronized oscillation. Such an oscillatory state appears in a neural network coupled with a feedback term, when this system is excited by noise and the noise strength is within a certain range.
Wireless sensor networks are composed of distributed sensors that can be used for signal detection or classification. The likelihood functions of the hypotheses are often not known in advance, and decision rules have to be learned via supervised learning. A specific such algorithm is Fisher discriminant analysis (FDA), the classification accuracy of which has been previously studied in the context of wireless sensor networks. Previous work, however, does not take into account the communication protocol or battery lifetime of the sensor networks; in this paper we extend the existing studies by proposing a model that captures the relationship between battery lifetime and classification accuracy. In order to do so we combine the FDA with a model that captures the dynamics of the Carrier-Sense Multiple-Access (CSMA) algorithm, the random-access algorithm used to regulate communications in sensor networks. This allows us to study the interaction between the classification accuracy, battery lifetime and effort put towards learning, as well as the impact of the back-off rates of CSMA on the accuracy. We characterize the tradeoff between the length of the training stage and accuracy, and show that accuracy is non-monotone in the back-off rate due to changes in the training sample size and overfitting.
Convolutional Neural Networks (CNNs) has shown a great success in many areas including complex image classification tasks. However, they need a lot of memory and computational cost, which hinders them from running in relatively low-end smart devices such as smart phones. We propose a CNN compression method based on CP-decomposition and Tensor Power Method. We also propose an iterative fine tuning, with which we fine-tune the whole network after decomposing each layer, but before decomposing the next layer. Significant reduction in memory and computation cost is achieved compared to state-of-the-art previous work with no more accuracy loss.
Employing scaling analysis of the localization length, we deduce the critical exponent of the metal-topological insulator (TI) transitions induced by disorder. The obtained exponent nu~2.7 shows no conspicuous deviation from the value established for metal-ordinary insulator transitions in systems of the symplectic class. We investigate the topological phase diagram upon carrier doping to reveal the nature of the so-called topological Anderson insulator (TAI) region. The critical exponent of the metal-TAI transition is also first estimated, shown to be undistinguishable from the above value within the numerical error. By symmetry considerations we determine the explicit form of Rashba spin-orbit coupling in systems of C4v point group symmetry.
Wireless Sensor Networks is important to nodes energy consumption for long activity of sensor nodes because nodes that compose sensor network are small size, and battery capacity is limited. For energy consumption decrease of sensor nodes, sensor networks routing technique is divided by flat routing and hierarchical routing technique. Specially, hierarchical routing technique is energy efficient routing protocol to pare down energy consumption of whole sensor nodes and to scatter energy consumption of sensor nodes by forming cluster and communicating with cluster head. but though hierarchical routing technique based on clustering is advantage more than flat routing technique, this is not used for reason that is not realistic. The reason that is not realistic is because hierarchical routing technique does not consider data transmission radius of sensor node in actually. so this paper propose realistic routing technique base on clustering.
In this work we present a method for using Deep Q-Networks (DQNs) in multi-objective tasks. Deep Q-Networks provide remarkable performance in single objective tasks learning from high-level visual perception. However, in many scenarios (e.g in robotics), the agent needs to pursue multiple objectives simultaneously. We propose an architecture in which separate DQNs are used to control the agent's behaviour with respect to particular objectives. In this architecture we use signal suppression, known from the (Brooks) subsumption architecture, to combine outputs of several DQNs into a single action. Our architecture enables the decomposition of the agent's behaviour into controllable and replaceable sub-behaviours learned by distinct modules. To evaluate our solution we used a game-like simulator in which an agent - provided with high-level visual input - pursues multiple objectives in a 2D world. Our solution provides benefits of modularity, while its performance is comparable to the monolithic approach.
This paper elaborates about the potential risk of systemic instabilities in future networks and proposes a methodology to mitigate it. The starting concept is modeling the network as a complex environment (e.g. ecosystem) of resources and associated functional controllers in a continuous and dynamic game of cooperation - competition. Methodology foresees defining and associating utility functions to these controllers and elaborating a global utility function (as a function of the controllers' utility functions) for the overall network. It is conjectured that the optimization of the global utility function ensures network stability and security evaluations. Paper concludes arguing that self-governance (with limited human intervention) is possible provided that proper local, global control rules are coded into these utility functions optimization processes.
The inclusion of a threshold in the dynamics of layered neural networks with variable activity is studied at arbitrary temperature. In particular, the effects on the retrieval quality of a self-controlled threshold obtained by forcing the neural activity to stay equal to the activity of the stored paterns during the whole retrieval process, are compared with those of a threshold chosen externally for every loading and every temperature through optimisation of the mutual information content of the network. Numerical results, mostly concerning low activity networks are discussed.
In this paper, we investigate recommender systems from a network perspective and investigate recommendation networks, where nodes are items (e.g., movies) and edges are constructed from top-N recommendations (e.g., related movies). In particular, we focus on evaluating the reachability and navigability of recommendation networks and investigate the following questions: (i) How well do recommendation networks support navigation and exploratory search? (ii) What is the influence of parameters, in particular different recommendation algorithms and the number of recommendations shown, on reachability and navigability? and (iii) How can reachability and navigability be improved in these networks? We tackle these questions by first evaluating the reachability of recommendation networks by investigating their structural properties. Second, we evaluate navigability by simulating three different models of information seeking scenarios. We find that with standard algorithms, recommender systems are not well suited to navigation and exploration and propose methods to modify recommendations to improve this. Our work extends from one-click-based evaluations of recommender systems towards multi-click analysis (i.e., sequences of dependent clicks) and presents a general, comprehensive approach to evaluating navigability of arbitrary recommendation networks.
In this thesis we present the novel semi-supervised network-based algorithm P-Net, which is able to rank and classify patients with respect to a specific phenotype or clinical outcome under study. The peculiar and innovative characteristic of this method is that it builds a network of samples/patients, where the nodes represent the samples and the edges are functional or genetic relationships between individuals (e.g. similarity of expression profiles), to predict the phenotype under study. In other words, it constructs the network in the "sample space" and not in the "biomarker space" (where nodes represent biomolecules (e.g. genes, proteins) and edges represent functional or genetic relationships between nodes), as usual in state-of-the-art methods. To assess the performances of P-Net, we apply it on three different publicly available datasets from patients afflicted with a specific type of tumor: pancreatic cancer, melanoma and ovarian cancer dataset, by using the data and following the experimental set-up proposed in two recently published papers [Barter et al., 2014, Winter et al., 2012]. We show that network-based methods in the "sample space" can achieve results competitive with classical supervised inductive systems. Moreover, the graph representation of the samples can be easily visualized through networks and can be used to gain visual clues about the relationships between samples, taking into account the phenotype associated or predicted for each sample. To our knowledge this is one of the first works that proposes graph-based algorithms working in the "sample space" of the biomolecular profiles of the patients to predict their phenotype or outcome, thus contributing to a novel research line in the framework of the Network Medicine.
Recently, Apt and Markakis introduced a model for product adoption in social networks with multiple products, where the agents, influenced by their neighbours, can adopt one out of several alternatives (products). To analyze these networks we introduce social network games in which product adoption is obligatory.   We show that when the underlying graph is a simple cycle, there is a polynomial time algorithm allowing us to determine whether the game has a Nash equilibrium. In contrast, in the arbitrary case this problem is NP-complete. We also show that the problem of determining whether the game is weakly acyclic is co-NP hard.   Using these games we analyze various types of paradoxes that can arise in the considered networks. One of them corresponds to the well-known Braess paradox in congestion games. In particular, we show that social networks exist with the property that by adding an additional product to a specific node, the choices of the nodes will unavoidably evolve in such a way that everybody is strictly worse off.
In this paper we propose network methodology to infer prognostic cancer biomarkers based on the epigenetic pattern DNA methylation. Epigenetic processes such as DNA methylation reflect environmental risk factors, and are increasingly recognised for their fundamental role in diseases such as cancer. DNA methylation is a gene-regulatory pattern, and hence provides a means by which to assess genomic regulatory interactions. Network models are a natural way to represent and analyse groups of such interactions. The utility of network models also increases as the quantity of data and number of variables increase, making them increasingly relevant to large-scale genomic studies. We propose methodology to infer prognostic genomic networks from a DNA methylation-based measure of genomic interaction and association. We then show how to identify prognostic biomarkers from such networks, which we term `network community oncomarkers'. We illustrate the power of our proposed methodology in the context of a large publicly available breast cancer dataset.
The phenotype of any organism on earth is, in large part, the consequence of interplay between numerous gene products encoded in the genome, and such interplay between gene products affects the evolutionary fate of the genome itself through the resulting phenotype. In this regard, contemporary genomes can be used as molecular records that reveal associations of various genes working in their natural lifestyles. By analyzing thousands of orthologs across ~600 bacterial species, we constructed a map of gene-gene co-occurrence across much of the sequenced biome. If genes preferentially co-occur in the same organisms, they were called herein correlogs; in the opposite case, called anti-correlogs. To quantify correlogy and anti-correlogy, we alleviated the contribution of indirect correlations between genes by adapting ideas developed for reverse engineering of transcriptional regulatory networks. Resultant correlogous associations are highly enriched for physically interacting proteins and for co-expressed transcripts, clearly differentiating a subgroup of functionally-obligatory protein interactions from conditional or transient interactions. Other biochemical and phylogenetic properties were also found to be reflected in correlogous and anti-correlogous relationships. Additionally, our study elucidates the global organization of the gene association map, in which various modules of correlogous genes are strikingly interconnected by anti-correlogous crosstalk between the modules. We then demonstrate the effectiveness of such associations along different domains of life and environmental microbial communities. These phylogenetic profiling approaches infer functional coupling of genes regardless of mechanistic details, and may be useful to guide exogenous gene import in synthetic biology.
In this paper, we propose a deep learning based approach for facial action unit detection by enhancing and cropping the regions of interest. The approach is implemented by adding two novel nets (layers): the enhancing layers and the cropping layers, to a pretrained CNN model. For the enhancing layers, we designed an attention map based on facial landmark features and applied it to a pretrained neural network to conduct enhanced learning (The E-Net). For the cropping layers, we crop facial regions around the detected landmarks and design convolutional layers to learn deeper features for each facial region (C-Net). We then fuse the E-Net and the C-Net to obtain our Enhancing and Cropping (EAC) Net, which can learn both feature enhancing and region cropping functions. Our approach shows significant improvement in performance compared to the state-of-the-art methods applied to BP4D and DISFA AU datasets.
Certain classes of problems, including perceptual data understanding, robotics, discovery, and learning, can be represented as incremental, dynamically constructed belief networks. These automatically constructed networks can be dynamically extended and modified as evidence of new individuals becomes available. The main result of this paper is the incremental extension of the singly connected polytree network in such a way that the network retains its singly connected polytree structure after the changes. The algorithm is deterministic and is guaranteed to have a complexity of single node addition that is at most of order proportional to the number of nodes (or size) of the network. Additional speed-up can be achieved by maintaining the path information. Despite its incremental and dynamic nature, the algorithm can also be used for probabilistic inference in belief networks in a fashion similar to other exact inference algorithms.
A rigorous formulation of the dynamics of a signal processing scheme aimed at dense signal scanning without any loss in accuracy is introduced and analyzed. Related methods proposed in the recent past lack a satisfactory analysis of whether they actually fulfill any exactness constraints. This is improved through an exact characterization of the requirements for a sound sliding window approach. The tools developed in this paper are especially beneficial if Convolutional Neural Networks are employed, but can also be used as a more general framework to validate related approaches to signal scanning. The proposed theory helps to eliminate redundant computations and renders special case treatment unnecessary, resulting in a dramatic boost in efficiency particularly on massively parallel processors. This is demonstrated both theoretically in a computational complexity analysis and empirically on modern parallel processors.
A significant amount of research in recent years has been dedicated towards single agent deep reinforcement learning. Much of the success of deep reinforcement learning can be attributed towards the use of experience replay memories within which state transitions are stored. Function approximation methods such as convolutional neural networks (referred to as deep Q-Networks, or DQNs, in this context) can subsequently be trained through sampling the stored transitions. However, considerations are required when using experience replay memories within multi-agent systems, as stored transitions can become outdated due to agents updating their respective policies in parallel [1]. In this work we apply leniency [2] to multi-agent deep reinforcement learning (MA-DRL), acting as a control mechanism to determine which state-transitions sampled are allowed to update the DQN. Our resulting Lenient-DQN (LDQN) is evaluated using variations of the Coordinated Multi-Agent Object Transportation Problem (CMOTP) outlined by Busoniu et al. [3]. The LDQN significantly outperforms the existing hysteretic DQN (HDQN) [4] within environments that yield stochastic rewards. Based on results from experiments conducted using vanilla and double Q-learning versions of the lenient and hysteretic algorithms, we advocate a hybrid approach where learners initially use vanilla Q-learning before transitioning to double Q-learners upon converging on a cooperative joint policy.
We present in this paper a systematic study on how to morph a well-trained neural network to a new one so that its network function can be completely preserved. We define this as \emph{network morphism} in this research. After morphing a parent network, the child network is expected to inherit the knowledge from its parent network and also has the potential to continue growing into a more powerful one with much shortened training time. The first requirement for this network morphism is its ability to handle diverse morphing types of networks, including changes of depth, width, kernel size, and even subnet. To meet this requirement, we first introduce the network morphism equations, and then develop novel morphing algorithms for all these morphing types for both classic and convolutional neural networks. The second requirement for this network morphism is its ability to deal with non-linearity in a network. We propose a family of parametric-activation functions to facilitate the morphing of any continuous non-linear activation neurons. Experimental results on benchmark datasets and typical neural networks demonstrate the effectiveness of the proposed network morphism scheme.
Neural networks and especially convolutional neural networks are of great interest in current computer vision research. However, many techniques, extensions, and modifications have been published in the past, which are not yet used by current approaches. In this paper, we study the application of a method called QuickProp for training of deep neural networks. In particular, we apply QuickProp during learning and testing of fully convolutional networks for the task of semantic segmentation. We compare QuickProp empirically with gradient descent, which is the current standard method. Experiments suggest that QuickProp can not compete with standard gradient descent techniques for complex computer vision tasks like semantic segmentation.
We present a layered Boltzmann machine (BM) that can better exploit the advantages of a distributed representation. It is widely believed that deep BMs (DBMs) have far greater representational power than its shallow counterpart, restricted Boltzmann machines (RBMs). However, this expectation on the supremacy of DBMs over RBMs has not ever been validated in a theoretical fashion. In this paper, we provide both theoretical and empirical evidences that the representational power of DBMs can be actually rather limited in taking advantages of distributed representations. We propose an approximate measure for the representational power of a BM regarding to the efficiency of a distributed representation. With this measure, we show a surprising fact that DBMs can make inefficient use of distributed representations. Based on these observations, we propose an alternative BM architecture, which we dub soft-deep BMs (sDBMs). We show that sDBMs can more efficiently exploit the distributed representations in terms of the measure. Experiments demonstrate that sDBMs outperform several state-of-the-art models, including DBMs, in generative tasks on binarized MNIST and Caltech-101 silhouettes.
The effect of a slow noise in non-diagonal matrix element, J(t), that describes the diabatic level coupling, on the probability of the Landau-Zener transition is studied. For slow noise, the correlation time, \tau_c, of J(t) is much longer than the characteristic time of the transition. Existing theory for this case suggests that the average transition probability is the result of averaging of the conventional Landau-Zener probability, calculated for a given constant J, over the distribution of J. We calculate a finite-\tau_c correction to this classical result. Our main finding is that this correction is dominated by sparse realizations of noise for which J(t) passes through zero within a narrow time interval near the level crossing. Two models of noise, random telegraph noise and gaussian noise, are considered. Naturally, in both models the average probability of transition decreases upon decreasing \tau_c. For gaussian noise we identify two domains of this fall-off with specific dependencies of average transition probability on \tau_c.
Recently there has been an increasing interest in methods that deal with multiple outputs. This has been motivated partly by frameworks like multitask learning, multisensor networks or structured output data. From a Gaussian processes perspective, the problem reduces to specifying an appropriate covariance function that, whilst being positive semi-definite, captures the dependencies between all the data points and across all the outputs. One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform we establish dependencies between output variables. The main drawbacks of this approach are the associated computational and storage demands. In this paper we address these issues. We present different sparse approximations for dependent output Gaussian processes constructed through the convolution formalism. We exploit the conditional independencies present naturally in the model. This leads to a form of the covariance similar in spirit to the so called PITC and FITC approximations for a single output. We show experimental results with synthetic and real data, in particular, we show results in pollution prediction, school exams score prediction and gene expression data.
Network coding is an efficient means to improve the spectrum efficiency of satellite communications. However, its resilience to eavesdropping attacks is not well understood. This paper studies the confidentiality issue in a bidirectional satellite network consisting of two mobile users who want to exchange message via a multibeam satellite using the XOR network coding protocol. We aim to maximize the sum secrecy rate by designing the optimal beamforming vector along with optimizing the return and forward link time allocation. The problem is non-convex, and we find its optimal solution using semidefinite programming together with a 1-D search. For comparison, we also solve the sum secrecy rate maximization problem for a conventional reference scheme without using network coding. Simulation results using realistic system parameters demonstrate that the bidirectional scheme using network coding provides considerably higher secrecy rate compared to that of the conventional scheme.
We study the order of capillary condensation and evaporation transitions of a simple fluid adsorbed in a deep capillary groove using a fundamental measure density functional theory (DFT). The walls of the capillary interact with the fluid particles via long-ranged, dispersion, forces while the fluid-fluid interaction is modelled as a truncated Lennard-Jones-like potential. We find that below the wetting temperature $T_w$ condensation is first-order and evaporation is continuous with the metastability of the condensation being well described by the complementary Kelvin equation. In contrast above $T_w$ both phase transitions are continuous and their critical singularities are determined. In addition we show that for the evaporation transition above $T_w$ there is an elegant mapping, or covariance, with the complete wetting transition occurring at a planar wall. Our numerical DFT studies are complemented by analytical slab model calculations which explain how the asymmetry between condensation and evaporation arises out of the combination of long-ranged forces and substrate geometry.
We describe a prototype dialogue response generation model for the customer service domain at Amazon. The model, which is trained in a weakly supervised fashion, measures the similarity between customer questions and agent answers using a dual encoder network, a Siamese-like neural network architecture. Answer templates are extracted from embeddings derived from past agent answers, without turn-by-turn annotations. Responses to customer inquiries are generated by selecting the best template from the final set of templates. We show that, in a closed domain like customer service, the selected templates cover $>$70\% of past customer inquiries. Furthermore, the relevance of the model-selected templates is significantly higher than templates selected by a standard tf-idf baseline.
Many natural, physical and social networks commonly exhibit power-law degree distributions. In this paper, we discover previously unreported asymmetrical patterns in the degree distributions of incoming and outgoing links in the investigation of large-scale industrial networks, and provide interpretations. In industrial networks, nodes are firms and links are directed supplier-customer relationships. While both in- and out-degree distributions have "power law" regimes, out-degree distribution decays faster than in-degree distribution and crosses it at a consistent nodal degree. It implies that, as link degree increases, the constraints to the capacity for designing, producing and transmitting artifacts out to others grow faster than and surpasses those for acquiring, absorbing and synthesizing artifacts provided from others. We further discover that this asymmetry in decaying rates of in-degree and out-degree distributions is smaller in networks that process and transmit more decomposable artifacts, e.g. informational artifacts in contrast with physical artifacts. This asymmetry in in-degree and out-degree distributions is likely to hold for other directed networks, but to different degrees, depending on the decomposability of the processed and transmitted artifacts.
Histopathological assessments, including surgical resection and core needle biopsy, are the standard procedures in the diagnosis of the prostate cancer. Current interpretation of the histopathology images includes the determination of the tumor area, Gleason grading, and identification of certain prognosis-critical features. Such a process is not only tedious, but also prune to intra/inter-observe variabilities. Recently, FDA cleared the marketing of the first whole slide imaging system for digital pathology. This opens a new era for the computer aided prostate image analysis and feature extraction based on the digital histopathology images. In this work, we present an analysis pipeline that includes localization of the cancer region, grading, area ratio of different Gleason grades, and cytological/architectural feature extraction. The proposed algorithm combines the human engineered feature extraction as well as those learned by the deep neural network. Moreover, the entire pipeline is implemented to directly operate on the whole slide images produced by the digital scanners and is therefore potentially easy to translate into clinical practices. The algorithm is tested on 368 whole slide images from the TCGA data set and achieves an overall accuracy of 75% in differentiating Gleason 3+4 with 4+3 slides.
We present an analysis of the classical contact process on scale-free networks. A mean-field study, both for finite and infinite network sizes, yields an absorbing-state phase transition at a finite critical value of the control parameter, characterized by a set of exponents depending on the network structure. Since finite size effects are large and the infinite network limit cannot be reached in practice, a numerical study of the transition requires the application of finite size scaling theory. Contrary to other critical phenomena studied previously, the contact process in scale-free networks exhibits a non-trivial critical behavior that cannot be quantitatively accounted for by mean-field theory.
We compute the pressure of the random energy model (REM) and generalized random energy model(GREM) by establishing variational upper and lower bounds. For the upper bound, we generalize Guerra's ``broken replica symmetry bounds",and identify the random probability cascade as the appropriate random overlap structure for the model. For the REM the lower bound is obtained, in the high temperature regime using Talagrand's concentration of measure inequality, and in the low temperature regime using convexity and the high temperature formula. The lower bound for the GREM follows from the lower bound for the REM by induction. While the argument for the lower bound is fairly standard, our proof of the upper bound is new.
The Lockman Hole Project is a wide international collaboration aimed at exploiting the multi-band extensive and deep information available for the Lockman Hole region, with the aim of better characterizing the physical and evolutionary properties of the various source populations detected in deep radio fields. Recent observations with the LOw-Frequency ARray (LOFAR) extends the multi-frequency radio information currently available for the Lockman Hole (from 350 MHz up to 15 GHz) down to 150 MHz, allowing us to explore a new radio spectral window for the faint radio source population. These LOFAR observations allow us to study the population of sources with spectral peaks at lower radio frequencies, providing insight into the evolution of GPS and CSS sources. In this general framework, I present preliminary results from 150 MHz LOFAR observations of the Lockman Hole field.
As computation spreads from computers to networks of computers, and migrates into cyberspace, it ceases to be globally programmable, but it remains programmable indirectly: network computations cannot be controlled, but they can be steered by local constraints on network nodes. The tasks of "programming" global behaviors through local constraints belong to the area of security. The "program particles" that assure that a system of local interactions leads towards some desired global goals are called security protocols. As computation spreads beyond cyberspace, into physical and social spaces, new security tasks and problems arise. As networks are extended by physical sensors and controllers, including the humans, and interlaced with social networks, the engineering concepts and techniques of computer security blend with the social processes of security. These new connectors for computational and social software require a new "discipline of programming" of global behaviors through local constraints. Since the new discipline seems to be emerging from a combination of established models of security protocols with older methods of procedural programming, we use the name procedures for these new connectors, that generalize protocols. In the present paper we propose actor-networks as a formal model of computation in heterogenous networks of computers, humans and their devices; and we introduce Procedure Derivation Logic (PDL) as a framework for reasoning about security in actor-networks. On the way, we survey the guiding ideas of Protocol Derivation Logic (also PDL) that evolved through our work in security in last 10 years. Both formalisms are geared towards graphic reasoning and tool support. We illustrate their workings by analysing a popular form of two-factor authentication, and a multi-channel device pairing procedure, devised for this occasion.
We compute the order alpha_s^2 corrections to the one particle inclusive electroproduction cross section of hadrons with non vanishing transverse momentum. We compare our results with H1 data on forward production of pi^0, and conclude that the data is well described by the DGLAP approach. within the theoretical uncertainties.
We adapt Forman's discretization of Ricci curvature to the case of undirected networks, both weighted and unweighted, and investigate the measure in a variety of model and real-world networks. We find that most nodes and edges in model and real networks have a negative curvature. Furthermore, the distribution of Forman curvature of nodes and edges is narrow in random and small-world networks, while the distribution is broad in scale-free and real-world networks. In most networks, Forman curvature is found to display significant negative correlation with degree and centrality measures. However, Forman curvature is uncorrelated with clustering coefficient in most networks. Importantly, we find that both model and real networks are vulnerable to targeted deletion of nodes with highly negative Forman curvature. Our results suggest that Forman curvature can be employed to gain novel insights on the organization of complex networks.
We propose augmenting deep neural networks with an attention mechanism for the visual object detection task. As perceiving a scene, humans have the capability of multiple fixation points, each attended to scene content at different locations and scales. However, such a mechanism is missing in the current state-of-the-art visual object detection methods. Inspired by the human vision system, we propose a novel deep network architecture that imitates this attention mechanism. As detecting objects in an image, the network adaptively places a sequence of glimpses of different shapes at different locations in the image. Evidences of the presence of an object and its location are extracted from these glimpses, which are then fused for estimating the object class and bounding box coordinates. Due to lacks of ground truth annotations of the visual attention mechanism, we train our network using a reinforcement learning algorithm with policy gradients. Experiment results on standard object detection benchmarks show that the proposed network consistently outperforms the baseline networks that does not model the attention mechanism.
The interaction of a two-level system (TLS) with a single bosonic mode is one of the most fundamental processes in quantum optics. Microscopically, it is described by the quantum Rabi model (QRM). Here, we propose an implementation of this model based on single trapped cold atoms. The TLS is implemented using atomic Zeeman states, while the atom's vibrational states in the trap represent the bosonic mode. The coupling is mediated by a suitable fictitious magnetic field pattern. We show that all important system parameters, i.e., the emitter-field detuning and the coupling strength of the emitter to the mode, can be tuned over a wide range. Remarkably, assuming realistic experimental conditions, our approach allows one to explore the regimes of ultra-strong coupling, deep strong coupling, and dispersive deep strong coupling. The states of the bosonic mode and the TLS can be prepared and read out using standard cold-atom techniques. Moreover, we show that our scheme enables the implementation of important generalizations, namely, the driven QRM, the QRM with quadratic coupling as well as the case of many TLSs coupled to one mode (Dicke model). The proposed cold-atom based implementation will facilitate experimental studies of a series of phenomena predicted for the QRM in extreme, so far unexplored physical regimes.
We probe the droplet excitations in short range spin glasses by adding a perturbative long range interaction that decays with distance as a power law: $J/r^{\sigma}$. It is shown that if the power law exponent $\sigma$ is smaller than the spatial dimension $d$, the perturbation induces large scale avalanches which roll until they force the system to develop a pseudo gap in the excitation spectrum of the stabilities. This makes the perturbative long range interactions relevant for $\sigma < \sigma_c = d$. The droplet theory predicts that the critical exponent $\sigma_c$ depends on the droplet stiffness exponent as $\sigma_c=d-\theta$. Combining these two results leads to a zero stiffness exponent $\theta = 0$ in the droplet theory of short range spin glasses.
We explore the relatiohsip between deep mixing on the red-giant branch and the second parameter that gives rise to the extended blue tail seen on the horizontal branch of the globular cluster M 13. We use the [Al/Fe] values of 96 RGB stars and the [Na/Fe] values of 119 RGB stars in M 13 to follow the evolution of mixing in this cluster. The data are compared to an "instantaneous" mixing algorithm to predict the effects of deep mixing on the HB morphology. In addition, we present new data for six giants in the cluster M 3 and three in M 13. Our final results are consistent with deep mixing as a blue-tail second parameter. We also examine the question of overproducing sodium during deep mixing and suggest ways around this problem and observational tests that center around the s-process elements Sr, Y and Zr.
We describe a system to detect objects in three-dimensional space using video and inertial sensors (accelerometer and gyrometer), ubiquitous in modern mobile platforms from phones to drones. Inertials afford the ability to impose class-specific scale priors for objects, and provide a global orientation reference. A minimal sufficient representation, the posterior of semantic (identity) and syntactic (pose) attributes of objects in space, can be decomposed into a geometric term, which can be maintained by a localization-and-mapping filter, and a likelihood function, which can be approximated by a discriminatively-trained convolutional neural network. The resulting system can process the video stream causally in real time, and provides a representation of objects in the scene that is persistent: Confidence in the presence of objects grows with evidence, and objects previously seen are kept in memory even when temporarily occluded, with their return into view automatically predicted to prime re-detection.
A Multi-hop Control Network consists of a plant where the communication between sensors, actuators and computational units is supported by a (wireless) multi-hop communication network, and data flow is performed using scheduling and routing of sensing and actuation data. Given a SISO LTI plant, we will address the problem of co-designing a digital controller and the network parameters (scheduling and routing) in order to guarantee stability and maximize a performance metric on the transient response to a step input, with constraints on the control effort, on the output overshoot and on the bandwidth of the communication channel. We show that the above optimization problem is a polynomial optimization problem, which is generally NP-hard. We provide sufficient conditions on the network topology, scheduling and routing such that it is computationally feasible, namely such that it reduces to a convex optimization problem.
A theory of 'time' as a form of 'information' is proposed. New tools such as Feynman Clocks, Collective Excitation Networks, Sequential Excitation Networks, Plateaus of Complexity, and Causal Networks are used to unify previously separate 'arrows of time'.
We study the effect of a periodic density modulation on surface acoustic wave (SAW) propagation along a 2D electron gas near Landau level filling $\nu=1/2$. Within the composite fermion theory, the problem is described in terms of fermions subject to a spatially modulated magnetic field and scattered by a random magnetic field. We find that a few percent modulation induces a large peak in the SAW velocity shift, as has been observed recently by Willett et al. As further support of this theory we find the dc resistivity to be in good agreement with recent data of Smet et al.
The growth of world-wide-web (WWW) spreads its wings from an intangible quantities of web-pages to a gigantic hub of web information which gradually increases the complexity of crawling process in a search engine. A search engine handles a lot of queries from various parts of this world, and the answers of it solely depend on the knowledge that it gathers by means of crawling. The information sharing becomes a most common habit of the society, and it is done by means of publishing structured, semi-structured and unstructured resources on the web. This social practice leads to an exponential growth of web-resource, and hence it became essential to crawl for continuous updating of web-knowledge and modification of several existing resources in any situation. In this paper one statistical hypothesis based learning mechanism is incorporated for learning the behavior of crawling speed in different environment of network, and for intelligently control of the speed of crawler. The scaling technique is used to compare the performance proposed method with the standard crawler. The high speed performance is observed after scaling, and the retrieval of relevant web-resource in such a high speed is analyzed.
We report a search for the standard model Higgs boson in the missing energy and acoplanar b-jet topology, using an integrated luminosity of 0.93 inverse femtobarn recorded by the D0 detector at the Fermilab Tevatron Collider. The analysis includes signal contributions from pp->ZH->nu nu b b, as well as from WH production in which the charged lepton from the W boson decay is undetected. Neural networks are used to separate signal from background. In the absence of a signal, we set limits on the cross section of pp->VH times the branching ratio of H->bb at the 95% C.L. of 2.6 - 2.3 pb, for Higgs boson masses in the range 105 - 135 GeV, where V=W,Z. The corresponding expected limits range from 2.8 pb - 2.0 pb.
Fundamental to many transportation network studies, traffic flow models can be used to describe traffic dynamics determined by drivers' car-following, lane-changing, merging, and diverging behaviors. In this study, we develop a deterministic queueing model of network traffic flow, in which traffic on each link is considered as a queue. In the link queue model, the demand and supply of a queue are defined based on the link's fundamental diagram, and its in- and out-fluxes are computed from junction flux functions corresponding to macroscopic merging and diverging rules. We demonstrate that the model is well defined and can be considered as a continuous approximation to the kinematic wave model on a road network. From careful analytical and numerical studies, we conclude that the model is physically meaningful, computationally efficient, always stable, and mathematically tractable for network traffic flow. As an addition to the multiscale modeling framework of network traffic flow, the model strikes a balance between mathematical tractability and physical realism and can be used for analyzing traffic dynamics, developing traffic operation strategies, and studying drivers' route choice and other behaviors in large-scale road networks.
High-speed, low-latency obstacle avoidance that is insensitive to sensor noise is essential for enabling multiple decentralized robots to function reliably in cluttered and dynamic environments. While other distributed multi-agent collision avoidance systems exist, these systems require online geometric optimization where tedious parameter tuning and perfect sensing are necessary.   We present a novel end-to-end framework to generate reactive collision avoidance policy for efficient distributed multi-agent navigation. Our method formulates an agent's navigation strategy as a deep neural network mapping from the observed noisy sensor measurements to the agent's steering commands in terms of movement velocity. We train the network on a large number of frames of collision avoidance data collected by repeatedly running a multi-agent simulator with different parameter settings. We validate the learned deep neural network policy in a set of simulated and real scenarios with noisy measurements and demonstrate that our method is able to generate a robust navigation strategy that is insensitive to imperfect sensing and works reliably in all situations. We also show that our method can be well generalized to scenarios that do not appear in our training data, including scenes with static obstacles and agents with different sizes. Videos are available at https://sites.google.com/view/deepmaca.
The time evolution of the local field in symmetric Q-Ising neural networks is studied for arbitrary Q. In particular, the structure of the noise and the appearance of gaps in the probability distribution are discussed. Results are presented for several values of Q and compared with numerical simulations.
Researchers have devoted themselves to exploring static features of social networks and further discovered many representative characteristics, such as power law in the degree distribution and assortative value used to differentiate social networks from nonsocial ones. However, people are not satisfied with these achievements and more and more attention has been paid on how to uncover those dynamic characteristics of social networks, especially how to track community evolution effectively. With these interests, in the paper we firstly display some basic but dynamic features of social networks. Then on its basis, we propose a novel core-based algorithm of tracking community evolution, CommTracker, which depends on core nodes to establish the evolving relationships among communities at different snapshots. With the algorithm, we discover two unique phenomena in social networks and further propose two representative coefficients: GROWTH and METABOLISM by which we are also able to distinguish social networks from nonsocial ones from the dynamic aspect. At last, we have developed a social network model which has the capabilities of exhibiting two necessary features above.
A divide-and-conquer based approach for computing the Moore-Penrose pseudo-inverse of the combinatorial Laplacian matrix $(\bb L^+)$ of a simple, undirected graph is proposed. % The nature of the underlying sub-problems is studied in detail by means of an elegant interplay between $\bb L^+$ and the effective resistance distance $(\Omega)$. Closed forms are provided for a novel {\em two-stage} process that helps compute the pseudo-inverse incrementally. Analogous scalar forms are obtained for the converse case, that of structural regress, which entails the breaking up of a graph into disjoint components through successive edge deletions. The scalar forms in both cases, show absolute element-wise independence at all stages, thus suggesting potential parallelizability. Analytical and experimental results are presented for dynamic (time-evolving) graphs as well as large graphs in general (representing real-world networks). An order of magnitude reduction in computational time is achieved for dynamic graphs; while in the general case, our approach performs better in practice than the standard methods, even though the worst case theoretical complexities may remain the same: an important contribution with consequences to the study of online social networks.
A general methodology is proposed to engineer a system of interacting components (particles) which is able to self-regulate their concentrations in order to produce any prescribed output in response to a particular input. The methodology is based on the mathematical equivalence between artificial neurons in neural networks and species in autocatalytic reactions, and it specifies the relationship between the artificial neural network's parameters and the rate coefficients of the reactions between particle species. Such systems are characterised by a high degree of robustness as they are able to reach the desired output despite disturbances and perturbations of the concentrations of the various species.
A probabilistic two-component mixture model allows one to separate the diffuse background from the celestial sources within a one-step algorithm without data censoring. The background is modeled with a thin-plate spline combined with the satellite's exposure time. Source probability maps are created in a multi-resolution analysis for revealing faint and extended sources. All detected sources are automatically parametrized to produce a list of source positions, fluxes and morphological parameters. The present analysis is applied to the Chandra Deep Field South 2 Ms public released data. Within its 1.884 ks of exposure time and its angular resolution (0.984 arcsec), the Chandra Deep Field South data are particularly suited for testing the Background-Source separation algorithm.
The study of random networks in a neuroscientific context has developed extensively over the last couple of decades. By contrast, techniques for the statistical analysis of these networks are less developed. In this paper, we focus on the statistical comparison of brain networks in a nonparametric framework and discuss the associated detection and identification problems. We tested network differences between groups with an analysis of variance (ANOVA) test we developed specifically for networks. We also propose and analyse the behaviour of a new statistical procedure designed to identify different subnetworks. As an example, we show the application of this tool in resting-state fMRI data obtained from the Human Connectome Project. Finally, we discuss the potential bias in neuroimaging findings that is generated by some behavioural and brain structure variables. Our method can also be applied to other kind of networks such as protein interaction networks, gene networks or social networks.
In recent years, there has been an intense interest in understanding the microscopic mechanism of thermally induced magnetization switching driven by a femtosecond laser pulse. Most of the effort has been dedicated to periodic crystalline structures while the amorphous counterparts have been less studied. By using a multiscale approach, i.e. first-principles density functional theory combined with atomistic spin dynamics, we report here on the very intricate structural and magnetic nature of amorphous Gd-Fe alloys for a wide range of Gd and Fe atomic concentrations at the nanoscale level. Both structural and dynamical properties of Gd-Fe alloys reported in this work are in good agreement with previous experiments. We calculated the dynamic behavior of homogeneous and inhomogeneous amorphous Gd-Fe alloys and their response under the influence of a femtosecond laser pulse. In the homogeneous sample, the Fe sublattice switches its magnetization before the Gd one. However, the temporal sequence of the switching of the two sublattices is reversed in the inhomogeneous sample. We propose a possible explanation based on a mechanism driven by a combination of the Dzyaloshiskii-Moriya interaction and exchange frustration, modeled by an antiferromagnetic second-neighbour exchange interaction between Gd atoms in the Gd-rich region. We also report on the influence of laser fluence and damping effects in the all-thermal switching.
Safety and reliability are important in the cloud computing environment. This is especially true today as distributed denial-of-service (DDoS) attacks constitute one of the largest threats faced by Internet users and cloud computing services. DDoS attacks target the resources of these services, lowering their ability to provide optimum usage of the network infrastructure. Due to the nature of cloud computing, the methodologies for preventing or stopping DDoS attacks are quite different compared to those used in traditional networks. In this paper, we investigate the effect of DDoS attacks on cloud resources and recommend practical defense mechanisms against different types of DDoS attacks in the cloud environment.
Click through rate (CTR) prediction of image ads is the core task of online display advertising systems, and logistic regression (LR) has been frequently applied as the prediction model. However, LR model lacks the ability of extracting complex and intrinsic nonlinear features from handcrafted high-dimensional image features, which limits its effectiveness. To solve this issue, in this paper, we introduce a novel deep neural network (DNN) based model that directly predicts the CTR of an image ad based on raw image pixels and other basic features in one step. The DNN model employs convolution layers to automatically extract representative visual features from images, and nonlinear CTR features are then learned from visual features and other contextual features by using fully-connected layers. Empirical evaluations on a real world dataset with over 50 million records demonstrate the effectiveness and efficiency of this method.
A random null model termed the Blind Watchmaker network (BW) has been shown to reproduce the degree distribution found in metabolic networks. This might suggest that natural selection has had little influence on this particular network property. We here investigate to what extent other structural network properties have evolved under selective pressure from the corresponding ones of the random null model: The clustering coefficient and the assortativity measures are chosen and it is found that these measures for the metabolic network structure are close enough to the BW-network so as to fit inside its reachable random phase space. It is furthermore shown that the use of this null model indicates an evolutionary pressure towards low assortativity and that this pressure is stronger for larger networks. It is also shown that selecting for BW networks with low assortativity causes the BW degree distribution to slightly deviate from its power-law shape in the same way as the metabolic networks. This implies that an equilibrium model with fluctuating degree distribution is more suitable as a null model, when identifying selective pressures, than a randomized counterpart with fixed degree sequence, since the overall degree sequence itself can change under selective pressure on other global network properties.
Upcoming wireless deployments, such as C-RAN and Massive-MIMO, rely on real-time transport of baseband samples. The radios (or radio heads) in such deployments generate baseband packets once every period, which are then transported to a backend processing cluster. Therefore, to meet the real-time processing constraints, the baseband transport network must deliver the packets to the backend within a fixed end-to-end delay bound. However, computing the queuing delays in a large-scale packet-switched network is intractable, making it difficult to provide end-to-end delay guarantees. In this paper, we present a novel Fat-Tree-based design, called DISTRO, for transporting baseband traffic. DISTRO's design supports real-time transport as it allows us to bound the maximum transport delay of each packet. The network switches can therefore implement well-known real-time scheduling policies to achieve end-to-end delay guarantees. DISTRO further partitions the transport network into separate aggregation- and edge-switch networks such that all scheduling policy changes occur only at the edge-switch network. We also characterize the maximum wireless capacity using DISTRO while meeting the real-time constraints of baseband processing.
Recently, physical layer security in the optical layer has gained significant traction. Security treats in optical networks generally impact the reliability of optical transmission. Linear Network Coding (LNC) can protect from both the security treats in form of eavesdropping and faulty transmission due to jamming. LNC can mix original data to become incomprehensible for an attacker and also extend original data by coding redundancy, thus protecting a data from errors injected via jamming attacks. In this paper, we study the effectiveness of LNC to balance reliable transmission and security in optical networks. To this end, we combine the coding process with data flow parallelization of the source and propose and compare optimal and randomized path selection methods for parallel transmission. The study shows that a combination of data parallelization, LNC and randomization of path selection increases security and reliability of the transmission. We analyze the so-called catastrophic security treat of the network and show that in case of conventional transmission scheme and in absence of LNC, an attacker could eavesdrop or disrupt a whole secret data by accessing only one edge in a network.
The encoding complexity of network coding for single multicast networks has been intensively studied from several aspects: e.g., the time complexity, the required number of encoding links, and the required field size for a linear code solution. However, these issues as well as the solvability are less understood for networks with multiple multicast sessions. Recently, Wang and Shroff showed that the solvability of networks with two unit-rate multicast sessions (2-URMS) can be decided in polynomial time. In this paper, we prove that for the 2-URMS networks: $1)$ the solvability can be determined with time $O(|E|)$; $2)$ a solution can be constructed with time $O(|E|)$; $3)$ an optimal solution can be obtained in polynomial time; $4)$ the number of encoding links required to achieve a solution is upper-bounded by $\max\{3,2N-2\}$; and $5)$ the field size required to achieve a linear solution is upper-bounded by $\max\{2,\lfloor\sqrt{2N-7/4}+1/2\rfloor\}$, where $|E|$ is the number of links and $N$ is the number of sinks of the underlying network. Both bounds are shown to be tight.
A way to enhance the performance of a model that combines genetic algorithms and fuzzy logic for feature selection and classification is proposed. Early diagnosis of any disease with less cost is preferable. Diabetes is one such disease. Diabetes has become the fourth leading cause of death in developed countries and there is substantial evidence that it is reaching epidemic proportions in many developing and newly industrialized nations. In medical diagnosis, patterns consist of observable symptoms along with the results of diagnostic tests. These tests have various associated costs and risks. In the automated design of pattern classification, the proposed system solves the feature subset selection problem. It is a task of identifying and selecting a useful subset of pattern-representing features from a larger set of features. Using fuzzy rule-based classification system, the proposed system proves to improve the classification accuracy.
Previous machine comprehension (MC) datasets are either too small to train end-to-end deep learning models, or not difficult enough to evaluate the ability of current MC techniques. The newly released SQuAD dataset alleviates these limitations, and gives us a chance to develop more realistic MC models. Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is an end-to-end system that directly predicts the answer beginning and ending points in a passage. Our model first adjusts each word-embedding vector in the passage by multiplying a relevancy weight computed against the question. Then, we encode the question and weighted passage by using bi-directional LSTMs. For each point in the passage, our model matches the context of this point against the encoded question from multiple perspectives and produces a matching vector. Given those matched vectors, we employ another bi-directional LSTM to aggregate all the information and predict the beginning and ending points. Experimental result on the test set of SQuAD shows that our model achieves a competitive result on the leaderboard.
Interactive object selection is a very important research problem and has many applications. Previous algorithms require substantial user interactions to estimate the foreground and background distributions. In this paper, we present a novel deep learning based algorithm which has a much better understanding of objectness and thus can reduce user interactions to just a few clicks. Our algorithm transforms user provided positive and negative clicks into two Euclidean distance maps which are then concatenated with the RGB channels of images to compose (image, user interactions) pairs. We generate many of such pairs by combining several random sampling strategies to model user click patterns and use them to fine tune deep Fully Convolutional Networks (FCNs). Finally the output probability maps of our FCN 8s model is integrated with graph cut optimization to refine the boundary segments. Our model is trained on the PASCAL segmentation dataset and evaluated on other datasets with different object classes. Experimental results on both seen and unseen objects clearly demonstrate that our algorithm has a good generalization ability and is superior to all existing interactive object selection approaches.
Wireless sensor networks are often designed to perform two tasks: sensing a physical field and transmitting the data to end-users. A crucial aspect of the design of a WSN is the minimization of the overall energy consumption. Previous researchers aim at optimizing the energy spent for the communication, while mostly ignoring the energy cost due to sensing. Recently, it has been shown that considering the sensing energy cost can be beneficial for further improving the overall energy efficiency. More precisely, sparse sensing techniques were proposed to reduce the amount of collected samples and recover the missing data by using data statistics. While the majority of these techniques use fixed or random sampling patterns, we propose to adaptively learn the signal model from the measurements and use the model to schedule when and where to sample the physical field. The proposed method requires minimal on-board computation, no inter-node communications and still achieves appealing reconstruction performance. With experiments on real-world datasets, we demonstrate significant improvements over both traditional sensing schemes and the state-of-the-art sparse sensing schemes, particularly when the measured data is characterized by a strong intra-sensor (temporal) or inter-sensors (spatial) correlation.
Fingerprint classification is an effective technique for reducing the candidate numbers of fingerprints in the stage of matching in automatic fingerprint identification system (AFIS). In recent years, deep learning is an emerging technology which has achieved great success in many fields, such as image processing, natural language processing and so on. In this paper, we only choose the orientation field as the input feature and adopt a new method (stacked sparse autoencoders) based on depth neural network for fingerprint classification. For the four-class problem, we achieve a classification of 93.1 percent using the depth network structure which has three hidden layers (with 1.8% rejection) in the NIST-DB4 database. And then we propose a novel method using two classification probabilities for fuzzy classification which can effectively enhance the accuracy of classification. By only adjusting the probability threshold, we get the accuracy of classification is 96.1% (setting threshold is 0.85), 97.2% (setting threshold is 0.90) and 98.0% (setting threshold is 0.95). Using the fuzzy method, we obtain higher accuracy than other methods.
We introduce SalGAN, a deep convolutional neural network for visual saliency prediction trained with adversarial examples. The first stage of the network consists of a generator model whose weights are learned by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency maps. The resulting prediction is processed by a discriminator network trained to solve a binary classification task between the saliency maps generated by the generative stage and the ground truth ones. Our experiments show how adversarial training allows reaching state-of-the-art performance across different metrics when combined with a widely-used loss function like BCE.
This paper investigates quantum diffusion of matter waves in two-dimensional random potentials, focussing on expanding Bose-Einstein condensates in spatially correlated optical speckle potentials. Special care is taken to describe the effect of dephasing, finite system size, and an initial momentum distribution. We derive general expressions for the interference-renormalized diffusion constant, the disorder-averaged probability density distribution, the variance of the expanding atomic cloud, and the localized fraction of atoms. These quantities are studied in detail for the special case of an inverted-parabola momentum distribution as obtained from an expanding condensate in the Thomas-Fermi regime. Lastly, we derive quantitative criteria for the unambiguous observation of localization effects in a possible 2D experiment.
We show that a broad class of quantum critical points can be stable against locally correlated disorder even if they are unstable against uncorrelated disorder. Although this result seemingly contradicts the Harris criterion, it follows naturally from the absence of a random-mass term in the associated order-parameter field theory. We illustrate the general concept with explicit calculations for quantum spin-chain models. Instead of the infinite-randomness physics induced by uncorrelated disorder, we find that weak locally correlated disorder is irrelevant. For larger disorder, we find a line of critical points with unusual properties such as an increase of the entanglement entropy with the disorder strength. We also propose experimental realizations in the context of quantum magnetism and cold-atom physics.
Overhead depth map measurements capture sufficient amount of information to enable human experts to track pedestrians accurately. However, fully automating this process using image analysis algorithms can be challenging. Even though hand-crafted image analysis algorithms are successful in many common cases, they fail frequently when there are complex interactions of multiple objects in the image. Many of the assumptions underpinning the hand-crafted solutions do not hold in these cases and the multitude of exceptions are hard to model precisely. Deep Learning (DL) algorithms, on the other hand, do not require hand crafted solutions and are the current state-of-the-art in object localization in images. However, they require exceeding amount of annotations to produce successful models. In the case of object localization these annotations are difficult and time consuming to produce. In this work we present an approach for developing pedestrian localization models using DL algorithms with efficient weak supervision from an expert. We circumvent the need for annotation of large corpus of data by annotating only small amount of patches and relying on synthetic data augmentation as a vehicle for injecting expert knowledge in the model training. This approach of weak supervision through expert selection of representative patches, suitable transformations and synthetic data augmentations enables us to successfully develop DL models for pedestrian localization efficiently.
Amorphous solids exhibit intrinsic, local structural transitions, that give rise to the well known quantum-mechanical two-level systems at low temperatures. We explain the microscopic origin of the electric dipole moment of these two-level systems: The dipole emerges as a result of polarization fluctuations between near degenerate local configurations, which have nearly frozen in at the glass transition. An estimate of the dipole's magnitude, based on the random first order transition theory, is obtained and is found to be consistent with experiment. The interaction between the dipoles is estimated and is shown to contribute significantly to the Gr\"{u}neisen parameter anomaly in low $T$ glasses. In completely amorphous media, the dipole moments are expected to be modest in size despite their collective origin. In partially crystalline materials, however, very large dipoles may arise, possibly explaining the findings of Bauer and Kador, J. Chem. Phys. {\bf 118}, 9069 (2003).
We present two deep generative models based on Variational Autoencoders to improve the accuracy of drug response prediction. Our models, Perturbation Variational Autoencoder and its semi-supervised extension, Drug Response Variational Autoencoder (Dr.VAE), learn latent representation of the underlying gene states before and after drug application that depend on: (i) drug-induced biological change of each gene and (ii) overall treatment response outcome. Our VAE-based models outperform the current published benchmarks in the field by anywhere from 3 to 11% AUROC and 2 to 30% AUPR. In addition, we found that better reconstruction accuracy does not necessarily lead to improvement in classification accuracy and that jointly trained models perform better than models that minimize reconstruction error independently.
Network layers are analyzed for their design and issues of researches, while dense wavelength division multiplexing equipment has been deployed in networks of major telecommunications carriers for a long time, the efficiency of networking and relation with network control and management have not caught up to those of digital cross-connect systems and packet-switched counterparts in higher layer networks.
This paper presents a deep learning based approach to the problem of human pose estimation. We employ generative adversarial networks as our learning paradigm in which we set up two stacked hourglass networks with the same architecture, one as the generator and the other as the discriminator. The generator is used as a human pose estimator after the training is done. The discriminator distinguishes ground-truth heatmaps from generated ones, and back-propagates the adversarial loss to the generator. This process enables the generator to learn plausible human body configurations and is shown to be useful for improving the prediction accuracy.
New cells are generated throughout life and integrate into the hippocampus via the process of adult neurogenesis. Epileptogenic brain injury induces many structural changes in the hippocampus, including the death of interneurons and altered connectivity patterns. The pathological neurogenic niche is associated with aberrant neurogenesis, though the role of the network-level changes in development of epilepsy is not well understood. In this paper, we use computational simulations to investigate the effect of network environment on structural and functional outcomes of neurogenesis. We find that small-world networks with external stimulus are able to be augmented by activity-seeking neurons in a manner that enhances activity at the stimulated sites without altering the network as a whole. However, when inhibition is decreased or connectivity patterns are changed, new cells are both less responsive to stimulus and the new cells are more likely to drive the network into bursting dynamics. Our results suggest that network-level changes caused by epileptogenic injury can create an environment where neurogenic reorganization can induce or intensify epileptic dynamics and abnormal integration of new cells.
We study quantum transport of an interacting Bose-Einstein condensate in a two-dimensional disorder potential. In the limit of vanishing atom-atom interaction, a sharp cone in the angle-resolved density of the scattered matter wave is observed, arising from constructive interference between amplitudes propagating along reversed scattering paths. Weak interaction transforms this coherent backscattering peak into a pronounced dip, indicating destructive instead of constructive interference. We reproduce this result, obtained from the numerical integration of the Gross-Pitaevskii equation, by a diagrammatic theory of weak localization in presence of a nonlinearity.
We consider unknown ad-hoc radio networks, when the underlying network is bidirectional and nodes can have polynomially large labels. For this model, we present a deterministic protocol for gossiping which takes $O(n \lg^2 n \lg \lg n)$ rounds. This improves upon the previous best result for deterministic gossiping for this model by [Gasienec, Potapov, Pagourtizis, Deterministic Gossiping in Radio Networks with Large labels, ESA (2002)], who present a protocol of round complexity $O(n \lg^3 n \lg \lg n)$ for this problem. This resolves open problem posed in [Gasienec, Efficient gossiping in radio networks, SIROCCO (2009)], who cite bridging gap between lower and upper bounds for this problem as an important objective. We emphasize that a salient feature of our protocol is its simplicity, especially with respect to the previous best known protocol for this problem.
A leading order QCD analysis of spin asymmetries in polarized deep inelastic lepton nucleon scattering is presented within the framework of the radiative parton model. Two resulting sets of plausible leading order spin dependent parton distributions are presented, respecting the fundamental positivity constraints down to the low resolution scale $Q^2=\mu_{LO}^2=0.23$ GeV$^2$. The $Q^2$ dependence of the spin asymmetries $A_1^{p,n,d}(x,Q^2)$ is investigated in the range $1 \leq Q^2 \leq 20$ GeV$^2$ and shown to be non-negligible for $x$-values relevant for the analysis of present data and possibly forthcoming data at HERA.
We address the long standing problem of the construction of relativistic spin operators for a composite system in QCD. Exploiting the kinematical boost symmetry in light front theory, we show that transverse spin operators for massless particles can be introduced in an arbitrary reference frame, in analogy with those for massive particles. In light front QCD, the complete set of transverse spin operators are identified for the first time, which are responsible for the helicity flip of the nucleon. We establish the direct connection between transverse spin in light front QCD and transverse polarized deep inelastic scattering. We discuss the theoretical and phenomenological implications of our results.
Image-based salient object detection (SOD) has been extensively studied in the past decades. However, video-based SOD is much less explored since there lack large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos (64 minutes). In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects that free-view all videos. From the user data, we find salient objects in video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for video-based salient object detection.   Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliency-guided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are unsupervisedly constructed which automatically infer a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. Experimental results show that the proposed unsupervised approach outperforms 30 state-of-the-art models on the proposed dataset, including 19 image-based & classic (unsupervised or non-deep learning), 6 image-based & deep learning, and 5 video-based & unsupervised. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.
We consider the energy difference restricted to a finite volume for certain pairs of incongruent ground states (if they exist) in the d-dimensional Edwards-Anderson (EA) Ising spin glass at zero temperature. We prove that the variance of this quantity with respect to the couplings grows at least proportionally to the volume in any dimension greater than or equal to two. An essential aspect of our result is the use of the excitation metastate. As an illustration of potential applications, we use this result to restrict the possible structure of spin glass ground states in two dimensions.
5G networks are expected to be highly energy efficient, with a 10 times lower consumption than today's systems. An effective way to achieve such a goal is to act on the backhaul network by controlling the nodes operational state and the allocation of traffic flows. To this end, in this paper we formulate energy-efficient flow routing on the backhaul network as an optimization problem. In light of its complexity, which impairs the solution in large-scale scenarios, we then propose a heuristic approach. Our scheme, named EMMA, aims to both turn off idle nodes and concentrate traffic on the smallest possible set of links, which in its turn increases the number of idle nodes. We implement EMMA on top of ONOS and derive experimental results by emulating the network through Mininet. Our results show that EMMA provides excellent energy saving performance, which closely approaches the optimum. In larger network scenarios, the gain in energy consumption that EMMA provides with respect to the simple benchmark where all nodes are active, is extremely high, reaching almost 1 under medium-low traffic load.
We investigate the role of cooperation in wireless networks subject to a spatial degrees of freedom limitation. To address the worst case scenario, we consider a free-space line-of-sight type environment with no scattering and no fading. We identify three qualitatively different operating regimes that are determined by how the area of the network A, normalized with respect to the wavelength lambda, compares to the number of users n. In networks with sqrt{A}/lambda < sqrt{n}, the limitation in spatial degrees of freedom does not allow to achieve a capacity scaling better than sqrt{n} and this performance can be readily achieved by multi-hopping. This result has been recently shown by Franceschetti et al. However, for networks with sqrt{A}/lambda > sqrt{n}, the number of available degrees of freedom is min(n, sqrt{A}/lambda), larger that what can be achieved by multi-hopping. We show that the optimal capacity scaling in this regime is achieved by hierarchical cooperation. In particular, in networks with sqrt{A}/lambda> n, hierarchical cooperation can achieve linear scaling.
Event-specific concepts are the semantic concepts designed for the events of interest, which can be used as a mid-level representation of complex events in videos. Existing methods only focus on defining event-specific concepts for a small number of predefined events, but cannot handle novel unseen events. This motivates us to build a large scale event-specific concept library that covers as many real-world events and their concepts as possible. Specifically, we choose WikiHow, an online forum containing a large number of how-to articles on human daily life events. We perform a coarse-to-fine event discovery process and discover 500 events from WikiHow articles. Then we use each event name as query to search YouTube and discover event-specific concepts from the tags of returned videos. After an automatic filter process, we end up with 95,321 videos and 4,490 concepts. We train a Convolutional Neural Network (CNN) model on the 95,321 videos over the 500 events, and use the model to extract deep learning feature from video content. With the learned deep learning feature, we train 4,490 binary SVM classifiers as the event-specific concept library. The concepts and events are further organized in a hierarchical structure defined by WikiHow, and the resultant concept library is called EventNet. Finally, the EventNet concept library is used to generate concept based representation of event videos. To the best of our knowledge, EventNet represents the first video event ontology that organizes events and their concepts into a semantic structure. It offers great potential for event retrieval and browsing. Extensive experiments over the zero-shot event retrieval task when no training samples are available show that the EventNet concept library consistently and significantly outperforms the state-of-the-art (such as the 20K ImageNet concepts trained with CNN) by a large margin up to 207%.
Understanding human activity is very challenging even with the recently developed 3D/depth sensors. To solve this problem, this work investigates a novel deep structured model, which adaptively decomposes an activity instance into temporal parts using the convolutional neural networks (CNNs). Our model advances the traditional deep learning approaches in two aspects. First, { we incorporate latent temporal structure into the deep model, accounting for large temporal variations of diverse human activities. In particular, we utilize the latent variables to decompose the input activity into a number of temporally segmented sub-activities, and accordingly feed them into the parts (i.e. sub-networks) of the deep architecture}. Second, we incorporate a radius-margin bound as a regularization term into our deep model, which effectively improves the generalization performance for classification. For model training, we propose a principled learning algorithm that iteratively (i) discovers the optimal latent variables (i.e. the ways of activity decomposition) for all training instances, (ii) { updates the classifiers} based on the generated features, and (iii) updates the parameters of multi-layer neural networks. In the experiments, our approach is validated on several complex scenarios for human activity recognition and demonstrates superior performances over other state-of-the-art approaches.
Isolated, short dispersed pulses of radio emission of unknown origin have been reported and there is strong interest in wide-field, sensitive searches for such events. To achieve high sensitivity, large collecting area is needed and dispersion due to the interstellar medium should be removed. To survey a large part of the sky in reasonable time, a telescope that forms multiple simultaneous beams is desirable. We have developed a novel FPGA-based transient search engine that is suitable for these circumstances. It accepts short-integration-time spectral power measurements from each beam of the telescope, performs incoherent de-dispersion simultaneously for each of a wide range of dispersion measure (DM) values, and automatically searches the de-dispersed time series for pulse-like events. If the telescope provides buffering of the raw voltage samples of each beam, then our system can provide trigger signals to allow data in those buffers to be saved when a tentative detection occurs; this can be done with a latency of tens of ms, and only the buffers for beams with detections need to be saved. In one version of our implementation, intended for the ASKAP array of 36 antennas (currently under construction in Australia), 36 beams are simultaneously de-dispersed for 448 different DMs with an integration time of 1.0 ms. In the absence of such a multi-beam telescope, we have built a second version that handles up to 6 beams at 0.1 ms integration time and 512 DMs. We have deployed and tested this at a 34-m antenna of the Deep Space Network in Goldstone, California. A third version that processes up to 6 beams at an integration time of 2.0 ms and 1,024 DMs has been built and deployed at the Murchison Widefield Array telescope.
Many modern datacenter applications involve large-scale computations composed of multiple data flows that need to be completed over a shared set of distributed resources. Such a computation completes when all of its flows complete. A useful abstraction for modeling such scenarios is a {\em coflow}, which is a collection of flows (e.g., tasks, packets, data transmissions) that all share the same performance goal.   In this paper, we present the first approximation algorithms for scheduling coflows over general network topologies with the objective of minimizing total weighted completion time. We consider two different models for coflows based on the nature of individual flows: circuits, and packets. We design constant-factor polynomial-time approximation algorithms for scheduling packet-based coflows with or without given flow paths, and circuit-based coflows with given flow paths. Furthermore, we give an $O(\log n/\log \log n)$-approximation polynomial time algorithm for scheduling circuit-based coflows where flow paths are not given (here $n$ is the number of network edges).   We obtain our results by developing a general framework for coflow schedules, based on interval-indexed linear programs, which may extend to other coflow models and objective functions and may also yield improved approximation bounds for specific network scenarios. We also present an experimental evaluation of our approach for circuit-based coflows that show a performance improvement of at least 22% on average over competing heuristics.
The Kohonen algorithm (SOM, Kohonen,1984, 1995) is a very powerful tool for data analysis. It was originally designed to model organized connections between some biological neural networks. It was also immediately considered as a very good algorithm to realize vectorial quantization, and at the same time pertinent classification, with nice properties for visualization. If the individuals are described by quantitative variables (ratios, frequencies, measurements, amounts, etc.), the straightforward application of the original algorithm leads to build code vectors and to associate to each of them the class of all the individuals which are more similar to this code-vector than to the others. But, in case of individuals described by categorical (qualitative) variables having a finite number of modalities (like in a survey), it is necessary to define a specific algorithm. In this paper, we present a new algorithm inspired by the SOM algorithm, which provides a simultaneous classification of the individuals and of their modalities.
In this talk I will review some of the recent applications of the replica theory to glasses. I will firstly describe the basic assumptions and I will show that they can be considered as a precise reformulations of old ideas. The relation of this approach with the mode-coupling theory will be shortly discussed.   I will present numerical simulations for binary mixtures. The results of these simulations point toward the correctness of the replica approach to glasses. I will describe the results of off-equilibrium simulations for large systems, in which the aging dynamics is studied.
We investigate the first passage time t_{j,N} to a given chemical or Euclidean distance of the first j of a set of N>>1 independent random walkers all initially placed on a site of a disordered medium. To solve this order-statistics problem we assume that, for short times, the survival probability (the probability that a single random walker is not absorbed by a hyperspherical surface during some time interval) decays for disordered media in the same way as for Euclidean and some class of deterministic fractal lattices. This conjecture is checked by simulation on the incipient percolation aggregate embedded in two dimensions. Arbitrary moments of t_{j,N} are expressed in terms of an asymptotic series in powers of 1/ln N which is formally identical to those found for Euclidean and (some class of) deterministic fractal lattices. The agreement of the asymptotic expressions with simulation results for the two-dimensional percolation aggregate is good when the boundary is defined in terms of the chemical distance. The agreement worsens slightly when the Euclidean distance is used.
The analysis of quantum corrections to magnetoconductivity of thin Au films responsible for by the effect of weak electron localization has made it possible to determine the temperature dependences of electron phase relaxation time in the temperature range 0.5--50 K for different degrees of crystal lattice disorder. The disorder was enhanced by irradiating the films in vacuum with 3.5 keV Ar ions. The experimental data clearly demonstrate that the contribution of electron-electron interaction to electron phase relaxation increases with disorder and support the theoretical prediction that the frequency of electron-phonon scattering tends to diminish upon a decrease in electron mean free path. It is found that the spin-orbit scattering rate decreases with disorder. In our opinion, such unusual behavior can take place for thin films at decreasing the electron mean free path provided, that the surface electron scattering contributes significantly to the total spin-orbit scattering.
We revisit the Anderson localization problem on Bethe lattices, putting in contact various aspects which have been previously only discussed separately. For the case of connectivity 3 we compute by the cavity method the density of states and the evolution of the mobility edge with disorder. Furthermore, we show that below a certain critical value of the disorder the smallest eigenvalue remains delocalized and separated by all the others (localized) ones by a gap. We also study the evolution of the mobility edge at the center of the band with the connectivity, and discuss the large connectivity limit.
A study of statistics of transmission and reflection from a random medium with stochastic amplification as opposed to coherent amplification is presented. It is found that the transmission coefficient $t$, for sample length $L$ less than the critical length $L_c$ grows exponentially with $L$. In the limit $L \to \infty$ transmission decays exponentially as $\avg{lnt} = -L/\xi$ where $\xi$ is the localization length. In this limit reflection coefficient $r$ saturates to a fixed value which shows a monotonic increase as a function of strength of amplification $\alpha$. The stationary distribution of super-reflection coefficient agrees well with the analytical results obtained within the random phase approximation (RPA). Our model also exhibits the well known duality between absorption and amplification. We emphasize the major differences between coherent amplification and stochastic amplification where-ever appropriate.
A new package, DISPred, is described. The package can be used to calculate ep deep inelastic scattering cross sections at Born level in Electroweak theory and at both leading and next-to-leading order in QCD.
This paper describes NS - Network Simulator, the computer networks simulation tool. We offer an overview NS, and also analyze its characteristics and functions. Finally, we present in detail all steps for preparing a simulation of a simple model in NS.
The field of digital libraries (DLs) coalesced in 1994: the first digital library conferences were held that year, awareness of the World Wide Web was accelerating, and the National Science Foundation awarded $24 Million (U.S.) for the Digital Library Initiative (DLI). In this paper we examine the state of the DL domain after a decade of activity by applying social network analysis to the co-authorship network of the past ACM, IEEE, and joint ACM/IEEE digital library conferences. We base our analysis on a common binary undirectional network model to represent the co-authorship network, and from it we extract several established network measures. We also introduce a weighted directional network model to represent the co-authorship network, for which we define $AuthorRank$ as an indicator of the impact of an individual author in the network. The results are validated against conference program committee members in the same period. The results show clear advantages of PageRank and AuthorRank over degree, closeness and betweenness centrality metrics. We also investigate the amount and nature of international participation in Joint Conference on Digital Libraries (JCDL).
To bring their innovative ideas to market, those embarking in new ventures have to raise money, and, to do so, they have often resorted to banks and venture capitalists. Nowadays, they have an additional option: that of crowdfunding. The name refers to the idea that funds come from a network of people on the Internet who are passionate about supporting others' projects. One of the most popular crowdfunding sites is Kickstarter. In it, creators post descriptions of their projects and advertise them on social media sites (mainly Twitter), while investors look for projects to support. The most common reason for project failure is the inability of founders to connect with a sufficient number of investors, and that is mainly because hitherto there has not been any automatic way of matching creators and investors. We thus set out to propose different ways of recommending investors found on Twitter for specific Kickstarter projects. We do so by conducting hypothesis-driven analyses of pledging behavior and translate the corresponding findings into different recommendation strategies. The best strategy achieves, on average, 84% of accuracy in predicting a list of potential investors' Twitter accounts for any given project. Our findings also produced key insights about the whys and wherefores of investors deciding to support innovative efforts.
The low-frequency vibrational and low-temperature thermal properties of amorphous solids are markedly different from those of crystalline solids. This situation is counter-intuitive because any solid material is expected to behave as a homogeneous elastic body in the continuum limit, in which vibrational modes are phonons following the Debye law. A number of phenomenological explanations have been proposed, which assume elastic heterogeneities, soft localized vibrations, and so on. Recently, the microscopic mean-field theories have been developed to predict the universal non-Debye scaling law. Considering these theoretical arguments, it is absolutely necessary to directly observe the nature of the low-frequency vibrations of amorphous solids and determine the laws that such vibrations obey. Here, we perform an extremely large-scale vibrational mode analysis of a model amorphous solid. We find that the scaling law predicted by the mean-field theory is violated at low frequency, and in the continuum limit, the vibrational modes converge to a mixture of phonon modes following the Debye law and soft localized modes following another universal non-Debye scaling law.
Dynamic trees are mixtures of tree structured belief networks. They solve some of the problems of fixed tree networks at the cost of making exact inference intractable. For this reason approximate methods such as sampling or mean field approaches have been used. However, mean field approximations assume a factorized distribution over node states. Such a distribution seems unlickely in the posterior, as nodes are highly correlated in the prior. Here a structured variational approach is used, where the posterior distribution over the non-evidential nodes is itself approximated by a dynamic tree. It turns out that this form can be used tractably and efficiently. The result is a set of update rules which can propagate information through the network to obtain both a full variational approximation, and the relevant marginals. The progagtion rules are more efficient than the mean field approach and give noticeable quantitative and qualitative improvement in the inference. The marginals calculated give better approximations to the posterior than loopy propagation on a small toy problem.
The objective of this paper is analyzing to which extent the multiverse hypothesis provides a real explanation of the peculiarities of the laws and constants in our universe. First we argue in favor of the thesis that all multiverses except Tegmark's <<mathematical multiverse>> are too small to explain the fine tuning, so that they merely shift the problem up one level. But the <<mathematical multiverse>> is surely too large. To prove this assessment, we have performed a number of experiments with cellular automata of complex behavior, which can be considered as universes in the mathematical multiverse. The analogy between what happens in some automata (in particular Conway's <<Game of Life>>) and the real world is very strong. But if the results of our experiments can be extrapolated to our universe, we should expect to inhabit -- in the context of the multiverse -- a world in which at least some of the laws and constants of nature should show a certain time dependence. Actually, the probability of our existence in a world such as ours would be mathematically equal to zero. In consequence, the results presented in this paper can be considered as an inkling that the hypothesis of the multiverse, whatever its type, does not offer an adequate explanation for the peculiarities of the physical laws in our world. A slightly reduced version of this paper has been published in the Journal for General Philosophy of Science, Springer, March 2013, DOI: 10.1007/s10838-013-9215-7.
Hebbian plasticity is a powerful principle that allows biological brains to learn from their lifetime experience. By contrast, artificial neural networks trained with backpropagation generally have fixed connection weights that do not change once training is complete. While recent methods can endow neural networks with long-term memories, Hebbian plasticity is currently not amenable to gradient descent. Here we derive analytical expressions for activity gradients in neural networks with Hebbian plastic connections. Using these expressions, we can use backpropagation to train not just the baseline weights of the connections, but also their plasticity. As a result, the networks "learn how to learn" in order to solve the problem at hand: the trained networks automatically perform fast learning of unpredictable environmental features during their lifetime, expanding the range of solvable problems. We test the algorithm on various on-line learning tasks, including pattern completion, one-shot learning, and reversal learning. The algorithm successfully learns how to learn the relevant associations from one-shot instruction, and fine-tunes the temporal dynamics of plasticity to allow for continual learning in response to changing environmental parameters. We conclude that backpropagation of Hebbian plasticity offers a powerful model for lifelong learning.
Regulatory gene networks contain generic modules like those involving feedback loops, which are essential for the regulation of many biological functions. We consider a class of self-regulated genes which are the building blocks of many regulatory gene networks, and study the steady state distributions of the associated Gillespie algorithm by providing efficient numerical algorithms. We also study a regulatory gene network of interest in synthetic biology and in gene therapy, using mean-field models with time delays. Convergence of the related time-nonhomogeneous Markov chain is established for a class of linear catalytic networks with feedback loops
Weyl semimetals (WSMs) have recently attracted a great deal of attention as they provide condensed matter realization of chiral anomaly, feature topologically protected Fermi arc surface states and sustain sharp chiral Weyl quasiparticles up to a critical disorder at which the system undergoes a quantum phase transition (QPT) into a metallic phase. However, the fate of the Fermi arc states has remained unexplored at the transition. We here numerically demonstrate the evolution of Fermi arcs against gradual onset of randomness and establish a bulk-boundary correspondence across this QPT. We show that with increasing strength of disorder the Fermi arc systematically looses its sharpness, and close to the WSM-metal QPT it completely dissolves into the metallic bath of the bulk of the system. Predicted topological nature of this transition can directly be observed in ARPES and STM measurements by following continuous deformation of the Fermi arcs with increasing disorder.
Software Defined Networking (SDN) is a new networking architecture which aims to provide better decoupling between network control (control plane) and data forwarding functionalities (data plane). This separation introduces several benefits, such as a directly programmable and (virtually) centralized network control. However, researchers showed that the required communication channel between the control and data plane of SDN creates a potential bottleneck in the system, introducing new vulnerabilities. Indeed, this behavior could be exploited to mount powerful attacks, such as the control plane saturation attack, that can severely hinder the performance of the whole network.   In this paper we present LineSwitch, an efficient and effective solution against control plane saturation attack. LineSwitch combines SYN proxy techniques and probabilistic blacklisting of network traffic. We implemented LineSwitch as an extension of OpenFlow, the current reference implementation of SDN, and evaluate our solution considering different traffic scenarios (with and without attack). The results of our preliminary experiments confirm that, compared to the state-of-the-art, LineSwitch reduces the time overhead up to 30%, while ensuring the same level of protection.
Understanding the mutual interdependence between the behavior of dynamical processes on networks and the underlying topologies promises new insight for a large class of empirical networks. We present a generic approach to investigate this relationship which is applicable to a wide class of dynamics, namely to evolve networks using a performance measure based on the whole spectrum of the dynamics' time evolution operator. As an example, we consider the graph Laplacian describing diffusion processes, and we evolve the network structure such that a given sub-diffusive behavior emerges.
We consider the statistics of the scattering coefficient S of a chaotic microwave cavity coupled to a single port. We remove the non-universal effects of the coupling from the experimental S data using the radiation impedance obtained directly from the experiments. We thus obtain the normalized, complex scattering coefficient whose Probability Density Function (PDF) is predicted to be universal in that it depends only on the loss (quality factor) of the cavity. We compare experimental PDFs of the normalized scattering coefficients with those obtained from Random Matrix Theory (RMT), and find excellent agreement. The results apply to scattering measurements on any wave chaotic system.
We show that the magnetism of double perovskite AFe_{1/2}M_{1/2}O_3 systems may be described by the Heisenberg model on the simple cubic lattice, where only half of sites are occupied by localized magnetic moments. The nearest-neighbor interaction J_1 is more than 20 times the next-nearest neighbor interaction J_2, the third-nearest interaction along the space diagonal of the cube being negligible. We argue that the variety of magnetic properties observed in different systems is connected with the variety of chemical ordering in them. We analyze six possible types of the chemical ordering in 2x2x2 supercell, and argue that the probability to find them in a real compound does not correspond to a random occupation of lattice sites by magnetic ions. The exchange J_2 rather than J_1 define the magnetic energy scale of most double perovskite compounds that means the enhanced probability of 1:1 short range ordering. Two multiferroic compounds PbFe_{1/2}M_{1/2}O_3 (M=Nb, Ta) are exceptions. We show that the relatively high temperature of antiferromagnetic transition is compatible with a layered short-range chemical order, which was recently shown to be most stable for these two compounds [I. P. Raevski, {\em et al.}, Phys.\ Rev.\ B \textbf{85}, 224412 (2012)]. We show also that one of the types of ordering has ferrimagnetic ground state. The clusters with short-range order of this type may be responsible for a room-temperature superparamagnetism, and may form the cluster glass at low temperatures.
Spatial networks range from the brain networks, to transportation networks and infrastructures. Recently interacting and multiplex networks are attracting great attention because their dynamics and robustness cannot be understood without treating at the same time several networks. Here we present maximal entropy ensembles of spatial multiplex and spatial interacting networks that can be used in order to model spatial multilayer network structures and to build null models of real datasets. We show that spatial multiplex naturally develop a significant overlap of the links, a noticeable property of many multiplexes that can affect significantly the dynamics taking place on them. Additionally, we characterize ensembles of spatial interacting networks and we analyse the structure of interacting airport and railway networks in India, showing the effect of space in determining the link probability.
Effective SDN control relies on the network data collecting capability as well as the quality and timeliness of the data. As open programmable data plane is becoming a reality, we further enhance it with the support of runtime interactive programming in order to cope with application dynamics, optimize data plane resource allocation, and reduce control-plane processing pressure. Based on the latest technologies, we propose the Dynamic Network Probes (DNP) as a means to support real-time and on-demand network visibility. DNPs serve as an important building block of an integrated networking data analytics platform which involves the network data plane as an active component for in-network computing. In this paper, we show the types of DNPs and their role in the big picture. We have implemented an NP-based hardware prototype to demonstrate the feasibility and efficiency of DNPs. We lay out the research challenges and our future work to realize the omni network visibility based on DNPs.
Handwriting is one of the most important means of daily communication. Although the problem of handwriting recognition has been considered for more than 60 years there are still many open issues, especially in the task of unconstrained handwritten sentence recognition. This paper focuses on the automatic system that recognizes continuous English sentence through a mouse-based gestures in real-time based on Artificial Neural Network. The proposed Artificial Neural Network is trained using the traditional backpropagation algorithm for self supervised neural network which provides the system with great learning ability and thus has proven highly successful in training for feed-forward Artificial Neural Network. The designed algorithm is not only capable of translating discrete gesture moves, but also continuous gestures through the mouse. In this paper we are using the efficient neural network approach for recognizing English sentence drawn by mouse. This approach shows an efficient way of extracting the boundary of the English Sentence and specifies the area of the recognition English sentence where it has been drawn in an image and then used Artificial Neural Network to recognize the English sentence. The proposed approach English sentence recognition (ESR) system is designed and tested successfully. Experimental results show that the higher speed and accuracy were examined.
Galaxy counts from bright ultraviolet (UV) and deep optical spectroscopic surveys have revealed an unexpectedly large number of very blue galaxies. The colors and luminosities of these objects indicate that they are dwarf galaxies undergoing bursts of star formation. We use a galaxy evolution model (PEGASE, Fioc & Rocca-Volmerange 1997) to describe this population as galaxies undergoing cyclical bursts of star formation, thereby determining the luminosity function of these galaxies.   When these bursting galaxies are added to normally evolving populations, the combination reproduces the UV number counts, color distributions and deep optical redshift distributions fairly well. Optical (including the Hubble Deep Field) and near-infrared number counts are fitted assuming an open or a flat, Lambda-dominated, Universe. The high amplitude of the angular correlation function of very blue galaxies discovered by Landy et al. (1996) is also recovered in this modelling.   The number of bursting galaxies is only a small fraction of the total number of galaxies at optical and near-infrared wavelengths, even at faintest magnitudes. In our evolution modelling, normal galaxies explain most of the blue excess in a low-Omega Universe. The problem of the blue excess remains in a flat Universe without a cosmological constant.
In Internet Routing, the static shortest path (SP) problem has been addressed using well known intelligent optimization techniques like artificial neural networks, genetic algorithms (GAs) and particle swarm optimization. Advancement in wireless communication lead more and more mobile wireless networks, such as mobile networks [mobile ad hoc networks (MANETs)] and wireless sensor networks. Dynamic nature of the network is the main characteristic of MANET. Therefore, the SP routing problem in MANET turns into dynamic optimization problem (DOP). Here the nodes ae made aware of the environmental condition, thereby making it intelligent, which goes as the input for GA. The implementation then uses GAs with immigrants and memory schemes to solve the dynamic SP routing problem (DSPRP) in MANETS. In our paper, once the network topology changes, the optimal solutions in the new environment can be searched using the new immigrants or the useful information stored in the memory. Results shows GA with new immigrants shows better convergence result than GA with memory scheme.
We have connected the dynamic fragility, namely the rapidity of the relaxation time increase upon temperature reduction, to the excess entropy and heat capacity of a large number of glass-forming polymers. The connection was obtained in a natural way from the Adam-Gibbs equation, relating the structural relaxation time to the configurational entropy. We find a clear correlation for a group of polymers. For another group of polymers, for which this correlation does not work, we emphasise the role of relaxation processes unrelated to the &#61537; process in affecting macroscopic thermodynamic properties. Once an essentially temperature independent contribution of these processes is removed from the total excess entropy, the correlation between dynamic fragility and thermodynamic properties is re-established.
Kubo formula is used to get the d.c conductance of a statistical ensemble of two dimensional clusters of the square lattice in the presence of random magnetic fluxes. Fluxes traversing lattice plaquettes are distributed uniformly between minus one half and plus one half of the flux quantum. The localization length is obtained from the exponential decay of the averaged conductance as a function of the cluster side. Standard results are recovered when this numerical approach is applied to Anderson model of diagonal disorder. The localization length of the complex non-diagonal model of disorder remains well below 10 000 (in units of the lattice constant) in the main part of the band in spite of its exponential increase near the band edges.
We introduce a new measure of centrality, the information centrality C^I, based on the concept of efficient propagation of information over the network. C^I is defined for both valued and non-valued graphs, and applies to groups and classes as well as individuals. The new measure is illustrated and compared to the standard centrality measures by using a classic network data set.
In this paper, we propose a real-time classification scheme to cope with noisy Radio Signal Strength Indicator (RSSI) measurements utilized in indoor positioning systems. RSSI values are often converted to distances for position estimation. However due to multipathing and shadowing effects, finding a unique sensor model using both parametric and non-parametric methods is highly challenging. We learn decision regions using the Gaussian Processes classification to accept measurements that are consistent with the operating sensor model. The proposed approach can perform online, does not rely on a particular sensor model or parameters, and is robust to sensor failures. The experimental results achieved using hardware show that available positioning algorithms can benefit from incorporating the classifier into their measurement model as a meta-sensor modeling technique.
Recurrent neural networks (RNNs) are widely used in computational neuroscience and machine learning applications. In an RNN, each neuron computes its output as a nonlinear function of its integrated input. While the importance of RNNs, especially as models of brain processing, is undisputed, it is also widely acknowledged that the computations in standard RNN models may be an over-simplification of what real neuronal networks compute. Here, we suggest that the RNN approach may be made both neurobiologically more plausible and computationally more powerful by its fusion with Bayesian inference techniques for nonlinear dynamical systems. In this scheme, we use an RNN as a generative model of dynamic input caused by the environment, e.g. of speech or kinematics. Given this generative RNN model, we derive Bayesian update equations that can decode its output. Critically, these updates define a 'recognizing RNN' (rRNN), in which neurons compute and exchange prediction and prediction error messages. The rRNN has several desirable features that a conventional RNN does not have, for example, fast decoding of dynamic stimuli and robustness to initial conditions and noise. Furthermore, it implements a predictive coding scheme for dynamic inputs. We suggest that the Bayesian inversion of recurrent neural networks may be useful both as a model of brain function and as a machine learning tool. We illustrate the use of the rRNN by an application to the online decoding (i.e. recognition) of human kinematics.
Random network coding recently attracts attention as a technique to disseminate information in a network. This paper considers a non-coherent multi-shot network, where the unknown and time-variant network is used several times. In order to create dependencies between the different shots, particular convolutional codes in rank metric are used. These codes are so-called (partial) unit memory ((P)UM) codes, i.e., convolutional codes with memory one. First, distance measures for convolutional codes in rank metric are shown and two constructions of (P)UM codes in rank metric based on the generator matrices of maximum rank distance codes are presented. Second, an efficient error-erasure decoding algorithm for these codes is presented. Its guaranteed decoding radius is derived and its complexity is bounded. Finally, it is shown how to apply these codes for error correction in random linear and affine network coding.
We introduce a network growth model based on complete redirection: a new node randomly selects an existing target node, but attaches to a random neighbor of this target. For undirected networks, this simple growth rule generates unusual, highly modular networks. Individual network realizations typically contain multiple macrohubs---nodes whose degree scales linearly with the number of nodes $N$. The size of the network "nucleus"---the set of nodes of degree greater than one---grows sublinearly with $N$ and thus constitutes a vanishingly small fraction of the network. The network therefore consists almost entirely of leaves (nodes of degree one) as $N\to\infty$.
We present a framework for vision-based model predictive control (MPC) for the task of aggressive, high-speed autonomous driving. Our approach uses deep convolutional neural networks to predict cost functions from input video which are directly suitable for online trajectory optimization with MPC. We demonstrate the method in a high speed autonomous driving scenario, where we use a single monocular camera and a deep convolutional neural network to predict a cost map of the track in front of the vehicle. Results are demonstrated on a 1:5 scale autonomous vehicle given the task of high speed, aggressive driving.
Using network-based information to facilitate information spreading is an essential task for spreading dynamics in complex networks, which will benefit the promotion of technical innovations, healthy behaviors, new products, etc. Focusing on degree correlated networks, we propose a preferential contact strategy based on the local network structure and local informed density to promote the information spreading. During the spreading process, an informed node will preferentially select a contact target among its neighbors, basing on their degrees or local informed densities. By extensively implementing numerical simulations in synthetic and empirical networks, we find that when only consider the local structure information, the convergence time of information spreading will be remarkably reduced if low-degree neighbors are favored as contact targets. Meanwhile, the minimum convergence time depends non-monotonically on degree-degree correlation, and moderate correlation coefficients result in most efficient information spreading. Incorporating the informed density information into contact strategy, the convergence time of information spreading can be further reduced. Finally, we show that by using local informed density is more effective as compared with the global case.
We have constructed a simple semiclassical model of neural network where neurons have quantum links with one another in a chosen way and affect one another in a fashion analogous to action potentials. We have examined the role of stochasticity introduced by the quantum potential and compare the system with the classical system of an integrate-and-fire model by Hopfield. Average periodicity and short term retentivity of input memory are noted.
The structure of sexual contacts, its contacts network and its temporal interactions, play an important role in the spread of sexually transmitted infections. Unfortunately, that kind of data is very hard to obtain. One of the few exceptions is the "Romantic network" which is a complete structure of a real sexual network of a high school. In terms of topology, unlike other sexual networks classified as scale-free network. Regarding the temporal structure, several studies indicate that relationship timing can have effects on diffusion through networks, as relationship order determines transmission routes.With the aim to check if the particular structure, static and dynamic, of the Romantic network is determinant for the propagation of an STI in it, we perform simulations in two scenarios: the static network where all contacts are available and the dynamic case where contacts evolve in time. In the static case, we compare the epidemic results in the Romantic network with some paradigmatic topologies. We further study the behavior of the epidemic on the Romantic network in response to the effect of any individual, belonging to the network, having a contact with an external infected subject, the influence of the degree of the initial infected, and the effect of the variability of contacts per unit time. We also consider the dynamics of formation of pairs in and we study the propagation of the diseases in this dynamic scenario. Our results suggest that while the Romantic network can not be labeled as a Watts-Strogatz network, it is, regarding the propagation of an STI, very close to one with high disorder. Our simulations confirm that relationship timing affects, but strongly lowering, the final outbreak size. Besides, shows a clear correlation between the average degree and the outbreak size over time.
Electron microscopic connectomics is an ambitious research direction with the goal of studying comprehensive brain connectivity maps by using high-throughput, nano-scale microscopy. One of the main challenges in connectomics research is developing scalable image analysis algorithms that require minimal user intervention. Recently, deep learning has drawn much attention in computer vision because of its exceptional performance in image classification tasks. For this reason, its application to connectomic analyses holds great promise, as well. In this paper, we introduce a novel deep neural network architecture, FusionNet, for the automatic segmentation of neuronal structures in connectomics data. FusionNet leverages the latest advances in machine learning, such as semantic segmentation and residual neural networks, with the novel introduction of summation-based skip connections to allow a much deeper network architecture for a more accurate segmentation. We demonstrate the performance of the proposed method by comparing it with state-of-the-art electron microscopy (EM) segmentation methods from the ISBI EM segmentation challenge. We also show the segmentation results on two different tasks including cell membrane and cell body segmentation and a statistical analysis of cell morphology.
We develop an alternative scaling approach to determine the criteria for Anderson localization in one-dimensional tight-binding models with random site energies having a bandwidth that decays as a power law in space, $H_{ij} \propto |i - j|^{-\alpha}$. At the first order in perturbation theory the scale dependence of the exchange-narrowed energy of the disorder is compared to the energy level spacing of the ideal system to establish whether or not the disorder has a perturbative effect on the Bloch states. We find that at $\alpha =1$, the perturbative condition is satisfied and for sufficiently weak disorder strength all states are extended. For $\alpha > 1$, all states are localized for arbitrary disorder strength, in agreement with the earlier renormalization group treatment by Levitov.
The area of networking games has had a growing impact on wireless networks. This reflects the recognition in the important scaling advantages that the service providers can benefit from by increasing the autonomy of mobiles in decision making. This may however result in inefficiencies that are inherent to equilibria in non-cooperative games. Due to the concern for efficiency, centralized protocols keep being considered and compared to decentralized ones. From the point of view of the network architecture, this implies the co-existence of network-centric and terminal centric radio resource management schemes. Instead of taking part within the debate among the supporters of each solution, we propose in this paper hybrid schemes where the wireless users are assisted in their decisions by the network that broadcasts aggregated load information. We derive the utilities related to the Quality of Service (QoS) perceived by the users and develop a Bayesian framework to obtain the equilibria. Numerical results illustrate the advantages of using our hybrid game framework in an association problem in a network composed of HSDPA and 3G LTE systems.
We present early results from our multi-wavelength follow-up campaigns of the AKARI Deep Fields at the North and South Ecliptic Poles. We summarize our campaigns in this poster paper, and present three early outcomes. (a) Our AAOmega optical spectroscopy of the Deep Field South at the AAT has observed over 550 different targets, and our preliminary local luminosity function at 90 microns from the first four hours of data is in good agreement with the predictions from Serjeant & Harrison 2005. (b) Our GMRT 610 MHz imaging in the Deep Field North has reached ~30 microJy RMS, making this among the deepest images at this frequency. Our 610 MHz source counts at >200 microJy are the deepest ever derived at this frequency. (c) Comparing our GMRT data with our 1.4 GHz WSRT data, we have found two examples of radio-loud AGN that may have more than one epoch of activity.
Can complex engineered and biological networks be coarse-grained into smaller and more understandable versions in which each node represents an entire pattern in the original network? To address this, we define coarse-graining units (CGU) as connectivity patterns which can serve as the nodes of a coarse-grained network, and present algorithms to detect them. We use this approach to systematically reverse-engineer electronic circuits, forming understandable high-level maps from incomprehensible transistor wiring: first, a coarse-grained version in which each node is a gate made of several transistors is established. Then, the coarse-grained network is itself coarse-grained, resulting in a high-level blueprint in which each node is a circuit-module made of multiple gates. We apply our approach also to a mammalian protein-signaling network, to find a simplified coarse-grained network with three main signaling channels that correspond to cross-interacting MAP-kinase cascades. We find that both biological and electronic networks are 'self-dissimilar', with different network motifs found at each level. The present approach can be used to simplify a wide variety of directed and nondirected, natural and designed networks.
A particular, two-dimensional, tiling model, composed by the so called Wang tiles has been studied at finite temperature by Monte Carlo numerical simulations. In absence of any thermal bath the Wang tiles give the opportunity of building a very large number of non-periodic tilings. We can construct a local Hamiltonian such that only perfectly matched tilings are ground states with zero energy. This Hamiltonian has a very large degeneracy. The thermodynamic behaviour of such a system seems to show a continuous phase transition at non zero temperature. An order parameter with non-trivial features is proposed. Under the critical temperature the model exhibits aging properties. The fluctuation-dissipation theorem is violated.
Using conditional Generative Adversarial Network (conditional GAN) for cross-domain image-to-image translation has achieved significant improvements in the past year. Depending on the degree of task complexity, thousands or even millions of labeled image pairs are needed to train conditional GANs. However, human labeling is very expensive and sometimes impractical. Inspired by the success of dual learning paradigm in natural language translation, we develop a novel dual-GAN mechanism, which enables image translators to be trained from two sets of unlabeled images each representing a domain. In our architecture, the primal GAN learns to translate images from domain $U$ to those in domain $V$, while the dual GAN learns to convert images from $V$ to $U$. The closed loop made by the primal and dual tasks allows images from either domain to be translated and then reconstructed. Hence a loss function that accounts for the reconstruction error of images can be used to train the translation models. Experiments on multiple image translation tasks with unlabeled data show considerable performance gain of our dual-GAN architecture over a single GAN. For some tasks, our model can even achieve comparable or slightly better results to conditional GAN trained on fully labeled data.
With the emerging of new networks, such as wireless sensor networks, vehicle networks, P2P networks, cloud computing, mobile Internet, or social networks, the network dynamics and complexity expands from system design, hardware, software, protocols, structures, integration, evolution, application, even to business goals. Thus the dynamics and uncertainty are unavoidable characteristics, which come from the regular network evolution and unexpected hardware defects, unavoidable software errors, incomplete management information and dependency relationship between the entities among the emerging complex networks. Due to the complexity of emerging networks, it is not always possible to build precise models in modeling and optimization (local and global) for networks. This paper presents a survey on probabilistic modeling for evolving networks and identifies the new challenges which emerge on the probabilistic models and optimization strategies in the potential application areas of network performance, network management and network security for evolving networks.
One of the main limiting factors in the carrier mobility in semiconductor nanowires is the presence of deep trap levels. While deep-level transient spectroscopy (DLTS) has proved to be a powerful tool in analysing traps in bulk semiconductors, this technique is ineffective for the characterisation of nanowires due to their very small capacitance. Here we introduce a new technique for measuring the spectrum of deep traps in nanowires. In current-mode DLTS (I-DLTS) the temperature-dependence of the transient current through a nanowire field-effect transistor in response to an applied gate voltage pulse is measured. We demonstrate the applicability of I-DLTS to determine the activation energy and capture cross-sections of several deep defect states in zinc oxide nanowires. In addition to characterising deep defect states, we show that I-DLTS can be used to measure the surface barrier height in semiconductor nanowires.
The organization in brain networks shows highly modular features with weak inter-modular interaction. The topology of the networks involves emergence of modules and sub-modules at different levels of constitution governed by fractal laws. The modular organization, in terms of modular mass, inter-modular, and intra-modular interaction, also obeys fractal nature. The parameters which characterize topological properties of brain networks follow one parameter scaling theory in all levels of network structure which reveals the self-similar rules governing the network structure. The calculated fractal dimensions of brain networks of different species are found to decrease when one goes from lower to higher level species which implicates the more ordered and self-organized topography at higher level species. The sparsely distributed hubs in brain networks may be most influencing nodes but their absence may not cause network breakdown, and centrality parameters characterizing them also follow one parameter scaling law indicating self-similar roles of these hubs at different levels of organization in brain networks.
The way diseases spread through schools, epidemics through countries, and viruses through the Internet is crucial in determining their risk. Although each of these threats has its own characteristics, its underlying network determines the spreading. To restrain the spreading, a widely used approach is the fragmentation of these networks through immunization, so that epidemics cannot spread. Here we develop an immunization approach based on optimizing the susceptible size, which outperforms the best known strategy based on immunizing the highest-betweenness links or nodes. We find that the network's vulnerability can be significantly reduced, demonstrating this on three different real networks: the global flight network, a school friendship network, and the internet. In all cases, we find that not only is the average infection probability significantly suppressed, but also for the most relevant case of a small and limited number of immunization units the infection probability can be reduced by up to 55%.
We present Caffe con Troll (CcT), a fully compatible end-to-end version of the popular framework Caffe with rebuilt internals. We built CcT to examine the performance characteristics of training and deploying general-purpose convolutional neural networks across different hardware architectures. We find that, by employing standard batching optimizations for CPU training, we achieve a 4.5x throughput improvement over Caffe on popular networks like CaffeNet. Moreover, with these improvements, the end-to-end training time for CNNs is directly proportional to the FLOPS delivered by the CPU, which enables us to efficiently train hybrid CPU-GPU systems for CNNs.
We present two different approaches to model power grids as interconnected networks of networks. Both models are derived from a model for spatially embedded mono-layer networks and are generalised to handle an arbitrary number of network layers. The two approaches are distinguished by their use case. The static glue stick construction model yields a multi-layer network from a predefined layer interconnection scheme, i.e. different layers are attached with transformer edges. It is especially suited to construct multi-layer power grids with a specified number of nodes in and transformers between layers. We contrast it with a genuine growth model which we label interconnected layer growth model.
Extensions of the Standard Model that contain leptophobic Z' gauge bosons are theoretically interesting but difficult to probe directly in high-energy hadron colliders. However, precision measurements of Standard Model neutral current processes can provide powerful indirect tests. We demonstrate that parity-violating deep inelastic scattering of polarized electrons off of deuterium offer a unique probe leptophobic Z' bosons with axial quark couplings and masses above 100 GeV. In addition to covering a wide range of previously uncharted parameter space, planned measurements of the deep inelastic parity-violating eD asymmetry would be capable of testing leptophobic Z' scenarios proposed to explain the CDF W plus di-jet anomaly.
Social status refers to the relative position within the society. It is an important notion in sociology and related research. The problem of measuring social status has been studied for many years. Various indicators are proposed to assess social status of individuals, including educational attainment, occupation, and income/wealth. However, these indicators are sometimes difficult to collect or measure.   We investigate social networks for alternative measures of social status. Online activities expose certain traits of users in the real world. We are interested in how these activities are related to social status, and how social status can be predicted with social network data. To the best of our knowledge, this is the first study on connecting online activities with social status in reality.   In particular, we focus on the network structure of microblogs in this study. A user following another implies some kind of status. We cast the predicted social status of users to the "status" of real-world entities, e.g., universities, occupations, and regions, so that we can compare and validate predicted results with facts in the real world. We propose an efficient algorithm for this task and evaluate it on a dataset consisting of 3.4 million users from Sina Weibo. The result shows that it is possible to predict social status with reasonable accuracy using social network data. We also point out challenges and limitations of this approach, e.g., inconsistence between online popularity and real-world status for certain users. Our findings provide insights on analyzing online social status and future designs of ranking schemes for social networks.
In this paper we present a network model to study the impact of spatial distribution of constituents, coupling between them and diffusive processes in the context of biological situations. The model is in terms of network of mobile elements whose internal dynamics is given by differential equations exhibiting switching and/or oscillatory behaviour. To make the model more consistent with the underlying biological phenomena we incorporate properties like growth and decay into the network.   We characterise this network by calculating the usual network measures like network efficiency, entropy growth, vertex degree distribution, geodesic lengths, centrality as well as fractal dimensions and generalised entropy. It is seen that the model can demonstrate the features of both scale free networks as well as small worlds network in different parameter domains. The formation of spatio-temporal patterns is another feature of such networks which makes them appealing for understanding broad qualitative aspects of problems like cell differentiation and synchronization. The response of the network to various attack strategies(isolated and multiple) is also studied.
Deep generative models (DGMs) are effective on learning multilayered representations of complex data and performing inference of input data by exploring the generative ability. However, it is relatively insufficient to empower the discriminative ability of DGMs on making accurate predictions. This paper presents max-margin deep generative models (mmDGMs) and a class-conditional variant (mmDCGMs), which explore the strongly discriminative principle of max-margin learning to improve the predictive performance of DGMs in both supervised and semi-supervised learning, while retaining the generative capability. In semi-supervised learning, we use the predictions of a max-margin classifier as the missing labels instead of performing full posterior inference for efficiency; we also introduce additional max-margin and label-balance regularization terms of unlabeled data for effectiveness. We develop an efficient doubly stochastic subgradient algorithm for the piecewise linear objectives in different settings. Empirical results on various datasets demonstrate that: (1) max-margin learning can significantly improve the prediction performance of DGMs and meanwhile retain the generative ability; (2) in supervised learning, mmDGMs are competitive to the best fully discriminative networks when employing convolutional neural networks as the generative and recognition models; and (3) in semi-supervised learning, mmDCGMs can perform efficient inference and achieve state-of-the-art classification results on several benchmarks.
We present an approach to constructing a neuromorphic device that responds to language input by producing neuron spikes in proportion to the strength of the appropriate positive or negative emotional response. Specifically, we perform a fine-grained sentiment analysis task with implementations on two different systems: one using conventional spiking neural network (SNN) simulators and the other one using IBM's Neurosynaptic System TrueNorth. Input words are projected into a high-dimensional semantic space and processed through a fully-connected neural network (FCNN) containing rectified linear units trained via backpropagation. After training, this FCNN is converted to a SNN by substituting the ReLUs with integrate-and-fire neurons. We show that there is practically no performance loss due to conversion to a spiking network on a sentiment analysis test set, i.e. correlations between predictions and human annotations differ by less than 0.02 comparing the original DNN and its spiking equivalent. Additionally, we show that the SNN generated with this technique can be mapped to existing neuromorphic hardware -- in our case, the TrueNorth chip. Mapping to the chip involves 4-bit synaptic weight discretization and adjustment of the neuron thresholds. The resulting end-to-end system can take a user input, i.e. a word in a vocabulary of over 300,000 words, and estimate its sentiment on TrueNorth with a power consumption of approximately 50 uW.
The study of feedback has been mostly limited to single-hop communication settings. In this paper, we consider Gaussian networks where sources and destinations can communicate with the help of intermediate relays over multiple hops. We assume that links in the network can be bidirected providing opportunities for feedback. We ask the following question: can the information transfer in both directions of a link be critical to maximizing the end-to-end communication rates in the network? Equivalently, could one of the directions in each bidirected link (and more generally at least one of the links forming a cycle) be shut down and the capacity of the network still be approximately maintained? We show that in any arbitrary Gaussian network with bidirected edges and cycles and unicast traffic, we can always identify a directed acyclic subnetwork that approximately maintains the capacity of the original network. For Gaussian networks with multiple-access and broadcast traffic, an acyclic subnetwork is sufficient to achieve every rate point in the capacity region of the original network, however, there may not be a single acyclic subnetwork that maintains the whole capacity region. For networks with multicast and multiple unicast traffic, on the other hand, bidirected information flow across certain links can be critically needed to maximize the end-to-end capacity region. These results can be regarded as generalizations of the conclusions regarding the usefulness of feedback in various single-hop Gaussian settings and can provide opportunities for simplifying operation in Gaussian multi-hop networks.
We propose a neural reranking system for named entity recognition (NER). The basic idea is to leverage recurrent neural network models to learn sentence-level patterns that involve named entity mentions. In particular, given an output sentence produced by a baseline NER model, we replace all entity mentions, such as \textit{Barack Obama}, into their entity types, such as \textit{PER}. The resulting sentence patterns contain direct output information, yet is less sparse without specific named entities. For example, "PER was born in LOC" can be such a pattern. LSTM and CNN structures are utilised for learning deep representations of such sentences for reranking. Results show that our system can significantly improve the NER accuracies over two different baselines, giving the best reported results on a standard benchmark.
We analytically study the dynamics of evolving populations that exhibit metastability on the level of phenotype or fitness. In constant selective environments, such metastable behavior is caused by two qualitatively different mechanisms. One the one hand, populations may become pinned at a local fitness optimum, being separated from higher-fitness genotypes by a {\em fitness barrier} of low-fitness genotypes. On the other hand, the population may only be metastable on the level of phenotype or fitness while, at the same time, diffusing over {\em neutral networks} of selectively neutral genotypes. Metastability occurs in this case because the population is separated from higher-fitness genotypes by an {\em entropy barrier}: The population must explore large portions of these neutral networks before it discovers a rare connection to fitter phenotypes.   We derive analytical expressions for the barrier crossing times in both the fitness barrier and entropy barrier regime. In contrast with ``landscape'' evolutionary models, we show that the waiting times to reach higher fitness depend strongly on the width of a fitness barrier and much less on its height. The analysis further shows that crossing entropy barriers is faster by orders of magnitude than fitness barrier crossing. Thus, when populations are trapped in a metastable phenotypic state, they are most likely to escape by crossing an entropy barrier, along a neutral path in genotype space. If no such escape route along a neutral path exists, a population is most likely to cross a fitness barrier where the barrier is {\em narrowest}, rather than where the barrier is shallowest.
CdTe can be made semi-insulating by shallow donor doping. This is routinely done to obtain high resistivity in CdTe-based radiation detectors. However, it is widely believed that the high resistivity in CdTe is due to the Fermi level pinning by native deep donors. The model based on shallow donor compensation of native acceptors was dismissed based on the assumption that it is practically impossible to control the shallow donor doping level so precisely that the free carrier density can be brought below the desired value suitable for radiation detection applications. In this paper, we present our calculations on carrier statistics and energetics of shallow donors and native defects in CdTe. Our results show that the shallow donor can be used to reliably obtain high resistivity in CdTe. Since radiation detection applications require both high resistivity and good carrier transport, one should generally use shallow donors and shallow acceptors for carrier compensation and avoid deep centers that are effective carrier traps.
Discriminative deep learning approaches have shown impressive results for problems where human-labeled ground truth is plentiful, but what about tasks where labels are difficult or impossible to obtain? This paper tackles one such problem: establishing dense visual correspondence across different object instances. For this task, although we do not know what the ground-truth is, we know it should be consistent across instances of that category. We exploit this consistency as a supervisory signal to train a convolutional neural network to predict cross-instance correspondences between pairs of images depicting objects of the same category. For each pair of training images we find an appropriate 3D CAD model and render two synthetic views to link in with the pair, establishing a correspondence flow 4-cycle. We use ground-truth synthetic-to-synthetic correspondences, provided by the rendering engine, to train a ConvNet to predict synthetic-to-real, real-to-real and real-to-synthetic correspondences that are cycle-consistent with the ground-truth. At test time, no CAD models are required. We demonstrate that our end-to-end trained ConvNet supervised by cycle-consistency outperforms state-of-the-art pairwise matching methods in correspondence-related tasks.
We propose Deep Feature Interpolation (DFI), a new data-driven baseline for automatic high-resolution image transformation. As the name suggests, it relies only on simple linear interpolation of deep convolutional features from pre-trained convnets. We show that despite its simplicity, DFI can perform high-level semantic transformations like "make older/younger", "make bespectacled", "add smile", among others, surprisingly well - sometimes even matching or outperforming the state-of-the-art. This is particularly unexpected as DFI requires no specialized network architecture or even any deep network to be trained for these tasks. DFI therefore can be used as a new baseline to evaluate more complex algorithms and provides a practical answer to the question of which image transformation tasks are still challenging in the rise of deep learning.
The crossover in energy level statistics of a quasi-1-dimensional disordered wire as a function of its length L is used, in order to derive its averaged localization length, without magnetic field, in a magnetic field and for moderate spin orbit scattering strength. An analytical function of the magnetic field for the local level spacing is obtained, and found to be in excellent agreement with the magnetic field dependent activation energy, recently measured in low-mobility quasi-one-dimensional wires\cite{khavin}. This formula can be used to extract directly and accurately the localization length from magnetoresistance experiments. In general, the local level spacing is shown to be proportional to the excitation gap of a virtual particle, moving on a compact symmetric space.
Many real world stochastic control problems suffer from the "curse of dimensionality". To overcome this difficulty, we develop a deep learning approach that directly solves high-dimensional stochastic control problems based on Monte-Carlo sampling. We approximate the time-dependent controls as feedforward neural networks and stack these networks together through model dynamics. The objective function for the control problem plays the role of the loss function for the deep neural network. We test this approach using examples from the areas of optimal trading and energy storage. Our results suggest that the algorithm presented here achieves satisfactory accuracy and at the same time, can handle rather high dimensional problems.
The idea of opposition-based learning was introduced 10 years ago. Since then a noteworthy group of researchers has used some notions of oppositeness to improve existing optimization and learning algorithms. Among others, evolutionary algorithms, reinforcement agents, and neural networks have been reportedly extended into their opposition-based version to become faster and/or more accurate. However, most works still use a simple notion of opposites, namely linear (or type- I) opposition, that for each $x\in[a,b]$ assigns its opposite as $\breve{x}_I=a+b-x$. This, of course, is a very naive estimate of the actual or true (non-linear) opposite $\breve{x}_{II}$, which has been called type-II opposite in literature. In absence of any knowledge about a function $y=f(\mathbf{x})$ that we need to approximate, there seems to be no alternative to the naivety of type-I opposition if one intents to utilize oppositional concepts. But the question is if we can receive some level of accuracy increase and time savings by using the naive opposite estimate $\breve{x}_I$ according to all reports in literature, what would we be able to gain, in terms of even higher accuracies and more reduction in computational complexity, if we would generate and employ true opposites? This work introduces an approach to approximate type-II opposites using evolving fuzzy rules when we first perform opposition mining. We show with multiple examples that learning true opposites is possible when we mine the opposites from the training data to subsequently approximate $\breve{x}_{II}=f(\mathbf{x},y)$.
In a multi-class classification problem, it is standard to model the output of a neural network as a categorical distribution conditioned on the inputs. The output must therefore be positive and sum to one, which is traditionally enforced by a softmax. This probabilistic mapping allows to use the maximum likelihood principle, which leads to the well-known log-softmax loss. However the choice of the softmax function seems somehow arbitrary as there are many other possible normalizing functions. It is thus unclear why the log-softmax loss would perform better than other loss alternatives. In particular Vincent et al. (2015) recently introduced a class of loss functions, called the spherical family, for which there exists an efficient algorithm to compute the updates of the output weights irrespective of the output size. In this paper, we explore several loss functions from this family as possible alternatives to the traditional log-softmax. In particular, we focus our investigation on spherical bounds of the log-softmax loss and on two spherical log-likelihood losses, namely the log-Spherical Softmax suggested by Vincent et al. (2015) and the log-Taylor Softmax that we introduce. Although these alternatives do not yield as good results as the log-softmax loss on two language modeling tasks, they surprisingly outperform it in our experiments on MNIST and CIFAR-10, suggesting that they might be relevant in a broad range of applications.
We study statistical properties of peculiar responses in glassy systems at mesoscopic scales based on a class of mean-field spin-glass models which exhibit 1 step replica symmetry breaking. Under variation of a generic external field, a finite-sized sample of such a system exhibits a series of step wise responses which can be regarded as a finger print of the sample. We study in detail the statistical properties of the step structures based on a low temperature expansion approach and a replica approach. The spacings between the steps vanish in the thermodynamic limit so that arbitrary small but finite variation of the field induce infinitely many level crossings in the thermodynamic limit leading to a static chaos effect which yields a self-averaging, smooth macroscopic response. We also note that there is a strong analogy between the problem of step-wise responses in glassy systems at mesoscopic scales and intermittency in turbulent flows due to shocks.
We study the many-body localization aspects of single-particle mobility edges in fermionic systems. We investigate incommensurate lattices and random disorder Anderson models. Many-body localization and quantum nonergodic properties are studied by comparing entanglement and thermal entropy, and by calculating the scaling of subsystem particle number fluctuations, respectively. We establish a nonergodic extended phase as a generic intermediate phase (between purely ergodic extended and nonergodic localized phases) for the many-body localization transition of non-interacting fermions where the entanglement entropy manifests a volume law (`extended'), but there are large fluctuations in the subsystem particle numbers (`nonergodic'). We argue such an intermediate phase scenario may continue holding even for the many-body localization in the presence of interactions as well. We find for many-body states in non-interacting 1d Aubry-Andre and 3d Anderson models that the entanglement entropy density and the normalized particle-number fluctuation have discontinuous jumps at the localization transition where the entanglement entropy is sub-thermal but obeys the "volume law". In the vicinity of the localization transition we find that both the entanglement entropy and the particle number fluctuations obey a single parameter scaling. We argue using numerical and theoretical results that such a critical scaling behavior should persist for the interacting many-body localization problem with important consequences. Our work provides persuasive evidence in favor of there being two transitions in many-body systems with single-particle mobility edges, the first one indicating a transition from the purely localized nonergodic many-body localized phase to a nonergodic extended many-body metallic phase, and the second one being a transition eventually to the usual ergodic many-body extended phase.
In this paper we extend the calculation of the QED corrections to deep inelastic lepton-proton scattering with a tagged photon, taking into account the full corrections on the lepton side. Comparing to previous results that were obtained by considering only large logarithmic terms at leading and next-to-leading accuray, we find that the difference is in general quite small, however, it may be significant in the region of large $y$ and small $x$.
In this paper, a novel extreme learning machine based online multi-label classifier for real-time data streams is proposed. Multi-label classification is one of the actively researched machine learning paradigm that has gained much attention in the recent years due to its rapidly increasing real world applications. In contrast to traditional binary and multi-class classification, multi-label classification involves association of each of the input samples with a set of target labels simultaneously. There are no real-time online neural network based multi-label classifier available in the literature. In this paper, we exploit the inherent nature of high speed exhibited by the extreme learning machines to develop a novel online real-time classifier for multi-label data streams. The developed classifier is experimented with datasets from different application domains for consistency, performance and speed. The experimental studies show that the proposed method outperforms the existing state-of-the-art techniques in terms of speed and accuracy and can classify multi-label data streams in real-time.
We have observed the Crab pulsar with the Deep Space Network (DSN) Goldstone 70 m antenna at 1664 MHz during three observing epochs for a total of 4 hours. Our data analysis has detected more than 2500 giant pulses, with flux densities ranging from 0.1 kJy to 150 kJy and pulse widths from 125 ns (limited by our bandwidth) to as long as 100 microseconds, with median power amplitudes and widths of 1 kJy and 2 microseconds respectively. The most energetic pulses in our sample have energy fluxes of approximately 100 kJy-microsecond. We have used this large sample to investigate a number of giant-pulse emission properties in the Crab pulsar, including correlations among pulse flux density, width, energy flux, phase and time of arrival. We present a consistent accounting of the probability distributions and threshold cuts in order to reduce pulse-width biases. The excellent sensitivity obtained has allowed us to probe further into the population of giant pulses. We find that a significant portion, no less than 50%, of the overall pulsed energy flux at our observing frequency is emitted in the form of giant pulses.
In this paper, the problem of designing network codes that are both communicationally and computationally efficient over packet line networks with worst-case schedules is considered. In this context, random linear network codes (dense codes) are asymptotically capacity-achieving, but require highly complex coding operations. To reduce the coding complexity, Maymounkov et al. proposed chunked codes (CC). Chunked codes operate by splitting the message into non-overlapping chunks and send a randomly chosen chunk at each transmission time by a dense code. The complexity, that is linear in the chunk size, is thus reduced compared to dense codes. In this paper, the existing analysis of CC is revised, and tighter bounds on the performance of CC are derived. As a result, we prove that (i) CC with sufficiently large chunks are asymptotically capacity-achieving, but with a slower speed of convergence compared to dense codes; and (ii) CC with relatively smaller chunks approach the capacity with an arbitrarily small but non-zero constant gap. To improve the speed of convergence of CC, while maintaining their advantage in reducing the computational complexity, we propose and analyze a new CC scheme with overlapping chunks, referred to as overlapped chunked codes (OCC). We prove that for smaller chunks, which are advantageous due to lower computational complexity, OCC with larger overlaps provide a better tradeoff between the speed of convergence and the message or packet error rate. This implies that for smaller chunks, and with the same computational complexity, OCC outperform CC in terms of the speed of approaching the capacity for sufficiently small target error rate. In fact, we design linear-time OCC with very small chunks (constant in the message size) that are both computationally and communicationally efficient, and that outperform linear-time CC.
Domain walls for spin glasses are believed to be scale invariant invariant; a stronger symmetry, conformal invariance, has the potential to hold. The statistics of zero-temperature Ising spin glass domain walls in two dimensions are used to test the hypothesis that these domain walls are described by a Schramm-Loewner evolution SLE$_\kappa$. Multiple tests are consistent with SLE$_\kappa$, where $\kappa=2.30(5)$. Both conformal invariance and the domain Markov property are tested. The latter does not hold in small systems, but detailed numerical evidence suggests that it holds in the continuum limit.
Device-free (DF) localization is an emerging technology that allows the detection and tracking of entities that do not carry any devices nor participate actively in the localization process. Typically, DF systems require a large number of transmitters and receivers to achieve acceptable accuracy, which is not available in many scenarios such as homes and small businesses. In this paper, we introduce MonoStream as an accurate single-stream DF localization system that leverages the rich Channel State Information (CSI) as well as MIMO information from the physical layer to provide accurate DF localization with only one stream. To boost its accuracy and attain low computational requirements, MonoStream models the DF localization problem as an object recognition problem and uses a novel set of CSI-context features and techniques with proven accuracy and efficiency. Experimental evaluation in two typical testbeds, with a side-by-side comparison with the state-of-the-art, shows that MonoStream can achieve an accuracy of 0.95m with at least 26% enhancement in median distance error using a single stream only. This enhancement in accuracy comes with an efficient execution of less than 23ms per location update on a typical laptop. This highlights the potential of MonoStream usage for real-time DF tracking applications.
This paper presents the coupling of a building thermal simulation code with genetic algorithms (GAs). GAs are randomized search algorithms that are based on the mechanisms of natural selection and genetics. We show that this coupling allows the location of defective sub-models of a building thermal model i.e. parts of model that are responsible for the disagreements between measurements and model predictions. The method first of all is checked and validated on the basis of a numerical model of a building taken as reference. It is then applied to a real building case. The results show that the method could constitute an efficient tool when checking the model validity.
" How well connected is the network? " This is one of the most fundamental questions one would ask when facing the challenge of designing a communication network. Three major notions of connectivity have been considered in the literature, but in the context of traditional (single-layer) networks, they turn out to be equivalent. This paper introduces a model for studying the three notions of connectivity in multi-layer networks. Using this model, it is easy to demonstrate that in multi-layer networks the three notions may differ dramatically. Unfortunately, in contrast to the single-layer case, where the values of the three connectivity notions can be computed efficiently, it has been recently shown in the context of WDM networks (results that can be easily translated to our model) that the values of two of these notions of connectivity are hard to compute or even approximate in multi-layer networks. The current paper shed some positive light into the multi-layer connectivity topic: we show that the value of the third connectivity notion can be computed in polynomial time and develop an approximation for the construction of well connected overlay networks.
The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an EM algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.
In this work, we present a MCTS-based Go-playing program which uses convolutional networks in all parts. Our method performs MCTS in batches, explores the Monte Carlo search tree using Thompson sampling and a convolutional network, and evaluates convnet-based rollouts on the GPU. We achieve strong win rates against open source Go programs and attain competitive results against state of the art convolutional net-based Go-playing programs.
We demonstrate that for gapped bilayer graphene, the nonlinear nature of the screening of an external disorder potential and the resulting inhomogeneity of the electron liquid are crucial for describing the electronic compressibility. In particular, traditional diagrammatic methods of many-body theory do not include this inhomogeneity and therefore fail to reproduce experimental data accurately, particularly at low carrier densities. In contrast, a direct calculation of the charge landscape via a numerical Thomas-Fermi energy functional method along with the appropriate bulk averaging procedure captures all the essential physics, including the interplay between the band gap and the inhomogeneity.
The color-magnitude diagram (CMD) of the intermediate-age Large Magellanic Cloud (LMC) star cluster NGC 2154 and its adjacent field, has been analyzed using Padova stellar models to determine the cluster's fundamental parameters and its Star Formation History (SFH). Deep $BR$ CCD photometry, together with synthetic CMDs and Integrated Luminosity Functions (ILFs), has allowed us to infer that the cluster experienced an extended star formation period of about 1.2 Gyrs, which began approximately 2.3 Gyr and ended 1.1 Gyr ago. The physical reality of such a prolonged period of star formation is however questionable, and could be the result of inadequacies in the stellar evolutionary tracks themselves. A substantial fraction of binaries (70%) seems to exist in NGC 2154.
Humans and animals are constantly exposed to a continuous stream of sensory information from different modalities. At the same time, they form more compressed representations like concepts or symbols. In species that use language, this process is further structured by this interaction, where a mapping between the sensorimotor concepts and linguistic elements needs to be established. There is evidence that children might be learning language by simply disambiguating potential meanings based on multiple exposures to utterances in different contexts (cross-situational learning). In existing models, the mapping between modalities is usually found in a single step by directly using frequencies of referent and meaning co-occurrences. In this paper, we present an extension of this one-step mapping and introduce a newly proposed sequential mapping algorithm together with a publicly available Matlab implementation. For demonstration, we have chosen a less typical scenario: instead of learning to associate objects with their names, we focus on body representations. A humanoid robot is receiving tactile stimulations on its body, while at the same time listening to utterances of the body part names (e.g., hand, forearm and torso). With the goal at arriving at the correct "body categories", we demonstrate how a sequential mapping algorithm outperforms one-step mapping. In addition, the effect of data set size and noise in the linguistic input are studied.
The aim of this article is to provide an understanding of social networks as a useful addition to the standard tool-box of techniques used by system designers. To this end, we give examples of how data about social links have been collected and used in di erent application contexts. We develop a broad taxonomy-based overview of common properties of social networks, review how they might be used in di erent applications, and point out potential pitfalls where appropriate. We propose a framework, distinguishing between two main types of social network-based user selection-personalised user selection which identi es target users who may be relevant for a given source node, using the social network around the source as a context, and generic user selection or group delimitation, which lters for a set of users who satisfy a set of application requirements based on their social properties. Using this framework, we survey applications of social networks in three typical kinds of application scenarios: recommender systems, content-sharing systems (e.g., P2P or video streaming), and systems which defend against users who abuse the system (e.g., spam or sybil attacks). In each case, we discuss potential directions for future research that involve using social network properties.
We consider a communication network with a single source that has a set of messages and two terminals where each terminal is interested in an arbitrary subset of messages at the source. A tight capacity region for this problem is demonstrated. We show by a simple graph-theoretic procedure that any such problem can be solved by performing network coding on the subset of messages that are requested by both the terminals and that routing is sufficient for transferring the remaining messages.
Pure acoustic neural models, particularly the LSTM-RNN model, have shown great potential in language identification (LID). However, the phonetic information has been largely overlooked by most of existing neural LID models, although this information has been used in the conventional phonetic LID systems with a great success. We present a phone-aware neural LID architecture, which is a deep LSTM-RNN LID system but accepts output from an RNN-based ASR system. By utilizing the phonetic knowledge, the LID performance can be significantly improved. Interestingly, even if the test language is not involved in the ASR training, the phonetic knowledge still presents a large contribution. Our experiments conducted on four languages within the Babel corpus demonstrated that the phone-aware approach is highly effective.
Transmission Control Protocol (TCP) has been profusely used by most of internet applications. Since 1970s, several TCP variants have been developed in order to cope with the fast increasing of network capacities especially in high Bandwidth Delay Product (high-BDP) networks. In these TCP variants, several approaches have been used, some of these approaches have the ability to estimate available bandwidths and some react based on network loss and/or delay changes. This variety of the used approaches arises many consequent problems with different levels of dependability and accuracy. Indeed, a particular TCP variant which is proper for wireless networks, may not fit for high-BDP wired networks and vice versa. Therefore, it is necessary to conduct a comparison between the high-speed TCP variants that have a high level of importance especially after the fast growth of networks bandwidths. In this paper, high-speed TCP variants, that are implemented in Linux and available for research, have been evaluated using NS2 network simulator. This performance evaluation presents the advantages and disadvantages of these TCP variants in terms of throughput, loss-ratio and fairness over high-BDP networks. The results reveal that, CUBIC and YeAH overcome the other highspeed TCP variants in different cases of buffer size. However, they still require more improvement to extend their ability to fully utilize the high-speed bandwidths, especially when the applied buffer is near-zero or less than the BDP of the link.
This paper describes a dataset containing small images of text from everyday scenes. The purpose of the dataset is to support the development of new automated systems that can detect and analyze text. Although much research has been devoted to text detection and recognition in scanned documents, relatively little attention has been given to text detection in other types of images, such as photographs that are posted on social-media sites. This new dataset, known as COCO-Text-Patch, contains approximately 354,000 small images that are each labeled as "text" or "non-text". This dataset particularly addresses the problem of text verification, which is an essential stage in the end-to-end text detection and recognition pipeline. In order to evaluate the utility of this dataset, it has been used to train two deep convolution neural networks to distinguish text from non-text. One network is inspired by the GoogLeNet architecture, and the second one is based on CaffeNet. Accuracy levels of 90.2% and 90.9% were obtained using the two networks, respectively. All of the images, source code, and deep-learning trained models described in this paper will be publicly available
It is becoming increasingly important for machine learning methods to make predictions that are interpretable as well as accurate. In many practical applications, it is of interest which features and feature interactions are relevant to the prediction task. We present a novel method, Selective Bayesian Forest Classifier, that strikes a balance between predictive power and interpretability by simultaneously performing classification, feature selection, feature interaction detection and visualization. It builds parsimonious yet flexible models using tree-structured Bayesian networks, and samples an ensemble of such models using Markov chain Monte Carlo. We build in feature selection by dividing the trees into two groups according to their relevance to the outcome of interest. Our method performs competitively on classification and feature selection benchmarks in low and high dimensions, and includes a visualization tool that provides insight into relevant features and interactions.
Empirical observations and theoretical studies indicate that the overall travel-time of vehicles in a traffic network can be optimized by means of ramp metering control systems. Here, we present an analysis of traffic data of the highway network of North-Rhine-Westfalia in order to identify and characterize the sections of the network which limit the performance, i.e., the bottlenecks. It is clarified whether the bottlenecks are of topological nature or if they are constituted by on-ramps. This allows to judge possible optimization mechanisms and reveals in which areas of the network they have to be applied.
Mammography is the most widely used method to screen breast cancer. Because of its mostly manual nature, variability in mass appearance, and low signal-to-noise ratio, a significant number of breast masses are missed or misdiagnosed. In this work, we present how Convolutional Neural Networks can be used to directly classify pre-segmented breast masses in mammograms as benign or malignant, using a combination of transfer learning, careful pre-processing and data augmentation to overcome limited training data. We achieve state-of-the-art results on the DDSM dataset, surpassing human performance, and show interpretability of our model.
With a widespread use of digital imaging data in hospitals, the size of medical image repositories is increasing rapidly. This causes difficulty in managing and querying these large databases leading to the need of content based medical image retrieval (CBMIR) systems. A major challenge in CBMIR systems is the semantic gap that exists between the low level visual information captured by imaging devices and high level semantic information perceived by human. The efficacy of such systems is more crucial in terms of feature representations that can characterize the high-level information completely. In this paper, we propose a framework of deep learning for CBMIR system by using deep Convolutional Neural Network (CNN) that is trained for classification of medical images. An intermodal dataset that contains twenty four classes and five modalities is used to train the network. The learned features and the classification results are used to retrieve medical images. For retrieval, best results are achieved when class based predictions are used. An average classification accuracy of 99.77% and a mean average precision of 0.69 is achieved for retrieval task. The proposed method is best suited to retrieve multimodal medical images for different body organs.
PageRank centrality is used by Google for ranking web-pages to present search result for a user query. Here, we have shown that PageRank value of a vertex also depends on its intrinsic, non-network contribution. If the intrinsic, non-network contributions of the vertices are proportional to their degrees or zeros, then their PageRank centralities become proportion to their degrees. Some simulations and empirical data are used to support our study. In addition, we have shown that localization of PageRank centrality depends upon the same intrinsic, non-network contribution.
A real-space renormalization transformation is constructed for lattices of non-identical oscillators with dynamics of the general form $d\phi_{k}/dt=\omega_{k}+g\sum_{l}f_{lk}(\phi_{l},\phi_{k})$. The transformation acts on ensembles of such lattices. Critical properties corresponding to a second order phase transition towards macroscopic synchronization are deduced. The analysis is potentially exact, but relies in part on unproven assumptions. Numerically, second order phase transitions with the predicted properties are observed as $g$ increases in two structurally different, two-dimensional oscillator models. One model has smooth coupling $f_{lk}(\phi_{l},\phi_{k})=\phi(\phi_{l}-\phi_{k})$, where $\phi(x)$ is non-odd. The other model is pulse-coupled, with $f_{lk}(\phi_{l},\phi_{k})=\delta(\phi_{l})\phi(\phi_{k})$. Lower bounds for the critical dimensions for different types of coupling are obtained. For non-odd coupling, macroscopic synchronization cannot be ruled out for any dimension $D\geq 1$, whereas in the case of odd coupling, the well-known result that it can be ruled out for $D< 3$ is regained.
Inter-firm organizations, which play a driving role in the economy of a country, can be represented in the form of a customer-supplier network. Such a network exhibits a scale-free degree distribution, disassortative mixing and a prominent community structure. We analyze a large-scale data set of customer-supplier relationships containing data from one million Japanese firms. Using a directed network framework, we show that the production network exhibits the characteristics listed above. We conduct detailed investigations to characterize the communities in the network. The topology within smaller communities is found to be very close to a tree-like structure but becomes denser as the community size increases. A large fraction (~40%) of firms with relatively small in- or out-degrees have customers or suppliers solely from within their own communities, indicating interactions with a highly local nature. The interaction strengths between communities as measured by the inter-community link weights exhibit a highly heterogeneous distribution. We further present the statistically significant over-expressions of different prefectures and sectors within different communities.
Previous work shows that the mean first-passage time (MFPT) for random walks to a given hub node (node with maximum degree) in uncorrelated random scale-free networks is closely related to the exponent $\gamma$ of power-law degree distribution $P(k)\sim k^{-\gamma}$, which describes the extent of heterogeneity of scale-free network structure. However, extensive empirical research indicates that real networked systems also display ubiquitous degree correlations. In this paper, we address the trapping issue on the Koch networks, which is a special random walk with one trap fixed at a hub node. The Koch networks are power-law with the characteristic exponent $\gamma$ in the range between 2 and 3, they are either assortative or disassortative. We calculate exactly the MFPT that is the average of first-passage time from all other nodes to the trap. The obtained explicit solution shows that in large networks the MFPT varies lineally with node number $N$, which is obviously independent of $\gamma$ and is sharp contrast to the scaling behavior of MFPT observed for uncorrelated random scale-free networks, where $\gamma$ influences qualitatively the MFPT of trapping problem.
In this paper, we investigate the impact of network coding at the relay node on the stable throughput rate in multicasting cooperative wireless networks. The proposed protocol adopts Network-level cooperation in contrast to the traditional physical layer cooperative protocols and in addition uses random linear network coding at the relay node. The traffic is assumed to be bursty and the relay node forwards its packets during the periods of source silence which allows better utilization for channel resources. Our results show that cooperation will lead to higher stable throughput rates than conventional retransmission policies and that the use of random linear network coding at the relay can further increase the stable throughput with increasing Network Coding field size or number of packets over which encoding is performed.
We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
It is challenging to develop stochastic gradient based scalable inference for deep discrete latent variable models (LVMs), due to the difficulties in not only computing the gradients, but also adapting the step sizes to different latent factors and hidden layers. For the Poisson gamma belief network (PGBN), a recently proposed deep discrete LVM, we derive an alternative representation that is referred to as deep latent Dirichlet allocation (DLDA). Exploiting data augmentation and marginalization techniques, we derive a block-diagonal Fisher information matrix and its inverse for the simplex-constrained global model parameters of DLDA. Exploiting that Fisher information matrix with stochastic gradient MCMC, we present topic-layer-adaptive stochastic gradient Riemannian (TLASGR) MCMC that jointly learns simplex-constrained global parameters across all layers and topics, with topic and layer specific learning rates. State-of-the-art results are demonstrated on big data sets.
High-precision pressure measurements in solid 4He, grown by the capillary blocking technique, have been made in temperatures range from 50 to 500 mK. The temperature dependence of pressure indicates that aside from the usual phonon contribution ~T^4, there is an additional contribution ~ T^2, the latter becoming dominant at temperatures T < 300 mK, where an abnormal behavior attributed to supersolidity has been observed. The data suggest the appearance of a glassy phase (that might be responsible for the anomalous behaviors observed previously).
In multi-radio multi-channel (MRMC) WMNs, interference alleviation is affected through several network design techniques e.g., channel assignment (CA), link scheduling, routing etc., intelligent CA schemes being the most effective tool for interference mitigation. CA in WMNs is an NP-Hard problem, and makes optimality a desired yet elusive goal in real-time deployments which are characterized by fast transmission and switching times and minimal end-to-end latency. The trade-off between optimal performance and minimal response times is often achieved through CA schemes that employ heuristics to propose efficient solutions. WMN configuration and physical layout are also crucial factors which decide network performance, and it has been demonstrated in numerous research works that rectangular/square grid WMNs outperform random or unplanned WMN deployments in terms of network capacity, latency, and network resilience. In this work, we propose a smart heuristic approach to devise a near-optimal CA algorithm for grid WMNs (NOCAG). We demonstrate the efficacy of NOCAG by evaluating its performance against the minimal-interference CA generated through a rudimentary brute-force technique (BFCA), for the same WMN configuration. We assess its ability to mitigate interference both, theoretically (through interference estimation metrics) and experimentally (by running rigorous simulations in NS-3). We demonstrate that the performance of NOCAG is almost as good as the BFCA, at a minimal computational overhead of O(n) compared to the exponential of BFCA.
Deep Convolutional Neural Networks (CNN) have shown great success in supervised classification tasks such as character classification or dating. Deep learning methods typically need a lot of annotated training data, which is not available in many scenarios. In these cases, traditional methods are often better than or equivalent to deep learning methods. In this paper, we propose a simple, yet effective, way to learn CNN activation features in an unsupervised manner. Therefore, we train a deep residual network using surrogate classes. The surrogate classes are created by clustering the training dataset, where each cluster index represents one surrogate class. The activations from the penultimate CNN layer serve as features for subsequent classification tasks. We evaluate the feature representations on two publicly available datasets. The focus lies on the ICDAR17 competition dataset on historical document writer identification (Historical-WI). We show that the activation features we trained without supervision are superior to descriptors of state-of-the-art writer identification methods. Additionally, we achieve comparable results in the case of handwriting classification using the ICFHR16 competition dataset on historical Latin script types (CLaMM16).
Operational network data, management data such as customer care call logs and equipment system logs, is a very important source of information for network operators to detect problems in their networks. Unfortunately, there is lack of efficient tools to automatically track and detect anomalous events on operational data, causing ISP operators to rely on manual inspection of this data. While anomaly detection has been widely studied in the context of network data, operational data presents several new challenges, including the volatility and sparseness of data, and the need to perform fast detection (complicating application of schemes that require offline processing or large/stable data sets to converge).   To address these challenges, we propose Tiresias, an automated approach to locating anomalous events on hierarchical operational data. Tiresias leverages the hierarchical structure of operational data to identify high-impact aggregates (e.g., locations in the network, failure modes) likely to be associated with anomalous events. To accommodate different kinds of operational network data, Tiresias consists of an online detection algorithm with low time and space complexity, while preserving high detection accuracy. We present results from two case studies using operational data collected at a large commercial IP network operated by a Tier-1 ISP: customer care call logs and set-top box crash logs. By comparing with a reference set verified by the ISP's operational group, we validate that Tiresias can achieve >94% accuracy in locating anomalies. Tiresias also discovered several previously unknown anomalies in the ISP's customer care cases, demonstrating its effectiveness.
Connecting different text attributes associated with the same entity (conflation) is important in business data analytics since it could help merge two different tables in a database to provide a more comprehensive profile of an entity. However, the conflation task is challenging because two text strings that describe the same entity could be quite different from each other for reasons such as misspelling. It is therefore critical to develop a conflation model that is able to truly understand the semantic meaning of the strings and match them at the semantic level. To this end, we develop a character-level deep conflation model that encodes the input text strings from character level into finite dimension feature vectors, which are then used to compute the cosine similarity between the text strings. The model is trained in an end-to-end manner using back propagation and stochastic gradient descent to maximize the likelihood of the correct association. Specifically, we propose two variants of the deep conflation model, based on long-short-term memory (LSTM) recurrent neural network (RNN) and convolutional neural network (CNN), respectively. Both models perform well on a real-world business analytics dataset and significantly outperform the baseline bag-of-character (BoC) model.
Achieving efficient and scalable exploration in complex domains poses a major challenge in reinforcement learning. While Bayesian and PAC-MDP approaches to the exploration problem offer strong formal guarantees, they are often impractical in higher dimensions due to their reliance on enumerating the state-action space. Hence, exploration in complex domains is often performed with simple epsilon-greedy methods. In this paper, we consider the challenging Atari games domain, which requires processing raw pixel inputs and delayed rewards. We evaluate several more sophisticated exploration strategies, including Thompson sampling and Boltzman exploration, and propose a new exploration method based on assigning exploration bonuses from a concurrently learned model of the system dynamics. By parameterizing our learned model with a neural network, we are able to develop a scalable and efficient approach to exploration bonuses that can be applied to tasks with complex, high-dimensional state spaces. In the Atari domain, our method provides the most consistent improvement across a range of games that pose a major challenge for prior methods. In addition to raw game-scores, we also develop an AUC-100 metric for the Atari Learning domain to evaluate the impact of exploration on this benchmark.
Analysis of the data shows that hadron tags of the two standard DELPHI particle identification packages RIBMEAN and HADSIGN are weakly correlated. This led to the idea of constructing a neural network for both kaon and proton identification using as input the existing tags from RIBMEAN and HADSIGN, as well as preproccessed TPC and RICH detector measurements together with additional dE/dx information from the DELPHI vertex detector. It will be shown in this note that the net output is much more efficient at the same purity than the HADSIGN or RIBMEAN tags alone. We present an easy-to-use routine performing the necessary calculations.
Obstacle Detection is a central problem for any robotic system, and critical for autonomous systems that travel at high speeds in unpredictable environment. This is often achieved through scene depth estimation, by various means. When fast motion is considered, the detection range must be longer enough to allow for safe avoidance and path planning. Current solutions often make assumption on the motion of the vehicle that limit their applicability, or work at very limited ranges due to intrinsic constraints. We propose a novel appearance-based Object Detection system that is able to detect obstacles at very long range and at a very high speed (~300Hz), without making assumptions on the type of motion. We achieve these results using a Deep Neural Network approach trained on real and synthetic images and trading some depth accuracy for fast, robust and consistent operation. We show how photo-realistic synthetic images are able to solve the problem of training set dimension and variety typical of machine learning approaches, and how our system is robust to massive blurring of test images.
We show why the amount of information communicated between the past and future--the excess entropy--is not in general the amount of information stored in the present--the statistical complexity. This is a puzzle, and a long-standing one, since the latter is what is required for optimal prediction, but the former describes observed behavior. We layout a classification scheme for dynamical systems and stochastic processes that determines when these two quantities are the same or different. We do this by developing closed-form expressions for the excess entropy in terms of optimal causal predictors and retrodictors--the epsilon-machines of computational mechanics. A process's causal irreversibility and crypticity are key determining properties.
Current research on Internet of Things (IoT) mainly focuses on how to enable general objects to see, hear, and smell the physical world for themselves, and make them connected to share the observations. In this paper, we argue that only connected is not enough, beyond that, general objects should have the capability to learn, think, and understand both the physical world by themselves. On the other hand, the future IoT will be highly populated by large numbers of heterogeneous networked embedded devices, which are generating massive or big data in an explosive fashion. Although there is a consensus among almost everyone on the great importance of big data analytics in IoT, to date, limited results, especially the mathematical foundations, are obtained. These practical needs impels us to propose a systematic tutorial on the development of effective algorithms for big data analytics in future IoT, which are grouped into four classes: 1) heterogeneous data processing, 2) nonlinear data processing, 3) high-dimensional data processing, and 4) distributed and parallel data processing. We envision that the presented research is offered as a mere baby step in a potentially fruitful research direction. We hope that this article, with interdisciplinary perspectives, will stimulate more interests in research and development of practical and effective algorithms for specific IoT applications, to enable smart resource allocation, automatic network operation, and intelligent service provisioning.
Flocks of starlings exhibit a remarkable ability to maintain cohesion as a group in highly uncertain environments and with limited, noisy information. Recent work demonstrated that individual starlings within large flocks respond to a fixed number of nearest neighbors, but until now it was not understood why this number is seven. We analyze robustness to uncertainty of consensus in empirical data from multiple starling flocks and show that the flock interaction networks with six or seven neighbors optimize the trade-off between group cohesion and individual effort. We can distinguish these numbers of neighbors from fewer or greater numbers using our systems-theoretic approach to measuring robustness of interaction networks as a function of the network structure, i.e., who is sensing whom. The metric quantifies the disagreement within the network due to disturbances and noise during consensus behavior and can be evaluated over a parameterized family of hypothesized sensing strategies (here the parameter is number of neighbors). We use this approach to further show that for the range of flocks studied the optimal number of neighbors does not depend on the number of birds within a flock; rather, it depends on the shape, notably the thickness, of the flock. The results suggest that robustness to uncertainty may have been a factor in the evolution of flocking for starlings. More generally, our results elucidate the role of the interaction network on uncertainty management in collective behavior, and motivate the application of our approach to other biological networks.
We present two new large-scale datasets aimed at evaluating systems designed to comprehend a natural language query and extract its answer from a large corpus of text. The Quasar-S dataset consists of 37000 cloze-style (fill-in-the-gap) queries constructed from definitions of software entity tags on the popular website Stack Overflow. The posts and comments on the website serve as the background corpus for answering the cloze questions. The Quasar-T dataset consists of 43000 open-domain trivia questions and their answers obtained from various internet sources. ClueWeb09 serves as the background corpus for extracting these answers. We pose these datasets as a challenge for two related subtasks of factoid Question Answering: (1) searching for relevant pieces of text that include the correct answer to a query, and (2) reading the retrieved text to answer the query. We also describe a retrieval system for extracting relevant sentences and documents from the corpus given a query, and include these in the release for researchers wishing to only focus on (2). We evaluate several baselines on both datasets, ranging from simple heuristics to powerful neural models, and show that these lag behind human performance by 16.4% and 32.1% for Quasar-S and -T respectively. The datasets are available at https://github.com/bdhingra/quasar .
We study random walks on the dilute hypercube using an exact enumeration Master equation technique, which is much more efficient than Monte Carlo methods for this problem. For each dilution $p$ the form of the relaxation of the memory function $q(t)$ can be accurately parametrized by a stretched exponential $q(t)=\exp(-(t/\tau)^\beta)$ over several orders of magnitude in $q(t)$. As the critical dilution for percolation $p_c$ is approached, the time constant $\tau(p)$ tends to diverge and the stretching exponent $\beta(p)$ drops towards 1/3. As the same pattern of relaxation is observed in wide class of glass formers, the fractal like morphology of the giant cluster in the dilute hypercube is a good representation of the coarse grained phase space in these systems. For these glass formers the glass transition can be pictured as a percolation transition in phase space.
This study deals with the evolution of the so called intelligent networks (insect society without leader, cells of an organism, brain...) during their apprenticeship period. The used formalism draws one's inspiration from the one of the Quantum field theory (Principle of stationary action, gauge fields, invariance by symmetry transformations...). After a recall of some definitions, we consider at first the free network, that is to say which does not exchange any information with outside. Then we study the evolution of the network connected with its environment, that is to say immersed into an information field created by this environment which so dictates to it the apprenticeship constraints. At that time, we obtain Lagrange equations which solutions describe the network evolution during the whole apprenticeship period. Finally, while proceeding with the same formalism inspiration, we suggest other study ways capable of evolving the knowledge in the considered scope.
We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees. An experiment on the WMT16 German-English news translation task resulted in an improved BLEU score when compared to a syntax-agnostic NMT baseline trained on the same dataset. An analysis of the translations from the syntax-aware system shows that it performs more reordering during translation in comparison to the baseline. A small-scale human evaluation also showed an advantage to the syntax-aware system.
Geometrically, foams or covalent graphs can be decomposed into successive layers or strata. Disorder of the underlying structure imposes a characteristic roughening of the layers. Our main results are hysteresis and convergence in the layer sequences. 1) If the direction of construction is reversed, the layers are different in the up and down sequences (irreversibility); nevertheless, under suitable but non-restrictive conditions, the layers come back, exactly, to the initial profile, a hysteresis phenomenon. 2) Layer sequences based on different initial conditions (e.g. different starting cells) converge, at least in the cylindrical geometry. Jogs in layers may be represented as pairs of opposite dislocations, moving erratically due to the disorder of the underlying structure and ending up annihilating when colliding.
Imaging studies suggest that the functional connectivity patterns of resting state networks (RS-networks) reflect underlying structural connectivity (SC). If the connectome constrains how brain areas are functionally connected, the stimulation of specific brain areas should produce a characteristic wave of activity ultimately resolving into RS-networks. To systematically test this hypothesis, we use a connectome-based network model of the human brain with detailed realistic SC. We systematically activate all possible thalamic and cortical areas with focal stimulation patterns and confirm that the stimulation of specific areas evokes network patterns that closely resemble RS-networks. For some sites, one or no RS-network is engaged, whereas for other sites more than one RS-network may evolve. Our results confirm that the brain is operating at the edge of criticality, wherein stimulation produces a cascade of functional network recruitments, collapsing onto a smaller subspace that is constrained in part by the anatomical local and long-range SCs. We suggest that information flow, and subsequent cognitive processing, follows specific routes imposed by connectome features, and that these routes explain the emergence of RS-networks. Since brain stimulation can be used to diagnose/treat neurological disorders, we provide a look-up table showing which areas need to be stimulated to activate specific RS-networks.
For a chaotic system pairs of initially close-by trajectories become eventually fully uncorrelated on the attracting set. This process of decorrelation may split into an initial exponential decrease, characterized by the maximal Lyapunov exponent, and a subsequent diffusive process on the chaotic attractor causing the final loss of predictability. The time scales of both processes can be either of the same or of very different orders of magnitude. In the latter case the two trajectories linger within a finite but small distance (with respect to the overall extent of the attractor) for exceedingly long times and therefore remain partially predictable.   Tests for distinguishing chaos from laminar flow widely use the time evolution of inter-orbital correlations as an indicator. Standard tests however yield mostly ambiguous results when it comes to distinguish partially predictable chaos and laminar flow, which are characterized respectively by attractors of fractally broadened braids and limit cycles. For a resolution we introduce a novel 0-1 indicator for chaos based on the cross-distance scaling of pairs of initially close trajectories, showing that this test robustly discriminates chaos, including partially predictable chaos, from laminar flow. One can use furthermore the finite time cross-correlation of pairs of initially close trajectories to distinguish, for a complete classification, also between strong and partially predictable chaos. We are thus able to identify laminar flow as well as strong and partially predictable chaos in a 0-1 manner solely from the properties of pairs of trajectories.
Traditional syntax models typically leverage part-of-speech (POS) information by constructing features from hand-tuned templates. We demonstrate that a better approach is to utilize POS tags as a regularizer of learned representations. We propose a simple method for learning a stacked pipeline of models which we call "stack-propagation". We apply this to dependency parsing and tagging, where we use the hidden layer of the tagger network as a representation of the input tokens for the parser. At test time, our parser does not require predicted POS tags. On 19 languages from the Universal Dependencies, our method is 1.3% (absolute) more accurate than a state-of-the-art graph-based approach and 2.7% more accurate than the most comparable greedy model.
Although the set of permutation symmetries of a complex network can be very large, few of the symmetries give rise to stable synchronous patterns. Here we present a new framework and develop techniques for controlling synchronization patterns in complex network of coupled chaotic oscillators. Specifically, according to the network permutation symmetry, we design a small-size and weighted network, namely the control network, and use it to control the large-size complex network by means of pinning coupling. We argue mathematically that for \emph{any} of the network symmetries, there always exists a critical pinning strength beyond which the unstable synchronous pattern associated to this symmetry can be stabilized. The feasibility of the control method is verified by numerical simulations of both artificial and real-work networks, and is demonstrated by experiment of coupled chaotic circuits. Our studies pave a way to the control of dynamical patterns in complex networks.
Existing works on distributed consensus explore linear iterations based on reversible Markov chains, which contribute to the slow convergence of the algorithms. It has been observed that by overcoming the diffusive behavior of reversible chains, certain nonreversible chains lifted from reversible ones mix substantially faster than the original chains. In this paper, we investigate the idea of accelerating distributed consensus via lifting Markov chains, and propose a class of Location-Aided Distributed Averaging (LADA) algorithms for wireless networks, where nodes' coarse location information is used to construct nonreversible chains that facilitate distributed computing and cooperative processing. First, two general pseudo-algorithms are presented to illustrate the notion of distributed averaging through chain-lifting. These pseudo-algorithms are then respectively instantiated through one LADA algorithm on grid networks, and one on general wireless networks. For a $k\times k$ grid network, the proposed LADA algorithm achieves an $\epsilon$-averaging time of $O(k\log(\epsilon^{-1}))$. Based on this algorithm, in a wireless network with transmission range $r$, an $\epsilon$-averaging time of $O(r^{-1}\log(\epsilon^{-1}))$ can be attained through a centralized algorithm. Subsequently, we present a fully-distributed LADA algorithm for wireless networks, which utilizes only the direction information of neighbors to construct nonreversible chains. It is shown that this distributed LADA algorithm achieves the same scaling law in averaging time as the centralized scheme. Finally, we propose a cluster-based LADA (C-LADA) algorithm, which, requiring no central coordination, provides the additional benefit of reduced message complexity compared with the distributed LADA algorithm.
For Anderson localization on the Cayley tree, we study the statistics of various observables as a function of the disorder strength $W$ and the number $N$ of generations. We first consider the Landauer transmission $T_N$. In the localized phase, its logarithm follows the traveling wave form $\ln T_N \simeq \bar{\ln T_N} + \ln t^*$ where (i) the disorder-averaged value moves linearly $\bar{\ln (T_N)} \simeq - \frac{N}{\xi_{loc}}$ and the localization length diverges as $\xi_{loc} \sim (W-W_c)^{-\nu_{loc}}$ with $\nu_{loc}=1$ (ii) the variable $t^*$ is a fixed random variable with a power-law tail $P^*(t^*) \sim 1/(t^*)^{1+\beta(W)}$ for large $t^*$ with $0<\beta(W) \leq 1/2$, so that all integer moments of $T_N$ are governed by rare events. In the delocalized phase, the transmission $T_N$ remains a finite random variable as $N \to \infty$, and we measure near criticality the essential singularity $\bar{\ln (T)} \sim - | W_c-W |^{-\kappa_T}$ with $\kappa_T \sim 0.25$. We then consider the statistical properties of normalized eigenstates, in particular the entropy and the Inverse Participation Ratios (I.P.R.). In the localized phase, the typical entropy diverges as $(W-W_c)^{- \nu_S}$ with $\nu_S \sim 1.5$, whereas it grows linearly in $N$ in the delocalized phase. Finally for the I.P.R., we explain how closely related variables propagate as traveling waves in the delocalized phase. In conclusion, both the localized phase and the delocalized phase are characterized by the traveling wave propagation of some probability distributions, and the Anderson localization/delocalization transition then corresponds to a traveling/non-traveling critical point. Moreover, our results point towards the existence of several exponents $\nu$ at criticality.
We propose a novel approach for mitigating radio frequency interference (RFI) signals in radio data using the latest advances in deep learning. We employ a special type of Convolutional Neural Network, the U-Net, that enables the classification of clean signal and RFI signatures in 2D time-ordered data acquired from a radio telescope. We train and assess the performance of this network using the HIDE & SEEK radio data simulation and processing packages, as well as early Science Verification data acquired with the 7m single-dish telescope at the Bleien Observatory. We find that our U-Net implementation is showing competitive accuracy to classical RFI mitigation algorithms such as SEEK's SumThreshold implementation. We publish our U-Net software package on GitHub under GPLv3 license.
Recurrent Neural Network (RNN) are a popular choice for modeling temporal and sequential tasks and achieve many state-of-the-art performance on various complex problems. However, most of the state-of-the-art RNNs have millions of parameters and require many computational resources for training and predicting new data. This paper proposes an alternative RNN model to reduce the number of parameters significantly by representing the weight parameters based on Tensor Train (TT) format. In this paper, we implement the TT-format representation for several RNN architectures such as simple RNN and Gated Recurrent Unit (GRU). We compare and evaluate our proposed RNN model with uncompressed RNN model on sequence classification and sequence prediction tasks. Our proposed RNNs with TT-format are able to preserve the performance while reducing the number of RNN parameters significantly up to 40 times smaller.
We address the problem of Visual Question Answering (VQA), which requires joint image and language understanding to answer a question about a given photograph. Recent approaches have applied deep image captioning methods based on convolutional-recurrent networks to this problem, but have failed to model spatial inference. To remedy this, we propose a model we call the Spatial Memory Network and apply it to the VQA task. Memory networks are recurrent neural networks with an explicit attention mechanism that selects certain parts of the information stored in memory. Our Spatial Memory Network stores neuron activations from different spatial regions of the image in its memory, and uses the question to choose relevant regions for computing the answer, a process of which constitutes a single "hop" in the network. We propose a novel spatial attention architecture that aligns words with image patches in the first hop, and obtain improved results by adding a second attention hop which considers the whole question to choose visual evidence based on the results of the first hop. To better understand the inference process learned by the network, we design synthetic questions that specifically require spatial inference and visualize the attention weights. We evaluate our model on two published visual question answering datasets, DAQUAR [1] and VQA [2], and obtain improved results compared to a strong deep baseline model (iBOWIMG) which concatenates image and question features to predict the answer [3].
Mid-level visual element discovery aims to find clusters of image patches that are both representative and discriminative. In this work, we study this problem from the prospective of pattern mining while relying on the recently popularized Convolutional Neural Networks (CNNs). Specifically, we find that for an image patch, activations extracted from the first fully-connected layer of CNNs have two appealing properties which enable its seamless integration with pattern mining. Patterns are then discovered from a large number of CNN activations of image patches through the well-known association rule mining. When we retrieve and visualize image patches with the same pattern, surprisingly, they are not only visually similar but also semantically consistent. We apply our approach to scene and object classification tasks, and demonstrate that our approach outperforms all previous works on mid-level visual element discovery by a sizeable margin with far fewer elements being used. Our approach also outperforms or matches recent works using CNN for these tasks. Source code of the complete system is available online.
Network coding is famous for significantly improving the throughput of networks. The successful decoding of the network coded data relies on some side information of the original data. In that framework, independent data flows are usually first decoded and then network coded by relay nodes. If appropriate signal design is adopted, physical layer network coding is a natural way in wireless networks. In this work, a network coding tree algorithm which enhances the efficiency of the multiple access system (MAS) is presented. For MAS, existing works tried to avoid the collisions while collisions happen frequently under heavy load. By introducing network coding to MAS, our proposed algorithm achieves a better performance of throughput and delay. When multiple users transmit signal in a time slot, the mexed signals are saved and used to jointly decode the collided frames after some component frames of the network coded frame are received. Splitting tree structure is extended to the new algorithm for collision solving. The throughput of the system and average delay of frames are presented in a recursive way. Besides, extensive simulations show that network coding tree algorithm enhances the system throughput and decreases the average frame delay compared with other algorithms. Hence, it improves the system performance.
Mobile sensing is an emerging technology that utilizes agent-participatory data for decision making or state estimation, including multimedia applications. This article investigates the structure of mobile sensing schemes and introduces crowdsourcing methods for mobile sensing. Inspired by social network, one can establish trust among participatory agents to leverage the wisdom of crowds for mobile sensing. A prototype of social network inspired mobile multimedia and sensing application is presented for illustrative purpose. Numerical experiments on real-world datasets show improved performance of mobile sensing via crowdsourcing. Challenges for mobile sensing with respect to Internet layers are discussed.
The microscopic dynamics in liquid gallium (l-Ga) at melting (T=315 K) has been studied by inelastic x-ray scattering. We demonstrate the existence of collective acoustic-like modes up to wave-vectors above one half of the first maximum of the static structure factor, at variance with earlier results from inelastic neutron scattering data [F.J. Bermejo et al. Phys. Rev. E 49, 3133 (1994)]. Despite the structural (an extremely rich polymorphism and rather complex phase diagram) and electronic (mixed valence) peculiarity of l-Ga, its collective dynamics is strikingly similar to the one of Van der Walls and alkali metals liquids. This result speaks in favor of the universality of the short time dynamics in monatomic liquids rather than of system-specific dynamics.
We undertook a study of the use of a memristor network for music generation, making use of the memristor's memory to go beyond the Markov hypothesis. Seed transition matrices are created and populated using memristor equations, and which are shown to generate musical melodies and change in style over time as a result of feedback into the transition matrix. The spiking properties of simple memristor networks are demonstrated and discussed with reference to applications of music making. The limitations of simulating composing memristor networks in von Neumann hardware is discussed and a hardware solution based on physical memristor properties is presented.
We investigate by Monte Carlo simulations the critical properties of the three-dimensional bond-diluted Ising model. The phase diagram is determined by locating the maxima of the magnetic susceptibility and is compared to mean-field and effective-medium approximations. The calculation of the size-dependent effective critical exponents shows the competition between the different fixed points of the model as a function of the bond dilution.
We present results of numerical simulations of the kinetics of exciton-exciton annihilation of weakly localized one-dimensional Frenkel excitons at low temperatures. We find that the kinetics is represented by two well-distinguished components: a fast short-time decay and a very slow long-time tail. The former arises from excitons that initially reside in states belonging to the same localization segment of the chain, while the slow component is caused by excitons created on different localization segments. We show that the usual bi-molecular theory fails in the description of the behavior found. We also present a qualitative analytical explanation of the non-exponential behavior observed in both the short- and the long-time decay components.
In this paper, an efficient Offline Hand Written Character Recognition algorithm is proposed based on Associative Memory Net (AMN). The AMN used in this work is basically auto associative. The implementation is carried out completely in 'C' language. To make the system perform to its best with minimal computation time, a Parallel algorithm is also developed using an API package OpenMP. Characters are mainly English alphabets (Small (26), Capital (26)) collected from system (52) and from different persons (52). The characters collected from system are used to train the AMN and characters collected from different persons are used for testing the recognition ability of the net. The detailed analysis showed that the network recognizes the hand written characters with recognition rate of 72.20% in average case. However, in best case, it recognizes the collected hand written characters with 88.5%. The developed network consumes 3.57 sec (average) in Serial implementation and 1.16 sec (average) in Parallel implementation using OpenMP.
We consider different ways to control the magnification in self-organizing maps (SOM) and neural gas (NG). Starting from early approaches of magnification control in vector quantization, we then concentrate on different approaches for SOM and NG. We show that three structurally similar approaches can be applied to both algorithms: localized learning, concave-convex learning, and winner relaxing learning. Thereby, the approach of concave-convex learning in SOM is extended to a more general description, whereas the concave-convex learning for NG is new. In general, the control mechanisms generate only slightly different behavior comparing both neural algorithms. However, we emphasize that the NG results are valid for any data dimension, whereas in the SOM case the results hold only for the one-dimensional case.
We present Breakout, a group interaction platform for online courses that enables the creation and measurement of face-to-face peer learning groups in online settings. Breakout is designed to help students easily engage in synchronous, video breakout session based peer learning in settings that otherwise force students to rely on asynchronous text-based communication. The platform also offers data collection and intervention tools for studying the communication patterns inherent in online learning environments. The goals of the system are twofold: to enhance student engagement in online learning settings and to create a platform for research into the relationship between distributed group interaction patterns and learning outcomes.
The problem of clustering large complex networks plays a key role in several scientific fields ranging from Biology to Sociology and Computer Science. Many approaches to clustering complex networks are based on the idea of maximizing a network modularity function. Some of these approaches can be classified as global because they exploit knowledge about the whole network topology to find clusters. Other approaches, instead, can be interpreted as local because they require only a partial knowledge of the network topology, e.g., the neighbors of a vertex. Global approaches are able to achieve high values of modularity but they do not scale well on large networks and, therefore, they cannot be applied to analyze on-line social networks like Facebook or YouTube. In contrast, local approaches are fast and scale up to large, real-life networks, at the cost of poorer results than those achieved by local methods. In this article we propose a glocal method to maximizing modularity, i.e., our method uses information at the global level, yet its scalability on large networks is comparable to that of local methods. The proposed method is called COmplex Network CLUster DEtection (or, shortly, CONCLUDE.) It works in two stages: in the first stage it uses an information-propagation model, based on random and non-backtracking walks of finite length, to compute the importance of each edge in keeping the network connected (called edge centrality.) Then, edge centrality is used to map network vertices onto points of an Euclidean space and to compute distances between all pairs of connected vertices. In the second stage, CONCLUDE uses the distances computed in the first stage to partition the network into clusters. CONCLUDE is computationally efficient since in the average case its cost is roughly linear in the number of edges of the network.
In many applications, multivariate samples may harbor previously unrecognized heterogeneity at the level of conditional independence or network structure. For example, in cancer biology, disease subtypes may differ with respect to subtype-specific interplay between molecular components. Then, both subtype discovery and estimation of subtype-specific networks present important and related challenges. To enable such analyses, we put forward a mixture model whose components are sparse Gaussian graphical models. This brings together model-based clustering and graphical modeling to permit simultaneous estimation of cluster assignments and cluster-specific networks. We carry out estimation within an L1-penalized framework, and investigate several specific penalization regimes. We present empirical results on simulated data and provide general recommendations for the formulation and use of mixtures of L1-penalized Gaussian graphical models.
Consider the robust network design problem of finding a minimum cost network with enough capacity to route all traffic demand matrices in a given polytope. We investigate the impact of different routing models in this robust setting: in particular, we compare \emph{oblivious} routing, where the routing between each terminal pair must be fixed in advance, to \emph{dynamic} routing, where routings may depend arbitrarily on the current demand. Our main result is a construction that shows that the optimal cost of such a network based on oblivious routing (fractional or integral) may be a factor of $\BigOmega(\log{n})$ more than the cost required when using dynamic routing. This is true even in the important special case of the asymmetric hose model. This answers a question in \cite{chekurisurvey07}, and is tight up to constant factors. Our proof technique builds on a connection between expander graphs and robust design for single-sink traffic patterns \cite{ChekuriHardness07}.
Considering a network of dissipative quantum harmonic oscillators we deduce and analyze the optimum topologies which are able to store, for the largest period of time, a quantum superposition previously prepared in one of the network oscillators. The storage of the superposition is made dynamically, in that the state to be protected evolves through the network before being retrieved back in the oscillator where it was prepared. The decoherence time during the dynamic storage process is computed and we demonstrate that it is proportional to the number of oscillators in the network for a particular regime of parameters.
Text simplification aims at reducing the lexical, grammatical and structural complexity of a text while keeping the same meaning. In the context of machine translation, we introduce the idea of simplified translations in order to boost the learning ability of deep neural translation models. We conduct preliminary experiments showing that translation complexity is actually reduced in a translation of a source bi-text compared to the target reference of the bi-text while using a neural machine translation (NMT) system learned on the exact same bi-text. Based on knowledge distillation idea, we then train an NMT system using the simplified bi-text, and show that it outperforms the initial system that was built over the reference data set. Performance is further boosted when both reference and automatic translations are used to learn the network. We perform an elementary analysis of the translated corpus and report accuracy results of the proposed approach on English-to-French and English-to-German translation tasks.
We present a new online algorithm for detecting overlapping communities. The main ingredients are a modification of an online k-means algorithm and a new approach to modelling overlap in communities. An evaluation on large benchmark graphs shows that the quality of discovered communities compares favorably to several methods in the recent literature, while the running time is significantly improved.
We propose a model to show the self-assembling of network-like structures between a set of nodes without using preexisting positional information or long-range attraction of the nodes. The model is based on Brownian agents that are capable of producing different local (chemical) information and respond to it in a non-linear manner. They solve two tasks in parallel: (i) the detection of the appropriate nodes, and (ii) the establishment of stable links between them. We present results of computer simulations that demonstrate the emergence of robust network structures and investigate the connectivity of the network by means of both analytical estimations and computer simulations. PACS: 05.65.+b, 89.75.Kd, 84.30.Bv, 87.18.Sn
It is unrealistic to assume that all nodes in an ad hoc wireless network would be willing to participate in cooperative communication, especially if their desired Quality-of- Service (QoS) is achievable via direct transmission. An incentivebased auction mechanism is presented to induce cooperative behavior in wireless networks with emphasis on users with asymmetrical channel fading conditions. A single-object secondprice auction is studied for cooperative partner selection in singlecarrier networks. In addition, a multiple-object bundled auction is analyzed for the selection of multiple simultaneous partners in a cooperative orthogonal frequency-division multiplexing (OFDM) setting. For both cases, we characterize equilibrium outage probability performance, seller revenue, and feedback bounds. The auction-based partner selection allows winning bidders to achieve their desired QoS while compensating the seller who assists them. At the local level sellers aim for revenue maximization, while connections are drawn to min-max fairness at the network level. The proposed strategies for partner selection in self-configuring cooperative wireless networks are shown to be robust under conditions of uncertainty in the number of users requesting cooperation, as well as minimal topology and channel link information available to individual users.
Stock exchanges are considered major players in financial sectors of many countries. Most Stockbrokers, who execute stock trade, use technical, fundamental or time series analysis in trying to predict stock prices, so as to advise clients. However, these strategies do not usually guarantee good returns because they guide on trends and not the most likely price. It is therefore necessary to explore improved methods of prediction.   The research proposes the use of Artificial Neural Network that is feedforward multi-layer perceptron with error backpropagation and develops a model of configuration 5:21:21:1 with 80% training data in 130,000 cycles. The research develops a prototype and tests it on 2008-2012 data from stock markets e.g. Nairobi Securities Exchange and New York Stock Exchange, where prediction results show MAPE of between 0.71% and 2.77%. Validation done with Encog and Neuroph realized comparable results. The model is thus capable of prediction on typical stock markets.
Identifying musical instruments in polyphonic music recordings is a challenging but important problem in the field of music information retrieval. It enables music search by instrument, helps recognize musical genres, or can make music transcription easier and more accurate. In this paper, we present a convolutional neural network framework for predominant instrument recognition in real-world polyphonic music. We train our network from fixed-length music excerpts with a single-labeled predominant instrument and estimate an arbitrary number of predominant instruments from an audio signal with a variable length. To obtain the audio-excerpt-wise result, we aggregate multiple outputs from sliding windows over the test audio. In doing so, we investigated two different aggregation methods: one takes the average for each instrument and the other takes the instrument-wise sum followed by normalization. In addition, we conducted extensive experiments on several important factors that affect the performance, including analysis window size, identification threshold, and activation functions for neural networks to find the optimal set of parameters. Using a dataset of 10k audio excerpts from 11 instruments for evaluation, we found that convolutional neural networks are more robust than conventional methods that exploit spectral features and source separation with support vector machines. Experimental results showed that the proposed convolutional network architecture obtained an F1 measure of 0.602 for micro and 0.503 for macro, respectively, achieving 19.6% and 16.4% in performance improvement compared with other state-of-the-art algorithms.
The large-scale magnetic pattern of the quiet sun is dominated by the magnetic network. This network, created by photospheric magnetic fields swept into convective downflows, delineates the boundaries of large scale cells of overturning plasma and exhibits voids in magnetic organization. Such voids include internetwork fields, a mixed-polarity sparse field that populate the inner part of network cells. To single out voids and to quantify their intrinsic pattern a fast circle packing based algorithm is applied to 511 SOHO/MDI high resolution magnetograms acquired during the outstanding solar activity minimum between 23 and 24 cycles. The computed Void Distribution Function shows a quasi-exponential decay behavior in the range 10-60 Mm. The lack of distinct flow scales in such a range corroborates the hypothesis of multi-scale motion flows at the solar surface. In addition to the quasi-exponential decay we have found that the voids reveal departure from a simple exponential decay around 35 Mm.
We report on a refined version of our spin-glass type approach to the low-temperature physics of structural glasses. Its key idea is based on a Born von Karman expansion of the interaction potential about a set of reference positions in which glassy aspects are modeled by taking the harmonic contribution within this expansion to be random. Within the present refined version the expansion at the harmonic level is reorganized so as to respect the principle of global translational invariance. By implementing this principle, we have for the first time a mechanism that fixes the distribution of the parameters characterizing the local potential energy configurations responsible for glassy low-temperature anomalies solely in terms of assumptions about interactions at a microscopic level.
Anderson localization is known to be inevitable in one dimension for generic disordered models. Since localization leads to Poissonian energy level statistics, we ask if localized systems possess "additional" integrals of motion as well, so as to enhance the analogy with quantum integrable systems. We answer this in the affirmative in the present work. We construct a set of nontrivial integrals of motion for Anderson localized models, in terms of the original creation and annihilation operators. These are found as a power series in the hopping parameter. The recently found Type-1 Hamiltonians, which are known to be quantum integrable in a precise sense, motivate our construction. We note that these models can be viewed as disordered electron models with infinite-range hopping, where a similar series truncates at the linear order. We show that despite the infinite range hopping, all states but one are localized. We also study the conservation laws for the disorder free Aubry-Andre model, where the states are either localized or extended, depending on the strength of a coupling constant. We formulate a specific procedure for averaging over disorder, in order to examine the convergence of the power series. Using this procedure in the Aubry-Andre model, we show that integrals of motion given by our construction are well-defined in localized phase, but not so in the extended phase. Finally, we also obtain the integrals of motion for a model with interactions to lowest order in the interaction.
Quantum networks based on atomic qubits and scattered photons provide a promising way to build a large-scale quantum information processor. We review quantum protocols for generating entanglement and operating gates between two distant atomic qubits, which can be used for constructing scalable atom--photon quantum networks. We emphasize the crucial role of collecting light from atomic qubits for large-scale networking and describe two techniques to enhance light collection using reflective optics or optical cavities. A brief survey of some applications for scalable and efficient atom--photon networks is also provided.
We report an extensive and systematic investigation of the multi-point and multi-time correlation functions to reveal the spatio-temporal structures of dynamic heterogeneities in glass-forming liquids. Molecular dynamics simulations are carried out for the supercooled states of various prototype models of glass-forming liquids such as binary Kob-Andersen, Wahnstrom, soft-sphere, and network-forming liquids. First, we quantify the length scale of the dynamic heterogeneities utilizing the four-point correlation function. The growth of the dynamic length scale with decreasing temperature is characterized by various scaling relations that are analogous to the critical phenomena. We also examine how the growth of the length scale depends upon the model employed. Second, the four-point correlation function is extended to a three-time correlation function to characterize the temporal structures of the dynamic heterogeneities based on our previous studies. We provide comprehensive numerical results obtained from the three-time correlation function for the above models. From these calculations, we examine the time scale of the dynamic heterogeneities and determine the associated lifetime in a consistent and systematic way. Our results indicate that the lifetime of the dynamical heterogeneities becomes much longer than the alpha relaxation time determined from a two-point correlation function in fragile liquids. The decoupling between the two time scales is remarkable, particularly in supercooled states, and the time scales differ by more than an order of magnitude in a more fragile liquid. In contrast, the lifetime is shorter than the alpha-relaxation time in tetrahedral network-forming strong liquid, even at lower temperatures.
Deep learning has established many new state of the art solutions in the last decade in areas such as object, scene and speech recognition. In particular Convolutional Neural Network (CNN) is a category of deep learning which obtains excellent results in object detection and recognition tasks. Its architecture is indeed well suited to object analysis by learning and classifying complex (deep) features that represent parts of an object or the object itself. However, some of its features are very similar to texture analysis methods. CNN layers can be thought of as filter banks of complexity increasing with the depth. Filter banks are powerful tools to extract texture features and have been widely used in texture analysis. In this paper we develop a simple network architecture named Texture CNN (T-CNN) which explores this observation. It is built on the idea that the overall shape information extracted by the fully connected layers of a classic CNN is of minor importance in texture analysis. Therefore, we pool an energy measure from the last convolution layer which we connect to a fully connected layer. We show that our approach can improve the performance of a network while greatly reducing the memory usage and computation.
We consider a two-dimensional electron or hole system at zero temperature and low carrier densities, where the long-range Coulomb interactions dominate over the kinetic energy. In this limit the clean system will form a Wigner crystal. Non-trivial quantum mechanical corrections to the classical ground state lead to multiparticle exchange processes that can be expressed as an effective spin Hamiltonian involving competing interactions. Disorder will destroy the Wigner crystal on large length scales, and the resulting state is called a Wigner glass. The notion of multiparticle exchange processes is still applicable in the Wigner glass, but the exchange frequencies now follow a random distribution. We compute the exchange frequencies for a large number of relevant exchange processes in the Wigner crystal, and the frequency distributions for some important processes in the Wigner glass. The resulting effective low energy spin Hamiltonian should be the starting point of an analysis of the possible ground state phases and quantum phase transitions between them. We find that disorder plays a crucial role and speculate on a possible zero temperature phase diagram.
We present an analysis leading to precise locations of the multicritical points for spin glasses on regular lattices. The conventional technique for determination of the location of the multicritical point was previously derived using a hypothesis emerging from duality and the replica method. In the present study, we propose a systematic technique, by an improved technique, giving more precise locations of the multicritical points on the square, triangular, and hexagonal lattices by carefully examining relationship between two partition functions related with each other by the duality. We can find that the multicritical points of the $\pm J$ Ising model are located at $p_c = 0.890813$ on the square lattice, where $p_c$ means the probability of $J_{ij} = J(>0)$, at $p_c = 0.835985$ on the triangular lattice, and at $p_c = 0.932593$ on the hexagonal lattice. These results are in excellent agreement with recent numerical estimations.
Data transfer in opportunistic Delay Tolerant Networks (DTNs) must rely on unscheduled sporadic meetings between nodes. The main challenge in these networks is to develop a mechanism based on which nodes can learn to make nearly optimal forwarding decision rules despite having no a-priori knowledge of the network topology. The forwarding mechanism should ideally result in a high delivery probability, low average latency and efficient usage of the network resources. In this paper, we propose both centralized and decentralized single-copy message forwarding algorithms that, under relatively strong assumptions about the networks behaviour, minimize the expected latencies from any node in the network to a particular destination. After proving the optimality of our proposed algorithms, we develop a decentralized algorithm that involves a recursive maximum likelihood procedure to estimate the meeting rates. We confirm the improvement that our proposed algorithms make in the system performance through numerical simulations on datasets from synthetic and real-world opportunistic networks.
We propose general principles for semantic networks allowing them to be implemented as dynamical neural networks. Major features of our scheme include: (a) the interpretation that each node in a network stands for a bound integration of the meanings of all nodes and external events the node links with; (b) the systematic use of nodes that stand for categories or types, with separate nodes for instances of these types; (c) an implementation of relationships that does not use intrinsically typed links between nodes.
Recent work on the structure of social networks and the internet has focussed attention on graphs with distributions of vertex degree that are significantly different from the Poisson degree distributions that have been widely studied in the past. In this paper we develop in detail the theory of random graphs with arbitrary degree distributions. In addition to simple undirected, unipartite graphs, we examine the properties of directed and bipartite graphs. Among other results, we derive exact expressions for the position of the phase transition at which a giant component first forms, the mean component size, the size of the giant component if there is one, the mean number of vertices a certain distance away from a randomly chosen vertex, and the average vertex-vertex distance within a graph. We apply our theory to some real-world graphs, including the world-wide web and collaboration graphs of scientists and Fortune 1000 company directors. We demonstrate that in some cases random graphs with appropriate distributions of vertex degree predict with surprising accuracy the behavior of the real world, while in others there is a measurable discrepancy between theory and reality, perhaps indicating the presence of additional social structure in the network that is not captured by the random graph.
We study theoretically the effects of disorder on Bose-Einstein condensates (BEC) of bosonic triplon quasiparticles in doped dimerized quantum magnets. The condensation occurs in a strong enough magnetic field Hc, where the concentration of bosons in the random potential is sufficient to form the condensate. The effect of doping is partly modeled by delta - correlated disorder potential, which (i) leads to the uniform renormalization of the system parameters and (ii) produces disorder in the system with renormalized parameters. These approaches can explain qualitatively the available magnetization data in the Tl_(1-x)K_(x)CuCl_3 compound taken as an example. In addition to the magnetization, we found that the speed of the Bogoliubov mode has a peak as a function of doping parameter, x. No evidence of the pure Bose glass phase has been obtained in the BEC regime.
We investigate the properties of the galaxies selected from the deepest 850-micron survey undertaken to date with SCUBA-2 on the JCMT. This deep 850-micron imaging was taken in parallel with deep 450-micron imaging in the very best observing conditions as part of the SCUBA-2 Cosmology Legacy Survey. A total of 106 sources were uncovered at 850 microns from ~150, sq. arcmin in the centre of the COSMOS/UltraVISTA/CANDELS field, imaged to a typical rms depth of ~0.25 mJy. We utilise the wealth of available deep multi-frequency data to establish the complete redshift distribution for this sample, yielding <z> = 2.38 +- 0.09, a mean redshift comparable with that derived for all but the brightest previous sub-mm samples. We have also been able to establish the stellar masses of the majority of the galaxy identifications, enabling us to explore their location on the star-formation-rate:stellar-mass (SFR:M*) plane. Crucially, our new deep sample reaches flux densities equivalent to SFR ~ 100 Msun/yr, enabling us to confirm that sub-mm galaxies form the high-mass end of the `main sequence' (MS) of star-forming galaxies at z > 1.5 (with a mean specific SFR of sSFR = 2.25 +- 0.19 /Gyr at z ~ 2.5). Our results are consistent with no significant flattening of the MS towards high masses at these redshifts, suggesting that reports of such flattening possibly arise from under-estimates of dust-enshrouded star-formation activity in massive star-forming galaxies. However, our findings add to the growing evidence that average sSFR rises only slowly at high redshift, resulting in log(sSFR) being an apparently simple linear function of the age of the Universe.
In the performance evaluation of a protocol for a vehicular ad hoc network, the protocol should be tested under a realistic conditions including, representative data traffic models, and realistic movements of the mobile nodes which are the vehicles (i.e., a mobility model). This work is a comparative study between two mobility models that are used in the simulations of vehicular networks, i.e., MOVE (MObility model generator for VEhicular networks) and CityMob, a mobility pattern generator for VANET. We describe several mobility models for VANET simulations. In this paper we aim to show that the mobility models can significantly affect the simulation results in VANET networks. The results presented in this article prove the importance of choosing a suitable real world scenario for performances studies of routing protocols in this kind of network.
This paper studies synchronization in coupled nonlinear dynamic networks with unknown parameters. Adaptation can be added to one or several elements in the network, while preserving the global synchronization conditions derived in previously. This implies that new nodes can be added to the network without prior knowledge of the individual dynamics, and that nodes in an existing network have the ability to recover dynamic information if temporarily lost. In addition, when the individual elements feature sufficiently rich stable dynamics, as e.g. in the case of oscillators, then adaptation actually leads to an exact estimation of the unknown parameters. Different kinds of leaders are also discussed in this context - one type of leader can specify overall trajectories for the network, while another can concurrently specify dynamic parameters.
One of the major challenges in neuroscience is to determine how noise that is present at the molecular and cellular levels affects dynamics and information processing at the macroscopic level of synaptically coupled neuronal populations. Often noise is incorprated into deterministic network models using extrinsic noise sources. An alternative approach is to assume that noise arises intrinsically as a collective population effect, which has led to a master equation formulation of stochastic neural networks. In this paper we extend the master equation formulation by introducing a stochastic model of neural population dynamics in the form of a velocity jump Markov process. The latter has the advantage of keeping track of synaptic processing as well as spiking activity, and reduces to the neural master equation in a particular limit. The population synaptic variables evolve according to piecewise deterministic dynamics, which depends on population spiking activity. The latter is characterised by a set of discrete stochastic variables evolving according to a jump Markov process, with transition rates that depend on the synaptic variables. We consider the particular problem of rare transitions between metastable states of a network operating in a bistable regime in the deterministic limit. Assuming that the synaptic dynamics is much slower than the transitions between discrete spiking states, we use a WKB approximation and singular perturbation theory to determine the mean first passage time to cross the separatrix between the two metastable states. Such an analysis can also be applied to other velocity jump Markov processes, including stochastic voltage-gated ion channels and stochastic gene networks.
We develop a new method to probe the local spin dynamic autocorrelation function, using magnetic field dependent muon depolarization measurements. We apply this method to muSR experiments in the dilute Heisenberg spin glass AgMn(p at. %) at T>T_g where the correlations of the Mn local magnetic moment are strongly non-exponential. Our results clearly indicate that the dynamics of this spin glass cannot be described by a distribution of correlation times. Therefore, we analyze the data assuming a local spin correlation function which is the product of a power law times a cutoff function. The concentration and temperature dependence of the parameters of this function are determined. Our major conclusion is that in the temperature region close to Tg the correlation function is dominated by an algebraic relaxation term.
Fractal scale-free networks are empirically known to exhibit disassortative degree mixing. It is, however, not obvious whether a negative degree correlation between nearest neighbor nodes makes a scale-free network fractal. Here we examine the possibility that disassortativity in complex networks is the origin of fractality. To this end, maximally disassortative (MD) networks are prepared by rewiring edges while keeping the degree sequence of an initial uncorrelated scale-free network that is guaranteed to become fractal by rewiring edges. Our results show that most of MD networks with different topologies are not fractal, which demonstrates that disassortativity does not cause the fractal property of networks. In addition, we suggest that fractality of scale-free networks requires a long-range repulsive correlation in similar degrees.
For over a quarter century, security-relevant detection has been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or non-malicious based on its similarity to the learned model at run-time. However, the training of the models has been historically limited to only those features available at run time. In this paper, we consider an alternate model construction approach that trains models using forensic "privileged" information--features available at training time but not at runtime--to improve the accuracy and resilience of detection systems. In particular, we adapt and extend recent advances in knowledge transfer, model influence, and distillation to enable the use of forensic data in a range of security domains. Our empirical study shows that privileged information increases detection precision and recall over a system with no privileged information: we observe up to 7.7% relative decrease in detection error for fast-flux bot detection, 8.6% for malware traffic detection, 7.3% for malware classification, and 16.9% for face recognition. We explore the limitations and applications of different privileged information techniques in detection systems. Such techniques open the door to systems that can integrate forensic data directly into detection models, and therein provide a means to fully exploit the information available about past security-relevant events.
We use the replica method in order to obtain an expression for the variational free energy of an Ising ferromagnet on a Viana-Bray lattice in the presence of random external fields. Introducing a global order parameter, in the replica-symmetric context, the problem is reduced to the analysis of the solutions of a nonlinear integral equation. At zero temperature, and under some restrictions on the form of the random fields, we are able to perform a detailed analysis of stability of the replica-symmetric solutions. In contrast to the behaviour of the Sherrington-Kirkpatrick model for a spin glass in a uniform field, the paramagnetic solution is fully stable in a sufficiently large random field.
In this paper we study the solution space structure of model RB, a standard prototype of Constraint Satisfaction Problem (CSPs) with growing domains. Using rigorous the first and the second moment method, we show that in the solvable phase close to the satisfiability transition, solutions are clustered into exponential number of well-separated clusters, with each cluster contains sub-exponential number of solutions. As a consequence, the system has a clustering (dynamical) transition but no condensation transition. This picture of phase diagram is different from other classic random CSPs with fixed domain size, such as random K-Satisfiability (K-SAT) and graph coloring problems, where condensation transition exists and is distinct from satisfiability transition. Our result verifies the non-rigorous results obtained using cavity method from spin glass theory, and sheds light on the structures of solution spaces of problems with a large number of states.
Unicast communication over a network of $M$-parallel relays in the presence of an eavesdropper is considered. The relay nodes, operating under individual power constraints, amplify and forward the signals received at their inputs. The problem of the maximum secrecy rate achievable with AF relaying is addressed. Previous work on this problem provides iterative algorithms based on semidefinite relaxation. However, those algorithms result in suboptimal performance without any performance and convergence guarantees. We address this problem for three specific network models, with real-valued channel gains. We propose a novel transformation that leads to convex optimization problems. Our analysis leads to (i)a polynomial-time algorithm to compute the optimal secure AF rate for two of the models and (ii) a closed-form expression for the optimal secure rate for the other.
A probabilistic query may not be estimable from observed data corrupted by missing values if the data are not missing at random (MAR). It is therefore of theoretical interest and practical importance to determine in principle whether a probabilistic query is estimable from missing data or not when the data are not MAR. We present an algorithm that systematically determines whether the joint probability is estimable from observed data with missing values, assuming that the data-generation model is represented as a Bayesian network containing unobserved latent variables that not only encodes the dependencies among the variables but also explicitly portrays the mechanisms responsible for the missingness process. The result significantly advances the existing work.
Mobile multi-hop ad hoc networks allow establishing local groups of communicating devices in a self-organizing way. However, in a global setting such networks fail to work properly due to network partitioning. Providing that devices are capable of communicating both locally-e.g. using Wi-Fi or Bluetooth-and additionally also with arbitrary remote devices-e.g. using GSM/UMTS links-the objective is to find efficient ways of inter-linking multiple network partitions. Tackling this problem of topology control, we focus on the class of small-world networks that obey two distinguishing characteristics: they have a strong local clustering while still retaining a small average distance between two nodes. This paper reports on results gained investigating the question if small-world properties are indicative for an efficient link management in multiple multi-hop ad hoc network partitions.
Providing the neurobiological basis of information processing in higher animals, spiking neural networks must be able to learn a variety of complicated computations, including the generation of appropriate, possibly delayed reactions to inputs and the self-sustained generation of complex activity patterns, e.g.~for locomotion. Many such computations require previous building of intrinsic world models. Here we show how spiking neural networks may solve these different tasks. Firstly, we derive constraints under which classes of spiking neural networks lend themselves to substrates of powerful general purpose computing. The networks contain dendritic or synaptic nonlinearities and have a constrained connectivity. We then combine such networks with learning rules for outputs or recurrent connections. We show that this allows to learn even difficult benchmark tasks such as the self-sustained generation of desired low-dimensional chaotic dynamics or memory-dependent computations. Furthermore, we show how spiking networks can build models of external world systems and use the acquired knowledge to control them.
Background: Network communities help the functional organization and evolution of complex networks. However, the development of a method, which is both fast and accurate, provides modular overlaps and partitions of a heterogeneous network, has proven to be rather difficult. Methodology/Principal Findings: Here we introduce the novel concept of ModuLand, an integrative method family determining overlapping network modules as hills of an influence function-based, centrality-type community landscape, and including several widely used modularization methods as special cases. As various adaptations of the method family, we developed several algorithms, which provide an efficient analysis of weighted and directed networks, and (1) determine pervasively overlapping modules with high resolution; (2) uncover a detailed hierarchical network structure allowing an efficient, zoom-in analysis of large networks; (3) allow the determination of key network nodes and (4) help to predict network dynamics. Conclusions/Significance: The concept opens a wide range of possibilities to develop new approaches and applications including network routing, classification, comparison and prediction.
In the process of scientific research, many information objects are generated, all of which may remain valuable indefinitely. However, artifacts such as instrument data and associated calibration information may have little value in isolation; their meaning is derived from their relationships to each other. Individual artifacts are best represented as components of a life cycle that is specific to a scientific research domain or project. Current cataloging practices do not describe objects at a sufficient level of granularity nor do they offer the globally persistent identifiers necessary to discover and manage scholarly products with World Wide Web standards. The Open Archives Initiative's Object Reuse and Exchange data model (OAI-ORE) meets these requirements. We demonstrate a conceptual implementation of OAI-ORE to represent the scientific life cycles of embedded networked sensor applications in seismology and environmental sciences. By establishing relationships between publications, data, and contextual research information, we illustrate how to obtain a richer and more realistic view of scientific practices. That view can facilitate new forms of scientific research and learning. Our analysis is framed by studies of scientific practices in a large, multi-disciplinary, multi-university science and engineering research center, the Center for Embedded Networked Sensing (CENS).
Cooperation plays a key role in the evolution of complex systems. However, the level of cooperation extensively varies with the topology of agent networks in the widely used models of repeated games. Here we show that cooperation remains rather stable by applying the reinforcement learning strategy adoption rule, Q-learning on a variety of random, regular, small-word, scale-free and modular network models in repeated, multi-agent Prisoners Dilemma and Hawk-Dove games. Furthermore, we found that using the above model systems other long-term learning strategy adoption rules also promote cooperation, while introducing a low level of noise (as a model of innovation) to the strategy adoption rules makes the level of cooperation less dependent on the actual network topology. Our results demonstrate that long-term learning and random elements in the strategy adoption rules, when acting together, extend the range of network topologies enabling the development of cooperation at a wider range of costs and temptations. These results suggest that a balanced duo of learning and innovation may help to preserve cooperation during the re-organization of real-world networks, and may play a prominent role in the evolution of self-organizing, complex systems.
In the past few years, the network measurement community has been interested in the problem of internet topology discovery using a large number (hundreds or thousands) of measurement monitors. The standard way to obtain information about the internet topology is to use the traceroute tool from a small number of monitors. Recent papers have made the case that increasing the number of monitors will give a more accurate view of the topology. However, scaling up the number of monitors is not a trivial process. Duplication of effort close to the monitors wastes time by reexploring well-known parts of the network, and close to destinations might appear to be a distributed denial-of-service (DDoS) attack as the probes converge from a set of sources towards a given destination. In prior work, authors of this report proposed Doubletree, an algorithm for cooperative topology discovery, that reduces the load on the network, i.e., router IP interfaces and end-hosts, while discovering almost as many nodes and links as standard approaches based on traceroute. This report presents our open-source and freely downloadable implementation of Doubletree in a tool we call traceroute@home. We describe the deployment and validation of traceroute@home on the PlanetLab testbed and we report on the lessons learned from this experience. We discuss how traceroute@home can be developed further and discuss ideas for future improvements.
We consider a generic time-reversal invariant model of fermions hopping randomly on a square lattice. By means of the conventional replica-trick within the fermionic path-integral formalism, the model is mapped onto a non-linear sigma-model with fields spanning the coset U(4N)/Sp(2N), N->0. We determine the proper scaling combinations of an infinite family of relevant operators which control deviations from perfect two-sublattice symmetry. This allows us to extract the low-energy behavior of the density of states, which agrees with earlier results obtained in particular two-sublattice models with Dirac-like single-particle dispersion. The agreement proves the efficacy of the conventional fermionic-path-integral approach to disordered systems, which, in spite of many controversial aspects, like the zero-replica limit, remains one of the more versatile theoretical tool to deal with disordered electrons.
Recent works on hard spheres in the limit of infinite dimensions revealed that glass states, envisioned as meta-basins in configuration space, can break up in a multitude of separate basins at low enough temperature or high enough pressure, leading to the emergence of new kinds of soft-modes and unusual properties. In this paper we study by perturbative renormalisation group techniques the critical properties of this transition, which has been discovered in disordered mean-field models in the '80s. We find that the upper critical dimension $d_u$ above which mean-field results hold is strictly larger than six and apparently non-universal, i.e. system dependent. Below $d_u$, we do not find any perturbative attractive fixed point (except for a tiny region of the 1RSB breaking parameter), thus showing that the transition in three dimensions either is governed by a non-perturbative fixed point unrelated to the Gaussian mean-field one or becomes first order or does not exist. We also discuss possible relationships with the behavior of spin glasses in a field.
We study the representational power of a Boltzmann machine (a type of neural network) in quantum many-body systems. We prove that any (local) tensor network state has a (local) neural network representation. The construction is almost optimal in the sense that the number of parameters in the neural network representation is almost linear in the number of nonzero parameters in the tensor network representation. Despite the difficulty of representing (gapped) chiral topological states with local tensor networks, we construct a quasi-local neural network representation for a chiral p-wave superconductor. This demonstrates the power of Boltzmann machines.
Automatic detection of lymphocyte in H&E images is a necessary first step in lots of tissue image analysis algorithms. An accurate and robust automated lymphocyte detection approach is of great importance in both computer science and clinical studies. Most of the existing approaches for lymphocyte detection are based on traditional image processing algorithms and/or classic machine learning methods. In the recent years, deep learning techniques have fundamentally transformed the way that a computer interprets images and have become a matchless solution in various pattern recognition problems. In this work, we design a new deep neural network model which extends the fully convolutional network by combining the ideas in several recent techniques, such as shortcut links. Also, we design a new training scheme taking the prior knowledge about lymphocytes into consideration. The training scheme not only efficiently exploits the limited amount of free-form annotations from pathologists, but also naturally supports efficient fine-tuning. As a consequence, our model has the potential of self-improvement by leveraging the errors collected during real applications. Our experiments show that our deep neural network model achieves good performance in the images of different staining conditions or different types of tissues.
Neural or cortical fields are continuous assemblies of mesoscopic models, also called neural masses, of neural populations that are fundamental in the modeling of macroscopic parts of the brain. Neural fields are described by nonlinear integro-differential equations. The solutions of these equations represent the state of activity of these populations when submitted to inputs from neighbouring brain areas. Understanding the properties of these solutions is essential in advancing our understanding of the brain. In this paper we study the dependency of the stationary solutions of the neural fields equations with respect to the stiffness of the nonlinearity and the contrast of the external inputs. This is done by using degree theory and bifurcation theory in the context of functional, in particular infinite dimensional, spaces. The joint use of these two theories allows us to make new detailed predictions about the global and local behaviours of the solutions. We also provide a generic finite dimensional approximation of these equations which allows us to study in great details two models. The first model is a neural mass model of a cortical hypercolumn of orientation sensitive neurons, the ring model. The second model is a general neural field model where the spatial connectivity isdescribed by heterogeneous Gaussian-like functions.
The conventional tensor-network states employ real-space product states as reference wave functions. Here, we propose a many-variable variational Monte Carlo (mVMC) method combined with tensor networks by taking advantages of both to study fermionic models. The variational wave function is composed of a pair product wave function operated by real space correlation factors and tensor networks. Moreover, we can apply quantum number projections, such as spin, momentum and lattice symmetry projections, to recover the symmetry of the wave function to further improve the accuracy. We benchmark our method for one- and two-dimensional Hubbard models, which show significant improvement over the results obtained individually either by mVMC or by tensor network. We have applied the present method to hole doped Hubbard model on the square lattice, which indicates the stripe charge/spin order coexisting with a weak $d$-wave superconducting order in the ground state for the doping concentration less than 0.3, where the stripe oscillation period gets longer with increasing hole concentration. The charge homogeneous and highly superconducting state also exists as a metastable excited state for the doping concentration less than 0.25.
The aim of this paper is to develop a multiscale hierarchical hybrid model based on finite element analysis and neural network computation to link mesoscopic scale (trabecular network level) and macroscopic (whole bone level) to simulate bone remodelling process. Because whole bone simulation considering the 3D trabecular level is time consuming, the finite element calculation is performed at macroscopic level and a trained neural network are employed as numerical devices for substituting the finite element code needed for the mesoscale prediction. The bone mechanical properties are updated at macroscopic scale depending on the morphological organization at the mesoscopic computed by the trained neural network. The digital image-based modeling technique using m-CT and voxel finite element mesh is used to capture 2 mm3 Representative Volume Elements at mesoscale level in a femur head. The input data for the artificial neural network are a set of bone material parameters, boundary conditions and the applied stress. The output data is the updated bone properties and some trabecular bone factors. The presented approach, to our knowledge, is the first model incorporating both FE analysis and neural network computation to simulate the multilevel bone adaptation in rapid way.
Approximately over 50 million people worldwide suffer from epilepsy. Traditional diagnosis of epilepsy relies on tedious visual screening by highly trained clinicians from lengthy EEG recording that contains the presence of seizure (ictal) activities. Nowadays, there are many automatic systems that can recognize seizure-related EEG signals to help the diagnosis. However, it is very costly and inconvenient to obtain long-term EEG data with seizure activities, especially in areas short of medical resources. We demonstrate in this paper that we can use the interictal scalp EEG data, which is much easier to collect than the ictal data, to automatically diagnose whether a person is epileptic. In our automated EEG recognition system, we extract three classes of features from the EEG data and build Probabilistic Neural Networks (PNNs) fed with these features. We optimize the feature extraction parameters and combine these PNNs through a voting mechanism. As a result, our system achieves an impressive 94.07% accuracy, which is very close to reported human recognition accuracy by experienced medical professionals.
Fractional order proportional-integral-derivative (FOPID) controllers are designed for load frequency control (LFC) of two interconnected power systems. Conflicting time domain design objectives are considered in a multi objective optimization (MOO) based design framework to design the gains and the fractional differ-integral orders of the FOPID controllers in the two areas. Here, we explore the effect of augmenting two different chaotic maps along with the uniform random number generator (RNG) in the popular MOO algorithm - the Non-dominated Sorting Genetic Algorithm-II (NSGA-II). Different measures of quality for MOO e.g. hypervolume indicator, moment of inertia based diversity metric, total Pareto spread, spacing metric are adopted to select the best set of controller parameters from multiple runs of all the NSGA-II variants (i.e. nominal and chaotic versions). The chaotic versions of the NSGA-II algorithm are compared with the standard NSGA-II in terms of solution quality and computational time. In addition, the Pareto optimal fronts showing the trade-off between the two conflicting time domain design objectives are compared to show the advantage of using the FOPID controller over that with simple PID controller. The nature of fast/slow and high/low noise amplification effects of the FOPID structure or the four quadrant operation in the two inter-connected areas of the power system is also explored. A fuzzy logic based method has been adopted next to select the best compromise solution from the best Pareto fronts corresponding to each MOO comparison criteria. The time domain system responses are shown for the fuzzy best compromise solutions under nominal operating conditions. Comparative analysis on the merits and de-merits of each controller structure is reported then. A robustness analysis is also done for the PID and the FOPID controllers.
There are many indexes (measures or metrics) in Social Network Analysis (SNA), like density, cohesion, etc. We have defined a new SNA index called "comfortability". In this paper, core comfortable team of a social network is defined based on graph theoretic concepts and some of their structural properties are analyzed.   Comfortability is one of the important attributes (characteristics) for a successful team work. So, it is necessary to find a comfortable and successful team in any given social network.   It is proved that forming core comfortable team in any network is NP-Complete using the concepts of domination in graph theory. Next, we give two polynomial-time approximation algorithms for finding such a core comfortable team in any given network with performance ratio O(ln \Delta), where \Delta is the maximum degree of a given network (graph). The time complexity of the algorithm is proved to be O(n^{3}), where n is the number of persons (vertices) in the network (graph). It is also proved that the algorithms give good results in scale-free networks.
The main objectives of the KM3NeT Collaboration are i) the discovery and subsequent observation of high-energy neutrino sources in the Universe and ii) the determination of the mass hierarchy of neutrinos. These objectives are strongly motivated by two recent important discoveries, namely: 1) The high-energy astrophysical neutrino signal reported by IceCube and 2) the sizable contribution of electron neutrinos to the third neutrino mass eigenstate as reported by Daya Bay, Reno and others. To meet these objectives, the KM3NeT Collaboration plans to build a new Research Infrastructure consisting of a network of deep-sea neutrino telescopes in the Mediterranean Sea. A phased and distributed implementation is pursued which maximises the access to regional funds, the availability of human resources and the synergetic opportunities for the earth and sea sciences community. Three suitable deep-sea sites are identified, namely off-shore Toulon (France), Capo Passero (Italy) and Pylos (Greece). The infrastructure will consist of three so-called building blocks. A building block comprises 115 strings, each string comprises 18 optical modules and each optical module comprises 31 photo-multiplier tubes. Each building block thus constitutes a 3-dimensional array of photo sensors that can be used to detect the Cherenkov light produced by relativistic particles emerging from neutrino interactions. Two building blocks will be configured to fully explore the IceCube signal with different methodology, improved resolution and complementary field of view, including the Galactic plane. One building block will be configured to precisely measure atmospheric neutrino oscillations.
Starting from the Minority Game and building more and more sophisticated models of adaptive agents, we show that minority mechanisms underly any model where agents learn collectively a resource level that can be either obvious and constant in time, obvious and time-varying, or hidden.
We apply the dual-varibles approach to the problem of the optical response of an disordered film of metal particles with dipole-dipole interaction. Long range dipole-dipole interaction makes the effect of spatial correlations significant, so that dual-variables technique provides a desirable improvement of the coherent-potential results. It is shown that the effect of nonlocality is more pronounced for a medium-range concentration of the particles. The result is compared with the non-local cluster approach. It is shown that short-range correlations accounted in the cluster method reveal themselves in the spectral properties of the response, whereas long-range phenomena kept in the dual technique are more pronounced in the k-dependence of the Green's function.
We show that deep inelastic neutron scattering from hydrogen(or other light nuclei) can be used to measure a spectrum of anharmonic contributions to the target atom momentum distribution with high and known accuracy . The method is applied here to determine the momentum distribution of the hydrogen in the hydrogen bonded system KHC2O4(potassium binoxalate), where 13 anharmonic coefficients are obtained at the 2-sigma to 3-sigma level. The momentum distribution is obtained to an accuracy of better than few percent at all significant values of momentum.
We propose a novel deep layer cascade (LC) method to improve the accuracy and speed of semantic segmentation. Unlike the conventional model cascade (MC) that is composed of multiple independent models, LC treats a single deep model as a cascade of several sub-models. Earlier sub-models are trained to handle easy and confident regions, and they progressively feed-forward harder regions to the next sub-model for processing. Convolutions are only calculated on these regions to reduce computations. The proposed method possesses several advantages. First, LC classifies most of the easy regions in the shallow stage and makes deeper stage focuses on a few hard regions. Such an adaptive and 'difficulty-aware' learning improves segmentation performance. Second, LC accelerates both training and testing of deep network thanks to early decisions in the shallow stage. Third, in comparison to MC, LC is an end-to-end trainable framework, allowing joint learning of all sub-models. We evaluate our method on PASCAL VOC and Cityscapes datasets, achieving state-of-the-art performance and fast speed.
In this work a novel, automated process for determining an appropriate deep neural network architecture and weight initialization based on decision trees is presented. The method maps a collection of decision trees trained on the data into a collection of initialized neural networks, with the structure of the network determined by the structure of the tree. These models, referred to as "deep jointly-informed neural networks", demonstrate high predictive performance for a variety of datasets. Furthermore, the algorithm is readily cast into a Bayesian framework, resulting in accurate and scalable models that provide quantified uncertainties on predictions.
Opportunistic mobile networks consisting of intermittently connected mobile devices have been exploited for various applications, such as computational offloading and mitigating cellular traffic load. In contrast to existing work, in this paper, we focus on cooperatively offloading data among mobile devices to maximally improve the probability of data delivery from a mobile device to intermittently connected infrastructure within a given time constraint, which is referred to as the \textit{cooperative offloading} problem. Unfortunately, the estimation of data delivery probability over an opportunistic path is difficult and cooperative offloading is NP-hard. To this end, we first propose a probabilistic framework that provides the estimation of such probability. Based on the proposed probabilistic framework, we design a heuristic algorithm to solve cooperative offloading at a low computation cost. Due to the lack of global information, a distributed algorithm is further proposed. The performance of the proposed approaches is evaluated based on both synthetic networks and real traces. Experimental results show that the probabilistic framework can accurately estimate the data delivery probability, cooperative offloading greatly improves the delivery probability, the heuristic algorithm approximates the optimum, and the performance of both the heuristic algorithm and distributed algorithm outperforms other approaches.
In this work a novel method to quantify spectral ergodicity for random matrices is presented. The new methodology combines approaches rooted in the metrics of Thirumalai-Mountain (TM) and Kullbach-Leibler (KL) divergence. The method is applied to a general study of deep and recurrent neural networks via the analysis of random matrix ensembles mimicking typical weight matrices of those systems. In particular, we examine circular random matrix ensembles: circular unitary ensemble (CUE), circular orthogonal ensemble (COE), and circular symplectic ensemble (CSE). Eigenvalue spectra and spectral ergodicity are computed for those ensembles as a function of network size. It is observed that as the matrix size increases the level of spectral ergodicity of the ensemble rises, i.e., the eigenvalue spectra obtained for a single realisation at random from the ensemble is closer to the spectra obtained averaging over the whole ensemble. Based on previous results we conjecture that success of deep learning architectures is strongly bound to the concept of spectral ergodicity. The method to compute spectral ergodicity proposed in this work could be used to optimise the size and architecture of deep as well as recurrent neural networks.
Residual networks (ResNets) have recently achieved state-of-the-art on challenging computer vision tasks. We introduce Resnet in Resnet (RiR): a deep dual-stream architecture that generalizes ResNets and standard CNNs and is easily implemented with no computational overhead. RiR consistently improves performance over ResNets, outperforms architectures with similar amounts of augmentation on CIFAR-10, and establishes a new state-of-the-art on CIFAR-100.
In this paper, we study the zero temperature persistent current in a ferromagnetic Kondo lattice model in the strong coupling limit. In this model, there are spontaneous spin textures at some values of the external magnetic flux. These spin textures contribute a geometric flux, which can induce an additional spontaneous persistent current. Since this spin texture changes with the external magnetic flux, we find that there is an anomalous persistent current in some region of magnetic flux: near Phi/Phi_0=0 for an even number of electrons and Phi/Phi_0=1/2 for an odd number of electrons.
Collective, especially group-based, managerial decision making is crucial in organizations. Using an evolutionary theory approach to collective decision making, agent-based simulations were conducted to investigate how collective decision making would be affected by the agents' diversity in problem understanding and/or behavior in discussion, as well as by their social network structure. Simulation results indicated that groups with consistent problem understanding tended to produce higher utility values of ideas and displayed better decision convergence, but only if there was no group-level bias in collective problem understanding. Simulation results also indicated the importance of balance between selection-oriented (i.e., exploitative) and variation-oriented (i.e., explorative) behaviors in discussion to achieve quality final decisions. Expanding the group size and introducing non-trivial social network structure generally improved the quality of ideas at the cost of decision convergence. Simulations with different social network topologies revealed that collective decision making on small-world networks with high local clustering tended to achieve highest decision quality more often than on random or scale-free networks. Implications of this evolutionary theory and simulation approach for future managerial research on collective, group, and multi-level decision making are discussed.
Social networks are increasingly being used to conduct polls. We introduce a simple model of such social polling. We suppose agents vote sequentially, but the order in which agents choose to vote is not necessarily fixed. We also suppose that an agent's vote is influenced by the votes of their friends who have already voted. Despite its simplicity, this model provides useful insights into a number of areas including social polling, sequential voting, and manipulation. We prove that the number of candidates and the network structure affect the computational complexity of computing which candidate necessarily or possibly can win in such a social poll. For social networks with bounded treewidth and a bounded number of candidates, we provide polynomial algorithms for both problems. In other cases, we prove that computing which candidates necessarily or possibly win are computationally intractable.
In condition-based maintenance, real-time observations are crucial for on-line health assessment. When the monitoring system is a wireless sensor network, data loss becomes highly probable and this affects the quality of the remaining useful life prediction. In this paper, we present a fully distributed algorithm that ensures fault tolerance and recovers data loss in wireless sensor networks. We first theoretically analyze the algorithm and give correctness proofs, then provide simulation results and show that the algorithm is (i) able to ensure data recovery with a low failure rate and (ii) preserves the overall energy for dense networks.
We have shown that in systems where the Coulomb interaction is strongly suppressed, the superconducting transition temperature may be enhanced by disorder close to the Anderson localization transition. This phenomenon is based on the enhancement by disorder of the matrix element of attraction in the Cooper channel. For enhancement to take place one needs (i) strong disorder which makes the single-particle wave functions strongly inhomogeneous in space and (ii) strong correlation of the patterns of inhomogeneity for different wavefunctions. One case where such correlation is well known is the system close to the Anderson transition. We review the notion of multifractality of wavefunctions in this region and show how the enhancement of T_{c} arises out of the multifractal correlations.
Data-sharing scientific collaborations have particular characteristics, potentially different from the current peer-to-peer environments. In this paper we advocate the benefits of exploiting emergent patterns in self-configuring networks specialized for scientific data-sharing collaborations. We speculate that a peer-to-peer scientific collaboration network will exhibit small-world topology, as do a large number of social networks for which the same pattern has been documented. We propose a solution for locating data in decentralized, scientific, data-sharing environments that exploits the small-worlds topology. The research challenge we raise is: what protocols should be used to allow a self-configuring peer-to-peer network to form small worlds similar to the way in which the humans that use the network do in their social interactions?
The principles underlying protein folding remains one of Nature's puzzles with important practical consequences for Life. An approach that has gathered momentum since the late 1990's, looks at protein hetero-polymers and their folding process through the lens of complex network analysis. Consequently, there is now a body of empirical studies describing topological characteristics of protein macro-molecules through their contact networks and linking these topological characteristics to protein folding. The present paper is primarily a review of this rich area. But it delves deeper into certain aspects by emphasizing short-range and long-range links, and suggests unconventional places where "power-laws" may be lurking within protein contact networks. Further, it considers the dynamical view of protein contact networks. This closer scrutiny of protein contact networks raises new questions for further research, and identifies new regularities which may be useful to parameterize a network approach to protein folding. Preliminary experiments with such a model confirm that the regularities we identified cannot be easily reproduced through random effects. Indeed, the grand challenge of protein folding is to elucidate the process(es) which not only generates the specific and diverse linkage patterns of protein contact networks, but also reproduces the dynamic behavior of proteins as they fold. Keywords: network analysis, protein contact networks, protein folding
In this work, we propose and analyze a generalized construction of distributed network codes for a network consisting of $M$ users sending different information to a common base station through independent block fading channels. The aim is to increase the diversity order of the system without reducing its throughput. The proposed scheme, called generalized dynamic-network codes (GDNC), is a generalization of the dynamic-network codes (DNC) recently proposed by Xiao and Skoglund. The design of the network codes that maximize the diversity order is recognized as equivalent to the design of linear block codes over a nonbinary finite field under the Hamming metric. We prove that adopting a systematic generator matrix of a maximum distance separable block code over a sufficiently large finite field as the network transfer matrix is a sufficient condition for full diversity order under link failure model. The proposed generalization offers a much better tradeoff between rate and diversity order compared to the DNC. An outage probability analysis showing the improved performance is carried out, and computer simulations results are shown to agree with the analytical results.
We consider task and motion planning in complex dynamic environments for problems expressed in terms of a set of Linear Temporal Logic (LTL) constraints, and a reward function. We propose a methodology based on reinforcement learning that employs deep neural networks to learn low-level control policies as well as task-level option policies. A major challenge in this setting, both for neural network approaches and classical planning, is the need to explore future worlds of a complex and interactive environment. To this end, we integrate Monte Carlo Tree Search with hierarchical neural net control policies trained on expressive LTL specifications. This paper investigates the ability of neural networks to learn both LTL constraints and control policies in order to generate task plans in complex environments. We demonstrate our approach in a simulated autonomous driving setting, where a vehicle must drive down a road in traffic, avoid collisions, and navigate an intersection, all while obeying given rules of the road.
The delensing procedure is an effective tool for removing lensing-induced $B$-mode polarization in the Cosmic Microwave Background to allow for deep searches of primordial $B$-modes. However, the delensing algorithm existing in the literature breaks down if the target $B$-mode signals overlap significantly with the lensing $B$-mode ($\sim 300<l<2000$) in multipole-$l$ space. In this paper, we identify the cause of the breakdown to be correlations between the input $B$-map and the deflection field estimator ($EB$). The amplitude of this bias is quantified by numerical simulations and compared to the analytically derived functional form. We also propose a revised delensing procedure that circumvents the bias. While the newly identified bias does not affect the search of degree scale $B$-mode generated by tensor perturbations, the modified delensing algorithm makes it possible to perform deep searches of high-$l$ $B$-modes such as those generated by patchy reionization, cosmic strings, or rotation of polarization angles. Finally, we estimate how well future polarization experiments can do in detecting tensor- and cosmic string- generated $B$-mode after delensing and comment on different survey strategies.
The one-dimensional (1d) Anderson model (AM) has statistical anomalies at any rational point $f=2a/\lambda_{E}$, where $a$ is the lattice constant and $\lambda_{E}$ is the de Broglie wavelength. We develop a regular approach to anomalous statistics of normalized eigenfunctions $\psi(r)$ at such commensurability points. The approach is based on an exact integral transfer-matrix equation for a generating function $\Phi_{r}(u, \phi)$ ($u$ and $\phi$ have a meaning of the squared amplitude and phase of eigenfunctions, $r$ is the position of the observation point). The descender of the generating function ${\cal P}_{r}(\phi)\equiv\Phi_{r}(u=0,\phi)$ is shown to be the distribution function of phase which determines the Lyapunov exponent and the local density of states.   In the leading order in the small disorder we have derived a second-order partial differential equation for the $r$-independent ("zero-mode") component $\Phi(u, \phi)$ at the $E=0$ ($f=\frac{1}{2}$) anomaly. This equation is nonseparable in variables $u$ and $\phi$. Yet, we show that due to a hidden symmetry, it is integrable and we construct an exact solution for $\Phi(u, \phi)$ explicitly in quadratures. Using this solution we have computed moments $I_{m}=N<|\psi|^{2m}>$ ($m\geq 1$) for a chain of the length $N \rightarrow \infty$ and found an essential difference between their $m$-behavior in the center-of-band anomaly and for energies outside this anomaly. Outside the anomaly the "extrinsic" localization length defined from the Lyapunov exponent coincides with that defined from the inverse participation ratio ("intrinsic" localization length). This is not the case at the $E=0$ anomaly where the extrinsic localization length is smaller than the intrinsic one.
The Global Positioning Systems (GPS) and Inertial Navigation System (INS) technology have attracted a considerable importance recently because of its large number of solutions serving both military as well as civilian applications. This paper aims to develop a more efficient and especially a faster method for processing the GPS signal in case of INS signal loss without losing the accuracy of the data. The conventional or usual method consists of processing data through a neural network and obtaining accurate positioning output data. The new or improved method adds selective filtering at the low-band frequency, the mid-band frequency and the high band frquency, before processing the GPS data through the neural network, so that the processing time is decreased significantly while the accuracy remains the same.
The structure entropy is an important index to illuminate the structure property of the complex network. Most of the existing structure entropies are based on the degree distribution of the complex network. But the structure entropy based on the degree can not illustrate the structure property of the weighted networks. In order to study the structure property of the weighted networks, a new structure entropy of the complex networks based on the betweenness is proposed in this paper. Comparing with the existing structure entropy, the proposed method is more reasonable to describe the structure property of the complex weighted networks.
We consider continuous time Hopfield-like recurrent networks as dynamical models for gene regulation and neural networks. We are interested in networks that contain n high-degree nodes preferably connected to a large number of Ns weakly connected satellites, a property that we call n/Ns-centrality. If the hub dynamics is slow, we obtain that the large time network dynamics is completely defined by the hub dynamics. Moreover, such networks are maximally flexible and switchable, in the sense that they can switch from a globally attractive rest state to any structurally stable dynamics when the response time of a special controller hub is changed. In particular, we show that a decrease of the controller hub response time can lead to a sharp variation in the network attractor structure: we can obtain a set of new local attractors, whose number can increase exponentially with N, the total number of nodes of the nework. These new attractors can be periodic or even chaotic. We provide an algorithm, which allows us to design networks with the desired switching properties, or to learn them from time series, by adjusting the interactions between hubs and satellites. Such switchable networks could be used as models for context dependent adaptation in functional genetics or as models for cognitive functions in neuroscience.
It is shown that there exists a mapping between the fermions of the Standard Model (SM) represented as braids in the Bilson-Thompson model, and a set of gates which can perform Universal Quantum Computation (UQC). This leads us to conjecture that the "Computational Universe Hypothesis" (CUH) can be given a concrete implementation in a new physical framework where elementary particles and the gauge bosons (which intermediate interactions between fermions) are interpreted as the components of a quantum computational network, with the particles serving as quantum computational gates and the gauge fields as the information carrying entities.
Recently recurrent neural networks (RNN) has been very successful in handling sequence data. However, understanding RNN and finding the best practices for RNN is a difficult task, partly because there are many competing and complex hidden units (such as LSTM and GRU). We propose a gated unit for RNN, named as Minimal Gated Unit (MGU), since it only contains one gate, which is a minimal design among all gated hidden units. The design of MGU benefits from evaluation results on LSTM and GRU in the literature. Experiments on various sequence data show that MGU has comparable accuracy with GRU, but has a simpler structure, fewer parameters, and faster training. Hence, MGU is suitable in RNN's applications. Its simple architecture also means that it is easier to evaluate and tune, and in principle it is easier to study MGU's properties theoretically and empirically.
We consider the dynamics of spin facilitated models of glasses in the non-equilibrium aging regime following a sudden quench from high to low temperatures. We briefly review known results obtained for the broad class of kinetically constrained models, and then present new results for the behaviour of the one-spin facilitated Fredrickson-Andersen and East models in various spatial dimensions. The time evolution of one-time quantities, such as the energy density, and the detailed properties of two-time correlation and response functions are studied using a combination of theoretical approaches, including exact mappings of master operators and reductions to integrable quantum spin chains, field theory and renormalization group, and independent interval and timescale separation methods. The resulting analytical predictions are confirmed by means of detailed numerical simulations. The models we consider are characterized by trivial static properties, with no finite temperature singularities, but they nevertheless display a surprising variety of dynamic behaviour during aging, which can be directly related to the existence and growth in time of dynamic lengthscales. Well-behaved fluctuation-dissipation ratios can be defined for these models, and we study their properties in detail. We confirm in particular the existence of negative fluctuation-dissipation ratios for a large number of observables. Our results suggest that well-defined violations of fluctuation-dissipation relations, of a purely dynamic origin and unrelated to the thermodynamic concept of effective temperatures, could in general be present in non-equilibrium glassy materials.
We examine Memory Networks for the task of question answering (QA), under common real world scenario where training examples are scarce and under weakly supervised scenario, that is only extrinsic labels are available for training. We propose extensions for the Dynamic Memory Network (DMN), specifically within the attention mechanism, we call the resulting Neural Architecture as Dynamic Memory Tensor Network (DMTN). Ultimately, we see that our proposed extensions results in over 80% improvement in the number of task passed against the baselined standard DMN and 20% more task passed compared to state-of-the-art End-to-End Memory Network for Facebook's single task weakly trained 1K bAbi dataset.
The University of Sheffield (USFD) participated in the International Workshop for Spoken Language Translation (IWSLT) in 2014. In this paper, we will introduce the USFD SLT system for IWSLT. Automatic speech recognition (ASR) is achieved by two multi-pass deep neural network systems with adaptation and rescoring techniques. Machine translation (MT) is achieved by a phrase-based system. The USFD primary system incorporates state-of-the-art ASR and MT techniques and gives a BLEU score of 23.45 and 14.75 on the English-to-French and English-to-German speech-to-text translation task with the IWSLT 2014 data. The USFD contrastive systems explore the integration of ASR and MT by using a quality estimation system to rescore the ASR outputs, optimising towards better translation. This gives a further 0.54 and 0.26 BLEU improvement respectively on the IWSLT 2012 and 2014 evaluation data.
We study the sensitivity of Yu-Shiba-Rusinov states, bound states that form around magnetic scatterers in superconductors, to the presence of nonmagnetic disorder in both two and three dimensional systems. We formulate a scattering approach to this problem and reduce the effects of disorder to two contributions: disorder-induced normal reflection and a random phase of the amplitude for Andreev reflection. We find that both of these are small even for moderate amounts of disorder. In the dirty limit in which the disorder-induced mean free path is smaller than the superconducting coherence length, the variance of the energy of the Yu-Shiba-Rusinov state remains small in the ratio of the Fermi wavelength and the mean free path. This effect is more pronounced in three dimensions, where only impurities within a few Fermi wavelengths of the magnetic scatterer contribute. In two dimensions the energy variance is larger by a logarithmic factor because impurities contribute up to a distance of the order of the superconducting coherence length.
We show how deeply quenching a liquid to temperatures where it is linearly unstable and the crystal is the equilibrium phase often produces crystalline structures with defects and disorder. As the solid phase advances into the liquid phase, the modulations in the density distribution created behind the advancing solidification front do not necessarily have a wavelength that is the same as the equilibrium crystal lattice spacing. This is because in a deep enough quench the front propagation is governed by linear processes, but the crystal lattice spacing is determined by nonlinear terms. The wavelength mismatch can result in significant disorder behind the front that may or may not persist in the latter stage dynamics. We support these observations by presenting results from dynamical density functional theory calculations for simple one- and two-component two-dimensional systems of soft core particles.
We consider the critical point of two mean-field disordered models : (i) the Random Energy Model (REM), introduced by Derrida as a mean-field spin-glass model of $N$ spins (ii) the Directed Polymer of length $N$ on a Cayley Tree (DPCT) with random bond energies. Both models are known to exhibit a freezing transition between a high temperature phase where the entropy is extensive and a low-temperature phase of finite entropy. In this paper, we study the weight statistics at criticality via the entropy $S=-\sum w_i \ln w_i$ and the generalized moments $Y_k=\sum w_i^k$, where the $w_i$ are the Boltzmann weights of the $2^N$ configurations. In the REM, we find that the critical weight statistics is governed by the finite-size exponent $\nu=2$ : the entropy scales as $\bar{S}_N(T_c) \sim N^{1/2}$, the typical values $e^{\bar{\ln Y_k}}$ decay as $N^{-k/2}$, and the disorder-averaged values $\bar{Y_k}$ are governed by rare events and decay as $N^{-1/2}$ for any $k>1$. For the DPCT, we find that the entropy scales similarly as $\bar{S}_N(T_c) \sim N^{1/2}$, whereas another exponent $\nu'=1$ governs the $Y_k$ statistics : the typical values $e^{\bar{\ln Y_k}}$ decay as $N^{-k}$, the disorder-averaged values $\bar{Y_k}$ decay as $N^{-1}$ for any $k>1$. As a consequence, the asymptotic probability distribution $\bar{\pi}_{N=\infty}(q)$ of the overlap $q$, beside the delta function $\delta(q)$ which bears the whole normalization, contains an isolated point at $q=1$, as a memory of the delta peak $(1-T/T_c) \delta(q-1)$ of the low-temperature phase $T<T_c$. The associated value $\bar{\pi}_{N=\infty}(q=1)$ is finite for the DPCT, and diverges as $\bar{\pi}_{N=\infty}(q=1) \sim N^{1/2}$ for the REM.
Identifying the importance of nodes of complex networks is of interest to the research of Social Networks, Biological Networks etc.. Current researchers have proposed several measures or algorithms, such as betweenness, PageRank and HITS etc., to identify the node importance. However, these measures are based on different aspects of properties of nodes, and often conflict with the others. A reasonable, fair standard is needed for evaluating and comparing these algorithms. This paper develops a framework as the standard for ranking the importance of nodes. Four intuitive rules are suggested to measure the node importance, and the equivalence classes approach is employed to resolve the conflicts and aggregate the results of the rules. To quantitatively compare the algorithms, the performance indicators are also proposed based on a similarity measure. Three widely used real-world networks are used as the test-beds. The experimental results illustrate the feasibility of this framework and show that both algorithms, PageRank and HITS, perform well with bias when dealing with the tested networks. Furthermore, this paper uses the proposed approach to analyze the structure of the Internet, and draws out the kernel of the Internet with dense links.
We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward. This setting has been motivated by problems arising in cognitive radio networks, and is especially challenging under the realistic assumption that communication between players is limited. We provide a communication-free algorithm (Musical Chairs) which attains constant regret with high probability, as well as a sublinear-regret, communication-free algorithm (Dynamic Musical Chairs) for the more difficult setting of players dynamically entering and leaving throughout the game. Moreover, both algorithms do not require prior knowledge of the number of players. To the best of our knowledge, these are the first communication-free algorithms with these types of formal guarantees. We also rigorously compare our algorithms to previous works, and complement our theoretical findings with experiments.
The transition from a many-body localized phase to a thermalizing one is a dynamical quantum phase transition which lies outside the framework of equilibrium statistical mechanics. We provide a detailed study of the critical properties of this transition at finite sizes in one dimension. We find that the entanglement entropy of small subsystems looks strongly subthermal in the quantum critical regime, which indicates that it varies discontinuously across the transition as the system-size is taken to infinity, even though many other aspects of the transition look continuous. We also study the variance of the half-chain entanglement entropy which shows a peak near the transition, and find substantial variation in the entropy across eigenstates of the same sample. Further, the sample-to-sample variations in this quantity are strongly growing, and larger than the intra-sample variations. We posit that these results are consistent with a picture in which the transition to the thermal phase is driven by an eigenstate-dependent sparse resonant "backbone" of long-range entanglement, which just barely gains enough strength to thermalize the system on the thermal side of the transition as the system size is taken to infinity. This discontinuity in a global quantity --- the presence of a fully functional bath --- in turn implies a discontinuity even for local properties. We discuss how this picture compares with existing renormalization group treatments of the transition.
Artificial neurons with arbitrarily complex internal structure are introduced. The neurons can be described in terms of a set of internal variables, a set activation functions which describe the time evolution of these variables and a set of characteristic functions which control how the neurons interact with one another. The information capacity of attractor networks composed of these generalized neurons is shown to reach the maximum allowed bound. A simple example taken from the domain of pattern recognition demonstrates the increased computational power of these neurons. Furthermore, a specific class of generalized neurons gives rise to a simple transformation relating attractor networks of generalized neurons to standard three layer feed-forward networks. Given this correspondence, we conjecture that the maximum information capacity of a three layer feed-forward network is 2 bits per weight.
We have used Suprime-Cam on the Subaru Telescope to conduct a V- and I-band imaging survey of fields sampling the spheroid of the Andromeda galaxy along its south-east minor axis. Our photometric data are deep enough to resolve stars down to the red clump. Based on a large and reliable sample of red giant stars available from this deep wide-field imager, we have derived metallicity distributions vs. radius and a surface brightness profile over projected distances of R=23-66 kpc from the galaxy's center. The metallicity distributions across this region shows a clear high mean metallicity and a broad distribution ([Fe/H] ~ -0.6 +/- 0.5), and indicates no metallicity gradient within our observed range. The surface brightness profile at R>40 kpc is found to be flatter than previously thought. It is conceivable that this part of the halo samples as yet unidentified, metal-rich substructure.
Technical security metrics provide measurements in ensuring the effectiveness of technical security controls or technology devices/objects that are used in protecting the information systems. However, lack of understanding and method to develop the technical security metrics may lead to unachievable security control objectives and incompetence of the implementation. This paper proposes a model of technical security metric to measure the effectiveness of network security management. The measurement is based on the effectiveness of security performance for (1) network security controls such as firewall, Intrusion Detection Prevention System (IDPS), switch, wireless access point, wireless controllers and network architecture; and (2) network services such as Hypertext Transfer Protocol Secure (HTTPS) and virtual private network (VPN). We use the Goal-Question-Metric (GQM) paradigm [1] which links the measurement goals to measurement questions and produce the metrics that can easily be interpreted in compliance with the requirements. The outcome of this research method is the introduction of network security management metric as an attribute to the Technical Security Metric (TSM) model. Apparently, the proposed TSM model may provide guidance for organizations in complying with effective measurement requirements of ISO/IEC 27001 Information Security Management System (ISMS) standard. The proposed model will provide a comprehensive measurement and guidance to support the use of ISO/IEC 27004 ISMS Measurement template.
The field of neuroimaging has truly become data rich, and novel analytical methods capable of gleaning meaningful information from large stores of imaging data are in high demand. Those methods that might also be applicable on the level of individual subjects, and thus potentially useful clinically, are of special interest. In the present study, we introduce just such a method, called nonlinear functional mapping (NFM), and demonstrate its application in the analysis of resting state fMRI from a 242-subject subset of the IMAGEN project, a European study of adolescents that includes longitudinal phenotypic, behavioral, genetic, and neuroimaging data. NFM employs a computational technique inspired by biological evolution to discover and mathematically characterize interactions among ROI (regions of interest), without making linear or univariate assumptions. We show that statistics of the resulting interaction relationships comport with recent independent work, constituting a preliminary cross-validation. Furthermore, nonlinear terms are ubiquitous in the models generated by NFM, suggesting that some of the interactions characterized here are not discoverable by standard linear methods of analysis. We discuss one such nonlinear interaction in the context of a direct comparison with a procedure involving pairwise correlation, designed to be an analogous linear version of functional mapping. We find another such interaction that suggests a novel distinction in brain function between drinking and non-drinking adolescents: a tighter coupling of ROI associated with emotion, reward, and interoceptive processes such as thirst, among drinkers. Finally, we outline many improvements and extensions of the methodology to reduce computational expense, complement other analytical tools like graph-theoretic analysis, and allow for voxel level NFM to eliminate the necessity of ROI selection.
Although deep learning models have proven effective at solving problems in natural language processing, the mechanism by which they come to their conclusions is often unclear. As a result, these models are generally treated as black boxes, yielding no insight of the underlying learned patterns. In this paper we consider Long Short Term Memory networks (LSTMs) and demonstrate a new approach for tracking the importance of a given input to the LSTM for a given output. By identifying consistently important patterns of words, we are able to distill state of the art LSTMs on sentiment analysis and question answering into a set of representative phrases. This representation is then quantitatively validated by using the extracted phrases to construct a simple, rule-based classifier which approximates the output of the LSTM.
We report the progress on our analysis of two deep XMM observations in the areas of Hubble Deep Field-North and the Groth-Westphal Strip field. We present the source detection method, Monte-Carlo simulation on source detection, and our preliminary Log N -- Log S relations. Comparing these two fields and other fields in literature, we find up to 30% variations in number counts near our flux limits, which is most likely to be due to cosmic variance. This serves as a pilot study for a more systematic investigation. The nature of the X-ray sources in the Groth-Westphal field is also discussed in terms of hardness ratios, morphology and spectroscopic properties.
Although system administrators are frequently urged to protect the machines in their network, the fact remains that the decision to protect is far from universal. To better understand this decision, we formulate a decision-theoretic model of a system administrator responsible for a network of size n against an attacker attempting to penetrate the network and infect the machines with a virus or similar exploit. By analyzing the model we are able to demonstrate the cost sensitivity of smaller networks as well as identify tipping points that can lead the administrator to switch away from the decision to protect.
In the mixture models problem it is assumed that there are $K$ distributions $\theta_{1},\ldots,\theta_{K}$ and one gets to observe a sample from a mixture of these distributions with unknown coefficients. The goal is to associate instances with their generating distributions, or to identify the parameters of the hidden distributions. In this work we make the assumption that we have access to several samples drawn from the same $K$ underlying distributions, but with different mixing weights. As with topic modeling, having multiple samples is often a reasonable assumption. Instead of pooling the data into one sample, we prove that it is possible to use the differences between the samples to better recover the underlying structure. We present algorithms that recover the underlying structure under milder assumptions than the current state of art when either the dimensionality or the separation is high. The methods, when applied to topic modeling, allow generalization to words not present in the training data.
We investigate a spin-boson model with two boson baths that are coupled to two perpendicular components of the spin by employing the density matrix renormalization group method with an optimized boson basis. It is revealed that in the deep sub-Ohmic regime there exists a novel second-order phase transition between two types of doubly degenerate states, which is reduced to one of the usual type for nonzero tunneling. In addition, it is found that expectation values of the spin components display jumps at the phase boundary in the absence of bias and tunneling.
The interdisciplinary topic of archaeo-astronomy links science subjects such as astronomy with archaeology and sociology to explore how ancient societies perceived the heavens above. This is achieved by analysing ancient sites such as megalithic monuments (e.g. Stonehenge), since they are the most common remains of these societies and are wide spread in Europe. We discuss how archaeo-astronomy and ancient sites can be transversal to many topics in school. The links to the science curricula in different countries are highlighted. However, especially the subject of citizenship can be supported by exploring the diversity of culture, ideas, and identities including the changing nature of society in the past millennia. We conclude that archaeo-astronomy offers many opportunities for citizenship. Learning more about megalithic monuments in different countries (e.g. England, Portugal, and Germany) supports tolerance and understanding. Furthermore, the distribution of these sites lends itself to explore beyond borders, introduce international networking, and truly develop students into global citizens.
The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this work, we address these challenges and finally realize the promise of conditional computation, achieving greater than 1000x improvements in model capacity with only minor losses in computational efficiency on modern GPU clusters. We introduce a Sparsely-Gated Mixture-of-Experts layer (MoE), consisting of up to thousands of feed-forward sub-networks. A trainable gating network determines a sparse combination of these experts to use for each example. We apply the MoE to the tasks of language modeling and machine translation, where model capacity is critical for absorbing the vast quantities of knowledge available in the training corpora. We present model architectures in which a MoE with up to 137 billion parameters is applied convolutionally between stacked LSTM layers. On large language modeling and machine translation benchmarks, these models achieve significantly better results than state-of-the-art at lower computational cost.
Distributed Denial of Service (DDoS) is one of the most prevalent attacks that an organizational network infrastructure comes across nowadays. We propose a deep learning based multi-vector DDoS detection system in a software-defined network (SDN) environment. SDN provides flexibility to program network devices for different objectives and eliminates the need for third-party vendor-specific hardware. We implement our system as a network application on top of an SDN controller. We use deep learning for feature reduction of a large set of features derived from network traffic headers. We evaluate our system based on different performance metrics by applying it on traffic traces collected from different scenarios. We observe high accuracy with a low false-positive for attack detection in our proposed system.
Structure and magnetic properties of the compounds $Zn_{1-x}Cu_xCr_2O_4$ (ZCCO) are investigated systematically. A structural phase transition from space-group symmetry $Fd3m$ to $I4_1/amd$ is observed in ZCCO. The critical value of the doping, $x$, appears at $0.58\sim 0.62$ through the appearance of a splitting of diffraction peaks at room temperature. Measurements of dc magnetization, ac susceptibility, memory effect and exchange bias-like (EB-like) effect have been performed to reveal the glassy magnetic behaviors of ZCCO. The system with $x\leqslant 0.50$ is suggested as a spin glass-like (SG-like) of magnetic characterization whereas doping values of $0.58\leqslant x\leqslant 0.90$ defines the system as a $"$cluster glass-like$"$ (CG-like) with unidirectional anisotropy. The Cu content suppresses the geometrical frustration of $ZnCr_2O_4$, which may correlate with the pinning effect of Cu sublattice on Cr sublattice to a preferential direction.
In previous work, I have developed an information theoretic complexity measure of networks. When applied to several real world food webs, there is a distinct difference in complexity between the real food web, and randomised control networks obtained by shuffling the network links. One hypothesis is that this complexity surplus represents information captured by the evolutionary process that generated the network. In this paper, I test this idea by applying the same complexity measure to several well-known artificial life models that exhibit ecological networks: Tierra, EcoLab and Webworld. Contrary to what was found in real networks, the artificial life generated foodwebs had little information difference between itself and randomly shuffled versions.
Dimensionality reduction (DR) of image features plays an important role in image retrieval and classification tasks. Recently, two types of methods have been proposed to improve the both the accuracy and efficiency for the dimensionality reduction problem. One uses Non-negative matrix factorization (NMF) to describe the image distribution on the space of base matrix. Another one for dimension reduction trains a subspace projection matrix to project original data space into some low-dimensional subspaces which have deep architecture, so that the low-dimensional codes would be learned. At the same time, the graph based similarity learning algorithm which tries to exploit contextual information for improving the effectiveness of image rankings is also proposed for image class and retrieval problem. In this paper, after above two methods mentioned are utilized to reduce the high-dimensional features of images respectively, we learn the graph based similarity for the image classification problem. This paper compares the proposed approach with other approaches on an image database.
We present a novel training framework for neural sequence models, particularly for grounded dialog generation. The standard training paradigm for these models is maximum likelihood estimation (MLE), or minimizing the cross-entropy of the human responses. Across a variety of domains, a recurring problem with MLE trained generative neural dialog models (G) is that they tend to produce 'safe' and generic responses ("I don't know", "I can't tell"). In contrast, discriminative dialog models (D) that are trained to rank a list of candidate human responses outperform their generative counterparts; in terms of automatic metrics, diversity, and informativeness of the responses. However, D is not useful in practice since it can not be deployed to have real conversations with users.   Our work aims to achieve the best of both worlds -- the practical usefulness of G and the strong performance of D -- via knowledge transfer from D to G. Our primary contribution is an end-to-end trainable generative visual dialog model, where G receives gradients from D as a perceptual (not adversarial) loss of the sequence sampled from G. We leverage the recently proposed Gumbel-Softmax (GS) approximation to the discrete distribution -- specifically, a RNN augmented with a sequence of GS samplers, coupled with the straight-through gradient estimator to enable end-to-end differentiability. We also introduce a stronger encoder for visual dialog, and employ a self-attention mechanism for answer encoding along with a metric learning loss to aid D in better capturing semantic similarities in answer responses. Overall, our proposed model outperforms state-of-the-art on the VisDial dataset by a significant margin (2.67% on recall@10).